frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The lost art of XML

https://marcosmagueta.com/blog/the-lost-art-of-xml/
36•Curiositry•1h ago

Comments

shadowgovt•1h ago
XML was abandoned because we realized bandwidth costs money and while it was too late to do anything about how verbose HTML is, we didn't have to repeat the mistake with our data transfer protocols.

Even with zipped payloads, it's just way unnecessarily chatty without being more readable.

_heimdall•1h ago
That doesn't match my memory, though its been a while now!

I remember the arguments largely revolving around verbosity and the prevalence of JSON use in browsers.

That doesn't mean bandwidth wasn't a consideration, but I mostly remember hearing devs complain about how verbose or difficult to work with XML was.

johngossman•53m ago
Your memory is correct. Once compression was applied, the size on the wire was mostly a wash. Parsing costs were often greater but that's at the endpoints.
voidfunc•1h ago
OK, but XML is a pretty solid format for a lot of other stuff that doesn't necessarily need network transmission.
cosmotic•1h ago
The article addresses this.
howdyhowdyhowdy•55m ago
if bandwidth was a concern, JSON was a poor solution. XML compresses nicely and efficiently. Yes it can be verbose to the human eyes, but I don't know if bandwidth is the reason it's not used more often.
adgjlsfhk1•2m ago
JSON absolutely isn't perfect, but it's a spec that you can explain in ~5 minutes, mirrors common PL syntax for Dict/Array, and is pretty much superior to XML in every way.
_heimdall•1h ago
This is a debate I've had many times. XML, and REST, are extremely useful for certain types of use cases that you quite often run into online.

The industry abandoned both in favor of JSON and RPC for speed and perceived DX improvements, and because for a period of time everyone was in fact building only against their own servers.

There are plenty of examples over the last two decades of us having to reinvent solutions to the same problems that REST solved way back then though. MCP is the latest iteration of trying to shoehorn schemas and self-documenting APIs into a sea of JSON RPC.

striking•1h ago
I tried using XML on a lark the other day and realized that XSDs are actually somewhat load bearing. It's difficult to map data in XML to objects in your favorite programming language without the schema being known beforehand as lists of a single element are hard to distinguish from just a property of the overall object.

Maybe this is okay if you know your schema beforehand and are willing to write an XSD. My usecase relied on not knowing the schema. Despite my excitement to use a SAX-style parser, I tucked my tail between my legs and switched back to JSONL. Was I missing something?

mkozlows•58m ago
XML was designed as a document format, not a data structure serialization format. You're supposed to parse it into a DOM or similar format, not a bunch of strongly-typed objects. You definitely need some extra tooling if you're trying to do the latter, and yes, that's one of XSD's purposes.
froh•41m ago
that's underselling xml. xml is explicitly meant for data serialization and exchange, xsd reflects that, and it's the reason for jaxb Java xml binding tooling.

get me right: Json is superior in many aspects, xml is utterly overengineered.

but xml absolutely was _meant_ for data exchange, machine to machine.

mkozlows•31m ago
No. That use case was grafted onto it later. You can look at the original 1998 XML 1.0 spec first edition to see what people were saying at the time: https://www.w3.org/TR/1998/REC-xml-19980210#sec-origin-goals

Here's the bullet point from that verbatim:

  The design goals for XML are:

    XML shall be straightforwardly usable over the Internet.
    XML shall support a wide variety of applications.
    XML shall be compatible with SGML.
    It shall be easy to write programs which process XML documents.
    The number of optional features in XML is to be kept to the absolute minimum, ideally zero.
    XML documents should be human-legible and reasonably clear.
    The XML design should be prepared quickly.
    The design of XML shall be formal and concise.
    XML documents shall be easy to create.
    Terseness in XML markup is of minimal importance.
Or heck, even more concisely from the abstract: "The Extensible Markup Language (XML) is a subset of SGML that is completely described in this document. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML."

It's always talking about documents. It was a way to serve up marked-up documents that didn't depend on using the specific HTML tag vocabulary. Everything else happened to it later, and was a bad idea.

mkozlows•30m ago
And as for JAXB, it was released in 2003, well into XML's decadent period. The original Java APIs for XML parsing were SAX and DOM, both of which are tag and document oriented.
zarzavat•50m ago
You have to use the right tool for the job.

XML is extensible markup, i.e. it's like HTML that can be applied to tasks outside of representing web pages. It's designed to be written by hand. It has comments! A good use for XML would be declaring a native UI: it's not HTML but it's like HTML.

JSON is a plain text serialization format. It's designed to be generated and consumed by computers whilst being readable by humans.

Neither is a configuration language but both have been abused as one.

froh•45m ago
there were tools that derive the schema from sample data

and relaxng is a human friendly schema syntax that has transformers from and to xsd.

g947o•1h ago
Is there anything new on this topic that has never been said before in 1000 other articles posted here?

I didn't see any.

rerdavies•34m ago
What's new is that they WANT to revert to the horror of XML. :-P
mkozlows•1h ago
This is performance art, right? The very first bullet point it starts with is extolling the merits of XSD. Even back in the day when XML was huge, XSD was widely recognized as a monstrosity and a boondoggle -- the real XMLheads were trying to make RELAX NG happen, but XSD got jammed through because it was needed for all those monstrous WS-* specs.

XML did some good things for its day, but no, we abandoned it for very good reasons.

froh•53m ago
xslt was a stripped down dsssl in xml syntax.

dsssl was the scheme based domain specific "document style semantics and and specification language"

the syntax change was in the era of general lisp syntax bashing.

but to xml syntax? really? that was so surreal to me.

in_a_society•55m ago
Smells like an article from someone that didn’t really USE the XML ecosystem.

First, there is modeling ambiguity, too many ways to represent the same data structure. Which means you can’t parse into native structs but instead into a heavy DOM object and it sucks to interact with it.

Then, schemas sound great, until you run into DTD, XSD, and RelaxNG. Relax only exists because XSD is pretty much incomprehensible.

Then let’s talk about entity escaping and CDATA. And how you break entire parsers because CDATA is a separate incantation on the DOM.

And in practice, XML is always over engineered. It’s the AbstractFactoryProxyBuilder of data formats. SOAP and WSDL are great examples of this, vs looking at a JSON response and simply understanding what it is.

I worked with XML and all the tooling around it for a long time. Zero interest in going back. It’s not the angle brackets or the serialization efficiency. It’s all of the above brain damage.

mkozlows•44m ago
The part where it favorably mentioned namespaces also blew my mind. Namespaces were a constant pain point!
kenforthewin•51m ago
> This is insanity masquerading as pragmatism.

> This is not engineering. This is fashion masquerading as technical judgment.

The boring explanation is that AI wrote this. The more interesting theory is that folks are beginning to adopt the writing quirks of AI en masse.

kennethallen•41m ago
The fundamental reason JSON won over XML is that JSON maps exactly to universal data structures (lists and string-keyed maps) and XML does not.
lighthouse1212•17m ago
XML was designed for documents; JSON for data structures. The 'lost art' framing implies we forgot something valuable, but what actually happened is we stopped using a document format for data serialization. That's not forgetting - that's learning. XML is still the right choice for its original domain (markup, documents with mixed content). It was never the right choice for API payloads and config files.
com2kid•14m ago
I remember spending hours just trying to properly define the XML schema I wanted to use.

Then if there were any problems in my XML, trying to decipher horrible errors determining what I did wrong.

The docs sucked and where "enterprise grade", the examples sucked (either too complicated or too simple), and the tooling sucked.

I suspect it would be fine now days with LLMs to help, but back when it existed, XML was a huge hassle.

I once worked on a robotics project where a full 50% of the CPU was used for XML serialization and parsing. Made it hard to actually have the robot do anything. XML is violently wordy and parsing strings is expensive.

acabal•9m ago
XML lost because 1) the existence of attributes means a document cannot be automatically mapped to a basic language data structure like an array of strings, and 2) namespaces are an unmitigated hell to work with. Even just declaring a default namespace and doing nothing else immediately makes your day 10x harder.

These items make XML deeply tedious and annoying to ingest and manipulate. Plus, some major XML libraries, like lxml in Python, are extremely unintuitive in their implementation of DOM structures and manipulation. If ingesting and manipulating your markup language feels like an endless trudge through a fiery wasteland then don't be surprised when a simpler, more ergonomic alternative wins, even if its feature set is strictly inferior. And that's exactly what happened.

I say this having spent the last 10 years struggling with lxml specifically, and my entire 25 year career dealing with XML in some shape or form. I still routinely throw up my hands in frustration when having to use Python tooling to do what feels like what should be even the most basic XML task. Though xpath is nice.

culebron21•8m ago
XML was a product of its time, when after almost 20 years of CPUs rapidly getting quicker, we contemplated that the size of data wouldn't matter, and data types won't matter (hence XML doesn't have them, but after that JSON got them back) -- we expected languages with weak type systems to dominate forever, and that we would be working and thinking levels above all this, abstractly, and so on.

I remember XML proponents back then argued that it allows semantics -- although, it was never clear how a non-human would understand it and process.

The funny thing about namespaces is that the prefix, by the XML docs, should be meaningless -- instead you should look at the URL of the namespace. It's like if we decide that in each document we may decide differently what "duck" means: it may mean bear, or fox, or whatever. (Or, probably we'd always mean bear, but by our wish, in one document could call it wolf, or capybara in another.) It feels like mathematical concepts -- coordinate spaces, numeric spaces with different number 1 and base space vectors -- applied to HTML. It may be useful in rare cases. But few can wrap their heads around it, and right from the start, most tools worked only with exactly named prefixes, and everyone had to follow this way.

stmw•5m ago
There were efforts to make XML 1. more ergonomic and 2. more performant, and while (2) was largely successful, (1) never got there, unfortunately - but seem https://github.com/yaml/sml-dev-archive for some history of just one of the discussions (sml-dev mailing list).

From Sketch to Masterpiece: Understanding Stable Diffusion Img2Img

1•bozhou•2m ago•0 comments

How do I fight 250 duplicate Amazon listings with fake reviews?

https://travelhead.medium.com/amazons-dirty-secret-the-chinese-marketplace-manipulation-destroyin...
1•travelhead•2m ago•0 comments

ClaudePad

https://github.com/marshallrichards/ClaudePad
1•ray__•6m ago•0 comments

I built a light that reacts to radio waves [video]

https://www.youtube.com/watch?v=moBCOEiqiPs
1•codetheweb•6m ago•0 comments

One year of Kash Patel running the FBI

https://www.nytimes.com/interactive/2026/01/22/magazine/trump-kash-patel-fbi-agents.html
2•osnium123•7m ago•1 comments

CEOs Say AI Is Making Work More Efficient. Employees Tell a Different Story

https://www.wsj.com/lifestyle/workplace/ceos-say-ai-is-making-work-more-efficient-employees-tell-...
2•1vuio0pswjnm7•10m ago•0 comments

"A public origin record for a phase‑resonance hybrid computing architecture"

1•LUMENPIXEL•12m ago•0 comments

Amazon planning job cuts next week after axing 14,000 due to AI: report

https://nypost.com/2026/01/22/business/amazon-planning-thousands-of-job-cuts-next-week-after-axin...
2•1vuio0pswjnm7•12m ago•0 comments

Tell HN: Cursor agent force-pushed despite explicit "ask for permission" rules

1•xinbenlv•12m ago•0 comments

ChatGPT gives answers. Agentic AI makes decisions

https://chungmoo.substack.com/p/chatgpt-gives-answers-agentic-ai
1•chungmoo•15m ago•0 comments

GM announces end of Chevy Bolt (for the second time)

https://techcrunch.com/2026/01/22/gm-to-end-chevy-bolt-ev-production-next-year-move-china-made-bu...
1•LanceJones•21m ago•0 comments

I Asked 4 AIs to Define a Fake Term. Only 1 Refused to Lie

https://chungmoo.substack.com/p/i-asked-4-ais-to-define-a-fake-term
1•chungmoo•21m ago•0 comments

Software sell-off sparked by AI sets stage for potential big year of M&A

https://www.cnbc.com/2026/01/22/selloff-in-software-from-ai-sets-stage-for-potential-big-year-of-...
1•1vuio0pswjnm7•22m ago•0 comments

24 Hour Timelapse of all FedEx Airplanes in the USA (2009) [video]

https://www.youtube.com/watch?v=0xEczrGIy08
1•radeeyate•22m ago•0 comments

Robotics Needs World Models

https://www.signalfire.com/blog/missing-piece-in-robotics-a-world-model
1•zviugfd•23m ago•0 comments

PolyShapr

https://chambercode.com/music/polyshapr/
1•gregsadetsky•25m ago•0 comments

The US national debt will soon be growing faster than the economy itself

https://fortune.com/2026/01/22/how-big-national-debt-when-recession-financial-crisis-could-hit/
4•testing22321•26m ago•0 comments

US officially exists World Health Organization

https://abcnews.go.com/Health/us-officially-exits-world-health-organization-accusing-agency/story...
6•testing22321•28m ago•1 comments

USA Exits WHO

https://www.hhs.gov/press-room/united-states-completes-who-withdrawal.html
9•Swizec•31m ago•0 comments

What Is Control Flow Analysis for Lambda Calculus? [audio]

https://podcasts.apple.com/us/podcast/what-is-control-flow-analysis-for-lambda-calculus/id1493036...
1•matt_d•34m ago•1 comments

Show HN: Extracting React apps from Figma Make's undocumented binary format

https://albertsikkema.com/ai/development/tools/reverse-engineering/2026/01/23/reverse-engineering...
2•albertsikkema•34m ago•3 comments

ClickHouse launches natively integrated Postgres managed service

https://clickhouse.com/blog/postgres-managed-by-clickhouse
2•saisrirampur•36m ago•0 comments

What it's like to dissect a cadaver (2022)

https://alok.github.io/2022/11/09/dissection/
2•Gegenkraft•37m ago•0 comments

Blogroll.club – a curated collection of blogs and personal sites

https://blogroll.club
2•Curiositry•39m ago•0 comments

Feral cats and foxes have driven many Australian mammals to extinction

https://theconversation.com/yes-feral-cats-and-foxes-really-have-driven-many-australian-mammals-t...
1•defrost•41m ago•0 comments

Common bad arguments for the correct answer to the Monty Hall Problem

https://link.springer.com/article/10.1007/s11229-025-05389-6
1•mellosouls•44m ago•0 comments

VidBee: Free Open Source Video Downloader

https://vidbee.org/
1•jonbaer•45m ago•0 comments

Remotion: Make Videos Programmatically

https://www.remotion.dev/
1•jonbaer•46m ago•0 comments

Show HN: Audio AI had a wild day – 5 major open-source / real-time TTS drops

https://github.com/FlashLabs-AI-Corp/FlashLabs-Chroma
1•pratik227•48m ago•0 comments

TikTok deal finalized to stop US ban: Oracle, Silver Lake, MGX to hold 15% each

https://www.reuters.com/world/china/tiktok-reaches-deal-new-us-joint-venture-avoid-american-ban-2...
4•aarondong•49m ago•0 comments