Do you hate XML? (2010)

https://sigfrid-lundberg.se/entries/2010/07/hate_xml/

24•theanonymousone•2h ago

Comments

kccqzy•57m ago

In my opinion, the reason people hate XML is because of what M signifies: it is a markup language and most of the time we don’t need a markup language. Markup languages are great for rich text documents. They are just not a good fit for representing data. The markup-nature of XML introduces unnecessary choice in whether to use an attribute or a child element to represent data; for HTML such ambiguity doesn’t actually exist but for data it does. Consider this piece of XML from the Python docs:

    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>

Why is the country name an attribute but not the rank? Why are all information about neighbors attributes but not children?

Furthermore parsing JSON or YAML gives you an AST that consists of the basic data types like lists and dictionaries. Parsing XML gives you an AST that requires a lot more effort to turn into data in your domain. Even on the web, very few people like to use the verbose XML DOM API like childNodes, nodeType, getElementsByTagName et al; it is basically unheard of for anyone to use it outside the web such as in Python, despite that the DOM API is in the Python standard library since forever (see https://github.com/python/cpython/blob/3.14/Lib/xml/dom/mini... for example).

actionfromafar•52m ago

YAML made me not hate XML.

SoftTalker•45m ago

Agreed. Among text based formats, nothing I hate more than YAML.

tonyedgecombe•33m ago

That and all the proprietary formats we had to deal with before XML came along.

wombatpm•25m ago

XML is like violence. If it’s not solving your problem, you need to use more.

cryptos•52m ago

Interesting point of view. JSON is also not the right thing to use in many scenarios, but it is the de-facto standard now. Maybe something like protobuf is the way to go.

sfn42•35m ago

Not really. In C# I use a parsing library for which I just write a class and then the library automatically serializes the JSON into an instance of that class.

I can do the same thing with XML. Of course it doesn't necessarily go that smoothly with all xml, but as long as the xml is fairly simple like a JSON document would be it's totally fine. It's only when you start to use all the features of xml that don't fit neatly into a class model that it starts to get annoying. But if JSON serves your needs then simple xml does as well. I wouldn't use it because JSON works just fine but it's not as bad as people make it seem, unless people make it really bad.

crispyambulance•55m ago

XML was a good, well-intentioned idea.

The problem, IMHO, was that rampant "xml-abuse" in the naughts. ws-* standards and over-engineered garbage like SOAP ("complex object access protocol") made people loathe XML.

I did like JAXB in Java, XLST, schemas, XPATH. Never got into XSL, but it seemed like good thing too. It worked best when your tooling manipulated it for you or at least helped you in an intelligent way. Much of the hate for XML came from situations where you had to deal with someone's over-the-top-one-size-fits all schema without the benefit of tooling to at least hint you in the right direction.

It still survives in WPF and c# *.proj files. If it were just me, I would still use it for object serialization. But json is king now even though it's inferior.

dtech•15m ago

It's non-trivial to implement XML parser in a secure way, many stdlib ones are insecure by default. That should just not be a thing. XML has a bunch of vulnerabilities very specific to it, XXE is the most well known one, but you also have a bunch of DoSes due to expansions and XPath injection etc.

An object serialization format should not have a bunch of footguns and vulnerability categories specific to it.

cryptos•55m ago

At least XML is hated for the wrong reasons (e.g. verbosity, esthetics) most of the time. There was for sure an era where it was overused (see Apache Cocoon from 2006 https://en.wikipedia.org/wiki/Apache_Cocoon). But XML is still a pretty good format to exchange (and store) data and make sure the data conforms to a certain schema. JSON Schema in comparison is not nearly as powerful.

AnimalMuppet•40m ago

1. What, in your view, are the right reasons to hate XML?

2. To me, verbosity and aesthetics seem like perfectly valid reasons to hate XML. Once you learn S expressions, XML looks disgusting. They implemented half of Common Lisp in a markup language.

hackrmn•52m ago

Every time XML comes up, I feel obligated to share my opinion (I too wrote XML a the turn of the millennium and have seen it become and still witness on occasion it being excommunicated).

XML is verbose and therefore uglier than it ought to be. I think most of the haters hate it for that alone -- there's not much else to hate because you don't have to deal with the rest, it's not really imposed on you unless you really have to deal with someone else's XML application.

What do I mean? Well, the brackets thing and the necessity to repeat name of every element twice, in correct (LIFO, last in first out) order, isn't great, admittedly.

What XML has that the dev-bro alternatives that have flooded the void XML left since, haven't gotten and thus see being reinvented, are: namespaces, attributes and interop using the former two. Sure you can write JSON and YAML (the latter deservingly being incredibly hard to parse correctly -- they tried to design a better XML but failed IMO) -- but these suck as meta-languages because there's not much "meta" there. JSON, for example, allows you create properties and has a few types (kind of more than XML, really) but it leaves semantics up to you and namespaces are up to you to re-invent, poorly. If you think I am stretching the argument, see if you can represent an HTML document (no, not Markdown) with a JSON file.

YAML is a similar story, albeit with a few cool things like aliases. I think it's a better attempt to give the world a better XML, but the jury is still out on that one.

The killer thing with XML, for better and for worse, was plethora of tools to work with it. I wrote a fair share of XSLT documents to transform data, back when there was momentum in XHTML, for example. XSLT barely supports JSON and it's not pretty. XPath cannot natively understand YAML -- unless you convert it to XML which I guess re-animates XML as some sort of Frankenstein's monster. And even if it were a [pretty] monster, dealing with intermediate representation for the kind of purpose, is a can of worms all of its own.

Ironically nobody seems to hate HTML 5, seemingly. Or React basically turned it into a greasy cogwheel nobody needs to look at. Because if you look at it, it's in my opinion an abomination even compared to XML (unpopular opinion) -- the parser is quirky and behaviour is defined by the standard per element type (i.e. some elements need a closing tag and some do not, and what happens if you forget a closing tag is element-specific; care to remember the set of rules to ensure your document renders to your liking?). It has no namespaces but it has "custom elements" which require a dash in the name as poor' man's namespaces and you can't omit one, and now we have a Web of `x-spinner` and `x-carousel` because it turns out everyone rightfully wanted default namespace but didn't get one. Anyway, it's all plumbing, right -- the idea of _writing_ HTML has largely come and gone us by. And I am digressing.

jjgreen•48m ago

It will be back like vinyl.

jolmg•46m ago

> developers must become domain experts [my emphasis] in a rich and complex space that is essentially unrelated to the application itself.

XML is a markup language, but most people that used it just needed a standard structured data format. In comes JSON which is more easily compatible with the object systems of various languages and in particular is compatible with Javascript syntax, and XML loses most of the people that used it.

As a markup language though, it seems pretty good. It's just that the amount of people that actually need an extensible markup language is much smaller.

I do hate the strictness of it. The header

  <?xml version="1.0" encoding="UTF-8"?>

should be unnecessary. For a markup language, an already-made plain-text document should already count as XML. The tags should be something you can just sprinkle as you'd like to add contextual metadata.

int_19h•46m ago

Honestly I miss it. As overengineered as it was, at least we had proper tooling for it, and while there were dialects in the associated tech (e.g. XML Schema vs RELAX NG vs Schematron) it was minor compared to the wild west that JSON is to this day.

reenorap•45m ago

I’ve hated XML since 2004. The worst part about it is the tags vs attributes fights. They both do the same thing and the only difference is preference. Having two ways of doing the same thing invite and incite religious positions and cause unnecessary fighting. There should be one, opinionated way of doing things so you avoid confusion.

rf15•42m ago

yeah it's not a good design to have tags have two sets of children: a Set of key-value children and then a List of tree object children.

jolmg•39m ago

> The worst part about it is the tags vs attributes fights. They both do the same thing and the only difference is preference.

They're not the same thing. If you look at it as the extensible markup language for documents that it is, "tags" (i.e. inner content) would be visible and "attributes" would not. If your XML document was processed by an application to convert to another type of document (PDF, etc.), and it didn't recognize a particular tag, it would be sensible for attributes to disappear, but inner content ("tags") to remain.

It's only seems like a preference thing if you look at XML as a structured data format like JSON is.

edflsafoiewq•27m ago

In data structure terms, attributes do allow nodes to be decorated with additional information without forcing any change on existing parsers. In JSON, this would require swapping, eg. "str" -> {"value": "str", "attrib1": "..."}.

mickeyp•42m ago

XML is unfairly maligned. Yes, people bought into it too much 26 years ago, but then you would too if you had to maintain someone else's massive packed struct dumped into a file and documented in a poorly-maintained word document --- or worse, a brace of dumb IETF RFCs that contradict eachother.

I am glad that younger generations are looking at it with fresh eyes. XML is a useful format; it has its place in your toolbox. Ignore the haters.

trueno•33m ago

i dont hate it, the declaration kind of annoys me from time to time digging into attributes can be annoying its obviously not the best form of structured data.

json is just easier for my brain at this point if it needs to go over http, but ive seen some pretty... poorly designed json structures.

csv is always a good time. love when i can just plop important data into a table and query away

hahahaa•26m ago

Aside I love how about me is just another tag and clicking lists 3 blog posts.

On XML I don't hate. I hate wrapping my head around XSLT but that is more about my head. AI may make XSLTing more bearable as it happens? I did work with someone passionate about XSLT. Aaaand now I am doxxed.

I also thing in practice schemaless i.e. JSON or "the schema is look at the code or some logs lol" won because fuck let's face it that is more fun.

davidpapermill•23m ago

Last year we chose XML as the basis for our document language.

It's been a good choice for designing a new language, but we've been really surprised by the poor quality of the available parsers. We figured it would be a solved problem, but we'll be writing our own at some point.

janci•19m ago

My reasons to hate XML:

- element vs attribute ambiguity

- model of the document does not fit nicely to programming model of structs, dicts and arrays

- too many complexities (entities, cdata, parser directives)

- cardinality unknown without schema (is that a single value, or an array that just happens to have one element)

- order of elements may or may not be significant depending on schema

- not really extensible if the original schema does not explicitly allow for extensibility

- some types of valid XML documents are not representable by a schema (e.g. any number of different elements in any order)

- verbosity

- namespace identifiers being URIs that may or may not be resolvable

What I want for general data exchange is JSON with comments and sane namespaces.

Edit: line wraps

derriz•14m ago

I dislike it because it failed in such a fundamental way as a way to represent a document; you cannot, in general, reliably determine what characters the bytes in an XML file represent - the best a general XML processor can do is guess.

hoppp•3m ago

I don't hate it but I definitely don't like it.

I write a lot of html already I don't need xml too in my life.

Organic Maps

New AI tutor achieves 0.71-1.30 SD effect size in Dartmouth course [pdf]

Starring the Computer

The future of Flipper Zero development

It's not about physical vs. digital games, it's about ownership

The great blogging collapse: What happened to 100 successful blogs?

Introduction to Compilers and Language Design (2021)

Run Windows 2000 on a DEC Alpha with a new es40 fork

Mr. Baby Paint and accidentally discovering a new cellular automata

You need a webring

Installing A/UX 1.1 like it's the 90s

Airplane Boneyards List and Map

Why DMARC's new "NP" tag can fail with DNSSEC

Small Penis Rule

Shadcn/UI now defaults to Base UI instead of Radix

Taphonomic analysis reveals behavioral & tech capabilities of Homo floresiensis

A sociotechnical threat model for AI-driven smart home devices

OpenWiki: CLI that writes and maintains agent documentation for your codebase

Jim Keller's startup is building a factory to mass-produce small chip fabs

Optimizing an algorithm that's quadratic by design

Show HN: KiCad in the Browser

Every postcard tells a story

The GNU Emacs Architecture: Unlocking the Core [pdf]

Web-based cryptography is always snake oil

EU Council forces Chat Control via fast-track

Pandoc Lua Filters

Medieval-style fortifications are back in the Sahel

Autonomous flying umbrella follows and shields users from rain and sunlight

Rayfish, Peer-to-peer mesh VPN with no server to trust

If you're a button, you have one job