The problem, IMHO, was that rampant "xml-abuse" in the naughts. ws-* standards and over-engineered garbage like SOAP ("complex object access protocol") made people loathe XML.
I did like JAXB in Java, XLST, schemas, XPATH. Never got into XSL, but it seemed like good thing too. It worked best when your tooling manipulated it for you or at least helped you in an intelligent way. Much of the hate for XML came from situations where you had to deal with someone's over-the-top-one-size-fits all schema without the benefit of tooling to at least hint you in the right direction.
It still survives in WPF and c# *.proj files. If it were just me, I would still use it for object serialization. But json is king now even though it's inferior.
An object serialization format should not have a bunch of footguns and vulnerability categories specific to it.
2. To me, verbosity and aesthetics seem like perfectly valid reasons to hate XML. Once you learn S expressions, XML looks disgusting. They implemented half of Common Lisp in a markup language.
XML is verbose and therefore uglier than it ought to be. I think most of the haters hate it for that alone -- there's not much else to hate because you don't have to deal with the rest, it's not really imposed on you unless you really have to deal with someone else's XML application.
What do I mean? Well, the brackets thing and the necessity to repeat name of every element twice, in correct (LIFO, last in first out) order, isn't great, admittedly.
What XML has that the dev-bro alternatives that have flooded the void XML left since, haven't gotten and thus see being reinvented, are: namespaces, attributes and interop using the former two. Sure you can write JSON and YAML (the latter deservingly being incredibly hard to parse correctly -- they tried to design a better XML but failed IMO) -- but these suck as meta-languages because there's not much "meta" there. JSON, for example, allows you create properties and has a few types (kind of more than XML, really) but it leaves semantics up to you and namespaces are up to you to re-invent, poorly. If you think I am stretching the argument, see if you can represent an HTML document (no, not Markdown) with a JSON file.
YAML is a similar story, albeit with a few cool things like aliases. I think it's a better attempt to give the world a better XML, but the jury is still out on that one.
The killer thing with XML, for better and for worse, was plethora of tools to work with it. I wrote a fair share of XSLT documents to transform data, back when there was momentum in XHTML, for example. XSLT barely supports JSON and it's not pretty. XPath cannot natively understand YAML -- unless you convert it to XML which I guess re-animates XML as some sort of Frankenstein's monster. And even if it were a [pretty] monster, dealing with intermediate representation for the kind of purpose, is a can of worms all of its own.
Ironically nobody seems to hate HTML 5, seemingly. Or React basically turned it into a greasy cogwheel nobody needs to look at. Because if you look at it, it's in my opinion an abomination even compared to XML (unpopular opinion) -- the parser is quirky and behaviour is defined by the standard per element type (i.e. some elements need a closing tag and some do not, and what happens if you forget a closing tag is element-specific; care to remember the set of rules to ensure your document renders to your liking?). It has no namespaces but it has "custom elements" which require a dash in the name as poor' man's namespaces and you can't omit one, and now we have a Web of `x-spinner` and `x-carousel` because it turns out everyone rightfully wanted default namespace but didn't get one. Anyway, it's all plumbing, right -- the idea of _writing_ HTML has largely come and gone us by. And I am digressing.
XML is a markup language, but most people that used it just needed a standard structured data format. In comes JSON which is more easily compatible with the object systems of various languages and in particular is compatible with Javascript syntax, and XML loses most of the people that used it.
As a markup language though, it seems pretty good. It's just that the amount of people that actually need an extensible markup language is much smaller.
I do hate the strictness of it. The header
<?xml version="1.0" encoding="UTF-8"?>
should be unnecessary. For a markup language, an already-made plain-text document should already count as XML. The tags should be something you can just sprinkle as you'd like to add contextual metadata.They're not the same thing. If you look at it as the extensible markup language for documents that it is, "tags" (i.e. inner content) would be visible and "attributes" would not. If your XML document was processed by an application to convert to another type of document (PDF, etc.), and it didn't recognize a particular tag, it would be sensible for attributes to disappear, but inner content ("tags") to remain.
It's only seems like a preference thing if you look at XML as a structured data format like JSON is.
I am glad that younger generations are looking at it with fresh eyes. XML is a useful format; it has its place in your toolbox. Ignore the haters.
json is just easier for my brain at this point if it needs to go over http, but ive seen some pretty... poorly designed json structures.
csv is always a good time. love when i can just plop important data into a table and query away
On XML I don't hate. I hate wrapping my head around XSLT but that is more about my head. AI may make XSLTing more bearable as it happens? I did work with someone passionate about XSLT. Aaaand now I am doxxed.
I also thing in practice schemaless i.e. JSON or "the schema is look at the code or some logs lol" won because fuck let's face it that is more fun.
It's been a good choice for designing a new language, but we've been really surprised by the poor quality of the available parsers. We figured it would be a solved problem, but we'll be writing our own at some point.
- element vs attribute ambiguity
- model of the document does not fit nicely to programming model of structs, dicts and arrays
- too many complexities (entities, cdata, parser directives)
- cardinality unknown without schema (is that a single value, or an array that just happens to have one element)
- order of elements may or may not be significant depending on schema
- not really extensible if the original schema does not explicitly allow for extensibility
- some types of valid XML documents are not representable by a schema (e.g. any number of different elements in any order)
- verbosity
- namespace identifiers being URIs that may or may not be resolvable
What I want for general data exchange is JSON with comments and sane namespaces.
Edit: line wraps
I write a lot of html already I don't need xml too in my life.
This is much less common/less standardized with json.
So if name == "foobar" then read; else ignore. For a 500 GiB XML file that makes a difference.
As for your other point about an "AST" (it's actually just a DOM.) That's the the benefit? And you're in for a surprise when you learn that reaching into a deeply-nested JSON structure deserialised into whatever memory format most appropriate for your pet language is also an abstract data type that you act on with getters/accessors/what-have-yous that is in all but name a DOM.
And we do have tools to deal with it: XSLT for transformation. For querying? XPath.
Perhaps because it's an example of what is possible in XML and how to parse it, and not, in fact, a particularly good or canonical example of XML?
More precisely: in XML, elements (nodes) are named/labeled. ("node-labeled graph") In JSON, keys (edges) are named. ("edge-labeled graph")
In programming, we need names for the fields in our structures (edges between objects), so JSON is a much better match than XML (which needs contortions to handle this use case -- e.g. by having nesting levels alternate between element=node and element=edge). Only in some object-oriented cases (which derived class should the deserializer construct?) do you care about node labels -- but usually that's in addition to edge labels, so a "_type" key in JSON is still easier than XML.
Sure, this gives quite a few variations on how to serialize some data. But it's not like json's simpler approach would make data serialization universal, there are many different ways to encode the same thing.
You also need to distinguish the format itself from the various libraries that may be available to parse/process it in a given language. It doesn't always make sense, even when it's an option, to let the tail wag the dog and choose a language just because it has a nice library for something.
> Furthermore parsing JSON or YAML gives you the basic data types like lists and dictionaries
Well, maybe some library for some language does that, and if that is the language you are using, and that is all you need, then I suppose you are in luck. More generally you may want to use a format like XML or JSON to hold user defined types, which rather levels the playing field since there are few good libraries for this in any language,and you may need to roll your own (been there, done that).
As a document format, it's supposed to be hand-written by humans. If you have paragraphs between the opening tag and closing tag, it makes sense to let the reader know what they're seeing the closing of.
After deciding you do want to repeat the element name, the angle brackets make more sense. Otherwise, you can have a syntax like LaTeX's.
It's simple to parse like JSON, it has namespaces like XML (but better), and it doesn't require you to repeat the name of every element twice.
kccqzy•57m ago
Furthermore parsing JSON or YAML gives you an AST that consists of the basic data types like lists and dictionaries. Parsing XML gives you an AST that requires a lot more effort to turn into data in your domain. Even on the web, very few people like to use the verbose XML DOM API like childNodes, nodeType, getElementsByTagName et al; it is basically unheard of for anyone to use it outside the web such as in Python, despite that the DOM API is in the Python standard library since forever (see https://github.com/python/cpython/blob/3.14/Lib/xml/dom/mini... for example).
actionfromafar•52m ago
SoftTalker•45m ago
tonyedgecombe•33m ago
wombatpm•25m ago
cryptos•52m ago
sfn42•35m ago
I can do the same thing with XML. Of course it doesn't necessarily go that smoothly with all xml, but as long as the xml is fairly simple like a JSON document would be it's totally fine. It's only when you start to use all the features of xml that don't fit neatly into a class model that it starts to get annoying. But if JSON serves your needs then simple xml does as well. I wouldn't use it because JSON works just fine but it's not as bad as people make it seem, unless people make it really bad.