frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

SIOF (Scheme in One File) – A Minimal R7RS Scheme System

https://github.com/false-schemers/siof
1•gjvc•4m ago•0 comments

The Sinclair ZX Spectrum Next Issue 3 Is Coming to Kickstarter This Saturday

https://www.specnext.com/the-sinclair-zx-spectrum-next-issue-3-is-coming-to-kickstarter-this-saturday/
1•whobre•12m ago•0 comments

Smart assistant to be developed to help people with dementia

https://www.uu.nl/en/news/smart-assistant-to-be-developed-to-help-people-with-dementia
1•geox•13m ago•0 comments

Can You Drink Saturn's Rings?

https://www.scientificamerican.com/article/can-you-drink-saturns-rings/
1•Bluestein•15m ago•0 comments

Cursor snaps up enterprise startup Koala in challenge to GitHub Copilot

https://techcrunch.com/2025/07/18/cursor-snaps-up-enterprise-startup-koala-in-challenge-to-github-copilot/
1•pseudolus•17m ago•0 comments

AI Powered Cat Flap

https://www.onlycat.com/
2•nikolayasdf123•25m ago•0 comments

A major AI training data set contains millions of examples of personal data

https://www.technologyreview.com/2025/07/18/1120466/a-major-ai-training-data-set-contains-millions-of-examples-of-personal-data/
1•pseudolus•29m ago•1 comments

The sumerian game early computer game

https://spillhistorie.no/2025/07/10/the-sumerian-game-the-ancestor-of-modern-city-builders/
2•christkv•29m ago•1 comments

Fstrings.wtf

https://fstrings.wtf/
5•darkamaul•30m ago•0 comments

I avoid using LLMs as a publisher and writer

https://lifehacky.net/prompt-0b953c089b44
1•tombarys•30m ago•1 comments

IIA team decodes reason behind May 2024 solar eruptions

https://www.thehindu.com/news/national/karnataka/iia-team-decodes-reason-behind-may-2024-solar-eruptions/article69827818.ece
1•Bluestein•30m ago•0 comments

Show HN: Vlm in 3D PC, 16 shot scanobjectnn top1 acc: 99.91

https://github.com/genji970/3d-vlm-gaussian-splatting-pointclip-on-modelnet40
1•genji970•36m ago•0 comments

First Space-Based Gravitational Wave Detector Begins Construction

https://spectrum.ieee.org/laser-interferometer-space-antenna
2•pseudolus•36m ago•0 comments

Petition: Repeal the Online Safety Act

https://petition.parliament.uk/petitions/722903
3•Bogdanp•36m ago•0 comments

Show HN: I couldn't poop, so I built an app to track digestion in real-time

https://www.digestrackapp.com
2•YaccoHakon•41m ago•0 comments

Felix Baumgartner, Who Jumped from Stratosphere, Dies in Italy

https://www.theinternational.at/felix-baumgartner-who-jumped-from-stratosphere-dies-in-italy/
2•signa11•43m ago•0 comments

Base44 – build fully-functional apps in minutes with just your words

https://base44.com/?via=b44d
1•bubblehack3r•46m ago•0 comments

Homelab Tour (2022)

https://taoofmac.com/space/blog/2022/02/12/1930
1•rcarmo•49m ago•0 comments

China committee chair calls out Admin's decision to resume GPU sales to China

https://www.theregister.com/2025/07/18/trump_gpu_china/
3•rntn•52m ago•0 comments

The Vibes

https://taoofmac.com/space/blog/2025/05/13/2230
1•rcarmo•52m ago•0 comments

Ask HN: Is TAOCP helpful in real life reasoning?

1•hamiecod•53m ago•0 comments

Singapore actively dealing with ongoing cyberattack on critical infrastructure

https://www.channelnewsasia.com/singapore/unc3886-cyber-security-threat-actor-attack-singapore-5245791
3•hongsy•59m ago•0 comments

It Takes Two to Tango

https://avivbenyosef.com/it-takes-two-to-tango/
1•kiyanwang•1h ago•0 comments

Kolmogorov Complexity [20:48]

https://www.lesswrong.com/posts/KqgujtM3vSAfZE2dR/on-ilya-sutskever-s-a-theory-of-unsupervised-learning
2•Bluestein•1h ago•0 comments

The Remarkable Incompetence at the Heart of Tech

https://www.wheresyoured.at/the-remarkable-incompetence-at-the-heart-of-tech/
3•vermilingua•1h ago•0 comments

CLI converts YAML exercise plan to guided audio files

https://github.com/mrclmr/w2a
4•defree•1h ago•4 comments

Hypseus Singe: A program to play laserdisc arcade games

https://github.com/DirtBagXon/hypseus-singe
2•exvi•1h ago•0 comments

OpenAI claiming gold medal standard at IMO 2025

https://github.com/aw31/openai-imo-2025-proofs
6•ocfnash•1h ago•7 comments

Show HN: API Radar – Track Leaked API Keys in Public GitHub Repos

https://api-radar.live
1•zaim_abbasi•1h ago•0 comments

NASA's Escapade Mars Mission Will Launch on New Glenn in 2025

https://in.mashable.com/science/97280/blue-origin-confirms-nasas-escapade-mars-mission-will-launch-on-new-glenn-in-2025
1•Bluestein•1h ago•0 comments
Open in hackernews

Microsoft Office is using an artificially complex XML schema as a lock-in tool

https://blog.documentfoundation.org/blog/2025/07/18/artificially-complex-xml-schema-as-lock-in-tool/
126•firexcy•6h ago

Comments

khelavastr•6h ago
Does this person not understand XML serializers..?
yftsui•5h ago
Exactly, there is no data provided on why the author believes it is “too complex”, just one random person ranting.
ranger_danger•5h ago
You're not wrong, but it's funny that this same topic was posted just earlier today with a very different sentiment in the comments.

https://news.ycombinator.com/item?id=44606646

But if you dig hard enough, there's actually links to more evidence of why it is that complicated... so I don't think it was necessarily intentionally done as a method of lock-in, but where's the outrage in that? /s

"Complicated file format has legitimate reasons for being complicated" just doesn't have the same ring to it as a sensationalized accusation with no proof.

mjevans•5h ago
Weren't those reasons effectively...?

'special case everything we ever used to do in office so everything renders exactly the same'

Instead of offering some suitable placebo for properly rendering into a new format ONCE with those specific quirks fixed in place?

lozenge•3h ago
What would that look like?

"You have opened your Word 97 document in Office 2003. The quirks have been removed, so it might look different now. Check every page before saving as docx."

"You have pasted from a Word 97 document into an Office 2003 OOXML document. Some things will not work."

pessimizer•4h ago
I have no idea what you mean to express by this. I've never met an XML or SOAP truther, but are you really saying that because XML can be serialized, it's impossible for an XML schema to be artificially complex?

What is it about serializing XML that would optimize the expression of a data model?

constantcrying•2h ago
Do you not understand that there is a difference between parsing something and implementing a specification? These are totally separate things.

Obviously parsing the XML is trivial. What is not trivial is what you do with parsed XML and what the parsed structure represents.

wvenable•5h ago
> Unfortunately, while an XML schema can be simple, it can also be unnecessarily complex, bloated, convoluted and difficult to implement without specific knowledge of its features.

One could now use that exact sentence to describe the most popular open document format of all: HTML and CSS.

masa331•4h ago
Can you be more specific here? HTML and CSS can't be described like that in my opinion.

It is complex but not complicated. You can start with just a few small parts and get to a usable and clean document within hours from the first contact with the languages. The tags and rules are usually quite self-describing while consice and there are tons and tons of good docs and tools. The development of the standards is also open and you can peek there if you want to understand decisions and rationals.

alterom•3h ago
It's not about making a document.

It's about making software that would display a document in that format correctly.

I.e., a browser.

perching_aix•3h ago
The current HTML spec alone is a 1000+ page PDF, and I can't imagine the CSS spec being much shorter.

Wordsmithing your way around this doesn't make them any easier.

masa331•6m ago
Sure the spec might be enormous but you don't need to touch it at all to be productive quickly. In no HTML or CSS tutorial i'v ever seen was a reference to the spec nor did i need to go there to solve something. And that in itself is another proof how nicely it is designed actually. Because on the other hand there are other document types or schemas where you absolutely have to go to the spec because it's is so cryptic and badly designed and not self-explaining that there is nothing else you can do.
perching_aix•3m ago
HTML and CSS tutorials are for authoring HTML and CSS documents, not for authoring HTML and CSS parsers and renderers.
yegle•2h ago
You could say the existing browser vendors pushed to make the HTML standard more complicated to the point that there's no chance for a newcomer to compete with the existing ones.
leonewton253•3h ago
Yeah but those are open standards, where as Microsoft is the only one with true knowledge of its XML.
rullelito•3h ago
This is similar in zero ways.
ddtaylor•5h ago
Again?
cranberryturkey•4h ago
The post is essentially reminding people that XML doesn’t magically equal openness. A schema can be “unnecessarily complex, bloated, convoluted and difficult to implement”, and in the case of Office 365 the spec runs to “over 8 000 pages” and uses deeply nested tags, overloaded elements and wildcards. The result is that only the vendor can feasibly implement it, which eliminates third‑party implementations and lets the vendor dictate terms. The rail‑control analogy in the article makes the point well.

What isn’t acknowledged is that a lot of that complexity isn’t purely malicious. OOXML had to capture decades of WordPerfect/Office binary formats, include every oddball feature ever shipped, and satisfy both backwards‑compatibility and ISO standardisation. A comprehensive schema will inevitably have “dozens or even hundreds of optional or overloaded elements” and long type hierarchies. That’s one reason why the spec is huge. Likewise, there’s a difference between a complicated but documented standard and a closed format—OOXML is published (you can go and download those 8 000 pages), and the parts of it that matter for basic interoperability are quite small compared with the full kitchen‑sink spec.

That doesn’t mean the criticism is wrong. The sheer size and complexity of OOXML mean that few free‑software developers can afford to implement more than a tiny subset. When the bar is that high, the practical effect is the same as lock‑in. For simple document exchange, OpenDocument is significantly leaner and easier to work with, and interoperability bodies like the EU have been encouraging governments to use it for years. The takeaway for anyone designing document formats today should be the same as the article’s closing line: complexity imprisons people; simplicity and clarity set them free.

charcircuit•2h ago
>that few free‑software developers can afford to.

Considering how little most free software makes they can't afford to do a lot of things. It's not a hard bar to hit.

mrweasel•1h ago
The complaint that OOXML was overly complex was a criticism when Microsoft first introduced the format, but as you point out, it needed to be able to handle decades of old formatting rules back then already. While I'm sure that there are stuff in the format that Microsoft made needlessly complex, one has to remember that they still need to be able to maintain the code, so throwing in to many roadblocks for open source developers would likely come back to haunt them. Still we know they did just that with SMB, so why not with OOXML.

What surprises me is how well LibreOffice handles various file formats, not just OOXML. In some cases LibreOffice has the absolute best support for abandoned file formats. I'm not the one maintaining them, so it's easy enough for me to say "See, you managed just fine". It much be especially frustrating when you have the OpenDocument format, which does effectively the same thing, only simpler.

jiggawatts•4h ago
The opinion in the article misses something fundamental.

The complexity is not artificial, it is completely organic and natural.

It is incidental complexity born of decades of history, backwards compatibility, lip-service to openness, and regulatory compliance checkbox ticking. It wasn't purposefully added, it just happened.

Every large document-based application's file format is like this, no exceptions.

As a random example, Adobe Photoshop PSD files are famously horrific to parse, let alone interpret in any useful way. There are many, many other examples, I don't aim to single out any particular vendor.

All of this boils down to the simple fact that these file formats have no independent existence apart from their editor programs.

They're simply serialised application state, little better than memory-dumps. They encode every single feature the application has, directly. They must! Otherwise the feature states couldn't be saved. It's tautological. If it's in Word, Excel, PowerPoint, or any other Office app somewhere, it has to go into the files too.

There are layers and layers of this history and complex internal state that has to be represented in the file. Everything from compatibility flags, OLE embedding, macros, external data source, incremental saves, the support for quirks of legacy printers that no longer exist, CYMK, external data, document signing, document review notes, and on and on.

No extra complexity had to be added to the OOXML file formats, that's just a reflection of the complexity of Microsoft Office applications.

Simplicity was never engineered into these file formats. If it had been, it would have been a tremendous extra effort for zero gain to Microsoft.

Don't blame Microsoft for this either, because other vendors did the exact same thing, for the exact same pragmatic reasons.

Ekaros•1h ago
You might start with something simple with aim for simplicity. Then you need to add more features. Eventually in enough years you will have lost the simplicity as you have that many features to support.

You might not add features, but well that is most likely losing proposition against those competitors that have features. As generally normal users want some tiny subset of features. Be it images, tables, internal links, comments, versions.

bob1029•4h ago
This is a comical perspective to me. I've been ass-deep in core banking APIs where we generate service references from WSDL/XSDs. Some of the resulting codegen measures in the tens of megabytes for some files. I wouldn't even attempt to quantify the number of pages of documentation. And this is just for mid size US banking domain. Microsoft Office has to work literally everywhere for everything. The fact that it's only 8000 pages of documentation is likely a miracle.

If you're working with an XML schema that is served up in XSD format, using code gen is the best (only) path. I understand it's old and confusing to the new generation, but if you just do it the boomer way you can have the whole job done in like 15 minutes. Hand-coding to an XML interface would be like cutting a board with an unplugged circular saw.

piker•3h ago
While I generally agree, I don't think the author is complaining about the XML spec's complexity per se but rather that rendering the underlying structures to a page is hard.
jajko•2h ago
Yeah another b(w)anker dev here, complex xsds seem to be the baseline in industry as soon as the role of that spec escapes simple 1 server : 1 client use case.

One example I work with sometimes is almost 1MB of xsds and thats a rather small internal data tool. They even have restful json variant but its not that used, and complexity is roughly the same (you escape namescape hell, escaping xml chars etc but then tooling around json is a bit less evolved). Xml to object mapping tool is a must.

perching_aix•2h ago
Interfacing sounds like only just half the battle though? Like, I don't understand why this is a counter-argument.
deknos•28m ago
It's not only about the XML itself, but that microsoft really likes to change the standard any time opensource catches up.

and most of the time they do not use their open standard, but the other document type.

The artificial vendor lockin is real.

bob1029•4m ago
You can simply re-run your codegen on the newly published schema and review for compile-time errors.

We do this about once a quarter in the banking industry. It takes about an hour on average.

pessimizer•4h ago
Strange that this is getting traction again, and good on the people getting it out there. Saw something about "OOXML" make Google News the other day.

Having a debate about the quality of OOXML feels like a waste of time, though. This was all debated in public when Microsoft was making its proprietary products into national standards, and nobody on Microsoft's side debated the formats on the merits because there obviously weren't any, except a dubious backwards compatibility promise that was already being broken because MS Office couldn't even render OOXML properly. People trying to open old MS Office documents were advised to try Openoffice.

They instead did the wise thing and just named themselves after their enemy ("Open Office? Well we have Office Open!"), offered massive discounts and giveaways to budget-strapped European countries for support, and directly suborned individual politicians.

Which means to me that it's potentially a winnable battle at some point in the future, but I don't know why now would be a better outcome than then. Maybe if you could trick MS into fighting with Google about it. Or just maybe, this latest media push is some submarine attempt by Google to start a new fight about file formats?

another_twist•3h ago
How hard would it be to generate a parser for this spec with AI code gen ?
choeger•3h ago
A parser is trivial. It's XML and you have a schema.

What you want is a compiler (e.g., into a different document format) or an interpreter (e.g., for running a search or a spell checker).

That's a task that's massively complicated because you cannot give an LLM the semantic definition of the XML and your target (both typically are under documented and under specified). Without that information, the LLM would almost certainly generate an incomplete or broken implementation.

constantcrying•2h ago
If a spec is "difficult to implement without specific knowledge of its features" it is ridiculous to assume an AI could do an adequate job.
danjc•3h ago
So, basically the same as Adobe with PDF
piker•3h ago
This is a dupe from: https://news.ycombinator.com/item?id=44606646 but I'll repeat what I said over there.

I feel qualified to opine on this as both a former power user of Word and someone building a word processor for lawyers from scratch[1]. I've spent hours pouring over both the .doc and OOXML specs and implementing them. There's a pretty obvious journey visible in those specs from 1984 when computers were under powered with RAM rounding to zero through the 00's when XML was the hot idea to today when MSFT wants everyone on the cloud for life. Unlike say an IDE or generic text editor where developers are excited to work on and dogfood the product via self-hosting, word processors are kind of boring and require separate testing/QA.

It's not "artificial", it's just complex.

MSFT has the deep pockets to fund that development and testing/QA. LibreOffice doesn't.

The business model is just screaming that GPL'd LibreOffice is toast.

[1] Plug: https://tritium.legal

unyttigfjelltol•1h ago
LO is at least as functional as some other market leading SaaS word processors. LO could spin their product into a cloud application and not at all be "toast", because people in separate walled gardens no longer expect interoperability.

As for complexity, an illustration-- while using M365 I recently was confounded by a stretch of text that had background highlighting that was neither highlight markup, not paragraph or style formatting. An AI turned me onto an obscure dialog for background shading at a text level which explained the mystery. I've been a sophisticated user of M365 for decades and never encountered such a thing, nor have a clear idea of why anyone would use text-level background formatting in preference of the more obvious choices. Yet, there it is. With that kind of complexity and obscurity in the actual product, it's inevitable the file format would be convoluted and complex.

piker•58m ago
Agreed, but the point the author is missing is that complexity doesn't exist due to deliberate corporate lock in, but because the product is 40 years old and has had 10-11 ways to do just about everything it does. Unfortunately, as your case illustrates, there are still documents in the wild that depend on these legacy features. So to render with 100% fidelity, you end up in a sprawling web of complexity. Microsoft can afford to navigate that web (and already owns it). It's neigh impossible for an open-source product to do so.
fithisux•3h ago
I have seen in the past the same claim for Bluetooth.

I think this needs to end and it is up to ordinary people to seek alternatives.

Apart from LibreOffice, we still have many other alternatives.

jpalomaki•3h ago
What we should really do is abandon the WYSIWYG approach to document editing. This inevitably leads into vendor lock in.

Instead of perfect looks, we should focus on the content. Formats like markdown are nice, because they force you to do this. The old way made sense 30 yers ago when information was consumed on paper.

piker•2h ago
Not sure why you're getting so downvoted. It's a totally reasonable opinion in 2025 but it faces massive adoption headwind. People still cling to the idea of printing pages of documents even if it's increasingly rare (even, say in legal) for them to do so.
Arainach•1h ago
This is getting downvoted because it's ludicrous. Users want WYSIWIG - documents that are what appears on the printer page or when people they share the document to open it.

"Interoperability" is something technical enthusiasts talk about and not something that users creating documents care about outside of the people they share with seeing exactly what was created.

nlitened•13m ago
as you say, tech enthusiasts only _talk_ about interoperability, but very very few actually care about it. Try interoperating between two pieces of code written in two different languages without spinning up an HTTP server or a separate virtual machine with a database system. Not even two different languages, just between two major versions of the same damn programming language.
bboygravity•2h ago
Yeah and let's all move to Arch Linux without any Window manager. And you have to write your own driver to use wifi.

Death to user friendlyness! Advanced users only! /s

fxtentacle•51m ago
Try to create a PDF report with collapsible subheadings in Excel. After you have learned the necessary MacroScript and JavaScript to pull that off, writing a Wi-Fi driver will feel like a joke in comparison.
constantcrying•2h ago
I don't think WYSIWYG is the issue here, WYSIWYG editors for markdown exist. It is the premise that document creation is about representing a piece of paper digitally.

For most documents nowadays it makes no sense to see them as a representation of physical paper. And the word paradigm of representing a document as a if it were a piece of paper is obsolete in many areas where it is still being used.

Ironically Atlassian, with confluence, is a large force pushing companies away from documents as a representation of paper.

mawadev•2h ago
Do HTML WYSIWYG editors ever lead to vendor locking?
ZiiS•2h ago
Yes, Frontpage tried to lock reading to Internet Explorer and hosting to IIS several times. Even with the best will in the world switching editors lost fidelity.
nikanj•1h ago
HTML WYSIWYG editors lead to eldrich horrors, the likes of which haven’t been seen since someone tried parsing HTML with regex
sixtyj•1h ago
Wix, Webflow, WordPress… I have tried all wysiwyg editors (block editors). Oh my, what a mess in html, if you try to edit a file manually…
Ekaros•1h ago
Looking at Latex, I don't think hand tuning some parameters until you get right look in every single case is much better user experience...

Ofc, if we stop really caring what things look like we could save lot of energy and time. Just go back to pure HTML without any JavaScript or CSS...

lcnielsen•41m ago
> Looking at Latex, I don't think hand tuning some parameters until you get right look in every single case is much better user experience...

Having written many papers, reports and my entire Ph. D. thesis in Latex, and also moved between LaTeX classes/templates when changing journals... I'm inclined to agree to an extent. I think every layout system has a final hand-tweaking component (like inline HTML in markdown for example), but LaTeX has a very steep learning curve once you go beyond the basic series of plots and paragraphs. There are so many tricks and hacks for padding and shifting and adjusting your layout, and some of them are "right" and others are "wrong" for really quite esoteric reasons (like which abstraction layer they work at, or some priority logic).

Of course in the end it's extremely powerful and still my favourite markup language when I need something more powerful than markdown (although reStructuredText is not so bad either). But it's really for professionals with the time to learn a layout system.

Then again there are other advantages to writing out the layout, when it comes to archiving and accessibility, due to the structured information contained in the markup beyond what is rendered. arXiv makes a point about this and forces you to submit LaTeX without rendering the PDF, so that they can really preserve it.

DemocracyFTW2•40m ago
Latex is just as bad as WYSIWYG, pure unadultered TeX is what real programmers use! Of course, real hardcore programmers just bathe their hard drives in showers of cosmic rays until just the right bits have flipped et voila!—document ready
homebrewer•6m ago
> if we stop really caring what things look like we could save lot of energy and time

Yet simple Markdown documents automatically converted into pdf by pandoc look ten times better than most MS Office documents I've had to deal with over the past couple of decades. Most MS Office users have very little knowledge of its capabilities and do things like adjusting text blocks with spaces, manually number figures (which results in broken references that lead to the wrong figure — or nowhere), manually apply styles to text instead of using style presets (resulting in similar things being differently styled), etc.

ozim•1h ago
Not fun part is that focusing on content will lead you to a place where you get customers using BS arguments for power play.

You want to be able to do everything just right for the looks. Because there always will be someone negotiating down because your PDF report does not look right and they know a competitor who „does this heading exactly right”.

In theory if you have garbled content that is not acceptable of course, but small deviations should be tolerated.

Unfortunately we have all kinds of power games where you want exact looks. You don’t always have option to walk away from asshole customers nitpicking on BS issues.

jongjong•2h ago
Microsoft is using an artificially complex everything as a lock-in tool. I learned this many years ago when I learned how to create a window in C++ and it took around 100 lines of over-engineered code just to create an empty window on Windows.

Even TypeScript encourages artificial complexity of interfaces and creates lock-in, that's why Microsoft loves it. That's why they made it Turing Complete and why they don't want TypeScript to be made backwards with JavaScript via the type annotations ECMAScript proposal. They want complex interfaces and they want all these complex interfaces to be locked into their tsc compiler which they control.

They love it when junior devs use obscure 'cutting edge' or 'enterprise grade' features of their APIs and disregard the benefits of simplicity and backwards compatibility.

flohofwoe•1h ago
I don't even think it's intentional, they had to come up with a file format which supports all the weird historical artefacts in the various Office tools. They didn't have the luxury to first come up with a clean file format and then write the tools around it.

And I bet they didn't switch to XML because it was superior to their old file formats, but simply because of the unbelievable XML hype that existed for a short time in the late 1990s and early 2000s.

Arainach•1h ago
An XML format, even one with a lot of cruft to handle legacy complexity, is absolutely easier to parse/interop with than a legacy binary format that was to a large degree a serialization of undocumented in-memory content.

OOXML was, if anything, an attempt to get ahead of requirements to have a documented interoperable format. I believe it was a consequence of legal settlements with the US or EU but am too tired at the moment to look up sources proving that.

mickeyp•1h ago
Sorry but XML is a good fit for this. Most people who've never used XML cannot ever fathom that it does actually do a number of things well.

Being able to layer markup with text before, inside elements, and after is especially important --- as anyone with HTML knowledge should know. Being able to namespace things so, you know, that OLE widget you pulled into your documents continue to work? Even more important. And that third-party compiled plugin your company uses for some obscure thing? Guess what. Its metadata gets correctly embedded and saved also, and in a way that is forward and backwards compatible with tooling that does not have said plugin installed.

So no, it wasn't 'hype'.

Neil44•1h ago
This format of XML in a zip with a docx extension came into existence in Office 2007
scarface_74•52m ago
There have been third party support for importing and exporting Office documents as long as I can remember. It was part of Apple’s File Exchange extension in 1994. No one is locked into Office because of file formats.
lcnielsen•17m ago
> No one is locked into Offive because of file formats

A lot of people are locked in because those import/export features are typically imperfect (or perhaps the documents themselves are) and will badly and often "invisibly" (to the non-Office user) break something.

scarface_74•4m ago
You could say the same about a web page or even Markdown…
catmanjan•48m ago
Does software that produces files have an obligation to provide interoperability?
happymellon•46m ago
When they have a monopoly, places like the EU will frown on purposefully breaking compatibility.

Its called antitrust.

graemep•14m ago
> When they have a monopoly, places like the EU will frown on purposefully breaking compatibility.

What exactly have they done about it?

jonathaneunice•4m ago
I wish this article had shown side-by-side examples. Back when I built document transformation tools as part of a publishing pipeline, the simplicity and clarity benefit of OpenDocument's XML over Microsoft's OOXML were *staggering* in practice. A beautiful, clean, logical approach vs beyond-Byzantine cruft and complexity at every turn.

I don't remember every element enough to render from memory, but ChatGPT's example feels about right:

OpenDocument

<text:p text:style-name="Para"> This is some <text:span text:style-name="Bold">bold text</text:span> in a paragraph. </text:p>

OOXML

<w:p> <w:pPr> <w:pStyle w:val="Para"/> </w:pPr> <w:r> <w:t>This is some </w:t> </w:r> <w:r> <w:rPr> <w:b/> </w:rPr> <w:t>bold text</w:t> </w:r> <w:r> <w:t> in a paragraph.</w:t> </w:r> </w:p>

OpenDocument is not always 100% "simple," but it's logical and direct. Comprehensible on sight. OOXML is...something else entirely. Keep in mind the above are the simplest possible examples, not including named styles, footnotes, comments, change markup, and 247 other features commonly seen in commercial documents. The OpenDocument advantage increases at scale. In every way except breadth of adoption.