Conformance checking at MongoDB: Testing that our code matches our TLA+ specs

https://www.mongodb.com/blog/post/engineering/conformance-checking-at-mongodb-testing-our-code-matches-our-tla-specs

108•todsacerdoti•1d ago

Comments

magicalhippo•1d ago

About 15 years ago or so, webscale was everything and MongoDB was webscale[1][2] hence being used everywhere.

Has it become a legacy product, or are there still good reasons for using it in a new project in 2025?

[1]: https://www.youtube.com/watch?v=b2F-DItXtZs

[2]: https://news.ycombinator.com/item?id=1636198

Magmalgebra•1d ago

> or are there still good reasons for using it in a new project in 2025

it's not clear there ever was. Most of the big users I'm aware of, like Stripe, don't seem to have needed it and regretted the decision. Big data didn't become a thing in the way people expected[0]. If you really did need the scalability of Mongo you'd choose a NewSql database like TiDB[1].

[0] https://motherduck.com/blog/big-data-is-dead/ [1] https://docs.pingcap.com/

threeseed•1d ago

a) Big data is more than just the size of the data. It's about how you treat that data i.e. instead of doing expensive and brittle up-front RDBMS modelling you instead dump it all into a data lake and figure out how to handle the data at run-time. And it is still the standard pattern in almost all companies today.

b) Nobody was choosing MongoDB solely for performance. If it was you would choose some in-memory K/V store. It was about it being the only well supported document store that was also fast and scalable.

c) Stripe’s DocDB is an extension of MongoDB Community: https://stripe.com/blog/how-stripes-document-databases-suppo...

guappa•1d ago

That's not how software works. In my experience using mongodb just means that now every single bug in the code creates messed up data for a few months/years and then when that data pops up the software crashes :D

computerfan494•1d ago

I can assure you that Stripe does not regret the decision.

hobs•21h ago

Interesting, a lot of ex stripe eng I talk to say the opposite, but that they made it work.

threeseed•1d ago

MongoDB as a company is growing 20% y/y and 2B in revenue.

So very far from being a legacy product.

I still use it for new projects because (a) Atlas is genuinely a solid offering with a great price point for startups and (b) schema-less datastores have become more of a necessity as our control of data has decreased e.g. SaaS companies dictate their schema and we need to accomodate.

anonymousDan•1d ago

Interesting, can you elaborate on what you mean by (b) with an example?

MoreQARespect•1d ago

They're really good at sales and marketing, especially aimed at beginners.

I'm still puzzled why people use it given that it's a database and there's nothing technical it ever did better than any of its competitors. The best that can be said for it is that it works about as well in some circumstances as competing databases do.

Aldipower•1d ago

What you're writing sounds like marketing to me. Really bad one...

lunarcave•1d ago

Personal opinion.

I swore away from it for 10 years, but came back recently. And I'm pleasantly surprised with the developer experience of MongoDB Atlas (the cloud version).

You just have to keep in mind the common sense best practices about developing with kv stores, and you'll be mostly alright.

Thaxll•1d ago

Only here you read that MongoDB is a legacy product.

Fortnite runs on MongoDB, it's one of the largest game on the planet.

15 years ago Postgres did not have proper replication, shocking right?

viraptor•1d ago

It didn't have replication integrated. Slony replication existed for 20 years, same as pgpool.

guappa•1d ago

I doubt they use mongodb for the actual real time game. Probably just for the user accounts, items, skins and that sort of stuff. It could literally be done by anything else.

mxey•1d ago

Did MongoDB have _proper_ replication 15 years ago?

tomnipotent•1d ago

The original storage engine was terrible, but they wised up and later acquired and adopted WiredTiger as the default. It was sort of their InnoDB moment and went a long way to improving performance, reliability, replication, and other "enterprisey" features.

erulabs•1d ago

Been an SRE and DBA for almost 20 years and the only truth in tech I firmly believe in: use whatever you want in a new project. Build fast and get it out. Once you have paying users, hire old dads like me to move you to Cassandra or Vitess or TiDB or something, or dont and just pay the bills for mongodb and laugh all the way to Series C.

I wouldn't start a new project with MongoDB, I'd probably use ScyllaDB, and i'd spent months getting the data model just right while you launch and get paying customers.

VWWHFSfQ•20h ago

We're seeing a convergence of document DBs adding relational features, and relational DBs adding document features. At this point I find the best of both worlds to simply be PG with JSONB.

    create table document (
      id uuid primary key default gen_random_uuid(),
      doc jsonb
    );

This alone will give you a huge number of the features that Mongo provides, but also the full power of Postgres for everything else.

0x70dd•1d ago

Until very recently, at the company I work for, we were running one of the largest (if not the largest) replica set cluster in terms of number of documents stored (~20B) and data size ~11TB. The database held up nicely in the past 10 years since the very first inception of the product. We had to do some optimizations over the years, but those are expected with any database. Since Mongo Atlas has a hard limit on the maximum disk size of 14TB we explored multiple migration options - sharding Mongo or moving to Postgres/TimescaleDB or another time series database. During the evaluation of alternative database we couldn't find one which supported our use case, that's highly available, could scale horizontally and that's easily maintainable (e.g. upgrades on Mongo Atlas require no manual intervention and there's no downtime even when going to the next major version). We had to work around numerous bugs that we encountered during sharding that were specific to our workloads, but the migration was seamless and required ~1h of downtime (mainly required to fine-tune database parameters). We've had very few issues with it over the years. I think Mongo is a mature technology and it makes sense depending on the type of data you're storing. I know at least few other healthcare companies that are using it for storing life-critical data even at a larger scale.

HdS84•1d ago

Did you evaluate RavenDB? Just out of interest

serguzest•23h ago

Ravendb is trash. It can’t handle even 1/10 of this type of a load, it was trashed in jepsen testing too. I had to work with it 4 years and i disliked it.

otabdeveloper4•1d ago

> Has it become a legacy product, or are there still good reasons for using it in a new project in 2025?

We're in the process of migrating our legacy stuff to mongodb right now.

If you want a high-availability database solution that "just works" and is open then you don't have many other options.

(Postgress and Mysql don't have automatic failover and recovery.)

guappa•1d ago

mongodb is not open

otabdeveloper4•1d ago

I disagree with that. The Server Side Public License is more open than the AGPL.

In the same sense that the GPL is more open than the MIT license; more viral requirements for openness are generally a good thing. I don't want Amazon and the ilk deploying hosted Mongodb clusters.

guappa•1d ago

People have the most varied creeds and faiths. However the facts are that it doesn't respect freedom 0.

otabdeveloper4•1h ago

Freedom of what? Freedom of business for cloud providers? I don't give a shit about that.

Open source for me is a hedge against risks. (Especially political and/or legal ones.) Mongodb's current license resolves that issue for me perfectly.

mxey•1d ago

PostgreSQL does not have built-in clustering but Patroni does it automatically.

MySQL Group Replication isn‘t automatic in all cases, like starting after all nodes failed, and it has some limitations, but it is built-in.

Not saying that MongoDB ReplicaSet is bad, has been working very well for us AFAICT.

otabdeveloper4•1h ago

If you rely on Postgres and/or Mysql replication then you'll need a DBA on call. Lots of manual work involved in babysitting them when stuff goes south.

Mongodb is entirely automatic, you just delete bad servers and reprovision them automatically and everything "just works" with no interruption.

arccy•1d ago

it's probably stable enough... but our teams got like 90% reduction in their bills by moving off mongodb atlas to postgres. of course, architect your data schema right and you'll have much more flexibility in choosing a database engine that can be fast, cheap, and easy to operate.

tgv•1d ago

> are there still good reasons for using it in a new project in 2025?

I've written this before: if your data looks like trees, with some loose coupling between them, it's a good choice. And most data does look like trees.

It does place some extra duties on the backend. E.g., mongodb doesn't propagate ("cascade") deletes (this also happens to be a feature I dislike: not so long ago, a delete of an insignificant record triggered an avalanche of deletes in a postgresql database, and restoring that took quite a bit of time.)

Semaphor•1d ago

Cascading deletes are an optional feature in every DB that supports them, aren’t they? I only have DBA-ish experience with SQL Server, but there they are fully opt in.

hobs•21h ago

Yes, because they are terrible and break replication for cross system repl.

Semaphor•12h ago

Yeah, but OP made it seem like they were some kind of mandatory or at least default feature.

mxey•1d ago

Postgres defaults to preventing delete when foreign keys would be violated, not cascading.

tgv•23h ago

The sane default. It was a Django project, and unfortunately many relations were set to ondelete=cascade.

nailer•1d ago

> I've written this before: if your data looks like trees, with some loose coupling between them, it's a good choice. And most data does look like trees.

I had an education statup a little while ago.

Courses had many cohorts, cohorts had many sessions.

It really was much nicer having a single tree structure for each course, appending new cohorts etc, rather than trying to represent this in a flat Postgres database.

That said I acknowledge and agree with other commenter's experiences about MongoDB and data loss in the past.

elcritch•23h ago

My main concern would be with querying. What is querying like in modern mongodb, or scylladb? I've seen couchbase ads showing a superset of sql, etc. Last I recall Mongo had a weird query system.

senderista•22h ago

Basically it requires you to write the query plan yourself, which IMO is not a bad idea for an OLTP system.

Aldipower•1d ago

MongoDB is used for ~8 years straight here with a replica set that replicates itself and even crossing datacenter boundaries. It runs smooth, stable, fast and allows for a good backup strategy. It implements all the database concept you can image, offers great compression algos and it easy to maintain. Short: It drives the job and it does that good.

amazingamazing•1d ago

interesting article - I wonder why TLA+ isn't more popular. seems useful, but I guess it's hard to iterate fast with it.

windows2020•1d ago

It's the complete opposite of asking an LLM to make an app, which is all the rage.

ketzo•1d ago

I think it’s just fairly rare that formal verification is valuable to a business.

Obviously all software benefits from correctness — but we all know that software certainly doesn’t need to be bug-free to make money!

And if the benefits of formal verification don’t outweigh the (non-trivial!) costs — and critically, outweigh them in a business-person-legibile way — then people won’t put up the time and effort.

shoo•1d ago

Mike Dodds' article "What Works (and Doesn't) Selling Formal Methods" offers some insight - two points were: (i) many potential applications of formal methods in business do not result in an attractive cost-benefit outcome, relative to mainstream software development approaches, especially in being able to produce enough benefit for a small input of time / resources. (ii) anecdote: one business that needs to produce software that complies with certain validation/quality targets - say achieved with a baseline method using a lot of manual QA and without formal methods - does not get a business benefit to investing in increased validation, they just need enough enough validation to clear the compliance bar or rank slightly ahead of the competition. if formal methods could assist them in achieving the same level of validation to meet their compliance obligations with less cost than their baseline development and QA processes then they might have much more appetite to invest in formal methods.

https://www.galois.com/articles/what-works-and-doesnt-sellin...

corresponding HN thread from earlier this week: https://news.ycombinator.com/item?id=44131324

cmrdporcupine•1d ago

Honestly... Just like with writing extensive unit tests, you will then end up basically writing the program twice. Which is not going to appeal to the "move as fast as possible" crowd.

pjmlp•1d ago

At least the unit tests can validate the actual implementation.

cmrdporcupine•1d ago

I mean you only validate the parts that the test writer thought to verify. It will always remain unexhaustive and full of holes.

pjmlp•1d ago

Which still remains more than what the TLA+ theoric model can offer.

stusmall•21h ago

This comment thread is on an article about how Mongo applies TLA+ models to an implementation. They talk about how they use the model to generate test cases for the application. There are fair criticisms of formal modeling, but that they can't test the implementation isn't one.

I mention in another comment, I ended up not going the TLA+ route. It isn't because it cannot offer this, it's because it's a heavy, opinionated investment. That is a good trade off for some systems, like databases, but not for everything.

pjmlp•20h ago

The criticism is regarding TLA+ in particular, versus other formal verification tools that generate code directly from the model without additional effort building tools, or manual verification that model and actual code map to the same semantics.

KeplerBoy•1d ago

Because you can't ship a TLA+ model and the chances of the model matching the implementation are nil. So you spend a lot of time proving your approach could work flawlessly in theory if you don't fuck up the implementation.

For most problems this is an assumed certainty.

oggy•1d ago

I think the main answer was given by another comment: for most projects, correctness usually isn't worth that much (i.e., a bug isn't that expensive for a company producing a piece of software). It also isn't in the software culture (yet?). Today people will be shocked if you don't have a version control system and a CI pipeline. Few people had one 20 years ago. Also, people are often reluctant to learn a new paradigm (think functional programming).

Having done multiple TLA verification projects myself, here are some technical ones:

1. It's actually surprisingly hard to write a good specification, i.e., precisely and formally state what you want. Often it's not even practical, i.e., the specification is so complex that you end up nearly recreating the implementation. Kolmogorov complexity is also a thing for specifications, I suppose ;) 2. TLA in particular is mostly useful for concurrent/distributed systems. 3. Model checking (the method used to verify the TLA models) hits limits pretty quickly. E.g., you may be able only check your system for 2 threads, but forget about 3 threads. 4. The TLA tooling is very clunky by modern standards.

_flux•22h ago

I think for this reason TLA+ is—or should be—good for designing systems: it keeps your design as simple as possible :) and it asks you the difficult questions. Sort of a logical rubber duck.

And the best part is, when you implement the spec, it works! At least by design.

It is quite a different task to implement models for existing systems in fine detail. But it can be done, e.g. https://arxiv.org/abs/2409.14301 . This group in particular combined fine and coarse grained models for the checking, allowing to focus the checking performance only on the parts of the system they were interested in.

stusmall•23h ago

I recently took some training on TLA+. I loved the concepts of it. It seems amazingly powerful for prototyping and proving out algorithms/architectures. A big concern of mine was testing models vs implementation and managing the drift. I talked to the instructor a bit about it and he actually pointed to what Mongo was doing as described in the article. There wasn't good off the shelf tooling available to make something similar work for us at our scale without major amount of work.

Where on the other hand, model based property testing gets us a lot of the benefits but drops in more easily. There are great proptest frameworks available in the language we use an the team is already experienced with then. TLA+ is much more exhaustive and powerful, but for the exact project I was looking at the additional cost couldn't be justified by the additional benefit.

I believe we still have some TLA+ work moving forward inside the company but for my product I couldn't justify it.

oggy•1d ago

Funny to see this posted on HN, just last week I finished writing a blog post about a project I did for checking that code matches the TLA+ specs so I have to shamelessly plug it :) [1] I was aware of the MongoDB paper, but I ended up actually doing exactly what they suggested wouldn't work: I instrumented Rust programs to log traces and checked them against the specification. Even though the models were largely also post-factum models as in their case (i.e., the code was there first, and models were built later on), this worked for us since the models really were aimed at capturing the implementation's behavior. Our target implementation is possibly smaller than what they had to deal with, though (it's only around 5kLoC of Rust smart contracts) so that's a factor.

My technique was slightly different, though. First, I ended up ignoring the initial states, because tests would often manually create fixtures that would serve as a starting point. So I only ended up checking that trace steps obey the transition predicate, which is weaker, but hey, all this is best-effort only anyways. Second, I ended up using the Apalache tool instead of TLC; the reason being that my transitions were sometimes of the form "there exists a number between 1 and N". While model checking you would pick a small N, but in the test runs where the traces come from the N was sometimes huge. TLC ends up enumerating all the possibilities between 1 and N, whereas Apalache translates the whole thing into an SMT formula which is generally trivial to check. Apalache also has the side benefit of requiring type annotations (and providing a type checker) which makes the models a lot easier to refactor when the code changes.

I also ended up creating a small library for helping instrument Rust programs to collect such logs. The idea here was to minimize the amount of logging statements in the production code, to keep it out of the way of the people working on the code base who aren't familiar with TLA. It's somewhat specific to our code base, and there's a fair amount of unsafe nastiness that works only because our code is of certain shape, but in case someone's interested, it's open source: https://github.com/dfinity/ic/tree/master/rs/tla_instrumenta....

I wasn't aware of the 2024 paper they reference though, so curious to see what approach they took.

[1] https://www.mongodb.com/blog/post/engineering/conformance-ch...

jiryu•1d ago

That's a really cool technique oggy, I don't think I've heard of anything exactly like it. If you haven't told the TLA+ community about it yet, I'm sure they'd like to learn about your approach, either on the mailing list[1] or at the next TLA+ conference[2] in 2026.

[1] https://groups.google.com/g/tlaplus

[2] https://conf.tlapl.us/home/

oggy•23h ago

Thank you for the kind words! I haven't really talked about it anywhere yet since it's fresh off the press, I'll definitely post it on the mailing list.

Potential 'agroterrorism weapon' fungus smuggled into US by Chinese scientists

Truths About Developer Experience

HEIC Shenanigans

Farewell Cortex as ARM looks to product rebranding and China risks

Auto-Labeling Data for Object Detection

The Entangled Brain

Show HN: OSS's Answer to Cursor's Codebase Level Context for Large Projects

AGI Is Not Multimodal

I-19, America's Only Metric Interstate

Linux 6.15.1 Ships with Fix to Prevent Snapdragon X1 GPUs from Overheating

Formalizing a proof in lean using GitHub Copilot and canonical

How Much Energy Does It Take to Think?

Show HN: Apollo, your personal Home Assistant

215 Department Store Catalogs 1908-2019

Ask HN: Anyone using project management tools for personal projects?

Genomics pioneer fired from firm he founded: 'It was not easy to domesticate me'

Show HN: A2A – curated resources for Google's Agent2Agent protocol

CDN Solutions for Software Companies

Show HN: Skyla – AI that proves identity through recursive ZK proofs, not memory

Against "Against Life Extension"

Ischhfd83: When cybercriminals eat their own

Markdown-Oriented Programming in MoonBit: Bridging Code and Documentation

The Right to Repair Is Law in Washington State

The Mitre Dissection Puzzle

Yambda-5B – Industrial-scale music recommendation dataset

Chinese nationals accused of smuggling 'dangerous biological pathogen' into US

Portable Linux R Binary Packages

Morgan Stanley's AI parsed 9M lines of code, saving 15K devs 280K hours

Auto-Updating AI Expert Using Telegram Channels and NotebookLM

Show HN: Mongo Analyser – A TUI for MongoDB with Integrated AI Assistant

Potential 'agroterrorism weapon' fungus smuggled into US by Chinese scientists

Truths About Developer Experience

HEIC Shenanigans

Farewell Cortex as ARM looks to product rebranding and China risks

Auto-Labeling Data for Object Detection

The Entangled Brain

Show HN: OSS's Answer to Cursor's Codebase Level Context for Large Projects

AGI Is Not Multimodal

I-19, America's Only Metric Interstate

Linux 6.15.1 Ships with Fix to Prevent Snapdragon X1 GPUs from Overheating

Formalizing a proof in lean using GitHub Copilot and canonical

How Much Energy Does It Take to Think?

Show HN: Apollo, your personal Home Assistant

215 Department Store Catalogs 1908-2019

Ask HN: Anyone using project management tools for personal projects?

Genomics pioneer fired from firm he founded: 'It was not easy to domesticate me'

Show HN: A2A – curated resources for Google's Agent2Agent protocol

CDN Solutions for Software Companies

Show HN: Skyla – AI that proves identity through recursive ZK proofs, not memory

Against "Against Life Extension"

Ischhfd83: When cybercriminals eat their own

Markdown-Oriented Programming in MoonBit: Bridging Code and Documentation

The Right to Repair Is Law in Washington State

The Mitre Dissection Puzzle

Yambda-5B – Industrial-scale music recommendation dataset

Chinese nationals accused of smuggling 'dangerous biological pathogen' into US

Portable Linux R Binary Packages

Morgan Stanley's AI parsed 9M lines of code, saving 15K devs 280K hours

Auto-Updating AI Expert Using Telegram Channels and NotebookLM

Show HN: Mongo Analyser – A TUI for MongoDB with Integrated AI Assistant

Conformance checking at MongoDB: Testing that our code matches our TLA+ specs

Comments