Is there a backstory between these guys / FDB?
https://innovation.ebayinc.com/stories/graphload-a-framework...
It looks like people are using it to build graph related models on it.
I am looking at it & considering doing something similar for graph data sets. As well as a transactionally safe key value store to store roaring bitmaps.
The consistency guarantees are phenomenal and writing software is much easier when you have strict serializability. Most people do not appreciate this because they do not understand the anomalies that you can get without strict serializable consistency.
The beginning of this video has some of that: https://youtu.be/Nrb3LN7X1Pg
But data doesn't teleport except in demos. Rebalancing means streaming data across a network, consuming total network I/O, regardless of the distributed database.
Did you actually implement FDB, and was it better?
I mean, both kind of store data, and multiple users can change the data that is being stored. The story of what you'll get back and when (if ever), however, is rather different.
I would respectfully suggest that anyone that wants to comment in distributed database discussions should be familiar with https://jepsen.io/consistency/models and https://antithesis.com/resources/reliability_glossary/ and use the wording found there.
If your eyes gloss over, because there is a lot of complex stuff there, it is likely that your comments will not have much value.
Out of curiosity: what are the scale limits of FoundationDB? What kind of issues would it start to have? For example, being able to store all of Discord messages on it?
I see blog posts of Discord moving to Scylla and ElasticSearch, but I wonder if there would be any difficulties here.
https://foundationdb.github.io/fdb-record-layer/SQL_Referenc...
Also IIRC Apple uses FDB at tremendous scale:
https://read.engineerscodex.com/p/how-apple-built-icloud-to-...
"Swift as C++ Successor in FoundationDB" by Konrad Malawski (Strange Loop 2023)
https://github.com/apple/foundationdb/commit/e52fc3621fd5e41...
As someone that enjoys using C++ despite all its warts, I can imagine a few reasons, but would nonetheless an interesting read, in case that is public.
I guess that experience might also had an impact on ongoing Swift 6+ features.
FoundationDB started development in the same year MongoDB launched but took nearly four years to reach the market. It's the rarely discussed dark side of great testing - you can end up with robust code nobody cares about because it arrives years after people decided they wanted it. Everyone went with what existed and learned to deal with its quirks. In this case they got lucky I guess that Apple saw the potential for iCloud and bought them out, but the people who had bet on FDB before then kinda lost. You really don't want your database to be bought and made fully private tech. MongoDB was open source at the start and went closed later but never disappeared, so whilst the license switch pissed people off it didn't fundamentally wreck MongoDB as a viable tech.
Database tech has a chicken and egg problem. Most people don't want to run their own infrastructure anymore. No clouds offer hosted FoundationDB, so people don't want to use it for that reason, which means there's no demand, so clouds don't offer it, ad infinitum. MongoDB was released around the start of the cloud era, just three years after AWS first launched, so that was less of an issue. Back then "cloud" just meant VMs and storage. And later Mongo built their own cloud offering.
FoundationDB does full strict serializability checks, which is expensive. One trick it uses to get acceptable performance is by imposing a difficult programming model on the user. Keys and values must be small. Think individual fields of a JSON object, not objects themselves. Transactions also have very small limits in lifespan and size. You can't open a transaction and run a computation against your entire dataset in FoundationDB unless it's tiny. Everything has to complete in five seconds or else your transaction dies.
Their website used to claim this timeout isn't even configurable, it's hard to know if it changed because the FoundationDB team at Apple don't care about marketing. Probably Apple don't care if anyone else uses it and only made it open source to make the team happy. Even quite average open source projects have better marketing. Their blog consists only of release announcements and the last one was in 2022. A casual visitor who didn't know better would think it had been abandoned years ago.
The scalability story is unclear. It doesn't matter for most people but the biggest FDB clusters are about 100T in size. Apple say they use it for iCloud but really they use a large fleet of FDB clusters with lots of in-house tooling for balancing and moving data between those clusters. Effectively they built another scaling layer on top of core FDB.
Even if you work through all of that, what you get is a key value store. Not really a database, it's more like the bottom layer of a database. That's why it's called FoundationDB. It's not meant to be used directly. There are layers that turn your actual data into key/value pairs in a way that offers features like schema handling, object serialization etc but they are language specific and not so well documented. Most devs on the backend will have ORMs or frameworks they already want to use, and Apple server-side is mostly a Java shop so there's a Java layer, but you can't just point Spring at an FDB cluster and go. For instance, there's no notion of a query, or a query planner or even indexes. You're expected to handle all that stuff using libraries in your app.
So overall it's a highly solid bit of tech that solved a very small, very specific problem very well but years too late for anyone to care. Except for Apple. Good work, whichever Apple executive sponsored that deal!
It provides fantastic (strict serializable) consistency guarantees in a distributed database, which is extremely rare. It is a huge advantage, but sadly most people do not understand how badly most distributed databases are broken and don't even understand the concepts (https://antithesis.com/resources/reliability_glossary/) well enough to talk about the issues involved. See every discussion where someone mentions ACID.
It's hard to compete for mindshare when the concepts are difficult and every other database has a warm-and-fuzzy-feeling website saying that everything will be great (it usually won't).
Personally, I hope more people will start using it, and I hope to see more easy-to-use databases built on top of it (that's what it was designed for, really). In my experience with it, working with a fast distributed database that gives you strict serializable semantics right in your code is fantastic.
jauntywundrkind•13h ago
Hacker News is here too! From July 2012 (78 points, 72 comments): https://news.ycombinator.com/item?id=4294719
For a general introduction, I enjoyed the recent submission How FoundationDB works and why it works: https://news.ycombinator.com/item?id=37552085 https://uvdn7.github.io/notes-on-the-foundationdb-paper/