FoundationDB: From idea to Apple acquisition [video]

https://www.youtube.com/watch?v=C1nZzQqcPZw

181•zdw•4d ago

Comments

jauntywundrkind•13h ago

Nice having this backstory (fantastic production value too, impressive start to this podcast). Dis-aggregating the responsibilities of the DB into multiple pieces just feels so logical, helps make sure each piece can scale. Deterministic Simulation Testing gets mentioned in the video & was way ahead of it's time here. https://apple.github.io/foundationdb/testing.html

Hacker News is here too! From July 2012 (78 points, 72 comments): https://news.ycombinator.com/item?id=4294719

For a general introduction, I enjoyed the recent submission How FoundationDB works and why it works: https://news.ycombinator.com/item?id=37552085 https://uvdn7.github.io/notes-on-the-foundationdb-paper/

vlovich123•10h ago

What a great story and really interesting courage to double-down on improving the testing even when a critical flaw that testing should have found was found. Wish that they had managed long enough for Snowflake to keep them alive, but then we wouldn't have Antithesis as a service so silver lining.

tptacek•9h ago

By "keep them alive", you mean the team, right? People are definitely still using FDB!

vlovich123•8h ago

The team pushing forward the vision. Using FDB is a fraction of the vision if you listen to them.

tptacek•7h ago

Makes sense, thanks! Antithesis is pretty neat, though (we use it for a distributed system thing here).

majestik•9h ago

I can't put my finger on it but there's a weird tension between the two Dave's in this video. Almost like Rosenthal is trying to impress or earn the praise of Scherer.

Is there a backstory between these guys / FDB?

Dave_Rosenthal•6h ago

Ha, well I met Scherer ~30 years ago in a high school math class and we’ve done three companies together, so you could say we’ve known each other for a bit :)

philosopher1234•9h ago

Does anyone know of cool things built with fdb? I’ve been aware of it for a while and it seems very cool but I haven’t seen a lot of details about how folks are using it.

pjd7•9h ago

https://www.youtube.com/watch?v=oYiFTBO67uU

https://innovation.ebayinc.com/stories/graphload-a-framework...

It looks like people are using it to build graph related models on it.

I am looking at it & considering doing something similar for graph data sets. As well as a transactionally safe key value store to store roaring bitmaps.

bpicolo•9h ago

Apple uses it for CloudKit. I'd say that's pretty cool. Snowflake uses it for their metadata layer. Datadog uses it for their system called Husky (https://www.datadoghq.com/blog/engineering/introducing-husky...)

pstuart•9h ago

An abandoned project I'd love to see resurrected is SQLite on fdb: https://github.com/losfair/mvsqlite

jwr•35m ago

I am moving my SaaS from RethinkDB to FoundationDB. It's a long-term project that needs to be done very carefully (thousands of people using the app), but the rewards are significant. Thanks to FoundationDB versionstamps, I'll be able to replace changefeeds with polling, simplifying the system, and also make things much faster along the way.

The consistency guarantees are phenomenal and writing software is much easier when you have strict serializability. Most people do not appreciate this because they do not understand the anomalies that you can get without strict serializable consistency.

mannyv•10m ago

From what I understand one of the big IP ad tracking services (El Toro) is built on FoundationDB.

romanhn•9h ago

Posted about this in the past, but what really got FoundationDB on my radar was a demo at a developer conference, back in 2014-ish. They had the database running across a bunch of machines, with a visual showing their health and data distribution. One team member would then be turning machines on and off (or maybe unplugging them from the network) and you could see FDB effortlessly rebalancing the data across the available nodes. It was a very striking, impressive presentation (especially as we were dealing with the challenges of distributed Cassandra at the time).

The beginning of this video has some of that: https://youtu.be/Nrb3LN7X1Pg

AtlasBarfed•7h ago

So ... cassandra does that? I get the FDB demo probably made it look better and easier.

But data doesn't teleport except in demos. Rebalancing means streaming data across a network, consuming total network I/O, regardless of the distributed database.

Did you actually implement FDB, and was it better?

rapsey•5h ago

Many years later did cassandra get reliable. Fdb was the gold standard that set the bar. They did not need jepsen tests to implement it properly.

jwr•47m ago

Comparing Cassandra to FoundationDB is like comparing a spreadsheet in Google Sheets to PostgreSQL.

I mean, both kind of store data, and multiple users can change the data that is being stored. The story of what you'll get back and when (if ever), however, is rather different.

I would respectfully suggest that anyone that wants to comment in distributed database discussions should be familiar with https://jepsen.io/consistency/models and https://antithesis.com/resources/reliability_glossary/ and use the wording found there.

If your eyes gloss over, because there is a lot of complex stuff there, it is likely that your comments will not have much value.

Nican•9h ago

FoundationDB has been growing as my favorite database lately. Even though it is only key-value store.

Out of curiosity: what are the scale limits of FoundationDB? What kind of issues would it start to have? For example, being able to store all of Discord messages on it?

I see blog posts of Discord moving to Scylla and ElasticSearch, but I wonder if there would be any difficulties here.

hardwaresofton•6h ago

Note that FDB can support other paradigms on top of KV

https://foundationdb.github.io/fdb-record-layer/SQL_Referenc...

Also IIRC Apple uses FDB at tremendous scale:

https://read.engineerscodex.com/p/how-apple-built-icloud-to-...

msy•8h ago

Does anyone know how widely FoundationDB is now being used at Apple? I know they run a huge Cassandra cluster, does this serve a different use case?

minitoar•6h ago

iCloud uses both.

ntqz•3h ago

My understanding is that CloudKit runs on it.

ethan_smith•1h ago

Apple uses FoundationDB extensively for iCloud services including CloudKit, with public documentation confirming it handles billions of operations per second across their infrastructure.

iangregson•3h ago

+1 really enjoyed this

pjmlp•3h ago

Nowadays being rewritten into Swift.

"Swift as C++ Successor in FoundationDB" by Konrad Malawski (Strange Loop 2023)

https://www.youtube.com/watch?v=ZQc9-seU-5k

jen20•2h ago

That was an experiment the team didn’t end up committing to - it’s been backed out. That said it was a fascinating dive into the flexibility of Swift, and the Konrad’s talk is excellent and worth watching.

https://github.com/apple/foundationdb/commit/e52fc3621fd5e41...

pjmlp•2h ago

Interesting, thanks for sharing, is there a rationale somewhere?

As someone that enjoys using C++ despite all its warts, I can imagine a few reasons, but would nonetheless an interesting read, in case that is public.

I guess that experience might also had an impact on ongoing Swift 6+ features.

TobbenTM•1h ago

The Ladybird project started a similar journey, and indeed they are mainly waiting on Swift 6+ features as documented in their blockers issue: https://github.com/LadybirdBrowser/ladybird/issues/933

piokoch•2h ago

I've looked on FoundationDB and on paper it looks great. But it never got momentum, like, say, MongoDB. Is this just a matter of hype or it is not that great as advertised?

chrischen•2h ago

I think it wasn't as easy to use or get started with. There was a MongoDB compatibility layer but it wasn't maintained.

qcnguy•2h ago

Many reasons.

FoundationDB started development in the same year MongoDB launched but took nearly four years to reach the market. It's the rarely discussed dark side of great testing - you can end up with robust code nobody cares about because it arrives years after people decided they wanted it. Everyone went with what existed and learned to deal with its quirks. In this case they got lucky I guess that Apple saw the potential for iCloud and bought them out, but the people who had bet on FDB before then kinda lost. You really don't want your database to be bought and made fully private tech. MongoDB was open source at the start and went closed later but never disappeared, so whilst the license switch pissed people off it didn't fundamentally wreck MongoDB as a viable tech.

Database tech has a chicken and egg problem. Most people don't want to run their own infrastructure anymore. No clouds offer hosted FoundationDB, so people don't want to use it for that reason, which means there's no demand, so clouds don't offer it, ad infinitum. MongoDB was released around the start of the cloud era, just three years after AWS first launched, so that was less of an issue. Back then "cloud" just meant VMs and storage. And later Mongo built their own cloud offering.

FoundationDB does full strict serializability checks, which is expensive. One trick it uses to get acceptable performance is by imposing a difficult programming model on the user. Keys and values must be small. Think individual fields of a JSON object, not objects themselves. Transactions also have very small limits in lifespan and size. You can't open a transaction and run a computation against your entire dataset in FoundationDB unless it's tiny. Everything has to complete in five seconds or else your transaction dies.

Their website used to claim this timeout isn't even configurable, it's hard to know if it changed because the FoundationDB team at Apple don't care about marketing. Probably Apple don't care if anyone else uses it and only made it open source to make the team happy. Even quite average open source projects have better marketing. Their blog consists only of release announcements and the last one was in 2022. A casual visitor who didn't know better would think it had been abandoned years ago.

The scalability story is unclear. It doesn't matter for most people but the biggest FDB clusters are about 100T in size. Apple say they use it for iCloud but really they use a large fleet of FDB clusters with lots of in-house tooling for balancing and moving data between those clusters. Effectively they built another scaling layer on top of core FDB.

Even if you work through all of that, what you get is a key value store. Not really a database, it's more like the bottom layer of a database. That's why it's called FoundationDB. It's not meant to be used directly. There are layers that turn your actual data into key/value pairs in a way that offers features like schema handling, object serialization etc but they are language specific and not so well documented. Most devs on the backend will have ORMs or frameworks they already want to use, and Apple server-side is mostly a Java shop so there's a Java layer, but you can't just point Spring at an FDB cluster and go. For instance, there's no notion of a query, or a query planner or even indexes. You're expected to handle all that stuff using libraries in your app.

So overall it's a highly solid bit of tech that solved a very small, very specific problem very well but years too late for anyone to care. Except for Apple. Good work, whichever Apple executive sponsored that deal!

jwr•40m ago

It is difficult to use by itself: the "foundation" in the name describes it quite well. It is a foundation that you build a database on. It fits my use case very well, for example, because I know my data model and usage patterns very well and I can integrate deeply with the database, but it's not a good match by itself for quick-and-dirty apps.

It provides fantastic (strict serializable) consistency guarantees in a distributed database, which is extremely rare. It is a huge advantage, but sadly most people do not understand how badly most distributed databases are broken and don't even understand the concepts (https://antithesis.com/resources/reliability_glossary/) well enough to talk about the issues involved. See every discussion where someone mentions ACID.

It's hard to compete for mindshare when the concepts are difficult and every other database has a warm-and-fuzzy-feeling website saying that everything will be great (it usually won't).

Personally, I hope more people will start using it, and I hope to see more easy-to-use databases built on top of it (that's what it was designed for, really). In my experience with it, working with a fast distributed database that gives you strict serializable semantics right in your code is fantastic.

gregoriol•1h ago

So if it has been acquired by Apple, it's a failure, isn't it? Most things acquired by Apple get unmaintained or change completely, or disappear. Being "open-source" here doesn't bring any guarantees to any third-party user about maintenance or long-term life. It should be a serious no-go indicator for anyone willing to build something with it.

dialup_sounds•1h ago

It was acquired ten years ago.

M8.7 earthquake in Western Pacific, tsunami warning issued

Study mode

RIP Shunsaku Tamiya, the man who made plastic model kits a global obsession

Launch HN: Hyprnote (YC S25) – An open-source AI meeting notetaker

Show HN: I built a free backlink exchange marketplace

URL-Driven State in HTMX

iPhone 16 cameras vs. traditional digital cameras

A major AI training data set contains millions of examples of personal data

Sleep all comes down to the mitochondria

Show HN: The Aria Programming Language

Learning basic electronics by building fireflies

Two Birds with One Tone: I/Q Signals and Fourier Transform

ACM Transitions to Full Open Access

Analoguediehard

Show HN: Cant, rust nn lib for learning

USB-C for Lightning iPhones

How the brain increases blood flow on demand

FoundationDB: From idea to Apple acquisition [video]

Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL

Irrelevant facts about cats added to math problems increase LLM errors by 300%

Show HN: I built an AI that turns any book into a text adventure game

A month using XMPP (using Snikket) for every call and chat (2023)

My 2.5 year old laptop can write Space Invaders in JavaScript now (GLM-4.5 Air)

Elements of System Design

Structuring large Clojure codebases with Biff

Observable Notebooks 2.0 Technology Preview

Playing with more user-friendly methods for multi-factor authentication

Microsoft Flight Simulator 2024: WebAssembly SDK

Supervised fine tuning on curated data is reinforcement learning

CodeCrafters (YC S22) is hiring first Marketing Person

M8.7 earthquake in Western Pacific, tsunami warning issued

Study mode

RIP Shunsaku Tamiya, the man who made plastic model kits a global obsession

Launch HN: Hyprnote (YC S25) – An open-source AI meeting notetaker

Show HN: I built a free backlink exchange marketplace

URL-Driven State in HTMX

iPhone 16 cameras vs. traditional digital cameras

A major AI training data set contains millions of examples of personal data

Sleep all comes down to the mitochondria

Show HN: The Aria Programming Language

Learning basic electronics by building fireflies

Two Birds with One Tone: I/Q Signals and Fourier Transform

ACM Transitions to Full Open Access

Analoguediehard

Show HN: Cant, rust nn lib for learning

USB-C for Lightning iPhones

How the brain increases blood flow on demand

FoundationDB: From idea to Apple acquisition [video]

Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL

Irrelevant facts about cats added to math problems increase LLM errors by 300%

Show HN: I built an AI that turns any book into a text adventure game

A month using XMPP (using Snikket) for every call and chat (2023)

My 2.5 year old laptop can write Space Invaders in JavaScript now (GLM-4.5 Air)

Elements of System Design

Structuring large Clojure codebases with Biff

Observable Notebooks 2.0 Technology Preview

Playing with more user-friendly methods for multi-factor authentication

Microsoft Flight Simulator 2024: WebAssembly SDK

Supervised fine tuning on curated data is reinforcement learning

CodeCrafters (YC S22) is hiring first Marketing Person

FoundationDB: From idea to Apple acquisition [video]

Comments