frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

More databases should be single-threaded

https://blog.konsti.xyz/p/8c8a399f-8cfe-47dd-9278-9527105d07dc/
29•lawrencechen•4h ago

Comments

qouteall•2h ago
In real-world business requirements it often need to read some data then touch other data based on previous read result.

It violates the "every transaction can only be in one shard" constraint.

For a specific business requirement it's possible to design clever sharding to make transaction fit into one shard. However new business requirements can emerge and invalidate it.

"Every transaction can only be in one shard" only works for simple business logics.

rowanG077•2h ago
in my experience most backends I have worked on people don't use the facilities of their database. They indeed simply hit the database two or more times. But that doesn't mean it's not possible to do better if you actually put more care in your queries. Most of the time multiple transactions can be eliminated. So I don't agree this is a business requirement complexity problem. It's a "it works so it's good enough" problem, or a "lazy developer" problem depending on how you want to frame it.
formerly_proven•2h ago
This (along with n+1) is somewhat encouraged in business applications due to the prevalence of the repository pattern.
SoftTalker•2h ago
Give each business or customer its own schema and you almost never need sharding.
n2d4•1h ago
Yes, but you could also flip it the other way around — make the business or customer your sharding key, and you'll only need to manage one schema!
n2d4•1h ago
I talk about these problems in the "How hard can sharding be?" section of the article — long story short, not all business requirements can be dealt with easily, but surprisingly many can if you choose a smart sharding key.

You can also still do optimistic concurrency across shards! That covers most of the remaining ones. Anything that requires anything more complex — sagas, 2PC, etc. — is relatively rare, and at scale, a traditional SQL OLTP will also struggle with those.

qouteall•1h ago
Thanks for reply.

So in my understanding:

- The transactions that only touch one shard is simple

- The transactions that read multiple shards but only write shard can use simple optimistic concurrency control

- The transactions that writes (and reads) multiple shards stay complex. Can be avoided by designing a smart sharding key. (hard to do if business requirement is complex)

n2d4•1h ago
That's right!

If you anticipate you will encounter the third type a lot, and you don't anticipate that you will need to shard either way, what I'm talking about here makes no sense for you.

qouteall•1h ago
The optimistic concurrency control that reads multiple shards cannot use simple CAS. It probably needs to do something like two-phase committing
hinkley•24m ago
Business people have a nasty habit of identifying two independent pieces of data you have and finding ideas to combine them to do something new. They aren’t happy until every piece of data is copied with every other piece and then they still aren’t happy because now everything is horrible because everything is coupled to everything.
kgeist•2h ago
From what I understand, the complexity stays there, it's just moved from the DB layer to the app layer (now I have to decide how to shard data, how to reshard, how to synchronize data across shards, how to run queries across shards without wildly inconsistent results), so as I developer I have more headaches now than before, when most of that was taken care of by the DB. I don't see why it's an improvement.

The author also mentions B2B and I'm not sure how it's going to work. I understand B2C where you can just say "1 user=1 single-threaded shard" because most user data is isolated/independent from other users. But with B2B, we have accounts ranging from 100 users per organization to 200k users per organization. Something tells me making a 200k account single-threaded isn't a good idea. On the other hand, artificially sharding inside an organization will lead to much more complex queries overall too, because usually a lot of business rules require joining different users' data within 1 org.

n2d4•1h ago
It's a different kind of complexity. Essentially, your app layer needs shift from:

    - transaction serializability
    - atomicity
    - deadlocks (generally locks)
    - occ (unless you do VERY long tx, like a user checkout flow)
    - retries
    - scale, infrastructure, parameter tuning
towards thinking about

    - separating data into shards
    - sharding keys
    - cross-shard transactions
which can be sometimes easier, sometimes harder. I think there are a surprising amount of problems where it's much easier to think about sharding than about race conditions!

> But with B2B, we have accounts ranging from 100 users per organization to 200k users per organization.

You'd be surprised at how much traffic a single core (or machine) can handle — 200k users is absolutely within reach. At some point you'll need even more granular sharding (eg. per user within organization), but at that point, you would need sharding anyways (no matter your DB).

bawolff•31m ago
If you have to think about cross-shard transactions then you have to think about all the things on your first list too, as they are complexities related to transaction. I fail to see how that could possibly be simpler.
n2d4•1m ago
Cross-shard transactions are only a tiny fraction of transactions — if the complexities of dealing with that is constrained to some transactions instead of all of them, you're saving yourself a lot of headaches.

Actually, I'd argue a lot of apps can do entirely without cross-shard transactions! (eg. sharding by B2B orgs)

whizzter•1h ago
Yeah, mgmt (and more than anything, query tools) is gonna be a PITA.

But looking at it in a different way, say building something like Google Sheets.

One could place user-mgmt in one single-threaded database (Even at 200k users you probably don't have too many concurrently modifying administrators) whilst "documents" gets their own database. I'm prototyping one such "document" centric tool and the per-document DB thinking has come up, debugging users problems could be as "simple" as cloning a SQLite file.

Now on the other hand if it's some ERP/CRM/etc system with tons of linked data that naturally won't fly.

Tool for the job.

ltbarcly3•1h ago
Wow that's a dumb take. The whole point of ACID is that you can get roughly the same result but have a system that can serve more than 1 user at a time.
codeslinger•1h ago
Disclaimer: ex-AWS here.

This article ends up making a compelling case for DynamoDB. It has the properties he describes wanting. Many, many systems inside of Amazon are built with DDB as the primary datastore. I don't know of any OSS commensurate to DDB, but it would be quite interesting for one to appear.

> "Every transaction can only be in one shard" only works for simple business logics.

You'd be quite surprised at what you can get out of this model. Check out the later chapters of the DynamoDB Book [1] for some neat examples.

[1] https://dynamodbbook.com/

raggi•1h ago
We are doing this, and it’s terrible. Having done both at scale this one is worse.
bawolff•33m ago
That's great if your shards are truly independent of each other, but if not then you just invented a custom transaction layer living in your application code, which sounds way way worse than the original problem.

And quite frankly, i think it is incredibly rare for the shards to both be fine grained and independent in typical oltp DB usecase.

Flock and Cyble Inc. Weaponize "Cybercrime" Takedowns to Silence Critics

https://haveibeenflocked.com/news/cyble-downtime
91•_a9•1h ago•10 comments

Show HN: Jmail – Google Suite for Epstein files

https://www.jmail.world
378•lukeigel•6h ago•78 comments

Backing Up Spotify

https://annas-archive.li/blog/backing-up-spotify.html
868•vitplister•8h ago•307 comments

Ireland’s Diarmuid Early wins world Microsoft Excel title

https://www.bbc.com/news/articles/cj4qzgvxxgvo
185•1659447091•7h ago•64 comments

Claude in Chrome

https://claude.com/chrome
115•ianrahman•5h ago•58 comments

Pure Silicon Demo Coding: No CPU, No Memory, Just 4k Gates

https://www.a1k0n.net/2025/12/19/tiny-tapeout-demo.html
287•a1k0n•10h ago•43 comments

Log level 'error' should mean that something needs to be fixed

https://utcc.utoronto.ca/~cks/space/blog/programming/ErrorsShouldRequireFixing
320•todsacerdoti•3d ago•209 comments

Go ahead, self-host Postgres

https://pierce.dev/notes/go-ahead-self-host-postgres#user-content-fn-1
438•pavel_lishin•11h ago•276 comments

Big GPUs don't need big PCs

https://www.jeffgeerling.com/blog/2025/big-gpus-dont-need-big-pcs
150•mikece•9h ago•52 comments

Italian bears living near villages have evolved to be smaller and less agressive

https://phys.org/news/2025-12-italian-villages-evolved-smaller-aggressive.html
58•wjSgoWPm5bWAhXB•5d ago•26 comments

Gemini 3 Pro vs. 2.5 Pro in Pokemon Crystal

https://blog.jcz.dev/gemini-3-pro-vs-25-pro-in-pokemon-crystal
254•alphabetting•4d ago•75 comments

Perfecting Steve Baer's Triple Dome

https://vorth.github.io/vzome-sharing/2024/02/18/baer-dome-from-H4-1001-09-13-04.html
8•robinhouston•3d ago•0 comments

I spent a week without IPv4 (2023)

https://www.apalrd.net/posts/2023/network_ipv6/
113•mahirsaid•8h ago•190 comments

Show HN: HN Wrapped 2025 - an LLM reviews your year on HN

https://hn-wrapped.kadoa.com?year=2025
129•hubraumhugo•13h ago•74 comments

OpenSCAD is kinda neat

https://nuxx.net/blog/2025/12/20/openscad-is-kinda-neat/
210•c0nsumer•9h ago•150 comments

MIRA – An open-source persistent AI entity with memory

https://github.com/taylorsatula/mira-OSS
68•taylorsatula•6h ago•31 comments

Chomsky and the Two Cultures of Statistical Learning

https://norvig.com/chomsky.html
12•atomicnature•4d ago•4 comments

NTP at NIST Boulder Has Lost Power

https://lists.nanog.org/archives/list/nanog@lists.nanog.org/message/ACADD3NKOG2QRWZ56OSNNG7UIEKKT...
439•lpage•19h ago•197 comments

Biscuit is a specialized PostgreSQL index for fast pattern matching LIKE queries

https://github.com/CrystallineCore/Biscuit
73•eatonphil•4d ago•10 comments

More databases should be single-threaded

https://blog.konsti.xyz/p/8c8a399f-8cfe-47dd-9278-9527105d07dc/
29•lawrencechen•4h ago•19 comments

How to Write a 21st Century Proof (2011) [pdf]

https://lamport.azurewebsites.net/pubs/proof.pdf
14•User23•4d ago•0 comments

You have reached the end of the internet (2006)

https://hmpg.net/
107•raytopia•9h ago•23 comments

Depot (YC W23) Is Hiring an Enterprise Support Engineer (Remote/US)

https://www.ycombinator.com/companies/depot/jobs/jhGxVjO-enterprise-support-engineer
1•jacobwg•9h ago

Skills Officially Comes to Codex

https://developers.openai.com/codex/skills/
250•rochansinha•18h ago•123 comments

Why do people leave comments on OpenBenches?

https://shkspr.mobi/blog/2025/12/why-do-people-leave-comments-on-openbenches/
103•sedboyz•10h ago•6 comments

Over 40% of deceased drivers in vehicle crashes test positive for THC: Study

https://www.facs.org/media-center/press-releases/2025/over-40-of-deceased-drivers-in-motor-vehicl...
226•bookofjoe•10h ago•349 comments

X-59 3D Printing

https://www.nasa.gov/stem-content/x-59-3d-printing/
49•Jsebast24•4d ago•9 comments

Approaching 50 Years of String Theory

https://www.math.columbia.edu/~woit/wordpress/?p=15401
52•jjgreen•13h ago•99 comments

Immersa: Open-source Web-based 3D Presentation Tool

https://github.com/ertugrulcetin/immersa
133•simonpure•13h ago•20 comments

Privacy doesn't mean anything anymore, anonymity does

https://servury.com/blog/privacy-is-marketing-anonymity-is-architecture/
379•ybceo•20h ago•247 comments