A low end ARM processor(like a raspberry pi) can crank out 1000 requests a second with a CGI program handing the requests — using a single CPU core. Of course this doesn’t happen by with traditional CGI. (Actual performance with traditional CGI will be more like 20-50/s or worse).
Like the stereotypical drivers of such vehicles, the industry has become so fat and stupid that an x86 system handling 500 requests/sec actually sounds impressive. Sadly, considering the bloated nature of modern stacks, it kinda is.
Honestly, unless you're bandwidth/uplink limited (e.g running a CDN) then a single machine will take you really far.
Also simpler systems tend to have better uptime/reliability. Doesn't get much simpler than a single box.
So when people say 1k is "highload" and requires a whole cluster, I'm not sure what to think of it. You can squeeze so much more out of a single fairly modest machine.
That's the other thing AWS tends to have really dated SSDs.
Honestly, it's like the industry has jumped the shark. 1k is not a lot of load. It's like when people say single writer means you can't be performant, it's the opposite most of the time single writer lets you batch and batching is where the magic happens.
With a test like this, you're really testing two different things:
1. How fast your database is,
2. How fast your frontend is
Since the query is simple, your frontend is basically a DB access layer and should be taking no time. And since the table is indexed the query should also take no time.
The only other interesting question is if the database can handle the number of connections and the storage is. The app is using connection pools, but the actual size of the database machine is never mentioned...which is a problem. How big is the DB instance? A small instance could be crushed with 80 connections. A database on a hard drive may not be able to handle the load either (though since the data volume is small, it could be that everything ends up cached anyway).
So this is sort of interesting, but sort of not interesting.
Both the app and db are hosted on the same machine - they are sharing resources. This fact, type of storage and other details of the setup are contained in this section: https://binaryigor.com/how-many-http-requests-can-a-single-m...
I think you're right that I didn't mention the details of the db connection pool; they are here: https://github.com/BinaryIgor/code-examples/blob/master/sing...
Long story short, there's a Hikari Connection Pool with initial 10 connections, resizable to 20.
var issuedRequests = i + 1;
if (issuedRequests % REQUESTS_PER_SECOND == 0 && issuedRequests < REQUESTS) {
System.out.println("%s, %d/%d requests were issued, waiting 1s before sending next batch..."
.formatted(LocalDateTime.now(), issuedRequests, REQUESTS));
Thread.sleep(1000);
}
don't take any conclusions away from this post, friendsSame with db - I wanted to see, what kind of load a system (not just app) deployed to a single machine can handle.
It can be obviously optimized even further, I didn't try to do that in the article
Suppose it takes 0.99s to send REQUESTS_PER_SECOND requests. Then you sleep for 1s. Result: You send REQUESTS_PER_SECOND requests every 1.99s. (If sending the batch of requests could take longer than a second, then the situation gets even worse.)
The issue GP has with app and DB on the same box is a red herring -- that was explicitly the condition under test.
and, furthermore, if the application and DB are co-located on the same machine, you're co-mingling service loads, and definitely not measuring or capturing any kind of useful load numbers, in the end
tl;dr is that these benchmarks/results are ultimately unsound, it's not about optimization, it's about validity
if you want to benchmark the application, then either you (a) mock the DB at as close to 0 cost as you can, or (b) point all application endpoints to the same shared (separate-machine) DB instance, and make sure each benchmark run executes exactly the same set of queries against against a DB instance that is 100% equivalent to the other runs, resetting in-between each run
A picture would have been worth quite a bit more than a thousand words.
obviously at high load (1k TPS+) talking in servers is way cheaper than serverless, so the tradeoff can start to swing
That’s barely more than a raspberry pi? (4 vs 8 cores) Huge machines today have 20+ TBs of RAM and hundreds of cores. Even top-end consumer machines can have 512GB of RAM!
I do agree with the author that single machines can scale far beyond what most orgs / companies need, but I think they may be underestimating how far that goes by orders-of-magnitude
Single core perf doubled every 8 years, multicore every 6 years, and GPUs every 3 years !
In 8 years, Ryzen went from 1166 geekbench 6 single core to 3398.
Is this common? Why not use the local filesystem? Actually, I thought that using anything else beyond the local filesystem for the database is a no-no. Am I missing something?
Block storage is meant to be reliable, so databases go there. Yes it's slower but you don't lose data.
Generally, the only time you want a local database in the cloud is if it's being used for short-lived data meaningful only to that particular instance in time.
Or it can work if your database rarely changes and you make regular backups that are easy to revert to, like for a blog.
Databases with high availability and robust storage were possible before the cloud.
I'm not saying it can't be done. But block storage is built for reliability in a way that ephemeral instances are not. There's a good reason why every guide will tell you to set your database up on block storage rather than an instance's local disk. If your instance fails, just spin up another instantly and reconnect to the same block storage.
I know that you can have significantly bigger machines; network-mounted DB storage on the other hand is not slow - it's designed specifically for these kind of use cases
also, it always feels like I need a second instance at the very least for redundancy, but then we have to ensure they're stateless and that batch jobs are sharded across them (or only run on one), and again we hit an architecture explosion. Wish that I was more comfortable just dropping a single spring boot instance on a vm and calling it a day; spring boot has a lot of bells and whistles and you can get pretty far without the architecture explosion but it is almost inevitable
Use One Big Server (2022) - https://news.ycombinator.com/item?id=45085029 - Aug 2025 (61 comments)
rokkamokka•4h ago
BinaryIgor•4h ago