I understand there are practical reasons why you might want to just choose a concurrency and let it rip at a fixed warehouse size and say, “I ran TPC-C”, but you didn’t!
TPC-C when run properly is effectively an open-loop benchmark that scales where the load scales with the dataset size by having a fixed number of workers per warehouse (2?) that each issue transactions at some rate. It’s designed to have a low level of builtin contention that occurs based on the frequency of cross warehouse transactions, I don’t remember the exact rate but I think it’s something like 10%.
The benchmark has an interesting property that if the system can keep up with the transaction load by processing transactions quickly, it remains a low contention workload but if it falls behind and transactions start to pile up, then the number of contending transactions in flight will increase. This leads to non-linear degradation mode even beyond what normally happens with an open loop benchmark — you hit some limit and the performance falls off a cliff because now you have to do even more work than just catching up on the query backlog.
When you run without think time, you make the benchmark closed loop. Also, because you’re varying the number of workers without changing the dataset size (because you have to vary something to make your pretty charts), you’re changing the rate at which any given transaction is going to be on the same warehouse. So, you’ve got more contending transactions generally, but worse than that, because of Amdahl’s law, the uncontended transactions will fly through, so most of the time for most workers will be spend sitting waiting on contended keys.
If you took their exact same hardware and put the application+SQLite on the same box, you could literally chop 4 zeroes off these p99 latency figures. NVMe storage is unbelievably fast when it's utilized in the same machine that the application runs on.
> At PlanetScale, we give you a primary and two replicas spread across 3 availability zones (AZs) by default. Multi-AZ configurations are critical to have a highly-available database. The replicas can also be used to handle significant read load.
Imustaskforhelp•7h ago
Now after that, they released their nvme drive innovation which I admit I am a little ignorant of.
Now one of the reasons that I hated planetscale was that it was exclusively mysql, Postgresql is good tbh. But can it run postgres extensions?
And also regarding convex using them. Isn't convex itself a database? / a reactive database. I didn't knew that underneath convex used some other database like postgres though I guess Correct me if I am wrong but from my last recall, they can also use sqlite etc. too.
Another point I'd like to raise is that alloydb is the cheapest in their benchmark except their own product.
And I wonder if there is some part of the results that they have omitted to be shown as the better product & I'd like to see third party results too tbh.
I'd also love to see it being open source tbh. Neon/Supabase is open source fwiw. The closest open source I could see of planetscale is of https://github.com/planetscale/migration-scripts where its a shell script to migrate from postgres to planetscale and at the time of writing, a recent commit just 36 minutes ago was launched but I guess I'd like to genuinely tweak and self host what makes their postgres better IDK