Kafka, GraphQL... These are the two technology's where my first question is always this: Does the person who championed/lead this project still work here?
The answer is almost always "no, they got a new job after we launched".
Resume Architecture is a real thing. Meanwhile the people left behind have to deal with a monster...
They’ll be fine if you made something that works, even if it was a bit faddish, make sure you take care of yourself along the way (they won’t)
My point stands, company loyalty tallies up to very little when you’re looking for your next job; no interviewer will care much to hear of how you stood firm, and ignored the siren song of tech and practices that were more modern than the one you were handed down (the tech and practices they’re hiring for).
The moment that reverses, I will start advising people not to skill up, as it will look bad in their resumes.
is there some reason GraphQL gets so much hate? it always feels to me like it's mostly just a normal RPC system but with some incredibly useful features (pipelining, and super easy to not request data you don't need), with obvious perf issues in code and obvious room for perf abuse because it's easy to allow callers to do N+1 nonsense.
so I can see why it's not popular to get stuck with for public APIs unless you have infinite money, it's relatively wide open for abuse, but private seems pretty useful because you can just smack the people abusing it. or is it more due to specific frameworks being frustrating, or stuff like costly parsing and serialization and difficult validation?
which I fully understand is more work than "it's super easy just X" which it gets presented as, but that's always the cost of super flexible things. does graphql (or the ecosystem, as that's part of daily life of using it) make that substantially worse somehow? because I've dealt with people using protobuf to avoid graphql, then trying to reimplement parts of its features, and the resulting API is always an utter abomination.
And yes, you don't want to use it for public APIs. But if you have private APIs that are so complex that you need a query language, and still want use those over web services, you are very likely doing something really wrong.
"check that the user matches the data they're requesting by comparing the context and request field by hand" is ultra common - there are some real benefits to having authorization baked into the language, but it seems very rare in practice (which is part of why it's often flawed, but following the overwhelming standard is hardly graphql's mistake imo). I'd personally think capabilities are a better model for this, but that seems likely pretty easy to chain along via headers?
Instead of not knowing 1 thing to launch.. let’s pick as many new to us things, that will increase the chances of success.
Personally I’d expect some kind of internal interface to abstract away and develop reusable components for such an external dependency, which readily enables having relational data stores mirroring the brokers functionality. Handy for testing and some specific local scenarios, and those database backed stores can easily pull from the main cluster(s) later to mirror data as needed.
Postgres really is a startup's best friend most of the time. Building a new product that's going to deal with a good bit of reporting that I began to look at OLAP DBs for, but had hesitation to leave PG for it. This kind of seals it for me (and of course the reference to the class "Just Use Postgres for Everything" post helps) that I should Just Use Postgres (R).
On top of being easy to host and already being familiar with it, the resources out there for something like PG are near endless. Plus the team working on it is doing constant good work to make it even more impressive.
It's built on pgmq and not married to supabase (nearly everything is in the database).
Postgres is enough.
I've been a happy Postgres user for several decades. Postgres can do a lot! But like anything, don't rely on maxims to do your engineering for you.
Wouldn't OrioleDB solve that issue though?
It often doesn't.
[1] https://rubyonrails.org/2024/11/7/rails-8-no-paas-required
When you’re doing hundreds or thousands of transactions to begin with it doesn’t really impact as much out of the gate.
Of course there will be someone who will pull out something that won’t work but such examples can likely be found for anything.
We don’t need to fear simplification, it is easy to complicate later when the actual complexities reveal themselves.
And the thing is, a server from 10 years ago running postgres (with a backup) is enough for most applications to handle thousands of simultaneous users. Without even going into the kinds of optimization you are talking about. Adding ops complexity for the sake of scale on the exploratory phase of a product is a really bad idea when there's an alternative out there that can carry you until you have fit some market. (And for some markets, that's enough forever.)
Postgres isn’t meant to be a guaranteed permanent replacement.
It’s a common starting point for a simpler stack which can retain a greater deal of flexibility out of the box and increased velocity.
Starting with Postgres lets the bottlenecks reveal themselves, and then optimize from there.
Maybe a tweak to Postgres or resources, or consider a jump to Kafka.
> ...
> The other camp chases common sense
I don't really like these simplifications. Like one group obviously isn't just dumb, they're doing things for reasons you maybe don't understand. I don't know enough about data science to make a call, but I'm guessing there were reasons to use Kafka due to current hardware limits or scalability concerns, and while the issues may not be as present today that doesn't mean they used Kafka just because they heard a new word and wanted to repeat it.
https://github.com/mongomock/mongomock Extrapolating from my personal usage of this library to others, I'm starting to think that mongodb's 25 billion dollar valuation is partially based on this open source package :)
Otherwise SQLITE :)
I prefer not to start with a nosql database and then undertake odysseys to make it into a relational database.
In my company most of our topics need to be consumed by more than one application/team, so this feature is a must have. Also, the ability to move the offset backwards or forwards programmatically has been a life saver many times.
Does Postgres support this functionality for their queues?
I don't know what kind of native support PG has for queue management, the assumption here is that a basic "kill the task as you see it" is usually good enough and the simplicity of writing and running a script far outweighs the development, infrastructure and devops costs of Kafka.
But obviously, whether you need stuff to happen in 15 seconds instead of 5 minutes, or 5 minutes instead of an hour is a business decision, along with understanding the growth pattern of the workload you happen to have.
Here is one: https://pgmq.github.io/pgmq/
Some others: https://github.com/dhamaniasad/awesome-postgres
Most of my professional life I have considered Postgres folks to be pretty smart… while I by chance happened to go with MySQL and it became the rdbms I thought in by default.
Heavily learning about Postgres recently has been okay, not much different than learning the tweaks for msssl, oracle or others. Just have to be willing to slow down a little for a bit and enjoy it instead of expecting to thrush thru everything.
Unless you're a five man shop where everybody just agrees to use that one table, make sure to manage transactions right, cron job retention, YOLO clustering, etc. etc.
Performance is probably last on the list of reasons to choose Kafka over Postgres.
There’s several implementations of queues to increase the chance of finishing what one is after. https://github.com/dhamaniasad/awesome-postgres
Naive approach with sequence (or serial type which uses sequence automatically) does not work. Transaction "one" gets number "123", transaction "two" gets number "124". Transaction "two" commits, now table contains "122", "124" rows and readers can start to process it. Then transaction "one" commits with its "123" number, but readers already past "124". And transaction "one" might never commit for various reasons (e.g. client just got power cut), so just waiting for "123" forever does not cut it.
Notifications can help with this approach, but then you can't restart old readers (and you don't need monotonic numbers at all).
Isn't it a bit of a white whale thing that a umion can solve all one's subscriber problems? Afaik even with kafka this isn't completely watertight.
https://www.oreilly.com/library/view/designing-data-intensiv...
You can generate distributed monotonic number sequences with a Lamport Clock.
https://en.wikipedia.org/wiki/Lamport_timestamp
The wikipedia entry doesn't describe it as well as that book does.
It's not the end of the puzzle for distributed systems, but it gets you a long way there.
See also Vector clocks. https://en.wikipedia.org/wiki/Vector_clock
Edit: I've found these slides, which are a good primer for solving the issue, page 70 onwards "logical time":
https://ia904606.us.archive.org/32/items/distributed-systems...
There's poles.
1. Is folks constantly adopting the new tech, whatever the motivation, and 2. I learned a thing and shall never learn anything else, ever.
Of course nobody exists actually on either pole, but the closer you are to either, the less pragmatic you are likely to be.
I think it's still just 2 poles. However, I probably shouldn't have prescribed motivation to latter pole, as I purposely did not with the former.
Pole 2 is simply never adopt anything new ever, for whatever the motivation.
If you don't need what kafka offers, don't use it. But don't pretend you're on to something with your custom 5k msg/s PG setup.
This is literally the point the author is making.
"Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize."
In this case, you do just need a single fuel truck. That's what it was built for. Avoiding using a design-for-purpose tool to achieve the same result actually is wasteful. You don't need 288 cores to achieve 243,000 messages/second. You can do that kind of throughput with a Kafka-compatible service on a laptop.
[Disclosure: I work for Redpanda]
Kafka et al definitely have their place, but I think most people would be much better off reaching for a simpler queue system (or for some things, just using Postgres) unless you really need the advanced features.
You should be able to install within minutes.
None of this applies to Redpanda.
Yet to also be fair to the Kafka folks, Zookeeper is no longer default and hasn't been since April 2025 with the release of Apache Kafka 4.0:
"Kafka 4.0's completed transition to KRaft eliminates ZooKeeper (KIP-500), making clusters easier to operate at any scale."
Source: https://developer.confluent.io/newsletter/introducing-apache...
> This is literally the point the author is making.
Exactly! I just don't understand why HN invariably always tends to bubble up the most dismissive comments to the top that don't even engage with the actual subject matter of the article!
https://www.youtube.com/watch?v=7CdM1WcuoLc
Getting even less than that throughput on 3x c7i.24xlarge — a total of 288 vCPUs – is bafflingly wasteful.
Just because you can do something with Postgres doesn't mean you should.
> 1. One camp chases buzzwords.
> 2. The other camp chases common sense
In this case, is "Postgres" just being used as a buzzword?
[Disclosure: I work for Redpanda; we provide a Kafka-compatible service.]
Kafka is a full on steaming solution.
Postgres isn’t a buzzword. It can be a capable placeholder until it’s outgrown. One can arrive at Kafka with a more informed run history from Postgres.
Freudian slip? ;)
"The use of fsync is essential for ensuring data consistency and durability in a replicated system. The post highlights the common misconception that replication alone can eliminate the need for fsync and demonstrates that the loss of unsynchronized data on a single node still can cause global data loss in a replicated non-Byzantine system."
However, for all that said, Redpanda is still blazingly fast.
https://www.redpanda.com/blog/why-fsync-is-needed-for-data-s...
Can you lose one Postgres instance?
When you have C++ code, the number of external folks who want to — and who can effectively, actively contribute to the code — drops considerably. Our "cousins in code," ScyllaDB last year announced they were moving to source-available because of the lack of OSS contributors:
> Moreover, we have been the single significant contributor of the source code. Our ecosystem tools have received a healthy amount of contributions, but not the core database. That makes sense. The ScyllaDB internal implementation is a C++, shard-per-core, future-promise code base that is extremely hard to understand and requires full-time devotion. Thus source-wise, in terms of the code, we operated as a full open-source-first project. However, in reality, we benefitted from this no more than as a source-available project.
Source: https://www.scylladb.com/2024/12/18/why-were-moving-to-a-sou...
People still want to get free utility of the source-available code. Less commonly they want be able to see the code to understand it and potentially troubleshoot it. Yet asking for active contribution is, for almost all, a bridge too far.
(Notably, they're not arguing that open source reusers have been "unfair" to them and freeloaded on their effort, which was the key justification many others gave for relicensing their code under non-FLOSS terms.)
In case anyone here is looking for a fully-FLOSS contender that they may want to perhaps contribute to, there's the interesting project YugabyteDB https://github.com/yugabyte/yugabyte-db
(Personally, I have no issues with the AGPL and Stallman originally suggested this model to Qt IIRC, so I don't really mind the original split, but that is the modern intent of the strategy.)
As a maintainer of several free software projects, there are lots of issues with how projects are structured and user expectations, but I struggle to see how proprietary licenses help with that issue (I can see -- though don't entirely buy -- the argument that they help with certain business models, but that's a completely different topic). To be honest, I have no interest in actively seeking out proprietary software, but I'm certainly in the minority on that one.
AKA "Medium Data" ?
Never used Kafka myself, but we extensively use Redis queues with some scripts to ensure persistency, and we hit throughputs much higher than those in equivalent prod machines.
Same for Redis pubsubs, but those are just standard non-persistent pubsubs, so maybe that gives it an upper edge.
zeek -> kafka -> logstash -> elastic
CPU is more tricky but I’m sure it can be shown somehow
It's very common to start adding more and more infra for use cases that, while technically can be better cover with new stuff, it can be served by already existing infrastructure, at least until you have proof that you need to grow it.
Is anyone actually reading the full article, or just reacting to the first unimpressive numbers you can find and then jumping on the first dismissive comment you can find here?
Benchmarking Kafka isn't the point here. The author isn't claiming that Postgres outperforms Kafka. The argument is that Postgres can handle modest messaging workloads well enough for teams that don't want the operational complexity of running Kafka.
Yes, the throughput is astoundingly low for such a powerful CPU but that's precisely the point. Now you know how well or how bad Postgres performs on a beefy machine. You don't always need Kafka-level scale. The takeaway is that Postgres can be a practical choice if you already have it in place.
So rather than dismissing it over the first unimpressive number you find, maybe respond to that actual matter of TFA. Where's the line where Postgres stops being "good enough"? That'll be something nice to talk about.
I'm glad the OP benchmarked on the 96 vCPU server. So now I know how well Postgres performs on a large CPU. Not very well. But if the OP had done their benchmark on a low CPU, I wouldn't have learned this.
Or they could have not mentioned kafka at all and just demonstrated their pub/sub implementation with PG. They could have not tried to make it about the buzzword resume driven engineering people vs. common sense folks such as himself.
It's a fair point that if you already have a pgsql setup, and only need a few messages here and there, then pg is fine. But yeah, the 96 vcpu setup is absurd.
I might well be talking out of my arse but if you're going to implement pub/sub in Postgres, it'd be worth designing around its strengths and going back to basics on event sourcing.
Performance isn't a big deal for me. I had assumed that Kafka would give me things like decoupling, retry, dead-lettering, logging, schema validation, schema versioning, exactly once processing.
I like Postgres, and obviously I can write a queue ontop of it, but it seems like quite a lot of effort?
It’s mostly registering the Postgres database functions which is one time.
There are also pre-made Postgres extensions that already run the queue.
These days i would like consider m starting with Supabase self hosted which has the Postgres ready to tweak.
Event-sourcing is when you buy something and get a receipt, you go stick it in a shoe-box for tax time.
A queue is you get given receipts, and you look at them in the correct order before throwing each one away.
If you plan on retaining your topics indefinitely, schema evolution can become painful since you can't update existing records. Changing the number of partitions in a topic is also painful, and choosing the number initially is a difficult choice. You might want to build your own infrastructure for rewriting a topic and directing new writes to the new topic without duplication.
Kafka isn't really a replacement for a database or anything high-level like a ledger. It's really a replicated log, which is a low-level primitive that will take significant work to build into something else.
MQTT -> Redpanda (for message logs and replay, etc) -> Postgres/Timescaledb (for data) + S3 (for archive)
(and possibly Flink/RisingWave/Arroyo somewhere in order to do some alerting/incrementally updated materialized views/ etc)
this seems "simple enough" (but I don't have any experience with Redpanda) but is indeed one more moving part compared to MQTT -> Postgres (as a queue) -> Postgres/Timescaledb + S3
Questions:
1. my "fear" would be that if I use the same Postgres for the queue and for my business database, the "message ingestion" part could block the "business" part sometimes (locks, etc)? Also perhaps when I want to update the schema of my database and not "stop" the inflow of messages, not sure if this would be easy?
2. also that since it would write messages in the queue and then delete them, there would be a lot of GC/Vacuuming to do, compared to my business database which is mostly append-only?
3. and if I split the "Postgres queue" from "Postgres database" as two different processes, of course I have "one less tech to learn", but I still have to get used to pgmq, integrate it, etc, is that really much easier than adding Redpanda?
4. I guess most Postgres queues are also "simple" and don't provide "fanout" for multiple things (eg I want to take one of my IoT message, clean it up, store it in my timescaledb, and also archive it to S3, and also run an alert detector on it, etc)
What would be the recommendation?
How is it common sense to try to re-implement Kafka in Posgres? You probably need something similar but more simple. Then implement that! But if you really need something like Kafka, then .. use Kafka!
IMO the author is now making the same mistake as some Kafka evangelists that try to implement a database in Kafka.
Plug: If you need granular, durable streams in a serverless context, check out s2.dev
1) People constantly chasing the latest technology with no regard for whether it's appropriate for the situation.
2) People constantly trying to shoehorn their favourite technology into everything with no regard for whether it's appropriate for the situation.
Postgres also has been around for a long time and a lot of people didn’t know all it can do which isn’t what we normally think about with a database.
Appropriateness is a nice way to look at it as long as it’s clear whether or not it’s about personal preferences and interpretations and being righteous towards others with them.
Customers rarely care about the backend or what it’s developed in, except maybe for developer products. It’s a great way to waste time though.
The third camp:
3) People who look at a task, then apply a tool appropriate for the task.
Use the tool which is appropriate for the job, it is trivial to write code to use them with LLMs these days and these software are mature enough to rarely cause problems and tools built for a purpose will always be more performant.
The fact that there is no common library that implements the authors strategy is a good sign that there is not much demand for this.
ease of use. in ruby If I want to use kafka I can use karafka. or redis streams via the redis library. likewise if kafka is too complex to run there's countless alternatives which work as well - hell even 0mq with client libraries.
now with the postgres version I have to write my own stuff which I might not where it's gonna lead me.
postgres is scalable, no one doubts that. but what people forget to mention is the ecosystem around certain tools.
https://github.com/dhamaniasad/awesome-postgres
There is at least a Python example here.
You need some sort of server-side logic to manage that, and the consumer heartbeats, and generation tracking, to make sure that only the "correct" instances can actually commit the new offsets. Distributed systems are hard, and Kafka goes through a lot of trouble to ensure that you don't fail to process a message.
Of course the implementation based off that is going to miss a bit.
I think a bigger issue is the DBMS themselves getting feature after feature and becoming bloated and unfocused. Add the thing to Postgres because it is convenient! At least Postgres has a decent plugin approach. But I think more use cases might be served by standalone products than by add-ons.
Your DB on the other hand is usually a well-understood part of your system, and while scaling issues like that can cause problems, they're often fairly easy to predict- just unfortunate on timing. This means that while they'll disrupt, they're usually solved quickly, which you can't always say for additional systems.
1. Do you really need a queue? (Alternative: periodic polling of a DB)
2. What's your event volume and can it fit on one node for the foreseeable future, or even serverless compute (if not too expensive)? (Alternative: lightweight single-process web service, or several instances, on one node.)
3. If it can't fit on one node, do you really need a distributed queue? (Alternative: good ol' load balancing and REST API's, maybe with async semantics and retry semantics)
4. If you really do need a distributed queue, then you may as well use a distributed queue, such as Kafka. Even if you take on the complexity of managing a Kafka cluster, the programming and performance semantics are simpler to reason about than trying to shoehorn a distributed queue onto a SQL DB.
MQTT -> Redpanda (for message logs and replay, etc) -> Postgres/Timescaledb (for data) + S3 (for archive)
(and possibly Flink/RisingWave/Arroyo somewhere in order to do some alerting/incrementally updated materialized views/ etc)
this seems "simple enough" (but I don't have any experience with Redpanda) but is indeed one more moving part compared to MQTT -> Postgres (as a queue) -> Postgres/Timescaledb + S3
Questions:
1. my "fear" would be that if I use the same Postgres for the queue and for my business database, the "message ingestion" part could block the "business" part sometimes (locks, etc)? Also perhaps when I want to update the schema of my database and not "stop" the inflow of messages, not sure if this would be easy?
2. also that since it would write messages in the queue and then delete them, there would be a lot of GC/Vacuuming to do, compared to my business database which is mostly append-only?
3. and if I split the "Postgres queue" from "Postgres database" as two different processes, of course I have "one less tech to learn", but I still have to get used to pgmq, integrate it, etc, is that really much easier than adding Redpanda?
4. I guess most Postgres queues are also "simple" and don't provide "fanout" for multiple things (eg I want to take one of my IoT message, clean it up, store it in my timescaledb, and also archive it to S3, and also run an alert detector on it, etc)
What would be the recommendation?
Doesn't PostgreSQL have transactional schema updates as a key feature? AIUI, you shouldn't be having any data loss as a result of such changes. It's also common to use views in order to simplify the management of such updates.
You can run into issues with scheduled queues (e.g. run this job in 5 minutes) since the tables will be bigger, you need an index, and you will create the garbage in the index at the point you are querying (jobs to run now). This is a spectacularly bad pattern for postgres at high volume.
That sounds distributed to me, even if it wires different tech together to make it happen. Is there something about load balancing REST requests to different DB nodes that is less complicated than Kafka?
sup {
position: relative;
top: -0.4em;
line-height: 0;
vertical-align: baseline;
}MQTT -> Redpanda (for message logs and replay, etc) -> Postgres/Timescaledb (for data) + S3 (for archive)
(and possibly Flink/RisingWave/Arroyo somewhere in order to do some alerting/incrementally updated materialized views/ etc)
this seems "simple enough" (but I don't have any experience with Redpanda) but is indeed one more moving part compared to MQTT -> Postgres (as a queue) -> Postgres/Timescaledb + S3
Questions:
1. my "fear" would be that if I use the same Postgres for the queue and for my business database, the "message ingestion" part could block the "business" part sometimes (locks, etc)? Also perhaps when I want to update the schema of my database and not "stop" the inflow of messages, not sure if this would be easy?
2. also that since it would write messages in the queue and then delete them, there would be a lot of GC/Vacuuming to do, compared to my business database which is mostly append-only?
3. and if I split the "Postgres queue" from "Postgres database" as two different processes, of course I have "one less tech to learn", but I still have to get used to pgmq, integrate it, etc, is that really much easier than adding Redpanda?
4. I guess most Postgres queues are also "simple" and don't provide "fanout" for multiple things (eg I want to take one of my IoT message, clean it up, store it in my timescaledb, and also archive it to S3, and also run an alert detector on it, etc)
What would be the recommendation?
Ok so instead of running Kafka, we're going to spend development cycles building our own?
> Most of the time - yes. You should always default to Postgres until the constraints prove you wrong.
Interesting.
I've also been by my seniors that I should go with PostgreSQL by default unless I have a good justification not to.
task: msg queue
software: kafka
hardware: m7i.xlarge (vCPUs: 4 Memory: 16 GiB)
payload: 2kb / msg
possible performance: ### - #### msgs / second
etc…
So many times I've found myself wondering: is this thing behaving within an order of magnitude of a correctly setup version so that I can decide whether I should leave it alone or spend more time on it.
You see this a lot in the tech world. "Why would you use Python, Python is slow" (objectively true, but does it matter for your high-value SaaS that gets 20 logins per day?)
qsort•2h ago
I'm not saying they're useless, but if I see something like that lying around, it's more likely that someone put it there based on vibes rather than an actual engineering need. Postgres is good enough for OpenAI, chances are it's good enough for you.