HTAP is Dead

https://www.mooncake.dev/blog/htap-is-dead

159•moonikakiss•8mo ago

Comments

cwillu•8mo ago

That is the worst smooth scrolling hijack I've ever seen, and the whole site breaks if you disable javascript.

charcircuit•8mo ago

>Cursor is powered by a single-box Postgres instance

Why wouldn't it? The resources needed to run the backend of Cursor come from the compute for the AI models. Updating someone's quota in a database every few minutes is not going to be causing issues.

zhousun•8mo ago

there's actually a great read, cursor started with a distributed OLTP solution: yugabyte, and then fall back to RDS...

bcoates•8mo ago

In the nosql era the idea that you could run even the basics for a >1m user SaaS platform on an ordinary, free, single-node transactional SQL database would have been considered nuts.

LAC-Tech•8mo ago

wait we're not in the nosql era anymore?

dynamo and mongo are huge, redis and kafka (and their clones) are ubiquitous, etc etc

bcoates•8mo ago

Oh God people are still using Mongo in production? Why?

Kafka exists but is deeply obsolete and mostly marginalized outside of things with dependencies on the weird way it works (Debezium, etc)

I've always liked Redis but choosing it as a core tech on a new product in the last, say, 6 years is basically malpractice? 10 if you're uncharitable.

The thing these all have in common is having their economics and ergonomics absolutely shattered by SSDs and cluster-virtualization-by-default (i.e. cloud and on-prem pseudo-cloud). They're just artifacts of a very narrow window of history where a rack of big-ram servers was a reasonable way of pairing storage IOPS to network bandwidth.

Dynamo is and always was niche. Thriving in its niche, but a specialized tool for specialized jobs.

throwawaythekey•8mo ago

What's the better kafka/redis? Mongo I know you can just use your favorite relational tool with JSON support if needed (PG/MYSQL)

cebert•8mo ago

> What's the better kafka/redis?

If you are going to leverage caching I’d use the OSS Valkey over Redis. Based on the company’s past behavior, Redis is dead to me now.

bcoates•8mo ago

If you're already built around Redis I'd just keep using it, but if you're doing new development there's not so much a single drop in replacement as a substantially better alternative for any given feature (and not particularly any advantage to having all your data in the "same Redis instance"). That said, 90+% of the time the answer is probably "transactional SQL database" or "message queue"

For Kafka, the answer is probably an object store, a message queue, a specialized logging system, an ordinary transactional database table, or whatever mechanism your chosen analytics DB uses for bulk input (probably S3 or equivalent these days). Or maybe just a REST interface in front of a filesystem. Unless of course you truly need to interface with a Kafka consumer/producer in which case you’re stuck with it (the actual reason I've seen for every Kafka deployment I've personally witnessed in recent history)

chatmasta•8mo ago

I work for a database company and of my ~100 customer meetings last year, only one of the notes mentions Mongo as software they use in production. Maybe it’s a different world or something, idk, but I don’t understand the use case.

If I’m ingesting unstructured data for search or “parse it later” purposes, I’ll choose OpenSearch (elastic). Otherwise I’m going PG by default and if I need analytics I’ll use Parquet or Delta and pick the query engine based on requirements.

I honestly cannot think of a use case where Mongo is the appropriate solution.

FridgeSeal•8mo ago

We’re not in the no-sql era anymore, because the prevailing marketing and “thought leadership” isn’t peaking these things _instead of_ a sql database. They’re now _parts_ of a system, of which SQL DB’s are still a very big part.

cyberax•8mo ago

> wait we're not in the nosql era anymore?

Kinda. It turned out, that for the vast majority of users, a single Postgres instance on a reasonably large host is more than enough. Perhaps with a read replica as a hot standby.

You can easily get 1 million transactions per second from it (simple ones, granted). So why bother with NoSQL?

> redis and kafka (and their clones) are ubiquitous, etc etc

That's a bit different. Kafka is a message queue, and Redis is mostly used as a cache.

api•8mo ago

It may have been back then, though I'd argue that you could have done it back then with efficient code and a very well-tuned DB.

Today big boxes are big. Really big. Stuff like 128 cores, 1TB RAM, and dozens of terabytes of incredibly fast RAIDed flash storage is available out there.

They're also more reliable than they used to be. Hardware still fails of course, but it doesn't fail as often as OG spinning disk did.

bob1029•8mo ago

I've always been impressed by the architecture of the Hyperscale service tier of MSSQL in Azure. It is arguably a competitor in this area.

https://learn.microsoft.com/en-us/azure/azure-sql/database/h...

kagolaub•8mo ago

Anyone have any first-hand experience combining transactional and analytic workloads on this vs. Aurora, or something like CockroachDB? Seems like a major advantage of CockroachDB is being able to horizontally scale writes.

beoberha•8mo ago

Hyperscale/Aurora are definitely not competitors and it seems odd you got that premise from the article since it argues the complete opposite.

TOMDM•8mo ago

Terrible scrolling aside;

> pg_mooncake is a PostgreSQL extension adding columnstore tables with DuckDB execution for 1000x faster analytics. Columnstore tables are stored as Iceberg or Delta Lake tables in your Object Store. Maintained by Mooncake Labs, it is available on Neon Postgres.

Seems to summarise the reason this article exists.

Not that I really disagree with the premise or conclusion of the article itself.

ashvardanian•8mo ago

From a modern startup’s POV - fast pivots, fast feedback - it’s fair to say HTAP is “dead.” The market is sticky and slow-moving. But I’d argue that’s precisely why it’s still interesting: fewer teams can survive the long game, but the payoff can be disproportionate.

refset•8mo ago

I agree the opportunity is still there, although the long game keeps getting longer.

Prof. Viktor Leis suggested [0] that SQL itself - being so complex to implement and so ineffectively standardized - may be the biggest inhibitor to faster experimentation in the field of database startups. It's a shame there's no clear path to solving that problem directly.

[0] https://www.juxt.pro/blog/sane-query-languages-podcast/

ashvardanian•8mo ago

Absolutely agree! On the bright side, widespread adoption of Python-like general-purpose languages gives me hope, that similar options will multiply in the DBMS space.

jarbaugh•8mo ago

I'm skeptical of this. The cost of maintaining the "disaggregated data stack" can be immense at scale. A database that can handle replication from a row-based transactional store to, for example, a columnar one that can support aggregations could really reduce the load on engineering teams.

My work involves a "disaggregated data stack" and a ton of work goes into orchestrating all the streaming, handling drift, etc between the transactional stores (hbase) and the various indexes like ES. For low-latency OLAP queries, the data lakes can't always meet the need either. I haven't gotten the chance to see an HTAP database in action at scale, but it sounds very promising.

skissane•8mo ago

> Back in the ’70s, one relational database did everything. Transactions (OLTP) during the day and reports after hours (OLAP). Databases like Oracle V2 and IBM DB2 ran OLTP and OLAP on the same system; largely because data sets still fit on a few disks and compute was costly.

The timeline is a bit off - Oracle V2 was released in second half of 1979, so although it technically came out at the very end of the 1970s, it isn’t really representative of 1970s databases. Oracle V1 was never released commercially, it was used as an internal name while under development starting circa 1977, inside SDL (which renamed itself RSI in 1979, and then Oracle in 1983). Plus Larry Ellison wanted the first release to be version 2 because some people are hesitant to buy version 1 software. Oracle was named after a database project Ellison worked on for the CIA while employed at Ampex, although I’m not sure anyone can really know exactly how much the abandoned CIA database system had in common with Oracle V1/V2, definitely taking some ideas from the CIA project but I’m not sure if it took any of the actual code.

The original DB2 for MVS (later OS/390 and now z/OS) was released in 1983. The first IBM RDBMS to ship as a generally available commercial product was SQL/DS in 1981 (for VM/CMS), which this century was renamed DB2 for VM/VSE. I believe DB2/400 (now renamed DB2 for IBM i) came out with the AS/400 and OS/400 in 1988, although possibly there was already some SQL support in S/38 in the preceding years. The DB2 most people nowadays would encounter is the Linux/AIX/Windows edition (DB2 LUW) is a descendant of OS/2 EE Database Manager, which I think came out in 1987. Anyway, my point - the various editions of DB2 all saw their initial releases in the 1980s, not the 1970s.

While relational technology was invented as a research concept in the 1970s (including the SQL query language, and several now largely forgotten competitors), in that decade its use was largely limited to research, along with a handful of commercial pilots. General commercial adoption of RDBMS technology didn’t happen until the 1980s.

The most common database technologies in the 1970s were flat file databases (such as ISAM and VSAM databases on IBM mainframes), hierarchical databases (such as IBM IMS), the CODASYL network model (e.g. IDS, IDMS), MUMPS (a key-value store with hierarchical keys), early versions of PICK, inverted list databases (ADABAS, Model 204, Datacom)-I think many (or even all) of these were more popular in the 1970s than any RDBMS. The first release of dBase came out in 1978 (albeit then called Vulcan, it wasn’t named dBase until 1980)-but like Oracle, it falls into the category “technically released in late 1970s but didn’t become popular until the 1980s”

refset•8mo ago

The HTAP vision was essentially built on the traditional notion that a database is a single 'place' where both transactions happen and complex queries run.

Rich Hickey argued [0] that place-orientation is bad and that a database should actually just be an immutable value which can be passed around freely. That's fairly in line with the conclusions of the post, although I think much more simplification of the disaggregated stack is possible.

[0] https://www.infoq.com/presentations/Deconstructing-Database/

pragmatic•8mo ago

https://learn.microsoft.com/en-us/sql/relational-databases/i...

HTAP in sql server for reference.

wejick•8mo ago

I would say compute and storage separation is the way to go, especially for hyperscaler offering ala aurora db/cosmos/alloy. And later more opensource alternatives will catch up.

jandrewrogers•8mo ago

Most analytics workloads are bandwidth-bound if you are optimizing them at all. The major issue with disaggregated storage is that the storage bandwidth is terrible in the cloud. I can buy a server from Dell with 10x the usable storage bandwidth of the fastest environments in AWS and that will be reflected in workload performance. The lack of usable bandwidth even on huge instance types means most of that compute and memory is not doing much — you are forced to buy compute you don’t need to access mediocre bandwidth of which there is never enough. The economics are poor as a result.

This is an architectural decision of the cloud providers to some extent. Linux can drive well over 1 Tbps of direct-attached storage bandwidth on a modern server but that bandwidth is largely beyond the limits of cheap off-the-shelf networking that disaggregated storage is often running over.

justincormack•8mo ago

Object storage does scale out to that performance (via replication) but you do need to use multiple compute instances as you only get say 100Gb on each which is low. You can also do some of the filtering in the api which helps too.

pradn•8mo ago

On the data warehousing side, I think the story looks like this:

1) Cloud data warehouses like Redshift, Snowflake, and BigQuery proved to be quite good at handling very large datasets (petabytes) with very fast querying.

2) Customers of these proprietary solutions didn't want to be locked in. So many are drifting toward Iceberg tables on top of Parquet (columnar) data files.

Another "hidden" motive here is that Cloud object stores give you regional (multi-zonal) redundancy without having to pay extra inter-zonal fees. An OLTP database would likely have to pay this cost, as it likely won't be based purely on object stores - it'll need a fast durable medium (disk), if at least for the WAL or the hot pages. So here we see the topology of Cloud object stores being another reason forcing the split between OLTP and OLAP.

But how does this new world of open OLTP/OLAP technologies look like? Pretty complicated.

1) You'd probably run PostGres as your OLTP DB, as it's the default these days and scales quite well.

2) You'd set up an Iceberg/Parquet system for OLAP, probably on Cloud object stores.

3) Now you need to stream the changes from PostGres to Iceberg/Parquet. The canonical OSS way to do this is to set up a Kafka cluster with Kafka Connect. You use the Debezium CDC connector for Postgres to pull deltas, then write to Iceberg/Parquet using the Iceberg sink connector. This incurs extra compute, memory, network, and disk.

There's so many moving parts here. The ideal is likely a direct Postgres->Iceberg write flow built-into PostGres. The pg_mooncake this company is offering also adds DuckDB-based querying, but that's likely not necessary if you plan to use Iceberg-compatible querying engines anyway.

Ideally, you have one plugin for purely streaming PostGres writes to Iceberg with some defined lag. That would cut out the third bullet above.

moonikakiss•8mo ago

totally agreed on 3. You're also missing the challenges of dealing with updates/deletes; and managing the many small files.

CDC from OLTP to Iceberg is extremely non-trivial.

pradn•8mo ago

The small writes problem that Iceberg has is totally silly. They spend so much effort requiring a tree of metadata files, but you still need an ACID DB to manage the pointer to the latest tree. At that point, why not just move all that metadata to the DB itself? It’s not sooo massive in scale.

The current Iceberg architecture requires table reads to do so many small reads, of the files in the metadata tree.

The brand new DuckLake post makes all this clear.

https://duckdb.org/2025/05/27/ducklake.html

Still Iceberg will probably do just fine because every data warehousing vendor is adding support for it. Worse is better.

jgraettinger1•8mo ago

> There's so many moving parts here.

Yep. At the scope of a single table, append-only history is nice but you're often after a clone of your source table within Iceberg, materialized from insert/update/delete events with bounded latency.

There are also nuances like Postgres REPLICA IDENTITY and TOAST columns. Enabling REPLICA IDENTITY FULL amplifies you source DB WAL volume, but not having it means your CDC updates will clobber your unchanged TOAST values.

If you're moving multiple tables, ideally your multi-table source transactions map into corresponding Iceberg transactions.

Zooming out, there's the orchestration concern of propagating changes to table schema over time, or handling tables that come and go at the source DB, or adding new data sources, or handling sources without trivially mapped schema (legacy lakes / NoSQL / SaaS).

As an on-topic plug, my company tackles this problem. Postgres => Iceberg is a common use case.

[0] https://docs.estuary.dev/reference/Connectors/materializatio...

gjvc•8mo ago

can you explain this please "not having it means your CDC updates will clobber your unchanged TOAST values" ?

sgarland•8mo ago

They’re referring to this: https://debezium.io/blog/2019/10/08/handling-unchanged-postg...

gunnarmorling•8mo ago

Funny timing, just took a fresh look at this topic in this new post earlier this week: https://www.morling.dev/blog/backfilling-postgres-toast-colu....

lmz•8mo ago

This may be helpful for you https://clickhouse.com/docs/integrations/clickpipes/postgres...

brightball•8mo ago

This is essentially what Crunchydata does with their Crunchydata Warehouse product. It’s really cool.

pradn•8mo ago

Their product looks promising. It looks like the PostGres schema and writes have to be "Iceberg-aware": special work to get around the fact that a small write results in a new, small Parquet file. That's not the end of the world - but perhaps ideally, you wouldn't be aware of Iceberg much at all when using PostGres. That might be a dream though.

Fully using PostGres without awareness of Iceberg would require full decoupling, and a translation layer in between (Debezium, etc). That comes with its own problems.

So perhaps some intimacy between the PostGres and Iceberg schemas is a good thing - especially to support transparent schema evolution.

DuckLake and CrunchyBridge both support SQL queries on the backing Iceberg tables. That's a good option. But a big part of the value of Iceberg comes in being able to read using Spark, Flink, etc.

BewareTheYiga•8mo ago

I'd argue the bigger value is keeping the data in one storage place and bringing the compute to it. Works especially well for Big Corp use cases where entire divisions of the corp go their own way. Throw in M&A activity and it is a good hedge for the unknown (I.e you might be an Databricks and Azure shop and you just bought a Snowflake & AWS company). Keep the data in an open table format, and let everyone query using their preferred engine to their hearts desire.

pradn•8mo ago

There's two problems being discussed in this article and thread:

1) Combining OLTP and OLAP databases into one system

2) Using an open data format to be able to read/write from many system (OLTP/PostGres, analytics engine/Spark)

> I'd argue the bigger value is keeping the data in one storage place and bringing the compute to it.

Yes, I agree with you. This observation is the idea behind #2, and why Iceberg has so much momentum now.

apwell23•8mo ago

Article is really messing up my browser so couldnt read on my phone. But htap never made sense to me be because in my experience its very rare that you'd need analytics on a single database. Its often a confluence of multiple datasources- streams, databases, csvs, vendor provided data .

pragmatic•8mo ago

Analytics on your hot OLTP data.

Like realtime dashboards/reports as the transactions are coming in.

Think of a SaaS with high usage.

The analytics you're referring to use the more slow moving "ETL all the source data together" and then analyze it.

Different use cases.

hn_throwaway_99•8mo ago

> Most workloads don’t need distributed OLTP. Hardware got faster and cheaper. A single beefy machine can handle the majority of transactional workloads. Cursor and OpenAI are powered by a single-box Postgres instance. You’ll be just fine.

I thought this was such an important point. Sooooo many dev hours were spent figuring out how to do distributed writes, and for a lot of companies that work was never needed.

roncesvalles•8mo ago

I thought it was the weakest point. The need for a distributed DB is rarely performance, it's availability and durability.

davidgomes•8mo ago

But you can get more availability and more durability with much easier alternatives:

- Availability: spin up more read replicas.

- Durability: spin up more read replicas and also write to S3 asynchronously.

With Postgres on Neon, you can have both of these very easily. Same with Aurora.

(Disclaimer: I work at Neon)

_benedict•8mo ago

This doesn’t seem to provide higher write availability, and if the read replicas are consistent with the write replica this design must surely degrade write availability as it improves read availability, since the write replica must update all the read replicas.

This also doesn’t appear to describe a higher durability design at all by normal definitions (in the context of databases at least) if it’s async…?

davidgomes•8mo ago

Yeah, this is not about write availability, but as the OP/author points out, scaling that is not the bottleneck for most apps.

_benedict•8mo ago

I think you may have misunderstood the GP and are perhaps misusing terminology. You cannot meaningfully scale vertically to improve write availability, and if you care about availability a single machine (and often a primary/secondary setup) is insufficient.

Even if you only care about scaling reads, eventually the 1:N write:read replica ratio will become too costly to maintain, and long before you reach that point you likely sacrifice real-time isolation guarantees to maintain your write availability and throughput.

mrkeen•8mo ago

> you likely sacrifice real-time isolation guarantees to maintain your write availability and throughput

No worries there, in all likelihood isolation has probably been killed twice already. Once by running the DB on READ COMMITTED, and a second time by using an ORM like EF to read data into your application, fiddle with it in-RAM, and write the new (unrelated-to-what-was-read) data back to the DB.

In other words, we throw out all that performant 2010-2020 NoSQL & eventual consistency tech, and go back to good old fashioned SQL & ACID, because everyone knows SQL, and ACID is amazing. Then we use LINQ/EF instead because it turns out that no-one actually wants to touch SQL, and full isolation is too slow so that gets axed too.

sgarland•8mo ago

> You cannot meaningfully scale vertically to improve write availability

Disagree. Even if you limit yourself to the cloud, r7i/r8g.48xl gets you 192 vCPU / 1.5 TiB RAM. If you really want to get silly, x2iedn.32xl is 128 vCPU / 4 TiB RAM, and you get 3.8 TiB of local NVMe storage for temp tablespace. The money you’ll pay ($16.5K - $44K month, depending on specific class) would pay for a similarly spec’d server in the same amount of time, though.

Which brings me to the novel concept of owning your own hardware. A quick look at Supermicro’s site shows a 2U w/ up to 1.92 PB of Gen5 NVMe, 8 TiB of RAM, and dual sockets. That would likely cost a wee bit more than a month of renting the aforementioned AWS VM, but a more reasonably spec’d one would not. Realistically, that much storage would be used as SDS for other DBs to use. NVMoF isn’t quite as fast as local disks, but it’s a hell of a lot faster than EBS et al.

The point is that you actually can vertically scale to stupidly high levels, it’s just that most companies have no idea how to run servers anymore.

> and if you care about availability a single machine (and often a primary/secondary setup) is insufficient.

Depending on your availability SLOs, of course, I think you’d find that a two-node setup (optionally having N read replicas) with one in standby would be quite sufficient. Speaking from personal experience on RDS (MySQL fronted with ProxySQL on K8s, load balanced with NLB), I experienced a single outage in two years. When it happened, no one noticed, it was so brief. Some notice-only alerts for 500s in Slack, but no pages went out.

_benedict•8mo ago

> If you really want to get silly, x2iedn.32xl is 128 vCPU / 4 TiB RAM, and you get 3.8 TiB of local NVMe

This doesn't affect availability - except insofar as unavailability might be caused by insufficient capacity, which is not the typical definition.

> Depending on your availability SLOs, of course

Yes, exactly. Which is the point the GP was making. You generally make the trade-off in question not for performance, but because you have SLOs demanding higher availability. If you do not have these SLOs, then of course you don't want to make that trade-off.

sgarland•8mo ago

> This doesn't affect availability - except insofar as unavailability might be caused by insufficient capacity, which is not the typical definition.

I agree, but it seemed to me that GP was using it as such: "You cannot meaningfully scale vertically to improve write availability"

jandrewrogers•8mo ago

The big caveat about these configurations is the amount of time it takes to rebuild a replica due to the quantity of storage per node that has to be pushed over the network. This is one of the low-key major advantages of disaggregated storage.

I prefer to design my own hardware infrastructure but there are many operational tradeoffs to consider.

roncesvalles•8mo ago

No loss of committed transactions is acceptable to any serious business.

>I work at Neon

In my opinion, distributed DB solutions without synchronous write replication are DOA. Apparently a good number of people don't share this opinion because there's a whole cottage industry around such solutions, but I would never touch them with a 10 foot stick.

hn_throwaway_99•8mo ago

I think you misunderstood his point (and mine). There are usually much better ways to support availability and durability than to have multiple simultaneous write servers. On the contrary, having multiple write servers is usually worse for availability and durability because of the complexity.

For example, look at how Google Cloud SQL's aptly name "High Availability" configuration supports high availability: 1 primary and 1 standby. The standby is synced to the primary, and the roles are switched if a failover occurs.

growlNark•8mo ago

Something tells me neither cursor nor openai need write workloads, so they would probably do just as fine using a flat file. I'm honestly curious what use either would have for queries that you couldn't get with a filesystem.

Certainly neither products have much obvious need for OLTP workloads. Hell, neither have any need for transactions at all. You're just paying them for raw CPU.

growlNark•8mo ago

Update: in my mind, this reflects analytics of queries. Just further reason to run your own models I guess....

hn_throwaway_99•8mo ago

It's not just analytics. ChatGPT saves all of your conversation history - I don't know if they save the full conversation text in postgres, but I'd assume they at least save conversation metadata there.

You may not want this from a privacy perspective, but as a user I find it to be a very useful feature, e.g. I can see my full history, I can easily share conversations with a share link (and it's the exact version of that conversation, not like a URL where contents can change).

physix•8mo ago

My takeaway about all this is that nobody really cares much about consistency or the cost to build and run lambda-like architectures.

jamesblonde•8mo ago

The 2nd last line is the summary - "The HTAP challenge of our time comes down to making the lakehouse real-time ready."

We are building this platform as well. There are 2 aspects to it - the "enterprise way" and the "greenfield way". The greenfield way will win out in 10-15 years, but unless you have capital to last that long, as a startup we need to go the Enterprise way first until we are big enough to go the unified HTAP-style way. The Lakehouse - open columnar data - is here to stay. It needs a better connection to OLTP than Kafka, but it will take time between A and B.

orefalo•8mo ago

I found the title amusing. This died.. right after inception.

Clearly, the objectives and limitations of OLAP and OLTP differ so much that merging the two domains in a fantasy.

It's like asking two people to view through the same lens.

thom•8mo ago

You cannot say HTAP is dead when the alternative is so much complexity and so many moving parts. Most enterprises are burning huge amounts of resources literally just shuffling data around for zero business value.

The dream is a single data mesh presenting an SQL userland where I can write and join data from across the business with high throughput and low latency. With that, I can kill off basically every microservice that exists, and work on stuff that matters at pace, instead of half of all projects being infrastructure churn. We are close but we are not there yet and I will be furious if people stop trying to reach this endgame.

RedShift1•8mo ago

Maybe GraphQL can be your savior?

Charon77•8mo ago

GraphQL is just a language much like SQL.

RedShift1•8mo ago

Yes but it provides a standardized way to deliver a unified interface to query your data, which is what OP is after?

010101010101•8mo ago

“With high throughput and low latency” are the operative requirements for OP and GQL doesn’t help with either on its own.

Yoric•8mo ago

The problem is generally not the surface language, but the underlying distribution/querying mechanism.

It is, of course, possible that SQL is too complex a language for this dream.

sgarland•8mo ago

> The dream is a single data mesh presenting an SQL userland where I can write and join data from across the business with high throughput and low latency.

That exists, and has for years: an extremely large DB loaded to the gills with RAM and local NVMe drives. Add some read replicas if you need them, similarly configured. Dedicate one for OLAP.

jandrewrogers•8mo ago

This doesn’t work quite as well as people assume. The first limit is simply size, you can only cram a few petabytes of NVMe in a server (before any redundancy) and many operational analytic workloads are quite a bit larger these days. One of the major advantages of disaggregated storage in theory is that it allows you to completely remove size limits. Many operational analytic workloads don’t need a lot of compute, just sparse on-demand access to vast amounts of data. With good selectivity (another open issue), you could get excellent performance in this configuration.

Ignoring the storage size limits, the real issue as you scale up is that the I/O schedulers, caching, and low-level storage engine mechanics in a large SQL database are not designed to operate efficiently on storage volumes this large. They will work technically, but scale quite a bit more poorly than people expect. The internals of SQL databases are (sensibly) optimized for working sets no larger than 10x RAM size, regardless of the storage size. This turns out to be the more practical limit for analytics in a scale-up system even if you have a JBOD of fast NVMe at your disposal.

layer8•8mo ago

> many operational analytic workloads are quite a bit larger these days.

What are the use cases where such workloads come up, aside from Google-level operations? Just trying to understand what we are talking about.

jandrewrogers•8mo ago

Sensor and telemetry analytics workloads in boring industrial sectors are all at this scale, even at companies that aren’t that large revenue-wise. TBs to PBs of new data per day.

layer8•8mo ago

What are these used for, to all have to be in a single unified database?

jandrewrogers•8mo ago

A large part of those workloads is stitching together a single derived model of operational reality and how different entities interact over time from the samples you get from each individual source. You need a running log of all entity behavior and interactions over time to look back on in order to contextualize what you see at the current point in time. Most of this is not pre-computable because the combinatorial state space is too large so every analytic query needs to be able to see across every relationship between sources that can be inferred.

It is essentially a spatial and/or graph analytic model evolving over time. Any non-trivial data models that capture dynamics in the physical world looks like this.

In fairness, all popular analytics platforms handle these workloads poorly regardless of if they are vertically or horizontally scaled. These workloads usually cannot be cached even in theory, so performance and scalability comes down to the sophistication of your scheduler design.

layer8•8mo ago

Thanks. It would be interesting to talk about the business specifics, but that would move into confidential territory I guess.

sgarland•8mo ago

It works to a certain point, yes, but I daresay that the overwhelming majority of OLTP needs are in the <= TB range, not PB. OLAP is its own beast, though I'll also say that most modern tech companies' schema is hot garbage, full of denormalized tables for no good reason, JSON everywhere, etc. and thus the entire thing could be much, much smaller if RDMBS was used as it was intended: relationally.

dehrmann•8mo ago

A sibling mentioned GraphQL. That works, but it was really built for clients interacting with Meta's Ent framework. The web layer is largely a monolith, and user objects are modeled as "ents," linked to each other, and stored in heavily cached MySQL. GraphQL exposes access to them.

physix•8mo ago

> You cannot say HTAP is dead when the alternative is so much complexity and so many moving parts. Most enterprises are burning huge amounts of resources literally just shuffling data around for zero business value.

We built an HTAP platform as a layer over Cassandra for precisely that reason round about when Gartner invented the term.

In finance and fintech, there are ample use cases where the need for transactional consistency and horizontal scalability to process and report on large volumes come together, and where the banks really struggle to meet requirements.

I dug out an old description of our platform, updated it a bit, and put it on Medium, in case anyone is interested: https://medium.com/@paul_42036/a-technical-description-of-th...

nubinetwork•8mo ago

Never heard of it. Maybe it's a good thing it's being considered dead... /s

mrkeen•8mo ago

> Cursor and OpenAI are powered by a single-box Postgres instance. You’ll be just fine.

Well no, not according to your own source:

  This setup consists of one primary database and dozens of replicas.

Are they just fine?

  There have been several instances in the past where issues related to PostgreSQL have led to outages of ChatGPT.

OK but let's pretend it's acceptable to have outages. It's fine apart from that?

  However, “write requests” have become a major bottleneck. OpenAI has implemented numerous optimizations in this area, such as offloading write loads wherever possible and avoiding the addition of new services to the primary database.

I feel that! I've been part of projects where we've finished building a feature, but didn't let customers have it because it affected the write path and broke other features.

It's been less than a week since someone in the company posted in Slack "we tried scaling up the db (Azure mssql) but it didn't fix the performance issues."

hobs•8mo ago

I don't understand why that's an acceptable answer when people dont understand the nature of the performance issue.

Network round trip? Scaling the instance aint gonna help. Row by agonizing row? Maybe some linear speedups as you get more IO, but cloud storage is pretty fucking slow. Terrible plan/table/indexing/statistics? Still gonna be bad with more grunt. Blocking and locking and deadlocking the problem? Speeding up might make it worse :)

If people have exponential problems they don't think "let's just get more machines" they think "lets measure and fix the damn thing" but for some reason it doesn't apply to most people's databases.

sgarland•8mo ago

> but for some reason it doesn't apply to most people's databases.

It’s because RDBMS effectively hasn’t changed in decades, and so requires fundamental knowledge of how computers work, and the ability to read dense technical docs. If those two clauses don’t seem related, go read the docs for HAProxy, or Linux man pages, or anything else ancient in the tech world. It used to be assumed that if you were operating complex software, you necessarily understood the concepts it was built on, and also that you could read dozens of pages of plaintext without flashy images and effects.

That’s not to say that all modern software assumes the user is an idiot, or has terrible docs – Django does neither, for example.

> Network round trip? Scaling the instance aint gonna help. Row by agonizing row? Maybe some linear speedups as you get more IO, but cloud storage is pretty fucking slow.

See previous statement re: fundamentals. “I need more IOPS!” You have a 1 msec read latency; it doesn’t matter how quickly it comes off the disk (never mind the fact that the query is probably in a single thread), you have the same bottleneck.

ghc•8mo ago

> Django does neither, for example.

Django is "ancient" just like HAProxy. I deployed my first Django app at the end of 2005.

sgarland•8mo ago

Fair enough, I didn't know it was that old.

0xbadcafebee•8mo ago

Don't worry. All architectures get recycled eventually. Everything is new again.

One of the biggest problems with having more data is it's just hard to manage. That's why cloud data warehouses are here to stay. They enable the "utility computing" of cloud compute providers, but for data. I don't think architecture is a serious consideration for most people using it, other than the idea that "we can just throw everything at it".

NewSQL didn't thrive because it isn't sexy enough. A thing doesn't succeed because it's a "superior technology", it survives if it's overwhelmingly more appealing than existing solutions. None of the NewSQL solutions are sufficiently sexier than old boring stable databases. This is the problem with every new database. I mean, sure, they're fun for a romp in the sheets; but are they gonna support your kids? Interest drops off once everyone realizes it's not overwhelmingly better than the old stuff. Humans are trend-seekers, but they also seek familiarity and safety.

teleforce•8mo ago

I think people need to realize that HTAP it's not a technology but database features while relational is the real database technology.

It seems that now people is converging to this pseudo-math database solution namely Postgresql with its battle-hardened object-relational technology that's IMHO a local minima [1].

The world need a proper math based universal solution for the database technology similar to relational. But this time around we need much more features, we want it all including analytical, transaction, spreadsheet, graph, vector, signal, etc. On top of that we want reliable distributed architecture. We simply cannot add on indefinitely upon Postgresql because the complexity will be humongous and the solutions become sub-optimal [2].

We need strong database foundation with solid mathematical basis not unlike the original relational database technology.

The best candidate that's available now is D4M by the fine folks at MIT that has been implemented in Matlab, Python and Julia [3]. Perhaps someone need to write C++, Dlang or Rust version of it to be widely acceptable.

It's funny that the article started by mentioning the article inspiration was from the popular article on big data is dead and by doing so is prematurely dismissing the problem. The book on D4M, however embrace the big data problem by its head by putting the exact terminology it the title [4].

[1] What’s the Difference Between MySQL and PostgreSQL?

https://aws.amazon.com/compare/the-difference-between-mysql-...

[2] Just Use Postgres!

https://www.manning.com/books/just-use-postgres

[3] D4M: Dynamic Distributed Dimensional Data Model:

https://d4m.mit.edu/

[4] Mathematics of Big Data: Spreadsheets, Databases, Matrices, and Graphs (MIT Lincoln Laboratory Series):

https://mitpress.mit.edu/9780262038393/mathematics-of-big-da...

maxmcd•8mo ago

I think the upcoming CedarDB is HTAP? https://cedardb.com/

Clickhouse performance for Postgres workloads?

databasegirl•8mo ago

https://planetscale.com/blog/what-is-htap

orefalo•8mo ago

I have compiled the following table to compare OLTP and OLAP

https://medium.com/@orefalo_66733/oltp-vs-olap-fb0441f57259

rubenvanwyk•8mo ago

One thing none seem to notice is the rise of “Operational Warehouses” such as RisingWave or Materialize. A big ‘problem’ in OLAP, as the article mentions, is people expects aggregations or analytic views on live data. These solutions solve it. In principle, this shows that just having incrementally maintained materialised views, really goes a long way towards achieving the HTAP dream on a single DB.

CurtMonash•8mo ago

I stopped reading early, when the article said that in the 1970s one big relational database did everything.

In fact, relational databases did nothing in the 1970s. They didn't even exist yet in commercial form.

My first prediction as an analyst from 1982 onwards was that "index-based" DBMS would take over from linked-list DBMS and flat files. (That was meant to cover both inverted-list and relational systems; I expected inverted-list DBMS to outperform relational ones for longer than they did.)

EVs Are a Failed Experiment

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

CCC (Claude's C Compiler) on Compiler Explorer

Homeland Security Spying on Reddit Users

Actors with Tokio (2021)

Can graph neural networks for biology realistically run on edge devices?

Deeper into the shareing of one air conditioner for 2 rooms

Weatherman introduces fruit-based authentication system to combat deep fakes

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

A Curated List of ML System Design Case Studies

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

Open Problems in Mechanistic Interpretability

Bye Bye Humanity: The Potential AMOC Collapse

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

Digital Iris [video]

Essential CDN: The CDN that lets you do more than JavaScript

They Hijacked Our Tech [video]

Vouch

HRL Labs in Malibu laying off 1/3 of their workforce

Show HN: High-performance bidirectional list for React, React Native, and Vue

Show HN: I built a Mac screen recorder Recap.Studio

Ask HN: Codex 5.3 broke toolcalls? Opus 4.6 ignores instructions?

Vectors and HNSW for Dummies

Sanskrit AI beats CleanRL SOTA by 125%

'Washington Post' CEO resigns after going AWOL during job cuts

Claude Opus 4.6 Fast Mode: 2.5× faster, ~6× more expensive

TSMC to produce 3-nanometer chips in Japan

Quantization-Aware Distillation

List of Musical Genres

EVs Are a Failed Experiment

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

CCC (Claude's C Compiler) on Compiler Explorer

Homeland Security Spying on Reddit Users

Actors with Tokio (2021)

Can graph neural networks for biology realistically run on edge devices?

Deeper into the shareing of one air conditioner for 2 rooms

Weatherman introduces fruit-based authentication system to combat deep fakes

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

A Curated List of ML System Design Case Studies

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

Open Problems in Mechanistic Interpretability

Bye Bye Humanity: The Potential AMOC Collapse

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

Digital Iris [video]

Essential CDN: The CDN that lets you do more than JavaScript

They Hijacked Our Tech [video]

Vouch

HRL Labs in Malibu laying off 1/3 of their workforce

Show HN: High-performance bidirectional list for React, React Native, and Vue

Show HN: I built a Mac screen recorder Recap.Studio

Ask HN: Codex 5.3 broke toolcalls? Opus 4.6 ignores instructions?

Vectors and HNSW for Dummies

Sanskrit AI beats CleanRL SOTA by 125%

'Washington Post' CEO resigns after going AWOL during job cuts

Claude Opus 4.6 Fast Mode: 2.5× faster, ~6× more expensive

TSMC to produce 3-nanometer chips in Japan

Quantization-Aware Distillation

List of Musical Genres

HTAP is Dead

Comments