I've shared many of those thoughts with their team directly out of love.
Also that's Series D-E, money isn't real anymore
Could you explain this? Is this commentary on voting power dilution or their class a/b share rules?
I had the same thought the first time I heard about a 12M "seed" round.
Personally I’d just go to a colo center buy a rack of super micro and call it a day. No way that’s more expensive after a year (per public pricing).
Some of the default config options are weird and SSL is something that needs to be addressed. Overall, still one of the easier DBs to maintain.
Roughly speaking, Postgres is to SQLite what Clickhouse is to DuckDB.
OLTP -> Online Transaction Processing. Postgres and traditional RDBMS. Mainly focused on transactions and addressing specific rows. Queries like "show me all orders for customer X".
OLAP -> Online Analytical Processing. Clickhouse and other columnar oriented. For analytical and calculation queries, like "show me the total value of all orders in March 2024". OLTP database typically store data by column rather than row, and usually have optimizations for storage space and query speed based on that. As a tradeoff they're typically slower for OLTP type queries. Often you'd bring in an OLAP db like Clickhouse when you have a huge volume of data and your OLTP database is struggling to keep up.
Here "Online" means results while connected to the system, not real time since there is no time requirement for results.
ClickHouse is designed so you can build dashboard with it. Other offline system are designed so you can build reports that you send in PDF over email with them.
The database is OLAP where Postgres is an OLTP database. Essentially it very fast at complex queries, and is targeted at analytics workloads.
ClickHouse spun out of Yandex & is open source, https://github.com/ClickHouse/clickhouse
Disclosure: I started at Citus & ended up at ClickHouse
Is it a surprise that OLTP is not efficient at aggregation and analytics?
There's nothing Clickhouse does that other OLAP DBs can't do, but the killer feature for us was just how trivially easy it was to replicate InnoDB data into Clickhouse and get great general performance out of the box. It was a very accessible option for a bunch of Rails developers who were moonlighting as DBAs in a small company.
The heart of Clickhouse are these table engines (they don't exist in Postgres) https://clickhouse.com/docs/engines/table-engines . The primary column (or columns) is ordered in some way and adjacent values in memory are from the same column in the table. Index entries span wide areas (EG: By default there's only one key record in the primary index for every 8192 rows) because most operations in Clickhouse are aggregate in nature. Inserts are also expected to be in bulk (They are initially a new physical part that is later merged into the main table structure). A single DELETE is an ALTER TABLE operation in the MergeTree engine. :)
This structure allows it to literally crunch billions of values per second (brutally, not with pre-processing, erm, "tricks" although there is a lot of support for that in Clickhouse as well). I've had tables with hundreds of columns and 100+ billion rows that are nearly as performant as a million row table if I can structure the query to work with the table's physical ordering.
Clickhouse recommends not using nullable fields because of the performance implications (it requires storing a bit somewhere for each value). That's how much they care about perf and how close to the raw data type it is that their memory allocation uses. :)
> They are initially a new physical part that is later merged into the main table structure
> A single DELETE is an ALTER TABLE operation
Can you explain these two further?
The reason I mentioned it is because it's a huge surprise to some people that... from the docs: "The ALTER TABLE prefix makes this syntax different from most other systems supporting SQL. It is intended to signify that unlike similar queries in OLTP databases this is a heavy operation not designed for frequent use. ALTER TABLE is considered a heavyweight operation that requires the underlying data to be merged before it is deleted."
There's also a "lightweight delete" available in many circumstances https://clickhouse.com/docs/sql-reference/statements/delete. Something really nice about the ClickHouse docs is that they devote quite a bit of text to describing the design and performance implications of using an operation. It reiterates the focus on performance that is pervasive across the product.
Edit: Per the other part of your question, why inserts create new parts and how they are merged is best described here https://clickhouse.com/docs/engines/table-engines/mergetree-...
It's fast, it's........ really fast!!
But you need to get comfortable with their extended SQL dialect that forces you to think a little different than with usual SQL if you want to keep perf high.
It's a fun story.
Our first swag shipment with the new colours had just arrived, the founders were in one place together for one of the first times, the weather wasn't terrible in Amsterdam for one day.
Not a pringles can. Rather they were stuffed in a shipping box that came from a warehouse, manhandled by customs, and thrown onto them for the purpose of taking the photo.
#startuplife eh?
with 200$/month I have a good database. $1-5M revenue?
p.s., It's also possible to break ClickHouse as you demonstrated. It used to be a lot easier.
When you use INSERT ... SELECT in ClickHouse you do need to pay attention to the generated table partitions, as they coexist in memory before flushing to storage. The usual approach is to break up the insert into chunks so you can control how many parts are generated or to adjust the partitioning in the target table.
It's possible the problem might be somehow related to this behavior but that's just conjecture. It's usually pretty easy to work around. Meanwhile if it's a bug it will probably get fixed quickly.
Unless you try to join tables in it, in which case it will immediately explode.
More seriously, it's a columnar data store, not a relational database. It'll definitely pretend to be "postgres but faster", but that's a very thin and very leaky facade. You want to do massively a complex set of selects and conditional sums over one table with 3b rows and tb of data? You'll get a result in tens of seconds without optimization. You want to join two tables that postgres could handle easily? You'll OOM a machine with TB of memory.
So: good for very specific use cases. If you have those usecases, it's great! If you don't, use something else. Many large companies have those use cases.
Could you explain why you don't think ClickHouse is relational? The storage is an implementation detail. It affects how fast queries run but not the query model. Joins have already improved substantially and will continue to do so in future.
They've made strides in the last year or two to implement more join algorithms, and re-order your joins automatically (including whats on the "left" and "right" of the join, relating to performance of the algorithm).
Their release notes cover a lot of the highlights, and they have dedicated documentation regarding joins[1]. But we've made improvements by an order-of-magnitude before by just reordering our joins to align with how ClickHouse processes them.
It’s important to structure your tables and queries in a way that aligns with the ordering keys, in order to optimize how much data needs to be loaded into RAM. You absolutely CANNOT just replicate your existing postgres DB and its primary keys or whatever over to CH. There are tricks like projections and incremental materialized views that can help to get the appropriate “lenses” for your queries. We use incremental MVs to, for example, continuously aggregate all-time stats about tens of billions of records. In general, for CH, space is cheap and RAM is expensive, so it’s better to duplicate a table’s data with a different ordering key than to make an inefficient query.
As long as the queries align with the ordering keys, it is insanely fast and able to enable analytics queries for truly massive amounts of data. We’ve been very impressed.
Clickhouse is great, but like any database if you run it at scale someone must tend to it.
bananapub•20h ago
candiddevmike•19h ago
hodgesrm•18h ago
[0] https://fosdem.org/2025/schedule/event/fosdem-2025-5320-buil...
Disclaimer: I run Altinity.
datavirtue•17h ago