> Deletes accumulate in tombstone files over time. Eventually we would want to coalesce 100 small tombstone files into one and /or rewrite data files if a row group has >50% rows deleted, resulting in further compaction.
The bigger problem for me is that tombstones that remove rows can make reads quite inefficient because they reduce the usefulness of min-max and bloom filter indexes. It can also affect vectorized query if you have to apply predicates within row groups. Finally there are degenerate cases where the tombstones would be bigger than the compressed columns themselves.
Any assertion that this would be performant needs to be backed up by code. ClickHouse took many years to implement so-called lightweight deletes. It's a hard problem to solve in a performant way.
ohnoesjmr•4mo ago
deepsun•4mo ago
For some reason they thought hard-positioned top-to-bottom SVG is somehow better than adding "white-space: pre" once in CSS ¯\_(ツ)_/¯
shayonj•4mo ago
deepsun•4mo ago