How large a dataset can it tackle? I work with Parquet files spanning 300million+ records (~800MB files) using DuckDB and it works within seconds.
I might be interested to see benchmarks against Parquet and Vortex. A DuckDB extension would be great as well.
another comment already mentioned comparison to vortex, which is the same compression ratio and same speeds as youre claiming - but your compression is half of parquet. and if speed is the main goal youre going for, python is an interesting choice. no hate, but def keep working on it, and would love to see more concrete benchmarks with various columnar store types
arunkore2026•48m ago
inheritedwisdom•19m ago