frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Distributed SQL engine for ultra-wide tables

2•synsqlbythesea•4h ago
I ran into a practical limitation while working on ML feature engineering and multi-omics data.

At some point, the problem stops being “how many rows” and becomes “how many columns”. Thousands, then tens of thousands, sometimes more.

What I observed in practice:

- Standard SQL databases usually cap out around ~1,000–1,600 columns. - Columnar formats like Parquet can handle width, but typically require Spark or Python pipelines. - OLAP engines are fast, but tend to assume relatively narrow schemas. - Feature stores often work around this by exploding data into joins or multiple tables.

At extreme width, metadata handling, query planning, and even SQL parsing become bottlenecks.

I experimented with a different approach: - no joins - no transactions - columns distributed instead of rows - SELECT as the primary operation

With this design, it’s possible to run native SQL selects on tables with hundreds of thousands to millions of columns, with predictable (sub-second) latency when accessing a subset of columns.

On a small cluster (2 servers, AMD EPYC, 128 GB RAM each), rough numbers look like: - creating a 1M-column table: ~6 minutes - inserting a single column with 1M values: ~2 seconds - selecting ~60 columns over ~5,000 rows: ~1 second

I’m curious how others here approach ultra-wide datasets. Have you seen architectures that work cleanly at this width without resorting to heavy ETL or complex joins?

Comments

icsa•2h ago
> With this design, it’s possible to run native SQL selects on tables with hundreds of thousands to millions of columns, with predictable (sub-second) latency when accessing a subset of columns.

What is the design?

Ask HN: Share your personal website

414•susam•9h ago•1317 comments

Ask HN: How do you safely give LLMs SSH/DB access?

61•nico•7h ago•85 comments

Ask HN: Weird Archive.today Behavior?

5•rabinovich•4h ago•1 comments

Ask HN: Iran's 120h internet shutdown, phones back. How to stay resilient?

106•us321•1d ago•94 comments

Tell HN: Properly using dishwasher reduced friction with my wife

9•xylo•6h ago•10 comments

The $LANG Programming Language

258•dang•1d ago•66 comments

Ask HN: ADHD – How do you manage the constant stream of thoughts and ideas?

110•chriswright1664•1d ago•133 comments

Distributed SQL engine for ultra-wide tables

2•synsqlbythesea•4h ago•1 comments

Ask HN: How are you doing RAG locally?

29•tmaly•12h ago•6 comments

Ask HN: Quantum Computation, Computers and Programming

31•rramadass•1d ago•26 comments

Ask HN: What are you working on? (January 2026)

256•david927•3d ago•857 comments

Ask HN: Are diffs still useful for AI-assisted code changes?

4•nuky•8h ago•8 comments

Ask HN: Vxlan over WireGuard or WireGuard over Vxlan?

44•mlhpdx•1d ago•81 comments

Tell HN: DigitalOcean's managed services broke each other after update

76•neilfrndes•2d ago•46 comments

Ask HN: Discrepancy between Lichess and Stockfish

21•HNLurker2•1d ago•11 comments

Ask HN: Looking for Windows contributors for meeting-detection engine

7•Ayobamiu•1d ago•1 comments

Anything Down?

3•Artur-Defences•8h ago•2 comments

Ask HN: What makes someone hate their job?

5•agcat•8h ago•12 comments

A Proposal to Modernize Xorg as a Protocol-Only Graphics Layer

3•powerwordtree•8h ago•3 comments

Ask HN: Any evidence AI coding assistants are helping open source projects?

6•UncleOxidant•7h ago•0 comments

Tell HN: Intel could blow up the Console Wars if it had the guts

7•noumenon1111•1d ago•10 comments

Tell HN: I Downgraded from macOS Tahoe to Sequoia

7•inatreecrown2•15h ago•6 comments

Ask HN: Who remembers AWS Spot's auction era before the 2017 pricing change?

3•aleroawani•1d ago•0 comments

Ask HN: 500 citation MSc CS, stuck in a low-trust region. How to move forward?

19•throwawaysafely•1d ago•12 comments

Tell HN: The Google Tenor GIF API has been shut down

23•dfajgljsldkjag•1d ago•17 comments

Ask HN: How to find gaps and oppurtunities in the AI era?

6•SRMohitkr•20h ago•4 comments

Ask HN: Learning Discoverability

2•learnwithmattc•1d ago•0 comments

Ask HN: Are you underutilizing your insurance too?

7•nemath•1d ago•5 comments

Is "AI vibe coding" making prototyping worse inside real companies?

16•arapkuliev•1d ago•5 comments

Ask HN: Personal website featured on HN, list of restaurants in NYC

4•laffOr•14h ago•0 comments