frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Distributed SQL engine for ultra-wide tables

2•synsqlbythesea•1h ago
I ran into a practical limitation while working on ML feature engineering and multi-omics data.

At some point, the problem stops being “how many rows” and becomes “how many columns”. Thousands, then tens of thousands, sometimes more.

What I observed in practice:

- Standard SQL databases usually cap out around ~1,000–1,600 columns. - Columnar formats like Parquet can handle width, but typically require Spark or Python pipelines. - OLAP engines are fast, but tend to assume relatively narrow schemas. - Feature stores often work around this by exploding data into joins or multiple tables.

At extreme width, metadata handling, query planning, and even SQL parsing become bottlenecks.

I experimented with a different approach: - no joins - no transactions - columns distributed instead of rows - SELECT as the primary operation

With this design, it’s possible to run native SQL selects on tables with hundreds of thousands to millions of columns, with predictable (sub-second) latency when accessing a subset of columns.

On a small cluster (2 servers, AMD EPYC, 128 GB RAM each), rough numbers look like: - creating a 1M-column table: ~6 minutes - inserting a single column with 1M values: ~2 seconds - selecting ~60 columns over ~5,000 rows: ~1 second

I’m curious how others here approach ultra-wide datasets. Have you seen architectures that work cleanly at this width without resorting to heavy ETL or complex joins?

Nick Shirley Exposed Minnesotas Billion Dollar Fraud Scandal [video]

https://www.youtube.com/watch?v=zF2a3aTfA9w
1•zahlman•1m ago•1 comments

Rams Owner Stan Kroenke Becomes Largest Private Landowner in US with 2.7M Acres

https://www.nytimes.com/2026/01/13/realestate/stan-kroenke-largest-private-landowner.html
1•bookofjoe•1m ago•1 comments

(informed?) Opinion: why boys struggle in class

https://www.wsj.com/opinion/why-boys-struggle-in-class-girls-recess-math-5fdeb6ce
1•gsf_emergency_6•2m ago•0 comments

Modder Runs PC in a Chest Freezer

https://www.youtube.com/watch?v=P4W8f-703rI
1•gsf_emergency_6•3m ago•0 comments

Anthropic is making a huge mistake

https://geohot.github.io//blog/jekyll/update/2026/01/15/anthropic-huge-mistake.html
1•swah•4m ago•0 comments

Finding bugs across the Python ecosystem with Claude and property-based testing

https://red.anthropic.com/2026/property-based-testing/
1•mmaaz•5m ago•0 comments

Show HN: CockroachDB Daily

https://cockroachdb-daily.doanything.app
1•RobinBrooksAgt•5m ago•0 comments

Brag Doc

https://www.bragdoc.io/
1•stmoreau•5m ago•0 comments

Mike Pompeo says history books should ignore Gaza's victims – Middle East Eye

https://www.middleeasteye.net/trending/mike-pompeo-says-history-books-should-ignore-gazas-victims
2•abdelhousni•6m ago•1 comments

Soumith Chintala Becomes CTO of Thinking Machines

https://twitter.com/miramurati/status/2011577319295692801
1•amrrs•7m ago•0 comments

Show HN: KernDB – Managed Postgres Under EU Jurisdiction (Germany)

https://kerndb.com
1•michael_si•7m ago•0 comments

Build trams. But build them well

https://marcochitti.substack.com/p/build-trams-but-build-them-well
1•decimalenough•9m ago•0 comments

Web Based AI Generated ePub Reader

https://github.com/ovidiuiliescu/EpubWebReader
1•ovvyblabla•9m ago•0 comments

BBC 1 sound received in Texas, November 1981 [video]

https://www.youtube.com/watch?v=7vVqHUSNgYY
1•austinallegro•10m ago•0 comments

Clawdbot – personal AI assistant in WhatsApp, Telegram, Discord, Slack

https://github.com/clawdbot/clawdbot
1•tin7in•11m ago•0 comments

The Last Question

http://www.thelastquestion.net/
1•morpheos137•12m ago•0 comments

Trump says 'anything less' than US control of Greenland is 'unacceptable'

https://www.cnn.com/2026/01/14/politics/greenland-trump-nato-denmark
3•doener•12m ago•2 comments

Hegseth wants to integrate Musk's Grok AI into military networks this month

https://arstechnica.com/ai/2026/01/hegseth-wants-to-integrate-musks-grok-ai-into-military-network...
3•nothrowaways•12m ago•0 comments

Gnome Mutter 50 Alpha Released with X11 Back End Removed

https://www.phoronix.com/news/GNOME-Mutter-Shell-50-Alpha
2•tlmbl•12m ago•0 comments

Ask HN: Form History Control is great. Why doesn't Firefox integrate it?

1•Openai2•13m ago•0 comments

Dremel: Interactive Analysis of Web-Scale Datasets (2010)

https://research.google/pubs/dremel-interactive-analysis-of-web-scale-datasets-2/
1•tosh•14m ago•0 comments

The Missing Innovation

https://suriya.cc/essays/02_missing_innovation/
1•suriya-ganesh•14m ago•0 comments

Ask HN: How do you find a non-US only, remote first job?

1•hxii•18m ago•0 comments

Cc-search: a skill to search Claude Code sessions

https://www.definite.app/blog/claude-code-search-skill
2•mritchie712•19m ago•0 comments

Tesla Sales now compared to last year

https://twitter.com/Microinteracti1/status/2011400771343135048
3•doener•20m ago•0 comments

Various pizzerias nearby The Pentagon are reporting above average traffic

https://twitter.com/PenPizzaReport/status/2011551004098252896
2•doener•20m ago•2 comments

Learning Creative Coding

https://stigmollerhansen.dk/resume/learning-creative-coding/
2•bj-rn•22m ago•0 comments

Show HN: I built a semantic search engine for video ("Ctrl+F" for mp4s)

https://www.matriq.video/
1•Daviduche03•23m ago•0 comments

X restricts Grok image generation capabilities for all users

https://twitter.com/Safety/status/2011573102485127562
2•anigbrowl•26m ago•0 comments

Trump Is Risking a Global Catastrophe

https://www.theatlantic.com/ideas/2026/01/trump-greenland-risk-global-conflict/685616/
6•perihelions•26m ago•2 comments