frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
258•theblazehen•2d ago•86 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
26•AlexeyBrin•1h ago•3 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
706•klaussilveira•15h ago•206 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
969•xnx•21h ago•558 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
69•jesperordrup•6h ago•31 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
7•onurkanbkrc•48m ago•0 comments

Making geo joins faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
135•matheusalmeida•2d ago•35 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
45•speckx•4d ago•36 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
68•videotopia•4d ago•7 comments

Welcome to the Room – A lesson in leadership by Satya Nadella

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
39•kaonwarb•3d ago•30 comments

Ga68, a GNU Algol 68 Compiler

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
13•matt_d•3d ago•2 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
45•helloplanets•4d ago•46 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
240•isitcontent•16h ago•26 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
238•dmpetrov•16h ago•127 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
340•vecti•18h ago•149 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
506•todsacerdoti•23h ago•248 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
389•ostacke•22h ago•98 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
304•eljojo•18h ago•188 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
361•aktau•22h ago•186 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
428•lstoll•22h ago•284 comments

Cross-Region MSK Replication: K2K vs. MirrorMaker2

https://medium.com/lensesio/cross-region-msk-replication-a-comprehensive-performance-comparison-o...
3•andmarios•4d ago•1 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
71•kmm•5d ago•10 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
24•bikenaga•3d ago•11 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
96•quibono•4d ago•22 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
26•1vuio0pswjnm7•2h ago•16 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
271•i5heu•18h ago•219 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
34•romes•4d ago•3 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1079•cdrnsf•1d ago•462 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
64•gfortaine•13h ago•30 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
306•surprisetalk•3d ago•44 comments
Open in hackernews

An MVCC-like columnar table on S3 with constant-time deletes

https://www.shayon.dev/post/2025/277/an-mvcc-like-columnar-table-on-s3-with-constant-time-deletes/
45•shayonj•4mo ago

Comments

simlevesque•3mo ago
I just want to be able to append metadata to a Parquet file at the end without rewriting the whole file. Tombstones could be baked in the parquet file this way.

It does work with "one more file" but it's not good for performance.

shayonj•3mo ago
Yeah. Or just sub out the data with null bytes. Something like that could be nice too.
simlevesque•3mo ago
Are you familiar with Parquet ? you can't do that at all, you need to rewrite the whole file.
shayonj•3mo ago
Yeah , I poorly phrased it - I meant in an ideal situation with the benefits of parquet like columnar file structure. I very much understand that it’s not possible on parquet today for the reasons you mentioned and others.
cpard•3mo ago
That’s the whole reason of existence of Iceberg, Delta and Hudi right?

Not as easy as just appending metadata to a parquet file but in the other hand, parquet was never and probably shouldn’t be designed with that functionality in mind.

simonw•3mo ago
This is a really clever design.

The cost estimates are particularly notable: if they're right that's a cost of about $3/day for 6TB/day of written data, 2TB/day of deletes and 50K read queries.

Storing all those TBs of data in S3 is where the real cost lies. I think it costs $5520 to store 8TB*30 = 240TB in S3, and if you retain all data your monthly cost goes up by $5520 every month.

xyzzy_plugh•3mo ago
I think the idea is that the deletes would eventually be compacted, so it's ultimately half as much, but I digress.

The cost isn't that bad all things considered. Hot, durable and available data ain't that cheap, especially in the cloud. Self-hosting is within an order of magnitude.

shayonj•3mo ago
I think ideally you could map retention of cold data to file objects itself and using key space naming strategy and lifecycle rules, expire the data that is not needed, thus saving on the storage costs (as much as possible hopefully)
shayonj•3mo ago
Here is another take on deletes through by just updating the row groups in Parquet file through multi part upload and UploadPartCopy - https://www.shayon.dev/post/2025/285/mutable-atomic-deletes-...