frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•6m ago•0 comments

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
1•o8vm•8m ago•0 comments

Haniri: A live autonomous world where AI agents survive or collapse

https://www.haniri.com
1•donangrey•9m ago•1 comments

GPT-5.3-Codex System Card [pdf]

https://cdn.openai.com/pdf/23eca107-a9b1-4d2c-b156-7deb4fbc697c/GPT-5-3-Codex-System-Card-02.pdf
1•tosh•22m ago•0 comments

Atlas: Manage your database schema as code

https://github.com/ariga/atlas
1•quectophoton•25m ago•0 comments

Geist Pixel

https://vercel.com/blog/introducing-geist-pixel
1•helloplanets•27m ago•0 comments

Show HN: MCP to get latest dependency package and tool versions

https://github.com/MShekow/package-version-check-mcp
1•mshekow•35m ago•0 comments

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
2•FinnLobsien•37m ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•38m ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•39m ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
1•basilikum•41m ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•42m ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•47m ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
3•throwaw12•48m ago•1 comments

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•48m ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•49m ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•51m ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•54m ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
2•andreabat•57m ago•1 comments

I Was Trapped in Chinese Mafia Crypto Slavery [video]

https://www.youtube.com/watch?v=zOcNaWmmn0A
2•mgh2•1h ago•0 comments

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

https://www.cbp.gov/newsroom/stats/reported-employee-arrests
1•ludicrousdispla•1h ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•1h ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•1h ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
2•lifeisstillgood•1h ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•1h ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•1h ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
2•birdculture•1h ago•1 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•1h ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
2•ramenbytes•1h ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•1h ago•0 comments
Open in hackernews

Iceberg, the right idea – the wrong spec – Part 2 of 2: The spec

https://www.database-doctor.com/posts/iceberg-is-wrong-2.html
35•lsuresh•6mo ago

Comments

ozgrakkurt•6mo ago
Great analysis of what iceberg does but don’t agree with so much criticism.

It is very basic compared to a database, and even when you go into details of databases there are many things that don’t make sense in terms of doing the absolute best thing.

You could ciritisize parquet in a similar way if you go through the spec but because it is open and so popular people are going to use it no matter what.

If you need more performance/efficiency simplicity etc. just don’t use parquet but have conversion between your format and parquet.

Or you can build on top of parquet with external indices, keeping metadata in memory and having a separate WAL for consistency.

Similarly it should be possible to build on top of iceberg spec to create something like a db server that is efficient.

It is unlikely for something so usable for so many use cases to be the technically pure and most sensible option.

dkdcio•6mo ago
I think this criticism is missing the order of magnitude aspect -- I agree, people do not choose the most technically pure option. But one that launches on day 1 that can be used in SQL or Python with a few lines of code, across any cloud provider, and it basically "just works" is an order of magnitude or more simple than using Iceberg, at least in my experience in Python. It's always been odd how every non-JVM client for Iceberg has supported reads, but never writes...

People don't choose on tech on technical purity, but they often chose on simplicity & ease of use

lsuresh•6mo ago
Yeah that's been our biggest issue in this ecosystem (the non-JVM clients). They can't do writes and are often far behind on feature parity with the blessed JVM clients.
fifilura•6mo ago
I am currently considering whether it is worth moving our stack from Hive type tables to Iceberg. Iceberg is obviously technically more competent, but the Hive tables are just so nice because the data is almost orthogonal from the tables.

You can throw away a table and recreate it in minutes and vice versa you can edit the data and the table will adapt.

I am so used to this and I am worried of loosing this flexibility with Iceberg.

Maybe a mix is the way to go.

TFA is very well written by the way. From my perspective I see Iceberg as Hive tables 2.0. Solving a lot of the Hive related problems but not all generic database problems. So all new features are positive for me.

But my only gripe is - is the added complexity worth it?

chojeen•6mo ago
I really don't get a lot of this criticism. For example, who is using iceberg with hundreds of concurrent committers, especially at the scale mentioned in the article (10k rows per second)? Using iceberg or any table format over object storage would be insane in that case. But for your typical spark application, you have one main writer (the spark driver) appending or merging a large number of records in > 1 minute microbatches and maybe a handful of maintenance jobs for compaction and retention; Iceberg's concurrency system works fine there.

If you have any use case like one the author describes, maybe use an in-memory cloud database with tiered storage or a plain RDBMS. Iceberg (and similar formats) work great for the use cases for which they're designed.

RhysU•6mo ago
> But for your typical spark application, you have one main writer (the spark driver) appending or merging a large number of records...

The multi-writer architecture can't be proven scalable because a single writer doesn't cause it to fall over.

I have caused issues by using 500 concurrent writers on embarrassingly parallel workloads. I have watched people choose sharding schemes to accommodate Iceberg's metadata throughput NOT the natural/logical sharding of the underlying data.

Last I half-knew (so check me), Spark may have done some funky stuff to workaround the Iceberg shortcomings. That is useless if you're not using Spark. If scalability of the architecture requires a funky client in one language and a cooperative backend, we might as well be sticking HDF5 on Lustre. HDF5 on Lustre never fell over for me in the 1000+ embarrassingly parallel concurrent writer use case (massive HPC turbulence restart files with 32K concurrent writers per https://ieeexplore.ieee.org/abstract/document/6799149 )

bdangubic•6mo ago
if you use a tool for use cases thet are designed how are you gonna come up with a blog to bitch about it? :)
teleforce•6mo ago
>who is using iceberg with hundreds of concurrent committers, especially at the scale mentioned in the article (10k rows per second)? Using iceberg or any table format over object storage would be insane in that case

You can achieve 100M database inserts per second with D4M and Accumulo more than a decade ago back in 2014, and object storage is not necessary for that exercise.

Someone need to come up with lakehouse systems based on D4M, it's a long overdue.

D4M is also based on sound mathematics not unlike the venerable SQL [2].

[1] Achieving 100M database inserts per second using Apache Accumulo and D4M (2017 - 46 comments):

https://news.ycombinator.com/item?id=13465141

[2] Mathematics of Big Data: Spreadsheets, Databases, Matrices, and Graphs:

https://mitpress.mit.edu/9780262038393/mathematics-of-big-da...