frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
255•theblazehen•2d ago•85 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
26•AlexeyBrin•1h ago•2 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
706•klaussilveira•15h ago•206 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
969•xnx•21h ago•558 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
69•jesperordrup•6h ago•31 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
7•onurkanbkrc•47m ago•0 comments

Making geo joins faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
135•matheusalmeida•2d ago•35 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
45•speckx•4d ago•36 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
68•videotopia•4d ago•7 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
39•kaonwarb•3d ago•30 comments

ga68, the GNU Algol 68 Compiler – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
13•matt_d•3d ago•2 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
45•helloplanets•4d ago•46 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
240•isitcontent•16h ago•26 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
237•dmpetrov•16h ago•126 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
340•vecti•18h ago•149 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
506•todsacerdoti•23h ago•247 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
389•ostacke•22h ago•98 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
303•eljojo•18h ago•188 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
361•aktau•22h ago•186 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
428•lstoll•22h ago•284 comments

Cross-Region MSK Replication: K2K vs. MirrorMaker2

https://medium.com/lensesio/cross-region-msk-replication-a-comprehensive-performance-comparison-o...
3•andmarios•4d ago•1 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
71•kmm•5d ago•10 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
23•bikenaga•3d ago•11 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
96•quibono•4d ago•22 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
26•1vuio0pswjnm7•2h ago•16 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
271•i5heu•18h ago•219 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
34•romes•4d ago•3 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1079•cdrnsf•1d ago•461 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
64•gfortaine•13h ago•30 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
306•surprisetalk•3d ago•44 comments
Open in hackernews

Thundering herd problem: Preventing the stampede

https://distributed-computing-musings.com/2025/08/thundering-herd-problem-preventing-the-stampede/
49•pbardea•4mo ago

Comments

blakepelton•4mo ago
Some recent academic work suggests implementing caches directly in network switches. Tofino switches are programmable enough that academics can implement this today.

OrbitCache is one example, described in this paper: https://www.usenix.org/system/files/nsdi25-kim.pdf

It should solve the thundering herd problem, because the switch would "know" what outstanding cache misses it has pending, and the switch would park subsequent requests for the same key in switch memory until the reply comes back from the backend server. This has an advantage compared to a multi-threaded CPU-based cache, because it avoids performance overheads associated with multiple threads having to synchronize with each other to realize they are about to start a stampede.

A summary of OrbitCache will be published to my blog tomorrow. Here is a "draft link": https://danglingpointers.substack.com/p/4967f39c-7d6b-4486-a...

alfons_foobar•4mo ago
This sounds intriguing and terrible at the same time :D
fidotron•4mo ago
That's not completely unheard of. Cavium used to sell chips intended to go in switches and routers which had a sort of programmable hash function that could be applied to incoming packets, and then use the result of that hash to route the packets.

These days you'd have to assume someone somewhere has a neural net based router.

vlovich123•4mo ago
> This has an advantage compared to a multi-threaded CPU-based cache, because it avoids performance overheads associated with multiple threads having to synchronize with each other to realize they are about to start a stampede.

The switch presumably also has multiple cores which still need to do this work, no? Or is the claim that moving this synchronization to the router behind a network hop saves CPU cycles on the app server?

blakepelton•4mo ago
The relevant part of the switch hardware described by the OrbitCache paper doesn't have typical processor cores. Instead it is an "RMT" pipeline.

I wrote a brief description of RMT here: https://danglingpointers.substack.com/p/scaling-ip-lookup-to...

fidotron•4mo ago
This reads like LLM noise, with headings missing articles.

It also doesn't mentionn the most obvious solution to this problem: adding a random factor to retry timing during backoff, since a major cause of it is everyone coming back at the precise instant a service becomes available again, only to knock it offline.

glhaynes•4mo ago
I don't associate missing articles with LLMs and I've known people, always for whom English was a second language, who dropped them often.
raffraffraff•4mo ago
If anything I'd say that "excellent formatting" with fancy headings and formatting is the hallmark of llm
vlovich123•4mo ago
That works when you control the client but not when you have a case like this fetching state from a DB. For that you do in fact need some kind of locking mechanism, although it’s better if it’s implemented within the cache as a pass through cache that parks the concurrent requests except one if an outbound is needed rather than hitting the database directly. That’s how Cloudflare’s CDN works to minimize requests into the origin.
sriram_malhar•4mo ago
This particular example of thundering herd isn't convincing. First, the database has a cache too, and the first query would end up benefiting the other queries for the same key. The only extra overhead is of the network, which is something a distributed lock would also have.

I would think that in the rare instance of multiple concurrent requests for the same key where none of the caches have it cached, it might just be worth it to take the slightly increased hit (if any) of going to the db instead of complicated it further and slowing down everyone else with the same mechanism.

to11mtm•4mo ago
> The only extra overhead is of the network, which is something a distributed lock would also have.

Well, There's also the 'overhead' of connection pooling. I put it that way because I've definitely run into the case of a 'hot' key (i.e. imagine hundreds of users that all need to download the same set of data because they are in the same group). Next thing you know your connection pool is getting saturated with these requests.

To your point however, I've also had cases where frankly querying the database is always fast enough (i.e. simple lookup on a table small enough that the DB engine practically always has it in memory anyway) so a cache would just be wasted dev time.

jayd16•4mo ago
Yeah, honestly a read replica is usually a lot less bug prone than a custom rolled cache if you just need traffic off main instance.
chmod775•4mo ago
This is just how you should implement any (clientside) cache in a concurrent situation. It's the obvious and correct way. I expect you'll find this pattern implemented with promises in thousands of javascript/typescript codebases.

This query will probably find loads already: https://github.com/search?q=language%3Atypescript+%22new+Map...

Ciantic•4mo ago
I've stumbled on this twice now, usually you can use just CDN caching, but I once solved it with redis locks, and once with simply filling the cache periodically in the background.

If you can, it's easier to have every client fetch from cache, and then a cron job e.g., every second, refresh the cache.

In CDN feature to prevent this is "Collapse Forwarding"

larkost•4mo ago
Some years back, at a previous employer I had a related thundering herd problem: I was running an automated testing lab, and if a new job came in after a period of idleness, then we would have 100+ computers all downloading 3 (or more) multi gigabyte files at the same time (software-under-test, symbols files, and compiled tests).

To make matters worse, due to the budget for this lab, we had just three servers that the testing computers could download from. In the worst case the horrible snarl-up would cause computers to wait for as much as two hours before they got the materials needed to run the tests.

My solution was to use peer-to-peer BitTorrent (no Trackers involved), with HTTP seeding. So the BitTorrent files had no trackers listed, but the three servers listed as HTTP seeds, and the clients were all started with local peer discovery. So the first couple of computers to get the job would pull most/all of the file contents from our servers, and then the rest of the computers would wind up getting the file chunks mostly from their peers.

I did need to do some work so that the clients would first try a URL on the servers that would check for the .torrent file, and if it did not exist, build it (sending the clients a 503 code, causing them to wait a minute or two before retrying).

There are lots of things I would do differently if I rebuilt the system (write my own peer-to-peer code), but the result meant that we rarely had systems waiting more than a few minutes to get full files. It took the thundering heard and made it its own solution.

achalshah•4mo ago
Uber built Kraken to solve the same problem with distributing images: https://github.com/uber/kraken
raffraffraff•4mo ago
Cool. That's reminds of an approach I took back in 2011 when implementing a Linux build / update system in a (small) bank. 8000 machines across hundreds of branches, no servers in the branches, no internet access, limited bandwidth. The goal was to wake one machine (WOL) which detects an update (via LDAP attribute) and then rsyncs repo update + torrent file. Once complete, that machine would load the torrent, verify the synced files, update it's version in LDAP and wake all of its peers. Each peer host would also query LDAP, detect the need to update, but also notice a peer with the latest version, so skip repo rsync and grab the torrent file and load it. So a branch with hundreds of hosts would torrent the repo update pretty quickly to each other. Pretty cool tbh, you could PXE boot and rebuild a bunch of hosts remotely, and once built, any one of them could act as an installation point. I even used this to do a distribution change, switching from SLES to Ubuntu.
ciupicri•4mo ago

    if (Boolean.TRUE.equals(lockAcquired))
WTF?!
jedberg•4mo ago
Exponential backoff is the usual solution to thundering herd problems. The solution of in-app coordination could certainly help, by making sure each app only requests the data once instead of each thread, but at the end of the day, you still need exponential backoff.
ecoffey•4mo ago
Interesting! Reading the headline before the article, my brain immediately thought of "jitter".

I wonder if you could extend the `In-process synchronization` example so that when `CompleteableFuture.supplyAsync()` thunk first does a random sleep (where the sleep time is bounded by an informed value based on the expensive query execution time), then it checks the cache again, and only if the cache is still empty does it proceed with the rest of the example code.

That way you (stochastically) get some of the benefits of distributed locking w/o actually having to do distributed locking.

Of course that only works if you are ok adding in a bit of extra latency (which should be ok; you're already on the non-hot path), and that there still may be more than 1 query issued to fill the cache.