frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma

https://rhodesmill.org/brandon/2009/commands-with-comma/
197•theblazehen•2d ago•59 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
679•klaussilveira•14h ago•204 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
956•xnx•20h ago•553 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
126•matheusalmeida•2d ago•34 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
25•kaonwarb•3d ago•21 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
62•videotopia•4d ago•3 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
235•isitcontent•15h ago•25 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
41•jesperordrup•5h ago•20 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
228•dmpetrov•15h ago•122 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
332•vecti•17h ago•145 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
499•todsacerdoti•22h ago•243 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
384•ostacke•21h ago•96 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
360•aktau•21h ago•183 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
293•eljojo•17h ago•182 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
23•speckx•3d ago•11 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
414•lstoll•21h ago•280 comments

ga68, the GNU Algol 68 Compiler – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
6•matt_d•3d ago•1 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
20•bikenaga•3d ago•10 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
66•kmm•5d ago•10 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
93•quibono•4d ago•22 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
260•i5heu•17h ago•203 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
33•romes•4d ago•3 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
10•1vuio0pswjnm7•1h ago•0 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
38•gmays•10h ago•13 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
61•gfortaine•12h ago•26 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1073•cdrnsf•1d ago•459 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
151•vmatsiiako•20h ago•72 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
292•surprisetalk•3d ago•43 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
156•SerCe•11h ago•144 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
187•limoce•3d ago•102 comments
Open in hackernews

Cloudflare incident on August 21, 2025

https://blog.cloudflare.com/cloudflare-incident-on-august-21-2025/
209•achalshah•5mo ago

Comments

iqfareez•5mo ago
Wild that one tenant’s cache-hit traffic could tip over Cloudflare’s interconnect capacity
immibis•5mo ago
You'd be surprised how low the capacity of a lot of internet links is. 10Gbps is common on smaller networks - let me rephrase that, a small to medium ISP might only have 10Gbps to each of most of their peering partners. Normally, traffic is distributed, going to different places, coming from different places, and each link is partially utilized. But unusual patterns can fill up one specific link.

10Gbps is old technology now and any real ISP can probably afford 40 or 100 - for hundreds of dollars per link. But they're going to deploy that on their most utilized links first, and only if their peering partner can also afford it and exchanges enough traffic to justify it. So the smallest connections are typically going to be 10. (Lower than 10 is too small to justify a point-to-point peering at all).

If you have 10Gbps fiber at home, you could congest one of these links all by yourself.

Now this is Cloudflare talking to aws-east-1, so they should have shitloads of capacity there, probably at least 8x100 or more. But considering that AWS is the kind of environment where you can spin up 800 servers for a few hours to perform a massively parallel task, it's not surprising that someone did eventually create 800Gbps of traffic to the same place, or however much they have. Actually it's surprising it doesn't happen more often. Perhaps that's because AWS charges an arm and a leg for data transfer - 800Gbps is $5-$9 per second.

aianus•5mo ago
Downloading cached data from Cloudflare to AWS is free to the person doing the downloading if they use Internet gateway
transitionnel•5mo ago
Future proofing inevitable things should be something to talk about more.

For instance, people will be scraping at a "growing" rate as they figure out how everything AI works. We might as well figure out some standard seeded data packages for training that ~all sources/sectors agree to make available as public torrents to reduce this type of problem.

[I realize this ask is currently idealistic, but it's an anchor point to negotiate from.]

tucnak•5mo ago
Hot take. 40 Gbps is not a real rate; it's just four 10 Gbps in a trenchcoat stacked on top of one another!
ZWoz•5mo ago
Thats true for several other speeds too. 100GE first generation was 10x10GbE, second generation was 4x25GbE. 200GE first version was 25GbE based and so on.
themafia•5mo ago
That's what started the incident.

It was prolonged by the fact that Cloudflare didn't react correctly to withdrawn BGP routes to a major peer, that the secondary routes had reduced capacity due to unaddressed problems, and basic nuisance rate limiting had to be done manually.

It seems like they just build huge peering pipes and basically just hope for the best. They've maybe gotten so used to this working that they'll let degraded "secondary" links persist for much longer than they should. It's the typical "Swiss Cheese" style of failure.

vlovich123•5mo ago
Wasn’t the problem exacerbated precisely by withdrawing a BGP link because all the same traffic is then forced over a smaller number of physical links?
miyuru•5mo ago
AWS us-east-1 is now taking down other providers.
inemesitaffia•5mo ago
Didn't even notice
yaboi3•5mo ago
Anyone want to tell Cloudflare that BGP advertisements at AWS are automated and their congested network directly cause BGP withdrawals as the automated system detected congestion and decreased traffic to remediate it?
grumple•5mo ago
It wouldn't surprise me if the BGP routes in the DCI PNI were manually configured, since this is probably one of the most direct and important connections. I would be surprised if Cloudflare didn't have firsthand knowledge of what happened with AWS during this incident.

I think the withdrawal approach by AWS would normally work, as this action should desaturate the connections. Just really unfortunate that this caused routing through a link that was at half capacity.

__float•5mo ago
The way I read the blog post, it seems they're very aware of that.

I imagine Cloudflare and AWS were on a Chime bridge while this all went down, they both have a lot at stake here.

erulabs•5mo ago
It’s gonna turn out it was one guy on one machine calling “pnpm install” on a fast machine with a 100gbps uplink.
cluckindan•5mo ago
Can we stop with the 2015 jokes already?
chatmasta•5mo ago
I’ve actually had an npm install that failed on my ISP but succeeded with Cloudflare VPN and the OP comment was more or less the explanation.
BoorishBears•5mo ago
In 2015 it would have been "npm install"

(Thanks Rauch.)

__turbobrew__•5mo ago
> This system will allot network resources on a per-customer basis, creating a budget that, once exceeded, will prevent a customer's traffic from degrading the service for anyone else on the platform

How would this work practically? If a single client is overflowing the edge router queues you are kindof screwed already? Even if you dropped all packets from that client you would need to still process the packets to figure out what client they belong to before dropping the packets?

I guess you could somehow do some shuffle sharding where a single client belongs to a few IP prefixes and when that client misbehaves you withdraw those prefixes using BGP to essentially black hole the network routes for that client. If the shuffle sharding is done right only the problem client will have issues as other clients on the same prefixes will be sharded to other prefixes.

jeffbee•5mo ago
Perhaps they drop the client's flows on the host side.
__turbobrew__•5mo ago
I don’t understand? The issue is that a client/customer outside of cloudflares control DOSed one of their network links. Cloudflare has no control on the client side to implement rate limiting?
fusl•5mo ago
I think you misunderstand the flow of traffic here. The data flow, initiated by requests coming from AWS us-east-1, was Cloudflare towards AWS, not the other way around. Cloudflare can easily control where and how their egress traffic gets to the destination (as long as there are multiple paths towards the target) as well as rate limit that traffic to sane levels.
__turbobrew__•5mo ago
Ah I see now. Yes in that case they could just reply with 429 codes or just not reply at all.
everfrustrated•5mo ago
I think you're overthinking this. Just having a per (cloudflare) customer rate limit would go a long long way.
milofeynman•5mo ago
It's load shedding, but it's weighted towards people abusing their quota usually over some rolling weighted average. The benefit is that they are dropped immediately at the edge rather than holding sockets open or using compute/resources. It usually takes 30s-1m to kick in.
Thorrez•5mo ago
In this specific case, it wasn't requests from the client that caused overload. It was the responses to those requests. So Cloudflare can avoid sending responses, and prevent the problem.

You're right that this doesn't solve all cases, but it would have prevented this case.

jcalvinowens•5mo ago
> Even if you dropped all packets from that client you would need to still process the packets to figure out what client they belong to before dropping the packets?

In modern Linux you can write BPF-XDP programs to drop traffic at the lowest level in the driver before any computation is spent on them at all. Nearly the first thing the driver does after getting new packets in the rx ring buffer is run your program on them.

__turbobrew__•5mo ago
Say you have a BPF-XDP program which processes the packet to figure out what client the packet is coming from and selectively drops those packets. Is that really going to be faster than just forwarding the packet from the edge router to the next hop? I find it hard to believe that running such a program would actually alleviate full queues when all the edge router is doing is just forwarding to the next hop?
jcalvinowens•5mo ago
Where is the queueing happening? Maybe I misunderstood.

I assumed you meant the hosts are queueing in the kernel because their userspace consumers can't keep up. In that case, XDP can help, because it can drop things out of the rx ring buffers before the network stack and later userspace spend cpu cycles processing them.

If you meant the router is queueing because it's receiving more traffic than the sum of its downstream link bandwidth, like a raw static spam flood DDoS, I don't think the hosts can't do anything about that.

senderista•5mo ago
There was definitely a recurring pattern at AWS where a single customer would trigger latent bugs/undercapacity resulting in outages. Postmortems would often recommend developing per-customer observability and mitigation.
md224•5mo ago
I'm having trouble understanding the second diagram in the article. I can make sense of a directed graph, but this one has thin horizontal lines with arrows leaving them in both directions. These lines look like dividers, not nodes, so I'm not sure how to interpret it.
dontdoxxme•5mo ago
I think the intention is to show the divide between Amazon's and Cloudflare's responsibility, over the piece of fibre linking their network devices together. It would have been clearer to continue the lines and just put a dotted divider between them I feel.
pm90•5mo ago
Only real long term mitigation is to move to another aws region; us-east-1 seems to suffer from all kinds of scaling challenges.
bastawhiz•5mo ago
There's nothing to suggest the link between Cloudflare and any other AWS region has more capacity or that there aren't more disruptive Cloudflare customers using those regions.
o11c•5mo ago
But there is absolutely something to suggest "if you only support one region for some tasks, you're going to have problems that other people don't have."
wavemode•5mo ago
yeah but us-east-1 is cursed
Hilift•5mo ago
> The incident was a result of a surge of traffic from a single customer that overloaded Cloudflare's links with AWS us-east-1. It was a network congestion event, not an attack or a BGP hijack.

And no one knew a single thing about it until the incident. That is the current network management state of the art, let Cloudflare deal.

wferrell•5mo ago
I wonder which customer triggered this…
rdl•5mo ago
Also curious if it was legit, misconfigured, or attack traffic.
AtNightWeCode•5mo ago
Braze is my guess. They let customers do a lot of stuff with pushing and pulling data per user and I would guess every customer is in a sandbox. They were also impacted by the incident.
dpoloncsak•5mo ago
It sounds like anyone relying on Cloudflare and AWS us-east-1 were impacted. Not sure it's quite the smoking gun you're implying
AtNightWeCode•5mo ago
It was the company that I found that was impacted and could have caused it. They also don't play nice with others. It's even in the documentation. We looked into buying their services. They have also caused load problems in other clouds. But it was just a guess. Could be anything.
darkwater•5mo ago
Come on, some CF insider please create a throwaway account and tell us which client was :)
fasteo•5mo ago
There are some missing y-axis labels that would be interesting to see