frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Red Squares – GitHub outages as contributions

https://red-squares.cian.lol/
630•cianmm•5h ago•139 comments

The bottleneck was never the code

https://www.thetypicalset.com/blog/thoughts-on-coding-agents
238•Anon84•2d ago•167 comments

Setting up a Sun Ray server on OpenIndiana Hipster 2025.10

https://catstret.ch/202605/srss-hipster202510/
85•jandeboevrie•4h ago•18 comments

Agents can now create Cloudflare accounts, buy domains, and deploy

https://blog.cloudflare.com/agents-stripe-projects/
506•rolph•12h ago•276 comments

StarFighter 16-Inch

https://us.starlabs.systems/pages/starfighter
533•signa11•13h ago•269 comments

The Thinking Plant's Man (2025)

https://www.sciencehistory.org/stories/magazine/the-thinking-plants-man/
32•benbreen•1d ago•2 comments

CARA 2.0 – “I Built a Better Robot Dog”

https://www.aaedmusa.com/projects/cara2
321•hakonjdjohnsen•2d ago•44 comments

Reverse-engineering the 1998 Ultima Online demo server

https://draxinar.github.io/articles/2026-05-01-uodemo-reverse-engineering.html
151•notsentient•9h ago•32 comments

Knitting bullshit

https://katedaviesdesigns.com/2026/04/29/knitting-bullshit/
269•ColinEberhardt•10h ago•129 comments

Batteries Not Included, or Required, for These Smart Home Sensors

https://coe.gatech.edu/news/2026/04/batteries-not-included-or-required-these-smart-home-sensors
129•gnabgib•2d ago•49 comments

245TB Micron 6600 ION Data Center SSD Now Shipping

https://investors.micron.com/news-releases/news-release-details/industry-leading-245tb-micron-660...
174•neilfrndes•12h ago•114 comments

Wolfenstein 3D for Gameboy Color on custom cartridge (2016)

https://www.happydaze.se/wolf/
85•ksymph•1d ago•10 comments

Cat (YC S22) Seeks Fractional Engineer to Build AI-Native Growth Toolkit

https://www.coveragecat.com/careers/engineering/fractional-growth-engineer
1•botacode•3h ago

Show HN: Adam – An embeddable cross-platform AI agent library

https://github.com/sqliteai/adam
17•marcobambini•2h ago•1 comments

DNSSEC disruption affecting .de domains – Resolved

https://status.denic.de/pages/incident/592577eab611ce1e0d00046f/69fa60ef9d12f5057a974f38
718•warpspin•19h ago•377 comments

Multi-stroke text effect in CSS

https://yuanchuan.dev/multi-stroke-text-effect-in-css
189•cheeaun•10h ago•24 comments

YouTube, your RSS feeds are broken

https://openrss.org/blog/youtube-your-feeds-are-broken
248•veeti•14h ago•94 comments

Accelerating Gemma 4: faster inference with multi-token prediction drafters

https://blog.google/innovation-and-ai/technology/developers-tools/multi-token-prediction-gemma-4/
630•amrrs•23h ago•303 comments

Virtual violin produces realistic sounds

https://news.mit.edu/2026/mit-engineers-virtual-violin-produces-realistic-sounds-0429
50•gmays•3d ago•36 comments

CNN founder Ted Turner, a pioneer of cable TV news, dies at 87

https://www.cnn.com/2026/05/06/us/ted-turner-death
11•pseudolus•41m ago•1 comments

Write some software, give it away for free

https://nonogra.ph/write-some-software-give-it-away-for-free-05-05-2026
332•nohell•18h ago•226 comments

Computer Use is 45x more expensive than structured APIs

https://reflex.dev/blog/computer-use-is-45x-more-expensive-than-structured-apis/
447•palashawas•23h ago•246 comments

Shrinkflation Is Quietly Making All Gadgets Worse

https://gizmodo.com/shrinkflation-is-quietly-making-all-gadgets-worse-2000754565
39•cainxinth•2h ago•25 comments

Three Inverse Laws of AI

https://susam.net/inverse-laws-of-robotics.html
501•blenderob•1d ago•330 comments

EEVblog: The 555 Timer is 55 years old [video]

https://www.youtube.com/watch?v=6JhK8iCQuqI
313•brudgers•23h ago•83 comments

Building the deployment tool I wish I had

https://ruuda.nl/2026/deptool
9•ruuda•2h ago•3 comments

Our Continuation of MkDocs

https://github.com/orgs/ProperDocs/discussions/33
6•serhack_•18m ago•0 comments

Today I've made the difficult decision to reduce the size of Coinbase by ~14%

https://twitter.com/brian_armstrong/status/2051616759145185723
453•adrianmsmith•1d ago•705 comments

Google Chrome silently installs a 4 GB AI model on your device without consent

https://www.thatprivacyguy.com/blog/chrome-silent-nano-install/
1563•john-doe•1d ago•1039 comments

Why most product tours get skipped

https://productonboarding.com/articles/why-product-tours-get-skipped
195•pancomplex•18h ago•167 comments
Open in hackernews

Forget IPs: using cryptography to verify bot and agent traffic

https://blog.cloudflare.com/web-bot-auth/
80•todsacerdoti•11mo ago

Comments

PaulHoule•11mo ago
There is a lot of talk about AI training being a driver of bot activity, but I think AI inference is also a driver, in two ways.

(1) It's always been easy to write bots [1] [2]. If you knew beautifulsoup well you could often write a scraper in 10 minutes, now people will ask ChatGPT to write a scraper for them and have a scraper ready in 15 minutes so they're discovering how easy it is, how you don't have to limit yourself to public APIs that are usually designed to limit access, not expand it.

(2) Instead of using content to train an AI you can feed it into an AI for inference. For instance, you can tell the AI to summarize pages or to extract specific facts from pages or to classify pages. It's increasingly possible to develop a workflow like: classify 30,000 RSS feed items, select 300 items that the user will probably find interesting, crawl those 300 pages looking for hyperlinks to scientific journal articles or other links that would be better to post, crawl those links to see if the journal articles are open access, weigh various factors to decide what's likely to be the best link, do specialized image extraction so I can make a good social post, etc. It's not too hard to do but it all comes falling down if the bot has to click on fire hydrants endlessly.

[1] Polite crawlers limit how many threads they have running against a single server. If you only have one thread per server you are unlikely to overload it. If you want to make a crawler with a large thread count that is crawling a large number of servers it can be a hassle to implement this, particularly if you want to maximize performance or run a large distributed crawler. However a lot of times I do a crawling project that targets one site or five sites or that maybe crawls 1000 documents a day and in those cases the single-threaded crawler is fine.

[2] For some reason, my management has always overestimated the work of building scrapers, I think because they've been burned by UI development which is always underestimated. The fact that UI development is such a bitch actually helps with crawler development -- you might be afraid that the target site is going to change but between the high cost of making changes and the fact that Google will trash your SEO if you change anything about your site, the target site won't change.

showerst•11mo ago
Agreed on all points except [2], I run many scrapers and sites change _all the time_, often changing markup for seemingly random reasons. One government site I scrape changes ids and classes between camel and snake case every couple of weeks, it makes me wonder if it's a developer pulling a fast one on the client.
dboreham•11mo ago
Hearing this makes me suspect some tool auto-generates the id's and its config is getting changed every couple weeks by some spaces vs tabs battle between devs.
superkuh•11mo ago
I do not think that more in-house cloudflare-only "standards" open washed through their IETF employees, both of which raise the friction to participation in the web even higher for actual humans, are the way to go. Especially setups which again rely on centralized CAs and have tiny expiring lifetimes. Seems like pretty soon there'll only be one or two browsers which can even hope to access sites behind cloudflare's infrastructure. They might as well just start releasing their own browser and the transformation to AOL will be complete.
ecb_penguin•11mo ago
> I do not think that more in-house cloudflare-only "standards" open washed through their IETF employees

As someone with multiple RFCs, this is the way it's always been done. Industry has a problem, there's some collaboration with other industry or academia, someone submits a draft RFC. People are either free to adopt it or not. Sometimes there's competing proposals that are accepted, and sometimes the topic dies entirely.

> both of which raise the friction to participation in the web even higher for actual humans

Absolutely nothing wrong with this, as it's site owners that make the decision for their own sites. Yep, I do want some friction. The tradeoff saves me a ton of money. Heck, I could block most ASNs and email domains and still keep 99% of my customers.

> Seems like pretty soon there'll only be one or two browsers which can even hope to access sites behind cloudflare's infrastructure

This proposal is about bots identifying themselves through open HTTP headers.

superkuh•11mo ago
>This proposal is about bots identifying themselves through open HTTP headers.

The problem is that to CF, everything that isn't Chrome is a bot (only a slight exaggeration). So browsers that aren't made by large corporations wouldn't have this. It's like how CF uses CORS.

CORS isn't only CF but it's an example of their requiring obscure things no one else really uses, and using them in weird ways that causes most browser to be unable to do it. The HTTP header CA signing is yet another of these things. And weird modifications of TLS flags fall right in there too. It's basically Proof-of-Chrome via Gish Gallop of new "standards" they come up with.

>Absolutely nothing wrong with this, as it's site owners that make the decision for their own sites.

I agree. It's their choice. I am just laying out the consequences of these mostly uninformed choices. They won't be aware that they're blocking a large number of their actual human visitors initially. I've seen it play out again and again with sites and CF. Eventually the sites are doing as much work maintaining their whitelists of UAs and IPs that one wonders why they use CF at all if they're doing the job instead.

And that's not even starting on the bad and aggressive defaults for CF free accounts. In the last month or two they have slightly improved this. So there's some hope. They know they are a problem because they're so big,

"It was a decision I could make because I’m the CEO of a major Internet infrastructure company." ... "Literally, I woke up in a bad mood and decided someone shouldn't be allowed on the Internet. No one should have that power." - Cloudflare CEO Matthew Prince

(ps. You made some good and valid points, re: IETF process status quo, personal choice, etc, it's not me doing the downvotes)

Sophira•11mo ago
There's another problem here that I haven't seen anyone talking about, and that's the futility of trying to distinguish between "good bots" and "bad bots".

The idea of Anubis is to stop bots that are meant to gather data for AI purposes. But you know who has a really big AI right now? Google. And you know who are the people who have the most bots indexing the web for their search engine? Yup, Google.

All these discussions have been assuming that Googlebot is a "good bot", but what exactly is stopping Google from using the data from Googlebot to feed Gemini? After all, nobody's going to block Googlebot, for obvious reasons.

At most, surely the only thing that blocking AI bots will do is stop locally-running bots, or stop OpenAI (because they don't have any other legitimate reason to be running bots over the web).

nubinetwork•11mo ago
Using IPs requires next to no cpu power... if I have to start picking apart http requests and running algorithms on the traffic, I might as not well even run websites, including personal ones.
ecb_penguin•11mo ago
This already happens with TLS, JWT verification, etc.
molticrystal•11mo ago
You are right that IP checks are lightweight, though you miss that setting up TCP/IP handshakes is algorithm-heavy, but it’s transparent because hardware and kernel optimizations keep it light on the CPU. TLS encryption, through certificate checks, key exchanges, that whole negotiation, is a CPU-heavy activity, especially on servers. Most of that asymmetric crypto, like verifying certificates, isn’t helped much by hardware accelerators like AES-NI, which mainly help with session encryption. TLS is already tons of work, so HTTP Message Signatures and mTLS are like piling more hay on the stack, it’s extra work, but you’re already doing a lot at that point.

The real complaint should be about having to adopt another standard, and whether they’ll discriminate against applications like legacy RSS readers, since they’re considered a type of bot.

kbolino•11mo ago
IP bans are usually enforced before the TCP handshake proceeds: server receives SYN packet, checks source address against blocklist, and if blocked then drops it before proceeding any further in the TCP state diagram.
probably_wrong•11mo ago
Wasn't that the argument against https, namely, that it was too costly to run [1]? I also run fail2ban [2] in my servers and I rarely even notice it's there.

I'm not saying you should sit down with the iptables manual and start going through the logs, but I can see the idea taking off if all it takes is (say) one apt-get and two config lines.

[1] https://stackoverflow.com/questions/1035283/will-it-ever-be-...

[2] https://github.com/fail2ban/fail2ban

elithrar•11mo ago
IPs as identifiers aren’t great: in a world of both CGNAT (more shared IPs) and a ton of sketchy residential proxies, they’ve become poor proxies for identity of a “thing”.
kbolino•11mo ago
IPs are slowly getting worse as identifiers over time, but IP and IP range bans are like port-shifting SSH: you can often get a lot of defense against low-effort attacks for similarly low amounts of effort.
senectus1•11mo ago
This clever bunny did something very similar.. (but self hosted)

https://xeiaso.net/blog/2025/anubis/

I love the approach.. If I could be arsed blogging I'd probably set it up myself.

mshockwave•11mo ago
I might be wrong, but it seems like Anubis asks the _client_ to solve the cryptography challenges while the approach Cloudflare described here asks the server to verify the (cryptography) signature?
senectus1•11mo ago
the client thats scraping... yes. it shifts the load to the offending service. its not a big issue if you're human but if you're a AI scraping bot its a load of heavy resources.
ralferoo•11mo ago
Altcha is a similar thing: https://github.com/altcha-org/altcha

I recently implemented a very similar thing to its obfuscation via proof-of-work (https://altcha.org/docs/obfuscation/) in my C++ REST backend and flutter front-end, and use it for rate-limiting on APIs that allow creation of a new account or sending sign-up e-mails.

I have an authentication token that's then wrapped with AES-GCM using a random IV and the client is given the key, IV stem and a maximum count for the IV.

lockhead•11mo ago
This would help detecting legit BOTs for sure, but as Origin you would still have the same issue than before, as you still need to be able to discern between "real" Users and all the malicious Traffic. The Amount of "good" bots is way smaller than that, and by good behavior and transparent data much easier to identify even without this kind of stuff. So to make real use of this, Users would also need to do this and suddenly "privacy hell" would be too kind to call this.
Sophira•11mo ago
Taking this to its logical extreme, if it ended up getting used enough, then governments could be tempted to enforce its use.
drtgh•11mo ago
It does not sound extreme, unfortunately. Meanwhile the malicious traffic would keep their activity with spoofed-and-so-on certs, from the very beginning.
az09mugen•11mo ago
Totally agree, that's conceptually the same problem as robots.txt. As stated in https://www.robotstxt.org/faq/blockjustbad.html :

> But almost all bad robots ignore /robots.txt, making that pointless.

unsolved73•11mo ago
Interesting proposal.

The current situation is getting worse day after day because everybody want to ScRaPe 4lL Th3 W38!!

Verifying Ed25519 signature is almost free on modern CPUs, I just wonder why they go with an obscure RFC for HTTP signatures instead of using plain JSON Web Tokens in an header.

JWTs are universal. Parsing this custom format will certainly lead to a few interesting bugs.

dboreham•11mo ago
The subtext surely is: "and we're going to charge for crawler traffic next".
ok123456•11mo ago
What about an SMTP proof of work extension? Smaller SMTP relays that would typically have a harder time sending mail could opt in to solve a problem to increase the chance of delivery. The difficulty of the problem could be inversely related to reputation.
ipdashc•11mo ago
https://en.wikipedia.org/wiki/Hashcash