frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

I rebuilt my blog's cache. Bots are the audience now

https://hoeijmakers.net/thirty-years-of-caching-sorted-in-an-afternoon/
27•robhoeijmakers•3h ago

Comments

jdw64•3h ago
Personally, I think this is a good idea. But the core problem is this: How is a newcomer supposed to build reputation now? Without exaggerated business promises or capital, basic online reputation usually depends on writing. In fact, my own first step into freelancing came because someone found the articles on my Korean blog interesting. So the question is: if the subscribers are bots, what benefit do they actually give me? If bots become the readers, then what matters is whether they can provide any kind of symbolic capital or real capital. I can build caching with Redis without much difficulty, but I worry that if this continues, the result may simply be that LLMs learn from my writing while no benefit returns to me. People write partly to organize their thoughts, but also partly to gain symbolic capital. That is one reason why I write my own posts instead of using an LLM to write them for me.
johng•1h ago
What's worse, is they train on your content, and very often you don't even get an attribution link. So the end user never even knows it was your site that provided the information and you never even get a single clickthrough. It's not like the SERPs where someone would click through, read your site, hopefully find it interesting and useful and come back.

It's going to be a serious problem and I've already seen sites that are down 90% in traffic simply because AI is scraiping them, answering the questions themselves and never providing a linkback.

01284a7e•1h ago
I pulled all the websites I had - some existed for a decade plus and made me hundreds of thousands of dollars. All that is left is bots that theft the value of my work. Until something changes, goodbye.
gbgarbeb•36m ago
This is like choosing to be an elementary school teacher and then quitting because it turns out your students for the year aren't your pets in perpetuity.
diatone•29m ago
If your students were growing up to subvert your line of work, sure. Pretty sure that’s not the case though!
pixl97•1h ago
>How is a newcomer supposed to build reputation now

Dead internet manifest.

nilirl•25m ago
I feel that pressure of not knowing how to definitively compete on the internet, especially when there's so much AI created noise.

I'm a copywriter and I used to get hired to write posts on behalf of founders on LinkedIn or for their company blog.

Now, the last three jobs I had were all focused on sending cold email.

ryandrake•16m ago
I think in general, "Writing on the Internet with the intent to make money" is effectively dead, or at least soon to be dead. AI+bots mean we now have the "infinite typewriter monkeys" from the thought experiment. With infinite supply, the price goes to zero.

We need to stop this treadmill of trying to "build reputation" and stop focusing on "symbolic capital" and "clout" and whatever else bloggers are going after. You're not going to get it, and even if you do, you're not going to be able to "monetize" it.

If you have a need to write, write. Maybe a handful of actual people will read it, maybe not. But, I wouldn't try to do it for a living. The reward will have to be the cathartic process of writing itself, and not in how much attention it gets, how much it "blows up" or how viral it gets.

jdw64•7m ago
I am not trying to make money from writing.

What I need is for my writing to spread enough that I can receive opportunities to have my programming ability evaluated.

The reason I write about programming is that, in the past, some readers found my programming essays interesting, and that led to chances for me to be tested. I had to leave graduate school because of financial problems, and I did not graduate from a prestigious university.

So this is not simply about monetizing writing. It is a struggle to receive opportunities. Those are fundamentally different things.

Some people may be happy writing things that nobody reads. But many people are happier when they can share their writing and let their values collide with those of others.

Hackbraten•2h ago
Why do I get just an empty page?
ksk23•1h ago
Caching gone wrong.. (Works for me)
consumer451•1h ago
Same here via VPN. No VPN, and I get the actual content.
robhoeijmakers•57m ago
Thanks. It seems to be very local/incidental. The page works from the locations I can test, but I’ll check whether one edge cache or request path served a bad response.
pavel_lishin•1h ago
> Not because I expect a person in Singapore to shave 200ms off their pageload, but because the next request for that page is more likely to come from a retrieval system than a browser, and the request after that, and the one after that.

Why do I care if I shave off 200ms from a crawler's request, instead of a human's?

rodw•55m ago
Page load time can impact index coverage (depth of crawl), freshness (revisit rate), and ranking.
m0rde•47m ago
From the post:

> If you care about how your content moves through the world now, including through AI systems, you have to care about caching. Not as a performance optimisation for human browsers, but as infrastructure for machine readership.

nothrabannosir•18m ago
That doesn’t answer the question at all, and I wonder is it’s actually true? A cache is not magic, it is, itself, just a static file server in the end. If I self host a static page website on an nginx box, do I actually need cache to serve today’s crawlers?

The screenshot in the image says 3k req/day. That’s 2 requests per minute (amortized). At that rate, you can serve it with cgi and Perl.

Cache is only relevant if you have a lot of traffic AND dynamic pages, or if you care about latency (which is only relevant for humans).

Brybry•39m ago
The graphic in the article seems to be the only significant content.

Based on that I think it's more about requests from bots/scrapers having the greatest chance possible of hitting a cache before hitting the blog's origin/real host. Bots will hit some layer of Cloudflare first then they'll hit Fastly and then if not in Fastly they'll hit the Ghost blog's server.

To me, this makes a lot of sense if it's self-hosted but I also thought it was already the standard to shove your self-hosted blog behind a reverse-proxy and cache as much as possible.

And I'm not a professional web developer but all the extra caching layers for a static personal blog seem a bit overkill.

Aside from the graphic, the article is a lot of words about engaging with an LLM to get a full understanding of how caching works for their blog hosting and how it enabled them to change their setup for the better.

It's kind of hard to understand because there are no words about what they actually did or how what they actually did was better.

cullumsmith•1h ago
I simply block all AI crawlers with a user-agent check in nginx.conf.
robhoeijmakers•55m ago
I started blocking some of them. But for now I want to improve visibility before further blocking or optimising. The dashboard helps with this.
microtonal•50m ago
I also block all AI crawlers. I am not sure why I should give them my content for them to rip it off and make money from it through training or agents. Sadly, a lot of AI companies are trying to make requests indistinguishable from regular browsers from residential connections, so unfortunately I have to use Cloudflare to block them.

Ideally I'd make the content available to crawlers for training open models, but that seems to be nearly impossible. It would be possible if other AI companies behaved.

Barbing•35m ago
>so unfortunately I have to use Cloudflare to block them.

That can’t block Grok, can it?

(You might have a fake iPhone or something visit your site if you ask Grok to retrieve information from it)

orf•45m ago
*some AI crawlers. Not many
steve_adams_86•25m ago
I went through a similar process recently. For a while I saw readership of my site gradually increasing, and eventually it became clear that it wasn't human beings.

I also used Claude to help me drill into what's going on. Bizarrely, about 80% of my traffic comes from Singapore, which the author mentioned. I don't know why. A lot of the traffic looks real; it stays for a while, clicks different links in different orders. But no one in Singapore has ever read a thing I've written on my site as far as I'm concerned.

I thought Cloudflare would help protect my site from bots, but it utterly fails. I'm not sure if they're too sophisticated or people overestimate how well CF works for these things. I paid for advanced features for a while and reverted to the free plan once I realized it made no difference. It's a great platform in general, but hasn't been great for allowing me to see how many humans actually read my content.

I know some do because they email me occasionally. If I had to guess, of the ~200 visits per week reported in analytics, around 15 are real.

chrismorgan•21m ago
I’m very confused about why you’d have such a complex cache arrangement. Sounds like you’re using Cloudflare and Fastly to do roughly the same thing. That sounds like a recipe for more expense and more problems.

For the sort of thing you’re doing, it should be as simple as “throw it behind Cloudflare/Fastly/Bunny/whichever private CDN you like” and that’s it.

Also the diagram near the end is pretty much incoherent. GenAI, I presume.

ssv445•19m ago
the core value of internet was some one discovering you via your content, agents as primary consumer might looks good for now, but we are definitely making internet dead for many SMBs.
gostsamo•19m ago
The writership of the blog is also changed and seems to be mostly machine as well. It is painful to read something that lacks human presence on the other side.
yawnxyz•15m ago
my tiny blogs no one reads have been racking up a huge deno deploy and vercel bills ($40-50/mo each) bc I ran them "naked" without a cache or cloudflare or static builds - it didnt matter bc I got like hundreds of visitors a month. they were just hono or whatever api pulling from my backend which could be notion or airtable - super simple, though kind of slow

now I suddenly I have 10k visitors a month hammering my apis and causing massive egress and cpu usage - so i had to get them behind cloudflare and now build everything statically - cut the costs back down from 90+ cpu hours to about 0.2 cpu hours a month

crazy times

(also, all donw w/ claude code's help, or it would have taken a week for me to figure out)

faangguyindia•10m ago
This is why I don't use those serverless setup.

$4 hetzner vps can serve tons of request if you put cloudflare in front of it

I host my own runners for CI and artifcat building on Hetzner VPS (spun on demand).

People are easily lured by pay as you go plans on serverless and other cheap to get started managed services and end up racking huge bills.

This is same reason I don't use stack driver or cloud monitoring and prefer to use it graphana + loki + Prometheus setup

My setup cannot be mosco figured and end up racking huge bills.

faangguyindia•12m ago
Yesterday I logged into cloudflare and found that Cloudflare had blocked chatgpt and claude from accessing my site. https://macrocodex.app

This is bad because there are fitness guides on my domain

https://macrocodex.app/guides which newbies often put in chatgpt and asks to simplify.

I enabled crawl for LLMs. There is lot of misinformation in fitness field so it's better if LLMs get their content from people who atleast have experience in the field

ianberdin•12m ago
It is time rewrite to X to optimize Y :)

For thirty years I programmed with Phish on, every day

https://christophermeiklejohn.com/ai/personal/phish/flow/agents/2026/05/03/rift.html
58•azhenley•57m ago•34 comments

Mercedes-Benz commits to bringing back physical buttons

https://www.drive.com.au/news/mercedes-benz-commits-to-bringing-back-phycial-buttons/
233•teleforce•2h ago•133 comments

Alert-Driven Monitoring

https://simpleobservability.com/docs/alert-driven-monitoring
40•khazit•2h ago•13 comments

Porsche will contest Laguna Seca in historic colors of the Apple Computer livery

https://newsroom.porsche.com/en_US/2026/motorsport/porsche-will-contest-laguna-seca-in-historic-c...
36•Amorymeltzer•2h ago•8 comments

I rebuilt my blog's cache. Bots are the audience now

https://hoeijmakers.net/thirty-years-of-caching-sorted-in-an-afternoon/
27•robhoeijmakers•3h ago•32 comments

What Is Z-Angle Memory and Why Is Intel Developing It?

https://www.hpcwire.com/2026/02/05/what-is-z-angle-memory-and-why-is-intel-developing-it/
25•rbanffy•2d ago•7 comments

Show HN: Apple's Sharp Running in the Browser via ONNX Runtime Web

https://github.com/bring-shrubbery/ml-sharp-web
118•bring-shrubbery•7h ago•19 comments

Group averages obscure how an individual's brain controls behavior: study

https://med.stanford.edu/news/all-news/2026/04/brain-scans-individual-versus-group.html
83•hhs•2d ago•22 comments

A couple million lines of Haskell: Production engineering at Mercury

https://blog.haskell.org/a-couple-million-lines-of-haskell/
355•unignorant•16h ago•166 comments

Embedded Rust or C Firmware? Lessons from an Industrial Microcontroller Use Case

https://arxiv.org/abs/2604.25679
118•mrtz•2d ago•104 comments

Uncle Bob: It's Over

https://old.reddit.com/r/vibecoding/comments/1srfqm0/uncle_bob_its_over/
9•lopespm•24m ago•3 comments

Automating Hermitage to see how transactions differ in MySQL and MariaDB

https://theconsensus.dev/p/2026/05/02/automating-hermitage.html
17•zdw•20h ago•4 comments

This Month in Ladybird – April 2026

https://ladybird.org/newsletter/2026-04-30/
450•richardboegli•20h ago•128 comments

Six Years Perfecting Maps on WatchOS

https://www.david-smith.org/blog/2026/04/29/maps-on-watchos/
399•valzevul•19h ago•100 comments

Dav2d

https://code.videolan.org/videolan/dav2d
561•dabinat•23h ago•157 comments

Coffee doesn't just wake you up–a biological pathway illuminates health effects

https://sciencex.com/news/2026-04-coffee-doesnt-key-biological-pathway.html
18•pseudolus•5h ago•8 comments

Haskell: Debugging

https://wiki.haskell.org/Debugging
19•tosh•2d ago•1 comments

Security Through Obscurity Is Not Bad

https://mobeigi.com/blog/security/security-through-obscurity-is-not-bad/
41•mobeigi•2h ago•44 comments

Do_not_track

https://donottrack.sh/
446•RubyGuy•23h ago•136 comments

I Built SpecDD Because AI Kept Forgetting What We Were Building

https://specdd.ai/articles/i-built-specdd-because-ai-kept-forgetting-what-we-were-building/
5•addvilz•3d ago•0 comments

Windows quality update: Progress we've made since March

https://blogs.windows.com/windows-insider/2026/05/01/windows-quality-update-progress-weve-made-si...
127•jovial_cavalier•1d ago•361 comments

Breaking Up with WordPress After Two Decades

https://yusufaytas.com/breaking-up-with-wordpress-after-two-decades
42•owenbuilds•2h ago•18 comments

Neanderthals ran 'fat factories' 125,000 years ago (2025)

https://www.universiteitleiden.nl/en/news/2025/07/neanderthals-ran-fat-factories-125000-years-ago
256•andsoitis•20h ago•140 comments

Utilyze measures how efficiently your GPU is doing useful work

https://github.com/systalyze/utilyze
37•nateb2022•2d ago•10 comments

Care homes and hotels in Japan shut as expansion strategy unravels

https://www.newsonjapan.com/article/149075.php
88•mikhael•15h ago•31 comments

Utah to hold websites liable for users who mask their location with VPNs

https://www.tomshardware.com/software/vpn/utah-becomes-first-us-state-to-target-vpn-use-with-age-...
162•GavinAnderegg•2h ago•153 comments

A Desktop Made for One

https://isene.org/2026/05/Audience-of-One.html
18•xngbuilds•1h ago•3 comments

Inventions for battery reuse and recycling increase seven-fold in last decade

https://www.epo.org/en/news-events/news/inventions-battery-reuse-and-recycling-increase-more-seve...
226•JeanKage•3d ago•28 comments

VS Code inserting 'Co-Authored-by Copilot' into commits regardless of usage

https://github.com/microsoft/vscode/pull/310226
1390•indrora•20h ago•759 comments

Specsmaxxing – On overcoming AI psychosis, and why I write specs in YAML

https://acai.sh/blog/specsmaxxing
222•brendanmc6•10h ago•238 comments