Why do I care if I shave off 200ms from a crawler's request, instead of a human's?
> If you care about how your content moves through the world now, including through AI systems, you have to care about caching. Not as a performance optimisation for human browsers, but as infrastructure for machine readership.
The screenshot in the image says 3k req/day. That’s 2 requests per minute (amortized). At that rate, you can serve it with cgi and Perl.
Cache is only relevant if you have a lot of traffic AND dynamic pages, or if you care about latency (which is only relevant for humans).
Based on that I think it's more about requests from bots/scrapers having the greatest chance possible of hitting a cache before hitting the blog's origin/real host. Bots will hit some layer of Cloudflare first then they'll hit Fastly and then if not in Fastly they'll hit the Ghost blog's server.
To me, this makes a lot of sense if it's self-hosted but I also thought it was already the standard to shove your self-hosted blog behind a reverse-proxy and cache as much as possible.
And I'm not a professional web developer but all the extra caching layers for a static personal blog seem a bit overkill.
Aside from the graphic, the article is a lot of words about engaging with an LLM to get a full understanding of how caching works for their blog hosting and how it enabled them to change their setup for the better.
It's kind of hard to understand because there are no words about what they actually did or how what they actually did was better.
Ideally I'd make the content available to crawlers for training open models, but that seems to be nearly impossible. It would be possible if other AI companies behaved.
That can’t block Grok, can it?
(You might have a fake iPhone or something visit your site if you ask Grok to retrieve information from it)
I also used Claude to help me drill into what's going on. Bizarrely, about 80% of my traffic comes from Singapore, which the author mentioned. I don't know why. A lot of the traffic looks real; it stays for a while, clicks different links in different orders. But no one in Singapore has ever read a thing I've written on my site as far as I'm concerned.
I thought Cloudflare would help protect my site from bots, but it utterly fails. I'm not sure if they're too sophisticated or people overestimate how well CF works for these things. I paid for advanced features for a while and reverted to the free plan once I realized it made no difference. It's a great platform in general, but hasn't been great for allowing me to see how many humans actually read my content.
I know some do because they email me occasionally. If I had to guess, of the ~200 visits per week reported in analytics, around 15 are real.
For the sort of thing you’re doing, it should be as simple as “throw it behind Cloudflare/Fastly/Bunny/whichever private CDN you like” and that’s it.
Also the diagram near the end is pretty much incoherent. GenAI, I presume.
now I suddenly I have 10k visitors a month hammering my apis and causing massive egress and cpu usage - so i had to get them behind cloudflare and now build everything statically - cut the costs back down from 90+ cpu hours to about 0.2 cpu hours a month
crazy times
(also, all donw w/ claude code's help, or it would have taken a week for me to figure out)
$4 hetzner vps can serve tons of request if you put cloudflare in front of it
I host my own runners for CI and artifcat building on Hetzner VPS (spun on demand).
People are easily lured by pay as you go plans on serverless and other cheap to get started managed services and end up racking huge bills.
This is same reason I don't use stack driver or cloud monitoring and prefer to use it graphana + loki + Prometheus setup
My setup cannot be mosco figured and end up racking huge bills.
This is bad because there are fitness guides on my domain
https://macrocodex.app/guides which newbies often put in chatgpt and asks to simplify.
I enabled crawl for LLMs. There is lot of misinformation in fitness field so it's better if LLMs get their content from people who atleast have experience in the field
jdw64•3h ago
johng•1h ago
It's going to be a serious problem and I've already seen sites that are down 90% in traffic simply because AI is scraiping them, answering the questions themselves and never providing a linkback.
01284a7e•1h ago
gbgarbeb•36m ago
diatone•29m ago
pixl97•1h ago
Dead internet manifest.
nilirl•25m ago
I'm a copywriter and I used to get hired to write posts on behalf of founders on LinkedIn or for their company blog.
Now, the last three jobs I had were all focused on sending cold email.
ryandrake•16m ago
We need to stop this treadmill of trying to "build reputation" and stop focusing on "symbolic capital" and "clout" and whatever else bloggers are going after. You're not going to get it, and even if you do, you're not going to be able to "monetize" it.
If you have a need to write, write. Maybe a handful of actual people will read it, maybe not. But, I wouldn't try to do it for a living. The reward will have to be the cathartic process of writing itself, and not in how much attention it gets, how much it "blows up" or how viral it gets.
jdw64•7m ago
What I need is for my writing to spread enough that I can receive opportunities to have my programming ability evaluated.
The reason I write about programming is that, in the past, some readers found my programming essays interesting, and that led to chances for me to be tested. I had to leave graduate school because of financial problems, and I did not graduate from a prestigious university.
So this is not simply about monetizing writing. It is a struggle to receive opportunities. Those are fundamentally different things.
Some people may be happy writing things that nobody reads. But many people are happier when they can share their writing and let their values collide with those of others.