They won an award for the paper, and the example they given was a "holiday" search, where a hotel inserted their name, and an airline company wedged themselves as the best way to go there.
If I can find it again, I'll print and stick its link all over walls to make sure everybody knows what Google is up to.
Google even put the AI snippet above their ads, so you know how bad it stings.
IMO a LLM is just a superior technology to a search engine in that it can understand vague questions, collate information and translate from other languages. In a lot of cases what I want isn't to find a particular page but to obtain information, and a LLM gets closer to that ideal.
It's nowhere near perfect yet but I won't be surprised if search engines go extinct in a decade or so.
> and Reclaim Your Organic Traffic
Content:
> 1. Set Snippet Length to Zero with max-snippet:0
Sure, buddy, sure. Users are notorious for clicking a link in search result without description, right.
Header set X-Robots-Tag "noindex, nofollow, noarchive, nositelinkssearchbox, nosnippet, notranslate, noimageindex"
Of course, only the beeping Internet Archive totally ignored it and scraped my site. And now, despite me trying many times, they won't remove it.
It seems to mostly work, I also have Anubis in front of it now to keep the scrapers at bay.
(It's a personal diary website, started in 2000 before the term "blog" existed. I know it's public content, I just don't want it searchable public)
Look at the reason, and get mad to the correct people.
It might be the archive themselves, but just be sure.
Why would you NOT want internet archive to scrape your website? (Im Clueless - thank you)
Yes I could password protect it (and any really personal content is locked behind being logged in, AI hasn't scraped that) but I _like_ being able to share links with people without having to also share passwords.
I realise the HN crowd is very much "More eyeballs are better for business" but this isn't business. This is a tiny, 5 hits a month (that's not me writing it) website.
In all honestly, if you're hosting it on the internet, why is this a problem? If you didn't want it to backed up, why is it publicly accessible at all? I'm glad the internet archive will keep hosting this content even when the original is long gone.
Let's say I'd read your website and wanted to look it up one day in the far future, only to find many years later the domain had expired, I'd be damn glad at least one organization had kept it readable.
Additionally, when I die, I want my website to go dark and that's that. It's a diary, it's very very mundane. My tech blog I post to, sure, I'm 200% happy to have that scraped/archived. My diary I keep very up-to-date offline copies of that my family have access to, should I tip over tomorrow.
I realise this goes against the usual Internet wisdom, and I'm sure there's more than one Chinese AI/bot out there that's scraped it and I have zero control over. But where I allegedly do have control, I'd like to exercise it. I don't think that's an unfair/ridiculous request.
This feels like the wrong solution for wanting to be compensated for information.
I don't how what the solution is because one often doesn't know if the information is worth paying for until after viewing it.
bitpush•1h ago