https://hackernoon.com/the-long-now-of-the-web-inside-the-in...
I wouldn't be surprised if it's AI.
It's time to come up with a term for blog posts that are just AI-augmented re-hashes of other people's writing.
Maybe blogslop.
* power budget dominates everything: I have access to a lot of rack hardware from old connections, but I don't want to put the army of old stuff in my cabinet because it will blow my power budget for not that much performance in comparison to my 9755. What disks does the IA use? Any specific variety or like Backblaze a large variety?
* magnetic is bloody slow: I'm not the Internet Archive so I'm just going to have a couple of machines with a few hundred TiB. I'm planning on making them all a big zfs so I can deduplicate but it seems like if I get a single disk failure I'm doomed to a massive rebuild
I'm sure I can work it out with a modern LLM, but maybe someone here has experience with actually running massive storage and the use-case where tomorrow's data is almost the same as today's - as is the case with the Internet Archive where tomorrow's copy of wiki.roshangeorge.dev will look, even at the block level, like yesterday's copy.
The last time I built with multi-petabyte datasets we were still using Hadoop on HDFS, haha!
tylerchilds•1h ago
tylerchilds•1h ago
https://en.wikipedia.org/wiki/Executive_Order_9066