(Also: "Accumulated power on time, hours:minutes 37451*:12, Manufactured in week 27 of year 2014" — I might want to replace these :D — * pretty sure that overflowed at 16 bit, they were powered on almost continuously & adding 65536 makes it 11.7 years.)
Meanwhile, if we slice the data up three ways to hell and back, /all/ we see is unexplainable variation - every point is unique.
This is where PCA is helpful - given our set of covariates, what combination of variables best explain the variation, and how much of the residual remains? If there's a lot of residual, we should look for other covariates. If it's a tiny residual, we don't care, and can work on optimizing the known major axes.
On top of that it seems like by the time there is a clear winner for reliability, the manufacturer no longer makes that particular model and the newer models are just not a part of the dataset yet. Basically, you can’t just go “Hitachi good, Seagate bad”. You have to look at specific models and there are what? Hundreds? Thousands?
I'd assume that a drive manufacture does similar knowing which batch from which vendor the magnets, grease, or silicon all comes from. You hope you never need to use these records to do any kind of forensic research, but the one time you do need it makes a huge difference. So many people doing similar products that I do look at me with a tilted head while their eyes go wide and glaze over as if I'm speaking an alien language discussing lineage tracking.
I'm still not sure how to confidently store decent amounts of (personal) data for over 5 years without
1- giving to cloud,
2- burning to M-disk, or
3- replacing multiple HDD every 5 years on average
All whilst regularly checking for bitrot and not overwriting good files with bad corrupted files.Who has the easy, self-service, cost-effective solution for basic, durable file storage? Synology? TrueNAS? Debian? UGreen?
(1) and (2) both have their annoyances, so (3) seems "best" still, but seems "too complex" for most? I'd consider myself pretty technical, and I'd say (3) presents real challenges if I don't want it to become a somewhat significant hobby.
Not great for easy read access but other than that it might be decent storage.
AFAIK someone on reddit did the math and the break-even for tapes is between 50TB to 100TB. Any less and it's cheaper to get a bunch of hard drives.
You can't buy those anymore. I've tried.
IIRC, the things currently marketed as MDisc are just regular BD-R discs (perhaps made to a higher standard, and maybe with a slower write speed programmed into them, but still regular BD-Rs).
1. Use ZFS with raidz
2. Scrub regularly to catch the bitrot
3. Park a small reasonably low-power computer at a friend's house across town or somewhere a little further out -- it can be single-disk or raidz1. Send ZFS snapshots to it using Tailscale or whatever. (And scrub that regularly, too.)
4. Bring over pizza or something from time to time.
As to brands: This method is independent of brand or distro.
Unless you're storing terabyte levels of data, surely it's more straightforward and more reliable to store on backblaze or aws glacier? The only advantage of the DIY solution is if you value your time at zero and/or want to "homelab".
The time required to set this stuff up is...not very big.
Things like ZFS and Tailscale may sound daunting, but they're very light processes on even the most garbage-tier levels of vaguely-modern PC hardware and are simple to get working.
I have two raid1 pairs - "the old one", and "the new one", plus a third drive the same sizes as "the old pair". The new pair is always larger than the old pair, in the early days it was usually well over twice as big but drive growth rates have slowed since then. About every three years I buy a new "new pair" + third drive, and downgrade the current "new pair" to be the4 "old pair". The old pair is my primary storage, and gets rsynced to a partition that's the same size on the new pair. Te remainder of the new pair is used for data I'm OK with not being backed up (umm, all my BitTorrented Linux isos...) The third drive is on a switched powerpoint and spins up late Sunday night and rsyncs the data copy on the new pair then powers back down for the week.
When building up initially, make a point of trying to stagger purchases and service entry dates. After that, chances are failures will be staggered as well, so you naturally get staggered service entry dates. You can likely hit better than 5 year time in service if you run until failure, and don't accumulate much additional storage.
But I just did a 5 year replacement, so I dunno. Not a whole lot of work to replace disks that work.
ZFS with a three way mirror will be incredibly unlikely to fail. You only need one drive for your data to survive.
Then get a second setup exactly like this for your backup server. I use rsnapshot for that.
For your third copy you can use S3 like a block device, which means you can use an encrypted file system. Use FreeBSD for your base OS.
Disk Prices https://news.ycombinator.com/item?id=45587280 - 1 day ago, 67 comments
Hard drives you can conveniently buy as a consumer - yes. There's a difference.
thisislife2•3h ago
That said, one thing that I do find very attractive in Seagate HDDs now is that they are also offering free data recovery within the warranty period, with some models. Anybody who has lost data (i.e. idiots like me who didn't care about backups) and had to use such services knows how expensive they can be.
kvemkon•1h ago
But the warranty lasts only 5 years since the purchase of the drive, doesn't it?
thisislife2•1h ago