frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Corrupting a ZFS File on Purpose

https://oshogbo.com/blog/90/
54•zdw•2d ago

Comments

anonymous_user9•2d ago
> The DVA was correct, the sector math was correct, the dd command was correct. The right place, the wrong mental model.

God the intensity is tiresome. Whether or not it's AI slop, it's also bad writing. Things can be fun or interesting or worthwhile without being a harrowing battle of discovery!

ralferoo•2d ago
Hmmm, it's been a long long time since I actually had a failed drive (and also I don't use zfs), but from what I remember of my last failing drive 20 years ago, the drive was able to detect that sectors had been corrupted, and then failed the read rather than just returning silently corrupted data. If my memory is correct, replacing random bytes on disk wouldn't actually reflect the typical way data corruption manifests itself.

I always thought that the reason zfs did its extensive CRC checks was primarily to detect data corruption while it was in RAM or over the network, with a side effect that in the rare cares that data on disk got corrupted without the drive detecting it because the CRC was still valid, it'd also be spotted.

But anyway, it might be worth testing by replacing some of the disk images with actually truncated ones so that there are holes when reading, so that it returns an actual read error rather than junk data.

adrian_b•2d ago
The error-correcting codes used by HDDs/SSDs correct or detect the most frequent errors, but sometimes, when there are too many erroneous bits in a sector, they can mis-correct the data and then the HDD/SSD returns a corrupted sector without signaling any error.

I have seen this a few times on HDDs that had been used for the cold storage of archival data, for several years (around 5 years or even more). For each archive file, I had my own hash values that were used to detect corrupted files, which allowed me to detect all such cases. I had duplicates for all such HDDs. Sometimes both HDD copies had a few silent corrupted sectors, but they were not in the same locations, so in all cases I could recover the corrupted files from their duplicates. If I had stored the archival data without redundancy, I would have lost it.

If you do not use hashes or other error-detecting codes for all your files, like I do, you may have had some failures in your HDDs without recognizing them, but such errors are much more likely to happen in files that have been stored for many years.

ramses0•1d ago
And/Or: `*.par` files.

https://en.wikipedia.org/wiki/Parchive

wongarsu•49m ago
Or rar files with recover records. Same concept, but in one self-contained file instead of a number of sidecar files
matja•1d ago
You're right that the ECC validation is very robust, but that only validates one small part - that the drive is reading what it has previously written, not that the data was correct when it came in to the drive, correctly handled by the firmware, or even written in the correct place (LBA) on the drive.

There's been times when some features of entire models of drives have been disabled in the Linux kernel because of buggy firmware that silently writes bad data (with correct ECC), so reading it back is successful from both the drive's and the OS's block driver views.

I was hit by this myself with the queued TRIM command firmware bug that affected all Samsung EVO 840 SSDs (Linux kernel commit 9a9324d3969678d44b330e1230ad2c8ae67acf81 if you want to look into the history) - the drive didn't report any errors, but ZFS kept reporting corruption, and kept on fixing it in the background.

throw0101c•1h ago
> I always thought that the reason zfs did its extensive CRC checks was primarily to detect data corruption while it was in RAM or over the network, with a side effect that in the rare cares that data on disk got corrupted without the drive detecting it because the CRC was still valid, it'd also be spotted.

Nope, it's always been about on-disk bit rot.

First off: drive firmware has been known to return the wrong LBA data. The OS asks for 123, the drive reads 234—and verifies its drive-level CRC, which passes—and sends it up. Application gets a bundle of bits that's not correct. With ZFS, it expects a certain checksum from that part of the tree/file, and so the LBA 234 gets returned it will not match the checksum that is for 123.

Next, if you have RAID-1, then if the drive has corrupted data, if you don't have higher-level FS checksums, how do you which mirror has the correct data? They're different, but which is correct. With ZFS you know which block has the correct checksum, return that data to application, and then use the correct data to correct the wrong one.

lanycrost•4h ago
I miss ZFS, only had a chance once to work with it in production and liked it very much. It's have performance overhead compared to journal filesystems but greatly designed.
igtztorrero•35m ago
I always run my servers on zfs pool mirrored using raid1 on 2 nvme drives, because when nvme fails, fail completely. How can a File be corrupted on normal operations?

Solar Energy Saves Europeans $135M a Day

https://cleantechnica.com/2026/06/08/solar-energy-saves-europeans-135-million-a-day/
69•vrganj•41m ago•10 comments

Albania Is Not for Sale: Kushner's $4B Resort Triggers'Flamingo Revolution'

https://www.yacnews.com/albania-is-not-for-sale-kushners-4-billion-resort-triggers-flamingo-revol...
348•ortr•2h ago•108 comments

Making Graphics Like it's 1993

https://staniks.github.io/articles/catlantean-3d-blog-1/
369•sklopec•5h ago•52 comments

GentleOS – Classic operating system with a lovely retro GUI

https://github.com/luke8086/gentleos32
308•tekkertje•6h ago•65 comments

Microsoft's open source tools were hacked to steal passwords of AI developers

https://techcrunch.com/2026/06/08/microsofts-open-source-tools-were-hacked-to-steal-passwords-of-...
369•raffael_de•8h ago•154 comments

Cleaning up after AI rockstar developers

https://www.codingwithjesse.com/blog/rockstar-developers/
262•BrunoBernardino•6h ago•178 comments

Can LLMs Beat Classical Hyperparameter Optimization Algorithms?

https://arxiv.org/abs/2603.24647
19•galsapir•49m ago•3 comments

OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision

https://opencv.org/opencv-5/
471•ternaus•3d ago•76 comments

Unified Controllable and Faithful Text-to-CAD Generation with LLMs

https://arxiv.org/abs/2604.19773
16•PaulHoule•1h ago•0 comments

Show HN: Gravity – interactive solar-system simulator, from Newton to Einstein

https://qunabu.github.io/Gravity/
60•qunabu•4h ago•16 comments

The Effective Sample Size

https://alex.smola.org/posts/40-effective-sample-size/
8•jxmorris12•4d ago•1 comments

Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

https://arxiv.org/abs/2605.15184
21•Anon84•2h ago•6 comments

Forever Young: how one molecule can lock plants in a youthful state (2025)

https://omnia.sas.upenn.edu/story/biologist-scott-poethig-plants-never-age
92•bryanrasmussen•7h ago•52 comments

Emerge Career (YC S22) Is Hiring a Founding Growth Marketer

https://www.ycombinator.com/companies/emerge-career/jobs/v0S1AEG-founding-growth-marketer
1•gabesaruhashi•3h ago

Apple reveals new AI architecture built around Google Gemini models

https://www.macrumors.com/2026/06/08/apple-reveals-new-ai-architecture/
681•unclefuzzy•20h ago•527 comments

WWDC 2026: Apple is Folding

https://cupertinolens.com/2026/06/09/wwdc-2026-apple-is-folding/
125•brandonb•1h ago•109 comments

Using Optical Aberrations to Distinguish Real Astronomical Transients

https://arxiv.org/abs/2606.08319
7•solarist•38m ago•0 comments

Adopting the Parallel DWARF linker in dsymutil

https://jonasdevlieghere.com/post/dsymutil-parallel-linker/
19•JDevlieghere•2d ago•3 comments

An introduction to functional analysis for science and engineering

https://arxiv.org/abs/1904.02539
78•Anon84•1d ago•9 comments

Thi.ng – open-source building blocks for computational design and art

https://thi.ng
128•nmstoker•1d ago•19 comments

xAI is looking more like a datacentre REIT than a frontier lab

https://martinalderson.com/posts/xais-new-rental-business/
642•martinald•1d ago•496 comments

Show HN: Performative-UI – A react component library of design tropes

https://vorpus.github.io/performativeUI/
1088•lizhang•1d ago•195 comments

Corrupting a ZFS File on Purpose

https://oshogbo.com/blog/90/
54•zdw•2d ago•9 comments

Job: Head of Stonehenge

https://www.english-heritage.org.uk/about/our-people/careers-with-us/job-search/default-job-page/...
201•mooreds•12h ago•184 comments

The beauty and simplicity of the good old C-style void* in C++

https://giodicanio.com/2026/06/05/how-to-declare-a-c-plus-plus-function-that-takes-a-blob-of-memory/
52•movd128•2d ago•99 comments

Siri AI

https://www.apple.com/apple-intelligence/
638•0xedb•21h ago•644 comments

EU-banned pesticides found in rice, tea and spices

https://www.foodwatch.org/en/eu-banned-pesticides-found-in-rice-tea-and-spices
483•john-titor•23h ago•265 comments

Porting the ThinkPad X61 to Coreboot

https://blog.aheymans.xyz/post/thinkpad_x61/
151•walterbell•11h ago•47 comments

Eagle Computer: The rise and fall of an early PC clone

https://dfarq.homeip.net/eagle-computer-the-rise-and-fall-of-an-early-pc-clone/
38•giuliomagnifico•6h ago•8 comments

The iPhone's Last Stand

https://stratechery.com/2026/the-iphones-last-stand/
92•swolpers•5h ago•140 comments