Overall, there's grossly insufficient comprehensive testing tools, techniques, and culture in FOSS (FreeBSD, Linux, and most projects) rely upon informal/under-documented, ad-hoc, meat-based scream testing rather than proper, formal verification of correctness. Although no one ever said high-confidence software engineering was easy, it's essential to avoid entire classes of CVEs and unexpected operation bugs.
0: https://www.freebsd.org/releases/13.0R/relnotes/
1: https://lists.freebsd.org/pipermail/freebsd-fs/2018-December...
saurik•7h ago
(FWIW, I appreciate the performance impact of a full fix here might be brutal, but the suggestion of requiring boot-args opt-in for O_DIRECT in these cases should not have been ignored, as there are a ton of people who might not actively need or even be using O_DIRECT, and the people who do should be required to know what they are getting into.)
summa_tech•7h ago
vbezhenar•6h ago
saurik•6h ago
(Oh, unless you are maybe talking about something orthogonal to the fixes mentioned in the discussion thread, such as some property of the extra checksumming done by these filesystems? And so, even if the disks de-synchronize, maybe zfs will detect an error if it reads "the wrong one" off of the underlying MD RAID, rather than ending up with the other content?)
ludocode•4h ago
I run btrfs on top of mdraid in RAID6 so I can incrementally grow it while still having copy-on-write, checksums, snapshots, etc.
I hope that one day btrfs fixes its parity raid or bcachefs will become stable enough to fully replace mdraid. In the meantime I'll continue using mdraid with a copy-on-write filesystem on top.
bananapub•3h ago
indeed out of date - that was merged a long time ago and shipped in a stable version earlier this year.
Polizeiposaune•4h ago
When the actual checksum of what was read from storage doesn't match the expected value, it will try reading alternate locations (if there are any), and it will write back the corrected block if it succeeds in reconstructing a block with the expected checksum.
weinzierl•6h ago
No wonder O_DIRECT never saw much love.
"I hope some day we can just rip the damn disaster out."
-- Linus Torvalds, 2007
https://lkml.org/lkml/2007/1/10/235
jandrewrogers•5h ago
Something like O_DIRECT is critical for high-performance storage in software for well-understood reasons. It enables entire categories of optimization by breaking a kernel abstraction that is intrinsically unfit for purpose; there is no way to fix it in the kernel, the existence of the abstraction is the problem as a matter of theory.
As a database performance enjoyer, I've been using O_DIRECT for 15+ years. Something like it will always exist because removing it would make some high-performance, high-scale software strictly worse.
jeffbee•4h ago
vacuity•4h ago
tremon•3h ago
vacuity•2h ago
karmakaze•5h ago