Bcachefs may be headed out of the kernel

https://lwn.net/Articles/1027289/

150•ksec•7mo ago

Comments

guerrilla•7mo ago

The drama with Linux filesystems is just nuts... It never ends.

mschuster91•7mo ago

The stakes are the highest across the entire kernel. Data that's corrupt cannot (easily) be uncorrupted.

tpolzer•7mo ago

Bad drivers could brick (parts of) your hardware permanently.

While you should have a backup of your data anyway.

quotemstr•7mo ago

At least Kent hasn't murdered his wife

Tostino•7mo ago

First thing that came to mind when I saw this drama.

msgodel•7mo ago

It's crazy people spend so much time paying attention to Hollywood celebrity drama.

Opens LKML archive hoping for another Linus rant.

rendaw•7mo ago

I'm sure there's just as much political allstar programmer fighting at google/apple/microsoft/whatever too, just this is done in public.

chasil•7mo ago

So the assertion is that users with (critical) data loss bugs need complete solutions for recovery and damage containment with all possible speed, and without this "last mile" effort, stability will never be achieved.

The objection is the tiniest bug-fix windows get everything but the kitchen sink.

These are both uncomfortable positions to occupy, without doubt.

koverstreet•7mo ago

No, the assertion is that the proper response to a bug often (and if it's high impact - always) involves a lot more than just the bugfix.

And the whole reason for a filesystem's existence is to store and maintain your data, so if that is what the patch if for, yes, it should be under consideration as a hotfix.

There's also the broader context: it's a major problem for stabilization if we can't properly support the people using it so they can keep testing.

More context: the kernel as a whole is based on fixed time tables and code review, which it needs because QA (especially automated testing) is extremely spotty. bcachefs's QA, both automated testing and community testing, is extremely good, and we've had bugfix patchsets either held up or turn into flamewars because of this mismatch entirely too many times.

WesolyKubeczek•7mo ago

> No, the assertion is that the proper response to a bug often (and if it's high impact - always) involves a lot more than just the bugfix.

Then what you do is you try to split your work in two. You could think of a stopgap measure or a workaround which is small, can be reviewed easily, and will reduce the impact of the bug while not being a "proper" fix, and prepare the "properer" fix when the merge window opens.

I would ask, since the bug probably lived since the last stable release, how come it fell through the crack and had only been noticed recently? Could it be that not all setups are affected? If so, can't they live with it until the next merge window?

By making a "feature that fixes the bug for real", you greatly expand the area in which new, unknown bugs may land, with very little time to give it proper testing. This is inevitable, evident by the simple fact that the bug you were trying to fix exists. You can be good, but not that good. Nobody is that good. If anybody was that good, they wouldn't have the bug in the first place.

If you have commercial clients who use your filesystem and you have contractual obligations to fix their bugs and keep their data intact, you could (I'd even say "should") maintain an out-of-tree version with its own release and bugfix schedule. This is IMO the only reasonable way to have it, because the kernel is a huge administrative machine with lots of people, and by mainlining stuff, you necessarily become co-dependent on the release schedule for the whole kernel. I think a conflict between kernel's release schedule and contractual obligations, if you have any, is only a matter of time.

koverstreet•7mo ago

> Then what you do is you try to split your work in two. You could think of a stopgap measure or a workaround which is small, can be reviewed easily, and will reduce the impact of the bug while not being a "proper" fix, and prepare the "properer" fix when the merge window opens.

That is indeed what I normally do. For example, 6.14 and 6.15 had people discovering btree iterator locking bugs (manifesting as assertion pops) while running evacuates on large filesystems (it's hard to test a sufficiently deep tree depth in virtual machine tests with our large btree nodes); some small hotfixes went out in rc kernels, but the majority of the work (a whole project to add assertions for path->should_be_locked, which should shut these down for good) waited until the 6.16 merge window.

That was for a less critical bug - your machine crashing is somewhat less severe than losing a filesystem.

In this case, we had a bug pop up in 6.15 where the link count in the VFS inode getting screwed up caused an inode to be deleted that shouldn't have been - a subvolume root - and then an untested repair path took out the entire subvolume.

Ouuuuch.

That's why the repair code was rushed; it had already gotten one filesystem back, and I'd just gotten another report of someone else hitting it - and for every bug report there are almost always more people who hit it and don't report it.

And considering that a lot of people running bcachefs now are getting it from distro kernels and don't know how to build kernels - that is why it was important to get this out quickly through the normal channels.

In addition, the patch wasn't risky, contrary to what Ted was saying. It's a code path that's very well covered by automated tests, including KASAN/UBSAN/lockdep variants - those would exploded if this patch was incorrect.

When to ship a patch is always a judgement call, and part of how you make that call is how well your QA process can guarantee the patch is correct. Part of what was going on here is a disconnect between those of us who do make heavy use of modern QA infrastructure and those who do it the old school way, relying heavily on manual review and long testing periods for rc kernels.

WesolyKubeczek•7mo ago

> In this case, we had a bug pop up in 6.15 where the link count in the VFS inode getting screwed up caused an inode to be deleted that shouldn't have been - a subvolume root - and then an untested repair path took out the entire subvolume.

I would rather make sure this path was never hit for rc, to minimize the damage. The fact alone that it didn’t pop up until late in the 6.15 cycle could hint at some specific circumstances the bug manifested on, and those could be described and avoided.

And I think there could be a mediocre way to get by until the next merge window in which a superior solution could be presented.

I don’t want to sound as if I’m an expert in how to do VFS, because I’m not. I’m, however, an expert in how to be “correcter than others” which has cost me getting kicked out of jobs before. I hope I have learned better since, and at the time I have been very, very stubborn (they wouldn’t have kicked me out otherwise).

There is this bit when working with others that you will likely go with solutions you deem more mediocre than the theoretically best solution, or that experienced people will say “no” to your ideas or solutions and you accept it instead of seeking quarrel in spite of them obviously not understanding you (spoiler: this is not so). But if you show that you’re willing to work with other as a single unit, you will be listened to, appreciated, and concessions will be made for you too. This is not necessarily about the kernel, it’s about working in a team in general.

I don’t have a dog in this fight, but I’ve been “that guy” before and I regret it took me that long to realize this and mend my ways. Hope it doesn’t keep happening to you.

magicalhippo•7mo ago

While I absolutely think you're taking a stand in the wrong fights, like I don't see why you needed to push it so far on this hill in particular, I am sympathetic to your argument that experimental kernel modules like filesystems might need a different release approach at times.

At work we have our main application which also contains a lot of customer integrations. Our policy has been new features in trunk only, except if it's entirely contained inside a customer-specific integration module.

We do try to avoid it, but this does allow us to be flexible with regards to customer needs, while keeping the base application stable.

This new recovery feature was, as far as I could see, entirely contained within the bcachefs kernel code. Given the experimental status, as long as it was clearly communicated to users, I don't see a huge problem allowing such self-contained features during the RC phase.

Obviously a requirement must be that it doesn't break the build.

bombcar•7mo ago

I have seen modules and code scream at me that code needed something else - so a PR for the literal bugfix could include a message that says “RECOVERABLE SITUATION DETECTED - visit bcachefs.org/owmp for details”

Then you have details on how to obtain recovery tools. You’d only need it for one patch revision.

rewgs•7mo ago

Kent, it’s actually really simple: bcachefs is experimental. Those that are currently using bcachefs and those that can’t wait for a data recovery tool that hasn’t existed until now is a group containing precisely zero people.

You’re acting like bcachefs systems are storing Critical Data That Absolutely Cannot Be Lost. And yet at the same time it’s experimental. I’m just one user, but I can tell you that, even as excited as I am about bcachefs, I’m not touching it with a ten foot pole for anything beyond playing around until at least the experimental label is removed.

I imagine my position is not uncommon.

Please stop trying to die on this hill. Your project is really great and really important. I want it to succeed.

Just chill and let bug fixes be bug fixes and features be features.

koverstreet•7mo ago

I've recovered a _lot_ of data for users that didn't have backups.

It's all part of the job.

rewgs•7mo ago

Frankly, if you store important data on an experimental file system and don’t have backups, you deserve to lose it.

And I have to imagine that those that are technical enough to not only use Linux, but use an experimental non-default file system, and those that don’t have backups of their data, is a vanishingly small group.

So no, I actually disagree —- it’s not part of the job.

So again we arrive at the same place: this data recovery tool is not worth the drama.

It’s a feature, not a bug fix, and an incredibly unimportant one at that _at this stage of development_. If bcachefs weren’t experimental and were widely used, it would be a different story —- I’d probably be in favor of bending the rules to get it in there faster. But that just isn’t where we are right now.

magicalhippo•7mo ago

If you have a lot of users who store data on an experimental filesystems and who don't back up said data, yet are not cool with data losses, I would say you have a serious communication issue at hand.

I have lost years of my work due to not having proper backups. I know the pain.

And I totally get you feel responsible for the data loss and want to help your users, I'm like that too with my code.

But this is an experiment. It's right there in the name.

If this feature really is as needed as you claim here, then getting it into the kernel is a mere side-issue.

In that case, your #1 priority should be to fix whatever is causing such users to install and use bcachefs without having a recovery plan that they have verified, and get existing users on the same level.

Because not doing so would be a huge disservice to those users who don't know better, and at worst borderline exploitative.

Writing recovery software is part of the job. Forcing it into the kernel to save users who are clearly not in any shape or form competent enough to partake in your experiment is definitely not part of the job.

Finding yourself in this position means something has gone very, very wrong.

Dylan16807•7mo ago

> Forcing it into the kernel to save users who are clearly not in any shape or form competent enough to partake in your experiment is definitely not part of the job.

That code needs to be there for non-experimental users later on.

The reason to push it in quickly is so it can get tested and iterated on. Saving a handful of experimental users is not the main benefit.

magicalhippo•7mo ago

> Saving a handful of experimental users is not the main benefit.

Kent himself argued otherwise here[1].

And even if that were the case, there's no need to take a stand on trying to get it into the kernel. If he gets booted out of the kernel tree, then the end result is the same for his users: they have to compile their own kernel. So it makes no sense to push this so hard.

[1]: https://www.phoronix.com/forums/forum/software/general-linux...

Dylan16807•7mo ago

> Kent himself argued otherwise here[1].

I said "main" for a reason. The current users that need to recover data are a part of the picture, but they're a few trees out of the forest.

"Please tell that to the users who lost data." is not arguing against what I said.

> And even if that were the case, there's no need to take a stand on trying to get it into the kernel. If he gets booted out of the kernel tree, then the end result is the same for his users: they have to compile their own kernel. So it makes no sense to push this so hard.

He's not giving an ultimatum here. The goal is to figure out something that works for everyone.

rewgs•7mo ago

> The reason to push it in quickly is so it can get tested and iterated on. Saving a handful of experimental users is not the main benefit.

Testing and iterating on code does not require making exceptions to the kernel development schedule.

Dylan16807•7mo ago

He makes a pretty good argument that the filesystem's never going to get done if there's only one iteration per kernel release.

I don't know what the best solution is, but it looks like it requires either exceptions or something else that gets around the schedule.

NekkoDroid•7mo ago

Well... there is the merge window where this should be added and then like 8 release candidates (as always: it depends) where he can iterate on the added code. So the statement of "only one iteration per kernel release" is just categorically wrong.

Dylan16807•7mo ago

"categorically wrong" is a rather uncharitable way to describe you and me using different connotations of the word "iterate". Especially when you pulled "on the added code" out of nowhere.

To stabilize the filesystem he needs to iterate on the code that has been there for a while, to add more debugging and fallbacks for error situations. In this case he wanted to add a new fallback.

gdevenyi•7mo ago

> You’re acting like bcachefs systems are storing Critical Data That Absolutely Cannot Be Lost.

It is to the user storing it.

rewgs•7mo ago

As I said in my reply to Kent: frankly, if you store important data on an experimental file system and don’t have backups, you deserve to lose it.

jethro_tell•7mo ago

Who’s using an experimental filesystem and risking critical data loss? Rule one of experimental file systems is have a copy on a not experimental file system.

bombcar•7mo ago

The biggest dirty secret of the IT world is that everyone knows you should have more backups than God, but everyone runs with an average of about zero.

jethro_tell•7mo ago

Sure, and when I go, 'I'm just going to slap this together and if it dies I'll rebuild it' I run on ext4 instead of an experimental service. If there is a reason that I need to run something 'experimental' you gonna bet your ass that I'm going to back things up.

shmerl•7mo ago

May be bcachefs should have been governed by a group of people, not a single person.

mananaysiempre•7mo ago

Committees are good-to-acceptable for keeping things going, but bad for initial design or anything requiring a coherent vision and taste. There are some examples of groups that straddled the boundary between a committee and a creative collaboration and produced good designs (Algol 60; RnRS for n ≤ 5; IIRC the design of ZFS was produced by a three-person team), but they are more of an exception, and the secret of tying together such groups remotely doesn’t seem to have been cracked. Even in the keeping things going department, a committee’s inbuilt and implicit self-preservation mechanisms can lead it to keep fiddling with things far longer than would be advisable.

shmerl•7mo ago

In this case it's more about keeping things in check and not letting one person with an attitude to ignore kernel development rules derail the whole project.

I'm not saying those concerns are wrong, but when it's causing a fallout like being kicked out from the kernel, the downsides clearly are more severe than any potential benefits.

koverstreet•7mo ago

Actually, I think remote collaboration can work with the right medium and tools. For bcachefs, that's been IRC; we have an extremely active channel where we do a lot of collaborative debugging, design discussion, helping new users, etc.

I know a lot of people heavily use slack/discord these days, but personally I find the web interfaces way too busy. IRC all the way, for me.

But the problem of communicating effectively enough to produce a coherent design is very real - this goes back to Fred Brooks (Mythical Man Month). I think bcachefs turned out very well with the way the process has gone to date, and now that it's gotten bigger, with more distinct subsystems, I am very eagerly looking forward to the date when I can hand off ownership of some of those subsystems. Lately we've had some sharp developers getting involved - for the past several years it's been mainly users testing it (and some of them have gotten very good at debugging at this point).

So it's happening.

charcircuit•7mo ago

If Linux would add a stable kernel module API this wouldn't be a huge a problem and it would be easy for bcachefs to ship as a kernel module with his own independent release schedule.

josephcsible•7mo ago

The slight benefit for out-of-tree module authors wouldn't be worth the negative effects on the rest of the kernel to everyone else.

charcircuit•7mo ago

"slight benefit"? Having a working system after upgrading your kernel is not just a slight benefit. It's table stakes. Especially for something critical like a filesystem it should never break.

>negative effects on the rest of the kernel

Needing to design and support an API is not purely negative for kernel developers. It also gives a change to have a proper interface for drivers to use and follow. Take a look at the Rust for Linux which keeps running into undocumented APIs that make little sense and are just whatever <insert most popular driver> does.

josephcsible•7mo ago

> Having a working system after upgrading your kernel is not just a slight benefit. It's table stakes.

We already have that, with the "don't break userspace" policy combined with all of the modules being in-tree.

> Needing to design and support an API is not purely negative for kernel developers.

Sure, it's not purely negative, but it's overall a big net negative.

> Take a look at the Rust for Linux which keeps running into undocumented APIs that make little sense and are just whatever <insert most popular driver> does.

That's an argument against a stable module API! Those things are getting fixed as they get found, but if we had a stable module API, we'd be stuck with them forever.

I recommend reading https://docs.kernel.org/process/stable-api-nonsense.html

charcircuit•7mo ago

>We already have that, with the "don't break userspace"

Bcachefs is not user space.

>with all of the modules being in-tree.

That is not true. There are out of tree modules such as ZFS.

>That's an argument against a stable module API!

My point was that there was 0 thought put into creating a good API. Additionally API could be evolved over time and have a support period if you care about being able to evolve it and deprecate the old one. And likely even with a better interface there is probably a way to make the old API still function.

josephcsible•7mo ago

> Bcachefs is not user space.

bcachefs is still in-tree.

> That is not true. There are out of tree modules such as ZFS.

ZFS could be in-tree in no time at all if Oracle would fix its license. And until they do that, it's not safe to use ZFS-on-Linux anyway, since Oracle could sue you for it.

> My point was that there was 0 thought put into creating a good API.

There is thought put into it: it's exactly what we need right now, because if what we need ever changes, we'll change the API too, thus avoiding YAGNI and similar problems.

> Additionally API could be evolved over time and have a support period if you care about being able to evolve it.

If a temporary "support period" is what you want, then just use the LTS kernels. That's already exactly what they give you.

> And likely even with a better interface there is probably a way to make the old API still function.

That's the big net negative I was mentioning and that https://docs.kernel.org/process/stable-api-nonsense.html talks about too. Sometimes there isn't a feasible way to support part of an old API anymore, and it's not worth holding the whole kernel back just for the out-of-tree modules.

yjftsjthsd-h•7mo ago

> ZFS could be in-tree in no time at all if Oracle would fix its license. And until they do that, it's not safe to use ZFS-on-Linux anyway, since Oracle could sue you for it.

IANAL, but I don't believe either of these things are true.

OpenZFS contains enough code not authored by Sun/Oracle that relicensing it now is effectively impossible.

OTOH, it is under the CDDL, which is a perfectly good open source license; AFAICT the problem, if one exists at all[0], only manifests when distributing the combination of CDDL (OpenZFS) and GPL (Linux) software. If you download CDDL software and compile it into GPL software yourself (say, with DKMS) then it should be fine because you aren't distributing it.

[0] This is a case where I'm going to really emphasize that I'm really not a lawyer and merely point out that ex. Canonical's lawyers do seem to think CDDL+GPL is okay.

timschmidt•7mo ago

> it should be fine because you aren't distributing it.

Which excludes a vast amount of activity one might want to use Linux for which is otherwise allowed. Like selling a device with a Linux installation, distributing VM or system restore images, etc.

yjftsjthsd-h•7mo ago

Sure, I happily grant that the licensing situation is really annoying and restricts the set of safe actions. I only object to claims that all use of ZFS is legally risky.

skissane•7mo ago

> OpenZFS contains enough code not authored by Sun/Oracle that relicensing it now is effectively impossible.

I don't think so. Suppose Oracle did agree to put their code under GPLv2/CDDL dual licensing.

Then, I'm sure if you look at the non-Oracle contributors to OpenZFS, there's a few big ones and a long tail of smaller ones. Many of the big ones might be able and willing to follow Oracle's lead. Chasing down the smaller ones may be harder, but it is possible their contributions may be judged as sufficiently trivial to escape copyright protection. More substantive contributions from people who are unreachable (or unwilling/unable to consent to the relicensing) can pose a bigger issue, but it could be solved either by (a) intentionally rewriting their contributions from scratch; (b) given enough time, decent chance (a) will happen anyway just to normal code churn, even if you don't do it intentionally for licensing reasons.

It would be a big, multi-year project, but one that other open source communities have successfully tackled, most notably LLVM – so I do think "effectively impossible" is too strong.

I think the biggest blocker is that, it is hard to motivate people to make the effort unless Oracle is on-board – and they've displayed no signs of willingness to change their position on this. I doubt Oracle will budge, but anything is possible.

Another possibility to consider – CDDL clause 4 allows the "license steward" (Sun Microsystems) [0] to release a new version, which automatically applies to all CDDL software unless the developers explicitly opt-out. I don't know if any of the OpenZFS developers have made such an explicit opt-out – but if they haven't, then Oracle could issue a new CDDL version adding a clause saying that if the covered work is ZFS or a derivative thereof, anyone is allowed to relicense it under GPLv2. Then you wouldn't even need to track down and get the consent of non-Oracle contributors. For a real historical example of something like this, witness how the FSF issued a new GFDL version just to help Wikipedia move from GFDL to Creative Commons licensing. But, again, even if this is legally possible, unlikely (but not impossible) Oracle will ever cooperate in it.

Another blocker is that even if OpenZFS were relicensed as GPLv2/CDDL, that still wouldn't solve the issue that Torvalds is unlikely to agree to upstreaming it as part of the mainline Linux kernel – a massive code base written in a very different style, and having portability concerns (wanting to work on BSD/etc too) which Linux normally doesn't care about. Possibly if you forked OpenZFS, ripped out the cross-platform aspects, and rewrote it to be more like typical Linux kernel code, it might have a chance. But, will anyone be willing to make that massive investment of time and effort? And even assuming they succeeded, we'd now have two forks of ZFS (one in the Linux kernel, one for other operating systems), adding to the maintenance burden, and the risk they'd diverge over time would be high.

[0] Sun Microsystems still legally exists on paper, and probably will indefinitely, as an Oracle subsidiary – it has been renamed to Oracle America Inc – so Oracle has effectively inherited Sun's rights as CDDL license steward

charcircuit•7mo ago

>it's not safe to use ZFS-on-Linux anyway, since Oracle could sue you for it.

It's not against the license to use them together.

>If a temporary "support period" is what you want, then just use the LTS kernels. That's already exactly what they give you.

Only the Android one does. The regular LTS one has no such guarantee.

mustache_kimono•7mo ago

> And until they do that, it's not safe to use ZFS-on-Linux anyway, since Oracle could sue you for it.

This is clearly untrue. Upon what theory?

msgodel•7mo ago

Does your system have some critical out of tree driver? That should have been recompiled with the new kernel, that sounds like a failure of whoever maintains the driver/kernel/distro (which may be you if you're building it yourself.)

homebrewer•7mo ago

It would also have a lot less FOSS drivers, neither we nor FreeBSD (which is often invoked in these complaints) would have amdgpu for example.

charcircuit•7mo ago

I would actually posture that making it easier to make drivers would actually have the opposite effect and result in more FOSS drivers.

>FreeBSD (which is often invoked in these complaints) would have amdgpu for example.

In such a hypothetical FreeBSD could reimplement the stable API of Linux.

throw0101d•7mo ago

> In such a hypothetical FreeBSD could reimplement the stable API of Linux.

Like it does with the userland API of Linux, which is stable:

* https://wiki.freebsd.org/Linuxulator

smcameron•7mo ago

No, every gpu vendor out there would prefer proprietary drivers and with a stable ABI, they could do it, and would do, there is no question about it.

I worked for HP on storage drivers for a decade or so, and had their been a stable ABI, HP would have shipped proprietary storage drivers for everything. Even without a stable ABI, they shipped proprietary drivers at considerable effort, compiling for myriad different distro kernels. It was a nightmare, and good thing too, or there wouldn't be any open source drivers.

charcircuit•7mo ago

I never said they wouldn't. Having more and better drivers is a good thing for Linux users. It's okay for proprietary drivers to exist. The kernel isn't meant to be a vehicle to push the free software agenda.

msgodel•7mo ago

It's plenty easy to make drivers now, it's just hard to distribute them without sharing the source.

There is absolutely no good reason not to share driver source though so that's a terrible use case to optimize for.

Nextgrid•7mo ago

What's so bad about it? Windows to this day doesn't have FOSS drivers as standard and despite that is pretty successful. In practice, as long as a driver works it's fine for the vast majority of users, and you can always disassemble and binary-patch if really needed.

(it's not obvious that having to occasionally disassemble/patch closed-source drivers is worse than the collective effort wasted trying to get every single thing in the kernel and keep it up to date).

heavyset_go•7mo ago

The unstable interface is Linux's moat, and IMO, is the reason we're able to enjoy such a large ecosystem of hardware via open source operating systems.

zahlman•7mo ago

I'm afraid I don't follow your reasoning.

rcxdude•7mo ago

The interface churn in linux adds a strong incentive (on top of the GPL) to upstream drivers, i.e. publish them as open source. Not doing so tends to mean you get stuck on old versions. If it had a stable interface, hardware vendors would just release crappy binary blobs and they'd only be usable on linux, and not maintainable by anyone else (and hardware vendors don't generally maintain their drivers for long)

heavyset_go•7mo ago

With a stable driver API/ABI, vendors will just dump closed source drivers once and call it a day, or pull a Apple/Sony/Nintendo with FreeBSD, where you effectively get a closed source OS that supports your hardware.

An unstable interface means the driver source needs to be updated frequently, you can't just dump a .ko file online and expect it to work for however long the hardware lasts.

Easiest way to approach it is to attempt to upstream drivers, and potentially take advantage of free labor and maintenance in virtual perpetuity, which is good for all Linux users. If vendors don't want to spend the effort upstreaming drivers, but they need to support Linux, by necessity the drivers must be open source so they can be compiled against users' changing kernels. That's at least a step in the right direction, and should anyone want to make the effort, they're free to upstream drivers themselves.

shtripok•7mo ago

It is very difficult to get the driver included in the upstream. This is why almost none of the Chinese equipment manufacturers do this. And those that do do it for one or two models out of 2-3 dozen produced, and this process rarely takes less than 2-3 years. That is, by the time the device is included in the kernel, it is usually already out of production for a year or more.

So don't repeat these legends from 20 years ago. However, this may not have been true even 20 years ago.

heavyset_go•7mo ago

And yet, it's the case for plenty of manufacturers.

Yes, bad driver implementations that shit all over the kernel tree just to get hardware "working" should not be upstreamed. 99% of the time in these cases, Chinese equipment manufacturers can't be bothered to write acceptable code and it's a good thing it isn't mainlined and made someone else's problem.

dralley•7mo ago

I donate to Kent's patreon and I'm very enthusiastic about bcachefs.

However, Kent, if you read this: please just settle down and follow the rules. Quit deliberately antagonizing Linus. The constant drama is incredibly offputting. Don't jeopardize the entire future of bcachefs over the silliest and most temporary concerns.

If you absolutely must argue about some rule or other, then make that argument without having your opening move be to blatantly violate them and then complain when people call you out.

You were the one who wanted into the kernel despite many suggestions that it was too early. That comes with tradeoffs. You need to figure out how to live with that, at least for a year or two. Stop making your self-imposed problems everyone else's problems.

NewJazz•7mo ago

Seriously how hard is it to say "I'm unhappy users won't have access to this data recovery option but will postpone its inclusion until the next merge window". Yeah, maybe it sucks for users who want the new option or what have you, but like you said it is a temporary concern.

vbezhenar•7mo ago

Why does it suck for users? Those brave enough to use new filesystem, surely can use custom kernel for the time being, while merge effort is underway and vanilla kernel might not be the most stable option.

cwillu•7mo ago

I believe part of the problem is distributions including it in their installers without requiring any of the usual “type the words “I KNOW WHAT I'M DOING”” to proceed” warning gates that are otherwise typical.

torbid•7mo ago

It seems to me like the goal is to work around the user having to type that to use Bcachefs while implying that they will have with the standard gatekeepers to avoid any limits on adoption via quality checks.

bombcar•7mo ago

What distributions are including kernels so quickly without also including various patch sets applied? Even Gentoo layers patches on top of the recent kernels, unless you run the literal source; but if you can do that you can apply your own patch sets.

Big distros like RedHat or Ubuntu always roll patched kernels, and as far as I know they’re usually a bit long in the tooth.

int_19h•7mo ago

By this logic it should be kept out of the kernel entirely until it's stable enough for general use.

queenkjuul•7mo ago

Any distro shipping -rc releases to regular users can apply Kent's patches themselves if they think it's that important. But seriously, how many distros do that, and of those, how many of their users are on experimental filesystems?

thrtythreeforty•7mo ago

I did subscribe to his Patreon but I stopped because of this - vote with your wallet and all that. I would happily resubscribe if he can demonstrate he can work within the Linux development process. This isn't the first time this flavor of personality clash has come up.

Kent is absolutely technically capable of, and has the vision to, finally displace ext4, xfs, and zfs with a new filesystem that Does Not Lose Data. To jeopardize that by refusing to work within the well-established structure is madness.

mjevans•7mo ago

I think the others have been proven correct. It _is_ too early. It would have been better maintained in one of the other staging branches to brew and also possibly as a patch-set that could be added atop the vanilla branch.

baggy_trough•7mo ago

No matter how good the code is, Overstreet's behavior and the apparent bus factor of 1 leave me reluctant to investigate this technology.

dsp_person•7mo ago

Curious about this process. Can anyone submit patches to bcachefs and Kent is just the only one doing it? Is there a community with multiple contributors hacking on the features, or just Kent? If not, what could he do to grow this? And how does a single person receiving patreon donations affect the ability of a project like this to get passed bus factor of 1?

nolist_policy•7mo ago

Generally you need a maintainer for your subsystem who sends pull requests to Linus.

koverstreet•7mo ago

I take patches from quite a few people. If the patch looks good, I'll generally apply it.

And I encourage anyone who wants to contribute to join the IRC channel. It's not a one man show, I work with a lot of people there.

devwastaken•7mo ago

Good. There is no place for unstable developers in a stable kernel.

msgodel•7mo ago

The older I get the more I feel like anything other than the ExtantFS family is just silly.

The filesystem should do files, if you want something more complex do it in userspace. We even have FUSE if you want to use the Filesystem API with your crazy network database thing.

anonnon•7mo ago

> The older I get the more I feel like anything other than the ExtantFS family is just silly.

The extended (not extant) family (including ext4) don't support copy-on-write. Using them as your primary FS after 2020 (or even 2010) is like using a non-journaling file system after 2010 (or even 2001)--it's a non-negotiable feature at this point. Btrfs has been stable for a decade, and if you don't like or trust it, there's always ZFS, which has been stable 20 years now. Apple now has AppFS, with CoW, on all their devices, while MSFT still treats ReFS as unstable, and Windows servers still rely heavily on NTFS.

msgodel•7mo ago

Again I don't really want the kernel managing a database for me like that, the few applications that need that can do it themselves just fine. (IME mostly just RDBMSs and Qemu.)

robotnikman•7mo ago

>Windows will at some point have ReFS

They seem to be slowly introducing it to the masses, Dev drives you set up on Windows automatically use ReFS

milkey_mouse•7mo ago

Hell, there's XFS if you love stability but want CoW.

josephcsible•7mo ago

XFS doesn't support whole-volume snapshots, which is the main reason I want CoW filesystems. And it also stands out as being basically the only filesystem that you can't arbitrarily shrink without needing to wipe and reformat.

leogao•7mo ago

you can always have an LVM layer for atomic snapshots

josephcsible•7mo ago

There are advantages to having the filesystem do the snapshots itself. For example, if you have a really big file that you keep deleting and restoring from a snapshot, you'll only pay the cost of the space once with Btrfs, but will pay it every time over with LVM.

shtripok•7mo ago

On some of my zfs servers, the number of snapshots (mostly periodic, rotated — hour, day, month, updates, data maintenance work) is 10-12 thousand. LVM can't do that.

kzrdude•7mo ago

there was the "old dog new tricks" xfs talk long time ago, but I suppose it was for fun and exploration and not really a sneak peek into snapshots

MertsA•7mo ago

You can shrink XFS, but only the realtime volume. All you need is xfs_db and a steady hand. I once had to pull this off for a shortened test program for a new server platform at Meta. Works great except some of those filesystems did somehow get this weird corruption around used space tracking that xfs_repair couldn't detect... It was mostly fine.

adrian_b•7mo ago

Many years ago, XFS did not support snapshots.

However, there is also a long time since XFS supports snapshots.

See for example:

https://thelinuxcode.com/xfs-snapshot/

I am not sure what you mean by "whole-volume" snapshots, but I have not noticed any restrictions in the use of the XFS snapshots. As expected, they store a snapshot of the entire file system, which can be restored later.

In many decades of managing computers with all kinds of operating systems and file systems, on a variety of servers and personal computers, I have never had the need to shrink a file system. I cannot imagine how such a need can arise, except perhaps as a consequence of bad planning. There are also many decades since I have deprecated the use of multiple partitions on a storage device, with the exception of bootable devices, which must have a dedicated partition for booting, conforming to the BIOS or UEFI expectations. For anything that was done in the ancient times with multiple partitions there are better alternatives now. With the exception of bootable USB sticks with live Linux or FreeBSD partitions, I use XFS on whole SSDs or HDDs (i.e. unpartitioned), regardless if they are internal or external, so there is never any need for changing the size of the file system.

Even so, copying a file system to an external device, reformatting the device and copying the file system back is not likely to be significantly slower than shrinking in place. In fact sometimes it can be faster and it has the additional benefit that the new copy of the file system will be defragmented.

Much more significant than the lack of shrinking ability, which may slow down a little something that occurs very seldom, is that both EXT4 and XFS are much faster for most applications than the other file systems available for Linux, so they are fast for the frequent operations. You may choose another file system for other reasons, but choosing it for making faster a very rare operation like shrinking is a very weak reason.

CoolCold•7mo ago

I definitely met several cases where support for shrinking would be beneficial - usually something about migrations and things like that, but yet I agree it's quite rare operation. Benefits come with lower amount of downtime window and/or expenses in time and duplicating systems.

I.e. back in ~ 2013-2014 while moving some baremetal Windows server into VMware, srhinking and then optimizing MFT helped to save AFAIR 2 hours of downtime window.

> except perhaps as a consequence of bad planning

Assuming people go to Clouds instead of physical servers because they may need to add 100 more nodes "suddenly" - selling point of Clouds is "avoid planning" - one may expect cases of need of shrinking are rising, now lowing. It may be mitigated by different approaches of course - i.e. often it's easier to resetup VM, but yet.

adrian_b•7mo ago

I do not see the connection between shrinking and migrations.

In migrations you normally copy the file system elsewhere, to the cloud or to different computers, you do not shrink it in place, which is what XFS cannot do. Unlike with Windows, copying Linux file systems, including XFS, during migrations to different hardware is trivial and fast. The same is true for multiplicating a file system to a big set of computers.

Shrinking in place is normally needed only when you share a physical device between 2 different operating systems, which use incompatible file systems, e.g. Windows and Linux, and you discover that you did not partition well the physical device and you want to shrink the partition allocated for one of the operating systems, in order to be able to expand the partition allocated for the other operating system.

Sharing physical devices between Windows and any other operating systems comes with a lot of risks and disadvantages, so I strongly recommend against it. I have stopped sharing Windows disks decades ago. Now, if I want to use the same computer in Windows and in another operating system, e.g. Linux or FreeBSD, I install Windows on the internal SSD, and, when desired, I boot Linux or FreeBSD from an external SSD. Thus the problem of reallocating a shared SSD/HDD by shrinking a partition never arises.

CoolCold•7mo ago

> Now, if I want to use the same computer in Windows and in another operating system, e.g. Linux or FreeBSD, I install Windows on the internal SSD

I'm not sure I've ever seen any server which had dualboot of this sort - meaning production systems, not tests/homelabs of course. Usually it's either Linux either Windows, and never FreeBSD (it's dead basically, over last 15 years at least).

That sounds more like desktop/laptop usage case, where experimenting can happen and planning is out of equation, cuz it's well, experimenting.

> I do not see the connection between shrinking and migrations.

You may think on "changes" as more wider term in addition to migrations - be it changing underlying drives under DB partition or need to free some space on in VG and LV to be able to use LVM snapshots (a looooooot of systems I see allocate all the space in VG at once and then cannot use snapshots because there is literally no space available) or some webhosting like cPanel/Plesk managed need more/less space for /var/{mail,you_name_it} and so on.

Again, there could be more reallife stories with XFS, but well it was not an option in many cases in the past. Nowdays, at least in my bubble, it's usually something clustered and you can do migration/changes on node-by-node basis and downtime window is avoided on another level, not by FS means.

leogao•7mo ago

btrfs has eaten my data within the last decade. (not even because of the broken erasure coding, which I was careful to avoid!) not sure I'm willing to give it another chance. I'd much rather use zfs.

bombcar•7mo ago

I used reiserfs for awhile after I noticed it eating data (tail packing for the power loss) but quickly switched to xfs when it became available.

Speed is sometimes more important than absolute reliability, but it’s still an undesirable tradeoff.

NewJazz•7mo ago

CoW is an efficiency gain. Does it do anything to ensure data integrity, like journaling does? I think it is an unreasonable comparison you are making.

webstrand•7mo ago

I use CoW a lot just managing files. It's only an efficiency gain if you have enough space to do the data-copying operation. And that's not necessarily true in all cases.

Being able to quickly take a "backup" copy of some multi-gb directory tree before performing some potentially destructive operation on it is such a nice safety net to have.

It's also a handy way to backup file metadata, like mtime, without having to design a file format for mapping saved mtimes back to their host files.

anonnon•7mo ago

> CoW is an efficiency gain.

You're thinking of the optimization technique of CoW, as in what Linux does when spawning a new thread or forking a process. I'm talking about it in the context of only ever modifying copies of file system data and metadata blocks, for the purpose of ensuring file system integrity, even in the context of sudden power loss (EDIT: wrong link): https://www.qnx.com/developers/docs/8.0/com.qnx.doc.neutrino...

If anything, ordinary file IO is likely to be slightly slower on a CoW file system, due to it always having to copy a block before said block can be modified and updating block pointers.

throw0101d•7mo ago

> Does it do anything to ensure data integrity, like journaling does?

What kind of journaling though? By default ext4 only uses journaling for metadata updates, not data updates (see "ordered" mode in ext4(5)).

So if you have a (e.g.) 1000MB file, and you update 200MB in the middle of it, you can have a situation where the first 100MB is written out and the system dies with the other 100MB vanishing.

With a CoW, if the second 100MB is not written out and the file sync'd, then on system recovery you're back to the original file being completely intact. With ext4 in the default configuration you have a file that has both new-100MB and stale-100MB in the middle of it.

The updating of the file data and the metadata are two separate steps (by default) in ext4:

* https://www.baeldung.com/linux/ext-journal-modes

* https://michael.kjorling.se/blog/2024/ext4-defaulting-to-dat...

* https://fy.blackhats.net.au/blog/2024-08-13-linux-filesystem...

Whereas with a proper CoW (like ZFS), updates are ACID.

ryao•7mo ago

Large file writes are an exception in ZFS. They are broken into multiple transactions, which can go into multiple transaction groups, such that the updates are not ACID. You can see this in the code here:

https://github.com/openzfs/zfs/blob/6af8db61b1ea489ade2d5344...

Small writes on ZFS are ACID. If ZFS made large writes ACID, large writes could block the transaction group commit for arbitrarily long periods, which is why it does not. Just imagine writing a 1PB file. It would likely take a long time (days?) and it is just not reasonable to block the transaction group commit until it finishes.

That said, for your example, you will often have all of the writes go into the same transaction group commit, such that it becomes ACID, but this is not a strict guarantee. The maximum atomic write size on ZFS is 32MB, assuming alignment. If the write is not aligned to the record size, it will be smaller, as per:

https://github.com/openzfs/zfs/blob/6af8db61b1ea489ade2d5344...

ryao•7mo ago

In what way do you consider CoW to be an efficiency gain? Traditionally, it is considered more expensive due to write amplification. In place filesystems such as XFS tend to be more efficient in terms of IOPs and CoW filesystems need to do many tricks to be close to them.

As for ensuring data integrity, I cannot talk about other CoW filesystems, but ZFS has an atomic transaction commit that relies on CoW. In ZFS, your changes either happened or they did not happen. The entire file system is a giant merkle tree and every change requires that all nodes of the tree up to the root be rewritten. To minimize the penalty of CoW, these changes are aggregated into transaction groups that are then committed atomically. Thus, you simultaneously have both the old and new versions available, plus possible more than just 1 old version. ZFS will start recycling space after a couple transaction group commits, but often, you can go further back in its history if needed after some catastrophic event, although ZFS makes no solid guarantee of this (until you fiddle with module parameter settings to prevent reclaim from being so aggressive).

If it counts for anything, I have hundreds of commits in OpenZFS, so I am fairly familiar with how ZFS works internally.

tbrownaw•7mo ago

> The extended (not extant) family (including ext4)

I read that more as "we have filesystems at home, and also get off my lawn".

zahlman•7mo ago

... NTFS does copy-on-write?

... It does hard links? After checking: It does hard links.

... Why didn't any programs I had noticeably take advantage of that?

anonnon•7mo ago

> NTFS does copy-on-write?

No, it doesn't. Maybe you're thinking of shadow volume copies or something else. CoW files systems never modify data or metadata blocks directly, only modifying copies, with the root of the updated block pointer graph only updated after all other changes have been synced. Read this: https://www.qnx.com/developers/docs/8.0/com.qnx.doc.neutrino...

zahlman•7mo ago

>No, it doesn't. Maybe you're thinking of shadow volume copies or something else.

I was asking, because didn't know, and I thought the other person was implying that it did.

I know what copy-on-write is.

anonnon•7mo ago

The "other person" (only mention of NTFS) is me, here:

> while MSFT still treats ReFS as unstable, and Windows servers still rely heavily on NTFS.

By this I implied it's an embarrassment to MSFT that iOS devices have a better, more reliable file system (AppFS) than even Windows servers now (having to rely on NTFS until ReFS is ready for prime time). If HN users and mods didn't tone-police so heavily, I could state things more frankly.

yjftsjthsd-h•7mo ago

I mean, I'd really like some sort of data error detection (and ideally correction). If a disk bitflips one of my files, ext* won't do anything about it.

timewizard•7mo ago

> some sort of data error detection (and ideally correction).

That's pretty much built into most mass storage devices already.

> If a disk bitflips one of my files

The likelihood and consequence of this occurring is in many situations not worth the overhead of adding additional ECC on top of what the drive does.

> ext* won't do anything about it.

What should it do? Blindly hand you the data without any indication that there's a problem with the underlying block? Without an fsck what mechanism do you suppose would manage these errors as they're discovered?

throw0101d•7mo ago

>> > some sort of data error detection (and ideally correction).

> That's pretty much built into most mass storage devices already.

And ZFS has shown that it is not sufficient (at least for some use-cases, perhaps less of a big deal for 'residential' users).

> The likelihood and consequence of this occurring is in many situations not worth the overhead of adding additional ECC on top of what the drive does.

Not worth it to whom? Not having the option available at all is the problem. I can do a zfs set checksum=off pool_name/dataset_name if I really want that extra couple percentage points of performance.

> Without an fsck what mechanism do you suppose would manage these errors as they're discovered?

Depends on the data involved: if it's part of the file system tree metadata there are often multiple copies even for a single disk on ZFS. So instead of the kernel consuming corrupted data and potentially panicing (or going off into the weeds) it can find a correct copy elsewhere.

If you're in a fancier configuration with some level of RAID, then there could be other copies of the data, or it could be rebuilt through ECC.

With ext*, LVM, and mdadm no such possibility exists because there are no checksums at any of those layers (perhaps if you glom on dm-integrity?).

And with ZFS one can set copies=2 on a per-dataset basis (perhaps just for /home?), and get multiple copies strewn across the disk: won't save you from a drive dying, but could save you from corruption.

yjftsjthsd-h•7mo ago

> (perhaps if you glom on dm-integrity?).

I looked at that, in hopes of being able to protect my data. Unfortunately, I considered this something of a fatal flaw:

> It uses journaling for guaranteeing write atomicity by default, which effectively halves the write speed.

- https://wiki.archlinux.org/title/Dm-integrity

timewizard•7mo ago

> it can find a correct copy elsewhere.

Which implies you can already correct errors through a simple majority mechanism.

> or it could be rebuilt through ECC.

So just by having the appropriate level of RAID you automatically solve the problem. Why is this in the fs layer then?

yjftsjthsd-h•7mo ago

> Which implies you can already correct errors through a simple majority mechanism.

I don't think so? You set copies=2, and the disk says that your file starts with 01010101, except that the second copy says your file starts with 01010100. How do you tell which one is right? For that matter, even with only one copy ex. ZFS can tell that what it has is wrong even if it can't fix it, and flagging the error is still useful.

> So just by having the appropriate level of RAID you automatically solve the problem. Why is this in the fs layer then?

Similarly, you shouldn't need RAID to catch problems, only (potentially) to correct them. I do agree that it doesn't necessarily have to be in the FS layer, but AFAIK Linux doesn't have any other layers that do a good job of it (as mentioned above, dm-integrity exists but halving the write speed is a pretty big problem).

timewizard•7mo ago

> I don't think so?

The disk is going to report an uncorrected error for one of them.

throw0101d•7mo ago

> The disk is going to report an uncorrected error for one of them.

Emperical evidence has shown otherwise: I have regularly gotten checksum error reports that ZFS has complained about during a scrub.

The ZFS developers have said in interviews that disks, when asked from LBA 123 have returned the contents of LBA 234 (due to disk firmware bugs): the on-disk checksum for 234 is correct, and so the bits were passed up the stack, but that's not the data that the kernel/ZFS asked for. It is only be verifying at the file system layer than the problem was caught (because at the disk layer things were "fine").

A famous paper that used Google's large quantity of drives as a 'sample population' mentions file system-level checks:

* https://www.cs.toronto.edu/~bianca/papers/fast08.pdf

See also the Google File System paper (§5.2 Data Integrity):

* https://research.google/pubs/the-google-file-system/

Trusting drives is not wise.

shtripok•7mo ago

Let's revert your question: why should raid be a separate level at all?

throw0101d•7mo ago

> Why is this in the fs layer then?

Define "fs layer". ZFS has multiple layers with-in it:

The "file system" that most people interact with (for things like homedirs) is actually a layer with-in ZFS' architecture, and is called the ZFS POSIX layer (ZPL). It exposes a POSIX file system, and take the 'tradition' Unix calls and creates objects. Those objects are passed to the Data Management Unit (DMU), which then passed them down to Storage Pool Allocator (SPA) layer which actually manages the striping, redundancy, etc.

* https://ibug.io/blog/2023/10/zfs-block-size/

There was a bit of a 'joke' back in the day about ZFS being a "layering violation" because it subsumed into itself RAID, volume management, and a file system, instead of having each in a separate software packages:

* https://web.archive.org/web/20070508214221/https://blogs.sun...

* https://lildude.co.uk/zfs-rampant-layering-violation

The ZPL is not used all the time: one can create a block device ("zvol") and put swap or iSCSI on it. The Lustre folks have their own layer that hooks into the DMU and doesn't bother with POSIX semantics:

* https://wiki.lustre.org/ZFS_OSD_Hardware_Considerations

* https://www.eofs.eu/wp-content/uploads/2024/02/21_andreas_di...

ars•7mo ago

> The likelihood .. of this occurring

That's 10^14 bits for a consumer drive. That's just 12TB. A heavy user (lots of videos or games) would see a bit flip a couple times a year.

magicalhippo•7mo ago

I do monthly scrubs on my NAS, I have 8 14-20TB drives that are quite full.

According to that 10^14 metric I should see read errors just about every month. Except I have just about zero.

Current disks are ~4 years, runs 24/7, and excluding a bad cable incident I've had a single case of a read error (recoverable, thanks ZFS).

I suspect those URE numbers are made by the manufacturers figuring out they can be sure the disk will do 10^14, but they don't actually try to find the real number because 10^14 is good enough.

ars•7mo ago

If you are using enterprise drives those are 10^16, so that might explain it.

magicalhippo•7mo ago

Fair, newest ones are, but two of my older current drives are IronWolfs 16TB which are 10^15 in the specs[1], and they've been running for 5.4 years. Again without any read errors, monthly scrubs, and of course daily use.

And before that I have been using 8x WD Reds 3TB for 6-7 years, which have 10^14 in the specs[2], and had the same experience with those.

Yes smaller size, but I ran scrubbing on those biweekly, and over so many years?

[1]: https://www.seagate.com/files/www-content/datasheets/pdfs/ir...

[2]: https://documents.westerndigital.com/content/dam/doc-library...

ryao•7mo ago

> I suspect those URE numbers are made by the manufacturers figuring out they can be sure the disk will do 10^14, but they don't actually try to find the real number because 10^14 is good enough.

I am inclined to agree. However, I have one thought to the contrary. When a mechanical drive is failing, you tend to have debris inside the drive hitting the platters, causing damage that creates more debris, accelerating the drive’s eventual death, with read errors becoming increasingly common while it happens. When those are included in averages, the 10^14 might very well be accurate. I have not done any rigorous analysis to justify this thought and I do not have the data to be able to do that analysis. It is just something that occurs to me that might justify the 10^14 figure.

Dylan16807•7mo ago

I'm not really sure how you're supposed to interpret those error rates. The average read error probably has a lot more than 1 flipped bit, right? And if the average error affects 50 bits, then you'd expect 50x fewer errors? But I have no idea what the actual histogram looks like.

timewizard•7mo ago

Is that raw error rate or uncorrected error rate?

yjftsjthsd-h•7mo ago

To your first couple points: I trust hardware less than you.

> What should it do? Blindly hand you the data without any indication that there's a problem with the underlying block?

Well, that's what it does now, and I think that's a problem.

> Without an fsck what mechanism do you suppose would manage these errors as they're discovered?

Linux can fail a read, and IMHO should do so if it cannot return correct data. (I support the ability to override this and tell it to give you the corrupted data, but certainly not by default.) On ZFS, if a read fails its checksum, the OS will first try to get a valid copy (ex. from a mirror or if you've set copies=2), and then if the error can't be recovered then the file read fails and the system reports/records the failure, at which point the user should probably go do a full scrub (which for our purposes should probably count as fsck) and restore the affected file(s) from backup. (Or possibly go buy a new hard drive, depending on the extent of the problem.) I would consider that ideal.

eptcyka•7mo ago

Bitflips in my files? Well, there’s a high likelihood that the corruption won’t be too bad. Bit flips in the filesystem metadata? There’s a significant chance all of the data is lost.

msgodel•7mo ago

Anything important should be really be stored in some sort of distributed system that uses eg merkle trees. If the file system also did that you'd be doing it twice which would be annoying.

Anything unimportant is really just being cached and it's probably fine if it gets corrupted.

heavyset_go•7mo ago

Transparent compression, checksumming, copy-on-write, snapshots and virtual subvolumes should be considered the minimum default feature set for new OS installations in TYOOL 2025.

You get that with APFS by default on macOS these days and those features come for free in btrfs, some in XFS, etc on Linux.

riobard•7mo ago

APFS checksums only fs metadata not user data which is a pita. Presumably because APFS is used on single drive systems and there’s no redundancy to recover from anyway. Still, not ideal.

vbezhenar•7mo ago

Apple trusts their hardware to do their own checksums properly. Modern SSD uses checksums and parity codes for blocks. SATA/NVMe include checksums for protocol frames. The only unreliable component is RAM, but FS checksums can't help here, because RAM bit likely will be flipped before checksum is calculated or after checksum is verified.

riobard•7mo ago

If they do trust their hardware, APFS won’t need to checksum fs metadata either, so I guess they don’t trust it well enough? Also I have external drives that is not Apple sanctioned to store files and I don’t trust them enough either, and there’s no choice of user data checksumming at all.

1over137•7mo ago

Apple does not care about your external non-Apple drives. In the slightest.

londons_explore•7mo ago

Most SSD's can't be trusted to maintain proper data ordering in the case of a sudden power off.

That makes checksums and journals of only marginal usefulness.

I wish some review website would have a robot plug and unplug the power cable in a test rig for a few weeks and rate which SSD manufacturers are robust to this stuff.

Quekid5•7mo ago

I'd say it makes checksums even more important so that you know whether something got corrupted immediately and not after a year (or whatever) has gone by and you actually need it.

londons_explore•7mo ago

The problem is that if the SSD suffers a power failure and reverts a 1 megabyte block of metadata to the way it was yesterday, the filesystem won't see that as corruption - since all the checksums will match.

Yet all the pointers in that metadata will point to data which no longer exists, and your filesystem will be destroyed.

Quekid5•7mo ago

So... metadata checksums? I mean if enough data/metadata gets corrupted you're SOL either way.

For example, in ZFS the metadata is checksummed such that pointers to the data carry the checksum for that data.

criticalfault•7mo ago

I've been following this for a while now.

Kent is in the wrong. Having a lead position in development I would kick Kent of the team.

One thing is to challenge things. What Kent is doing is something completely different. It is obvious he introduced a feature, not only a Bugfix.

If the rules are set in a way that rc1+ gets only Bugfixes, then this is absolutely clear what happens with the feature. Tolerating this once or twice is ok, but Kent is doing this all the time, testing Linus.

Linus is absolutely in the right to kick this out and it's Kent's fault if he does so.

Pet_Ant•7mo ago

Why take it out of the kernel? Why not just make someone responsible the maintainer so they can say "no, next release" to his shenanigans? It can't be the license.

nolist_policy•7mo ago

Kent can appoint a suitable maintainer if he wishes. That's his job, not Linus'.

criticalfault•7mo ago

This is for me unclear as well, but I'm saying I wouldn't hold it against Linus if he did this. And based on Kent's behavior he has full right to do so.

A way to handle this would be with one person (or more) in between Kent and Linus. And maybe a separate tree only for changes and fixes from bcachefs that those people in between would forward to Linus. A staging of sorts.

tliltocatl•7mo ago

Maintainers aren't getting paid and so cannot be "appointed". Someone must volunteer - and most people qualified and motivated enough are already doing something else.

timewizard•7mo ago

Presumably there would be an open call where people would nominate themselves for consideration. These are problems that have come up and been solved in human organizations for hundreds of years before the kernel even existed.

xorcist•7mo ago

There is no call. Anyone can volunteer at any time.

Software take up no space and there is no scarcity. Theoretically there could be any number of maintainers and what gets uptake is the de facto upstream. That's what people refer to when they talk about free software development in terms of meritocracy.

timewizard•7mo ago

How would they know to volunteer? Are you saying I can perform a hostile volunteering to take over for a maintainer who does not want to give up the project? I don't think you understood what was meant.

cwillu•7mo ago

Anyone remotely suitable would be active on the lkml.

pmarreck•7mo ago

This can happen with primadonna devs who haven't had to collaborate in a team environment for a long time.

It's a damn shame too because bcachefs has some unique features/potential

rob_c•7mo ago

And a honking great bus factor of Kent deciding enough is enough and having a tantrum. You couldn't and shouldn't trust critical data to such a scenario

bombcar•7mo ago

There’s no harm doing it - if the thing actually works! Kent getting that lass metro pass wouldn’t cause your file system to immediately corrupt and delete itself.

What you want to avoid is becoming dependent on continued development of it - but unless you’re particularly using some specific feature of the file system that none other provide you’ll have time to migrate off it.

Even resierfs didn’t cease to operate.

tremon•7mo ago

The reiserfs code was stable and in maintenance mode. All new development effort was going into reiser4, which absolutely did die off. IIRC a few developers (that were already working on it) tried to continue the development, but it was abandoned due to lack of support and funds.

In terms of maturity, bcachefs is closer to production quality than reiser4 was, but it's still closer to reiser4 than reiserfs in its lifecycle.

koverstreet•7mo ago

we're further along than btrfs in "will it keep my data"

tremon•7mo ago

Fair enough, I have no practical experience with bcachefs myself.

koverstreet•7mo ago

Fair :) I've been trying to keep this thing (relatively) quiet and low profile until it's completely done, but it's gotten hyped.

Data integrity, core IO paths, all the failure modes, all the crazy repair corner cases - these are the hard parts if you really want to get a filesystem right. These are what we've been taking our time on.

I can't claim 100% success rate w.r.t. data loss, but it's been phenomenally good, with some crazy stories of filesystems we've gotten back that you'd never expect - and then it just becomes the norm.

I love the crazy bug reports that really push the boundaries of our capabilities.

That's an attitude that reiserfs and btrfs never had, and when I am confident that it is 100% rock solid and bulletproof I'll be lifting the experimental label.

sroussey•7mo ago

> I have no practical experience with bcachefs myself.

Who does?

When the MySQL and Postgres projects recommend it, I’ll have a look.

koverstreet•7mo ago

That's still a ways off, but it is worth noting that bcachefs handles database workloads in cow mode with no issue.

jcalvinowens•7mo ago

> we're further along than btrfs in "will it keep my data"

Honestly Kent, this continuing baseless fearmongering from you about btrfs is absolutely disgusting.

It costs you. I was initially very interested in bcachefs, but I will never spend my time testing it or contributing to it as long as you continue behave this way. I'm certain there are many many others who would nominally be very interested, but feel the same way I do.

Your filesystem charitably gets 0.001% the real world testing btrfs does. To claim it is more reliable than btrfs is ignorant and naive.

Maybe it actually is more reliable in the real world (press X to doubt...), but you can't possibly know yet, and you won't know for a long time.

rob_c•7mo ago

I'm happy to support that bcache may have a stable on disk format, but the lashing out at the alternatives is another example of behaviour I'd prefer to see dropped.

If your product is so great its it's own advert. If it has problems spend the limited person power fixing it not attacking the opposition, this is what ciq have done, do better.

koverstreet•7mo ago

We have documented, in this very thread, issues with multi device setups that btrfs has that bcachefs does not - and btrfs developers ignoring these issues.

This isn't baseless fearmongering, this is "did you think through the failure modes when you were designing the basics".

This stuff comes up over, and over, and over.

Engineering things right matters, and part of that absolutely is comparing and documenting approaches and solutions to see what went right and what went wrong.

This isn't a popularity contest, and this isn't high school where we form into cliques and start slinging insults.

Come up with facts, documentation, analysis. That's what we do. I'm tired of these threads degenerating into flamewars.

kzrdude•7mo ago

(That's impressive, but the real world user pool is much smaller isn't. It still sounds like a proud brag more than it does proven by workload.)

I am not a filesystems guy, but I was disappointed when I realized that btrfs did not have a good design for ENOSPC handling.

So I'm curious, does bcachefs design for a benign failure mode when out of space?

koverstreet•7mo ago

We have enough user reports of multi device testing that they put both bcachefs and btrfs through, where bcachefs consistently survives where btrfs does not. We have much better repair and recovery, with real defense in depth.

Now: I am not saying that bcachefs is yet trouble free enough for widespread deployment, we're still seeing cases where repair needs fairly minor work, but the filesystem may be offline while we get that fixed.

OTOH we also recover, regularly, from absolutely crazy scenarios involving hardware failure: flaky controllers, lightning strikes, I've seen cases where it looked like a head crashed - took out a whole bunch of btree nodes in similar LBA ranges.

IOW: the fundamentals are very solid, but keep checking back if you're wondering when it'll be ready for widespread deployment.

Milestones to watch for: - 3 months with zero data loss or downtime events: I think we may get this soon, knock on wood - stable backports starting: 6.17, knock on wood (or maybe we'll be out of the kernel and coming up with our own plan, who knows) - weird application specific bugs squashed: these have been on the back burner, but there's some weird stuff to get to still (e.g. weird unlink behavior that affects docker and go builds, and the Rust people just reported something odd when building certain packages with mold).

And yes, we've always handled -ENOSPC gracefully.

motorest•7mo ago

> We have enough user reports of multi device testing that they put both bcachefs and btrfs through, where bcachefs consistently survives where btrfs does not. We have much better repair and recovery, with real defense in depth.

Are any of these claims verifiable, or even made by anyone other than yourself?

Frankly, without substantiating any claim or even providing any concrete evidence, it reads like trying to badmouth what you perceive as competitors in a desperate attempt to get some traction. Not cool.

koverstreet•7mo ago

There's a lot of user feedback out there, try the mailing lists.

rob_c•7mo ago

Yet few quantified benchmarks, few examples, few written up discussions and few technical documents to serve as a fixed reference.

I'm not saying everything needs to be a bullet point presentation but in the era of llms to help plug the gaps with this stuff, "look at the mailing list" or "watch irc" isn't much better than anecdotal.

Again, recovering data, laudable. Hell, maybe impressive to compare the situation to equivalents on other fs to discuss why this is better than X. But a simple "we get the data back compared to Y" just reads as Y bashing unless there's metrics, or clear technical reasons as to why you're superior.

I _want_ bcache to be better for numerous reasons. Everyone wins from a better product. But realistically getting there means telling some people that they need to wait.

Frankly if there is a need to support users who demand mainline access to the latest and greatest _NOW_. Adopt/appoint a supported os/distro and roll your own nightly packages into a simple repo. If you have to rush upstream because a user can't cope with "pull your kernel sources from https://...." They can't cope with compiling it correctly so there's little (if anything) to be gained from rushing into mainline next week constantly...

koverstreet•7mo ago

Benchmarks? I hope you mean detailed writeups on robustness, because performance is not a consideration yet.

I agree that more thoughtful analysis would be helpful, but I have to work with what I've got :)

One of the recent pull request threads had a user talking about how bcachefs development is being done "the old way", from the earlier days of Linux; less "structure", less process, so that we can move quickly and - critically - work effectively with users.

I liked that comparison, and I think that's a big part of why bcachefs has had the success it's had with users (it's a proven recipe! Linux did displace everything else, after all). And on top of that, we're doing it engineering best practices that have advanced significantly since then. Automated testing, a codebase that heavily uses assertions (there's a lot I've said elsewhere about how to effectively use assertions; it's not preconditions/postconditions), runtime debugging, and more. It started too early to be written in Rust, that's really the only thing I'd change :)

People just need to be patient - this stuff takes time. The core design was done years ago, on disk format was frozen in 6.15, and now we're in a pretty hard freeze and doing nothing but fix bugs. The development process has been working well, it's been shaping up fast.

motorest•7mo ago

> Benchmarks? I hope you mean detailed writeups on robustness, because performance is not a consideration yet.

I don't think there is ambiguity. Either you have some way to objectively corroborate your personal claims, or you don't. Those are called Benchmarks. Performance is a specific type of benchmark test, but it's not the only one.

Either you have those or you don't. Making assertions and claims without benchmarks is not a confidence- or reputation-builder.

koverstreet•7mo ago

The benchmark here is the bug reports. Scan through the btrfs and bcachefs bug trackers.

Dylan16807•7mo ago

I haven't lost data on btrfs but I have broken half the partitions I made with it. The comparison doesn't feel baseless to me.

> 0.001% the real world testing

Statistics can be quite powerful. If you have a million installs, and your billion-install competitor has 100 problems per million installs, you can make some pretty strong statements about how you rate against that competitor. Just for easy example numbers.

pmarreck•7mo ago

I've bricked 2 drives to btrfs (one on Manjaro and one on Arch), and 0 to bcachefs.

Of course, I've never used bcachefs as a daily driver... nor as a filesystem in general... but I'd love to... lol

bombcar•7mo ago

From my experience as a [x,z]fs snob, "further along than butterfs" is damning with faint praise.

bigyabai•7mo ago

I have used BTRFS for 6 years on 5 drives without a single journaled corruption.

pmarreck•7mo ago

And I used it for 2 years on 2 drives on 2 different OS'es (Manjaro and Arch) and had 2 corruptions that bricked the drives. And they were root filesystems, so... that wasn't fun.

Statistics are funny like that.

taskforcegemini•7mo ago

bricked the drive as in the disk was physically defective? disks like to brick on their own, how sure are you btrfs was the reason?

int_19h•7mo ago

btrfs is used by numerous NAS providers at this point.

koverstreet•7mo ago

Do you know any that use it in multi device mode?

int_19h•7mo ago

Synology.

(Yes, I know they don't do it "the right way". It still works at the end of the day.)

koverstreet•7mo ago

No, we're talking about btrfs multi device mode, md doesn't count :)

int_19h•7mo ago

Are we? You did not qualify your original statement about btrfs "not keeping data" with that.

pmarreck•7mo ago

Hi, I believe I've contributed financially to your project in the past (mainly because zfs (licensing issues) and btrfs (unreliability, in my personal experience) need good competition... and btrfs bricked 2 of my drives once...)

You've built an amazing thing but speaking as another very opinionated dev on the outside who sometimes butts heads with other opinionated devs- please just defer to Linus on this so you can get this in the kernel. Or just have it pushed to the next kernel release (I realize this has already happened repeatedly). Just please don't add a significant feature in a bugfix cycle. And (as mercurial as we all know he is, and as valid as some of your concerns surely are), try to keep the peace with that guy. Do it for the sake of the project. I've been waiting for this to drop for literally years now. I care about it, and I know no one does more than you, and I swear that I know exactly what that's like. It's your brain-baby, the concrete instantiation of your blood/sweat/tears. But... Take a deep breath, swallow, trust the process.

The bands that produced some of the best music had HUGE tensions within them. But you can't let it explode (we all know that some bands did). I made that mistake a year ago, and lost a job I very much cared about. (Perhaps to a fault.) Don't be me. lol

rob_c•7mo ago

> There’s no harm doing it - if the thing actually works

This is the antiphrasis of good project management and stability.

No you want to avoid a static target in a dynamic environment that is unmaintained (such as an experimental fs in the kernel tree).

If it's static and unsupported. You'd end up failing to be run this to recover disks using ryzen9 processors that requires a minimum kernel version where the API/abi have drifted so far that the old module won't compile or import.

If you can't afford to get your hands dirty and hack at the API changing if this has such a bus factor. DON'T USE IT.

Frankly the argument you're making is the other side of stick with ext2 since it works. It's probably going to die soon and frankly unless there's a community to support it. (such as zfs, or ext4 in the kernel, or CEPH in hpc corporate spaces)

bgwalter•7mo ago

bcachefs is experimental and Kent writes in the LWN comments that nothing would get done if he didn't develop it this way. Filesystems are a massive undertaking and you can have all the rules you want. It doesn't help if nothing gets developed.

It would be interesting how strict the rules are in the Linux kernel for other people. Other projects have nepotistic structures where some developers can do what they want but others cannot.

Anyway, if Linus had developed the kernel with this kind of strictness from the beginning, maybe it wouldn't have taken off. I don't see why experimental features should follow the rules for stable features.

yjftsjthsd-h•7mo ago

If it's an experimental feature, then why not let changes go into the next version?

bgwalter•7mo ago

That is a valid objection, but I still think that for some huge and difficult features the month long pauses imposed by release cycles are absolutely detrimental.

Ideally they'd be developed outside the kernel until they are perfect, but Kent addresses this in his LWN comment: There is no funding/time to make that ideal scenario possible.

jethro_tell•7mo ago

He could release a patch that can be pulled by the people that need it.

If you’re using experimental file systems, I’d expect you to be pretty competent in being able to hold your own in a storage emergency, like compiling a kernel if that’s the way out.

This is a made up emergency, to break the rules.

eviks•7mo ago

The inconvenience of this process is also addressed by the dev, as is the different definition of experimental that you're using (though your expectation re kernel doesn't follow even without the mismatch in definitions)

rovr138•7mo ago

The kernel, even its bugs, should be stable (in that they shouldn't change unless it happens the correct way). If not, it starts introducing unexpected issues to users.

If someone's testing against these versions, adding their fixes and patches, stuff like this will break things for users. He can't assume all users will be regular desktop users, even on an experimental area of the code.

Things like 'RC' have meaning. Meaning that has been there for years. He can develop on a separate tree and users that want it can use it. This is used all over.

motorest•7mo ago

> The inconvenience of this process is also addressed by the dev, as is the different definition of experimental that you're using (...)

The only aspect of "experimental" that matters is what it means to the release process. If you can't meet that bar then debating semantics won't help either.

And by the way, the patch thread clearly stresses a) loss of data, b) the patch trying to sneak under the maintenance radar new features. That is the definition of unstable in anyone's book.

koverstreet•7mo ago

Experimental has no defined meaning with respect to the release process.

It's a signal to users - "watch out if you're going to use this thing"

motorest•7mo ago

> Experimental has no defined meaning with respect to the release process.

Nonsense. And to make matters worse, you're commenting as if trying to sneak features in bug fixes late on the release process has no impact on quality assurance.

queenkjuul•7mo ago

Yeah the argument breaks down for me -- this project is so stable and mainstream, it's being used by a huge community of technically lay Linux users who can't boot a home built kernel temporarily to run a data recovery?

And yet simultaneously, it's a bleeding edge experimental system that needs a license to break the Linux rules on account of its experimental nature?

I just don't see how there's a critical mass of casual users that can't handle a complicated data recovery (as in, i won't generally believe this to be true, and if it is, those users should probably stick with something more mature), AND the system is still so experimental and developing so quickly that introducing features outside the merge window should be considered uncontroversial (or even necessary, as Kent seems to sometimes argue)

Analemma_•7mo ago

This position seems so incoherent. If it’s so experimental, why is it in the mainline kernel? And why are fixes so critical they can’t wait for a merge window? Who is using an “experimental” filesystem for mission-critical work that also has to be on untested bleeding-edge code?

Like the sibling commenter, I suspect the word “experimental” is being used here to try and evade the rules that, somehow, every other part of the kernel manages to do just fine with.

koverstreet•7mo ago

No, you have to understand that filesystems are massive (decade+) projects, and one of the key things you have to do with anything that big that has to work that perfectly is a very gradual rollout, starting with the more risk tolerant users and gradually increasing to a wider and wider set of users.

We're very far along in that process now, but it's still marked as experimental because it is not quite ready for widespread deployment by just anyone. 6.16 is getting damn close, though.

That means a lot of our users now are people getting it from distro kernels, who often have never compiled a kernel before - nevertheless, they can and do report bugs.

And no matter where you are in the rollout, when you get bug reports you have to fix them and get the fixes out to users in a timely manner so that they can keep running, keep testing and reporting bugs.

It's a big loss if a user has to wait 3 months for a bugfix - they'll get frustrated and leave, and a big part of what I do is building a community that knows how the system works, how to help debug, and how to report those bugs.

A very common refrain I get is "it's experimental <expletive deleted>, why does it matter?" - and, well, the answer is getting fixes out in a timely manner matters just as much if not more if we want to get this thing done in a timely manner.

orbisvicis•7mo ago

Isn't this the point of DKMS, to decouple module code from kernel code?

koverstreet•7mo ago

Well, my hope when bcachefs was merged was for it to be a real kernel community project.

At the time it looked like that could happen - there was real interest from Redhat prior to merging. Sadly Redhat's involvement never translated into much code, and while upstreaming did get me a large influx of users - many of which have helped enormously with the QA and stabilization effort - the drama and controversies have kept developers away, so on the whole it's meant more work, pressure and stress for me.

So DKMS wouldn't be the worst route, at this point. It would be a real shame though, this close to taking the experimental label off, and an enormous hassle for users and distributions.

But it's not my call to make, of course. I just write code...

em-bee•7mo ago

the drama and controversies have kept developers away

well there you have it. i am not saying the drama is your fault because it really doesn't matter whose fault it is. regardless of who is causing drama, your only chance to reduce drama (and stress for you) is to deescalate. even if linus is causing the drama, actually especially if linus is causing the drama (we all know that he doesn't have the most agreeable personality) you need to do things his way and work to earn his trust.

others here in the comments ask you to recognize and admit you are wrong, but i'd say no, you don't have to. this is not a matter of who is right. that's completely besides the point. it's a matter of politics and diplomacy. agree to disagree and move on. that is what i hope you can recognize. to accept defeat of an argument even if you are right, and to avoid causing arguments in the first place. it's like marriage. if you want to keep the relationship, you need to defer until you earn their trust. but unlike marriage you can't ask others to join you in relationship counseling. you have to do all the relationship work yourself.

i am rooting for you, and i look forward to the day that i can use bcachefs myself.

btw: is there any goal to support in place conversion from ext4 like btrfs supports. (and maybe even from btrfs :-)

koverstreet•7mo ago

We already have in place conversion :)

In place conversion from btrfs needs an FIEMAP extension, but ext4 and xfs are supported

webstrand•7mo ago

DKMS is an awful user experience, it's an easy way to render a system unbootable. I hope Linus doesn't force existing users, like me, down that path. It's why I avoid zfs, however good it may be.

mroche•7mo ago

DKMS isn't a "fire and forget" kind of tool, but it comes reasonably close most of the time. I would say it's a far cry from awful, though.

webstrand•7mo ago

I think my problem is that it's just close enough to being fire-and-forget that I forget how to do the recovery when it misfires. It usually seems to crop up when I'm on vacation or something and I don't have my tools.

yjftsjthsd-h•7mo ago

One of my machines runs root on ZFS via DKMS. I will grant that it is annoying, and it used to be worse, but I don't think it's been quite as bad as all that for a very long time. I would also argue that it's more acceptable for testing actively developed stuff that's getting the bugs worked out in order to work towards mainlining.

That said, I vaguely seem to recall that bcachefs ended up involving changes to other parts of the kernel to better support it; if that's true then DKMS is likely to be much more painful if not outright impossible. It's fine to compile one module (or even several) against a normal kernel, but the moment you have to patch parts of the "main" kernel it's gonna get messy.

krageon•7mo ago

ZFS should be avoided because it has too many dumb complete failure states (having run it in a real production storage environment), not because it's DKMS

cyberpunk•7mo ago

I’ve run racks and racks of it in prod also. What are these dumb complete failure states you mean?

queenkjuul•7mo ago

Yeah idk 5+ years now of ZFS on Linux for me without a single hiccup (other than me letting a drive die that wasn't properly backed up). The modules get rebuilt when the distro installs a new kernel. I've never once even had to think about it.

hamandcheese•7mo ago

I am sympathetic to your plight. I work on internal dev tools and being able to get changes out to users quickly is an incredible tool to be able serve them well and earn (or keep) their trust.

It seems like you want that kind of fast turn around time in the kernel though, which strikes me as an impossible goal.

samus•7mo ago

IMHO, anybody willing using a file system marked as experimental from a downstream kernel should be able to wait for the fix. If they need it faster they should be ready to compile their own kernel or seriously reevaluate their decision to adopt the FS.

The Kernel's pace is predictable, even billion $ corporates can live with it, and it's not like Linus hasn't accommodated you in the past. But continuing to do so will make people stop believing you are acting in good faith, and the outcome of that will be predictable.

This is simply how the development model is like in the Linux kernel. Exceptions happen, but they sometimes backfire and highlight why the rules matter in the first place, and therefore they are unlikely to change.

rob_c•7mo ago

> It's a big loss if a user has to wait 3 months for a bugfix

Either the bugfix is not serious and they can wait because the system is mature. Or, The fs is so unstable you shouldn't be pandering to the crowd that struggle with build deps and make install.

There is no in between, this is the situation. And the "but not all bugs are equal" argument doesn't stand.

I know if I read of a metadata but getting fixed in ext4 or ZFS there's a very small chance of this causing my platter to evaporate. By definition of stable, if that was happening it would be hitting that one unfortunate guy (<0.001% of users) running a weird OS or hardware and that's just the luck of the draw.

If the fix is from a fs marked experimental, yes I kinda expect this could fry my data and hurt kittens. That's what experimental and stable mean. That means I expect up to 100% of users to be impacted under certain workflows or scenarios.

Everything outside of this is wasted energy and arguing with the tide.

krzyk•7mo ago

> It's a big loss if a user has to wait 3 months for a bugfix

Is the wait really 3 monts away? I don't exactly know the release cycle, but for me kernels are released quite frequently, at least there are RC sooner than 3 months. Just checked latest history and major releases are 2 months apart - and between them there are minor ones.

People using experimental features are quite aware how to get new experimental kernel sources.

baobun•7mo ago

DKMS as an option might be better then you imagine.

jjaksic•7mo ago

> It's a big loss if a user has to wait 3 months for a bugfix

This is incredibly short-sighted. You're talking about 1 user 3 months, and you think that's "big" ? I'd say it's a much bigger loss if the project gets kicked out because of one person's impatience. Then everybody will have to wait forever, how is that better?

If the fs is as good as you claim, then you better play by the rules and make sure the project survives and eventually goes GA. If it happens a few months later, then so be it. Think about the long term.

If you're worried about a single user leaving, then a much better strategy would be to explain to this user the Linux release timeline, or how to apply a patch, than to go up toe to toe against Linus.

And btw, squeezing a fix/feature in at the last minute in order to help one user is not as good as you think it is. Even if that one user appreciates your responsiveness, to everyone else it sends a message that the key dev is super impatient and unprofessional. So even if you manage to keep that one user, how many potential users are you losing by sending that message?

dataflow•7mo ago

> evade the rules that, somehow, every other part of the kernel manages to do just fine with

I have no context on the situation or understanding of what the right set of rules here is, but the difference between filesystems and other code is that bugs in the filesystem can cause silent, persistent corruption for user data, on top of all the other failure modes. Most other parts of the kernel don't have such a large persistence capability in case of failure. So I can understand if filesystems feel the need to be special.

samus•7mo ago

Yet the other filesystems seem fine with the rules. And the value proposition of Bcachefs precisely is that it doesn't eat your data. So, either the marketing is off, or it is far from ready to live with the quite predictable release pace of the Linux kernel.

dataflow•7mo ago

My impression as a total outsider here is that most (all?) other filesystems I'm aware of are either more mature - and generally not in active feature development - or they are not as popular, limiting the damage. Is this inaccurate?

I will also say that bcachefs's selling point - and probably a major reason people are so excited for it - is amount of effort it puts into avoiding data corruption. Which tells you something about the perceived quality of other filesystems on Linux. Which means that saying "other filesystems seem fine with the rules" misses the very fact that people have seen too much data corruption with other filesystems and want something that prioritizes it higher.

dismalaf•7mo ago

Btrfs is still very much being developed, in the kernel and is quite popular.

koverstreet•7mo ago

Chris Mason moved on a long time ago, Josef seems to be spending most of his time on other things, and if you look at the commit history btrfs development has been moving pretty slowly for a long time.

It's a bad sign when the key founders leave like that, filesystems require a lot of coherence of design and institutional knowledge to be retained.

samus•7mo ago

Yes, Bcachefs is experimental and thus needs more fixes. Everyone understands that, and bugfixes are indeed totally fine to be merged at all times.

The problem are new features, general improvements, and fixes that are actually features or require nontrivial refactorings. I understand the temptation, but these carry significantly higher risks than a well-thought out bugfix and are thus not welcome outside the merge window. Most prior rows between Kent and Linus have been about such patches, sometimes surreptitiously mixed in with more benign fixes.

This time his argument is "think of the distro users", but this won't fly - those either accept that using an experimental FS has consequences, or Kent learns how to work with the kernel community instead of testing the boundaries of the rules in the name of all the oppressed kernel developers gatekept by Linus. Shipping bugfixes for bugfixes would be kinda embarrassing for everyone involved (Kent, Linus, and the distros!), and that's why Linus rejects his PRs.

motorest•7mo ago

> That is a valid objection, but I still think that for some huge and difficult features the month long pauses imposed by release cycles are absolutely detrimental.

I feel you're not answering the question, nor are you presenting any case in favor of forcing an exceptional release process for an unstable feature.

The "detrimental" claim is also void of any reason or substance. It's not to it's users as users know better than rolling out experimental file systems for critical data, and those hypothetical users who somehow are really really interested in using bleeding edge will already be building their own kernel for this reason alone. Both scenarios don't require this code to be in the kernel, let alone exceptional release processes.

> Ideally they'd be developed outside the kernel until they are perfect, but Kent addresses this in his LWN comment: There is no funding/time to make that ideal scenario possible.

It is clear then that the code should be pulled from the kernel. If it isn't impossible to maintain a feature with the regular release process, everyone stands to benefit by not shipping code that is impossible to stabilize.

bgwalter•7mo ago

> The "detrimental" claim is also void of any reason or substance.

Thanks for the compliments! Detrimental for development speed, not for the users.

cwillu•7mo ago

See the replies made by Josef Bacik and Theodore Ts'o.

https://lwn.net/ml/all/20250627144441.GA349175@fedora/#t

https://lwn.net/ml/all/20250628015934.GB4253@mit.edu/

rob_c•7mo ago

> Kent writes in the LWN comments

Unfortunately Kent spends a lot of time and effort defending Kent. I wish he would learn to take a step back and admit he's fallible and learn to play nice in the sandbox rather than wasting all of this time and effort. A simple "mea culpa" could smooth over a lot of the feathers he constantly ruffles.

queenkjuul•7mo ago

Because every upstream change carries a [however slight] nonzero chance that something unrelated breaks. That's why there's rules.

It doesn't follow that an experimental feature should jeopardize stable ones, however small the possibility.

redeeman•7mo ago

there have been several examples of other exceptions. we are talking data corruption here. Kent may not be the best communicator, but he cares about what matters. you'd rather see people lose their data than bending rules.

queenkjuul•7mo ago

People using experimental filesystems without backups already decided to lose their data.

Besides, nobody is forbidden from applying Kent's patches and building their own kernel to run a recovery.

redeeman•7mo ago

this is an elitist view. many people testing are not really able to do that, but ARE able to get the rc kernels from a distribution. They are not stupid, they know that it can kill their data. They most probably dont have anything mega important on it. Doesnt mean it can be convenient. These people are part of the community and WANT to help bcachefs. They can install rc kernels, they can put data on, test in many scenarios. Do you not want to make life as easy for them as reasonably possible? we are talking a thing to RECOVER DATA, in a very self contained way.

But correct me if im wrong, but it appears you are saying: "well, maybe you were too stupid to read the experimental label, and well.. then you may also be too dumb to compile kents tree. The community doesnt want you, we just do not care to get help from plebs without these skills. buh-bye". Is this really what you want to convey?

queenkjuul•7mo ago

Why wouldn't the distro ship the patch, then? What distro is shipping vanilla rc kernels to causal users? Why are casual users using such a distro?

And genuinely, _genuinely_, who is using Linux and can't type three commands into a shell?

Life for them is already not as easy as possible, they've chosen to run an experimental filesystem without backups! If they need to run the recovery today, then getting into the rc doesn't help. If they're waiting for their distro to build the rc, they can't wait a few more weeks for the next release? If they really can't wait, Kent can send them a 5-line bash script to build them a new kernel.

Bottom line: if operate without backups, you've earned your data loss. People being stupid doesn't need to be the Linux kernel's problem.

alphazard•7mo ago

> Having a lead position in development I would kick Kent of the team.

I've seen this sentiment a lot lately. That disagreeable top performers have to be disposed of because they are "toxic" or "problematic".

You aren't doing your job as a leader if this is your attitude to good engineers. Engineering is a field where a small amount of the people create a large amount of the value. You can either understand that, and take it upon yourself to integrate disagreeable yet high performing people into the team, paving over the rough patches yourself. Or you can oust them, and quite literally take a >50% productivity hit on your team.

A disagreeable person will take up more of your time as a manager, but a high performer is worth significantly more of your time. When these traits co-occur in the same person, the cost-benefit is complicated. The reason we talk about this problem a lot in tech is because it is legitimately a tough call, with errors in both directions. Wishing that the right move was always as simple as kicking someone off the team doesn't make it true, although it may relieve you from having to contend with the decision.

koverstreet•7mo ago

It's not one or the other.

Ideally, you teach people how to get along better together; I think of my job as manager (and I effectively do manage a large team these days) as one of teaching and fostering good communication.

hinkley•7mo ago

But if you have one “top performer” who gets in the way of every other person’s productivity and buy-in, they have to go. You can’t base an organization on a bus number of one.

viraptor•7mo ago

> Or you can oust them, and quite literally take a >50% productivity hit on your team.

In a short term, possibly. But do you think bcachefs is better in the current situation than if it moved at half the speed, but without conflict? By being out of kernel it will get less testing, fewer contributions, the main developer will get some time wasted on rebasing the patch set with every release, and distros are unlikely to expose bcachefs to the user any time soon. When you're working with an ecosystem / teams, single person's performance really doesn't mean that much in the end. And occasionally Kent will still have to upstream some changes to interfaces - how likely is anyone to review/merge them quickly?

And now, what are the chances this will ever become more than a single person project really?

alphazard•7mo ago

It would be worse for bcachefs and the kernel if they parted ways. The Linux kernel does not have a feature complete alternative to APFS. Apple, of all companies, is beating the Linux kernel at filesystems. That hasn't happened before.

> When you're working with an ecosystem / teams, single person's performance really doesn't mean that much in the end.

This is demonstrably not true. Kent brought Bcachefs to fruition and got it upstreamed. Wireguard was also one guy. The cryptography used in both, also 1 guy. There's an argument to be made that given an elegant, well designed system, we should assume it came from a single or a few minds. But given a system that's been around for a while, you would be right to assume that a lot of people were/are involved in keeping it around.

viraptor•7mo ago

> Bcachefs to fruition and got it upstreamed

That may be reversed, so... wouldn't count it as a success yet. The project may not get popular adoption if people don't trust its future.

rob_c•7mo ago

> a feature complete alternative to APFS

Yet Linux has better tested NFS, CEPH, a stable ZFS target... I think the opposite is still true, apples golden goose of an fs is still basically their NTFS implementation

mort96•7mo ago

You aren't making your job easier as a leader by keeping assholes who insist on causing problems and not following established process.

eqvinox•7mo ago

A highly skilled but socially inept engineer is not a top performer. Interacting with others is part of their performance. Ultimately you need to look at the sum total of time, money, and outcome for the entire team; if firing a single "rockstar but asshole" developer allows the rest of your team to achieve the same productivity, you're still better off because you're saving both money and time on that person. Conversely if a single such developer can replace your entire team… sure, go for it.

In the extreme, if bcachefs gets removed from the kernel, the productivity outcome (depending on your measure) is actually zero.

[Ed.: also, honestly, if you need to hire a "babysitter" for such a highly skilled engineer, that is also a viable option & there shouldn't be a social stigma for that either. I wouldn't say it's the manager's job though, not to that degree at least.]

lll-o-lll•7mo ago

Difficult people are not “assholes”. Someone saying “your engineering practices are shoddy and the quality of your code is bad” does not make them an asshole. It makes them french maybe.

My point is that there are people low on the agreeableness scale, and they can often be exceptional engineers. You have to manage them, yes, but taking the easy way out of “i’ll sack anyone who’s prickly” will mean a shit team. You need people (or at least one), who will say “that won’t work because…” “this is bad because…” “this will fail because…”.

jjaksic•7mo ago

A person who points out flaws is not an asshole. An asshole is a person who breaks rules and who breaks trust. We're talking about the latter, not the former.

motorest•7mo ago

> My point is that there are people low on the agreeableness scale, and they can often be exceptional engineers.

Not true. Every single egregious asshole who is unable or unwilling to work in a team environment is quite bluntly incompetent. No degree of perceived hard skills mitigate how incompetent these types are. Engineering is predominantly a social activity. Team output is amplified by a collaborative environment. If you fail at basic tasks such as coordinating work and you manage to antagonize anyone around you then your net contribution to the project is negative.

hitekker•7mo ago

When the "top performer" destroys trust and fails to rebuild it, they shouldn't be on the team*

Skimming over the context, Kent seems to be lying by omission in PRs and distorting the history behind the PRs. Plus fighting with what is basically his tech lead who represents the team's norms, culture, and health. I also think he's fighting in this comment section right now.

Speaking as a former "brilliant jerk", I wouldn't trust the memory or intentions of a brilliant jerk. I wouldn't want to be looking over my shoulder for back-stabbing on my own team. I also wouldn't want my manager to get stuck in "I can fix him" mode because they're afraid of doing their flipping job: firing an employee who refuses to learn from their mistakes.

* I'm sympathetic to contexts when the team itself is bad and the performer is actually doing better. In that case: forget their trust, the performer should either remake the team (become the manager) or leave to do better work.

em-bee•7mo ago

former "brilliant jerk"

what helped you recognize and change that?

hitekker•7mo ago

When another Brilliant Jerk struck me down in my own little kingdom, using similar weapons & tactics. The Golden Rule is a Ruler that cuts both ways, turns out.

TL;DR I became a Christian. God broke my heart and opened my eyes. Like showing how my politics & diplomacy were tricks to escape from much-needed heartbreak. Much-needed for shifting my will from evil to good... or just giving me the will to leave a terrible but lucrative job.

Proverbs 26:11 and all that. I know I can be a jerk again, the difference is that the fear of God is in me now. A true blessing.

brookst•7mo ago

It’s no different from giving up on someone who writes terrible code or creates got hell.

Sure, you talk to them. And sure, you explain what the problem is and treat them like an adult. But ultimately it is completely acceptable to give up.

Peoples’ potential matters to parents, and to mentors. A high-potential, low-performing person can be a project worth taking on, but they are not an obligation in the workplace, especially for someone as senior and time-constrained as Linus.

bombcar•7mo ago

If you as a manager can build trust with your high performance engineer with zero social skills, you can end up with a power combination. You protect the engineer from insane requirements and also protect the rest of the team/company from outbursts.

I’ve seen it time and time again, sometime so much so that hiring the engineer also means hiring his handler, and everyone knows it and is ok with it - even the engineer.

moomin•7mo ago

Counterpoint: every time I meet someone who is perceived this way, they’re definitely an asshole, but their “productivity” is often mostly corner-cutting. Other devs irritation with them is often conflating the technical unprofessionality with the team unprofessionality. Managers are lousy at actually judging the productivity in these situations. You 100% can ditch these people and your productivity will rise. You just won’t have some asshole claiming the credit for other people’s work anymore.

Funnily enough, I just tracked down a problem that significantly affected the calculation of how much money something cost down to an issue one of these geniuses introduced by thinking they were too good for regular, dull, due diligence in their development practices.

baobun•7mo ago

IMO it's very clear by reading a few threads that K is not just disagreeable but manipulative and disingeniuous. Bordering on gaslighting at times.

As someone who might have fallen in your grumpy-disagreeable-senior bucket at times, that's a different story and not something I would accept.

> I've seen this sentiment a lot lately. That disagreeable top performers have to be disposed of because they are "toxic" or "problematic".

This is not really a relevant argument to this situation.

saghm•7mo ago

In other words, the rules don't apply to people who are "top performers"? This mentality will drive out all of the other people working for you, so even ignoring the obvious issues with how enables all sorts of shitty behavior from certain people, you're going to cost yourself more from losing larger numbers of "lower performers" in the long run (unless you end up replenishing the numbers with people equally shitty or at least willing to tolerate shittiness, which I guess would explain the stories of literal cesspools that have cropped up over the years that otherwise are hard to even comprehend).

motorest•7mo ago

> You aren't doing your job as a leader if this is your attitude to good engineers.

This is precisely the mistake you are making: conflating egregious types who post code as "good engineers". They are not. They are incompetent.

There is no engineering activity that is not driven.by teams. Being able to work effecticely in a team environment is therefore a very basic and critical skill. Those who are unable to work in a team environment are lacking a very basic and critical skill. Those who fail at this basic skill to the point they are dubbed as "toxic" end up sabotaging your whole team, needlessly creating problems to everyone around them, and preventing any collaboration to take place.

If this problem is introduced by a single team member, it is in everyone's best interest to just cut the cancer.

monkeyelite•7mo ago

What you’re saying make sense but do we know the social convention about bugfix classification in Linux?

My job matches what you’re describing, but bug fix is widely interpreted. It basically means “managers don’t do anything stupid”.

If someone got in trouble using language you are desiring “the rules are clear and were broken”, i would feel they were singling someone out.

eviks•7mo ago

> One thing is to challenge things. > If the rules are set in a way that rc1+ gets only Bugfixes

So it's not ok to challenge things like the substance of rules...

dottedmag•7mo ago

It is, but directly, not as a subversion.

I have had a similar experience with a team member who was quietly unhappy about a rule. Instead of raising a discussion about the rule (like the rest of the team members did) he tried to quietly ignore it in his work, usually via requesting reviews from less stringent reviewers.

As a result, after a while I started documenting every single instance of his sneaky rule-breakage, sending every instance straight to his manager, and the person was out pretty soon.

eviks•7mo ago

> It is, but directly, not as a subversion.

It is directly challenged in the very thread linked in the article (and likely before, the drama is ancient).

Also, there is no "less stringent reviewer", it's always been the same you!

So your example fails at both core points, yet your outcome is still the same happy firing!

At least for paid work you can just sprinkle $ to cover up such mistakes and find someone else, but wait, this is also not paid work!

dottedmag•7mo ago

I don't see it challenged before the MR.

Linux caught Kent when he tried to sneak in non-bugfixes into a RC, and berated him.

After that (not before, this is a critical distinction) Kent said "I don't want to abide by the rules, because I have my concerns".

This is very similar to the situation I have described, except that in Linux it was Linus who was skipping reviews on Kent's code trusting him not to subvert the rules, and in situation I described the team collectively was trusting each other not to subvert the rules.

krageon•7mo ago

You've explained everyone is unhappy with it and that you worked to get the one person who actually acted upon it fired. It's hilarious but in a pretty sad way that you're portraying this as an inevitability. It wasn't, it was just you. You had a choice, and you chose to do this. It wasn't inevitable.

dottedmag•7mo ago

I didn't make myself quite clear — the others were raising points on _other_ rules, and as a result we tuned the rules quite often, as we discovered what works better and what works worse.

Except one person.

yxhuvud•7mo ago

You don't challenge them by pretending they don't exist. That only make you look like an asshole.

The proper way here would have been two pull requests, one with all the bugfixes, and one with the new feature with a cover letter motivating why an exception should happen. And if this happens often enough with sufficiently good backing motivations, then he may be able to convince people.

Guvante•7mo ago

"Pull requests aren't the time to talk about this" is only ever correct if the next part of the sentence is "because we already agreed" or some such.

Otherwise that is a red flag. Like pull requests are when discussions are had...

cwillu•7mo ago

And the linux kernel project has a long-established process, which includes not routinely landing major features post-merge-window without having a discussion first.

dataflow•7mo ago

I'm not sure how I feel on the larger picture, but I think I understand his view of why certain PRs aren't the place to talk about certain things.

It's because he views user data integrity as a more critical concern than the PR process or team dynamics - which, as a user, I don't fault him for. I think that in his mind, every hour/day/week spent debating things on a PR equals more people losing or corrupting data. This is not commonly the case with most PRs - it's specific to popular filesystems in active development.

What I don't necessarily buy is how to weigh this responsibility against the responsibility users take on when they use such an experimental FS in the first place. It's a tough question in my mind, and both sides have good points. And I also don't know anything about the relative safety vs. severity of each patch. But what I do understand is the motivation for not viewing these as generic PRs against generic codebases. So the idea that this is a red flag in this case just doesn't seem right to me, based on my current understanding.

koverstreet•7mo ago

No, it's mainly that tensions have been high between myself and Linus so I want that stuff done privately so it doesn't spill out into the community the way it has been :)

It gets to be a real distraction. Fortunately the people I work with have learned how to roll with it, so it's not nearly as bad as it used to be. Now it mainly shows up in forum comments where it doesn't really affect me and I can eat popcorn.

It is true that I don't want critical fixes being held up by angry arguing, but most pull requests, even fixes, aren't nearly so critical.

The main thing I keep hammering on is "the development process _matters_ if we want to get this done right", and user considerations are a big part of that.

Debugging issues that come up in the wild, and getting those fixes to users in a timely manner so they can keep testing and we can get all these crazy failure modes sorted out is a big part of that - if we want a filesystem that's truly bulletproof. I know I want that!

I've been spending the past week and a half mostly working with one user and his filesystem that's been through flaky dying controllers and now lightning strikes; ext4 even got corrupted on the same setup.

But we discovered some 6.16 regressions, got some more people involved staring at code and logs (a new guy spotted a big one), and another small pile of fixes are going out next week. And even with the 6.16 regressions (some nasty ones were found), it's looking like he didn't lose much, thanks in part to journal rewind.

This thing is turning into a tank.

All in a day's work...

dastbe•7mo ago

As a person who probably has one of the best vantage points on this, how was Apple to get apfs out so quickly compared to filesystems in Linux like bcachefs?

koverstreet•7mo ago

I am curious about that myself, I know very little about apfs.

But Apple has historically been strong on organizing and supporting teams (see: their chip design), a filesystem sounds exactly like something they'd do well if they decided to give it the proper investment and support.

Where they seem to be falling down these days is software maintenance - many, many reports of MacOS getting buggier with every release. But a big, complicated, but well defined and self contained engineering project? That's their ballpark.

plorkyeran•7mo ago

APFS was the third filesystem designed by Dominic Giampaolo and fourth that he'd worked on, had a full team working on it, and had absurd testing resources thrown at it. It was set up to succeed in every way that a software project can be.

layer8•7mo ago

For some reason I always read this as “BCA chefs”.

kzrdude•7mo ago

today Kent posted another rc patch with a new filesystem option. But it was merged..

ajb•7mo ago

Yeah.. the thing is, suppose Kent was 100% right that this needed to be merged in a bugfix phase, even though it's not a bug fix. It's still a massive trust issue that he didn't flag up that the contents of his PR was well outside the expected.

That means Linus has to check each of his PRs assuming that it might be pushing the boundaries without warning.

No amount of post hoc justification gets you that trust back, not when this has happened multiple times now.

NewJazz•7mo ago

He mentioned it in his PR summary as a new option. About half of the summary of the original PR was talking about the new option and why it was important.

https://lore.kernel.org/linux-fsdevel/4xkggoquxqprvphz2hwnir...

ajb•7mo ago

I'm not saying he made a PR just saying "Fixes" like a rookie. What I'm saying is that in there should have been something along the lines of "heads up - I know this doesn't comply with the usual process for the following commits, here's why I think they should be given a waiver under these circumstances" followed by the justifications that appeared after Linus got upset.

The PR description would have been fine - if it had been in the right stage of the process.

gdgghhhhh•7mo ago

In this context, this is worth a read: https://hachyderm.io/@josefbacik/114755106269205960

wmf•7mo ago

A lot of open source volunteers can't really be replaced because there is no one willing to volunteer to maintain that thing. This is complicated by the fact that people mostly get credit for creating new projects and no credit for maintenance. Anyone who could take over bcachefs would probably be better off creating their own new filesystem.

ajb•7mo ago

Ehh. I don't think Kent is an arsehole. The problem with terms like "arsehole" that is that they conflate a bunch of different issues. It doesn't really have much explanatory power. Someone who is difficult to work with can be that way for loads of different reasons: Ego, tunnel vision, stress, neuro divergence (of various kinds), commercial pressures , greed, etc etc.

There is always a point where you have to say "no I can't work with this person any more", but while you are still trying to it's worth trying to figure out why someone is behaving as they do.

skissane•7mo ago

> The problem with terms like "arsehole" that is that they conflate a bunch of different issues.

Agree, plus I’d add: if we are going to criticise other people’s communication style/abilities or attitude, then using a vague, vulgar and hurtful slang term like “arsehole”/“asshole” (and similar slang such as “dick”, “prick”, etc) is an example of exhibiting the very thing one is complaining about in making the complaint, which is fundamentally hypocritical. One can state the same concerns in a more professional way, focusing on the details of the specific behaviour pattern not a vague term which can refer to lots of distinct behaviours (e.g. people with ASD traits who hurt the feelings of others because they honestly have trouble thinking about them, versus people with antisocial or narcissistic personality disorder traits who knowingly hurt the feelings of others because they enjoy doing so) - labelling the behaviour pattern not the person, acknowledging that it is entirely possibly due to an unintentional skills gap, (sub)culture clash, differences in life experiences, neurodiversity/neurodivergence/mental health/trauma, etc.

I also think it is helpful when criticising the flaws of others to try to relate them to one’s own, whenever possible - e.g. sometimes in the past I did X and from my perspective it looks like you are doing something similar-hurtful labels are not encouraging that kind of self-reflectiveness at all, they promote the idea that “I’m one of the good ones but you are one of the bad ones”

bgwalter•7mo ago

People who go on holier-than-thou rants like that are usually extremely unpleasant to work with and will cancel you (as directly admitted in that post) if you contradict them on anything.

josefbacik•7mo ago

I’m framing this comment and putting it on my wall in my office.

heavensteeth•7mo ago

Whether or not you agree with Kent on this, you have to commend that he tends to be very active in discussing issues with the community in a fairly open, calm, and thought out way (at least from what I've seen).

Comparatively, I find subtweeting him from the sanctity of Mastodon, with a few insults and backhanded complements thrown in for good measure, a bit low.

ars•7mo ago

This happened about a year ago as well: https://news.ycombinator.com/item?id=41407768

jagged-chisel•7mo ago

For the uninitiated:

bCacheFS, not BCA Chefs. I’m not clued into the kernel at this level so I racked my brain a bit.

zahlman•7mo ago

I had to think about it the first time, too.

skissane•7mo ago

Given the context is an OS kernel, even if I hadn't heard of "bcachefs" before (which I have), parsing it as "bcache-fs" seems obvious.

By contrast, I don't know what "BCA Chefs" is supposed to be. "BCA" could be many things: "Barbados Cricket Association", "Billiard Congress of America", "British Caving Association", "Business Council of Australia", among others. But what would any of them have to do with "Chefs"?

anonfordays•7mo ago

Linux needs a true answer to ZFS that's not btrfs. Sadly the ship has sailed for btrfs, after 15+ years it's still not something trustable.

Apparently bcachefs won't be the successor. Filesystem development for Linux needs a big shakeup.

bombcar•7mo ago

ZFS is good enough for 90% of people who need that so no real money is available for anything new.

Maybe a university could do it.

anonfordays•7mo ago

Indeed, and it's inclusion in Ubuntu is fantastic. It's also showing it's age, 20 years now. Tso, where are you when we need you most!?

bombcar•7mo ago

Or someday a file system will somehow piss off Linus and he’ll write one in a weekend or something ;)

XorNot•7mo ago

I mean, is it? It's a filesystem and it works. How is it "showing its age"?

em-bee•7mo ago

several people i know are using btrfs without problems for years now. i use it on half a dozen devices. what's your evidence that it is not trustable?

rcxdude•7mo ago

Many reports of data loss or even complete filesystem loss, often in very straightforward scenarios.

yjftsjthsd-h•7mo ago

In this case, some people using it and not having problems is much less interesting than some people that are having problems. As a former user who lost 2 root filesystems to BTRFS, I'm not touching it for a long time.

csnover•7mo ago

btrfs is OK for a single disk. All the raid modes are not good, not just the parity modes.

The biggest reason raid btrfs is not trustable is that it has no mechanism for correctly handling a temporary device loss. It will happily rejoin an array where one of the devices didn’t see all the writes. This gives a 1/N chance of returning corrupt data for nodatacow (due to read-balancing), and for all other data it will return corrupt data according to the probability of collision of the checksum. (The default is still crc32c, so high probability for many workloads.) It apparently has no problem even with joining together a split-brained filesystem (where the two halves got distinct writes) which will happily eat itself.

One of the shittier aspects of this is that it is not clearly communicated to application developers that btrfs with nodatacow offers less data integrity than ext4 with raid, so several vendors (systemd, postgres, libvirt) turn on nodatacow by default for their data, which then gets corrupted when this problem occurs, and users won’t even know until it is too late because they didn’t enable nodatacow.

The main dev knows this is a problem but they do seem quite committed to not taking any of it seriously, given that they were arguing about it at least seven years ago[0], it’s still not fixed, and now the attitude seems to just ignore anyone who brings it up again (it comes up probably once or twice a year on the ML). Just getting them to accept documentation changes to increase awareness of the risk was like pulling teeth. It is perhaps illustrative that when Synology decided to commit to btrfs they apparently created some abomination that threads btrfs csums through md raid for error correction instead of using btrfs raid.

It is very frustrating for me because a trivial stale device bitmap written to each device would fix it totally, and more intelligently using a write intent bitmap like md, but I had to be deliberately antagonistic on the ML for the main developer to even reply at all after yet another user was caught out losing data because of this. Even then, they just said I should not talk about things I don’t understand. As far as I can tell, this is because they thought “write intent bitmap” meant a specific implementation that does not work with zone append, and I was an unserious person for not saying “write intent log” or something more generic. (This is speculation, though—they refused to engage any more when I asked for clarification, and I am not a filesystem designer, so I might actually be wrong, though I’m not sure why everyone has to suffer because a rarefied few are using zoned storage.)

A less serious but still unreasonable behaviour is that btrfs is designed to immediately go read-only if redundancy is lost, so even if you could write to the remaining good device(s), it will force you to lose anything still in transit/memory if you lose redundancy. (Except that it also doesn’t detect when a device drops through e.g. a dm layer, so you can actually ‘only’ have to deal with the much bigger first problem if you are using FDE or similar.) You could always mount with `-o degraded` to avoid this but then you are opening yourself up to inadvertently destroying your array due to the first problem if you have some thing like a backplane power issue.

Finally, unlike traditional raid, btrfs tools don’t make it possible to handle an online removal of an unhealthy device without risking data loss because in order to remove an unhealthy but extant device you must first reduce the redundancy of the array—but doing that will just cause btrfs to rebalance across all the devices, including the unhealthy one, and potentially taking corrupt data from the bad device and overwriting on the good device, or just losing the whole array if the unhealthy device fails totally during the two required rebalances.

There are some other issues where it becomes basically impossible to recover a filesystem that is very full because you cannot even delete files any more but I think this is similar on all CoW filesystems. This at least won’t eat data directly, but will cause downtime and expense to rebuild the filesystem.

The last time I was paying attention a few months ago, most of the work going into btrfs seemed to be all about improving performance and zoned devices. They won’t reply to any questions or offers for funding or personnel to complete work. It’s all very weird and unfortunate.

[0] https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg...

koverstreet•7mo ago

> The biggest reason raid btrfs is not trustable is that it has no mechanism for correctly handling a temporary device loss. It will happily rejoin an array where one of the devices didn’t see all the writes. This gives a 1/N chance of returning corrupt data for nodatacow (due to read-balancing), and for all other data it will return corrupt data according to the probability of collision of the checksum. (The default is still crc32c, so high probability for many workloads.) It apparently has no problem even with joining together a split-brained filesystem (where the two halves got distinct writes) which will happily eat itself.

That is just mind bogglingly inept. (And thanks, I hadn't heard THIS one before).

For nocow mode, there is a bloody simple solution: you just fall back to a cow write if you can't write to every replica. And considering you have to have the cow fallback anyways - maybe the data is compressed, or you just took a snapshot, or the replication level is different - you have to work really hard or be really inept to screw this one up.

I honestly have no idea how you'd get this wrong in cow mode. The whole point of a cow filesystem is that it makes these sorts of problems go away.

I'm not even going to go through the rest of the list, but suffice it to say - every single broken thing I've ever seen mentioned about btrfs multi device mode is fixed in bcachefs.

Every. Single. One. And it's not like I ever looked at btrfs for a list of things to make sure I got right, but every time someone mentions one of these things - I'll check the code if I don't remember, some of this code I wrote 10 years ago, but I yet to have seen someone mention something broken about btrfs multi device mode that bcachefs doesn't get right.

It's honestly mind boggling.

koverstreet•7mo ago

> The last time I was paying attention a few months ago, most of the work going into btrfs seemed to be all about improving performance and zoned devices. They won’t reply to any questions or offers for funding or personnel to complete work. It’s all very weird and unfortunate.

By the way, if that was serious, bcachefs would love the help, and more people are joining the party.

I would love to find someone to take over erasure coding and finish it off.

csnover•7mo ago

In my case it was a last-ditch effort to get them to explain what was keeping them from making raid actually safe. Others have offered more concrete support more recently[0], I guess you could try reaching out to them, though I suppose they are interested in funding btrfs because they are using btrfs.

I share the sentiments of others in this discussion that I hope you are able to resolve the process issues so that bcachefs does become a viable long-term filesystem. There likely won’t be any funding from anyone ever if it looks like it’s going to get the boot. btrfs also has substantial project management issues (take a look at the graveyard of untriaged bug reports on kernel.org as one more example[1]), they just manage to keep theirs under the radar.

[0] https://lore.kernel.org/linux-btrfs/CAEFpDz+R3rLW8iujSd2m4jH...

[1] https://bugzilla.kernel.org/buglist.cgi?bug_status=__open__&...

koverstreet•7mo ago

Well, bcachefs has the safe, high performance problem solved.

But I really just don't know what to do if technology has become a popularity contest instead of about the technology :)

tobias3•7mo ago

The btrfs devs are mainly employed by Meta and SuSE and they only support single devices (I haven't looked up recently if SuSE supports multiple device fs). Meta probably uses zoned storage devices, so that is why they are focusing on that.

Unfortunately I don't think Patreon can fund the kind of talent you need to sustainably develop a file system.

That btrfs contains broken features is IMO 50/50 the fault of up-stream and the distributions. Distributions should patch out features that are broken (like btrfs multi-device support, direct IO) or clearly put it behind experimental flags. Up-stream is unfortunately incentivised to not do this, to get testers.

koverstreet•7mo ago

Patreon has never been my main source of funding. (It has been a very helpful backstop though!)

But I do badly need more funding, this would go better with a real team behind it. Right now I'm trying to find the money to bring Alan Huang on full time; he's fresh out of school but very sharp and motivated, and he's already been doing excellent work.

If anyone can help with that, hit me up :)

em-bee•7mo ago

i know it's not appropriate to complain about downvotes, but anonfordays responds to my question with an actual answer ( https://news.ycombinator.com/item?id=44468404 ) and more importantly with a link to the btrfs status page ( https://btrfs.readthedocs.io/en/latest/Status.html ) that i was not aware of (but as a btrfs user should have been) and you all downvote that to death. why? what possible disagreement could you have with that?

anonfordays•7mo ago

I did not down vote you, and my post was flagged or dead:

https://btrfs.readthedocs.io/en/latest/Status.html

The amount of "mostly OK" and still an "unstable" RAID6 implementation. Not going to trust a file system with "mostly OK" device replace. Anecdotally, you can search the LKML and here for tons of data loss stories.

em-bee•7mo ago

my post was flagged or dead

yes, that is what i was complaining about. i wasn't talking about my post but yours. there is absolutely no reason for anyone to downvote your post.

tandr•7mo ago

I tried it as a FS for a data volume (200GB) on Linux a year ago, after reading how stable it is "now". The first hard crash made it unrecoverable no matter what I have tried. Never again.

commandersaki•7mo ago

If btrfs implemented encryption, I'd consider it a suitable replacement for 90% of cases.

int_19h•7mo ago

People who need encryption mostly want full-disk encryption, so why not an encrypted volume with btrfs on top?

commandersaki•7mo ago

btrfs does volumes, it should also do volume encryption, why have an unnecessary layer inbetween?

int_19h•7mo ago

I'm not saying it wouldn't be nice to have, but given that the workaround is there, it doesn't sound like something that should contribute to btrfs being a "suitable replacement for 90% of cases".

zahlman•7mo ago

Does the filesystem actually need to be part of the kernel project to work? I can see where you'd need that for the root filesystem, but even then, couldn't one migrate an existing installation to a new partition with a different filesystem?

teekert•7mo ago

We ZFS for that. What we want is something in kernel, ready to go, 100% supported on root ok any Linux system with no license ambiguity. We want to replace ext4. Maybe btrfs can do it. I hear it has outgrown its rocky puberty.

alphazard•7mo ago

Bcachefs and Btrfs are not really competing with Ext4. There are basically 2 filesystem niches.

First niche is the full featured CoW filesystem; it has snapshots, detects and repairs corruption, transparent compression, all that good stuff.

The other niche is being an allocator of sectors. There's one storage device, divide it up amongst all these processes asking for storage. That's Ext4: an allocator of disk sectors, dressed up in a filesystem API. When you are running databases or VMs, all you want is an allocator of sectors. You don't want lots of stuff getting in the way of your writes. You don't want checksumming, you don't want your writes going to a new place every time. You just want write access to part of the disk.

teekert•7mo ago

I want COW everywhere because I want to revert to snapshots on my laptop as much as on my servers. I want it integrated into the bootloader and boot into snapshots, I want ransomware protection on my laptop, etc.

bombcar•7mo ago

I want the default file system everywhere be CoW and snapshot-enabled (a perfection would have snapshot on the file and directory level) so that tooling starts to assume it is available and begins to use it.

int_19h•7mo ago

Funnily enough Windows went there with NTFS and full-fledged transactional FS and registry APIs with snapshots and rollbacks:

https://learn.microsoft.com/en-us/windows/win32/fileio/about...

But then for some reason it was all deprecated in Win11.

samus•7mo ago

At the same time, many VM and containerization solutions build these features themselves. It seems attractive to reuse the heavily optimized machinery that a COW filesystem offers. And indeed at least Docker can use Btrfs snapshots to create images. It has been a while since I looked into it though; no clue how mature and performant it is nowadays.

https://docs.docker.com/engine/storage/drivers/btrfs-driver/

kzrdude•7mo ago

Technological progress comes from us all being lifted to new levels and higher standards, I think. The Bcachefs/Btrfs niche is the standard we want/expect now and I think people are right to imagine it, develop it and make it happen.

gizmo686•7mo ago

Even the root filesystem can be FUSE if you want it to. The only thing that needs to be in the kernel is the initial root filesystem driver. Nowadays that is pretty much always just compressed CPIO (initrd). At that point user space can do pretty much whatever before doing a pivot_root operation to wherever.

tremon•7mo ago

It's not really initrd though; the kernel uses initramfs, which is more like tmpfs (emulating a filesystem in the VFS cache) than a ramdisk (a preallocated piece of memory that emulates a block device).

The files are still loaded from a compressed cpio archive though, and because of the initrd legacy that file is still called initrd in most distributions.

topspin•7mo ago

No, it does not. It might need to be part of the kernel to be included downstream in Linux distributions, if the file system developer fails to maintain distro buildable modules that don't eat people's data. Should developers provide workable modules, getting these modules into most distros is generally unhindered.

And no, it's not necessary even for root file systems. Linux can load modules, such as file system drivers, before it mounts root. That's what initramfs is about.

ZFSoL has thrived for 15 years, fostering several commercial empires, and has never been in Linus's mainline.

Bcachefs development may continue as Kent Overstreet wishes, and he need not squabble with Linus going forward. Seems like an entirely workable outcome. Kudos to Linus for a.) giving Kent a chance, despite known issues with Kent, and b.) making the difficult decision to reverse himself. Both of these decisions were correct.

What I learn from all of this is that Linus is still in the saddle and still making good calls. We are blessed.

samus•7mo ago

I think using an experimental file system is actually perfectly fine for the root partition, as long as it doesn't include `/home` and you keep a USB stick in your drawer so you can reinstall whenever it loses data.

I'd be highly conservative about using it for my home directory though. Or at least make a subfolder where all my really important files (legal documents, master thesis, etc.) go and mount that on another partition that uses a more conservative filesystem.

alphazard•7mo ago

This whole debacle is the perfect advertisement for microkernels. The only reason Kent needs to coordinate with Linus is because filesystems need to live in the kernel. FUSE is second class. Imagine how much easier this all would be if linux maintained a slowly evolving filesystem API, and all bcachefs had to do was keep up with it.

skissane•7mo ago

I don’t think FUSE is deliberately a “second class citizen”, it is simply that doing a filesystem in user space has a performance cost compared to doing it in the kernel-and that is a very tricky problem to solve. Even microkernels have this problem, it is just you don’t notice it as readily because a pure microkernel doesn’t offer in-kernel filesystems as a comparator - but if you take a microkernel and transform it into a hybrid kernel by moving filesystems (and block device drivers) into kernel space, like NeXT/Apple did in turning Mach into XNU, almost certainly you are going to see tangible performance gains. Maybe this is less true with more modern microkernel designs such as L4, but even there I suspect it is still true, even if not to quite the same extent.

I think the performance cost of FUSE compared to in-kernel filesystems is improving with time - FUSE with io_uring is a big step forward, but the immaturity of io_uring is an obstacle to its adoption, at least in the short-to-medium-term. I’m sure in the future we’ll see even further improvements in this area. But will we ever reach the Nirvana where FUSE equals the performance of in-kernel filesystems, or (maybe more realistically) the performance overhead has become so marginal nobody is bothered by it in practice? I’d like to think we eventually will, but it is far from certain.

koverstreet•7mo ago

There's no inherent reason why FUSE has to be noticably slower for buffered IO, it just hasn't gotten nearly enough well thought it attention. But that's starting to change, there's a lot more interest these days in a faster FUSE.

Direct IO would be slower via FUSE, but L4 style IPC could solve that.

It would be an interesting proposition, although not my first choice for the direction I want to go in :)

skissane•7mo ago

I think the issue with any new physical filesystem, is even if it becomes mature, fully upstream as part of the mainline Linux kernel, and supported out-of-the-box by all the major distributions - still a lot of people are just never going to use it, because there is so much competition in that space (ext4, XFS, btrfs, etc), people are understandably quite conservative (fear of data loss due to bugs), and the fear that a less popular filesystem may end up being abandoned if something unexpected happens to its primary developer (see e.g. ReiserFS)

By contrast, improvements in performance of FUSE, L4-style IPC, could be much more widely beneficial-both for developers of new physical filesystems (by making possible in-user space implementations where they can iterate faster, get better API/ABI stability, easier adoption by end-users), but also for developers of numerous other pieces of software too

Of course, you personally are going to scratch the itch you want to scratch. But in terms of what’s most beneficial for the Linux ecosystem as a whole, I think FUSE improvements and L4-style IPC would deliver the most benefit per unit of effort

koverstreet•7mo ago

I agree about the benefit they'd offer, but the thing is - I already have a todo list that extents out to 2030, and FUSE is going to take a lot of work before it gets there: probably years, because it's going to be done incrementally on top of a big hodgepog instead of being done right by someone willing to invest the time to get it right.

We've had someone show up claiming "I'm going to do FUSE right!" and it never happened, so - the incremental approach is probably best here. But it's going to take awhile.

toast0•7mo ago

Kernel modules exist. The Linux VFS is a slowly evolving filesystem API. Most Linux distributions boot with initramfs, so it's not hard to use a stable filesystem for the bootloader to read the kernel and initramfs which includes the driver for the experimental filesystem.

Sometimes a new filesystem needs changes to things in the kernel and the VFS API isn't enough, but often VFS is enough.

snvzz•7mo ago

It is indeed a mistake to target Linux, as it guarantees the majority of effort will be spent tracking Linux, rather than working on the filesystem itself.

There are far better options such as FUSE or the filesystem APIs in other operating systems like Netbsd, Haiku, Genode or even ReactOS (and Windows NT).

Some of the best filesystems such as OpenZFS, HAMMER2 or Lustre are developed outside of Linux.

holowoodman•7mo ago

FUSE is what a microkernel filesystem would look like. There are some optimisations that FUSE doesn't do, that microkernels usually have. In the most extreme form, L4 trims down communication primitives to the most efficient platform-specific ways of exchanging memory buffers. In all cases, microkernels and FUSE still need context switches for everything, and those are expensive. If you leave out the context switches, you don't have a microkernel anymore. This is what Windows did by pulling graphics drivers into the kernel, because context switches are slow.

So no. Microkernels have been tried. Microkernel-workalike filesystems are here with FUSE. They suck because microkernels suck when you need performance. Research has gone into different directions, like microkernels as hypervisors and for security, because it has become clear that the performance problems if microkernels are inherent and unfixable.

Dylan16807•7mo ago

> because it has become clear that the performance problems if microkernels are inherent and unfixable.

I don't get the impression that CPU designers have been putting a particularly large focus on making context switches fast. They try but they're busy doing everything else too. If context switches were constant I think silicon would make them go a lot faster.

holowoodman•7mo ago

One thing is that context switches have become less of a problem with the advent of multicore systems. But we are not at a place where you can have your dedicated filesystem core for each filesystem, so they are still relevant.

And yes, CPUs are generally optimized not for context switches, but for numerics and certain single-application benchmarks. So computations without any context switches are very fast. Application to Kernel switches are somewhat fast as well. But application-to-application or even worse application-kernel-privilegedFSprocess-kernel-application will be slow as hell, because those numbers never make it to the TOP500 or benchmark sites.

Context switches also have already been optimized too much, Spectre and Meltdown in all their variants are a certain kind of botched context switch. That means that certain optimizations are now out of the question, context switches will be optimized more carefully (if at all) in the future. So there isn't too much hope for a faster but still secure context switch.

I think the actual future in a decade or so will rather have more cores, less context switches and a more computer-network-like CPU. You will have your filesystem server process on the filesystem/disk-IO core and you will speak some filesystem protocol via shared memory over a on-CPU network.

ddtaylor•7mo ago

I fear posting this because it's YouTube content and I don't know of the creator very well beyond these videos, but I have been following this saga a bit from this creator:

https://www.youtube.com/@SavvyNik/videos

He gets a few words wrong because my understanding is he covers the topic in a more broad way, but most of his coverage seems objective and factual. He does have some opinions, but I think it's closer to journalism of the LKML than an opinion piece.

ncrmro•7mo ago

Lots of people mentioning ZFS, which can’t do hibernation correctly as sometimes ZFS will still do some writes after that ram has suspended. Which I feel like would complete the story of here’s my mobile device that is snapshoted and backed up regularly.

I wonder where bcachfs in regards to mobile snapshots and hibernation.

koverstreet•7mo ago

Works fine - my main development laptop has been bcachefs for ~8 years, I suspend it all the time :)

I think there have been one or two bug reports in the past from rebalance not freezing in a timely manner (laptops don't usually use rebalance, that's usually a multi device thing), but I think they've been fixed. Send me a bug report if it's not :)

ncrmro•7mo ago

Suspend works find on ZFS from what I’ve read but not hibernate where ram is written to disk for full power off

ryao•7mo ago

Linux’s power management APIs that handle these things are behind GPL symbol exports. In theory, it should be fairly simple to resolve this if it were not for that. Right now, if you want hibernation with ZFS, you should use FreeBSD.

eviks•7mo ago

Even if you're an absolute stickler for arbitrary guidelines, Linus can easily just enforce the rule and not merge, that's it! He already sees this FS as very experimental, so any subtle bugs remaining due to the dev not fixing them according to the process is acceptable. Inflating the drama and threatening compete removal is a hissy fit.

kzrdude•7mo ago

Linus didn't threaten removal. Removing it from the kernel is apparently a topic that came up in a non-public maintainer conversation where both Kent and Linus were participating.

eviks•7mo ago

What do you think the quote means then?

kzrdude•7mo ago

A threat is unilateral, and this is not unilateral.

teekert•7mo ago

It's such a shame. We don't quite trust btrfs (but it's probably fine!), we don't quite trust the ZFS license and the fact that it is not in the Kernel (but it's mostly fine!), so Bcachefs would be so nice to have. A modern FS that one uses on their Linux root (or anywhere), as confidently as one does ext4.

But what prevents it ultimately? This ... situation. It makes me sad.

I didn't follow the details but I know that Linus is a reasonable person, and Kent is very thorough and delivering quality. But even if Linus was too much on the conservative-side here (but who's to judge??), please Kent, just fall in line. The alternative is nothing. Go have a beer with Linus.

commandersaki•7mo ago

I just want an intree filesystem that does metadata and data checksumming, compression, encryption, and volumes/snapshots. Bcachefs was the one that ticked all of these. Such a shame to see it turn out this way.

rob_c•7mo ago

Good

Voultapher•7mo ago

I'm sure it can be annoying to collaborate with the other Linux people, especially if social interactions don't come easily to you - as I assume is the case for Kent - but if you want real adoption of your fs and improvement to the status quo, playing by their rules and being humble seems like the most reliably way to get there.

I say that as someone that has donated to the project. I want to have a real alternative to zfs, let's please get there.

CogitoCogito•7mo ago

Reading through Kent Overstreet's comments, it seems totally correct to kick bcachefs out of the kernel. His comments demonstrate very clearly that he's not able to work under the constraints of the current kernel development process.

Leaving Google has actively improved my life

OpenAI raises $110B on $730B pre-money valuation

The Robotic Dexterity Deadlock

NASA announces overhaul of Artemis program amid safety concerns, delays

A better streams API is possible for JavaScript

Let's discuss sandbox isolation

Dan Simmons, author of Hyperion, has died

A Chinese official’s use of ChatGPT revealed an intimidation operation

Writing a Guide to SDF Fonts

A new California law says all operating systems need to have age verification

Allocating on the Stack

Kyber (YC W23) Is Hiring an Enterprise Account Executive

Modeling cycles of grift with evolutionary game theory

"Just a little detail that wouldn't sell anything"

We Built Secure, Scalable Agent Sandbox Infrastructure

PCB Tracer

Court finds Fourth Amendment doesn’t support broad search of protesters’ devices

Get free Claude max 20x for open-source maintainers

Open source calculator firmware DB48X forbids CA/CO use due to age verification

Reading English from 1000 AD

Implementing a Z80 / ZX Spectrum emulator with Claude Code

Can you reverse engineer our neural network?

Tell HN: MitID, Denmark's digital ID, was down

Show HN: RetroTick – Run classic Windows EXEs in the browser

Rob Grant, creator of Red Dwarf, has died

We gave terabytes of CI logs to an LLM

Sprites on the Web

Statement from Dario Amodei on our discussions with the Department of War

F-Droid Board of Directors nominations 2026

Show HN: Claude-File-Recovery, recover files from your ~/.claude sessions

Leaving Google has actively improved my life

OpenAI raises $110B on $730B pre-money valuation

The Robotic Dexterity Deadlock

NASA announces overhaul of Artemis program amid safety concerns, delays

A better streams API is possible for JavaScript

Let's discuss sandbox isolation

Dan Simmons, author of Hyperion, has died

A Chinese official’s use of ChatGPT revealed an intimidation operation

Writing a Guide to SDF Fonts

A new California law says all operating systems need to have age verification

Allocating on the Stack

Kyber (YC W23) Is Hiring an Enterprise Account Executive

Modeling cycles of grift with evolutionary game theory

"Just a little detail that wouldn't sell anything"

We Built Secure, Scalable Agent Sandbox Infrastructure

PCB Tracer

Court finds Fourth Amendment doesn’t support broad search of protesters’ devices

Get free Claude max 20x for open-source maintainers

Open source calculator firmware DB48X forbids CA/CO use due to age verification

Reading English from 1000 AD

Implementing a Z80 / ZX Spectrum emulator with Claude Code

Can you reverse engineer our neural network?

Tell HN: MitID, Denmark's digital ID, was down

Show HN: RetroTick – Run classic Windows EXEs in the browser

Rob Grant, creator of Red Dwarf, has died

We gave terabytes of CI logs to an LLM

Sprites on the Web

Statement from Dario Amodei on our discussions with the Department of War

F-Droid Board of Directors nominations 2026

Show HN: Claude-File-Recovery, recover files from your ~/.claude sessions

Bcachefs may be headed out of the kernel

Comments