Oh no, what happened to Rust will save us from retarded legacy languages prone to memory corruption?
Rust Binder contains the following unsafe operation:
// SAFETY: A `NodeDeath` is never inserted into the death list
// of any node other than its owner, so it is either in this
// death list or in no death list.
unsafe { node_inner.death_list.remove(self) };
This operation is unsafe because when touching the prev/next pointers of
a list element, we have to ensure that no other thread is also touching
them in parallel. If the node is present in the list that `remove` is
called on, then that is fine because we have exclusive access to that
list. If the node is not in any list, then it's also ok. But if it's
present in a different list that may be accessed in parallel, then that
may be a data race on the prev/next pointers.
And unfortunately that is exactly what is happening here. In
Node::release, we:
1. Take the lock.
2. Move all items to a local list on the stack.
3. Drop the lock.
4. Iterate the local list on the stack.
Combined with threads using the unsafe remove method on the original
list, this leads to memory corruption of the prev/next pointers. This
leads to crashes like this one:(The nuance being that sometimes there's a lot of unsafe Rust, because some domains - like kernel programming - necessitate it. But this is still a better state of affairs than having no code be correct by construction, which is the reality with C.)
But as the adjacent commenter notes: having unsafe is not inherently a problem. You need unsafe Rust to interact with C and C++, because they're not safe by construction. This is a good thing!
If Rust doesn't live up to its lofty promises, then it changes the cost-benefit analysis. You might give up almost anything to eliminate all bugs, a lot to eliminate all memory bugs, but what would you give up to eliminate some bugs?
The cost-benefit argument for Rust has always been mediated by the fact that Rust will need to interact with (or include) unsafe code in some domains. Per above, that's an explicit goal of Rust: to provide sound abstractions over unsound primitives that can be used soundly by construction.
The short of it is that for fundamental computer science reasons the ability to always reject unsafe programs comes at the cost of sometimes being unable to verify that an actually-safe program is safe. You can deal with this either by accepting this tradeoff as it is and accepting that some actually-safe programs will be impossible to write, or you can add an escape hatch that the compiler is unable to check but allows you to write those unverifiable programs. Rust chose the latter approach.
> Kinda sounds a lock would make this safe?
There was a lock, but it looks like it didn't cover everything it needed to.
All bugs is typically a strawman typically only used by detractors. The correct claim is: safe Rust eliminates certain classes of bugs. I'd wager the design of std eliminates more (e.g. the different string types), but that doesn't really apply to the kernel.
Which is either 1) not true as evidenced by this bug or 2) a tautology whereby Rust eliminates all bugs that it eliminates.
> 1) not true as evidenced by this bug
Code used unsafe, putting us out of "safe" rust.
So arguably both camps are correct. Those who advocate Rust rewrites, and those who are against it too.
Then the safety comment can easily bias the reader into believing that the author has fully understood the problem and all edge cases.
That's roughly 100% of unsafe code because a lint in the compiler asks for it.
Classic Motte and Bailey. Rust is often said "if it compiles it runs". When that is obviously not the case, Rust evangelicals claim nobody actually means that and that Rust just eliminates memory bugs. And when that isn't even true, they try to mischaracterize it as "all bugs" when, no, people are expecting it to eliminate all memory bugs because that's what Rust people claim.
The real question is "does it provide this greater value for _less_ effort?"
The answer seems to be: "No."
> Since it was in an unsafe block, the error for sure was way easier to find within the codebase than in C. Everything that's not unsafe can be ruled out as a reason for race conditions and the usual memory handling mistakes - that's already a huge win.
The benefit of Rust is you can isolate the possible code that causes an XYZ to an unsafe block*. But that doesn't necessarily mean the error shown is directly related to the unsafe block. Like C++, triggering undefined behavior can in theory cause the program to do anything, including fail spectacularly within seemingly unrelated safe code.
* Excluding cases where safe things are actually possibly unsafe (like some incorrectly marked FFI)
When debugging, we care about where the assumptions we had were violated. Not where we observe a bad effect of these violated assumptions.
I think you get here yourself when you say:
> triggering undefined behavior can in theory cause the program to do anything, including fail spectacularly within seemingly unrelated safe code
The bug isn't where it failed spectacularly. It's where the C++ code triggered undefined behavior.
Put another way: if the undefined behavior _didn't_ cause a crash / corrupted data, the bug _still_ exists. We just haven't observed any bad effects from it.
I believe their point was that they only needed to audit only the unsafe blocks to find the actual root cause of the bug once they had an idea of the problematic area.
As for why there is unsafe in the kernel? There are things, especially in a kernel, that cannot be expressed in safe Rust.
Still, having smaller sections of unsafe is a boon because you isolate these locations of elevated power, meaning they are auditable and obvious. Rust also excels at wrapping unsafe in safe abstractions that are impossible to misuse. A common comparison point is that in C your entire program is effectively unsafe, whereas in Rust it's a subset.
The Rustonomicon makes it very clear that it is generally insufficient to only verify correctness of Rust-unsafe blocks. If the absence of UB in a Rust-unsafe block depends on Rust-not-unsafe code in the surrounding module, potentially the whole module has to be verified for correctness. And that assumes that the module has correct encapsulation, otherwise even more may have to be verified. And a single buggy change to Rust-not-unsafe code can cause UB, if a Rust-unsafe block somewhere depends on that code to be correct.
1: Marketing and social media brigading, as Linus put it.
2: Pattern matching and enums, which are genuinely good.
3: Rust has some trade-offs that are closer to C++ than C, and those trade-offs have advantages and disadvantages.
4: Module system.
5: More modern macros.
6: Other advantages.
Rust also has a large heap of drawbacks.
The author is thinking about "the error" as some source code that's incorrect. "Your error was not bringing gloves and a hat to the snowball fight" but you're thinking "the error" is some diagnostic result that shows there was a problem. "My error is that I'm covered in freezing snow".
Does that help?
If you look at the mentioned patches, the fixes are to code outside the described unsafe block, in Rust-not-unsafe code. It is perfectly possible to introduce UB through changes to "safe" Rust, if those changes end up violating some assumptions in some Rust-unsafe block somewhere.
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux...
Another way to introduce UB in Rust-not-unsafe, is if no_std is used. In that case, a simple stack overflow can cause UB, no Rust-unsafe required.
Surprisingly many Rust developers do not understand these points. It may take some of the shine off of Rust, so some Rust fans refrain from explaining it properly. Which is not good.
If we're looking beyond one decade, then:
- As with all other utility, the future must be discounted compared to the present.
- A language that might look sterling today might fall behind tomorrow.
- Something something AI by 2035.
Kernels - and especially the Linux kernel - are high-performance systems that require lots of shared mutable state. Every driver is a glorified while loop waiting for an IRQ so it can copy a chunk of data from one shared mutable buffer to another shared mutable buffer. So there will need to be some level of unsafe in the code.
There's a fallacy that if 95% of the code is safe, and 5% is unsafe, then that code is only 5% as likely to contain memory errors as a comparable C program. But, to reiterate what another commenter said, and something I've predicted for a long time, the tendency for the "unsafe block" to become instrumented by the "safe block" will always exist. People will loosen the API contract between the "safe" and "unsafe" sides until an error in the "safe" side kicks off an error in the "unsafe" side.
pub(crate) fn release(&self) {
let mut guard = self.owner.inner.lock();
while let Some(work) = self.inner.access_mut(&mut guard).oneway_todo.pop_front() {
drop(guard);
work.into_arc().cancel();
guard = self.owner.inner.lock();
}
- let death_list = core::mem::take(&mut self.inner.access_mut(&mut guard).death_list);
- drop(guard);
- for death in death_list {
+ while let Some(death) = self.inner.access_mut(&mut guard).death_list.pop_front() {
+ drop(guard);
death.into_arc().set_dead();
+ guard = self.owner.inner.lock();
}
}
And here is the unsafe block mentioned in the commit message with some more context [3]: fn set_cleared(self: &DArc<Self>, abort: bool) -> bool {
// <snip>
// Remove death notification from node.
if needs_removal {
let mut owner_inner = self.node.owner.inner.lock();
let node_inner = self.node.inner.access_mut(&mut owner_inner);
// SAFETY: A `NodeDeath` is never inserted into the death list of any node other than
// its owner, so it is either in this death list or in no death list.
unsafe { node_inner.death_list.remove(self) };
}
needs_queueing
}
[0]: https://lore.kernel.org/linux-cve-announce/2025121614-CVE-20...[1]: https://github.com/torvalds/linux/commit/3e0ae02ba831da2b707...
[2]: https://github.com/torvalds/linux/blob/3e0ae02ba831da2b70790...
[3]: https://github.com/torvalds/linux/blob/3e0ae02ba831da2b70790...
kahlonel•1h ago