Bonus: recent talk from Ralf Jung on his group's efforts to precisely specify Rust's operational semantics in executable form in a dialect of Rust: https://youtube.com/watch?v=yoeuW_dSe0o
> The problem with unsafe code is that it can do things like this:
fn main() {
let mut x = 42;
let ptr = &mut x as *mut i32;
let val = unsafe { write_both(&mut *ptr, &mut *ptr) };
println!("{val}");
}
No it can't? Using pointers to coexist multiple mutable references to the same variable is undefined behavior. Unless I'm just misunderstanding the point they're trying to make here."Unsafe code allows to express the following, which is UB:"
- Anything accepted by the borrow checker is legal
- Unsafe can express illegal / undefined behavior
- There's some set of rules, broader than what the borrow checker can check, that is still legal / defined behavior
The goal of this line of work is to precisely specify that set of rules. The outlines are clear (basically, no writable pointers should alias) but the details (interior pointers, invalidation of iterators, is it creating or using bad pointers that's bad, etc) are really hard. The previous paper in this series, on Stacked Borrows, was simpler but more restrictive, and real-world unsafe code often failed its rules (while still seeming correct). Tree Borrows is broader and allows more while still being provably safe.
Note that we have not yet proven this. :) I hope to one day prove that every program accepted by the borrow checker is compatible with TB, but right now, that is only a (very well-tested) conjecture.
> Given that aliasing optimizations are something that the Rust compiler developers clearly want to support, we need some way of “ruling out” counterexamples like the one above from consideration.
Yes, but which exact rule does it violate? What is the exact definition that says that it is UB? Tree Borrows is a proposal for exactly such a definition.
"code can do things like this" here means "you can write this code and compile it and run it and it will do something, and unless we have something like Tree Borrows we have no argument for why there would be anything wrong with this code".
You seem to have already accepted that we need something like Tree Borrows (i.e., we should say code like this is UB). This part of the paper is arguing why we need something like Tree Borrows. :)
You misunderstand the word "can". Yes, you can, in unsafe code, do that. And yes, that is undefined behaviour ;)
https://play.rust-lang.org/?version=stable&mode=debug&editio...
And one could say that they borrow from the tree some of their qualities. Sorry, couldn't resist.
That's before he went into politics, though.
All have different costs and capabilities across implementation, performance and developer experience.
Then we have what everyone else besides Rust is actually going for, the productivity of automatic resource management (regardless of how), coupled with one of the type systems above, only for performance critical code paths.
Doesn't matter if it's purely "syntaxical" because the language is garbage collected, just the fact of specifying what owns what and be explicit about multiple references is great imo.
Some sort of effects systems can already be simulated with Kotlin features too.
Programming language theory is so interesting!
I'd just like to interject for a moment. What you’re referring to as "affine types", is in fact, Uniqueness Types. The difference has to do with how they interact with unrestricted types. In Rust, these "unrestricted types" are references (which can be used multiple times due to implementing Copy).
Uniqueness types allow functions to place a constraint on the caller ("this argument cannot be aliased when you pass it to me"), but places no restriction on the callee. This is useful for Rust, because (among other reasons) if a value is not aliased you can free it and be sure that you're not leaving behind references to freed data.
Affine types are the opposite - they allow the caller to place a restriction on the callee ("I'm passing you this value, but you may use it at most once"), which is not something possible to express in Rust's type system, because the callee is always free to create a reference from its argument and pass that reference to multiple functions..
This is often explained via the "do not use more than once rule", but that's not the actual definition, and as your example shows, following that simplified explanation to the letter can cause confusion.
> because the callee is always free to create a reference from its argument and pass that reference to multiple functions..
Passing a reference is not the same thing as passing the actual value, so this does not contradict affinity.
Maybe a dumb question but couldn't you just run multiple implementations in parallel threads and whichever finishes first with a positive result wins?
It has migrated from a scope-based borrow checker to non-lexical borrow checker, and has next experimental Polonius implementation as an option. However, once the new implementation becomes production-ready, the old one gets discarded, because there's no reason to choose it. Borrow checking is fast, and the newer ones accept strictly more (correct) programs.
You also have Rc and RefCell types which give you greater flexibility at cost of some runtime checks.
"Rust", in this context, is "merely" "the usual invariants that people want" and "a suite of optimizations that assume those usual invariants, but not more or less".
[1] https://github.com/Voultapher/sort-research-rs/blob/main/wri... Miri column
[2] https://github.com/rust-lang/rust/blob/6b3ae3f6e45a33c2d95fa...
And it seams to not be the case on the stable compiler version?
fn write(x: &mut i32) {*x = 10}
fn main() {
let x = &mut 0;
let y = x as *mut i32;
//write(x); // this should use the mention implicit twophase borrow
*x = 10; // this should not and therefore be rejected by the compiler
unsafe {*y = 15 };
}
rustc itself has no reason to reject either version, because y is a *mut and thus has no borrow/lifetime relation to the &mut that x is, from a compile-time/typesystem perspective.
How true is this really?
Torvalds has argued for a long time that strict aliasing rules in C are more trouble than they're worth, I find his arguments compelling. Here's one of many examples: https://lore.kernel.org/all/CAHk-=wgq1DvgNVoodk7JKc6BuU1m9Un... (the entire thread worth reading if you find this sort of thing interesting)
Is Rust somehow fundamentally different? Based on limited experience, it seems not (at least, when unsafe is involved...).
But in the end, it's a trade-off, like everything in language design. (In life, really. ;) We think that in Rust we may have found a new sweet spot for this kind of optimizations. Time will tell whether we are right.
In the safe subset of Rust it's guaranteed in all cases. Even across libraries. Even in multi-threaded code.
C's aliasing is based on type alone, hence its other name "type based alias analysis" or TBAA.
In C you have a nuclear `restrict` that in my experience does anything only when applied to function arguments across clang & gcc, and type-based aliasing which is both not a generally-usable thing (don't have infinite different copies of the int64_t type (and probably wouldn't want such either)), and annoying (forces using memcpy if you want to reinterpret to a different type).
Whereas with Rust references you have finely-bounded lifetimes and spans and mutability, and it doesn't actually care about the "physical" types, so it is possible to reinterpret memory as both `&mut i32`/`&i32` and `&mut i64`/`&i64` and switch between the two for the same memory, writing/reading halves of/multiple values by the most bog standard Rust safe reads & writes, as long as the unsafe abstraction never gives you overlapping `&mut` references at the same time.
But like Linus, I've noticed it doesn't seem to make much difference outside of obvious narrow cases.
pvg•4h ago
https://news.ycombinator.com/item?id=22281205
https://news.ycombinator.com/item?id=17715399