There are open ideas for how to handle “view types” that express that you’re only borrowing specific fields of a struct, including Self, but they’re an ergonomic improvement, not a semantic power improvement.
Right, and even more to the point, there's another important property of Rust at play here: a function's signature should be the only thing necessary to typecheck the program; changes in the body of a function should not cause a caller to fail. This is why you can't infer types in function signatures and a variety of other restrictions.
It's easier to "abuse" in some languages with casts, and of course borrow checking is not common, but it also seems like just "typed function signatures 101".
Are there common exceptions to this out there, where you can call something that says it takes or returns one type but get back or send something entirely different?
I would personally consider null in Java to be an exception to this.
Much analysis is delayed until all templates are instantiated, with famously terrible consequences for error messages, compile times, and tools like IDEs and linters.
By contrast, rust's monomorphization achieves many of the same goals, but is less of a headache to use because once the signature is satisfied, codegen isn't allowed to fail.
That's the whole point of Concepts, though.
> Here is the most famous implication of this rule: Rust does not infer function signatures. If it did, changing the body of the function would change its signature. While this is convenient in the small, it has massive ramifications.
Many languages violate this. As another commenter mentioned, C++ templates are one example. Rust even violates it a little - lifetime variance is inferred, not explicitly stated.
struct Point {
x: f64,
y: f64,
}
impl Point {
fn x_mut(&mut self) -> &mut f64 {
&mut self.x
}
fn y_mut(&mut self) -> &mut f64 {
&mut self.y
}
}
the returned references are, for the purposes of aliasing rules, references to the entire struct rather than to pieces of it. `x` and `y` are implementation details of the struct and not part of its public API. Yes, this is occasionally annoying but I think the inverse (the borrow checker looking into the implementations of functions, rather than their signature, and reasoning about private API details) would be more confusing.I also disagree with the author that his rejected code:
fn main() {
let mut point = Point { x: 1.0, y: 2.0 };
let x_ref = point.x_mut();
let y_ref = point.y_mut();
*x_ref *= 2.0;
*y_ref *= 2.0;
}
"doesn't even violate the spirit of Rust's ownership rules."I think the spirit of Rust's ownership rules is quite clear that when calling a function whose signature is
fn f<'a>(param: &'a mut T1) -> &'a mut T2;
`param` is "locked" (i.e., no other references to it may exist) for the lifetime of the return value. This is clear once you start to think of Rust borrow-checking as compile-time reader-writer locks.This is often necessary for correctness (because there are many scenarios where you need to be guaranteed exclusive access to an object beyond just wanting to satisfy the LLVM "noalias" rules) and is not just an implementation detail: the language would be fundamentally different if instead the borrow checker tried to loosen this requirement as much as it could while still respecting the aliasing rules at a per-field level.
He's basically talking about the rigidity that Rust's borrow checking imposes on a program's data design. Once you've got the program following all the rules, it can be extraordinarily difficult to make even a minor change without incurring a time-consuming and painful refactor.
This is an argument about the language's ergonomics, so it seems like a fair criticism.
Unfortunately, this behavior does sometimes occur with Send bounds in deeply nested async code, which is why I mostly restrain from using colored-function style asynchronous code at all in favor of explicit threadpool management which the borrow checker excels at compared to every other language I used.
I don't think that was ever the intent behind the borrow checker but it is definitely an outcome.
So yes, the borrow checker makes some code more awkward than it would be in GC languages, but the benefits are easily worth it and they stretch far beyond memory safety.
If a language is bad, but you must use it, then yes learn it. But, if the borrowchecker is a source of pain in Rust, why not andmit it needs work instead of saying that “it makes you better”?
I’m not going to start writing brainfuck because it makes me a better programmer.
The point is the borrow checker has already gone beyond the point where the benefits outweigh those annoyances.
It's like... Static typing. Obviously there are cases where you're like "I know the types are correct! Get out of my way compiler!" but static types are still vastly superior because of all the benefits they convey in spite of those occasional times when they get in the way.
I don't think a smarter borrow checker could solve most of the issues the author raises. The author wants borrow checking to be an interprocedural analysis, but it isn't one by design. Everything the borrow checker knows about a function is in its signature.
Allowing the borrow checker to peek inside of the body of local methods for the purposes of identifying partial borrows would fundamentally break the locality of the borrow checker, but I think that as long as that analysis is only extended to methods on the local trait impl, it could be done without too much fanfare. These two things would be relaxations of the borrow checker rules, making it smarter, if you will.
The author may have a point in the idea of borrowing record fields separately. It is possible if we assume that the fields are completely orthogonal and can be mutated independently without representing an incorrect state. It would be a good option to have.
But a doubly-linked list (or graph) just can't be safely represented in the existing reference semantics. Dropping a node would lead to a dangling pointer, or several. An RDBMS can handle that ("on delete set null"), because it requires a specific way to declare all such links, so that they can be updated (aka "foreign keys"). A program usually does not (Rc / Arc or shared_ptr provide a comparable capability though).
Of course a bidirectional link is a special case, because it always provides a backlink to the object that would have a dangling pointer. The problem is that the borrow checker does not know that these two pointers belonging to different structs are a pair. I wish Rust had direct support for such things, then, when one end of the bidirectional link dies, the borrow checker would unset the pointer on the reciprocal end. Linking the objects together would also be a special operation that atomically sets both pointers.
In a more general case, it would be interesting to have a way to declare some invariants over fields of a struct. A mutual pair of pointers would be one case, allowing / forbidding to borrow two fields at once would be another. But we're far from that.
Special-casing some kinds of pointers is not unheard of; almost every GC-based language offers weak references (and Java, also Soft and Phantom references). I don't see why Rust could not get a BidirectionalRef of sorts.
Until then, Arc or array with indexes seem to be the only guaranteed memory-safe approaches.
Also, in the whole article I could not find a single reason why the author chose Rust, but I suppose it's because of its memory efficiency, considering the idea of keeping large graphs in RAM. Strictly speaking, Go could be about as efficient, but it has other inflammation points. C++... well, I hope Rust is not painful enough to resort to that.
This is a moot statement. Here is a thought experiment that demonstrates the pointlessness of languages like Rust in terms of correctness.
Lets say your goal is ultimate correctness - i.e for any possible input/inital state, the program produces a known and deterministic output.
You can chose 1 of 2 languages to write your program in:
First is standard C
Second is an absolutely strict programming language, that incorporates not only memory membership Rust style, but every single object must have a well defined type that determines not only the set of values that the object can have, but the operations on that object, which produce other well defined types. Basically, the idea is that if your program compiles, its by definition correct.
The issue is, the time it takes to develop the program to be absolutely correct is about the same. In the first case with C, you would write your program with carefully designed memory allocation (something like mempool that allocates at the start), you would design unit tests, you would run valgrind, and so on.
In the second case, you would spend a lot more time carefully designing types and operations, leading to a lot of churn of code-compile-fix error-repeat, taking you way longer to develop the program.
You could argue that the programmer is somewhat incompetent (for example, forgets to run valgrind), so the second language will have a higher change of being absolutely correct. However the argument still holds - in the second language, a slightly incompetent programmer can be lazy and define wide ranging types (similar to `any` in languages like typescript), leading to technical correctness, but logic bugs.
So in the end, it really doesn't matter which language you chose if you want ultimate correctness, because its all up to the programmer. However, if your goal is rapid prototyping, and you can guarantee that your input is constrained to a certain range, and even though out of range program will lead to a memory bug or failure of some sort, programming in something like C is going to be more efficient, whereas the second language will force you write a lot more code for basic things.
Working at a company with lots of systems written by former employees running in production… the advantages of Rust become starkly obvious. If it’s C++, I walk on eggshells. I have to become a Jedi master of the codebase before I can make any meaningful change, lest I become responsible for some disaster. If it’s Rust, I can just do stuff and I’ve never broken anything. Unit tests of business logic are all the QA I need. Other than that, if it compiles it works.
This is not true. Case and point- Java. Many times simpler than Rust, and large codebases are as horrible as C++ ones.
Without something like that, I think it just would have been impossible for Rust to gain enough momentum, and also attract the sort of people that made its culture what it is.
Otherwise, IMO Rust would have ended up just like D, a language that few people have ever used, but most people who have heard of it will say "apparently it's a better safer C++, but I'm not going to switch because I can technically do all that stuff in C++"
Hell, the early versions of the Rust compiler were written in OCaml...
Your list is at least missing PHP, Typescript, Swift, Go, Lua, Ruby and Rust though.
But Ocaml really doesn't belong anywhere close to this list.
OCaml runs software that billions use, is used by financial and defense firms, plus Facebook.
But Lua? By that metric I'm throwing in every language I've ever seen a job for...
R, Haskell, Odin, Lisp, etc...
Edit - this site is basically a meme at this point. Roblox is industrial strength but Facebook, Dassault and trading firms are "hobby". Lol.
Also, I'm not dissing Lua, there's just irony in calling Lua industrial but not OCaml...
Do realize that luajit for years was bankrolled by corporations.
Lua, Bash ... these are birds of a feather. They are the glue holding things together all over the place. No one thinks about them but if they disappeared over night a LOT of stuff would fall apart.
I kind of like that Ruby is still focusing on single developer/small team productivity.
Also I would argue the rust compiler started as a hobby project
> Yes, the first prototype of React was written in SML; we then moved onto OCaml.
> Jordan transcribed the prototype into JS for adoption; the SML version of React, however great it might be, would have died in obscurity. The Reason project's biggest goal is to show that OCaml is actually a viable, incremental and familiar-looking choice. We've been promoting this a lot but I guess one blog post and testimonial helps way more.
A version of React was built to run in ReasonML, which is a flavor of Ocaml for the web, but Reason didn't even exist before React was fairly well established.
Hell, Facebook's own XHP's interface (plus PHP/Hack's execution model) is more conceptually relatable to React, and its initial development predates Jordan's time at Facebook. It wasn't JavaScript, but at the very least it defined rails for writing applications that used the DOM.
Also, OCaml had trouble with multithreading for quite some time, which was a limiting factor for many applications.
Facebook made a large effort to thrust OCaml into the limelight, and even wrote a nice alternative frontend (Reason). Sadly, it did not stick.
Old but funny comparison: http://adam.chlipala.net/mlcomp/
> 1/100th the momentum and community and resources of Go or Rust
I think even 1/100 would be pretty generous.
* Rust has a C++-flavored syntax, but OCaml has a relatively alien ML-flavored syntax.
* Rust has the backing of Mozilla, but I don't think OCaml had comparable industry backing. (Jane Street, maybe?)
I do not at all agree with this. Rust is by far the most complex language in terms of syntax that has ever become popular enough to compare it to anything.
But you use it more and see actionscript types of function notation, funtional language semantics where you just "return" whatever was the last expression in a statement, how structs have no bodies (and classes aren't a thing) and instead everything is implemented externally, and it starts to really become its own beast.
The difference between academia languages such as ocaml or haskell and industry languages such as Java or C# is hundreds of millions of dollar in advertising. It's not limited to the academy: plenty of languages from other horizons failed, that weren't backed by companies with a vested interest in you using their language.
You should probably not infer too much from a language's success or failure.
Java and C# are the only one's that fit this. Go and Rust had some publicity from being associated with Google and Mozilla, but they both caught on without "millions of dollars in advertising" too. Endorsement by big companies like MS came much later for Rust, and Google only started devoting some PR to Go after several years of it already catching momentum.
Yes.
> in advertising
No, in hiring 500 compiler and tool developers, developing and supporting libraries, optimizing it for niche use cases.
No amount of advertising is going to propel Haskell to a mainstream language. If it wants to succeed (and let's be honest, it probably doesn't), it's going to need an investment of millions of developer-hours in libraries and tooling. No matter how pretty and elegant the language may be, if you have to reinvent the wheel every time you go beyond "hello world" you're going to think twice before considering it for production code.
If you look at it from that perspective, then Rust is the hobby language.
How different?
It's not about not having a C-like syntax (huge mainstream points lost), good momentum, and not having the early marketing clout that came from Rust being Mozilla's "hot new language".
Most of my smaller projects don't benefit so much from the statically proven compile time guarantees that e.g. Rust with it's borrow checker provide. They're simple enough to more-or-less exhaustively test. They also tend to have simple enough data models and/or lax enough latency requirements that garbage collectors aren't a drawback. C#? Kotlin? Java? Javascript? ??? Doesn't matter. I'm writing them in Rust now, and I'm comfortable enough with the borrow checker that I don't feel it slows me down, but I wouldn't have learned Rust in the first place without a borrow checker to draw me in, and I respect when people choose to pass on the whole circus for similar projects.
The larger projects... for me they tend to be C++, and haven't been rewritten in Rust, so I'm tormented with a stream of bugs, a large portion of which would've been prevented - or at least made shallow - by Rust's borrow checker. Every single one of them taunts me with how theoretically preventable they are.
Except both of these things are that way for a reason.
The author talks about the pain of having other refactor because of the borrow checker. Every one laments having to deal with errors in go. These are features, not bugs. They are forcing functions to get you to behave like an adult when you write code.
Dealing with error conditions at "google scale" means you need every one to be a good citizen to keep signal to noise down. GO solves a very google problem: don't let JR dev's leave trash on at the campsite, force them to be good boy scouts. It is Conways law in action (and it is a good thing).
Rust's forced refactors make it hard to leave things dangling. It makes it hard to have weak design. If you have something "stable", from a product, design and functionality standpoint then Rust is amazing. This is sort of antithetical to "go fast and break things" (use typescript, or python if you need this). It's antithetical to written in the stand up requirements, that change week to week where your artifacts are pantomime and post it notes.
Could the borrow checker be better, sure, and so could errors in go. But most people would still find them a reason to complain even after their improvement. The features are a product of design goals.
Also, in my experience, the Rust maintainers generally err on the side of pragmatism rather than opinionatedness; language design decisions generally aren't driven by considerations like "this will force junior developers to adhere to the right discipline". Rust tries to be flexible, because people's requirements are flexible, especially in the domain of low-level programming. In general, they try to err on the side of letting you write your code however you want, subject to the constraints of the language's two overriding design goals (memory safety and precise programmer control over runtime behavior). The resulting language is in many ways less flexible than some more opinionated languages, but that's because meeting those design goals is inherently hard and forces compromises elsewhere (and because the language has limited development resources and a large-but-finite complexity budget), not because anyone views this as a positive in and of itself.
(The one arguable exception to this that I can think of is the lack of syntactic sugar for features like reference counting and fallible operations that are syntactically invisible in some other languages. That said, this is not just because some people are ideologically against them; they've been seriously considered and haven't been rejected outright, it's just that a new feature requires consensus in favor and dedicated resources to make it happen. "You can do the thing but it requires syntactic salt" is the default in Rust, because of its design, and in these cases the default has prevailed for now.)
I don't think this is a bad thing but it's a funny consequence that to become mainstream you have to (1) announce a cool new feature that isn't in other languages (2) eventually accept the feature is actually pretty niche and your average developer won't get it (3) sand off the weird features to make another "C but slightly better/different"
That is exactly how it was sold.
A safe C, or a nicer simpler Java.
Nobody cared about Erlang back then and nobody does today.
I write Erlang for a living.
It's never been "safe C" because it's garbage collected. Java is truly the comp because it's a great Grug language.
I also wrote some Erlang in the past, I really enjoy it and I was sad that Go didn't borrow more.
Nobody may have known they cared about Erlang, but those features sure made people pay attention.
Go's selling points are different: it takes a weekend to learn, and a week to become productive, it has a well-stocked standard library, it compiles quickly, runs quickly enough, and produces a single self-contained executable.
I would say that Go is mostly a better Modula-2 (with bits of Oberon); it's only better from the language standpoint because now it has type parameters, but GC definitely helps make writing it simpler.
There are numerous interviews with Rob Pike about the design of Go from when Go was still being developed, and Erlang doesn't come up in anything that I can find other than this interview from 2010 where someone asks Rob Pike a question involving Erlang and Rob replies by saying he thinks the two languages have a different approach to are fairly different:
https://www.youtube.com/watch?v=3DtUzH3zoFo
It's at the 32 minute mark, but once again this is in response to someone asking a question.
Here are other interviews about Go, and once again in almost every interview I'd say Rob kind of insinuates he was motivated by a dislike of using C++ within Google to write highly parallel services, but not once is Erlang ever mentioned:
https://www.informit.com/articles/article.aspx?p=1623555
- Java (popular among people who went to college and learned all about OOP or places that had a lot of "enterprise" software development)
- Ruby on Rails (which was the hot new thing)
- Python or Perl to be the P in your LAMP stack
- C++ for "performance"
All of these were kitchen sink choices because they wound up needing to do everything. If you went back in time and said you were building a language that didn't do something incredibly common and got in the way of your work, no one would pick it up.
>More amorphous, but not less important is Rust's strong cultural affinity for correctness. For example, go to YouTube and click on some Rust conference channel. You'll see that a large fraction of the talks are on correctness, in some way or another. That's not something I see in Julia or Python conference talks.
And it creates an interesting chicken and egg approach. The borrow checker may indeed be too strict (and of course, has its edge cases and outright bugs), but its existence (rather than the utility it brings) may have in fact attracted and amassed an audience who cares about correctness above all else. Even if we abolished the borrow checker tomorrow, this audience may still maintain a certain style based on such principles, party because the other utilities of Rust were built around it.
It's very intriguing. But like anything else trying to attract people, you need something new and flashy to get people in the door. Even for people who traditionally try to reject obvious sales pitches.
This is a real problem across the entire industry, and Rust is a particularly egregious example because you get to justify playing with the fun stimulating puzzle machine because safety—you don't want unsafe code, do you? Meanwhile there's very little consideration to whether the level of rigidity is justified in the problem domain. And Rust isn't alone here, devs snort lines of TypeScript rather than do real work for weeks on end.
Uh, no thanks.
> and get the same benefit.
Not quite.
Sometimes you can't afford that though, from web browsers to MCUs to hardware drivers to HFT.
With Rust, you're battling a compiler that has a very restrictive model, that you can't shut up. You will end up performing major refactors to implement what seem like trivial additions.
> But what's the point of the rules in this case, though? Here, the ownership rules does not prevent use after free, or double free, or data races, or any other bug. It's perfectly clear to a human that this code is fine and doesn't have any actual ownership issues
I mean, of course there is an obvious ownership issue with the code above, how are the destructors supposed to be ran without freeing the Id object twice?
A more precise way to phrase what he's getting at would be something like "all types that _can_ implement `Copy` should do so automatically unless you opt out", which is not a crazy thing to want, but also not very important (the ergonomic effect of this papercut is pretty close to zero).
> A more precise way to phrase what he's getting at would be something like "all types that _can_ implement `Copy` should do so automatically unless you opt out", which is not a crazy thing to want,
From a memory safety PoV it's indeed entirely valid, but from a programming logic standpoint it sounds like a net regression. Rust's move semantics are such a bliss compared to the hidden copies you have in Go (Go not having pointer semantics by default is one of my biggest gripe with the language).
The only thing that changes if the type is Copy is that after executing that line, you are still allowed to use y.
Yes when an item is Copy-ed, you are still allowed to use it, but it means that you now have two independent copies of the same thing, and you may edit one, then use the other, and be surprised that it hasn't been updated. (When I briefly worked with Go, junior developers with mostly JavaScript or Python experience would fall into this trap all the time). And given that most languages nowadays have pointer semantics, having default copy types would lead to a very confusing situation: people would need to learn about value semantics AND about move semantics for objects with a destructor (including all collections).
No thanks. Rust is already complex enough for beginners to grasp.
This program fails to compile:
#[derive(Clone, Copy)]
struct S;
impl Drop for S {
fn drop(&mut self) {}
}
fn main() {}
So, whenever you wanted to implement Drop you'd need to engage the escape hatch.
> all types that _can_ implement `Copy` should do so automatically unless you opt out
, which was explicitly intended to exclude types with destructors, not
> types should auto-derive `Copy` based purely on an analysis of their fields.
https://doc.rust-lang.org/std/primitive.pointer.html#impl-Co...
The borrow checker is certainly Rust’s claim to fame. And a critical reason why the language got popular and grew. But it’s probably not in my Top 10 favorite things about using Rust. And if Rust as it exists today existed without the borrow checker it’d be a great programming experience. Arguably even better than with the borrow checker.
Rust’s ergonomics, standardized cargo build system, crates.io ecosystem, and community community to good API design are probably my favorite things about Rust.
The borrow checker is usually fine. But does require a staunch commitment to RAII which is not fine. Rust is absolute garbage at arenas. No bumpalo doesn’t count. So Rust w/ borrow checker is not strictly better than C. A Rust without a borrow checker would probably be strictly better than C and almost C++. Rust generics are mostly good, and C++ templates are mostly bad, but I do badly wish at times that Rust just had some damn template notation.
Mind explaining why? I have made good experiences with bumpalo.
My last attempt is I had a text file with a custom DSL. Pretend it’s JSON. I was parsing this into a collection of nodes. I wanted to dump the file into an arena. And then have all the nodes have &str living in and tied to the arena. I wanted zero unnecessary copies. This is trivially safe code.
I’m sure it’s possible. But it required an ungodly amount of ugly lifetime 'a lifetime markers and I eventually hit a wall where I simply could not get it to compile. It’s been awhile so I forget the details.
I love Rust. But you really really have to embrace the RAII or your life is hell.
[1]: https://crates.io/crates/arcstr [2]: https://crates.io/crates/imstr
Let’s pretend I was in C. I would allocate one big flat segment of memory. I’d read the “JSON” text file into this block. Then I’d build an AST of nodes. Each node would be appended into the arena. Object nodes would container a list of pointers to child nodes.
Once I built the AST of nested nodes of varying type I would treat it as constant. I’d use it for a few purposes. And then at some point I would free the chunk of memory in one go.
In C this is trivial. No string copies. No duplicated data. Just a bunch of dirty unsafe pointers. Writing this “safely” is very easy.
In Rust this is… maybe possible. But brutally difficult. I’m pretty good at Rust. I gave up. I don’t recall what exact what wall I hit.
I’m not saying it can’t be done. But I am saying it’s really hard and really gross. It’s radically easier to allocate lots of little Strings and Vecs and Box each nested value. And then free them all one-by-one.
use bumpalo::Bump;
use std::io::Read;
fn main() {
let mut arena = Bump::new();
loop {
read_and_process_lines(&mut arena);
arena.reset();
}
}
#[derive(Debug)]
enum AstNode<'a> {
Leaf(&'a str),
Branch {
line: &'a str,
meta: usize,
cons: &'a mut AstNode<'a>
},
}
fn read_and_process_lines(arena: &Bump) {
let cap = 40;
let buf: &mut [u8] = arena.alloc_slice_fill_default(cap);
let l = std::io::stdin().lock().read(buf).expect("reading stdin");
let content: &str = str::from_utf8(&buf[..l]).unwrap();
dbg!(content);
let mut lines = content.lines();
let mut latest: &mut AstNode<'_> = arena.alloc(AstNode::Leaf(lines.next().unwrap()));
for line in lines {
latest = arena.alloc(AstNode::Branch{line, meta:0, cons: latest});
}
println!("{latest:?}");
}
If you can get a full JSON parser working then maybe I’m just wrong. Arrays, objects with keys/values, etc.
I’d like to think I’m a decent Rust programmer. Maybe I just need to give it another crack and if I fail again turn it into a blog post…
* a very nice package manager
* Libraries written in it tend to be more modular and composable.
* You can more confidently compile projects without worrying too much about system differences or dependencies.
I think this is because:
* It came out during the Internet era.
* It's partially to do with how cargo by default encourages more use of existing libraries rather than reinventing the wheel or using custom/vendored forks of them.
* It doesn't have dynamic linking unless you use FFI. So rust can still run into issues here but only when depending on non-rust libraries.
In principle, the language already has raw pointers with the same expressive power as in C, and unlike references they don't have aliasing restrictions. That is, so long as you only use pointers to access data, this should be fine (in the sense of, it's as safe as doing the same thing in C or Zig).
Note that this last point is not the same as "so long as you don't use references" though! The problem is that aliasing rules apply to variables themselves - e.g. in safe rust taking a mutable reference to, say, local variable and then writing directly to that variable is forbidden, so doing the same with raw pointers is UB. So if you want to be on the safe side, you must never work with variables directly - you must always take a pointer first and then do all reads and writes through it, which guarantees that it can be aliased.
However, this seems something that could be done in an easy mechanical transform. Basically a macro that would treat all & as &raw, and any `let mut x = ...` as something like `let mut x_storage = ...; let x = &raw mut x_storage` and then patch up all references to `x` in scope to `*x`.
The other problem is that stdlib assumes references, but in principle it should be possible to mechanically translate the whole thing as well...
And if you make it into a macro instead of patching the compiler directly, you can still use all the tooling, Cargo, LSP(?) etc.
I haven't written a ton of Rust so maybe my assumptions of what's possible are wrong, but it is an idea I've come back to a few times.
I've been writing C++ for almost 30 years, and a few years of Rust. I sometimes struggle with the Rust borrow checker, and it's almost always my fault. I keep trying to write C++ in Rust, because I'm thinking in C++ instead of Rust.
The lesson is always the same. If you want to use language X, you must learn to write X, instead of writing language Y in X.
Using indexes (or node ids or opaque handles) in graph/tree implementations is a good idea both in C++ and in Rust. It makes serialization easier and faster. It allows you to use data structures where you can't have a pointer to a node. And it can also save memory, as pointers and separate memory allocations take a lot of space when you have billions of them. Like when working with human genomes.
From the post:
"The Rust community's whole thing is commitment to compiler-enforced correctness, and they built the borrowchecker on the premise that humans can't be trusted to handle references manually. When the same borrowchecker makes references unworkable, their solution is to... recommend that I manually manage them, with zero safety and zero language support?!? The irony is unreal."
No it doesn't. I just don't think author understands the pitfalls of implementing something like a graph structure in a memory unsafe language. The author doesn't write C so I don't believe he has struggled with the pain of chasing a dangling pointer with valgrind.
There are plenty of libraries in C that eventually decided to use indexes instead of juggling pointers around because it's much harder to eventually introduce a use-after-free when dereferencing nodes this way.
Entity component systems were invented in 1998 which essentially implement this pattern. I don't find it ironic that the Rust compiler herds people towards a safe design that has been rediscovered again and again.
The borrow checker was introduced to statically verify memory safety. Using indices into graphs has been a memory safe option in languages like C for decades. I find his argument as valid as if someone said "I can't use goto? you expect me to manually run my cleanup code before I return?" Just because I took away your goto to make control flow easier it doesn't make it "ironic" if certain legitimate uses of goto are harder. Surely you wouldn't accept his argument for someone arguing for the return of goto in mainstream languages?
This is how the regex crate works internally and uses almost no `unsafe`.
That is not to say these languages are better. Intuition is just one trade off.
I feel with practice basic type checking is something that helps you rather then hinders you. It can be learned easily imo. People coming from js tend to have a hard time but that's understandable.
The borrow checker is not easily learned imo. It's always me running into a wall.
When I was going through its docs I was impressed with all those good ideas one after the other. Docs itself are really good (high information density that reads itself).
If this is true for Rust, it's 10x more true for C++!
Lifetime issues are puzzles, yes, but boring and irritating ones.
But in C++? Select an appetizer, entree, and desert (w/ bottomless breadsticks) from the Menu of Meta Programming. An endless festival of computer science sideshows living _in the language itself_ that juices the dopamine reward of figuring out a clever way of doing something.
People have compared Rust to C++ and others have argued that they really aren't alike, but I think it's in these puzzles that they are more alike than any other two languages. Even just reading rust code is a brain teaser for me!
I think this is why C and Zig get compared too. They apparently have roughly the same level of "fun problems" to solve.
The other end of the spectrum is something like gamedev: you write code that pretty explicitly has an end-date, and the actual shape of the program can change drastically during development (because it's a creative thing) so you very much don't want to slowly build up rigidity over time.
Both have rust-like flavor and neither has a borrow checker.
(It also needs some kind of reflection-like thing, either compile-time or runtime, so that there can be an equivalent of Rust's Serde, but at least they admit that that needs doing.)
Indices aren't simply "references but worse". There are some advantages:
- they are human readable
- they are just data, so can be trivially serialized/deserialized and retain their meaning
- you can make them smaller than 64 bits, saving memory and letting you keep more in cache
Also I don't see how they're unsafe. The array accesses are still bounds-checked and type-checked. Logical errors, sure I can see that. But where's the unsafety?
This goes for not only unchecked indexing but also eg. transmuting based on a checked index into a &[u8] or such. If those indexes move in and out of your API and you do some kind of GC on your arrays / vectors, then you might run into indices being use-after-free and now those SAFETY comments that previously felt pretty obvious, even trivial, may no longer be quite so safe to be around of.
I've actually written about this previously w.r.t. the borrow checker and implementing a GC system based on indices / handles. My opinion was that unless you're putting in ironclad lifetimes on your indices, all assumptions based on indices must be always checked before use.
struct Id(u32);
fn main() {
let id = Id(5);
let mut v = vec![id];
println!("{}", id.0);
}
isn't even legit in modern C++. That's just move semantics. When you move it, it's gone at the old name.He does point out two significant problems in Rust. When you need to change a program, re-doing the ownership plumbing can be quite time-consuming. Losing a few days on that is a routine Rust experience. Rust forces you to pay for your technical debt up front in that area.
The other big problem is back references. Rust still lacks a good solution in that area. So often, you want A to own B, and B to be able to reference A. Rust will not allow that directly. There are three workarounds commonly used.
- Put all the items in an array and refer to them by index. Then write run-time code to manage all that. The Bevy game engine is an example of a large Rust system which does this. The trouble is that you've re-created dangling pointers, in the form of indices kept around after they are invalid. Now you have most of the problems of raw pointers. They will at least be an index to some structure of the right type, but that's all the guarantee you get. I've found bugs in that approach in Rust crates.
- Unsafe code with raw pointers. That seldom ends well. Crates which do that are almost the only time I've had to use a debugger on Rust code.
- Rc/RefCell/run-time ".borrow()". This moves all the checking to run time. It's safe, but you panic at run time if two things borrow the same item.
This is a fundamental problem in Rust. I've mentioned this before. What's needed to fix this is an analyzer that checks the scope of explicit .borrow() and .borrow_mut() calls, and determines that all scopes for the same object are disjoint. This is not too hard conceptually if all the .borrow() calls produce locally scoped results. It does mean a full call chain analysis. It's a lot like static detection of deadlock, which is a known area of research [1] but something not seen in production yet.
I've discussed this with some of the Rust developers. The problem is generics. When you call a generic, the calling code has no idea what code the generic is going to generate. You don't know what it's going to borrow. You'd have to do this static analysis after generic expansion. Rust avoids that; generics either compile for all cases, or not at all. Such restricted generic expansion avoids the huge compile error messages from hell associated with C++ template instantiation fails. Post template expansion static analysis is thus considered undesirable.
Fixing that could be done with annotation, along the lines of "this function might borrow 'foo'". That rapidly gets clunky. People hate doing transitive closure by hand. Remember Java checked exceptions.
This is a good PhD topic for somebody in programming language theory. It's a well-known hard problem for which a solution would be useful. There's no easy general fix.
That's true, but as a runtime mitigation, adding a generational counter (maybe only in debug builds) to allocations can catch use-after-frees.
And at least it's less likely to be a security vulnerability, unless you put sensitive information inside one of these arrays.
At the cost of making the use of the resulting heap significantly slower and larger than if you just wrote the thing in Java to begin with, though! The resulting instrumentation is likely to be isomorphic to GC's latency excursions, even.
This is the biggest issue that bugs me about Rust. It starts from a marketing position of "Safety With No Compromises" on runtime metrics like performance or whatever, then when things get hairy it's always "Well, here's a very reasonable compromise". We know how to compromise! The world is filled with very reasonably compromised memory-safe runtimes that are excellent choices for your new system.
Lower throughput, probably. But it introduces constant latency. It has some advantages over doing it in Java:
* You're never going to get latency spikes by adding a counter to each allocation slot.
* If you really want to, you can disable them in release builds and still not give up memory-safety, although you might get logical use-after-frees.
* You don't need to use such "compromises" for literally everything, just where it's needed.
> It starts from a marketing position of "Safety With No Compromises"
I haven't seen that marketing, but if it exists, sure, it's misleading. Yes, you have to compromise. But in my opinion, the compromises that Rust lets you make are meaningfully different from the compromises in other mainstream languages. Sometimes better, sometimes worse. Probably worse for most applications than a GC language, tbh.
The suggestion wasn't just the counter though. A counter by itself does nothing. At some point you need to iterate[1] through your set to identify[2] the unreferenced[3] blocks. And that has to be done with some kind of locking vs. the unrestricted contexts elsewhere trying to do their own allocation work. And that has costs.
Bottom line is that the response was isomorphic to "That's OK, you can work around it by writing a garbage collector". And... yeah. We have that, and it's better than this nonsense.
[1] "sweep", in the vernacular
[2] "collect", in some idioms
[3] Yup, "garbage"
https://docs.rs/generational-arena
If you're implementing a tracing garbage collector you obviously don't need any such counters to detect use-after-frees.
This is clearly a different compromise entirely to the one made by tracing garbage collection. I'm actually not sure how you confused the two.
The problem is this mental model is entirely foreign to people who have worked in literally every other language where pass by value (copy or pass by reference are the way things work, always.
Exactly the opposite actually.
Rust has destructive move while modern C++ has nondestructive move.
So in Rust, an object is dead after you move out of it, and any further attempts to use it are a compiler diagnosed error. In contrast, a C++ object is remains alive after the move, and further use of it isn't forbidden by the language, although some or all uses might be forbidden by the specific user provided move function - you'll have to reference the documentation for that move function to find out.
This article explains the difference well: https://www.foonathan.net/2017/09/destructive-move/
Of course, designing for safety is quite complex and easy to get wrong. For example, Swift's "structured concurrency" is an attempt to provide additional abstractions to try to hide some complexity around life times and synchronization... but (personally) I think the results are even more confusing and volatile.
It doesn't need to be Rust: Rust's borrow checker has (mostly reasonable) limitations that eg. make some interprocedural things impossible while being possible within a single function (eg. &mut Vec<u32> and &mut u32 derived from it, both being used at the same time as shared references, and then one or the other being used as exclusive later). Maybe some other language will come in with a more powerful and omniscient borrow checker[^1], and leave Rust in the dust. It definitely can happen, and if it does then I suppose we'll enjoy that language then.
But: it is my opinion that a borrow checker is an absolutely massive thing in a (non-GC) programming language, and one that cannot be ignored in the future. (Though, Zig is proving me wrong here and it's doing a lot of really cool things. What memory safety vulnerabilities in the Ziglang world end up looking like remains to be seen.) Memory is always owned by some_one_, its validity is always determined by some_one_, and having that validity enforced by the language is absolutely priceless.
Wanting GC for some things is of course totally valid; just reach for a GC library for those cases, or if you think it's the right tool for the job then use a GC language.
[^1]: Or something even better that can replace the borrow checker; maybe Graydon Hoare's original idea of path based aliasing analysis would've been that? Who knows.
Imo a GC needs some cooperation from the language implementation, at least to find the rootset. Workarounds are either inefficient or unergonomic. I guess inefficient GC is fine in plenty of scenarios, though.
A huge part of the spirit of rust is fearless concurrency. The simple seeming false positive examples become non-trivial in concurrent code.
The author admits they don't write large concurrent - which clearly explains why they don't find much use in the borrow checker. So the problem isn't that the rust doesn't work for them - it's that a central language feature of rust hampers them instead of helping them.
The conclusion for this article should have been: if you're like me and don't write concurrent programs, enums and matches are great. The language would be work better for me if the arc/box syntax spam went away.
As a side note, if your code is a house of cards, it's probably because you prematurely optimized. A good way to get around this problem is to arc/box spam upfront with as little abstraction as possible, then profile, then optimize.
Probably I just haven't been writing very "advanced" rust programs in the sense of doing complicated things that require advanced usages of lifetimes and references. But having written rust professionally for 3 years now, I haven't encountered this once. Just putting this out there as another data point.
Of course, partial borrows would make things nicer. So would polonius (which I believe is supposed to resolve the "famous" issue the post mentions, and maybe allow self-referential structs a long way down the road). But it's very rare that I encounter a situation where I actually need these. (example: a much more common need for me is more powerful consteval.)
Before writing Rust professionally, I wrote OCaml professionally. To people who wish for "rust, but with a garbage collector", I suggest you use OCaml! The languages are extremely similar.
Maybe its an idiom you already picked up in OCaml and did it mostly right in rust too?
You might have a point with my OCaml background though. I rarely use mutable references, since I prefer to write code in a functional style. That means I rarely am in a situation where I want to create a mutable reference but already have other references floating around.
Here's an example of some of my code: https://github.com/not-pizza/tysm/blob/main/src/chat_complet... . I wouldn't be surprised if there's not a mutable reference or lifetime specifier in this whole project
It's not super common though, especially if the code is not in the hot path which means you can just keep things simple and clone.
- Marking and sweeping cause latency spikes which may be unacceptable if your program must have millisecond responsiveness.
- GC happens intermittently, which means garbage accumulates until each collection, and so your program is overall less memory efficient.
With modern concurrent collectors like Java's ZGC, that's not the case any longer. They show sub-millisecond pause times and run concurrently. The trade-off is a higher CPU utilization and thus reduced overall throughput, which if and when it is a problem can oftentimes be mitigated by scaling out to more compute nodes.
Language support: You can implement extension traits on an integer so you can do things like current_node.next(v) (like if you have an integer named 'current_node' which is an index into a vector v of nodes) and customize how your next() works.
Also, I disagree there is 'zero safety', since the indexes are into a Rust vector, they are bounds checked by default when "dereferencing" the index into the vector (v[i]), and the checking is not that slow for vast majority of use cases. If you go out of bounds, Rust will panic and tell you exactly where it panicked. If panicking is a problem you could theoretically have custom deference code that does something more graceful than panic.
But with using indexes there is no corruption of memory outside of the vector where you are keeping your data, in other words there isn't a buffer overflow attack that allows for machine instructions to be overwritten with data, which is where a huge amount of vulnerabilities and hacks have come from over the past few decades. That's what is meant by 'safety' in general.
I know people stick in 'unsafe' to gain a few percent speed sometimes, but then it's unsafe rust by definition. I agree that unsafe rust is unsafe.
Also you can do silly optimization tricks like if you need to perform a single operation on the entire collection of nodes, you can parallelize it easily by iterating thru the vector without having to iterate through the data structure using next/prev leaf/branch whatever.
This arguement has a long history.
It is a widely used pattern in rust.
It is true that panics are memory safe, and there is nothing unsafe about having your own ref ids.
However, I believe thats its both fair and widely acknowledged that in general this approach is prone to bugs that cause panics for exactly this reason, and thats bad.
Just use Arc or Rc.
Or, an existing crate that implements a wrapper around it.
Its enormously unlikely that most applications need the performance of avoiding them, and very likely that if you are rolling your own, youll get caught up by edge cases.
This is a prime example of a rust antipattern.
You shouldnt be implementing it in your application code.
If you try to write Java in Rust, you will fail. Rust is no different in this regard from Haskell, but method syntax feels so friendly that it doesn't register that this is a language you genuinely have to learn instead of picking up the basics in a couple hours and immediately start implementing `increment_counter`-style interfaces.
And this is an inexperienced take, no matter how eloquently it's written. You can see it immediately from the complaint about CS101 pointer-chasing graph structures, and apoplexy at the thought of index-based structures, when any serious graph should be written with an index-based adjacency list and writing your own nonintrusive collection types is pretty rare in normal code. Just Use Petgraph.
A beginner is told to 'Just' use borrow-splitting functions, and this feels like a hoop to jump through. This is because it's not the real answer. The real answer is that once you have properly learned Rust, once your reflexes are procedural instead of object-oriented, you stop running into the problem altogether; you automatically architect code so it doesn't come up (as often). The article mentions this point and says 'nuh uh', but everyone saying it is personally at this level; 'intermittent' Rust usage is not really a good Learning Environment.
The compiler changed the type of my variable based on its usage. Usage in code I didn't write. There was no warning about this (even with clippy). The program crashed at runtime.
I found this amusing because it doesn't happen in dynamic languages, and it doesn't happen in languages where you have to specify the types. But Rust, with its emphasis on safety, somehow lured me into this trap within the first 15 minutes of programming.
I found it more amusing because in my other attempts at Rust, the compiler rejected my code constantly (which was valid and worked fine), but then also silently modified my program without warning to crash at runtime.
I saw an article by the developers of the Flow language, which suffered from a similar issue until it was fixed. They called it Spooky Action at a Distance.
This being said, I like Rust and its goals overall. I just wish it was a little more explicit with the types, and a little more configurable on the compiler strictness side. Many of its errors are actually just warnings, depending on your program. It feels disrespectful for a compiler to insist it knows better than the programmer, and to refuse to even compile the program.
Many projects are written in Rust that would absolutely be fine in Go, Swift or a JVM language. And I don't understand: it is nicer to write in those other languages, why choose Rust?
On the other hand, Rust is a lot nicer than C/C++, so I see it as a valid alternative there: I'm a lot happier having to make the borrow-checker happy than tracking tricky memory errors in C.
I think this is a matter of preference. Nowadays I cannot stand environments like Java (or especially Kotlin). "Tricky memory errors" is in my opinion nicer than a borrow-checker refusing sound code. I guess I really hate 'magic'...
> It is nicer to write in those other languages, why choose Rust?
Honestly I don't think it is nicer to write in those other languages you mention. I might still prefer Rust if performance was removed from the equation entirely. That is just to say I think preference and experience matters just as much, if not more, than the language's memory model.
As someone who writes Rust professionally this sentence is sus. Typically, the borrow checker is somewhere between 10th and 100th in the list with regards to things I think about when programming. At the end of the day, you could in theory just wrap something in a reference counter if needed, but even that hasn't happened to me yet.
When my function gets an exclusive reference to an object, I know for sure that it won't be touched by the caller while I use it, but I can still mutate it freely. I never need to make deep copies of inputs defensively just in case the caller tries to keep a reference to somewhere in the object they've passed to my function.
And conversely, as a user of libraries, I can look at an API of any function and know whether it will only temporarily look at its arguments (and I can then modify or destroy them without consequences), or whether it keeps them, or whether they're shared between the caller and the callee.
All of this is especially important in multi-threaded code where a function holding on to a reference for too long, or mutating something unexpectedly, can cause painful-to-debug bugs. Once you know the limitations of the borrow checker, and how to work with or around them, it's not that hard. Dealing with a picky compiler is IMHO still preferable to dealing with mysterious bugs from unexpectedly-mutated state.
In a way, borrow checker also makes interfaces simpler. The rules may be restrictive, but the same rules apply to everything everywhere. I can learn them once, and then know what to expect from every API using references. There are no exceptions in libraries that try to be clever. There are no exceptions for single-threaded programs. There are no exceptions for DLLs. There are no exceptions for programs built with -fpointers-go-sideways. It may be tricky like a game of chess, but I only need to consider the rules of the game, and not odd stuff like whether my opponent glued pieces to the chessboard.
No, you could use destructuring. This doesn't work for all cases but it does for your examples without needing to derive copy or clone. Here's a more complex but also compelling example of the problem:
struct Graph {
nodes: BTreeMap<u32, Node>,
}
struct Node {
edges: Vec<u32>,
}
impl Graph {
fn visit_mut(&mut self, visit: impl Fn(&mut Node, &mut Node)) {
let mut visited = BTreeSet::new();
let mut stack = vec![0];
while let Some(id) = stack.pop() {
if !visited.insert(id) { continue; }
let curr = self.nodes.get_mut(&id);
for id in source.edges.clone() {
let next = self.nodes.get_mut(&id);
visit(curr, next);
stack.push(id);
}
}
}
}
We're doing everything in the "Rust" way here. We're using IDs instead of pointers. We're cloning a vec even if it's a bit excessive. But the bigger problem is we actually _do_ need to have multiple mutable references to two values owned by a collection that we know don't transitively reference the collection. We need to wrap these in an RefCell or UnsafeCell and unsafe { } block to actually get mutable references to the underlying data to correctly implement visit_mut().This is a problem that shows up all the time when using collections, which Rust encourages within the ecosystem.
This is actually a learning lesson for the user to understand that the bugs one has seen in languages like c++ are inherent to using simple types.
The author goes about mentioning python. If you do change all your types to python equivalents, ref counted etc. Rust becomes as easy. But you don’t want to do that and so it becomes pain, but pain with a gain. You must decide if that gain is worth it.
From my point of view the issue is that rust defaults to be a system programming language. Meaning, simple types are written simple (i32, b32, mut ..), complex types are written complex (ref, arc, etc.). And because of that one wants to use the simple types, which makes the solutions complex.
Let’s imagine a rust dialect, where every type without annotation is ref counted and if you want the simple type you would have to annotate your types, the situation would change.
What one must realize is that verifiable correctness is hard , the simplicity of the given problematic examples is a clear indication of how close those screw ups are even with very simple code. And exactly why we are still seeing issues in core c libs after decades of fixing them.
It might be surprising to some folks, but there is a lot of unsafe code in Rust, and a lot of that is in the standard’s data structure implementations.
Also —
Common in network programming, the pain of lifetimes, is in async.
The model sort of keels over and becomes obtuse when every task requires ownership of its data with static lifetimes.
airstrike•8h ago
This is true both in theory and in practice, as you can write any program with a borrow checker as you can without it.
TFA also dismisses all the advantages of the borrow checker and focuses on a narrow set of pain points of which every Rust developer is already aware. We still prefer those borrowing pain points over what we believe to be the much greater pain inflicted by other languages.
umanwizard•7h ago
ameliaquining•2h ago