Iirc the work on safe transmute also involves a sort of “any bit pattern” trait?
I’ve also dealt with pain implementing similar interfaces in Rust, and it really feels like you end up jumping through a ton of hoops (and in some of my cases, hurting performance) all to satisfy the abstract machine, at no benefit to programmer or application. It’s really a case where the abstract machine cart is leading the horse
I totally understand not wanting to promise things get zeroed, but I don't really understand why full UB, instead of just "they have whatever value is initially in memory / the register / the compiler chose" is so much better.
Has anyone ever done a performance comparison between UB and freezing I wonder? I can't find one.
Also, an uninitialized value might be in a memory page that gets reclaimed and then mapped in again, in which case (because it hasn’t been written to) the OS doesn’t guarantee it will have the same value the second time. There was recently a bug discovered in one of the few algorithms that uses uninitialized values, because of this effect.
I would imagine there isn't that many cases where we are reading uninitalised memory and counting on that reading not saving a value. It would happen when reading in 8-byte blocks for alignment, but does it happen that much elsewhere?
The only way to really know is to test this. Compilers and their optimizations depend on a lot of things. Even the order and layout of instructions can matter due to the instruction cache. You can always go and make the guarantee later on, but undoing it would be impossible.
And it definitely does allow some optimisation. But probably nothing significant on modern out-of-order machines.
it kills _a lot_ of optimizations leading to problematic perf. degredation
TL;DR: always freezing I/O buffers => yes no issues (in general); freezing all primitives => perf problem
(at lest in practice in theory many might still be possible but with a way higher analysis compute cost (like exponential higher) and potentially needing more high level information (so bad luck C)).
still for I/O buffers of primitive enough types `frozen` is basically always just fine (I also vaguely remember some discussion about some people more involved into rust core development to probably wanting to add some functionality like that, so it might still happen).
To illustrate why frozen I/O buffers are just fin: Some systems do already anyway always (zero or rand) initialize all their I/O buffers. And a lot of systems reuse I/O buffers, they init them once on startup and then just continuously re-use them. And some OS setups do (zero or rand) initialize all OS memory allocations (through that is for the OS granting more memory to your in process memory allocator, not for every lang specific alloc call, and it doesn't remove UB for stack or register values at all (nor for various stations related to heap values either)).
So doing much more "costly" things then just freezing them is pretty much normal for I/O buffers.
Through as mentioned, sometimes things are not frozen undefined on a hardware level (things like every read might return different values). It's a bit of a niche issue you probably won't run into wrt. I/O buffers and I'm not sure how common it is on modern hardware, but still a thing.
But freezing primitives which majorly affect control flows is both making some optimizations impossible and other much harder to compute/check/find, potentially to a point where it's not viable anymore.
This can involve (as in freezing can prevent) some forms of dead code elimination, some forms of inlining+unrolling+const propagation etc.. This is mostly (but not exclusively) for micro optimizations but micro optimizations which sum up and accumulate leading to (potentially but not always) major performance regressions. Frozen also has some subtle interactions with floats and their different NaN values (can be a problem especially wrt. signaling NaNs).
Through I'm wondering if a different C/C++ where arrays of primitives are always treated as frozen (and no signaling NaNs) would have worked just fine without any noticeable perf. drawback. And if so, if rust should adopt this...
I've implemented what TFA calls the "double cursor" design for buffers at $dayjob, ie an underlying (ref-counted) [MaybeUninit<u8>] with two indices to track the filled, initialized and unfilled regions, plus API to split the buffer into two non-overlapping handles, etc. It certainly required wrangling with UnsafeCell in non-trivial ways to make miri happy, but it doesn't have any less performance than the equivalent C code that just dealt with uint8_t* would've had.
Some people simply aren't comfortable with it.
Currently sound Rust code does not depend on the value of uninitialized memory whatsoever. Adding `freeze` means that it can. A vulnerability similar to heartbleed to expose secrets from free'd memory is impossible in sound Rust code without `freeze`, but theoretically possible with `freeze`.
Whether you consider this a realistic issue or not likely determines your stance on `freeze`. I personally don't think it's a big deal and have several algorithms which are fundamentally being slowed down by the lack of `freeze`, so I'd love it if we added it.
Abstractions like ReadBuf allow safe code to efficiently work with uninitialized buffers without risking exposure of random memory contents.
Go has similar characteristics.
$ cat bigbuf.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define DEFINITELY_BIG_ENOUGH 2U*1024*1024*1024
int main(int nargs, char ** args)
{
char * definitely_big_enough = malloc(DEFINITELY_BIG_ENOUGH);
if (nargs > 1)
{
memset(definitely_big_enough,0, DEFINITELY_BIG_ENOUGH);
}
sprintf(definitely_big_enough,"%s",args[0]);
return 0;
}
$ /usr/bin/time ./bigbuf
0.00user 0.00system 0:00.00elapsed 100%CPU (0avgtext+0avgdata 1356maxresident)k
0inputs+0outputs (0major+80minor)pagefaults 0swaps
$ /usr/bin/time ./bigbuf 1
0.05user 1.25system 0:01.31elapsed 99%CPU (0avgtext+0avgdata 2098468maxresident)k
0inputs+0outputs (0major+524369minor)pagefaults 0swaps
YMMV on different operating systems. Of course this is a program only an idiot would write, but things like caches are often significantly bigger than the median case, especially on Linux where you know there is overcommit.People should really read more on safety semantics in Rust before making comments like this, it's quite annoying to bump into surface level misunderstandings everytime Rust is mentioned somewhere.
Rust is being used and is designed to be able to be used everywhere from top of the line PCs, to servers to microcontrollers to virtual machines in the browser.
Not all tradeoffs are acceptable to everyone all of the time
int* ptr = malloc(size); if(ptr[offset] == 0) { }
The code was assuming that the value in an allocated buffer did not change.
However, it was pointed out in review that it could change with these steps:
1) The malloc allocates from a new memory page. This page is often not mapped to a physical page until written to.
2) The reads just return the default (often 0 value) as the page is not mapped.
3) Another allocation is made that is written to the same page. This maps the page to physical memory which then changes the value of the original allocation.
What could happen is that the UB in that code could result in it being compiled in a way that makes the comparison non-deterministic.
(*): ... or alternatively, we're not talking about regular userspace program but a higher privilege layer that is doing direct unpaged access, but I assume that's not the case since you're talking about malloc.
The closest thing to "conditionally returned to the kernel" is if the page had been given to madvise(MADV_FREE), but that would still not have the behavior they're talking about. Reading and writing would still produce the same content, either the original page content because the kernel hasn't released the page yet, or zero because the kernel has already released the page. Even if the order of operations is read -> kernel frees -> write, then that still doesn't match their story, because the read will produce the original page content, not zero.
That said, the code they're talking about is different from yours in that their code is specifically doing an out-of-bounds read. (They said "If you happen to allocate a string that's 128 bytes, and malloc happens to return an address to you that's 128 bytes away from the end of the page, you'll write the 128 bytes and the null terminator will be the first byte on the next page. So they're very clearly talking about the \0 being outside the allocation.)
So it is absolutely possible to have this setup: the string's allocation happens to be followed by a different allocation that is currently 0 -> the `data[size()] != '\0'` check is performed and succeeds -> `data` is returned to the caller -> whoever owns that following allocation writes a non-zero value to the first byte -> whoever called `c_str()` will now run off the end of the 128B string. This doesn't have anything to do with pages; it can happen within the bounds of a single page. It is also such an obvious out-of-bounds bug that it boggles my mind that it passed any sort of code review and required some sort of graybeard to point out.
He explicitly states 128byte filename allocates 129 bytes. https://www.youtube.com/watch?v=kPR8h4-qZdk&t=1417s
Is this so inefficient? If your code is very sensitive to IO throughput, then it seems preferable to re-use buffers and pay the initialization once at startup.
Some years ago, I needed a buffer like this and one didn't exist, so I wrote one: https://crates.io/crates/fixed-buffer . I like that it's a plain struct with no type parameters.
It can be. If you have large buffers (tuned for throughput) that end up fulfilling lots of small requests for whatever reason, for example. And there's always the occasional article when someone rediscovers that replacing malloc + memset with calloc can have massive performance savings thanks to zeroing by the OS only occuring on first page fault (if it ever occurs), instead of an O(N) operation on the whole buffer up front.
Which, if in the wrong loop, can quickly balloon from O(N) to O(scary).
https://github.com/PSeitz/lz4_flex/issues/147
https://github.com/rust-lang/rust/issues/117545
If I'm reading that log-log plot right, that looks like a significantly worse than 100x slowdown on 1GB data sets. Avoiding init isn't the only solution, of course, but it was a solution.
> then it seems preferable to re-use buffers
Buffer reuse may be an option, but in code with complicated buffer ownership (e.g. transfering between threads, with the thread of origination not necessarily sticking around, etc.), one of the sanest methods of re-use may be to return said buffer to the allocator, or even OS.
> and pay the initialization once at startup.
Possibly a great option for long lived processes, possibly a terrible one for something you spawn via xargs.
90s_dev•5h ago
> Another thing is the difficulty of using uninitialized data in Rust. I do understand that this involves an attribute in clang which can then perform quite drastic optimizations based on it, but this makes my life as a programmer kind of difficult at times. When it comes to `MaybeUninit`, or the previous `mem::uninit()`, I feel like the complexity of compiler engineering is leaking into the programming language itself and I'd like to be shielded from that if possible. At the end of the day, what I'd love to do is declare an array in Rust, assign it no value, `read()` into it, and magically reading from said array is safe. That's roughly how it works in C, and I know that it's also UB there if you do it wrong, but one thing is different: It doesn't really ever occupy my mind as a problem. In Rust it does. [https://news.ycombinator.com/item?id=44036021]
electrograv•5h ago
UB doesn’t occupy the author’s mind when writing C, when it really should. This kind of lazy attitude to memory safety is precisely why so much C code is notoriously riddled with memory bugs and security vulnerabilities.
mk12•4h ago
Arnavion•4h ago
It's not as painless as it could be though, because many of the MaybeUninit<T> -> T conversion fns are unstable. Eg the code in TFA needs `&mut [MaybeUninit<T>] -> &mut [T]` but `[T]::assume_init_mut()` is unstable. But reimplementing them is just a matter of copying the libstd impl, that in turn is usually just a straightforward reinterpret-cast one-liner.
nemothekid•4h ago
vgatherps•4h ago
It’s strictly more complicated and slower than the obvious thing to do and only exists to satisfy the abstract machine.
Arnavion•4h ago
vgatherps•3h ago
nemothekid•2h ago
Arnavion•2h ago
nemothekid•1h ago
Arnavion•2m ago
eptcyka•4h ago
ironhaven•3h ago
uecker•2h ago
codeflo•2h ago
There are two actual differences in this regard: C pointers are more ergonomic than Rust pointers. And Rust has an additional feature called references, which enable a lot more aggressive compiler optimizations, but which have the restriction that you can’t have a reference to uninitialized memory.
o11c•4h ago
Most other UBs related to datums that you think you can do something with.
usefulcat•4h ago
It sounds like the more difficult problem here has to do with explaining to the compiler that read() is not being used unsafely.
ii41•5h ago
IIRC it's not that hard to convince the compiler to give you a safe buffer from a MaybeUninit. However, this type has really lengthy docs and makes you question everything you do with it. Thinking through all this is painful but it's not like you don't have to it with C.
jcranmer•4h ago
And the most aggravating part of all of this is that the most common use case for uninitialized memory (the scenario being talked about both in the article here and the discussion you quote) is actually pretty easy to have a reasonable, safe abstraction for, so the fact that the current options requires both use of unsafe code and also potentially faulty duplication of value calculations doesn't make for a fun experience. (Also, the I/O traits predate MaybeUninit, which means the most common place to want to work with uninitialized memory is one where you can't do it properly.)