I no mention of fsync/sync_all. That’s why your disk file system is acting as fast as your in memory file system (for small tests). Both are effectively in-memory.
maxbond•3h ago
A Rust-specific danger is that, if you don't explicitly sync a file before dropping it, any errors from syncing are ignored. So if you care about atomicity, call eg `File::sync_all()`.
dezgeg•2h ago
Is that really rust-specific? I would be really surprised if any other languages do fsync() in their destructor either
maxbond•2h ago
To be clear `File::drop()` does sync, it just ignores errors (because `drop()` doesn't have a way of returning an error). It's not really Rust specific I guess, I just don't know off the top of my head what other languages behave this way.
aw1621107•55m ago
I believe C++'s fstreams also ignore errors on destruction for similar reasons.
I've wondered for a while what it'd take to eliminate such pitfalls in the "traditional" RAII approach. Something equivalent to deleting the "normal" RAII destructor and forcing consumption via a close() could be interesting, but I don't know how easy/hard that would be to pull off.
the8472•2m ago
[delayed]
01HNNWZ0MV43FF•2h ago
For context - cppreference.com doesn't say anything about `fstream` syncing on drop, but it does have an explicit `sync` function. `QFile` from Qt doesn't even have a sync function, which I find odd.
aw1621107•29m ago
I had always assumed that fstream flushes on destruction, but after digging through the standard all I can conclude is that I'm confused.
According to the standard, fstream doesn't have an explicit destructor, but the standard says "It uses a basic_filebuf<charT, traits> object to control the associated sequences." ~basic_filebuf(), in turn, is defined to call close() (which I think flushes to disk?) and swallow exceptions.
However, I can't seem to find anything that explicitly ties the lifetime of the fstream to the corresponding basic_filebuf. fstream doesn't have an explicitly declared destructor and the standard doesn't require that the basic_filebuf is a member of fstream, so the obvious ways the file would be closed don't seem to be explicitly required. In addition, all of fstream's parents' destructors are specified to perform no operations on the underlying rdbuf(). Which leaves... I don't know?
cppreference says the underlying file is closed, though, which should flush it. And that's what I would expect for an RAII class! But I just can't seem to find the requirement...
silon42•2h ago
I'd almost never want do to fsync in normal code (unless implementing something transactional)... but I'd want an explicit close almost always (or drop should panic/abort).
znpy•1h ago
so the good old `sync; sync; sync;` ?
goodpoint•49m ago
This is not correct. Programming languages do not and should not call sync automatically.
the8472•7m ago
[delayed]
indirect•2h ago
I guess I wasn't sufficiently clear in the post, but the part I think is interesting is not that tmpfs and SSD bench at the same speed. I am aware of in-memory filesystem caches, and explicitly mention them twice in the last few paragraphs.
The interesting part, to me, was that using the vfs crate or the rsfs crate didn't produce any differences from using tmpfs or an SSD. In theory, those crates completely cut out the actual filesystem and the OS entirely. Somehow, avoiding all those syscalls didn't make it any faster? Not what I expected.
Anyway, if you have examples of in-process filesystem mocks that run faster than the in-memory filesystem cache, I'd love to hear about them.
eumon•2h ago
you may try /dev/shm for the testing purpose, which is effectively an in memory filesystem that linux provides, it is very performant
j1elo•57m ago
> It turns out the intended primary use case of the crate is to store files inside Rust binaries but still have an API sort of like the filesystem API to interact with them. Unfortunately, that information is hidden away in a comment on a random GitHub issue, rather than included in the project readme.
A+ on technical prowess,
F- on being able to articulate a couple words about it on a text file.
kolektiv•38m ago
It always surprised me somewhat that there isn't a set of traits covering some kind of `fs` like surface. It's not a trivial surface, but it's not huge either, and I've also found myself in a position of wanting to have multiple implementations of a filesystem-like structure (not even for the same reasons).
Tricky to make that kind of change to std lib now I appreciate, but it seems like an odd gap.
adastra22•1d ago
maxbond•3h ago
dezgeg•2h ago
maxbond•2h ago
aw1621107•55m ago
I've wondered for a while what it'd take to eliminate such pitfalls in the "traditional" RAII approach. Something equivalent to deleting the "normal" RAII destructor and forcing consumption via a close() could be interesting, but I don't know how easy/hard that would be to pull off.
the8472•2m ago
01HNNWZ0MV43FF•2h ago
aw1621107•29m ago
According to the standard, fstream doesn't have an explicit destructor, but the standard says "It uses a basic_filebuf<charT, traits> object to control the associated sequences." ~basic_filebuf(), in turn, is defined to call close() (which I think flushes to disk?) and swallow exceptions.
However, I can't seem to find anything that explicitly ties the lifetime of the fstream to the corresponding basic_filebuf. fstream doesn't have an explicitly declared destructor and the standard doesn't require that the basic_filebuf is a member of fstream, so the obvious ways the file would be closed don't seem to be explicitly required. In addition, all of fstream's parents' destructors are specified to perform no operations on the underlying rdbuf(). Which leaves... I don't know?
cppreference says the underlying file is closed, though, which should flush it. And that's what I would expect for an RAII class! But I just can't seem to find the requirement...
silon42•2h ago
znpy•1h ago
goodpoint•49m ago
the8472•7m ago
indirect•2h ago
The interesting part, to me, was that using the vfs crate or the rsfs crate didn't produce any differences from using tmpfs or an SSD. In theory, those crates completely cut out the actual filesystem and the OS entirely. Somehow, avoiding all those syscalls didn't make it any faster? Not what I expected.
Anyway, if you have examples of in-process filesystem mocks that run faster than the in-memory filesystem cache, I'd love to hear about them.