User space filesystem is not the first thing that comes to my mind when trying to get faster performance than NFS
A well configured and distributed Lustre will be very fast, BTW.
JuiceFS: https://juicefs.com/en/
Weka: https://www.weka.io/
You can get dozens of GB/s out of FUSE nowadays. This will even improve in the near future as FUSE is adding io_uring support for communication with the kernel (instead of a pipe).
I see the need for "sharing" in giving access to the data, but not to have it represented on the filesystem (other than giving the illusion of local dev)
This is why systems like Weka exist, and why Lustre is still being developed and polished. These systems reach tremendous speeds. This is not an exaggeration when you connect terabits of bandwidth to these storage systems.
This is not at all about NFS vs FUSE, this is about specific NFS providers vs specific FUSE with some specific object store backends.
FUSE us just a way to have a filesystem not implemented in the kernel. I can have a FUSE driver that implements storage based on rat trained to push a button in reaction to lights turning on, or basically anything else.
NFS is a specific networked filesystem.
That said, all of the open source NFS implementations are either missing this stuff or you'd have to implement it yourself which would be a lot of work. NetApp Filers are crazy expensive and really annoying to administer. I'm not really surprised that the cloud NFS solutions are all expensive and slow because truly *needing* NFS is a very niche thing (like do you really need `flock(2)` to work in a distributed way).
Modern day NFS also has RDMA transports available with some vendors. Plus perhaps have it over IB for extra speed.
For read-only data (the original model is about serving file weights) you can also use iscsi. This is how packages/binaries are served to nearly all borg hosts at Google (most Borg hosts don't have any local disk whatsoever, when they need to run a given binary they mount the software image using iscsi and then I believe mlock nearly all of the elf sections).
NFS is a set of protocols for networked filesystems. You can just as well implement an NFS server that "implements storage based on rat trained to push a button in reaction to lights turning on". Some people even argue it is a better way to do it than FUSE, because you get robust clients on most platforms with included caching out of the box. E.g. this is a library for building such a NFS server: https://github.com/xetdata/nfsserve
Math isn't mathing, but the salesperson implanted the idea. lol
Also, the 3rd party library we were calling to do the reconstruction had a limit in the number of threads reading in parallell when preloading projection image data (optimized for what was reasonable on local storage) so that favored azcopy.
Don’t remember that NFS ever came out ahead.
So, benchmark, benchmark, benchmark and see what possibilities you have in adapting the preloading behavior before choosing.
- NFS has the "pro" of being POSIX compliant, but I can't see how a FUSE device is different in this regard - FUSE allegedly supports local caching and lazy loading, but why can't I cache or lazy load with a NFS share? - NFS apparently has a high infrastructure costs - but FUSE comes for free? Then, the author compares cloud offerings, which should make the infrastructure concerns moot? - the cost calculations don't even mention which provider is used (though you can guess which one) and seemingly doesn't include transfer costs
There's even more I can't be bothered to mention. Stay away from this post
If you don't care about acceptable latency, metadata operations, indexing, finding files without already knowing the full path, proper synchronisation between clients, then sure mounting S3 over FUSE is nice, heck I even use it myself, but it's not a replacement for NFS.
You could use S3 object storage with something like JuiceFS/SeaweedFS to make metadata operations acceptably fast (in case of Redis backed JuiceFS, lightning fast), but you're no longer just using object storage and now have a critical database in your infrastructure to maintain.
> Speed: Matching NVMe performance (5-10 GB/s) through kernel bypass and parallelization.
Say wha? Not sure how a userland application is supposed to 1) create a tcp connection to connect to s3 or 2) respond to fopen without going through the kernel.
They're in for a shock when they find out you can do NFS via FUSE too.
All that said - There's still a ton of room for NFS to be the backing store, but more importantly there's room for distributed filesystems with intelligent caching to displace all of this.
krupan•1h ago
agcat•1h ago
sudobash1•1h ago
But if you scroll down, the article lists a few specific network filesystems using FUSE that were tested (JuiceFS, goofys, etc...).
I don't follow all of the reasoning, but I am not surprised at the conclusion. The newer FUSE-based network filesystems are build for modern cloud purposes, so they are more specific for the task.