Fuse is 95% cheaper and 10x faster than NFS

https://nilesh-agarwal.com/storage-in-cloud-for-llms-2/

46•agcat•1h ago

Comments

krupan•1h ago

I'm confused, is this FUSE as in Filesystem in User space?

agcat•1h ago

Yes

sudobash1•1h ago

The title is confusing since FUSE is not a network filesystem. It can be used as a "frontend" for multiple different network filesystems (as used in sshfs and smbfuse). There is even a fuse-nfs project to allow you to run a NFS client using fuse.

But if you scroll down, the article lists a few specific network filesystems using FUSE that were tested (JuiceFS, goofys, etc...).

I don't follow all of the reasoning, but I am not surprised at the conclusion. The newer FUSE-based network filesystems are build for modern cloud purposes, so they are more specific for the task.

dmoy•1h ago

Would be interested to see a comparison with other not-NFS things (Lustre, daos, etc).

User space filesystem is not the first thing that comes to my mind when trying to get faster performance than NFS

agcat•1h ago

This is a great idea

bayindirh•1h ago

You could also try JuiceFS & Weka (if you can have access to a cluster).

A well configured and distributed Lustre will be very fast, BTW.

JuiceFS: https://juicefs.com/en/

Weka: https://www.weka.io/

nickaggarwal•55m ago

Will test out Weka, Thanks for sharing

fh973•1h ago

At Quobyte (https://www.quobyte.com) we use FUSE for the client for parallel file system access.

You can get dozens of GB/s out of FUSE nowadays. This will even improve in the near future as FUSE is adding io_uring support for communication with the kernel (instead of a pipe).

nickaggarwal•1h ago

Yes there is AWS FSx with lustre in the blog..that might be worth checking out

pstuart•1h ago

Is there really a need for a filesystem? Just pull from a bucket and it's done. Push updates to the bucket and...it's done.

I see the need for "sharing" in giving access to the data, but not to have it represented on the filesystem (other than giving the illusion of local dev)

nickaggarwal•1h ago

If you want de couple model loading and distribution, If you do it in application when you need application pulling from bucket can be slow

jacobsenscott•1h ago

What if all your code is already written to use a filesystem, and you want to change the backing store from nfs to object store? Or what if you want to abstract away the specific blob store?

nickaggarwal•54m ago

This was our exact use case backing store from nfs to object store. You can cloud cloud-specific mount providers and use a thin client in the middle

bayindirh•1h ago

When you want to feed your GPUs with what they need (model, weights, data), that kind of hoops slow you down exponentially. You need to be able to stream data to pinned memory, or to the GPU directly (hence GPUDirect) to keep your cards saturated.

This is why systems like Weka exist, and why Lustre is still being developed and polished. These systems reach tremendous speeds. This is not an exaggeration when you connect terabits of bandwidth to these storage systems.

SahAssar•1h ago

(Posting while the title is "Fuse is 95% cheaper and 10x faster than NFS", I'm guessing that will get changed based on the HN rules)

This is not at all about NFS vs FUSE, this is about specific NFS providers vs specific FUSE with some specific object store backends.

FUSE us just a way to have a filesystem not implemented in the kernel. I can have a FUSE driver that implements storage based on rat trained to push a button in reaction to lights turning on, or basically anything else.

NFS is a specific networked filesystem.

eklitzke•54m ago

NFS can be super fast, in a past life I had to work a lot with a large distributed system of NetApp Filers (hundreds of filers located around the globe) and they have a lot of fancy logic for doing doing locale-aware caching and clustering.

That said, all of the open source NFS implementations are either missing this stuff or you'd have to implement it yourself which would be a lot of work. NetApp Filers are crazy expensive and really annoying to administer. I'm not really surprised that the cloud NFS solutions are all expensive and slow because truly *needing* NFS is a very niche thing (like do you really need `flock(2)` to work in a distributed way).

throw0101c•22m ago

> NFS can be super fast

Modern day NFS also has RDMA transports available with some vendors. Plus perhaps have it over IB for extra speed.

eklitzke•5m ago

Yeah if you were really trying to make things fast you'd have the compute and NFS server in the same rack connected this way. But you aren't going to get this from any cloud providers.

For read-only data (the original model is about serving file weights) you can also use iscsi. This is how packages/binaries are served to nearly all borg hosts at Google (most Borg hosts don't have any local disk whatsoever, when they need to run a given binary they mount the software image using iscsi and then I believe mlock nearly all of the elf sections).

matrss•42m ago

> NFS is a specific networked filesystem.

NFS is a set of protocols for networked filesystems. You can just as well implement an NFS server that "implements storage based on rat trained to push a button in reaction to lights turning on". Some people even argue it is a better way to do it than FUSE, because you get robust clients on most platforms with included caching out of the box. E.g. this is a library for building such a NFS server: https://github.com/xetdata/nfsserve

Spivak•1h ago

I'm no NFS stan but lordy the comparison table is a hit piece. NFS isn't that bad to administer, there are managed NFS services on every major cloud provider, and for on-prem every RHCE ought to know how to set up and deploy a many-reader multi-writer replicated cluster.

positisop•1h ago

Please do not make decisions based on this article. It is a poorly written blog with typos and a lack of technical depth. The blog puts Goofys in the same bucket as JuiceFS and Alluxio.. A local NVMe populated via a high-throughput Object Store will give you the best performance. This blog does not go into the system architecture involved that prohibits static models from being pre-populated or the variations in the "FUSE" choices. I can see why AI startups need large amounts of money when the depth of engineering is this shallow.

c0l0•1h ago

I've been in this business for a while now, and I continue to be surprised by the extent of how cloud customers are being milked by cloud platform providers. And, of course, their seemingly limitless tolerance for it.

Spooky23•56m ago

It is amazing. I just left a discussion where the protagonist is moving a legacy workload to a hyperscaler to avoid some software licensing costs. Re-implemented with cloud in mind, it would probably run $10-15k/year to run. As it stands as a lift and shift, likely something like $250k. The total value of the software licensing is <$30k.

Math isn't mathing, but the salesperson implanted the idea. lol

nickaggarwal•44m ago

I agree, if you go with the wrong solutions, it can inflate the costs

Mave83•1h ago

For AI, you want DAOS storage. It runs in userspace, you can use FUSE and it's the fastest storage on the planet when it comes to bandwidth (see io500). There are good companies supporting it like croit, and with their software it's easy to manage as well.

MertsA•7m ago

While DAOS looks cool, from their roadmap it looks like they still don't have a fault domain larger than a server... Their erasure coding profiles also look pretty thin. I'm ex-Meta, our infra had vastly different availability and reliability requirements but that looks like it'd be painful to support at scale.

tomasGiden•1h ago

I did some benchmarking on BlobFuse2 vs NFS vs azcopy on Azure for a CT imaging reconstruction a year back or so. As I remember it, it was not clear if Fuse (copy on demand) or azcopy (copy all necessary data before starting the workload) was the winner. The use case and specific application access pattern really mattered A LOT: * Reading full files favored azcopy (even if reading parts just when they were needed). * If the application closed and opened each file multiple times it favored azcopy. * If only a small part of many files were read, it favored fuse

Also, the 3rd party library we were calling to do the reconstruction had a limit in the number of threads reading in parallell when preloading projection image data (optimized for what was reasonable on local storage) so that favored azcopy.

Don’t remember that NFS ever came out ahead.

So, benchmark, benchmark, benchmark and see what possibilities you have in adapting the preloading behavior before choosing.

nickaggarwal•50m ago

With Fuse you can make it transparent for the Application, it just exposes the mount with all the files. When your application reads them, it's pulled from Object storage, while az-copy is a utility to copy it to your disk

looperhacks•59m ago

This article is a random collection of claims without sources or even explanations how the author came to the conclusions.

- NFS has the "pro" of being POSIX compliant, but I can't see how a FUSE device is different in this regard - FUSE allegedly supports local caching and lazy loading, but why can't I cache or lazy load with a NFS share? - NFS apparently has a high infrastructure costs - but FUSE comes for free? Then, the author compares cloud offerings, which should make the infrastructure concerns moot? - the cost calculations don't even mention which provider is used (though you can guess which one) and seemingly doesn't include transfer costs

There's even more I can't be bothered to mention. Stay away from this post

positisop•43m ago

NFS is its own spec, which is somewhat compliant with POSIX, and arguably FUSE is POSIX and can be used to implement a POSIX-compliant filesystem.

dekhn•52m ago

This article is garbage on so many levels it's actually impressive.

ChocolateGod•50m ago

I feel like the author of the article doesn't actually know what FUSE is and that article is AI generated as the comparison tables smell of LLM hallucination.

If you don't care about acceptable latency, metadata operations, indexing, finding files without already knowing the full path, proper synchronisation between clients, then sure mounting S3 over FUSE is nice, heck I even use it myself, but it's not a replacement for NFS.

You could use S3 object storage with something like JuiceFS/SeaweedFS to make metadata operations acceptably fast (in case of Redis backed JuiceFS, lightning fast), but you're no longer just using object storage and now have a critical database in your infrastructure to maintain.

> Speed: Matching NVMe performance (5-10 GB/s) through kernel bypass and parallelization.

Say wha? Not sure how a userland application is supposed to 1) create a tcp connection to connect to s3 or 2) respond to fopen without going through the kernel.

They're in for a shock when they find out you can do NFS via FUSE too.

GauntletWizard•31m ago

Anyone who knows filesystems would have said "No Duh". Caching on NVME will always be significantly faster than remote, simply because of network latency hops - Even at microseconds per! There's really not a huge difference between modern PCI-E Architecture and modern networking - but the length of the cables matters a lot at these latencies.

All that said - There's still a ton of room for NFS to be the backing store, but more importantly there's room for distributed filesystems with intelligent caching to displace all of this.

gjvc•23m ago

horseshit

Management of IP numbers by peg-DHCP (1998)

Max Read's 'A Literary History of Fake Texts in Apple's Marketing Materials'

Z-Wave Reborn – Home Assistant Connect ZWA-2

Is McKinsey losing its crown to AI? [video]

Amazon Ads Multi-Touch Attribution

Show HN: Cinematic Rolplay with Wan 2.2

Ask HN: Is there an AI that can read code aloud and explain it?

Evals as Code: CI for LLMs with Dagger

Ask HN: Is https://web.whatsapp.com/ loading for you atm?

A Good Find

If You Could Fix One Thing About AI Search, What Would It Be?

Eca: Editor Code Assistant – AI pair programming capabilities agnostic of editor

Show HN: Deploy Any Web App Directly from Claude Code

The Tulpa in Your Pocket

Water Cremation (Alkaline Hydrolysis)

Temporary tattoo could detect an unwanted drug in your drink

How I Use Computers Now [video]

Memento Mori (Short Story)

Meta's superintelligence isn't here yet but its AI bets are already paying off

NIST Finalizes 'Lightweight Cryptography' Standard to Protect Small Devices

Tony Hoare on record handling. (1965)

Job board for hidden PERM jobs (H1B)

Tensor Auto Demo Video

Show HN: Videolangua – end-to-end video translate and subtitle/ dub

A Conjecture Regarding SMT Instability [pdf]

Humans, not glacial transport, brought bluestones to Stonehenge (new research)

Edcapit Presented Its Project at Keiretsu Forum Texas (USA)

All Souls exam questions and the limits of machine reasoning

The quiet work of changing your mind

Gemini rolling out personalization based on your chat history