The Latest Linux File-System: TernFS

https://www.phoronix.com/news/TernFS-File-System-Open-Source

56•guiambros•4mo ago

Comments

jauntywundrkind•4mo ago

There was also an introductory blog post submitted 4 days ago. 245 points, 108 comments. https://www.xtxmarkets.com/tech/2025-ternfs/ https://news.ycombinator.com/item?id=45290245

Some notable constraints: files are immutable, write-once update never. Designed for files at least 2MB in size. Slow at directory creation/deletion. No permissions/access control.

bionsystem•4mo ago

So, it competes more with S3/minio than NFS it seems ?

jleahy•4mo ago

(disclaimer: CTO of XTX)

These limits aren't quite as strict as they first seem.

Our median file size is 2MB, which means 50% of our files are <2MB. Realistically if you've got an exabyte of data with an average file size of a few kilobytes then this is the wrong tool for the job (you need something more like a database), but otherwise it should be just fine. We actually have a nice little optimisation where very small files are stored inline in the metadata.

It works out of the box with "normal" tools like rsync, python, etc despite the immutability. The reality is that most things don't actually modify files, even text editors tend to save a new version and rename over the top. We had to update relatively little of our massive code base when switching over to this. For us that was a big win, moving to an S3-like interface would have required updating a lot of code.

Directory creation/deletion is "slow", currenly limited to about 10,000 operations per second. We don't current need to create more than 10,000 directories per second so we just haven't prioritised improving that. There is an issue open, #28, which would get this up to 100,000 per second. This is the sort of thing that, like access control, I would love to have had in an initial open source release, but we prioritised open sourcing what we have over getting it perfect.

lucyjojo•4mo ago

thanks for the open-sourcing!

em-bee•4mo ago

The reality is that most things don't actually modify files, even text editors tend to save a new version and rename over the top.

it is essentially copy-on-write exposed to the user level. the only issue is that this breaks hard links, so tools that rely on that are going to break. but yes, custom code should be easy to adapt.

jleahy•4mo ago

Yes hard links aren't supported in TernFS. They would actually be really difficult to make work in this kind of sharded metadata design as they would need to be reference counted and all the operations would need to go via the CDC. It wouldn't really have matched with the design philosphy of simple and predictable performance.

em-bee•4mo ago

well, that's at least consistent. if hard-links aren't even supported, you can't break hard-links by replacing a file with a new one through renaming either.

olivia-banks•4mo ago

> TernFS is designed for XTX data center needs of maxing out at around 10EB of logical file storage, around one trillion files and 100 billion directories with around one million clients. All running atop commodity hardware and Ethernet networking.

Good lord.

untrimmed•4mo ago

This feels less like a gift to the community and more like the world's most impressive job ad to attract top-tier kernel developers.

mgarfias•4mo ago

> XTX developed TernFS for distributed storage after they outgrew their original NFS usage and other file-system alternatives.

So... call me old and crotchety, but i'm not sure I trust someone to write a DFS like this that once thought NFS a good idea. I'm sure its fine, I just have bad memories.

holoduke•4mo ago

Nfs is cheap and simple. We are using it for over 15 years in our business. Sering 10s of million daily users. I yet have to find a replacement.

scuff3d•4mo ago

What's wrong with NFS?

andrehacker•4mo ago

It.. depends.

Historically NFS has had many flaws on different O/S-es. Many of these issues appear to have been resolved over time and I have not seen it being referred to as "Nightmare File System" for decades.

However, depending on many factors NFS may still be a bad choice. In our setup, for example, using a large SQLite database through NFS turns out to be up to 10 times as slow as using a "real" disk.

The SQLite FAQs warn about bigger problems than slowness: https://www.sqlite.org/faq.html#q5

quotemstr•4mo ago

So there's nothing wrong with NFS: people just remember old, buggy implementations. Do you think TernFS is somehow with these old bugs?

scuff3d•4mo ago

It sounds like you're saying it use to be bad (fair enough) and there are use cases where it's bad (also fair enough). But I feel like that describes most software as it goes through growing pains and people figure out where it's useful.

jleahy•4mo ago

(disclaimer: CTO of XTX)

It was a long long time ago that we were only using NFS, it ran on top of a Solaris machine running ZFS. It did its job at the very beginning, but you don't build up hundreds of petabytes of data on an NFS server.

We did try various solutions in between NFS and developing TernFS, both open source and properietary. However we didn't name these specifically in the blog post because there's little point in bad mouthing what didn't work out for us.

gethly•4mo ago

Eh, aren't all FSs the same, essentially? Can't we just configure the limits during the OS installation and be done with gazillion FSs?

olivia-banks•4mo ago

There’s definitely a space for these highly-specialised filesystems. You wouldn’t want to use this as your /home FS, nor would you want to use ext4 or something similar for what they’re trying to do.

stinkbeetle•4mo ago

No, they aren't. Especially not distributed filesystems which really aren't yet a "solved problem", which in part explains why there are all these proprietary competing ones still around and companies everywhere using all different ones. NFS, BeeGFS, Weka, Ceph, Lustre, GPFS, GoogleFS, Coda/AFS, and more, each with their own flavor of crap.

For local filesystems, the average PC user shouldn't really care though. Just use whatever your installer defaults. But this story is about a distributed filesystem.

I don't have great hopes for one capable of such massive scale being good and usable (low overhead, low complexity, low adminst cost) in very small configurations, but we can always hope.

voxadam•4mo ago

Previously:

TernFS – An exabyte scale, multi-region distributed filesystem, 247 points, 4 days ago, https://news.ycombinator.com/item?id=45290245

aorth•4mo ago

We used GlusterFS for the past decade or so in HPC but it seems to be abandoned now. Need to see whether I switch to Ceph or something else.

Gluster was OK. We never pushed it very hard but it mostly just worked. Performance wasn't great but we encouraged users to use scratch space that was local to the node where their job was running anyway.

3D Printed Microfluidic Multiplexing [video]

Abstractions Are in the Eye of the Beholder

Show HN: Routed Attention – 75-99% savings by routing between O(N) and O(N²)

We didn't ask for this internet – Ezra Klein show [video]

The AI Talent War Is for Plumbers and Electricians

Show HN: MimiClaw, OpenClaw(Clawdbot)on $5 Chips

I Maintain My Blog in the Age of Agents

The Fall of the Nerds

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

How close is AI to taking my job?

You are the reason I am not reviewing this PR

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

How Meta Made Linux a Planet-Scale Load Balancer

A Turing Test for AI Coding

How to Identify and Eliminate Unused AWS Resources

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

CLI for Common Playwright Actions

Would you use an e-commerce platform that shares transaction fees with users?

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

The Evolution of the Interface

Azure: Virtual network routing appliance overview

Seedance2 – multi-shot AI video generation

Πfs – The Data-Free Filesystem

Go-busybox: A sandboxable port of busybox for AI agents

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

Zen Tools

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

3D Printed Microfluidic Multiplexing [video]

Abstractions Are in the Eye of the Beholder

Show HN: Routed Attention – 75-99% savings by routing between O(N) and O(N²)

We didn't ask for this internet – Ezra Klein show [video]

The AI Talent War Is for Plumbers and Electricians

Show HN: MimiClaw, OpenClaw(Clawdbot)on $5 Chips

I Maintain My Blog in the Age of Agents

The Fall of the Nerds

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

How close is AI to taking my job?

You are the reason I am not reviewing this PR

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

How Meta Made Linux a Planet-Scale Load Balancer

A Turing Test for AI Coding

How to Identify and Eliminate Unused AWS Resources

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

CLI for Common Playwright Actions

Would you use an e-commerce platform that shares transaction fees with users?

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

The Evolution of the Interface

Azure: Virtual network routing appliance overview

Seedance2 – multi-shot AI video generation

Πfs – The Data-Free Filesystem

Go-busybox: A sandboxable port of busybox for AI agents

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

Zen Tools

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

The Latest Linux File-System: TernFS

Comments