frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

We Are Doing Files Wrong (2021)

https://simonsafar.com/2021/we_are_doing_files_wrong/
5•Expurple•6mo ago

Comments

jll29•6mo ago
That's one of the few times I've read about a proposed innovation "in the spirit of UNIX" that was not already present in the original UNIX or one of its descendants.

  UNIX: Everything is a file.
  => A directory is a file.

  Parent post: Everything is a directory.
  A file is a directory.
I.e., a switch from "There are files and special files called directories that are handled differently." to the recursive definition "There are files, which are made up of 0..n files (blobs) and 0..n subdirectories" - so file versus directory is just a VIEW.

Makes sense & would make writing traversal code for files wiht internal structure much easier to read and write.

Expurple•6mo ago
> to the recursive definition "There are files, which are made up of 0..n files (blobs) and 0..n subdirectories"

I think, it's more like "a file node contains metadata, a binary blob of data (may be empty), and 0..n child files".

Agreed that this idea is very elegant and removes special cases, nodes become uniform. And the argument for reusing the OS(FS)-provided tree abstraction is compelling.

Although, I can imagine some performance concerns in the real world. If implemented naively and similarly to the existing Unixes, this model results in a lot of small fragmented blocks and separate syscalls+descriptors for dealing with each small file. Also, when the "tree" is actually a sequential array of nameless elements, there's some extra overhead involved with writing and storing made-up file names, as well as sorting by name when reading. This could be remedied by some new API. And a single tree implementation reused by everything could be more cache-friendly than having a userland parser for every "old" format in every application.

Anyway, this mental model is useful and I'd like to see and try out the "automounting" that the author describes.

Expurple•6mo ago
Can't edit the parent comment anymore, so I'll append my other thoughts here.

I remembered that the "automounting" already exists in some forms, and I really like these instances. When you click on an archive in a good file manager, it opens a "folder" view with the archive contents. The difference between an archive and a folder is arbitrary. It shouldn't exist and only complicates things for everybody. I assume that many applications today hand-code the logic for "if the user drags and drops a folder, we need to zip it before sending". Or they don't, and the user has to zip manually :)

One could say that storing program data as "transparent" folders instead of "opaque" binary files is too much detail for the user. And users can accidentally damage something (e.g. delete one of the files inside) more easily. But I have a few counter-points:

1. Many applications already use folders, but they're doing fine.

2. File managers already open many filetypes in an editor by default (e.g. plain text, office docs), but these files are doing fine.

3. A good file manager should recognize most file types and do the more reasonable and safe thing. If JPEG is reimplemented as a folder, clicking on a JPEG should still open the image viewer instead of the folder view. That's already the case with Mac OS app bundles (the example from the original post).

4. Actually, now that folders have a "data" field, it can be used for storing arbitrary metadata. This is extremely powerful and can be used for HIDING extra details from casual users! E.g. there could be a standartized metadata header that hints to the file manager that it should treat the folder as an "opaque" file and not show the user the contents. Now, old applications that already used folders for their internal state, can mark these folders as "opaque" and prevent casual users from messing with the contents! While still allowing to see, move and delete the folder as a whole (unlike hidden folders). And while providing uniform FS access to applications and advanced users.

Man, I really like this idea of arbitrary metadata for folders... It's not as necessary for files, because in practice you can just put the "metadata header" in the beginning of the main "data" (as many file formats do).

Expurple•6mo ago
> I can imagine some performance concerns in the real world.

A friend has pointed out that, for performance reasons, games already tend to sidestep the filesystem and bundle everything into container/archive files, like .pak [1]. It also reminded me of how compile times are often noticeably faster on Linux vs Windows, because its filesystem is better optimized for handling many small files.

Honestly, it's weird and makes me kind of sad. Filesystems are such an important, convenient and (mostly) standardized and portable abstraction. But to this day, it's often a bottleneck and too slow for some domains. It seems like there's a lot of missed opportunity here.

> when the "tree" is actually a sequential array of nameless elements, there's some extra overhead involved with writing and storing made-up file names, as well as sorting by name when reading. This could be remedied by some new API.

As I keep thinking about it, the fastest approach is still a sequential binary blob. It avoids indirection, fragmentation and needlessly storing separate metadata for each element. A VFS mount could still be implemented on top, for accessing elements through a filesystem interface (something like `my_array_file/0`, reporting the same medatata as the parent).

But if storing separate FS-level metadata for each element is actually desirable and indirection+fragmentation isn't a critical problem, we could use folders for arrays. As a partial optimization, we could introduce a special "ordered" folder type. It requires explicitly ordering the subfiles on write, so that later listing the children is sorted and fast by default.

Option 1. The ordering is a separate metadata in the folder. Subfiles still have unique names that are not related to the ordering. This could be useful for the app, or could be unnecessary. POSIX apps (that don't know about "sorted folders") would still re-sort the subfiles by name and could get a different ordering as the result.

Option 2. The subfiles don't have user-provided names. For POSIX-compatibility, the VFS supports "artificial" file names like 0000000, 0000001, 0000002 (the number of digits according to the FS limits, sorts as text correctly) or 0, 1, 2 (prettier, independent of the FS limits and portable, but doesn't sort correctly as text).

In some sense, this special folder type is against the spirit of the original article. It has special write restrictions that prevent treating every inode the same way (as an arbitratily writetable folder). But it's still in the spirit of the original in the sense of reusing the standard FS abstractions and features as much as possible.

[1] https://quakewiki.org/wiki/.pak

Qt Moves Away from Direct Rdrand/Rdseed Usage for Better Performance

https://www.phoronix.com/news/Qt-Moves-Off-RDRAND-RDSEED
1•LorenDB•47s ago•0 comments

Show HN: Hide Your Face with One Click

https://emojiface.us/
1•yong1024•3m ago•0 comments

May 1968 Graffiti

https://www.bopsecrets.org/CF/graffiti.htm
1•jruohonen•5m ago•0 comments

Principles of Slack Maximalism

https://aelerinya.substack.com/p/the-10-principles-of-slack-maximalism
1•surprisetalk•11m ago•0 comments

Why it's easier to build SpaceX than to fix Boeing [video]

https://www.youtube.com/watch?v=Q4Krg42Mg-E
1•surprisetalk•11m ago•0 comments

Show HN: Echolock – Federated AI for real-time phishing detection

https://github.com/ojayballer/ECHOLOCK
1•iLove_AI•12m ago•1 comments

Garbage Collection Is Useful

https://dubroy.com/blog/garbage-collection-is-useful/
1•surprisetalk•13m ago•0 comments

MS SQL Management Studio Copilot lacks security controls to use in prod

https://the.agilesql.club/2025/11/github-copilot-in-ssms-can-include-data-in-its-memory-simple-pr...
2•ed_elliott_asc•14m ago•0 comments

PgFirstAid: PostgreSQL function for improving stability and performance

https://github.com/randoneering/pgFirstAid
2•yakshaving_jgt•15m ago•0 comments

Saving the Venus Flytrap

https://gardenandgun.com/feature/venus-flytrap/
1•HR01•16m ago•0 comments

A twelve-year-old on the failed promise of educational technology

https://micahblachman.beehiiv.com/p/where-educational-technology-fails
1•subdomain•18m ago•0 comments

Uv2nix – Ingest uv workspaces using Nix

https://github.com/pyproject-nix/uv2nix
1•based2•18m ago•0 comments

Bryan Johnson: "I'm exploring magic mushrooms as a longevity therapy"

https://twitter.com/bryan_johnson/status/1988282302389256295
1•Anon84•19m ago•0 comments

Vintage Large Language Models

https://owainevans.github.io/talk-transcript.html
2•pr337h4m•23m ago•0 comments

The Role of Deliberate Practice in the Acquisition of Expert Performance

https://www.researchgate.net/publication/224827585_The_Role_of_Deliberate_Practice_in_the_Acquisi...
1•BinaryIgor•23m ago•1 comments

MCP: Model Context Pitfalls in an agentic world

https://hiddenlayer.com/innovation-hub/mcp-model-context-pitfalls-in-an-agentic-world/
1•beabytes•25m ago•0 comments

The Silent Hiring Revolution

https://foundersarehiring.com/silent-hiring-revolution-ai-reshaping-talent-real-time
1•niksmac•26m ago•0 comments

The Internet Is No Longer a Safe Haven

https://brainbaking.com/post/2025/10/the-internet-is-no-longer-a-safe-haven/
4•akyuu•26m ago•1 comments

Building Serverless Applications with Rust on AWS Lambda – AWS Compute Blog

https://aws.amazon.com/blogs/compute/building-serverless-applications-with-rust-on-aws-lambda/
2•9woc•27m ago•0 comments

The Great Data Escape: AI, Local-First, and the Cloud Exodus

https://solutionsreview.com/cloud-platforms/the-great-data-escape-ai-local-first-and-the-cloud-ex...
1•teleforce•27m ago•0 comments

Reversing Swift Like a Pro

https://hexai.re/blog/reversing-swift-like-a-pro
1•organy1337•29m ago•0 comments

A "cooked" Computer Science grad's perspective

https://www.youtube.com/watch?v=YcrfUzKJQms
2•foofoo4u•29m ago•0 comments

From WhatsApp to Kitchen: An AI-Powered Order Automation System

https://mateolafalce.github.io/2025/From%20WhatsApp%20to%20Kitchen/FromWhatsApptoKitchenAnAIPower...
1•lafalce•29m ago•0 comments

Show HN: Floxtop – Offline Mac app that organizes files and images by meaning

https://floxtop.com/index.html
1•bobnarizes•32m ago•1 comments

When the Bears Come Back

https://southlandsmag.com/p/3c9941b4-afe4-4cbf-a4c0-26f1a2a9e382/
1•jger15•35m ago•0 comments

Woman pleads guilty lying about astronaut wife accessing bank account from ISS

https://www.cnbc.com/2025/11/14/space-station-nasa-guilty-wife-bank-account.html
1•koolba•37m ago•0 comments

New book shows how questioning the alphabet can push typography further

https://www.creativeboom.com/resources/new-book-reveals-how-questioning-the-alphabet-can-help-us-...
1•bryanrasmussen•43m ago•0 comments

Major Bitcoin mining firm pivoting to AI

https://www.tomshardware.com/tech-industry/cryptomining/major-bitcoin-mining-firm-pivoting-to-ai-...
4•heresie-dabord•49m ago•1 comments

Análisis de acciones por dividendos

https://puenteolambo.com/
1•suglus•51m ago•0 comments

Forget AGI–Sam Altman celebrates ChatGPT following em dash formatting rules

https://arstechnica.com/ai/2025/11/forget-agi-sam-altman-celebrates-chatgpt-finally-following-em-...
2•AIBytes•51m ago•0 comments