Too Many Open Files

https://mattrighetti.com/2025/06/04/too-many-files-open

180•furkansahin•8mo ago

Comments

quotemstr•8mo ago

There's no reason to place an arbitrary cap on the number of file descriptors a process can have. It's neither necessary nor sufficient for limiting the amount of memory the kernel will allocate on behalf of a process. On every Linux system I use, I bump the FD limit to maximum everywhere.

kstrauser•8mo ago

You'd been downvoted, but I also wonder about that.

If you write a program that wants to have a million files open at once, you're almost certainly doing it wrong. Is there a real, inherent reason why the OS can't or shouldn't allow that, though?

quotemstr•8mo ago

> If you write a program that wants to have a million files open at once

A file descriptor is just the name of a kernel resource. Why shouldn't I be able to have a ton of inotify watches, sockets, dma_buf texture descriptors, or memfd file descriptors? Systems like DRM2 work around FD limits by using their own ID namespaces instead of file descriptors and make the system thereby uglier and more bug-prone. Some programs that regularly bump up against default FD limits are postgres, nginx, the docker daemon, watchman, and notoriously, JetBrains IDEs.

Why? Why do we live like this?

kstrauser•8mo ago

I honestly don’t know. Maybe there’s a great reason for it that would be obvious if I knew more about the low-level kernel details, but at the moment it eludes me.

Like, there’s not a limit on how many times you can call malloc() AFAIK, and the logic for limiting the number of those calls seems to be the same as for open files. “If you call malloc too many times, your program is buggy and you should fix it!” isn’t a thing, but yet allocating an open file is locked down hard.

hulitu•8mo ago

> Is there a real, inherent reason why the OS can't or shouldn't allow that, though?

Yes, because you are not alone in this universe. A user does usually run more than one program and all programs shall have access to resources (cpu time, memory, disk space).

kstrauser•8mo ago

But back to: why is that a problem? Why is there a limit on max open files such that process A opening one takes away from how many process B can open?

Dylan16807•8mo ago

People often run programs that are supposed to use 98% of the resources of the system. Go ahead and let the admin set a limit, but trying to preemptively set a "reasonable" limit causes a lot more problems than it solves.

Especially when most of these resources go back to memory. If you want a limit, limit memory. Don't make it overcomplicated.

toast0•8mo ago

How else am I supposed to service a million clients, other than having a million sockets?

This isn't a real issue though. Usually, you can just set the soft limit to the often much higher hard limit; at worst, you just have to reboot with a big number for max fds; too many open files is a clear indicator of a missing config, and off we go. The defaults limits are small and that usually works because most of the time a program opening 1M fds is broken.

Kind of annoying when Google decides their container optimized OS should go from soft and hard limits of 1M to soft limit 1024, hard limit 512k though.

jeffbee•8mo ago

A million FDs on a process is not weird. I used to run frontends with that many sockets, on Intel Clovertown Xeons. That was a machine that came out 20 years ago. There is absolutely no reason whatsoever that this would indicate "doing it wrong" in the year 2025.

gnulinux•8mo ago

It's not that it's a proxy for memory use, but FD is its own resource, I've seen many software with FD leak (i.e. they open a file and forget about it, if they need the file again, they open another FD) so this limit can be a method to tell that it's leaking. Whether that's a good idea/necessary depends on the application.

quotemstr•8mo ago

Then account for _kernel memory used_ by file descriptors and account it like any other ulimit. Don't impose a cap on file descriptors in particular. These caps distort program design throughout the ecosystem

Dwedit•8mo ago

It's not just memory, it's cleanup that the Kernel must perform when the process terminates.

quotemstr•8mo ago

So? Doesn't matter. Account for the thing you want to control directly. Don't put caps on bad proxies for the thing you want to control.

tedunangst•8mo ago

Don't leak things with limits.

quotemstr•8mo ago

It's not a "leak" if you're actually using it, you know.

jcalvinowens•8mo ago

I'm not really sure, but I've always assumed early primitive UNIX implementations didn't support dynamically allocating file descriptors. It's not uncommon to see a global fixed size array in toy OSs.

One downside to your approach is that kernel memory is not swappable in Linux: the OOM failure mode could be much nastier than leaking memory in userspace. But almost any code in the real world is going to allocate some memory in userspace to go along with the FD, that will cause an OOM first.

duped•8mo ago

ENOMEM is already one of the allowed error conditions of `open`. Classically you hit this if it's a pipe and you've hit the pipe-user-pages-hard limit. POSIX is a bit pedantic about this but Linux explicitly says the kernel may return this as a general failure when kernel memory limits are reached.

jcalvinowens•8mo ago

My guess is it was more about partitioning resources: if you have four daemons and 100 static global file descriptors, in the simplest case you probably want to limit each one to using 25. But I'm guessing, hopefully somebody who knows more than me will show up here :)

duped•8mo ago

No it's way simpler than that. The file descriptors are indices into an array containing the state of the file for the process. Limiting the max size of the array makes everything easier to implement correctly.

For example consider if you're opening/closing file descriptors concurrently. If the array never resizes the searches for free fds and close operations can happen without synchronization.

quotemstr•8mo ago

The Linux FD table's performance does not depend on assumptions of non-growth.

jcalvinowens•8mo ago

I meant the existence of ulimit was about partitioning resources.

Imagine a primitive UNIX with a global fixed size file descriptor probing hashtable indexed by FD+PID: that's more what I was getting at. I have no idea if such a thing really existed.

> If the array never resizes the searches for free fds and close operations can happen without synchronization.

No, you still have to (at the very least) serialize the lookups of the lowest available descriptor number if you care about complying with POSIX. In practice, you're almost certain to require more synchronization for other reasons. Threads share file descriptors.

The modern Linux implementation is not so terrible IMHO: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds...

JdeBP•8mo ago

This particular resource limit stuff was not in "primitive Unix". It was a novelty that came with 4.2BSD.

CactusRocket•8mo ago

It's always a good thing to have resource limits, to constrain runaway programs or guard against bugs. Low limits are unfortunate, but extremely high limits or unbounded resource acquisition can lead to many problems. I rather see "too many open files" than my entire machine freezing up, when a program misbehaves.

quotemstr•8mo ago

There's a certain kind of person who just likes limits and will never pass up an opportunity to defend and employ them.

For example, imagine a world in which Linux had a RLIMIT_CUMULATIVE_IO:

"How else am I supposed to prevent programs wearing out my flash? Of course we should have this limit"

"Of course a program should get SIGIO after doing too much IO. It'll encourage use of compression"

"This is a security feature, dumbass. Crypto-encrypters need to write encrypted files, right? If you limit a program to writing 100MB, it can't do that much damage"

Yet we don't have a cumulative write(2) limit and the world keeps spinning. It's the same way with the limits we do have --- file number limits, vm.max_map_count, VSIZE, and so on. They're relicts of a different time, yet the I Like Limits people will retroactively justify their existence and resist attempts to make them more suitable for the modern world.

eviks•8mo ago

Your entire machine won't freeze if you have a sensible limit of the direct cause of that freeze (which would be e.g. memory or CPU %, not some arbitrary number descriptors)

mattrighetti•8mo ago

> There's no reason to place an arbitrary cap on the number of file descriptors a process can have

I like to think that if something is there then there's a reason for it, it's just that I'm not that smart to see it :) jokes aside, I could see this as a security measure? A malware that tries to encrypt your whole filesystem in a single shot could be blocked or at least slowed down with this limit.

JdeBP•8mo ago

The name for the principle that you are roughly adhering to is Chesterton's Fence.

josephcsible•8mo ago

There's one unfortunate historical reason: passing FDs >=1024 to glibc's "select" function doesn't work, so it would be a breaking change to ever raise the default soft limit above that. It's perfectly fine for the default hard limit to be way higher, though, and for programs that don't use "select" (or that use the kernel syscall directly, but you should really just use poll or epoll instead) to raise their own soft limit to the hard limit.

Borg3•8mo ago

Well, if you want more, you can just set it via: #define FD_SETSIZE <value>

Just keep value sane ;) 4096 or 5120 should be okish.

josephcsible•8mo ago

FD_SETSIZE isn't something you can just redefine like that. You'd have to reimplement all the types, macros, etc. yourself.

quotemstr•8mo ago

Yes, it is in fact something you can just redefine yourself. That's not to say you should do it.

josephcsible•8mo ago

Right. You can but generally shouldn't. Like I said upthread, you're better off using poll or epoll.

nasretdinov•8mo ago

Yeah macOS has a very low default limit, and apparently it affects more than just cargo test, e.g. ClickHouse, and there's even a quite good article about how to increase it permanently: https://clickhouse.com/docs/development/build-osx

mhink•8mo ago

I actually tried another method for doing this not too long ago (adding `kern.maxfiles` and `kern.maxfilesperproc` to `/etc/sysctl.conf` with higher limits than the default) and it made my system extremely unstable after rebooting. I'm not entirely sure why, though.

css•8mo ago

I ran into this issue recently [0]. Apparently the integrated VSCode terminal sets its own (very high) cap by default, but other shells don't, so all of my testing in the VSCode shell "hid" the bug that other shells exposed.

[0]: https://github.com/ReagentX/imessage-exporter/issues/314#iss...

oatsandsugar•8mo ago

Yeah I ran into this too when testing a new feature. My colleague sent me this: https://apple.stackexchange.com/questions/32235/how-to-prope...

But I reckon its unreasonable for us to ask our users to know this, and we'll have to fix the underlying cause.

loeg•8mo ago

Also possible to have an fd leak when this error arises. Probably worth investigating a little if that might be the case.

gizmo686•8mo ago

Based on how the number of open files dropped down to 15 or lower after the error, I doubt the issue was caused by an actual leak.

I have had issues with not quite FD leaks, where we would open the same file a bunch of times for some tasks. It is not a leak because we close all of the FDs at the end of the task. In particular, this meant that it slipped past the explicit FD leak detection logic we had in our test harness. It also worked flawlessly in our long running stress tests.

For a while people assumed it was legit, because it only showed up on tasks that involved thousands of files, and the needed FD limit seemed to scale to the input file count.

loeg•8mo ago

Fair -- excessive-but-tracked fd use can still be a problem, even if they don't end up truly leaked.

trinix912•8mo ago

Brings back memories of setting FILES= in config.sys in MS-DOS. I've totally forgotten this can still be a problem nowadays!

Izkata•8mo ago

  lsof -p $(echo $$)

The subshell isn't doing anything useful here, could just be:

  lsof -p $$

zx8080•8mo ago

This code has AI smell

mattrighetti•8mo ago

Or writing blogs at 2AM is not a smart thing to do

codedokode•8mo ago

The problem with lsof is that it outputs lot of duplicates, for example:

- it outputs memory-mapped files whose descriptor was closed (with "mem" type)

- for multi-thread processes it repeats every file for every thread

For example my system has 400 000 lines in lsof output and it is really difficult to figure out which of them count against the system-wide limit.

AdmiralAsshat•8mo ago

Used to run into this error frequently with a piece of software I supported. I don't remember the specifics, but it was your basic C program to process a record-delimited datafile. A for-loop with an fopen that didn't have a corresponding fclose at the end of the loop. For a sufficiently large datafile, eventually we'd run out of file handles.

xorvoid•8mo ago

The real fun thing is when the same application is using “select()” and then somewhere else you open like 5000 files. Then you start getting weird crashes and eventually trace it down to the select bitset having a hardcoded max of 4096 entries and no bounds checking! Fun fun fun.

danadam•8mo ago

> trace it down to the select bitset having a hardcoded max of 4096

Did it change? Last time I checked it was 1024 (though it was long time ago).

> and no bounds checking!

_FORTIFY_SOURCE is not set? When I try to pass 1024 to FD_SET and FD_CLR on my (very old) machine I immediately get:

  *** buffer overflow detected ***: ./a.out terminated
  Aborted

(ok, with -O1 and higher)

xorvoid•8mo ago

You’re right. I think it ends up working out to a 4096 page on x86 machines, that’s probably what I remembered.

Yes, _FORTIFY_SOURCE is a fabulous idea. I was just a bit shocked it wasn’t checked without _FORTIFY_SOURCE. If you’re doing FD_SET/FD_CLR, you’re about to make an (expensive) syscall. Why do you care to elide a cheap not-taken branch that’ll save your bacon some day? The overhead is so incredibly negligible.

Anyways, seriously just use poll(). The select() syscall needs to go away for good.

reisse•8mo ago

You've had a good chance to really see 4096 descriptions in select() somewhere. The man is misleading because it refers to the stubbornly POSIX compliant glibc wrapper around actual syscall. Any sane modern kernel (Linux; FreeBSD; NT (although select() on NT is a very different beast); well, maybe except macOS, never had a chance to write network code there) supports passing the descriptor sets of arbitrary size to select(). It's mentioned further down in the man, in the BUGS section:

> POSIX allows an implementation to define an upper limit, advertised via the constant FD_SETSIZE, on the range of file descriptors that can be specified in a file descriptor set. The Linux kernel imposes no fixed limit, but the glibc implementation makes fd_set a fixed-size type, with FD_SETSIZE defined as 1024, and the FD_*() macros operating according to that limit.

The code I've had a chance to work with (it had its roots in the 90s-00s, therefore the select()) mostly used 2048 and 4096.

> Anyways, seriously just use poll().

Oh please don't. poll() should be in the same grave as select() really. Either use libev/libuv or go down the rabbit hole of what is the bleeding edge IO multiplexer for your platform (kqueue/epoll/IOCP/io_uring...).

moyix•8mo ago

I made a CTF challenge based on that lovely feature of select() :D You could use the out-of-bounds bitset memory corruption to flip bits in an RSA public key in a way that made it factorable, generate the corresponding private key, and use that to authenticate.

https://threadreaderapp.com/thread/1723398619313603068.html

malux85•8mo ago

Oh that’s clever!

StefanBatory•8mo ago

I love how you've made it Eva themed, my respect to you.

ape4•8mo ago

Yeah, the man page says:

    WARNING:  select()  can  monitor  only file descriptors numbers that are
       less than FD_SETSIZE (1024)—an unreasonably low limit  for  many  modern
       applications—and  this  limitation will not change.  All modern applica‐
       tions should instead use poll(2) or epoll(7), which do not  suffer  this
       limitation.

time4tea•8mo ago

You can recompile libc if you want to change the limit, or at least you could in the past.

o11c•8mo ago

You don't have to recompile, just do the following (at least on glibc):

  #include <sys/types.h> // pull in initial definition of __FD_SETSIZE
  #undef __FD_SETSIZE
  #define __FD_SETSIZE 32768 // or whatever
  #include <sys/select.h> // won't include the internal <bits/types.h> again

This is a rare case when `-Wsystem-headers` is useful to enable (and these days system headers are usually pretty clean) - it will catch if you accidentally define `__FD_SETSIZE` before the system does.

Note that `select` is still the nicest API in a lot of ways - `poll` wastes space gratuitously, `epoll` requires lots of finicky `modify` syscalls, and `io_uring` is frankly not sane.

That said:

* if you're only dealing with a couple FDs, use `poll`.

* it's not that hard to take a day and think about epoll write buffer management. You need to consider every combination of:

  epoll state is/isn't checking writability (you want to only change this lazily)
  on the previous/current iteration, was there nothing/something in the write buffer?
  prior actual write was would-block/actually-incomplete/spuriously-incomplete/complete
  current actual write ends up would-block/actually-incomplete/spuriously-incomplete/complete

There are many "correct" answers, but I suspect the optimal answer for epoll is something like: initially, write optimistically (and do this before the wait). If you fail to write anything at all, enable the kernel flag. For FDs that you've previously enabled the flag for, if you don't have anything to write this time, disable the flag; otherwise, don't actually write until after the wait (it is guaranteed to return immediately if the write would be allowed, after all, but you'll also get other events that happen to be ready). If you trust your event handlers to return quickly, you can defer any indicated writes until the next wait, otherwise do them before handling events.

You can see why people still use `select`.

o11c•8mo ago

Checking other libcs (note that "edit the header" is not that difficult to automate):

  bionic - must edit the header
  dietlibc - must edit the header
  glibc - undocumented but reliable, see the dance in the original post
  klibc - must edit <linux/posix_types.h> (which, note, sabotages glibc)
  MUSL - must edit the header
  newlib - documented in header, just `#define FD_SETSIZE` before you `#include <sys/select.h>`
  uclibc - as glibc (since it's a distant fork). Note that `poll.c` for old uclinux kernels is implemented in terms of `select` with dynamic `fd_set` sizing logic!

  freebsd - properly documented, just `#define FD_SETSIZE` first
  netbsd - properly documented, just `#define FD_SETSIZE` first
  openbsd - documented just in the header now (formerly in the man page too), just `#define FD_SETSIZE` first
  solaris - properly documented, just `#define FD_SETSIZE` first
  macos - properly documented, just `#define FD_SETSIZE` first
  winsock - properly documented, just `#define FD_SETSIZE` first, but note the API is not actually the same

reisse•8mo ago

Oh the real fun thing is when the select() is not even in your code! I remember having to integrate a closed-source third-party library vendored by an Australian fin(tech?) company which used select() internally, into a bigger application which really liked to open a lot of file descriptors. Their devs refused to rewrite it to use something more contemporary (it was 2019 iirc!), so we had to improvise.

In the end we came up with a hack to open 4k file descriptors into /dev/null on start, then open the real files and sockets necessary for our app, then close that /dev/null descriptors and initialize the library.

o11c•8mo ago

There's no need to actually do all the opening if you control the code.

You can do anything with `fcntl(F_DUPFD{,_CLOEXEC})` and `fdopen`.

reisse•8mo ago

If we had control of library code, we'd just get rid of select()...

Though we did use the dup trick in another case!

cryptonector•8mo ago

Or back in the days of Solaris 9 and under, 32-bit processes could not have stdio handles with file descriptor numbers larger than 255. Super double plus unfun when you got hit by that. Remember that, u/lukeh?

geocrasher•8mo ago

Back in the earlier days of web hosting we'd run into this with Apache. In fact, I have a note from 2014 (much later than the early days actually):

  ulimit -n 10000

  to set permanently:
  /etc/security/limits.conf
  \* - nofile 10000

bombcar•8mo ago

I seem to remember this was a big point of contention when threaded Apache (vs just forking a billion processes) appeared - that if you went from 20 processes to 4 processes of 5 threads each you could hit the ulimit.

But ... that's a bad memory from long ago and far away.

geocrasher•8mo ago

If you're old enough to remember those days, then you're also old enough to start forgetting them. I speak for myself of course :)

mzs•8mo ago

Is there no way limit the number of concurrently running tests as with make -j 128?

jeroenhd•8mo ago

I think there's something ironic about combining UNIX's "everything is a file" philosophy with a rule like "every process has a maximum amount of open files". Feels a bit like Windows programming back when GDI handles were a limited resource.

Nowadays Windows seems to have capped the max amount of file handles per process to 2^16 (or 8096 if you're using raw C rather than Windows APIs). However, as on Windows not everything is a file, the amount of open handles is limited "only by memory", so Windows programs can do a lot of things UNIX programs can't do anymore when the file handle limit has been reached.

CactusRocket•8mo ago

I actually think it's not ironic, but a synergy. If not everything is a file, you need to limit everything in their own specific way (because resource limits are always important, although it's convenient if they're configurable). If everything is a file, you just limit the maximum number of open files and you're done.

eddd-ddde•8mo ago

That's massively simplifying things however, every "file" uses resources in its own magical little way under the hood.

Brian_K_White•8mo ago

saying "everything is a file" is massively simplifying, so fair is fair

taeric•8mo ago

I'm not sure I see irony? I can somewhat get that it is awkward to have a limit that covers many use cases, but this feels a bit easier to reason about than having to check every possible thing you would want to limit.

Granted, I can agree it is frustrating to hit an overall limit if you have tuned lower limits.

jchw•8mo ago

I'm not even 100% certain there's really much of a specific reason why there has to be a low hard limit on file descriptors. I would guess that Windows NT handles take up more system resources since NT handles have a lot of things that file descriptors do not (e.g. ACLs).

Still, on the other hand, opening a lot of file descriptors will necessarily incur a lot of resource usage, so really if there's a more efficient way to do it, we should find it. That's definitely the case with the old way of doing inotify for recursive file watching; I believe most or all uses of inotify that work this way can now use fanotify instead much more efficiently (and kqueue exists on other UNIX-likes.)

In general having the limit be low is probably useful for sussing out issues like this though it definitely can result in a worse experience for users for a while...

> Feels a bit like Windows programming back when GDI handles were a limited resource.

IIRC it was also amusing because the limit was global (right?) and so you could have a handle leak cause the entire UI to go haywire. This definitely lead to some very interesting bugs for me over the years.

bombcar•8mo ago

> I'm not even 100% certain there's really much of a specific reason why there has to be a low hard limit on file descriptors.

There was. Even if a file handle is 128 bytes or so, on a system with only 10s or 100s of KB you wouldn't want it to get out of control. On multi-user especially, you don't want one process going nuts to open so many files that it eats all available kernel RAM.

Today, not so much though an out-of-control program is still out of control.

mrguyorama•8mo ago

The limit was global, so you could royally screw things up, but it was also a very high limit for the time, 65k GDI handles. In practice, hitting this before running out of hardware resources was unlikely, and basically required leaking the handles or doing something fantastically stupid (as was the style at the time). There was also a per process 10k GDI handle limit that could be modified, and Windows 2000 reduced the global limit to 16k.

It was the Windows 9x days, so of course you could also just royally screw things up by just writing to whatever memory or hardware you felt like, with few limits.

jchw•8mo ago

> It was the Windows 9x days, so of course you could also just royally screw things up by just writing to whatever memory or hardware you felt like, with few limits.

You say that, but when I actually tried I found that despite not actually having robust memory protection, it's not as though it's particularly straightforward. You certainly wouldn't do it by accident... I can't imagine, anyway.

0xbadcafebee•8mo ago

> I'm not even 100% certain there's really much of a specific reason why there has to be a low hard limit on file descriptors

Same reason disks have quotas and containers have cpu & memory limits: to keep one crappy program from doinking the whole system. In general it's seen as poor form to let your server crash just because somebody allowed infinite loops/resource use in their program.

A lot of people's desktops, servers, even networks, crashing is just a program that was allowed to take up too many resources. Limits/quotas help more than they hurt.

saagarjha•8mo ago

As long as you can lift them when it actually makes sense to do so.

kevincox•8mo ago

The reason for this limit, at least on modern systems, is that select() has a fixed limit (usually 1024). So it would cause issues if there was an fd higher than that.

The correct solution is basically 1. On startup every process should set the soft limit to the hard limit, 2. Don't use select ever 3. Before execing any processes set the limit back down (in case the thing you exec uses select)

This silly dance is explained in more detail here: https://0pointer.net/blog/file-descriptor-limits.html

muststopmyths•8mo ago

There is no "max amount of file handles per process" on Windows.

The C runtime has limitations as you indicated. The Win32 API does not.

File,Socket and other handles to NTOSKRNL objects (GDI is its own beast) are not limited by anything but available memory. some of the used memory is non-pageable in the kernel, and there is a limit to the non-pageable memory (1/8 of RAM, I think), so it's not as simple as RAM/(handlecount*storagecost per handle).

dwattttt•8mo ago

I mean, there's only 30 bits available for HANDLEs in the handle table, so you've got a limit there. You'd have to work pretty hard to reach it without running out of resources though.

gjvc•8mo ago

seems like the limits today have not been raised for a very long time.

database64128•8mo ago

This is one of the many things where Go just takes care of automatically. Since Go 1.19, if you import the os package, on startup, the open file soft limit will be raised to the hard limit: https://github.com/golang/go/commit/8427429c592588af8c49522c...

nritchie•8mo ago

Seems like a good idea but I do wonder what the cost is as the overhead of allocate the extra resource space (whatever it is) would be added to every Go application.

database64128•8mo ago

Raising the soft limit to the hard limit is also recommended by the author of systemd: https://0pointer.net/blog/file-descriptor-limits.html

I doubt the kernel would actually allocate the resource space upfront. Like SO_SNDBUF and SO_RCVBUF, it's probably only allocated when it's actually needed.

saagarjha•8mo ago

It's just a limit, you pay the cost when you open the file descriptor

raggi•8mo ago

        use std::io;
        
        #[cfg(unix)]
        fn raise_file_limit() -> io::Result<()> {
            use libc::{getrlimit, setrlimit, rlimit, RLIMIT_NOFILE};
            
            unsafe {
                let mut rlim = rlimit {
                    rlim_cur: 0,
                    rlim_max: 0,
                };
                
                if getrlimit(RLIMIT_NOFILE, &mut rlim) != 0 {
                    return Err(io::Error::last_os_error());
                }
                
                rlim.rlim_cur = rlim.rlim_max;
                
                if setrlimit(RLIMIT_NOFILE, &rlim) != 0 {
                    return Err(io::Error::last_os_error());
                }
            }
            
            Ok(())
        }

NooneAtAll3•8mo ago

...but that didn't solve the bug itself, did it?

what was causing so many open files?

pak9rabid•8mo ago

That's what I was wondering. Seems like a band-aid solution to an underlying problem.

JdeBP•8mo ago

> Another useful command to check for open file descriptors is lsof,

... but the one that comes with the operating system, on the BSDs, is fstat(1).

> 10u: Another file descriptor [...] likely used for additional terminal interactions.

The way that ZLE provides its user interface, and indeed what the Z shell does with the terminal in general, is quite interesting; and almost nothing like what one would expect from old books on the Bourne shell.

> it tries to open more files than the soft limit set by my shell

Your shell can change limits, but it isn't what is originally setting them. That is either the login program or the SSH daemon. On the BSDs, you can read about the configuration file that controls this in login.conf(5).

a_t48•8mo ago

Years ago I had the fun of hunting down a bug at 3am before a game launch. Randomly, we’d save the game and instead get an empty file. This is pretty much the worst thing a game can do (excepting wiping your hard drive, hello Bungie). Turned out some analytics framework was leaking network connections and thus stealing all our file handles. :(

LAC-Tech•8mo ago

what's a good default for modern 64 bit systems? I know there's some kind of table in the linux kernel.

1024 on my workstation. seems low.

jeffbee•8mo ago

1024 is absurdly low for a modern platform. This is just one of those traps that Linux and the other surviving unix-like operating systems leave laying around for their users to fall into. It is possible to exhaust all memory by opening a huge number of files, but there are lots of ways to exhaust all memory, so that's not a good reason. On a modern system, which typically have a minimum of 8GiB main memory and usually much more, you can safely set it to a million or whatever.

The only reason you'd want 1024 as a limit is if you intend to start a process that might have been naively written to use `select` without checking the limits.

MobiusHorizons•8mo ago

Someone elsewhere in the comments pointed out that 1024 is just the glibc limit, the syscall on at least Linux and possibly FreeBSD allows expanding it

LAC-Tech•8mo ago

any good guides on how to set this up? not tutorials on ulimit, there's a man page, but I note 1024 is one of those very square numbers, and wonder if they need to be power of two, or correlated to block size or blah blah.

kiitos•8mo ago

128000 is a reasonable baseline for server-class hosts.

gjvc•8mo ago

[delayed]

L3viathan•8mo ago

Nitpick, but:

> At its core, a file descriptor (often abbreviated as fd) is simply a positive integer

A _non-negative_ integer.

DougN7•8mo ago

Ok, I have to ask: the difference between “positive” and “non-negative” is …?

rkomorn•8mo ago

0 is non-negative but also... non-positive?

bregma•8mo ago

The difference is a classic off-by-one error that can lead to a security exploit.

cesarb•8mo ago

The integer numbers can be divided into "negative" (less than zero), zero, and "positive" (greater than zero). When one says "non-negative", they are excluding only the negative numbers, so zero is included; when one says "positive", zero is not included.

magicalhippo•8mo ago

Positive numbers are greater than zero[1], while non-negative numbers are greater or equal to zero.

[1]: https://en.wikipedia.org/wiki/List_of_types_of_numbers#Signe...

jkol36•8mo ago

Anyone need any programming done?

JackYoustra•8mo ago

htop has a setting where you can show how many fds you have open and how many remaining

eviks•8mo ago

> I couldn’t get the script to catch the exact moment the process reached the soft limits,

That's unfortunate, is there no way to subscribe to open file events instead of polling for status?

NetOpWibby•8mo ago

Sublime Text language parsers will crash if you have a lot of windows and files open for some long period of time.

I really should close all these windows but meh.

userbinator•8mo ago

but the solution is to just bump the soft limit of open file descriptors in my shell

As you can see the max value reached is around 1600, which is way above the previous limit of 256.

Without asking and answering "why the hell does it need to keep so many files open", I don't think that's a good way to do things. Raising the limit is justified only if there is a real reason why the process needs to have that many files open simultaneously.

saagarjha•8mo ago

As others have mentioned, macOS both picks low ulimits but it also has a fun little poorly-documented limit for sandboxed apps which is not queryable in any way. Unfortunately how this manifests is that you try to open a file in a sandboxed app and…it just fails. Somewhere around the few thousand range or so. If you have a sandboxed app, try opening a bunch of files and see how it handles it ;)

The rationale Apple gives is something about using kernel resources, but a normal file descriptor also uses kernel resources, so I'm leaning towards implementation laziness or legacy like many of the examples here.

gajjanag•8mo ago

There is a vast number of sysctl in xnu that have not really been re-examined in over 15 years. Many tunings date back to the spinning rust drive era (for example). There are plenty of examples like this.

Disclaimer: I worked at Apple and poked xnu a bit.

amelius•8mo ago

Why can't we just dynamically allocate file descriptors until memory is physically full?

I hate to say it but it sounds like the developers of Unix were being lazy here.

mbrumlow•8mo ago

Such a big blog post for “Mac OS has crap max file defaults”

We interfaced single-threaded C++ with multi-threaded Rust

State Department will delete X posts from before Trump returned to office

AI Skills Marketplace

Show HN: A fast TUI for managing Azure Key Vault secrets written in Rust

eInk UI Components in CSS

Discuss – Do AI agents deserve all the hype they are getting?

ChatGPT is changing how we ask stupid questions

Zig Package Manager Enhancements

Neutron Scans Reveal Hidden Water in Martian Meteorite

Deepfaking Orson Welles's Mangled Masterpiece

France's homegrown open source online office suite

SpaceX Delays Mars Plans to Focus on Moon

Jeremy Wade's Mighty Rivers

Show HN: MCP App to play backgammon with your LLM

AI Command and Staff–Operational Evidence and Insights from Wargaming

Show HN: CCBot – Control Claude Code from Telegram via tmux

Ask HN: Is the CoCo 3 the best 8 bit computer ever made?

Show HN: Convert your articles into videos in one click

Red Queen's Race

The Anthropic Hive Mind

A Horrible Conclusion

I spent $10k to automate my research at OpenAI with Codex

From Zero to Hero: A Spring Boot Deep Dive

Show HN: Solving NP-Complete Structures via Information Noise Subtraction (P=NP)

Cook New Emojis

Show HN: LoKey Typer – A calm typing practice app with ambient soundscapes

Long-Sought Proof Tames Some of Math's Unruliest Equations

Hacking the last Z80 computer – FOSDEM 2026 [video]

Browser-use for Node.js v0.2.0: TS AI browser automation parity with PY v0.5.11

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

We interfaced single-threaded C++ with multi-threaded Rust

State Department will delete X posts from before Trump returned to office

AI Skills Marketplace

Show HN: A fast TUI for managing Azure Key Vault secrets written in Rust

eInk UI Components in CSS

Discuss – Do AI agents deserve all the hype they are getting?

ChatGPT is changing how we ask stupid questions

Zig Package Manager Enhancements

Neutron Scans Reveal Hidden Water in Martian Meteorite

Deepfaking Orson Welles's Mangled Masterpiece

France's homegrown open source online office suite

SpaceX Delays Mars Plans to Focus on Moon

Jeremy Wade's Mighty Rivers

Show HN: MCP App to play backgammon with your LLM

AI Command and Staff–Operational Evidence and Insights from Wargaming

Show HN: CCBot – Control Claude Code from Telegram via tmux

Ask HN: Is the CoCo 3 the best 8 bit computer ever made?

Show HN: Convert your articles into videos in one click

Red Queen's Race

The Anthropic Hive Mind

A Horrible Conclusion

I spent $10k to automate my research at OpenAI with Codex

From Zero to Hero: A Spring Boot Deep Dive

Show HN: Solving NP-Complete Structures via Information Noise Subtraction (P=NP)

Cook New Emojis

Show HN: LoKey Typer – A calm typing practice app with ambient soundscapes

Long-Sought Proof Tames Some of Math's Unruliest Equations

Hacking the last Z80 computer – FOSDEM 2026 [video]

Browser-use for Node.js v0.2.0: TS AI browser automation parity with PY v0.5.11

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

Too Many Open Files

Comments