frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Running Out of Disk Space in Production

https://alt-romes.github.io/posts/2026-04-01-running-out-of-disk-space-on-launch.html
73•romes•3d ago

Comments

flanfly•2h ago
A neat trick I was told is to always have ballast files on your systems. Just a few GiB of zeros that you can delete in cases like this. This won't fix the problem, but will buy you time and free space for stuff like lock files so you can get a working system.
jaapz•2h ago
Love the simplicity and pragmatism of this solution
omarqureshi•2h ago
Surely a 50% warning alarm on disk usage covers this without manual intervention?
jcims•2h ago
If the alarms are reliably configured, confirmed to be working, low noise enough to be actioned, etc etc.

And of course there's nothing to say that both of these things can't be done simultaneously.

theshrike79•2h ago
Depends. A Kubernetes container might have only a few megabytes of disk space, because it shouldn't need it.

Except that one time when .NET decides that the incoming POST is over some magic limit and it doesn't do the processing in-memory like before, but instead has to write it to disk, crashing the whole pod. Fun times.

Also my Unraid NAS has two drives in "WARNING! 98% USED" alert state. One has 200GB of free space, the other 330GB. Percentages in integers don't work when the starting number is too big :)

dspillett•2h ago
If the alarm works. And it actioned not just snoozed too much or just dismissed entirely.

Defence in depth is a good idea: proper alarms, and a secondary measure in case they don't have the intended effect.

pixl97•1h ago
Alarms are great, but when something goes wrong SSDs can fill up amazingly fast!
n4r9•1h ago
Surely there are pitfalls either way. A ballast file can be deleted too readily, or someone could forget to re-add it.
fifilura•2h ago
I did this too, but i also zipped the file, turns out it had great packing ratio!
saagarjha•2h ago
Personally I just keep the file on a ramdisk so you can avoid having to fetch it from slow storage
3form•40m ago
Neat! I optimized for my own case, and I'm storing my ramdisk on SSD to gain persistence.
ninalanyon•2h ago
This is why I never empty the Rubbish Bin/trash Can on my Linux laptop until the disk fills.
testplzignore•1h ago
Would another way be to drop the reserved space (typically 1% to 5% on an ext file system)?
bombcar•1h ago
Reserved space doesn't protect you against root, who is often the user to blame for the last used MB.
dspillett•1h ago
Similarly, I always leave some space unallocated on LMV volume groups. It means that I can temporarily expand a volume easily if needed.

It also serves to leave some space unused to help out the wear-levelling on the SSDs on which the RAID array that is the PV¹ for LVM. I'm, not 100% sure this is needed any more² but I've not looked into that sufficiently so until I do I'll keep the habit.

--------

[1] if there are multiple PVs, from different drives/arrays, in the VG, then you might need to manually skip a bit on each one because LVM will naturally fill one before using the next. Just allocate a small LV specially on each and don't use it. You can remove one/all of them and add the extents to the fill LV if/when needed. Giving it a useful name also reminds you why that bit of space is carved out.

[2] drives under-allocate by default IIRC

Chaosvex•1h ago
Similar to the old game development trick of hiding some memory away and then freeing it up near the end of development when the budget starts getting tight.
bombcar•1h ago
Some filesystems can be unable to delete a file if full. Something to be a bit worried about.
throw0101d•59m ago
> A neat trick I was told is to always have ballast files on your systems.

ZFS has a "reservation" mechanism that's handy:

> The minimum amount of space guaranteed to a dataset, not including its descendants. When the amount of space used is below this value, the dataset is treated as if it were taking up the amount of space specified by refreservation. The refreservation reservation is accounted for in the parent datasets' space used, and counts against the parent datasets' quotas and reservations.

* https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops...

Quotas prevent users/groups/directories (ZFS datasets) from using too much space, but reservations ensure that particular areas always have a minimum amount set aside for them.

dijit•48m ago
I always called it a “bit-mass”. Like a thermal mass used in freezers in places where the power is not very stable.

I knew I didn’t invent the concept, as there’s so many systems that cannot recover if the disk is totally full. (a write may be required in many systems in order to execute an instruction to remove things gracefully).

The latest thing I found with this issue is Unreal Engines Horde build system, its so tightly coupled with caches, object files and database references: that a manual clean up is extremely difficult and likely to create an unstable system. But you can configure it to have fewer build artefacts kept around and then it will clear itself out gracefully. - but it needs to be able to write to the disk to do it.

Now that I think about it, I don’t do this for inodes, but you can run out of those too and end up in a weird “out of disk” situation despite having lots of usable capacity left.

layer8•47m ago
Better fill those files with random bytes, to ensure the filesystem doesn’t apply some “I don’t actually have to store all-zero blocks” sparse-file optimization. To my knowledge no non-compressing file system currently does this, but who knows about the future.
ape4•33m ago
If I recall correctly:

    dd if=/dev/urandom of=/home/myrandomfile bs=1 count=N
freedomben•11m ago
Yep, btrfs will happily do this to you. I verified it the hard way
entropie•1h ago
> I rushed to run du -sh on everything I could, as that’s as good as I could manage.

I recently came across gdu (1) and have installed/used it on every machine since then.

[1]: https://github.com/dundee/gdu

Neil44•39m ago
I also discovered gdu recently. It's really good. It saves me running du -h --max-depth=1 | sort -h a million times trying to find where the space has gone while you're stressing about production being down.
illusive4080•31m ago
Have you used ncdu? I wonder how this compares.
NitpickLawyer•21m ago
I use dust for this, but gdu looks nice, I'll give it a try. Thanks for sharing.
huijzer•46m ago
> Plausible Analytics, with a 8.5GB (clickhouse) database

And this is why I tried Plausible once and never looked back.

To get basic but effective analytics, use GoAccess and point it at the Caddy or Nginx logs. It’s written in C and thus barely uses memory. With a few hundreds visits per day, the logs are currently 10 MB per day. Caddy will automatically truncate if logs go above 100 MB.

bdcravens•44m ago
I appreciate the last line

> Note: this was written fully by me, human.

brunoborges•23m ago
I remember a story of an Oracle Database customer who had production broken for days until an Oracle support escalation led to identifying the problem as mere "No disk space left".
dirkt•9m ago
If you run nginx anyway, why not serve static files from nginx? No need for temporary files, no extra disk space.

The authorization can probably be done somehow in nginx as well.

Show HN: Brutalist Concrete Laptop Stand (2024)

https://sam-burns.com/posts/concrete-laptop-stand/
166•sam-bee•3h ago•69 comments

We found an undocumented bug in the Apollo 11 guidance computer code

https://www.juxt.pro/blog/a-bug-on-the-dark-side-of-the-moon/
179•henrygarner•3h ago•104 comments

Show HN: A cartographer's attempt to realistically map Tolkien's world

https://www.intofarlands.com/atlasofarda
51•intofarlands•2h ago•7 comments

Every GPU That Mattered

https://sheets.works/data-viz/every-gpu
184•jonbaer•5h ago•101 comments

Dropping Cloudflare for Bunny.net

https://jola.dev/posts/dropping-cloudflare
96•shintoist•1h ago•40 comments

Identify a London Underground Line just by listening to it

https://tubesoundquiz.com/
99•nelson687•4h ago•28 comments

9 Mothers (YC P26) Is Hiring – Lead Robotics and More

https://jobs.ashbyhq.com/9-mothers?utm_source=x8pZ4B3P3Q
1•ukd1•37m ago

Show HN: Stop paying for Dropbox/Google Drive, use your own S3 bucket instead

https://locker.dev
123•Zm44•3h ago•110 comments

Running Out of Disk Space in Production

https://alt-romes.github.io/posts/2026-04-01-running-out-of-disk-space-on-launch.html
73•romes•3d ago•30 comments

Has electricity decoupled from gas prices in Germany?

https://has-electricity-decoupled-yet.strommarktberatung.de
7•konschubert•32m ago•2 comments

My Experience as a Rice Farmer

https://xd009642.github.io/2026/04/01/My-Experience-as-a-Rice-Farmer.html
240•surprisetalk•5d ago•111 comments

Blackholing My Email

https://www.johnsto.co.uk/blog/blackholing-my-email/
95•semyonsh•5h ago•5 comments

Show HN: Pion/handoff – Move WebRTC out of browser and into Go

https://github.com/pion/handoff
45•Sean-Der•2h ago•10 comments

AI may be making us think and write more alike

https://dornsife.usc.edu/news/stories/ai-may-be-making-us-think-and-write-more-alike/
114•giuliomagnifico•2h ago•109 comments

Wi-Fi That Can Withstand a Nuclear Reactor: This receiver chip can take it

https://spectrum.ieee.org/robotics-in-nuclear-industry
24•voxadam•4d ago•1 comments

DeiMOS – A Superoptimizer for the MOS 6502

https://aransentin.github.io/deimos/
29•Aransentin•3h ago•7 comments

Breaking the console: a brief history of video game security

https://sergioprado.blog/breaking-the-console-a-brief-history-of-video-game-security/
48•sprado•4h ago•9 comments

Floating point from scratch: Hard Mode

https://essenceia.github.io/projects/floating_dragon/
53•random__duck•2d ago•8 comments

Haunting Photos Show the Aftermath of the Kursk Submarine Disaster in 2000

https://rarehistoricalphotos.com/kursk-submarine-disaster-photos/
54•mooreds•4d ago•7 comments

The best tools for sending an email if you go silent

https://blog.alcazarsec.com/posts/best-email-dead-mans-switches
13•alcazar•1h ago•4 comments

You can't cancel a JavaScript promise (except sometimes you can)

https://www.inngest.com/blog/hanging-promises-for-control-flow
4•goodoldneon•48m ago•2 comments

Sam Altman may control our future – can he be trusted?

https://www.newyorker.com/magazine/2026/04/13/sam-altman-may-control-our-future-can-he-be-trusted
1650•adrianhon•1d ago•681 comments

Record wind and solar saved UK from gas imports worth £1B in March 2026

https://www.carbonbrief.org/analysis-record-wind-and-solar-saved-uk-from-gas-imports-worth-1bn-in...
45•mindracer•2h ago•7 comments

"The new Copilot app for Windows 11 is really just Microsoft Edge"

https://twitter.com/TheBobPony/status/2041112541909205001
52•bundie•1h ago•31 comments

Show HN: Ghost Pepper – Local hold-to-talk speech-to-text for macOS

https://github.com/matthartman/ghost-pepper
419•MattHart88•18h ago•187 comments

Hybrid Attention

15•JohannaAlmeida•1h ago•2 comments

Three hundred synths, 3 hardware projects, and one app

https://midi.guide/blog/three-hunded-synths-one-app/
85•ductionist•9h ago•7 comments

Issue: Claude Code is unusable for complex engineering tasks with Feb updates

https://github.com/anthropics/claude-code/issues/42796
1220•StanAngeloff•1d ago•659 comments

Second Revision of 6502 Laptop

https://codeberg.org/TechPaula/LT6502b
82•uticus•4d ago•17 comments

Solod – A subset of Go that translates to C

https://github.com/solod-dev/solod
158•TheWiggles•13h ago•37 comments