We Saved $500k per Year by Rolling Our Own "S3"

https://engineering.nanit.com/how-we-saved-500-000-per-year-by-rolling-our-own-s3-6caec1ee1143

70•mpweiher•6h ago

Comments

ch2026•2h ago

Who is “The South Korean Government”?

OsrsNeedsf2P•1h ago

It's the government who lost 850TB of citizen data with no backups[0] Because Cloud bad.

[0] https://www.techradar.com/pro/security/the-south-korean-gove...

codedokode•54m ago

Storing the data in a foreign cloud would allow foreign nation to play funny tricks on the country. What they need is not the cloud but sane backup system.

PartiallyTyped•40m ago

Isolated partitions exist.

senectus1•42m ago

because they didnt have a decent backup.

Havoc•2h ago

Tbh I feel this in one of those that would be significantly cleaner without serverless in first place.

Sticking something with 2 second lifespan on disk to shoehorn it into aws serverless paradigm created problems and cost out of thin air here

Good solution moving at least partially to a in memory solution though

tcdent•1h ago

Yeah, so now you're basically running a heavy instance in order to get the network throughput and the RAM, but not really using that much CPU when you could probably handle the encode with the available headroom. Although the article lists TLS handshakes as being a significant source of CPU usage, I must be missing something because I don't see how that is anywhere near the top of the constraints of a system like this.

Regardless, I enjoyed the article and I appreciate that people are still finding ways to build systems tailored to their workflows.

inlined•1h ago

Maybe they’re not using keepalives in their clients causing thousands of handshakes per second?

none2585•2h ago

I'm curious how many engineers per year this costs to maintain

UseofWeapons1•1h ago

Yes, that was my thought as well. Breakeven might be like 1 (give or take 2x)?

hinkley•1h ago

Anything worth doing needs three people. Even if they also are used for other things.

codedokode•1h ago

And I am curious how many engineer years it requires to port code to cloud services and deal with multiple issues you cannot even debug due to not having root privileges in the cloud.

Without cloud, saving a file is as simple as "with open(...) as f: f.write(data)" + adding a record to DB. And no weird network issues to debug.

rajamaka•1h ago

> as simple as "with open(...) as f: f.write(data)"

Save where? With what redundancy? With what access policies? With what backup strategy? With what network topology? With what storage equipment and file system and HVAC system and...

Without on-prem, saving a file is as simple as s3.put_object() !

codedokode•1h ago

With s3, you cannot use ls, grep and other tools.

> Save where? With what redundancy? With what access policies? With what backup strategy? With what network topology? With what storage equipment and file system and HVAC system and...

Wow that's a lot to learn before using s3... I wonder how much it costs in salaries.

> With what network topology?

You don't need to care about this when using SSDs/HDDs.

> With what access policies?

Whichever is defined in your code, no restrictions unlike in S3. No need to study complicated AWS documentation and navigate through multiple consoles (this also costs you salaries by the way). No risk of leaking files due to misconfigured cloud services.

> With what backup strategy?

Automatically backed up with rest of your server data, no need to spend time on this.

coderintherye•1h ago

I mean you can easily mount the S3 bucket to the local filesystem (e.g. using s3fs-fuse) and then use standard command line tools such as ls and grep.

codedokode•1h ago

It's probably going to be dog slow. I dealt with HDDs where just iterating through all files and directories takes hours, and network storage is going to be even slower at this scale.

hallman76•53m ago

I inherited an S3 bucket where hundreds of thousands of files were written to the bucket root. Every filename was just a uuid. ls might work after waiting to page though to get every file. To grep you would need to download 5 TB.

rajamaka•1h ago

> You don't need to care about this when using SSDs/HDDs.

You do need to care when you move beyond a single server in a closet that runs your database, webserver and storage.

> No risk of leaking files due to misconfigured cloud services.

One misconfigured .htaccess file for example, could result in leaking files.

Nextgrid•3m ago

With bare-metal machines you can go very far before needing to scale beyond one machine.

inlined•1h ago

It sounds like you’re not at the scale where cloud storage is obviously useful. By the time you definitely need S3/GCS you have problems making sure files are accessible everywhere. “Grep” is a ludicrous proposition against large blob stores

bcrosby95•1h ago

You can't ever definitively answer most of those questions on someone else's cloud. You just take Amazons word for whatever number of nines they claim it has.

rajamaka•1h ago

Not needing to ask the questions is the selling point.

grebc•3m ago

Bro were you off grid last week. Your questions equally apply to AWS, you just magically handwave away all those questions as if AWS/GCP/Azure outages aren’t a thing.

Rohansi•1h ago

I don't think any of those mattered for their use case. That's why they didn't actually need S3.

AdieuToLogic•49m ago

>> Without cloud, saving a file is as simple as "with open(...) as f: f.write(data)" + adding a record to DB.

> Save where? With what redundancy? With what access policies? With what backup strategy? With what network topology? With what storage equipment and file system and HVAC system and...

Most of these concerns can be addressed with ZFS[0] provided by FreeBSD systems hosted in triple-A data centers.

Show HN: MyraOS – My 32-bit operating system in C and ASM (Hack Club project)

How I turned Zig into my favorite language to write network programs in

Sandhill cranes have adopted a Canada gosling

Are-we-fast-yet implementations in Oberon, C++, C, Pascal, Micron and Luon

We Saved $500k per Year by Rolling Our Own "S3"

A definition of AGI

You already have a Git server

Sphere Computer – The Innovative 1970s Computer Company Everyone Forgot

Ken Thompson recalls Unix's rowdy, lock-picking origins

Termite farmers fine-tune their weed control

NORAD’s Cheyenne Mountain Combat Center, c.1966

Argentina's midterm election hands landslide win to Milei's libertarian overhaul

Microsoft 365 Copilot – Arbitrary Data Exfiltration via Mermaid Diagrams

A bug that taught me more about PyTorch than years of using it

ICE Will Use AI to Surveil Social Media

System.LongBool

Poison, Poison Everywhere

Show HN: Helium Browser for Android with extensions support, based on Vanadium

Researchers demonstrate centimetre-level positioning using smartwatches

Asbestosis

A Looking Glass Half Empty, Part 2: A Series of Unfortunate Events

Wren: A classy little scripting language

Making the Electron Microscope

Feed the bots

Eavesdropping on Internal Networks via Unencrypted Satellites

Pico-Banana-400k

Books by People – Defending Organic Literature in an AI World

Alzheimer's disrupts circadian rhythms of plaque-clearing brain cells

Downloadable movie posters from the 40s, 50s, 60s, and 70s

Formal Reasoning [pdf]

Show HN: MyraOS – My 32-bit operating system in C and ASM (Hack Club project)

How I turned Zig into my favorite language to write network programs in

Sandhill cranes have adopted a Canada gosling

Are-we-fast-yet implementations in Oberon, C++, C, Pascal, Micron and Luon

We Saved $500k per Year by Rolling Our Own "S3"

A definition of AGI

You already have a Git server

Sphere Computer – The Innovative 1970s Computer Company Everyone Forgot

Ken Thompson recalls Unix's rowdy, lock-picking origins

Termite farmers fine-tune their weed control

NORAD’s Cheyenne Mountain Combat Center, c.1966

Argentina's midterm election hands landslide win to Milei's libertarian overhaul

Microsoft 365 Copilot – Arbitrary Data Exfiltration via Mermaid Diagrams

A bug that taught me more about PyTorch than years of using it

ICE Will Use AI to Surveil Social Media

System.LongBool

Poison, Poison Everywhere

Show HN: Helium Browser for Android with extensions support, based on Vanadium

Researchers demonstrate centimetre-level positioning using smartwatches

Asbestosis

A Looking Glass Half Empty, Part 2: A Series of Unfortunate Events

Wren: A classy little scripting language

Making the Electron Microscope

Feed the bots

Eavesdropping on Internal Networks via Unencrypted Satellites

Pico-Banana-400k

Books by People – Defending Organic Literature in an AI World

Alzheimer's disrupts circadian rhythms of plaque-clearing brain cells

Downloadable movie posters from the 40s, 50s, 60s, and 70s

Formal Reasoning [pdf]

We Saved $500k per Year by Rolling Our Own "S3"

Comments