frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
623•klaussilveira•12h ago•182 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
925•xnx•18h ago•548 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
32•helloplanets•4d ago•24 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
109•matheusalmeida•1d ago•27 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
9•kaonwarb•3d ago•7 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
40•videotopia•4d ago•1 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
219•isitcontent•13h ago•25 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
210•dmpetrov•13h ago•103 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
321•vecti•15h ago•143 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
369•ostacke•18h ago•94 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
358•aktau•19h ago•181 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
477•todsacerdoti•20h ago•232 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
272•eljojo•15h ago•160 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
402•lstoll•19h ago•271 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
85•quibono•4d ago•20 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
14•jesperordrup•2h ago•6 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
25•romes•4d ago•3 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
56•kmm•5d ago•3 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
12•bikenaga•3d ago•2 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
243•i5heu•15h ago•188 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
52•gfortaine•10h ago•21 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
140•vmatsiiako•17h ago•62 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
280•surprisetalk•3d ago•37 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1058•cdrnsf•22h ago•433 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
132•SerCe•8h ago•117 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
70•phreda4•12h ago•14 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
28•gmays•7h ago•10 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
176•limoce•3d ago•96 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
63•rescrv•20h ago•22 comments

WebView performance significantly slower than PWA

https://issues.chromium.org/issues/40817676
32•denysonique•9h ago•6 comments
Open in hackernews

Ptar: Replacing .tgz for petabyte-scale S3 archives

https://plakar.io/posts/2025-06-30/technical-deep-dive-into-.ptar-replacing-.tgz-for-petabyte-scale-s3-archives/
57•vcoisne•7mo ago

Comments

nemothekid•7mo ago
>By contrast, S3 buckets are rarely backed up (a rather short-sighted approach for mission-critical cloud data), and even one-off archives are rarely done.

This is a complete aside, but how often are people backing up data to something other than S3? What I mean is it some piece of data is on S3, do people have a contingency for "S3 failing".

S3 is so durable in my mind now that I really only imagine having an "S3 backup" if (1) I had an existing system (e.g. tapes), or (2) I need multi-cloud redundancy. Other than that, once I assume something is in S3, I confident it's safe.

Obviously this was built over years (decades?) or reliability, and if your DRP requires alternatives, you should do them, but is anyone realistically paranoid about S3?

SteveNuts•7mo ago
Yes, I am paranoid of S3. Not only could a once in a lifetime event happen, an attacker could get in and delete all my data. Data could be accidentally deleted. Corrupted data could be written...
burnt-resistor•7mo ago
Then 3 steps.

1. Use tarsnap so there's an encryption and a management layer.

2. Use a second service so there's redundancy and no SPoF.

3. Keep cryptographic signatures (not hashes) of each backup job in something like a WORM blockchain KVS.

nemothekid•7mo ago
>Data could be accidentally deleted. Corrupted data could be written...

You guys should really have versioning enabled. Now if someone deleted your data and all the versions, that could be possible, but that would take real effort and would like be malicious.

imglorp•7mo ago
Nobody mentioned the case where you get locked out of your cloud provider with no humans to speak to, or your account gets deleted by the algorithm. Both happen routinely and we only hear about it when the victim takes to the socials.
mrflop•7mo ago
That’s basically one of the reasons that led us to build Plakar.
tecleandor•7mo ago
But don't make the same mistake people make with RAID. "More durable" doesn't mean "backup".

What if somebody deletes the file? What if it got corrupted for a problem in one of your processes? What if your API key falls in the wrong hands?

nemothekid•7mo ago
Yes - backups also protect against someone doing a `rm -rf /*` by accident. However, I don't think I've created an S3 bucket without versioning enabled for years. If someone deletes the file, or the file gets corrupted - I just restore a pervious version.

I don't want to suggest that people should place all their eggs in one basket - it's obviously irresponsible. However, S3 (and versioning) has been the "final storage" for years now. I can only imagine a catastrophic situation like an entire s3 region blowing up. And I'm sure a disgruntled employee could do a lot of damage as well.

joshka•7mo ago
Backups don't just protect you from durability issues. They protect you from accidental deletion, malware, and even just snapshots of what something looked at a particular time etc.

The context that this article suggests is that if your S3 bucket is your primary storage, then it's possible that you're not thinking about where the second copy of your data should belong.

nemothekid•7mo ago
>They protect you from accidental deletion, malware, and even just snapshots of what something looked at a particular time etc.

S3 with versioning enabled provides this. I'm not being naive when I say S3 really provides everything you might need. Its my observation over the last 13 years, dealing with tons of fires, that there has every been a situation where I couldn't retrieve something from S3.

Legally you might need an alternative. Going multi-cloud doesn't hurt - after all I do it. But practically? I don't think I would lose sleep if someone told me they only back up to S3.

icedchai•7mo ago
What if someone deletes a bucket? Then all your versioning is gone...
charcircuit•7mo ago
It doesn't let you.
icedchai•7mo ago
It can be done if you delete the versions. You’ll need to use the aws cli.
fpoling•7mo ago
It cannot be done if S3 objects use the object lock in compliance mode. Such objects cannot be altered in any way and the bucket cannot be deleted until the lock expires .
icedchai•7mo ago
Good to know! I’ve never used that feature.
fpoling•7mo ago
Note that with such lock mistakes can be costly. If you put into S3 several terabytes by mistake and set the compliance lock duration for 2 years, you will have to pay for that storage for 2 years.
icedchai•7mo ago
So not even Amazon can fix this? What if my company goes bankrupt with several TB locked up?
fpoling•7mo ago
If you close the account with Amazon, then yes, the data can be deleted. But typically based on contract this will require notifying Amazon and will be extremely visible and can be reverted.

If the company does not pay, then the company breaches its contract and Amazon can delete the data. But typically there would be a warning period.

tuckerman•7mo ago
Insider risk is a potential reason. If someone acquires root in your AWS account, having a backup might give you options to dealing with blackmail or even malicious deletion after it happens.
deathanatos•7mo ago
If someone acquires root in the AWS account, they likely then have access to the backups, too. Unless we're also assuming whatever is doing the backup runs in an alternate cloud and our attacker or insider somehow has access to only 1 of 2 clouds.

Possible, perhaps, but contrived.

coredog64•7mo ago
There's account root and then there's org root. Accounts are security boundaries, meaning you'd want your backups to at least be in another account within the org.
tuckerman•7mo ago
I think using a separate cloud with credentials stored in a safe (or the equivalent) isn’t that uncommon (worked places where we were nearly 100% AWS but had GCP for storing backups). You’d need to compromise/socially engineer a different set of people to get access to that.
Brian_K_White•7mo ago
And then Amazon kills your account. It doesn't matter how great their hardware and software is.
fpoling•7mo ago
There is a contractual obligation on Amazon side. If they kill the account in violation of the contract, the court will force them to pay heavy damages.

Now, one can argue that courts would take time and money and a company may not afford such risk even if it is theoretical. In this case if data is that important it is stupid to keep them at AWS.

But then just write the data to tapes and store in a bank cell or whatever.

treve•7mo ago
We can get everything back except data. It feels silly to take the risk _not_ to if you're somewhat established.
firesteelrain•7mo ago
My HOA uses a SmartNAS in addition to S3. And we aren’t a huge operation.
zzo38computer•7mo ago
I prefer to store backups on "write once read many" media, such as DVDs. However, having multiple backups would be helpful.
kjellsbells•7mo ago
Perhaps reframe the problem not as data loss because S3's technical infrastructure failed but because of one of the many other ways that data can get zapped or that you might need it. For example:

- Employee goes rogue and nukes buckets.

- Code fault quietly deletes data, or doesnt store it like you thought.

- State entity demands access to data, and you'd rather give them a tape than your S3 keys.

I agree that with eleven-nines or whatever it is of availability, a write to S3 is not going to disappoint you, but most data losses are more about policy and personnel than infrastructure failures.

coredog64•7mo ago
A fun one I've seen before: Your encrypted content reused a KMS key that was provisioned by a temporary CloudFormation stack and got torn down months ago.
foota•7mo ago
Accidental crypto shredding? Oof.
toomuchtodo•7mo ago
This is solved for using versioning with MFA for delete or corruption risk, S3 export if required to provide a copy. Data can also be replicated to a write only bucket in another account, with only the ability to replicate.

https://docs.aws.amazon.com/AmazonS3/latest/userguide/MultiF...

https://docs.aws.amazon.com/AmazonS3/latest/userguide/object...

xyzzy123•7mo ago
Yep, for many applications, versioning is the lightweight solve.

But.. aws backup is still nice, if a bit heavy. I like common workflows to restore all stuff (ddbs, managed dbs, buckets etc) to a common point in time. Also, one of the under-appreciated causes of massive data loss is subtly incorrect lifecycle policies. Backup can save you here even when other techniques may not.

mrflop•7mo ago
AWS Backup can get really pricey since you pay GB-month for every single restore point. Plakar only charges once for the initial backup and then for the small deltas on whatever cheap storage you pick.

Also, AWS Backup locks your snapshots into AWS vaults, whereas Plakar lets you push and pull backups to any backend—local disk, S3, another cloud, on-prem, etc.

xyzzy123•7mo ago
AWS backup is a bit more nuanced than that; ideally the thing you want is N days of PITR (point in time recovery) and you want that across all your data stores (RDS dbs, buckets, dynamodb tables, etc etc), and you want to be able to restore them all to a common point in time. 7 or 30 days or PITR are common choices. It is ideal if you can perform a data restore in 1 operation since your hair may be on fire when you need to use it. In practice almost all your recovery will be from this.

The storage needed for this depends on the data change rate in your application, more or less it works like a WAL in a DB. What is annoying is that you can't really control it (for obvious reasons), and less forgivably, AWS backup is super opaque about how much is actually being used by what.

Retention of dailies / weeklies / monthlies is a different (usually compliance) concern (NOT operational, not really, if you have to restore from a monthly your business is probably already done for) and in an enterprise context you are generally prevented from using deltas for these due to enterprise policy or regulation (yeah I know it sounds crazy, reqs are getting really specific these days).

People on AWS don't generally care that they're locked in to AWS services (else.. they wouldn't be on AWS), and while cost is often a factor it is usually not the primary concern (else.. they would not be on AWS). What often IS a primary concern is knowing that their backup solution is covered under the enterprise tier AWS support they are already paying an absolute buttload for.

Also stuff like Vault lock "compliance mode" & "automated restore testing" are helpful in box-ticking scenarios.

Plakar looks awesome but I'm not sure AWS Backup customers are the right market to go for.

fpoling•7mo ago
S3 provides an object lock in compliance mode when nobody at the organization including its admins can delete objects during the specified period.
mrflop•7mo ago
S3 buckets can just vanish for lots of reasons. With AWS’s shared-responsibility model, you’re the one who has to back up and protect your data not AWS.
fpoling•7mo ago
The hold in compliance mode with AWS is accepted way to persist data that a company obliged to hold legally by US requirements.

And if your company has a sale contract with AWS the buckets cannot just vanish or AWS cannot close the account at arbitrary moment.

FooBarWidget•7mo ago
Or: AWS closes your account with a vague reason ("you violated our terms, we won't tell you which one") with no way to appeal.
hxtk•7mo ago
I’ve worked on a project with strict legal record-keeping requirements that had a plan for the primary AWS region literally getting nuked. But that was the only contingency in our book of plans that really required the S3 backup. We generally assumed that as long as the region still existed, S3 still had everything we put in it.

Of course, since we had the backups, restoration of individual objects would’ve been possible, but we would’ve needed to do it by hand.

Spooky23•7mo ago
AWS is an incredible company and S3 a best in class service. Blindly trust my business to their SLA? To every thing with write access to data? Hell, no.
jamesfinlayson•7mo ago
I worked at a place that uses AWS Backup - which I assume under the hood uses S3.

The backups themselves were off-limits to regular employees though - only the team that managed AWS could edit or delete the backups.

winrid•7mo ago
If you zoom in on your site before the cookies banner pops up you are stuck with just "Hi, we're cookies!" stuck on the screen and can't zoom out out
msgodel•7mo ago
You don't even need a banner like this unless you have third party cookies which there are no good reasons for.
chungy•7mo ago
Another similar archive format is WIM, the thing created by Microsoft for the Windows Vista (and newer) installer; an open source implementation is at: https://wimlib.net/

It offers similar deduplication, indexing, per-file compression, and versioning advantages

mrflop•7mo ago
But it works only for Windows, right?
chungy•7mo ago
No, it works on many OSes. That's the point of linking to wimlib :)

It even supports Unix metadata!

gcr•7mo ago
How does this differ from zpaq and dwarFS?

Zpaq is quite mature and also handles deduplication, versioning, etc.

jauntywundrkind•7mo ago
Or pax. https://linux.die.net/man/1/pax

Or eStargz. https://github.com/containerd/stargz-snapshotter

Or Nydus RAFS. https://github.com/dragonflyoss/nydus

Links for your mentioned zpaq and dwarFS https://www.mattmahoney.net/dc/zpaq.html https://github.com/mhx/dwarfs

tux1968•7mo ago
They mention in the article that some people don't want to install the full Plakar backup software just to read and write ptar archives; so a dedicated open-source tool is offered for download as of yesterday:

https://plakar.io/posts/2025-07-07/kapsul-a-tool-to-create-a...

throwaway127482•7mo ago
Direct link to GitHub: https://github.com/PlakarKorp/kapsul
ac29•7mo ago
Are people really using gzip in 2025 for new projects?

Zstd has been widely available for a long time. Debian, which is pretty conservative with new software, has shipped zstd since at least stretch (released 2017).

kazinator•7mo ago
I integrated gzip into TXR Lisp in 2022. I evaluated all the choices and went with that one because of:

- tiny code size; - widely used standard; - fast compression and decompression.

And it also beat Zstandard on compressing TXR Lisp .tlo files by a non-negligible margin. I can reproduce that today:

  $ zstd -o compiler.tlo.zstd stdlib/compiler.tlo
  stdlib/compiler.tlo  : 25.60%   (250146 =>  64037 bytes, compiler.tlo.zstd)
  $ gzip -c > compiler.tlo.gzip stdlib/compiler.tlo
  $ ls -l compiler.tlo.*
  -rw-rw-r-- 1 kaz kaz 60455 Jul  8 21:17 compiler.tlo.gzip
  -rw-rw-r-- 1 kaz kaz 64037 Jul  8 17:43 compiler.tlo.zstd

The .gzip file is 0.944 as large as the .zstd file.

So for this use case, gzip is faster (zstd has only decompression that is fast), compresses better and has way smaller code footprint.

jonas21•7mo ago
zstd uses a fairly low compression level by default. If you run with `zstd -19 -o compiler.tlo.zstd stdlib/compiler.tlo` you will probably get much better compression than gzip, even at its highest setting.

That said, the tiny code footprint of gzip can be a real benefit. And you can usually count on gzip being available as a system library on whatever platform you're targeting, while that's often not the case for zstd (on iOS, for example).

kazinator•7mo ago
Additional datapoints:

Tne Zopfli gzip-compatible compressor gets the file down to 54343. But zstd with level -19 beats that:

  -rw-rw-r-- 1 kaz kaz 54373 Jul  8 22:59 compiler.tlo.zopfli
  -rw-rw-r-- 1 kaz kaz 50102 Jul  8 17:43 compiler.tlo.zstd.19
I have no idea which is more CPU/memory intensive.

For applications in which compression speed is not important (data is being prepared once to be decompressed many times), if you want the best compression and stick with gzip, Zopfli is the ticket.

attentive•7mo ago
Try lzip. It's about 10 times faster than zopfli though it's not gzip compatible. And it beats zstd -19 on compression.
Quekid5•7mo ago
I believe the default compression setting for the zstd command is biased towards speed -- maybe try -9, -13 or even -22 (max, which should probably be fine for such a small file).

Not that it matters when the file is so small in the first place... I'm just saying you should be sure what you're 'benchmarking'

Scaevolus•7mo ago
Having the entire backup as a single file is interesting, but does it matter?

Restic has a similar featureset (deduplicated encrypted backups), but almost certainly has better incremental performance for complex use cases like storing X daily backups, Y weekly backups, etc. At the same time, it struggles with RAM usage when handling even 1TB of data, and presumably ptar has better scaling at that size.

mkroman•7mo ago
> At the same time, it struggles with RAM usage when handling even 1TB of data, and presumably ptar has better scaling at that size.

There's also rustic, which supposedly is optimized for memory: https://rustic.cli.rs/docs/

throwaway127482•7mo ago
Does this support content-defined chunking (CDC)?
mrflop•7mo ago
Yes, both ptar and plakar. If you want to read more about the internal: https://www.plakar.io/posts/2025-04-29/kloset-the-immutable-...
ahofmann•7mo ago
I'm trying to evaluate what plakar is. Is it like restic, Borgbackup, Kopia?
mrflop•7mo ago
Yes Plakar works much like Restic and Kopia: it takes content-addressed, encrypted and deduplicated snapshots and offers efficient incremental backups via a simple CLI. Under the hood, its Kloset engine splits data into encrypted, compressed chunks. Plakar main strengths:

UI: In addition to a simple Unix-style CLI, Plakar provides an web interface and API for monitoring, browsing snapshots

Data-agnostic snapshots: Plakar’s Kloset engine captures any structured data—filesystems, databases, applications—not just files, by organizing them into self-describing snapshots

Source/target decoupling: You can back up from one system (e.g. a local filesystem) and restore to another (e.g. an S3 bucket) using pluggable source and target connectors

Universal storage backends: Storage connectors let you persist encrypted, compressed chunks to local filesystems, SFTP servers or S3-compatible object stores (and more)—all via a unified interface

Extreme scale with low RAM: A virtual filesystem with lazy loading and backpressure-aware parallelism keeps memory use minimal, even on very large datasets

Network- and egress-optimized: Advanced client-side deduplication and compression dramatically cut storage and network transfer costs—ideal for inter-cloud or cross-provider migrations

Online maintenance: you don't need to stop you backup to free some space

ptar...

ahofmann•7mo ago
Thank you!