A reverse-delta backup strategy – obvious idea or bad idea?

6•datastack•4h ago

I recently came up with a backup strategy that seems so simple I assume it must already exist — but I haven’t seen it in any mainstream tools.

The idea is:

The latest backup (timestamped) always contains a full copy of the current source state.

Any previous backups are stored as deltas: files that were deleted or modified compared to the next (newer) version.

There are no version numbers — just timestamps. New versions can be inserted naturally.

Each time you back up:

1. Compare the current source with the latest backup.

2. For files that changed or were deleted: move them into a new delta folder (timestamped).

3. For new/changed files: copy them into the latest snapshot folder (only as needed).

4. Optionally rotate old deltas to keep history manageable.

This means:

The latest backup is always a usable full snapshot (fast restore).

Previous versions can be reconstructed by applying reverse deltas.

If the source is intact, the system self-heals: corrupted backups are replaced on the next run.

Only one full copy is needed, like a versioned rsync mirror.

As time goes by, losing old versions is low-impact.

It's user friendly since the latest backup can be browsed through with regular file explorers.

Example:

Initial backup:

latest/ a.txt # "Hello" b.txt # "World"

Next day, a.txt is changed and b.txt is deleted:

latest/ a.txt # "Hi" backup-2024-06-27T14:00:00/ a.txt # "Hello" b.txt # "World"

The newest version is always in latest/, and previous versions can be reconstructed by applying the deltas in reverse.

I'm curious: has this been done before under another name? Are there edge cases I’m overlooking that make it impractical in real-world tools?

Would love your thoughts.

Comments

compressedgas•4h ago

It works. Already implemented: https://rdiff-backup.net/ https://github.com/rdiff-backup/rdiff-backup

There are also other tools which have implemented reverse incremental backup or backup with reverse deduplication which store the most recent backup in contiguous form and fragment the older backups.

wmf•4h ago

It seems like ZFS/Btrfs snapshots would do this.

HumanOstrich•27m ago

No, they work the opposite way using copy-on-write.

wmf•16m ago

"For files that changed or were deleted: move them into a new delta folder. For new/changed files: copy them into the latest snapshot folder." is just redneck copy-on-write. It's the same result but less efficient under the hood.

codingdave•4h ago

The low likelihood / high impact edge case this does not handle is: "Oops, our data center blew up." An extreme scenario, but one that this method does not handle. It instead turns your most recent backup into a single point of failure because you cannot restore from other backups.

ahazred8ta•2h ago

For reference: a comprehensive backup + security plan for individuals https://nau.github.io/triplesec/

dr_kiszonka•1h ago

It sounds like this method is I/O intensive as you are writing the complete image at every backup time. Theoretically, it could be problematic when dealing with large backups in terms of speed, hardware longevity, and write errors, and I am not sure how you would recover from such errors without also storing the first image. (Or I might be misunderstanding your idea. It is not my area.)

rawgabbit•1h ago

What happens if in the process of all this read write rewrite, data is corrupted?

jiggawatts•8m ago

The more common approach now is incrementals forever with occasional synthetic full backups computed at the storage end. This minimises backup time and data movement.

Delphi Raises $16M Series A from Sequoia Capital to Pioneer "Digital Minds"

'Quantum AI' algorithms outpace the fastest supercomputers, study says

Techie went home rather than fix mistake that caused a meltdown

GPTuner: GPTuner is a manual-reading database tuning system leveraging domain k

Astronomers solve mystery of bright burst in space

Harvest Move – A game that requires careful movement

Systemic Misalignment: Key Failures of AI Alignment Methods

Canada orders China's Hikvision to close Canadian operations

First thoughts on Rust vs. OCaml (2020)

7 People Now Have Elon Musk's Neuralink Brain Implant

Humanity Needs Aliens to Survive

Vintage Modern – a car that looks vintage but drives like a new car

New Paradigm for Computing Global Networks Databases, and Embedded Systems

The Whimsy and Practicality of 'SuperAdobe'

China's Top Factory: How Premium CPU Air Coolers Are Made – Deepcool [video]

Engineering Marvels: International Space Station

ECMAScript 2025 Finalized with Iterator Helpers, Set Methods, RegExp.escape, and

Vancouver man says institutions unable to recognize new Indigenous street name

National Semiconductor's Series 32000 Family

Go is 80/20 language

Jim: Immediate Mode JSON Serialization Library in C

Elon Musk says Senate bill would destroy jobs and harm US

Battery recycling company is now cleaning up AI data centers

Awesome-Claude-Code

Curl_cffi

Show HN: Make a free chatbot and share it

Solomonoff's Theory of Inductive Inference

Sikorsky's Other Aviation Pioneer: Michael Gluhareff

Authors call on publishers to limit their use of AI

Ask HN: What can I build in a month for YC?