The idea is:
The latest backup (timestamped) always contains a full copy of the current source state.
Any previous backups are stored as deltas: files that were deleted or modified compared to the next (newer) version.
There are no version numbers — just timestamps. New versions can be inserted naturally.
Each time you back up:
1. Compare the current source with the latest backup.
2. For files that changed or were deleted: move them into a new delta folder (timestamped).
3. For new/changed files: copy them into the latest snapshot folder (only as needed).
4. Optionally rotate old deltas to keep history manageable.
This means:
The latest backup is always a usable full snapshot (fast restore).
Previous versions can be reconstructed by applying reverse deltas.
If the source is intact, the system self-heals: corrupted backups are replaced on the next run.
Only one full copy is needed, like a versioned rsync mirror.
As time goes by, losing old versions is low-impact.
It's user friendly since the latest backup can be browsed through with regular file explorers.
Example:
Initial backup:
latest/ a.txt # "Hello" b.txt # "World"
Next day, a.txt is changed and b.txt is deleted:
latest/ a.txt # "Hi" backup-2024-06-27T14:00:00/ a.txt # "Hello" b.txt # "World"
The newest version is always in latest/, and previous versions can be reconstructed by applying the deltas in reverse.
I'm curious: has this been done before under another name? Are there edge cases I’m overlooking that make it impractical in real-world tools?
Would love your thoughts.
compressedgas•2d ago
There are also other tools which have implemented reverse incremental backup or backup with reverse deduplication which store the most recent backup in contiguous form and fragment the older backups.
datastack•1d ago
trod1234•1d ago
An app (that requires remote infrastructure), seems a bit overkill and if your going through the hassle of doing that you might as well set up the equivalent of what MS used to call the Modern Desktop Experience which is how many enterprise level customers have their systems configured now.
The core parts are cloud-based IDp, storage, and a slipstreamed deployment image which with network connectivity will pull down the config and sets the desired state, replicating the workspace down as needed (with OneDrive).
Backup data layout/strategy/BCDR plan can then be automated from the workspace/IDp/cloud-storage backend with no user interaction/learning curve.
If hardware fails, you use the deployment image to enroll new hardware, login and replicate the user related state down, etc. Automation for recurring tasks can be matched up to the device lifecycle phases (Provision, Enrollment, Recovery, Migration, Retirement). This is basically done in a professional setup with EntraID/Autopilot MDM with MSO365 plans. You can easily set up equivalents but you have to write your own glue.
Most of that structure was taken from Linux grey beards ages ago, MS just made a lot of glue and put it in a nice package.