Now instead of chasing down different systems and backups, you can simply set ensure your archival process runs regularly and you should be good.
The data archive serialized the schema of the deleted object representative the schema in that point in time.
But fast-forward some schema changes, now your system has to migrate the archived objects to the current schema?
Of course, as always, it depends on the system and how the archive is used. That's just my experience. I can imagine that if there are more tools or features built around the archive, the situation might be different.
I think maintaining schema changes and migrations on archived objects can be tricky in its own ways, even kept in the live tables with an 'archived_at' column, especially when objects span multiple tables with relationships. I've worked on migrations where really old archived objects just didn't make sense anymore in the new data model, and figuring out a safe migration became a difficult, error-prone project.
* It's obvious from the schema: If there's a `deleted_at` column, I know how to query the table correctly (vs thinking rows aren't DELETEd, or knowing where to look in another table)
* One way to do things: Analytics queries, admin pages, it all can look at the same set of data, vs having separate handling for historical data.
* DELETEs are likely fairly rare by volume for many use cases
* I haven't found soft-deleted rows to be a big performance issue. Intuitively this should be true, since queries should be O log(N)
* Undoing is really easy, because all the relationships stay in place, vs data already being moved elsewhere (In practice, I haven't found much need for this kind of undo).
In most cases, I've really enjoyed going even further and making rows fully immutable, using a new row to handle updates. This makes it really easy to reference historical data.
If I was doing the logging approach described in the article, I'd use database triggers that keep a copy of every INSERT/UPDATE/DELETEd row in a duplicate table. This way it all stays in the same database—easy to query and replicate elsewhere.
All your other points make sense, given this assumption.
I've seen tables where 50%-70% were soft-deleted, and it did affect the performance noticeably.
> Undoing is really easy
Depends on whether undoing even happens, and whether the act of deletion and undeletion require audit records anyway.
In short, there are cases when soft-deletion works well, and is a good approach. In other cases it does not, and is not. Analysis is needed before adopting it.
Strange. I've only ever heard of legal requirements preventing deletion of things you'd expect could be fully deleted (in case they're needed as evidence at trial or something).
cj•1h ago
This works well especially in cases where you don’t want to waste CPU/memory scanning soft deleted records every time you do a lookup.
And avoids situations where app/backend logic forgets to apply the “deleted: false” filter.
vjvjvjvjghv•59m ago
tempest_•45m ago
buchanae•42m ago
tempest_•31m ago