Ask HN: How do you deal with data backups in servers?

4•atomicnature•5h ago

Recently, we lost some data during migrating our servers due to a missing backup.

We thought we had something backed up - but was not really the case.

We have multiple databases and apps - each having its own data store often.

How do you usually deal with server backups? What has worked for you and what has not?

Comments

Bender•4h ago

For me both professionally and personally having a manifest of all non-OS and non git repo committed data e.g. code artifacts that are restored by code deployment clearly defines what needs to be backed up. This must be tested routinely by restoring only what exists in the role based manifest along with the role based procedure and doing QA testing on the restored nodes. Procedures will vary by role but there must be a manifest that defines what directories contain live data. Each role must have its own clearly defined procedure for data restoration and the role must be defined in the manifest. So for example DBA's will be responsible for writing their role based procedure for primary and secondary databases. Ideally role based data should be neatly contained to a corporate specific directory structure meaning that every role could in theory be restored to a single node without overlapping ports for stand-alone QA testing on a developer laptop.

Personally I also like to have a local snapshot using rsnapshot of live/ephemeral data so that I can quickly get a node back in service assuming the backup volume only accessible by root has not been tainted or tampered with. OSSEC is one of the many tools that can checksum data and alert on tampering. AuditD with well written rules is also useful for real time monitoring. Anti-tampering is an entire topic by itself.

I like to keep these concepts outside of configuration management tools but design them so they can be easily pulled into said tools. This makes replacing a tool much easier. So if for example ones company desires switching from Chef to Ansible for whatever reasons the process is already a well known-known allowing a quick semi-automated migration.

codegeek•3m ago

Some rules for backups that you must follow:

1. Backups must be taken offsite on a separate server (obvious but surprisingly some people miss this)

2. Backups must be tested frequently. If you cannot test a backup, you don't have a backup.

3. Frequency depends on your criticality of data, your contract/SLA with your customer etc. Ideally, you should be able to have Point-in-time-Restore (PTR) going back to certain number of hours/days/weeks

4. Make sure to have notifications for backup failures. If a backup failed, you must be notified to correct it manually.

5. Bonus: Have a backup reconciliation script that runs additionally to recon all backups for a certain period.

The Shape of Things Unseen

EI/LVM: New Models Meet Old

Show HN: Empromptu.ai – Agentic AI Building AI Apps

America Party (AMEP) FEC Form 1

Gen Z, the Useful Idiot Generation

How Does O3 Guess Latitude from Photos?

What is a micro-retirement? Inside the latest Gen Z trend

Exception Handling in Rustc_codegen_cranelift

ZorkLand – Retro Amiga Shooter

OpenAI 4o Image Generation Guide

How Let's Encrypt made the internet safer and HTTPS standard – and free

Amiga Forever and C64 Forever 11 Released

Show HN: Cursor's "Tab" Model in the Browser

Show HN: An Apple-like computer with a built-in BASIC interpreter in my game

Southwest Airlines' free bags perk is mostly gone – loyal customers are outraged

Supabase MCP's Lethal Trifecta

Ask HN: How can I make 0,1M dollars?

Intel layoffs begin: Chipmaker is cutting many jobs

The National Security Archive

Wall Street Builds S&P 500 'No Dividend' Fund in New Tax Dodge

Build Like It's 1996

Song for My Grandmother [video]

Trump and Congress finalize law that could hurt your Wi-Fi

Teens Almost Got Away with Murder. Then Police Found Their Google Searches

'Completely unexpected': Antarctic sea ice may be in terminal decline

Building a map of the whole history using Wikidata and SQLite

From Task to Table: How I Got to the Korean Burger

Waymo starts robotaxi testing in Philadelphia and NYC

Apple tries get €500M EU fine tossed

Exploiting an ORM Injection to Steal Cryptocurrency from an Online Shooter