Container name specification for pre/post backup hooks (you can now target specific containers in multi-container pods) Proper error handling throughout the codebase (fixed unchecked errors that linters were complaining about) Added unit tests for core reconciliation logic Cleaner code structure following Go best practices
Still experimental, but getting more stable. The operator automates PV snapshots for StatefulSets using Kubernetes VolumeSnapshot API, with backup policies defined as CRDs. Looking for feedback on feature priorities: I'm trying to figure out what would make this actually useful beyond a learning project. Some ideas I'm considering:
Application-aware backup hooks (e.g., MySQL flush before snapshot) Backup verification/validation Better observability (Prometheus metrics) Restore automation (currently manual) Multi-cluster snapshot replication
Which of these (or other features) would matter for real-world usage? Or is Velero already solving all these problems well enough that narrow-scope alternatives don't make sense? Appreciate any thoughts from folks running stateful workloads on Kubernetes.