Backup and Restore
Purpose
Section titled “Purpose”This document defines initial backup and restore expectations.
Backup and restore are mandatory because the platform stores long-term telemetry and customer data.
Data That Must Be Protected
Section titled “Data That Must Be Protected”- PostgreSQL relational data.
- TimescaleDB telemetry data.
- Audit logs.
- Export metadata.
- Configuration needed to restore deployments.
- Kubernetes secrets where applicable.
- Helm values for each environment.
- Container image versions.
Retention Requirements
Section titled “Retention Requirements”Raw payloads:
- Approximately one month.
Normalized measurements:
- At least five years.
Initial Backup Requirements
Section titled “Initial Backup Requirements”V1 must document:
- Backup frequency.
- Backup storage location.
- Restore procedure.
- Restore validation process.
- Responsible operator.
- Expected RPO.
- Expected RTO.
Recovery Point Objective means maximum acceptable data loss.
Initial target: to be decided.
Recovery Time Objective means maximum acceptable restore time.
Initial target: to be decided.
Single-Server Warning
Section titled “Single-Server Warning”A single Kubernetes server is not a backup strategy.
Persistent volumes can fail.
Database files can be corrupted.
Operator mistakes can delete data.
A separate backup location is required.
Restore Testing
Section titled “Restore Testing”A backup is only useful if restore has been tested.
Restore should be tested in test or staging regularly.
Open Decisions
Section titled “Open Decisions”- Backup tooling.
- Backup storage location.
- WAL archiving strategy.
- Full backup frequency.
- Restore testing frequency.
- RPO and RTO.