Disaster Recovery
Full recovery procedures for catastrophic failures.
Recovery Tiers
| Tier | Scenario | RTO |
|---|---|---|
| 1 | Single service failure | < 15 min |
| 2 | Data corruption | < 1 hour |
| 3 | Full host failure (T9 available) | < 4 hours |
| 4 | Complete site loss (B2 only) | < 24 hours |
Tier 1: Service Recovery
cd ~/ravenhelm/services/<service>
docker compose down
docker compose up -d
docker compose logs -f
Tier 2: Data Recovery
# Stop service
docker compose down
# Restore from latest snapshot
source ~/.config/restic/homelab.env
restic restore latest --include ~/ravenhelm/data/<service> --target /
# Restart
docker compose up -d
Tier 3: Full Host Recovery (T9 Available)
1. Initial Setup
# Install dependencies
brew install colima docker docker-compose restic
# Start Colima
colima start --cpu 8 --memory 16 --disk 100
2. Configure 1Password
export OP_SERVICE_ACCOUNT_TOKEN="<token>"
op whoami
3. Restore from T9
source ~/.config/restic/homelab.env
restic restore latest --target /
4. Deploy Services
docker network create ravenhelm_net
# Deploy in order
for svc in traefik postgres redis zitadel; do
cd ~/ravenhelm/services/$svc && docker compose up -d
done
Tier 4: Full Site Recovery (B2 Only)
Same as Tier 3, but:
# Use B2 config instead
source ~/.config/restic/b2.env
restic restore latest --target /
Note: This will be slower due to network transfer.
Post-Recovery Checklist
- All containers running
- Traefik dashboard accessible
- TLS certificates valid
- PostgreSQL healthy
- Redis healthy
- All services accessible via HTTPS
- Backup cron jobs restored
- DNS working
Critical Assets Priority
- 1Password access
- Restic password
~/ravenhelm/secrets/.env~/ravenhelm/data/postgres/- Remaining data directories