Runbook: Decommission Service
Purpose
Safely remove a service from RavenmaskOS, including all related infrastructure components, secrets, and tracking documentation.
MCP Tools Available
This runbook can be automated using these Bifrost MCP tools:
| Step | Tool | Description |
|---|---|---|
| List containers | infra_docker_list_containers | Find service containers |
| Stop container | infra_docker_stop_container | Gracefully stop container |
| Remove container | infra_docker_remove_container | Remove stopped container |
| List volumes | infra_docker_list_volumes | Find service volumes |
| Remove volume | infra_docker_remove_volume | Remove Docker volume |
| List data dirs | infra_list_data_directories | Find service data directories |
| Remove data dir | infra_remove_data_directory | Remove service data |
| List Zitadel apps | infra_zitadel_list_apps | Find OIDC applications |
| Delete Zitadel app | infra_zitadel_delete_app | Remove OIDC app |
| Trigger discovery | infra_trigger_discovery | Update CMDB after removal |
Executor: infrastructure (requires confirmation for destructive operations)
Prerequisites
- SSH access to odin (ravenhelm user)
- Service owner approval for decommissioning
- Backup of any important data (if needed)
- Incident/change ticket created in tracking system
Pre-Decommission Checklist
Before starting, verify:
- No active dependencies - Confirm no other services depend on this one
- Data backup - Export any data that needs retention
- User notification - Relevant users notified of deprecation
- Documentation - Note why service is being removed
Procedure
Step 1: Stop and Remove Containers
MCP Tool: infra_docker_list_containers, infra_docker_stop_container, infra_docker_remove_container
# Navigate to service directory
cd ~/ravenhelm/services/<service-name>
# Stop containers
docker compose down
# Remove containers and networks (but not volumes)
docker compose down --remove-orphans
# Verify containers removed
docker ps -a | grep <service-name>
Tool invocation:
// List containers matching service name
{"tool": "infra_docker_list_containers", "arguments": {"name_filter": "<service-name>"}}
// Stop each container
{"tool": "infra_docker_stop_container", "arguments": {"container_id": "<container-id>"}}
// Remove each container
{"tool": "infra_docker_remove_container", "arguments": {"container_id": "<container-id>", "force": true}}
Step 2: Remove Docker Volumes (if applicable)
MCP Tool: infra_docker_list_volumes, infra_docker_remove_volume
# List service volumes
docker volume ls | grep <service-name>
# Remove volumes (CAUTION: data loss)
docker volume rm <volume-name>
Tool invocation:
// List volumes
{"tool": "infra_docker_list_volumes", "arguments": {"name_filter": "<service-name>"}}
// Remove volume (destructive!)
{"tool": "infra_docker_remove_volume", "arguments": {"volume_name": "<volume-name>", "force": false}}
Step 3: Remove Service Directory
cd ~/ravenhelm/services
rm -rf <service-name>
Note: Service directory removal is manual - not exposed via MCP for safety.
Step 4: Remove Data Directory
MCP Tool: infra_list_data_directories, infra_remove_data_directory
# Backup first if needed
tar -czvf ~/ravenhelm/backups/<service-name>_$(date +%Y%m%d).tar.gz ~/ravenhelm/data/<service-name>
# Remove data
rm -rf ~/ravenhelm/data/<service-name>
Tool invocation:
// List data directories to find service
{"tool": "infra_list_data_directories", "arguments": {}}
// Remove data directory (destructive!)
{"tool": "infra_remove_data_directory", "arguments": {"path": "/Users/ravenhelm/ravenhelm/data/<service-name>", "force": true}}
Step 5: Clean Up Secrets
# Edit secrets file and remove service-specific entries
nano ~/ravenhelm/secrets/.env
# Remove lines related to <service-name>
# Look for patterns like:
# - <SERVICE_NAME>_*
# - *_<SERVICE_NAME>_*
# - Comments mentioning the service
Note: Secret cleanup is manual for safety.
Step 6: Remove Traefik Configuration (if applicable)
If service had Traefik labels/routes:
MCP Tool: infra_docker_restart_container
# Traefik auto-discovers via labels, so removing container handles this
# For static config, remove from traefik dynamic config if present
ls ~/ravenhelm/services/traefik/dynamic/
# Restart Traefik to clear any cached routes
docker restart traefik
Tool invocation:
{"tool": "infra_docker_restart_container", "arguments": {"container_id": "traefik"}}
Step 7: Revoke/Remove Certificates
Let's Encrypt certs are managed automatically by Traefik. Once the route is removed:
# Certificates will expire naturally (90 days)
# To force removal from Traefik ACME storage:
docker exec traefik cat /etc/traefik/acme.json | jq 'del(.letsencrypt.Certificates[] | select(.domain.main == "<service>.ravenhelm.dev"))'
Note: Manual cert revocation is rarely needed. Certs expire naturally.
Step 8: Remove Zitadel Application (if SSO-enabled)
MCP Tool: infra_zitadel_list_apps, infra_zitadel_delete_app
If service had Zitadel OIDC integration:
- Access Zitadel Admin: https://auth.ravenhelm.dev
- Navigate to: Organization → Projects →
<project>→ Applications - Find and delete the application
- Record the Client ID for audit trail
Tool invocation:
// List all apps to find the one to delete
{"tool": "infra_zitadel_list_apps", "arguments": {}}
// Delete the app (requires project_id and app_id)
{"tool": "infra_zitadel_delete_app", "arguments": {"project_id": "`<project-id>`", "app_id": "`<app-id>`"}}
Step 9: Remove DNS Records (if applicable)
If service had external DNS:
# Check current DNS
dig <service>.ravenhelm.dev
# Remove from DNS provider (Cloudflare/Route53)
# This is typically not needed for internal services using Traefik
Step 10: Clean Up Monitoring
Remove from Prometheus/Grafana:
# Remove scrape config from prometheus.yml if added
nano ~/ravenhelm/services/prometheus/prometheus.yml
# Remove any Grafana dashboards
# Delete from Grafana UI or remove from provisioning
Step 11: Update CMDB
MCP Tool: infra_trigger_discovery
If using Vidar, trigger discovery to update entity status:
curl -X POST https://vidar-api.ravenhelm.dev/api/v1/cmdb/discovery/trigger
Tool invocation:
{"tool": "infra_trigger_discovery", "arguments": {}}
Step 12: Update Documentation
Remove or archive references to the service:
# Update wiki
cd /tmp/ravenmaskos.wiki
# Remove service documentation pages
rm -f Features/Platform/<Service>.md
rm -f Troubleshooting/By-Service/<Service>.md
# Update index pages that reference the service
# Commit changes
git add -A
git commit -m "Decommission: Remove <service-name> documentation"
git push
GitLab MCP Tools available: gitlab_list_wiki_pages, gitlab_get_wiki_page
Verification
After completion, verify:
# No containers running
docker ps -a | grep <service-name>
# Expected: no output
# No volumes remaining
docker volume ls | grep <service-name>
# Expected: no output
# Service directory removed
ls ~/ravenhelm/services/<service-name>
# Expected: No such file or directory
# Data directory removed
ls ~/ravenhelm/data/<service-name>
# Expected: No such file or directory
# DNS no longer resolves (if external)
curl -I https://<service-name>.ravenhelm.dev
# Expected: 404 or connection refused
Automated Verification Script
// Run these tools to verify decommissioning
{"tool": "infra_docker_list_containers", "arguments": {"name_filter": "<service-name>"}}
// Expected: count = 0
{"tool": "infra_docker_list_volumes", "arguments": {"name_filter": "<service-name>"}}
// Expected: count = 0
{"tool": "infra_list_data_directories", "arguments": {}}
// Expected: <service-name> not in list
{"tool": "infra_trigger_discovery", "arguments": {}}
// Refresh CMDB to reflect changes
Post-Decommission
- Update tracking ticket - Mark as completed with summary
- Notify stakeholders - Confirm service removal
- Archive any backup - Move to long-term storage if needed
- Update homepage - Remove from service dashboard if listed
Rollback
If service needs to be restored:
-
Restore from backup:
cd ~/ravenhelm
tar -xzvf backups/<service-name>_<date>.tar.gz -
Recreate service directory from template or git history
-
Restore secrets from 1Password or backup
-
Deploy service:
cd ~/ravenhelm/services/<service-name>
docker compose up -d