Skip to main content

Runbook: Decommission Service

Purpose

Safely remove a service from RavenmaskOS, including all related infrastructure components, secrets, and tracking documentation.

MCP Tools Available

This runbook can be automated using these Bifrost MCP tools:

StepToolDescription
List containersinfra_docker_list_containersFind service containers
Stop containerinfra_docker_stop_containerGracefully stop container
Remove containerinfra_docker_remove_containerRemove stopped container
List volumesinfra_docker_list_volumesFind service volumes
Remove volumeinfra_docker_remove_volumeRemove Docker volume
List data dirsinfra_list_data_directoriesFind service data directories
Remove data dirinfra_remove_data_directoryRemove service data
List Zitadel appsinfra_zitadel_list_appsFind OIDC applications
Delete Zitadel appinfra_zitadel_delete_appRemove OIDC app
Trigger discoveryinfra_trigger_discoveryUpdate CMDB after removal

Executor: infrastructure (requires confirmation for destructive operations)

Prerequisites

  • SSH access to odin (ravenhelm user)
  • Service owner approval for decommissioning
  • Backup of any important data (if needed)
  • Incident/change ticket created in tracking system

Pre-Decommission Checklist

Before starting, verify:

  1. No active dependencies - Confirm no other services depend on this one
  2. Data backup - Export any data that needs retention
  3. User notification - Relevant users notified of deprecation
  4. Documentation - Note why service is being removed

Procedure

Step 1: Stop and Remove Containers

MCP Tool: infra_docker_list_containers, infra_docker_stop_container, infra_docker_remove_container

# Navigate to service directory
cd ~/ravenhelm/services/<service-name>

# Stop containers
docker compose down

# Remove containers and networks (but not volumes)
docker compose down --remove-orphans

# Verify containers removed
docker ps -a | grep <service-name>

Tool invocation:

// List containers matching service name
{"tool": "infra_docker_list_containers", "arguments": {"name_filter": "<service-name>"}}

// Stop each container
{"tool": "infra_docker_stop_container", "arguments": {"container_id": "<container-id>"}}

// Remove each container
{"tool": "infra_docker_remove_container", "arguments": {"container_id": "<container-id>", "force": true}}

Step 2: Remove Docker Volumes (if applicable)

MCP Tool: infra_docker_list_volumes, infra_docker_remove_volume

# List service volumes
docker volume ls | grep <service-name>

# Remove volumes (CAUTION: data loss)
docker volume rm <volume-name>

Tool invocation:

// List volumes
{"tool": "infra_docker_list_volumes", "arguments": {"name_filter": "<service-name>"}}

// Remove volume (destructive!)
{"tool": "infra_docker_remove_volume", "arguments": {"volume_name": "<volume-name>", "force": false}}

Step 3: Remove Service Directory

cd ~/ravenhelm/services
rm -rf <service-name>

Note: Service directory removal is manual - not exposed via MCP for safety.

Step 4: Remove Data Directory

MCP Tool: infra_list_data_directories, infra_remove_data_directory

# Backup first if needed
tar -czvf ~/ravenhelm/backups/<service-name>_$(date +%Y%m%d).tar.gz ~/ravenhelm/data/<service-name>

# Remove data
rm -rf ~/ravenhelm/data/<service-name>

Tool invocation:

// List data directories to find service
{"tool": "infra_list_data_directories", "arguments": {}}

// Remove data directory (destructive!)
{"tool": "infra_remove_data_directory", "arguments": {"path": "/Users/ravenhelm/ravenhelm/data/<service-name>", "force": true}}

Step 5: Clean Up Secrets

# Edit secrets file and remove service-specific entries
nano ~/ravenhelm/secrets/.env

# Remove lines related to <service-name>
# Look for patterns like:
# - <SERVICE_NAME>_*
# - *_<SERVICE_NAME>_*
# - Comments mentioning the service

Note: Secret cleanup is manual for safety.

Step 6: Remove Traefik Configuration (if applicable)

If service had Traefik labels/routes:

MCP Tool: infra_docker_restart_container

# Traefik auto-discovers via labels, so removing container handles this
# For static config, remove from traefik dynamic config if present
ls ~/ravenhelm/services/traefik/dynamic/

# Restart Traefik to clear any cached routes
docker restart traefik

Tool invocation:

{"tool": "infra_docker_restart_container", "arguments": {"container_id": "traefik"}}

Step 7: Revoke/Remove Certificates

Let's Encrypt certs are managed automatically by Traefik. Once the route is removed:

# Certificates will expire naturally (90 days)
# To force removal from Traefik ACME storage:
docker exec traefik cat /etc/traefik/acme.json | jq 'del(.letsencrypt.Certificates[] | select(.domain.main == "<service>.ravenhelm.dev"))'

Note: Manual cert revocation is rarely needed. Certs expire naturally.

Step 8: Remove Zitadel Application (if SSO-enabled)

MCP Tool: infra_zitadel_list_apps, infra_zitadel_delete_app

If service had Zitadel OIDC integration:

  1. Access Zitadel Admin: https://auth.ravenhelm.dev
  2. Navigate to: Organization → Projects → <project> → Applications
  3. Find and delete the application
  4. Record the Client ID for audit trail

Tool invocation:

// List all apps to find the one to delete
{"tool": "infra_zitadel_list_apps", "arguments": {}}

// Delete the app (requires project_id and app_id)
{"tool": "infra_zitadel_delete_app", "arguments": {"project_id": "`<project-id>`", "app_id": "`<app-id>`"}}

Step 9: Remove DNS Records (if applicable)

If service had external DNS:

# Check current DNS
dig <service>.ravenhelm.dev

# Remove from DNS provider (Cloudflare/Route53)
# This is typically not needed for internal services using Traefik

Step 10: Clean Up Monitoring

Remove from Prometheus/Grafana:

# Remove scrape config from prometheus.yml if added
nano ~/ravenhelm/services/prometheus/prometheus.yml

# Remove any Grafana dashboards
# Delete from Grafana UI or remove from provisioning

Step 11: Update CMDB

MCP Tool: infra_trigger_discovery

If using Vidar, trigger discovery to update entity status:

curl -X POST https://vidar-api.ravenhelm.dev/api/v1/cmdb/discovery/trigger

Tool invocation:

{"tool": "infra_trigger_discovery", "arguments": {}}

Step 12: Update Documentation

Remove or archive references to the service:

# Update wiki
cd /tmp/ravenmaskos.wiki

# Remove service documentation pages
rm -f Features/Platform/<Service>.md
rm -f Troubleshooting/By-Service/<Service>.md

# Update index pages that reference the service
# Commit changes
git add -A
git commit -m "Decommission: Remove <service-name> documentation"
git push

GitLab MCP Tools available: gitlab_list_wiki_pages, gitlab_get_wiki_page

Verification

After completion, verify:

# No containers running
docker ps -a | grep <service-name>
# Expected: no output

# No volumes remaining
docker volume ls | grep <service-name>
# Expected: no output

# Service directory removed
ls ~/ravenhelm/services/<service-name>
# Expected: No such file or directory

# Data directory removed
ls ~/ravenhelm/data/<service-name>
# Expected: No such file or directory

# DNS no longer resolves (if external)
curl -I https://<service-name>.ravenhelm.dev
# Expected: 404 or connection refused

Automated Verification Script

// Run these tools to verify decommissioning
{"tool": "infra_docker_list_containers", "arguments": {"name_filter": "<service-name>"}}
// Expected: count = 0

{"tool": "infra_docker_list_volumes", "arguments": {"name_filter": "<service-name>"}}
// Expected: count = 0

{"tool": "infra_list_data_directories", "arguments": {}}
// Expected: <service-name> not in list

{"tool": "infra_trigger_discovery", "arguments": {}}
// Refresh CMDB to reflect changes

Post-Decommission

  1. Update tracking ticket - Mark as completed with summary
  2. Notify stakeholders - Confirm service removal
  3. Archive any backup - Move to long-term storage if needed
  4. Update homepage - Remove from service dashboard if listed

Rollback

If service needs to be restored:

  1. Restore from backup:

    cd ~/ravenhelm
    tar -xzvf backups/<service-name>_<date>.tar.gz
  2. Recreate service directory from template or git history

  3. Restore secrets from 1Password or backup

  4. Deploy service:

    cd ~/ravenhelm/services/<service-name>
    docker compose up -d