Prometheus
Metrics collection and alerting system.
Overview
Prometheus scrapes metrics from all services and stores time-series data.
| Property | Value |
|---|---|
| Image | prom/prometheus:latest |
| Container | prometheus |
| Port | 9090 (internal) |
| Config | ~/ravenhelm/services/prometheus/ |
| Data | ~/ravenhelm/data/prometheus/ |
Scrape Targets
| Target | Endpoint | Interval |
|---|---|---|
| Prometheus | localhost:9090 | 15s |
| Node Exporter | node-exporter:9100 | 15s |
| cAdvisor | cadvisor:8080 | 15s |
| PostgreSQL | postgres-exporter:9187 | 15s |
| Redis | redis-exporter:9121 | 15s |
| Traefik | traefik:8080 | 15s |
Quick Commands
# View logs
docker logs -f prometheus
# Restart
docker restart prometheus
# Check targets
curl -s http://localhost:9090/api/v1/targets | jq '.data.activeTargets[].scrapeUrl'
# Run query
curl -s 'http://localhost:9090/api/v1/query?query=up' | jq
Configuration
prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
- job_name: 'postgres'
static_configs:
- targets: ['postgres-exporter:9187']
- job_name: 'redis'
static_configs:
- targets: ['redis-exporter:9121']
Common Queries
# CPU usage by container
rate(container_cpu_usage_seconds_total{name!=""}[5m]) * 100
# Memory usage by container
container_memory_usage_bytes{name!=""} / 1024 / 1024
# Disk usage
node_filesystem_avail_bytes / node_filesystem_size_bytes * 100
# Container restart count
changes(container_start_time_seconds[1h])
Troubleshooting
Issue: Target Down
Symptoms: Target shows as "down" in /targets
Diagnosis:
# Check if exporter is running
docker ps | grep exporter
# Test endpoint directly
curl http://<exporter>:<port>/metrics
Solutions:
- Verify exporter container is running
- Check network connectivity
- Verify port configuration