Grafana Dashboard Catalog
Complete reference for all dashboards and data sources in Grafana.
Overview
Grafana provides observability dashboards for the RavenmaskOS platform. All dashboards are provisioned via configuration files and automatically updated.
Dashboard Index
| Dashboard | UID | Tags | Purpose |
|---|---|---|---|
| System Overview | system-overview | system, containers, overview | Platform health at a glance |
| Docker Logs | docker-logs | docker, logs, loki | Container log search |
| Distributed Traces | traces | observability, tempo, traces | Request tracing |
Data Sources
| Source | Type | URL | Purpose |
|---|---|---|---|
| Prometheus | prometheus | http://prometheus:9090 | Metrics |
| Loki | loki | http://loki:3100 | Logs |
| Tempo | tempo | http://tempo:3200 | Traces |
Dashboards
System Overview
URL: grafana.ravenhelm.dev/d/system-overview
UID: system-overview
Tags: system, containers, overview
Main platform health dashboard showing resource utilization and container status.
┌─────────────────────────────────────────────────────────────────────────────┐
│ SYSTEM OVERVIEW │
├─────────────────┬─────────────────┬─────────────────┬───────────────────────┤
│ CPU Usage │ Memory Usage │ Disk Usage │ Running Containers │
│ 15% │ 65% │ 45% │ 47 │
├─────────────────┴─────────────────┴─────────────────┴───────────────────────┤
│ Container CPU Usage (timeseries) │
│ ████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │
├──────────────────────────────────────────────────────────────────────────────┤
│ Container Memory Usage (timeseries) │
│ ██████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │
├──────────────────────────────────────────────────────────────────────────────┤
│ Container Network I/O (timeseries) │
│ ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │
├──────────────────────────────────────────────────────────────────────────────┤
│ Log Volume by Container (timeseries) │
│ ██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │
├──────────────────────────────────────────────────────────────────────────────┤
│ Recent Errors (All Containers) │
│ [2025-01-03 10:23] norns-agent: ConnectionError: Redis connection refused │
│ [2025-01-03 10:45] bifrost-api: TimeoutError: Tool execution timed out │
└──────────────────────────────────────────────────────────────────────────────┘
Panels:
| Panel | Type | Data Source | Description |
|---|---|---|---|
| CPU Usage | stat | Prometheus | Current CPU utilization percentage |
| Memory Usage | stat | Prometheus | Current memory utilization percentage |
| Disk Usage | stat | Prometheus | Current disk utilization percentage |
| Running Containers | stat | Prometheus | Count of running Docker containers |
| Container CPU Usage | timeseries | Prometheus | CPU usage by container over time |
| Container Memory Usage | timeseries | Prometheus | Memory usage by container over time |
| Container Network I/O | timeseries | Prometheus | Network traffic by container |
| Log Volume by Container | timeseries | Loki | Log line count by container |
| Recent Errors (All Containers) | logs | Loki | Latest error-level log entries |
Docker Logs
URL: grafana.ravenhelm.dev/d/docker-logs
UID: docker-logs
Tags: docker, logs, loki
Log exploration dashboard for searching and filtering container logs.
Panels:
| Panel | Type | Data Source | Description |
|---|---|---|---|
| Live Logs | logs | Loki | Real-time log stream (auto-refresh) |
| Log Volume by Container | timeseries | Loki | Log volume breakdown by container |
| Log Stream | logs | Loki | Filterable log history |
Variables:
container- Filter by container namelevel- Filter by log level (error, warn, info, debug)
Example Queries:
# All errors from a specific container
{container="norns-agent"} |= "error"
# HTTP 500 errors from traefik
{container="traefik"} |= "500" |~ "error|Error"
# Slow database queries
{container="postgres"} |= "duration"
Distributed Traces
URL: grafana.ravenhelm.dev/d/traces
UID: traces
Tags: observability, tempo, traces
Distributed tracing dashboard for request flow analysis.
Panels:
| Panel | Type | Data Source | Description |
|---|---|---|---|
| Trace Rate | timeseries | Tempo | Traces per second over time |
| P95 Latency | timeseries | Tempo | 95th percentile request latency |
| Recent Traces | traces | Tempo | Searchable trace list with flamegraph |
Use Cases:
- Debug slow API requests
- Trace request flow across services
- Identify bottlenecks in multi-service calls
Quick Access URLs
| Dashboard | Direct URL |
|---|---|
| System Overview | https://grafana.ravenhelm.dev/d/system-overview |
| Docker Logs | https://grafana.ravenhelm.dev/d/docker-logs |
| Traces | https://grafana.ravenhelm.dev/d/traces |
| Explore (Loki) | https://grafana.ravenhelm.dev/explore?orgId=1&left=["now-1h","now","Loki",{}] |
| Explore (Tempo) | https://grafana.ravenhelm.dev/explore?orgId=1&left=["now-1h","now","Tempo",{}] |
Provisioning
Dashboards are automatically provisioned from JSON files:
Location: /etc/grafana/provisioning/dashboards/
Update Interval: 30 seconds
| File | Dashboard |
|---|---|
| system-overview.json | System Overview |
| docker-logs.json | Docker Logs |
| traces.json | Distributed Traces |
To add a new dashboard:
- Create JSON file in
/Users/ravenhelm/ravenhelm/services/grafana/provisioning/dashboards/ - Restart Grafana or wait for auto-sync
Alert Rules
Currently no alert rules are configured. Alerting is handled via:
- n8n workflows (GitLab Pipeline → Alerting)
- Uptime Kuma (service health checks)
See Also
- Observability - Observability stack overview
- Loki - Log aggregation
- Prometheus - Metrics collection
- Tempo - Distributed tracing
- Check System Health - Health monitoring use case
- Query Logs - Log search use case