SLIs and SLOs
Service level indicators and objectives for RavenmaskOS.
Definitions
- SLI (Service Level Indicator): A quantifiable measure of service performance.
- SLO (Service Level Objective): A target for an SLI over a defined period.
Baseline SLOs
| SLI | Target | Notes |
|---|---|---|
| Availability (core services) | 99.9% monthly | Traefik, Postgres, Redis |
| API latency (P99) | < 500ms | Primary APIs |
| Error rate | < 1% | 5xx responses |
| Job success rate | > 98% | Background jobs and automations |
Examples
- Norns Agent: P99 < 750ms; error rate < 1%
- Voice Gateway: P99 < 800ms; availability > 99.5%
- Grafana: Availability > 99.9%
Measuring SLIs
- Latency/Errors: Prometheus + Grafana dashboards
- Availability: Prometheus uptime checks + Blackbox Exporter
- LLM Metrics: Langfuse traces and usage metrics