Service Level Objectives

# Service Level Objectives
You can't improve reliability without measuring it. SLOs provide the framework.
## The Stack
**SLI (Service Level Indicator)**
What you measure. Latency, error rate, throughput.
**SLO (Service Level Objective)**
Your target. "99.9% of requests succeed."
**SLA (Service Level Agreement)**
Contract with users. Usually has financial penalties.
## Choosing Good SLIs
**User-Focused**
Measure what users care about, not what's easy to measure.
**Request Success Rate**
Most fundamental metric. Did it work?
**Request Latency**
How fast? P50, P95, P99.
**Availability**
Is the service up?
## Error Budgets
**100% Uptime is Wrong**
Perfect reliability prevents innovation.
**Error Budget = 1 - SLO**
99.9% SLO = 0.1% error budget = 43 minutes/month.
**Use Your Budget**
Spend it on features. When exhausted, focus on reliability.
## Implementing SLOs
- Start with loose SLOs
- Measure continuously
- Review and adjust
- Automate alerting
SLOs align engineering with user needs.
