IncidentOps Specification
Multi-tenant incident management API. Org context embedded in JWT — no orgId in URLs.
Architecture
| Service |
Stack |
Purpose |
| api |
FastAPI, asyncpg |
REST API, JWT auth, RBAC |
| worker |
Celery, Redis |
Notifications, escalations |
| web |
Next.js |
Dashboard (future) |
Infrastructure: PostgreSQL, Redis, ingress-nginx, Helm/Skaffold
Auth
JWT Access Token Claims
sub: user_id (uuid)
org_id: active org (uuid)
org_role: admin | member | viewer
iss: issuer (configurable, default: incidentops)
aud: audience (configurable, default: incidentops-api)
jti: unique token ID (uuid)
iat: issued at (unix timestamp)
exp: expiration (unix timestamp)
Refresh Token
- Opaque token returned in JSON (not cookie)
- Stored hashed in DB with
active_org_id
- Rotated on refresh and org-switch
Endpoints
| Endpoint |
Description |
POST /v1/auth/register |
Create user + default org, return tokens |
POST /v1/auth/login |
Authenticate, return tokens |
POST /v1/auth/refresh |
Rotate refresh token, mint new access token |
POST /v1/auth/switch-org |
Change active org, rotate tokens |
POST /v1/auth/logout |
Revoke refresh token |
Authorization
Roles
| Role |
Permissions |
| viewer |
Read-only |
| member |
+ create incidents, transitions, comments |
| admin |
+ manage members, notification targets |
Enforcement
- Role check via dependency injection
- Ownership check: resource
org_id must match JWT org_id
API Routes
All under /v1. Auth required unless noted.
Org (implicit from JWT)
GET /org — current org summary
GET /org/members (admin)
GET /org/services
POST /org/services (member+)
GET /org/notification-targets (admin)
POST /org/notification-targets (admin)
Incidents
GET /incidents?status=&cursor=&limit=
POST /services/{serviceId}/incidents (member+)
GET /incidents/{incidentId}
GET /incidents/{incidentId}/events
POST /incidents/{incidentId}/transition (member+)
POST /incidents/{incidentId}/comment (member+)
Health
GET /healthz — liveness
GET /readyz — readiness (postgres + redis)
Incident State Machine
- Transitions validated at application level
- Optimistic locking via
version column
- All changes recorded in
incident_events
Database Schema
| Table |
Purpose |
users |
User accounts |
orgs |
Organizations |
org_members |
User-org membership + role |
services |
Org-scoped services |
incidents |
Org-scoped incidents with version |
incident_events |
Append-only timeline |
refresh_tokens |
Token rotation + active org |
notification_targets |
Webhook/email/slack configs |
notification_attempts |
Delivery tracking (idempotent) |
Background Jobs (Celery)
| Task |
Queue |
Purpose |
incident_triggered |
default |
Fan-out to notification targets |
send_webhook |
default |
HTTP POST with retry |
escalate_if_unacked |
critical |
Delayed escalation (stretch) |
Config (Environment)
| Variable |
Required |
Default |
DATABASE_URL |
Yes |
— |
REDIS_URL |
No |
redis://localhost:6379/0 |
JWT_SECRET_KEY |
Yes |
— |
JWT_ALGORITHM |
No |
HS256 |
JWT_ISSUER |
No |
incidentops |
JWT_AUDIENCE |
No |
incidentops-api |
ACCESS_TOKEN_EXPIRE_MINUTES |
No |
15 |
REFRESH_TOKEN_EXPIRE_DAYS |
No |
30 |
Development
Use uv for all Python operations:
Project Structure