# IncidentOps Specification Multi-tenant incident management API. Org context embedded in JWT — no `orgId` in URLs. ## Architecture | Service | Stack | Purpose | |---------|-------|---------| | **api** | FastAPI, asyncpg | REST API, JWT auth, RBAC | | **worker** | Celery, Redis | Notifications, escalations | | **web** | Next.js | Dashboard (future) | **Infrastructure:** PostgreSQL, Redis, ingress-nginx, Helm/Skaffold ## Auth ### JWT Access Token Claims - `sub`: user_id (uuid) - `org_id`: active org (uuid) - `org_role`: `admin | member | viewer` - `iss`: issuer (configurable, default: `incidentops`) - `aud`: audience (configurable, default: `incidentops-api`) - `jti`: unique token ID (uuid) - `iat`: issued at (unix timestamp) - `exp`: expiration (unix timestamp) ### Refresh Token - Opaque token returned in JSON (not cookie) - Stored hashed in DB with `active_org_id` - Rotated on refresh and org-switch ### Endpoints | Endpoint | Description | |----------|-------------| | `POST /v1/auth/register` | Create user + default org, return tokens | | `POST /v1/auth/login` | Authenticate, return tokens | | `POST /v1/auth/refresh` | Rotate refresh token, mint new access token | | `POST /v1/auth/switch-org` | Change active org, rotate tokens | | `POST /v1/auth/logout` | Revoke refresh token | ## Authorization ### Roles | Role | Permissions | |------|-------------| | viewer | Read-only | | member | + create incidents, transitions, comments | | admin | + manage members, notification targets | ### Enforcement - Role check via dependency injection - Ownership check: resource `org_id` must match JWT `org_id` ## API Routes All under `/v1`. Auth required unless noted. ### Org (implicit from JWT) - `GET /org` — current org summary - `GET /org/members` (admin) - `GET /org/services` - `POST /org/services` (member+) - `GET /org/notification-targets` (admin) - `POST /org/notification-targets` (admin) ### Incidents - `GET /incidents?status=&cursor=&limit=` - `POST /services/{serviceId}/incidents` (member+) - `GET /incidents/{incidentId}` - `GET /incidents/{incidentId}/events` - `POST /incidents/{incidentId}/transition` (member+) - `POST /incidents/{incidentId}/comment` (member+) ### Health - `GET /healthz` — liveness - `GET /readyz` — readiness (postgres + redis) ## Incident State Machine ``` Triggered → Acknowledged → Mitigated → Resolved ``` - Transitions validated at application level - Optimistic locking via `version` column - All changes recorded in `incident_events` ## Database Schema | Table | Purpose | |-------|---------| | `users` | User accounts | | `orgs` | Organizations | | `org_members` | User-org membership + role | | `services` | Org-scoped services | | `incidents` | Org-scoped incidents with version | | `incident_events` | Append-only timeline | | `refresh_tokens` | Token rotation + active org | | `notification_targets` | Webhook/email/slack configs | | `notification_attempts` | Delivery tracking (idempotent) | ## Background Jobs (Celery) | Task | Queue | Purpose | |------|-------|---------| | `incident_triggered` | default | Fan-out to notification targets | | `send_webhook` | default | HTTP POST with retry | | `escalate_if_unacked` | critical | Delayed escalation (stretch) | ## Config (Environment) | Variable | Required | Default | |----------|----------|---------| | `DATABASE_URL` | Yes | — | | `REDIS_URL` | No | `redis://localhost:6379/0` | | `JWT_SECRET_KEY` | Yes | — | | `JWT_ALGORITHM` | No | `HS256` | | `JWT_ISSUER` | No | `incidentops` | | `JWT_AUDIENCE` | No | `incidentops-api` | | `ACCESS_TOKEN_EXPIRE_MINUTES` | No | `15` | | `REFRESH_TOKEN_EXPIRE_DAYS` | No | `30` | ## Development Use `uv` for all Python operations: ```bash # Install dependencies uv sync # Run tests uv run pytest tests/ # Run the API server uv run uvicorn app.main:app --reload # Run migrations uv run python migrations/migrate.py ``` ## Project Structure ``` incidentops/ ├── app/ │ ├── main.py # FastAPI entry │ ├── config.py # pydantic-settings │ ├── db.py # asyncpg pool │ ├── core/ # security, exceptions │ ├── api/v1/ # route handlers │ ├── schemas/ # pydantic models │ ├── repositories/ # data access │ └── services/ # business logic ├── worker/ │ ├── celery_app.py │ └── tasks/ ├── migrations/ │ └── *.sql + migrate.py ├── helm/ ├── Dockerfile ├── docker-compose.yml └── pyproject.toml ```