Files
incidentops/worker/celery_app.py
minhtrannhat 46ede7757d feat: add observability stack and background task infrastructure
Add OpenTelemetry instrumentation with distributed tracing and metrics:
- Structured JSON logging with trace context correlation
- Auto-instrumentation for FastAPI, asyncpg, httpx, redis
- OTLP exporter for traces and Prometheus metrics endpoint

Implement Celery worker and notification task system:
- Celery app with Redis/SQS broker support and configurable queues
- Notification tasks for incident fan-out, webhooks, and escalations
- Pluggable TaskQueue abstraction with in-memory driver for testing

Add Grafana observability stack (Loki, Tempo, Prometheus, Grafana):
- OpenTelemetry Collector for receiving OTLP traces and logs
- Tempo for distributed tracing backend
- Loki for log aggregation with Promtail DaemonSet
- Prometheus for metrics scraping with RBAC configuration
- Grafana with pre-provisioned datasources and API overview dashboard
- Helm templates for all observability components

Enhance application infrastructure:
- Global exception handlers with structured ErrorResponse schema
- Request logging middleware with timing metrics
- Health check updated to verify task queue connectivity
- Non-root user in Dockerfile for security
- Init containers in Helm deployments for dependency ordering
- Production Helm values with autoscaling and retention policies
2026-01-07 20:51:13 -05:00

44 lines
1.1 KiB
Python

"""Celery application configured for IncidentOps."""
from __future__ import annotations
from celery import Celery
from kombu import Queue
from app.config import settings
celery_app = Celery("incidentops")
celery_app.conf.update(
broker_url=settings.resolved_task_queue_broker_url,
task_default_queue=settings.task_queue_default_queue,
task_queues=(
Queue(settings.task_queue_default_queue),
Queue(settings.task_queue_critical_queue),
),
task_routes={
"worker.tasks.notifications.escalate_if_unacked": {
"queue": settings.task_queue_critical_queue
},
},
task_serializer="json",
accept_content=["json"],
timezone="UTC",
enable_utc=True,
)
if settings.task_queue_backend == "sqs":
celery_app.conf.broker_transport_options = {
"region": settings.aws_region or "us-east-1",
"visibility_timeout": settings.task_queue_visibility_timeout,
"polling_interval": settings.task_queue_polling_interval,
}
celery_app.autodiscover_tasks(["worker.tasks"])
__all__ = ["celery_app"]