Add OpenTelemetry instrumentation with distributed tracing and metrics: - Structured JSON logging with trace context correlation - Auto-instrumentation for FastAPI, asyncpg, httpx, redis - OTLP exporter for traces and Prometheus metrics endpoint Implement Celery worker and notification task system: - Celery app with Redis/SQS broker support and configurable queues - Notification tasks for incident fan-out, webhooks, and escalations - Pluggable TaskQueue abstraction with in-memory driver for testing Add Grafana observability stack (Loki, Tempo, Prometheus, Grafana): - OpenTelemetry Collector for receiving OTLP traces and logs - Tempo for distributed tracing backend - Loki for log aggregation with Promtail DaemonSet - Prometheus for metrics scraping with RBAC configuration - Grafana with pre-provisioned datasources and API overview dashboard - Helm templates for all observability components Enhance application infrastructure: - Global exception handlers with structured ErrorResponse schema - Request logging middleware with timing metrics - Health check updated to verify task queue connectivity - Non-root user in Dockerfile for security - Init containers in Helm deployments for dependency ordering - Production Helm values with autoscaling and retention policies
42 lines
719 B
YAML
42 lines
719 B
YAML
auth_enabled: false
|
|
|
|
server:
|
|
http_listen_port: 3100
|
|
grpc_listen_port: 9096
|
|
|
|
common:
|
|
path_prefix: /loki
|
|
storage:
|
|
filesystem:
|
|
chunks_directory: /loki/chunks
|
|
rules_directory: /loki/rules
|
|
replication_factor: 1
|
|
ring:
|
|
kvstore:
|
|
store: inmemory
|
|
|
|
query_range:
|
|
results_cache:
|
|
cache:
|
|
embedded_cache:
|
|
enabled: true
|
|
max_size_mb: 100
|
|
|
|
schema_config:
|
|
configs:
|
|
- from: "2020-10-24"
|
|
store: tsdb
|
|
object_store: filesystem
|
|
schema: v13
|
|
index:
|
|
prefix: index_
|
|
period: 24h
|
|
|
|
ruler:
|
|
alertmanager_url: http://localhost:9093
|
|
|
|
limits_config:
|
|
retention_period: 168h # 7 days
|
|
allow_structured_metadata: true
|
|
volume_enabled: true
|