# Observability > **Honest assessment**: observability in this service is **minimal** today. This document records what exists, what does not, and what the natural next steps are. It is intentionally short — there is not much to describe. ## What exists ### Logging - **Provider**: ASP.NET Core defaults (Console + Debug providers via `Microsoft.Extensions.Logging`). No Serilog, no NLog, no structured logging. - **Format**: plain text to stdout (Console provider). Docker collects via the standard container log stream; `docker logs missions` reveals everything. - **Application-emitted logs** (only): | Source | Level | When | Message | |--------|-------|------|---------| | `06_http_conventions/ErrorHandlingMiddleware` | `ERROR` | Unhandled exception caught by the catch-all branch | `"Unhandled exception"` (with stack trace via `LogError(ex, ...)`) | No `INFO`, `DEBUG`, `WARN` logs are emitted by application code. - **Framework logs**: `JwtBearerHandler` and ASP.NET Core's request pipeline log token-validation outcomes and request lifecycle at `Information` / `Debug` levels (`05_identity` § Logging). LinqToDB does NOT log SQL by default — `04_persistence` Caveats #7. ### Metrics - **None.** No `Microsoft.Extensions.Diagnostics.Metrics` consumption, no Prometheus / OpenTelemetry exporters, no application-level counters. ### Tracing - **None.** No `Activity` / `OpenTelemetry` instrumentation. No correlation IDs, no W3C `traceparent` propagation. ### Health endpoint - `GET /health` → `{ "status": "healthy" }` (Flow F7). Confirms process liveness + HTTP pipeline serving. **Does NOT verify DB connectivity** today. ### Build-time metadata - `AZAION_REVISION` env var baked from `CI_COMMIT_SHA` at build time (`Dockerfile`). Visible via `docker inspect` or `env` inside the running container, but **not surfaced via any HTTP endpoint** today. ## What does not exist (carry-forward) ### Correlation / request tracing - No request ID generation (would be a 5-line middleware: emit `X-Request-Id` if absent, propagate to logs). - No client-supplied correlation ID propagation. - No way to grep logs by anything other than timestamp. ### Structured logging - Console-only plaintext means log aggregation (Loki / ELK / Splunk) has to parse free-text. A switch to `Microsoft.Extensions.Logging.Console.JsonFormatter` (or Serilog with JSON sink) would emit `{ "Timestamp": ..., "Level": ..., "Message": ..., "Properties": {...} }` and make downstream querying viable. ### Audit logging - No per-request audit trail (which user / token did what). The JWT's `sub` / `user_id` claim is **not consumed** today — `05_identity` Caveats #2. Adding it would unlock per-user attribution for vehicle changes and mission deletions. ### Application metrics - No counters for: requests served, error rate, mission-delete cascade duration, `vehicles.is_default` race occurrences, JWT validation failure rate. - No DB metrics: connection pool utilization, query latency p50 / p95 / p99, slow-query log. - No SLO tracking — `architecture.md` § NFRs notes that no explicit latency budget is set. ### Distributed tracing - The cross-service cascade (`missions` → `media` / `annotations` / `detection` rows in shared PG) would benefit from a single trace ID per cascade walk. Today it is one trace span (the HTTP request) that opaquely runs many SQL statements. ### DB liveness on `/health` - Flow F7 § Future improvement notes the natural extension: `await db.ExecuteAsync("SELECT 1")` inside the `/health` handler. Today the migrator-at-startup is the only DB-availability gate. ## Natural next steps (in rough priority order) 1. **Request ID middleware + emit it from `ErrorHandlingMiddleware` log + Response header.** ~10 lines, immediately useful for production support. 2. **DB ping in `/health`** — flips `/health` from "process liveness" to "service readiness". Costs <1ms per probe. 3. **Switch console logger to JSON formatter** — flat-rate change; downstream log aggregation becomes feasible. 4. **Surface `AZAION_REVISION` via a `/version` endpoint** (one line on top of the existing `MapGet`) so support knows which build is on the device without `docker inspect`. 5. **OpenTelemetry instrumentation** — once #1–4 are in, a minimal OTel exporter (HTTP server + LinqToDB SqlClient activity) would give tracing for free. These are deferred to **post-rename** work (out of this Epic). The rename refactor (B5–B10) does not change observability, and the autodev BUILD pipeline (Steps 3–7 of existing-code flow) will exercise the existing code as-is. Adding observability is a separate pass that should land alongside the testing work in autodev cycle 2 or later.