Files
Oleksandr Bezdieniezhnykh 7025f4d075 refactor: enhance JWT authentication and CORS configuration
Updated JWT authentication to use configuration values instead of hardcoded secrets, improving security and flexibility. Enhanced CORS policy to conditionally allow origins based on configuration settings, with logging for permissive defaults. Updated README to reflect project renaming and clarify service context.
2026-05-14 19:48:25 +03:00

4.6 KiB
Raw Permalink Blame History

Observability

Honest assessment: observability in this service is minimal today. This document records what exists, what does not, and what the natural next steps are. It is intentionally short — there is not much to describe.

What exists

Logging

  • Provider: ASP.NET Core defaults (Console + Debug providers via Microsoft.Extensions.Logging). No Serilog, no NLog, no structured logging.

  • Format: plain text to stdout (Console provider). Docker collects via the standard container log stream; docker logs missions reveals everything.

  • Application-emitted logs (only):

    Source Level When Message
    06_http_conventions/ErrorHandlingMiddleware ERROR Unhandled exception caught by the catch-all branch "Unhandled exception" (with stack trace via LogError(ex, ...))

    No INFO, DEBUG, WARN logs are emitted by application code.

  • Framework logs: JwtBearerHandler and ASP.NET Core's request pipeline log token-validation outcomes and request lifecycle at Information / Debug levels (05_identity § Logging). LinqToDB does NOT log SQL by default — 04_persistence Caveats #7.

Metrics

  • None. No Microsoft.Extensions.Diagnostics.Metrics consumption, no Prometheus / OpenTelemetry exporters, no application-level counters.

Tracing

  • None. No Activity / OpenTelemetry instrumentation. No correlation IDs, no W3C traceparent propagation.

Health endpoint

  • GET /health{ "status": "healthy" } (Flow F7). Confirms process liveness + HTTP pipeline serving. Does NOT verify DB connectivity today.

Build-time metadata

  • AZAION_REVISION env var baked from CI_COMMIT_SHA at build time (Dockerfile). Visible via docker inspect or env inside the running container, but not surfaced via any HTTP endpoint today.

What does not exist (carry-forward)

Correlation / request tracing

  • No request ID generation (would be a 5-line middleware: emit X-Request-Id if absent, propagate to logs).
  • No client-supplied correlation ID propagation.
  • No way to grep logs by anything other than timestamp.

Structured logging

  • Console-only plaintext means log aggregation (Loki / ELK / Splunk) has to parse free-text. A switch to Microsoft.Extensions.Logging.Console.JsonFormatter (or Serilog with JSON sink) would emit { "Timestamp": ..., "Level": ..., "Message": ..., "Properties": {...} } and make downstream querying viable.

Audit logging

  • No per-request audit trail (which user / token did what). The JWT's sub / user_id claim is not consumed today — 05_identity Caveats #2. Adding it would unlock per-user attribution for vehicle changes and mission deletions.

Application metrics

  • No counters for: requests served, error rate, mission-delete cascade duration, vehicles.is_default race occurrences, JWT validation failure rate.
  • No DB metrics: connection pool utilization, query latency p50 / p95 / p99, slow-query log.
  • No SLO tracking — architecture.md § NFRs notes that no explicit latency budget is set.

Distributed tracing

  • The cross-service cascade (missionsmedia / annotations / detection rows in shared PG) would benefit from a single trace ID per cascade walk. Today it is one trace span (the HTTP request) that opaquely runs many SQL statements.

DB liveness on /health

  • Flow F7 § Future improvement notes the natural extension: await db.ExecuteAsync("SELECT 1") inside the /health handler. Today the migrator-at-startup is the only DB-availability gate.

Natural next steps (in rough priority order)

  1. Request ID middleware + emit it from ErrorHandlingMiddleware log + Response header. ~10 lines, immediately useful for production support.
  2. DB ping in /health — flips /health from "process liveness" to "service readiness". Costs <1ms per probe.
  3. Switch console logger to JSON formatter — flat-rate change; downstream log aggregation becomes feasible.
  4. Surface AZAION_REVISION via a /version endpoint (one line on top of the existing MapGet) so support knows which build is on the device without docker inspect.
  5. OpenTelemetry instrumentation — once #14 are in, a minimal OTel exporter (HTTP server + LinqToDB SqlClient activity) would give tracing for free.

These are deferred to post-rename work (out of this Epic). The rename refactor (B5B10) does not change observability, and the autodev BUILD pipeline (Steps 37 of existing-code flow) will exercise the existing code as-is. Adding observability is a separate pass that should land alongside the testing work in autodev cycle 2 or later.