mirror of
https://github.com/azaion/missions.git
synced 2026-06-21 09:31:06 +00:00
refactor: enhance JWT authentication and CORS configuration
Updated JWT authentication to use configuration values instead of hardcoded secrets, improving security and flexibility. Enhanced CORS policy to conditionally allow origins based on configuration settings, with logging for permissive defaults. Updated README to reflect project renaming and clarify service context.
This commit is contained in:
@@ -0,0 +1,75 @@
|
||||
# Observability
|
||||
|
||||
> **Honest assessment**: observability in this service is **minimal** today. This document records what exists, what does not, and what the natural next steps are. It is intentionally short — there is not much to describe.
|
||||
|
||||
## What exists
|
||||
|
||||
### Logging
|
||||
|
||||
- **Provider**: ASP.NET Core defaults (Console + Debug providers via `Microsoft.Extensions.Logging`). No Serilog, no NLog, no structured logging.
|
||||
- **Format**: plain text to stdout (Console provider). Docker collects via the standard container log stream; `docker logs missions` reveals everything.
|
||||
- **Application-emitted logs** (only):
|
||||
|
||||
| Source | Level | When | Message |
|
||||
|--------|-------|------|---------|
|
||||
| `06_http_conventions/ErrorHandlingMiddleware` | `ERROR` | Unhandled exception caught by the catch-all branch | `"Unhandled exception"` (with stack trace via `LogError(ex, ...)`) |
|
||||
|
||||
No `INFO`, `DEBUG`, `WARN` logs are emitted by application code.
|
||||
|
||||
- **Framework logs**: `JwtBearerHandler` and ASP.NET Core's request pipeline log token-validation outcomes and request lifecycle at `Information` / `Debug` levels (`05_identity` § Logging). LinqToDB does NOT log SQL by default — `04_persistence` Caveats #7.
|
||||
|
||||
### Metrics
|
||||
|
||||
- **None.** No `Microsoft.Extensions.Diagnostics.Metrics` consumption, no Prometheus / OpenTelemetry exporters, no application-level counters.
|
||||
|
||||
### Tracing
|
||||
|
||||
- **None.** No `Activity` / `OpenTelemetry` instrumentation. No correlation IDs, no W3C `traceparent` propagation.
|
||||
|
||||
### Health endpoint
|
||||
|
||||
- `GET /health` → `{ "status": "healthy" }` (Flow F7). Confirms process liveness + HTTP pipeline serving. **Does NOT verify DB connectivity** today.
|
||||
|
||||
### Build-time metadata
|
||||
|
||||
- `AZAION_REVISION` env var baked from `CI_COMMIT_SHA` at build time (`Dockerfile`). Visible via `docker inspect` or `env` inside the running container, but **not surfaced via any HTTP endpoint** today.
|
||||
|
||||
## What does not exist (carry-forward)
|
||||
|
||||
### Correlation / request tracing
|
||||
|
||||
- No request ID generation (would be a 5-line middleware: emit `X-Request-Id` if absent, propagate to logs).
|
||||
- No client-supplied correlation ID propagation.
|
||||
- No way to grep logs by anything other than timestamp.
|
||||
|
||||
### Structured logging
|
||||
|
||||
- Console-only plaintext means log aggregation (Loki / ELK / Splunk) has to parse free-text. A switch to `Microsoft.Extensions.Logging.Console.JsonFormatter` (or Serilog with JSON sink) would emit `{ "Timestamp": ..., "Level": ..., "Message": ..., "Properties": {...} }` and make downstream querying viable.
|
||||
|
||||
### Audit logging
|
||||
|
||||
- No per-request audit trail (which user / token did what). The JWT's `sub` / `user_id` claim is **not consumed** today — `05_identity` Caveats #2. Adding it would unlock per-user attribution for vehicle changes and mission deletions.
|
||||
|
||||
### Application metrics
|
||||
|
||||
- No counters for: requests served, error rate, mission-delete cascade duration, `vehicles.is_default` race occurrences, JWT validation failure rate.
|
||||
- No DB metrics: connection pool utilization, query latency p50 / p95 / p99, slow-query log.
|
||||
- No SLO tracking — `architecture.md` § NFRs notes that no explicit latency budget is set.
|
||||
|
||||
### Distributed tracing
|
||||
|
||||
- The cross-service cascade (`missions` → `media` / `annotations` / `detection` rows in shared PG) would benefit from a single trace ID per cascade walk. Today it is one trace span (the HTTP request) that opaquely runs many SQL statements.
|
||||
|
||||
### DB liveness on `/health`
|
||||
|
||||
- Flow F7 § Future improvement notes the natural extension: `await db.ExecuteAsync("SELECT 1")` inside the `/health` handler. Today the migrator-at-startup is the only DB-availability gate.
|
||||
|
||||
## Natural next steps (in rough priority order)
|
||||
|
||||
1. **Request ID middleware + emit it from `ErrorHandlingMiddleware` log + Response header.** ~10 lines, immediately useful for production support.
|
||||
2. **DB ping in `/health`** — flips `/health` from "process liveness" to "service readiness". Costs <1ms per probe.
|
||||
3. **Switch console logger to JSON formatter** — flat-rate change; downstream log aggregation becomes feasible.
|
||||
4. **Surface `AZAION_REVISION` via a `/version` endpoint** (one line on top of the existing `MapGet`) so support knows which build is on the device without `docker inspect`.
|
||||
5. **OpenTelemetry instrumentation** — once #1–4 are in, a minimal OTel exporter (HTTP server + LinqToDB SqlClient activity) would give tracing for free.
|
||||
|
||||
These are deferred to **post-rename** work (out of this Epic). The rename refactor (B5–B10) does not change observability, and the autodev BUILD pipeline (Steps 3–7 of existing-code flow) will exercise the existing code as-is. Adding observability is a separate pass that should land alongside the testing work in autodev cycle 2 or later.
|
||||
Reference in New Issue
Block a user