This commit captures everything produced during autodev existing-code Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec), together with the targeted auth + CORS re-sync triggered on 2026-05-14 when codebase drift was detected at Step 4 entry. None of this work was previously committed. Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution, architecture, system flows, glossary, module-layout, per-component specs (01..06), modules, deployment, diagrams, data model, FINAL report, verification log, discovery. Step 2 (Architecture Baseline) — architecture_compliance_baseline.md. Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No High/Critical findings; auto-chained to Step 3 per existing-code flow. Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across blackbox, security, resilience, resource-limit, performance), plus e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh, scripts/run-performance-tests.sh. Coverage 88% over the active scope (40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered). Targeted auth + CORS re-sync — replaces the deleted in-house token issuer with a JWKS-verifier model. AuthController and TokenService removed; JwtExtensions switched from HS256 symmetric to ES256 over admin's JWKS. ConfigurationResolver and CorsConfigurationValidator added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01, SEC-02, SEC-03 marked Closed. One new testability risk recorded in architecture.md Open Risks Section 6 (JWKS HTTPS gating). Source changes: - src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning - src/Program.cs (modified) — DI wiring for ConfigurationResolver and CorsConfigurationValidator - src/Controllers/AuthController.cs (deleted) — no in-service issuance - src/Services/TokenService.cs (deleted) — same - src/Infrastructure/ConfigurationResolver.cs (new) - src/Infrastructure/CorsConfigurationValidator.cs (new) - .env.example (new) — required env var documentation - .gitignore (updated) Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec captures the change-spec for downstream services that consumed the now deleted /auth endpoints. Co-authored-by: Cursor <cursoragent@cursor.com>
2.5 KiB
Observability
Source of truth: src/Program.cs (no dedicated logging config files exist in repo).
Health check
app.MapGet("/health", () => Results.Ok(new { status = "healthy" }));
- Path:
GET /health - Auth: none (
MapGetbypasses controller-level[Authorize]). - Response:
200 { "status": "healthy" } - Liveness only: this endpoint does not probe the DB, RabbitMQ, or filesystem. A pod can return healthy while the failsafe outbox is unable to publish or while DB connectivity is broken.
API documentation
app.UseSwagger()andapp.UseSwaggerUI()mounted unconditionally (ADR-005).- Endpoints:
/swagger/v1/swagger.json(OpenAPI),/swagger/index.html(UI). - No version pinning of the OpenAPI document (Swashbuckle defaults).
Logging
- Default ASP.NET Core console logger. No
appsettings.jsonoverrides in repo. - No structured logger (Serilog / NLog) configured.
- No correlation id middleware in repo (
X-Request-Idnot propagated).
Metrics
None configured today. Possible additions:
prometheus-netexporter on/metrics.- ASP.NET Core
MetricsCollector(built-in HTTP / runtime counters).
Traces
None configured. OpenTelemetry SDK is not referenced in csproj.
Image revision stamp
The runtime container has AZAION_REVISION = $CI_COMMIT_SHA set as an env var (Dockerfile + Woodpecker pipeline). This makes "what's running?" diagnosable from inside the container with printenv AZAION_REVISION or by surfacing it in a future /info endpoint.
Error visibility to clients
ErrorHandlingMiddleware maps exceptions to JSON { statusCode, message } with HTTP 400 / 404 / 409 / 500. Internal exception details are not leaked beyond the message string (confirm during Step 4 verification — make sure 500 paths do not echo stack traces).
Open observability items
- Readiness vs liveness split: today there is one endpoint that does not check dependencies. A
GET /readythat pings DB and (optionally) RabbitMQ would unblock proper rolling-update gates. - Structured logs with request id correlation across HTTP + outbox drain + SSE.
- Outbox depth metric (
COUNT(*)onannotations_queue_records) — surfaces stuck-failsafe scenarios early. - SSE subscriber count metric — visibility into connected UIs.
- Stream publish lag — time from outbox row insertion to RabbitMQ publish.
- Failure injection / chaos hooks — none today.
These are candidates for the deploy-and-retro phase of autodev (Steps 14–17 once the project enters Phase B).