chore: update configuration and Docker setup for JWT and test results
ci/woodpecker/push/build-arm Pipeline was successful

Enhanced the .gitignore to exclude test results and updated the Dockerfile to include a new entrypoint script for improved container initialization. Refactored JWT configuration to support additional parameters for automatic refresh intervals, ensuring better control over token management. Updated the ConfigurationResolver to enforce required environment variables without hardcoded fallbacks, enhancing security and flexibility.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-15 03:23:23 +03:00
parent 7025f4d075
commit 78dea8ebab
40 changed files with 1990 additions and 510 deletions
+47 -27
View File
@@ -7,36 +7,45 @@
## 1. Authentication
**Mechanism**: JWT bearer (HS256) with **local validation only** — this service never calls back to the issuing `admin` service.
**Mechanism**: JWT bearer (**ECDSA-SHA256**) with public-key validation against `admin`'s JWKS endpoint. After the first successful JWKS fetch, request-path validation is purely local (no per-request call to `admin`).
**Trust model**: a single shared HMAC secret (`JWT_SECRET`) is provisioned to `admin` (issuer) and to every backend service on each edge device (validators). Rotation requires a coordinated re-deploy across all of them.
**Trust model**: `admin` holds the ECDSA **private** key; every backend service on each edge device validates with the corresponding **public** keys, fetched from `admin`'s JWKS endpoint (`JWT_JWKS_URL`). Rotation publishes a new `kid` in the JWKS; consumers pick it up at the next refresh tick — **NO coordinated redeploy** required (one of the primary operational wins over the legacy HS256 model).
**Validation parameters** (`Auth/JwtExtensions.cs`):
| Parameter | Value | Notes |
|-----------|-------|-------|
| Algorithm | HS256 (`SymmetricSecurityKey(UTF-8(JWT_SECRET))`) | Symmetric → asymmetric switch is suite-wide concern, not in this Epic |
| `ValidAlgorithms` | `[SecurityAlgorithms.EcdsaSha256]` | Algorithm pin — defends against HS256-confusion attacks (an attacker who learns the JWKS public key cannot forge tokens signed with `alg: HS256` using that key as the HMAC secret) |
| `IssuerSigningKeyResolver` | Pulls keys from cached `JsonWebKeySet` retrieved via `ConfigurationManager<JsonWebKeySet>` | Lazily fetches on first protected request; cache refreshes on the manager's default schedule matched against admin's `Cache-Control: public, max-age=3600` |
| JWKS HTTP transport | `HttpDocumentRetriever { RequireHttps = true }` | HTTPS-only — a misconfigured `JWT_JWKS_URL = http://...` fails at fetch time, not at startup config resolution |
| `ValidateLifetime` | `true` | Tokens with `exp` in the past are rejected |
| `ClockSkew` | `TimeSpan.FromMinutes(1)` | Tighter than .NET's 5-min default |
| `ValidateIssuer` | **`false`** | Known CMMC L2 finding (suite-tracked AZ-487/AZ-494); consistent with shared-secret trust |
| `ValidateAudience` | **`false`** | Same finding as above |
| `ValidateIssuerSigningKey` | `true` | Always required when `ValidateLifetime`/`ValidateIssuer` are set explicitly |
| `ClockSkew` | `TimeSpan.FromSeconds(30)` | Tighter than .NET's 5-min default AND tighter than the legacy 1-min setting |
| `ValidateIssuer` | **`true`** with `ValidIssuer = <resolved JWT_ISSUER>` | **CMMC L2 row 3 finding structurally fixed in this service's code.** Suite-level docs may still describe the legacy "disabled" model and have a separate sync task pending |
| `ValidateAudience` | **`true`** with `ValidAudience = <resolved JWT_AUDIENCE>` | Same as above |
| `ValidateIssuerSigningKey` | `true` (implicit via `IssuerSigningKeyResolver`) | Required for asymmetric validation |
**Failure outcomes**:
| Condition | HTTP code |
|-----------|-----------|
| Missing `Authorization` header | 401 |
| Invalid signature | 401 |
| Expired token (with 1-min skew) | 401 |
| Token signed with old `JWT_SECRET` after rotation | 401 (until coordinated re-deploy + re-login) |
| Valid signature + lifetime, but missing `permissions=FL` claim | 403 |
| Missing `Authorization` header on `[Authorize]` route | 401 |
| Invalid signature (no public key in cached JWKS verifies the token) | 401 |
| Token `kid` not in cached JWKS (rotation lag before refresh tick) | 401 (resolved on next JWKS refresh) |
| Token `alg``[EcdsaSha256]` (e.g. forged `alg: HS256`) | 401 (algorithm pin) |
| Expired token (with 30s skew) | 401 |
| `iss` claim ≠ `JWT_ISSUER` | 401 |
| `aud` claim ≠ `JWT_AUDIENCE` | 401 |
| Valid signature + lifetime + iss + aud, but missing `permissions=FL` claim | 403 |
| `JWT_JWKS_URL` uses `http://` (not `https://`) | 500 on first protected request (HTTPS-only retriever throws); NOT caught at startup |
| `admin` unreachable AT the time of the first protected request after cold start | 500 on that first request (synchronous JWKS fetch fails); resolves once admin is reachable |
**`admin` outage**: tokens issued before the outage continue to validate locally. This service does **not** require `admin` to be reachable for any flow. Once issued tokens expire, new logins fail at `admin`'s end (UI concern), but this service stays up.
**`admin` outage AFTER JWKS cached**: tokens issued before the outage continue to validate locally against the cached public keys. This service does **not** require `admin` to be reachable for any request-path flow once the JWKS cache is warm. Once issued tokens expire, new logins fail at `admin`'s end (UI concern), but this service stays up until the cache itself expires and a refresh fails.
**`admin` outage AT cold start**: the first protected request triggers a synchronous JWKS HTTPS GET; if `admin` is unreachable at that moment, the request fails 500. This is a **new failure mode** introduced by the ECDSA-JWKS switch and is the cost of the rotation-without-redeploy operational win.
## 2. Authorization
**Single named policy**: `"FL"`. Every controller action in `01_vehicle_catalog` and `02_mission_planning` carries `[Authorize(Policy = "FL")]`. The policy is satisfied by a `permissions` claim with value `"FL"`.
**Single named policy**: `"FL"`. Every controller action in `01_vehicle_catalog` and `02_mission_planning` carries `[Authorize(Policy = "FL")]`. The policy is built as `AuthorizationPolicyBuilder.RequireClaim("permissions", "FL")` satisfied when ANY `permissions` claim on the principal equals `"FL"`. A multi-permission token (`permissions: ["FL", "SOMETHING_ELSE"]`) is accepted.
**Role → permission matrix** is suite-level (`../../suite/_docs/00_roles_permissions.md`); this service does NOT enforce roles, only the `FL` permission.
@@ -59,9 +68,9 @@
**Secrets management**:
- `JWT_SECRET` and `DATABASE_URL` are env vars (with hardcoded dev fallbacks). See § 6 below.
- No secret manager (Vault, AWS SM, K8s Secrets) — secrets are baked into the device's docker compose env at provisioning time.
- No runtime gate prevents startup with the dev fallback in production (ADR-005 carry-forward).
- This service no longer holds a JWT signing secret. It holds **only** public-key configuration (`JWT_ISSUER`, `JWT_AUDIENCE`, `JWT_JWKS_URL`) plus `DATABASE_URL`. Whichever side gets compromised, that compromise no longer affects token signing.
- All four required values (`DATABASE_URL`, `JWT_ISSUER`, `JWT_AUDIENCE`, `JWT_JWKS_URL`) are resolved through `Infrastructure/ConfigurationResolver.cs``ResolveRequiredOrThrow` (env-var-first, then `IConfiguration` key, else THROW). **No hardcoded dev fallbacks** — the ADR-005 "dev fallback secret silently accepted in production" failure mode is structurally eliminated. ADR-005 now only covers the unconditional-Swagger branch.
- No secret manager (Vault, AWS SM, K8s Secrets) — config values are baked into the device's docker compose env at provisioning time.
## 4. Input validation
@@ -80,9 +89,13 @@ This is a **carry-forward concern** — input-shape testing is not a security ga
## 5. CORS
**Open in every environment**: `AllowAnyOrigin / AllowAnyMethod / AllowAnyHeader` in `Program.cs`. Spec does not mandate a CORS policy.
**Gated by `Infrastructure/CorsConfigurationValidator.cs`** at startup:
The closed edge network behind the suite reverse proxy is the deployment-shape mitigation. Worth confirming on the **first production rollout** that the upstream proxy whitelists origins; if not, this is a finding to surface.
- In `Production` (case-insensitive match on `ASPNETCORE_ENVIRONMENT`): startup **THROWS `InvalidOperationException`** when `CorsConfig:AllowedOrigins` is empty AND `CorsConfig:AllowAnyOrigin != true`. The "open in all environments" failure mode is structurally eliminated.
- In non-Production environments: the same empty allow-list with `AllowAnyOrigin=false` falls back to permissive (`AllowAnyOrigin/Method/Header`) AND emits a `PermissiveDefaultWarning` startup log. The pre-B11 "all environments permissive" assumption no longer holds.
- An explicit `AllowedOrigins` list narrows CORS to those origins in every environment.
The closed edge network behind the suite reverse proxy is still the deployment-shape backstop, but the application now refuses to start in production without an explicit policy decision.
## 6. Production-deploy footguns
@@ -90,17 +103,23 @@ These are explicit security-relevant risks the code carries today, all tracked a
| Footgun | Where | Mitigation |
|---------|-------|------------|
| **Dev fallback for `JWT_SECRET`** silently accepted in production if env var unset | `Program.cs` (no `IsDevelopment` gate, ADR-005) | Suite-level remediation pending; recommend "fail-fast at startup if `JWT_SECRET` is unset OR equals the well-known dev fallback" |
| **Dev fallback for `DATABASE_URL`** silently accepted in production if env var unset | `Program.cs` | Same pattern as `JWT_SECRET`; misconfigured deploy hits localhost Postgres on the device, which usually doesn't exist → process exits, but the failure mode is loud (crash) not silent |
| **Swagger UI mounted unconditionally** | `Program.cs` (no `IsDevelopment` gate, ADR-005) | Reverse-proxy-level allowlist on `/swagger` is the suite-level mitigation; verify on first production rollout |
| **CORS `AllowAnyOrigin/Method/Header`** in production | `Program.cs` | Reverse-proxy origin whitelist is the suite-level mitigation |
| **Cold-start dependency on `admin` reachability**: first protected request after a cold start triggers a synchronous JWKS HTTPS GET against `JWT_JWKS_URL`. If `admin` is unreachable at that moment, the request fails 500. Once cached, request-path is local-only | `Auth/JwtExtensions.cs` `ConfigurationManager<JsonWebKeySet>` | Document operational expectation; consider pre-warming the JWKS cache during startup if cold-start failure modes become disruptive |
| **`JWT_JWKS_URL` misconfigured as `http://`** passes startup config resolution but fails at first JWKS fetch → 500 | `Auth/JwtExtensions.cs` `HttpDocumentRetriever { RequireHttps = true }` | Detected at runtime, not startup; recommend a startup smoke check that validates the URL scheme before serving traffic |
| **Swagger UI mounted unconditionally** | `Program.cs` (no `IsDevelopment` gate, ADR-005 surviving branch) | Reverse-proxy-level allowlist on `/swagger` is the suite-level mitigation; verify on first production rollout |
| **CORS allow-list empty in non-Production** falls back to permissive (`AllowAnyOrigin/Method/Header`) with a startup warning | `Infrastructure/CorsConfigurationValidator.cs` | Document explicit `CorsConfig:AllowedOrigins` for staging/dev too if permissive is undesirable. **Production fails-fast** — no remediation needed there |
| **No HTTPS redirection** | `Program.cs` (no `app.UseHttpsRedirection()`) | Reverse proxy enforces TLS upstream |
| **Stack trace logged for unhandled 500s** | `Middleware/ErrorHandlingMiddleware.cs` `LogError(ex, ...)` | Stack is logged only — NOT returned in the HTTP response body (the 500 body is the generic `"Internal server error"` message from middleware) |
| **JWT `iss`/`aud` validation disabled** | `Auth/JwtExtensions.cs` | CMMC L2 row 3 finding; tracked at suite level under AZ-487 / AZ-494 |
| **Cascade-delete is NOT transaction-wrapped** (data-integrity, not auth) | `Services/MissionService.cs`, `Services/WaypointService.cs` (ADR-006) | One-line fix queued; recommended to land with B6 |
| **Hardcoded permission string `"FL"`** in feature controllers | `Controllers/{Vehicles,Missions}Controller.cs` | Risk: typo silently turns into permanent 403; mitigation by code review + `module-layout.md` |
| **Permission code `"FL"` retains legacy "Flight" wording** post-rename | `Auth/JwtExtensions.cs` | Fleet-wide auth change deferred (not in this Epic); TODO in `../../suite/_docs/00_roles_permissions.md` |
**Removed from this list** (previously listed, now structurally fixed by code, not by mitigation):
- ❌ Dev fallback for `JWT_SECRET``JWT_SECRET` env var is no longer consulted; the resolver throws on missing required config.
- ❌ Dev fallback for `DATABASE_URL` — same resolver throws on missing required config.
- ❌ CORS `AllowAnyOrigin/Method/Header` in production — production startup throws on empty allow-list with `AllowAnyOrigin != true`.
- ❌ JWT `iss`/`aud` validation disabled — both are now validated; CMMC L2 row 3 finding structurally fixed in this service's code.
## 7. Audit logging
**None at the application level today.** The only structured log emitted by application code is `_logger.LogError(ex, "Unhandled exception")` in `ErrorHandlingMiddleware` for 500s. There is:
@@ -114,7 +133,7 @@ Production incident response on this service today requires grep-by-timestamp co
## 8. Threat model summary (one-paragraph)
The deployment shape — closed edge network, single operator per device, suite reverse proxy enforcing TLS and origin allowlisting upstream, Watchtower restart on crash — is the **primary defence-in-depth layer** for everything not handled by HS256 JWT validation and the `FL` permission gate. The known weak points (dev fallbacks not gated on `IsDevelopment()`, no input validation, no application-level audit log, `iss`/`aud` not validated) are documented and tracked, with the most critical ones (CMMC L2 row 3, default-vehicle race) under suite-level or B-ticket Jira IDs. This Epic (rename + GPS-Denied removal) does **not** change the security posture; it preserves every current invariant.
The deployment shape — closed edge network, single operator per device, suite reverse proxy enforcing TLS and origin allowlisting upstream, Watchtower restart on crash — remains the primary defence-in-depth layer, but the application-layer auth posture has materially improved: ECDSA-SHA256 JWT validation against `admin`'s JWKS (with algorithm pinning, `iss`/`aud` validation, HTTPS-only JWKS fetch, and a 30s clock skew), production-gated CORS, and fail-fast required-config resolution have collectively eliminated the dev-fallback, iss/aud-disabled, and CORS-permissive-in-prod footguns that the legacy HS256 model carried. Residual application-level weak points (no input validation, no per-user audit, hardcoded `"FL"` string, cold-start dependency on `admin` reachability for the first protected request, non-transactional cascade delete) are documented and tracked. The CMMC L2 row 3 finding is structurally closed in this service's code; suite-level documentation may still describe the legacy posture and has a separate sync task pending.
## 9. References
@@ -123,10 +142,11 @@ The deployment shape — closed edge network, single operator per device, suite
| Auth registration | `Auth/JwtExtensions.cs` |
| Authorization attribute usage | `Controllers/AircraftsController.cs` (post-B6: `VehiclesController.cs`), `Controllers/FlightsController.cs` (post-B6: `MissionsController.cs`) |
| Error envelope (no stack-leak) | `Middleware/ErrorHandlingMiddleware.cs` |
| Env var resolution + dev fallbacks | `Program.cs` |
| Env / config resolution (fail-fast) | `Program.cs`, `Infrastructure/ConfigurationResolver.cs` |
| CORS validation | `Infrastructure/CorsConfigurationValidator.cs` |
| CMMC L2 scorecard | `../../suite/_docs/05_security/cmmc_l2_scorecard.md` |
| Roles & `FL` permission origin | `../../suite/_docs/00_roles_permissions.md` |
| ADR-005 (Swagger + dev fallbacks) | `_docs/02_document/architecture.md` § 8 |
| ADR-005 (Swagger unconditional, surviving branch) | `_docs/02_document/architecture.md` § 8 |
| ADR-002 (PascalCase wire shape) | `_docs/02_document/architecture.md` § 8 |
| Component identity description | `_docs/02_document/components/05_identity/description.md` |
| Component http-conventions description | `_docs/02_document/components/06_http_conventions/description.md` |