mirror of
https://github.com/azaion/missions.git
synced 2026-06-21 19:41:08 +00:00
78dea8ebab
ci/woodpecker/push/build-arm Pipeline was successful
Enhanced the .gitignore to exclude test results and updated the Dockerfile to include a new entrypoint script for improved container initialization. Refactored JWT configuration to support additional parameters for automatic refresh intervals, ensuring better control over token management. Updated the ConfigurationResolver to enforce required environment variables without hardcoded fallbacks, enhancing security and flexibility.
153 lines
15 KiB
Markdown
153 lines
15 KiB
Markdown
# Security Approach — Azaion.Missions
|
|
|
|
> **Status**: derived-from-code (autodev `/document` Step 6, 2026-05-14).
|
|
> All claims below trace to actual code, configuration, or a tracked suite-level finding. Items called out as "currently divergent" are intentional carry-forward — see `_docs/02_document/architecture.md` § 8 ADRs and `00_discovery.md` § Spec ↔ Code Divergences.
|
|
|
|
---
|
|
|
|
## 1. Authentication
|
|
|
|
**Mechanism**: JWT bearer (**ECDSA-SHA256**) with public-key validation against `admin`'s JWKS endpoint. After the first successful JWKS fetch, request-path validation is purely local (no per-request call to `admin`).
|
|
|
|
**Trust model**: `admin` holds the ECDSA **private** key; every backend service on each edge device validates with the corresponding **public** keys, fetched from `admin`'s JWKS endpoint (`JWT_JWKS_URL`). Rotation publishes a new `kid` in the JWKS; consumers pick it up at the next refresh tick — **NO coordinated redeploy** required (one of the primary operational wins over the legacy HS256 model).
|
|
|
|
**Validation parameters** (`Auth/JwtExtensions.cs`):
|
|
|
|
| Parameter | Value | Notes |
|
|
|-----------|-------|-------|
|
|
| `ValidAlgorithms` | `[SecurityAlgorithms.EcdsaSha256]` | Algorithm pin — defends against HS256-confusion attacks (an attacker who learns the JWKS public key cannot forge tokens signed with `alg: HS256` using that key as the HMAC secret) |
|
|
| `IssuerSigningKeyResolver` | Pulls keys from cached `JsonWebKeySet` retrieved via `ConfigurationManager<JsonWebKeySet>` | Lazily fetches on first protected request; cache refreshes on the manager's default schedule matched against admin's `Cache-Control: public, max-age=3600` |
|
|
| JWKS HTTP transport | `HttpDocumentRetriever { RequireHttps = true }` | HTTPS-only — a misconfigured `JWT_JWKS_URL = http://...` fails at fetch time, not at startup config resolution |
|
|
| `ValidateLifetime` | `true` | Tokens with `exp` in the past are rejected |
|
|
| `ClockSkew` | `TimeSpan.FromSeconds(30)` | Tighter than .NET's 5-min default AND tighter than the legacy 1-min setting |
|
|
| `ValidateIssuer` | **`true`** with `ValidIssuer = <resolved JWT_ISSUER>` | **CMMC L2 row 3 finding structurally fixed in this service's code.** Suite-level docs may still describe the legacy "disabled" model and have a separate sync task pending |
|
|
| `ValidateAudience` | **`true`** with `ValidAudience = <resolved JWT_AUDIENCE>` | Same as above |
|
|
| `ValidateIssuerSigningKey` | `true` (implicit via `IssuerSigningKeyResolver`) | Required for asymmetric validation |
|
|
|
|
**Failure outcomes**:
|
|
|
|
| Condition | HTTP code |
|
|
|-----------|-----------|
|
|
| Missing `Authorization` header on `[Authorize]` route | 401 |
|
|
| Invalid signature (no public key in cached JWKS verifies the token) | 401 |
|
|
| Token `kid` not in cached JWKS (rotation lag before refresh tick) | 401 (resolved on next JWKS refresh) |
|
|
| Token `alg` ∉ `[EcdsaSha256]` (e.g. forged `alg: HS256`) | 401 (algorithm pin) |
|
|
| Expired token (with 30s skew) | 401 |
|
|
| `iss` claim ≠ `JWT_ISSUER` | 401 |
|
|
| `aud` claim ≠ `JWT_AUDIENCE` | 401 |
|
|
| Valid signature + lifetime + iss + aud, but missing `permissions=FL` claim | 403 |
|
|
| `JWT_JWKS_URL` uses `http://` (not `https://`) | 500 on first protected request (HTTPS-only retriever throws); NOT caught at startup |
|
|
| `admin` unreachable AT the time of the first protected request after cold start | 500 on that first request (synchronous JWKS fetch fails); resolves once admin is reachable |
|
|
|
|
**`admin` outage AFTER JWKS cached**: tokens issued before the outage continue to validate locally against the cached public keys. This service does **not** require `admin` to be reachable for any request-path flow once the JWKS cache is warm. Once issued tokens expire, new logins fail at `admin`'s end (UI concern), but this service stays up until the cache itself expires and a refresh fails.
|
|
|
|
**`admin` outage AT cold start**: the first protected request triggers a synchronous JWKS HTTPS GET; if `admin` is unreachable at that moment, the request fails 500. This is a **new failure mode** introduced by the ECDSA-JWKS switch and is the cost of the rotation-without-redeploy operational win.
|
|
|
|
## 2. Authorization
|
|
|
|
**Single named policy**: `"FL"`. Every controller action in `01_vehicle_catalog` and `02_mission_planning` carries `[Authorize(Policy = "FL")]`. The policy is built as `AuthorizationPolicyBuilder.RequireClaim("permissions", "FL")` — satisfied when ANY `permissions` claim on the principal equals `"FL"`. A multi-permission token (`permissions: ["FL", "SOMETHING_ELSE"]`) is accepted.
|
|
|
|
**Role → permission matrix** is suite-level (`../../suite/_docs/00_roles_permissions.md`); this service does NOT enforce roles, only the `FL` permission.
|
|
|
|
**No per-method authz**: every protected endpoint has the same gate. There is no notion of "read-only operator" vs "full-access operator" inside this service.
|
|
|
|
**Hardcoded policy name carries legacy wording**: the string `"FL"` (originally "Flight") survives the rename to `missions`. Renaming the permission code is a fleet-wide auth change (would invalidate every issued token until new ones are minted) and is **NOT** in this Epic. Tracked as a TODO in `../../suite/_docs/00_roles_permissions.md`.
|
|
|
|
**Typo risk**: the `"FL"` string is repeated in feature controllers as a raw string. A typo silently turns into a permanent 403 with no compile-time detection. Mitigation: code review + the `module-layout.md` § Verification Needed #4 entry.
|
|
|
|
**No per-user attribution / audit**: the JWT's `sub` / user-id claim is parsed by `JwtBearerHandler` into `ClaimsPrincipal`, but nothing in this service consumes it. Logs are timestamp-only — incident reconstruction requires correlation by request time, not by user.
|
|
|
|
## 3. Data protection
|
|
|
|
**At rest**: PostgreSQL on-disk encryption is the device-level concern (suite-level, NOT this service). This service does NOT encrypt data at the column level.
|
|
|
|
**In transit**:
|
|
|
|
- The container `EXPOSE 8080` is **plain HTTP**. TLS termination is handled by the suite's edge reverse proxy (per `../../suite/_docs/00_top_level_architecture.md`).
|
|
- No `app.UseHttpsRedirection()` in this service. If the reverse proxy is misconfigured or absent, traffic between the operator UI and this service may be cleartext on the local edge network.
|
|
|
|
**Secrets management**:
|
|
|
|
- This service no longer holds a JWT signing secret. It holds **only** public-key configuration (`JWT_ISSUER`, `JWT_AUDIENCE`, `JWT_JWKS_URL`) plus `DATABASE_URL`. Whichever side gets compromised, that compromise no longer affects token signing.
|
|
- All four required values (`DATABASE_URL`, `JWT_ISSUER`, `JWT_AUDIENCE`, `JWT_JWKS_URL`) are resolved through `Infrastructure/ConfigurationResolver.cs` → `ResolveRequiredOrThrow` (env-var-first, then `IConfiguration` key, else THROW). **No hardcoded dev fallbacks** — the ADR-005 "dev fallback secret silently accepted in production" failure mode is structurally eliminated. ADR-005 now only covers the unconditional-Swagger branch.
|
|
- No secret manager (Vault, AWS SM, K8s Secrets) — config values are baked into the device's docker compose env at provisioning time.
|
|
|
|
## 4. Input validation
|
|
|
|
**None at the application layer**. No `[Required]`, no `[Range]`, no min-length attributes; no custom validators. The following are all accepted by ASP.NET Core model binding without rejection:
|
|
|
|
| Bad input | Accepted today |
|
|
|-----------|---------------|
|
|
| `CreateVehicleRequest.Name = ""` | yes |
|
|
| `CreateVehicleRequest.BatteryCapacity = -1` | yes |
|
|
| `CreateVehicleRequest.Type = (VehicleType)999` | yes — int casts to enum without range check |
|
|
| `CreateWaypointRequest.OrderNum = -1` | yes |
|
|
| `CreateWaypointRequest.GeoPoint = null` (or all three of `Lat`/`Lon`/`Mgrs` null) | yes |
|
|
| `GetMissionsQuery.Page = -1` / `PageSize = 1_000_000` | yes — no bounds |
|
|
|
|
This is a **carry-forward concern** — input-shape testing is not a security gate today; the threat surface is mitigated by the closed edge network and authenticated single-operator workflow. Tightening is on the autodev backlog (Phase B feature cycle).
|
|
|
|
## 5. CORS
|
|
|
|
**Gated by `Infrastructure/CorsConfigurationValidator.cs`** at startup:
|
|
|
|
- In `Production` (case-insensitive match on `ASPNETCORE_ENVIRONMENT`): startup **THROWS `InvalidOperationException`** when `CorsConfig:AllowedOrigins` is empty AND `CorsConfig:AllowAnyOrigin != true`. The "open in all environments" failure mode is structurally eliminated.
|
|
- In non-Production environments: the same empty allow-list with `AllowAnyOrigin=false` falls back to permissive (`AllowAnyOrigin/Method/Header`) AND emits a `PermissiveDefaultWarning` startup log. The pre-B11 "all environments permissive" assumption no longer holds.
|
|
- An explicit `AllowedOrigins` list narrows CORS to those origins in every environment.
|
|
|
|
The closed edge network behind the suite reverse proxy is still the deployment-shape backstop, but the application now refuses to start in production without an explicit policy decision.
|
|
|
|
## 6. Production-deploy footguns
|
|
|
|
These are explicit security-relevant risks the code carries today, all tracked at the suite level or as carry-forward:
|
|
|
|
| Footgun | Where | Mitigation |
|
|
|---------|-------|------------|
|
|
| **Cold-start dependency on `admin` reachability**: first protected request after a cold start triggers a synchronous JWKS HTTPS GET against `JWT_JWKS_URL`. If `admin` is unreachable at that moment, the request fails 500. Once cached, request-path is local-only | `Auth/JwtExtensions.cs` `ConfigurationManager<JsonWebKeySet>` | Document operational expectation; consider pre-warming the JWKS cache during startup if cold-start failure modes become disruptive |
|
|
| **`JWT_JWKS_URL` misconfigured as `http://`** passes startup config resolution but fails at first JWKS fetch → 500 | `Auth/JwtExtensions.cs` `HttpDocumentRetriever { RequireHttps = true }` | Detected at runtime, not startup; recommend a startup smoke check that validates the URL scheme before serving traffic |
|
|
| **Swagger UI mounted unconditionally** | `Program.cs` (no `IsDevelopment` gate, ADR-005 surviving branch) | Reverse-proxy-level allowlist on `/swagger` is the suite-level mitigation; verify on first production rollout |
|
|
| **CORS allow-list empty in non-Production** falls back to permissive (`AllowAnyOrigin/Method/Header`) with a startup warning | `Infrastructure/CorsConfigurationValidator.cs` | Document explicit `CorsConfig:AllowedOrigins` for staging/dev too if permissive is undesirable. **Production fails-fast** — no remediation needed there |
|
|
| **No HTTPS redirection** | `Program.cs` (no `app.UseHttpsRedirection()`) | Reverse proxy enforces TLS upstream |
|
|
| **Stack trace logged for unhandled 500s** | `Middleware/ErrorHandlingMiddleware.cs` `LogError(ex, ...)` | Stack is logged only — NOT returned in the HTTP response body (the 500 body is the generic `"Internal server error"` message from middleware) |
|
|
| **Cascade-delete is NOT transaction-wrapped** (data-integrity, not auth) | `Services/MissionService.cs`, `Services/WaypointService.cs` (ADR-006) | One-line fix queued; recommended to land with B6 |
|
|
| **Hardcoded permission string `"FL"`** in feature controllers | `Controllers/{Vehicles,Missions}Controller.cs` | Risk: typo silently turns into permanent 403; mitigation by code review + `module-layout.md` |
|
|
| **Permission code `"FL"` retains legacy "Flight" wording** post-rename | `Auth/JwtExtensions.cs` | Fleet-wide auth change deferred (not in this Epic); TODO in `../../suite/_docs/00_roles_permissions.md` |
|
|
|
|
**Removed from this list** (previously listed, now structurally fixed by code, not by mitigation):
|
|
|
|
- ❌ Dev fallback for `JWT_SECRET` — `JWT_SECRET` env var is no longer consulted; the resolver throws on missing required config.
|
|
- ❌ Dev fallback for `DATABASE_URL` — same resolver throws on missing required config.
|
|
- ❌ CORS `AllowAnyOrigin/Method/Header` in production — production startup throws on empty allow-list with `AllowAnyOrigin != true`.
|
|
- ❌ JWT `iss`/`aud` validation disabled — both are now validated; CMMC L2 row 3 finding structurally fixed in this service's code.
|
|
|
|
## 7. Audit logging
|
|
|
|
**None at the application level today.** The only structured log emitted by application code is `_logger.LogError(ex, "Unhandled exception")` in `ErrorHandlingMiddleware` for 500s. There is:
|
|
|
|
- **No correlation ID** per request
|
|
- **No per-user attribution** (the JWT user-id claim is not consumed)
|
|
- **No security-event log** (auth failures are logged by `JwtBearerHandler` at default ASP.NET Core levels — typically Information, not surfaced as a dedicated audit channel)
|
|
- **No data-access audit** (writes/deletes go directly through linq2db with no wrapper that emits an audit row)
|
|
|
|
Production incident response on this service today requires grep-by-timestamp correlation against the operator UI's logs and `admin`'s issuance logs.
|
|
|
|
## 8. Threat model summary (one-paragraph)
|
|
|
|
The deployment shape — closed edge network, single operator per device, suite reverse proxy enforcing TLS and origin allowlisting upstream, Watchtower restart on crash — remains the primary defence-in-depth layer, but the application-layer auth posture has materially improved: ECDSA-SHA256 JWT validation against `admin`'s JWKS (with algorithm pinning, `iss`/`aud` validation, HTTPS-only JWKS fetch, and a 30s clock skew), production-gated CORS, and fail-fast required-config resolution have collectively eliminated the dev-fallback, iss/aud-disabled, and CORS-permissive-in-prod footguns that the legacy HS256 model carried. Residual application-level weak points (no input validation, no per-user audit, hardcoded `"FL"` string, cold-start dependency on `admin` reachability for the first protected request, non-transactional cascade delete) are documented and tracked. The CMMC L2 row 3 finding is structurally closed in this service's code; suite-level documentation may still describe the legacy posture and has a separate sync task pending.
|
|
|
|
## 9. References
|
|
|
|
| Concern | File |
|
|
|---------|------|
|
|
| Auth registration | `Auth/JwtExtensions.cs` |
|
|
| Authorization attribute usage | `Controllers/AircraftsController.cs` (post-B6: `VehiclesController.cs`), `Controllers/FlightsController.cs` (post-B6: `MissionsController.cs`) |
|
|
| Error envelope (no stack-leak) | `Middleware/ErrorHandlingMiddleware.cs` |
|
|
| Env / config resolution (fail-fast) | `Program.cs`, `Infrastructure/ConfigurationResolver.cs` |
|
|
| CORS validation | `Infrastructure/CorsConfigurationValidator.cs` |
|
|
| CMMC L2 scorecard | `../../suite/_docs/05_security/cmmc_l2_scorecard.md` |
|
|
| Roles & `FL` permission origin | `../../suite/_docs/00_roles_permissions.md` |
|
|
| ADR-005 (Swagger unconditional, surviving branch) | `_docs/02_document/architecture.md` § 8 |
|
|
| ADR-002 (PascalCase wire shape) | `_docs/02_document/architecture.md` § 8 |
|
|
| Component identity description | `_docs/02_document/components/05_identity/description.md` |
|
|
| Component http-conventions description | `_docs/02_document/components/06_http_conventions/description.md` |
|