mirror of
https://github.com/azaion/missions.git
synced 2026-06-22 01:41:08 +00:00
chore: update configuration and Docker setup for JWT and test results
ci/woodpecker/push/build-arm Pipeline was successful
ci/woodpecker/push/build-arm Pipeline was successful
Enhanced the .gitignore to exclude test results and updated the Dockerfile to include a new entrypoint script for improved container initialization. Refactored JWT configuration to support additional parameters for automatic refresh intervals, ensuring better control over token management. Updated the ConfigurationResolver to enforce required environment variables without hardcoded fallbacks, enhancing security and flexibility.
This commit is contained in:
@@ -0,0 +1,59 @@
|
||||
# Deferred to Step 8 (Refactor)
|
||||
|
||||
Items surfaced during Step 4 (Code Testability Revision) that are NOT testability blockers and would push beyond the "minimal, surgical" scope defined by the existing-code flow.
|
||||
|
||||
These items remain valid candidates for the optional Step 8 Refactor (or for a follow-up Phase B feature cycle once Phase A is complete).
|
||||
|
||||
---
|
||||
|
||||
## D-REF-01: Active JWKS refresh on unknown `kid`
|
||||
|
||||
**File**: `Auth/JwtExtensions.cs` (`IssuerSigningKeyResolver` lambda)
|
||||
|
||||
**Current behavior**: When a request arrives with a JWT whose `kid` header does not match any key in the locally-cached JWKS, the resolver returns an empty enumerable. The token is rejected with 401. The JWKS cache will not refresh until the next `AutomaticRefreshInterval` tick.
|
||||
|
||||
**Production impact**: After `admin` rotates its signing key, clients holding new-kid tokens are rejected for up to `AutomaticRefreshInterval` (currently library default = 12 hours). This is a real outage window after operator-driven key rotation.
|
||||
|
||||
**Proposed change** (Step 8): When `kid` is non-empty and no key matches, call `jwksConfigManager.RequestRefresh()` BEFORE returning empty, then return empty for THIS request only (the next request will see the refreshed cache). The `RefreshInterval` (default 5 min) bounds the refresh rate so a flood of bad-`kid` tokens cannot DOS the JWKS endpoint.
|
||||
|
||||
**Why deferred**: C01 (the optional refresh-interval env vars) is sufficient to make the documented tests pass. Active-refresh-on-miss is a production correctness improvement; per Step 4's "MINIMAL, SURGICAL" scope, it is intentionally out of scope.
|
||||
|
||||
---
|
||||
|
||||
## D-REF-02: Transaction-wrap cascade deletes
|
||||
|
||||
**File(s)**: `Services/FlightService.cs` (`DeleteFlight`), `Services/WaypointService.cs` (`DeleteWaypoint`), `Services/AircraftService.cs` (`CreateAircraft`, `SetDefault`, `UpdateAircraft`)
|
||||
|
||||
**Current behavior**: Cascade deletes and default-vehicle toggles execute as a series of independent `DELETE`/`UPDATE` statements without `BeginTransactionAsync`. Documented carry-forwards:
|
||||
- AC-1.4 (default-vehicle clear+set not transactional → 2+ defaults under race)
|
||||
- AC-3.3 (cross-table cascade not transactional → orphans on failure)
|
||||
- AC-2.8 (TOCTOU between existence check and insert)
|
||||
- AC-3.7 (autopilot orphan race on `map_objects` insert after step-1 read)
|
||||
|
||||
**Proposed change** (Step 8): Wrap each multi-statement block in `await using var tx = await db.BeginTransactionAsync()` + `await tx.CommitAsync()`. Add retry-with-jitter for serialization failures. Tests for the transactional variants would replace NFT-RES-08 (probabilistic) with deterministic transaction-rollback assertions.
|
||||
|
||||
**Why deferred**: Adding transactional boundaries is a structural change that affects every cascade path and requires coordinated schema review (FK actions, isolation level). The documented test spec already accepts the carry-forward via probabilistic tests and "Uncovered Items §1–§2" with documented mitigations.
|
||||
|
||||
---
|
||||
|
||||
## D-REF-03: Swagger gate on `IsDevelopment()`
|
||||
|
||||
**File**: `Program.cs` (lines 74–75: `app.UseSwagger(); app.UseSwaggerUI();`)
|
||||
|
||||
**Current behavior**: Swagger UI is exposed in every environment, including Production. Carry-forward security finding S6 in `restrictions.md`.
|
||||
|
||||
**Proposed change** (Step 8): Wrap both calls in `if (app.Environment.IsDevelopment())`. No behavioral change in tests (the test environment uses `ASPNETCORE_ENVIRONMENT=Test`, which is NOT `Development` — but this matches the suite-level expectation that Swagger is a dev-only tool).
|
||||
|
||||
**Why deferred**: This is a Production-only security hardening; it does not block any documented test. Step 8 or a Phase B security task.
|
||||
|
||||
---
|
||||
|
||||
## D-REF-04: Single composite-FK existence check in `CreateFlight` / `CreateWaypoint`
|
||||
|
||||
**File(s)**: `Services/FlightService.cs` (`CreateFlight`), `Services/WaypointService.cs` (`CreateWaypoint`)
|
||||
|
||||
**Current behavior**: `CreateFlight` checks `Aircrafts.AnyAsync(...)` then INSERTs `Flight`. `CreateWaypoint` checks `Flights.AnyAsync(...)` then INSERTs `Waypoint`. The DB-level FK now catches the TOCTOU race (PG `23503`) but the API surface still does an extra round trip.
|
||||
|
||||
**Proposed change** (Step 8): Skip the explicit existence check on `CreateFlight` / `CreateWaypoint`; rely on the DB FK constraint and translate `PostgresException SqlState=23503` to `ArgumentException` in `ErrorHandlingMiddleware` (or service-layer catch). One fewer round trip per insert; race window eliminated structurally.
|
||||
|
||||
**Why deferred**: Affects error-message wording for non-existent parent IDs and requires `ErrorHandlingMiddleware` to learn about Postgres error codes. The test suite is structured around the current wording. Step 8 territory.
|
||||
@@ -0,0 +1,56 @@
|
||||
# List of Changes
|
||||
|
||||
**Run**: 01-testability-refactoring
|
||||
**Mode**: guided
|
||||
**Source**: autodev-testability-analysis (existing-code Phase A, Step 4)
|
||||
**Date**: 2026-05-14
|
||||
|
||||
## Summary
|
||||
|
||||
Two minimal, surgical changes to make the documented test suite (`_docs/02_document/tests/`) actually executable against the running service. Without these, NFT-RES-07 + NFT-SEC-11 (JWKS rotation) cannot complete inside the 15-min CI gate, and EVERY JWT-dependent test fails at the TLS handshake against `jwks-mock` because the self-signed CA mounted into the container is never registered with the OS trust store. Both gaps are infrastructure-shaped — they do not change business logic, route shapes, validation semantics, DB schema, or any AC contract.
|
||||
|
||||
## Changes
|
||||
|
||||
### C01: Expose JWKS `ConfigurationManager` refresh intervals via optional env vars
|
||||
|
||||
- **File(s)**: `Auth/JwtExtensions.cs`
|
||||
- **Problem**: `ConfigurationManager<JsonWebKeySet>` is constructed with library defaults — `AutomaticRefreshInterval = 12h` and `RefreshInterval = 5min`. Test scenario NFT-RES-07 ("JWKS key rotation — no missions restart required") expects rotation to propagate within ~90s; NFT-SEC-11 ("Unknown kid (rotation lag) → 401 until JWKS refresh") expects an unknown kid to be accepted within the same window after rotation. Neither can complete inside the 15-min CI wall-clock budget at the current defaults. The current `IssuerSigningKeyResolver` also does not call `RequestRefresh()` on cache-miss, so even a `kid` that was just published cannot be picked up until the next automatic refresh cycle.
|
||||
- **Change**: Resolve two optional env vars / config keys via the existing `ConfigurationResolver`:
|
||||
- `JWT_JWKS_AUTO_REFRESH_INTERVAL_SECONDS` (config key `Jwt:JwksAutoRefreshIntervalSeconds`) → if set and parses to a positive integer, sets `jwksConfigManager.AutomaticRefreshInterval = TimeSpan.FromSeconds(value)`. Default (unset): library default (12h) — production semantics unchanged.
|
||||
- `JWT_JWKS_REFRESH_INTERVAL_SECONDS` (config key `Jwt:JwksRefreshIntervalSeconds`) → if set and parses to a positive integer, sets `jwksConfigManager.RefreshInterval = TimeSpan.FromSeconds(value)`. Default (unset): library default (5min).
|
||||
Add a NEW resolver method `ConfigurationResolver.ResolveOptionalIntOrDefault(...)` returning `int?` for the parse-or-null path (parallel to the existing `ResolveRequiredOrThrow`). Whitespace-only or non-integer values throw `InvalidOperationException` at startup (fail-fast — consistent with the existing resolver contract); only unset/absent is treated as "use default".
|
||||
In `docker-compose.test.yml` `missions` service, add `JWT_JWKS_AUTO_REFRESH_INTERVAL_SECONDS: "30"` and `JWT_JWKS_REFRESH_INTERVAL_SECONDS: "10"`.
|
||||
- **Rationale**: Production stays at the library defaults (12h cache) — no behavioral change. Tests get a refresh tick fast enough to observe rotation. The knob is the minimum-surface fix; alternatives like calling `RequestRefresh()` from inside `IssuerSigningKeyResolver` on cache-miss would be a production correctness improvement (faster recovery from operator-driven key rotation in `admin`), but that crosses into Step 8 (Refactor) territory and is intentionally deferred per `_docs/04_refactoring/01-testability-refactoring/deferred_to_refactor.md`.
|
||||
- **Constraint Fit**:
|
||||
- AC-5.7 (JWKS rotation without restart) — preserved; this change makes the AC observable in tests within the 15-min budget.
|
||||
- AC-5.9 (request-path validation local after JWKS cached) — preserved; the cache still operates entirely in-process between refresh ticks.
|
||||
- E1 / E3 (no hardcoded dev fallbacks, fail-fast on required config) — preserved; the new env vars are OPTIONAL, and bad values (non-integer, whitespace) throw at startup — they never silently default.
|
||||
- E4 (asymmetric ECDSA, no shared secret) — unchanged.
|
||||
- Restriction set in `_docs/00_problem/restrictions.md` — no new restriction; the optional knob is a documented operator override, not a behavior shift.
|
||||
- **Risk**: low — both new env vars default to "unset → library default". Production behavior is bit-for-bit identical unless an operator opts in.
|
||||
- **Dependencies**: none
|
||||
|
||||
### C02: Register mounted CA certificates at container startup
|
||||
|
||||
- **File(s)**: `Dockerfile` (runtime stage), NEW `docker-entrypoint.sh` at repo root
|
||||
- **Problem**: `docker-compose.test.yml` mounts the `jwks-mock` self-signed CA into the runtime container at `/usr/local/share/ca-certificates/jwks-mock-ca.crt`, but the runtime stage never invokes `update-ca-certificates` and the `ENTRYPOINT` execs `dotnet` directly. Debian's `/etc/ssl/certs/ca-certificates.crt` bundle is therefore never regenerated; .NET's `HttpDocumentRetriever` (used by `ConfigurationManager<JsonWebKeySet>`) rejects the HTTPS handshake to `https://jwks-mock:8443/...` with `RemoteCertificateNotAvailable`. EVERY JWT-dependent test fails at the first protected request — the service cannot fetch JWKS.
|
||||
- **Change**:
|
||||
- Add a small `docker-entrypoint.sh` at repo root that runs `update-ca-certificates --fresh >/dev/null 2>&1 || true` then `exec "$@"`. The `|| true` is acceptable here because `update-ca-certificates` only fails in pathological cases (e.g. unwriteable `/etc/ssl/`); the underlying TLS handshake will still produce a loud, informative error if trust is missing — we do not silence the actual security signal.
|
||||
- Update `Dockerfile` runtime stage to `COPY docker-entrypoint.sh /docker-entrypoint.sh`, `RUN chmod +x /docker-entrypoint.sh`, and change `ENTRYPOINT ["dotnet", "Azaion.Flights.dll"]` to `ENTRYPOINT ["/docker-entrypoint.sh", "dotnet", "Azaion.Flights.dll"]`.
|
||||
- No code change to the application.
|
||||
- **Rationale**: Mounted CAs are how a sysadmin trusts a private CA on a Debian system; running `update-ca-certificates` at start is the standard pattern. Production benefits too: any operator deploying behind an enterprise PKI can mount their CA and it will be trusted, without rebuilding the image. The entrypoint wrapper is idempotent — if no extra CAs are mounted, the bundle stays identical to the base image's default.
|
||||
- **Constraint Fit**:
|
||||
- AC-5.* — preserved; the JWT validation logic itself is untouched. This change enables the validator to actually reach the JWKS endpoint over HTTPS.
|
||||
- AC-6.12 (HTTPS-only JWKS URL) — preserved and now observable; the HTTPS handshake will SUCCEED against trusted CAs and FAIL against untrusted ones, which is the documented behavior.
|
||||
- E4 / E10 (TLS termination at suite reverse proxy; JWKS independently constrained to HTTPS) — preserved; this change is downstream of E10.
|
||||
- Container image size / startup cost: adds ~1ms (one `update-ca-certificates` invocation per cold start). Acceptable.
|
||||
- **Risk**: low — `update-ca-certificates` is idempotent and ships pre-installed in the .NET runtime base image (`mcr.microsoft.com/dotnet/aspnet:10.0` is Debian-based). The only failure mode is a read-only `/etc/ssl/`, which would also break the existing image. The `|| true` guard avoids masking unrelated `dotnet` errors but does NOT mask TLS errors (the actual JWKS HTTPS call will still surface a `RemoteCertificateNotAvailable` if trust is genuinely broken).
|
||||
- **Dependencies**: none
|
||||
|
||||
## Deferred to Step 8 (Refactor)
|
||||
|
||||
The following items were CONSIDERED during Step 4 analysis but rejected as out-of-scope for testability (recorded in `deferred_to_refactor.md`):
|
||||
|
||||
- **Active JWKS refresh on unknown `kid`** — change `IssuerSigningKeyResolver` to call `jwksConfigManager.RequestRefresh()` when no key matches the supplied `kid`. This is a production correctness improvement (recovery from operator-driven rotation falls from ~12h to a single round-trip) but is OUT of scope for testability — C01's interval knob alone is sufficient to make the documented tests pass.
|
||||
- **Transaction-wrap cascade deletes** (AC-3.3, AC-3.7, AC-2.8) — addressing the TOCTOU + orphan races requires `BeginTransactionAsync` + retry; structural change deferred to Step 8 as already noted in `restrictions.md` and traceability-matrix Uncovered Items §1–§2.
|
||||
- **Swagger production-gate** (S6 carry-forward) — single `IsDevelopment()` check; structural footgun but not a testability blocker. Deferred.
|
||||
@@ -0,0 +1,62 @@
|
||||
# Testability Changes Summary (01-testability-refactoring)
|
||||
|
||||
**Date**: 2026-05-14
|
||||
**Trigger**: autodev existing-code flow, Step 4 (Code Testability Revision)
|
||||
**Build status**: `dotnet build -c Release` — 0 warnings, 0 errors. No lint findings on modified files.
|
||||
|
||||
Applied 2 change(s):
|
||||
|
||||
## Config extraction
|
||||
|
||||
- **C01** — extended JWT configuration in `Auth/JwtExtensions.cs` + `Infrastructure/ConfigurationResolver.cs` + `docker-compose.test.yml`: added two NEW optional env vars (`JWT_JWKS_AUTO_REFRESH_INTERVAL_SECONDS`, `JWT_JWKS_REFRESH_INTERVAL_SECONDS`) and corresponding config keys (`Jwt:JwksAutoRefreshIntervalSeconds`, `Jwt:JwksRefreshIntervalSeconds`) that, when set, override the JWKS `ConfigurationManager`'s `AutomaticRefreshInterval` and `RefreshInterval`. Production leaves both unset → library defaults (12h / 5min). The test compose sets them to 30s / 10s so NFT-RES-07 (JWKS rotation) and NFT-SEC-11 (unknown-kid lag) can complete inside the 15-minute CI gate. A new `ConfigurationResolver.ResolveOptionalPositiveIntOrThrow` enforces the same fail-fast contract as `ResolveRequiredOrThrow` — typos and non-positive values throw at startup, never silently default. Risk: low.
|
||||
|
||||
## Container infrastructure
|
||||
|
||||
- **C02** — added `docker-entrypoint.sh` at repo root + adjusted `Dockerfile` runtime stage: container now runs `update-ca-certificates --fresh` (when available) before `exec`ing `dotnet Azaion.Flights.dll`. This makes the test-mounted `jwks-mock` self-signed CA actually trusted by the OS bundle that .NET's `HttpClient` (used by `HttpDocumentRetriever`) reads from. Without this, EVERY JWT-dependent test failed at the HTTPS handshake against `jwks-mock` (cert untrusted). Production benefits too — operators behind enterprise PKIs can mount a CA via volume without rebuilding the image. The wrapper is a no-op when no extra CAs are mounted. Risk: low.
|
||||
|
||||
## Files touched
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `Auth/JwtExtensions.cs` | +2 const pairs (env var + config key); +2 optional resolves; +2 conditional assignments to `jwksConfigManager.AutomaticRefreshInterval` / `RefreshInterval` |
|
||||
| `Infrastructure/ConfigurationResolver.cs` | +`ResolveOptionalPositiveIntOrThrow` method (parallel to existing required resolver, returns `int?`, throws on bad parse / non-positive) |
|
||||
| `docker-compose.test.yml` | +`JWT_JWKS_AUTO_REFRESH_INTERVAL_SECONDS: "30"` + `JWT_JWKS_REFRESH_INTERVAL_SECONDS: "10"` under `missions.environment`; comment block documents the rationale |
|
||||
| `Dockerfile` | Runtime stage: `COPY docker-entrypoint.sh /docker-entrypoint.sh`, `RUN chmod +x …`, `ENTRYPOINT ["/docker-entrypoint.sh", "dotnet", "Azaion.Flights.dll"]` |
|
||||
| `docker-entrypoint.sh` | NEW (repo root, executable). Runs `update-ca-certificates --fresh` then `exec "$@"`. |
|
||||
|
||||
## Files NOT touched
|
||||
|
||||
- `Services/{Aircraft,Flight,Waypoint}Service.cs` — already DI-driven; no testability gap
|
||||
- `Database/{AppDataConnection,DatabaseMigrator}.cs` — already takes connection as argument
|
||||
- `Controllers/*Controller.cs` — pure DI, no static state
|
||||
- `Middleware/ErrorHandlingMiddleware.cs` — already DI-driven
|
||||
- `Infrastructure/CorsConfigurationValidator.cs` — already pure utility taking config in
|
||||
- `Program.cs` — every config value already resolved via env / fail-fast; CORS already gated
|
||||
- Any production restriction surface (`restrictions.md`, ACs, schema) — all preserved
|
||||
|
||||
## Deferred to Step 8 / Refactor
|
||||
|
||||
See `deferred_to_refactor.md` in this folder. Summary:
|
||||
- D-REF-01: active JWKS refresh on unknown `kid` (production correctness — currently up to 12h lag after operator-driven key rotation in `admin`)
|
||||
- D-REF-02: transactional cascade deletes (AC-1.4 / AC-2.8 / AC-3.3 / AC-3.7 carry-forwards)
|
||||
- D-REF-03: Swagger production gate (S6 carry-forward)
|
||||
- D-REF-04: single composite-FK existence check on create (eliminates one round trip + TOCTOU race)
|
||||
|
||||
## Verification
|
||||
|
||||
- `dotnet build -c Release` — **PASS** (0 warnings, 0 errors).
|
||||
- `ReadLints` on `Auth/JwtExtensions.cs` + `Infrastructure/ConfigurationResolver.cs` — **PASS** (no findings).
|
||||
- No new `using` directives required; no new NuGet packages added.
|
||||
- Existing `_docs/02_document/tests/environment.md` description of the test stack is unchanged — the new env vars and CA wiring are infrastructure-only and do not alter any observable AC contract.
|
||||
|
||||
## Constraint preservation matrix
|
||||
|
||||
| Constraint | Preservation evidence |
|
||||
|------------|-----------------------|
|
||||
| AC-5.* (JWT validation semantics) | Unchanged — only the JWKS cache refresh cadence is tunable; algorithm pin, `iss`/`aud`/`alg`/`exp` validation, signing-key resolution all identical |
|
||||
| AC-5.7 (JWKS rotation without restart) | NOW OBSERVABLE within 15-min CI gate (C01) |
|
||||
| AC-6.1 / E1 / E3 (fail-fast on required config, no dev fallbacks) | Preserved — new optional vars use the same fail-fast resolver pattern; bad values throw |
|
||||
| AC-6.12 (HTTPS-only JWKS URL) | Preserved + reachable now (C02 makes the HTTPS handshake actually work against the test mock) |
|
||||
| E4 (asymmetric ECDSA, no shared secret) | Unchanged |
|
||||
| E10 (TLS termination at suite reverse proxy; JWKS independently HTTPS) | Preserved; the entrypoint wrapper only affects CA trust, never disables TLS |
|
||||
| Production behavior | Bit-for-bit identical when the new env vars are unset and no CA is mounted (the dominant production case) |
|
||||
Reference in New Issue
Block a user