Enhanced the .gitignore to exclude test results and updated the Dockerfile to include a new entrypoint script for improved container initialization. Refactored JWT configuration to support additional parameters for automatic refresh intervals, ensuring better control over token management. Updated the ConfigurationResolver to enforce required environment variables without hardcoded fallbacks, enhancing security and flexibility.
21 KiB
Azaion.Missions — System Flows
NOTE (forward-looking): route prefixes, identifiers, and cascade chains in this document reflect the post-rename, post-GPS-Denied-removal state. Today's source still uses
[Route("flights")],[Route("aircrafts")],Aircraft*/Flight*filenames, and the cascade still touchesorthophotos+gps_corrections. Renames + cascade shrink tracked under Jira AZ-EPIC children B5 / B6 / B7 / B8. Per-flow.mdfiles underdiagrams/flows/follow the same convention.
Flow Inventory
| # | Flow Name | Trigger | Primary Components | Criticality |
|---|---|---|---|---|
| F1 | Vehicle CRUD | Operator UI HTTP | 01_vehicle_catalog → 04_persistence |
High |
| F2 | Mission create / read / update | Operator UI HTTP | 02_mission_planning → 04_persistence, with 01_vehicle_catalog existence check |
High |
| F3 | Mission delete with cross-service cascade | Operator UI HTTP DELETE /missions/{id} |
02_mission_planning → 04_persistence (touches map_objects, media, annotations, detection, waypoints, missions) |
Critical (data integrity; not transaction-wrapped today) |
| F4 | Waypoint create / read / update / delete | Operator UI HTTP | 02_mission_planning (WaypointService) → 04_persistence |
High (delete is a cross-service cascade variant of F3) |
| F5 | JWT bearer validation | Every protected request | 05_identity (pipeline middleware) |
Critical (cross-cutting; applies to every authenticated route) |
| F6 | Service startup + schema migration | Process start | 07_host → 04_persistence (DatabaseMigrator.Migrate) |
High (one-shot per restart; B9 DROP runs here on legacy devices) |
| F7 | Health probe | Container orchestration / reverse proxy | 07_host (MapGet("/health")) |
Medium |
Flow Dependencies
| Flow | Depends on | Shares data with |
|---|---|---|
| F1 | F6 (schema must exist) | F2 (mission references vehicle_id) |
| F2 | F1 (vehicle existence check on create / update), F5 (auth), F6 | F3 (deletion), F4 (waypoint owner) |
| F3 | F2 (a mission must exist to delete it), F5, F6 | F4 (waypoint cascade is a sub-walk), annotations + detection pipeline (cross-service tables) |
| F4 | F2 (mission must exist to nest waypoints under it), F5, F6 | F3 (mission delete also walks all waypoint sub-trees) |
| F5 | None | Every protected flow (F1–F4) |
| F6 | None | Every flow (no flow can run before the schema is in place) |
| F7 | None | None |
Cross-cutting concerns (apply to all HTTP flows)
These behaviors wrap every flow at the pipeline level. They are described once here rather than repeated in each flow:
- JWT bearer validation (F5). ASP.NET Core's
JwtBearerHandlerruns on every request marked[Authorize]. Validation is local ECDSA-SHA256 againstadmin's JWKS, which this service fetches once at startup (lazy, on the first protected request) and caches viaMicrosoft.IdentityModel.Protocols.ConfigurationManager<JsonWebKeySet>; subsequent request-path validation does not calladmin. The handler enforcesiss == JWT_ISSUER,aud == JWT_AUDIENCE,exp(with 30-second clock skew), and pinsalgtoEcdsaSha256(defends against HS256-confusion). Failures surface as401 Unauthorized(no token / signature / claims / lifetime invalid) or403 Forbidden(token valid but missing the"FL"permission claim). Seediagrams/flows/flow_jwt_validation.mdfor the sequence. - Permission gate. Every controller action in
01_vehicle_catalogand02_mission_planningcarries[Authorize(Policy = "FL")]. The policy requirement is satisfied by apermissionsclaim equal to"FL". The policy NAME is referenced as a raw string in feature controllers — a typo would silently turn into a permanent 403 (seemodule-layout.md§ Verification Needed #4). - Global exception → JSON middleware.
ErrorHandlingMiddleware(06_http_conventions) is registered FIRST in the pipeline. It mapsKeyNotFoundException → 404,ArgumentException → 400,InvalidOperationException → 409; everything else → 500 with the stack trace logged. Wire shape: entity / DTO bodies are PascalCase (suite-spec divergence — seearchitecture.mdADR-002); the global error envelope is camelCase already (accidental match — anonymous object literalnew { statusCode, message }uses lowercase property names) but still missing the spec'serrorsfield. - No correlation ID, no request-level audit trail. Logs are timestamp-only; supporting a production incident requires grep-by-timestamp.
Flow F1: Vehicle CRUD
See diagrams/flows/flow_vehicle_crud.md.
Description: Full CRUD over the vehicle catalog. Every endpoint requires the "FL" permission. The list endpoint is unpaginated by spec; create + update + setDefault contain the "exactly one default vehicle" exclusivity rule that is stricter than spec (see 01_vehicle_catalog Caveats #1 and Jira AZ-551 / B12 for the resolution decision).
Preconditions: Service is running; schema is in place (F6); caller holds a JWT with permissions=FL.
Key sequence steps (happy path, create with IsDefault=true):
- UI →
POST /vehicles { Name, VehicleType, IsDefault: true, ... }(withAuthorization: Bearer <jwt>). - Pipeline → JWT validation (F5) + policy
"FL"check. VehiclesController.Create→VehicleService.CreateVehicle(req).VehicleService.CreateVehicle:- If
req.IsDefault == true:UPDATE vehicles SET is_default = FALSE WHERE is_default = TRUE(divergence from spec — clears every other row's default flag). INSERT INTO vehicles VALUES (...).
- If
Vehicleentity returned on the wire (PascalCase JSON).
Error scenarios:
| Error | Where | Detection | Recovery |
|---|---|---|---|
| Missing or invalid JWT | Pipeline | JwtBearerHandler |
401 Unauthorized; client refreshes the token at admin |
JWT lacks "FL" permission |
Pipeline | Policy evaluator | 403 Forbidden |
DELETE of a vehicle referenced by any mission |
VehicleService.DeleteVehicle |
IsAny<Mission>(m => m.VehicleId == id) check returns true |
InvalidOperationException → 409 Conflict |
| Vehicle not found by id | Get / Update / Delete / SetDefault | Entity lookup returns null | KeyNotFoundException → 404 Not Found |
| Race on default-set | VehicleService.CreateVehicle / UpdateVehicle / SetDefault |
None (no transaction) | Race window can leave 2+ IsDefault=true rows or zero defaults — see 01_vehicle_catalog Caveats #1, tracked as B12 |
Flow F2: Mission create / read / update
See diagrams/flows/flow_mission_lifecycle.md.
Description: Mission CRUD excluding delete (delete is F3 because of the cascade complexity). Create / update validate that the referenced vehicle_id exists; list is paginated (the only paginated endpoint in this service).
Preconditions: Service is running; schema is in place (F6); caller holds a JWT with permissions=FL; for create / update with VehicleId, the referenced vehicle must exist (F1).
Key sequence steps (create):
- UI →
POST /missions { Name, VehicleId }. - Pipeline → JWT +
"FL"(F5). MissionsController.Create→MissionService.CreateMission(req).MissionService.CreateMission:SELECT 1 FROM vehicles WHERE id = @VehicleId(existence check, no transaction with the insert below).- If absent: throw
ArgumentException("VehicleId not found")→400(spec says404; minor divergence — see02_mission_planningCaveats #8). INSERT INTO missions (id, name, vehicle_id, created_date) VALUES (...).
Missionentity returned (PascalCase; LinqToDB does NOT eager-load[Association]navigation, soMission.VehicleandMission.Waypointsserialize asnull/[]).
Error scenarios:
| Error | Where | Detection | Recovery |
|---|---|---|---|
VehicleId does not exist on create |
MissionService.CreateMission |
Existence check returns false | ArgumentException → 400 Bad Request (spec wants 404) |
VehicleId deleted between existence check and insert (TOCTOU) |
MissionService.CreateMission |
FK constraint rejects insert | Npgsql PostgresException → 500 Internal Server Error (UX gap — should map to 400) |
| Mission not found on read / update | MissionService.GetMission / UpdateMission |
Entity lookup returns null | KeyNotFoundException → 404 |
Flow F3: Mission delete with cross-service cascade (most critical)
See diagrams/flows/flow_mission_cascade_delete.md.
Description: DELETE /missions/{id} walks the full ownership graph and tears down rows in tables this service does NOT own the schema for (media, annotations, detection) plus its own map_objects, waypoints, and missions. NOT transaction-wrapped today (architecture.md ADR-006); partial failure leaves orphans. After B7, the orthophotos + gps_corrections branches are gone — they belong to the separate gps-denied service which manages its own cleanup.
Preconditions: Mission exists (KeyNotFoundException → 404 otherwise); schema for the borrowed tables is present in postgres-local (in standard edge deployment, annotations and detection have run their migrations on the same DB).
Cascade order (strictly child-before-parent, FK-driven):
1. DELETE FROM map_objects WHERE mission_id = ? (autopilot writes; this service owns schema + cleanup)
2. SELECT id FROM waypoints WHERE mission_id = ? → waypointIds
3. If waypointIds.Any():
SELECT id FROM media WHERE waypoint_id IN waypointIds → mediaIds
SELECT id FROM annotations WHERE media_id IN mediaIds → annotationIds
DELETE FROM detection WHERE annotation_id IN annotationIds (cross-service: detection pipeline)
DELETE FROM annotations WHERE id IN annotationIds (cross-service: annotations)
DELETE FROM media WHERE id IN mediaIds (cross-service: annotations)
4. DELETE FROM waypoints WHERE mission_id = ?
5. DELETE FROM missions WHERE id = ?
Error scenarios:
| Error | Where | Detection | Recovery |
|---|---|---|---|
| Mission not found | Step 0 (initial existence check) | Entity lookup | KeyNotFoundException → 404 |
relation does not exist for media / annotations / detection |
Step 3 | Npgsql PostgresException (42P01) |
500. Indicates annotations or detection pipeline never migrated on this device — abnormal edge deployment. See 02_mission_planning Caveats #6 |
| Partial failure mid-cascade (network blip, lock timeout) | Any step | Npgsql exception | 500. Orphan rows left behind. Re-running the same DELETE is a partial fix — already-deleted children are no-ops, remaining children proceed (see ADR-006 carry-forward) |
autopilot writes a map_object racing this delete |
Step 1 vs. concurrent insert | None | The insert may succeed AFTER step 1 reads zero rows; the orphan row stays until the next mission delete or manual cleanup. Small race window in practice (single-operator workflow) |
Performance expectations:
| Metric | Target | Notes |
|---|---|---|
| End-to-end latency | <50ms typical | 4–7 sequential round-trips against local PostgreSQL on the same device |
| Throughput | 1 op / mission delete; not load-tested | Operator-paced; not a hot path |
| Orphan rate | 0 once transaction-wrap lands (ADR-006 carry-forward) | Today: non-zero on any failure mid-cascade |
Flow F4: Waypoint create / read / update / delete
See diagrams/flows/flow_waypoint_lifecycle.md.
Description: Waypoint CRUD nested under a mission (/missions/{id}/waypoints/*). Delete is a scoped variant of F3's cascade — it walks media / annotations / detection for one waypoint instead of all waypoints of a mission. Same NO-transaction caveat applies (ADR-006). UpdateWaypoint is a full overwrite of every field even though the request DTO looks "partial-shaped" (see 02_mission_planning Caveats #2). List is unpaginated by spec, ordered by OrderNum.
Preconditions: Parent mission exists (KeyNotFoundException → 404 otherwise).
Key sequence steps (delete one waypoint):
- UI →
DELETE /missions/{id}/waypoints/{wpId}. - Pipeline → JWT +
"FL"(F5). MissionsController.DeleteWaypoint→WaypointService.DeleteWaypoint(missionId, wpId).WaypointService.DeleteWaypoint:- Verify waypoint exists with
mission_id = @missionId AND id = @wpId→ 404 if not. - Resolve
mediaIdsfor this one waypoint, thenannotationIds. DELETE FROM detection WHERE annotation_id IN annotationIds.DELETE FROM annotations WHERE id IN annotationIds.DELETE FROM media WHERE id IN mediaIds.DELETE FROM waypoints WHERE id = @wpId.
- Verify waypoint exists with
Error scenarios: identical to F3 scoped to one waypoint.
Flow F5: JWT bearer validation
See diagrams/flows/flow_jwt_validation.md.
Description: The cross-cutting auth flow that runs on every [Authorize] request. Signature validation is local against admin's cached JWKS public keys; the only call to admin is the JWKS fetch (once at startup, plus refreshes on the default schedule). Request-path validation does NOT call admin.
Preconditions: JWT_ISSUER, JWT_AUDIENCE, and JWT_JWKS_URL all resolved at startup via ConfigurationResolver.ResolveRequiredOrThrow (any missing value aborts startup); the JWT bearer middleware was registered by AddJwtAuth(builder.Configuration) in 07_host.
Key sequence steps:
- Request arrives at the ASP.NET Core pipeline with
Authorization: Bearer <jwt>. JwtBearerHandler:- Parse the token header.
- Reject unless
alg ∈ ValidAlgorithms(pinned toEcdsaSha256— defends against HS256-confusion). - Resolve signing key for the token's
kidvia the cachedConfigurationManager<JsonWebKeySet>. On a cold cache, this triggers a one-time HTTPS GET ofJWT_JWKS_URLfromadmin. - Verify ECDSA-SHA256 signature against the matching public key.
- Verify
iss == JWT_ISSUER,aud == JWT_AUDIENCE,exp(with 30-second clock skew).
- If algorithm, signature, claims, or lifetime fails:
401 Unauthorized(without ever invoking the controller). - If valid: parse claims into
ClaimsPrincipal; attach to the request. - Authorization policy
"FL"evaluator checks for apermissionsclaim with value"FL". - If absent:
403 Forbidden. - If present: forward to the controller action.
Error scenarios:
| Error | Where | Detection | Recovery |
|---|---|---|---|
Missing Authorization header |
Pipeline | JwtBearerHandler |
401 |
Forged alg: HS256 token |
Pipeline | ValidAlgorithms pin |
401. Pin defense — see 05_identity Caveats #6 |
| Invalid signature | Pipeline | ECDSA verify fails | 401 |
iss ≠ JWT_ISSUER |
Pipeline | ValidateIssuer = true |
401 |
aud ≠ JWT_AUDIENCE |
Pipeline | ValidateAudience = true |
401 |
| Expired token | Pipeline | ValidateLifetime (with 30s skew) |
401; client re-authenticates with admin |
kid not in cached JWKS |
IssuerSigningKeyResolver |
No matching key | 401. Manager refreshes on default schedule; new kid becomes available there |
admin unreachable on first JWKS fetch |
HttpDocumentRetriever |
HttpRequestException |
First request fails 500. Subsequent requests retry on next refresh. Operationally: keep admin reachable from edge |
permissions claim missing or not "FL" |
Policy evaluator | Claim lookup | 403 |
JWKS rotation on admin |
ConfigurationManager refresh |
next scheduled refresh tick | No coordinated redeploy needed — new keys are picked up on refresh; old tokens with the old kid remain valid until expiry |
Flow F6: Service startup + schema migration
See diagrams/flows/flow_startup_migration.md.
Description: One-time-per-process bootstrap. Program.cs builds the DI graph, runs DatabaseMigrator.Migrate(db) once, then starts serving HTTP. The migrator is idempotent (CREATE ... IF NOT EXISTS). After B9, the migrator additionally runs DROP TABLE IF EXISTS orthophotos; DROP TABLE IF EXISTS gps_corrections; once for fielded edge devices that previously ran the legacy schema.
Preconditions: DATABASE_URL, JWT_ISSUER, JWT_AUDIENCE, JWT_JWKS_URL all resolve via ConfigurationResolver.ResolveRequiredOrThrow (any missing value aborts startup — no hardcoded fallbacks); postgres-local is reachable; the azaion database exists. admin does NOT need to be reachable at this point — the JWKS fetch is lazy on the first protected request.
Key sequence steps:
- Container starts → entrypoint
dotnet Azaion.Missions.dll. Program.csresolvesDATABASE_URLviaConfigurationResolver.ResolveRequiredOrThrow→ConvertPostgresUrl→ Npgsql connection string.- Calls
AddJwtAuth(builder.Configuration), which resolvesJWT_ISSUER,JWT_AUDIENCE,JWT_JWKS_URL(each viaResolveRequiredOrThrow), wires theConfigurationManager<JsonWebKeySet>, and registers JWT bearer +"FL"(+ legacy"GPS"until B7) policies. No network call yet. - Reads
CorsConfig:AllowedOrigins+CorsConfig:AllowAnyOrigin; runsCorsConfigurationValidator.EnsureSafeForEnvironment(throws inProductionwith implicit-permissive config); registers the CORS policy (permissive ORWithOrigins). - Registers controllers, middleware, scoped
AppDataConnection, scoped service classes. - Builds the host. If implicit-permissive CORS applies (non-Production, empty origins,
AllowAnyOrigin=false), logsPermissiveDefaultWarningat startup. Opens a single startup scope and callsDatabaseMigrator.Migrate(db):CREATE TABLE IF NOT EXISTS vehicles (...).CREATE TABLE IF NOT EXISTS missions (...).CREATE TABLE IF NOT EXISTS waypoints (...).CREATE TABLE IF NOT EXISTS map_objects (...).CREATE INDEX IF NOT EXISTS ix_missions_vehicle_id ...and similar.- B9 one-shot:
DROP TABLE IF EXISTS orthophotos; DROP TABLE IF EXISTS gps_corrections;.
- Registers
ErrorHandlingMiddlewareFIRST in the pipeline; mounts CORS, auth, controllers,MapGet("/health"), Swagger UI. app.Run()— ready to serve HTTP on port 8080.
Error scenarios:
| Error | Where | Detection | Recovery |
|---|---|---|---|
Missing DATABASE_URL / JWT_ISSUER / JWT_AUDIENCE / JWT_JWKS_URL |
Step 2 or 3 | ResolveRequiredOrThrow throws InvalidOperationException |
Process exits non-zero with a message naming the env var and config key. Watchtower restarts, but the new container hits the same failure until the value is provided |
CORS misconfigured in Production (empty origins + AllowAnyOrigin != true) |
Step 4 | EnsureSafeForEnvironment throws |
Process exits with MissingOriginsMessage. Fix: set CorsConfig:AllowedOrigins or explicit AllowAnyOrigin=true |
postgres-local unreachable |
Step 6 | Npgsql connection failure | Process exits non-zero. Watchtower restarts the container; flight-gate prevents restart mid-mission |
azaion database does not exist |
Step 6 | Npgsql 3D000 (invalid_catalog_name) |
Process exits. Operator must create the database (provisioning concern, not this service) |
DROP TABLE IF EXISTS orthophotos fails because the table is being read by gps-denied |
Step 6 (B9 one-shot) | Lock timeout | Process exits. Restart loop until gps-denied releases the lock — should be moments. Out-of-band ordering: deploy gps-denied first so it has its own copy before missions drops the legacy tables |
| Migrator partial failure mid-statement | Step 6 | Npgsql exception | Process exits. Each statement is individually idempotent (IF NOT EXISTS) so the next startup retries safely |
Flow F7: Health probe
See diagrams/flows/flow_health_probe.md.
Description: GET /health returns { "status": "healthy" } with no auth. Used by container orchestration (Watchtower / docker compose healthcheck) and any reverse-proxy upstream check. Does NOT verify DB connectivity today — only confirms the process is up and the HTTP pipeline is serving.
Preconditions: HTTP pipeline is serving (i.e., app.Run() reached).
Key sequence steps:
- Probe →
GET /health(noAuthorizationheader required). MapGet("/health")returnsResults.Ok(new { status = "healthy" }).
Error scenarios: none meaningful — if the pipeline is up the response is always 200. If the process is down, the probe fails at TCP-connect time and orchestration restarts it.
Future improvement (carry-forward): gate /health on a DB ping so flight-gate and reverse-proxy checks reflect actual readiness rather than process liveness. Today the migrator runs at startup and crashes the process on DB failure, which is a coarse but workable substitute.
Mermaid diagram conventions
Per the suite documentation conventions:
- Participants match
components/[##]_[name]directories. - Node IDs are camelCase, no spaces.
- Decision nodes use
{Question?}format. - Start / End stadia use
([label]). - External systems (
autopilot,annotations, detection pipeline,gps-denied,admin, the Reactui) use[[label]]subroutine shape and live in their own subgraphs. - No styling — let the renderer theme handle it.