Files
missions/_docs/02_document/system-flows.md
T
Oleksandr Bezdieniezhnykh 78dea8ebab
ci/woodpecker/push/build-arm Pipeline was successful
chore: update configuration and Docker setup for JWT and test results
Enhanced the .gitignore to exclude test results and updated the Dockerfile to include a new entrypoint script for improved container initialization. Refactored JWT configuration to support additional parameters for automatic refresh intervals, ensuring better control over token management. Updated the ConfigurationResolver to enforce required environment variables without hardcoded fallbacks, enhancing security and flexibility.
2026-05-15 03:23:23 +03:00

21 KiB
Raw Blame History

Azaion.Missions — System Flows

NOTE (forward-looking): route prefixes, identifiers, and cascade chains in this document reflect the post-rename, post-GPS-Denied-removal state. Today's source still uses [Route("flights")], [Route("aircrafts")], Aircraft* / Flight* filenames, and the cascade still touches orthophotos + gps_corrections. Renames + cascade shrink tracked under Jira AZ-EPIC children B5 / B6 / B7 / B8. Per-flow .md files under diagrams/flows/ follow the same convention.

Flow Inventory

# Flow Name Trigger Primary Components Criticality
F1 Vehicle CRUD Operator UI HTTP 01_vehicle_catalog04_persistence High
F2 Mission create / read / update Operator UI HTTP 02_mission_planning04_persistence, with 01_vehicle_catalog existence check High
F3 Mission delete with cross-service cascade Operator UI HTTP DELETE /missions/{id} 02_mission_planning04_persistence (touches map_objects, media, annotations, detection, waypoints, missions) Critical (data integrity; not transaction-wrapped today)
F4 Waypoint create / read / update / delete Operator UI HTTP 02_mission_planning (WaypointService) → 04_persistence High (delete is a cross-service cascade variant of F3)
F5 JWT bearer validation Every protected request 05_identity (pipeline middleware) Critical (cross-cutting; applies to every authenticated route)
F6 Service startup + schema migration Process start 07_host04_persistence (DatabaseMigrator.Migrate) High (one-shot per restart; B9 DROP runs here on legacy devices)
F7 Health probe Container orchestration / reverse proxy 07_host (MapGet("/health")) Medium

Flow Dependencies

Flow Depends on Shares data with
F1 F6 (schema must exist) F2 (mission references vehicle_id)
F2 F1 (vehicle existence check on create / update), F5 (auth), F6 F3 (deletion), F4 (waypoint owner)
F3 F2 (a mission must exist to delete it), F5, F6 F4 (waypoint cascade is a sub-walk), annotations + detection pipeline (cross-service tables)
F4 F2 (mission must exist to nest waypoints under it), F5, F6 F3 (mission delete also walks all waypoint sub-trees)
F5 None Every protected flow (F1F4)
F6 None Every flow (no flow can run before the schema is in place)
F7 None None

Cross-cutting concerns (apply to all HTTP flows)

These behaviors wrap every flow at the pipeline level. They are described once here rather than repeated in each flow:

  1. JWT bearer validation (F5). ASP.NET Core's JwtBearerHandler runs on every request marked [Authorize]. Validation is local ECDSA-SHA256 against admin's JWKS, which this service fetches once at startup (lazy, on the first protected request) and caches via Microsoft.IdentityModel.Protocols.ConfigurationManager<JsonWebKeySet>; subsequent request-path validation does not call admin. The handler enforces iss == JWT_ISSUER, aud == JWT_AUDIENCE, exp (with 30-second clock skew), and pins alg to EcdsaSha256 (defends against HS256-confusion). Failures surface as 401 Unauthorized (no token / signature / claims / lifetime invalid) or 403 Forbidden (token valid but missing the "FL" permission claim). See diagrams/flows/flow_jwt_validation.md for the sequence.
  2. Permission gate. Every controller action in 01_vehicle_catalog and 02_mission_planning carries [Authorize(Policy = "FL")]. The policy requirement is satisfied by a permissions claim equal to "FL". The policy NAME is referenced as a raw string in feature controllers — a typo would silently turn into a permanent 403 (see module-layout.md § Verification Needed #4).
  3. Global exception → JSON middleware. ErrorHandlingMiddleware (06_http_conventions) is registered FIRST in the pipeline. It maps KeyNotFoundException → 404, ArgumentException → 400, InvalidOperationException → 409; everything else → 500 with the stack trace logged. Wire shape: entity / DTO bodies are PascalCase (suite-spec divergence — see architecture.md ADR-002); the global error envelope is camelCase already (accidental match — anonymous object literal new { statusCode, message } uses lowercase property names) but still missing the spec's errors field.
  4. No correlation ID, no request-level audit trail. Logs are timestamp-only; supporting a production incident requires grep-by-timestamp.

Flow F1: Vehicle CRUD

See diagrams/flows/flow_vehicle_crud.md.

Description: Full CRUD over the vehicle catalog. Every endpoint requires the "FL" permission. The list endpoint is unpaginated by spec; create + update + setDefault contain the "exactly one default vehicle" exclusivity rule that is stricter than spec (see 01_vehicle_catalog Caveats #1 and Jira AZ-551 / B12 for the resolution decision).

Preconditions: Service is running; schema is in place (F6); caller holds a JWT with permissions=FL.

Key sequence steps (happy path, create with IsDefault=true):

  1. UI → POST /vehicles { Name, VehicleType, IsDefault: true, ... } (with Authorization: Bearer <jwt>).
  2. Pipeline → JWT validation (F5) + policy "FL" check.
  3. VehiclesController.CreateVehicleService.CreateVehicle(req).
  4. VehicleService.CreateVehicle:
    • If req.IsDefault == true: UPDATE vehicles SET is_default = FALSE WHERE is_default = TRUE (divergence from spec — clears every other row's default flag).
    • INSERT INTO vehicles VALUES (...).
  5. Vehicle entity returned on the wire (PascalCase JSON).

Error scenarios:

Error Where Detection Recovery
Missing or invalid JWT Pipeline JwtBearerHandler 401 Unauthorized; client refreshes the token at admin
JWT lacks "FL" permission Pipeline Policy evaluator 403 Forbidden
DELETE of a vehicle referenced by any mission VehicleService.DeleteVehicle IsAny<Mission>(m => m.VehicleId == id) check returns true InvalidOperationException409 Conflict
Vehicle not found by id Get / Update / Delete / SetDefault Entity lookup returns null KeyNotFoundException404 Not Found
Race on default-set VehicleService.CreateVehicle / UpdateVehicle / SetDefault None (no transaction) Race window can leave 2+ IsDefault=true rows or zero defaults — see 01_vehicle_catalog Caveats #1, tracked as B12

Flow F2: Mission create / read / update

See diagrams/flows/flow_mission_lifecycle.md.

Description: Mission CRUD excluding delete (delete is F3 because of the cascade complexity). Create / update validate that the referenced vehicle_id exists; list is paginated (the only paginated endpoint in this service).

Preconditions: Service is running; schema is in place (F6); caller holds a JWT with permissions=FL; for create / update with VehicleId, the referenced vehicle must exist (F1).

Key sequence steps (create):

  1. UI → POST /missions { Name, VehicleId }.
  2. Pipeline → JWT + "FL" (F5).
  3. MissionsController.CreateMissionService.CreateMission(req).
  4. MissionService.CreateMission:
    • SELECT 1 FROM vehicles WHERE id = @VehicleId (existence check, no transaction with the insert below).
    • If absent: throw ArgumentException("VehicleId not found")400 (spec says 404; minor divergence — see 02_mission_planning Caveats #8).
    • INSERT INTO missions (id, name, vehicle_id, created_date) VALUES (...).
  5. Mission entity returned (PascalCase; LinqToDB does NOT eager-load [Association] navigation, so Mission.Vehicle and Mission.Waypoints serialize as null / []).

Error scenarios:

Error Where Detection Recovery
VehicleId does not exist on create MissionService.CreateMission Existence check returns false ArgumentException400 Bad Request (spec wants 404)
VehicleId deleted between existence check and insert (TOCTOU) MissionService.CreateMission FK constraint rejects insert Npgsql PostgresException500 Internal Server Error (UX gap — should map to 400)
Mission not found on read / update MissionService.GetMission / UpdateMission Entity lookup returns null KeyNotFoundException404

Flow F3: Mission delete with cross-service cascade (most critical)

See diagrams/flows/flow_mission_cascade_delete.md.

Description: DELETE /missions/{id} walks the full ownership graph and tears down rows in tables this service does NOT own the schema for (media, annotations, detection) plus its own map_objects, waypoints, and missions. NOT transaction-wrapped today (architecture.md ADR-006); partial failure leaves orphans. After B7, the orthophotos + gps_corrections branches are gone — they belong to the separate gps-denied service which manages its own cleanup.

Preconditions: Mission exists (KeyNotFoundException404 otherwise); schema for the borrowed tables is present in postgres-local (in standard edge deployment, annotations and detection have run their migrations on the same DB).

Cascade order (strictly child-before-parent, FK-driven):

1. DELETE FROM map_objects   WHERE mission_id = ?    (autopilot writes; this service owns schema + cleanup)
2. SELECT id FROM waypoints  WHERE mission_id = ?    → waypointIds
3. If waypointIds.Any():
     SELECT id FROM media         WHERE waypoint_id IN waypointIds → mediaIds
     SELECT id FROM annotations   WHERE media_id    IN mediaIds    → annotationIds
     DELETE FROM detection   WHERE annotation_id IN annotationIds  (cross-service: detection pipeline)
     DELETE FROM annotations WHERE id IN annotationIds              (cross-service: annotations)
     DELETE FROM media       WHERE id IN mediaIds                   (cross-service: annotations)
4. DELETE FROM waypoints WHERE mission_id = ?
5. DELETE FROM missions  WHERE id = ?

Error scenarios:

Error Where Detection Recovery
Mission not found Step 0 (initial existence check) Entity lookup KeyNotFoundException404
relation does not exist for media / annotations / detection Step 3 Npgsql PostgresException (42P01) 500. Indicates annotations or detection pipeline never migrated on this device — abnormal edge deployment. See 02_mission_planning Caveats #6
Partial failure mid-cascade (network blip, lock timeout) Any step Npgsql exception 500. Orphan rows left behind. Re-running the same DELETE is a partial fix — already-deleted children are no-ops, remaining children proceed (see ADR-006 carry-forward)
autopilot writes a map_object racing this delete Step 1 vs. concurrent insert None The insert may succeed AFTER step 1 reads zero rows; the orphan row stays until the next mission delete or manual cleanup. Small race window in practice (single-operator workflow)

Performance expectations:

Metric Target Notes
End-to-end latency <50ms typical 47 sequential round-trips against local PostgreSQL on the same device
Throughput 1 op / mission delete; not load-tested Operator-paced; not a hot path
Orphan rate 0 once transaction-wrap lands (ADR-006 carry-forward) Today: non-zero on any failure mid-cascade

Flow F4: Waypoint create / read / update / delete

See diagrams/flows/flow_waypoint_lifecycle.md.

Description: Waypoint CRUD nested under a mission (/missions/{id}/waypoints/*). Delete is a scoped variant of F3's cascade — it walks media / annotations / detection for one waypoint instead of all waypoints of a mission. Same NO-transaction caveat applies (ADR-006). UpdateWaypoint is a full overwrite of every field even though the request DTO looks "partial-shaped" (see 02_mission_planning Caveats #2). List is unpaginated by spec, ordered by OrderNum.

Preconditions: Parent mission exists (KeyNotFoundException404 otherwise).

Key sequence steps (delete one waypoint):

  1. UI → DELETE /missions/{id}/waypoints/{wpId}.
  2. Pipeline → JWT + "FL" (F5).
  3. MissionsController.DeleteWaypointWaypointService.DeleteWaypoint(missionId, wpId).
  4. WaypointService.DeleteWaypoint:
    • Verify waypoint exists with mission_id = @missionId AND id = @wpId → 404 if not.
    • Resolve mediaIds for this one waypoint, then annotationIds.
    • DELETE FROM detection WHERE annotation_id IN annotationIds.
    • DELETE FROM annotations WHERE id IN annotationIds.
    • DELETE FROM media WHERE id IN mediaIds.
    • DELETE FROM waypoints WHERE id = @wpId.

Error scenarios: identical to F3 scoped to one waypoint.


Flow F5: JWT bearer validation

See diagrams/flows/flow_jwt_validation.md.

Description: The cross-cutting auth flow that runs on every [Authorize] request. Signature validation is local against admin's cached JWKS public keys; the only call to admin is the JWKS fetch (once at startup, plus refreshes on the default schedule). Request-path validation does NOT call admin.

Preconditions: JWT_ISSUER, JWT_AUDIENCE, and JWT_JWKS_URL all resolved at startup via ConfigurationResolver.ResolveRequiredOrThrow (any missing value aborts startup); the JWT bearer middleware was registered by AddJwtAuth(builder.Configuration) in 07_host.

Key sequence steps:

  1. Request arrives at the ASP.NET Core pipeline with Authorization: Bearer <jwt>.
  2. JwtBearerHandler:
    • Parse the token header.
    • Reject unless alg ∈ ValidAlgorithms (pinned to EcdsaSha256 — defends against HS256-confusion).
    • Resolve signing key for the token's kid via the cached ConfigurationManager<JsonWebKeySet>. On a cold cache, this triggers a one-time HTTPS GET of JWT_JWKS_URL from admin.
    • Verify ECDSA-SHA256 signature against the matching public key.
    • Verify iss == JWT_ISSUER, aud == JWT_AUDIENCE, exp (with 30-second clock skew).
  3. If algorithm, signature, claims, or lifetime fails: 401 Unauthorized (without ever invoking the controller).
  4. If valid: parse claims into ClaimsPrincipal; attach to the request.
  5. Authorization policy "FL" evaluator checks for a permissions claim with value "FL".
  6. If absent: 403 Forbidden.
  7. If present: forward to the controller action.

Error scenarios:

Error Where Detection Recovery
Missing Authorization header Pipeline JwtBearerHandler 401
Forged alg: HS256 token Pipeline ValidAlgorithms pin 401. Pin defense — see 05_identity Caveats #6
Invalid signature Pipeline ECDSA verify fails 401
issJWT_ISSUER Pipeline ValidateIssuer = true 401
audJWT_AUDIENCE Pipeline ValidateAudience = true 401
Expired token Pipeline ValidateLifetime (with 30s skew) 401; client re-authenticates with admin
kid not in cached JWKS IssuerSigningKeyResolver No matching key 401. Manager refreshes on default schedule; new kid becomes available there
admin unreachable on first JWKS fetch HttpDocumentRetriever HttpRequestException First request fails 500. Subsequent requests retry on next refresh. Operationally: keep admin reachable from edge
permissions claim missing or not "FL" Policy evaluator Claim lookup 403
JWKS rotation on admin ConfigurationManager refresh next scheduled refresh tick No coordinated redeploy needed — new keys are picked up on refresh; old tokens with the old kid remain valid until expiry

Flow F6: Service startup + schema migration

See diagrams/flows/flow_startup_migration.md.

Description: One-time-per-process bootstrap. Program.cs builds the DI graph, runs DatabaseMigrator.Migrate(db) once, then starts serving HTTP. The migrator is idempotent (CREATE ... IF NOT EXISTS). After B9, the migrator additionally runs DROP TABLE IF EXISTS orthophotos; DROP TABLE IF EXISTS gps_corrections; once for fielded edge devices that previously ran the legacy schema.

Preconditions: DATABASE_URL, JWT_ISSUER, JWT_AUDIENCE, JWT_JWKS_URL all resolve via ConfigurationResolver.ResolveRequiredOrThrow (any missing value aborts startup — no hardcoded fallbacks); postgres-local is reachable; the azaion database exists. admin does NOT need to be reachable at this point — the JWKS fetch is lazy on the first protected request.

Key sequence steps:

  1. Container starts → entrypoint dotnet Azaion.Missions.dll.
  2. Program.cs resolves DATABASE_URL via ConfigurationResolver.ResolveRequiredOrThrowConvertPostgresUrl → Npgsql connection string.
  3. Calls AddJwtAuth(builder.Configuration), which resolves JWT_ISSUER, JWT_AUDIENCE, JWT_JWKS_URL (each via ResolveRequiredOrThrow), wires the ConfigurationManager<JsonWebKeySet>, and registers JWT bearer + "FL" (+ legacy "GPS" until B7) policies. No network call yet.
  4. Reads CorsConfig:AllowedOrigins + CorsConfig:AllowAnyOrigin; runs CorsConfigurationValidator.EnsureSafeForEnvironment (throws in Production with implicit-permissive config); registers the CORS policy (permissive OR WithOrigins).
  5. Registers controllers, middleware, scoped AppDataConnection, scoped service classes.
  6. Builds the host. If implicit-permissive CORS applies (non-Production, empty origins, AllowAnyOrigin=false), logs PermissiveDefaultWarning at startup. Opens a single startup scope and calls DatabaseMigrator.Migrate(db):
    • CREATE TABLE IF NOT EXISTS vehicles (...).
    • CREATE TABLE IF NOT EXISTS missions (...).
    • CREATE TABLE IF NOT EXISTS waypoints (...).
    • CREATE TABLE IF NOT EXISTS map_objects (...).
    • CREATE INDEX IF NOT EXISTS ix_missions_vehicle_id ... and similar.
    • B9 one-shot: DROP TABLE IF EXISTS orthophotos; DROP TABLE IF EXISTS gps_corrections;.
  7. Registers ErrorHandlingMiddleware FIRST in the pipeline; mounts CORS, auth, controllers, MapGet("/health"), Swagger UI.
  8. app.Run() — ready to serve HTTP on port 8080.

Error scenarios:

Error Where Detection Recovery
Missing DATABASE_URL / JWT_ISSUER / JWT_AUDIENCE / JWT_JWKS_URL Step 2 or 3 ResolveRequiredOrThrow throws InvalidOperationException Process exits non-zero with a message naming the env var and config key. Watchtower restarts, but the new container hits the same failure until the value is provided
CORS misconfigured in Production (empty origins + AllowAnyOrigin != true) Step 4 EnsureSafeForEnvironment throws Process exits with MissingOriginsMessage. Fix: set CorsConfig:AllowedOrigins or explicit AllowAnyOrigin=true
postgres-local unreachable Step 6 Npgsql connection failure Process exits non-zero. Watchtower restarts the container; flight-gate prevents restart mid-mission
azaion database does not exist Step 6 Npgsql 3D000 (invalid_catalog_name) Process exits. Operator must create the database (provisioning concern, not this service)
DROP TABLE IF EXISTS orthophotos fails because the table is being read by gps-denied Step 6 (B9 one-shot) Lock timeout Process exits. Restart loop until gps-denied releases the lock — should be moments. Out-of-band ordering: deploy gps-denied first so it has its own copy before missions drops the legacy tables
Migrator partial failure mid-statement Step 6 Npgsql exception Process exits. Each statement is individually idempotent (IF NOT EXISTS) so the next startup retries safely

Flow F7: Health probe

See diagrams/flows/flow_health_probe.md.

Description: GET /health returns { "status": "healthy" } with no auth. Used by container orchestration (Watchtower / docker compose healthcheck) and any reverse-proxy upstream check. Does NOT verify DB connectivity today — only confirms the process is up and the HTTP pipeline is serving.

Preconditions: HTTP pipeline is serving (i.e., app.Run() reached).

Key sequence steps:

  1. Probe → GET /health (no Authorization header required).
  2. MapGet("/health") returns Results.Ok(new { status = "healthy" }).

Error scenarios: none meaningful — if the pipeline is up the response is always 200. If the process is down, the probe fails at TCP-connect time and orchestration restarts it.

Future improvement (carry-forward): gate /health on a DB ping so flight-gate and reverse-proxy checks reflect actual readiness rather than process liveness. Today the migrator runs at startup and crashes the process on DB failure, which is a coarse but workable substitute.


Mermaid diagram conventions

Per the suite documentation conventions:

  • Participants match components/[##]_[name] directories.
  • Node IDs are camelCase, no spaces.
  • Decision nodes use {Question?} format.
  • Start / End stadia use ([label]).
  • External systems (autopilot, annotations, detection pipeline, gps-denied, admin, the React ui) use [[label]] subroutine shape and live in their own subgraphs.
  • No styling — let the renderer theme handle it.