mirror of
https://github.com/azaion/missions.git
synced 2026-06-21 08:01:07 +00:00
7025f4d075
Updated JWT authentication to use configuration values instead of hardcoded secrets, improving security and flexibility. Enhanced CORS policy to conditionally allow origins based on configuration settings, with logging for permissive defaults. Updated README to reflect project renaming and clarify service context.
265 lines
19 KiB
Markdown
265 lines
19 KiB
Markdown
# Azaion.Missions — System Flows
|
||
|
||
> **NOTE (forward-looking)**: route prefixes, identifiers, and cascade chains in this document reflect the **post-rename, post-GPS-Denied-removal** state. Today's source still uses `[Route("flights")]`, `[Route("aircrafts")]`, `Aircraft*` / `Flight*` filenames, and the cascade still touches `orthophotos` + `gps_corrections`. Renames + cascade shrink tracked under Jira AZ-EPIC children B5 / B6 / B7 / B8. Per-flow `.md` files under `diagrams/flows/` follow the same convention.
|
||
|
||
## Flow Inventory
|
||
|
||
| # | Flow Name | Trigger | Primary Components | Criticality |
|
||
|---|-----------|---------|--------------------|-------------|
|
||
| F1 | Vehicle CRUD | Operator UI HTTP | `01_vehicle_catalog` → `04_persistence` | High |
|
||
| F2 | Mission create / read / update | Operator UI HTTP | `02_mission_planning` → `04_persistence`, with `01_vehicle_catalog` existence check | High |
|
||
| F3 | Mission delete with cross-service cascade | Operator UI HTTP `DELETE /missions/{id}` | `02_mission_planning` → `04_persistence` (touches `map_objects`, `media`, `annotations`, `detection`, `waypoints`, `missions`) | **Critical** (data integrity; not transaction-wrapped today) |
|
||
| F4 | Waypoint create / read / update / delete | Operator UI HTTP | `02_mission_planning` (`WaypointService`) → `04_persistence` | High (delete is a cross-service cascade variant of F3) |
|
||
| F5 | JWT bearer validation | Every protected request | `05_identity` (pipeline middleware) | **Critical** (cross-cutting; applies to every authenticated route) |
|
||
| F6 | Service startup + schema migration | Process start | `07_host` → `04_persistence` (`DatabaseMigrator.Migrate`) | High (one-shot per restart; B9 `DROP` runs here on legacy devices) |
|
||
| F7 | Health probe | Container orchestration / reverse proxy | `07_host` (`MapGet("/health")`) | Medium |
|
||
|
||
## Flow Dependencies
|
||
|
||
| Flow | Depends on | Shares data with |
|
||
|------|------------|------------------|
|
||
| F1 | F6 (schema must exist) | F2 (mission references `vehicle_id`) |
|
||
| F2 | F1 (vehicle existence check on create / update), F5 (auth), F6 | F3 (deletion), F4 (waypoint owner) |
|
||
| F3 | F2 (a mission must exist to delete it), F5, F6 | F4 (waypoint cascade is a sub-walk), `annotations` + detection pipeline (cross-service tables) |
|
||
| F4 | F2 (mission must exist to nest waypoints under it), F5, F6 | F3 (mission delete also walks all waypoint sub-trees) |
|
||
| F5 | None | Every protected flow (F1–F4) |
|
||
| F6 | None | Every flow (no flow can run before the schema is in place) |
|
||
| F7 | None | None |
|
||
|
||
## Cross-cutting concerns (apply to all HTTP flows)
|
||
|
||
These behaviors wrap every flow at the pipeline level. They are described once here rather than repeated in each flow:
|
||
|
||
1. **JWT bearer validation (F5)**. ASP.NET Core's `JwtBearerHandler` runs on every request marked `[Authorize]`. Validation is local (HMAC HS256, shared secret with `admin`) — no network call to the issuer. Failures surface as `401 Unauthorized` (no token / invalid signature / expired) or `403 Forbidden` (token valid but missing the `"FL"` permission claim). See `diagrams/flows/flow_jwt_validation.md` for the sequence.
|
||
2. **Permission gate**. Every controller action in `01_vehicle_catalog` and `02_mission_planning` carries `[Authorize(Policy = "FL")]`. The policy requirement is satisfied by a `permissions` claim equal to `"FL"`. The policy NAME is referenced as a raw string in feature controllers — a typo would silently turn into a permanent 403 (see `module-layout.md` § Verification Needed #4).
|
||
3. **Global exception → JSON middleware**. `ErrorHandlingMiddleware` (`06_http_conventions`) is registered FIRST in the pipeline. It maps `KeyNotFoundException → 404`, `ArgumentException → 400`, `InvalidOperationException → 409`; everything else → 500 with the stack trace logged. Wire shape: entity / DTO bodies are PascalCase (suite-spec divergence — see `architecture.md` ADR-002); the global error envelope is camelCase already (accidental match — anonymous object literal `new { statusCode, message }` uses lowercase property names) but still missing the spec's `errors` field.
|
||
4. **No correlation ID, no request-level audit trail**. Logs are timestamp-only; supporting a production incident requires grep-by-timestamp.
|
||
|
||
---
|
||
|
||
## Flow F1: Vehicle CRUD
|
||
|
||
See `diagrams/flows/flow_vehicle_crud.md`.
|
||
|
||
**Description**: Full CRUD over the vehicle catalog. Every endpoint requires the `"FL"` permission. The list endpoint is **unpaginated by spec**; create + update + setDefault contain the "exactly one default vehicle" exclusivity rule that is **stricter than spec** (see `01_vehicle_catalog` Caveats #1 and Jira AZ-551 / B12 for the resolution decision).
|
||
|
||
**Preconditions**: Service is running; schema is in place (F6); caller holds a JWT with `permissions=FL`.
|
||
|
||
**Key sequence steps** (happy path, create with `IsDefault=true`):
|
||
|
||
1. UI → `POST /vehicles { Name, VehicleType, IsDefault: true, ... }` (with `Authorization: Bearer <jwt>`).
|
||
2. Pipeline → JWT validation (F5) + policy `"FL"` check.
|
||
3. `VehiclesController.Create` → `VehicleService.CreateVehicle(req)`.
|
||
4. `VehicleService.CreateVehicle`:
|
||
- If `req.IsDefault == true`: `UPDATE vehicles SET is_default = FALSE WHERE is_default = TRUE` *(divergence from spec — clears every other row's default flag)*.
|
||
- `INSERT INTO vehicles VALUES (...)`.
|
||
5. `Vehicle` entity returned on the wire (PascalCase JSON).
|
||
|
||
**Error scenarios**:
|
||
|
||
| Error | Where | Detection | Recovery |
|
||
|-------|-------|-----------|----------|
|
||
| Missing or invalid JWT | Pipeline | `JwtBearerHandler` | `401 Unauthorized`; client refreshes the token at `admin` |
|
||
| JWT lacks `"FL"` permission | Pipeline | Policy evaluator | `403 Forbidden` |
|
||
| `DELETE` of a vehicle referenced by any mission | `VehicleService.DeleteVehicle` | `IsAny<Mission>(m => m.VehicleId == id)` check returns true | `InvalidOperationException` → `409 Conflict` |
|
||
| Vehicle not found by id | Get / Update / Delete / SetDefault | Entity lookup returns null | `KeyNotFoundException` → `404 Not Found` |
|
||
| Race on default-set | `VehicleService.CreateVehicle` / `UpdateVehicle` / `SetDefault` | None (no transaction) | Race window can leave 2+ `IsDefault=true` rows or zero defaults — see `01_vehicle_catalog` Caveats #1, tracked as B12 |
|
||
|
||
---
|
||
|
||
## Flow F2: Mission create / read / update
|
||
|
||
See `diagrams/flows/flow_mission_lifecycle.md`.
|
||
|
||
**Description**: Mission CRUD excluding delete (delete is F3 because of the cascade complexity). Create / update validate that the referenced `vehicle_id` exists; list is paginated (the only paginated endpoint in this service).
|
||
|
||
**Preconditions**: Service is running; schema is in place (F6); caller holds a JWT with `permissions=FL`; for create / update with `VehicleId`, the referenced vehicle must exist (F1).
|
||
|
||
**Key sequence steps** (create):
|
||
|
||
1. UI → `POST /missions { Name, VehicleId }`.
|
||
2. Pipeline → JWT + `"FL"` (F5).
|
||
3. `MissionsController.Create` → `MissionService.CreateMission(req)`.
|
||
4. `MissionService.CreateMission`:
|
||
- `SELECT 1 FROM vehicles WHERE id = @VehicleId` (existence check, no transaction with the insert below).
|
||
- If absent: throw `ArgumentException("VehicleId not found")` → `400` *(spec says `404`; minor divergence — see `02_mission_planning` Caveats #8)*.
|
||
- `INSERT INTO missions (id, name, vehicle_id, created_date) VALUES (...)`.
|
||
5. `Mission` entity returned (PascalCase; LinqToDB does NOT eager-load `[Association]` navigation, so `Mission.Vehicle` and `Mission.Waypoints` serialize as `null` / `[]`).
|
||
|
||
**Error scenarios**:
|
||
|
||
| Error | Where | Detection | Recovery |
|
||
|-------|-------|-----------|----------|
|
||
| `VehicleId` does not exist on create | `MissionService.CreateMission` | Existence check returns false | `ArgumentException` → `400 Bad Request` (spec wants `404`) |
|
||
| `VehicleId` deleted between existence check and insert (TOCTOU) | `MissionService.CreateMission` | FK constraint rejects insert | Npgsql `PostgresException` → `500 Internal Server Error` (UX gap — should map to `400`) |
|
||
| Mission not found on read / update | `MissionService.GetMission` / `UpdateMission` | Entity lookup returns null | `KeyNotFoundException` → `404` |
|
||
|
||
---
|
||
|
||
## Flow F3: Mission delete with cross-service cascade *(most critical)*
|
||
|
||
See `diagrams/flows/flow_mission_cascade_delete.md`.
|
||
|
||
**Description**: `DELETE /missions/{id}` walks the full ownership graph and tears down rows in tables this service does NOT own the schema for (`media`, `annotations`, `detection`) plus its own `map_objects`, `waypoints`, and `missions`. **NOT transaction-wrapped today** (`architecture.md` ADR-006); partial failure leaves orphans. After B7, the `orthophotos` + `gps_corrections` branches are gone — they belong to the separate `gps-denied` service which manages its own cleanup.
|
||
|
||
**Preconditions**: Mission exists (`KeyNotFoundException` → `404` otherwise); schema for the borrowed tables is present in `postgres-local` (in standard edge deployment, `annotations` and detection have run their migrations on the same DB).
|
||
|
||
**Cascade order** (strictly child-before-parent, FK-driven):
|
||
|
||
```
|
||
1. DELETE FROM map_objects WHERE mission_id = ? (autopilot writes; this service owns schema + cleanup)
|
||
2. SELECT id FROM waypoints WHERE mission_id = ? → waypointIds
|
||
3. If waypointIds.Any():
|
||
SELECT id FROM media WHERE waypoint_id IN waypointIds → mediaIds
|
||
SELECT id FROM annotations WHERE media_id IN mediaIds → annotationIds
|
||
DELETE FROM detection WHERE annotation_id IN annotationIds (cross-service: detection pipeline)
|
||
DELETE FROM annotations WHERE id IN annotationIds (cross-service: annotations)
|
||
DELETE FROM media WHERE id IN mediaIds (cross-service: annotations)
|
||
4. DELETE FROM waypoints WHERE mission_id = ?
|
||
5. DELETE FROM missions WHERE id = ?
|
||
```
|
||
|
||
**Error scenarios**:
|
||
|
||
| Error | Where | Detection | Recovery |
|
||
|-------|-------|-----------|----------|
|
||
| Mission not found | Step 0 (initial existence check) | Entity lookup | `KeyNotFoundException` → `404` |
|
||
| `relation does not exist` for `media` / `annotations` / `detection` | Step 3 | Npgsql `PostgresException` (`42P01`) | `500`. **Indicates `annotations` or detection pipeline never migrated on this device** — abnormal edge deployment. See `02_mission_planning` Caveats #6 |
|
||
| Partial failure mid-cascade (network blip, lock timeout) | Any step | Npgsql exception | `500`. **Orphan rows left behind**. Re-running the same DELETE is a partial fix — already-deleted children are no-ops, remaining children proceed (see ADR-006 carry-forward) |
|
||
| `autopilot` writes a `map_object` racing this delete | Step 1 vs. concurrent insert | None | The insert may succeed AFTER step 1 reads zero rows; the orphan row stays until the next mission delete or manual cleanup. Small race window in practice (single-operator workflow) |
|
||
|
||
**Performance expectations**:
|
||
|
||
| Metric | Target | Notes |
|
||
|--------|--------|-------|
|
||
| End-to-end latency | <50ms typical | 4–7 sequential round-trips against local PostgreSQL on the same device |
|
||
| Throughput | 1 op / mission delete; not load-tested | Operator-paced; not a hot path |
|
||
| Orphan rate | 0 once transaction-wrap lands (ADR-006 carry-forward) | Today: non-zero on any failure mid-cascade |
|
||
|
||
---
|
||
|
||
## Flow F4: Waypoint create / read / update / delete
|
||
|
||
See `diagrams/flows/flow_waypoint_lifecycle.md`.
|
||
|
||
**Description**: Waypoint CRUD nested under a mission (`/missions/{id}/waypoints/*`). Delete is a scoped variant of F3's cascade — it walks `media` / `annotations` / `detection` for **one** waypoint instead of all waypoints of a mission. Same NO-transaction caveat applies (ADR-006). `UpdateWaypoint` is a **full overwrite** of every field even though the request DTO looks "partial-shaped" (see `02_mission_planning` Caveats #2). List is **unpaginated by spec**, ordered by `OrderNum`.
|
||
|
||
**Preconditions**: Parent mission exists (`KeyNotFoundException` → `404` otherwise).
|
||
|
||
**Key sequence steps** (delete one waypoint):
|
||
|
||
1. UI → `DELETE /missions/{id}/waypoints/{wpId}`.
|
||
2. Pipeline → JWT + `"FL"` (F5).
|
||
3. `MissionsController.DeleteWaypoint` → `WaypointService.DeleteWaypoint(missionId, wpId)`.
|
||
4. `WaypointService.DeleteWaypoint`:
|
||
- Verify waypoint exists with `mission_id = @missionId AND id = @wpId` → 404 if not.
|
||
- Resolve `mediaIds` for this one waypoint, then `annotationIds`.
|
||
- `DELETE FROM detection WHERE annotation_id IN annotationIds`.
|
||
- `DELETE FROM annotations WHERE id IN annotationIds`.
|
||
- `DELETE FROM media WHERE id IN mediaIds`.
|
||
- `DELETE FROM waypoints WHERE id = @wpId`.
|
||
|
||
**Error scenarios**: identical to F3 scoped to one waypoint.
|
||
|
||
---
|
||
|
||
## Flow F5: JWT bearer validation
|
||
|
||
See `diagrams/flows/flow_jwt_validation.md`.
|
||
|
||
**Description**: The cross-cutting auth flow that runs on every `[Authorize]` request. Validation is **local** — this service never calls the `admin` service that issued the token.
|
||
|
||
**Preconditions**: `JWT_SECRET` is set (or the dev fallback applies — see `architecture.md` ADR-005); the JWT bearer middleware was registered by `AddJwtAuth` in `07_host`.
|
||
|
||
**Key sequence steps**:
|
||
|
||
1. Request arrives at the ASP.NET Core pipeline with `Authorization: Bearer <jwt>`.
|
||
2. `JwtBearerHandler`:
|
||
- Parse the token.
|
||
- Verify HMAC-SHA256 signature with `SymmetricSecurityKey(UTF-8(JWT_SECRET))`.
|
||
- Verify `lifetime` (`ClockSkew = 1 minute` — tighter than .NET's 5-minute default).
|
||
- **Skip** `iss` / `aud` validation (`ValidateIssuer = false`, `ValidateAudience = false` — known CMMC L2 finding, suite-tracked under AZ-487 / AZ-494, see `05_identity` § Implementation Details).
|
||
3. If signature or lifetime fails: `401 Unauthorized` (without ever invoking the controller).
|
||
4. If valid: parse claims into `ClaimsPrincipal`; attach to the request.
|
||
5. Authorization policy `"FL"` evaluator checks for a `permissions` claim with value `"FL"`.
|
||
6. If absent: `403 Forbidden`.
|
||
7. If present: forward to the controller action.
|
||
|
||
**Error scenarios**:
|
||
|
||
| Error | Where | Detection | Recovery |
|
||
|-------|-------|-----------|----------|
|
||
| Missing `Authorization` header | Pipeline | `JwtBearerHandler` | `401` |
|
||
| Invalid signature | Pipeline | HMAC verify fails | `401` |
|
||
| Expired token | Pipeline | `ValidateLifetime` (with 1min skew) | `401`; client re-authenticates with `admin` |
|
||
| Token signed with old `JWT_SECRET` (rotation) | Pipeline | HMAC verify fails | `401`; coordinated re-deploy across all backends sharing the secret + UI re-login |
|
||
| `permissions` claim missing or not `"FL"` | Policy evaluator | Claim lookup | `403` |
|
||
| `JWT_SECRET` is the well-known dev fallback in production | n/a (silent) | None at runtime | **Security risk** — any party with the fallback can mint accepted tokens. ADR-005 carry-forward; suite-level remediation pending |
|
||
|
||
---
|
||
|
||
## Flow F6: Service startup + schema migration
|
||
|
||
See `diagrams/flows/flow_startup_migration.md`.
|
||
|
||
**Description**: One-time-per-process bootstrap. `Program.cs` builds the DI graph, runs `DatabaseMigrator.Migrate(db)` once, then starts serving HTTP. The migrator is idempotent (`CREATE ... IF NOT EXISTS`). After B9, the migrator additionally runs `DROP TABLE IF EXISTS orthophotos; DROP TABLE IF EXISTS gps_corrections;` once for fielded edge devices that previously ran the legacy schema.
|
||
|
||
**Preconditions**: `DATABASE_URL` resolves (env or hardcoded dev fallback); `postgres-local` is reachable; the `azaion` database exists.
|
||
|
||
**Key sequence steps**:
|
||
|
||
1. Container starts → entrypoint `dotnet Azaion.Missions.dll`.
|
||
2. `Program.cs` reads `DATABASE_URL` → `ConvertPostgresUrl` → Npgsql connection string.
|
||
3. Reads `JWT_SECRET` → `AddJwtAuth(jwt)` (DI registration; no network).
|
||
4. Registers controllers, middleware, scoped `AppDataConnection`, scoped service classes.
|
||
5. Builds the host. Opens a single startup scope and calls `DatabaseMigrator.Migrate(db)`:
|
||
- `CREATE TABLE IF NOT EXISTS vehicles (...)`.
|
||
- `CREATE TABLE IF NOT EXISTS missions (...)`.
|
||
- `CREATE TABLE IF NOT EXISTS waypoints (...)`.
|
||
- `CREATE TABLE IF NOT EXISTS map_objects (...)`.
|
||
- `CREATE INDEX IF NOT EXISTS ix_missions_vehicle_id ...` and similar.
|
||
- **B9 one-shot**: `DROP TABLE IF EXISTS orthophotos; DROP TABLE IF EXISTS gps_corrections;`.
|
||
6. Registers `ErrorHandlingMiddleware` FIRST in the pipeline; mounts auth, controllers, `MapGet("/health")`, Swagger UI.
|
||
7. `app.Run()` — ready to serve HTTP on port 8080.
|
||
|
||
**Error scenarios**:
|
||
|
||
| Error | Where | Detection | Recovery |
|
||
|-------|-------|-----------|----------|
|
||
| `postgres-local` unreachable | Step 5 | Npgsql connection failure | Process exits non-zero. Watchtower restarts the container; `flight-gate` prevents restart mid-mission |
|
||
| `azaion` database does not exist | Step 5 | Npgsql `3D000` (`invalid_catalog_name`) | Process exits. Operator must create the database (provisioning concern, not this service) |
|
||
| `DROP TABLE IF EXISTS orthophotos` fails because the table is being read by `gps-denied` | Step 5 (B9 one-shot) | Lock timeout | Process exits. Restart loop until `gps-denied` releases the lock — should be moments. **Out-of-band ordering**: deploy `gps-denied` first so it has its own copy before `missions` drops the legacy tables |
|
||
| Migrator partial failure mid-statement | Step 5 | Npgsql exception | Process exits. Each statement is individually idempotent (`IF NOT EXISTS`) so the next startup retries safely |
|
||
|
||
---
|
||
|
||
## Flow F7: Health probe
|
||
|
||
See `diagrams/flows/flow_health_probe.md`.
|
||
|
||
**Description**: `GET /health` returns `{ "status": "healthy" }` with no auth. Used by container orchestration (Watchtower / docker compose healthcheck) and any reverse-proxy upstream check. Does NOT verify DB connectivity today — only confirms the process is up and the HTTP pipeline is serving.
|
||
|
||
**Preconditions**: HTTP pipeline is serving (i.e., `app.Run()` reached).
|
||
|
||
**Key sequence steps**:
|
||
|
||
1. Probe → `GET /health` (no `Authorization` header required).
|
||
2. `MapGet("/health")` returns `Results.Ok(new { status = "healthy" })`.
|
||
|
||
**Error scenarios**: none meaningful — if the pipeline is up the response is always 200. If the process is down, the probe fails at TCP-connect time and orchestration restarts it.
|
||
|
||
**Future improvement (carry-forward)**: gate `/health` on a DB ping so `flight-gate` and reverse-proxy checks reflect actual readiness rather than process liveness. Today the migrator runs at startup and crashes the process on DB failure, which is a coarse but workable substitute.
|
||
|
||
---
|
||
|
||
## Mermaid diagram conventions
|
||
|
||
Per the suite documentation conventions:
|
||
|
||
- **Participants** match `components/[##]_[name]` directories.
|
||
- **Node IDs** are camelCase, no spaces.
|
||
- **Decision nodes** use `{Question?}` format.
|
||
- **Start / End** stadia use `([label])`.
|
||
- **External systems** (`autopilot`, `annotations`, detection pipeline, `gps-denied`, `admin`, the React `ui`) use `[[label]]` subroutine shape and live in their own subgraphs.
|
||
- **No styling** — let the renderer theme handle it.
|