mirror of
https://github.com/azaion/missions.git
synced 2026-06-21 15:31:07 +00:00
78dea8ebab
ci/woodpecker/push/build-arm Pipeline was successful
Enhanced the .gitignore to exclude test results and updated the Dockerfile to include a new entrypoint script for improved container initialization. Refactored JWT configuration to support additional parameters for automatic refresh intervals, ensuring better control over token management. Updated the ConfigurationResolver to enforce required environment variables without hardcoded fallbacks, enhancing security and flexibility.
369 lines
41 KiB
Markdown
369 lines
41 KiB
Markdown
# Azaion.Missions — Architecture
|
||
|
||
> **NOTE (forward-looking)**: this document reflects the **post-rename, post-GPS-Denied-removal** target. Today's source still uses `Azaion.Flights` namespace, `Aircraft*`/`Flight*`/`Orthophoto*`/`GpsCorrection*` filenames, `[Route("aircrafts"|"flights")]`, and migrates 6 tables. The renames + drops are tracked under Jira AZ-EPIC + child tickets B5–B12 (see `_docs/_process_leftovers/2026-05-14_rename-flights-to-missions.md`). The doc IS the spec for that work.
|
||
|
||
## Architecture Vision
|
||
|
||
> **Status**: confirmed-by-user (autodev `/document` Step 4.5, 2026-05-14). Source-of-truth for "what this service is and why" — downstream skills (`/refactor`, `/decompose`, `/new-task`, `/code-review`) consume this section before reading the lower-level technical sections below.
|
||
|
||
`missions` is the **edge-tier .NET 10 REST service** that owns the **mission domain** of each Azaion deployment — vehicle inventory, mission plans, waypoint sequences, and the cross-service cascade-delete that keeps the rest of the edge stack consistent when missions or waypoints are removed. **Exactly one instance runs per device** (Jetson Orin / OrangePI / operator-PC) alongside sibling edge services (`annotations`, detection, `autopilot`, `gps-denied`, `ui`), all sharing **ONE local PostgreSQL with per-service table ownership enforced by convention**. JWTs are minted remotely by the central `admin` service using ECDSA-SHA256 and validated locally against `admin`'s JWKS, which this service fetches once at startup and caches; request-path validation is local and does not call `admin`. The dominant pattern is **thin controller → service → linq2db active-record over a per-request scoped `DataConnection`**, with **no repository abstraction** and **no in-process message queue / event bus**.
|
||
|
||
### Components & responsibilities (6 logical components, 1 csproj)
|
||
|
||
| # | Component | Responsibility |
|
||
|---|-----------|----------------|
|
||
| 01 | `01_vehicle_catalog` | Vehicle CRUD + "is_default" exclusivity (stricter than spec — B12 decision pending) |
|
||
| 02 | `02_mission_planning` | Mission + Waypoint CRUD + the cross-service cascade-delete walk (canonical owner of the full mission ownership graph) |
|
||
| 04 | `04_persistence` | `AppDataConnection` (LinqToDB) + `DatabaseMigrator` (`CREATE TABLE IF NOT EXISTS` for the 4 owned tables post-B7 + B9) |
|
||
| 05 | `05_identity` | `JwtExtensions`; ECDSA-SHA256 validation against admin's JWKS (cached locally); one `"FL"` policy (post-B7) |
|
||
| 06 | `06_http_conventions` | `ErrorHandlingMiddleware` + `PaginatedResponse<T>` + the unused `ErrorResponse` DTO |
|
||
| 07 | `07_host` | `Program.cs` composition root; runs migrator at startup; serves on port 8080 |
|
||
|
||
### Major data flows (7 — see `system-flows.md` for full sequences)
|
||
|
||
- **F1 Vehicle CRUD** — operator UI → vehicle service → DB.
|
||
- **F2 Mission create/read/update** — UI → mission service, with vehicle existence check.
|
||
- **F3 Mission delete + CASCADE** *(critical)* — walks across `annotations` + detection schemas; **not transaction-wrapped today** (ADR-006).
|
||
- **F4 Waypoint CRUD** — delete is a scoped F3 cascade.
|
||
- **F5 JWT bearer validation** — every protected request; local ECDSA-SHA256 against admin's JWKS (cached); `iss` and `aud` both validated; `alg` pinned to `EcdsaSha256` (defends against HS256-confusion). The CMMC L2 finding tracked under AZ-487 / AZ-494 is now structurally addressed in this service's code; the suite-level docs still describe the legacy HS256 model and have a sync task pending.
|
||
- **F6 Startup + schema migration** — `Program → DatabaseMigrator.Migrate → app.Run`.
|
||
- **F7 Health probe** — anonymous `GET /health`; process-liveness only.
|
||
|
||
### Architectural principles / non-negotiables (inferred from the code)
|
||
|
||
- **One PostgreSQL per device; per-service table ownership enforced by convention.** *[inferred-from: `../../suite/_docs/00_top_level_architecture.md` § Database Topology, `Database/AppDataConnection.cs`, `Database/DatabaseMigrator.cs`]*
|
||
- **Manual cascade-delete in code, NOT `ON DELETE CASCADE` in schema.** *[inferred-from: `Database/DatabaseMigrator.cs`, `FlightService.DeleteFlight` (today's `MissionService.DeleteMission`)]*
|
||
- **JWT validated locally against admin's public JWKS** (ECDSA-SHA256). The JWKS is fetched once at startup (via `Microsoft.IdentityModel.Protocols.ConfigurationManager<JsonWebKeySet>`) and refreshed on the default schedule; per-request validation is local. *[inferred-from: `Auth/JwtExtensions.cs`]*
|
||
- **Forward-only-additive schema bootstrap** (`CREATE TABLE IF NOT EXISTS`); B9's `DROP TABLE IF EXISTS` is the one explicit destructive step. *[inferred-from: `Database/DatabaseMigrator.cs`]*
|
||
- **Layer-organized layout** (`Controllers/`, `Services/`, `DTOs/`, `Enums/`), NOT feature-folders; one project / one root namespace; layering rules in `module-layout.md` enforced by convention not by the compiler. *[inferred-from: repository tree + `Azaion.Flights.csproj` (today's `Azaion.Missions.csproj`)]*
|
||
- **`gps-denied` is decoupled by design** — no runtime call in either direction; rows reference `mission_id` / `waypoint_id` as plain GUIDs in `gps-denied`'s own tables. *[inferred-from: ADR-007 + AZ-546 acceptance criteria]*
|
||
- **Watchtower-restart + `flight-gate` is the ONLY orchestration**; no Kubernetes; vertical scale only (one instance per device). *[inferred-from: `Dockerfile` + `../../suite/_docs/00_top_level_architecture.md`]*
|
||
|
||
### Carry-forward concerns (acknowledged, NOT in this Epic's scope)
|
||
|
||
These divergences from spec or known foot-guns are tracked in `00_discovery.md` § Spec ↔ Code Divergences and called out in component / module docs. They are deliberately deferred:
|
||
|
||
- PascalCase entity-body wire shape vs spec's camelCase (the *error envelope* is already camelCase by accidental match — see ADR-002).
|
||
- Cascade-delete is not transaction-wrapped (ADR-006); one-line fix to land opportunistically with B6.
|
||
- Swagger UI NOT gated on `IsDevelopment()` (ADR-005, scope reduced — the "dev fallback secrets" aspect is now obsolete; see ADR-005 below for details).
|
||
- `"FL"` policy code retains the legacy "Flight" wording even after the service rename — fleet-wide auth change, not in this Epic.
|
||
- `Geopoint` stored as 3 flat columns (`lat`, `lon`, `mgrs`) instead of spec's single auto-converting `string GPS`.
|
||
- F2 returns `400` instead of spec's `404` on a missing `VehicleId` (`ArgumentException` mapping).
|
||
- `ErrorResponse` DTO is dead on the wire and has the wrong shape (`List<string>?` instead of spec's `object?` keyed by field name).
|
||
|
||
## 1. System Context
|
||
|
||
**Problem being solved**: Provide the edge-tier (.NET) service that owns the **mission domain** of an Azaion deployment — vehicle inventory (Plane / Copter / UGV / GuidedMissile), mission plans, waypoint sequences, and the cross-service cascade-delete that keeps the rest of the edge stack consistent when missions or waypoints are removed. The service runs **on the device** (Jetson / OrangePI / operator-PC), one instance per device, and shares the local PostgreSQL with its sibling edge services.
|
||
|
||
**System boundaries**:
|
||
|
||
- **Inside the system**: the 6 components (`01_vehicle_catalog`, `02_mission_planning`, `04_persistence`, `05_identity`, `06_http_conventions`, `07_host`), their HTTP surface, and the migrator that owns 4 PostgreSQL tables (`vehicles`, `missions`, `waypoints`, `map_objects`).
|
||
- **Outside the system**: the central `admin` service (mints JWTs); the React `ui` (consumer); the `autopilot` service (writes `map_objects` via the same DB); the `annotations` service (owns `media` + `annotations` tables); the detection pipeline (owns `detection`); the new `gps-denied` service (owns `orthophotos` + `gps_corrections` — out of this repo as of B7).
|
||
|
||
**External systems**:
|
||
|
||
| System | Integration Type | Direction | Purpose |
|
||
|--------|------------------|-----------|---------|
|
||
| `admin` (.NET, central) | JWKS over HTTPS (outbound, startup + refresh) + JWT validation (inbound) | Outbound at startup; inbound on every request | Issues ECDSA-signed bearer tokens; this service fetches admin's public JWKS once at startup, caches it, and validates tokens locally thereafter. No per-request callback. JWKS rotation does not require a coordinated redeploy |
|
||
| Operator UI (React, edge) | REST (JSON over HTTP) | Inbound | All vehicle / mission / waypoint CRUD |
|
||
| `autopilot` (edge) | Shared DB (PostgreSQL on the same device) | Bidirectional | `autopilot` writes `map_objects` (this service owns the schema and cascade-deletes them); `autopilot` reads `missions` + `waypoints` to drive the vehicle |
|
||
| `annotations` (edge) | Shared DB | Outbound delete | `missions` cascade-deletes from `media` + `annotations` on mission/waypoint delete; `annotations` owns the schema |
|
||
| Detection pipeline (edge) | Shared DB | Outbound delete | Same pattern — `missions` cascade-deletes `detection` rows; pipeline owns the schema |
|
||
| `gps-denied` (separate edge service) | Shared DB (loose ref by GUID) | None at runtime | `gps-denied` rows reference `mission_id` / `waypoint_id` as plain GUIDs; no inbound HTTP call into `missions` and no outbound call from `missions` to `gps-denied` (decoupled by design after B7) |
|
||
| `postgres-local` (PostgreSQL 16+) | TCP | Outbound | Sole datastore. Shared with every other edge service on the same device |
|
||
|
||
## 2. Technology Stack
|
||
|
||
| Layer | Technology | Version | Rationale |
|
||
|-------|------------|---------|-----------|
|
||
| Language | C# | net10.0 | Suite-wide convention for backend services (per `../../suite/_docs/_repo-config.yaml`) |
|
||
| Web framework | ASP.NET Core (`Microsoft.NET.Sdk.Web`) | net10.0 | Built-in DI, middleware pipeline, attribute routing, JWT bearer auth |
|
||
| Data access | linq2db | 6.2.0 | Suite-wide ORM choice; explicit SQL escape hatch + attribute mapping; works well with the manual cascade pattern |
|
||
| Database driver | Npgsql | 10.0.2 | PostgreSQL native protocol driver |
|
||
| Schema bootstrap | linq2db raw `Execute` (`CREATE TABLE IF NOT EXISTS`) | — | Forward-only-additive; one `DROP TABLE IF EXISTS orthophotos / gps_corrections` block in B9 |
|
||
| Auth | `Microsoft.AspNetCore.Authentication.JwtBearer` + `Microsoft.IdentityModel.Protocols` | 10.0.5 | JWT bearer with ECDSA-SHA256 against admin's JWKS (cached via `ConfigurationManager<JsonWebKeySet>`); `iss`/`aud` validated; algorithm pinned |
|
||
| API docs | `Swashbuckle.AspNetCore` | 10.1.5 | Swagger UI + JSON spec (mounted unconditionally — see ADR-005) |
|
||
| HTTP error envelope | Custom `ErrorHandlingMiddleware` | — | Maps `KeyNotFoundException`/`ArgumentException`/`InvalidOperationException` → 404/400/409 (see ADR-002 and component `06_http_conventions` Caveats for divergences from suite spec) |
|
||
| Container | `mcr.microsoft.com/dotnet/aspnet:10.0` (multi-arch SDK build) | 10.0 | Matches edge target architectures (ARM64 dominant; AMD64 used for operator-PC) |
|
||
| CI | Woodpecker (`.woodpecker/build-arm.yml`) | — | Single docker-build-and-push job triggered on `[dev, stage, main]`; suite-standard runner |
|
||
| Hosting | Docker compose on each edge device | — | Service runs alongside `annotations`, `detection`, `autopilot`, `gps-denied`, `ui`, `postgres-local` per `../../suite/_docs/00_top_level_architecture.md` |
|
||
| Tests | **None present** | — | Tracked in `../../suite/_docs/_process_leftovers/2026-04-22_ci-unit-test-lane-missing-projects.md`; will be filled by the autodev BUILD pipeline (Steps 3 → 6) |
|
||
|
||
**Key constraints from discovery**:
|
||
|
||
- **No `src/` directory** — the .NET project sits at the repo root (`Azaion.Missions.csproj`, `Program.cs`). `coderule.mdc` says "follow the established directory structure", and the established structure here has no `src/`. This shape persists post-rename.
|
||
- **No per-component csproj** — there is one project, effectively one root namespace (`Azaion.Missions.*` post-B5). Components are logical groupings, not compilation units. Cross-component dependencies are checked by convention (per `module-layout.md` § Allowed Dependencies), not by compiler.
|
||
- **Layer-organized, not feature-organized layout** — `Controllers/`, `Services/`, `DTOs/`, `Enums/`, `Auth/`, `Middleware/`, `Database/` at the root. Component `Owns` globs are file-by-file lists across multiple top-level directories. See `module-layout.md` § Layout Rules.
|
||
- **One PostgreSQL shared with all edge services** — the per-service ownership pattern is the load-bearing convention (`../../suite/_docs/00_top_level_architecture.md` § Database Topology).
|
||
- **No automated tests** — every change today is human-reviewed only. Adding a `tests/Azaion.Missions.Tests/` sibling project is on the autodev backlog (Steps 5–7 of existing-code flow).
|
||
|
||
## 3. Deployment Model
|
||
|
||
**Environments**: Development (local `dotnet run` + local PostgreSQL), edge production (Docker compose on each device).
|
||
|
||
**Infrastructure**:
|
||
|
||
- **On-prem only** — every Azaion edge deployment is on customer-owned hardware (Jetson Orin / OrangePI / operator-PC). No managed cloud.
|
||
- **Container orchestration**: plain `docker compose` per device (see `../../suite/_infra/_compose/`). No Kubernetes.
|
||
- **Scaling**: vertical only — exactly one instance of `missions` per edge device, sized to the device. Horizontal scale-out of edge services is explicitly out of scope (each device is its own deployment).
|
||
- **Watchtower** restarts the container if it crashes; `flight-gate` (per `../../suite/_docs/00_top_level_architecture.md`) prevents container restart mid-mission.
|
||
|
||
**Image / port wiring** (post-B10):
|
||
|
||
- Image tag: `${REGISTRY_HOST}/azaion/missions:${BRANCH}-arm` (was `azaion/flights:*-arm` pre-B10).
|
||
- Container `EXPOSE 8080`; edge compose maps host port `5002:8080`.
|
||
- Entrypoint: `dotnet Azaion.Missions.dll` (was `Azaion.Flights.dll` pre-B5).
|
||
- Multi-arch build: `--platform=$BUILDPLATFORM`, `dotnet publish --os linux --arch $arch` so a single Dockerfile produces both ARM64 and AMD64 image variants.
|
||
|
||
**Environment-specific configuration**:
|
||
|
||
| Config | Development | Edge production |
|
||
|--------|-------------|-----------------|
|
||
| `DATABASE_URL` | Operator-supplied env var or `Database:Url` config key (e.g. `Host=localhost;Database=azaion;Username=postgres;Password=changeme`). **No hardcoded fallback** — `ConfigurationResolver.ResolveRequiredOrThrow` aborts startup if unset | `postgresql://postgres:${PG_LOCAL_PASSWORD}@postgres-local/azaion` (compose env) |
|
||
| `JWT_ISSUER` | Operator-supplied (e.g. `https://admin.azaion.dev/`). **Required at startup** | Set by Edge compose to the central admin issuer |
|
||
| `JWT_AUDIENCE` | Operator-supplied (e.g. `missions`). **Required at startup** | Set by Edge compose to this service's audience identifier |
|
||
| `JWT_JWKS_URL` | Operator-supplied HTTPS URL (e.g. `https://admin.azaion.dev/.well-known/jwks.json`). **Required at startup** + must be HTTPS (`HttpDocumentRetriever.RequireHttps = true`) | Set by Edge compose to admin's JWKS endpoint |
|
||
| `CorsConfig:AllowedOrigins` | Optional; defaults to `[]` (implicit-permissive policy + startup warning) | Required when `CorsConfig:AllowAnyOrigin != true` — startup THROWS in `Production` with empty origins |
|
||
| `CorsConfig:AllowAnyOrigin` | Optional; defaults to `false` | Optional; explicit opt-in if reverse-proxy already enforces origin checks |
|
||
| Logging | Console / Debug (ASP.NET Core defaults) + `PermissiveDefaultWarning` when implicit-permissive CORS applies | Console only (no Serilog / structured logging configured today) |
|
||
| Swagger | enabled | enabled (NOT gated on `IsDevelopment()` — see ADR-005) |
|
||
| CORS | Permissive fallback (with `PermissiveDefaultWarning` startup log) | Explicit allow-list via `CorsConfig:AllowedOrigins`, or explicit `AllowAnyOrigin=true` if reverse-proxy gates origins; implicit-permissive aborts startup |
|
||
| Migrator | runs at process start | runs at process start (idempotent `IF NOT EXISTS` + the one B9 `DROP TABLE IF EXISTS` block for legacy GPS-Denied tables on previously-deployed devices) |
|
||
|
||
For containerization details, CI pipeline structure, and observability, see `_docs/02_document/deployment/`.
|
||
|
||
## 4. Data Model Overview
|
||
|
||
> Detailed entity column shapes live in `_docs/02_document/modules/entities.md`. Detailed cross-service ownership lives in `_docs/02_document/data_model.md`.
|
||
|
||
**Core entities** (post-B7 shape — 7 entity files, 4 owned tables + 3 borrowed read-only stubs):
|
||
|
||
| Entity | Description | Owned by component |
|
||
|--------|-------------|--------------------|
|
||
| `Vehicle` | Operator-managed inventory of mission-capable assets (Plane / Copter / UGV / GuidedMissile). 1 default at most by spec; code currently enforces "exactly one default" (see B12) | `01_vehicle_catalog` (logically); table schema in `04_persistence` |
|
||
| `Mission` | Planned mission; FK to a `Vehicle` | `02_mission_planning` (logically); table schema in `04_persistence` |
|
||
| `Waypoint` | Ordered geo-point inside a `Mission`; FK to `Mission` | `02_mission_planning` (logically); table schema in `04_persistence` |
|
||
| `MapObject` | H3-indexed detection projection written by `autopilot`; FK to `Mission` | `04_persistence` owns the schema; `autopilot` is the writer; this service cascade-deletes |
|
||
| `Media` | Borrowed read-only stub. Owned by `annotations`. Cascade-delete only | `04_persistence` declares the entity for ITable access; schema owned by `annotations` |
|
||
| `Annotation` | Borrowed read-only stub. Owned by `annotations`. Cascade-delete only | Same as `Media` |
|
||
| `Detection` | Borrowed read-only stub. Owned by detection pipeline. Cascade-delete only | Schema owned by detection pipeline; this service has cascade-delete responsibility only |
|
||
|
||
**Removed in B7+B9**: `Orthophoto` and `GpsCorrection` entities + tables. Now owned by the separate `gps-denied` service.
|
||
|
||
**Key relationships**:
|
||
|
||
- `Vehicle (1) ── (0..N) Mission` — `mission.vehicle_id → vehicle.id`. Existence-checked on `MissionService.CreateMission` / `UpdateMission` (the FK constraint is the safety net).
|
||
- `Mission (1) ── (0..N) Waypoint` — `waypoint.mission_id → mission.id`.
|
||
- `Mission (1) ── (0..N) MapObject` — `map_object.mission_id → mission.id`. Written by `autopilot`; cascade-deleted by `missions`.
|
||
- `Waypoint (1) ── (0..N) Media` (cross-service FK to `annotations`-owned table) — cascade-deleted by `missions`.
|
||
- `Media (1) ── (0..N) Annotation` (intra-`annotations` FK) — cascade-deleted by `missions` while walking the dependency graph.
|
||
- `Annotation (1) ── (0..N) Detection` (intra-detection FK) — cascade-deleted by `missions` while walking the dependency graph.
|
||
|
||
**No FK to `gps-denied` tables** — `orthophotos` / `gps_corrections` reference `mission_id` and `waypoint_id` as plain GUIDs in the `gps-denied` service's own tables. Cleanup of those rows is `gps-denied`'s own concern; this service does NOT cascade into them.
|
||
|
||
**Data flow summary**:
|
||
|
||
- `Operator UI → missions (HTTP)` — vehicle + mission + waypoint CRUD; the dominant inbound flow.
|
||
- `admin → operator UI → missions (JWT)` — admin mints token; UI carries it to every backend; this service validates locally (HS256, shared secret).
|
||
- `autopilot → missions (DB read)` — `autopilot` reads `missions` + `waypoints` to drive the vehicle.
|
||
- `autopilot → missions (DB write)` — `autopilot` writes `map_objects`; this service owns the table schema and cascade-deletes them.
|
||
- `missions → annotations + detection (DB delete)` — cascade-delete walk during mission/waypoint delete; tears down `media`, `annotations`, `detection` rows in dependency order.
|
||
|
||
## 5. Integration Points
|
||
|
||
### Internal Communication
|
||
|
||
This service is a single .NET process. Components communicate via direct C# calls registered in DI (`07_host`). There is no in-process message queue, no RPC, no event bus.
|
||
|
||
| From | To | Protocol | Pattern | Notes |
|
||
|------|----|----------|---------|-------|
|
||
| `07_host` | `04_persistence`, `05_identity`, `06_http_conventions` | DI registration | Composition root | Wired once at startup |
|
||
| `01_vehicle_catalog` (controller → service) | `04_persistence` (`AppDataConnection`) | Direct C# call | Active-record over `ITable<Vehicle>` | Per-request scoped DB connection |
|
||
| `02_mission_planning` (controllers → services) | `04_persistence` (`AppDataConnection`) | Direct C# call | Active-record over `ITable<Mission>`, `ITable<Waypoint>`, plus cascade delete touching `MapObject`, `Media`, `Annotation`, `Detection` | Per-request scoped DB connection. **No transaction wraps cascade delete** — see `02_mission_planning` Caveats #1 |
|
||
| `02_mission_planning` (`MissionService`) | `01_vehicle_catalog` (existence) | Direct DB read against `vehicles` | Existence check | Cross-component, but reads via the shared `AppDataConnection`; no service-to-service call |
|
||
| `01_vehicle_catalog`, `02_mission_planning` (controllers) | `05_identity` (`"FL"` policy) | ASP.NET Core `[Authorize(Policy = "FL")]` attribute | Pipeline check | String-typed policy reference (see `module-layout.md` § Verification Needed #4) |
|
||
| `02_mission_planning` (`MissionService.GetMissions`) | `06_http_conventions` (`PaginatedResponse<T>`) | Direct C# type | DTO | Sole consumer of the paginated envelope |
|
||
| Every controller exception | `06_http_conventions` (`ErrorHandlingMiddleware`) | Pipeline interceptor | Exception → status mapping | Middleware is registered FIRST so it wraps everything |
|
||
|
||
### External Integrations
|
||
|
||
| External system | Protocol | Auth | Rate limits | Failure mode |
|
||
|-----------------|----------|------|-------------|--------------|
|
||
| Operator UI | REST (JSON over HTTP) | JWT bearer | None enforced | Standard HTTP error envelope (see ADR-002 for the suite-spec divergence still in code) |
|
||
| `admin` (token issuance) | None at runtime — this service validates tokens locally | Shared HMAC secret (`JWT_SECRET`) | N/A | Rejected token → `401`. No network call to `admin`, so `admin` outage does NOT take this service down (until issued tokens expire) |
|
||
| `postgres-local` | PostgreSQL wire protocol via Npgsql | Username + password (`DATABASE_URL`) | Connection pool default (Npgsql) | Connection failure → `KeyNotFoundException` cannot fire (different exception type) → middleware fallthrough → 500. Migrator failure at startup crashes the process; Watchtower restarts the container |
|
||
| `autopilot` (DB-mediated) | Shared `postgres-local` | Same DB credentials | N/A | If `autopilot` writes a `map_object` referencing a deleted mission, the FK constraint rejects the insert. If a mission delete races with an `autopilot` write, the cascade may leave one row of `map_objects` that the next mission delete would reject — small race window, no data corruption |
|
||
| `annotations`, detection pipeline (DB-mediated, schema borrowing) | Shared `postgres-local` | Same DB credentials | N/A | If `annotations` is absent at deploy time, the cascade walks `media` / `annotations` and gets `relation does not exist` → 500. In standard edge deployment all services are present (suite compose stack) — see `02_mission_planning` Caveats #6 |
|
||
| `gps-denied` (post-B7) | None — no runtime coupling. `gps-denied` owns its own tables and references `mission_id` / `waypoint_id` as plain GUIDs | N/A | N/A | Decoupled by design |
|
||
|
||
## 6. Non-Functional Requirements
|
||
|
||
> Numbers below are observable from code + Dockerfile + Woodpecker; the spec (`../../suite/_docs/02_missions.md`) does not state explicit SLOs. Where targets are inferred, that is called out.
|
||
|
||
| Requirement | Target | Measurement | Priority |
|
||
|-------------|--------|-------------|----------|
|
||
| Availability | Best-effort per-device; no multi-instance HA per device | One container per device; restart-on-crash via Watchtower; `flight-gate` prevents restart mid-mission per `../../suite/_docs/00_top_level_architecture.md` | High (per-device) |
|
||
| Latency (p95) | **Not specified.** Code uses synchronous LINQ-to-SQL with one DB round-trip per operation; cascade delete has up to 7 sequential SELECTs/DELETEs. On a local PostgreSQL on the same device this is single-digit ms typical | `/health` and CRUD endpoints; no explicit latency budget | Medium (inferred) |
|
||
| Throughput | **Not specified.** Edge deployment is one operator + one or two background consumers (`autopilot`, `ui`); load is operator-paced not load-tested | — | Low (inferred) |
|
||
| Data retention | No retention policy in this service. Data persists in `postgres-local` until manually deleted via the API or device wipe | — | — |
|
||
| Recovery (RPO/RTO) | RPO = device-local backup cadence (suite-level concern, not this service); RTO ≈ container restart time (~10s) | Watchtower restart on crash | Medium |
|
||
| Scalability | One instance per edge device; horizontal scale-out NOT supported | — | — (out of scope) |
|
||
| Cascade delete atomicity | **Currently violated** — `MissionService.DeleteMission` and `WaypointService.DeleteWaypoint` are NOT wrapped in a transaction (see `02_mission_planning` Caveats #1). Partial failure leaves orphan rows in `media` / `annotations` / `detection` / `map_objects`. Fix is one-line (`db.BeginTransactionAsync`) | Carry-forward improvement | High (data integrity) |
|
||
| API spec conformance | **Currently divergent** on entity/DTO wire shape (PascalCase vs spec camelCase) and on error envelope's missing `errors` field; the unused `ErrorResponse` DTO has wrong `Errors` shape (see ADR-002). Note: error envelope is already camelCase on case (accidental match) | Manual diff against `../../suite/_docs/00_top_level_architecture.md` § Error Response Format + § Pagination | High (cross-service contract) |
|
||
| Health endpoint | `GET /health` returns `{ status: "healthy" }` in <10ms | `Program.cs` `MapGet` | High (used by container orchestration) |
|
||
|
||
## 7. Security Architecture
|
||
|
||
**Authentication**: JWT bearer with **ECDSA-SHA256** signature validation. Tokens are minted by the central `admin` service (which holds the ECDSA private key) and validated locally by `05_identity` against admin's public JWKS document. The JWKS is fetched once at startup via `Microsoft.IdentityModel.Protocols.ConfigurationManager<JsonWebKeySet>` against `JWT_JWKS_URL` (HTTPS only — `HttpDocumentRetriever.RequireHttps = true`) and refreshed on the manager's default schedule. After the initial fetch, request-path validation is local; no per-request callback to `admin`. Validation enforces `iss == JWT_ISSUER`, `aud == JWT_AUDIENCE`, `exp` (with 30-second clock skew), and pins `alg` to `EcdsaSha256` to defend against the HS256-confusion attack. **JWKS rotation does NOT require a coordinated redeploy** — consumers pick up new keys on the next refresh tick, and old tokens signed with the previous `kid` remain valid until their natural expiry. The CMMC L2 finding (`../../suite/_docs/05_security/cmmc_l2_scorecard.md` row 3) about missing `iss`/`aud` validation is structurally fixed in this service's code; the suite-level docs still describe the legacy HS256 model and have a sync task pending (drift recorded in `_docs/02_document/05_drift_findings_2026-05-14.md`).
|
||
|
||
**Authorization**: Single named policy `"FL"`, gated by a `permissions` claim value. Every controller route in `01_vehicle_catalog` and `02_mission_planning` carries `[Authorize(Policy = "FL")]`. The role → permission matrix lives in `../../suite/_docs/00_roles_permissions.md`. Note: the policy code `"FL"` carries the legacy "Flight" name even after the service rename to `missions`; renaming the permission code is a fleet-wide auth change (would invalidate every issued token until new ones are minted) and is **NOT** in this Epic's scope. Tracked as a TODO in `../../suite/_docs/00_roles_permissions.md`.
|
||
|
||
**Data protection**:
|
||
|
||
- **At rest**: PostgreSQL on-disk encryption is the device-level concern (suite-level, not this service). This service does not encrypt data at the column level.
|
||
- **In transit**: TLS termination is the reverse proxy's responsibility. This service does NOT enforce HTTPS redirection. The container `EXPOSE 8080` is plain HTTP; the upstream reverse proxy adds TLS. The JWKS fetch is independently constrained to HTTPS by `HttpDocumentRetriever { RequireHttps = true }`.
|
||
- **Secrets management**: Four required env vars (`DATABASE_URL`, `JWT_ISSUER`, `JWT_AUDIENCE`, `JWT_JWKS_URL`) plus optional CORS keys flow through `Infrastructure/ConfigurationResolver.cs` → `ResolveRequiredOrThrow`. **There are no hardcoded fallbacks**; a missing required value aborts startup with `InvalidOperationException` before the host is built. A production deploy that forgets `JWT_JWKS_URL` cannot silently accept tokens — it fails fast. The legacy `JWT_SECRET` env var is no longer consulted.
|
||
|
||
**Audit logging**: None at the application level. The only structured log emitted by app code is `06_http_conventions`' middleware `LogError(ex, "Unhandled exception")` for unhandled 500s, plus `Program.cs`' `PermissiveDefaultWarning` when implicit-permissive CORS applies. There is no per-request audit trail, no correlation ID, and no per-user attribution (the JWT's user-id claim is not consumed — see `05_identity` Caveats #2).
|
||
|
||
**Input validation**: None. No `[Required]` attributes, no range checks. Empty `Name`, negative `BatteryCapacity`, invalid enum int values are accepted on input. Carry-forward improvement; not in this Epic's scope.
|
||
|
||
**CORS**: Gated by `Infrastructure/CorsConfigurationValidator.cs`. In `Production` (case-insensitive match on `ASPNETCORE_ENVIRONMENT`) an empty `CorsConfig:AllowedOrigins` with `CorsConfig:AllowAnyOrigin != true` aborts startup. In non-Production environments, an empty allow-list with `AllowAnyOrigin=false` falls back to permissive (`AllowAnyOrigin/Method/Header`) and emits the `PermissiveDefaultWarning` startup log. Explicit `AllowAnyOrigin=true` always applies permissive without warning. The previous "permissive in all environments" model no longer holds.
|
||
|
||
## 8. Key Architectural Decisions
|
||
|
||
> ADR numbering reflects what is implemented today (post-rename, post-B7). Items called out as "currently divergent" are intentional carry-forward — they are implemented choices that diverge from the suite spec; tightening them is suite-level work, not part of this Epic.
|
||
|
||
### ADR-001: One PostgreSQL per edge device, shared by all edge services
|
||
|
||
**Context**: Each edge device runs ~6 backend services (this one + `annotations`, detection, `autopilot`, `gps-denied`, plus the React `ui`). Each service needs persistent storage; running ~6 separate Postgres instances per device is operationally heavy.
|
||
|
||
**Decision**: Run ONE `postgres-local` per device. Every service connects to it; every service migrates only the tables it owns (this service owns `vehicles`, `missions`, `waypoints`, `map_objects` post-B7+B9). Cross-service reads / cascade deletes happen through ITable accessors against the shared schema.
|
||
|
||
**Alternatives considered**:
|
||
|
||
1. **One Postgres per service** — rejected: 6× the operational overhead per device for no real isolation gain (services run on the same OS anyway).
|
||
2. **SQLite per service** — rejected: cross-service queries (cascade delete walking from `mission` to `media` to `annotation` to `detection`) require a single transactional database; SQLite-per-service would require a coordination layer.
|
||
|
||
**Consequences**:
|
||
|
||
- Cross-service cascade-delete is physically possible and atomic *within one DB connection* (the transaction-wrap is a one-line carry-forward — see ADR-006).
|
||
- Schema ownership boundary is enforced by **convention**, not by access control. Any service could write to any table; the rule "only owners write" is upheld by code review.
|
||
- If `annotations` is absent from a deployment, this service's cascade-delete fails on `relation does not exist`. Standard edge compose includes all services; this is acceptable.
|
||
|
||
### ADR-002: PascalCase wire shape on entity bodies (currently divergent from suite spec)
|
||
|
||
**Context**: Spec (`../../suite/_docs/00_top_level_architecture.md` § Error Response Format + § Pagination) mandates camelCase JSON across all .NET services. Code today emits PascalCase for entity / DTO responses (`Vehicle`, `Mission`, `Waypoint`, `PaginatedResponse<Mission>`) via System.Text.Json defaults — entity property names are PascalCase and no `JsonNamingPolicy.CamelCase` is configured. **Exception (accidental match)**: the global error envelope IS already camelCase, because `ErrorHandlingMiddleware` writes an anonymous object literal `new { statusCode = ..., message }` whose property names are lowercase-first by construction; `System.Text.Json` preserves them as-is.
|
||
|
||
**Decision (current, carry-forward)**: Keep PascalCase entity bodies until a coordinated suite-wide camelCase migration. Adding `JsonSerializerOptions.PropertyNamingPolicy = JsonNamingPolicy.CamelCase` would flip every endpoint's wire shape simultaneously; the UI and `autopilot` consumers would need to be updated in lock-step.
|
||
|
||
**Alternatives considered**:
|
||
|
||
1. **Fix unilaterally now** — rejected for this Epic: would break the UI without a coordinated cutover.
|
||
2. **Per-route override** — rejected: all-or-nothing is the cleaner cutover.
|
||
|
||
**Consequences**:
|
||
|
||
- Entity/DTO HTTP responses do NOT match the suite spec on case style.
|
||
- The error envelope DOES match spec on case (camelCase) but still misses the `errors` field; the `ErrorResponse` DTO is dead on the wire (middleware writes the anonymous object instead) and its `Errors` field shape (`List<string>?`) doesn't match spec (`object?` keyed by field name) — both carry forward until the migration.
|
||
|
||
### ADR-003: Manual cascade-delete in code, not `ON DELETE CASCADE` in schema
|
||
|
||
**Context**: Mission deletion has to clean up rows across multiple tables, some of which are owned by other services (`media` / `annotations` / `detection`). Schema-level `ON DELETE CASCADE` would force the foreign service's schema to encode this service's lifecycle.
|
||
|
||
**Decision**: This service owns the cascade walk. `MissionService.DeleteMission` deletes in dependency order: `map_objects` → resolve `waypoint_ids` → resolve `media_ids` and `annotation_ids` → `detection` → `annotations` → `media` → `waypoints` → `missions`.
|
||
|
||
**Alternatives considered**:
|
||
|
||
1. **`ON DELETE CASCADE` at the schema level** — rejected: would require the `annotations` service to encode this service's domain in its own migration. Schema becomes coupled to consumer.
|
||
2. **Soft-delete + tombstone everywhere** — rejected: read paths everywhere would have to filter; the spec does not require it.
|
||
|
||
**Consequences**:
|
||
|
||
- The cascade walk lives in one place (`MissionService.DeleteMission` + `WaypointService.DeleteWaypoint`).
|
||
- It is **not transaction-wrapped today** (see ADR-006) — a one-line fix carried forward.
|
||
- If `gps-denied` ever adds rows that need cleanup on mission delete, that's `gps-denied`'s concern (it owns the tables and the lifecycle) — this service does not extend its cascade.
|
||
|
||
### ADR-004: Schema bootstrap via `CREATE TABLE IF NOT EXISTS` (no migration tool)
|
||
|
||
**Context**: Edge deployments are restart-driven (Watchtower picks up new images); each container start runs the migrator. A heavy migration tool (Flyway, EF Core migrations) adds dependencies and complexity.
|
||
|
||
**Decision**: `DatabaseMigrator.Migrate` runs additive `CREATE TABLE IF NOT EXISTS` + `CREATE INDEX IF NOT EXISTS` for the 4 owned tables. The B9 ticket adds a one-shot `DROP TABLE IF EXISTS orthophotos; DROP TABLE IF EXISTS gps_corrections;` block for fielded devices that previously ran the legacy schema.
|
||
|
||
**Alternatives considered**:
|
||
|
||
1. **EF Core / Flyway** — rejected: adds a build dependency and a state table for what is currently a 4-table schema with no column drops or type changes.
|
||
2. **External SQL scripts** — rejected: harder to keep aligned with code-side entity changes; deployment becomes two-step.
|
||
|
||
**Consequences**:
|
||
|
||
- Column drops / type changes / constraint changes will require manual SQL or a future migration tool. The B9 `DROP` is the one explicit destructive step in the migrator's history.
|
||
- No version table; the migrator is idempotent and runs every startup.
|
||
- Acceptable today; will become a real problem if the schema starts evolving frequently.
|
||
|
||
### ADR-005: Swagger NOT gated on `IsDevelopment()` (scope reduced — dev-fallback secrets obsoleted)
|
||
|
||
**Context**: ASP.NET Core's idiomatic pattern gates Swagger UI and dev-only convenience features on `app.Environment.IsDevelopment()`. The original form of this ADR also covered hardcoded dev fallbacks for `JWT_SECRET` / `DATABASE_URL`; that aspect is now obsolete after the introduction of `Infrastructure/ConfigurationResolver.cs` (fail-fast `ResolveRequiredOrThrow`). The only remaining gap is Swagger.
|
||
|
||
**Decision (current, carry-forward)**: Leave Swagger UI mounted unconditionally. Swagger UI is useful on edge devices for one-off operator debugging through the local network. There is no hardcoded dev fallback for any secret today.
|
||
|
||
**Alternatives considered**:
|
||
|
||
1. **Gate Swagger on `IsDevelopment()` (or on `ASPNETCORE_ENVIRONMENT != "Production"`)** — preferred long-term; out of this Epic.
|
||
2. **Add a Swagger security scheme so the UI knows how to attach `Authorization: Bearer ...`** — usability improvement; out of this Epic.
|
||
|
||
**Consequences**:
|
||
|
||
- Swagger UI is exposed on every deployment. The reverse proxy may or may not whitelist it; verify on first production rollout.
|
||
- The "production silently boots with the dev secret" risk no longer exists: `JWT_ISSUER`, `JWT_AUDIENCE`, `JWT_JWKS_URL`, and `DATABASE_URL` are all required, and `ResolveRequiredOrThrow` aborts startup with `InvalidOperationException` if any is missing. The CMMC L2 row-3 finding (HS256 + missing `iss`/`aud`) is also structurally addressed by the ECDSA + JWKS + iss/aud-validation model — see Section 7 above.
|
||
|
||
### ADR-006: Cascade-delete is NOT transaction-wrapped (carry-forward)
|
||
|
||
**Context**: `MissionService.DeleteMission` and `WaypointService.DeleteWaypoint` issue 4–7 sequential `DELETE` statements across tables. Without a transaction, partial failure leaves orphan rows.
|
||
|
||
**Decision (current, carry-forward)**: Today the cascade runs autocommit-per-statement. Wrapping in `db.BeginTransactionAsync()` is one extra line and will land as part of the broader testability / refactor pass after the rename Epic.
|
||
|
||
**Alternatives considered**:
|
||
|
||
1. **Wrap now in B6** — possible; B6 is a rename, not a behavior change. The transaction wrap is a separate one-line concern that can either ride along (cheap) or land standalone.
|
||
2. **Saga / outbox pattern** — overkill for an in-process, one-DB cascade.
|
||
|
||
**Consequences**:
|
||
|
||
- Partial cascade failure leaves orphan rows in `media` / `annotations` / `detection` / `map_objects` / `waypoints`. The next mission delete or `autopilot` write may surface the inconsistency as an FK violation.
|
||
- **Recommended**: include the transaction wrap when B6 lands; it is a one-line change that materially raises the data-integrity floor.
|
||
|
||
### ADR-007: GPS-Denied moved out of this repo (B7 + B9)
|
||
|
||
**Context**: The pre-rename `flights` repo had a `03_gps_denied` component covering orthophoto upload + live-GPS / GPS-correction endpoints. Per `../../suite/_docs/11_gps_denied.md` and the rename plan, GPS-Denied is its own domain (orthorectification of satellite imagery; correction of GPS drift in denied environments) and does not belong inside the mission-planning service.
|
||
|
||
**Decision**: Delete `Database/Entities/Orthophoto.cs`, `Database/Entities/GpsCorrection.cs`, the corresponding DTOs/controllers/services, the `"GPS"` policy, and the cascade branches that referenced `orthophotos` / `gps_corrections`. Add a one-shot `DROP TABLE IF EXISTS` block to the migrator for fielded devices.
|
||
|
||
**Alternatives considered**:
|
||
|
||
1. **Keep GPS-Denied in this repo, behind a feature flag** — rejected: the new `gps-denied` service has different scaling and deployment concerns (heavier disk for orthos, separate update cadence).
|
||
2. **Leave the schema, drop only the API** — rejected: leaves dead tables on every device with no ownership; cleanup later would be harder.
|
||
|
||
**Consequences**:
|
||
|
||
- 9 entity files → 7 entity files. 6 owned tables → 4 owned tables.
|
||
- `MissionService.DeleteMission` cascade chain shrinks (no `orthophotos` / `gps_corrections` branch). One less foot-gun.
|
||
- `gps-denied` references `mission_id` / `waypoint_id` as plain GUIDs in its own tables. **No runtime coupling** between the two services — `gps-denied` is responsible for cleaning up its own rows when missions are deleted (its own concern, its own decision).
|
||
|
||
### ADR-008: One project, one root namespace (no per-component csproj)
|
||
|
||
**Context**: Some .NET solutions split each component into its own csproj for compile-time enforcement of "no upward dependencies". This service has 6 logical components but one csproj.
|
||
|
||
**Decision**: Keep one project (`Azaion.Missions.csproj` post-B5), one effective root namespace (`Azaion.Missions.*`). Layering rules in `module-layout.md` § Allowed Dependencies are enforced by **convention** (and by the autodev `code-review` Phase 7), not by the compiler.
|
||
|
||
**Alternatives considered**:
|
||
|
||
1. **Per-component csproj** — rejected for this codebase: 6 csprojs in a service this small has more solution-management overhead than it has value. Cross-component types are referenced directly, not through public APIs.
|
||
2. **Shared `Common` project + per-component projects** — rejected: same overhead as #1, plus the cross-cutting concerns (`Auth/`, `Middleware/`) are tiny and don't warrant their own DLL.
|
||
|
||
**Consequences**:
|
||
|
||
- A typo in an import won't be caught by the compiler — code review + the layering table in `module-layout.md` are the safety net.
|
||
- Solution remains easy for one engineer to navigate.
|
||
- If the service ever splits in two, the rename to per-project structure would be a separate refactor (not part of this Epic).
|