Files
Oleksandr Bezdieniezhnykh 1f634c2604
ci/woodpecker/push/02-build-push Pipeline failed
Update demo replay validation and testing documentation
- Modified the autodev state to reflect the current testing phase and details of the new `jetson-e2e` tests.
- Enhanced the "How to Test" documentation to provide clearer instructions on the demo replay validation process, including video and tlog alignment steps.
- Updated architectural documentation to include the new demo replay operator flow and its dependencies.
- Documented the removal of deprecated auto-sync features and clarified the operator-facing UI for replay validation.
- Added new entries in the dependencies table for upcoming tasks related to the demo replay flow.

These changes improve clarity and usability for operators and developers working with the demo replay system.
2026-06-20 11:24:43 +03:00

1189 lines
76 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# GPS-Denied Onboard Pose Estimation — System Flows
> Date: 2026-05-09 (Plan Phase 2a — initial draft).
> Companion document to `architecture.md`. Component IDs (C1, C2, … C13) match the architecture's intent-level decomposition; concrete interfaces are defined in Step 3.
>
> Diagram conventions follow `.cursor/skills/plan/templates/system-flows.md` § Mermaid Diagram Conventions: component-named participants, camelCase node IDs, `{Question?}` decisions, `([label])` start/end, `[[label]]` for external systems, no inline styling.
## Flow Inventory
| # | Flow Name | Trigger | Primary Components | Criticality |
|---|-----------|---------|--------------------|-------------|
| F1 | Pre-flight cache provisioning | Operator runs C12 cache-build CLI on workstation with `--flight-id <Guid>` (online) or `--flight-file <path>` (offline). The flight was previously authored in the parent-suite Mission Planner UI (`suite/ui`) and persisted to the parent-suite `flights` REST service | C12 (operator), C12 `FlightsApiClient` (operator-side, AZ-489), [[`flights` REST service]], C11 `TileDownloader`, [[`satellite-provider`]], C10, C6, C7 | High |
| F2 | Takeoff load | Companion boot detected by FC `MAV_STATE` ARMED OR companion process start with armed FC | C10, C7, C8 (signing handshake), C5 `set_takeoff_origin` (operator-origin warm-start, AZ-490), C13 | High |
| F3 | Steady-state per-frame estimation | Nav camera frame received (3 Hz nominal) | C1, C2, C2.5, C3, C3.5, C4, C5, C8 (out), C13 | High |
| F4 | Mid-flight tile generation + local cache write | Successful satellite-anchored frame with quality metadata above threshold | C5, C6, C13 (no C8/C11 path — C11 `TileUploader` is not loaded in the airborne image) | High |
| F5 | Visual blackout + spoofed-GPS failsafe | Camera unusable AND/OR FC GPS reports denial/spoof | C1, C5, C8, C13 (degraded-mode escalation per AC-NEW-8) | High |
| F6 | Sharp-turn / disconnected-segment re-localization | Frame-to-frame registration fails for ≥ 1 frame (AC-3.2 / AC-3.3) | C1, C2, C2.5, C3, C3.5, C4, C5, C8, C13; optionally operator (AC-3.4) | High |
| F7 | Spoofing-promotion via EKF source-set switch | FC reports GPS denial/spoof while companion estimate is healthy | C5, C8, [[ArduPilot Plane FC]] | High |
| F8 | Companion reboot recovery | Companion process restart while FC remains armed | C8 (FC IMU pose ingest), C5, C10 (warm-cache verify), C13 | Medium |
| F9 | GCS telemetry stream | Per-frame estimate available + GCS link healthy | C5, C8, [[QGroundControl]] | Medium |
| F10 | Post-landing tile upload | Operator triggers C12 `PostLandingUploadOrchestrator`; orchestrator confirms `flight_footer.clean_shutdown == True` and invokes C11 `TileUploader` | C12 `PostLandingUploadOrchestrator` (operator-side; reads FDR footer), C11 `TileUploader` (operator-side), C6 (read), [[`satellite-provider`]] (D-PROJ-2 endpoint, planned) | High |
| F11 | Demo replay validation (operator) | Operator uploads `(video, tlog, calibration)` in suite UI; aligns timelines; runs full GPS-denied replay verdict | [[`suite/ui`]] (AZ-897), `replay_api` (AZ-973), `replay_input` (AZ-970972), C12 `seed-cache-from-tlog` (AZ-974), C11 route seed, C10, airborne replay (`config.mode=replay`) | High |
## Flow Dependencies
| Flow | Depends on | Shares data with |
|------|-----------|------------------|
| F1 | none | F2 (Manifest, EngineCacheEntry, Tile cache, FAISS index) |
| F2 | F1 (cache + engines + manifests on disk; SHA-256 content-hash gate) | F3 (warmed pipeline state) |
| F3 | F2 (warm pipeline) | F4 (PoseEstimate for tile gen), F9 (downsampled summary), F11 (smoothing extends F3 internally — see F3 § Notes), F13 FDR (cross-cutting) |
| F4 | F3 (PoseEstimate + quality metadata) | F10 (uploaded tiles), F13 FDR |
| F5 | F3 (last trusted state), F8 (FC IMU prior) | F8 if covariance trips fail-threshold |
| F6 | F3 (frame-to-frame failure detection) | F3 resumes once anchor recovers |
| F7 | F3 (companion estimate health), F8 IMU prior | F3 (becomes primary FC source after switch) |
| F8 | F1 + F2 (warm cache survives reboot via content-hash verify) | F3 (resumes once warm), F5 (degraded mode if recovery fails) |
| F9 | F3 | n/a (read-only outbound) |
| F10 | F4 (locally-saved tiles), C13 `flight_footer` written on clean shutdown, parent-suite D-PROJ-2 endpoint availability | F1 of the next flight (uploaded tiles enter the basemap once promoted to `trusted`) |
| F11 | F1 route-driven variant (AZ-974) OR warm cache; E-DEMO-REPLAY (AZ-265) | F1 (corridor cache), replay JSONL + map artifacts consumed by suite UI |
**Cross-cutting**: F13 FDR-write is not a flow per se — every flow above has an FDR write side-effect. AC-NEW-3 requires every payload class (estimate, IMU, MAVLink, mid-flight tile, system health, failed-tile thumbnail) to be present; rollover is logged, never silent.
---
## Flow F1: Pre-flight cache provisioning
### Description
The operator builds (or refreshes) the per-mission cache before takeoff. F1 has **three phases** sequenced by C12 OperatorTool:
- **Phase 0 — Flight resolve (C12 `FlightsApiClient`, AZ-489)**: read the operator-authored `Flight` (ordered waypoints + altitudes) either from the parent-suite `flights` REST service (`--flight-id <Guid>`) or from a local JSON export (`--flight-file <path>`). Compute the bounding box as the envelope of waypoint lat/lon plus a configurable buffer (default 1 km). Extract `Flight.waypoints[0].(lat, lon, alt)` as the **takeoff origin**. Both are passed downstream as `BuildRequest` fields.
- **Phase 1 — Tile download (C11 `TileDownloader` — bbox-driven, production path)**: fetch tiles from `satellite-provider` for the bbox computed in Phase 0 via `POST /api/satellite/tiles/inventory` (bulk lookup of `(z,x,y)` coords per `tile-inventory.md` v1.0.0 / AZ-505) + `GET /tiles/{z}/{x}/{y}` (slippy-map JPEG fetch for inventory entries with `present=true`); apply sector-classified freshness rules (AC-NEW-6) and resolution gate (RESTRICT-SAT-4); write tile rows + JPEGs into C6. Auth: JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) over TLS; dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` accepts self-signed certs.
- **Phase 2 — Cache artifact build (C10 CacheProvisioner)**: read the populated C6 store; compile/deserialize TRT engines via C7; batch-generate descriptors via the C2 backbone; atomically write the FAISS HNSW index with SHA-256 sidecars; write the Manifest hashing model + calibration + corpus + sector classification **+ takeoff origin** (D-C10-1 idempotence; ADR-010).
This flow is offline and not time-critical. **Only Phase 0 reaches `flights` REST and Phase 1 reaches `satellite-provider`** — both run on the operator workstation, which is the only host that holds TLS + service-internal credentials. The companion never reaches either service directly (Principle #9 — denied-environment operation).
#### Phase 1 variant — route-driven seeding (cycle 3 — Epic AZ-835 / AZ-836 + AZ-838 + AZ-839)
A tlog-driven alternative to bbox download lets the operator pre-commit the cache to the precise corridor the drone actually flew. **Production bindings** (Epic AZ-969): C12 `seed-cache-from-tlog` (AZ-974) and the `replay_api` demo job (AZ-973) call the same `operator_replay.cache_seed` module. The e2e fixture `operator_pre_flight_setup` (AZ-839) is a thin wrapper over that production path — not a parallel implementation.
Phase-1 sub-steps in the route-driven variant (replaces the bbox download for that invocation):
1. **Extract corridor from tlog**`replay_input.tlog_route.extract_route_from_tlog(tlog, *, max_waypoints=10)` (AZ-836). Trims pre-takeoff stationary frames, then coarsens the GPS trace to ≤ `max_waypoints` waypoints via Douglas-Peucker in WGS-84 with great-circle distance. Returns a `RouteSpec(waypoints, suggested_region_size_meters, source_tlog, source_segment, total_distance_meters)` — frozen+slots; canonical home `_types/route.py` (AZ-845).
2. **Submit to satellite-provider**`c11_tile_manager.route_client.SatelliteProviderRouteClient.seed_route(spec)` (AZ-838). Pre-emptively validates against the AZ-809 `CreateRouteRequestValidator` bounds (`points` 2..500; `regionSizeMeters` 100..10 000; `zoomLevel` 0..22; lat/lon ranges) BEFORE the HTTP POST. Then POSTs `/api/satellite/route` with `requestMaps=true&createTilesZip=false` and polls `GET /api/satellite/route/{id}` every 5 s × ≤ 60 attempts until `mapsReady=true` (terminal-success) or a terminal-failure status (`{failed, error, rejected}`). Returns a `RouteSeedResult(route_id, terminal_status, maps_ready, tile_count, elapsed_ms, submitted_payload_sha256)`.
3. **Populate C6 via C11** — enumerate the route's tile coverage locally from `(waypoints, suggested_region_size_meters)`; invoke `tile_downloader.HttpTileDownloader.download_for_bbox` (existing C11 download path) to pull every corridor tile into C6.
4. **Build FAISS index via C10**`DescriptorBatcher` against the populated C6 using the NetVLAD backbone (per `c2_vpr/config.py:67` default); verify sidecar triple-consistency (`.index` + `.sha256` + `.meta.json`) per AZ-306; mismatch raises `IndexUnavailableError`.
5. **Yield `PopulatedC6Cache`**`(cache_root, tile_store_path, faiss_index_path, faiss_sidecar_sha256_path, faiss_sidecar_meta_path, route_spec, tile_count, elapsed_seconds)`. Backed by a docker named volume that survives across pytest sessions in the same compose run.
Cold-start budget on Tier-2 Jetson: ≤ 5 min (first invocation, full materialisation + descriptor batching); warm: ≤ 30 s (named-volume reuse).
### Preconditions
- Operator workstation has network reach to `satellite-provider` (TLS + service-internal API key).
- Operator has classified the operational area (`active_conflict | stable_rear`) — drives the freshness threshold (AC-8.2 / AC-NEW-6).
- **Mission already authored in the parent-suite Mission Planner UI (`suite/ui`)** and persisted to the parent-suite `flights` REST service. Operator knows the `Flight` GUID (online path) OR has a JSON export of the same DTO shape on disk (offline path).
- Camera calibration JSON for the deployed unit is available (`adti20.<unit-id>.json` from D-PROJ-1 hybrid).
- Companion is connected to the operator workstation (USB or Ethernet) and writable.
- Available cache budget on the companion's NVM is ≥ the projected `≤ 10 GB` per AC-8.3.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Operator
participant C12OperatorTool as C12 Operator Tool (workstation)
participant FlightsClient as C12 FlightsApiClient (workstation, AZ-489)
participant FlightsApi as [[flights REST service]] (.NET 8)
participant C11TileDownloader as C11 TileDownloader (workstation)
participant SatelliteProvider as [[satellite-provider]] (.NET 8)
participant C6TileStore as C6 TileStore + DescriptorIndex (Postgres + filesystem + FAISS)
participant C10Provisioner as C10 CacheProvisioner (companion)
participant C7Inference as C7 InferenceRuntime
participant C2Backbone as C2 VPR backbone (TensorRT)
Operator->>C12OperatorTool: build_cache --flight-id GUID [--flight-file PATH] sector_class calibration_file
alt online (flight-id)
C12OperatorTool->>FlightsClient: fetch_flight(GUID)
FlightsClient->>FlightsApi: GET /flights/{id} + GET /flights/{id}/waypoints
FlightsApi-->>FlightsClient: Flight DTO (waypoints, altitudes)
else offline (flight-file)
C12OperatorTool->>FlightsClient: load_flight_file(PATH)
FlightsClient->>FlightsClient: parse JSON into FlightDto
end
FlightsClient->>FlightsClient: bbox = envelope(waypoints.lat, waypoints.lon) + buffer
FlightsClient->>FlightsClient: takeoff_origin = waypoints[0].(lat, lon, alt)
FlightsClient-->>C12OperatorTool: (bbox, takeoff_origin, flight_id)
C12OperatorTool->>C11TileDownloader: download_tiles_for_area(bbox, zooms, sector_class)
C11TileDownloader->>SatelliteProvider: POST /api/satellite/tiles/inventory (bulk z,x,y lookup)
SatelliteProvider-->>C11TileDownloader: per-entry present:true|false + metadata
C11TileDownloader->>SatelliteProvider: GET /tiles/{z}/{x}/{y} (one per present:true entry)
SatelliteProvider-->>C11TileDownloader: Tile JPEG body
C11TileDownloader->>C11TileDownloader: filter by AC-NEW-6 freshness + RESTRICT-SAT-4 resolution
C11TileDownloader->>C6TileStore: write tiles to ./tiles/{zoomLevel}/{x}/{y}.jpg + Postgres rows (source='googlemaps')
C11TileDownloader-->>C12OperatorTool: DownloadBatchReport (counts, freshness summary)
C12OperatorTool->>C10Provisioner: build_cache_artifacts(bbox, zooms, sector_class, calibration, takeoff_origin, flight_id)
C10Provisioner->>C7Inference: load VPR backbone ONNX
C7Inference-->>C10Provisioner: TRT engine compiled (cached per SM/JP/TRT/precision tuple)
C10Provisioner->>C2Backbone: per-tile descriptor generation (batched on Jetson, reads tiles from C6)
C2Backbone-->>C10Provisioner: descriptor matrix (FP16/INT8 per D-C7-1)
C10Provisioner->>C6TileStore: faiss.write_index (HNSW) + atomicwrites + SHA-256 content-hash
C10Provisioner->>C10Provisioner: write Manifest (hash of model + calibration + corpus + sector_class + takeoff_origin)
C10Provisioner-->>C12OperatorTool: BuildReport (counts, hashes)
C12OperatorTool-->>Operator: PASS / FAIL summary
```
### Flowchart
```mermaid
flowchart TD
Start([Operator invokes C12 build with --flight-id or --flight-file]) --> ResolveFlight[C12 FlightsApiClient fetches Flight by GUID or reads JSON export]
ResolveFlight --> FlightOk{Flight resolved + at least 1 waypoint?}
FlightOk -->|no| RefuseBuild[Refuse build with explicit error to operator]
FlightOk -->|yes| ComputeBbox[Compute bbox as envelope of waypoint lat/lon + buffer; take waypoints[0] as takeoff origin]
ComputeBbox --> Classify[Operator classifies sector active_conflict OR stable_rear]
Classify --> InvokeC11[C12 invokes C11 TileDownloader with computed bbox]
InvokeC11 --> Download[C11 POST /api/satellite/tiles/inventory then GET /tiles/{z}/{x}/{y}]
Download --> FreshnessFilter{Freshness ok per AC-8.2 + AC-NEW-6?}
FreshnessFilter -->|stale and stable_rear| RejectOrDowngrade[Reject or downgrade tile]
FreshnessFilter -->|stale and active_conflict| RejectOrDowngrade
FreshnessFilter -->|fresh| ResolutionGate{Resolution >= 0.5 m/px per RESTRICT-SAT-4?}
RejectOrDowngrade --> ResolutionGate
ResolutionGate -->|fail| RejectRes[Reject and report]
ResolutionGate -->|pass| WriteTiles[C11 writes tiles to filesystem + Postgres]
WriteTiles --> InvokeC10[C12 invokes C10 build_cache_artifacts]
RejectRes --> Done
InvokeC10 --> CompileEngines[C10 compiles or reuses TRT engines via C7 InferenceRuntime]
CompileEngines --> EngineCacheHit{EngineCacheEntry already valid for SM JP TRT precision tuple?}
EngineCacheHit -->|yes D-C10-6| ReuseEngine[Reuse cached engine and INT8 calibration cache]
EngineCacheHit -->|no| BuildEngine[Polygraphy or trtexec or IBuilderConfig hybrid build]
ReuseEngine --> Descriptors
BuildEngine --> Descriptors[C10 batches each tile through C2 backbone for descriptors]
Descriptors --> WriteIndex[faiss.write_index HNSW + atomicwrites + SHA-256 content-hash]
WriteIndex --> WriteManifest[Write Manifest with hash of model + calibration + corpus + sector_class + takeoff_origin]
WriteManifest --> ManifestHashCheck{Idempotence check D-C10-1: same manifest hash as last build?}
ManifestHashCheck -->|same| SkipRebuild[Skip rebuild and emit no-op report]
ManifestHashCheck -->|different| Done([Provisioning complete; cache + engines + manifest staged])
SkipRebuild --> Done
RefuseBuild --> Done
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 0a | Operator | C12 | (`flight_id` OR `flight_file`, `zoom_levels`, `sector_class`, `calibration_path`) | CLI args / GUI form |
| 0b | C12 `FlightsApiClient` (online) | `flights` REST | `GET /flights/{id}` + `GET /flights/{id}/waypoints` | HTTPS GET |
| 0c | `flights` REST | C12 `FlightsApiClient` | `Flight` + ordered `Waypoint[]` (lat / lon / alt / objective / source) | JSON DTOs |
| 0d | C12 `FlightsApiClient` (offline) | filesystem | `flight_file` JSON in the same DTO shape | JSON read |
| 0e | C12 `FlightsApiClient` | C12 | `(bbox, takeoff_origin, flight_id)` | in-process |
| 1 | C12 | C11 `TileDownloader` | `DownloadRequest(bbox, zoom_levels, sector_class)` | in-process call |
| 2a | C11 | `satellite-provider` REST | `POST /api/satellite/tiles/inventory` (bulk `(z,x,y)` lookup, ≤ 5000 entries / request; per `tile-inventory.md` v1.0.0) | HTTPS POST JSON body |
| 2b | `satellite-provider` | C11 | Per-entry `present: true \| false` + metadata when present | JSON response (order matches request order) |
| 2c | C11 | `satellite-provider` REST | `GET /tiles/{z}/{x}/{y}` (issued only for `present=true` entries) | HTTPS GET |
| 3 | `satellite-provider` | C11 | Tile JPEG body | binary JPEG |
| 4 | C11 | C6 filesystem (over USB/Eth) | Tile JPEG bodies | `./tiles/{zoomLevel}/{x}/{y}.jpg` |
| 5 | C11 | C6 PostgreSQL | Tile metadata rows (`source='googlemaps'`) | SQL INSERT (mirror of `satellite-provider`'s `tiles` table) |
| 1' (route variant) | tlog file | `replay_input.tlog_route.extract_route_from_tlog` | `RouteSpec(waypoints, suggested_region_size_meters, …)` | in-process call |
| 2' (route variant) | C11 `SatelliteProviderRouteClient` | `satellite-provider` REST | `POST /api/satellite/route` (`requestMaps=true`); then `GET /api/satellite/route/{id}` poll until `mapsReady=true` | HTTPS POST + repeated GET |
| 3' (route variant) | C11 | enumerator | local enumeration of corridor `(z,x,y)` coords from `(waypoints, suggested_region_size_meters)` | in-process |
| 4'+5' (route variant) | C11 | C6 | same as steps 4+5 above (downloads via the same inventory + slippy-map paths) | as above |
| 6 | C12 | C10 `CacheProvisioner` | `BuildRequest(bbox, zoom_levels, sector_class, calibration_path, takeoff_origin, flight_id)` | in-process call (operator-orchestrator side); RPC over USB/Eth to companion runner |
| 7 | C10 → C7 | TRT engine cache | TRT engines | `.engine` files keyed by `(SM, JP, TRT, precision)` (D-C10-7) |
| 8 | C2 backbone (driven by C10) | C6 FAISS index | Descriptor matrix | `.index` (FAISS HNSW), atomicwrites, SHA-256 sidecar |
| 9 | C10 | filesystem | Manifest (carries `takeoff_origin` + hashes) | YAML or JSON |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| `flights` REST unreachable (online path) | Step 0b | HTTP timeout / connection refused | Fail explicitly; instruct operator to retry online or use `--flight-file` offline path; takeoff blocked |
| `flights` REST 401/403 (online path) | Step 0b | HTTP 401/403 | Fail with explicit error; instruct operator to refresh suite credentials; takeoff blocked. Never silently fall back |
| `flights` REST 404 (online path) | Step 0b | HTTP 404 | Fail with explicit message naming the unknown `flight_id`; takeoff blocked |
| Flight file malformed (offline path) | Step 0d | JSON parse failure / schema mismatch | Fail with line / field reference; instruct operator to re-export from Mission Planner UI; takeoff blocked |
| Flight has zero waypoints | Step 0e | Post-fetch validation | Fail explicitly; cannot derive bbox or takeoff origin; takeoff blocked |
| Flight bbox exceeds cache budget | Step 0e | Pre-Phase-1 bbox area vs AC-8.3 budget projection | Fail with budget delta; operator must re-plan a smaller route in Mission Planner UI; takeoff blocked |
| `satellite-provider` unreachable | Step 2a/2c (or 2' route variant) | HTTP timeout / 5xx | C11 `TileDownloader` / `SatelliteProviderRouteClient` fails with explicit error; operator retries when network is available; takeoff blocked |
| `satellite-provider` JWT auth 401/403 | Step 2a/2c (or 2' route variant) | HTTP 401/403 | Fail with explicit error; instruct operator to refresh `SATELLITE_PROVIDER_API_KEY`; takeoff blocked. Never silently fall back to plaintext or unauthenticated |
| Route validation fails (route variant) | Step 1'→2' | Pre-emptive client check against AZ-809 `CreateRouteRequestValidator` bounds | `RouteValidationError` raised BEFORE the HTTP POST; surface field-by-field errors to operator |
| Route materialisation terminal failure (route variant) | Step 2' poll | `GET /api/satellite/route/{id}` returns `status ∈ {failed, error, rejected}` | `RouteTerminalFailureError` with `.detail` carrying the server response JSON; takeoff blocked |
| Route poll budget exhausted (route variant) | Step 2' poll | 60 attempts × 5 s ceiling reached without `mapsReady=true` or terminal failure | `RouteTransientError` referencing the last observed status; operator may re-invoke or extend the poll budget |
| Tile fails freshness | Step 3 (C11) | `tile.capture_timestamp` vs `sector_class` threshold | Reject (active_conflict) or downgrade-no-`satellite_anchored`-label (rear), per AC-NEW-6; counts surface in `DownloadBatchReport` |
| Resolution below 0.5 m/px | Step 3 (C11) | Tile metadata GSD check (RESTRICT-SAT-4) | Reject; report; takeoff blocked |
| Insufficient cache budget | Step 4 (C11) | Filesystem free-space check pre-write | Fail fast with explicit budget delta; no partial write |
| C6 missing tiles for requested bbox/zoom | Step 6 (C10) | C10's pre-build scan finds < expected tile count | Surface as `BuildReport.failure` instructing operator to re-run C11 `TileDownloader`; do **not** trigger network fetch from C10 |
| Engine compile failure | Step 7 | Polygraphy / trtexec exit code; no output `.engine` | Surface error to operator; takeoff blocked; **never silently fall back** |
| Descriptor generation OOM on Jetson | Step 8 | CUDA OOM | Halve batch size and retry once; if still OOM, surface to operator |
| Atomic-write or SHA-256 mismatch | Step 8 | `atomicwrites` rollback or content-hash sidecar mismatch | Mark cache invalid; rebuild from staged tiles; if persistent, surface to operator |
| Tampered cache (post-write, pre-takeoff) | (caught at takeoff in F2, not here) | F2 SHA-256 content-hash gate | F2 refuses takeoff (IT-7) |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| End-to-end provisioning time (~400 km², worst case) | ≤ tens of minutes (offline, not time-critical per AC-8.3) | Dominated by tile download bandwidth + descriptor batching on Jetson |
| Engine cache hit re-build | < 30 s per IT-9 | D-C10-6 calibration-cache reuse + D-C10-7 self-describing filename schema |
| Idempotent re-run with no inputs changed | Skip rebuild via D-C10-1 manifest-hash trigger | IT-8 |
| Descriptor cache footprint | Inside the 10 GB AC-8.3 budget (incl. tiles + indices + overviews) | Carve-out per chosen VPR backbone descriptor dimension (D-C2-6 / D-C2-9 / D-C2-10) |
---
## Flow F2: Takeoff load
### Description
From companion process start to **first valid emitted external-position frame**, within the AC-NEW-1 ≤ 30 s p95 cold-start TTFF budget. Takeoff load verifies the cache (SHA-256 content-hash gate, D-C10-3), mmaps the FAISS HNSW index, deserialises pre-built TensorRT engines, completes the MAVLink 2.0 signing handshake on the AP wired channel (D-C8-9 = (d)), and arms the per-frame pipeline.
### Preconditions
- F1 (pre-flight cache provisioning) has completed successfully.
- Manifest + tiles + descriptor index + TRT engines exist on the companion's NVM.
- Camera calibration JSON for the deployed unit is present at the path declared in config.
- FC is reachable on the configured UART/USB; FC firmware is `ArduPilot Plane ≥ Aug 2021 PR #18345` (for D-C8-2) or `iNav 8.0+`.
- Operator has staged the per-flight MAVLink 2.0 signing key seed (one half) and the FC has the matching pair (the AP path); iNav path skips this step.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Companion as Companion process (composition root)
participant ContentHash as C10 ManifestVerifier
participant FaissIndex as C6 DescriptorIndex (FAISS HNSW)
participant TrtRuntime as C7 InferenceRuntime
participant FcAdapter as C8 FcAdapter (per-FC)
participant FC as [[Flight Controller]]
participant Pipeline as C1+C2+C2.5+C3+C3.5+C4+C5 pipeline (warm)
participant Fdr as C13 FdrWriter
Companion->>ContentHash: verify(manifest, descriptor.index, tiles dir)
ContentHash-->>Companion: pass (or refuse takeoff)
Companion->>FaissIndex: faiss.read_index(IO_FLAG_MMAP_IFC)
FaissIndex-->>Companion: ready
Companion->>TrtRuntime: deserializeCudaEngine per cached .engine
TrtRuntime-->>Companion: engines ready
alt FC is ArduPilot Plane
Companion->>FcAdapter: open MAVLink 2.0 + signing handshake
FcAdapter->>FC: signing seed + key handshake (D-C8-9 = (d))
FC-->>FcAdapter: signed handshake ack
FcAdapter-->>Companion: AP signed channel ready; key rotation logged to FDR
else FC is iNav
Companion->>FcAdapter: open MSP2 channel (unsigned, accepted residual risk)
FcAdapter-->>Companion: iNav channel ready
end
Companion->>FC: subscribe to FC IMU + attitude + GPS health (telemetry)
FC-->>Companion: first telemetry frame
Note over Companion,Pipeline: Cold-start ladder (ADR-010, AZ-490). Operator-origin from Manifest is primary; FC EKF GPS is secondary
alt Manifest carries takeoff_origin (AZ-490 primary path)
Companion->>Pipeline: C5.set_takeoff_origin(manifest.takeoff_origin, sigma_horiz_m, sigma_vert_m) BEFORE any add_vio / add_fc_imu
else Manifest has no takeoff_origin AND FC EKF GPS is valid (AZ-419 secondary path)
Companion->>FC: query FC EKF last valid GPS + IMU-extrapolated pose (AC-5.1)
FC-->>Companion: warm-start pose
Companion->>Pipeline: C5.set_takeoff_origin(fc_gps_origin, fc_gps_sigma)
else No origin available
Companion-->>Companion: stay INITIALIZING; FT-P-11 takeoff-abort policy (AZ-419 amended)
end
Companion->>Pipeline: warm pipeline with calibration
Pipeline-->>Companion: ready (no estimate emitted yet)
Companion->>Fdr: open per-flight FDR record; log signing key rotation event + chosen cold-start origin source
Note over Companion: Wait for first nav frame (F3 entry)
```
### Flowchart
```mermaid
flowchart TD
Start([Companion process start]) --> ReadManifest[C10 ManifestVerifier reads Manifest + sidecars]
ReadManifest --> ContentHashCheck{D-C10-3 SHA-256 content-hash gate passes?}
ContentHashCheck -->|no| RefuseTakeoff[Companion refuses takeoff: STATUSTEXT, no GPS_INPUT emit, FDR log]
ContentHashCheck -->|yes| MmapIndex[FAISS read_index IO_FLAG_MMAP_IFC]
MmapIndex --> LoadEngines[Per .engine: deserializeCudaEngine on C7]
LoadEngines --> EnginesOk{All engines deserialized OK?}
EnginesOk -->|no| RefuseTakeoff
EnginesOk -->|yes| FcDetect{Configured FC?}
FcDetect -->|ArduPilot Plane| ApSign[MAVLink 2.0 signing handshake D-C8-9]
FcDetect -->|iNav| InavOpen[Open MSP2 channel unsigned residual risk]
ApSign --> SignOk{Signing handshake OK?}
SignOk -->|no| RefuseTakeoff
SignOk -->|yes| OriginGate
InavOpen --> OriginGate{Manifest carries takeoff_origin?}
OriginGate -->|yes ADR-010 AZ-490 primary| OperatorOrigin[C5.set_takeoff_origin manifest.takeoff_origin sigma_horiz_m sigma_vert_m]
OriginGate -->|no| FcEkfGate{FC EKF reports valid non-spoofed GPS?}
FcEkfGate -->|yes AZ-419 secondary| FcOrigin[C5.set_takeoff_origin fc_gps_origin fc_gps_sigma_horiz fc_gps_sigma_vert]
FcEkfGate -->|no| NoOrigin[Stay INITIALIZING and apply FT-P-11 takeoff-abort policy]
OperatorOrigin --> WarmPipeline
FcOrigin --> WarmPipeline
NoOrigin --> Refuse2[Refuse takeoff with FDR record of missing origin]
WarmPipeline[Warm C1 + C2 + C2.5 + C3 + C3.5 + C4 + C5 with calibration]
WarmPipeline --> OpenFdr[C13 opens per-flight FDR; logs signing key rotation event + chosen origin source]
OpenFdr --> Ready([Ready; awaiting first nav frame])
Refuse2 --> RefuseTakeoff
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | Companion | C10 | (`manifest_path`) | filesystem read |
| 2 | C10 | filesystem | content-hash sidecars + `takeoff_origin` | SHA-256 hex digests + `LatLonAlt` in Manifest |
| 3 | Companion | FAISS | `.index` mmap pointer | C++ FAISS API |
| 4 | Companion | C7 / TensorRT | `.engine` deserialize | TensorRT IRuntime |
| 5 | Companion | FC (AP) | signing seed + handshake | MAVLink 2.0 signing |
| 6 | FC | Companion | warm-start pose + IMU/attitude/GPS health | MAVLink (AP) / MSP2 + MAVLink outbound (iNav) |
| 7 | Companion | C5 `StateEstimator` (AZ-490) | `set_takeoff_origin(origin, sigma_horiz_m, sigma_vert_m)` with origin = `manifest.takeoff_origin` (primary) OR FC-EKF GPS (secondary) | in-process Protocol method |
| 8 | Companion | C13 FDR | startup record (config snapshot, signing key rotation event, content-hash digests, chosen cold-start origin source) | FDR record |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Content-hash mismatch | Step 2 | D-C10-3 sidecar verify | Refuse takeoff; STATUSTEXT to GCS; FDR records the event; operator must re-run F1 |
| FAISS mmap failure | Step 3 | C++ FAISS exception | Refuse takeoff; same as above |
| TRT deserialize failure | Step 4 | TensorRT API error | Refuse takeoff; report mismatched `(SM, JP, TRT, precision)` tuple to operator |
| Signing handshake fail (AP) | Step 5 | Handshake timeout / signed-message rejection | Refuse takeoff; clear-text reason via STATUSTEXT (handshake never succeeded → unsigned STATUSTEXT is acceptable for this case only) |
| FC unreachable | Step 6 | UART/USB read timeout | Retry with backoff; after `N` retries refuse takeoff |
| Manifest has no `takeoff_origin` AND EKF returns no warm-start pose | Step 7 (ADR-010, AZ-490) | Both primary (manifest) and secondary (FC EKF) origin paths unavailable | Refuse takeoff; FDR records "no cold-start origin available"; FT-P-11 takeoff-abort policy applies. Bound the wait by AC-NEW-1 budget before final refusal |
| Manifest has `takeoff_origin` but FC EKF GPS disagrees by > 200 m at takeoff | Step 7 (ADR-010 Principle #11 bounded-delta) | Operator origin vs FC GPS comparison after first FC telemetry frame | Operator origin wins; FC GPS is logged as suspect (likely spoofed-at-takeoff); proceed to warm pipeline with the operator origin |
| FDR open failure | Step 8 | Filesystem write error | Refuse takeoff (per AC-NEW-3 every payload class must be present from t=0) |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Total takeoff load (boot → first valid frame) | p95 < 30 s (AC-NEW-1) | Validated by IT-2 (50× cold boot SITL) and NFT-PERF-03 |
| FAISS mmap cost | sub-second (mmap is lazy) | First query in F3 pays the page-in cost |
| TRT deserialize per engine | 15 s typical on Jetson Orin Nano Super | Engines per `(SM 87, JetPack 6.2, TRT 10.3, precision)` cached on disk |
| Signing handshake (AP) | sub-second | Wired UART/USB; per-flight key |
---
## Flow F3: Steady-state per-frame estimation
### Description
The system's **hot path**. For each nav-camera frame at 3 Hz nominal, run the canonical hierarchical pipeline `VIO → retrieval → re-rank → matching → AdHoP-conditional refinement → pose → fusion`, emit an `EmittedExternalPosition` to the FC at 5 Hz periodic, and write to FDR. The end-to-end latency budget is AC-4.1 p95 ≤ 400 ms; the partition is the D-CROSS-LATENCY-1 hybrid.
### Preconditions
- F2 (Takeoff load) completed; pipeline is warm.
- Camera ingest thread is running; FC IMU/attitude telemetry is flowing.
- `flight_state == IN_AIR` (hands-on indication that the upload code path is not loaded — F4 also gates on this).
- Last `EmittedExternalPosition` is either fresh or AC-5.2 fallback has not been triggered.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Camera as Nav camera
participant C1 as C1 VioStrategy
participant C2 as C2 VprStrategy
participant C2_5 as C2.5 ReRanker
participant C3 as C3 CrossDomainMatcher
participant C3_5 as C3.5 ConditionalRefiner
participant C4 as C4 PoseEstimator (OpenCV solvePnPRansac + GTSAM Marginals)
participant C5 as C5 StateEstimator (GTSAM iSAM2)
participant C8 as C8 FcAdapter
participant FC as [[Flight Controller]]
participant Fdr as C13 FdrWriter
Camera->>C1: NavCameraFrame_t
Camera->>C2: NavCameraFrame_t (parallel fan-out)
C1->>C5: VioOutput_t (relative pose + 6x6 cov + IMU bias)
C2->>C2_5: top-K=10 VprResult
C2_5->>C3: top-N=3 RerankResult
C3->>C3_5: MatchResult (with reprojection residual)
alt residual exceeds threshold
C3_5->>C3_5: invoke AdHoP refinement (~+30..90 ms p99)
C3_5->>C4: refined MatchResult
else residual below threshold
C3_5->>C4: passthrough MatchResult
end
C4->>C5: PoseEstimate (with 6x6 covariance from GTSAM Marginals or Jacobian degraded)
C5->>C5: iSAM2 update + IncrementalFixedLagSmoother (K=10..20 keyframes)
C5->>C8: PoseEstimate with provenance label
C8->>FC: GPS_INPUT (AP) or MSP2_SENSOR_GPS (iNav) at 5 Hz periodic
C8->>FC: STATUSTEXT or NAMED_VALUE_FLOAT for source label (out-of-band)
Note over C5: AC-4.5 internal smoothing — emits corrected current frame; logs smoothed past-frames to FDR
C8->>Fdr: emitted external-position record
C5->>Fdr: per-frame estimate + smoothed-past entries (NFT-6)
```
### Flowchart
```mermaid
flowchart TD
Start([NavCameraFrame received at 3 Hz]) --> Fanout[Fan out to C1 VIO and C2 VPR in parallel]
Fanout --> C1Out[C1 produces VioOutput: relative pose + 6x6 cov + IMU bias + feature quality]
Fanout --> C2Out[C2 produces top-K=10 VprResult against FAISS HNSW]
C2Out --> C2_5[C2.5 single-pair LightGlue per candidate; rank by inlier count -> top-N=3]
C2_5 --> C3[C3 DISK + LightGlue × N pairs FP16]
C3 --> ResidualCheck{Reprojection residual > AdHoP threshold?}
ResidualCheck -->|yes| C3_5[C3.5 AdHoP refinement; ~+30..90 ms p99]
ResidualCheck -->|no| Passthrough[C3.5 passthrough]
C3_5 --> C4
Passthrough --> C4[C4 OpenCV solvePnPRansac IPPE]
C4 --> ThermalCheck{D-CROSS-LATENCY-1 thermal-throttle telemetry crosses threshold?}
ThermalCheck -->|no, steady-state| GtsamMarginals[GTSAM Marginals 6x6 cov D-C4-2 = b]
ThermalCheck -->|yes, hybrid degraded| Jacobian[Jacobian 6x6 cov D-C4-2 = a; degrade C2.5 N to 2]
GtsamMarginals --> C5
Jacobian --> C5
C1Out --> C5[C5 iSAM2 + CombinedImuFactor + IncrementalFixedLagSmoother K=10..20 keyframes]
C5 --> Provenance{Provenance label?}
Provenance -->|fresh anchor| LabelSat[satellite_anchored]
Provenance -->|propagated under no fresh anchor| LabelVisual[visual_propagated]
Provenance -->|IMU-only| LabelDeadReckon[dead_reckoned]
LabelSat --> Emit[C8 emits per-FC GPS_INPUT or MSP2_SENSOR_GPS at 5 Hz; STATUSTEXT for source label]
LabelVisual --> Emit
LabelDeadReckon --> Emit
Emit --> FdrWrite[C13 FDR: emitted record + per-frame estimate + smoothed-past entries]
FdrWrite --> Done([Frame complete])
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | Camera | C1 | `NavCameraFrame` | RGB pixel buffer + timestamp |
| 1 | Camera | C2 | `NavCameraFrame` (same frame) | same |
| 2 | C8 inbound | C1, C5 | `ImuWindow` (timestamp-aligned to frame) | DTO; same window for both consumers |
| 3 | C1 | C5 | `VioOutput` | relative SE(3) + 6×6 cov + bias + feature quality |
| 4 | C2 | C2.5 | `VprResult` (top-K=10 tile IDs ranked by descriptor distance) | DTO |
| 5 | C2.5 | C3 / C3.5 | `RerankResult` (top-N=3 tile IDs ranked by inlier count) | DTO |
| 6 | C3 → C3.5 → C4 | match pipeline | `MatchResult` (2D-3D corresp. + RANSAC inliers + reprojection residual) | DTO |
| 7 | C4 | C5 | `PoseEstimate` (WGS84 + 6×6 cov + provenance + `last_satellite_anchor_age_ms`) | DTO |
| 8 | C5 | C8 | smoothed/refined `PoseEstimate` | DTO |
| 9 | C8 | FC | `EmittedExternalPosition` | MAVLink `GPS_INPUT` (AP) or MSP2 `MSP2_SENSOR_GPS` (iNav) |
| 10 | C8 | FC | provenance label | MAVLink `STATUSTEXT` / `NAMED_VALUE_FLOAT` (AP) or MSP equivalent (iNav) |
| 11 | C5 + C8 | C13 FDR | per-frame estimate + emitted MAVLink frame + smoothed past-frame entries | FDR record |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Frame-to-frame registration failure | C1 | VioOutput marks low feature quality OR matcher fails | F6 sharp-turn / disconnected-segment re-localization |
| Cross-domain matching insufficient inliers | C3 | RANSAC inlier count below threshold | Mark frame as no satellite anchor; provenance becomes `visual_propagated`; F6 if persists |
| Reprojection residual exceeds AdHoP threshold | C3 | residual > threshold | C3.5 AdHoP refinement invoked (worst-case 2× C3 latency on triggered frames; budgeted in D-CROSS-LATENCY-1) |
| GTSAM Marginals exceeds latency budget | C4 | per-frame timer | D-CROSS-LATENCY-1 hybrid auto-degrade: drop to Jacobian covariance + N=2 |
| Sustained latency overrun (multi-frame) | end-to-end | rolling p95 monitor | Drop oldest frame from camera ingest queue (~10% drop budget per AC-4.1); FDR logs the drop |
| FC GPS reports denial/spoof | C8 inbound | MAVLink GPS health bit / spoof flag | F7 spoofing-promotion + F5 if visual is also lost |
| FC stops accepting `GPS_INPUT` (AP) | C8 outbound | no source-set acknowledgement after `MAV_CMD_SET_EKF_SOURCE_SET` | D-C8-2-FALLBACK path; AC-5.2 IMU-only fallback if persistent |
| Camera frame drop | Camera | ingest queue overflow | Drop oldest frame; log in FDR |
| Dead-reckoning >3 s | C5 | watchdog | AC-5.2 — system logs failure; FC enters IMU-only |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| End-to-end latency (camera capture → FC GPS frame) | AC-4.1 p95 ≤ 400 ms | D-CROSS-LATENCY-1 partition; NFT-PERF-01 + NFT-9 |
| Tail | p99 ≤ 600 ms (allows AdHoP-triggered frames) | NFT-9 |
| Memory | < 8 GB shared on Jetson | AC-4.2; NFT-LIM-01 |
| Frame rate | 3 Hz nominal; ~10 % drop allowed under sustained load | AC-4.1 |
| C8 emit cadence | 5 Hz periodic per D-C8-5 | independent of nav-frame rate; last-known-pose if no new estimate |
| Mode-transition into degraded label | ≤ 1 frame OR ≤ 400 ms (AC-3.5) | applies on transition to `visual_propagated` / `dead_reckoned` |
### Notes — AC-4.5 internal smoothing (sub-flow of F3)
GTSAM iSAM2 with `IncrementalFixedLagSmoother` retroactively refines past keyframes (window K = 1020 per D-C5-3). The current frame emitted to the FC carries the smoothing-corrected state — but the FC log itself remains forward-time only (Mode B Fact #107). FDR (C13) MUST log the smoothed past-frame estimates so post-mission analysis can validate AC-4.5. IT-11 measures the smoothing-loop look-back accuracy independently of FC consumption.
---
## Flow F4: Mid-flight tile generation + local cache write
### Description
For every successful satellite-anchored frame whose `TileQualityMetadata` clears the publish threshold, orthorectify the nav frame onto basemap projection, deduplicate against the existing local tile cache, and write the result locally in `satellite-provider`-compatible on-disk format. **No outbound network write while airborne** — process-level isolation enforces this: neither the C11 `TileUploader` nor the C11 `TileDownloader` is loaded in the airborne companion image (ADR-004). The post-landing tool (F10) is a separate process / image.
### Preconditions
- F3 produced a `PoseEstimate` with provenance `satellite_anchored` and covariance below the publish threshold.
- `flight_state == IN_AIR` is signalled by FC `MAV_STATE`; the in-air image does not contain C11 (process-level isolation; both `TileDownloader` and `TileUploader` paths absent).
- Local C6 tile store has free quota (per AC-NEW-3 FDR sub-budget allocation).
- Mid-flight tile metadata schema (quality_metadata) is configured per AC-NEW-7 + D-PROJ-2 contract sketch.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Frame as NavCameraFrame_t (post-F3)
participant Pose as PoseEstimate_t (from F3)
participant Ortho as Orthorectifier (C6 sub-component)
participant Dedup as Deduper (latest/highest-quality wins)
participant TileStore as C6 TileStore (filesystem + Postgres)
participant Fdr as C13 FdrWriter
Pose->>Ortho: provenance == satellite_anchored AND covariance below threshold?
alt yes
Frame->>Ortho: NavCameraFrame_t
Ortho->>Ortho: orthorectify with calibration + pose
Ortho->>Dedup: candidate Tile (zoomLevel, lat, lon, capture_timestamp, quality_metadata)
Dedup->>TileStore: dedup query by (zoomLevel, lat, lon)
alt new or higher-quality
Dedup->>TileStore: write JPEG to ./tiles/{zoomLevel}/{x}/{y}.jpg
Dedup->>TileStore: insert/update Postgres row with source=onboard_ingest, voting_status=pending
Dedup->>Fdr: tile-write event + quality metadata
else duplicate or lower-quality
Dedup->>Fdr: skip event
end
else no (provenance not satellite_anchored OR cov above threshold)
Pose->>Fdr: skip event (rationale logged)
end
```
### Flowchart
```mermaid
flowchart TD
Start([F3 emitted PoseEstimate_t]) --> Provenance{provenance == satellite_anchored AND cov below threshold?}
Provenance -->|no| LogSkip[FDR logs skip with rationale]
Provenance -->|yes| Ortho[Orthorectify NavCameraFrame_t with calibration + pose]
Ortho --> BuildMeta[Build TileQualityMetadata: estimator_label + 2x2 cov + last_anchor_age + MRE + IMU bias norm]
BuildMeta --> DedupQuery{Dedup vs existing tiles by zoomLevel + lat + lon}
DedupQuery -->|new cell| WriteFs[Write JPEG to filesystem . tiles . zoom . x . y . jpg]
DedupQuery -->|existing higher quality| WriteFs
DedupQuery -->|existing same or lower quality| LogSkip
WriteFs --> InsertDb[Postgres INSERT or UPDATE with source=onboard_ingest, voting_status=pending]
InsertDb --> FdrLog[FDR logs tile-write event + metadata]
FdrLog --> Done([Tile available locally; awaits F10 post-landing upload])
LogSkip --> Done
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | F3 | C6 ortho | (`NavCameraFrame`, `PoseEstimate`, `CameraCalibration`) | DTO |
| 2 | C6 ortho | C6 dedup | candidate `Tile` + `TileQualityMetadata` | JPEG body + metadata DTO |
| 3 | C6 dedup | C6 store filesystem | tile JPEG | `./tiles/{zoomLevel}/{x}/{y}.jpg` (mirror of `satellite-provider`) |
| 4 | C6 dedup | C6 Postgres | tile row + metadata | SQL INSERT/UPDATE; `source=onboard_ingest`, `voting_status=pending` |
| 5 | C6 | FDR | tile-write event | FDR record (counts against AC-NEW-3 budget) |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Filesystem write fails | Step 3 | filesystem error | Skip tile; FDR logs error; pipeline continues (tile generation is best-effort, not safety-critical) |
| Postgres insert fails | Step 4 | DB error | Skip tile; FDR logs error |
| Local cache quota exhausted | Step 3 | pre-write free-space check | LRU-evict oldest **mid-flight** tile (never evict pre-flight `satellite-provider` tiles); FDR logs eviction |
| `flight_state` glitch reports `ON_GROUND` mid-flight | architectural | software guard — but C11 is not loaded anyway | Defense-in-depth holds: even if guard misfires, C11 (both `TileDownloader` and `TileUploader`) is not present in the airborne image |
| Dedup race (two threads writing same cell) | Step 4 | DB unique constraint or filesystem `O_EXCL` | Retry once with the freshest candidate; FDR logs race |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Per-tile orthorectification cost | not on the AC-4.1 critical path | Runs off the F3 hot loop; dropped first under thermal throttle |
| Per-tile write latency | < 1 frame interval typical (333 ms @ 3 Hz) | If exceeded, drop the tile rather than back-pressure F3 |
| Cache footprint growth | bounded by AC-NEW-3 mid-flight tile sub-budget | LRU-evict mid-flight tiles only |
---
## Flow F5: Visual blackout + spoofed-GPS failsafe
### Description
When the navigation camera becomes fully unusable (clouds, occlusion, whiteout, hardware fault) **and/or** the FC reports GPS denial/spoof, the system must NOT pretend to have visual or GPS data. It transitions to `dead_reckoned` propagation from the last trusted state + FC IMU/attitude/airspeed/altitude, grows covariance monotonically, escalates the MAVLink fix-quality field as the covariance crosses thresholds, and never re-promotes spoofed GPS without a 10-s GPS-health + visual-consistency gate. Reference: AC-3.5, AC-NEW-8.
### Preconditions
- F3 was running normally before the trigger.
- FC is still reachable (the link itself works; it's the GPS source / camera that failed).
### Sequence Diagram
```mermaid
sequenceDiagram
participant Camera as Nav camera
participant C1 as C1 VioStrategy
participant C5 as C5 StateEstimator
participant C8 as C8 FcAdapter
participant FC as [[Flight Controller]]
participant Gcs as [[QGroundControl]]
participant Fdr as C13 FdrWriter
alt camera unusable (whiteout / hw fault)
Camera->>C1: degraded or no frame
C1->>C5: VioOutput with low feature_quality OR no output
end
alt FC reports GPS denial/spoof
FC->>C8: GPS health bit / spoof flag set
C8->>C5: gps_health_event(denied | spoofed)
end
C5->>C5: switch label to dead_reckoned within ≤ 1 frame OR ≤ 400 ms
C5->>C5: propagate from last trusted state + FC IMU/attitude/airspeed/altitude; cov grows monotonically
C5->>C8: PoseEstimate (label = dead_reckoned)
C8->>C8: degrade horiz_accuracy field per AC-NEW-8 thresholds
alt 95% cov semi-major axis ≤ 100 m
C8->>FC: GPS_INPUT/MSP2 with honest horiz_accuracy
else 95% cov in (100, 500] m
C8->>FC: GPS_INPUT/MSP2 with fix_quality "2D fix or worse"
else 95% cov > 500 m OR blackout > 30 s
C8->>FC: GPS_INPUT/MSP2 with horiz_accuracy=999.0 (no fix)
C8->>Gcs: VISUAL_BLACKOUT_FAILSAFE STATUSTEXT
end
C8->>Gcs: VISUAL_BLACKOUT_IMU_ONLY STATUSTEXT at 12 Hz
C5->>Fdr: degraded-mode entry + per-frame estimate + cov
Note over C5,FC: spoofed GPS NEVER re-enters the estimator unless FC GPS health stable + non-spoofed for ≥10 s AND visual/satellite consistency check succeeds
```
### Flowchart
```mermaid
flowchart TD
Start([Visual blackout AND/OR GPS denial/spoof detected]) --> SwitchLabel[C5 switches label to dead_reckoned within ≤1 frame OR ≤400 ms]
SwitchLabel --> RejectSpoof[Reject spoofed GPS as estimator input]
RejectSpoof --> Propagate[Propagate from last trusted state + FC IMU/attitude/airspeed/altitude]
Propagate --> CovGrow[Covariance grows monotonically]
CovGrow --> Threshold{95 percent cov semi-major axis?}
Threshold -->|≤ 100 m| EmitNormal[C8 emits GPS_INPUT or MSP2 with honest horiz_accuracy]
Threshold -->|100 m to 500 m| EmitDegraded[C8 emits with fix_quality 2D fix or worse]
Threshold -->|gt 500 m OR blackout gt 30 s| EmitNoFix[C8 emits horiz_accuracy=999.0 + VISUAL_BLACKOUT_FAILSAFE STATUSTEXT]
EmitNormal --> Recovery{Anchor recovers OR GPS-health stable + non-spoofed for >=10 s + visual consistency check?}
EmitDegraded --> Recovery
EmitNoFix --> Recovery
Recovery -->|yes anchor| ResumeF3[Resume F3 with provenance restoration]
Recovery -->|yes GPS gate| ConsentReturn[Allow real-GPS back into estimator]
Recovery -->|no| Continue[Continue degraded mode; FDR per-frame]
Continue --> Threshold
ResumeF3 --> Done([Recovered])
ConsentReturn --> Done
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | Camera / C1 | C5 | degraded `VioOutput` or no output | DTO |
| 2 | FC / C8 inbound | C5 | GPS-health / spoof event | event DTO |
| 3 | C5 | C8 | `PoseEstimate` with `provenance=dead_reckoned` and growing covariance | DTO |
| 4 | C8 | FC | per-FC degraded `GPS_INPUT` / `MSP2_SENSOR_GPS` | MAVLink / MSP2 |
| 5 | C8 | GCS | `VISUAL_BLACKOUT_IMU_ONLY` STATUSTEXT (12 Hz); escalates to `VISUAL_BLACKOUT_FAILSAFE` at thresholds | MAVLink STATUSTEXT |
| 6 | C5 / C8 | FDR | degraded-mode entry + per-frame estimate + thresholds crossed | FDR record |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| 30-s budget exhausted with no anchor | C5 | timer | Escalate to no-fix; FC then handles AC-5.2 IMU-only fallback |
| Spoofed GPS attempts to re-enter | C5 | re-entry gate (10-s health + visual-consistency) | Reject; FDR logs the rejection; STATUSTEXT to GCS |
| Camera comes back but FC still spoofed | F3 / F7 | per-frame check | Resume `satellite_anchored` provenance via F6 re-localization; trigger F7 spoofing-promotion |
| FDR write back-pressure during degraded mode | C13 | queue overflow | Logged rollover (NFT-6); never silent |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Mode-transition latency | ≤ 1 frame OR ≤ 400 ms (AC-3.5) | NFT-RES-04 / FT-N-04 |
| Threshold escalation cadence | per-frame | AC-NEW-8 |
| GCS STATUSTEXT cadence | 12 Hz | AC-6.1 + AC-NEW-8 |
| Recovery — visual anchor | ≤ 12 frames after first valid match | F6 sharp-turn / disconnected-segment re-localization |
| Recovery — GPS re-promotion | NEVER < 10 s + visual-consistency check | AC-NEW-8 |
---
## Flow F6: Sharp-turn / disconnected-segment re-localization
### Description
Frame-to-frame registration may fail under sharp turns (<5 % overlap, AC-3.2), disconnected segments (AC-3.3), or after a brief visual blackout. F6 restores `satellite_anchored` provenance via the C2 → C2.5 → C3 → C3.5 → C4 → C5 path, re-anchoring the estimate. If failure persists for ≥ 3 consecutive frames AND ≥ 2 s, the system requests an operator re-loc hint via GCS (AC-3.4) while continuing dead-reckoned propagation.
### Preconditions
- F3 was running normally; frame-to-frame registration just failed.
- Visual is **not** in full blackout (else go to F5).
- FC GPS may or may not be present (the re-loc path doesn't depend on FC GPS).
### Sequence Diagram
```mermaid
sequenceDiagram
participant C1 as C1 VioStrategy
participant C2 as C2 VprStrategy
participant C2_5 as C2.5 ReRanker
participant C3 as C3 CrossDomainMatcher
participant C5 as C5 StateEstimator
participant C8 as C8 FcAdapter
participant Gcs as [[QGroundControl]]
participant Operator as Operator
participant Fdr as C13 FdrWriter
C1->>C5: VioOutput marks frame-to-frame fail (or low feature quality)
C5->>C2: trigger satellite re-localization for current frame
C2->>C2_5: top-K=10 (full pipeline retried)
C2_5->>C3: top-N=3
alt re-localization succeeds within 12 frames
C3->>C5: MatchResult with sufficient inliers
C5->>C5: restore satellite_anchored provenance
C5->>C8: PoseEstimate (label = satellite_anchored)
C8->>Fdr: recovery event
else re-localization fails ≥ 3 consecutive frames AND ≥ 2 s
C5->>C8: PoseEstimate (label = visual_propagated → dead_reckoned)
C8->>Gcs: STATUSTEXT requesting operator re-loc hint (AC-3.4)
Gcs->>Operator: prompt
Operator-->>Gcs: re-loc hint (region / pose seed)
Gcs-->>C8: NAMED_VALUE_FLOAT or custom-dialect re-loc hint
C8->>C5: re-loc hint
C5->>C2: prior-anchored retry (limit search by hint region)
end
C5->>Fdr: per-frame estimate + outage event chain
```
### Flowchart
```mermaid
flowchart TD
Start([Frame-to-frame registration fails]) --> Trigger[C5 triggers C2 satellite re-localization]
Trigger --> RetryCount{≥ 12 frames since trigger?}
RetryCount -->|yes| RetryFull[Full C2 → C2.5 → C3 → C3.5 → C4 retry]
RetryFull --> Inliers{Sufficient inliers from C3?}
Inliers -->|yes| Restore[Restore satellite_anchored; resume F3]
Inliers -->|no| Counter[Increment outage counter]
Counter --> ThreeFrameTwoSecond{≥ 3 frames AND ≥ 2 s?}
ThreeFrameTwoSecond -->|no| RetryFull
ThreeFrameTwoSecond -->|yes| Operator[STATUSTEXT to GCS requesting operator re-loc hint AC-3.4]
Operator --> WaitHint{Hint received within bound?}
WaitHint -->|yes| BoundedRetry[C2 retry with hint region prior]
WaitHint -->|no| Degraded[Continue dead-reckoned propagation; F5 thresholds apply]
BoundedRetry --> Inliers
Restore --> Done([Re-anchored; F3 resumes])
Degraded --> Done
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | C1 / C5 | C2 | re-localization trigger + last trusted pose prior | DTO |
| 2 | C2 → C2.5 → C3 → C3.5 → C4 | C5 | `MatchResult` + `PoseEstimate` (or fail) | DTO |
| 3 | C8 | GCS | re-loc-hint-request STATUSTEXT (after AC-3.4 thresholds) | MAVLink STATUSTEXT |
| 4 | GCS / Operator | C8 | re-loc hint (`NAMED_VALUE_FLOAT` or custom-dialect) | MAVLink |
| 5 | C8 | C5 → C2 | hint region prior | DTO |
| 6 | C5 / C8 | FDR | per-frame estimate + outage event chain | FDR record |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Re-localization fails after operator hint | C2/C3 | per-frame inlier count | Continue dead-reckoned; F5 thresholds escalate `horiz_accuracy` |
| Operator hint never arrives | GCS link | bounded wait | Continue dead-reckoned; FDR logs no-hint case |
| GCS link fully down | C8 | link-health monitor | Continue dead-reckoned; FDR logs unreachable-GCS case |
| Hint region invalidates the cache | C6 | cache miss | Fall back to global re-localization (full C2 candidate set) |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Re-anchor on sharp turn | within 12 frames after first valid match (AC-3.2) | FT-P-07 + IT-4 |
| Disconnected-segment recovery | ≥ 3 disconnected segments per flight (AC-3.3) | core capability, not degraded mode |
| Operator-hint round-trip | best-effort (GCS bandwidth-limited) | AC-3.4 + AC-6.2 |
---
## Flow F7: Spoofing-promotion via EKF source-set switch
### Description
When the FC reports GPS denial/spoof while the companion estimate is healthy, the companion publishes its estimate to the FC's EKF source-set 2 and issues `MAV_CMD_SET_EKF_SOURCE_SET` to make set 2 primary (D-C8-2 = (b)). When the companion is unavailable, the FC switches back to set 1 (real GPS). On iNav, the companion is the sole GPS source and there is no source-set switching — the equivalent is just keeping `MSP2_SENSOR_GPS` flowing.
This flow is a **hot path**: AC-NEW-2 ≤ 3 s p95 from spoof onset to companion estimate becoming primary. Status: D-C8-2 = (b) is `Selected with runtime gate` — IT-3 SITL validation is the lock gate (Mode B Fact #111).
**Reverse path — mid-flight FC GPS re-promotion (ADR-010, Principle #11 amended)**: when the FC's GPS subsequently recovers in flight, the companion does **not** auto-yield. The FC GPS is fused back into C5 via `add_pose_anchor` **only after** the three-part gate fires: (a) FC GPS health stable + non-spoofed for ≥ 10 s, (b) a visual/satellite consistency check has succeeded on the next anchor frame, **AND** (c) the FC's reported position is within ≤ 200 m of the companion's last emitted `PoseEstimate`. The third clause is the bounded-delta gate — it catches "FC reports stable GPS but the value is wrong". When the gate passes, the FC GPS becomes one more anchor source, not an override. The source-set switch back to set 1 happens through the existing AC-NEW-8 path.
### Preconditions
- ArduPilot Plane FC (D-C8-2 only applies to AP path).
- FC reports `GPS_RAW_INT` health degradation OR a spoof flag.
- Companion estimate is healthy (provenance = `satellite_anchored` OR `visual_propagated` within fresh anchor age).
### Sequence Diagram (ArduPilot Plane path)
```mermaid
sequenceDiagram
participant FC as [[ArduPilot Plane FC]]
participant C8 as C8 FcAdapter (AP)
participant C5 as C5 StateEstimator
participant Gcs as [[QGroundControl]]
participant Fdr as C13 FdrWriter
FC->>C8: GPS_RAW_INT health degraded OR spoof flag set
C8->>C5: gps_health_event(denied | spoofed)
alt companion estimate is healthy
C8->>FC: GPS_INPUT (5 Hz periodic) on source-set 2 with signed MAVLink 2.0
C8->>FC: MAV_CMD_SET_EKF_SOURCE_SET to make set 2 primary
FC-->>C8: command ack
C8->>Fdr: source-set switch event + signing key reference
C8->>Gcs: STATUSTEXT "EKF_SOURCE_SET=2"
else companion estimate not healthy
Note over C8,FC: stay on source-set 1; F5 failsafe thresholds apply
end
Note over FC: when companion becomes unavailable, FC auto-switches back to source-set 1
```
### Flowchart
```mermaid
flowchart TD
Start([FC reports GPS denial OR spoof]) --> CompanionHealthy{Companion estimate healthy?}
CompanionHealthy -->|no| F5Path[Stay on source-set 1; F5 failsafe thresholds apply]
CompanionHealthy -->|yes| FcType{FC type?}
FcType -->|ArduPilot Plane| PublishSet2[C8 publishes GPS_INPUT to source-set 2 signed MAVLink 2.0]
FcType -->|iNav| ContinueMSP[Continue MSP2_SENSOR_GPS; iNav has no source-set]
PublishSet2 --> SetCmd[Send MAV_CMD_SET_EKF_SOURCE_SET set=2]
SetCmd --> Ack{Ack within latency budget?}
Ack -->|yes| LogSwitch[FDR + STATUSTEXT EKF_SOURCE_SET=2]
Ack -->|no, IT-3 fail| FallbackPath[D-C8-2-FALLBACK options a or b or c]
ContinueMSP --> Done([Companion is sole GPS source on iNav by construction])
LogSwitch --> Done
FallbackPath --> Done
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | FC | C8 | GPS health / spoof event | MAVLink `GPS_RAW_INT` + flags |
| 2 | C8 | C5 | gps_health_event | DTO |
| 3 | C8 | FC | `GPS_INPUT` on source-set 2 + `MAV_CMD_SET_EKF_SOURCE_SET` | MAVLink 2.0 signed |
| 4 | FC | C8 | command ack | MAVLink |
| 5 | C8 | FDR + GCS | switch event + STATUSTEXT | FDR record + MAVLink STATUSTEXT |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| FC does not ack `MAV_CMD_SET_EKF_SOURCE_SET` | Step 4 | timeout | Retry once; if persistent, D-C8-2-FALLBACK; FDR logs |
| Real GPS becomes healthy mid-spoof (not actually spoof) | Step 1 | FC GPS health restored AND spoof flag cleared | Source-set switch back to 1 (FC-driven); companion stays on standby |
| Spoofed real-GPS attempts re-promotion | C5 / C8 | 10-s + visual-consistency gate | Reject; AC-NEW-8 |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Spoof onset → primary switch (AP) | p95 < 3 s (AC-NEW-2) | NFT-PERF-04; IT-3 SITL is the runtime gate for D-C8-2 = (b) |
| iNav companion-as-sole-GPS lateral check | continuous; no switch needed | iNav has no source-set arbitration |
---
## Flow F8: Companion reboot recovery
### Description
The companion process restarts mid-flight (crash, watchdog reset, voltage glitch). The FC remains armed and continues IMU-only dead reckoning during the gap (~500 m drift max at 60 km/h cruise per AC-NEW-1). On restart, the companion re-runs F2 (Takeoff load), seeded with the FC's current IMU-extrapolated pose. Cold-start TTFF ≤ 30 s p95 (AC-NEW-1) is the same budget as a clean takeoff.
### Preconditions
- FC remained armed during the gap; FC IMU is still reporting.
- The companion's NVM cache survived the reboot (warm cache; D-C10-3 SHA-256 content-hash gate verifies integrity).
### Sequence Diagram
```mermaid
sequenceDiagram
participant Companion as Companion (post-reboot)
participant C10 as C10 ManifestVerifier
participant FaissIndex as C6 DescriptorIndex
participant TrtRuntime as C7 InferenceRuntime
participant C8 as C8 FcAdapter
participant FC as [[Flight Controller]]
participant Pipeline as C1+...+C5 (warm)
participant Fdr as C13 FdrWriter
Note over Companion: Process restart; FC continues IMU-only dead reckoning
Companion->>C10: warm-cache content-hash verify
C10-->>Companion: pass (or refuse takeoff if tampering)
Companion->>FaissIndex: faiss.read_index mmap
Companion->>TrtRuntime: deserializeCudaEngine
Companion->>C8: re-establish MAVLink 2.0 signing handshake (per-flight key still valid)
C8->>FC: signing handshake
FC-->>C8: ack
C8->>FC: query GLOBAL_POSITION_INT + GPS health
FC-->>C8: current IMU-extrapolated pose
Companion->>Pipeline: warm with IMU-extrapolated pose (AC-5.3)
Companion->>Fdr: open continuation record (NOT a new flight; same flight_id) + reboot event
Note over Companion: Pipeline re-enters F3; first valid frame budget = AC-NEW-1 30 s
```
### Flowchart
```mermaid
flowchart TD
Start([Companion process restart while FC armed]) --> Verify[D-C10-3 SHA-256 content-hash gate]
Verify --> Pass{Pass?}
Pass -->|no| Refuse[Refuse to re-arm; STATUSTEXT to GCS; FDR log]
Pass -->|yes| LoadCache[FAISS mmap + TRT deserialize]
LoadCache --> Reconnect[Re-establish MAVLink 2.0 signing handshake]
Reconnect --> WarmStart[Query FC IMU-extrapolated pose AC-5.3]
WarmStart --> WarmPipe[Warm C1+...+C5]
WarmPipe --> ReopenFdr[FDR opens continuation record same flight_id]
ReopenFdr --> ReenterF3([Re-enter F3 within AC-NEW-1 budget])
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | Companion | C10 | manifest path + content-hash sidecars | filesystem read |
| 2 | Companion | FAISS / C7 | mmap pointer + TRT engines | runtime API |
| 3 | C8 | FC | MAVLink 2.0 signing handshake (re-handshake; per-flight key valid) | MAVLink 2.0 |
| 4 | FC | C8 | IMU-extrapolated pose | `GLOBAL_POSITION_INT` |
| 5 | C13 | FDR | reboot continuation record | FDR record |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Cache integrity check fails | Step 1 | SHA-256 mismatch | Refuse to re-arm; STATUSTEXT; companion stays out of source-set 2 |
| MAVLink signing re-handshake fails | Step 3 | handshake timeout | Refuse to re-arm |
| AC-NEW-1 budget exceeded | end-to-end | timer | F5 dead-reckoned mode kicks in once first frame is emitted; FDR logs the over-budget event |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Reboot → first valid frame | p95 < 30 s (AC-NEW-1) | Same budget as cold takeoff |
| FC dead-reckoning drift during gap | ≤ ~500 m at 60 km/h cruise | inherited from AC-NEW-1 rationale |
---
## Flow F9: GCS telemetry stream
### Description
Send a 12 Hz downsampled summary of the per-frame estimate to QGroundControl over MAVLink (AC-6.1). High-rate per-frame data stays on the local FDR; the GCS link is bandwidth-limited and best-effort.
### Preconditions
- F2 completed; pipeline is warm.
- GCS link is healthy (link drop is non-fatal; companion continues; FDR retains everything).
### Sequence Diagram
```mermaid
sequenceDiagram
participant C5 as C5 StateEstimator
participant C8 as C8 FcAdapter / Telemetry
participant Gcs as [[QGroundControl]]
participant Fdr as C13 FdrWriter
loop every 5001000 ms
C5->>C8: latest PoseEstimate + provenance + cov + system health (CPU/GPU/temp/throttle)
C8->>C8: downsample + serialize per AC-6.1
C8->>Gcs: STATUSTEXT (provenance + degraded-mode flags) + NAMED_VALUE_FLOAT (cov ellipse axis) + GPS_RAW_INT (downsampled pos)
C8->>Fdr: telemetry-emit record
end
alt operator command inbound (AC-6.2)
Gcs->>C8: STATUSTEXT or NAMED_VALUE_FLOAT or custom-dialect command
C8->>C5: forward command (e.g., re-loc hint, sector reclassification)
end
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | C5 | C8 | latest `PoseEstimate` + system health | DTO |
| 2 | C8 | GCS | downsampled summary | MAVLink STATUSTEXT + NAMED_VALUE_FLOAT + GPS_RAW_INT |
| 3 | GCS | C8 | operator command | MAVLink |
| 4 | C8 | C5 / C12 | parsed command | DTO |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| GCS link drop | Step 2 | link health monitor | Continue; FDR retains everything; reconnect when link returns |
| Operator command malformed | Step 3 | parser error | Reject; STATUSTEXT explanation; FDR logs |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Telemetry rate | 12 Hz (AC-6.1) | NFT-PERF + FT-P-12 |
| Per-frame data on FDR | full rate | AC-NEW-3 |
---
## Flow F10: Post-landing tile upload
### Description
After the UAV has landed, the operator triggers C12's `PostLandingUploadOrchestrator` (`operator-orchestrator upload-pending --flight-id ...`). The orchestrator reads the `flight_footer` FDR record for the given `flight_id` via `FdrFooterReader`; if the footer is present AND `clean_shutdown == True`, it invokes C11's `TileUploader` (via the `TileUploaderCut` Protocol) which reads locally-saved mid-flight tiles from C6 and uploads them to `satellite-provider`'s ingest endpoint per the D-PROJ-2 contract sketch. The C11 Tile Manager is a separate operator-side process / image — **not present in the airborne companion image**, ADR-004. Each tile carries quality metadata sufficient for the parent-suite voting layer to decide promotion `pending → trusted` (D-PROJ-2 design task #2; not yet implemented service-side). Until the real endpoint ships, integration tests target the e2e-test `mock-suite-sat-service` fixture under `tests/fixtures/`; production never reaches the fixture.
### Preconditions
- C13 has written the `flight_footer` FDR record for the requested `flight_id` with `clean_shutdown == True`. This is the single safety invariant; the operator does not query FC `MAV_STATE` directly (Batch 44 SRP refactor — the footer is the authoritative "vehicle stopped cleanly" signal).
- Operator workstation has network reach to `satellite-provider` (in tests, the e2e `mock-suite-sat-service` fixture stands in for the not-yet-shipped POST endpoint).
- Local C6 tile store has mid-flight tiles with `voting_status=pending` and quality metadata.
- Per-flight onboard signing key (generated at takeoff load, baked into tile metadata) is available to C11 `TileUploader` for payload signing.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Operator
participant C11 as C11 TileUploader (workstation)
participant C6 as C6 TileStore (companion or workstation mirror)
participant SatelliteProvider as [[satellite-provider]] (D-PROJ-2 endpoint, planned)
participant Fdr as C13 FdrWriter
Operator->>C11: upload_pending_tiles(flight_id)
C11->>C6: read mid-flight tiles where voting_status=pending AND flight_id=...
C6-->>C11: batched Tile + TileQualityMetadata
loop per batch
C11->>SatelliteProvider: POST /api/satellite/tiles/ingest (multipart) signed with per-flight key
SatelliteProvider-->>C11: 202 Accepted with batch UUID + per-tile status (queued | rejected | duplicate | superseded)
C11->>C6: update voting_status=uploaded for accepted tiles
C11->>Fdr: upload-batch event + service response
end
C11-->>Operator: UploadBatchReport (counts, rejections, duplicates)
Note over SatelliteProvider: voting layer (D-PROJ-2 design task #2) eventually promotes pending → trusted; out of scope for this flow
```
### Flowchart
```mermaid
flowchart TD
Start([Operator triggers C12 PostLandingUploadOrchestrator with flight_id]) --> FooterRead[C12 FdrFooterReader reads flight_footer FDR record]
FooterRead --> StateCheck{footer present AND clean_shutdown == True?}
StateCheck -->|no — footer_missing / unclean_shutdown / flight_id_not_found / fdr_unreadable| Refuse[Raise FlightStateNotConfirmedError with sub-reason + operator-actionable remediation]
StateCheck -->|yes| ReadTiles[C11 TileUploader reads mid-flight tiles voting_status=pending]
ReadTiles --> Empty{Any tiles to upload?}
Empty -->|no| Done([No-op; report])
Empty -->|yes| Batch[Batch by configurable size]
Batch --> Sign[Sign payload with per-flight onboard signing key]
Sign --> Post[POST /api/satellite/tiles/ingest with multipart batch]
Post --> Endpoint{Endpoint responds?}
Endpoint -->|2xx| Update[Update voting_status=uploaded for accepted tiles]
Endpoint -->|429 rate limit| Backoff[Back off and retry]
Endpoint -->|5xx OR network error| Retry[Retry with bounded retries]
Endpoint -->|endpoint not yet implemented| Queue[Keep batches queued locally; never block]
Update --> More{More batches?}
More -->|yes| Batch
More -->|no| Report[Report to operator with counts and rejections]
Backoff --> Post
Retry --> Post
Queue --> Report
Report --> Done
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | Operator | C11 | (`flight_id`) | CLI / GUI |
| 2 | C11 | C6 | SELECT tiles WHERE `voting_status=pending` AND `flight_id=...` | SQL + filesystem reads |
| 3 | C11 | `satellite-provider` | multipart batch (tile JPEG + metadata + signature) | per D-PROJ-2 contract sketch |
| 4 | `satellite-provider` | C11 | 202 Accepted with batch UUID + per-tile statuses | JSON |
| 5 | C11 | C6 | UPDATE voting_status | SQL UPDATE |
| 6 | C11 | FDR | upload-batch event + service response | FDR record |
### Error scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| `flight_footer` missing OR `clean_shutdown == False` OR `flight_id` not found OR FDR unreadable | Step 1 (C12 `FdrFooterReader`) | Raises `FlightStateNotConfirmedError(reason=...)` with one of 4 sub-reasons (`footer_missing`, `unclean_shutdown`, `flight_id_not_found`, `fdr_unreadable: <repr>`) | Refuse upload; CLI exits with a distinct exit code per sub-reason; operator must fix the FDR or pick the right `flight_id` before retrying (architectural invariant — no auto-retry) |
| `satellite-provider` ingest endpoint not yet implemented (D-PROJ-2 open) | Step 3 | 404 / 501 / connection refused | Keep batches queued locally; report to operator; retry on next operator trigger |
| Network rate-limit (429) | Step 3 | HTTP 429 | Back off + retry |
| Per-tile rejected by service | Step 4 | per-tile status `rejected` | Mark `voting_status=rejected_by_service`; FDR logs reason; do not retry that tile |
| Per-tile duplicate / superseded | Step 4 | per-tile status `duplicate` / `superseded` | Mark accordingly; not an error |
| Signature verification fails service-side | Step 3 | service rejects all tiles in batch | Investigate per-flight signing key; FDR logs; do NOT downgrade or remove signing |
| Operator workstation runs out of disk space mid-upload | Step 5 | filesystem check | Pause; surface to operator; never silently drop tiles |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| End-to-end upload time | not time-critical | post-landing; bursty |
| Batch size | configurable; default sized to workstation bandwidth | tunable per deployment |
| Idempotence | service-side dedup is the dedup mechanism (per `(zoomLevel, lat, lon, capture_timestamp, companion_id, flight_id)`) | onboard-side does not need to track delivery transactions |
---
## Flow F11: Demo replay validation (operator)
### Description
Post-flight **product demo** and **validation** flow. The operator uploads a nav-camera video and ArduPilot `.tlog` through the suite UI (AZ-897), visually aligns the two recordings on dual timeline bars, and runs the same airborne GPS-denied pipeline used in live flight — against a corridor cache seeded from the tlog GPS trace. Output: per-tick estimated positions (JSONL), accuracy map, and PASS/FAIL verdict against tlog ground truth (AZ-696 AC-3).
This is **not** a test-harness shortcut. E2E tests (AZ-840) call the same `replay_api` orchestration (AZ-973) and `operator_replay.cache_seed` (AZ-974) as the UI.
**Phases** (sequenced by `replay_api` demo job or manual CLI equivalents):
1. **Preview** (AZ-970) — parse tlog IMU2 activity + video metadata for UI timelines.
2. **Align** (AZ-897 + AZ-971) — operator coarse offset; backend refine via optical-flow + IMU cross-correlation.
3. **Export** (AZ-972) — write AZ-896 canonical CSV with `Time=0` at aligned video frame 0 (single canonical clock for replay).
4. **Seed cache** (AZ-974) — `extract_route_from_tlog``SatelliteProviderRouteClient.seed_route` → tile download → FAISS build (F1 route-driven variant).
5. **Replay**`gps-denied-replay --video … --imu aligned.csv` with `config.mode=replay`; C1C5 identical to live.
6. **Verdict** — horizontal-error distribution + map artifact returned to UI.
Advanced bypass: operator may upload a pre-aligned `(video, CSV)` per AZ-959 without steps 13.
### Preconditions
- Operator workstation runs `replay_api` (docker-compose or native) with network to `satellite-provider`.
- Camera calibration JSON for the flight's nav camera.
- Tlog contains `SCALED_IMU2` (or `RAW_IMU`) and `GLOBAL_POSITION_INT` / `GPS_RAW_INT`.
- Video covers the active flight segment after alignment.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Operator
participant UI as [[suite/ui]] AZ-897
participant API as replay_api AZ-973
participant Align as replay_input alignment AZ-971
participant Export as tlog_to_csv AZ-972
participant Seed as operator_replay cache_seed AZ-974
participant Sat as [[satellite-provider]]
participant Replay as gps-denied-replay
participant Pipeline as C1..C5 replay mode
Operator->>UI: upload video + tlog + calibration
UI->>API: POST /replay/preview
API-->>UI: video metadata + IMU2 activity timeline
Operator->>UI: drag video bar / refine
UI->>API: POST /replay/align/refine
API->>Align: refine_video_offset
Align-->>UI: refined_offset_ms + confidence
Operator->>UI: Run demo
UI->>API: POST /replay/demo
API->>Export: export_aligned_csv
API->>Seed: extract_route + seed_route + FAISS
Seed->>Sat: POST /api/satellite/route
Sat-->>Seed: mapsReady
API->>Replay: subprocess --video --imu
Replay->>Pipeline: per-frame loop
Pipeline-->>API: results.jsonl
API-->>UI: map URL + verdict report
```
### Data flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | UI | replay_api | video + tlog multipart | HTTP |
| 2 | replay_api | UI | timeline preview JSON | JSON |
| 3 | UI | replay_api | `video_offset_ms` | JSON |
| 4 | replay_api | disk | aligned `data_imu.csv` | AZ-896 CSV |
| 5 | replay_api | satellite-provider | `RouteSpec` waypoints | JSON POST |
| 6 | replay_api | airborne binary | video + CSV + cache config | subprocess |
| 7 | replay_api | UI | JSONL path, map URL, verdict md | JSON job result |
### Error scenarios
| Error | Detection | Recovery |
|-------|-----------|----------|
| Missing IMU in tlog | preview 422 | Operator message; cannot align |
| Refine hard-fail (< 95 % frame match) | align/refine response | Operator adjusts bar or aborts |
| Route seed terminal failure | `RouteTerminalFailureError` | Job failed; operator retries |
| ESKF divergence (no cache) | replay exit ≠ 0 | Ensure step 4 completed; check AZ-963 |
### Performance expectations
| Metric | Target | Notes |
|--------|--------|-------|
| Preview latency | p95 < 5 s | tlog parse + video probe |
| Full demo (Derkachi) | ≤ 15 min cold | matches AZ-835 AC-7 |
| Warm cache reuse | ≤ 30 s seed skip | named volume / cache_root reuse |
---
## Cross-cutting: FDR write side-effect
Every flow above produces FDR records (per AC-NEW-3). The cross-cutting rules are:
- **Every payload class must be present** for the duration of the flight (per-frame estimates with covariance + source-label, FC IMU traces full-rate, all emitted external-position MAVLink frames, raw MAVLink stream `tlog`, system health, mid-flight tiles, ≤ 0.1 Hz failed-tile thumbnails).
- **No raw nav/AI-cam frames** (AC-8.5).
- **64 GB cap per flight**; oldest segment dropped first on rollover; **rollover is logged**, never silent (NFT-6).
- **Smoothed past-frame entries** are mandatory per Mode B Fact #107 so post-mission analysis can verify AC-4.5 internal-smoothing scope.
- **Reboot continuation** (F8) opens a continuation record under the same `flight_id`, never a new flight.
The FDR is the post-mission single source of truth; everything emitted to FC + GCS is also FDR-logged so AC-NEW-4 / AC-NEW-7 / IT-10 / IT-11 / NFT-* analyses can be replayed offline.