mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 10:31:13 +00:00
Update autodev state, architecture documentation, and glossary terms
Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
This commit is contained in:
@@ -9,16 +9,16 @@
|
||||
|
||||
| # | Flow Name | Trigger | Primary Components | Criticality |
|
||||
|---|-----------|---------|--------------------|-------------|
|
||||
| F1 | Pre-flight cache provisioning | Operator runs C12 cache-build CLI on workstation | C12 (operator), [[`satellite-provider`]], C10, C6, C7 | High |
|
||||
| F1 | Pre-flight cache provisioning | Operator runs C12 cache-build CLI on workstation | C12 (operator), C11 `TileDownloader`, [[`satellite-provider`]], C10, C6, C7 | High |
|
||||
| F2 | Takeoff load | Companion boot detected by FC `MAV_STATE` ARMED OR companion process start with armed FC | C10, C7, C8 (signing handshake), C13 | High |
|
||||
| F3 | Steady-state per-frame estimation | Nav camera frame received (3 Hz nominal) | C1, C2, C2.5, C3, C3.5, C4, C5, C8 (out), C13 | High |
|
||||
| F4 | Mid-flight tile generation + local cache write | Successful satellite-anchored frame with quality metadata above threshold | C5, C6, C13 (no C8/C11 path) | High |
|
||||
| F4 | Mid-flight tile generation + local cache write | Successful satellite-anchored frame with quality metadata above threshold | C5, C6, C13 (no C8/C11 path — C11 `TileUploader` is not loaded in the airborne image) | High |
|
||||
| F5 | Visual blackout + spoofed-GPS failsafe | Camera unusable AND/OR FC GPS reports denial/spoof | C1, C5, C8, C13 (degraded-mode escalation per AC-NEW-8) | High |
|
||||
| F6 | Sharp-turn / disconnected-segment re-localization | Frame-to-frame registration fails for ≥ 1 frame (AC-3.2 / AC-3.3) | C1, C2, C2.5, C3, C3.5, C4, C5, C8, C13; optionally operator (AC-3.4) | High |
|
||||
| F7 | Spoofing-promotion via EKF source-set switch | FC reports GPS denial/spoof while companion estimate is healthy | C5, C8, [[ArduPilot Plane FC]] | High |
|
||||
| F8 | Companion reboot recovery | Companion process restart while FC remains armed | C8 (FC IMU pose ingest), C5, C10 (warm-cache verify), C13 | Medium |
|
||||
| F9 | GCS telemetry stream | Per-frame estimate available + GCS link healthy | C5, C8, [[QGroundControl]] | Medium |
|
||||
| F10 | Post-landing tile upload | Operator triggers C11 with `flight_state == ON_GROUND` confirmed | C11 (operator-side), C6 (read), [[`satellite-provider`]] (D-PROJ-2 endpoint, planned) | High |
|
||||
| F10 | Post-landing tile upload | Operator triggers C11 `TileUploader` with `flight_state == ON_GROUND` confirmed | C11 `TileUploader` (operator-side), C6 (read), [[`satellite-provider`]] (D-PROJ-2 endpoint, planned) | High |
|
||||
|
||||
## Flow Dependencies
|
||||
|
||||
@@ -43,7 +43,12 @@
|
||||
|
||||
### Description
|
||||
|
||||
The operator builds (or refreshes) the per-mission cache on the companion before takeoff: downloads tiles from `satellite-provider` for the operational area, generates VPR descriptors, compiles TensorRT engines, applies sector-classified freshness rules, and writes a manifest with a SHA-256 content-hash gate. This flow is offline and not time-critical; it is the only path that reaches `satellite-provider` from the companion side.
|
||||
The operator builds (or refreshes) the per-mission cache before takeoff. F1 has **two phases** sequenced by C12 OperatorTool:
|
||||
|
||||
- **Phase 1 — Tile download (C11 `TileDownloader`)**: fetch tiles from `satellite-provider` for the operational area; apply sector-classified freshness rules (AC-NEW-6) and resolution gate (RESTRICT-SAT-4); write tile rows + JPEGs into C6.
|
||||
- **Phase 2 — Cache artifact build (C10 CacheProvisioner)**: read the populated C6 store; compile/deserialize TRT engines via C7; batch-generate descriptors via the C2 backbone; atomically write the FAISS HNSW index with SHA-256 sidecars; write the Manifest hashing model + calibration + corpus + sector classification.
|
||||
|
||||
This flow is offline and not time-critical. **Only Phase 1 reaches `satellite-provider`** — and it runs on the operator workstation, which is the only host that holds the TLS + service-internal API key. The companion never reaches `satellite-provider` directly.
|
||||
|
||||
### Preconditions
|
||||
|
||||
@@ -59,25 +64,28 @@ The operator builds (or refreshes) the per-mission cache on the companion before
|
||||
sequenceDiagram
|
||||
participant Operator
|
||||
participant C12OperatorTool as C12 Operator Tool (workstation)
|
||||
participant C11TileDownloader as C11 TileDownloader (workstation)
|
||||
participant SatelliteProvider as [[satellite-provider]] (.NET 8)
|
||||
participant C6TileStore as C6 TileStore + DescriptorIndex (Postgres + filesystem + FAISS)
|
||||
participant C10Provisioner as C10 CacheProvisioner (companion)
|
||||
participant C7Inference as C7 InferenceRuntime
|
||||
participant C2Backbone as C2 VPR backbone (TensorRT)
|
||||
participant C6TileStore as C6 TileStore + DescriptorIndex (Postgres + filesystem + FAISS)
|
||||
|
||||
Operator->>C12OperatorTool: build_cache(area, sector_class, calibration_file)
|
||||
C12OperatorTool->>SatelliteProvider: GET /api/satellite/tiles?bbox=&zoom=
|
||||
SatelliteProvider-->>C12OperatorTool: Tile blobs + metadata (paged)
|
||||
C12OperatorTool->>C12OperatorTool: filter by AC-NEW-6 freshness (sector-class-driven)
|
||||
C12OperatorTool->>C10Provisioner: stage(tiles, manifest_inputs, calibration)
|
||||
C10Provisioner->>C6TileStore: write tiles to ./tiles/{zoomLevel}/{x}/{y}.jpg + Postgres rows
|
||||
C12OperatorTool->>C11TileDownloader: download_tiles_for_area(bbox, zooms, sector_class)
|
||||
C11TileDownloader->>SatelliteProvider: GET /api/satellite/tiles?bbox=&zoom=
|
||||
SatelliteProvider-->>C11TileDownloader: Tile blobs + metadata (paged)
|
||||
C11TileDownloader->>C11TileDownloader: filter by AC-NEW-6 freshness + RESTRICT-SAT-4 resolution
|
||||
C11TileDownloader->>C6TileStore: write tiles to ./tiles/{zoomLevel}/{x}/{y}.jpg + Postgres rows (source='googlemaps')
|
||||
C11TileDownloader-->>C12OperatorTool: DownloadBatchReport (counts, freshness summary)
|
||||
C12OperatorTool->>C10Provisioner: build_cache_artifacts(bbox, zooms, sector_class, calibration)
|
||||
C10Provisioner->>C7Inference: load VPR backbone ONNX
|
||||
C7Inference-->>C10Provisioner: TRT engine compiled (cached per SM/JP/TRT/precision tuple)
|
||||
C10Provisioner->>C2Backbone: per-tile descriptor generation (batched on Jetson)
|
||||
C10Provisioner->>C2Backbone: per-tile descriptor generation (batched on Jetson, reads tiles from C6)
|
||||
C2Backbone-->>C10Provisioner: descriptor matrix (FP16/INT8 per D-C7-1)
|
||||
C10Provisioner->>C6TileStore: faiss.write_index (HNSW) + atomicwrites + SHA-256 content-hash
|
||||
C10Provisioner->>C10Provisioner: write Manifest (hash of model + calibration + corpus + sector_class)
|
||||
C10Provisioner-->>C12OperatorTool: provisioning report (counts, hashes, freshness summary)
|
||||
C10Provisioner-->>C12OperatorTool: BuildReport (counts, hashes)
|
||||
C12OperatorTool-->>Operator: PASS / FAIL summary
|
||||
```
|
||||
|
||||
@@ -86,15 +94,18 @@ sequenceDiagram
|
||||
```mermaid
|
||||
flowchart TD
|
||||
Start([Operator invokes C12 build]) --> Classify[Operator classifies sector active_conflict OR stable_rear]
|
||||
Classify --> Download[C12 downloads tiles from satellite-provider for bbox + zoom]
|
||||
Classify --> InvokeC11[C12 invokes C11 TileDownloader]
|
||||
InvokeC11 --> Download[C11 GET /api/satellite/tiles for bbox + zoom]
|
||||
Download --> FreshnessFilter{Freshness ok per AC-8.2 + AC-NEW-6?}
|
||||
FreshnessFilter -->|stale and stable_rear| RejectOrDowngrade[Reject or downgrade tile]
|
||||
FreshnessFilter -->|stale and active_conflict| RejectOrDowngrade
|
||||
FreshnessFilter -->|fresh| Stage[C12 stages tiles + calibration on companion]
|
||||
RejectOrDowngrade --> Stage
|
||||
Stage --> InvokeC10[C12 invokes C10 CacheProvisioner on companion]
|
||||
InvokeC10 --> WriteTiles[C10 writes tiles to filesystem + Postgres]
|
||||
WriteTiles --> CompileEngines[C10 compiles TRT engines via C7 InferenceRuntime]
|
||||
FreshnessFilter -->|fresh| ResolutionGate{Resolution >= 0.5 m/px per RESTRICT-SAT-4?}
|
||||
RejectOrDowngrade --> ResolutionGate
|
||||
ResolutionGate -->|fail| RejectRes[Reject and report]
|
||||
ResolutionGate -->|pass| WriteTiles[C11 writes tiles to filesystem + Postgres]
|
||||
WriteTiles --> InvokeC10[C12 invokes C10 build_cache_artifacts]
|
||||
RejectRes --> Done
|
||||
InvokeC10 --> CompileEngines[C10 compiles or reuses TRT engines via C7 InferenceRuntime]
|
||||
CompileEngines --> EngineCacheHit{EngineCacheEntry already valid for SM JP TRT precision tuple?}
|
||||
EngineCacheHit -->|yes D-C10-6| ReuseEngine[Reuse cached engine and INT8 calibration cache]
|
||||
EngineCacheHit -->|no| BuildEngine[Polygraphy or trtexec or IBuilderConfig hybrid build]
|
||||
@@ -113,26 +124,28 @@ flowchart TD
|
||||
| Step | From | To | Data | Format |
|
||||
|------|------|----|------|--------|
|
||||
| 1 | Operator | C12 | (`bounding_box`, `zoom_levels`, `sector_class`, `calibration_path`) | CLI args / GUI form |
|
||||
| 2 | C12 | `satellite-provider` REST | `GET /api/satellite/tiles?bbox=…&zoom=…` | HTTPS query |
|
||||
| 3 | `satellite-provider` | C12 | Paged tile blobs + metadata rows | JPEG + JSON metadata |
|
||||
| 4 | C12 | C10 (over USB/Eth) | Staged tile bundle + calibration JSON | Tarball + manifest stub |
|
||||
| 5 | C10 | C6 filesystem | Tile JPEG bodies | `./tiles/{zoomLevel}/{x}/{y}.jpg` |
|
||||
| 6 | C10 | C6 PostgreSQL | Tile metadata rows | SQL INSERT (mirror of `satellite-provider`'s `tiles` table) |
|
||||
| 7 | C10 → C7 | TRT engine cache | TRT engines | `.engine` files keyed by `(SM, JP, TRT, precision)` (D-C10-7) |
|
||||
| 8 | C2 backbone | C6 FAISS index | Descriptor matrix | `.index` (FAISS HNSW), atomicwrites, SHA-256 sidecar |
|
||||
| 9 | C10 | filesystem | Manifest | YAML or JSON; carries hashes |
|
||||
| 2 | C12 | C11 `TileDownloader` | `DownloadRequest` | in-process call |
|
||||
| 3 | C11 | `satellite-provider` REST | `GET /api/satellite/tiles?bbox=…&zoom=…` | HTTPS query |
|
||||
| 4 | `satellite-provider` | C11 | Paged tile blobs + metadata rows | JPEG + JSON metadata |
|
||||
| 5 | C11 | C6 filesystem (over USB/Eth) | Tile JPEG bodies | `./tiles/{zoomLevel}/{x}/{y}.jpg` |
|
||||
| 6 | C11 | C6 PostgreSQL | Tile metadata rows (`source='googlemaps'`) | SQL INSERT (mirror of `satellite-provider`'s `tiles` table) |
|
||||
| 7 | C12 | C10 `CacheProvisioner` | `BuildRequest` | in-process call (operator-tool side); RPC over USB/Eth to companion runner |
|
||||
| 8 | C10 → C7 | TRT engine cache | TRT engines | `.engine` files keyed by `(SM, JP, TRT, precision)` (D-C10-7) |
|
||||
| 9 | C2 backbone (driven by C10) | C6 FAISS index | Descriptor matrix | `.index` (FAISS HNSW), atomicwrites, SHA-256 sidecar |
|
||||
| 10 | C10 | filesystem | Manifest | YAML or JSON; carries hashes |
|
||||
|
||||
### Error scenarios
|
||||
|
||||
| Error | Where | Detection | Recovery |
|
||||
|-------|-------|-----------|----------|
|
||||
| `satellite-provider` unreachable | Step 2 | HTTP timeout / 5xx | C12 fails with explicit error; operator retries when network is available; takeoff blocked |
|
||||
| Tile fails freshness | Step 4 | `tile.capture_timestamp` vs `sector_class` threshold | Reject (active_conflict) or downgrade-no-`satellite_anchored`-label (rear), per AC-NEW-6; report to operator |
|
||||
| Resolution below 0.5 m/px | Step 4 | Tile metadata GSD check (RESTRICT-SAT-4) | Reject; report; takeoff blocked |
|
||||
| Insufficient cache budget | Step 5 | Filesystem free-space check pre-write | Fail fast with explicit budget delta; no partial write |
|
||||
| Engine compile failure | Step 7 | Polygraphy / trtexec exit code; no output `.engine` | Surface error to operator; takeoff blocked; **never silently fall back** |
|
||||
| Descriptor generation OOM on Jetson | Step 8 | CUDA OOM | Halve batch size and retry once; if still OOM, surface to operator |
|
||||
| Atomic-write or SHA-256 mismatch | Step 8 | `atomicwrites` rollback or content-hash sidecar mismatch | Mark cache invalid; rebuild from staged tiles; if persistent, surface to operator |
|
||||
| `satellite-provider` unreachable | Step 3 | HTTP timeout / 5xx | C11 `TileDownloader` fails with explicit error; operator retries when network is available; takeoff blocked |
|
||||
| Tile fails freshness | Step 4 (C11) | `tile.capture_timestamp` vs `sector_class` threshold | Reject (active_conflict) or downgrade-no-`satellite_anchored`-label (rear), per AC-NEW-6; counts surface in `DownloadBatchReport` |
|
||||
| Resolution below 0.5 m/px | Step 4 (C11) | Tile metadata GSD check (RESTRICT-SAT-4) | Reject; report; takeoff blocked |
|
||||
| Insufficient cache budget | Step 5 (C11) | Filesystem free-space check pre-write | Fail fast with explicit budget delta; no partial write |
|
||||
| C6 missing tiles for requested bbox/zoom | Step 7 (C10) | C10's pre-build scan finds < expected tile count | Surface as `BuildReport.failure` instructing operator to re-run C11 `TileDownloader`; do **not** trigger network fetch from C10 |
|
||||
| Engine compile failure | Step 8 | Polygraphy / trtexec exit code; no output `.engine` | Surface error to operator; takeoff blocked; **never silently fall back** |
|
||||
| Descriptor generation OOM on Jetson | Step 9 | CUDA OOM | Halve batch size and retry once; if still OOM, surface to operator |
|
||||
| Atomic-write or SHA-256 mismatch | Step 9 | `atomicwrites` rollback or content-hash sidecar mismatch | Mark cache invalid; rebuild from staged tiles; if persistent, surface to operator |
|
||||
| Tampered cache (post-write, pre-takeoff) | (caught at takeoff in F2, not here) | F2 SHA-256 content-hash gate | F2 refuses takeoff (IT-7) |
|
||||
|
||||
### Performance expectations
|
||||
@@ -390,12 +403,12 @@ GTSAM iSAM2 with `IncrementalFixedLagSmoother` retroactively refines past keyfra
|
||||
|
||||
### Description
|
||||
|
||||
For every successful satellite-anchored frame whose `TileQualityMetadata` clears the publish threshold, orthorectify the nav frame onto basemap projection, deduplicate against the existing local tile cache, and write the result locally in `satellite-provider`-compatible on-disk format. **No outbound write while airborne** — process-level isolation enforces this: the C11 upload path is not loaded in the airborne companion image (ADR-004). The post-landing tool (F10) is a separate process / image.
|
||||
For every successful satellite-anchored frame whose `TileQualityMetadata` clears the publish threshold, orthorectify the nav frame onto basemap projection, deduplicate against the existing local tile cache, and write the result locally in `satellite-provider`-compatible on-disk format. **No outbound network write while airborne** — process-level isolation enforces this: neither the C11 `TileUploader` nor the C11 `TileDownloader` is loaded in the airborne companion image (ADR-004). The post-landing tool (F10) is a separate process / image.
|
||||
|
||||
### Preconditions
|
||||
|
||||
- F3 produced a `PoseEstimate` with provenance `satellite_anchored` and covariance below the publish threshold.
|
||||
- `flight_state == IN_AIR` is signalled by FC `MAV_STATE`; the in-air image does not contain C11 (process-level isolation).
|
||||
- `flight_state == IN_AIR` is signalled by FC `MAV_STATE`; the in-air image does not contain C11 (process-level isolation; both `TileDownloader` and `TileUploader` paths absent).
|
||||
- Local C6 tile store has free quota (per AC-NEW-3 FDR sub-budget allocation).
|
||||
- Mid-flight tile metadata schema (quality_metadata) is configured per AC-NEW-7 + D-PROJ-2 contract sketch.
|
||||
|
||||
@@ -463,7 +476,7 @@ flowchart TD
|
||||
| Filesystem write fails | Step 3 | filesystem error | Skip tile; FDR logs error; pipeline continues (tile generation is best-effort, not safety-critical) |
|
||||
| Postgres insert fails | Step 4 | DB error | Skip tile; FDR logs error |
|
||||
| Local cache quota exhausted | Step 3 | pre-write free-space check | LRU-evict oldest **mid-flight** tile (never evict pre-flight `satellite-provider` tiles); FDR logs eviction |
|
||||
| `flight_state` glitch reports `ON_GROUND` mid-flight | architectural | software guard — but C11 is not loaded anyway | Defense-in-depth holds: even if guard misfires, C11 binary is not present in the airborne image |
|
||||
| `flight_state` glitch reports `ON_GROUND` mid-flight | architectural | software guard — but C11 is not loaded anyway | Defense-in-depth holds: even if guard misfires, C11 (both `TileDownloader` and `TileUploader`) is not present in the airborne image |
|
||||
| Dedup race (two threads writing same cell) | Step 4 | DB unique constraint or filesystem `O_EXCL` | Retry once with the freshest candidate; FDR logs race |
|
||||
|
||||
### Performance expectations
|
||||
@@ -902,26 +915,26 @@ sequenceDiagram
|
||||
|
||||
### Description
|
||||
|
||||
After the UAV has landed and `flight_state == ON_GROUND` is confirmed, the operator triggers C11 (a separate operator-side process / image — **not present in the airborne companion image**, ADR-004) which reads locally-saved mid-flight tiles from C6 and uploads them to `satellite-provider`'s ingest endpoint per the D-PROJ-2 contract sketch. Each tile carries quality metadata sufficient for the parent-suite voting layer to decide promotion `pending → trusted` (D-PROJ-2 design task #2; not yet implemented service-side — `mock-suite-sat-service` stands in for testing).
|
||||
After the UAV has landed and `flight_state == ON_GROUND` is confirmed, the operator triggers the C11 Tile Manager's `TileUploader` (a separate operator-side process / image — **not present in the airborne companion image**, ADR-004) which reads locally-saved mid-flight tiles from C6 and uploads them to `satellite-provider`'s ingest endpoint per the D-PROJ-2 contract sketch. Each tile carries quality metadata sufficient for the parent-suite voting layer to decide promotion `pending → trusted` (D-PROJ-2 design task #2; not yet implemented service-side). Until the real endpoint ships, integration tests target the e2e-test `mock-suite-sat-service` fixture under `tests/fixtures/`; production never reaches the fixture.
|
||||
|
||||
### Preconditions
|
||||
|
||||
- `flight_state == ON_GROUND` confirmed by the FC's `MAV_STATE` (operator's workstation reads this off the FC or from the FDR).
|
||||
- Operator workstation has network reach to `satellite-provider` (or `mock-suite-sat-service` in test).
|
||||
- Operator workstation has network reach to `satellite-provider` (in tests, the e2e `mock-suite-sat-service` fixture stands in for the not-yet-shipped POST endpoint).
|
||||
- Local C6 tile store has mid-flight tiles with `voting_status=pending` and quality metadata.
|
||||
- Per-flight onboard signing key (generated at takeoff load, baked into tile metadata) is available to C11 for payload signing.
|
||||
- Per-flight onboard signing key (generated at takeoff load, baked into tile metadata) is available to C11 `TileUploader` for payload signing.
|
||||
|
||||
### Sequence Diagram
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant Operator
|
||||
participant C11 as C11 PostLandingUploadTool (workstation)
|
||||
participant C11 as C11 TileUploader (workstation)
|
||||
participant C6 as C6 TileStore (companion or workstation mirror)
|
||||
participant SatelliteProvider as [[satellite-provider]] (D-PROJ-2 endpoint, planned)
|
||||
participant Fdr as C13 FdrWriter
|
||||
|
||||
Operator->>C11: trigger upload(flight_id)
|
||||
Operator->>C11: upload_pending_tiles(flight_id)
|
||||
C11->>C6: read mid-flight tiles where voting_status=pending AND flight_id=...
|
||||
C6-->>C11: batched Tile + TileQualityMetadata
|
||||
loop per batch
|
||||
@@ -930,7 +943,7 @@ sequenceDiagram
|
||||
C11->>C6: update voting_status=uploaded for accepted tiles
|
||||
C11->>Fdr: upload-batch event + service response
|
||||
end
|
||||
C11-->>Operator: upload report (counts, rejections, duplicates)
|
||||
C11-->>Operator: UploadBatchReport (counts, rejections, duplicates)
|
||||
Note over SatelliteProvider: voting layer (D-PROJ-2 design task #2) eventually promotes pending → trusted; out of scope for this flow
|
||||
```
|
||||
|
||||
@@ -938,7 +951,7 @@ sequenceDiagram
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
Start([Operator triggers C11 with flight_id]) --> StateCheck{flight_state == ON_GROUND confirmed?}
|
||||
Start([Operator triggers C11 TileUploader with flight_id]) --> StateCheck{flight_state == ON_GROUND confirmed?}
|
||||
StateCheck -->|no| Refuse[Refuse to upload; report to operator]
|
||||
StateCheck -->|yes| ReadTiles[Read mid-flight tiles voting_status=pending]
|
||||
ReadTiles --> Empty{Any tiles to upload?}
|
||||
|
||||
Reference in New Issue
Block a user