mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 10:41:14 +00:00
[AZ-489] [AZ-490] ADR-010 design pass: operator-mission as cold-start anchor
Architecture, contracts, and task amendments for the flight-route-driven preflight + cold-start origin feature (ADR-010). No source code touched in this commit; the implementation commits for AZ-489 / AZ-490 / AZ-419 land separately. * architecture.md: ADR-010, new Principle #14, amended Principle #11, external systems gain flights service + Mission Planner UI, data model gains Flight / Waypoint / TakeoffOrigin. * system-flows.md: F1 gains phase 0 (Flight resolve), F2 gains cold-start ladder, F7 gains mid-flight bounded-delta GPS gate. * glossary.md: Flight, Flights API, Mid-flight bounded-delta GPS gate, Mission Planner UI, Takeoff origin, Waypoint. * C10: description + cache_provisioner + manifest_verifier bumped to v1.1 carrying takeoff_origin + flight_id in the manifest hash. * C12: description updated + new flights_api_client.md contract v1.0. * C5: description + state_estimator_protocol bumped to v1.1 with set_takeoff_origin + 3-clause spoof-promotion gate. * AZ-323/324/325/326/328/419 amended in place. AZ-490 spec created (C5 set_takeoff_origin entrypoint). * Dependencies table: 142 tasks / 478 pts / 15 forward edges (2 new tasks, 2 backward deps, 2 forward deps from AZ-419). * Leftovers cleared: 2026-05-11 Jira transition entries for AZ-355 and AZ-386 are deleted (Jira reconnected; both already transitioned in their respective implementation commits). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,15 +1,15 @@
|
||||
# Dependencies Table
|
||||
|
||||
**Date**: 2026-05-10 (refreshed after E-BBT decomposition)
|
||||
**Total Tasks**: 140 (99 product + 41 blackbox-test)
|
||||
**Total Complexity Points**: 472 (339 product + 133 blackbox-test)
|
||||
**Date**: 2026-05-11 (refreshed after AZ-489 + AZ-490 onboarding for ADR-010 operator-origin path)
|
||||
**Total Tasks**: 142 (101 product + 41 blackbox-test)
|
||||
**Total Complexity Points**: 478 (345 product + 133 blackbox-test)
|
||||
|
||||
Dependencies columns list only the tracker-ID portion (descriptive tail
|
||||
text in each task spec is omitted here for table-readability). The
|
||||
authoritative dependency narrative — including "co-developed", "forward
|
||||
dependency", and helper-vs-Protocol distinctions — lives in each task's
|
||||
own `Dependencies:` field. The graph is a strict DAG: a topological
|
||||
traversal visits all 140 tasks. The 13 forward edges (dep ID > task ID)
|
||||
traversal visits all 142 tasks. The 15 forward edges (dep ID > task ID)
|
||||
are all declared and documented below under **Cycle Check**.
|
||||
|
||||
| Task | Name | Complexity | Dependencies | Epic |
|
||||
@@ -61,9 +61,9 @@ are all declared and documented below under **Cycle Check**.
|
||||
| AZ-323 | C10 Manifest Builder | 3 | AZ-263, AZ-269, AZ-266, AZ-280, AZ-281, AZ-303 | AZ-252 |
|
||||
| AZ-324 | C10 ManifestVerifier | 3 | AZ-263, AZ-269, AZ-266, AZ-280, AZ-281 | AZ-252 |
|
||||
| AZ-325 | C10 CacheProvisioner | 3 | AZ-263, AZ-269, AZ-266, AZ-303, AZ-321, AZ-322, AZ-323 | AZ-252 |
|
||||
| AZ-326 | C12 CLI App | 3 | AZ-263, AZ-269, AZ-266 | AZ-253 |
|
||||
| AZ-326 | C12 CLI App | 3 | AZ-263, AZ-269, AZ-266, AZ-489 | AZ-253 |
|
||||
| AZ-327 | C12 Companion Bringup | 3 | AZ-263, AZ-269, AZ-266 | AZ-253 |
|
||||
| AZ-328 | C12 Build-Cache Orchestrator | 5 | AZ-326, AZ-327, AZ-316, AZ-325, AZ-263, AZ-269, AZ-266 | AZ-253 |
|
||||
| AZ-328 | C12 Build-Cache Orchestrator | 5 | AZ-326, AZ-327, AZ-316, AZ-325, AZ-489, AZ-263, AZ-269, AZ-266 | AZ-253 |
|
||||
| AZ-329 | C12 Post-Landing Upload | 3 | AZ-326, AZ-319, AZ-272, AZ-263, AZ-269, AZ-266 | AZ-253 |
|
||||
| AZ-330 | C12 OperatorReLocService | 3 | AZ-326, AZ-273, AZ-263, AZ-269, AZ-266 | AZ-253 |
|
||||
| AZ-331 | C1 VioStrategy Protocol | 3 | AZ-263, AZ-269, AZ-266, AZ-270, AZ-272, AZ-276, AZ-277 | AZ-254 |
|
||||
@@ -126,7 +126,7 @@ are all declared and documented below under **Cycle Check**.
|
||||
| AZ-416 | FT-P-09-AP — ArduPilot Plane GPS_INPUT contract + MAVLink 2.0 signing handshake | 5 | AZ-406, AZ-407 | AZ-262 |
|
||||
| AZ-417 | FT-P-09-iNav — iNav MSP2_SENSOR_GPS contract conformance | 3 | AZ-406, AZ-407 | AZ-262 |
|
||||
| AZ-418 | FT-P-10 — GTSAM smoothing-loop look-back accuracy | 3 | AZ-406, AZ-407 | AZ-262 |
|
||||
| AZ-419 | FT-P-11 — Cold-start initialization from FC EKF | 3 | AZ-406, AZ-407 | AZ-262 |
|
||||
| AZ-419 | FT-P-11 — Cold-start init (operator-manifest primary + FC EKF secondary + bounded-delta gate)| 3 | AZ-406, AZ-407, AZ-489 (forward), AZ-490 (forward) | AZ-262 |
|
||||
| AZ-420 | FT-P-12 + FT-P-13 — GCS downsample + GCS-originated re-loc command | 3 | AZ-406, AZ-407 | AZ-262 |
|
||||
| AZ-421 | FT-P-15 + FT-P-16 + FT-P-18 — Tile cache + offline + no-raw-retention | 3 | AZ-406, AZ-407 | AZ-262 |
|
||||
| AZ-422 | FT-P-17 + FT-N-06 — Mid-flight tile generation + freshness | 3 | AZ-406, AZ-407 | AZ-262 |
|
||||
@@ -154,6 +154,8 @@ are all declared and documented below under **Cycle Check**.
|
||||
| AZ-444 | Tier-2 Jetson harness wrapper — run-tier2.sh, ssh provisioning, systemd, ASan-fuzz | 5 | AZ-406 | AZ-262 |
|
||||
| AZ-445 | CSV reporter + evidence bundler — per-NFR machine-readable outputs + traceability-status.json | 2 | AZ-406 | AZ-262 |
|
||||
| AZ-446 | CSV reporter refinements — trend-line + acceptance-band annotations + Monte Carlo CI | 2 | AZ-406, AZ-445 | AZ-262 |
|
||||
| AZ-489 | C12 FlightsApiClient — fetch Flight from suite flights service + offline JSON fallback | 3 | AZ-263, AZ-269, AZ-266, AZ-279, AZ-280 | AZ-253 |
|
||||
| AZ-490 | C5 set_takeoff_origin entrypoint — accept operator origin from C10 Manifest | 3 | AZ-263, AZ-269, AZ-266, AZ-272, AZ-273, AZ-279, AZ-381, AZ-383, AZ-384, AZ-385, AZ-386 | AZ-260 |
|
||||
|
||||
## Notes
|
||||
|
||||
@@ -189,6 +191,23 @@ are all declared and documented below under **Cycle Check**.
|
||||
`blackout_spoof.py`; NFT-RES-04 is the focused 35 s escalation
|
||||
scenario while FT-N-04 covers the 5 s / 15 s / 35 s ladder.
|
||||
- AZ-446 depends on AZ-445 — refinements layer over the bundler.
|
||||
- **ADR-010 operator-origin path** (added 2026-05-11):
|
||||
- **AZ-489 (C12 FlightsApiClient)** is the new read-only Flight
|
||||
resolver for C12; it has no consumers inside its own epic but
|
||||
feeds AZ-326 (CLI flags) and AZ-328 (orchestrator phase 0) — both
|
||||
declare a hard backward dep on AZ-489. The CLI's `--flight-id` /
|
||||
`--flight-file` flags + AZ-328's flight-resolve phase 0 cannot
|
||||
land without it.
|
||||
- **AZ-490 (C5 set_takeoff_origin)** extends the AZ-381 Protocol
|
||||
with the pre-takeoff entrypoint, amends the AZ-385 source-label
|
||||
state machine with the third bounded-delta clause, and depends
|
||||
on AZ-381..AZ-386 (Protocol + factor adds + marginals + source
|
||||
label gate + ESKF baseline) plus AZ-272/273/279 for FDR + Vincenty.
|
||||
All deps are backward; AZ-490 ships after the C5 epic core lands.
|
||||
- **AZ-419 (FT-P-11 cold-start)** carries forward deps on both
|
||||
AZ-489 + AZ-490 — the blackbox cold-start scenario now exercises
|
||||
the operator-manifest primary path (needs both) AND the FC EKF
|
||||
secondary fallback (back-compat).
|
||||
- **All E-BBT tasks depend on AZ-406 (test infrastructure)**; this is
|
||||
by design — AZ-406 is the foundation every blackbox test depends on
|
||||
(analogous to AZ-263 for the product side).
|
||||
@@ -202,13 +221,13 @@ are all declared and documented below under **Cycle Check**.
|
||||
- C3 `CrossDomainMatcher` → AZ-344 (Protocol) + AZ-345/346/347 (concrete)
|
||||
- C3.5 `ConditionalRefiner` → AZ-348 (Protocol + Passthrough) + AZ-349 (AdHoP)
|
||||
- C4 `PoseEstimator` → AZ-355 (Protocol) + AZ-358/361 (concrete)
|
||||
- C5 `StateEstimator` → AZ-381 (Protocol) + AZ-382..AZ-389 (concrete)
|
||||
- C5 `StateEstimator` → AZ-381 (Protocol) + AZ-382..AZ-389 (concrete) + AZ-490 (`set_takeoff_origin` entrypoint + bounded-delta gate)
|
||||
- C6 `TileStore` / `DescriptorIndex` → AZ-303 (Interfaces) + AZ-304/305/306/307/308
|
||||
- C7 `InferenceRuntime` → AZ-297 (Protocol) + AZ-298/299/300/301/302
|
||||
- C8 `FcAdapter` / `GcsAdapter` → AZ-390 (Protocols) + AZ-391..AZ-397
|
||||
- C10 Provisioning → AZ-321/322/323/324/325
|
||||
- C11 Tile Manager → AZ-316/317/318/319/320
|
||||
- C12 Operator Tooling → AZ-326/327/328/329/330
|
||||
- C12 Operator Tooling → AZ-326/327/328/329/330 + AZ-489 (FlightsApiClient)
|
||||
- C13 FDR Writer → AZ-291..AZ-296
|
||||
|
||||
- **Cross-cutting product modules**:
|
||||
@@ -244,7 +263,7 @@ are all declared and documented below under **Cycle Check**.
|
||||
## Cycle Check
|
||||
|
||||
A static dependency-graph traversal (Kahn topological sort) visits all
|
||||
140 nodes — no cycles. The 13 forward edges (dep ID > task ID) are all
|
||||
142 nodes — no cycles. The 15 forward edges (dep ID > task ID) are all
|
||||
declared, bounded, and documented:
|
||||
|
||||
- **AZ-267 → AZ-272** (FDR Log Bridge → FdrRecord Schema; shipped in
|
||||
@@ -261,6 +280,13 @@ declared, bounded, and documented:
|
||||
optionally for the ASan-fuzz mode). AZ-444 is therefore scheduled
|
||||
as the first Tier-2 E-BBT deliverable; the dependent scenarios land
|
||||
on top of it.
|
||||
- **AZ-326 → AZ-489, AZ-328 → AZ-489** (C12 CLI + orchestrator
|
||||
depend on the new C12 FlightsApiClient task added 2026-05-11; the
|
||||
client lands first inside the C12 epic and the CLI/orchestrator
|
||||
then plug it in).
|
||||
- **AZ-419 → AZ-489, AZ-419 → AZ-490** (blackbox cold-start scenario
|
||||
forward-depends on both the C12 client + the new C5 entrypoint;
|
||||
the scenario lands after both product tasks).
|
||||
|
||||
The graph is therefore a strict DAG once these documented forward
|
||||
edges are accounted for, and remains sortable by tracker ID modulo
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
**Task**: AZ-323_c10_manifest_builder
|
||||
**Name**: C10 Manifest Builder
|
||||
**Description**: Implement `ManifestBuilder`, the C10-internal phase that produces the signed cache Manifest covering EVERY shipped artifact (engines, FAISS index, calibration JSON, all tile hashes from C6) plus the build-identity tuple `(model_ids, calibration_sha256, sorted_tile_hashes, sector_class, bbox, zoom_levels)` whose canonical hash is `manifest_hash` — the D-C10-1 idempotence key. Serializes the Manifest as canonical JSON (sorted keys, no whitespace) at `cache_root/Manifest.json`, computes its own SHA-256 sidecar via AZ-280, and writes a detached Ed25519 signature at `cache_root/Manifest.json.sig` using the operator's signing key from `key_path`. Refuses to sign with a non-operator key when `config.c10.signing_mode = "operator"` (C10-ST-01). Emits the `signing_public_key_fingerprint` into the Manifest itself so verifiers can pin the trust root.
|
||||
**Description**: Implement `ManifestBuilder`, the C10-internal phase that produces the signed cache Manifest covering EVERY shipped artifact (engines, FAISS index, calibration JSON, all tile hashes from C6) plus the build-identity tuple `(model_ids, calibration_sha256, sorted_tile_hashes, sector_class, bbox, zoom_levels, takeoff_origin, flight_id)` whose canonical hash is `manifest_hash` — the D-C10-1 idempotence key. The `takeoff_origin` (`LatLonAlt`) and `flight_id` (`UUID`) are supplied by C12 from `Flight.waypoints[0]` via the `FlightsApiClient` (ADR-010, AZ-489); both are baked into the Manifest body **and** included in the manifest-hash so re-planning the flight produces a new cache identity. Serializes the Manifest as canonical JSON (sorted keys, no whitespace) at `cache_root/Manifest.json`, computes its own SHA-256 sidecar via AZ-280, and writes a detached Ed25519 signature at `cache_root/Manifest.json.sig` using the operator's signing key from `key_path`. Refuses to sign with a non-operator key when `config.c10.signing_mode = "operator"` (C10-ST-01). Emits the `signing_public_key_fingerprint` into the Manifest itself so verifiers can pin the trust root.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module, AZ-280_sha256_sidecar, AZ-281_engine_filename_schema, AZ-303_c6_storage_interfaces
|
||||
**Component**: c10_provisioning (epic AZ-252 / E-C10)
|
||||
@@ -34,7 +34,7 @@ This task delivers the Manifest serialization + signing. It does NOT compile eng
|
||||
- Constructor: `__init__(self, *, sidecar: Sha256Sidecar, signer: ManifestSigner, tile_metadata_store: TileMetadataStore, logger: Logger, clock: Clock, config: C10ManifestConfig)`.
|
||||
- `C10ManifestConfig` (`@dataclass(frozen=True)`): `signing_mode: enum {operator, dev}`, `allowed_operator_fingerprints: tuple[str, ...]`, `schema_version: str = "1.0"`.
|
||||
- Public method: `build_manifest(input: ManifestBuildInput) -> ManifestArtifact`.
|
||||
- `ManifestBuildInput` (`@dataclass(frozen=True)`): `cache_root: Path`, `bbox: Bbox`, `zoom_levels: tuple[int, ...]`, `sector_class: SectorClassification`, `engine_entries: tuple[EngineCacheEntry, ...]`, `descriptor_index_path: Path`, `calibration_path: Path`, `key_path: Path`.
|
||||
- `ManifestBuildInput` (`@dataclass(frozen=True)`): `cache_root: Path`, `bbox: Bbox`, `zoom_levels: tuple[int, ...]`, `sector_class: SectorClassification`, `engine_entries: tuple[EngineCacheEntry, ...]`, `descriptor_index_path: Path`, `calibration_path: Path`, `key_path: Path`, `takeoff_origin: LatLonAlt | None = None` (ADR-010 / AZ-489 — when set, baked into Manifest + hash), `flight_id: UUID | None = None` (ADR-010 — pass-through provenance).
|
||||
- `ManifestArtifact` (`@dataclass(frozen=True)`): `manifest_path: Path`, `signature_path: Path`, `manifest_hash: str`, `signing_public_key_fingerprint: str`, `total_artifacts_listed: int`.
|
||||
- A `ManifestSigner` Protocol at `src/gps_denied_onboard/components/c10_provisioning/interface.py`:
|
||||
```python
|
||||
@@ -54,10 +54,10 @@ This task delivers the Manifest serialization + signing. It does NOT compile eng
|
||||
- For descriptor index: call `sidecar.read_sidecar(input.descriptor_index_path)` → expect a 64-char hex digest.
|
||||
- For calibration JSON: `sha256_hex(open(calibration_path, 'rb').read())` — calibration is small (KB).
|
||||
- For tiles: call `tile_metadata_store.query_by_bbox(bbox, zoom_levels, sector_class)` → list of `TileMetadata` with `sha256_hex` field (set by AZ-316). Sort by `(zoom, lat, lon, source)` for determinism. Compute `tiles_coverage_sha256 = sha256(b"\n".join(f"{t.tile_id}:{t.sha256_hex}".encode() for t in sorted_tiles))`.
|
||||
5. Build the canonical Manifest dict:
|
||||
5. Build the canonical Manifest dict (ADR-010 adds `flight.takeoff_origin` + `flight.flight_id` blocks when supplied):
|
||||
```
|
||||
{
|
||||
"schema_version": "1.0",
|
||||
"schema_version": "1.1",
|
||||
"build": {
|
||||
"bbox": {...},
|
||||
"zoom_levels": [16, 17, 18],
|
||||
@@ -65,6 +65,14 @@ This task delivers the Manifest serialization + signing. It does NOT compile eng
|
||||
"built_at": "2026-05-10T12:00:00Z",
|
||||
"manifest_hash": "<sha256-hex>"
|
||||
},
|
||||
"flight": {
|
||||
"flight_id": "<uuid>", // null when ManifestBuildInput.flight_id is None
|
||||
"takeoff_origin": { // omitted when ManifestBuildInput.takeoff_origin is None
|
||||
"lat_deg": <float>,
|
||||
"lon_deg": <float>,
|
||||
"alt_m": <float>
|
||||
}
|
||||
},
|
||||
"artifacts": {
|
||||
"engines": [{"path": "engines/dinov2_vpr_sm87_jp62_trt103_fp16.engine", "sha256": "<hex>"}, ...],
|
||||
"descriptor_index": {"path": "descriptors/corpus.index", "sha256": "<hex>"},
|
||||
@@ -74,7 +82,7 @@ This task delivers the Manifest serialization + signing. It does NOT compile eng
|
||||
"signing_public_key_fingerprint": "<hex>"
|
||||
}
|
||||
```
|
||||
6. Compute `manifest_hash` as `sha256(canonical_json(build_identity_tuple))` where `build_identity_tuple = sorted({model_ids, calibration_sha256, tiles_coverage_sha256, sector_class, bbox, zoom_levels})`. This is the D-C10-1 idempotence key. Insert into the Manifest dict at `build.manifest_hash` AFTER computation.
|
||||
6. Compute `manifest_hash` as `sha256(canonical_json(build_identity_tuple))` where `build_identity_tuple = sorted({model_ids, calibration_sha256, tiles_coverage_sha256, sector_class, bbox, zoom_levels, takeoff_origin_tuple_or_none, flight_id_or_none})`. The takeoff origin is serialised as `(lat_deg, lon_deg, alt_m)` rounded to 9 decimal places (sub-millimetre, deterministic). This is the D-C10-1 idempotence key. Insert into the Manifest dict at `build.manifest_hash` AFTER computation. **Two builds with identical inputs but different `takeoff_origin` produce different `manifest_hash` values; this is the contract that lets `ManifestVerifier` reject a re-planned route at boot (AZ-324, MV-INV-8).**
|
||||
7. Serialize the Manifest dict as canonical JSON: `orjson.dumps(manifest, option=orjson.OPT_SORT_KEYS | orjson.OPT_INDENT_2).decode()`. Append a trailing newline.
|
||||
8. Atomic-write the JSON via `sidecar.write_with_sidecar(cache_root / "Manifest.json", canonical_json_bytes)` — produces `Manifest.json` + `Manifest.json.sha256` (the latter is the Manifest's OWN sha256, used by T4).
|
||||
9. Sign the canonical JSON bytes: `signature_bytes = signer.sign(key, canonical_json_bytes)` (raw Ed25519 signature, 64 bytes).
|
||||
@@ -168,6 +176,26 @@ Given an input with N engines + 1 index + 1 calibration + tiles_coverage
|
||||
When `ManifestArtifact.total_artifacts_listed` is inspected
|
||||
Then it equals `N + 3` (engines + index + calibration + tiles_coverage); does NOT count the Manifest itself or the signature
|
||||
|
||||
**AC-13: `takeoff_origin` baked into Manifest body when supplied (ADR-010 / AZ-489)**
|
||||
Given a `ManifestBuildInput` with `takeoff_origin = LatLonAlt(50.0, 36.2, 200.0)` and `flight_id = some_uuid`
|
||||
When `build_manifest` is called
|
||||
Then the Manifest body contains a `flight` block with `flight_id` and `takeoff_origin` (`lat_deg=50.0`, `lon_deg=36.2`, `alt_m=200.0`); ZERO `built_at`-style timestamp inside `takeoff_origin`
|
||||
|
||||
**AC-14: `takeoff_origin` absent from Manifest body when not supplied**
|
||||
Given a `ManifestBuildInput` with `takeoff_origin = None` and `flight_id = None`
|
||||
When `build_manifest` is called
|
||||
Then the Manifest body has the `flight` block with `flight_id: null` and NO `takeoff_origin` key (use absence, not `null`, so AZ-324 can detect "field never set" vs "field invalid")
|
||||
|
||||
**AC-15: `manifest_hash` changes when only `takeoff_origin` differs**
|
||||
Given two `ManifestBuildInput`s identical except `takeoff_origin = A` vs `takeoff_origin = B` (B != A by ≥ 1 mm)
|
||||
When `build_manifest` is called twice
|
||||
Then the two `manifest_hash` values differ — D-C10-1 idempotence treats re-planned route as a new build
|
||||
|
||||
**AC-16: `manifest_hash` stable when only `flight_id` differs but `takeoff_origin` is the same**
|
||||
Given two `ManifestBuildInput`s identical except `flight_id`
|
||||
When `build_manifest` is called twice
|
||||
Then the two `manifest_hash` values **differ** — `flight_id` is provenance and is part of the build identity (operator may re-plan with the same takeoff position but a different mission; the cache identity must track that)
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
@@ -199,6 +227,10 @@ Then it equals `N + 3` (engines + index + calibration + tiles_coverage); does NO
|
||||
| AC-10 | Kill mid-write | No half-Manifest |
|
||||
| AC-11 | Verify Manifest's own sidecar | Hashes match |
|
||||
| AC-12 | Inspect total_artifacts_listed | Counts engines+index+calibration+tiles_coverage |
|
||||
| AC-13 | Build with takeoff_origin set | `flight.takeoff_origin` present in JSON; lat/lon/alt match |
|
||||
| AC-14 | Build with takeoff_origin=None | `flight.takeoff_origin` key absent from JSON |
|
||||
| AC-15 | Two builds, takeoff_origin differs | manifest_hash differs |
|
||||
| AC-16 | Two builds, only flight_id differs | manifest_hash differs |
|
||||
| NFR-perf | 100k-tile bench | ≤ 5 s wall clock |
|
||||
| NFR-reliability-fail-closed | Operator mode + unknown fp | Fail-closed; nothing written |
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
**Task**: AZ-324_c10_manifest_verifier
|
||||
**Name**: C10 ManifestVerifier
|
||||
**Description**: Implement `ManifestVerifier` (per the contract `_docs/02_document/contracts/c10_provisioning/manifest_verifier.md`), the read-only validator that AC-NEW-1 places between F2 takeoff and any engine deserialization. Loads `Manifest.json`, verifies its sidecar SHA-256 matches the Manifest bytes, parses the Ed25519 detached signature at `Manifest.json.sig`, verifies it against the caller-supplied `trusted_public_keys` tuple, parses the Manifest schema (rejecting absolute paths and schema violations), and walks every per-artifact entry re-hashing it via AZ-280's sidecar pattern. Returns a `VerificationResult` with `outcome ∈ {PASS, FAIL}`, the union of all `VerifyFailReason` values that fired, the populated `per_artifact_checks` list, and `elapsed_ms`. Fail-closed: any deviation in signature, schema, key trust, or hashes yields `FAIL` with detailed reasons. Never raises on a verify failure — only on environment errors (Manifest.json missing → `MANIFEST_NOT_FOUND` is still `FAIL`, not raise).
|
||||
**Description**: Implement `ManifestVerifier` (per the contract `_docs/02_document/contracts/c10_provisioning/manifest_verifier.md` v1.1.0), the read-only validator that AC-NEW-1 places between F2 takeoff and any engine deserialization. Loads `Manifest.json`, verifies its sidecar SHA-256 matches the Manifest bytes, parses the Ed25519 detached signature at `Manifest.json.sig`, verifies it against the caller-supplied `trusted_public_keys` tuple, parses the Manifest schema (rejecting absolute paths and schema violations), validates the optional `flight.takeoff_origin` block (well-formed `LatLonAlt` + inside `build.bbox` per ADR-010 + AZ-490), and walks every per-artifact entry re-hashing it via AZ-280's sidecar pattern. Returns a `VerificationResult` with `outcome ∈ {PASS, FAIL}`, the union of all `VerifyFailReason` values that fired, the populated `per_artifact_checks` list, the pass-through `takeoff_origin` + `flight_id` (or `None` when absent from the Manifest body), and `elapsed_ms`. Fail-closed: any deviation in signature, schema, key trust, hashes, or origin validity yields `FAIL` with detailed reasons. Never raises on a verify failure — only on environment errors (Manifest.json missing → `MANIFEST_NOT_FOUND` is still `FAIL`, not raise).
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module, AZ-280_sha256_sidecar, AZ-281_engine_filename_schema
|
||||
**Component**: c10_provisioning (epic AZ-252 / E-C10)
|
||||
@@ -56,9 +56,15 @@ This task delivers the verifier + its frozen contract. It does NOT compile engin
|
||||
- If `trusted_public_keys` is empty: append `UNTRUSTED_PUBLIC_KEY`; return `FAIL`.
|
||||
5. **Step C — Schema parse**:
|
||||
- `orjson.loads(manifest_bytes)` → dict.
|
||||
- Validate required keys: `schema_version`, `build` (with sub-keys `bbox`, `zoom_levels`, `sector_class`, `built_at`, `manifest_hash`), `artifacts` (with `engines`, `descriptor_index`, `calibration`, `tiles_coverage`), `signing_public_key_fingerprint`.
|
||||
- Validate required keys: `schema_version`, `build` (with sub-keys `bbox`, `zoom_levels`, `sector_class`, `built_at`, `manifest_hash`), `artifacts` (with `engines`, `descriptor_index`, `calibration`, `tiles_coverage`), `signing_public_key_fingerprint`. `flight` block is OPTIONAL (added in schema v1.1, ADR-010).
|
||||
- Validate types: `engines` is list of `{path: str, sha256: str}`; `descriptor_index`, `calibration` are `{path: str, sha256: str}`; `tiles_coverage` is `{sha256: str, tile_count: int}`.
|
||||
- Validate path-relative-only: every `path` value must be relative (no leading `/`, no `..` segments). Append `SCHEMA_VIOLATION` per offending field; if any, return `FAIL`.
|
||||
- **Flight block (ADR-010 / AZ-490)**:
|
||||
- If `flight` key absent → `takeoff_origin = None`, `flight_id = None`; continue.
|
||||
- If `flight` present → parse `flight_id` (`UUID` or `None`) and `takeoff_origin` (optional block).
|
||||
- If `flight.takeoff_origin` present → validate `lat_deg ∈ [-90, 90]`, `lon_deg ∈ [-180, 180]`, `alt_m` finite (no NaN/Inf). Append `TAKEOFF_ORIGIN_INVALID` to `fail_reasons` and the offending field name to `fail_details` if any check fails.
|
||||
- If `flight.takeoff_origin` is well-formed → check it falls inside `build.bbox` (`bbox.lat_min ≤ lat ≤ bbox.lat_max`, `bbox.lon_min ≤ lon ≤ bbox.lon_max`). Append `TAKEOFF_ORIGIN_OUT_OF_BBOX` if not.
|
||||
- The `takeoff_origin` is populated on `VerificationResult` whenever the block parsed (even on FAIL), per MV-INV-9, so operators see what was attempted.
|
||||
6. **Step D — Per-artifact hash walk** (only reached if Steps A–C all passed):
|
||||
- For each engine, descriptor_index, calibration entry:
|
||||
- Compute `actual_path = manifest_path.parent / entry.path`.
|
||||
@@ -166,6 +172,26 @@ Given `trusted_public_keys = ()`
|
||||
When verify runs
|
||||
Then `fail_reasons=(UNTRUSTED_PUBLIC_KEY,)` regardless of Manifest validity; per-artifact walk does NOT happen
|
||||
|
||||
**AC-14: Manifest with no `flight` block parses cleanly (back-compat)**
|
||||
Given a v1.0 Manifest (no `flight` block) that is otherwise valid + signed
|
||||
When verify runs
|
||||
Then `outcome=PASS`; `VerificationResult.takeoff_origin is None`; `VerificationResult.flight_id is None`
|
||||
|
||||
**AC-15: Well-formed in-bbox `takeoff_origin` passes through**
|
||||
Given a v1.1 Manifest with `flight.takeoff_origin = (50.0, 36.2, 200.0)` inside the recorded bbox
|
||||
When verify runs
|
||||
Then `outcome=PASS`; `VerificationResult.takeoff_origin == LatLonAlt(50.0, 36.2, 200.0)`
|
||||
|
||||
**AC-16: Malformed `takeoff_origin` (lat=200) fails closed**
|
||||
Given a Manifest with `flight.takeoff_origin.lat_deg = 200`
|
||||
When verify runs
|
||||
Then `outcome=FAIL`; `fail_reasons` contains `TAKEOFF_ORIGIN_INVALID`; `fail_details` names `lat_deg`; the `takeoff_origin` field on `VerificationResult` is still populated for diagnostics
|
||||
|
||||
**AC-17: Out-of-bbox `takeoff_origin` fails closed**
|
||||
Given a Manifest whose `flight.takeoff_origin = (10.0, 10.0, 0)` while `build.bbox` covers `(49.5..50.5, 35.5..36.5)`
|
||||
When verify runs
|
||||
Then `outcome=FAIL`; `fail_reasons` contains `TAKEOFF_ORIGIN_OUT_OF_BBOX`
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
@@ -197,6 +223,10 @@ Then `fail_reasons=(UNTRUSTED_PUBLIC_KEY,)` regardless of Manifest validity; per
|
||||
| AC-9 | Operator mode + drifted tile | TILES_COVERAGE_MISMATCH |
|
||||
| AC-10 | Airborne mode | tiles_coverage matched=True |
|
||||
| AC-11 | Conformance check | True |
|
||||
| AC-14 | v1.0 Manifest (no flight block) | PASS; takeoff_origin=None; flight_id=None |
|
||||
| AC-15 | v1.1 Manifest, valid in-bbox origin | PASS; takeoff_origin populated |
|
||||
| AC-16 | Malformed origin (lat=200) | FAIL; TAKEOFF_ORIGIN_INVALID; field name in details |
|
||||
| AC-17 | Out-of-bbox origin | FAIL; TAKEOFF_ORIGIN_OUT_OF_BBOX |
|
||||
| AC-12 | Inspect elapsed_ms | All non-negative; ordered as expected |
|
||||
| AC-13 | Empty trusted keys | FAIL; UNTRUSTED |
|
||||
| NFR-perf-airborne | 5 artifact bench, no tile re-walk | p99 ≤ 100 ms |
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
**Task**: AZ-325_c10_cache_provisioner
|
||||
**Name**: C10 CacheProvisioner
|
||||
**Description**: Implement `CacheProvisioner` (per the contract `_docs/02_document/contracts/c10_provisioning/cache_provisioner.md`), the public top-level orchestrator that composes AZ-321 (EngineCompiler), AZ-322 (DescriptorBatcher), and AZ-323 (ManifestBuilder) into a single idempotent F1 build pipeline. Acquires a `cache_root/.c10.lock` filesystem lockfile to enforce CP-INV-4. Computes the build-identity hash from the same canonical inputs AZ-323 hashes (model_ids + calibration_sha256 + tiles_coverage_sha256 + sector_class + bbox + zoom_levels) and compares to the existing `Manifest.json`'s `manifest_hash`; on match → `outcome=IDEMPOTENT_NO_OP`. On mismatch (or no prior Manifest) → run engine compile → descriptor population → Manifest build, then walk `cache_root` to confirm every file is listed in the new Manifest's `artifacts` section, raising `ManifestCoverageError` on orphans (with rollback to prior-good Manifest). Empty corpus → `BuildReport(outcome=FAILURE, failure_reason="run C11 TileDownloader first")` per description.md § 5.
|
||||
**Description**: Implement `CacheProvisioner` (per the contract `_docs/02_document/contracts/c10_provisioning/cache_provisioner.md` v1.1.0), the public top-level orchestrator that composes AZ-321 (EngineCompiler), AZ-322 (DescriptorBatcher), and AZ-323 (ManifestBuilder) into a single idempotent F1 build pipeline. Acquires a `cache_root/.c10.lock` filesystem lockfile to enforce CP-INV-4. Computes the build-identity hash from the same canonical inputs AZ-323 hashes (model_ids + calibration_sha256 + tiles_coverage_sha256 + sector_class + bbox + zoom_levels **+ takeoff_origin + flight_id**) and compares to the existing `Manifest.json`'s `manifest_hash`; on match → `outcome=IDEMPOTENT_NO_OP`. On mismatch (or no prior Manifest) → run engine compile → descriptor population → Manifest build (passing `request.takeoff_origin` and `request.flight_id` to AZ-323), then walk `cache_root` to confirm every file is listed in the new Manifest's `artifacts` section, raising `ManifestCoverageError` on orphans (with rollback to prior-good Manifest). Empty corpus → `BuildReport(outcome=FAILURE, failure_reason="run C11 TileDownloader first")` per description.md § 5. **A request whose `takeoff_origin` differs from the prior Manifest's by ≥ 1 mm is treated as a new build identity (CP-INV-8) — this is the contract that lets `ManifestVerifier` reject a re-planned route at boot.**
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module, AZ-303_c6_storage_interfaces, AZ-321_c10_engine_compiler, AZ-322_c10_descriptor_batcher, AZ-323_c10_manifest_builder
|
||||
**Component**: c10_provisioning (epic AZ-252 / E-C10)
|
||||
@@ -40,13 +40,13 @@ This task delivers the orchestrator + its frozen contract. It does NOT compile e
|
||||
2. **Tile gathering**: call `tile_metadata_store.query_by_bbox(bbox, zoom_levels, sector_class)`.
|
||||
- If empty → return `BuildReport(outcome=FAILURE, failure_reason="no tiles in C6 for the requested scope; run C11 TileDownloader first", engines_built=0, ...)`. ERROR log; release lock.
|
||||
3. **Build-identity hash for idempotence check**:
|
||||
- Compute `request_hash = sha256(canonical_json(model_ids + calibration_sha256 + tiles_coverage_sha256 + sector_class + bbox + zoom_levels))`. The `model_ids` come from the configured backbone list; `calibration_sha256` from streaming the calibration_path; `tiles_coverage_sha256` from sorting the tile rows by `(zoom, lat, lon, source)` and hashing per AZ-323's algorithm.
|
||||
- Compute `request_hash = sha256(canonical_json(model_ids + calibration_sha256 + tiles_coverage_sha256 + sector_class + bbox + zoom_levels + takeoff_origin_tuple_or_none + flight_id_or_none))`. The `model_ids` come from the configured backbone list; `calibration_sha256` from streaming the calibration_path; `tiles_coverage_sha256` from sorting the tile rows by `(zoom, lat, lon, source)` and hashing per AZ-323's algorithm. `takeoff_origin_tuple_or_none` is `(lat_deg, lon_deg, alt_m)` rounded to 9 decimal places when `request.takeoff_origin is not None`, otherwise the JSON `null` sentinel (CP-INV-8). The hashing formula MUST match AZ-323 exactly so AZ-325's idempotence decision agrees with AZ-323's emitted `build.manifest_hash`.
|
||||
- Read existing `Manifest.json` if present; parse only the `build.manifest_hash` field (don't run full verification — that's AZ-324's job). If `existing.manifest_hash == request_hash` → return `BuildReport(outcome=IDEMPOTENT_NO_OP, manifest_hash=existing.manifest_hash, manifest_path=existing_path, engines_built=0, engines_reused=0, descriptors_generated=0, elapsed_s, failure_reason=None)`. INFO log; release lock.
|
||||
4. **Active build path**:
|
||||
- Snapshot prior-good Manifest (rename to `Manifest.json.prev` if present) for rollback.
|
||||
- Compose engine compile request from configured backbones; call `engine_compiler.compile_engines_for_corpus(...)` → `engine_entries`.
|
||||
- Compose descriptor populate request (filter, callback hooked to logger); call `descriptor_batcher.populate_descriptors(...)` → `DescriptorBatchReport`. If `outcome=failure` → restore prior Manifest, release lock, return `BuildReport(outcome=FAILURE, failure_reason=batch.failure_reason, ...)`.
|
||||
- Compose Manifest build input from engine entries + descriptor index path + calibration + key_path; call `manifest_builder.build_manifest(...)` → `ManifestArtifact`.
|
||||
- Compose Manifest build input from engine entries + descriptor index path + calibration + key_path **+ `request.takeoff_origin` + `request.flight_id`** (ADR-010); call `manifest_builder.build_manifest(...)` → `ManifestArtifact`. Both fields default to `None` when the caller did not supply them (e.g., legacy C12 invocation without `--flight-id`).
|
||||
5. **Coverage check** (CP-INV-3 / D-C10-3):
|
||||
- Walk `cache_root` recursively (`pathlib.Path.rglob`); collect every regular file path EXCLUDING `Manifest.json`, `Manifest.json.sha256`, `Manifest.json.sig`, `Manifest.json.prev`, `.c10.lock`, and any `.sha256` sidecar (sidecars are implicit per the AZ-280 pattern, paired with their primary).
|
||||
- Build expected set: every `path` in `manifest.artifacts.engines + descriptor_index + calibration` (resolved relative to `cache_root`).
|
||||
@@ -152,6 +152,21 @@ Given a populated cache and identical request
|
||||
When `build_cache_artifacts` runs
|
||||
Then wall-clock ≤ 1 min (CP-TC-13 / NFR C10-PT-01); the bound work is the build-identity hash computation, which is dominated by `tiles_coverage_sha256` over 1000 tiles (~5 ms hashing)
|
||||
|
||||
**AC-14: `takeoff_origin` mismatch triggers full rebuild (ADR-010 / CP-INV-8)**
|
||||
Given a prior Manifest built with `takeoff_origin = A`
|
||||
When `build_cache_artifacts` is called with the SAME bbox / zooms / sector / calibration / tiles, but `takeoff_origin = B (B ≠ A by ≥ 1 mm)`
|
||||
Then `outcome=SUCCESS` (NOT `IDEMPOTENT_NO_OP`); the new Manifest replaces the old; the new `manifest_hash` differs from the prior; the new Manifest's `flight.takeoff_origin` matches B
|
||||
|
||||
**AC-15: `takeoff_origin = None` propagates through with no flight block in Manifest (back-compat)**
|
||||
Given a `BuildRequest` with `takeoff_origin = None` and `flight_id = None`
|
||||
When `build_cache_artifacts` runs
|
||||
Then `outcome=SUCCESS`; the produced Manifest has no `flight.takeoff_origin` key (AZ-323's AC-14); idempotence still works for subsequent identical-without-origin invocations
|
||||
|
||||
**AC-16: `flight_id` participation in idempotence**
|
||||
Given a prior Manifest built with `flight_id = X, takeoff_origin = A`
|
||||
When `build_cache_artifacts` runs with `flight_id = Y, takeoff_origin = A` (only `flight_id` differs)
|
||||
Then `outcome=SUCCESS` (NOT `IDEMPOTENT_NO_OP`); `flight_id` is part of the build identity per CP-INV-8
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
@@ -177,6 +192,9 @@ Then wall-clock ≤ 1 min (CP-TC-13 / NFR C10-PT-01); the bound work is the buil
|
||||
| AC-2 | Warm re-run with identical request | IDEMPOTENT_NO_OP; zero phase calls |
|
||||
| AC-3 | Different bbox after prior build | SUCCESS; atomic replace; old Manifest gone |
|
||||
| AC-4 | Empty C6 query | FAILURE; hint string; lock released |
|
||||
| AC-14 | Warm re-run with different takeoff_origin | SUCCESS; new manifest_hash; phases called |
|
||||
| AC-15 | Build with takeoff_origin=None | SUCCESS; Manifest has no flight.takeoff_origin |
|
||||
| AC-16 | Warm re-run with different flight_id only | SUCCESS; new manifest_hash |
|
||||
| AC-5 | Pre-acquire lock externally; run | BuildLockHeldError |
|
||||
| AC-6 | Inject orphan file before coverage walk | ManifestCoverageError; prior Manifest restored |
|
||||
| AC-7 | Same as AC-6 with `coverage_strict=False` | SUCCESS; WARN log |
|
||||
|
||||
@@ -2,9 +2,9 @@
|
||||
|
||||
**Task**: AZ-326_c12_cli_app
|
||||
**Name**: C12 CLI App
|
||||
**Description**: Implement the operator-tooling CLI shell that operators run on the workstation. Wires Typer (per the Click/Typer project pin) into `operator_tool/__main__.py`, registers six subcommands (`download`, `build-cache`, `upload-pending`, `reloc-confirm`, `verify-ready`, `set-sector`), wires the E-CC-LOG (AZ-266) logger to a workstation-side structured-JSON log file (`~/.azaion/onboard/c12-tooling.log`), and ships the two trivial operator-side helpers from description.md § 2 — `set_sector_classification(area, sector_class)` (persists per-area classification to a local JSON file under the operator workstation's home directory) and `apply_freshness_threshold(sector_class) -> int (months)` (a pure-data lookup that maps the sector classification enum to the AC-NEW-6 months freshness budget). Each subcommand is a thin shell that resolves its service collaborator (`build_cache`, `companion_bringup`, `post_landing_upload`, `operator_reloc_service` — all owned by sibling tasks AZ-NNN T2..T5) from the composition root and delegates to it; on success returns 0; on a known error type maps to a documented non-zero exit code with a one-line operator-friendly message + remediation hint pulled from the underlying error's `remediation` attribute. The CLI app does NOT own any workflow logic itself — only command registration, argument parsing, logger wiring, exit-code mapping, and the two simple operator helpers.
|
||||
**Description**: Implement the operator-tooling CLI shell that operators run on the workstation. Wires Typer (per the Click/Typer project pin) into `operator_tool/__main__.py`, registers six subcommands (`download`, `build-cache`, `upload-pending`, `reloc-confirm`, `verify-ready`, `set-sector`), wires the E-CC-LOG (AZ-266) logger to a workstation-side structured-JSON log file (`~/.azaion/onboard/c12-tooling.log`), and ships the two trivial operator-side helpers from description.md § 2 — `set_sector_classification(area, sector_class)` (persists per-area classification to a local JSON file under the operator workstation's home directory) and `apply_freshness_threshold(sector_class) -> int (months)` (a pure-data lookup that maps the sector classification enum to the AC-NEW-6 months freshness budget). Each subcommand is a thin shell that resolves its service collaborator (`flights_api_client`, `build_cache`, `companion_bringup`, `post_landing_upload`, `operator_reloc_service` — all owned by sibling tasks AZ-489 / AZ-NNN T2..T5) from the composition root and delegates to it; on success returns 0; on a known error type maps to a documented non-zero exit code with a one-line operator-friendly message + remediation hint pulled from the underlying error's `remediation` attribute. The CLI app does NOT own any workflow logic itself — only command registration, argument parsing, logger wiring, exit-code mapping, and the two simple operator helpers. **ADR-010 amendment**: the `build-cache` subcommand accepts a mutually-exclusive pair `--flight-id <Guid> | --flight-file <Path>` and forwards the resolved `FlightDto` (via AZ-489 `FlightsApiClient`) to the orchestrator (AZ-328), which derives the bbox + takeoff origin from it. The legacy `--bbox` flag is dropped because the bbox is now derived; passing it is an error.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module
|
||||
**Dependencies**: AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module, AZ-489_c12_flights_api_client (for the `FlightsApiClient` service collaborator + DTO definitions surfaced via `--flight-id` / `--flight-file`)
|
||||
**Component**: c12_operator_tooling (epic AZ-253 / E-C12)
|
||||
**Tracker**: AZ-326
|
||||
**Epic**: AZ-253 (E-C12)
|
||||
@@ -42,12 +42,12 @@ This task delivers the CLI shell + the two trivial operator helpers. It does NOT
|
||||
- `src/operator_tool/freshness_table.py` — `freshness_threshold_months(sector_class: SectorClassification) -> int`:
|
||||
- Pure data: `active_conflict → 1 month`; `stable_rear → 12 months`. Documented inline as the AC-NEW-6 freshness budget per description.md § 1 + Plan-phase intent.
|
||||
- Module-level constant: `FRESHNESS_TABLE: dict[SectorClassification, int]`.
|
||||
- `src/operator_tool/exit_codes.py` — module-level constants: `EXIT_OK = 0`, `EXIT_GENERIC_ERROR = 1`, `EXIT_USAGE = 2`, `EXIT_COMPANION_UNREACHABLE = 10`, `EXIT_CONTENT_HASH_MISMATCH = 11`, `EXIT_DOWNLOAD_FAILURE = 20`, `EXIT_BUILD_FAILURE = 21`, `EXIT_FLIGHT_STATE_NOT_CONFIRMED = 30`, `EXIT_UPLOAD_FAILURE = 31`, `EXIT_GCS_LINK_ERROR = 40`, `EXIT_LOCK_HELD = 50`. Sibling tasks may extend with documented additions.
|
||||
- `src/operator_tool/exit_codes.py` — module-level constants: `EXIT_OK = 0`, `EXIT_GENERIC_ERROR = 1`, `EXIT_USAGE = 2`, `EXIT_COMPANION_UNREACHABLE = 10`, `EXIT_CONTENT_HASH_MISMATCH = 11`, `EXIT_DOWNLOAD_FAILURE = 20`, `EXIT_BUILD_FAILURE = 21`, `EXIT_FLIGHT_STATE_NOT_CONFIRMED = 30`, `EXIT_UPLOAD_FAILURE = 31`, `EXIT_GCS_LINK_ERROR = 40`, `EXIT_LOCK_HELD = 50`, `EXIT_FLIGHTS_API_UNREACHABLE = 60`, `EXIT_FLIGHTS_API_AUTH = 61`, `EXIT_FLIGHT_NOT_FOUND = 62`, `EXIT_FLIGHT_SCHEMA = 63`, `EXIT_EMPTY_WAYPOINTS = 64`. Sibling tasks may extend with documented additions.
|
||||
- A composition root entry at `src/gps_denied_onboard/runtime_root/c12_factory.py`:
|
||||
- `build_operator_tool(config: Config) -> OperatorToolServices` — pure factory that constructs the `SectorClassificationStore` + a logger configured to write to `~/.azaion/onboard/c12-tooling.log`. Returns a frozen dataclass aggregating the operator-tool service handles. Sibling tasks T2..T5 each add their service to this dataclass without renaming or moving it.
|
||||
- Subcommand surface (each subcommand body lives in `cli.py`; service implementations live in sibling task files):
|
||||
- `download` — delegates to `tile_downloader.fetch(...)` (AZ-316). Maps `SatelliteProviderError → EXIT_DOWNLOAD_FAILURE`.
|
||||
- `build-cache` — delegates to `build_cache_orchestrator.build_cache(...)` (sibling T3). Maps `CacheBuildError → EXIT_DOWNLOAD_FAILURE | EXIT_BUILD_FAILURE` (per `failure_phase`); `BuildLockHeldError → EXIT_LOCK_HELD`.
|
||||
- `build-cache` — accepts a mutually-exclusive pair `--flight-id <Guid> | --flight-file <Path>` (Typer-enforced via a callback that rejects both-set / neither-set with `EXIT_USAGE`), plus `--sector-class`, `--calibration-path`. Delegates to `build_cache_orchestrator.build_cache(...)` (sibling AZ-328) passing the resolved `FlightDto` (the orchestrator computes bbox + takeoff origin from it via AZ-489 helpers). Maps `CacheBuildError → EXIT_DOWNLOAD_FAILURE | EXIT_BUILD_FAILURE` (per `failure_phase`); `BuildLockHeldError → EXIT_LOCK_HELD`; `FlightsApiUnreachableError → EXIT_FLIGHTS_API_UNREACHABLE`; `FlightsApiAuthError → EXIT_FLIGHTS_API_AUTH`; `FlightNotFoundError → EXIT_FLIGHT_NOT_FOUND`; `FlightsApiSchemaError | FlightFileNotFoundError | WaypointSchemaError → EXIT_FLIGHT_SCHEMA`; `EmptyWaypointsError → EXIT_EMPTY_WAYPOINTS`.
|
||||
- `upload-pending` — delegates to `post_landing_upload.trigger_post_landing_upload(...)` (sibling T4). Maps `FlightStateNotConfirmedError → EXIT_FLIGHT_STATE_NOT_CONFIRMED`; `UploadGateBlockedError → EXIT_UPLOAD_FAILURE`.
|
||||
- `reloc-confirm` — delegates to `operator_reloc_service.request_reloc(...)` (sibling T5). Maps `GcsLinkError → EXIT_GCS_LINK_ERROR`.
|
||||
- `verify-ready` — delegates to `companion_bringup.verify_companion_ready(...)` (sibling T2). Maps `CompanionUnreachableError → EXIT_COMPANION_UNREACHABLE`; `ContentHashMismatchError → EXIT_CONTENT_HASH_MISMATCH`.
|
||||
@@ -131,6 +131,39 @@ Given `set-sector --area Derkachi --class active_conflict` was just run
|
||||
When the same command is run again
|
||||
Then the on-disk JSON file is byte-identical (or has only timestamp diffs in the log, not in the data file); the operator sees the same exit code 0 and the same INFO log line
|
||||
|
||||
**AC-11: `build-cache --flight-id` happy path delegates to orchestrator with `FlightDto` (ADR-010)**
|
||||
Given a fake `FlightsApiClient.fetch_flight` returns a 3-waypoint `FlightDto`
|
||||
When `operator-tool build-cache --flight-id 00000000-0000-0000-0000-000000000001 --sector-class stable_rear --calibration-path /tmp/cal.json` runs
|
||||
Then `build_cache_orchestrator.build_cache(...)` is called once with the resolved `FlightDto` (or its `(flight_id, bbox, takeoff_origin)` projection per AZ-328 signature); ZERO calls to `--bbox` legacy parsing
|
||||
|
||||
**AC-12: `build-cache --flight-file` happy path uses offline loader**
|
||||
Given a local JSON file in the documented schema is on disk
|
||||
When `operator-tool build-cache --flight-file /tmp/flight.json --sector-class stable_rear --calibration-path /tmp/cal.json` runs
|
||||
Then `FlightsApiClient.load_flight_file(/tmp/flight.json)` is called once; `fetch_flight` is NOT called; the orchestrator receives the same DTO shape
|
||||
|
||||
**AC-13: `build-cache` with both `--flight-id` and `--flight-file` errors out**
|
||||
When `operator-tool build-cache --flight-id 00000000-0000-0000-0000-000000000001 --flight-file /tmp/flight.json ...` runs
|
||||
Then exit code is `EXIT_USAGE = 2`; stderr names the conflict; ZERO calls to either client method
|
||||
|
||||
**AC-14: `build-cache` with neither `--flight-id` nor `--flight-file` errors out**
|
||||
When `operator-tool build-cache --sector-class stable_rear --calibration-path /tmp/cal.json` runs (no flight source)
|
||||
Then exit code is `EXIT_USAGE = 2`; stderr lists which flag must be supplied
|
||||
|
||||
**AC-15: `FlightNotFoundError` maps to `EXIT_FLIGHT_NOT_FOUND`**
|
||||
Given `fetch_flight` raises `FlightNotFoundError`
|
||||
When `build-cache --flight-id <unknown>` runs
|
||||
Then exit code is `62`; ERROR log carries the offending flight_id; ZERO calls to C11/C10
|
||||
|
||||
**AC-16: `FlightsApiAuthError` maps to `EXIT_FLIGHTS_API_AUTH`** (and never logs the auth token)
|
||||
Given `fetch_flight` raises `FlightsApiAuthError`
|
||||
When `build-cache --flight-id <id>` runs
|
||||
Then exit code is `61`; the structured log entry does NOT contain the `auth_token` value
|
||||
|
||||
**AC-17: `EmptyWaypointsError` maps to `EXIT_EMPTY_WAYPOINTS`**
|
||||
Given the fetched `FlightDto` has zero waypoints
|
||||
When `build-cache --flight-id <id>` runs (and the orchestrator calls `bbox_from_waypoints` → raises)
|
||||
Then exit code is `64`; the stderr message instructs the operator to re-plan in the Mission Planner UI
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
@@ -158,6 +191,13 @@ Then the on-disk JSON file is byte-identical (or has only timestamp diffs in the
|
||||
| AC-8 | `subprocess.run(["operator-tool", "--help"])` after `pip install -e .` | Exit 0, help text printed |
|
||||
| AC-9 | Per-subcommand `--help` text | Includes documented AC IDs |
|
||||
| AC-10 | Repeated `set-sector` for same area/class | On-disk JSON byte-identical |
|
||||
| AC-11 | `build-cache --flight-id` happy path | Orchestrator called once with resolved DTO |
|
||||
| AC-12 | `build-cache --flight-file` happy path | `load_flight_file` called; `fetch_flight` NOT called |
|
||||
| AC-13 | Both `--flight-id` and `--flight-file` | Exit 2; conflict message |
|
||||
| AC-14 | Neither flight source supplied | Exit 2; usage hint |
|
||||
| AC-15 | `FlightNotFoundError` | Exit 62; flight_id in log |
|
||||
| AC-16 | `FlightsApiAuthError` | Exit 61; auth_token NOT in log |
|
||||
| AC-17 | `EmptyWaypointsError` | Exit 64; Mission Planner UI hint |
|
||||
| NFR-perf-cold-start | Microbench `operator-tool --help` × 10 | p99 ≤ 500 ms |
|
||||
|
||||
## Constraints
|
||||
|
||||
@@ -2,9 +2,9 @@
|
||||
|
||||
**Task**: AZ-328_c12_build_cache_orchestrator
|
||||
**Name**: C12 Build-Cache Orchestrator
|
||||
**Description**: Implement `BuildCacheOrchestrator`, the public top-level F1 (pre-flight cache build) workflow. `build_cache(request: BuildCacheRequest) -> CacheBuildReport` does the following sequenced work, with strict ordering: (1) acquire a filesystem lockfile at `<cache_staging_root>/.c12.lock` per description.md § 7 (prevents concurrent F1 runs from stomping each other); (2) call `tile_downloader.fetch(...)` (AZ-316) on the operator workstation with `area`, `sector_class`, `freshness_threshold_months`, `satellite_provider_url`, `api_key`; (3) on download `failure` outcome → wrap as `CacheBuildError(failure_phase=download, ...)` and return `CacheBuildReport(outcome=failure, failure_phase=download, download_report=..., build_report=None)` WITHOUT invoking C10; (4) on download `success` → call `companion_bringup.verify_companion_ready(...)` (AZ-327) — if `not_ready` → wrap and return `CacheBuildReport(outcome=failure, failure_phase=download, ...)` because the artifacts the C11 step pushed to the companion did not survive the verification (the boundary case here is that the operator workstation may have ingested tiles into local C6 but the companion's pre-existing artifacts are stale); (5) SSH-invoke `C10.CacheProvisioner.build_cache_artifacts` (AZ-325) on the companion via the `RemoteCacheProvisionerInvoker` helper, streaming the C10 stdout/stderr lines back as DEBUG logs and parsing the final `BuildReport` JSON document the C10 process emits on stdout; (6) aggregate into `CacheBuildReport`; (7) release the lockfile in `finally`. Wraps any underlying error from C11/C10/C7/C6 as `CacheBuildError` with a `remediation` attribute populated per `failure_phase` (download phase → retry hint, key rotation hint; build phase → cache cleanup hint, GPU OOM mitigation hint). Surfaces a clear non-zero exit code via T1's `cli.py` mapping. Owns the operator-facing C12-IT-02 acceptance test contract (build_cache orchestrates C11 then C10; download failure aborts before C10; mixed reports surface in `CacheBuildReport`).
|
||||
**Description**: Implement `BuildCacheOrchestrator`, the public top-level F1 (pre-flight cache build) workflow. `build_cache(request: BuildCacheRequest) -> CacheBuildReport` does the following sequenced work, with strict ordering: **(0) Flight-resolve phase (ADR-010, AZ-489)** — the orchestrator either calls `flights_api_client.fetch_flight(flight_id, base_url, auth_token)` (online) or `flights_api_client.load_flight_file(path)` (offline) per the resolved CLI flag, then `bbox = flights_api_client.bbox_from_waypoints(flight.waypoints, buffer_m=config.flight_bbox_buffer_m)` and `takeoff_origin = flights_api_client.takeoff_origin_from_flight(flight)`. The resolved `(bbox, takeoff_origin, flight_id, raw_flight_dto)` is captured into `FlightResolveReport` for FDR/debug and forwarded into the downstream phases; any `FlightsApiUnreachableError` / `FlightsApiAuthError` / `FlightNotFoundError` / `FlightsApiSchemaError` / `FlightFileNotFoundError` / `EmptyWaypointsError` / `WaypointSchemaError` is wrapped as `CacheBuildError(failure_phase=flight_resolve, ...)` and aborts BEFORE the lockfile is even acquired (no point holding the lock while diagnosing operator inputs). (1) acquire a filesystem lockfile at `<cache_staging_root>/.c12.lock` per description.md § 7 (prevents concurrent F1 runs from stomping each other); (2) call `tile_downloader.fetch(...)` (AZ-316) on the operator workstation with `bbox` (computed in phase 0), `sector_class`, `freshness_threshold_months`, `satellite_provider_url`, `api_key`; (3) on download `failure` outcome → wrap as `CacheBuildError(failure_phase=download, ...)` and return `CacheBuildReport(outcome=failure, failure_phase=download, flight_resolve_report=..., download_report=..., build_report=None)` WITHOUT invoking C10; (4) on download `success` → call `companion_bringup.verify_companion_ready(...)` (AZ-327) — if `not_ready` → wrap and return `CacheBuildReport(outcome=failure, failure_phase=download, ...)`; (5) SSH-invoke `C10.CacheProvisioner.build_cache_artifacts` (AZ-325) on the companion via the `RemoteCacheProvisionerInvoker` helper, **passing `takeoff_origin` + `flight_id` along with bbox/sector_class** so AZ-325 / AZ-323 bake them into the Manifest. Stream the C10 stdout/stderr lines back as DEBUG logs and parse the final `BuildReport` JSON document the C10 process emits on stdout; (6) aggregate into `CacheBuildReport`; (7) release the lockfile in `finally`. Wraps any underlying error from C11/C10/C7/C6 as `CacheBuildError` with a `remediation` attribute populated per `failure_phase`. Owns the operator-facing C12-IT-02 acceptance test contract.
|
||||
**Complexity**: 5 points
|
||||
**Dependencies**: AZ-326_c12_cli_app, AZ-327_c12_companion_bringup, AZ-316_c11_tile_downloader, AZ-325_c10_cache_provisioner, AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module
|
||||
**Dependencies**: AZ-326_c12_cli_app, AZ-327_c12_companion_bringup, AZ-316_c11_tile_downloader, AZ-325_c10_cache_provisioner, AZ-489_c12_flights_api_client (Flight resolve + bbox-from-waypoints + takeoff origin), AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module
|
||||
**Component**: c12_operator_tooling (epic AZ-253 / E-C12)
|
||||
**Tracker**: AZ-328
|
||||
**Epic**: AZ-253 (E-C12)
|
||||
@@ -34,12 +34,13 @@ This task delivers the F1 orchestrator + the remote C10 invoker + the lockfile +
|
||||
## Outcome
|
||||
|
||||
- A `BuildCacheOrchestrator` class at `src/operator_tool/build_cache.py`:
|
||||
- Constructor: `__init__(self, *, tile_downloader: TileDownloader, companion_bringup: CompanionBringup, remote_c10_invoker: RemoteCacheProvisionerInvoker, freshness_table: FreshnessTable, lock_factory: FileLockFactory, logger: Logger, clock: Clock, config: C12BuildCacheConfig)`.
|
||||
- `C12BuildCacheConfig` (`@dataclass(frozen=True)`): `cache_staging_root: Path`, `lock_filename: str = ".c12.lock"`, `lock_timeout_s: float = 5.0`, `companion_cache_root: PurePosixPath`.
|
||||
- Constructor: `__init__(self, *, flights_api_client: FlightsApiClient, tile_downloader: TileDownloader, companion_bringup: CompanionBringup, remote_c10_invoker: RemoteCacheProvisionerInvoker, freshness_table: FreshnessTable, lock_factory: FileLockFactory, logger: Logger, clock: Clock, config: C12BuildCacheConfig)`.
|
||||
- `C12BuildCacheConfig` (`@dataclass(frozen=True)`): `cache_staging_root: Path`, `lock_filename: str = ".c12.lock"`, `lock_timeout_s: float = 5.0`, `companion_cache_root: PurePosixPath`, `flight_bbox_buffer_m: float = 1000.0`, `flights_api_base_url: str`, `flights_api_auth_token: SecretStr`.
|
||||
- Public method: `build_cache(request: BuildCacheRequest) -> CacheBuildReport`.
|
||||
- DTOs at `src/operator_tool/_types.py`:
|
||||
- `BuildCacheRequest` (`@dataclass(frozen=True)`): `area: AreaIdentifier`, `bbox: Bbox`, `sector_class: SectorClassification`, `calibration_path: Path`, `satellite_provider_url: str`, `api_key: SecretStr`, `companion_address: CompanionAddress`, `expected_engines: tuple[str, ...]`.
|
||||
- `CacheBuildReport` (`@dataclass(frozen=True)`): `download_report: DownloadBatchReport | None`, `build_report: BuildReport | None`, `outcome: enum {success, failure, idempotent_no_op}`, `failure_phase: enum {none, download, build}`, `failure_reason: str | None`, `wall_clock_s: float`.
|
||||
- `BuildCacheRequest` (`@dataclass(frozen=True)`): `flight_source: FlightSource (one of `FlightById(flight_id: UUID)` or `FlightFromFile(path: Path)`)`, `sector_class: SectorClassification`, `calibration_path: Path`, `satellite_provider_url: str`, `api_key: SecretStr`, `companion_address: CompanionAddress`, `expected_engines: tuple[str, ...]`. **The legacy `bbox` field is removed — the orchestrator derives bbox from the resolved `FlightDto`.**
|
||||
- `FlightResolveReport` (`@dataclass(frozen=True)`): `source: enum {flights_api, flight_file}`, `flight_id: UUID`, `waypoint_count: int`, `bbox: Bbox`, `takeoff_origin: LatLonAlt`, `raw_flight_dto: FlightDto`.
|
||||
- `CacheBuildReport` (`@dataclass(frozen=True)`): `flight_resolve_report: FlightResolveReport | None`, `download_report: DownloadBatchReport | None`, `build_report: BuildReport | None`, `outcome: enum {success, failure, idempotent_no_op}`, `failure_phase: enum {none, flight_resolve, download, build}`, `failure_reason: str | None`, `wall_clock_s: float`.
|
||||
- Errors at `src/operator_tool/errors.py`:
|
||||
- `CacheBuildError(Exception)`: attributes `failure_phase: enum {download, build}`, `wrapped_exception_repr: str`, `remediation: str`. The `remediation` attribute is populated at construction time per `failure_phase` (download → "Re-run with same args; check `satellite_provider_url` and `api_key`."; build → "Inspect companion `~/.azaion/onboard/c10-build.log`; consider `rm -rf <companion_cache_root>/engines/` to force a clean rebuild.").
|
||||
- `BuildLockHeldError(CacheBuildError)`: subclass for the lock-held case with `remediation` = "Another `build-cache` is in progress; wait or kill the holding process and remove `<lock_path>`."
|
||||
@@ -59,14 +60,22 @@ This task delivers the F1 orchestrator + the remote C10 invoker + the lockfile +
|
||||
```
|
||||
Concrete: `FilelockFileLockFactory` wrapping the `filelock` library per the project pin (already used by E-C13 per epics.md C13 section). NOT a custom implementation.
|
||||
- Method flow for `build_cache`:
|
||||
0. **Flight resolve phase** (ADR-010 / AZ-489) — runs BEFORE the lockfile is acquired:
|
||||
- Branch on `request.flight_source`:
|
||||
- `FlightById(flight_id)` → `flight = flights_api_client.fetch_flight(flight_id=..., base_url=config.flights_api_base_url, auth_token=config.flights_api_auth_token)`.
|
||||
- `FlightFromFile(path)` → `flight = flights_api_client.load_flight_file(path=path)`.
|
||||
- Compute `bbox = flights_api_client.bbox_from_waypoints(flight.waypoints, buffer_m=config.flight_bbox_buffer_m)`.
|
||||
- Compute `takeoff_origin = flights_api_client.takeoff_origin_from_flight(flight)`.
|
||||
- Build `FlightResolveReport(source=..., flight_id=flight.flight_id, waypoint_count=len(flight.waypoints), bbox, takeoff_origin, raw_flight_dto=flight)`.
|
||||
- Catch `FlightsApiUnreachableError`, `FlightsApiAuthError`, `FlightNotFoundError`, `FlightsApiSchemaError`, `FlightFileNotFoundError`, `EmptyWaypointsError`, `WaypointSchemaError` → wrap as `CacheBuildError(failure_phase=flight_resolve, ...)` and return `CacheBuildReport(outcome=failure, failure_phase=flight_resolve, flight_resolve_report=None, download_report=None, build_report=None, ...)`. INFO log `kind="c12.build_cache.flight_resolve.start"` before; ERROR log `kind="c12.build_cache.flight_resolve.failed"` on failure with the resolved error class name (auth_token NEVER logged).
|
||||
1. Compute `lock_path = config.cache_staging_root / config.lock_filename`. Ensure `config.cache_staging_root` exists (mkdir parents=True).
|
||||
2. Compute `freshness_threshold_months = freshness_table.threshold(request.sector_class)` (uses T1's helper).
|
||||
3. Acquire lock: `with lock_factory.try_lock(lock_path, timeout_s=config.lock_timeout_s) as lock:` — on timeout, raise `BuildLockHeldError(failure_phase=download, ...)`.
|
||||
4. Record `start_t = clock.monotonic()`.
|
||||
5. INFO log `kind="c12.build_cache.start"` with the request (api_key REDACTED).
|
||||
6. **Download phase**: `download_report = tile_downloader.fetch(DownloadRequest(area=request.area, bbox=request.bbox, freshness_threshold_months=freshness_threshold_months, url=request.satellite_provider_url, api_key=request.api_key))`. Catch `SatelliteProviderError`, `RateLimitedError`, `ResolutionRejectionError`, `CacheBudgetExceededError`, `TileManagerError` → wrap as `CacheBuildError(failure_phase=download, ...)`. If `download_report.outcome == failure` → return `CacheBuildReport(outcome=failure, failure_phase=download, download_report=..., build_report=None, failure_reason=download_report.failure_reason, wall_clock_s=...)`.
|
||||
7. **Verify-ready phase**: `readiness = companion_bringup.verify_companion_ready(request.companion_address)`. Catch `CompanionUnreachableError`, `ContentHashMismatchError` → wrap as `CacheBuildError(failure_phase=download, ...)` (the C11 download succeeded but the companion is not in a state to consume the new tiles; failure_phase is `download` because the operator's next action is to re-run the same `build-cache` command, not to clean the build). If `readiness.outcome == not_ready` → return `CacheBuildReport(outcome=failure, failure_phase=download, ..., failure_reason="companion not ready: " + ", ".join(readiness.not_ready_reasons))`.
|
||||
8. **Build phase**: open SSH session via `ssh_factory.open(request.companion_address, ...)`; call `remote_c10_invoker.invoke(session, RemoteBuildRequest(bbox=request.bbox, sector_class=request.sector_class, calibration_path=request.calibration_path, expected_engines=request.expected_engines, companion_cache_root=config.companion_cache_root))`; catch `EngineBuildError`, `CalibrationCacheError`, `ManifestSignatureError`, `ManifestCoverageError`, `BuildLockHeldError` (C10's lock, distinct from C12's) → wrap as `CacheBuildError(failure_phase=build, ...)`.
|
||||
5. INFO log `kind="c12.build_cache.start"` with the request (api_key + auth_token REDACTED) and the `flight_resolve_report` summary.
|
||||
6. **Download phase**: `download_report = tile_downloader.fetch(DownloadRequest(bbox=flight_resolve_report.bbox, freshness_threshold_months=freshness_threshold_months, url=request.satellite_provider_url, api_key=request.api_key))` — the bbox is the one derived in phase 0; the orchestrator no longer accepts a caller-supplied bbox. Catch `SatelliteProviderError`, `RateLimitedError`, `ResolutionRejectionError`, `CacheBudgetExceededError`, `TileManagerError` → wrap as `CacheBuildError(failure_phase=download, ...)`. If `download_report.outcome == failure` → return `CacheBuildReport(outcome=failure, failure_phase=download, flight_resolve_report=..., download_report=..., build_report=None, failure_reason=download_report.failure_reason, wall_clock_s=...)`.
|
||||
7. **Verify-ready phase**: `readiness = companion_bringup.verify_companion_ready(request.companion_address)`. Catch `CompanionUnreachableError`, `ContentHashMismatchError` → wrap as `CacheBuildError(failure_phase=download, ...)`. If `readiness.outcome == not_ready` → return `CacheBuildReport(outcome=failure, failure_phase=download, ..., failure_reason="companion not ready: " + ", ".join(readiness.not_ready_reasons))`.
|
||||
8. **Build phase**: open SSH session via `ssh_factory.open(request.companion_address, ...)`; call `remote_c10_invoker.invoke(session, RemoteBuildRequest(bbox=flight_resolve_report.bbox, zoom_levels=..., sector_class=request.sector_class, calibration_path=request.calibration_path, expected_engines=request.expected_engines, companion_cache_root=config.companion_cache_root, takeoff_origin=flight_resolve_report.takeoff_origin, flight_id=flight_resolve_report.flight_id))` — the orchestrator forwards `takeoff_origin` + `flight_id` to the remote C10 build entry point so AZ-325 / AZ-323 bake them into the Manifest (ADR-010, AZ-490 consumes them on the companion at boot). Catch `EngineBuildError`, `CalibrationCacheError`, `ManifestSignatureError`, `ManifestCoverageError`, `BuildLockHeldError` (C10's lock, distinct from C12's) → wrap as `CacheBuildError(failure_phase=build, ...)`.
|
||||
9. Aggregate: `build_report` from step 8. If `build_report.outcome == IDEMPOTENT_NO_OP` → return `CacheBuildReport(outcome=idempotent_no_op, failure_phase=none, download_report=..., build_report=..., failure_reason=None, wall_clock_s=...)`. Else if `build_report.outcome == FAILURE` → return `CacheBuildReport(outcome=failure, failure_phase=build, ..., failure_reason=build_report.failure_reason, ...)`.
|
||||
10. INFO log `kind="c12.build_cache.success"` with the aggregated counts (tiles_downloaded, engines_built, engines_reused, descriptors_generated).
|
||||
11. Return `CacheBuildReport(outcome=success, failure_phase=none, download_report=..., build_report=..., failure_reason=None, wall_clock_s=...)`.
|
||||
@@ -99,10 +108,10 @@ This task delivers the F1 orchestrator + the remote C10 invoker + the lockfile +
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Happy path — download → verify-ready → build → `success`**
|
||||
Given a fresh empty C6 + a clean companion + valid `BuildCacheRequest` + fakes that all return `success`
|
||||
**AC-1: Happy path — flight-resolve → download → verify-ready → build → `success`**
|
||||
Given a fresh empty C6 + a clean companion + valid `BuildCacheRequest(flight_source=FlightById(...))` + fakes that all return `success` (including a 3-waypoint `FlightDto`)
|
||||
When `build_cache(request)` is called
|
||||
Then the call sequence is `lock acquire → tile_downloader.fetch → companion_bringup.verify_companion_ready → remote_c10_invoker.invoke → lock release` (verifiable via spy on each fake); `CacheBuildReport(outcome=success, failure_phase=none, download_report=..., build_report=..., failure_reason=None)` is returned; ONE INFO log `kind="c12.build_cache.start"`; ONE INFO log `kind="c12.build_cache.success"`
|
||||
Then the call sequence is `flights_api_client.fetch_flight → bbox_from_waypoints → takeoff_origin_from_flight → lock acquire → tile_downloader.fetch (with derived bbox) → companion_bringup.verify_companion_ready → remote_c10_invoker.invoke (with takeoff_origin + flight_id) → lock release` (verifiable via spy on each fake); `CacheBuildReport(outcome=success, failure_phase=none, flight_resolve_report=..., download_report=..., build_report=..., failure_reason=None)` is returned; ONE INFO log `kind="c12.build_cache.flight_resolve.start"`; ONE INFO log `kind="c12.build_cache.start"`; ONE INFO log `kind="c12.build_cache.success"`
|
||||
|
||||
**AC-2: Download failure aborts before C10**
|
||||
Given a fake `tile_downloader.fetch` that raises `SatelliteProviderError("503 Service Unavailable")`
|
||||
@@ -144,10 +153,35 @@ Given a `BuildCacheRequest` with `api_key=SecretStr("super-secret-token")`
|
||||
When any log line is emitted by the orchestrator
|
||||
Then no log line contains the literal token; `api_key` field appears as `"REDACTED"` or is omitted entirely
|
||||
|
||||
**AC-10: Aggregated `CacheBuildReport` carries both sub-reports on success**
|
||||
**AC-10: Aggregated `CacheBuildReport` carries all sub-reports on success**
|
||||
Given a happy-path run
|
||||
When the caller inspects the returned `CacheBuildReport`
|
||||
Then `download_report` is a populated `DownloadBatchReport` from C11; `build_report` is a populated `BuildReport` from C10; `wall_clock_s` is a positive float; both sub-reports' fields are accessible (no truncation)
|
||||
Then `flight_resolve_report` is a populated `FlightResolveReport`; `download_report` is a populated `DownloadBatchReport` from C11; `build_report` is a populated `BuildReport` from C10; `wall_clock_s` is a positive float; all sub-reports' fields are accessible (no truncation)
|
||||
|
||||
**AC-11: Flight-resolve failure aborts BEFORE the lockfile (ADR-010)**
|
||||
Given `flights_api_client.fetch_flight` raises `FlightNotFoundError`
|
||||
When `build_cache(request)` is called
|
||||
Then `CacheBuildReport(outcome=failure, failure_phase=flight_resolve, flight_resolve_report=None, download_report=None, build_report=None, failure_reason="flight not found: <uuid>")` is returned; `lock_factory.try_lock` is NEVER called; `tile_downloader.fetch` is NEVER called; `companion_bringup.verify_companion_ready` is NEVER called; `remote_c10_invoker.invoke` is NEVER called; ONE ERROR log `kind="c12.build_cache.flight_resolve.failed"`
|
||||
|
||||
**AC-12: Offline flight-file path used when `FlightFromFile` source is passed**
|
||||
Given `BuildCacheRequest(flight_source=FlightFromFile(path=/tmp/flight.json))`
|
||||
When `build_cache(request)` is called
|
||||
Then `flights_api_client.load_flight_file(path=/tmp/flight.json)` is called once; `flights_api_client.fetch_flight` is NEVER called; the rest of the pipeline runs identically
|
||||
|
||||
**AC-13: `takeoff_origin` is forwarded to the remote C10 invoker**
|
||||
Given a fake `FlightDto` with `waypoints[0] = (50.0, 36.2, 200.0)`
|
||||
When `build_cache(request)` is called through to the build phase
|
||||
Then `remote_c10_invoker.invoke` is called with `RemoteBuildRequest.takeoff_origin == LatLonAlt(50.0, 36.2, 200.0)` and `RemoteBuildRequest.flight_id == flight.flight_id`
|
||||
|
||||
**AC-14: `EmptyWaypointsError` surfaces with `failure_phase=flight_resolve`**
|
||||
Given the resolved `FlightDto` has zero waypoints (so `bbox_from_waypoints` raises `EmptyWaypointsError`)
|
||||
When `build_cache(request)` is called
|
||||
Then `CacheBuildReport(outcome=failure, failure_phase=flight_resolve, ..., failure_reason="empty waypoints; re-plan in Mission Planner UI")` is returned; lockfile NOT acquired
|
||||
|
||||
**AC-15: `auth_token` is REDACTED in all log output (Phase 0)**
|
||||
Given `config.flights_api_auth_token = SecretStr("bearer-xyz")`
|
||||
When any log line is emitted by the flight-resolve phase
|
||||
Then no log line contains the literal `bearer-xyz`; the field appears as `"REDACTED"` or is omitted entirely (same convention as AC-9 for `api_key`)
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
@@ -178,13 +212,18 @@ Then `download_report` is a populated `DownloadBatchReport` from C11; `build_rep
|
||||
| AC-7 | Fake C10 returns `IDEMPOTENT_NO_OP` | `outcome=idempotent_no_op`, INFO log |
|
||||
| AC-8 | Construct each error type, inspect `remediation` | Matches documented text per phase |
|
||||
| AC-9 | Capture log output with `api_key="super-secret-token"` | Token not present in any log line |
|
||||
| AC-10 | Happy-path inspect returned report | Both sub-reports present, fields accessible |
|
||||
| AC-10 | Happy-path inspect returned report | All three sub-reports (flight_resolve + download + build) present, fields accessible |
|
||||
| AC-11 | Fake `fetch_flight` raises `FlightNotFoundError` | `failure_phase=flight_resolve`; lockfile NOT acquired; ZERO downstream calls |
|
||||
| AC-12 | `FlightFromFile` source | `load_flight_file` called; `fetch_flight` NOT called |
|
||||
| AC-13 | Inspect `RemoteBuildRequest` sent to invoker | `takeoff_origin` + `flight_id` forwarded |
|
||||
| AC-14 | `EmptyWaypointsError` from `bbox_from_waypoints` | `failure_phase=flight_resolve`; lockfile NOT acquired |
|
||||
| AC-15 | Capture log output with auth_token | Token not present |
|
||||
| NFR-perf-overhead | Microbench orchestrator-only path with all-fake collaborators × 100 | p99 ≤ 50 ms (excludes real network/SSH) |
|
||||
|
||||
## Constraints
|
||||
|
||||
- Strict phase ordering is non-negotiable: download → verify-ready → build. Any reordering breaks AC-2/AC-3 and causes operators to chase phantom errors.
|
||||
- `failure_phase` is a closed set `{none, download, build}` — adding a new value requires Plan-cycle approval (operators script against these values).
|
||||
- Strict phase ordering is non-negotiable: flight_resolve → lock → download → verify-ready → build. Any reordering breaks AC-2/AC-3/AC-11 and causes operators to chase phantom errors. **The flight_resolve phase happens BEFORE the lockfile is acquired — a Flight that cannot be resolved is an operator-input error, not a contended-resource error, and should not block parallel builds.**
|
||||
- `failure_phase` is a closed set `{none, flight_resolve, download, build}` — adding a new value requires Plan-cycle approval (operators script against these values).
|
||||
- The lockfile lives in the operator workstation's cache staging area, NOT on the companion. Companion-side concurrent protection is C10's responsibility (CP-INV-4 in AZ-325).
|
||||
- `api_key` field uses `pydantic.SecretStr` (or equivalent) and MUST NOT be `repr()`-logged anywhere in the orchestrator.
|
||||
- The remote C10 invocation goes through the same `SshSessionFactory` as T2 — do NOT instantiate a second SSH client. Single composition root.
|
||||
|
||||
@@ -1,57 +1,67 @@
|
||||
# FT-P-11 — Cold-start initialization from FC EKF
|
||||
# FT-P-11 — Cold-start initialization from operator origin (primary) OR FC EKF (secondary)
|
||||
|
||||
**Task**: AZ-419_ft_p_11_cold_start_init
|
||||
**Name**: Cold-start initialization from FC EKF's last valid GPS + IMU-extrapolated position (AC-5.1)
|
||||
**Description**: Implement FT-P-11 — start SITL with `cold-boot-fixture` snapshot loaded; start SUT cold; push first nav-camera frame; assert first outbound estimate's lat/lon within ±50 m of the FC EKF snapshot pose.
|
||||
**Name**: Cold-start initialization — operator-origin-from-Manifest primary; FC EKF GPS secondary (ADR-010 + AC-5.1)
|
||||
**Description**: Implement FT-P-11 — exercise both cold-start paths defined by ADR-010. **Primary path (AZ-490)**: pre-bake a `takeoff_origin` into the C10 Manifest, start SUT cold, push first nav-camera frame, assert the first outbound estimate's lat/lon falls within ±50 m of the operator origin even when the SITL FC EKF reports NO valid GPS. **Secondary path (legacy AC-5.1)**: clear the Manifest's `takeoff_origin`, load a `cold-boot-fixture` snapshot into SITL, start SUT cold, push first nav-camera frame, assert the first outbound estimate's lat/lon falls within ±50 m of the FC EKF snapshot. The two paths share a single test module parameterised on `(origin_source ∈ {operator_manifest, fc_ekf})`. The test also exercises the bounded-delta gate (Principle #11 amended): set Manifest origin to A and SITL FC EKF to a position B with `|A − B| > 200 m`; assert the operator origin wins and the FC GPS is logged as suspect.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407 (cold-boot-fixture)
|
||||
**Dependencies**: AZ-406, AZ-407 (cold-boot-fixture), AZ-323 / AZ-325 (Manifest with takeoff_origin), AZ-490 (set_takeoff_origin), AZ-489 (FlightsApiClient — used by the test fixture builder to fabricate Manifests with a known origin)
|
||||
**Component**: Blackbox Tests / Positive / Startup (epic AZ-262)
|
||||
**Tracker**: AZ-419
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
Cold-start initialization is a critical path — if the SUT cannot bootstrap its prior from the FC's last valid GPS + IMU-extrapolated pose (AC-5.1), it cannot resume after companion reboot or after a cold boot in a flight-resume scenario. This must be measured.
|
||||
Cold-start initialization is a critical path. The original assumption that the FC EKF's last valid GPS fix is always available at takeoff (AC-5.1) does not hold under realistic EW conditions — a UAV can be jammed at the launch site before takeoff, leaving the FC EKF with no valid GPS. ADR-010 introduces the operator-planned mission as the **primary** cold-start trust anchor: the operator authors the route in the Mission Planner UI, C12 fetches the `Flight`, derives `takeoff_origin` from `waypoints[0]`, and bakes it into the C10 Manifest. The airborne C5's `set_takeoff_origin` (AZ-490) consumes it before any sensor sample. The FC EKF GPS becomes the **secondary** path used only when the Manifest carries no origin (back-compat). Both paths must be measured end-to-end, and the bounded-delta gate (the third clause of the spoof-promotion gate, Principle #11) must be exercised so an inconsistent FC GPS at takeoff does not silently override the operator origin.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/positive/test_ft_p_11_cold_start_init.py`.
|
||||
- Loads `cold-boot-fixture` JSON pose into SITL (parameter-load path); starts SITL.
|
||||
- Starts SUT (cold — no prior state).
|
||||
- Pushes a single first nav-camera frame.
|
||||
- Reads the first outbound estimate; computes Vincenty distance to the FC-EKF snapshot pose.
|
||||
- Asserts distance ≤ 50 m.
|
||||
- Parameterised on `origin_source ∈ {operator_manifest, fc_ekf, bounded_delta_conflict}`:
|
||||
- **`operator_manifest`** (primary path, AZ-490): the test fixture builder writes a Manifest with `flight.takeoff_origin = A` (a known `LatLonAlt`); SITL starts with NO valid GPS (`GPS_TYPE = 0` or simulated denial); SUT cold-starts; the test asserts the first outbound estimate's lat/lon is within ±50 m of `A`.
|
||||
- **`fc_ekf`** (secondary path, legacy AC-5.1): Manifest has no `flight.takeoff_origin`; `cold-boot-fixture` JSON pose loaded into SITL (parameter-load path); SUT cold-starts; the test asserts the first outbound estimate's lat/lon is within ±50 m of the FC-EKF snapshot pose.
|
||||
- **`bounded_delta_conflict`** (ADR-010 Principle #11 amended): Manifest carries `takeoff_origin = A`; SITL FC EKF reports `B` with `vincenty(A, B) > 200 m`; the test asserts the first outbound estimate falls within ±50 m of `A` (operator origin wins), the source label on the first estimate is NOT `SATELLITE_ANCHORED` (no immediate spoof-promotion), and the FDR carries a `c5.gps_bounded_delta.reject` record naming both A and B.
|
||||
- Starts SUT (cold — no prior FDR, no in-memory state). Pushes a single first nav-camera frame. Reads the first outbound estimate; computes Vincenty distance to the expected origin.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Cold-boot SITL parameter load.
|
||||
- Cold-boot SITL parameter load (secondary path).
|
||||
- Test fixture builder that produces a C10 Manifest with a known `flight.takeoff_origin` (primary path); reuses AZ-323's canonical JSON serialization.
|
||||
- SUT cold start (`docker compose up gps-denied-onboard` from clean state; OR `systemctl start` on Tier-2).
|
||||
- First-frame push and first-emission read.
|
||||
- Distance comparison.
|
||||
- FDR record assertions for the bounded-delta conflict scenario.
|
||||
|
||||
### Excluded
|
||||
- Cold-start TTFF latency — owned by NFT-PERF-03 (AZ-430).
|
||||
- Companion mid-flight reboot — owned by NFT-RES-02 (AZ-433).
|
||||
- Mid-flight bounded-delta gate (only the takeoff slice is covered here; mid-flight is part of AZ-385 follow-up).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: SITL reflects snapshot pose**
|
||||
Given `cold-boot-fixture` loaded
|
||||
Then the SITL EKF reports the snapshot pose (within ±1 m per fixture's load tolerance).
|
||||
**AC-1: Primary path (operator origin) — SUT cold-starts even when FC EKF has no GPS (ADR-010)**
|
||||
Given a C10 Manifest with `flight.takeoff_origin = LatLonAlt(50.0, 36.2, 200.0)` AND SITL configured with no valid GPS
|
||||
When SUT cold-starts and the first nav-camera frame is pushed
|
||||
Then the SUT emits its first outbound message within ≤30 s; `vincenty(estimate.position, manifest.takeoff_origin) ≤ 50 m`; the FDR carries a `c5.cold_start_origin.set` record with `source = "manifest"`
|
||||
|
||||
**AC-2: SUT initializes from FC EKF**
|
||||
Given SUT cold-started against the loaded SITL
|
||||
When the first nav-camera frame is pushed
|
||||
Then the SUT emits its first outbound message within ≤30 s (AC-NEW-1 budget — but FT-P-11 itself has a relaxed 60 s timeout).
|
||||
**AC-2: Secondary path (FC EKF) — Manifest has no origin (back-compat)**
|
||||
Given a C10 Manifest with no `flight.takeoff_origin` AND `cold-boot-fixture` JSON loaded into SITL
|
||||
When SUT cold-starts
|
||||
Then the first outbound estimate is within ±50 m of the FC EKF snapshot; the FDR carries a `c5.cold_start_origin.set` record with `source = "fc_ekf"`
|
||||
|
||||
**AC-3: first-emission position within budget**
|
||||
Given the first outbound estimate
|
||||
Then `vincenty(estimate_position, snapshot_position) ≤ 50 m`.
|
||||
**AC-3: No origin available — SUT refuses takeoff**
|
||||
Given a C10 Manifest with no `flight.takeoff_origin` AND SITL with no valid GPS
|
||||
When SUT cold-starts
|
||||
Then NO outbound `EmittedExternalPosition` is produced within the AC-NEW-1 30 s budget; the SUT logs `c5.cold_start_origin.unavailable` to FDR + GCS STATUSTEXT; the test asserts the FT-P-11 takeoff-abort policy fires
|
||||
|
||||
**AC-4: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
**AC-4: Bounded-delta conflict — operator origin wins (ADR-010 Principle #11 amended)**
|
||||
Given Manifest `takeoff_origin = A` AND SITL FC EKF reports `B` with `vincenty(A, B) > 200 m`
|
||||
When SUT cold-starts and the first nav-camera frame is pushed
|
||||
Then the first outbound estimate is within ±50 m of `A`; the source label is NOT `SATELLITE_ANCHORED` (no immediate spoof-promotion); the FDR carries a `c5.gps_bounded_delta.reject` record naming both A and B and the computed distance
|
||||
|
||||
**AC-5: parameterization across FC adapters + VIO strategies**
|
||||
Given conftest parameterization on `(fc_adapter, vio_strategy, origin_source)`
|
||||
Then each combination listed in the test matrix runs the appropriate ACs (AC-1 / AC-2 / AC-3 / AC-4 per `origin_source`)
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
|
||||
@@ -0,0 +1,220 @@
|
||||
# C5 set_takeoff_origin entrypoint — accept operator origin from C10 Manifest
|
||||
|
||||
**Task**: AZ-490_c5_set_takeoff_origin
|
||||
**Name**: C5 set_takeoff_origin entrypoint — accept operator origin from C10 Manifest
|
||||
**Description**: Extend `StateEstimator` (Protocol + both concrete impls — `GtsamIsam2StateEstimator` and `EskfStateEstimator`) with a new pre-takeoff entrypoint `set_takeoff_origin(origin: LatLonAlt, sigma_horiz_m: float, sigma_vert_m: float) -> None` that seeds the cold-start prior to the first frame. The composition root calls this method during F2 (Takeoff load) when the C10 ManifestVerifier reports a valid `flight.takeoff_origin`. With operator origin set, the FC-EKF cold-start path becomes the secondary fallback (ADR-010). The method also makes the spoof-promotion gate (AZ-385) consult a third bounded-delta clause: mid-flight FC GPS samples within 200 m of the current smoother estimate may be admitted as a soft constraint; samples > 200 m off are rejected and emit an FDR `c5.gps_bounded_delta.reject` record. The 200 m threshold is config-driven (`spoof_promotion_bounded_delta_m`, default 200.0). This task delivers the C5-side contract + the FDR record kind + the unit tests covering primary/secondary/conflict paths; downstream wiring into the composition root and the F2 sequence is the consumer's responsibility (AZ-381 owner already plumbs it).
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-381 (Protocol + DTOs + factory), AZ-383 (factor adds), AZ-384 (marginals + outputs), AZ-385 (source-label gate; bounded-delta is a new clause inside its state machine), AZ-386 (ESKF baseline must honour the same entrypoint), AZ-272 (FdrRecord Schema — new `c5.cold_start_origin.set` and `c5.gps_bounded_delta.{accept,reject}` kinds), AZ-273 (FdrClient), AZ-279 (WgsConverter for the bounded-delta Vincenty distance), AZ-269 (config), AZ-266 (logging), AZ-263 (initial structure)
|
||||
**Component**: c5_state (epic AZ-260 / E-C5)
|
||||
**Tracker**: AZ-490
|
||||
**Epic**: AZ-260 (E-C5)
|
||||
|
||||
### Document Dependencies
|
||||
|
||||
- `_docs/02_document/contracts/c5_state/state_estimator_protocol.md` § Invariants 11–12, § Config schema, § Test expectations.
|
||||
- `_docs/02_document/components/07_c5_state/description.md` § State management (cold-start ladder), § Spoof-promotion gate (3rd clause).
|
||||
- `_docs/02_document/architecture.md` ADR-010 (operator origin primary), Principle #11 (amended).
|
||||
- `_docs/02_document/system-flows.md` F2 (Takeoff load), F7 (spoof gate).
|
||||
|
||||
## Problem
|
||||
|
||||
Today, C5's `StateEstimator` Protocol has only one cold-start trust anchor: the FC EKF GPS snapshot consumed at first frame. ADR-010 makes the operator-planned mission the primary anchor and the FC EKF the secondary fallback — but the C5 Protocol carries no method to accept an external operator origin, so the composition root has nowhere to deliver the Manifest-resolved value.
|
||||
|
||||
Concretely:
|
||||
|
||||
- The C5 Protocol exposes `add_vio`, `add_pose_anchor`, `add_fc_imu`, `query` — none of which fit "pre-takeoff, set the absolute reference frame". Trying to overload `add_pose_anchor` would conflate "constant absolute origin from a trusted offline source" with "noisy per-frame satellite anchor"; the noise model, gating, and FDR record kind are all different.
|
||||
- AZ-385's spoof-promotion gate has two clauses (consistency-with-VPR-anchors + dwell-time). ADR-010 amends Principle #11 with a third clause: an FC GPS sample within `spoof_promotion_bounded_delta_m` of the current smoother estimate is admitted as a soft constraint, but samples outside that ring are rejected and counted against the spoof-promotion gate. Without this third clause, mid-flight FC GPS is either fully trusted or fully ignored — losing a legitimate fallback signal when GPS recovers.
|
||||
- The cold-start ladder is undocumented in the running code: operator origin should win when both are available, FC EKF should win when operator origin is absent, and the system should refuse takeoff when neither is available — but no method captures these states.
|
||||
- Two concrete estimators (iSAM2 + ESKF) need to honour the same entrypoint with the same semantics so the composition root can switch strategies without re-wiring F2.
|
||||
|
||||
This task lands the C5-side contract, the FDR records, and the spoof-gate amendment. It does NOT touch the C10 Manifest parsing (AZ-324), the C12 Flight resolution (AZ-489), or the composition-root F2 wiring (AZ-381 owner sequences that as a follow-up commit).
|
||||
|
||||
## Outcome
|
||||
|
||||
- **`StateEstimator` Protocol** (`src/gps_denied_onboard/components/c5_state/interface.py`):
|
||||
- New method `set_takeoff_origin(origin: LatLonAlt, sigma_horiz_m: float, sigma_vert_m: float) -> None`.
|
||||
- Contract: idempotent if called twice with identical args; raises `EstimatorAlreadyStartedError` if called after the first `add_vio` / `add_pose_anchor`; raises `EstimatorConfigError` on negative sigmas or on `LatLonAlt` outside WGS-84 bounds.
|
||||
- **`GtsamIsam2StateEstimator`** (`src/gps_denied_onboard/components/c5_state/gtsam_isam2_estimator.py`):
|
||||
- `set_takeoff_origin(...)` sets the local-ENU origin via `WgsConverter`, seeds the iSAM2 prior factor at `Pose3(Rot3.Identity(), Point3(0,0,0))` with a horizontal sigma of `sigma_horiz_m` and a vertical sigma of `sigma_vert_m`, and emits an FDR `c5.cold_start_origin.set` record (`source="manifest"`) with the origin lat/lon/alt.
|
||||
- Internal `_origin_source: Literal["manifest", "fc_ekf", None]` field tracks the cold-start path; first `add_vio` call without an origin set is allowed (FC EKF path stays); first `add_vio` call with an origin set fixes the cold-start ladder to "manifest".
|
||||
- **`EskfStateEstimator`** (`src/gps_denied_onboard/components/c5_state/eskf_baseline.py`):
|
||||
- Same method, same semantics; for ESKF the origin is set on the local-ENU converter; the state's nominal position prior is set to `(0,0,0)` with covariance `diag(sigma_horiz_m², sigma_horiz_m², sigma_vert_m²)`.
|
||||
- **Spoof-promotion gate amendment** (`src/gps_denied_onboard/components/c5_state/source_label_state_machine.py` — owned by AZ-385, this task patches it with one new clause):
|
||||
- When the gate processes an incoming FC GPS sample, it computes `vincenty(current_smoother_latlon, sample_latlon)` via `WgsConverter`.
|
||||
- If `distance ≤ spoof_promotion_bounded_delta_m`: sample is admitted as a `BOUNDED_DELTA_SOFT` source label; FDR `c5.gps_bounded_delta.accept` is emitted.
|
||||
- If `distance > spoof_promotion_bounded_delta_m`: sample is rejected; FDR `c5.gps_bounded_delta.reject` is emitted naming the sample, the current estimate, and the computed distance; the rejection counts against the existing dwell-time clause.
|
||||
- **Config schema additions** (`src/gps_denied_onboard/config/c5_state.py` via AZ-269's loader):
|
||||
- `spoof_promotion_bounded_delta_m: float` (default `200.0`).
|
||||
- `default_takeoff_origin_sigma_horiz_m: float` (default `5.0`) — used when the Manifest does not carry an explicit sigma.
|
||||
- `default_takeoff_origin_sigma_vert_m: float` (default `10.0`).
|
||||
- **FDR record kinds** (`src/gps_denied_onboard/fdr/record_schema.py` via AZ-272 schema extension):
|
||||
- `c5.cold_start_origin.set` — `{source: "manifest" | "fc_ekf", lat_deg, lon_deg, alt_m, sigma_horiz_m, sigma_vert_m}`.
|
||||
- `c5.cold_start_origin.unavailable` — emitted when neither anchor is available; carries a takeoff-abort reason code.
|
||||
- `c5.gps_bounded_delta.accept` — `{sample_lat, sample_lon, smoother_lat, smoother_lon, distance_m, threshold_m}`.
|
||||
- `c5.gps_bounded_delta.reject` — same shape as `accept`.
|
||||
- **Logging**:
|
||||
- INFO on every `set_takeoff_origin` call (`kind="c5.cold_start_origin.set"`).
|
||||
- WARN on every bounded-delta reject (`kind="c5.gps_bounded_delta.reject"`).
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- Protocol method addition + both concrete impls (iSAM2 + ESKF).
|
||||
- WgsConverter integration for ENU origin + Vincenty distance in bounded-delta gate.
|
||||
- Spoof-gate's third clause + FDR record emission.
|
||||
- Config schema entries for the three new keys.
|
||||
- Error classes: `EstimatorAlreadyStartedError`, `EstimatorConfigError`.
|
||||
- Unit tests covering every AC: idempotency, post-start rejection, sigma validation, lat/lon bounds, bounded-delta accept/reject, source-label transition, FDR emission shape, both estimator impls.
|
||||
|
||||
### Excluded
|
||||
|
||||
- C10 Manifest schema changes (owned by AZ-323 / AZ-324).
|
||||
- C12 Flight resolution (owned by AZ-489).
|
||||
- Composition-root F2 wiring of `manifest.takeoff_origin → estimator.set_takeoff_origin(...)` (owned by AZ-381 owner as a follow-up commit, NOT this task).
|
||||
- Operator origin "refresh" mid-flight (out of scope — the cold-start anchor is set once at takeoff and not revised).
|
||||
- Changes to the AZ-385 source-label state-machine's first two clauses (consistency + dwell-time); only the third clause is added here.
|
||||
- ESKF deep-rewrite (the ESKF stays as the AZ-386 baseline; only `set_takeoff_origin` + bounded-delta admission are added).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Protocol-conformance — both impls expose `set_takeoff_origin`**
|
||||
Given `from c5_state import GtsamIsam2StateEstimator, EskfStateEstimator, StateEstimator`
|
||||
When `isinstance(impl, StateEstimator)` is checked on each
|
||||
Then both return `True`; both have a callable `set_takeoff_origin` with the documented signature.
|
||||
|
||||
**AC-2: Happy path — origin set before first VIO seeds the smoother prior (iSAM2)**
|
||||
Given a fresh `GtsamIsam2StateEstimator`
|
||||
When `set_takeoff_origin(LatLonAlt(50.0, 36.2, 200.0), sigma_horiz_m=5.0, sigma_vert_m=10.0)` is called
|
||||
Then the iSAM2 graph has exactly one prior factor at `Pose3.Identity()` with sigma diag matching `[deg2rad(5°)*3, deg2rad(5°)*3, deg2rad(5°)*3, 5.0, 5.0, 10.0]` (rotation sigma is the iSAM2 default Identity prior; translation sigmas come from the call); ONE FDR record `c5.cold_start_origin.set` is emitted with `source="manifest"`; ONE INFO log.
|
||||
|
||||
**AC-3: Happy path — origin set before first VIO seeds the ESKF state prior**
|
||||
Given a fresh `EskfStateEstimator`
|
||||
When `set_takeoff_origin(LatLonAlt(50.0, 36.2, 200.0), sigma_horiz_m=5.0, sigma_vert_m=10.0)` is called
|
||||
Then the ESKF's `P` matrix's position block is `diag(25.0, 25.0, 100.0)`; the ENU origin is set to the call's `LatLonAlt`; ONE FDR record + ONE INFO log as in AC-2.
|
||||
|
||||
**AC-4: Idempotent — calling twice with identical args is a no-op**
|
||||
Given the estimator after a first `set_takeoff_origin(A, s, s_v)` call
|
||||
When `set_takeoff_origin(A, s, s_v)` is called again with identical args
|
||||
Then no second FDR record is emitted; no second prior factor is added; no exception raised.
|
||||
|
||||
**AC-5: Conflict — calling twice with different args raises `EstimatorConfigError`**
|
||||
Given the estimator after `set_takeoff_origin(A, ...)`
|
||||
When `set_takeoff_origin(B, ...)` is called with `A != B`
|
||||
Then `EstimatorConfigError` is raised; message names both values; the estimator state is unchanged.
|
||||
|
||||
**AC-6: Late call — `set_takeoff_origin` after first `add_vio` raises `EstimatorAlreadyStartedError`**
|
||||
Given the estimator after one `add_vio(...)` call
|
||||
When `set_takeoff_origin(...)` is called
|
||||
Then `EstimatorAlreadyStartedError` is raised; the estimator state is unchanged.
|
||||
|
||||
**AC-7: Bounds — invalid `LatLonAlt` raises `EstimatorConfigError`**
|
||||
Given `LatLonAlt(lat_deg=95.0, ...)` (out of WGS-84 bounds)
|
||||
When `set_takeoff_origin(invalid_origin, ...)` is called
|
||||
Then `EstimatorConfigError` is raised; message names the violated bound.
|
||||
|
||||
**AC-8: Negative sigma raises `EstimatorConfigError`**
|
||||
Given `sigma_horiz_m=-5.0`
|
||||
When `set_takeoff_origin(...)` is called
|
||||
Then `EstimatorConfigError`; message names the violated invariant.
|
||||
|
||||
**AC-9: Bounded-delta accept — incoming GPS within 200 m of smoother estimate is admitted**
|
||||
Given a running smoother with current estimate at `LatLonAlt(50.000, 36.200, 200.0)`
|
||||
When a `GpsSample(LatLonAlt(50.0008, 36.2008, 200.0))` arrives via the existing AZ-391 inbound path (distance ≈ 110 m at 50° N)
|
||||
Then the gate admits the sample with source label `BOUNDED_DELTA_SOFT`; ONE FDR record `c5.gps_bounded_delta.accept` is emitted; the sample contributes to the iSAM2 graph as a soft factor with sigma per the config; the source-label state machine's first-two-clause behaviour is unchanged.
|
||||
|
||||
**AC-10: Bounded-delta reject — incoming GPS > 200 m off is rejected**
|
||||
Given the same smoother as AC-9
|
||||
When a `GpsSample(LatLonAlt(50.005, 36.205, 200.0))` arrives (distance ≈ 700 m)
|
||||
Then the gate rejects the sample; ONE FDR `c5.gps_bounded_delta.reject` record is emitted naming sample, smoother estimate, and distance; the sample is NOT added to the graph; the rejection increments the source-label state machine's dwell-time counter.
|
||||
|
||||
**AC-11: Threshold is config-driven — setting `spoof_promotion_bounded_delta_m=500.0` admits AC-10's sample**
|
||||
Given the config override `spoof_promotion_bounded_delta_m=500.0` is in effect (NOTE: 500 m is still < AC-10's 700 m, so the assertion is: the original AC-10 reject still happens, but with a 500 m threshold a 300 m offset now passes)
|
||||
Re-spec to:
|
||||
Given the config override `spoof_promotion_bounded_delta_m=1000.0` is in effect
|
||||
When the AC-10 sample arrives
|
||||
Then it is admitted as `BOUNDED_DELTA_SOFT` (it now sits within the relaxed ring).
|
||||
|
||||
**AC-12: FDR record kinds are registered in the AZ-272 schema**
|
||||
Given the AZ-272 schema after this task
|
||||
When `kind="c5.cold_start_origin.set"`, `kind="c5.cold_start_origin.unavailable"`, `kind="c5.gps_bounded_delta.accept"`, `kind="c5.gps_bounded_delta.reject"` are encoded
|
||||
Then each round-trips through serialization without raising; the schema contract test (AZ-268) covers all four.
|
||||
|
||||
**AC-13: No-op when no origin is set — FC EKF cold-start path unchanged**
|
||||
Given a fresh estimator
|
||||
When `add_vio(...)` is called WITHOUT a prior `set_takeoff_origin` call
|
||||
Then the estimator falls back to the legacy FC EKF cold-start path; the FDR record `c5.cold_start_origin.set` is emitted with `source="fc_ekf"` exactly once at first frame (this is the legacy path, just newly logged); no new behaviour beyond the FDR record kind.
|
||||
|
||||
**AC-14: Source-label state machine remains stable for AZ-385's first two clauses**
|
||||
Given the bounded-delta clause's introduction
|
||||
When the AZ-385 acceptance tests run unmodified
|
||||
Then they pass — the bounded-delta clause is additive, not replacing the existing two.
|
||||
|
||||
**AC-15: Vincenty distance is computed via `WgsConverter` (not naive haversine on the equirectangular projection)**
|
||||
Given a smoother estimate at high latitude (`60° N`) and a sample 200 m to the east
|
||||
When the gate computes the distance
|
||||
Then it uses `WgsConverter.vincenty_distance` and matches the documented ground-truth distance within ±0.5 m.
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- `set_takeoff_origin` wall-clock ≤ 5 ms (single prior factor insertion + one FDR emission).
|
||||
- Bounded-delta check on every inbound GPS sample ≤ 1 ms (single Vincenty call + threshold compare).
|
||||
|
||||
**Reliability**
|
||||
- Idempotency at AC-4 prevents accidental double-seeding on composition-root retries.
|
||||
- Late-call rejection at AC-6 prevents bricking an in-flight smoother by mistake.
|
||||
- All four FDR record kinds are part of the AZ-272 schema; AZ-268's contract test gates schema drift.
|
||||
|
||||
**Compatibility**
|
||||
- Both estimator impls (iSAM2 + ESKF) honour the same signature so the composition root can switch strategies without re-wiring F2.
|
||||
- The legacy FC EKF cold-start path stays as the secondary fallback (AC-13); existing AZ-419's old AC-3 still passes against the FC EKF path when no Manifest origin is present.
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|-------------|------------------|
|
||||
| AC-1 | Protocol conformance both impls | `isinstance` True |
|
||||
| AC-2 | iSAM2 set-origin seeds prior | Prior factor + sigmas; FDR record |
|
||||
| AC-3 | ESKF set-origin seeds P | P block matches; ENU origin set; FDR record |
|
||||
| AC-4 | Idempotent double-call | No second FDR; no exception |
|
||||
| AC-5 | Conflict double-call | `EstimatorConfigError`; names both |
|
||||
| AC-6 | Late call after add_vio | `EstimatorAlreadyStartedError` |
|
||||
| AC-7 | Out-of-bounds lat | `EstimatorConfigError` |
|
||||
| AC-8 | Negative sigma | `EstimatorConfigError` |
|
||||
| AC-9 | Bounded-delta accept | `BOUNDED_DELTA_SOFT` label + FDR |
|
||||
| AC-10 | Bounded-delta reject | Sample dropped + FDR reject |
|
||||
| AC-11 | Threshold config override | AC-10 sample admitted at relaxed threshold |
|
||||
| AC-12 | FDR schema round-trip | All 4 kinds serialise; AZ-268 covers |
|
||||
| AC-13 | No origin → FC EKF path | Legacy path + new FDR record `source="fc_ekf"` |
|
||||
| AC-14 | AZ-385's first 2 clauses unchanged | All AZ-385 tests pass unmodified |
|
||||
| AC-15 | Vincenty at 60° N | Within ±0.5 m of ground truth |
|
||||
|
||||
## Constraints
|
||||
|
||||
- `set_takeoff_origin` is the ONLY supported pre-takeoff origin entrypoint; do NOT add `set_takeoff_origin_from_gps(...)` or similar convenience overloads — the FC EKF path stays purely inside the legacy first-frame logic.
|
||||
- The bounded-delta clause is the THIRD clause of the AZ-385 source-label state machine; do NOT replace AZ-385's existing consistency + dwell-time logic — just add the new clause and its FDR records.
|
||||
- No clock or `datetime.now()` calls in this code path — pass a `Clock` Protocol per the project's established pattern (used by AZ-273, AZ-382).
|
||||
- The 200 m threshold is configurable but NOT operator-tunable from the GCS — it's a deploy-time config.
|
||||
- No "auto-correct origin" or "average origins" logic — operator origin set once at takeoff or not at all.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Operator origin from a stale/wrong flight plan poisons the cold start**
|
||||
- *Risk*: Mission Planner export drift, manual file edits, or a wrong `--flight-id` flag selects the wrong route.
|
||||
- *Mitigation*: AZ-323's Manifest carries `flight_id` + `takeoff_origin`, both hashed into `manifest_hash`; AZ-324 validates origin is inside the bbox; this task validates `LatLonAlt` bounds (AC-7) but does NOT re-validate against the Manifest's bbox — the Manifest-level check is sufficient. AC-5's conflict-on-double-call surfaces drift if the composition root somehow re-calls with a different value.
|
||||
|
||||
**Risk 2: Bounded-delta gate admits spoofed GPS that happens to be < 200 m off**
|
||||
- *Risk*: A sophisticated spoofer reads the C5's smoother output and emits GPS within the ring.
|
||||
- *Mitigation*: Bounded-delta admission is `BOUNDED_DELTA_SOFT`, NOT `SATELLITE_ANCHORED`; AZ-385's other two clauses (consistency-with-VPR-anchors + dwell-time) remain authoritative for the final spoof-promotion decision. The bounded-delta channel is a soft constraint, not a hard reference.
|
||||
|
||||
**Risk 3: Composition root forgets to call `set_takeoff_origin` even though the Manifest carries one**
|
||||
- *Risk*: F2 wiring drift between AZ-381 and AZ-324 — the Manifest has the origin, but the estimator doesn't see it.
|
||||
- *Mitigation*: AZ-419's updated AC-1 / AC-4 (operator origin test cases) exercise the full F2 chain end-to-end; failure would be visible there. AC-13 here proves the no-origin path is still functional so this task does not break the legacy fallback during the transition.
|
||||
|
||||
## Runtime Completeness
|
||||
|
||||
- **Named capability**: pre-takeoff operator origin acceptance + mid-flight bounded-delta GPS gate (ADR-010 Principle #11 amended).
|
||||
- **Production code that must exist**: real `set_takeoff_origin` on both `GtsamIsam2StateEstimator` and `EskfStateEstimator`; real `WgsConverter` for ENU origin + Vincenty distance; real FDR record emission with the four new kinds; real config wiring via AZ-269.
|
||||
- **Allowed external stubs**: tests use `FakeFdrSink` (AZ-275), a fake `Clock`, and a fixed `WgsConverter` instance with a known ENU origin.
|
||||
- **Unacceptable substitutes**: a "set-origin" convenience that just sets a class attribute without seeding the prior factor (would silently no-op); haversine instead of Vincenty (loses precision at high lat → AC-15 fails); skipping the FDR records "because they're operational telemetry" (the schema contract test AZ-268 would still flag them as missing).
|
||||
Reference in New Issue
Block a user