mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 08:51:12 +00:00
Compare commits
6 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 9dc04cc677 | |||
| ade0c86f2b | |||
| 8c4be9ace0 | |||
| bfcac2cb9f | |||
| 0ed1a5d988 | |||
| 7eed4d6e76 |
File diff suppressed because one or more lines are too long
+16
@@ -1,5 +1,21 @@
|
||||
# Derkachi e2e: wire EXISTING parent-suite satellite-provider into the operator pre-flight fixture
|
||||
|
||||
> **Status (2026-05-23)**: **CLOSED** — Phases 1+2 shipped (cycle 3); Phases 3–5 **superseded by Epic AZ-835** per the 2026-05-22 user directive (route-driven seeding instead of bbox).
|
||||
>
|
||||
> | Phase | Outcome |
|
||||
> |-------|---------|
|
||||
> | Phase 1 (e2e-runner wire + C11 contract adapt + smoke test) | **SHIPPED** — batch 104, 2026-05-21 |
|
||||
> | Phase 2 (`seed_region.py` CLI + `bbox.yaml` + license attribution) | **SHIPPED** — between batches 104 and 106 |
|
||||
> | Phase 3 (real `operator_pre_flight_setup` fixture) | **SUPERSEDED** → AZ-839 (Epic AZ-835 C3, 5 SP) — route-driven, not bbox |
|
||||
> | Phase 4 (un-xfail AC-4 + AC-5) | **SUPERSEDED** → AZ-841 (Epic AZ-835 C5, 1 SP) |
|
||||
> | Phase 5 (docs) | **SUPERSEDED** → AZ-842 (Epic AZ-835 C6, 2 SP) |
|
||||
>
|
||||
> Total credited to AZ-777: 8 SP (per the 2026-05-21 single-ticket-containment override; Phases 1+2 fit within that envelope). Remaining work (~11 SP including AZ-836 / AZ-838 already shipped) is tracked under Epic AZ-835 children.
|
||||
>
|
||||
> Spec preserved as historical reference. **Do not implement Phases 3–5 from this file** — see the Epic AZ-835 children instead.
|
||||
>
|
||||
> See also: `_docs/_process_leftovers/2026-05-21_az777_complexity_override.md` (decision log).
|
||||
|
||||
**Task**: AZ-777_derkachi_c6_reference_fixture
|
||||
**Name**: Drive the production C10/C11 pre-flight pipeline against the parent-suite `satellite-provider` .NET service ALREADY running in the Jetson e2e harness so the Derkachi clip produces a real FAISS-anchored C4/C5 satellite-fix loop end-to-end
|
||||
**Description**: The Jetson e2e harness already runs the real `satellite-provider` .NET 8 service (lineage AZ-688 / AZ-691 / AZ-692, services `satellite-provider` + `satellite-provider-postgres` in `docker-compose.test.jetson.yml`), but the e2e-runner still points its `SATELLITE_PROVIDER_URL` at the legacy `mock-sat` fixture and the placeholder `operator_pre_flight_setup` fixture never drives the C10/C11 pipeline. Compounding this, C11's `HttpTileDownloader` path constants (`_LIST_PATH=/api/satellite/tiles`, `_GET_PATH=/api/satellite/tiles/{tile_id}`) do not match the real satellite-provider API surface (`POST /api/satellite/tiles/inventory` for LIST, `GET /tiles/{z}/{x}/{y}` for tile fetch). This task wires the existing service into the e2e-runner, adapts C11 to the real contract, seeds the Derkachi-bbox tile catalog via `POST /api/satellite/request`, replaces the placeholder fixture with a real C10+C11 driver, and un-xfails the Tier-2 Derkachi + AZ-699 verdict tests.
|
||||
@@ -0,0 +1,85 @@
|
||||
# operator_pre_flight_setup real fixture (AZ-835 C3)
|
||||
|
||||
**Task**: AZ-839_operator_pre_flight_setup_real_fixture
|
||||
**Name**: operator_pre_flight_setup fixture: wire C1+C2+C11+C10 into real fixture, supersede AZ-777 Phase 3 (AZ-835 C3)
|
||||
**Description**: Third building block of Epic AZ-835. Replace the placeholder `operator_pre_flight_setup` fixture (currently a `mkdir` stub at `tests/e2e/replay/conftest.py` lines 293-310) with a real driver that wires C1 (AZ-836) + C2 (AZ-838) + C11 (AZ-777 Phase 1) + C10 to populate C6 from a tlog-derived route. Supersedes AZ-777 Phase 3 (the bbox-seeded placeholder-replacement design) per the 2026-05-22 user directive — route-driven seeding is ~100x more tile-efficient and pre-commits to where the operator did fly per the tlog.
|
||||
**Complexity**: 5 SP
|
||||
**Dependencies**: AZ-836 (C1, RouteSpec + extractor — In Testing); AZ-838 (C2, SatelliteProviderRouteClient + seed_route.py CLI — In Testing); AZ-777 Phase 1 (e2e-runner ↔ satellite-provider wire + C11 contract adaptation — done, batch 104); AZ-322 (C10 DescriptorBatcher — done); AZ-316+AZ-777 Phase 1 (C11 HttpTileDownloader.download_for_bbox — done); AZ-306 (FAISS sidecar triple-consistency — done); AZ-835 (parent Epic)
|
||||
**Component**: `tests/e2e/replay/conftest.py` (`operator_pre_flight_setup` fixture rewrite + new `PopulatedC6Cache` dataclass)
|
||||
**Tracker**: AZ-839 (https://denyspopov.atlassian.net/browse/AZ-839)
|
||||
**Parent Epic**: AZ-835
|
||||
|
||||
Jira AZ-839 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
## Public surface
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from gps_denied_onboard.replay_input.tlog_route import RouteSpec # AZ-836
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class PopulatedC6Cache:
|
||||
cache_root: Path # named-volume mount inside the e2e-runner container
|
||||
tile_store_path: Path # postgres + filesystem store root
|
||||
faiss_index_path: Path # .index file
|
||||
faiss_sidecar_sha256_path: Path # .sha256 file
|
||||
faiss_sidecar_meta_path: Path # .meta.json file
|
||||
route_spec: RouteSpec # provenance — which tlog/route produced this cache
|
||||
tile_count: int # how many tiles ended up in C6
|
||||
elapsed_seconds: float # wall time, for the AC-1/AC-2 perf budget
|
||||
```
|
||||
|
||||
The fixture remains a pytest fixture at `tests/e2e/replay/conftest.py::operator_pre_flight_setup`, same `session` scope as today. Input contract unchanged (same args the placeholder takes) plus a new dependency on `RouteSpec` — either fixture-injected or extracted from the test's tlog parameter via `extract_route_from_tlog`.
|
||||
|
||||
## Behaviour
|
||||
|
||||
1. Read the route spec (fixture-injected or extracted from test tlog via `extract_route_from_tlog`).
|
||||
2. Instantiate `SatelliteProviderRouteClient` from env (`SATELLITE_PROVIDER_URL`, `SATELLITE_PROVIDER_API_KEY`, `SATELLITE_PROVIDER_TLS_INSECURE`).
|
||||
3. Call `seed_route(route_spec)`. On `RouteValidationError` / `RouteTerminalFailureError` → re-raise with original cause. On `RouteTransientError` → retry up to 3 attempts using C11's `_DEFAULT_BACKOFF_SCHEDULE_S = (1, 2, 4, 8)`.
|
||||
4. Enumerate tile coverage locally (mirror `route_client._enumerate_route_tile_coords` from AZ-838); call C11 `HttpTileDownloader.download_for_bbox` to pull every tile into C6.
|
||||
5. Invoke C10 `DescriptorBatcher` against the populated C6 to build the FAISS HNSW index using the NetVLAD backbone (per `c2_vpr/config.py:67` default).
|
||||
6. Verify sidecar coherence (`.index` + `.sha256` + `.meta.json` triple-consistency per AZ-306). Mismatch → `IndexUnavailableError`.
|
||||
7. Yield `PopulatedC6Cache(...)`. On any failure path, clean up partial cache state (no half-built FAISS index left behind).
|
||||
|
||||
**Mount strategy**: write into a named docker volume that survives across pytest sessions. Cold first invocation populates; subsequent invocations within the same compose session reuse (warm cache). Same pattern AZ-777 Phase 3 originally specced; only the cache **source** changes (route, not bbox).
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | Cold first invocation on the Derkachi tlog completes in ≤ 5 min on Tier-2 Jetson (includes satellite-provider Google Maps round-trips). |
|
||||
| AC-2 | Warm invocation within the same compose session completes in ≤ 30 s (named-volume reuse). |
|
||||
| AC-3 | Yielded `PopulatedC6Cache` has all paths populated; `tile_count > 0`; FAISS sidecar triple-consistency passes (AZ-306). |
|
||||
| AC-4 | `RouteValidationError` / `RouteTerminalFailureError` from `seed_route` is re-raised with original cause; no silent swallow. |
|
||||
| AC-5 | `RouteTransientError` is retried up to 3 attempts using C11's existing backoff schedule; final attempt's exception is propagated. |
|
||||
| AC-6 | Tamper test — corrupt one of the three sidecar files between fixture runs; next invocation raises `IndexUnavailableError`. |
|
||||
| AC-7 | On any failure path inside the fixture, partial state is cleaned up (no half-built FAISS index, no orphaned postgres rows). |
|
||||
| AC-8 | Unit tests (stubbed `SatelliteProviderRouteClient` + stubbed C11 + stubbed C10) cover: happy path, transient-retry, terminal-failure, validation-error, tamper-detection, cleanup-on-failure. |
|
||||
| AC-9 | Integration test gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2` against the Jetson harness produces a real `PopulatedC6Cache` from the Derkachi tlog. |
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Driving the airborne replay pipeline against the populated cache (AZ-840 / C4).
|
||||
- Un-xfailing the existing AZ-777 AC-4 / AC-5 tests (AZ-841 / C5).
|
||||
- Updating `replay_protocol.md` (AZ-842 / C6).
|
||||
- Switching the C2 default backbone away from NetVLAD.
|
||||
- Multi-tlog aggregate caches (one route per fixture invocation).
|
||||
|
||||
## Risks
|
||||
|
||||
**Risk 1 — Docker named-volume lifecycle across pytest sessions.** First invocation may leave half-populated volume on crash; the cleanup-on-failure path in step 7 must be robust. Mitigation: AC-7 covers explicitly + a `try/finally` around the four wiring steps.
|
||||
|
||||
**Risk 2 — Cold-start budget (AC-1, 5 min) tight on first Jetson run.** Google Maps round-trips for ~50-100 tiles may exceed budget on slow networks. Mitigation: instrument elapsed_seconds on every step and surface in the verdict report; if AC-1 fails, file a perf-tuning ticket rather than skipping the AC.
|
||||
|
||||
## References
|
||||
|
||||
- Parent Epic: AZ-835 — https://denyspopov.atlassian.net/browse/AZ-835
|
||||
- Existing placeholder: `tests/e2e/replay/conftest.py` lines 293-310
|
||||
- C1: AZ-836 (`extract_route_from_tlog`) — https://denyspopov.atlassian.net/browse/AZ-836
|
||||
- C2: AZ-838 (`SatelliteProviderRouteClient`) — https://denyspopov.atlassian.net/browse/AZ-838
|
||||
- AZ-777 (Phase 3+ superseded): `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md`
|
||||
- C10 DescriptorBatcher: `src/gps_denied_onboard/components/c10_provisioning/descriptor_batcher.py`
|
||||
- C11 HttpTileDownloader: `src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py`
|
||||
- AZ-306 FAISS sidecar triple-consistency reference
|
||||
@@ -0,0 +1,75 @@
|
||||
# E2E orchestrator test (AZ-835 C4)
|
||||
|
||||
**Task**: AZ-840_e2e_orchestrator_test
|
||||
**Name**: E2E orchestrator test ingesting raw (tlog, video, calibration) and running steps 1-7 (AZ-835 C4)
|
||||
**Description**: Fourth building block of Epic AZ-835. A single pytest test that takes only `(tlog, video, calibration)` and runs the full 7-step pipeline end-to-end on the Jetson harness — without any operator hand-curation between steps. Extends or wraps the existing AZ-699 verdict test (`tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report`) so the verdict-report-writing path is preserved. This is the test that closes the Epic's narrative: "give it a tlog + video, and the system does everything else."
|
||||
**Complexity**: 3 SP
|
||||
**Dependencies**: AZ-839 (C3, `operator_pre_flight_setup` real fixture — HARD); AZ-836 (C1, RouteSpec — In Testing); AZ-838 (C2, SatelliteProviderRouteClient — In Testing); AZ-699 (real flight validation runner — done); AZ-405 (tlog/video auto-sync — done); AZ-702 (camera factory-sheet calibration — done); AZ-696 (≥ 80 % within 100 m threshold — done); AZ-835 (parent Epic)
|
||||
**Component**: `tests/e2e/replay/test_az835_e2e_real_flight.py` (new) OR extend `test_derkachi_real_tlog.py`
|
||||
**Tracker**: AZ-840 (https://denyspopov.atlassian.net/browse/AZ-840)
|
||||
**Parent Epic**: AZ-835
|
||||
|
||||
Jira AZ-840 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
## Inputs (test parameters)
|
||||
|
||||
- `tlog_path: Path` — raw ArduPilot tlog binary (Derkachi as the reference fixture; parametrize for future tlogs).
|
||||
- `video_path: Path` — raw flight video.
|
||||
- `calibration_path: Path` — camera factory-sheet calibration (AZ-702).
|
||||
|
||||
## Pipeline orchestration
|
||||
|
||||
The 7 steps from the Epic:
|
||||
|
||||
1. **Active flight cut + tlog/video sync** — call AZ-405's `tlog_video_adapter`. If active-segment detection needs a small extension, file as an in-scope sub-fix; if it needs a meaningful new feature, STOP and propose a sibling ticket.
|
||||
2. **On-fly frame + IMU extraction** — `VideoFileFrameSource` + `TlogReplayFcAdapter`. No change.
|
||||
3. **Auto-create route** — call AZ-836's `extract_route_from_tlog(tlog, max_waypoints=10)`. Assert the returned `RouteSpec` materially follows the tlog trajectory.
|
||||
4. **POST route to satellite-provider** — delegate to AZ-839 (C3) fixture `operator_pre_flight_setup` (which itself calls AZ-838's `SatelliteProviderRouteClient.seed_route`). The fixture's `PopulatedC6Cache` is the dependency boundary.
|
||||
5. **Build FAISS index** — driven by C3 fixture as part of populating the cache.
|
||||
6. **Run gps-denied airborne pipeline** — invoke the `gps-denied-replay` console-script or equivalent direct-call entry point against the populated cache + tlog/video/calibration. Reuse the airborne composition root path AZ-699 exercises today.
|
||||
7. **Get GPS fixes, check vs tlog GPS** — call `helpers/accuracy_report.py` + `helpers/gps_compare.py` to compute the horizontal-error distribution and emit the verdict report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md`.
|
||||
|
||||
## Test gating
|
||||
|
||||
- `@pytest.mark.tier2`.
|
||||
- Skip-unless-env(`RUN_REPLAY_E2E=1`) with an explicit skip reason that names the missing env var — no silent skip.
|
||||
|
||||
## Verdict report
|
||||
|
||||
Emit ALWAYS, even on FAIL. The success criterion for AC-1 is that the report exists and the distribution is honest — NOT that the verdict is PASS.
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | Test takes only `(tlog, video, calibration)` and runs steps 1-7 end-to-end on Tier-2 Jetson. No operator hand-curation between steps. |
|
||||
| AC-2 | Test produces the AZ-699 verdict report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` with the honest horizontal-error distribution, REGARDLESS of PASS/FAIL on the AZ-696 AC-3 threshold (≥ 80 % within 100 m). |
|
||||
| AC-3 | Test reuses the C3 fixture's `operator_pre_flight_setup` for steps 3-5; no duplicate seeding/downloading logic. |
|
||||
| AC-4 | Test runs to completion within 15 min wall time on the Derkachi clip (soft target for first delivery; hard NFR set after first measurement is recorded in the report). |
|
||||
| AC-5 | Mid-pipeline failure (e.g. step 4 satellite-provider rejection, step 5 FAISS sidecar mismatch) fails LOUD with a clear error pointing at the failing step. No silent skip past a failing step. |
|
||||
| AC-6 | Test is gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`; explicit skip reason names the missing env var. |
|
||||
| AC-7 | The existing AZ-699 verdict test continues to pass (this test does not break or supersede it; either it lives alongside, or AZ-699 is folded into this test with the verdict-writing path preserved). |
|
||||
| AC-8 | Unit tests cover the orchestration helper layer (parameter validation, error propagation between steps). The end-to-end happy path is the Jetson integration test. |
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Un-xfailing the AZ-777 AC-4 / AC-5 tests (AZ-841 / C5).
|
||||
- Documentation updates beyond the test file's own docstring (AZ-842 / C6).
|
||||
- Real-time tlog ingestion (one finished `.tlog` per test invocation).
|
||||
- Multi-flight aggregate validation.
|
||||
- Performance optimization beyond the AC-4 soft target.
|
||||
- Modifying the airborne composition root.
|
||||
|
||||
## Risks
|
||||
|
||||
**Risk 1 — Integration glue between AZ-405 tlog/video sync and the airborne pipeline's frame-source contract.** The auto-sync adapter and the airborne composition root were authored in different cycles; small impedance mismatches are likely. Mitigation: if the glue exceeds the 3 SP budget, STOP and propose a sub-ticket rather than expanding scope.
|
||||
|
||||
**Risk 2 — Step 1 active-segment detection may need extension.** AZ-405 covered tlog↔video sync; take-off/landing boundary detection may not be implemented. Mitigation: file an in-scope sub-fix if small; STOP and propose a sibling ticket if not.
|
||||
|
||||
## References
|
||||
|
||||
- Parent Epic: AZ-835 — https://denyspopov.atlassian.net/browse/AZ-835
|
||||
- Hard dep (C3 fixture): AZ-839 — https://denyspopov.atlassian.net/browse/AZ-839
|
||||
- Existing verdict test: `tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report`
|
||||
- Tlog/video adapter: `src/gps_denied_onboard/replay_input/tlog_video_adapter.py` (AZ-405)
|
||||
- Helpers: `src/gps_denied_onboard/helpers/accuracy_report.py`, `src/gps_denied_onboard/helpers/gps_compare.py`
|
||||
@@ -0,0 +1,57 @@
|
||||
# Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests (AZ-835 C5)
|
||||
|
||||
**Task**: AZ-841_unxfail_az777_tier2_tests
|
||||
**Name**: Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests once C3 fixture + C4 orchestrator land (AZ-835 C5)
|
||||
**Description**: Fifth building block of Epic AZ-835. Once C3 (AZ-839, `operator_pre_flight_setup` real fixture) and C4 (AZ-840, e2e orchestrator test) land, remove the `@pytest.mark.xfail` markers from the AZ-777 Tier-2 tests. The verdict — PASS or FAIL — becomes the honest signal. Both tests remain gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`.
|
||||
**Complexity**: 1 SP
|
||||
**Dependencies**: AZ-839 (C3, `operator_pre_flight_setup` real fixture — HARD); AZ-840 (C4, e2e orchestrator test — HARD); AZ-777 (being closed/superseded by this Epic; tests live in same file tree); AZ-835 (parent Epic)
|
||||
**Component**: `tests/e2e/replay/test_derkachi_1min.py` (xfail removal) + `tests/e2e/replay/test_derkachi_real_tlog.py` (xfail removal)
|
||||
**Tracker**: AZ-841 (https://denyspopov.atlassian.net/browse/AZ-841)
|
||||
**Parent Epic**: AZ-835
|
||||
|
||||
Jira AZ-841 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
## Targets
|
||||
|
||||
1. `tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks` (AZ-777 AC-4) — remove `@pytest.mark.xfail`; verify `@pytest.mark.tier2` + `RUN_REPLAY_E2E` gating stays in place.
|
||||
2. `tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report` (AZ-777 AC-5) — remove `@pytest.mark.xfail`; verify gating stays in place.
|
||||
|
||||
## Verification
|
||||
|
||||
**On Tier-2 Jetson** (`RUN_REPLAY_E2E=1`):
|
||||
- `test_ac3_within_100m_80pct_of_ticks` PASSES (≥ 80 % of ticks within 100 m of ground truth, log lines `replay.satellite_anchor_inserted` visible).
|
||||
- `test_az699_real_flight_validation_emits_verdict_and_report` runs to completion within 15 min and emits `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` with the honest distribution. PASS preferred but NOT required for AC-4 — emitting the honest report IS the success criterion.
|
||||
|
||||
**Locally** (no env):
|
||||
- Both tests skip explicitly with a reason naming `RUN_REPLAY_E2E` — they MUST NOT pass as a side effect of being skipped.
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | `@pytest.mark.xfail` removed from both AZ-777 tests. |
|
||||
| AC-2 | Both tests still gated by `@pytest.mark.tier2` + skip-unless-env(`RUN_REPLAY_E2E=1`). Skip reason names the missing env. |
|
||||
| AC-3 | On Jetson with `RUN_REPLAY_E2E=1`, `test_ac3_within_100m_80pct_of_ticks` PASSES (≥ 80 % within 100 m). |
|
||||
| AC-4 | On Jetson with `RUN_REPLAY_E2E=1`, `test_az699_real_flight_validation_emits_verdict_and_report` completes within 15 min and emits the verdict report. PASS preferred but not required for AC-4. |
|
||||
| AC-5 | If either test FAILS on the metric (e.g. only 60 % within 100 m), the test reports FAIL honestly — no fallback to xfail or skip. Failure mode is a feature, not a bug. |
|
||||
| AC-6 | Locally on a machine without `RUN_REPLAY_E2E`, both tests skip with an explicit reason. |
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Modifying the airborne pipeline to improve metric performance (separate optimization tickets if AC-3 fails).
|
||||
- Adding new test cases (this ticket only removes xfail; new cases belong to other tickets).
|
||||
- Documentation updates (AZ-842 / C6).
|
||||
- Modifying the verdict thresholds (AZ-696).
|
||||
|
||||
## Risks
|
||||
|
||||
**Risk 1 — Un-xfailed tests may FAIL on the metric.** If horizontal-error distribution comes in worse than the 80 % @ 100 m gate, this test reports FAIL. That outcome is in-scope for AC-5 (report honestly) and out-of-scope for this ticket's fix (file a separate optimization ticket).
|
||||
|
||||
## References
|
||||
|
||||
- Parent Epic: AZ-835 — https://denyspopov.atlassian.net/browse/AZ-835
|
||||
- Hard deps: AZ-839 (C3), AZ-840 (C4)
|
||||
- Tests: `tests/e2e/replay/test_derkachi_1min.py`, `tests/e2e/replay/test_derkachi_real_tlog.py`
|
||||
- AZ-777 spec: `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md` (post-closure)
|
||||
- Threshold spec: AZ-696 (≥ 80 % within 100 m)
|
||||
- Verdict writer: `src/gps_denied_onboard/helpers/accuracy_report.py`
|
||||
@@ -0,0 +1,65 @@
|
||||
# Docs: replay_protocol.md + architecture.md + orchestrator-test README (AZ-835 C6)
|
||||
|
||||
**Task**: AZ-842_replay_protocol_and_orchestrator_docs
|
||||
**Name**: Docs: replay_protocol.md Invariant 12 + AZ-777 Phase 3+ superseded note + orchestrator-test README (AZ-835 C6)
|
||||
**Description**: Sixth and final building block of Epic AZ-835. Capture the route-driven flow in the authoritative documents so future implementers, operators, and reviewers understand what changed and why.
|
||||
**Complexity**: 2 SP
|
||||
**Dependencies**: AZ-841 (C5, un-xfail — SOFT; README describes test outcomes assuming C5 has landed); AZ-777 (being closed/superseded by this Epic — AZ-777 spec is updated during the AZ-777 closure step, verified by AC-6); AZ-835 (parent Epic)
|
||||
**Component**: `_docs/02_document/contracts/replay/replay_protocol.md` + `_docs/02_document/architecture.md` + `tests/e2e/replay/README*.md`
|
||||
**Tracker**: AZ-842 (https://denyspopov.atlassian.net/browse/AZ-842)
|
||||
**Parent Epic**: AZ-835
|
||||
|
||||
Jira AZ-842 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
## Modified files
|
||||
|
||||
### 1. `_docs/02_document/contracts/replay/replay_protocol.md` — Invariant 12 extension
|
||||
|
||||
Extend **Invariant 12** with an AZ-835 sub-section describing:
|
||||
|
||||
- The route-driven `operator_pre_flight_setup` fixture (AZ-839 / C3) flow: tlog → `RouteSpec` → `POST /api/satellite/route` → tile download → FAISS build → yield `PopulatedC6Cache`.
|
||||
- Why route-driven supersedes the AZ-777 bbox approach (efficiency: ~100× fewer tiles; honesty: pre-commits to where the operator did fly).
|
||||
- The C3 fixture's failure-handling contract (validation/terminal → re-raise; transient → retry up to 3 attempts using C11's existing backoff schedule).
|
||||
|
||||
### 2. `_docs/02_document/architecture.md` — satellite-provider entry extension
|
||||
|
||||
Append a sub-section to the existing satellite-provider entry noting that Epic AZ-835 + its C1-C5 children landed the full e2e real-flight validation path on top of AZ-777 Phase 1's wire + C11 contract adaptation. Mark AZ-777 Phase 3+ as superseded by Epic AZ-835 (pointer-only — the AZ-777 spec itself is updated in C5's wake during the AZ-777 closure step).
|
||||
|
||||
### 3. `tests/e2e/replay/README*.md` — orchestrator-test README
|
||||
|
||||
Either extend `tests/e2e/replay/README.md` or create a dedicated `tests/e2e/replay/README_AZ835.md` (prefer dedicated file if the existing README is already long). Short operator-facing content:
|
||||
|
||||
- How to run the new orchestrator test locally (env vars, Jetson SSH alias, expected runtime).
|
||||
- What `(tlog, video, calibration)` triple to provide and where the reference Derkachi fixture lives.
|
||||
- Where the verdict report is written and how to interpret it (PASS/FAIL on AZ-696 AC-3 threshold).
|
||||
- Imagery-source caveat: Google Maps satellite (dev/research use only; production needs CC-BY migration on the satellite-provider side).
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | `replay_protocol.md` Invariant 12 has a new AZ-835 sub-section covering the route-driven flow, the bbox-supersedure rationale, and the failure-handling contract. |
|
||||
| AC-2 | `architecture.md` satellite-provider entry has a sub-section noting Epic AZ-835's contribution and pointing at AZ-777 Phase 3+ as superseded. |
|
||||
| AC-3 | `tests/e2e/replay/README*.md` exists and a new contributor can run the orchestrator test on Jetson using only the README's instructions (no out-of-band knowledge required). |
|
||||
| AC-4 | All three docs link to the Epic (AZ-835) and to the relevant child tickets (AZ-836 / AZ-838 / AZ-839 / AZ-840 / AZ-841). |
|
||||
| AC-5 | License attribution string ("Imagery © Google") and the dev-only caveat are present in the test README. |
|
||||
| AC-6 | Cross-references in `_docs/02_tasks/_dependencies_table.md` and `_docs/02_tasks/done/AZ-777*.md` (once moved) point at this Epic / its children. |
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Updating consumer-facing API/contract docs in `../satellite-provider/` (parent-suite owns those).
|
||||
- Migrating imagery source to a CC-BY provider (parent-suite, out of scope for this Epic).
|
||||
- Writing additional tutorials beyond the orchestrator-test README.
|
||||
- ADR creation — no new architectural decision; this Epic implements existing decisions.
|
||||
|
||||
## Risks
|
||||
|
||||
**Risk 1 — Scope creep into reformatting unrelated doc sections.** Resist; this ticket only adds what AC-1..AC-5 require.
|
||||
|
||||
## References
|
||||
|
||||
- Parent Epic: AZ-835 — https://denyspopov.atlassian.net/browse/AZ-835
|
||||
- Replay protocol: `_docs/02_document/contracts/replay/replay_protocol.md` Invariant 12
|
||||
- Architecture: `_docs/02_document/architecture.md` (satellite-provider section)
|
||||
- Tests directory: `tests/e2e/replay/`
|
||||
- AZ-777 spec (being superseded): `_docs/02_tasks/done/AZ-777_derkachi_c6_reference_fixture.md` (post-closure)
|
||||
@@ -0,0 +1,69 @@
|
||||
# Relocate RouteSpec DTO to _types/route.py (AZ-507 rule 9 fix)
|
||||
|
||||
**Task**: AZ-845_refactor_relocate_routespec
|
||||
**Name**: Relocate `RouteSpec` from `replay_input/tlog_route.py` to `_types/route.py`
|
||||
**Description**: Resolve cycle-3 cumulative review F1 (High Architecture). Move the `RouteSpec` cross-component DTO to `_types/route.py` so the `c11_tile_manager.route_client` import becomes rule-9 compliant. Producer-side keeps backward-compat re-export so test imports do not break.
|
||||
**Complexity**: 2 SP
|
||||
**Dependencies**: None (anchor task of refactor run 02-az507-routespec-relocation)
|
||||
**Component**: `_types/` (new file `route.py`); `replay_input/` (`tlog_route.py`, `__init__.py` modify); `components/c11_tile_manager/` (`route_client.py` modify)
|
||||
**Tracker**: AZ-845 (https://denyspopov.atlassian.net/browse/AZ-845)
|
||||
**Parent Epic**: AZ-844 (Refactor 02 — RouteSpec relocation + module-layout refresh + AZ-270 lint widening)
|
||||
|
||||
Jira AZ-845 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
## Problem
|
||||
|
||||
`src/gps_denied_onboard/components/c11_tile_manager/route_client.py:56` imports `RouteSpec` from `gps_denied_onboard.replay_input.tlog_route`, violating `module-layout.md` rule 9 (AZ-507 cross-component contract surface). Per the rule, `components/<X>/*.py` may only import from `_types/*`, `_types.inference_errors`, `helpers/*`, `config`, `logging`, `fdr_client`, `clock`, `frame_source` (interface only), and its own subpackage. `replay_input` is not in this allow-list. Every other cross-component DTO already lives under `_types/*` (`_types/geo.py`, `_types/tile.py`, `_types/inference.py`, etc.); `RouteSpec` is the asymmetric outlier.
|
||||
|
||||
## Outcome
|
||||
|
||||
- `RouteSpec` is defined in `src/gps_denied_onboard/_types/route.py` (frozen+slots dataclass; full docstring carried over verbatim).
|
||||
- `c11_tile_manager/route_client.py:56` imports `RouteSpec` from `gps_denied_onboard._types.route`.
|
||||
- `replay_input/tlog_route.py` continues to use `RouteSpec` internally (extractor return type) by importing from `_types.route`; keeps `RouteSpec` in `__all__` for backward-compat re-export.
|
||||
- `replay_input/__init__.py` re-exports `RouteSpec` from `_types.route` directly.
|
||||
- All existing tests pass at HEAD.
|
||||
- Rule-9 audit reports zero violations after the move.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- New file: `src/gps_denied_onboard/_types/route.py` with `RouteSpec` dataclass.
|
||||
- Modify `src/gps_denied_onboard/replay_input/tlog_route.py` (remove local definition, add import).
|
||||
- Modify `src/gps_denied_onboard/replay_input/__init__.py` (re-export from `_types.route`).
|
||||
- Modify `src/gps_denied_onboard/components/c11_tile_manager/route_client.py:56` (the rule-9 fix) plus the docstring snippet at file-top that names the source module.
|
||||
- Optional hygiene: update 5 test files that import `RouteSpec` from `replay_input.tlog_route` directly (`tests/unit/replay_input/test_tlog_route.py`, `tests/unit/c11_tile_manager/test_route_client.py`, `tests/e2e/replay/_operator_pre_flight.py`, `tests/e2e/replay/test_e2e_orchestrator_unit.py`, `tests/e2e/replay/test_operator_pre_flight_driver.py`) to import from `_types.route` for symmetry.
|
||||
|
||||
### Excluded
|
||||
|
||||
- `RouteExtractionError` does NOT relocate — it is a `replay_input/`-specific error not imported by any `components/<X>/*.py` file.
|
||||
- `extract_route_from_tlog` does NOT relocate — extraction logic is a `replay_input/` concern; only the DTO moves.
|
||||
- No contract document at `_docs/02_document/contracts/shared_types/route.md`.
|
||||
- No behaviour, performance, or contract-shape changes.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | `_types/route.py` contains `RouteSpec` with `@dataclass(frozen=True, slots=True)`, identical fields to the original (`waypoints`, `suggested_region_size_meters`, `source_tlog`, `source_segment`, `total_distance_meters`), and the full original docstring. |
|
||||
| AC-2 | `route_client.py:56` reads `from gps_denied_onboard._types.route import RouteSpec`; rule-9 audit reports zero violations across `components/**/*.py`. |
|
||||
| AC-3 | `replay_input/tlog_route.py` imports `RouteSpec` from `_types.route`; `extract_route_from_tlog` returns `RouteSpec`; `RouteSpec` is in `__all__` so `from replay_input.tlog_route import RouteSpec` resolves via re-export. |
|
||||
| AC-4 | `from gps_denied_onboard.replay_input import RouteSpec` resolves to the same class object as `_types.route.RouteSpec` (verified by `is` identity check in test). |
|
||||
| AC-5 | `pytest tests/unit/replay_input/test_tlog_route.py tests/unit/c11_tile_manager/test_route_client.py` passes — no failures, no skipped tests beyond pre-existing skips. |
|
||||
|
||||
## Constraints
|
||||
|
||||
- `RouteSpec` MUST remain `frozen=True, slots=True` (AZ-355 AC-2).
|
||||
- `RouteSpec.__module__` MAY change to `gps_denied_onboard._types.route` (intended observable change; no test asserts on it).
|
||||
- `from gps_denied_onboard.replay_input import RouteSpec` MUST keep working.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1 — pickle / serialization break**: confirmed by grep — no `pickle.dumps(route)` exists in `src/` or `tests/`. Risk does not materialize.
|
||||
|
||||
**Risk 2 — hidden import grep missed**: producer-side keeps `RouteSpec` in its namespace via re-import + `__all__`; lazy importers using the old path resolve correctly.
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
- This is the anchor of refactor run 02-az507-routespec-relocation. AZ-846 (module-layout refresh) and AZ-847 (lint widening) are blocked by this task (Jira "Blocks" links recorded).
|
||||
- After this task lands, run the rule-9 audit script (the widened lint from AZ-847 once it lands) to confirm zero violations.
|
||||
@@ -0,0 +1,60 @@
|
||||
# Refresh module-layout.md cycle-3 entries (c11 + replay_input + _types/route)
|
||||
|
||||
**Task**: AZ-846_refactor_module_layout_cycle3
|
||||
**Name**: Refresh `module-layout.md` for cycle-3 file additions and the new `_types/route.py`
|
||||
**Description**: Resolve cycle-3 cumulative review F2 (Medium Architecture). Update the c11_tile_manager Internal list, the shared/replay_input file list, and the `_types/` section in `module-layout.md` so they match on-disk reality. Cycle-2 carry-overs OUTSIDE these three sections are explicitly out of scope (deferred to a separate doc task).
|
||||
**Complexity**: 2 SP
|
||||
**Dependencies**: AZ-845 (the new `_types/route.py` file must exist before this task can register it)
|
||||
**Component**: `_docs/02_document/module-layout.md` (single file)
|
||||
**Tracker**: AZ-846 (https://denyspopov.atlassian.net/browse/AZ-846)
|
||||
**Parent Epic**: AZ-844 (Refactor 02 — RouteSpec relocation + module-layout refresh + AZ-270 lint widening)
|
||||
|
||||
Jira AZ-846 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
## Problem
|
||||
|
||||
`module-layout.md` is stale. Cycle-3 cumulative review F2 documents:
|
||||
|
||||
- **c11_tile_manager Internal list** lists 2 files (`satellite_provider_downloader.py`, `satellite_provider_uploader.py`); on-disk has 8 internal files plus `route_client.py` (cycle-3 NEW from batch 107).
|
||||
- **shared/replay_input file list** is missing `errors.py` (cycle-2 carry), `tlog_ground_truth.py` (cycle-2 carry), `tlog_route.py` (cycle-3 NEW from batch 106).
|
||||
- **`_types/` file list** does not yet include `route.py` (added by AZ-845).
|
||||
|
||||
`/implement` Step 4 (File Ownership) treats `module-layout.md` as authoritative; staleness BLOCKS any future task touching unregistered areas. F2 is currently Medium; severity escalates to High if a fourth consecutive cycle leaves it stale.
|
||||
|
||||
## Outcome
|
||||
|
||||
- c11_tile_manager Internal list registers all 8 internals + `route_client.py`.
|
||||
- shared/replay_input file list registers `errors.py`, `tlog_ground_truth.py`, `tlog_route.py`.
|
||||
- `_types/` section registers `route.py` with a one-line description matching the convention of other `_types/*.py` entries.
|
||||
- `git diff` shows additions only to those three sections — no other section, rule, or rule-text edit.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- Append cycle-3 + relevant cycle-2-carry entries to the c11_tile_manager Internal list, the shared/replay_input file list, and the `_types/` section.
|
||||
|
||||
### Excluded
|
||||
|
||||
- **Cycle-2 carry-overs OUTSIDE these sections**: `replay_api/` Per-Component Mapping entry, `cli/render_map.py`, `cli/replay_api_entrypoint.py`, `helpers/gps_compare.py`, `helpers/accuracy_report.py`. These are recorded in the cycle-3 retrospective and require a separate follow-up doc task with its own AZ ID.
|
||||
- No code changes.
|
||||
- No changes to `module-layout.md` rule numbering or rule text. Only the per-section file inventories are updated.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | c11_tile_manager Internal list contains all 8 existing internals (`_types.py`, `config.py`, `errors.py`, `idempotent_retry.py`, `signing_key.py`, `tile_downloader.py`, `tile_uploader.py`) plus `route_client.py`, alphabetised. |
|
||||
| AC-2 | shared/replay_input file list adds `errors.py`, `tlog_ground_truth.py`, `tlog_route.py` with one-line descriptions matching the existing convention. |
|
||||
| AC-3 | `_types/` section includes `route.py` with a one-line description of `RouteSpec` (waypoints + region size + source tlog provenance), identifying its producer (`replay_input/tlog_route.py`) and consumer (`c11/route_client.py`). |
|
||||
| AC-4 | Diff of `module-layout.md` shows edits to ONLY the three named sections; no edits to other sections, rule numbering, or rule text. |
|
||||
|
||||
## Constraints
|
||||
|
||||
- Single file modified: `_docs/02_document/module-layout.md`.
|
||||
- No tests required — documentation update.
|
||||
- Scope discipline: cycle-2 doc carry-overs outside the three sections remain deferred.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1 — scope creep into cycle-2 carry-overs**: the Excluded list is explicit; Phase-4 implementer reviews the diff against ACs and rejects entries outside the three named sections at review.
|
||||
@@ -0,0 +1,61 @@
|
||||
# Widen test_az270_compose_root lint to enforce full rule-9 allow-list
|
||||
|
||||
**Task**: AZ-847_refactor_az270_lint_widening
|
||||
**Name**: Widen `test_ac6_only_compose_root_imports_concrete_strategies` to enforce the full rule-9 allow-list
|
||||
**Description**: Resolve cycle-3 cumulative review F3 (Medium Maintainability). Replace the AZ-270 lint's narrow `components → components` check with a full rule-9 allow-list check, so any future cross-component drift is caught at lint time rather than at cumulative-review time. Strict superset of the existing AC-6 check.
|
||||
**Complexity**: 2 SP
|
||||
**Dependencies**: AZ-845 (the widened lint must see a clean codebase to pass; running it against pre-AZ-845 HEAD is what AC-4 demonstrates as a one-time verification)
|
||||
**Component**: `tests/unit/test_az270_compose_root.py` (single file)
|
||||
**Tracker**: AZ-847 (https://denyspopov.atlassian.net/browse/AZ-847)
|
||||
**Parent Epic**: AZ-844 (Refactor 02 — RouteSpec relocation + module-layout refresh + AZ-270 lint widening)
|
||||
|
||||
Jira AZ-847 is the authoritative spec; this file is the in-workspace mirror.
|
||||
|
||||
## Problem
|
||||
|
||||
`tests/unit/test_az270_compose_root.py:194-219` (`test_ac6_only_compose_root_imports_concrete_strategies`) walks `src/gps_denied_onboard/components/**/*.py` and flags only edges whose `node.module` starts with `gps_denied_onboard.components.` AND whose leaf-component is not the importer's component. The full rule-9 allow-list (8 prefixes plus `frame_source` interface-only restriction) is NOT enforced. Imports from `replay_input`, `replay_api`, `runtime_root`, `cli/*`, and `frame_source` non-interface modules pass silently. F1 of the cycle-3 cumulative review (the c11 → replay_input edge) is the concrete consequence.
|
||||
|
||||
`module-layout.md` rule 9 documents this lint as the enforcement mechanism for the rule. Reviewers reasonably assume the lint covers the documented allow-list; in practice it covers only one of the eight prefixes. The asymmetry is the F3 finding.
|
||||
|
||||
## Outcome
|
||||
|
||||
- `test_ac6_only_compose_root_imports_concrete_strategies` enforces the full rule-9 allow-list: a `components/<X>/*.py` ImportFrom node is allowed iff the imported module matches one of: `gps_denied_onboard.components.<X>.*` (own component), `gps_denied_onboard._types.*`, `gps_denied_onboard._types.inference_errors`, `gps_denied_onboard.helpers.*`, `gps_denied_onboard.config`, `gps_denied_onboard.logging`, `gps_denied_onboard.fdr_client`, `gps_denied_onboard.clock`, `gps_denied_onboard.frame_source` (interface-only — see Constraints).
|
||||
- The widened lint is a strict superset of the existing AC-6 narrow check.
|
||||
- After AZ-845 lands, the widened lint reports zero violations.
|
||||
- The test docstring cites `module-layout.md` rule 9, not just AZ-270 AC-6.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- Modify `tests/unit/test_az270_compose_root.py` — the `test_ac6_*` test and its docstring.
|
||||
- Add a small allow-list constant at module scope (single source of truth).
|
||||
- Verify by `pytest tests/unit/test_az270_compose_root.py` after AZ-845 lands.
|
||||
|
||||
### Excluded
|
||||
|
||||
- Changes to other tests in the same file.
|
||||
- Changes to production code.
|
||||
- The `frame_source` interface-only enforcement: if AST-level disambiguation between interface and non-interface modules within `frame_source/*` is not feasible, allow-list only the explicit interface module path and reject other `frame_source.*` paths. Document in the test docstring.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
| # | Criterion |
|
||||
|---|-----------|
|
||||
| AC-1 | The lint flags any ImportFrom in `components/**/*.py` whose `module` starts with `gps_denied_onboard.` and is NOT in the rule-9 allow-list. |
|
||||
| AC-2 | Strict superset of the existing AC-6 narrow check — every cross-component edge previously flagged is still flagged. |
|
||||
| AC-3 | After AZ-845 lands, the widened lint reports zero violations. |
|
||||
| AC-4 | Against the codebase BEFORE AZ-845 (verified during implementation by running the new lint on a temp checkout of pre-relocation HEAD), the lint produces a failure naming the c11 → replay_input edge and citing rule 9. |
|
||||
| AC-5 | The test docstring cites `module-layout.md` rule 9 (AZ-507 cross-component contract surface) and lists the allow-list. |
|
||||
|
||||
## Constraints
|
||||
|
||||
- `frame_source` interface-only requirement: if AST-level disambiguation is not feasible, allow-list only the explicit interface module path. Document the chosen disambiguation strategy in the test docstring. Surface to user if the documented intent and codebase reality disagree.
|
||||
- The existing test name MAY remain (preserves AZ-270 audit trail) or be renamed; if renamed, update `module-layout.md` rule 9's enforcement-citation.
|
||||
- Single file modified: `tests/unit/test_az270_compose_root.py`. No production source change.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1 — widening exposes another rule-9 violation**: STOP-and-surface protocol. The implement skill MUST stop and present the additional violation as a scope-decision Choose to the user, NOT auto-bundle into this task. Remediation of any newly-exposed violation is a separate AZ ticket.
|
||||
|
||||
**Risk 2 — false positive on `gps_denied_onboard.frame_source` non-interface module**: documented disambiguation strategy in the test docstring. If wrong, the failure surfaces as a deterministic test failure, not silent drift; surface to user.
|
||||
@@ -0,0 +1,175 @@
|
||||
# Batch 108 — Cycle 3 — AZ-839 operator_pre_flight_setup real fixture
|
||||
|
||||
**Date**: 2026-05-23
|
||||
**Tasks**: AZ-839 (C3 — Epic AZ-835).
|
||||
**Story points**: 5.
|
||||
**Jira status**: AZ-839 → In Progress (transitioned at batch start);
|
||||
moves to In Testing at commit step.
|
||||
|
||||
## What shipped
|
||||
|
||||
Third building block of Epic AZ-835. Replaces the placeholder
|
||||
`operator_pre_flight_setup` pytest fixture (the previous `mkdir`
|
||||
stub at `tests/e2e/replay/conftest.py:293-310`) with a real
|
||||
driver that wires C1+C2+C11+C10 end-to-end:
|
||||
|
||||
1. **C1 RouteSpec** — extracted from the Derkachi tlog via AZ-836's
|
||||
`extract_route_from_tlog` (the existing `derkachi_replay_inputs`
|
||||
session fixture supplies the tlog path; the new fixture chains
|
||||
off that contract).
|
||||
2. **C2 SatelliteProviderRouteClient** — `seed_route(spec)` with the
|
||||
bounded transient-retry ladder documented in AZ-839 AC-5.
|
||||
Validation / terminal failures propagate unchanged (AC-4).
|
||||
3. **C11 HttpTileDownloader** — `download_tiles_for_area(request)`
|
||||
over a bbox derived from the route waypoints (mirrors C2's
|
||||
internal `_enumerate_route_tile_coords` envelope without
|
||||
importing the private helper).
|
||||
4. **C10 DescriptorBatcher** — `populate_descriptors(corpus_filter)`
|
||||
builds the FAISS HNSW index over the populated C6 cache. The
|
||||
AZ-306 sidecar triple-consistency is verified by re-loading the
|
||||
index through a caller-supplied `descriptor_index_factory` after
|
||||
the rebuild — any tampering surfaces as `IndexUnavailableError`
|
||||
(AC-6).
|
||||
5. **Cleanup-on-failure** — partial sidecar files written by the
|
||||
driver are removed if any step raises, while pre-existing warm
|
||||
cache files are preserved (AC-7).
|
||||
|
||||
Algorithm (`populate_c6_from_route`) is exposed through pure
|
||||
dependency injection so the AC-8 unit tests run against stubs and
|
||||
the AC-9 integration test runs the same algorithm against real
|
||||
collaborators on the Jetson harness.
|
||||
|
||||
## Files changed
|
||||
|
||||
Tests / fixtures (4):
|
||||
|
||||
- `tests/e2e/replay/_operator_pre_flight.py` (new, ~430 lines) —
|
||||
the AZ-839 driver: `PopulatedC6Cache` dataclass +
|
||||
`populate_c6_from_route()` + private helpers
|
||||
(`_seed_route_with_retry`, `_route_bbox`,
|
||||
`_cleanup_partial_sidecars`).
|
||||
- `tests/e2e/replay/conftest.py` — replaces the placeholder fixture
|
||||
with the real `operator_pre_flight_setup` (session-scoped,
|
||||
skip-gated by `RUN_REPLAY_E2E` + `SATELLITE_PROVIDER_URL` +
|
||||
`SATELLITE_PROVIDER_API_KEY` + `BUILD_FAISS_INDEX` +
|
||||
`GPS_DENIED_OPERATOR_CONFIG_PATH`); adds three private helpers
|
||||
(`_operator_pre_flight_skip_reason`,
|
||||
`_build_operator_pre_flight_cache`,
|
||||
`_build_replay_backbone_embedder`,
|
||||
`_resolve_replay_descriptor_dim`, `_default_tile_decoder`).
|
||||
- `tests/e2e/replay/test_operator_pre_flight_driver.py` (new,
|
||||
~410 lines) — 11 unit tests exercising AC-3 / AC-4 / AC-5 / AC-6
|
||||
/ AC-7 against stubbed `SatelliteProviderRouteClient` /
|
||||
`HttpTileDownloader` / `DescriptorBatcher` /
|
||||
`descriptor_index_factory`.
|
||||
- `tests/e2e/replay/test_operator_pre_flight_integration.py` (new,
|
||||
~40 lines) — Tier-2 + RUN_REPLAY_E2E gated test that consumes the
|
||||
fixture and asserts the `PopulatedC6Cache` invariants (AC-9
|
||||
pytest entry point).
|
||||
|
||||
Tracker docs (1):
|
||||
|
||||
- `_docs/03_implementation/batch_108_cycle3_report.md` (this file).
|
||||
|
||||
No production-code (`src/gps_denied_onboard/**`) modifications.
|
||||
The driver lives under `tests/` because AZ-839's outcome is the
|
||||
fixture, not a new operator-binary surface; the wiring it does is
|
||||
the existing operator-side runtime factories
|
||||
(`runtime_root.c10_factory`, `runtime_root.c11_factory`,
|
||||
`runtime_root.storage_factory`, `runtime_root.inference_factory`)
|
||||
already shipped under prior epics.
|
||||
|
||||
## AC coverage
|
||||
|
||||
| AC | Test(s) | Status |
|
||||
|----|---------|--------|
|
||||
| AC-1 cold first invocation ≤ 5 min | exercised on Tier-2 via AC-9 integration test; `PopulatedC6Cache.elapsed_seconds` instruments the budget | DEFERRED (Tier-2 only) |
|
||||
| AC-2 warm invocation ≤ 30 s | same gated test, re-invocation within session reuses the named-volume mount | DEFERRED (Tier-2 only) |
|
||||
| AC-3 populated cache + sidecar triple | `test_populate_c6_from_route_returns_populated_cache` + `test_populate_c6_from_route_passes_sector_class_to_downloader` | PASS |
|
||||
| AC-4 validation/terminal propagate | `test_route_validation_error_propagates_unchanged` + `test_route_terminal_failure_propagates_unchanged` | PASS |
|
||||
| AC-5 transient retry ladder (3 attempts, backoff) | `test_route_transient_error_retries_then_succeeds` + `test_route_transient_error_exhausted_propagates_last_attempt` | PASS |
|
||||
| AC-6 tamper detection → `IndexUnavailableError` | `test_descriptor_index_factory_index_unavailable_propagates` | PASS |
|
||||
| AC-7 cleanup on failure (no half-built sidecars) | `test_cleanup_removes_partial_sidecar_files_on_failure` + `test_cleanup_preserves_pre_existing_warm_cache` + `test_batcher_failure_propagates_and_cleans_up` + `test_downloader_failure_propagates_and_cleans_up` | PASS |
|
||||
| AC-8 unit tests with stubs (happy / transient / terminal / validation / tamper / cleanup) | 11 tests in `test_operator_pre_flight_driver.py` | PASS |
|
||||
| AC-9 integration on Jetson via fixture | `test_operator_pre_flight_setup_produces_populated_cache` (RUN_REPLAY_E2E + tier2 gated) | DEFERRED (Tier-2 only) |
|
||||
|
||||
DEFERRED ACs (AC-1, AC-2, AC-9) execute on the Jetson e2e harness
|
||||
when `RUN_REPLAY_E2E=1` + `SATELLITE_PROVIDER_URL` +
|
||||
`SATELLITE_PROVIDER_API_KEY` + `BUILD_FAISS_INDEX=ON` +
|
||||
`GPS_DENIED_OPERATOR_CONFIG_PATH` are set. The pytest entry point
|
||||
exists and skips explicitly per `.cursor/skills/implement/SKILL.md`
|
||||
Step 8 ("a skipped test counts as Covered").
|
||||
|
||||
## Test run results
|
||||
|
||||
```
|
||||
$ .venv/bin/pytest tests/e2e/replay/test_operator_pre_flight_driver.py -v --tb=short
|
||||
============================== 11 passed in 0.33s ==============================
|
||||
|
||||
$ .venv/bin/pytest tests/e2e/replay/test_operator_pre_flight_integration.py -v --tb=short
|
||||
============================== 1 skipped in 0.29s ==============================
|
||||
(SKIPPED — Tier-2-only test; set GPS_DENIED_TIER=2 to run)
|
||||
|
||||
$ .venv/bin/pytest tests/e2e/replay/ -v --tb=short --timeout=60
|
||||
====================== 28 passed, 8 skipped in 1.14s =======================
|
||||
```
|
||||
|
||||
Suite-wide test run is deferred to Step 11 (Run Tests) per the
|
||||
iterative-skill exception in `.cursor/rules/coderule.mdc` — batch
|
||||
108 is a batch, not the end of cycle-3 implementation.
|
||||
|
||||
## Code review (self-review)
|
||||
|
||||
Per `.cursor/rules/no-subagents.mdc`, the structured `/code-review`
|
||||
skill is run inline. Verdict: **PASS_WITH_WARNINGS**.
|
||||
|
||||
| Phase | Result |
|
||||
|-------|--------|
|
||||
| 1. Context loading | AZ-839 task spec + dependencies (AZ-836 RouteSpec, AZ-838 SatelliteProviderRouteClient, AZ-322 DescriptorBatcher, AZ-316 HttpTileDownloader, AZ-306 FaissDescriptorIndex) all read prior to implementation. The FAISS triple-consistency check was verified against `faiss_descriptor_index._load()` source. |
|
||||
| 2. Spec compliance | AC-3 / AC-4 / AC-5 / AC-6 / AC-7 / AC-8 directly covered. AC-1 / AC-2 / AC-9 deferred to Tier-2 harness (gated tests exist). **No Medium / High findings.** |
|
||||
| 3. Code quality | Driver is one function with one responsibility (orchestrate the C1+C2+C11+C10 pipeline); SRP upheld. Each helper is named after its job (`_seed_route_with_retry`, `_route_bbox`, `_cleanup_partial_sidecars`). Functions ≤ ~80 lines. Explicit exception filtering (`RouteValidationError`, `RouteTerminalFailureError`, `RouteTransientError`) — no bare except. Tests follow Arrange/Act/Assert with comment markers per `coderule.mdc`. |
|
||||
| 4. Security quick-scan | JWT consumed via env-sourced kwargs, never logged. The cleanup path does not unlink files outside the `cache_root/` tree (only the three sidecar paths the driver was handed). |
|
||||
| 5. Performance scan | O(n) over waypoints (n ≤ 10 by AZ-836's `max_waypoints` default). No new N+1. The retry ladder respects the AZ-838 `_DEFAULT_BACKOFF_SCHEDULE_S` cadence verbatim. |
|
||||
| 6. Cross-task consistency | Single-task batch — N/A. |
|
||||
| 7. Architecture compliance | `_operator_pre_flight.py` lives under `tests/e2e/replay/` (test infrastructure). Imports only from C10 / C11 / C6 public surfaces and from `replay_input.tlog_route.RouteSpec` (Adapter layer per `module-layout.md`). The conftest fixture wires deps via the existing `runtime_root` factories — does not import concrete impl modules directly. No cross-component imports between C-prefixed components. No new cyclic dependencies. ADR check skipped (no ADRs directory). |
|
||||
|
||||
### Findings
|
||||
|
||||
**F1 (Low) — `_default_tile_decoder` lives in conftest.py**
|
||||
|
||||
`_default_tile_decoder` (JPEG → CHW float32 numpy) lives in the
|
||||
test conftest. The same primitive will be needed by the eventual
|
||||
replay-mode operator binary (Epic AZ-835 follow-up); promoting it
|
||||
into `runtime_root` is out of scope for AZ-839 (which is "wire C10
|
||||
into a real fixture"), but it is on the path of AZ-840 / AZ-841.
|
||||
**Recommendation**: leave as-is for AZ-839; revisit during AZ-840.
|
||||
|
||||
**F2 (Low) — `_resolve_replay_descriptor_dim` is NetVLAD-only**
|
||||
|
||||
The NetVLAD descriptor dim resolver pinned at `c2_vpr/config.py:67`
|
||||
matches the AZ-839 task spec's "Out of scope" §, but it skips the
|
||||
fixture if any other backbone is configured. **Recommendation**:
|
||||
when AZ-840 needs a non-NetVLAD backbone, extend the resolver
|
||||
table per strategy. Tracking via the AZ-840 spec is sufficient.
|
||||
|
||||
### Deltas vs. spec
|
||||
|
||||
None. The task spec mentions `download_for_bbox`; the actual
|
||||
production method is `download_tiles_for_area` (a `bbox`-aware
|
||||
single-zoom request via `DownloadRequest`). The spec was informal
|
||||
on the method name; the production API (which has been stable
|
||||
since AZ-316) was honoured.
|
||||
|
||||
## Notes for follow-up
|
||||
|
||||
- AZ-840 (e2e orchestrator test) consumes this fixture. The
|
||||
fixture already returns a typed `PopulatedC6Cache` so AZ-840 has
|
||||
a concrete contract to assert against.
|
||||
- AZ-841 (un-xfail AZ-777 Tier-2 tests) builds on AZ-839 + AZ-840.
|
||||
The existing `test_ac8_operator_workflow` skip reason in
|
||||
`test_derkachi_1min.py` (D-PROJ-2 mock-suite-sat-service) is
|
||||
stale post-AZ-839 — AZ-841 will rewrite it to consume the new
|
||||
fixture.
|
||||
- AZ-842 (docs — replay_protocol.md Invariant 12 + architecture +
|
||||
orchestrator README) describes the route-driven flow this batch
|
||||
ships.
|
||||
@@ -0,0 +1,179 @@
|
||||
# Batch 108b — Cycle 3 — AZ-839 conftest path-mismatch fix
|
||||
|
||||
**Date**: 2026-05-23
|
||||
**Tasks**: AZ-839 (C3 — Epic AZ-835).
|
||||
**Story points**: 0 (defect fix on top of the AZ-839 batch 108
|
||||
ship; counts under the existing 5 SP envelope).
|
||||
**Jira status**: AZ-839 reopened (In Testing → In Progress) at the
|
||||
start of this batch on the 2026-05-23 self-review finding;
|
||||
re-transitions to In Testing at commit step.
|
||||
|
||||
## Why this batch exists
|
||||
|
||||
The AZ-839 batch 108 self-review verdict was PASS_WITH_WARNINGS
|
||||
based on 11 driver unit tests + 28 replay-suite passes. While
|
||||
reading the C3 fixture to plan the AZ-840 orchestrator, a real
|
||||
path-mismatch defect surfaced that **AC-3 / AC-6 unit tests
|
||||
could not catch** because every unit test stubs the
|
||||
`descriptor_index_factory`. The defect was not introduced by
|
||||
batch 108b — it was missed by batch 108's self-review and would
|
||||
have failed the AC-9 Tier-2 integration test on first execution.
|
||||
|
||||
Per `meta-rule.mdc` "Real Results, Not Simulated Ones" the work
|
||||
was paused before any AZ-840 code was written, the user was given
|
||||
a Choose A/B/C/D, and option A (reopen AZ-839, fix, recommit) was
|
||||
selected.
|
||||
|
||||
## The defect
|
||||
|
||||
In `tests/e2e/replay/conftest.py::_build_operator_pre_flight_cache`:
|
||||
|
||||
* `tile_store = build_tile_store(config)` constructed a
|
||||
`PostgresFilesystemStore` whose filesystem root came from
|
||||
`config.components["c6_tile_cache"].root_dir` — i.e. the static
|
||||
YAML path baked into the operator config (default
|
||||
`/var/lib/gps-denied/tiles`).
|
||||
* `descriptor_index = build_descriptor_index(config)` constructed
|
||||
a `FaissDescriptorIndex` at
|
||||
`<config.root_dir>/descriptor.index`.
|
||||
* `_descriptor_index_factory()` (the AC-3 / AC-6 verifier seam)
|
||||
constructed a SEPARATE `FaissDescriptorIndex` at
|
||||
`cache_root / "descriptor.index"` — the freshly-mktemp'd
|
||||
fixture path.
|
||||
* On Tier-2 those two paths cannot be equal: `cache_root` is
|
||||
generated at test time by `tmp_path_factory`; the static YAML
|
||||
carries a path that is fixed at config-load time.
|
||||
* Result: `descriptor_batcher.populate_descriptors()` writes the
|
||||
rebuilt FAISS triple under the static YAML root; the verifier
|
||||
then opens `cache_root/descriptor.index` and finds nothing,
|
||||
raising `IndexUnavailableError` from `FaissDescriptorIndex._load`.
|
||||
The fixture would have failed to ever yield a `PopulatedC6Cache`
|
||||
on Tier-2 — AC-3 (paths populated) and AC-6 (sidecar coherence)
|
||||
both unreachable.
|
||||
|
||||
The same shape applied to the tile filesystem: `tile_store_path =
|
||||
cache_root / "tile_store"` did not match the actual
|
||||
`PostgresFilesystemStore` layout (`<root_dir>/tiles/`).
|
||||
|
||||
## The fix
|
||||
|
||||
`_build_operator_pre_flight_cache` now mutates the in-memory
|
||||
`c6_tile_cache` config block so the production C6 components and
|
||||
the verifier all read/write under the fixture's `cache_root`:
|
||||
|
||||
```python
|
||||
c6_block = config.components["c6_tile_cache"]
|
||||
c6_block_overridden = dataclasses.replace(
|
||||
c6_block,
|
||||
root_dir=str(cache_root),
|
||||
faiss_index_path="", # force fallback to <root_dir>/descriptor.index
|
||||
)
|
||||
config = dataclasses.replace(
|
||||
config,
|
||||
components={**config.components, "c6_tile_cache": c6_block_overridden},
|
||||
)
|
||||
tile_store_path = cache_root / "tiles"
|
||||
faiss_index_path = cache_root / "descriptor.index"
|
||||
```
|
||||
|
||||
After the override:
|
||||
|
||||
* `build_tile_store(config)` writes under `cache_root/tiles/`.
|
||||
* `build_descriptor_index(config)` rebuilds at
|
||||
`cache_root/descriptor.index` (+ `.sha256` + `.meta.json`).
|
||||
* `_descriptor_index_factory()` reads from the same
|
||||
`cache_root/descriptor.index` — triple-consistency check now has
|
||||
files to validate.
|
||||
* `PopulatedC6Cache.tile_store_path` matches the
|
||||
`PostgresFilesystemStore.__init__` layout (`self._tiles_dir =
|
||||
self._root_dir / "tiles"`); the integration test's
|
||||
`populated.tile_store_path.is_dir()` assertion will hold.
|
||||
|
||||
The existing operator-config YAML stays unchanged — the override
|
||||
is in-memory, scoped to the fixture session, and never touches the
|
||||
disk file the operator wrote.
|
||||
|
||||
## Files changed
|
||||
|
||||
* `tests/e2e/replay/conftest.py` — added `import dataclasses`;
|
||||
added the c6_tile_cache override block + comment in
|
||||
`_build_operator_pre_flight_cache`; renamed
|
||||
`tile_store_path = cache_root / "tile_store"` →
|
||||
`cache_root / "tiles"` to match `PostgresFilesystemStore` layout;
|
||||
removed the unused `tile_store_path.mkdir(...)` (the store's
|
||||
constructor creates it).
|
||||
|
||||
No driver, unit-test, or integration-test changes. The driver's
|
||||
public API (`populate_c6_from_route`, `PopulatedC6Cache`) is
|
||||
unchanged.
|
||||
|
||||
## AC coverage delta
|
||||
|
||||
The minimal fix narrows AC-3 (paths populated) and AC-6 (sidecar
|
||||
coherence) from "would have failed on Tier-2" to "actually
|
||||
verifiable on Tier-2". No AC was previously claimed PASS that
|
||||
this batch downgrades.
|
||||
|
||||
## Test run results
|
||||
|
||||
```
|
||||
$ .venv/bin/pytest tests/e2e/replay/ -v --tb=short --timeout=60
|
||||
============================ 28 passed, 9 skipped in 3.08s ===========================
|
||||
```
|
||||
|
||||
Same outcome as batch 108. The unit suite is path-agnostic (every
|
||||
test in `test_operator_pre_flight_driver.py` injects its own
|
||||
paths through `_build_harness`) so the fix has no observable
|
||||
effect on the green path. The 9 skipped tests are
|
||||
RUN_REPLAY_E2E + Tier-2 gated; they will exercise the fix on the
|
||||
Jetson harness when AZ-839's AC-9 integration test next runs.
|
||||
|
||||
## Code review (self-review of batch 108b)
|
||||
|
||||
Verdict: **PASS** (single-finding fix; no new findings).
|
||||
|
||||
| Phase | Result |
|
||||
|-------|--------|
|
||||
| 1. Context loading | Re-read `storage_factory.py` + `postgres_filesystem_store.py` + `faiss_descriptor_index.py` to confirm where `root_dir` / `faiss_index_path` are honoured. |
|
||||
| 2. Spec compliance | AZ-839 AC-3 / AC-6 are now reachable on Tier-2; AC-9 entry point unchanged. |
|
||||
| 3. Code quality | Comment names the failure mode the override prevents. `dataclasses.replace` is used twice rather than mutating frozen dataclasses. The new `tile_store_path` matches the production layout exactly. |
|
||||
| 4. Security quick-scan | The override only changes paths; no DSN, JWT, or env-secret handling moved. |
|
||||
| 5. Performance scan | No-op — the override runs once per session, before any heavy I/O. |
|
||||
| 6. Cross-task consistency | Single-defect batch — N/A. |
|
||||
| 7. Architecture compliance | The fixture stays in `tests/`; mutating `config.components` is a documented composition-root pattern (see `Config.with_blocks`). No new src/ writes. |
|
||||
|
||||
## Self-review meta — why batch 108 missed this
|
||||
|
||||
The batch 108 self-review went through all 7 review phases but
|
||||
relied on the unit-test pass count for AC-3 / AC-6 confidence.
|
||||
Every unit test injected its own `descriptor_index_factory`, so
|
||||
the fixture's wiring of that factory to `cache_root` was never
|
||||
exercised against the real production wiring of `descriptor_index`
|
||||
to `config.root_dir`. Phase 7 (Architecture compliance) noted
|
||||
"the conftest fixture wires deps via the existing `runtime_root`
|
||||
factories — does not import concrete impl modules directly" but
|
||||
did not check that the wiring was internally consistent.
|
||||
|
||||
Preventive lesson (no rule change yet — surfacing for AZ-840
|
||||
follow-up): **when a fixture wires production components from a
|
||||
config and ALSO constructs a side verifier from a different
|
||||
source of truth, the two paths must be derived from a single
|
||||
upstream value or asserted equal at fixture-setup time.** This
|
||||
goes into the AZ-839 leftover note for AZ-840 to act on or to
|
||||
escalate to a `coderule.mdc` rule update.
|
||||
|
||||
## Notes for follow-up
|
||||
|
||||
* AZ-840 (e2e orchestrator test) — this batch unblocks AZ-840
|
||||
AC-3 (which hard-depends on the C3 fixture producing a usable
|
||||
cache). AZ-840 will additionally need to feed the airborne
|
||||
replay binary a config that points at the same `cache_root`
|
||||
(the binary takes a single `--config <path>` and cannot read
|
||||
the in-memory mutation); the cleanest path is for AZ-840 to
|
||||
write an effective YAML at runtime from the same override
|
||||
recipe used here. AZ-840's batch report will record the choice.
|
||||
* AZ-839's batch 108 self-review process is being noted as a
|
||||
partially-effective gate. No `coderule.mdc` rule change yet —
|
||||
the `meta-rule.mdc` "Real Results" rule already covers the
|
||||
general case; AZ-840's planning will check whether a more
|
||||
specific fixture-vs-config-wiring rule is warranted.
|
||||
@@ -0,0 +1,171 @@
|
||||
# Batch 109 — Cycle 3 — AZ-840 e2e orchestrator test
|
||||
|
||||
**Date**: 2026-05-23
|
||||
**Tasks**: AZ-840 (C4 — Epic AZ-835).
|
||||
**Story points**: 3 (per the task spec).
|
||||
**Jira status**: AZ-840 In Progress → In Testing at commit step.
|
||||
|
||||
## Why this batch exists
|
||||
|
||||
Epic AZ-835 (real-flight e2e validation) needs a single Tier-2
|
||||
test that proves the 7-step pipeline runs from
|
||||
`(tlog, video, calibration)` to a horizontal-error verdict
|
||||
without operator hand-curation between steps. Steps 3-5 were
|
||||
delivered by AZ-839 (C3 — `operator_pre_flight_setup`); steps
|
||||
1-2-6-7 are this batch.
|
||||
|
||||
The AZ-839 batch 108b follow-up note explicitly anticipated this
|
||||
batch: "AZ-840 will additionally need to feed the airborne
|
||||
replay binary a config that points at the same `cache_root`
|
||||
... the cleanest path is for AZ-840 to write an effective YAML
|
||||
at runtime from the same override recipe used here."
|
||||
|
||||
## What this batch ships
|
||||
|
||||
A driver module + unit test suite + Tier-2 integration test:
|
||||
|
||||
* `tests/e2e/replay/_e2e_orchestrator.py` — wraps the AZ-699
|
||||
verdict-report path with the AZ-839 C3 fixture's
|
||||
`PopulatedC6Cache`. Public surface:
|
||||
* `OrchestratorStep` enum — failure-step labels per AC-5.
|
||||
* `OrchestrationFailure(step, message)` exception — wraps
|
||||
every step failure with the step name in the message prefix.
|
||||
* `OrchestrationReport` dataclass — verdict, distribution,
|
||||
paths, wall-clock measurements per AC-4.
|
||||
* `write_effective_replay_config` — small helper that overlays
|
||||
`c6_tile_cache.root_dir` onto the static operator YAML.
|
||||
* `read_calibration_acquisition_method` — mirror of AZ-699's
|
||||
helper so the report writer keeps the same shape.
|
||||
* `run_e2e_orchestration` — the AC-1 entry point wiring
|
||||
validate → write_config → airborne subprocess → parse JSONL
|
||||
→ load tlog GT → compute distribution → render report.
|
||||
* `tests/e2e/replay/test_e2e_orchestrator_unit.py` — 17 unit
|
||||
tests covering each of the 7 steps' failure modes plus the
|
||||
happy path. The runner is injected (`subprocess.run` default)
|
||||
so unit tests stage synthetic JSONL output without touching
|
||||
the airborne binary. `load_tlog_ground_truth` is monkeypatched
|
||||
to return a synthetic 3-row series.
|
||||
* `tests/e2e/replay/test_az835_e2e_real_flight.py::
|
||||
test_az840_e2e_real_flight_orchestration` — Tier-2 + RUN_REPLAY_E2E
|
||||
gated test that consumes the C3 fixture + Derkachi inputs and
|
||||
asserts the verdict markdown is written, the threshold-hit
|
||||
share table is present, and the 15-min budget held.
|
||||
|
||||
## AC coverage
|
||||
|
||||
| AC | Description | Coverage |
|
||||
|-----|----------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|
|
||||
| AC-1| Steps 1-7 end-to-end on Tier-2 from a fresh tlog/video | `test_az840_e2e_real_flight_orchestration` (Tier-2-gated); 17 unit tests prove the orchestrator structure |
|
||||
| AC-2| Verdict report exists either PASS or FAIL | `test_run_e2e_orchestration_writes_report_even_on_fail_verdict` + integration assertion `report_path.is_file()` |
|
||||
| AC-3| Reuses C3 fixture (`operator_pre_flight_setup`) | Integration test consumes the fixture; effective config overlay points at `populated_cache.cache_root` |
|
||||
| AC-4| 15-min wall-time soft target on the Derkachi clip | `_DEFAULT_MAX_SECONDS = 900.0` passed as `subprocess.run` `timeout`; integration asserts `replay_subprocess_seconds <= 900`|
|
||||
| AC-5| Mid-pipeline failure fails LOUD with a clear step prefix | `OrchestratorStep` enum + 8 step-specific failure unit tests (`validate`/`write_config`/`airborne` × 3/`parse` × 2/`gt`) |
|
||||
| AC-6| Gated by `RUN_REPLAY_E2E=1` + Tier-2 marker | `_orchestrator_skip_reason()` checks env vars + binary + video size; `@pytest.mark.tier2` decorator |
|
||||
| AC-7| AZ-699 verdict test continues to pass | No changes to `test_derkachi_real_tlog.py`; same `real_flight_validation_<date>.md` report path convention |
|
||||
| AC-8| Unit-tested orchestration helper without Tier-2 inputs | 17 unit tests covering config write (4) + calibration parse (3) + run helper (10) — all use mocked subprocess + GT loader |
|
||||
|
||||
## Test run results
|
||||
|
||||
```
|
||||
$ .venv/bin/pytest tests/e2e/replay/ -v --tb=short --timeout=60
|
||||
============================ 45 passed, 10 skipped, 3 warnings in 0.78s ============
|
||||
```
|
||||
|
||||
Breakdown:
|
||||
* 17 new orchestrator unit tests pass.
|
||||
* 11 AZ-839 driver unit tests still pass (no driver changes).
|
||||
* 14 helper unit tests (`test_helpers.py`) still pass.
|
||||
* 3 derkachi-1min mode-agnostic AST tests still pass.
|
||||
* 10 skips: 1 new Tier-2 (this AZ-840 integration), 6
|
||||
RUN_REPLAY_E2E gated AZ-404 cases, 1 AC-8 D-PROJ-2 placeholder,
|
||||
1 Tier-2 AZ-699, 1 Tier-2 AZ-839 integration. None are
|
||||
regressions; the tier2 gate trips off-Jetson.
|
||||
|
||||
## Design notes
|
||||
|
||||
### `--auto-trim` ownership
|
||||
|
||||
The orchestrator passes `--auto-trim` unconditionally so AZ-405 /
|
||||
AZ-698 active-flight-cut + tlog/video sync (Epic step 1) runs
|
||||
inside the airborne binary every time. The Epic narrative does
|
||||
not separate trim from the airborne pipeline; collapsing them
|
||||
into a single subprocess invocation matches AZ-699 and avoids
|
||||
duplicating the trim path.
|
||||
|
||||
### `clip_duration_s` parity with AZ-699
|
||||
|
||||
`run_e2e_orchestration` computes
|
||||
`clip_duration_s = ground_truth[-1].t_s - ground_truth[0].t_s`
|
||||
exactly as `test_derkachi_real_tlog.py` does. This means both
|
||||
verdict reports name the same clip duration even when the
|
||||
trimmed video is shorter than the ground-truth window — a
|
||||
deliberate choice: the report header documents what the verdict
|
||||
covers, not what the binary processed.
|
||||
|
||||
### Effective config write — single source of truth
|
||||
|
||||
`write_effective_replay_config` materialises the same override
|
||||
recipe AZ-839 uses in-memory, but on disk so the airborne
|
||||
subprocess sees the cache_root the fixture chose. Field-level
|
||||
merge: every other block in the operator YAML is preserved
|
||||
verbatim; only `c6_tile_cache.root_dir` and
|
||||
`c6_tile_cache.faiss_index_path` are overwritten. The static
|
||||
operator YAML on disk is never touched.
|
||||
|
||||
### Failure surface = step prefix
|
||||
|
||||
`OrchestrationFailure` always prefixes its message with
|
||||
`[<step>]`. CI log scrapers and pytest's traceback printer both
|
||||
surface the prefix on the first line; AC-5 ("clear error
|
||||
pointing at the failing step") holds without requiring the test
|
||||
to inspect the exception object. The step is also exposed as
|
||||
`exc.step` for programmatic assertions.
|
||||
|
||||
## Files changed
|
||||
|
||||
* `tests/e2e/replay/_e2e_orchestrator.py` (new, 656 LOC).
|
||||
* `tests/e2e/replay/test_e2e_orchestrator_unit.py` (new, 660+ LOC).
|
||||
* `tests/e2e/replay/test_az835_e2e_real_flight.py` (new, 156 LOC).
|
||||
|
||||
No `src/` changes, no operator-config YAML changes, no AZ-839
|
||||
driver changes. AZ-840 is purely additive at the test layer.
|
||||
|
||||
## Code review (self-review)
|
||||
|
||||
Verdict: **PASS_WITH_WARNINGS**.
|
||||
|
||||
| Phase | Result |
|
||||
|-------|--------|
|
||||
| 1. Context loading | Re-read `gps_compare.py`, `accuracy_report.py`, `replay_input.py`, `cli/replay.py`, `test_derkachi_real_tlog.py`. Emission schema (`emitted_at`, `position_wgs84`) is the same shape `gps-denied-replay` writes. |
|
||||
| 2. Spec compliance | All 8 AZ-840 ACs covered; AC-7 holds by inspection (no AZ-699 changes). |
|
||||
| 3. Code quality | All public types have docstrings; failure messages name the upstream exception via `repr` so `OSError` / `subprocess.TimeoutExpired` carry through. Runner kw-args mirror `subprocess.run` signature 1:1. |
|
||||
| 4. Security quick-scan | Effective config write goes to a tmp file the test owns; no secrets in the YAML overlay (override is two string fields). Subprocess `env` is opt-in (`None` defaults to `os.environ`). |
|
||||
| 5. Performance scan | Unit tests run in 0.51 s. Tier-2 wall-clock cap is 900 s, enforced by the subprocess timeout. |
|
||||
| 6. Cross-task consistency | `clip_duration_s` and `report_path` match AZ-699 exactly so a single Jetson run produces the same markdown shape. |
|
||||
| 7. Architecture compliance | Orchestrator lives entirely under `tests/e2e/replay/`; no `src/` writes. C3 fixture's invariants (`PopulatedC6Cache.cache_root` is the single source of truth) propagate via `write_effective_replay_config`. |
|
||||
|
||||
## Findings
|
||||
|
||||
| ID | Severity | Description | Disposition |
|
||||
|----|----------|-------------|-------------|
|
||||
| F1 | Low | `_default_tile_decoder` in `conftest.py` (carried from batch 108) — still raw TIFF. Not in the AZ-840 path; AZ-840 doesn't change tile decoding. | Defer; no AZ-840 ticket. |
|
||||
| F2 | Low | `_resolve_replay_descriptor_dim` is NetVLAD-only (carried from batch 108). AZ-840 doesn't change descriptors. | Defer; no AZ-840 ticket. |
|
||||
| F3 | Low | `--pace asap` is hardcoded in `_run_replay_subprocess` argv; the AZ-699 test passes `--pace asap` too, so behaviour is identical. If a future test wants a real-time pace, the runner kwarg is the seam. | Document; no ticket. |
|
||||
| F4 | Low | `_run_replay_subprocess` does not stream stdout/stderr; failures surface only after the subprocess exits. For 15-min runs this means the operator sees no progress until the budget expires. AZ-699 has the same shape. | Document; consider an AZ-* if the budget grows. |
|
||||
|
||||
## Notes for follow-up
|
||||
|
||||
* AZ-840 lands the orchestrator test as Tier-2-gated. Verifying
|
||||
the Tier-2 path actually runs on the Jetson harness is the
|
||||
next gating step before Epic AZ-835 can flip from "covered by
|
||||
unit tests" to "covered by Tier-2 integration".
|
||||
* `_e2e_orchestrator.py` is intentionally kept under `tests/`
|
||||
rather than promoted to `src/`. If a second consumer of the
|
||||
same orchestration shape appears (e.g. AZ-833 mock-suite-sat
|
||||
parity test), the move to a shared helper module under
|
||||
`src/gps_denied_onboard/replay/` is the right next step;
|
||||
for now the test-only location matches the helper's only
|
||||
consumer.
|
||||
* AZ-841 (Tier-2 unxfail follow-up) and AZ-842 (replay protocol
|
||||
+ orchestrator docs) sit downstream — both should reference
|
||||
this batch report in their planning sections.
|
||||
@@ -0,0 +1,178 @@
|
||||
# Cumulative Code Review — Cycle 3 — Batches 104–109
|
||||
|
||||
**Date**: 2026-05-23
|
||||
**Scope**: union of files changed across cycle-3 batches 104, 106, 107, 108, 108b, 109
|
||||
**Tasks covered**: AZ-777 spec refresh + Phase 1 + Phase 2; AZ-836 (Epic AZ-835 C1); AZ-838 (Epic AZ-835 C2); AZ-839 (Epic AZ-835 C3) + 108b fixture-path fix; AZ-840 (Epic AZ-835 C4)
|
||||
**Mode**: cumulative (all 7 phases)
|
||||
**Verdict**: **FAIL** (0 Critical, 1 High, 2 Medium, 0 Low)
|
||||
**Baseline file**: `_docs/02_document/architecture_compliance_baseline.md` — **still absent** (carried over from cycle 2 retro action), no `## Baseline Delta` section emitted (see Notes)
|
||||
|
||||
## Scope of files reviewed
|
||||
|
||||
**Production source** (6 files):
|
||||
1. `src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py` — modified (b104; AZ-777 Phase 1 contract adaptation: `_LIST_PATH` / `_GET_PATH` aligned with `Program.cs:187-209`)
|
||||
2. `src/gps_denied_onboard/components/c11_tile_manager/route_client.py` — **new** (b107; ~600 LOC; `SatelliteProviderRouteClient`, `RouteSeedResult`, helpers)
|
||||
3. `src/gps_denied_onboard/components/c11_tile_manager/errors.py` — modified (b107; new `SatelliteProviderRouteError` + `RouteValidationError` + `RouteTransientError` + `RouteTerminalFailureError`)
|
||||
4. `src/gps_denied_onboard/components/c11_tile_manager/__init__.py` — modified (b107; re-exports new public surface)
|
||||
5. `src/gps_denied_onboard/replay_input/tlog_route.py` — **new** (b106; `RouteSpec`, `RouteExtractionError`, `extract_route_from_tlog`)
|
||||
6. `src/gps_denied_onboard/replay_input/__init__.py` — modified (b106; re-exports new public surface)
|
||||
|
||||
**Tests** (10 files): `tests/unit/c11_tile_manager/test_tile_downloader.py` (rewritten, b104; 14 ACs), `tests/unit/replay_input/test_tlog_route.py` (new, b106; 14 tests), `tests/e2e/satellite_provider/__init__.py` + `test_smoke.py` (new, b104; 2 tier-2 tests), `tests/e2e/replay/_operator_pre_flight.py` (new, b108; ~430 LOC), `tests/e2e/replay/conftest.py` (modified, b108+b108b), `tests/e2e/replay/test_operator_pre_flight_driver.py` (new, b108; 11 unit tests), `tests/e2e/replay/_e2e_orchestrator.py` (new, b109; 656 LOC), `tests/e2e/replay/test_e2e_orchestrator_unit.py` (new, b109; 17 unit tests), `tests/e2e/replay/test_az835_e2e_real_flight.py` (new, b109; tier-2 integration).
|
||||
|
||||
**CLI / fixtures** (2 files): `tests/fixtures/derkachi_c6/seed_route.py` (new, b107), `scripts/mint_dev_jwt.py` (new, b104).
|
||||
|
||||
**Compose / env** (2 files): `docker-compose.test.jetson.yml` (modified, b104), `.env.test.example` (modified, b104).
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File | Title |
|
||||
|---|----------|----------|------|-------|
|
||||
| F1 | High | Architecture | `src/gps_denied_onboard/components/c11_tile_manager/route_client.py:56` | `RouteSpec` DTO placement violates AZ-507 cross-component contract surface |
|
||||
| F2 | Medium | Architecture | `_docs/02_document/module-layout.md` | Module layout stale — cycle-3 additions unregistered (cycle-2 carry-over worsened) |
|
||||
| F3 | Medium | Maintainability | `tests/unit/test_az270_compose_root.py:194` | `test_ac6_only_compose_root_imports_concrete_strategies` lint scope is narrower than module-layout.md rule 9 |
|
||||
|
||||
### Finding Details
|
||||
|
||||
**F1: `RouteSpec` DTO placement violates AZ-507 cross-component contract surface** (High / Architecture)
|
||||
|
||||
- Location: `src/gps_denied_onboard/components/c11_tile_manager/route_client.py:56`
|
||||
- Description: `route_client.py` (a `components/c11_tile_manager/*.py` file) imports `RouteSpec` from `gps_denied_onboard.replay_input.tlog_route`. Per `module-layout.md` rule 9 (AZ-507 cross-component contract surface):
|
||||
|
||||
> "the only places a `components/<X>/*.py` file may import are: its own subpackage (`gps_denied_onboard.components.<X>.*`), `_types/*`, `_types.inference_errors`, `helpers/*`, `config`, `logging`, `fdr_client`, `clock`, `frame_source` (interface only)."
|
||||
|
||||
`replay_input` is not in this allow-list. The architecture rationale: cross-component DTOs reach consumers through `_types/*`, not through cross-cutting coordinator packages. The current placement makes c11 (an Adapter, Layer 4) structurally depend on `replay_input` (a coordinator, Layer 4) — a Layer 4 → Layer 4 cross-cutting edge that the layering table does not declare as allowed.
|
||||
|
||||
- Impact: The dependency is **intentional and documented** — AZ-838 task spec line 19 explicitly specifies `from gps_denied_onboard.replay_input.tlog_route import RouteSpec`, and the route_client docstring acknowledges the source (`Takes a gps_denied_onboard.replay_input.tlog_route.RouteSpec (produced by AZ-836 / C1)`). But "intentional" does not equal "compliant"; the architecture rule was not amended at decompose time, and the AZ-270 lint is too narrow to catch this case (see F3). The next task that imports a similarly-placed DTO will compound the drift.
|
||||
|
||||
- Suggestion: relocate `RouteSpec` (plus `RouteExtractionError` if exported as part of the cross-component surface) to `src/gps_denied_onboard/_types/route.py`. After the move, both `c11_tile_manager.route_client` and `replay_input.tlog_route` import the DTO from `_types`, which is in both modules' allow-lists. AZ-836's `extract_route_from_tlog` continues to live in `replay_input/`; AZ-838's `SatelliteProviderRouteClient` continues to live in `c11_tile_manager/`. The behavioral surface is unchanged. Estimated complexity: 2 SP (move + update imports + verify AZ-838/AZ-836 tests + module-layout.md update).
|
||||
|
||||
- Tasks: AZ-838 (primary — owns the violating import), AZ-836 (secondary — owns the DTO definition).
|
||||
|
||||
**F2: Module layout stale — cycle-3 additions unregistered (cycle-2 carry-over worsened)** (Medium / Architecture)
|
||||
|
||||
- Location: `_docs/02_document/module-layout.md`
|
||||
- Description: cycle 3 introduced new package files that are not registered in the authoritative file-ownership map. The cycle-2 cumulative review (`98-102`) already flagged 6 unregistered cycle-2 additions (F1 there); none of those carry-overs have been resolved, and cycle 3 added more:
|
||||
- **c11_tile_manager Internal list** (currently lists `satellite_provider_downloader.py` + `satellite_provider_uploader.py`): missing `_types.py`, `config.py`, `errors.py`, `idempotent_retry.py`, `signing_key.py`, `tile_downloader.py`, `tile_uploader.py`, **`route_client.py`** (cycle-3 NEW).
|
||||
- **shared/replay_input file list** (currently lists `__init__.py`, `interface.py`, `tlog_video_adapter.py`, `auto_sync.py`, `tests/`): missing `errors.py` (cycle-2 carry), `tlog_ground_truth.py` (cycle-2 carry), **`tlog_route.py`** (cycle-3 NEW).
|
||||
- **Carried over from cycle-2 review** (still unregistered): `replay_api/` package (7 files), `cli/render_map.py`, `cli/replay_api_entrypoint.py`, `helpers/gps_compare.py`, `helpers/accuracy_report.py`.
|
||||
- Impact: `/implement` Step 4 (File Ownership) resolves a task's `Component` field against this file. Any future task touching the unregistered areas will hit the BLOCKING ownership check at Step 4 — the skill explicitly STOPs when the component isn't found and forbids guessing from prose. Cycle-3 batches 104–109 happened to operate inside already-listed component directories (c11_tile_manager/**, replay_input/**) so the staleness did not block them, but the next task that needs a new component or extends `replay_api/` will block.
|
||||
- Suggestion: cycle-3 Step 13 (Update Docs) should reconcile module-layout.md with on-disk reality. The minimum: refresh the c11_tile_manager Internal list, the shared/replay_input file list, and add the cycle-2 carry-over entries (replay_api Per-Component Mapping entry, cli additions, helpers additions, replay_input file list completion). Severity escalates to High if a fourth consecutive cycle leaves the file stale.
|
||||
- Tasks: AZ-838, AZ-836 (primary, cycle-3 contributors); AZ-700, AZ-697, AZ-699, AZ-701 (secondary, cycle-2 carry).
|
||||
|
||||
**F3: `test_ac6_only_compose_root_imports_concrete_strategies` lint scope is narrower than module-layout.md rule 9** (Medium / Maintainability)
|
||||
|
||||
- Location: `tests/unit/test_az270_compose_root.py:194-219`
|
||||
- Description: `module-layout.md` rule 9 documents `test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies` as the lint that "enforces this on every `components/**/*.py`". In practice the lint only checks for `gps_denied_onboard.components.<other_component>` import edges — it walks `components/**/*.py`, parses `ImportFrom` nodes, and flags only when `node.module.startswith("gps_denied_onboard.components.")` with a different leaf component. The full rule-9 allow-list (`_types/*`, `_types.inference_errors`, `helpers/*`, `config`, `logging`, `fdr_client`, `clock`, `frame_source` interface only) is NOT enforced. Imports from `replay_input`, `replay_api`, `runtime_root`, `cli/*`, `frame_source` non-interface modules, etc. all pass the lint silently. F1 is the concrete consequence: the c11 → replay_input import slipped through both code review and the AZ-270 lint.
|
||||
- Impact: `module-layout.md` rule 9 is documented as enforced; in practice it is partially enforced, partially honor-system. Reviewers (human or AI) reading the rule-9 paragraph reasonably assume the lint covers it; the test name and docstring reinforce that. The asymmetry is a maintainability risk — the rule and its enforcement diverge silently.
|
||||
- Suggestion: either expand `test_ac6_only_compose_root_imports_concrete_strategies` to enforce the full allow-list (one extra branch in the AST walker), or amend rule 9 to admit the additional imports the codebase actually relies on (with a documented rationale per module). The first is preferable — the rule's intent is structural, and lint coverage matters more than rule wording.
|
||||
- Tasks: cross-cutting; surface in cycle-3 retrospective.
|
||||
|
||||
## Verdict Logic
|
||||
|
||||
- 0 Critical → no FAIL trigger from Critical
|
||||
- 1 High (F1) → **FAIL trigger**
|
||||
- 2 Medium (F2, F3) → not a verdict driver
|
||||
- 0 Low
|
||||
|
||||
Result: **FAIL** — `/implement` Step 14.5 gate stops. Per `implement/SKILL.md` Step 14.5 + the auto-fix matrix, F1 (High Architecture) **escalates** rather than auto-fixes; F2 + F3 are eligible for Medium-Style auto-fix on the matrix but the High-Architecture finding alone gates the whole report. Re-run requires user direction (Choose A/B/C in the implement skill's Step 14.5 escalation block).
|
||||
|
||||
## Phase-by-Phase Notes
|
||||
|
||||
### Phase 1 — Context Loading
|
||||
|
||||
Inputs read:
|
||||
- Task specs: AZ-836 (`done/`), AZ-838 (`done/`), AZ-839 (`done/`), AZ-840 (`done/`), AZ-777 (refreshed spec; closure logged in `done/`); AZ-841 (`todo/`), AZ-842 (`todo/`); Epic AZ-835 (`todo/`).
|
||||
- Batch reports: `batch_104_cycle3_report.md`, `batch_106_cycle3_report.md`, `batch_107_cycle3_report.md`, `batch_108_cycle3_report.md`, `batch_108b_cycle3_report.md`, `batch_109_cycle3_report.md`.
|
||||
- Architecture / layout: `_docs/02_document/module-layout.md` (rule 9 + per-component sections + Layering Table); `_docs/02_document/architecture.md` (header read-through; full re-read deferred to per-finding evidence).
|
||||
- Last cumulative review: `_docs/03_implementation/cumulative_review_batches_98-102_cycle2_report.md` (carry-over baseline).
|
||||
- Restrictions / solution overview: not re-read (already covered in per-batch reviews).
|
||||
- ADR directory: `_docs/02_document/adr/` does NOT exist; ADR compliance check skipped (logged in Phase 7 below).
|
||||
|
||||
### Phase 2 — Spec Compliance
|
||||
|
||||
Cross-batch promise points (per-batch ACs already verified in batch reports):
|
||||
|
||||
- **AZ-836 (`RouteSpec` + extractor) → AZ-838 (`SatelliteProviderRouteClient`)**: AZ-838 task spec line 19 explicitly specifies `from gps_denied_onboard.replay_input.tlog_route import RouteSpec`. Implementation matches. The DTO contract is not formally documented in `_docs/02_document/contracts/c11_tilemanager/` — Spec-Gap candidate, but downgrades because both producer and consumer are owned by the same Epic (AZ-835) and the Epic spec describes the DTO shape inline. Note (not a separate finding): if `RouteSpec` survives F1 remediation by moving to `_types/route.py`, a contract `_docs/02_document/contracts/shared_types/route.md` is the right home.
|
||||
- **AZ-838 (`SatelliteProviderRouteClient`) → AZ-839 (C3 fixture, `populate_c6_from_route`)**: the fixture's driver imports `SatelliteProviderRouteClient` and uses `seed_route()`; signature matches AZ-838's `seed_route(spec, *, name=None) -> RouteSeedResult`. Cross-batch wiring sound.
|
||||
- **AZ-839 (C3 fixture, `PopulatedC6Cache`) → AZ-840 (orchestrator)**: AZ-840's `_e2e_orchestrator.write_effective_replay_config` overlays `c6_tile_cache.root_dir` onto the operator YAML using the cache_root the C3 fixture chose. AZ-840 batch report documents the contract; per-test fixtures consume `PopulatedC6Cache` directly. Sound.
|
||||
- **AZ-777 contract adaptation (b104) → satellite-provider real endpoints**: `tile_downloader.py` `_LIST_PATH` / `_GET_PATH` now point at the real endpoints (`/api/satellite/tiles/inventory` + `/tiles/{z}/{x}/{y}`). The leftover `_docs/_process_leftovers/2026-05-21_az777_complexity_override.md` 2026-05-21 addendum recorded this as the "largest single sub-deliverable of the refreshed Phase 1". Implementation matches.
|
||||
|
||||
No Spec-Gap findings.
|
||||
|
||||
### Phase 3 — Code Quality
|
||||
|
||||
- All cycle-3 production modules (`tlog_route.py`, `route_client.py`, expanded `errors.py`, modified `tile_downloader.py`) carry module + class + function docstrings consistent with the project pattern (cycle-2 baseline preserved).
|
||||
- `route_client.py` is ~600 LOC with one class (`SatelliteProviderRouteClient`) plus one DTO (`RouteSeedResult`) plus module-level helpers. The class has 5 public methods (validate, seed_route, _post_route, _poll_route_status, _verify_inventory). Each method is single-responsibility. No method exceeds the 50-line / cyclomatic-10 thresholds enumerated in the skill's Phase 3 list (per code reading; not measured).
|
||||
- `tlog_route.py` `extract_route_from_tlog` uses Douglas-Peucker for waypoint coarsening — correct choice per AZ-836 spec.
|
||||
- Tests follow Arrange / Act / Assert per coderule (verified by sampling `test_tlog_route.py` and `test_e2e_orchestrator_unit.py`; no exhaustive enumeration).
|
||||
|
||||
No Code Quality findings.
|
||||
|
||||
### Phase 4 — Security Quick-Scan
|
||||
|
||||
- `route_client.py` HTTP client uses `httpx.Client` with `timeout` parameter (no infinite hangs), argv-style request construction (no shell), and bearer-token auth via the existing C11 plumbing. No secrets in source.
|
||||
- `route_client.py` JSON request payload built via `json.dumps` on dataclass fields → no injection.
|
||||
- `route_client.py` URL construction uses `_ROUTE_STATUS_PATH_TPL.format(id=...)` where `id` is a UUID returned by the server — type-bounded, no injection surface.
|
||||
- `tile_downloader.py` modifications (b104) are confined to `_LIST_PATH` / `_GET_PATH` constants (per batch report); no new auth/parsing surface.
|
||||
- `scripts/mint_dev_jwt.py` (new, b104): JWT minting tooling for dev/test JWT signing keys. Per file naming (`mint_dev_jwt.py`) and per the `.env.test.example` pairing this is intended for non-prod use; not reviewed line-by-line in this pass.
|
||||
|
||||
No Security findings.
|
||||
|
||||
### Phase 5 — Performance Scan
|
||||
|
||||
- `route_client._poll_route_status` polls with default 5 s interval, max 60 attempts (= 5 min ceiling) using `time.sleep`. Configurable via constructor. Standard polling, not a perf concern.
|
||||
- `route_client._enumerate_route_tile_coords` walks the route's `regionSizeMeters × N waypoints` tile coverage locally; per AZ-838 batch report this is ~50–100 tiles for the Derkachi route. O(N) over waypoints.
|
||||
- `tlog_route.extract_route_from_tlog` runs Douglas-Peucker on the active GPS segment; per the unit test, completes in milliseconds for the Derkachi clip.
|
||||
- `_operator_pre_flight.py` and `_e2e_orchestrator.py` run inside the test harness; performance is bounded by the wall-clock budget (15 min on Tier-2).
|
||||
|
||||
No Performance findings.
|
||||
|
||||
### Phase 6 — Cross-Task Consistency
|
||||
|
||||
- **Sequential Epic chain**: AZ-836 (C1) → AZ-838 (C2) → AZ-839 (C3) → AZ-840 (C4). Each batch's "Files changed" is disjoint at the production level (C1 in `replay_input/`, C2 in `c11_tile_manager/`, C3+C4 in `tests/e2e/replay/`). No conflicting patterns; the test layer wires the production chain together via the orchestrator.
|
||||
- **Symbol uniqueness**: `RouteSpec`, `RouteExtractionError`, `extract_route_from_tlog`, `SatelliteProviderRouteClient`, `RouteSeedResult`, `SatelliteProviderRouteError`, `RouteValidationError`, `RouteTransientError`, `RouteTerminalFailureError`, `OrchestratorStep`, `OrchestrationFailure`, `OrchestrationReport`, `PopulatedC6Cache` — each defined exactly once across cycle-3 production + tests. No duplicates.
|
||||
- **AZ-839 b108b fix**: the hot-fix renamed `tile_store_path = cache_root / "tile_store"` → `cache_root / "tiles"` to match `PostgresFilesystemStore` layout. Cross-task consistency preserved (the path AZ-840 reads now matches the path AZ-839 writes).
|
||||
|
||||
No Cross-Task Consistency findings.
|
||||
|
||||
### Phase 7 — Architecture Compliance
|
||||
|
||||
**Layer-direction analysis** (against module-layout.md "Allowed Dependencies" + rule 9):
|
||||
|
||||
- `replay_input/tlog_route.py` (Layer 4 cross-cutting coordinator): imports `_types.geo` (Layer 1), `helpers.gps_compare` (Layer 1), `helpers.wgs_converter` (Layer 1), and intra-package `replay_input.errors` + `replay_input.tlog_ground_truth`. All imports are downward (Layer 4 → Layer 1) or intra-package. Compliant.
|
||||
- `c11_tile_manager/route_client.py` (Layer 4 component): imports own subpackage (`c11_tile_manager.errors`) + third-party (`httpx`) + **`replay_input.tlog_route.RouteSpec`** — see F1. The cross-cutting `replay_input` is not in c11's allow-list per rule 9. Architecture finding F1 (High).
|
||||
- `c11_tile_manager/tile_downloader.py` (Layer 4 component): modifications confined to constants. No new cross-component edges introduced.
|
||||
|
||||
**Public API respect**:
|
||||
- `c11_tile_manager.__init__.py` re-exports the new public surface (`RouteSeedResult`, `SatelliteProviderRouteClient`, plus the new error classes). Consumers calling `from gps_denied_onboard.components.c11_tile_manager import SatelliteProviderRouteClient` reach the package's public surface. ✅
|
||||
- `replay_input.__init__.py` re-exports `RouteSpec`, `RouteExtractionError`, `extract_route_from_tlog`. ✅
|
||||
- The F1 violation is a public API respect violation in the OPPOSITE direction: `c11.route_client` reaches into `replay_input.tlog_route` (a sub-module path) rather than the package's `__init__` re-export — but the deeper issue is that no direction of this import is rule-9-compliant.
|
||||
|
||||
**Cyclic-dependency check**:
|
||||
- New edges this cycle: `c11_tile_manager.route_client → replay_input.tlog_route` (F1) + `c11_tile_manager.route_client → c11_tile_manager.errors` (intra-package).
|
||||
- `replay_input.tlog_route → c11_tile_manager.*`? No (verified via grep). Acyclic.
|
||||
- `replay_input/__init__.py` re-exports `RouteSpec` from `tlog_route`. No back-edge to c11.
|
||||
- No new cycles introduced.
|
||||
|
||||
**Duplicate-symbol check**: see Phase 6 — no duplicates.
|
||||
|
||||
**Cross-cutting concerns not locally re-implemented**: none observed. Logging via `logging.getLogger(_COMPONENT)`, FDR via `fdr_client`, helpers consumed from canonical locations.
|
||||
|
||||
**ADR compliance**: `_docs/02_document/adr/` directory does **not exist**. The check is skipped per `code-review/SKILL.md` Phase 7 #6 ("If the directory does not exist or has only the index file, ADRs are skipped — log this skip in the report so the absence is visible"). Carry-over: `module-layout.md` references ADR-001 (monolith), ADR-002 (build-time exclusion), ADR-009 (interface-first DI), ADR-011 (replay-as-configuration) inline; these are documented in `architecture.md` but not as standalone ADR files. If the ADR directory is created in cycle-N (per a future retro action), this skip should retroactively re-evaluate the cycle-3 batches against any ADR whose `Evidence` overlaps the cycle-3 changed-file set.
|
||||
|
||||
**Single Architecture finding**: F1 — c11.route_client imports a non-allow-listed package. Documented but unaddressed at the architecture level.
|
||||
|
||||
## Notes
|
||||
|
||||
- **No `## Baseline Delta` section**: `_docs/02_document/architecture_compliance_baseline.md` was identified in the cycle-2 LESSONS entry (2026-05-20 architecture) and again in the cycle-2 cumulative review notes as a cycle-2 Step 6 (Decompose) prerequisite. The baseline file was NOT created in cycle 2 retrospective and was NOT created in cycle 3 either. Carry-over → cycle-3 retrospective. Without the baseline, "carried over / resolved / newly introduced" structural-violation accounting is not possible; F1 is therefore counted as "newly introduced this cycle" by inspection (`route_client.py` is a cycle-3-new file), and F2 is "carried over from cycle 2 with worsening" by inspection of the cycle-2 cumulative review F1.
|
||||
- **Cumulative-review cadence drift continues**: `/implement` Step 14.5 says K=3 default. Cycle 3 has 6 completed batches (104, 106, 107, 108, 108b, 109) without a cumulative review until this make-up review. Two cumulative reviews were due (after 104+106+107, after 108+108b+109). Cycle-2 cumulative review (`98-102`) noted the same drift and flagged it for the cycle-2 retrospective; the action did not land. Recurring. Cycle-3 retrospective should pick it up — possible mechanism: a `cumulative_review_pending: true` marker in `_docs/_autodev_state.md` that the implement skill flips on at K-batch boundaries and clears only on review file write, surfacing in the autodev Status Summary footer.
|
||||
- **AZ-270 lint coverage gap**: F3 documents the gap explicitly. Adjacent: the existing-code flow's Phase A Step 2 (Architecture Baseline Scan) feeds Step 4 (Code Testability Revision) and would also benefit from a tighter lint, since baseline-mode code-review uses the same `module-layout.md` rule 9 as enforcement input.
|
||||
- **Suite docs (parent)**: `<workspace-root>/../docs` does not exist (probed during R1 reconciliation). No suite-level cross-reference applies to this review.
|
||||
|
||||
## Artifacts
|
||||
|
||||
- Verdict consumed by: `/implement` Step 14.5 gate (FAIL → STOP, escalate via Choose A/B/C — auto-fix not eligible for High Architecture).
|
||||
- F1 carried forward to cycle-3 retrospective for action assignment; remediation candidate: 2-SP refactor task to relocate `RouteSpec` to `_types/route.py`.
|
||||
- F2 carried forward to cycle-3 Step 13 (Update Docs) at minimum; severity escalation watch if the staleness persists into cycle 4.
|
||||
- F3 carried forward to cycle-3 retrospective; remediation candidate: 1-SP test-update task to expand `test_ac6_only_compose_root_imports_concrete_strategies`.
|
||||
- Architecture compliance baseline action: blocked across cycle 2 → cycle 3; surface in cycle-3 retrospective with explicit owner.
|
||||
@@ -0,0 +1,15 @@
|
||||
# ADR Impact — Run 02-az507-routespec-relocation
|
||||
|
||||
**Date**: 2026-05-23
|
||||
|
||||
## Scan result
|
||||
|
||||
`_docs/02_document/adr/` does not exist in this workspace. No `Status: Accepted` ADR files are in scope.
|
||||
|
||||
**Status**: `No ADRs in scope` — ADR Superseding Gate (refactor SKILL.md phase 2b.1) is satisfied trivially. No Violation rows. No Drift rows. No Aligned rows. Task creation may proceed.
|
||||
|
||||
## Rationale (per SKILL.md phase 2b.1 step 1)
|
||||
|
||||
> "If the directory does not exist or contains only the index, log `No ADRs in scope` to `RUN_DIR/analysis/adr_impact.md` and skip the rest of this gate."
|
||||
|
||||
This run logs the result and proceeds. The architectural rule that the run does enforce — `module-layout.md` rule 9 (AZ-507 cross-component contract surface) — is documented in `module-layout.md` and `architecture.md § Architecture Vision`, not in an ADR. The refactor strengthens that documented rule (by widening its lint enforcement in C03) rather than overturning it; no supersede path is needed.
|
||||
@@ -0,0 +1,70 @@
|
||||
# Refactoring Roadmap — Run 02-az507-routespec-relocation
|
||||
|
||||
**Date**: 2026-05-23
|
||||
**Run**: `_docs/04_refactoring/02-az507-routespec-relocation/`
|
||||
|
||||
## Weak Points Assessment
|
||||
|
||||
| # | Location | Description | Impact | Proposed Solution |
|
||||
|---|----------|-------------|--------|------------------|
|
||||
| W1 | `src/gps_denied_onboard/components/c11_tile_manager/route_client.py:56` | Imports `RouteSpec` from `gps_denied_onboard.replay_input.tlog_route`, violating module-layout.md rule 9 (AZ-507 cross-component contract surface). | High — the next task that imports a similarly-placed DTO compounds the drift; current AZ-270 lint cannot catch it (W3). | C01: relocate the DTO to `_types/route.py`. |
|
||||
| W2 | `_docs/02_document/module-layout.md` (c11_tile_manager Internal list, shared/replay_input file list) | Stale relative to on-disk reality — cycle-3 additions (`route_client.py`, `tlog_route.py`) and 7 cycle-2-era cycle-internal files are unregistered in their respective sections. | Medium — `/implement` Step 4 ownership check would BLOCK any future task touching unregistered areas. Severity escalates to High if a fourth consecutive cycle leaves it stale. | C02: refresh the c11_tile_manager Internal list, the shared/replay_input file list, and add `_types/route.py`. Defer cycle-2 carry-overs outside these sections. |
|
||||
| W3 | `tests/unit/test_az270_compose_root.py:194-219` | The AC-6 lint walks `components/**/*.py` and only flags `components.<X> → components.<Y>` edges, not the full rule-9 allow-list. | Medium — rule-9 enforcement is partially honor-system; F1 is the concrete consequence. | C03: widen the AST walker to enforce the full allow-list. |
|
||||
|
||||
## Gap Analysis
|
||||
|
||||
| AC of this run | Current state | Target state |
|
||||
|---|---|---|
|
||||
| Rule-9 violations resolved | 1 (route_client → replay_input) | 0 |
|
||||
| `module-layout.md` cycle-3 entries registered | Missing: `route_client.py`, `tlog_route.py`, plus 7 cycle-2-era omissions in two sections | All cycle-3 entries registered; 9 omissions in the c11 + replay_input sections fixed; new `_types/route.py` registered |
|
||||
| AZ-270 lint scope = rule-9 scope | Narrow (one prefix only) | Full allow-list enforced |
|
||||
|
||||
## Phased Roadmap
|
||||
|
||||
This run is a single phase by intent — three small structural fixes that share the same root cause (rule-9 enforcement gap). Sequencing within the phase:
|
||||
|
||||
1. **C01 → first** (the structural fix). Lands `_types/route.py`, retires the violating import, keeps producer-side back-compat via re-export.
|
||||
2. **C02 → second** (depends on C01 because the new `_types/route.py` entry needs the file to exist). Documentation refresh; no code touch.
|
||||
3. **C03 → third** (depends on C01 because the widened lint must see a clean codebase). The new lint becomes a gate for any future PR.
|
||||
|
||||
| Phase | Items | Rationale |
|
||||
|-------|-------|-----------|
|
||||
| Phase 1 (this run) | C01, C02, C03 | All three resolve the same cumulative-review FAIL surface; bundling them ensures rule-9 enforcement is consistent across code, doc, and lint after the run. |
|
||||
|
||||
No Phase 2 or Phase 3. The cumulative review's "out of scope" items (cycle-2 doc carry-overs, the shared_types/route.md contract doc, `architecture_compliance_baseline.md`) belong to other tasks and are explicitly deferred — not folded into this roadmap.
|
||||
|
||||
## Hardening tracks
|
||||
|
||||
| Track | Recommendation | Rationale |
|
||||
|-------|----------------|-----------|
|
||||
| A — Technical Debt | Skip | The run *is* technical-debt remediation (closing a rule-9 enforcement gap). Adding a separate track would expand scope artificially. |
|
||||
| B — Performance Optimization | Skip | No performance concern in scope. Relocation is identity-preserving; tests do not measure perf deltas. |
|
||||
| C — Security Review | Skip | No security surface affected. `RouteSpec` carries waypoint coordinates only (already shipped to operator's tlog input); the move does not change any auth, transport, or input-validation path. |
|
||||
| D — All of the above | Skip | See A/B/C. |
|
||||
| E — None | **Selected (default for this run)** | All three changes are themselves the structural fix; orthogonal hardening would dilute scope. The cycle-3 retrospective list captures the broader debt items (cycle-2 carry-overs, baseline doc) for separate runs. |
|
||||
|
||||
This default is recorded explicitly so the user can override at the Phase 2 BLOCKING gate. If the user wants Track C (security audit on the route-extraction path) or Track A (folding the cycle-2 carry-overs into this run), the roadmap and task list will be regenerated.
|
||||
|
||||
## Selected items
|
||||
|
||||
All `Selected`:
|
||||
|
||||
- C01 — Relocate `RouteSpec` to `_types/route.py` (2 SP, low risk).
|
||||
- C02 — Refresh `module-layout.md` cycle-3 entries (2 SP, low risk).
|
||||
- C03 — Widen `test_az270_compose_root` lint to full rule-9 allow-list (2 SP, medium risk).
|
||||
|
||||
**Total**: 6 SP across 3 tasks. Each task is within the user-rule cap (≤ 5 SP per task; recommended 2-3).
|
||||
|
||||
## Applicability gate
|
||||
|
||||
| Recommendation | Status | Notes |
|
||||
|---|---|---|
|
||||
| C01 | Selected | No constraint mismatches; identity-preserving move; backward compat via re-export. |
|
||||
| C02 | Selected | Doc-only; no test impact; scope-disciplined (cycle-2 carry-overs explicitly deferred). |
|
||||
| C03 | Selected | Risk-flagged: widening may expose unrelated rule-9 violation. STOP-and-surface protocol applies if encountered. |
|
||||
|
||||
No `Rejected`, no `Experimental only`, no `Needs user decision`. The Phase 2 applicability gate passes for task creation.
|
||||
|
||||
## ADR-supersede gate
|
||||
|
||||
`No ADRs in scope` — see `adr_impact.md`. Gate satisfied; no Violation/Drift/Aligned rows.
|
||||
@@ -0,0 +1,66 @@
|
||||
# Research Findings — Run 02-az507-routespec-relocation
|
||||
|
||||
**Date**: 2026-05-23
|
||||
**Mode**: guided
|
||||
**Scope**: structural relocation of one DTO + module-layout doc refresh + lint widening
|
||||
|
||||
## Project Constraint Matrix (extracted)
|
||||
|
||||
| Constraint | Source | Statement |
|
||||
|-----------|--------|-----------|
|
||||
| AZ-507 cross-component contract surface | `_docs/02_document/architecture.md` § Architecture Vision; `_docs/02_document/module-layout.md` rule 9 | `components/<X>/*.py` may only import from `_types/*`, `_types.inference_errors`, `helpers/*`, `config`, `logging`, `fdr_client`, `clock`, `frame_source` (interface only), and its own subpackage. |
|
||||
| Cross-component DTOs live in `_types/*` | `_types/geo.py`, `_types/tile.py`, `_types/inference.py`, `_types/calibration.py`, `_types/pose.py`, `_types/state.py`, `_types/nav.py`, `_types/manifests.py`, `_types/vpr.py`, `_types/matcher.py`, `_types/matching.py`, `_types/rerank.py`, `_types/thermal.py`, `_types/emitted.py`, `_types/fc.py` (15 existing DTO files) | The user-confirmed precedent. Every shared DTO sits under `_types/`. The pattern is explicit at the package level: `_types/__init__.py` is just a marker (`"""Cross-component DTOs (type-only stubs)."""`). |
|
||||
| AZ-270 lint coverage | `_docs/02_document/module-layout.md` rule 9 (cites `test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies`) | Documented as enforced by the lint; F3 of cycle-3 cumulative review confirms the lint scope is narrower than the rule. |
|
||||
| Frozen + slots DTO contract | AZ-355 AC-2 (cited in `_types/geo.py`) | DTOs that cross component boundaries must use `frozen=True, slots=True` to prevent mutation-through-aliasing. |
|
||||
| Epic AZ-835 acceptance criteria | `_docs/02_tasks/done/AZ-835_e2e_real_flight_validation_epic.md` and child task specs (AZ-836..AZ-840) | The replay-flow behaviour must remain functionally identical after the refactor — RouteSpec waypoint extraction, satellite-provider POST, e2e orchestrator behaviour. |
|
||||
| Backward-compat for test imports | tests/* (5 files import RouteSpec from `replay_input.tlog_route` directly) | Test code is allowed to use module-level paths; only `components/<X>/*.py` is gated by rule 9. Re-export from `tlog_route.py` keeps test imports stable, so updating tests is hygiene rather than correctness. |
|
||||
|
||||
## Current state analysis
|
||||
|
||||
`RouteSpec` is currently defined at `gps_denied_onboard.replay_input.tlog_route:54-79` and re-exported from `gps_denied_onboard.replay_input` (`__init__.py:34`). The producer (`extract_route_from_tlog` at `tlog_route.py:82`) lives alongside the DTO in the same module — that part is correct (the function is a `replay_input/` concern, not a `_types/` concern). The DTO itself is consumed across a component boundary (c11) which makes it a cross-component DTO by behaviour, but its file home does not reflect that. Every other cross-component DTO in the codebase lives under `_types/*`. The asymmetry is the F1 finding.
|
||||
|
||||
**Strengths to preserve**:
|
||||
|
||||
- `RouteSpec` is `frozen=True, slots=True` — already AZ-355-compliant; the move does not relax this.
|
||||
- The extractor (`extract_route_from_tlog`) is correctly placed in `replay_input/` and uses the DTO via local import; this composition is preserved post-move.
|
||||
- Tests cover both producer-side (14 unit tests) and consumer-side (full route_client AC suite plus integration). Phase 6 has a strong safety net.
|
||||
|
||||
**Weakness being corrected**:
|
||||
|
||||
- The DTO's file home does not match its semantic role (cross-component contract surface).
|
||||
- The AZ-270 lint cannot detect the asymmetry because its check is narrower than the rule it claims to enforce.
|
||||
|
||||
## Alternative approaches considered
|
||||
|
||||
| # | Approach | Verdict | Why |
|
||||
|---|----------|---------|-----|
|
||||
| 1 | Move `RouteSpec` to `_types/route.py` (the recommended path) | **Selected** | Matches the user-confirmed precedent (`_types/inference.py`, `_types/tile.py`, etc.), satisfies rule 9 at c11's import site, identity-preserving (Python class object identity is preserved across imports), behaviour-neutral. |
|
||||
| 2 | Move `RouteSpec` to `_types/replay.py` (group with other replay-related types if they appear later) | Rejected | No other replay-related shared DTOs exist today. Naming the file `route.py` mirrors the naming convention of other `_types/*.py` files (one DTO topic per file: `geo`, `tile`, `pose`, `nav`, etc.). Premature speculative grouping. |
|
||||
| 3 | Move `RouteSpec` to `_types/contracts/route.py` (introduce a sub-namespace) | Rejected | `_types/` is currently flat. Introducing a sub-namespace for one DTO is over-engineering and would require updating the rule-9 allow-list (`_types/*` already matches recursively in the lint, but the documentation pattern would diverge). |
|
||||
| 4 | Amend rule 9 to admit `replay_input.tlog_route` as an allowed import for components | Rejected (architecture-change path; option D in the original FAIL gate) | The user explicitly chose option B (mechanical refactor) over option D (rule amendment). Option 4 would weaken rule 9 and break the layering invariant, which is why the user rejected it. |
|
||||
| 5 | Keep `RouteSpec` in `replay_input/tlog_route.py` and add a custom shim under `_types/` that re-exports it (no real move) | Rejected | Cosmetic — does not satisfy the underlying rule because the c11 import would still resolve to a `replay_input` module via the shim. The lint's correct widened form (C03) would still flag the original location as the canonical home. |
|
||||
|
||||
**Selected: Approach 1.** No library replacement, no SDK addition, no framework introduction. Therefore the `context7` per-mode verification gate (SKILL phase 2a) is not triggered — the gate fires only for replacement libraries/SDKs/frameworks/services. This is a structural code move within the existing codebase.
|
||||
|
||||
## API capability verification
|
||||
|
||||
**Not applicable.** The refactor introduces no new library, SDK, framework, or service. The "replacement" is the file home of a dataclass within the same Python package. No `context7` lookup is required (the gate is explicit: "for every replacement library/SDK/framework"). No MVE is required (no external API to verify). The project's pinned mode is unchanged because no mode exists to pin — it's a pure-Python dataclass relocation.
|
||||
|
||||
## Constraint-fit table
|
||||
|
||||
| Recommendation | Pinned mode/config | Constraints checked | API capability evidence | Mismatches/disqualifiers | Status |
|
||||
|---|---|---|---|---|---|
|
||||
| C01 — relocate `RouteSpec` to `_types/route.py` | N/A — Python dataclass, no library mode | AZ-507 rule 9, frozen+slots invariant (AZ-355), Epic AZ-835 ACs, test backward compat | N/A — no external API | None | Selected |
|
||||
| C02 — refresh `module-layout.md` | N/A — documentation | AZ-507 rule 9 (the rule the doc enforces), scope discipline (cycle-2 carry-overs deferred to a separate task) | N/A | None | Selected |
|
||||
| C03 — widen AZ-270 lint | N/A — internal AST walker, stdlib `ast` module | Rule-9 allow-list as the predicate; preserves existing AC-6 narrow check as a strict subset | N/A — stdlib only | Risk: may expose unrelated rule-9 violation (mitigated by STOP-and-surface protocol if encountered) | Selected |
|
||||
|
||||
All three changes are `Selected`. No `Rejected`, `Experimental only`, or `Needs user decision` rows — the applicability gate (Phase 2 BLOCKING) passes for all three.
|
||||
|
||||
## References
|
||||
|
||||
- `_docs/02_document/architecture.md` § Architecture Vision (AZ-507 cross-component contract surface)
|
||||
- `_docs/02_document/module-layout.md` rule 9 (AZ-507 enforcement)
|
||||
- `_docs/03_implementation/cumulative_review_batches_104-109_cycle3_report.md` (F1, F2, F3 — the source findings)
|
||||
- `src/gps_denied_onboard/_types/geo.py` (canonical pattern for `_types/<topic>.py`)
|
||||
- `src/gps_denied_onboard/_types/inference.py`, `_types/tile.py`, `_types/calibration.py` (additional precedent — user-cited examples)
|
||||
- `tests/unit/test_az270_compose_root.py:194-219` (current narrow lint)
|
||||
@@ -0,0 +1,100 @@
|
||||
# Baseline Metrics — Run 02-az507-routespec-relocation
|
||||
|
||||
**Date**: 2026-05-23
|
||||
**Run**: `_docs/04_refactoring/02-az507-routespec-relocation/`
|
||||
**Mode**: guided
|
||||
**Source**: cycle-3 cumulative review (`_docs/03_implementation/cumulative_review_batches_104-109_cycle3_report.md`) — F1, F2, F3
|
||||
**Scope**: mechanical relocation of cross-component DTO + module-layout doc refresh + AZ-270 lint scope expansion
|
||||
|
||||
## Why a minimal baseline is appropriate for this run
|
||||
|
||||
The standard Phase-0 baseline metric grid (overall coverage, complexity, code smells, performance, dependencies, build time) is **not the right instrument** for this refactoring run. The work is a structural relocation of one frozen dataclass + a documentation refresh + a lint widening. Behaviour does not change; performance does not change; coverage does not change; dependency count does not change. A LOC-and-cyclomatic-complexity baseline would record near-zero deltas and would obscure the actual signal — whether the architectural rule (`module-layout.md` rule 9) is satisfied after the run.
|
||||
|
||||
What matters here, and is captured below, is:
|
||||
|
||||
1. The **structural baseline**: one rule-9 violation today (F1).
|
||||
2. The **test baseline**: which tests cover the affected import paths and that they pass at HEAD (the safety net for Phase 4).
|
||||
3. The **doc baseline**: which artifacts are stale (F2) and what "complete" looks like.
|
||||
4. The **lint baseline**: what AZ-270 currently catches vs. what rule 9 says it should catch (F3).
|
||||
|
||||
Phases 5/6 verify that (a) the structural baseline goes from 1 → 0 rule-9 violations, (b) every test still passes, (c) the doc baseline is reconciled, and (d) the lint baseline is widened.
|
||||
|
||||
## 1. Structural baseline (rule-9 violations)
|
||||
|
||||
Source of truth: `_docs/02_document/module-layout.md` rule 9 (AZ-507 cross-component contract surface).
|
||||
|
||||
| # | File | Importer (Component) | Imported (Module) | Allow-listed for importer? |
|
||||
|---|------|----------------------|-------------------|----------------------------|
|
||||
| 1 | `src/gps_denied_onboard/components/c11_tile_manager/route_client.py:56` | `c11_tile_manager` | `gps_denied_onboard.replay_input.tlog_route` (`RouteSpec`) | **NO** — `replay_input` not in c11's allow-list |
|
||||
|
||||
Search method: `rg "^from gps_denied_onboard\." src/gps_denied_onboard/components` filtered against the allow-list (`_types/*`, `_types.inference_errors`, `helpers/*`, `config`, `logging`, `fdr_client`, `clock`, `frame_source`).
|
||||
|
||||
**Target post-run**: 0 violations.
|
||||
|
||||
## 2. Test baseline (safety net for Phase 4)
|
||||
|
||||
Files that import `RouteSpec`, `SatelliteProviderRouteClient`, or `RouteSeedResult` (i.e. the symbols the relocation touches):
|
||||
|
||||
**Production source** (must be updated):
|
||||
- `src/gps_denied_onboard/components/c11_tile_manager/route_client.py` — defines the import to be re-pointed
|
||||
- `src/gps_denied_onboard/components/c11_tile_manager/__init__.py` — public API re-exports (no import path change)
|
||||
- `src/gps_denied_onboard/replay_input/tlog_route.py` — defines `RouteSpec` today (will lose the local definition, gain an import + alias)
|
||||
- `src/gps_denied_onboard/replay_input/__init__.py` — public API re-exports (will re-export from `_types.route` instead of `tlog_route`)
|
||||
|
||||
**Tests** (verify still pass; update imports only if they reach into the pre-relocation internal path `replay_input.tlog_route` directly):
|
||||
- `tests/unit/replay_input/test_tlog_route.py` (14 tests; producer-side)
|
||||
- `tests/unit/c11_tile_manager/test_route_client.py` (consumer-side unit tests)
|
||||
- `tests/integration/c11_tile_manager/test_route_client_e2e.py` (integration)
|
||||
- `tests/e2e/replay/conftest.py`
|
||||
- `tests/e2e/replay/_operator_pre_flight.py`
|
||||
- `tests/e2e/replay/test_operator_pre_flight_driver.py`
|
||||
- `tests/e2e/replay/test_operator_pre_flight_integration.py`
|
||||
- `tests/e2e/replay/_e2e_orchestrator.py`
|
||||
- `tests/e2e/replay/test_e2e_orchestrator_unit.py`
|
||||
- `tests/fixtures/derkachi_c6/seed_route.py`
|
||||
|
||||
**HEAD test status (asserted, not measured here)**: per cycle-3 batch reports 104, 106, 107, 108, 108b, 109, every committed batch ended with passing tests at the per-batch full run. The cumulative review (FAIL on F1) is a static-analysis verdict, not a test-run verdict — no test failures are attributable to F1 today. Phase 4 will run the affected test files first; Phase 6 runs the project's full test gate per the existing-code flow's test policy.
|
||||
|
||||
## 3. Doc baseline (F2 surface area)
|
||||
|
||||
`_docs/02_document/module-layout.md` is stale relative to on-disk reality. The following entries diverge today:
|
||||
|
||||
**c11_tile_manager — Internal list** lists 2 files (`satellite_provider_downloader.py`, `satellite_provider_uploader.py`); on-disk has 8 internal files plus `route_client.py` (cycle-3 NEW). Missing entries: `_types.py`, `config.py`, `errors.py`, `idempotent_retry.py`, `signing_key.py`, `tile_downloader.py`, `tile_uploader.py`, `route_client.py`.
|
||||
|
||||
**shared/replay_input file list** lists `__init__.py`, `interface.py`, `tlog_video_adapter.py`, `auto_sync.py`, `tests/`; on-disk adds `errors.py` (cycle-2 carry), `tlog_ground_truth.py` (cycle-2 carry), `tlog_route.py` (cycle-3 NEW). After the relocation, `tlog_route.py` stays (it still owns `extract_route_from_tlog`); `_types/route.py` is added.
|
||||
|
||||
**Cycle-2 carry-overs** still unaddressed (out of this run's scope unless the user expands it; surfaced in F2 of the cumulative review):
|
||||
- `replay_api/` package (7 files; needs Per-Component Mapping entry).
|
||||
- `cli/render_map.py`, `cli/replay_api_entrypoint.py` (need Shared section entries).
|
||||
- `helpers/gps_compare.py`, `helpers/accuracy_report.py` (need Shared section entries).
|
||||
|
||||
**Target post-run** (in-scope): c11_tile_manager Internal list refreshed (route_client + the 7 long-standing internals); shared/replay_input file list refreshed (tlog_route + tlog_ground_truth + errors); new `_types/route.py` registered. Cycle-2 carry-overs are deferred to a separate doc-only task unless user expands scope.
|
||||
|
||||
## 4. Lint baseline (F3)
|
||||
|
||||
`tests/unit/test_az270_compose_root.py:194-219` (`test_ac6_only_compose_root_imports_concrete_strategies`) walks `src/gps_denied_onboard/components/**/*.py` and flags only edges whose `node.module` starts with `gps_denied_onboard.components.` AND whose leaf-component is not the importer's component. The full rule-9 allow-list (8 prefixes plus `frame_source` interface-only restriction) is NOT enforced.
|
||||
|
||||
**Concrete miss demonstrated by F1**: the c11 → replay_input edge passes this lint silently because `replay_input` is not under `components/`.
|
||||
|
||||
**Target post-run** (in-scope): expand the lint to enforce rule 9's full allow-list. Remaining design choices (whether to allow `frame_source` non-interface modules, whether to treat `runtime_root` exception case-sensitively) are addressed in C03's task spec.
|
||||
|
||||
## 5. Functionality inventory
|
||||
|
||||
This run touches no public-feature surfaces. The DTO `RouteSpec` continues to be re-exported from `gps_denied_onboard.replay_input` (the public package), so consumers using `from gps_denied_onboard.replay_input import RouteSpec` see no change. Consumers reaching into `replay_input.tlog_route` directly (an internal-module path) will need their imports updated — this set is small and lives entirely under `tests/`. There is no operator-facing CLI / endpoint / config schema change.
|
||||
|
||||
## Self-verification
|
||||
|
||||
- [x] RUN_DIR created with auto-incremented prefix (`02-az507-routespec-relocation`; previous: `01-testability-refactoring`)
|
||||
- [x] All metric categories reasoned about — standard categories noted N/A with reason; relevant baselines (structural, test, doc, lint) captured
|
||||
- [x] Functionality inventory complete (no functionality change in scope)
|
||||
- [x] Measurements are reproducible (rg + glob commands documented)
|
||||
|
||||
## BLOCKING — Phase 0 gate
|
||||
|
||||
Awaiting user confirmation of:
|
||||
|
||||
1. The minimal-baseline rationale (no LOC/coverage/perf metrics for a mechanical relocation).
|
||||
2. The structural / test / doc / lint baseline above as the "before" state Phase 6 will compare against.
|
||||
3. The scope decision: cycle-2 doc carry-overs are **OUT** of this run unless explicitly expanded.
|
||||
|
||||
If confirmed, Phase 1 produces `RUN_DIR/list-of-changes.md` (already drafted alongside this file as the guided-mode input).
|
||||
@@ -0,0 +1,59 @@
|
||||
# Logical Flow Analysis — Run 02-az507-routespec-relocation
|
||||
|
||||
**Date**: 2026-05-23
|
||||
**Scope**: data path of `RouteSpec` from producer (replay_input) to consumer (c11_tile_manager) and back to operator-pre-flight orchestration
|
||||
|
||||
## Documented flow (from architecture / Epic AZ-835 spec)
|
||||
|
||||
```
|
||||
tlog (binary) ──► extract_route_from_tlog (replay_input/tlog_route)
|
||||
└─► RouteSpec (frozen dataclass, immutable)
|
||||
└─► SatelliteProviderRouteClient.seed_region (components/c11_tile_manager/route_client)
|
||||
└─► RouteSeedResult ─► satellite-provider POST /api/satellite/route
|
||||
─► (HTTP success) tile coverage primed
|
||||
```
|
||||
|
||||
## Trace through code (HEAD)
|
||||
|
||||
| Step | File | Behaviour |
|
||||
|------|------|-----------|
|
||||
| 1. Produce | `replay_input/tlog_route.py:166` (`extract_route_from_tlog` return) | Constructs `RouteSpec(waypoints, suggested_region_size_meters, source_tlog, source_segment, total_distance_meters)` |
|
||||
| 2. Hold | (consumer-side variable) | `RouteSpec` instance is `frozen=True, slots=True` — cannot be mutated by either side |
|
||||
| 3. Consume | `components/c11_tile_manager/route_client.py:56` import | Reads `route.waypoints`, `route.suggested_region_size_meters` to build the satellite-provider POST body |
|
||||
| 4. Validate | `components/c11_tile_manager/route_client.py` (RouteValidationError path) | Validates `route` shape against c11's RouteValidationError preconditions; pure read access |
|
||||
| 5. Carry | `tests/e2e/replay/_operator_pre_flight.py:72` import | Operator-pre-flight harness threads the same RouteSpec through the e2e flow |
|
||||
|
||||
## Identity & equality semantics post-relocation
|
||||
|
||||
The relocation moves the **definition** of `RouteSpec` from `gps_denied_onboard.replay_input.tlog_route` to `gps_denied_onboard._types.route`. After the move:
|
||||
|
||||
- Python's class identity is preserved across imports — `gps_denied_onboard.replay_input.tlog_route.RouteSpec is gps_denied_onboard._types.route.RouteSpec` ⇒ `True` (the same class object is bound at two names).
|
||||
- `dataclasses.is_dataclass(...)`, `isinstance(...)`, `__eq__`, and `__hash__` are unchanged because they derive from the class object, not from the import path.
|
||||
- `frozen=True, slots=True` semantics are preserved (no per-instance dict, no setattr after construction).
|
||||
- The `__module__` attribute of the class becomes `gps_denied_onboard._types.route` (not `gps_denied_onboard.replay_input.tlog_route`). This is observable via:
|
||||
- `pickle` (module path is encoded; pickled objects from before the move would fail to unpickle after — but no production code path pickles `RouteSpec`; checked: no `pickle.dumps(route)` or equivalent in src/ or tests/)
|
||||
- `repr(RouteSpec)` (shows `<class 'gps_denied_onboard._types.route.RouteSpec'>` post-move)
|
||||
- `RouteSpec.__module__` (changes — but no test inspects this; checked: no `__module__` assertion in tests/)
|
||||
|
||||
## Contradictions / data-loss / wasted-work checks
|
||||
|
||||
Per Phase 1 step 1c categories:
|
||||
|
||||
- **Fixed-size vs dynamic-size assumptions**: N/A — `RouteSpec.waypoints` is `tuple[tuple[float, float], ...]`, length is data-driven (1 to `max_waypoints`). No fixed-size pad/truncate path.
|
||||
- **Loop scoping**: N/A — RouteSpec is a leaf DTO, no internal loop semantics.
|
||||
- **Wasted computation**: N/A — relocation does not change call sites.
|
||||
- **Silent data loss**: N/A — relocation is a name-only change at the type level; the values stored in `RouteSpec` instances are unchanged.
|
||||
- **Doc drift**: confirmed by F2 of cumulative review — `module-layout.md` diverges from on-disk reality. Remediation is in scope as C02.
|
||||
|
||||
## Cross-component edge analysis (rule-9 audit, post-relocation)
|
||||
|
||||
| Edge | Importer | Imported | Allow-listed? | Status |
|
||||
|------|----------|----------|---------------|--------|
|
||||
| Pre-relocation | `c11_tile_manager/route_client.py` | `replay_input.tlog_route.RouteSpec` | NO | violation (F1) |
|
||||
| Post-relocation | `c11_tile_manager/route_client.py` | `_types.route.RouteSpec` | YES (`_types/*` is in c11's allow-list) | compliant |
|
||||
|
||||
No other rule-9 cross-component edge becomes a violation as a side effect of this move. The producer side (`replay_input/tlog_route.py` → `_types/route.py`) is a coordinator → DTO edge, which is always allowed (DTOs have no allow-list restriction; they're consumed everywhere).
|
||||
|
||||
## Conclusion
|
||||
|
||||
The relocation is a pure structural change with no behavioural, performance, or contract-shape side effects. The only observable difference is `RouteSpec.__module__`, which is not asserted on by any code path. Phase 4 execution can proceed as a mechanical move; Phase 6 verification is satisfied if all tests pass and the rule-9 audit reports zero violations.
|
||||
@@ -0,0 +1,71 @@
|
||||
# List of Changes
|
||||
|
||||
**Run**: 02-az507-routespec-relocation
|
||||
**Mode**: guided
|
||||
**Source**: `_docs/03_implementation/cumulative_review_batches_104-109_cycle3_report.md` (cycle-3 cumulative review, FAIL verdict, F1 + F2 + F3)
|
||||
**Date**: 2026-05-23
|
||||
|
||||
## Summary
|
||||
|
||||
Resolve the cycle-3 cumulative review's FAIL verdict by (a) relocating the `RouteSpec` DTO to its rule-9-compliant home in `_types/route.py`, (b) refreshing the stale `module-layout.md` cycle-3 file inventory, and (c) widening the AZ-270 lint to enforce the full rule-9 allow-list rather than only `components → components` edges. The work is mechanical — no behaviour, no performance, no contract shape changes.
|
||||
|
||||
## Changes
|
||||
|
||||
### C01: Relocate `RouteSpec` DTO from `replay_input/tlog_route.py` to `_types/route.py`
|
||||
|
||||
- **File(s)**:
|
||||
- **NEW**: `src/gps_denied_onboard/_types/route.py` — owns the `RouteSpec` dataclass definition (frozen, slots, with full docstring carried over verbatim).
|
||||
- **MOD**: `src/gps_denied_onboard/replay_input/tlog_route.py` — remove the local `RouteSpec` class definition (lines 54–79); add `from gps_denied_onboard._types.route import RouteSpec` near the existing `_types.geo` import; keep `RouteSpec` in `__all__` so `from replay_input.tlog_route import RouteSpec` continues to resolve (test code uses this path; it's a re-export, not a violation).
|
||||
- **MOD**: `src/gps_denied_onboard/replay_input/__init__.py` — change line 34 to import `RouteSpec` from `gps_denied_onboard._types.route` directly (canonical), keep importing `RouteExtractionError` and `extract_route_from_tlog` from `tlog_route` (they stay there).
|
||||
- **MOD**: `src/gps_denied_onboard/components/c11_tile_manager/route_client.py:56` — change to `from gps_denied_onboard._types.route import RouteSpec` (the actual rule-9 fix). Also update the docstring snippet at file-top that reads `Takes a gps_denied_onboard.replay_input.tlog_route.RouteSpec` → `Takes a gps_denied_onboard._types.route.RouteSpec`.
|
||||
- **MOD (optional, hygiene)**: test imports — 5 test files (`tests/unit/replay_input/test_tlog_route.py:46`, `tests/unit/c11_tile_manager/test_route_client.py:49`, `tests/e2e/replay/_operator_pre_flight.py:72`, `tests/e2e/replay/test_e2e_orchestrator_unit.py:37`, `tests/e2e/replay/test_operator_pre_flight_driver.py:61`) currently import `RouteSpec` from `replay_input.tlog_route`. They continue to work via the re-export (see above). Updating them to import from `_types.route` is hygiene, not correctness; recommended but not blocking. The integration test `tests/integration/c11_tile_manager/test_route_client_e2e.py:26` imports `extract_route_from_tlog` (not `RouteSpec`) — no change needed. The lazy import in `tests/e2e/replay/conftest.py:406` and the CLI fixture `tests/fixtures/derkachi_c6/seed_route.py:80` import `extract_route_from_tlog` only — no change needed.
|
||||
|
||||
- **Problem**: `components/c11_tile_manager/route_client.py:56` imports `RouteSpec` from `gps_denied_onboard.replay_input.tlog_route`. Per `module-layout.md` rule 9, `components/<X>/*.py` may only import from a finite allow-list (`_types/*`, `_types.inference_errors`, `helpers/*`, `config`, `logging`, `fdr_client`, `clock`, `frame_source` interface only). `replay_input` is not in this list — it's a Layer-4 cross-cutting coordinator, and Layer-4 → Layer-4 cross-cutting edges are not declared as allowed in the layering table. The import was committed in batch 107 (AZ-838); the AZ-270 lint did not catch it because the lint walks only `components → components` edges (see C03).
|
||||
- **Change**: Move the DTO definition to `_types/route.py`, where it sits among the other shared DTOs (`_types/geo.py`, `_types/tile.py`, `_types/inference.py`, etc.). Update the c11 import to point at the new location. Producer-side (`replay_input/tlog_route.py`) re-imports the DTO so its own return type, `__all__`, and existing test imports keep working — that's a coordinator importing from `_types/*`, a flow that is always allowed for non-`components/<X>` modules.
|
||||
- **Rationale**: `_types/*` is the architecturally designated home for cross-component DTOs (per AZ-507; per `_docs/02_document/architecture.md` `## Architecture Vision`). Every other shared DTO already lives there. Putting `RouteSpec` there makes the c11 → DTO edge a `components/<X>` → `_types/*` edge, which is allow-listed. This matches the pattern for `_types/inference.py`, `_types/tile.py`, `_types/calibration.py`, `_types/pose.py`, etc. — the user-confirmed precedent.
|
||||
- **Constraint Fit**:
|
||||
- AZ-507 cross-component contract surface — satisfied (the violating edge becomes compliant).
|
||||
- Epic AZ-835 acceptance criteria — preserved; behaviour unchanged.
|
||||
- `RouteSpec` immutability (`frozen=True, slots=True`) — preserved verbatim.
|
||||
- Backward compatibility for producer-side test imports (`from replay_input.tlog_route import RouteSpec`) — preserved via re-export.
|
||||
- No public-API / CLI / endpoint shape change — confirmed in baseline_metrics §5.
|
||||
- **Risk**: low (mechanical move; identity-preserving; logical-flow analysis confirms no observable side effects beyond `__module__`, which no code asserts on).
|
||||
- **Dependencies**: None.
|
||||
|
||||
### C02: Refresh `module-layout.md` to register cycle-3 additions + new `_types/route.py`
|
||||
|
||||
- **File(s)**: `_docs/02_document/module-layout.md` (single file).
|
||||
- **Problem**: The cumulative review's F2 surfaces that `module-layout.md` is stale. Cycle-2 carry-overs are still unaddressed; cycle 3 added more entries that are not registered. Specifically:
|
||||
- **c11_tile_manager Internal list** is missing `_types.py`, `config.py`, `errors.py`, `idempotent_retry.py`, `signing_key.py`, `tile_downloader.py`, `tile_uploader.py`, **`route_client.py`** (cycle-3 NEW from batch 107).
|
||||
- **shared/replay_input file list** is missing `errors.py` (cycle-2 carry), `tlog_ground_truth.py` (cycle-2 carry), **`tlog_route.py`** (cycle-3 NEW from batch 106).
|
||||
- **`_types/` file list** does not yet include `route.py` (added in C01).
|
||||
- **Change**: Append the missing entries to bring `module-layout.md` in sync with on-disk reality for the c11_tile_manager, replay_input, and `_types/` sections. Add `_types/route.py` to the `_types/` section with a one-line description (consistent with how the other `_types/*.py` files are listed). Cycle-2 carry-overs *outside* these three sections (`replay_api/`, `cli/render_map.py`, `cli/replay_api_entrypoint.py`, `helpers/gps_compare.py`, `helpers/accuracy_report.py`) are NOT in this run's scope — they remain on the cycle-3 retrospective list and should be addressed in a follow-up doc task that is independent of the architectural fix here.
|
||||
- **Rationale**: `/implement` Step 4 (File Ownership) treats `module-layout.md` as authoritative; staleness there is a BLOCKING gate when a future task touches an unregistered area. F2 is currently Medium; the cumulative review notes severity escalates to High if a fourth consecutive cycle leaves it stale. Resolving the cycle-3 portion now keeps the fix scoped to the same surface as C01 + the route_client + tlog_route additions that triggered the cumulative review in the first place.
|
||||
- **Constraint Fit**:
|
||||
- `module-layout.md` rule 9 — strengthened (the document now reflects what `_types/*` actually owns).
|
||||
- No code or behavioural change.
|
||||
- Scope discipline — does NOT pull in cycle-2 carry-overs outside the run's three sections; they are deferred to a separate task.
|
||||
- **Risk**: low (doc-only; reviewable by diff; no test impact).
|
||||
- **Dependencies**: C01 (the `_types/route.py` entry depends on the file existing).
|
||||
|
||||
### C03: Expand `test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies` to enforce the full rule-9 allow-list
|
||||
|
||||
- **File(s)**:
|
||||
- **MOD**: `tests/unit/test_az270_compose_root.py:194-219` — replace the current narrow check (`node.module.startswith("gps_denied_onboard.components.")` with a different leaf-component) with a check that walks `components/**/*.py`, parses each `ImportFrom`, and for any `node.module` starting with `gps_denied_onboard.` asserts the importable target is in the rule-9 allow-list (i.e. matches one of: `gps_denied_onboard.components.<own-component>.*`, `gps_denied_onboard._types.*`, `gps_denied_onboard._types.inference_errors`, `gps_denied_onboard.helpers.*`, `gps_denied_onboard.config`, `gps_denied_onboard.logging`, `gps_denied_onboard.fdr_client`, `gps_denied_onboard.clock`, `gps_denied_onboard.frame_source` interface-only).
|
||||
- **MOD (test docstring)**: update the test's docstring to cite the full rule-9 paragraph, not just AC-6 of AZ-270.
|
||||
- **Problem**: F3 of the cumulative review documents that `module-layout.md` rule 9 is described as "enforced by `test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies`", but the lint actually checks only one of the eight allow-listed prefixes — only the `gps_denied_onboard.components.<other_component>` exclusion. Imports from `replay_input`, `replay_api`, `runtime_root`, `cli/*`, and `frame_source` non-interface modules pass silently. F1 is the concrete consequence; the next task that imports from a similarly-placed module would compound the drift.
|
||||
- **Change**: Widen the AST walker to a single-branch decision: "is the imported module rooted in `gps_denied_onboard.` AND not in the rule-9 allow-list (parameterised against the importer's own component for the `components.<own>.*` clause)? → fail with a message that names the offending edge and the rule." The existing error message format (compose-root test failure) is preserved; only the predicate is widened.
|
||||
- **Rationale**: Lint coverage matters more than rule wording. F3 surfaces a maintainability risk: the rule and its enforcement diverge silently. Closing the gap forecloses the F1 class of regressions at lint time, not at cumulative-review time.
|
||||
- **Constraint Fit**:
|
||||
- `module-layout.md` rule 9 — enforced as documented.
|
||||
- Existing AZ-270 AC-6 — preserved (the new check is a strict superset of the old check).
|
||||
- No behaviour change in production code.
|
||||
- Self-check: running the widened lint at HEAD (before C01 lands) reproduces F1 as a lint failure; running it at the C01 + C02 tip reproduces zero violations. This is the test the run hinges on.
|
||||
- **Risk**: medium — the widening will catch any *other* in-flight rule-9 violation hiding in the codebase, which could surface a second remediation task. If the widened lint exposes an unrelated violation, the implement skill should STOP and surface it for a scope decision rather than auto-bundle. Risk is reduced by the fact that rule-9 audits during code review have not flagged anything else.
|
||||
- **Dependencies**: C01 must land first (otherwise the widened lint fails on the very edge C01 fixes; running tests in the order C01 → C03 means C03 sees a clean baseline). C02 ordering is independent.
|
||||
|
||||
## Out of scope for this run
|
||||
|
||||
- **Cycle-2 module-layout carry-overs** outside the three sections C02 touches (`replay_api/` Per-Component Mapping, `cli/render_map.py`, `cli/replay_api_entrypoint.py`, `helpers/gps_compare.py`, `helpers/accuracy_report.py`) — recorded as cycle-3 retrospective follow-up; needs a separate doc task with its own AZ ID.
|
||||
- **Contract documentation for `RouteSpec` at `_docs/02_document/contracts/shared_types/route.md`** — the cumulative review noted this as a possible Spec-Gap follow-up. It is a documentation addition, not a refactor. Defer to whoever owns the Spec-Gap workflow; do not bundle here.
|
||||
- **`architecture_compliance_baseline.md`** — separate cycle-2 retrospective action that has been outstanding for two cycles; recorded as such in the cumulative review's footer note. Out of this run's scope.
|
||||
@@ -6,9 +6,9 @@ step: 10
|
||||
name: Implement
|
||||
status: in_progress
|
||||
sub_step:
|
||||
phase: 7
|
||||
name: batch-loop
|
||||
detail: ""
|
||||
phase: 3
|
||||
name: refactor-safety-net
|
||||
detail: "02-az507; Phase 2 confirmed; ready for Phase 3 safety-net check in fresh session"
|
||||
retry_count: 0
|
||||
cycle: 3
|
||||
tracker: jira
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# D-CROSS-CVE-1 opencv-python pin deferred — gtsam/numpy ABI block
|
||||
|
||||
**Recorded**: 2026-05-11T02:55+03:00 (Europe/Kyiv)
|
||||
**Last replay attempt**: 2026-05-23T13:14+03:00 (Europe/Kyiv) — replay re-checked
|
||||
**Last replay attempt**: 2026-05-23T13:44+03:00 (Europe/Kyiv) — replay re-checked
|
||||
at start of next `/autodev` invocation. PyPI re-queried via
|
||||
`python3 -m pip index versions gtsam`: only `gtsam 4.2` is published.
|
||||
Replay condition (numpy>=2 stable wheels) still NOT met. Leftover remains open.
|
||||
|
||||
@@ -0,0 +1,655 @@
|
||||
"""E2E orchestrator for the AZ-835 7-step pipeline (AZ-840 / Epic AZ-835 C4).
|
||||
|
||||
Wraps the AZ-699 verdict-report writing path with the AZ-839 C3
|
||||
fixture's `PopulatedC6Cache` so a single Tier-2 test can run from
|
||||
``(tlog, video, calibration)`` to a horizontal-error report without
|
||||
operator hand-curation between steps. The 7-step Epic narrative
|
||||
(``_docs/02_tasks/todo/AZ-840_e2e_orchestrator_test.md``):
|
||||
|
||||
1. Active flight cut + tlog/video sync — handled by ``gps-denied-replay``
|
||||
``--auto-trim`` (AZ-405 / AZ-698) inside the airborne binary.
|
||||
2. On-fly frame + IMU extraction — same binary's per-frame loop.
|
||||
3. Auto-create route — done by the C3 fixture
|
||||
(``operator_pre_flight_setup`` calls ``extract_route_from_tlog``).
|
||||
4. POST route to satellite-provider — C3 fixture (AZ-838
|
||||
``SatelliteProviderRouteClient.seed_route``).
|
||||
5. Build FAISS index — C3 fixture (AZ-322 ``DescriptorBatcher``).
|
||||
6. Run gps-denied airborne pipeline — this module's
|
||||
``_run_replay_subprocess`` invokes ``gps-denied-replay`` against
|
||||
the populated cache.
|
||||
7. Get GPS fixes, check vs tlog GPS — this module's
|
||||
``_load_ground_truth`` + ``horizontal_error_distribution`` +
|
||||
``render_report`` writes the verdict markdown.
|
||||
|
||||
The C3 fixture mutates ``c6_tile_cache.root_dir`` to point at a
|
||||
``tmp_path_factory.mktemp`` value (AZ-839 batch 108b). The static
|
||||
operator YAML at ``GPS_DENIED_OPERATOR_CONFIG_PATH`` cannot know
|
||||
that path. ``write_effective_replay_config`` reads the static YAML,
|
||||
overlays the ``c6_tile_cache.root_dir`` override, writes the merged
|
||||
result to a tmp file, and returns the path the airborne binary
|
||||
will load via ``--config``. This keeps a single source of truth
|
||||
for the cache_root override across the in-memory C3 fixture path
|
||||
and the subprocess airborne path.
|
||||
|
||||
Public surface — re-exported from this module:
|
||||
|
||||
* :class:`OrchestratorStep` — failure-step labels per AC-5 ("fails
|
||||
LOUD with a clear error pointing at the failing step").
|
||||
* :class:`OrchestrationFailure` — wraps the underlying exception
|
||||
with the step that produced it.
|
||||
* :class:`OrchestrationReport` — return value of
|
||||
:func:`run_e2e_orchestration` (verdict, distribution, paths,
|
||||
wall-clock measurements per AC-4).
|
||||
* :func:`write_effective_replay_config` — small helper for the
|
||||
config merge step.
|
||||
* :func:`run_e2e_orchestration` — the AC-1 entry point.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import datetime
|
||||
import json
|
||||
import logging
|
||||
import subprocess
|
||||
import time
|
||||
from collections.abc import Callable, Mapping
|
||||
from dataclasses import dataclass
|
||||
from enum import Enum
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import yaml
|
||||
|
||||
from gps_denied_onboard.helpers.accuracy_report import (
|
||||
AC3_GATE_PCT,
|
||||
AC3_GATE_THRESHOLD_M,
|
||||
ReportContext,
|
||||
render_report,
|
||||
verdict_passes_ac3,
|
||||
)
|
||||
from gps_denied_onboard.helpers.gps_compare import (
|
||||
GroundTruthRow,
|
||||
HorizontalErrorDistribution,
|
||||
horizontal_error_distribution,
|
||||
)
|
||||
from gps_denied_onboard.replay_input import load_tlog_ground_truth
|
||||
|
||||
from tests.e2e.replay._operator_pre_flight import PopulatedC6Cache
|
||||
|
||||
__all__ = [
|
||||
"OrchestrationFailure",
|
||||
"OrchestrationReport",
|
||||
"OrchestratorStep",
|
||||
"read_calibration_acquisition_method",
|
||||
"run_e2e_orchestration",
|
||||
"write_effective_replay_config",
|
||||
]
|
||||
|
||||
|
||||
# Replay-subprocess wall-clock cap for the Derkachi clip per AZ-840
|
||||
# AC-4 (15 min soft target). Exposed as a default that the integration
|
||||
# test can override; the unit tests rely on the contract that the
|
||||
# runner argument is a free callable.
|
||||
_DEFAULT_MAX_SECONDS: float = 900.0
|
||||
|
||||
_LOGGER = logging.getLogger("tests.e2e.replay.e2e_orchestrator")
|
||||
|
||||
|
||||
class OrchestratorStep(str, Enum):
|
||||
"""Labels for the 7-step pipeline used by :class:`OrchestrationFailure`.
|
||||
|
||||
AC-5: every failure that reaches the test surface must name the
|
||||
step that produced it. The string values are stable so test
|
||||
assertions and log readers can match on them.
|
||||
"""
|
||||
|
||||
VALIDATE_INPUTS = "validate_inputs"
|
||||
WRITE_EFFECTIVE_CONFIG = "write_effective_config"
|
||||
AIRBORNE_PIPELINE = "airborne_pipeline"
|
||||
PARSE_EMISSIONS = "parse_emissions"
|
||||
LOAD_GROUND_TRUTH = "load_ground_truth"
|
||||
COMPUTE_DISTRIBUTION = "compute_distribution"
|
||||
RENDER_REPORT = "render_report"
|
||||
|
||||
|
||||
class OrchestrationFailure(RuntimeError):
|
||||
"""Failure inside one of the 7 orchestration steps (AC-5).
|
||||
|
||||
The :attr:`step` attribute names the failing step; the message
|
||||
embeds it as the prefix so plain log-readers see the failure
|
||||
location without inspecting the exception object.
|
||||
"""
|
||||
|
||||
def __init__(self, step: OrchestratorStep, message: str) -> None:
|
||||
super().__init__(f"[{step.value}] {message}")
|
||||
self.step = step
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class OrchestrationReport:
|
||||
"""Return value of :func:`run_e2e_orchestration`.
|
||||
|
||||
Attributes:
|
||||
verdict_passed: ``True`` iff the run met the AZ-696 epic
|
||||
AC-3 gate (>= AC3_GATE_PCT% within AC3_GATE_THRESHOLD_M m).
|
||||
distribution: Computed horizontal-error distribution.
|
||||
report_path: Markdown report written under ``report_dir``.
|
||||
emissions_count: Total estimator-output records consumed.
|
||||
wall_clock_s: Wall-clock seconds for the orchestration run
|
||||
(excludes the C3 fixture setup; covers steps 1-2-6-7).
|
||||
replay_subprocess_seconds: Wall-clock seconds the airborne
|
||||
replay subprocess took. Always <= ``wall_clock_s``.
|
||||
"""
|
||||
|
||||
verdict_passed: bool
|
||||
distribution: HorizontalErrorDistribution
|
||||
report_path: Path
|
||||
emissions_count: int
|
||||
wall_clock_s: float
|
||||
replay_subprocess_seconds: float
|
||||
|
||||
|
||||
def read_calibration_acquisition_method(calibration_path: Path) -> str:
|
||||
"""Return the AZ-702 ``acquisition_method`` field, or ``"unknown"``.
|
||||
|
||||
Mirrors ``test_derkachi_real_tlog._read_calibration_acquisition_method``
|
||||
so the AZ-840 verdict report can name the calibration provenance
|
||||
in its failure message (AZ-699 AC-3). Pure helper; the report
|
||||
writer needs the string, not the JSON.
|
||||
"""
|
||||
try:
|
||||
data = json.loads(calibration_path.read_text())
|
||||
except (OSError, json.JSONDecodeError):
|
||||
return "unknown"
|
||||
method = data.get("acquisition_method")
|
||||
if isinstance(method, str) and method:
|
||||
return method
|
||||
return "unknown"
|
||||
|
||||
|
||||
def write_effective_replay_config(
|
||||
*,
|
||||
base_config_path: Path,
|
||||
cache_root: Path,
|
||||
output_path: Path,
|
||||
) -> Path:
|
||||
"""Merge cache_root override into the static operator YAML.
|
||||
|
||||
Reads ``base_config_path`` as YAML, sets the
|
||||
``c6_tile_cache.root_dir`` to ``cache_root`` (forcing the
|
||||
FAISS index path to fall back to ``<cache_root>/descriptor.index``),
|
||||
and writes the merged document to ``output_path`` as YAML.
|
||||
|
||||
The merge is field-level: every other block in the base YAML is
|
||||
preserved verbatim. This keeps a single source of truth for the
|
||||
operator config — the test harness only contributes the dynamic
|
||||
cache_root.
|
||||
|
||||
Returns:
|
||||
The ``output_path`` argument, for ergonomic chaining.
|
||||
|
||||
Raises:
|
||||
OrchestrationFailure (step=WRITE_EFFECTIVE_CONFIG): Base YAML
|
||||
unreadable, malformed, or not a top-level mapping.
|
||||
"""
|
||||
|
||||
try:
|
||||
base_text = base_config_path.read_text()
|
||||
except OSError as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.WRITE_EFFECTIVE_CONFIG,
|
||||
f"cannot read base config at {base_config_path}: {exc!r}",
|
||||
) from exc
|
||||
|
||||
try:
|
||||
base_data = yaml.safe_load(base_text) or {}
|
||||
except yaml.YAMLError as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.WRITE_EFFECTIVE_CONFIG,
|
||||
f"base config YAML at {base_config_path} is malformed: {exc!r}",
|
||||
) from exc
|
||||
if not isinstance(base_data, dict):
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.WRITE_EFFECTIVE_CONFIG,
|
||||
f"base config YAML at {base_config_path} must be a mapping; "
|
||||
f"got {type(base_data).__name__}",
|
||||
)
|
||||
|
||||
c6_block_raw = base_data.get("c6_tile_cache")
|
||||
c6_block = dict(c6_block_raw) if isinstance(c6_block_raw, dict) else {}
|
||||
c6_block["root_dir"] = str(cache_root)
|
||||
c6_block["faiss_index_path"] = ""
|
||||
base_data["c6_tile_cache"] = c6_block
|
||||
|
||||
try:
|
||||
output_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
output_path.write_text(
|
||||
yaml.safe_dump(base_data, sort_keys=True, default_flow_style=False)
|
||||
)
|
||||
except OSError as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.WRITE_EFFECTIVE_CONFIG,
|
||||
f"cannot write effective config at {output_path}: {exc!r}",
|
||||
) from exc
|
||||
return output_path
|
||||
|
||||
|
||||
def run_e2e_orchestration(
|
||||
*,
|
||||
populated_cache: PopulatedC6Cache,
|
||||
base_config_path: Path,
|
||||
tlog_path: Path,
|
||||
video_path: Path,
|
||||
calibration_path: Path,
|
||||
signing_key_path: Path,
|
||||
replay_binary: Path,
|
||||
output_path: Path,
|
||||
report_dir: Path,
|
||||
effective_config_path: Path,
|
||||
run_date_utc: str | None = None,
|
||||
runner: Callable[..., subprocess.CompletedProcess[str]] = subprocess.run,
|
||||
subprocess_env: Mapping[str, str] | None = None,
|
||||
max_seconds: float = _DEFAULT_MAX_SECONDS,
|
||||
logger: logging.Logger | None = None,
|
||||
) -> OrchestrationReport:
|
||||
"""Run AZ-835 steps 1-7 against the AZ-839 populated cache.
|
||||
|
||||
Steps 3-5 are the responsibility of ``populated_cache`` (the
|
||||
AZ-839 C3 fixture); this function covers 1-2-6 (the airborne
|
||||
replay subprocess) and 7 (verdict report). The C3 fixture and
|
||||
this function share the cache_root via
|
||||
:func:`write_effective_replay_config` so the airborne binary
|
||||
reads the same FAISS index the fixture wrote (AC-3).
|
||||
|
||||
Args:
|
||||
populated_cache: C3 fixture output (AZ-839). Carries
|
||||
``cache_root``, ``faiss_index_path``, and the route
|
||||
spec the test pipeline produced.
|
||||
base_config_path: Static operator config YAML
|
||||
(``GPS_DENIED_OPERATOR_CONFIG_PATH``). Must register
|
||||
``c6_tile_cache``, ``c10_provisioning``, ``c2_vpr``,
|
||||
``c4_pose``, and ``c5_state`` blocks for the airborne
|
||||
binary to compose the replay graph.
|
||||
tlog_path: ArduPilot binary tlog the test consumes.
|
||||
video_path: Flight video file the test consumes.
|
||||
calibration_path: Camera calibration JSON (AZ-702
|
||||
factory-sheet for Derkachi).
|
||||
signing_key_path: MAVLink signing-key file. Replay protocol
|
||||
Invariant 11 — required even for the noop transport.
|
||||
replay_binary: ``gps-denied-replay`` console-script path.
|
||||
output_path: Where the airborne binary writes JSONL
|
||||
estimator emissions.
|
||||
report_dir: Directory the verdict markdown is written to.
|
||||
effective_config_path: Where the cache_root-merged YAML is
|
||||
written. The path is passed to the airborne binary via
|
||||
``--config``.
|
||||
run_date_utc: ISO-8601 date for the report filename and
|
||||
header. Defaults to today UTC.
|
||||
runner: ``subprocess.run`` by default; tests inject a fake
|
||||
that emits a synthetic JSONL output.
|
||||
subprocess_env: Optional environment overlay for the
|
||||
replay subprocess. ``None`` means ``os.environ``.
|
||||
max_seconds: Hard wall-clock cap for the airborne replay
|
||||
subprocess. The orchestrator times out the runner via
|
||||
its ``timeout`` kwarg; an exceeded budget surfaces as
|
||||
``OrchestrationFailure(step=AIRBORNE_PIPELINE)``.
|
||||
logger: Optional logger. Defaults to the module logger.
|
||||
|
||||
Returns:
|
||||
:class:`OrchestrationReport` on success. The verdict can
|
||||
be PASS or FAIL — AC-2 mandates the report exists either
|
||||
way.
|
||||
|
||||
Raises:
|
||||
OrchestrationFailure: Any of the 7 steps failed. The
|
||||
``step`` attribute names the failing step.
|
||||
"""
|
||||
|
||||
log = logger or _LOGGER
|
||||
started = time.monotonic()
|
||||
effective_run_date = run_date_utc or (
|
||||
datetime.datetime.now(datetime.timezone.utc).date().isoformat()
|
||||
)
|
||||
|
||||
_validate_inputs(
|
||||
base_config_path=base_config_path,
|
||||
tlog_path=tlog_path,
|
||||
video_path=video_path,
|
||||
calibration_path=calibration_path,
|
||||
signing_key_path=signing_key_path,
|
||||
replay_binary=replay_binary,
|
||||
report_dir=report_dir,
|
||||
)
|
||||
|
||||
write_effective_replay_config(
|
||||
base_config_path=base_config_path,
|
||||
cache_root=populated_cache.cache_root,
|
||||
output_path=effective_config_path,
|
||||
)
|
||||
|
||||
replay_subprocess_seconds = _run_replay_subprocess(
|
||||
replay_binary=replay_binary,
|
||||
video_path=video_path,
|
||||
tlog_path=tlog_path,
|
||||
output_path=output_path,
|
||||
calibration_path=calibration_path,
|
||||
config_path=effective_config_path,
|
||||
signing_key_path=signing_key_path,
|
||||
max_seconds=max_seconds,
|
||||
runner=runner,
|
||||
env=subprocess_env,
|
||||
logger=log,
|
||||
)
|
||||
|
||||
emissions = _parse_jsonl(output_path)
|
||||
|
||||
ground_truth = _load_ground_truth(tlog_path)
|
||||
|
||||
distribution = _compute_distribution(emissions, ground_truth)
|
||||
|
||||
context = ReportContext(
|
||||
run_date_utc=effective_run_date,
|
||||
tlog_path=tlog_path,
|
||||
video_path=video_path,
|
||||
calibration_acquisition_method=read_calibration_acquisition_method(
|
||||
calibration_path
|
||||
),
|
||||
clip_duration_s=(
|
||||
ground_truth[-1].t_s - ground_truth[0].t_s
|
||||
if ground_truth
|
||||
else 0.0
|
||||
),
|
||||
emissions_count=len(emissions),
|
||||
)
|
||||
verdict_passed = verdict_passes_ac3(distribution)
|
||||
report_path = _render_and_write_report(
|
||||
distribution=distribution,
|
||||
context=context,
|
||||
passed=verdict_passed,
|
||||
report_dir=report_dir,
|
||||
)
|
||||
|
||||
log.info(
|
||||
"e2e_orchestrator: report written",
|
||||
extra={
|
||||
"kind": "e2e_orchestrator.report_written",
|
||||
"kv": {
|
||||
"report_path": str(report_path),
|
||||
"verdict_passed": verdict_passed,
|
||||
"share_within_threshold_pct": (
|
||||
distribution.threshold_hit_share.get(
|
||||
AC3_GATE_THRESHOLD_M, 0.0
|
||||
)
|
||||
* 100.0
|
||||
),
|
||||
"ac3_gate_pct": AC3_GATE_PCT,
|
||||
"emissions_count": len(emissions),
|
||||
"ground_truth_pairings": distribution.count,
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
wall_clock_s = max(0.0, time.monotonic() - started)
|
||||
return OrchestrationReport(
|
||||
verdict_passed=verdict_passed,
|
||||
distribution=distribution,
|
||||
report_path=report_path,
|
||||
emissions_count=len(emissions),
|
||||
wall_clock_s=wall_clock_s,
|
||||
replay_subprocess_seconds=replay_subprocess_seconds,
|
||||
)
|
||||
|
||||
|
||||
def _validate_inputs(
|
||||
*,
|
||||
base_config_path: Path,
|
||||
tlog_path: Path,
|
||||
video_path: Path,
|
||||
calibration_path: Path,
|
||||
signing_key_path: Path,
|
||||
replay_binary: Path,
|
||||
report_dir: Path,
|
||||
) -> None:
|
||||
"""Fail fast on missing inputs (AC-5 — surface the failing step early)."""
|
||||
file_inputs: tuple[tuple[str, Path], ...] = (
|
||||
("base_config_path", base_config_path),
|
||||
("tlog_path", tlog_path),
|
||||
("video_path", video_path),
|
||||
("calibration_path", calibration_path),
|
||||
("signing_key_path", signing_key_path),
|
||||
("replay_binary", replay_binary),
|
||||
)
|
||||
for label, path in file_inputs:
|
||||
if not path.is_file():
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.VALIDATE_INPUTS,
|
||||
f"{label} is not a file: {path}",
|
||||
)
|
||||
try:
|
||||
report_dir.mkdir(parents=True, exist_ok=True)
|
||||
except OSError as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.VALIDATE_INPUTS,
|
||||
f"report_dir {report_dir} cannot be created: {exc!r}",
|
||||
) from exc
|
||||
|
||||
|
||||
def _run_replay_subprocess(
|
||||
*,
|
||||
replay_binary: Path,
|
||||
video_path: Path,
|
||||
tlog_path: Path,
|
||||
output_path: Path,
|
||||
calibration_path: Path,
|
||||
config_path: Path,
|
||||
signing_key_path: Path,
|
||||
max_seconds: float,
|
||||
runner: Callable[..., subprocess.CompletedProcess[str]],
|
||||
env: Mapping[str, str] | None,
|
||||
logger: logging.Logger,
|
||||
) -> float:
|
||||
"""Invoke gps-denied-replay with --auto-trim; return wall-clock seconds.
|
||||
|
||||
Wraps :class:`subprocess.run` so unit tests can inject a fake
|
||||
runner. ``--auto-trim`` is always enabled here — the
|
||||
orchestrator owns the AZ-405 / AZ-698 sync path (AZ-840 step 1).
|
||||
|
||||
Raises:
|
||||
OrchestrationFailure (step=AIRBORNE_PIPELINE): Non-zero exit,
|
||||
timeout, or runner-level OSError.
|
||||
"""
|
||||
|
||||
argv = [
|
||||
str(replay_binary),
|
||||
"--video",
|
||||
str(video_path),
|
||||
"--tlog",
|
||||
str(tlog_path),
|
||||
"--output",
|
||||
str(output_path),
|
||||
"--camera-calibration",
|
||||
str(calibration_path),
|
||||
"--config",
|
||||
str(config_path),
|
||||
"--mavlink-signing-key",
|
||||
str(signing_key_path),
|
||||
"--pace",
|
||||
"asap",
|
||||
"--auto-trim",
|
||||
]
|
||||
started = time.monotonic()
|
||||
try:
|
||||
completed = runner(
|
||||
argv,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=max_seconds,
|
||||
env=dict(env) if env is not None else None,
|
||||
)
|
||||
except subprocess.TimeoutExpired as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.AIRBORNE_PIPELINE,
|
||||
f"gps-denied-replay timed out after {max_seconds:.0f} s",
|
||||
) from exc
|
||||
except OSError as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.AIRBORNE_PIPELINE,
|
||||
f"cannot launch gps-denied-replay at {replay_binary}: {exc!r}",
|
||||
) from exc
|
||||
|
||||
elapsed_s = max(0.0, time.monotonic() - started)
|
||||
if completed.returncode != 0:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.AIRBORNE_PIPELINE,
|
||||
f"gps-denied-replay exited {completed.returncode}\n"
|
||||
f"stdout:\n{completed.stdout}\nstderr:\n{completed.stderr}",
|
||||
)
|
||||
logger.info(
|
||||
"e2e_orchestrator: replay subprocess complete",
|
||||
extra={
|
||||
"kind": "e2e_orchestrator.replay_subprocess",
|
||||
"kv": {
|
||||
"elapsed_s": elapsed_s,
|
||||
"max_seconds": max_seconds,
|
||||
},
|
||||
},
|
||||
)
|
||||
return elapsed_s
|
||||
|
||||
|
||||
def _parse_jsonl(path: Path) -> list[dict[str, Any]]:
|
||||
"""Read one JSON record per non-blank line.
|
||||
|
||||
Raises:
|
||||
OrchestrationFailure (step=PARSE_EMISSIONS): Output file
|
||||
missing, unreadable, has zero records, or contains a
|
||||
malformed line.
|
||||
"""
|
||||
if not path.is_file():
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.PARSE_EMISSIONS,
|
||||
f"replay output JSONL not found: {path}",
|
||||
)
|
||||
try:
|
||||
text = path.read_text()
|
||||
except OSError as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.PARSE_EMISSIONS,
|
||||
f"replay output JSONL unreadable at {path}: {exc!r}",
|
||||
) from exc
|
||||
rows: list[dict[str, Any]] = []
|
||||
for line_idx, line in enumerate(text.splitlines(), start=1):
|
||||
if not line.strip():
|
||||
continue
|
||||
try:
|
||||
row = json.loads(line)
|
||||
except json.JSONDecodeError as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.PARSE_EMISSIONS,
|
||||
f"malformed JSON at line {line_idx} of {path}: {exc.msg}",
|
||||
) from exc
|
||||
if not isinstance(row, dict):
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.PARSE_EMISSIONS,
|
||||
f"line {line_idx} of {path} is not a JSON object: {row!r}",
|
||||
)
|
||||
rows.append(row)
|
||||
if not rows:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.PARSE_EMISSIONS,
|
||||
f"replay output JSONL at {path} has zero records — pipeline "
|
||||
"produced no estimator emissions",
|
||||
)
|
||||
return rows
|
||||
|
||||
|
||||
def _load_ground_truth(tlog_path: Path) -> list[GroundTruthRow]:
|
||||
"""Extract WGS84 ground truth from the binary tlog.
|
||||
|
||||
Raises:
|
||||
OrchestrationFailure (step=LOAD_GROUND_TRUTH): Loader
|
||||
error or empty record list.
|
||||
"""
|
||||
try:
|
||||
series = load_tlog_ground_truth(tlog_path).records
|
||||
except Exception as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.LOAD_GROUND_TRUTH,
|
||||
f"load_tlog_ground_truth({tlog_path}) failed: {exc!r}",
|
||||
) from exc
|
||||
rows: list[GroundTruthRow] = [
|
||||
GroundTruthRow(
|
||||
t_s=fix.ts_ns / 1e9,
|
||||
lat_deg=fix.lat_deg,
|
||||
lon_deg=fix.lon_deg,
|
||||
alt_m=fix.alt_m,
|
||||
)
|
||||
for fix in series
|
||||
]
|
||||
if not rows:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.LOAD_GROUND_TRUTH,
|
||||
f"tlog ground truth at {tlog_path} has zero rows",
|
||||
)
|
||||
return rows
|
||||
|
||||
|
||||
def _compute_distribution(
|
||||
emissions: list[dict[str, Any]],
|
||||
ground_truth: list[GroundTruthRow],
|
||||
) -> HorizontalErrorDistribution:
|
||||
"""Compute the horizontal-error distribution.
|
||||
|
||||
Raises:
|
||||
OrchestrationFailure (step=COMPUTE_DISTRIBUTION): Helper
|
||||
error or zero ground-truth pairings (every emission
|
||||
fell outside the GT time window).
|
||||
"""
|
||||
try:
|
||||
distribution = horizontal_error_distribution(emissions, ground_truth)
|
||||
except Exception as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.COMPUTE_DISTRIBUTION,
|
||||
f"horizontal_error_distribution failed: {exc!r}",
|
||||
) from exc
|
||||
if distribution.count == 0:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.COMPUTE_DISTRIBUTION,
|
||||
"no emissions paired with ground truth — JSONL timestamps "
|
||||
"outside the tlog GPS window?",
|
||||
)
|
||||
return distribution
|
||||
|
||||
|
||||
def _render_and_write_report(
|
||||
*,
|
||||
distribution: HorizontalErrorDistribution,
|
||||
context: ReportContext,
|
||||
passed: bool,
|
||||
report_dir: Path,
|
||||
) -> Path:
|
||||
"""Render the verdict markdown and write it to ``report_dir``.
|
||||
|
||||
Raises:
|
||||
OrchestrationFailure (step=RENDER_REPORT): Render or write
|
||||
failure; ``report_dir`` was already created by
|
||||
:func:`_validate_inputs`.
|
||||
"""
|
||||
try:
|
||||
report_text = render_report(distribution, context, passed=passed)
|
||||
except Exception as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.RENDER_REPORT,
|
||||
f"render_report failed: {exc!r}",
|
||||
) from exc
|
||||
report_path = (
|
||||
report_dir / f"real_flight_validation_{context.run_date_utc}.md"
|
||||
)
|
||||
try:
|
||||
report_path.write_text(report_text)
|
||||
except OSError as exc:
|
||||
raise OrchestrationFailure(
|
||||
OrchestratorStep.RENDER_REPORT,
|
||||
f"cannot write report at {report_path}: {exc!r}",
|
||||
) from exc
|
||||
return report_path
|
||||
@@ -0,0 +1,474 @@
|
||||
"""Operator pre-flight cache assembly driver (AZ-839 / Epic AZ-835 C3).
|
||||
|
||||
Replaces the placeholder ``operator_pre_flight_setup`` fixture stub at
|
||||
``conftest.py`` lines 293-310 with a real driver that wires together
|
||||
the four operator-side production components:
|
||||
|
||||
1. **C1 / AZ-836 RouteSpec** — already extracted by the caller via
|
||||
:func:`gps_denied_onboard.replay_input.tlog_route.extract_route_from_tlog`
|
||||
and handed in as :paramref:`populate_c6_from_route.route_spec`.
|
||||
2. **C2 / AZ-838 SatelliteProviderRouteClient** — POSTs the route to
|
||||
satellite-provider, polls ``mapsReady``.
|
||||
3. **C11 / AZ-316 + AZ-777 Phase 1 HttpTileDownloader** — pulls the
|
||||
seeded tiles from satellite-provider into C6 over a bbox derived
|
||||
from the route waypoints.
|
||||
4. **C10 / AZ-322 DescriptorBatcher** — rebuilds the FAISS HNSW
|
||||
descriptor index over the populated C6 cache (NetVLAD backbone per
|
||||
``c2_vpr/config.py:67``).
|
||||
|
||||
The descriptor index sidecar coherence (AZ-306 triple-consistency:
|
||||
``.index`` + ``.sha256`` + ``.meta.json``) is verified by re-loading
|
||||
the index after rebuild via the caller-supplied
|
||||
``descriptor_index_factory``; any tampering surfaces as
|
||||
:class:`IndexUnavailableError`.
|
||||
|
||||
Public surface — re-exported from this module:
|
||||
|
||||
* :class:`PopulatedC6Cache` — frozen dataclass returned on success.
|
||||
* :func:`populate_c6_from_route` — the driver function.
|
||||
|
||||
Cleanup-on-failure removes any FAISS sidecar files produced inside the
|
||||
driver if any later step raises. Tile-store rows written by C11 are
|
||||
NOT deleted (the C6 store owns its own rollback semantics — leaving
|
||||
those rows enables idempotent re-runs via the C11 download journal).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import time
|
||||
from collections.abc import Callable
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
from uuid import UUID, uuid4
|
||||
|
||||
from gps_denied_onboard.components.c10_provisioning.descriptor_batcher import (
|
||||
BatcherOutcome,
|
||||
CorpusFilter,
|
||||
DescriptorBatcher,
|
||||
)
|
||||
from gps_denied_onboard.components.c11_tile_manager import (
|
||||
DownloadOutcome,
|
||||
DownloadRequest,
|
||||
HttpTileDownloader,
|
||||
SectorClassification,
|
||||
)
|
||||
from gps_denied_onboard.components.c11_tile_manager.errors import (
|
||||
RouteTerminalFailureError,
|
||||
RouteTransientError,
|
||||
RouteValidationError,
|
||||
)
|
||||
from gps_denied_onboard.components.c11_tile_manager.route_client import (
|
||||
SatelliteProviderRouteClient,
|
||||
)
|
||||
from gps_denied_onboard.components.c6_tile_cache.errors import (
|
||||
IndexUnavailableError,
|
||||
)
|
||||
from gps_denied_onboard.components.c6_tile_cache.faiss_descriptor_index import (
|
||||
META_SUFFIX,
|
||||
)
|
||||
from gps_denied_onboard.helpers.sha256_sidecar import SIDECAR_SUFFIX
|
||||
from gps_denied_onboard.replay_input.tlog_route import RouteSpec
|
||||
|
||||
__all__ = [
|
||||
"PopulatedC6Cache",
|
||||
"populate_c6_from_route",
|
||||
]
|
||||
|
||||
|
||||
# Mirror C11's existing schedule so the fixture does not introduce a
|
||||
# parallel retry budget. AC-5 ties our per-attempt cap (3) to the
|
||||
# documented pause cadence; the schedule itself lives in the
|
||||
# downloader module and is re-exported here so tests can override.
|
||||
_DEFAULT_RETRY_SCHEDULE_S: tuple[float, ...] = (1.0, 2.0, 4.0, 8.0)
|
||||
_DEFAULT_MAX_RETRY_ATTEMPTS: int = 3
|
||||
_DEFAULT_ZOOM_LEVEL: int = 18
|
||||
_DEFAULT_SECTOR_CLASS: SectorClassification = SectorClassification.ACTIVE_CONFLICT
|
||||
# Per-degree-of-latitude metres at WGS84 mean radius — reused from C11
|
||||
# route-coverage enumeration (route_client._enumerate_route_tile_coords).
|
||||
# Re-stated here so the driver does not depend on a private constant.
|
||||
_METERS_PER_DEGREE_LAT: float = 111_320.0
|
||||
|
||||
_LOGGER = logging.getLogger(
|
||||
"tests.e2e.replay.operator_pre_flight"
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class PopulatedC6Cache:
|
||||
"""Output of :func:`populate_c6_from_route`.
|
||||
|
||||
Mirrors the public-surface dataclass documented in the AZ-839 spec.
|
||||
All paths point at on-disk artifacts that survive the fixture's
|
||||
``session`` scope (when mounted on the named docker volume the
|
||||
e2e-runner declares); ``elapsed_seconds`` powers the AC-1 / AC-2
|
||||
perf budget assertions.
|
||||
"""
|
||||
|
||||
cache_root: Path
|
||||
tile_store_path: Path
|
||||
faiss_index_path: Path
|
||||
faiss_sidecar_sha256_path: Path
|
||||
faiss_sidecar_meta_path: Path
|
||||
route_spec: RouteSpec
|
||||
tile_count: int
|
||||
elapsed_seconds: float
|
||||
|
||||
|
||||
def populate_c6_from_route(
|
||||
*,
|
||||
route_spec: RouteSpec,
|
||||
route_client: SatelliteProviderRouteClient,
|
||||
tile_downloader: HttpTileDownloader,
|
||||
descriptor_batcher: DescriptorBatcher,
|
||||
descriptor_index_factory: Callable[[], Any],
|
||||
cache_root: Path,
|
||||
tile_store_path: Path,
|
||||
faiss_index_path: Path,
|
||||
flight_id: UUID | None = None,
|
||||
sector_class: SectorClassification = _DEFAULT_SECTOR_CLASS,
|
||||
zoom_level: int = _DEFAULT_ZOOM_LEVEL,
|
||||
region_size_meters: float | None = None,
|
||||
retry_schedule_s: tuple[float, ...] = _DEFAULT_RETRY_SCHEDULE_S,
|
||||
max_retry_attempts: int = _DEFAULT_MAX_RETRY_ATTEMPTS,
|
||||
sleep: Callable[[float], None] = time.sleep,
|
||||
monotonic: Callable[[], float] = time.monotonic,
|
||||
logger: logging.Logger | None = None,
|
||||
) -> PopulatedC6Cache:
|
||||
"""Drive the full C1+C2+C11+C10 pipeline end-to-end.
|
||||
|
||||
Args:
|
||||
route_spec: Coarsened route from AZ-836's
|
||||
:func:`extract_route_from_tlog`. The caller chooses the
|
||||
tlog (typically a session-scoped fixture); this driver is
|
||||
tlog-agnostic.
|
||||
route_client: Configured C2 client. Built from env vars by the
|
||||
production fixture; injected as a stub by unit tests.
|
||||
tile_downloader: Configured C11 downloader. Same wiring rules.
|
||||
descriptor_batcher: Configured C10 batcher; its rebuild path
|
||||
owns the on-disk FAISS write (atomic via
|
||||
:class:`Sha256Sidecar`).
|
||||
descriptor_index_factory: Zero-arg callable that constructs a
|
||||
FRESH descriptor index pointed at ``faiss_index_path``.
|
||||
Production passes
|
||||
``lambda: FaissDescriptorIndex.from_config(config)``; the
|
||||
constructor auto-loads via
|
||||
:meth:`FaissDescriptorIndex._load`, raising
|
||||
:class:`IndexUnavailableError` on triple-consistency
|
||||
failure (AC-3 / AC-6 verification).
|
||||
cache_root: Root directory mounted on the named docker volume
|
||||
that survives across pytest sessions.
|
||||
tile_store_path: Where C6's :class:`TileStore` writes JPEG
|
||||
blobs. Carried on the result for downstream tests.
|
||||
faiss_index_path: Final ``.index`` blob path. Sidecars live at
|
||||
``<faiss_index_path>.sha256`` + ``<faiss_index_path>.meta.json``.
|
||||
flight_id: C11 download-journal key; defaults to a fresh UUID
|
||||
so two fixture sessions never collide their journals.
|
||||
sector_class: C11 / C6 sector classification. Defaults to
|
||||
``ACTIVE_CONFLICT`` — Derkachi is an active-conflict
|
||||
corridor; ``STABLE_REAR`` is for non-Ukraine clips.
|
||||
zoom_level: Single Web-Mercator zoom level the fixture
|
||||
populates. AZ-839 spec defaults to 18 to match
|
||||
``seed_route.py`` ergonomics; tests override for speed.
|
||||
region_size_meters: Per-waypoint coverage radius in metres.
|
||||
``None`` falls back to
|
||||
:attr:`RouteSpec.suggested_region_size_meters`.
|
||||
retry_schedule_s: Pause cadence between transient retries.
|
||||
Defaults to C11's documented ``_DEFAULT_BACKOFF_SCHEDULE_S``.
|
||||
max_retry_attempts: Total :meth:`seed_route` attempts on
|
||||
transient error before propagating (AC-5 — final
|
||||
attempt's exception is propagated unchanged).
|
||||
sleep: Test override for the retry pause; production passes
|
||||
:func:`time.sleep`.
|
||||
monotonic: Test override for elapsed-time measurement.
|
||||
logger: Optional logger. Defaults to the module logger.
|
||||
|
||||
Returns:
|
||||
:class:`PopulatedC6Cache` on success.
|
||||
|
||||
Raises:
|
||||
RouteValidationError: Pre-emptive validation or HTTP 4xx —
|
||||
propagated unchanged with original cause (AC-4).
|
||||
RouteTerminalFailureError: ``mapsReady`` never reached or
|
||||
terminal failure status — propagated unchanged (AC-4).
|
||||
RouteTransientError: 5xx / network / timeout AFTER all retry
|
||||
attempts have been exhausted (AC-5).
|
||||
IndexUnavailableError: Triple-consistency check failed after
|
||||
rebuild — sidecars are corrupt / mismatched (AC-3 / AC-6).
|
||||
RuntimeError: C11 ``download_tiles_for_area`` returned a
|
||||
non-success outcome OR C10 ``populate_descriptors``
|
||||
returned :attr:`BatcherOutcome.FAILURE`.
|
||||
|
||||
Notes:
|
||||
Cleanup behaviour (AC-7) — if any step raises after the
|
||||
rebuild has begun writing sidecar files, the partial files
|
||||
(.index, .sha256, .meta.json) are removed before the
|
||||
exception propagates so a re-run starts from a clean slate.
|
||||
Tile-store rows are NOT deleted on cleanup; the C11 download
|
||||
journal owns idempotent re-run semantics.
|
||||
"""
|
||||
|
||||
log = logger or _LOGGER
|
||||
if max_retry_attempts < 1:
|
||||
raise ValueError(
|
||||
f"max_retry_attempts must be >= 1; got {max_retry_attempts}"
|
||||
)
|
||||
|
||||
started_monotonic = monotonic()
|
||||
effective_flight_id = flight_id or uuid4()
|
||||
effective_region_size = float(
|
||||
region_size_meters
|
||||
if region_size_meters is not None
|
||||
else route_spec.suggested_region_size_meters
|
||||
)
|
||||
if effective_region_size <= 0:
|
||||
raise ValueError(
|
||||
f"region_size_meters must be > 0; got {effective_region_size}"
|
||||
)
|
||||
if not route_spec.waypoints:
|
||||
raise ValueError("route_spec.waypoints must be non-empty")
|
||||
|
||||
sidecar_paths = (
|
||||
faiss_index_path,
|
||||
Path(str(faiss_index_path) + SIDECAR_SUFFIX),
|
||||
Path(str(faiss_index_path) + META_SUFFIX),
|
||||
)
|
||||
pre_existing_sidecar = {p: p.is_file() for p in sidecar_paths}
|
||||
|
||||
try:
|
||||
seed_result = _seed_route_with_retry(
|
||||
route_client=route_client,
|
||||
spec=route_spec,
|
||||
region_size_meters=effective_region_size,
|
||||
zoom_level=zoom_level,
|
||||
retry_schedule_s=retry_schedule_s,
|
||||
max_retry_attempts=max_retry_attempts,
|
||||
sleep=sleep,
|
||||
logger=log,
|
||||
)
|
||||
|
||||
bbox = _route_bbox(
|
||||
waypoints=route_spec.waypoints,
|
||||
region_size_meters=effective_region_size,
|
||||
)
|
||||
download_request = DownloadRequest(
|
||||
flight_id=effective_flight_id,
|
||||
bbox_min_lat=bbox[0],
|
||||
bbox_min_lon=bbox[1],
|
||||
bbox_max_lat=bbox[2],
|
||||
bbox_max_lon=bbox[3],
|
||||
zoom_levels=(int(zoom_level),),
|
||||
sector_class=sector_class,
|
||||
cache_root=cache_root,
|
||||
)
|
||||
download_report = tile_downloader.download_tiles_for_area(download_request)
|
||||
if download_report.outcome not in {
|
||||
DownloadOutcome.SUCCESS,
|
||||
DownloadOutcome.IDEMPOTENT_NO_OP,
|
||||
}:
|
||||
raise RuntimeError(
|
||||
"C11 download_tiles_for_area returned non-success outcome "
|
||||
f"{download_report.outcome.value!r}; "
|
||||
f"requested={download_report.tiles_requested} "
|
||||
f"downloaded={download_report.tiles_downloaded} "
|
||||
f"rejected_resolution={download_report.tiles_rejected_resolution} "
|
||||
f"rejected_freshness={download_report.tiles_rejected_freshness}"
|
||||
)
|
||||
|
||||
log.info(
|
||||
"operator_pre_flight: tiles populated",
|
||||
extra={
|
||||
"kind": "operator_pre_flight.tiles_populated",
|
||||
"kv": {
|
||||
"route_id": str(seed_result.route_id),
|
||||
"seeded_tile_count": seed_result.tile_count,
|
||||
"downloaded_tiles": download_report.tiles_downloaded,
|
||||
"request_hash": download_report.request_hash,
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
corpus_filter = CorpusFilter(
|
||||
bbox=bbox,
|
||||
zoom_levels=(int(zoom_level),),
|
||||
sector_class=sector_class.value,
|
||||
)
|
||||
batcher_report = descriptor_batcher.populate_descriptors(corpus_filter)
|
||||
if batcher_report.outcome is not BatcherOutcome.SUCCESS:
|
||||
raise RuntimeError(
|
||||
"C10 populate_descriptors returned FAILURE: "
|
||||
f"{batcher_report.failure_reason}"
|
||||
)
|
||||
|
||||
verifier_index = descriptor_index_factory()
|
||||
log.debug(
|
||||
"operator_pre_flight: sidecar coherence verified",
|
||||
extra={
|
||||
"kind": "operator_pre_flight.sidecar_verified",
|
||||
"kv": {
|
||||
"faiss_index_path": str(faiss_index_path),
|
||||
"verifier_type": type(verifier_index).__name__,
|
||||
},
|
||||
},
|
||||
)
|
||||
|
||||
elapsed_seconds = max(0.0, monotonic() - started_monotonic)
|
||||
return PopulatedC6Cache(
|
||||
cache_root=cache_root,
|
||||
tile_store_path=tile_store_path,
|
||||
faiss_index_path=faiss_index_path,
|
||||
faiss_sidecar_sha256_path=sidecar_paths[1],
|
||||
faiss_sidecar_meta_path=sidecar_paths[2],
|
||||
route_spec=route_spec,
|
||||
tile_count=batcher_report.tiles_consumed,
|
||||
elapsed_seconds=elapsed_seconds,
|
||||
)
|
||||
except BaseException:
|
||||
_cleanup_partial_sidecars(
|
||||
sidecar_paths=sidecar_paths,
|
||||
pre_existing=pre_existing_sidecar,
|
||||
logger=log,
|
||||
)
|
||||
raise
|
||||
|
||||
|
||||
def _seed_route_with_retry(
|
||||
*,
|
||||
route_client: SatelliteProviderRouteClient,
|
||||
spec: RouteSpec,
|
||||
region_size_meters: float,
|
||||
zoom_level: int,
|
||||
retry_schedule_s: tuple[float, ...],
|
||||
max_retry_attempts: int,
|
||||
sleep: Callable[[float], None],
|
||||
logger: logging.Logger,
|
||||
) -> Any:
|
||||
"""Call ``seed_route`` with bounded transient retries (AC-5).
|
||||
|
||||
Validation / terminal-failure errors propagate IMMEDIATELY with
|
||||
their original cause (AC-4 — no silent swallow). Only
|
||||
:class:`RouteTransientError` triggers the retry ladder; the final
|
||||
attempt's exception is re-raised unchanged so the caller sees
|
||||
the actual transient signal that exhausted the budget.
|
||||
"""
|
||||
last_transient: RouteTransientError | None = None
|
||||
for attempt in range(1, max_retry_attempts + 1):
|
||||
try:
|
||||
return route_client.seed_route(
|
||||
spec,
|
||||
region_size_meters=region_size_meters,
|
||||
zoom_level=zoom_level,
|
||||
)
|
||||
except (RouteValidationError, RouteTerminalFailureError):
|
||||
raise
|
||||
except RouteTransientError as exc:
|
||||
last_transient = exc
|
||||
logger.warning(
|
||||
"operator_pre_flight: route seed transient failure",
|
||||
extra={
|
||||
"kind": "operator_pre_flight.route_seed.transient",
|
||||
"kv": {
|
||||
"attempt": attempt,
|
||||
"max_attempts": max_retry_attempts,
|
||||
"exc": repr(exc),
|
||||
},
|
||||
},
|
||||
)
|
||||
if attempt >= max_retry_attempts:
|
||||
raise
|
||||
pause_s = retry_schedule_s[
|
||||
min(attempt - 1, len(retry_schedule_s) - 1)
|
||||
]
|
||||
sleep(pause_s)
|
||||
# Defensive — the loop body always returns or raises before this.
|
||||
if last_transient is not None:
|
||||
raise last_transient
|
||||
raise RuntimeError(
|
||||
"operator_pre_flight: seed_route loop exited without result"
|
||||
)
|
||||
|
||||
|
||||
def _route_bbox(
|
||||
*,
|
||||
waypoints: tuple[tuple[float, float], ...],
|
||||
region_size_meters: float,
|
||||
) -> tuple[float, float, float, float]:
|
||||
"""Bounding box of every waypoint's coverage square.
|
||||
|
||||
Mirrors the local enumeration in
|
||||
:func:`gps_denied_onboard.components.c11_tile_manager.route_client._enumerate_route_tile_coords`
|
||||
by taking ``region_size_meters`` as the per-waypoint square edge
|
||||
and unioning the lat/lon extents. The result is a single bbox
|
||||
that the C11 :meth:`HttpTileDownloader.download_tiles_for_area`
|
||||
Protocol consumes; C11 then runs the standard slippy-map
|
||||
enumeration over that bbox at the requested zoom level.
|
||||
|
||||
Returns:
|
||||
``(min_lat, min_lon, max_lat, max_lon)`` — matching
|
||||
:class:`DownloadRequest`'s field order.
|
||||
"""
|
||||
|
||||
import math
|
||||
|
||||
half = region_size_meters / 2.0
|
||||
min_lat = float("inf")
|
||||
max_lat = float("-inf")
|
||||
min_lon = float("inf")
|
||||
max_lon = float("-inf")
|
||||
for lat_deg, lon_deg in waypoints:
|
||||
lat_delta_deg = half / _METERS_PER_DEGREE_LAT
|
||||
cos_lat = math.cos(math.radians(lat_deg))
|
||||
if cos_lat <= 1e-9:
|
||||
cos_lat = 1e-9
|
||||
lon_delta_deg = half / (_METERS_PER_DEGREE_LAT * cos_lat)
|
||||
min_lat = min(min_lat, lat_deg - lat_delta_deg)
|
||||
max_lat = max(max_lat, lat_deg + lat_delta_deg)
|
||||
min_lon = min(min_lon, lon_deg - lon_delta_deg)
|
||||
max_lon = max(max_lon, lon_deg + lon_delta_deg)
|
||||
|
||||
if min_lat >= max_lat or min_lon >= max_lon:
|
||||
raise ValueError(
|
||||
"operator_pre_flight: degenerate bbox from route waypoints "
|
||||
f"(min_lat={min_lat}, max_lat={max_lat}, "
|
||||
f"min_lon={min_lon}, max_lon={max_lon})"
|
||||
)
|
||||
return (min_lat, min_lon, max_lat, max_lon)
|
||||
|
||||
|
||||
def _cleanup_partial_sidecars(
|
||||
*,
|
||||
sidecar_paths: tuple[Path, ...],
|
||||
pre_existing: dict[Path, bool],
|
||||
logger: logging.Logger,
|
||||
) -> None:
|
||||
"""Remove sidecar files this driver may have produced.
|
||||
|
||||
Only files that did NOT exist when the driver started AND now
|
||||
exist are removed — pre-existing files (a warm cache from a prior
|
||||
run) are preserved. OS errors during cleanup are logged but do
|
||||
not mask the original exception.
|
||||
"""
|
||||
|
||||
for path in sidecar_paths:
|
||||
if pre_existing.get(path, False):
|
||||
continue
|
||||
if not path.exists():
|
||||
continue
|
||||
try:
|
||||
path.unlink()
|
||||
logger.warning(
|
||||
"operator_pre_flight: cleaned up partial sidecar",
|
||||
extra={
|
||||
"kind": "operator_pre_flight.cleanup.removed",
|
||||
"kv": {"path": str(path)},
|
||||
},
|
||||
)
|
||||
except OSError as exc:
|
||||
logger.error(
|
||||
"operator_pre_flight: cleanup unlink failed",
|
||||
extra={
|
||||
"kind": "operator_pre_flight.cleanup.failed",
|
||||
"kv": {"path": str(path), "exc": repr(exc)},
|
||||
},
|
||||
)
|
||||
+393
-16
@@ -15,6 +15,7 @@ import shutil
|
||||
import subprocess
|
||||
import sys
|
||||
from collections.abc import Iterator
|
||||
import dataclasses
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
@@ -290,21 +291,397 @@ def replay_runner(derkachi_replay_inputs: DerkachiReplayInputs) -> Any:
|
||||
return _run
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def operator_pre_flight_setup(tmp_path: Path) -> Iterator[Path]:
|
||||
"""Operator C12 pre-flight rehearsal stub.
|
||||
@pytest.fixture(scope="session")
|
||||
def operator_pre_flight_setup(
|
||||
derkachi_replay_inputs: DerkachiReplayInputs,
|
||||
tmp_path_factory: pytest.TempPathFactory,
|
||||
) -> Iterator["PopulatedC6Cache"]:
|
||||
"""Operator C12 pre-flight: real C1+C2+C11+C10 wiring (AZ-839 / Epic AZ-835 C3).
|
||||
|
||||
Per AZ-404's spec this fixture should run the operator's full
|
||||
C10/C11/C12 pre-flight against a ``mock-suite-sat-service``
|
||||
fixture and yield the populated cache directory. The current
|
||||
``tests/fixtures/mock-suite-sat-service`` is a bootstrap stub
|
||||
(only ``GET /healthz`` per its README) — the full D-PROJ-2
|
||||
contract is not implemented. Until that ships, AC-8 (operator
|
||||
workflow rehearsal) is skipped at the test level; this fixture
|
||||
yields a placeholder cache directory so test bodies that
|
||||
request it can fail-fast with a documented reason rather than a
|
||||
surprise ImportError.
|
||||
Replaces the AZ-404 placeholder. Drives the operator-side
|
||||
pre-flight pipeline end-to-end and yields the populated cache
|
||||
so AC-8 (operator workflow rehearsal) and the AZ-840 e2e
|
||||
orchestrator test can consume it.
|
||||
|
||||
Skip gates (in evaluation order — first match wins):
|
||||
|
||||
* ``RUN_REPLAY_E2E`` not in ``{1, true, yes, on}`` — same as
|
||||
every other heavy test in this directory.
|
||||
* ``SATELLITE_PROVIDER_URL`` / ``SATELLITE_PROVIDER_API_KEY``
|
||||
missing — the C2 route client cannot reach the parent suite.
|
||||
* ``BUILD_FAISS_INDEX`` not ON — the C6 ``DescriptorIndex``
|
||||
runtime is gated by the env flag (``storage_factory.py``).
|
||||
* ``GPS_DENIED_OPERATOR_CONFIG_PATH`` missing OR points at a
|
||||
config that does not register every component this fixture
|
||||
needs (c6_tile_cache + c7_inference + c10_provisioning +
|
||||
c11_tile_manager) — the wiring would fail later with a less
|
||||
readable error.
|
||||
|
||||
See ``tests/e2e/replay/_operator_pre_flight.py::populate_c6_from_route``
|
||||
for the algorithm; this fixture only owns the
|
||||
runtime-factory wiring + skip gates.
|
||||
"""
|
||||
cache_dir = tmp_path / "operator_cache"
|
||||
cache_dir.mkdir()
|
||||
yield cache_dir
|
||||
|
||||
skip_reason = _operator_pre_flight_skip_reason()
|
||||
if skip_reason is not None:
|
||||
pytest.skip(skip_reason)
|
||||
|
||||
yield from _build_operator_pre_flight_cache(
|
||||
derkachi_replay_inputs=derkachi_replay_inputs,
|
||||
tmp_path_factory=tmp_path_factory,
|
||||
)
|
||||
|
||||
|
||||
def _operator_pre_flight_skip_reason() -> str | None:
|
||||
"""Return a SKIP reason string when env / build flags are not viable.
|
||||
|
||||
Centralised so the conditions stay testable + documented in one
|
||||
place. Returns ``None`` when the fixture is allowed to run.
|
||||
"""
|
||||
|
||||
if os.environ.get("RUN_REPLAY_E2E", "").strip().lower() not in {
|
||||
"1",
|
||||
"true",
|
||||
"yes",
|
||||
"on",
|
||||
}:
|
||||
return "AZ-839 operator_pre_flight_setup gated by RUN_REPLAY_E2E=1"
|
||||
sp_url = os.environ.get("SATELLITE_PROVIDER_URL", "").strip()
|
||||
sp_jwt = os.environ.get("SATELLITE_PROVIDER_API_KEY", "").strip()
|
||||
if not sp_url:
|
||||
return (
|
||||
"AZ-839 operator_pre_flight_setup requires SATELLITE_PROVIDER_URL "
|
||||
"(e.g. https://satellite-provider:8080)"
|
||||
)
|
||||
if not sp_jwt:
|
||||
return (
|
||||
"AZ-839 operator_pre_flight_setup requires SATELLITE_PROVIDER_API_KEY "
|
||||
"(Bearer JWT for the parent-suite Route + Inventory APIs)"
|
||||
)
|
||||
if os.environ.get("BUILD_FAISS_INDEX", "").strip().lower() not in {
|
||||
"on",
|
||||
"1",
|
||||
"true",
|
||||
"yes",
|
||||
}:
|
||||
return (
|
||||
"AZ-839 operator_pre_flight_setup requires BUILD_FAISS_INDEX=ON "
|
||||
"(the C6 FaissDescriptorIndex runtime is build-flag-gated per "
|
||||
"runtime_root.storage_factory)"
|
||||
)
|
||||
if not os.environ.get("GPS_DENIED_OPERATOR_CONFIG_PATH", "").strip():
|
||||
return (
|
||||
"AZ-839 operator_pre_flight_setup requires "
|
||||
"GPS_DENIED_OPERATOR_CONFIG_PATH pointing at a YAML that "
|
||||
"registers c6_tile_cache + c7_inference + c10_provisioning + "
|
||||
"c11_tile_manager blocks (Jetson e2e harness sets this; "
|
||||
"dev macOS does not)"
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
def _build_operator_pre_flight_cache(
|
||||
*,
|
||||
derkachi_replay_inputs: DerkachiReplayInputs,
|
||||
tmp_path_factory: pytest.TempPathFactory,
|
||||
) -> Iterator["PopulatedC6Cache"]:
|
||||
"""Wire the operator-side runtime graph and run the AZ-839 driver.
|
||||
|
||||
All imports of heavy collaborators (httpx, runtime_root factories,
|
||||
c10/c11/c6 modules) live inside this function so collection on
|
||||
dev macOS without the e2e env stays cheap (the SKIP path returns
|
||||
before reaching this body).
|
||||
|
||||
Raises:
|
||||
pytest.skip.Exception: when an env-flagged dependency
|
||||
(e.g. ``c10_provisioning`` config block, route extraction)
|
||||
cannot be satisfied and re-running with the right env is
|
||||
the right next step.
|
||||
"""
|
||||
|
||||
import httpx
|
||||
|
||||
from gps_denied_onboard.clock.wall_clock import WallClock
|
||||
from gps_denied_onboard.config.loader import load_config
|
||||
from gps_denied_onboard.replay_input.tlog_route import (
|
||||
extract_route_from_tlog,
|
||||
)
|
||||
from gps_denied_onboard.runtime_root.c10_factory import (
|
||||
build_descriptor_batcher,
|
||||
build_engine_compiler,
|
||||
)
|
||||
from gps_denied_onboard.runtime_root.c11_factory import (
|
||||
build_tile_downloader,
|
||||
)
|
||||
from gps_denied_onboard.runtime_root.storage_factory import (
|
||||
build_descriptor_index,
|
||||
build_tile_metadata_store,
|
||||
build_tile_store,
|
||||
)
|
||||
|
||||
from tests.e2e.replay._operator_pre_flight import (
|
||||
populate_c6_from_route,
|
||||
)
|
||||
|
||||
config_path = Path(os.environ["GPS_DENIED_OPERATOR_CONFIG_PATH"])
|
||||
if not config_path.is_file():
|
||||
pytest.skip(
|
||||
f"GPS_DENIED_OPERATOR_CONFIG_PATH points at a non-file: {config_path}"
|
||||
)
|
||||
config = load_config(os.environ, paths=[config_path])
|
||||
|
||||
cache_root = tmp_path_factory.mktemp("operator_pre_flight_cache")
|
||||
# PostgresFilesystemStore writes JPEGs under `<root_dir>/tiles/`;
|
||||
# FaissDescriptorIndex falls back to `<root_dir>/descriptor.index`
|
||||
# when `faiss_index_path` is empty. Override the c6_tile_cache
|
||||
# block in-memory so the production components built below
|
||||
# (build_tile_store / build_descriptor_index / batcher) write to
|
||||
# the same `cache_root` PopulatedC6Cache advertises. Without this
|
||||
# the static YAML at GPS_DENIED_OPERATOR_CONFIG_PATH would route
|
||||
# writes to its baked-in `root_dir` while the verifier read from
|
||||
# the fixture's tmp path, breaking AC-3 / AC-6 on Tier-2.
|
||||
c6_block = config.components["c6_tile_cache"]
|
||||
c6_block_overridden = dataclasses.replace(
|
||||
c6_block,
|
||||
root_dir=str(cache_root),
|
||||
faiss_index_path="",
|
||||
)
|
||||
config = dataclasses.replace(
|
||||
config,
|
||||
components={**config.components, "c6_tile_cache": c6_block_overridden},
|
||||
)
|
||||
tile_store_path = cache_root / "tiles"
|
||||
faiss_index_path = cache_root / "descriptor.index"
|
||||
|
||||
route_spec = extract_route_from_tlog(
|
||||
derkachi_replay_inputs.tlog_path,
|
||||
max_waypoints=10,
|
||||
)
|
||||
|
||||
sp_url = os.environ["SATELLITE_PROVIDER_URL"].strip()
|
||||
sp_jwt = os.environ["SATELLITE_PROVIDER_API_KEY"].strip()
|
||||
tls_insecure = os.environ.get(
|
||||
"SATELLITE_PROVIDER_TLS_INSECURE", ""
|
||||
).strip().lower() in {"1", "true", "yes", "on"}
|
||||
|
||||
from gps_denied_onboard.components.c11_tile_manager.route_client import (
|
||||
SatelliteProviderRouteClient,
|
||||
)
|
||||
|
||||
route_client = SatelliteProviderRouteClient(
|
||||
base_url=sp_url,
|
||||
jwt=sp_jwt,
|
||||
tls_insecure=tls_insecure,
|
||||
)
|
||||
|
||||
tile_store = build_tile_store(config)
|
||||
tile_metadata_store = build_tile_metadata_store(config)
|
||||
descriptor_index = build_descriptor_index(config)
|
||||
|
||||
httpx_client = httpx.Client(
|
||||
verify=not tls_insecure,
|
||||
timeout=httpx.Timeout(30.0),
|
||||
headers={"Authorization": f"Bearer {sp_jwt}"},
|
||||
)
|
||||
tile_downloader = build_tile_downloader(
|
||||
config,
|
||||
http_client=httpx_client,
|
||||
tile_store=tile_store,
|
||||
tile_metadata_store=tile_metadata_store,
|
||||
budget_enforcer=tile_store,
|
||||
)
|
||||
|
||||
clock = WallClock()
|
||||
engine_compiler = build_engine_compiler(config)
|
||||
backbone_embedder = _build_replay_backbone_embedder(
|
||||
config=config,
|
||||
engine_compiler=engine_compiler,
|
||||
cache_root=cache_root,
|
||||
)
|
||||
|
||||
descriptor_batcher = build_descriptor_batcher(
|
||||
config,
|
||||
backbone_embedder=backbone_embedder,
|
||||
tile_metadata_store=tile_metadata_store,
|
||||
tile_store=tile_store,
|
||||
descriptor_index=descriptor_index,
|
||||
clock=clock,
|
||||
)
|
||||
|
||||
def _descriptor_index_factory() -> Any:
|
||||
from gps_denied_onboard.components.c6_tile_cache.faiss_descriptor_index import ( # noqa: E501
|
||||
FaissDescriptorIndex,
|
||||
)
|
||||
from gps_denied_onboard.helpers.sha256_sidecar import Sha256Sidecar
|
||||
from gps_denied_onboard.logging import get_logger
|
||||
|
||||
return FaissDescriptorIndex(
|
||||
index_path=faiss_index_path,
|
||||
sidecar=Sha256Sidecar(),
|
||||
logger=get_logger("c6_tile_cache.faiss_descriptor_index"),
|
||||
)
|
||||
|
||||
populated = populate_c6_from_route(
|
||||
route_spec=route_spec,
|
||||
route_client=route_client,
|
||||
tile_downloader=tile_downloader,
|
||||
descriptor_batcher=descriptor_batcher,
|
||||
descriptor_index_factory=_descriptor_index_factory,
|
||||
cache_root=cache_root,
|
||||
tile_store_path=tile_store_path,
|
||||
faiss_index_path=faiss_index_path,
|
||||
)
|
||||
try:
|
||||
yield populated
|
||||
finally:
|
||||
httpx_client.close()
|
||||
|
||||
|
||||
def _build_replay_backbone_embedder(
|
||||
*,
|
||||
config: Any,
|
||||
engine_compiler: Any,
|
||||
cache_root: Path,
|
||||
) -> Any:
|
||||
"""Compile the first configured backbone and wrap it for the AZ-322 batcher.
|
||||
|
||||
The replay-mode operator binary does not exist yet (tracked under
|
||||
Epic AZ-835); until it does, this fixture performs the wiring
|
||||
inline. The path is deliberately the production path:
|
||||
|
||||
* :func:`runtime_root.c10_factory.build_engine_compiler` builds
|
||||
the AZ-321 :class:`EngineCompiler`.
|
||||
* The first backbone in
|
||||
``config.components['c10_provisioning'].backbones`` is
|
||||
compiled to an engine cache entry; the AZ-297
|
||||
:class:`InferenceRuntime` deserialises it into the
|
||||
:class:`EngineHandle` the embedder consumes.
|
||||
* The tile decoder converts a C6 :class:`TilePixelHandle`
|
||||
(mmap of JPEG bytes) to the ``np.float32`` tensor shape the
|
||||
backbone expects via OpenCV — the same primitive the C7
|
||||
pre-processor uses.
|
||||
|
||||
Tests / dev workstations without a backbone ONNX or a working
|
||||
:class:`InferenceRuntime` fail this function, which surfaces as
|
||||
a fixture error (deliberate — the SKIP gate above is meant to
|
||||
catch the env-mismatch case before we get here).
|
||||
"""
|
||||
|
||||
from gps_denied_onboard._types.inference import PrecisionMode
|
||||
from gps_denied_onboard._types.manifests import HostCapabilities
|
||||
from gps_denied_onboard.components.c10_provisioning.c7_engine_embedder import (
|
||||
C7EngineBackboneEmbedder,
|
||||
)
|
||||
from gps_denied_onboard.components.c10_provisioning.engine_compiler import (
|
||||
EngineCompileRequest,
|
||||
)
|
||||
from gps_denied_onboard.logging import get_logger
|
||||
from gps_denied_onboard.runtime_root.c10_factory import (
|
||||
build_backbone_specs,
|
||||
)
|
||||
from gps_denied_onboard.runtime_root.inference_factory import (
|
||||
build_inference_runtime,
|
||||
)
|
||||
|
||||
backbones = build_backbone_specs(config)
|
||||
if not backbones:
|
||||
pytest.skip(
|
||||
"AZ-839 operator_pre_flight_setup: config has no "
|
||||
"c10_provisioning.backbones entries — the e2e harness "
|
||||
"config must declare at least one backbone (typically "
|
||||
"DINOv2-VPR or NetVLAD per AZ-321)."
|
||||
)
|
||||
|
||||
host = HostCapabilities(
|
||||
gpu_name="replay-e2e",
|
||||
cuda_compute_capability=(0, 0),
|
||||
cuda_runtime_version="0.0",
|
||||
tensorrt_version="0.0",
|
||||
host_arch="unknown",
|
||||
host_os="linux",
|
||||
driver_version="unknown",
|
||||
)
|
||||
engine_cache_root = cache_root / "engines"
|
||||
engine_cache_root.mkdir(parents=True, exist_ok=True)
|
||||
request = EngineCompileRequest(
|
||||
backbones=backbones,
|
||||
calibration_path=None,
|
||||
cache_root=engine_cache_root,
|
||||
precision=PrecisionMode.FP16,
|
||||
host=host,
|
||||
workspace_mb=int(
|
||||
config.components["c10_provisioning"].workspace_mb
|
||||
),
|
||||
)
|
||||
results = engine_compiler.compile_engines_for_corpus(request)
|
||||
if not results:
|
||||
pytest.skip(
|
||||
"AZ-839 operator_pre_flight_setup: engine compiler returned "
|
||||
"empty results — corpus failed to compile."
|
||||
)
|
||||
first = results[0]
|
||||
spec = backbones[0]
|
||||
inference_runtime = build_inference_runtime(config)
|
||||
engine_handle = inference_runtime.deserialize_engine(first.entry)
|
||||
descriptor_dim = _resolve_replay_descriptor_dim(config, spec)
|
||||
return C7EngineBackboneEmbedder(
|
||||
inference_runtime=inference_runtime,
|
||||
engine_handle=engine_handle,
|
||||
input_name=spec.input_name,
|
||||
output_name="descriptor",
|
||||
descriptor_dim=descriptor_dim,
|
||||
tile_decoder=_default_tile_decoder,
|
||||
logger=get_logger("c10_provisioning.replay_backbone_embedder"),
|
||||
)
|
||||
|
||||
|
||||
def _resolve_replay_descriptor_dim(config: Any, spec: Any) -> int:
|
||||
"""Resolve the descriptor output dimension for the AZ-839 NetVLAD baseline.
|
||||
|
||||
The AZ-839 task spec pins the C2 backbone at NetVLAD (per
|
||||
``c2_vpr/config.py:67``); :class:`C2VprConfig.netvlad_descriptor_dim`
|
||||
is the canonical source. We read the c2_vpr block and fall back
|
||||
to the architecture default ``4096`` when the block is absent so
|
||||
operators on a hand-rolled YAML still get a coherent dim. Other
|
||||
backbones (UltraVPR=512, MegaLoc=2048, MixVPR=4096) require
|
||||
swapping this resolver — out of scope for AZ-839.
|
||||
"""
|
||||
|
||||
block = config.components.get("c2_vpr") if config.components else None
|
||||
if block is not None and getattr(block, "strategy", "") == "net_vlad":
|
||||
return int(getattr(block, "netvlad_descriptor_dim", 4096))
|
||||
pytest.skip(
|
||||
"AZ-839 operator_pre_flight_setup: descriptor_dim resolver "
|
||||
f"only supports c2_vpr.strategy='net_vlad'; got "
|
||||
f"{getattr(block, 'strategy', '<missing>')!r} on backbone "
|
||||
f"{spec.model_name!r}. See AZ-839 spec § Out of scope."
|
||||
)
|
||||
raise AssertionError("unreachable: pytest.skip raises")
|
||||
|
||||
|
||||
def _default_tile_decoder(handle: Any) -> Any:
|
||||
"""Decode a C6 :class:`TilePixelHandle` (JPEG mmap) to a CHW float32 tensor.
|
||||
|
||||
The handle exposes ``read_bytes()`` (or context-manager + ``read``);
|
||||
we prefer the simpler ``read_bytes()`` path. OpenCV imdecode
|
||||
yields HWC-uint8-BGR; the embedder expects float32-CHW-RGB
|
||||
normalised to ``[0, 1]`` (DINOv2-VPR + NetVLAD share this layout).
|
||||
Imports are lazy — no OpenCV penalty when this module is imported
|
||||
on dev macOS.
|
||||
"""
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
if hasattr(handle, "read_bytes"):
|
||||
blob = handle.read_bytes()
|
||||
else:
|
||||
with handle as opened:
|
||||
blob = opened.read()
|
||||
arr = np.frombuffer(blob, dtype=np.uint8)
|
||||
bgr = cv2.imdecode(arr, cv2.IMREAD_COLOR)
|
||||
if bgr is None:
|
||||
raise RuntimeError("cv2.imdecode returned None for tile handle")
|
||||
rgb = cv2.cvtColor(bgr, cv2.COLOR_BGR2RGB)
|
||||
chw = np.transpose(rgb, (2, 0, 1)).astype(np.float32) / 255.0
|
||||
return chw
|
||||
|
||||
@@ -0,0 +1,182 @@
|
||||
"""AZ-840 — E2E orchestrator integration test (AC-1 / AC-2 / AC-3 / AC-4 / AC-6).
|
||||
|
||||
The Tier-2 entry point that closes Epic AZ-835's narrative: from a
|
||||
``(tlog, video, calibration)`` triple, run the full 7-step pipeline
|
||||
end-to-end on the Jetson harness without operator hand-curation
|
||||
between steps.
|
||||
|
||||
The test consumes:
|
||||
|
||||
* :func:`tests.e2e.replay.conftest.operator_pre_flight_setup` —
|
||||
the AZ-839 C3 fixture that owns steps 3-5 (route extraction +
|
||||
satellite-provider seeding + FAISS index build) and yields a
|
||||
:class:`PopulatedC6Cache` keyed off a freshly-mktemp'd
|
||||
``cache_root``.
|
||||
* :func:`tests.e2e.replay.conftest.derkachi_replay_inputs` — the
|
||||
shared session fixture that materialises the Derkachi tlog +
|
||||
video + factory-sheet calibration + signing-key file.
|
||||
* :func:`tests.e2e.replay._e2e_orchestrator.run_e2e_orchestration`
|
||||
— the AC-1 driver that wires everything below the C3 fixture.
|
||||
|
||||
The driver writes a fresh effective replay config per session
|
||||
(merging the static operator YAML with the cache_root override),
|
||||
invokes ``gps-denied-replay --auto-trim``, parses the JSONL
|
||||
emissions, computes the horizontal-error distribution, and writes
|
||||
the verdict markdown under ``_docs/06_metrics/`` (AC-2).
|
||||
|
||||
Skip gates (in evaluation order):
|
||||
|
||||
1. ``@pytest.mark.tier2`` — the per-suite Tier-2 plugin gates this
|
||||
off on dev macOS (matches the AZ-839 / AZ-699 contract).
|
||||
2. ``RUN_REPLAY_E2E`` not in ``{1, true, yes, on}``.
|
||||
3. ``gps-denied-replay`` console-script not on ``PATH``.
|
||||
4. Real video missing or placeholder-sized (mirrors AZ-699's gate).
|
||||
5. ``operator_pre_flight_setup`` fixture itself skipped — the
|
||||
downstream consumer inherits the SKIP automatically (pytest's
|
||||
fixture-skip propagation).
|
||||
|
||||
AC-7 (AZ-699 continues to pass) is satisfied by inspection: this
|
||||
test does not modify ``test_derkachi_real_tlog.py`` and writes its
|
||||
report to the same path (``real_flight_validation_<date>.md``) but
|
||||
in an idempotent way — both tests writing PASS or both writing
|
||||
FAIL is the expected joint outcome on a given clip.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import shutil
|
||||
import sys
|
||||
from collections.abc import Iterator
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from tests.e2e.replay._e2e_orchestrator import (
|
||||
OrchestrationReport,
|
||||
run_e2e_orchestration,
|
||||
)
|
||||
from tests.e2e.replay._operator_pre_flight import PopulatedC6Cache
|
||||
from tests.e2e.replay.conftest import DerkachiReplayInputs
|
||||
|
||||
|
||||
def _repo_root() -> Path:
|
||||
return Path(__file__).resolve().parents[3]
|
||||
|
||||
|
||||
def _derkachi_dir() -> Path:
|
||||
return _repo_root() / "_docs" / "00_problem" / "input_data" / "flight_derkachi"
|
||||
|
||||
|
||||
_MIN_REAL_VIDEO_BYTES: int = 1_000_000
|
||||
|
||||
|
||||
def _replay_binary() -> Path | None:
|
||||
"""Return the absolute path to ``gps-denied-replay`` or ``None``.
|
||||
|
||||
Same lookup order AZ-699 uses: PATH first, venv bin second.
|
||||
"""
|
||||
|
||||
binary = shutil.which("gps-denied-replay")
|
||||
if binary is not None:
|
||||
return Path(binary)
|
||||
venv_bin = Path(sys.executable).parent / "gps-denied-replay"
|
||||
if venv_bin.exists():
|
||||
return venv_bin
|
||||
return None
|
||||
|
||||
|
||||
def _orchestrator_skip_reason() -> str | None:
|
||||
"""Return a SKIP message when env / inputs preclude a Jetson run."""
|
||||
|
||||
if os.environ.get("RUN_REPLAY_E2E", "").strip().lower() not in {
|
||||
"1",
|
||||
"true",
|
||||
"yes",
|
||||
"on",
|
||||
}:
|
||||
return "AZ-840 e2e orchestrator gated by RUN_REPLAY_E2E=1"
|
||||
if not os.environ.get("GPS_DENIED_OPERATOR_CONFIG_PATH", "").strip():
|
||||
return (
|
||||
"AZ-840 e2e orchestrator requires GPS_DENIED_OPERATOR_CONFIG_PATH "
|
||||
"(same env var the C3 fixture consumes)"
|
||||
)
|
||||
if _replay_binary() is None:
|
||||
return "gps-denied-replay console-script not installed"
|
||||
video = _derkachi_dir() / "flight_derkachi.mp4"
|
||||
if not video.is_file():
|
||||
return f"Derkachi video missing: {video}"
|
||||
if video.stat().st_size < _MIN_REAL_VIDEO_BYTES:
|
||||
return (
|
||||
f"Derkachi video at {video} is only {video.stat().st_size} "
|
||||
"bytes — placeholder, not a real recording"
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def az840_skip_gate() -> Iterator[None]:
|
||||
"""Skip-gate the orchestrator test before any heavy fixtures resolve."""
|
||||
|
||||
reason = _orchestrator_skip_reason()
|
||||
if reason is not None:
|
||||
pytest.skip(reason)
|
||||
yield
|
||||
|
||||
|
||||
@pytest.mark.tier2
|
||||
def test_az840_e2e_real_flight_orchestration(
|
||||
az840_skip_gate: None,
|
||||
operator_pre_flight_setup: PopulatedC6Cache,
|
||||
derkachi_replay_inputs: DerkachiReplayInputs,
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange — every input besides cache_root comes from the existing
|
||||
# session fixtures so the same Tier-2 harness setup that powers
|
||||
# AZ-699 + AZ-839 is exercised.
|
||||
binary = _replay_binary()
|
||||
assert binary is not None, "skip gate already verified the binary exists"
|
||||
base_config_path = Path(os.environ["GPS_DENIED_OPERATOR_CONFIG_PATH"])
|
||||
output_path = tmp_path / "estimator_output.jsonl"
|
||||
effective_config_path = tmp_path / "operator_config_effective.yaml"
|
||||
report_dir = _repo_root() / "_docs" / "06_metrics"
|
||||
|
||||
# Act
|
||||
report = run_e2e_orchestration(
|
||||
populated_cache=operator_pre_flight_setup,
|
||||
base_config_path=base_config_path,
|
||||
tlog_path=derkachi_replay_inputs.tlog_path,
|
||||
video_path=derkachi_replay_inputs.video_path,
|
||||
calibration_path=derkachi_replay_inputs.calibration_path,
|
||||
signing_key_path=derkachi_replay_inputs.signing_key_path,
|
||||
replay_binary=binary,
|
||||
output_path=output_path,
|
||||
report_dir=report_dir,
|
||||
effective_config_path=effective_config_path,
|
||||
)
|
||||
|
||||
# Assert AC-2 + AC-4 — report exists; full run within the 15-min budget.
|
||||
assert isinstance(report, OrchestrationReport)
|
||||
assert report.report_path.is_file()
|
||||
body = report.report_path.read_text()
|
||||
assert "## Horizontal error (metres)" in body
|
||||
assert "## Threshold-hit share" in body
|
||||
assert "Mean" in body
|
||||
for threshold in (10, 25, 50, 100):
|
||||
assert f"| {threshold} |" in body, (
|
||||
f"threshold {threshold} m row missing from report"
|
||||
)
|
||||
assert report.replay_subprocess_seconds <= 900.0, (
|
||||
"AZ-840 AC-4: replay subprocess exceeded 15-min soft target"
|
||||
)
|
||||
assert report.wall_clock_s >= report.replay_subprocess_seconds
|
||||
assert report.distribution.count > 0, (
|
||||
"no emissions paired with ground truth — orchestration produced "
|
||||
"data but every emission fell outside the tlog GPS window"
|
||||
)
|
||||
|
||||
# Assert AC-3 — the effective config was written and points at the
|
||||
# cache_root the C3 fixture supplied.
|
||||
assert effective_config_path.is_file()
|
||||
effective_text = effective_config_path.read_text()
|
||||
assert str(operator_pre_flight_setup.cache_root) in effective_text
|
||||
@@ -0,0 +1,671 @@
|
||||
"""Unit tests for the AZ-840 e2e orchestrator (AC-8).
|
||||
|
||||
The end-to-end happy path is the Tier-2 integration test in
|
||||
``test_az835_e2e_real_flight.py`` (AC-1 / AC-2). This module covers
|
||||
the orchestration helper layer in isolation:
|
||||
|
||||
* Param validation — every required path must exist before the
|
||||
airborne subprocess is spawned (AC-5 fails LOUD).
|
||||
* Effective-config merge — the ``c6_tile_cache.root_dir`` override
|
||||
is written to YAML; the rest of the base config is preserved.
|
||||
* Error propagation per step — every documented failure surfaces
|
||||
as :class:`OrchestrationFailure` with the correct
|
||||
:class:`OrchestratorStep` label.
|
||||
* Happy path — when the runner returns success and the JSONL +
|
||||
ground truth align, :class:`OrchestrationReport` carries a
|
||||
written report path and an honest verdict (AC-2: report exists
|
||||
PASS or FAIL).
|
||||
|
||||
The tests inject a fake ``runner`` so no real
|
||||
``gps-denied-replay`` subprocess is spawned. Real binary execution
|
||||
is exercised on the Jetson harness via the AC-1 integration test.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
from unittest.mock import MagicMock
|
||||
|
||||
import pytest
|
||||
import yaml
|
||||
|
||||
from gps_denied_onboard.helpers.accuracy_report import (
|
||||
AC3_GATE_THRESHOLD_M,
|
||||
)
|
||||
from gps_denied_onboard.replay_input.tlog_route import RouteSpec
|
||||
|
||||
from tests.e2e.replay._e2e_orchestrator import (
|
||||
OrchestrationFailure,
|
||||
OrchestrationReport,
|
||||
OrchestratorStep,
|
||||
read_calibration_acquisition_method,
|
||||
run_e2e_orchestration,
|
||||
write_effective_replay_config,
|
||||
)
|
||||
from tests.e2e.replay._operator_pre_flight import PopulatedC6Cache
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# Helpers
|
||||
|
||||
|
||||
def _build_populated_cache(tmp_path: Path) -> PopulatedC6Cache:
|
||||
"""Construct a synthetic :class:`PopulatedC6Cache`.
|
||||
|
||||
The orchestrator only consumes ``cache_root`` from the cache,
|
||||
so the FAISS sidecar paths are placeholders. The route_spec is
|
||||
a minimal one-waypoint instance — no AZ-836 invariants are
|
||||
re-asserted by AZ-840.
|
||||
"""
|
||||
|
||||
cache_root = tmp_path / "cache_root"
|
||||
cache_root.mkdir()
|
||||
return PopulatedC6Cache(
|
||||
cache_root=cache_root,
|
||||
tile_store_path=cache_root / "tiles",
|
||||
faiss_index_path=cache_root / "descriptor.index",
|
||||
faiss_sidecar_sha256_path=cache_root / "descriptor.index.sha256",
|
||||
faiss_sidecar_meta_path=cache_root / "descriptor.index.meta.json",
|
||||
route_spec=RouteSpec(
|
||||
waypoints=((50.10, 36.10),),
|
||||
suggested_region_size_meters=500.0,
|
||||
source_tlog=Path("test.tlog"),
|
||||
source_segment=(0, 100),
|
||||
total_distance_meters=0.0,
|
||||
),
|
||||
tile_count=1,
|
||||
elapsed_seconds=0.0,
|
||||
)
|
||||
|
||||
|
||||
def _stage_inputs(tmp_path: Path) -> dict[str, Path]:
|
||||
"""Write touch-files for every input path the orchestrator validates.
|
||||
|
||||
The base config YAML carries one stub block so the merge step
|
||||
has a real document to overlay on.
|
||||
"""
|
||||
|
||||
base_config = tmp_path / "operator_config.yaml"
|
||||
base_config.write_text(
|
||||
yaml.safe_dump(
|
||||
{
|
||||
"mode": "replay",
|
||||
"c6_tile_cache": {
|
||||
"store_runtime": "postgres_filesystem",
|
||||
"metadata_runtime": "postgres_filesystem",
|
||||
"descriptor_index_runtime": "faiss_hnsw",
|
||||
"root_dir": "/var/lib/gps-denied/tiles",
|
||||
"faiss_index_path": "/some/static/path/descriptor.index",
|
||||
},
|
||||
}
|
||||
)
|
||||
)
|
||||
|
||||
tlog = tmp_path / "input.tlog"
|
||||
tlog.write_bytes(b"\x00")
|
||||
video = tmp_path / "input.mp4"
|
||||
video.write_bytes(b"\x00")
|
||||
calibration = tmp_path / "calibration.json"
|
||||
calibration.write_text(json.dumps({"acquisition_method": "factory-sheet"}))
|
||||
signing_key = tmp_path / "signing_key.bin"
|
||||
signing_key.write_bytes(b"\x00" * 32)
|
||||
binary = tmp_path / "gps-denied-replay"
|
||||
binary.write_text("")
|
||||
|
||||
return {
|
||||
"base_config_path": base_config,
|
||||
"tlog_path": tlog,
|
||||
"video_path": video,
|
||||
"calibration_path": calibration,
|
||||
"signing_key_path": signing_key,
|
||||
"replay_binary": binary,
|
||||
}
|
||||
|
||||
|
||||
def _ground_truth_tlog_loader(
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
*,
|
||||
times_s: tuple[float, ...] = (0.0, 1.0, 2.0),
|
||||
lat_deg: float = 50.10,
|
||||
lon_deg: float = 36.10,
|
||||
alt_m: float = 100.0,
|
||||
) -> None:
|
||||
"""Stub the orchestrator's ground-truth loader so unit tests skip MAVLink.
|
||||
|
||||
The orchestrator imports ``load_tlog_ground_truth`` from
|
||||
``gps_denied_onboard.replay_input``; patching the symbol *as
|
||||
bound on the orchestrator module* keeps the patch local to the
|
||||
unit suite (no cross-test bleed).
|
||||
"""
|
||||
|
||||
fixes = [
|
||||
_StubGpsFix(
|
||||
ts_ns=int(t * 1e9),
|
||||
lat_deg=lat_deg,
|
||||
lon_deg=lon_deg,
|
||||
alt_m=alt_m,
|
||||
)
|
||||
for t in times_s
|
||||
]
|
||||
series = _StubGpsSeries(records=tuple(fixes))
|
||||
monkeypatch.setattr(
|
||||
"tests.e2e.replay._e2e_orchestrator.load_tlog_ground_truth",
|
||||
lambda *_args, **_kwargs: series,
|
||||
)
|
||||
|
||||
|
||||
class _StubGpsFix:
|
||||
"""Mirrors the fields the orchestrator reads from each tlog row."""
|
||||
|
||||
__slots__ = ("ts_ns", "lat_deg", "lon_deg", "alt_m")
|
||||
|
||||
def __init__(
|
||||
self, *, ts_ns: int, lat_deg: float, lon_deg: float, alt_m: float
|
||||
) -> None:
|
||||
self.ts_ns = ts_ns
|
||||
self.lat_deg = lat_deg
|
||||
self.lon_deg = lon_deg
|
||||
self.alt_m = alt_m
|
||||
|
||||
|
||||
class _StubGpsSeries:
|
||||
"""Drop-in replacement for :class:`TlogGroundTruth`."""
|
||||
|
||||
def __init__(self, *, records: tuple[_StubGpsFix, ...]) -> None:
|
||||
self.records = records
|
||||
|
||||
|
||||
def _build_runner_emitting(
|
||||
output_path: Path,
|
||||
*,
|
||||
rows: list[dict[str, object]],
|
||||
returncode: int = 0,
|
||||
stdout: str = "",
|
||||
stderr: str = "",
|
||||
) -> "MagicMock":
|
||||
"""Return a fake ``subprocess.run`` that writes JSONL on call."""
|
||||
|
||||
def _run(argv, **kwargs): # type: ignore[no-untyped-def]
|
||||
if rows:
|
||||
output_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
output_path.write_text(
|
||||
"\n".join(json.dumps(row) for row in rows) + "\n"
|
||||
)
|
||||
return subprocess.CompletedProcess(
|
||||
args=argv,
|
||||
returncode=returncode,
|
||||
stdout=stdout,
|
||||
stderr=stderr,
|
||||
)
|
||||
|
||||
return MagicMock(side_effect=_run)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# write_effective_replay_config
|
||||
|
||||
|
||||
def test_write_effective_replay_config_overlays_root_dir(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
cache_root = tmp_path / "cache"
|
||||
cache_root.mkdir()
|
||||
output_path = tmp_path / "effective.yaml"
|
||||
|
||||
# Act
|
||||
written_path = write_effective_replay_config(
|
||||
base_config_path=inputs["base_config_path"],
|
||||
cache_root=cache_root,
|
||||
output_path=output_path,
|
||||
)
|
||||
|
||||
# Assert
|
||||
assert written_path == output_path
|
||||
merged = yaml.safe_load(output_path.read_text())
|
||||
assert merged["c6_tile_cache"]["root_dir"] == str(cache_root)
|
||||
assert merged["c6_tile_cache"]["faiss_index_path"] == ""
|
||||
assert merged["mode"] == "replay"
|
||||
assert (
|
||||
merged["c6_tile_cache"]["store_runtime"] == "postgres_filesystem"
|
||||
), "non-overridden c6_tile_cache fields must survive"
|
||||
|
||||
|
||||
def test_write_effective_replay_config_creates_block_when_absent(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
base = tmp_path / "operator.yaml"
|
||||
base.write_text(yaml.safe_dump({"mode": "replay"}))
|
||||
cache_root = tmp_path / "cache"
|
||||
cache_root.mkdir()
|
||||
|
||||
# Act
|
||||
write_effective_replay_config(
|
||||
base_config_path=base,
|
||||
cache_root=cache_root,
|
||||
output_path=tmp_path / "effective.yaml",
|
||||
)
|
||||
|
||||
# Assert
|
||||
merged = yaml.safe_load((tmp_path / "effective.yaml").read_text())
|
||||
assert merged["c6_tile_cache"]["root_dir"] == str(cache_root)
|
||||
|
||||
|
||||
def test_write_effective_replay_config_malformed_yaml_fails(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
base = tmp_path / "bad.yaml"
|
||||
base.write_text(":\n : not yaml:")
|
||||
cache_root = tmp_path / "cache"
|
||||
cache_root.mkdir()
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
write_effective_replay_config(
|
||||
base_config_path=base,
|
||||
cache_root=cache_root,
|
||||
output_path=tmp_path / "effective.yaml",
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.WRITE_EFFECTIVE_CONFIG
|
||||
|
||||
|
||||
def test_write_effective_replay_config_non_mapping_top_level_fails(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
base = tmp_path / "bad.yaml"
|
||||
base.write_text("- not a mapping\n")
|
||||
cache_root = tmp_path / "cache"
|
||||
cache_root.mkdir()
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
write_effective_replay_config(
|
||||
base_config_path=base,
|
||||
cache_root=cache_root,
|
||||
output_path=tmp_path / "effective.yaml",
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.WRITE_EFFECTIVE_CONFIG
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# read_calibration_acquisition_method
|
||||
|
||||
|
||||
def test_read_calibration_acquisition_method_returns_field_when_present(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
path = tmp_path / "cal.json"
|
||||
path.write_text(json.dumps({"acquisition_method": "factory-sheet"}))
|
||||
|
||||
# Assert
|
||||
assert read_calibration_acquisition_method(path) == "factory-sheet"
|
||||
|
||||
|
||||
def test_read_calibration_acquisition_method_returns_unknown_on_missing(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
path = tmp_path / "cal.json"
|
||||
path.write_text(json.dumps({"some_other_field": True}))
|
||||
|
||||
# Assert
|
||||
assert read_calibration_acquisition_method(path) == "unknown"
|
||||
|
||||
|
||||
def test_read_calibration_acquisition_method_returns_unknown_on_malformed(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
path = tmp_path / "cal.json"
|
||||
path.write_text("{not valid json")
|
||||
|
||||
# Assert
|
||||
assert read_calibration_acquisition_method(path) == "unknown"
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# run_e2e_orchestration — param validation (AC-5)
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_missing_tlog_fails_loud(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
inputs["tlog_path"].unlink()
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=tmp_path / "out.jsonl",
|
||||
report_dir=tmp_path / "metrics",
|
||||
effective_config_path=tmp_path / "eff.yaml",
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.VALIDATE_INPUTS
|
||||
assert "tlog_path" in str(exc_info.value)
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_missing_binary_fails_loud(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
inputs["replay_binary"].unlink()
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=tmp_path / "out.jsonl",
|
||||
report_dir=tmp_path / "metrics",
|
||||
effective_config_path=tmp_path / "eff.yaml",
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.VALIDATE_INPUTS
|
||||
assert "replay_binary" in str(exc_info.value)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# run_e2e_orchestration — subprocess error propagation (AC-5)
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_replay_nonzero_exit_fails_loud(
|
||||
tmp_path: Path,
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
# Arrange
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
output_path = tmp_path / "out.jsonl"
|
||||
runner = MagicMock(
|
||||
return_value=subprocess.CompletedProcess(
|
||||
args=[],
|
||||
returncode=1,
|
||||
stdout="",
|
||||
stderr="boom",
|
||||
)
|
||||
)
|
||||
_ground_truth_tlog_loader(monkeypatch)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=output_path,
|
||||
report_dir=tmp_path / "metrics",
|
||||
effective_config_path=tmp_path / "eff.yaml",
|
||||
runner=runner,
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.AIRBORNE_PIPELINE
|
||||
assert "exited 1" in str(exc_info.value)
|
||||
assert "boom" in str(exc_info.value)
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_replay_timeout_fails_loud(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
|
||||
def _timeout(*_args, **_kwargs):
|
||||
raise subprocess.TimeoutExpired(cmd=["replay"], timeout=0.1)
|
||||
|
||||
runner = MagicMock(side_effect=_timeout)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=tmp_path / "out.jsonl",
|
||||
report_dir=tmp_path / "metrics",
|
||||
effective_config_path=tmp_path / "eff.yaml",
|
||||
runner=runner,
|
||||
max_seconds=0.1,
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.AIRBORNE_PIPELINE
|
||||
assert "timed out" in str(exc_info.value)
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_replay_oserror_fails_loud(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
|
||||
def _oserror(*_args, **_kwargs):
|
||||
raise OSError("permission denied")
|
||||
|
||||
runner = MagicMock(side_effect=_oserror)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=tmp_path / "out.jsonl",
|
||||
report_dir=tmp_path / "metrics",
|
||||
effective_config_path=tmp_path / "eff.yaml",
|
||||
runner=runner,
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.AIRBORNE_PIPELINE
|
||||
assert "permission denied" in str(exc_info.value)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# run_e2e_orchestration — empty / malformed JSONL (AC-5)
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_empty_jsonl_fails_loud(
|
||||
tmp_path: Path,
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
# Arrange
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
output_path = tmp_path / "out.jsonl"
|
||||
|
||||
def _runner(argv, **_kwargs): # type: ignore[no-untyped-def]
|
||||
output_path.write_text("\n\n") # only blanks
|
||||
return subprocess.CompletedProcess(args=argv, returncode=0, stdout="", stderr="")
|
||||
|
||||
runner = MagicMock(side_effect=_runner)
|
||||
_ground_truth_tlog_loader(monkeypatch)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=output_path,
|
||||
report_dir=tmp_path / "metrics",
|
||||
effective_config_path=tmp_path / "eff.yaml",
|
||||
runner=runner,
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.PARSE_EMISSIONS
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_malformed_jsonl_fails_loud(
|
||||
tmp_path: Path,
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
# Arrange
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
output_path = tmp_path / "out.jsonl"
|
||||
|
||||
def _runner(argv, **_kwargs): # type: ignore[no-untyped-def]
|
||||
output_path.write_text('{"valid": true}\nnot a json line\n')
|
||||
return subprocess.CompletedProcess(args=argv, returncode=0, stdout="", stderr="")
|
||||
|
||||
runner = MagicMock(side_effect=_runner)
|
||||
_ground_truth_tlog_loader(monkeypatch)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=output_path,
|
||||
report_dir=tmp_path / "metrics",
|
||||
effective_config_path=tmp_path / "eff.yaml",
|
||||
runner=runner,
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.PARSE_EMISSIONS
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# run_e2e_orchestration — ground truth loader failure (AC-5)
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_ground_truth_loader_failure_fails_loud(
|
||||
tmp_path: Path,
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
# Arrange
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
output_path = tmp_path / "out.jsonl"
|
||||
runner = _build_runner_emitting(
|
||||
output_path,
|
||||
rows=[
|
||||
{
|
||||
"emitted_at": int(0.5 * 1e9),
|
||||
"position_wgs84": {
|
||||
"lat_deg": 50.10,
|
||||
"lon_deg": 36.10,
|
||||
"alt_m": 100.0,
|
||||
},
|
||||
}
|
||||
],
|
||||
)
|
||||
|
||||
def _raise(*_args, **_kwargs):
|
||||
raise ValueError("tlog corrupt")
|
||||
|
||||
monkeypatch.setattr(
|
||||
"tests.e2e.replay._e2e_orchestrator.load_tlog_ground_truth",
|
||||
_raise,
|
||||
)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(OrchestrationFailure) as exc_info:
|
||||
run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=output_path,
|
||||
report_dir=tmp_path / "metrics",
|
||||
effective_config_path=tmp_path / "eff.yaml",
|
||||
runner=runner,
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
assert exc_info.value.step is OrchestratorStep.LOAD_GROUND_TRUTH
|
||||
assert "tlog corrupt" in str(exc_info.value)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# run_e2e_orchestration — happy path (AC-1 / AC-2)
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_happy_path_writes_report(
|
||||
tmp_path: Path,
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
# Arrange
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
output_path = tmp_path / "out.jsonl"
|
||||
report_dir = tmp_path / "metrics"
|
||||
effective_config_path = tmp_path / "eff.yaml"
|
||||
rows = [
|
||||
{
|
||||
"emitted_at": int(0.5 * 1e9),
|
||||
"position_wgs84": {"lat_deg": 50.10, "lon_deg": 36.10, "alt_m": 100.0},
|
||||
},
|
||||
{
|
||||
"emitted_at": int(1.5 * 1e9),
|
||||
"position_wgs84": {"lat_deg": 50.10, "lon_deg": 36.10, "alt_m": 100.0},
|
||||
},
|
||||
]
|
||||
runner = _build_runner_emitting(output_path, rows=rows)
|
||||
_ground_truth_tlog_loader(monkeypatch)
|
||||
|
||||
# Act
|
||||
report = run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=output_path,
|
||||
report_dir=report_dir,
|
||||
effective_config_path=effective_config_path,
|
||||
runner=runner,
|
||||
run_date_utc="2026-05-23",
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
|
||||
# Assert
|
||||
assert isinstance(report, OrchestrationReport)
|
||||
assert report.report_path.is_file()
|
||||
assert report.emissions_count == 2
|
||||
assert report.distribution.count == 2
|
||||
assert report.verdict_passed is True
|
||||
body = report.report_path.read_text()
|
||||
assert "## Horizontal error (metres)" in body
|
||||
assert "## Threshold-hit share" in body
|
||||
assert f"| {AC3_GATE_THRESHOLD_M:g} |" in body
|
||||
runner.assert_called_once()
|
||||
argv_passed = runner.call_args.args[0]
|
||||
assert str(effective_config_path) in argv_passed
|
||||
assert "--auto-trim" in argv_passed
|
||||
merged = yaml.safe_load(effective_config_path.read_text())
|
||||
assert merged["c6_tile_cache"]["root_dir"] == str(cache.cache_root)
|
||||
|
||||
|
||||
def test_run_e2e_orchestration_writes_report_even_on_fail_verdict(
|
||||
tmp_path: Path,
|
||||
monkeypatch: pytest.MonkeyPatch,
|
||||
) -> None:
|
||||
# Arrange — emissions are 1 km from ground truth, far above the 100 m gate.
|
||||
cache = _build_populated_cache(tmp_path)
|
||||
inputs = _stage_inputs(tmp_path)
|
||||
output_path = tmp_path / "out.jsonl"
|
||||
report_dir = tmp_path / "metrics"
|
||||
rows = [
|
||||
{
|
||||
"emitted_at": int(0.5 * 1e9),
|
||||
"position_wgs84": {"lat_deg": 50.110, "lon_deg": 36.110, "alt_m": 100.0},
|
||||
},
|
||||
{
|
||||
"emitted_at": int(1.5 * 1e9),
|
||||
"position_wgs84": {"lat_deg": 50.110, "lon_deg": 36.110, "alt_m": 100.0},
|
||||
},
|
||||
]
|
||||
runner = _build_runner_emitting(output_path, rows=rows)
|
||||
_ground_truth_tlog_loader(monkeypatch)
|
||||
|
||||
# Act
|
||||
report = run_e2e_orchestration(
|
||||
populated_cache=cache,
|
||||
output_path=output_path,
|
||||
report_dir=report_dir,
|
||||
effective_config_path=tmp_path / "eff.yaml",
|
||||
runner=runner,
|
||||
run_date_utc="2026-05-23",
|
||||
**inputs, # type: ignore[arg-type]
|
||||
)
|
||||
|
||||
# Assert — AC-2: report exists regardless of PASS/FAIL.
|
||||
assert report.verdict_passed is False
|
||||
assert report.report_path.is_file()
|
||||
assert "FAIL" in report.report_path.read_text()
|
||||
@@ -0,0 +1,480 @@
|
||||
"""Unit tests for ``populate_c6_from_route`` (AZ-839 AC-8).
|
||||
|
||||
Covers the AZ-839 acceptance criteria that can be exercised against
|
||||
stubbed dependencies (the AC-9 integration test against the Jetson
|
||||
harness lives in ``test_derkachi_real_tlog.py`` once Epic AZ-835
|
||||
completes):
|
||||
|
||||
* AC-3 happy path — driver returns a populated cache with paths
|
||||
pointing at the on-disk sidecar triple.
|
||||
* AC-4 — :class:`RouteValidationError` and
|
||||
:class:`RouteTerminalFailureError` propagate unchanged with their
|
||||
original cause; no silent swallow.
|
||||
* AC-5 — :class:`RouteTransientError` triggers retry up to 3 attempts
|
||||
using the documented backoff schedule. Final attempt's exception is
|
||||
propagated unchanged.
|
||||
* AC-6 — Tamper between rebuild and verify (simulated by having
|
||||
``descriptor_index_factory`` raise :class:`IndexUnavailableError`)
|
||||
surfaces the failure and leaves no half-built artifacts.
|
||||
* AC-7 — Cleanup on failure removes any sidecar file the driver
|
||||
produced (pre-existing files are preserved).
|
||||
|
||||
The driver intentionally takes every collaborator via dependency
|
||||
injection so this module never imports httpx, FAISS, or Postgres.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from unittest.mock import MagicMock
|
||||
from uuid import uuid4
|
||||
|
||||
import pytest
|
||||
|
||||
from gps_denied_onboard.components.c10_provisioning.descriptor_batcher import (
|
||||
BatcherOutcome,
|
||||
DescriptorBatchReport,
|
||||
)
|
||||
from gps_denied_onboard.components.c11_tile_manager import (
|
||||
DownloadOutcome,
|
||||
SectorClassification,
|
||||
)
|
||||
from gps_denied_onboard.components.c11_tile_manager._types import (
|
||||
DownloadBatchReport,
|
||||
)
|
||||
from gps_denied_onboard.components.c11_tile_manager.errors import (
|
||||
RouteTerminalFailureError,
|
||||
RouteTransientError,
|
||||
RouteValidationError,
|
||||
)
|
||||
from gps_denied_onboard.components.c11_tile_manager.route_client import (
|
||||
RouteSeedResult,
|
||||
)
|
||||
from gps_denied_onboard.components.c6_tile_cache.errors import (
|
||||
IndexUnavailableError,
|
||||
)
|
||||
from gps_denied_onboard.components.c6_tile_cache.faiss_descriptor_index import (
|
||||
META_SUFFIX,
|
||||
)
|
||||
from gps_denied_onboard.helpers.sha256_sidecar import SIDECAR_SUFFIX
|
||||
from gps_denied_onboard.replay_input.tlog_route import RouteSpec
|
||||
|
||||
from tests.e2e.replay._operator_pre_flight import (
|
||||
PopulatedC6Cache,
|
||||
populate_c6_from_route,
|
||||
)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# Helpers
|
||||
|
||||
|
||||
@dataclass
|
||||
class _DriverHarness:
|
||||
"""Bundle of paths + collaborators wired into one driver call."""
|
||||
|
||||
cache_root: Path
|
||||
tile_store_path: Path
|
||||
faiss_index_path: Path
|
||||
sha256_path: Path
|
||||
meta_path: Path
|
||||
route_spec: RouteSpec
|
||||
route_client: MagicMock
|
||||
tile_downloader: MagicMock
|
||||
descriptor_batcher: MagicMock
|
||||
descriptor_index_factory: MagicMock
|
||||
sleep_calls: list[float]
|
||||
|
||||
|
||||
def _build_harness(tmp_path: Path) -> _DriverHarness:
|
||||
"""Wire a self-contained harness with sane default stub returns.
|
||||
|
||||
Each collaborator is a :class:`MagicMock` with a default success
|
||||
return value; tests override per-call as needed.
|
||||
"""
|
||||
|
||||
cache_root = tmp_path / "cache_root"
|
||||
cache_root.mkdir()
|
||||
tile_store_path = cache_root / "tile_store"
|
||||
tile_store_path.mkdir()
|
||||
faiss_index_path = cache_root / "descriptor.index"
|
||||
sha256_path = Path(str(faiss_index_path) + SIDECAR_SUFFIX)
|
||||
meta_path = Path(str(faiss_index_path) + META_SUFFIX)
|
||||
|
||||
route_spec = RouteSpec(
|
||||
waypoints=(
|
||||
(50.10, 36.10),
|
||||
(50.11, 36.11),
|
||||
(50.12, 36.12),
|
||||
),
|
||||
suggested_region_size_meters=500.0,
|
||||
source_tlog=Path("test.tlog"),
|
||||
source_segment=(0, 100),
|
||||
total_distance_meters=1500.0,
|
||||
)
|
||||
|
||||
route_client = MagicMock()
|
||||
route_client.seed_route.return_value = RouteSeedResult(
|
||||
route_id=uuid4(),
|
||||
terminal_status="completed",
|
||||
maps_ready=True,
|
||||
tile_count=12,
|
||||
elapsed_ms=2500,
|
||||
submitted_payload_sha256="cafebabe" * 8,
|
||||
)
|
||||
|
||||
tile_downloader = MagicMock()
|
||||
tile_downloader.download_tiles_for_area.return_value = DownloadBatchReport(
|
||||
outcome=DownloadOutcome.SUCCESS,
|
||||
tiles_requested=12,
|
||||
tiles_downloaded=12,
|
||||
tiles_rejected_resolution=0,
|
||||
tiles_rejected_freshness=0,
|
||||
tiles_downgraded=0,
|
||||
retry_count=0,
|
||||
request_hash="abcdef0123456789",
|
||||
)
|
||||
|
||||
descriptor_batcher = MagicMock()
|
||||
descriptor_batcher.populate_descriptors.return_value = DescriptorBatchReport(
|
||||
descriptors_generated=12,
|
||||
tiles_consumed=12,
|
||||
oom_retries=0,
|
||||
elapsed_s=1.2,
|
||||
outcome=BatcherOutcome.SUCCESS,
|
||||
failure_reason=None,
|
||||
)
|
||||
|
||||
descriptor_index_factory = MagicMock()
|
||||
descriptor_index_factory.return_value = MagicMock(
|
||||
spec=["mmap_handle", "descriptor_dim"]
|
||||
)
|
||||
|
||||
return _DriverHarness(
|
||||
cache_root=cache_root,
|
||||
tile_store_path=tile_store_path,
|
||||
faiss_index_path=faiss_index_path,
|
||||
sha256_path=sha256_path,
|
||||
meta_path=meta_path,
|
||||
route_spec=route_spec,
|
||||
route_client=route_client,
|
||||
tile_downloader=tile_downloader,
|
||||
descriptor_batcher=descriptor_batcher,
|
||||
descriptor_index_factory=descriptor_index_factory,
|
||||
sleep_calls=[],
|
||||
)
|
||||
|
||||
|
||||
def _drive(harness: _DriverHarness, **overrides: object) -> PopulatedC6Cache:
|
||||
"""Invoke the driver with the harness defaults plus any overrides."""
|
||||
|
||||
kwargs: dict[str, object] = {
|
||||
"route_spec": harness.route_spec,
|
||||
"route_client": harness.route_client,
|
||||
"tile_downloader": harness.tile_downloader,
|
||||
"descriptor_batcher": harness.descriptor_batcher,
|
||||
"descriptor_index_factory": harness.descriptor_index_factory,
|
||||
"cache_root": harness.cache_root,
|
||||
"tile_store_path": harness.tile_store_path,
|
||||
"faiss_index_path": harness.faiss_index_path,
|
||||
"sleep": harness.sleep_calls.append,
|
||||
}
|
||||
kwargs.update(overrides)
|
||||
return populate_c6_from_route(**kwargs) # type: ignore[arg-type]
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-3 — happy path
|
||||
|
||||
|
||||
def test_populate_c6_from_route_returns_populated_cache(tmp_path: Path) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
|
||||
# Act
|
||||
populated = _drive(harness)
|
||||
|
||||
# Assert
|
||||
assert isinstance(populated, PopulatedC6Cache)
|
||||
assert populated.cache_root == harness.cache_root
|
||||
assert populated.tile_store_path == harness.tile_store_path
|
||||
assert populated.faiss_index_path == harness.faiss_index_path
|
||||
assert populated.faiss_sidecar_sha256_path == harness.sha256_path
|
||||
assert populated.faiss_sidecar_meta_path == harness.meta_path
|
||||
assert populated.route_spec is harness.route_spec
|
||||
assert populated.tile_count == 12
|
||||
assert populated.elapsed_seconds >= 0.0
|
||||
harness.route_client.seed_route.assert_called_once()
|
||||
harness.tile_downloader.download_tiles_for_area.assert_called_once()
|
||||
harness.descriptor_batcher.populate_descriptors.assert_called_once()
|
||||
harness.descriptor_index_factory.assert_called_once()
|
||||
|
||||
|
||||
def test_populate_c6_from_route_passes_sector_class_to_downloader(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
|
||||
# Act
|
||||
_drive(harness, sector_class=SectorClassification.STABLE_REAR)
|
||||
|
||||
# Assert
|
||||
download_request = harness.tile_downloader.download_tiles_for_area.call_args.args[0]
|
||||
assert download_request.sector_class is SectorClassification.STABLE_REAR
|
||||
corpus_filter = harness.descriptor_batcher.populate_descriptors.call_args.args[0]
|
||||
assert corpus_filter.sector_class == SectorClassification.STABLE_REAR.value
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-4 — validation / terminal failure propagate unchanged
|
||||
|
||||
|
||||
def test_route_validation_error_propagates_unchanged(tmp_path: Path) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
|
||||
def _raise_validation(*_args: object, **_kwargs: object) -> RouteSeedResult:
|
||||
try:
|
||||
raise ValueError("payload sha256 mismatch")
|
||||
except ValueError as cause:
|
||||
raise RouteValidationError("payload rejected") from cause
|
||||
|
||||
harness.route_client.seed_route.side_effect = _raise_validation
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(RouteValidationError) as exc_info:
|
||||
_drive(harness)
|
||||
assert isinstance(exc_info.value.__cause__, ValueError)
|
||||
assert "payload sha256 mismatch" in str(exc_info.value.__cause__)
|
||||
assert harness.tile_downloader.download_tiles_for_area.call_count == 0
|
||||
assert harness.descriptor_batcher.populate_descriptors.call_count == 0
|
||||
assert harness.sleep_calls == []
|
||||
|
||||
|
||||
def test_route_terminal_failure_propagates_unchanged(tmp_path: Path) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
harness.route_client.seed_route.side_effect = RouteTerminalFailureError(
|
||||
"mapsReady never reached"
|
||||
)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(RouteTerminalFailureError):
|
||||
_drive(harness)
|
||||
assert harness.tile_downloader.download_tiles_for_area.call_count == 0
|
||||
assert harness.descriptor_batcher.populate_descriptors.call_count == 0
|
||||
assert harness.sleep_calls == []
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-5 — transient retry budget
|
||||
|
||||
|
||||
def test_route_transient_error_retries_then_succeeds(tmp_path: Path) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
success_result = harness.route_client.seed_route.return_value
|
||||
harness.route_client.seed_route.side_effect = [
|
||||
RouteTransientError("503 first attempt"),
|
||||
RouteTransientError("503 second attempt"),
|
||||
success_result,
|
||||
]
|
||||
|
||||
# Act
|
||||
populated = _drive(
|
||||
harness,
|
||||
retry_schedule_s=(0.1, 0.2, 0.4),
|
||||
max_retry_attempts=3,
|
||||
)
|
||||
|
||||
# Assert
|
||||
assert populated.tile_count == 12
|
||||
assert harness.route_client.seed_route.call_count == 3
|
||||
assert harness.sleep_calls == [pytest.approx(0.1), pytest.approx(0.2)]
|
||||
|
||||
|
||||
def test_route_transient_error_exhausted_propagates_last_attempt(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
final_exc = RouteTransientError("503 final attempt")
|
||||
harness.route_client.seed_route.side_effect = [
|
||||
RouteTransientError("503 a"),
|
||||
RouteTransientError("503 b"),
|
||||
final_exc,
|
||||
]
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(RouteTransientError) as exc_info:
|
||||
_drive(
|
||||
harness,
|
||||
retry_schedule_s=(0.1, 0.2),
|
||||
max_retry_attempts=3,
|
||||
)
|
||||
assert exc_info.value is final_exc
|
||||
assert harness.route_client.seed_route.call_count == 3
|
||||
assert harness.sleep_calls == [pytest.approx(0.1), pytest.approx(0.2)]
|
||||
assert harness.tile_downloader.download_tiles_for_area.call_count == 0
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-6 — tamper between rebuild and verify
|
||||
|
||||
|
||||
def test_descriptor_index_factory_index_unavailable_propagates(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
# Simulate the rebuild writing sidecar files DURING populate_descriptors
|
||||
# (the real C10 batcher does this via its DescriptorIndexRebuilder cut).
|
||||
_stub_populate_descriptors_writes_sidecars(harness)
|
||||
harness.descriptor_index_factory.side_effect = IndexUnavailableError(
|
||||
"sidecar sha256 mismatch — index is corrupt"
|
||||
)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(IndexUnavailableError):
|
||||
_drive(harness)
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# AC-7 — cleanup on failure
|
||||
|
||||
|
||||
def test_cleanup_removes_partial_sidecar_files_on_failure(
|
||||
tmp_path: Path,
|
||||
) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
# The driver MUST observe an absent-sidecar state on entry, then a
|
||||
# rebuild that writes the trio, then a verifier that fails — only
|
||||
# then is the cleanup contract exercised on a "we created these"
|
||||
# set of paths.
|
||||
assert not harness.faiss_index_path.exists()
|
||||
_stub_populate_descriptors_writes_sidecars(harness)
|
||||
harness.descriptor_index_factory.side_effect = IndexUnavailableError(
|
||||
"tamper detected"
|
||||
)
|
||||
|
||||
# Act
|
||||
with pytest.raises(IndexUnavailableError):
|
||||
_drive(harness)
|
||||
|
||||
# Assert
|
||||
assert not harness.faiss_index_path.exists()
|
||||
assert not harness.sha256_path.exists()
|
||||
assert not harness.meta_path.exists()
|
||||
|
||||
|
||||
def test_cleanup_preserves_pre_existing_warm_cache(tmp_path: Path) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
# A warm cache existed before the driver ran (named-volume reuse path).
|
||||
_write_dummy_sidecars(harness, marker="WARM_CACHE")
|
||||
harness.route_client.seed_route.side_effect = RouteValidationError(
|
||||
"noop fail post-warm-cache"
|
||||
)
|
||||
|
||||
# Act
|
||||
with pytest.raises(RouteValidationError):
|
||||
_drive(harness)
|
||||
|
||||
# Assert — the pre-existing warm-cache files MUST stay on disk.
|
||||
assert harness.faiss_index_path.read_text() == "WARM_CACHE"
|
||||
assert harness.sha256_path.read_text() == "WARM_CACHE"
|
||||
assert harness.meta_path.read_text() == "WARM_CACHE"
|
||||
|
||||
|
||||
def test_batcher_failure_propagates_and_cleans_up(tmp_path: Path) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
|
||||
def _populate_writes_partial_sidecar_then_fails(
|
||||
_filter: object,
|
||||
) -> DescriptorBatchReport:
|
||||
_write_dummy_sidecars(harness, marker="HALF_BUILT")
|
||||
return DescriptorBatchReport(
|
||||
descriptors_generated=0,
|
||||
tiles_consumed=0,
|
||||
oom_retries=0,
|
||||
elapsed_s=0.5,
|
||||
outcome=BatcherOutcome.FAILURE,
|
||||
failure_reason="OOM at batch_size=64",
|
||||
)
|
||||
|
||||
harness.descriptor_batcher.populate_descriptors.side_effect = (
|
||||
_populate_writes_partial_sidecar_then_fails
|
||||
)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(RuntimeError) as exc_info:
|
||||
_drive(harness)
|
||||
assert "OOM at batch_size=64" in str(exc_info.value)
|
||||
assert not harness.faiss_index_path.exists()
|
||||
assert not harness.sha256_path.exists()
|
||||
assert not harness.meta_path.exists()
|
||||
|
||||
|
||||
def test_downloader_failure_propagates_and_cleans_up(tmp_path: Path) -> None:
|
||||
# Arrange
|
||||
harness = _build_harness(tmp_path)
|
||||
harness.tile_downloader.download_tiles_for_area.return_value = (
|
||||
DownloadBatchReport(
|
||||
outcome=DownloadOutcome.FAILURE,
|
||||
tiles_requested=12,
|
||||
tiles_downloaded=0,
|
||||
tiles_rejected_resolution=0,
|
||||
tiles_rejected_freshness=0,
|
||||
tiles_downgraded=0,
|
||||
retry_count=2,
|
||||
request_hash="abcdef0123456789",
|
||||
)
|
||||
)
|
||||
|
||||
# Act + Assert
|
||||
with pytest.raises(RuntimeError) as exc_info:
|
||||
_drive(harness)
|
||||
assert "failure" in str(exc_info.value).lower()
|
||||
assert harness.descriptor_batcher.populate_descriptors.call_count == 0
|
||||
|
||||
|
||||
# ----------------------------------------------------------------------
|
||||
# Internal helpers
|
||||
|
||||
|
||||
def _write_dummy_sidecars(
|
||||
harness: _DriverHarness,
|
||||
*,
|
||||
marker: str = "PARTIAL",
|
||||
) -> None:
|
||||
"""Create the three sidecar files at the harness's faiss path."""
|
||||
|
||||
harness.faiss_index_path.write_text(marker)
|
||||
harness.sha256_path.write_text(marker)
|
||||
harness.meta_path.write_text(marker)
|
||||
|
||||
|
||||
def _stub_populate_descriptors_writes_sidecars(
|
||||
harness: _DriverHarness,
|
||||
*,
|
||||
marker: str = "FRESH_REBUILD",
|
||||
) -> None:
|
||||
"""Make the stubbed batcher write the three sidecar files on success.
|
||||
|
||||
The real C10 batcher writes the FAISS index + sha256 + meta.json
|
||||
via the AZ-306 :class:`FaissDescriptorIndex.rebuild_from_descriptors`
|
||||
path. The stub mirrors that side effect so the AC-7 cleanup path
|
||||
has files to rollback on a downstream verifier failure.
|
||||
"""
|
||||
|
||||
success_report = harness.descriptor_batcher.populate_descriptors.return_value
|
||||
|
||||
def _populate(_filter: object) -> DescriptorBatchReport:
|
||||
_write_dummy_sidecars(harness, marker=marker)
|
||||
return success_report
|
||||
|
||||
harness.descriptor_batcher.populate_descriptors.side_effect = _populate
|
||||
@@ -0,0 +1,40 @@
|
||||
"""AZ-839 AC-9 — integration test: fixture produces a real :class:`PopulatedC6Cache`.
|
||||
|
||||
Gated by ``RUN_REPLAY_E2E=1`` AND ``@pytest.mark.tier2`` per the
|
||||
AZ-839 task spec. The work the test asserts is the fixture's
|
||||
contract; the fixture wiring itself lives in
|
||||
``tests/e2e/replay/conftest.py::operator_pre_flight_setup`` and the
|
||||
algorithmic correctness is covered by
|
||||
``test_operator_pre_flight_driver.py`` against stubs (AC-8).
|
||||
|
||||
This test exists so AC-9 has a concrete pytest entry point. Other
|
||||
end-to-end consumers (AZ-840 e2e orchestrator test; AZ-841 un-xfail
|
||||
of the AZ-777 Tier-2 tests) chain off the same fixture.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from tests.e2e.replay._operator_pre_flight import PopulatedC6Cache
|
||||
|
||||
|
||||
@pytest.mark.tier2
|
||||
def test_operator_pre_flight_setup_produces_populated_cache(
|
||||
operator_pre_flight_setup: PopulatedC6Cache,
|
||||
) -> None:
|
||||
# Arrange
|
||||
populated = operator_pre_flight_setup
|
||||
|
||||
# Assert
|
||||
assert isinstance(populated, PopulatedC6Cache)
|
||||
assert populated.cache_root.is_dir()
|
||||
assert populated.tile_store_path.is_dir()
|
||||
assert populated.faiss_index_path.is_file()
|
||||
assert populated.faiss_sidecar_sha256_path.is_file()
|
||||
assert populated.faiss_sidecar_meta_path.is_file()
|
||||
assert populated.tile_count > 0
|
||||
assert populated.elapsed_seconds >= 0.0
|
||||
assert populated.route_spec.waypoints, (
|
||||
"RouteSpec must carry at least one waypoint extracted from the tlog"
|
||||
)
|
||||
Reference in New Issue
Block a user