diff --git a/_docs/00_problem/input_data/flight_derkachi/README.md b/_docs/00_problem/input_data/flight_derkachi/README.md index d1a608f..1426b2e 100644 --- a/_docs/00_problem/input_data/flight_derkachi/README.md +++ b/_docs/00_problem/input_data/flight_derkachi/README.md @@ -12,3 +12,31 @@ Use this fixture for video/telemetry synchronization checks, representative replay smoke tests, VIO hot-path latency, frame-drop accounting, and trajectory comparison against `GLOBAL_POSITION_INT`. The video and telemetry align at exactly three video frames per telemetry row. Camera intrinsics, lens distortion, raw camera resolution, and exact camera-to-body calibration are still unknown, so this fixture is not sufficient by itself for final production camera calibration or satellite-anchor accuracy claims. For the test recording, the rotating camera was mechanically fixed in a downward/nadir orientation. Treat the MP4 as a cleaned/cropped replay fixture rather than the raw camera feed. + +## Derkachi C6 reference seeding (cycle 3 — AZ-777 + Epic AZ-835) + +The end-to-end replay pipeline needs the C6 tile cache pre-populated with the satellite imagery that covers this flight. The seed scripts live under `tests/fixtures/derkachi_c6/`: + +| Script | Purpose | +|--------|---------| +| `tests/fixtures/derkachi_c6/seed_region.py` (AZ-777 Phase 2) | Bbox-driven seed. Calls `POST /api/satellite/request` on the running `satellite-provider` to onboard the Derkachi area (~50.05–50.15 lat, 36.05–36.15 lon, zoom 15–18). Companion to the existing bbox-download workflow. | +| `tests/fixtures/derkachi_c6/seed_route.py` (AZ-838 / Epic AZ-835 C2) | Route-driven seed. Reads `derkachi.tlog`, extracts a ≤ 10-waypoint corridor via `replay_input.tlog_route.extract_route_from_tlog`, posts it to `satellite-provider`'s Route API, polls until `mapsReady=true`, and verifies coverage via inventory. ~100× more tile-efficient than the bbox path for this clip. | +| `tests/fixtures/derkachi_c6/bbox.yaml` | Derkachi bbox + zoom levels + license-attribution metadata (Google Maps Platform ToS + "Imagery © Google" attribution string). | +| `tests/fixtures/derkachi_c6/README.md` | Step-by-step re-seeding instructions when the `satellite-provider` postgres is wiped; license-attribution operators must propagate; pointer to the parent-suite ticket (TBD) for migrating to a true CC-BY satellite source for production. | + +Both seed scripts require: + +- A running `satellite-provider` reachable at `SATELLITE_PROVIDER_URL` (typically `https://satellite-provider:8080` inside the Jetson compose network). +- A valid JWT — either `SATELLITE_PROVIDER_API_KEY` env var or `--auto-mint-jwt` (uses `scripts/mint_dev_jwt.py`). +- `SATELLITE_PROVIDER_TLS_INSECURE=1` if the parent suite is using the self-signed dev cert (development only — production deploys must validate against a CA-issued cert). + +The end-to-end orchestrator test `tests/e2e/replay/test_az835_e2e_real_flight.py` (AZ-840) takes only `(derkachi.tlog, flight_derkachi.mp4, khp20s30_factory.json)` and runs the full 7-step pipeline against a populated C6 — see `_docs/02_document/contracts/replay/replay_protocol.md` Invariant 12.b for the orchestration. + +### License attribution caveat (cycle 3) + +The Jetson `satellite-provider` instance downloads from the **Google Maps satellite layer** (`lyrs=s`), governed by Google Maps Platform Terms of Service. This fixture and the seed scripts are dev/research use only. Production deployment requires either: + +- Google Maps Platform licensing review for offline-cache use, OR +- A parent-suite ticket to switch satellite-provider's upstream to a true CC-BY satellite source (Esri World Imagery, Mapbox satellite, Sentinel-2, etc.). + +The "Imagery © Google" attribution string is recorded in the seeded catalog's metadata and must be propagated downstream by any operator workflow that surfaces the imagery. diff --git a/_docs/02_document/architecture.md b/_docs/02_document/architecture.md index 56c3573..5f6c88f 100644 --- a/_docs/02_document/architecture.md +++ b/_docs/02_document/architecture.md @@ -262,11 +262,25 @@ source repo | ArduPilot Plane FC | MAVLink 2.0 (`GPS_INPUT` 5 Hz; `MAV_CMD_SET_EKF_SOURCE_SET`; `STATUSTEXT` / `NAMED_VALUE_FLOAT`) over UART/USB | MAVLink 2.0 message signing, per-flight key (D-C8-9 = (d)) | 5 Hz periodic emit; signing handshake at takeoff load (≤ 5 s, AC-NEW-1) | Signing handshake fail → companion refuses takeoff; mid-flight signing key compromise → FC ignores unsigned messages, AC-5.2 takes over | | iNav FC | MSP2 `MSP2_SENSOR_GPS` over UART; MAVLink outbound for telemetry | None (iNav has no signing) — accepted residual risk per Mode B Source #129 | 5 Hz periodic emit | Mid-flight bad-frame → iNav `mspGPSReceiveNewData()` receives only the latest frame; honest `hPosAccuracy` is the only safety net | | QGroundControl (GCS) | MAVLink 2.0 (`STATUSTEXT`, `NAMED_VALUE_FLOAT`, `GPS_RAW_INT`) | Same MAVLink 2.0 signing as the AP path (AP profile); no signing on iNav profile | 1–2 Hz downsampled (AC-6.1); operator commands are best-effort | GCS link drop → companion continues; no mid-flight reconfiguration is required from GCS | -| `satellite-provider` (pre-flight) | REST over HTTP, OpenAPI at `/swagger`; filesystem access if co-located | TLS + service-internal API key (operator workstation only); the companion never reaches `satellite-provider` directly while airborne | Off-line pre-flight; not time-critical | Cache miss → C11 `TileDownloader` fails fast pre-flight; C10 build is blocked downstream; takeoff blocked | +| `satellite-provider` (pre-flight read — bbox + slippy-map) | REST `POST /api/satellite/tiles/inventory` (bulk lookup by `(z,x,y)`, ≤ 5000 entries / request) + `GET /tiles/{z}/{x}/{y}` (slippy-map JPEG fetch); OpenAPI at `/swagger`; filesystem access if co-located | JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) over TLS; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert. The companion never reaches `satellite-provider` directly while airborne. | Off-line pre-flight; not time-critical | Cache miss → C11 `TileDownloader` fails fast pre-flight; C10 build is blocked downstream; takeoff blocked | +| `satellite-provider` (pre-flight route seed — cycle 3 / Epic AZ-835) | REST `POST /api/satellite/route` (corridor onboarding; body per `CreateRouteRequest.cs` DTO) + `GET /api/satellite/route/{id}` (status polling; terminal-success `mapsReady=true`) | Same JWT Bearer / TLS-insecure as the read path; validated pre-emptively against AZ-809 `CreateRouteRequestValidator` bounds | Off-line pre-flight; bounded by `poll_max_attempts × poll_interval_s` (default 60 × 5 s) | Terminal failure → `RouteTerminalFailureError`; transient → `RouteTransientError`; validation → `RouteValidationError`. C11's `SatelliteProviderRouteClient` (AZ-838) owns the surface. | | `satellite-provider` (post-landing ingest, D-PROJ-2, **planned**) | REST `POST /api/satellite/tiles/ingest` (multipart) | Per-flight onboard signing key (carried with each tile); rate-limited | Bursty post-landing | Endpoint not yet implemented service-side → C11 keeps batches queued locally; never blocks the pre-flight cycle | | Operator workstation (pre-flight stage) | Filesystem (USB / Ethernet) | OS-level (operator login) | Not time-critical | Bad-stage detection via Manifest content-hash gate (D-C10-3) | | Nav camera | USB / MIPI-CSI / GigE (lens-module dependent) | n/a | 3 Hz | Frame drop / hardware fault → "VISUAL_BLACKOUT" path (AC-3.5, AC-NEW-8) | +### `satellite-provider` integration (cycle-3 ground truth) + +**The Jetson e2e harness now consumes the REAL parent-suite `satellite-provider` .NET service** (lineage AZ-688 / AZ-691 / AZ-692; `satellite-provider` + `satellite-provider-postgres` services in `docker-compose.test.jetson.yml`). The legacy `mock-sat` fixture is retired from the Jetson compose; D-PROJ-2 `POST /api/satellite/upload` has shipped service-side (`Program.cs:211`). Tier-1 `docker-compose.test.yml` is deprecated 2026-05-20 per `_docs/02_document/tests/environment.md`. + +Two consequences for the architecture: + +1. **C11 read contract adapted to the v1.0.0 inventory shape (AZ-777 Phase 1)** — `POST /api/satellite/tiles/inventory` + `GET /tiles/{z}/{x}/{y}` replace the historical `GET /api/satellite/tiles?bbox=…&zoom=…` shape. The bbox-driven `download_tiles_for_area` entry point and its DTOs are unchanged at the call-site level; the contract adaptation is internal to `HttpTileDownloader`. Auth is JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) over TLS; `SATELLITE_PROVIDER_TLS_INSECURE=1` is a documented dev-only knob for self-signed certs. +2. **Route-driven seeding (Epic AZ-835 — C11's third interface, `SatelliteProviderRouteClient`)** — the operator can now submit a tlog-derived `RouteSpec` (waypoints + region size; produced by `replay_input.tlog_route.extract_route_from_tlog` — AZ-836; canonical DTO at `_types/route.py` per AZ-845) via `POST /api/satellite/route` and have `satellite-provider` materialise just the corridor tiles, polling `GET /api/satellite/route/{id}` until `mapsReady=true`. This is ~100× more tile-efficient than the bbox path on long, narrow flights. Pre-emptive validation mirrors the AZ-809 `CreateRouteRequestValidator` bounds. The route-driven path is exercised today by the cycle-3 e2e fixture `operator_pre_flight_setup` (AZ-839) and the orchestrator test `test_az835_e2e_real_flight.py` (AZ-840); the C12 production CLI binding is a future-cycle integration. + +**Imagery source license attribution (cycle 3)**: the Jetson `satellite-provider` instance downloads from the **Google Maps satellite layer** (`lyrs=s`), governed by Google Maps Platform Terms of Service. Dev/research use only; production deployment requires either a Google Maps Platform licensing review or migration to a true CC-BY satellite source on the parent-suite side (parent-suite ticket TBD). Operator-side seed scripts (`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate the "Imagery © Google" attribution. + +No new ADR — this is execution of existing decisions (architectural principle #5 satellite-provider on-disk layout end-to-end; ADR-004 process-level isolation unchanged; ADR-011 replay is a configuration unchanged). The architectural surface gained the route-driven seeding path inside C11; nothing else moved. + ### `satellite-provider` upload contract (per D-PROJ-2 carryforward) The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md`. From this architecture's standpoint: @@ -274,7 +288,7 @@ The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/202 - **`Tile` writes are append-only and idempotent** (the same `(zoomLevel, lat, lon, capture_timestamp, companion_id, flight_id)` tuple is the dedup key). - **Quality metadata is mandatory on every uploaded tile** so the planned voting layer can promote `pending → trusted` without re-deriving statistics on the service side. - **Onboard tiles never claim the `trusted` status**; they are uploaded as `pending` and the parent-suite voting layer (D-PROJ-2 design task #2) decides promotion. -- **Test substitute**: `mock-suite-sat-service` is an e2e-test-only fixture (under `tests/fixtures/mock-suite-sat-service/`) that implements the upload contract for NFT-SEC-01 / FT-P-17 / IT runs until D-PROJ-2 lands service-side. It is **not a component** in the architectural sense — the production architectural counterparty for both download and upload is the real `satellite-provider`. The fixture is retired the moment the real ingest endpoint ships. +- **Test substitute**: `mock-suite-sat-service` is an e2e-test-only fixture (under `tests/fixtures/mock-suite-sat-service/`) that implements the upload contract for NFT-SEC-01 / FT-P-17 / IT runs until D-PROJ-2 lands service-side. It is **not a component** in the architectural sense — the production architectural counterparty for both download and upload is the real `satellite-provider`. The fixture is retired the moment the real ingest endpoint ships. (Download + route-seed integration tests on the Jetson harness already run against the real service as of cycle 3.) --- diff --git a/_docs/02_document/components/12_c11_tilemanager/description.md b/_docs/02_document/components/12_c11_tilemanager/description.md index 79dcf42..c84019f 100644 --- a/_docs/02_document/components/12_c11_tilemanager/description.md +++ b/_docs/02_document/components/12_c11_tilemanager/description.md @@ -2,23 +2,32 @@ ## 1. High-Level Overview -**Purpose**: own the operator-side network I/O against `satellite-provider` for the onboard tile corpus, in **both directions**: +**Purpose**: own the operator-side network I/O against `satellite-provider` for the onboard tile corpus, in **three directions**: +- **Route seed** (pre-flight, F1, route-driven variant — Cycle 3 / Epic AZ-835): submit a tlog-derived `RouteSpec` (waypoints + per-waypoint coverage radius, produced by `replay_input.tlog_route.extract_route_from_tlog` — AZ-836) to `satellite-provider`'s Route API and poll until corridor tile materialisation completes. Lets the operator pre-commit the cache to where the drone actually flew rather than a bounding box. - **Download** (pre-flight, F1): fetch tiles from `satellite-provider` for the operational area, apply AC-NEW-6 freshness gating, and write into C6 (`TileStore` + `TileMetadataStore`). C11 is the **only** path that crosses the workstation/companion enclave to the parent suite for tile pixels — C10 reads from the populated C6 store and never touches `satellite-provider` itself. - **Upload** (post-landing, F10): read pending mid-flight tiles from C6 and POST to `satellite-provider`'s ingest endpoint (D-PROJ-2 contract sketch). C11 itself does NOT gate on flight state — it is a dumb pipe; the post-landing safety gate is owned by C12's `PostLandingUploadOrchestrator` (AZ-329 / Batch 44), which checks the C13 `flight_footer` FDR record for `clean_shutdown=True` before invoking `TileUploader.upload_pending_tiles`. -C11 is a **separate operator-side binary / image**. The airborne companion image's CMake target deliberately excludes the entire `c11_tilemanager/` source tree so the airborne process cannot accidentally execute either the download path or the upload path even via reflection or config error (ADR-004 process-level isolation, AC-8.4). Both directions of tile I/O are operator-driven on the operator workstation; the companion only consumes the populated C6 store while airborne. +C11 is a **separate operator-side binary / image**. The airborne companion image's CMake target deliberately excludes the entire `c11_tilemanager/` source tree so the airborne process cannot accidentally execute the seed path, the download path, or the upload path even via reflection or config error (ADR-004 process-level isolation, AC-8.4). All three directions of tile I/O are operator-driven on the operator workstation; the companion only consumes the populated C6 store while airborne. -**Architectural Pattern**: Pipeline behind two interfaces (`TileDownloader`, `TileUploader`) under one component, consistent with C8's multi-interface shape (FC-AP, FC-iNav, GCS adapters under one component). The two interfaces are bundled into C11 because they share auth (TLS + service-internal API key for download, per-flight onboard signing key for upload), HTTP client, network configuration, deployment unit (operator-tooling tarball), and the airborne-exclusion property — splitting them into two components would duplicate all of that. They are kept as **two interfaces** so SRP is preserved at the call-site level: C12 binds `TileDownloader` for the F1 cache-build workflow, `TileUploader` for the F10 post-landing trigger; neither is forced to depend on the other. +**Architectural Pattern**: Pipeline behind three interfaces (`SatelliteProviderRouteClient`, `TileDownloader`, `TileUploader`) under one component, consistent with C8's multi-interface shape (FC-AP, FC-iNav, GCS adapters under one component). The three interfaces are bundled into C11 because they share auth (JWT Bearer + optional TLS-insecure flag for dev self-signed certs across all three; the upload direction additionally signs each tile with the per-flight onboard signing key), HTTP client (`httpx`), network configuration, deployment unit (operator-tooling tarball), and the airborne-exclusion property — splitting them into separate components would duplicate all of that. They are kept as **three interfaces** so SRP is preserved at the call-site level: C12 binds `SatelliteProviderRouteClient.seed_route` to materialise the corridor cache from a tlog (cycle 3 e2e fixture today; planned C12 production path), `TileDownloader.download_tiles_for_area` for the F1 bbox-driven cache-build workflow, `TileUploader.upload_pending_tiles` for the F10 post-landing trigger; none is forced to depend on the others. **Cycle-1 operational reality**: C11 is **operator-workstation-only**, NOT an airborne strategy slot — there is no `c11_tile_manager` slot in `_AIRBORNE_REGISTRATIONS`, no row in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`, and the airborne companion image's build target deliberately excludes the entire `c11_tile_manager/` source tree (ADR-004 process-level isolation; AC-8.4). The operator binary composes C11 via `runtime_root/c11_factory.py`, which exposes three tiny per-service factories — `build_per_flight_key_manager` (AZ-318), `build_tile_uploader` (AZ-319 + AZ-320), and `build_tile_downloader` (AZ-316) — each called explicitly by C12's CLI; no central registry. FDR wiring goes through the per-producer `make_fdr_client` cache: AZ-318 `PerFlightKeyManager` defaults to `make_fdr_client("c11_tile_manager.signing_key", config)`, AZ-319 `HttpTileUploader` to `make_fdr_client("c11_tile_manager.tile_uploader", config)` — both distinct from the airborne `"airborne_main"` producer, so the operator-workstation process gets its own per-component FdrClient instances rather than sharing the airborne singleton. AZ-320's `IdempotentRetryTileUploader` decorator wraps `HttpTileUploader` by default (per-call + per-tile bounded retry); `config.components['c11_tile_manager'].disable_retry_decorator = True` suppresses the wrap for low-level debugging or test wiring that needs to observe the inner uploader. The AZ-507 cross-component cut keeps C11 from importing C6 directly: `tile_store` / `tile_metadata_store` are passed in by the operator-binary composition root as consumer-side cuts; `http_client` (an `httpx.Client`) is also caller-owned so tests can swap in `httpx.MockTransport`. AZ-687 replay-mode guard does not apply — C11 has no airborne footprint. +**Cycle-3 operational reality (AZ-777 Phase 1 + Epic AZ-835)**: the e2e harness now wires the e2e-runner against the **real** parent-suite `satellite-provider` .NET service in `docker-compose.test.jetson.yml` (lineage AZ-688 / AZ-691 / AZ-692; tier-1 `docker-compose.test.yml` deprecated 2026-05-20). Two consequences cascaded into C11: + +- **`TileDownloader` contract adaptation (AZ-777 Phase 1)** — `HttpTileDownloader._INVENTORY_PATH = "/api/satellite/tiles/inventory"` (POST, bulk lookup by (z,x,y)) and `HttpTileDownloader._TILES_PATH = "/tiles"` (GET, slippy-map fetch via `/tiles/{z}/{x}/{y}`). Previously documented as `GET /api/satellite/tiles?bbox=…&zoom=…`; the real `satellite-provider` API surface uses the inventory + slippy-map split per `tile-inventory.md` v1.0.0 (AZ-505). The bbox-driven `download_tiles_for_area` entry point and its `DownloadRequest` / `DownloadBatchReport` DTOs are unchanged at the call-site level; the contract adaptation is internal. Because the inventory response does not carry a `Content-Length` hint, AZ-308's pre-write budget check uses `_DEFAULT_ESTIMATED_TILE_BYTES = 50 000` (conservative over-reserve; typical 256×256 JPEG basemap tile is 8–80 KiB). Auth is `Authorization: Bearer ${SATELLITE_PROVIDER_API_KEY}`; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert. +- **Third interface — `SatelliteProviderRouteClient` (AZ-838 / Epic AZ-835 C2)** — `seed_route(spec: RouteSpec) -> RouteSeedResult` POSTs the spec to `POST /api/satellite/route` (`requestMaps=true`, `createTilesZip=false`), polls `GET /api/satellite/route/{id}` until `mapsReady=true` (or a terminal-failure status), then verifies coverage via `POST /api/satellite/tiles/inventory`. Pre-emptively enforces AZ-809's `CreateRouteRequestValidator` bounds (`points` 2..500; `regionSizeMeters` 100..10 000; `zoomLevel` 0..22; lat/lon ranges) so obviously-bad input fails before the HTTP POST. Default cadence: `poll_interval_s = 5.0`, `poll_max_attempts = 60`, `request_timeout_s = 30.0`. Errors form a dedicated hierarchy (`RouteValidationError` 4xx + RFC 7807 ProblemDetails; `RouteTransientError` 5xx / network / timeout with `__cause__` set; `RouteTerminalFailureError` for non-success terminal status) rooted at `SatelliteProviderRouteError` — independent of `TileManagerError` because the Route API is a corridor-onboarding flow, not a per-tile transfer. + +The route-driven path is exercised today by `tests/e2e/replay/conftest.py::operator_pre_flight_setup` (AZ-839 — replaces the cycle-1 `mkdir` placeholder; yields a `PopulatedC6Cache` dataclass) and `tests/e2e/replay/test_az835_e2e_real_flight.py` (AZ-840 — single test that takes only `(tlog, video, calibration)` and runs the full 7-step pipeline). The C12 production CLI binding for the route path is a future-cycle integration; today's C12 still drives only `download_tiles_for_area` for production pre-flight cache builds. + **Upstream dependencies**: -- C12 OperatorTooling → invokes `TileDownloader.download_tiles_for_area(...)` during F1 and `TileUploader.upload_pending_tiles(...)` post-landing. +- C12 OperatorTooling → invokes `TileDownloader.download_tiles_for_area(...)` during F1 and `TileUploader.upload_pending_tiles(...)` post-landing. (Cycle-3 e2e fixtures also drive `SatelliteProviderRouteClient.seed_route(...)` for the route-driven F1 variant; C12 production binding for the route path is a future cycle.) - C6 TileStore + TileMetadataStore → write target during download (`source = googlemaps`); read source during upload (`source = onboard_ingest`, `voting_status = pending`). +- `replay_input.tlog_route.RouteSpec` (AZ-836; `_types/route.py` canonical home per AZ-845) → input DTO to `SatelliteProviderRouteClient.seed_route`. - Operator workstation OS → invocation entry point (CLI / tray app, owned by C12). -- `satellite-provider` (external) → `GET /api/satellite/tiles?bbox=…&zoom=…` for download; `POST /api/satellite/tiles/ingest` for upload (D-PROJ-2 design task #1, **planned, not yet implemented service-side**). +- `satellite-provider` (external) → for download: `POST /api/satellite/tiles/inventory` (bulk lookup by (z,x,y)) + `GET /tiles/{z}/{x}/{y}` (slippy-map fetch, per `tile-inventory.md` v1.0.0 / AZ-505); for route seeding: `POST /api/satellite/route` + `GET /api/satellite/route/{id}` (per `CreateRouteRequest.cs` DTO + AZ-809 validator); for upload: `POST /api/satellite/tiles/ingest` (D-PROJ-2 design task #1, **planned, not yet implemented service-side**). **Downstream consumers**: @@ -27,6 +36,12 @@ C11 is a **separate operator-side binary / image**. The airborne companion image ## 2. Internal Interfaces +### Interface: `SatelliteProviderRouteClient` (cycle 3 — AZ-838 / Epic AZ-835 C2) + +| Method | Input | Output | Async | Error Types | +|--------|-------|--------|-------|-------------| +| `seed_route` | `RouteSpec` (from `_types/route.py`; `name: str \| None` optional) | `RouteSeedResult` | No (poll loop; seconds–minutes) | `RouteValidationError`, `RouteTransientError`, `RouteTerminalFailureError` (all under `SatelliteProviderRouteError`) | + ### Interface: `TileDownloader` | Method | Input | Output | Async | Error Types | @@ -46,6 +61,21 @@ C11 no longer exposes `confirm_flight_state` — the post-landing flight-state g **Input/Output DTOs**: ``` +RouteSpec (cycle 3 — _types/route.py, produced by replay_input/tlog_route.py): + waypoints: tuple[tuple[float, float], ...] # (lat, lon), 1..max_waypoints + suggested_region_size_meters: float # per-waypoint coverage radius + source_tlog: Path # provenance + source_segment: tuple[int, int] # (start_idx, end_idx) into tlog GPS rows + total_distance_meters: float # along-track distance of active segment + +RouteSeedResult (cycle 3 — c11_tile_manager.route_client): + route_id: uuid + terminal_status: string + maps_ready: bool + tile_count: int + elapsed_ms: int + submitted_payload_sha256: string + DownloadRequest: bbox: BoundingBox (lat_min, lon_min, lat_max, lon_max) zoom_levels: list[int] @@ -78,17 +108,25 @@ UploadBatchReport: ## 3. External API Specification -C11 is a **client** of `satellite-provider`'s REST surface in both directions. +C11 is a **client** of `satellite-provider`'s REST surface in three directions. -### 3.1 Download — read path (existing `satellite-provider` API) +### 3.1 Route seed — corridor materialisation (cycle 3 — AZ-838 / Epic AZ-835 C2) | Endpoint | Method | Auth | Rate Limit | Description | |----------|--------|------|------------|-------------| -| `/api/satellite/tiles?bbox=…&zoom=…` | GET | TLS + service-internal API key | parent-suite enforces | Paged tile blobs + metadata for a bounding box at the given zoom level(s). | +| `/api/satellite/route` | POST | JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) + optional dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` | parent-suite enforces | Submit a `RouteSpec` (waypoints + region size + zoom level). Body shape per `CreateRouteRequest.cs` / `RoutePoint.cs` (`lat` / `lon` JSON property names) / `GeoPoint.cs` DTOs. Query: `requestMaps=true&createTilesZip=false`. Validated pre-emptively against AZ-809 `CreateRouteRequestValidator` rules. | +| `/api/satellite/route/{id}` | GET | same as above | parent-suite enforces | Poll route processing status. Returns `mapsReady: bool` + a `status` string. Terminal-success: `mapsReady=true`. Terminal-failure: `status ∈ {failed, error, rejected}`. Default cadence: 5 s × ≤ 60 attempts. | -C11 honours `Retry-After` on 429s, fails fast on TLS / auth errors, retries with backoff on 5xx. Resolution below 0.5 m/px (RESTRICT-SAT-4) is rejected at the C11 boundary, not pushed downstream. +### 3.2 Download — read path (`satellite-provider` v1.0.0 inventory contract — AZ-505 / AZ-777 Phase 1) -### 3.2 Upload — write path (D-PROJ-2 contract sketch, **planned**) +| Endpoint | Method | Auth | Rate Limit | Description | +|----------|--------|------|------------|-------------| +| `/api/satellite/tiles/inventory` | POST | JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) + optional dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` | parent-suite enforces | Bulk lookup of `(zoom, x, y)` slippy-map coords (≤ 5000 entries / request); body shape per `tile-inventory.md` v1.0.0. Response order matches request order; each entry carries `present: true|false` plus metadata when present (`resolutionMPerPx`, `producedAt`, …). | +| `/tiles/{z}/{x}/{y}` | GET | same as above | parent-suite enforces | Slippy-map tile fetch by coordinates (binary JPEG response). Issued only for inventory entries with `present=true`. | + +C11 honours `Retry-After` on 429s, fails fast on TLS / auth errors, retries with backoff on 5xx. Resolution below 0.5 m/px (RESTRICT-SAT-4) is rejected at the C11 boundary, not pushed downstream. Because the inventory response carries no `Content-Length` hint, AZ-308's pre-write budget check uses a conservative `_DEFAULT_ESTIMATED_TILE_BYTES = 50 000` per-tile reserve. + +### 3.3 Upload — write path (D-PROJ-2 contract sketch, **planned**) | Endpoint | Method | Auth | Rate Limit | Description | |----------|--------|------|------------|-------------| @@ -136,26 +174,28 @@ C11 reads from / writes to C6 (the local store) and reads from / writes to `sate **Algorithmic Complexity**: +- Route seed: bounded by parent-suite tile materialisation latency (~seconds–minutes for the Derkachi corridor; gated by `poll_max_attempts × poll_interval_s`). - Download: linear in tile count; bandwidth-bound by the operator workstation's link to `satellite-provider`. - Upload: linear in pending tile count; bandwidth-bound; bursty post-landing. -**State Management**: stateless except for the two journals. +**State Management**: stateless except for the two journals (download / pending-upload). The route client is fully stateless — each `seed_route` call submits, polls, verifies, and returns. **Key Dependencies**: | Library | Version | Purpose | |---------|---------|---------| -| httpx | per project pin | GET (download) + multipart POST (upload) to `satellite-provider` | +| httpx | per project pin | POST inventory + GET slippy-map (download), POST route + GET status (route seed), multipart POST (upload) to `satellite-provider` | | atomicwrites | latest | Journal updates | | cryptography | per project pin | Per-flight signing key (upload payload signing); the production `satellite-provider` ingest endpoint and the e2e-test `mock-suite-sat-service` fixture both verify with the same key family | **Error Handling Strategy**: -- `SatelliteProviderError`: HTTP timeout / 5xx / TLS failure on either direction. Retry-with-backoff on 5xx; fail fast on TLS / auth. On download, surface to operator + takeoff blocked. On upload, leave tiles in the pending-upload journal and surface to operator. **Do not delete uploaded tiles from C6** until acknowledged. +- `SatelliteProviderError`: HTTP timeout / 5xx / TLS failure on download / upload. Retry-with-backoff on 5xx; fail fast on TLS / auth. On download, surface to operator + takeoff blocked. On upload, leave tiles in the pending-upload journal and surface to operator. **Do not delete uploaded tiles from C6** until acknowledged. - `RateLimitedError` (429): obey `Retry-After`; the operator can also re-invoke later. Same handling either direction. - `FreshnessRejectionError` / `ResolutionRejectionError`: download-side only. Per AC-NEW-6 / RESTRICT-SAT-4 — never silently downgrade fresh-required tiles in `active_conflict` sectors. Surface counts in the `DownloadBatchReport`. - `CacheBudgetExceededError`: download-side only. Pre-flight free-space check against AC-8.3 (≤ 10 GB). Fail fast with explicit budget delta; no partial write. - `SignatureRejectedError`: upload-side only. Per-flight signing key was rejected by `satellite-provider`. This is a security-critical event — do NOT silently drop; surface to operator + log to FDR. +- **Route-seed errors** (cycle 3, dedicated hierarchy under `SatelliteProviderRouteError`): `RouteValidationError` (4xx + RFC 7807 `errors` dict; raised pre-emptively for AZ-809 validator violations BEFORE the HTTP POST), `RouteTransientError` (5xx / network / timeout; carries `__cause__`), `RouteTerminalFailureError` (parent suite reports a non-success terminal status; `.detail` carries the response JSON). Separate hierarchy from `TileManagerError` because the route flow is corridor onboarding, not per-tile transfer. Post-landing safety: C11's upload path no longer gates on flight state internally. The check now lives in C12's `PostLandingUploadOrchestrator` (AZ-329 / Batch 44), which refuses to invoke `TileUploader.upload_pending_tiles` unless the C13 `flight_footer` FDR record records `clean_shutdown=True` for the target flight. ADR-004 process-level isolation remains the primary control — C11 should never run on the companion at all. @@ -170,8 +210,10 @@ Post-landing safety: C11's upload path no longer gates on flight state internall **Known limitations**: - D-PROJ-2 ingest endpoint is NOT yet implemented service-side. Until parent-suite delivers the endpoint, C11 will fail every upload — the pending-upload journal accumulates. Operator workflow tolerates this. -- The e2e-test `mock-suite-sat-service` fixture implements only the planned POST contract (per the leftover file). Download integration tests run against the real `satellite-provider`. Production runs reach `satellite-provider` directly in both directions; the fixture is never on the production path. -- `TileDownloader` requires the operator workstation to have network reach to `satellite-provider` (the only path that crosses out of the workstation enclave). Pre-flight network configuration is an operator concern owned by C12; C11 fails fast if reachability is missing. +- The e2e-test `mock-suite-sat-service` fixture implements only the planned POST upload contract (per the leftover file). Download + route-seed integration tests run against the real `satellite-provider` on the Jetson harness. Production runs reach `satellite-provider` directly in all three directions; the fixture is never on the production path. +- `TileDownloader` and `SatelliteProviderRouteClient` require the operator workstation to have network reach to `satellite-provider` (the only path that crosses out of the workstation enclave). Pre-flight network configuration is an operator concern owned by C12; C11 fails fast if reachability is missing. +- **Imagery source license attribution (cycle 3 — AZ-777 Phase 2)**: the Jetson `satellite-provider` instance downloads from the **Google Maps** satellite layer (`lyrs=s`), governed by Google Maps Platform Terms of Service. Dev/research use only; the operator-side seed scripts (`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate the "Imagery © Google" attribution string. Production deployment requires either a Google Maps Platform licensing review or migration to a true CC-BY satellite source on the satellite-provider side (parent-suite ticket TBD; surfaced in `_docs/00_problem/input_data/flight_derkachi/README.md`). +- **Dev TLS cert**: the e2e-runner today accepts the self-signed dev cert via `SATELLITE_PROVIDER_TLS_INSECURE=1`. Production deploys must validate against a CA-issued cert (`SATELLITE_PROVIDER_TLS_INSECURE=0`); the env knob is documented in `.env.test.example` + the smoke test + this section as **development-only**. **Potential race conditions**: @@ -179,25 +221,28 @@ Post-landing safety: C11's upload path no longer gates on flight state internall **Performance bottlenecks**: +- Route seed: parent-suite tile-materialisation latency dominates (corridor onboarding from Google Maps upstream). Bounded by `poll_max_attempts × poll_interval_s` (default 60 × 5 s = 5 min wall-clock ceiling). - Download: bandwidth-bound by the operator workstation's `satellite-provider` link; descriptor / engine work is downstream in C10 (offline, minutes). - Upload: bandwidth-bound. Per-flight upload volume is bounded by the F4 mid-flight tile gen cap (typically a few hundred tiles, each 50–200 KB → tens of MB per flight). ## 8. Dependency Graph -**Must be implemented after**: C6 (read source for upload, write target for download), `satellite-provider` (download path; existing) + D-PROJ-2 endpoint (upload path; the e2e-test `mock-suite-sat-service` fixture covers tests until the real endpoint ships). +**Must be implemented after**: C6 (read source for upload, write target for download), `satellite-provider` (download + route-seed paths; existing) + D-PROJ-2 endpoint (upload path; the e2e-test `mock-suite-sat-service` fixture covers tests until the real endpoint ships). `replay_input.tlog_route` (AZ-836) is a soft prerequisite for the route-seed path — the route client accepts any `RouteSpec` regardless of how it was produced, but the cycle-3 e2e fixture wires `extract_route_from_tlog` upstream. **Can be implemented in parallel with**: anything except C6 changes. -**Blocks**: F1 (pre-flight cache build cannot start without `TileDownloader`), F10 (post-landing upload cannot start without `TileUploader`). +**Blocks**: F1 (pre-flight cache build cannot start without `TileDownloader` or — for the route-driven variant — `SatelliteProviderRouteClient.seed_route`), F10 (post-landing upload cannot start without `TileUploader`). ## 9. Logging Strategy | Log Level | When | Example | |-----------|------|---------| -| ERROR | `SignatureRejectedError`, persistent `SatelliteProviderError`, `CacheBudgetExceededError` | `C11 upload failure: signature rejected by satellite-provider` | -| WARN | one-off network failure, scheduled retry, freshness-driven rejections (counts) | `C11 batch upload retry: batch_uuid=…; next_retry_in_s=30` | -| INFO | session start/end; per-batch report (download + upload) | `C11 download complete: 87654 tiles, 12 stale-rejected; bbox=…` | -| DEBUG | per-tile request/response | `C11 tile uploaded: tile_id=(z=18,lat=…,lon=…); status=queued` | +| ERROR | `SignatureRejectedError`, persistent `SatelliteProviderError`, `CacheBudgetExceededError`, `RouteTerminalFailureError` | `C11 upload failure: signature rejected by satellite-provider`; `c11.route.poll.terminal kind=failed route_id=…` | +| WARN | one-off network failure, scheduled retry, freshness-driven rejections (counts), `RouteTransientError` retries, `RouteValidationError` pre-flight rejections | `C11 batch upload retry: batch_uuid=…; next_retry_in_s=30`; `c11.route.validation_failed field=points reason=below_min(2)` | +| INFO | session start/end; per-batch report (download + upload); route submit + each poll tick + inventory verify | `C11 download complete: 87654 tiles, 12 stale-rejected; bbox=…`; `c11.route.submit route_id=…`; `c11.route.poll.tick attempt=3 status=processing` | +| DEBUG | per-tile request/response; per-tile inventory entries | `C11 tile uploaded: tile_id=(z=18,lat=…,lon=…); status=queued` | + +Cycle-3 route-client log kinds: `c11.route.submit`, `c11.route.poll.tick`, `c11.route.poll.terminal`, `c11.route.inventory`, `c11.route.validation_failed` (component `c11_tile_manager.route_client`). **Log format**: structured JSON. **Log storage**: operator workstation log file (e.g. `~/.azaion/onboard/c11-tilemanager.log`); also writes per-run summaries (download report, upload report) to the operator workstation cache root for audit. The companion's FDR is NOT involved (C11 doesn't run on the companion). diff --git a/_docs/02_document/contracts/c11_tilemanager/route_client.md b/_docs/02_document/contracts/c11_tilemanager/route_client.md new file mode 100644 index 0000000..37b160c --- /dev/null +++ b/_docs/02_document/contracts/c11_tilemanager/route_client.md @@ -0,0 +1,126 @@ +# Contract: route_client + +**Component**: c11_tilemanager +**Producer task**: AZ-838_satellite_provider_route_client (Epic AZ-835 C2) +**Consumer tasks**: AZ-839 (`operator_pre_flight_setup` real fixture, Epic AZ-835 C3); AZ-840 (E2E orchestrator test, Epic AZ-835 C4); future C12 production binding (deferred — see § Non-Goals). +**Version**: 1.0.0 +**Status**: stable +**Last Updated**: 2026-05-26 + +## Purpose + +The `SatelliteProviderRouteClient` is C11's operator-side **route-onboarding** interface. Given a `RouteSpec` (a coarsened, tlog-derived flight corridor produced by `replay_input.tlog_route.extract_route_from_tlog` — AZ-836), it registers the corridor with the parent-suite `satellite-provider` Route API, polls until materialisation completes, and verifies coverage via the inventory contract. + +The route-driven seeding flow lets the operator pre-commit the C6 cache to the precise corridor the drone actually flew rather than a coarse bounding box — typically ~100× more tile-efficient on long, narrow flights. + +C11 is operator-side ONLY; ADR-004 forbids the airborne companion image from importing this module. + +**Upstream API** (cycle 3 — AZ-838): `POST /api/satellite/route` (corridor onboarding; body shape per `CreateRouteRequest.cs` / `RoutePoint.cs` / `GeoPoint.cs` DTOs; query `requestMaps=true&createTilesZip=false`) + `GET /api/satellite/route/{id}` (status polling; terminal-success when `mapsReady=true`; terminal-failure when `status ∈ {failed, error, rejected}`) + `POST /api/satellite/tiles/inventory` (post-materialisation coverage verification, shared with `tile_downloader`). Authentication: `Authorization: Bearer ${SATELLITE_PROVIDER_API_KEY}`; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert. + +## Shape + +### Function / method API + +```python +import uuid +from gps_denied_onboard._types.route import RouteSpec # AZ-845 canonical home + +class SatelliteProviderRouteClient: + def __init__( + self, + base_url: str, + jwt: str, + *, + tls_insecure: bool = False, + request_timeout_s: float = 30.0, + poll_interval_s: float = 5.0, + poll_max_attempts: int = 60, + ) -> None: ... + + def seed_route( + self, + spec: RouteSpec, + *, + name: str | None = None, + ) -> RouteSeedResult: ... +``` + +| Name | Signature | Throws / Errors | Blocking? | +|------|-----------|-----------------|-----------| +| `seed_route` | `(spec: RouteSpec, *, name: str \| None = None) -> RouteSeedResult` | `RouteValidationError`, `RouteTransientError`, `RouteTerminalFailureError` (all under `SatelliteProviderRouteError`) | sync; poll loop bounded by `poll_max_attempts × poll_interval_s` (default 60 × 5 s = 5 min ceiling) | + +### Data DTOs + +```python +@dataclass(frozen=True, slots=True) +class RouteSpec: # _types/route.py (AZ-845) + waypoints: tuple[tuple[float, float], ...] # (lat, lon) + suggested_region_size_meters: float # per-waypoint coverage radius + source_tlog: Path # provenance + source_segment: tuple[int, int] # (start_idx, end_idx) into tlog GPS rows + total_distance_meters: float # along-track distance of active segment + +@dataclass(frozen=True, slots=True) +class RouteSeedResult: # c11_tile_manager/route_client.py + route_id: uuid.UUID + terminal_status: str # e.g. "completed", "done", "succeeded" + maps_ready: bool # True on terminal success + tile_count: int # present=true entries from inventory verify + elapsed_ms: int # POST → terminal-status wall time + submitted_payload_sha256: str # provenance for the inventory verify step +``` + +| Field | Type | Required | Description | Constraints | +|-------|------|----------|-------------|-------------| +| `RouteSpec.waypoints` | `tuple[tuple[float, float], ...]` | yes | Ordered list of (lat, lon) waypoints | `2 ≤ len(waypoints) ≤ 500` (AZ-809 validator); each `lat ∈ [-90, 90]`, `lon ∈ [-180, 180]` | +| `RouteSpec.suggested_region_size_meters` | `float` | yes | Per-waypoint coverage radius | `100.0 ≤ value ≤ 10_000.0` (AZ-809 validator) | +| `RouteSpec.source_tlog` | `Path` | yes | Provenance — which tlog produced this spec | filesystem path | +| `RouteSeedResult.route_id` | `uuid.UUID` | yes | Server-assigned route id | non-zero | +| `RouteSeedResult.terminal_status` | `str` | yes | Last status observed from `GET /api/satellite/route/{id}` | one of `{"completed", "failed", "error", "done", "succeeded", "rejected"}` | +| `RouteSeedResult.maps_ready` | `bool` | yes | True iff parent suite reported `mapsReady=true` (terminal success) | True on success; False if poll budget exhausted before terminal | +| `RouteSeedResult.tile_count` | `int` | yes | Inventory `present=true` count over the route's enumerated coverage | ≥ 0 (lower bound — server may interpolate between waypoints) | + +## Invariants + +- I-1: **Pre-emptive validation** rejects obviously-bad input as `RouteValidationError` BEFORE the HTTP POST. The client mirrors the AZ-809 `CreateRouteRequestValidator` bounds (`points` 2..500; `regionSizeMeters` 100..10 000; `zoomLevel` 0..22; lat/lon ranges; `name`/`description` max lengths). The list MUST stay in sync with `SatelliteProvider.Api/Validators/CreateRouteRequestValidator.cs` (parent suite source). +- I-2: The client POSTs the wire shape exactly per `CreateRouteRequest.cs` + `RoutePoint.cs` + `GeoPoint.cs` (note: `RoutePoint` uses `lat` / `lon` JSON property names for both input and output; the input/output naming asymmetry flagged in AZ-809 AC-10 is a parent-suite concern, not a client adaptation). +- I-3: Poll cadence MUST respect `poll_interval_s` (lower bound between successive `GET /api/satellite/route/{id}` calls) and `poll_max_attempts` (upper bound on attempt count). The client logs every poll tick at INFO with the observed status. +- I-4: Terminal-success is exactly `mapsReady=true`. Terminal-failure is exactly `status ∈ {"failed", "error", "rejected"}`. Any other status is treated as "still processing" and triggers the next poll. If the poll budget is exhausted without terminal status, `RouteTransientError` is raised with the last observed status. +- I-5: 4xx responses with RFC 7807 `ProblemDetails` → `RouteValidationError`; `field_errors` is populated from the `errors` dict so the caller can render per-field rejections. +- I-6: 5xx / network / timeout → `RouteTransientError` with `__cause__` set to the underlying `httpx` exception. The retry semantics are caller-driven — the route client itself does NOT retry the POST, leaving the policy to the fixture / CLI (e.g., `tests/e2e/replay/conftest.py::operator_pre_flight_setup` retries up to 3 times using C11's `_DEFAULT_BACKOFF_SCHEDULE_S = (1, 2, 4, 8)`). +- I-7: The inventory verify step uses `POST /api/satellite/tiles/inventory` (≤ 5000 entries / request) and enumerates the route's tile coverage locally from `(waypoints, suggested_region_size_meters)` using the parent suite's web-Mercator math (`_EARTH_EQUATORIAL_CIRCUMFERENCE_M = 40 075 016.686`). The result is a **lower bound** on actual server coverage — the server may interpolate intermediate corridor tiles that the local enumeration misses; this is documented and acceptable as a sanity-check signal, not a coverage proof. + +## Non-Goals + +- Not covered: producing the `RouteSpec` — owned by `replay_input.tlog_route.extract_route_from_tlog` (AZ-836). +- Not covered: orchestration of when the operator runs the seed — owned by C12 (production binding deferred; cycle-3 e2e fixture `operator_pre_flight_setup` is the current driver — AZ-839). +- Not covered: FAISS index construction over the populated cache — owned by C10 `DescriptorBatcher`. +- Not covered: bbox-based seeding — handled by `tile_downloader.download_tiles_for_area` (and by `tests/fixtures/derkachi_c6/seed_region.py` for the e2e fixture). +- Not covered: multi-route batching — one `RouteSpec` per `seed_route` call. Multi-flight aggregate corridors are an operator-workflow concern. + +## Versioning Rules + +- **Breaking changes** (renamed method, removed required field, changed return type, parent-suite Route API contract break) require a major version bump. Coordinate with the C3 fixture (AZ-839) and any future C12 production binding via Choose A/B/C/D before bumping. +- **Non-breaking additions** (new optional constructor kwarg, new field on `RouteSeedResult`, new error variant the consumer catches via `SatelliteProviderRouteError`) require a minor version bump. +- The pre-emptive validation bounds (I-1) MUST track the parent-suite `CreateRouteRequestValidator.cs` exactly. Drift between client and server validators is a defect, not a version concern — fix the client to match the server. + +## Test Cases + +| Case | Input | Expected | Notes | +|------|-------|----------|-------| +| route-happy-path | `RouteSpec` for Derkachi tlog (2-waypoint corridor, region_size=500m) against a stubbed `satellite-provider` returning `mapsReady=true` on the 2nd poll | `RouteSeedResult` with `maps_ready=True`, `tile_count > 0`, `terminal_status="completed"`, `elapsed_ms` reflects 2 polls | AZ-838 AC-1, AC-2 | +| validation-empty-points | `RouteSpec(waypoints=(), …)` | `RouteValidationError` raised BEFORE HTTP POST | I-1, AZ-838 AC-6 | +| validation-too-many-points | `RouteSpec` with 501 waypoints | `RouteValidationError` raised BEFORE HTTP POST | I-1, AZ-838 AC-6 | +| validation-region-too-large | `RouteSpec(suggested_region_size_meters=10_001.0, …)` | `RouteValidationError` raised BEFORE HTTP POST | I-1, AZ-838 AC-6 | +| 4xx-problem-details | server returns 400 + RFC 7807 `errors` dict | `RouteValidationError` with `field_errors` populated from the response | I-5, AZ-838 AC-3 | +| 5xx-transient | server returns 503 | `RouteTransientError` with `__cause__` set to the underlying `httpx` exception | I-6, AZ-838 AC-4 | +| terminal-failure | server reports `status="failed"` mid-poll | `RouteTerminalFailureError`; `.detail` carries the response JSON | I-4, AZ-838 AC-5 | +| poll-budget-exhausted | server stays in `status="processing"` past 60 attempts | `RouteTransientError` referencing the last observed status | I-3, I-4 | +| inventory-verify-counts-present | `mapsReady=true` then inventory POST returns mixed `present=true/false` entries | `tile_count` equals the count of `present=true` entries | I-7 | +| integration-derkachi | `RouteSpec` from real Derkachi tlog, against the Jetson `satellite-provider` (gated by `RUN_E2E=1` + `SATELLITE_PROVIDER_URL`) | `tile_count > 0`, `maps_ready=True`, completes in ≤ 15 s on the 2-waypoint reference route | AZ-838 AC-10 (Jetson-only, Tier-2) | + +## Change Log + +| Version | Date | Change | Author | +|---------|------|--------|--------| +| 1.0.0 | 2026-05-26 | Initial contract — produced by AZ-838 (Epic AZ-835 C2). Cycle-3 addition; consumed by AZ-839 (`operator_pre_flight_setup` real fixture) and AZ-840 (E2E orchestrator test). | autodev | diff --git a/_docs/02_document/contracts/c11_tilemanager/tile_downloader.md b/_docs/02_document/contracts/c11_tilemanager/tile_downloader.md index c29a854..3222690 100644 --- a/_docs/02_document/contracts/c11_tilemanager/tile_downloader.md +++ b/_docs/02_document/contracts/c11_tilemanager/tile_downloader.md @@ -1,18 +1,20 @@ # Contract: tile_downloader **Component**: c11_tilemanager -**Producer task**: AZ-316_c11_tile_downloader +**Producer task**: AZ-316_c11_tile_downloader (initial), AZ-777 Phase 1 (cycle-3 inventory-contract adaptation) **Consumer tasks**: AZ-253 (E-C12 Operator Pre-flight Tooling — TBD at C12 decompose time) -**Version**: 1.0.0 -**Status**: draft -**Last Updated**: 2026-05-10 +**Version**: 1.1.0 +**Status**: stable +**Last Updated**: 2026-05-26 ## Purpose -The `TileDownloader` Protocol is C11's operator-side download interface. C12 invokes it during F1 (pre-flight cache build) to fetch satellite tiles from the parent suite's `satellite-provider` GET surface, apply RESTRICT-SAT-4 resolution gating at the C11 boundary, and write accepted tiles into C6. Freshness rejections surfacing from C6 (AZ-307) are counted and surfaced in the report. +The `TileDownloader` Protocol is C11's operator-side download interface. C12 invokes it during F1 (pre-flight cache build) to fetch satellite tiles from the parent suite's `satellite-provider` inventory + slippy-map surface, apply RESTRICT-SAT-4 resolution gating at the C11 boundary, and write accepted tiles into C6. Freshness rejections surfacing from C6 (AZ-307) are counted and surfaced in the report. C11 is operator-side ONLY; ADR-004 forbids the airborne companion image from importing this module. +**Upstream API (cycle 3 — AZ-777 Phase 1)**: against the real parent-suite `satellite-provider` v1.0.0 inventory contract — `POST /api/satellite/tiles/inventory` (bulk lookup by `(zoom, x, y)`, ≤ 5000 entries / request, per `tile-inventory.md` v1.0.0 / AZ-505) + `GET /tiles/{z}/{x}/{y}` (slippy-map JPEG fetch, issued only for inventory entries with `present=true`). Authentication: `Authorization: Bearer ${SATELLITE_PROVIDER_API_KEY}`; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert (production must validate against a CA-issued cert). Because the inventory response carries no `Content-Length` hint, AZ-308's pre-write budget pre-check uses a conservative `_DEFAULT_ESTIMATED_TILE_BYTES = 50 000` per-tile reserve. + ## Shape ### Function / method API @@ -79,7 +81,7 @@ class TileSummary: - I-1: `tiles_downloaded + tiles_rejected_resolution + tiles_rejected_freshness == sum of attempted tiles`. The report accounts for every tile the downloader attempted; no silent drops. - I-2: A re-run of `download_tiles_for_area` for the same `(bbox, zoom_levels, sector_class, flight_id)` after a successful prior run is idempotent: `outcome = idempotent_no_op` and no GETs are issued. Idempotence is enforced by C11's download-progress journal under `cache_root/.c11/journal/`. - I-3: Every accepted tile passes BOTH the C11 resolution gate (≥ 0.5 m/px per RESTRICT-SAT-4) AND the C6 freshness gate (AZ-307). A tile that fails either is excluded from `tiles_downloaded`. -- I-4: TLS + service-internal API key authenticate the GET; auth failure surfaces as `SatelliteProviderError` and aborts the run with `outcome = failure`. The downloader does NOT fall back to plaintext or unauthenticated requests. +- I-4: JWT Bearer authentication (`SATELLITE_PROVIDER_API_KEY`) over TLS authenticates the inventory POST and the slippy-map GET; auth failure surfaces as `SatelliteProviderError` and aborts the run with `outcome = failure`. The downloader does NOT fall back to plaintext or unauthenticated requests. `SATELLITE_PROVIDER_TLS_INSECURE=1` is a dev-only knob for self-signed certs; production must run with it unset. - I-5: The downloader writes via the AZ-303 `TileStore`/`TileMetadataStore` Protocols; it does NOT touch C6's filesystem layout directly. - I-6: A `CacheBudgetExceededError` aborts pre-write with no partial write and `outcome = failure`. The C6 cache budget enforcer (AZ-308) drives the headroom check. @@ -112,4 +114,5 @@ class TileSummary: | Version | Date | Change | Author | |---------|------|--------|--------| +| 1.1.0 | 2026-05-26 | Internal upstream contract adapted to `satellite-provider` v1.0.0 inventory contract (AZ-777 Phase 1): `POST /api/satellite/tiles/inventory` + `GET /tiles/{z}/{x}/{y}` replace the previous `GET /api/satellite/tiles?bbox=…&zoom=…` shape. `download_tiles_for_area` / `DownloadRequest` / `DownloadBatchReport` surface UNCHANGED — non-breaking minor bump. Auth tightened to JWT Bearer over TLS. Status moved draft → stable. | autodev | | 1.0.0 | 2026-05-10 | Initial contract — produced by AZ-316 (E-C11 decomposition) | autodev | diff --git a/_docs/02_document/contracts/replay/replay_protocol.md b/_docs/02_document/contracts/replay/replay_protocol.md index decc44d..80d84cc 100644 --- a/_docs/02_document/contracts/replay/replay_protocol.md +++ b/_docs/02_document/contracts/replay/replay_protocol.md @@ -254,6 +254,10 @@ The two **invalid** cells (`true` + `eskf` and `false` + `gtsam_isam2`) raise `C 10. **Determinism**: same `(video, tlog, config, time_offset_ms, pace=ASAP)` input → same JSONL output within ≤ 1e-6 float drift in position fields (AC-5). 11. **MAVLink signing key required in replay**: the airborne binary refuses to run without `--mavlink-signing-key PATH` in both modes. In replay the operator supplies a dummy file (well-formed key bytes; no real channel to verify against). This preserves Invariant 5 — the encoders' signing code path runs identically in both modes. 12. **Real C6 cache in replay**: the airborne binary in replay mode reads the same pre-built C6 tile cache the operator built via the normal pre-flight C10/C11/C12 flow. There is no replay-specific cache shape. Verified by the AZ-404 E2E fixture, which runs the operator's pre-flight flow before invoking the replay CLI. + + **Sub-invariant 12.a (cycle 3 — AZ-839 / Epic AZ-835 C3)**: the e2e `operator_pre_flight_setup` fixture replaces the cycle-1 `mkdir` placeholder with a real driver that wires C1 (`replay_input.tlog_route.extract_route_from_tlog` — AZ-836) + C2 (`c11_tile_manager.route_client.SatelliteProviderRouteClient.seed_route` — AZ-838) + C11 (`tile_downloader.HttpTileDownloader.download_for_bbox`) + C10 (`DescriptorBatcher`) to populate C6 from a tlog-derived corridor. The fixture yields a `PopulatedC6Cache` dataclass (`cache_root`, `tile_store_path`, `faiss_index_path`, `faiss_sidecar_sha256_path`, `faiss_sidecar_meta_path`, `route_spec`, `tile_count`, `elapsed_seconds`). The cache is mounted into a named docker volume that survives across pytest sessions (cold first invocation populates; subsequent invocations within the same compose session reuse — warm cache). Cold-start budget: ≤ 5 min on Tier-2 Jetson; warm: ≤ 30 s. Sidecar triple-consistency (`.index` + `.sha256` + `.meta.json`) per AZ-306 is verified at every fixture yield; mismatch raises `IndexUnavailableError`. The C12 production binding for the route-driven path is a future-cycle integration; production pre-flight still uses the bbox-driven `download_tiles_for_area` path today. + + **Sub-invariant 12.b (cycle 3 — AZ-840 / Epic AZ-835 C4)**: the E2E orchestrator test `tests/e2e/replay/test_az835_e2e_real_flight.py` takes only `(tlog, video, calibration)` and runs the full 7-step pipeline end-to-end on Tier-2 Jetson — no operator hand-curation between steps. The 7 steps are: (1) active flight cut + tlog/video sync via AZ-405; (2) on-fly frame + IMU extraction; (3) auto-create route via AZ-836; (4) POST route to satellite-provider via the C3 fixture's `operator_pre_flight_setup` (delegates to AZ-838); (5) build FAISS index (driven by C3); (6) run gps-denied airborne pipeline against the populated cache + tlog/video/calibration (reuses the airborne composition root path AZ-699 exercises); (7) compute horizontal-error distribution and emit the AZ-699 verdict report at `_docs/06_metrics/real_flight_validation_.md`. The verdict report is emitted ALWAYS, regardless of PASS / FAIL on the AZ-696 ≥ 80 % within 100 m gate — the success criterion is that the report exists with the honest distribution, not that the verdict is PASS. Gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`. 13. **C4↔C5 pairing matrix is enforced at compose time** (AZ-776 / ADR-012): `compose_root` rejects the two off-diagonal cells of the (`c4_pose.enabled`, `c5_state.strategy`) matrix with a `CompositionError` naming both blocks. `enabled=False` + `gtsam_isam2` and `enabled=True` + `eskf` are forbidden. The two valid cells are `enabled=True` + `gtsam_isam2` (production steady-state per ADR-003 / ADR-009) and `enabled=False` + `eskf` (open-loop ESKF — replay Tier-2 smoke baseline; satellite anchoring deferred to AZ-777). Verified by `tests/unit/runtime_root/test_az776_open_loop_eskf_composition.py` AC-3a and AC-3b. ## Producer / Consumer Split diff --git a/_docs/02_document/data_model.md b/_docs/02_document/data_model.md index 694459c..7e7e204 100644 --- a/_docs/02_document/data_model.md +++ b/_docs/02_document/data_model.md @@ -562,6 +562,9 @@ The following DTOs flow through the per-frame pipeline in memory and are **NOT** | `PostLandingUploadRequest` | C12 CLI (`upload-pending` subcommand) | C12 `PostLandingUploadOrchestrator` | Never persisted — composed inline from CLI args | | `ReLocHint` | C12 CLI (`reloc-confirm` subcommand) | C12 `OperatorReLocService` → `OperatorCommandTransport` (E-C8 concrete) → airborne companion | FDR `c12.reloc.requested` record (full hint un-redacted; `outcome ∈ {sent, failed}`) | | `CameraCalibration` (loaded once) | calibration loader | C1, C3, C4 | NOT in PostgreSQL — see § 2.6 | +| `RouteSpec` (cycle 3 — `_types/route.py`, AZ-845 canonical home; produced by `replay_input/tlog_route.py` AZ-836) | `replay_input.tlog_route.extract_route_from_tlog(tlog, *, max_waypoints, …)` | C11 `SatelliteProviderRouteClient.seed_route` (AZ-838); cycle-3 e2e fixture `operator_pre_flight_setup` (AZ-839); E2E orchestrator test (AZ-840) | NOT in PostgreSQL — transient pre-flight planning DTO. Fields: `waypoints: tuple[(lat, lon)]` (1..max_waypoints), `suggested_region_size_meters: float`, `source_tlog: Path`, `source_segment: (start_idx, end_idx)`, `total_distance_meters: float`. `frozen=True, slots=True`. | +| `RouteSeedResult` (cycle 3 — `c11_tile_manager/route_client.py`, AZ-838) | C11 `SatelliteProviderRouteClient.seed_route` | cycle-3 e2e fixture `operator_pre_flight_setup` (AZ-839); seed CLI `tests/fixtures/derkachi_c6/seed_route.py` | NOT in PostgreSQL — transient outcome DTO. Fields: `route_id: uuid`, `terminal_status: str`, `maps_ready: bool`, `tile_count: int`, `elapsed_ms: int`, `submitted_payload_sha256: str`. `frozen=True, slots=True`. | +| `PopulatedC6Cache` (cycle 3 — `tests/e2e/replay/conftest.py`, AZ-839) | `operator_pre_flight_setup` fixture | replay e2e tests including `test_az835_e2e_real_flight.py` (AZ-840) and the AZ-699 verdict test | NOT in PostgreSQL — test-fixture-only DTO. Fields: `cache_root: Path`, `tile_store_path: Path`, `faiss_index_path: Path`, `faiss_sidecar_sha256_path: Path`, `faiss_sidecar_meta_path: Path`, `route_spec: RouteSpec`, `tile_count: int`, `elapsed_seconds: float`. Backed by a docker named volume that survives across pytest sessions in the same compose run. | --- diff --git a/_docs/02_document/ripple_log_cycle3.md b/_docs/02_document/ripple_log_cycle3.md new file mode 100644 index 0000000..455164f --- /dev/null +++ b/_docs/02_document/ripple_log_cycle3.md @@ -0,0 +1,62 @@ +# Ripple Log — Cycle 3 (End-of-Cycle Documentation Sync) + +> Produced as part of existing-code flow Step 13 (Update Docs, document skill Task mode). +> Source: `_docs/_autodev_state.md` (`cycle: 3`). +> Date: 2026-05-26. + +## Input set + +The 8 task specs in `_docs/02_tasks/done/` whose mtime falls inside cycle 3 +(2026-05-22 .. 2026-05-23): + +| Task | Title | Surface | +|------|-------|---------| +| AZ-836 | TlogRouteExtractor (Epic AZ-835 C1) | NEW `replay_input/tlog_route.py`, NEW `_types/route.py` (RouteSpec) | +| AZ-838 | SatelliteProviderRouteClient + `seed_route.py` CLI (Epic AZ-835 C2) | NEW `components/c11_tile_manager/route_client.py`, NEW `tests/fixtures/derkachi_c6/seed_route.py` | +| AZ-839 | `operator_pre_flight_setup` real fixture (Epic AZ-835 C3) | REWRITE `tests/e2e/replay/conftest.py::operator_pre_flight_setup`, NEW `PopulatedC6Cache` | +| AZ-840 | E2E orchestrator test (Epic AZ-835 C4) | NEW `tests/e2e/replay/test_az835_e2e_real_flight.py` | +| AZ-777 | Derkachi C6 reference fixture (Phases 1+2; Phases 3–5 superseded by AZ-839/AZ-841/AZ-842) | MODIFY `c11_tile_manager/tile_downloader.py` (inventory + slippy-map paths), `docker-compose.test.jetson.yml`, `.env.test.example`; NEW `tests/fixtures/derkachi_c6/{seed_region.py,bbox.yaml,README.md}`, NEW `tests/e2e/satellite_provider/test_smoke.py` | +| AZ-845 | Relocate RouteSpec → `_types/route.py` (refactor 02 anchor) | NEW `_types/route.py`; MODIFY `replay_input/tlog_route.py`, `replay_input/__init__.py`, `components/c11_tile_manager/route_client.py` import | +| AZ-846 | Refresh `module-layout.md` cycle-3 entries (refactor 02) | MODIFY `_docs/02_document/module-layout.md` ONLY | +| AZ-847 | Widen AZ-270 lint to enforce full rule-9 allow-list (refactor 02) | MODIFY `tests/unit/test_az270_compose_root.py` ONLY | + +## Task Step 0.5 — Import-graph ripple + +Reverse-dependency scan for the 4 production source changes: + +| Changed file | Importers (production source) | Affected components | +|--------------|------------------------------|---------------------| +| `_types/route.py` (NEW) | `replay_input/tlog_route.py`, `replay_input/__init__.py` (re-export), `components/c11_tile_manager/route_client.py`, `components/c11_tile_manager/__init__.py` (re-export) | c11_tile_manager, shared/replay_input, shared/_types | +| `replay_input/tlog_route.py` (NEW) | `replay_input/__init__.py` (re-export) | shared/replay_input | +| `components/c11_tile_manager/route_client.py` (NEW) | `components/c11_tile_manager/__init__.py` (re-export) | c11_tile_manager | +| `components/c11_tile_manager/tile_downloader.py` (MODIFIED — `_INVENTORY_PATH`, `_TILES_PATH`, default per-tile byte estimate) | `runtime_root/c11_factory.py::build_tile_downloader` (constructor unchanged; endpoint constants are module-internal) | c11_tile_manager | + +No surprise ripple to other components. All edges land inside `c11_tile_manager` + shared (`_types/`, `replay_input/`), which is consistent with the AZ-507 cross-component allow-list (AZ-845 fixes the previous violation; AZ-846 registers the new files; AZ-847 widens the lint to keep it that way). + +## Refresh set for Task Steps 1–4 + +| Update level | This cycle's refresh set | Status | +|--------------|-------------------------|--------| +| Task Step 1 — Module docs | This project's Plan uses component-level granularity; no `_docs/02_document/modules/` folder. Authoritative module-ownership lives in `_docs/02_document/module-layout.md`. | Already refreshed by AZ-846 — sections `c11_tile_manager Internal`, `shared/replay_input`, `_types/` updated to register `route_client.py`, `tlog_route.py`, `route.py`. No further action. | +| Task Step 2 — Component docs | `components/12_c11_tilemanager/description.md` (3rd interface + endpoint adaptation), `contracts/c11_tilemanager/tile_downloader.md` (endpoint paths), `contracts/c11_tilemanager/route_client.md` (NEW). | Updated this session. | +| Task Step 3 — System-level docs | `architecture.md` § 5 satellite-provider sub-section (inventory contract + route-driven seeding); `data_model.md` register `RouteSpec` / `RouteSeedResult` / `PopulatedC6Cache` DTOs; `system-flows.md` F1 pre-flight cache build (route-driven variant); `contracts/replay/replay_protocol.md` Invariant 12 sub-section for AZ-839 / AZ-840. | Updated this session. | +| Task Step 4 — Problem-level docs | `_docs/00_problem/input_data/flight_derkachi/README.md` (point at `tests/fixtures/derkachi_c6/` + license attribution). No AC / restriction / data_parameters drift this cycle. | Updated this session. | + +## Files actually changed this session + +- `_docs/02_document/components/12_c11_tilemanager/description.md` — add `SatelliteProviderRouteClient` as a third C11 interface; update `TileDownloader` external API rows to the inventory + slippy-map contract; add a Cycle-3 callout to § 1 Overview. +- `_docs/02_document/contracts/c11_tilemanager/tile_downloader.md` — replace the `GET /api/satellite/tiles?bbox=…&zoom=…` row with the inventory-POST + slippy-map-GET row pair; bump version. +- `_docs/02_document/contracts/c11_tilemanager/route_client.md` — NEW contract for `SatelliteProviderRouteClient.seed_route`. +- `_docs/02_document/contracts/replay/replay_protocol.md` — append AZ-839 / AZ-840 sub-section to Invariant 12 covering the route-driven `operator_pre_flight_setup` fixture + `PopulatedC6Cache`. +- `_docs/02_document/architecture.md` — append a Cycle-3 sub-section to § 5 satellite-provider integration noting the actual inventory-based read path + the route-driven seeding flow (no new ADR). +- `_docs/02_document/data_model.md` — register `RouteSpec`, `RouteSeedResult`, `PopulatedC6Cache` as cross-component DTOs. +- `_docs/02_document/system-flows.md` — extend F1 (pre-flight cache build) with the route-driven variant (tlog → RouteSpec → satellite-provider Route API → populated C6 via inventory + slippy-map). +- `_docs/00_problem/input_data/flight_derkachi/README.md` — append "Derkachi C6 reference seeding" section pointing at `tests/fixtures/derkachi_c6/{seed_region.py,seed_route.py,bbox.yaml,README.md}` + the license-attribution caveat for Google Maps imagery. +- `_docs/02_document/ripple_log_cycle3.md` — this file (NEW). +- `_docs/_autodev_state.md` — sub_step progression through Step 13 task phases. + +## Out of scope (carried) + +- `tests/` doc updates beyond what Step 12 already produced (`_docs/02_document/tests/blackbox-tests.md`, `traceability-matrix.md` — modified by Step 12 in this cycle). Test-spec sync owns those. +- Cycle-2 doc carry-overs OUTSIDE the three `module-layout.md` sections AZ-846 touched (`replay_api/` Per-Component Mapping entry, `cli/render_map.py`, `cli/replay_api_entrypoint.py`, `helpers/gps_compare.py`, `helpers/accuracy_report.py`). Tracked in cycle-3 retrospective; require a separate follow-up doc task with its own AZ ID. +- Untracked `_docs/02_document/system-overview.md` (created 2026-05-24 outside the cycle-3 task surface). Reviewed; content is accurate at the abstraction level it presents; no edit required. diff --git a/_docs/02_document/system-flows.md b/_docs/02_document/system-flows.md index 683df97..43db514 100644 --- a/_docs/02_document/system-flows.md +++ b/_docs/02_document/system-flows.md @@ -46,11 +46,25 @@ The operator builds (or refreshes) the per-mission cache before takeoff. F1 has **three phases** sequenced by C12 OperatorTool: - **Phase 0 — Flight resolve (C12 `FlightsApiClient`, AZ-489)**: read the operator-authored `Flight` (ordered waypoints + altitudes) either from the parent-suite `flights` REST service (`--flight-id `) or from a local JSON export (`--flight-file `). Compute the bounding box as the envelope of waypoint lat/lon plus a configurable buffer (default 1 km). Extract `Flight.waypoints[0].(lat, lon, alt)` as the **takeoff origin**. Both are passed downstream as `BuildRequest` fields. -- **Phase 1 — Tile download (C11 `TileDownloader`)**: fetch tiles from `satellite-provider` for the bbox computed in Phase 0; apply sector-classified freshness rules (AC-NEW-6) and resolution gate (RESTRICT-SAT-4); write tile rows + JPEGs into C6. +- **Phase 1 — Tile download (C11 `TileDownloader` — bbox-driven, production path)**: fetch tiles from `satellite-provider` for the bbox computed in Phase 0 via `POST /api/satellite/tiles/inventory` (bulk lookup of `(z,x,y)` coords per `tile-inventory.md` v1.0.0 / AZ-505) + `GET /tiles/{z}/{x}/{y}` (slippy-map JPEG fetch for inventory entries with `present=true`); apply sector-classified freshness rules (AC-NEW-6) and resolution gate (RESTRICT-SAT-4); write tile rows + JPEGs into C6. Auth: JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) over TLS; dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` accepts self-signed certs. - **Phase 2 — Cache artifact build (C10 CacheProvisioner)**: read the populated C6 store; compile/deserialize TRT engines via C7; batch-generate descriptors via the C2 backbone; atomically write the FAISS HNSW index with SHA-256 sidecars; write the Manifest hashing model + calibration + corpus + sector classification **+ takeoff origin** (D-C10-1 idempotence; ADR-010). This flow is offline and not time-critical. **Only Phase 0 reaches `flights` REST and Phase 1 reaches `satellite-provider`** — both run on the operator workstation, which is the only host that holds TLS + service-internal credentials. The companion never reaches either service directly (Principle #9 — denied-environment operation). +#### Phase 1 variant — route-driven seeding (cycle 3 — Epic AZ-835 / AZ-836 + AZ-838 + AZ-839) + +A tlog-driven alternative to bbox download lets the operator (or the post-flight replay harness) pre-commit the cache to the precise corridor the drone actually flew. The path is exercised today by the e2e fixture `tests/e2e/replay/conftest.py::operator_pre_flight_setup` (AZ-839) and the orchestrator test `tests/e2e/replay/test_az835_e2e_real_flight.py` (AZ-840); the C12 production CLI binding for this variant is deferred to a future cycle. + +Phase-1 sub-steps in the route-driven variant (replaces the bbox download for that invocation): + +1. **Extract corridor from tlog** — `replay_input.tlog_route.extract_route_from_tlog(tlog, *, max_waypoints=10)` (AZ-836). Trims pre-takeoff stationary frames, then coarsens the GPS trace to ≤ `max_waypoints` waypoints via Douglas-Peucker in WGS-84 with great-circle distance. Returns a `RouteSpec(waypoints, suggested_region_size_meters, source_tlog, source_segment, total_distance_meters)` — frozen+slots; canonical home `_types/route.py` (AZ-845). +2. **Submit to satellite-provider** — `c11_tile_manager.route_client.SatelliteProviderRouteClient.seed_route(spec)` (AZ-838). Pre-emptively validates against the AZ-809 `CreateRouteRequestValidator` bounds (`points` 2..500; `regionSizeMeters` 100..10 000; `zoomLevel` 0..22; lat/lon ranges) BEFORE the HTTP POST. Then POSTs `/api/satellite/route` with `requestMaps=true&createTilesZip=false` and polls `GET /api/satellite/route/{id}` every 5 s × ≤ 60 attempts until `mapsReady=true` (terminal-success) or a terminal-failure status (`{failed, error, rejected}`). Returns a `RouteSeedResult(route_id, terminal_status, maps_ready, tile_count, elapsed_ms, submitted_payload_sha256)`. +3. **Populate C6 via C11** — enumerate the route's tile coverage locally from `(waypoints, suggested_region_size_meters)`; invoke `tile_downloader.HttpTileDownloader.download_for_bbox` (existing C11 download path) to pull every corridor tile into C6. +4. **Build FAISS index via C10** — `DescriptorBatcher` against the populated C6 using the NetVLAD backbone (per `c2_vpr/config.py:67` default); verify sidecar triple-consistency (`.index` + `.sha256` + `.meta.json`) per AZ-306; mismatch raises `IndexUnavailableError`. +5. **Yield `PopulatedC6Cache`** — `(cache_root, tile_store_path, faiss_index_path, faiss_sidecar_sha256_path, faiss_sidecar_meta_path, route_spec, tile_count, elapsed_seconds)`. Backed by a docker named volume that survives across pytest sessions in the same compose run. + +Cold-start budget on Tier-2 Jetson: ≤ 5 min (first invocation, full materialisation + descriptor batching); warm: ≤ 30 s (named-volume reuse). + ### Preconditions - Operator workstation has network reach to `satellite-provider` (TLS + service-internal API key). @@ -88,8 +102,10 @@ sequenceDiagram FlightsClient->>FlightsClient: takeoff_origin = waypoints[0].(lat, lon, alt) FlightsClient-->>C12OperatorTool: (bbox, takeoff_origin, flight_id) C12OperatorTool->>C11TileDownloader: download_tiles_for_area(bbox, zooms, sector_class) - C11TileDownloader->>SatelliteProvider: GET /api/satellite/tiles?bbox=&zoom= - SatelliteProvider-->>C11TileDownloader: Tile blobs + metadata (paged) + C11TileDownloader->>SatelliteProvider: POST /api/satellite/tiles/inventory (bulk z,x,y lookup) + SatelliteProvider-->>C11TileDownloader: per-entry present:true|false + metadata + C11TileDownloader->>SatelliteProvider: GET /tiles/{z}/{x}/{y} (one per present:true entry) + SatelliteProvider-->>C11TileDownloader: Tile JPEG body C11TileDownloader->>C11TileDownloader: filter by AC-NEW-6 freshness + RESTRICT-SAT-4 resolution C11TileDownloader->>C6TileStore: write tiles to ./tiles/{zoomLevel}/{x}/{y}.jpg + Postgres rows (source='googlemaps') C11TileDownloader-->>C12OperatorTool: DownloadBatchReport (counts, freshness summary) @@ -114,7 +130,7 @@ flowchart TD FlightOk -->|yes| ComputeBbox[Compute bbox as envelope of waypoint lat/lon + buffer; take waypoints[0] as takeoff origin] ComputeBbox --> Classify[Operator classifies sector active_conflict OR stable_rear] Classify --> InvokeC11[C12 invokes C11 TileDownloader with computed bbox] - InvokeC11 --> Download[C11 GET /api/satellite/tiles for bbox + zoom] + InvokeC11 --> Download[C11 POST /api/satellite/tiles/inventory then GET /tiles/{z}/{x}/{y}] Download --> FreshnessFilter{Freshness ok per AC-8.2 + AC-NEW-6?} FreshnessFilter -->|stale and stable_rear| RejectOrDowngrade[Reject or downgrade tile] FreshnessFilter -->|stale and active_conflict| RejectOrDowngrade @@ -149,10 +165,16 @@ flowchart TD | 0d | C12 `FlightsApiClient` (offline) | filesystem | `flight_file` JSON in the same DTO shape | JSON read | | 0e | C12 `FlightsApiClient` | C12 | `(bbox, takeoff_origin, flight_id)` | in-process | | 1 | C12 | C11 `TileDownloader` | `DownloadRequest(bbox, zoom_levels, sector_class)` | in-process call | -| 2 | C11 | `satellite-provider` REST | `GET /api/satellite/tiles?bbox=…&zoom=…` | HTTPS query | -| 3 | `satellite-provider` | C11 | Paged tile blobs + metadata rows | JPEG + JSON metadata | +| 2a | C11 | `satellite-provider` REST | `POST /api/satellite/tiles/inventory` (bulk `(z,x,y)` lookup, ≤ 5000 entries / request; per `tile-inventory.md` v1.0.0) | HTTPS POST JSON body | +| 2b | `satellite-provider` | C11 | Per-entry `present: true \| false` + metadata when present | JSON response (order matches request order) | +| 2c | C11 | `satellite-provider` REST | `GET /tiles/{z}/{x}/{y}` (issued only for `present=true` entries) | HTTPS GET | +| 3 | `satellite-provider` | C11 | Tile JPEG body | binary JPEG | | 4 | C11 | C6 filesystem (over USB/Eth) | Tile JPEG bodies | `./tiles/{zoomLevel}/{x}/{y}.jpg` | | 5 | C11 | C6 PostgreSQL | Tile metadata rows (`source='googlemaps'`) | SQL INSERT (mirror of `satellite-provider`'s `tiles` table) | +| 1' (route variant) | tlog file | `replay_input.tlog_route.extract_route_from_tlog` | `RouteSpec(waypoints, suggested_region_size_meters, …)` | in-process call | +| 2' (route variant) | C11 `SatelliteProviderRouteClient` | `satellite-provider` REST | `POST /api/satellite/route` (`requestMaps=true`); then `GET /api/satellite/route/{id}` poll until `mapsReady=true` | HTTPS POST + repeated GET | +| 3' (route variant) | C11 | enumerator | local enumeration of corridor `(z,x,y)` coords from `(waypoints, suggested_region_size_meters)` | in-process | +| 4'+5' (route variant) | C11 | C6 | same as steps 4+5 above (downloads via the same inventory + slippy-map paths) | as above | | 6 | C12 | C10 `CacheProvisioner` | `BuildRequest(bbox, zoom_levels, sector_class, calibration_path, takeoff_origin, flight_id)` | in-process call (operator-orchestrator side); RPC over USB/Eth to companion runner | | 7 | C10 → C7 | TRT engine cache | TRT engines | `.engine` files keyed by `(SM, JP, TRT, precision)` (D-C10-7) | | 8 | C2 backbone (driven by C10) | C6 FAISS index | Descriptor matrix | `.index` (FAISS HNSW), atomicwrites, SHA-256 sidecar | @@ -168,7 +190,11 @@ flowchart TD | Flight file malformed (offline path) | Step 0d | JSON parse failure / schema mismatch | Fail with line / field reference; instruct operator to re-export from Mission Planner UI; takeoff blocked | | Flight has zero waypoints | Step 0e | Post-fetch validation | Fail explicitly; cannot derive bbox or takeoff origin; takeoff blocked | | Flight bbox exceeds cache budget | Step 0e | Pre-Phase-1 bbox area vs AC-8.3 budget projection | Fail with budget delta; operator must re-plan a smaller route in Mission Planner UI; takeoff blocked | -| `satellite-provider` unreachable | Step 2 | HTTP timeout / 5xx | C11 `TileDownloader` fails with explicit error; operator retries when network is available; takeoff blocked | +| `satellite-provider` unreachable | Step 2a/2c (or 2' route variant) | HTTP timeout / 5xx | C11 `TileDownloader` / `SatelliteProviderRouteClient` fails with explicit error; operator retries when network is available; takeoff blocked | +| `satellite-provider` JWT auth 401/403 | Step 2a/2c (or 2' route variant) | HTTP 401/403 | Fail with explicit error; instruct operator to refresh `SATELLITE_PROVIDER_API_KEY`; takeoff blocked. Never silently fall back to plaintext or unauthenticated | +| Route validation fails (route variant) | Step 1'→2' | Pre-emptive client check against AZ-809 `CreateRouteRequestValidator` bounds | `RouteValidationError` raised BEFORE the HTTP POST; surface field-by-field errors to operator | +| Route materialisation terminal failure (route variant) | Step 2' poll | `GET /api/satellite/route/{id}` returns `status ∈ {failed, error, rejected}` | `RouteTerminalFailureError` with `.detail` carrying the server response JSON; takeoff blocked | +| Route poll budget exhausted (route variant) | Step 2' poll | 60 attempts × 5 s ceiling reached without `mapsReady=true` or terminal failure | `RouteTransientError` referencing the last observed status; operator may re-invoke or extend the poll budget | | Tile fails freshness | Step 3 (C11) | `tile.capture_timestamp` vs `sector_class` threshold | Reject (active_conflict) or downgrade-no-`satellite_anchored`-label (rear), per AC-NEW-6; counts surface in `DownloadBatchReport` | | Resolution below 0.5 m/px | Step 3 (C11) | Tile metadata GSD check (RESTRICT-SAT-4) | Reject; report; takeoff blocked | | Insufficient cache budget | Step 4 (C11) | Filesystem free-space check pre-write | Fail fast with explicit budget delta; no partial write | diff --git a/_docs/02_document/system-overview.md b/_docs/02_document/system-overview.md new file mode 100644 index 0000000..c862622 --- /dev/null +++ b/_docs/02_document/system-overview.md @@ -0,0 +1,54 @@ +# System Overview Diagram + +> Date: 2026-05-24. Plain-English end-to-end view of the GPS-denied onboard pose estimation system, intended for onboarding and presentations. Detailed per-component decomposition lives in `architecture.md`; per-flow sequences in `system-flows.md`. + +**One-line goal**: when a drone's GPS is jammed or spoofed, give the flight controller a position fix derived from what the camera sees vs. a pre-loaded satellite map — with an honest accuracy number attached. + +```mermaid +flowchart LR + subgraph BEFORE["Before flight — operator workstation"] + UI["Mission Planner
(operator draws route)"] --> PREP["Pre-flight setup
• download map tiles
• build search index
• mark takeoff point"] + SAT[("Satellite map service")] -. tiles .-> PREP + end + + subgraph DURING["During flight — drone companion computer"] + CAM[/"Camera
(3 Hz)"/] --> MOTION["Motion tracker
(camera + IMU →
frame-to-frame motion)"] + CAM --> MATCH["Map matcher
(find where this frame is
on the satellite map)"] + FC[/"Flight controller"/] -- "IMU 100–200 Hz" --> MOTION + FC -- "IMU 100–200 Hz" --> FUSE + MOTION --> FUSE + MATCH --> FUSE["State estimator
(fuse motion + map +
IMU into one position)"] + FUSE == "Position + accuracy
+ how we got it" ==> FC + CACHE[("Cached map tiles
read-only in flight")] --> MATCH + end + + subgraph AFTER["After landing — operator workstation"] + UPLOAD["Upload new tiles
captured in flight
(only on clean landing)"] + end + + PREP ==> DURING + PREP --> CACHE + DURING -. flight log .-> UPLOAD + UPLOAD -. tiles .-> SAT + + classDef ext fill:#eef,stroke:#88a; + classDef store fill:#ffe,stroke:#aa6; + class UI,SAT,FC,CAM ext; + class CACHE store; +``` + +## How to read it in 30 seconds + +1. **Before flight** — the operator draws a route in the Mission Planner. The workstation downloads the satellite-map tiles that cover the route, builds a search index over them, and notes the takeoff point. +2. **During flight** — the drone's camera produces a frame three times a second. Two things happen to each frame in parallel: + - The **motion tracker** combines the camera with the flight controller's IMU to estimate how the drone moved since the last frame. + - The **map matcher** compares the frame against the cached satellite tiles to find where on the map the drone currently is. +3. The **state estimator** fuses both signals (plus raw IMU) into a single position estimate, attaches an honest accuracy number, and sends it to the flight controller — which uses it as a drop-in replacement for GPS. +4. **After landing** — any new map tiles the drone captured during the flight get uploaded back to the satellite map service so the next mission has fresher data. + +## Why the picture is shaped this way (invariants worth defending) + +- **The drone never talks to the satellite map service in flight.** All tile downloads happen on the operator workstation before takeoff; all tile uploads happen on the operator workstation after landing. The airborne code physically cannot reach the network for tiles. (ADR-004 process isolation.) +- **Two parallel branches feed the estimator.** Motion tracking (camera + IMU) and map matching (camera + cached tiles) are independent — neither depends on the other to produce a result. The estimator decides how to weigh them on every frame. +- **The position emitted to the flight controller always carries an honest accuracy number and a provenance label** (`satellite_anchored` / `visual_propagated` / `dead_reckoned`). Under-reporting accuracy is treated as a defect, not a tuning knob. +- **Post-landing upload only fires on a clean shutdown** (the flight log's footer record confirms it). If the system crashed or the drone went down hard, mid-flight tiles stay local until an operator triages them. diff --git a/_docs/02_document/tests/blackbox-tests.md b/_docs/02_document/tests/blackbox-tests.md index 3239005..53ac864 100644 --- a/_docs/02_document/tests/blackbox-tests.md +++ b/_docs/02_document/tests/blackbox-tests.md @@ -672,3 +672,44 @@ All tests run from the `e2e-runner` container against the SUT through public bou The Vertical-error section is replaced by `_No emissions carried a comparable altitude — vertical stats skipped._` when none of the JSONL rows carry an `alt_m` field comparable to the ground-truth altitude. **Skip semantics**: AZ-699 distinguishes between *missing-prerequisite skip* (cleanly skipped with the missing file's path) and *test-cannot-resolve mask* (`@xfail` — explicitly forbidden by AZ-699 AC-1). The AZ-404 1-min test's `@xfail` on AC-3 is unchanged (AZ-699 AC-4 is "add a new test, don't replace") — FT-P-20 is the honest replacement that runs alongside it. + +--- + +### FT-P-21: End-to-end orchestrator pipeline from `(tlog, video, calibration)` only + +**Summary**: Validates the full 7-step Epic AZ-835 pipeline — given only `(tlog, video, calibration)`, the system auto-extracts a `RouteSpec` (AZ-836), posts it to the real satellite-provider (AZ-838), builds the C6 FAISS index via the route-driven `operator_pre_flight_setup` fixture (AZ-839, supersedes the AZ-777 Phase 3 bbox-seeded placeholder), runs the airborne replay pipeline, and emits a horizontal-error verdict report. No operator hand-curation between steps. Closes the Epic AZ-835 narrative: "give it a tlog + video + calibration, and the system does everything else." +**Traces to**: AZ-840 AC-1..AC-8 (epic AZ-835 narrative); supplementary product-AC coverage on AC-1.1, AC-1.2, AC-8.3 (pre-loaded cache realised from route, not bbox). +**Category**: End-to-end Integration + Position Accuracy + +**Preconditions**: +- Tier-2 Jetson with `@pytest.mark.tier2` + `RUN_REPLAY_E2E=1` env (explicit skip-reason naming the missing env var — no silent skip per AZ-840 AC-6). +- Real `satellite-provider` + `satellite-provider-postgres` services running in `docker-compose.test.jetson.yml` (lineage AZ-688 / AZ-691 / AZ-692; cycle-3 AZ-777 Phase 1 adapted C11 to the real `POST /api/satellite/tiles/inventory` + `GET /tiles/{z}/{x}/{y}` endpoints). +- `tests/e2e/replay/conftest.py::operator_pre_flight_setup` from AZ-839 (route-driven C6 population, supersedes the AZ-777 Phase 3 placeholder). +- `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog` + `flight_derkachi.mp4` (real binary + real video >1 MB). +- `_docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json` (AZ-702 factory-sheet camera calibration). +- `gps-denied-replay` console-script installed in the e2e-runner image (AZ-604). +- AZ-776 (eskf open-loop composition profile) landed; AZ-848 — Jetson `eskf_out_of_order` regression — currently blocks the heavy-AC path on Jetson, so FT-P-21 produces its first honest verdict once AZ-848 lands. + +**Input data**: real `derkachi.tlog`, real `flight_derkachi.mp4`, factory-sheet camera calibration. AZ-836's `extract_route_from_tlog(tlog, max_waypoints=10)` derives the `RouteSpec` from the tlog itself; no operator authoring required. + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Active-flight cut + tlog/video sync via AZ-405's `tlog_video_adapter` | Active segment located; tlog↔video offset resolved (`replay.compose_root.ready` logs `auto_sync_used=true|false`, AC-8 escape hatch honored). | +| 2 | On-fly frame + IMU extraction via `VideoFileFrameSource` + `TlogReplayFcAdapter` | Frame and IMU streams co-aligned per AZ-697 ground-truth invariants. | +| 3 | `extract_route_from_tlog(tlog, max_waypoints=10)` → `RouteSpec` | Route materially follows tlog trajectory; waypoints inside the Derkachi bbox (lat 50.0808..50.0832, lon 36.1070..36.1134) per AZ-836 AC-1. | +| 4 | `operator_pre_flight_setup` posts route via `SatelliteProviderRouteClient.seed_route`; satellite-provider downloads Google Maps tiles into C6 | Route registered; `mapsReady=true` within poll budget; `tile_count > 0`; warm fixture re-invocation within the same compose session ≤ 30 s (AZ-839 AC-2). | +| 5 | C10 `DescriptorBatcher` builds the FAISS HNSW NetVLAD index from the populated C6 | Three sidecar files (`.index` + `.sha256` + `.meta.json`) pass the AZ-306 triple-consistency check; tamper test raises `IndexUnavailableError` (AZ-839 AC-6). | +| 6 | Invoke airborne `gps-denied-replay` against the populated cache + tlog/video/calibration | Subprocess runs the per-frame loop end-to-end; emits JSONL outputs (currently blocked by AZ-848 — `eskf_out_of_order` at frame 3 fails the binary with exit 1 deterministically on the Derkachi 1-min clip). | +| 7 | Compute horizontal-error distribution via `helpers/accuracy_report.py` + `helpers/gps_compare.py`; write verdict report | `_docs/06_metrics/real_flight_validation_.md` exists with the honest distribution (PASS or FAIL on the AZ-696 100 m / 80 % gate — verdict emitted **regardless** of PASS/FAIL per AZ-840 AC-2). | + +**Expected outcome**: Verdict report exists with the honest horizontal-error distribution. Test PASSes iff the run meets the AZ-696 100 m / 80 % gate (≥ 80 % of ticks within 100 m of ground truth). Mid-pipeline failures (e.g., satellite-provider rejection at step 4, sidecar mismatch at step 5, ESKF divergence at step 6) fail LOUD with a clear error pointing at the failing step — no silent skip past a failure (AZ-840 AC-5). + +**Max execution time**: 15 min wall-clock on the Derkachi clip (AZ-840 AC-4 soft target for first delivery; hard NFR set after first honest measurement is recorded in the verdict report). + +**Relationship to existing tests**: +- FT-P-20 (AZ-699 real-flight runner) is preserved (AZ-840 AC-7) — FT-P-21 reuses its verdict-report-writing path through `_report_writer.py` rather than superseding it. Either the two live alongside, or AZ-699's runner is wrapped by AZ-840's orchestrator with the verdict-writing path preserved. +- FT-P-15 + FT-P-16 (pre-loaded cache, AC-8.3) remain the canonical bbox-fixture tests; FT-P-21 is the route-driven supplementary test that exercises the same end-state (populated C6) via the production C11→satellite-provider path. + +**Implemented as**: `tests/e2e/replay/test_az835_e2e_real_flight.py` (per AZ-840). Unit-tested orchestration helper: `tests/e2e/replay/test_e2e_orchestrator_unit.py` (17 tests covering parameter validation + error propagation between the 7 orchestration steps). diff --git a/_docs/02_document/tests/traceability-matrix.md b/_docs/02_document/tests/traceability-matrix.md index 9b39257..37b759c 100644 --- a/_docs/02_document/tests/traceability-matrix.md +++ b/_docs/02_document/tests/traceability-matrix.md @@ -8,8 +8,8 @@ This matrix is the canonical view of test coverage for the planning context. It | AC ID | Acceptance Criterion (one-line) | Test IDs | Coverage | |-------|---------------------|----------|----------| -| AC-1.1 | Frame-center GPS within 50 m for ≥80% of normal-flight photos | FT-P-01 | Covered | -| AC-1.2 | Frame-center GPS within 20 m for ≥50% of normal-flight photos | FT-P-01 | Covered | +| AC-1.1 | Frame-center GPS within 50 m for ≥80% of normal-flight photos | FT-P-01, FT-P-21 (orchestrator-level supplementary) | Covered | +| AC-1.2 | Frame-center GPS within 20 m for ≥50% of normal-flight photos | FT-P-01, FT-P-21 (orchestrator-level supplementary) | Covered | | AC-1.3 | Cumulative drift between satellite-anchored fixes <100 m visual / <50 m IMU-fused | FT-P-02 | Covered | | AC-1.4 | Estimate reports 95% covariance + source label | FT-P-03 | Covered | | AC-2.1a | Frame-to-frame registration ≥95% on normal segments | FT-P-04 | Covered | @@ -35,7 +35,7 @@ This matrix is the canonical view of test coverage for the planning context. It | AC-7.2 | AI-camera object coordinates from gimbal/zoom/altitude | — | NOT COVERED — same as AC-7.1 | | AC-8.1 | Imagery via Suite Sat Service offline cache, ≥0.5 m/px | FT-P-15, FT-P-16, NFT-SEC-02 | Covered | | AC-8.2 | Tile freshness <6 mo (active-conflict) / <12 mo (rear) | FT-N-05 | Covered | -| AC-8.3 | Imagery pre-loaded onto companion before flight | FT-P-15, FT-P-16 | Covered | +| AC-8.3 | Imagery pre-loaded onto companion before flight | FT-P-15, FT-P-16, FT-P-21 (route-driven via real satellite-provider) | Covered | | AC-8.4 | Mid-flight tile generation with quality metadata | FT-P-17 | Covered | | AC-8.5 | No raw nav/AI-cam frame retention except thumbnail log | FT-P-18 | Covered | | AC-8.6 | Satellite relocalization scale-ratio + scene-change | FT-P-19 (scale FULL; scene-change PARTIAL) | PARTIAL — scene-change subset reduced confidence (only 2/60 stills have paired sat refs; no labeled change-pair dataset). Independent of the AC-NEW-4 / AC-NEW-7 multi-flight gap (those rows were resolved by AC-text relaxation 2026-05-09; AC-8.6 scene-change still requires a labeled change-pair dataset that synthetic perturbations cannot substitute for). Mitigation: deferred to a follow-up cycle when labeled change-pair data becomes available; surfaced in the Step 4 risk register | @@ -78,6 +78,8 @@ This matrix is the canonical view of test coverage for the planning context. It > Revised 2026-05-09 (Plan Phase 2a.0 outcomes): three rows moved PARTIAL → Covered (AC-NEW-4, AC-NEW-7, RESTRICT-FAIL-2) following AC-text relaxation per Q3=B. Restriction row count corrected from 19 to 20 (pre-existing arithmetic error). > > Revised 2026-05-19 (Greenfield Step 12 cycle-update — autodev): NFT-RES-05 appended to `resilience-tests.md` capturing the composition-root bootstrap contract introduced by AZ-591 / AZ-618 / AZ-687 (replay-mode minimal config, `AirborneBootstrapError` operator-error contract, Tier-2 `replay.compose_root.ready` + `replay.input.frame_emitted` log-boundary gate). NFT-RES-05 is added to AC-NEW-1 and AC-4.1 as bootstrap-precondition coverage; no coverage counts move because the scenario is supplementary, not promoting any PARTIAL row. +> +> Revised 2026-05-24 (Existing-code cycle-3 Step 12 cycle-update — autodev): FT-P-21 appended to `blackbox-tests.md` capturing the Epic AZ-835 orchestrator-level end-to-end pipeline (AZ-836 `RouteSpec` extractor + AZ-838 `SatelliteProviderRouteClient` + AZ-839 route-driven `operator_pre_flight_setup` + AZ-840 orchestrator test). FT-P-21 is supplementary route-driven coverage on AC-1.1, AC-1.2 (orchestrator-level pipeline accuracy) and AC-8.3 (pre-loaded cache realised via the production C11→satellite-provider path rather than the bbox-seeded FT-P-15/FT-P-16 fixture). No coverage counts move — FT-P-21 supplements already-Covered rows. **Currently blocked on Jetson by AZ-848** (`eskf_out_of_order` regression introduced by AZ-776's missing Jetson-verification gate — pre-existing, surfaced cycle-3 Step 11; tracked locally at `_docs/02_tasks/todo/AZ-848_jetson_eskf_out_of_order_regression.md`). Cycle-3 internal changes (C11 contract adaptation per AZ-777 Phase 1; RouteSpec relocation per AZ-845; module-layout refresh AZ-846; AZ-270 lint widening AZ-847; C12 cold-start unit-NFR threshold relax AZ-844) are implementation-only and produce no new black-box scenarios. | Category | Total Items | Covered | PARTIAL | Not Covered | Coverage % (Covered + PARTIAL counted half) | |----------|-----------|---------|---------|-------------|--------------------------------------------| diff --git a/_docs/02_tasks/_dependencies_table.md b/_docs/02_tasks/_dependencies_table.md index 0c96fef..4ad5408 100644 --- a/_docs/02_tasks/_dependencies_table.md +++ b/_docs/02_tasks/_dependencies_table.md @@ -1,8 +1,8 @@ # Dependencies Table -**Date**: 2026-05-23 (cycle-3 Step 10 Implement, refactor run 02-az507-routespec-relocation — added AZ-844 (Epic, run dir `_docs/04_refactoring/02-az507-routespec-relocation/`) + AZ-845 (C01, 2pt relocate `RouteSpec` from `replay_input/tlog_route.py` to `_types/route.py`, deps None, epic AZ-844) + AZ-846 (C02, 2pt refresh `module-layout.md` cycle-3 entries — c11 + replay_input + `_types/route`, deps AZ-845, epic AZ-844) + AZ-847 (C03, 2pt widen `test_az270_compose_root` lint to enforce full rule-9 allow-list, deps AZ-845, epic AZ-844). Resolves cycle-3 cumulative review FAIL verdict (F1 High Architecture, F2 Medium Architecture, F3 Medium Maintainability) per `_docs/03_implementation/cumulative_review_batches_104-109_cycle3_report.md`. Jira "Blocks" links recorded: AZ-845 → AZ-846, AZ-845 → AZ-847. Earlier same-day at start of Step 10 Implement — Epic AZ-835 decomposed into 4 leaf tasks + AZ-777 closed: AZ-839 (C3, 5pt operator_pre_flight_setup real fixture, deps AZ-836+AZ-838+AZ-777Phase1+AZ-322+AZ-316+AZ-306, epic AZ-835), AZ-840 (C4, 3pt e2e orchestrator test (tlog,video,calibration), deps AZ-839+AZ-836+AZ-838+AZ-699+AZ-405+AZ-702+AZ-696, epic AZ-835), AZ-841 (C5, 1pt un-xfail AZ-777 AC-4+AC-5, deps AZ-839+AZ-840, epic AZ-835), AZ-842 (C6, 2pt docs — replay_protocol.md Invariant 12 + architecture.md + orchestrator README, soft dep AZ-841, epic AZ-835). AZ-777 transitioned to Done in Jira: Phases 1+2 shipped (batch 104 + between batches 104 and 106); Phases 3-5 superseded by Epic AZ-835 children per 2026-05-22 user directive. AZ-777 spec moved to done/. Earlier 2026-05-21 (cycle-3 Step 9 New Task — added AZ-776 (3pt open-loop ESKF composition profile via `c4_pose.enabled` flag, no deps, epic AZ-602) + AZ-777 (5pt Derkachi C6 reference tile cache + FAISS descriptor index from OSM/CARTO basemap, depends on AZ-776, epic AZ-602). Both unblock the 7 currently-`@xfail`-masked Derkachi e2e tests on Jetson; AZ-776 unblocks 5 (AC-1, AC-2, AC-5, AC-6 realtime, AC-6 asap), AZ-777 unblocks the remaining 2 (AC-3 + AZ-699 real-flight verdict). Earlier 2026-05-19 (refreshed late-morning after 11:27 Jetson Tier-2 e2e run for AZ-618 — surfaced a NEW gap: replay-mode `Config` lacks `c6_tile_cache` block, so `build_pre_constructed → _build_c6_descriptor_index → _c6_config` raises `KeyError` for AC-1/2/5/6. Follow-up filed as AZ-687 (2pt) under E-AZ-602 with guard at the bootstrap layer (NOT silent fallback in `_c6_config`). Earlier same-day mid-day after AZ-618 split: per the spec author's own Sizing-note recommendation + user-rule cap on PBI complexity, AZ-618 was split into 6 subtasks AZ-619..AZ-624 in Jira (subtasks of AZ-618; epic AZ-602 stays grandparent). AZ-618 retained at 0pt as the umbrella tracker; aggregate actionable work is 16pt across the subtasks (vs. AZ-618's original 5pt filing — author's "likely a true 8" caveat was understated due to c5_isam2_graph_handle ordering + GPU builder unknowns). Earlier same-day refresh at start of Step-7 rewind for AZ-618 — Step-11 Jetson tier-2 e2e gate identified missing internal product implementation: `runtime_root.main()` does not build the airborne `pre_constructed` infrastructure dict before `compose_root()`; AZ-618 = 5pt cross-cutting follow-up to AZ-591, lives under E-AZ-602; all 12 dep tasks are in `done/`. Earlier 2026-05-16 (cycle-1 completeness-gate post-mortem): AZ-589 + AZ-590 closed Won't Fix — were wrong abstraction (OKVIS v1 `ThreadedKFVio` API doesn't exist in OKVIS2 upstream; VINS-Mono `cpp/vins_mono/upstream/` submodule never existed; the actual production gap is the empty central `_STRATEGY_REGISTRY` affecting EVERY component with a strategy-selecting config field, not just c1_vio); replaced by AZ-591 (cross-cutting compose_root per-binary bootstrap, todo/, 5pt) + AZ-592 (AZ-332 Tier-2 validation bundle, backlog/, 5pt placeholder) + AZ-593 (AZ-333 Tier-2 validation bundle, backlog/, 5pt placeholder); AZ-332 + AZ-333 re-classified in gate report from FAIL to BLOCKED-on-Tier-2 per the original tasks' Implementation Notes deferral handles; earlier same-day after end of cycle-1 gate: AZ-589 + AZ-590 created (now closed); earlier same-day after end of Batch 64: AZ-558 implementation closed — `MavlinkTransport` seam now routes every C8 outbound MAVLink byte; AZ-401 AC-9 + AZ-404 AC-4b unskipped together; encoder helpers extracted to `_outbound_mavlink_payloads.py`; live-mode `compose_root` injection deferred to whichever future batch registers AP/iNav strategies in an airborne binary; earlier 2026-05-14: refreshed at start of Batch 63: AZ-559 closed Won't Fix — gap was illusory; `TileSource.ONBOARD_INGEST` + `TileMetadata.quality_metadata` + `write_tile`'s `FreshnessRejectionError` already cover the AZ-389 mid-flight ingest semantic without any new API; AZ-389 dep restored to AZ-303; earlier same-day after Batch 61: AZ-558 follow-up added — routes C8 outbound encoder bytes through `MavlinkTransport` seam; closes AZ-401 AC-9 deferred during batch 61 due to encoder-side routing not being in the AZ-401 task envelope; earlier same-day after cumulative review batches 52-54: AZ-528 hygiene PBI added for c1_vio strategy facade orchestration-spine 3-way duplication (Medium); earlier same-day after Batch 53: AZ-333 VINS-Mono landed — first c1_vio strategy after the AZ-332 OKVIS2 production-default; consolidation hygiene for the strategy-facade duplication deferred to a post-AZ-334 PBI; earlier same-day after Batch 51: AZ-527 hygiene PBI added from cumulative review batches 49-51 F1; 2026-05-13: AZ-526 hygiene PBI added from cumulative review batches 46-48 F1+F3; same-day refresh after Batch 44 SRP refactor: AZ-317 superseded; AZ-329 + AZ-330 specs rewritten; AZ-523 + AZ-524 audit-trail tickets added; E-C12 epic renamed `Operator Pre-flight Tooling` → `Operator Pre-flight Orchestrator`; earlier same-day refresh: AZ-507 + AZ-508 hygiene PBIs from cumulative review batches 31-33; 2026-05-11: AZ-489 + AZ-490 ADR-010 operator-origin path) -**Total Tasks**: 176 (135 product + 41 blackbox-test) — 2026-05-23 refactor-run bump: +AZ-844 (Epic, 0pt umbrella for refactor run 02) + AZ-845 + AZ-846 + AZ-847 (3 product tasks). Prior 2026-05-23 bump (Epic AZ-835 decomposition): 173 (132 product + 41 blackbox-test) = +AZ-835 (Epic) + AZ-836 (C1) + AZ-837 (test-stack hardening, not this Epic) + AZ-838 (C2) added 2026-05-22→2026-05-23 prior to that update; +AZ-839 (C3) + AZ-840 (C4) + AZ-841 (C5) + AZ-842 (C6) added in that update. AZ-777 stays in the table (now closed in Jira; spec at `done/AZ-777_derkachi_c6_reference_fixture.md` retains 8pt credit for Phases 1+2 shipped). Earlier counts: 165 (124 product + 41 blackbox-test) — AZ-317 retained in the table marked SUPERSEDED for audit; AZ-523 (C11 gate removal) + AZ-524 (C12 rename) added as 2 closed audit-trail tasks; AZ-526 = 2pt clock-helper hygiene; AZ-527 = 2pt c2 engine-dim helper hygiene; AZ-528 = 3pt c1_vio facade-spine hygiene; AZ-558 = 3pt MavlinkTransport routing follow-up; AZ-559 closed Won't Fix; AZ-589 + AZ-590 closed Won't Fix (kept in table as 0pt audit-trail rows); AZ-591 = 5pt cross-cutting compose_root bootstrap (todo/); AZ-592 = 5pt OKVIS2 Tier-2 placeholder (backlog/); AZ-593 = 5pt VINS-Mono Tier-2 placeholder (backlog/); AZ-618 = 0pt umbrella (split into AZ-619..AZ-624 on 2026-05-19); AZ-619..AZ-624 = 6 subtasks of AZ-618 covering Phase A..F of the airborne `pre_constructed` assembly, summing to 16pt actionable work; AZ-687 = 2pt replay-mode guard follow-up surfaced by AZ-618 Tier-2 run on 2026-05-19 -**Total Complexity Points**: 563 (430 product + 133 blackbox-test) — 2026-05-23 refactor-run bump: +2pt AZ-845 + 2pt AZ-846 + 2pt AZ-847 = +6 product pts on top of prior reconciled total (AZ-844 epic itself is 0pt umbrella). Prior 2026-05-23 reconciled total: 557 (424 product + 133 blackbox-test) — +5pt AZ-839 + 3pt AZ-840 + 1pt AZ-841 + 2pt AZ-842 = +11 product pts on top of prior reconciled total. AZ-836 (3pt) + AZ-838 (3pt) were added 2026-05-22→2026-05-23 prior to that update; AZ-837 (test-stack hardening, not this Epic) is unaccounted in that delta and should be folded in at the next preamble reconciliation. Earlier baseline: 546 (413 product + 133 blackbox-test) — +3pt AZ-776 + 8pt AZ-777 (5→8 override 2026-05-21 cycle-3 batch 104; see `_docs/_process_leftovers/2026-05-21_az777_complexity_override.md` for rationale + the spec refresh that pulled e2e-runner wiring + C11 contract adapt + Derkachi catalog seed + fixture replacement + un-xfail into one ticket) — AZ-523 = 3pt, AZ-524 = 2pt, AZ-526 = 2pt, AZ-527 = 2pt, AZ-528 = 3pt, AZ-558 = 3pt, AZ-589 + AZ-590 retained at 5pt each but closed Won't Fix (treated as 0 effective pts going forward), AZ-591 = 5pt, AZ-592 = 5pt placeholder, AZ-593 = 5pt placeholder, AZ-618 = 0pt umbrella post-split, AZ-619 = 2pt, AZ-620 = 3pt, AZ-621 = 3pt, AZ-622 = 3pt, AZ-623 = 3pt, AZ-624 = 2pt, AZ-687 = 2pt +**Date**: 2026-05-26 (cycle-4 Step 9 New Task — scope adjustments: (a) AZ-841 (1pt, un-xfail AZ-777 Tier-2 tests) moved from todo/ to backlog/ due to hard conflict with AZ-895 AC-4 (test_derkachi_real_tlog.py stays @xfail in cycle 4 because AZ-848 is backlogged) + partial overlap with AZ-894 AC-3 (CSV-path adapter covers the test_derkachi_1min.py un-xfail target); Jira comment added to AZ-841 documenting the deferral. (b) AZ-842 (2pt → **3pt**, +1 SP rescope) — dropped AZ-841 soft dependency, expanded replay_protocol.md scope to add new Invariant 13 covering single-canonical-clock model + cycle-4 CSV-driven replay narrative (AZ-894 + AZ-895 + AZ-896), plus architecture.md replay-input section updates. New deps: AZ-894 HARD + AZ-895 HARD + AZ-896 SOFT. (c) +**AZ-899** (1pt, product, todo/, land architecture_compliance_baseline.md — cycle-3 retro Top-3 #3 third try; deps None; no epic). (d) +**AZ-900** (1pt, product, todo/, autodev cycle-N+1 Step-9 retro-existence gate — cycle-3 retro Top-3 #2 + 2026-05-26 LESSONS process entry; deps None; no epic). (e) +**AZ-901** (1pt, product, todo/, fix EVIDENCE_OUT default path in e2e/runner/conftest.py:56 — closes 2026-05-26 leftover; deps None; no epic). Cycle-4 active scope: 6 product tickets in todo/ totaling **17 SP** = AZ-842 (3, docs) + AZ-894 (3, CSV adapter) + AZ-895 (2, auto-sync deprecation) + AZ-896 (1, format docs) + AZ-897 (5, replay UI) + AZ-899 (1) + AZ-900 (1) + AZ-901 (1). Dependency order: AZ-894 blocks AZ-895 + AZ-842 + AZ-897; AZ-896 blocks AZ-897 + AZ-842. AZ-899/AZ-900/AZ-901 standalone (no internal blockers). AZ-848 (5) + AZ-883 (2) remain in backlog/ (cycle-3 retro Top-3 #1 deferred by user decision; CSV-bypass strategy supersedes their fixes for the demo path). Earlier 2026-05-23 (cycle-3 Step 10 Implement, refactor run 02-az507-routespec-relocation — added AZ-844 (Epic, run dir `_docs/04_refactoring/02-az507-routespec-relocation/`) + AZ-845 (C01, 2pt relocate `RouteSpec` from `replay_input/tlog_route.py` to `_types/route.py`, deps None, epic AZ-844) + AZ-846 (C02, 2pt refresh `module-layout.md` cycle-3 entries — c11 + replay_input + `_types/route`, deps AZ-845, epic AZ-844) + AZ-847 (C03, 2pt widen `test_az270_compose_root` lint to enforce full rule-9 allow-list, deps AZ-845, epic AZ-844). Resolves cycle-3 cumulative review FAIL verdict (F1 High Architecture, F2 Medium Architecture, F3 Medium Maintainability) per `_docs/03_implementation/cumulative_review_batches_104-109_cycle3_report.md`. Jira "Blocks" links recorded: AZ-845 → AZ-846, AZ-845 → AZ-847. Earlier same-day at start of Step 10 Implement — Epic AZ-835 decomposed into 4 leaf tasks + AZ-777 closed: AZ-839 (C3, 5pt operator_pre_flight_setup real fixture, deps AZ-836+AZ-838+AZ-777Phase1+AZ-322+AZ-316+AZ-306, epic AZ-835), AZ-840 (C4, 3pt e2e orchestrator test (tlog,video,calibration), deps AZ-839+AZ-836+AZ-838+AZ-699+AZ-405+AZ-702+AZ-696, epic AZ-835), AZ-841 (C5, 1pt un-xfail AZ-777 AC-4+AC-5, deps AZ-839+AZ-840, epic AZ-835), AZ-842 (C6, 2pt docs — replay_protocol.md Invariant 12 + architecture.md + orchestrator README, soft dep AZ-841, epic AZ-835). AZ-777 transitioned to Done in Jira: Phases 1+2 shipped (batch 104 + between batches 104 and 106); Phases 3-5 superseded by Epic AZ-835 children per 2026-05-22 user directive. AZ-777 spec moved to done/. Earlier 2026-05-21 (cycle-3 Step 9 New Task — added AZ-776 (3pt open-loop ESKF composition profile via `c4_pose.enabled` flag, no deps, epic AZ-602) + AZ-777 (5pt Derkachi C6 reference tile cache + FAISS descriptor index from OSM/CARTO basemap, depends on AZ-776, epic AZ-602). Both unblock the 7 currently-`@xfail`-masked Derkachi e2e tests on Jetson; AZ-776 unblocks 5 (AC-1, AC-2, AC-5, AC-6 realtime, AC-6 asap), AZ-777 unblocks the remaining 2 (AC-3 + AZ-699 real-flight verdict). Earlier 2026-05-19 (refreshed late-morning after 11:27 Jetson Tier-2 e2e run for AZ-618 — surfaced a NEW gap: replay-mode `Config` lacks `c6_tile_cache` block, so `build_pre_constructed → _build_c6_descriptor_index → _c6_config` raises `KeyError` for AC-1/2/5/6. Follow-up filed as AZ-687 (2pt) under E-AZ-602 with guard at the bootstrap layer (NOT silent fallback in `_c6_config`). Earlier same-day mid-day after AZ-618 split: per the spec author's own Sizing-note recommendation + user-rule cap on PBI complexity, AZ-618 was split into 6 subtasks AZ-619..AZ-624 in Jira (subtasks of AZ-618; epic AZ-602 stays grandparent). AZ-618 retained at 0pt as the umbrella tracker; aggregate actionable work is 16pt across the subtasks (vs. AZ-618's original 5pt filing — author's "likely a true 8" caveat was understated due to c5_isam2_graph_handle ordering + GPU builder unknowns). Earlier same-day refresh at start of Step-7 rewind for AZ-618 — Step-11 Jetson tier-2 e2e gate identified missing internal product implementation: `runtime_root.main()` does not build the airborne `pre_constructed` infrastructure dict before `compose_root()`; AZ-618 = 5pt cross-cutting follow-up to AZ-591, lives under E-AZ-602; all 12 dep tasks are in `done/`. Earlier 2026-05-16 (cycle-1 completeness-gate post-mortem): AZ-589 + AZ-590 closed Won't Fix — were wrong abstraction (OKVIS v1 `ThreadedKFVio` API doesn't exist in OKVIS2 upstream; VINS-Mono `cpp/vins_mono/upstream/` submodule never existed; the actual production gap is the empty central `_STRATEGY_REGISTRY` affecting EVERY component with a strategy-selecting config field, not just c1_vio); replaced by AZ-591 (cross-cutting compose_root per-binary bootstrap, todo/, 5pt) + AZ-592 (AZ-332 Tier-2 validation bundle, backlog/, 5pt placeholder) + AZ-593 (AZ-333 Tier-2 validation bundle, backlog/, 5pt placeholder); AZ-332 + AZ-333 re-classified in gate report from FAIL to BLOCKED-on-Tier-2 per the original tasks' Implementation Notes deferral handles; earlier same-day after end of cycle-1 gate: AZ-589 + AZ-590 created (now closed); earlier same-day after end of Batch 64: AZ-558 implementation closed — `MavlinkTransport` seam now routes every C8 outbound MAVLink byte; AZ-401 AC-9 + AZ-404 AC-4b unskipped together; encoder helpers extracted to `_outbound_mavlink_payloads.py`; live-mode `compose_root` injection deferred to whichever future batch registers AP/iNav strategies in an airborne binary; earlier 2026-05-14: refreshed at start of Batch 63: AZ-559 closed Won't Fix — gap was illusory; `TileSource.ONBOARD_INGEST` + `TileMetadata.quality_metadata` + `write_tile`'s `FreshnessRejectionError` already cover the AZ-389 mid-flight ingest semantic without any new API; AZ-389 dep restored to AZ-303; earlier same-day after Batch 61: AZ-558 follow-up added — routes C8 outbound encoder bytes through `MavlinkTransport` seam; closes AZ-401 AC-9 deferred during batch 61 due to encoder-side routing not being in the AZ-401 task envelope; earlier same-day after cumulative review batches 52-54: AZ-528 hygiene PBI added for c1_vio strategy facade orchestration-spine 3-way duplication (Medium); earlier same-day after Batch 53: AZ-333 VINS-Mono landed — first c1_vio strategy after the AZ-332 OKVIS2 production-default; consolidation hygiene for the strategy-facade duplication deferred to a post-AZ-334 PBI; earlier same-day after Batch 51: AZ-527 hygiene PBI added from cumulative review batches 49-51 F1; 2026-05-13: AZ-526 hygiene PBI added from cumulative review batches 46-48 F1+F3; same-day refresh after Batch 44 SRP refactor: AZ-317 superseded; AZ-329 + AZ-330 specs rewritten; AZ-523 + AZ-524 audit-trail tickets added; E-C12 epic renamed `Operator Pre-flight Tooling` → `Operator Pre-flight Orchestrator`; earlier same-day refresh: AZ-507 + AZ-508 hygiene PBIs from cumulative review batches 31-33; 2026-05-11: AZ-489 + AZ-490 ADR-010 operator-origin path) +**Total Tasks**: 179 (138 product + 41 blackbox-test) — 2026-05-26 cycle-4 Step 9 bump: +AZ-899 + AZ-900 + AZ-901 (3 product tasks). AZ-841 moved todo/ → backlog/ (no count change; backlog tickets are still in the table). Prior 2026-05-23 refactor-run bump: 176 (135 product + 41 blackbox-test) — +AZ-844 (Epic, 0pt umbrella for refactor run 02) + AZ-845 + AZ-846 + AZ-847 (3 product tasks). Prior 2026-05-23 bump (Epic AZ-835 decomposition): 173 (132 product + 41 blackbox-test) = +AZ-835 (Epic) + AZ-836 (C1) + AZ-837 (test-stack hardening, not this Epic) + AZ-838 (C2) added 2026-05-22→2026-05-23 prior to that update; +AZ-839 (C3) + AZ-840 (C4) + AZ-841 (C5) + AZ-842 (C6) added in that update. AZ-777 stays in the table (now closed in Jira; spec at `done/AZ-777_derkachi_c6_reference_fixture.md` retains 8pt credit for Phases 1+2 shipped). Earlier counts: 165 (124 product + 41 blackbox-test) — AZ-317 retained in the table marked SUPERSEDED for audit; AZ-523 (C11 gate removal) + AZ-524 (C12 rename) added as 2 closed audit-trail tasks; AZ-526 = 2pt clock-helper hygiene; AZ-527 = 2pt c2 engine-dim helper hygiene; AZ-528 = 3pt c1_vio facade-spine hygiene; AZ-558 = 3pt MavlinkTransport routing follow-up; AZ-559 closed Won't Fix; AZ-589 + AZ-590 closed Won't Fix (kept in table as 0pt audit-trail rows); AZ-591 = 5pt cross-cutting compose_root bootstrap (todo/); AZ-592 = 5pt OKVIS2 Tier-2 placeholder (backlog/); AZ-593 = 5pt VINS-Mono Tier-2 placeholder (backlog/); AZ-618 = 0pt umbrella (split into AZ-619..AZ-624 on 2026-05-19); AZ-619..AZ-624 = 6 subtasks of AZ-618 covering Phase A..F of the airborne `pre_constructed` assembly, summing to 16pt actionable work; AZ-687 = 2pt replay-mode guard follow-up surfaced by AZ-618 Tier-2 run on 2026-05-19 +**Total Complexity Points**: 567 (434 product + 133 blackbox-test) — 2026-05-26 cycle-4 Step 9 bump: +1pt AZ-899 + 1pt AZ-900 + 1pt AZ-901 + 1pt AZ-842 rescope (2→3) = +4 product pts. Prior 2026-05-23 refactor-run bump: 563 (430 product + 133 blackbox-test) — +2pt AZ-845 + 2pt AZ-846 + 2pt AZ-847 = +6 product pts on top of prior reconciled total (AZ-844 epic itself is 0pt umbrella). Prior 2026-05-23 reconciled total: 557 (424 product + 133 blackbox-test) — +5pt AZ-839 + 3pt AZ-840 + 1pt AZ-841 + 2pt AZ-842 = +11 product pts on top of prior reconciled total. AZ-836 (3pt) + AZ-838 (3pt) were added 2026-05-22→2026-05-23 prior to that update; AZ-837 (test-stack hardening, not this Epic) is unaccounted in that delta and should be folded in at the next preamble reconciliation. Earlier baseline: 546 (413 product + 133 blackbox-test) — +3pt AZ-776 + 8pt AZ-777 (5→8 override 2026-05-21 cycle-3 batch 104; see `_docs/_process_leftovers/2026-05-21_az777_complexity_override.md` for rationale + the spec refresh that pulled e2e-runner wiring + C11 contract adapt + Derkachi catalog seed + fixture replacement + un-xfail into one ticket) — AZ-523 = 3pt, AZ-524 = 2pt, AZ-526 = 2pt, AZ-527 = 2pt, AZ-528 = 3pt, AZ-558 = 3pt, AZ-589 + AZ-590 retained at 5pt each but closed Won't Fix (treated as 0 effective pts going forward), AZ-591 = 5pt, AZ-592 = 5pt placeholder, AZ-593 = 5pt placeholder, AZ-618 = 0pt umbrella post-split, AZ-619 = 2pt, AZ-620 = 3pt, AZ-621 = 3pt, AZ-622 = 3pt, AZ-623 = 3pt, AZ-624 = 2pt, AZ-687 = 2pt Dependencies columns list only the tracker-ID portion (descriptive tail text in each task spec is omitted here for table-readability). The @@ -190,11 +190,14 @@ are all declared and documented below under **Cycle Check**. | AZ-838 | C2: SatelliteProviderRouteClient + seed_route.py CLI — POST RouteSpec to SP, poll mapsReady | 3 | AZ-836; AZ-777 Phase 1; AZ-809 (soft) | AZ-835 | | AZ-839 | C3: operator_pre_flight_setup real fixture — wire C1+C2+C11+C10, supersedes AZ-777 Phase 3 | 5 | AZ-836; AZ-838; AZ-777 Phase 1 (done); AZ-322; AZ-316; AZ-306 | AZ-835 | | AZ-840 | C4: E2E orchestrator test — raw (tlog, video, calibration) drives steps 1-7 end-to-end | 3 | AZ-839; AZ-836; AZ-838; AZ-699; AZ-405; AZ-702; AZ-696 | AZ-835 | -| AZ-841 | C5: Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests once C3 + C4 land | 1 | AZ-839; AZ-840 | AZ-835 | -| AZ-842 | C6: Docs — replay_protocol.md Invariant 12 + architecture.md + orchestrator-test README | 2 | AZ-841 (soft) | AZ-835 | +| AZ-841 | C5: Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests (BACKLOGGED 2026-05-26 — AZ-895 AC-4 conflict) | 1 | AZ-839; AZ-840 | AZ-835 | +| AZ-842 | C6: Docs — replay_protocol.md Invariants 12+13 + architecture.md + orchestrator-test README | 3 | AZ-894; AZ-895; AZ-896 (soft) | AZ-835 | | AZ-845 | Refactor C01: Relocate RouteSpec DTO to _types/route.py (AZ-507 rule 9 fix) | 2 | None | AZ-844 | | AZ-846 | Refactor C02: Refresh module-layout.md cycle-3 entries (c11 + replay_input + _types/route) | 2 | AZ-845 | AZ-844 | | AZ-847 | Refactor C03: Widen test_az270_compose_root lint to enforce full rule-9 allow-list | 2 | AZ-845 | AZ-844 | +| AZ-899 | Land architecture_compliance_baseline.md (cycle-3 retro #3, 3rd try) | 1 | None | (none) | +| AZ-900 | Autodev: gate cycle-N+1 Step-9 entry on previous-cycle retro existence | 1 | None | (none) | +| AZ-901 | Fix EVIDENCE_OUT default path in e2e/runner/conftest.py:56 | 1 | None | (none) | ## Notes diff --git a/_docs/02_tasks/todo/AZ-841_unxfail_az777_tier2_tests.md b/_docs/02_tasks/backlog/AZ-841_unxfail_az777_tier2_tests.md similarity index 82% rename from _docs/02_tasks/todo/AZ-841_unxfail_az777_tier2_tests.md rename to _docs/02_tasks/backlog/AZ-841_unxfail_az777_tier2_tests.md index 4900c3a..6fad2a0 100644 --- a/_docs/02_tasks/todo/AZ-841_unxfail_az777_tier2_tests.md +++ b/_docs/02_tasks/backlog/AZ-841_unxfail_az777_tier2_tests.md @@ -1,5 +1,19 @@ # Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests (AZ-835 C5) +> **Cycle-4 deferral (2026-05-26)**: moved to `backlog/` during cycle-4 Step 9 +> scope review. Blocking issues: +> - **Conflict with AZ-895 AC-4**: AZ-895 (cycle-4 cleanup) explicitly states +> `test_derkachi_real_tlog.py` stays `@xfail` with the AZ-848-scoped reason +> in cycle 4. Un-xfailing this test here contradicts AZ-895 and will fail +> the Jetson run because AZ-848 (the underlying clock bug) is in backlog/. +> - **Partial overlap with AZ-894 AC-3**: the other un-xfail target +> (`test_derkachi_1min.py::AC3`) is the same test AZ-894 (cycle-4 CSV +> adapter) covers under its own AC-3 — re-doing the un-xfail in a +> separate ticket duplicates effort. +> - **Replay condition**: revisit when EITHER (a) AZ-848 is fixed and the +> tlog adapter path is restored, OR (b) cycle 4 lands and we rescope this +> ticket to only the CSV-path tests AZ-894 doesn't already cover. + **Task**: AZ-841_unxfail_az777_tier2_tests **Name**: Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests once C3 fixture + C4 orchestrator land (AZ-835 C5) **Description**: Fifth building block of Epic AZ-835. Once C3 (AZ-839, `operator_pre_flight_setup` real fixture) and C4 (AZ-840, e2e orchestrator test) land, remove the `@pytest.mark.xfail` markers from the AZ-777 Tier-2 tests. The verdict — PASS or FAIL — becomes the honest signal. Both tests remain gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`. diff --git a/_docs/02_tasks/backlog/AZ-848_jetson_eskf_out_of_order_regression.md b/_docs/02_tasks/backlog/AZ-848_jetson_eskf_out_of_order_regression.md new file mode 100644 index 0000000..baafd30 --- /dev/null +++ b/_docs/02_tasks/backlog/AZ-848_jetson_eskf_out_of_order_regression.md @@ -0,0 +1,135 @@ +# [AZ-776 follow-up] derkachi_1min AC-1/2/5/6 fail on Jetson — VioOutput.emitted_at_ns clock-mismatch with FC IMU timebase + +> **SCOPE UPDATE (2026-05-26, cycle-4 planning)** +> +> After user decision to switch the primary replay path to user-supplied (video, CSV) pairs (see AZ-894 / AZ-895 / AZ-896 / AZ-897), the tlog-adapter path becomes **audit-only** and this ticket is **no longer bench-blocking**. It remains a real bug and stays open for any future tlog-only flight (flights that ship with a `.tlog` but no companion `data_imu.csv`). +> +> **Priority**: backlog (deprioritised from cycle-4 candidate) +> **Bench-blocking?**: no — AZ-894 supersedes +> **Production-blocking?**: no — production single-clock model never goes through the tlog adapter +> **Complexity**: unchanged (5 SP) + +**Task**: AZ-848_jetson_eskf_out_of_order_regression +**Name**: Repair the VioOutput contract — emitted_at_ns must use the frame's timeline timestamp, not process monotonic_ns, so it aligns with the FC IMU timebase that C5 ESKF tracks alongside it +**Description**: On the Jetson e2e harness (`scripts/run-tests-jetson.sh`), four tests in `tests/e2e/replay/test_derkachi_1min.py` (AC-1, AC-5, AC-6 realtime, AC-6 asap) fail with identical deterministic root cause `EstimatorFatalError('eskf filter divergence on vio: mahalanobis²=109.765 > 100.0')` at frame 3, preceded by `c5.state.eskf_out_of_order` from `imu_window` (ts_ns=187_370_418_000 < last_added_ts_ns=1_187_232_637_925_619 — ~5–6 orders of magnitude apart). Plus 1 XPASS on `test_ac3_within_100m_80pct_of_ticks` (probable vacuous-pass — when the binary exits 1 on frame 3, the ≥80 % within 100 m assertion evaluates over zero emissions). + +**Revised root cause (2026-05-26 evidence-based investigation)**: NOT an IMU-vs-IMU clock-source mismatch (the original hypothesis was incorrect — RAW_IMU.time_usec and SCALED_IMU2.time_boot_ms share the same FC-boot-relative timebase in the Derkachi tlog: 187–634 s). The actual mismatch is **VioOutput.emitted_at_ns** vs **ImuWindow.ts_end_ns**: + +| Source | Code site | Value on Jetson | Timebase | +|---|---|---|---| +| `VioOutput.emitted_at_ns` | `klt_ransac.py:274` — `self._clock.monotonic_ns()` | ~1.187·10¹⁵ ns (≈ 13.7 days — Jetson uptime when the run started) | Process monotonic | +| `imu_window.ts_end_ns` | `tlog_replay_adapter.py:710` — `time_usec * 1000` | ~1.87·10¹¹ ns (≈ 187 s — Pixhawk boot-relative) | FC-boot-relative | + +C5 ESKF tracks `_last_added_ts_ns` across BOTH `add_vio` and `add_fc_imu`. Frame 0: `add_vio` sets `_last_added_ts_ns = 1.187·10¹⁵`. Frame 1: `add_fc_imu` checks `1.87·10¹¹ + ~10⁸ < 1.187·10¹⁵` → out_of_order degraded → next add_vio with corrupted nominal state → mahalanobis² = 109.76 > 100 → fatal divergence at frame 3. + +**Why this hides on Tier-1**: the test is `@pytest.mark.tier2_only` (skipped on workstation runs). Unit tests use mocked VIO with synthetic clocks, so the contract clash never surfaces. + +**Why this hides on a short-uptime Jetson**: a Jetson booted < ~10 s ago would have monotonic_ns smaller than the FC's boot-relative timestamps; the inequality flips and the bug masquerades as "intermittent passes". The 13.7-day-uptime test box made it deterministic. + +**Complexity**: 5 SP (revised up from 3 — the fix touches the C1 contract: `VioOutput.emitted_at_ns` semantics + every C1 strategy that populates it + `_docs/02_document/contracts/c1_vio/` doc + every consumer of `vio.emitted_at_ns` in C5 / C13 / FDR. Plus a determinism test that records monotonic_ns vs frame_ts_ns at frame 0 to lock the invariant in.) +**Dependencies**: AZ-776 (closed; produced the verification gap that hid this regression) +**Related**: AZ-883 (SCALED_IMU2 latent ts_ns=0 bug; uncovered during this investigation; separate ticket) +**Component**: c1_vio (`klt_ransac.py`, `bench/okvis2.py`, `bench/vins_mono.py`, `_facade_spine.py`) + `_types/nav.py` (VioOutput dataclass) + c5_state (`eskf_baseline.py:add_vio` consumes the field) + c13_fdr (consumes `emitted_at_ns` per the docstring's "adaptive-gating decisions") +**Tracker**: AZ-848 (https://denyspopov.atlassian.net/browse/AZ-848) +**Parent Epic**: (none — bug surfaced in cycle 3 Step 11) + +Jira AZ-848 is the authoritative spec; this file is the in-workspace mirror. + +## Symptom + +On Jetson (`scripts/run-tests-jetson.sh`), four tests in `tests/e2e/replay/test_derkachi_1min.py` fail with identical root cause: + +- `test_ac1_exits_0_jsonl_count_match` +- `test_ac5_determinism_two_runs_diff` +- `test_ac6_pace_realtime_60s_within_5pct` +- `test_ac6_pace_asap_under_30s` + +All four assert `gps-denied-replay` exits 0; the binary actually exits 1 on frame 3 with: + +``` +ERROR c5_state.eskf_baseline c5.state.eskf_out_of_order + source=imu_window ts_ns=187,370,418,000 last_added_ts_ns=1,187,232,637,925,619 +ERROR c5_state.eskf_baseline c5.state.eskf_filter_divergence + source=vio mahalanobis_sq=109.76467866548009 threshold_sq=100.0 +ERROR runtime_root.replay_loop replay_loop.state_add_vio_fatal + frame=3 EstimatorFatalError('eskf filter divergence on vio: mahalanobis²=109.765 > 100.0') +``` + +Mahalanobis distance is identical (109.765) across all four runs — fully deterministic on the Derkachi 1-min clip. + +Additionally, `test_ac3_within_100m_80pct_of_ticks` reports XPASS (was `@xfail` referencing AZ-777). Appears to be a symptom of the same bug — with the binary exiting code 1 before any GPS-denied emissions land, the `≥ 80 % within 100 m` assertion evaluates against an empty population and passes vacuously. The XPASS is NOT honest evidence that AZ-777 has been completed. + +## Origin — AZ-776 verification gap + +Commit `8de2716 [AZ-776] Open-loop ESKF composition profile via c4_pose.enabled` removed `@pytest.mark.xfail` decorators from AC-1 (line 61), AC-2 (line 138), AC-5 (line 413), AC-6 realtime (line 453), AC-6 asap (line 479) of `test_derkachi_1min.py`. The AZ-776 spec (`_docs/02_tasks/done/AZ-776_eskf_open_loop_composition_profile.md`) claims under AC-7: + +> `_run_replay_loop` in `runtime_root/__init__.py` is exercised end-to-end on Jetson by a non-`xfail` integration test (AC-1, AC-2, AC-5, AC-6 realtime, AC-6 asap in `tests/e2e/replay/test_derkachi_1min.py` un-xfail **and pass**). + +This was not honored — AZ-776 closed without an honest Jetson run. Predates the `meta-rule.mdc` "Real Results, Not Simulated Ones" rule (added 2026-05) that would have caught it. + +## Cycle-3 scope (not the cause) + +Cycle-3 Step 11 (2026-05-24) surfaced this on the first full Jetson run since cycle 1. Cycle-3's only src change was commit `fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint` — four files, all in `_types/route.py` (new), `c11_tile_manager/route_client.py`, `replay_input/__init__.py`, `replay_input/tlog_route.py`. None of `c5_state`, `c8_fc_adapter`, `runtime_root` were touched. Most recent change to `c5_state/eskf_baseline.py` is AZ-389; to `c8_fc_adapter/tlog_replay_adapter.py` is AZ-398. Both pre-date cycle 1. The latent contract clash was always there — Jetson uptime + an un-`xfail`ed test combined to make it deterministic. + +## Diagnosis evidence (2026-05-26) + +`/tmp/inspect_tlog.py` (ad-hoc pymavlink probe against `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog`) — outputs preserved in this session's chat history: + +- 4326 RAW_IMU msgs, time_usec ∈ [187,274,914 ; 633,952,656] µs (boot-relative ~187s–~634s) +- 4330 SCALED_IMU2 msgs, time_boot_ms ∈ [187,274 ; 633,954] ms (same timebase, same range) +- Both IMU types share the FC's boot timebase → original "two-IMU-clock-source mismatch" hypothesis is REFUTED +- `klt_ransac.py:274` populates `VioOutput.emitted_at_ns = self._clock.monotonic_ns()` → 1.187·10¹⁵ ns on the test Jetson (uptime 13.7 days) +- `_types/nav.py:158` documents this contract explicitly: "`emitted_at_ns` is `time.monotonic_ns` at output time." +- `eskf_baseline.py:492` reads `ts_ns = vio.emitted_at_ns` and stores it in `_last_added_ts_ns` — the same field that `add_fc_imu` checks against `imu_window.ts_end_ns` (FC-boot-relative) +- Confirmed: the inequality direction MATCHES the AZ-848 error log (`ts_ns=187,370,418,000 < last_added_ts_ns=1,187,232,637,925,619`) + +## Affected files + +- `src/gps_denied_onboard/_types/nav.py` — `VioOutput.emitted_at_ns` field + docstring at line 158 (contract change site) +- `src/gps_denied_onboard/components/c1_vio/klt_ransac.py:274,425,463,592–619` — every site that fills `emitted_at_ns` +- `src/gps_denied_onboard/components/c1_vio/bench/okvis2.py`, `vins_mono.py` — other C1 strategies that fill `emitted_at_ns` +- `src/gps_denied_onboard/components/c1_vio/_facade_spine.py` — `frame_ts_ns(frame)` is the existing helper that should be the new source of truth +- `src/gps_denied_onboard/components/c5_state/eskf_baseline.py:492,502,565` — already reads `vio.emitted_at_ns`; no API change needed once the field's semantics are fixed +- `src/gps_denied_onboard/components/c13_fdr/**` — read `emitted_at_ns` per the docstring's "adaptive-gating decisions"; behavior change must be evaluated +- `_docs/02_document/contracts/c1_vio/` — contract docs need re-version (semantic change to a public field) +- `tests/e2e/replay/test_derkachi_1min.py` — the failing tests; AC-3 XPASS handling per AC-4 below + +## Repro + +``` +bash scripts/run-tests-jetson.sh +# pytest report (after ~5 min): +# tests/e2e/replay/test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match FAILED +# tests/e2e/replay/test_derkachi_1min.py::test_ac5_determinism_two_runs_diff FAILED +# tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_realtime_60s_within_5pct FAILED +# tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_asap_under_30s FAILED +# tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks XPASS +``` + +## Acceptance Criteria + +| # | Criterion | +|---|-----------| +| AC-1 | The `VioOutput.emitted_at_ns` contract docstring (`_types/nav.py:158`) no longer says "monotonic_ns at output time"; the field's semantics are documented as "the frame's timeline timestamp aligned with C8 FC IMU timebase, so C5 ESKF can compare against `imu_window.ts_end_ns` without a clock-source mismatch". A version bump is recorded in `_docs/02_document/contracts/c1_vio/`. | +| AC-2 | Every C1 strategy (`klt_ransac.py`, `bench/okvis2.py`, `bench/vins_mono.py`) populates `emitted_at_ns` from the frame's timestamp (via `frame_ts_ns(frame)` or the strategy's own equivalent), NOT from `monotonic_ns()`. A unit test per strategy asserts the field value equals `frame_ts_ns(frame)`. | +| AC-3 | A determinism test reads two consecutive frames' `VioOutput.emitted_at_ns` values and asserts they are equal to `frame_ts_ns(frame_n)` and `frame_ts_ns(frame_n+1)` respectively — locking the new invariant. | +| AC-4 | Fix lands and `test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match` PASSES on Jetson with `RUN_REPLAY_E2E=1` — no `@xfail` re-add. | +| AC-5 | `test_ac5_determinism_two_runs_diff`, `test_ac6_pace_realtime_60s_within_5pct`, `test_ac6_pace_asap_under_30s` also PASS on Jetson. | +| AC-6 | XPASS on `test_ac3_within_100m_80pct_of_ticks` is investigated. If symptom of the same bug, returns to honest XFAIL referencing AZ-777 once binary exits 0 cleanly. If genuine pass, AZ-777 is closed instead. | +| AC-7 | C13 FDR consumers of `emitted_at_ns` are audited — any code path that relied on the field being monotonic-clock-wall-time has its behavior preserved via an explicit `time.monotonic_ns()` recorded under a different name (e.g., `recorded_at_ns`) or its expectation is documented as "frame timeline; not wall clock". | +| AC-8 | `meta-rule.mdc` "Real Results" gate is honored — no ticket may close `Done` until the operator has eyes on a green Jetson run log line. | + +## Notes + +- Tracker context: surfaced `cycle: 3, step: 11` on 2026-05-24; root cause re-diagnosed 2026-05-26 (operator-supervised investigation against the actual Derkachi tlog). +- Local unit suite (`pytest tests/unit/`) passes 2303 / 0 fail / 86 legitimate skips after C12 cold-start threshold relax (`05f1143 [AZ-844]`). +- Cycle 3 Step 11 verdict was PASS for cycle-3-scope; this ticket captures the wider Jetson regression for next cycle. +- Local mirror created retroactively 2026-05-24 (cycle 3 Step 12 entry) — Jira AZ-848 filed 2026-05-24 was the original signal; mirror was missing. +- 2026-05-26: spec materially revised after evidence-based investigation refuted the original "two-IMU-clock-source mismatch" hypothesis. The corrected diagnosis points at the C1 contract (`VioOutput.emitted_at_ns` semantics), not at the C8 adapter. The SCALED_IMU2 latent bug surfaced during this investigation is split out as AZ-883 to keep this ticket's scope tight. + +## References + +- Jira: https://denyspopov.atlassian.net/browse/AZ-848 +- Run-tests report: `_docs/03_implementation/run_tests_step11_report.md` (Cycle 3 closeout, lines 617–635) +- Origin spec: `_docs/02_tasks/done/AZ-776_eskf_open_loop_composition_profile.md` +- Related: AZ-777 (the XFAIL the AC-6 XPASS originally referenced); AZ-883 (SCALED_IMU2 latent bug) diff --git a/_docs/02_tasks/backlog/AZ-883_scaled_imu2_ts_ns_zero_default.md b/_docs/02_tasks/backlog/AZ-883_scaled_imu2_ts_ns_zero_default.md new file mode 100644 index 0000000..a27e290 --- /dev/null +++ b/_docs/02_tasks/backlog/AZ-883_scaled_imu2_ts_ns_zero_default.md @@ -0,0 +1,74 @@ +# `_handle_imu` mis-reads SCALED_IMU2 timestamps — produces ts_ns=0 for every other IMU sample + +> **SCOPE UPDATE (2026-05-26, cycle-4 planning)** +> +> Deprioritised behind AZ-894 (CSV-driven replay adapter). This bug only matters once the tlog-adapter path is reactivated for tlog-only flights (flights that ship with a `.tlog` but no companion `data_imu.csv`). Stays open in backlog. +> +> **Priority**: backlog (deprioritised from cycle-4 candidate) +> **Bench-blocking?**: no — AZ-894 supersedes the tlog path for Derkachi +> **Complexity**: unchanged (2 SP) + +**Task**: AZ-883_scaled_imu2_ts_ns_zero_default +**Name**: Branch `_handle_imu` on message type so SCALED_IMU2 uses `time_boot_ms × 1_000_000` instead of the missing `time_usec` field +**Description**: `src/gps_denied_onboard/components/c8_fc_adapter/tlog_replay_adapter.py:683` routes BOTH `RAW_IMU` and `SCALED_IMU2` messages through `_handle_imu`, which at line 710 reads `getattr(msg, "time_usec", 0) * 1000` to compute `sensor_ts_ns`. SCALED_IMU2 has no `time_usec` field (its time field is `time_boot_ms`, uint32 milliseconds since FC boot), so the `getattr` default-of-zero path fires for every SCALED_IMU2 message. The resulting IMU sample stream alternates RAW_IMU timestamps with `ts_ns=0` values. + +**Evidence (2026-05-26 investigation against `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog`)**: + +- 4326 RAW_IMU messages with `time_usec` ∈ [187,274,914 ; 633,952,656] µs (boot-relative microseconds, ~187s–~634s) +- 4330 SCALED_IMU2 messages with `time_boot_ms` ∈ [187,274 ; 633,954] ms (same FC-boot timebase, same range) +- Both interleaved in arrival order — every other IMU sample is the affected type +- `_handle_imu`'s simulated output: 4266 non-monotonic transitions out of 8656 (~49 %) — almost every other transition is non-monotonic because SCALED_IMU2 collapses to ts_ns=0 + +**Why this is currently latent**: C5 ESKF's `add_fc_imu` reads `imu_window.ts_end_ns` (the LAST sample's ts_ns) for monotonicity guarding. If the last sample in the window happens to be RAW_IMU, the guard passes. The per-sample preintegration loop at `eskf_baseline.py:627–647` reads each `sample.ts_ns` individually for delta-t computation, but with ts_ns=0 samples interleaved, the delta-t arithmetic produces negative or near-zero intervals that get silently absorbed by the bias-correction math without raising. It WILL bite once any downstream consumer (FDR replay, latency analyser, deterministic-time gate) does a per-sample monotonicity assertion. + +**Why this surfaced now**: the operator-supervised AZ-848 investigation read the Derkachi tlog through pymavlink and observed the interleaving directly. The bug has been present since `_handle_imu` was written (predates cycle 1) and was never caught because no test asserts per-sample IMU monotonicity. + +**Complexity**: 2 SP +**Dependencies**: AZ-848 (split off from its investigation; can land before, after, or in parallel — no shared code path beyond `_handle_imu`) +**Component**: c8_fc_adapter (`tlog_replay_adapter.py`) +**Tracker**: AZ-883 (https://denyspopov.atlassian.net/browse/AZ-883) — Jira ticket created 2026-05-26 during cycle 3 release flow; allocated key AZ-883 (next-available, NOT the originally-planned AZ-849) +**Parent Epic**: (none — bug surfaced during AZ-848 investigation) + +## Symptom + +If you add a per-sample monotonicity assertion to the C5 ESKF or to the C8 tlog adapter pre-emit gate, every Jetson run against the Derkachi tlog reports 4266 zero-valued IMU sample timestamps interleaved with proper RAW_IMU values. The assertion fires immediately at message index 1 (the first SCALED_IMU2 after the first RAW_IMU). + +## Proposed fix + +Modify `_handle_imu` (`src/gps_denied_onboard/components/c8_fc_adapter/tlog_replay_adapter.py:709`) to branch on the message type via the caller's already-computed `msg_type`: + +```python +def _handle_imu(self, msg: Any, *, msg_type: str) -> bool: + if msg_type == "RAW_IMU": + sensor_ts_ns = int(getattr(msg, "time_usec", 0)) * 1000 + elif msg_type == "SCALED_IMU2": + sensor_ts_ns = int(getattr(msg, "time_boot_ms", 0)) * 1_000_000 + else: + raise FcOpenError( + f"_handle_imu called with unsupported msg_type={msg_type!r}; " + f"expected RAW_IMU or SCALED_IMU2" + ) + ... +``` + +Update the caller at line 684 to pass `msg_type=msg_type`. Add a unit test that synthesises a SimpleNamespace with `time_boot_ms=187274` (no `time_usec` field) and verifies the emitted `ImuTelemetrySample.ts_ns == 187_274_000_000`. + +Alternative (heavier): pick a single canonical message type at construction time (parameterise the adapter with `imu_source: Literal["RAW_IMU","SCALED_IMU2"]`, auto-detected from the tlog pre-scan) and drop the non-chosen type at the dispatch site. This buys cleaner streams but doubles the test matrix. + +The branching fix is simpler and preserves the existing OR-group semantic (`("RAW_IMU", "SCALED_IMU2")` in `_REQUIRED_MESSAGE_GROUPS`). + +## Acceptance Criteria + +| # | Criterion | +|---|-----------| +| AC-1 | `_handle_imu` reads `time_boot_ms × 1_000_000` for SCALED_IMU2 messages and `time_usec × 1000` for RAW_IMU. A unit test exercises both branches with a synthetic SimpleNamespace lacking the OTHER field. | +| AC-2 | An integration test against the Derkachi tlog (Tier-1; no Jetson hardware needed — only pymavlink + the tlog file) asserts that the IMU stream as seen by the runtime loop is strictly monotonic ts_ns. The test reads at least the first 100 IMU samples and verifies `sample[i+1].ts_ns > sample[i].ts_ns` for all i. | +| AC-3 | No regression in existing RAW_IMU-only adapter tests. | +| AC-4 | The fix is independent of AZ-848 — does not require the VioOutput contract change to land first. | + +## References + +- Jira: https://denyspopov.atlassian.net/browse/AZ-883 +- Origin: AZ-848 investigation, 2026-05-26 cycle 3 Step 16.5 release flow +- Related: AZ-848 (the VIO contract repair; both surfaced from the same investigation but their fixes are independent) +- Tlog evidence: `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog`, 8656 IMU samples (4326 RAW_IMU + 4330 SCALED_IMU2 interleaved) diff --git a/_docs/02_tasks/todo/AZ-842_replay_protocol_and_orchestrator_docs.md b/_docs/02_tasks/todo/AZ-842_replay_protocol_and_orchestrator_docs.md index 672adf8..03f9cdb 100644 --- a/_docs/02_tasks/todo/AZ-842_replay_protocol_and_orchestrator_docs.md +++ b/_docs/02_tasks/todo/AZ-842_replay_protocol_and_orchestrator_docs.md @@ -3,17 +3,30 @@ **Task**: AZ-842_replay_protocol_and_orchestrator_docs **Name**: Docs: replay_protocol.md Invariant 12 + AZ-777 Phase 3+ superseded note + orchestrator-test README (AZ-835 C6) **Description**: Sixth and final building block of Epic AZ-835. Capture the route-driven flow in the authoritative documents so future implementers, operators, and reviewers understand what changed and why. -**Complexity**: 2 SP -**Dependencies**: AZ-841 (C5, un-xfail — SOFT; README describes test outcomes assuming C5 has landed); AZ-777 (being closed/superseded by this Epic — AZ-777 spec is updated during the AZ-777 closure step, verified by AC-6); AZ-835 (parent Epic) +**Complexity**: 3 SP (cycle-4 rescope: was 2 SP) +**Dependencies**: AZ-894 (CSV adapter — HARD; replay_protocol.md sub-section describes the new single-canonical-clock flow); AZ-895 (auto-sync deprecation — HARD; replay_protocol.md sub-section describes the tlog adapter's new audit-only role); AZ-896 (CSV format docs — SOFT; replay_protocol.md cross-links to the format spec); AZ-777 (closed/superseded by this Epic); AZ-835 (parent Epic) **Component**: `_docs/02_document/contracts/replay/replay_protocol.md` + `_docs/02_document/architecture.md` + `tests/e2e/replay/README*.md` **Tracker**: AZ-842 (https://denyspopov.atlassian.net/browse/AZ-842) **Parent Epic**: AZ-835 Jira AZ-842 is the authoritative spec; this file is the in-workspace mirror. +> **Cycle-4 rescope (2026-05-26)**: dropped the AZ-841 (un-xfail) soft +> dependency — AZ-841 was deferred to backlog in cycle-4 Step 9 scope +> review (see `_docs/02_tasks/backlog/AZ-841_unxfail_az777_tier2_tests.md`). +> Expanded scope from "AZ-835 epic docs only" to also cover the cycle-4 +> replay-input redesign narrative: AZ-894 (CSV-driven single-canonical-clock +> adapter), AZ-895 (tlog adapter → audit-only after auto-sync deprecation), +> AZ-896 (CSV format spec). The replay_protocol.md edits now describe BOTH +> the route-driven AZ-835 flow AND the cycle-4 CSV-driven replay path, +> which together supersede the legacy tlog+auto-sync surface. +> Complexity bumped 2 → 3 SP to cover the added cycle-4 narrative. + ## Modified files -### 1. `_docs/02_document/contracts/replay/replay_protocol.md` — Invariant 12 extension +### 1. `_docs/02_document/contracts/replay/replay_protocol.md` — Invariant 12 extension + Invariant 13 (NEW, cycle-4) + +**1a. Invariant 12 — route-driven flow (AZ-835)** Extend **Invariant 12** with an AZ-835 sub-section describing: @@ -21,6 +34,16 @@ Extend **Invariant 12** with an AZ-835 sub-section describing: - Why route-driven supersedes the AZ-777 bbox approach (efficiency: ~100× fewer tiles; honesty: pre-commits to where the operator did fly). - The C3 fixture's failure-handling contract (validation/terminal → re-raise; transient → retry up to 3 attempts using C11's existing backoff schedule). +**1b. Invariant 13 — single canonical clock (cycle-4, AZ-894 / AZ-895 / AZ-896)** + +Add a new **Invariant 13** sub-section describing: + +- The single-clock model production uses (single edge device, single clock at receipt) and why two-clock surfaces (e.g. `VioOutput.emitted_at_ns` from Jetson monotonic vs. `ImuWindow.ts_end_ns` from FC-boot) produce ESKF out-of-order regressions like AZ-848. +- The CSV-driven replay path (AZ-894) — `(video, CSV)` operator input, IMU + GPS-ground-truth on a single canonical clock derived from the CSV's `Time` column, no auto-sync. +- The CSV schema (delegate to `_docs/02_document/contracts/replay/csv_replay_format.md` produced by AZ-896 for the field-level spec). +- The tlog-replay adapter's new audit-only role (AZ-895): retained for FDR analysis and one-shot tlog→CSV export, removed from the test/demo critical path. +- Auto-sync deprecation (AZ-895): `--time-offset-ms` / `--skip-auto-sync-validation` CLI flags removed or marked deprecated with one-cycle warning. + ### 2. `_docs/02_document/architecture.md` — satellite-provider entry extension Append a sub-section to the existing satellite-provider entry noting that Epic AZ-835 + its C1-C5 children landed the full e2e real-flight validation path on top of AZ-777 Phase 1's wire + C11 contract adaptation. Mark AZ-777 Phase 3+ as superseded by Epic AZ-835 (pointer-only — the AZ-777 spec itself is updated in C5's wake during the AZ-777 closure step). @@ -39,11 +62,13 @@ Either extend `tests/e2e/replay/README.md` or create a dedicated `tests/e2e/repl | # | Criterion | |---|-----------| | AC-1 | `replay_protocol.md` Invariant 12 has a new AZ-835 sub-section covering the route-driven flow, the bbox-supersedure rationale, and the failure-handling contract. | +| AC-1b | `replay_protocol.md` has a new Invariant 13 (cycle-4) sub-section covering the single-canonical-clock model, the CSV-driven replay path (AZ-894), the tlog adapter's audit-only role (AZ-895), and auto-sync deprecation. Links to `csv_replay_format.md` (AZ-896). | | AC-2 | `architecture.md` satellite-provider entry has a sub-section noting Epic AZ-835's contribution and pointing at AZ-777 Phase 3+ as superseded. | +| AC-2b | `architecture.md` replay-input section explains the cycle-4 redesign: CSV adapter primary path, tlog adapter audit-only role, removal of auto-sync. References AZ-894 / AZ-895 / AZ-896 / AZ-897. | | AC-3 | `tests/e2e/replay/README*.md` exists and a new contributor can run the orchestrator test on Jetson using only the README's instructions (no out-of-band knowledge required). | -| AC-4 | All three docs link to the Epic (AZ-835) and to the relevant child tickets (AZ-836 / AZ-838 / AZ-839 / AZ-840 / AZ-841). | +| AC-4 | All three docs link to the Epic (AZ-835), its children (AZ-836 / AZ-838 / AZ-839 / AZ-840), and the cycle-4 redesign tickets (AZ-894 / AZ-895 / AZ-896 / AZ-897). AZ-841 reference omitted (deferred to backlog). | | AC-5 | License attribution string ("Imagery © Google") and the dev-only caveat are present in the test README. | -| AC-6 | Cross-references in `_docs/02_tasks/_dependencies_table.md` and `_docs/02_tasks/done/AZ-777*.md` (once moved) point at this Epic / its children. | +| AC-6 | Cross-references in `_docs/02_tasks/_dependencies_table.md` and `_docs/02_tasks/done/AZ-777*.md` (once moved) point at this Epic / its children and at the cycle-4 redesign tickets. | ## Out of scope diff --git a/_docs/02_tasks/todo/AZ-894_csv_driven_replay_adapter.md b/_docs/02_tasks/todo/AZ-894_csv_driven_replay_adapter.md new file mode 100644 index 0000000..e0b9683 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-894_csv_driven_replay_adapter.md @@ -0,0 +1,53 @@ +# Replay: CSV-driven IMU+GPS adapter using single canonical clock + +**Task**: AZ-894_csv_driven_replay_adapter +**Name**: Add a CSV-replay adapter that consumes the Derkachi-schema `data_imu.csv` (or any flight that ships with a paired CSV) and exposes IMU + GPS-ground-truth on a single canonical clock derived from the CSV's `Time` column +**Description**: Cycle 3 surfaced AZ-848 (eskf_out_of_order on frame 3) because the current replay pipeline imports two incompatible clocks: `VioOutput.emitted_at_ns` uses Jetson process-monotonic time, while `ImuWindow.ts_end_ns` uses FC-boot-relative time (parsed from MAVLink tlog messages). The single-clock model that production uses (single edge device, single clock at receipt) is not what replay does today. The Derkachi fixture's `data_imu.csv` already contains both IMU (`SCALED_IMU2.*`) and GPS ground truth (`GLOBAL_POSITION_INT.*`) on a single canonical clock (the `Time` column, 0..489.9 s at 10 Hz, aligned 3:1 with the 30 fps video). Using the CSV directly eliminates the clock-mismatch surface entirely for the test/demo path and matches the production single-clock model. + +**Complexity**: 3 SP +**Dependencies**: AZ-896 (format docs land in the same cycle but can land in either order) +**Blocks**: AZ-895 (auto-sync deprecation), AZ-897 (replay UI) +**Component**: replay_input (new adapter), c8_fc_adapter (alternate ground-truth source), cli/replay +**Tracker**: AZ-894 (https://denyspopov.atlassian.net/browse/AZ-894) +**Parent Epic**: (none — cycle-4 replay-input redesign) + +## Schema + +The Derkachi CSV header (19 columns): + +``` +timestamp(ms), Time, +SCALED_IMU2.xacc, SCALED_IMU2.yacc, SCALED_IMU2.zacc, +SCALED_IMU2.xgyro, SCALED_IMU2.ygyro, SCALED_IMU2.zgyro, +SCALED_IMU2.xmag, SCALED_IMU2.ymag, SCALED_IMU2.zmag, +GLOBAL_POSITION_INT.lat, GLOBAL_POSITION_INT.lon, GLOBAL_POSITION_INT.alt, +GLOBAL_POSITION_INT.relative_alt, +GLOBAL_POSITION_INT.vx, GLOBAL_POSITION_INT.vy, GLOBAL_POSITION_INT.vz, +GLOBAL_POSITION_INT.hdg +``` + +- `timestamp(ms)`: FC-boot-relative milliseconds (kept for traceability; not used by C5) +- `Time`: flight-relative seconds (canonical clock — what C5 actually uses) +- `SCALED_IMU2.*`: 10 Hz IMU stream (accel mg, gyro mrad/s, mag mGauss per ArduPilot convention) +- `GLOBAL_POSITION_INT.*`: 10 Hz GPS ground truth (lat/lon in 1e-7 deg, alt in mm, vx/vy/vz in cm/s, hdg in cdeg) + +## Acceptance Criteria + +- **AC-1**: Adapter parses the Derkachi `data_imu.csv` end-to-end and emits 4,899 IMU samples + 4,899 GPS-ground-truth samples on a single monotonic clock anchored at row 0. +- **AC-2**: Wired into `cli/replay.py`; `gps-denied-replay --video flight_derkachi.mp4 --imu data_imu.csv` runs without invoking `tlog_replay_adapter.py`. +- **AC-3**: `test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match` passes on the Jetson e2e harness using the new path. AZ-848 cascade no longer triggers (no two-clock surface in the new path). +- **AC-4**: `VioOutput.emitted_at_ns` is populated from the CSV's `Time` column (or the frame-derived `t = N/fps`), not `time.monotonic_ns()`, when the new adapter is in use. +- **AC-5**: Schema mismatch (missing required column, NaN in `Time`, non-monotonic `Time`) raises a clear `ReplayInputAdapterError` at startup, not deep in the loop. + +## Out of scope + +- The structural AZ-848 / AZ-883 fix in the tlog adapter — those stay open as backlog. +- UI for picking the CSV — AZ-897. +- Other CSV schemas (PX4, generic MAVLink dumps) — future enhancement if needed. + +## References + +- Cycle-3 retro: `_docs/06_metrics/retro_2026-05-26.md` +- Bench-run evidence: `_docs/04_release/release_cycle3_jetson-bench_2026-05-26-1442.md` +- Companion tickets: AZ-895 (deprecate auto-sync), AZ-896 (format docs + example CSV), AZ-897 (replay UI) +- Supersedes (re bench-blocking): AZ-848 (VioOutput contract), AZ-883 (SCALED_IMU2 ts_ns=0) diff --git a/_docs/02_tasks/todo/AZ-895_deprecate_auto_sync_surface.md b/_docs/02_tasks/todo/AZ-895_deprecate_auto_sync_surface.md new file mode 100644 index 0000000..d6066c0 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-895_deprecate_auto_sync_surface.md @@ -0,0 +1,39 @@ +# Replay: deprecate auto_sync surface; tlog adapter → audit-only + +**Task**: AZ-895_deprecate_auto_sync_surface +**Name**: Remove the tlog+video auto-sync infrastructure and reframe `tlog_replay_adapter.py` as audit-only, now that AZ-894 ships the CSV-driven primary path +**Description**: User decision (2026-05-26): the test/demo replay path will accept a paired (video, CSV) input from the operator instead of auto-syncing a tlog and video. Auto-sync is unnecessary in production (single edge device, single clock by design) and over-engineered for test (the CSV already encodes the alignment). + +**Complexity**: 2 SP +**Dependencies**: AZ-894 (must ship first — the CSV adapter is the replacement) +**Component**: replay_input (auto_sync.py, tlog_video_adapter.py), cli/replay, runtime_root/_replay_branch +**Tracker**: AZ-895 (https://denyspopov.atlassian.net/browse/AZ-895) +**Parent Epic**: (none — cycle-4 replay-input redesign) + +## Touch list + +- `src/gps_denied_onboard/replay_input/auto_sync.py` — delete or convert to a clear no-op that raises `ReplayInputAdapterError("auto-sync removed; supply --imu CSV instead")` +- `src/gps_denied_onboard/replay_input/tlog_video_adapter.py` — strip auto-sync invocations +- `src/gps_denied_onboard/cli/replay.py` — remove `--time-offset-ms` / `--skip-auto-sync-validation` flags (or mark deprecated with one-cycle warning) +- `src/gps_denied_onboard/runtime_root/_replay_branch.py` — strip auto-sync wiring +- `tests/unit/replay_input/test_az405_auto_sync.py` — pass against the new behaviour or delete with rationale recorded in the batch report +- `tests/e2e/replay/test_derkachi_real_tlog.py` — continues to `@xfail` with the AZ-848-scoped reason; nothing in this ticket fixes the underlying tlog-clock bug +- `tlog_replay_adapter.py` / `tlog_ground_truth.py` — module docstrings updated to call out the new audit-only / one-shot-export roles + +## Acceptance Criteria + +- **AC-1**: `auto_sync.py` is either deleted or made into a clear no-op that raises `ReplayInputAdapterError("auto-sync removed; supply --imu CSV instead")`. +- **AC-2**: All references to `--time-offset-ms` / `--skip-auto-sync-validation` flags in the CLI are removed or marked deprecated with a one-cycle deprecation warning. +- **AC-3**: `test_az405_auto_sync` tests either pass against the new behaviour or are deleted with rationale recorded in the batch report. +- **AC-4**: `test_derkachi_real_tlog.py` continues to `@xfail` with the AZ-848-scoped reason; nothing in this ticket fixes the underlying tlog-clock bug. +- **AC-5**: Module docstrings of `tlog_replay_adapter.py` and `tlog_ground_truth.py` are updated to call out their new audit-only / one-shot-export roles. + +## Out of scope + +- AZ-848 / AZ-883 structural fix — they stay open as backlog (tlog path is still broken, just no longer the primary path). +- New CSV export tooling for arbitrary tlogs — future ticket. + +## References + +- Cycle-3 retro: `_docs/06_metrics/retro_2026-05-26.md` +- Companion: AZ-894 (CSV adapter — must land first), AZ-896 (docs), AZ-897 (UI) diff --git a/_docs/02_tasks/todo/AZ-896_replay_format_docs_and_example_csv.md b/_docs/02_tasks/todo/AZ-896_replay_format_docs_and_example_csv.md new file mode 100644 index 0000000..839c826 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-896_replay_format_docs_and_example_csv.md @@ -0,0 +1,38 @@ +# Docs: replay-input format spec + downloadable example CSV + +**Task**: AZ-896_replay_format_docs_and_example_csv +**Name**: Author the operator-facing format spec for the (video, CSV) replay input pair, plus a minimal downloadable example CSV +**Description**: Operators using the replay/demo path need to know the exact CSV schema the system accepts, the hard contract (video t=0 ≡ CSV row 0; video must be nadir; UAV must already be airborne at t=0), and have a downloadable example to copy from. Operators today have no entry point that documents this. + +**Complexity**: 1 SP +**Dependencies**: AZ-894 (the adapter that consumes the format — the doc describes what AZ-894 accepts) +**Blocks**: AZ-897 (UI links to the docs page and serves the example CSV) +**Component**: docs (_docs/04_release/) +**Tracker**: AZ-896 (https://denyspopov.atlassian.net/browse/AZ-896) +**Parent Epic**: (none — cycle-4 replay-input redesign) + +## What + +- Author a docs page at `_docs/04_release/replay_input_format.md` (or wherever the operator-facing docs land in cycle 4) +- Schema table: column names, units, types, expected rates, required vs optional +- Constraint statements up top, before the column table: + - Video: nadir camera; UAV already airborne at frame 0 + - CSV: row 0 timestamp == video frame 0 timestamp; `Time` column starts at 0.0; rows monotonic and uniformly-spaced +- Ship `_docs/04_release/example_data_imu.csv` — a minimal valid example (e.g., 20 rows = 2 seconds at 10 Hz) +- Cross-link from the AZ-897 replay UI "Download example" button + +## Acceptance Criteria + +- **AC-1**: Schema page documents all 19 columns of the Derkachi CSV with units and types. +- **AC-2**: The three hard constraints (nadir / airborne / aligned-start) are stated up top, before the column table. +- **AC-3**: The example CSV (≥10 rows) passes through the AZ-894 CSV adapter without errors. +- **AC-4**: The page is reachable from the AZ-897 UI's "Download example" link. + +## Out of scope + +- Multi-schema support (PX4, generic MAVLink dumps). + +## References + +- Companion: AZ-894 (CSV adapter), AZ-897 (UI), AZ-895 (auto-sync deprecation) +- Source fixture: `_docs/00_problem/input_data/flight_derkachi/data_imu.csv`, README at `_docs/00_problem/input_data/flight_derkachi/README.md` diff --git a/_docs/02_tasks/todo/AZ-897_replay_ui_web_form.md b/_docs/02_tasks/todo/AZ-897_replay_ui_web_form.md new file mode 100644 index 0000000..cf8e0bb --- /dev/null +++ b/_docs/02_tasks/todo/AZ-897_replay_ui_web_form.md @@ -0,0 +1,45 @@ +# Replay UI: web form for paired video + CSV input + example download + +**Task**: AZ-897_replay_ui_web_form +**Name**: Build the first operator-facing UI for the GPS-denied onboard system — a single-page form that uploads a paired (video, CSV) for replay +**Description**: User decision (2026-05-26): the system offers an operator-facing UI for the test/demo replay path. The UI surfaces the hard constraints visually (nadir, airborne, aligned-start) so operators don't fail silently from a misaligned video. This is also the foundation for the deferred operator-tooling work (see `_docs/00_research/00_question_decomposition.md` lines 119, 224). + +Tech stack per `.cursor/rules/techstackrule.mdc`: React + Tailwind CSS. + +**Complexity**: 5 SP +**Dependencies**: AZ-894 (backend CSV adapter), AZ-896 (format docs + example CSV that the UI serves) +**Component**: frontend (new — first piece of operator-facing UI), backend (new HTTP endpoint that fronts `gps-denied-replay`) +**Tracker**: AZ-897 (https://denyspopov.atlassian.net/browse/AZ-897) +**Parent Epic**: (none — cycle-4 replay-input redesign; will likely become the first piece of a future operator-tooling epic) + +## Shape + +A single-page web form, served from a target to be decided during implementation (Jetson? operator workstation? containerised dev mode?). Hosts: + +- **Video file picker**. Accept `.mp4`, `.mov`. Display constraint hint: "Nadir camera; UAV already airborne at frame 0." +- **CSV file picker**. Accept `.csv`. Display constraint hint: "Row 0 timestamp must equal video frame 0; see format docs." +- **"Download example CSV"** link → AZ-896's `example_data_imu.csv`. +- **"View format docs"** link → AZ-896's `replay_input_format.md`. +- **"Start replay"** button → POSTs (video_path, csv_path) to a backend endpoint that invokes `gps-denied-replay --video X --imu Y`. +- **Result panel**: tail the replay subprocess output, display final verdict (PASS/FAIL + accuracy metrics). + +## Acceptance Criteria + +- **AC-1**: Form renders with both pickers, both constraint hints, download/docs links, and the start button. +- **AC-2**: The start button correctly invokes the replay pipeline against the selected files; success path returns a verdict; failure path returns the error reason from the backend. +- **AC-3**: Form rejects mismatched filename pairs only with explicit operator-actionable error messages — no silent failures. +- **AC-4**: Example-CSV download serves the file from AZ-896 with the correct content-type. +- **AC-5**: Tests cover empty submissions, mismatched file types, backend failures, and the happy path. React Testing Library + jest for component tests; an e2e smoke test covers the full flow. + +## Out of scope + +- Multi-flight management / history / library. +- Authentication / user accounts. +- Sector classification, pre-flight cache provisioning, mission planning (those are separate deferred items from C10 / `00_question_decomposition.md`). +- The deploy-target decision (Jetson vs operator workstation) — to be resolved during implementation; default proposal: containerised dev mode for now. + +## References + +- Companion: AZ-894 (CSV adapter), AZ-896 (docs + example CSV) +- Deferred precedent: `_docs/00_research/00_question_decomposition.md` lines 119 ("Mission-planning UX is out of scope"), 224 ("Operator-side CLI/desktop tool design deferred to Plan-phase") +- Tech stack: React + Tailwind CSS per `.cursor/rules/techstackrule.mdc` diff --git a/_docs/02_tasks/todo/AZ-899_architecture_compliance_baseline.md b/_docs/02_tasks/todo/AZ-899_architecture_compliance_baseline.md new file mode 100644 index 0000000..e998686 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-899_architecture_compliance_baseline.md @@ -0,0 +1,78 @@ +# Land `architecture_compliance_baseline.md` (cycle-3 retro #3, third try) + +**Task**: AZ-899_architecture_compliance_baseline +**Name**: Create `_docs/02_document/architecture_compliance_baseline.md` so cumulative reviews can emit `## Baseline Delta` rows +**Description**: Cycle-1 retro Top-3 Improvement Action #3, repeated in cycle-3 retro Top-3 #3. The file has been unmade across cycles 2 and 3, leaving cumulative reviews unable to quantify carried-over / resolved / newly-introduced architecture violations per cycle. Seed the baseline from `_docs/06_metrics/structure_2026-05-20.md` with `0` violations, freeze the snapshot semantics, and wire the existing-code flow's Step 2 to reference it. +**Complexity**: 1 SP +**Dependencies**: None (operates on existing artifact `_docs/06_metrics/structure_2026-05-20.md`) +**Component**: documentation only — no source code change +**Tracker**: AZ-899 (https://denyspopov.atlassian.net/browse/AZ-899) +**Epic**: (none — cycle-4 process housekeeping) + +## Problem + +Cycle-3 retro § Structural Metrics: + +> `_docs/02_document/architecture_compliance_baseline.md` **still does not exist** — cycle-1 retro Top-3 Improvement Action #3 was NOT delivered in cycles 2 or 3. + +Without a baseline, cumulative reviews log "`_docs/02_document/architecture_compliance_baseline.md` does NOT exist → no Baseline Delta section emitted". Structural regressions (new cycles in the import graph, newly-introduced violations) therefore cannot be quantified across cycles — only verified pairwise per batch. + +## Outcome + +- Cumulative-review reports starting from cycle-4 batch 1 emit a `## Baseline Delta` section that quantifies new vs. resolved vs. carried-over architecture violations. +- Cycle-end retros can compare structural deltas across cycles using a single canonical baseline document instead of re-deriving from the previous cycle's snapshot. + +## Scope + +### Included + +- Create `_docs/02_document/architecture_compliance_baseline.md` seeded with **0** violations. +- Reference `_docs/06_metrics/structure_2026-05-20.md` as the source-of-truth snapshot from which the baseline was derived. +- Document the file's update protocol: a new violation found in a cumulative review is appended (with batch ID, severity, finding ID); a resolution is recorded by marking the row `RESOLVED in batch `. +- Document the snapshot-refresh trigger: any cycle that materially changes structure (component count, cross-component edges, new contracts) re-snapshots via `python -m gps_denied_onboard.tools.structure_snapshot` (or equivalent existing script — verify before reference). + +### Excluded + +- Refactoring source code to fix violations — none currently exist. +- Adding new component scaffolding — out of scope. +- Modifying `code-review` or `retrospective` skills — they already reference the file; the only change needed is making the referenced file exist. + +## Acceptance Criteria + +**AC-1: Baseline file exists with 0 violations** +Given a fresh repo checkout +When `ls _docs/02_document/architecture_compliance_baseline.md` runs +Then the file exists and its `## Violations` section is explicitly empty (or marked "None at baseline") + +**AC-2: Baseline references the structural snapshot** +Given the baseline file +When read +Then it includes a `## Source` section pointing at `_docs/06_metrics/structure_2026-05-20.md` and lists the structural facts (15 components, 0 import cycles, 5 contract files) that establish the "0 violations" claim + +**AC-3: Update protocol documented** +Given the baseline file +When read +Then it includes an `## Update Protocol` section describing append-on-violation, mark-resolved-on-fix, and the snapshot-refresh trigger + +**AC-4: Cumulative-review hook verified** +Given the baseline file in place +When the cycle-4 first cumulative-review report is generated +Then the report emits a `## Baseline Delta` section (even if empty: "0 new, 0 resolved, 0 carried-over") + +## Constraints + +- File format: markdown, matches the structure of `_docs/06_metrics/structure_2026-05-20.md` style. +- No source code change permitted under this ticket — strictly documentation. + +## Risks & Mitigation + +**Risk 1: Future violations slip past the baseline** +- *Risk*: A cumulative review finds a violation but the reviewer forgets to append it to the baseline. +- *Mitigation*: The `code-review` skill (referenced in cycle-3 retro Suggested Updates) should be updated separately to auto-append; this ticket only delivers the baseline file. The follow-up belongs in cycle 5 if needed. + +## References + +- Cycle-3 retro: `_docs/06_metrics/retro_2026-05-26.md` § Top 3 Improvement Actions #3 +- Cycle-1 retro: `_docs/06_metrics/retro_2026-05-20.md` § Top 3 Improvement Actions #3 (original) +- Source snapshot: `_docs/06_metrics/structure_2026-05-20.md` +- Existing-code flow Step 2: `.cursor/skills/autodev/flows/existing-code.md` § "Step 2 — Architecture Baseline Scan" diff --git a/_docs/02_tasks/todo/AZ-900_autodev_retro_existence_gate.md b/_docs/02_tasks/todo/AZ-900_autodev_retro_existence_gate.md new file mode 100644 index 0000000..7e5ba96 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-900_autodev_retro_existence_gate.md @@ -0,0 +1,82 @@ +# Autodev: gate Step-9 entry on previous-cycle retro existence + +**Task**: AZ-900_autodev_retro_existence_gate +**Name**: Codify the LESSONS rule — autodev must block cycle-N+1 Step 9 entry if `retro_.md` for cycle N is absent +**Description**: Cycle-3 retro Top-3 Improvement Action #2 and 2026-05-26 LESSONS entry both call for codifying a Re-Entry After Completion gate that verifies the previous cycle's retro file exists before incrementing the cycle counter. Cycle-2 retro was never filed; the orchestrator silently advanced to cycle 3 and all cycle-1 retro Top-3 actions sat invisible. This ticket codifies the gate in `.cursor/skills/autodev/flows/existing-code.md` § Re-Entry After Completion. +**Complexity**: 1 SP +**Dependencies**: None +**Component**: `.cursor/skills/autodev/flows/existing-code.md` (workflow doc only) +**Tracker**: AZ-900 (https://denyspopov.atlassian.net/browse/AZ-900) +**Epic**: (none — cycle-4 process housekeeping) + +## Problem + +LESSONS 2026-05-26 [process] entry: + +> Cycle-2 retro was never filed. The autodev orchestrator silently auto-chained from cycle-2 Step 17 (if it ran at all) straight into cycle-3 Step 9 without producing `retro_.md`. As a result, cycle-1 retro's Top-3 Improvement Actions sat invisible across cycle 2 and were re-discovered, all three still undelivered, only at cycle-3 close. + +Cycle-3 retro Top-3 #2 echoes the same recommendation. + +The fix is a one-line check in the flow file that BLOCKS Step 9 entry for cycle N+1 unless `_docs/06_metrics/retro_.md` for cycle N exists. + +## Outcome + +- Future cycle-N → cycle-(N+1) transitions are gated: the autodev orchestrator refuses to enter Step 9 of cycle N+1 if no retro file exists for cycle N. +- Missing retros are surfaced at the session boundary, not 6 weeks later at the next cycle's close. + +## Scope + +### Included + +- Edit `.cursor/skills/autodev/flows/existing-code.md` § "Re-Entry After Completion" to add a gate: before incrementing `cycle`, glob `_docs/06_metrics/retro_*.md` and verify a file dated after the cycle-N start exists. +- Define the BLOCK behavior: if absent, present a Choose A/B/C block: + - **A)** Author the missing retro now (invoke `.cursor/skills/retrospective/SKILL.md` in cycle-end mode) + - **B)** Stub a backfilled retro and proceed (with a leftover entry filed for proper backfill) + - **C)** Abort and ask the user +- Add a corresponding bullet to `.cursor/skills/autodev/state.md` § "Session Boundaries" pointing at the new gate. + +### Excluded + +- Retroactively writing cycle-2 retro (separate ticket if user wants it; cycle-3 retro already covers cycle-2 trend deltas where data is on disk). +- Adding similar gates to greenfield or meta-repo flows (only `existing-code` has the cycle counter). +- Per-step retro check inside cycles (this gate fires only at the cycle boundary). + +## Acceptance Criteria + +**AC-1: Flow file gate exists** +Given `.cursor/skills/autodev/flows/existing-code.md` +When the "Re-Entry After Completion" section is read +Then it contains a step `Verify previous cycle's retro exists` BEFORE the cycle increment + +**AC-2: Choose A/B/C block specified** +Given the gate triggers (no retro file found) +When the documented behavior is consulted +Then it specifies the three options (A: author now, B: stub + leftover, C: abort) with the standard Choose format + +**AC-3: state.md cross-reference** +Given `.cursor/skills/autodev/state.md` +When the "Session Boundaries" section is read +Then it mentions the new retro-existence gate or links to the flow file's gate + +**AC-4: Discovery rule** +Given the gate +When the file pattern is documented +Then the glob is unambiguous: `_docs/06_metrics/retro_*.md` with a date matching cycle-N's date range; the date-range derivation is explicit (cycle N start = last `implementation_report_*_cycle{N-1}.md` date; cycle N end = today) + +## Constraints + +- Pure workflow doc change — no source code, no tests. +- Must not break the existing greenfield-Done → existing-code Phase-B transition (greenfield → existing-code is a one-shot flow change with no retro requirement on first entry, since there is no previous cycle). + +## Risks & Mitigation + +**Risk 1: False positive on greenfield→existing-code transition** +- *Risk*: First cycle of an existing-code flow shouldn't require a previous-cycle retro. +- *Mitigation*: Gate condition includes `state.cycle > 1` — cycle 1 has no previous cycle. + +## References + +- LESSONS 2026-05-26 [process] entry: `_docs/LESSONS.md` § 2026-05-26 [process] +- Cycle-3 retro Top-3 #2: `_docs/06_metrics/retro_2026-05-26.md` +- Flow file: `.cursor/skills/autodev/flows/existing-code.md` § "Re-Entry After Completion" +- State management: `.cursor/skills/autodev/state.md` § "Session Boundaries" diff --git a/_docs/02_tasks/todo/AZ-901_evidence_out_default_path_fix.md b/_docs/02_tasks/todo/AZ-901_evidence_out_default_path_fix.md new file mode 100644 index 0000000..bd63fc1 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-901_evidence_out_default_path_fix.md @@ -0,0 +1,85 @@ +# Fix `EVIDENCE_OUT` default path — workspace-relative, not container-only + +**Task**: AZ-901_evidence_out_default_path_fix +**Name**: Change `e2e/runner/conftest.py:56` `EVIDENCE_OUT` default from `/e2e-results/evidence` to a workspace-relative path so Tier-1 host runs don't crash +**Description**: Closes leftover `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`. Cycle-3 Step 15 (Performance Test) surfaced this: the default path `/e2e-results/evidence` is the container mount inside the Tier-1 Docker harness; a developer Mac/Linux workstation invoking `python -m pytest e2e/tests/performance/` directly hits `OSError: [Errno 30] Read-only file system: '/e2e-results'` (macOS) or `PermissionError` (Linux). Workaround today: `EVIDENCE_OUT="$(pwd)/e2e-results/..." pytest ...`. Fix: resolve a workspace-relative default when neither `--evidence-out` nor `EVIDENCE_OUT` is set. +**Complexity**: 1 SP +**Dependencies**: None +**Component**: `e2e/runner/conftest.py` +**Tracker**: AZ-901 (https://denyspopov.atlassian.net/browse/AZ-901) +**Epic**: (none — cycle-4 process housekeeping) + +## Problem + +`e2e/runner/conftest.py:56`: + +```python +default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence") +``` + +The default `/e2e-results/evidence` is a container-mount path. Tier-1 Docker harness and the Tier-2 Jetson runner pass `--evidence-out` explicitly, so they're fine. Host-direct `python -m pytest e2e/tests/performance/` invocations (developer machine, no Docker) hit `nfr_recorder.pytest_sessionfinish` which tries `mkdir(evidence_dir)` and crashes. + +## Outcome + +- Developer can run `python -m pytest e2e/tests/performance/` on a Mac/Linux workstation without setting `EVIDENCE_OUT` and without crashing. +- Docker / Jetson runners continue to work unchanged (they pass `--evidence-out` explicitly). + +## Scope + +### Included + +- Modify `e2e/runner/conftest.py:56` to resolve a workspace-relative default when `EVIDENCE_OUT` is unset. + - Proposed: `default=os.environ.get("EVIDENCE_OUT", str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"))` +- Verify Docker compose files and Jetson scripts that pass `--evidence-out` still work (they should — they override the default). +- Verify `.gitignore` ignores `e2e-results/` at repo root (probably already does — confirm before commit). +- Delete the leftover file `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` once the fix lands and the verification AC passes. + +### Excluded + +- The "lazy fallback inside the recorder" alternative shape — staying with the workspace-relative-default shape for simplicity (Option 1 from the leftover file). +- Refactoring `nfr_recorder.pytest_sessionfinish` — the writer code is fine; only the default path is wrong. +- Adding new evidence-out related env vars or CLI flags. + +## Acceptance Criteria + +**AC-1: Host-direct pytest works without EVIDENCE_OUT** +Given a clean workspace on macOS or Linux +When `python -m pytest e2e/tests/performance/ -v --tb=short` runs (no `EVIDENCE_OUT` env var, no `--evidence-out` flag) +Then pytest exits 0, evidence is written under `/e2e-results/evidence/`, and no `OSError` / `PermissionError` is raised + +**AC-2: Docker harness unchanged** +Given the Tier-1 Docker compose (`docker-compose.test.jetson.yml`) +When the e2e suite runs inside the container +Then `--evidence-out` is still passed and evidence lands at the container mount path `/e2e-results/evidence/` (no behavioral change) + +**AC-3: Jetson harness unchanged** +Given `scripts/run-tests-jetson.sh` +When invoked +Then it still passes `--evidence-out` to pytest and evidence is collected per the existing protocol + +**AC-4: gitignore covers workspace-relative path** +Given the fix in place +When a host-direct run produces `/e2e-results/` +Then `git status` does NOT show `e2e-results/` as untracked (already covered by `.gitignore`, or `.gitignore` is updated as part of this ticket) + +**AC-5: Leftover deleted** +Given the fix lands and ACs 1–4 pass +When `ls _docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` +Then the file does not exist + +## Unit Tests + +| AC Ref | What to Test | Required Outcome | +|--------|-------------|-----------------| +| AC-1 | Run `pytest e2e/tests/performance/` without env vars on host | Exit 0, evidence at `/e2e-results/evidence/` | + +## Constraints + +- Backward-compatible — existing callers passing `--evidence-out` or setting `EVIDENCE_OUT` see no change. +- No new dependencies; uses `pathlib.Path` which `conftest.py` already imports (verify before commit). + +## References + +- Leftover file: `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` +- Cycle-3 Step 15 perf report: `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` § "Findings worth tracking" item 3 +- Conftest: `e2e/runner/conftest.py:56` diff --git a/_docs/04_release/release_cycle3_jetson-bench_2026-05-26-1442.md b/_docs/04_release/release_cycle3_jetson-bench_2026-05-26-1442.md new file mode 100644 index 0000000..0cfdb02 --- /dev/null +++ b/_docs/04_release/release_cycle3_jetson-bench_2026-05-26-1442.md @@ -0,0 +1,181 @@ +# Release Report — Cycle 3 → Jetson (bench test) + +- **Date**: 2026-05-26 14:42 EEST (UTC+3) +- **Operator**: obezdienie001 (single-operator project; agent-assisted via `/autodev`) +- **Strategy**: manual / bench-test +- **Target version**: `be743a7` (dev HEAD; commit `[AZ-844] Close Step 11 cycle-3: unit pass, jetson regression AZ-848`) +- **Target environment**: lab Jetson Orin Nano Super at SSH alias `jetson-e2e` (uptime 15d, 42 GB free on `/var/lib/docker`) +- **Compose file**: `docker-compose.test.jetson.yml` (TEST compose — NOT the parent-suite airborne deploy compose) +- **Verdict**: **Released** +- **Verdict reason**: Bench run produced identical failure profile to Step 11 (`4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed in 335.41s`); same four AZ-848 test IDs failed; no NEW cycle-3-scope regressions introduced by `fd52cc9`. AZ-848 / AZ-883 carry forward to Cycle 4 as planned. + +## Pre-Release Gate (Phase 1) + +### Scope of this release + +This is **not** an airborne production deploy. It is a **bench-test verification** that the cycle-3 source tree builds and runs on real Tier-2 hardware (the lab Jetson Orin Nano Super), using the same `docker-compose.test.jetson.yml` harness that drove the cycle-3 closeout in Step 11. The user explicitly chose this path over a true airborne deploy because two open Jetson blockers (AZ-848, AZ-883) were just diagnosed and deferred to Cycle 4. + +A true airborne release will be Cycle 4's job, once AZ-848 (`VioOutput.emitted_at_ns` contract repair) and AZ-883 (`SCALED_IMU2` ts_ns=0 latent bug) are fixed. + +### Acceptance Criteria + +The system-level ACs in `_docs/00_problem/acceptance_criteria.md` (AC-1.x position accuracy, AC-4.x latency/memory, AC-NEW-1 TTFF, AC-NEW-2 spoof promotion, AC-NEW-4 false-position safety, AC-NEW-5 thermal envelope) all require **live-flight data + Tier-2 hardware** and are not in scope for this bench test. They remain "Unverified" — same status as recorded in `_docs/06_metrics/perf_2026-05-19_workstation-tier1-probe.md` and `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md`. + +What IS in scope and verifiable here: + +| Scope item | Verification | Status | +|------------|--------------|--------| +| Cycle-3 source builds on arm64 (Jetson Orin Nano Super) | `docker compose build` against `tests/e2e/Dockerfile.jetson` succeeds | Phase 3 | +| Cycle-3 source runs on real Jetson hardware end-to-end | `pytest tests/unit/ + tests/e2e/replay/` exits with same failure profile as Step 11 closeout (`4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed`) | Phase 4 | +| No new Cycle-3-scope regressions vs. Step 11 (2026-05-24) | Failure profile matches Step 11 — only the known AZ-848 4-tuple fails; no new failures introduced by `fd52cc9` | Phase 4 | +| Working tree on Jetson reflects the cycle-3 closeout commit | `rsync` mirrors local `be743a7` to remote `~/gps-denied-onboard/` | Phase 3 | + +### Test Status + +| Suite | Pass | Fail | Skip | Source | +|-------|-----:|-----:|-----:|--------| +| Tier-1 unit (local Mac) | 2303 | 0 | 86 | `_docs/03_implementation/run_tests_step11_report.md` § Cycle-3 closeout → Local unit suite | +| Tier-1 perf (this cycle, Mac) | n/a | n/a | n/a | `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` — 4/4 NFRs **Unverified** on Tier-1 (NFR-PERF-* require Tier-2 + AZ-595 fixture, both still pending) | +| Tier-2 Jetson e2e (Step 11, 2026-05-24) | 48 | 4 (AZ-848) | 3 | `_docs/03_implementation/run_tests_step11_report.md` § Cycle 3 closeout → Jetson e2e | +| Tier-2 Jetson e2e (this release; bench rerun) | | | | This release report, Phase 4 below | + +### Change Summary + +Cycle-3 src delta (single commit `fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint`): + +``` +src/gps_denied_onboard/_types/route.py | +43 +src/gps_denied_onboard/components/c11_tile_manager/route_client.py | -4 +src/gps_denied_onboard/replay_input/__init__.py | -2 +src/gps_denied_onboard/replay_input/tlog_route.py | -30 +``` + +Net effect: relocate the `RouteSpec` dataclass from a private helper into the shared `_types/` package; widen ruff lint rules to cover the new module. No behavioural change. No `c1_vio` / `c5_state` / `c8_fc_adapter` / `runtime_root` touches. + +Cycle-3 ticket scope (closed in this cycle, present at HEAD): + +| Ticket | Type | Component | Notes | +|--------|------|-----------|-------| +| AZ-835 (epic) | feature | C1–C6 | "GPS-denied tile provisioning + route spec" epic; decomposed into C1–C6 sub-tasks | +| AZ-836 | tooling | autodev | State-file trim; defer In Testing transition (MCP unavailable workaround) | +| AZ-838 | feature | C2 (route client) | `SatelliteProviderRouteClient` + `seed_route.py` CLI | +| AZ-839 | feature / fixture | C3 (matcher) + E-AZ-835 C3 | `operator_pre_flight_setup` real-fixture wiring | +| AZ-840 | feature / test | E-AZ-835 C4 | e2e orchestrator test | +| AZ-844 | infra / fix | C12 cold-start NFR + Jetson harness | Threshold relax 500 → 1000 ms; rsync exclude `tiles/` `ready/`; Step 11 closeout | +| AZ-845, AZ-846, AZ-847 | refactor / lint | `_types/`, `c11_tile_manager`, `replay_input`, lint | Refactor 02 (this is the only `src/` delta) | +| AZ-848 | bug (deferred) | C1 contract (`VioOutput.emitted_at_ns`) | **Deferred to Cycle 4.** Surfaced during this cycle's release flow when initially routed to operator-workstation target; root-cause re-diagnosed via tlog probe; 5 SP. | +| AZ-883 | bug (deferred) | C8 adapter (`_handle_imu` SCALED_IMU2) | **Deferred to Cycle 4.** Latent ts_ns=0 bug surfaced during AZ-848 investigation; 2 SP. | + +### Rollback Plan + +- **Previous version**: NONE — this is the first-ever release for this project. + - `_docs/04_release/` was empty before this report. + - No `release/*` git tag in the repo. + - No `.previous-tags.env` produced by a prior `stop-services.sh` run. +- **Rollback script**: `scripts/deploy.sh --rollback` is **unavailable** for this bench test (exit 70 — `.previous-tags.env` not found). Acceptable: the test compose's "rollback" is `docker compose down` against `docker-compose.test.jetson.yml`, which leaves the Jetson in pre-test state. +- **Rollback target verified pullable**: n/a (no previous version exists). +- **Rollback target verified bootable in target env**: n/a. + +For Cycle 4's true airborne release, a real rollback target will exist (the image produced by this bench-test cycle, once an arm64 image is built + tagged in CI). + +### Restrictions / Approvals + +- Change-window restrictions: none for bench testing on lab Jetson (NFT-SEC-05 in-flight egress lockdown and ground-only gate apply only to airborne). +- Manual approvals required: none — single-operator project. +- Restriction `_docs/00_problem/restrictions.md` § "Failsafe & Safety" applies only to live flight; not exercised by bench test. + +### Tracker State at Gate + +- **Tickets in scope** (CLOSED at HEAD): 8 tickets (AZ-835, AZ-836, AZ-838, AZ-839, AZ-840, AZ-844, AZ-845, AZ-846, AZ-847 — see Change Summary above). +- **Tickets deferred to Cycle 4** (NOT blocking this bench release; explicitly off the operator-orchestrator + bench-test paths): AZ-848, AZ-883. +- **Tickets blocking release**: 0. AZ-848 / AZ-883 affect only the live-flight tlog-replay path on the airborne Jetson; they are deliberately NOT a bench-test blocker because the bench test re-confirms the SAME failure profile as Step 11 (no NEW regressions in cycle-3-scope). + +### Gate Decision + +User picked **A) Bench testing on jetson-e2e** at the Pre-Release Gate. The contradiction with the user's prior turn (operator-workstation target) was flagged and resolved in favour of bench-test on Jetson. Three issues from the gate that influence verdict interpretation are recorded under "Rollback Plan" (no rollback target) and "Acceptance Criteria" (system-level ACs unverifiable from Tier-1 / bench). + +## Strategy Select (Phase 2) + +- **Recommended by skill table** for this target capability: `manual` (per `release/SKILL.md` Phase 2 table — "Non-automatable env (one-off VMs, regulated infrastructure, non-Docker host) — the whole release becomes a runbook"). Although Docker IS in play here, this is a bench rig with no load balancer, no traffic-tier routing, no automated rollout — the closest semantic match in the skill's table. +- **Chosen**: `manual` / bench-test. +- **Reasoning**: blue-green / canary / all-at-once all imply a service taking real traffic. The bench-test Jetson takes no traffic; it runs an internally-scripted test compose. The release does record but does not "deploy" in the production sense — the parent-suite Watchtower flow is bypassed; only the cycle-3 image's compileability + runnability on hardware is being verified. + +## Execute (Phase 3) + +- **Start**: 2026-05-26 14:42:41 UTC (shell job PID 84808) +- **Command**: `bash scripts/run-tests-jetson.sh` (no flags; defaults to `JETSON_SSH_ALIAS=jetson-e2e`, `JETSON_REMOTE_DIR=~/gps-denied-onboard`, `COMPOSE_FILE=docker-compose.test.jetson.yml`) +- **Stream sink**: `_docs/04_release/.jetson_bench_run_2026-05-26.log` (preserved for audit; NOT committed — `.jetson_bench_run_*.log` should land in `.gitignore` post-release). +- **End**: 2026-05-26 14:50:17 UTC (wall clock 7m 35s; includes rsync + docker compose pull + e2e-runner image build + pytest) +- **Exit code**: 1 — propagated from `pytest` (4 failures inside `e2e-runner`). **Expected**: AZ-848 deterministically fails the same 4 cases. The bench-test verdict is NOT "exit 0" — it is "failure profile matches Step 11". + +Pytest summary line (from `_docs/04_release/.jetson_bench_run_2026-05-26.log`, e2e-runner-1 container): + +``` +============================= test session starts ============================== +platform linux -- Python 3.10.12, pytest-9.0.3, pluggy-1.6.0 +collected 57 items +... (57 tests; see Phase 4 table below for the test-ID summary) += 4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed, 1 warning in 335.41s (0:05:35) = +``` + +AZ-848 root-cause log line from THIS run (matches Step 11 root cause, confirms determinism): + +``` +c5.state.eskf_out_of_order ts_ns=187,370,418,000 last_added_ts_ns=1,362,268,944,997,999 +c5.state.eskf_filter_divergence source=vio mahalanobis_sq=109.76467866548009 threshold_sq=100.0 +replay_loop.state_add_vio_fatal frame=3 EstimatorFatalError('eskf filter divergence on vio: mahalanobis²=109.765 > 100.0') +``` + +(`last_added_ts_ns` differs from Step 11's value because Jetson uptime grew 2 days — the gap between `monotonic_ns` and FC-boot-relative timestamps scales with uptime per AZ-848 root cause; the IMU ts_ns is byte-identical (FC-boot-relative). Both confirm AZ-848's mechanism.) + +## Smoke Test (Phase 4) + +The bench-test compose IS the smoke set (per Phase 2 — bench-test strategy collapses Execute and Smoke into one harness invocation). The pass criterion below is **not** "0 failures" — it is "failure profile matches Step 11's evidence, i.e. only the known AZ-848 4-tuple fails, no new failures introduced by cycle-3 src delta". + +- **Mode**: same harness as Step 11 closeout (rsync + `docker compose --abort-on-container-exit --exit-code-from e2e-runner up`) +- **Start**: 2026-05-26 14:44:31 UTC (e2e-runner container started; `test session starts` line) +- **End**: 2026-05-26 14:50:06 UTC (5m 35s pytest wall clock) + +| Test | Step 11 (2026-05-24) | This run (2026-05-26) | Verdict | +|------|----------------------|----------------------|---------| +| `tests/e2e/replay/test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match` | FAIL (AZ-848 frame-3 ESKF divergence) | FAIL (same root cause; same frame; same mahalanobis²=109.765) | **Match — AZ-848 carries forward** | +| `tests/e2e/replay/test_derkachi_1min.py::test_ac5_determinism_two_runs_diff` | FAIL (same root cause) | FAIL (same root cause) | **Match** | +| `tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_realtime_60s_within_5pct` | FAIL (same root cause) | FAIL (same root cause) | **Match** | +| `tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_asap_under_30s` | FAIL (same root cause) | FAIL (same root cause) | **Match** | +| `tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks` | XPASS (vacuous — binary exits 1 before emissions) | XPASS (same vacuous; same explanation in short-summary) | **Match** | +| Remaining 48 cases | PASS | PASS (all 48) | **Match — no new regressions** | +| Skipped (3) | env-gated (legitimate) | SKIPPED — same three (AZ-839 operator_pre_flight_setup × 2; AC-8 mock-suite-sat-service incomplete) | **Match** | +| xfailed (1) | known xfail (AZ-699 / AZ-776+AZ-777) | XFAIL — same test, same upstream-gap explanation | **Match** | + +**Smoke verdict pass condition**: ✅ met. Totals = `4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed` and the 4 failure IDs are byte-identical to Step 11's IDs. + +## Watch Window (Phase 5) + +- **Duration**: not applicable — bench test, no live traffic, no observability backend in scope. +- **Substitute**: the test compose's `--abort-on-container-exit --exit-code-from e2e-runner` IS the watch — if any service crashes mid-test, pytest aborts and the exit code propagates back. The duration of the bench run (~5–6 min) acts as the de-facto watch. +- This is explicitly recorded per `release/SKILL.md` Phase 5: "If the user explicitly demands skipping (e.g., emergency rollforward), record the override reason in the release report and continue, but mark the verdict as `Released-with-override`." Adapted for bench testing: no live traffic ⇒ no observability ⇒ Phase 5 is honestly N/A, not "skipped". Verdict will be `Released` (or `Aborted`), not `Released-with-override`. + +## Commit or Rollback (Phase 6) + +### Released + +- Tracker tickets in scope **stay as they are** — they were moved to Done during prior cycle-3 steps (Step 12-15). No new tracker movement triggered by this bench-test release. +- Git tag: deliberately NOT pushed. `release/cycle3-bench` would mislabel a bench-test milestone as a production release; the next true airborne release in Cycle 4 will carry the first `release/*` tag. +- AZ-848 and AZ-883 are **explicit known-regression carry-forwards** into Cycle 4 — both have updated specs and Jira state set during this autodev session. +- Cycle-3 source is hardware-bench-verified on the lab Jetson at SHA `be743a7`. The same source can be re-run reproducibly via `bash scripts/run-tests-jetson.sh` against `jetson-e2e`. +- Retrospective scheduled: `/retrospective --cycle-end` auto-chains after this report. Output expected at `_docs/06_metrics/retro_cycle3_.md`. + +## Open Risks Carried Into Cycle 4 + +| Risk | Owner ticket | Severity | +|------|--------------|----------| +| AZ-848 — VioOutput.emitted_at_ns contract clashes with FC-IMU timebase; blocks live-flight ESKF on long-uptime Jetson | AZ-848 (5 SP) | High — real airborne release blocked until fixed | +| AZ-883 — `_handle_imu` produces ts_ns=0 for every SCALED_IMU2 message; latent IMU monotonicity violation | AZ-883 (2 SP) | Medium — latent; fix lands before C13 FDR replay tools assume per-sample monotonicity | +| `EVIDENCE_OUT` default points at container-only path (`/e2e-results/evidence`) — breaks Tier-1 perf tests on the host | `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` | Low — workaround exists (`EVIDENCE_OUT="$(pwd)/e2e-results/..."`) | + +## Lessons (one-liners) + +- **First-release rollback gap is structural, not procedural** — the `scripts/deploy.sh --rollback` path requires `.previous-tags.env`, which only exists after a successful `stop-services.sh` run. First-ever deploys have no rollback target by construction; the release skill's Phase 1 rollback check should treat first-release as a recognized first-time path, not a blocking gate. +- **Bench-test "release" is a legitimate milestone but not a production release** — the release skill's six-phase pipeline (deploy → smoke → watch → commit) compresses to three phases for bench testing (rsync+build → harness-as-smoke → commit). The skill could grow an explicit `strategy: bench-test` row in its Phase 2 table so future releases don't have to improvise. +- **Long-uptime Jetson + freshly-booted FC is the AZ-848 sensitiser** — the gap between `monotonic_ns` and FC-boot-relative timestamps grew by ~175 trillion ns over 2 days (1.187·10¹⁵ → 1.362·10¹⁵). This confirms the bug's mechanism is purely additive in uptime and gives Cycle 4 a clean reproduction protocol: `uptime -p` ≥ 1d on the Jetson + a tlog from a session ≤ 15 min after FC boot. +- **Cycle-3 src delta size vs. release scope tension** — `fd52cc9` is a 75-line refactor; the release machinery exercises full deploy + smoke against it. The bench-test path balances "release discipline" against "tiny delta does not warrant prod-deploy theatre", and it should stay as the default for refactor-only cycles in this project. diff --git a/_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md b/_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md new file mode 100644 index 0000000..578a11e --- /dev/null +++ b/_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md @@ -0,0 +1,136 @@ +# Performance Test Run — 2026-05-26 — Cycle 3 Tier-1 probe + +**Invoked by**: autodev existing-code Step 15 (cycle 3) — `.cursor/skills/test-run/SKILL.md` perf mode. +**Host**: developer Mac workstation (Darwin arm64, no Jetson hardware, no `E2E_SITL_REPLAY_DIR` fixture mounted). +**Runner**: `scripts/run-performance-tests.sh` + direct `pytest e2e/tests/performance/` probe + pure-logic evaluator unit tests. +**Run ID**: `cycle3-tier1-probe`. +**Status**: **Unverified across all 4 production perf NFRs; pure-logic evaluator unit tests Pass (70/70).** No regression detected because no measurement was possible. No Warn / Fail to gate on. **Not blocking deploy** per the skill's "Any Unverified scenarios with no Warn/Fail" rule. + +## Why this cycle re-ran the same probe + +Cycle 3 work touched only pre-flight / offline code paths: + +| Task | Layer | Runtime hot-path impact | +|---|---|---| +| AZ-836 `tlog_route_extractor` | Pre-flight (operator workstation) | None — extraction runs once per flight, before takeoff | +| AZ-838 `SatelliteProviderRouteClient` | Pre-flight (operator workstation) | None — HTTP client against satellite-provider's Route API | +| AZ-839 `operator_pre_flight_setup` real fixture | Test infrastructure | None — fixture composes existing pre-flight components | +| AZ-840 E2E orchestrator test | Test only | None | +| AZ-777 Derkachi C6 reference fixture + C11 inventory adapter | Pre-flight + C11 download path | C11 `TileDownloader` is invoked at pre-flight (operator workstation), not in-flight — airborne process has no egress (RESTRICT-OPS-1, NFT-SEC-02) | +| AZ-845 `RouteSpec` relocation | Refactor (type re-home) | None — public API unchanged | +| AZ-846 `module-layout.md` refresh | Docs | None | +| AZ-847 Lint widening | Test only | None | + +None of these touches the airborne pipeline that NFT-PERF-01..04 measure (E2E latency, frame-by-frame streaming, cold-start TTFF, spoof-promotion). The 2026-05-19 baseline (`perf_2026-05-19_workstation-tier1-probe.md`) remains the most recent measurement of record; this run confirms no Tier-1-observable regression by reproducing the same 4× Unverified outcome. + +## What ran + +### A) `scripts/run-performance-tests.sh` + +```text +Tier-2 perf tests skipped (GPS_DENIED_TIER!=2). +exit=0 +``` + +Tier-2 gate (`pytest -m tier2 -q tests/perf` only when `GPS_DENIED_TIER=2`). Exit 0 silently on Tier-1 by design — canonical perf measurements require Jetson Orin Nano Super hardware (D-C7-9, JetPack 6.2, TensorRT 10.3); a workstation run would produce numbers that DO NOT meet the pinned-hardware budgets and would actively mislead trend tracking. + +### B) Direct `pytest e2e/tests/performance/` probe (24 parameterizations) + +| NFR | Configs | Outcome | Skip reason | +|---|---|---|---| +| **NFT-PERF-01** (E2E latency p95 ≤ 400 ms — AC-4.1) | 6 ({ardupilot, inav} × {okvis2, klt_ransac, vins_mono}) | 6 skipped | "Tier-2 only — Jetson hardware required" | +| **NFT-PERF-02** (frame-by-frame streaming, inter-emit p95 ≤ 350 ms — AC-4.4) | 6 ({ardupilot, inav} × {okvis2, klt_ransac, vins_mono}) | 4 skipped (no fixture) + 2 skipped (vins_mono research-only per D-C1-1-SUB-A) | "requires `E2E_SITL_REPLAY_DIR` (AZ-595) carrying the 5 min Derkachi @ 3 Hz replay" | +| **NFT-PERF-03** (cold-start TTFF p95 ≤ 30 s — AC-NEW-1) | 6 | 6 skipped | "Tier-2 only — Jetson hardware required" | +| **NFT-PERF-04** (spoof-promotion p95 ≤ 600 ms — AC-NEW-2) | 6 | 4 skipped (no fixture) + 2 skipped (vins_mono research-only per D-C1-1-SUB-A) | "requires `E2E_SITL_REPLAY_DIR` (AZ-595) containing N≥20 randomized-start blackout+spoof events" | + +Total: 24 skipped, 0 passed, 0 failed, 0 errored. Exit code 0. + +### C) Pure-logic evaluator unit tests — `e2e/_unit_tests/helpers/test_*_evaluator.py` + +```text +$ .venv/bin/python -m pytest e2e/_unit_tests/helpers/test_e2e_latency_evaluator.py \ + e2e/_unit_tests/helpers/test_streaming_evaluator.py \ + e2e/_unit_tests/helpers/test_ttff_evaluator.py \ + e2e/_unit_tests/helpers/test_spoof_promotion_evaluator.py \ + -v --tb=short +======================= 70 passed in 0.25s ======================== +``` + +**70/70 pass.** Identical to 2026-05-19 — confirms percentile estimators, inter-emit interval math, TTFF distribution math, and spoof-onset → label-switch delta math are still correct. A future hardware run feeds JSON fixtures into the same evaluators — only the input data changes, not the math. + +## Threshold comparison (Step 3 of skill) + +Per the skill's Step 3, thresholds load from `_docs/02_document/tests/performance-tests.md`. The thresholds exist and are documented but no scenario produced a measurement to compare them against. + +| NFR | Threshold | Observed | Verdict | +|---|---|---|---| +| NFT-PERF-01 | p95 ≤ 400 ms (K=3 baseline AND K=2 hybrid auto-degrade) + ≤10 % frame drops | — | **Unverified** (Tier-2 hardware required) | +| NFT-PERF-02 | p95 inter-emit interval ≤ 350 ms; no window of ≥3 missed-emit gaps | — | **Unverified** (`E2E_SITL_REPLAY_DIR` fixture not yet recorded; AZ-595) | +| NFT-PERF-03 | p95 TTFF < 30 s (50 cold boots) | — | **Unverified** (Tier-2 hardware required) | +| NFT-PERF-04 | p95 < 3 s on both FCs (50 trials per FC) | — | **Unverified** (`E2E_SITL_REPLAY_DIR` fixture not yet recorded; AZ-595) | + +## Classification + +Per the skill's perf-mode reporting: + +```text +══════════════════════════════════════ + PERF RESULTS +══════════════════════════════════════ + Scenarios: [pass 0 · warn 0 · fail 0 · unverified 4] +────────────────────────────────────── + 1. NFT-PERF-01 — Unverified — Tier-2 Jetson hardware required + 2. NFT-PERF-02 — Unverified — SITL replay fixture pending (AZ-595) + 3. NFT-PERF-03 — Unverified — Tier-2 Jetson hardware required + 4. NFT-PERF-04 — Unverified — SITL replay fixture pending (AZ-595) +────────────────────────────────────── + Pure-logic evaluator coverage: 70/70 unit tests pass + (e2e/_unit_tests/helpers/test_{e2e_latency,streaming,ttff,spoof_promotion}_evaluator.py) +══════════════════════════════════════ +``` + +## Coverage gap assessment (skill Step 5: "Unverified") + +Per the skill: + +> **Any Unverified scenarios with no Warn/Fail** → not blocking, but surface them in the report so the user knows coverage gaps exist. Suggest running `/test-spec` to add expected results next cycle. + +This run has **0 Warn + 0 Fail + 4 Unverified**, so: + +- **Not deploy-blocking.** The perf gate is allowed to be Unverified when the SUT is not yet running on its canonical hardware. +- **Coverage gap is unchanged from 2026-05-19** — same two recording-phase prerequisites: + - **NFT-PERF-01 / NFT-PERF-03**: AZ-444 (Tier-2 Jetson harness). When AZ-444 lands, these scenarios run on the Jetson and produce numbers — at which point this report's "Unverified" entries become "Pass / Warn / Fail" against the AC-4.1 / AC-NEW-1 thresholds. + - **NFT-PERF-02 / NFT-PERF-04**: AZ-595 (SITL replay fixture builder). When AZ-595 lands, the fixtures are committed under `e2e/fixtures/sitl_replay/`, `E2E_SITL_REPLAY_DIR` is set, and the scenarios run on Tier-1. + +## Findings worth tracking (Low) + +### Carryforward from 2026-05-19 + +1. **Unregistered pytest mark `tier2_only`** — pytest warnings at `e2e/tests/performance/test_nft_perf_01_e2e_latency.py:61` and `e2e/tests/performance/test_nft_perf_03_ttff.py:48`. Add `tier2_only: marks scenarios that require Jetson hardware` to `e2e/runner/pytest.ini` `markers` list. **Status: still present in cycle 3.** +2. **`scripts/run-performance-tests.sh` is intentionally a Tier-2 stub.** Unchanged from 2026-05-19. **Status: still as designed.** + +### New (discovered while running this probe — pre-existing, not cycle-3 caused) + +3. **EVIDENCE_OUT default is a hardcoded container path** — `e2e/runner/conftest.py:56` sets `default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence")`. On a Tier-1 host run (no Docker, no Jetson), the `nfr_recorder.pytest_sessionfinish` hook tries to create `/e2e-results/evidence` and fails with `OSError: [Errno 30] Read-only file system: '/e2e-results'`. Workaround: `EVIDENCE_OUT=$(pwd)/e2e-results//evidence python -m pytest …`. Suggested fix: default to a workspace-relative path when `--evidence-out` is not explicitly passed and no `EVIDENCE_OUT` env var is set. Logged to `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` for later remediation. **Status: pre-existing host-pytest defect, not introduced by cycle 3 — but cycle 3 work is what surfaced it (re-running the same probe a second time).** + +## Anti-patterns explicitly NOT used + +Per the skill's anti-pattern guidance: + +- **No improvised perf tests.** Did not synthesize a workstation-only "approximation" of any NFR; the AC-4.1 / AC-NEW-1 / AC-NEW-2 / AC-4.4 budgets are pinned to canonical hardware and synthetic Tier-1 numbers would mislead the trend-tracker. +- **No skip-acceptance without justification.** Each Unverified entry is cataloged against a concrete recording task (AZ-444 / AZ-595). +- **No threshold downgrade.** Did not soften any threshold to make a Tier-1 measurement "pass". +- **No silent passthrough.** The four perf NFRs all measure real algorithm execution; no per-test bypass was inserted to make a Tier-1 result look like a Tier-2 result. + +## Cross-Reference Index + +| Source | Purpose | +|---|---| +| `_docs/02_document/tests/performance-tests.md` | Threshold + scenario spec | +| `scripts/run-performance-tests.sh` | Runner script (current Tier-2 stub) | +| `_docs/06_metrics/perf_2026-05-19_workstation-tier1-probe.md` | Prior Tier-1 probe (greenfield Step 15) | +| `_docs/02_tasks/todo/AZ-444*` | Tier-2 Jetson harness (recording-phase task) | +| `_docs/02_tasks/todo/AZ-595*` | SITL replay fixture builder (recording task) | +| `_docs/02_tasks/todo/AZ-{428..431}*` | NFT-PERF-{01..04} scenario tasks (runner side complete; harness pending) | +| `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` | EVIDENCE_OUT defect leftover | +| `_docs/06_metrics/` (this directory) | Per-run perf trend artefacts | diff --git a/_docs/06_metrics/retro_2026-05-26.md b/_docs/06_metrics/retro_2026-05-26.md new file mode 100644 index 0000000..1bf874f --- /dev/null +++ b/_docs/06_metrics/retro_2026-05-26.md @@ -0,0 +1,184 @@ +# Retrospective — 2026-05-26 (Cycle 3) + +> Cycle-3 retrospective for GPS-Denied Onboard. Cycle 3 spans +> 2026-05-21 → 2026-05-26 (post-cycle-2 → Step 17 Retrospective). +> Generated by `/autodev` existing-code Step 17 (Retrospective, +> cycle-end mode). Prior retro: `retro_2026-05-20.md` (cycle 1). +> **Process gap**: no cycle-2 retro was filed — cycle 2 transitioned +> straight from Step 11 into cycle-3 work; the autodev session boundary +> between cycles 2 and 3 ran without invoking Step 17. This retro +> partially covers cycle-2 trend deltas where the data is still +> available on disk, and explicitly flags the missing retro as an +> Improvement Action below. + +## Implementation Summary + +### Cycle 3 scope (2026-05-21 → 2026-05-26) + +| Metric | Value | +|--------|-------| +| Tickets closed in cycle 3 (`_docs/02_tasks/done/AZ-83{6..9}*`, `AZ-84{0,5,6,7}*`) | 7 (AZ-836, AZ-838, AZ-839, AZ-840, AZ-845, AZ-846, AZ-847) | +| Tickets touched but split off (deferred to cycle 4) | 2 (AZ-848 — 5 SP, AZ-883 — 2 SP; both surfaced during this cycle's release flow) | +| Tickets in `todo/` at cycle-3 close (open work) | 1 (AZ-848 — the deferred one; AZ-883 mirror also written) | +| Cycle 3 batches (`batch_*_cycle3_report.md`) | 6 (104, 106, 107, 108, 108b, 109) — batch 105 is reserved/missing; 108b is a same-day follow-up to 108 | +| Cycle 3 src delta | 1 commit (`fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint`); +43 −36 LoC across 4 files in `_types/`, `c11_tile_manager/`, `replay_input/` | +| Cycle duration | ~6 days (2026-05-21 first cycle-3 batch → 2026-05-26 retro) | +| Avg tasks per batch | 7 tickets ÷ 6 batches ≈ 1.2 tasks/batch | +| Estimated total complexity points | ~22 SP delivered (3 + 3 + 5 + 3 + 2 + 2 + 4 estimated across AZ-836/838/839/840/845/846/847); plus AZ-844 closeout work (3 SP); deferred 7 SP (AZ-848 5 + AZ-883 2) | +| Carry-over from cycle 1's Top 3 Improvement Actions | 1/3 fulfilled (see "Trend Comparison" below) | + +### Cumulative (cycle 1 + 2 + 3) + +| Metric | Value (this retro) | Cycle-1 retro | +|--------|---------------------|----------------| +| Total tickets closed (lifetime) | ~175 (cycle 1: 165 + cycle 2: ~3-5 + cycle 3: 7) | 165 | +| Total batches (lifetime) | 109 (cycle 1: 97; cycle 2: 5; cycle 3: 6 + 1 inter-cycle batch 109 numbering) | 97 | +| Source LoC, `src/` Python | 61,071 (unchanged vs cycle-1; cycle-3 delta is a refactor, not a feature; cycle-2 src delta also small per Step 11 report) | 61,071 | +| Components | 15 (unchanged) | 15 | +| Binary tracks | 3 (airborne, research, operator-orchestrator) | 3 | + +## Quality Metrics + +### Code Review Verdicts (cycle-3 batches) + +| Batch | Ticket | Verdict | Notes | +|-------|--------|---------|-------| +| 104 | AZ-777 Phase 1 | PASS_WITH_WARNINGS | 3 findings (1 Medium); AZ-777 Phase 1 closed | +| 106 | AZ-836 (TlogRouteExtractor) | **PASS** | Single-task batch; 10 ACs all PASS | +| 107 | AZ-838 (SatelliteProviderRouteClient + seed_route CLI) | PASS_WITH_WARNINGS | C2 — Epic AZ-835 | +| 108 | AZ-839 (operator_pre_flight_setup real fixture) | PASS_WITH_WARNINGS | C3 — Epic AZ-835 | +| 108b | AZ-839 follow-up (fix C3 fixture path mismatch) | **PASS** | Single-finding fix; no new findings | +| 109 | AZ-840 (e2e orchestrator test) | PASS_WITH_WARNINGS | C4 — Epic AZ-835; 17 unit tests; 3 SP per spec | + +Verdict distribution (cycle-3 only): + +| Verdict | Count | % of cycle-3 batches | +|---------|------:|----------------------:| +| PASS | 2 | 33.3 % | +| PASS_WITH_WARNINGS | 4 | 66.7 % | +| FAIL | 0 | 0 % | +| BLOCKED | 0 | 0 % | + +Auto-fix loop did not escalate to user intervention across cycle 3. + +### Cycle 3 — Findings (qualitative; no aggregated severity table in batch reports) + +The 6 cycle-3 batches did NOT use a `| Critical | High | Medium | Low |` table convention (grep found zero matches). Findings appear in inline `## Code review` sections only. Per-batch breakdown: + +| Severity | Cycle 3 count | Trend vs cycle 1 | +|----------|---------------:|-------------------| +| Critical | 0 | maintained — 0 in cycle 1 too | +| High | 0 | maintained — 0 in cycle 1 too | +| Medium | 1 (batch 104, AZ-777 Phase 1) | dropped — cycle 1 carried 2 (CR-F1, CR-F2) — see Trend Comparison | +| Low | ~3 (informal counts across PASS_WITH_WARNINGS batches; not enumerated in tables) | ~5 → ~3 (trend down) | + +### Quality Gates Late in the Cycle (Steps 11–16.5) + +The interesting findings of cycle 3 did NOT come from in-batch code review — they came from the autodev quality-gate steps: + +| Step | Surface | Outcome | +|------|---------|---------| +| 11 Run Tests (Jetson e2e) | AZ-848 — `eskf_filter_divergence` at frame 3 in `test_derkachi_1min.py` | 4 deterministic failures; root cause re-diagnosed 2026-05-26 as `VioOutput.emitted_at_ns` clock-source mismatch (NOT IMU-vs-IMU as initially hypothesised). Split AZ-883 for a secondary latent bug (`_handle_imu` SCALED_IMU2 ts_ns=0). | +| 14 Security Audit | Resumed prior 2026-05-19 audit; verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 5 Medium, 17 Low — same as cycle 1) | No new vulnerabilities introduced by cycle-3 refactor; existing OpenCV CVE pin replay condition unchanged. | +| 15 Performance Test | NFRs 4/4 **Unverified** on Tier-1 (same as cycle 1 + 2); pure-logic evaluator unit tests 70/70 PASS | Surfaced `EVIDENCE_OUT` default-path bug (`/e2e-results` is container-only; breaks Tier-1 host runs) → leftover `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` filed; perf report `perf_2026-05-26_cycle3-tier1-probe.md` written. | +| 16 Deploy | Resumed from cycle-1 greenfield artifacts; no cycle-3 deltas required | Deploy artifacts all present (compose files, scripts/, env templates); operator workstation deploy is the production target for `operator-orchestrator`. | +| 16.5 Release | First-ever release; ran bench-test on `jetson-e2e` lab Jetson | Verdict: **Released**. Failure profile byte-identical to Step 11 (`4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed`); no NEW cycle-3-scope regressions. AZ-848 / AZ-883 explicitly carried forward to cycle 4. | + +## Structural Metrics + +`_docs/02_document/architecture_compliance_baseline.md` **still does not exist** — cycle-1 retro Top-3 Improvement Action #3 was NOT delivered in cycles 2 or 3. + +Delta vs `structure_2026-05-20.md`: + +| Metric | Cycle 1 close | Cycle 3 close | Delta | +|--------|----------------|----------------|-------| +| Component count | 15 | 15 | 0 | +| Source LoC, `src/` Python | 61,071 | 61,071 (+7 net from `fd52cc9` — RouteSpec relocation is net-neutral) | ~0 | +| Cycles in component import graph | 0 | 0 (verified — cycle-3 commit only relocates a type, no new edges) | 0 (healthy) | +| Cross-component edges, count | Concentrated in `runtime_root/` factories | Same | 0 | +| Contract files | 5 | 5 (no new contracts in cycle 3 — refactor cycle) | 0 | +| `architecture_compliance_baseline.md` present | No | **No (carried over gap)** | +0 — *still missing* | +| New Architecture violations this cycle | n/a (no baseline) | 0 (none flagged in cumulative reviews) | n/a | +| Public-API symbol contract coverage % | not computed | not computed | n/a | + +A fresh structural snapshot for this retro is **not produced** — the structure is unchanged from cycle 1 (verified via the 7 LoC delta and 0 new components). `structure_2026-05-20.md` remains the current authoritative snapshot. The next cycle that materially changes structure (e.g., AZ-848 contract repair adds a new field to `VioOutput`; cycle-4 C1 work) should re-snapshot. + +## Efficiency + +| Metric | Cycle 3 value | Cycle 1 value | +|--------|---------------:|---------------:| +| Blocked tasks at cycle close (Tier-2 hardware or otherwise) | 1 in todo/ (AZ-848 deferred) + 1 mirror (AZ-883) — both filed in this retro session, NOT blockers for cycle close | 4 (all Tier-2 hardware rooted) | +| Tasks requiring fixes after review | 1 (batch 108b is a same-day fix follow-up to 108 for a fixture path mismatch — minor) | ~5 | +| Auto-fix loop escalations to user | 0 | 0 | +| Mid-cycle remediation post-mortems | 0 | 1 (AZ-589/AZ-590 → AZ-591) | +| Mid-cycle scope rewinds | 0 | 1 (Step 11 → Step 7 for AZ-618) | +| Mid-cycle ticket splits (NEW: surfaced + split during quality-gate step) | 1 (AZ-848 → split AZ-883 during release-flow investigation) | 0 | +| Process leftovers opened this cycle | 1 (`2026-05-26_evidence_out_default_path.md`) | 1 (D-CROSS-CVE-1 — still open) | +| Process leftovers closed this cycle | 0 | 0 | + +### Blocker Analysis + +| Blocker Type | Count (cycle 3) | Prevention (carries to cycle 4) | +|--------------|------------------|------------------------------------| +| Jetson tlog-replay path broken at frame 3 (AZ-848) | 1 | Cycle 4 first product task; primary AC: `VioOutput.emitted_at_ns` contract repaired so `add_vio` and `add_fc_imu` share the FC-boot timebase. | +| `_handle_imu` SCALED_IMU2 latent bug (AZ-883) | 1 | Cycle 4; independent of AZ-848; 2 SP. | +| `EVIDENCE_OUT` default path container-only | 1 | Leftover at `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`; cycle-4 quick win (15 min). | +| OpenCV CVE pin replay condition (D-CROSS-CVE-1) | 1 (carried from cycle 1) | Out-of-band; re-check at every `/autodev` invocation; unchanged across cycles 1-3. | +| Tier-2 hardware/evidence (AZ-595 fixtures, AZ-592/AZ-593 VIO native bindings) | 0 (cycle 3 did not need them; cycle 1 had 4 of these) | Re-emerge in cycle 4 if AZ-595 SITL fixture is sequenced. | + +## Trend Comparison + +Previous retro: `retro_2026-05-20.md` (cycle 1 close). + +### Cycle-1 Top 3 Improvement Actions — fulfillment status + +| # | Action | Status at cycle-3 close | Evidence | +|---|--------|-------------------------|----------| +| 1 | Land CR-F1 + CR-F2 hygiene PBIs before any new NFT helper expansion in cycle 2 | **Partial / unclear** — no batch report for CR-F1 / CR-F2 specifically in cycle 2 batches (98-102); but cycle-3 batches do not surface duplicated `csv_evidence_writer` / `fixture_path` helpers, suggesting silent absorption or the work is yet to land | Cycle-2 batches 98-102, cycle-3 batches 104-109 — no new Medium-severity helper-duplication findings | +| 2 | Sequence AZ-595 as first product task of cycle 2 | **Not done** — AZ-595 still listed as backlog item in cycle-1 retro language; no cycle-2 batch references AZ-595; the 17 NFT scenarios likely still skip on `sitl_replay_ready` | Glob `_docs/02_tasks/done/AZ-595*` — file absent from `done/` | +| 3 | Create `architecture_compliance_baseline.md` as Step 6 prerequisite | **Not done** — file still missing at cycle-3 close (verified via glob) | `_docs/02_document/architecture_compliance_baseline.md` does not exist | + +**Net assessment**: cycle-1 retro's Top 3 actions were largely not delivered. The cycle-2-retro skip is the proximate cause — without a cycle-2 retro to surface non-delivery, the actions sat invisible. + +### Metric Comparison + +| Metric | Cycle 1 baseline | Cycle 3 close | Target (cycle 4) | +|--------|-------------------|----------------|-------------------| +| Code-review verdict mix | ~44 % PASS / ~55 % PASS_WITH_WARNINGS / 0 % FAIL | 33 % PASS / 67 % PASS_WITH_WARNINGS / 0 % FAIL | Maintain 0 % FAIL; lift PASS to ≥50 % via AZ-848 fix landing cleanly (a single-finding-batch tends to be PASS) | +| Avg findings per batch (Medium + Low) | ~0.2 | ~0.7 (one Medium in batch 104 + ~3 Lows across 4 PASS_WITH_WARNINGS = ~4 ÷ 6) | ≤ 0.5 | +| Mid-cycle remediation post-mortems | 1 | 0 | 0 | +| Mid-cycle ticket splits | 0 | 1 (AZ-848 → AZ-883) — *good* (correct discipline; not bad churn) | maintain (split discipline) | +| Structural baseline file present | No | **No (gap carried 2 cycles)** | Yes — drop it into cycle 4 Step 6 | +| Cycle-N retro filed at cycle-N close | Yes | **No for cycle 2; yes for cycle 3** | Yes — fix the autodev orchestrator gap | + +## Top 3 Improvement Actions (cycle 4) + +1. **Land the AZ-848 fix as cycle-4 first product task; bench-verify on Jetson before merging.** + - Impact: unblocks the Jetson e2e tlog-replay path that's been broken since cycle 2 (the AZ-776 xfail removal). Required for any real airborne release. Carries an explicit verification protocol: long-uptime Jetson + freshly-booted FC reproduces deterministically. + - Effort: 5 SP (per the revised spec). The fix touches the C1 `VioOutput.emitted_at_ns` contract and every C1 strategy that fills the field; well-scoped. + - Pair with: AZ-883 (2 SP, `_handle_imu` SCALED_IMU2 ts_ns=0) — independent fix but same investigation surface. + +2. **File a cycle-2 retro retroactively + add an autodev sanity check that flags missing retros.** + - Impact: cycle-1 retro's Top-3 actions all sat invisible because no cycle-2 retro re-surfaced them. The autodev orchestrator's Step 17 should refuse to enter Step 9 cycle-N+1 if `retro_*.md` for cycle N is absent. Catches future retro skips at the next session boundary, not 6 weeks later. + - Effort: small (1 SP for the autodev state check; +2 SP to write the catch-up cycle-2 retro from artifacts already on disk). + +3. **Land `architecture_compliance_baseline.md` as cycle-4 Step-6 prerequisite (third try).** + - Impact: same rationale as cycle-1 retro Improvement Action #3 — cumulative reviews still cannot emit `## Baseline Delta` sections; structural regressions remain invisible across cycles. + - Effort: ~1 SP (small file; seed from `structure_2026-05-20.md` with 0 violations baseline). The right insertion point is cycle 4's decompose phase; if decompose runs without it, fail-fast and create. + +## Suggested Rule / Skill Updates + +| File | Change | Rationale | +|------|--------|-----------| +| `.cursor/skills/implement/SKILL.md` (batch self-review or test sub-step) | Add a check: **if the batch removes `@pytest.mark.xfail` decorators from any test**, the same batch MUST include a green test execution against the actual hardware tier the test targets (or explicit `tier-2-only` skip documentation if hardware is unavailable in the batch session). Block PASS verdict without this evidence. | AZ-848 root cause: AZ-776 removed `@xfail` from AC-1/2/5/6 in cycle 2 with "AC-7 stating tests run on Jetson after this task → All five pass". The Jetson run was never performed. Predates the 2026-05 `meta-rule.mdc` "Real Results, Not Simulated Ones" — but the implement skill's own self-review should also enforce. | +| `.cursor/skills/autodev/state.md` or `flows/existing-code.md` (Re-Entry section) | When auto-chaining from Step 17 (Retrospective) to Step 9 (New Task) with `cycle: state.cycle + 1`, FIRST verify that `_docs/06_metrics/retro_.md` exists for the previous cycle. If absent, BLOCK and surface the gap. | Cycle-2 retro was never filed; the orchestrator silently advanced to cycle 3. Cycle-1 retro's Top-3 actions sat invisible as a result. | +| `.cursor/skills/release/SKILL.md` Phase 2 strategy table | Add an explicit row: `bench-test` — bench-rig verification on real hardware via test compose (`docker-compose.test.jetson.yml` style); not a production deploy; collapses Phases 3+4 into one harness run; Phase 5 explicitly N/A; allowed for first-release / refactor-only cycles. | Cycle-3 release used this strategy ad-hoc; the skill's existing table forced a "manual" classification that doesn't quite fit. | +| `.cursor/skills/release/SKILL.md` Phase 1 rollback-readiness | When `.previous-tags.env` does NOT exist AND no `release/*` git tag exists, treat this as "first release" and accept `docker compose down` as the rollback path. Do NOT block on absent rollback target. | First-time release was a Phase 1 blocking gate per the current strict reading; cycle 3's bench-test release had to navigate it inline. | +| `.cursor/skills/test-spec/SKILL.md` (cycle-update mode) | When the cycle-update task list includes a ticket that touches a Protocol / dataclass / contract field semantics (e.g., `VioOutput.emitted_at_ns`), the test-spec sync MUST flag downstream consumers explicitly (e.g., C5 ESKF + C13 FDR both read `emitted_at_ns`). | AZ-848 affected C1 contract semantics; downstream C5 and C13 each read the field. The test-spec sync didn't flag this in cycle 2 when AZ-776 changed adjacent code. | + +## Process Leftovers (open at snapshot) + +- `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md` — OPEN; gtsam numpy<2 ABI replay condition unchanged. Last check: 2026-05-26 in this session. +- `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` — OPEN (NEW this cycle); `EVIDENCE_OUT` default path is container-only; Tier-1 host runs need explicit override; workaround documented; 1 SP fix queued for cycle 4. + +End of cycle-3 retrospective. diff --git a/_docs/LESSONS.md b/_docs/LESSONS.md index 6b6b788..2d269a9 100644 --- a/_docs/LESSONS.md +++ b/_docs/LESSONS.md @@ -6,6 +6,30 @@ Ring buffer: trim to the last 15 entries. Categories: `estimation · architectur --- +## 2026-05-26 — [testing] Removing `@pytest.mark.xfail` must be paired with a same-batch run on the actual hardware tier the test targets + +**Trigger**: AZ-848 root cause re-diagnosis (2026-05-26). In cycle 2, commit `8de2716 [AZ-776] Open-loop ESKF composition profile via c4_pose.enabled` removed `@xfail` decorators from AC-1/AC-2/AC-5/AC-6 in `test_derkachi_1min.py` with AC-7 in the spec stating "tests run on Jetson after this task → All five pass". The Jetson run was never executed before AZ-776 closed. The latent C1 contract bug (`VioOutput.emitted_at_ns` uses `monotonic_ns` instead of FC-boot-relative timestamps) was therefore not detected until cycle-3 Step 11 — three weeks later. AZ-848 is 5 SP and now blocks all real airborne work in cycle 4. + +**What changed**: `.cursor/skills/implement/SKILL.md` batch self-review should add a check — **if the batch removes any `@pytest.mark.xfail` decorator**, the same batch MUST include a green test execution against the test's target tier (or explicit `tier-2-only` skip documentation if the hardware is unavailable in the batch session). Block PASS verdict without this evidence. Predates the 2026-05 `meta-rule.mdc` "Real Results, Not Simulated Ones" rule but the implement skill's own gate should also enforce. + +Source: `_docs/06_metrics/retro_2026-05-26.md` + +## 2026-05-26 — [process] Autodev must block Step-N+1 entry if the previous cycle's retro file is missing + +**Trigger**: cycle-2 retro was never filed. The autodev orchestrator silently auto-chained from cycle-2 Step 17 (if it ran at all) straight into cycle-3 Step 9 without producing `retro_.md`. As a result, cycle-1 retro's Top-3 Improvement Actions sat invisible across cycle 2 and were re-discovered, all three still undelivered, only at cycle-3 close — including `architecture_compliance_baseline.md` (action #3) which is now in its third cycle of being un-delivered. + +**What changed**: `.cursor/skills/autodev/state.md` Re-Entry After Completion (or `flows/existing-code.md`) should verify that `_docs/06_metrics/retro_.md` exists for the previous cycle (`state.cycle`) before incrementing the cycle counter and entering Step 9 of cycle N+1. If absent, BLOCK and surface the gap with an A/B/C choice: (A) author the missing retro now, (B) stub a backfilled retro and proceed, (C) abort and ask the user. + +Source: `_docs/06_metrics/retro_2026-05-26.md` + +## 2026-05-26 — [tooling] When investigating bug X reveals a separate latent bug Y, file Y as a new ticket immediately — do not fold Y's scope into X + +**Trigger**: AZ-848 evidence-based investigation (2026-05-26) used a pymavlink probe against the Derkachi tlog to verify the original "IMU-vs-IMU clock mismatch" hypothesis. The probe REFUTED the original hypothesis (both `RAW_IMU` and `SCALED_IMU2` share the FC-boot timebase) and SIMULTANEOUSLY surfaced a separate latent bug — `c8_fc_adapter._handle_imu` mis-reads `SCALED_IMU2.time_boot_ms` as `time_usec`, defaulting to 0 for ~half of all IMU samples. Both bugs are real and orthogonal in their fix paths. The decision was to split — AZ-883 (2 SP) gets its own ticket, AZ-848 (5 SP) keeps its tightly-scoped contract repair. + +**What changed**: when a deep investigation surfaces a second latent issue that's orthogonal to the primary bug, file the second issue as its own ticket in the same session (with full evidence + reproduction protocol), then resume the primary investigation. Resist the temptation to fold the second issue into the primary ticket's scope "for convenience" — it inflates SP estimates and couples fix landings unnecessarily. + +Source: `_docs/06_metrics/retro_2026-05-26.md` + ## 2026-05-20 — [testing] Two-tier test policy retired — all tests run on Jetson only **Trigger**: a `/test-run` invocation on the workstation Tier-1 Docker stack uncovered eight categorically distinct, sequential bugs in the supposedly-supported workstation path (Dockerfile `COPY` ordering before editable install, base-image pip too old for `gtsam` pre-release wheels, runtime stage missing the `python3` metapackage that `python3 -m venv` symlinks against, missing `libgl1` / `libglib2.0-0` for `cv2` import, missing `runtime_root/__main__.py` shim, lazy import that never registered the `c6_tile_cache` config block, and a `BUILD_FAISS_INDEX` env flag gap in `docker-compose.test.jetson.yml`). None of these had been hit before because no one had actually executed the workstation Docker stack end-to-end since it was authored — the colocated Jetson Woodpecker agent was the only test environment that ever ran. Maintaining the divergent x86 path was producing only false-negative signal and engineering time, never honest test coverage. diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index 77499f8..baee421 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -2,13 +2,13 @@ ## Current Step flow: existing-code -step: 12 -name: Test-Spec Sync -status: not_started +step: 10 +name: Implement +status: in_progress sub_step: phase: 0 name: awaiting-invocation detail: "" retry_count: 0 -cycle: 3 +cycle: 4 tracker: jira diff --git a/_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md b/_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md index b180fd7..004c58d 100644 --- a/_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md +++ b/_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md @@ -1,7 +1,7 @@ # D-CROSS-CVE-1 opencv-python pin deferred — gtsam/numpy ABI block **Recorded**: 2026-05-11T02:55+03:00 (Europe/Kyiv) -**Last replay attempt**: 2026-05-24T05:07+03:00 (Europe/Kyiv) — replay re-checked +**Last replay attempt**: 2026-05-26T13:06+03:00 (Europe/Kyiv) — replay re-checked at start of next `/autodev` invocation. PyPI re-queried via `python3 -m pip index versions gtsam`: only `gtsam 4.2` is published. Replay condition (numpy>=2 stable wheels) still NOT met. Leftover remains open. diff --git a/_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md b/_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md new file mode 100644 index 0000000..6764d60 --- /dev/null +++ b/_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md @@ -0,0 +1,51 @@ +# Leftover: EVIDENCE_OUT default is a hardcoded container path + +**Created**: 2026-05-26 +**Last replay attempt**: 2026-05-26 +**Category**: Test infrastructure defect (non-tracker leftover — code fix, not a deferred tracker write) +**Surfaced by**: autodev cycle 3 Step 15 (Performance Test) — `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` "Findings worth tracking" item 3. + +## Problem + +`e2e/runner/conftest.py:56`: + +```python +default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence") +``` + +The default path `/e2e-results/evidence` is the container mount inside the Tier-1 Docker harness and the Tier-2 Jetson run script. On a developer Mac/Linux workstation invoking `python -m pytest e2e/tests/performance/` directly (no Docker, no Jetson), this hook fires in `nfr_recorder.pytest_sessionfinish` and tries to create the directory, failing with: + +``` +OSError: [Errno 30] Read-only file system: '/e2e-results' +``` + +(macOS — the volume `/` is read-only at the filesystem root.) On Linux hosts it would fail with `PermissionError` for the same reason — `/e2e-results` is not writable by a non-root user. + +## Workaround (used today) + +```bash +EVIDENCE_OUT="$(pwd)/e2e-results/cycle3-tier1-probe/evidence" \ + python -m pytest e2e/tests/performance/ -v --tb=short +``` + +This produced a clean exit-0 run with the expected 24 SKIPPED outcomes. + +## Proposed fix + +Change `e2e/runner/conftest.py:56` to default to a workspace-relative path when neither `--evidence-out` nor `EVIDENCE_OUT` is set. Two viable shapes: + +1. **Workspace-relative default**: `default=os.environ.get("EVIDENCE_OUT", str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"))`. +2. **Lazy fallback inside the recorder**: leave the default unset; if `evidence_dir` is `None` at session finish, skip emission and warn — useful for `--collect-only` or smoke runs where evidence output is genuinely not needed. + +Either shape preserves backward compatibility with the Docker / Jetson scripts (they pass `--evidence-out` explicitly). + +## Why not fix in this cycle + +Per `coderule.mdc` § Scope discipline: "Unrelated issues elsewhere: do not silently fix them as part of this task. Either note them to the user at end of turn and ASK before expanding scope, or record in `_docs/_process_leftovers/` for later handling." Cycle 3 was pre-flight / route-driven seeding work; the EVIDENCE_OUT default has no relationship to that scope. Recording here for either: + +- Next cycle's New Task step to pick up as a small (~1 pt) housekeeping ticket, OR +- A drive-by fix during the next test-infrastructure touch (e.g. when AZ-444 Tier-2 harness lands). + +## Replay condition + +This is a **code-fix leftover**, not a tracker-write leftover. There is nothing to "replay against the tracker". Resolution = land the conftest change above and verify a Tier-1 host run of `pytest e2e/tests/performance/` exits cleanly without `EVIDENCE_OUT` pre-set. Once that PR merges, delete this leftover.