chore: WIP pre-implement

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-26 17:09:13 +03:00
parent be743a72d6
commit 940066bee2
31 changed files with 1709 additions and 54 deletions
@@ -12,3 +12,31 @@
Use this fixture for video/telemetry synchronization checks, representative replay smoke tests, VIO hot-path latency, frame-drop accounting, and trajectory comparison against `GLOBAL_POSITION_INT`. The video and telemetry align at exactly three video frames per telemetry row. Camera intrinsics, lens distortion, raw camera resolution, and exact camera-to-body calibration are still unknown, so this fixture is not sufficient by itself for final production camera calibration or satellite-anchor accuracy claims.
For the test recording, the rotating camera was mechanically fixed in a downward/nadir orientation. Treat the MP4 as a cleaned/cropped replay fixture rather than the raw camera feed.
## Derkachi C6 reference seeding (cycle 3 — AZ-777 + Epic AZ-835)
The end-to-end replay pipeline needs the C6 tile cache pre-populated with the satellite imagery that covers this flight. The seed scripts live under `tests/fixtures/derkachi_c6/`:
| Script | Purpose |
|--------|---------|
| `tests/fixtures/derkachi_c6/seed_region.py` (AZ-777 Phase 2) | Bbox-driven seed. Calls `POST /api/satellite/request` on the running `satellite-provider` to onboard the Derkachi area (~50.0550.15 lat, 36.0536.15 lon, zoom 1518). Companion to the existing bbox-download workflow. |
| `tests/fixtures/derkachi_c6/seed_route.py` (AZ-838 / Epic AZ-835 C2) | Route-driven seed. Reads `derkachi.tlog`, extracts a ≤ 10-waypoint corridor via `replay_input.tlog_route.extract_route_from_tlog`, posts it to `satellite-provider`'s Route API, polls until `mapsReady=true`, and verifies coverage via inventory. ~100× more tile-efficient than the bbox path for this clip. |
| `tests/fixtures/derkachi_c6/bbox.yaml` | Derkachi bbox + zoom levels + license-attribution metadata (Google Maps Platform ToS + "Imagery © Google" attribution string). |
| `tests/fixtures/derkachi_c6/README.md` | Step-by-step re-seeding instructions when the `satellite-provider` postgres is wiped; license-attribution operators must propagate; pointer to the parent-suite ticket (TBD) for migrating to a true CC-BY satellite source for production. |
Both seed scripts require:
- A running `satellite-provider` reachable at `SATELLITE_PROVIDER_URL` (typically `https://satellite-provider:8080` inside the Jetson compose network).
- A valid JWT — either `SATELLITE_PROVIDER_API_KEY` env var or `--auto-mint-jwt` (uses `scripts/mint_dev_jwt.py`).
- `SATELLITE_PROVIDER_TLS_INSECURE=1` if the parent suite is using the self-signed dev cert (development only — production deploys must validate against a CA-issued cert).
The end-to-end orchestrator test `tests/e2e/replay/test_az835_e2e_real_flight.py` (AZ-840) takes only `(derkachi.tlog, flight_derkachi.mp4, khp20s30_factory.json)` and runs the full 7-step pipeline against a populated C6 — see `_docs/02_document/contracts/replay/replay_protocol.md` Invariant 12.b for the orchestration.
### License attribution caveat (cycle 3)
The Jetson `satellite-provider` instance downloads from the **Google Maps satellite layer** (`lyrs=s`), governed by Google Maps Platform Terms of Service. This fixture and the seed scripts are dev/research use only. Production deployment requires either:
- Google Maps Platform licensing review for offline-cache use, OR
- A parent-suite ticket to switch satellite-provider's upstream to a true CC-BY satellite source (Esri World Imagery, Mapbox satellite, Sentinel-2, etc.).
The "Imagery © Google" attribution string is recorded in the seeded catalog's metadata and must be propagated downstream by any operator workflow that surfaces the imagery.
+16 -2
View File
@@ -262,11 +262,25 @@ source repo
| ArduPilot Plane FC | MAVLink 2.0 (`GPS_INPUT` 5 Hz; `MAV_CMD_SET_EKF_SOURCE_SET`; `STATUSTEXT` / `NAMED_VALUE_FLOAT`) over UART/USB | MAVLink 2.0 message signing, per-flight key (D-C8-9 = (d)) | 5 Hz periodic emit; signing handshake at takeoff load (≤ 5 s, AC-NEW-1) | Signing handshake fail → companion refuses takeoff; mid-flight signing key compromise → FC ignores unsigned messages, AC-5.2 takes over |
| iNav FC | MSP2 `MSP2_SENSOR_GPS` over UART; MAVLink outbound for telemetry | None (iNav has no signing) — accepted residual risk per Mode B Source #129 | 5 Hz periodic emit | Mid-flight bad-frame → iNav `mspGPSReceiveNewData()` receives only the latest frame; honest `hPosAccuracy` is the only safety net |
| QGroundControl (GCS) | MAVLink 2.0 (`STATUSTEXT`, `NAMED_VALUE_FLOAT`, `GPS_RAW_INT`) | Same MAVLink 2.0 signing as the AP path (AP profile); no signing on iNav profile | 12 Hz downsampled (AC-6.1); operator commands are best-effort | GCS link drop → companion continues; no mid-flight reconfiguration is required from GCS |
| `satellite-provider` (pre-flight) | REST over HTTP, OpenAPI at `/swagger`; filesystem access if co-located | TLS + service-internal API key (operator workstation only); the companion never reaches `satellite-provider` directly while airborne | Off-line pre-flight; not time-critical | Cache miss → C11 `TileDownloader` fails fast pre-flight; C10 build is blocked downstream; takeoff blocked |
| `satellite-provider` (pre-flight read — bbox + slippy-map) | REST `POST /api/satellite/tiles/inventory` (bulk lookup by `(z,x,y)`, ≤ 5000 entries / request) + `GET /tiles/{z}/{x}/{y}` (slippy-map JPEG fetch); OpenAPI at `/swagger`; filesystem access if co-located | JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) over TLS; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert. The companion never reaches `satellite-provider` directly while airborne. | Off-line pre-flight; not time-critical | Cache miss → C11 `TileDownloader` fails fast pre-flight; C10 build is blocked downstream; takeoff blocked |
| `satellite-provider` (pre-flight route seed — cycle 3 / Epic AZ-835) | REST `POST /api/satellite/route` (corridor onboarding; body per `CreateRouteRequest.cs` DTO) + `GET /api/satellite/route/{id}` (status polling; terminal-success `mapsReady=true`) | Same JWT Bearer / TLS-insecure as the read path; validated pre-emptively against AZ-809 `CreateRouteRequestValidator` bounds | Off-line pre-flight; bounded by `poll_max_attempts × poll_interval_s` (default 60 × 5 s) | Terminal failure → `RouteTerminalFailureError`; transient → `RouteTransientError`; validation → `RouteValidationError`. C11's `SatelliteProviderRouteClient` (AZ-838) owns the surface. |
| `satellite-provider` (post-landing ingest, D-PROJ-2, **planned**) | REST `POST /api/satellite/tiles/ingest` (multipart) | Per-flight onboard signing key (carried with each tile); rate-limited | Bursty post-landing | Endpoint not yet implemented service-side → C11 keeps batches queued locally; never blocks the pre-flight cycle |
| Operator workstation (pre-flight stage) | Filesystem (USB / Ethernet) | OS-level (operator login) | Not time-critical | Bad-stage detection via Manifest content-hash gate (D-C10-3) |
| Nav camera | USB / MIPI-CSI / GigE (lens-module dependent) | n/a | 3 Hz | Frame drop / hardware fault → "VISUAL_BLACKOUT" path (AC-3.5, AC-NEW-8) |
### `satellite-provider` integration (cycle-3 ground truth)
**The Jetson e2e harness now consumes the REAL parent-suite `satellite-provider` .NET service** (lineage AZ-688 / AZ-691 / AZ-692; `satellite-provider` + `satellite-provider-postgres` services in `docker-compose.test.jetson.yml`). The legacy `mock-sat` fixture is retired from the Jetson compose; D-PROJ-2 `POST /api/satellite/upload` has shipped service-side (`Program.cs:211`). Tier-1 `docker-compose.test.yml` is deprecated 2026-05-20 per `_docs/02_document/tests/environment.md`.
Two consequences for the architecture:
1. **C11 read contract adapted to the v1.0.0 inventory shape (AZ-777 Phase 1)**`POST /api/satellite/tiles/inventory` + `GET /tiles/{z}/{x}/{y}` replace the historical `GET /api/satellite/tiles?bbox=…&zoom=…` shape. The bbox-driven `download_tiles_for_area` entry point and its DTOs are unchanged at the call-site level; the contract adaptation is internal to `HttpTileDownloader`. Auth is JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) over TLS; `SATELLITE_PROVIDER_TLS_INSECURE=1` is a documented dev-only knob for self-signed certs.
2. **Route-driven seeding (Epic AZ-835 — C11's third interface, `SatelliteProviderRouteClient`)** — the operator can now submit a tlog-derived `RouteSpec` (waypoints + region size; produced by `replay_input.tlog_route.extract_route_from_tlog` — AZ-836; canonical DTO at `_types/route.py` per AZ-845) via `POST /api/satellite/route` and have `satellite-provider` materialise just the corridor tiles, polling `GET /api/satellite/route/{id}` until `mapsReady=true`. This is ~100× more tile-efficient than the bbox path on long, narrow flights. Pre-emptive validation mirrors the AZ-809 `CreateRouteRequestValidator` bounds. The route-driven path is exercised today by the cycle-3 e2e fixture `operator_pre_flight_setup` (AZ-839) and the orchestrator test `test_az835_e2e_real_flight.py` (AZ-840); the C12 production CLI binding is a future-cycle integration.
**Imagery source license attribution (cycle 3)**: the Jetson `satellite-provider` instance downloads from the **Google Maps satellite layer** (`lyrs=s`), governed by Google Maps Platform Terms of Service. Dev/research use only; production deployment requires either a Google Maps Platform licensing review or migration to a true CC-BY satellite source on the parent-suite side (parent-suite ticket TBD). Operator-side seed scripts (`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate the "Imagery © Google" attribution.
No new ADR — this is execution of existing decisions (architectural principle #5 satellite-provider on-disk layout end-to-end; ADR-004 process-level isolation unchanged; ADR-011 replay is a configuration unchanged). The architectural surface gained the route-driven seeding path inside C11; nothing else moved.
### `satellite-provider` upload contract (per D-PROJ-2 carryforward)
The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/2026-05-09_satellite-provider-design-tasks.md`. From this architecture's standpoint:
@@ -274,7 +288,7 @@ The onboard side of D-PROJ-2 is fully specified in `_docs/_process_leftovers/202
- **`Tile` writes are append-only and idempotent** (the same `(zoomLevel, lat, lon, capture_timestamp, companion_id, flight_id)` tuple is the dedup key).
- **Quality metadata is mandatory on every uploaded tile** so the planned voting layer can promote `pending → trusted` without re-deriving statistics on the service side.
- **Onboard tiles never claim the `trusted` status**; they are uploaded as `pending` and the parent-suite voting layer (D-PROJ-2 design task #2) decides promotion.
- **Test substitute**: `mock-suite-sat-service` is an e2e-test-only fixture (under `tests/fixtures/mock-suite-sat-service/`) that implements the upload contract for NFT-SEC-01 / FT-P-17 / IT runs until D-PROJ-2 lands service-side. It is **not a component** in the architectural sense — the production architectural counterparty for both download and upload is the real `satellite-provider`. The fixture is retired the moment the real ingest endpoint ships.
- **Test substitute**: `mock-suite-sat-service` is an e2e-test-only fixture (under `tests/fixtures/mock-suite-sat-service/`) that implements the upload contract for NFT-SEC-01 / FT-P-17 / IT runs until D-PROJ-2 lands service-side. It is **not a component** in the architectural sense — the production architectural counterparty for both download and upload is the real `satellite-provider`. The fixture is retired the moment the real ingest endpoint ships. (Download + route-seed integration tests on the Jetson harness already run against the real service as of cycle 3.)
---
@@ -2,23 +2,32 @@
## 1. High-Level Overview
**Purpose**: own the operator-side network I/O against `satellite-provider` for the onboard tile corpus, in **both directions**:
**Purpose**: own the operator-side network I/O against `satellite-provider` for the onboard tile corpus, in **three directions**:
- **Route seed** (pre-flight, F1, route-driven variant — Cycle 3 / Epic AZ-835): submit a tlog-derived `RouteSpec` (waypoints + per-waypoint coverage radius, produced by `replay_input.tlog_route.extract_route_from_tlog` — AZ-836) to `satellite-provider`'s Route API and poll until corridor tile materialisation completes. Lets the operator pre-commit the cache to where the drone actually flew rather than a bounding box.
- **Download** (pre-flight, F1): fetch tiles from `satellite-provider` for the operational area, apply AC-NEW-6 freshness gating, and write into C6 (`TileStore` + `TileMetadataStore`). C11 is the **only** path that crosses the workstation/companion enclave to the parent suite for tile pixels — C10 reads from the populated C6 store and never touches `satellite-provider` itself.
- **Upload** (post-landing, F10): read pending mid-flight tiles from C6 and POST to `satellite-provider`'s ingest endpoint (D-PROJ-2 contract sketch). C11 itself does NOT gate on flight state — it is a dumb pipe; the post-landing safety gate is owned by C12's `PostLandingUploadOrchestrator` (AZ-329 / Batch 44), which checks the C13 `flight_footer` FDR record for `clean_shutdown=True` before invoking `TileUploader.upload_pending_tiles`.
C11 is a **separate operator-side binary / image**. The airborne companion image's CMake target deliberately excludes the entire `c11_tilemanager/` source tree so the airborne process cannot accidentally execute either the download path or the upload path even via reflection or config error (ADR-004 process-level isolation, AC-8.4). Both directions of tile I/O are operator-driven on the operator workstation; the companion only consumes the populated C6 store while airborne.
C11 is a **separate operator-side binary / image**. The airborne companion image's CMake target deliberately excludes the entire `c11_tilemanager/` source tree so the airborne process cannot accidentally execute the seed path, the download path, or the upload path even via reflection or config error (ADR-004 process-level isolation, AC-8.4). All three directions of tile I/O are operator-driven on the operator workstation; the companion only consumes the populated C6 store while airborne.
**Architectural Pattern**: Pipeline behind two interfaces (`TileDownloader`, `TileUploader`) under one component, consistent with C8's multi-interface shape (FC-AP, FC-iNav, GCS adapters under one component). The two interfaces are bundled into C11 because they share auth (TLS + service-internal API key for download, per-flight onboard signing key for upload), HTTP client, network configuration, deployment unit (operator-tooling tarball), and the airborne-exclusion property — splitting them into two components would duplicate all of that. They are kept as **two interfaces** so SRP is preserved at the call-site level: C12 binds `TileDownloader` for the F1 cache-build workflow, `TileUploader` for the F10 post-landing trigger; neither is forced to depend on the other.
**Architectural Pattern**: Pipeline behind three interfaces (`SatelliteProviderRouteClient`, `TileDownloader`, `TileUploader`) under one component, consistent with C8's multi-interface shape (FC-AP, FC-iNav, GCS adapters under one component). The three interfaces are bundled into C11 because they share auth (JWT Bearer + optional TLS-insecure flag for dev self-signed certs across all three; the upload direction additionally signs each tile with the per-flight onboard signing key), HTTP client (`httpx`), network configuration, deployment unit (operator-tooling tarball), and the airborne-exclusion property — splitting them into separate components would duplicate all of that. They are kept as **three interfaces** so SRP is preserved at the call-site level: C12 binds `SatelliteProviderRouteClient.seed_route` to materialise the corridor cache from a tlog (cycle 3 e2e fixture today; planned C12 production path), `TileDownloader.download_tiles_for_area` for the F1 bbox-driven cache-build workflow, `TileUploader.upload_pending_tiles` for the F10 post-landing trigger; none is forced to depend on the others.
**Cycle-1 operational reality**: C11 is **operator-workstation-only**, NOT an airborne strategy slot — there is no `c11_tile_manager` slot in `_AIRBORNE_REGISTRATIONS`, no row in `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS`, and the airborne companion image's build target deliberately excludes the entire `c11_tile_manager/` source tree (ADR-004 process-level isolation; AC-8.4). The operator binary composes C11 via `runtime_root/c11_factory.py`, which exposes three tiny per-service factories — `build_per_flight_key_manager` (AZ-318), `build_tile_uploader` (AZ-319 + AZ-320), and `build_tile_downloader` (AZ-316) — each called explicitly by C12's CLI; no central registry. FDR wiring goes through the per-producer `make_fdr_client` cache: AZ-318 `PerFlightKeyManager` defaults to `make_fdr_client("c11_tile_manager.signing_key", config)`, AZ-319 `HttpTileUploader` to `make_fdr_client("c11_tile_manager.tile_uploader", config)` — both distinct from the airborne `"airborne_main"` producer, so the operator-workstation process gets its own per-component FdrClient instances rather than sharing the airborne singleton. AZ-320's `IdempotentRetryTileUploader` decorator wraps `HttpTileUploader` by default (per-call + per-tile bounded retry); `config.components['c11_tile_manager'].disable_retry_decorator = True` suppresses the wrap for low-level debugging or test wiring that needs to observe the inner uploader. The AZ-507 cross-component cut keeps C11 from importing C6 directly: `tile_store` / `tile_metadata_store` are passed in by the operator-binary composition root as consumer-side cuts; `http_client` (an `httpx.Client`) is also caller-owned so tests can swap in `httpx.MockTransport`. AZ-687 replay-mode guard does not apply — C11 has no airborne footprint.
**Cycle-3 operational reality (AZ-777 Phase 1 + Epic AZ-835)**: the e2e harness now wires the e2e-runner against the **real** parent-suite `satellite-provider` .NET service in `docker-compose.test.jetson.yml` (lineage AZ-688 / AZ-691 / AZ-692; tier-1 `docker-compose.test.yml` deprecated 2026-05-20). Two consequences cascaded into C11:
- **`TileDownloader` contract adaptation (AZ-777 Phase 1)** — `HttpTileDownloader._INVENTORY_PATH = "/api/satellite/tiles/inventory"` (POST, bulk lookup by (z,x,y)) and `HttpTileDownloader._TILES_PATH = "/tiles"` (GET, slippy-map fetch via `/tiles/{z}/{x}/{y}`). Previously documented as `GET /api/satellite/tiles?bbox=…&zoom=…`; the real `satellite-provider` API surface uses the inventory + slippy-map split per `tile-inventory.md` v1.0.0 (AZ-505). The bbox-driven `download_tiles_for_area` entry point and its `DownloadRequest` / `DownloadBatchReport` DTOs are unchanged at the call-site level; the contract adaptation is internal. Because the inventory response does not carry a `Content-Length` hint, AZ-308's pre-write budget check uses `_DEFAULT_ESTIMATED_TILE_BYTES = 50 000` (conservative over-reserve; typical 256×256 JPEG basemap tile is 880 KiB). Auth is `Authorization: Bearer ${SATELLITE_PROVIDER_API_KEY}`; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert.
- **Third interface — `SatelliteProviderRouteClient` (AZ-838 / Epic AZ-835 C2)**`seed_route(spec: RouteSpec) -> RouteSeedResult` POSTs the spec to `POST /api/satellite/route` (`requestMaps=true`, `createTilesZip=false`), polls `GET /api/satellite/route/{id}` until `mapsReady=true` (or a terminal-failure status), then verifies coverage via `POST /api/satellite/tiles/inventory`. Pre-emptively enforces AZ-809's `CreateRouteRequestValidator` bounds (`points` 2..500; `regionSizeMeters` 100..10 000; `zoomLevel` 0..22; lat/lon ranges) so obviously-bad input fails before the HTTP POST. Default cadence: `poll_interval_s = 5.0`, `poll_max_attempts = 60`, `request_timeout_s = 30.0`. Errors form a dedicated hierarchy (`RouteValidationError` 4xx + RFC 7807 ProblemDetails; `RouteTransientError` 5xx / network / timeout with `__cause__` set; `RouteTerminalFailureError` for non-success terminal status) rooted at `SatelliteProviderRouteError` — independent of `TileManagerError` because the Route API is a corridor-onboarding flow, not a per-tile transfer.
The route-driven path is exercised today by `tests/e2e/replay/conftest.py::operator_pre_flight_setup` (AZ-839 — replaces the cycle-1 `mkdir` placeholder; yields a `PopulatedC6Cache` dataclass) and `tests/e2e/replay/test_az835_e2e_real_flight.py` (AZ-840 — single test that takes only `(tlog, video, calibration)` and runs the full 7-step pipeline). The C12 production CLI binding for the route path is a future-cycle integration; today's C12 still drives only `download_tiles_for_area` for production pre-flight cache builds.
**Upstream dependencies**:
- C12 OperatorTooling → invokes `TileDownloader.download_tiles_for_area(...)` during F1 and `TileUploader.upload_pending_tiles(...)` post-landing.
- C12 OperatorTooling → invokes `TileDownloader.download_tiles_for_area(...)` during F1 and `TileUploader.upload_pending_tiles(...)` post-landing. (Cycle-3 e2e fixtures also drive `SatelliteProviderRouteClient.seed_route(...)` for the route-driven F1 variant; C12 production binding for the route path is a future cycle.)
- C6 TileStore + TileMetadataStore → write target during download (`source = googlemaps`); read source during upload (`source = onboard_ingest`, `voting_status = pending`).
- `replay_input.tlog_route.RouteSpec` (AZ-836; `_types/route.py` canonical home per AZ-845) → input DTO to `SatelliteProviderRouteClient.seed_route`.
- Operator workstation OS → invocation entry point (CLI / tray app, owned by C12).
- `satellite-provider` (external) → `GET /api/satellite/tiles?bbox=…&zoom=…` for download; `POST /api/satellite/tiles/ingest` for upload (D-PROJ-2 design task #1, **planned, not yet implemented service-side**).
- `satellite-provider` (external) → for download: `POST /api/satellite/tiles/inventory` (bulk lookup by (z,x,y)) + `GET /tiles/{z}/{x}/{y}` (slippy-map fetch, per `tile-inventory.md` v1.0.0 / AZ-505); for route seeding: `POST /api/satellite/route` + `GET /api/satellite/route/{id}` (per `CreateRouteRequest.cs` DTO + AZ-809 validator); for upload: `POST /api/satellite/tiles/ingest` (D-PROJ-2 design task #1, **planned, not yet implemented service-side**).
**Downstream consumers**:
@@ -27,6 +36,12 @@ C11 is a **separate operator-side binary / image**. The airborne companion image
## 2. Internal Interfaces
### Interface: `SatelliteProviderRouteClient` (cycle 3 — AZ-838 / Epic AZ-835 C2)
| Method | Input | Output | Async | Error Types |
|--------|-------|--------|-------|-------------|
| `seed_route` | `RouteSpec` (from `_types/route.py`; `name: str \| None` optional) | `RouteSeedResult` | No (poll loop; secondsminutes) | `RouteValidationError`, `RouteTransientError`, `RouteTerminalFailureError` (all under `SatelliteProviderRouteError`) |
### Interface: `TileDownloader`
| Method | Input | Output | Async | Error Types |
@@ -46,6 +61,21 @@ C11 no longer exposes `confirm_flight_state` — the post-landing flight-state g
**Input/Output DTOs**:
```
RouteSpec (cycle 3 — _types/route.py, produced by replay_input/tlog_route.py):
waypoints: tuple[tuple[float, float], ...] # (lat, lon), 1..max_waypoints
suggested_region_size_meters: float # per-waypoint coverage radius
source_tlog: Path # provenance
source_segment: tuple[int, int] # (start_idx, end_idx) into tlog GPS rows
total_distance_meters: float # along-track distance of active segment
RouteSeedResult (cycle 3 — c11_tile_manager.route_client):
route_id: uuid
terminal_status: string
maps_ready: bool
tile_count: int
elapsed_ms: int
submitted_payload_sha256: string
DownloadRequest:
bbox: BoundingBox (lat_min, lon_min, lat_max, lon_max)
zoom_levels: list[int]
@@ -78,17 +108,25 @@ UploadBatchReport:
## 3. External API Specification
C11 is a **client** of `satellite-provider`'s REST surface in both directions.
C11 is a **client** of `satellite-provider`'s REST surface in three directions.
### 3.1 Download — read path (existing `satellite-provider` API)
### 3.1 Route seed — corridor materialisation (cycle 3 — AZ-838 / Epic AZ-835 C2)
| Endpoint | Method | Auth | Rate Limit | Description |
|----------|--------|------|------------|-------------|
| `/api/satellite/tiles?bbox=…&zoom=…` | GET | TLS + service-internal API key | parent-suite enforces | Paged tile blobs + metadata for a bounding box at the given zoom level(s). |
| `/api/satellite/route` | POST | JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) + optional dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` | parent-suite enforces | Submit a `RouteSpec` (waypoints + region size + zoom level). Body shape per `CreateRouteRequest.cs` / `RoutePoint.cs` (`lat` / `lon` JSON property names) / `GeoPoint.cs` DTOs. Query: `requestMaps=true&createTilesZip=false`. Validated pre-emptively against AZ-809 `CreateRouteRequestValidator` rules. |
| `/api/satellite/route/{id}` | GET | same as above | parent-suite enforces | Poll route processing status. Returns `mapsReady: bool` + a `status` string. Terminal-success: `mapsReady=true`. Terminal-failure: `status ∈ {failed, error, rejected}`. Default cadence: 5 s × ≤ 60 attempts. |
C11 honours `Retry-After` on 429s, fails fast on TLS / auth errors, retries with backoff on 5xx. Resolution below 0.5 m/px (RESTRICT-SAT-4) is rejected at the C11 boundary, not pushed downstream.
### 3.2 Download — read path (`satellite-provider` v1.0.0 inventory contract — AZ-505 / AZ-777 Phase 1)
### 3.2 Upload — write path (D-PROJ-2 contract sketch, **planned**)
| Endpoint | Method | Auth | Rate Limit | Description |
|----------|--------|------|------------|-------------|
| `/api/satellite/tiles/inventory` | POST | JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) + optional dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` | parent-suite enforces | Bulk lookup of `(zoom, x, y)` slippy-map coords (≤ 5000 entries / request); body shape per `tile-inventory.md` v1.0.0. Response order matches request order; each entry carries `present: true|false` plus metadata when present (`resolutionMPerPx`, `producedAt`, …). |
| `/tiles/{z}/{x}/{y}` | GET | same as above | parent-suite enforces | Slippy-map tile fetch by coordinates (binary JPEG response). Issued only for inventory entries with `present=true`. |
C11 honours `Retry-After` on 429s, fails fast on TLS / auth errors, retries with backoff on 5xx. Resolution below 0.5 m/px (RESTRICT-SAT-4) is rejected at the C11 boundary, not pushed downstream. Because the inventory response carries no `Content-Length` hint, AZ-308's pre-write budget check uses a conservative `_DEFAULT_ESTIMATED_TILE_BYTES = 50 000` per-tile reserve.
### 3.3 Upload — write path (D-PROJ-2 contract sketch, **planned**)
| Endpoint | Method | Auth | Rate Limit | Description |
|----------|--------|------|------------|-------------|
@@ -136,26 +174,28 @@ C11 reads from / writes to C6 (the local store) and reads from / writes to `sate
**Algorithmic Complexity**:
- Route seed: bounded by parent-suite tile materialisation latency (~secondsminutes for the Derkachi corridor; gated by `poll_max_attempts × poll_interval_s`).
- Download: linear in tile count; bandwidth-bound by the operator workstation's link to `satellite-provider`.
- Upload: linear in pending tile count; bandwidth-bound; bursty post-landing.
**State Management**: stateless except for the two journals.
**State Management**: stateless except for the two journals (download / pending-upload). The route client is fully stateless — each `seed_route` call submits, polls, verifies, and returns.
**Key Dependencies**:
| Library | Version | Purpose |
|---------|---------|---------|
| httpx | per project pin | GET (download) + multipart POST (upload) to `satellite-provider` |
| httpx | per project pin | POST inventory + GET slippy-map (download), POST route + GET status (route seed), multipart POST (upload) to `satellite-provider` |
| atomicwrites | latest | Journal updates |
| cryptography | per project pin | Per-flight signing key (upload payload signing); the production `satellite-provider` ingest endpoint and the e2e-test `mock-suite-sat-service` fixture both verify with the same key family |
**Error Handling Strategy**:
- `SatelliteProviderError`: HTTP timeout / 5xx / TLS failure on either direction. Retry-with-backoff on 5xx; fail fast on TLS / auth. On download, surface to operator + takeoff blocked. On upload, leave tiles in the pending-upload journal and surface to operator. **Do not delete uploaded tiles from C6** until acknowledged.
- `SatelliteProviderError`: HTTP timeout / 5xx / TLS failure on download / upload. Retry-with-backoff on 5xx; fail fast on TLS / auth. On download, surface to operator + takeoff blocked. On upload, leave tiles in the pending-upload journal and surface to operator. **Do not delete uploaded tiles from C6** until acknowledged.
- `RateLimitedError` (429): obey `Retry-After`; the operator can also re-invoke later. Same handling either direction.
- `FreshnessRejectionError` / `ResolutionRejectionError`: download-side only. Per AC-NEW-6 / RESTRICT-SAT-4 — never silently downgrade fresh-required tiles in `active_conflict` sectors. Surface counts in the `DownloadBatchReport`.
- `CacheBudgetExceededError`: download-side only. Pre-flight free-space check against AC-8.3 (≤ 10 GB). Fail fast with explicit budget delta; no partial write.
- `SignatureRejectedError`: upload-side only. Per-flight signing key was rejected by `satellite-provider`. This is a security-critical event — do NOT silently drop; surface to operator + log to FDR.
- **Route-seed errors** (cycle 3, dedicated hierarchy under `SatelliteProviderRouteError`): `RouteValidationError` (4xx + RFC 7807 `errors` dict; raised pre-emptively for AZ-809 validator violations BEFORE the HTTP POST), `RouteTransientError` (5xx / network / timeout; carries `__cause__`), `RouteTerminalFailureError` (parent suite reports a non-success terminal status; `.detail` carries the response JSON). Separate hierarchy from `TileManagerError` because the route flow is corridor onboarding, not per-tile transfer.
Post-landing safety: C11's upload path no longer gates on flight state internally. The check now lives in C12's `PostLandingUploadOrchestrator` (AZ-329 / Batch 44), which refuses to invoke `TileUploader.upload_pending_tiles` unless the C13 `flight_footer` FDR record records `clean_shutdown=True` for the target flight. ADR-004 process-level isolation remains the primary control — C11 should never run on the companion at all.
@@ -170,8 +210,10 @@ Post-landing safety: C11's upload path no longer gates on flight state internall
**Known limitations**:
- D-PROJ-2 ingest endpoint is NOT yet implemented service-side. Until parent-suite delivers the endpoint, C11 will fail every upload — the pending-upload journal accumulates. Operator workflow tolerates this.
- The e2e-test `mock-suite-sat-service` fixture implements only the planned POST contract (per the leftover file). Download integration tests run against the real `satellite-provider`. Production runs reach `satellite-provider` directly in both directions; the fixture is never on the production path.
- `TileDownloader` requires the operator workstation to have network reach to `satellite-provider` (the only path that crosses out of the workstation enclave). Pre-flight network configuration is an operator concern owned by C12; C11 fails fast if reachability is missing.
- The e2e-test `mock-suite-sat-service` fixture implements only the planned POST upload contract (per the leftover file). Download + route-seed integration tests run against the real `satellite-provider` on the Jetson harness. Production runs reach `satellite-provider` directly in all three directions; the fixture is never on the production path.
- `TileDownloader` and `SatelliteProviderRouteClient` require the operator workstation to have network reach to `satellite-provider` (the only path that crosses out of the workstation enclave). Pre-flight network configuration is an operator concern owned by C12; C11 fails fast if reachability is missing.
- **Imagery source license attribution (cycle 3 — AZ-777 Phase 2)**: the Jetson `satellite-provider` instance downloads from the **Google Maps** satellite layer (`lyrs=s`), governed by Google Maps Platform Terms of Service. Dev/research use only; the operator-side seed scripts (`tests/fixtures/derkachi_c6/seed_region.py`, `seed_route.py`) propagate the "Imagery © Google" attribution string. Production deployment requires either a Google Maps Platform licensing review or migration to a true CC-BY satellite source on the satellite-provider side (parent-suite ticket TBD; surfaced in `_docs/00_problem/input_data/flight_derkachi/README.md`).
- **Dev TLS cert**: the e2e-runner today accepts the self-signed dev cert via `SATELLITE_PROVIDER_TLS_INSECURE=1`. Production deploys must validate against a CA-issued cert (`SATELLITE_PROVIDER_TLS_INSECURE=0`); the env knob is documented in `.env.test.example` + the smoke test + this section as **development-only**.
**Potential race conditions**:
@@ -179,25 +221,28 @@ Post-landing safety: C11's upload path no longer gates on flight state internall
**Performance bottlenecks**:
- Route seed: parent-suite tile-materialisation latency dominates (corridor onboarding from Google Maps upstream). Bounded by `poll_max_attempts × poll_interval_s` (default 60 × 5 s = 5 min wall-clock ceiling).
- Download: bandwidth-bound by the operator workstation's `satellite-provider` link; descriptor / engine work is downstream in C10 (offline, minutes).
- Upload: bandwidth-bound. Per-flight upload volume is bounded by the F4 mid-flight tile gen cap (typically a few hundred tiles, each 50200 KB → tens of MB per flight).
## 8. Dependency Graph
**Must be implemented after**: C6 (read source for upload, write target for download), `satellite-provider` (download path; existing) + D-PROJ-2 endpoint (upload path; the e2e-test `mock-suite-sat-service` fixture covers tests until the real endpoint ships).
**Must be implemented after**: C6 (read source for upload, write target for download), `satellite-provider` (download + route-seed paths; existing) + D-PROJ-2 endpoint (upload path; the e2e-test `mock-suite-sat-service` fixture covers tests until the real endpoint ships). `replay_input.tlog_route` (AZ-836) is a soft prerequisite for the route-seed path — the route client accepts any `RouteSpec` regardless of how it was produced, but the cycle-3 e2e fixture wires `extract_route_from_tlog` upstream.
**Can be implemented in parallel with**: anything except C6 changes.
**Blocks**: F1 (pre-flight cache build cannot start without `TileDownloader`), F10 (post-landing upload cannot start without `TileUploader`).
**Blocks**: F1 (pre-flight cache build cannot start without `TileDownloader` or — for the route-driven variant — `SatelliteProviderRouteClient.seed_route`), F10 (post-landing upload cannot start without `TileUploader`).
## 9. Logging Strategy
| Log Level | When | Example |
|-----------|------|---------|
| ERROR | `SignatureRejectedError`, persistent `SatelliteProviderError`, `CacheBudgetExceededError` | `C11 upload failure: signature rejected by satellite-provider` |
| WARN | one-off network failure, scheduled retry, freshness-driven rejections (counts) | `C11 batch upload retry: batch_uuid=…; next_retry_in_s=30` |
| INFO | session start/end; per-batch report (download + upload) | `C11 download complete: 87654 tiles, 12 stale-rejected; bbox=…` |
| DEBUG | per-tile request/response | `C11 tile uploaded: tile_id=(z=18,lat=…,lon=…); status=queued` |
| ERROR | `SignatureRejectedError`, persistent `SatelliteProviderError`, `CacheBudgetExceededError`, `RouteTerminalFailureError` | `C11 upload failure: signature rejected by satellite-provider`; `c11.route.poll.terminal kind=failed route_id=…` |
| WARN | one-off network failure, scheduled retry, freshness-driven rejections (counts), `RouteTransientError` retries, `RouteValidationError` pre-flight rejections | `C11 batch upload retry: batch_uuid=…; next_retry_in_s=30`; `c11.route.validation_failed field=points reason=below_min(2)` |
| INFO | session start/end; per-batch report (download + upload); route submit + each poll tick + inventory verify | `C11 download complete: 87654 tiles, 12 stale-rejected; bbox=…`; `c11.route.submit route_id=…`; `c11.route.poll.tick attempt=3 status=processing` |
| DEBUG | per-tile request/response; per-tile inventory entries | `C11 tile uploaded: tile_id=(z=18,lat=…,lon=…); status=queued` |
Cycle-3 route-client log kinds: `c11.route.submit`, `c11.route.poll.tick`, `c11.route.poll.terminal`, `c11.route.inventory`, `c11.route.validation_failed` (component `c11_tile_manager.route_client`).
**Log format**: structured JSON.
**Log storage**: operator workstation log file (e.g. `~/.azaion/onboard/c11-tilemanager.log`); also writes per-run summaries (download report, upload report) to the operator workstation cache root for audit. The companion's FDR is NOT involved (C11 doesn't run on the companion).
@@ -0,0 +1,126 @@
# Contract: route_client
**Component**: c11_tilemanager
**Producer task**: AZ-838_satellite_provider_route_client (Epic AZ-835 C2)
**Consumer tasks**: AZ-839 (`operator_pre_flight_setup` real fixture, Epic AZ-835 C3); AZ-840 (E2E orchestrator test, Epic AZ-835 C4); future C12 production binding (deferred — see § Non-Goals).
**Version**: 1.0.0
**Status**: stable
**Last Updated**: 2026-05-26
## Purpose
The `SatelliteProviderRouteClient` is C11's operator-side **route-onboarding** interface. Given a `RouteSpec` (a coarsened, tlog-derived flight corridor produced by `replay_input.tlog_route.extract_route_from_tlog` — AZ-836), it registers the corridor with the parent-suite `satellite-provider` Route API, polls until materialisation completes, and verifies coverage via the inventory contract.
The route-driven seeding flow lets the operator pre-commit the C6 cache to the precise corridor the drone actually flew rather than a coarse bounding box — typically ~100× more tile-efficient on long, narrow flights.
C11 is operator-side ONLY; ADR-004 forbids the airborne companion image from importing this module.
**Upstream API** (cycle 3 — AZ-838): `POST /api/satellite/route` (corridor onboarding; body shape per `CreateRouteRequest.cs` / `RoutePoint.cs` / `GeoPoint.cs` DTOs; query `requestMaps=true&createTilesZip=false`) + `GET /api/satellite/route/{id}` (status polling; terminal-success when `mapsReady=true`; terminal-failure when `status ∈ {failed, error, rejected}`) + `POST /api/satellite/tiles/inventory` (post-materialisation coverage verification, shared with `tile_downloader`). Authentication: `Authorization: Bearer ${SATELLITE_PROVIDER_API_KEY}`; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert.
## Shape
### Function / method API
```python
import uuid
from gps_denied_onboard._types.route import RouteSpec # AZ-845 canonical home
class SatelliteProviderRouteClient:
def __init__(
self,
base_url: str,
jwt: str,
*,
tls_insecure: bool = False,
request_timeout_s: float = 30.0,
poll_interval_s: float = 5.0,
poll_max_attempts: int = 60,
) -> None: ...
def seed_route(
self,
spec: RouteSpec,
*,
name: str | None = None,
) -> RouteSeedResult: ...
```
| Name | Signature | Throws / Errors | Blocking? |
|------|-----------|-----------------|-----------|
| `seed_route` | `(spec: RouteSpec, *, name: str \| None = None) -> RouteSeedResult` | `RouteValidationError`, `RouteTransientError`, `RouteTerminalFailureError` (all under `SatelliteProviderRouteError`) | sync; poll loop bounded by `poll_max_attempts × poll_interval_s` (default 60 × 5 s = 5 min ceiling) |
### Data DTOs
```python
@dataclass(frozen=True, slots=True)
class RouteSpec: # _types/route.py (AZ-845)
waypoints: tuple[tuple[float, float], ...] # (lat, lon)
suggested_region_size_meters: float # per-waypoint coverage radius
source_tlog: Path # provenance
source_segment: tuple[int, int] # (start_idx, end_idx) into tlog GPS rows
total_distance_meters: float # along-track distance of active segment
@dataclass(frozen=True, slots=True)
class RouteSeedResult: # c11_tile_manager/route_client.py
route_id: uuid.UUID
terminal_status: str # e.g. "completed", "done", "succeeded"
maps_ready: bool # True on terminal success
tile_count: int # present=true entries from inventory verify
elapsed_ms: int # POST → terminal-status wall time
submitted_payload_sha256: str # provenance for the inventory verify step
```
| Field | Type | Required | Description | Constraints |
|-------|------|----------|-------------|-------------|
| `RouteSpec.waypoints` | `tuple[tuple[float, float], ...]` | yes | Ordered list of (lat, lon) waypoints | `2 ≤ len(waypoints) ≤ 500` (AZ-809 validator); each `lat ∈ [-90, 90]`, `lon ∈ [-180, 180]` |
| `RouteSpec.suggested_region_size_meters` | `float` | yes | Per-waypoint coverage radius | `100.0 ≤ value ≤ 10_000.0` (AZ-809 validator) |
| `RouteSpec.source_tlog` | `Path` | yes | Provenance — which tlog produced this spec | filesystem path |
| `RouteSeedResult.route_id` | `uuid.UUID` | yes | Server-assigned route id | non-zero |
| `RouteSeedResult.terminal_status` | `str` | yes | Last status observed from `GET /api/satellite/route/{id}` | one of `{"completed", "failed", "error", "done", "succeeded", "rejected"}` |
| `RouteSeedResult.maps_ready` | `bool` | yes | True iff parent suite reported `mapsReady=true` (terminal success) | True on success; False if poll budget exhausted before terminal |
| `RouteSeedResult.tile_count` | `int` | yes | Inventory `present=true` count over the route's enumerated coverage | ≥ 0 (lower bound — server may interpolate between waypoints) |
## Invariants
- I-1: **Pre-emptive validation** rejects obviously-bad input as `RouteValidationError` BEFORE the HTTP POST. The client mirrors the AZ-809 `CreateRouteRequestValidator` bounds (`points` 2..500; `regionSizeMeters` 100..10 000; `zoomLevel` 0..22; lat/lon ranges; `name`/`description` max lengths). The list MUST stay in sync with `SatelliteProvider.Api/Validators/CreateRouteRequestValidator.cs` (parent suite source).
- I-2: The client POSTs the wire shape exactly per `CreateRouteRequest.cs` + `RoutePoint.cs` + `GeoPoint.cs` (note: `RoutePoint` uses `lat` / `lon` JSON property names for both input and output; the input/output naming asymmetry flagged in AZ-809 AC-10 is a parent-suite concern, not a client adaptation).
- I-3: Poll cadence MUST respect `poll_interval_s` (lower bound between successive `GET /api/satellite/route/{id}` calls) and `poll_max_attempts` (upper bound on attempt count). The client logs every poll tick at INFO with the observed status.
- I-4: Terminal-success is exactly `mapsReady=true`. Terminal-failure is exactly `status ∈ {"failed", "error", "rejected"}`. Any other status is treated as "still processing" and triggers the next poll. If the poll budget is exhausted without terminal status, `RouteTransientError` is raised with the last observed status.
- I-5: 4xx responses with RFC 7807 `ProblemDetails``RouteValidationError`; `field_errors` is populated from the `errors` dict so the caller can render per-field rejections.
- I-6: 5xx / network / timeout → `RouteTransientError` with `__cause__` set to the underlying `httpx` exception. The retry semantics are caller-driven — the route client itself does NOT retry the POST, leaving the policy to the fixture / CLI (e.g., `tests/e2e/replay/conftest.py::operator_pre_flight_setup` retries up to 3 times using C11's `_DEFAULT_BACKOFF_SCHEDULE_S = (1, 2, 4, 8)`).
- I-7: The inventory verify step uses `POST /api/satellite/tiles/inventory` (≤ 5000 entries / request) and enumerates the route's tile coverage locally from `(waypoints, suggested_region_size_meters)` using the parent suite's web-Mercator math (`_EARTH_EQUATORIAL_CIRCUMFERENCE_M = 40 075 016.686`). The result is a **lower bound** on actual server coverage — the server may interpolate intermediate corridor tiles that the local enumeration misses; this is documented and acceptable as a sanity-check signal, not a coverage proof.
## Non-Goals
- Not covered: producing the `RouteSpec` — owned by `replay_input.tlog_route.extract_route_from_tlog` (AZ-836).
- Not covered: orchestration of when the operator runs the seed — owned by C12 (production binding deferred; cycle-3 e2e fixture `operator_pre_flight_setup` is the current driver — AZ-839).
- Not covered: FAISS index construction over the populated cache — owned by C10 `DescriptorBatcher`.
- Not covered: bbox-based seeding — handled by `tile_downloader.download_tiles_for_area` (and by `tests/fixtures/derkachi_c6/seed_region.py` for the e2e fixture).
- Not covered: multi-route batching — one `RouteSpec` per `seed_route` call. Multi-flight aggregate corridors are an operator-workflow concern.
## Versioning Rules
- **Breaking changes** (renamed method, removed required field, changed return type, parent-suite Route API contract break) require a major version bump. Coordinate with the C3 fixture (AZ-839) and any future C12 production binding via Choose A/B/C/D before bumping.
- **Non-breaking additions** (new optional constructor kwarg, new field on `RouteSeedResult`, new error variant the consumer catches via `SatelliteProviderRouteError`) require a minor version bump.
- The pre-emptive validation bounds (I-1) MUST track the parent-suite `CreateRouteRequestValidator.cs` exactly. Drift between client and server validators is a defect, not a version concern — fix the client to match the server.
## Test Cases
| Case | Input | Expected | Notes |
|------|-------|----------|-------|
| route-happy-path | `RouteSpec` for Derkachi tlog (2-waypoint corridor, region_size=500m) against a stubbed `satellite-provider` returning `mapsReady=true` on the 2nd poll | `RouteSeedResult` with `maps_ready=True`, `tile_count > 0`, `terminal_status="completed"`, `elapsed_ms` reflects 2 polls | AZ-838 AC-1, AC-2 |
| validation-empty-points | `RouteSpec(waypoints=(), …)` | `RouteValidationError` raised BEFORE HTTP POST | I-1, AZ-838 AC-6 |
| validation-too-many-points | `RouteSpec` with 501 waypoints | `RouteValidationError` raised BEFORE HTTP POST | I-1, AZ-838 AC-6 |
| validation-region-too-large | `RouteSpec(suggested_region_size_meters=10_001.0, …)` | `RouteValidationError` raised BEFORE HTTP POST | I-1, AZ-838 AC-6 |
| 4xx-problem-details | server returns 400 + RFC 7807 `errors` dict | `RouteValidationError` with `field_errors` populated from the response | I-5, AZ-838 AC-3 |
| 5xx-transient | server returns 503 | `RouteTransientError` with `__cause__` set to the underlying `httpx` exception | I-6, AZ-838 AC-4 |
| terminal-failure | server reports `status="failed"` mid-poll | `RouteTerminalFailureError`; `.detail` carries the response JSON | I-4, AZ-838 AC-5 |
| poll-budget-exhausted | server stays in `status="processing"` past 60 attempts | `RouteTransientError` referencing the last observed status | I-3, I-4 |
| inventory-verify-counts-present | `mapsReady=true` then inventory POST returns mixed `present=true/false` entries | `tile_count` equals the count of `present=true` entries | I-7 |
| integration-derkachi | `RouteSpec` from real Derkachi tlog, against the Jetson `satellite-provider` (gated by `RUN_E2E=1` + `SATELLITE_PROVIDER_URL`) | `tile_count > 0`, `maps_ready=True`, completes in ≤ 15 s on the 2-waypoint reference route | AZ-838 AC-10 (Jetson-only, Tier-2) |
## Change Log
| Version | Date | Change | Author |
|---------|------|--------|--------|
| 1.0.0 | 2026-05-26 | Initial contract — produced by AZ-838 (Epic AZ-835 C2). Cycle-3 addition; consumed by AZ-839 (`operator_pre_flight_setup` real fixture) and AZ-840 (E2E orchestrator test). | autodev |
@@ -1,18 +1,20 @@
# Contract: tile_downloader
**Component**: c11_tilemanager
**Producer task**: AZ-316_c11_tile_downloader
**Producer task**: AZ-316_c11_tile_downloader (initial), AZ-777 Phase 1 (cycle-3 inventory-contract adaptation)
**Consumer tasks**: AZ-253 (E-C12 Operator Pre-flight Tooling — TBD at C12 decompose time)
**Version**: 1.0.0
**Status**: draft
**Last Updated**: 2026-05-10
**Version**: 1.1.0
**Status**: stable
**Last Updated**: 2026-05-26
## Purpose
The `TileDownloader` Protocol is C11's operator-side download interface. C12 invokes it during F1 (pre-flight cache build) to fetch satellite tiles from the parent suite's `satellite-provider` GET surface, apply RESTRICT-SAT-4 resolution gating at the C11 boundary, and write accepted tiles into C6. Freshness rejections surfacing from C6 (AZ-307) are counted and surfaced in the report.
The `TileDownloader` Protocol is C11's operator-side download interface. C12 invokes it during F1 (pre-flight cache build) to fetch satellite tiles from the parent suite's `satellite-provider` inventory + slippy-map surface, apply RESTRICT-SAT-4 resolution gating at the C11 boundary, and write accepted tiles into C6. Freshness rejections surfacing from C6 (AZ-307) are counted and surfaced in the report.
C11 is operator-side ONLY; ADR-004 forbids the airborne companion image from importing this module.
**Upstream API (cycle 3 — AZ-777 Phase 1)**: against the real parent-suite `satellite-provider` v1.0.0 inventory contract — `POST /api/satellite/tiles/inventory` (bulk lookup by `(zoom, x, y)`, ≤ 5000 entries / request, per `tile-inventory.md` v1.0.0 / AZ-505) + `GET /tiles/{z}/{x}/{y}` (slippy-map JPEG fetch, issued only for inventory entries with `present=true`). Authentication: `Authorization: Bearer ${SATELLITE_PROVIDER_API_KEY}`; the dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` env knob accepts the self-signed dev cert (production must validate against a CA-issued cert). Because the inventory response carries no `Content-Length` hint, AZ-308's pre-write budget pre-check uses a conservative `_DEFAULT_ESTIMATED_TILE_BYTES = 50 000` per-tile reserve.
## Shape
### Function / method API
@@ -79,7 +81,7 @@ class TileSummary:
- I-1: `tiles_downloaded + tiles_rejected_resolution + tiles_rejected_freshness == sum of attempted tiles`. The report accounts for every tile the downloader attempted; no silent drops.
- I-2: A re-run of `download_tiles_for_area` for the same `(bbox, zoom_levels, sector_class, flight_id)` after a successful prior run is idempotent: `outcome = idempotent_no_op` and no GETs are issued. Idempotence is enforced by C11's download-progress journal under `cache_root/.c11/journal/`.
- I-3: Every accepted tile passes BOTH the C11 resolution gate (≥ 0.5 m/px per RESTRICT-SAT-4) AND the C6 freshness gate (AZ-307). A tile that fails either is excluded from `tiles_downloaded`.
- I-4: TLS + service-internal API key authenticate the GET; auth failure surfaces as `SatelliteProviderError` and aborts the run with `outcome = failure`. The downloader does NOT fall back to plaintext or unauthenticated requests.
- I-4: JWT Bearer authentication (`SATELLITE_PROVIDER_API_KEY`) over TLS authenticates the inventory POST and the slippy-map GET; auth failure surfaces as `SatelliteProviderError` and aborts the run with `outcome = failure`. The downloader does NOT fall back to plaintext or unauthenticated requests. `SATELLITE_PROVIDER_TLS_INSECURE=1` is a dev-only knob for self-signed certs; production must run with it unset.
- I-5: The downloader writes via the AZ-303 `TileStore`/`TileMetadataStore` Protocols; it does NOT touch C6's filesystem layout directly.
- I-6: A `CacheBudgetExceededError` aborts pre-write with no partial write and `outcome = failure`. The C6 cache budget enforcer (AZ-308) drives the headroom check.
@@ -112,4 +114,5 @@ class TileSummary:
| Version | Date | Change | Author |
|---------|------|--------|--------|
| 1.1.0 | 2026-05-26 | Internal upstream contract adapted to `satellite-provider` v1.0.0 inventory contract (AZ-777 Phase 1): `POST /api/satellite/tiles/inventory` + `GET /tiles/{z}/{x}/{y}` replace the previous `GET /api/satellite/tiles?bbox=…&zoom=…` shape. `download_tiles_for_area` / `DownloadRequest` / `DownloadBatchReport` surface UNCHANGED — non-breaking minor bump. Auth tightened to JWT Bearer over TLS. Status moved draft → stable. | autodev |
| 1.0.0 | 2026-05-10 | Initial contract — produced by AZ-316 (E-C11 decomposition) | autodev |
@@ -254,6 +254,10 @@ The two **invalid** cells (`true` + `eskf` and `false` + `gtsam_isam2`) raise `C
10. **Determinism**: same `(video, tlog, config, time_offset_ms, pace=ASAP)` input → same JSONL output within ≤ 1e-6 float drift in position fields (AC-5).
11. **MAVLink signing key required in replay**: the airborne binary refuses to run without `--mavlink-signing-key PATH` in both modes. In replay the operator supplies a dummy file (well-formed key bytes; no real channel to verify against). This preserves Invariant 5 — the encoders' signing code path runs identically in both modes.
12. **Real C6 cache in replay**: the airborne binary in replay mode reads the same pre-built C6 tile cache the operator built via the normal pre-flight C10/C11/C12 flow. There is no replay-specific cache shape. Verified by the AZ-404 E2E fixture, which runs the operator's pre-flight flow before invoking the replay CLI.
**Sub-invariant 12.a (cycle 3 — AZ-839 / Epic AZ-835 C3)**: the e2e `operator_pre_flight_setup` fixture replaces the cycle-1 `mkdir` placeholder with a real driver that wires C1 (`replay_input.tlog_route.extract_route_from_tlog` — AZ-836) + C2 (`c11_tile_manager.route_client.SatelliteProviderRouteClient.seed_route` — AZ-838) + C11 (`tile_downloader.HttpTileDownloader.download_for_bbox`) + C10 (`DescriptorBatcher`) to populate C6 from a tlog-derived corridor. The fixture yields a `PopulatedC6Cache` dataclass (`cache_root`, `tile_store_path`, `faiss_index_path`, `faiss_sidecar_sha256_path`, `faiss_sidecar_meta_path`, `route_spec`, `tile_count`, `elapsed_seconds`). The cache is mounted into a named docker volume that survives across pytest sessions (cold first invocation populates; subsequent invocations within the same compose session reuse — warm cache). Cold-start budget: ≤ 5 min on Tier-2 Jetson; warm: ≤ 30 s. Sidecar triple-consistency (`.index` + `.sha256` + `.meta.json`) per AZ-306 is verified at every fixture yield; mismatch raises `IndexUnavailableError`. The C12 production binding for the route-driven path is a future-cycle integration; production pre-flight still uses the bbox-driven `download_tiles_for_area` path today.
**Sub-invariant 12.b (cycle 3 — AZ-840 / Epic AZ-835 C4)**: the E2E orchestrator test `tests/e2e/replay/test_az835_e2e_real_flight.py` takes only `(tlog, video, calibration)` and runs the full 7-step pipeline end-to-end on Tier-2 Jetson — no operator hand-curation between steps. The 7 steps are: (1) active flight cut + tlog/video sync via AZ-405; (2) on-fly frame + IMU extraction; (3) auto-create route via AZ-836; (4) POST route to satellite-provider via the C3 fixture's `operator_pre_flight_setup` (delegates to AZ-838); (5) build FAISS index (driven by C3); (6) run gps-denied airborne pipeline against the populated cache + tlog/video/calibration (reuses the airborne composition root path AZ-699 exercises); (7) compute horizontal-error distribution and emit the AZ-699 verdict report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md`. The verdict report is emitted ALWAYS, regardless of PASS / FAIL on the AZ-696 ≥ 80 % within 100 m gate — the success criterion is that the report exists with the honest distribution, not that the verdict is PASS. Gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`.
13. **C4↔C5 pairing matrix is enforced at compose time** (AZ-776 / ADR-012): `compose_root` rejects the two off-diagonal cells of the (`c4_pose.enabled`, `c5_state.strategy`) matrix with a `CompositionError` naming both blocks. `enabled=False` + `gtsam_isam2` and `enabled=True` + `eskf` are forbidden. The two valid cells are `enabled=True` + `gtsam_isam2` (production steady-state per ADR-003 / ADR-009) and `enabled=False` + `eskf` (open-loop ESKF — replay Tier-2 smoke baseline; satellite anchoring deferred to AZ-777). Verified by `tests/unit/runtime_root/test_az776_open_loop_eskf_composition.py` AC-3a and AC-3b.
## Producer / Consumer Split
+3
View File
@@ -562,6 +562,9 @@ The following DTOs flow through the per-frame pipeline in memory and are **NOT**
| `PostLandingUploadRequest` | C12 CLI (`upload-pending` subcommand) | C12 `PostLandingUploadOrchestrator` | Never persisted — composed inline from CLI args |
| `ReLocHint` | C12 CLI (`reloc-confirm` subcommand) | C12 `OperatorReLocService``OperatorCommandTransport` (E-C8 concrete) → airborne companion | FDR `c12.reloc.requested` record (full hint un-redacted; `outcome ∈ {sent, failed}`) |
| `CameraCalibration` (loaded once) | calibration loader | C1, C3, C4 | NOT in PostgreSQL — see § 2.6 |
| `RouteSpec` (cycle 3 — `_types/route.py`, AZ-845 canonical home; produced by `replay_input/tlog_route.py` AZ-836) | `replay_input.tlog_route.extract_route_from_tlog(tlog, *, max_waypoints, …)` | C11 `SatelliteProviderRouteClient.seed_route` (AZ-838); cycle-3 e2e fixture `operator_pre_flight_setup` (AZ-839); E2E orchestrator test (AZ-840) | NOT in PostgreSQL — transient pre-flight planning DTO. Fields: `waypoints: tuple[(lat, lon)]` (1..max_waypoints), `suggested_region_size_meters: float`, `source_tlog: Path`, `source_segment: (start_idx, end_idx)`, `total_distance_meters: float`. `frozen=True, slots=True`. |
| `RouteSeedResult` (cycle 3 — `c11_tile_manager/route_client.py`, AZ-838) | C11 `SatelliteProviderRouteClient.seed_route` | cycle-3 e2e fixture `operator_pre_flight_setup` (AZ-839); seed CLI `tests/fixtures/derkachi_c6/seed_route.py` | NOT in PostgreSQL — transient outcome DTO. Fields: `route_id: uuid`, `terminal_status: str`, `maps_ready: bool`, `tile_count: int`, `elapsed_ms: int`, `submitted_payload_sha256: str`. `frozen=True, slots=True`. |
| `PopulatedC6Cache` (cycle 3 — `tests/e2e/replay/conftest.py`, AZ-839) | `operator_pre_flight_setup` fixture | replay e2e tests including `test_az835_e2e_real_flight.py` (AZ-840) and the AZ-699 verdict test | NOT in PostgreSQL — test-fixture-only DTO. Fields: `cache_root: Path`, `tile_store_path: Path`, `faiss_index_path: Path`, `faiss_sidecar_sha256_path: Path`, `faiss_sidecar_meta_path: Path`, `route_spec: RouteSpec`, `tile_count: int`, `elapsed_seconds: float`. Backed by a docker named volume that survives across pytest sessions in the same compose run. |
---
+62
View File
@@ -0,0 +1,62 @@
# Ripple Log — Cycle 3 (End-of-Cycle Documentation Sync)
> Produced as part of existing-code flow Step 13 (Update Docs, document skill Task mode).
> Source: `_docs/_autodev_state.md` (`cycle: 3`).
> Date: 2026-05-26.
## Input set
The 8 task specs in `_docs/02_tasks/done/` whose mtime falls inside cycle 3
(2026-05-22 .. 2026-05-23):
| Task | Title | Surface |
|------|-------|---------|
| AZ-836 | TlogRouteExtractor (Epic AZ-835 C1) | NEW `replay_input/tlog_route.py`, NEW `_types/route.py` (RouteSpec) |
| AZ-838 | SatelliteProviderRouteClient + `seed_route.py` CLI (Epic AZ-835 C2) | NEW `components/c11_tile_manager/route_client.py`, NEW `tests/fixtures/derkachi_c6/seed_route.py` |
| AZ-839 | `operator_pre_flight_setup` real fixture (Epic AZ-835 C3) | REWRITE `tests/e2e/replay/conftest.py::operator_pre_flight_setup`, NEW `PopulatedC6Cache` |
| AZ-840 | E2E orchestrator test (Epic AZ-835 C4) | NEW `tests/e2e/replay/test_az835_e2e_real_flight.py` |
| AZ-777 | Derkachi C6 reference fixture (Phases 1+2; Phases 35 superseded by AZ-839/AZ-841/AZ-842) | MODIFY `c11_tile_manager/tile_downloader.py` (inventory + slippy-map paths), `docker-compose.test.jetson.yml`, `.env.test.example`; NEW `tests/fixtures/derkachi_c6/{seed_region.py,bbox.yaml,README.md}`, NEW `tests/e2e/satellite_provider/test_smoke.py` |
| AZ-845 | Relocate RouteSpec → `_types/route.py` (refactor 02 anchor) | NEW `_types/route.py`; MODIFY `replay_input/tlog_route.py`, `replay_input/__init__.py`, `components/c11_tile_manager/route_client.py` import |
| AZ-846 | Refresh `module-layout.md` cycle-3 entries (refactor 02) | MODIFY `_docs/02_document/module-layout.md` ONLY |
| AZ-847 | Widen AZ-270 lint to enforce full rule-9 allow-list (refactor 02) | MODIFY `tests/unit/test_az270_compose_root.py` ONLY |
## Task Step 0.5 — Import-graph ripple
Reverse-dependency scan for the 4 production source changes:
| Changed file | Importers (production source) | Affected components |
|--------------|------------------------------|---------------------|
| `_types/route.py` (NEW) | `replay_input/tlog_route.py`, `replay_input/__init__.py` (re-export), `components/c11_tile_manager/route_client.py`, `components/c11_tile_manager/__init__.py` (re-export) | c11_tile_manager, shared/replay_input, shared/_types |
| `replay_input/tlog_route.py` (NEW) | `replay_input/__init__.py` (re-export) | shared/replay_input |
| `components/c11_tile_manager/route_client.py` (NEW) | `components/c11_tile_manager/__init__.py` (re-export) | c11_tile_manager |
| `components/c11_tile_manager/tile_downloader.py` (MODIFIED — `_INVENTORY_PATH`, `_TILES_PATH`, default per-tile byte estimate) | `runtime_root/c11_factory.py::build_tile_downloader` (constructor unchanged; endpoint constants are module-internal) | c11_tile_manager |
No surprise ripple to other components. All edges land inside `c11_tile_manager` + shared (`_types/`, `replay_input/`), which is consistent with the AZ-507 cross-component allow-list (AZ-845 fixes the previous violation; AZ-846 registers the new files; AZ-847 widens the lint to keep it that way).
## Refresh set for Task Steps 14
| Update level | This cycle's refresh set | Status |
|--------------|-------------------------|--------|
| Task Step 1 — Module docs | This project's Plan uses component-level granularity; no `_docs/02_document/modules/` folder. Authoritative module-ownership lives in `_docs/02_document/module-layout.md`. | Already refreshed by AZ-846 — sections `c11_tile_manager Internal`, `shared/replay_input`, `_types/` updated to register `route_client.py`, `tlog_route.py`, `route.py`. No further action. |
| Task Step 2 — Component docs | `components/12_c11_tilemanager/description.md` (3rd interface + endpoint adaptation), `contracts/c11_tilemanager/tile_downloader.md` (endpoint paths), `contracts/c11_tilemanager/route_client.md` (NEW). | Updated this session. |
| Task Step 3 — System-level docs | `architecture.md` § 5 satellite-provider sub-section (inventory contract + route-driven seeding); `data_model.md` register `RouteSpec` / `RouteSeedResult` / `PopulatedC6Cache` DTOs; `system-flows.md` F1 pre-flight cache build (route-driven variant); `contracts/replay/replay_protocol.md` Invariant 12 sub-section for AZ-839 / AZ-840. | Updated this session. |
| Task Step 4 — Problem-level docs | `_docs/00_problem/input_data/flight_derkachi/README.md` (point at `tests/fixtures/derkachi_c6/` + license attribution). No AC / restriction / data_parameters drift this cycle. | Updated this session. |
## Files actually changed this session
- `_docs/02_document/components/12_c11_tilemanager/description.md` — add `SatelliteProviderRouteClient` as a third C11 interface; update `TileDownloader` external API rows to the inventory + slippy-map contract; add a Cycle-3 callout to § 1 Overview.
- `_docs/02_document/contracts/c11_tilemanager/tile_downloader.md` — replace the `GET /api/satellite/tiles?bbox=…&zoom=…` row with the inventory-POST + slippy-map-GET row pair; bump version.
- `_docs/02_document/contracts/c11_tilemanager/route_client.md` — NEW contract for `SatelliteProviderRouteClient.seed_route`.
- `_docs/02_document/contracts/replay/replay_protocol.md` — append AZ-839 / AZ-840 sub-section to Invariant 12 covering the route-driven `operator_pre_flight_setup` fixture + `PopulatedC6Cache`.
- `_docs/02_document/architecture.md` — append a Cycle-3 sub-section to § 5 satellite-provider integration noting the actual inventory-based read path + the route-driven seeding flow (no new ADR).
- `_docs/02_document/data_model.md` — register `RouteSpec`, `RouteSeedResult`, `PopulatedC6Cache` as cross-component DTOs.
- `_docs/02_document/system-flows.md` — extend F1 (pre-flight cache build) with the route-driven variant (tlog → RouteSpec → satellite-provider Route API → populated C6 via inventory + slippy-map).
- `_docs/00_problem/input_data/flight_derkachi/README.md` — append "Derkachi C6 reference seeding" section pointing at `tests/fixtures/derkachi_c6/{seed_region.py,seed_route.py,bbox.yaml,README.md}` + the license-attribution caveat for Google Maps imagery.
- `_docs/02_document/ripple_log_cycle3.md` — this file (NEW).
- `_docs/_autodev_state.md` — sub_step progression through Step 13 task phases.
## Out of scope (carried)
- `tests/` doc updates beyond what Step 12 already produced (`_docs/02_document/tests/blackbox-tests.md`, `traceability-matrix.md` — modified by Step 12 in this cycle). Test-spec sync owns those.
- Cycle-2 doc carry-overs OUTSIDE the three `module-layout.md` sections AZ-846 touched (`replay_api/` Per-Component Mapping entry, `cli/render_map.py`, `cli/replay_api_entrypoint.py`, `helpers/gps_compare.py`, `helpers/accuracy_report.py`). Tracked in cycle-3 retrospective; require a separate follow-up doc task with its own AZ ID.
- Untracked `_docs/02_document/system-overview.md` (created 2026-05-24 outside the cycle-3 task surface). Reviewed; content is accurate at the abstraction level it presents; no edit required.
+33 -7
View File
@@ -46,11 +46,25 @@
The operator builds (or refreshes) the per-mission cache before takeoff. F1 has **three phases** sequenced by C12 OperatorTool:
- **Phase 0 — Flight resolve (C12 `FlightsApiClient`, AZ-489)**: read the operator-authored `Flight` (ordered waypoints + altitudes) either from the parent-suite `flights` REST service (`--flight-id <Guid>`) or from a local JSON export (`--flight-file <path>`). Compute the bounding box as the envelope of waypoint lat/lon plus a configurable buffer (default 1 km). Extract `Flight.waypoints[0].(lat, lon, alt)` as the **takeoff origin**. Both are passed downstream as `BuildRequest` fields.
- **Phase 1 — Tile download (C11 `TileDownloader`)**: fetch tiles from `satellite-provider` for the bbox computed in Phase 0; apply sector-classified freshness rules (AC-NEW-6) and resolution gate (RESTRICT-SAT-4); write tile rows + JPEGs into C6.
- **Phase 1 — Tile download (C11 `TileDownloader` — bbox-driven, production path)**: fetch tiles from `satellite-provider` for the bbox computed in Phase 0 via `POST /api/satellite/tiles/inventory` (bulk lookup of `(z,x,y)` coords per `tile-inventory.md` v1.0.0 / AZ-505) + `GET /tiles/{z}/{x}/{y}` (slippy-map JPEG fetch for inventory entries with `present=true`); apply sector-classified freshness rules (AC-NEW-6) and resolution gate (RESTRICT-SAT-4); write tile rows + JPEGs into C6. Auth: JWT Bearer (`SATELLITE_PROVIDER_API_KEY`) over TLS; dev-only `SATELLITE_PROVIDER_TLS_INSECURE=1` accepts self-signed certs.
- **Phase 2 — Cache artifact build (C10 CacheProvisioner)**: read the populated C6 store; compile/deserialize TRT engines via C7; batch-generate descriptors via the C2 backbone; atomically write the FAISS HNSW index with SHA-256 sidecars; write the Manifest hashing model + calibration + corpus + sector classification **+ takeoff origin** (D-C10-1 idempotence; ADR-010).
This flow is offline and not time-critical. **Only Phase 0 reaches `flights` REST and Phase 1 reaches `satellite-provider`** — both run on the operator workstation, which is the only host that holds TLS + service-internal credentials. The companion never reaches either service directly (Principle #9 — denied-environment operation).
#### Phase 1 variant — route-driven seeding (cycle 3 — Epic AZ-835 / AZ-836 + AZ-838 + AZ-839)
A tlog-driven alternative to bbox download lets the operator (or the post-flight replay harness) pre-commit the cache to the precise corridor the drone actually flew. The path is exercised today by the e2e fixture `tests/e2e/replay/conftest.py::operator_pre_flight_setup` (AZ-839) and the orchestrator test `tests/e2e/replay/test_az835_e2e_real_flight.py` (AZ-840); the C12 production CLI binding for this variant is deferred to a future cycle.
Phase-1 sub-steps in the route-driven variant (replaces the bbox download for that invocation):
1. **Extract corridor from tlog**`replay_input.tlog_route.extract_route_from_tlog(tlog, *, max_waypoints=10)` (AZ-836). Trims pre-takeoff stationary frames, then coarsens the GPS trace to ≤ `max_waypoints` waypoints via Douglas-Peucker in WGS-84 with great-circle distance. Returns a `RouteSpec(waypoints, suggested_region_size_meters, source_tlog, source_segment, total_distance_meters)` — frozen+slots; canonical home `_types/route.py` (AZ-845).
2. **Submit to satellite-provider**`c11_tile_manager.route_client.SatelliteProviderRouteClient.seed_route(spec)` (AZ-838). Pre-emptively validates against the AZ-809 `CreateRouteRequestValidator` bounds (`points` 2..500; `regionSizeMeters` 100..10 000; `zoomLevel` 0..22; lat/lon ranges) BEFORE the HTTP POST. Then POSTs `/api/satellite/route` with `requestMaps=true&createTilesZip=false` and polls `GET /api/satellite/route/{id}` every 5 s × ≤ 60 attempts until `mapsReady=true` (terminal-success) or a terminal-failure status (`{failed, error, rejected}`). Returns a `RouteSeedResult(route_id, terminal_status, maps_ready, tile_count, elapsed_ms, submitted_payload_sha256)`.
3. **Populate C6 via C11** — enumerate the route's tile coverage locally from `(waypoints, suggested_region_size_meters)`; invoke `tile_downloader.HttpTileDownloader.download_for_bbox` (existing C11 download path) to pull every corridor tile into C6.
4. **Build FAISS index via C10**`DescriptorBatcher` against the populated C6 using the NetVLAD backbone (per `c2_vpr/config.py:67` default); verify sidecar triple-consistency (`.index` + `.sha256` + `.meta.json`) per AZ-306; mismatch raises `IndexUnavailableError`.
5. **Yield `PopulatedC6Cache`**`(cache_root, tile_store_path, faiss_index_path, faiss_sidecar_sha256_path, faiss_sidecar_meta_path, route_spec, tile_count, elapsed_seconds)`. Backed by a docker named volume that survives across pytest sessions in the same compose run.
Cold-start budget on Tier-2 Jetson: ≤ 5 min (first invocation, full materialisation + descriptor batching); warm: ≤ 30 s (named-volume reuse).
### Preconditions
- Operator workstation has network reach to `satellite-provider` (TLS + service-internal API key).
@@ -88,8 +102,10 @@ sequenceDiagram
FlightsClient->>FlightsClient: takeoff_origin = waypoints[0].(lat, lon, alt)
FlightsClient-->>C12OperatorTool: (bbox, takeoff_origin, flight_id)
C12OperatorTool->>C11TileDownloader: download_tiles_for_area(bbox, zooms, sector_class)
C11TileDownloader->>SatelliteProvider: GET /api/satellite/tiles?bbox=&zoom=
SatelliteProvider-->>C11TileDownloader: Tile blobs + metadata (paged)
C11TileDownloader->>SatelliteProvider: POST /api/satellite/tiles/inventory (bulk z,x,y lookup)
SatelliteProvider-->>C11TileDownloader: per-entry present:true|false + metadata
C11TileDownloader->>SatelliteProvider: GET /tiles/{z}/{x}/{y} (one per present:true entry)
SatelliteProvider-->>C11TileDownloader: Tile JPEG body
C11TileDownloader->>C11TileDownloader: filter by AC-NEW-6 freshness + RESTRICT-SAT-4 resolution
C11TileDownloader->>C6TileStore: write tiles to ./tiles/{zoomLevel}/{x}/{y}.jpg + Postgres rows (source='googlemaps')
C11TileDownloader-->>C12OperatorTool: DownloadBatchReport (counts, freshness summary)
@@ -114,7 +130,7 @@ flowchart TD
FlightOk -->|yes| ComputeBbox[Compute bbox as envelope of waypoint lat/lon + buffer; take waypoints[0] as takeoff origin]
ComputeBbox --> Classify[Operator classifies sector active_conflict OR stable_rear]
Classify --> InvokeC11[C12 invokes C11 TileDownloader with computed bbox]
InvokeC11 --> Download[C11 GET /api/satellite/tiles for bbox + zoom]
InvokeC11 --> Download[C11 POST /api/satellite/tiles/inventory then GET /tiles/{z}/{x}/{y}]
Download --> FreshnessFilter{Freshness ok per AC-8.2 + AC-NEW-6?}
FreshnessFilter -->|stale and stable_rear| RejectOrDowngrade[Reject or downgrade tile]
FreshnessFilter -->|stale and active_conflict| RejectOrDowngrade
@@ -149,10 +165,16 @@ flowchart TD
| 0d | C12 `FlightsApiClient` (offline) | filesystem | `flight_file` JSON in the same DTO shape | JSON read |
| 0e | C12 `FlightsApiClient` | C12 | `(bbox, takeoff_origin, flight_id)` | in-process |
| 1 | C12 | C11 `TileDownloader` | `DownloadRequest(bbox, zoom_levels, sector_class)` | in-process call |
| 2 | C11 | `satellite-provider` REST | `GET /api/satellite/tiles?bbox=…&zoom=…` | HTTPS query |
| 3 | `satellite-provider` | C11 | Paged tile blobs + metadata rows | JPEG + JSON metadata |
| 2a | C11 | `satellite-provider` REST | `POST /api/satellite/tiles/inventory` (bulk `(z,x,y)` lookup, ≤ 5000 entries / request; per `tile-inventory.md` v1.0.0) | HTTPS POST JSON body |
| 2b | `satellite-provider` | C11 | Per-entry `present: true \| false` + metadata when present | JSON response (order matches request order) |
| 2c | C11 | `satellite-provider` REST | `GET /tiles/{z}/{x}/{y}` (issued only for `present=true` entries) | HTTPS GET |
| 3 | `satellite-provider` | C11 | Tile JPEG body | binary JPEG |
| 4 | C11 | C6 filesystem (over USB/Eth) | Tile JPEG bodies | `./tiles/{zoomLevel}/{x}/{y}.jpg` |
| 5 | C11 | C6 PostgreSQL | Tile metadata rows (`source='googlemaps'`) | SQL INSERT (mirror of `satellite-provider`'s `tiles` table) |
| 1' (route variant) | tlog file | `replay_input.tlog_route.extract_route_from_tlog` | `RouteSpec(waypoints, suggested_region_size_meters, …)` | in-process call |
| 2' (route variant) | C11 `SatelliteProviderRouteClient` | `satellite-provider` REST | `POST /api/satellite/route` (`requestMaps=true`); then `GET /api/satellite/route/{id}` poll until `mapsReady=true` | HTTPS POST + repeated GET |
| 3' (route variant) | C11 | enumerator | local enumeration of corridor `(z,x,y)` coords from `(waypoints, suggested_region_size_meters)` | in-process |
| 4'+5' (route variant) | C11 | C6 | same as steps 4+5 above (downloads via the same inventory + slippy-map paths) | as above |
| 6 | C12 | C10 `CacheProvisioner` | `BuildRequest(bbox, zoom_levels, sector_class, calibration_path, takeoff_origin, flight_id)` | in-process call (operator-orchestrator side); RPC over USB/Eth to companion runner |
| 7 | C10 → C7 | TRT engine cache | TRT engines | `.engine` files keyed by `(SM, JP, TRT, precision)` (D-C10-7) |
| 8 | C2 backbone (driven by C10) | C6 FAISS index | Descriptor matrix | `.index` (FAISS HNSW), atomicwrites, SHA-256 sidecar |
@@ -168,7 +190,11 @@ flowchart TD
| Flight file malformed (offline path) | Step 0d | JSON parse failure / schema mismatch | Fail with line / field reference; instruct operator to re-export from Mission Planner UI; takeoff blocked |
| Flight has zero waypoints | Step 0e | Post-fetch validation | Fail explicitly; cannot derive bbox or takeoff origin; takeoff blocked |
| Flight bbox exceeds cache budget | Step 0e | Pre-Phase-1 bbox area vs AC-8.3 budget projection | Fail with budget delta; operator must re-plan a smaller route in Mission Planner UI; takeoff blocked |
| `satellite-provider` unreachable | Step 2 | HTTP timeout / 5xx | C11 `TileDownloader` fails with explicit error; operator retries when network is available; takeoff blocked |
| `satellite-provider` unreachable | Step 2a/2c (or 2' route variant) | HTTP timeout / 5xx | C11 `TileDownloader` / `SatelliteProviderRouteClient` fails with explicit error; operator retries when network is available; takeoff blocked |
| `satellite-provider` JWT auth 401/403 | Step 2a/2c (or 2' route variant) | HTTP 401/403 | Fail with explicit error; instruct operator to refresh `SATELLITE_PROVIDER_API_KEY`; takeoff blocked. Never silently fall back to plaintext or unauthenticated |
| Route validation fails (route variant) | Step 1'→2' | Pre-emptive client check against AZ-809 `CreateRouteRequestValidator` bounds | `RouteValidationError` raised BEFORE the HTTP POST; surface field-by-field errors to operator |
| Route materialisation terminal failure (route variant) | Step 2' poll | `GET /api/satellite/route/{id}` returns `status ∈ {failed, error, rejected}` | `RouteTerminalFailureError` with `.detail` carrying the server response JSON; takeoff blocked |
| Route poll budget exhausted (route variant) | Step 2' poll | 60 attempts × 5 s ceiling reached without `mapsReady=true` or terminal failure | `RouteTransientError` referencing the last observed status; operator may re-invoke or extend the poll budget |
| Tile fails freshness | Step 3 (C11) | `tile.capture_timestamp` vs `sector_class` threshold | Reject (active_conflict) or downgrade-no-`satellite_anchored`-label (rear), per AC-NEW-6; counts surface in `DownloadBatchReport` |
| Resolution below 0.5 m/px | Step 3 (C11) | Tile metadata GSD check (RESTRICT-SAT-4) | Reject; report; takeoff blocked |
| Insufficient cache budget | Step 4 (C11) | Filesystem free-space check pre-write | Fail fast with explicit budget delta; no partial write |
+54
View File
@@ -0,0 +1,54 @@
# System Overview Diagram
> Date: 2026-05-24. Plain-English end-to-end view of the GPS-denied onboard pose estimation system, intended for onboarding and presentations. Detailed per-component decomposition lives in `architecture.md`; per-flow sequences in `system-flows.md`.
**One-line goal**: when a drone's GPS is jammed or spoofed, give the flight controller a position fix derived from what the camera sees vs. a pre-loaded satellite map — with an honest accuracy number attached.
```mermaid
flowchart LR
subgraph BEFORE["Before flight — operator workstation"]
UI["Mission Planner<br/>(operator draws route)"] --> PREP["Pre-flight setup<br/>• download map tiles<br/>• build search index<br/>• mark takeoff point"]
SAT[("Satellite map service")] -. tiles .-> PREP
end
subgraph DURING["During flight — drone companion computer"]
CAM[/"Camera<br/>(3 Hz)"/] --> MOTION["Motion tracker<br/>(camera + IMU →<br/>frame-to-frame motion)"]
CAM --> MATCH["Map matcher<br/>(find where this frame is<br/>on the satellite map)"]
FC[/"Flight controller"/] -- "IMU 100200 Hz" --> MOTION
FC -- "IMU 100200 Hz" --> FUSE
MOTION --> FUSE
MATCH --> FUSE["State estimator<br/>(fuse motion + map +<br/>IMU into one position)"]
FUSE == "Position + accuracy<br/>+ how we got it" ==> FC
CACHE[("Cached map tiles<br/>read-only in flight")] --> MATCH
end
subgraph AFTER["After landing — operator workstation"]
UPLOAD["Upload new tiles<br/>captured in flight<br/>(only on clean landing)"]
end
PREP ==> DURING
PREP --> CACHE
DURING -. flight log .-> UPLOAD
UPLOAD -. tiles .-> SAT
classDef ext fill:#eef,stroke:#88a;
classDef store fill:#ffe,stroke:#aa6;
class UI,SAT,FC,CAM ext;
class CACHE store;
```
## How to read it in 30 seconds
1. **Before flight** — the operator draws a route in the Mission Planner. The workstation downloads the satellite-map tiles that cover the route, builds a search index over them, and notes the takeoff point.
2. **During flight** — the drone's camera produces a frame three times a second. Two things happen to each frame in parallel:
- The **motion tracker** combines the camera with the flight controller's IMU to estimate how the drone moved since the last frame.
- The **map matcher** compares the frame against the cached satellite tiles to find where on the map the drone currently is.
3. The **state estimator** fuses both signals (plus raw IMU) into a single position estimate, attaches an honest accuracy number, and sends it to the flight controller — which uses it as a drop-in replacement for GPS.
4. **After landing** — any new map tiles the drone captured during the flight get uploaded back to the satellite map service so the next mission has fresher data.
## Why the picture is shaped this way (invariants worth defending)
- **The drone never talks to the satellite map service in flight.** All tile downloads happen on the operator workstation before takeoff; all tile uploads happen on the operator workstation after landing. The airborne code physically cannot reach the network for tiles. (ADR-004 process isolation.)
- **Two parallel branches feed the estimator.** Motion tracking (camera + IMU) and map matching (camera + cached tiles) are independent — neither depends on the other to produce a result. The estimator decides how to weigh them on every frame.
- **The position emitted to the flight controller always carries an honest accuracy number and a provenance label** (`satellite_anchored` / `visual_propagated` / `dead_reckoned`). Under-reporting accuracy is treated as a defect, not a tuning knob.
- **Post-landing upload only fires on a clean shutdown** (the flight log's footer record confirms it). If the system crashed or the drone went down hard, mid-flight tiles stay local until an operator triages them.
+41
View File
@@ -672,3 +672,44 @@ All tests run from the `e2e-runner` container against the SUT through public bou
The Vertical-error section is replaced by `_No emissions carried a comparable altitude — vertical stats skipped._` when none of the JSONL rows carry an `alt_m` field comparable to the ground-truth altitude.
**Skip semantics**: AZ-699 distinguishes between *missing-prerequisite skip* (cleanly skipped with the missing file's path) and *test-cannot-resolve mask* (`@xfail` — explicitly forbidden by AZ-699 AC-1). The AZ-404 1-min test's `@xfail` on AC-3 is unchanged (AZ-699 AC-4 is "add a new test, don't replace") — FT-P-20 is the honest replacement that runs alongside it.
---
### FT-P-21: End-to-end orchestrator pipeline from `(tlog, video, calibration)` only
**Summary**: Validates the full 7-step Epic AZ-835 pipeline — given only `(tlog, video, calibration)`, the system auto-extracts a `RouteSpec` (AZ-836), posts it to the real satellite-provider (AZ-838), builds the C6 FAISS index via the route-driven `operator_pre_flight_setup` fixture (AZ-839, supersedes the AZ-777 Phase 3 bbox-seeded placeholder), runs the airborne replay pipeline, and emits a horizontal-error verdict report. No operator hand-curation between steps. Closes the Epic AZ-835 narrative: "give it a tlog + video + calibration, and the system does everything else."
**Traces to**: AZ-840 AC-1..AC-8 (epic AZ-835 narrative); supplementary product-AC coverage on AC-1.1, AC-1.2, AC-8.3 (pre-loaded cache realised from route, not bbox).
**Category**: End-to-end Integration + Position Accuracy
**Preconditions**:
- Tier-2 Jetson with `@pytest.mark.tier2` + `RUN_REPLAY_E2E=1` env (explicit skip-reason naming the missing env var — no silent skip per AZ-840 AC-6).
- Real `satellite-provider` + `satellite-provider-postgres` services running in `docker-compose.test.jetson.yml` (lineage AZ-688 / AZ-691 / AZ-692; cycle-3 AZ-777 Phase 1 adapted C11 to the real `POST /api/satellite/tiles/inventory` + `GET /tiles/{z}/{x}/{y}` endpoints).
- `tests/e2e/replay/conftest.py::operator_pre_flight_setup` from AZ-839 (route-driven C6 population, supersedes the AZ-777 Phase 3 placeholder).
- `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog` + `flight_derkachi.mp4` (real binary + real video >1 MB).
- `_docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json` (AZ-702 factory-sheet camera calibration).
- `gps-denied-replay` console-script installed in the e2e-runner image (AZ-604).
- AZ-776 (eskf open-loop composition profile) landed; AZ-848 — Jetson `eskf_out_of_order` regression — currently blocks the heavy-AC path on Jetson, so FT-P-21 produces its first honest verdict once AZ-848 lands.
**Input data**: real `derkachi.tlog`, real `flight_derkachi.mp4`, factory-sheet camera calibration. AZ-836's `extract_route_from_tlog(tlog, max_waypoints=10)` derives the `RouteSpec` from the tlog itself; no operator authoring required.
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Active-flight cut + tlog/video sync via AZ-405's `tlog_video_adapter` | Active segment located; tlog↔video offset resolved (`replay.compose_root.ready` logs `auto_sync_used=true|false`, AC-8 escape hatch honored). |
| 2 | On-fly frame + IMU extraction via `VideoFileFrameSource` + `TlogReplayFcAdapter` | Frame and IMU streams co-aligned per AZ-697 ground-truth invariants. |
| 3 | `extract_route_from_tlog(tlog, max_waypoints=10)``RouteSpec` | Route materially follows tlog trajectory; waypoints inside the Derkachi bbox (lat 50.0808..50.0832, lon 36.1070..36.1134) per AZ-836 AC-1. |
| 4 | `operator_pre_flight_setup` posts route via `SatelliteProviderRouteClient.seed_route`; satellite-provider downloads Google Maps tiles into C6 | Route registered; `mapsReady=true` within poll budget; `tile_count > 0`; warm fixture re-invocation within the same compose session ≤ 30 s (AZ-839 AC-2). |
| 5 | C10 `DescriptorBatcher` builds the FAISS HNSW NetVLAD index from the populated C6 | Three sidecar files (`.index` + `.sha256` + `.meta.json`) pass the AZ-306 triple-consistency check; tamper test raises `IndexUnavailableError` (AZ-839 AC-6). |
| 6 | Invoke airborne `gps-denied-replay` against the populated cache + tlog/video/calibration | Subprocess runs the per-frame loop end-to-end; emits JSONL outputs (currently blocked by AZ-848 — `eskf_out_of_order` at frame 3 fails the binary with exit 1 deterministically on the Derkachi 1-min clip). |
| 7 | Compute horizontal-error distribution via `helpers/accuracy_report.py` + `helpers/gps_compare.py`; write verdict report | `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` exists with the honest distribution (PASS or FAIL on the AZ-696 100 m / 80 % gate — verdict emitted **regardless** of PASS/FAIL per AZ-840 AC-2). |
**Expected outcome**: Verdict report exists with the honest horizontal-error distribution. Test PASSes iff the run meets the AZ-696 100 m / 80 % gate (≥ 80 % of ticks within 100 m of ground truth). Mid-pipeline failures (e.g., satellite-provider rejection at step 4, sidecar mismatch at step 5, ESKF divergence at step 6) fail LOUD with a clear error pointing at the failing step — no silent skip past a failure (AZ-840 AC-5).
**Max execution time**: 15 min wall-clock on the Derkachi clip (AZ-840 AC-4 soft target for first delivery; hard NFR set after first honest measurement is recorded in the verdict report).
**Relationship to existing tests**:
- FT-P-20 (AZ-699 real-flight runner) is preserved (AZ-840 AC-7) — FT-P-21 reuses its verdict-report-writing path through `_report_writer.py` rather than superseding it. Either the two live alongside, or AZ-699's runner is wrapped by AZ-840's orchestrator with the verdict-writing path preserved.
- FT-P-15 + FT-P-16 (pre-loaded cache, AC-8.3) remain the canonical bbox-fixture tests; FT-P-21 is the route-driven supplementary test that exercises the same end-state (populated C6) via the production C11→satellite-provider path.
**Implemented as**: `tests/e2e/replay/test_az835_e2e_real_flight.py` (per AZ-840). Unit-tested orchestration helper: `tests/e2e/replay/test_e2e_orchestrator_unit.py` (17 tests covering parameter validation + error propagation between the 7 orchestration steps).
@@ -8,8 +8,8 @@ This matrix is the canonical view of test coverage for the planning context. It
| AC ID | Acceptance Criterion (one-line) | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-1.1 | Frame-center GPS within 50 m for ≥80% of normal-flight photos | FT-P-01 | Covered |
| AC-1.2 | Frame-center GPS within 20 m for ≥50% of normal-flight photos | FT-P-01 | Covered |
| AC-1.1 | Frame-center GPS within 50 m for ≥80% of normal-flight photos | FT-P-01, FT-P-21 (orchestrator-level supplementary) | Covered |
| AC-1.2 | Frame-center GPS within 20 m for ≥50% of normal-flight photos | FT-P-01, FT-P-21 (orchestrator-level supplementary) | Covered |
| AC-1.3 | Cumulative drift between satellite-anchored fixes <100 m visual / <50 m IMU-fused | FT-P-02 | Covered |
| AC-1.4 | Estimate reports 95% covariance + source label | FT-P-03 | Covered |
| AC-2.1a | Frame-to-frame registration ≥95% on normal segments | FT-P-04 | Covered |
@@ -35,7 +35,7 @@ This matrix is the canonical view of test coverage for the planning context. It
| AC-7.2 | AI-camera object coordinates from gimbal/zoom/altitude | — | NOT COVERED — same as AC-7.1 |
| AC-8.1 | Imagery via Suite Sat Service offline cache, ≥0.5 m/px | FT-P-15, FT-P-16, NFT-SEC-02 | Covered |
| AC-8.2 | Tile freshness <6 mo (active-conflict) / <12 mo (rear) | FT-N-05 | Covered |
| AC-8.3 | Imagery pre-loaded onto companion before flight | FT-P-15, FT-P-16 | Covered |
| AC-8.3 | Imagery pre-loaded onto companion before flight | FT-P-15, FT-P-16, FT-P-21 (route-driven via real satellite-provider) | Covered |
| AC-8.4 | Mid-flight tile generation with quality metadata | FT-P-17 | Covered |
| AC-8.5 | No raw nav/AI-cam frame retention except thumbnail log | FT-P-18 | Covered |
| AC-8.6 | Satellite relocalization scale-ratio + scene-change | FT-P-19 (scale FULL; scene-change PARTIAL) | PARTIAL — scene-change subset reduced confidence (only 2/60 stills have paired sat refs; no labeled change-pair dataset). Independent of the AC-NEW-4 / AC-NEW-7 multi-flight gap (those rows were resolved by AC-text relaxation 2026-05-09; AC-8.6 scene-change still requires a labeled change-pair dataset that synthetic perturbations cannot substitute for). Mitigation: deferred to a follow-up cycle when labeled change-pair data becomes available; surfaced in the Step 4 risk register |
@@ -78,6 +78,8 @@ This matrix is the canonical view of test coverage for the planning context. It
> Revised 2026-05-09 (Plan Phase 2a.0 outcomes): three rows moved PARTIAL → Covered (AC-NEW-4, AC-NEW-7, RESTRICT-FAIL-2) following AC-text relaxation per Q3=B. Restriction row count corrected from 19 to 20 (pre-existing arithmetic error).
>
> Revised 2026-05-19 (Greenfield Step 12 cycle-update — autodev): NFT-RES-05 appended to `resilience-tests.md` capturing the composition-root bootstrap contract introduced by AZ-591 / AZ-618 / AZ-687 (replay-mode minimal config, `AirborneBootstrapError` operator-error contract, Tier-2 `replay.compose_root.ready` + `replay.input.frame_emitted` log-boundary gate). NFT-RES-05 is added to AC-NEW-1 and AC-4.1 as bootstrap-precondition coverage; no coverage counts move because the scenario is supplementary, not promoting any PARTIAL row.
>
> Revised 2026-05-24 (Existing-code cycle-3 Step 12 cycle-update — autodev): FT-P-21 appended to `blackbox-tests.md` capturing the Epic AZ-835 orchestrator-level end-to-end pipeline (AZ-836 `RouteSpec` extractor + AZ-838 `SatelliteProviderRouteClient` + AZ-839 route-driven `operator_pre_flight_setup` + AZ-840 orchestrator test). FT-P-21 is supplementary route-driven coverage on AC-1.1, AC-1.2 (orchestrator-level pipeline accuracy) and AC-8.3 (pre-loaded cache realised via the production C11→satellite-provider path rather than the bbox-seeded FT-P-15/FT-P-16 fixture). No coverage counts move — FT-P-21 supplements already-Covered rows. **Currently blocked on Jetson by AZ-848** (`eskf_out_of_order` regression introduced by AZ-776's missing Jetson-verification gate — pre-existing, surfaced cycle-3 Step 11; tracked locally at `_docs/02_tasks/todo/AZ-848_jetson_eskf_out_of_order_regression.md`). Cycle-3 internal changes (C11 contract adaptation per AZ-777 Phase 1; RouteSpec relocation per AZ-845; module-layout refresh AZ-846; AZ-270 lint widening AZ-847; C12 cold-start unit-NFR threshold relax AZ-844) are implementation-only and produce no new black-box scenarios.
| Category | Total Items | Covered | PARTIAL | Not Covered | Coverage % (Covered + PARTIAL counted half) |
|----------|-----------|---------|---------|-------------|--------------------------------------------|
File diff suppressed because one or more lines are too long
@@ -1,5 +1,19 @@
# Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests (AZ-835 C5)
> **Cycle-4 deferral (2026-05-26)**: moved to `backlog/` during cycle-4 Step 9
> scope review. Blocking issues:
> - **Conflict with AZ-895 AC-4**: AZ-895 (cycle-4 cleanup) explicitly states
> `test_derkachi_real_tlog.py` stays `@xfail` with the AZ-848-scoped reason
> in cycle 4. Un-xfailing this test here contradicts AZ-895 and will fail
> the Jetson run because AZ-848 (the underlying clock bug) is in backlog/.
> - **Partial overlap with AZ-894 AC-3**: the other un-xfail target
> (`test_derkachi_1min.py::AC3`) is the same test AZ-894 (cycle-4 CSV
> adapter) covers under its own AC-3 — re-doing the un-xfail in a
> separate ticket duplicates effort.
> - **Replay condition**: revisit when EITHER (a) AZ-848 is fixed and the
> tlog adapter path is restored, OR (b) cycle 4 lands and we rescope this
> ticket to only the CSV-path tests AZ-894 doesn't already cover.
**Task**: AZ-841_unxfail_az777_tier2_tests
**Name**: Un-xfail AZ-777 AC-4 + AC-5 Tier-2 tests once C3 fixture + C4 orchestrator land (AZ-835 C5)
**Description**: Fifth building block of Epic AZ-835. Once C3 (AZ-839, `operator_pre_flight_setup` real fixture) and C4 (AZ-840, e2e orchestrator test) land, remove the `@pytest.mark.xfail` markers from the AZ-777 Tier-2 tests. The verdict — PASS or FAIL — becomes the honest signal. Both tests remain gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2`.
@@ -0,0 +1,135 @@
# [AZ-776 follow-up] derkachi_1min AC-1/2/5/6 fail on Jetson — VioOutput.emitted_at_ns clock-mismatch with FC IMU timebase
> **SCOPE UPDATE (2026-05-26, cycle-4 planning)**
>
> After user decision to switch the primary replay path to user-supplied (video, CSV) pairs (see AZ-894 / AZ-895 / AZ-896 / AZ-897), the tlog-adapter path becomes **audit-only** and this ticket is **no longer bench-blocking**. It remains a real bug and stays open for any future tlog-only flight (flights that ship with a `.tlog` but no companion `data_imu.csv`).
>
> **Priority**: backlog (deprioritised from cycle-4 candidate)
> **Bench-blocking?**: no — AZ-894 supersedes
> **Production-blocking?**: no — production single-clock model never goes through the tlog adapter
> **Complexity**: unchanged (5 SP)
**Task**: AZ-848_jetson_eskf_out_of_order_regression
**Name**: Repair the VioOutput contract — emitted_at_ns must use the frame's timeline timestamp, not process monotonic_ns, so it aligns with the FC IMU timebase that C5 ESKF tracks alongside it
**Description**: On the Jetson e2e harness (`scripts/run-tests-jetson.sh`), four tests in `tests/e2e/replay/test_derkachi_1min.py` (AC-1, AC-5, AC-6 realtime, AC-6 asap) fail with identical deterministic root cause `EstimatorFatalError('eskf filter divergence on vio: mahalanobis²=109.765 > 100.0')` at frame 3, preceded by `c5.state.eskf_out_of_order` from `imu_window` (ts_ns=187_370_418_000 < last_added_ts_ns=1_187_232_637_925_619 — ~56 orders of magnitude apart). Plus 1 XPASS on `test_ac3_within_100m_80pct_of_ticks` (probable vacuous-pass — when the binary exits 1 on frame 3, the ≥80 % within 100 m assertion evaluates over zero emissions).
**Revised root cause (2026-05-26 evidence-based investigation)**: NOT an IMU-vs-IMU clock-source mismatch (the original hypothesis was incorrect — RAW_IMU.time_usec and SCALED_IMU2.time_boot_ms share the same FC-boot-relative timebase in the Derkachi tlog: 187634 s). The actual mismatch is **VioOutput.emitted_at_ns** vs **ImuWindow.ts_end_ns**:
| Source | Code site | Value on Jetson | Timebase |
|---|---|---|---|
| `VioOutput.emitted_at_ns` | `klt_ransac.py:274``self._clock.monotonic_ns()` | ~1.187·10¹⁵ ns (≈ 13.7 days — Jetson uptime when the run started) | Process monotonic |
| `imu_window.ts_end_ns` | `tlog_replay_adapter.py:710``time_usec * 1000` | ~1.87·10¹¹ ns (≈ 187 s — Pixhawk boot-relative) | FC-boot-relative |
C5 ESKF tracks `_last_added_ts_ns` across BOTH `add_vio` and `add_fc_imu`. Frame 0: `add_vio` sets `_last_added_ts_ns = 1.187·10¹⁵`. Frame 1: `add_fc_imu` checks `1.87·10¹¹ + ~10⁸ < 1.187·10¹⁵` → out_of_order degraded → next add_vio with corrupted nominal state → mahalanobis² = 109.76 > 100 → fatal divergence at frame 3.
**Why this hides on Tier-1**: the test is `@pytest.mark.tier2_only` (skipped on workstation runs). Unit tests use mocked VIO with synthetic clocks, so the contract clash never surfaces.
**Why this hides on a short-uptime Jetson**: a Jetson booted < ~10 s ago would have monotonic_ns smaller than the FC's boot-relative timestamps; the inequality flips and the bug masquerades as "intermittent passes". The 13.7-day-uptime test box made it deterministic.
**Complexity**: 5 SP (revised up from 3 — the fix touches the C1 contract: `VioOutput.emitted_at_ns` semantics + every C1 strategy that populates it + `_docs/02_document/contracts/c1_vio/` doc + every consumer of `vio.emitted_at_ns` in C5 / C13 / FDR. Plus a determinism test that records monotonic_ns vs frame_ts_ns at frame 0 to lock the invariant in.)
**Dependencies**: AZ-776 (closed; produced the verification gap that hid this regression)
**Related**: AZ-883 (SCALED_IMU2 latent ts_ns=0 bug; uncovered during this investigation; separate ticket)
**Component**: c1_vio (`klt_ransac.py`, `bench/okvis2.py`, `bench/vins_mono.py`, `_facade_spine.py`) + `_types/nav.py` (VioOutput dataclass) + c5_state (`eskf_baseline.py:add_vio` consumes the field) + c13_fdr (consumes `emitted_at_ns` per the docstring's "adaptive-gating decisions")
**Tracker**: AZ-848 (https://denyspopov.atlassian.net/browse/AZ-848)
**Parent Epic**: (none — bug surfaced in cycle 3 Step 11)
Jira AZ-848 is the authoritative spec; this file is the in-workspace mirror.
## Symptom
On Jetson (`scripts/run-tests-jetson.sh`), four tests in `tests/e2e/replay/test_derkachi_1min.py` fail with identical root cause:
- `test_ac1_exits_0_jsonl_count_match`
- `test_ac5_determinism_two_runs_diff`
- `test_ac6_pace_realtime_60s_within_5pct`
- `test_ac6_pace_asap_under_30s`
All four assert `gps-denied-replay` exits 0; the binary actually exits 1 on frame 3 with:
```
ERROR c5_state.eskf_baseline c5.state.eskf_out_of_order
source=imu_window ts_ns=187,370,418,000 last_added_ts_ns=1,187,232,637,925,619
ERROR c5_state.eskf_baseline c5.state.eskf_filter_divergence
source=vio mahalanobis_sq=109.76467866548009 threshold_sq=100.0
ERROR runtime_root.replay_loop replay_loop.state_add_vio_fatal
frame=3 EstimatorFatalError('eskf filter divergence on vio: mahalanobis²=109.765 > 100.0')
```
Mahalanobis distance is identical (109.765) across all four runs — fully deterministic on the Derkachi 1-min clip.
Additionally, `test_ac3_within_100m_80pct_of_ticks` reports XPASS (was `@xfail` referencing AZ-777). Appears to be a symptom of the same bug — with the binary exiting code 1 before any GPS-denied emissions land, the `≥ 80 % within 100 m` assertion evaluates against an empty population and passes vacuously. The XPASS is NOT honest evidence that AZ-777 has been completed.
## Origin — AZ-776 verification gap
Commit `8de2716 [AZ-776] Open-loop ESKF composition profile via c4_pose.enabled` removed `@pytest.mark.xfail` decorators from AC-1 (line 61), AC-2 (line 138), AC-5 (line 413), AC-6 realtime (line 453), AC-6 asap (line 479) of `test_derkachi_1min.py`. The AZ-776 spec (`_docs/02_tasks/done/AZ-776_eskf_open_loop_composition_profile.md`) claims under AC-7:
> `_run_replay_loop` in `runtime_root/__init__.py` is exercised end-to-end on Jetson by a non-`xfail` integration test (AC-1, AC-2, AC-5, AC-6 realtime, AC-6 asap in `tests/e2e/replay/test_derkachi_1min.py` un-xfail **and pass**).
This was not honored — AZ-776 closed without an honest Jetson run. Predates the `meta-rule.mdc` "Real Results, Not Simulated Ones" rule (added 2026-05) that would have caught it.
## Cycle-3 scope (not the cause)
Cycle-3 Step 11 (2026-05-24) surfaced this on the first full Jetson run since cycle 1. Cycle-3's only src change was commit `fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint` — four files, all in `_types/route.py` (new), `c11_tile_manager/route_client.py`, `replay_input/__init__.py`, `replay_input/tlog_route.py`. None of `c5_state`, `c8_fc_adapter`, `runtime_root` were touched. Most recent change to `c5_state/eskf_baseline.py` is AZ-389; to `c8_fc_adapter/tlog_replay_adapter.py` is AZ-398. Both pre-date cycle 1. The latent contract clash was always there — Jetson uptime + an un-`xfail`ed test combined to make it deterministic.
## Diagnosis evidence (2026-05-26)
`/tmp/inspect_tlog.py` (ad-hoc pymavlink probe against `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog`) — outputs preserved in this session's chat history:
- 4326 RAW_IMU msgs, time_usec ∈ [187,274,914 ; 633,952,656] µs (boot-relative ~187s~634s)
- 4330 SCALED_IMU2 msgs, time_boot_ms ∈ [187,274 ; 633,954] ms (same timebase, same range)
- Both IMU types share the FC's boot timebase → original "two-IMU-clock-source mismatch" hypothesis is REFUTED
- `klt_ransac.py:274` populates `VioOutput.emitted_at_ns = self._clock.monotonic_ns()` → 1.187·10¹⁵ ns on the test Jetson (uptime 13.7 days)
- `_types/nav.py:158` documents this contract explicitly: "`emitted_at_ns` is `time.monotonic_ns` at output time."
- `eskf_baseline.py:492` reads `ts_ns = vio.emitted_at_ns` and stores it in `_last_added_ts_ns` — the same field that `add_fc_imu` checks against `imu_window.ts_end_ns` (FC-boot-relative)
- Confirmed: the inequality direction MATCHES the AZ-848 error log (`ts_ns=187,370,418,000 < last_added_ts_ns=1,187,232,637,925,619`)
## Affected files
- `src/gps_denied_onboard/_types/nav.py``VioOutput.emitted_at_ns` field + docstring at line 158 (contract change site)
- `src/gps_denied_onboard/components/c1_vio/klt_ransac.py:274,425,463,592619` — every site that fills `emitted_at_ns`
- `src/gps_denied_onboard/components/c1_vio/bench/okvis2.py`, `vins_mono.py` — other C1 strategies that fill `emitted_at_ns`
- `src/gps_denied_onboard/components/c1_vio/_facade_spine.py``frame_ts_ns(frame)` is the existing helper that should be the new source of truth
- `src/gps_denied_onboard/components/c5_state/eskf_baseline.py:492,502,565` — already reads `vio.emitted_at_ns`; no API change needed once the field's semantics are fixed
- `src/gps_denied_onboard/components/c13_fdr/**` — read `emitted_at_ns` per the docstring's "adaptive-gating decisions"; behavior change must be evaluated
- `_docs/02_document/contracts/c1_vio/` — contract docs need re-version (semantic change to a public field)
- `tests/e2e/replay/test_derkachi_1min.py` — the failing tests; AC-3 XPASS handling per AC-4 below
## Repro
```
bash scripts/run-tests-jetson.sh
# pytest report (after ~5 min):
# tests/e2e/replay/test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match FAILED
# tests/e2e/replay/test_derkachi_1min.py::test_ac5_determinism_two_runs_diff FAILED
# tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_realtime_60s_within_5pct FAILED
# tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_asap_under_30s FAILED
# tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks XPASS
```
## Acceptance Criteria
| # | Criterion |
|---|-----------|
| AC-1 | The `VioOutput.emitted_at_ns` contract docstring (`_types/nav.py:158`) no longer says "monotonic_ns at output time"; the field's semantics are documented as "the frame's timeline timestamp aligned with C8 FC IMU timebase, so C5 ESKF can compare against `imu_window.ts_end_ns` without a clock-source mismatch". A version bump is recorded in `_docs/02_document/contracts/c1_vio/`. |
| AC-2 | Every C1 strategy (`klt_ransac.py`, `bench/okvis2.py`, `bench/vins_mono.py`) populates `emitted_at_ns` from the frame's timestamp (via `frame_ts_ns(frame)` or the strategy's own equivalent), NOT from `monotonic_ns()`. A unit test per strategy asserts the field value equals `frame_ts_ns(frame)`. |
| AC-3 | A determinism test reads two consecutive frames' `VioOutput.emitted_at_ns` values and asserts they are equal to `frame_ts_ns(frame_n)` and `frame_ts_ns(frame_n+1)` respectively — locking the new invariant. |
| AC-4 | Fix lands and `test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match` PASSES on Jetson with `RUN_REPLAY_E2E=1` — no `@xfail` re-add. |
| AC-5 | `test_ac5_determinism_two_runs_diff`, `test_ac6_pace_realtime_60s_within_5pct`, `test_ac6_pace_asap_under_30s` also PASS on Jetson. |
| AC-6 | XPASS on `test_ac3_within_100m_80pct_of_ticks` is investigated. If symptom of the same bug, returns to honest XFAIL referencing AZ-777 once binary exits 0 cleanly. If genuine pass, AZ-777 is closed instead. |
| AC-7 | C13 FDR consumers of `emitted_at_ns` are audited — any code path that relied on the field being monotonic-clock-wall-time has its behavior preserved via an explicit `time.monotonic_ns()` recorded under a different name (e.g., `recorded_at_ns`) or its expectation is documented as "frame timeline; not wall clock". |
| AC-8 | `meta-rule.mdc` "Real Results" gate is honored — no ticket may close `Done` until the operator has eyes on a green Jetson run log line. |
## Notes
- Tracker context: surfaced `cycle: 3, step: 11` on 2026-05-24; root cause re-diagnosed 2026-05-26 (operator-supervised investigation against the actual Derkachi tlog).
- Local unit suite (`pytest tests/unit/`) passes 2303 / 0 fail / 86 legitimate skips after C12 cold-start threshold relax (`05f1143 [AZ-844]`).
- Cycle 3 Step 11 verdict was PASS for cycle-3-scope; this ticket captures the wider Jetson regression for next cycle.
- Local mirror created retroactively 2026-05-24 (cycle 3 Step 12 entry) — Jira AZ-848 filed 2026-05-24 was the original signal; mirror was missing.
- 2026-05-26: spec materially revised after evidence-based investigation refuted the original "two-IMU-clock-source mismatch" hypothesis. The corrected diagnosis points at the C1 contract (`VioOutput.emitted_at_ns` semantics), not at the C8 adapter. The SCALED_IMU2 latent bug surfaced during this investigation is split out as AZ-883 to keep this ticket's scope tight.
## References
- Jira: https://denyspopov.atlassian.net/browse/AZ-848
- Run-tests report: `_docs/03_implementation/run_tests_step11_report.md` (Cycle 3 closeout, lines 617635)
- Origin spec: `_docs/02_tasks/done/AZ-776_eskf_open_loop_composition_profile.md`
- Related: AZ-777 (the XFAIL the AC-6 XPASS originally referenced); AZ-883 (SCALED_IMU2 latent bug)
@@ -0,0 +1,74 @@
# `_handle_imu` mis-reads SCALED_IMU2 timestamps — produces ts_ns=0 for every other IMU sample
> **SCOPE UPDATE (2026-05-26, cycle-4 planning)**
>
> Deprioritised behind AZ-894 (CSV-driven replay adapter). This bug only matters once the tlog-adapter path is reactivated for tlog-only flights (flights that ship with a `.tlog` but no companion `data_imu.csv`). Stays open in backlog.
>
> **Priority**: backlog (deprioritised from cycle-4 candidate)
> **Bench-blocking?**: no — AZ-894 supersedes the tlog path for Derkachi
> **Complexity**: unchanged (2 SP)
**Task**: AZ-883_scaled_imu2_ts_ns_zero_default
**Name**: Branch `_handle_imu` on message type so SCALED_IMU2 uses `time_boot_ms × 1_000_000` instead of the missing `time_usec` field
**Description**: `src/gps_denied_onboard/components/c8_fc_adapter/tlog_replay_adapter.py:683` routes BOTH `RAW_IMU` and `SCALED_IMU2` messages through `_handle_imu`, which at line 710 reads `getattr(msg, "time_usec", 0) * 1000` to compute `sensor_ts_ns`. SCALED_IMU2 has no `time_usec` field (its time field is `time_boot_ms`, uint32 milliseconds since FC boot), so the `getattr` default-of-zero path fires for every SCALED_IMU2 message. The resulting IMU sample stream alternates RAW_IMU timestamps with `ts_ns=0` values.
**Evidence (2026-05-26 investigation against `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog`)**:
- 4326 RAW_IMU messages with `time_usec` ∈ [187,274,914 ; 633,952,656] µs (boot-relative microseconds, ~187s~634s)
- 4330 SCALED_IMU2 messages with `time_boot_ms` ∈ [187,274 ; 633,954] ms (same FC-boot timebase, same range)
- Both interleaved in arrival order — every other IMU sample is the affected type
- `_handle_imu`'s simulated output: 4266 non-monotonic transitions out of 8656 (~49 %) — almost every other transition is non-monotonic because SCALED_IMU2 collapses to ts_ns=0
**Why this is currently latent**: C5 ESKF's `add_fc_imu` reads `imu_window.ts_end_ns` (the LAST sample's ts_ns) for monotonicity guarding. If the last sample in the window happens to be RAW_IMU, the guard passes. The per-sample preintegration loop at `eskf_baseline.py:627647` reads each `sample.ts_ns` individually for delta-t computation, but with ts_ns=0 samples interleaved, the delta-t arithmetic produces negative or near-zero intervals that get silently absorbed by the bias-correction math without raising. It WILL bite once any downstream consumer (FDR replay, latency analyser, deterministic-time gate) does a per-sample monotonicity assertion.
**Why this surfaced now**: the operator-supervised AZ-848 investigation read the Derkachi tlog through pymavlink and observed the interleaving directly. The bug has been present since `_handle_imu` was written (predates cycle 1) and was never caught because no test asserts per-sample IMU monotonicity.
**Complexity**: 2 SP
**Dependencies**: AZ-848 (split off from its investigation; can land before, after, or in parallel — no shared code path beyond `_handle_imu`)
**Component**: c8_fc_adapter (`tlog_replay_adapter.py`)
**Tracker**: AZ-883 (https://denyspopov.atlassian.net/browse/AZ-883) — Jira ticket created 2026-05-26 during cycle 3 release flow; allocated key AZ-883 (next-available, NOT the originally-planned AZ-849)
**Parent Epic**: (none — bug surfaced during AZ-848 investigation)
## Symptom
If you add a per-sample monotonicity assertion to the C5 ESKF or to the C8 tlog adapter pre-emit gate, every Jetson run against the Derkachi tlog reports 4266 zero-valued IMU sample timestamps interleaved with proper RAW_IMU values. The assertion fires immediately at message index 1 (the first SCALED_IMU2 after the first RAW_IMU).
## Proposed fix
Modify `_handle_imu` (`src/gps_denied_onboard/components/c8_fc_adapter/tlog_replay_adapter.py:709`) to branch on the message type via the caller's already-computed `msg_type`:
```python
def _handle_imu(self, msg: Any, *, msg_type: str) -> bool:
if msg_type == "RAW_IMU":
sensor_ts_ns = int(getattr(msg, "time_usec", 0)) * 1000
elif msg_type == "SCALED_IMU2":
sensor_ts_ns = int(getattr(msg, "time_boot_ms", 0)) * 1_000_000
else:
raise FcOpenError(
f"_handle_imu called with unsupported msg_type={msg_type!r}; "
f"expected RAW_IMU or SCALED_IMU2"
)
...
```
Update the caller at line 684 to pass `msg_type=msg_type`. Add a unit test that synthesises a SimpleNamespace with `time_boot_ms=187274` (no `time_usec` field) and verifies the emitted `ImuTelemetrySample.ts_ns == 187_274_000_000`.
Alternative (heavier): pick a single canonical message type at construction time (parameterise the adapter with `imu_source: Literal["RAW_IMU","SCALED_IMU2"]`, auto-detected from the tlog pre-scan) and drop the non-chosen type at the dispatch site. This buys cleaner streams but doubles the test matrix.
The branching fix is simpler and preserves the existing OR-group semantic (`("RAW_IMU", "SCALED_IMU2")` in `_REQUIRED_MESSAGE_GROUPS`).
## Acceptance Criteria
| # | Criterion |
|---|-----------|
| AC-1 | `_handle_imu` reads `time_boot_ms × 1_000_000` for SCALED_IMU2 messages and `time_usec × 1000` for RAW_IMU. A unit test exercises both branches with a synthetic SimpleNamespace lacking the OTHER field. |
| AC-2 | An integration test against the Derkachi tlog (Tier-1; no Jetson hardware needed — only pymavlink + the tlog file) asserts that the IMU stream as seen by the runtime loop is strictly monotonic ts_ns. The test reads at least the first 100 IMU samples and verifies `sample[i+1].ts_ns > sample[i].ts_ns` for all i. |
| AC-3 | No regression in existing RAW_IMU-only adapter tests. |
| AC-4 | The fix is independent of AZ-848 — does not require the VioOutput contract change to land first. |
## References
- Jira: https://denyspopov.atlassian.net/browse/AZ-883
- Origin: AZ-848 investigation, 2026-05-26 cycle 3 Step 16.5 release flow
- Related: AZ-848 (the VIO contract repair; both surfaced from the same investigation but their fixes are independent)
- Tlog evidence: `_docs/00_problem/input_data/flight_derkachi/derkachi.tlog`, 8656 IMU samples (4326 RAW_IMU + 4330 SCALED_IMU2 interleaved)
@@ -3,17 +3,30 @@
**Task**: AZ-842_replay_protocol_and_orchestrator_docs
**Name**: Docs: replay_protocol.md Invariant 12 + AZ-777 Phase 3+ superseded note + orchestrator-test README (AZ-835 C6)
**Description**: Sixth and final building block of Epic AZ-835. Capture the route-driven flow in the authoritative documents so future implementers, operators, and reviewers understand what changed and why.
**Complexity**: 2 SP
**Dependencies**: AZ-841 (C5, un-xfail — SOFT; README describes test outcomes assuming C5 has landed); AZ-777 (being closed/superseded by this Epic — AZ-777 spec is updated during the AZ-777 closure step, verified by AC-6); AZ-835 (parent Epic)
**Complexity**: 3 SP (cycle-4 rescope: was 2 SP)
**Dependencies**: AZ-894 (CSV adapter — HARD; replay_protocol.md sub-section describes the new single-canonical-clock flow); AZ-895 (auto-sync deprecation — HARD; replay_protocol.md sub-section describes the tlog adapter's new audit-only role); AZ-896 (CSV format docs — SOFT; replay_protocol.md cross-links to the format spec); AZ-777 (closed/superseded by this Epic); AZ-835 (parent Epic)
**Component**: `_docs/02_document/contracts/replay/replay_protocol.md` + `_docs/02_document/architecture.md` + `tests/e2e/replay/README*.md`
**Tracker**: AZ-842 (https://denyspopov.atlassian.net/browse/AZ-842)
**Parent Epic**: AZ-835
Jira AZ-842 is the authoritative spec; this file is the in-workspace mirror.
> **Cycle-4 rescope (2026-05-26)**: dropped the AZ-841 (un-xfail) soft
> dependency — AZ-841 was deferred to backlog in cycle-4 Step 9 scope
> review (see `_docs/02_tasks/backlog/AZ-841_unxfail_az777_tier2_tests.md`).
> Expanded scope from "AZ-835 epic docs only" to also cover the cycle-4
> replay-input redesign narrative: AZ-894 (CSV-driven single-canonical-clock
> adapter), AZ-895 (tlog adapter → audit-only after auto-sync deprecation),
> AZ-896 (CSV format spec). The replay_protocol.md edits now describe BOTH
> the route-driven AZ-835 flow AND the cycle-4 CSV-driven replay path,
> which together supersede the legacy tlog+auto-sync surface.
> Complexity bumped 2 → 3 SP to cover the added cycle-4 narrative.
## Modified files
### 1. `_docs/02_document/contracts/replay/replay_protocol.md` — Invariant 12 extension
### 1. `_docs/02_document/contracts/replay/replay_protocol.md` — Invariant 12 extension + Invariant 13 (NEW, cycle-4)
**1a. Invariant 12 — route-driven flow (AZ-835)**
Extend **Invariant 12** with an AZ-835 sub-section describing:
@@ -21,6 +34,16 @@ Extend **Invariant 12** with an AZ-835 sub-section describing:
- Why route-driven supersedes the AZ-777 bbox approach (efficiency: ~100× fewer tiles; honesty: pre-commits to where the operator did fly).
- The C3 fixture's failure-handling contract (validation/terminal → re-raise; transient → retry up to 3 attempts using C11's existing backoff schedule).
**1b. Invariant 13 — single canonical clock (cycle-4, AZ-894 / AZ-895 / AZ-896)**
Add a new **Invariant 13** sub-section describing:
- The single-clock model production uses (single edge device, single clock at receipt) and why two-clock surfaces (e.g. `VioOutput.emitted_at_ns` from Jetson monotonic vs. `ImuWindow.ts_end_ns` from FC-boot) produce ESKF out-of-order regressions like AZ-848.
- The CSV-driven replay path (AZ-894) — `(video, CSV)` operator input, IMU + GPS-ground-truth on a single canonical clock derived from the CSV's `Time` column, no auto-sync.
- The CSV schema (delegate to `_docs/02_document/contracts/replay/csv_replay_format.md` produced by AZ-896 for the field-level spec).
- The tlog-replay adapter's new audit-only role (AZ-895): retained for FDR analysis and one-shot tlog→CSV export, removed from the test/demo critical path.
- Auto-sync deprecation (AZ-895): `--time-offset-ms` / `--skip-auto-sync-validation` CLI flags removed or marked deprecated with one-cycle warning.
### 2. `_docs/02_document/architecture.md` — satellite-provider entry extension
Append a sub-section to the existing satellite-provider entry noting that Epic AZ-835 + its C1-C5 children landed the full e2e real-flight validation path on top of AZ-777 Phase 1's wire + C11 contract adaptation. Mark AZ-777 Phase 3+ as superseded by Epic AZ-835 (pointer-only — the AZ-777 spec itself is updated in C5's wake during the AZ-777 closure step).
@@ -39,11 +62,13 @@ Either extend `tests/e2e/replay/README.md` or create a dedicated `tests/e2e/repl
| # | Criterion |
|---|-----------|
| AC-1 | `replay_protocol.md` Invariant 12 has a new AZ-835 sub-section covering the route-driven flow, the bbox-supersedure rationale, and the failure-handling contract. |
| AC-1b | `replay_protocol.md` has a new Invariant 13 (cycle-4) sub-section covering the single-canonical-clock model, the CSV-driven replay path (AZ-894), the tlog adapter's audit-only role (AZ-895), and auto-sync deprecation. Links to `csv_replay_format.md` (AZ-896). |
| AC-2 | `architecture.md` satellite-provider entry has a sub-section noting Epic AZ-835's contribution and pointing at AZ-777 Phase 3+ as superseded. |
| AC-2b | `architecture.md` replay-input section explains the cycle-4 redesign: CSV adapter primary path, tlog adapter audit-only role, removal of auto-sync. References AZ-894 / AZ-895 / AZ-896 / AZ-897. |
| AC-3 | `tests/e2e/replay/README*.md` exists and a new contributor can run the orchestrator test on Jetson using only the README's instructions (no out-of-band knowledge required). |
| AC-4 | All three docs link to the Epic (AZ-835) and to the relevant child tickets (AZ-836 / AZ-838 / AZ-839 / AZ-840 / AZ-841). |
| AC-4 | All three docs link to the Epic (AZ-835), its children (AZ-836 / AZ-838 / AZ-839 / AZ-840), and the cycle-4 redesign tickets (AZ-894 / AZ-895 / AZ-896 / AZ-897). AZ-841 reference omitted (deferred to backlog). |
| AC-5 | License attribution string ("Imagery © Google") and the dev-only caveat are present in the test README. |
| AC-6 | Cross-references in `_docs/02_tasks/_dependencies_table.md` and `_docs/02_tasks/done/AZ-777*.md` (once moved) point at this Epic / its children. |
| AC-6 | Cross-references in `_docs/02_tasks/_dependencies_table.md` and `_docs/02_tasks/done/AZ-777*.md` (once moved) point at this Epic / its children and at the cycle-4 redesign tickets. |
## Out of scope
@@ -0,0 +1,53 @@
# Replay: CSV-driven IMU+GPS adapter using single canonical clock
**Task**: AZ-894_csv_driven_replay_adapter
**Name**: Add a CSV-replay adapter that consumes the Derkachi-schema `data_imu.csv` (or any flight that ships with a paired CSV) and exposes IMU + GPS-ground-truth on a single canonical clock derived from the CSV's `Time` column
**Description**: Cycle 3 surfaced AZ-848 (eskf_out_of_order on frame 3) because the current replay pipeline imports two incompatible clocks: `VioOutput.emitted_at_ns` uses Jetson process-monotonic time, while `ImuWindow.ts_end_ns` uses FC-boot-relative time (parsed from MAVLink tlog messages). The single-clock model that production uses (single edge device, single clock at receipt) is not what replay does today. The Derkachi fixture's `data_imu.csv` already contains both IMU (`SCALED_IMU2.*`) and GPS ground truth (`GLOBAL_POSITION_INT.*`) on a single canonical clock (the `Time` column, 0..489.9 s at 10 Hz, aligned 3:1 with the 30 fps video). Using the CSV directly eliminates the clock-mismatch surface entirely for the test/demo path and matches the production single-clock model.
**Complexity**: 3 SP
**Dependencies**: AZ-896 (format docs land in the same cycle but can land in either order)
**Blocks**: AZ-895 (auto-sync deprecation), AZ-897 (replay UI)
**Component**: replay_input (new adapter), c8_fc_adapter (alternate ground-truth source), cli/replay
**Tracker**: AZ-894 (https://denyspopov.atlassian.net/browse/AZ-894)
**Parent Epic**: (none — cycle-4 replay-input redesign)
## Schema
The Derkachi CSV header (19 columns):
```
timestamp(ms), Time,
SCALED_IMU2.xacc, SCALED_IMU2.yacc, SCALED_IMU2.zacc,
SCALED_IMU2.xgyro, SCALED_IMU2.ygyro, SCALED_IMU2.zgyro,
SCALED_IMU2.xmag, SCALED_IMU2.ymag, SCALED_IMU2.zmag,
GLOBAL_POSITION_INT.lat, GLOBAL_POSITION_INT.lon, GLOBAL_POSITION_INT.alt,
GLOBAL_POSITION_INT.relative_alt,
GLOBAL_POSITION_INT.vx, GLOBAL_POSITION_INT.vy, GLOBAL_POSITION_INT.vz,
GLOBAL_POSITION_INT.hdg
```
- `timestamp(ms)`: FC-boot-relative milliseconds (kept for traceability; not used by C5)
- `Time`: flight-relative seconds (canonical clock — what C5 actually uses)
- `SCALED_IMU2.*`: 10 Hz IMU stream (accel mg, gyro mrad/s, mag mGauss per ArduPilot convention)
- `GLOBAL_POSITION_INT.*`: 10 Hz GPS ground truth (lat/lon in 1e-7 deg, alt in mm, vx/vy/vz in cm/s, hdg in cdeg)
## Acceptance Criteria
- **AC-1**: Adapter parses the Derkachi `data_imu.csv` end-to-end and emits 4,899 IMU samples + 4,899 GPS-ground-truth samples on a single monotonic clock anchored at row 0.
- **AC-2**: Wired into `cli/replay.py`; `gps-denied-replay --video flight_derkachi.mp4 --imu data_imu.csv` runs without invoking `tlog_replay_adapter.py`.
- **AC-3**: `test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match` passes on the Jetson e2e harness using the new path. AZ-848 cascade no longer triggers (no two-clock surface in the new path).
- **AC-4**: `VioOutput.emitted_at_ns` is populated from the CSV's `Time` column (or the frame-derived `t = N/fps`), not `time.monotonic_ns()`, when the new adapter is in use.
- **AC-5**: Schema mismatch (missing required column, NaN in `Time`, non-monotonic `Time`) raises a clear `ReplayInputAdapterError` at startup, not deep in the loop.
## Out of scope
- The structural AZ-848 / AZ-883 fix in the tlog adapter — those stay open as backlog.
- UI for picking the CSV — AZ-897.
- Other CSV schemas (PX4, generic MAVLink dumps) — future enhancement if needed.
## References
- Cycle-3 retro: `_docs/06_metrics/retro_2026-05-26.md`
- Bench-run evidence: `_docs/04_release/release_cycle3_jetson-bench_2026-05-26-1442.md`
- Companion tickets: AZ-895 (deprecate auto-sync), AZ-896 (format docs + example CSV), AZ-897 (replay UI)
- Supersedes (re bench-blocking): AZ-848 (VioOutput contract), AZ-883 (SCALED_IMU2 ts_ns=0)
@@ -0,0 +1,39 @@
# Replay: deprecate auto_sync surface; tlog adapter → audit-only
**Task**: AZ-895_deprecate_auto_sync_surface
**Name**: Remove the tlog+video auto-sync infrastructure and reframe `tlog_replay_adapter.py` as audit-only, now that AZ-894 ships the CSV-driven primary path
**Description**: User decision (2026-05-26): the test/demo replay path will accept a paired (video, CSV) input from the operator instead of auto-syncing a tlog and video. Auto-sync is unnecessary in production (single edge device, single clock by design) and over-engineered for test (the CSV already encodes the alignment).
**Complexity**: 2 SP
**Dependencies**: AZ-894 (must ship first — the CSV adapter is the replacement)
**Component**: replay_input (auto_sync.py, tlog_video_adapter.py), cli/replay, runtime_root/_replay_branch
**Tracker**: AZ-895 (https://denyspopov.atlassian.net/browse/AZ-895)
**Parent Epic**: (none — cycle-4 replay-input redesign)
## Touch list
- `src/gps_denied_onboard/replay_input/auto_sync.py` — delete or convert to a clear no-op that raises `ReplayInputAdapterError("auto-sync removed; supply --imu CSV instead")`
- `src/gps_denied_onboard/replay_input/tlog_video_adapter.py` — strip auto-sync invocations
- `src/gps_denied_onboard/cli/replay.py` — remove `--time-offset-ms` / `--skip-auto-sync-validation` flags (or mark deprecated with one-cycle warning)
- `src/gps_denied_onboard/runtime_root/_replay_branch.py` — strip auto-sync wiring
- `tests/unit/replay_input/test_az405_auto_sync.py` — pass against the new behaviour or delete with rationale recorded in the batch report
- `tests/e2e/replay/test_derkachi_real_tlog.py` — continues to `@xfail` with the AZ-848-scoped reason; nothing in this ticket fixes the underlying tlog-clock bug
- `tlog_replay_adapter.py` / `tlog_ground_truth.py` — module docstrings updated to call out the new audit-only / one-shot-export roles
## Acceptance Criteria
- **AC-1**: `auto_sync.py` is either deleted or made into a clear no-op that raises `ReplayInputAdapterError("auto-sync removed; supply --imu CSV instead")`.
- **AC-2**: All references to `--time-offset-ms` / `--skip-auto-sync-validation` flags in the CLI are removed or marked deprecated with a one-cycle deprecation warning.
- **AC-3**: `test_az405_auto_sync` tests either pass against the new behaviour or are deleted with rationale recorded in the batch report.
- **AC-4**: `test_derkachi_real_tlog.py` continues to `@xfail` with the AZ-848-scoped reason; nothing in this ticket fixes the underlying tlog-clock bug.
- **AC-5**: Module docstrings of `tlog_replay_adapter.py` and `tlog_ground_truth.py` are updated to call out their new audit-only / one-shot-export roles.
## Out of scope
- AZ-848 / AZ-883 structural fix — they stay open as backlog (tlog path is still broken, just no longer the primary path).
- New CSV export tooling for arbitrary tlogs — future ticket.
## References
- Cycle-3 retro: `_docs/06_metrics/retro_2026-05-26.md`
- Companion: AZ-894 (CSV adapter — must land first), AZ-896 (docs), AZ-897 (UI)
@@ -0,0 +1,38 @@
# Docs: replay-input format spec + downloadable example CSV
**Task**: AZ-896_replay_format_docs_and_example_csv
**Name**: Author the operator-facing format spec for the (video, CSV) replay input pair, plus a minimal downloadable example CSV
**Description**: Operators using the replay/demo path need to know the exact CSV schema the system accepts, the hard contract (video t=0 ≡ CSV row 0; video must be nadir; UAV must already be airborne at t=0), and have a downloadable example to copy from. Operators today have no entry point that documents this.
**Complexity**: 1 SP
**Dependencies**: AZ-894 (the adapter that consumes the format — the doc describes what AZ-894 accepts)
**Blocks**: AZ-897 (UI links to the docs page and serves the example CSV)
**Component**: docs (_docs/04_release/)
**Tracker**: AZ-896 (https://denyspopov.atlassian.net/browse/AZ-896)
**Parent Epic**: (none — cycle-4 replay-input redesign)
## What
- Author a docs page at `_docs/04_release/replay_input_format.md` (or wherever the operator-facing docs land in cycle 4)
- Schema table: column names, units, types, expected rates, required vs optional
- Constraint statements up top, before the column table:
- Video: nadir camera; UAV already airborne at frame 0
- CSV: row 0 timestamp == video frame 0 timestamp; `Time` column starts at 0.0; rows monotonic and uniformly-spaced
- Ship `_docs/04_release/example_data_imu.csv` — a minimal valid example (e.g., 20 rows = 2 seconds at 10 Hz)
- Cross-link from the AZ-897 replay UI "Download example" button
## Acceptance Criteria
- **AC-1**: Schema page documents all 19 columns of the Derkachi CSV with units and types.
- **AC-2**: The three hard constraints (nadir / airborne / aligned-start) are stated up top, before the column table.
- **AC-3**: The example CSV (≥10 rows) passes through the AZ-894 CSV adapter without errors.
- **AC-4**: The page is reachable from the AZ-897 UI's "Download example" link.
## Out of scope
- Multi-schema support (PX4, generic MAVLink dumps).
## References
- Companion: AZ-894 (CSV adapter), AZ-897 (UI), AZ-895 (auto-sync deprecation)
- Source fixture: `_docs/00_problem/input_data/flight_derkachi/data_imu.csv`, README at `_docs/00_problem/input_data/flight_derkachi/README.md`
@@ -0,0 +1,45 @@
# Replay UI: web form for paired video + CSV input + example download
**Task**: AZ-897_replay_ui_web_form
**Name**: Build the first operator-facing UI for the GPS-denied onboard system — a single-page form that uploads a paired (video, CSV) for replay
**Description**: User decision (2026-05-26): the system offers an operator-facing UI for the test/demo replay path. The UI surfaces the hard constraints visually (nadir, airborne, aligned-start) so operators don't fail silently from a misaligned video. This is also the foundation for the deferred operator-tooling work (see `_docs/00_research/00_question_decomposition.md` lines 119, 224).
Tech stack per `.cursor/rules/techstackrule.mdc`: React + Tailwind CSS.
**Complexity**: 5 SP
**Dependencies**: AZ-894 (backend CSV adapter), AZ-896 (format docs + example CSV that the UI serves)
**Component**: frontend (new — first piece of operator-facing UI), backend (new HTTP endpoint that fronts `gps-denied-replay`)
**Tracker**: AZ-897 (https://denyspopov.atlassian.net/browse/AZ-897)
**Parent Epic**: (none — cycle-4 replay-input redesign; will likely become the first piece of a future operator-tooling epic)
## Shape
A single-page web form, served from a target to be decided during implementation (Jetson? operator workstation? containerised dev mode?). Hosts:
- **Video file picker**. Accept `.mp4`, `.mov`. Display constraint hint: "Nadir camera; UAV already airborne at frame 0."
- **CSV file picker**. Accept `.csv`. Display constraint hint: "Row 0 timestamp must equal video frame 0; see format docs."
- **"Download example CSV"** link → AZ-896's `example_data_imu.csv`.
- **"View format docs"** link → AZ-896's `replay_input_format.md`.
- **"Start replay"** button → POSTs (video_path, csv_path) to a backend endpoint that invokes `gps-denied-replay --video X --imu Y`.
- **Result panel**: tail the replay subprocess output, display final verdict (PASS/FAIL + accuracy metrics).
## Acceptance Criteria
- **AC-1**: Form renders with both pickers, both constraint hints, download/docs links, and the start button.
- **AC-2**: The start button correctly invokes the replay pipeline against the selected files; success path returns a verdict; failure path returns the error reason from the backend.
- **AC-3**: Form rejects mismatched filename pairs only with explicit operator-actionable error messages — no silent failures.
- **AC-4**: Example-CSV download serves the file from AZ-896 with the correct content-type.
- **AC-5**: Tests cover empty submissions, mismatched file types, backend failures, and the happy path. React Testing Library + jest for component tests; an e2e smoke test covers the full flow.
## Out of scope
- Multi-flight management / history / library.
- Authentication / user accounts.
- Sector classification, pre-flight cache provisioning, mission planning (those are separate deferred items from C10 / `00_question_decomposition.md`).
- The deploy-target decision (Jetson vs operator workstation) — to be resolved during implementation; default proposal: containerised dev mode for now.
## References
- Companion: AZ-894 (CSV adapter), AZ-896 (docs + example CSV)
- Deferred precedent: `_docs/00_research/00_question_decomposition.md` lines 119 ("Mission-planning UX is out of scope"), 224 ("Operator-side CLI/desktop tool design deferred to Plan-phase")
- Tech stack: React + Tailwind CSS per `.cursor/rules/techstackrule.mdc`
@@ -0,0 +1,78 @@
# Land `architecture_compliance_baseline.md` (cycle-3 retro #3, third try)
**Task**: AZ-899_architecture_compliance_baseline
**Name**: Create `_docs/02_document/architecture_compliance_baseline.md` so cumulative reviews can emit `## Baseline Delta` rows
**Description**: Cycle-1 retro Top-3 Improvement Action #3, repeated in cycle-3 retro Top-3 #3. The file has been unmade across cycles 2 and 3, leaving cumulative reviews unable to quantify carried-over / resolved / newly-introduced architecture violations per cycle. Seed the baseline from `_docs/06_metrics/structure_2026-05-20.md` with `0` violations, freeze the snapshot semantics, and wire the existing-code flow's Step 2 to reference it.
**Complexity**: 1 SP
**Dependencies**: None (operates on existing artifact `_docs/06_metrics/structure_2026-05-20.md`)
**Component**: documentation only — no source code change
**Tracker**: AZ-899 (https://denyspopov.atlassian.net/browse/AZ-899)
**Epic**: (none — cycle-4 process housekeeping)
## Problem
Cycle-3 retro § Structural Metrics:
> `_docs/02_document/architecture_compliance_baseline.md` **still does not exist** — cycle-1 retro Top-3 Improvement Action #3 was NOT delivered in cycles 2 or 3.
Without a baseline, cumulative reviews log "`_docs/02_document/architecture_compliance_baseline.md` does NOT exist → no Baseline Delta section emitted". Structural regressions (new cycles in the import graph, newly-introduced violations) therefore cannot be quantified across cycles — only verified pairwise per batch.
## Outcome
- Cumulative-review reports starting from cycle-4 batch 1 emit a `## Baseline Delta` section that quantifies new vs. resolved vs. carried-over architecture violations.
- Cycle-end retros can compare structural deltas across cycles using a single canonical baseline document instead of re-deriving from the previous cycle's snapshot.
## Scope
### Included
- Create `_docs/02_document/architecture_compliance_baseline.md` seeded with **0** violations.
- Reference `_docs/06_metrics/structure_2026-05-20.md` as the source-of-truth snapshot from which the baseline was derived.
- Document the file's update protocol: a new violation found in a cumulative review is appended (with batch ID, severity, finding ID); a resolution is recorded by marking the row `RESOLVED in batch <ID>`.
- Document the snapshot-refresh trigger: any cycle that materially changes structure (component count, cross-component edges, new contracts) re-snapshots via `python -m gps_denied_onboard.tools.structure_snapshot` (or equivalent existing script — verify before reference).
### Excluded
- Refactoring source code to fix violations — none currently exist.
- Adding new component scaffolding — out of scope.
- Modifying `code-review` or `retrospective` skills — they already reference the file; the only change needed is making the referenced file exist.
## Acceptance Criteria
**AC-1: Baseline file exists with 0 violations**
Given a fresh repo checkout
When `ls _docs/02_document/architecture_compliance_baseline.md` runs
Then the file exists and its `## Violations` section is explicitly empty (or marked "None at baseline")
**AC-2: Baseline references the structural snapshot**
Given the baseline file
When read
Then it includes a `## Source` section pointing at `_docs/06_metrics/structure_2026-05-20.md` and lists the structural facts (15 components, 0 import cycles, 5 contract files) that establish the "0 violations" claim
**AC-3: Update protocol documented**
Given the baseline file
When read
Then it includes an `## Update Protocol` section describing append-on-violation, mark-resolved-on-fix, and the snapshot-refresh trigger
**AC-4: Cumulative-review hook verified**
Given the baseline file in place
When the cycle-4 first cumulative-review report is generated
Then the report emits a `## Baseline Delta` section (even if empty: "0 new, 0 resolved, 0 carried-over")
## Constraints
- File format: markdown, matches the structure of `_docs/06_metrics/structure_2026-05-20.md` style.
- No source code change permitted under this ticket — strictly documentation.
## Risks & Mitigation
**Risk 1: Future violations slip past the baseline**
- *Risk*: A cumulative review finds a violation but the reviewer forgets to append it to the baseline.
- *Mitigation*: The `code-review` skill (referenced in cycle-3 retro Suggested Updates) should be updated separately to auto-append; this ticket only delivers the baseline file. The follow-up belongs in cycle 5 if needed.
## References
- Cycle-3 retro: `_docs/06_metrics/retro_2026-05-26.md` § Top 3 Improvement Actions #3
- Cycle-1 retro: `_docs/06_metrics/retro_2026-05-20.md` § Top 3 Improvement Actions #3 (original)
- Source snapshot: `_docs/06_metrics/structure_2026-05-20.md`
- Existing-code flow Step 2: `.cursor/skills/autodev/flows/existing-code.md` § "Step 2 — Architecture Baseline Scan"
@@ -0,0 +1,82 @@
# Autodev: gate Step-9 entry on previous-cycle retro existence
**Task**: AZ-900_autodev_retro_existence_gate
**Name**: Codify the LESSONS rule — autodev must block cycle-N+1 Step 9 entry if `retro_<YYYY-MM-DD>.md` for cycle N is absent
**Description**: Cycle-3 retro Top-3 Improvement Action #2 and 2026-05-26 LESSONS entry both call for codifying a Re-Entry After Completion gate that verifies the previous cycle's retro file exists before incrementing the cycle counter. Cycle-2 retro was never filed; the orchestrator silently advanced to cycle 3 and all cycle-1 retro Top-3 actions sat invisible. This ticket codifies the gate in `.cursor/skills/autodev/flows/existing-code.md` § Re-Entry After Completion.
**Complexity**: 1 SP
**Dependencies**: None
**Component**: `.cursor/skills/autodev/flows/existing-code.md` (workflow doc only)
**Tracker**: AZ-900 (https://denyspopov.atlassian.net/browse/AZ-900)
**Epic**: (none — cycle-4 process housekeeping)
## Problem
LESSONS 2026-05-26 [process] entry:
> Cycle-2 retro was never filed. The autodev orchestrator silently auto-chained from cycle-2 Step 17 (if it ran at all) straight into cycle-3 Step 9 without producing `retro_<cycle2-date>.md`. As a result, cycle-1 retro's Top-3 Improvement Actions sat invisible across cycle 2 and were re-discovered, all three still undelivered, only at cycle-3 close.
Cycle-3 retro Top-3 #2 echoes the same recommendation.
The fix is a one-line check in the flow file that BLOCKS Step 9 entry for cycle N+1 unless `_docs/06_metrics/retro_<YYYY-MM-DD>.md` for cycle N exists.
## Outcome
- Future cycle-N → cycle-(N+1) transitions are gated: the autodev orchestrator refuses to enter Step 9 of cycle N+1 if no retro file exists for cycle N.
- Missing retros are surfaced at the session boundary, not 6 weeks later at the next cycle's close.
## Scope
### Included
- Edit `.cursor/skills/autodev/flows/existing-code.md` § "Re-Entry After Completion" to add a gate: before incrementing `cycle`, glob `_docs/06_metrics/retro_*.md` and verify a file dated after the cycle-N start exists.
- Define the BLOCK behavior: if absent, present a Choose A/B/C block:
- **A)** Author the missing retro now (invoke `.cursor/skills/retrospective/SKILL.md` in cycle-end mode)
- **B)** Stub a backfilled retro and proceed (with a leftover entry filed for proper backfill)
- **C)** Abort and ask the user
- Add a corresponding bullet to `.cursor/skills/autodev/state.md` § "Session Boundaries" pointing at the new gate.
### Excluded
- Retroactively writing cycle-2 retro (separate ticket if user wants it; cycle-3 retro already covers cycle-2 trend deltas where data is on disk).
- Adding similar gates to greenfield or meta-repo flows (only `existing-code` has the cycle counter).
- Per-step retro check inside cycles (this gate fires only at the cycle boundary).
## Acceptance Criteria
**AC-1: Flow file gate exists**
Given `.cursor/skills/autodev/flows/existing-code.md`
When the "Re-Entry After Completion" section is read
Then it contains a step `Verify previous cycle's retro exists` BEFORE the cycle increment
**AC-2: Choose A/B/C block specified**
Given the gate triggers (no retro file found)
When the documented behavior is consulted
Then it specifies the three options (A: author now, B: stub + leftover, C: abort) with the standard Choose format
**AC-3: state.md cross-reference**
Given `.cursor/skills/autodev/state.md`
When the "Session Boundaries" section is read
Then it mentions the new retro-existence gate or links to the flow file's gate
**AC-4: Discovery rule**
Given the gate
When the file pattern is documented
Then the glob is unambiguous: `_docs/06_metrics/retro_*.md` with a date matching cycle-N's date range; the date-range derivation is explicit (cycle N start = last `implementation_report_*_cycle{N-1}.md` date; cycle N end = today)
## Constraints
- Pure workflow doc change — no source code, no tests.
- Must not break the existing greenfield-Done → existing-code Phase-B transition (greenfield → existing-code is a one-shot flow change with no retro requirement on first entry, since there is no previous cycle).
## Risks & Mitigation
**Risk 1: False positive on greenfield→existing-code transition**
- *Risk*: First cycle of an existing-code flow shouldn't require a previous-cycle retro.
- *Mitigation*: Gate condition includes `state.cycle > 1` — cycle 1 has no previous cycle.
## References
- LESSONS 2026-05-26 [process] entry: `_docs/LESSONS.md` § 2026-05-26 [process]
- Cycle-3 retro Top-3 #2: `_docs/06_metrics/retro_2026-05-26.md`
- Flow file: `.cursor/skills/autodev/flows/existing-code.md` § "Re-Entry After Completion"
- State management: `.cursor/skills/autodev/state.md` § "Session Boundaries"
@@ -0,0 +1,85 @@
# Fix `EVIDENCE_OUT` default path — workspace-relative, not container-only
**Task**: AZ-901_evidence_out_default_path_fix
**Name**: Change `e2e/runner/conftest.py:56` `EVIDENCE_OUT` default from `/e2e-results/evidence` to a workspace-relative path so Tier-1 host runs don't crash
**Description**: Closes leftover `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`. Cycle-3 Step 15 (Performance Test) surfaced this: the default path `/e2e-results/evidence` is the container mount inside the Tier-1 Docker harness; a developer Mac/Linux workstation invoking `python -m pytest e2e/tests/performance/` directly hits `OSError: [Errno 30] Read-only file system: '/e2e-results'` (macOS) or `PermissionError` (Linux). Workaround today: `EVIDENCE_OUT="$(pwd)/e2e-results/..." pytest ...`. Fix: resolve a workspace-relative default when neither `--evidence-out` nor `EVIDENCE_OUT` is set.
**Complexity**: 1 SP
**Dependencies**: None
**Component**: `e2e/runner/conftest.py`
**Tracker**: AZ-901 (https://denyspopov.atlassian.net/browse/AZ-901)
**Epic**: (none — cycle-4 process housekeeping)
## Problem
`e2e/runner/conftest.py:56`:
```python
default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence")
```
The default `/e2e-results/evidence` is a container-mount path. Tier-1 Docker harness and the Tier-2 Jetson runner pass `--evidence-out` explicitly, so they're fine. Host-direct `python -m pytest e2e/tests/performance/` invocations (developer machine, no Docker) hit `nfr_recorder.pytest_sessionfinish` which tries `mkdir(evidence_dir)` and crashes.
## Outcome
- Developer can run `python -m pytest e2e/tests/performance/` on a Mac/Linux workstation without setting `EVIDENCE_OUT` and without crashing.
- Docker / Jetson runners continue to work unchanged (they pass `--evidence-out` explicitly).
## Scope
### Included
- Modify `e2e/runner/conftest.py:56` to resolve a workspace-relative default when `EVIDENCE_OUT` is unset.
- Proposed: `default=os.environ.get("EVIDENCE_OUT", str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"))`
- Verify Docker compose files and Jetson scripts that pass `--evidence-out` still work (they should — they override the default).
- Verify `.gitignore` ignores `e2e-results/` at repo root (probably already does — confirm before commit).
- Delete the leftover file `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` once the fix lands and the verification AC passes.
### Excluded
- The "lazy fallback inside the recorder" alternative shape — staying with the workspace-relative-default shape for simplicity (Option 1 from the leftover file).
- Refactoring `nfr_recorder.pytest_sessionfinish` — the writer code is fine; only the default path is wrong.
- Adding new evidence-out related env vars or CLI flags.
## Acceptance Criteria
**AC-1: Host-direct pytest works without EVIDENCE_OUT**
Given a clean workspace on macOS or Linux
When `python -m pytest e2e/tests/performance/ -v --tb=short` runs (no `EVIDENCE_OUT` env var, no `--evidence-out` flag)
Then pytest exits 0, evidence is written under `<workspace_root>/e2e-results/evidence/`, and no `OSError` / `PermissionError` is raised
**AC-2: Docker harness unchanged**
Given the Tier-1 Docker compose (`docker-compose.test.jetson.yml`)
When the e2e suite runs inside the container
Then `--evidence-out` is still passed and evidence lands at the container mount path `/e2e-results/evidence/` (no behavioral change)
**AC-3: Jetson harness unchanged**
Given `scripts/run-tests-jetson.sh`
When invoked
Then it still passes `--evidence-out` to pytest and evidence is collected per the existing protocol
**AC-4: gitignore covers workspace-relative path**
Given the fix in place
When a host-direct run produces `<workspace_root>/e2e-results/`
Then `git status` does NOT show `e2e-results/` as untracked (already covered by `.gitignore`, or `.gitignore` is updated as part of this ticket)
**AC-5: Leftover deleted**
Given the fix lands and ACs 14 pass
When `ls _docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`
Then the file does not exist
## Unit Tests
| AC Ref | What to Test | Required Outcome |
|--------|-------------|-----------------|
| AC-1 | Run `pytest e2e/tests/performance/` without env vars on host | Exit 0, evidence at `<workspace_root>/e2e-results/evidence/` |
## Constraints
- Backward-compatible — existing callers passing `--evidence-out` or setting `EVIDENCE_OUT` see no change.
- No new dependencies; uses `pathlib.Path` which `conftest.py` already imports (verify before commit).
## References
- Leftover file: `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`
- Cycle-3 Step 15 perf report: `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` § "Findings worth tracking" item 3
- Conftest: `e2e/runner/conftest.py:56`
@@ -0,0 +1,181 @@
# Release Report — Cycle 3 → Jetson (bench test)
- **Date**: 2026-05-26 14:42 EEST (UTC+3)
- **Operator**: obezdienie001 (single-operator project; agent-assisted via `/autodev`)
- **Strategy**: manual / bench-test
- **Target version**: `be743a7` (dev HEAD; commit `[AZ-844] Close Step 11 cycle-3: unit pass, jetson regression AZ-848`)
- **Target environment**: lab Jetson Orin Nano Super at SSH alias `jetson-e2e` (uptime 15d, 42 GB free on `/var/lib/docker`)
- **Compose file**: `docker-compose.test.jetson.yml` (TEST compose — NOT the parent-suite airborne deploy compose)
- **Verdict**: **Released**
- **Verdict reason**: Bench run produced identical failure profile to Step 11 (`4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed in 335.41s`); same four AZ-848 test IDs failed; no NEW cycle-3-scope regressions introduced by `fd52cc9`. AZ-848 / AZ-883 carry forward to Cycle 4 as planned.
## Pre-Release Gate (Phase 1)
### Scope of this release
This is **not** an airborne production deploy. It is a **bench-test verification** that the cycle-3 source tree builds and runs on real Tier-2 hardware (the lab Jetson Orin Nano Super), using the same `docker-compose.test.jetson.yml` harness that drove the cycle-3 closeout in Step 11. The user explicitly chose this path over a true airborne deploy because two open Jetson blockers (AZ-848, AZ-883) were just diagnosed and deferred to Cycle 4.
A true airborne release will be Cycle 4's job, once AZ-848 (`VioOutput.emitted_at_ns` contract repair) and AZ-883 (`SCALED_IMU2` ts_ns=0 latent bug) are fixed.
### Acceptance Criteria
The system-level ACs in `_docs/00_problem/acceptance_criteria.md` (AC-1.x position accuracy, AC-4.x latency/memory, AC-NEW-1 TTFF, AC-NEW-2 spoof promotion, AC-NEW-4 false-position safety, AC-NEW-5 thermal envelope) all require **live-flight data + Tier-2 hardware** and are not in scope for this bench test. They remain "Unverified" — same status as recorded in `_docs/06_metrics/perf_2026-05-19_workstation-tier1-probe.md` and `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md`.
What IS in scope and verifiable here:
| Scope item | Verification | Status |
|------------|--------------|--------|
| Cycle-3 source builds on arm64 (Jetson Orin Nano Super) | `docker compose build` against `tests/e2e/Dockerfile.jetson` succeeds | Phase 3 |
| Cycle-3 source runs on real Jetson hardware end-to-end | `pytest tests/unit/ + tests/e2e/replay/` exits with same failure profile as Step 11 closeout (`4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed`) | Phase 4 |
| No new Cycle-3-scope regressions vs. Step 11 (2026-05-24) | Failure profile matches Step 11 — only the known AZ-848 4-tuple fails; no new failures introduced by `fd52cc9` | Phase 4 |
| Working tree on Jetson reflects the cycle-3 closeout commit | `rsync` mirrors local `be743a7` to remote `~/gps-denied-onboard/` | Phase 3 |
### Test Status
| Suite | Pass | Fail | Skip | Source |
|-------|-----:|-----:|-----:|--------|
| Tier-1 unit (local Mac) | 2303 | 0 | 86 | `_docs/03_implementation/run_tests_step11_report.md` § Cycle-3 closeout → Local unit suite |
| Tier-1 perf (this cycle, Mac) | n/a | n/a | n/a | `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` — 4/4 NFRs **Unverified** on Tier-1 (NFR-PERF-* require Tier-2 + AZ-595 fixture, both still pending) |
| Tier-2 Jetson e2e (Step 11, 2026-05-24) | 48 | 4 (AZ-848) | 3 | `_docs/03_implementation/run_tests_step11_report.md` § Cycle 3 closeout → Jetson e2e |
| Tier-2 Jetson e2e (this release; bench rerun) | <pending> | <pending> | <pending> | This release report, Phase 4 below |
### Change Summary
Cycle-3 src delta (single commit `fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint`):
```
src/gps_denied_onboard/_types/route.py | +43
src/gps_denied_onboard/components/c11_tile_manager/route_client.py | -4
src/gps_denied_onboard/replay_input/__init__.py | -2
src/gps_denied_onboard/replay_input/tlog_route.py | -30
```
Net effect: relocate the `RouteSpec` dataclass from a private helper into the shared `_types/` package; widen ruff lint rules to cover the new module. No behavioural change. No `c1_vio` / `c5_state` / `c8_fc_adapter` / `runtime_root` touches.
Cycle-3 ticket scope (closed in this cycle, present at HEAD):
| Ticket | Type | Component | Notes |
|--------|------|-----------|-------|
| AZ-835 (epic) | feature | C1C6 | "GPS-denied tile provisioning + route spec" epic; decomposed into C1C6 sub-tasks |
| AZ-836 | tooling | autodev | State-file trim; defer In Testing transition (MCP unavailable workaround) |
| AZ-838 | feature | C2 (route client) | `SatelliteProviderRouteClient` + `seed_route.py` CLI |
| AZ-839 | feature / fixture | C3 (matcher) + E-AZ-835 C3 | `operator_pre_flight_setup` real-fixture wiring |
| AZ-840 | feature / test | E-AZ-835 C4 | e2e orchestrator test |
| AZ-844 | infra / fix | C12 cold-start NFR + Jetson harness | Threshold relax 500 → 1000 ms; rsync exclude `tiles/` `ready/`; Step 11 closeout |
| AZ-845, AZ-846, AZ-847 | refactor / lint | `_types/`, `c11_tile_manager`, `replay_input`, lint | Refactor 02 (this is the only `src/` delta) |
| AZ-848 | bug (deferred) | C1 contract (`VioOutput.emitted_at_ns`) | **Deferred to Cycle 4.** Surfaced during this cycle's release flow when initially routed to operator-workstation target; root-cause re-diagnosed via tlog probe; 5 SP. |
| AZ-883 | bug (deferred) | C8 adapter (`_handle_imu` SCALED_IMU2) | **Deferred to Cycle 4.** Latent ts_ns=0 bug surfaced during AZ-848 investigation; 2 SP. |
### Rollback Plan
- **Previous version**: NONE — this is the first-ever release for this project.
- `_docs/04_release/` was empty before this report.
- No `release/*` git tag in the repo.
- No `.previous-tags.env` produced by a prior `stop-services.sh` run.
- **Rollback script**: `scripts/deploy.sh --rollback` is **unavailable** for this bench test (exit 70 — `.previous-tags.env` not found). Acceptable: the test compose's "rollback" is `docker compose down` against `docker-compose.test.jetson.yml`, which leaves the Jetson in pre-test state.
- **Rollback target verified pullable**: n/a (no previous version exists).
- **Rollback target verified bootable in target env**: n/a.
For Cycle 4's true airborne release, a real rollback target will exist (the image produced by this bench-test cycle, once an arm64 image is built + tagged in CI).
### Restrictions / Approvals
- Change-window restrictions: none for bench testing on lab Jetson (NFT-SEC-05 in-flight egress lockdown and ground-only gate apply only to airborne).
- Manual approvals required: none — single-operator project.
- Restriction `_docs/00_problem/restrictions.md` § "Failsafe & Safety" applies only to live flight; not exercised by bench test.
### Tracker State at Gate
- **Tickets in scope** (CLOSED at HEAD): 8 tickets (AZ-835, AZ-836, AZ-838, AZ-839, AZ-840, AZ-844, AZ-845, AZ-846, AZ-847 — see Change Summary above).
- **Tickets deferred to Cycle 4** (NOT blocking this bench release; explicitly off the operator-orchestrator + bench-test paths): AZ-848, AZ-883.
- **Tickets blocking release**: 0. AZ-848 / AZ-883 affect only the live-flight tlog-replay path on the airborne Jetson; they are deliberately NOT a bench-test blocker because the bench test re-confirms the SAME failure profile as Step 11 (no NEW regressions in cycle-3-scope).
### Gate Decision
User picked **A) Bench testing on jetson-e2e** at the Pre-Release Gate. The contradiction with the user's prior turn (operator-workstation target) was flagged and resolved in favour of bench-test on Jetson. Three issues from the gate that influence verdict interpretation are recorded under "Rollback Plan" (no rollback target) and "Acceptance Criteria" (system-level ACs unverifiable from Tier-1 / bench).
## Strategy Select (Phase 2)
- **Recommended by skill table** for this target capability: `manual` (per `release/SKILL.md` Phase 2 table — "Non-automatable env (one-off VMs, regulated infrastructure, non-Docker host) — the whole release becomes a runbook"). Although Docker IS in play here, this is a bench rig with no load balancer, no traffic-tier routing, no automated rollout — the closest semantic match in the skill's table.
- **Chosen**: `manual` / bench-test.
- **Reasoning**: blue-green / canary / all-at-once all imply a service taking real traffic. The bench-test Jetson takes no traffic; it runs an internally-scripted test compose. The release does record but does not "deploy" in the production sense — the parent-suite Watchtower flow is bypassed; only the cycle-3 image's compileability + runnability on hardware is being verified.
## Execute (Phase 3)
- **Start**: 2026-05-26 14:42:41 UTC (shell job PID 84808)
- **Command**: `bash scripts/run-tests-jetson.sh` (no flags; defaults to `JETSON_SSH_ALIAS=jetson-e2e`, `JETSON_REMOTE_DIR=~/gps-denied-onboard`, `COMPOSE_FILE=docker-compose.test.jetson.yml`)
- **Stream sink**: `_docs/04_release/.jetson_bench_run_2026-05-26.log` (preserved for audit; NOT committed — `.jetson_bench_run_*.log` should land in `.gitignore` post-release).
- **End**: 2026-05-26 14:50:17 UTC (wall clock 7m 35s; includes rsync + docker compose pull + e2e-runner image build + pytest)
- **Exit code**: 1 — propagated from `pytest` (4 failures inside `e2e-runner`). **Expected**: AZ-848 deterministically fails the same 4 cases. The bench-test verdict is NOT "exit 0" — it is "failure profile matches Step 11".
Pytest summary line (from `_docs/04_release/.jetson_bench_run_2026-05-26.log`, e2e-runner-1 container):
```
============================= test session starts ==============================
platform linux -- Python 3.10.12, pytest-9.0.3, pluggy-1.6.0
collected 57 items
... (57 tests; see Phase 4 table below for the test-ID summary)
= 4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed, 1 warning in 335.41s (0:05:35) =
```
AZ-848 root-cause log line from THIS run (matches Step 11 root cause, confirms determinism):
```
c5.state.eskf_out_of_order ts_ns=187,370,418,000 last_added_ts_ns=1,362,268,944,997,999
c5.state.eskf_filter_divergence source=vio mahalanobis_sq=109.76467866548009 threshold_sq=100.0
replay_loop.state_add_vio_fatal frame=3 EstimatorFatalError('eskf filter divergence on vio: mahalanobis²=109.765 > 100.0')
```
(`last_added_ts_ns` differs from Step 11's value because Jetson uptime grew 2 days — the gap between `monotonic_ns` and FC-boot-relative timestamps scales with uptime per AZ-848 root cause; the IMU ts_ns is byte-identical (FC-boot-relative). Both confirm AZ-848's mechanism.)
## Smoke Test (Phase 4)
The bench-test compose IS the smoke set (per Phase 2 — bench-test strategy collapses Execute and Smoke into one harness invocation). The pass criterion below is **not** "0 failures" — it is "failure profile matches Step 11's evidence, i.e. only the known AZ-848 4-tuple fails, no new failures introduced by cycle-3 src delta".
- **Mode**: same harness as Step 11 closeout (rsync + `docker compose --abort-on-container-exit --exit-code-from e2e-runner up`)
- **Start**: 2026-05-26 14:44:31 UTC (e2e-runner container started; `test session starts` line)
- **End**: 2026-05-26 14:50:06 UTC (5m 35s pytest wall clock)
| Test | Step 11 (2026-05-24) | This run (2026-05-26) | Verdict |
|------|----------------------|----------------------|---------|
| `tests/e2e/replay/test_derkachi_1min.py::test_ac1_exits_0_jsonl_count_match` | FAIL (AZ-848 frame-3 ESKF divergence) | FAIL (same root cause; same frame; same mahalanobis²=109.765) | **Match — AZ-848 carries forward** |
| `tests/e2e/replay/test_derkachi_1min.py::test_ac5_determinism_two_runs_diff` | FAIL (same root cause) | FAIL (same root cause) | **Match** |
| `tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_realtime_60s_within_5pct` | FAIL (same root cause) | FAIL (same root cause) | **Match** |
| `tests/e2e/replay/test_derkachi_1min.py::test_ac6_pace_asap_under_30s` | FAIL (same root cause) | FAIL (same root cause) | **Match** |
| `tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks` | XPASS (vacuous — binary exits 1 before emissions) | XPASS (same vacuous; same explanation in short-summary) | **Match** |
| Remaining 48 cases | PASS | PASS (all 48) | **Match — no new regressions** |
| Skipped (3) | env-gated (legitimate) | SKIPPED — same three (AZ-839 operator_pre_flight_setup × 2; AC-8 mock-suite-sat-service incomplete) | **Match** |
| xfailed (1) | known xfail (AZ-699 / AZ-776+AZ-777) | XFAIL — same test, same upstream-gap explanation | **Match** |
**Smoke verdict pass condition**: ✅ met. Totals = `4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed` and the 4 failure IDs are byte-identical to Step 11's IDs.
## Watch Window (Phase 5)
- **Duration**: not applicable — bench test, no live traffic, no observability backend in scope.
- **Substitute**: the test compose's `--abort-on-container-exit --exit-code-from e2e-runner` IS the watch — if any service crashes mid-test, pytest aborts and the exit code propagates back. The duration of the bench run (~56 min) acts as the de-facto watch.
- This is explicitly recorded per `release/SKILL.md` Phase 5: "If the user explicitly demands skipping (e.g., emergency rollforward), record the override reason in the release report and continue, but mark the verdict as `Released-with-override`." Adapted for bench testing: no live traffic ⇒ no observability ⇒ Phase 5 is honestly N/A, not "skipped". Verdict will be `Released` (or `Aborted`), not `Released-with-override`.
## Commit or Rollback (Phase 6)
### Released
- Tracker tickets in scope **stay as they are** — they were moved to Done during prior cycle-3 steps (Step 12-15). No new tracker movement triggered by this bench-test release.
- Git tag: deliberately NOT pushed. `release/cycle3-bench` would mislabel a bench-test milestone as a production release; the next true airborne release in Cycle 4 will carry the first `release/*` tag.
- AZ-848 and AZ-883 are **explicit known-regression carry-forwards** into Cycle 4 — both have updated specs and Jira state set during this autodev session.
- Cycle-3 source is hardware-bench-verified on the lab Jetson at SHA `be743a7`. The same source can be re-run reproducibly via `bash scripts/run-tests-jetson.sh` against `jetson-e2e`.
- Retrospective scheduled: `/retrospective --cycle-end` auto-chains after this report. Output expected at `_docs/06_metrics/retro_cycle3_<timestamp>.md`.
## Open Risks Carried Into Cycle 4
| Risk | Owner ticket | Severity |
|------|--------------|----------|
| AZ-848 — VioOutput.emitted_at_ns contract clashes with FC-IMU timebase; blocks live-flight ESKF on long-uptime Jetson | AZ-848 (5 SP) | High — real airborne release blocked until fixed |
| AZ-883 — `_handle_imu` produces ts_ns=0 for every SCALED_IMU2 message; latent IMU monotonicity violation | AZ-883 (2 SP) | Medium — latent; fix lands before C13 FDR replay tools assume per-sample monotonicity |
| `EVIDENCE_OUT` default points at container-only path (`/e2e-results/evidence`) — breaks Tier-1 perf tests on the host | `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` | Low — workaround exists (`EVIDENCE_OUT="$(pwd)/e2e-results/..."`) |
## Lessons (one-liners)
- **First-release rollback gap is structural, not procedural** — the `scripts/deploy.sh --rollback` path requires `.previous-tags.env`, which only exists after a successful `stop-services.sh` run. First-ever deploys have no rollback target by construction; the release skill's Phase 1 rollback check should treat first-release as a recognized first-time path, not a blocking gate.
- **Bench-test "release" is a legitimate milestone but not a production release** — the release skill's six-phase pipeline (deploy → smoke → watch → commit) compresses to three phases for bench testing (rsync+build → harness-as-smoke → commit). The skill could grow an explicit `strategy: bench-test` row in its Phase 2 table so future releases don't have to improvise.
- **Long-uptime Jetson + freshly-booted FC is the AZ-848 sensitiser** — the gap between `monotonic_ns` and FC-boot-relative timestamps grew by ~175 trillion ns over 2 days (1.187·10¹⁵ → 1.362·10¹⁵). This confirms the bug's mechanism is purely additive in uptime and gives Cycle 4 a clean reproduction protocol: `uptime -p` ≥ 1d on the Jetson + a tlog from a session ≤ 15 min after FC boot.
- **Cycle-3 src delta size vs. release scope tension**`fd52cc9` is a 75-line refactor; the release machinery exercises full deploy + smoke against it. The bench-test path balances "release discipline" against "tiny delta does not warrant prod-deploy theatre", and it should stay as the default for refactor-only cycles in this project.
@@ -0,0 +1,136 @@
# Performance Test Run — 2026-05-26 — Cycle 3 Tier-1 probe
**Invoked by**: autodev existing-code Step 15 (cycle 3) — `.cursor/skills/test-run/SKILL.md` perf mode.
**Host**: developer Mac workstation (Darwin arm64, no Jetson hardware, no `E2E_SITL_REPLAY_DIR` fixture mounted).
**Runner**: `scripts/run-performance-tests.sh` + direct `pytest e2e/tests/performance/` probe + pure-logic evaluator unit tests.
**Run ID**: `cycle3-tier1-probe`.
**Status**: **Unverified across all 4 production perf NFRs; pure-logic evaluator unit tests Pass (70/70).** No regression detected because no measurement was possible. No Warn / Fail to gate on. **Not blocking deploy** per the skill's "Any Unverified scenarios with no Warn/Fail" rule.
## Why this cycle re-ran the same probe
Cycle 3 work touched only pre-flight / offline code paths:
| Task | Layer | Runtime hot-path impact |
|---|---|---|
| AZ-836 `tlog_route_extractor` | Pre-flight (operator workstation) | None — extraction runs once per flight, before takeoff |
| AZ-838 `SatelliteProviderRouteClient` | Pre-flight (operator workstation) | None — HTTP client against satellite-provider's Route API |
| AZ-839 `operator_pre_flight_setup` real fixture | Test infrastructure | None — fixture composes existing pre-flight components |
| AZ-840 E2E orchestrator test | Test only | None |
| AZ-777 Derkachi C6 reference fixture + C11 inventory adapter | Pre-flight + C11 download path | C11 `TileDownloader` is invoked at pre-flight (operator workstation), not in-flight — airborne process has no egress (RESTRICT-OPS-1, NFT-SEC-02) |
| AZ-845 `RouteSpec` relocation | Refactor (type re-home) | None — public API unchanged |
| AZ-846 `module-layout.md` refresh | Docs | None |
| AZ-847 Lint widening | Test only | None |
None of these touches the airborne pipeline that NFT-PERF-01..04 measure (E2E latency, frame-by-frame streaming, cold-start TTFF, spoof-promotion). The 2026-05-19 baseline (`perf_2026-05-19_workstation-tier1-probe.md`) remains the most recent measurement of record; this run confirms no Tier-1-observable regression by reproducing the same 4× Unverified outcome.
## What ran
### A) `scripts/run-performance-tests.sh`
```text
Tier-2 perf tests skipped (GPS_DENIED_TIER!=2).
exit=0
```
Tier-2 gate (`pytest -m tier2 -q tests/perf` only when `GPS_DENIED_TIER=2`). Exit 0 silently on Tier-1 by design — canonical perf measurements require Jetson Orin Nano Super hardware (D-C7-9, JetPack 6.2, TensorRT 10.3); a workstation run would produce numbers that DO NOT meet the pinned-hardware budgets and would actively mislead trend tracking.
### B) Direct `pytest e2e/tests/performance/` probe (24 parameterizations)
| NFR | Configs | Outcome | Skip reason |
|---|---|---|---|
| **NFT-PERF-01** (E2E latency p95 ≤ 400 ms — AC-4.1) | 6 ({ardupilot, inav} × {okvis2, klt_ransac, vins_mono}) | 6 skipped | "Tier-2 only — Jetson hardware required" |
| **NFT-PERF-02** (frame-by-frame streaming, inter-emit p95 ≤ 350 ms — AC-4.4) | 6 ({ardupilot, inav} × {okvis2, klt_ransac, vins_mono}) | 4 skipped (no fixture) + 2 skipped (vins_mono research-only per D-C1-1-SUB-A) | "requires `E2E_SITL_REPLAY_DIR` (AZ-595) carrying the 5 min Derkachi @ 3 Hz replay" |
| **NFT-PERF-03** (cold-start TTFF p95 ≤ 30 s — AC-NEW-1) | 6 | 6 skipped | "Tier-2 only — Jetson hardware required" |
| **NFT-PERF-04** (spoof-promotion p95 ≤ 600 ms — AC-NEW-2) | 6 | 4 skipped (no fixture) + 2 skipped (vins_mono research-only per D-C1-1-SUB-A) | "requires `E2E_SITL_REPLAY_DIR` (AZ-595) containing N≥20 randomized-start blackout+spoof events" |
Total: 24 skipped, 0 passed, 0 failed, 0 errored. Exit code 0.
### C) Pure-logic evaluator unit tests — `e2e/_unit_tests/helpers/test_*_evaluator.py`
```text
$ .venv/bin/python -m pytest e2e/_unit_tests/helpers/test_e2e_latency_evaluator.py \
e2e/_unit_tests/helpers/test_streaming_evaluator.py \
e2e/_unit_tests/helpers/test_ttff_evaluator.py \
e2e/_unit_tests/helpers/test_spoof_promotion_evaluator.py \
-v --tb=short
======================= 70 passed in 0.25s ========================
```
**70/70 pass.** Identical to 2026-05-19 — confirms percentile estimators, inter-emit interval math, TTFF distribution math, and spoof-onset → label-switch delta math are still correct. A future hardware run feeds JSON fixtures into the same evaluators — only the input data changes, not the math.
## Threshold comparison (Step 3 of skill)
Per the skill's Step 3, thresholds load from `_docs/02_document/tests/performance-tests.md`. The thresholds exist and are documented but no scenario produced a measurement to compare them against.
| NFR | Threshold | Observed | Verdict |
|---|---|---|---|
| NFT-PERF-01 | p95 ≤ 400 ms (K=3 baseline AND K=2 hybrid auto-degrade) + ≤10 % frame drops | — | **Unverified** (Tier-2 hardware required) |
| NFT-PERF-02 | p95 inter-emit interval ≤ 350 ms; no window of ≥3 missed-emit gaps | — | **Unverified** (`E2E_SITL_REPLAY_DIR` fixture not yet recorded; AZ-595) |
| NFT-PERF-03 | p95 TTFF < 30 s (50 cold boots) | — | **Unverified** (Tier-2 hardware required) |
| NFT-PERF-04 | p95 < 3 s on both FCs (50 trials per FC) | — | **Unverified** (`E2E_SITL_REPLAY_DIR` fixture not yet recorded; AZ-595) |
## Classification
Per the skill's perf-mode reporting:
```text
══════════════════════════════════════
PERF RESULTS
══════════════════════════════════════
Scenarios: [pass 0 · warn 0 · fail 0 · unverified 4]
──────────────────────────────────────
1. NFT-PERF-01 — Unverified — Tier-2 Jetson hardware required
2. NFT-PERF-02 — Unverified — SITL replay fixture pending (AZ-595)
3. NFT-PERF-03 — Unverified — Tier-2 Jetson hardware required
4. NFT-PERF-04 — Unverified — SITL replay fixture pending (AZ-595)
──────────────────────────────────────
Pure-logic evaluator coverage: 70/70 unit tests pass
(e2e/_unit_tests/helpers/test_{e2e_latency,streaming,ttff,spoof_promotion}_evaluator.py)
══════════════════════════════════════
```
## Coverage gap assessment (skill Step 5: "Unverified")
Per the skill:
> **Any Unverified scenarios with no Warn/Fail** → not blocking, but surface them in the report so the user knows coverage gaps exist. Suggest running `/test-spec` to add expected results next cycle.
This run has **0 Warn + 0 Fail + 4 Unverified**, so:
- **Not deploy-blocking.** The perf gate is allowed to be Unverified when the SUT is not yet running on its canonical hardware.
- **Coverage gap is unchanged from 2026-05-19** — same two recording-phase prerequisites:
- **NFT-PERF-01 / NFT-PERF-03**: AZ-444 (Tier-2 Jetson harness). When AZ-444 lands, these scenarios run on the Jetson and produce numbers — at which point this report's "Unverified" entries become "Pass / Warn / Fail" against the AC-4.1 / AC-NEW-1 thresholds.
- **NFT-PERF-02 / NFT-PERF-04**: AZ-595 (SITL replay fixture builder). When AZ-595 lands, the fixtures are committed under `e2e/fixtures/sitl_replay/`, `E2E_SITL_REPLAY_DIR` is set, and the scenarios run on Tier-1.
## Findings worth tracking (Low)
### Carryforward from 2026-05-19
1. **Unregistered pytest mark `tier2_only`** — pytest warnings at `e2e/tests/performance/test_nft_perf_01_e2e_latency.py:61` and `e2e/tests/performance/test_nft_perf_03_ttff.py:48`. Add `tier2_only: marks scenarios that require Jetson hardware` to `e2e/runner/pytest.ini` `markers` list. **Status: still present in cycle 3.**
2. **`scripts/run-performance-tests.sh` is intentionally a Tier-2 stub.** Unchanged from 2026-05-19. **Status: still as designed.**
### New (discovered while running this probe — pre-existing, not cycle-3 caused)
3. **EVIDENCE_OUT default is a hardcoded container path**`e2e/runner/conftest.py:56` sets `default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence")`. On a Tier-1 host run (no Docker, no Jetson), the `nfr_recorder.pytest_sessionfinish` hook tries to create `/e2e-results/evidence` and fails with `OSError: [Errno 30] Read-only file system: '/e2e-results'`. Workaround: `EVIDENCE_OUT=$(pwd)/e2e-results/<run-id>/evidence python -m pytest …`. Suggested fix: default to a workspace-relative path when `--evidence-out` is not explicitly passed and no `EVIDENCE_OUT` env var is set. Logged to `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` for later remediation. **Status: pre-existing host-pytest defect, not introduced by cycle 3 — but cycle 3 work is what surfaced it (re-running the same probe a second time).**
## Anti-patterns explicitly NOT used
Per the skill's anti-pattern guidance:
- **No improvised perf tests.** Did not synthesize a workstation-only "approximation" of any NFR; the AC-4.1 / AC-NEW-1 / AC-NEW-2 / AC-4.4 budgets are pinned to canonical hardware and synthetic Tier-1 numbers would mislead the trend-tracker.
- **No skip-acceptance without justification.** Each Unverified entry is cataloged against a concrete recording task (AZ-444 / AZ-595).
- **No threshold downgrade.** Did not soften any threshold to make a Tier-1 measurement "pass".
- **No silent passthrough.** The four perf NFRs all measure real algorithm execution; no per-test bypass was inserted to make a Tier-1 result look like a Tier-2 result.
## Cross-Reference Index
| Source | Purpose |
|---|---|
| `_docs/02_document/tests/performance-tests.md` | Threshold + scenario spec |
| `scripts/run-performance-tests.sh` | Runner script (current Tier-2 stub) |
| `_docs/06_metrics/perf_2026-05-19_workstation-tier1-probe.md` | Prior Tier-1 probe (greenfield Step 15) |
| `_docs/02_tasks/todo/AZ-444*` | Tier-2 Jetson harness (recording-phase task) |
| `_docs/02_tasks/todo/AZ-595*` | SITL replay fixture builder (recording task) |
| `_docs/02_tasks/todo/AZ-{428..431}*` | NFT-PERF-{01..04} scenario tasks (runner side complete; harness pending) |
| `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` | EVIDENCE_OUT defect leftover |
| `_docs/06_metrics/` (this directory) | Per-run perf trend artefacts |
+184
View File
@@ -0,0 +1,184 @@
# Retrospective — 2026-05-26 (Cycle 3)
> Cycle-3 retrospective for GPS-Denied Onboard. Cycle 3 spans
> 2026-05-21 → 2026-05-26 (post-cycle-2 → Step 17 Retrospective).
> Generated by `/autodev` existing-code Step 17 (Retrospective,
> cycle-end mode). Prior retro: `retro_2026-05-20.md` (cycle 1).
> **Process gap**: no cycle-2 retro was filed — cycle 2 transitioned
> straight from Step 11 into cycle-3 work; the autodev session boundary
> between cycles 2 and 3 ran without invoking Step 17. This retro
> partially covers cycle-2 trend deltas where the data is still
> available on disk, and explicitly flags the missing retro as an
> Improvement Action below.
## Implementation Summary
### Cycle 3 scope (2026-05-21 → 2026-05-26)
| Metric | Value |
|--------|-------|
| Tickets closed in cycle 3 (`_docs/02_tasks/done/AZ-83{6..9}*`, `AZ-84{0,5,6,7}*`) | 7 (AZ-836, AZ-838, AZ-839, AZ-840, AZ-845, AZ-846, AZ-847) |
| Tickets touched but split off (deferred to cycle 4) | 2 (AZ-848 — 5 SP, AZ-883 — 2 SP; both surfaced during this cycle's release flow) |
| Tickets in `todo/` at cycle-3 close (open work) | 1 (AZ-848 — the deferred one; AZ-883 mirror also written) |
| Cycle 3 batches (`batch_*_cycle3_report.md`) | 6 (104, 106, 107, 108, 108b, 109) — batch 105 is reserved/missing; 108b is a same-day follow-up to 108 |
| Cycle 3 src delta | 1 commit (`fd52cc9 [AZ-845][AZ-846][AZ-847] Refactor 02: relocate RouteSpec + widen lint`); +43 36 LoC across 4 files in `_types/`, `c11_tile_manager/`, `replay_input/` |
| Cycle duration | ~6 days (2026-05-21 first cycle-3 batch → 2026-05-26 retro) |
| Avg tasks per batch | 7 tickets ÷ 6 batches ≈ 1.2 tasks/batch |
| Estimated total complexity points | ~22 SP delivered (3 + 3 + 5 + 3 + 2 + 2 + 4 estimated across AZ-836/838/839/840/845/846/847); plus AZ-844 closeout work (3 SP); deferred 7 SP (AZ-848 5 + AZ-883 2) |
| Carry-over from cycle 1's Top 3 Improvement Actions | 1/3 fulfilled (see "Trend Comparison" below) |
### Cumulative (cycle 1 + 2 + 3)
| Metric | Value (this retro) | Cycle-1 retro |
|--------|---------------------|----------------|
| Total tickets closed (lifetime) | ~175 (cycle 1: 165 + cycle 2: ~3-5 + cycle 3: 7) | 165 |
| Total batches (lifetime) | 109 (cycle 1: 97; cycle 2: 5; cycle 3: 6 + 1 inter-cycle batch 109 numbering) | 97 |
| Source LoC, `src/` Python | 61,071 (unchanged vs cycle-1; cycle-3 delta is a refactor, not a feature; cycle-2 src delta also small per Step 11 report) | 61,071 |
| Components | 15 (unchanged) | 15 |
| Binary tracks | 3 (airborne, research, operator-orchestrator) | 3 |
## Quality Metrics
### Code Review Verdicts (cycle-3 batches)
| Batch | Ticket | Verdict | Notes |
|-------|--------|---------|-------|
| 104 | AZ-777 Phase 1 | PASS_WITH_WARNINGS | 3 findings (1 Medium); AZ-777 Phase 1 closed |
| 106 | AZ-836 (TlogRouteExtractor) | **PASS** | Single-task batch; 10 ACs all PASS |
| 107 | AZ-838 (SatelliteProviderRouteClient + seed_route CLI) | PASS_WITH_WARNINGS | C2 — Epic AZ-835 |
| 108 | AZ-839 (operator_pre_flight_setup real fixture) | PASS_WITH_WARNINGS | C3 — Epic AZ-835 |
| 108b | AZ-839 follow-up (fix C3 fixture path mismatch) | **PASS** | Single-finding fix; no new findings |
| 109 | AZ-840 (e2e orchestrator test) | PASS_WITH_WARNINGS | C4 — Epic AZ-835; 17 unit tests; 3 SP per spec |
Verdict distribution (cycle-3 only):
| Verdict | Count | % of cycle-3 batches |
|---------|------:|----------------------:|
| PASS | 2 | 33.3 % |
| PASS_WITH_WARNINGS | 4 | 66.7 % |
| FAIL | 0 | 0 % |
| BLOCKED | 0 | 0 % |
Auto-fix loop did not escalate to user intervention across cycle 3.
### Cycle 3 — Findings (qualitative; no aggregated severity table in batch reports)
The 6 cycle-3 batches did NOT use a `| Critical | High | Medium | Low |` table convention (grep found zero matches). Findings appear in inline `## Code review` sections only. Per-batch breakdown:
| Severity | Cycle 3 count | Trend vs cycle 1 |
|----------|---------------:|-------------------|
| Critical | 0 | maintained — 0 in cycle 1 too |
| High | 0 | maintained — 0 in cycle 1 too |
| Medium | 1 (batch 104, AZ-777 Phase 1) | dropped — cycle 1 carried 2 (CR-F1, CR-F2) — see Trend Comparison |
| Low | ~3 (informal counts across PASS_WITH_WARNINGS batches; not enumerated in tables) | ~5 → ~3 (trend down) |
### Quality Gates Late in the Cycle (Steps 1116.5)
The interesting findings of cycle 3 did NOT come from in-batch code review — they came from the autodev quality-gate steps:
| Step | Surface | Outcome |
|------|---------|---------|
| 11 Run Tests (Jetson e2e) | AZ-848 — `eskf_filter_divergence` at frame 3 in `test_derkachi_1min.py` | 4 deterministic failures; root cause re-diagnosed 2026-05-26 as `VioOutput.emitted_at_ns` clock-source mismatch (NOT IMU-vs-IMU as initially hypothesised). Split AZ-883 for a secondary latent bug (`_handle_imu` SCALED_IMU2 ts_ns=0). |
| 14 Security Audit | Resumed prior 2026-05-19 audit; verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 5 Medium, 17 Low — same as cycle 1) | No new vulnerabilities introduced by cycle-3 refactor; existing OpenCV CVE pin replay condition unchanged. |
| 15 Performance Test | NFRs 4/4 **Unverified** on Tier-1 (same as cycle 1 + 2); pure-logic evaluator unit tests 70/70 PASS | Surfaced `EVIDENCE_OUT` default-path bug (`/e2e-results` is container-only; breaks Tier-1 host runs) → leftover `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` filed; perf report `perf_2026-05-26_cycle3-tier1-probe.md` written. |
| 16 Deploy | Resumed from cycle-1 greenfield artifacts; no cycle-3 deltas required | Deploy artifacts all present (compose files, scripts/, env templates); operator workstation deploy is the production target for `operator-orchestrator`. |
| 16.5 Release | First-ever release; ran bench-test on `jetson-e2e` lab Jetson | Verdict: **Released**. Failure profile byte-identical to Step 11 (`4 failed, 48 passed, 3 skipped, 1 xfailed, 1 xpassed`); no NEW cycle-3-scope regressions. AZ-848 / AZ-883 explicitly carried forward to cycle 4. |
## Structural Metrics
`_docs/02_document/architecture_compliance_baseline.md` **still does not exist** — cycle-1 retro Top-3 Improvement Action #3 was NOT delivered in cycles 2 or 3.
Delta vs `structure_2026-05-20.md`:
| Metric | Cycle 1 close | Cycle 3 close | Delta |
|--------|----------------|----------------|-------|
| Component count | 15 | 15 | 0 |
| Source LoC, `src/` Python | 61,071 | 61,071 (+7 net from `fd52cc9` — RouteSpec relocation is net-neutral) | ~0 |
| Cycles in component import graph | 0 | 0 (verified — cycle-3 commit only relocates a type, no new edges) | 0 (healthy) |
| Cross-component edges, count | Concentrated in `runtime_root/` factories | Same | 0 |
| Contract files | 5 | 5 (no new contracts in cycle 3 — refactor cycle) | 0 |
| `architecture_compliance_baseline.md` present | No | **No (carried over gap)** | +0 — *still missing* |
| New Architecture violations this cycle | n/a (no baseline) | 0 (none flagged in cumulative reviews) | n/a |
| Public-API symbol contract coverage % | not computed | not computed | n/a |
A fresh structural snapshot for this retro is **not produced** — the structure is unchanged from cycle 1 (verified via the 7 LoC delta and 0 new components). `structure_2026-05-20.md` remains the current authoritative snapshot. The next cycle that materially changes structure (e.g., AZ-848 contract repair adds a new field to `VioOutput`; cycle-4 C1 work) should re-snapshot.
## Efficiency
| Metric | Cycle 3 value | Cycle 1 value |
|--------|---------------:|---------------:|
| Blocked tasks at cycle close (Tier-2 hardware or otherwise) | 1 in todo/ (AZ-848 deferred) + 1 mirror (AZ-883) — both filed in this retro session, NOT blockers for cycle close | 4 (all Tier-2 hardware rooted) |
| Tasks requiring fixes after review | 1 (batch 108b is a same-day fix follow-up to 108 for a fixture path mismatch — minor) | ~5 |
| Auto-fix loop escalations to user | 0 | 0 |
| Mid-cycle remediation post-mortems | 0 | 1 (AZ-589/AZ-590 → AZ-591) |
| Mid-cycle scope rewinds | 0 | 1 (Step 11 → Step 7 for AZ-618) |
| Mid-cycle ticket splits (NEW: surfaced + split during quality-gate step) | 1 (AZ-848 → split AZ-883 during release-flow investigation) | 0 |
| Process leftovers opened this cycle | 1 (`2026-05-26_evidence_out_default_path.md`) | 1 (D-CROSS-CVE-1 — still open) |
| Process leftovers closed this cycle | 0 | 0 |
### Blocker Analysis
| Blocker Type | Count (cycle 3) | Prevention (carries to cycle 4) |
|--------------|------------------|------------------------------------|
| Jetson tlog-replay path broken at frame 3 (AZ-848) | 1 | Cycle 4 first product task; primary AC: `VioOutput.emitted_at_ns` contract repaired so `add_vio` and `add_fc_imu` share the FC-boot timebase. |
| `_handle_imu` SCALED_IMU2 latent bug (AZ-883) | 1 | Cycle 4; independent of AZ-848; 2 SP. |
| `EVIDENCE_OUT` default path container-only | 1 | Leftover at `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md`; cycle-4 quick win (15 min). |
| OpenCV CVE pin replay condition (D-CROSS-CVE-1) | 1 (carried from cycle 1) | Out-of-band; re-check at every `/autodev` invocation; unchanged across cycles 1-3. |
| Tier-2 hardware/evidence (AZ-595 fixtures, AZ-592/AZ-593 VIO native bindings) | 0 (cycle 3 did not need them; cycle 1 had 4 of these) | Re-emerge in cycle 4 if AZ-595 SITL fixture is sequenced. |
## Trend Comparison
Previous retro: `retro_2026-05-20.md` (cycle 1 close).
### Cycle-1 Top 3 Improvement Actions — fulfillment status
| # | Action | Status at cycle-3 close | Evidence |
|---|--------|-------------------------|----------|
| 1 | Land CR-F1 + CR-F2 hygiene PBIs before any new NFT helper expansion in cycle 2 | **Partial / unclear** — no batch report for CR-F1 / CR-F2 specifically in cycle 2 batches (98-102); but cycle-3 batches do not surface duplicated `csv_evidence_writer` / `fixture_path` helpers, suggesting silent absorption or the work is yet to land | Cycle-2 batches 98-102, cycle-3 batches 104-109 — no new Medium-severity helper-duplication findings |
| 2 | Sequence AZ-595 as first product task of cycle 2 | **Not done** — AZ-595 still listed as backlog item in cycle-1 retro language; no cycle-2 batch references AZ-595; the 17 NFT scenarios likely still skip on `sitl_replay_ready` | Glob `_docs/02_tasks/done/AZ-595*` — file absent from `done/` |
| 3 | Create `architecture_compliance_baseline.md` as Step 6 prerequisite | **Not done** — file still missing at cycle-3 close (verified via glob) | `_docs/02_document/architecture_compliance_baseline.md` does not exist |
**Net assessment**: cycle-1 retro's Top 3 actions were largely not delivered. The cycle-2-retro skip is the proximate cause — without a cycle-2 retro to surface non-delivery, the actions sat invisible.
### Metric Comparison
| Metric | Cycle 1 baseline | Cycle 3 close | Target (cycle 4) |
|--------|-------------------|----------------|-------------------|
| Code-review verdict mix | ~44 % PASS / ~55 % PASS_WITH_WARNINGS / 0 % FAIL | 33 % PASS / 67 % PASS_WITH_WARNINGS / 0 % FAIL | Maintain 0 % FAIL; lift PASS to ≥50 % via AZ-848 fix landing cleanly (a single-finding-batch tends to be PASS) |
| Avg findings per batch (Medium + Low) | ~0.2 | ~0.7 (one Medium in batch 104 + ~3 Lows across 4 PASS_WITH_WARNINGS = ~4 ÷ 6) | ≤ 0.5 |
| Mid-cycle remediation post-mortems | 1 | 0 | 0 |
| Mid-cycle ticket splits | 0 | 1 (AZ-848 → AZ-883) — *good* (correct discipline; not bad churn) | maintain (split discipline) |
| Structural baseline file present | No | **No (gap carried 2 cycles)** | Yes — drop it into cycle 4 Step 6 |
| Cycle-N retro filed at cycle-N close | Yes | **No for cycle 2; yes for cycle 3** | Yes — fix the autodev orchestrator gap |
## Top 3 Improvement Actions (cycle 4)
1. **Land the AZ-848 fix as cycle-4 first product task; bench-verify on Jetson before merging.**
- Impact: unblocks the Jetson e2e tlog-replay path that's been broken since cycle 2 (the AZ-776 xfail removal). Required for any real airborne release. Carries an explicit verification protocol: long-uptime Jetson + freshly-booted FC reproduces deterministically.
- Effort: 5 SP (per the revised spec). The fix touches the C1 `VioOutput.emitted_at_ns` contract and every C1 strategy that fills the field; well-scoped.
- Pair with: AZ-883 (2 SP, `_handle_imu` SCALED_IMU2 ts_ns=0) — independent fix but same investigation surface.
2. **File a cycle-2 retro retroactively + add an autodev sanity check that flags missing retros.**
- Impact: cycle-1 retro's Top-3 actions all sat invisible because no cycle-2 retro re-surfaced them. The autodev orchestrator's Step 17 should refuse to enter Step 9 cycle-N+1 if `retro_*.md` for cycle N is absent. Catches future retro skips at the next session boundary, not 6 weeks later.
- Effort: small (1 SP for the autodev state check; +2 SP to write the catch-up cycle-2 retro from artifacts already on disk).
3. **Land `architecture_compliance_baseline.md` as cycle-4 Step-6 prerequisite (third try).**
- Impact: same rationale as cycle-1 retro Improvement Action #3 — cumulative reviews still cannot emit `## Baseline Delta` sections; structural regressions remain invisible across cycles.
- Effort: ~1 SP (small file; seed from `structure_2026-05-20.md` with 0 violations baseline). The right insertion point is cycle 4's decompose phase; if decompose runs without it, fail-fast and create.
## Suggested Rule / Skill Updates
| File | Change | Rationale |
|------|--------|-----------|
| `.cursor/skills/implement/SKILL.md` (batch self-review or test sub-step) | Add a check: **if the batch removes `@pytest.mark.xfail` decorators from any test**, the same batch MUST include a green test execution against the actual hardware tier the test targets (or explicit `tier-2-only` skip documentation if hardware is unavailable in the batch session). Block PASS verdict without this evidence. | AZ-848 root cause: AZ-776 removed `@xfail` from AC-1/2/5/6 in cycle 2 with "AC-7 stating tests run on Jetson after this task → All five pass". The Jetson run was never performed. Predates the 2026-05 `meta-rule.mdc` "Real Results, Not Simulated Ones" — but the implement skill's own self-review should also enforce. |
| `.cursor/skills/autodev/state.md` or `flows/existing-code.md` (Re-Entry section) | When auto-chaining from Step 17 (Retrospective) to Step 9 (New Task) with `cycle: state.cycle + 1`, FIRST verify that `_docs/06_metrics/retro_<YYYY-MM-DD>.md` exists for the previous cycle. If absent, BLOCK and surface the gap. | Cycle-2 retro was never filed; the orchestrator silently advanced to cycle 3. Cycle-1 retro's Top-3 actions sat invisible as a result. |
| `.cursor/skills/release/SKILL.md` Phase 2 strategy table | Add an explicit row: `bench-test` — bench-rig verification on real hardware via test compose (`docker-compose.test.jetson.yml` style); not a production deploy; collapses Phases 3+4 into one harness run; Phase 5 explicitly N/A; allowed for first-release / refactor-only cycles. | Cycle-3 release used this strategy ad-hoc; the skill's existing table forced a "manual" classification that doesn't quite fit. |
| `.cursor/skills/release/SKILL.md` Phase 1 rollback-readiness | When `.previous-tags.env` does NOT exist AND no `release/*` git tag exists, treat this as "first release" and accept `docker compose down` as the rollback path. Do NOT block on absent rollback target. | First-time release was a Phase 1 blocking gate per the current strict reading; cycle 3's bench-test release had to navigate it inline. |
| `.cursor/skills/test-spec/SKILL.md` (cycle-update mode) | When the cycle-update task list includes a ticket that touches a Protocol / dataclass / contract field semantics (e.g., `VioOutput.emitted_at_ns`), the test-spec sync MUST flag downstream consumers explicitly (e.g., C5 ESKF + C13 FDR both read `emitted_at_ns`). | AZ-848 affected C1 contract semantics; downstream C5 and C13 each read the field. The test-spec sync didn't flag this in cycle 2 when AZ-776 changed adjacent code. |
## Process Leftovers (open at snapshot)
- `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md` — OPEN; gtsam numpy<2 ABI replay condition unchanged. Last check: 2026-05-26 in this session.
- `_docs/_process_leftovers/2026-05-26_evidence_out_default_path.md` — OPEN (NEW this cycle); `EVIDENCE_OUT` default path is container-only; Tier-1 host runs need explicit override; workaround documented; 1 SP fix queued for cycle 4.
End of cycle-3 retrospective.
+24
View File
@@ -6,6 +6,30 @@ Ring buffer: trim to the last 15 entries. Categories: `estimation · architectur
---
## 2026-05-26 — [testing] Removing `@pytest.mark.xfail` must be paired with a same-batch run on the actual hardware tier the test targets
**Trigger**: AZ-848 root cause re-diagnosis (2026-05-26). In cycle 2, commit `8de2716 [AZ-776] Open-loop ESKF composition profile via c4_pose.enabled` removed `@xfail` decorators from AC-1/AC-2/AC-5/AC-6 in `test_derkachi_1min.py` with AC-7 in the spec stating "tests run on Jetson after this task → All five pass". The Jetson run was never executed before AZ-776 closed. The latent C1 contract bug (`VioOutput.emitted_at_ns` uses `monotonic_ns` instead of FC-boot-relative timestamps) was therefore not detected until cycle-3 Step 11 — three weeks later. AZ-848 is 5 SP and now blocks all real airborne work in cycle 4.
**What changed**: `.cursor/skills/implement/SKILL.md` batch self-review should add a check — **if the batch removes any `@pytest.mark.xfail` decorator**, the same batch MUST include a green test execution against the test's target tier (or explicit `tier-2-only` skip documentation if the hardware is unavailable in the batch session). Block PASS verdict without this evidence. Predates the 2026-05 `meta-rule.mdc` "Real Results, Not Simulated Ones" rule but the implement skill's own gate should also enforce.
Source: `_docs/06_metrics/retro_2026-05-26.md`
## 2026-05-26 — [process] Autodev must block Step-N+1 entry if the previous cycle's retro file is missing
**Trigger**: cycle-2 retro was never filed. The autodev orchestrator silently auto-chained from cycle-2 Step 17 (if it ran at all) straight into cycle-3 Step 9 without producing `retro_<cycle2-date>.md`. As a result, cycle-1 retro's Top-3 Improvement Actions sat invisible across cycle 2 and were re-discovered, all three still undelivered, only at cycle-3 close — including `architecture_compliance_baseline.md` (action #3) which is now in its third cycle of being un-delivered.
**What changed**: `.cursor/skills/autodev/state.md` Re-Entry After Completion (or `flows/existing-code.md`) should verify that `_docs/06_metrics/retro_<YYYY-MM-DD>.md` exists for the previous cycle (`state.cycle`) before incrementing the cycle counter and entering Step 9 of cycle N+1. If absent, BLOCK and surface the gap with an A/B/C choice: (A) author the missing retro now, (B) stub a backfilled retro and proceed, (C) abort and ask the user.
Source: `_docs/06_metrics/retro_2026-05-26.md`
## 2026-05-26 — [tooling] When investigating bug X reveals a separate latent bug Y, file Y as a new ticket immediately — do not fold Y's scope into X
**Trigger**: AZ-848 evidence-based investigation (2026-05-26) used a pymavlink probe against the Derkachi tlog to verify the original "IMU-vs-IMU clock mismatch" hypothesis. The probe REFUTED the original hypothesis (both `RAW_IMU` and `SCALED_IMU2` share the FC-boot timebase) and SIMULTANEOUSLY surfaced a separate latent bug — `c8_fc_adapter._handle_imu` mis-reads `SCALED_IMU2.time_boot_ms` as `time_usec`, defaulting to 0 for ~half of all IMU samples. Both bugs are real and orthogonal in their fix paths. The decision was to split — AZ-883 (2 SP) gets its own ticket, AZ-848 (5 SP) keeps its tightly-scoped contract repair.
**What changed**: when a deep investigation surfaces a second latent issue that's orthogonal to the primary bug, file the second issue as its own ticket in the same session (with full evidence + reproduction protocol), then resume the primary investigation. Resist the temptation to fold the second issue into the primary ticket's scope "for convenience" — it inflates SP estimates and couples fix landings unnecessarily.
Source: `_docs/06_metrics/retro_2026-05-26.md`
## 2026-05-20 — [testing] Two-tier test policy retired — all tests run on Jetson only
**Trigger**: a `/test-run` invocation on the workstation Tier-1 Docker stack uncovered eight categorically distinct, sequential bugs in the supposedly-supported workstation path (Dockerfile `COPY` ordering before editable install, base-image pip too old for `gtsam` pre-release wheels, runtime stage missing the `python3` metapackage that `python3 -m venv` symlinks against, missing `libgl1` / `libglib2.0-0` for `cv2` import, missing `runtime_root/__main__.py` shim, lazy import that never registered the `c6_tile_cache` config block, and a `BUILD_FAISS_INDEX` env flag gap in `docker-compose.test.jetson.yml`). None of these had been hit before because no one had actually executed the workstation Docker stack end-to-end since it was authored — the colocated Jetson Woodpecker agent was the only test environment that ever ran. Maintaining the divergent x86 path was producing only false-negative signal and engineering time, never honest test coverage.
+4 -4
View File
@@ -2,13 +2,13 @@
## Current Step
flow: existing-code
step: 12
name: Test-Spec Sync
status: not_started
step: 10
name: Implement
status: in_progress
sub_step:
phase: 0
name: awaiting-invocation
detail: ""
retry_count: 0
cycle: 3
cycle: 4
tracker: jira
@@ -1,7 +1,7 @@
# D-CROSS-CVE-1 opencv-python pin deferred — gtsam/numpy ABI block
**Recorded**: 2026-05-11T02:55+03:00 (Europe/Kyiv)
**Last replay attempt**: 2026-05-24T05:07+03:00 (Europe/Kyiv) — replay re-checked
**Last replay attempt**: 2026-05-26T13:06+03:00 (Europe/Kyiv) — replay re-checked
at start of next `/autodev` invocation. PyPI re-queried via
`python3 -m pip index versions gtsam`: only `gtsam 4.2` is published.
Replay condition (numpy>=2 stable wheels) still NOT met. Leftover remains open.
@@ -0,0 +1,51 @@
# Leftover: EVIDENCE_OUT default is a hardcoded container path
**Created**: 2026-05-26
**Last replay attempt**: 2026-05-26
**Category**: Test infrastructure defect (non-tracker leftover — code fix, not a deferred tracker write)
**Surfaced by**: autodev cycle 3 Step 15 (Performance Test) — `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` "Findings worth tracking" item 3.
## Problem
`e2e/runner/conftest.py:56`:
```python
default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence")
```
The default path `/e2e-results/evidence` is the container mount inside the Tier-1 Docker harness and the Tier-2 Jetson run script. On a developer Mac/Linux workstation invoking `python -m pytest e2e/tests/performance/` directly (no Docker, no Jetson), this hook fires in `nfr_recorder.pytest_sessionfinish` and tries to create the directory, failing with:
```
OSError: [Errno 30] Read-only file system: '/e2e-results'
```
(macOS — the volume `/` is read-only at the filesystem root.) On Linux hosts it would fail with `PermissionError` for the same reason — `/e2e-results` is not writable by a non-root user.
## Workaround (used today)
```bash
EVIDENCE_OUT="$(pwd)/e2e-results/cycle3-tier1-probe/evidence" \
python -m pytest e2e/tests/performance/ -v --tb=short
```
This produced a clean exit-0 run with the expected 24 SKIPPED outcomes.
## Proposed fix
Change `e2e/runner/conftest.py:56` to default to a workspace-relative path when neither `--evidence-out` nor `EVIDENCE_OUT` is set. Two viable shapes:
1. **Workspace-relative default**: `default=os.environ.get("EVIDENCE_OUT", str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"))`.
2. **Lazy fallback inside the recorder**: leave the default unset; if `evidence_dir` is `None` at session finish, skip emission and warn — useful for `--collect-only` or smoke runs where evidence output is genuinely not needed.
Either shape preserves backward compatibility with the Docker / Jetson scripts (they pass `--evidence-out` explicitly).
## Why not fix in this cycle
Per `coderule.mdc` § Scope discipline: "Unrelated issues elsewhere: do not silently fix them as part of this task. Either note them to the user at end of turn and ASK before expanding scope, or record in `_docs/_process_leftovers/` for later handling." Cycle 3 was pre-flight / route-driven seeding work; the EVIDENCE_OUT default has no relationship to that scope. Recording here for either:
- Next cycle's New Task step to pick up as a small (~1 pt) housekeeping ticket, OR
- A drive-by fix during the next test-infrastructure touch (e.g. when AZ-444 Tier-2 harness lands).
## Replay condition
This is a **code-fix leftover**, not a tracker-write leftover. There is nothing to "replay against the tracker". Resolution = land the conftest change above and verify a Tier-1 host run of `pytest e2e/tests/performance/` exits cleanly without `EVIDENCE_OUT` pre-set. Once that PR merges, delete this leftover.