AZ-964 SHIPPED — AZ-840 orchestrator test moves past FAISS gate. Changes: * tests/e2e/replay/_faiss_seed.py — extracts the empty HNSW32 seeding logic from scripts/mk_test_faiss_fixture.py into a reusable test-infra module: seed_empty_faiss_index(root_dir, *, descriptor_dim=512, backbone_label="ultra_vpr") -> Path. * scripts/mk_test_faiss_fixture.py rewritten as a thin CLI shim importing the same helper. compose `tile-init` contract is preserved. * tests/e2e/replay/conftest.py::_build_operator_pre_flight_cache now calls seed_empty_faiss_index(cache_root) immediately before build_descriptor_index(config), so the factory's _load() finds a valid .index + .sha256 + .meta.json triplet at the fixture's override root_dir. populate_c6_from_route later in the fixture rebuilds the real index once route tiles are downloaded. * docker-compose.test.jetson.yml: BUILD_PYTORCH_FP16_RUNTIME: "ON" added to e2e-runner.environment. Scope creep documented honestly in the spec — Tier-2 surfaced this third config gap on the same fixture chain while validating AZ-964 (RuntimeNotAvailableError: ... the flag is OFF). One-line wiring; the dustynv/l4t-pytorch base image bakes the Tegra-tuned PyTorch wheel and pytorch_fp16_runtime.py exists, so flag flip is sufficient. Tier-2 verdict (4F / 48P / 3S / 1XF / 1XP in 86.07s, 0 errors — was 2 errors before this commit): AZ-840 orchestrator test moves from ERROR at FAISS gate to SKIP at empty-backbones gate — exactly the AZ-965 gate AZ-964 AC-3 promised. test_operator_pre_flight_ integration SKIPs cleanly too. The 4 derkachi_1min ESKF-divergence FAILs are constant across all three runs today (AZ-963 path, independent of orchestrator chain). Three Tier-2 runs today on the orchestrator chain: i. pre-AZ-962: SKIP at env-var gate ii. post-AZ-962: ERROR at FAISS gate iii. post-AZ-964: SKIP at backbones gate (AZ-965) Cycle-4 e2e gate still NOT GREEN. Orchestrator chain remaining = AZ-965 (NetVLAD backbone provisioning); 60s smoke chain remaining = AZ-963 (ESKF divergence). OKVIS2 deferral directive unchanged. Pre-existing yamllint false positive on docker-compose.test.jetson .yml:185 (sibling `volumes:` keys flagged as duplicates without respecting parent-key scope) — PyYAML parses cleanly with no duplicates and docker-compose accepts the file at runtime. Co-authored-by: Cursor <cursoragent@cursor.com>
5.9 KiB
Autodev State
Current Step
flow: existing-code
step: 10
name: Implement
status: in_progress
sub_step:
phase: 6
name: implement-tasks
detail: "batch 11 = AZ-964 SHIPPED end-to-end. Extracted FAISS empty-index seeding logic into tests/e2e/replay/_faiss_seed.py (exposes seed_empty_faiss_index(root_dir, *, descriptor_dim=512, backbone_label='ultra_vpr') -> Path); rewrote scripts/mk_test_faiss_fixture.py as a thin CLI shim importing the new module (compose tile-init contract preserved); wired seed_empty_faiss_index(cache_root) into tests/e2e/replay/conftest.py::_build_operator_pre_flight_cache immediately before build_descriptor_index(config). Scope creep documented honestly: also added BUILD_PYTORCH_FP16_RUNTIME: ON to docker-compose.test.jetson.yml because Tier-2 surfaced a third config gap (RuntimeNotAvailableError: ... the flag is OFF) on the same fixture chain. Tier-2 re-run on Jetson AGX Orin: 4 failed / 48 passed / 3 skipped / 1 xfailed / 1 xpassed in 86.07s, 0 errors (was 2 errors). AZ-840 orchestrator moves from ERROR (FAISS gate) to SKIP at empty-backbones gate — exactly the AZ-965 gate AZ-964's AC-3 promised. AZ-964 → Done in Jira (read-back verified). AZ-964 spec moved todo/ → done/. Three Tier-2 runs today: (i) pre-AZ-962 = SKIP env-var; (ii) post-AZ-962 = ERROR FAISS; (iii) post-AZ-964 = SKIP backbones. Cycle-4 e2e still NOT GREEN; orchestrator chain remaining = AZ-965 (backbones); 60s smoke chain remaining = AZ-963. OKVIS2 deferral unchanged. Earlier same-day batch 10 = AZ-962 SHIPPED end-to-end + 2 new tickets filed. Implemented configs/operator_replay.yaml (registers c6/c7/c10/c11 with defaults; backbones: [] intentionally — see AZ-965), docker-compose.test.jetson.yml exports GPS_DENIED_OPERATOR_CONFIG_PATH=/opt/configs/operator_replay.yaml + bind-mounts ./configs:/opt/configs:ro, ENV_KEY_MAP (src/gps_denied_onboard/config/loader.py) gained two entries (SATELLITE_PROVIDER_URL → c11.satellite_provider_url, SATELLITE_PROVIDER_API_KEY → c11.service_api_key) so secrets flow from .env.test and never land in YAML, and README dropped the manual export step. 97/97 c11+config unit tests stay green. Tier-2 re-run on Jetson AGX Orin (JETSON_SSH_ALIAS=jetson bash scripts/run-tests-jetson.sh): 4 failed / 48 passed / 1 skipped / 1 xfailed / 1 xpassed / 2 errors in 84.99s — i.e. -2 skipped, +2 errors vs the prior baseline. AZ-962 AC-3 + AC-4 satisfied: AZ-840 orchestrator no longer SKIPs at env-var; it now ERRORs at a deeper, real gate during fixture setup with IndexUnavailableError: FaissDescriptorIndex: .index file missing at /tmp/pytest-of-root/pytest-0/operator_pre_flight_cache0/descriptor.index. Same error in test_operator_pre_flight_integration.py::test_operator_pre_flight_setup_produces_populated_cache confirms fixture-wide bug, not a single-test issue. Root cause: conftest.py:487 calls build_descriptor_index(config) against a fresh empty c6_tile_cache.root_dir (tmp dir per AZ-839 invariant) — FAISS factory needs an existing .index. tile-init compose service exists but writes its seed to /var/lib/gps-denied/tiles, not the tmp dir the fixture overrides into. Filed AZ-964 (3 SP, To Do, FAISS index bootstrap; preferred fix = invoke mk_test_faiss_fixture.py inline against override root_dir) and AZ-965 (3 SP, To Do, blocked by AZ-964, NetVLAD ONNX backbone provisioning — the next gate after FAISS clears). AZ-962 transitioned To Do → In Progress → Done in Jira (read-back verified). AZ-962 spec moved todo/ → done/. Cycle-4 e2e gate still NOT GREEN: AZ-840 chain is now AZ-964 → AZ-965 → orchestrator PASS; 60s smoke is AZ-963 → 4 derkachi_1min tests PASS. OKVIS2 deferral directive still in force (not yet met). Earlier same-day batch 9 = Tier-2 Jetson e2e validation run NOT GREEN. Ran JETSON_SSH_ALIAS=jetson bash scripts/run-tests-jetson.sh; result = 4 failed / 48 passed / 3 skipped / 1 xfailed / 1 xpassed in 90.59s. Two distinct blockers: (1) AZ-840 orchestrator test SKIPPED because GPS_DENIED_OPERATOR_CONFIG_PATH not exported by docker-compose.test.jetson.yml AND operator_replay.yaml missing from repo — Epic AZ-835's 'Done' status was validated by doc-content only, never by actual orchestrator test execution; (2) AZ-895 fallout — 4 tests in test_derkachi_1min.py regress with EstimatorFatalError('eskf filter divergence: mahalanobis²=212.311 > 100.0') at frame 233 because the CSV-driven path (now primary) runs open-loop on the Derkachi fixture (no reference C6 tile cache → no satellite anchoring). Filed AZ-962 (3 SP, operator config + compose wiring) and AZ-963 (3 SP, ESKF regression triage). OKVIS2 chain stays deferred per user 2026-05-29 directive ('after Derkachi e2e green' — directive unchanged; e2e not green). AZ-842 caveat: the AZ-840/AZ-842 'Done' tracker state set earlier today is contingent on whether convention (A) 'In Testing = shipped' or (B) 'Done = shipped+tested' applies; user-skipped convention question, leftover holds the walk-back payload if needed. Cycle-4 not green. Earlier same-day batch 8 = tracker-only fix for AZ-842 (To Do → Done, read-back verified) + wider Jira drift audit recorded as _docs/_process_leftovers/2026-05-29_jira_status_drift_audit.md. 10 cycle-3/4 tickets (AZ-836/838/839/840/894/895/896/899/900/901) shipped to done/ locally but stuck in 'In Testing' in Jira; Epic AZ-835 in todo/ with all 5 children done. User skipped A/B/C/D convention question — leftover holds the bulk-transition payload for whichever convention they pick. Corrected cycle-4 todo/ remainder: nothing actionable. Earlier narratives that listed AZ-899/900/901 as 'cycle-4 todo/ remainder for next batches' were fiction — those specs have been in done/ the whole time. OKVIS2 chain (AZ-943/951/952) sits in todo/ but is deferred per user 2026-05-29 directive until after Derkachi e2e flight test passes. Cycle-4 product work is effectively complete pending Derkachi e2e green + AZ-897 UI in ../ui."
retry_count: 0
cycle: 4
tracker: jira