Files
gps-denied-onboard/_docs/_autodev_state.md
T
Oleksandr Bezdieniezhnykh 288aae881d [AZ-964] FAISS index bootstrap for AZ-839 fixture + build flag
AZ-964 SHIPPED — AZ-840 orchestrator test moves past FAISS gate.

Changes:
* tests/e2e/replay/_faiss_seed.py — extracts the empty HNSW32
  seeding logic from scripts/mk_test_faiss_fixture.py into a
  reusable test-infra module: seed_empty_faiss_index(root_dir,
  *, descriptor_dim=512, backbone_label="ultra_vpr") -> Path.
* scripts/mk_test_faiss_fixture.py rewritten as a thin CLI shim
  importing the same helper. compose `tile-init` contract is
  preserved.
* tests/e2e/replay/conftest.py::_build_operator_pre_flight_cache
  now calls seed_empty_faiss_index(cache_root) immediately before
  build_descriptor_index(config), so the factory's _load() finds
  a valid .index + .sha256 + .meta.json triplet at the fixture's
  override root_dir. populate_c6_from_route later in the fixture
  rebuilds the real index once route tiles are downloaded.
* docker-compose.test.jetson.yml: BUILD_PYTORCH_FP16_RUNTIME: "ON"
  added to e2e-runner.environment. Scope creep documented honestly
  in the spec — Tier-2 surfaced this third config gap on the same
  fixture chain while validating AZ-964 (RuntimeNotAvailableError:
  ... the flag is OFF). One-line wiring; the dustynv/l4t-pytorch
  base image bakes the Tegra-tuned PyTorch wheel and
  pytorch_fp16_runtime.py exists, so flag flip is sufficient.

Tier-2 verdict (4F / 48P / 3S / 1XF / 1XP in 86.07s, 0 errors —
was 2 errors before this commit): AZ-840 orchestrator test moves
from ERROR at FAISS gate to SKIP at empty-backbones gate — exactly
the AZ-965 gate AZ-964 AC-3 promised. test_operator_pre_flight_
integration SKIPs cleanly too. The 4 derkachi_1min ESKF-divergence
FAILs are constant across all three runs today (AZ-963 path,
independent of orchestrator chain).

Three Tier-2 runs today on the orchestrator chain:
  i.   pre-AZ-962: SKIP at env-var gate
  ii.  post-AZ-962: ERROR at FAISS gate
  iii. post-AZ-964: SKIP at backbones gate (AZ-965)

Cycle-4 e2e gate still NOT GREEN. Orchestrator chain remaining =
AZ-965 (NetVLAD backbone provisioning); 60s smoke chain remaining
= AZ-963 (ESKF divergence). OKVIS2 deferral directive unchanged.

Pre-existing yamllint false positive on docker-compose.test.jetson
.yml:185 (sibling `volumes:` keys flagged as duplicates without
respecting parent-key scope) — PyYAML parses cleanly with no
duplicates and docker-compose accepts the file at runtime.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-29 17:02:49 +03:00

5.9 KiB

Autodev State

Current Step

flow: existing-code step: 10 name: Implement status: in_progress sub_step: phase: 6 name: implement-tasks detail: "batch 11 = AZ-964 SHIPPED end-to-end. Extracted FAISS empty-index seeding logic into tests/e2e/replay/_faiss_seed.py (exposes seed_empty_faiss_index(root_dir, *, descriptor_dim=512, backbone_label='ultra_vpr') -> Path); rewrote scripts/mk_test_faiss_fixture.py as a thin CLI shim importing the new module (compose tile-init contract preserved); wired seed_empty_faiss_index(cache_root) into tests/e2e/replay/conftest.py::_build_operator_pre_flight_cache immediately before build_descriptor_index(config). Scope creep documented honestly: also added BUILD_PYTORCH_FP16_RUNTIME: ON to docker-compose.test.jetson.yml because Tier-2 surfaced a third config gap (RuntimeNotAvailableError: ... the flag is OFF) on the same fixture chain. Tier-2 re-run on Jetson AGX Orin: 4 failed / 48 passed / 3 skipped / 1 xfailed / 1 xpassed in 86.07s, 0 errors (was 2 errors). AZ-840 orchestrator moves from ERROR (FAISS gate) to SKIP at empty-backbones gate — exactly the AZ-965 gate AZ-964's AC-3 promised. AZ-964 → Done in Jira (read-back verified). AZ-964 spec moved todo/ → done/. Three Tier-2 runs today: (i) pre-AZ-962 = SKIP env-var; (ii) post-AZ-962 = ERROR FAISS; (iii) post-AZ-964 = SKIP backbones. Cycle-4 e2e still NOT GREEN; orchestrator chain remaining = AZ-965 (backbones); 60s smoke chain remaining = AZ-963. OKVIS2 deferral unchanged. Earlier same-day batch 10 = AZ-962 SHIPPED end-to-end + 2 new tickets filed. Implemented configs/operator_replay.yaml (registers c6/c7/c10/c11 with defaults; backbones: [] intentionally — see AZ-965), docker-compose.test.jetson.yml exports GPS_DENIED_OPERATOR_CONFIG_PATH=/opt/configs/operator_replay.yaml + bind-mounts ./configs:/opt/configs:ro, ENV_KEY_MAP (src/gps_denied_onboard/config/loader.py) gained two entries (SATELLITE_PROVIDER_URLc11.satellite_provider_url, SATELLITE_PROVIDER_API_KEYc11.service_api_key) so secrets flow from .env.test and never land in YAML, and README dropped the manual export step. 97/97 c11+config unit tests stay green. Tier-2 re-run on Jetson AGX Orin (JETSON_SSH_ALIAS=jetson bash scripts/run-tests-jetson.sh): 4 failed / 48 passed / 1 skipped / 1 xfailed / 1 xpassed / 2 errors in 84.99s — i.e. -2 skipped, +2 errors vs the prior baseline. AZ-962 AC-3 + AC-4 satisfied: AZ-840 orchestrator no longer SKIPs at env-var; it now ERRORs at a deeper, real gate during fixture setup with IndexUnavailableError: FaissDescriptorIndex: .index file missing at /tmp/pytest-of-root/pytest-0/operator_pre_flight_cache0/descriptor.index. Same error in test_operator_pre_flight_integration.py::test_operator_pre_flight_setup_produces_populated_cache confirms fixture-wide bug, not a single-test issue. Root cause: conftest.py:487 calls build_descriptor_index(config) against a fresh empty c6_tile_cache.root_dir (tmp dir per AZ-839 invariant) — FAISS factory needs an existing .index. tile-init compose service exists but writes its seed to /var/lib/gps-denied/tiles, not the tmp dir the fixture overrides into. Filed AZ-964 (3 SP, To Do, FAISS index bootstrap; preferred fix = invoke mk_test_faiss_fixture.py inline against override root_dir) and AZ-965 (3 SP, To Do, blocked by AZ-964, NetVLAD ONNX backbone provisioning — the next gate after FAISS clears). AZ-962 transitioned To Do → In Progress → Done in Jira (read-back verified). AZ-962 spec moved todo/ → done/. Cycle-4 e2e gate still NOT GREEN: AZ-840 chain is now AZ-964 → AZ-965 → orchestrator PASS; 60s smoke is AZ-963 → 4 derkachi_1min tests PASS. OKVIS2 deferral directive still in force (not yet met). Earlier same-day batch 9 = Tier-2 Jetson e2e validation run NOT GREEN. Ran JETSON_SSH_ALIAS=jetson bash scripts/run-tests-jetson.sh; result = 4 failed / 48 passed / 3 skipped / 1 xfailed / 1 xpassed in 90.59s. Two distinct blockers: (1) AZ-840 orchestrator test SKIPPED because GPS_DENIED_OPERATOR_CONFIG_PATH not exported by docker-compose.test.jetson.yml AND operator_replay.yaml missing from repo — Epic AZ-835's 'Done' status was validated by doc-content only, never by actual orchestrator test execution; (2) AZ-895 fallout — 4 tests in test_derkachi_1min.py regress with EstimatorFatalError('eskf filter divergence: mahalanobis²=212.311 > 100.0') at frame 233 because the CSV-driven path (now primary) runs open-loop on the Derkachi fixture (no reference C6 tile cache → no satellite anchoring). Filed AZ-962 (3 SP, operator config + compose wiring) and AZ-963 (3 SP, ESKF regression triage). OKVIS2 chain stays deferred per user 2026-05-29 directive ('after Derkachi e2e green' — directive unchanged; e2e not green). AZ-842 caveat: the AZ-840/AZ-842 'Done' tracker state set earlier today is contingent on whether convention (A) 'In Testing = shipped' or (B) 'Done = shipped+tested' applies; user-skipped convention question, leftover holds the walk-back payload if needed. Cycle-4 not green. Earlier same-day batch 8 = tracker-only fix for AZ-842 (To Do → Done, read-back verified) + wider Jira drift audit recorded as _docs/_process_leftovers/2026-05-29_jira_status_drift_audit.md. 10 cycle-3/4 tickets (AZ-836/838/839/840/894/895/896/899/900/901) shipped to done/ locally but stuck in 'In Testing' in Jira; Epic AZ-835 in todo/ with all 5 children done. User skipped A/B/C/D convention question — leftover holds the bulk-transition payload for whichever convention they pick. Corrected cycle-4 todo/ remainder: nothing actionable. Earlier narratives that listed AZ-899/900/901 as 'cycle-4 todo/ remainder for next batches' were fiction — those specs have been in done/ the whole time. OKVIS2 chain (AZ-943/951/952) sits in todo/ but is deferred per user 2026-05-29 directive until after Derkachi e2e flight test passes. Cycle-4 product work is effectively complete pending Derkachi e2e green + AZ-897 UI in ../ui." retry_count: 0 cycle: 4 tracker: jira