gps-denied-onboard

mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-06-21 10:21:13 +00:00

Author	SHA1	Message	Date
Oleksandr Bezdieniezhnykh	6599d828d2	[AZ-407] [AZ-444] [AZ-445] Batch 68: fixtures, Tier-2 harness, NFR reporter Three blackbox-harness tasks landed together — all depend only on AZ-406 and unblock the FT-* / NFT-* scenario tasks scheduled for batches 69+. AZ-407 — Static fixture builders (3pt): * tile-cache-builder/{builder.py, Dockerfile, build.sh} produces a deterministic tile-cache-fixture Docker volume from _docs/00_problem/input_data/. Reproducibility primitives: sorted iteration, frozen PIL JPEG settings, FAISS HNSW32 built single- threaded with seeded stub descriptors. * age-injector/{age_injector.py, inject.sh} clones the volume and shifts capture_date by N×30.44 days; tile JPEG bytes preserved bit-identical. Emits synth-age-7mo + synth-age-13mo volumes. * cold-boot/cold_boot_fixture.json: frozen FC pose snapshot at Derkachi sector centre, schema v1. * secrets/mavlink-test-passkey.txt: 64-hex with required `# TEST ONLY` header line per AC-5. Passkey-equality test now compares the secret line after stripping the header. * security/cve-2025-53644.jpg: synthetic 158-byte malformed JPEG (truncated SOS marker). OpenCV 4.11.x rejects gracefully with imdecode → None. AZ-439 will sharpen for ASan instrumentation. * Top-level Makefile with `make fixtures` / `make fixtures-` / `make e2e-tier1` / `make unit-tests` targets. AZ-444 — Tier-2 Jetson harness wrapper (5pt): * run-tier2.sh rewritten as orchestrator. Detects local (aarch64 + TIER2_HOST=localhost) vs remote (ssh into TIER2_HOST). New flags: -k/--selector, --build-kind production\|asan, --reflash (gated behind TIER2_REFLASH_ACK=1 two-key gate), --dry-run. * tier2-on-jetson.sh (new) — on-device delegate. Verifies gps-denied-onboard{,-asan}.service health; restarts with 5s tolerance; spawns tegrastats + jtop parallel samplers; tails ASan unit's journal in asan mode; drives docker compose with TIER=tier2-jetson; forwards SELECTOR to pytest -k. * docker/run-tier1.sh (new) — selector-parity sibling. * AC-1 (selector parity) and AC-6 (reflash gating) unit-tested via --dry-run output assertions. AC-2/AC-3/AC-4/AC-5 are hardware- loop ACs verified by the Tier-2 runtime smoke (no Jetson in the unit-test layer). AZ-445 — CSV reporter + evidence bundler refinements (2pt): * reporting/nfr_recorder.py (new) — pytest plugin. Provides the `nfr_recorder` fixture with record_metric(name, value, ac_id) and partial(ac_id, reason). At session end emits: - per-nfr/<scenario_id>.json (AC-1) - traceability-status.json with every AC ID parsed from traceability-matrix.md, classified Covered/PARTIAL/NOT COVERED with source scenario IDs (AC-2) - regression-baseline.json with all numeric metrics (AC-3) * csv_reporter.py extended — `_outcome_to_result` consults the aggregator; rows flip PASS → PARTIAL when an AC was marked PARTIAL by nfr_recorder (AC-4). Graceful fallback when aggregator isn't registered (unit-test contexts). * conftest.py registers nfr_recorder in pytest_plugins. * New --traceability-matrix CLI flag seeds the NOT COVERED rows. Build / config: * pyproject.toml dev extras: added Pillow>=10.4,<13.0 for the tile-cache-builder unit test (broad enough to keep torchvision's Pillow 12 pin happy; the production builder runs inside its own Docker image with its own pin). * Updated test_directory_layout.py to cover 10 new files + replaced the byte-equal passkey assertion with the header-stripping variant. Test results: * 157 focused tests pass (was 97 in batch 67; +60 new across this batch). No regressions. Module-layout / spec drift: * AZ-407 spec text says `tests/fixtures/...`; module-layout blackbox_tests entry (commit `d7a17a8`) authoritatively places the harness under `e2e/`. Implementation followed the layout entry. * AZ-444 spec mentions `e2e/tier2/run-tier2.sh`; AZ-406 placed it at `e2e/jetson/run-tier2.sh`. Kept at `e2e/jetson/` for consistency. * Cold-boot README ownership: corrected from AZ-419 to AZ-407 per AZ-419's own Dependencies field. Specs archived to _docs/02_tasks/done/. Jira tickets transitioned to In Testing on commit. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 17:18:01 +03:00
Oleksandr Bezdieniezhnykh	e9e6e32097	[AZ-406] Update autodev state: batch 67 closed, batch 68 pending Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 16:23:40 +03:00
Oleksandr Bezdieniezhnykh	59d9116d36	[AZ-406] Blackbox test harness bootstrap (Tier-1 + Tier-2 scaffold) Bootstraps the public-boundary blackbox test harness owned by epic AZ-262 (E-BBT). Establishes the e2e/ directory tree at the repo root, fully separated from src/gps_denied_onboard/ and from the in-process tests/ tree, and commits to the contracts every subsequent test ticket (AZ-407..AZ-446) builds against. Tier-1 (workstation Docker): - docker/docker-compose.test.yml wires SUT + ArduPilot SITL + iNav SITL + mock Suite Sat Service + mavproxy listener + e2e-runner onto one e2e-net bridge with internal: true (enforces RESTRICT-SAT-1 / NFT-SEC-02 egress isolation at the network layer). - docker/docker-compose.tier2-bridge.yml override disables the in- compose SUT so Tier-2 pairs SITLs + mock + runner on an x86 host while the SUT runs natively on the Jetson under systemd. Tier-2 (Jetson): - jetson/run-tier2.sh + tier2.service systemd unit + tegrastats / jtop parsers feed per-sample telemetry into the evidence bundle. Runner image (e2e/runner/): - Dockerfile + requirements.txt install ONLY ground-side libs (pymavlink, opencv-python>=4.12, numpy/scipy/geopy/pyproj, httpx, orjson, pydantic, structlog, pytest 8.x). The runner deliberately does NOT install the SUT package. - conftest.py implements the AC-9 skip-rule mapping (tier2_only, chamber_only, vins_mono, deferred_ac) tied to environment.md parametrize axes. - reporting/csv_reporter.py is a pytest plugin emitting one row per test with the exact 11-column schema from environment.md § Reporting (test_id, test_name, traces_to, fc_adapter, vio_strategy, tier, started_at_utc, execution_time_ms, result, error_message, evidence_paths). XFAIL surfaced only when a test carries @pytest.mark.deferred_ac(verdict="xfail", reason=...). - reporting/evidence_bundler.py exposes the attach_evidence fixture that copies per-test artifacts (.tlog, FDR archives, screenshots, tegrastats / jtop CSVs) into the run bundle and records relative paths into the reporter's evidence_paths column. - helpers/{frame_source_replay,imu_replay,sitl_observer, mavproxy_tlog_reader,fdr_reader}.py declare the public surfaces (concrete implementations owned by AZ-407 / AZ-408 / AZ-416 / AZ-417 / AZ-441 per the dependency table); helpers/geo.py ships today (no downstream task dep) — WGS84 distance / forward-bearing / offset via pyproj with NaN rejection. Mock Suite Sat Service (e2e/fixtures/mock-suite-sat/): - FastAPI app: POST /tiles (ingest contract from D-PROJ-2 follow-up), GET /tiles/audit + /mock/audit (per-run read-back), POST /mock/config (force-status, response delay), POST /mock/reset (clears audit between tests), GET /mock/health. Fixture scaffolds (e2e/fixtures/{tile-cache-builder, age-injector, injectors, cold-boot, secrets, security}/): - Public surfaces only. Concrete builders land in AZ-407 (static fixtures), AZ-408 (runtime synthetic injection), AZ-419 (cold-boot fixture), AZ-439 (CVE-2025-53644 JPEG generator). Test tree (e2e/tests/{positive,negative,performance,resilience, security,resource_limit}/): - Mirror of the test-spec category grouping in _docs/02_document/tests/-tests.md. - tests/positive/test_smoke.py is the AC-1 harness-boot smoke run inside the e2e-runner image once Docker brings everything up. Out-of-container unit tests (e2e/_unit_tests/): - Exercises the harness internals (CSV reporter plugin lifecycle, conftest skip rules, helper modules, parsers, mock app, compose YAML structural contract, public-boundary enforcement) without Docker / SITL. 97 unit tests, all passing. Build / config: - pyproject.toml: testpaths extended with e2e/_unit_tests; pythonpath extended with e2e; fastapi>=0.111,<0.120 added to dev extras for the mock-app TestClient unit test. AC coverage: - AC-1 (Tier-1 boot) → compose YAML test + directory layout + smoke test (Docker-bound) - AC-2 (mock services) → 6 FastAPI TestClient unit tests - AC-3 (SITLs accept output) → contract present; concrete check deferred to AZ-416 / AZ-417 - AC-4 (CSV columns) → in-process plugin lifecycle test emits the exact 11-column schema - AC-5 (egress isolation) → static config test + runtime probe in Docker-bound smoke - AC-6 (Tier-2 contract) → tegrastats + jtop parser unit tests + jetson/ layout test; full Tier-2 contract is AZ-444 - AC-7 (fixture reproducibility) → deferred to AZ-407 per task spec - AC-8 (parametrize matrix) → vins_mono skip-rule cases + tests/positive/test_smoke - AC-9 (skip semantics) → 9 conftest skip-rule unit tests Module layout entry for blackbox_tests was added in 2026-05-16 preparatory commit `d7a17a8` so this diff stays focused on the harness scaffold. AZ-406 advances to In Testing on commit. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 16:22:44 +03:00
Oleksandr Bezdieniezhnykh	d7a17a8248	[AZ-406] Add blackbox_tests cross-cutting entry to module-layout.md The 41 blackbox/e2e test tasks (AZ-406..AZ-446 under epic AZ-262) all declare Component=Blackbox Tests, but module-layout.md had no matching Per-Component Mapping entry. The implement skill's Step 4 (File Ownership) requires every batch's component to be resolvable in module-layout.md. Add a `blackbox_tests` entry in the Shared / Cross-Cutting section that owns the top-level `e2e/` directory (separate from `tests/`), documents the public-boundary discipline (no SUT imports), and clarifies that boundary-driven performance/resilience/security scenarios live under `e2e/tests/<category>/` rather than under `tests/perf\|security\|resilience/`. Also update Layout Rule #7 to reflect the harness split and the state file's sub_step to parse-and-detect-progress (Step 10 entry). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 16:01:43 +03:00
Oleksandr Bezdieniezhnykh	fa38bfe608	Step 9: Decompose Tests — already complete in prior cycle 41 blackbox test task specs (AZ-406..AZ-446) under epic AZ-262 already exist in _docs/02_tasks/todo/. Dependencies table reflects them (155 = 114 product + 41 test, 133 blackbox-test pts). tests/e2e/conftest.py + tests/e2e/Dockerfile placeholders confirm the bootstrap was decomposed in a prior pass. Folder fallback for Step 9 is satisfied. No new work executed. State advanced to Step 10 (Implement Tests) — session boundary per greenfield flow; suggest fresh conversation before continuing. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 14:14:58 +03:00
Oleksandr Bezdieniezhnykh	7a71579428	Step 8: Code Testability Revision — no changes needed Autodev greenfield Step 8 closes with outcome "Code is testable — no changes needed" after reviewing the 41 test scenarios in _docs/02_document/tests/ against the codebase against the Step-8 allowed-changes checklist. Key findings: - Hardcoded paths are config defaults, overridable via Config dataclass - All mutable registries expose clear__registry()/_reset_for_tests() - Hot-path timing uses injected Clock; cosmetic timestamps are monkeypatch-safe (2105-test unit suite proves it) - Heavy strategies (OKVIS2, VINS-Mono, FAISS, TRT) are BUILD_ gated - compose_root(pre_constructed=...) (AZ-591) is the Tier-1 injection seam; tests/e2e/replay already drives it end-to-end Artifacts: - _docs/04_refactoring/01-testability-refactoring/ testability_assessment.md - State advanced to Step 9 (Decompose Tests) - last_step_outcomes.step_8 recorded Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 13:05:43 +03:00
Oleksandr Bezdieniezhnykh	55ddcb70d3	[AZ-591] State: advance Step 7 to Step 8 (Code Testability Rev.) Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 12:59:50 +03:00
Oleksandr Bezdieniezhnykh	f7a99282fb	[AZ-591] Add airborne_bootstrap to populate _STRATEGY_REGISTRY Batch 66 — fixes the production gap surfaced during the cycle-1 completeness-gate post-mortem: the central _STRATEGY_REGISTRY was empty in production source, so compose_root() raised StrategyNotLinkedError on the first component lookup and the airborne binary couldn't reach takeoff. Changes: - New module `src/.../runtime_root/airborne_bootstrap.py` exposes `register_airborne_strategies()` and a documented `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS` table. The function registers 14 entries into the central registry across 7 strategy-selecting slots (c1_vio + c2_vpr + c2_5_rerank + c3_matcher + c3_5_adhop + c4_pose + c5_state). Per-slot wrappers adapt the registry-factory signature (config, constructed) to each per-component factory's kwarg surface and surface a AirborneBootstrapError when a required infrastructure dep is missing from constructed. - `compose_root` gains a `pre_constructed` kwarg in live mode, symmetric with the replay-mode seam. Replay entries still take precedence on key collision (ADR-011). Existing callers unaffected (kwarg defaults to None). - `runtime_root/__init__.py::main()` now calls `register_airborne_strategies()` before `compose_root(config)` so production binaries no longer crash at the registry-lookup step. - Lazy-loading preserved: state_factory's private _STATE_REGISTRY is populated lazily inside the c5_state wrapper, gated by BUILD_STATE_GTSAM_ISAM2 / BUILD_STATE_ESKF env flags. pose_factory's own lazy-import fallback handles c4_pose without an explicit register() call. - 7 new unit tests in `tests/unit/runtime_root/test_az591_airborne_\ bootstrap.py` cover AC-1..AC-5 plus the negative-path AirborneBootstrapError contract. Full unit suite 2105 passed / 88 environment-gated skips / 0 failures. End-to-end takeoff still needs a follow-up task to wire infrastructure pre-construction (c13_fdr / c6_* / c7_inference / etc.) into the pre_constructed dict passed to compose_root. That follow-up is gated by AZ-591 landing first; recommended split into per-component infrastructure-prep tasks (3pt each). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 12:58:38 +03:00
Oleksandr Bezdieniezhnykh	6d51e06886	[AZ-589] [AZ-590] [AZ-591] [AZ-592] [AZ-593] Re-classify cycle1 gate findings Cycle 1 Product Implementation Completeness Gate post-mortem. AZ-589 + AZ-590 were the wrong abstraction: - AZ-589 targeted `okvis::ThreadedKFVio` (OKVIS v1 API) which does not exist in the vendored OKVIS2 upstream; smartroboticslab/okvis2 exposes `okvis::ThreadedSlam` instead. - AZ-590 assumed a "de-ROSified VINS-Mono pin" submodule exists; `cpp/vins_mono/upstream/` has no `.gitmodules` entry. - The actual production gap is the empty central `_STRATEGY_REGISTRY`: `register_strategy(...)` is never called outside test fixtures, so `compose_root()` raises `StrategyNotLinkedError` for every component slug with a strategy-selecting config field. Affects c1_vio + c2_vpr + c2_5_rerank + c3_matcher + c3_5_adhop + c4_pose + c5_state. Re-classification: - AZ-589 + AZ-590 closed Won't Fix (Jira); spec files removed from todo/ but rows retained in the dependencies table as audit-trail. - AZ-591 created (todo/, 5pt) — cross-cutting compose_root per-binary bootstrap that populates `_STRATEGY_REGISTRY` for the airborne binary. Scheduled as Batch 66 sole task. - AZ-592 created (backlog/, 5pt placeholder) — AZ-332 Tier-2 validation bundle (real `okvis::ThreadedSlam` wiring + Linux CI apt-install + DBoW2 vocab + Jetson). BLOCKED on Tier-2 prerequisites; honors AZ-332's `AZ-332_tier2_validation` self-deferral handle. - AZ-593 created (backlog/, 5pt placeholder) — AZ-333 Tier-2 validation bundle (de-ROSified VINS-Mono upstream + binding + CI + Jetson). BLOCKED on upstream vendoring decision plus Tier-2 prerequisites; honors AZ-333's parallel deferral pattern. - AZ-332 + AZ-333 re-classified in cycle1 gate report from FAIL to BLOCKED-on-Tier-2. Step 7 stays in_progress until AZ-591 lands; after that it can advance to Step 8 with AZ-592 + AZ-593 parked in backlog/. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 12:45:58 +03:00
Oleksandr Bezdieniezhnykh	be5c6d20aa	[AZ-589] [AZ-590] Close completeness gate cycle 1: VIO remediation tasks The Product Implementation Completeness Gate (cycle 1, 2026-05-16) audited 107 done product tasks. 105 PASS / 0 BLOCKED / 2 FAIL. FAIL findings — both AZ-332 (OKVIS2) and AZ-333 (VINS-Mono) ship a real Python facade + AC-tested fake backend, but their native pybind11 bindings (_native/okvis2_binding.cpp, _native/vins_mono_binding.cpp) are skeletons: _build_estimator() sets estimator_built_ = false; the first add_frame() raises *FatalException("estimator not yet wired"). Production-default VIO and the comparative-study path both crash on the first nav-camera frame. Remediation tasks created in _docs/02_tasks/todo/: - AZ-589 remediate_okvis2_threadedkfvio_wiring (5pt) - AZ-590 remediate_vins_mono_estimator_wiring (5pt) Both tasks also seed the per-binary bootstrap register_strategy() call sites — the existing strategy registry in runtime_root/__init__.py is never invoked in src/ today. Artifacts: - _docs/03_implementation/implementation_completeness_cycle1_report.md - _docs/02_tasks/todo/AZ-589_remediate_okvis2_threadedkfvio_wiring.md - _docs/02_tasks/todo/AZ-590_remediate_vins_mono_estimator_wiring.md - _docs/02_tasks/_dependencies_table.md (+2 rows; totals refreshed) - _docs/_autodev_state.md (Step 7 phase 1 parse; current_batch: 66) Returning to implement-skill Step 1 to parse Batch 66 against these remediation tasks (per Step 15 option A). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 10:24:38 +03:00
Oleksandr Bezdieniezhnykh	c5ffc14fe9	[AZ-389] C5 orthorectifier emits mid-flight tiles to C6 Adds an opt-in C5-internal orthorectifier (`_orthorectifier.py`) that emits at most one tile-aligned JPEG candidate per nav frame to the C6 `TileStore.write_tile` API. Quality gates fire before any OpenCV work: covariance Frobenius, inlier floor, source-label (`SATELLITE_ANCHORED` only), and once-per-frame rate limit. Cross-component import rule (AZ-507) is preserved: c5_state never imports c6_tile_cache. `runtime_root.state_factory` carries a new `_C6MidFlightIngestAdapter` that builds the canonical `TileMetadata` (`ONBOARD_INGEST` / `FRESH` / `PENDING`), hashes the JPEG, and translates `FreshnessRejectionError` to a `None` return so the orthorectifier silently swallows freshness rejection per AC-NEW-3. Wiring is opt-in via `C5StateConfig.orthorectifier.enabled`; existing tests/binaries default to disabled and are unaffected. Both `GtsamIsam2StateEstimator` and `EskfStateEstimator` participate through new `attach_orthorectifier` / `set_latest_nav_frame` extension methods (Protocol surface unchanged). Tests: 22 new unit tests cover AC-1..AC-9 plus inlier-floor gate plus the composition-root adapter. 216/216 c5_state and 38/38 runtime-root + compose tests pass. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 09:02:33 +03:00
Oleksandr Bezdieniezhnykh	811ddc8aa7	chore: bump opencv-pin leftover replay timestamp Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 05:47:21 +03:00
Oleksandr Bezdieniezhnykh	2b19b8b90b	[AZ-558] Route C8 outbound encoder bytes through MavlinkTransport seam All FC adapter outbound MAVLink bytes now go through the AZ-401 MavlinkTransport seam (NoopMavlinkTransport in replay, SerialMavlinkTransport in live). New helpers in _outbound_mavlink_payloads.py extract encode/pack/seq-bump so the four AP _send sites and the iNav statustext _send site become encode -> pack -> transport.write. TlogReplayFcAdapter emits real AP-shape MAVLink bytes through the injected NoopMavlinkTransport, satisfying replay protocol Invariant 5 and unblocking AZ-401 AC-9. Closes AZ-558. Also unskips AZ-401 AC-9 and AZ-404 AC-4b. Live wire output remains byte-identical (proven via two-instance MAVLink byte-equivalence tests). AST scan asserts no .mav.<name>_send( calls remain in the retrofit set (AP / iNav / tlog adapters). Out of scope (logged in review): GCS adapter retrofit; airborne live strategy registration that would activate the SerialMavlinkTransport factory injection path. Tests: 2110 passed, 92 environmental skips, 1 unrelated pre-existing macOS cold-start flake deselected. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-16 05:33:56 +03:00
Oleksandr Bezdieniezhnykh	d7e6b0959e	[AZ-404] [AZ-389] [AZ-559] E2E replay test (Derkachi 60s) + AZ-389 cleanup Batch 63 of /autodev replay slice. Adds the AZ-404 E2E test harness against the Derkachi fixture and resolves the AZ-389 dependency phantom (closing AZ-559 Won't Fix). E2E test (AZ-404) - tests/e2e/replay/_tlog_synth.py: deterministic CSV->tlog generator (the original Derkachi tlog is not in repo; data_imu.csv is its export, so we round-trip the CSV through pymavlink). Verified: SCALED_IMU2 + ATTITUDE + GPS_RAW_INT + HEARTBEAT round-trip cleanly through mavutil.mavlink_connection. - tests/e2e/replay/_helpers.py: parse_jsonl, l2_horizontal_m (haversine), match_percentage, CapturingMavlinkTransport (ready for AZ-558 unblock), GroundTruthRow + load_ground_truth_csv. - tests/e2e/replay/conftest.py: derkachi_replay_inputs (session scope), replay_runner (subprocess fixture per AZ-402 CLI), operator_pre_flight_setup placeholder. - tests/e2e/replay/test_derkachi_1min.py: 9 tests covering AC-1..AC-8 with AC-7 skip-gate self-check + AC-4a mode-agnosticism AST scan (passes unconditionally, confirms ADR-011 holding). - tests/e2e/replay/test_helpers.py: 14 unit tests covering AC-9 helper L2 correctness + match_percentage + parse_jsonl + CapturingMavlinkTransport (all unconditional). - tests/e2e/replay/README.md: AC matrix, fixture state, runtime budget, failure cookbook (AC-10). AC matrix - AC-1, AC-2, AC-5, AC-6 implemented and Tier-1 gated on RUN_REPLAY_E2E=1. - AC-3 (<=100m for 80%) xfail until real Topotek KHP20S30 calibration ships (camera_info.md states intrinsics are unknown). - AC-4a (mode-agnosticism AST scan) PASSES unconditionally. - AC-4b (encoder byte-equality) skip until AZ-558 routes C8 bytes through MavlinkTransport. - AC-7 (skip-gate self-check) PASSES unconditionally. - AC-8 (operator workflow rehearsal) skip until D-PROJ-2 mock-suite-sat-service implements tile-fetch + index-build endpoints. - AC-9 (helper L2 correctness) 14 PASSES unconditionally. AZ-389 housekeeping - AZ-559 closed Won't Fix: investigation against c6_tile_cache/_types.py confirmed TileSource.ONBOARD_INGEST + TileMetadata.quality_metadata + write_tile's FreshnessRejectionError already cover the mid-flight ingest semantic. The "missing API" was a spec-vs-impl naming mismatch. - AZ-389 spec rewritten to consume the existing write_tile API + catch FreshnessRejectionError per AC-NEW-3 opportunistic emission. - _dependencies_table.md reverted: AZ-389 deps -> AZ-303 (was AZ-559 in the previous commit on this branch); total 150 / 497 pts. Tests - Full regression: 2099 passed (+14 new e2e/replay), 94 skipped (incl. 8 e2e/replay heavy-tier + documented blocker skips), 3 perf-microbench flakes deselected (test_cli_cold_start_under_2s, test_cold_start_under_500ms_p99, test_nfr_perf_sign_microbench; all pass in isolation - pre-existing under-load flakes on dev macOS). Reviews - _docs/03_implementation/reviews/batch_63_review.md: code review PASS_WITH_WARNINGS (3 documented spec-gap deferrals: AC-3, AC-4b, AC-8). - _docs/03_implementation/cumulative_review_batches_61-63_cycle1_report.md: cumulative review PASS_WITH_WARNINGS. Action items: prioritise AZ-558 (closes AZ-401 AC-9 + AZ-404 AC-4b); consider 2pt hygiene PBI for Protocol-completeness AST scan to catch the AZ-389 / AZ-559 phantom-API pattern at task-prep time. Architecture invariants observably holding - ADR-011 (replay-as-configuration): AC-4a's AST scan over src/gps_denied_onboard/components/*/.py finds zero violations - components branch on neither config.mode nor any synonym. - Single composition root (replay protocol Invariant 11): AZ-402 CLI dispatches to runtime_root.main(config); does not call compose_root directly. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 21:41:39 +03:00
Oleksandr Bezdieniezhnykh	4f10fd230f	[AZ-559] [AZ-389] docs: defer AZ-389 to AZ-559 (C6 mid-flight tile gap) AZ-389's task spec assumed the existence of `tile_store.put_mid_flight_ candidate(MidFlightTileCandidate)` (in Excluded: "owned by AZ-303 / E-C6"), but the current TileStore Protocol has only the four-method baseline shipped under AZ-303 — there is no put_mid_flight_candidate, no MidFlightTileCandidate DTO, and no MID_FLIGHT_INGEST TileSource enum value. Filed AZ-559 as a 5pt task to close the C6 storage gap (Protocol method + DTO + enum + persistence + freshness/LRU integration + contract update). Updated AZ-389 spec to depend on AZ-559 (replacing the stale AZ-303 dep) with a Status: BLOCKED note. Updated the dependencies table totals: 151 tasks / 502 complexity points. This is the same dep-gap pattern surfaced for AZ-401 in batch 61 (missing AZ-400 transport-seam retrofit) — the autodev replay-track sequence is exposing under-spec deliveries upstream. Tracker remains the source of truth via the new AZ-559 issue + Blocks link. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 20:14:47 +03:00
Oleksandr Bezdieniezhnykh	2c31cc094f	[AZ-402] Replay — gps-denied-replay console-script + shared main(config) Implements the replay-mode CLI dispatcher per ADR-011 (replay-as- configuration): - src/gps_denied_onboard/cli/replay.py: argparse with all 6 required args (--video, --tlog, --output, --camera-calibration, --config, --mavlink-signing-key) plus --pace and --time-offset-ms; path validation, calibration JSON schema-validation, config mutation (mode='replay' + replay sub-block + signing-key hex on dev_static field), dispatch into runtime_root.main(config). - runtime_root.main() now accepts an optional Config (additive, backward-compat). Adds dedicated catch for ReplayInputAdapterError mapping to EXIT_FDR_OPEN_FAILURE (2) so the CLI's exit-code matrix holds end-to-end (AC-9 + epic AZ-265 AC-8). - Signing-key contents stored as hex; redacted in startup banner. - Top-level except logs full traceback via logger.exception + stderr print and exits 1. The CLI does NOT call compose_root directly — it builds a Config and hands it to the shared airborne main, which calls compose_root, which branches on config.mode (AZ-401 / replay protocol Invariant 11). Tests: 22 unit tests covering AC-1..AC-10 + extras (signing-key redaction, file-not-dir validation, dev_static propagation, unhandled exception traceback). Full regression: 2085 passed (+22) green; no new flaky tests. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 20:04:37 +03:00
Oleksandr Bezdieniezhnykh	17a0d074af	[AZ-401] [AZ-400] Replay — compose_root replay-mode branch + transport seam Wires the airborne composition root for replay-as-configuration (ADR-011): - compose_root(config) branches on config.mode in {"live", "replay"}. Live behaviour is unchanged; replay builds ReplayInputAdapter, attaches JsonlReplaySink, and injects NoopMavlinkTransport. - New private module runtime_root/_replay_branch.py holds the replay-only strategy graph + build-flag gate + calibration loader. - Config gains Config.mode (Literal["live","replay"]) plus Config.replay sub-block with nested ReplayAutoSyncConfig that mirrors the AZ-405 AutoSyncConfig DTO; YAML loader + ENV map updated. Absorbs the AZ-400 transport-seam retrofit that AZ-401 strictly required but AZ-400 had not delivered: - New MavlinkTransport Protocol (write/bytes_written/close). - NoopMavlinkTransport (replay; build-flag gated, idempotent close, thread-safe byte counter). - SerialMavlinkTransport (live, no-op restructure of existing pymavlink byte path; encoder retrofit to actually USE it is the AZ-558 follow-up). AZ-401 AC-9 (NoopMavlinkTransport.bytes_written > 0 after C8 encoders run) is BLOCKED on AZ-558 — the encoder routing retrofit is out of the AZ-401 task envelope (FORBIDDEN files: pymavlink_ardupilot_adapter, msp2_inav_adapter). AZ-558 spec, batch_61_review.md, and the test's @pytest.mark.skip rationale all carry the deferral reason. Tests: 22 compose_root replay-branch tests + 17 transport tests. Full regression: 2063 passed, 86 environment-skips, 1 documented skip (AC-9 / AZ-558), 1 pre-existing flaky perf test deselected. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 11:55:33 +03:00
Oleksandr Bezdieniezhnykh	8149083cac	[AZ-405] Replay — replay_input/ coordinator + IMU take-off auto-sync Adds the Layer-4 cross-cutting `replay_input/` module per ADR-011: ReplayInputAdapter converges (video, tlog) into the standard FrameSource + FcAdapter + Clock surfaces the airborne composition root consumes. Owns time-alignment between video frames and tlog IMU/attitude ticks (manual via --time-offset-ms or auto via the AZ-405 IMU-take-off detector + Farneback motion-onset detector). Auto-sync algorithm (auto_sync.py): - Tlog take-off detector: sustained vertical-accel excess > 0.5 g for >= 0.5 s + sustained attitude-rate magnitude > 1 rad/s. - Video motion-onset detector: dense Farneback flow magnitude > 1.5 px sustained >= 0.5 s (deterministic per AC-10). - compute_offset combines the two; confidence = min(tlog, video). - validate_offset_or_fail implements the AC-9 95 % frame-window match validator with configurable threshold + window. ReplayInputAdapter.open() ordering (AC-13): 1. Load tlog samples + fail-fast on missing RAW_IMU/SCALED_IMU2 or ATTITUDE BEFORE any video read. 2. Resolve offset (auto-sync OR manual override; manual bypasses the detectors entirely per AC-8). 3. Run AC-9 validator on resolved offset; raise auto-sync hard-fail for AC-7 (CLI exit 2 mapping). 4. Build single Clock instance per pace (TlogDerived/ASAP, Wall/REAL). 5. Construct VideoFileFrameSource and TlogReplayFcAdapter with the resolved offset baked in (replay protocol Invariant 8). Structured log + FDR records on auto-sync detected / low-confidence / AC-8 hard-fail kinds. Idempotent close (AC-12). Tests: 25 unit tests across tests/unit/replay_input/ covering all 13 ACs (kernel-level synthetic fixtures for AC-1..AC-10; coordinator- level OpenCV synthetic videos + faked pymavlink for AC-6..AC-13). Contract update: replay_protocol.md v2.0.0 added fdr_client to the ReplayInputAdapter __init__ signature (was missing in the prose; the task spec already listed it in the allowed-imports section). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 09:50:51 +03:00
Oleksandr Bezdieniezhnykh	f9b4241d3a	[AZ-403] Remove process leftover after Jira cancellation replay Replayed deferred tracker write: AZ-403 transitioned to Done with cancellation comment per ADR-011 (replay-as-configuration). Resolution auto-set to Done by AZ workflow (no Cancelled status exposed in this Jira instance; resolution edit rejected by API). Cancellation reason recorded in the Jira comment. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 09:12:59 +03:00
Oleksandr Bezdieniezhnykh	5adf3dd04f	[AZ-265] Replay as configuration of airborne binary (ADR-011) Re-design replay mode per user direction: replay is no longer a fourth Docker image with a reduced component set, but a `config.mode = "replay"` branch of the single airborne binary. The pre-flight workflow (route in suite UI -> C12 tile download via real satellite-provider -> C10 manifest+engines build) is identical between live and replay; only three strategies swap at compose time: FrameSource: Live <-> Video FcAdapter: Pymavlink/MSP2 <-> TlogReplay MavlinkTransport: Serial <-> Noop The C8 outbound MAVLink encoders run unchanged in both modes; their bytes hit `NoopMavlinkTransport` in replay and disappear. A new `JsonlReplaySink` taps C5's `EstimatorOutput` stream so the parent-suite UI sees per-tick coordinates by tailing `results.jsonl`. MAVLink 2.0 signing key remains mandatory (operator supplies a dummy file). A new `replay_input/` Layer-4 cross-cutting coordinator owns `(video, tlog) -> (FrameSource, FcAdapter, Clock)` convergence; the composition root sees only standard interfaces past `.open()`. Docs: - architecture.md: new ADR-011 with full rationale; ADR-002 binary narrative updated. - contracts/replay/replay_protocol.md: bumped to v2.0.0; 12 invariants (notably mode-agnosticism + encoder byte-equality + signing key mandatory + real C6 cache in replay). - module-layout.md: Build-Time Exclusion Map dropped from 4 to 3 binary columns; replay-mode `BUILD_*` flags default ON in airborne; `shared/replay_input` cross-cutting entry added. - epics.md: E-DEMO-REPLAY scope reframed; story points 27-32 -> 19-24. Task respecs: - AZ-401: shrunk 3 -> 2 pts; `compose_root` mode branch + JSONL sink + NoopMavlinkTransport wiring; legacy `compose_replay` export deleted. - AZ-402: console-script wrapper that mutates `config.mode = "replay"` and dispatches into the shared airborne main; `--mavlink-signing-key` mandatory. - AZ-403: CANCELLED. Moved to done/ with banner; Jira transition deferred via `_docs/_process_leftovers/2026-05-14_az_403_cancellation_pending_tracker.md`. - AZ-404: AC-4 reworded as mode-agnosticism AST scan + encoder byte-equality test; new AC-8 operator-workflow rehearsal. - AZ-405: also owns the `replay_input/` module + `ReplayInputAdapter`. _dependencies_table.md updated: AZ-401 gains AZ-405 dep; AZ-404 drops AZ-403 dep; AZ-403 row marked CANCELLED. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 09:01:04 +03:00
Oleksandr Bezdieniezhnykh	fa3742d582	[AZ-399] [AZ-400] C8 TlogReplayFcAdapter + ReplaySink + JsonlReplaySink Opens E-DEMO-REPLAY (AZ-265): the two C8 strategies that let the upcoming compose_replay (AZ-401) and gps-denied-replay CLI (AZ-402) run the production C1-C5 pipeline against a recorded (.tlog, video) pair without touching live FC I/O. AZ-400 lands the contract ReplaySink Protocol (emit + close per replay_protocol.md v1.0.0) and JsonlReplaySink: orjson-serialised JSONL, fsync-on-close, build-flag gated (BUILD_REPLAY_SINK_JSONL), double-close idempotent, FDR mirror on open/close. The drifted AZ-390 stub in interface.py is removed; the canonical Protocol now lives in replay_sink.py per module-layout.md and is re-exported via __init__.py. AZ-390 conformance test widened. AZ-399 lands TlogReplayFcAdapter: full FcAdapter Protocol surface, build-flag gated (BUILD_TLOG_REPLAY_ADAPTER), pymavlink stream-parse with bounded pre-scan + fail-fast on missing required messages (R-DEMO-3), dedicated decode thread feeding the existing AZ-391 SubscriptionBus. Outbound surface raises FcEmitError per Invariant 5; request_source_set_switch raises SourceSetSwitchNotSupportedError. Pacing honours Invariant 6 via Clock.sleep_until_ns. time_offset_ms shifts every emitted received_at per Invariant 8. Non-monotonic timestamps raise FcOpenError. Test coverage: 188 c8_fc_adapter tests pass; 1 skipped (AZ-399 AC-1 500 MB tlog RSS bound, deferred to AZ-404 e2e behind RUN_REPLAY_E2E). Code review: PASS_WITH_WARNINGS — 1 Medium (mapping logic duplicates AZ-391 live decoder; intentional today, four behavioural deltas documented), 2 Low. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 05:33:20 +03:00
Oleksandr Bezdieniezhnykh	4eac24f37a	[AZ-358] [AZ-361] C4 OpenCVGtsamPoseEstimator + Jacobian thermal hybrid Implement the single production-default C4 PoseEstimator strategy. AZ-358 — Marginals path: OpenCV solvePnPRansac (SOLVEPNP_IPPE) on best-candidate inliers, PriorFactorPose3 with Jacobian-derived initial covariance, flushed into C5's iSAM2 graph via the widened ISam2GraphHandle.update(graph, values, None) (Option B). Posterior covariance from compute_marginals().marginalCovariance(pose_key) with SPD-defensive Cholesky check. Tile pixel -> ENU world conversion via the shared WgsConverter + a configurable tile_size_px. Two spec deviations now documented in the AZ-358 task file: PriorFactorPose3 over GenericProjectionFactorCal3DS2 (avoids unbounded landmark variables; same Fisher information on the pose marginal) and explicit (graph, values, timestamps) update args (aligns with C5's impl). AZ-361 — Jacobian + thermal hybrid: per-frame dispatch on thermal_state.thermal_throttle_active selects the cv2.projectPoints- derived 6x6 information matrix (with ridge regularisation) as the emitted covariance. Skips the iSAM2 factor add under throttle (Invariant 12). Emits CovarianceDegradedWarning via warnings.warn (never raised); paired WARN log + FDR record rate-limited per covariance_degraded_warn_window_ns (default 60 s) via an injected monotonic Clock. Supersedes the AZ-358 NotImplementedError stub. Widens ISam2GraphHandle from get_pose_key only to all five C4-facing methods (add_factor, update, compute_marginals, last_anchor_age_ms); C5's existing ISam2GraphHandleImpl already satisfies the superset, so no C5 source change this batch. Threads fdr_client + clock through pose_factory composition. Registers two new FDR payload kinds: pose.frame_done (per-call telemetry; both success and PnpFailureError paths) and pose.covariance_degraded (per-window throttle exposure). Tests: 21 new (AZ-358 AC-1..11 + AZ-361 AC-1..10/12/13; AZ-361 AC-11 RMSE-ratio informational per spec, not asserted). Updates 2 existing test files for Protocol widening and the FDR-schema round trip. Code review verdict: PASS_WITH_WARNINGS (5 findings: Medium x2, Low x3; none blocking). Full suite: 1958 passed, 1 unrelated host-dependent perf failure (c12 CLI cold-start, pre-existing). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 05:01:14 +03:00
Oleksandr Bezdieniezhnykh	360aece7a6	[AZ-528] [AZ-335] [AZ-345..AZ-347] [AZ-349] Cumulative review 55-57 Cumulative code review for the C3 / C3.5 cross-domain matching pipeline going live (B55 facade-spine consolidation, B56 warm-start + F8 reboot recovery, B57 three concrete matchers + AdHoP refiner). Verdict PASS_WITH_WARNINGS — three Low findings, no Critical / High / Architecture issues. Cumulative-52-54 Medium F1 (c1_vio facade-spine duplication) closed by AZ-528 with regression guards. State: last_completed_batch=57, last_cumulative_review=batches_55-57, current_batch=58. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 04:12:47 +03:00
Oleksandr Bezdieniezhnykh	abe8c5cd2c	[AZ-345] [AZ-346] [AZ-347] [AZ-349] Archive batch 57 task specs Move completed task specs from _docs/02_tasks/todo/ to _docs/02_tasks/done/ now that the four tickets are In Testing. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 04:10:34 +03:00
Oleksandr Bezdieniezhnykh	a1185d0a28	[AZ-345] [AZ-346] [AZ-347] [AZ-349] C3 matchers + C3.5 AdHoP refiner Implement the three concrete C3 CrossDomainMatcher strategies plus the C3.5 production-default AdHoPRefiner. C3 (AZ-345/346/347): - DiskLightGlueMatcher + AlikedLightGlueMatcher share a single shared _pipeline.run_lightglue_pipeline orchestrator (decode -> query extract -> per-candidate loop -> RANSAC sort -> health update -> FDR emit) so the only per-backbone delta is the keypoint+descriptor extractor closure. ALIKED adds a create-time engine output-schema probe (AC-special-1). - XFeatMatcher owns its own per-candidate loop (single forward fuses extraction + matching); it re-uses the shared FDR emission helpers to keep telemetry byte-identical across strategies. lightglue_runtime parameter accepted by factory but discarded (AC-special-1). - All three consume the shared LightGlueRuntime / RansacFilter / RollingHealthWindow helpers; no helper forks. InferenceRuntimeCut consumer-side Protocol added per AZ-507. C3.5 (AZ-349): - AdHoPRefiner implements the <= conditional gate, runs the OrthoLoC AdHoP TRT engine over best-candidate correspondences, re-runs RANSAC on the perspective-preconditioned set, and emits an enriched MatchResult with refinement_label="adhop". - Invariant 4 passthrough fall-through: any RefinerBackboneError (TRT failure, OOM, NaN, bad shape) is caught, logged ERROR, FDR-emitted with error: true, and converted to passthrough that still counts against the rolling invocation-rate window. MemoryError and other non-listed exceptions propagate by design (AC-5 closed-set semantics). - Rolling 60-s invocation-rate window + rate-limited WARN log (configurable via ratelimited_warn_window_ns; default 60 s). Shared changes: - C3MatcherConfig + C3_5RefinerConfig extended with the new weights/threshold/window fields. - matcher_factory + refiner_factory optionally forward clock + fdr_client to the strategy's create(); backward-compatible. - fdr_client.records registers five new kinds: matcher.frame_done, matcher.backbone_error, matcher.insufficient_inliers, matcher.all_failed, refiner.frame_done. Tests: 66 new (43 C3 parametrised + 23 AdHoP) covering 47/47 ACs; focused suite green; full project test suite green except for one pre-existing flaky CLI cold-start timing test unrelated to this batch. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 04:09:22 +03:00
Oleksandr Bezdieniezhnykh	06f655d8fb	[AZ-335] C1 warm-start hint persistence + F8 reboot recovery wiring Adds JsonSidecarWarmStartHintStore (atomic JSON + SHA-256 sidecar via AZ-280) inside c1_vio, plus the cross-strategy WarmStartWiredStrategy wrapper + prime_warm_start_from_disk / prime_warm_start_from_fc hooks at runtime_root. AC-7 post-reset covariance inflation and AC-8 "no fake confidence" baseline floor are enforced at the wiring layer so no strategy module needed edits. Adds three c1_vio config fields (warm_start_store_dir, warm_start_save_period_frames, post_reset_covariance_inflation_factor) and registers the new FDR kind vio.warm_start. 34 unit tests cover all 10 ACs + 3 NFRs. Verdict PASS_WITH_WARNINGS — see _docs/03_implementation/reviews/batch_56_review.md for the four non-blocking documentation findings (F1 cold-start log kind shorthand, F2 strategy-frame pose semantics, F3 dev-hardware perf smoke, F4 runtime_root importing c1-internal _facade_spine for shared FDR conventions). Closes AZ-335; depends on AZ-528 (batch 55). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 03:30:46 +03:00
Oleksandr Bezdieniezhnykh	f12789ebf0	[AZ-528] Consolidate c1_vio strategy facade orchestration spine Replace 3-way byte-equivalent orchestration-spine duplication across okvis2.py / vins_mono.py / klt_ransac.py with a single c1-internal helper at components/c1_vio/_facade_spine.py. Closes cumulative review batches 52-54 Finding F1. No behaviour change — all existing AZ-332 / AZ-333 / AZ-334 AC tests pass unmodified (114 c1_vio tests green, 237 with adjacent regression suite). The helper exposes 5 stateless free functions (now_iso, bias_norm, se3_from_4x4, frame_ts_ns, frame_image) and a FacadeSpine mixin class providing _classify_state / _tick_lost / _emit_transition. Concrete strategies inherit the mixin and set spine-required instance attributes in __init__. Mirrors the AZ-527 precedent for c2_vpr-side _assert_engine_output_dim consolidation. New test file test_az528_facade_spine.py covers AC-1..AC-8 with 19 tests, including an AST regression guard that prevents future re-introduction of the consolidated free functions in any strategy module, plus a Risk-1 static check that every strategy's __init__ assigns every spine-required attribute. Archive AZ-528 task spec to done/, bump autodev state to batch 56. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 03:03:16 +03:00
Oleksandr Bezdieniezhnykh	ac3e288dbd	[AZ-528] Add AZ-528 task spec + register in dependencies table Follow-up to cumulative review batches 52-54 Finding F1. Creates the local task-spec file under _docs/02_tasks/todo/ and adds the row to _dependencies_table.md so Batch 55's implement-loop can pick AZ-528 up. Mirrors the AZ-527 precedent from the c2_vpr-side cumulative review (49-51): cumulative review opens the Jira ticket + raises the finding, the prep commit adds the spec, the next batch implements. Sized at 3 points (1 helper module + 3 strategy edits + 1 test file with AST-walk + import-grep regression guards). Marginally larger than AZ-527's 2-point c2 consolidation because the c1 spine has both module-level free functions AND mixin-shaped instance methods. Jira: https://denyspopov.atlassian.net/browse/AZ-528 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 02:49:31 +03:00
Oleksandr Bezdieniezhnykh	21cef8bdce	[AZ-528] [AZ-527] [AZ-333] [AZ-334] Cumulative review batches 52-54 Verdict: PASS_WITH_WARNINGS — auto-chain allowed per implement skill Step 14.5. AZ-528 created as the formal hygiene PBI for the c1_vio strategy facade orchestration-spine 3-way duplication (Medium / Maintainability) — the deferred F1 finding from B53 + B54 per-batch reviews. AZ-527 closes the parallel c2_vpr-side helper duplication finding (carried over from cumulative-49-51 F1). Carry-overs: F2 (B52-54 test-fake / _patch_pose_recovery sharing) + cumulative-49-51 F2 (AC-10 spec wording drift across c2_vpr specs) remain informational; no code defect, no active drift. Next cumulative review trigger fires after Batch 57 (every K=3). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 02:45:28 +03:00
Oleksandr Bezdieniezhnykh	ceb24b5a62	[AZ-334] C1 KLT/RANSAC strategy — engine-rule simple-baseline VIO Implement KltRansacStrategy, the ADR-002 engine-rule mandatory simple-baseline VioStrategy for E-C1. Pure-Python facade over OpenCV's cv2.goodFeaturesToTrack / calcOpticalFlowPyrLK / findEssentialMat / recoverPose pipeline — no C++/pybind11 binding by design so a Tier-0 workstation runs the strategy with `pip install opencv-python` and the BUILD_KLT_RANSAC=ON gate alone. Constructor + state machine + FDR transition spine mirror Okvis2Strategy + VinsMonoStrategy so the AZ-331 factory + IT-12 comparative harness treat all three as drop-in substitutable; the duplication is the consolidation target now formally in scope for the next cumulative review (batches 52-54). AC coverage: AC-1..AC-11 + NFR-perf mapped to passing tests (25 tests, 23 pass + 2 tier-2 skipped on dev/CI runners; all 25 pass under GPS_DENIED_TIER=2). Honest-covariance invariant (AC-9) implemented as residual-scatter / (N_inliers - 5) with an inlier- count penalty — no client-side floor or smoother; cov Frobenius grows monotonically across DEGRADED. Camera-agnostic source (AC-11) enforced by CI-grep gate that excludes docstring text. Test-Run Cadence: focused suite tests/unit/c1_vio/ green (95 passed, 6 skipped); config-loader + compose-root suites green; full-suite gate deferred to Step 16 per implement skill. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 02:40:01 +03:00
Oleksandr Bezdieniezhnykh	4815dd6aa1	chore: bump D-CROSS-CVE-1 leftover replay timestamp Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 02:15:37 +03:00
Oleksandr Bezdieniezhnykh	6a5954bdae	[AZ-333] C1 VINS-Mono strategy — research-only comparative VIO VinsMonoStrategy: Python facade conforming to AZ-331 Protocol; mirrors the AZ-332 OKVIS2 facade so the AZ-331 factory + IT-12 comparative harness can treat both as drop-in substitutable. Native binding is a pybind11 skeleton compiled behind BUILD_VINS_MONO=ON (default OFF for airborne / operator-tooling / replay-cli per module-layout.md Build-Time Exclusion Map). Real vins_estimator wiring is the Tier-2 follow-up. VinsMonoConfig added to c1_vio/config.py with sliding-window / feature-tracker / marginalisation / opt-iteration knobs plus __post_init__ validation; exported through the package __init__. cpp/vins_mono/CMakeLists.txt replaces the AZ-263 placeholder with full pybind11 wiring: Risk-1 mitigation forces VINS_MONO_USE_ROS=OFF; Risk-2 mitigation links Eigen from the same cpp/_third_party/eigen pin as OKVIS2; Risk-3 mitigation enforces BUILD_VINS_MONO=OFF in deployment binaries via the gate at the top of the file. Tests: 17 new in test_vins_mono_strategy.py (15 pass + 2 tier2 skip); fake_vins_mono_binding fixture added to conftest.py mirroring the fake_okvis2_binding pattern; test_protocol_conformance updated to drop vins_mono from _STRATEGIES_WITHOUT_PY_MODULE so the existing parametrised factory tests route through the new strategy. Focused c1_vio suite: 72 passed, 4 skipped. Full suite: 1788 passed, 1 unrelated pre-existing flake (c12 cold-start perf, env-bound). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 01:11:09 +03:00
Oleksandr Bezdieniezhnykh	2ce300ddb1	[AZ-527] Archive AZ-527 + batch 52 report + state bump Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 00:51:19 +03:00
Oleksandr Bezdieniezhnykh	235eb4549e	[AZ-527] Consolidate _assert_engine_output_dim into c2-internal helper Closes cumulative review batches 49-51 Finding F1 (Medium / Maintainability) -- the 7-way duplication of _assert_engine_output_dim across c2_vpr secondary VPR strategy modules. Add c2-internal helper assert_engine_output_dim(inference_runtime, handle, preprocessor, descriptor_dim, , output_key='embedding', input_key='input') in src/gps_denied_onboard/components/c2_vpr/ _engine_dim_assertion.py. The helper runs a zero-init dry-run inference at preprocessor.input_shape() and asserts the engine output dict carries (1, descriptor_dim) under output_key. Raises gps_denied_onboard.config.schema.ConfigError on mismatch (preserving the prior error envelope and message wording byte-identically). Migrate 7 strategy modules (ultra_vpr, net_vlad, mega_loc, mix_vpr, sela_vpr, eigen_places, salad) to import the helper and delete the local _assert_engine_output_dim definitions + their inline 'AZ-527 (planned)' comments. NetVLAD is the only call site that overrides output_key='vlad_descriptor'; the other 6 explicitly pass output_key=_OUTPUT_KEY + input_key=_ENGINE_INPUT_KEY (matching helper defaults but documenting strategy contract at the call site). Add tests/unit/c2_vpr/test_az527_engine_dim_assertion.py (14 tests, AAA pattern, Protocol-conforming fakes) covering AC-1..AC-4: helper signature; wrong shape raises ConfigError naming both dims; missing output key raises ConfigError naming the missing key; AST-walk regression guard for stray definitions outside the helper module (modeled on AZ-526's test_ac4_az526_no_module_level_iso_ts_from_clock_outside_helper); import-grep regression guard verifying all 7 strategy modules import the helper. AC-5 (existing AZ-337/338/339/340 AC-6 sub-tests pass unmodified) is exercised transitively: c2_vpr/ full directory 230/230 PASS, no test file modified outside the new test_az527_. AC-6 (AZ-270 + AZ-507 layer lints) verified by tests/unit/test_az270_compose_root.py 8/8 PASS. Code-review verdict: PASS (zero findings). Ruff clean. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 00:50:17 +03:00
Oleksandr Bezdieniezhnykh	f6a180e5df	[AZ-340] [AZ-527] Archive AZ-340 + batch 51 report + cumulative review 49-51 Bookkeeping for batch 51 close: - Archive AZ-340 spec todo/ -> done/ - Add _docs/03_implementation/batch_51_cycle1_report.md - Add _docs/03_implementation/cumulative_review_batches_49-51_cycle1_report.md Verdict: PASS_WITH_WARNINGS. F1 (Medium) escalates the 2-way _assert_engine_output_dim near-duplicate from cumulative-46-48 to a 7-way duplication after AZ-339 + AZ-340; new hygiene PBI AZ-527 formally created. F2 (Low) carries the AC-10 ConfigError vs literal ConfigurationError spec drift (documentation only). - File AZ-527 hygiene PBI (Hygiene -- consolidate _assert_engine_output_dim into a c2-internal helper, 2pt, AZ-255 E-C2). Add the spec stub at _docs/02_tasks/todo/AZ-527_*.md. - Refresh _docs/02_tasks/_dependencies_table.md: +AZ-527 row, totals bumped to 148 tasks / 491 points. - Bump _docs/_autodev_state.md: last_completed_batch=51, last_cumulative_review=batches_49-51. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 00:39:29 +03:00
Oleksandr Bezdieniezhnykh	87909cce9f	[AZ-340] C2 SelaVPR + EigenPlaces + SALAD secondary VPR backbones Three new VprStrategy implementations for IT-12 comparative-study (research binary only, gated OFF for airborne / operator-tooling per ADR-002). All run via the C7 TensorRT runtime (or ONNX-RT fallback) with their own concrete BackbonePreprocessor, single-stage L2 normalisation, and FaissBridge-delegated retrieval — same pattern as AZ-339 (MegaLoc + MixVPR), parametrised in tests for compactness. * SelaVprStrategy — D=512, input 224x224 * EigenPlacesStrategy — D=2048, input 480x480 * SaladStrategy — D=8448, input 322x322 (DINOv2-Large backbone; heaviest in the C2 family — NFR-perf budget relaxed to 120 ms p95 / 1200 MB GPU per task spec) The composition-root factory tables and KNOWN_STRATEGIES set were already pre-wired at AZ-336 land time; module-layout.md already names all three Internal entries and BUILD_VPR_* rows. No CMake change required (env-flag gating). 54 unit tests (3 strategies * 18 cases) cover AC-1..AC-11 plus extras (single-stage L2, NCHW FP16, constructor validation, FDR emission). All pass; sibling c2_vpr suite + composition-root regression + AZ-526 iso-ts regression all green. Code review verdict: PASS_WITH_WARNINGS. Two Low findings logged in batch_51_review.md: F1 escalates `_assert_engine_output_dim` duplication from 4-way to 7-way (already tracked by AZ-527 hygiene PBI; will surface in cumulative review batches 49-51); F2 mirrors the AZ-337 / 338 / 339 AC-10 spec-drift precedent (literal ConfigurationError vs implemented ConfigError / StrategyNotAvailable). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 00:32:38 +03:00
Oleksandr Bezdieniezhnykh	e81616a09d	[meta] Refresh D-CROSS-CVE-1 leftover replay timestamp Bootstrap-time replay check confirmed gtsam==4.2.1 still pins numpy<2.0.0; opencv-python>=4.12 pin remains deferred. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-14 00:19:06 +03:00
Oleksandr Bezdieniezhnykh	0d65ff4705	[AZ-339] C2 MegaLoc + MixVPR secondary VPR backbones Adds two research-only VprStrategy implementations for the IT-12 comparative-study matrix. MegaLocStrategy (D=2048, 322x322) and MixVprStrategy (D=4096, 320x320), both via C7 TensorRT FP16 with their own concrete BackbonePreprocessor. Single-stage global L2 normalisation; retrieval delegated to FaissBridge; FDR records + structured logs identical to UltraVPR. BUILD_VPR_MEGALOC and BUILD_VPR_MIXVPR ON for research/replay-cli only, OFF for airborne and operator-tooling (fail-fast at composition root via existing AZ-336 factory). Uses helpers.iso_ts_from_clock from day 1 — no new timestamp helper duplicates introduced. 36 parametrised AC tests + 25 protocol-conformance + 18 helper regression tests pass; 1690 / 1690 unit tests pass (excluding 1 pre-existing flaky cold-start subprocess test in c12). Verdict: PASS_WITH_WARNINGS — one Medium follow-on (AZ-527 to consolidate 4-way _assert_engine_output_dim) + one Low AC wording drift. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 23:52:54 +03:00
Oleksandr Bezdieniezhnykh	5dfd9a577e	[AZ-526] Consolidate _iso_ts_from_clock into helpers/iso_timestamps Closes cumulative review 46-48 F1 (Medium) + F3 (Low). Adds iso_ts_from_clock(clock) alongside iso_ts_now() in the Layer-1 helper; migrates four duplicate definitions in c2_vpr (net_vlad, ultra_vpr, _faiss_bridge) and c12_operator_orchestrator (operator_reloc_service). Output format flipped +00:00 -> Z to align with iso_ts_now() and the canonical FDR _TS fixture (FDR schema test passes unmodified). 18 helper AC tests + 186 sibling tests pass; ruff clean. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 23:37:04 +03:00
Oleksandr Bezdieniezhnykh	fbeeab60b3	[AZ-337] [AZ-338] [AZ-508] Cumulative review batches 46-48 Verdict: PASS_WITH_WARNINGS. Per-batch reviews already validated each task's ACs; this cumulative review focuses on cross-batch drift and surfaces 1 Medium + 2 Low maintainability findings: - F1 (Medium): `_iso_ts_from_clock` Clock-injected helper duplicated across 4 files (c2_vpr/net_vlad + ultra_vpr + _faiss_bridge, c12_operator_orchestrator/operator_reloc_service). B46 + B47 carry inline comments anticipating AZ-508 would consolidate this, but AZ-508 (Batch 48) scoped itself narrower (stdlib-only, Excluded the Clock-injected variant). Recommend a 2-point follow-up PBI adding `iso_ts_from_clock(clock)` to helpers/iso_timestamps.py before AZ-339 / AZ-340 / AZ-358 / AZ-389 add more copies. - F2 (Low): `_assert_engine_output_dim` near-duplicated between NetVLAD and UltraVPR. Defer consolidation until 5 c2_vpr strategies are in flight (after AZ-339 / AZ-340). - F3 (Low): Clock-driven helper outputs `+00:00`; canonical FDR `ts` is `Z`. Fold into F1 follow-up PBI. No Critical or High findings; auto-chain to next batch allowed. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 23:26:58 +03:00
Oleksandr Bezdieniezhnykh	5441ea2017	[AZ-508] Consolidate _iso_ts_now into helpers/iso_timestamps Batch 48 / Cycle 1 (greenfield Step 7). Closes cumulative review batches 31-33 F2 and 28-30 F3 by replacing the duplicated private _iso_ts_now() one-liners with a single Layer-1 helper: src/gps_denied_onboard/helpers/iso_timestamps.py iso_ts_now() -> str Output format matches the canonical FDR _TS fixture (YYYY-MM-DDTHH:MM:SS.ffffffZ); no FDR schema change. Migrated call-sites (3): c7_inference/onnx_trt_ep_runtime, c7_inference/thermal_publisher, plus the 3 c6_tile_cache callers that previously imported from the local c6_tile_cache/_timestamp shim (now deleted, superseded by the Layer-1 helper). Spec drift resolved (Choose A, user-approved): spec listed 5 call sites + +00:00 regex; on-disk reality at batch start is 3 sites + Z-suffix matching every existing helper and the FDR _TS fixture. Spec preamble + AC-2 regex updated in the task file; documented in batch_48_cycle1_report.md. Tests: 9 new AC tests (AC-1..AC-7 + Layer-1 invariant + public-surface defensive); 216 focused tests pass including the unmodified AZ-272 FDR schema suite and AZ-270 / AZ-507 layering lints. Verdict: PASS (no findings). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 23:23:22 +03:00
Oleksandr Bezdieniezhnykh	f29897cb3a	[meta] Tighten Jira tracker error handling: STOP and ASK on any error User feedback after a transitionJiraIssue call returned a bare {"success": true} that I trusted blindly: the rule should require explicit verification and stop-and-ask on any ambiguous response. Two targeted clarifications: - .cursor/rules/tracker.mdc - Tracker Availability Gate now lists the full set of failure modes (non-2xx, timeout, empty body, opaque success) and bans automatic retries. Adds an explicit read-back requirement when the response is minimal, and adds "abort" to the user-choice menu. - .cursor/skills/implement/SKILL.md - Step 5 (In Progress) and Step 12 (In Testing) now spell out the STOP-and-ASK rule inline instead of just pointing at tracker.mdc. Adds the read-back verification step for opaque responses. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 23:06:48 +03:00
Oleksandr Bezdieniezhnykh	cfe3d357f4	[meta] Forbid per-batch full-suite test runs under implement skill Root cause: I ran the full unit suite at the end of every autodev batch despite implement/SKILL.md already saying that is forbidden (lines 33, 136, 145, 372). The skill's existing rules were buried mid-document; coderule.mdc's general "run full suite when done" overrode them in practice because each batch felt like a "done" point. Two targeted clarifications: - .cursor/rules/coderule.mdc: add an Iterative-Skill Exception bullet stating that when an iterative loop skill (autodev / implement batch loop, refactor batch loop) is active, the skill governs full-suite cadence and "done with changes" means done with the implementation phase, not done with one batch. - .cursor/skills/implement/SKILL.md: hoist the per-batch / per- task / Step-16 cadence rule into a top-of-file "READ FIRST, EVERY BATCH" banner with an explicit anti-pattern check ("if you catch yourself about to run pytest tests/ at end of batch, STOP"). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 22:51:48 +03:00
Oleksandr Bezdieniezhnykh	b64f3a1b93	[AZ-337] Archive task spec + batch 47 report + state bump - _docs/02_tasks/todo/AZ-337_c2_ultra_vpr.md -> _docs/02_tasks/done/AZ-337_c2_ultra_vpr.md - _docs/03_implementation/batch_47_cycle1_report.md (new) - _docs/_autodev_state.md: last_completed_batch 46 -> 47; sub_step.detail "batch 47 complete - selecting batch 48" AZ-337 transitioned in Jira: In Progress -> In Testing. Batches 45/46/47 close the C2 production path (Protocol + FaissBridge + NetVLAD baseline + UltraVPR primary). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 22:44:22 +03:00
Oleksandr Bezdieniezhnykh	3c4fd272f1	[AZ-337] C2 UltraVPR primary backbone VprStrategy UltraVPR is the Documentary Lead's PRIMARY backbone per description.md § 1 and is wired by default (config.c2_vpr.strategy = "ultra_vpr"). Runs on the C7 TensorRT runtime (AZ-298) or ONNX-Runtime fallback (AZ-299); explicitly NOT on the PyTorch FP16 runtime so a TRT engine compile bug can fall back to NetVLAD without simultaneously breaking both strategies. Production changes: - c2_vpr/ultra_vpr.py - UltraVprStrategy + module-level create() factory. embed_query pipeline: preprocess -> runtime.infer -> single-stage L2 -> VprQuery. retrieve_topk delegates one-line to FaissBridge. Engine load + output-shape assertion happen at create() time (AC-6) so misconfiguration surfaces at startup, not 17 minutes into a flight. UltraVPR has D=512 fixed (NOT a config knob; AC-5 / AC-6 / AC-7 all assume 512). Single-stage L2 (no intra-cluster step like NetVLAD; spy-test enforces this so a future refactor cannot silently regress recall). - c2_vpr/_preprocessor_ultra_vpr.py - centre-crop using the camera calibration's principal point (cx, cy from intrinsics_3x3), falling back to geometric centre + WARN log when calibration is absent (AC-9). Resize -> (384, 384) -> ImageNet mean/std -> FP16 NCHW. - No composition-root changes: UltraVPR consumes a pre-compiled .trt engine (no PyTorch nn.Module), so the strategy module does NOT expose MODEL_NAME / architecture_factory. The composition- root _register_strategy_architecture helper no-ops cleanly for this case (verified by test_create_does_not_register_pytorch_architecture). Tests: - tests/unit/c2_vpr/test_ultra_vpr.py - 29 tests covering all 12 ACs + preprocessor contract + constructor validation + FDR record emission + single-stage L2 enforcement. Full unit suite: 1637 passed / 80 env-skipped (+29 new tests). Per-batch code review (batch_47_review.md): PASS_WITH_WARNINGS (3 Low-severity findings; no Critical / High / Medium): - F1: _iso_ts_from_clock is now the 7th copy (AZ-508 will close). - F2: AZ-337 spec uses outdated C7 API names; affects upcoming AZ-339 / AZ-340. Spec-hygiene PBI recommended. - F3: principal-point fallback uses (0, 0) zero-detection for missing calibration; safe but tightens when intrinsics become Optional. Architectural notes: - AZ-507 layering clean. Imports only InferenceRuntimeCut, DescriptorIndexCut, c2_vpr internals, _types, helpers, clock, fdr_client. Architecture lint test passes. - Pattern parity with NetVLAD (B46) where semantics permit; UltraVPR-specific paths (single-stage L2, 'embedding' output key, TRT runtime, no architecture registry, principal-point crop) are clearly localised. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 22:43:17 +03:00
Oleksandr Bezdieniezhnykh	773d589d34	[AZ-338] Archive task spec + batch 46 report + state bump - _docs/02_tasks/todo/AZ-338_c2_net_vlad.md -> _docs/02_tasks/done/AZ-338_c2_net_vlad.md - _docs/03_implementation/batch_46_cycle1_report.md (new) - _docs/_autodev_state.md: last_completed_batch 45 -> 46; sub_step.detail "batch 46 complete - selecting batch 47" AZ-338 transitioned in Jira: In Progress -> In Testing. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 22:31:56 +03:00
Oleksandr Bezdieniezhnykh	af0dbe863a	[AZ-338] [AZ-283] C2 NetVLAD mandatory simple-baseline VprStrategy NetVLAD is the C2 comparative baseline per the engine rule (every production-default backbone ships with a simple-baseline alongside). Runs on the C7 PyTorch FP16 runtime (NOT TRT) so a TRT engine compile bug cannot simultaneously break NetVLAD AND UltraVPR. Production changes: - c2_vpr/net_vlad.py — NetVladStrategy + module-level create() factory. Constructor wires InferenceRuntimeCut + DescriptorIndexCut + NetVladBackbonePreprocessor + DescriptorNormaliser + FaissBridge. embed_query pipeline: preprocess -> runtime.infer -> dual-stage normalisation (intra-cluster THEN global L2) -> VprQuery. retrieve_topk delegates one-line to FaissBridge. - c2_vpr/_net_vlad_architecture.py — Arandjelovic et al. 2016 NetVLAD layer over torchvision VGG16 features + optional Linear PCA projection to descriptor_dim (default 4096; published Pittsburgh reference uses KD=64512=32768 raw + Linear(32768, 4096) PCA). - c2_vpr/_preprocessor_net_vlad.py — OpenCV-based image preprocessor: decode -> centre-crop square -> resize (480, 480) -> ImageNet normalisation -> FP16 NCHW. Calibration is not consumed (NetVLAD is calibration-agnostic per published preprocessing chain). - c2_vpr/inference_runtime_cut.py — NEW AZ-507 consumer-side cut mirroring C7 InferenceRuntime; lets c2_vpr stay AZ-507-clean. - c2_vpr/config.py — added netvlad_descriptor_dim: int = 4096 knob. - helpers/descriptor_normaliser.py — added intra_cluster_normalise (DescriptorNormaliser v1.0.0 -> v1.1.0; backward-compatible add). - runtime_root/vpr_factory.py — added _register_strategy_architecture helper that binds (MODEL_NAME, architecture_factory(descriptor_dim)) to C7's architecture registry before delegating to the strategy's create() factory. Keeps the c7 import at L4, preserves AZ-507. - fdr_client/records.py — registered vpr.embed_query, vpr.backbone_error, vpr.preprocess_error record kinds. Tests: - tests/unit/c2_vpr/test_net_vlad.py — 31 tests covering all 11 ACs + preprocessor contract + architecture factory + constructor validation + FDR record emission. - tests/unit/test_az283_descriptor_normaliser.py — +8 tests for the new intra_cluster_normalise. - tests/unit/test_az272_fdr_record_schema.py — +3 fixture payloads. Full unit suite: 1608 passed / 80 env-skipped (+43 new tests). Per-batch code review (batch_46_review.md): PASS_WITH_WARNINGS (4 Low-severity hygiene findings; no Critical/High/Medium). Architectural notes: - The spec implied c2_vpr.net_vlad.create() registers the architecture with C7. That violates AZ-507 (no cross-component imports). Resolved by exposing MODEL_NAME + architecture_factory(descriptor_dim) on the strategy module and having the composition root perform the C7 bind. - C7 PyTorch runtime API names in the spec (forward, load_engine) were outdated; aligned implementation with the live v1.0.0 Protocol (infer, compile_engine + deserialize_engine). Spec hygiene flagged in review F2. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 22:30:29 +03:00
Oleksandr Bezdieniezhnykh	dd2f1cbae6	[AZ-341] [AZ-329] [AZ-330] [AZ-328] Cumulative review batches 43-45 PASS_WITH_WARNINGS verdict covering AZ-328 (BuildCacheOrchestrator), AZ-329 (PostLandingUploadOrchestrator + FdrFooterReader), AZ-330 (OperatorReLocService), AZ-523/AZ-524 (C11 internal gate removal + c12_operator_orchestrator rename), and AZ-341 (FaissBridge + DescriptorIndexCut). Four Low-severity findings, all hygiene or carry-over: F1 ISO timestamp helper duplicated across 6 modules (AZ-508 hygiene PBI exists), F2 IndexUnavailableError namespace duplication c2/c6 flagged for spec/docstring alignment, F3 AZ-341 spec lists unused normaliser parameter, F4 carry-over cold-start microbench host-load flake. Full unit suite 1565 passed / 80 env-skipped at close of window. No new layer-direction or AZ-507 violations introduced; three new structural Protocol cuts (TileDownloaderCut, FdrFooterReader, DescriptorIndexCut) all follow the same shape. State file updated: last_cumulative_review batches_40-42 -> batches_43-45. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 21:50:32 +03:00
Oleksandr Bezdieniezhnykh	1682dc354b	[AZ-341] Archive AZ-341 + batch 45 report Batch 45 (AZ-341 C2 FAISS retrieve wiring) post-commit bookkeeping: - Move AZ-341 task spec to done/ (implement skill step 13). - Write batch_45_cycle1_report.md (test results, AC coverage, architectural decisions, findings carried into cumulative review). - Bump state.last_completed_batch 44 → 45. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 21:47:07 +03:00
Oleksandr Bezdieniezhnykh	88f6ae6dce	[AZ-341] C2 FAISS HNSW retrieve wiring (FaissBridge + AZ-507 cut) Shared retrieve_topk plumbing for every concrete C2 VprStrategy: - FaissBridge centralises the c6 search_topk → VprResult pipeline, the defended-in-depth INV-4 check (exactly k, distance-ascending), the WARN-threshold check on distances[0], optional per-frame DEBUG log, and one `vpr.retrieve_topk` FDR record per call with latency measurement. - DescriptorIndexCut Protocol — consumer-side structural cut of c6 DescriptorIndex.search_topk (AZ-507); keeps c2_vpr c6-import-free. - C2VprConfig gains warn_top1_threshold + debug_per_frame_distances knobs with validators. - KNOWN_PAYLOAD_KEYS registers vpr.retrieve_topk for the FDR record schema with payload {frame_id, backbone_label, top10_distances, latency_us}; companion fixture added to the AZ-272 roundtrip suite. - 22 unit tests cover AC-1..AC-11 + NFR-perf microbench (p95 ≤ 0.5 ms) + constructor and retrieve-argument validation. Verdict: PASS_WITH_WARNINGS (2 Low findings — duplicated ISO-ts helper across c2/c5/c11/c12, captured in AZ-508 hygiene PBI; spec-listed but unused `normaliser` parameter dropped — INV-3 makes the embedding L2-normalised at the strategy's `embed_query`). Tests: 1565 passed / 80 skipped (was 1543; +22 new tests). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-13 21:45:40 +03:00

1 2 3 4 5 ...

322 Commits