mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-06-21 11:31:13 +00:00

Files

T

Oleksandr Bezdieniezhnykh 1682dc354b [AZ-341] Archive AZ-341 + batch 45 report

Batch 45 (AZ-341 C2 FAISS retrieve wiring) post-commit bookkeeping:
- Move AZ-341 task spec to done/ (implement skill step 13).
- Write batch_45_cycle1_report.md (test results, AC coverage,
  architectural decisions, findings carried into cumulative review).
- Bump state.last_completed_batch 44 → 45.

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-13 21:47:07 +03:00

7.7 KiB

Raw Blame History

Batch 45 — Cycle 1 Report

Date: 2026-05-13 Batch: 45 Tasks: AZ-341 (C2 FAISS HNSW Retrieve Wiring, 3pt) Status: complete; ticket transitioned to "In Testing".

Scope

AZ-341 delivers FaissBridge — the cross-strategy shared plumbing that connects every concrete C2 VprStrategy.retrieve_topk to C6's FAISS HNSW DescriptorIndex.search_topk surface. The bridge owns the defended-in-depth INV-4 check (exactly K results, distance-ascending), the WARN-threshold check on distances[0], the optional per-frame DEBUG top10_distances log, and the vpr.retrieve_topk FDR record emission with monotonic-ns latency measurement.

Companion deliverables in the same batch:

DescriptorIndexCut Protocol — the AZ-507 consumer-side structural cut that keeps c2_vpr/*.py import-free of c6_tile_cache.*.
C2VprConfig gains two operator-tunable knobs: warn_top1_threshold (default 0.30 — task-spec Risk 3 conservative placeholder pending FT-P-19 telemetry) and debug_per_frame_distances (default OFF — Risk 2 mitigation against journald flood).
fdr_client.records registers vpr.retrieve_topk in KNOWN_PAYLOAD_KEYS and the AZ-272 roundtrip schema test gains a fixture for the new kind (mirroring the Batch-44 c12.reloc.requested pattern).

Unblocks each future C2 strategy (AZ-337 UltraVPR, AZ-338 NetVLAD, AZ-339 MegaLoc+MixVPR, AZ-340 SelaVPR+EigenPlaces+SALAD): every strategy's retrieve_topk shrinks to a one-line delegation return self._faiss_bridge.retrieve(query, k, backbone_label=...), eliminating the seven-way duplication risk the task spec called out.

Architectural Decisions

1. The bridge is stricter than the c6 contract — by design

C6's descriptor_index.md Invariant I-2 says search_topk "MAY return fewer than K results when the corpus has fewer than K vectors". The AZ-341 task spec asks the bridge to enforce == K and raise IndexUnavailableError on undersized returns. The two contracts are complementary, not contradictory:

C6's I-2 is a structural invariant (the return-shape upper bound, not a guarantee of full results).
AZ-341's INV-4 is an operational invariant (a corpus with fewer than K vectors is operationally degraded; the system can not function — downstream RANSAC matching needs the full top-K spread).

The bridge's module docstring documents this on-purpose strictness. The defended-in-depth posture is the task-spec's explicit Risk-1 mitigation against silent C6 regressions.

2. The bridge propagates the c6 `IndexUnavailableError` unchanged

AC-4 is explicit: "the bridge does not catch and re-raise". The bridge has no try/except IndexUnavailableError. When the c6 stale-handle / sidecar / dim-mismatch defence fires, the original c6 exception escapes directly to the strategy and onwards.

This conflicts with the c6_tile_cache.errors.IndexUnavailableError module docstring's expectation that "the C2 family is the closed envelope a C5/C2.5 consumer sees; the C6 family is the storage-layer error a concrete strategy is responsible for rewrapping". The implementation follows the task-spec letter; surfacing the gap as Finding F2 of the per-batch review for spec/code-comment alignment.

3. `_iso_ts_from_clock` is duplicated — accepted for this batch

The helper is six lines that appear inline in c2 / c5 / c11 / c12. Each component avoids importing the others' helpers (per AZ-507 shape). A future shared helper would live in shared/helpers/ (L1) but factoring is deferred to AZ-508 (the hygiene-only PBI already in todo/ covering ISO-timestamp consolidation across the suite). Captured as Finding F1.

4. The dropped `normaliser` constructor parameter

The task spec's Outcome § Constructor lists normaliser: DescriptorNormaliser but the documented retrieve body never calls it — the VprStrategy Protocol INV-3 already requires VprQuery.embedding to arrive L2-normalised. The bridge omits the parameter (no behaviour change vs the documented body). Captured as Finding F2 for spec hygiene.

Files Changed

Production source (new)

src/gps_denied_onboard/components/c2_vpr/_faiss_bridge.py — FaissBridge class (~250 LOC), _verify_invariants / _log_invariant_violation / _emit_fdr_record / _iso_ts_from_clock helpers, module-level FDR / log kind constants.
src/gps_denied_onboard/components/c2_vpr/descriptor_index_cut.py — DescriptorIndexCut Protocol + TileIdTuple alias (~50 LOC).

Production source (modified)

src/gps_denied_onboard/components/c2_vpr/config.py — C2VprConfig gains warn_top1_threshold: float = 0.30 and debug_per_frame_distances: bool = False plus their validators.
src/gps_denied_onboard/components/c2_vpr/__init__.py — re-exports DescriptorIndexCut and TileIdTuple for the composition root's c6 adapter.
src/gps_denied_onboard/fdr_client/records.py — registers vpr.retrieve_topk in KNOWN_PAYLOAD_KEYS with payload field set {frame_id, backbone_label, top10_distances, latency_us}.

Tests (new)

tests/unit/c2_vpr/test_faiss_bridge.py — 22 tests covering AC-1..AC-11, NFR-perf microbench (p95 ≤ 0.5 ms), constructor validation, retrieve-argument validation. Uses _FakeDescriptorIndex, _StubClock, and a real FdrClient with power-of-two capacity 8 to verify both the success enqueue pattern and the OVERRUN warn path.

Tests (modified)

tests/unit/test_az272_fdr_record_schema.py — adds a vpr.retrieve_topk payload to the _make_fdr_payload fixture so the schema roundtrip + envelope tests cover the new kind.

Docs / process

_docs/03_implementation/reviews/batch_45_review.md — per-batch code review (verdict PASS_WITH_WARNINGS, 2 Low findings).
_docs/_autodev_state.md — sub_step bumped through the batch loop.

Test Results

1565 passed, 80 skipped in 69.9s

Versus pre-batch (Batch 44 final): 1543 passed, 80 skipped.

Delta: +22 new tests (all in tests/unit/c2_vpr/test_faiss_bridge.py), zero regressions, zero new skips.

Code Review

Verdict: PASS_WITH_WARNINGS — see _docs/03_implementation/reviews/batch_45_review.md.

Two Low-severity findings:

#	Severity	Category	Title
F1	Low	Maintainability	Duplicated `_iso_ts_from_clock` helper across c2/c5/c11/c12 — defer to AZ-508
F2	Low	Spec-Gap	Task spec lists unused `normaliser` parameter — surface to user / spec hygiene

No Critical, no High, no Medium. The Auto-Fix Gate accepts the verdict without intervention.

AC Coverage Summary

All 11 acceptance criteria + NFR-perf have at least one covering test. Detailed AC ↔ test mapping in _docs/03_implementation/reviews/batch_45_review.md § Phase 2.

Tracker

AZ-341 transitioned To Do → In Progress at batch start.
AZ-341 transitioned In Progress → In Testing after the commit landed.

Cumulative Review (batches 43-45)

Last cumulative review was for batches 40-42. With Batch 45 closed (43, 44, 45 since the last cumulative), the implement skill's Step 14.5 cumulative review fires next at K=3.

Next Batch

Several C-component slices are now unblocked:

C1: AZ-333 (VINS-Mono strategy, research), AZ-334 (KLT/RANSAC mandatory baseline). Then AZ-335 (warm-start) once 333+334 land.
C2: AZ-337 (UltraVPR primary, 5pt), AZ-338 (NetVLAD baseline, 3pt), AZ-339 (MegaLoc+MixVPR research), AZ-340 (SelaVPR+ EigenPlaces+SALAD research).
C3: AZ-345 / AZ-346 / AZ-347 — pending the C2.5 ReRanker tasks.

A natural Batch-46 candidate would be a C1 mandatory simple-baseline (AZ-334) — single 5pt task, OpenCV-based, no model weights — but the next-batch selection will go through the implement skill's Step 3 (Compute Next Batch) afresh.

7.7 KiB Raw Blame History