[AZ-341] Archive AZ-341 + batch 45 report

Batch 45 (AZ-341 C2 FAISS retrieve wiring) post-commit bookkeeping:
- Move AZ-341 task spec to done/ (implement skill step 13).
- Write batch_45_cycle1_report.md (test results, AC coverage,
  architectural decisions, findings carried into cumulative review).
- Bump state.last_completed_batch 44 → 45.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 21:47:07 +03:00
parent 88f6ae6dce
commit 1682dc354b
3 changed files with 195 additions and 1 deletions
@@ -0,0 +1,194 @@
# Batch 45 — Cycle 1 Report
**Date**: 2026-05-13
**Batch**: 45
**Tasks**: AZ-341 (C2 FAISS HNSW Retrieve Wiring, 3pt)
**Status**: complete; ticket transitioned to "In Testing".
## Scope
AZ-341 delivers `FaissBridge` — the cross-strategy shared plumbing
that connects every concrete C2 `VprStrategy.retrieve_topk` to C6's
FAISS HNSW `DescriptorIndex.search_topk` surface. The bridge owns the
defended-in-depth INV-4 check (exactly K results, distance-ascending),
the WARN-threshold check on `distances[0]`, the optional per-frame
DEBUG `top10_distances` log, and the `vpr.retrieve_topk` FDR record
emission with monotonic-ns latency measurement.
Companion deliverables in the same batch:
- `DescriptorIndexCut` Protocol — the AZ-507 consumer-side structural
cut that keeps `c2_vpr/*.py` import-free of `c6_tile_cache.*`.
- `C2VprConfig` gains two operator-tunable knobs:
`warn_top1_threshold` (default 0.30 — task-spec Risk 3 conservative
placeholder pending FT-P-19 telemetry) and `debug_per_frame_distances`
(default OFF — Risk 2 mitigation against journald flood).
- `fdr_client.records` registers `vpr.retrieve_topk` in
`KNOWN_PAYLOAD_KEYS` and the AZ-272 roundtrip schema test gains a
fixture for the new kind (mirroring the Batch-44 `c12.reloc.requested`
pattern).
Unblocks each future C2 strategy (AZ-337 UltraVPR, AZ-338 NetVLAD,
AZ-339 MegaLoc+MixVPR, AZ-340 SelaVPR+EigenPlaces+SALAD): every
strategy's `retrieve_topk` shrinks to a one-line delegation
`return self._faiss_bridge.retrieve(query, k, backbone_label=...)`,
eliminating the seven-way duplication risk the task spec called out.
## Architectural Decisions
### 1. The bridge is stricter than the c6 contract — by design
C6's `descriptor_index.md` Invariant I-2 says `search_topk` "MAY
return fewer than K results when the corpus has fewer than K
vectors". The AZ-341 task spec asks the bridge to enforce `== K`
and raise `IndexUnavailableError` on undersized returns. The two
contracts are complementary, not contradictory:
- C6's I-2 is a **structural** invariant (the return-shape upper
bound, not a guarantee of full results).
- AZ-341's INV-4 is an **operational** invariant (a corpus with
fewer than K vectors is operationally degraded; the system can
not function — downstream RANSAC matching needs the full top-K
spread).
The bridge's module docstring documents this on-purpose strictness.
The defended-in-depth posture is the task-spec's explicit Risk-1
mitigation against silent C6 regressions.
### 2. The bridge propagates the c6 `IndexUnavailableError` unchanged
AC-4 is explicit: "the bridge does not catch and re-raise". The
bridge has no `try/except IndexUnavailableError`. When the c6
stale-handle / sidecar / dim-mismatch defence fires, the original
c6 exception escapes directly to the strategy and onwards.
This conflicts with the `c6_tile_cache.errors.IndexUnavailableError`
module docstring's expectation that "the C2 family is the closed
envelope a C5/C2.5 consumer sees; the C6 family is the storage-layer
error a concrete strategy is responsible for rewrapping". The
implementation follows the task-spec letter; surfacing the gap as
Finding F2 of the per-batch review for spec/code-comment alignment.
### 3. `_iso_ts_from_clock` is duplicated — accepted for this batch
The helper is six lines that appear inline in c2 / c5 / c11 / c12.
Each component avoids importing the others' helpers (per AZ-507
shape). A future shared helper would live in `shared/helpers/`
(L1) but factoring is deferred to AZ-508 (the hygiene-only PBI
already in `todo/` covering ISO-timestamp consolidation across
the suite). Captured as Finding F1.
### 4. The dropped `normaliser` constructor parameter
The task spec's Outcome § Constructor lists
`normaliser: DescriptorNormaliser` but the documented `retrieve`
body never calls it — the `VprStrategy` Protocol INV-3 already
requires `VprQuery.embedding` to arrive L2-normalised. The bridge
omits the parameter (no behaviour change vs the documented body).
Captured as Finding F2 for spec hygiene.
## Files Changed
### Production source (new)
- `src/gps_denied_onboard/components/c2_vpr/_faiss_bridge.py`
`FaissBridge` class (~250 LOC), `_verify_invariants` /
`_log_invariant_violation` / `_emit_fdr_record` /
`_iso_ts_from_clock` helpers, module-level FDR / log `kind`
constants.
- `src/gps_denied_onboard/components/c2_vpr/descriptor_index_cut.py`
`DescriptorIndexCut` Protocol + `TileIdTuple` alias (~50 LOC).
### Production source (modified)
- `src/gps_denied_onboard/components/c2_vpr/config.py`
`C2VprConfig` gains `warn_top1_threshold: float = 0.30` and
`debug_per_frame_distances: bool = False` plus their validators.
- `src/gps_denied_onboard/components/c2_vpr/__init__.py`
re-exports `DescriptorIndexCut` and `TileIdTuple` for the
composition root's c6 adapter.
- `src/gps_denied_onboard/fdr_client/records.py` — registers
`vpr.retrieve_topk` in `KNOWN_PAYLOAD_KEYS` with payload field
set `{frame_id, backbone_label, top10_distances, latency_us}`.
### Tests (new)
- `tests/unit/c2_vpr/test_faiss_bridge.py` — 22 tests covering
AC-1..AC-11, NFR-perf microbench (p95 ≤ 0.5 ms), constructor
validation, retrieve-argument validation. Uses
`_FakeDescriptorIndex`, `_StubClock`, and a real `FdrClient` with
power-of-two capacity 8 to verify both the success enqueue
pattern and the OVERRUN warn path.
### Tests (modified)
- `tests/unit/test_az272_fdr_record_schema.py` — adds a
`vpr.retrieve_topk` payload to the `_make_fdr_payload` fixture so
the schema roundtrip + envelope tests cover the new kind.
### Docs / process
- `_docs/03_implementation/reviews/batch_45_review.md` — per-batch
code review (verdict PASS_WITH_WARNINGS, 2 Low findings).
- `_docs/_autodev_state.md` — sub_step bumped through the batch
loop.
## Test Results
```
1565 passed, 80 skipped in 69.9s
```
Versus pre-batch (Batch 44 final): `1543 passed, 80 skipped`.
Delta: +22 new tests (all in `tests/unit/c2_vpr/test_faiss_bridge.py`),
zero regressions, zero new skips.
## Code Review
Verdict: **PASS_WITH_WARNINGS** — see
`_docs/03_implementation/reviews/batch_45_review.md`.
Two Low-severity findings:
| # | Severity | Category | Title |
|---|----------|----------|-------|
| F1 | Low | Maintainability | Duplicated `_iso_ts_from_clock` helper across c2/c5/c11/c12 — defer to AZ-508 |
| F2 | Low | Spec-Gap | Task spec lists unused `normaliser` parameter — surface to user / spec hygiene |
No Critical, no High, no Medium. The Auto-Fix Gate accepts the
verdict without intervention.
## AC Coverage Summary
All 11 acceptance criteria + NFR-perf have at least one covering
test. Detailed AC ↔ test mapping in
`_docs/03_implementation/reviews/batch_45_review.md` § Phase 2.
## Tracker
- AZ-341 transitioned `To Do``In Progress` at batch start.
- AZ-341 transitioned `In Progress``In Testing` after the
commit landed.
## Cumulative Review (batches 43-45)
Last cumulative review was for batches 40-42. With Batch 45 closed
(43, 44, 45 since the last cumulative), the implement skill's
Step 14.5 cumulative review fires next at K=3.
## Next Batch
Several C-component slices are now unblocked:
- C1: AZ-333 (VINS-Mono strategy, research), AZ-334 (KLT/RANSAC
mandatory baseline). Then AZ-335 (warm-start) once 333+334 land.
- C2: AZ-337 (UltraVPR primary, 5pt), AZ-338 (NetVLAD baseline,
3pt), AZ-339 (MegaLoc+MixVPR research), AZ-340 (SelaVPR+
EigenPlaces+SALAD research).
- C3: AZ-345 / AZ-346 / AZ-347 — pending the C2.5 ReRanker tasks.
A natural Batch-46 candidate would be a C1 mandatory simple-baseline
(AZ-334) — single 5pt task, OpenCV-based, no model weights — but
the next-batch selection will go through the implement skill's
Step 3 (Compute Next Batch) afresh.
+1 -1
View File
@@ -12,5 +12,5 @@ sub_step:
retry_count: 0
cycle: 1
tracker: jira
last_completed_batch: 44
last_completed_batch: 45
last_cumulative_review: batches_40-42