[AZ-338] Archive task spec + batch 46 report + state bump

- _docs/02_tasks/todo/AZ-338_c2_net_vlad.md -> _docs/02_tasks/done/AZ-338_c2_net_vlad.md - _docs/03_implementation/batch_46_cycle1_report.md (new) - _docs/_autodev_state.md: last_completed_batch 45 -> 46; sub_step.detail "batch 46 complete - selecting batch 47" AZ-338 transitioned in Jira: In Progress -> In Testing. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-21 08:41:12 +00:00 · 2026-05-13 22:31:56 +03:00
parent af0dbe863a
commit 773d589d34
3 changed files with 205 additions and 2 deletions
@@ -0,0 +1,203 @@
+# Batch 46 / Cycle 1 — Implementation Report
+
+**Date**: 2026-05-13
+**Tasks**: AZ-338 — C2 NetVLAD Mandatory Simple-Baseline (3pt)
+**Total complexity**: 3 points
+**Result**: PASS_WITH_WARNINGS (per-batch code review)
+**Jira tracker state**: AZ-338 transitioned To Do → In Progress → In Testing
+
+## Scope
+
+NetVLAD is the C2 comparative baseline mandated by the engine rule
+(every production-default backbone ships with a simple-baseline
+alongside; description.md § 1). Per § 5 the baseline runs on the C7
+PyTorch FP16 runtime (NOT TensorRT) — runtime-isolation so a TRT engine
+compile bug does not simultaneously break baseline + primary. This
+batch lands the first concrete `VprStrategy` implementation, validating
+the AZ-341 `FaissBridge` plumbing and establishing the pattern that
+AZ-337 / AZ-339 / AZ-340 follow.
+
+## Files Changed
+
+### Production (new)
+
+- `src/gps_denied_onboard/components/c2_vpr/net_vlad.py` —
+  `NetVladStrategy` class implementing the `VprStrategy` Protocol.
+  Constructor wires `InferenceRuntimeCut` + `DescriptorIndexCut` +
+  `NetVladBackbonePreprocessor` + `DescriptorNormaliser` +
+  `FaissBridge`. Module-level `MODEL_NAME` + `architecture_factory()`
+  exposed for the composition root to bind to C7's architecture
+  registry. Module-level `create(config, descriptor_index,
+  inference_runtime)` factory consumed by `build_vpr_strategy`.
+- `src/gps_denied_onboard/components/c2_vpr/_net_vlad_architecture.py`
+  — canonical NetVLAD-VGG16 architecture (Arandjelović et al. 2016):
+  VGG16 feature extractor up to conv5_3 + NetVLAD pooling layer
+  (soft-assign 1x1 conv + cluster centroids + batched residual
+  aggregation) + optional `nn.Linear(K*D, descriptor_dim)` PCA
+  projection. K=64, D=512 (VGG16 conv5_3 channels), default
+  descriptor_dim=4096. Torch / torchvision imported lazily inside the
+  factory.
+- `src/gps_denied_onboard/components/c2_vpr/_preprocessor_net_vlad.py`
+  — `NetVladBackbonePreprocessor` implementing the C2-internal
+  `BackbonePreprocessor` Protocol. Decode → centre-crop square →
+  resize (480, 480) → ImageNet mean/std → FP16 NCHW.
+- `src/gps_denied_onboard/components/c2_vpr/inference_runtime_cut.py`
+  — NEW AZ-507 consumer-side cut mirroring the subset of C7
+  `InferenceRuntime` that C2 strategies consume (`compile_engine`,
+  `deserialize_engine`, `infer`, `release_engine`,
+  `current_runtime_label`). Lets `c2_vpr` stay AZ-507-clean.
+
+### Production (modified)
+
+- `src/gps_denied_onboard/components/c2_vpr/config.py` — added
+  `netvlad_descriptor_dim: int = 4096` knob + `__post_init__`
+  validation.
+- `src/gps_denied_onboard/components/c2_vpr/__init__.py` — re-exported
+  `InferenceRuntimeCut`.
+- `src/gps_denied_onboard/helpers/descriptor_normaliser.py` — added
+  `intra_cluster_normalise(descriptor, num_clusters)` for NetVLAD's
+  dual-stage normalisation chain. Backward-compatible function
+  addition (v1.0.0 → v1.1.0).
+- `src/gps_denied_onboard/fdr_client/records.py` — registered three
+  new record kinds: `vpr.embed_query`, `vpr.backbone_error`,
+  `vpr.preprocess_error`.
+- `src/gps_denied_onboard/runtime_root/vpr_factory.py` — added
+  `_register_strategy_architecture` helper that binds
+  `(MODEL_NAME, architecture_factory(descriptor_dim))` to C7's
+  architecture registry before delegating to the strategy's `create()`
+  factory. Keeps the c7 import at L4, preserves AZ-507.
+
+### Tests
+
+- `tests/unit/c2_vpr/test_net_vlad.py` — NEW, 31 tests covering all
+  11 ACs (with per-AC variants for AC-2, AC-6, AC-8, AC-9, AC-11) +
+  preprocessor contract (4) + constructor validation (3) + FDR record
+  emission (2) + architecture-factory closure (2).
+- `tests/unit/test_az283_descriptor_normaliser.py` — +8 tests for
+  `intra_cluster_normalise`: per-cluster unit norm, dtype preservation,
+  zero-cluster handling, non-divisible length rejection, 2-D
+  rejection, zero/bool `num_clusters` rejection, float64 rejection,
+  no-mutation invariant.
+- `tests/unit/test_az272_fdr_record_schema.py` — +3 fixture payloads
+  for the three new VPR record kinds so the schema roundtrip test
+  exercises all of them.
+
+### Docs
+
+- `_docs/02_document/contracts/shared_helpers/descriptor_normaliser.md`
+  — v1.0.0 → v1.1.0; added `intra_cluster_normalise` row + changelog
+  entry.
+- `_docs/03_implementation/reviews/batch_46_review.md` — per-batch code
+  review (PASS_WITH_WARNINGS).
+
+## Acceptance Criteria Coverage
+
+All 11 ACs of AZ-338 have at least one covering unit test:
+
+| AC | Description | Status |
+|----|-------------|--------|
+| AC-1 | Protocol conformance | covered |
+| AC-2 | L2-norm == 1.0 ± 1e-3 FP16 (D,) | covered (4096 + 512 variants) |
+| AC-3 | `intra_cluster_normalise` BEFORE `l2_normalise` | covered (spy + once-each) |
+| AC-4 | Deterministic across 3 calls | covered |
+| AC-5 | `retrieve_topk` == k, label="net_vlad", sorted | covered |
+| AC-6 | `descriptor_dim()` stable | covered (4096 + 512 variants) |
+| AC-7 | Engine output shape mismatch → ConfigError | covered |
+| AC-8 | `VprBackboneError` on forward failure | covered (3 variants) |
+| AC-9 | `VprPreprocessError` on corrupt image | covered (3 variants) |
+| AC-10 | Composition-root wiring + `c2.vpr.ready` log | covered (log + model_name forcing) |
+| AC-11 | `BUILD_PYTORCH_RUNTIME=OFF` → ConfigError fail-fast | covered (tensorrt + onnx_trt_ep variants) |
+
+## Test Results
+
+- **Full unit suite**: `1608 passed / 80 environment-skipped / 0 failed`
+  in ~81s. Up from 1565 at the close of Batch 45 (+43 new tests).
+- **Focused per-component**: `c2_vpr/test_net_vlad.py` 31/31 PASS;
+  `c2_vpr/test_faiss_bridge.py` 22/22 PASS (no regression);
+  `test_az283_descriptor_normaliser.py` 23/23 PASS (15 original + 8
+  new); `test_az272_fdr_record_schema.py` 3/3 PASS.
+- **Lint**: `ruff check` clean on every new + modified file.
+- **AZ-507 layering lint**: `test_ac6_only_compose_root_imports_concrete_strategies` PASS.
+
+## Architectural Decisions
+
+1. **Architecture-registration moved to composition root**. The
+   AZ-338 spec implied `c2_vpr.net_vlad.create()` registers the
+   NetVLAD nn.Module factory with C7. That violates AZ-507 (no
+   cross-component imports). Resolved by exposing `MODEL_NAME` +
+   `architecture_factory(descriptor_dim)` on the strategy module and
+   having `runtime_root/vpr_factory.py::_register_strategy_architecture`
+   perform the c7 binding before calling the strategy's `create()`
+   factory. Pattern is generalisable to AZ-337 / AZ-339 / AZ-340
+   strategies that also use the PyTorch runtime.
+
+2. **C7 API names aligned with v1.0.0 Protocol**. The spec uses
+   `runtime.forward(engine_id, ...)` and
+   `inference_runtime.load_engine(weights_path)`. Live C7 Protocol
+   (AZ-297) is `infer(handle, inputs)` + `compile_engine(model_path,
+   build_config) → entry` + `deserialize_engine(entry) → handle`.
+   Implementation aligns with the v1.0.0 Protocol; spec § Outcome is
+   stale on these names (flagged as F2 in code review).
+
+3. **InferenceRuntimeCut**. New AZ-507 consumer-side cut joins
+   `DescriptorIndexCut` (AZ-341), `TileDownloaderCut` (AZ-328),
+   `TileUploaderCut` (AZ-329), `FdrFooterReader` (AZ-329) — five
+   structural Protocol cuts across the codebase, all named `*Cut`
+   or `*Reader`, all `runtime_checkable=True`. Pattern is stable.
+
+4. **Dual-stage normalisation order**. `intra_cluster_normalise`
+   BEFORE `l2_normalise` is mandatory per AZ-338 spec § Constraints
+   and per the published Pittsburgh NetVLAD preprocessing chain.
+   Verified by AC-3 spy. Reversing the order would silently break
+   AC-2.1b (recall regression).
+
+5. **PCA-projection inside the architecture**. The published
+   Pittsburgh NetVLAD reference ships with a learned `Linear(K*D=32768,
+   4096)` PCA-whitening layer. The architecture embeds the layer
+   as `nn.Linear(K*D, descriptor_dim)`; when `descriptor_dim == K*D`
+   the layer is omitted (raw VLAD). Default `descriptor_dim=4096` per
+   the spec. The `.pth` state dict is expected to carry the PCA
+   weights alongside the rest of the model parameters.
+
+6. **NetVLAD pooling implemented via `torch.bmm`**, not a Python
+   `for k in range(K)` loop. Single CUDA kernel; asymptotically
+   equivalent (K=64) but dramatically faster on GPU than the
+   reference Python-loop form.
+
+## Carried-over Findings
+
+- **F1 from cumulative review 43-45**: `_iso_ts_from_clock` duplicated
+  across 6 modules — `c2_vpr/net_vlad.py` is the 6th copy. AZ-508
+  hygiene PBI exists for consolidation.
+
+## New Findings (per Batch 46 code review)
+
+- **F2 (Low, Spec-Hygiene)**: AZ-338 spec § Outcome uses outdated C7
+  API names + implies an AZ-507-violating architecture-registration
+  location. Recommend a spec-hygiene follow-up that refreshes
+  AZ-337..AZ-340 against the stabilised C7 v1.0.0 + AZ-507 patterns.
+- **F3 (Low, Test-Coverage)**: NFR-perf microbench (`embed_query` p95
+  ≤ 80ms on Tier-1 Jetson) deferred — no real PyTorch CUDA host on
+  the dev tier. Schedule under FT-P-19 / C2-IT-01 on Tier-1 hardware.
+- **F4 (Low, Architecture)**: PCA-projection sidecar verification
+  deferred — the architecture loads PCA weights via
+  `load_state_dict(strict=True)` only; no cross-check against an
+  AZ-280 sidecar manifest yet.
+
+## Jira Tracker
+
+- AZ-338 transitioned: To Do → In Progress → In Testing.
+- Task spec archived: `_docs/02_tasks/todo/AZ-338_c2_net_vlad.md` →
+  `_docs/02_tasks/done/AZ-338_c2_net_vlad.md`.
+
+## Next
+
+Per the autodev orchestrator loop: select Batch 47. C2 production
+path is now half-built (AZ-336 Protocol + AZ-341 FaissBridge + AZ-338
+NetVLAD baseline). The remaining C2 strategies are AZ-337 (UltraVPR
+primary, 5pt), AZ-339 (MegaLoc + MixVPR, 5pt), AZ-340 (SelaVPR +
+EigenPlaces + SALAD, 5pt). Other candidates: AZ-358 (C4 OpenCV/GTSAM
+pose estimator, 5pt), AZ-349 (C3.5 AdHoP refiner, 5pt), AZ-389 (C5
+internal orthorectifier, 3pt), AZ-508 (ISO timestamp hygiene, 2pt),
+or a spec-hygiene PBI to address F2 / F3 from this batch + cumulative
+F2 / F3.
@@ -8,9 +8,9 @@ status: in_progress
 sub_step:
  phase: 7
  name: batch-loop
-  detail: "batch 46 — AZ-338 (C2 NetVLAD mandatory simple-baseline)"
+  detail: "batch 46 complete — selecting batch 47"
 retry_count: 0
 cycle: 1
 tracker: jira
-last_completed_batch: 45
+last_completed_batch: 46
 last_cumulative_review: batches_43-45