[AZ-338] Archive task spec + batch 46 report + state bump

- _docs/02_tasks/todo/AZ-338_c2_net_vlad.md
  -> _docs/02_tasks/done/AZ-338_c2_net_vlad.md
- _docs/03_implementation/batch_46_cycle1_report.md (new)
- _docs/_autodev_state.md: last_completed_batch 45 -> 46;
  sub_step.detail "batch 46 complete - selecting batch 47"

AZ-338 transitioned in Jira: In Progress -> In Testing.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 22:31:56 +03:00
parent af0dbe863a
commit 773d589d34
3 changed files with 205 additions and 2 deletions
@@ -0,0 +1,203 @@
# Batch 46 / Cycle 1 — Implementation Report
**Date**: 2026-05-13
**Tasks**: AZ-338 — C2 NetVLAD Mandatory Simple-Baseline (3pt)
**Total complexity**: 3 points
**Result**: PASS_WITH_WARNINGS (per-batch code review)
**Jira tracker state**: AZ-338 transitioned To Do → In Progress → In Testing
## Scope
NetVLAD is the C2 comparative baseline mandated by the engine rule
(every production-default backbone ships with a simple-baseline
alongside; description.md § 1). Per § 5 the baseline runs on the C7
PyTorch FP16 runtime (NOT TensorRT) — runtime-isolation so a TRT engine
compile bug does not simultaneously break baseline + primary. This
batch lands the first concrete `VprStrategy` implementation, validating
the AZ-341 `FaissBridge` plumbing and establishing the pattern that
AZ-337 / AZ-339 / AZ-340 follow.
## Files Changed
### Production (new)
- `src/gps_denied_onboard/components/c2_vpr/net_vlad.py`
`NetVladStrategy` class implementing the `VprStrategy` Protocol.
Constructor wires `InferenceRuntimeCut` + `DescriptorIndexCut` +
`NetVladBackbonePreprocessor` + `DescriptorNormaliser` +
`FaissBridge`. Module-level `MODEL_NAME` + `architecture_factory()`
exposed for the composition root to bind to C7's architecture
registry. Module-level `create(config, descriptor_index,
inference_runtime)` factory consumed by `build_vpr_strategy`.
- `src/gps_denied_onboard/components/c2_vpr/_net_vlad_architecture.py`
— canonical NetVLAD-VGG16 architecture (Arandjelović et al. 2016):
VGG16 feature extractor up to conv5_3 + NetVLAD pooling layer
(soft-assign 1x1 conv + cluster centroids + batched residual
aggregation) + optional `nn.Linear(K*D, descriptor_dim)` PCA
projection. K=64, D=512 (VGG16 conv5_3 channels), default
descriptor_dim=4096. Torch / torchvision imported lazily inside the
factory.
- `src/gps_denied_onboard/components/c2_vpr/_preprocessor_net_vlad.py`
`NetVladBackbonePreprocessor` implementing the C2-internal
`BackbonePreprocessor` Protocol. Decode → centre-crop square →
resize (480, 480) → ImageNet mean/std → FP16 NCHW.
- `src/gps_denied_onboard/components/c2_vpr/inference_runtime_cut.py`
— NEW AZ-507 consumer-side cut mirroring the subset of C7
`InferenceRuntime` that C2 strategies consume (`compile_engine`,
`deserialize_engine`, `infer`, `release_engine`,
`current_runtime_label`). Lets `c2_vpr` stay AZ-507-clean.
### Production (modified)
- `src/gps_denied_onboard/components/c2_vpr/config.py` — added
`netvlad_descriptor_dim: int = 4096` knob + `__post_init__`
validation.
- `src/gps_denied_onboard/components/c2_vpr/__init__.py` — re-exported
`InferenceRuntimeCut`.
- `src/gps_denied_onboard/helpers/descriptor_normaliser.py` — added
`intra_cluster_normalise(descriptor, num_clusters)` for NetVLAD's
dual-stage normalisation chain. Backward-compatible function
addition (v1.0.0 → v1.1.0).
- `src/gps_denied_onboard/fdr_client/records.py` — registered three
new record kinds: `vpr.embed_query`, `vpr.backbone_error`,
`vpr.preprocess_error`.
- `src/gps_denied_onboard/runtime_root/vpr_factory.py` — added
`_register_strategy_architecture` helper that binds
`(MODEL_NAME, architecture_factory(descriptor_dim))` to C7's
architecture registry before delegating to the strategy's `create()`
factory. Keeps the c7 import at L4, preserves AZ-507.
### Tests
- `tests/unit/c2_vpr/test_net_vlad.py` — NEW, 31 tests covering all
11 ACs (with per-AC variants for AC-2, AC-6, AC-8, AC-9, AC-11) +
preprocessor contract (4) + constructor validation (3) + FDR record
emission (2) + architecture-factory closure (2).
- `tests/unit/test_az283_descriptor_normaliser.py` — +8 tests for
`intra_cluster_normalise`: per-cluster unit norm, dtype preservation,
zero-cluster handling, non-divisible length rejection, 2-D
rejection, zero/bool `num_clusters` rejection, float64 rejection,
no-mutation invariant.
- `tests/unit/test_az272_fdr_record_schema.py` — +3 fixture payloads
for the three new VPR record kinds so the schema roundtrip test
exercises all of them.
### Docs
- `_docs/02_document/contracts/shared_helpers/descriptor_normaliser.md`
— v1.0.0 → v1.1.0; added `intra_cluster_normalise` row + changelog
entry.
- `_docs/03_implementation/reviews/batch_46_review.md` — per-batch code
review (PASS_WITH_WARNINGS).
## Acceptance Criteria Coverage
All 11 ACs of AZ-338 have at least one covering unit test:
| AC | Description | Status |
|----|-------------|--------|
| AC-1 | Protocol conformance | covered |
| AC-2 | L2-norm == 1.0 ± 1e-3 FP16 (D,) | covered (4096 + 512 variants) |
| AC-3 | `intra_cluster_normalise` BEFORE `l2_normalise` | covered (spy + once-each) |
| AC-4 | Deterministic across 3 calls | covered |
| AC-5 | `retrieve_topk` == k, label="net_vlad", sorted | covered |
| AC-6 | `descriptor_dim()` stable | covered (4096 + 512 variants) |
| AC-7 | Engine output shape mismatch → ConfigError | covered |
| AC-8 | `VprBackboneError` on forward failure | covered (3 variants) |
| AC-9 | `VprPreprocessError` on corrupt image | covered (3 variants) |
| AC-10 | Composition-root wiring + `c2.vpr.ready` log | covered (log + model_name forcing) |
| AC-11 | `BUILD_PYTORCH_RUNTIME=OFF` → ConfigError fail-fast | covered (tensorrt + onnx_trt_ep variants) |
## Test Results
- **Full unit suite**: `1608 passed / 80 environment-skipped / 0 failed`
in ~81s. Up from 1565 at the close of Batch 45 (+43 new tests).
- **Focused per-component**: `c2_vpr/test_net_vlad.py` 31/31 PASS;
`c2_vpr/test_faiss_bridge.py` 22/22 PASS (no regression);
`test_az283_descriptor_normaliser.py` 23/23 PASS (15 original + 8
new); `test_az272_fdr_record_schema.py` 3/3 PASS.
- **Lint**: `ruff check` clean on every new + modified file.
- **AZ-507 layering lint**: `test_ac6_only_compose_root_imports_concrete_strategies` PASS.
## Architectural Decisions
1. **Architecture-registration moved to composition root**. The
AZ-338 spec implied `c2_vpr.net_vlad.create()` registers the
NetVLAD nn.Module factory with C7. That violates AZ-507 (no
cross-component imports). Resolved by exposing `MODEL_NAME` +
`architecture_factory(descriptor_dim)` on the strategy module and
having `runtime_root/vpr_factory.py::_register_strategy_architecture`
perform the c7 binding before calling the strategy's `create()`
factory. Pattern is generalisable to AZ-337 / AZ-339 / AZ-340
strategies that also use the PyTorch runtime.
2. **C7 API names aligned with v1.0.0 Protocol**. The spec uses
`runtime.forward(engine_id, ...)` and
`inference_runtime.load_engine(weights_path)`. Live C7 Protocol
(AZ-297) is `infer(handle, inputs)` + `compile_engine(model_path,
build_config) → entry` + `deserialize_engine(entry) → handle`.
Implementation aligns with the v1.0.0 Protocol; spec § Outcome is
stale on these names (flagged as F2 in code review).
3. **InferenceRuntimeCut**. New AZ-507 consumer-side cut joins
`DescriptorIndexCut` (AZ-341), `TileDownloaderCut` (AZ-328),
`TileUploaderCut` (AZ-329), `FdrFooterReader` (AZ-329) — five
structural Protocol cuts across the codebase, all named `*Cut`
or `*Reader`, all `runtime_checkable=True`. Pattern is stable.
4. **Dual-stage normalisation order**. `intra_cluster_normalise`
BEFORE `l2_normalise` is mandatory per AZ-338 spec § Constraints
and per the published Pittsburgh NetVLAD preprocessing chain.
Verified by AC-3 spy. Reversing the order would silently break
AC-2.1b (recall regression).
5. **PCA-projection inside the architecture**. The published
Pittsburgh NetVLAD reference ships with a learned `Linear(K*D=32768,
4096)` PCA-whitening layer. The architecture embeds the layer
as `nn.Linear(K*D, descriptor_dim)`; when `descriptor_dim == K*D`
the layer is omitted (raw VLAD). Default `descriptor_dim=4096` per
the spec. The `.pth` state dict is expected to carry the PCA
weights alongside the rest of the model parameters.
6. **NetVLAD pooling implemented via `torch.bmm`**, not a Python
`for k in range(K)` loop. Single CUDA kernel; asymptotically
equivalent (K=64) but dramatically faster on GPU than the
reference Python-loop form.
## Carried-over Findings
- **F1 from cumulative review 43-45**: `_iso_ts_from_clock` duplicated
across 6 modules — `c2_vpr/net_vlad.py` is the 6th copy. AZ-508
hygiene PBI exists for consolidation.
## New Findings (per Batch 46 code review)
- **F2 (Low, Spec-Hygiene)**: AZ-338 spec § Outcome uses outdated C7
API names + implies an AZ-507-violating architecture-registration
location. Recommend a spec-hygiene follow-up that refreshes
AZ-337..AZ-340 against the stabilised C7 v1.0.0 + AZ-507 patterns.
- **F3 (Low, Test-Coverage)**: NFR-perf microbench (`embed_query` p95
≤ 80ms on Tier-1 Jetson) deferred — no real PyTorch CUDA host on
the dev tier. Schedule under FT-P-19 / C2-IT-01 on Tier-1 hardware.
- **F4 (Low, Architecture)**: PCA-projection sidecar verification
deferred — the architecture loads PCA weights via
`load_state_dict(strict=True)` only; no cross-check against an
AZ-280 sidecar manifest yet.
## Jira Tracker
- AZ-338 transitioned: To Do → In Progress → In Testing.
- Task spec archived: `_docs/02_tasks/todo/AZ-338_c2_net_vlad.md`
`_docs/02_tasks/done/AZ-338_c2_net_vlad.md`.
## Next
Per the autodev orchestrator loop: select Batch 47. C2 production
path is now half-built (AZ-336 Protocol + AZ-341 FaissBridge + AZ-338
NetVLAD baseline). The remaining C2 strategies are AZ-337 (UltraVPR
primary, 5pt), AZ-339 (MegaLoc + MixVPR, 5pt), AZ-340 (SelaVPR +
EigenPlaces + SALAD, 5pt). Other candidates: AZ-358 (C4 OpenCV/GTSAM
pose estimator, 5pt), AZ-349 (C3.5 AdHoP refiner, 5pt), AZ-389 (C5
internal orthorectifier, 3pt), AZ-508 (ISO timestamp hygiene, 2pt),
or a spec-hygiene PBI to address F2 / F3 from this batch + cumulative
F2 / F3.
+2 -2
View File
@@ -8,9 +8,9 @@ status: in_progress
sub_step:
phase: 7
name: batch-loop
detail: "batch 46 — AZ-338 (C2 NetVLAD mandatory simple-baseline)"
detail: "batch 46 complete — selecting batch 47"
retry_count: 0
cycle: 1
tracker: jira
last_completed_batch: 45
last_completed_batch: 46
last_cumulative_review: batches_43-45