gps-denied-onboard

azaion/gps-denied-onboard

Fork 0

mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-06-21 08:51:12 +00:00

Commit Graph

Author	SHA1	Message	Date
Oleksandr Bezdieniezhnykh	97f5f9793c	[AZ-965] NetVLAD-VGG16 backbone checkpoint + YAML/compose wiring AZ-965 ships the NetVLAD .pt checkpoint that clears the AZ-839 empty-c10_provisioning.backbones SKIP gate. Pipeline-integration scaffold — encoder is real, NetVLAD tail is honestly labelled as untrained. Composition: * Encoder (26 keys, encoder.0..encoder.28): torchvision vgg16(weights=IMAGENET1K_V1) features [:-2], BSD-3-Clause. Real ImageNet-pretrained VGG16 conv stack. * NetVLAD pool + PCA tail (5 keys: pool.conv.{weight,bias}, pool.centroids, pca.{weight,bias}): random-init via torch.manual_seed(0). NOT trained for visual place recognition. Total: 149,002,112 params (568.4 MiB fp32, sha256=745c6f29...). Round-trip verified locally: torch.load(weights_only=True) + load_state_dict(strict=True) succeed; forward(1,3,480,480) emits {'vlad_descriptor': (1, 4096) fp32} — matches NetVladStrategy contract per net_vlad.py:247-251. Two material discoveries documented in the AZ-965 spec: 1. The NetVLAD-VGG16 architecture already lives in repo at src/gps_denied_onboard/components/c2_vpr/_net_vlad_architecture.py — we instantiate it and save a state_dict, NOT externally source. 2. The PyTorch FP16 runtime expects a .pt state_dict (NOT .onnx). BackboneConfig.onnx_path is a misnomer for NetVLAD: per AZ-321 design + c2_vpr description.md §1, NetVLAD runs on PyTorch FP16 (NOT TRT). compile_engine is a no-op sha256+path wrap; deserialize_engine does torch.load(weights_only=True) + load_state_dict(strict=True). User skipped Option A/B/C/D/E question — judgment call = Option B (IMAGENET1K_V1 + random tail) per "use judgment, don't block": * Option A (Nanne translation) was 5-8 SP, above the 5 SP budget. * Option B is 3 SP, fits the budget, honestly labelled. * Option C (pure random) was borderline-dishonest per Real Results. Files: * scripts/mk_netvlad_checkpoint.py — deterministic generator. * models/netvlad/netvlad.pt — 568 MiB, via git-lfs (.gitattributes extended for models/*/.pt, .onnx, .engine). * configs/operator_replay.yaml — c2_vpr + c10_provisioning blocks populated; the field literally named onnx_path actually points at the .pt for NetVLAD per the runtime semantics noted above. * docker-compose.test.jetson.yml — ./models:/opt/models:ro bind mount added to e2e-runner. * _docs/03_ip_attribution/netvlad.md — provenance, licence, how-to- reproduce, honest scope statement ("NOT a real-retrieval checkpoint; ESKF divergence under garbage retrievals is the expected next gate"). * _docs/02_tasks/todo/AZ-965_netvlad_onnx_backbone_provisioning.md — rewritten to reflect the .pt-not-.onnx + Option B discoveries. Tier-2 verification follows in a separate commit after the harness run confirms the empty-backbones SKIP gate clears. Out of scope (filed as follow-ups): * Real-retrieval NetVLAD weights (Nanne Pittsburgh-30k translation or internal team checkpoint) — separate ticket. * AZ-840 orchestrator PASSing end-to-end (depends on retrieval quality + ESKF stability). * AZ-963 60s smoke ESKF divergence (independent chain). Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-29 18:03:32 +03:00

Author

SHA1

Message

Date

Oleksandr Bezdieniezhnykh

97f5f9793c

[AZ-965] NetVLAD-VGG16 backbone checkpoint + YAML/compose wiring

AZ-965 ships the NetVLAD .pt checkpoint that clears the AZ-839
empty-c10_provisioning.backbones SKIP gate. Pipeline-integration
scaffold — encoder is real, NetVLAD tail is honestly labelled as
untrained.

Composition:

* Encoder (26 keys, encoder.0..encoder.28): torchvision
  vgg16(weights=IMAGENET1K_V1) features [:-2], BSD-3-Clause.
  Real ImageNet-pretrained VGG16 conv stack.
* NetVLAD pool + PCA tail (5 keys: pool.conv.{weight,bias},
  pool.centroids, pca.{weight,bias}): random-init via
  torch.manual_seed(0). NOT trained for visual place recognition.

Total: 149,002,112 params (568.4 MiB fp32, sha256=745c6f29...).
Round-trip verified locally: torch.load(weights_only=True) +
load_state_dict(strict=True) succeed; forward(1,3,480,480) emits
{'vlad_descriptor': (1, 4096) fp32} — matches NetVladStrategy
contract per net_vlad.py:247-251.

Two material discoveries documented in the AZ-965 spec:

1. The NetVLAD-VGG16 architecture already lives in repo at
   src/gps_denied_onboard/components/c2_vpr/_net_vlad_architecture.py
   — we instantiate it and save a state_dict, NOT externally source.
2. The PyTorch FP16 runtime expects a .pt state_dict (NOT .onnx).
   BackboneConfig.onnx_path is a misnomer for NetVLAD: per AZ-321
   design + c2_vpr description.md §1, NetVLAD runs on PyTorch FP16
   (NOT TRT). compile_engine is a no-op sha256+path wrap;
   deserialize_engine does torch.load(weights_only=True) +
   load_state_dict(strict=True).

User skipped Option A/B/C/D/E question — judgment call = Option B
(IMAGENET1K_V1 + random tail) per "use judgment, don't block":
* Option A (Nanne translation) was 5-8 SP, above the 5 SP budget.
* Option B is 3 SP, fits the budget, honestly labelled.
* Option C (pure random) was borderline-dishonest per Real Results.

Files:

* scripts/mk_netvlad_checkpoint.py — deterministic generator.
* models/netvlad/netvlad.pt — 568 MiB, via git-lfs (.gitattributes
  extended for models/**/*.pt, *.onnx, *.engine).
* configs/operator_replay.yaml — c2_vpr + c10_provisioning blocks
  populated; the field literally named onnx_path actually points
  at the .pt for NetVLAD per the runtime semantics noted above.
* docker-compose.test.jetson.yml — ./models:/opt/models:ro bind
  mount added to e2e-runner.
* _docs/03_ip_attribution/netvlad.md — provenance, licence, how-to-
  reproduce, honest scope statement ("NOT a real-retrieval
  checkpoint; ESKF divergence under garbage retrievals is the
  expected next gate").
* _docs/02_tasks/todo/AZ-965_netvlad_onnx_backbone_provisioning.md
  — rewritten to reflect the .pt-not-.onnx + Option B discoveries.

Tier-2 verification follows in a separate commit after the harness
run confirms the empty-backbones SKIP gate clears.

Out of scope (filed as follow-ups):

* Real-retrieval NetVLAD weights (Nanne Pittsburgh-30k translation
  or internal team checkpoint) — separate ticket.
* AZ-840 orchestrator PASSing end-to-end (depends on retrieval
  quality + ESKF stability).
* AZ-963 60s smoke ESKF divergence (independent chain).

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-29 18:03:32 +03:00

1 Commits