# NetVLAD-VGG16 Checkpoint — Provenance & License **Artifact**: `models/netvlad/netvlad.pt` **Generated**: 2026-05-29 (AZ-965) **Architecture**: project-owned `_NetVladVgg16` in `src/gps_denied_onboard/components/c2_vpr/_net_vlad_architecture.py` **Parameters**: 149,002,112 (~568.4 MiB fp32) **SHA-256**: `745c6f29faa4e6754a74189c503189dbab1978d8ff2c65b48c95749b4e48c444` This checkpoint is a **pipeline-integration scaffold**, not a retrieval-quality artifact. The encoder weights come from a real public source (torchvision IMAGENET1K_V1), but the NetVLAD pool and PCA tail are deterministic-random — they have NOT been trained for visual place recognition. The orchestrator will run end-to-end with these weights, but retrieval results will be effectively random. ## Composition | Layer | Source | License | Trained-for-VPR? | |---|---|---|---| | `encoder.0` … `encoder.28` (26 keys, VGG16 features `[:-2]`) | `torchvision.models.vgg16(weights="IMAGENET1K_V1")` | BSD-3-Clause | No (ImageNet classification) | | `pool.conv.weight` (64, 512, 1, 1) | `torch.manual_seed(0)` → arch-default init | Project-owned | No | | `pool.conv.bias` (64,) | Same | Project-owned | No | | `pool.centroids` (64, 512) | Same | Project-owned | No | | `pca.weight` (4096, 32768) | Same | Project-owned | No | | `pca.bias` (4096,) | Same | Project-owned | No | Total: 31 state_dict keys; loads strictly into `make_net_vlad_vgg16(num_clusters=64, encoder_dim=512, descriptor_dim=4096)`. ## Encoder licence (BSD-3-Clause) `torchvision.models.vgg16` weights are distributed by PyTorch under the BSD-3-Clause licence: > Copyright (c) 2016-, PyTorch Contributors. > > Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: … Full text: https://github.com/pytorch/vision/blob/main/LICENSE (torchvision project). The model weights themselves are derived from the ImageNet dataset; commercial use of ImageNet-derived models is subject to the ImageNet terms of access (https://www.image-net.org/download.php). ## How to reproduce ```bash # From repo root, in the project virtualenv: source .venv/bin/activate # torchvision IMAGENET1K_V1 weights download requires HTTPS cert # validation. On macOS with Python.org installer the system trust # store is not used by default; export certifi's bundle: export SSL_CERT_FILE=$(python -c "import certifi; print(certifi.where())") # Generate the checkpoint: python scripts/mk_netvlad_checkpoint.py # → writes models/netvlad/netvlad.pt ``` The script is **deterministic** (`torch.manual_seed(0)` before the random-init layers, IMAGENET1K_V1 weights are content-addressed). Re-running on a different machine yields the same SHA-256. ## Why this isn't a real-retrieval checkpoint AZ-965 was scoped at 3 SP to unblock the AZ-840 orchestrator's empty-`c10_provisioning.backbones` skip-gate. A real-retrieval checkpoint requires one of: 1. **Translate Nanne's Pittsburgh-30k weights** (https://github.com/Nanne/pytorch-NetVlad). Nanne's `vladv2=False` default sets `pool.conv.bias=False` (no bias key in their state_dict); the project's architecture has `bias=True`. WPCA is also stored separately as `nn.Conv2d(4096, 32768, 1, 1)` and would need a reshape→`nn.Linear` conversion. Estimated 5-8 SP for the translation script plus follow-up Tier-2 verification. 2. **Train from scratch on aerial-imagery datasets** (e.g. xView, BigEarthNet, NWPU-RESISC45). Multi-week effort with GPU compute budget. 3. **Use an internal team checkpoint** if one exists. This is filed as the AZ-965 follow-up (see the AZ-965 spec for ticket reference). ## Observable behaviour with this checkpoint With this scaffold checkpoint and the Derkachi clip: * `c10_provisioning.compile_engines_for_corpus` succeeds (PyTorch FP16 runtime is a no-op `compile_engine` that just sha-256's the `.pt` and records the path). * `c2_vpr.NetVladStrategy.create()` succeeds (encoder/pool/pca all load, output shape `(1, 4096)` matches descriptor_dim). * `embed_query` produces valid `(1, 4096)` fp16 vectors per frame. * `retrieve_topk` produces top-K matches — but they are effectively random, because the NetVLAD pool + PCA never learned a semantic embedding space. * Downstream ESKF measurement updates fed from random tile matches will likely diverge — surfacing as a SEPARATE failure mode that's NOT the empty-backbones gate AZ-965 closed. That ESKF divergence under garbage retrievals is the EXPECTED next gate for the orchestrator chain, and is a separate ticket from AZ-965.