mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 10:31:12 +00:00
[AZ-965] [AZ-835] Archive completed task specs to done/
This commit is contained in:
@@ -0,0 +1,92 @@
|
||||
# End-to-end real-flight validation pipeline (Epic)
|
||||
|
||||
**Task**: AZ-835_e2e_real_flight_validation_epic
|
||||
**Name**: End-to-end real-flight validation: raw (tlog, video) → route-driven satellite seeding → gps-denied verdict
|
||||
**Description**: Drive the full gps-denied-onboard validation pipeline from raw operator inputs to a verdict. Given a `.tlog` binary + a flight video, the system automatically extracts the flight cut, syncs frames to IMU, builds the satellite imagery the descriptor stack needs (route-driven, not bbox-driven), runs the airborne pipeline, and reports the horizontal-error distribution against the tlog's own GPS ground truth. Supersedes AZ-777 Phase 3+ design.
|
||||
**Complexity**: Epic — ~17 SP decomposed into 6 child tasks of ≤ 5 SP each (see decomposition table below)
|
||||
**Dependencies**: AZ-777 Phase 1 (landed cycle 3 batch 105 — C11 contract adaptation + e2e-runner wiring); AZ-405 (tlog↔video auto-sync adapter); AZ-699 (verdict report writer); AZ-809 SOFT (Route API validation — landing AZ-809 before C2 lets the client consume RFC 7807 validator responses cleanly)
|
||||
**Component**: cross-cutting — replay_input + new TlogRouteExtractor + new SatelliteProviderRouteClient + e2e fixtures + tests/e2e/replay
|
||||
**Tracker**: AZ-835 (https://denyspopov.atlassian.net/browse/AZ-835)
|
||||
**Originating directive**: user (2026-05-22) after AZ-777 Phase 2 deliverables landed — "In the end it should be full e2e flow. You give it a tlog + video, and the system does everything else."
|
||||
|
||||
Jira AZ-835 is the authoritative spec; this file mirrors the in-workspace-only sections that gps-denied-onboard implementers will need.
|
||||
|
||||
## Goal
|
||||
|
||||
A single pytest test takes only `(tlog, video, calibration)` as input and runs the full 7-step pipeline end-to-end on the Jetson harness, producing an honest PASS/FAIL verdict against the AZ-696 AC-3 threshold (≥ 80 % of emissions within 100 m).
|
||||
|
||||
## The 7-step pipeline
|
||||
|
||||
| # | Step | Existing? | Component / new code |
|
||||
|---|------|-----------|----------------------|
|
||||
| 1 | Extract active flight cut + sync with video | **Mostly existing** (AZ-405 `tlog_video_adapter.py`) | small extension for take-off/landing boundary detection if needed |
|
||||
| 2 | On-fly frame + IMU extraction | **Existing** | `VideoFileFrameSource` + `TlogReplayFcAdapter` (no change) |
|
||||
| 3 | Auto-create route from tlog GPS, coarsen to ≤ 10 pts | **New** | `TlogRouteExtractor` (Douglas-Peucker on `GLOBAL_POSITION_INT` rows) → `RouteSpec` |
|
||||
| 4 | POST route to satellite-provider, get tiles | **New consumer** | `SatelliteProviderRouteClient` (POST `/api/satellite/route`, poll `mapsReady`) |
|
||||
| 5 | Calc FAISS index from tiles | **Mostly existing** | C10 `DescriptorBatcher` runs; new fixture wires C11 → C10 trigger |
|
||||
| 6 | Run gps-denied from all the info | **Existing** | `gps-denied-replay` console-script + airborne composition root |
|
||||
| 7 | Get GPS fixes, check against tlog GPS | **Existing** | `helpers/accuracy_report.py` + `helpers/gps_compare.py` |
|
||||
|
||||
## Decomposition (6 child tasks)
|
||||
|
||||
| # | Title | Est | Depends |
|
||||
|---|-------|-----|---------|
|
||||
| C1 | `TlogRouteExtractor` — extract active segment + coarsen to N waypoints | 3 | — |
|
||||
| C2 | `SatelliteProviderRouteClient` + `route_seed.py` CLI | 3 | AZ-809 (soft) |
|
||||
| C3 | New `operator_pre_flight_setup` fixture (C1 + C2 + C11 + C10) — replaces placeholder, supersedes AZ-777 Phase 3 | 5 | C1, C2, AZ-777 Phase 1 |
|
||||
| C4 | E2E test ingesting raw `(tlog, video)` and running steps 1-7 — extends/replaces AZ-699 verdict test | 3 | C3 |
|
||||
| C5 | Un-xfail AZ-777 AC-4 + AC-5 tests | 1 | C4 |
|
||||
| C6 | Docs: `replay_protocol.md` Invariant 12 + AZ-777 amendment + new-test README | 2 | C5 |
|
||||
|
||||
**Total ~17 SP**.
|
||||
|
||||
## Why route-driven seeding (not bbox)
|
||||
|
||||
- **Efficiency**: AZ-777 spec bbox = ~11400 tiles z15-z18 (~140 MB, 48% over budget). 10-point coarsened route with `regionSizeMeters=500` per point = ~50-100 unique tiles (~1.5 MB) for the same VPR descriptor lock area. **~100× reduction**.
|
||||
- **Honesty**: bbox pre-commits to where the operator *might* fly. Route pre-commits to where they *did* fly. For real-flight validation, the latter is the right primitive.
|
||||
- **Probe-confirmed**: Route API works end-to-end in ~15s for a 2-point route per 2026-05-22 black-box probe. Uses `lat`/`lon` already (no AZ-812 rename needed).
|
||||
|
||||
## Coordination with prior work
|
||||
|
||||
- **AZ-777** — Phase 1 + Phase 2 reused; Phase 3+ design **superseded** by this Epic when C3 lands.
|
||||
- **AZ-699** — verdict-report-writing path preserved; C4 extends or wraps it.
|
||||
- **AZ-405** — tlog↔video auto-sync adapter reused as-is for step 1.
|
||||
- **AZ-702** — camera factory-sheet calibration unchanged.
|
||||
- **AZ-696** — ≥ 80 % within 100 m threshold gate unchanged.
|
||||
- **AZ-808** — Region-endpoint validation; not on this Epic's critical path (Route used, not Region).
|
||||
- **AZ-809** — Route-endpoint validation; soft prereq for C2.
|
||||
- **AZ-812** — Region rename to lat/lon; not on this Epic's critical path.
|
||||
|
||||
## Acceptance criteria (Epic-level)
|
||||
|
||||
**AC-1**: New pytest test gated by `RUN_REPLAY_E2E=1` + `@pytest.mark.tier2` takes only `(tlog, video, calibration)` and runs the full 7-step pipeline on Jetson.
|
||||
|
||||
**AC-2**: Step 1 auto-detects active flight cut from raw tlog (take-off → landing) without operator intervention.
|
||||
|
||||
**AC-3**: Step 3 produces ≤ 10 waypoints that materially follow the tlog GPS trajectory (DP tolerance documented in config).
|
||||
|
||||
**AC-4**: Step 4 succeeds against real satellite-provider on Jetson docker network, downloads route tiles from Google Maps, `mapsReady=true` within runtime budget.
|
||||
|
||||
**AC-5**: Step 5 builds FAISS HNSW index over route-seeded C6 cache; sidecar triple-consistency holds (AZ-306).
|
||||
|
||||
**AC-6**: Step 7 emits AZ-699 verdict report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` with honest horizontal-error distribution — PASS or FAIL on AZ-696 AC-3 threshold, no xfail mask.
|
||||
|
||||
**AC-7**: End-to-end run ≤ 15 min on Tier-2 Jetson for the Derkachi clip (soft target for first delivery; hard NFR after first measurement).
|
||||
|
||||
**AC-8**: Docs: `replay_protocol.md` Invariant 12 sub-section + AZ-777 marked Phase 3+ superseded + new-test README.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Satellite-provider imagery-source migration to CC-BY (parent-suite ticket, TBD).
|
||||
- FAISS / NetVLAD backbone replacement.
|
||||
- Real-time tlog ingestion (this Epic operates on finished `.tlog` files).
|
||||
- Multi-flight aggregate validation.
|
||||
- ZERO modifications to `../satellite-provider/` (Route API consumed as-is).
|
||||
- CI gating (test stays behind `RUN_REPLAY_E2E=1`).
|
||||
|
||||
## References
|
||||
|
||||
- Jira AZ-835: https://denyspopov.atlassian.net/browse/AZ-835
|
||||
- Supersedes AZ-777 Phase 3+ design (AZ-777 Phase 1 + Phase 2 reused)
|
||||
- Probe foundation: 2026-05-22 black-box probe of Route API confirmed end-to-end viability
|
||||
- Related: AZ-405, AZ-696, AZ-699, AZ-702, AZ-777, AZ-808, AZ-809, AZ-812
|
||||
@@ -0,0 +1,114 @@
|
||||
# AZ-965 — Provision NetVLAD backbone for AZ-839 `c10_provisioning` corpus
|
||||
|
||||
**Status**: In Progress (Jira) / `todo/` (local)
|
||||
**Issue type**: Task
|
||||
**Complexity**: 3 SP (was estimated 3-5)
|
||||
**Cycle**: cycle-4 e2e closure follow-up
|
||||
**Jira**: https://denyspopov.atlassian.net/browse/AZ-965
|
||||
**Filed**: 2026-05-29 (forward-looked during AZ-962)
|
||||
**Started**: 2026-05-29
|
||||
|
||||
## Why
|
||||
|
||||
Forward-looked during AZ-962 + confirmed by AZ-964's Tier-2 result: with the FAISS index gate cleared (AZ-964), the AZ-840 orchestrator test SKIPs at the **empty-backbones gate** in `tests/e2e/replay/conftest.py:594-601`:
|
||||
|
||||
```
|
||||
AZ-839 operator_pre_flight_setup: config has no c10_provisioning.backbones
|
||||
entries — the e2e harness config must declare at least one backbone
|
||||
(typically DINOv2-VPR or NetVLAD per AZ-321).
|
||||
```
|
||||
|
||||
## Important corrections to the original spec
|
||||
|
||||
Two material discoveries during AZ-965 implementation that change the work shape:
|
||||
|
||||
1. **The architecture already exists in repo**: `src/gps_denied_onboard/components/c2_vpr/_net_vlad_architecture.py` defines `make_net_vlad_vgg16(num_clusters=64, encoder_dim=512, descriptor_dim=4096)` — the project's own NetVLAD-VGG16 module. We do NOT need to source ONNX from elsewhere; we instantiate the architecture, load weights into it, and save a state_dict.
|
||||
2. **Runtime expects a PyTorch `.pt` state_dict, NOT `.onnx`**. Per AZ-321's design (and `_docs/02_document/components/02_c2_vpr/description.md` §1): NetVLAD runs on the C7 **PyTorch FP16 runtime** (NOT TensorRT). The PyTorch FP16 `compile_engine` is a **no-op** that sha-256's the `.pt` path; `deserialize_engine` calls `torch.load(weights_only=True)` + `model.load_state_dict(state_dict, strict=True)`. The `BackboneConfig.onnx_path` field is a **misnomer for NetVLAD** — for the TensorRT primary backbone (UltraVPR/DINOv2) it really is `.onnx`, but for the PyTorch-FP16 baseline (NetVLAD) it's a `.pt` path.
|
||||
|
||||
## Chosen approach — Option B (judgment call)
|
||||
|
||||
The original spec's source options were:
|
||||
|
||||
* A — Translate Nanne/pytorch-NetVlad's Pittsburgh-30k weights (5-8 SP — exceeds the 5 SP budget per `tracker.mdc` user-rule; needs split).
|
||||
* B — `torchvision.models.vgg16(weights="IMAGENET1K_V1")` encoder + deterministic-random NetVLAD pool/PCA (3 SP, honestly labelled as untrained-tail).
|
||||
* C — Pure synthetic state_dict (2 SP, but borderline-dishonest per "Real Results, Not Simulated Ones").
|
||||
* D — Internal team checkpoint (user-provided).
|
||||
* E — Defer AZ-965 entirely.
|
||||
|
||||
The user was presented options A-E on 2026-05-29 and skipped the choice. Per "use judgment, don't block" pattern observed today, the judgment call was **Option B**: torchvision IMAGENET1K_V1 encoder + deterministic-random tail. Reasoning:
|
||||
|
||||
* Encoder IS a real public source (torchvision BSD-3-Clause).
|
||||
* 3 SP fits the budget.
|
||||
* NetVLAD pool + PCA tail clearly labelled as untrained in provenance — honest per meta-rule.
|
||||
* Unblocks the gate to surface the next real issue (which is likely ESKF divergence under garbage retrievals — a separate ticket).
|
||||
|
||||
## Goal
|
||||
|
||||
Provision a NetVLAD-VGG16 `.pt` checkpoint at `models/net_vlad/net_vlad.pt` + matching `BackboneConfig` entry in `configs/operator_replay.yaml` so the AZ-839 fixture skip-gate clears and the AZ-840 orchestrator can compose c10 (+ c2_vpr) into a real pipeline run. File stem MUST equal `c2_vpr.net_vlad.MODEL_NAME == "net_vlad"` — the PyTorch FP16 runtime uses `path.stem` as the architecture-registry lookup key.
|
||||
|
||||
## Scope
|
||||
|
||||
1. **Write `scripts/mk_netvlad_checkpoint.py`** — generates a deterministic `.pt`:
|
||||
* Loads `torchvision.models.vgg16(weights="IMAGENET1K_V1")` features, slices `[:-2]` to match `_NetVladVgg16.encoder`.
|
||||
* Seeds `torch.manual_seed(0)`, instantiates `make_net_vlad_vgg16(num_clusters=64, encoder_dim=512, descriptor_dim=4096)`, overlays ImageNet features into `encoder.*` keys.
|
||||
* Saves to `models/net_vlad/net_vlad.pt`.
|
||||
* Prints SHA-256 + key composition.
|
||||
2. **Add `models/**/*.pt`, `*.onnx`, `*.engine` to `.gitattributes` for git-lfs**.
|
||||
3. **Commit `models/net_vlad/net_vlad.pt` via git-lfs**.
|
||||
4. **Update `configs/operator_replay.yaml`**:
|
||||
```yaml
|
||||
c2_vpr:
|
||||
strategy: net_vlad
|
||||
backbone_weights_path: /opt/models/net_vlad/net_vlad.pt
|
||||
netvlad_descriptor_dim: 4096
|
||||
warn_top1_threshold: 0.30
|
||||
|
||||
c10_provisioning:
|
||||
workspace_mb: 4096
|
||||
backbones:
|
||||
- model_name: net_vlad
|
||||
onnx_path: /opt/models/net_vlad/net_vlad.pt
|
||||
expected_input_shape: [3, 480, 480]
|
||||
input_name: input
|
||||
```
|
||||
5. **Add `./models:/opt/models:ro` bind-mount** to `docker-compose.test.jetson.yml` e2e-runner.
|
||||
6. **Write `_docs/03_ip_attribution/netvlad.md`** — provenance, licence, how to reproduce, honest scope statement.
|
||||
7. **Tier-2 verify**: `JETSON_SSH_ALIAS=jetson bash scripts/run-tests-jetson.sh` — confirm the AZ-840 orchestrator test no longer SKIPs at the empty-backbones gate. Document the next gate that surfaces.
|
||||
8. **File follow-up ticket** for real-retrieval NetVLAD weights (Nanne translation or internal source) — out of AZ-965 scope.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
* **AC-1**: `models/net_vlad/net_vlad.pt` exists in the repo (via git-lfs) with documented provenance + licence.
|
||||
* **AC-2**: `torch.load(path, weights_only=True)` + `load_state_dict(strict=True)` on `make_net_vlad_vgg16()` succeeds locally (round-trip verified before commit).
|
||||
* **AC-3**: `configs/operator_replay.yaml` declares the `net_vlad` backbone in `c10_provisioning.backbones` and the `c2_vpr` block with matching `backbone_weights_path`.
|
||||
* **AC-4**: `JETSON_SSH_ALIAS=<alias> bash scripts/run-tests-jetson.sh` no longer SKIPs `test_az840_e2e_real_flight_orchestration` with the empty-backbones message.
|
||||
* **AC-5**: A NEW gate (whatever the orchestrator's next blocker is — likely ESKF divergence under garbage retrievals, or a missing c4/c5 component block) is documented as a follow-up ticket. AZ-840 PASSing is OUT OF SCOPE for AZ-965.
|
||||
* **AC-6**: Provenance + licence recorded in `_docs/03_ip_attribution/netvlad.md`.
|
||||
* **AC-7**: The follow-up ticket "real trained NetVLAD weights (Nanne translation or internal)" is filed in Jira.
|
||||
|
||||
## Out of scope
|
||||
|
||||
* DINOv2-VPR or other alternative primary backbones (NetVLAD is AZ-321's pinned baseline and the c10 corpus only needs ONE backbone to clear the gate).
|
||||
* Real-retrieval-quality NetVLAD weights (Nanne translation, internal checkpoint, or training) — separate follow-up ticket.
|
||||
* MegaLoc / MixVPR / UltraVPR / SelaVPR / EigenPlaces / SALAD provisioning.
|
||||
* The 4 ESKF-divergence regression failures from the 60s smoke (AZ-963).
|
||||
* Reference C6 tile cache for the Derkachi fixture.
|
||||
* Making AZ-840 actually PASS end-to-end.
|
||||
|
||||
## Dependencies
|
||||
|
||||
* **Blocked by**: AZ-964 (FAISS index bootstrap — cleared 2026-05-29).
|
||||
* **Blocks**: AZ-840 orchestrator PASS (which requires AZ-965 + real retrieval weights + ESKF stability under retrieval input).
|
||||
* **Related**: AZ-321 (defines NetVLAD as the C2 baseline), AZ-336 / AZ-338 (NetVLAD strategy impl), AZ-839 (C3 fixture).
|
||||
|
||||
## References
|
||||
|
||||
* Fixture skip-gate: `tests/e2e/replay/conftest.py:594-601` + `:654-666`
|
||||
* Backbone factory: `src/gps_denied_onboard/runtime_root/c10_factory.py::build_backbone_specs`
|
||||
* `BackboneConfig` dataclass: `src/gps_denied_onboard/components/c10_provisioning/config.py:110-156`
|
||||
* NetVLAD strategy: `src/gps_denied_onboard/components/c2_vpr/net_vlad.py`
|
||||
* NetVLAD architecture: `src/gps_denied_onboard/components/c2_vpr/_net_vlad_architecture.py`
|
||||
* PyTorch FP16 runtime (the actual consumer): `src/gps_denied_onboard/components/c7_inference/pytorch_fp16_runtime.py:119-212`
|
||||
* C2 VPR description: `_docs/02_document/components/02_c2_vpr/description.md` §1 §5
|
||||
* AZ-321 spec: `_docs/02_tasks/done/AZ-321_c10_engine_compiler.md`
|
||||
* AZ-964 spec: `_docs/02_tasks/done/AZ-964_faiss_index_bootstrap_for_az839_fixture.md`
|
||||
Reference in New Issue
Block a user