mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 13:01:14 +00:00
[AZ-423] [AZ-427] Add FT-P-19 + FT-N-05 blackbox tests
Implement the AC-8.6 (top-K=10 retrieval scale-ratio + scene-change
PARTIAL) and AC-8.2 / AC-NEW-6 (stale aged-tile rejection) blackbox
scenarios.
AZ-423 (FT-P-19, 3pt) helpers + scenario:
- retrieval_evaluator.py — top-K within-distance evaluator (60 stills
vs 100 m budget), scene-change PARTIAL recorder (always emits
PARTIAL on the 2 _gmaps.png pairs), FDR record projectors, CSV
writers.
- tests/positive/test_ft_p_19_sat_reloc_scale.py (6 parametrised
variants).
AZ-427 (FT-N-05, 2pt) helpers + scenario:
- aged_tile_rejection_evaluator.py — Signal A (stale rejection at
load) + Signal B (per-frame downgrade) decision matrix, reuses
ALLOWED_SOURCE_LABELS from estimate_schema.
- tests/negative/test_ft_n_05_stale_tile_rejection.py (12 parametrised
variants: FC × VIO × {7mo/active-conflict, 13mo/rear}).
48 new unit tests cover every helper branch. Both scenarios skip
when sitl_replay_ready is false and fail loudly when fixture records
are missing.
Per-batch review: PASS_WITH_WARNINGS (2 Low — production-dependency
surface, FDR-kind constant duplication).
Cumulative review 82-84: PASS (2 Low carry-over / hygiene candidate).
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,63 @@
|
||||
# FT-P-19 — Satellite relocalization scale-ratio + scene-change PARTIAL
|
||||
|
||||
**Task**: AZ-423_ft_p_19_sat_reloc_scale
|
||||
**Name**: UAV-frame footprint scale-ratio retrievable from cache (FULL); scene-change subset (PARTIAL) (AC-8.6)
|
||||
**Description**: Implement FT-P-19 — for each `still-image-set-60` image, query cache top-K=10 retrieval; assert top-K result includes a tile whose centre is within 100 m of the image's true centre. For the 2 paired `_gmaps.png` images, run cross-domain matcher; record scene-change behavior as PARTIAL (full coverage requires labeled change-pair dataset, deferred under D-PROJ-3).
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407
|
||||
**Component**: Blackbox Tests / Positive (epic AZ-262)
|
||||
**Tracker**: AZ-423
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
AC-8.6 has two halves: scale-ratio retrievability (UAV-frame footprint at deployment altitude is in the cache regardless of internal tiling) and scene-change handling (matcher succeeds when the satellite reference shows seasonal/temporal differences). The first is fully measurable; the second is constrained by the lack of labeled change-pair data and is documented as PARTIAL in the traceability matrix.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/positive/test_ft_p_19_sat_reloc_scale.py`.
|
||||
- For each of the 60 still images: query the SUT's cache top-K=10 retrieval (via the per-frame retrieval log in FDR OR a public cache-query API if exposed); assert the top-K includes a tile whose centre is within 100 m of the image's true centre.
|
||||
- For the 2 paired `_gmaps.png` images: run cross-domain matcher; record per-image match success (boolean) into CSV; mark the scene-change subset as `PARTIAL` in `report.csv`.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Per-image top-K=10 retrieval observation.
|
||||
- Per-image distance check on top-K members.
|
||||
- Scene-change subset (2 paired images) with PARTIAL annotation.
|
||||
|
||||
### Excluded
|
||||
- Stale-tile rejection — owned by FT-N-05.
|
||||
- Multi-flight scene-change statistics — deferred under D-PROJ-3 (out of scope of this task).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: scale-ratio retrievability (60 images)**
|
||||
Given each still image
|
||||
When the SUT performs top-K=10 retrieval
|
||||
Then the top-K result includes a tile whose centre is within 100 m of the image's true centre, for all 60 images.
|
||||
|
||||
**AC-2: scene-change subset PARTIAL**
|
||||
Given the 2 paired `_gmaps.png` images
|
||||
When the cross-domain matcher runs against each
|
||||
Then the per-image result is recorded; the scenario's overall result is `PARTIAL` in `report.csv` for this subset (regardless of pass/fail count, because N=2 is too small for statistical confidence).
|
||||
|
||||
**AC-3: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries.
|
||||
|
||||
- **Allowed**: FDR per-frame retrieval log read OR a public cache-query API (whichever is documented).
|
||||
- **Forbidden**: importing C2 / C6 internal index, monkeypatching FAISS query.
|
||||
|
||||
## Constraints
|
||||
|
||||
- The PARTIAL annotation is structural — the scenario emits it regardless of pass/fail count for the scene-change subset, because the AC text itself acknowledges insufficient data (per traceability matrix).
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-P-19
|
||||
- `_docs/02_document/tests/traceability-matrix.md` § AC-8.6 (PARTIAL annotation)
|
||||
@@ -0,0 +1,63 @@
|
||||
# FT-N-05 — Stale-tile rejection on freshness violation
|
||||
|
||||
**Task**: AZ-427_ft_n_05_stale_tile_rejection
|
||||
**Name**: Tiles violating AC-8.2 freshness window are rejected or downgraded (AC-8.2, AC-NEW-6)
|
||||
**Description**: Implement FT-N-05 — replay 60 still images against `synth-age-7mo` (configure SUT for active-conflict sector) and `synth-age-13mo` (configure SUT for rear sector); SUT either rejects load OR loads but never emits `satellite_anchored` from these tiles.
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: AZ-406, AZ-407 (synth-age-tile-set)
|
||||
**Component**: Blackbox Tests / Negative / Cache freshness (epic AZ-262)
|
||||
**Tracker**: AZ-427
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
Stale-tile rejection is a security-relevant freshness gate (AC-8.2, AC-NEW-6) — without it the SUT could promote outdated geographic information to `satellite_anchored`, enabling silent location drift. This must be exercised.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/negative/test_ft_n_05_stale_tile_rejection.py`.
|
||||
- Two sub-cases:
|
||||
- `synth-age-7mo` mounted; SUT configured for active-conflict sector; replay 60 stills.
|
||||
- `synth-age-13mo` mounted; SUT configured for rear sector; replay 60 stills.
|
||||
- For each sub-case: assert 0 frames emit `source_label = satellite_anchored` from these tiles. Either rejection at load (FDR `tile-load-rejected: stale`) OR per-frame downgrade is acceptable.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Both sub-cases (7mo / 13mo).
|
||||
- Sector-configuration switch (per AC-8.2 active-conflict vs rear).
|
||||
|
||||
### Excluded
|
||||
- Mid-flight tile freshness positive case — owned by FT-N-06 (inside AZ-422).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: 7mo aged tiles in active-conflict sector**
|
||||
Given the SUT mounts `synth-age-7mo` and is configured for active-conflict sector
|
||||
When 60 still images are replayed
|
||||
Then 0 outbound emissions carry `source_label = satellite_anchored`. Acceptable signals: (a) FDR `tile-load-rejected: stale` events at startup; (b) per-frame `source_label ∈ {visual_propagated, dead_reckoned}` throughout.
|
||||
|
||||
**AC-2: 13mo aged tiles in rear sector**
|
||||
Given the SUT mounts `synth-age-13mo` and is configured for rear sector
|
||||
When 60 still images are replayed
|
||||
Then same as AC-1.
|
||||
|
||||
**AC-3: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries.
|
||||
|
||||
- **Allowed**: FDR for `tile-load-rejected` events, outbound `source_label` stream.
|
||||
- **Forbidden**: importing C6 freshness gate state, monkeypatching the date-comparison logic.
|
||||
|
||||
## Constraints
|
||||
|
||||
- "Active-conflict" / "rear" sector is per AC-8.2; the configuration mechanism is documented in the SUT's config schema (per E-CC-CONF / `module-layout.md`).
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-N-05
|
||||
- `_docs/02_document/tests/test-data.md` § Seed Data Sets (synth-age-tile-set)
|
||||
Reference in New Issue
Block a user