mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 10:21:13 +00:00
[AZ-414] [AZ-415] [AZ-418] Test batch 71: sharp turn + multi-segment + smoothing
- AZ-414 (FT-P-07 + FT-N-02): sharp_turn_detector helper covering AC-1 (gyro_z run detection + synthetic-overlay fallback), AC-2/AC-3 (FT-N-02 during-turn label + monotonic covariance), AC-4/AC-5/AC-6 (FT-P-07 recovery lag/drift/heading); twin scenario files under positive/ and negative/. - AZ-415 (FT-P-08): multi_segment_evaluator helper + scenario. - AZ-418 (FT-P-10): smoothing_evaluator helper covering AC-1 (raw + smoothed pose pairing), AC-2 (improvement rate >= 0.80), AC-3 (mean improvement >= 5 m); scenario file. - All scenarios skip-gated on upstream frame_source_replay / imu_replay / fdr_reader stubs (auto-activate when AZ-441 + AZ-407 leftovers land). - +68 unit tests; full e2e unit suite: 393 passed. See _docs/03_implementation/batch_71_report.md and _docs/03_implementation/reviews/batch_71_review.md. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,81 +0,0 @@
|
||||
# FT-P-07 + FT-N-02 — Sharp-turn recovery (positive + negative twin)
|
||||
|
||||
**Task**: AZ-414_ft_p_07_ftn_02_sharp_turn
|
||||
**Name**: Sharp-turn recovery via satellite reference + legitimate frame-to-frame failure expected (AC-3.2)
|
||||
**Description**: Implement FT-P-07 (recovery within 3 frames of turn end; drift ≤ 200 m, heading change handled) AND FT-N-02 (during turn, source_label is `visual_propagated` or `dead_reckoned`; covariance grows; recovery exercised in FT-P-07). Both scenarios share the sharp-turn segment fixture.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407
|
||||
**Component**: Blackbox Tests / Positive + Negative (epic AZ-262)
|
||||
**Tracker**: AZ-414
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
Sharp turns are a documented degradation case (AC-3.2). The system must label the turn frames correctly (FT-N-02) AND recover when the turn ends (FT-P-07) — both halves must be measured to validate the failure-path correctness.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/positive/test_ft_p_07_sharp_turn_recovery.py` (FT-P-07) and `e2e/tests/negative/test_ft_n_02_sharp_turn_failure.py` (FT-N-02).
|
||||
- Both replay the sharp-turn segment of Derkachi (identified by gyro_z spikes in `SCALED_IMU2`).
|
||||
- FT-P-07 asserts: source_label returns to `satellite_anchored` within 3 frames of turn end; drift since pre-turn anchor ≤ 200 m; heading change up to 70° handled.
|
||||
- FT-N-02 asserts: during turn, source_label ∈ `{visual_propagated, dead_reckoned}`; covariance grows monotonically; transitions to satellite_anchored after turn (handed off to FT-P-07 for the recovery assertion).
|
||||
- Synthetic-gyro-overlay fallback: if the natural Derkachi flight has no sharp turn meeting AC-3.2 thresholds, both scenarios fall back to a synthetic gyro overlay; this fact is flagged in the FDR record + CSV `evidence_paths`.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Sharp-turn segment identification via gyro_z spikes in `data_imu.csv`.
|
||||
- Synthetic-gyro overlay fallback path (use the same approach as outlier injector for determinism).
|
||||
- FT-P-07 recovery assertions (label transition, drift, heading-change tolerance).
|
||||
- FT-N-02 during-turn assertions (label, monotonic covariance).
|
||||
|
||||
### Excluded
|
||||
- Multi-segment satellite re-localization — owned by FT-P-08 (AZ-415).
|
||||
- Outlier-injection tolerance — owned by FT-N-01 (AZ-424).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: turn-segment identification**
|
||||
Given Derkachi `data_imu.csv`
|
||||
Then the test computes gyro_z magnitude per IMU row and identifies segments where ≥3 consecutive rows have `|gyro_z| > AC-3.2 threshold`. If no segment meets the threshold, the synthetic-overlay fallback fires and the FDR + CSV mark this as `synthetic-overlay`.
|
||||
|
||||
**AC-2: FT-N-02 during-turn label**
|
||||
Given a turn segment
|
||||
Then for every frame inside the segment, source_label ∈ `{visual_propagated, dead_reckoned}` (no `satellite_anchored` during the turn).
|
||||
|
||||
**AC-3: FT-N-02 monotonic covariance**
|
||||
Given the during-turn frames
|
||||
Then `cov_semi_major_m` is non-decreasing across consecutive frames within the turn segment.
|
||||
|
||||
**AC-4: FT-P-07 recovery within 3 frames**
|
||||
Given a turn-end timestamp
|
||||
Then the next satellite_anchored emission occurs within ≤3 frames after that timestamp.
|
||||
|
||||
**AC-5: FT-P-07 drift bound**
|
||||
Given the recovery anchor
|
||||
Then `‖propagated_centre_at_turn_end − recovery_anchor_centre‖ ≤ 200 m`.
|
||||
|
||||
**AC-6: FT-P-07 heading-change envelope**
|
||||
Given the heading delta from pre-turn to post-turn anchor
|
||||
Then heading changes up to 70° are handled (the recovery still occurs within the 3-frame budget at heading deltas in [0°, 70°]).
|
||||
|
||||
**AC-7: parameterization**
|
||||
Given conftest parameterization
|
||||
Then both scenarios run per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries.
|
||||
|
||||
- **Allowed**: outbound message stream (`source_label`, `cov_semi_major_m`); FDR for the synthetic-overlay flag.
|
||||
- **Forbidden**: stubbing C1 VIO failure mode, monkeypatching the source-label state machine.
|
||||
|
||||
## Constraints
|
||||
|
||||
- Synthetic-gyro overlay fallback is determined per-run by checking the natural fixture; the choice is logged into FDR and CSV `evidence_paths`.
|
||||
- The sharp-turn threshold per AC-3.2 is the project's authoritative value (gyro_z magnitude + duration); this test reads that threshold from the test-spec environment, not from a hardcoded constant.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-P-07, § FT-N-02
|
||||
- `_docs/02_document/tests/test-data.md` § Image processing quality / Resilience
|
||||
@@ -1,70 +0,0 @@
|
||||
# FT-P-08 — Multi-segment satellite-reference re-localization
|
||||
|
||||
**Task**: AZ-415_ft_p_08_multi_segment_reloc
|
||||
**Name**: ≥3 disconnected segments handled via satellite-reference re-localization (AC-3.3)
|
||||
**Description**: Implement FT-P-08 — replay the `multi-segment-derkachi` synthetic fixture with 3+ blackout windows; assert SUT emits `dead_reckoned` during each blackout, returns to `satellite_anchored` within 3 frames of each blackout end, and trajectory continuity preserved (no >100 m jump).
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407, AZ-408 (multi_segment injector)
|
||||
**Component**: Blackbox Tests / Positive (epic AZ-262)
|
||||
**Tracker**: AZ-415
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
The system claims to handle ≥3 disconnected segments per flight via satellite-reference re-localization (AC-3.3). Without this scenario the multi-blackout recovery path is unmeasured.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/positive/test_ft_p_08_multi_segment_reloc.py`.
|
||||
- Replays the `multi-segment-derkachi` fixture (3+ blackout windows distributed across the flight, no spoof injection).
|
||||
- For each blackout: asserts `source_label = dead_reckoned` during the blackout; asserts `source_label` returns to `satellite_anchored` within 3 frames of blackout end; asserts no trajectory jump >100 m at the recovery transition.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- Replay-driven test method against `multi-segment-derkachi`.
|
||||
- Per-blackout assertion (label during, label transition, jump size).
|
||||
- Aggregate pass: all 3+ blackouts must satisfy all three sub-assertions.
|
||||
|
||||
### Excluded
|
||||
- Spoof-paired blackouts — owned by FT-N-04 (AZ-426) and NFT-RES-04 (AZ-435).
|
||||
- Single-blackout outage with operator-reloc request — owned by FT-N-03 (AZ-425).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: blackout-window detection**
|
||||
Given the fixture
|
||||
Then the test identifies all ≥3 blackout windows from the fixture's manifest (the injector emits a window-list JSON alongside the modified video).
|
||||
|
||||
**AC-2: dead_reckoned during blackout**
|
||||
Given a blackout window `[t_start, t_end]`
|
||||
Then for every outbound emission with timestamp in `[t_start, t_end]`, `source_label = dead_reckoned`.
|
||||
|
||||
**AC-3: recovery within 3 frames**
|
||||
Given each blackout's `t_end`
|
||||
Then the next `satellite_anchored` emission occurs within ≤3 frames after `t_end`.
|
||||
|
||||
**AC-4: trajectory-continuity bound**
|
||||
Given the recovery anchor following each blackout
|
||||
Then `‖estimate_at_t_end − recovery_anchor‖ ≤ 100 m`.
|
||||
|
||||
**AC-5: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries.
|
||||
|
||||
- **Allowed**: outbound message stream; FDR; the injector's window-list JSON (a public test artifact).
|
||||
- **Forbidden**: querying SUT internal anchor cache, stubbing C2 retrieval.
|
||||
|
||||
## Constraints
|
||||
|
||||
- The injector ensures ≥3 windows AND ≥30 s of normal flight between them (AC-5 of AZ-408).
|
||||
- "Trajectory jump" is the L2 distance at the moment of recovery; pre/post measurements are the SUT's outbound estimates, not GT.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-P-08
|
||||
- `_docs/02_document/tests/test-data.md` § Resilience (FT-P-08 row)
|
||||
@@ -1,69 +0,0 @@
|
||||
# FT-P-10 — GTSAM smoothing-loop look-back accuracy
|
||||
|
||||
**Task**: AZ-418_ft_p_10_smoothing_lookback
|
||||
**Name**: Internal smoothing improves past-keyframe estimates (AC-4.5 revised, Mode B Fact #107)
|
||||
**Description**: Implement FT-P-10 — full Derkachi replay; FDR contains per-keyframe (a) raw single-shot pose at first emission, (b) smoothed pose at iSAM2 convergence; assert `smoothed_error < raw_error` for ≥80 % of keyframes; `mean_improvement ≥ 5 m`. NOT validated as FC-side retroactive correction (out of scope per Mode B revision).
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407
|
||||
**Component**: Blackbox Tests / Positive / Internal smoothing (epic AZ-262)
|
||||
**Tracker**: AZ-418
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
The iSAM2 fixed-lag smoother is the project's IMU-fusion mechanism; if it doesn't actually improve past-keyframe estimates over raw single-shot, the entire C5 design loses its rationale. AC-4.5 (revised per Mode B) measures this as an internal-improvement metric, NOT FC-side retroactive correction.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/positive/test_ft_p_10_smoothing_lookback.py`.
|
||||
- Replays Derkachi end-to-end; reads FDR archive after replay.
|
||||
- For each past keyframe: extracts (a) `raw_pose` (first single-shot emission for that keyframe) and (b) `smoothed_pose` (iSAM2-converged pose at smoother window end).
|
||||
- Computes `distance(raw, GT)` and `distance(smoothed, GT)` against Derkachi `GLOBAL_POSITION_INT`.
|
||||
- Aggregates: `improvement_rate = count(smoothed_error < raw_error) / total`; `mean_improvement = mean(raw_error - smoothed_error)`.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- FDR archive reader for past-keyframe records (per AC-NEW-3 schema; raw + smoothed entries).
|
||||
- Per-keyframe error computation against `GLOBAL_POSITION_INT` GT.
|
||||
- Aggregate assertions.
|
||||
|
||||
### Excluded
|
||||
- FC-side retroactive correction — explicitly OUT OF SCOPE per Mode B revision.
|
||||
- Inter-keyframe interpolation accuracy — out of scope.
|
||||
- iSAM2 timing — owned by NFT-PERF-01 (AZ-428).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: FDR contains raw + smoothed pose pairs**
|
||||
Given a full Derkachi replay
|
||||
Then the FDR archive contains, for each past keyframe k: (a) `raw_pose_k` recorded at the keyframe's first emission timestamp; (b) `smoothed_pose_k` recorded when k exits the iSAM2 window.
|
||||
|
||||
**AC-2: improvement rate**
|
||||
Given the per-keyframe pairs
|
||||
Then `count(smoothed_error_k < raw_error_k) / total_keyframes ≥ 0.80`.
|
||||
|
||||
**AC-3: mean improvement**
|
||||
Given the per-keyframe pairs
|
||||
Then `mean(raw_error_k − smoothed_error_k) ≥ 5 m` over all keyframes.
|
||||
|
||||
**AC-4: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`. Note: this AC is sensitive to VIO strategy quality; expected `vins_mono` (research) ≥ `okvis2` ≥ `klt_ransac` improvement rates — the test reports per-strategy rates as evidence even if all pass the threshold.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries; FDR archive read post-flight.
|
||||
|
||||
- **Allowed**: FDR-archive read (a public on-disk artifact per AC-NEW-3).
|
||||
- **Forbidden**: querying live iSAM2 graph state; importing SUT C5 module.
|
||||
|
||||
## Constraints
|
||||
|
||||
- The FDR record schema (AC-NEW-3) MUST distinguish raw vs smoothed past-keyframe entries; if only one is present, the test fails.
|
||||
- "Past keyframe" in this scenario excludes the most recent K=10..20 keyframes (still inside the smoother window) — those have not yet converged.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/blackbox-tests.md` § FT-P-10
|
||||
- `_docs/02_document/tests/test-data.md` § FC contract & startup (FT-P-10 row)
|
||||
Reference in New Issue
Block a user