Files
gps-denied-onboard/_docs/02_document/tests/_phase1_findings.md
T
Oleksandr Bezdieniezhnykh 9eba1689b3 - Introduced a new document detailing the current state of the autodev process, including steps, status, and findings.
- Revised acceptance criteria in the acceptance_criteria.md file to clarify metrics and expectations, including updates to GPS accuracy and image processing quality.
- Enhanced restrictions documentation to reflect operational parameters and constraints for UAV flights, including camera specifications and satellite imagery usage.
- Added new research documents for acceptance criteria assessment and question decomposition to support ongoing project evaluation and decision-making.
2026-04-26 14:28:10 +03:00

7.5 KiB
Raw Blame History

Test-Spec Phase 1 Findings (intermediate, not a final artifact)

Generated 2026-04-26 during Plan Step 1 (test-spec/SKILL.md, Phase 1). This is a working note — Phase 2 reads it to carry forward findings + the user's locked-in decisions. Phase 2 produces the 8 final artifacts under _docs/02_document/tests/.

Inputs surveyed

  • _docs/00_problem/problem.md — short problem statement; flags missing IMU.
  • _docs/00_problem/acceptance_criteria.md — 46 ACs (37 numbered + 9 NEW).
  • _docs/00_problem/restrictions.md — UAV/flight/camera/satellite/HW/sensors/failsafe.
  • _docs/01_solution/solution.md (renamed from solution_draft03 by Plan Prereq 2) — 11 components + testing strategy F-T1…F-T19, NF-T1…NF-T6, S-T1…S-T5, FT-1…FT-3.
  • _docs/00_problem/input_data/:
    • 60 nav-cam JPGs AD000001.jpgAD000060.jpg
    • coordinates.csv (frame → GPS truth)
    • 2 _gmaps.png thumbnails (frames 12 only)
    • data_parameters.md (corpus-shoot params)
    • expected_results/results_report.md (46 mapped scenarios) + position_accuracy.csv

Quantifiability check

All 46 mapped scenarios in results_report.md have quantifiable expected results (no vague "works correctly" entries). Comparison methods used: percentage, numeric_tolerance, threshold_max, threshold_min, exact, regex, range, file_reference. Acceptable.

Coverage of the 46 ACs by results_report.md

  • Fully covered (~18 / 46 ≈ 39%): AC-1.1, AC-1.3 (mono only), AC-2.1, AC-2.2 (VO half), AC-3.1, AC-3.2, AC-3.4, AC-4.1, AC-4.2, AC-5.1, AC-5.3, AC-6.2, AC-7.1, AC-7.2, AC-NEW-5 (junction temp slice), AC-NEW-8 (mono only), API ACs (rows 3033), TRT validation rows 4244.
  • Partially covered (~10): AC-1.2, AC-2.2 cross-view <2.5 px, AC-5.2 timing, AC-6.1, AC-NEW-1, AC-NEW-5 hot/cold soak.
  • Not covered (~18): AC-1.4, AC-3.3, AC-4.3 ODOMETRY (intentionally per v1 — see clause), AC-4.4, AC-4.5, AC-6.3, AC-8.1AC-8.6, AC-NEW-2, AC-NEW-3, AC-NEW-4, AC-NEW-6, AC-NEW-7, AC-NEW-9.

Headline ≈ 39% direct + 22% partial = ~61% against the 46-AC denominator. Below the 75% threshold only when the 60-image slice is treated as the sole corpus. The solution's testing strategy explicitly delegates the missing slice to bench-off corpora named in solution.md (AerialVL, UAV-VisLoc, AerialExtreMatch, 2chADCNN, TartanAir V2, internal Mavic, first internal fixed-wing). Per user decision #4 below, Phase 2 will spec tests for all 46 ACs and mark unfulfilled-data ACs with data_status: deferred-corpus in traceability-matrix.md.

Stale-doc fixes already applied (per user decision #1, option A)

Edits made during Phase 1 — Phase 2 reads these as baseline truth:

File Row / AC Change
results_report.md row 2 "≥60% within 20m" → "≥50% within 20m" (aligns with AC-1.2).
results_report.md row 19 "ESKF position corrected" → "Component 5 calibrator emits a satellite-anchored fix, FC EKF3 reconverges".
results_report.md row 22 "uses hint as ESKF measurement" → "uses hint as a high-covariance (~500m) seed for VPR/cross-view re-localization (consumed by Component 5 calibrator)".
results_report.md row 23 "GPS_INPUT output begins within 60s of boot" → "within 30s of boot (95th percentile)" (aligns with AC-NEW-1).
results_report.md row 25 "inits ESKF with high uncertainty" → "re-initialises Component 5 calibrator state with high uncertainty"; recovery time "≤70s" → "≤30s".
results_report.md row 38 LiteSAM/XFeat ≤330ms inline → "SP+LG (TRT FP16/INT8) inline ≤200ms; LiteSAM re-loc fallback ≤2000ms".
acceptance_criteria.md AC-4.3 added v1-scope clause: ODOMETRY emission disabled in v1 (per solution_draft03 finding M-30, EKF3 issues #30076/#32506); EK3_SRC1_*=GPS+Compass; tests assert ODOMETRY is intentionally absent on the wire in v1; ODOMETRY re-enabled in v1.1 once F-T9 SITL passes.

Locked-in user decisions (carry into Phase 2)

ID Decision Phase-2 implication
D1 Apply 4 stale-doc fixes inline (done above). Phase 2 reads results_report.md v2 (post-edit) and the new AC-4.3 clause as authoritative.
D2 Camera/altitude mismatch: 60-image slice is pipeline-correctness corpus only — does NOT validate GSD-band assumptions, latency budgets, or matcher resolution sweeps for the deployed 1km AGL / 20MP path. tests/test-data.md MUST state: corpus shot at 400m AGL with ADTi 26S v2 (26MP, 6252×4168, 25mm, 23.5mm sensor). Tests scoped to "pipeline correctness" only. AC-1.1/AC-1.2/AC-2.1/AC-2.2/AC-NEW-8 acceptance numbers from this slice are pipeline-functional, not deployment-binding. Deployment-binding tests reference AerialVL S03 (1km AGL fixed-wing).
D3 Missing satellite tiles + IMU: spec tests with placeholder fixtures referenced by name even though files don't yet exist. tests/test-data.md declares: (a) fixtures/satellite_tiles_AD0000xx_z20/ — z=20 ortho tiles for the bbox of coordinates.csv, fetched by an implementer-written script (Esri / public ortho); (b) fixtures/imu_AD0000xx.csv — IMU traces from SITL ArduPilot replay of coordinates.csv as ground-truth trajectory at 200 Hz; for AC-1.3 / AC-NEW-8 fixed-wing tests use AerialVL S03 IMU as the fixed-wing reference. Phase 3 hard gate will surface these as "pending data", not "remove".
D4 AC-coverage gap: Phase 2 specs tests for all 46 ACs; deferred-data ACs get data_status: deferred-corpus in traceability-matrix.md listing the named external corpus. traceability-matrix.md columns: AC-id, Test-id, Test-file, data_status (∈ {present, deferred-corpus, deferred-sitl, deferred-hil}). Rows pointing at AerialVL S03, UAV-VisLoc, AerialExtreMatch, TartanAir V2, internal Mavic, first internal fixed-wing flight, hot/cold soak chamber, multi-flight Monte Carlo, and SITL ArduPilot are emitted with the appropriate deferred-* token.

Open contradictions still standing (NOT auto-fixed)

None for Phase 2 entry. AC-4.3 dual-channel framing was the only remaining one and it was resolved by the v1-scope clause (D1) — AC text intact, v1 implementation scoped to GPS_INPUT only.

Known data dependencies for Phase 2 to spec around

Dependency Status Phase-2 treatment
z=20 satellite tiles for the coordinates.csv bbox Missing Fixture name declared in test-data.md, script TODO (implementer task in Plan Step 3 / Decompose).
IMU traces synced to coordinates.csv frames Missing SITL replay declared as fixture; AerialVL S03 used for fixed-wing AC-1.3 / AC-NEW-8.
AerialVL S03 / UAV-VisLoc / AerialExtreMatch / 2chADCNN / TartanAir V2 External, not yet downloaded data_status: deferred-corpus in matrix; Decompose creates a "dataset acquisition" task.
First internal fixed-wing flight footage Pending field-test plan data_status: deferred-corpus.
SITL ArduPilot environment (PR #30080 pinned version) Not yet provisioned data_status: deferred-sitl.
Hot/cold soak chamber (AC-NEW-5) Bench equipment data_status: deferred-hil.
8-h synthetic load fixture (AC-NEW-3 FDR) Synthesizable Declared as fixture, generated at impl time.

Phase 2 entry checklist (READY)

  • Phase 1 BLOCKING gate cleared (user confirmed coverage decisions).
  • Stale-doc fixes applied (D1).
  • Findings preserved here for resume in a fresh conversation.
  • Phase 2 will read this file first, then read solution.md / AC / restrictions / results_report.md as needed for each artifact.