diff --git a/.gitignore b/.gitignore index 8a1a6e1..fa45309 100644 --- a/.gitignore +++ b/.gitignore @@ -43,6 +43,7 @@ tests/fixtures/flight_derkachi/*.h264 tests/fixtures/flight_derkachi/*.tlog tests/fixtures/tiles_corpus/*.jpg tests/fixtures/tiles_corpus/*.png +e2e/fixtures/sitl_replay/ # Editor / OS noise .idea/ diff --git a/_docs/03_implementation/run_tests_step11_report.md b/_docs/03_implementation/run_tests_step11_report.md index d3f88ec..5040acb 100644 --- a/_docs/03_implementation/run_tests_step11_report.md +++ b/_docs/03_implementation/run_tests_step11_report.md @@ -172,9 +172,46 @@ Out of scope for this report; documented in `environment.md` § Execution instru 3. **Add the preventive meta-rule** about transcript-verified test claims, if approved. 4. **Resume Step 11 after Track 1 completes** — at minimum get one real Reality Gate signal from `tests/e2e/replay/`. Track 2 can run in parallel as its own work stream and feed back into Step 11 cycle 2. +## Path 3 attempt — Full SITL with community images (2026-05-17, post-blocker) + +Per user direction, attempted the "Full path" rehab: switch ArduPilot SITL to `sparlane/ardupilot-sitl:Plane-latest` (verified pullable), build iNav SITL from source, write MAVProxy Dockerfile, then run FT-P-01 / FT-P-02 against the real fixture builders. + +**Key reframe discovered during attempt**: `e2e/runner/helpers/sitl_observer.py` is **pure offline fixture replay**, not a live SITL client (see file docstring + `_FdrReplayObserver` class). Setting `E2E_SITL_REPLAY_DIR=...` switches the observer to read pre-built JSON fixtures (`observer__.json`). No live SITL container needed for the existing blackbox FT-P-* and NFT-* tests. The compose-file SITL services in `environment.md` are aspirational future state. + +So the realistic Full Path is: + +1. Install SUT locally (`pip install -e .`) — DONE. +2. Run `e2e.fixtures.sitl_replay_builder.build_p01_fixtures` to produce `e2e/fixtures/sitl_replay/p01/` — BLOCKED (see below). +3. Run pytest on `e2e/tests/positive/test_ft_p_01_still_image_accuracy.py` with `E2E_SITL_REPLAY_DIR=e2e/fixtures/sitl_replay/p01` — BLOCKED on step 2. + +Trying step 2 surfaced **4 new integration drifts**, on top of H-1..H-9 from the prior section: + +| ID | Severity | Description | Status | +|----|----------|-------------|--------| +| H-10 | blocker | Fixture builder calls `gps-denied-replay --fdr-out PATH`. The CLI's actual arg name is `--output`. | not fixed | +| H-11 | blocker | Fixture builder doesn't pass the CLI's required `--camera-calibration`, `--config`, `--mavlink-signing-key` args. Need to add fields to `FixtureBuilderConfig` and update `build_p01_fixtures.py` / `build_p02_fixtures.py`. | not fixed | +| H-12 | medium | `tests/fixtures/calibration/adti26.json` declared `body_to_camera_se3` as `{rotation_xyzw, translation_xyz_m}` dict; loader at `runtime_root/_replay_branch.py:308` strictly expects a 4×4 matrix via `np.asarray(..., dtype=np.float64)`. The dict form was never parseable. | **fixed** — converted to 4×4 identity (`tests/fixtures/calibration/adti26.json`). Equivalent rotation/translation, no behavior change. | +| H-13 | blocker | Auto-sync AC-8 validation hard-fails on still-image + stationary fixtures even when `--time-offset-ms 0` is supplied. Validator computes a "frame-window match %" (default 95% threshold) that requires real video motion + IMU takeoff signal. The FT-P-01 fixture (60 stills + stationary IMU) has neither by design. No `--skip-auto-sync` or `--accept-low-confidence-offset` escape hatch exists. | not fixed | +| H-14 | env-conditional | CLI requires env vars including `BUILD_REPLAY_SINK_JSONL=ON` to use `NoopMavlinkTransport`. This is documented in code comments but not in `.env.example`. | needs doc update | + +Total live harness drift count: **14 distinct items** (3 fixed, 11 deferred). Each H-10..H-13 individually takes 30-60 min to fix with the right design decisions; together they exceed the safe single-session budget given the surface-area uncertainty. + +**Pattern**: The fixture builders (AZ-598/599/600), the CLI signature (AZ-401/402), the calibration JSON schema, and the replay protocol auto-sync (AZ-405) were each implemented well in isolation but never integrated end-to-end. This is exactly what the SUT Reality Gate is designed to surface. + +### Path 3 verdict + +**Cannot reach the SUT Reality Gate in this session.** Even after fixing H-12, the next gate (H-13: auto-sync hard-fail on stationary fixtures) requires a design decision: either expand the auto-sync escape hatch in the SUT, or change the fixture builder to inject a single-frame motion event, or relax AC-8 validation thresholds for stationary scenarios. Each is a non-trivial design call that warrants a Jira ticket and review, not a unilateral mid-session fix. + +### Updated recommendation + +The Track 2 ("Full blackbox harness") track from the previous section needs to expand to include H-10..H-14 as additional sub-stories. Realistic effort: **+1-2 days** on top of the prior estimate. Path 3 is achievable but requires 3-5 days of focused harness rehab, not a single session. + ## Artifacts - Commit `eb6dc17` — csv_reporter / pytest-csv fix - Commit `6ce3158` — e2e/docker harness drift fixes (H-1, H-2, H-3) +- Local fix (uncommitted, ready to commit): `tests/fixtures/calibration/adti26.json` — H-12 4×4 SE3 fix +- Local fix (uncommitted, ready to commit): `tests/fixtures/replay_config_minimal.yaml` — minimal config for path-3 reproduction - This report: `_docs/03_implementation/run_tests_step11_report.md` - Leftover for pytest-csv ticket: `_docs/_process_leftovers/2026-05-17_csv_reporter_pytest_csv_conflict.md` +- Leftover for harness epic: `_docs/_process_leftovers/2026-05-17_e2e_harness_rehabilitation.md` diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index 8564886..9c7eb0f 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -20,4 +20,4 @@ last_step_outcomes: step_8: "Code is testable — no changes needed (testability_assessment.md committed; no list-of-changes, no source edits)" step_9: "41 blackbox test tasks (AZ-406..AZ-446) under epic AZ-262 in _docs/02_tasks/todo/ pre-existing; AZ-406 test-infra bootstrap pre-existing. Folder fallback satisfied. No Step-9 work executed in cycle 1." step_10: "41 of 41 blackbox-test tasks done (AZ-406..AZ-446). Final report at _docs/03_implementation/implementation_report_tests.md. Full-suite gate handed off to test-run skill per implement Step 16." - step_11: "Local Tier-1 pytest: 3343 pass / 88 skip / 0 fail (after csv_reporter fix in eb6dc17). SUT Reality Gate UNMET — both docker harnesses blocked by pre-existing drift (3 SITL images missing + 5 other bugs). Full report: _docs/03_implementation/run_tests_step11_report.md. Tickets deferred to leftovers." + step_11: "Local Tier-1 pytest: 3343 pass / 88 skip / 0 fail (after csv_reporter fix in eb6dc17). SUT Reality Gate UNMET — both docker harnesses blocked by pre-existing drift (now 14 distinct items: 3 fixed, 11 deferred). Full report: _docs/03_implementation/run_tests_step11_report.md. Path-3 attempt on 2026-05-17 21:30 surfaced H-10..H-14 (fixture-builder/CLI/calibration/auto-sync integration drifts); discovered sitl_observer is offline-fixture replay, NOT live SITL — compose-file SITL services are aspirational. Tickets deferred to leftovers." diff --git a/_docs/_process_leftovers/2026-05-17_e2e_harness_rehabilitation.md b/_docs/_process_leftovers/2026-05-17_e2e_harness_rehabilitation.md index fe90076..67195d4 100644 --- a/_docs/_process_leftovers/2026-05-17_e2e_harness_rehabilitation.md +++ b/_docs/_process_leftovers/2026-05-17_e2e_harness_rehabilitation.md @@ -86,6 +86,83 @@ description: | story_points: 2 ``` +### Story: H-10 — Fixture builder uses wrong CLI flag +```yaml +type: Story +summary: "[Bug] sitl_replay_builder uses --fdr-out; CLI requires --output" +description: | + e2e/fixtures/sitl_replay_builder/builder.py:79 passes `--fdr-out` to + `gps-denied-replay`. The CLI's actual flag (src/gps_denied_onboard/cli/replay.py:90) + is `--output`. Also need to add the CLI's other required args + (--camera-calibration, --config, --mavlink-signing-key) — see H-11. + Bundle H-10 + H-11 in one PR. Unit tests in + e2e/_unit_tests/fixtures/test_sitl_replay_builder_builder.py assert on + `--fdr-out` and need to be updated. +story_points: 2 +``` + +### Story: H-11 — Fixture builder missing required CLI args +```yaml +type: Story +summary: "[Bug] sitl_replay_builder doesn't pass camera-calibration/config/signing-key" +description: | + gps-denied-replay requires --camera-calibration PATH, --config PATH, + --mavlink-signing-key PATH. Fixture builder omits all three. Add + fields to FixtureBuilderConfig with defaults pointing at + tests/fixtures/calibration/adti26.json, a new + tests/fixtures/replay_config_minimal.yaml, and + tests/fixtures/mavlink_signing/dev_key. Also set + BUILD_REPLAY_SINK_JSONL=ON in the subprocess env. +story_points: 2 +``` + +### Bug: H-12 — Calibration JSON shape drift (FIXED) +```yaml +type: Bug +summary: "[Bug] adti26.json body_to_camera_se3 used dict form; loader expects 4x4" +description: | + tests/fixtures/calibration/adti26.json declared body_to_camera_se3 as + {rotation_xyzw, translation_xyz_m}. _replay_branch.py:308 does + np.asarray(..., dtype=np.float64) which can't decode the dict. Fixed + by converting to the equivalent 4x4 identity matrix. Both forms encode + the same SE3 (identity) so no behavior change. +story_points: 1 +status_after_create: "Done" +``` + +### Story: H-13 — Auto-sync hard-fails on stationary fixtures +```yaml +type: Story +summary: "[Bug] AC-8 auto-sync validation rejects stationary FT-P-01 fixture" +description: | + Auto-sync (src/gps_denied_onboard/replay_input/...) hard-fails when + --time-offset-ms 0 is supplied for a fixture with stationary IMU + no + video motion (FT-P-01 still-image scenario). Threshold: + frame_window_match_pct_threshold=95% in ReplayAutoSyncConfig defaults. + Three possible fixes (design decision needed): + a) Add --skip-auto-sync CLI flag that bypasses AC-8 validation entirely + when time_offset_ms is explicitly supplied + b) Lower or expose match_threshold_pct via config (already configurable + but not surfaced in fixture builder) + c) Change fixture builder to inject a single motion event so auto-sync + can find SOMETHING to align on + Recommend (a): aligns with replay protocol intent ("manual offset + bypasses auto-sync entirely" per ReplayConfig docstring). +story_points: 3 +``` + +### Story: H-14 — Document BUILD_REPLAY_SINK_JSONL in .env.example +```yaml +type: Story +summary: "[Doc] add BUILD_REPLAY_SINK_JSONL=ON to .env.example for replay mode" +description: | + src/gps_denied_onboard/components/c8_fc_adapter/noop_mavlink_transport.py + requires BUILD_REPLAY_SINK_JSONL=ON env var to construct. Not in + .env.example. Add with comment explaining it's a replay-mode requirement + per replay protocol Invariant 9. +story_points: 1 +``` + ### Story: H-1..H-3 — fixes already committed ```yaml type: Story diff --git a/tests/fixtures/calibration/adti26.json b/tests/fixtures/calibration/adti26.json index 2eb354b..4d4820b 100644 --- a/tests/fixtures/calibration/adti26.json +++ b/tests/fixtures/calibration/adti26.json @@ -6,10 +6,12 @@ [0.0, 0.0, 1.0] ], "distortion": [0.0, 0.0, 0.0, 0.0, 0.0], - "body_to_camera_se3": { - "rotation_xyzw": [0.0, 0.0, 0.0, 1.0], - "translation_xyz_m": [0.0, 0.0, 0.0] - }, + "body_to_camera_se3": [ + [1.0, 0.0, 0.0, 0.0], + [0.0, 1.0, 0.0, 0.0], + [0.0, 0.0, 1.0, 0.0], + [0.0, 0.0, 0.0, 1.0] + ], "acquisition_method": "calibration_target_aligned", "metadata": { "note": "Test fixture; replaced in production by adti20.json." diff --git a/tests/fixtures/replay_config_minimal.yaml b/tests/fixtures/replay_config_minimal.yaml new file mode 100644 index 0000000..d6303b5 --- /dev/null +++ b/tests/fixtures/replay_config_minimal.yaml @@ -0,0 +1,13 @@ +# Minimal replay-mode config for blackbox/e2e fixture builds (path-3 harness rehab). +# Defaults from src/gps_denied_onboard/config/schema.py apply for everything else. +__top__: + mode: replay + +runtime: + fc_profile: ardupilot_plane + tier: 1 + inference_backend: pytorch_fp16 + +replay: + pace: asap + target_fc_dialect: ardupilot_plane