mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 04:51:12 +00:00
[autodev] step-11 path-3: calibration fix + harness drift report
Attempted Path-3 (Full SITL with community images) for the SUT Reality
Gate. Discovered sitl_observer is offline-fixture replay, not a live
SITL client -- compose-file SITL services in environment.md are
aspirational. The real Path-3 needs the fixture builders + SUT CLI
end-to-end, which surfaced 5 additional integration drifts (H-10..H-14)
on top of the prior 9.
Fixes:
- tests/fixtures/calibration/adti26.json: body_to_camera_se3 was a
{rotation_xyzw, translation_xyz_m} dict; runtime_root/_replay_branch.py
loader strictly expects a 4x4 SE3. Identity quaternion + zero
translation = identity 4x4, semantically equivalent.
New files:
- tests/fixtures/replay_config_minimal.yaml: minimal replay-mode config
for harness reproduction (mode=replay, ardupilot_plane defaults).
- .gitignore: e2e/fixtures/sitl_replay/ (generated by build_p0X_fixtures).
Documentation:
- Step 11 report: appended Path-3 attempt section.
- Leftover doc: H-10..H-14 ticket payloads added.
- Autodev state: reflects Path-3 outcome.
Step 11 stays blocked; H-13 (auto-sync AC-8 hard-fails on stationary
fixtures) requires a SUT design decision and cannot be unilaterally
fixed mid-session.
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -172,9 +172,46 @@ Out of scope for this report; documented in `environment.md` § Execution instru
|
||||
3. **Add the preventive meta-rule** about transcript-verified test claims, if approved.
|
||||
4. **Resume Step 11 after Track 1 completes** — at minimum get one real Reality Gate signal from `tests/e2e/replay/`. Track 2 can run in parallel as its own work stream and feed back into Step 11 cycle 2.
|
||||
|
||||
## Path 3 attempt — Full SITL with community images (2026-05-17, post-blocker)
|
||||
|
||||
Per user direction, attempted the "Full path" rehab: switch ArduPilot SITL to `sparlane/ardupilot-sitl:Plane-latest` (verified pullable), build iNav SITL from source, write MAVProxy Dockerfile, then run FT-P-01 / FT-P-02 against the real fixture builders.
|
||||
|
||||
**Key reframe discovered during attempt**: `e2e/runner/helpers/sitl_observer.py` is **pure offline fixture replay**, not a live SITL client (see file docstring + `_FdrReplayObserver` class). Setting `E2E_SITL_REPLAY_DIR=...` switches the observer to read pre-built JSON fixtures (`observer_<fc_kind>_<host>.json`). No live SITL container needed for the existing blackbox FT-P-* and NFT-* tests. The compose-file SITL services in `environment.md` are aspirational future state.
|
||||
|
||||
So the realistic Full Path is:
|
||||
|
||||
1. Install SUT locally (`pip install -e .`) — DONE.
|
||||
2. Run `e2e.fixtures.sitl_replay_builder.build_p01_fixtures` to produce `e2e/fixtures/sitl_replay/p01/` — BLOCKED (see below).
|
||||
3. Run pytest on `e2e/tests/positive/test_ft_p_01_still_image_accuracy.py` with `E2E_SITL_REPLAY_DIR=e2e/fixtures/sitl_replay/p01` — BLOCKED on step 2.
|
||||
|
||||
Trying step 2 surfaced **4 new integration drifts**, on top of H-1..H-9 from the prior section:
|
||||
|
||||
| ID | Severity | Description | Status |
|
||||
|----|----------|-------------|--------|
|
||||
| H-10 | blocker | Fixture builder calls `gps-denied-replay --fdr-out PATH`. The CLI's actual arg name is `--output`. | not fixed |
|
||||
| H-11 | blocker | Fixture builder doesn't pass the CLI's required `--camera-calibration`, `--config`, `--mavlink-signing-key` args. Need to add fields to `FixtureBuilderConfig` and update `build_p01_fixtures.py` / `build_p02_fixtures.py`. | not fixed |
|
||||
| H-12 | medium | `tests/fixtures/calibration/adti26.json` declared `body_to_camera_se3` as `{rotation_xyzw, translation_xyz_m}` dict; loader at `runtime_root/_replay_branch.py:308` strictly expects a 4×4 matrix via `np.asarray(..., dtype=np.float64)`. The dict form was never parseable. | **fixed** — converted to 4×4 identity (`tests/fixtures/calibration/adti26.json`). Equivalent rotation/translation, no behavior change. |
|
||||
| H-13 | blocker | Auto-sync AC-8 validation hard-fails on still-image + stationary fixtures even when `--time-offset-ms 0` is supplied. Validator computes a "frame-window match %" (default 95% threshold) that requires real video motion + IMU takeoff signal. The FT-P-01 fixture (60 stills + stationary IMU) has neither by design. No `--skip-auto-sync` or `--accept-low-confidence-offset` escape hatch exists. | not fixed |
|
||||
| H-14 | env-conditional | CLI requires env vars including `BUILD_REPLAY_SINK_JSONL=ON` to use `NoopMavlinkTransport`. This is documented in code comments but not in `.env.example`. | needs doc update |
|
||||
|
||||
Total live harness drift count: **14 distinct items** (3 fixed, 11 deferred). Each H-10..H-13 individually takes 30-60 min to fix with the right design decisions; together they exceed the safe single-session budget given the surface-area uncertainty.
|
||||
|
||||
**Pattern**: The fixture builders (AZ-598/599/600), the CLI signature (AZ-401/402), the calibration JSON schema, and the replay protocol auto-sync (AZ-405) were each implemented well in isolation but never integrated end-to-end. This is exactly what the SUT Reality Gate is designed to surface.
|
||||
|
||||
### Path 3 verdict
|
||||
|
||||
**Cannot reach the SUT Reality Gate in this session.** Even after fixing H-12, the next gate (H-13: auto-sync hard-fail on stationary fixtures) requires a design decision: either expand the auto-sync escape hatch in the SUT, or change the fixture builder to inject a single-frame motion event, or relax AC-8 validation thresholds for stationary scenarios. Each is a non-trivial design call that warrants a Jira ticket and review, not a unilateral mid-session fix.
|
||||
|
||||
### Updated recommendation
|
||||
|
||||
The Track 2 ("Full blackbox harness") track from the previous section needs to expand to include H-10..H-14 as additional sub-stories. Realistic effort: **+1-2 days** on top of the prior estimate. Path 3 is achievable but requires 3-5 days of focused harness rehab, not a single session.
|
||||
|
||||
## Artifacts
|
||||
|
||||
- Commit `eb6dc17` — csv_reporter / pytest-csv fix
|
||||
- Commit `6ce3158` — e2e/docker harness drift fixes (H-1, H-2, H-3)
|
||||
- Local fix (uncommitted, ready to commit): `tests/fixtures/calibration/adti26.json` — H-12 4×4 SE3 fix
|
||||
- Local fix (uncommitted, ready to commit): `tests/fixtures/replay_config_minimal.yaml` — minimal config for path-3 reproduction
|
||||
- This report: `_docs/03_implementation/run_tests_step11_report.md`
|
||||
- Leftover for pytest-csv ticket: `_docs/_process_leftovers/2026-05-17_csv_reporter_pytest_csv_conflict.md`
|
||||
- Leftover for harness epic: `_docs/_process_leftovers/2026-05-17_e2e_harness_rehabilitation.md`
|
||||
|
||||
Reference in New Issue
Block a user