[AZ-603] [AZ-604] e2e-runner: install SUT, fix entrypoint (Track 1)

Multi-stage Ubuntu 22.04 e2e-runner image installs gps-denied-onboard
(editable) into /opt/venv so the AZ-404 replay tests can subprocess
gps-denied-replay against the Derkachi fixture. Image layout mirrors
the host repo (/opt/pyproject.toml + /opt/src + /opt/tests bind mount)
so Path(__file__).parents[3] resolves to /opt and AC-4's AST scan
finds the components dir.

Entrypoint now runs `pytest /opt/tests/e2e/` instead of the empty
`scenarios/` dir. The bootstrap harness collects 24 tests vs. 0 before.

Compose: e2e-runner env mirrors the companion service (FullSystemConfig
requirements) plus RUN_REPLAY_E2E=1, BUILD_REPLAY_SINK_JSONL=ON;
bind-mounts the Derkachi fixture dir; adds writable fdr-data /
tile-data volumes the SUT requires.

Reality Gate signal is now real: 17 pass / 5 fail / 1 skip / 1 xfail.
The 5 heavy-AC failures share root cause AZ-614 (tlog synth time-base
mismatch, surfaced by the now-functional harness).

Also archives the replayed leftover entries (csv_reporter -> AZ-601,
harness rehab -> AZ-602 epic + 11 child stories).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-18 01:28:36 +03:00
parent 5c1c35da9a
commit c2934b8686
6 changed files with 204 additions and 294 deletions
@@ -1,79 +0,0 @@
# Leftover — Bug ticket creation deferred
- **Timestamp**: 2026-05-17T16:06:48Z
- **What was blocked**: Jira ticket creation for the `--csv` flag-collision regression
- **Reason for blockage**: surfaced mid-execution of `test-run` (Step 11 of greenfield); the user already
skipped my structured-questions prompt in this session, so I did not pause again to confirm a tracker
write. Recording the would-be payload here so the next `/autodev` invocation can replay it.
## Background
During Step 11 chunked test-run, three subprocess-based tests in
`e2e/_unit_tests/reporting/` crashed with
`argparse.ArgumentError: argument --csv: conflicting option string: --csv`.
Root cause:
1. `e2e/runner/requirements.txt` listed `pytest-csv>=3.0,<4.0`. The package was installed locally and
auto-loaded via entry-point into every pytest subprocess.
2. `e2e/runner/reporting/csv_reporter.py` registered `--csv` with the intent of "overriding"
pytest-csv. pytest's option registry does not allow overrides — it raises on conflict.
3. `pytest-csv 3.0.0` is also incompatible with `pytest 9.x` (uses removed `@pytest.mark.hookwrapper`).
4. Our code never `import pytest_csv` — the dep was dead weight.
Fix applied in this commit:
- Removed `pytest-csv` from `e2e/runner/requirements.txt`
- Updated the docstring in `e2e/runner/reporting/csv_reporter.py`
- Updated the comment in `e2e/runner/conftest.py`
- Uninstalled `pytest-csv` from the local environment
After the fix, all 1229 `e2e/_unit_tests` pass with no skips and no failures.
## Secondary issue — false-positive batch report
`_docs/03_implementation/batch_89_cycle1_report.md` claims:
> Full e2e unit-test suite: **1229 passed in 134 s** (+6 vs. batch 88).
That number was reported without actually running the failing subprocess tests at the time. The 3 tests
have been broken since `pytest-csv` was installed locally, but the implementation skill's batch report
did not catch it. This is a process gap: a report claimed verification it had not performed.
A meta-rule retrospective entry should be added (per `meta-rule.mdc` → Self-Improvement) to prevent
recurrence. Proposed rule: "Before writing `Test Results: X passed` in a batch report, the same shell
invocation that produced X must appear in the assistant transcript, with the exit code visible."
## Pending tracker write — to replay on next /autodev
```yaml
type: Bug
summary: "[Bug] csv_reporter --csv flag collides with pytest-csv autoload — subprocess tests crash"
description: |
Three subprocess-spawned tests in e2e/_unit_tests/reporting/ crash with
`argparse.ArgumentError: argument --csv: conflicting option string: --csv`
when `pytest-csv` is installed (entry-point autoload) alongside our custom
csv_reporter plugin. Fix removed pytest-csv from e2e/runner/requirements.txt.
Affected tests (pre-fix):
- e2e/_unit_tests/reporting/test_csv_reporter.py::test_csv_plugin_emits_required_columns
- e2e/_unit_tests/reporting/test_nfr_recorder.py::test_nfr_recorder_fixture_emits_artifacts_in_run
- e2e/_unit_tests/reporting/test_nfr_recorder.py::test_per_metric_report_emitted_in_pytest_run
See commit (this commit hash) for the fix and
`_docs/_process_leftovers/2026-05-17_csv_reporter_pytest_csv_conflict.md` for full root-cause.
Also: batch_89_cycle1_report.md falsely claimed "1229 passed" — the meta-rule retrospective
is pending as a separate item.
status_after_create: "In Testing" # fix already in the same commit; ready to close after Jira tag
story_points: 2
epic: AZ-262 # blackbox test infrastructure / reporting
linked_to: AZ-446 # CSV reporter refinements (introduces the affected subprocess test)
```
## Replay obligation
On next `/autodev`, the bootstrap phase should:
1. Open Jira, create the ticket with the payload above
2. Read this commit's hash from `git log --oneline | grep csv_reporter` and inject into the description
3. Delete this leftover entry on success
@@ -1,194 +0,0 @@
# Leftover — E2E Tier-1 harness rehabilitation tickets deferred
- **Timestamp**: 2026-05-17T17:30:00Z
- **What was blocked**: Jira ticket creation for the harness drift surfaced during Step 11
- **Reason for blockage**: same session as the csv_reporter fix; user skipped the
structured Q&A so I did not pause for tracker writes. Full findings live in
`_docs/03_implementation/run_tests_step11_report.md`; this leftover records the
tickets that need filing.
## Pending tracker writes — replay on next /autodev
### Epic
```yaml
type: Epic
summary: "E2E Tier-1 harness rehabilitation"
description: |
Surfaced during /autodev Step 11 (Run Tests) cycle 1 on 2026-05-17. Both
Tier-1 docker harnesses (top-level scripts/run-tests.sh and the fuller
e2e/docker/run-tier1.sh) had pre-existing drift preventing them from
running end-to-end. Local pytest suite is green (3343/88/0); SUT Reality
Gate is unmet until at least the bootstrap harness can run
tests/e2e/replay/ with RUN_REPLAY_E2E=1. Full report:
_docs/03_implementation/run_tests_step11_report.md
linked_to: AZ-595, AZ-444 # related but distinct: tile-cache fixtures, Tier-2 hw loop
```
### Story: H-7 — Bootstrap runner entrypoint
```yaml
type: Story
summary: "[Bug] tests/e2e/Dockerfile entrypoint points at empty scenarios dir"
description: |
Current entrypoint: `pytest -q /opt/tests/e2e/scenarios` (empty in repo).
Real tests are in `tests/e2e/replay/` (test_derkachi_1min.py, etc.).
Fix: change entrypoint to /opt/tests/e2e/ (let pytest discover both
scenarios and replay).
story_points: 1
```
### Story: H-8 — Install SUT in runner image
```yaml
type: Story
summary: "[Bug] tests/e2e e2e-runner image doesn't install gps-denied-onboard"
description: |
Image is python:3.10-slim with only pytest+requests+pyyaml. The replay
tests need `gps-denied-replay` console script on PATH. Either:
- COPY pyproject.toml + src/ and pip install -e ".[dev]", or
- Build a wheel in a separate stage and pip install it.
Verify the resulting image: `which gps-denied-replay`.
story_points: 3
```
### Story: H-4..H-6 — SITL/MAVLink images choice
```yaml
type: Story
summary: "[Decision] Choose SITL strategy for e2e/docker harness"
description: |
environment.md specifies ardupilot/ardupilot-sitl:plane-stable,
inavflight/inav-sitl:9.0.0, ardupilot/mavproxy:latest. All MISSING from
Docker Hub. Options:
a) Switch to community images (radarku/ardupilot-sitl etc.)
b) Build SITLs from source in a separate stage
c) Strip SITL services and mark SITL-bound scenarios skip(reason="sitl-unavailable")
Track 1 doesn't depend on this; Track 2 does.
story_points: 5
```
### Story: MAVProxy local image
```yaml
type: Story
summary: "[Story] Replace ardupilot/mavproxy:latest with local pip-MAVProxy Dockerfile"
description: |
Image doesn't exist on Docker Hub. Wrap `pip install MAVProxy` in a
python:3.10-slim Dockerfile in e2e/fixtures/mavproxy/. Update compose
to use the local build.
story_points: 1
```
### Story: H-9 — Tile-cache fixture builder
```yaml
type: Story
summary: "Link H-9 to AZ-595 / tile-cache fixture seeder"
description: |
e2e/docker/docker-compose.test.yml declares tile-cache-fixture as an
empty named volume. Track 2 cannot run without seeded tiles. AZ-595
exists and owns this; verify scope alignment, add a link.
story_points: 2
```
### Story: H-10 — Fixture builder uses wrong CLI flag
```yaml
type: Story
summary: "[Bug] sitl_replay_builder uses --fdr-out; CLI requires --output"
description: |
e2e/fixtures/sitl_replay_builder/builder.py:79 passes `--fdr-out` to
`gps-denied-replay`. The CLI's actual flag (src/gps_denied_onboard/cli/replay.py:90)
is `--output`. Also need to add the CLI's other required args
(--camera-calibration, --config, --mavlink-signing-key) — see H-11.
Bundle H-10 + H-11 in one PR. Unit tests in
e2e/_unit_tests/fixtures/test_sitl_replay_builder_builder.py assert on
`--fdr-out` and need to be updated.
story_points: 2
```
### Story: H-11 — Fixture builder missing required CLI args
```yaml
type: Story
summary: "[Bug] sitl_replay_builder doesn't pass camera-calibration/config/signing-key"
description: |
gps-denied-replay requires --camera-calibration PATH, --config PATH,
--mavlink-signing-key PATH. Fixture builder omits all three. Add
fields to FixtureBuilderConfig with defaults pointing at
tests/fixtures/calibration/adti26.json, a new
tests/fixtures/replay_config_minimal.yaml, and
tests/fixtures/mavlink_signing/dev_key. Also set
BUILD_REPLAY_SINK_JSONL=ON in the subprocess env.
story_points: 2
```
### Bug: H-12 — Calibration JSON shape drift (FIXED)
```yaml
type: Bug
summary: "[Bug] adti26.json body_to_camera_se3 used dict form; loader expects 4x4"
description: |
tests/fixtures/calibration/adti26.json declared body_to_camera_se3 as
{rotation_xyzw, translation_xyz_m}. _replay_branch.py:308 does
np.asarray(..., dtype=np.float64) which can't decode the dict. Fixed
by converting to the equivalent 4x4 identity matrix. Both forms encode
the same SE3 (identity) so no behavior change.
story_points: 1
status_after_create: "Done"
```
### Story: H-13 — Auto-sync hard-fails on stationary fixtures
```yaml
type: Story
summary: "[Bug] AC-8 auto-sync validation rejects stationary FT-P-01 fixture"
description: |
Auto-sync (src/gps_denied_onboard/replay_input/...) hard-fails when
--time-offset-ms 0 is supplied for a fixture with stationary IMU + no
video motion (FT-P-01 still-image scenario). Threshold:
frame_window_match_pct_threshold=95% in ReplayAutoSyncConfig defaults.
Three possible fixes (design decision needed):
a) Add --skip-auto-sync CLI flag that bypasses AC-8 validation entirely
when time_offset_ms is explicitly supplied
b) Lower or expose match_threshold_pct via config (already configurable
but not surfaced in fixture builder)
c) Change fixture builder to inject a single motion event so auto-sync
can find SOMETHING to align on
Recommend (a): aligns with replay protocol intent ("manual offset
bypasses auto-sync entirely" per ReplayConfig docstring).
story_points: 3
```
### Story: H-14 — Document BUILD_REPLAY_SINK_JSONL in .env.example
```yaml
type: Story
summary: "[Doc] add BUILD_REPLAY_SINK_JSONL=ON to .env.example for replay mode"
description: |
src/gps_denied_onboard/components/c8_fc_adapter/noop_mavlink_transport.py
requires BUILD_REPLAY_SINK_JSONL=ON env var to construct. Not in
.env.example. Add with comment explaining it's a replay-mode requirement
per replay protocol Invariant 9.
story_points: 1
```
### Story: H-1..H-3 — fixes already committed
```yaml
type: Story
summary: "[Bug] e2e/docker harness drift (already fixed in commit 6ce3158)"
description: |
Fixed in this session: dockerfile rename, fdr-output tmpfs cap, e2e-results
dir + gitignore. Ticket is just for tracking — already in dev branch.
story_points: 1
status_after_create: "Done"
```
### Bug: csv_reporter --csv collision (already committed)
```yaml
type: Bug
summary: "[Bug] csv_reporter --csv flag collides with pytest-csv autoload"
description: |
See _docs/_process_leftovers/2026-05-17_csv_reporter_pytest_csv_conflict.md
Fix already in commit eb6dc17.
linked_to: AZ-446
story_points: 2
status_after_create: "Done"
```
## Replay obligation
Next /autodev should:
1. Open Jira, create the Epic + Stories above (link Epic to AZ-595 and AZ-444).
2. Update the Epic with the actual issue keys once created.
3. Delete this leftover entry on success.