diff --git a/_docs/03_implementation/run_tests_step11_report.md b/_docs/03_implementation/run_tests_step11_report.md index 7932dc8..2ad76b6 100644 --- a/_docs/03_implementation/run_tests_step11_report.md +++ b/_docs/03_implementation/run_tests_step11_report.md @@ -314,3 +314,75 @@ This is the same family as H-13 / `AZ-611` (stationary FT-P-01) but on the movin ### Reality Gate verdict **Cycle-2 verdict for Step 11**: Reality Gate signal is now REAL — the SUT runs end-to-end for ~21 s on the Derkachi fixture and surfaces a real auto-sync bug. Pre-Track 1, the gate was a vacuous "exit 0 with 0 tests collected" that hid every SUT issue. Track 1 was the minimum investment to make the gate honest; future cycles (Track 2 + AZ-614) will turn the failing ACs green. + +## Cycle-2 addendum: Jetson harness brought online (AZ-615) + +The Colima harness above is "Tier-1" — ARM Linux without GPU. The SUT's +`pytorch_fp16_runtime` (and `tensorrt_runtime`) hard-code `.cuda()` calls, +so anything past auto-sync can ONLY be exercised against a real GPU. The +operator's Jetson Orin Nano (JetPack 6.2.2+b24, L4T R36.5.0, +nvidia-container-toolkit ≥ 1.16) was wired in as the Tier-2 harness. + +Net-new artifacts (committed under AZ-615): + +* `tests/e2e/Dockerfile.jetson` — `FROM dustynv/l4t-pytorch:r36.4.0` with + Tegra-tuned torch / torchvision pre-baked. Wipes the image's stale + `/etc/pip.conf` (jetson.webredirect.org is maintainer-LAN only), + upgrades pip 24→26 so the `gtsam<5.0,>=4.2` constraint resolves to + the only PyPI wheel for aarch64 (`4.3a0`, same as Colima), installs + the SUT editable via system-pip + `--break-system-packages`. +* `docker-compose.test.jetson.yml` — mirror of `docker-compose.test.yml` + with `runtime: nvidia`, `deploy.resources.reservations.devices`, and + `GPS_DENIED_TIER: "2"` so the auto-skip hook in `tests/conftest.py` + runs the heavy ACs instead of skipping them. +* `scripts/run-tests-jetson.sh` — rsync → ssh build → ssh up wrapper. + Operator-side SSH alias `jetson-e2e` documented in + `_docs/03_implementation/jetson_harness_setup.md`. +* `@pytest.mark.tier2` applied to AC-1, AC-2, AC-3, AC-5, AC-6 in + `tests/e2e/replay/test_derkachi_1min.py` so the same test file is the + source of truth for both harnesses (Colima auto-skips tier2 via the + existing `pytest_collection_modifyitems` hook). + +### Jetson smoke run (first end-to-end, 2026-05-18) + +| Outcome | Count | Tests | +|---------|-------|-------| +| PASSED | 17 | AC-4 AST scan, AC-7 skip-gate, 14× AC-9 helpers | +| FAILED | 5 | AC-1, AC-2, AC-5, AC-6 pace-realtime, AC-6 pace-asap | +| SKIPPED | 1 | AC-8 (unchanged: D-PROJ-2 mock-sat stub) | +| XFAIL | 1 | AC-3 (unchanged: calibration intrinsics unknown) | +| **Wall clock** | **10m09s** | (vs ~5m on Colima) | + +**Same 5 failures as Colima, same root cause** (`replay.auto_sync.ac8_validation_failed`, +offset_ms=1699999995666). AZ-614 reproduces on Jetson because the synth +tlog time-base bug is architecture-independent — heavy ACs die at +auto-sync, BEFORE any frame reaches the GPU. So this run validated the +infrastructure (image builds, GPU exposed, SUT runs, pytest collects 24) +but did NOT yet exercise ALIKED / DISK LightGlue on the actual GPU. The +2× wall delta vs Colima is the cost of CUDA + torch + TensorRT +initialization in the per-test SUT subprocess. + +**Implication for Track 2**: fixing AZ-614 is the gating prerequisite for +ANY Reality-Gate-grade signal from the heavy ACs. Until then, Jetson and +Colima are indistinguishable — same green light ACs, same failed heavy +ACs. Once AZ-614 lands, the two harnesses divide cleanly: Colima keeps +exercising the light path (AC-4 / AC-7 / AC-9 plus auto-sync), Jetson +covers the heavy path (AC-1 / AC-2 / AC-5 / AC-6 plus the GPU inference +stages they entail). + +### Lessons learned (committed to setup doc) + +* `nvcr.io/nvidia/l4t-base` is deprecated in JetPack 6; `l4t-pytorch` + has no R36 tags; `l4t-jetpack:r36.4.0` exists but ships no PyTorch. + `dustynv/l4t-pytorch:r36.4.0` (Docker Hub) is the only off-the-shelf + Jetson base image with Tegra-tuned PyTorch wheels for R36. +* `nvidia-container-runtime` mounts `nvidia-smi` + CUDA libs from the + host into any container at runtime, so the GPU-exposure smoke test + doesn't need a 5 GB `l4t-jetpack` pull — `ubuntu:22.04 nvidia-smi` + (80 MB) suffices. +* The dustynv image bakes a private pip mirror into `/etc/pip.conf`; + builds in any other network must wipe it AND pin `--index-url` to + upstream PyPI. +* git LFS-tracked fixtures (the 269 MB Derkachi mp4) must be + pre-smudged on the Mac BEFORE the rsync step; otherwise the Jetson + receives the 134 B pointer and tests fail at fixture-load. diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index ec52e9b..9d2578f 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -6,9 +6,9 @@ step: 11 name: Run Tests status: passed_with_followups sub_step: - phase: 6 - name: track-1-complete - detail: "Track 1 done (AZ-603 + AZ-604 Done). Reality Gate signal now REAL: 17 pass / 5 fail / 1 skip / 1 xfail across 24 tests. AC-1..AC-6 share root cause AZ-614 (tlog synth time-base mismatch). Tracks 2/3 queued for cycle 2." + phase: 7 + name: jetson-harness-online + detail: "Track 1 done + AZ-615 Jetson Tier-2 harness wired. First Jetson run: identical to Colima (17 pass / 5 fail / 1 skip / 1 xfail, 10m09s). Same 5 failures hit AZ-614 (tlog synth time-base, arch-independent) BEFORE reaching the GPU. Image builds, GPU exposed, SUT runs — infrastructure proven. Next: fix AZ-614 to actually exercise the GPU. AZ-616 (real ../satellite-provider) + AZ-617 (tier2 marks done) queued." retry_count: 0 cycle: 1 tracker: jira