build_pre_constructed now populates c3_lightglue_runtime
(LightGlueRuntime) + c3_feature_extractor (FeatureExtractor) on top
of AZ-619/620/621. Strategy-specific BUILD_MATCHER_* flag mismatch
raises AirborneBootstrapError naming the missing flag and the c3_matcher
consumer; the c7 InferenceRuntime built earlier in the bootstrap is
reused as the engine source so no double-build at this layer.
C3MatcherConfig gains optional lightglue_weights_path: Path | None
for the operator's deployment config; production main() (AZ-624)
populates it. Real LightGlue inference correctness is verified by
AZ-624's Jetson AC-5 run per the AZ-622 Tier-2 Note.
Phase tests for AZ-619/620/621 gain an autouse _stub_c3_matcher_builders
fixture so additivity assertions remain valid as the bootstrap grows.
Code review: PASS_WITH_WARNINGS (3 Low: signature drift from spec,
_is_build_flag_on duplication across 3 runtime_root modules, and
BuildConfig literal mirrored with per-strategy build configs). All
deferred to future hygiene PBIs.
Co-authored-by: Cursor <cursoragent@cursor.com>
Catches up implement skill Step 14.5 cadence (K=3 missed since
batches 82-84): one review covering the 88-92 window after the
previous session backfilled the missing 85-87 review at the wrong
path. Renames reviews/cumulative_review_batches_85_87.md to the
canonical cumulative_review_batches_85-87_cycle1_report.md so the
implement skill's resumability detects it.
Cumulative review 88-92 verdict: PASS_WITH_WARNINGS.
- CR-F1/F2 carry-overs from 85-87 escalated (write_csv_evidence +
_resolve_fixture_path duplication now in 17 files each).
- CR-F3 process: batch_90/91_review.md missing on disk; batches'
inline self-reviews substitute.
- Phase 7 architecture clean: airborne_bootstrap.py imports all
Layer-5 sibling or lower, no new cycles, public APIs respected.
State: still Step 7 (Implement) sub_step 16 batch-loop. Next: batch
93 = AZ-622 (Phase D, 3cp) — fresh session recommended.
Co-authored-by: Cursor <cursoragent@cursor.com>
Third subtask of AZ-618. Extends airborne_bootstrap.build_pre_constructed
additively with c7_inference (GPU InferenceRuntime). Wraps the existing
inference_factory.build_inference_runtime so a BUILD_TENSORRT_RUNTIME /
BUILD_PYTORCH_FP16_RUNTIME mismatch surfaces a clear operator-facing
AirborneBootstrapError naming BOTH airborne C7 flags plus the consuming
component slug, rather than bubbling up RuntimeNotAvailableError with no
context.
New public const C7_AIRBORNE_BUILD_FLAGS pairs each airborne runtime
with its gating env flag (onnx_trt_ep deliberately omitted — research
only). Tests stub at the factory boundary; real GPU/TensorRT load
remains Tier-2 only (consolidated at AZ-624). AZ-619 and AZ-620 test
files extended with a _stub_c7_inference_builder autouse fixture
mirroring the AZ-620 pattern for _build_c6_*.
18/18 runtime_root unit tests pass.
Co-authored-by: Cursor <cursoragent@cursor.com>
Second of six subtasks of AZ-618. Extends
airborne_bootstrap.build_pre_constructed(config) additively with the
two C6 storage entries on top of AZ-619's c13_fdr + clock contract:
- c6_descriptor_index: via storage_factory.build_descriptor_index
- c6_tile_store: via storage_factory.build_tile_store
When BUILD_FAISS_INDEX=OFF, the lower-level RuntimeNotAvailableError
from the descriptor index factory is translated into an
AirborneBootstrapError that names the missing key
(c6_descriptor_index), the gating flag (BUILD_FAISS_INDEX), and the
consuming component slug(s) drawn from
AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS. The original error is
preserved as __cause__ so operators still see the upstream reason.
Tests: 3 new unit tests cover AC-620.1 + AC-620.2 (twice, with and
without a configured consumer, so the bootstrap fails loudly in
either branch). AZ-619 tests updated to add an autouse stub for the
Phase B builders (keeps them focused on Phase A keys) and to relax
the "exactly two keys" assertion to "AZ-619 keys remain present
under AZ-620 additivity" per the original test's own forward-pointer.
Bonus: ruff --fix removed 12 pre-existing UP037 quoted-annotation
warnings in airborne_bootstrap.py (covered by `from __future__ import
annotations`). All in modified-area scope per quality-gates.mdc.
Run: pytest tests/unit/runtime_root/ -q -> 15/15 passed in 1.06s.
Spec moved to _docs/02_tasks/done/ in the previous commit (audit-trail
backfill of batch_90 also landed there).
Co-authored-by: Cursor <cursoragent@cursor.com>
Prior session committed AZ-619 (Phase A of AZ-618) as 8abfb02,
transitioned the tracker, and archived the spec, but did not write
the batch report. Content reconstructed from git show + the AZ-619
task spec + the prior _docs/_autodev_state.md sub_step.detail.
No code change. Pure audit-trail housekeeping.
Co-authored-by: Cursor <cursoragent@cursor.com>
Adds airborne_bootstrap.build_pre_constructed(config) returning a
dict with the two foundational keys: a per-binary shared FdrClient
under "c13_fdr" (via make_fdr_client with the new
AIRBORNE_MAIN_PRODUCER_ID constant) and a fresh WallClock under
"clock". Phases B..F (AZ-620..AZ-624) extend this function
additively without breaking the AZ-619 contract.
The c13_fdr instance is identity-stable across calls (per the
make_fdr_client per-producer cache) so callers can call
build_pre_constructed twice and get the same FdrClient back -
AC-619.2.
Replay-mode override is unchanged: compose_root merges
replay_components over pre_constructed so the WallClock here is
replaced by TlogDerivedClock in replay binaries (existing
contract documented in compose_root's docstring).
Tests: 5 new unit tests under tests/unit/runtime_root/
test_az619_pre_constructed_phase_a.py, all passing. AZ-591 not
regressed (12/12 in the combined run).
Spec moved to _docs/02_tasks/done/.
Co-authored-by: Cursor <cursoragent@cursor.com>
The AZ-618 spec author flagged "likely a true 8" with a recommended
6-subtask split; combined with the user-rule cap on PBI complexity
(create at 2-3pt, max 5pt) the right move was to split before any
implementation began. Subtasks created in Jira as children of AZ-618:
AZ-619 (Phase A) c13_fdr + clock 2pt
AZ-620 (Phase B) c6_descriptor_index + c6_tile_store 3pt
AZ-621 (Phase C) c7_inference engine 3pt
AZ-622 (Phase D) c3_lightglue_runtime + c3_feature_extractor 3pt
AZ-623 (Phase E) c282_ransac_filter + c5 helpers 3pt
AZ-624 (Phase F) wire main() + AC-1..AC-5 + Jetson 2pt
Aggregate: 16pt actionable work (vs. AZ-618's original 5pt filing,
which the author had already qualified as understated). AZ-618 stays
In Progress in Jira as the umbrella tracker; its task spec file is
now an umbrella reference pointing to the 6 phase-specific spec files.
Deps table updated: AZ-618 row reduced to 0pt with subtask deps; six
new rows added; header counts refreshed (156 -> 162 tasks, 522 -> 533
points). Autodev state set to phase=1 (parse) for the next batch =
AZ-619 (Phase A) only.
Co-authored-by: Cursor <cursoragent@cursor.com>
Codifies that Tier-1 (local pytest + Docker) is necessary but NOT
sufficient: Tier-2 (Jetson Orin Nano via run-tests-jetson.sh) is the
product-completeness gate for runtime_root, c7_inference, c3_matcher,
c2_5_rerank, replay_input, and the replay CLI. Documents the
mandatory-Tier-2 scope, what Tier-1-only stubs cannot prove, the
operating procedure, and what batch reports must capture for in-scope
changes. Surfaced by the Step-11 cycle-1 finding that AZ-618 was only
caught because Tier-2 was actually run.
Co-authored-by: Cursor <cursoragent@cursor.com>
Append AZ-618 row to _dependencies_table.md (5pt, 12 dep tasks all in
done/, epic AZ-602) and refresh totals (155→156 tasks, 517→522 pts).
Mark autodev state in_progress at sub_step phase 1 (parse) so the
implement skill can pick up batch 90 with a clean tree per the
2026-05-18 lesson on rewinds-as-session-boundaries.
Co-authored-by: Cursor <cursoragent@cursor.com>
Captures the pattern observed this cycle: when /autodev rewinds from
Step 11 (Run Tests) back to Step 7 (Implement) due to a gate fail,
the rewind itself eats real context (task spec drafting + state
update + dependencies survey). Continuing into the destination
step's batch loop in the same conversation risks context truncation
mid-batch. Treat the rewind as a session boundary; let a fresh
/autodev invocation start the implement loop cleanly.
Co-authored-by: Cursor <cursoragent@cursor.com>
Cycle-3 addendum captures the layered Jetson rerun progression:
synth time-base fix (AZ-614) drops offset_ms from 1.7e12 to -4334;
AZ-611 skip-auto-sync then crosses the AC-9 validator; AZ-602
build-flag completeness opens VideoFileFrameSource and
TlogReplayFcAdapter; composition root logs
'replay.compose_root.ready: auto_sync_used=false', then crashes
inside runtime_root.airborne_bootstrap because production main()
never builds c13_fdr / c6_* / c7_inference / c3_lightglue_runtime /
c3_feature_extractor / c2_82_ransac_filter into pre_constructed.
The bootstrap gap is filed as AZ-618 (Story under AZ-602). It
affects both live and replay binaries -- every prior Reality-Gate
run died at auto-sync before the composition graph was walked, so
the gap was hidden. The 38 compose_root unit tests pass only via
the replay_components_factory stub kwarg, which bypasses the
bootstrap entirely.
Autodev sub_step advances to phase 8
'az614-az611-landed-bootstrap-gap-discovered' pending the user's
decision on whether to start AZ-618 immediately or close out
Step 11 with the current Reality-Gate signal.
Co-authored-by: Cursor <cursoragent@cursor.com>
Records the first Jetson Tier-2 run results in the step-11 report:
17 pass / 5 fail / 1 skip / 1 xfail (24 total, 10m09s) — identical to
Colima because all 5 failures hit AZ-614 (tlog time-base mismatch)
BEFORE reaching the GPU. So the infrastructure is proven (image
builds, GPU exposed inside container, SUT subprocess runs to the
auto-sync stage) but the heavy ACs haven't yet exercised ALIKED /
DISK LightGlue. Fixing AZ-614 is the gating prerequisite to actually
drive the GPU stages.
Also captures lessons learned that are now in the setup doc:
* Only dustynv/l4t-pytorch:r36.4.0 is a usable Jetson PyTorch base
on Docker Hub for R36 / JetPack 6 (l4t-base deprecated, official
l4t-pytorch has no R36 tags).
* The dustynv image bakes a maintainer-LAN-only pip mirror into
/etc/pip.conf — must be wiped + --index-url pinned to pypi.org.
* pip 24.2 (image default) rejects gtsam-4.3a0 pre-release; pip 26.x
accepts the same wheel for `gtsam<5.0,>=4.2` because there are no
stable aarch64 builds. Upgrade pip in the build, don't relax pin.
* nvidia-container-runtime mounts nvidia-smi from host, so the GPU
smoke test needs only ubuntu:22.04 (80 MB), not l4t-jetpack (5 GB).
Autodev state advances to phase 7 / jetson-harness-online.
Co-authored-by: Cursor <cursoragent@cursor.com>
Two doc lessons learned from on-Jetson verification:
1. The `cat >> ~/.ssh/config <<'EOF'` heredoc needs a leading blank
line. Without it, the appended block fused onto the previous
file line and produced "unsupported option yesHost" at parse
time. Added an explicit blank line + comment.
2. The smoke test for nvidia-container-runtime doesn't need a 5 GB
l4t-jetpack pull — nvidia-container-runtime mounts nvidia-smi
from the host into any container, so `ubuntu:22.04 nvidia-smi`
(80 MB) is sufficient. Switched the doc.
Operator verified end-to-end:
* `ssh jetson-e2e true` works from both terminal and Cursor Shell
* `jetson` user already in `docker` group (no sudo needed)
* `docker run --runtime=nvidia ubuntu:22.04 nvidia-smi` returns
Orin GPU info inside the container
Co-authored-by: Cursor <cursoragent@cursor.com>
Operator-reported: `nvcr.io/nvidia/l4t-base:r36.4.0` fails to pull.
Investigation against the live registries confirmed:
* `nvcr.io/nvidia/l4t-base` — deprecated in JetPack 6, no r36 tags
(forum thread "L4T Base docker image for Jetpack 6.2 (r36.4.3)",
GitHub dusty-nv/jetson-containers#883).
* `nvcr.io/nvidia/l4t-pytorch` — no r36 tags at all. Newest is
r35.2.1-pth2.0-py3 (too old for our torch>=2.2 floor).
* `nvcr.io/nvidia/l4t-jetpack:r36.4.0` — exists but ships no PyTorch.
* `dustynv/l4t-pytorch:r36.4.0` (Docker Hub) — exists, ~6.3 GB ARM64,
PyTorch + torchvision + opencv pre-baked, maintained by dusty-nv
(NVIDIA's Jetson containers maintainer).
Switched Dockerfile.jetson base to `dustynv/l4t-pytorch:r36.4.0`.
Forward-compatible with the host's R36.5 BSP (NVIDIA containers
tolerate one minor BSP ahead on the host side).
Setup doc fixes:
* smoke-test command now uses `l4t-jetpack:r36.4.0` (the official
replacement for the deprecated `l4t-base`)
* keygen step explicitly states it produces BOTH halves (private +
.pub) in one go
* ssh-copy-id + ssh config show how to specify a custom port
* troubleshooting table gets a new row for the `l4t-base not found`
case so the next dev hits the answer in 30 seconds
Co-authored-by: Cursor <cursoragent@cursor.com>
C7 inference (PytorchFp16Runtime / TensorRTRuntime / OnnxTrtEpRuntime)
is CUDA-only by design — `model.half().cuda()` is hard-wired with no
CPU fallback. The Colima/Tier-1 smoke harness can never exercise C3
matcher or C7 inference. Once AZ-614 fixes the tlog time-base mismatch
and the pipeline reaches those stages, Colima runs would hard-fail at
`.cuda()` instead of cleanly skipping.
This commit lays down the Jetson companion harness and wires the
existing `tier2` auto-skip:
* tests/e2e/Dockerfile.jetson — l4t-pytorch:r36.4.0-pth2.3-py3 base,
same /opt layout as the Colima image so AC-4 AST scan + bind mounts
work identically. Built ON the Jetson via run-tests-jetson.sh.
* docker-compose.test.jetson.yml — mirrors docker-compose.test.yml
but with `runtime: nvidia`, GPU device exposure, and
GPS_DENIED_TIER=2 (turns OFF the tier2 auto-skip).
* scripts/run-tests-jetson.sh — rsync → ssh build → ssh up,
exit-code-from e2e-runner so the local exit code reflects the
remote test verdict. No credentials in the repo; uses
`ssh jetson-e2e` alias resolved via ~/.ssh/config.
* _docs/03_implementation/jetson_harness_setup.md — one-time SSH
key + alias + sshd hardening + GPU verification steps. Documents
the smoke vs. Reality Gate split + the GPS_DENIED_TIER switch.
AZ-617 (mark heavy ACs with tier2): adds @pytest.mark.tier2 to AC-1,
AC-2, AC-3, AC-5, AC-6 in tests/e2e/replay/test_derkachi_1min.py.
Reuses the existing tier2 marker + auto-skip in tests/conftest.py
(scope revision documented as a comment on AZ-617). AC-4a/4b/AC-7/AC-9
stay unmarked — they don't touch CUDA.
Defers to follow-up Jira:
* AZ-614 — Derkachi tlog synth time-base mismatch (unblocks tier2 ACs
actually reaching the GPU stage on the Jetson)
* AZ-616 — replace mock-sat with real ../satellite-provider service
Not run yet: the harness needs operator-side SSH setup to come online
before scripts/run-tests-jetson.sh can be executed end-to-end. Setup
steps documented in jetson_harness_setup.md.
Co-authored-by: Cursor <cursoragent@cursor.com>
Multi-stage Ubuntu 22.04 e2e-runner image installs gps-denied-onboard
(editable) into /opt/venv so the AZ-404 replay tests can subprocess
gps-denied-replay against the Derkachi fixture. Image layout mirrors
the host repo (/opt/pyproject.toml + /opt/src + /opt/tests bind mount)
so Path(__file__).parents[3] resolves to /opt and AC-4's AST scan
finds the components dir.
Entrypoint now runs `pytest /opt/tests/e2e/` instead of the empty
`scenarios/` dir. The bootstrap harness collects 24 tests vs. 0 before.
Compose: e2e-runner env mirrors the companion service (FullSystemConfig
requirements) plus RUN_REPLAY_E2E=1, BUILD_REPLAY_SINK_JSONL=ON;
bind-mounts the Derkachi fixture dir; adds writable fdr-data /
tile-data volumes the SUT requires.
Reality Gate signal is now real: 17 pass / 5 fail / 1 skip / 1 xfail.
The 5 heavy-AC failures share root cause AZ-614 (tlog synth time-base
mismatch, surfaced by the now-functional harness).
Also archives the replayed leftover entries (csv_reporter -> AZ-601,
harness rehab -> AZ-602 epic + 11 child stories).
Co-authored-by: Cursor <cursoragent@cursor.com>
Attempted Path-3 (Full SITL with community images) for the SUT Reality
Gate. Discovered sitl_observer is offline-fixture replay, not a live
SITL client -- compose-file SITL services in environment.md are
aspirational. The real Path-3 needs the fixture builders + SUT CLI
end-to-end, which surfaced 5 additional integration drifts (H-10..H-14)
on top of the prior 9.
Fixes:
- tests/fixtures/calibration/adti26.json: body_to_camera_se3 was a
{rotation_xyzw, translation_xyz_m} dict; runtime_root/_replay_branch.py
loader strictly expects a 4x4 SE3. Identity quaternion + zero
translation = identity 4x4, semantically equivalent.
New files:
- tests/fixtures/replay_config_minimal.yaml: minimal replay-mode config
for harness reproduction (mode=replay, ardupilot_plane defaults).
- .gitignore: e2e/fixtures/sitl_replay/ (generated by build_p0X_fixtures).
Documentation:
- Step 11 report: appended Path-3 attempt section.
- Leftover doc: H-10..H-14 ticket payloads added.
- Autodev state: reflects Path-3 outcome.
Step 11 stays blocked; H-13 (auto-sync AC-8 hard-fails on stationary
fixtures) requires a SUT design decision and cannot be unilaterally
fixed mid-session.
Co-authored-by: Cursor <cursoragent@cursor.com>
Local Tier-1 pytest suite: 3343 pass / 88 skip / 0 fail across 12 chunks.
Docker harness SUT Reality Gate UNMET — both Tier-1 docker harnesses
(scripts/run-tests.sh and e2e/docker/run-tier1.sh) have pre-existing
drift that prevents them from running end-to-end. Findings:
H-1..H-3 (fixed in 6ce3158): dockerfile rename, fdr-output tmpfs cap,
e2e-results bind dir + gitignore.
H-4..H-6 (deferred): three SITL/MAVLink Docker Hub images don't exist
(ardupilot/mavproxy, ardupilot/ardupilot-sitl,
inavflight/inav-sitl). environment.md spec was
written against aspirational image names.
H-7..H-8 (deferred): tests/e2e/Dockerfile entrypoint points at empty
scenarios dir + doesn't install the SUT package.
H-9 (deferred): tile-cache-fixture seeder missing (relates to AZ-595).
Plus a regression caught and fixed mid-run: pytest-csv autoload
conflicts with our custom --csv flag (commit eb6dc17). Also surfaced a
false-positive batch-89 test-result report; proposed preventive
meta-rule pending user approval.
Step 11 marked status=blocked pending harness rehabilitation tickets
(payloads recorded in _docs/_process_leftovers/). Full outcome report:
_docs/03_implementation/run_tests_step11_report.md.
Co-authored-by: Cursor <cursoragent@cursor.com>
Subprocess-spawned tests in e2e/_unit_tests/reporting/ crashed with
"argparse.ArgumentError: argument --csv: conflicting option string: --csv"
because pytest-csv (autoloaded via entry-point) and our custom plugin both
register --csv. pytest's option registry does not allow overrides.
Fix: drop pytest-csv from e2e/runner/requirements.txt. It was unused, dead
weight, and incompatible with pytest 9.x (uses removed hookwrapper marker).
Update conftest + csv_reporter comments to match.
After fix: 1229/1229 in e2e/_unit_tests pass.
Bug ticket creation deferred (user skipped interactive Q this session) —
payload recorded in _docs/_process_leftovers/2026-05-17_csv_reporter_*.md
for replay on next /autodev.
Co-authored-by: Cursor <cursoragent@cursor.com>
Final test-implementation report written at
_docs/03_implementation/implementation_report_tests.md. All 41
blackbox-test tasks (AZ-406..AZ-446) under epic AZ-262 are done.
Full-suite gate handed off to .cursor/skills/test-run/SKILL.md per
implement skill Step 16.
Co-authored-by: Cursor <cursoragent@cursor.com>
Batch 89 — adds optional `band`, `ci95_low`, `ci95_high` kw-only
parameters to `_NfrRecorder.record_metric` and emits a new per-metric
report.csv artifact (one row per scenario × metric, columns:
scenario_id, metric_name, value, value_band, ci95_low, ci95_high,
ac_id, outcome). Backwards compatible — existing 4-arg callers
unchanged; unbalanced ci95 pair raises ValueError. report.csv is
written once per pytest session from `pytest_sessionfinish` so the
annotation pass runs once per CI invocation regardless of
(fc_adapter, vio_strategy) (AC-3). `regression-baseline.json`
intentionally kept flat to preserve the diff contract used by
regression-detection tooling.
NFT-RES-03 + NFT-PERF-01 scenarios updated to pass real bands and
compute empirical 2.5/97.5-percentile ci95 from their own sample
streams (per-iteration envelope ratios for Monte Carlo,
per-frame latency samples for N-sample latency).
Tests: 1229 e2e/_unit_tests pass (+6 vs. batch 88 for AZ-446
band/CI behavior, value-error on unbalanced ci95, report.csv columns,
explicit-path override, and end-to-end emission via the pytest
plugin). Code review: PASS_WITH_WARNINGS — 1 Low (empirical-CI
semantics, documented inline), 1 Medium carried over from batch 88's
cumulative-review backlog (write_csv_evidence + _resolve_fixture_path
duplication is outside AZ-446 reporting scope).
This commit closes Step 10 Implement Tests for cycle 1 (41 of 41
blackbox-test tasks done, AZ-406..AZ-446). Greenfield auto-chains to
Step 11 Run Tests next.
Co-authored-by: Cursor <cursoragent@cursor.com>
Batch 88 — adds four resource-limit blackbox scenarios + pure-logic
helpers + unit tests:
- NFT-LIM-01 Jetson memory (AC-NEW-13): tier2_only; Plan A/B budgets;
AC-4 OOM-event scan; 30 s warm-up window; VmRSS + tegrastats streams.
- NFT-LIM-02 FDR size (AC-7.3): 30 min → 8 h linear extrapolation
against 50 GiB; ±60 s replay-window slack for AC-1.
- NFT-LIM-03+05 storage (AC-7.4 + AC-NEW-12 + RESTRICT-STORAGE):
aggregate ≤ 100 GiB across tile-cache + tile-cache-write +
fdr-output; thumbnail-log < 1 GiB strict 8 h-extrapolated.
- NFT-LIM-04 thermal (AC-NEW-5 PARTIAL): tier2_only; CPU/SoC p99
≤ T_throttle − 5 °C; throttle-event scan; PARTIAL annotation written
to traceability-status.json. Thresholds fixture lives at
e2e/fixtures/jetson/thermal-thresholds.json (moved from the
task spec's suggested tests/fixtures/ path so the file stays
inside the blackbox_tests Owns: e2e/** envelope).
All four helpers are public-boundary-only (no src/gps_denied_onboard
imports). Scenarios skip cleanly in the Tier-1 docker harness pending
AZ-595 (SITL replay builder) for the four shared fixture inputs and
AZ-444 (Tier-2 Jetson runner) for the tier2_only scenarios.
Code review: PASS_WITH_WARNINGS (0/0/2/1). Both Mediums are
carried-over write_csv_evidence + _resolve_fixture_path duplication,
deferred to AZ-446 (batch 89). Low is the self-resolved AZ-443 fixture
ownership drift documented in the review.
Tests: 1223 e2e/_unit_tests passing (+1 vs. batch 87 from the new
directory-layout entry); 24 resource_limit scenarios collect and skip
cleanly under runner/pytest.ini.
Co-authored-by: Cursor <cursoragent@cursor.com>
FT-P-15: parse FDR `cache-self-check` records; assert every tile-manifest
entry has CRS, tile_matrix, dimension, m_per_px, capture_date, source,
compression; m_per_px >= 0.5 (or rejected by FDR `tile-load-rejected`).
FT-P-16: read `docker network inspect e2e-net` + `docker inspect <sut>`
snapshots; assert `Internal == true` AND SUT attached only to e2e-net.
The 0-egress semantic of AC-8.3 is enforced structurally.
FT-P-18: walk FDR + tile-cache, probe JPEG dimensions via stdlib SOF
parser, reject any file matching nav-camera raw pattern (5472x3648 or
880x720). Extrapolate thumbnail-log size to 8h; assert < 1 GB.
Adds runner.helpers.tile_cache_inspector with five evaluators
(manifest schema, offline mode, raw-frame detection, thumbnail budget,
JPEG dimension probe) + walk_files helper. Pure-logic coverage: 43
new unit tests; full e2e/_unit_tests/ suite 793 passing (was 746).
Scenarios skip locally when SITL replay fixture or docker-inspect
env vars are missing; production hooks (cache-self-check FDR record,
tile-load-rejected events, docker-inspect snapshots) are tracked
outside this task.
See _docs/03_implementation/batch_82_report.md +
reviews/batch_82_review.md.
Co-authored-by: Cursor <cursoragent@cursor.com>
Archive AZ-420 to done/; add cumulative review for batches 79-81 (PASS,
no new findings); advance autodev state to await batch 82.
Co-authored-by: Cursor <cursoragent@cursor.com>
FT-P-12: parse mavproxy-listener tlog over a 60 s Derkachi replay and
assert SUT->GCS GLOBAL_POSITION_INT cadence lands in [1, 2] Hz (AC-6.1).
FT-P-13: inject `RELOC:<lat>,<lon>,<radius_m>` STATUSTEXT while the SUT
is in dead_reckoned; verify FDR `c8.gcs.operator_command` ack <=2s,
`anchor_search_region` centre shifts toward the hint, and no
BAD_SIGNATURE / UNAUTHORIZED / REJECTED STATUSTEXT lands in the
post-inject window (AC-6.2).
Adds runner.helpers.gcs_telemetry_evaluator (rate, hint-ack correlation,
haversine search-region shift, rejection scan) and
sitl_observer.capture_gcs_tlog (parity surface to capture_ap_tlog).
Pure-logic coverage: 39 new unit tests; full e2e/_unit_tests/ suite
746 passing (was 700). Scenarios skip locally on missing SITL replay
fixture; production hooks (inbound STATUSTEXT parser, anchor_search_region
FDR emitter) tracked outside this task.
See _docs/03_implementation/batch_81_report.md +
reviews/batch_81_review.md.
Co-authored-by: Cursor <cursoragent@cursor.com>
Phase 1: extend sitl_observer with cursor-based `wait_for_outbound`
returning `OutboundMessage` from `outbound_messages_<fc_kind>_<host>.json`
fixtures. Three outcomes: message, TimeoutError (null entries), or
RuntimeError (missing/malformed). Fix FT-P-01 + FT-P-05 scenarios to
use `fc_kind=` kwarg.
Phase 2: FT-P-01 vertical-slice fixture builder under
`e2e/fixtures/sitl_replay_builder/`. Reuses the production
`gps-denied-replay` CLI + `ReplayInputAdapter`: encode 60 stills as
1 fps MP4 + synthetic stationary tlog (pymavlink); run replay;
project FDR outbound estimates into the schema. Avoids the
13+ cp of SUT-side frame-ingestion that a live-SITL-capture path
would have required. Live execution remains a manual operator step.
+35 unit tests (664 total, up from 637). K=3 cumulative review for
b76-b78 documents the offline-replay arc convergence.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add `runner/helpers/replay_mode.py` (NullFrameSink, NullFcInboundEmitter,
default_frame_period_ms, load_replay_json, resolve_replay_subdir,
imu_replay_noop) and rewire all 13 scenarios off their local
`_resolve_*` / `_drive_*` / `_push_*` NotImplementedError stubs.
Closes the offline FDR-replay execution path. `grep raise
NotImplementedError` under `e2e/tests/` now returns zero matches. +17
unit tests (626 total, up from 608). Unit-test behaviour unchanged
(scenarios still skip via b75 sitl_replay_ready gate when
E2E_SITL_REPLAY_DIR is unset).
Co-authored-by: Cursor <cursoragent@cursor.com>
Add `runner/helpers/fc_proxy_runtime.py` wrapping the existing
`BlackoutSpoofProxy` (AZ-406) with a scenario-facing `drive_fc_proxy`
entry point. FDR-replay mode only: loads `schedule.json`, optionally
activates the proxy against a caller clock for alignment verification,
and writes a `proxy_drive_report.json` audit record into
`${E2E_SITL_REPLAY_DIR}` for downstream evaluators.
Replaces the local `_drive_fc_proxy` stub in FT-N-04. Adds 3
@property accessors on `BlackoutSpoofProxy` so the wrapper does not
reach into private attributes. +11 unit tests (608 total, up from
596). Live-mode router wiring remains out of scope (future ticket).
Co-authored-by: Cursor <cursoragent@cursor.com>
Implement all 11 `sitl_observer` public surfaces as an offline
FDR-replay strategy (reads JSON fixtures under `${E2E_SITL_REPLAY_DIR}`
instead of live pymavlink/yamspy). Replace 12 per-scenario
`_harness_helpers_implemented` probes with one shared session-scoped
`sitl_replay_ready` fixture in `e2e/tests/conftest.py`.
Net: -636 LoC of duplicated scenario gating, +17 LoC shared fixture,
+38 new unit tests (596 total, up from 558). Includes K=3 cumulative
review for batches 73-75 (PASS).
Co-authored-by: Cursor <cursoragent@cursor.com>
Replaces the NotImplementedError stubs AZ-406 reserved on three runner-
side helpers; these were stranded from any tracker ticket since
AZ-407/408 never came back to fill them. Concrete bodies:
* fdr_reader.iter_records: JSONL parser + wire-envelope validator;
recursive *.jsonl walk; projects {schema_version, ts, producer_id,
kind, payload} to runner-side FdrRecord with record_type/monotonic_ms
renames; yields oldest-first.
* frame_source_replay.replay_video: OpenCV VideoCapture decode + JPEG
re-encode; auto-detects file vs directory; injectable sleep_fn for
unit-test pacing.
* imu_replay.ImuReplayer.replay: csv.DictReader parse; degrees->radians
attitude conversion; tolerates scientific notation; same sleep_fn
injection pattern.
Adds 34 unit tests (14 + 10 + 10). Full e2e unit suite: 558 passed (+31).
Existing scenario _harness_helpers_implemented probes still return False
because they also depend on sitl_observer / fc_proxy_runtime stubs that
remain pending; scenario probe cleanup is out of AZ-594 scope.
Co-authored-by: Cursor <cursoragent@cursor.com>