mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 21:21:13 +00:00
f5366bbca163b82b63db0b8fd6d489ae48a4effa
134 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
87fe98858f |
[AZ-698] Tlog trim + mid-flight alignment for replay
Adds find_aligned_window cross-correlation (NCC, per-window unit norm)
between IMU energy and video optical-flow magnitude. Returns
AlignedWindow{tlog_start_ns, tlog_end_ns, offset_ms, confidence,
used_fallback}, with fallback to head-takeoff on low confidence to
preserve AZ-405 behavior. TlogReplayFcAdapter honors tlog_start_ns and
skips pre-window messages. New --auto-trim CLI flag, mutex with
--time-offset-ms. AC-1..AC-4 covered by unit tests; AC-5 skipped (no
real flight_derkachi.mp4 in repo). 106 tests pass in regression slice.
Zero new mypy --strict errors.
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
64d961f60c |
[AZ-697] [AZ-702] tlog GPS truth + KHP20S30 factory calibration
Batch 98 (cycle 2) — first two PBIs of epic AZ-696 (real-flight validation harness): AZ-697: direct binary-tlog GPS-truth extractor - New src/gps_denied_onboard/replay_input/tlog_ground_truth.py reads GLOBAL_POSITION_INT (with GPS_RAW_INT fallback) from a binary ArduPilot tlog via pymavlink.mavutil and returns a frozen+slotted TlogGroundTruth DTO with per-record ts_ns / lat_deg / lon_deg / alt_m / hdg_deg / vx_m_s / vy_m_s / vz_m_s. - Promoted l2_horizontal_m + match_percentage + GroundTruthRow from tests/e2e/replay/_helpers.py into the new production module src/gps_denied_onboard/helpers/gps_compare.py. The e2e helper now re-exports the same objects (identity, not copies) so existing test imports continue working untouched. - tests/e2e/replay/conftest.py prefers the real derkachi.tlog when present, falls back to the CSV synth path otherwise. - 22 new unit tests cover AC-1..AC-5 (mypy --strict subprocess test included). All passing. AZ-702: Topotek KHP20S30 factory-sheet camera calibration - New _docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json: fx = fy = 4644.444, cx = 960, cy = 540, HFOV ~ 23.3 deg, VFOV ~ 13.2 deg, computed from the published 8.5 mm focal length + 1/2.8" sensor + 1920x1080 capture at lowest zoom step. Distortion zeroed, body_to_camera_se3 = identity with nadir convention. Acquisition method explicitly recorded as factory_sheet so downstream code can expect higher residual error than a lab calibration. - _docs/00_problem/input_data/flight_derkachi/camera_info.md updated to document the assumptions, expected residual error window, and conftest pick-up rule. - tests/e2e/replay/conftest.py::_calibration_path() prefers khp20s30_factory.json when present, falls back to adti26.json. - 9 new unit tests cover AC-1..AC-4 (schema, intrinsics traceback, doc reference, conftest pick-up). All passing. Test run: 45 new tests, all passing. Full-suite gate deferred to Step 16 (after the last batch in cycle 2 per the implement skill). Adjacent note (not fixed in this batch, recorded in the batch report): auto_sync.py has the same redundant pymavlink type:ignore + a few numpy/cv2 mypy --strict issues. None on this batch's path. Refs: _docs/03_implementation/batch_98_cycle2_report.md Refs: _docs/02_tasks/done/AZ-697_tlog_ground_truth_extractor.md Refs: _docs/02_tasks/done/AZ-702_khp20s30_calibration.md Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
a12638dd92 |
[AZ-696] chore: cycle-2 bootstrap — gitignore tlog inputs, Step 9 PBIs
Pre-implement chore commit to land orchestration artifacts produced by
autodev cycle-2 Step 9 (New Task), so that Step 10 (Implement) starts
against a clean working tree.
What's included:
- .gitignore: exclude _docs/00_problem/input_data/**/*.{tlog,mp4,h264}
(derkachi.tlog is a 5.8 MB binary input and stays out-of-band).
- _docs/02_tasks/todo/AZ-697..AZ-702: 6 new PBI specs under epic AZ-696
(tlog ground-truth extractor, mid-flight trim+align, real-flight
validation runner, replay map viz, HTTP replay API, KHP20S30 calib).
- _docs/02_tasks/_dependencies_table.md: dep edges for the 6 PBIs.
- _docs/_autodev_state.md: status -> in_progress, step 10 cycle 2.
- _docs/_process_leftovers/...opencv_pin_deferred.md: replay-attempt
timestamp refreshed (gtsam-numpy-2 wheels still not published;
leftover remains open).
No source code is modified by this commit.
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
9bdc868dfd |
[AZ-687] Guard build_pre_constructed seeds in replay mode
Replay CLI synthesizes a minimal Config whose `components` mapping omits the strategy-component blocks (`c6_tile_cache`, `c7_inference`, `c5_state`) the airborne bootstrap historically read unconditionally. Add `_replay_omits_component_block` and gate the c6 seeds, the c7 + c3_lightglue_runtime pair, and the c5 (estimator, handle) eager build on `config.mode == "replay" AND block absent`. Live mode and any replay config that DOES populate the blocks remain unchanged — the guard is conditional, not blanket. The skip is safe because compose_root's per-component wrappers only run for slugs in `config.components`; absent blocks mean absent wrappers, so the seeded slots would never be read. Fix lives at the BUILD-PRE-CONSTRUCTED layer per the spec's explicit "no silent fallback in `_c6_config`" constraint. Covers AC-687-1 / AC-687-2 / AC-687-4. AC-687-3 (Jetson Tier-2 e2e replay) requires an out-of-band hardware re-run; evidence destination documented in autodev state. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
2be1b5101e |
[AZ-687] [autodev] File replay-mode guard task + Tier-2 evidence
Jetson Tier-2 e2e on 2026-05-19 11:27 surfaced a NEW gap one phase deeper than where Rerun 3 died: build_pre_constructed seeds c6_descriptor_index unconditionally, which reads config.components["c6_tile_cache"] via storage_factory._c6_config. The replay CLI synthesizes a Config that has no c6_tile_cache block, so AC-1/2/5/6 fail with KeyError 'c6_tile_cache'. Bootstrap (no source code changes): - AZ-687 (Story, To Do, 2pt, Epic AZ-602; blocks AZ-618) - Task spec in _docs/02_tasks/todo/ - _dependencies_table.md row + header narrative - _docs/_autodev_state.md detail repointed at AZ-687 - _docs/03_implementation/jetson_runs/ Tier-2 evidence The fix itself lives in batch 97 (next session): guard the c6/c7 seeds at the BUILD-PRE-CONSTRUCTED layer when config.mode == "replay". Per existing storage_factory._c6_config docstring the silent-fallback path is explicitly rejected — the bootstrap layer is the right seam. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
c3639a5d1c |
[AZ-624] [AZ-618] Phase F: wire build_pre_constructed into main()
Wire register_airborne_strategies + build_pre_constructed + compose_root(config, pre_constructed=...) into runtime_root.main(). The existing exception block now catches AirborneBootstrapError distinctly before the broader (ConfigurationError, StrategyNotLinkedError, RuntimeError) clause so the operator-facing "airborne_bootstrap:" prefix carried by every bootstrap error reaches stderr cleanly with EXIT_GENERIC_FAILURE rather than getting absorbed into a generic backtrace. This closes the AZ-618 umbrella: AZ-619..AZ-623 + AZ-625 had built each pre_constructed key; this batch lands the integration that the production main() actually invokes them. Both the live gps-denied-onboard and replay gps-denied-replay binaries dispatch through this main() per ADR-011, so both reach takeoff with pre_constructed populated end-to-end. Tests: tests/unit/runtime_root/test_az618_pre_constructed.py adds 6 tests covering AC-618-1..AC-618-4 + AZ-624 local handler-ordering regression guard. The strategy factories are stubbed at the airborne_bootstrap module boundary so the test exercises the integration seam without standing up gtsam / FAISS / TensorRT / PyTorch / OpenCV at unit-test scope. AC-618-5 (Jetson tier-2 e2e) is BLOCKED on operator-supplied hardware evidence: scripts/run-tests-jetson.sh tests/e2e/replay/test_derkachi_1min.py must run on Jetson Orin Nano (JetPack 6.2.2+b24) and the terminal log path + JetPack version + run timestamp captured per _docs/02_document/tests/tier2-jetson-testing.md. Quality gates: ruff format clean, ruff lint clean, 6/6 new umbrella tests pass, 261/261 runtime_root + c5_state regression suite passes, 25/25 test_az401_compose_root_replay regression passes, full Tier-1 unit suite 2150/2151 passes (1 unrelated pre-existing failure: c12_operator_orchestrator subprocess cold-start NFR fails on Mac dev host's Python startup ~700 ms; not regressed by AZ-624). Code review verdict PASS (1 Low finding; full report in _docs/03_implementation/reviews/batch_96_review.md). Archives AZ-624 task spec + AZ-618 umbrella reference to done/. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
2b8ef52f66 |
[AZ-625] Phase E.5: airborne_bootstrap c5_isam2_graph_handle ordering
Wire the airborne bootstrap to seed pre_constructed['c5_isam2_graph_handle'] so c4_pose's compose-time lookup is satisfied (c4_pose runs before c5_state in topological order; the iSAM2 graph handle is built INSIDE the C5 estimator's constructor and so must be produced eagerly at bootstrap time). build_pre_constructed now invokes a new internal _build_c5_state_estimator_pair helper that calls state_factory.build_state_estimator once, captures the (estimator, handle) tuple, and seeds two slots: 'c5_isam2_graph_handle' for C4's lookup, and an internal '_c5_prebuilt_estimator' look-aside key for the C5 wrapper's short-circuit. _c5_state_wrapper checks the look-aside key first and returns the prebuilt instance as-is — the SAME object the handle was extracted from, so c4_pose._isam2_handle and c5_state._isam2_handle reference ONE object across the C4 / C5 seam (AC-625.3 cross-seam identity invariant). C5_STATE_BUILD_FLAGS mirrors state_factory._STATE_BUILD_FLAGS so the bootstrap can name the gating BUILD_STATE_* flag in operator errors before the lower level StateEstimatorConfigError fires (AC-625.2). When the factory itself rejects the configuration with the flag ON, the error wraps into AirborneBootstrapError with __cause__ preserved (matches AZ-621 / AZ-622 patterns). Constraints respected per AZ-618 umbrella: no per-component factory signature changed; additive on top of AZ-619..AZ-623; no edits under state_factory, pose_factory, or c5_state internals. Tests: tests/unit/runtime_root/test_az625_c5_isam2_graph_handle_ordering.py adds 8 tests covering AC-625.1..3 (presence + Protocol conformance, internal key invariant, BUILD-flag-OFF error, unknown-strategy error, factory error wrapping, cross-seam identity, wrapper short-circuit, wrapper fallback). Autouse stubs added to test_az619/620/621/622/623 so prior phase tests stay isolated from the new builder. Quality gates: ruff format clean, ruff lint clean, 32/32 phase tests pass, 255/255 runtime_root + c5_state regression suite passes. Code review verdict PASS (2 Low findings; full report in _docs/03_implementation/reviews/batch_95_review.md). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
02208c577e |
[AZ-623] [AZ-625] Phase E: c282_ransac + c5 helpers; split handle work
Wire 4 stateless / cached helpers into airborne_bootstrap.build_pre_constructed: c282_ransac_filter, c5_imu_preintegrator (cached on calibration path), c5_se3_utils (helpers.se3_utils module as namespace handle), c5_wgs_converter. The original AZ-623 5th deliverable (c5_isam2_graph_handle) hit an unresolvable construction-order conflict between c4_pose (consumes the handle) and c5_state (creates it inside build_state_estimator's tuple return) under the umbrella's "MUST NOT touch any per-component factory signature" constraint. Per AZ-623 spec's escalation gate, scope was split: AZ-625 captures the handle ordering work; AZ-624 dependency edge updated to require both. Tests: tests/unit/runtime_root/test_az623_pre_constructed_phase_e.py adds 7 tests covering AC-623.1..3 (4 new keys + correct types, IMU preintegrator caching, operator-actionable error messages for empty / unreadable / malformed calibration paths). Autouse stubs added to test_az619/620/621/622 so prior phase tests remain isolated from new builders. Quality gates: ruff format clean, ruff lint clean, 24/24 phase tests pass, 247/247 runtime_root + c5_state regression suite passes. Code review verdict PASS_WITH_WARNINGS (3 Low findings; full report in _docs/03_implementation/reviews/batch_94_review.md). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
5c4d129f80 |
[AZ-622] Phase D: build_pre_constructed seeds c3 GPU runtimes
build_pre_constructed now populates c3_lightglue_runtime (LightGlueRuntime) + c3_feature_extractor (FeatureExtractor) on top of AZ-619/620/621. Strategy-specific BUILD_MATCHER_* flag mismatch raises AirborneBootstrapError naming the missing flag and the c3_matcher consumer; the c7 InferenceRuntime built earlier in the bootstrap is reused as the engine source so no double-build at this layer. C3MatcherConfig gains optional lightglue_weights_path: Path | None for the operator's deployment config; production main() (AZ-624) populates it. Real LightGlue inference correctness is verified by AZ-624's Jetson AC-5 run per the AZ-622 Tier-2 Note. Phase tests for AZ-619/620/621 gain an autouse _stub_c3_matcher_builders fixture so additivity assertions remain valid as the bootstrap grows. Code review: PASS_WITH_WARNINGS (3 Low: signature drift from spec, _is_build_flag_on duplication across 3 runtime_root modules, and BuildConfig literal mirrored with per-strategy build configs). All deferred to future hygiene PBIs. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
680ba29ae6 |
[AZ-621] Phase C: build_pre_constructed seeds c7_inference
Third subtask of AZ-618. Extends airborne_bootstrap.build_pre_constructed additively with c7_inference (GPU InferenceRuntime). Wraps the existing inference_factory.build_inference_runtime so a BUILD_TENSORRT_RUNTIME / BUILD_PYTORCH_FP16_RUNTIME mismatch surfaces a clear operator-facing AirborneBootstrapError naming BOTH airborne C7 flags plus the consuming component slug, rather than bubbling up RuntimeNotAvailableError with no context. New public const C7_AIRBORNE_BUILD_FLAGS pairs each airborne runtime with its gating env flag (onnx_trt_ep deliberately omitted — research only). Tests stub at the factory boundary; real GPU/TensorRT load remains Tier-2 only (consolidated at AZ-624). AZ-619 and AZ-620 test files extended with a _stub_c7_inference_builder autouse fixture mirroring the AZ-620 pattern for _build_c6_*. 18/18 runtime_root unit tests pass. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
dbae0cad5b |
[autodev] Backfill batch_90_cycle1_report.md for AZ-619
Prior session committed AZ-619 (Phase A of AZ-618) as
|
||
|
|
8abfb020fe |
[AZ-619] Phase A: build_pre_constructed seeds c13_fdr + clock
Adds airborne_bootstrap.build_pre_constructed(config) returning a dict with the two foundational keys: a per-binary shared FdrClient under "c13_fdr" (via make_fdr_client with the new AIRBORNE_MAIN_PRODUCER_ID constant) and a fresh WallClock under "clock". Phases B..F (AZ-620..AZ-624) extend this function additively without breaking the AZ-619 contract. The c13_fdr instance is identity-stable across calls (per the make_fdr_client per-producer cache) so callers can call build_pre_constructed twice and get the same FdrClient back - AC-619.2. Replay-mode override is unchanged: compose_root merges replay_components over pre_constructed so the WallClock here is replaced by TlogDerivedClock in replay binaries (existing contract documented in compose_root's docstring). Tests: 5 new unit tests under tests/unit/runtime_root/ test_az619_pre_constructed_phase_a.py, all passing. AZ-591 not regressed (12/12 in the combined run). Spec moved to _docs/02_tasks/done/. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
8cee532516 |
[AZ-618] [AZ-619] [AZ-620] [AZ-621] [AZ-622] [AZ-623] [AZ-624] Split AZ-618 into 6 subtasks per spec sizing-note
The AZ-618 spec author flagged "likely a true 8" with a recommended 6-subtask split; combined with the user-rule cap on PBI complexity (create at 2-3pt, max 5pt) the right move was to split before any implementation began. Subtasks created in Jira as children of AZ-618: AZ-619 (Phase A) c13_fdr + clock 2pt AZ-620 (Phase B) c6_descriptor_index + c6_tile_store 3pt AZ-621 (Phase C) c7_inference engine 3pt AZ-622 (Phase D) c3_lightglue_runtime + c3_feature_extractor 3pt AZ-623 (Phase E) c282_ransac_filter + c5 helpers 3pt AZ-624 (Phase F) wire main() + AC-1..AC-5 + Jetson 2pt Aggregate: 16pt actionable work (vs. AZ-618's original 5pt filing, which the author had already qualified as understated). AZ-618 stays In Progress in Jira as the umbrella tracker; its task spec file is now an umbrella reference pointing to the 6 phase-specific spec files. Deps table updated: AZ-618 row reduced to 0pt with subtask deps; six new rows added; header counts refreshed (156 -> 162 tasks, 522 -> 533 points). Autodev state set to phase=1 (parse) for the next batch = AZ-619 (Phase A) only. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
bcdc17bd74 |
[AZ-618] Task spec + autodev rewind to Step 7
Step 11 gate failed per greenfield rule: 5 e2e ACs reach `replay.compose_root.ready` and then crash inside runtime_root.airborne_bootstrap on the first pre_constructed lookup. That is "missing internal product implementation", which the gate description routes back to Implement. * Task spec AZ-618 (255 lines, 5 pts, 6-phase internal split, AC-1..AC-5) parked in _docs/02_tasks/todo/. Phases land in dependency order: c13_fdr+clock -> c6_* -> c7_inference -> c3_lightglue+features -> c282_ransac_filter -> c5 helpers. * Autodev state: step 7 (Implement), status not_started, sub_step awaiting-invocation, cycle 1. retry_count = 0. * Leftover D-CROSS-CVE-1: replay attempted, still deferred (gtsam 4.2.1 on PyPI still pins numpy<2.0.0); timestamp bumped to 2026-05-18T20:35+03:00. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
33e683dc0f |
[AZ-446] CSV reporter: band + ci95 annotations + report.csv emitter
Batch 89 — adds optional `band`, `ci95_low`, `ci95_high` kw-only parameters to `_NfrRecorder.record_metric` and emits a new per-metric report.csv artifact (one row per scenario × metric, columns: scenario_id, metric_name, value, value_band, ci95_low, ci95_high, ac_id, outcome). Backwards compatible — existing 4-arg callers unchanged; unbalanced ci95 pair raises ValueError. report.csv is written once per pytest session from `pytest_sessionfinish` so the annotation pass runs once per CI invocation regardless of (fc_adapter, vio_strategy) (AC-3). `regression-baseline.json` intentionally kept flat to preserve the diff contract used by regression-detection tooling. NFT-RES-03 + NFT-PERF-01 scenarios updated to pass real bands and compute empirical 2.5/97.5-percentile ci95 from their own sample streams (per-iteration envelope ratios for Monte Carlo, per-frame latency samples for N-sample latency). Tests: 1229 e2e/_unit_tests pass (+6 vs. batch 88 for AZ-446 band/CI behavior, value-error on unbalanced ci95, report.csv columns, explicit-path override, and end-to-end emission via the pytest plugin). Code review: PASS_WITH_WARNINGS — 1 Low (empirical-CI semantics, documented inline), 1 Medium carried over from batch 88's cumulative-review backlog (write_csv_evidence + _resolve_fixture_path duplication is outside AZ-446 reporting scope). This commit closes Step 10 Implement Tests for cycle 1 (41 of 41 blackbox-test tasks done, AZ-406..AZ-446). Greenfield auto-chains to Step 11 Run Tests next. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
6e4a575221 |
[AZ-440] [AZ-441] [AZ-442] [AZ-443] NFT-LIM-01/02/03+05/04 blackbox scenarios
Batch 88 — adds four resource-limit blackbox scenarios + pure-logic helpers + unit tests: - NFT-LIM-01 Jetson memory (AC-NEW-13): tier2_only; Plan A/B budgets; AC-4 OOM-event scan; 30 s warm-up window; VmRSS + tegrastats streams. - NFT-LIM-02 FDR size (AC-7.3): 30 min → 8 h linear extrapolation against 50 GiB; ±60 s replay-window slack for AC-1. - NFT-LIM-03+05 storage (AC-7.4 + AC-NEW-12 + RESTRICT-STORAGE): aggregate ≤ 100 GiB across tile-cache + tile-cache-write + fdr-output; thumbnail-log < 1 GiB strict 8 h-extrapolated. - NFT-LIM-04 thermal (AC-NEW-5 PARTIAL): tier2_only; CPU/SoC p99 ≤ T_throttle − 5 °C; throttle-event scan; PARTIAL annotation written to traceability-status.json. Thresholds fixture lives at e2e/fixtures/jetson/thermal-thresholds.json (moved from the task spec's suggested tests/fixtures/ path so the file stays inside the blackbox_tests Owns: e2e/** envelope). All four helpers are public-boundary-only (no src/gps_denied_onboard imports). Scenarios skip cleanly in the Tier-1 docker harness pending AZ-595 (SITL replay builder) for the four shared fixture inputs and AZ-444 (Tier-2 Jetson runner) for the tier2_only scenarios. Code review: PASS_WITH_WARNINGS (0/0/2/1). Both Mediums are carried-over write_csv_evidence + _resolve_fixture_path duplication, deferred to AZ-446 (batch 89). Low is the self-resolved AZ-443 fixture ownership drift documented in the review. Tests: 1223 e2e/_unit_tests passing (+1 vs. batch 87 from the new directory-layout entry); 24 resource_limit scenarios collect and skip cleanly under runner/pytest.ini. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
d1e30f818f |
[autodev] archive batch 87 tasks, advance to batch 88
Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
de19e716d8 |
[autodev] archive batch 86 tasks, advance to batch 87
Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
73cd632e95 |
[AZ-428] [AZ-429] [AZ-430] [AZ-431] Add NFT-PERF-01..04 perf scenarios
Batch 85 — 4 Performance NFT scenarios + pure-logic evaluators. - NFT-PERF-01 (AZ-428, Tier-2): two-config e2e latency p95 ≤ 400 ms (K=3@25°C, K=2 hybrid@50°C) + frame-drop ≤10% + informational per-stage partition recording (D-CROSS-LATENCY-1). - NFT-PERF-02 (AZ-429): inter-emit p95 ≤ 350 ms + no ≥3 missed-emit windows. fc-adapter-aware SITL timestamp extraction (tlog vs MSP). - NFT-PERF-03 (AZ-430, Tier-2): cold-start TTFF p95 ≤ 30 s AND max ≤ 45 s over N≥10 iterations. - NFT-PERF-04 (AZ-431): spoof-promotion latency p95 ≤ 600 ms over N≥20 randomized-start blackout+spoof events. All scenarios consume external fixtures (AZ-595 dependency surfaced) and fail loudly when fixtures are missing or empty. Public-boundary discipline preserved — evaluators do NOT import src/gps_denied_onboard. Tests: 60 new unit tests pass; 24 scenarios collect (4 tests × 2 fc × 3 vio). Code review: PASS_WITH_WARNINGS — 1 Medium (fixed in batch), 3 Low (production-dependency surfacings + future hygiene). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
f25cae4a82 |
[AZ-423] [AZ-427] Add FT-P-19 + FT-N-05 blackbox tests
Implement the AC-8.6 (top-K=10 retrieval scale-ratio + scene-change
PARTIAL) and AC-8.2 / AC-NEW-6 (stale aged-tile rejection) blackbox
scenarios.
AZ-423 (FT-P-19, 3pt) helpers + scenario:
- retrieval_evaluator.py — top-K within-distance evaluator (60 stills
vs 100 m budget), scene-change PARTIAL recorder (always emits
PARTIAL on the 2 _gmaps.png pairs), FDR record projectors, CSV
writers.
- tests/positive/test_ft_p_19_sat_reloc_scale.py (6 parametrised
variants).
AZ-427 (FT-N-05, 2pt) helpers + scenario:
- aged_tile_rejection_evaluator.py — Signal A (stale rejection at
load) + Signal B (per-frame downgrade) decision matrix, reuses
ALLOWED_SOURCE_LABELS from estimate_schema.
- tests/negative/test_ft_n_05_stale_tile_rejection.py (12 parametrised
variants: FC × VIO × {7mo/active-conflict, 13mo/rear}).
48 new unit tests cover every helper branch. Both scenarios skip
when sitl_replay_ready is false and fail loudly when fixture records
are missing.
Per-batch review: PASS_WITH_WARNINGS (2 Low — production-dependency
surface, FDR-kind constant duplication).
Cumulative review 82-84: PASS (2 Low carry-over / hygiene candidate).
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
5def1a3eb3 |
[AZ-422] Add FT-P-17 + FT-N-06 mid-flight tile blackbox tests
Implement the AC-8.4 and AC-NEW-6 blackbox scenarios for mid-flight tile generation, dedup, landing-time upload, and freshness gating. Helpers: - runner/helpers/mid_flight_tile_evaluator.py — pure-logic evaluators for tile generation rate, Mode B Fact #105 schema check, footprint+ GSD dedup (via geo.distance_m), upload-audit reconciliation, and the AC-5/AC-6 capture_utc + freshness-gate checks. - runner/helpers/mock_suite_sat_audit.py — httpx wrapper for the mock-suite-sat-service /tiles/audit endpoint with strict response- shape validation. Scenarios: - tests/positive/test_ft_p_17_mid_flight_tiles.py - tests/negative/test_ft_n_06_mid_flight_freshness.py Both skip when sitl_replay_ready is false and fail loudly when fixture records are missing (tests-as-gates discipline). 52 new unit tests (41 evaluator + 11 audit client) cover every helper branch. Review: PASS_WITH_WARNINGS (2 Low — duplicate haversine carry-over, upstream production dependency surface). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
1ee54b414b |
[AZ-421] Batch 82 housekeeping
Archive AZ-421 to done/ and advance autodev state to await batch 83. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
b0296da911 |
[AZ-420] Batch 81 housekeeping + cumulative 79-81 review
Archive AZ-420 to done/; add cumulative review for batches 79-81 (PASS, no new findings); advance autodev state to await batch 82. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
2d6d44af5d |
[AZ-424] [AZ-425] [AZ-426] Implement negatives set (FT-N-01/03/04)
Adds three pure-logic evaluators + scenarios + unit tests covering the project's failure-mode robustness ladder (AC-3.1, AC-3.4, AC-3.5, AC-NEW-8): * outlier_tolerance_evaluator (AZ-424 / FT-N-01): per-event 50 m drift bound + 3-frame covariance-monotonic window over the AZ-408 outlier injector's medium-density manifest. * outage_request_evaluator (AZ-425 / FT-N-03): detects 3+ consecutive missing-frame windows; validates OPERATOR_RELOC_REQUEST STATUSTEXT arrives at 2 s ±500 ms, dead_reckoned label during outage, and no FC EKF divergence. * blackout_spoof_evaluator (AZ-426 / FT-N-04): eight-AC ladder across the 5 s / 15 s / 35 s sub-windows — switch latency, spoof rejection, monotonic covariance, honest horiz_accuracy, STATUSTEXT 1-2 Hz, 35 s escalation thresholds, and recovery gate. Each scenario is skip-gated on the AZ-441 / AZ-407 / AZ-416 replay / SITL / mavproxy helpers; unit tests (14 + 18 + 29 = 61) cover the AC logic today. Full e2e unit-test suite: 527 passed (+67). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
a644debdb7 |
[AZ-416] [AZ-417] [AZ-419] Test batch 72: FT-P-09 AP/iNav + FT-P-11 cold start
- AZ-416 (FT-P-09-AP): fills mavproxy_tlog_reader.iter_messages with pymavlink body (AZ-406 surface kept); adds ap_contract_evaluator covering AC-1 (signing handshake <=5s), AC-2 (GPS_INPUT >=4.5 Hz), AC-3 (EK3_SRC1_POSXY=3), AC-4 (GPS_RAW_INT health >=80%); scenario forces fc_adapter=ardupilot. - AZ-417 (FT-P-09-iNav): msp_frame_observer covering AC-2 (MSP rate) and AC-3 (fix_type/provider/numSat); scenario forces fc_adapter=inav. - AZ-419 (FT-P-11): cold_start_evaluator covering AC-1 (operator manifest origin), AC-2 (FC EKF fallback), AC-3 (no-origin abort), AC-4 (bounded-delta conflict, ADR-010 Principle #11 amended); scenario parametrized on origin_source plus dedicated no-origin abort scenario. - All scenarios skip-gated on upstream frame_source_replay / imu_replay / fdr_reader / sitl_observer extensions. - +67 unit tests; full e2e unit suite: 460 passed. - K=3 cumulative review fired: PASS for batches 70-72. See _docs/03_implementation/batch_72_report.md, _docs/03_implementation/reviews/batch_72_review.md, _docs/03_implementation/cumulative_review_batches_70-72_cycle1_report.md. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
c6e6cba237 |
[AZ-414] [AZ-415] [AZ-418] Test batch 71: sharp turn + multi-segment + smoothing
- AZ-414 (FT-P-07 + FT-N-02): sharp_turn_detector helper covering AC-1 (gyro_z run detection + synthetic-overlay fallback), AC-2/AC-3 (FT-N-02 during-turn label + monotonic covariance), AC-4/AC-5/AC-6 (FT-P-07 recovery lag/drift/heading); twin scenario files under positive/ and negative/. - AZ-415 (FT-P-08): multi_segment_evaluator helper + scenario. - AZ-418 (FT-P-10): smoothing_evaluator helper covering AC-1 (raw + smoothed pose pairing), AC-2 (improvement rate >= 0.80), AC-3 (mean improvement >= 5 m); scenario file. - All scenarios skip-gated on upstream frame_source_replay / imu_replay / fdr_reader stubs (auto-activate when AZ-441 + AZ-407 leftovers land). - +68 unit tests; full e2e unit suite: 393 passed. See _docs/03_implementation/batch_71_report.md and _docs/03_implementation/reviews/batch_71_review.md. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
29ac16cfcb |
[AZ-409] [AZ-412] [AZ-413] Batch 70: FT-P-01/04/05/06 scenarios
AZ-409 (3pt) — FT-P-01 still-image frame-center accuracy: - accuracy_evaluator.py: GT loader + Vincenty error + AC-2/AC-3 pass-counts - test_ft_p_01_still_image_accuracy.py: scenario gated on frame_source_replay + sitl_observer NotImplementedError; AC-4 timeout discipline AZ-412 (3pt) — FT-P-04 Derkachi f2f registration >=95% on normal segments: - registration_classifier.py: accel-derived attitude + overlap heuristic + success ratio with AC-3 sharp-turn exclusion - test_ft_p_04_derkachi_f2f_registration.py: scenario gated on frame_source_replay + imu_replay + fdr_reader AZ-413 (3pt) — FT-P-05 + FT-P-06 cross-domain MRE budgets: - mre_evaluator.py: per-image budget (strict <2.5px) + 95th-percentile via numpy linear interp + combined report - test_ft_p_05_sat_anchor.py: cross-domain scenario, reuses accuracy_evaluator for geodesic join - test_ft_p_06_mre_budgets.py: pure piggyback on FT-P-04 + FT-P-05 CSV evidence; skips when either upstream CSV missing Tests: 325 unit tests pass (+77 vs batch 69). Reports: batch_70_report.md, batch_70_review.md (PASS). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
702a0c0ff3 |
[AZ-408] [AZ-410] [AZ-411] Batch 69: synth injectors + FT-P-02/03/14
AZ-408 (3pt) — Replace AZ-406 injector scaffolds with concrete generators: - outlier.py: deterministic stride + far-away tile replacement; AC-2 ≥350m offset - blackout_spoof.py: paired video blackout + FC GPS spoof with ≤40ms alignment; AC-4 realistic fix_type/hdop; AC-NEW-8 200-500m inter-spoof deltas - multi_segment.py: ≥3 disjoint windows, ≥30s gaps, ≤25% coverage - fc_proxy.py: timed-splice runtime proxy with pre-activate RuntimeError guard - _common.py: derive_rng + tile-manifest reader + tmpfs helpers - injector_fixtures.py: pytest fixtures wired via runner conftest AZ-410 (3pt) — FT-P-02 cumulative drift between satellite anchors: - anchor_pair_detector.py: AC-1 detection, AC-2/3 pass-fraction, AC-4 monotonicity check, CSV evidence - test_ft_p_02_derkachi_drift.py: scenario gated on upstream helper NotImplementedError (frame_source_replay / fdr_reader / imu_replay) AZ-411 (2pt) — FT-P-03 + FT-P-14 schema + WGS84: - estimate_schema.py: AC-1 schema completeness, AC-2 source-label set containment, AC-3 WGS84 range + int32 1e-7 decode - test_ft_p_03_14_schema_wgs84.py: shared single-image-push scenario Tests: 248 unit tests pass (+91 vs batch 68). Reports: batch_69_report.md, batch_69_review.md (PASS), cumulative_review_batches_67-69_cycle1_report.md (PASS). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
6599d828d2 |
[AZ-407] [AZ-444] [AZ-445] Batch 68: fixtures, Tier-2 harness, NFR reporter
Three blackbox-harness tasks landed together — all depend only on
AZ-406 and unblock the FT-* / NFT-* scenario tasks scheduled for
batches 69+.
AZ-407 — Static fixture builders (3pt):
* tile-cache-builder/{builder.py, Dockerfile, build.sh} produces a
deterministic tile-cache-fixture Docker volume from
_docs/00_problem/input_data/. Reproducibility primitives: sorted
iteration, frozen PIL JPEG settings, FAISS HNSW32 built single-
threaded with seeded stub descriptors.
* age-injector/{age_injector.py, inject.sh} clones the volume and
shifts capture_date by N×30.44 days; tile JPEG bytes preserved
bit-identical. Emits synth-age-7mo + synth-age-13mo volumes.
* cold-boot/cold_boot_fixture.json: frozen FC pose snapshot at
Derkachi sector centre, schema v1.
* secrets/mavlink-test-passkey.txt: 64-hex with required
`# TEST ONLY` header line per AC-5. Passkey-equality test now
compares the secret line after stripping the header.
* security/cve-2025-53644.jpg: synthetic 158-byte malformed JPEG
(truncated SOS marker). OpenCV 4.11.x rejects gracefully with
imdecode → None. AZ-439 will sharpen for ASan instrumentation.
* Top-level Makefile with `make fixtures` / `make fixtures-*` /
`make e2e-tier1*` / `make unit-tests` targets.
AZ-444 — Tier-2 Jetson harness wrapper (5pt):
* run-tier2.sh rewritten as orchestrator. Detects local
(aarch64 + TIER2_HOST=localhost) vs remote (ssh into TIER2_HOST).
New flags: -k/--selector, --build-kind production|asan,
--reflash (gated behind TIER2_REFLASH_ACK=1 two-key gate),
--dry-run.
* tier2-on-jetson.sh (new) — on-device delegate. Verifies
gps-denied-onboard{,-asan}.service health; restarts with 5s
tolerance; spawns tegrastats + jtop parallel samplers; tails
ASan unit's journal in asan mode; drives docker compose with
TIER=tier2-jetson; forwards SELECTOR to pytest -k.
* docker/run-tier1.sh (new) — selector-parity sibling.
* AC-1 (selector parity) and AC-6 (reflash gating) unit-tested via
--dry-run output assertions. AC-2/AC-3/AC-4/AC-5 are hardware-
loop ACs verified by the Tier-2 runtime smoke (no Jetson in the
unit-test layer).
AZ-445 — CSV reporter + evidence bundler refinements (2pt):
* reporting/nfr_recorder.py (new) — pytest plugin. Provides the
`nfr_recorder` fixture with record_metric(name, value, ac_id)
and partial(ac_id, reason). At session end emits:
- per-nfr/<scenario_id>.json (AC-1)
- traceability-status.json with every AC ID parsed from
traceability-matrix.md, classified Covered/PARTIAL/NOT
COVERED with source scenario IDs (AC-2)
- regression-baseline.json with all numeric metrics (AC-3)
* csv_reporter.py extended — `_outcome_to_result` consults the
aggregator; rows flip PASS → PARTIAL when an AC was marked
PARTIAL by nfr_recorder (AC-4). Graceful fallback when
aggregator isn't registered (unit-test contexts).
* conftest.py registers nfr_recorder in pytest_plugins.
* New --traceability-matrix CLI flag seeds the NOT COVERED rows.
Build / config:
* pyproject.toml dev extras: added Pillow>=10.4,<13.0 for the
tile-cache-builder unit test (broad enough to keep torchvision's
Pillow 12 pin happy; the production builder runs inside its own
Docker image with its own pin).
* Updated test_directory_layout.py to cover 10 new files + replaced
the byte-equal passkey assertion with the header-stripping
variant.
Test results:
* 157 focused tests pass (was 97 in batch 67; +60 new across this
batch). No regressions.
Module-layout / spec drift:
* AZ-407 spec text says `tests/fixtures/...`; module-layout
blackbox_tests entry (commit
|
||
|
|
59d9116d36 |
[AZ-406] Blackbox test harness bootstrap (Tier-1 + Tier-2 scaffold)
Bootstraps the public-boundary blackbox test harness owned by epic
AZ-262 (E-BBT). Establishes the e2e/ directory tree at the repo root,
fully separated from src/gps_denied_onboard/** and from the in-process
tests/** tree, and commits to the contracts every subsequent test
ticket (AZ-407..AZ-446) builds against.
Tier-1 (workstation Docker):
- docker/docker-compose.test.yml wires SUT + ArduPilot SITL + iNav SITL
+ mock Suite Sat Service + mavproxy listener + e2e-runner onto one
e2e-net bridge with internal: true (enforces RESTRICT-SAT-1 /
NFT-SEC-02 egress isolation at the network layer).
- docker/docker-compose.tier2-bridge.yml override disables the in-
compose SUT so Tier-2 pairs SITLs + mock + runner on an x86 host
while the SUT runs natively on the Jetson under systemd.
Tier-2 (Jetson):
- jetson/run-tier2.sh + tier2.service systemd unit + tegrastats /
jtop parsers feed per-sample telemetry into the evidence bundle.
Runner image (e2e/runner/):
- Dockerfile + requirements.txt install ONLY ground-side libs
(pymavlink, opencv-python>=4.12, numpy/scipy/geopy/pyproj, httpx,
orjson, pydantic, structlog, pytest 8.x). The runner deliberately
does NOT install the SUT package.
- conftest.py implements the AC-9 skip-rule mapping (tier2_only,
chamber_only, vins_mono, deferred_ac) tied to environment.md
parametrize axes.
- reporting/csv_reporter.py is a pytest plugin emitting one row per
test with the exact 11-column schema from environment.md §
Reporting (test_id, test_name, traces_to, fc_adapter, vio_strategy,
tier, started_at_utc, execution_time_ms, result, error_message,
evidence_paths). XFAIL surfaced only when a test carries
@pytest.mark.deferred_ac(verdict="xfail", reason=...).
- reporting/evidence_bundler.py exposes the attach_evidence fixture
that copies per-test artifacts (.tlog, FDR archives, screenshots,
tegrastats / jtop CSVs) into the run bundle and records relative
paths into the reporter's evidence_paths column.
- helpers/{frame_source_replay,imu_replay,sitl_observer,
mavproxy_tlog_reader,fdr_reader}.py declare the public surfaces
(concrete implementations owned by AZ-407 / AZ-408 / AZ-416 /
AZ-417 / AZ-441 per the dependency table); helpers/geo.py ships
today (no downstream task dep) — WGS84 distance / forward-bearing
/ offset via pyproj with NaN rejection.
Mock Suite Sat Service (e2e/fixtures/mock-suite-sat/):
- FastAPI app: POST /tiles (ingest contract from D-PROJ-2 follow-up),
GET /tiles/audit + /mock/audit (per-run read-back), POST
/mock/config (force-status, response delay), POST /mock/reset
(clears audit between tests), GET /mock/health.
Fixture scaffolds (e2e/fixtures/{tile-cache-builder, age-injector,
injectors, cold-boot, secrets, security}/):
- Public surfaces only. Concrete builders land in AZ-407 (static
fixtures), AZ-408 (runtime synthetic injection), AZ-419 (cold-boot
fixture), AZ-439 (CVE-2025-53644 JPEG generator).
Test tree (e2e/tests/{positive,negative,performance,resilience,
security,resource_limit}/):
- Mirror of the test-spec category grouping in
_docs/02_document/tests/*-tests.md.
- tests/positive/test_smoke.py is the AC-1 harness-boot smoke run
inside the e2e-runner image once Docker brings everything up.
Out-of-container unit tests (e2e/_unit_tests/):
- Exercises the harness internals (CSV reporter plugin lifecycle,
conftest skip rules, helper modules, parsers, mock app, compose
YAML structural contract, public-boundary enforcement) without
Docker / SITL. 97 unit tests, all passing.
Build / config:
- pyproject.toml: testpaths extended with e2e/_unit_tests; pythonpath
extended with e2e; fastapi>=0.111,<0.120 added to dev extras for the
mock-app TestClient unit test.
AC coverage:
- AC-1 (Tier-1 boot) → compose YAML test + directory layout
+ smoke test (Docker-bound)
- AC-2 (mock services) → 6 FastAPI TestClient unit tests
- AC-3 (SITLs accept output) → contract present; concrete check
deferred to AZ-416 / AZ-417
- AC-4 (CSV columns) → in-process plugin lifecycle test
emits the exact 11-column schema
- AC-5 (egress isolation) → static config test + runtime probe
in Docker-bound smoke
- AC-6 (Tier-2 contract) → tegrastats + jtop parser unit tests
+ jetson/* layout test; full Tier-2
contract is AZ-444
- AC-7 (fixture reproducibility) → deferred to AZ-407 per task spec
- AC-8 (parametrize matrix) → vins_mono skip-rule cases +
tests/positive/test_smoke
- AC-9 (skip semantics) → 9 conftest skip-rule unit tests
Module layout entry for blackbox_tests was added in 2026-05-16
preparatory commit
|
||
|
|
f7a99282fb |
[AZ-591] Add airborne_bootstrap to populate _STRATEGY_REGISTRY
Batch 66 — fixes the production gap surfaced during the cycle-1 completeness-gate post-mortem: the central _STRATEGY_REGISTRY was empty in production source, so compose_root() raised StrategyNotLinkedError on the first component lookup and the airborne binary couldn't reach takeoff. Changes: - New module `src/.../runtime_root/airborne_bootstrap.py` exposes `register_airborne_strategies()` and a documented `AIRBORNE_REQUIRED_PRE_CONSTRUCTED_KEYS` table. The function registers 14 entries into the central registry across 7 strategy-selecting slots (c1_vio + c2_vpr + c2_5_rerank + c3_matcher + c3_5_adhop + c4_pose + c5_state). Per-slot wrappers adapt the registry-factory signature (config, constructed) to each per-component factory's kwarg surface and surface a AirborneBootstrapError when a required infrastructure dep is missing from constructed. - `compose_root` gains a `pre_constructed` kwarg in live mode, symmetric with the replay-mode seam. Replay entries still take precedence on key collision (ADR-011). Existing callers unaffected (kwarg defaults to None). - `runtime_root/__init__.py::main()` now calls `register_airborne_strategies()` before `compose_root(config)` so production binaries no longer crash at the registry-lookup step. - Lazy-loading preserved: state_factory's private _STATE_REGISTRY is populated lazily inside the c5_state wrapper, gated by BUILD_STATE_GTSAM_ISAM2 / BUILD_STATE_ESKF env flags. pose_factory's own lazy-import fallback handles c4_pose without an explicit register() call. - 7 new unit tests in `tests/unit/runtime_root/test_az591_airborne_\ bootstrap.py` cover AC-1..AC-5 plus the negative-path AirborneBootstrapError contract. Full unit suite 2105 passed / 88 environment-gated skips / 0 failures. End-to-end takeoff still needs a follow-up task to wire infrastructure pre-construction (c13_fdr / c6_* / c7_inference / etc.) into the pre_constructed dict passed to compose_root. That follow-up is gated by AZ-591 landing first; recommended split into per-component infrastructure-prep tasks (3pt each). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
6d51e06886 |
[AZ-589] [AZ-590] [AZ-591] [AZ-592] [AZ-593] Re-classify cycle1 gate findings
Cycle 1 Product Implementation Completeness Gate post-mortem. AZ-589 + AZ-590 were the wrong abstraction: - AZ-589 targeted `okvis::ThreadedKFVio` (OKVIS v1 API) which does not exist in the vendored OKVIS2 upstream; smartroboticslab/okvis2 exposes `okvis::ThreadedSlam` instead. - AZ-590 assumed a "de-ROSified VINS-Mono pin" submodule exists; `cpp/vins_mono/upstream/` has no `.gitmodules` entry. - The actual production gap is the empty central `_STRATEGY_REGISTRY`: `register_strategy(...)` is never called outside test fixtures, so `compose_root()` raises `StrategyNotLinkedError` for every component slug with a strategy-selecting config field. Affects c1_vio + c2_vpr + c2_5_rerank + c3_matcher + c3_5_adhop + c4_pose + c5_state. Re-classification: - AZ-589 + AZ-590 closed Won't Fix (Jira); spec files removed from todo/ but rows retained in the dependencies table as audit-trail. - AZ-591 created (todo/, 5pt) — cross-cutting compose_root per-binary bootstrap that populates `_STRATEGY_REGISTRY` for the airborne binary. Scheduled as Batch 66 sole task. - AZ-592 created (backlog/, 5pt placeholder) — AZ-332 Tier-2 validation bundle (real `okvis::ThreadedSlam` wiring + Linux CI apt-install + DBoW2 vocab + Jetson). BLOCKED on Tier-2 prerequisites; honors AZ-332's `AZ-332_tier2_validation` self-deferral handle. - AZ-593 created (backlog/, 5pt placeholder) — AZ-333 Tier-2 validation bundle (de-ROSified VINS-Mono upstream + binding + CI + Jetson). BLOCKED on upstream vendoring decision plus Tier-2 prerequisites; honors AZ-333's parallel deferral pattern. - AZ-332 + AZ-333 re-classified in cycle1 gate report from FAIL to BLOCKED-on-Tier-2. Step 7 stays in_progress until AZ-591 lands; after that it can advance to Step 8 with AZ-592 + AZ-593 parked in backlog/. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
be5c6d20aa |
[AZ-589] [AZ-590] Close completeness gate cycle 1: VIO remediation tasks
The Product Implementation Completeness Gate (cycle 1, 2026-05-16)
audited 107 done product tasks. 105 PASS / 0 BLOCKED / 2 FAIL.
FAIL findings — both AZ-332 (OKVIS2) and AZ-333 (VINS-Mono) ship a
real Python facade + AC-tested fake backend, but their native pybind11
bindings (_native/okvis2_binding.cpp, _native/vins_mono_binding.cpp)
are skeletons: _build_estimator() sets estimator_built_ = false; the
first add_frame() raises *FatalException("estimator not yet wired").
Production-default VIO and the comparative-study path both crash on
the first nav-camera frame.
Remediation tasks created in _docs/02_tasks/todo/:
- AZ-589 remediate_okvis2_threadedkfvio_wiring (5pt)
- AZ-590 remediate_vins_mono_estimator_wiring (5pt)
Both tasks also seed the per-binary bootstrap register_strategy() call
sites — the existing strategy registry in runtime_root/__init__.py is
never invoked in src/ today.
Artifacts:
- _docs/03_implementation/implementation_completeness_cycle1_report.md
- _docs/02_tasks/todo/AZ-589_remediate_okvis2_threadedkfvio_wiring.md
- _docs/02_tasks/todo/AZ-590_remediate_vins_mono_estimator_wiring.md
- _docs/02_tasks/_dependencies_table.md (+2 rows; totals refreshed)
- _docs/_autodev_state.md (Step 7 phase 1 parse;
current_batch: 66)
Returning to implement-skill Step 1 to parse Batch 66 against these
remediation tasks (per Step 15 option A).
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
c5ffc14fe9 |
[AZ-389] C5 orthorectifier emits mid-flight tiles to C6
Adds an opt-in C5-internal orthorectifier (`_orthorectifier.py`) that emits at most one tile-aligned JPEG candidate per nav frame to the C6 `TileStore.write_tile` API. Quality gates fire before any OpenCV work: covariance Frobenius, inlier floor, source-label (`SATELLITE_ANCHORED` only), and once-per-frame rate limit. Cross-component import rule (AZ-507) is preserved: c5_state never imports c6_tile_cache. `runtime_root.state_factory` carries a new `_C6MidFlightIngestAdapter` that builds the canonical `TileMetadata` (`ONBOARD_INGEST` / `FRESH` / `PENDING`), hashes the JPEG, and translates `FreshnessRejectionError` to a `None` return so the orthorectifier silently swallows freshness rejection per AC-NEW-3. Wiring is opt-in via `C5StateConfig.orthorectifier.enabled`; existing tests/binaries default to disabled and are unaffected. Both `GtsamIsam2StateEstimator` and `EskfStateEstimator` participate through new `attach_orthorectifier` / `set_latest_nav_frame` extension methods (Protocol surface unchanged). Tests: 22 new unit tests cover AC-1..AC-9 plus inlier-floor gate plus the composition-root adapter. 216/216 c5_state and 38/38 runtime-root + compose tests pass. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
2b19b8b90b |
[AZ-558] Route C8 outbound encoder bytes through MavlinkTransport seam
All FC adapter outbound MAVLink bytes now go through the AZ-401 MavlinkTransport seam (NoopMavlinkTransport in replay, SerialMavlinkTransport in live). New helpers in _outbound_mavlink_payloads.py extract encode/pack/seq-bump so the four AP _send sites and the iNav statustext _send site become encode -> pack -> transport.write. TlogReplayFcAdapter emits real AP-shape MAVLink bytes through the injected NoopMavlinkTransport, satisfying replay protocol Invariant 5 and unblocking AZ-401 AC-9. Closes AZ-558. Also unskips AZ-401 AC-9 and AZ-404 AC-4b. Live wire output remains byte-identical (proven via two-instance MAVLink byte-equivalence tests). AST scan asserts no .mav.<name>_send( calls remain in the retrofit set (AP / iNav / tlog adapters). Out of scope (logged in review): GCS adapter retrofit; airborne live strategy registration that would activate the SerialMavlinkTransport factory injection path. Tests: 2110 passed, 92 environmental skips, 1 unrelated pre-existing macOS cold-start flake deselected. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
d7e6b0959e |
[AZ-404] [AZ-389] [AZ-559] E2E replay test (Derkachi 60s) + AZ-389 cleanup
Batch 63 of /autodev replay slice. Adds the AZ-404 E2E test harness against the Derkachi fixture and resolves the AZ-389 dependency phantom (closing AZ-559 Won't Fix). E2E test (AZ-404) - tests/e2e/replay/_tlog_synth.py: deterministic CSV->tlog generator (the original Derkachi tlog is not in repo; data_imu.csv is its export, so we round-trip the CSV through pymavlink). Verified: SCALED_IMU2 + ATTITUDE + GPS_RAW_INT + HEARTBEAT round-trip cleanly through mavutil.mavlink_connection. - tests/e2e/replay/_helpers.py: parse_jsonl, l2_horizontal_m (haversine), match_percentage, CapturingMavlinkTransport (ready for AZ-558 unblock), GroundTruthRow + load_ground_truth_csv. - tests/e2e/replay/conftest.py: derkachi_replay_inputs (session scope), replay_runner (subprocess fixture per AZ-402 CLI), operator_pre_flight_setup placeholder. - tests/e2e/replay/test_derkachi_1min.py: 9 tests covering AC-1..AC-8 with AC-7 skip-gate self-check + AC-4a mode-agnosticism AST scan (passes unconditionally, confirms ADR-011 holding). - tests/e2e/replay/test_helpers.py: 14 unit tests covering AC-9 helper L2 correctness + match_percentage + parse_jsonl + CapturingMavlinkTransport (all unconditional). - tests/e2e/replay/README.md: AC matrix, fixture state, runtime budget, failure cookbook (AC-10). AC matrix - AC-1, AC-2, AC-5, AC-6 implemented and Tier-1 gated on RUN_REPLAY_E2E=1. - AC-3 (<=100m for 80%) xfail until real Topotek KHP20S30 calibration ships (camera_info.md states intrinsics are unknown). - AC-4a (mode-agnosticism AST scan) PASSES unconditionally. - AC-4b (encoder byte-equality) skip until AZ-558 routes C8 bytes through MavlinkTransport. - AC-7 (skip-gate self-check) PASSES unconditionally. - AC-8 (operator workflow rehearsal) skip until D-PROJ-2 mock-suite-sat-service implements tile-fetch + index-build endpoints. - AC-9 (helper L2 correctness) 14 PASSES unconditionally. AZ-389 housekeeping - AZ-559 closed Won't Fix: investigation against c6_tile_cache/_types.py confirmed TileSource.ONBOARD_INGEST + TileMetadata.quality_metadata + write_tile's FreshnessRejectionError already cover the mid-flight ingest semantic. The "missing API" was a spec-vs-impl naming mismatch. - AZ-389 spec rewritten to consume the existing write_tile API + catch FreshnessRejectionError per AC-NEW-3 opportunistic emission. - _dependencies_table.md reverted: AZ-389 deps -> AZ-303 (was AZ-559 in the previous commit on this branch); total 150 / 497 pts. Tests - Full regression: 2099 passed (+14 new e2e/replay), 94 skipped (incl. 8 e2e/replay heavy-tier + documented blocker skips), 3 perf-microbench flakes deselected (test_cli_cold_start_under_2s, test_cold_start_under_500ms_p99, test_nfr_perf_sign_microbench; all pass in isolation - pre-existing under-load flakes on dev macOS). Reviews - _docs/03_implementation/reviews/batch_63_review.md: code review PASS_WITH_WARNINGS (3 documented spec-gap deferrals: AC-3, AC-4b, AC-8). - _docs/03_implementation/cumulative_review_batches_61-63_cycle1_report.md: cumulative review PASS_WITH_WARNINGS. Action items: prioritise AZ-558 (closes AZ-401 AC-9 + AZ-404 AC-4b); consider 2pt hygiene PBI for Protocol-completeness AST scan to catch the AZ-389 / AZ-559 phantom-API pattern at task-prep time. Architecture invariants observably holding - ADR-011 (replay-as-configuration): AC-4a's AST scan over src/gps_denied_onboard/components/**/*.py finds zero violations - components branch on neither config.mode nor any synonym. - Single composition root (replay protocol Invariant 11): AZ-402 CLI dispatches to runtime_root.main(config); does not call compose_root directly. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
4f10fd230f |
[AZ-559] [AZ-389] docs: defer AZ-389 to AZ-559 (C6 mid-flight tile gap)
AZ-389's task spec assumed the existence of `tile_store.put_mid_flight_ candidate(MidFlightTileCandidate)` (in Excluded: "owned by AZ-303 / E-C6"), but the current TileStore Protocol has only the four-method baseline shipped under AZ-303 — there is no put_mid_flight_candidate, no MidFlightTileCandidate DTO, and no MID_FLIGHT_INGEST TileSource enum value. Filed AZ-559 as a 5pt task to close the C6 storage gap (Protocol method + DTO + enum + persistence + freshness/LRU integration + contract update). Updated AZ-389 spec to depend on AZ-559 (replacing the stale AZ-303 dep) with a Status: BLOCKED note. Updated the dependencies table totals: 151 tasks / 502 complexity points. This is the same dep-gap pattern surfaced for AZ-401 in batch 61 (missing AZ-400 transport-seam retrofit) — the autodev replay-track sequence is exposing under-spec deliveries upstream. Tracker remains the source of truth via the new AZ-559 issue + Blocks link. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
2c31cc094f |
[AZ-402] Replay — gps-denied-replay console-script + shared main(config)
Implements the replay-mode CLI dispatcher per ADR-011 (replay-as- configuration): - src/gps_denied_onboard/cli/replay.py: argparse with all 6 required args (--video, --tlog, --output, --camera-calibration, --config, --mavlink-signing-key) plus --pace and --time-offset-ms; path validation, calibration JSON schema-validation, config mutation (mode='replay' + replay sub-block + signing-key hex on dev_static field), dispatch into runtime_root.main(config). - runtime_root.main() now accepts an optional Config (additive, backward-compat). Adds dedicated catch for ReplayInputAdapterError mapping to EXIT_FDR_OPEN_FAILURE (2) so the CLI's exit-code matrix holds end-to-end (AC-9 + epic AZ-265 AC-8). - Signing-key contents stored as hex; redacted in startup banner. - Top-level except logs full traceback via logger.exception + stderr print and exits 1. The CLI does NOT call compose_root directly — it builds a Config and hands it to the shared airborne main, which calls compose_root, which branches on config.mode (AZ-401 / replay protocol Invariant 11). Tests: 22 unit tests covering AC-1..AC-10 + extras (signing-key redaction, file-not-dir validation, dev_static propagation, unhandled exception traceback). Full regression: 2085 passed (+22) green; no new flaky tests. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
17a0d074af |
[AZ-401] [AZ-400] Replay — compose_root replay-mode branch + transport seam
Wires the airborne composition root for replay-as-configuration (ADR-011):
- compose_root(config) branches on config.mode in {"live", "replay"}.
Live behaviour is unchanged; replay builds ReplayInputAdapter,
attaches JsonlReplaySink, and injects NoopMavlinkTransport.
- New private module runtime_root/_replay_branch.py holds the
replay-only strategy graph + build-flag gate + calibration loader.
- Config gains Config.mode (Literal["live","replay"]) plus
Config.replay sub-block with nested ReplayAutoSyncConfig that mirrors
the AZ-405 AutoSyncConfig DTO; YAML loader + ENV map updated.
Absorbs the AZ-400 transport-seam retrofit that AZ-401 strictly
required but AZ-400 had not delivered:
- New MavlinkTransport Protocol (write/bytes_written/close).
- NoopMavlinkTransport (replay; build-flag gated, idempotent close,
thread-safe byte counter).
- SerialMavlinkTransport (live, no-op restructure of existing pymavlink
byte path; encoder retrofit to actually USE it is the AZ-558
follow-up).
AZ-401 AC-9 (NoopMavlinkTransport.bytes_written > 0 after C8 encoders
run) is BLOCKED on AZ-558 — the encoder routing retrofit is out of
the AZ-401 task envelope (FORBIDDEN files: pymavlink_ardupilot_adapter,
msp2_inav_adapter). AZ-558 spec, batch_61_review.md, and the test's
@pytest.mark.skip rationale all carry the deferral reason.
Tests: 22 compose_root replay-branch tests + 17 transport tests.
Full regression: 2063 passed, 86 environment-skips, 1 documented
skip (AC-9 / AZ-558), 1 pre-existing flaky perf test deselected.
Co-authored-by: Cursor <cursoragent@cursor.com>
|
||
|
|
8149083cac |
[AZ-405] Replay — replay_input/ coordinator + IMU take-off auto-sync
Adds the Layer-4 cross-cutting `replay_input/` module per ADR-011: ReplayInputAdapter converges (video, tlog) into the standard FrameSource + FcAdapter + Clock surfaces the airborne composition root consumes. Owns time-alignment between video frames and tlog IMU/attitude ticks (manual via --time-offset-ms or auto via the AZ-405 IMU-take-off detector + Farneback motion-onset detector). Auto-sync algorithm (auto_sync.py): - Tlog take-off detector: sustained vertical-accel excess > 0.5 g for >= 0.5 s + sustained attitude-rate magnitude > 1 rad/s. - Video motion-onset detector: dense Farneback flow magnitude > 1.5 px sustained >= 0.5 s (deterministic per AC-10). - compute_offset combines the two; confidence = min(tlog, video). - validate_offset_or_fail implements the AC-9 95 % frame-window match validator with configurable threshold + window. ReplayInputAdapter.open() ordering (AC-13): 1. Load tlog samples + fail-fast on missing RAW_IMU/SCALED_IMU2 or ATTITUDE BEFORE any video read. 2. Resolve offset (auto-sync OR manual override; manual bypasses the detectors entirely per AC-8). 3. Run AC-9 validator on resolved offset; raise auto-sync hard-fail for AC-7 (CLI exit 2 mapping). 4. Build single Clock instance per pace (TlogDerived/ASAP, Wall/REAL). 5. Construct VideoFileFrameSource and TlogReplayFcAdapter with the resolved offset baked in (replay protocol Invariant 8). Structured log + FDR records on auto-sync detected / low-confidence / AC-8 hard-fail kinds. Idempotent close (AC-12). Tests: 25 unit tests across tests/unit/replay_input/ covering all 13 ACs (kernel-level synthetic fixtures for AC-1..AC-10; coordinator- level OpenCV synthetic videos + faked pymavlink for AC-6..AC-13). Contract update: replay_protocol.md v2.0.0 added fdr_client to the ReplayInputAdapter __init__ signature (was missing in the prose; the task spec already listed it in the allowed-imports section). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
5adf3dd04f |
[AZ-265] Replay as configuration of airborne binary (ADR-011)
Re-design replay mode per user direction: replay is no longer a fourth Docker image with a reduced component set, but a `config.mode = "replay"` branch of the single airborne binary. The pre-flight workflow (route in suite UI -> C12 tile download via real satellite-provider -> C10 manifest+engines build) is identical between live and replay; only three strategies swap at compose time: FrameSource: Live <-> Video FcAdapter: Pymavlink/MSP2 <-> TlogReplay MavlinkTransport: Serial <-> Noop The C8 outbound MAVLink encoders run unchanged in both modes; their bytes hit `NoopMavlinkTransport` in replay and disappear. A new `JsonlReplaySink` taps C5's `EstimatorOutput` stream so the parent-suite UI sees per-tick coordinates by tailing `results.jsonl`. MAVLink 2.0 signing key remains mandatory (operator supplies a dummy file). A new `replay_input/` Layer-4 cross-cutting coordinator owns `(video, tlog) -> (FrameSource, FcAdapter, Clock)` convergence; the composition root sees only standard interfaces past `.open()`. Docs: - architecture.md: new ADR-011 with full rationale; ADR-002 binary narrative updated. - contracts/replay/replay_protocol.md: bumped to v2.0.0; 12 invariants (notably mode-agnosticism + encoder byte-equality + signing key mandatory + real C6 cache in replay). - module-layout.md: Build-Time Exclusion Map dropped from 4 to 3 binary columns; replay-mode `BUILD_*` flags default ON in airborne; `shared/replay_input` cross-cutting entry added. - epics.md: E-DEMO-REPLAY scope reframed; story points 27-32 -> 19-24. Task respecs: - AZ-401: shrunk 3 -> 2 pts; `compose_root` mode branch + JSONL sink + NoopMavlinkTransport wiring; legacy `compose_replay` export deleted. - AZ-402: console-script wrapper that mutates `config.mode = "replay"` and dispatches into the shared airborne main; `--mavlink-signing-key` mandatory. - AZ-403: CANCELLED. Moved to done/ with banner; Jira transition deferred via `_docs/_process_leftovers/2026-05-14_az_403_cancellation_pending_tracker.md`. - AZ-404: AC-4 reworded as mode-agnosticism AST scan + encoder byte-equality test; new AC-8 operator-workflow rehearsal. - AZ-405: also owns the `replay_input/` module + `ReplayInputAdapter`. _dependencies_table.md updated: AZ-401 gains AZ-405 dep; AZ-404 drops AZ-403 dep; AZ-403 row marked CANCELLED. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
fa3742d582 |
[AZ-399] [AZ-400] C8 TlogReplayFcAdapter + ReplaySink + JsonlReplaySink
Opens E-DEMO-REPLAY (AZ-265): the two C8 strategies that let the upcoming compose_replay (AZ-401) and gps-denied-replay CLI (AZ-402) run the production C1-C5 pipeline against a recorded (.tlog, video) pair without touching live FC I/O. AZ-400 lands the contract ReplaySink Protocol (emit + close per replay_protocol.md v1.0.0) and JsonlReplaySink: orjson-serialised JSONL, fsync-on-close, build-flag gated (BUILD_REPLAY_SINK_JSONL), double-close idempotent, FDR mirror on open/close. The drifted AZ-390 stub in interface.py is removed; the canonical Protocol now lives in replay_sink.py per module-layout.md and is re-exported via __init__.py. AZ-390 conformance test widened. AZ-399 lands TlogReplayFcAdapter: full FcAdapter Protocol surface, build-flag gated (BUILD_TLOG_REPLAY_ADAPTER), pymavlink stream-parse with bounded pre-scan + fail-fast on missing required messages (R-DEMO-3), dedicated decode thread feeding the existing AZ-391 SubscriptionBus. Outbound surface raises FcEmitError per Invariant 5; request_source_set_switch raises SourceSetSwitchNotSupportedError. Pacing honours Invariant 6 via Clock.sleep_until_ns. time_offset_ms shifts every emitted received_at per Invariant 8. Non-monotonic timestamps raise FcOpenError. Test coverage: 188 c8_fc_adapter tests pass; 1 skipped (AZ-399 AC-1 500 MB tlog RSS bound, deferred to AZ-404 e2e behind RUN_REPLAY_E2E). Code review: PASS_WITH_WARNINGS — 1 Medium (mapping logic duplicates AZ-391 live decoder; intentional today, four behavioural deltas documented), 2 Low. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
4eac24f37a |
[AZ-358] [AZ-361] C4 OpenCVGtsamPoseEstimator + Jacobian thermal hybrid
Implement the single production-default C4 PoseEstimator strategy. AZ-358 — Marginals path: OpenCV solvePnPRansac (SOLVEPNP_IPPE) on best-candidate inliers, PriorFactorPose3 with Jacobian-derived initial covariance, flushed into C5's iSAM2 graph via the widened ISam2GraphHandle.update(graph, values, None) (Option B). Posterior covariance from compute_marginals().marginalCovariance(pose_key) with SPD-defensive Cholesky check. Tile pixel -> ENU world conversion via the shared WgsConverter + a configurable tile_size_px. Two spec deviations now documented in the AZ-358 task file: PriorFactorPose3 over GenericProjectionFactorCal3DS2 (avoids unbounded landmark variables; same Fisher information on the pose marginal) and explicit (graph, values, timestamps) update args (aligns with C5's impl). AZ-361 — Jacobian + thermal hybrid: per-frame dispatch on thermal_state.thermal_throttle_active selects the cv2.projectPoints- derived 6x6 information matrix (with ridge regularisation) as the emitted covariance. Skips the iSAM2 factor add under throttle (Invariant 12). Emits CovarianceDegradedWarning via warnings.warn (never raised); paired WARN log + FDR record rate-limited per covariance_degraded_warn_window_ns (default 60 s) via an injected monotonic Clock. Supersedes the AZ-358 NotImplementedError stub. Widens ISam2GraphHandle from get_pose_key only to all five C4-facing methods (add_factor, update, compute_marginals, last_anchor_age_ms); C5's existing ISam2GraphHandleImpl already satisfies the superset, so no C5 source change this batch. Threads fdr_client + clock through pose_factory composition. Registers two new FDR payload kinds: pose.frame_done (per-call telemetry; both success and PnpFailureError paths) and pose.covariance_degraded (per-window throttle exposure). Tests: 21 new (AZ-358 AC-1..11 + AZ-361 AC-1..10/12/13; AZ-361 AC-11 RMSE-ratio informational per spec, not asserted). Updates 2 existing test files for Protocol widening and the FDR-schema round trip. Code review verdict: PASS_WITH_WARNINGS (5 findings: Medium x2, Low x3; none blocking). Full suite: 1958 passed, 1 unrelated host-dependent perf failure (c12 CLI cold-start, pre-existing). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
abe8c5cd2c |
[AZ-345] [AZ-346] [AZ-347] [AZ-349] Archive batch 57 task specs
Move completed task specs from _docs/02_tasks/todo/ to _docs/02_tasks/done/ now that the four tickets are In Testing. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
06f655d8fb |
[AZ-335] C1 warm-start hint persistence + F8 reboot recovery wiring
Adds JsonSidecarWarmStartHintStore (atomic JSON + SHA-256 sidecar via AZ-280) inside c1_vio, plus the cross-strategy WarmStartWiredStrategy wrapper + prime_warm_start_from_disk / prime_warm_start_from_fc hooks at runtime_root. AC-7 post-reset covariance inflation and AC-8 "no fake confidence" baseline floor are enforced at the wiring layer so no strategy module needed edits. Adds three c1_vio config fields (warm_start_store_dir, warm_start_save_period_frames, post_reset_covariance_inflation_factor) and registers the new FDR kind vio.warm_start. 34 unit tests cover all 10 ACs + 3 NFRs. Verdict PASS_WITH_WARNINGS — see _docs/03_implementation/reviews/batch_56_review.md for the four non-blocking documentation findings (F1 cold-start log kind shorthand, F2 strategy-frame pose semantics, F3 dev-hardware perf smoke, F4 runtime_root importing c1-internal _facade_spine for shared FDR conventions). Closes AZ-335; depends on AZ-528 (batch 55). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
f12789ebf0 |
[AZ-528] Consolidate c1_vio strategy facade orchestration spine
Replace 3-way byte-equivalent orchestration-spine duplication across okvis2.py / vins_mono.py / klt_ransac.py with a single c1-internal helper at components/c1_vio/_facade_spine.py. Closes cumulative review batches 52-54 Finding F1. No behaviour change — all existing AZ-332 / AZ-333 / AZ-334 AC tests pass unmodified (114 c1_vio tests green, 237 with adjacent regression suite). The helper exposes 5 stateless free functions (now_iso, bias_norm, se3_from_4x4, frame_ts_ns, frame_image) and a FacadeSpine mixin class providing _classify_state / _tick_lost / _emit_transition. Concrete strategies inherit the mixin and set spine-required instance attributes in __init__. Mirrors the AZ-527 precedent for c2_vpr-side _assert_engine_output_dim consolidation. New test file test_az528_facade_spine.py covers AC-1..AC-8 with 19 tests, including an AST regression guard that prevents future re-introduction of the consolidated free functions in any strategy module, plus a Risk-1 static check that every strategy's __init__ assigns every spine-required attribute. Archive AZ-528 task spec to done/, bump autodev state to batch 56. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
ac3e288dbd |
[AZ-528] Add AZ-528 task spec + register in dependencies table
Follow-up to cumulative review batches 52-54 Finding F1. Creates the local task-spec file under _docs/02_tasks/todo/ and adds the row to _dependencies_table.md so Batch 55's implement-loop can pick AZ-528 up. Mirrors the AZ-527 precedent from the c2_vpr-side cumulative review (49-51): cumulative review opens the Jira ticket + raises the finding, the prep commit adds the spec, the next batch implements. Sized at 3 points (1 helper module + 3 strategy edits + 1 test file with AST-walk + import-grep regression guards). Marginally larger than AZ-527's 2-point c2 consolidation because the c1 spine has both module-level free functions AND mixin-shaped instance methods. Jira: https://denyspopov.atlassian.net/browse/AZ-528 Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
ceb24b5a62 |
[AZ-334] C1 KLT/RANSAC strategy — engine-rule simple-baseline VIO
Implement KltRansacStrategy, the ADR-002 engine-rule mandatory simple-baseline VioStrategy for E-C1. Pure-Python facade over OpenCV's cv2.goodFeaturesToTrack / calcOpticalFlowPyrLK / findEssentialMat / recoverPose pipeline — no C++/pybind11 binding by design so a Tier-0 workstation runs the strategy with `pip install opencv-python` and the BUILD_KLT_RANSAC=ON gate alone. Constructor + state machine + FDR transition spine mirror Okvis2Strategy + VinsMonoStrategy so the AZ-331 factory + IT-12 comparative harness treat all three as drop-in substitutable; the duplication is the consolidation target now formally in scope for the next cumulative review (batches 52-54). AC coverage: AC-1..AC-11 + NFR-perf mapped to passing tests (25 tests, 23 pass + 2 tier-2 skipped on dev/CI runners; all 25 pass under GPS_DENIED_TIER=2). Honest-covariance invariant (AC-9) implemented as residual-scatter / (N_inliers - 5) with an inlier- count penalty — no client-side floor or smoother; cov Frobenius grows monotonically across DEGRADED. Camera-agnostic source (AC-11) enforced by CI-grep gate that excludes docstring text. Test-Run Cadence: focused suite tests/unit/c1_vio/ green (95 passed, 6 skipped); config-loader + compose-root suites green; full-suite gate deferred to Step 16 per implement skill. Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
6a5954bdae |
[AZ-333] C1 VINS-Mono strategy — research-only comparative VIO
VinsMonoStrategy: Python facade conforming to AZ-331 Protocol; mirrors the AZ-332 OKVIS2 facade so the AZ-331 factory + IT-12 comparative harness can treat both as drop-in substitutable. Native binding is a pybind11 skeleton compiled behind BUILD_VINS_MONO=ON (default OFF for airborne / operator-tooling / replay-cli per module-layout.md Build-Time Exclusion Map). Real vins_estimator wiring is the Tier-2 follow-up. VinsMonoConfig added to c1_vio/config.py with sliding-window / feature-tracker / marginalisation / opt-iteration knobs plus __post_init__ validation; exported through the package __init__. cpp/vins_mono/CMakeLists.txt replaces the AZ-263 placeholder with full pybind11 wiring: Risk-1 mitigation forces VINS_MONO_USE_ROS=OFF; Risk-2 mitigation links Eigen from the same cpp/_third_party/eigen pin as OKVIS2; Risk-3 mitigation enforces BUILD_VINS_MONO=OFF in deployment binaries via the gate at the top of the file. Tests: 17 new in test_vins_mono_strategy.py (15 pass + 2 tier2 skip); fake_vins_mono_binding fixture added to conftest.py mirroring the fake_okvis2_binding pattern; test_protocol_conformance updated to drop vins_mono from _STRATEGIES_WITHOUT_PY_MODULE so the existing parametrised factory tests route through the new strategy. Focused c1_vio suite: 72 passed, 4 skipped. Full suite: 1788 passed, 1 unrelated pre-existing flake (c12 cold-start perf, env-bound). Co-authored-by: Cursor <cursoragent@cursor.com> |
||
|
|
2ce300ddb1 |
[AZ-527] Archive AZ-527 + batch 52 report + state bump
Co-authored-by: Cursor <cursoragent@cursor.com> |