From a12638dd92531b92caeab1258f146a9adb51cf0e Mon Sep 17 00:00:00 2001 From: Oleksandr Bezdieniezhnykh Date: Wed, 20 May 2026 15:50:50 +0300 Subject: [PATCH] =?UTF-8?q?[AZ-696]=20chore:=20cycle-2=20bootstrap=20?= =?UTF-8?q?=E2=80=94=20gitignore=20tlog=20inputs,=20Step=209=20PBIs?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Pre-implement chore commit to land orchestration artifacts produced by autodev cycle-2 Step 9 (New Task), so that Step 10 (Implement) starts against a clean working tree. What's included: - .gitignore: exclude _docs/00_problem/input_data/**/*.{tlog,mp4,h264} (derkachi.tlog is a 5.8 MB binary input and stays out-of-band). - _docs/02_tasks/todo/AZ-697..AZ-702: 6 new PBI specs under epic AZ-696 (tlog ground-truth extractor, mid-flight trim+align, real-flight validation runner, replay map viz, HTTP replay API, KHP20S30 calib). - _docs/02_tasks/_dependencies_table.md: dep edges for the 6 PBIs. - _docs/_autodev_state.md: status -> in_progress, step 10 cycle 2. - _docs/_process_leftovers/...opencv_pin_deferred.md: replay-attempt timestamp refreshed (gtsam-numpy-2 wheels still not published; leftover remains open). No source code is modified by this commit. Co-authored-by: Cursor --- .gitignore | 5 + _docs/02_tasks/_dependencies_table.md | 6 + .../AZ-697_tlog_ground_truth_extractor.md | 120 +++++++++++++ .../AZ-698_tlog_trim_midflight_alignment.md | 118 +++++++++++++ .../AZ-699_real_flight_validation_runner.md | 106 ++++++++++++ .../todo/AZ-700_replay_map_visualization.md | 108 ++++++++++++ .../todo/AZ-701_http_replay_api_service.md | 161 ++++++++++++++++++ .../todo/AZ-702_khp20s30_calibration.md | 106 ++++++++++++ _docs/_autodev_state.md | 8 +- ...05-11_d_cross_cve_1_opencv_pin_deferred.md | 11 +- 10 files changed, 739 insertions(+), 10 deletions(-) create mode 100644 _docs/02_tasks/todo/AZ-697_tlog_ground_truth_extractor.md create mode 100644 _docs/02_tasks/todo/AZ-698_tlog_trim_midflight_alignment.md create mode 100644 _docs/02_tasks/todo/AZ-699_real_flight_validation_runner.md create mode 100644 _docs/02_tasks/todo/AZ-700_replay_map_visualization.md create mode 100644 _docs/02_tasks/todo/AZ-701_http_replay_api_service.md create mode 100644 _docs/02_tasks/todo/AZ-702_khp20s30_calibration.md diff --git a/.gitignore b/.gitignore index b3665c8..db5f21c 100644 --- a/.gitignore +++ b/.gitignore @@ -45,6 +45,11 @@ tests/fixtures/tiles_corpus/*.jpg tests/fixtures/tiles_corpus/*.png e2e/fixtures/sitl_replay/ +# Problem-folder flight-log inputs (binary, out-of-band) +_docs/00_problem/input_data/**/*.tlog +_docs/00_problem/input_data/**/*.mp4 +_docs/00_problem/input_data/**/*.h264 + # Editor / OS noise .idea/ .vscode/ diff --git a/_docs/02_tasks/_dependencies_table.md b/_docs/02_tasks/_dependencies_table.md index 983fa0b..7bafab8 100644 --- a/_docs/02_tasks/_dependencies_table.md +++ b/_docs/02_tasks/_dependencies_table.md @@ -177,6 +177,12 @@ are all declared and documented below under **Cycle Check**. | AZ-623 | AZ-618 Phase E: build_pre_constructed seeds c282_ransac_filter + c5 helpers | 3 | AZ-619, AZ-282, AZ-276, AZ-277, AZ-279, AZ-381 | AZ-602 | | AZ-624 | AZ-618 Phase F: wire build_pre_constructed into main() + AC-1..AC-5 (incl. Jetson tier-2) | 2 | AZ-619, AZ-620, AZ-621, AZ-622, AZ-623 | AZ-602 | | AZ-687 | build_pre_constructed must guard c6_descriptor_index when config.mode == 'replay' | 2 | AZ-619, AZ-620, AZ-624 | AZ-602 | +| AZ-697 | T1: Direct binary-tlog GPS-truth extractor | 3 | None | AZ-696 | +| AZ-698 | T2: Tlog trim + mid-flight alignment for replay | 5 | AZ-697 | AZ-696 | +| AZ-699 | T3: Real-flight validation runner + accuracy report | 3 | AZ-697 | AZ-696 | +| AZ-700 | T4: Replay map visualization (estimated vs ground-truth tracks) | 3 | AZ-699 | AZ-696 | +| AZ-701 | T5: HTTP Replay API service (POST tlog+video, return GPS fixes + map) | 5 | AZ-699, AZ-700 | AZ-696 | +| AZ-702 | T6: Topotek KHP20S30 camera calibration (factory-sheet approximation) | 1 | None | AZ-696 | ## Notes diff --git a/_docs/02_tasks/todo/AZ-697_tlog_ground_truth_extractor.md b/_docs/02_tasks/todo/AZ-697_tlog_ground_truth_extractor.md new file mode 100644 index 0000000..f806ff4 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-697_tlog_ground_truth_extractor.md @@ -0,0 +1,120 @@ +# Direct binary-tlog GPS-truth extractor + +**Task**: AZ-697_tlog_ground_truth_extractor +**Name**: Direct binary-tlog GPS-truth extractor (replaces data_imu.csv middle-man) +**Description**: New `tlog_ground_truth.py` module that streams `GLOBAL_POSITION_INT` (or falls back to `GPS_RAW_INT`) from a binary ArduPilot tlog into a typed `TlogGroundTruth` DTO. Production helper (not test-only). +**Complexity**: 3 points +**Dependencies**: None +**Component**: replay_input (cross-cutting validation helper) +**Tracker**: AZ-697 +**Epic**: AZ-696 + +## Problem + +Cycle-1 AC-3 (≤ 100 m horizontal error for 80 % of ticks) was permanently +`@xfail` partly because the test fed the SUT a tlog synthesized from +`_docs/00_problem/input_data/flight_derkachi/data_imu.csv`, and read +ground truth from the same CSV — comparing the estimator to itself. + +A real binary `derkachi.tlog` (5.8 MB ArduPilot tlog, MAVLink v2) was +committed on 2026-05-20. The remaining gap is a direct extractor that +reads `GLOBAL_POSITION_INT` (or `GPS_RAW_INT`) from the binary and +returns a typed DTO suitable for the AC-3 comparison helper. + +## Outcome + +- A new production module `src/gps_denied_onboard/replay_input/tlog_ground_truth.py` + exposes `load_tlog_ground_truth(path: Path) -> TlogGroundTruth`. +- The existing AC-3 comparison helpers (`l2_horizontal_m`, + `match_percentage`) move from `tests/e2e/replay/_helpers.py` into + `src/gps_denied_onboard/helpers/` so they are production code, not + test-only. +- The replay-test conftest uses the new extractor when the real tlog is + present; CSV path remains as a synth-tlog fallback. + +## Scope + +### Included +- New `TlogGroundTruth` dataclass (frozen + slotted) with per-record + `ts_ns`, `lat_deg`, `lon_deg`, `alt_m`, `hdg_deg`, `vx_m_s`, `vy_m_s`, + `vz_m_s` fields. +- `load_tlog_ground_truth(path)` — lazy `pymavlink.mavutil` open + mirroring `replay_input/auto_sync.py::_open_tlog`. +- Move `l2_horizontal_m` + `match_percentage` from test helpers to + `src/gps_denied_onboard/helpers/gps_compare.py`. +- Wire `tests/e2e/replay/conftest.py` to consume the new path when + `derkachi.tlog` exists. +- Unit tests under `tests/unit/replay_input/test_tlog_ground_truth.py` + using a synthetic tlog (extend `tests/e2e/replay/_tlog_synth.py`). + +### Excluded +- Tlog trimming for mid-flight slices — AZ-698 (T2). +- Accuracy report writing — AZ-699 (T3). +- Map visualization — AZ-700 (T4). + +## Acceptance Criteria + +**AC-1: Happy path on real tlog** +Given the committed `derkachi.tlog` +When `load_tlog_ground_truth(derkachi.tlog)` runs +Then it returns `TlogGroundTruth` with `len(records) > 100` and lat ≈ 50.08, lon ≈ 36.11 + +**AC-2: Empty GPS gracefully** +Given a tlog with no `GLOBAL_POSITION_INT` / `GPS_RAW_INT` messages +When the extractor runs +Then it returns `TlogGroundTruth(records=())` and logs WARN (does NOT raise) + +**AC-3: Fallback precedence** +Given a tlog containing only `GPS_RAW_INT` (no `GLOBAL_POSITION_INT`) +When the extractor runs +Then it returns records sourced from `GPS_RAW_INT` + +**AC-4: Type safety** +When `mypy --strict src/gps_denied_onboard/replay_input/tlog_ground_truth.py` runs +Then it reports zero errors + +**AC-5: Comparison helpers in production** +Given the moved `l2_horizontal_m` + `match_percentage` +When imported from `gps_denied_onboard.helpers.gps_compare` +Then they behave identically to the prior test-helpers location (snapshot test) + +## Non-Functional Requirements + +**Performance** +- `load_tlog_ground_truth(derkachi.tlog)` (5.8 MB, ~60 s of GPS at 5 Hz) returns in < 2 s on Tier-1 hardware. + +**Reliability** +- Lazy pymavlink import; missing dep raises `ReplayInputAdapterError` per project convention. + +## Unit Tests + +| AC Ref | What to Test | Required Outcome | +|--------|-------------|-----------------| +| AC-1 | Real derkachi.tlog parse | Non-empty TlogGroundTruth with Derkachi geofence lat/lon | +| AC-2 | Tlog with no GPS messages | Empty records tuple + WARN log | +| AC-3 | GPS_RAW_INT fallback | Records sourced from GPS_RAW_INT when GLOBAL_POSITION_INT absent | +| AC-3 | Mixed GLOBAL_POSITION_INT + GPS_RAW_INT | GLOBAL_POSITION_INT wins per AC-3 | +| AC-4 | mypy --strict | Zero errors | +| AC-5 | Helper move snapshot | Same numeric output as prior test-helpers location | + +## Blackbox Tests + +| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References | +|--------|------------------------|-------------|-------------------|----------------| +| AC-1 | derkachi.tlog (real) | Load full tlog | ≥ 100 records, Derkachi geofence | Perf < 2s | + +## Constraints + +- pymavlink is already a project dep (used by C8); MUST be lazy-imported (auto_sync.py pattern). +- New module MUST follow the project's frozen + slotted dataclass convention. +- File ownership goes in `_docs/02_document/module-layout.md` per AZ-696 epic layout (no contract — internal helper). + +## Risks & Mitigation + +**Risk 1: MAVLink unit-conversion bugs** +- *Risk*: Mavlink encodes lat/lon as int × 1e7. Forgetting the divide ships records off by 7 orders of magnitude. +- *Mitigation*: AC-1 asserts Derkachi geofence values; unit test snapshots a known fixture. + +**Risk 2: pymavlink import flakiness on Jetson** +- *Risk*: pymavlink occasionally fails to import on aarch64. +- *Mitigation*: Lazy import + raise `ReplayInputAdapterError` (existing pattern). diff --git a/_docs/02_tasks/todo/AZ-698_tlog_trim_midflight_alignment.md b/_docs/02_tasks/todo/AZ-698_tlog_trim_midflight_alignment.md new file mode 100644 index 0000000..eed1c0d --- /dev/null +++ b/_docs/02_tasks/todo/AZ-698_tlog_trim_midflight_alignment.md @@ -0,0 +1,118 @@ +# Tlog trim + mid-flight alignment for replay + +**Task**: AZ-698_tlog_trim_midflight_alignment +**Name**: Trim tlog to video window + align mid-flight slices via cross-correlation +**Description**: Extend `replay_input/auto_sync.py` and `TlogReplayFcAdapter` to handle the case where the video is a mid-flight slice of a longer tlog (not the takeoff). Adds `find_aligned_window` (cross-correlation of IMU energy vs video optical-flow magnitude) and a `--auto-trim` CLI flag. +**Complexity**: 5 points +**Dependencies**: AZ-697 +**Component**: replay_input + c8_fc_adapter +**Tracker**: AZ-698 +**Epic**: AZ-696 + +## Problem + +`replay_input/auto_sync.py::detect_tlog_takeoff` walks the tlog HEAD for +the takeoff event (sustained vertical accel + attitude rate). When the +uploaded video covers a **mid-flight slice** (e.g., 20–25 min into a +30 min flight), takeoff detection lands at t=0 and the resulting offset +is garbage. The replay coordinator then streams the entire tlog +start-to-end, wasting I/O on the leading minutes and computing +estimates against stale tlog samples. + +The user's pipeline framing: "tlog is usually bigger than video, and +usually the last chunk in tlog is relevant" — the system must locate +the video's window within the tlog and trim accordingly. + +## Outcome + +- A new `find_aligned_window(tlog_path, video_path, config) -> AlignedWindow` + returns `(tlog_start_ns, tlog_end_ns, offset_ms, confidence)`. +- `TlogReplayFcAdapter.open()` honors `tlog_start_ns` — seeks past + pre-window messages so downstream only sees the relevant slice. +- `gps-denied-replay --auto-trim` is the default for uploads that don't + pass `--time-offset-ms` or `--skip-auto-sync`. +- Existing takeoff-aligned Derkachi clip continues to pass AC-9 (no + regression on AZ-405). + +## Scope + +### Included +- New `find_aligned_window` algorithm — cross-correlation of: + - IMU energy stream (10 Hz subsampled `|a| − 1g` from `RAW_IMU`/`SCALED_IMU2`) + - Video optical-flow magnitude (existing `_compute_flow_magnitudes`) +- New `AlignedWindow` DTO under `replay_input/interface.py`. +- `TlogReplayFcAdapter._timestamp_filter(tlog_start_ns)` seek logic. +- `gps-denied-replay --auto-trim` CLI flag wiring. +- Tests: takeoff-aligned regression + synthetic mid-flight scenario. + +### Excluded +- Real-flight validation runner — AZ-699 (T3). +- Map visualization — AZ-700 (T4). +- HTTP API — AZ-701 (T5). +- Camera calibration — AZ-702 (T6). + +## Acceptance Criteria + +**AC-1: Backward-compat on takeoff-aligned clip** +Given the existing Derkachi 60 s clip with synthesized tlog +When `find_aligned_window` runs +Then it returns `offset_ms` within ± 50 ms of the current `auto_sync.compute_offset` result + +**AC-2: Mid-flight alignment** +Given a synthetic scenario: tlog covering 0–300 s, video covering 100–110 s with motion onset at tlog t=105 s +When `find_aligned_window` runs +Then `tlog_start_ns ≈ 100 s`, `tlog_end_ns ≈ 110 s`, `offset_ms` places video t=0 at tlog t=100 s + +**AC-3: Tlog trim honored by replay adapter** +Given `TlogReplayFcAdapter` opened with `tlog_start_ns = 100 s` +When messages flow +Then only messages with `_timestamp ≥ 100 s` reach subscribers + +**AC-4: AC-9 frame-window validator passes for both scenarios** +Given the resolved offset from AC-1 or AC-2 +When the AC-9 validator runs on the aligned window +Then it returns 0 (≥ 95 % match) + +**AC-5: End-to-end CLI smoke** +Given `gps-denied-replay --auto-trim --video derkachi.mp4 --tlog derkachi.tlog` +When the run completes +Then exit code is 0 and the output JSONL is non-empty + +## Non-Functional Requirements + +**Performance** +- Alignment over a 30-min tlog completes in < 30 s on Tier-1 hardware (10 Hz subsampled IMU stream). + +**Reliability** +- Low confidence (< `low_confidence_threshold`) falls back to head-takeoff detection (existing behavior). + +## Unit Tests + +| AC Ref | What to Test | Required Outcome | +|--------|-------------|-----------------| +| AC-1 | Takeoff-aligned offset match | Within ± 50 ms of compute_offset | +| AC-2 | Mid-flight window discovery | Correct (start_ns, end_ns) | +| AC-3 | Adapter seek skips pre-window | First emitted ts ≥ tlog_start_ns | +| AC-4 | Validator on aligned scenarios | Returns 0 | + +## Blackbox Tests + +| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References | +|--------|------------------------|-------------|-------------------|----------------| +| AC-5 | Real derkachi inputs + --auto-trim | Full replay CLI run | Clean exit 0 + non-empty JSONL | — | + +## Constraints + +- Reuse the existing `_find_sustained_event` window-scan utility — no new generic algorithms. +- IMU subsampling MUST be deterministic (AC-10 across the rest of the replay path). +- `tlog_start_ns` seek MUST not break the existing AZ-611 `--skip-auto-sync` path. + +## Risks & Mitigation + +**Risk 1: False maxima during steady cruise** +- *Risk*: Cross-correlation of steady-state cruise IMU + uniform video flow can have multiple equal-height peaks. +- *Mitigation*: Report `combined_confidence`; below threshold falls back to head-takeoff or explicit offset. + +**Risk 2: Performance on long tlogs** +- *Risk*: Multi-hour tlogs would slow naive correlation. +- *Mitigation*: Subsample both streams to 10 Hz before FFT-based correlation. diff --git a/_docs/02_tasks/todo/AZ-699_real_flight_validation_runner.md b/_docs/02_tasks/todo/AZ-699_real_flight_validation_runner.md new file mode 100644 index 0000000..2a41a8a --- /dev/null +++ b/_docs/02_tasks/todo/AZ-699_real_flight_validation_runner.md @@ -0,0 +1,106 @@ +# Real-flight validation runner + accuracy report + +**Task**: AZ-699_real_flight_validation_runner +**Name**: Run estimator against real Derkachi tlog + video; compute honest accuracy metrics; write report +**Description**: New e2e test `test_derkachi_real_tlog.py` that feeds the real `derkachi.tlog` (not the synth) into the replay pipeline, compares the JSONL output against the binary-tlog GPS truth (from AZ-697), and writes a structured Markdown accuracy report. Flips AC-3 from `@xfail` to a real PASS/FAIL verdict. +**Complexity**: 3 points +**Dependencies**: AZ-697 +**Component**: Blackbox Tests (epic AZ-696) +**Tracker**: AZ-699 +**Epic**: AZ-696 + +## Problem + +`tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks` +is permanently `@xfail`. Even when the test runs (Jetson Tier-2), the +result is hidden — we have no honest measurement of estimator accuracy +against a real flight. The cycle-1 retrospective (`_docs/06_metrics/retro_2026-05-20.md`) +flagged this as the highest-impact open verification. + +The two contributors: +1. Synth tlog (compares estimator to itself) — fixed by AZ-697. +2. Unknown camera intrinsics — addressed by AZ-702 (T6, factory sheet). + +This task wires the real tlog + the calibration into a new test and +produces the honest verdict + a structured report. + +## Outcome + +- A new test runs the full `gps-denied-replay` against `derkachi.tlog` + + `flight_derkachi.mp4` + `khp20s30_factory.json` (or the current + fallback) and reports honest accuracy metrics. +- A structured report at `_docs/06_metrics/real_flight_validation_{YYYY-MM-DD}.md` + contains mean / p50 / p95 / p99 horizontal error, % within {10, 25, 50, 100} m, + vertical error stats, and notes the calibration assumption. +- AC-3 emits a real PASS or honest FAIL verdict (no `@xfail` mask). + +## Scope + +### Included +- New test `tests/e2e/replay/test_derkachi_real_tlog.py` parallel to the existing 1-min test but using the binary tlog. +- Metric helpers (mean/p50/p95/p99 percentile + threshold-hit counters) live in `src/gps_denied_onboard/helpers/gps_compare.py` (extends AZ-697). +- Report writer `tests/e2e/replay/_report_writer.py` (test helper, not production code). +- Updated `_docs/06_metrics/real_flight_validation_{date}.md` artifact format documented in `_docs/02_document/tests/blackbox-tests.md`. + +### Excluded +- Map visualization — AZ-700. +- HTTP API — AZ-701. +- Camera calibration acquisition — AZ-702 (this task ships with whatever calibration is current). +- Editing the existing `test_derkachi_1min.py` (new test runs alongside). + +## Acceptance Criteria + +**AC-1: Real PASS/FAIL verdict (no mask)** +Given the new test on Tier-2 Jetson +When `pytest tests/e2e/replay/test_derkachi_real_tlog.py -m tier2` runs +Then the result is PASS or FAIL — no `@xfail`, no `@skip` + +**AC-2: Structured report written** +Given a successful invocation +When the test finishes +Then `_docs/06_metrics/real_flight_validation_{YYYY-MM-DD}.md` exists with all required metrics in a Markdown table + +**AC-3: FAIL message attributes calibration uncertainty** +Given the test fails the 80 %/100 m gate +When the failure message renders +Then it references the calibration acquisition method (factory-sheet per AZ-702) and the residual budget + +**AC-4: Existing 1-min test untouched** +Given the cycle-1 test `test_ac3_within_100m_80pct_of_ticks` +When all changes land +Then the existing `@xfail` test still exists and runs (we add, don't replace) + +## Non-Functional Requirements + +**Performance** +- The new test must complete within the existing Jetson Tier-2 wall budget (≤ 15 min for a 60 s clip; report longer for longer clips). + +## Unit Tests + +| AC Ref | What to Test | Required Outcome | +|--------|-------------|-----------------| +| AC-2 | Report writer with mock metrics | Markdown contains every required row | +| AC-3 | Failure message templating | Contains "calibration: factory-sheet" + budget | + +## Blackbox Tests + +| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References | +|--------|------------------------|-------------|-------------------|----------------| +| AC-1 | Real derkachi.tlog + video + KHP20S30 calibration | Full replay + accuracy gate | PASS or FAIL (honest) | Perf ≤ 15 min | +| AC-2 | After AC-1 run | Report file existence + contents | Structured report on disk | — | + +## Constraints + +- The new test MUST use the existing `gps-denied-replay` console-script — no inlined estimator invocation. +- The report MUST be Markdown (not HTML/JSON) so it lives alongside other `_docs/06_metrics/` artifacts. +- Skipping in CI when `RUN_REPLAY_E2E=0` is allowed (matches existing pattern); the test MUST run when the env var is set. + +## Risks & Mitigation + +**Risk 1: Honest FAIL exposes a true product gap** +- *Risk*: The estimator may legitimately fail the 100 m/80 % gate even with correct calibration. Derkachi is cruise altitude with limited VPR anchor diversity. +- *Mitigation*: That's the goal — honest measurement. Surface the gap; downstream cycles can tighten. + +**Risk 2: tlog format edge cases** +- *Risk*: Real tlogs may carry non-standard system IDs, dialect mismatches, or corrupt segments. +- *Mitigation*: AZ-697's AC-3 / AC-4 cover this at the truth-extractor level; this task only consumes the result. diff --git a/_docs/02_tasks/todo/AZ-700_replay_map_visualization.md b/_docs/02_tasks/todo/AZ-700_replay_map_visualization.md new file mode 100644 index 0000000..2881731 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-700_replay_map_visualization.md @@ -0,0 +1,108 @@ +# Replay map visualization (estimated vs ground-truth tracks) + +**Task**: AZ-700_replay_map_visualization +**Name**: HTML map showing estimated GPS track vs tlog ground-truth track +**Description**: New `gps-denied-render-map` console script. Takes a JSONL of estimator output + a tlog (or CSV fallback) and renders a single-file HTML map (folium / Leaflet) with both tracks in distinct colors, start/end markers, and an embedded accuracy summary from AZ-699. +**Complexity**: 3 points +**Dependencies**: AZ-699 +**Component**: cli (offline analysis surface) +**Tracker**: AZ-700 +**Epic**: AZ-696 + +## Problem + +Today the only feedback from a replay run is a JSONL file. There is no +way to visually verify whether the estimator is drifting, jumping, or +roughly tracking the real flight. A human reading the JSONL cannot +quickly answer "does this make sense geographically?" + +The user's pipeline explicitly calls for: "and then show both points on +the map." + +## Outcome + +- A standalone CLI `gps-denied-render-map` produces a self-contained + HTML map of the estimated track + the tlog ground-truth track for any + prior replay run. +- The map is shareable as a single file (no server required); developers + open it locally; AZ-701's HTTP API serves it back to API consumers. + +## Scope + +### Included +- New module `src/gps_denied_onboard/cli/render_map.py`. +- New console script `gps-denied-render-map` in `pyproject.toml`. +- folium dependency pin in the appropriate `[project.optional-dependencies]` group (NOT in airborne-binary deps — operator-side only). +- Default map style + tile provider (OpenStreetMap fallback documented for offline use). +- Auto-fit bounds; distance circles (100 m, 50 m) around start point for scale. +- Accuracy summary banner (read from `_docs/06_metrics/real_flight_validation_{date}.md` when `--summary` is passed). + +### Excluded +- Interactive time-slider playback (deferred follow-up). +- Embedded altitude profile chart. +- Animated marker traversal. + +## Acceptance Criteria + +**AC-1: CLI produces self-contained HTML** +Given a JSONL + tlog +When `gps-denied-render-map --estimated out.jsonl --truth derkachi.tlog --output map.html` runs +Then `map.html` exists, parses as valid HTML, exits 0 + +**AC-2: Two distinct tracks visible** +Given the rendered map opened in a browser +When inspected +Then it contains exactly two polyline layers (red = truth, blue = estimated) with start/end markers + +**AC-3: Markers + scale circles** +Given the rendered map +When parsed +Then it contains the start (green) + end (black) markers + 100 m + 50 m scale circles + +**AC-4: Accuracy summary inclusion** +Given `--summary _docs/06_metrics/real_flight_validation_2026-XX-XX.md` +When the map renders +Then the HTML header contains the accuracy metrics table + +**AC-5: Offline fallback documented** +Given an environment without internet access +When the map is rendered with `--offline-tiles` +Then tile loading uses a documented fallback (or fails fast with a clear error if no fallback is configured) + +## Non-Functional Requirements + +**Compatibility** +- Output HTML must render in Chrome 110+ and Firefox 110+ without console errors. + +**Performance** +- For a 60 s flight (~600 truth points + ~600 estimated points), render time < 5 s on Tier-1 hardware. + +## Unit Tests + +| AC Ref | What to Test | Required Outcome | +|--------|-------------|-----------------| +| AC-1 | CLI invocation with synthetic data | Output HTML file exists + non-empty | +| AC-2 | Parse output HTML | Exactly 2 polyline layers + 4 expected markers | +| AC-4 | Summary embed | Markdown summary metrics present in HTML | + +## Blackbox Tests + +| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References | +|--------|------------------------|-------------|-------------------|----------------| +| AC-1 | Real Derkachi replay JSONL + tlog | End-to-end render | HTML opens in browser, both tracks visible | Compat | + +## Constraints + +- folium MUST be in the operator-only dep group; airborne binary cold-start regression test must remain green. +- HTML output MUST be self-contained — embedded JS/CSS, no per-page CDN calls in `--offline-tiles` mode. +- Console script naming follows the project pattern (`gps-denied-`). + +## Risks & Mitigation + +**Risk 1: folium dep size** +- *Risk*: folium pulls ~5 MB of JS. Adding to airborne deps would regress cold-start. +- *Mitigation*: optional-dependencies group + ADR-002 build-time exclusion principle. + +**Risk 2: CDN dependency at render time** +- *Risk*: Default folium uses Leaflet via CDN — fails on offline Jetsons. +- *Mitigation*: Document `--offline-tiles` flag; provide bundled assets path or fail-fast. diff --git a/_docs/02_tasks/todo/AZ-701_http_replay_api_service.md b/_docs/02_tasks/todo/AZ-701_http_replay_api_service.md new file mode 100644 index 0000000..e77a454 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-701_http_replay_api_service.md @@ -0,0 +1,161 @@ +# HTTP Replay API service + +**Task**: AZ-701_http_replay_api_service +**Name**: HTTP API for offline replay (POST tlog+video, return GPS fixes + map URL) +**Description**: New `replay_api` component (FastAPI) wrapping the offline replay pipeline. One primary endpoint `POST /replay` accepts multipart `(tlog + video [+ calibration])` and returns either a synchronous JSONL+summary or an async job id. Returns links to the map artifact rendered by AZ-700. +**Complexity**: 5 points +**Dependencies**: AZ-699, AZ-700 +**Component**: replay_api (new component) +**Tracker**: AZ-701 +**Epic**: AZ-696 + +## Problem + +The product today has zero HTTP surface. The only ways to invoke the +estimator on a recorded flight are: +1. The airborne binary (real-time MAVLink GPS_INPUT — needs the + aircraft + FC). +2. `gps-denied-replay` CLI (operator workstation, Python install + required). +3. `operator-orchestrator` CLI (Click, pre-flight cache only — does + NOT run the estimator). + +External consumers (operator tools, suite web UIs, demo dashboards, +other suite services) cannot validate flights without installing the +full Python stack. The user's pipeline framing explicitly calls for +"part of the api — tlog and video uploading. and emits gps fixes back +to the user." + +## Outcome + +- A new HTTP service exposes `POST /replay` and the supporting `GET /jobs/{id}*` polling endpoints. +- The service wraps `gps-denied-replay` and AZ-700's map renderer behind a single multipart upload. +- Containerized; runs in `docker-compose.test.yml`; OpenAPI spec is committed. +- Authentication via bearer token, gated explicitly off in dev mode (logs WARN). + +## Scope + +### Included +- New component `src/gps_denied_onboard/replay_api/`: + - `app.py` (FastAPI instance) + - `handlers.py` (multipart upload, validation) + - `jobs.py` (sync ≤ 2 min videos / async > 2 min) + - `storage.py` (temp file lifecycle, cleanup) + - `interface.py` (`ReplayRunner` Protocol so handlers are decoupled) + - `errors.py` (custom HTTP error families) +- Endpoints: `POST /replay`, `GET /jobs/{id}`, `GET /jobs/{id}/result`, `GET /jobs/{id}/map`, `GET /healthz`, `GET /readyz`. +- Bearer-token auth: `REPLAY_API_BEARER_TOKEN` env var; explicit dev opt-out via `REPLAY_API_AUTH_REQUIRED=false`. +- Upload size limit + concurrent-job limit, env-configurable. +- New `replay-api` console script (uvicorn entrypoint) in `pyproject.toml`. +- New `docker/replay-api.Dockerfile` + `docker-compose.test.yml` entry. +- OpenAPI spec exported to `_docs/02_document/contracts/replay_api/openapi.yaml`. +- Contract file `_docs/02_document/contracts/replay_api/replay_api_protocol.md` (per shared/api decompose Step 4.5 rule). +- File-upload magic-byte validation for `.tlog` + `.mp4`. + +### Excluded +- Web UI (parent-suite concern). +- Persistent job database (in-memory + temp disk is sufficient for v1). +- Multi-node job distribution. +- WebSocket streaming of progress. + +## Acceptance Criteria + +**AC-1: Sync happy path (short video, dev mode)** +Given `REPLAY_API_AUTH_REQUIRED=false` and a 60 s video +When `POST /replay` runs with multipart `tlog + video` +Then response is 200 with JSONL of GPS fixes + accuracy summary inline + +**AC-2: Async happy path (long video)** +Given a > 2-minute video +When `POST /replay` runs +Then response is 202 with `Location: /jobs/{id}` and `{job_id, status_url}` + +**AC-3: Job state transitions** +Given an async job +When polled via `GET /jobs/{id}` +Then state transitions `queued → running → done` are observable + +**AC-4: Result + map served from job id** +Given a `done` job +When `GET /jobs/{id}/result` is called +Then it streams the JSONL; `GET /jobs/{id}/map` returns the HTML map (from AZ-700) + +**AC-5: Auth enforced when configured** +Given `REPLAY_API_BEARER_TOKEN=secret` +When `POST /replay` runs without `Authorization: Bearer secret` +Then response is 401 + +**AC-6: Health endpoints** +Given the service is up and `gps-denied-replay` console-script is on PATH +When `GET /healthz` and `GET /readyz` are called +Then both return 200 + +**AC-7: OpenAPI + contract documented** +Given the service is running +When the OpenAPI spec is exported +Then `_docs/02_document/contracts/replay_api/openapi.yaml` is committed; `replay_api_protocol.md` documents the versioning rules + +**AC-8: Concurrency limit enforced** +Given `REPLAY_API_MAX_CONCURRENT_JOBS=1` +When 3 jobs are submitted in quick succession +Then exactly 1 is `running`; 2 are `queued` + +**AC-9: Magic-byte upload validation** +Given a `POST /replay` with a misnamed `.tlog` (actually a `.zip`) +When the handler validates +Then response is 400 with a clear error + +## Non-Functional Requirements + +**Performance** +- For a 60 s Derkachi video, sync `POST /replay` returns within `gps-denied-replay` ASAP-mode wall + 5 s overhead on Tier-2 Jetson. + +**Security** +- Magic-byte file validation; reject anything not matching `.tlog` (MAVLink magic 0xFD/0xFE) or `.mp4` (ftyp). +- Bearer auth always available; default-OFF only with explicit env var. + +**Compatibility** +- FastAPI / uvicorn / python-multipart pinned; document version compatibility window. + +## Unit Tests + +| AC Ref | What to Test | Required Outcome | +|--------|-------------|-----------------| +| AC-1 | Sync POST → 200 + JSONL | Round-trip succeeds with synth fixtures | +| AC-2 | Async POST → 202 + job id | 202 with Location header | +| AC-3 | Job state machine | Transitions observed | +| AC-5 | Missing/wrong bearer → 401 | Strict failure | +| AC-8 | Concurrency limit | 2 of 3 queued | +| AC-9 | Wrong magic bytes → 400 | Clear error | + +## Blackbox Tests + +| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References | +|--------|------------------------|-------------|-------------------|----------------| +| AC-1, AC-4 | Real derkachi.tlog + video | `curl` round-trip in docker-compose | 200 + JSONL + map HTML | Perf | +| AC-6 | Container up | Health endpoint checks | 200 OK | — | + +## Constraints + +- FastAPI MUST live in an operator-only build target; ADR-002 binary-exclusion applies. Airborne binary cold-start regression test must remain green. +- New component MUST follow interface-first + constructor-injection (Principle #13 in architecture.md). +- Contract file MUST exist before the endpoint is callable in CI (per decompose Step 4.5 rule). + +## Risks & Mitigation + +**Risk 1: FastAPI / uvicorn dep weight on airborne binary** +- *Risk*: Adding the API dep to the airborne binary regresses cold-start. +- *Mitigation*: Place `replay_api/` in an operator-only optional-dependencies group; CMake / build-time exclusion enforces. + +**Risk 2: HTTP timeout on long videos** +- *Risk*: Sync mode + a long video → HTTP timeout. +- *Mitigation*: Async mode triggers automatically above the configured video-length threshold. + +**Risk 3: File-upload abuse** +- *Risk*: Malicious uploads (huge files, zip bombs, fake MIME types). +- *Mitigation*: Hard size limit (2 GB default), magic-byte validation, temp-file cleanup, configurable disk quota. + +## Contract + +This task produces the contract at `_docs/02_document/contracts/replay_api/replay_api_protocol.md`. +Consumers MUST read that file — not this task spec — to discover the interface and versioning rules. diff --git a/_docs/02_tasks/todo/AZ-702_khp20s30_calibration.md b/_docs/02_tasks/todo/AZ-702_khp20s30_calibration.md new file mode 100644 index 0000000..1e1cf88 --- /dev/null +++ b/_docs/02_tasks/todo/AZ-702_khp20s30_calibration.md @@ -0,0 +1,106 @@ +# Topotek KHP20S30 camera calibration (factory-sheet approximation) + +**Task**: AZ-702_khp20s30_calibration +**Name**: Provide a calibration JSON for the Topotek KHP20S30 nadir camera (factory-sheet approximation) +**Description**: Compute and commit a `CameraCalibrationArtifact` JSON for the Derkachi camera (Topotek KHP20S30) from manufacturer factory data. Replaces the `adti26.json` placeholder that AC-3 currently uses. Documents the residual error vs a per-unit checkerboard refinement. +**Complexity**: 1 point +**Dependencies**: None +**Component**: input_data / shared_helpers +**Tracker**: AZ-702 +**Epic**: AZ-696 + +## Problem + +`_docs/00_problem/input_data/flight_derkachi/camera_info.md` states the +Topotek KHP20S30 intrinsics are unknown. `tests/e2e/replay/conftest.py` +(line 50–56) substitutes `tests/fixtures/calibration/adti26.json` as a +placeholder. AC-3 (≤ 100 m horizontal error for 80 % of ticks) is +`@xfail` until a real calibration ships. + +The cheapest reasonable starting point is a factory-sheet approximation +— compute `K` from the manufacturer's published focal length + sensor +geometry, accept the 1–3 % focal-length residual as a documented +budget, and let AC-3 either PASS or honestly FAIL with the residual +attributed. + +## Outcome + +- A calibration JSON `khp20s30_factory.json` exists in the Derkachi + input directory, parses against the project's + `CameraCalibrationArtifact` schema, and documents the acquisition + method as `factory_sheet`. +- `camera_info.md` is updated to reference the new calibration + the + residual budget + the deferral handle (`AZ-XXX_checkerboard_refinement`). +- AZ-699 (T3) uses this calibration as its `--camera-calibration` input. + +## Scope + +### Included +- Source manufacturer factory data for the Topotek KHP20S30 (sensor: 1/2.8" CMOS, 2.13 MP, 1920×1080; lens focal length, FOV, pixel pitch). +- Compute `K = [[fx, 0, cx], [0, fy, cy], [0, 0, 1]]` from `fx = fy = focal_length_mm × (image_width_px / sensor_width_mm)`. +- Set distortion to `[0, 0, 0, 0, 0]` (factory-sheet approximation). +- Set `body_to_camera_se3` to identity-down (nadir; camera-z = aircraft-down). +- Set `acquisition_method = "factory_sheet"`. +- Write `_docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json`. +- Update `_docs/00_problem/input_data/flight_derkachi/camera_info.md`. +- New unit test under `tests/unit/calibration/` asserting the JSON parses and matches the documented inputs. + +### Excluded +- Physical checkerboard calibration (needs hardware). +- PnP-from-tlog back-computation (deferred follow-up). +- Updating `adti26.json` or other test fixtures. + +## Acceptance Criteria + +**AC-1: Calibration JSON parses** +Given the new `khp20s30_factory.json` +When loaded by the project's calibration parser (same schema as `adti26.json`) +Then it parses without error and all fields are populated + +**AC-2: Doc updated** +Given `camera_info.md` before +When the calibration is committed +Then `camera_info.md` says "factory-sheet approximation; per-unit checkerboard refinement deferred — see " and lists the residual budget + +**AC-3: Unit test snapshot** +Given the new JSON +When the unit test runs +Then it asserts `fx == fy` (square pixels), `cx ≈ width/2`, `cy ≈ height/2`, distortion all zero + +**AC-4: T3 consumes this calibration** +Given AZ-699's `test_derkachi_real_tlog.py` +When it runs +Then it loads `khp20s30_factory.json` as `--camera-calibration` (no longer the `adti26.json` placeholder) + +## Non-Functional Requirements + +**Compatibility** +- JSON schema MUST be identical to existing calibration fixtures (`adti26.json`) — no schema changes in this task. + +## Unit Tests + +| AC Ref | What to Test | Required Outcome | +|--------|-------------|-----------------| +| AC-1 | JSON loads via existing parser | Object populated | +| AC-3 | Field values match factory inputs | fx == fy, cx/cy at centre, zero distortion | + +## Blackbox Tests + +| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References | +|--------|------------------------|-------------|-------------------|----------------| +| AC-4 | T3 test pointed at new JSON | T3 launches without calibration parse error | Test starts cleanly | Compat | + +## Constraints + +- MUST follow the calibration contract in `_docs/02_document/contracts/shared_helpers/descriptor_normaliser.md` (or wherever the camera-calibration schema lives). +- MUST be a single committed JSON — no generator script with side effects. + +## Risks & Mitigation + +**Risk 1: Factory data unavailable at required precision** +- *Risk*: Topotek does not publish the exact focal length / sensor width to the precision needed. +- *Mitigation*: Document the gap; ship with the best-available estimate; flag in `camera_info.md` so T3 surfaces the uncertainty in its failure message. + +**Risk 2: Residual error exceeds AC-3 budget** +- *Risk*: 1–3 % focal-length error may push horizontal error past 100 m at 1 km AGL. +- *Mitigation*: That's the honest finding. T3 reports it. A follow-up task can pursue checkerboard refinement if needed. diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index c46d847..f2e0b13 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -2,13 +2,13 @@ ## Current Step flow: existing-code -step: 9 -name: New Task -status: not_started +step: 10 +name: Implement +status: in_progress sub_step: phase: 0 name: awaiting-invocation - detail: "" + detail: "epic AZ-696 — 6 PBIs AZ-697..AZ-702 in todo/; impl order: T1+T6 → T2 → T3 → T4 → T5" retry_count: 0 cycle: 2 tracker: jira diff --git a/_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md b/_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md index 675851a..90dc272 100644 --- a/_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md +++ b/_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md @@ -1,12 +1,11 @@ # D-CROSS-CVE-1 opencv-python pin deferred — gtsam/numpy ABI block **Recorded**: 2026-05-11T02:55+03:00 (Europe/Kyiv) -**Last replay attempt**: 2026-05-19T20:04+03:00 (Europe/Kyiv) — replay re-checked -at start of next `/autodev` invocation (~55 minutes after prior check at 19:09). -PyPI not re-queried this round (debounced — `gtsam` upstream state is highly -unlikely to publish numpy-2 wheels within a <2-hour window of the prior check, -and the previous check confirmed no movement). Replay condition (numpy>=2 -stable wheels) still NOT met. Leftover remains open. +**Last replay attempt**: 2026-05-20T13:59+03:00 (Europe/Kyiv) — replay re-checked +at start of next `/autodev` invocation (~17h after prior check at 2026-05-19 +20:04). PyPI re-queried via `pip index versions gtsam`: only `gtsam 4.2` +is published. Replay condition (numpy>=2 stable wheels) still NOT met. +Leftover remains open. **Status**: deferred-non-user (replay when upstream gtsam wheels target numpy>=2) ## What is blocked