mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 09:01:14 +00:00
[AZ-696] Cycle-2 Step 10 wrap-up: cumulative review, completeness gate, final report
Cumulative review (batches 98-102): PASS_WITH_WARNINGS — F1 module-layout stale (Medium/Arch) + F2 inline-import style nit (Low). No blocking findings. Completeness gate: PASS — all 6 cycle-2 tasks (AZ-697, AZ-702, AZ-698, AZ-699, AZ-700, AZ-701) verified PASS. Zero placeholder/stub/scaffold markers in production code; every named runtime dep integrated. Final implementation report hands off full-suite gate to Step 11 (Jetson e2e) — last Jetson run pre-dates all cycle-2 commits. Autodev state advanced to Step 11 (Run Tests), not_started. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,139 @@
|
|||||||
|
# Cumulative Code Review — Cycle 2 — Batches 98–102
|
||||||
|
|
||||||
|
**Date**: 2026-05-20
|
||||||
|
**Scope**: union of files changed across cycle-2 batches 98–102 (AZ-697, AZ-702, AZ-698, AZ-699, AZ-700, AZ-701)
|
||||||
|
**Mode**: cumulative (all 7 phases)
|
||||||
|
**Verdict**: **PASS_WITH_WARNINGS** (0 Critical, 0 High, 1 Medium, 1 Low)
|
||||||
|
**Baseline file**: `_docs/02_document/architecture_compliance_baseline.md` — **absent**, no `## Baseline Delta` section emitted (see Notes)
|
||||||
|
|
||||||
|
## Findings
|
||||||
|
|
||||||
|
| # | Severity | Category | File | Title |
|
||||||
|
|---|----------|----------|------|-------|
|
||||||
|
| F1 | Medium | Architecture | `_docs/02_document/module-layout.md` | Module layout stale — 6 new artefacts unregistered |
|
||||||
|
| F2 | Low | Maintainability | `src/gps_denied_onboard/replay_api/app.py:163-180` | Inline imports for stdlib + first-party modules |
|
||||||
|
|
||||||
|
### Finding Details
|
||||||
|
|
||||||
|
**F1: Module layout stale — 6 new artefacts unregistered** (Medium / Architecture)
|
||||||
|
|
||||||
|
- Location: `_docs/02_document/module-layout.md`
|
||||||
|
- Description: cycle-2 introduced six new package/file artefacts that are not registered in the authoritative file-ownership map:
|
||||||
|
- **New top-level package** `src/gps_denied_onboard/replay_api/` (AZ-701; 7 production files). No Per-Component Mapping entry, no Layering-table row, no Build-Time Exclusion Map row.
|
||||||
|
- **New files** in `src/gps_denied_onboard/replay_input/`: `tlog_ground_truth.py` (AZ-697) and `errors.py`. The Shared `replay_input` entry still lists only `__init__.py`, `interface.py`, `tlog_video_adapter.py`, `auto_sync.py`, `tests/`.
|
||||||
|
- **New CLI entrypoints** `src/gps_denied_onboard/cli/render_map.py` (AZ-700) and `src/gps_denied_onboard/cli/replay_api_entrypoint.py` (AZ-701). The Shared section has an entry for `cli/replay` but not for the two new console scripts.
|
||||||
|
- **New helpers** `src/gps_denied_onboard/helpers/gps_compare.py` (AZ-697/699) and `src/gps_denied_onboard/helpers/accuracy_report.py` (promoted in AZ-701). The Shared/helpers entries do not list them.
|
||||||
|
- Impact: `/implement` Step 4 (File Ownership) resolves a task's Component field against this file. Any future task touching `replay_api/`, the new CLI scripts, or these helpers will hit the BLOCKING ownership check at Step 4 — the skill explicitly STOPs when the component isn't found and forbids guessing from prose.
|
||||||
|
- Suggestion: Step 13 (Update Docs) for cycle 2 should append: (a) a `replay_api` component entry under "Per-Component Mapping" with Layer-5 classification and `[operator-tools]` build gate, (b) an updated `shared/replay_input` files list, (c) `shared/cli/render_map` and `shared/cli/replay_api` entries, (d) `shared/helpers/gps_compare` and `shared/helpers/accuracy_report` entries with Consumed-by lists.
|
||||||
|
- Task: AZ-701 (primary), AZ-697 / AZ-699 / AZ-700 (secondary)
|
||||||
|
|
||||||
|
**F2: Inline imports for stdlib + first-party modules** (Low / Maintainability)
|
||||||
|
|
||||||
|
- Location: `src/gps_denied_onboard/replay_api/app.py:163-180` (`_maybe_render_report` method body)
|
||||||
|
- Description: `_maybe_render_report` lazy-imports `json` (stdlib) plus three first-party modules (`helpers.accuracy_report`, `helpers.gps_compare`, `replay_input`). The same first-party modules are imported eagerly elsewhere in the call graph — `helpers/__init__.py` already pulls `accuracy_report` and `gps_compare` at package import time, and `app.py` itself eagerly imports neighbours like `replay_api.errors` / `interface` / `handlers` at the module top. Lazy import provides no measurable startup-time benefit here and inconsistently styles the file.
|
||||||
|
- Suggestion: hoist these imports to the top of `app.py`. Keep them inside the `try:` block only if there is a runtime reason (e.g. defensive import to tolerate missing optional dependency) — none of these four modules is optional in `[operator-tools]`.
|
||||||
|
- Task: AZ-701
|
||||||
|
|
||||||
|
## Verdict Logic
|
||||||
|
|
||||||
|
- 0 Critical → no FAIL trigger
|
||||||
|
- 0 High → no FAIL trigger
|
||||||
|
- 1 Medium → triggers PASS_WITH_WARNINGS
|
||||||
|
- 1 Low → triggers PASS_WITH_WARNINGS
|
||||||
|
- Result: **PASS_WITH_WARNINGS** — implement skill auto-fix gate does not block; findings reported as info to the user.
|
||||||
|
|
||||||
|
## Phase-by-Phase Notes
|
||||||
|
|
||||||
|
### Phase 1 — Context Loading
|
||||||
|
|
||||||
|
Inputs read:
|
||||||
|
- Task specs: `AZ-697`, `AZ-698`, `AZ-699`, `AZ-700`, `AZ-701`, `AZ-702` (all in `_docs/02_tasks/done/`)
|
||||||
|
- Batch reports: `batch_98_cycle2_report.md` … `batch_102_cycle2_report.md`
|
||||||
|
- Architecture / layout: `_docs/02_document/module-layout.md`, `_docs/02_document/architecture.md`
|
||||||
|
- Restrictions / solution overview: not re-read (already covered in per-batch reviews)
|
||||||
|
|
||||||
|
### Phase 2 — Spec Compliance
|
||||||
|
|
||||||
|
Per-batch code reviews (Step 9 of `/implement`) already verified each task's ACs against its implementation. Cumulative scope checks **only** the points where cross-batch promises interact:
|
||||||
|
|
||||||
|
- AZ-701 contract requires consuming AZ-700's `gps-denied-render-map` and AZ-699's accuracy report. Confirmed:
|
||||||
|
- `replay_api/app.py::_maybe_render_map` invokes `gps-denied-render-map` via `subprocess.run` with argv list. ✅
|
||||||
|
- `replay_api/app.py::_maybe_render_report` calls `helpers.accuracy_report.render_report` directly. ✅
|
||||||
|
- AZ-697's `helpers/gps_compare.py` (`l2_horizontal_m`, `match_percentage`, `GroundTruthRow`) is consumed by AZ-699's `HorizontalErrorDistribution` and by AZ-701's runner. No duplicate implementations.
|
||||||
|
- AZ-698's `--auto-trim` CLI flag in `cli/replay.py` is invoked from AZ-701's subprocess argv when `inputs.auto_trim` is set. Wired correctly.
|
||||||
|
|
||||||
|
No Spec-Gap findings.
|
||||||
|
|
||||||
|
### Phase 3 — Code Quality
|
||||||
|
|
||||||
|
All cycle-2 production modules have module docstrings explaining intent. No 50+-line functions detected; all DTOs are `@dataclass(frozen=True[, slots=True])` matching the project pattern. Tests follow Arrange/Act/Assert per coderule.
|
||||||
|
|
||||||
|
The only quality nit is F2 (inline imports in `_maybe_render_report`).
|
||||||
|
|
||||||
|
### Phase 4 — Security Quick-Scan
|
||||||
|
|
||||||
|
- `subprocess.run` in `replay_api/app.py` uses argv lists (no `shell=True`). ✅
|
||||||
|
- `subprocess.run` has `timeout` parameter. ✅
|
||||||
|
- `json.loads` on JSONL emissions in `cli/render_map.py` — safe (json, not pickle).
|
||||||
|
- Bearer-token validation in `replay_api/handlers.py`; default-on (`REPLAY_API_AUTH_REQUIRED=true` unless explicit dev opt-out).
|
||||||
|
- Magic-byte validation on multipart uploads (AC-9): MAVLink `\xfd`/`\xfe` for `.tlog`, `ftyp` for `.mp4`.
|
||||||
|
- Per-job storage directory is wiped on `release_job` and `cleanup_all` — no residual artefacts.
|
||||||
|
- Subprocess zero-bytes signing-key file (`signing_key_path.write_bytes(b"\x00" * 32)`) — acceptable in replay mode (paired with `noop_mavlink_transport`; outbound bytes are never transmitted) and the file is scoped to the per-job temp dir. Not a finding.
|
||||||
|
|
||||||
|
No Security findings.
|
||||||
|
|
||||||
|
### Phase 5 — Performance Scan
|
||||||
|
|
||||||
|
- `replay_api/jobs.py` uses `ThreadPoolExecutor` with `max_concurrent` and `max_queued` caps; excess returns HTTP 429. ✅
|
||||||
|
- `python-multipart` buffers each part in memory; bounded by `REPLAY_API_MAX_UPLOAD_BYTES`. Documented limitation, not a finding.
|
||||||
|
- `helpers/gps_compare.py` `match_percentage` uses `bisect_left` for O(log N) timestamp lookup. ✅
|
||||||
|
- `cli/render_map.py` reads JSONL line-by-line; no full-file slurp. ✅
|
||||||
|
|
||||||
|
No Performance findings.
|
||||||
|
|
||||||
|
### Phase 6 — Cross-Task Consistency
|
||||||
|
|
||||||
|
Cross-task wiring across the 5 batches:
|
||||||
|
|
||||||
|
- **Module promotion**: `tests/e2e/replay/_report_writer.py` (batch 100) → `src/gps_denied_onboard/helpers/accuracy_report.py` (batch 102). All four consumers re-pointed:
|
||||||
|
- `tests/e2e/replay/test_derkachi_real_tlog.py` (line 49)
|
||||||
|
- `tests/unit/test_az699_report_writer.py` (line 23)
|
||||||
|
- `src/gps_denied_onboard/helpers/__init__.py` (line 19)
|
||||||
|
- `src/gps_denied_onboard/replay_api/app.py` (line 167, inline)
|
||||||
|
Old path deleted from disk. No orphaned imports.
|
||||||
|
- **Symbol uniqueness**: `class HorizontalErrorDistribution`, `def load_tlog_ground_truth`, `def render_report`, `class TlogGroundTruth` — each defined exactly once. No duplicates.
|
||||||
|
- **AZ-698 → AZ-701 plumbing**: `--auto-trim` flag added to `cli/replay.py` (batch 99) is consumed by `replay_api/app.py` argv builder (batch 102). `AlignedWindow` DTO + `ReplayAutoSyncConfig` are threaded through composition root (`runtime_root/_replay_branch.py`) to `tlog_video_adapter` to `c8_fc_adapter.tlog_replay_adapter` — consistent signatures across all four batches.
|
||||||
|
|
||||||
|
No Cross-Task Consistency findings.
|
||||||
|
|
||||||
|
### Phase 7 — Architecture Compliance
|
||||||
|
|
||||||
|
**Layer-direction analysis** (against module-layout.md "Allowed Dependencies"):
|
||||||
|
|
||||||
|
- `replay_api/*` is best classified as Layer 5 (Entry / Composition). All its imports stay within `replay_api/*` plus stdlib + FastAPI / Pydantic / python-multipart (third-party). The deferred `helpers.accuracy_report` / `helpers.gps_compare` / `replay_input` imports inside `_maybe_render_report` reach Layer 1 / Layer 4 — both lower than Layer 5. ✅
|
||||||
|
- `cli/render_map.py` (Layer 5) imports `replay_input.load_tlog_ground_truth` (Layer 4). ✅
|
||||||
|
- `cli/replay_api_entrypoint.py` (Layer 5) imports `replay_api.app` / `replay_api.storage` (Layer 5 self-layer). ✅
|
||||||
|
- `replay_input/tlog_ground_truth.py` (Layer 4) imports only `replay_input.errors` (intra-package). ✅
|
||||||
|
- `replay_input/tlog_video_adapter.py` (Layer 4) imports `c8_fc_adapter.errors` + `c8_fc_adapter.tlog_replay_adapter` (Layer 4). Layer-4 → Layer-4 is explicitly **documented** in `module-layout.md` § shared/replay_input ("instantiates Layer-4 strategies `c8_fc_adapter.tlog_replay_adapter`, `frame_source.video_file_frame_source`"). ✅
|
||||||
|
- `runtime_root/_replay_branch.py` (Layer 5) imports `replay_input` (Layer 4) + `c8_fc_adapter` (Layer 4) + Layer-1 shareds. ✅
|
||||||
|
- `c8_fc_adapter/tlog_replay_adapter.py` (Layer 4) imports stay within `c8_fc_adapter/*` plus Layer-1 shareds. ✅
|
||||||
|
|
||||||
|
**Public-API respect**: every cross-component import goes through the target component's Public API surface — no internal-file imports detected.
|
||||||
|
|
||||||
|
**Cyclic-dependency check**: cycle-2's import edges form a strict DAG: `replay_api → helpers + replay_input + (subprocess) cli/replay + cli/render_map`. `replay_input` does not depend on `replay_api`. `helpers` does not depend on anything cycle-2-added. No new cycles.
|
||||||
|
|
||||||
|
**Duplicate-symbol check**: see Phase 6 — no duplicates.
|
||||||
|
|
||||||
|
**Cross-cutting concerns**: none re-implemented locally. Logging, FDR, helpers, clock, config — all consumed from their canonical Layer-1 homes.
|
||||||
|
|
||||||
|
**Single finding**: F1 — module-layout.md is the authoritative file-ownership map and is stale relative to the cycle-2 work. The code itself respects the layering; the documentation has not caught up. Severity Medium (Architecture) because the staleness blocks `/implement` Step 4 on the next task that touches these areas.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- **No `## Baseline Delta` section**: `_docs/02_document/architecture_compliance_baseline.md` was identified in the 2026-05-20 LESSONS entry as a cycle-2 Step 6 (Decompose) prerequisite, seeded with 0 violations from the structural snapshot in `_docs/06_metrics/structure_2026-05-20.md`. The baseline file does not exist at the time of this review. Partition against baseline therefore not possible. This is captured as a separate follow-up (retrospective / next-cycle decompose), not a per-batch finding.
|
||||||
|
- **Cumulative-review cadence drift**: per `/implement` Step 14.5, a cumulative review was due after batch 100 (3rd batch into cycle 2). It was not run; this is the make-up review covering all 5 cycle-2 batches in one pass. Cadence drift should be noted in the cycle-2 retrospective.
|
||||||
|
|
||||||
|
## Artifacts
|
||||||
|
|
||||||
|
- Verdict consumed by: `/implement` Step 14.5 gate (PASS_WITH_WARNINGS → proceed to Step 15 Product Implementation Completeness Gate)
|
||||||
|
- Findings (F1, F2) carried forward to the cycle-2 retrospective for action assignment
|
||||||
@@ -0,0 +1,81 @@
|
|||||||
|
# Product Implementation Completeness Gate — Cycle 2
|
||||||
|
|
||||||
|
**Date**: 2026-05-20
|
||||||
|
**Scope**: 6 product tasks completed in cycle 2 — AZ-697, AZ-702, AZ-698, AZ-699, AZ-700, AZ-701 (epic AZ-696 "operator-side replay tooling")
|
||||||
|
**Gate verdict**: **PASS** — all 6 tasks classified PASS, 0 BLOCKED, 0 FAIL
|
||||||
|
**Final test run handoff**: Step 11 (Run Tests) on the Jetson harness — see Notes
|
||||||
|
|
||||||
|
## Per-task Classification
|
||||||
|
|
||||||
|
| Task | Component | Status | Production code | Named runtime deps integrated | AC coverage |
|
||||||
|
|------|-----------|--------|-----------------|-------------------------------|-------------|
|
||||||
|
| AZ-697 | replay_input + helpers | **PASS** | `replay_input/tlog_ground_truth.py`, `helpers/gps_compare.py` (`GroundTruthRow`, `l2_horizontal_m`, `match_percentage`) | `pymavlink.mavutil` (binary tlog reader) | 5/5 ACs |
|
||||||
|
| AZ-702 | _docs/input_data | **PASS** | `_docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json` + `tests/e2e/replay/conftest.py::_calibration_path()` | n/a (static factory-sheet JSON) | 4/4 ACs |
|
||||||
|
| AZ-698 | replay_input + cli + runtime_root | **PASS** | `replay_input/auto_sync.py` + `replay_input/interface.AlignedWindow` + `--auto-trim` flag in `cli/replay.py` + `c8_fc_adapter/tlog_replay_adapter.py` (`tlog_start_ns` skip) + composition-root plumbing | OpenCV optical-flow, numpy NCC | 7/7 ACs (incl. AC-9 frame-window-match validator) |
|
||||||
|
| AZ-699 | helpers + tests/e2e/replay | **PASS** | `helpers/gps_compare.HorizontalErrorDistribution` + `helpers/accuracy_report.py` (promoted in AZ-701) + `tests/e2e/replay/test_derkachi_real_tlog.py` | real `gps-denied-replay --auto-trim` subprocess (AZ-402 + AZ-698) | All ACs; PASS/FAIL test gated on Jetson e2e |
|
||||||
|
| AZ-700 | cli + helpers | **PASS** | `cli/render_map.py` (`gps-denied-render-map` console-script) | folium / Leaflet (gated under `[operator-tools]` extra; airborne cold-start untouched) | 14/14 unit tests |
|
||||||
|
| AZ-701 | replay_api + cli + helpers | **PASS** | `replay_api/{__init__,errors,interface,storage,jobs,handlers,app}.py` + `cli/replay_api_entrypoint.py` + `helpers/accuracy_report.py` + Docker image | FastAPI/uvicorn/python-multipart (operator-only); real subprocess to `gps-denied-replay` + `gps-denied-render-map` | 7 ACs (AC-1, AC-2, AC-3, AC-5, AC-6, AC-8, AC-9) all covered by 18 unit tests |
|
||||||
|
|
||||||
|
## Evidence Checked
|
||||||
|
|
||||||
|
### Unresolved-placeholder scan (markers: `TODO`, `FIXME`, `XXX`, `HACK`, `NotImplemented`, `placeholder`, `scaffold`, `stub`, `deterministic fallback`, `fake runner`, `native bridge`)
|
||||||
|
|
||||||
|
- `src/gps_denied_onboard/replay_api/*` — 1 hit in `__init__.py:14` ("a fake runner") inside a module docstring describing the unit-test DI pattern. Not a code stub. ✅
|
||||||
|
- `src/gps_denied_onboard/replay_input/*` — 0 hits. ✅
|
||||||
|
- `src/gps_denied_onboard/helpers/gps_compare.py` — 0 hits. ✅
|
||||||
|
- `src/gps_denied_onboard/helpers/accuracy_report.py` — 2 hits at lines 56 / 90, both docstrings describing the `calibration_method` STRING FIELD which can take value `"factory-sheet"` (AZ-702 path) or `"placeholder"` (legacy adti26 fallback label). Not unimplemented code. ✅
|
||||||
|
- `src/gps_denied_onboard/cli/render_map.py` — 0 hits. ✅
|
||||||
|
- `src/gps_denied_onboard/cli/replay_api_entrypoint.py` — 0 hits. ✅
|
||||||
|
|
||||||
|
No unresolved placeholder / stub / scaffold / native-bridge markers in any cycle-2 production module.
|
||||||
|
|
||||||
|
### Named-runtime-dependency integration
|
||||||
|
|
||||||
|
| Task | Promised dependency | Where integrated | Adapter or real call? |
|
||||||
|
|------|---------------------|------------------|-----------------------|
|
||||||
|
| AZ-697 | `pymavlink.mavutil` binary tlog reader | `replay_input/tlog_ground_truth.py::_load_with_mavutil` (lazy import → `mavutil.mavlink_connection`) | Real call; missing `pymavlink` raises `ReplayInputAdapterError` |
|
||||||
|
| AZ-698 | OpenCV optical flow, numpy NCC | `replay_input/auto_sync.py::_compute_video_onset_from_samples` (cv2 pyramidal Lucas-Kanade) + `_frame_window_match` | Real calls; lazy import gated on env |
|
||||||
|
| AZ-699 | end-to-end `gps-denied-replay --auto-trim` invocation | `tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_derkachi_flight_passes_ac3` shells out via `subprocess.run([…, "--auto-trim"])` | Real subprocess against the production console script |
|
||||||
|
| AZ-700 | `folium` HTML map rendering | `cli/render_map.py::render_map_html` constructs `folium.Map` + adds `folium.PolyLine`, `folium.Marker`, `folium.Circle` layers | Real folium calls |
|
||||||
|
| AZ-701 | FastAPI/uvicorn HTTP, real pipeline subprocess | `replay_api/app.py::SubprocessReplayRunner.run` calls `subprocess.run([self._replay_binary, …, "--auto-trim"], …)` with argv list (no `shell=True`); then `_maybe_render_map` invokes `gps-denied-render-map` likewise | Real subprocess; tests inject a fake `ReplayRunner` via the Protocol DI seam |
|
||||||
|
| AZ-702 | Topotek KHP20S30 factory-sheet intrinsics | JSON committed to `_docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json`; `_calibration_path()` in `conftest.py` prefers it when present | Real data; assumptions documented in `camera_info.md` |
|
||||||
|
|
||||||
|
Every named dependency is integrated as production behaviour. No deterministic fallbacks substituted for promised runtime calls, no empty `native/` packages, no Protocol-only surfaces shipped without implementation.
|
||||||
|
|
||||||
|
### End-to-end production pipeline (Step 7 of gate)
|
||||||
|
|
||||||
|
The cycle-2 architecture promise — *"operator uploads (tlog + video [+ calibration]); receives back GPS-fix JSONL + accuracy report + HTML map"* — has an executable production path wired through:
|
||||||
|
|
||||||
|
1. `POST /replay` → `replay_api.app` route handler
|
||||||
|
2. `replay_api.handlers.validate_*` magic-byte checks + bearer-token extraction
|
||||||
|
3. `replay_api.storage.StorageRoot.allocate_job` per-job temp dir
|
||||||
|
4. `replay_api.jobs.JobRegistry.submit` queues to `ThreadPoolExecutor`
|
||||||
|
5. `replay_api.app.SubprocessReplayRunner.run` shells out to **`gps-denied-replay --auto-trim`** (AZ-402 + AZ-698)
|
||||||
|
6. `emissions.jsonl` written to the per-job dir
|
||||||
|
7. `SubprocessReplayRunner._maybe_render_report` loads ground truth via `replay_input.load_tlog_ground_truth` (AZ-697), computes `helpers.gps_compare.horizontal_error_distribution` (AZ-697/699), and renders `helpers.accuracy_report.render_report` (AZ-699/701)
|
||||||
|
8. `SubprocessReplayRunner._maybe_render_map` invokes **`gps-denied-render-map`** (AZ-700)
|
||||||
|
9. Job state transitions to `DONE`; static endpoints serve the three artefacts under stable URLs (`/jobs/{id}/result`, `/jobs/{id}/report`, `/jobs/{id}/map`)
|
||||||
|
|
||||||
|
Each step is real code calling real dependencies, not a stub or harness-only orchestration.
|
||||||
|
|
||||||
|
### Internal vs external prerequisites (Step 5 of gate)
|
||||||
|
|
||||||
|
All cycle-2 work is operator-side (no airborne path touched). The only external prerequisite for any task is the **real Derkachi flight artefacts** (`derkachi.tlog`, `flight_derkachi.mp4`, `khp20s30_factory.json`) — and that prerequisite is satisfied within the repo (`_docs/00_problem/input_data/flight_derkachi/`). AZ-699's e2e test gates with explicit `RUN_REPLAY_E2E=1` env + minimum video size + console-script presence (skip-with-reason, not `@xfail` / silent-skip), which matches the gate rule: "Environment-gated tests may skip only with an explicit prerequisite reason; they do not make missing production code complete."
|
||||||
|
|
||||||
|
No BLOCKED classifications — every task has its production code complete.
|
||||||
|
|
||||||
|
## Cross-Cutting Findings (none block this gate)
|
||||||
|
|
||||||
|
- **F1 from cumulative review** (`module-layout.md` is stale for the 6 new artefacts): an Architecture documentation gap, not a production-code gap. Production code respects the layering; the doc has not caught up. Owned by Step 13 (Update Docs).
|
||||||
|
- **F2 from cumulative review** (inline imports in `replay_api/app.py:_maybe_render_report`): style nit, Low severity.
|
||||||
|
|
||||||
|
Neither finding is a Completeness Gate FAIL trigger.
|
||||||
|
|
||||||
|
## Verdict
|
||||||
|
|
||||||
|
**PASS** — all 6 cycle-2 product tasks are classified PASS. The implement skill may write the final implementation report and hand off to Step 11 (Run Tests) for the e2e Jetson reality-gate execution.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- **Reality-gate signal still owed**: this gate verifies production code shape, not end-to-end runtime behaviour. The first actual end-to-end exercise of cycle-2 production code (real Derkachi tlog + real video + factory calibration through `replay_api` → `gps-denied-replay` → `gps-denied-render-map` → accuracy report) is owned by Step 11 on the Jetson harness. The completeness gate does not pre-emptively grant Step 11 a PASS.
|
||||||
|
- **No remediation tasks created**: the gate's only outputs are the F1 doc-drift and F2 style nit, both handled within existing cycle-2 steps (Step 13 Update Docs + retrospective). No `remediate_*.md` files added to `_docs/02_tasks/todo/`.
|
||||||
@@ -0,0 +1,79 @@
|
|||||||
|
# Implementation Report — Operator Replay Tooling — Cycle 2
|
||||||
|
|
||||||
|
**Cycle**: 2
|
||||||
|
**Feature slug**: `operator_replay_tooling`
|
||||||
|
**Epic**: AZ-696
|
||||||
|
**Date**: 2026-05-20
|
||||||
|
**Tasks shipped (6)**: AZ-697, AZ-702, AZ-698, AZ-699, AZ-700, AZ-701 (total: 24 story points)
|
||||||
|
**Batches**: 98 → 102 (5 batches)
|
||||||
|
**Status**: implementation phase complete; handing off full-suite gate to Step 11 (Run Tests) per `implement/SKILL.md` Step 16
|
||||||
|
|
||||||
|
## What shipped
|
||||||
|
|
||||||
|
A new operator-side post-flight replay path. Given a recorded ArduPilot binary `.tlog` + a flight video + a camera calibration, the operator now has three ways to consume the gps-denied estimator's output:
|
||||||
|
|
||||||
|
1. **Direct CLI** — `gps-denied-replay --auto-trim` (extended in AZ-698) emits GPS-fix JSONL; **`gps-denied-render-map`** (new in AZ-700) turns the JSONL plus the binary tlog ground truth into a self-contained HTML map; the AZ-699 e2e runner wraps both and produces a Markdown accuracy report with an explicit PASS/FAIL verdict against the AZ-696 epic AC-3 (≥ 80 % of emissions within 100 m).
|
||||||
|
2. **HTTP API** — new `replay_api` component (AZ-701): `POST /replay` accepts multipart `(tlog + video [+ calibration])` and returns either a synchronous 200 with the result URLs or an async 202 with a job id for polling. Static endpoints serve the JSONL, accuracy report, and HTML map per job.
|
||||||
|
3. **Programmatic** — `gps_denied_onboard.replay_input.load_tlog_ground_truth` (AZ-697) + `gps_denied_onboard.helpers.gps_compare.horizontal_error_distribution` (AZ-699) + `gps_denied_onboard.helpers.accuracy_report.render_report` (promoted to production from a test helper in AZ-701) are now stable importable surfaces.
|
||||||
|
|
||||||
|
Camera calibration for the Topotek KHP20S30 gimbal (the Derkachi flight's optical chain) is committed as a factory-sheet JSON (AZ-702), so the AZ-699 e2e run reproduces with no operator-side intrinsics work.
|
||||||
|
|
||||||
|
The whole tooling family is **operator-side only** — packaged under the `[operator-tools]` extra in `pyproject.toml`, served by a separate `docker/replay-api.Dockerfile`, and never linked into the airborne companion-tier1 binary. ADR-002 (binary-exclusion) is preserved.
|
||||||
|
|
||||||
|
## Per-task summary
|
||||||
|
|
||||||
|
| Task | Pts | What it delivers |
|
||||||
|
|------|----|------------------|
|
||||||
|
| **AZ-697** tlog ground-truth extractor | 3 | `replay_input/tlog_ground_truth.py` streams `GLOBAL_POSITION_INT` / `GPS_RAW_INT` from the binary tlog into a typed `TlogGroundTruth` DTO. Comparison kernels (`GroundTruthRow`, `l2_horizontal_m`, `match_percentage`) promoted to `helpers/gps_compare.py`. |
|
||||||
|
| **AZ-702** KHP20S30 factory calibration | 2 | `_docs/00_problem/input_data/flight_derkachi/khp20s30_factory.json` + assumptions documented in `camera_info.md`. `conftest.py::_calibration_path()` prefers it when present. |
|
||||||
|
| **AZ-698** tlog trim + mid-flight alignment | 5 | `replay_input/auto_sync.py` NCC aligner + `AlignedWindow` DTO + `auto_trim` flag through `ReplayConfig`/`ReplayAutoSyncConfig` + `--auto-trim` CLI flag on `gps-denied-replay` + `tlog_start_ns` skip in `TlogReplayFcAdapter`. |
|
||||||
|
| **AZ-699** real-flight validation runner | 3 | `helpers/gps_compare.HorizontalErrorDistribution` + `helpers/accuracy_report.py` (originally in `tests/e2e/replay/_report_writer.py`; promoted in AZ-701) + `tests/e2e/replay/test_derkachi_real_tlog.py` honest PASS/FAIL gate (no `@xfail` mask). |
|
||||||
|
| **AZ-700** replay map visualization | 3 | `cli/render_map.py` console script renders folium / Leaflet HTML of estimated vs truth tracks with start/end markers, 100 m / 50 m scale circles, optional summary banner. `--offline-tiles` mode for Jetsons without internet. |
|
||||||
|
| **AZ-701** HTTP replay API service | 5 | `replay_api/*` (7 files): FastAPI factory, `JobRegistry` with concurrency/queue caps, `ReplayRunner` Protocol DI seam, `SubprocessReplayRunner` production wiring, magic-byte upload validation (AC-9), bearer-token auth (AC-5), in-memory state by design. 18 unit tests pass. Docker image + compose profile + OpenAPI spec exported. |
|
||||||
|
|
||||||
|
## Cumulative review summary
|
||||||
|
|
||||||
|
`_docs/03_implementation/cumulative_review_batches_98-102_cycle2_report.md` — **PASS_WITH_WARNINGS**:
|
||||||
|
|
||||||
|
- **F1 (Medium / Architecture)** — `_docs/02_document/module-layout.md` is stale. Six new artefacts (replay_api component, two new `replay_input/` files, two new `cli/` entrypoints, two new `helpers/` modules) are not registered. Production code respects the layering; the doc has not caught up. Carried into Step 13 (Update Docs).
|
||||||
|
- **F2 (Low / Maintainability)** — `replay_api/app.py::_maybe_render_report` uses inline imports for stdlib `json` plus three first-party modules that are already eagerly imported elsewhere. Style nit; lazy import provides no measurable benefit. Owner: AZ-701 follow-up.
|
||||||
|
|
||||||
|
No Critical or High findings. No new cyclic dependencies. No duplicate symbols across batches. Module promotion (`_report_writer.py` → `helpers/accuracy_report.py`) was clean; all four consumers re-pointed and the old file deleted from disk.
|
||||||
|
|
||||||
|
## Product completeness gate summary
|
||||||
|
|
||||||
|
`_docs/03_implementation/implementation_completeness_cycle2_report.md` — **PASS** (all 6 tasks PASS, 0 BLOCKED, 0 FAIL):
|
||||||
|
|
||||||
|
- No unresolved-placeholder markers in any cycle-2 production module (only docstring mentions of test-pattern terminology in `replay_api/__init__.py:14` and string-field documentation in `helpers/accuracy_report.py:56,90`).
|
||||||
|
- Every named runtime dependency is integrated as production behaviour, not as a Protocol-only surface, deterministic fallback, fake runner, or "native bridge".
|
||||||
|
- The end-to-end production pipeline is wired: HTTP request → multipart validation → job queue → `gps-denied-replay --auto-trim` subprocess → `gps-denied-render-map` subprocess → accuracy report renderer → static result endpoints.
|
||||||
|
|
||||||
|
## Deviations from the implement skill workflow
|
||||||
|
|
||||||
|
Two deviations from `implement/SKILL.md`, both already justified in the corresponding reports:
|
||||||
|
|
||||||
|
1. **Step 14.5 cumulative-review cadence drift**: the cadence trigger (default K=3) should have fired after batch 100 mid-cycle. It did not. The make-up cumulative review at the end of cycle (covering all 5 batches 98–102 in one pass) is `cumulative_review_batches_98-102_cycle2_report.md`. To be flagged in the cycle-2 retrospective.
|
||||||
|
2. **No `## Baseline Delta` section in the cumulative review**: per the 2026-05-20 LESSONS entry, `_docs/02_document/architecture_compliance_baseline.md` should have been seeded as a cycle-2 Step 6 (Decompose) prerequisite with 0 violations from `_docs/06_metrics/structure_2026-05-20.md`. The file does not exist; cumulative review therefore could not emit Baseline Delta. To be addressed in cycle-2 retrospective (or the next decompose).
|
||||||
|
|
||||||
|
## Test status at handoff
|
||||||
|
|
||||||
|
- **AZ-701 unit slice** (`tests/unit/replay_api/`): 18/18 pass in 4 s (batch 102).
|
||||||
|
- **Full unit suite** (last run at batch 102): 2251 passed, 86 skipped, 1 failed. The single failure (`tests/unit/c12_operator_orchestrator/test_cli_console_script.py::TestConsoleScript::test_cold_start_under_500ms_p99`) is a pre-existing C12 CLI cold-start performance flake first reported in batch 100, unrelated to any cycle-2 task. To be handled in Step 11.
|
||||||
|
- **Mypy --strict** on cycle-2 surface: clean across all new modules.
|
||||||
|
- **e2e Jetson harness against cycle-2 production code**: **not yet run**. The last Jetson e2e run on file (`_docs/03_implementation/jetson_runs/2026-05-19_az618_tier2_run.txt`) predates the cycle-2 commits (`64d961f` … `7d53cef` all dated 2026-05-20). This is the explicit reason Step 11 (Run Tests) must follow this report.
|
||||||
|
|
||||||
|
## Full-suite handoff
|
||||||
|
|
||||||
|
Per `implement/SKILL.md` Step 16: "If the next flow step is `Run Tests`, record a handoff in the final implementation report and let `.cursor/skills/test-run/SKILL.md` own the full-suite gate to avoid duplicate full runs." The next flow step IS Step 11 (existing-code flow auto-chain). This report transfers the full-suite gate to `test-run/SKILL.md`, which on the cycle-2 invocation must:
|
||||||
|
|
||||||
|
1. Run the Tier-1 Docker harness (Colima) to catch any pure-Python regressions from the 5 cycle-2 batches.
|
||||||
|
2. Run the Tier-2 Jetson harness (`scripts/run-tests-jetson.sh`) with `RUN_REPLAY_E2E=1`. The Jetson run executes `tests/e2e/replay/test_derkachi_real_tlog.py` against the real `derkachi.tlog` + the real `flight_derkachi.mp4` + the `khp20s30_factory.json` calibration. Verdict is honest PASS/FAIL on the AZ-696 epic AC-3 threshold (≥ 80 % of emissions within 100 m).
|
||||||
|
3. Surface the C12 cold-start flake from batch 100 / 101 / 102 and decide whether it remains a pre-existing flake or now requires a remediation task.
|
||||||
|
|
||||||
|
## Artefacts
|
||||||
|
|
||||||
|
- Cumulative review: `_docs/03_implementation/cumulative_review_batches_98-102_cycle2_report.md`
|
||||||
|
- Completeness gate: `_docs/03_implementation/implementation_completeness_cycle2_report.md`
|
||||||
|
- Batch reports: `_docs/03_implementation/batch_{98,99,100,101,102}_cycle2_report.md`
|
||||||
|
- Cycle-2 commits: `64d961f` (AZ-697 + AZ-702) → `f5366bb` (AZ-698) → `dcde602` (AZ-699) → `b66b68f` (AZ-700) → `7d53cef` (AZ-701)
|
||||||
|
- Carry-forward findings for retro: cumul-review F1 (module-layout drift) + F2 (inline-import style nit) + cadence drift + missing architecture_compliance_baseline.md
|
||||||
@@ -2,13 +2,13 @@
|
|||||||
|
|
||||||
## Current Step
|
## Current Step
|
||||||
flow: existing-code
|
flow: existing-code
|
||||||
step: 10
|
step: 11
|
||||||
name: Implement
|
name: Run Tests
|
||||||
status: in_progress
|
status: not_started
|
||||||
sub_step:
|
sub_step:
|
||||||
phase: 6
|
phase: 0
|
||||||
name: implement-tasks-sequentially
|
name: awaiting-invocation
|
||||||
detail: "batch 102 of ~102: AZ-701"
|
detail: "cycle-2 implementation complete; Step 11 must execute Jetson e2e against new cycle-2 production code"
|
||||||
retry_count: 0
|
retry_count: 0
|
||||||
cycle: 2
|
cycle: 2
|
||||||
tracker: jira
|
tracker: jira
|
||||||
|
|||||||
@@ -1,9 +1,9 @@
|
|||||||
# D-CROSS-CVE-1 opencv-python pin deferred — gtsam/numpy ABI block
|
# D-CROSS-CVE-1 opencv-python pin deferred — gtsam/numpy ABI block
|
||||||
|
|
||||||
**Recorded**: 2026-05-11T02:55+03:00 (Europe/Kyiv)
|
**Recorded**: 2026-05-11T02:55+03:00 (Europe/Kyiv)
|
||||||
**Last replay attempt**: 2026-05-20T13:59+03:00 (Europe/Kyiv) — replay re-checked
|
**Last replay attempt**: 2026-05-20T17:34+03:00 (Europe/Kyiv) — replay re-checked
|
||||||
at start of next `/autodev` invocation (~17h after prior check at 2026-05-19
|
at start of next `/autodev` invocation (~3.5h after prior check at 2026-05-20
|
||||||
20:04). PyPI re-queried via `pip index versions gtsam`: only `gtsam 4.2`
|
13:59). PyPI re-queried via `pip index versions gtsam`: only `gtsam 4.2`
|
||||||
is published. Replay condition (numpy>=2 stable wheels) still NOT met.
|
is published. Replay condition (numpy>=2 stable wheels) still NOT met.
|
||||||
Leftover remains open.
|
Leftover remains open.
|
||||||
**Status**: deferred-non-user (replay when upstream gtsam wheels target numpy>=2)
|
**Status**: deferred-non-user (replay when upstream gtsam wheels target numpy>=2)
|
||||||
|
|||||||
Reference in New Issue
Block a user