[AZ-700] gps-denied-render-map: HTML map of estimated vs truth tracks

New operator-side console-script renders a self-contained HTML map
(folium / Leaflet) comparing the estimator's JSONL track against
the tlog ground-truth track. Pinned visual style: red truth + blue
estimated polylines, start/end markers per track, 100 m + 50 m
scale circles, optional AZ-699 accuracy-summary banner, and an
--offline-tiles mode (with optional local tile-URL template) for
Jetsons without internet.

folium is gated behind a new [operator-tools] optional-dep so the
airborne binary's cold-start NFR is unaffected (C12 binary doesn't
import the new module). 14 new unit tests pin polyline count,
marker count, scale-circle radii, summary embedding, offline-tile
behaviour, and full CLI smoke. Zero mypy --strict errors.

Refines the 2026-05-20 Jetson-only test policy: unit tests may run
locally, e2e/perf/resilience/security stay Jetson-only. Documented
in _docs/02_document/tests/environment.md (Where each tier runs)
and .cursor/rules/testing.mdc (Test environment for this project).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-20 17:04:01 +03:00
parent dcde602f61
commit b66b68ff76
8 changed files with 943 additions and 17 deletions
+48 -15
View File
@@ -1,17 +1,40 @@
# Test Environment
> **Active policy — 2026-05-20**: **all tests run on Jetson only.** The Jetson
> Orin Nano Super (or a Jetson-equivalent arm64 agent) is the single canonical
> test environment for every tier of testing — unit, integration, blackbox /
> e2e, performance, resilience, security, resource-limit. Workstation x86
> Docker (the historical "Tier-1" path) is **deprecated** and is not a
> supported test environment going forward; the Tier-1 sections below are
> retained as historical reference / traceability only. CI test pipelines
> target the colocated arm64 Jetson Woodpecker agent (see
> `_docs/04_deploy/ci_cd_pipeline.md`); local-development test runs SHOULD
> use `scripts/run-tests-jetson.sh` against the configured `jetson-e2e` SSH
> alias rather than `scripts/run-tests.sh`. This decision supersedes the
> 2026-05-09 "both" decision recorded in the § Test Execution section.
> **Active policy — 2026-05-20 (refined)**: the canonical CI / release-gate
> test environment is the Jetson Orin Nano Super (or a Jetson-equivalent
> arm64 agent). **Unit tests** (`pytest tests/unit/`) MAY be run on a local
> developer workstation for fast iteration — they are hardware-agnostic by
> construction, the suite is fully synthetic, and Jetson SSH round-trips add
> latency without adding signal. **Blackbox / e2e / performance / resilience
> / security / resource-limit tests** (`tests/e2e/`, `e2e/tests/`,
> `tests/perf/`, etc.) MUST run on the Jetson — never on a local workstation
> — because their pass criteria are tied to Jetson wall-clock latency,
> thermal envelope, and the real-camera + real-FC SITL loop. Workstation x86
> Docker (the historical "Tier-1" path) is **deprecated** as a supported
> e2e environment; the Tier-1 sections below are retained as historical
> reference / traceability only. CI e2e pipelines target the colocated
> arm64 Jetson Woodpecker agent (see `_docs/04_deploy/ci_cd_pipeline.md`);
> local-development e2e runs SHOULD use `scripts/run-tests-jetson.sh`
> against the configured `jetson-e2e` SSH alias rather than
> `scripts/run-tests.sh`. This refinement supersedes the 2026-05-20 "all
> tiers on Jetson" wording and the 2026-05-09 "both" decision recorded in
> the § Test Execution section.
## Where each tier runs (active policy)
| Tier | Local workstation | Jetson (canonical) | When local is the only option |
|------|--------------------|--------------------|-------------------------------|
| Unit (`tests/unit/`) | ✅ allowed and encouraged for dev iteration | ✅ also run as part of the Jetson CI lane | always |
| Blackbox / e2e (`tests/e2e/`, `e2e/tests/`) | ❌ forbidden — placeholder fixtures + missing hardware = false-negative runs | ✅ required for any merge / release decision | never — if Jetson is unreachable, the e2e verdict is "not run" rather than a local result |
| Performance / resilience / security / resource-limit | ❌ forbidden | ✅ required | never |
| Thermal chamber (AC-NEW-5) | ❌ forbidden | ✅ chamber Jetson only | never |
Practical consequences:
- A PR may merge on green local unit tests + green Jetson e2e tests.
- A PR MAY NOT merge on green local unit tests alone — the Jetson e2e lane is the binding signal.
- When the Jetson agent is offline, the e2e verdict is "pending Jetson" — record the gap (e.g. via `_docs/_process_leftovers/`) rather than substituting a local run.
- Tests in `tests/e2e/` that gate on `RUN_REPLAY_E2E` or `@pytest.mark.tier2` will SKIP locally; this is correct behaviour, not a failure to investigate.
## Overview
@@ -263,11 +286,21 @@ The captured-fixture builder framework (`e2e/fixtures/sitl_replay_builder/`) reg
## Test Execution
**Decision (2026-05-20)****Jetson only.** Supersedes the 2026-05-09 "both" decision below. All tests (unit, integration, blackbox / e2e, performance, resilience, security, resource-limit) run on the Jetson Orin Nano Super (or a Jetson-equivalent arm64 agent). The workstation x86 Docker path is deprecated. Rationale captured in `_docs/LESSONS.md` (2026-05-20 entry): repeated workstation-vs-Jetson environment divergences (Dockerfile build order, missing `libgl1`, gtsam wheel availability, venv symlink resolution, lazy-import side-effect registration) were producing false-negative test runs and consuming engineering time without ever exercising the production-equivalent hardware path.
**Decision (2026-05-20, refined later that day)****Jetson is the binding e2e environment; unit tests may run locally.** This refines the earlier "Jetson only for everything" wording. Rationale captured in `_docs/LESSONS.md` (2026-05-20 entries):
- The original "Jetson-only across all tiers" decision came from repeated workstation-vs-Jetson environment divergences in the e2e / build path (Dockerfile build order, missing `libgl1`, gtsam wheel availability, venv symlink resolution, lazy-import side-effect registration). Those divergences are real and continue to justify Jetson as the binding e2e environment.
- Forcing the unit-test suite over an SSH-orchestrated Jetson loop added 3090 s per iteration without producing any signal the local interpreter doesn't already produce. The unit suite is fully synthetic — no camera, no SITL, no Jetson-specific runtime — so a local PASS is equivalent to a Jetson PASS for that tier.
**Operational entry points**:
- Local-development: `scripts/run-tests-jetson.sh` against the configured `jetson-e2e` SSH alias (see `_docs/03_implementation/jetson_harness_setup.md` for one-time setup).
- CI: `.woodpecker/01-test.yml` on the colocated arm64 Jetson agent (see `_docs/04_deploy/ci_cd_pipeline.md`).
| Tier | Entry point | Where it runs |
|------|-------------|---------------|
| Unit (`tests/unit/`) | `pytest tests/unit/ -q` directly, or `scripts/run-tests.sh` | local workstation (Python 3.10+ venv) |
| Blackbox / e2e (`tests/e2e/`, `e2e/tests/`) | `scripts/run-tests-jetson.sh` (local dev) / `.woodpecker/01-test.yml` (CI) | colocated arm64 Jetson Woodpecker agent — see `_docs/04_deploy/ci_cd_pipeline.md` |
| Performance / resilience / security / resource-limit | same as e2e | Jetson only |
| AC-NEW-5 thermal chamber | quarterly + pre-release | `self-hosted-jetson-orin-chamber` |
A green local unit-test run is necessary-but-not-sufficient for merge; the Jetson e2e lane is the binding signal.
The remainder of this section preserves the original 2026-05-09 decision context for traceability.
@@ -106,3 +106,43 @@ Then tile loading uses a documented fallback (or fails fast with a clear error i
**Risk 2: CDN dependency at render time**
- *Risk*: Default folium uses Leaflet via CDN — fails on offline Jetsons.
- *Mitigation*: Document `--offline-tiles` flag; provide bundled assets path or fail-fast.
---
## Implementation Notes (Batch 101 — Cycle 2)
**Status**: In Testing (Jira AZ-700).
### Files changed
Production:
- `src/gps_denied_onboard/cli/render_map.py` — new module: `RenderInputs` DTO, `render_map_html`, `load_estimated_track`, `load_ground_truth_track`, argparse CLI, `main()`.
- `pyproject.toml` — new `[project.optional-dependencies] operator-tools = ["folium>=0.16,<1.0"]` group; new console script `gps-denied-render-map = "gps_denied_onboard.cli.render_map:main"`.
Tests:
- `tests/unit/test_az700_render_map.py` — 14 unit tests covering JSONL parsing, HTML rendering (2 polylines, 4 markers, 2 scale circles, summary embed, offline-tiles toggle), and CLI smoke including a minimal binary-tlog helper.
### AC coverage
| AC | Test / Artefact | Result |
| ---- | ---------------------------------------------------------------------------------------- | ------ |
| AC-1 | `test_cli_writes_html_with_default_tiles` | PASS (local). The Jetson e2e visual smoke is `AC-4` and is operator-driven on Tier-2. |
| AC-2 | `test_render_map_html_emits_two_polylines`, `…emits_four_markers_and_two_circles` | PASS |
| AC-3 | `test_render_map_html_emits_two_polylines`, `…emits_four_markers_and_two_circles` | PASS — output HTML contains exactly 2 polyline layers (red + blue) and 4 markers + 2 scale circles. |
| AC-4 | Visual smoke on Tier-2 Jetson (operator opens `map.html` produced by AZ-699's e2e run) | DEFERRED to Jetson — wired and ready. |
| AC-5 | `test_render_map_html_offline_tiles_omits_openstreetmap`, `…_template_uses_local_url` | PASS |
### Test results
`pytest tests/unit/test_az700_render_map.py` → 14 passed in 2.5 s. Wider regression slice (AZ-697/698/699/700 + replay_input + calibration): 107 passed, 1 skipped (pre-existing AC-5 e2e smoke that needs real video).
### Strict typing
`mypy --strict src/gps_denied_onboard/cli/render_map.py`**Success: no issues found in 1 source file.** Used `# type: ignore[import-untyped, import-not-found, unused-ignore]` on the lazy folium import so the strict pass is clean whether folium is installed or not.
### Design notes
- folium 0.20 (the latest in the pinned range) was used. The default tile provider is OpenStreetMap (`tiles="OpenStreetMap"`); the AC-5 `--offline-tiles` flag drops the base layer entirely, and `--offline-tiles-template` accepts a local tile-URL template for operators with a bundled tile pack.
- folium is lazy-imported inside `_import_folium()` so the airborne binary (which does NOT install `[operator-tools]`) doesn't pay for it on cold start. The C12 cold-start NFR is unaffected.
- The `_write_minimal_tlog` test helper builds a binary tlog with just `GLOBAL_POSITION_INT` records — that's the minimum AZ-697 needs — without coupling the test to the full Derkachi CSV schema used by `tests/e2e/replay/_tlog_synth.py`.
- All AZ-700 unit tests run locally per the refined test-environment policy (`_docs/02_document/tests/environment.md` § Where each tier runs); the Tier-2 visual-smoke AC-4 stays on the Jetson.
@@ -0,0 +1,93 @@
# Batch 101 — Cycle 2 — AZ-700
**Date**: 2026-05-20
**Tasks**: AZ-700 (replay map visualization).
**Story points**: 3.
**Jira status**: AZ-700 → `In Testing`.
## What shipped
A new operator-side console-script `gps-denied-render-map` that
renders a self-contained HTML map (folium / Leaflet) of the
estimator's track vs the tlog ground-truth track, with start/end
markers, 100 m + 50 m scale circles, optional summary banner from
AZ-699, and an `--offline-tiles` mode for Jetsons without internet
access.
folium is gated behind a new `[operator-tools]` optional-dependency
group so the airborne binary never pays for it.
## Files changed
Production (2):
- `src/gps_denied_onboard/cli/render_map.py` (new)
- `pyproject.toml` (new optional-deps group + console script)
Tests (1):
- `tests/unit/test_az700_render_map.py` (14 tests, all PASS local)
Docs:
- `_docs/02_document/tests/environment.md` — refined the 2026-05-20
"Jetson-only" policy to: unit tests local-OK, e2e Jetson-only.
- `.cursor/rules/testing.mdc` — added the refined policy as an
always-applied agent rule.
- `_docs/02_tasks/done/AZ-700_replay_map_visualization.md`
Implementation Notes appended; moved from `todo/`.
## AC coverage
| AC | Test / Artefact | Result |
| ---- | ---------------------------------------------------------------------------------------- | ------ |
| AC-1 | `test_cli_writes_html_with_default_tiles` | PASS (local). |
| AC-2 | `test_render_map_html_emits_two_polylines`, `…emits_four_markers_and_two_circles` | PASS |
| AC-3 | `test_render_map_html_emits_two_polylines`, `…emits_four_markers_and_two_circles` | PASS — exactly 2 polylines + 4 markers + 2 scale circles. |
| AC-4 | Visual smoke on Tier-2 Jetson with operator-opened `map.html` | DEFERRED to Jetson (correctly per refined test-env policy). |
| AC-5 | `test_render_map_html_offline_tiles_omits_openstreetmap`, `…_template_uses_local_url` | PASS |
## Test run
```
tests/unit/test_az700_render_map.py 14 PASS in 2.5 s
Wider regression slice 107 PASS 1 SKIP
```
The 1 skipped test is the pre-existing AZ-698 AC-5 e2e smoke
(needs the real video in `_docs/00_problem/input_data/flight_derkachi/`).
## Strict typing
```
mypy --strict src/gps_denied_onboard/cli/render_map.py
→ Success: no issues found in 1 source file.
```
The lazy folium import uses
`# type: ignore[import-untyped, import-not-found, unused-ignore]`
so strict passes cleanly whether or not `[operator-tools]` is
installed.
## Refined test-environment policy
Mid-batch the user clarified the existing "Jetson-only across all
tiers" policy: **unit tests may run locally, e2e tests stay
Jetson-only.** Rationale: the unit suite is fully synthetic, so a
local PASS = Jetson PASS for that tier; the e2e suite is bound to
Jetson hardware / latency / SITL and a local run is meaningless.
Captured in:
- `_docs/02_document/tests/environment.md` — banner + new
"Where each tier runs (active policy)" table + Test Execution
section rewritten.
- `.cursor/rules/testing.mdc` — appended "Test environment (this
project)" section so future agent sessions cannot drift back to
running e2e locally.
## Next batch
Batch 102 — **AZ-701** (HTTP replay API service). Depends on
AZ-697 (truth source) and AZ-699 (report writer). Last task in
cycle 2.
+2 -2
View File
@@ -8,8 +8,8 @@ status: in_progress
sub_step:
phase: 6
name: implement-tasks-sequentially
detail: "batch 101 of ~102: AZ-700"
detail: "batch 102 of ~102: AZ-701"
retry_count: 0
cycle: 2
tracker: jira
last_completed_batch: 100
last_completed_batch: 101