[AZ-700] gps-denied-render-map: HTML map of estimated vs truth tracks

New operator-side console-script renders a self-contained HTML map (folium / Leaflet) comparing the estimator's JSONL track against the tlog ground-truth track. Pinned visual style: red truth + blue estimated polylines, start/end markers per track, 100 m + 50 m scale circles, optional AZ-699 accuracy-summary banner, and an --offline-tiles mode (with optional local tile-URL template) for Jetsons without internet. folium is gated behind a new [operator-tools] optional-dep so the airborne binary's cold-start NFR is unaffected (C12 binary doesn't import the new module). 14 new unit tests pin polyline count, marker count, scale-circle radii, summary embedding, offline-tile behaviour, and full CLI smoke. Zero mypy --strict errors. Refines the 2026-05-20 Jetson-only test policy: unit tests may run locally, e2e/perf/resilience/security stay Jetson-only. Documented in _docs/02_document/tests/environment.md (Where each tier runs) and .cursor/rules/testing.mdc (Test environment for this project). Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-21 07:01:14 +00:00 · 2026-05-20 17:04:01 +03:00
parent dcde602f61
commit b66b68ff76
8 changed files with 943 additions and 17 deletions
@@ -13,3 +13,11 @@ globs: ["**/*test*", "**/*spec*", "**/*Test*", "**/tests/**", "**/test/**"]
 - Never use Thread Sleep or fixed delays in tests; use polling or async waits
 - Keep test data factories/builders for reusable test setup
 - Tests must be independent: no shared mutable state between tests
+
+## Test environment (this project)
+
+- **Unit tests** (`tests/unit/`): may run locally on the dev workstation (`pytest tests/unit/` in the project venv). Local PASS is equivalent to Jetson PASS for this tier because the suite is fully synthetic.
+- **Blackbox / e2e / performance / resilience / security / resource-limit** tests (`tests/e2e/`, `e2e/tests/`, `tests/perf/`, …): MUST run on the Jetson Orin Nano Super (or a Jetson-equivalent arm64 agent). Use `scripts/run-tests-jetson.sh` for local dev; CI runs `.woodpecker/01-test.yml` on the colocated arm64 Jetson Woodpecker agent.
+- Do NOT run e2e tests on the local workstation and report the result. If the Jetson is unreachable, the e2e verdict is "not run" — record the gap in `_docs/_process_leftovers/` rather than substituting a local result.
+- Tests gated by `RUN_REPLAY_E2E` or `@pytest.mark.tier2` are expected to SKIP locally; that is correct behaviour, not a failure to investigate.
+- Canonical source for this policy: `_docs/02_document/tests/environment.md` § Where each tier runs (active policy).
@@ -1,17 +1,40 @@
 # Test Environment

-> **Active policy — 2026-05-20**: **all tests run on Jetson only.** The Jetson
-> Orin Nano Super (or a Jetson-equivalent arm64 agent) is the single canonical
-> test environment for every tier of testing — unit, integration, blackbox /
-> e2e, performance, resilience, security, resource-limit. Workstation x86
-> Docker (the historical "Tier-1" path) is **deprecated** and is not a
-> supported test environment going forward; the Tier-1 sections below are
-> retained as historical reference / traceability only. CI test pipelines
-> target the colocated arm64 Jetson Woodpecker agent (see
-> `_docs/04_deploy/ci_cd_pipeline.md`); local-development test runs SHOULD
-> use `scripts/run-tests-jetson.sh` against the configured `jetson-e2e` SSH
-> alias rather than `scripts/run-tests.sh`. This decision supersedes the
-> 2026-05-09 "both" decision recorded in the § Test Execution section.
+> **Active policy — 2026-05-20 (refined)**: the canonical CI / release-gate
+> test environment is the Jetson Orin Nano Super (or a Jetson-equivalent
+> arm64 agent). **Unit tests** (`pytest tests/unit/`) MAY be run on a local
+> developer workstation for fast iteration — they are hardware-agnostic by
+> construction, the suite is fully synthetic, and Jetson SSH round-trips add
+> latency without adding signal. **Blackbox / e2e / performance / resilience
+> / security / resource-limit tests** (`tests/e2e/`, `e2e/tests/`,
+> `tests/perf/`, etc.) MUST run on the Jetson — never on a local workstation
+> — because their pass criteria are tied to Jetson wall-clock latency,
+> thermal envelope, and the real-camera + real-FC SITL loop. Workstation x86
+> Docker (the historical "Tier-1" path) is **deprecated** as a supported
+> e2e environment; the Tier-1 sections below are retained as historical
+> reference / traceability only. CI e2e pipelines target the colocated
+> arm64 Jetson Woodpecker agent (see `_docs/04_deploy/ci_cd_pipeline.md`);
+> local-development e2e runs SHOULD use `scripts/run-tests-jetson.sh`
+> against the configured `jetson-e2e` SSH alias rather than
+> `scripts/run-tests.sh`. This refinement supersedes the 2026-05-20 "all
+> tiers on Jetson" wording and the 2026-05-09 "both" decision recorded in
+> the § Test Execution section.
+
+## Where each tier runs (active policy)
+
+| Tier | Local workstation | Jetson (canonical) | When local is the only option |
+|------|--------------------|--------------------|-------------------------------|
+| Unit (`tests/unit/`) | ✅ allowed and encouraged for dev iteration | ✅ also run as part of the Jetson CI lane | always |
+| Blackbox / e2e (`tests/e2e/`, `e2e/tests/`) | ❌ forbidden — placeholder fixtures + missing hardware = false-negative runs | ✅ required for any merge / release decision | never — if Jetson is unreachable, the e2e verdict is "not run" rather than a local result |
+| Performance / resilience / security / resource-limit | ❌ forbidden | ✅ required | never |
+| Thermal chamber (AC-NEW-5) | ❌ forbidden | ✅ chamber Jetson only | never |
+
+Practical consequences:
+
+- A PR may merge on green local unit tests + green Jetson e2e tests.
+- A PR MAY NOT merge on green local unit tests alone — the Jetson e2e lane is the binding signal.
+- When the Jetson agent is offline, the e2e verdict is "pending Jetson" — record the gap (e.g. via `_docs/_process_leftovers/`) rather than substituting a local run.
+- Tests in `tests/e2e/` that gate on `RUN_REPLAY_E2E` or `@pytest.mark.tier2` will SKIP locally; this is correct behaviour, not a failure to investigate.

 ## Overview

@@ -263,11 +286,21 @@ The captured-fixture builder framework (`e2e/fixtures/sitl_replay_builder/`) reg

 ## Test Execution

-**Decision (2026-05-20)** — **Jetson only.** Supersedes the 2026-05-09 "both" decision below. All tests (unit, integration, blackbox / e2e, performance, resilience, security, resource-limit) run on the Jetson Orin Nano Super (or a Jetson-equivalent arm64 agent). The workstation x86 Docker path is deprecated. Rationale captured in `_docs/LESSONS.md` (2026-05-20 entry): repeated workstation-vs-Jetson environment divergences (Dockerfile build order, missing `libgl1`, gtsam wheel availability, venv symlink resolution, lazy-import side-effect registration) were producing false-negative test runs and consuming engineering time without ever exercising the production-equivalent hardware path.
+**Decision (2026-05-20, refined later that day)** — **Jetson is the binding e2e environment; unit tests may run locally.** This refines the earlier "Jetson only for everything" wording. Rationale captured in `_docs/LESSONS.md` (2026-05-20 entries):
+
+- The original "Jetson-only across all tiers" decision came from repeated workstation-vs-Jetson environment divergences in the e2e / build path (Dockerfile build order, missing `libgl1`, gtsam wheel availability, venv symlink resolution, lazy-import side-effect registration). Those divergences are real and continue to justify Jetson as the binding e2e environment.
+- Forcing the unit-test suite over an SSH-orchestrated Jetson loop added 30–90 s per iteration without producing any signal the local interpreter doesn't already produce. The unit suite is fully synthetic — no camera, no SITL, no Jetson-specific runtime — so a local PASS is equivalent to a Jetson PASS for that tier.

 **Operational entry points**:
- Local-development: `scripts/run-tests-jetson.sh` against the configured `jetson-e2e` SSH alias (see `_docs/03_implementation/jetson_harness_setup.md` for one-time setup).
- CI: `.woodpecker/01-test.yml` on the colocated arm64 Jetson agent (see `_docs/04_deploy/ci_cd_pipeline.md`).
+
+| Tier | Entry point | Where it runs |
+|------|-------------|---------------|
+| Unit (`tests/unit/`) | `pytest tests/unit/ -q` directly, or `scripts/run-tests.sh` | local workstation (Python 3.10+ venv) |
+| Blackbox / e2e (`tests/e2e/`, `e2e/tests/`) | `scripts/run-tests-jetson.sh` (local dev) / `.woodpecker/01-test.yml` (CI) | colocated arm64 Jetson Woodpecker agent — see `_docs/04_deploy/ci_cd_pipeline.md` |
+| Performance / resilience / security / resource-limit | same as e2e | Jetson only |
+| AC-NEW-5 thermal chamber | quarterly + pre-release | `self-hosted-jetson-orin-chamber` |
+
+A green local unit-test run is necessary-but-not-sufficient for merge; the Jetson e2e lane is the binding signal.

 The remainder of this section preserves the original 2026-05-09 decision context for traceability.

@@ -106,3 +106,43 @@ Then tile loading uses a documented fallback (or fails fast with a clear error i
 **Risk 2: CDN dependency at render time**
 - *Risk*: Default folium uses Leaflet via CDN — fails on offline Jetsons.
 - *Mitigation*: Document `--offline-tiles` flag; provide bundled assets path or fail-fast.
+
+---
+
+## Implementation Notes (Batch 101 — Cycle 2)
+
+**Status**: In Testing (Jira AZ-700).
+
+### Files changed
+
+Production:
+- `src/gps_denied_onboard/cli/render_map.py` — new module: `RenderInputs` DTO, `render_map_html`, `load_estimated_track`, `load_ground_truth_track`, argparse CLI, `main()`.
+- `pyproject.toml` — new `[project.optional-dependencies] operator-tools = ["folium>=0.16,<1.0"]` group; new console script `gps-denied-render-map = "gps_denied_onboard.cli.render_map:main"`.
+
+Tests:
+- `tests/unit/test_az700_render_map.py` — 14 unit tests covering JSONL parsing, HTML rendering (2 polylines, 4 markers, 2 scale circles, summary embed, offline-tiles toggle), and CLI smoke including a minimal binary-tlog helper.
+
+### AC coverage
+
+| AC   | Test / Artefact                                                                          | Result |
+| ---- | ---------------------------------------------------------------------------------------- | ------ |
+| AC-1 | `test_cli_writes_html_with_default_tiles`                                                | PASS (local). The Jetson e2e visual smoke is `AC-4` and is operator-driven on Tier-2. |
+| AC-2 | `test_render_map_html_emits_two_polylines`, `…emits_four_markers_and_two_circles`        | PASS   |
+| AC-3 | `test_render_map_html_emits_two_polylines`, `…emits_four_markers_and_two_circles`        | PASS — output HTML contains exactly 2 polyline layers (red + blue) and 4 markers + 2 scale circles. |
+| AC-4 | Visual smoke on Tier-2 Jetson (operator opens `map.html` produced by AZ-699's e2e run)   | DEFERRED to Jetson — wired and ready. |
+| AC-5 | `test_render_map_html_offline_tiles_omits_openstreetmap`, `…_template_uses_local_url`    | PASS   |
+
+### Test results
+
+`pytest tests/unit/test_az700_render_map.py` → 14 passed in 2.5 s. Wider regression slice (AZ-697/698/699/700 + replay_input + calibration): 107 passed, 1 skipped (pre-existing AC-5 e2e smoke that needs real video).
+
+### Strict typing
+
+`mypy --strict src/gps_denied_onboard/cli/render_map.py` → **Success: no issues found in 1 source file.** Used `# type: ignore[import-untyped, import-not-found, unused-ignore]` on the lazy folium import so the strict pass is clean whether folium is installed or not.
+
+### Design notes
+
+- folium 0.20 (the latest in the pinned range) was used. The default tile provider is OpenStreetMap (`tiles="OpenStreetMap"`); the AC-5 `--offline-tiles` flag drops the base layer entirely, and `--offline-tiles-template` accepts a local tile-URL template for operators with a bundled tile pack.
+- folium is lazy-imported inside `_import_folium()` so the airborne binary (which does NOT install `[operator-tools]`) doesn't pay for it on cold start. The C12 cold-start NFR is unaffected.
+- The `_write_minimal_tlog` test helper builds a binary tlog with just `GLOBAL_POSITION_INT` records — that's the minimum AZ-697 needs — without coupling the test to the full Derkachi CSV schema used by `tests/e2e/replay/_tlog_synth.py`.
+- All AZ-700 unit tests run locally per the refined test-environment policy (`_docs/02_document/tests/environment.md` § Where each tier runs); the Tier-2 visual-smoke AC-4 stays on the Jetson.
@@ -0,0 +1,93 @@
+# Batch 101 — Cycle 2 — AZ-700
+
+**Date**: 2026-05-20
+**Tasks**: AZ-700 (replay map visualization).
+**Story points**: 3.
+**Jira status**: AZ-700 → `In Testing`.
+
+## What shipped
+
+A new operator-side console-script `gps-denied-render-map` that
+renders a self-contained HTML map (folium / Leaflet) of the
+estimator's track vs the tlog ground-truth track, with start/end
+markers, 100 m + 50 m scale circles, optional summary banner from
+AZ-699, and an `--offline-tiles` mode for Jetsons without internet
+access.
+
+folium is gated behind a new `[operator-tools]` optional-dependency
+group so the airborne binary never pays for it.
+
+## Files changed
+
+Production (2):
+
+- `src/gps_denied_onboard/cli/render_map.py` (new)
+- `pyproject.toml` (new optional-deps group + console script)
+
+Tests (1):
+
+- `tests/unit/test_az700_render_map.py` (14 tests, all PASS local)
+
+Docs:
+
+- `_docs/02_document/tests/environment.md` — refined the 2026-05-20
+  "Jetson-only" policy to: unit tests local-OK, e2e Jetson-only.
+- `.cursor/rules/testing.mdc` — added the refined policy as an
+  always-applied agent rule.
+- `_docs/02_tasks/done/AZ-700_replay_map_visualization.md` —
+  Implementation Notes appended; moved from `todo/`.
+
+## AC coverage
+
+| AC   | Test / Artefact                                                                          | Result |
+| ---- | ---------------------------------------------------------------------------------------- | ------ |
+| AC-1 | `test_cli_writes_html_with_default_tiles`                                                | PASS (local).  |
+| AC-2 | `test_render_map_html_emits_two_polylines`, `…emits_four_markers_and_two_circles`        | PASS   |
+| AC-3 | `test_render_map_html_emits_two_polylines`, `…emits_four_markers_and_two_circles`        | PASS — exactly 2 polylines + 4 markers + 2 scale circles. |
+| AC-4 | Visual smoke on Tier-2 Jetson with operator-opened `map.html`                            | DEFERRED to Jetson (correctly per refined test-env policy).  |
+| AC-5 | `test_render_map_html_offline_tiles_omits_openstreetmap`, `…_template_uses_local_url`    | PASS   |
+
+## Test run
+
+```
+tests/unit/test_az700_render_map.py        14 PASS in 2.5 s
+Wider regression slice                    107 PASS  1 SKIP
+```
+
+The 1 skipped test is the pre-existing AZ-698 AC-5 e2e smoke
+(needs the real video in `_docs/00_problem/input_data/flight_derkachi/`).
+
+## Strict typing
+
+```
+mypy --strict src/gps_denied_onboard/cli/render_map.py
+→ Success: no issues found in 1 source file.
+```
+
+The lazy folium import uses
+`# type: ignore[import-untyped, import-not-found, unused-ignore]`
+so strict passes cleanly whether or not `[operator-tools]` is
+installed.
+
+## Refined test-environment policy
+
+Mid-batch the user clarified the existing "Jetson-only across all
+tiers" policy: **unit tests may run locally, e2e tests stay
+Jetson-only.** Rationale: the unit suite is fully synthetic, so a
+local PASS = Jetson PASS for that tier; the e2e suite is bound to
+Jetson hardware / latency / SITL and a local run is meaningless.
+
+Captured in:
+
+- `_docs/02_document/tests/environment.md` — banner + new
+  "Where each tier runs (active policy)" table + Test Execution
+  section rewritten.
+- `.cursor/rules/testing.mdc` — appended "Test environment (this
+  project)" section so future agent sessions cannot drift back to
+  running e2e locally.
+
+## Next batch
+
+Batch 102 — **AZ-701** (HTTP replay API service). Depends on
+AZ-697 (truth source) and AZ-699 (report writer). Last task in
+cycle 2.
@@ -8,8 +8,8 @@ status: in_progress
 sub_step:
  phase: 6
  name: implement-tasks-sequentially
-  detail: "batch 101 of ~102: AZ-700"
+  detail: "batch 102 of ~102: AZ-701"
 retry_count: 0
 cycle: 2
 tracker: jira
-last_completed_batch: 100
+last_completed_batch: 101
@@ -130,9 +130,18 @@ telemetry = [
    "jetson-stats>=4.2",
    "pynvml>=11.5",
 ]
+# AZ-700: operator-side post-flight analysis tools. NOT installed on
+# the airborne binary (folium pulls ~5 MB of JS + Leaflet assets that
+# regress the cold-start NFR if pulled into the runtime image).
+# Activate with `pip install gps-denied-onboard[operator-tools]` on
+# a developer / analyst workstation.
+operator-tools = [
+    "folium>=0.16,<1.0",
+]

 [project.scripts]
 gps-denied-replay = "gps_denied_onboard.cli.replay:main"
+gps-denied-render-map = "gps_denied_onboard.cli.render_map:main"
 operator-orchestrator = "gps_denied_onboard.components.c12_operator_orchestrator.cli:main"

 [tool.setuptools]
@@ -0,0 +1,370 @@
+"""AZ-700 ``gps-denied-render-map`` console-script.
+
+Renders a self-contained HTML map (folium / Leaflet) comparing the
+estimated GPS track (from a `gps-denied-replay` JSONL run) against
+the tlog ground-truth track (binary tlog via AZ-697). Output is a
+single shareable HTML file with two distinct polyline layers,
+start/end markers, scale circles for visual reference, and an
+optional accuracy-summary banner from AZ-699.
+
+This module lives under ``cli/`` rather than ``components/`` because
+it is an operator-side post-flight analysis tool — it never runs
+inside the airborne loop. folium is an optional dependency
+(``[operator-tools]``) so the airborne binary's cold-start NFR is
+unaffected.
+
+Style: small functions, pure renderers; the I/O (subprocess argv +
+file writes) lives at the edges so unit tests can exercise the
+rendering pipeline without invoking the CLI.
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import sys
+from collections.abc import Iterable
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any
+
+from gps_denied_onboard.replay_input import load_tlog_ground_truth
+
+__all__ = [
+    "RenderInputs",
+    "load_estimated_track",
+    "load_ground_truth_track",
+    "main",
+    "render_map_html",
+]
+
+
+# Default tile provider. folium uses OpenStreetMap when ``tiles`` is
+# ``"OpenStreetMap"`` (its built-in alias) or a literal URL template
+# is passed via the local-offline-tiles knob. AC-5 of AZ-700 allows
+# fail-fast when neither online nor local tiles are configured.
+_DEFAULT_TILES_NAME: str = "OpenStreetMap"
+
+
+# Visual style. Pinned so the AC-2/AC-3 HTML scans are stable across
+# folium upgrades (folium emits ``L.polyline([...], {color: '...'})``).
+_TRUTH_LINE_COLOR: str = "red"
+_ESTIMATED_LINE_COLOR: str = "blue"
+_TRUTH_START_COLOR: str = "green"
+_TRUTH_END_COLOR: str = "black"
+_ESTIMATED_START_COLOR: str = "lightgreen"
+_ESTIMATED_END_COLOR: str = "gray"
+
+
+@dataclass(frozen=True)
+class RenderInputs:
+    """Pre-parsed inputs for :func:`render_map_html`.
+
+    Attributes:
+        estimated_track: ``(lat_deg, lon_deg)`` per emission, in
+            chronological order.
+        truth_track: Same shape, sourced from the tlog.
+        summary_markdown: Optional content of the AZ-699 accuracy
+            report. ``None`` skips the header banner.
+        title: Page title (folium ``<title>``).
+    """
+
+    estimated_track: list[tuple[float, float]]
+    truth_track: list[tuple[float, float]]
+    summary_markdown: str | None
+    title: str
+
+
+def load_estimated_track(jsonl_path: Path) -> list[tuple[float, float]]:
+    """Load a track from a ``gps-denied-replay`` JSONL output."""
+    out: list[tuple[float, float]] = []
+    for line in jsonl_path.read_text().splitlines():
+        if not line.strip():
+            continue
+        try:
+            row = json.loads(line)
+        except json.JSONDecodeError as exc:
+            raise ValueError(
+                f"{jsonl_path}: invalid JSON on a line: {exc!r}"
+            ) from exc
+        pos = row.get("position_wgs84")
+        if not isinstance(pos, dict):
+            raise ValueError(
+                f"{jsonl_path}: row missing position_wgs84: {row!r}"
+            )
+        lat = pos.get("lat_deg")
+        lon = pos.get("lon_deg")
+        if not isinstance(lat, (int, float)) or not isinstance(lon, (int, float)):
+            raise ValueError(
+                f"{jsonl_path}: row has non-numeric lat/lon: {pos!r}"
+            )
+        out.append((float(lat), float(lon)))
+    return out
+
+
+def load_ground_truth_track(tlog_path: Path) -> list[tuple[float, float]]:
+    """Load a ``(lat, lon)`` track from a binary tlog (AZ-697)."""
+    series = load_tlog_ground_truth(tlog_path)
+    return [(fix.lat_deg, fix.lon_deg) for fix in series.records]
+
+
+def _bounds(
+    *tracks: Iterable[tuple[float, float]],
+) -> tuple[tuple[float, float], tuple[float, float]] | None:
+    """Return the lat/lon bounding box across all non-empty tracks."""
+    lats: list[float] = []
+    lons: list[float] = []
+    for track in tracks:
+        for lat, lon in track:
+            lats.append(lat)
+            lons.append(lon)
+    if not lats:
+        return None
+    return (min(lats), min(lons)), (max(lats), max(lons))
+
+
+def _import_folium() -> Any:
+    """Defer folium import so the airborne binary never pays for it."""
+    try:
+        import folium  # type: ignore[import-untyped, import-not-found, unused-ignore]
+    except ImportError as exc:
+        raise SystemExit(
+            "folium not installed. Install the operator-side tools "
+            "with `pip install gps-denied-onboard[operator-tools]`."
+        ) from exc
+    return folium
+
+
+def render_map_html(
+    inputs: RenderInputs,
+    *,
+    offline_tiles: bool = False,
+    offline_tiles_template: str | None = None,
+) -> str:
+    """Render the map to an HTML string.
+
+    Pure — does no file I/O. Returns the full HTML document that
+    :func:`main` writes to disk.
+
+    Args:
+        inputs: Parsed tracks + optional summary.
+        offline_tiles: When ``True``, folium is initialised with
+            ``tiles=None`` (no base layer). The operator is expected
+            to overlay tiles separately, or accept a gray map for
+            geometric review only.
+        offline_tiles_template: When provided, used as a local
+            tile-URL template (e.g. ``"file:///opt/tiles/{z}/{x}/{y}.png"``).
+            Takes precedence over ``offline_tiles``.
+    """
+    folium = _import_folium()
+    bbox = _bounds(inputs.estimated_track, inputs.truth_track)
+    if bbox is None:
+        raise ValueError(
+            "both estimated and truth tracks are empty; "
+            "nothing to render"
+        )
+    (lat_min, lon_min), (lat_max, lon_max) = bbox
+    centre = ((lat_min + lat_max) / 2.0, (lon_min + lon_max) / 2.0)
+
+    if offline_tiles_template is not None:
+        m = folium.Map(
+            location=centre,
+            zoom_start=15,
+            tiles=offline_tiles_template,
+            attr="local offline tile bundle",
+        )
+    elif offline_tiles:
+        m = folium.Map(location=centre, zoom_start=15, tiles=None)
+    else:
+        m = folium.Map(
+            location=centre, zoom_start=15, tiles=_DEFAULT_TILES_NAME
+        )
+
+    # AZ-700 AC-2: truth polyline (red) + estimated polyline (blue).
+    if inputs.truth_track:
+        folium.PolyLine(
+            inputs.truth_track,
+            color=_TRUTH_LINE_COLOR,
+            weight=3,
+            opacity=0.9,
+            tooltip="Ground truth (tlog)",
+        ).add_to(m)
+    if inputs.estimated_track:
+        folium.PolyLine(
+            inputs.estimated_track,
+            color=_ESTIMATED_LINE_COLOR,
+            weight=3,
+            opacity=0.9,
+            dash_array="6,4",
+            tooltip="Estimator output",
+        ).add_to(m)
+
+    # AZ-700 AC-3: start/end markers + 100 m + 50 m scale circles.
+    if inputs.truth_track:
+        truth_start = inputs.truth_track[0]
+        truth_end = inputs.truth_track[-1]
+        folium.Marker(
+            truth_start,
+            tooltip="Truth start",
+            icon=folium.Icon(color=_TRUTH_START_COLOR, icon="play"),
+        ).add_to(m)
+        folium.Marker(
+            truth_end,
+            tooltip="Truth end",
+            icon=folium.Icon(color=_TRUTH_END_COLOR, icon="stop"),
+        ).add_to(m)
+        folium.Circle(
+            truth_start, radius=100.0, color="black", fill=False,
+            tooltip="100 m scale",
+        ).add_to(m)
+        folium.Circle(
+            truth_start, radius=50.0, color="black", fill=False,
+            tooltip="50 m scale",
+        ).add_to(m)
+    if inputs.estimated_track:
+        est_start = inputs.estimated_track[0]
+        est_end = inputs.estimated_track[-1]
+        folium.Marker(
+            est_start,
+            tooltip="Estimator start",
+            icon=folium.Icon(color=_ESTIMATED_START_COLOR, icon="play"),
+        ).add_to(m)
+        folium.Marker(
+            est_end,
+            tooltip="Estimator end",
+            icon=folium.Icon(color=_ESTIMATED_END_COLOR, icon="stop"),
+        ).add_to(m)
+
+    m.fit_bounds([(lat_min, lon_min), (lat_max, lon_max)])
+
+    if inputs.summary_markdown is not None:
+        banner_html = (
+            "<div style='background:#fff; padding:8px 12px; "
+            "border-bottom:1px solid #999; font-family:monospace; "
+            "white-space:pre-wrap;'>"
+            + _escape_html(inputs.summary_markdown)
+            + "</div>"
+        )
+        m.get_root().html.add_child(folium.Element(banner_html))
+
+    title_html = (
+        f"<title>{_escape_html(inputs.title)}</title>"
+    )
+    m.get_root().header.add_child(folium.Element(title_html))
+
+    return str(m.get_root().render())
+
+
+def _escape_html(text: str) -> str:
+    return (
+        text.replace("&", "&amp;")
+        .replace("<", "&lt;")
+        .replace(">", "&gt;")
+    )
+
+
+# ---------------------------------------------------------------------
+# CLI surface
+
+
+def _build_argparser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(
+        prog="gps-denied-render-map",
+        description=(
+            "Render a self-contained HTML map comparing the "
+            "estimator's GPS track with the tlog ground-truth track."
+        ),
+    )
+    parser.add_argument(
+        "--estimated",
+        type=Path,
+        required=True,
+        help="Path to the gps-denied-replay JSONL emissions file.",
+    )
+    parser.add_argument(
+        "--truth",
+        type=Path,
+        required=True,
+        help="Path to the binary tlog the estimator was run against.",
+    )
+    parser.add_argument(
+        "--output",
+        type=Path,
+        required=True,
+        help="Path to write the resulting HTML map.",
+    )
+    parser.add_argument(
+        "--summary",
+        type=Path,
+        default=None,
+        help=(
+            "Optional path to an AZ-699 accuracy-summary Markdown "
+            "file. When supplied, its contents are embedded above "
+            "the map as a fixed banner."
+        ),
+    )
+    parser.add_argument(
+        "--offline-tiles",
+        action="store_true",
+        help=(
+            "Initialise the map with no base tile layer (gray "
+            "background). Use when the rendering host has no "
+            "internet access AND no local tile bundle. The map is "
+            "still useful for geometric track review."
+        ),
+    )
+    parser.add_argument(
+        "--offline-tiles-template",
+        type=str,
+        default=None,
+        help=(
+            "Local-tile URL template (e.g. "
+            "'file:///opt/tiles/{z}/{x}/{y}.png'). Takes precedence "
+            "over --offline-tiles when both are supplied."
+        ),
+    )
+    parser.add_argument(
+        "--title",
+        type=str,
+        default="gps-denied-onboard replay map",
+        help="HTML <title> for the produced page.",
+    )
+    return parser
+
+
+def main(argv: list[str] | None = None) -> int:
+    args = _build_argparser().parse_args(argv)
+
+    estimated_track = load_estimated_track(args.estimated)
+    truth_track = load_ground_truth_track(args.truth)
+    if not estimated_track and not truth_track:
+        print(
+            "both estimated and truth tracks are empty; nothing to render",
+            file=sys.stderr,
+        )
+        return 2
+
+    summary_markdown: str | None = None
+    if args.summary is not None:
+        if not args.summary.is_file():
+            print(
+                f"--summary file not found: {args.summary}", file=sys.stderr
+            )
+            return 2
+        summary_markdown = args.summary.read_text()
+
+    inputs = RenderInputs(
+        estimated_track=estimated_track,
+        truth_track=truth_track,
+        summary_markdown=summary_markdown,
+        title=args.title,
+    )
+
+    html = render_map_html(
+        inputs,
+        offline_tiles=bool(args.offline_tiles),
+        offline_tiles_template=args.offline_tiles_template,
+    )
+    args.output.parent.mkdir(parents=True, exist_ok=True)
+    args.output.write_text(html)
+    return 0
@@ -0,0 +1,373 @@
+"""AZ-700 — render_map CLI + HTML renderer unit tests.
+
+Covers AC-1 (CLI smoke + valid HTML), AC-2 (two distinct
+polylines), AC-3 (4 markers + 100 m + 50 m circles), and AC-4
+(summary embedding). AC-5 (offline-tiles flag) is exercised via a
+dedicated test.
+
+Folium is an optional dependency (``[operator-tools]`` group);
+these tests skip cleanly when it is not importable so the airborne
+test suite stays green even when the operator extra is absent.
+
+Style: every test follows the Arrange / Act / Assert pattern.
+"""
+
+from __future__ import annotations
+
+import json
+import struct
+from pathlib import Path
+
+import pytest
+
+folium = pytest.importorskip(
+    "folium",
+    reason="folium is an operator-only dep; install gps-denied-onboard[operator-tools]",
+)
+
+from gps_denied_onboard.cli.render_map import (
+    RenderInputs,
+    _build_argparser,
+    load_estimated_track,
+    load_ground_truth_track,
+    main,
+    render_map_html,
+)
+
+
+def _write_minimal_tlog(path: Path, fixes: list[tuple[float, float, float]]) -> None:
+    """Write a tiny binary tlog with ``GLOBAL_POSITION_INT`` only.
+
+    Format: ``<u64 big-endian timestamp_us><MAVLink2 msg bytes>``,
+    repeated. ``load_tlog_ground_truth`` ignores everything except
+    ``GLOBAL_POSITION_INT`` / ``GPS_RAW_INT``, so the minimal schema
+    is just one ``GLOBAL_POSITION_INT`` per fix.
+    """
+    from pymavlink.dialects.v20 import ardupilotmega as mavlink
+
+    mav = mavlink.MAVLink(file=None, srcSystem=1, srcComponent=1)
+    with path.open("wb") as fp:
+        for i, (lat, lon, alt) in enumerate(fixes):
+            time_boot_ms = i * 500
+            msg = mav.global_position_int_encode(
+                time_boot_ms=time_boot_ms,
+                lat=int(lat * 1e7),
+                lon=int(lon * 1e7),
+                alt=int(alt * 1000),
+                relative_alt=int(alt * 1000),
+                vx=0,
+                vy=0,
+                vz=0,
+                hdg=0,
+            )
+            payload = msg.pack(mav)
+            ts_us = i * 500_000
+            fp.write(struct.pack(">Q", ts_us))
+            fp.write(payload)
+
+
+# ---------------------------------------------------------------------
+# Helpers
+
+
+def _write_jsonl(path: Path, rows: list[dict[str, object]]) -> None:
+    path.write_text("\n".join(json.dumps(r) for r in rows) + "\n")
+
+
+def _example_inputs() -> RenderInputs:
+    return RenderInputs(
+        estimated_track=[
+            (50.0, 30.0),
+            (50.001, 30.001),
+            (50.002, 30.002),
+        ],
+        truth_track=[
+            (50.0, 30.0),
+            (50.0005, 30.0005),
+            (50.001, 30.001),
+        ],
+        summary_markdown=None,
+        title="unit-test",
+    )
+
+
+# ---------------------------------------------------------------------
+# load_estimated_track / load_ground_truth_track
+
+
+def test_load_estimated_track_skips_blank_lines(tmp_path: Path) -> None:
+    # Arrange
+    path = tmp_path / "out.jsonl"
+    path.write_text(
+        '{"position_wgs84":{"lat_deg":50.0,"lon_deg":30.0,"alt_m":100}}\n'
+        "\n"
+        '{"position_wgs84":{"lat_deg":50.1,"lon_deg":30.1,"alt_m":110}}\n'
+    )
+
+    # Act
+    track = load_estimated_track(path)
+
+    # Assert
+    assert track == [(50.0, 30.0), (50.1, 30.1)]
+
+
+def test_load_estimated_track_raises_on_missing_position(tmp_path: Path) -> None:
+    # Arrange
+    path = tmp_path / "out.jsonl"
+    path.write_text('{"frame_id":1}\n')
+
+    # Act / Assert
+    with pytest.raises(ValueError, match="missing position_wgs84"):
+        load_estimated_track(path)
+
+
+def test_load_estimated_track_raises_on_non_numeric_lat(tmp_path: Path) -> None:
+    # Arrange
+    path = tmp_path / "out.jsonl"
+    path.write_text(
+        '{"position_wgs84":{"lat_deg":"oops","lon_deg":30.0}}\n'
+    )
+
+    # Act / Assert
+    with pytest.raises(ValueError, match="non-numeric lat/lon"):
+        load_estimated_track(path)
+
+
+# ---------------------------------------------------------------------
+# render_map_html
+
+
+def test_render_map_html_emits_two_polylines() -> None:
+    # Act
+    html = render_map_html(_example_inputs())
+
+    # Assert — AC-2: two distinct polyline layers with our pinned colors.
+    assert html.count("L.polyline") == 2, (
+        "expected exactly 2 polylines (truth + estimated); "
+        f"saw {html.count('L.polyline')}"
+    )
+    assert '"color": "red"' in html, "truth polyline (red) missing"
+    assert '"color": "blue"' in html, "estimated polyline (blue) missing"
+
+
+def test_render_map_html_emits_four_markers_and_two_circles() -> None:
+    # Act
+    html = render_map_html(_example_inputs())
+
+    # Assert — AC-3: 2 markers per track (start + end) = 4 total.
+    assert html.count("L.marker") == 4, (
+        f"expected 4 markers; saw {html.count('L.marker')}"
+    )
+    # Scale circles at the truth start: radius 100 + 50.
+    assert html.count("L.circle") == 2, (
+        f"expected 2 scale circles (100 m + 50 m); "
+        f"saw {html.count('L.circle')}"
+    )
+    assert '"radius": 100.0' in html
+    assert '"radius": 50.0' in html
+
+
+def test_render_map_html_embeds_summary_when_provided() -> None:
+    # Arrange
+    inputs = RenderInputs(
+        estimated_track=[(50.0, 30.0), (50.001, 30.001)],
+        truth_track=[(50.0, 30.0), (50.0005, 30.0005)],
+        summary_markdown=(
+            "# Real-flight validation — 2026-05-20\n"
+            "**Verdict**: PASS\n"
+            "| Mean | 12.3 |"
+        ),
+        title="t",
+    )
+
+    # Act
+    html = render_map_html(inputs)
+
+    # Assert — AC-4: the markdown body shows up in the HTML.
+    assert "Real-flight validation" in html
+    assert "**Verdict**: PASS" in html  # noqa: E501 — escape allowed since `*` is not HTML-special
+    # HTML special chars are escaped — pipe characters stay raw, but
+    # the angle brackets used in markdown's emphasis would. We don't
+    # want script injection, so confirm the wrapper div is present.
+    assert "white-space:pre-wrap" in html
+
+
+def test_render_map_html_raises_on_both_tracks_empty() -> None:
+    # Arrange
+    inputs = RenderInputs(
+        estimated_track=[],
+        truth_track=[],
+        summary_markdown=None,
+        title="empty",
+    )
+
+    # Act / Assert
+    with pytest.raises(ValueError, match="empty"):
+        render_map_html(inputs)
+
+
+def test_render_map_html_offline_tiles_omits_openstreetmap() -> None:
+    # Act
+    html_default = render_map_html(_example_inputs())
+    html_offline = render_map_html(_example_inputs(), offline_tiles=True)
+
+    # Assert — `tiles=None` removes the default OpenStreetMap tile URL.
+    assert "openstreetmap" in html_default.lower()
+    assert "openstreetmap" not in html_offline.lower()
+
+
+def test_render_map_html_offline_tiles_template_uses_local_url() -> None:
+    # Act
+    html = render_map_html(
+        _example_inputs(),
+        offline_tiles_template="file:///opt/tiles/{z}/{x}/{y}.png",
+    )
+
+    # Assert
+    assert "file:///opt/tiles/{z}/{x}/{y}.png" in html
+    assert "local offline tile bundle" in html
+
+
+# ---------------------------------------------------------------------
+# CLI smoke (AC-1)
+
+
+def test_cli_writes_html_with_default_tiles(tmp_path: Path) -> None:
+    # Arrange
+    estimated = tmp_path / "estimator.jsonl"
+    _write_jsonl(
+        estimated,
+        [
+            {"position_wgs84": {"lat_deg": 50.0, "lon_deg": 30.0, "alt_m": 100}},
+            {"position_wgs84": {"lat_deg": 50.001, "lon_deg": 30.001, "alt_m": 101}},
+        ],
+    )
+
+    truth = tmp_path / "synth.tlog"
+    _write_minimal_tlog(
+        truth,
+        [(50.0, 30.0, 100.0), (50.0005, 30.0005, 100.0), (50.001, 30.001, 100.0)],
+    )
+
+    output = tmp_path / "map.html"
+
+    # Act
+    rc = main(
+        [
+            "--estimated", str(estimated),
+            "--truth", str(truth),
+            "--output", str(output),
+        ]
+    )
+
+    # Assert — AC-1: clean exit + non-empty HTML.
+    assert rc == 0
+    assert output.is_file()
+    body = output.read_text()
+    assert body.startswith("<!DOCTYPE html>")
+    assert len(body) > 1000
+
+
+def test_cli_embeds_summary_when_flag_supplied(tmp_path: Path) -> None:
+    # Arrange
+    estimated = tmp_path / "estimator.jsonl"
+    _write_jsonl(
+        estimated,
+        [
+            {"position_wgs84": {"lat_deg": 50.0, "lon_deg": 30.0, "alt_m": 100}},
+            {"position_wgs84": {"lat_deg": 50.001, "lon_deg": 30.001, "alt_m": 101}},
+        ],
+    )
+
+    truth = tmp_path / "synth.tlog"
+    _write_minimal_tlog(truth, [(50.0, 30.0, 100.0), (50.001, 30.001, 100.0)])
+
+    summary = tmp_path / "real_flight_validation_2026-05-20.md"
+    summary.write_text(
+        "# Real-flight validation — 2026-05-20\n"
+        "**Verdict**: FAIL\n\n"
+        "## Horizontal error (metres)\n"
+        "| Mean | 142.5 |\n"
+    )
+
+    output = tmp_path / "map.html"
+
+    # Act
+    rc = main(
+        [
+            "--estimated", str(estimated),
+            "--truth", str(truth),
+            "--output", str(output),
+            "--summary", str(summary),
+        ]
+    )
+
+    # Assert — AC-4
+    assert rc == 0
+    body = output.read_text()
+    assert "Real-flight validation" in body
+    assert "**Verdict**: FAIL" in body
+    assert "Mean | 142.5" in body
+
+
+def test_cli_fails_fast_when_summary_path_missing(tmp_path: Path) -> None:
+    # Arrange
+    estimated = tmp_path / "estimator.jsonl"
+    _write_jsonl(
+        estimated,
+        [
+            {"position_wgs84": {"lat_deg": 50.0, "lon_deg": 30.0, "alt_m": 100}},
+        ],
+    )
+    truth = tmp_path / "synth.tlog"
+    _write_minimal_tlog(truth, [(50.0, 30.0, 100.0), (50.001, 30.001, 100.0)])
+
+    output = tmp_path / "map.html"
+    missing_summary = tmp_path / "does_not_exist.md"
+
+    # Act
+    rc = main(
+        [
+            "--estimated", str(estimated),
+            "--truth", str(truth),
+            "--output", str(output),
+            "--summary", str(missing_summary),
+        ]
+    )
+
+    # Assert
+    assert rc == 2
+    assert not output.exists(), "must not write the map when summary path is invalid"
+
+
+def test_argparser_requires_three_paths() -> None:
+    # Arrange
+    parser = _build_argparser()
+
+    # Act / Assert
+    with pytest.raises(SystemExit):
+        parser.parse_args([])
+    with pytest.raises(SystemExit):
+        parser.parse_args(["--estimated", "/tmp/a.jsonl"])
+
+
+# ---------------------------------------------------------------------
+# load_ground_truth_track integration with AZ-697
+
+
+def test_load_ground_truth_track_returns_lat_lon_pairs(tmp_path: Path) -> None:
+    # Arrange — synthesize a minimal tlog and round-trip through AZ-697.
+    tlog_path = tmp_path / "synth.tlog"
+    _write_minimal_tlog(
+        tlog_path,
+        [(50.000, 30.000, 100.0), (50.001, 30.001, 101.0), (50.002, 30.002, 102.0)],
+    )
+
+    # Act
+    track = load_ground_truth_track(tlog_path)
+
+    # Assert
+    assert len(track) == 3
+    for lat, lon in track:
+        assert 49.99 < lat < 50.01
+        assert 29.99 < lon < 30.01