Files
gps-denied-onboard/_docs/02_tasks/done/AZ-407_fixture_builders_static.md
Oleksandr Bezdieniezhnykh 6599d828d2 [AZ-407] [AZ-444] [AZ-445] Batch 68: fixtures, Tier-2 harness, NFR reporter
Three blackbox-harness tasks landed together — all depend only on
AZ-406 and unblock the FT-* / NFT-* scenario tasks scheduled for
batches 69+.

AZ-407 — Static fixture builders (3pt):
  * tile-cache-builder/{builder.py, Dockerfile, build.sh} produces a
    deterministic tile-cache-fixture Docker volume from
    _docs/00_problem/input_data/. Reproducibility primitives: sorted
    iteration, frozen PIL JPEG settings, FAISS HNSW32 built single-
    threaded with seeded stub descriptors.
  * age-injector/{age_injector.py, inject.sh} clones the volume and
    shifts capture_date by N×30.44 days; tile JPEG bytes preserved
    bit-identical. Emits synth-age-7mo + synth-age-13mo volumes.
  * cold-boot/cold_boot_fixture.json: frozen FC pose snapshot at
    Derkachi sector centre, schema v1.
  * secrets/mavlink-test-passkey.txt: 64-hex with required
    `# TEST ONLY` header line per AC-5. Passkey-equality test now
    compares the secret line after stripping the header.
  * security/cve-2025-53644.jpg: synthetic 158-byte malformed JPEG
    (truncated SOS marker). OpenCV 4.11.x rejects gracefully with
    imdecode → None. AZ-439 will sharpen for ASan instrumentation.
  * Top-level Makefile with `make fixtures` / `make fixtures-*` /
    `make e2e-tier1*` / `make unit-tests` targets.

AZ-444 — Tier-2 Jetson harness wrapper (5pt):
  * run-tier2.sh rewritten as orchestrator. Detects local
    (aarch64 + TIER2_HOST=localhost) vs remote (ssh into TIER2_HOST).
    New flags: -k/--selector, --build-kind production|asan,
    --reflash (gated behind TIER2_REFLASH_ACK=1 two-key gate),
    --dry-run.
  * tier2-on-jetson.sh (new) — on-device delegate. Verifies
    gps-denied-onboard{,-asan}.service health; restarts with 5s
    tolerance; spawns tegrastats + jtop parallel samplers; tails
    ASan unit's journal in asan mode; drives docker compose with
    TIER=tier2-jetson; forwards SELECTOR to pytest -k.
  * docker/run-tier1.sh (new) — selector-parity sibling.
  * AC-1 (selector parity) and AC-6 (reflash gating) unit-tested via
    --dry-run output assertions. AC-2/AC-3/AC-4/AC-5 are hardware-
    loop ACs verified by the Tier-2 runtime smoke (no Jetson in the
    unit-test layer).

AZ-445 — CSV reporter + evidence bundler refinements (2pt):
  * reporting/nfr_recorder.py (new) — pytest plugin. Provides the
    `nfr_recorder` fixture with record_metric(name, value, ac_id)
    and partial(ac_id, reason). At session end emits:
      - per-nfr/<scenario_id>.json (AC-1)
      - traceability-status.json with every AC ID parsed from
        traceability-matrix.md, classified Covered/PARTIAL/NOT
        COVERED with source scenario IDs (AC-2)
      - regression-baseline.json with all numeric metrics (AC-3)
  * csv_reporter.py extended — `_outcome_to_result` consults the
    aggregator; rows flip PASS → PARTIAL when an AC was marked
    PARTIAL by nfr_recorder (AC-4). Graceful fallback when
    aggregator isn't registered (unit-test contexts).
  * conftest.py registers nfr_recorder in pytest_plugins.
  * New --traceability-matrix CLI flag seeds the NOT COVERED rows.

Build / config:
  * pyproject.toml dev extras: added Pillow>=10.4,<13.0 for the
    tile-cache-builder unit test (broad enough to keep torchvision's
    Pillow 12 pin happy; the production builder runs inside its own
    Docker image with its own pin).
  * Updated test_directory_layout.py to cover 10 new files + replaced
    the byte-equal passkey assertion with the header-stripping
    variant.

Test results:
  * 157 focused tests pass (was 97 in batch 67; +60 new across this
    batch). No regressions.

Module-layout / spec drift:
  * AZ-407 spec text says `tests/fixtures/...`; module-layout
    blackbox_tests entry (commit d7a17a8) authoritatively places the
    harness under `e2e/`. Implementation followed the layout entry.
  * AZ-444 spec mentions `e2e/tier2/run-tier2.sh`; AZ-406 placed it
    at `e2e/jetson/run-tier2.sh`. Kept at `e2e/jetson/` for
    consistency.
  * Cold-boot README ownership: corrected from AZ-419 to AZ-407 per
    AZ-419's own Dependencies field.

Specs archived to _docs/02_tasks/done/. Jira tickets transitioned to
In Testing on commit.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 17:18:01 +03:00

6.2 KiB

Fixture Builders — Static (tile-cache, age-injector, cold-boot, mavlink-passkey, cve-jpeg)

Task: AZ-407_fixture_builders_static Name: Static fixture builders for tile cache, aged tiles, cold-boot pose, MAVLink passkey, CVE JPEG Description: Implement reproducible fixture builders for the five static (build-once-per-CI) fixtures named in test-data.md: tile-cache-fixture, synth-age-tile-set, cold-boot-fixture, mavlink-passkey, cve-jpeg-fixture. Complexity: 3 points Dependencies: AZ-406 Component: Blackbox Tests / Fixture builders (epic AZ-262 / E-BBT) Tracker: AZ-407 Epic: AZ-262 (E-BBT)

Problem

Several blackbox scenarios assume the existence of static fixtures that do not vary across runs (FAISS HNSW index, aged tile manifests, frozen FC pose, signing passkey, crafted JPEG). Without a single owner producing them deterministically, every scenario task would re-implement its own variant and assertions would drift.

Outcome

  • tests/fixtures/tile-cache-builder/build.sh produces the same tile-cache-fixture content (FAISS index hashes, tile manifest rows, on-disk file sizes) bit-for-bit on two consecutive runs from the same _docs/00_problem/input_data/ source. Builds at minimum: 60 still-image footprints + Derkachi route bbox at 0.3-0.5 m/px. When D-PROJ-3 is unresolved, footprints without paired _gmaps.png use stub-tile content with explicit "STUB" provenance in the manifest.
  • tests/fixtures/age-injector/ clones tile-cache-fixture and produces synth-age-7mo (>6 mo, exceeds AC-8.2 active-conflict threshold) and synth-age-13mo (>12 mo, exceeds rear threshold). Tile pixels unchanged; only the manifest capture_date field mutated.
  • tests/fixtures/cold-boot/ ships a JSON snapshot of a GLOBAL_POSITION_INT pose at flight-resume time, loadable by ardupilot-plane-sitl / inav-sitl SITL via the standard parameter-load path.
  • tests/fixtures/secrets/mavlink-test-passkey.txt ships a 32-byte hex passkey, prefixed # TEST ONLY — not for production use.
  • tests/fixtures/security/cve-2025-53644.jpg ships a license-checked PoC OR a generation script that produces an equivalent crafted JPEG following the published PoC structure.

Scope

Included

  • build.sh + Dockerfile for tile-cache-builder; FAISS index emission; tile filesystem layout; manifest CSV/SQLite per restrictions.md § Satellite Imagery schema.
  • age-injector script that copies the tile-cache volume and mutates manifest dates only.
  • Static cold-boot JSON, mavlink-passkey, CVE JPEG fixtures + their license/provenance README.
  • A top-level make fixtures (or equivalent CI step) that builds all five fixtures into named Docker volumes / files.

Excluded

  • Synthetic-injection fixtures (outlier, blackout-spoof, multi-segment) — owned by AZ-408.
  • Real Derkachi video / 60 still images — bind-mounted from _docs/00_problem/input_data/, not built.
  • The Suite Sat Service mock — owned by AZ-406.
  • Production-grade tile-cache content (real public-data subset for D-PROJ-3); stub-tile fallback is acceptable until D-PROJ-3 lands.

Acceptance Criteria

AC-1: tile-cache-fixture is deterministic Given a clean Docker volume state When tests/fixtures/tile-cache-builder/build.sh runs twice from the same source Then both runs produce a tile-cache-fixture with identical FAISS index hash, identical manifest rows, and identical tile-filesystem byte sizes.

AC-2: tile-cache-fixture covers required footprints Given the build completes Then the manifest contains entries for all 60 still-image footprints AND the Derkachi route bbox AND the 2 paired _gmaps.png references; m/px ≥ 0.5 for every entry.

AC-3: synth-age-7mo and synth-age-13mo correctly aged Given tile-cache-fixture exists When age-injector runs with target=7mo / target=13mo Then the resulting volume has all capture_date fields set to (now - 7 mo) / (now - 13 mo) ± 1 day; tile pixel content is bit-identical to the source.

AC-4: cold-boot-fixture loads into SITL Given the JSON pose snapshot When loaded into ardupilot-plane-sitl (and separately inav-sitl) per the SITL parameter-load convention Then the SITL EKF reflects the snapshot pose within ±1 m of the JSON's lat/lon/alt fields.

AC-5: mavlink-passkey is a valid 32-byte hex secret Given mavlink-test-passkey.txt Then the file contains exactly 64 hex characters (32 bytes); the first line is # TEST ONLY — not for production use.

AC-6: cve-jpeg-fixture is decodable / triggers the CVE behavior Given cve-2025-53644.jpg When fed to OpenCV ≥4.12.0 imdecode under AddressSanitizer Then no buffer-overflow / use-after-free is reported AND OpenCV either decodes the image or returns an error gracefully (no crash). When fed to a vulnerable OpenCV (≤4.11) the PoC behavior is observable.

AC-7: License + provenance documented Given each fixture Then a README.md next to it states: source URL (or "synthetic"), license, and re-distribution terms. Fixtures lacking a clear license are generated programmatically rather than checked in.

System Under Test Boundary

This task ONLY produces fixtures consumed by other test tasks. It does NOT exercise SUT behavior. The fixtures themselves are the deliverable.

  • No internal SUT modules are imported by the builders.
  • The tile-cache-builder uses only the public on-disk schema documented in _docs/00_problem/restrictions.md § Satellite Imagery; it does NOT depend on the runtime tile-cache implementation (C6).
  • If C6's on-disk schema later evolves, this builder's output must be updated to match — the builder is a contract test on the schema.

Constraints

  • Re-runnability: each builder MUST be idempotent; running twice produces the same output.
  • Volume-driven: tile-cache + age-injector emit named Docker volumes (tile-cache-fixture, synth-age-7mo, synth-age-13mo) so compose can mount them RO into the SUT.
  • License hygiene: any third-party data must be license-checked at build time; failures abort the build with a human-readable error.

Document Dependencies

  • _docs/02_document/tests/test-data.md § Seed Data Sets, § Input Data Mapping
  • _docs/00_problem/restrictions.md § Satellite Imagery (manifest schema)
  • _docs/02_document/tests/blackbox-tests.md (which scenarios consume which fixture)