[AZ-407] [AZ-444] [AZ-445] Batch 68: fixtures, Tier-2 harness, NFR reporter

Three blackbox-harness tasks landed together — all depend only on
AZ-406 and unblock the FT-* / NFT-* scenario tasks scheduled for
batches 69+.

AZ-407 — Static fixture builders (3pt):
  * tile-cache-builder/{builder.py, Dockerfile, build.sh} produces a
    deterministic tile-cache-fixture Docker volume from
    _docs/00_problem/input_data/. Reproducibility primitives: sorted
    iteration, frozen PIL JPEG settings, FAISS HNSW32 built single-
    threaded with seeded stub descriptors.
  * age-injector/{age_injector.py, inject.sh} clones the volume and
    shifts capture_date by N×30.44 days; tile JPEG bytes preserved
    bit-identical. Emits synth-age-7mo + synth-age-13mo volumes.
  * cold-boot/cold_boot_fixture.json: frozen FC pose snapshot at
    Derkachi sector centre, schema v1.
  * secrets/mavlink-test-passkey.txt: 64-hex with required
    `# TEST ONLY` header line per AC-5. Passkey-equality test now
    compares the secret line after stripping the header.
  * security/cve-2025-53644.jpg: synthetic 158-byte malformed JPEG
    (truncated SOS marker). OpenCV 4.11.x rejects gracefully with
    imdecode → None. AZ-439 will sharpen for ASan instrumentation.
  * Top-level Makefile with `make fixtures` / `make fixtures-*` /
    `make e2e-tier1*` / `make unit-tests` targets.

AZ-444 — Tier-2 Jetson harness wrapper (5pt):
  * run-tier2.sh rewritten as orchestrator. Detects local
    (aarch64 + TIER2_HOST=localhost) vs remote (ssh into TIER2_HOST).
    New flags: -k/--selector, --build-kind production|asan,
    --reflash (gated behind TIER2_REFLASH_ACK=1 two-key gate),
    --dry-run.
  * tier2-on-jetson.sh (new) — on-device delegate. Verifies
    gps-denied-onboard{,-asan}.service health; restarts with 5s
    tolerance; spawns tegrastats + jtop parallel samplers; tails
    ASan unit's journal in asan mode; drives docker compose with
    TIER=tier2-jetson; forwards SELECTOR to pytest -k.
  * docker/run-tier1.sh (new) — selector-parity sibling.
  * AC-1 (selector parity) and AC-6 (reflash gating) unit-tested via
    --dry-run output assertions. AC-2/AC-3/AC-4/AC-5 are hardware-
    loop ACs verified by the Tier-2 runtime smoke (no Jetson in the
    unit-test layer).

AZ-445 — CSV reporter + evidence bundler refinements (2pt):
  * reporting/nfr_recorder.py (new) — pytest plugin. Provides the
    `nfr_recorder` fixture with record_metric(name, value, ac_id)
    and partial(ac_id, reason). At session end emits:
      - per-nfr/<scenario_id>.json (AC-1)
      - traceability-status.json with every AC ID parsed from
        traceability-matrix.md, classified Covered/PARTIAL/NOT
        COVERED with source scenario IDs (AC-2)
      - regression-baseline.json with all numeric metrics (AC-3)
  * csv_reporter.py extended — `_outcome_to_result` consults the
    aggregator; rows flip PASS → PARTIAL when an AC was marked
    PARTIAL by nfr_recorder (AC-4). Graceful fallback when
    aggregator isn't registered (unit-test contexts).
  * conftest.py registers nfr_recorder in pytest_plugins.
  * New --traceability-matrix CLI flag seeds the NOT COVERED rows.

Build / config:
  * pyproject.toml dev extras: added Pillow>=10.4,<13.0 for the
    tile-cache-builder unit test (broad enough to keep torchvision's
    Pillow 12 pin happy; the production builder runs inside its own
    Docker image with its own pin).
  * Updated test_directory_layout.py to cover 10 new files + replaced
    the byte-equal passkey assertion with the header-stripping
    variant.

Test results:
  * 157 focused tests pass (was 97 in batch 67; +60 new across this
    batch). No regressions.

Module-layout / spec drift:
  * AZ-407 spec text says `tests/fixtures/...`; module-layout
    blackbox_tests entry (commit d7a17a8) authoritatively places the
    harness under `e2e/`. Implementation followed the layout entry.
  * AZ-444 spec mentions `e2e/tier2/run-tier2.sh`; AZ-406 placed it
    at `e2e/jetson/run-tier2.sh`. Kept at `e2e/jetson/` for
    consistency.
  * Cold-boot README ownership: corrected from AZ-419 to AZ-407 per
    AZ-419's own Dependencies field.

Specs archived to _docs/02_tasks/done/. Jira tickets transitioned to
In Testing on commit.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-16 17:18:01 +03:00
parent e9e6e32097
commit 6599d828d2
35 changed files with 3716 additions and 147 deletions
+48 -5
View File
@@ -1,7 +1,50 @@
# age-injector
# age-injector (AZ-407)
Mutates `tile-cache-fixture` manifest dates → `synth-age-tile-set` for
FT-N-05 / FT-N-06 (stale-tile rejection on freshness violation).
Clones a `tile-cache-fixture` tree and mutates ONLY the manifest's
`capture_date` field (and the per-tile sidecar JSON's matching field)
to age every entry by a target number of months.
Delivered by **AZ-407** (Static fixture builders). AZ-406 commits to the
directory location + name only.
## Output volumes
| Volume | Age shift | Triggers |
|--------|-----------|----------|
| `synth-age-7mo` | now - 7 mo | > AC-8.2 active-conflict threshold (6 mo) — FT-N-05 |
| `synth-age-13mo` | now - 13 mo | > AC-8.2 rear threshold (12 mo) — FT-N-06 |
## Reproducibility
* Tile JPEG bodies are copied bit-identical (`shutil.copytree`).
* Manifest CSV row order is preserved from the source manifest (the
builder already sorts rows by `(zoom, x, y)`).
* The shifted date is `now - age_months × 30.44 days`, rounded — the
AC-3 tolerance is `± 1 day`, well within the 30.44-day floor.
* The descriptors.index (if present in the source) is copied
bit-identical.
## Provenance
The injector itself is fully synthetic. The aged volumes are derivative
works of `tile-cache-fixture` (same license — see
`e2e/fixtures/tile-cache-builder/README.md` § Provenance).
## Usage
```bash
# Production (Docker volumes):
e2e/fixtures/age-injector/inject.sh
# Local mode (used by AZ-407 unit test):
e2e/fixtures/age-injector/inject.sh --local /tmp/src /tmp/out-7mo /tmp/out-13mo
```
The unit test `e2e/_unit_tests/fixtures/test_age_injector.py` verifies
AC-3 by:
1. Building a small tile-cache fixture from a synthetic 4-still input
2. Running the injector with `--age-months=7` and `--age-months=13`
3. Asserting the manifest `capture_date` shifts ±1 day from `now - N*30.44 days`
4. Asserting every tile JPEG body byte-equals the source
## Owned by
AZ-407 (this task).
+177
View File
@@ -0,0 +1,177 @@
"""Age-injector for the tile-cache fixture.
Clones a ``tile-cache-fixture`` tree and mutates ONLY the manifest's
``capture_date`` column (and the per-tile sidecar JSON's matching field).
Tile JPEG bodies are copied bit-identical.
AC-3 (AZ-407): given target=7mo, every row's ``capture_date`` becomes
``now - 7 mo`` ± 1 day, exceeding the AC-8.2 active-conflict 6-month
threshold. Given target=13mo, every row's ``capture_date`` becomes
``now - 13 mo`` ± 1 day, exceeding the rear 12-month threshold.
Used by FT-N-05 / FT-N-06 (stale-tile rejection on freshness violation).
Public-boundary discipline: this module does NOT import any
``src/gps_denied_onboard`` symbol. The freshness contract lives in
``_docs/00_problem/restrictions.md`` § Satellite Imagery (AC-8.2).
"""
from __future__ import annotations
import argparse
import csv
import datetime as _dt
import json
import logging
import shutil
import sys
from pathlib import Path
logger = logging.getLogger(__name__)
# 30.44 days/month average — gives `now - N*30 days ± 1 day`, which the
# AC's "±1 day" tolerance accepts.
_DAYS_PER_MONTH = 30.44
_MANIFEST_HEADERS = (
"zoom_level",
"tile_x",
"tile_y",
"capture_date",
"source",
"m_per_px",
"jpeg_path",
"content_hash",
"provenance",
)
def _shifted_date(now: _dt.date, age_months: int) -> str:
delta_days = int(round(age_months * _DAYS_PER_MONTH))
return (now - _dt.timedelta(days=delta_days)).isoformat()
def inject(
source_dir: Path,
output_dir: Path,
age_months: int,
now: _dt.date | None = None,
) -> dict:
"""Clone ``source_dir`` into ``output_dir`` and mutate dates.
Returns a summary dict:
{"row_count": int, "shifted_date": "YYYY-MM-DD", "source_dir": str}
"""
if age_months <= 0:
raise ValueError(f"age_months must be positive; got {age_months}")
if now is None:
now = _dt.datetime.now(tz=_dt.timezone.utc).date()
if output_dir.exists():
shutil.rmtree(output_dir)
output_dir.mkdir(parents=True)
# Phase 1: clone the tile tree. Pixels copy bit-identical.
src_tiles = source_dir / "tiles"
if not src_tiles.is_dir():
raise FileNotFoundError(
f"{source_dir} does not look like a tile-cache fixture "
"(no `tiles/` subdir)"
)
shutil.copytree(src_tiles, output_dir / "tiles")
shifted = _shifted_date(now, age_months)
# Phase 2: mutate per-tile sidecar JSON files.
sidecar_count = 0
for sidecar in sorted((output_dir / "tiles").rglob("*.json")):
data = json.loads(sidecar.read_text())
data["capture_date"] = shifted
sidecar.write_text(
json.dumps(data, sort_keys=True, separators=(",", ":")) + "\n"
)
sidecar_count += 1
# Phase 3: re-emit manifest.csv with shifted dates. Row order is
# preserved (the source manifest is already sorted by builder.py).
src_manifest = source_dir / "manifest.csv"
if not src_manifest.is_file():
raise FileNotFoundError(f"missing manifest.csv at {src_manifest}")
with src_manifest.open() as fp:
reader = csv.DictReader(fp)
if tuple(reader.fieldnames or ()) != _MANIFEST_HEADERS:
raise ValueError(
f"unexpected manifest schema: {reader.fieldnames} "
f"(expected {list(_MANIFEST_HEADERS)})"
)
rows = list(reader)
out_manifest = output_dir / "manifest.csv"
with out_manifest.open("w", newline="") as fp:
writer = csv.writer(fp, lineterminator="\n")
writer.writerow(_MANIFEST_HEADERS)
for r in rows:
writer.writerow(
[
r["zoom_level"],
r["tile_x"],
r["tile_y"],
shifted,
r["source"],
r["m_per_px"],
r["jpeg_path"],
r["content_hash"],
r["provenance"],
]
)
# Phase 4: passthrough the descriptors.index if present (FAISS file
# is independent of capture_date; copy bit-identical).
src_index = source_dir / "descriptors.index"
if src_index.is_file():
shutil.copyfile(src_index, output_dir / "descriptors.index")
return {
"row_count": len(rows),
"sidecar_count": sidecar_count,
"shifted_date": shifted,
"source_dir": str(source_dir),
}
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser(description="Age-inject the tile-cache fixture")
parser.add_argument(
"--source-dir",
type=Path,
required=True,
help="Path to the source tile-cache-fixture tree",
)
parser.add_argument(
"--output-dir",
type=Path,
required=True,
help="Path to the aged output tree",
)
parser.add_argument(
"--age-months",
type=int,
required=True,
help="Shift capture_date by this many months into the past",
)
args = parser.parse_args(argv)
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s %(levelname)s %(name)s %(message)s",
)
summary = inject(args.source_dir, args.output_dir, args.age_months)
json.dump(summary, sys.stdout, sort_keys=True, indent=2)
sys.stdout.write("\n")
return 0
if __name__ == "__main__":
raise SystemExit(main())
+60
View File
@@ -0,0 +1,60 @@
#!/usr/bin/env bash
# Clone the tile-cache fixture and emit `synth-age-7mo` + `synth-age-13mo`
# Docker volumes (or local directories in ``--local`` mode).
#
# AC-3: dates shifted by 7 mo / 13 mo ±1 day; tile pixel content
# bit-identical to the source.
#
# Env vars:
# TILE_CACHE_VOLUME_NAME Source volume (default: tile-cache-fixture)
# AGE_7MO_VOLUME_NAME Output volume for 7mo (default: synth-age-7mo)
# AGE_13MO_VOLUME_NAME Output volume for 13mo (default: synth-age-13mo)
#
# Usage:
# inject.sh # Docker mode
# inject.sh --local /src /out-7mo /out-13mo # local mode (unit test path)
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SOURCE_VOL="${TILE_CACHE_VOLUME_NAME:-tile-cache-fixture}"
OUT_7MO_VOL="${AGE_7MO_VOLUME_NAME:-synth-age-7mo}"
OUT_13MO_VOL="${AGE_13MO_VOLUME_NAME:-synth-age-13mo}"
if [[ "${1:-}" == "--local" ]]; then
if [[ -z "${2:-}" || -z "${3:-}" || -z "${4:-}" ]]; then
echo "ERROR: --local requires <src_dir> <out_7mo_dir> <out_13mo_dir>" >&2
exit 2
fi
python3 "${SCRIPT_DIR}/age_injector.py" \
--source-dir "$2" --output-dir "$3" --age-months 7
python3 "${SCRIPT_DIR}/age_injector.py" \
--source-dir "$2" --output-dir "$4" --age-months 13
exit 0
fi
# Docker mode: reuse the tile-cache-builder image (it already has
# Python + Pillow + numpy; the injector script is mounted in).
IMAGE_TAG="azaion-tile-cache-builder:local"
for spec in "${OUT_7MO_VOL}:7" "${OUT_13MO_VOL}:13"; do
target_vol="${spec%%:*}"
months="${spec##*:}"
docker volume rm "${target_vol}" >/dev/null 2>&1 || true
docker volume create "${target_vol}" >/dev/null
docker run --rm \
-v "${SCRIPT_DIR}:/opt/injector:ro" \
-v "${SOURCE_VOL}:/source:ro" \
-v "${target_vol}:/output" \
--entrypoint python3 \
"${IMAGE_TAG}" \
/opt/injector/age_injector.py \
--source-dir /source \
--output-dir /output \
--age-months "${months}"
echo "synth-age volume '${target_vol}' built (age=${months}mo)"
done
+63 -6
View File
@@ -1,8 +1,65 @@
# cold-boot-fixture
# cold-boot-fixture (AZ-407 / AZ-419)
Static JSON fixture loaded by FT-P-11 (cold-start init) and NFT-PERF-03
(cold-start TTFF). Schema mirror lives in
`e2e/fixtures/injectors/cold_boot.py` (`ColdBootFixture`).
`cold_boot_fixture.json` is a frozen FC pose snapshot at flight-resume
time. The file is consumed by:
AZ-419 produces `cold_boot_fixture.json` here. AZ-406 commits to the
directory location only.
* **AZ-419 (FT-P-11 cold-start init)** — secondary path
(`origin_source == fc_ekf` per ADR-010): loaded into the SITL via
the standard parameter-load path. The SUT cold-starts with no
Manifest `takeoff_origin`, and the test asserts the first outbound
estimate lands within ±50 m of the snapshot pose.
* **NFT-PERF-03 (cold-start TTFF)** — same loading path, with
performance instrumentation around the time-to-first-fix metric.
## Schema (v1)
```json
{
"_schema": "cold-boot-fixture/v1",
"global_position_int": { "lat_e7": ..., "lon_e7": ..., "alt_mm": ..., ... },
"attitude": { "roll_rad": ..., "pitch_rad": ..., "yaw_rad": ..., ... },
"ardupilot_param_overrides": { ... },
"inav_serial_rx_overrides": { ... }
}
```
The `global_position_int` block uses the canonical MAVLink
`GLOBAL_POSITION_INT` units (lat/lon scaled by 1e7; alt in mm).
## Provenance
| Field | Source | License |
|-------|--------|---------|
| Lat / Lon | Derkachi sector centre (50.075° N, 36.150° E) | Synthetic — chosen from the Derkachi route bbox |
| Alt | 100 m AGL | Synthetic placeholder; refined when D-PROJ-3 supplies the production scenario |
| Attitude | Level flight, heading 0° (north) | Synthetic — chosen to match the parametrize matrix's default |
Fully synthetic; no third-party data. Re-distributable under this
repository's license.
## Loading path
* **ArduPilot**: `mavproxy.py --master=... --cmd="param load cold_boot_fixture.json"`
followed by a `FAKE_GPS` injection sequence (handled by the AZ-419
fixture loader; this README only documents the file itself).
* **iNav**: MSP2 `SET_HOME` message + `MSP2_SENSOR_GPS` injection. The
per-FC wiring is handled by the AZ-419 fixture loader.
## Verification
The AZ-407 unit test
`e2e/_unit_tests/fixtures/test_cold_boot_fixture.py` asserts:
* The file is valid JSON
* The `_schema` field equals `cold-boot-fixture/v1`
* All required numeric fields are present and within physically
reasonable bounds (±90° lat, ±180° lon, > 0 alt, etc.)
AC-4 (SITL loads the pose within ±1 m of the lat/lon/alt fields) is
verified by AZ-419's FT-P-11 test inside the Docker-bound runner —
that path requires SITL, which the AZ-407 unit test layer cannot
exercise.
## Owned by
AZ-407 (this file) + AZ-419 (the loader that consumes it).
@@ -0,0 +1,38 @@
{
"_schema": "cold-boot-fixture/v1",
"_description": "Frozen FC pose snapshot at flight-resume time. Loaded into ardupilot-plane-sitl / inav-sitl via the standard parameter-load path. Consumed by FT-P-11 (cold-start init, secondary path: origin_source == fc_ekf) per AZ-419.",
"_provenance": "synthetic — Derkachi sector centre at 100 m AGL, heading north",
"_license": "test-fixture (no third-party data; safe to redistribute under this repo's license)",
"_authored_for": ["AZ-407 (AC-4)", "AZ-419 (FT-P-11 fc_ekf path)"],
"global_position_int": {
"time_boot_ms": 0,
"lat_e7": 500750000,
"lon_e7": 361500000,
"alt_mm": 100000,
"relative_alt_mm": 100000,
"vx_cm_s": 0,
"vy_cm_s": 0,
"vz_cm_s": 0,
"hdg_cdeg": 0
},
"attitude": {
"roll_rad": 0.0,
"pitch_rad": 0.0,
"yaw_rad": 0.0,
"rollspeed_rad_s": 0.0,
"pitchspeed_rad_s": 0.0,
"yawspeed_rad_s": 0.0
},
"ardupilot_param_overrides": {
"SIM_GPS_DISABLE": 0,
"SIM_GPS_TYPE": 1,
"_comment_lat_lon_alt_yaw": "SIM_GPS_* params do not directly set EKF origin on the parameter-load path; FT-P-11 fixture loader will use mavproxy `param load` + a follow-up SET_HOME_POSITION / FAKE_GPS injection to land the EKF at the snapshot pose."
},
"inav_serial_rx_overrides": {
"_comment": "iNav loads pose via MSP2_SENSOR_GPS injection + INAV_SET_HOME message. FT-P-11 loader uses the standard MSP2 path; this fixture only declares the target lat/lon/alt/yaw — the loader handles per-FC wiring."
}
}
+25 -4
View File
@@ -3,9 +3,30 @@
These files are loaded by pymavlink / msp_gps_toy when the runner needs
to participate in a signed-message handshake (FT-P-09-AP, NFT-SEC-03).
The bytes here match the Docker-secret value at
`e2e/docker/secrets/mavlink_passkey`. **Both files MUST be kept in sync.**
## Files
Production deployments never see either file — the production passkey is
provisioned via a real secret store at deploy time per `environment.md`
| File | Format | Consumer |
|------|--------|----------|
| `mavlink-test-passkey.txt` | `# header line` + 64-hex passkey | Runner-side test fixture (AZ-407 AC-5 deliverable) |
The secret encoded here MUST match the bytes in
`e2e/docker/secrets/mavlink_passkey` (which is the raw 64-hex passkey
consumed by mavproxy as a Docker secret — no comment header allowed
in that file's body). The unit test
`e2e/_unit_tests/test_directory_layout.py::test_passkey_files_match`
strips the comment header before comparing.
## Provenance
The 64-hex value `0123456789abcdef…0123456789abcdef` is the canonical
"all-test-zeros-and-evens" pattern. It is **NOT** cryptographically
secure and MUST NEVER be used in any production deployment.
Production deployments provision the passkey via a real secret store
at deploy time per `_docs/02_document/tests/environment.md`
§ Communication with system under test.
## License
Synthetic — no third-party material. Covered by this repository's
license.
@@ -1 +1,2 @@
# TEST ONLY — not for production use
0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef
+47 -4
View File
@@ -1,5 +1,48 @@
# Security fixtures
# security fixtures (AZ-407 + AZ-439)
Hosts the crafted artifacts consumed by NFT-SEC-* scenarios. AZ-406
delivers the directory + generator scaffold; concrete fixture content is
delivered by the consuming security tasks (AZ-439 for the CVE JPEG).
## Contents
| File | Source | License | Consumer |
|------|--------|---------|----------|
| `generate_cve_jpeg.py` | Synthetic (this repo) | Same as repository license | AZ-439 (NFT-SEC-04) |
| `cve-2025-53644.jpg` | Generated by `generate_cve_jpeg.py` | Synthetic — no third-party data | NFT-SEC-04 control / regression test |
## Provenance
The JPEG is **fully synthetic** — hand-crafted bytes following the
JPEG structure documented in ITU-T T.81 / RFC 2046. It is NOT a copy
of the upstream CVE-2025-53644 proof-of-concept (whose redistribution
terms are unclear). The structural feature it exercises is a
**truncated SOS marker**: the marker is announced (`FFDA`) with a
valid 12-byte header but the entropy-coded scan data is absent and
the EOI (`FFD9`) is not present.
This matches the class of malformed input that CVE-2025-53644
exploits in vulnerable OpenCV (≤ 4.11). Hardened OpenCV (≥ 4.12)
must return a clean `imdecode` failure (None) without
buffer-overflow / use-after-free / SIGSEGV.
## Verification
```bash
.venv/bin/python -c "
import cv2, numpy as np
buf = np.fromfile('e2e/fixtures/security/cve-2025-53644.jpg', dtype=np.uint8)
img = cv2.imdecode(buf, cv2.IMREAD_COLOR)
assert img is None, 'AZ-407 fixture: OpenCV must reject this JPEG'
"
```
## Reproducibility
The generator is deterministic — `python generate_cve_jpeg.py out.jpg`
produces the same 158-byte file every time. The SHA-256 of the
generated file is checked into `e2e/_unit_tests/fixtures/test_cve_jpeg.py`
so any change to the generator's byte layout fails the unit test
explicitly.
## Re-distribution
The synthetic byte-stream and the generator script are covered by
this repository's license. No third-party CVE proof-of-concept content
is committed.
Binary file not shown.

After

Width:  |  Height:  |  Size: 158 B

+113 -25
View File
@@ -1,43 +1,131 @@
"""Programmatically generate the crafted JPEG fixture for CVE-2025-53644.
Per AZ-406 § Risk 5 — the upstream PoC JPEG has unclear redistribution
terms, so the e2e harness generates a structurally equivalent file from
scratch rather than committing copyrighted bytes.
Per AZ-407 § AC-6 and AZ-406 § Risk 5 — the upstream PoC JPEG has
unclear redistribution terms, so the e2e harness generates a
structurally equivalent malformed file from scratch rather than
committing copyrighted bytes.
The fixture is consumed by NFT-SEC-04 (OpenCV CVE-2025-53644 +
AddressSanitizer fuzz). The intent is NOT to reproduce the exact RCE; it
is to provide a malformed JPEG with the structural features the CVE
exploits (oversized DHT segment, truncated SOS marker) so the SUT's
hardened OpenCV path (>= 4.12.0) rejects it.
AZ-407 ships a *minimal* malformed JPEG with:
* Valid SOI marker (``FFD8``)
* Valid DQT (quantisation table)
* Valid SOF0 (baseline DCT) header
* **Truncated SOS marker** — the marker is announced (``FFDA``) but
only the length field is present; the entropy-coded data is
deliberately absent. This is the structural feature CVE-2025-53644
exploits: vulnerable OpenCV (≤ 4.11) reads past the buffer; hardened
OpenCV (≥ 4.12) rejects gracefully with an `imread` failure.
AZ-406 commits to the generator's existence + signature; AZ-439
(NFT-SEC-04) supplies the byte-level details and validates the generated
file actually triggers the CVE code path against opencv 4.11.x (control)
vs 4.12+ (mitigated).
AZ-439 (NFT-SEC-04) tightens this further:
* Adds an oversized DHT segment (the full PoC structure)
* Runs the file under AddressSanitizer to assert no buffer-overflow
/ use-after-free is reported on the hardened build
* Compares behaviour against a control vulnerable OpenCV ≤ 4.11
The AZ-407 fixture is sufficient to verify AC-6: feeding it to
OpenCV 4.12+ does NOT crash; it returns a clean decode failure.
The function is deterministic: same input → identical output bytes.
"""
from __future__ import annotations
import argparse
import hashlib
import logging
from pathlib import Path
logger = logging.getLogger(__name__)
def _build_minimal_malformed_jpeg() -> bytes:
"""Emit a deterministic malformed JPEG with a truncated SOS marker.
Byte-level structure (annotated):
FFD8 # SOI
FFE0 0010 4A464946 00 0102 0000 0001 0001 0000 # APP0 / JFIF stub
FFDB 0043 00 <64 bytes> # DQT (table 0, baseline)
FFC0 0011 08 0001 0001 03 01 22 00 02 11 01 03 11 01 # SOF0 (1x1 baseline 3-component)
FFC4 001F 00 <31 bytes> # DHT (DC table 0; bytes follow JPEG std)
FFDA 000C 03 01 00 02 11 03 11 00 3F 00 # SOS — header announced, NO entropy data
<eof — no trailing FFD9> # CVE: truncated stream
"""
soi = b"\xff\xd8"
app0 = bytes.fromhex(
"ffe000104a46494600010200000001000100"
"00"
)
dqt_body = bytes(range(64))
dqt = b"\xff\xdb" + (3 + len(dqt_body)).to_bytes(2, "big") + b"\x00" + dqt_body
sof0 = bytes.fromhex(
"ffc0001108" # SOF0 marker + length + precision
"0001" # height = 1
"0001" # width = 1
"03" # 3 components
"012200" # Y : id=1, sampling=22, quant tbl=0
"021101" # Cb : id=2, sampling=11, quant tbl=1
"031101" # Cr : id=3, sampling=11, quant tbl=1
)
# DHT for AC bits — standard JPEG huffman table 0/0; the count/value
# bytes here are a 31-byte body that decodes cleanly. We hand-craft
# the structure rather than depending on PIL.
dht_body = (
b"\x00" # tc=0, th=0
+ bytes([0, 1, 5, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0]) # length counts
+ bytes([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]) # symbols
)
dht = b"\xff\xc4" + (2 + len(dht_body)).to_bytes(2, "big") + dht_body
# SOS: announce the marker + parameters, then STOP. No entropy-coded
# scan data. No EOI. This is the CVE-relevant truncation.
sos = bytes.fromhex(
"ffda000c" # SOS marker + length
"03" # 3 components in scan
"0100" # Y : DC=0 / AC=0
"0211" # Cb : DC=1 / AC=1
"0311" # Cr : DC=1 / AC=1
"00" # Ss
"3f" # Se
"00" # Ah/Al
)
return soi + app0 + dqt + sof0 + dht + sos
def generate(out_path: Path) -> Path:
"""Write a malformed JPEG to ``out_path``. Returns the path on success.
"""Write the AZ-407 malformed JPEG to ``out_path``.
Raises NotImplementedError until AZ-439 supplies the byte template.
Tests that need the crafted fixture should mark themselves
@pytest.mark.skip(reason="awaiting AZ-439") until then.
Returns the path on success. Idempotent: writing twice produces the
same bytes.
"""
raise NotImplementedError(
"generate_cve_jpeg.generate is owned by AZ-439 — AZ-406 commits "
"to the public signature only."
blob = _build_minimal_malformed_jpeg()
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_bytes(blob)
logger.info(
"Wrote %d-byte CVE-2025-53644 fixture (sha256=%s) to %s",
len(blob),
hashlib.sha256(blob).hexdigest(),
out_path,
)
return out_path
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser(description="Generate CVE-2025-53644 fixture JPEG.")
parser.add_argument(
"out",
type=Path,
nargs="?",
default=Path("cve-2025-53644.jpg"),
help="Output JPEG path (default: ./cve-2025-53644.jpg)",
)
args = parser.parse_args(argv)
logging.basicConfig(level=logging.INFO)
generate(args.out)
return 0
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser(description="Generate CVE-2025-53644 fixture JPEG.")
parser.add_argument("out", type=Path, default=Path("cve-2025-53644.jpg"))
args = parser.parse_args()
generate(args.out)
raise SystemExit(main())
@@ -0,0 +1,49 @@
# syntax=docker/dockerfile:1.7
#
# tile-cache-fixture builder image. Built once per CI; output is a named
# Docker volume (`tile-cache-fixture`) mounted RO into the SUT by
# `docker/docker-compose.test.yml`.
#
# Public-boundary discipline: this image does NOT install the SUT
# package. It depends only on:
# * Pillow — JPEG re-encode of the paired _gmaps.png reference tiles
# and the deterministic stub-tile generator.
# * faiss-cpu — deterministic HNSW descriptor index emission.
# * numpy — backing array dtype for FAISS.
#
# Reproducibility:
# * Pin Python to 3.10-slim (matches the runner image's Python line).
# * Pin Pillow, faiss-cpu, numpy to the versions verified deterministic
# in `e2e/_unit_tests/fixtures/test_tile_cache_builder.py`.
# * `PYTHONHASHSEED=0` neutralises hash-order non-determinism.
FROM python:3.10.14-slim-bookworm@sha256:9c9efb0c19a8bb1f08e8e7a13be5d671e51bcb9c83a3a8b0e2ad7d8aaeb33b30
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PYTHONHASHSEED=0 \
PIP_NO_CACHE_DIR=1
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libgomp1 \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir \
"Pillow>=10.4,<12.0" \
"numpy>=1.26,<2.0" \
"faiss-cpu>=1.8,<2.0"
WORKDIR /opt/builder
COPY builder.py /opt/builder/builder.py
# Drop root for runtime; the image only reads /input and writes to
# /output, both bind-mounted by the caller.
RUN useradd -u 10001 -m -d /home/builder builder \
&& mkdir -p /input /output \
&& chown -R builder:builder /opt/builder /input /output
USER 10001:10001
ENTRYPOINT ["python", "/opt/builder/builder.py"]
CMD ["--input-dir", "/input", "--output-dir", "/output"]
+76 -11
View File
@@ -1,15 +1,80 @@
# tile-cache-builder
# tile-cache-builder (AZ-407)
Builds the `tile-cache-fixture` Docker volume from the 60 still-image
satellite references in `_docs/00_problem/input_data/` plus the Derkachi
route bbox.
satellite references in `_docs/00_problem/input_data/` plus the
Derkachi route bbox.
This directory currently contains only the structural placeholder; the
concrete builder (Dockerfile + build script + FAISS HNSW index emitter +
manifest writer + reproducibility assertion) is delivered by **AZ-407**
(Static fixture builders) — see AC-7 ("Fixture builders are reproducible")
in `_docs/02_tasks/todo/AZ-406_test_infrastructure.md`.
## Output schema
AZ-406 commits to the directory's location + name only. Do NOT delete this
README before AZ-407 lands; the `e2e_unit_test_directory_layout` unit test
asserts the placeholder is present.
```
tile-cache-fixture/
tiles/<zoom>/<x>/<y>.jpg # tile JPEG body
tiles/<zoom>/<x>/<y>.json # per-tile sidecar (mirrors `tiles` row)
manifest.csv # sorted manifest (9 columns)
descriptors.index # FAISS HNSW32 index (omitted if faiss not available)
```
Manifest columns (per `_docs/00_problem/restrictions.md` § Satellite
Imagery + `_docs/02_document/data_model.md` § 2.1):
| Column | Type | Notes |
|--------|------|-------|
| `zoom_level` | int | Slippy/XYZ zoom |
| `tile_x`, `tile_y` | int | Tile coords at the zoom |
| `capture_date` | ISO-8601 date | Default `2025-11-01` (frozen so freshness gate treats as fresh) |
| `source` | enum | `googlemaps` for real paired tiles, `stub` for D-PROJ-3 fallback |
| `m_per_px` | float | `0.5` (≥ the AC-8.1 floor) |
| `jpeg_path` | str | Relative path to the JPEG body |
| `content_hash` | hex | SHA-256 of the JPEG bytes |
| `provenance` | str | `paired_gmaps:AD000NNN`, `STUB`, or `STUB_BBOX:derkachi:lat,lon,lat,lon` |
## Reproducibility (AC-1)
Two consecutive invocations from the same input produce a bit-identical
output tree:
* Input files iterated in lexicographic order
* PIL JPEG encoded with `quality=85, optimize=False, progressive=False, subsampling=2`
* Manifest rows sorted by `(zoom_level, tile_x, tile_y)` before CSV
serialisation
* FAISS index built single-threaded with `omp_set_num_threads(1)` and
SHA-derived stub descriptors
## Provenance (AC-7)
| Item | Source | License |
|------|--------|---------|
| Real tile bodies | `_docs/00_problem/input_data/AD*_gmaps.png` (2 paired references) | Project test fixture; safe to redistribute under this repo's license |
| Stub tile bodies | Generated from `_stub_jpeg_bytes(seed)` (PIL solid-fill) | Fully synthetic; no third-party data |
| Derkachi bbox tile | Synthetic placeholder until D-PROJ-3 lands | Fully synthetic |
| FAISS index | SHA-derived stub vectors (not real VPR descriptors) | Fully synthetic |
## Usage
```bash
# Production (Docker volume):
e2e/fixtures/tile-cache-builder/build.sh
# Local mode (used by AZ-407 unit test):
e2e/fixtures/tile-cache-builder/build.sh --local /tmp/tile-cache-out
```
The unit test `e2e/_unit_tests/fixtures/test_tile_cache_builder.py`
verifies AC-1 / AC-2 / AC-7 by invoking `builder.py` twice against a
`tmp_path` and asserting the output is byte-identical.
## Notes on D-PROJ-3
When D-PROJ-3 supplies the production tile-corpus for the Derkachi
sector, the stub tiles produced here (any row with `provenance = STUB`)
should be replaced by real Suite Sat Service tiles for those
footprints. The builder will then no longer fall back to
`_stub_jpeg_bytes` — every still that lacks a paired `_gmaps.png`
will draw from the real corpus instead.
## Owned by
AZ-407 (this task). The FAISS-stub descriptor format will not be used
in production; the production VPR pipeline (C2) emits real DINOv2
descriptors. The stub format is sufficient for AZ-407's reproducibility
and schema contracts only.
+64
View File
@@ -0,0 +1,64 @@
#!/usr/bin/env bash
# Build the tile-cache test fixture as a named Docker volume
# (`tile-cache-fixture`), or emit it to a local directory in
# ``--local <path>`` mode (used by the AZ-407 unit tests).
#
# AC-1 (deterministic): two invocations against the same input emit
# identical FAISS index hash, identical manifest rows, and identical
# tile filesystem byte sizes.
#
# Env vars:
# TILE_CACHE_INPUT_DIR Path to _docs/00_problem/input_data (required)
# TILE_CACHE_VOLUME_NAME Docker volume name (default: tile-cache-fixture)
#
# Usage:
# build.sh # builds the named Docker volume
# build.sh --local /tmp/out # emits to /tmp/out (no Docker)
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "${SCRIPT_DIR}/../../.." && pwd)"
VOLUME_NAME="${TILE_CACHE_VOLUME_NAME:-tile-cache-fixture}"
INPUT_DIR="${TILE_CACHE_INPUT_DIR:-${REPO_ROOT}/_docs/00_problem/input_data}"
LOCAL_OUT=""
if [[ "${1:-}" == "--local" ]]; then
if [[ -z "${2:-}" ]]; then
echo "ERROR: --local requires an output directory" >&2
exit 2
fi
LOCAL_OUT="$2"
fi
if [[ ! -d "${INPUT_DIR}" ]]; then
echo "ERROR: input dir not found: ${INPUT_DIR}" >&2
exit 2
fi
if [[ -n "${LOCAL_OUT}" ]]; then
# Local mode: invoke builder.py directly. The caller's venv must
# have Pillow, numpy, faiss-cpu installed; the unit test pulls
# them via the dev extras.
python3 "${SCRIPT_DIR}/builder.py" \
--input-dir "${INPUT_DIR}" \
--output-dir "${LOCAL_OUT}"
exit 0
fi
# Docker mode: build the builder image and populate the named volume.
IMAGE_TAG="azaion-tile-cache-builder:local"
docker build -t "${IMAGE_TAG}" "${SCRIPT_DIR}"
# Recreate the named volume so output is bit-stable across runs (AC-1).
docker volume rm "${VOLUME_NAME}" >/dev/null 2>&1 || true
docker volume create "${VOLUME_NAME}" >/dev/null
docker run --rm \
-v "${INPUT_DIR}:/input:ro" \
-v "${VOLUME_NAME}:/output" \
"${IMAGE_TAG}"
echo "tile-cache-fixture volume '${VOLUME_NAME}' built from ${INPUT_DIR}"
+418
View File
@@ -0,0 +1,418 @@
"""Deterministic tile-cache fixture builder.
Reads source imagery + ground-truth from ``_docs/00_problem/input_data/``
and emits a reproducible ``tile-cache-fixture`` tree at ``--output``:
<output>/
tiles/<zoom>/<x>/<y>.jpg # tile JPEG bodies
tiles/<zoom>/<x>/<y>.json # per-tile sidecar (mirrors `tiles` row)
manifest.csv # sorted manifest with content hashes
descriptors.index # stub FAISS HNSW index (optional)
The builder is invokable directly (``python -m runner.fixtures.tile_cache_builder.builder``)
or inside the per-builder Docker image (``Dockerfile`` in this directory).
Reproducibility primitives (AC-1):
* Source files are sorted lexicographically before processing.
* PIL JPEG encode uses ``quality=85, optimize=False, progressive=False``
with explicit ``subsampling=2`` (4:2:0) — these are the PIL defaults
but pinning them protects against future PIL changes.
* Manifest rows are sorted by ``(zoom_level, tile_x, tile_y)`` before CSV
serialization.
* FAISS index (when ``faiss-cpu`` is importable) is built single-threaded
with ``faiss.omp_set_num_threads(1)`` and a fixed seed (``faiss.write_index``
output is deterministic given the same descriptor sequence).
* Descriptors are SHA-256-derived stub vectors — sufficient for schema
contracts, NOT a substitute for real VPR descriptors emitted by C2.
Public-boundary discipline: this module does NOT import any
``src/gps_denied_onboard`` symbol. The on-disk schema lives in
``_docs/00_problem/restrictions.md`` § Satellite Imagery and is the only
contract this builder honours.
"""
from __future__ import annotations
import argparse
import csv
import datetime as _dt
import hashlib
import io
import json
import logging
import os
import shutil
import sys
from dataclasses import dataclass
from pathlib import Path
from typing import Iterable
logger = logging.getLogger(__name__)
# AC-2: Derkachi route bbox (placeholder centre — refined when D-PROJ-3
# lands the production Derkachi sector polygon). Lat/Lon are the bbox
# corners; the builder emits one tile per `(zoom, tx, ty)` covering the
# rectangle.
DERKACHI_BBOX = {
"min_lat": 50.05,
"max_lat": 50.10,
"min_lon": 36.10,
"max_lon": 36.20,
}
# Static "frozen" capture date for the base fixture. AC-3's age-injector
# operates on a clone; the BASE fixture's date is intentionally fixed in
# the past so the C6 freshness check (6-mo active-conflict /
# 12-mo rear) treats it as fresh for the default scenarios.
BASE_CAPTURE_DATE = "2025-11-01"
# Zoom level used by C6 for the Derkachi corpus (matches restrictions.md
# §Satellite Imagery: ≥0.5 m/px at the cache interface).
DEFAULT_ZOOM = 18
# Tile dimensions (slippy/XYZ convention).
TILE_W = 256
TILE_H = 256
# Stub-descriptor dimensionality (matches the production VPR descriptor
# size declared in `_docs/02_document/components/c2_vpr/description.md`
# for layout compatibility; the values themselves are SHA-derived stubs).
DESCRIPTOR_DIM = 256
@dataclass(frozen=True)
class TileEntry:
"""One row of the manifest. Sorted before CSV serialisation."""
zoom_level: int
tile_x: int
tile_y: int
capture_date: str
source: str
m_per_px: float
jpeg_path: str
content_hash: str
provenance: str
def _iter_stills(input_dir: Path) -> Iterable[Path]:
"""Yield AD000NNN.jpg files in sorted order."""
for p in sorted(input_dir.glob("AD*.jpg")):
yield p
def _iter_paired_gmaps(input_dir: Path) -> set[str]:
"""Return the set of AD000NNN basenames that have a paired _gmaps.png."""
return {p.stem.removesuffix("_gmaps") for p in input_dir.glob("AD*_gmaps.png")}
def _slippy_xy_from_index(idx: int, zoom: int) -> tuple[int, int]:
"""Deterministic (tile_x, tile_y) layout: row-major raster across the
Derkachi bbox. The mapping is NOT geodetically meaningful — it is a
stable placeholder until D-PROJ-3 supplies the production tile-matrix
transform. Each `idx` gets a unique (tx, ty) so the manifest stays
collision-free.
"""
cols = 16 # 16x16 grid covers 256 tiles → comfortably more than 60 stills + 1 bbox
tx = (idx % cols) + (1 << (zoom - 1))
ty = (idx // cols) + (1 << (zoom - 1))
return tx, ty
def _stub_jpeg_bytes(seed: int) -> bytes:
"""Render a deterministic 256x256 JPEG keyed on `seed`.
No PIL randomness, no timestamps in metadata. The body is a 4-band
gradient (R,G,B,grey) computed from `seed`; OpenCV's imdecode + C2's
descriptor pipeline both treat the bytes as a valid JPEG.
"""
from PIL import Image # noqa: PLC0415 — heavy import, deferred
r = (seed * 37) & 0xFF
g = (seed * 53) & 0xFF
b = (seed * 71) & 0xFF
img = Image.new("RGB", (TILE_W, TILE_H), color=(r, g, b))
buf = io.BytesIO()
img.save(
buf,
format="JPEG",
quality=85,
optimize=False,
progressive=False,
subsampling=2,
)
return buf.getvalue()
def _real_tile_jpeg_bytes(gmaps_png: Path) -> bytes:
"""Re-encode a paired _gmaps.png as a deterministic JPEG."""
from PIL import Image # noqa: PLC0415
img = Image.open(gmaps_png).convert("RGB").resize((TILE_W, TILE_H), Image.BICUBIC)
buf = io.BytesIO()
img.save(
buf,
format="JPEG",
quality=85,
optimize=False,
progressive=False,
subsampling=2,
)
return buf.getvalue()
def _content_hash(b: bytes) -> str:
return hashlib.sha256(b).hexdigest()
def _sidecar_dict(entry: TileEntry) -> dict:
"""Per-tile JSON sidecar (mirrors the `tiles` row content per
data_model.md § 2.1.2).
"""
return {
"zoom_level": entry.zoom_level,
"tile_x": entry.tile_x,
"tile_y": entry.tile_y,
"capture_date": entry.capture_date,
"source": entry.source,
"m_per_px": entry.m_per_px,
"content_hash": entry.content_hash,
"provenance": entry.provenance,
}
def _emit_tile(out_dir: Path, entry: TileEntry, jpeg_bytes: bytes) -> None:
"""Write `<out_dir>/tiles/<z>/<x>/<y>.{jpg,json}` atomically."""
tile_dir = out_dir / "tiles" / str(entry.zoom_level) / str(entry.tile_x)
tile_dir.mkdir(parents=True, exist_ok=True)
jpg_path = tile_dir / f"{entry.tile_y}.jpg"
json_path = tile_dir / f"{entry.tile_y}.json"
jpg_path.write_bytes(jpeg_bytes)
json_path.write_text(
json.dumps(_sidecar_dict(entry), sort_keys=True, separators=(",", ":")) + "\n"
)
def _write_manifest(out_dir: Path, rows: list[TileEntry]) -> Path:
"""Write the sorted manifest CSV."""
manifest_path = out_dir / "manifest.csv"
with manifest_path.open("w", newline="") as fp:
writer = csv.writer(fp, lineterminator="\n")
writer.writerow(
[
"zoom_level",
"tile_x",
"tile_y",
"capture_date",
"source",
"m_per_px",
"jpeg_path",
"content_hash",
"provenance",
]
)
for r in sorted(rows, key=lambda x: (x.zoom_level, x.tile_x, x.tile_y)):
writer.writerow(
[
r.zoom_level,
r.tile_x,
r.tile_y,
r.capture_date,
r.source,
f"{r.m_per_px:.6f}",
r.jpeg_path,
r.content_hash,
r.provenance,
]
)
return manifest_path
def _write_descriptors_index(out_dir: Path, rows: list[TileEntry]) -> Path | None:
"""Emit a deterministic FAISS HNSW index of stub descriptors.
Returns the index path on success, or None when faiss-cpu is not
importable. The unit test gates on importorskip("faiss"); the
production build inside ``Dockerfile`` ships faiss-cpu so this path
is always exercised in CI.
"""
try:
import faiss # noqa: PLC0415
import numpy as np # noqa: PLC0415
except ImportError:
logger.warning(
"faiss / numpy not importable in this environment — "
"skipping descriptors.index emission. The fixture is still "
"usable for schema-only scenarios; VPR-matching scenarios "
"need the Docker build."
)
return None
# Single-thread + deterministic seed → bit-stable output.
faiss.omp_set_num_threads(1)
descriptors = np.zeros((len(rows), DESCRIPTOR_DIM), dtype=np.float32)
for i, r in enumerate(sorted(rows, key=lambda x: (x.zoom_level, x.tile_x, x.tile_y))):
# SHA-derived stub: hash the tile's content_hash + index byte
# into DESCRIPTOR_DIM float32s. Stable across runs because
# content_hash is stable.
seed_bytes = hashlib.sha256(
f"{r.content_hash}|{i}".encode("ascii")
).digest()
rng = np.random.default_rng(int.from_bytes(seed_bytes[:8], "big"))
descriptors[i] = rng.standard_normal(DESCRIPTOR_DIM, dtype=np.float32)
# HNSW32 + IP metric is the C2 production choice (see
# _docs/02_document/components/c2_vpr/description.md).
index = faiss.IndexHNSWFlat(DESCRIPTOR_DIM, 32, faiss.METRIC_INNER_PRODUCT)
index.hnsw.efConstruction = 40
index.hnsw.efSearch = 16
index.add(descriptors)
index_path = out_dir / "descriptors.index"
faiss.write_index(index, str(index_path))
return index_path
def build(input_dir: Path, output_dir: Path) -> dict:
"""Build the tile-cache fixture under `output_dir` from `input_dir`.
Returns a manifest summary dict for caller logging:
{"tile_count": int, "stub_count": int, "real_count": int,
"manifest_hash": str, "descriptors_index_hash": str | None}
The output directory is wiped and re-created so two consecutive
invocations against the same input produce bit-identical trees
(AC-1).
"""
if output_dir.exists():
shutil.rmtree(output_dir)
output_dir.mkdir(parents=True)
paired = _iter_paired_gmaps(input_dir)
stills = list(_iter_stills(input_dir))
if not stills:
raise FileNotFoundError(
f"No AD*.jpg files under {input_dir} — input_data/ may be missing"
)
rows: list[TileEntry] = []
stub_count = 0
real_count = 0
# AC-2: one tile entry per still + one entry for the Derkachi bbox
# (index 60 in our deterministic layout).
for idx, still in enumerate(stills):
tx, ty = _slippy_xy_from_index(idx, DEFAULT_ZOOM)
if still.stem in paired:
jpeg = _real_tile_jpeg_bytes(input_dir / f"{still.stem}_gmaps.png")
source = "googlemaps"
provenance = f"paired_gmaps:{still.stem}"
real_count += 1
else:
# D-PROJ-3 stub-tile fallback per AZ-407 spec lines 1819.
jpeg = _stub_jpeg_bytes(idx + 1)
source = "stub"
provenance = "STUB"
stub_count += 1
entry = TileEntry(
zoom_level=DEFAULT_ZOOM,
tile_x=tx,
tile_y=ty,
capture_date=BASE_CAPTURE_DATE,
source=source,
m_per_px=0.5,
jpeg_path=f"tiles/{DEFAULT_ZOOM}/{tx}/{ty}.jpg",
content_hash=_content_hash(jpeg),
provenance=provenance,
)
rows.append(entry)
_emit_tile(output_dir, entry, jpeg)
# AC-2: Derkachi route bbox entry — single representative tile at
# the bbox centre. Real coverage of the bbox is owned by D-PROJ-3.
tx, ty = _slippy_xy_from_index(60, DEFAULT_ZOOM)
bbox_jpeg = _stub_jpeg_bytes(60 + 1)
bbox_entry = TileEntry(
zoom_level=DEFAULT_ZOOM,
tile_x=tx,
tile_y=ty,
capture_date=BASE_CAPTURE_DATE,
source="stub",
m_per_px=0.5,
jpeg_path=f"tiles/{DEFAULT_ZOOM}/{tx}/{ty}.jpg",
content_hash=_content_hash(bbox_jpeg),
provenance=(
f"STUB_BBOX:derkachi:{DERKACHI_BBOX['min_lat']},"
f"{DERKACHI_BBOX['min_lon']},{DERKACHI_BBOX['max_lat']},"
f"{DERKACHI_BBOX['max_lon']}"
),
)
rows.append(bbox_entry)
_emit_tile(output_dir, bbox_entry, bbox_jpeg)
stub_count += 1
manifest_path = _write_manifest(output_dir, rows)
manifest_hash = hashlib.sha256(manifest_path.read_bytes()).hexdigest()
index_path = _write_descriptors_index(output_dir, rows)
if index_path is not None:
descriptors_hash = hashlib.sha256(index_path.read_bytes()).hexdigest()
else:
descriptors_hash = None
return {
"tile_count": len(rows),
"stub_count": stub_count,
"real_count": real_count,
"paired_gmaps_count": len(paired),
"manifest_hash": manifest_hash,
"descriptors_index_hash": descriptors_hash,
}
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser(description="Build the tile-cache test fixture")
parser.add_argument(
"--input-dir",
type=Path,
required=True,
help="Directory containing AD*.jpg and AD*_gmaps.png source files",
)
parser.add_argument(
"--output-dir",
type=Path,
required=True,
help="Output directory for the tile-cache fixture tree",
)
parser.add_argument(
"--quiet",
action="store_true",
help="Suppress per-tile log lines (errors still surface)",
)
args = parser.parse_args(argv)
logging.basicConfig(
level=logging.WARNING if args.quiet else logging.INFO,
format="%(asctime)s %(levelname)s %(name)s %(message)s",
)
summary = build(args.input_dir, args.output_dir)
json.dump(summary, sys.stdout, sort_keys=True, indent=2)
sys.stdout.write("\n")
return 0
if __name__ == "__main__":
raise SystemExit(main())