AZ-507: codify cross-component import rule. Added _types/inference_errors.py shim re-exporting EngineBuildError + CalibrationCacheError from c7_inference; narrowed C10 EngineCompiler's except Exception to the two typed errors so unknown exceptions propagate (AC-3). Rewrote module-layout.md "Imports from" sections for 9 components + added Rule 9; appended an architecture.md ADR-009 note explaining why components must go through _types/*. AZ-323: ManifestBuilder + Ed25519ManifestSigner. Canonical JSON via orjson OPT_SORT_KEYS+OPT_INDENT_2, atomic-write Manifest.json + sha sidecar + .sig via AZ-280, operator-key fingerprint allowlist gate (C10-ST-01), ADR-010 takeoff_origin + flight_id baked into Manifest AND manifest_hash so re-planned routes change the cache identity (AC-15/AC-16). 20 unit tests cover all 16 ACs. AZ-324: ManifestVerifierImpl. Fail-closed Steps A-D: Manifest.json sidecar self-hash, Ed25519 trust-key set, schema parse with absolute/.. path rejection + takeoff_origin in-bbox check, stream SHA-256 per artifact with multi-failure accumulation. Operator mode re-derives tiles_coverage_sha256 from C6; airborne mode trusts the signed aggregate. 19 unit tests cover all 17 ACs. Composition root: c10_factory.build_manifest_builder + build_manifest_verifier + c6_tile_metadata_store_to_tiles_query adapter (the one place that legitimately imports both C6 and C10 without violating the AZ-270 lint). Dependency: pinned cryptography>=43.0,<46.0 in pyproject.toml. Tests: 1300 passed, 80 skipped (env-only), ruff clean for all AZ-323/324 files. AZ-306 (FAISS) intentionally deferred to batch 35 — needs C++ pybind11 toolchain not present in this environment. Co-authored-by: Cursor <cursoragent@cursor.com>
20 KiB
C10 ManifestVerifier — Takeoff Content-Hash Gate + Trusted-Key Pinning
Task: AZ-324_c10_manifest_verifier
Name: C10 ManifestVerifier
Description: Implement ManifestVerifier (per the contract _docs/02_document/contracts/c10_provisioning/manifest_verifier.md v1.1.0), the read-only validator that AC-NEW-1 places between F2 takeoff and any engine deserialization. Loads Manifest.json, verifies its sidecar SHA-256 matches the Manifest bytes, parses the Ed25519 detached signature at Manifest.json.sig, verifies it against the caller-supplied trusted_public_keys tuple, parses the Manifest schema (rejecting absolute paths and schema violations), validates the optional flight.takeoff_origin block (well-formed LatLonAlt + inside build.bbox per ADR-010 + AZ-490), and walks every per-artifact entry re-hashing it via AZ-280's sidecar pattern. Returns a VerificationResult with outcome ∈ {PASS, FAIL}, the union of all VerifyFailReason values that fired, the populated per_artifact_checks list, the pass-through takeoff_origin + flight_id (or None when absent from the Manifest body), and elapsed_ms. Fail-closed: any deviation in signature, schema, key trust, hashes, or origin validity yields FAIL with detailed reasons. Never raises on a verify failure — only on environment errors (Manifest.json missing → MANIFEST_NOT_FOUND is still FAIL, not raise).
Complexity: 3 points
Dependencies: AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module, AZ-280_sha256_sidecar, AZ-281_engine_filename_schema
Component: c10_provisioning (epic AZ-252 / E-C10)
Tracker: AZ-324
Epic: AZ-252 (E-C10)
Document Dependencies
_docs/02_document/contracts/c10_provisioning/manifest_verifier.md— produced by this task (frozen Protocol + DTO shape, invariants, test cases)._docs/02_document/contracts/shared_helpers/sha256_sidecar.md— sidecar verify pattern (AZ-280)._docs/02_document/components/11_c10_provisioning/description.md— § 5ContentHashMismatchErrorhandling, § 7 D-C10-3 sidecar coverage.
Problem
Without a real verifier:
- AC-NEW-1 ("no engine deserialization at takeoff before manifest verify") collapses — F2 has nothing to gate on.
- D-C10-3 (SHA-256 content-hash gate over every shipped artifact) is unobservable at takeoff.
- C10-IT-02 (rejects tampered or wrong-key Manifests) cannot be implemented.
- A built but unverified Manifest is no better than no Manifest — operators cannot trust it without an actual check.
- Without a contract, C5 takeoff arming and C12 operator tooling cannot couple to C10 — every consumer would re-implement an ad-hoc check.
- The "fail-closed" property is a hard requirement; partial verifies that report PASS on first match would compromise the entire trust chain.
This task delivers the verifier + its frozen contract. It does NOT compile engines (AZ-321), build the Manifest (AZ-323), or own the takeoff-arming policy (E-C5).
Outcome
- A
ManifestVerifierclass implementation atsrc/gps_denied_onboard/components/c10_provisioning/manifest_verifier.pymatching the Protocol in the contract. - Constructor:
__init__(self, *, sidecar: Sha256Sidecar, logger: Logger, clock: Clock, tile_metadata_store: TileMetadataStore | None = None).- When
tile_metadata_store is None, the verifier operates in airborne mode: trusts the recordedtiles_coverage_sha256after the signature passes (per MV-INV-5). - When
tile_metadata_store is not None, the verifier operates in operator mode: re-derivestiles_coverage_sha256from C6 and reportsTILES_COVERAGE_MISMATCHon drift.
- When
- The frozen contract at
_docs/02_document/contracts/c10_provisioning/manifest_verifier.md(already written; this task brings the implementation up to it). - Method
verify_manifest(manifest_path, trusted_public_keys) -> VerificationResultflow:- Start
time.monotonic()forelapsed_ms. - Initialize empty
fail_reasons: list[VerifyFailReason],fail_details: list[str],per_artifact_checks: list[ArtifactCheck]. - Step A — Manifest exists & sidecar matches:
- If
manifest_pathdoes not exist: appendMANIFEST_NOT_FOUND; returnFAIL(no further work; per MV-INV-1). - Read
Manifest.jsonbytes. - If
manifest_path.with_suffix(".json.sha256")does not exist: appendSCHEMA_VIOLATION("missing manifest sidecar"); returnFAIL. - If
sha256(manifest_bytes) != sidecar_value: appendMANIFEST_SELF_HASH_MISMATCH; returnFAIL(do NOT consult signature per MV-INV-3).
- If
- Step B — Signature verifies against a trusted key:
- If
signature_path = manifest_path.with_suffix(".json.sig")does not exist: appendSIGNATURE_NOT_FOUND;signing_public_key_fingerprint = None; returnFAIL. - Parse Ed25519 signature bytes (must be exactly 64 bytes; otherwise
SIGNATURE_INVALID). - Try each public key in
trusted_public_keys:- Compute
fingerprint = sha256(pub.public_bytes_raw()).hex(). - Try
pub.verify(signature_bytes, manifest_bytes). - On success: signature is valid;
signing_public_key_fingerprint = fingerprint; break.
- Compute
- If no trusted key verified:
- If at least one key raised
InvalidSignature(signature doesn't match this key's bytes): the signature could still match an untrusted key. Try parsing the Manifest'ssigning_public_key_fingerprintfield (if schema parses) and report whichever is more diagnostic —UNTRUSTED_PUBLIC_KEYif the Manifest names a known-but-untrusted key,SIGNATURE_INVALIDotherwise. - Append the reason; return
FAIL(do NOT proceed to per-artifact hashing per MV-INV-2).
- If at least one key raised
- If
trusted_public_keysis empty: appendUNTRUSTED_PUBLIC_KEY; returnFAIL.
- If
- Step C — Schema parse:
orjson.loads(manifest_bytes)→ dict.- Validate required keys:
schema_version,build(with sub-keysbbox,zoom_levels,sector_class,built_at,manifest_hash),artifacts(withengines,descriptor_index,calibration,tiles_coverage),signing_public_key_fingerprint.flightblock is OPTIONAL (added in schema v1.1, ADR-010). - Validate types:
enginesis list of{path: str, sha256: str};descriptor_index,calibrationare{path: str, sha256: str};tiles_coverageis{sha256: str, tile_count: int}. - Validate path-relative-only: every
pathvalue must be relative (no leading/, no..segments). AppendSCHEMA_VIOLATIONper offending field; if any, returnFAIL. - Flight block (ADR-010 / AZ-490):
- If
flightkey absent →takeoff_origin = None,flight_id = None; continue. - If
flightpresent → parseflight_id(UUIDorNone) andtakeoff_origin(optional block). - If
flight.takeoff_originpresent → validatelat_deg ∈ [-90, 90],lon_deg ∈ [-180, 180],alt_mfinite (no NaN/Inf). AppendTAKEOFF_ORIGIN_INVALIDtofail_reasonsand the offending field name tofail_detailsif any check fails. - If
flight.takeoff_originis well-formed → check it falls insidebuild.bbox(bbox.lat_min ≤ lat ≤ bbox.lat_max,bbox.lon_min ≤ lon ≤ bbox.lon_max). AppendTAKEOFF_ORIGIN_OUT_OF_BBOXif not. - The
takeoff_originis populated onVerificationResultwhenever the block parsed (even on FAIL), per MV-INV-9, so operators see what was attempted.
- If
- Step D — Per-artifact hash walk (only reached if Steps A–C all passed):
- For each engine, descriptor_index, calibration entry:
- Compute
actual_path = manifest_path.parent / entry.path. - If file missing: append
ArtifactCheck(entry.path, entry.sha256, None, matched=False); appendARTIFACT_MISSINGtofail_reasonsonce if not already there. - Else: stream-read the file, compute SHA-256 (use AZ-280's helper that takes a path).
- If hash matches:
matched=True. - Else:
matched=False; appendARTIFACT_HASH_MISMATCHonce.
- Compute
- For tiles_coverage:
- If
tile_metadata_store is None(airborne mode): trust the recordedtiles_coverage.sha256since the Manifest signature already binds it. AppendArtifactCheck("tiles_coverage", recorded_sha256, recorded_sha256, matched=True)for completeness. - Else (operator mode): re-derive
tiles_coverage_sha256bytile_metadata_store.query_by_bbox(...)over thebuild.bbox+zoom_levels+sector_class, sort by(zoom, lat, lon, source), hash. If mismatch →TILES_COVERAGE_MISMATCH.
- If
- Walk ALL entries even on first failure (per MV-TC-9).
- For each engine, descriptor_index, calibration entry:
- Set
outcome = PASSifffail_reasonsis empty; elseFAIL. - Set
elapsed_ms = int((time.monotonic() - start) * 1000). - Return
VerificationResult(...).
- Start
- INFO log on PASS (
c10.manifest.verify.passwith elapsed_ms + fingerprint); WARN on FAIL withfail_reasons+ counts of mismatched artifacts. - Composition root factory
build_manifest_verifier(config, *, with_tile_store: bool) -> ManifestVerifier—with_tile_store=Truefor operator mode,Falsefor airborne C5.
Scope
Included
ManifestVerifierclass implementing the Protocol from the contract.- The contract document (frozen at v1.0.0).
- Schema validation against the v1.0 shape produced by AZ-323.
- Signature verification against a tuple of trusted public keys.
- Per-artifact stream-hash walk with multiple-failure accumulation.
- Airborne vs operator mode for tiles_coverage handling.
- Composition-root factory.
- Conformance test for the contract Protocol.
Excluded
- Manifest building / signing (AZ-323 owns).
- Trusted-key distribution / loading from disk — caller passes
Ed25519PublicKeyinstances. - Cache repair on FAIL — caller (E-C5 takeoff arming, E-C12 operator) decides next action.
- Coverage check for orphan files in
cache_root(AZ-325 ownsManifestCoverageError). - Logging Manifest contents (Manifests are not secret but verbose; only fingerprints + counts are logged).
- C13 FDR emission — caller's responsibility (per MV-INV-6).
- Non-Ed25519 signatures.
Acceptance Criteria
AC-1: PASS on a valid Manifest with all artifacts present and matching
Given a freshly-built Manifest + sig + sidecar from AZ-323 and trusted_public_keys = (signing_pub,)
When verify_manifest(manifest_path, trusted_public_keys) is called
Then outcome=PASS, fail_reasons is empty, per_artifact_checks has every entry matched=True, signing_public_key_fingerprint is the signing key's fingerprint, elapsed_ms > 0
AC-2: FAIL on missing Manifest with no further work
Given manifest_path does not exist
When verify runs
Then outcome=FAIL, fail_reasons=(MANIFEST_NOT_FOUND,), per_artifact_checks is empty (no work performed), signing_public_key_fingerprint=None
AC-3: FAIL on missing signature with diagnostic
Given Manifest.json exists + sidecar matches but Manifest.json.sig is absent
When verify runs
Then fail_reasons=(SIGNATURE_NOT_FOUND,), per_artifact_checks is empty, no per-artifact disk reads happen (defence-in-depth)
AC-4: FAIL on tampered Manifest body
Given Manifest.json is mutated by 1 byte after signing
When verify runs
Then either MANIFEST_SELF_HASH_MISMATCH (sidecar caught it first) OR SIGNATURE_INVALID (if sidecar was also re-computed by attacker); per-artifact walk does NOT happen
AC-5: FAIL on untrusted public key
Given the Manifest is signed with a key NOT in trusted_public_keys
When verify runs
Then fail_reasons=(UNTRUSTED_PUBLIC_KEY,), signing_public_key_fingerprint is populated (so operators see WHICH untrusted key signed it), per-artifact walk does NOT happen
AC-6: FAIL on schema violation lists offending field
Given a Manifest missing the signing_public_key_fingerprint key
When verify runs
Then fail_reasons=(SCHEMA_VIOLATION,), fail_details contains a string naming signing_public_key_fingerprint
AC-7: FAIL on absolute path in artifact entry
Given an engine entry has path: "/etc/passwd"
When verify runs
Then fail_reasons=(SCHEMA_VIOLATION,), fail_details names the offending field; per-artifact walk does NOT consult /etc/passwd
AC-8: FAIL with multiple reasons accumulated
Given one engine is missing on disk AND one engine's bytes drifted AND a third engine matches
When verify runs
Then fail_reasons contains BOTH ARTIFACT_MISSING and ARTIFACT_HASH_MISMATCH (in deterministic order: traversal order); per_artifact_checks has all 3 entries with correct matched values; the third entry has matched=True
AC-9: Operator mode re-derives tiles_coverage
Given tile_metadata_store is supplied AND C6's tiles for the build's bbox/zoom now have a different aggregate hash (e.g., a tile was re-downloaded)
When verify runs
Then fail_reasons=(TILES_COVERAGE_MISMATCH,); the recorded vs computed hashes are in fail_details
AC-10: Airborne mode trusts tiles_coverage post-signature
Given tile_metadata_store=None
When verify runs
Then tiles_coverage ArtifactCheck shows matched=True (recorded == "actual" because we don't re-derive); the airborne F2 path is fast (≤ 100 ms per NFR)
AC-11: Conformance — isinstance returns True
Given the implementation
When isinstance(impl, ManifestVerifier) is checked under runtime_checkable
Then True
AC-12: elapsed_ms recorded on every outcome
Given any of the above ACs
When inspecting the result
Then elapsed_ms >= 0 and is reasonable (smaller for early-exit failures, larger for full per-artifact walks)
AC-13: Empty trusted_public_keys always fails closed
Given trusted_public_keys = ()
When verify runs
Then fail_reasons=(UNTRUSTED_PUBLIC_KEY,) regardless of Manifest validity; per-artifact walk does NOT happen
AC-14: Manifest with no flight block parses cleanly (back-compat)
Given a v1.0 Manifest (no flight block) that is otherwise valid + signed
When verify runs
Then outcome=PASS; VerificationResult.takeoff_origin is None; VerificationResult.flight_id is None
AC-15: Well-formed in-bbox takeoff_origin passes through
Given a v1.1 Manifest with flight.takeoff_origin = (50.0, 36.2, 200.0) inside the recorded bbox
When verify runs
Then outcome=PASS; VerificationResult.takeoff_origin == LatLonAlt(50.0, 36.2, 200.0)
AC-16: Malformed takeoff_origin (lat=200) fails closed
Given a Manifest with flight.takeoff_origin.lat_deg = 200
When verify runs
Then outcome=FAIL; fail_reasons contains TAKEOFF_ORIGIN_INVALID; fail_details names lat_deg; the takeoff_origin field on VerificationResult is still populated for diagnostics
AC-17: Out-of-bbox takeoff_origin fails closed
Given a Manifest whose flight.takeoff_origin = (10.0, 10.0, 0) while build.bbox covers (49.5..50.5, 35.5..36.5)
When verify runs
Then outcome=FAIL; fail_reasons contains TAKEOFF_ORIGIN_OUT_OF_BBOX
Non-Functional Requirements
Performance
- Airborne F2 verify (no per-tile re-derivation, ~5 artifact entries): wall-clock ≤ 100 ms on Jetson Orin (signature verify + 5 stream-SHA-256s of bounded files).
- Operator-mode verify with 100k tiles re-derivation: ≤ 5 s (matches AZ-323's NFR).
- Stream-hash files via 64 KB chunks; do NOT load engine binaries (~200 MB) entirely into memory.
Compatibility
cryptography(already pinned via AZ-318),orjson(already pinned),hashlib(stdlib).- No new third-party dependencies.
Reliability
- Fail-closed: empty trusted keys → FAIL; missing files → FAIL; any drift → FAIL.
- No partial PASS; the
outcome=PASSbranch is taken only whenfail_reasonsis empty. - Defensive against directory traversal: relative paths only (AC-7).
Unit Tests
| AC Ref | What to Test | Required Outcome |
|---|---|---|
| AC-1 | Built Manifest from AZ-323 fixture | PASS; all matched |
| AC-2 | Missing Manifest.json | FAIL; MANIFEST_NOT_FOUND only |
| AC-3 | Missing signature | FAIL; SIGNATURE_NOT_FOUND; no disk reads |
| AC-4 | Mutated Manifest body | FAIL; either MANIFEST_SELF_HASH_MISMATCH or SIGNATURE_INVALID |
| AC-5 | Wrong-key signing | FAIL; UNTRUSTED_PUBLIC_KEY; fingerprint populated |
| AC-6 | Missing required field | FAIL; SCHEMA_VIOLATION + field name |
| AC-7 | Absolute path in artifact | FAIL; SCHEMA_VIOLATION; no path traversal |
| AC-8 | 1 missing + 1 drifted + 1 OK | Two failure reasons; per_artifact_checks complete |
| AC-9 | Operator mode + drifted tile | TILES_COVERAGE_MISMATCH |
| AC-10 | Airborne mode | tiles_coverage matched=True |
| AC-11 | Conformance check | True |
| AC-14 | v1.0 Manifest (no flight block) | PASS; takeoff_origin=None; flight_id=None |
| AC-15 | v1.1 Manifest, valid in-bbox origin | PASS; takeoff_origin populated |
| AC-16 | Malformed origin (lat=200) | FAIL; TAKEOFF_ORIGIN_INVALID; field name in details |
| AC-17 | Out-of-bbox origin | FAIL; TAKEOFF_ORIGIN_OUT_OF_BBOX |
| AC-12 | Inspect elapsed_ms | All non-negative; ordered as expected |
| AC-13 | Empty trusted keys | FAIL; UNTRUSTED |
| NFR-perf-airborne | 5 artifact bench, no tile re-walk | p99 ≤ 100 ms |
| NFR-perf-operator | 100k-tile re-walk | ≤ 5 s |
| NFR-reliability-stream-hash | 200 MB engine + memory profile | Peak < 10 MB extra |
Constraints
- Stream SHA-256 over files via
hashlib.sha256().update(chunk)in 64 KB blocks; do NOTPath.read_bytes()on engines (memory blowup per NFR). - Path interpretation is relative-only; absolute paths are SCHEMA_VIOLATION (AC-7).
- The verifier is read-only (per MV-INV-6); no disk writes, no network, no FDR.
fail_reasonsis a tuple (immutable, ordered, deterministic).- Signature checks happen before per-artifact walks (per MV-INV-2).
- Manifest sidecar check happens before signature (per MV-INV-3).
- Multiple failures accumulate; do not short-circuit on first per-artifact failure (per MV-TC-9 / AC-8).
Risks & Mitigation
Risk 1: Trusted-key list accidentally empty in production wiring
- Risk: Composition root mis-configures; airborne C5 ends up with an empty key list and arming silently fails forever.
- Mitigation: AC-13 + ERROR log on
UNTRUSTED_PUBLIC_KEYwith key-list-length=0 makes the misconfiguration loud at first arm attempt.
Risk 2: Per-artifact walk dominates airborne arm latency
- Risk: 5 engines × 200 MB stream-hash on slow microSD → 30 s arm latency.
- Mitigation: NFR-perf-airborne benchmark documents the envelope; if the Jetson microSD I/O is the bottleneck, a follow-up task adds an "incremental verify" path that trusts unchanged artifacts since last reboot. Out of scope this cycle.
Risk 3: Tampered sidecar matches tampered body (attacker drops both sidecar + body)
- Risk: AC-4's first failure case (sidecar mismatch) is bypassed by an attacker who recomputes the sidecar.
- Mitigation: Signature check (Step B) catches this — the signature is over the Manifest body; recomputing the sidecar does NOT also recompute the signature. The Ed25519 secret key is operator-only.
Risk 4: Path traversal via relative .. segments
- Risk: A relative path like
../../etc/passwdpasses the "no leading /" check but escapes cache_root. - Mitigation: AC-7 +
..segment rejection covers it; explicit checkif ".." in Path(entry.path).parts: SCHEMA_VIOLATION.
Risk 5: Operator-mode tile re-walk on Jetson is too slow
- Risk: An airborne-mode verifier mistakenly gets a
tile_metadata_store(composition root mistake) and re-walks 100k tiles, blowing the arm latency budget. - Mitigation: The composition root factory
build_manifest_verifier(config, *, with_tile_store: bool)is the explicit toggle; airborne wiring passeswith_tile_store=False. AC-10 tests airborne mode latency.
Runtime Completeness
- Named capability: takeoff content-hash gate per AC-NEW-1 + D-C10-3 + C10-IT-02 (epic § Acceptance C10-IT-01..02; description.md § 5
ContentHashMismatchError). - Production code that must exist: real
ManifestVerifierorchestrating realcryptographyEd25519 verify + realhashlibstream-SHA-256 + realorjsonschema parse; realtile_metadata_storere-derivation in operator mode. - Allowed external stubs: tests MAY use a fake key generated in-test, fake Manifest fixtures from AZ-323's test fixtures; production wiring uses real keys from operator key store.
- Unacceptable substitutes: skipping Step A's sidecar check (loses bit-rot detection); skipping Step B before walking artifacts (defeats MV-INV-2 defence-in-depth); short-circuiting on first per-artifact failure (operators need full diagnostic per MV-TC-9); HMAC instead of Ed25519 (different trust model); accepting absolute paths in entries (path traversal vulnerability per AC-7); raising on missing files instead of
outcome=FAIL(breaks the contract's read-only / never-raise-on-verify-failure invariant).