mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-23 02:41:13 +00:00
Decompose Step 6 snapshot: 140 task specs + contract docs
Closes out greenfield Step 6 (Decompose) for all 14 components (C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446 plus the _dependencies_table.md and component contract documents. State file updated to greenfield Step 7 (Implement), not_started. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,243 @@
|
||||
# C10 ManifestVerifier — Takeoff Content-Hash Gate + Trusted-Key Pinning
|
||||
|
||||
**Task**: AZ-324_c10_manifest_verifier
|
||||
**Name**: C10 ManifestVerifier
|
||||
**Description**: Implement `ManifestVerifier` (per the contract `_docs/02_document/contracts/c10_provisioning/manifest_verifier.md`), the read-only validator that AC-NEW-1 places between F2 takeoff and any engine deserialization. Loads `Manifest.json`, verifies its sidecar SHA-256 matches the Manifest bytes, parses the Ed25519 detached signature at `Manifest.json.sig`, verifies it against the caller-supplied `trusted_public_keys` tuple, parses the Manifest schema (rejecting absolute paths and schema violations), and walks every per-artifact entry re-hashing it via AZ-280's sidecar pattern. Returns a `VerificationResult` with `outcome ∈ {PASS, FAIL}`, the union of all `VerifyFailReason` values that fired, the populated `per_artifact_checks` list, and `elapsed_ms`. Fail-closed: any deviation in signature, schema, key trust, or hashes yields `FAIL` with detailed reasons. Never raises on a verify failure — only on environment errors (Manifest.json missing → `MANIFEST_NOT_FOUND` is still `FAIL`, not raise).
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module, AZ-280_sha256_sidecar, AZ-281_engine_filename_schema
|
||||
**Component**: c10_provisioning (epic AZ-252 / E-C10)
|
||||
**Tracker**: AZ-324
|
||||
**Epic**: AZ-252 (E-C10)
|
||||
|
||||
### Document Dependencies
|
||||
|
||||
- `_docs/02_document/contracts/c10_provisioning/manifest_verifier.md` — produced by this task (frozen Protocol + DTO shape, invariants, test cases).
|
||||
- `_docs/02_document/contracts/shared_helpers/sha256_sidecar.md` — sidecar verify pattern (AZ-280).
|
||||
- `_docs/02_document/components/11_c10_provisioning/description.md` — § 5 `ContentHashMismatchError` handling, § 7 D-C10-3 sidecar coverage.
|
||||
|
||||
## Problem
|
||||
|
||||
Without a real verifier:
|
||||
|
||||
- AC-NEW-1 ("no engine deserialization at takeoff before manifest verify") collapses — F2 has nothing to gate on.
|
||||
- D-C10-3 (SHA-256 content-hash gate over every shipped artifact) is unobservable at takeoff.
|
||||
- C10-IT-02 (rejects tampered or wrong-key Manifests) cannot be implemented.
|
||||
- A built but unverified Manifest is no better than no Manifest — operators cannot trust it without an actual check.
|
||||
- Without a contract, C5 takeoff arming and C12 operator tooling cannot couple to C10 — every consumer would re-implement an ad-hoc check.
|
||||
- The "fail-closed" property is a hard requirement; partial verifies that report PASS on first match would compromise the entire trust chain.
|
||||
|
||||
This task delivers the verifier + its frozen contract. It does NOT compile engines (AZ-321), build the Manifest (AZ-323), or own the takeoff-arming policy (E-C5).
|
||||
|
||||
## Outcome
|
||||
|
||||
- A `ManifestVerifier` class implementation at `src/gps_denied_onboard/components/c10_provisioning/manifest_verifier.py` matching the Protocol in the contract.
|
||||
- Constructor: `__init__(self, *, sidecar: Sha256Sidecar, logger: Logger, clock: Clock, tile_metadata_store: TileMetadataStore | None = None)`.
|
||||
- When `tile_metadata_store is None`, the verifier operates in airborne mode: trusts the recorded `tiles_coverage_sha256` after the signature passes (per MV-INV-5).
|
||||
- When `tile_metadata_store is not None`, the verifier operates in operator mode: re-derives `tiles_coverage_sha256` from C6 and reports `TILES_COVERAGE_MISMATCH` on drift.
|
||||
- The frozen contract at `_docs/02_document/contracts/c10_provisioning/manifest_verifier.md` (already written; this task brings the implementation up to it).
|
||||
- Method `verify_manifest(manifest_path, trusted_public_keys) -> VerificationResult` flow:
|
||||
1. Start `time.monotonic()` for `elapsed_ms`.
|
||||
2. Initialize empty `fail_reasons: list[VerifyFailReason]`, `fail_details: list[str]`, `per_artifact_checks: list[ArtifactCheck]`.
|
||||
3. **Step A — Manifest exists & sidecar matches**:
|
||||
- If `manifest_path` does not exist: append `MANIFEST_NOT_FOUND`; return `FAIL` (no further work; per MV-INV-1).
|
||||
- Read `Manifest.json` bytes.
|
||||
- If `manifest_path.with_suffix(".json.sha256")` does not exist: append `SCHEMA_VIOLATION` ("missing manifest sidecar"); return `FAIL`.
|
||||
- If `sha256(manifest_bytes) != sidecar_value`: append `MANIFEST_SELF_HASH_MISMATCH`; return `FAIL` (do NOT consult signature per MV-INV-3).
|
||||
4. **Step B — Signature verifies against a trusted key**:
|
||||
- If `signature_path = manifest_path.with_suffix(".json.sig")` does not exist: append `SIGNATURE_NOT_FOUND`; `signing_public_key_fingerprint = None`; return `FAIL`.
|
||||
- Parse Ed25519 signature bytes (must be exactly 64 bytes; otherwise `SIGNATURE_INVALID`).
|
||||
- Try each public key in `trusted_public_keys`:
|
||||
- Compute `fingerprint = sha256(pub.public_bytes_raw()).hex()`.
|
||||
- Try `pub.verify(signature_bytes, manifest_bytes)`.
|
||||
- On success: signature is valid; `signing_public_key_fingerprint = fingerprint`; break.
|
||||
- If no trusted key verified:
|
||||
- If at least one key raised `InvalidSignature` (signature doesn't match this key's bytes): the signature could still match an untrusted key. Try parsing the Manifest's `signing_public_key_fingerprint` field (if schema parses) and report whichever is more diagnostic — `UNTRUSTED_PUBLIC_KEY` if the Manifest names a known-but-untrusted key, `SIGNATURE_INVALID` otherwise.
|
||||
- Append the reason; return `FAIL` (do NOT proceed to per-artifact hashing per MV-INV-2).
|
||||
- If `trusted_public_keys` is empty: append `UNTRUSTED_PUBLIC_KEY`; return `FAIL`.
|
||||
5. **Step C — Schema parse**:
|
||||
- `orjson.loads(manifest_bytes)` → dict.
|
||||
- Validate required keys: `schema_version`, `build` (with sub-keys `bbox`, `zoom_levels`, `sector_class`, `built_at`, `manifest_hash`), `artifacts` (with `engines`, `descriptor_index`, `calibration`, `tiles_coverage`), `signing_public_key_fingerprint`.
|
||||
- Validate types: `engines` is list of `{path: str, sha256: str}`; `descriptor_index`, `calibration` are `{path: str, sha256: str}`; `tiles_coverage` is `{sha256: str, tile_count: int}`.
|
||||
- Validate path-relative-only: every `path` value must be relative (no leading `/`, no `..` segments). Append `SCHEMA_VIOLATION` per offending field; if any, return `FAIL`.
|
||||
6. **Step D — Per-artifact hash walk** (only reached if Steps A–C all passed):
|
||||
- For each engine, descriptor_index, calibration entry:
|
||||
- Compute `actual_path = manifest_path.parent / entry.path`.
|
||||
- If file missing: append `ArtifactCheck(entry.path, entry.sha256, None, matched=False)`; append `ARTIFACT_MISSING` to `fail_reasons` once if not already there.
|
||||
- Else: stream-read the file, compute SHA-256 (use AZ-280's helper that takes a path).
|
||||
- If hash matches: `matched=True`.
|
||||
- Else: `matched=False`; append `ARTIFACT_HASH_MISMATCH` once.
|
||||
- For tiles_coverage:
|
||||
- If `tile_metadata_store is None` (airborne mode): trust the recorded `tiles_coverage.sha256` since the Manifest signature already binds it. Append `ArtifactCheck("tiles_coverage", recorded_sha256, recorded_sha256, matched=True)` for completeness.
|
||||
- Else (operator mode): re-derive `tiles_coverage_sha256` by `tile_metadata_store.query_by_bbox(...)` over the `build.bbox` + `zoom_levels` + `sector_class`, sort by `(zoom, lat, lon, source)`, hash. If mismatch → `TILES_COVERAGE_MISMATCH`.
|
||||
- Walk ALL entries even on first failure (per MV-TC-9).
|
||||
7. Set `outcome = PASS` iff `fail_reasons` is empty; else `FAIL`.
|
||||
8. Set `elapsed_ms = int((time.monotonic() - start) * 1000)`.
|
||||
9. Return `VerificationResult(...)`.
|
||||
- INFO log on PASS (`c10.manifest.verify.pass` with elapsed_ms + fingerprint); WARN on FAIL with `fail_reasons` + counts of mismatched artifacts.
|
||||
- Composition root factory `build_manifest_verifier(config, *, with_tile_store: bool) -> ManifestVerifier` — `with_tile_store=True` for operator mode, `False` for airborne C5.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- `ManifestVerifier` class implementing the Protocol from the contract.
|
||||
- The contract document (frozen at v1.0.0).
|
||||
- Schema validation against the v1.0 shape produced by AZ-323.
|
||||
- Signature verification against a tuple of trusted public keys.
|
||||
- Per-artifact stream-hash walk with multiple-failure accumulation.
|
||||
- Airborne vs operator mode for tiles_coverage handling.
|
||||
- Composition-root factory.
|
||||
- Conformance test for the contract Protocol.
|
||||
|
||||
### Excluded
|
||||
|
||||
- Manifest building / signing (AZ-323 owns).
|
||||
- Trusted-key distribution / loading from disk — caller passes `Ed25519PublicKey` instances.
|
||||
- Cache repair on FAIL — caller (E-C5 takeoff arming, E-C12 operator) decides next action.
|
||||
- Coverage check for orphan files in `cache_root` (AZ-325 owns `ManifestCoverageError`).
|
||||
- Logging Manifest contents (Manifests are not secret but verbose; only fingerprints + counts are logged).
|
||||
- C13 FDR emission — caller's responsibility (per MV-INV-6).
|
||||
- Non-Ed25519 signatures.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: PASS on a valid Manifest with all artifacts present and matching**
|
||||
Given a freshly-built Manifest + sig + sidecar from AZ-323 and `trusted_public_keys = (signing_pub,)`
|
||||
When `verify_manifest(manifest_path, trusted_public_keys)` is called
|
||||
Then `outcome=PASS`, `fail_reasons` is empty, `per_artifact_checks` has every entry `matched=True`, `signing_public_key_fingerprint` is the signing key's fingerprint, `elapsed_ms > 0`
|
||||
|
||||
**AC-2: FAIL on missing Manifest with no further work**
|
||||
Given `manifest_path` does not exist
|
||||
When verify runs
|
||||
Then `outcome=FAIL`, `fail_reasons=(MANIFEST_NOT_FOUND,)`, `per_artifact_checks` is empty (no work performed), `signing_public_key_fingerprint=None`
|
||||
|
||||
**AC-3: FAIL on missing signature with diagnostic**
|
||||
Given Manifest.json exists + sidecar matches but Manifest.json.sig is absent
|
||||
When verify runs
|
||||
Then `fail_reasons=(SIGNATURE_NOT_FOUND,)`, `per_artifact_checks` is empty, no per-artifact disk reads happen (defence-in-depth)
|
||||
|
||||
**AC-4: FAIL on tampered Manifest body**
|
||||
Given Manifest.json is mutated by 1 byte after signing
|
||||
When verify runs
|
||||
Then either `MANIFEST_SELF_HASH_MISMATCH` (sidecar caught it first) OR `SIGNATURE_INVALID` (if sidecar was also re-computed by attacker); per-artifact walk does NOT happen
|
||||
|
||||
**AC-5: FAIL on untrusted public key**
|
||||
Given the Manifest is signed with a key NOT in `trusted_public_keys`
|
||||
When verify runs
|
||||
Then `fail_reasons=(UNTRUSTED_PUBLIC_KEY,)`, `signing_public_key_fingerprint` is populated (so operators see WHICH untrusted key signed it), per-artifact walk does NOT happen
|
||||
|
||||
**AC-6: FAIL on schema violation lists offending field**
|
||||
Given a Manifest missing the `signing_public_key_fingerprint` key
|
||||
When verify runs
|
||||
Then `fail_reasons=(SCHEMA_VIOLATION,)`, `fail_details` contains a string naming `signing_public_key_fingerprint`
|
||||
|
||||
**AC-7: FAIL on absolute path in artifact entry**
|
||||
Given an engine entry has `path: "/etc/passwd"`
|
||||
When verify runs
|
||||
Then `fail_reasons=(SCHEMA_VIOLATION,)`, `fail_details` names the offending field; per-artifact walk does NOT consult `/etc/passwd`
|
||||
|
||||
**AC-8: FAIL with multiple reasons accumulated**
|
||||
Given one engine is missing on disk AND one engine's bytes drifted AND a third engine matches
|
||||
When verify runs
|
||||
Then `fail_reasons` contains BOTH `ARTIFACT_MISSING` and `ARTIFACT_HASH_MISMATCH` (in deterministic order: traversal order); `per_artifact_checks` has all 3 entries with correct `matched` values; the third entry has `matched=True`
|
||||
|
||||
**AC-9: Operator mode re-derives tiles_coverage**
|
||||
Given `tile_metadata_store` is supplied AND C6's tiles for the build's bbox/zoom now have a different aggregate hash (e.g., a tile was re-downloaded)
|
||||
When verify runs
|
||||
Then `fail_reasons=(TILES_COVERAGE_MISMATCH,)`; the recorded vs computed hashes are in `fail_details`
|
||||
|
||||
**AC-10: Airborne mode trusts tiles_coverage post-signature**
|
||||
Given `tile_metadata_store=None`
|
||||
When verify runs
|
||||
Then `tiles_coverage` `ArtifactCheck` shows `matched=True` (recorded == "actual" because we don't re-derive); the airborne F2 path is fast (≤ 100 ms per NFR)
|
||||
|
||||
**AC-11: Conformance — `isinstance` returns True**
|
||||
Given the implementation
|
||||
When `isinstance(impl, ManifestVerifier)` is checked under runtime_checkable
|
||||
Then `True`
|
||||
|
||||
**AC-12: `elapsed_ms` recorded on every outcome**
|
||||
Given any of the above ACs
|
||||
When inspecting the result
|
||||
Then `elapsed_ms >= 0` and is reasonable (smaller for early-exit failures, larger for full per-artifact walks)
|
||||
|
||||
**AC-13: Empty `trusted_public_keys` always fails closed**
|
||||
Given `trusted_public_keys = ()`
|
||||
When verify runs
|
||||
Then `fail_reasons=(UNTRUSTED_PUBLIC_KEY,)` regardless of Manifest validity; per-artifact walk does NOT happen
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- Airborne F2 verify (no per-tile re-derivation, ~5 artifact entries): wall-clock ≤ 100 ms on Jetson Orin (signature verify + 5 stream-SHA-256s of bounded files).
|
||||
- Operator-mode verify with 100k tiles re-derivation: ≤ 5 s (matches AZ-323's NFR).
|
||||
- Stream-hash files via 64 KB chunks; do NOT load engine binaries (~200 MB) entirely into memory.
|
||||
|
||||
**Compatibility**
|
||||
- `cryptography` (already pinned via AZ-318), `orjson` (already pinned), `hashlib` (stdlib).
|
||||
- No new third-party dependencies.
|
||||
|
||||
**Reliability**
|
||||
- Fail-closed: empty trusted keys → FAIL; missing files → FAIL; any drift → FAIL.
|
||||
- No partial PASS; the `outcome=PASS` branch is taken only when `fail_reasons` is empty.
|
||||
- Defensive against directory traversal: relative paths only (AC-7).
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|-------------|-----------------|
|
||||
| AC-1 | Built Manifest from AZ-323 fixture | PASS; all matched |
|
||||
| AC-2 | Missing Manifest.json | FAIL; MANIFEST_NOT_FOUND only |
|
||||
| AC-3 | Missing signature | FAIL; SIGNATURE_NOT_FOUND; no disk reads |
|
||||
| AC-4 | Mutated Manifest body | FAIL; either MANIFEST_SELF_HASH_MISMATCH or SIGNATURE_INVALID |
|
||||
| AC-5 | Wrong-key signing | FAIL; UNTRUSTED_PUBLIC_KEY; fingerprint populated |
|
||||
| AC-6 | Missing required field | FAIL; SCHEMA_VIOLATION + field name |
|
||||
| AC-7 | Absolute path in artifact | FAIL; SCHEMA_VIOLATION; no path traversal |
|
||||
| AC-8 | 1 missing + 1 drifted + 1 OK | Two failure reasons; per_artifact_checks complete |
|
||||
| AC-9 | Operator mode + drifted tile | TILES_COVERAGE_MISMATCH |
|
||||
| AC-10 | Airborne mode | tiles_coverage matched=True |
|
||||
| AC-11 | Conformance check | True |
|
||||
| AC-12 | Inspect elapsed_ms | All non-negative; ordered as expected |
|
||||
| AC-13 | Empty trusted keys | FAIL; UNTRUSTED |
|
||||
| NFR-perf-airborne | 5 artifact bench, no tile re-walk | p99 ≤ 100 ms |
|
||||
| NFR-perf-operator | 100k-tile re-walk | ≤ 5 s |
|
||||
| NFR-reliability-stream-hash | 200 MB engine + memory profile | Peak < 10 MB extra |
|
||||
|
||||
## Constraints
|
||||
|
||||
- Stream SHA-256 over files via `hashlib.sha256().update(chunk)` in 64 KB blocks; do NOT `Path.read_bytes()` on engines (memory blowup per NFR).
|
||||
- Path interpretation is relative-only; absolute paths are SCHEMA_VIOLATION (AC-7).
|
||||
- The verifier is read-only (per MV-INV-6); no disk writes, no network, no FDR.
|
||||
- `fail_reasons` is a tuple (immutable, ordered, deterministic).
|
||||
- Signature checks happen before per-artifact walks (per MV-INV-2).
|
||||
- Manifest sidecar check happens before signature (per MV-INV-3).
|
||||
- Multiple failures accumulate; do not short-circuit on first per-artifact failure (per MV-TC-9 / AC-8).
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Trusted-key list accidentally empty in production wiring**
|
||||
- *Risk*: Composition root mis-configures; airborne C5 ends up with an empty key list and arming silently fails forever.
|
||||
- *Mitigation*: AC-13 + ERROR log on `UNTRUSTED_PUBLIC_KEY` with key-list-length=0 makes the misconfiguration loud at first arm attempt.
|
||||
|
||||
**Risk 2: Per-artifact walk dominates airborne arm latency**
|
||||
- *Risk*: 5 engines × 200 MB stream-hash on slow microSD → 30 s arm latency.
|
||||
- *Mitigation*: NFR-perf-airborne benchmark documents the envelope; if the Jetson microSD I/O is the bottleneck, a follow-up task adds an "incremental verify" path that trusts unchanged artifacts since last reboot. Out of scope this cycle.
|
||||
|
||||
**Risk 3: Tampered sidecar matches tampered body (attacker drops both sidecar + body)**
|
||||
- *Risk*: AC-4's first failure case (sidecar mismatch) is bypassed by an attacker who recomputes the sidecar.
|
||||
- *Mitigation*: Signature check (Step B) catches this — the signature is over the Manifest body; recomputing the sidecar does NOT also recompute the signature. The Ed25519 secret key is operator-only.
|
||||
|
||||
**Risk 4: Path traversal via relative `..` segments**
|
||||
- *Risk*: A relative path like `../../etc/passwd` passes the "no leading /" check but escapes cache_root.
|
||||
- *Mitigation*: AC-7 + `..` segment rejection covers it; explicit check `if ".." in Path(entry.path).parts: SCHEMA_VIOLATION`.
|
||||
|
||||
**Risk 5: Operator-mode tile re-walk on Jetson is too slow**
|
||||
- *Risk*: An airborne-mode verifier mistakenly gets a `tile_metadata_store` (composition root mistake) and re-walks 100k tiles, blowing the arm latency budget.
|
||||
- *Mitigation*: The composition root factory `build_manifest_verifier(config, *, with_tile_store: bool)` is the explicit toggle; airborne wiring passes `with_tile_store=False`. AC-10 tests airborne mode latency.
|
||||
|
||||
## Runtime Completeness
|
||||
|
||||
- **Named capability**: takeoff content-hash gate per AC-NEW-1 + D-C10-3 + C10-IT-02 (epic § Acceptance C10-IT-01..02; description.md § 5 `ContentHashMismatchError`).
|
||||
- **Production code that must exist**: real `ManifestVerifier` orchestrating real `cryptography` Ed25519 verify + real `hashlib` stream-SHA-256 + real `orjson` schema parse; real `tile_metadata_store` re-derivation in operator mode.
|
||||
- **Allowed external stubs**: tests MAY use a fake key generated in-test, fake Manifest fixtures from AZ-323's test fixtures; production wiring uses real keys from operator key store.
|
||||
- **Unacceptable substitutes**: skipping Step A's sidecar check (loses bit-rot detection); skipping Step B before walking artifacts (defeats MV-INV-2 defence-in-depth); short-circuiting on first per-artifact failure (operators need full diagnostic per MV-TC-9); HMAC instead of Ed25519 (different trust model); accepting absolute paths in entries (path traversal vulnerability per AC-7); raising on missing files instead of `outcome=FAIL` (breaks the contract's read-only / never-raise-on-verify-failure invariant).
|
||||
Reference in New Issue
Block a user