mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 23:31:13 +00:00
[AZ-266] [AZ-269] [AZ-277] [AZ-280] Cross-cutting log/config + SE3/SHA256 helpers
AZ-266: schema-compliant JSON logging entrypoint, level normalisation, handler-topology guard, format-error fallback (log_record_schema v1.0.0). AZ-269: env > YAML > defaults config loader, frozen Config dataclass, missing-var fail-fast with pointer to .env.example, component-block registry. AZ-277: GTSAM-backed SE3Utils (matrix<->SE3 + exp/log/adjoint) with strict orthogonality, dtype, and bottom-row contract enforcement. AZ-280: atomicwrites-backed write_atomic + independent verify + order-deterministic aggregate_hash; sidecar format strictness. pyproject.toml pins gtsam>=4.2,<5.0 and atomicwrites>=1.4,<2.0 (named-backend deps per the AZ-277 / AZ-280 contracts). 139 unit tests pass (44 new). Review verdict: PASS_WITH_WARNINGS; findings are perf-NFR + journald deferrals, no blocking issues. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,154 @@
|
||||
# Sha256Sidecar Helper Module
|
||||
|
||||
**Task**: AZ-280_sha256_sidecar
|
||||
**Name**: Sha256Sidecar Helper
|
||||
**Description**: Implement the shared `Sha256Sidecar` helper that owns the atomic-write + SHA-256 content-hash sidecar pattern (D-C10-3). Every persistent artifact that takeoff-load (F2) must verify gets written atomically AND has a `.sha256` sidecar that the verifier can independently recompute. Used by C6 (FAISS index, descriptor sidecar), C7 (engine cache + INT8 calibration cache), C10 (Manifest), and C11 (tile artifact verification). Stateless static-only design.
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: AZ-263_initial_structure
|
||||
**Component**: shared.helpers.sha256_sidecar (cross-cutting; epic AZ-264 / E-CC-HELPERS)
|
||||
**Tracker**: AZ-280
|
||||
**Epic**: AZ-264 (E-CC-HELPERS)
|
||||
|
||||
### Document Dependencies
|
||||
|
||||
- `_docs/02_document/contracts/shared_helpers/sha256_sidecar.md` — frozen public interface this task produces.
|
||||
- `_docs/02_document/common-helpers/05_helper_sha256_sidecar.md` — design rationale and consumer mapping (D-C10-3).
|
||||
|
||||
## Problem
|
||||
|
||||
The takeoff-load gate (F2) verifies four classes of persistent artifact: FAISS index + descriptor sidecar (C6), TensorRT engine cache + INT8 calibration cache (C7), Manifest (C10), and tile artifacts (C11). Each artifact must be written atomically (no partial files) AND must have a hash sidecar the verifier can independently recompute.
|
||||
|
||||
Without a shared helper:
|
||||
- C6 / C7 / C10 / C11 each grow their own atomic-write + hash implementation; subtle differences in temp-file naming, rename ordering, or sidecar format break the cross-component verifier the moment one drifts.
|
||||
- The Manifest aggregate hash (which covers many files) goes through path-ordering logic that is implemented in only one place; if that ordering ever differs across a writer and a verifier, the entire cache root looks corrupt.
|
||||
- An attacker (or accidental `rsync`) replaces `engine.engine` after `engine.engine.sha256` was written; without independent verification, takeoff-load accepts the swapped file.
|
||||
|
||||
## Outcome
|
||||
|
||||
- A single `helpers.sha256_sidecar` module is the only path through which any onboard process writes hash-verified artifacts.
|
||||
- Atomic write is a hard contract: the temp-file → rename pattern guarantees no partial file ever appears at the target path. A fault between the bytes-flushed point and the rename leaves either the previous version or no file at all — never a half-written one.
|
||||
- `verify(path)` recomputes the digest from the file's bytes; it does NOT trust the sidecar's value alone. A swapped artifact with a stale sidecar is detected.
|
||||
- `aggregate_hash` is order-deterministic (sorts paths first), so the Manifest aggregate is reproducible across writer and verifier.
|
||||
- The sidecar format is intentionally trivial (lowercase hex digest, no JSON wrapper, no trailing newline) so any small script can verify a single artifact without pulling in the helper.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- `Sha256Sidecar` static methods: `write_atomic`, `write_atomic_and_sidecar`, `verify`, `aggregate_hash`.
|
||||
- `Sha256SidecarError` exception type wrapping underlying `OSError` and capturing missing/malformed sidecar conditions.
|
||||
- Public interface contract published at `_docs/02_document/contracts/shared_helpers/sha256_sidecar.md`.
|
||||
|
||||
### Excluded
|
||||
|
||||
- Cryptographic signing — this helper is corruption + accidental-replacement defense only; signing is out of scope (mid-flight tile gen has its own per-flight signing key path elsewhere).
|
||||
- Streaming hashing for payloads larger than RAM — out of scope; the helper's API is `payload: bytes`.
|
||||
- Compression / on-disk encoding — payloads are written verbatim.
|
||||
- Sidecar versioning — there is no version byte.
|
||||
- Filesystem-type detection (warning when run on NFS / overlayfs) — documented in contract Caveats; not enforced at runtime.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Round-trip write + verify**
|
||||
Given a 1 MiB random payload
|
||||
When `write_atomic_and_sidecar(path, payload)` runs followed by `verify(path)`
|
||||
Then `verify` returns True AND the sidecar at `path.sha256` contains a 64-char lowercase hex digest matching `hashlib.sha256(payload).hexdigest()`
|
||||
|
||||
**AC-2: Atomicity — no partial file on fault**
|
||||
Given a fault is injected between the temp-file flush and the rename (e.g., monkey-patch `os.replace` to raise `OSError`)
|
||||
When `write_atomic(path, payload)` runs and raises
|
||||
Then `path` does NOT exist (or, if it pre-existed, its bytes are unchanged); no `*.tmp` or partial file remains at the target name
|
||||
|
||||
**AC-3: Independent verification rejects swapped payloads**
|
||||
Given an artifact is written via `write_atomic_and_sidecar`, then the file at `path` is overwritten out-of-band with different bytes
|
||||
When `verify(path)` runs
|
||||
Then it returns False (NOT True; it must NOT trust the sidecar value alone)
|
||||
|
||||
**AC-4: Missing sidecar is an error, not False**
|
||||
Given an artifact exists at `path` but `path.sha256` was deleted
|
||||
When `verify(path)` runs
|
||||
Then `Sha256SidecarError` is raised with a message naming the missing sidecar (the helper does NOT silently return False — that would conflate "corrupt artifact" with "missing sidecar")
|
||||
|
||||
**AC-5: Malformed sidecar is rejected**
|
||||
Given a sidecar containing `not a hex digest` or a digest of wrong length
|
||||
When `verify(path)` runs
|
||||
Then `Sha256SidecarError` is raised mentioning malformed sidecar content
|
||||
|
||||
**AC-6: Aggregate hash is order-deterministic**
|
||||
Given three files `a`, `b`, `c` and their hashes
|
||||
When `aggregate_hash([a, b, c])` and `aggregate_hash([c, a, b])` run
|
||||
Then both calls return the same hex digest (the implementation sorts paths internally)
|
||||
|
||||
**AC-7: Aggregate hash rejects missing files**
|
||||
Given a list including a non-existent path
|
||||
When `aggregate_hash` runs
|
||||
Then `Sha256SidecarError` is raised mentioning the missing path
|
||||
|
||||
**AC-8: Sidecar format strictness**
|
||||
Given the sidecar written by `write_atomic_and_sidecar`
|
||||
When the file's bytes are read
|
||||
Then the bytes are EXACTLY the 64-char lowercase hex digest — no JSON wrapper, no trailing newline, no whitespace
|
||||
|
||||
**AC-9: No upward imports (Layer 1 invariant)**
|
||||
Given the helper module
|
||||
When a static-import check runs
|
||||
Then it imports ONLY from `_types`, `atomicwrites`, `hashlib`, `pathlib`, and stdlib — no `gps_denied_onboard.components.*` imports anywhere
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- No specific latency budget per `_docs/02_document/common-helpers/05_helper_sha256_sidecar.md` (consumers are pre-flight / post-landing). Sanity bound: `write_atomic_and_sidecar` of a 1 MiB payload ≤ 50 ms on Tier-2.
|
||||
|
||||
**Reliability**
|
||||
- `Sha256SidecarError` is the ONLY exception type the public surface raises on filesystem / sidecar errors. `OSError` MUST be wrapped so callers do not have to handle two error hierarchies.
|
||||
- Pure deterministic: same payload always produces the same digest.
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|-------------|-----------------|
|
||||
| AC-1 | Round-trip write + verify on 1 MiB random payload | sidecar matches `hashlib.sha256(payload).hexdigest()`; `verify` True |
|
||||
| AC-2 | Inject `OSError` between flush and rename | no partial file remains at target name |
|
||||
| AC-3 | Overwrite payload after sidecar is written | `verify` returns False |
|
||||
| AC-4 | Delete sidecar; call `verify` | `Sha256SidecarError`; mentions missing sidecar |
|
||||
| AC-5 | Malformed sidecar content | `Sha256SidecarError`; mentions malformed sidecar |
|
||||
| AC-6 | `aggregate_hash` with two different orderings | byte-equal digests |
|
||||
| AC-7 | `aggregate_hash` with a missing path | `Sha256SidecarError`; mentions missing path |
|
||||
| AC-8 | Read sidecar bytes after `write_atomic_and_sidecar` | exactly 64 hex chars; no newline / whitespace / JSON |
|
||||
| AC-9 | importlinter / grep gate | no `components.*` imports |
|
||||
| NFR-perf | Microbench `write_atomic_and_sidecar` of 1 MiB payload | ≤ 50 ms on Tier-2 |
|
||||
|
||||
## Constraints
|
||||
|
||||
- Public surface frozen by `_docs/02_document/contracts/shared_helpers/sha256_sidecar.md` v1.0.0.
|
||||
- Layer 1 Foundation only.
|
||||
- `atomicwrites` is the single atomic-rename backend; pinned in `pyproject.toml` at AZ-263 / E-BOOT.
|
||||
- Static-only design satisfies `coderule.mdc`.
|
||||
- No new dependency beyond what AZ-263 / E-BOOT pinned.
|
||||
- Production cache root MUST live on a local POSIX filesystem (NFS / SMB / overlayfs are unsupported per the helper's atomic-rename invariant). Documented in deployment artifacts; not enforced at runtime.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: A future helper change relaxes atomicity to "best-effort"**
|
||||
- *Risk*: Someone replaces the temp-file → rename pattern with a direct write under the rationale "rename is slow on certain filesystems"; takeoff-load occasionally sees partial files.
|
||||
- *Mitigation*: AC-2 makes atomicity a hard test. Any regression that loses the rename is caught immediately.
|
||||
|
||||
**Risk 2: `aggregate_hash` ordering drifts between writer and verifier**
|
||||
- *Risk*: A future change adds case-insensitive sorting or strips path prefixes; writer and verifier disagree; cache root looks corrupt.
|
||||
- *Mitigation*: AC-6 pins the deterministic-ordering invariant; the contract spells out the exact format (`<filename>\0<file-hex-digest>\n` lines, lexicographically sorted by full path).
|
||||
|
||||
**Risk 3: Sidecar format ambiguity (someone wraps the digest in JSON)**
|
||||
- *Risk*: A future contributor "improves" the sidecar to be JSON for "extensibility"; verification scripts that expect raw hex break.
|
||||
- *Mitigation*: AC-8 pins the exact byte-level format. Versioning rules force a major bump for any format change.
|
||||
|
||||
## Runtime Completeness
|
||||
|
||||
- **Named capability**: atomic-write + SHA-256 content-hash sidecar (D-C10-3 / `05_helper_sha256_sidecar.md`).
|
||||
- **Production code that must exist**: real `atomicwrites`-backed atomic rename; real `hashlib.sha256` digesting; real independent verify.
|
||||
- **Allowed external stubs**: none — `atomicwrites` and `hashlib` are stdlib + production deps.
|
||||
- **Unacceptable substitutes**: direct write (loses atomicity); trusting the sidecar value without recomputing the file's hash; JSON-wrapped sidecar; case-insensitive aggregate ordering.
|
||||
|
||||
## Contract
|
||||
|
||||
This task produces the contract at `_docs/02_document/contracts/shared_helpers/sha256_sidecar.md`.
|
||||
Consumers MUST read that file — not this task spec — to discover the interface.
|
||||
Reference in New Issue
Block a user