Closes out greenfield Step 6 (Decompose) for all 14 components (C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446 plus the _dependencies_table.md and component contract documents. State file updated to greenfield Step 7 (Implement), not_started. Co-authored-by: Cursor <cursoragent@cursor.com>
6.6 KiB
Contract: sha256_sidecar
Component: shared_helpers / helpers.sha256_sidecar (cross-cutting concern owned by E-CC-HELPERS / AZ-264)
Producer task: AZ-280 — _docs/02_tasks/todo/AZ-280_sha256_sidecar.md
Consumer tasks: every C6 task that writes the FAISS index / descriptor sidecar; every C7 task that writes engine cache files + INT8 calibration cache; every C10 task that writes the Manifest; every C11 task that verifies tile artifacts before serving them
Version: 1.0.0
Status: draft
Last Updated: 2026-05-10
Purpose
Centralise the atomic-write + SHA-256 content-hash sidecar pattern (D-C10-3). Every persistent artifact that takeoff-load (F2) must verify gets written atomically AND has a .sha256 sidecar that the verifier can independently recompute. Without a shared helper, C6 / C7 / C10 / C11 each grow their own slightly-different implementation; the takeoff-load gate breaks the moment one of them drifts. Per _docs/02_document/common-helpers/05_helper_sha256_sidecar.md.
Shape
For function / method APIs
class Sha256Sidecar:
@staticmethod
def write_atomic(path: Path, payload: bytes) -> str: ... # returns hex digest
@staticmethod
def write_atomic_and_sidecar(path: Path, payload: bytes) -> str: ... # returns hex digest
@staticmethod
def verify(path: Path) -> bool: ... # checks payload hash against sidecar
@staticmethod
def aggregate_hash(paths: list[Path]) -> str: ... # for Manifest covering many files
| Name | Signature | Throws / Errors | Blocking? |
|---|---|---|---|
write_atomic |
(path, payload) -> str |
Sha256SidecarError if parent dir missing or filesystem rejects rename; underlying OSError is wrapped |
sync, I/O |
write_atomic_and_sidecar |
(path, payload) -> str |
same as write_atomic plus failure to write the sidecar atomically |
sync, I/O |
verify |
(path) -> bool |
Sha256SidecarError if path exists but path.sha256 is missing or malformed (returns False if path itself is missing) |
sync, I/O |
aggregate_hash |
(list[Path]) -> str |
Sha256SidecarError if any path is missing |
sync, I/O |
Path is pathlib.Path. Hex digests are lowercase 64-char strings.
Invariants
- Atomic write:
write_atomicwrites to a temp file in the same directory aspathand renames topathonce the bytes are flushed. The rename is filesystem-level — partial files NEVER appear atpath. - Sidecar format:
write_atomic_and_sidecarwrites<path>.sha256containing ONLY the lowercase hex digest, no JSON wrapper, no trailing newline. Keeps verification trivial (open(...).read().strip() == expected). - Verify is independent:
verify(path)recomputes the digest from the file's bytes and compares to the sidecar; it does NOT trust the sidecar's value alone. - Aggregate hash is order-deterministic:
aggregate_hashsorts the input paths first (case-sensitive, full path) so two runs that read the same files always yield the same aggregate. The aggregate is the SHA-256 of the concatenation of<filename>\0<file-hex-digest>\nlines (in sorted order). - No upward imports (Layer 1): the module imports ONLY from
_types,atomicwrites,hashlib,pathlib, and stdlib. Nogps_denied_onboard.components.*imports. - Production filesystem requirement: the atomic rename is filesystem-level — works on POSIX local filesystems, not on NFS / SMB / overlayfs. The cache root MUST live on a local filesystem in production. Documented in the contract's Caveats section; not enforced at runtime (it would require an OS-specific check that adds no value when the deployment is locked).
Non-Goals
- Cryptographic signing — the sidecar protects against accidental corruption + file-replacement-after-staging, NOT against an attacker with write access. Threat model treats the operator workstation as trusted; the companion's write access is restricted to F4 (mid-flight tile gen) which has its own per-flight signing key path (out of scope for this helper).
- Streaming hashing of files larger than RAM — the helper's API takes
payload: bytes, so the entire payload is in memory at write time. Files larger than RAM are out of scope (and outside the operational constraints of the cache root anyway). - Compression / on-disk encoding — payload is written verbatim.
- Sidecar format versioning — there is no version byte; if the format ever changes, the verifier rejects the old format and forces a re-write.
Versioning Rules
- Breaking changes (sidecar format changed, function renamed/removed, return type changed, atomicity invariant relaxed) require a new major version + a deprecation pass through C6, C7, C10, C11.
- Non-breaking additions (new helper function, new optional kwarg with safe default) require a minor version bump.
Test Cases
| Case | Input | Expected | Notes |
|---|---|---|---|
| valid-write-and-verify | random 1 MiB payload, write to tmp path, then verify |
verify returns True; sidecar contains the hex digest of the payload |
Round-trip happy path |
| valid-aggregate-deterministic | 3 files written with the helper, then aggregate_hash called twice with paths in different order |
both calls return the same hex digest | Order-deterministic invariant |
| valid-atomic-no-partial | inject a fault between temp write and rename (e.g., raise OSError mid-write); call verify afterward |
path does NOT exist (or pre-existing version unchanged); no partial file at the target name |
Atomicity invariant |
| invalid-sidecar-mismatch | manually overwrite path with different bytes after the sidecar was written |
verify(path) returns False |
Independent verification |
| invalid-missing-sidecar | verify on a path whose .sha256 was deleted |
Sha256SidecarError raised mentioning the missing sidecar |
Strict sidecar requirement |
| invalid-malformed-sidecar | sidecar contains not a hex digest |
Sha256SidecarError raised mentioning malformed digest |
Sidecar format strictness |
| invalid-missing-file-in-aggregate | aggregate_hash on a list including a non-existent path |
Sha256SidecarError raised mentioning the missing path |
Aggregate input validation |
| no-upward-imports | static import scan | only _types, atomicwrites, hashlib, pathlib, stdlib |
Layer 1 invariant |
Change Log
| Version | Date | Change | Author |
|---|---|---|---|
| 1.0.0 | 2026-05-10 | Initial contract derived from _docs/02_document/common-helpers/05_helper_sha256_sidecar.md |
autodev decompose Step 2 |