Files
Oleksandr Bezdieniezhnykh 880eabcb3f Decompose Step 6 snapshot: 140 task specs + contract docs
Closes out greenfield Step 6 (Decompose) for all 14 components
(C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446
plus the _dependencies_table.md and component contract documents.

State file updated to greenfield Step 7 (Implement), not_started.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:39:48 +03:00

6.6 KiB

Contract: sha256_sidecar

Component: shared_helpers / helpers.sha256_sidecar (cross-cutting concern owned by E-CC-HELPERS / AZ-264) Producer task: AZ-280 — _docs/02_tasks/todo/AZ-280_sha256_sidecar.md Consumer tasks: every C6 task that writes the FAISS index / descriptor sidecar; every C7 task that writes engine cache files + INT8 calibration cache; every C10 task that writes the Manifest; every C11 task that verifies tile artifacts before serving them Version: 1.0.0 Status: draft Last Updated: 2026-05-10

Purpose

Centralise the atomic-write + SHA-256 content-hash sidecar pattern (D-C10-3). Every persistent artifact that takeoff-load (F2) must verify gets written atomically AND has a .sha256 sidecar that the verifier can independently recompute. Without a shared helper, C6 / C7 / C10 / C11 each grow their own slightly-different implementation; the takeoff-load gate breaks the moment one of them drifts. Per _docs/02_document/common-helpers/05_helper_sha256_sidecar.md.

Shape

For function / method APIs

class Sha256Sidecar:
    @staticmethod
    def write_atomic(path: Path, payload: bytes) -> str: ...                  # returns hex digest
    @staticmethod
    def write_atomic_and_sidecar(path: Path, payload: bytes) -> str: ...      # returns hex digest
    @staticmethod
    def verify(path: Path) -> bool: ...                                       # checks payload hash against sidecar
    @staticmethod
    def aggregate_hash(paths: list[Path]) -> str: ...                         # for Manifest covering many files
Name Signature Throws / Errors Blocking?
write_atomic (path, payload) -> str Sha256SidecarError if parent dir missing or filesystem rejects rename; underlying OSError is wrapped sync, I/O
write_atomic_and_sidecar (path, payload) -> str same as write_atomic plus failure to write the sidecar atomically sync, I/O
verify (path) -> bool Sha256SidecarError if path exists but path.sha256 is missing or malformed (returns False if path itself is missing) sync, I/O
aggregate_hash (list[Path]) -> str Sha256SidecarError if any path is missing sync, I/O

Path is pathlib.Path. Hex digests are lowercase 64-char strings.

Invariants

  • Atomic write: write_atomic writes to a temp file in the same directory as path and renames to path once the bytes are flushed. The rename is filesystem-level — partial files NEVER appear at path.
  • Sidecar format: write_atomic_and_sidecar writes <path>.sha256 containing ONLY the lowercase hex digest, no JSON wrapper, no trailing newline. Keeps verification trivial (open(...).read().strip() == expected).
  • Verify is independent: verify(path) recomputes the digest from the file's bytes and compares to the sidecar; it does NOT trust the sidecar's value alone.
  • Aggregate hash is order-deterministic: aggregate_hash sorts the input paths first (case-sensitive, full path) so two runs that read the same files always yield the same aggregate. The aggregate is the SHA-256 of the concatenation of <filename>\0<file-hex-digest>\n lines (in sorted order).
  • No upward imports (Layer 1): the module imports ONLY from _types, atomicwrites, hashlib, pathlib, and stdlib. No gps_denied_onboard.components.* imports.
  • Production filesystem requirement: the atomic rename is filesystem-level — works on POSIX local filesystems, not on NFS / SMB / overlayfs. The cache root MUST live on a local filesystem in production. Documented in the contract's Caveats section; not enforced at runtime (it would require an OS-specific check that adds no value when the deployment is locked).

Non-Goals

  • Cryptographic signing — the sidecar protects against accidental corruption + file-replacement-after-staging, NOT against an attacker with write access. Threat model treats the operator workstation as trusted; the companion's write access is restricted to F4 (mid-flight tile gen) which has its own per-flight signing key path (out of scope for this helper).
  • Streaming hashing of files larger than RAM — the helper's API takes payload: bytes, so the entire payload is in memory at write time. Files larger than RAM are out of scope (and outside the operational constraints of the cache root anyway).
  • Compression / on-disk encoding — payload is written verbatim.
  • Sidecar format versioning — there is no version byte; if the format ever changes, the verifier rejects the old format and forces a re-write.

Versioning Rules

  • Breaking changes (sidecar format changed, function renamed/removed, return type changed, atomicity invariant relaxed) require a new major version + a deprecation pass through C6, C7, C10, C11.
  • Non-breaking additions (new helper function, new optional kwarg with safe default) require a minor version bump.

Test Cases

Case Input Expected Notes
valid-write-and-verify random 1 MiB payload, write to tmp path, then verify verify returns True; sidecar contains the hex digest of the payload Round-trip happy path
valid-aggregate-deterministic 3 files written with the helper, then aggregate_hash called twice with paths in different order both calls return the same hex digest Order-deterministic invariant
valid-atomic-no-partial inject a fault between temp write and rename (e.g., raise OSError mid-write); call verify afterward path does NOT exist (or pre-existing version unchanged); no partial file at the target name Atomicity invariant
invalid-sidecar-mismatch manually overwrite path with different bytes after the sidecar was written verify(path) returns False Independent verification
invalid-missing-sidecar verify on a path whose .sha256 was deleted Sha256SidecarError raised mentioning the missing sidecar Strict sidecar requirement
invalid-malformed-sidecar sidecar contains not a hex digest Sha256SidecarError raised mentioning malformed digest Sidecar format strictness
invalid-missing-file-in-aggregate aggregate_hash on a list including a non-existent path Sha256SidecarError raised mentioning the missing path Aggregate input validation
no-upward-imports static import scan only _types, atomicwrites, hashlib, pathlib, stdlib Layer 1 invariant

Change Log

Version Date Change Author
1.0.0 2026-05-10 Initial contract derived from _docs/02_document/common-helpers/05_helper_sha256_sidecar.md autodev decompose Step 2