Decompose Step 6 snapshot: 140 task specs + contract docs

Closes out greenfield Step 6 (Decompose) for all 14 components
(C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446
plus the _dependencies_table.md and component contract documents.

State file updated to greenfield Step 7 (Implement), not_started.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-11 00:39:48 +03:00
parent 8171fcb29e
commit 880eabcb3f
172 changed files with 22897 additions and 35 deletions
@@ -0,0 +1,77 @@
# Contract: sha256_sidecar
**Component**: shared_helpers / `helpers.sha256_sidecar` (cross-cutting concern owned by E-CC-HELPERS / AZ-264)
**Producer task**: AZ-280 — `_docs/02_tasks/todo/AZ-280_sha256_sidecar.md`
**Consumer tasks**: every C6 task that writes the FAISS index / descriptor sidecar; every C7 task that writes engine cache files + INT8 calibration cache; every C10 task that writes the Manifest; every C11 task that verifies tile artifacts before serving them
**Version**: 1.0.0
**Status**: draft
**Last Updated**: 2026-05-10
## Purpose
Centralise the atomic-write + SHA-256 content-hash sidecar pattern (D-C10-3). Every persistent artifact that takeoff-load (F2) must verify gets written atomically AND has a `.sha256` sidecar that the verifier can independently recompute. Without a shared helper, C6 / C7 / C10 / C11 each grow their own slightly-different implementation; the takeoff-load gate breaks the moment one of them drifts. Per `_docs/02_document/common-helpers/05_helper_sha256_sidecar.md`.
## Shape
### For function / method APIs
```python
class Sha256Sidecar:
@staticmethod
def write_atomic(path: Path, payload: bytes) -> str: ... # returns hex digest
@staticmethod
def write_atomic_and_sidecar(path: Path, payload: bytes) -> str: ... # returns hex digest
@staticmethod
def verify(path: Path) -> bool: ... # checks payload hash against sidecar
@staticmethod
def aggregate_hash(paths: list[Path]) -> str: ... # for Manifest covering many files
```
| Name | Signature | Throws / Errors | Blocking? |
|------|-----------|-----------------|-----------|
| `write_atomic` | `(path, payload) -> str` | `Sha256SidecarError` if parent dir missing or filesystem rejects rename; underlying `OSError` is wrapped | sync, I/O |
| `write_atomic_and_sidecar` | `(path, payload) -> str` | same as `write_atomic` plus failure to write the sidecar atomically | sync, I/O |
| `verify` | `(path) -> bool` | `Sha256SidecarError` if `path` exists but `path.sha256` is missing or malformed (returns `False` if `path` itself is missing) | sync, I/O |
| `aggregate_hash` | `(list[Path]) -> str` | `Sha256SidecarError` if any path is missing | sync, I/O |
`Path` is `pathlib.Path`. Hex digests are lowercase 64-char strings.
## Invariants
- **Atomic write**: `write_atomic` writes to a temp file in the same directory as `path` and renames to `path` once the bytes are flushed. The rename is filesystem-level — partial files NEVER appear at `path`.
- **Sidecar format**: `write_atomic_and_sidecar` writes `<path>.sha256` containing ONLY the lowercase hex digest, no JSON wrapper, no trailing newline. Keeps verification trivial (`open(...).read().strip() == expected`).
- **Verify is independent**: `verify(path)` recomputes the digest from the file's bytes and compares to the sidecar; it does NOT trust the sidecar's value alone.
- **Aggregate hash is order-deterministic**: `aggregate_hash` sorts the input paths first (case-sensitive, full path) so two runs that read the same files always yield the same aggregate. The aggregate is the SHA-256 of the concatenation of `<filename>\0<file-hex-digest>\n` lines (in sorted order).
- **No upward imports** (Layer 1): the module imports ONLY from `_types`, `atomicwrites`, `hashlib`, `pathlib`, and stdlib. No `gps_denied_onboard.components.*` imports.
- **Production filesystem requirement**: the atomic rename is filesystem-level — works on POSIX local filesystems, not on NFS / SMB / overlayfs. The cache root MUST live on a local filesystem in production. Documented in the contract's Caveats section; not enforced at runtime (it would require an OS-specific check that adds no value when the deployment is locked).
## Non-Goals
- Cryptographic signing — the sidecar protects against accidental corruption + file-replacement-after-staging, NOT against an attacker with write access. Threat model treats the operator workstation as trusted; the companion's write access is restricted to F4 (mid-flight tile gen) which has its own per-flight signing key path (out of scope for this helper).
- Streaming hashing of files larger than RAM — the helper's API takes `payload: bytes`, so the entire payload is in memory at write time. Files larger than RAM are out of scope (and outside the operational constraints of the cache root anyway).
- Compression / on-disk encoding — payload is written verbatim.
- Sidecar format versioning — there is no version byte; if the format ever changes, the verifier rejects the old format and forces a re-write.
## Versioning Rules
- **Breaking changes** (sidecar format changed, function renamed/removed, return type changed, atomicity invariant relaxed) require a new major version + a deprecation pass through C6, C7, C10, C11.
- **Non-breaking additions** (new helper function, new optional kwarg with safe default) require a minor version bump.
## Test Cases
| Case | Input | Expected | Notes |
|------|-------|----------|-------|
| valid-write-and-verify | random 1 MiB payload, write to tmp path, then `verify` | `verify` returns True; sidecar contains the hex digest of the payload | Round-trip happy path |
| valid-aggregate-deterministic | 3 files written with the helper, then `aggregate_hash` called twice with paths in different order | both calls return the same hex digest | Order-deterministic invariant |
| valid-atomic-no-partial | inject a fault between temp write and rename (e.g., raise `OSError` mid-write); call `verify` afterward | `path` does NOT exist (or pre-existing version unchanged); no partial file at the target name | Atomicity invariant |
| invalid-sidecar-mismatch | manually overwrite `path` with different bytes after the sidecar was written | `verify(path)` returns False | Independent verification |
| invalid-missing-sidecar | `verify` on a path whose `.sha256` was deleted | `Sha256SidecarError` raised mentioning the missing sidecar | Strict sidecar requirement |
| invalid-malformed-sidecar | sidecar contains `not a hex digest` | `Sha256SidecarError` raised mentioning malformed digest | Sidecar format strictness |
| invalid-missing-file-in-aggregate | `aggregate_hash` on a list including a non-existent path | `Sha256SidecarError` raised mentioning the missing path | Aggregate input validation |
| no-upward-imports | static import scan | only `_types`, `atomicwrites`, `hashlib`, `pathlib`, stdlib | Layer 1 invariant |
## Change Log
| Version | Date | Change | Author |
|---------|------|--------|--------|
| 1.0.0 | 2026-05-10 | Initial contract derived from `_docs/02_document/common-helpers/05_helper_sha256_sidecar.md` | autodev decompose Step 2 |