mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 07:01:14 +00:00
ab92946833
Batch 5b completes the helpers sweep for cycle-1 Step 13. For each of the four remaining helpers (sha256_sidecar, engine_filename_schema, ransac_filter, descriptor_normaliser): - Append "Cycle-1 operational reality" section to the existing common-helpers/<NN>_*.md, documenting the shipped interface, exception types, public constants, determinism / validation invariants, and AZ-task lineage. Specific cycle-1 facts captured per helper: - sha256_sidecar (AZ-280): single Sha256SidecarError hierarchy, SIDECAR_SUFFIX public constant, sidecar format is pure lowercase 64-char hex (no JSON), verbatim ".sha256" suffix append, streaming digests in 1 MiB chunks, verify-returns-False semantics for missing payload vs. raise for missing sidecar, byte-deterministic aggregate_hash with sorted-by-str basenames. - engine_filename_schema (AZ-281): EngineFilenameSchemaError, ENGINE_SUFFIX and ALLOWED_PRECISIONS public constants, strict model validation ([a-z0-9_]+ ≤64 chars no __), dotted version regex, non-bool sm validation, matches_host ignores precision by design. - ransac_filter (AZ-282 / AZ-623): RansacFilterError, frozen RansacResult dataclass, cv2.setRNGSeed(0) determinism, median-not-mean residual, NaN for empty inliers, min_inliers is informational only, filter_correspondences uses perspectiveTransform vs. compute_reprojection_residual uses projectPoints, OK to import se3_utils (both Layer 1). - descriptor_normaliser (AZ-283 / AZ-338): DescriptorNormaliserError, ALLOWED_DTYPES = (float16, float32), float32 norm computation with dtype-preserving cast-back, new intra_cluster_normalise method for NetVLAD per-cluster L2 (AZ-338), descriptor_metric returns "inner_product" string. Two contract files (descriptor_normaliser.md and ransac_filter.md mention follow-up) need follow-up minor revisions to match shipped surface; queued for the contracts-folder sweep. Bumps _docs/_autodev_state.md sub_step to tests-doc-updates phase 9. Co-authored-by: Cursor <cursoragent@cursor.com>
5.0 KiB
5.0 KiB
Common Helper — Sha256Sidecar
Purpose
Atomic-write + SHA-256 content-hash sidecar pattern (D-C10-3). Every persistent artifact that takeoff-load (F2) verifies must be written atomically AND have a .sha256 sidecar that the verifier can independently recompute. Centralising the pattern avoids two slightly-different implementations across C6 (FAISS index, tile metadata) and C7 (engine + calibration cache) and C10 (Manifest itself).
Used By
- C6 — Tile Cache + Spatial Index (FAISS
.index, descriptor sidecar; tile pixels do NOT use sidecars individually — there are too many; the Manifest covers the tile-tree hash collectively). - C7 — Inference Runtime (engine cache files + INT8 calibration cache; D-C10-6 calibration-cache trust depends on this).
- C10 — Pre-flight Cache Provisioning (Manifest itself; aggregate hash of the cache root).
Interface (sketch)
class Sha256Sidecar:
@staticmethod
def write_atomic(path: Path, payload: bytes) -> sha256
@staticmethod
def write_atomic_and_sidecar(path: Path, payload: bytes) -> sha256
@staticmethod
def verify(path: Path) -> bool # checks payload hash against sidecar
@staticmethod
def aggregate_hash(paths: list[Path]) -> sha256 # for Manifest covering many files
Implementation Notes
- Backed by the
atomicwritespackage for atomic rename and Python'shashlib.sha256for digesting. - Sidecars are written as
<path>.sha256containing the hex digest (no JSON wrapper — keeps verification trivial). aggregate_hashis order-deterministic (sorts paths first) so two runs that read the same files yield the same aggregate.
Caveats
- The atomic rename is filesystem-level — works on POSIX local filesystems, not on NFS / SMB / overlayfs. For production deployments the cache root MUST live on a local filesystem.
- The sidecar is NOT cryptographically signed; it protects against accidental corruption + file-replacement-after-staging, NOT against an attacker with write access to the cache root. Threat model treats the operator workstation as trusted; the companion's write access is restricted to F4 (mid-flight tile gen) which has its own per-flight signing key path.
Cycle-1 operational reality
The shipped surface in src/gps_denied_onboard/helpers/sha256_sidecar.py (AZ-280) is static-only by design. Atomicity comes from atomicwrites.atomic_write (temp-file → os.replace). All four entry points wrap OSError and ValueError into a single exception hierarchy.
Sha256SidecarError— single public exception type (subclassesRuntimeError). Raised on:write_atomicOS failure;write_atomic_and_sidecarsidecar OS failure;verifyfinds the sidecar missing for an existing payload; sidecar text not exactly 64 lowercase hex chars;aggregate_hashfinds a missing or unreadable path.SIDECAR_SUFFIX = ".sha256"— public module-level constant for callers (e.g. takeoff-load verifier listing) that need to spell the sidecar suffix without hard-coding it.- Sidecar file format — pure hex digest, no JSON wrapper, exactly 64 chars, all lowercase. The validator rejects uppercase or wrong-length sidecars hard (catches "user edited the sidecar by hand and broke it"). Keeps verification trivial.
- Sidecar path appends
.sha256verbatim —Path.with_suffixwould re-interpret an existing extension; we explicitly usePath(str(payload_path) + ".sha256"). Somanifest→manifest.sha256ANDengine.engine→engine.engine.sha256. This is the AC-NEW-CACHE-3 / D-C10-3 invariant. - Streaming digests —
verifyandaggregate_hashstream the file in 1 MiB chunks (_digest_file) so an 8 GB engine file does not require 8 GB of RAM.write_atomicis the only entry point that operates on in-memorybytes. verifysemantics — returnsFalse(not raise) when the payload path is missing entirely ("not verifiable" rather than "verification error"); raisesSha256SidecarErrorwhen the payload exists but the sidecar is missing, unreadable, or malformed. Callers can branch onpath.exists()first if they need to distinguish missing-payload from corrupt-sidecar.aggregate_hashis byte-deterministic — input list is sorted lexicographically bystr(path)before hashing. The digest is computed over the concatenation of<basename>\0<hex-digest>\nlines (basename only, NOT full path, so the same physical file at a different mount point still produces the same aggregate). Missing paths in the input list raise instead of being silently skipped.
Cycle-1 task lineage
- AZ-280 — initial helper, contract producer.
- No cycle-1 follow-up tasks touched this helper. The C10 / C6 / C7 task batch that consumes it (AZ-301 C7 engine gate, AZ-303 C6 storage interfaces, AZ-305 C6 postgres+filesystem store, AZ-321 C10 engine compiler, AZ-322 C10 descriptor batcher, AZ-323 C10 manifest builder, AZ-324 C10 manifest verifier, AZ-325 C10 cache provisioner) cycles through the four
Sha256Sidecarstatic methods without extending them.