# Batch 43 — Cycle 1 Report **Date**: 2026-05-13 **Batch**: 43 **Tasks**: AZ-328 (C12 Build-Cache Orchestrator, 5pt) **Status**: complete; ticket ready to transition to "In Testing". ## Scope AZ-328 delivers `BuildCacheOrchestrator` — the F1 (pre-flight cache build) workflow head for the operator workstation. It composes the already-shipped C11 `TileDownloader` (AZ-316), C12 `CompanionBringup` (AZ-327), C12 `FlightsApiClient` (AZ-489), and the new `RemoteCacheProvisionerInvoker` (this task) into one sequenced flow, guarded by a workstation-side `filelock` lockfile. The `operator-tool build-cache` subcommand (T1, AZ-326) is now wired to a real orchestrator and returns granular exit codes per failure phase. This unblocks the F1 happy-path acceptance test (C12-IT-02) and gives operators the first end-to-end "I have a flight ID, build me a cache" workflow. ## Architectural Decisions ### 1. Phase-0 flight-resolve runs BEFORE the lockfile (ADR-010) Per the AZ-328 spec, the orchestrator resolves the `FlightDto` (via `flights_api_client.fetch_flight` or `load_flight_file`), derives the bbox, and computes the takeoff origin **before** acquiring the `.c12.lock` file. Rationale: a flight that cannot be resolved is an operator-input error (wrong flight ID, missing JSON file, expired auth token); holding a contended-resource lock while the operator fixes their input would artificially block parallel builds against unrelated flights. AC-11 and AC-14 enforce. ### 2. Consumer-side cuts (AZ-507) for C11 + C10 types Per the workspace cross-component import policy, C12 cannot import C11 or C10 directly. Implemented as: - `tile_downloader_cut.py` — local `TileDownloaderCut` Protocol + `DownloadRequestCut` / `DownloadBatchReportCut` / `DownloadOutcomeCut` DTOs mirroring the C11 shapes the orchestrator actually consumes. - `RemoteBuildOutcome` + `RemoteBuildReport` in `_types.py` mirror the C10 `BuildOutcome` + `BuildReport` shapes that come back as JSON over SSH (the orchestrator never imports C10 types — it parses JSON the remote process emits). - External-error matching uses **name-based whitelisting** (`_is_recognised(exc, frozenset({"SatelliteProviderError", ...}))`) rather than `isinstance`. C12 cannot import the actual C11/C10 exception classes; matching by class name keeps the orchestrator's failure-phase classification accurate without violating the policy. Anything not on the whitelist propagates raw (per AC-6 — unexpected exceptions still release the lock and surface to the caller). - C11 → C12 type translation lives at the composition root (`c12_factory.build_build_cache_orchestrator`), where both components may legally be imported together. ### 3. Failure surfacing: report vs raise Spec text alternates between "raise `CacheBuildError(...)`" and "return `CacheBuildReport(outcome=failure, ...)`". Resolved by: - **Returning** `CacheBuildReport(outcome=failure, failure_phase=...)` for all *recognised* operational failures (download error, companion not ready, build error, flight-resolve error). The CLI's `_exit_code_for_report` helper translates these into the existing exit-code constants — `EXIT_DOWNLOAD_FAILURE` (20), `EXIT_BUILD_FAILURE` (21), `EXIT_FLIGHT_RESOLVE_*` (90/91/92/93/94/95 per AZ-489's existing taxonomy, selected via the new `failure_exception_type` field on `CacheBuildReport`). - **Raising** `BuildLockHeldError` for lockfile contention only — this is the one case where there is no `CacheBuildReport` to return (no phase ever ran). The CLI maps it to `EXIT_LOCK_HELD` (50). This keeps the orchestrator's public surface a pure function-style return-the-report API for normal operator flows, while reserving exceptions for the "we never even started" case. ### 4. Workstation-side `filelock`, NOT a custom primitive `FileLock` / `FileLockFactory` Protocols + `FilelockFileLockFactory` concrete wrap the already-pinned `filelock` library (used by E-C13). Cross-platform file-locking correctness is non-trivial; the spec explicitly forbids rolling our own. The `LockTimeout` exception is the single failure mode the orchestrator catches. ### 5. Remote C10 stdout streaming + secret redaction `RemoteCacheProvisionerInvoker.invoke` line-iterates the SSH session's stdout, logging each line as `kind="c10.remote.progress"` at DEBUG. Memory stays bounded even for multi-hour builds. Before logging, every line is filtered through a redactor that replaces the literal `api_key` and `auth_token` values with `` (defence-in-depth — guards against C10 itself accidentally echoing them). The final stdout line is parsed as `RemoteBuildReport` JSON; malformed output raises `BuildReportParseError` (a `CacheBuildError` subclass) with the captured tail. ### 6. CLI request construction now lives in `cli.py`, not orchestrator The previous CLI implementation called `_resolve_flight` itself before handing per-flag values to a no-op orchestrator stub. AZ-328 moves flight resolution into the orchestrator (per spec), so the CLI now just packages flags into a `BuildCacheRequest` (with `FlightSource` sum type — `FlightById | FlightFromFile`) and forwards. The CLI is now a thin adapter; all the workflow logic lives in `build_cache.py`. ## Files Changed ### Production source (new) - `src/gps_denied_onboard/components/c12_operator_tooling/build_cache.py` — `BuildCacheOrchestrator` class + sequenced `build_cache(...)` flow + `_is_recognised` helper for name-based external-error matching. - `src/gps_denied_onboard/components/c12_operator_tooling/file_lock.py` — `FileLock` / `FileLockFactory` Protocols, `LockTimeout` exception, `FilelockFileLockFactory` concrete wrapping the `filelock` library. - `src/gps_denied_onboard/components/c12_operator_tooling/remote_c10_invoker.py` — `RemoteCacheProvisionerInvoker` (SSH-driven), `RemoteBuildRequest` DTO, secret-redacting stdout streamer, JSON parser for the final `RemoteBuildReport`. - `src/gps_denied_onboard/components/c12_operator_tooling/tile_downloader_cut.py` — `TileDownloaderCut` Protocol (consumer-side cut for C11 `TileDownloader`). ### Production source (modified) - `src/gps_denied_onboard/components/c12_operator_tooling/_types.py` — added `BuildCacheRequest`, `CacheBuildReport`, `FlightResolveReport`, `BuildCacheOutcome` enum, `FailurePhase` enum, `FlightResolveSource` enum, `FlightSource` sum type (`FlightById` / `FlightFromFile`), `RemoteBuildOutcome` + `RemoteBuildReport` (C10 mirror types), `DownloadRequestCut` + `DownloadBatchReportCut` + `DownloadOutcomeCut` (C11 mirror types). - `src/gps_denied_onboard/components/c12_operator_tooling/errors.py` — added `CacheBuildError(Exception)` with phase-aware `remediation` attribute, `BuildLockHeldError(CacheBuildError)`, `BuildReportParseError(CacheBuildError)`. - `src/gps_denied_onboard/components/c12_operator_tooling/config.py` — added `C12BuildCacheConfig` dataclass (`cache_staging_root`, `lock_filename`, `lock_timeout_s`, `companion_cache_root`, `flight_bbox_buffer_m`, `ssh_connect_timeout_s`, `flights_api_base_url`, `flights_api_auth_token`); added `build_cache: C12BuildCacheConfig` field to `C12Config`. - `src/gps_denied_onboard/components/c12_operator_tooling/cli.py` — `build-cache` subcommand now accepts `--companion-host`, `--companion-port`, `--satellite-provider-url`, `--api-key`; constructs `BuildCacheRequest`; calls `services.build_cache_orchestrator.build_cache(request)`; new `_exit_code_for_report` helper translates `CacheBuildReport` → exit code (handles success / idempotent_no_op / per-phase failure with granular flight-resolve exit codes). - `src/gps_denied_onboard/components/c12_operator_tooling/__init__.py` — re-exports all new AZ-328 types (eager — none of them pull heavy deps; PEP 562 lazy machinery still in place for `paramiko` / `httpx` adapters). - `src/gps_denied_onboard/runtime_root/c12_factory.py` — extended `OperatorToolServices` with `build_cache_orchestrator: BuildCacheOrchestrator | None`; added `build_build_cache_orchestrator(...)` factory that wires the `FilelockFileLockFactory`, `RemoteCacheProvisionerInvoker`, shared `ParamikoSshSessionFactory`, and translates the C11 `TileDownloader` (passed in by the C11 composition root) to the C12-local `TileDownloaderCut` Protocol. ### Production source (formatting-only — adjacent hygiene) These files were re-formatted by `ruff format` when the formatter ran over the c12 directory. No behaviour change; pure whitespace / line-wrapping normalisation: - `src/gps_denied_onboard/components/c12_operator_tooling/flights_api/_parser.py` - `src/gps_denied_onboard/components/c12_operator_tooling/flights_api/bbox.py` - `src/gps_denied_onboard/components/c12_operator_tooling/flights_api/file_loader.py` - `src/gps_denied_onboard/components/c12_operator_tooling/flights_api/httpx_client.py` - `tests/unit/c12_operator_tooling/test_az489_flights_api_client.py` ### Tests (new) - `tests/unit/c12_operator_tooling/test_build_cache_orchestrator.py` — 29 tests covering all 15 ACs + NFR-perf-overhead microbench. - `tests/unit/c12_operator_tooling/test_remote_c10_invoker.py` — 7 tests: happy-path JSON parsing, DEBUG-streamed progress lines, api_key + auth_token redaction, malformed/truncated stdout error handling. - `tests/unit/c12_operator_tooling/test_file_lock.py` — 3 tests: acquire/release, concurrent-contention `LockTimeout`, parent-directory creation. ### Tests (modified) - `tests/unit/c12_operator_tooling/test_cli_build_cache.py` — rewritten for the new CLI/orchestrator interface: tests now inject a `_FakeOrchestrator` that captures `BuildCacheRequest` and returns `CacheBuildReport`, asserting that the CLI assembles the request correctly and that `_exit_code_for_report` emits the correct exit code per outcome / `failure_phase` / `failure_exception_type`. ## Task Results | Task | Status | Files Modified | Tests | AC Coverage | Issues | |------|--------|---------------|-------|-------------|--------| | AZ-328_c12_build_cache_orchestrator | Done | 4 prod-new + 6 prod-modified + 3 tests-new + 1 test-modified | 116 c12 unit tests pass (29 new) | 15/15 ACs + NFR-perf-overhead | None | ## AC Test Coverage: All covered Every acceptance criterion (AC-1 through AC-15) plus the NFR-perf-overhead microbench has a directly-validating test in `test_build_cache_orchestrator.py`. Verified by running the targeted suite and inspecting test names against the spec's "Unit Tests" table. ## Code Review Verdict: PASS ### Findings None of severity Low or higher. **Notes (informational)**: - `httpx_client.py` has a pre-existing `ruff I001` import-sort warning unrelated to this batch. Not in scope; left for the file's owning task to address. - Adjacent formatting-only changes to four `flights_api/*.py` files and one `test_az489_*.py` file were a side-effect of running `ruff format` over the c12 directory after the new files were added. They normalise existing pre-formatter style and are safe to ship. ## Auto-Fix Attempts: 0 ## Stuck Agents: None ## Test Suite - C12 unit tests: **116 passed, 0 failed** (was 73 before this batch — added 39 new tests for AZ-328 + 4 net from CLI test rewrite). - Full repository unit suite: **1522 passed, 80 skipped** (all skips are pre-existing environment gates: Docker / CUDA / Jetson / TensorRT / actionlint). - One transient flake observed in `test_az273_fdr_client_ringbuf::test_ac1_enqueue_never_blocks_and_returns_overrun_on_overflow` on the first full-suite run (passed both standalone and on a re-run with `-p no:randomly`). Unrelated to this batch (C13 / FDR area). Not blocking. - `python -X importtime` cold-start: still <250 ms typical for `operator-tool --help` — adding the orchestrator's pure-Python deps (`filelock`, the new modules) did not regress NFR-perf-cold-start (heavy adapters still PEP 562 lazy). ## Next Batch The natural follow-on is **AZ-329 (C12 post-landing upload trigger)** or **AZ-330 (operator reloc service)**, both of which depend only on the now-shipped AZ-326 + AZ-327 services. Confirm with `_docs/02_tasks/_dependencies_table.md` at the start of Batch 44.