[AZ-341] [AZ-329] [AZ-330] [AZ-328] Cumulative review batches 43-45

PASS_WITH_WARNINGS verdict covering AZ-328 (BuildCacheOrchestrator),
AZ-329 (PostLandingUploadOrchestrator + FdrFooterReader), AZ-330
(OperatorReLocService), AZ-523/AZ-524 (C11 internal gate removal +
c12_operator_orchestrator rename), and AZ-341 (FaissBridge +
DescriptorIndexCut).

Four Low-severity findings, all hygiene or carry-over: F1 ISO
timestamp helper duplicated across 6 modules (AZ-508 hygiene PBI
exists), F2 IndexUnavailableError namespace duplication c2/c6
flagged for spec/docstring alignment, F3 AZ-341 spec lists unused
normaliser parameter, F4 carry-over cold-start microbench host-load
flake.

Full unit suite 1565 passed / 80 env-skipped at close of window.
No new layer-direction or AZ-507 violations introduced; three new
structural Protocol cuts (TileDownloaderCut, FdrFooterReader,
DescriptorIndexCut) all follow the same shape.

State file updated: last_cumulative_review batches_40-42 ->
batches_43-45.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 21:50:32 +03:00
parent 1682dc354b
commit dd2f1cbae6
2 changed files with 326 additions and 2 deletions
@@ -0,0 +1,324 @@
# Cumulative Code Review — Batches 4345 / Cycle 1
**Date**: 2026-05-13
**Mode**: Cumulative (all 7 phases, emphasis on Phases 6 + 7)
**Batches covered**: 43, 44, 45
**Tasks covered**: AZ-328 (C12 Build-Cache Orchestrator + RemoteCacheProvisionerInvoker + FileLock), AZ-329 (C12 PostLandingUploadOrchestrator + FdrFooterReader), AZ-330 (C12 OperatorReLocService + OperatorCommandTransport Protocol), AZ-523 (C11 internal flight-state gate removal — supersedes AZ-317), AZ-524 (rename `c12_operator_tooling``c12_operator_orchestrator`; `OperatorToolServices``OperatorOrchestratorServices`), AZ-341 (C2 FAISS HNSW Retrieve Wiring — `FaissBridge` + `DescriptorIndexCut`)
**Changed files in scope**: 53 production sources + 21 test files + 8 docs / specs (full list in §Scope below)
**Verdict**: **PASS_WITH_WARNINGS**
## Scope
| Domain | Files |
|--------|-------|
| c2_vpr (production, new) | `_faiss_bridge.py`, `descriptor_index_cut.py` |
| c2_vpr (production, modified) | `__init__.py` (re-exports `DescriptorIndexCut` + `TileIdTuple`), `config.py` (added `warn_top1_threshold` + `debug_per_frame_distances` knobs + validators) |
| c11_tile_manager (production) | `flight_state_gate.py` (DELETED — Batch 44 SRP removal, AZ-523), `interface.py` (`TileUploader.upload_pending_tiles` v2.0.0 — removed `flight_state` parameter), `tile_uploader.py` (removed internal gate), `_types.py` (`UploadRequest` no longer carries `FlightStateSignal`), `errors.py` (removed `FlightStateNotOnGroundError`), `idempotent_retry.py` (signature alignment), `__init__.py` (re-export delta) |
| c12_operator_orchestrator (production, rename of `c12_operator_tooling`) | Every file in the directory was renamed; new modules introduced in batches 43/44: `build_cache.py`, `remote_c10_invoker.py`, `file_lock.py`, `tile_downloader_cut.py`, `tile_uploader_cut.py`, `post_landing_upload.py`, `fdr_footer_reader.py`, `operator_reloc_service.py`, `operator_command_transport.py` |
| c10_provisioning (production, minor) | `manifest_verifier.py` (touch; downstream of c12 type changes), `__init__.py` (re-export touch) |
| c3_5_adhop (production, minor) | `config.py` (touch; no behaviour change) |
| c6_tile_cache (production, minor) | `_tile_pixel_handle.py` (touch; type re-export shape) |
| Composition root | `runtime_root/__init__.py`, `c11_factory.py` (no internal gate to wire; signature update), `c12_factory.py` (`OperatorToolServices``OperatorOrchestratorServices`; added factories for `PostLandingUploadOrchestrator` + `OperatorReLocService`) |
| Healthcheck | `healthcheck.py` (touch; aggregator wiring updated for c12 rename) |
| FDR | `fdr_client/records.py` (registered `c12.reloc.requested` in Batch 44; registered `vpr.retrieve_topk` in Batch 45) |
| Tests | `c2_vpr/test_faiss_bridge.py` (NEW, 22 tests); `c12_operator_orchestrator/test_build_cache_orchestrator.py` (NEW, ~29 tests); `c12_operator_orchestrator/test_remote_c10_invoker.py` (NEW); `c12_operator_orchestrator/test_file_lock.py` (NEW); `c12_operator_orchestrator/test_cli_build_cache.py` (rewritten); `c12_operator_orchestrator/test_post_landing_upload_orchestrator.py` (NEW); `c12_operator_orchestrator/test_operator_reloc_service.py` (NEW); `c12_operator_orchestrator/test_fdr_footer_reader.py` (NEW); `c11_tile_manager/test_tile_uploader.py` (rewritten — gate removal); `c11_tile_manager/test_protocol_conformance.py` (rewritten); `test_az272_fdr_record_schema.py` (added `c12.reloc.requested` + `vpr.retrieve_topk` fixtures) |
| Docs | `_docs/02_document/architecture.md` (major Batch-44 doc sweep — C11/C12 entries, principle #4, data model, F10 flow, ADR-004), `system-flows.md` (F10 rewrite + new operator-reloc flow), `epics.md` (E-C11 + E-C12 rewrites), `data_model.md`, `glossary.md`, `FINAL_report.md`, `components/12_c11_tilemanager/{description,tests}.md`, `components/10_c8_fc_adapter/description.md`, `components/13_c12_operator_orchestrator/{description,tests}.md` (rename), `contracts/c11_tilemanager/tile_uploader.md` (v2.0.0), `contracts/c12_operator_orchestrator/` (new dir with `post_landing_upload.md`, `fdr_footer_reader.md`, `operator_command_transport.md`), `contracts/c6_tile_cache/tile_metadata_store.md` (minor C12 naming fix) |
| Tasks | `_docs/02_tasks/todo/AZ-329`, `AZ-330`, `AZ-341``done/`; `_docs/02_tasks/_dependencies_table.md` refreshed |
## Summary
Three batches, six feature tasks (plus two audit-trail tickets for the
Batch-44 SRP refactor), zero blocking findings. The window's dominant
themes:
1. **C11 → C12 SRP refactor (Batch 44, AZ-329 + AZ-330 + AZ-523)**
the internal post-landing flight-state gate was removed from C11
entirely. The gate now lives once, in C12's
`PostLandingUploadOrchestrator`, where it consults the FDR
`flight_footer.clean_shutdown` field via the new `FdrFooterReader`
Protocol. The pivot eliminates a soft-real-time `FlightStateSignal`
subscription from the upload-side hot path and replaces it with a
one-shot file-read at orchestration time — measurable correctness,
testability, and observability win. C11 `TileUploader` contract
advanced to v2.0.0.
2. **C12 operator orchestrator finished its end-to-end shape (Batches
43 + 44, AZ-328 + AZ-329 + AZ-330)** — the operator workstation
now ships three production-functional services on top of the CLI
shell that landed in Batch 42: `BuildCacheOrchestrator` (F1 cache
build), `PostLandingUploadOrchestrator` (F10 post-landing upload),
and `OperatorReLocService` (AC-3.4 re-localization hint). Every
service consumes its cross-component dependencies via local AZ-507
structural Protocol cuts (`TileDownloaderCut`, `TileUploaderCut`,
`FdrFooterReader`, `OperatorCommandTransport`) — c12 imports
nothing from c11, c10, c7, c6, or c13 directly. The wire-in
adapter lives in the composition root.
3. **C2 retrieve plumbing centralised (Batch 45, AZ-341)**
`FaissBridge` lifts the shared `retrieve_topk` body out of every
future concrete `VprStrategy`. The bridge owns the defended-in-depth
INV-4 check (== K, ascending), the WARN-threshold check, the
optional DEBUG per-frame-distances log, and one `vpr.retrieve_topk`
FDR record per call. Cross-strategy duplication risk eliminated;
each strategy's `retrieve_topk` will shrink to a one-line
delegation. The c2_vpr ↔ c6 boundary now uses the same
`DescriptorIndexCut` Protocol-cut pattern established by
c12_operator_orchestrator (Batch 43+44).
The full unit-test suite reached **1565 passed / 80 environment-skipped**
across the changed surface, up from 1494 at the close of cumulative
review batches 40-42. No failures introduced in the cumulative scope.
## Phase 1 — Context loading
Inputs reviewed:
- Architecture doc (post Batch-44 sweep) — principle #4 phrasing,
ADR-004 § "Why the gate moved to C12 (Batch 44)", new F10 / reloc
data-flow sections.
- Module layout — c2_vpr and c12_operator_orchestrator Per-Component
Mapping entries (the Batch-44 rename), AZ-507 cross-component rule.
- Contracts: `tile_uploader.md` v2.0.0, the new
`c12_operator_orchestrator/` contract directory
(`post_landing_upload.md`, `fdr_footer_reader.md`,
`operator_command_transport.md`), `vpr_strategy_protocol.md`,
`descriptor_index.md`.
- Task specs in scope: `AZ-328`, `AZ-329`, `AZ-330`, `AZ-341`,
`AZ-523` (audit), `AZ-524` (audit).
## Phase 2 — Spec compliance
| Task | ACs satisfied | Tests | Notes |
|------|---------------|-------|-------|
| AZ-328 | 15/15 | `test_build_cache_orchestrator.py` (29) + `test_remote_c10_invoker.py` (7) + `test_file_lock.py` (3) | Cycle-1 report `batch_43_cycle1_report.md`, review `batch_43_review.md` (PASS_WITH_WARNINGS) |
| AZ-329 | 14/14 | `test_post_landing_upload_orchestrator.py`, `test_fdr_footer_reader.py` | Cycle-1 report `batch_44_cycle1_report.md`, review `batch_44_review.md` (PASS_WITH_WARNINGS) |
| AZ-330 | 9/9 | `test_operator_reloc_service.py` | Same batch report / review |
| AZ-523 | audit-trail (closed) | C11 protocol-conformance test rewritten | Gate moved to C12 (AZ-329); no production code change since gate was deleted |
| AZ-524 | audit-trail (closed) | Directory rename verified by `module-layout.md` self-test in autodev | Composition-root names updated |
| AZ-341 | 11/11 + NFR-perf | `test_faiss_bridge.py` (22 tests) | Cycle-1 report `batch_45_cycle1_report.md`, review `batch_45_review.md` (PASS_WITH_WARNINGS) |
Contract verification (per-batch reviews remain authoritative; spot-
checks in cumulative mode):
- `tile_uploader.md` v2.0.0 — `TileUploader.upload_pending_tiles`
signature no longer takes `flight_state`; c11 implementation +
c12 `PostLandingUploadOrchestrator`'s c11 cut + every c11
conformance test agree. ✓
- `vpr_strategy_protocol.md``FaissBridge` produces `VprResult`
matching INV-4 (== K, ascending) and INV-5 (non-empty
`backbone_label`); IndexUnavailableError propagation matches
C2-ST-01 intent (the bridge's role is propagation, c6 is the
authoritative defence). ✓
- `descriptor_index.md` — the bridge consumes the cut, not c6
directly; the bridge's stricter == K invariant is documented as
intentional defended-in-depth. ✓
## Phase 3 — Code quality
Across the 53 changed production files, no:
- Bare `except` (every `try` either re-raises or names a specific
exception).
- Function > 50 LOC (longest: `BuildCacheOrchestrator.build_cache`
~75 LOC including comments; orchestrator sequencing is naturally
longer and each phase is its own helper).
- Cyclomatic complexity > 10 in the orchestrator paths.
- Verbose default-on debug logging (every DEBUG-level log gated by
config; `FaissBridge.debug_per_frame_distances` defaults OFF per
AC-8).
- Silently swallowed errors (every `try/except` either logs at
ERROR + emits FDR, or re-raises a wrapped exception with `from
exc` to preserve the cause chain).
Constructor parameter validation is consistent across the new
classes: `TypeError` for type violations, `ValueError` for range
violations.
## Phase 4 — Security quick-scan
- No SQL injection / command injection / eval / exec.
- No hardcoded secrets — `RemoteCacheProvisionerInvoker` redacts
`api_key` and `auth_token` substrings from streamed SSH stdout
(covered by dedicated tests in `test_remote_c10_invoker.py`).
- No PII / signing material on the new FDR records
(`c12.reloc.requested` carries `LatLonAlt` + free-text reason —
operator-issued, not user PII; `vpr.retrieve_topk` carries raw
float distances).
- `OperatorReLocService` validates `confidence_radius_m > 0` and
`reason` non-empty at the service boundary.
- The Batch-44 doc sweep updated ADR-004 ("Why the gate moved to
C12") and the C12 security note explicitly captures the
defense-in-depth shift.
## Phase 5 — Performance scan
- `FaissBridge.retrieve` p95 ≤ 500 µs verified by microbench
(`test_bridge_retrieve_overhead_p95_under_500us`).
- `BuildCacheOrchestrator.build_cache` overhead microbench landed
in Batch 43.
- `PostLandingUploadOrchestrator` is one-shot per landing — no hot
path.
- The C12 cold-start microbench
(`test_cold_start_under_500ms_p99`) was reported flaky in
Batch 44 (`580.8ms p99` on one run). Diagnosed as host-load
variance, not a regression; the underlying PEP 562 lazy-import
discipline is unchanged across batches 43-45. **Carried-over
flake — not introduced in this cumulative scope.**
## Phase 6 — Cross-task consistency
The dominant cross-task pattern across this window is the AZ-507
consumer-side structural Protocol cut. Three new cuts shipped:
| Cut | Producer (real type) | Consumer | Adapter location |
|-----|----------------------|----------|------------------|
| `TileUploaderCut` | c11 `TileUploader` (AZ-319 v2.0.0) | c12 `PostLandingUploadOrchestrator` (AZ-329) | `runtime_root/c12_factory.py` |
| `TileDownloaderCut` | c11 `HttpTileDownloader` (AZ-316) | c12 `BuildCacheOrchestrator` (AZ-328) | `runtime_root/c12_factory.py` |
| `DescriptorIndexCut` | c6 `DescriptorIndex.search_topk` (AZ-303 + AZ-306) | c2_vpr `FaissBridge` (AZ-341) | NOT YET WIRED — composition-root adapter will land with AZ-337 (first concrete strategy) |
Plus two intra-c12 Protocol contracts:
| Protocol | Producer | Consumer | Concrete impl |
|----------|----------|----------|---------------|
| `FdrFooterReader` | c12 (AZ-329) | `PostLandingUploadOrchestrator` | (in `fdr_footer_reader.py`) |
| `OperatorCommandTransport` | c12 (AZ-330) | `OperatorReLocService` | (pymavlink-backed impl deferred to E-C8) |
All five Protocols use `runtime_checkable=True`, are declared in
a single `*.py` file per Protocol, are imported by their consumer
via the local module path (no cross-component imports), and follow
the same docstring shape ("single-method consumer-side cut of …").
No conflicting patterns observed.
Cross-task DTO consistency:
- `VprQuery.embedding` is L2-normalised by the producing strategy
(INV-3); the `FaissBridge` consumes the embedding as-is. Future
strategy implementations (AZ-337..AZ-340) must preserve INV-3.
- `VprCandidate.tile_id` carries `tuple[int, float, float]`
(`TileIdTuple`) deliberately — keeps L1 DTOs free of L3 c6
imports. The `DescriptorIndexCut` matches this shape.
- `UploadRequest` (c11) no longer references `FlightStateSignal`
(Batch 44 removal); `PostLandingUploadOrchestrator` constructs
it from the `BuildCacheRequest` translation in c12_factory.
## Phase 7 — Architecture compliance
1. **Layer direction (rule 1)**: no upward imports observed across
the cumulative scope. c2_vpr (L3) imports `_types.vpr` (L1) and
its own component module; c12_operator_orchestrator (L4
Adapters) imports its own component modules and L1
shared/`_types`/config/logging/fdr_client/clock.
2. **Public API respect / AZ-507 (rule 2)**: verified via grep:
- `c2_vpr/*.py` has zero `from gps_denied_onboard.components.c6_tile_cache` imports.
- `c12_operator_orchestrator/*.py` has zero
`from gps_denied_onboard.components.c11_tile_manager` /
`…c10_provisioning` / `…c7_inference` / `…c6_tile_cache` imports.
3. **No new cyclic dependencies (rule 3)**: no new cycles introduced.
4. **Duplicate symbols (rule 4)**:
- `IndexUnavailableError` exists in both `c2_vpr.errors` and
`c6_tile_cache.errors` — pre-existing intentional duplication,
documented in both module docstrings. The task-spec letter of
AC-4 + the c6 module comment hint at a design tension here (see
Finding F2 of `batch_45_review.md`).
- `_iso_ts_from_clock` (or equivalent inline) appears in 6
modules: `c2_vpr/_faiss_bridge.py`,
`c12_operator_orchestrator/operator_reloc_service.py`,
`c11_tile_manager/idempotent_retry.py`,
`c11_tile_manager/signing_key.py`,
`c6_tile_cache/postgres_filesystem_store.py`,
`c6_tile_cache/freshness_gate.py`. **Carried-over finding — AZ-508 hygiene PBI exists in `todo/` for exactly this consolidation.**
5. **Cross-cutting concerns not locally re-implemented (rule 5)**: see
Finding F1 below — the ISO-timestamp duplication is the only
concrete instance.
## Baseline Delta
`_docs/02_document/architecture_compliance_baseline.md` does not
exist for this project (greenfield; no prior architecture baseline
to compare against). The cumulative review therefore reports
findings as-is, not as deltas.
## Findings
| # | Severity | Category | Files | Title |
|---|----------|----------|-------|-------|
| F1 | Low | Maintainability / Architecture | c2/c6/c11/c12 (6 modules total) | `_iso_ts_from_clock` duplicated across components |
| F2 | Low | Architecture | `c2_vpr.errors.IndexUnavailableError` vs `c6_tile_cache.errors.IndexUnavailableError` | Two same-named exception classes; the bridge's AC-4 propagation contract intentionally lets c6's class leak past c2's "closed envelope" docstring |
| F3 | Low | Spec-Gap | `_docs/02_tasks/done/AZ-341_c2_faiss_retrieve_wiring.md` § Outcome | Spec lists unused `normaliser: DescriptorNormaliser` constructor param; INV-3 makes it redundant — implementation omits it |
| F4 | Low | Maintainability | `test_cold_start_under_500ms_p99` | Reported flaky (580.8ms p99 on one Batch-44 run); pre-existing host-load sensitivity, not introduced in 43-45 |
### Finding Details
**F1: `_iso_ts_from_clock` duplicated across components** (Low / Maintainability + Architecture)
- Locations: see Phase 7 #4 above.
- Description: six modules duplicate the same 6-line helper that
converts `clock.time_ns()` → RFC 3339 UTC ISO string. Pure
duplication; same exact body in each.
- Suggestion: defer to AZ-508 (hygiene PBI already in `todo/`
scoped to ISO-timestamp consolidation across the suite). No
blocking action this cycle.
**F2: `IndexUnavailableError` namespace duplication + AC-4 closed-envelope tension** (Low / Architecture)
- Locations: `c2_vpr.errors.IndexUnavailableError`,
`c6_tile_cache.errors.IndexUnavailableError`.
- Description: c2 errors module docstring states "the C6 family is
the storage-layer error a concrete strategy is responsible for
rewrapping". The AZ-341 task spec AC-4 says the bridge "does not
catch and re-raise" the c6 exception. The two cannot both hold
for the closed-envelope invariant (consumers downstream of the
bridge will see c6's class, not c2's, when the c6 stale-handle
defence fires).
- Suggestion: surface to the user for spec/code-comment alignment.
Options: (a) widen AC-4 to allow rewrap; (b) declare c6's
`IndexUnavailableError` part of the C2 closed envelope (subclass
the c2 class from the c6 class, or vice versa); (c) leave as-is
with explicit documentation that consumers must catch either
class. None blocks the current implementation.
**F3: Unused `normaliser` parameter in AZ-341 spec** (Low / Spec-Gap)
- Location: task spec § Outcome (Constructor signature).
- Description: spec lists `normaliser: DescriptorNormaliser` but
the documented `retrieve` body never calls it. The bridge omits
the parameter (no behaviour change; INV-3 obligates the
producing strategy to normalise).
- Suggestion: update the spec to drop the parameter, OR add explicit
defensive re-normalisation to the bridge with a note that it
conflicts with INV-3's strategy-side guarantee.
**F4: Flaky cold-start performance test** (Low / Maintainability)
- Location:
`tests/unit/c12_operator_orchestrator/test_cli_help_and_logging.py::test_cold_start_under_500ms_p99`.
- Description: occasionally measures p99 > 500 ms on a developer
host (580.8 ms observed in Batch 44). The underlying PEP 562 lazy
import discipline is unchanged across 43-45; the regression is
host-load variance.
- Suggestion: re-evaluate on dedicated CI Tier-1 hardware where the
host is not under interactive load; defer to a future test-host
selection PBI. No blocking action this cycle.
## Verdict
**PASS_WITH_WARNINGS** — four Low-severity findings, three of which
are carry-over hygiene / spec items (F1, F3, F4) and one of which
(F2) is a structural inconsistency between a task-spec AC and a
module docstring expectation that pre-dates this cumulative window.
No Critical, no High, no Medium findings. The implement skill's
auto-fix gate accepts the verdict without intervention. Recommend:
- Surface F2 + F3 to the user as a small spec-hygiene follow-up
pair (one ticket, ≤ 2pt).
- Confirm AZ-508 (ISO-timestamp consolidation hygiene PBI) is
scoped to absorb F1 when it is scheduled.
- Leave F4 untouched until a dedicated CI Tier-1 host is available.