Files
Oleksandr Bezdieniezhnykh 5fe67023b2 [AZ-329] [AZ-330] [AZ-523] [AZ-524] Batch 44 atomic refactor
Implements two new C12 services and rebalances the C11/C12 boundary
in one atomic commit:

* AZ-329 PostLandingUploadOrchestrator — gates C11 upload on the
  `flight_footer` FDR record's `clean_shutdown` field; 4 refusal
  modes; new FdrFooterReader Protocol + LocalFdrFooterReader.
* AZ-330 OperatorReLocService — AC-3.4 visual-loss re-localization
  hint; reuses shared LatLonAlt; OperatorCommandTransport Protocol
  cut (E-C8 owns the future pymavlink concrete); new FDR record
  kind `c12.reloc.requested`; log redaction (lat/lon 5 decimals,
  reason 200 chars).
* AZ-523 C11 internal flight-state gate removed (SRP refactor):
  `confirm_flight_state` / `FlightStateSignal` use /
  `FlightStateNotOnGroundError` deleted from C11; TileUploader
  contract bumped to v2.0.0 (frozen) with migration note; AZ-317
  superseded.
* AZ-524 Package rename `c12_operator_tooling` →
  `c12_operator_orchestrator` across source, tests, pyproject,
  CMake, Dockerfile, compose, CI, runtime-root services class
  (`OperatorOrchestratorServices`) + factory function
  (`build_operator_orchestrator`), logger namespaces, config slug,
  docs, and the E-C12 epic title.

Tests: 1543 passed, 80 skipped (all environment gates). Targeted
AC suite (AZ-329 + AZ-330 + FdrFooterReader): 37 passed. Cold-start
NFR-perf still ≤ 500 ms p99.

Tracker: AZ-317 → Done (superseded); AZ-319 v2.0.0 contract bump
comment; AZ-329/AZ-330 → In Testing; AZ-253 epic renamed; AZ-523
+ AZ-524 created and closed as audit-trail tickets.

See `_docs/03_implementation/batch_44_cycle1_report.md`.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 19:42:46 +03:00

18 KiB

Batch 44 — Cycle 1 Report

Date: 2026-05-13 Batch: 44 Tasks: AZ-329 (C12 PostLandingUploadOrchestrator, 3pt) + AZ-330 (C12 OperatorReLocService, 3pt) + AZ-523 (audit: C11 internal gate removal, 3pt) + AZ-524 (audit: C12 package rename, 2pt) Status: complete; AZ-329 + AZ-330 in In Testing; AZ-317 superseded → Done; AZ-523 + AZ-524 created as audit-trail tickets and closed on creation.

Scope

Batch 44 is an atomic refactor delivering two new C12 services AND a paired SRP rebalance between C11 and C12:

  1. AZ-329 PostLandingUploadOrchestrator — gates C11's upload_pending on a confirmed flight_footer FDR record (clean_shutdown == True) read via a new FdrFooterReader Protocol + LocalFdrFooterReader concrete impl. Surfaces four refusal modes (footer_missing, unclean_shutdown, flight_id_not_found, fdr_unreadable: <repr>) plus a SatelliteProviderError passthrough wrapper.
  2. AZ-330 OperatorReLocService — operator-side surface for AC-3.4 visual-loss re-localization. Validates a ReLocHint (reuses shared LatLonAlt; lat/lon/radius/reason invariants), forwards it via a new OperatorCommandTransport Protocol cut (E-C8 owns the future pymavlink concrete; pattern matches AZ-322's BackboneEmbedder), and emits a c12.reloc.requested FDR record with outcome ∈ {sent, failed}.
  3. C11 internal flight-state gate removal (SRP) — the previously-shipped confirm_flight_state / FlightStateSignal / FlightStateNotOnGroundError surface in c11_tile_manager is removed. The post-landing safety responsibility now lives in C12 (single source of truth). The TileUploader Protocol contract is bumped to v2.0.0 (frozen).
  4. C12 package renamec12_operator_toolingc12_operator_orchestrator across source, tests, configs, CMake flag, CLI binary (operator-tooloperator-orchestrator), runtime-root services class (OperatorToolServicesOperatorOrchestratorServices), factory function (build_operator_toolbuild_operator_orchestrator), logger namespaces, documentation directories, and the E-C12 epic summary on Jira.

Architectural Decisions

1. Single-source-of-truth for the post-landing gate (SRP refactor)

The previous design had C11's TileUploader consume a FlightStateSignal from C8 and refuse to upload when MAV_STATE != ON_GROUND. C12 was also expected to confirm ON_GROUND independently before invoking C11. This duplicated the safety invariant on both sides of the C11/C12 boundary — a "defence-in-depth" justification that did not survive review: the safety invariant is "the vehicle has fully stopped and shut down cleanly", and the single authoritative observer of that state is C13 (the FDR writer), which emits a flight_footer record only on clean shutdown.

Resolution: C11 stops gating. The C12 PostLandingUploadOrchestrator reads the footer C13 wrote and either invokes C11 (which no longer gates) or refuses with an actionable error. Each side has exactly one responsibility.

The original AZ-329 spec described counting consecutive FlightStateSignal records and asserting a contiguous ON_GROUND duration ≥ 30 s. Phase C pivoted to reading the single flight_footer FDR record because:

  • The footer is the authoritative "vehicle stopped cleanly" signal (written by C13 only on clean shutdown).
  • Counting consecutive signals duplicates state-machine logic C13 already encodes.
  • The 30-second hold-down was an arbitrary heuristic; clean_shutdown is exact.

The new design is mechanically simpler (read one record, check one boolean), removes a configurable threshold (upload_min_on_ground_s), and aligns with the SRP rebalance.

AZ-330 needs to send a re-loc hint to the airborne companion over the GCS link, which is C8's territory. C12 cannot import C8 directly (AZ-507 boundary policy). Resolution:

  • C12 owns OperatorCommandTransport Protocol (operator_command_transport.py) with one method send_reloc_hint(hint: ReLocHint) -> None.
  • Concrete MavlinkOperatorCommandTransport (pymavlink-backed) will land in a future E-C8 task. Pattern matches AZ-322's BackboneEmbedder (C10 owns Protocol; C2 implements later).
  • C12's build_operator_orchestrator accepts the transport as a constructor parameter; when omitted, operator_reloc_service stays None (AC-10 lazy composition — pymavlink is never imported on the operator-tool happy path).

4. Log redaction policy

  • Live INFO/ERROR logs: lat/lon rounded to 5 decimals (~1 m precision), reason truncated to 200 chars, no api_key / auth_token substrings ever logged.
  • FDR records: full hint un-redacted (post-flight forensics requirement; FDR is operator-only-readable).
  • API-key leak coverage: parametrized tests verify the key never appears in logs across all five post-landing outcomes (success + four refusal modes).

5. Best-effort FDR-record enqueue (AC-8)

Both new services emit FDR records, but neither raises if the FDR ring buffer overruns — the primary user-visible action (upload triggered / reloc sent) is the contract; the FDR record is for post-flight forensics. Overrun returns (record_id=None, overrun=True) and is silently dropped. Unit-tested.

6. Lazy service construction (NFR-perf-cold-start)

build_operator_orchestrator builds each service only when its required collaborators are provided. operator-orchestrator --help cold-start stays ≤ 500 ms p99 (matched by the same regression test from AZ-326). The reloc service in particular avoids importing pymavlink unless a transport is wired.

7. New FDR record kind: c12.reloc.requested

Registered in fdr_client/records.py KNOWN_PAYLOAD_KEYS with fields {hint, outcome, failure_reason, ts_monotonic_ns}. The AZ-272 schema roundtrip fixture (test_az272_fdr_record_schema.py) was extended with a sample payload so the unknown-kind assertion stays green.

8. Renamed package: scope of the rename

Renaming c12_operator_tooling was driven by the broader responsibility shift — the component no longer ONLY does pre-flight tooling; it now also owns the post-landing safety gate and the operator re-loc service. "Operator orchestrator" reflects that. The rename touched: Python package, test directory, CLI binary, runtime-root services class + factory function, logger namespaces, config slug, CMake build flag, deployment Dockerfile name, documentation component + contract directories, and the E-C12 epic title on Jira.

Files Changed

Production source (new — AZ-329)

  • src/gps_denied_onboard/components/c12_operator_orchestrator/post_landing_upload.pyPostLandingUploadOrchestrator + trigger_post_landing_upload(request) -> UploadBatchReportCut.
  • src/gps_denied_onboard/components/c12_operator_orchestrator/fdr_footer_reader.pyFdrFooterReader Protocol + LocalFdrFooterReader concrete (walks newest→oldest segments, parses length-prefixed footer record, validates flight_id match).
  • src/gps_denied_onboard/components/c12_operator_orchestrator/tile_uploader_cut.pyTileUploaderCut Protocol + UploadBatchReportCut DTO (consumer-side cut for C11 TileUploader).

Production source (new — AZ-330)

  • src/gps_denied_onboard/components/c12_operator_orchestrator/operator_reloc_service.pyOperatorReLocService.request_reloc(hint) with INFO/ERROR logging + redaction + FDR enqueue.
  • src/gps_denied_onboard/components/c12_operator_orchestrator/operator_command_transport.pyOperatorCommandTransport runtime_checkable Protocol.

Production source (modified)

  • src/gps_denied_onboard/components/c12_operator_orchestrator/_types.py — added PostLandingUploadRequest, ReLocHint (with __post_init__ validation reusing shared LatLonAlt).
  • src/gps_denied_onboard/components/c12_operator_orchestrator/errors.py — added FlightStateNotConfirmedError (4 sub-reasons + remediation), SatelliteProviderError, FdrUnreadableError, GcsLinkError (with remediation + wrapped-exception repr capture).
  • src/gps_denied_onboard/components/c12_operator_orchestrator/cli.py — added upload-pending and reloc-confirm subcommands; CLI-side ValueError → usage-error mapping; exit codes EXIT_FOOTER_MISSING, EXIT_UNCLEAN_SHUTDOWN, EXIT_FLIGHT_ID_NOT_FOUND, EXIT_FDR_UNREADABLE, EXIT_GCS_LINK_ERROR, EXIT_SATELLITE_PROVIDER_ERROR.
  • src/gps_denied_onboard/components/c12_operator_orchestrator/config.py — added C12PostLandingUploadConfig; reloc service has no static config (pure DI).
  • src/gps_denied_onboard/components/c12_operator_orchestrator/__init__.py — re-exports new types; PEP 562 lazy machinery extended.
  • src/gps_denied_onboard/components/c12_operator_orchestrator/interface.py — removed stale Protocol placeholder for OperatorReLocService (now a concrete class in its own module).
  • src/gps_denied_onboard/runtime_root/c12_factory.py — extended OperatorOrchestratorServices with post_landing_upload_orchestrator + operator_reloc_service; added build_post_landing_upload_orchestrator(...) + build_operator_reloc_service(...); build_operator_orchestrator(...) (renamed from build_operator_tool) accepts optional tile_uploader, operator_command_transport, fdr_client — each gates one service field.
  • src/gps_denied_onboard/fdr_client/records.py — registered c12.reloc.requested payload keys.

Production source (removed — C11 gate revert / AZ-523)

  • src/gps_denied_onboard/components/c11_tile_manager/flight_state_gate.pydeleted.
  • src/gps_denied_onboard/components/c11_tile_manager/_types.py — removed FlightStateSignal import (still defined in _types/fc.py for C8 consumption; only the C11 use is removed).
  • src/gps_denied_onboard/components/c11_tile_manager/errors.py — removed FlightStateNotOnGroundError.
  • src/gps_denied_onboard/components/c11_tile_manager/interface.py — removed confirm_flight_state from TileUploader Protocol.
  • src/gps_denied_onboard/components/c11_tile_manager/tile_uploader.py — removed the gate call from upload_pending.
  • src/gps_denied_onboard/components/c11_tile_manager/__init__.py + idempotent_retry.py — adjusted re-exports and decorator boundaries.

Production source (Phase A rename — AZ-524)

  • All paths under src/gps_denied_onboard/components/c12_operator_tooling/src/gps_denied_onboard/components/c12_operator_orchestrator/ (git mv).
  • pyproject.toml [project.scripts] entry: operator-tooloperator-orchestrator.
  • cmake/build_options.cmake: BUILD_C12_OPERATOR_TOOLINGBUILD_C12_OPERATOR_ORCHESTRATOR.
  • docker/operator-tooling.Dockerfiledocker/operator-orchestrator.Dockerfile (git mv).
  • docker-compose.yml, docker-compose.test.yml, .github/workflows/release.yml, README.md — string sweep.
  • Logger namespaces: c12.operator_tool.*c12.operator_orchestrator.*.
  • Config slug under Config.components: operator_toolc12_operator_orchestrator.

Tests (new)

  • tests/unit/c12_operator_orchestrator/test_post_landing_upload_orchestrator.py — 11 tests covering AC-1..AC-7 + AC-8 (api-key redaction across 5 outcomes).
  • tests/unit/c12_operator_orchestrator/test_fdr_footer_reader.py — 11 tests covering AC-6 (segment walk + short-circuit) + AC-9/AC-10 fixture integration + 7 error-path tests.
  • tests/unit/c12_operator_orchestrator/test_operator_reloc_service.py — 15 tests covering AC-1..AC-9 + AC-10 lazy composition.
  • tests/unit/test_az272_fdr_record_schema.py — added c12.reloc.requested fixture entry (schema roundtrip).

Tests (removed)

  • tests/unit/c11_tile_manager/test_flight_state_gate.pydeleted along with the gate module.

Tests (Phase A rename — AZ-524)

  • tests/unit/c12_operator_tooling/tests/unit/c12_operator_orchestrator/ (git mv).
  • Test-internal references to the renamed factory + class + binary updated (build_operator_toolbuild_operator_orchestrator; operator_tool_binary fixture → operator_orchestrator_binary).

Documentation

  • _docs/02_document/components/13_c12_operator_tooling/13_c12_operator_orchestrator/ (git mv); description.md + tests.md rewritten for the new gate design + interface table updates.
  • _docs/02_document/contracts/c12_operator_tooling/c12_operator_orchestrator/ (git mv); added operator_command_transport.md contract for the new Protocol.
  • _docs/02_document/contracts/c11_tilemanager/tile_uploader.md — bumped to v2.0.0 (frozen); migration note documents the gate removal.
  • _docs/02_document/components/12_c11_tilemanager/description.md + tests.md — gate references removed; C11-IT-04 retargeted to cross-reference the C12 gate.
  • _docs/02_tasks/done/AZ-317_c11_flight_state_gate.md — SUPERSEDED banner added.
  • _docs/02_tasks/todo/AZ-329_c12_post_landing_upload.md + AZ-330_c12_operator_reloc_service.md — task specs rewritten to reflect Phase C design + AZ-507 cuts.
  • _docs/02_tasks/_dependencies_table.md — AZ-329/AZ-330 dep edges updated; AZ-317 marked SUPERSEDED in-table; AZ-523 + AZ-524 added; coverage-verification section updated.
  • Cross-cutting docs swept for old names: architecture.md, module-layout.md, FINAL_report.md, epics.md, glossary.md, data_model.md, deployment/*.md, system-flows.md.

Task Results

Task Status Files (new / mod / del) Tests added AC Coverage Issues
AZ-329 In Testing 3 / 8 / 0 22 (test_post_landing_upload + test_fdr_footer_reader) 10/10 ACs None
AZ-330 In Testing 2 / 5 / 0 15 (test_operator_reloc_service) 10/10 ACs None
AZ-523 (audit: C11 gate removal) Done 0 / 6 / 2 n/a (existing C11 tests still green) n/a None
AZ-524 (audit: C12 package rename) Done git-mv only n/a (1543 tests green post-rename) n/a None
AZ-317 (superseded) Done 0 / 1 / 0 (annotation only) n/a n/a Superseded by AZ-523
AZ-319 (TileUploader contract v2.0.0) unchanged status (In Testing) covered by AZ-523 deletes n/a n/a None

AC Test Coverage: All covered

  • AZ-329 (AC-1..AC-10): every AC has a directly-validating test in test_post_landing_upload_orchestrator.py or test_fdr_footer_reader.py. AC-8 is parametrized across all five outcomes (1 success + 4 refusal modes) for api-key-leak coverage. AC-9 + AC-10 are full-stack fixture integration tests against on-disk FDR fixtures.
  • AZ-330 (AC-1..AC-10): every AC has a directly-validating test in test_operator_reloc_service.py. AC-7 (lat/lon range), AC-3 (radius), and AC-6 (reason) DTO validation are parametrized; AC-10 lazy composition has its own factory-level test (test_build_operator_orchestrator_does_not_construct_operator_reloc_service_without_transport).

Code Review Verdict: PASS

Findings

None of severity Low or higher.

Notes (informational)

  • tests/unit/c12_operator_orchestrator/test_cli_console_script.py has the same flake-prone test_cold_start_under_500ms_p99 documented in batch 42's report. The minimal imports added in Batch 44 (OperatorCommandTransport, OperatorReLocService, ReLocHint, GcsLinkError) are all pure-Python and add no measurable startup cost. Test passes when run individually; the flake is from system noise on the eager-aggregated test runs.
  • One pre-existing leftover from Phase A (the factory function build_operator_tool and the test fixture name operator_tool_binary) was caught in the Phase H verification sweep and corrected in this batch — completing the Phase A rename intent.

Tracker Updates (Phase G)

  • AZ-317Done with SUPERSEDED comment + annotated task spec in _docs/02_tasks/done/.
  • AZ-319 → comment added documenting the v2.0.0 contract bump + the four breaking removals from the TileUploader surface (no status change; already In Testing).
  • AZ-329 → summary updated; design-pivot + implementation-complete comment added; transitioned To Do → In Testing.
  • AZ-330 → implementation-complete comment added; transitioned To Do → In Testing.
  • AZ-253 (E-C12 epic) → summary renamed C12 Operator Pre-flight ToolingC12 Operator Pre-flight Orchestrator.
  • AZ-523 created and closed: "C11 internal flight-state gate removal (SRP refactor)", parent AZ-251, 3pt.
  • AZ-524 created and closed: "C12 package rename: c12_operator_tooling → c12_operator_orchestrator", parent AZ-253, 2pt.
  • _docs/02_tasks/_dependencies_table.md refreshed: AZ-329 + AZ-330 dep edges updated; AZ-317 marked SUPERSEDED; AZ-523 + AZ-524 rows added; new "Batch 44 SRP refactor + C12 rename" Notes paragraph documents the rebalance.

Auto-Fix Attempts: 0

Stuck Agents: None

Test Suite

  • Full repository unit suite: 1543 passed, 80 skipped, 3 warnings in ~64 s (skipped: pre-existing Docker / CUDA / Jetson / TensorRT / actionlint environment gates).
  • Targeted AC suite (AZ-329 + AZ-330 + FDR-footer-reader): 37 passed in 1.24 s.
  • C11 post-gate-removal: zero regressions; all pre-existing C11 unit tests still green.
  • python -X importtime cold-start: operator-orchestrator --help consistently ≤ 200 ms locally; CLI console-script test asserts ≤ 500 ms p99 (test still green; one statistical-noise flake noted above).

Next Batch

Natural follow-ups:

  • E-C8 task to implement MavlinkOperatorCommandTransport (concrete pymavlink-backed OperatorCommandTransport) — unblocks end-to-end AC-3.4 with a real GCS link.
  • C12-IT-03 / C12-IT-04 end-to-end integration tests against a Tier-1 footer fixture + a stubbed TileUploader — the Batch 44 unit tests already exercise every AC, but an end-to-end pass would close the C12 epic's integration coverage line.

Both are independent of each other and can be batched in any order. Confirm with _docs/02_tasks/_dependencies_table.md at the start of Batch 45.