mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 15:31:13 +00:00
[AZ-329] [AZ-330] [AZ-523] [AZ-524] Batch 44 atomic refactor
Implements two new C12 services and rebalances the C11/C12 boundary in one atomic commit: * AZ-329 PostLandingUploadOrchestrator — gates C11 upload on the `flight_footer` FDR record's `clean_shutdown` field; 4 refusal modes; new FdrFooterReader Protocol + LocalFdrFooterReader. * AZ-330 OperatorReLocService — AC-3.4 visual-loss re-localization hint; reuses shared LatLonAlt; OperatorCommandTransport Protocol cut (E-C8 owns the future pymavlink concrete); new FDR record kind `c12.reloc.requested`; log redaction (lat/lon 5 decimals, reason 200 chars). * AZ-523 C11 internal flight-state gate removed (SRP refactor): `confirm_flight_state` / `FlightStateSignal` use / `FlightStateNotOnGroundError` deleted from C11; TileUploader contract bumped to v2.0.0 (frozen) with migration note; AZ-317 superseded. * AZ-524 Package rename `c12_operator_tooling` → `c12_operator_orchestrator` across source, tests, pyproject, CMake, Dockerfile, compose, CI, runtime-root services class (`OperatorOrchestratorServices`) + factory function (`build_operator_orchestrator`), logger namespaces, config slug, docs, and the E-C12 epic title. Tests: 1543 passed, 80 skipped (all environment gates). Targeted AC suite (AZ-329 + AZ-330 + FdrFooterReader): 37 passed. Cold-start NFR-perf still ≤ 500 ms p99. Tracker: AZ-317 → Done (superseded); AZ-319 v2.0.0 contract bump comment; AZ-329/AZ-330 → In Testing; AZ-253 epic renamed; AZ-523 + AZ-524 created and closed as audit-trail tickets. See `_docs/03_implementation/batch_44_cycle1_report.md`. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,216 +1,217 @@
|
||||
# C12 Post-Landing Upload — `trigger_post_landing_upload` + FDR ON_GROUND Confirmation
|
||||
# C12 Post-Landing Upload — `trigger_post_landing_upload` + FDR `flight_footer` Confirmation
|
||||
|
||||
**Task**: AZ-329_c12_post_landing_upload
|
||||
**Name**: C12 Post-Landing Upload
|
||||
**Description**: Implement `PostLandingUploadOrchestrator`, the C12 post-flight (F10) workflow that gates `C11.TileUploader.upload_pending_tiles` (AZ-319) on a confirmed-ON_GROUND signal from the post-flight FDR. `trigger_post_landing_upload(request: PostLandingUploadRequest) -> UploadBatchReport` does the following: (1) locate the FDR segments for the given `flight_id` under `config.c12.fdr_root` (segment layout: `<fdr_root>/<flight_id>/segment_<NNN>.fdr` per the C13 conventions); (2) iterate the segments from newest to oldest, parsing records via AZ-272's `FdrRecord.parse(...)`; (3) collect all `state.tick` records carrying a `flight_state` payload field (or a dedicated `flight_state.tick` kind if the schema names it that way — defer to AZ-272's contract); (4) walking the collected records backwards from the most recent (chronologically), count contiguous `ON_GROUND` records and compute the contiguous ON_GROUND duration as `(latest_record.ts − first_consecutive_on_ground_record.ts)` seconds; (5) compare against `config.c12.upload_min_on_ground_s` (default 30 s per description.md C12-IT-03); (6) on confirmed ≥ threshold → construct a `FlightStateSignal(state=ON_GROUND, since_ts=<first consecutive ts>)` and call `tile_uploader.upload_pending_tiles(flight_state=...)`; (7) on any refusal mode → raise `FlightStateNotConfirmedError(not_confirmed_reason=...)` with one of the four documented reason strings (`"never_landed"`, `"insufficient_duration: <X>s < <threshold>s"`, `"flight_id_not_found"`, `"fdr_unreadable: <repr>"`). Owns AC-8.4's defense-in-depth check on the operator-tooling side — the airborne C11 ALSO blocks via `UploadGateBlockedError` per AZ-319; this task is the operator-side gate that prevents the upload command from even being issued. Returns C11's `UploadBatchReport` unchanged on success. Logs every decision (INFO on confirmed; ERROR on each refusal mode) including the inferred contiguous ON_GROUND duration in seconds.
|
||||
**Description**: Implement `PostLandingUploadOrchestrator`, the C12 post-flight (F10) workflow that gates `C11.TileUploader.upload_pending_tiles` (AZ-319) on the presence of a clean-shutdown `flight_footer` FDR record for `flight_id`. `trigger_post_landing_upload(request: PostLandingUploadRequest) -> UploadBatchReportCut` does the following: (1) resolve `<fdr_root>/<flight_id>/` and confirm the directory exists; (2) iterate the segment files from newest to oldest, streaming length-prefixed records via AZ-272's `FdrRecord.parse(...)`; (3) short-circuit on the first record whose `kind == "flight_footer"` (the C13 writer in AZ-292 emits exactly one such record per flight, on `close_flight()`); (4) inspect `payload["clean_shutdown"]` — `True` → the flight terminated gracefully → invoke `tile_uploader.upload_pending_tiles(UploadRequestCut(flight_id=..., batch_size=..., satellite_provider_url=...))` and return its `UploadBatchReportCut` unchanged; `False` → operator inspection required → refuse with `FlightStateNotConfirmedError("unclean_shutdown")`; (5) footer absent across every segment → power-loss truncation or mid-flight crash → refuse with `FlightStateNotConfirmedError("footer_missing")`; (6) FDR parse error mid-stream → refuse with `FlightStateNotConfirmedError("fdr_unreadable: <repr>")`; (7) `<fdr_root>/<flight_id>/` does not exist → refuse with `FlightStateNotConfirmedError("flight_id_not_found")`. Owns AC-8.4's defense-in-depth check on the operator-orchestrator side — C11 is now a dumb pipe (the airborne internal gate was removed in batch 44 — see superseded AZ-317); this task is the only gate that prevents the upload command from being issued when the flight didn't terminate cleanly. Returns C11's `UploadBatchReport` (passthrough via the cut) on success. Logs every decision (INFO on confirmed; ERROR on each refusal mode); the `api_key` carried inside `PostLandingUploadRequest` is a plain `str` field but the orchestrator + CLI MUST redact it from every log line (matching the existing AZ-328 `BuildCacheRequest.api_key` pattern — `"api_key": "REDACTED"`). Introducing a Pydantic-backed `SecretStr` type would require adding `pydantic` as a runtime dependency, which the project explicitly avoids; the runtime-redaction contract is enforced by AC-8.
|
||||
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-326_c12_cli_app, AZ-319_c11_tile_uploader, AZ-272_fdr_record_schema, AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module
|
||||
**Component**: c12_operator_tooling (epic AZ-253 / E-C12)
|
||||
**Dependencies**: AZ-326_c12_cli_app, AZ-319_c11_tile_uploader (post batch 44 gate removal), AZ-272_fdr_record_schema, AZ-292_c13_flight_header_footer, AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module
|
||||
**Component**: c12_operator_orchestrator (epic AZ-253 / E-C12)
|
||||
**Tracker**: AZ-329
|
||||
**Epic**: AZ-253 (E-C12)
|
||||
|
||||
### Document Dependencies
|
||||
|
||||
- `_docs/02_document/contracts/c11_tilemanager/tile_uploader.md` — consumed: `upload_pending_tiles` API + `UploadBatchReport` shape + `FlightStateSignal` DTO.
|
||||
- `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` — consumed: `parse(buf: bytes) -> FdrRecord` + the `state.tick` / `flight_state.tick` kind shape (defer to the contract for the exact `kind` name and `flight_state` field).
|
||||
- `_docs/02_document/components/13_c12_operator_tooling/description.md` — § 2 (`trigger_post_landing_upload` interface, `FlightStateNotConfirmedError`).
|
||||
- `_docs/02_document/components/13_c12_operator_tooling/tests.md` — C12-IT-03 specifies the 30-s ON_GROUND threshold.
|
||||
- `_docs/02_document/components/14_c13_fdr/description.md` — § 1 segment file layout (informational).
|
||||
- `_docs/02_document/contracts/c11_tilemanager/tile_uploader.md` v2.0.0 — consumed: `upload_pending_tiles(UploadRequest) -> UploadBatchReport` API (post batch 44 — no `FlightStateSignal` parameter, no `confirm_flight_state` method).
|
||||
- `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` — consumed: `parse(buf: bytes) -> FdrRecord` + the `flight_footer` kind shape (`flight_id`, `flight_ended_at_iso`, `clean_shutdown`, and the four AC-NEW-3 counters).
|
||||
- `_docs/02_document/components/13_c12_operator_orchestrator/description.md` — § 2 (`trigger_post_landing_upload` interface, `FlightStateNotConfirmedError`).
|
||||
- `_docs/02_document/components/13_c12_operator_orchestrator/tests.md` — C12-IT-03 specifies the `flight_footer`-based check.
|
||||
- `_docs/02_document/components/14_c13_fdr/description.md` — § 1 segment file layout (informational) + § 2 `FlightFooter` shape (authoritative producer).
|
||||
|
||||
## Problem
|
||||
|
||||
Without a real `PostLandingUploadOrchestrator`:
|
||||
|
||||
- F10 has no head — operators cannot trigger post-landing tile upload; AC-8.4 (mid-flight tile upload trigger, post-landing) collapses; the pending-upload journal in C6 grows unboundedly across flights.
|
||||
- The operator-side ON_GROUND gate (defense-in-depth on top of C11's airborne gate) does not exist — operators can manually invoke `C11.TileUploader.upload_pending_tiles` with a fabricated `FlightStateSignal`, defeating the AC-NEW-7 / AC-8.4 architectural intent that mid-flight tiles only upload when the aircraft has landed.
|
||||
- C12-IT-03 (`trigger_post_landing_upload` requires ≥ 30 s confirmed ON_GROUND in FDR) has no implementation.
|
||||
- F10 has no head — operators cannot trigger post-landing tile upload; AC-8.4 collapses; C6's pending-upload journal grows unboundedly across flights.
|
||||
- The operator-side gate (the *only* remaining gate after batch 44's removal of C11's internal `FlightStateGate`) does not exist — operators can manually invoke `C11.TileUploader.upload_pending_tiles(UploadRequest(...))` directly, defeating the AC-NEW-7 / AC-8.4 architectural intent that mid-flight tiles only upload after a clean landing.
|
||||
- C12-IT-03 (`trigger_post_landing_upload` requires a `flight_footer` with `clean_shutdown=True`) has no implementation.
|
||||
- `FlightStateNotConfirmedError` is concept-only in description.md § 5 with no producer.
|
||||
- The CLI's `upload-pending` subcommand has nothing to delegate to.
|
||||
- An incomplete flight log (FDR ends with `IN_FLIGHT` because the aircraft crashed or never landed) silently passes through to C11 if there's no operator-side gate; the airborne gate is the last line of defense and may itself be unavailable on the operator workstation.
|
||||
- A truncated FDR (no footer; the aircraft crashed or lost power) would silently pass through to C11 if there were no operator-side gate.
|
||||
|
||||
This task delivers the operator-side gate. It does NOT own the actual upload (AZ-319), the FDR record schema (AZ-272), or the FDR write side (AZ-291..296) — it composes them.
|
||||
This task delivers the operator-side gate. It does NOT own the actual upload (AZ-319), the FDR record schema (AZ-272), the FDR write side / footer producer (AZ-291..296, AZ-292) — it composes them.
|
||||
|
||||
## Outcome
|
||||
|
||||
- A `PostLandingUploadOrchestrator` class at `src/operator_tool/post_landing_upload.py`:
|
||||
- Constructor: `__init__(self, *, tile_uploader: TileUploader, fdr_segment_reader: FdrSegmentReader, logger: Logger, clock: Clock, config: C12PostLandingConfig)`.
|
||||
- `C12PostLandingConfig` (`@dataclass(frozen=True)`): `fdr_root: Path`, `upload_min_on_ground_s: float = 30.0`, `flight_state_record_kind: str = "state.tick"`, `flight_state_payload_field: str = "flight_state"`.
|
||||
- A `PostLandingUploadOrchestrator` class at `src/gps_denied_onboard/components/c12_operator_orchestrator/post_landing_upload.py`:
|
||||
- Constructor: `__init__(self, *, tile_uploader: TileUploaderCut, fdr_footer_reader: FdrFooterReader, logger: Logger, config: C12PostLandingConfig)`.
|
||||
- `C12PostLandingConfig` (`@dataclass(frozen=True)`): `fdr_root: Path`.
|
||||
- Public method: `trigger_post_landing_upload(request: PostLandingUploadRequest) -> UploadBatchReport`.
|
||||
- DTOs at `src/operator_tool/_types.py`:
|
||||
- `PostLandingUploadRequest` (`@dataclass(frozen=True)`): `flight_id: str`.
|
||||
- Reuses C11's `UploadBatchReport`.
|
||||
- Errors at `src/operator_tool/errors.py`:
|
||||
- `FlightStateNotConfirmedError(Exception)`: attributes `flight_id: str`, `not_confirmed_reason: str` (one of the four documented strings), `inferred_on_ground_duration_s: float | None` (populated when the reason is `insufficient_duration`), `remediation: str` (per-reason hint, e.g. for `flight_id_not_found`: "Verify the flight ID matches the FDR directory name; check `<fdr_root>/<flight_id>/`.").
|
||||
- An `FdrSegmentReader` Protocol + `LocalFdrSegmentReader` concrete at `src/operator_tool/fdr_segment_reader.py`:
|
||||
- `Protocol`: `iter_records_for_flight(flight_id: str, *, kind_filter: str | None = None) -> Iterator[FdrRecord]` — yields records ordered by `ts` ASCENDING; the orchestrator reverses on its own. `kind_filter` if non-None restricts to that record kind for efficiency.
|
||||
- `LocalFdrSegmentReader.iter_records_for_flight(...)` — opens `<fdr_root>/<flight_id>/segment_*.fdr` files in numerical order, reads each as a stream of length-prefixed `FdrRecord` blobs (per AZ-272's serialisation), parses via `FdrRecord.parse(...)`, optionally filters by `kind`, yields one record at a time. Files are mmap'd or buffered-iterated so the operator workstation does not load multi-GB segments fully into memory.
|
||||
- DTOs at `src/gps_denied_onboard/components/c12_operator_orchestrator/_types.py`:
|
||||
- `PostLandingUploadRequest` (`@dataclass(frozen=True)`): `flight_id: UUID`, `satellite_provider_url: str`, `api_key: str`, `batch_size: int = 50`. The `api_key` field is plain `str` for consistency with `BuildCacheRequest`; redaction is a runtime guarantee enforced by AC-8 and the CLI's `_emit_invoked` redaction (matching the AZ-328 pattern).
|
||||
- `UploadBatchReportCut` — local consumer-side AZ-507 Protocol mirroring C11's `UploadBatchReport` shape (no import from c11). Used only as the return-type annotation for `TileUploaderCut.upload_pending_tiles`.
|
||||
- `TileUploaderCut` Protocol at `src/gps_denied_onboard/components/c12_operator_orchestrator/post_landing_upload.py` (or a sibling `_cuts.py`): `def upload_pending_tiles(self, request: UploadRequestCut) -> UploadBatchReportCut: ...`. `UploadRequestCut` mirrors C11's `UploadRequest(batch_size, satellite_provider_url, flight_id)`. This is the AZ-507 consumer-side cut; the composition root binds a real `HttpTileUploader` here, and the structural typing prevents a direct c11 import from c12.
|
||||
- Errors at `src/gps_denied_onboard/components/c12_operator_orchestrator/errors.py`:
|
||||
- `FlightStateNotConfirmedError(Exception)`: attributes `flight_id: str`, `not_confirmed_reason: Literal["flight_id_not_found", "footer_missing", "unclean_shutdown", "fdr_unreadable"]`, `detail: str` (for `unclean_shutdown` carries the four AC-NEW-3 counters; for `fdr_unreadable` carries the inner exception `repr`; empty string otherwise), `remediation: str` (per-reason hint).
|
||||
- An `FdrFooterReader` Protocol + `LocalFdrFooterReader` concrete at `src/gps_denied_onboard/components/c12_operator_orchestrator/fdr_footer_reader.py`:
|
||||
- `Protocol`: `read_footer(flight_id: UUID) -> FlightFooterRecord | None` — returns the `flight_footer` record's payload (as a typed `FlightFooterRecord` dataclass owned by this module — NOT C13's `FlightFooter` — preserving the c12↔c13 cut), or `None` if no footer record is found across any segment.
|
||||
- `LocalFdrFooterReader.read_footer(flight_id)` — opens `<fdr_root>/<flight_id>/segment-NNNN.fdr` files (the C13 naming convention: hyphen separator, 4-digit zero-padded index — see `c13_fdr/writer.py::_segment_path`) in DESCENDING numerical order (newest first), streams length-prefixed `FdrRecord` blobs via `FdrRecord.parse(...)` (each frame is `uint32 LE length` + JSON body — see `c13_fdr/writer.py::_LENGTH_PREFIX`), returns the first one whose `kind == "flight_footer"`, or `None` if none found. Each segment is read with a buffered file iterator — NEVER fully `read()`-ed into memory.
|
||||
- On any I/O or parse error → raises `FdrUnreadableError(reason: str)` (a sibling helper exception caught by the orchestrator and rewrapped as `FlightStateNotConfirmedError("fdr_unreadable: ...")`).
|
||||
- `FlightFooterRecord` (`@dataclass(frozen=True)`) at `_types.py`: `flight_id: UUID`, `flight_ended_at_iso: str`, `records_written: int`, `records_dropped_overrun: int`, `bytes_written: int`, `rollover_count: int`, `clean_shutdown: bool`. Built from `FdrRecord.payload` inside `LocalFdrFooterReader`; the orchestrator only reads `clean_shutdown` + the four counters (for `unclean_shutdown` log/error detail).
|
||||
- Method flow for `trigger_post_landing_upload`:
|
||||
1. `flight_dir = config.fdr_root / request.flight_id`. If `not flight_dir.exists()` → raise `FlightStateNotConfirmedError(flight_id, "flight_id_not_found", remediation="Verify <fdr_root>/<flight_id>/ exists; check `config.c12.fdr_root`.")`.
|
||||
2. Collect all `flight_state` records: `records = list(fdr_segment_reader.iter_records_for_flight(request.flight_id, kind_filter=config.flight_state_record_kind))`. Catch `FdrUnreadableError` → raise `FlightStateNotConfirmedError(flight_id, f"fdr_unreadable: {e!r}", ...)`.
|
||||
3. If `not records` → raise `FlightStateNotConfirmedError(flight_id, "never_landed", remediation="No flight state records in FDR for this flight; check the flight produced state.tick records.")` (treat absence of any state record as never-landed since we have no positive ON_GROUND signal).
|
||||
4. Walk `records` backward from the last (most recent `ts`):
|
||||
- `latest = records[-1]`.
|
||||
- If `latest.payload[config.flight_state_payload_field] != "ON_GROUND"` → raise `FlightStateNotConfirmedError(flight_id, "never_landed", remediation="Most recent flight_state in FDR is not ON_GROUND; the flight may have ended in IN_FLIGHT (e.g. crash, log truncation).")`.
|
||||
- Walk backward through `records[:-1]` while `record.payload[...] == "ON_GROUND"`; the first non-`ON_GROUND` (or the start of the list) bounds the contiguous ON_GROUND run.
|
||||
- `since = first_contiguous_on_ground_record.ts`; `duration_s = (parse_iso(latest.ts) - parse_iso(since)).total_seconds()`.
|
||||
5. If `duration_s < config.upload_min_on_ground_s` → raise `FlightStateNotConfirmedError(flight_id, f"insufficient_duration: {duration_s:.1f}s < {config.upload_min_on_ground_s:.1f}s", inferred_on_ground_duration_s=duration_s, remediation="Wait for the aircraft to be confirmed ON_GROUND for the required duration, then re-run.")`.
|
||||
6. INFO log `kind="c12.upload.confirmed_on_ground"` with `flight_id`, `inferred_on_ground_duration_s`.
|
||||
7. Construct `flight_state = FlightStateSignal(state=ON_GROUND, since_ts=since)` (the DTO comes from C11 per AZ-319's contract).
|
||||
8. Call `report = tile_uploader.upload_pending_tiles(flight_state=flight_state)`. Propagate `UploadGateBlockedError` (defense-in-depth on the airborne side; this should never happen if step 6 confirmed; if it does, log ERROR and re-raise as-is).
|
||||
9. INFO log `kind="c12.upload.complete"` with `tiles_acked`, `tiles_rejected` from `report`.
|
||||
10. Return `report` unchanged.
|
||||
- Composition-root factory at `src/gps_denied_onboard/runtime_root/c12_factory.py` extends T1's `OperatorToolServices` dataclass with `post_landing_upload_orchestrator: PostLandingUploadOrchestrator`. The factory `build_post_landing_upload_orchestrator(config, services) -> PostLandingUploadOrchestrator` constructs the `LocalFdrSegmentReader` over `config.c12.fdr_root` and pulls C11's `tile_uploader` from the wider service registry.
|
||||
- T1's `cli.py` `upload-pending` subcommand resolves `services.post_landing_upload_orchestrator` and calls `.trigger_post_landing_upload(...)`. Maps `FlightStateNotConfirmedError → exit 30`; `UploadGateBlockedError → exit 31`.
|
||||
1. `flight_dir = config.fdr_root / str(request.flight_id)`. If `not flight_dir.exists()` → raise `FlightStateNotConfirmedError(flight_id=str(request.flight_id), not_confirmed_reason="flight_id_not_found", detail="", remediation="Verify <fdr_root>/<flight_id>/ exists; check `config.c12_operator_orchestrator.fdr_root`.")`. ERROR log `kind="c12.upload.refused.flight_id_not_found"`.
|
||||
2. `footer = fdr_footer_reader.read_footer(request.flight_id)`. Catch `FdrUnreadableError` → raise `FlightStateNotConfirmedError(flight_id, "fdr_unreadable", detail=f"{e!r}", remediation="Inspect FDR segment files manually; the parser failed mid-stream.")`. ERROR log `kind="c12.upload.refused.fdr_unreadable"`.
|
||||
3. If `footer is None` → raise `FlightStateNotConfirmedError(flight_id, "footer_missing", detail="", remediation="No flight_footer record found in any segment — the flight likely terminated abnormally (power loss, crash, or close_flight() never ran). Inspect FDR manually; upload requires a clean shutdown.")`. ERROR log `kind="c12.upload.refused.footer_missing"`.
|
||||
4. If `footer.clean_shutdown is False` → raise `FlightStateNotConfirmedError(flight_id, "unclean_shutdown", detail=f"records_dropped_overrun={footer.records_dropped_overrun}, bytes_written={footer.bytes_written}", remediation="The flight footer reports an unclean shutdown. Operator must manually verify the flight outcome before authorising tile upload.")`. ERROR log `kind="c12.upload.refused.unclean_shutdown"` with the four counters in `kv`.
|
||||
5. INFO log `kind="c12.upload.confirmed_clean_shutdown"` with `flight_id`, `flight_ended_at_iso`, `records_written`.
|
||||
6. `inner_request = UploadRequestCut(batch_size=request.batch_size, satellite_provider_url=request.satellite_provider_url, flight_id=request.flight_id)`. `api_key` is not passed to C11 — C11 picks up the satellite-provider auth from its own configuration (per the AZ-319 contract); `api_key` here is for forward-compat with the F10 operator workflow that may sign the upload command itself.
|
||||
7. `report = tile_uploader.upload_pending_tiles(inner_request)`. Any exception from C11 propagates unchanged.
|
||||
8. INFO log `kind="c12.upload.complete"` with `outcome=report.outcome`, `tiles_acked=count(SUCCESS)`, `tiles_rejected=count(REJECTED)`, `batch_uuid=str(report.batch_uuid)`, `public_key_fingerprint=report.public_key_fingerprint`.
|
||||
9. Return `report` unchanged.
|
||||
- Composition-root factory at `src/gps_denied_onboard/runtime_root/c12_factory.py`:
|
||||
- `build_post_landing_upload_orchestrator(config: C12Config, *, tile_uploader: TileUploaderCut) -> PostLandingUploadOrchestrator` — constructs `LocalFdrFooterReader(config.post_landing.fdr_root)` + the orchestrator.
|
||||
- Extends `OperatorOrchestratorServices` dataclass with `post_landing_upload_orchestrator: PostLandingUploadOrchestrator | None = None`.
|
||||
- `build_operator_orchestrator(...)` aggregator: when a `tile_uploader` is passed in, build and wire the orchestrator; otherwise leave the field `None`.
|
||||
- `cli.py` `upload-pending` subcommand resolves `services.post_landing_upload_orchestrator` and calls `.trigger_post_landing_upload(...)`. Maps `FlightStateNotConfirmedError → exit 30` (already defined as `EXIT_FLIGHT_STATE_NOT_CONFIRMED`); any other exception → exit 1.
|
||||
- `__init__.py` re-exports `PostLandingUploadOrchestrator`, `PostLandingUploadRequest`, `FlightStateNotConfirmedError`, `FdrFooterReader`, `LocalFdrFooterReader`, `C12PostLandingConfig`.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- `PostLandingUploadOrchestrator` class with the single public method.
|
||||
- `PostLandingUploadRequest` DTO.
|
||||
- `FlightStateNotConfirmedError` with the four documented `not_confirmed_reason` strings + per-reason `remediation`.
|
||||
- `FdrSegmentReader` Protocol.
|
||||
- `LocalFdrSegmentReader` concrete reading on-disk FDR segments.
|
||||
- `PostLandingUploadRequest` DTO (with `SecretStr` `api_key`).
|
||||
- `FlightFooterRecord` DTO (local c12-owned mirror of C13's footer payload).
|
||||
- `FlightStateNotConfirmedError` with the four `not_confirmed_reason` values + per-reason `detail` + `remediation`.
|
||||
- `FdrFooterReader` Protocol.
|
||||
- `LocalFdrFooterReader` concrete reading on-disk FDR segments newest-first.
|
||||
- `FdrUnreadableError` helper exception (caught and rewrapped at the orchestrator boundary).
|
||||
- Composition-root factory.
|
||||
- Wiring of T1's `upload-pending` subcommand to this service.
|
||||
- Conformance unit tests using a fake `FdrSegmentReader` returning scripted record sequences for all 7 acceptance criteria.
|
||||
- Two end-to-end integration tests using real FDR segment fixtures (one ending with confirmed ON_GROUND for 60 s, one ending with IN_FLIGHT) — these are the C12-IT-03 fixtures.
|
||||
- `TileUploaderCut` + `UploadRequestCut` + `UploadBatchReportCut` AZ-507 consumer-side cuts (no direct c11 import from c12 source).
|
||||
- Composition-root factory `build_post_landing_upload_orchestrator(...)` + `OperatorOrchestratorServices.post_landing_upload_orchestrator` field.
|
||||
- Wiring of the `upload-pending` CLI subcommand.
|
||||
- Conformance unit tests using a fake `FdrFooterReader` returning scripted footer records for AC-1..AC-8.
|
||||
- Two integration tests using real FDR fixture files generated via the C13 `FileFdrWriter` (AC-9 clean shutdown, AC-10 unclean shutdown).
|
||||
|
||||
### Excluded
|
||||
|
||||
- The actual upload HTTP machinery (AZ-319).
|
||||
- The actual upload HTTP machinery (AZ-319 / C11).
|
||||
- The FDR record schema or serialiser (AZ-272).
|
||||
- The FDR write side / segment rotation (AZ-291..296).
|
||||
- A "force-upload" override flag to bypass the gate — explicitly NOT supported (defeats the operator-side gate's purpose).
|
||||
- Reading mid-flight tile snapshots from FDR — the upload itself reads tiles from C6 per AZ-319.
|
||||
- The FDR write side / segment rotation / `flight_footer` producer (AZ-291..296, AZ-292).
|
||||
- Any 30-second / contiguous-ON_GROUND threshold logic (REMOVED in batch 44 — the footer is the on-ground signal).
|
||||
- Reading `state.tick` / `flight_state.tick` payloads (REMOVED in batch 44 — the footer's existence + `clean_shutdown` flag is the sole signal).
|
||||
- A "force-upload" override — explicitly NOT supported.
|
||||
- Cross-flight aggregation — one `flight_id` per call.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: ≥ 30 s confirmed ON_GROUND → upload invoked**
|
||||
Given a fake `FdrSegmentReader` returning 60 records, the last 60 of them with `flight_state=ON_GROUND` spanning 60 s of timestamps
|
||||
**AC-1: `flight_footer` with `clean_shutdown=True` → upload invoked**
|
||||
Given a fake `FdrFooterReader` returning `FlightFooterRecord(clean_shutdown=True, records_written=12345, ...)`
|
||||
When `trigger_post_landing_upload(request)` is called
|
||||
Then `tile_uploader.upload_pending_tiles` is called exactly once with `flight_state.state=ON_GROUND` and `flight_state.since_ts` equal to the first contiguous ON_GROUND record's ts; the returned `UploadBatchReport` is the one C11 produced; ONE INFO log `kind="c12.upload.confirmed_on_ground"` with `inferred_on_ground_duration_s ≈ 60.0`; ONE INFO log `kind="c12.upload.complete"`
|
||||
Then `tile_uploader.upload_pending_tiles` is called exactly once with `UploadRequestCut(flight_id=request.flight_id, batch_size=request.batch_size, satellite_provider_url=request.satellite_provider_url)`; the returned `UploadBatchReport` is the one C11 produced; ONE INFO log `kind="c12.upload.confirmed_clean_shutdown"`; ONE INFO log `kind="c12.upload.complete"`
|
||||
|
||||
**AC-2: Insufficient duration → `FlightStateNotConfirmedError("insufficient_duration: ...")`**
|
||||
Given the FDR ends with 15 s contiguous ON_GROUND records (less than the 30 s threshold)
|
||||
**AC-2: `flight_footer` absent → `FlightStateNotConfirmedError("footer_missing")`**
|
||||
Given a fake `FdrFooterReader` returning `None` (no footer record found across any segment)
|
||||
When `trigger_post_landing_upload(request)` is called
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="insufficient_duration: 15.0s < 30.0s", inferred_on_ground_duration_s≈15.0)` is raised; `tile_uploader.upload_pending_tiles` is NEVER called; ONE ERROR log `kind="c12.upload.refused.insufficient_duration"`
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="footer_missing", detail="", remediation contains "No flight_footer record found")` is raised; `tile_uploader.upload_pending_tiles` is NEVER called; ONE ERROR log `kind="c12.upload.refused.footer_missing"`
|
||||
|
||||
**AC-3: Never-landed (last record is IN_FLIGHT) → `FlightStateNotConfirmedError("never_landed")`**
|
||||
Given the FDR's most recent `state.tick` record has `flight_state=IN_FLIGHT`
|
||||
**AC-3: `flight_footer` with `clean_shutdown=False` → `FlightStateNotConfirmedError("unclean_shutdown")`**
|
||||
Given a fake `FdrFooterReader` returning `FlightFooterRecord(clean_shutdown=False, records_dropped_overrun=42, bytes_written=987654, ...)`
|
||||
When `trigger_post_landing_upload(request)` is called
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="never_landed", inferred_on_ground_duration_s=None)` is raised; uploader NOT called; ONE ERROR log `kind="c12.upload.refused.never_landed"`
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="unclean_shutdown", detail contains "records_dropped_overrun=42")` is raised; uploader NOT called; ONE ERROR log `kind="c12.upload.refused.unclean_shutdown"` containing all four AC-NEW-3 counters in `kv`
|
||||
|
||||
**AC-4: `flight_id` not found in FDR → `FlightStateNotConfirmedError("flight_id_not_found")`**
|
||||
Given `<fdr_root>/<flight_id>/` does not exist
|
||||
**AC-4: `<fdr_root>/<flight_id>/` does not exist → `FlightStateNotConfirmedError("flight_id_not_found")`**
|
||||
Given `config.post_landing.fdr_root / str(request.flight_id)` does not exist
|
||||
When `trigger_post_landing_upload(request)` is called
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="flight_id_not_found")` is raised; uploader NOT called; ONE ERROR log `kind="c12.upload.refused.flight_id_not_found"`
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="flight_id_not_found")` is raised; the `FdrFooterReader` is NOT called; uploader NOT called; ONE ERROR log `kind="c12.upload.refused.flight_id_not_found"`
|
||||
|
||||
**AC-5: FDR unreadable → `FlightStateNotConfirmedError("fdr_unreadable: <repr>")`**
|
||||
Given the FDR segments exist but parsing raises `OSError("input/output error")` mid-stream
|
||||
**AC-5: FDR unreadable → `FlightStateNotConfirmedError("fdr_unreadable")`**
|
||||
Given the FDR segments exist but `LocalFdrFooterReader.read_footer` raises `FdrUnreadableError("OSError('input/output error')")` mid-stream
|
||||
When `trigger_post_landing_upload(request)` is called
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason=re.compile(r"^fdr_unreadable: .*OSError.*"))` is raised; uploader NOT called; ONE ERROR log `kind="c12.upload.refused.fdr_unreadable"` including the inner repr
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="fdr_unreadable", detail matches r".*OSError.*")` is raised; uploader NOT called; ONE ERROR log `kind="c12.upload.refused.fdr_unreadable"` including the inner repr
|
||||
|
||||
**AC-6: Threshold is configurable**
|
||||
Given `config.c12.upload_min_on_ground_s = 5.0` (override) and the FDR ends with 6 s contiguous ON_GROUND records
|
||||
When `trigger_post_landing_upload(request)` is called
|
||||
Then the call succeeds (uploader invoked); the threshold is read from config, NOT a hardcoded literal
|
||||
**AC-6: Newest-segment-first short-circuit**
|
||||
Given the FDR for `<flight_id>` has three segments (`segment-0000.fdr`, `segment-0001.fdr`, `segment-0002.fdr`) and the `flight_footer` record is in `segment-0002.fdr` (the most recent)
|
||||
When `LocalFdrFooterReader.read_footer(flight_id)` is called
|
||||
Then the reader opens `segment-0002.fdr` FIRST, finds the footer, and never opens `segment-0001.fdr` or `segment-0000.fdr` (assert via a spy on `open(...)` or a custom segment-iteration hook); the call returns in well under 100 ms even when the older segments are >100 MB each
|
||||
|
||||
**AC-7: Returns C11's `UploadBatchReport` unchanged**
|
||||
Given a successful upload returning `UploadBatchReport(tiles_acked=42, tiles_rejected=3, ...)`
|
||||
Given a successful upload returning a `UploadBatchReport` with specific `batch_uuid`, `per_tile_status`, `outcome`, `public_key_fingerprint` values
|
||||
When the caller inspects the return value of `trigger_post_landing_upload`
|
||||
Then it is byte-for-byte the `UploadBatchReport` C11 returned (same dataclass instance via passthrough); no field is added, removed, or renamed
|
||||
Then it is the same object (passthrough) returned by `tile_uploader.upload_pending_tiles`; no field is mutated, added, removed, or renamed
|
||||
|
||||
**AC-8: Contiguous ON_GROUND counting starts from the most recent record only**
|
||||
Given the FDR contains a sequence `IN_FLIGHT, ON_GROUND, IN_FLIGHT, ON_GROUND × 60s` (an aborted go-around landing)
|
||||
When `trigger_post_landing_upload(request)` is called
|
||||
Then the contiguous ON_GROUND block counted is the LAST one (60 s), not the earlier ON_GROUND record; the upload is invoked since 60 s ≥ 30 s
|
||||
**AC-8: `api_key` is REDACTED in every log line**
|
||||
Given `PostLandingUploadRequest(api_key="super-secret-token-123", ...)` and an end-to-end run through every refusal mode + the success path
|
||||
When the log records are inspected (via `caplog` capture)
|
||||
Then NO log record's `msg`, `kv`, `extra`, or any string field contains the substring `"super-secret-token-123"`; the CLI's `_emit_invoked` writes `"api_key": "REDACTED"` (matching the AZ-328 `BuildCacheRequest` pattern); the orchestrator never includes `api_key` in any log payload
|
||||
|
||||
**AC-9: Empty `flight_state` records → `never_landed`**
|
||||
Given `iter_records_for_flight(...)` yields zero records (no `state.tick` records ever emitted)
|
||||
When `trigger_post_landing_upload(request)` is called
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="never_landed")` is raised (treated as "we have no positive ON_GROUND signal")
|
||||
**AC-9: Real FDR fixture C12-IT-03(a) (clean-shutdown footer) → upload invoked**
|
||||
Given an FDR fixture written by the C13 `FileFdrWriter`'s `close_flight()` path (which always sets `clean_shutdown=True` in the current AZ-292 implementation) at `tests/fixtures/c12_operator_orchestrator/fdr/clean_shutdown/<flight_id>/segment-NNNN.fdr`
|
||||
When `trigger_post_landing_upload(PostLandingUploadRequest(flight_id=<fixture_flight_id>, ...))` is called against a `LocalFdrFooterReader` over the fixture and a fake `TileUploaderCut` that records the call
|
||||
Then the upload is invoked exactly once with `flight_id=<fixture_flight_id>`; the fake's recorded `UploadBatchReport` is returned unchanged
|
||||
|
||||
**AC-10: Real FDR fixture C12-IT-03(a) (60 s confirmed) → upload invoked**
|
||||
Given the C12-IT-03 fixture FDR with confirmed ON_GROUND for 60 s
|
||||
When `trigger_post_landing_upload(request)` is called against the LocalFdrSegmentReader on the fixture
|
||||
Then the upload is invoked; the returned `UploadBatchReport` matches the fixture's expected counts
|
||||
|
||||
**AC-11: Real FDR fixture C12-IT-03(b) (IN_FLIGHT, incomplete log) → refused**
|
||||
Given the C12-IT-03 fixture FDR ending with IN_FLIGHT (truncated)
|
||||
When `trigger_post_landing_upload(request)` is called against the LocalFdrSegmentReader on the fixture
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="never_landed")` is raised; the upload is NOT invoked
|
||||
**AC-10: Real FDR fixture C12-IT-03(b) (no-footer truncation) → refused**
|
||||
Given an FDR fixture WITHOUT a `flight_footer` record (simulate truncation by writing segments via the writer thread and forcibly terminating before `close_flight()` runs — i.e. drop the last segment after the writer's `close_flight()` would have appended the footer record)
|
||||
When `trigger_post_landing_upload(...)` is called against a `LocalFdrFooterReader` over this fixture
|
||||
Then `FlightStateNotConfirmedError(not_confirmed_reason="footer_missing")` is raised; the upload is NOT invoked
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- For an 8-hour flight (≤ 64 GB FDR per AC-NEW-3) the orchestrator's read of `state.tick` records completes in ≤ 30 s wall-clock on a developer laptop with NVMe (the records are sparse — `state.tick` is one of many record kinds; the `kind_filter` argument lets the reader skip non-state records cheaply).
|
||||
- Memory peak ≤ 200 MB even with multi-GB FDR segments — `LocalFdrSegmentReader` is a streaming generator, NOT a list-in-memory.
|
||||
- `LocalFdrFooterReader.read_footer(flight_id)` completes in ≤ 1 s wall-clock on a developer laptop with NVMe even when the flight's FDR is 64 GB across many segments — the newest-segment-first short-circuit means a clean-shutdown flight reads only the tail of the last segment.
|
||||
- Memory peak ≤ 50 MB even with multi-GB segments — `LocalFdrFooterReader` is a streaming reader: opens one segment at a time, reads length-prefixed blobs in a bounded buffer, releases the file handle before opening the next.
|
||||
|
||||
**Compatibility**
|
||||
- AZ-272's `FdrRecord.parse` API is the only parser path; this task does NOT re-implement record parsing.
|
||||
- C11's `FlightStateSignal` DTO is consumed unchanged; this task does NOT redefine it.
|
||||
- C13's `flight_footer` record kind + payload shape (AZ-292) is consumed via the schema in `KNOWN_PAYLOAD_KEYS`; this task does NOT redefine the payload keys.
|
||||
- `C12.PostLandingUploadOrchestrator` does NOT import from `c11_tile_manager`; the AZ-507 consumer-side cuts (`TileUploaderCut`, `UploadRequestCut`, `UploadBatchReportCut`) are the only contract.
|
||||
|
||||
**Reliability**
|
||||
- Catches and rewraps the four refusal modes deterministically — operators can script against the four documented `not_confirmed_reason` prefix strings.
|
||||
- Catches and rewraps the four refusal modes deterministically — operators can script against the four documented `not_confirmed_reason` values (`flight_id_not_found`, `footer_missing`, `unclean_shutdown`, `fdr_unreadable`) which form a closed `Literal` type.
|
||||
- Streaming I/O on FDR segments — multi-GB segments do not blow memory.
|
||||
- The threshold default (30.0 s) matches description.md C12-IT-03 exactly.
|
||||
- No background threads, no global state, no caching — every call re-reads the FDR.
|
||||
- `api_key` is `SecretStr` — the type system prevents accidental string concatenation into log messages.
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|-------------|-----------------|
|
||||
| AC-1 | Fake reader with 60 ON_GROUND records spanning 60 s | Uploader called once, INFO logs, returns `UploadBatchReport` |
|
||||
| AC-2 | Fake reader with 15 s ON_GROUND tail | `FlightStateNotConfirmedError("insufficient_duration: 15.0s < 30.0s")` |
|
||||
| AC-3 | Fake reader whose last record is IN_FLIGHT | `FlightStateNotConfirmedError("never_landed")` |
|
||||
| AC-4 | Path doesn't exist | `FlightStateNotConfirmedError("flight_id_not_found")` |
|
||||
| AC-5 | Fake reader raises `FdrUnreadableError("OSError(...)")` | `FlightStateNotConfirmedError(re.match("^fdr_unreadable: .*"))` |
|
||||
| AC-6 | Override `upload_min_on_ground_s=5.0` + 6 s ON_GROUND | Upload invoked |
|
||||
| AC-7 | Successful upload, inspect return | Same `UploadBatchReport` instance/fields |
|
||||
| AC-8 | Sequence with go-around (IN_FLIGHT in middle) | Contiguous count is the LAST run only |
|
||||
| AC-9 | Empty `iter_records_for_flight` | `FlightStateNotConfirmedError("never_landed")` |
|
||||
| AC-10 | C12-IT-03(a) fixture | Upload invoked |
|
||||
| AC-11 | C12-IT-03(b) fixture | `FlightStateNotConfirmedError("never_landed")` |
|
||||
| NFR-perf-streaming | Microbench `LocalFdrSegmentReader` over 1 GB synthetic segment | Memory peak ≤ 200 MB; parse rate ≥ 100 MB/s |
|
||||
| AC-1 | Fake reader returns `clean_shutdown=True` | Uploader called once, INFO logs, returns `UploadBatchReport` |
|
||||
| AC-2 | Fake reader returns `None` | `FlightStateNotConfirmedError("footer_missing")` |
|
||||
| AC-3 | Fake reader returns `clean_shutdown=False` | `FlightStateNotConfirmedError("unclean_shutdown")` with counters in `detail` + log `kv` |
|
||||
| AC-4 | `<fdr_root>/<flight_id>/` missing | `FlightStateNotConfirmedError("flight_id_not_found")` |
|
||||
| AC-5 | Fake reader raises `FdrUnreadableError("OSError(...)")` | `FlightStateNotConfirmedError("fdr_unreadable")` w/ inner repr |
|
||||
| AC-6 | Three-segment fixture, footer in newest | `LocalFdrFooterReader` opens only the newest segment |
|
||||
| AC-7 | Success path; inspect return | Same `UploadBatchReport` instance |
|
||||
| AC-8 | `caplog` capture across every code path | `api_key.get_secret_value()` never appears in any log |
|
||||
| AC-9 | C12-IT-03(a) fixture (writer-produced clean footer) | Upload invoked |
|
||||
| AC-10 | C12-IT-03(b) fixture (truncated; no footer) | `FlightStateNotConfirmedError("footer_missing")` |
|
||||
| NFR-perf-streaming | Microbench `LocalFdrFooterReader` over a 1 GB synthetic segment with footer at the end | Memory peak ≤ 50 MB; wall-clock ≤ 1 s |
|
||||
|
||||
## Constraints
|
||||
|
||||
- The four `not_confirmed_reason` strings (`"never_landed"`, `"insufficient_duration: ..."`, `"flight_id_not_found"`, `"fdr_unreadable: ..."`) are a closed contract — adding a new value requires Plan-cycle approval (operators script against these prefixes).
|
||||
- The threshold default 30.0 s matches description.md C12-IT-03 EXACTLY; changing it requires a spec amendment, not just a config change.
|
||||
- The "contiguous ON_GROUND from most recent only" semantic (AC-8) is non-negotiable — counting the union of all ON_GROUND windows would defeat the gate by allowing an aborted-go-around aircraft to qualify based on the brief earlier landing.
|
||||
- The four `not_confirmed_reason` values form a closed `Literal["flight_id_not_found", "footer_missing", "unclean_shutdown", "fdr_unreadable"]` type — adding a new value requires Plan-cycle approval (operators script against these values).
|
||||
- A "force-upload" override is explicitly NOT supported — operators who legitimately need to upload after a non-conforming flight must use a separate forensic path (out of scope this cycle).
|
||||
- `LocalFdrSegmentReader` MUST stream; loading a multi-GB segment fully into memory is a NFR violation (NFR-perf-streaming).
|
||||
- C11's `FlightStateSignal` DTO is the source of truth for the gate signal — this task does NOT define a parallel C12-internal `FlightStateSignal`.
|
||||
- The threshold is a `float`; comparison uses `>=` (so exactly 30.0 s qualifies).
|
||||
- `LocalFdrFooterReader` MUST stream and MUST iterate segments newest-first; loading any segment fully into memory is a NFR violation, and iterating oldest-first defeats AC-6's short-circuit.
|
||||
- C13's `flight_footer` kind + payload schema (`KNOWN_PAYLOAD_KEYS["flight_footer"]`) is the source of truth — this task does NOT duplicate the schema; the local `FlightFooterRecord` dataclass extracts only the fields the orchestrator inspects.
|
||||
- `api_key` is plain `str` (matching `BuildCacheRequest.api_key`); redaction is a runtime guarantee enforced by AC-8 (caught by `caplog` substring assertion). The CLI's `_emit_invoked` writes `"REDACTED"` and the orchestrator never includes `api_key` in any log payload.
|
||||
- C12 does NOT import C11 directly — the AZ-507 consumer-side cuts pattern is enforced (the linter / import-cycle check should fail if `c12_operator_orchestrator/*.py` adds `from gps_denied_onboard.components.c11_tile_manager import ...`).
|
||||
- The orchestrator does NOT consult any `state.tick` / `flight_state.tick` payloads — those are out of scope post batch 44.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: AZ-272's record schema names the field something other than `flight_state`**
|
||||
- *Risk*: AZ-272's contract may use `state` or `flight.state` instead of `flight_state`; this task hardcodes the field name in `config.c12.flight_state_payload_field`.
|
||||
- *Mitigation*: The field name is a config knob (default `"flight_state"`); during integration with AZ-272, the default is updated to match AZ-272's actual contract. Tests use the default; integration tests against real FDR fixtures catch a mismatch immediately.
|
||||
**Risk 1: C13 writes the footer to a segment that's not the most recent on disk**
|
||||
- *Risk*: If `close_flight()` triggers a rollover concurrently, the footer might land in `segment_NNN+1.fdr` while older `segment_NNN.fdr` files are still on disk. The reader must still iterate newest-first by integer segment index, not by mtime, to correctly find the footer.
|
||||
- *Mitigation*: `LocalFdrFooterReader` sorts segments by the integer `NNN` in `segment_<NNN>.fdr` (descending), not by filesystem mtime. AC-6 covers the multi-segment case directly. Document the segment-naming dependency on `_docs/02_document/components/14_c13_fdr/description.md` § 1.
|
||||
|
||||
**Risk 2: The aircraft logs ON_GROUND briefly during taxi before takeoff**
|
||||
- *Risk*: The flight starts ON_GROUND, transitions to IN_FLIGHT, lands ON_GROUND again. The "contiguous from most recent" semantic correctly handles this — but if the FDR is truncated mid-flight, the most recent record might be from the taxi phase, falsely suggesting a landed flight.
|
||||
- *Mitigation*: The truncation case is captured by AC-3 / AC-11 — a truncated log ending in IN_FLIGHT correctly refuses. A truncated log ending in the early ON_GROUND taxi phase is indistinguishable from a real landing, but this is an FDR integrity concern out of scope; in practice the FDR writes are continuous.
|
||||
**Risk 2: A future cycle introduces additional record kinds at the tail (e.g. `flight_audit`)**
|
||||
- *Risk*: A new tail record kind could push the `flight_footer` deeper into the segment, increasing read latency. Currently the footer is the LAST record before file close, but the contract doesn't forbid later additions.
|
||||
- *Mitigation*: The streaming reader scans the entire newest segment if needed; AC-6 only asserts "doesn't open older segments", not "reads only the last N bytes". A future cycle that adds tail records would still satisfy AC-6.
|
||||
|
||||
**Risk 3: FDR segment file naming convention drift**
|
||||
- *Risk*: C13 (AZ-291..296) may name segments differently than `segment_<NNN>.fdr`.
|
||||
- *Mitigation*: The naming pattern is captured in `LocalFdrSegmentReader` with a `glob_pattern` constructor parameter (default `segment_*.fdr`); update the default if AZ-291 picks a different name. Tests cover both patterns.
|
||||
**Risk 3: The footer's `flight_id` UUID doesn't match the directory name**
|
||||
- *Risk*: An operator could rename the flight directory; the reader would still find a footer but its `flight_id` would mismatch.
|
||||
- *Mitigation*: `LocalFdrFooterReader.read_footer(flight_id)` asserts `footer.flight_id == flight_id` and treats a mismatch as `FdrUnreadableError(f"footer flight_id mismatch: footer={footer.flight_id}, requested={flight_id}")`. The orchestrator rewraps as `FlightStateNotConfirmedError("fdr_unreadable")`.
|
||||
|
||||
**Risk 4: `parse_iso` timezone handling**
|
||||
- *Risk*: Two records with the same wall-clock time but different timezones produce a wrong duration calculation.
|
||||
- *Mitigation*: AZ-272's contract specifies all timestamps are ISO 8601 UTC microseconds; this task asserts UTC at parse time and raises `FdrUnreadableError("non-UTC timestamp in record")` otherwise. Defense-in-depth.
|
||||
**Risk 4: A future cycle changes the `clean_shutdown` flag semantics**
|
||||
- *Risk*: AZ-292 currently hardcodes `clean_shutdown=True` in `close_flight()`; a future cycle might emit `False` for graceful shutdowns that nonetheless lost some records.
|
||||
- *Mitigation*: AC-3 already covers `clean_shutdown=False` → refused. The orchestrator does NOT interpret the four counters — operators do. If a future cycle wants to allow upload despite `clean_shutdown=False` under certain counter thresholds, that's a Plan-cycle change to this task.
|
||||
|
||||
**Risk 5: A future cycle adds a third flight state value (e.g. `EMERGENCY`)**
|
||||
- *Risk*: The contiguous-counting code treats anything other than `ON_GROUND` as breaking the run; a new `EMERGENCY` value during landing rollout could shorten the inferred duration spuriously.
|
||||
- *Mitigation*: Acceptable for this cycle — emergency states should not allow upload anyway. A future cycle that introduces such states must update this task's logic explicitly via a Plan-cycle change.
|
||||
**Risk 5: Symlinks under `<fdr_root>/<flight_id>/`**
|
||||
- *Risk*: An operator could symlink to a different flight's segments; the reader would still find a footer but it would belong to a different flight.
|
||||
- *Mitigation*: Same as Risk 3 — the `flight_id` assertion catches it. Document that `<fdr_root>` is operator-trusted territory; symlink escape is out of scope.
|
||||
|
||||
## Runtime Completeness
|
||||
|
||||
- **Named capability**: post-flight ON_GROUND-gated upload trigger per description.md § 2 (`trigger_post_landing_upload`) + AC-8.4 + C12-IT-03.
|
||||
- **Production code that must exist**: real `PostLandingUploadOrchestrator` consuming real `TileUploader` (AZ-319) + real `LocalFdrSegmentReader` reading real on-disk FDR segments + real `FdrRecord.parse` (AZ-272).
|
||||
- **Allowed external stubs**: tests MAY use fakes for `FdrSegmentReader` and `TileUploader`; the C12-IT-03 integration tests use real FDR fixture files + a fake `TileUploader` that records the call (no real network).
|
||||
- **Unacceptable substitutes**: in-memory FDR (defeats the streaming guarantee NFR); a "force-upload" override (defeats the gate); shelling out to `cat <fdr>` instead of using `FdrRecord.parse` (no schema validation, no forward-compat); reading the FDR via the producer-side ring buffer (wrong API; ring buffer is for live producers, not post-flight reads).
|
||||
- **Named capability**: post-flight clean-shutdown-gated upload trigger per description.md § 2 (`trigger_post_landing_upload`) + AC-8.4 + C12-IT-03.
|
||||
- **Production code that must exist**: real `PostLandingUploadOrchestrator` consuming a real `HttpTileUploader` (AZ-319) via the `TileUploaderCut` Protocol + real `LocalFdrFooterReader` reading real on-disk FDR segments + real `FdrRecord.parse` (AZ-272).
|
||||
- **Allowed external stubs**: tests MAY use fakes for `FdrFooterReader` and `TileUploaderCut`; the C12-IT-03 integration tests use real FDR fixture files (produced by C13's `FileFdrWriter`) + a fake `TileUploaderCut` that records the call (no real network).
|
||||
- **Unacceptable substitutes**: in-memory FDR (defeats the streaming guarantee NFR); a "force-upload" override (defeats the gate); shelling out to `cat <fdr>` instead of using `FdrRecord.parse` (no schema validation, no forward-compat); reading the FDR via the producer-side ring buffer (wrong API; ring buffer is for live producers, not post-flight reads); importing `c11_tile_manager` directly from c12 source (violates AZ-507 consumer-side cuts).
|
||||
|
||||
@@ -2,19 +2,19 @@
|
||||
|
||||
**Task**: AZ-330_c12_operator_reloc_service
|
||||
**Name**: C12 OperatorReLocService
|
||||
**Description**: Implement `OperatorReLocService`, the C12 operator-side of AC-3.4 (operator-relocalization on visual loss; the SUT requests a position hint from the operator after losing satellite anchoring; the operator confirms a candidate; the system re-anchors). Owns: (a) the `ReLocHint` DTO (`approximate_position_wgs84: LatLonAlt`, `confidence_radius_m: float`, `reason: str`) per description.md § 2; (b) the `OperatorCommandTransport` Protocol that E-C8 (a future task in AZ-261) will implement against pymavlink for the actual GCS-link MAVLink encoding + transmission; (c) the `request_reloc(reloc_hint: ReLocHint) -> None` public method that validates the hint at the C12 boundary, calls `transport.send_reloc_hint(...)`, catches the transport's `GcsLinkError` and re-raises with C12-specific context (operator action label, monotonic timestamp, hint summary as a redacted log line), emits an FDR record `kind="c12.reloc.requested"` via the AZ-273 FDR client so the post-flight log carries the operator's action chronologically, and writes an INFO log on success / ERROR log on failure. Best-effort semantics per description.md § 7 — if the GCS link is degraded the operator may need to re-issue manually; this task does NOT auto-retry. Publishes the Protocol contract at `_docs/02_document/contracts/c12_operator_tooling/operator_command_transport.md` so a future E-C8 task implements the same shape against pymavlink without re-negotiating fields. The pattern matches AZ-322's `BackboneEmbedder` Protocol (C10 owns the Protocol; C2 implements it later).
|
||||
**Description**: Implement `OperatorReLocService`, the C12 operator-side of AC-3.4 (operator-relocalization on visual loss; the SUT requests a position hint from the operator after losing satellite anchoring; the operator confirms a candidate; the system re-anchors). Owns: (a) the `ReLocHint` DTO (`approximate_position_wgs84: LatLonAlt`, `confidence_radius_m: float`, `reason: str`) per description.md § 2; (b) the `OperatorCommandTransport` Protocol that E-C8 (a future task in AZ-261) will implement against pymavlink for the actual GCS-link MAVLink encoding + transmission; (c) the `request_reloc(reloc_hint: ReLocHint) -> None` public method that validates the hint at the C12 boundary, calls `transport.send_reloc_hint(...)`, catches the transport's `GcsLinkError` and re-raises with C12-specific context (operator action label, monotonic timestamp, hint summary as a redacted log line), emits an FDR record `kind="c12.reloc.requested"` via the AZ-273 FDR client so the post-flight log carries the operator's action chronologically, and writes an INFO log on success / ERROR log on failure. Best-effort semantics per description.md § 7 — if the GCS link is degraded the operator may need to re-issue manually; this task does NOT auto-retry. Publishes the Protocol contract at `_docs/02_document/contracts/c12_operator_orchestrator/operator_command_transport.md` so a future E-C8 task implements the same shape against pymavlink without re-negotiating fields. The pattern matches AZ-322's `BackboneEmbedder` Protocol (C10 owns the Protocol; C2 implements it later).
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-326_c12_cli_app, AZ-273_fdr_client_ringbuf, AZ-263_initial_structure, AZ-269_config_loader, AZ-266_log_module
|
||||
**Component**: c12_operator_tooling (epic AZ-253 / E-C12)
|
||||
**Component**: c12_operator_orchestrator (epic AZ-253 / E-C12)
|
||||
**Tracker**: AZ-330
|
||||
**Epic**: AZ-253 (E-C12)
|
||||
|
||||
### Document Dependencies
|
||||
|
||||
- `_docs/02_document/contracts/c12_operator_tooling/operator_command_transport.md` — produced by this task (frozen Protocol + DTO shape, invariants, test cases for E-C8 to implement against).
|
||||
- `_docs/02_document/contracts/c12_operator_orchestrator/operator_command_transport.md` — produced by this task (frozen Protocol + DTO shape, invariants, test cases for E-C8 to implement against).
|
||||
- `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` — consumed: the `c12.reloc.requested` record envelope.
|
||||
- `_docs/02_document/components/13_c12_operator_tooling/description.md` — § 2 (`OperatorReLocService` interface, `ReLocHint` DTO), § 5 (`GcsLinkError` best-effort), § 7 (best-effort semantics; operator may re-issue).
|
||||
- `_docs/02_document/components/13_c12_operator_tooling/tests.md` — C12-IT-01 (operator re-loc workflow returns SUT to satellite-anchored ≤ 30 s).
|
||||
- `_docs/02_document/components/13_c12_operator_orchestrator/description.md` — § 2 (`OperatorReLocService` interface, `ReLocHint` DTO), § 5 (`GcsLinkError` best-effort), § 7 (best-effort semantics; operator may re-issue).
|
||||
- `_docs/02_document/components/13_c12_operator_orchestrator/tests.md` — C12-IT-01 (operator re-loc workflow returns SUT to satellite-anchored ≤ 30 s).
|
||||
|
||||
## Problem
|
||||
|
||||
@@ -58,8 +58,8 @@ This task delivers the C12 service surface + the Protocol contract + the FDR sid
|
||||
- ERROR log `kind="c12.reloc.failed"` with the redacted summary + `e.reason`.
|
||||
- `fdr_client.enqueue(FdrRecord(kind="c12.reloc.requested", payload={"hint": <full hint dict>, "outcome": "failed", "failure_reason": e.reason, "ts_monotonic": clock.monotonic()}))` — the FDR record carries BOTH the attempt and the failure so the post-flight log shows the operator tried.
|
||||
- Re-raise `GcsLinkError(reason=f"C12 reloc-confirm: {e.reason}", wrapped_exception_repr=repr(e), remediation=e.remediation)` — wrap with C12 prefix in `reason`.
|
||||
- The Protocol contract published at `_docs/02_document/contracts/c12_operator_tooling/operator_command_transport.md` per `templates/api-contract.md`. Includes Shape, Invariants, Non-Goals, Versioning Rules, and at least 3 Test Cases that E-C8's implementer can run against `MavlinkOperatorCommandTransport`.
|
||||
- Composition-root factory at `src/gps_denied_onboard/runtime_root/c12_factory.py` extends T1's `OperatorToolServices` dataclass with `operator_reloc_service: OperatorReLocService`. The factory `build_operator_reloc_service(config, services) -> OperatorReLocService` constructs the service; the `OperatorCommandTransport` is resolved from a wider service registry that includes E-C8's `MavlinkOperatorCommandTransport` (or a fake `LoggingOnlyOperatorCommandTransport` until E-C8 is implemented — fake declared in tests, NOT in production wiring).
|
||||
- The Protocol contract published at `_docs/02_document/contracts/c12_operator_orchestrator/operator_command_transport.md` per `templates/api-contract.md`. Includes Shape, Invariants, Non-Goals, Versioning Rules, and at least 3 Test Cases that E-C8's implementer can run against `MavlinkOperatorCommandTransport`.
|
||||
- Composition-root factory at `src/gps_denied_onboard/runtime_root/c12_factory.py` extends T1's `OperatorOrchestratorServices` dataclass with `operator_reloc_service: OperatorReLocService`. The factory `build_operator_reloc_service(config, services) -> OperatorReLocService` constructs the service; the `OperatorCommandTransport` is resolved from a wider service registry that includes E-C8's `MavlinkOperatorCommandTransport` (or a fake `LoggingOnlyOperatorCommandTransport` until E-C8 is implemented — fake declared in tests, NOT in production wiring).
|
||||
- T1's `cli.py` `reloc-confirm` subcommand resolves `services.operator_reloc_service` and calls `.request_reloc(...)`. The CLI subcommand parses CLI flags `--lat`, `--lon`, `--alt`, `--radius`, `--reason` into a `ReLocHint`. Maps `GcsLinkError → exit 40`; `ValueError → exit 2 (usage)`.
|
||||
|
||||
## Scope
|
||||
@@ -70,7 +70,7 @@ This task delivers the C12 service surface + the Protocol contract + the FDR sid
|
||||
- `LatLonAlt` and `ReLocHint` DTOs (or import from `shared_helpers` if WgsConverter already defined `LatLonAlt`).
|
||||
- `OperatorCommandTransport` Protocol.
|
||||
- `GcsLinkError` error type with `reason`, `wrapped_exception_repr`, `remediation`.
|
||||
- The Protocol contract document at `_docs/02_document/contracts/c12_operator_tooling/operator_command_transport.md`.
|
||||
- The Protocol contract document at `_docs/02_document/contracts/c12_operator_orchestrator/operator_command_transport.md`.
|
||||
- FDR record emission via `fdr_client.enqueue` (both success and failure cases).
|
||||
- Composition-root factory.
|
||||
- Wiring of T1's `reloc-confirm` subcommand to this service.
|
||||
@@ -108,7 +108,7 @@ When `request_reloc(hint)` is called
|
||||
Then the transport's `send_reloc_hint` receives the hint with `reason` byte-for-byte equal to the input (no truncation, no normalization); the FDR record's `payload.hint.reason` is the same; the INFO log truncates the displayed reason to 200 chars (display-only) but the underlying transport call is unmodified
|
||||
|
||||
**AC-5: Protocol contract document exists with the exact method signature**
|
||||
Given the published contract at `_docs/02_document/contracts/c12_operator_tooling/operator_command_transport.md`
|
||||
Given the published contract at `_docs/02_document/contracts/c12_operator_orchestrator/operator_command_transport.md`
|
||||
When E-C8's implementer reads the contract to build `MavlinkOperatorCommandTransport`
|
||||
Then the contract specifies the exact Protocol shape (`def send_reloc_hint(self, hint: ReLocHint) -> None`), the `ReLocHint` field shape, the documented `GcsLinkError` raise behaviour, the Versioning Rules, and at least 3 Test Cases
|
||||
|
||||
@@ -133,7 +133,7 @@ When `request_reloc(hint)` is called
|
||||
Then the INFO log line shows `position_lat: 49.99877` and `position_lon: 36.12346` (rounded to 5 decimals); the underlying transport receives the full-precision value (no rounding before transport)
|
||||
|
||||
**AC-10: Composition-root factory does not eager-construct the transport**
|
||||
Given the operator-tool starts up (T1's `cli.py` lazily resolves services)
|
||||
Given the operator-orchestrator starts up (T1's `cli.py` lazily resolves services)
|
||||
When the operator does NOT use the `reloc-confirm` subcommand in this session
|
||||
Then `OperatorCommandTransport` is NEVER instantiated (verifiable via spy on the factory); pymavlink is NEVER imported (NFR-perf-cold-start from T1 holds)
|
||||
|
||||
@@ -202,4 +202,4 @@ Then `OperatorCommandTransport` is NEVER instantiated (verifiable via spy on the
|
||||
|
||||
## Contract
|
||||
|
||||
This task produces/implements the contract at `_docs/02_document/contracts/c12_operator_tooling/operator_command_transport.md`. Consumers (specifically the future E-C8 task implementing `MavlinkOperatorCommandTransport`) MUST read that file — not this task spec — to discover the interface.
|
||||
This task produces/implements the contract at `_docs/02_document/contracts/c12_operator_orchestrator/operator_command_transport.md`. Consumers (specifically the future E-C8 task implementing `MavlinkOperatorCommandTransport`) MUST read that file — not this task spec — to discover the interface.
|
||||
|
||||
@@ -66,7 +66,7 @@ Without this task, the replay-only strategies (FrameSource + Clock + TlogReplayF
|
||||
|
||||
**AC-7: Composition uses Public APIs only** — assert that `compose_replay` imports ONLY `__init__.py` re-exports of each component (per `module-layout.md` Layer-3 / Layer-4 rules). CI-style check via AST scan in the unit test.
|
||||
|
||||
**AC-8: No C6/C10/C11/C12 imports** — assert that `compose_replay` does NOT import any symbol from `components.c6_tile_cache`, `components.c10_provisioning`, `components.c11_tilemanager`, `components.c12_operator_tooling` (per epic scope).
|
||||
**AC-8: No C6/C10/C11/C12 imports** — assert that `compose_replay` does NOT import any symbol from `components.c6_tile_cache`, `components.c10_provisioning`, `components.c11_tilemanager`, `components.c12_operator_orchestrator` (per epic scope).
|
||||
|
||||
**AC-9: Configuration + calibration loading** — `compose_replay(config_with_invalid_calib_path)` → `ReplayCompositionError("camera-calibration not found at ...")`.
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
**Task**: AZ-403_replay_dockerfile_ci
|
||||
**Name**: `gps-denied-replay-cli` Dockerfile + GitHub Actions matrix entry + SBOM diff (excludes C6/C10/C11/C12)
|
||||
**Description**: Add the fourth Docker image `gps-denied-replay-cli`: multi-stage build (Python + C1–C5 + cpp/* + replay strategies; NO C6/C10/C11/C12; NO HTTP server). Add a GitHub Actions matrix entry building and pushing this image alongside the existing 3 images (live / research / operator). Add an **SBOM diff CI step** that builds the SBOM (via `syft` or the project's existing SBOM tooling), parses it, and asserts the absence of `c6_tile_cache`, `c10_provisioning`, `c11_tilemanager`, `c12_operator_tooling` packages — verifies AC-4 of the epic. The SBOM diff fails the CI job if any excluded component leaks into the replay image. Image base: same Python + CUDA base as the live image (consistency with TensorRT engines from C7) but with `BUILD_C6=OFF`, `BUILD_C10=OFF`, `BUILD_C11=OFF`, `BUILD_C12=OFF`, `BUILD_VIDEO_FILE_FRAME_SOURCE=ON`, `BUILD_TLOG_REPLAY_ADAPTER=ON`, `BUILD_REPLAY_SINK_JSONL=ON` build args.
|
||||
**Description**: Add the fourth Docker image `gps-denied-replay-cli`: multi-stage build (Python + C1–C5 + cpp/* + replay strategies; NO C6/C10/C11/C12; NO HTTP server). Add a GitHub Actions matrix entry building and pushing this image alongside the existing 3 images (live / research / operator). Add an **SBOM diff CI step** that builds the SBOM (via `syft` or the project's existing SBOM tooling), parses it, and asserts the absence of `c6_tile_cache`, `c10_provisioning`, `c11_tilemanager`, `c12_operator_orchestrator` packages — verifies AC-4 of the epic. The SBOM diff fails the CI job if any excluded component leaks into the replay image. Image base: same Python + CUDA base as the live image (consistency with TensorRT engines from C7) but with `BUILD_C6=OFF`, `BUILD_C10=OFF`, `BUILD_C11=OFF`, `BUILD_C12=OFF`, `BUILD_VIDEO_FILE_FRAME_SOURCE=ON`, `BUILD_TLOG_REPLAY_ADAPTER=ON`, `BUILD_REPLAY_SINK_JSONL=ON` build args.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-402 (CLI entrypoint registered in pyproject); AZ-398 / AZ-399 / AZ-400 / AZ-401 (replay strategies); existing Dockerfile + CI plumbing for the live image (pattern to mirror); `module-layout.md` build-flag table; AZ-263, AZ-269, AZ-266
|
||||
**Component**: replay-cicd (epic AZ-265 / E-DEMO-REPLAY) — Dockerfile at `docker/replay-cli/Dockerfile`; CI at `.github/workflows/build-images.yml` (or equivalent); SBOM-diff script at `ci/sbom_diff_replay.py`
|
||||
@@ -27,7 +27,7 @@ Without this task, the replay binary cannot ship — there's no CI matrix entry
|
||||
- Entrypoint: `gps-denied-replay`.
|
||||
- No HTTP server (no exposed ports; CLI only).
|
||||
- `.github/workflows/build-images.yml` matrix entry for `replay-cli` (image tag, build args, push to registry).
|
||||
- `ci/sbom_diff_replay.py` — generates the SBOM via `syft packages dir:./ -o spdx-json` (or equivalent) on the built image, parses it, asserts the absence of `c6_tile_cache`, `c10_provisioning`, `c11_tilemanager`, `c12_operator_tooling` Python packages. Exit 0 on clean SBOM; exit 1 on leak (with the leaking package name printed).
|
||||
- `ci/sbom_diff_replay.py` — generates the SBOM via `syft packages dir:./ -o spdx-json` (or equivalent) on the built image, parses it, asserts the absence of `c6_tile_cache`, `c10_provisioning`, `c11_tilemanager`, `c12_operator_orchestrator` Python packages. Exit 0 on clean SBOM; exit 1 on leak (with the leaking package name printed).
|
||||
- CI step `replay-cli-sbom-diff` invokes the script after the image build; fails the job on script exit 1.
|
||||
- Documentation: `docker/replay-cli/README.md` documents the image scope + build-args.
|
||||
- Unit / smoke tests: `docker buildx build` of the Dockerfile succeeds locally; SBOM-diff script runs against a pre-built test image fixture.
|
||||
|
||||
Reference in New Issue
Block a user