# Batch 06 — Implementation Report (Cycle 1) **Tasks**: AZ-291, AZ-292, AZ-293 **Component**: C13 FDR Writer (E-C13) **Cycle**: 1 (Build → Ship) **Date**: 2026-05-11 ## Summary Built the C13 FDR writer chain end-to-end. AZ-291 lands the single writer thread + segment file lifecycle + cross-process filelock + ENOSPC degraded mode. AZ-292 lands the `FlightHeader` / `FlightFooter` records and the four per-flight counters (records_written, records_dropped_overrun, bytes_written, rollover_count) that make a flight directory self-describing. AZ-293 lands the per-flight 64 GiB cap policy with oldest-segment-dropped + canonical `segment_rollover` record emission. The three tasks share a single module (`components/c13_fdr/`) with these new files: - `errors.py` — five typed exceptions covering construction, open, close, and concurrent-writer failure paths. - `headers.py` — `FlightHeader` and `FlightFooter` frozen dataclasses. - `writer.py` — `FileFdrWriter` (AZ-291 + AZ-292). - `cap_policy.py` — `CapacityCapPolicy` (AZ-293). - `__init__.py`, `interface.py` — re-exports. ## Features Landed ### AZ-291 — Writer thread + segment lifecycle - `FileFdrWriter(flight_root, flight_id, config, fdr_clients, gcs_alert, *, on_rotation, drain_sleep_s)` constructor. - `start()`, `stop()`, `open_flight(header)`, `close_flight()` lifecycle methods. - Background writer thread that loops over every registered `FdrClient.drain(batch_size)` and writes serialised records to the current segment with ` | ` framing. - Per-segment rotation triggered by `segment_size_bytes` (default 64 MiB). - Cross-process filelock via `fcntl.flock(LOCK_EX | LOCK_NB)` on `flight_root/.fdr.lock`; held for the entire flight; constructor-time `FdrConcurrentWriterError` on contention. - ENOSPC degraded mode: one ERROR log + one GCS alert; subsequent failures are log-rate-capped at 1/sec; producer buffers keep draining (records discarded) so producer-side memory does not grow unbounded. - Public introspection: `current_segment_path()`, `current_segment_bytes()`, `segments_written()`, `is_rolling()`, `is_degraded()`, `current_size_bytes()`, `rollover_count`, `records_dropped_overrun`, `flight_id`, `flight_dir`. ### AZ-292 — FlightHeader / FlightFooter + counters - `FlightHeader` dataclass with `flight_id`, `flight_started_at_iso`, `flight_started_at_monotonic_ns`, `config_snapshot`, `signing_key_rotation_event`, `manifest_content_hashes`, `build_info`. - `FlightFooter` dataclass with `flight_id`, `flight_ended_at_iso`, `flight_ended_at_monotonic_ns`, `records_written`, `records_dropped_overrun`, `bytes_written`, `rollover_count`, `clean_shutdown`. - `open_flight(header)` writes the header as the first record of segment 0; rejects flight_id mismatch with `FdrOpenError`. - `close_flight()` drains pending producer records, builds the footer (iteratively converging `bytes_written` to include the footer's own size), writes it, releases the filelock, and returns the `FlightFooter` to the caller. Idempotent (a second call returns the cached footer). - Counter integration: `_append_record` increments `_records_written` and `_bytes_written`; `_observe_overrun_record` aggregates `payload.dropped_count` into `_records_dropped_overrun`; `_rotate_segment` bumps `_rollover_count`. ### AZ-293 — Capacity cap policy - `CapacityCapPolicy(cap_bytes, fdr_client, gcs_alert)` callable; invoked by `FileFdrWriter` via the `on_rotation` hook after every per-segment rotation. - Walks the flight directory, sums on-disk segment sizes + writer's running `current_segment_bytes`, and unlinks the oldest CLOSED segment if total > cap. Repeats until under cap. - Segment 0 (containing the `flight_header`) is never dropped unless it is the only candidate AND the directory is over cap by itself — in that case logs `fdr.cap_misconfigured` ERROR + emits one GCS alert and lets the flight continue in degraded mode. - Each drop enqueues a `kind="segment_rollover"` `FdrRecord` (envelope `producer_id="shared.fdr_client"`) carrying `old_segment`, `new_segment`, `total_bytes_after`; bumps `writer.rollover_count`; logs `fdr.cap_drop` INFO. - Default `cap_bytes = 64 * 1024**3` (64 GiB exactly per AC-NEW-3 + AC-7); valid range `[1024, 2**40]`. - No config flag disables `segment_rollover` emission (AC-6 verified by a config-schema scan test). ## Schema / Contract Changes - `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` — `flight_header` and `flight_footer` payload key sets extended to match AZ-292's task-spec dataclasses. Effective minor bump (1.0.0 → 1.1.0); no breaking change since no producer or consumer used the previous narrow shape. - `src/gps_denied_onboard/fdr_client/records.py` — `KNOWN_PAYLOAD_KEYS` updated for the two kinds. - `src/gps_denied_onboard/config/schema.py` — added `FdrWriterConfig` nested inside `FdrConfig`. Fields: `segment_size_bytes` (default 64 MiB), `batch_size` (default 64), `flight_cap_bytes` (default 64 GiB), `debug_log_per_record` (default False). ## Dependency Changes None. Despite the AZ-291 spec calling for `filelock`, the package was not in `pyproject.toml` and `fcntl.flock` from the stdlib provides equivalent POSIX advisory-lock semantics (kernel auto-releases on process death — directly matching the Risk-3 mitigation). Documented inline in the writer's module docstring. ## Test Results - **New tests**: 29 (9 for AZ-291, 10 for AZ-292, 10 for AZ-293). - **Full suite**: 323 passed, 2 skipped (pre-existing cmake / actionlint skips). 0 regressions. ## Acceptance Criteria Coverage | Task | AC | Test | Status | |------|----|------|--------| | AZ-291 | AC-1 drain all producers | `test_ac1_drain_all_registered_producers` | PASS | | AZ-291 | AC-2 per-segment rotation | `test_ac2_per_segment_rotation_at_size_cap` | PASS | | AZ-291 | AC-3 atomic rotation | `test_ac3_atomic_rotation_no_half_segment` | PASS | | AZ-291 | AC-4 filelock prevents concurrent | `test_ac4_concurrent_writer_blocked_by_filelock` | PASS | | AZ-291 | AC-5 ENOSPC degrades + alerts | `test_ac5_enospc_degrades_and_alerts` | PASS | | AZ-291 | AC-6 stop drains + fsyncs + releases lock | `test_ac6_stop_drains_and_releases_lock` | PASS | | AZ-291 | AC-7 segment file layout | `test_ac7_segment_layout` | PASS | | AZ-291 | AC-8 steady-state no overrun | `test_ac8_steady_state_no_overrun` | PASS | | AZ-292 | AC-1 header is first record | `test_ac1_flight_header_is_first_record` | PASS | | AZ-292 | AC-2 footer is last record | `test_ac2_flight_footer_is_last_record` | PASS | | AZ-292 | AC-3 counters reflect reality | `test_ac3_counters_reflect_on_disk_reality` | PASS | | AZ-292 | AC-4 open_flight FdrOpenError on disk failure | `test_ac4_open_flight_fdrerror_on_disk_failure` | PASS | | AZ-292 | AC-5 reject flight_id mismatch | `test_ac5_open_flight_rejects_flight_id_mismatch` | PASS | | AZ-292 | AC-6 close without open raises | `test_ac6_close_without_open_raises` | PASS | | AZ-292 | AC-7 clean_shutdown=False on teardown | `test_ac7_uncleansed_teardown_no_clean_shutdown` | PASS | | AZ-292 | AC-8 records_dropped_overrun aggregates | `test_ac8_records_dropped_overrun_aggregates_dropped_counts` | PASS | | AZ-293 | AC-1 drop oldest when over cap | `test_ac1_drop_oldest_when_dir_exceeds_cap` | PASS | | AZ-293 | AC-2 loop until under cap | `test_ac2_loop_until_under_cap` | PASS | | AZ-293 | AC-3 misconfigured cap path | `test_ac3_cap_misconfigured_when_segment_zero_alone` | PASS | | AZ-293 | AC-4 open segment never dropped | `test_ac4_currently_open_segment_never_dropped` | PASS | | AZ-293 | AC-5 canonical fields on rollover | `test_ac5_segment_rollover_record_has_canonical_fields` | PASS | | AZ-293 | AC-6 no disable flag | `test_ac6_no_config_flag_disables_segment_rollover` + `test_config_full_schema_has_no_rollover_disable_field` | PASS | | AZ-293 | AC-7 default cap is exactly 64 GiB | `test_ac7_default_cap_is_exactly_64_gib` | PASS | | AZ-293 | AC-8 rollover_count matches | `test_ac8_rollover_count_matches_segment_rollover_records` | PASS | ## Follow-ups - **AZ-294 / AZ-295 / AZ-296**: mid-flight tile snapshot path, thumbnail rate cap, and takeoff-abort wiring — next sub-tasks in E-C13 (out of scope for Batch 6). - **Composition root wiring**: the `runtime_root.py` will inject the `CapacityCapPolicy` instance as the writer's `on_rotation` callback when E-C13's full wiring lands (likely a later batch or AZ-270 expansion). - **NFR-perf microbenches**: NFR-perf-throughput (≥ 200 Hz on Tier-2), NFR-perf-rotation (p99 ≤ 50 ms), NFR-perf-hook (p99 ≤ 50 ms), NFR-perf-multi-drop (≤ 100 ms) are documented in the specs but require Tier-2 hardware to run; tracked for a future Jetson-harness cycle. - **AZ-294 mid-flight tile snapshot**: depends on the writer being able to record a JSON pointer record without copying the JPEG inline (`sidecar_path` invariant); the existing `_append_record` supports this directly. Implementation will live in this same module.