[AZ-294] [AZ-295] [AZ-296] Finish C13: tile snapshot + record-kind policy + takeoff abort

AZ-294: MidFlightTileSnapshotSink writes orthorectified tile JPEGs
atomically to flight_root/<flight_id>/tiles/<tile_id>.jpg, emits a
kind="mid_flight_tile_snapshot" pointer record, and evicts the oldest
tile when the per-flight 64 MiB cap is exceeded. Adds optional
frame_id to the snapshot payload (fdr_record_schema bump).

AZ-295: RecordKindPolicy with two paired gates:
- enforce_or_raise (producer-side) raises RawFrameWriteForbiddenError
  for raw_nav_frame / raw_ai_cam_frame at the call site, defending
  AC-8.5 / RESTRICT-UAV-4.
- gate_for_writer (writer-side) tumbling-window rate-caps
  failed_tile_thumbnail records at <= 0.1 Hz; over-cap drops are
  coalesced into kind="overrun" records with the originating
  producer slug.

AZ-296: take_off() composition-root sequence with strict ordering
(writer.__init__ -> start -> open_flight -> fc_adapter.__init__ ->
fc_adapter.open). On FdrOpenError, logs ERROR record, calls
writer.stop(), prints the documented FATAL line to stderr, and
sys.exit(EXIT_FDR_OPEN_FAILURE=2). composition_root_protocol bumped
to v1.1.0 with the new constants + takeoff-sequence section.

29 new tests; full suite 356 passed / 2 skipped / 0 failures.
No new dependencies (stdlib only).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-11 03:52:07 +03:00
parent b5dd6031d2
commit e4ecdaf619
21 changed files with 1657 additions and 9 deletions
@@ -0,0 +1,70 @@
# Batch 07 — Implementation Report (cycle 1)
**Batch**: 7 of N
**Tasks**: AZ-294, AZ-295, AZ-296
**Cycle**: 1
**Date**: 2026-05-11
**Status**: complete (all ACs green; full suite 356 passed, 2 skipped, 0 failures)
## Tickets
| Ticket | Title | Complexity | Outcome |
|--------|-------|------------|---------|
| AZ-294 | C13 mid-flight tile snapshot sidecar (F4) | 3 pt | Done |
| AZ-295 | C13 AC-8.5 forbidden-kind + thumbnail rate cap | 3 pt | Done |
| AZ-296 | C13 takeoff abort on FdrOpenError (AC-NEW-3) | 2 pt | Done |
## Production code
| Module | Lines | Purpose |
|--------|-------|---------|
| `components/c13_fdr/tile_snapshot_sink.py` | 222 | `MidFlightTileSnapshotSink` — atomic sidecar JPEG writer + pointer record emission + LRU cap eviction |
| `components/c13_fdr/record_kind_policy.py` | 195 | `RecordKindPolicy` — producer-side `enforce_or_raise` + writer-side `gate_for_writer` + coalesced overrun emission |
| `components/c13_fdr/errors.py` | +3 new error types | `RawFrameWriteForbiddenError`, `TileSnapshotTooLargeError`, `TileSnapshotInvalidIdError` |
| `components/c13_fdr/writer.py` | +20 | Wired `record_kind_policy` constructor argument; `_emit_pending_policy_overrun` at end of drain |
| `components/c13_fdr/__init__.py` | +12 | Exported new public surface |
| `config/schema.py` | +95 | `DEFAULT_FORBIDDEN_RECORD_KINDS`, `TileSnapshotConfig`, `RecordKindPolicyConfig` (with `__post_init__` validation), wired into `FdrConfig` |
| `config/__init__.py` | +5 | Exported the new config classes |
| `fdr_client/records.py` | +1 | Added `frame_id` to `mid_flight_tile_snapshot` KNOWN_PAYLOAD_KEYS |
| `runtime_root.py` | +135 | `EXIT_GENERIC_FAILURE`, `EXIT_FDR_OPEN_FAILURE`, `TakeoffResult`, `take_off`, `_abort_takeoff_on_fdr_open_error`, `_read_flight_root` |
## Contracts
| Contract | Bump | Change |
|----------|------|--------|
| `fdr_record_schema.md` | v1.1.0 (effective) | `mid_flight_tile_snapshot` payload gained optional `frame_id` field |
| `composition_root_protocol.md` | v1.0.0 → v1.1.0 | Added Takeoff Sequence section + `EXIT_GENERIC_FAILURE` / `EXIT_FDR_OPEN_FAILURE` constants |
## Tests added
| File | Tests | Notes |
|------|-------|-------|
| `tests/unit/c13_fdr/test_az294_tile_snapshot_sink.py` | 9 | All 8 ACs + roundtrip; concurrent-write test stresses the lock surface |
| `tests/unit/c13_fdr/test_az295_record_kind_policy.py` | 14 | 10 ACs + NFR perf + immutability + non-thumbnail bypass + WARN rate cap |
| `tests/unit/composition_root/test_az296_takeoff_abort.py` | 10 | 8 ACs + perf + reliability; mix of subprocess (`sys.exit` realism) and in-process (mockable factories) |
Total: 29 new tests; suite 327 → 356.
## Dependency changes
None. Every new module uses stdlib only.
## Schema changes
- `FdrConfig.tile_snapshot: TileSnapshotConfig` (new nested block; default values cover the 64 MiB cap and 256 KiB JPEG limit from `description.md`).
- `FdrConfig.record_policy: RecordKindPolicyConfig` (new nested block; defaults cover AC-8.5 forbidden set + 0.1 Hz thumbnail rate cap).
Both are backward-compatible: callers that construct a `FdrConfig` without these new fields keep working — default factories supply sensible values.
## Risks & follow-ups
- **Composition root `main()` does NOT call `take_off()` yet.** `take_off` is the new airborne entrypoint contract, but `runtime_root.main()` still only calls `compose_root`. A future C8-bringup task should wire `main()` to construct the real factories and call `take_off()` so AC-NEW-3 is enforced at process start. Documented in the batch 07 review (informational finding #3).
- **`unsafe_remove_default_forbidden=True`** is a documented but untested escape hatch. Not used in any standard preset. Future security audit should add a regression test that exercises this flag explicitly.
- **Tile-snapshot tile_id uses a regex bound to 128 chars**. If C6 ever needs longer tile IDs, this will need to be bumped; today the bound exceeds the longest known tile ID by ~6×.
## Lint / format / tests
- `python -m ruff check src/ tests/` → All checks passed.
- `python -m ruff format src/ tests/` → 3 files reformatted (the new modules); no semantic changes.
- `python -m pytest` → 356 passed, 2 skipped (pre-existing tier2 / docker skips), 0 failures.
- No new lints in any file touched by the batch (`ReadLints` clean).