# FDR Log Bridge (ERROR + WARN forwarding) **Task**: AZ-267_fdr_log_bridge **Name**: FDR Log Bridge **Description**: Subscribe a logging Handler to the shared logger that forwards every ERROR and WARN record into the Flight Data Recorder via the FDR producer client, tagged `kind="log"` so post-flight tooling can correlate log events with the rest of the recorded telemetry. **Complexity**: 2 points **Dependencies**: AZ-266_log_module, AZ-247 (forward — FDR producer + record schema not yet decomposed; this task's contract surface is satisfied once AZ-247's record schema contract is published) **Component**: shared.logging (cross-cutting; epic AZ-245 / E-CC-LOG) **Tracker**: AZ-267 **Epic**: AZ-245 (E-CC-LOG) ### Document Dependencies - `_docs/02_document/contracts/shared_logging/log_record_schema.md` — log envelope this bridge consumes (produced by AZ-266). - `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` — FDR record schema this bridge writes into (produced by AZ-247; document does not yet exist — Step 4 cross-verification will catch the forward reference). ## Problem The acceptance criterion "ERROR + WARN records appear in FDR with `kind = \"log\"` and a back-reference to the originating component" requires a bridge between the shared Python `logging` machinery and the FDR producer client. Without this bridge, post-flight tools cannot correlate a `c5_state` ERROR log with the surrounding telemetry frames captured at the same flight time. ## Outcome - Every emitted log record at level WARN or ERROR is enqueued into the FDR producer queue with `kind="log"` and the originating component slug preserved. - INFO and DEBUG records are NEVER enqueued into FDR (verified by the contract test in PBI #3 of this epic). - The bridge never blocks the calling thread — it uses the FDR producer client's drop-oldest semantics so a saturated queue cannot stall a `logger.error(...)` call on the hot path. ## Scope ### Included - A logging Handler subclass installed onto the root onboard logger (or each `get_logger(...)` instance, whichever the AZ-266 implementation chose) that subscribes to records at WARN and ERROR. - Translation logic from `LogRecord` (per `log_record_schema` v1.0.0) into the FDR record envelope expected by the FDR producer client, with `kind="log"` and a `component` back-reference. - Wire-up in the composition root (consumed from AZ-246 / E-CC-CONF) so the bridge is attached exactly once, after the logger and the FDR client are both initialised. ### Excluded - The FDR producer client itself — owned by AZ-247 / E-CC-FDR-CLIENT. - The on-disk FDR segment writer thread — owned by AZ-248 / E-C13. - The contract test that verifies "DEBUG + INFO never reach FDR" — owned by PBI #3 of this epic (next task). - Per-component log call sites — owned by each component epic. ## Acceptance Criteria **AC-1: WARN records reach FDR** Given the bridge is installed and the FDR client's queue is below capacity When any component emits `logger.warning(...)` via the shared logger Then a single FDR record with `kind="log"`, `level="WARN"`, and `component=` is enqueued **AC-2: ERROR records reach FDR with traceback when applicable** Given the bridge is installed When a component emits `logger.exception(...)` from inside an `except` clause Then the enqueued FDR record's `exc` field carries the formatted traceback string from the `LogRecord` **AC-3: INFO and DEBUG never reach FDR** Given the bridge is installed When any component emits `logger.info(...)` or `logger.debug(...)` Then no FDR record is enqueued for that log call (verified by both unit tests here and the contract test in the next task) **AC-4: Backpressure is non-blocking** Given the FDR producer queue is at its drop-oldest threshold When a component emits `logger.error(...)` on the hot path Then the call returns within the same latency budget as a stdout-only WARN call (no blocking on the queue), and the FDR client's existing drop counter is incremented **AC-5: Single attachment** Given `compose_root(config)` runs at process start When the bridge wire-up is invoked Then exactly one bridge Handler is attached to the logger; reinitialising the composition root in tests does not stack duplicates ## Non-Functional Requirements **Performance** - Bridge add ≤ 0.05 ms p99 latency on top of the formatter's 0.2 ms budget (i.e. logger.error → bridge enqueue total p99 ≤ 0.25 ms on Tier-2). **Reliability** - A failure to enqueue (queue full + drop-oldest already saturated) MUST NOT raise into the caller; it MUST log a one-shot internal `WARN` record (via stdout only — recursion into the bridge is short-circuited by a thread-local flag) every N occurrences, where N is at least 1000. ## Unit Tests | AC Ref | What to Test | Required Outcome | |--------|-------------|-----------------| | AC-1 | Emit a WARN through the shared logger with the bridge installed | Stub FDR queue receives one record with `kind="log"`, `level="WARN"`, `component` matching origin | | AC-2 | Inside an `except` block, call `logger.exception("boom")` | Stub FDR queue's record carries non-empty `exc` traceback string | | AC-3 | Emit INFO and DEBUG records | Stub FDR queue receives zero records | | AC-4 | Pre-fill stub FDR queue to drop-oldest threshold, then emit an ERROR | Caller returns under 0.5 ms wall clock; FDR client's drop counter increments | | AC-5 | Call `compose_root` twice with the same config in a single process | Logger has exactly one bridge Handler attached after the second call | ## Constraints - The bridge has a forward dependency on AZ-247 (FDR producer client + record schema). It cannot pass its own AC tests until AZ-247 is implemented; Step 4 cross-verification will record this temporal dependency in `_dependencies_table.md`. - The bridge's record translation MUST consume only the public surface of `log_record_schema` v1.0.0 — no peeking into formatter internals. ## Risks & Mitigation **Risk 1: Recursion via internal `WARN` on enqueue failure** - *Risk*: The "queue full" internal WARN itself goes through the bridge, recurses, and corrupts the queue further. - *Mitigation*: Thread-local "in-bridge" flag short-circuits any logging call originating from the bridge itself; verified by a unit test that fills the queue and asserts no infinite loop. **Risk 2: Forward dependency on AZ-247 contract not yet written** - *Risk*: The FDR record schema is described in epic AZ-247's text but not yet a contract file; this task's expectations may drift from AZ-247's eventual contract. - *Mitigation*: AZ-247's first PBI MUST publish `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` before AZ-247's other PBIs; this task's implementation begins only after that contract exists. Step 4 cross-verification flags the temporal dependency.