mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 21:21:14 +00:00
Decompose Step 6 snapshot: 140 task specs + contract docs
Closes out greenfield Step 6 (Decompose) for all 14 components (C1-C13 + cross-cutting helpers/replay). Covers tasks AZ-266..AZ-446 plus the _dependencies_table.md and component contract documents. State file updated to greenfield Step 7 (Implement), not_started. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,100 @@
|
||||
# FDR Log Bridge (ERROR + WARN forwarding)
|
||||
|
||||
**Task**: AZ-267_fdr_log_bridge
|
||||
**Name**: FDR Log Bridge
|
||||
**Description**: Subscribe a logging Handler to the shared logger that forwards every ERROR and WARN record into the Flight Data Recorder via the FDR producer client, tagged `kind="log"` so post-flight tooling can correlate log events with the rest of the recorded telemetry.
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: AZ-266_log_module, AZ-247 (forward — FDR producer + record schema not yet decomposed; this task's contract surface is satisfied once AZ-247's record schema contract is published)
|
||||
**Component**: shared.logging (cross-cutting; epic AZ-245 / E-CC-LOG)
|
||||
**Tracker**: AZ-267
|
||||
**Epic**: AZ-245 (E-CC-LOG)
|
||||
|
||||
### Document Dependencies
|
||||
|
||||
- `_docs/02_document/contracts/shared_logging/log_record_schema.md` — log envelope this bridge consumes (produced by AZ-266).
|
||||
- `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` — FDR record schema this bridge writes into (produced by AZ-247; document does not yet exist — Step 4 cross-verification will catch the forward reference).
|
||||
|
||||
## Problem
|
||||
|
||||
The acceptance criterion "ERROR + WARN records appear in FDR with `kind = \"log\"` and a back-reference to the originating component" requires a bridge between the shared Python `logging` machinery and the FDR producer client. Without this bridge, post-flight tools cannot correlate a `c5_state` ERROR log with the surrounding telemetry frames captured at the same flight time.
|
||||
|
||||
## Outcome
|
||||
|
||||
- Every emitted log record at level WARN or ERROR is enqueued into the FDR producer queue with `kind="log"` and the originating component slug preserved.
|
||||
- INFO and DEBUG records are NEVER enqueued into FDR (verified by the contract test in PBI #3 of this epic).
|
||||
- The bridge never blocks the calling thread — it uses the FDR producer client's drop-oldest semantics so a saturated queue cannot stall a `logger.error(...)` call on the hot path.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- A logging Handler subclass installed onto the root onboard logger (or each `get_logger(...)` instance, whichever the AZ-266 implementation chose) that subscribes to records at WARN and ERROR.
|
||||
- Translation logic from `LogRecord` (per `log_record_schema` v1.0.0) into the FDR record envelope expected by the FDR producer client, with `kind="log"` and a `component` back-reference.
|
||||
- Wire-up in the composition root (consumed from AZ-246 / E-CC-CONF) so the bridge is attached exactly once, after the logger and the FDR client are both initialised.
|
||||
|
||||
### Excluded
|
||||
|
||||
- The FDR producer client itself — owned by AZ-247 / E-CC-FDR-CLIENT.
|
||||
- The on-disk FDR segment writer thread — owned by AZ-248 / E-C13.
|
||||
- The contract test that verifies "DEBUG + INFO never reach FDR" — owned by PBI #3 of this epic (next task).
|
||||
- Per-component log call sites — owned by each component epic.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: WARN records reach FDR**
|
||||
Given the bridge is installed and the FDR client's queue is below capacity
|
||||
When any component emits `logger.warning(...)` via the shared logger
|
||||
Then a single FDR record with `kind="log"`, `level="WARN"`, and `component=<originating component slug>` is enqueued
|
||||
|
||||
**AC-2: ERROR records reach FDR with traceback when applicable**
|
||||
Given the bridge is installed
|
||||
When a component emits `logger.exception(...)` from inside an `except` clause
|
||||
Then the enqueued FDR record's `exc` field carries the formatted traceback string from the `LogRecord`
|
||||
|
||||
**AC-3: INFO and DEBUG never reach FDR**
|
||||
Given the bridge is installed
|
||||
When any component emits `logger.info(...)` or `logger.debug(...)`
|
||||
Then no FDR record is enqueued for that log call (verified by both unit tests here and the contract test in the next task)
|
||||
|
||||
**AC-4: Backpressure is non-blocking**
|
||||
Given the FDR producer queue is at its drop-oldest threshold
|
||||
When a component emits `logger.error(...)` on the hot path
|
||||
Then the call returns within the same latency budget as a stdout-only WARN call (no blocking on the queue), and the FDR client's existing drop counter is incremented
|
||||
|
||||
**AC-5: Single attachment**
|
||||
Given `compose_root(config)` runs at process start
|
||||
When the bridge wire-up is invoked
|
||||
Then exactly one bridge Handler is attached to the logger; reinitialising the composition root in tests does not stack duplicates
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- Bridge add ≤ 0.05 ms p99 latency on top of the formatter's 0.2 ms budget (i.e. logger.error → bridge enqueue total p99 ≤ 0.25 ms on Tier-2).
|
||||
|
||||
**Reliability**
|
||||
- A failure to enqueue (queue full + drop-oldest already saturated) MUST NOT raise into the caller; it MUST log a one-shot internal `WARN` record (via stdout only — recursion into the bridge is short-circuited by a thread-local flag) every N occurrences, where N is at least 1000.
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|-------------|-----------------|
|
||||
| AC-1 | Emit a WARN through the shared logger with the bridge installed | Stub FDR queue receives one record with `kind="log"`, `level="WARN"`, `component` matching origin |
|
||||
| AC-2 | Inside an `except` block, call `logger.exception("boom")` | Stub FDR queue's record carries non-empty `exc` traceback string |
|
||||
| AC-3 | Emit INFO and DEBUG records | Stub FDR queue receives zero records |
|
||||
| AC-4 | Pre-fill stub FDR queue to drop-oldest threshold, then emit an ERROR | Caller returns under 0.5 ms wall clock; FDR client's drop counter increments |
|
||||
| AC-5 | Call `compose_root` twice with the same config in a single process | Logger has exactly one bridge Handler attached after the second call |
|
||||
|
||||
## Constraints
|
||||
|
||||
- The bridge has a forward dependency on AZ-247 (FDR producer client + record schema). It cannot pass its own AC tests until AZ-247 is implemented; Step 4 cross-verification will record this temporal dependency in `_dependencies_table.md`.
|
||||
- The bridge's record translation MUST consume only the public surface of `log_record_schema` v1.0.0 — no peeking into formatter internals.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Recursion via internal `WARN` on enqueue failure**
|
||||
- *Risk*: The "queue full" internal WARN itself goes through the bridge, recurses, and corrupts the queue further.
|
||||
- *Mitigation*: Thread-local "in-bridge" flag short-circuits any logging call originating from the bridge itself; verified by a unit test that fills the queue and asserts no infinite loop.
|
||||
|
||||
**Risk 2: Forward dependency on AZ-247 contract not yet written**
|
||||
- *Risk*: The FDR record schema is described in epic AZ-247's text but not yet a contract file; this task's expectations may drift from AZ-247's eventual contract.
|
||||
- *Mitigation*: AZ-247's first PBI MUST publish `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` before AZ-247's other PBIs; this task's implementation begins only after that contract exists. Step 4 cross-verification flags the temporal dependency.
|
||||
Reference in New Issue
Block a user