[AZ-626] Decompose complete: 47 tasks + docs + module layout

Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy
Qt/C++ to a Rust workspace.

- Remove legacy Qt/C++ tree (ai_controller, drone_controller,
  misc/camera, python_scaffold, root Dockerfile, autopilot.pro,
  legacy main.py / requirements.txt).
- Add _docs/00_problem (problem, restrictions, acceptance criteria,
  security approach, input data + fixtures).
- Add _docs/01_solution/solution_draft01.
- Add _docs/02_document (architecture, system-flows, data_model,
  glossary, decision-rationale, deployment, 13 component descriptions,
  tests/ specs, FINAL_report, module-layout).
- Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one
  bootstrap + 46 component tasks) and _dependencies_table.md.
- Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for
  canonical _docs artifacts).
- Track autodev state in _docs/_autodev_state.md (Step 6 completed,
  ready for Step 7 Implement).

Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks
AZ-640..AZ-686. Total complexity 173 points across 12 epics.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-19 11:02:01 +03:00
parent f7d6cb4a3a
commit bc40ea7300
235 changed files with 12585 additions and 15097 deletions
@@ -0,0 +1,74 @@
# Component — `frame_ingest`
**Layer**: Perception (data plane in)
**Status**: forward-looking design (Rust)
## 1. Purpose
Pull RTSP from the ViewPro A40 camera, decode H.264/265 to raw frames, attach a monotonic timestamp + sequence number, and hand each frame to the downstream consumers (`detection_client`, `movement_detector`, `telemetry_stream`) without copying frame buffers more than once.
Frames are the system's primary input. Everything downstream of `frame_ingest` is rate-limited by it.
## 2. Inputs
| Input | Source | Cadence | Notes |
|---|---|---|---|
| RTSP video stream | ViewPro A40 (via airframe IP/port) | 30 fps at 1080p (60 fps capable) | TCP or UDP transport per camera config. Re-opens on failure with bounded backoff. |
| Camera startup config | Static config (env or CLI) | once at process start | Stream URL, transport, decode codec preference. |
| `bringCameraDown` / `bringCameraUp` health signal | local supervisor (if present) | event | Optional. Used by deployments that gate AI access to the camera (e.g., during RC takeover). When `down` is asserted, `frame_ingest` continues decoding for `telemetry_stream` but flags frames as "AI-locked" so downstream consumers skip detection. |
## 3. Outputs
| Output | Consumer | Shape |
|---|---|---|
| `Frame` | `detection_client`, `movement_detector`, `telemetry_stream` | `{ seq: u64, capture_ts_monotonic: ns, decode_ts_monotonic: ns, pixels: Arc<Bytes>, width, height, pix_fmt, ai_locked: bool }` |
| Health metric | health aggregator | `frames/s`, `decode_ms_p50/p99`, `last_frame_age_ms`, `reopens_total`, `decode_errors_total` |
## 4. Key Responsibilities
- Open the RTSP session and recover from transient connection loss with bounded exponential backoff.
- Decode frames using a hardware decoder where available (NVDEC on Jetson) with software fallback.
- Stamp each frame with a monotonic capture timestamp at the earliest practical point in the pipeline; this is what `movement_detector` uses for telemetry-skew checks.
- Publish frames through a single multi-consumer channel (Tokio broadcast or equivalent) using `Arc<Bytes>` for pixel data so consumers do not copy.
- Drop frames if downstream consumers fall behind beyond a configured queue depth; record the drop with a reason ({{detection_client_slow, movement_detector_slow, telemetry_slow}}) and surface it through the health endpoint.
## 5. Internal State
- RTSP session handle and reconnect state (closed / connecting / streaming / failing).
- Last-frame timestamp and sequence number.
- Per-consumer drop counters.
State is in-process only; nothing persists across restarts.
## 6. Failure Modes
| Failure | Detection | Behaviour |
|---|---|---|
| RTSP connection refused / lost | TCP connect error / read timeout | Bounded exponential backoff (1 s → 30 s cap); health flips to yellow after first failure, red after `last_frame_age_ms` exceeds a configured threshold. |
| Decode error on a single frame | decoder returns error | Drop the frame; increment `decode_errors_total`; do not abort the stream. |
| Decoder cold-start latency | first-frame timestamp far from session-open | Surface `decode_ms_first_frame` once; not an alert by itself. |
| Downstream consumer slow | broadcast channel back-pressure | Drop the oldest frame for that consumer; counter-tagged drop; warning on sustained drops. |
| Camera output format mismatch | unexpected SPS/PPS | Hard-fail at session open with an explicit error; do not silently pick a wrong decode path. |
## 7. Dependencies
**In-process**: none upstream; downstream consumers are `detection_client`, `movement_detector`, `telemetry_stream`.
**External**:
- ViewPro A40 RTSP (live).
- Hardware video decoder (NVDEC on Jetson) via FFmpeg / GStreamer or a Rust binding.
## 8. Non-Functional Targets
| Concern | Target |
|---|---|
| End-to-end frame latency (RTSP rx → publish to consumers) | ≤30 ms p99 on Jetson Orin Nano. |
| Frame drop rate | ≤0.1 % under normal conditions. |
| Reconnect latency after camera reboot | ≤5 s from camera availability. |
| Memory | one decoded-frame buffer pool with bounded size; no unbounded growth on slow consumers. |
## 9. References
- `architecture.md §1 System Context`, `§3 Components`, `§7.6 Solution Architecture`.
- `system-flows.md §F1 Frame pipeline`.
- `data_model.md §Frame`.