Files
Oleksandr Bezdieniezhnykh bc40ea7300 [AZ-626] Decompose complete: 47 tasks + docs + module layout
Greenfield Steps 1-6 baseline for the autopilot rewrite from legacy
Qt/C++ to a Rust workspace.

- Remove legacy Qt/C++ tree (ai_controller, drone_controller,
  misc/camera, python_scaffold, root Dockerfile, autopilot.pro,
  legacy main.py / requirements.txt).
- Add _docs/00_problem (problem, restrictions, acceptance criteria,
  security approach, input data + fixtures).
- Add _docs/01_solution/solution_draft01.
- Add _docs/02_document (architecture, system-flows, data_model,
  glossary, decision-rationale, deployment, 13 component descriptions,
  tests/ specs, FINAL_report, module-layout).
- Add _docs/02_tasks/todo with 47 task specs (AZ-640..AZ-686, one
  bootstrap + 46 component tasks) and _dependencies_table.md.
- Add .cursor/rules/artifact-srp.mdc (single-responsibility rule for
  canonical _docs artifacts).
- Track autodev state in _docs/_autodev_state.md (Step 6 completed,
  ready for Step 7 Implement).

Jira: bootstrap AZ-626; component epics AZ-627..AZ-639; tasks
AZ-640..AZ-686. Total complexity 173 points across 12 epics.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-19 11:02:01 +03:00

4.3 KiB

Component — frame_ingest

Layer: Perception (data plane in) Status: forward-looking design (Rust)

1. Purpose

Pull RTSP from the ViewPro A40 camera, decode H.264/265 to raw frames, attach a monotonic timestamp + sequence number, and hand each frame to the downstream consumers (detection_client, movement_detector, telemetry_stream) without copying frame buffers more than once.

Frames are the system's primary input. Everything downstream of frame_ingest is rate-limited by it.

2. Inputs

Input Source Cadence Notes
RTSP video stream ViewPro A40 (via airframe IP/port) 30 fps at 1080p (60 fps capable) TCP or UDP transport per camera config. Re-opens on failure with bounded backoff.
Camera startup config Static config (env or CLI) once at process start Stream URL, transport, decode codec preference.
bringCameraDown / bringCameraUp health signal local supervisor (if present) event Optional. Used by deployments that gate AI access to the camera (e.g., during RC takeover). When down is asserted, frame_ingest continues decoding for telemetry_stream but flags frames as "AI-locked" so downstream consumers skip detection.

3. Outputs

Output Consumer Shape
Frame detection_client, movement_detector, telemetry_stream { seq: u64, capture_ts_monotonic: ns, decode_ts_monotonic: ns, pixels: Arc<Bytes>, width, height, pix_fmt, ai_locked: bool }
Health metric health aggregator frames/s, decode_ms_p50/p99, last_frame_age_ms, reopens_total, decode_errors_total

4. Key Responsibilities

  • Open the RTSP session and recover from transient connection loss with bounded exponential backoff.
  • Decode frames using a hardware decoder where available (NVDEC on Jetson) with software fallback.
  • Stamp each frame with a monotonic capture timestamp at the earliest practical point in the pipeline; this is what movement_detector uses for telemetry-skew checks.
  • Publish frames through a single multi-consumer channel (Tokio broadcast or equivalent) using Arc<Bytes> for pixel data so consumers do not copy.
  • Drop frames if downstream consumers fall behind beyond a configured queue depth; record the drop with a reason ({{detection_client_slow, movement_detector_slow, telemetry_slow}}) and surface it through the health endpoint.

5. Internal State

  • RTSP session handle and reconnect state (closed / connecting / streaming / failing).
  • Last-frame timestamp and sequence number.
  • Per-consumer drop counters.

State is in-process only; nothing persists across restarts.

6. Failure Modes

Failure Detection Behaviour
RTSP connection refused / lost TCP connect error / read timeout Bounded exponential backoff (1 s → 30 s cap); health flips to yellow after first failure, red after last_frame_age_ms exceeds a configured threshold.
Decode error on a single frame decoder returns error Drop the frame; increment decode_errors_total; do not abort the stream.
Decoder cold-start latency first-frame timestamp far from session-open Surface decode_ms_first_frame once; not an alert by itself.
Downstream consumer slow broadcast channel back-pressure Drop the oldest frame for that consumer; counter-tagged drop; warning on sustained drops.
Camera output format mismatch unexpected SPS/PPS Hard-fail at session open with an explicit error; do not silently pick a wrong decode path.

7. Dependencies

In-process: none upstream; downstream consumers are detection_client, movement_detector, telemetry_stream.

External:

  • ViewPro A40 RTSP (live).
  • Hardware video decoder (NVDEC on Jetson) via FFmpeg / GStreamer or a Rust binding.

8. Non-Functional Targets

Concern Target
End-to-end frame latency (RTSP rx → publish to consumers) ≤30 ms p99 on Jetson Orin Nano.
Frame drop rate ≤0.1 % under normal conditions.
Reconnect latency after camera reboot ≤5 s from camera availability.
Memory one decoded-frame buffer pool with bounded size; no unbounded growth on slow consumers.

9. References

  • architecture.md §1 System Context, §3 Components, §7.6 Solution Architecture.
  • system-flows.md §F1 Frame pipeline.
  • data_model.md §Frame.