Files
autopilot/_docs/02_tasks/done/AZ-658_frame_ingest_decoder.md
T
Oleksandr Bezdieniezhnykh 251ebed1c2 [AZ-658] frame_ingest H.264/265 decoder (NVDEC + sw fallback)
Wires a real ffmpeg-next 8.1 decoder into the frame_ingest lifecycle
loop. NVDEC is probed at runtime via h264_cuvid / hevc_cuvid; CUDA-less
hosts transparently fall back to software h264 / hevc. Each decoded
frame is stamped with capture_ts (taken at packet receipt) and
decode_ts (taken after decode returns) so movement_detector sees
accurate frame-arrival times. Single-frame decode errors are counted
toward decode_errors_total and dropped; the stream is never aborted.

Adds new public API on FrameIngestHandle: decoder_backend(),
decode_errors_total(), frames_decoded_total(), decode_ms_first_frame(),
decode_ms_p50(), decode_ms_p99(). Integration tests under
crates/frame_ingest/tests/decoder_pipeline.rs cover AC-1, AC-3, AC-4
end-to-end through the real FfmpegDecoder using libx264-encoded
synthetic streams; AC-2 positive (NVDEC selection) is opt-in via
--ignored on a CUDA host. AZ-657 lifecycle tests retained via a
StubDecoder.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-20 17:05:27 +03:00

3.6 KiB

Frame Decoder (NVDEC + Software Fallback)

Task: AZ-658_frame_ingest_decoder Name: H.264/265 decoder (NVDEC primary, software fallback) + monotonic timestamps Description: Decode H.264/265 to raw frames using NVDEC on Jetson Orin Nano, with software fallback. Stamp each frame with a monotonic capture timestamp + sequence number at the earliest practical point in the pipeline. Complexity: 5 points Dependencies: AZ-640_initial_structure, AZ-657_frame_ingest_rtsp_session Component: frame_ingest Tracker: AZ-658 Epic: AZ-627

Problem

Every frame downstream needs a monotonic capture timestamp so movement_detector can detect telemetry skew. Decoding must use the hardware decoder (NVDEC on Jetson) where present and fall back to software otherwise, without changing the emitted Frame shape. Decode errors on a single frame must be dropped (counted), not abort the stream — cold-start latency is observable once but not an alert by itself.

Outcome

  • FrameDecoder::decode(packet) -> Result<Frame, DecodeError> emits a Frame { seq, capture_ts_monotonic, decode_ts_monotonic, pixels: Arc<Bytes>, width, height, pix_fmt, ai_locked }.
  • NVDEC code path is used when available; software fallback otherwise (selection is automatic and observable in health).
  • Single-frame errors are dropped and counted as decode_errors_total; the stream is never aborted on a single frame.
  • Cold-start latency (first-frame decode time) is surfaced as decode_ms_first_frame once per session open.
  • Health surface: decode_ms_p50, decode_ms_p99, decoder_backend ∈ {NVDEC, Software}, decode_errors_total.

Scope

Included

  • NVDEC binding (via Jetson Multimedia API or GStreamer nvv4l2decoder).
  • Software decoder fallback (FFmpeg libavcodec).
  • Monotonic timestamping at the earliest point in the decode pipeline.
  • Sequence-number generation (monotonic u64 per session).
  • Single-frame error handling.

Excluded

  • RTSP session lifecycle (task 18).
  • Multi-consumer publisher (task 20).

Acceptance Criteria

AC-1: Software-path decode of a sample stream Given a sample H.264 RTSP stream at 1080p / 30 fps and a host without NVDEC When the decoder runs for 10 s Then ≥285 frames are emitted; decoder_backend = "Software"; sequence numbers are strictly monotonic.

AC-2: NVDEC-path selection on Jetson Given the host has NVDEC available When the decoder is initialized Then decoder_backend = "NVDEC"; functional correctness is identical to software path.

AC-3: Single-frame decode error does not abort the stream Given the input contains one corrupted frame When the decoder runs Then that single frame is dropped, decode_errors_total increments by 1, and subsequent frames continue to be emitted.

AC-4: Monotonic timestamps Given a sequence of decoded frames When their capture_ts_monotonic is read Then values are strictly monotonically increasing.

Non-Functional Requirements

Performance

  • End-to-end RTSP-rx → publish ≤30 ms p99 on Jetson Orin Nano (per description.md §8); decoder portion of that budget ≤20 ms p99.

Reliability

  • Single-frame errors do not abort the stream.
  • Cold-start latency surfaced once; not an alert.

Runtime Completeness

  • Named capability: H.264/265 decode (NVDEC primary, software fallback) — production decode path required.
  • Production code that must exist: real NVDEC binding; real software fallback; real monotonic timestamping.
  • Unacceptable substitutes: software-only decode on Jetson is acceptable as fallback but the NVDEC code path MUST exist (otherwise the latency target cannot be met).