mirror of
https://github.com/azaion/autopilot.git
synced 2026-06-21 08:31:10 +00:00
[AZ-659] [AZ-660] [AZ-661] Implement frame publisher + gRPC detection client
AZ-659: FramePublisher with per-consumer drop accounting (Arc<Bytes> zero-copy fan-out). Adds ConsumerId enum, PublisherStats, FrameReceiver wrapper, and publisher integration tests (AC-1, AC-2, AC-3). AZ-660: Bi-directional tonic gRPC stream to ../detections. Reconnect with bounded exponential backoff (1 s → 30 s cap). Drop-oldest in-flight budgeting (max_concurrent_in_flight = 2). ai_locked frame skipping. Integration tests against fixture in-process server (AC-1: happy path 30 fps/10 s, AC-2: reconnect, AC-3: budget drops, AC-4: ai_locked skipping). AZ-661: Schema validation (hard SchemaMismatch error on version mismatch), model_version latch with ModelVersionChanged events, sliding-window p99 latency tracker with Tier1Degraded/Tier1Recovered transitions. Integration tests (AC-1, AC-2, AC-3). Also: update module-layout.md for frame_ingest and detection_client to reflect the streaming API shape; code review report batch_18. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -75,8 +75,14 @@
|
||||
- **Epic**: AZ-627
|
||||
- **Directory**: `crates/frame_ingest/`
|
||||
- **Public API**:
|
||||
- `crates/frame_ingest/src/lib.rs` (`FrameIngest`, `FrameIngestHandle::subscribe() -> Receiver<Frame>`, `health()`)
|
||||
- `crates/frame_ingest/src/lib.rs` (`FrameIngest`, `FrameIngestHandle`, `ConsumerId`)
|
||||
- `FrameIngestHandle::subscribe() -> Receiver<Frame>` — raw broadcast receiver (no per-consumer accounting)
|
||||
- `FrameIngestHandle::subscribe_as(ConsumerId) -> FrameReceiver` — receiver with per-consumer lag accounting
|
||||
- `FrameIngestHandle::publisher() -> Arc<FramePublisher>` — direct publisher handle for the composition root
|
||||
- `FrameIngestHandle::dropped_frames(ConsumerId) -> u64`, `publishes_total() -> u64`
|
||||
- `FrameIngestHandle::health() -> ComponentHealth`
|
||||
- **Internal**:
|
||||
- `crates/frame_ingest/src/internal/publisher.rs` (`FramePublisher`, `FrameReceiver`, `PublisherStats`)
|
||||
- `crates/frame_ingest/src/internal/rtsp_client.rs`
|
||||
- `crates/frame_ingest/src/internal/decoder.rs`
|
||||
- `crates/frame_ingest/src/internal/timestamp.rs`
|
||||
@@ -91,14 +97,22 @@
|
||||
- **Epic**: AZ-628
|
||||
- **Directory**: `crates/detection_client/`
|
||||
- **Public API**:
|
||||
- `crates/detection_client/src/lib.rs` (`DetectionClient`, `DetectionClientHandle::request(Frame) -> Result<DetectionBatch>`, `health()`)
|
||||
- `crates/detection_client/src/lib.rs` (`DetectionClient`, `DetectionClientConfig`, `DetectionClientHandle`, `DetectionEvent`, `ConnectionState`, `Tier1DegradationReason`)
|
||||
- `DetectionClient::run(frame_rx: Receiver<Frame>) -> (JoinHandle, DetectionClientHandle)` — spawns the gRPC supervisor task
|
||||
- `DetectionClientHandle::subscribe_events() -> Receiver<DetectionEvent>` — broadcast stream of batches, schema errors, model-version changes, Tier-1 degradation transitions
|
||||
- `DetectionClientHandle::health() -> ComponentHealth`
|
||||
- `DetectionClientHandle::stats() -> Arc<DetectionStats>`, `latency_p50/p99()`, `connection_state()`, `shutdown()`
|
||||
- **Internal**:
|
||||
- `crates/detection_client/build.rs` (`tonic-build` for the gRPC proto)
|
||||
- `crates/detection_client/proto/detections.proto` (vendored copy of `../detections` contract per `architecture.md §10`)
|
||||
- `crates/detection_client/src/internal/grpc/*` (bi-directional streaming client, version handshake)
|
||||
- `crates/detection_client/src/internal/runtime.rs` (supervisor + bi-directional stream session)
|
||||
- `crates/detection_client/src/internal/budget.rs` (drop-oldest in-flight tracker)
|
||||
- `crates/detection_client/src/internal/latency.rs` (sliding-window p99 + degradation latch)
|
||||
- `crates/detection_client/src/internal/stats.rs` (lock-free atomic counters)
|
||||
- `crates/detection_client/src/internal/proto.rs` (generated tonic/prost types)
|
||||
- **Owns**: `crates/detection_client/**`
|
||||
- **Imports from**: `shared`
|
||||
- **Consumed by**: `scan_controller` (handle for direct request), `telemetry_stream` (via constructor-injected `Receiver<DetectionBatch>` for operator overlay)
|
||||
- **Consumed by**: `scan_controller` (subscribes to events), `telemetry_stream` (via composition-root-wired `Receiver<DetectionBatch>` for operator overlay)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,85 @@
|
||||
# Code Review Report
|
||||
|
||||
**Batch**: 18 — AZ-659, AZ-660, AZ-661
|
||||
**Date**: 2026-05-20
|
||||
**Verdict**: PASS_WITH_WARNINGS
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title |
|
||||
|---|----------|----------|-----------|-------|
|
||||
| 1 | Medium | Maintainability | `runtime.rs:392-411` | Dead code: unused `Instant::now()` + no-op `let _ = in_flight` |
|
||||
| 2 | Low | Architecture | `lib.rs (detection_client)` | `pub mod internal` exposes generated proto server types to external crates |
|
||||
| 3 | Low | Maintainability | `stats.rs:66` | `note_orphan_response` increments `stream_errors_total` — imprecise bucket |
|
||||
| 4 | Low | Performance | `runtime.rs:build_request` | `frame.pixels.to_vec()` copies the full pixel buffer for each gRPC encode |
|
||||
|
||||
### Finding Details
|
||||
|
||||
**F1: Dead code in `handle_response`** (Medium / Maintainability) — **FIXED**
|
||||
- Location: `crates/detection_client/src/internal/runtime.rs`
|
||||
- Description: `let now = Instant::now()` was captured but never used; `let _ = in_flight` was a no-op for a `Copy` type, suggesting incomplete RTT tracking that was never wired up.
|
||||
- Fix applied: removed both dead statements; replaced multi-paragraph placeholder comment with a concise doc note.
|
||||
|
||||
**F2: `pub mod internal` exposes server proto types** (Low / Architecture)
|
||||
- Location: `crates/detection_client/src/lib.rs:40`
|
||||
- Description: `pub mod internal` is required for integration tests in `tests/stream.rs` that need `detection_service_server` types to spin up the fixture gRPC server. The side-effect is that `detection_client::internal::*` is also visible to external crates, which contradicts module-layout rule #3.
|
||||
- Suggestion: gate the re-export behind `#[cfg(any(test, feature = "test-utils"))]` or move fixture server helpers into a private dev-dependency crate when test infra consolidation is next in scope. Not worth fixing now — the practical risk is negligible (no external crate is expected to consume `detection_client::internal`).
|
||||
|
||||
**F3: `note_orphan_response` uses wrong counter** (Low / Maintainability)
|
||||
- Location: `crates/detection_client/src/internal/stats.rs:66`
|
||||
- Description: An orphan response (response arrived after the in-flight slot was budget-evicted) is a normal consequence of drop-oldest budgeting, not a stream error. Incrementing `stream_errors_total` conflates two distinct observability signals and could mislead operators.
|
||||
- Suggestion: Add a dedicated `orphan_responses_total: AtomicU64` field in a future stats refactor. Not blocking — the counter is additive and currently only consumed internally.
|
||||
|
||||
**F4: Pixel buffer copy per gRPC frame** (Low / Performance)
|
||||
- Location: `crates/detection_client/src/internal/runtime.rs:build_request`
|
||||
- Description: `pixels: frame.pixels.to_vec()` allocates a `Vec<u8>` copy of the full pixel buffer (potentially 3–25 MB at operational resolutions) for each frame before gRPC serialisation. The `Arc<Bytes>` on the frame prevents sharing across the gRPC encode path because prost requires owned `Vec<u8>` for `bytes` fields.
|
||||
- Suggestion: Investigate `bytes::Bytes` integration with prost's `bytes` feature flag in a future optimisation pass. Not a regression — the copy existed implicitly before and is unavoidable with the current proto stack version.
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Spec Compliance Summary
|
||||
|
||||
### AZ-659 — frame_ingest_publisher
|
||||
|
||||
| AC | Status | Test |
|
||||
|----|--------|------|
|
||||
| AC-1: Three consumers at rate, no drops | PASS | `ac1_three_consumers_at_rate_lose_no_frames` |
|
||||
| AC-2: Slow consumer drops, fast unaffected | PASS | `ac2_slow_consumer_drops_while_fast_consumers_unaffected` |
|
||||
| AC-3: Fan-out is zero-copy via Arc<Bytes> | PASS | `ac3_fan_out_is_zero_copy_via_arc_bytes` |
|
||||
|
||||
### AZ-660 — detection_client_grpc_stream
|
||||
|
||||
| AC | Status | Test |
|
||||
|----|--------|------|
|
||||
| AC-1: 30 fps / 10 s / ≥285 batches / p99 ≤100 ms / drops=0 | PASS | `ac660_1_happy_path_30fps_285_batches` |
|
||||
| AC-2: Reconnect within ≤2 s after stream close | PASS | `ac660_2_reconnects_after_stream_close` |
|
||||
| AC-3: Budget drops > 0 on 200 ms server | PASS | `ac660_3_budget_drops_on_slow_server` |
|
||||
| AC-4: ai_locked frames skipped | PASS | `ac660_4_ai_locked_frames_skipped` |
|
||||
|
||||
### AZ-661 — detection_client_schema_and_health
|
||||
|
||||
| AC | Status | Test |
|
||||
|----|--------|------|
|
||||
| AC-1: Schema mismatch → hard error + counter | PASS | `ac661_1_schema_mismatch_hard_error` |
|
||||
| AC-2: model_version change → exactly one event | PASS | `ac661_2_model_version_change_emits_event` |
|
||||
| AC-3: Tier1Degraded emitted exactly once on latency spike | PASS | `ac661_3_tier1_degraded_emitted_once_on_latency_spike` |
|
||||
|
||||
---
|
||||
|
||||
## Phase 7: Architecture Compliance
|
||||
|
||||
| Rule | Check | Result |
|
||||
|------|-------|--------|
|
||||
| Layer direction | `detection_client` imports only `shared` (Layer 1); no sibling crate imports | PASS |
|
||||
| Layer direction | `frame_ingest` imports only `shared` (Layer 1) | PASS |
|
||||
| Public API respect | No cross-component imports of internal modules | PASS |
|
||||
| No new cyclic deps | Import graph: detection_client → shared, frame_ingest → shared; no cycles | PASS |
|
||||
| Module-layout sync | `detection_client` public API section updated to reflect streaming shape | PASS (fixed) |
|
||||
| Module-layout sync | `frame_ingest` public API section updated to include publisher methods | PASS (fixed) |
|
||||
|
||||
---
|
||||
|
||||
**critical_count**: 0
|
||||
**high_count**: 0
|
||||
**Medium findings auto-fixed inline**: 1 (F1)
|
||||
**Verdict**: PASS_WITH_WARNINGS — proceed to commit.
|
||||
Reference in New Issue
Block a user