mirror of https://github.com/azaion/detections.git synced 2026-04-22 06:36:32 +00:00

Files

T

Oleksandr Bezdieniezhnykh be4cab4fcb [AZ-178] Implement streaming video detection endpoint

- Added `/detect/video` endpoint for true streaming video detection, allowing inference to start as upload bytes arrive.
- Introduced `run_detect_video_stream` method in the inference module to handle video processing from a file-like object.
- Updated media hashing to include a new function for computing hashes directly from files with minimal I/O.
- Enhanced documentation to reflect changes in video processing and API behavior.

Made-with: Cursor

2026-04-01 03:11:43 +03:00

2.9 KiB

Raw Permalink Blame History

Observability

Logging

Current state: loguru with daily rotated files (Logs/log_inference_YYYYMMDD.txt, 30-day retention) + stdout/stderr. Format: [HH:mm:ss LEVEL] message.

Recommended improvements:

Aspect	Current	Recommended
Format	Human-readable	Structured JSON to stdout (container-friendly)
Fields	timestamp, level, message	+ service, correlation_id, context
PII	Not applicable	No user IDs or tokens in logs
Retention	30 days (file)	Console in dev; 7 days staging; 30 days production (via log aggregator)

Container logging pattern: Log to stdout/stderr only; let the container runtime (Docker/K8s) handle log collection and routing. Remove file-based logging in containerized deployments.

Metrics

Recommended /metrics endpoint (Prometheus-compatible):

Metric	Type	Labels	Description
`detections_requests_total`	Counter	method, endpoint, status	Total HTTP requests
`detections_request_duration_seconds`	Histogram	method, endpoint	Request processing time
`detections_inference_duration_seconds`	Histogram	media_type (image/video)	Inference processing time
`detections_active_inferences`	Gauge	—	Currently running inference jobs (0-2)
`detections_sse_clients`	Gauge	—	Connected SSE clients
`detections_engine_status`	Gauge	engine_type	1=ready, 0=not ready

Collection: Prometheus scrape at 15s intervals.

Distributed Tracing

Limited applicability: The Detections service makes outbound HTTP calls to Loader and Annotations. Trace context propagation is recommended for cross-service correlation.

Span	Parent	Description
`detections.detect_image`	Client request	Full image detection flow
`detections.detect_video`	Client request	Full video detection flow
`detections.model_download`	detect_*	Model download from Loader
`detections.post_annotation`	detect_*	Annotation POST to Annotations service

Implementation: OpenTelemetry Python SDK with OTLP exporter. Sampling: 100% in dev/staging, 10% in production.

Alerting

Severity	Response Time	Condition
Critical	5 min	Health endpoint returns non-200; container restart loop
High	30 min	Error rate > 5%; inference duration p95 > 10s
Medium	4 hours	SSE client count = 0 for extended period; disk > 80%
Low	Next business day	Elevated log warnings; model download retries

Dashboards

Operations dashboard:

Service health status
Request rate by endpoint
Inference duration histogram (p50, p95, p99)
Active inference count (0-2 gauge)
SSE connected clients
Error rate by type

Inference dashboard:

Detections per frame/video
Model availability status timeline
Engine type distribution (ONNX vs TensorRT)
Video batch processing rate

2.9 KiB Raw Permalink Blame History