mirror of https://github.com/azaion/detections.git synced 2026-04-22 22:16:31 +00:00

Files

T

Oleksandr Bezdieniezhnykh 3165a88f0b Add detailed file index and enhance skill documentation for autopilot, decompose, deploy, plan, and research skills. Introduce tests-only mode in decompose skill, clarify required files for deploy and plan skills, and improve prerequisite checks across skills for better user guidance and workflow efficiency.

2026-03-22 16:15:49 +02:00

4.6 KiB

Raw Blame History

Component: Inference Pipeline

Overview

Purpose: Orchestrates the full inference lifecycle — engine initialization with fallback strategy, media preprocessing (images + video), batched inference execution, postprocessing with detection filtering, and result delivery via callbacks.

Pattern: Façade + Pipeline — Inference class is the single entry point that coordinates engine selection, preprocessing, inference, and postprocessing stages.

Upstream: Domain (data models, config, status), Inference Engines (OnnxEngine/TensorRTEngine), External Client (LoaderHttpClient). Downstream: API (creates Inference, calls run_detect and detect_single_image).

Modules

Module	Role
`inference`	Core orchestrator: engine lifecycle, preprocessing, postprocessing, image/video processing
`loader_http_client`	HTTP client for model download/upload from Loader service

Internal Interfaces

Inference

cdef class Inference:
    __init__(loader_client)
    cpdef run_detect(dict config_dict, annotation_callback, status_callback=None)
    cpdef list detect_single_image(bytes image_bytes, dict config_dict)
    cpdef stop()

    # Internal pipeline stages:
    cdef init_ai()
    cdef preprocess(frames) -> ndarray
    cdef postprocess(output, ai_config) -> list[list[Detection]]
    cdef remove_overlapping_detections(list[Detection], float threshold) -> list[Detection]
    cdef _process_images(AIRecognitionConfig, list[str] paths)
    cdef _process_video(AIRecognitionConfig, str video_name)

LoaderHttpClient

class LoaderHttpClient:
    load_big_small_resource(str filename, str directory) -> LoadResult
    upload_big_small_resource(bytes content, str filename, str directory) -> LoadResult

External API

None — internal component, consumed by API layer.

Data Access Patterns

Model bytes downloaded from Loader service (HTTP)
Converted TensorRT engines uploaded back to Loader for caching
Video frames read via OpenCV VideoCapture
Images read via OpenCV imread
All processing is in-memory

Implementation Details

Engine Initialization Strategy

1. Check GPU availability (pynvml, compute capability ≥ 6.1)
2. If GPU:
   a. Try loading pre-built TensorRT engine from Loader
   b. If fails → download ONNX model → start background conversion thread
   c. Background thread: convert ONNX→TensorRT → upload to Loader → set _converted_model_bytes
   d. Next init_ai() call: load from _converted_model_bytes
3. If no GPU:
   a. Download ONNX model from Loader → create OnnxEngine

Preprocessing

cv2.dnn.blobFromImage: normalize 0..1, resize to model input, BGR→RGB
Batch via np.vstack

Postprocessing

Parse [batch][det][x1,y1,x2,y2,conf,cls] output
Normalize coordinates to 0..1
Convert to center-format Detection objects
Filter by confidence threshold
Remove overlapping detections (greedy: keep higher confidence, tie-break by lower class_id)

Large Image Tiling

Ground Sampling Distance: sensor_width * altitude / (focal_length * image_width)
Tile size: METERS_IN_TILE / GSD pixels
Overlap: configurable percentage
Tile deduplication: absolute-coordinate Detection equality across adjacent tiles
Physical size filtering: remove detections exceeding class max_object_size_meters

Video Processing

Frame sampling: every Nth frame
Annotation validity heuristics: time gap, detection count increase, spatial movement, confidence improvement
JPEG encoding of valid frames for annotation images

Callbacks

annotation_callback(annotation, percent) — called per valid annotation
status_callback(media_name, count) — called when all detections for a media item are complete

Caveats

ThreadPoolExecutor with max_workers=2 limits concurrent inference (set in main.py)
Background TensorRT conversion runs in a daemon thread — may be interrupted on shutdown
init_ai() called on every run_detect — idempotent but checks engine state each time
Video processing is sequential per video (no parallel video processing)
_tile_detections dict is instance-level state that persists across image calls within a single run_detect invocation

Dependency Graph

graph TD
    inference --> constants_inf
    inference --> ai_availability_status
    inference --> annotation
    inference --> ai_config
    inference -.-> onnx_engine
    inference -.-> tensorrt_engine
    inference --> loader_http_client

Logging Strategy

Extensive logging via constants_inf.log: engine init status, media processing start, GSD calculation, tile splitting, detection results, size filtering decisions.

4.6 KiB Raw Blame History