mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 10:36:32 +00:00
Add detailed file index and enhance skill documentation for autopilot, decompose, deploy, plan, and research skills. Introduce tests-only mode in decompose skill, clarify required files for deploy and plan skills, and improve prerequisite checks across skills for better user guidance and workflow efficiency.
This commit is contained in:
@@ -0,0 +1,68 @@
|
||||
# Module: ai_availability_status
|
||||
|
||||
## Purpose
|
||||
|
||||
Thread-safe status tracker for the AI engine lifecycle (downloading, converting, uploading, enabled, warning, error).
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Enum: AIAvailabilityEnum
|
||||
|
||||
| Value | Name | Meaning |
|
||||
|-------|------|---------|
|
||||
| 0 | NONE | Initial state, not yet initialized |
|
||||
| 10 | DOWNLOADING | Model download in progress |
|
||||
| 20 | CONVERTING | ONNX-to-TensorRT conversion in progress |
|
||||
| 30 | UPLOADING | Converted model upload in progress |
|
||||
| 200 | ENABLED | Engine ready for inference |
|
||||
| 300 | WARNING | Operational with warnings |
|
||||
| 500 | ERROR | Failed, not operational |
|
||||
|
||||
### Class: AIAvailabilityStatus
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `status` | AIAvailabilityEnum | Current status |
|
||||
| `error_message` | str or None | Error/warning details |
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `__init__` | `()` | Sets status=NONE, error_message=None |
|
||||
| `__str__` | `() -> str` | Thread-safe formatted string: `"StatusText ErrorText"` |
|
||||
| `serialize` | `() -> bytes` | Thread-safe msgpack serialization `{s: status, m: error_message}` **(legacy — not called in current codebase)** |
|
||||
| `set_status` | `(AIAvailabilityEnum status, str error_message=None) -> void` | Thread-safe status update; logs via constants_inf (error or info) |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
All public methods acquire a `threading.Lock` before reading/writing status fields. `set_status` logs the transition: errors go to `constants_inf.logerror`, normal transitions go to `constants_inf.log`.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `msgpack`, `threading`
|
||||
- **Internal**: `constants_inf` (logging)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `inference` — creates instance, calls `set_status` during engine lifecycle, exposes `ai_availability_status` for health checks
|
||||
- `main` — reads `ai_availability_status` via inference for `/health` endpoint
|
||||
|
||||
## Data Models
|
||||
|
||||
- `AIAvailabilityEnum` — status enum
|
||||
- `AIAvailabilityStatus` — stateful status holder
|
||||
|
||||
## Configuration
|
||||
|
||||
None.
|
||||
|
||||
## External Integrations
|
||||
|
||||
None.
|
||||
|
||||
## Security
|
||||
|
||||
Thread-safe via Lock — safe for concurrent access from FastAPI async + ThreadPoolExecutor.
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
@@ -0,0 +1,69 @@
|
||||
# Module: ai_config
|
||||
|
||||
## Purpose
|
||||
|
||||
Data class holding all AI recognition configuration parameters, with factory methods for deserialization from msgpack and dict formats.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Class: AIRecognitionConfig
|
||||
|
||||
#### Fields
|
||||
|
||||
| Field | Type | Default | Description |
|
||||
|-------|------|---------|-------------|
|
||||
| `frame_period_recognition` | int | 4 | Process every Nth frame in video |
|
||||
| `frame_recognition_seconds` | double | 2.0 | Minimum seconds between valid video annotations |
|
||||
| `probability_threshold` | double | 0.25 | Minimum detection confidence |
|
||||
| `tracking_distance_confidence` | double | 0.0 | Distance threshold for tracking (model-width units) |
|
||||
| `tracking_probability_increase` | double | 0.0 | Required confidence increase for tracking update |
|
||||
| `tracking_intersection_threshold` | double | 0.6 | IoU threshold for overlapping detection removal |
|
||||
| `file_data` | bytes | `b''` | Raw file data (msgpack use) |
|
||||
| `paths` | list[str] | `[]` | Media file paths to process |
|
||||
| `model_batch_size` | int | 1 | Batch size for inference |
|
||||
| `big_image_tile_overlap_percent` | int | 20 | Tile overlap percentage for large image splitting |
|
||||
| `altitude` | double | 400 | Camera altitude in meters |
|
||||
| `focal_length` | double | 24 | Camera focal length in mm |
|
||||
| `sensor_width` | double | 23.5 | Camera sensor width in mm |
|
||||
|
||||
#### Methods
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `from_msgpack` | `(bytes data) -> AIRecognitionConfig` | Static cdef; deserializes from msgpack binary |
|
||||
| `from_dict` | `(dict data) -> AIRecognitionConfig` | Static def; deserializes from Python dict |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
Both factory methods apply defaults for missing keys. `from_msgpack` uses compact single-character keys (`f_pr`, `pt`, `t_dc`, etc.) while `from_dict` uses full descriptive keys.
|
||||
|
||||
**Legacy/unused**: `from_msgpack()` is defined but never called in the current codebase — it is a remnant of a previous queue-based architecture. Only `from_dict()` is actively used. The `file_data` field is stored but never read anywhere.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `msgpack`
|
||||
- **Internal**: none (leaf module)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `inference` — creates config from dict, uses all fields for frame selection, detection filtering, image tiling, and tracking
|
||||
|
||||
## Data Models
|
||||
|
||||
- `AIRecognitionConfig` — the sole data class
|
||||
|
||||
## Configuration
|
||||
|
||||
Camera/altitude parameters (`altitude`, `focal_length`, `sensor_width`) are used for ground sampling distance calculation in aerial image processing.
|
||||
|
||||
## External Integrations
|
||||
|
||||
None.
|
||||
|
||||
## Security
|
||||
|
||||
None.
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
@@ -0,0 +1,83 @@
|
||||
# Module: annotation
|
||||
|
||||
## Purpose
|
||||
|
||||
Data models for object detections and annotations (grouped detections for a frame/tile with metadata).
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Class: Detection
|
||||
|
||||
Represents a single bounding box detection in normalized coordinates.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `x` | double | Center X (normalized 0..1) |
|
||||
| `y` | double | Center Y (normalized 0..1) |
|
||||
| `w` | double | Width (normalized 0..1) |
|
||||
| `h` | double | Height (normalized 0..1) |
|
||||
| `cls` | int | Class ID (maps to constants_inf.annotations_dict) |
|
||||
| `confidence` | double | Detection confidence (0..1) |
|
||||
| `annotation_name` | str | Parent annotation name (set after construction) |
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `__init__` | `(double x, y, w, h, int cls, double confidence)` | Constructor |
|
||||
| `__str__` | `() -> str` | Format: `"{cls}: {x} {y} {w} {h}, prob: {confidence}%"` |
|
||||
| `__eq__` | `(other) -> bool` | Two detections are equal if all bbox coordinates differ by less than `TILE_DUPLICATE_CONFIDENCE_THRESHOLD` |
|
||||
| `overlaps` | `(Detection det2, float confidence_threshold) -> bool` | Returns True if IoU-like overlap ratio (overlap area / min area) exceeds threshold |
|
||||
|
||||
### Class: Annotation
|
||||
|
||||
Groups detections for a single frame or image tile.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `name` | str | Unique annotation name (encodes tile/time info) |
|
||||
| `original_media_name` | str | Source media filename (without extension/spaces) |
|
||||
| `time` | long | Timestamp in milliseconds (video) or 0 (image) |
|
||||
| `detections` | list[Detection] | Detections found in this frame/tile |
|
||||
| `image` | bytes | JPEG-encoded frame image (set after validation) |
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `__init__` | `(str name, str original_media_name, long ms, list[Detection] detections)` | Sets annotation_name on all detections |
|
||||
| `__str__` | `() -> str` | Formatted detection summary |
|
||||
| `serialize` | `() -> bytes` | Msgpack serialization with compact keys **(legacy — not called in current codebase)** |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
- `Detection.__eq__` uses `constants_inf.TILE_DUPLICATE_CONFIDENCE_THRESHOLD` (0.01) to determine if two detections at absolute coordinates are duplicates across adjacent tiles.
|
||||
- `Detection.overlaps` computes the overlap as `overlap_area / min(area1, area2)` — this is not standard IoU but a containment-biased metric.
|
||||
- `Annotation.__init__` sets `annotation_name` on every child detection.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `msgpack`
|
||||
- **Internal**: `constants_inf` (TILE_DUPLICATE_CONFIDENCE_THRESHOLD constant)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `inference` — creates Detection and Annotation instances during postprocessing, uses overlaps for NMS, uses equality for tile dedup
|
||||
- `main` — reads Detection fields for DTO conversion
|
||||
|
||||
## Data Models
|
||||
|
||||
- `Detection` — bounding box + class + confidence
|
||||
- `Annotation` — frame/tile container for detections + metadata + image
|
||||
|
||||
## Configuration
|
||||
|
||||
None.
|
||||
|
||||
## External Integrations
|
||||
|
||||
None.
|
||||
|
||||
## Security
|
||||
|
||||
None.
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
@@ -0,0 +1,95 @@
|
||||
# Module: constants_inf
|
||||
|
||||
## Purpose
|
||||
|
||||
Application-wide constants, logging infrastructure, and the object detection class registry loaded from `classes.json`.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Constants
|
||||
|
||||
| Name | Type | Value | Description |
|
||||
|------|------|-------|-------------|
|
||||
| `CONFIG_FILE` | str | `"config.yaml"` | Configuration file path |
|
||||
| `QUEUE_CONFIG_FILENAME` | str | `"secured-config.json"` | Queue config filename |
|
||||
| `AI_ONNX_MODEL_FILE` | str | `"azaion.onnx"` | ONNX model filename |
|
||||
| `CDN_CONFIG` | str | `"cdn.yaml"` | CDN configuration file |
|
||||
| `MODELS_FOLDER` | str | `"models"` | Directory for model files |
|
||||
| `SMALL_SIZE_KB` | int | `3` | Small file size threshold (KB) |
|
||||
| `SPLIT_SUFFIX` | str | `"!split!"` | Delimiter in tiled image names |
|
||||
| `TILE_DUPLICATE_CONFIDENCE_THRESHOLD` | double | `0.01` | Threshold for tile duplicate detection equality |
|
||||
| `METERS_IN_TILE` | int | `25` | Physical tile size in meters for large image splitting |
|
||||
| `weather_switcher_increase` | int | `20` | Offset between weather mode class ID ranges |
|
||||
|
||||
### Enum: WeatherMode
|
||||
|
||||
| Value | Name | Meaning |
|
||||
|-------|------|---------|
|
||||
| 0 | Norm | Normal weather |
|
||||
| 20 | Wint | Winter |
|
||||
| 40 | Night | Night |
|
||||
|
||||
### Class: AnnotationClass
|
||||
|
||||
Fields: `id` (int), `name` (str), `color` (str), `max_object_size_meters` (int).
|
||||
|
||||
Represents a detection class with its display metadata and physical size constraint.
|
||||
|
||||
### Functions
|
||||
|
||||
| Function | Signature | Description |
|
||||
|----------|-----------|-------------|
|
||||
| `log` | `(str log_message) -> void` | Info-level log via loguru |
|
||||
| `logerror` | `(str error) -> void` | Error-level log via loguru |
|
||||
| `format_time` | `(int ms) -> str` | Converts milliseconds to compact time string `HMMSSf` |
|
||||
|
||||
### Global: `annotations_dict`
|
||||
|
||||
`dict[int, AnnotationClass]` — loaded at module init from `classes.json`. Contains 19 base classes × 3 weather modes (Norm/Wint/Night) = up to 57 entries. Keys are class IDs, values are `AnnotationClass` instances.
|
||||
|
||||
## Internal Logic
|
||||
|
||||
- On import, reads `classes.json` and builds `annotations_dict` by iterating 3 weather mode offsets (0, 20, 40) and adding class ID offsets. Weather mode names are appended to class names for non-Norm modes.
|
||||
- Configures loguru with:
|
||||
- File sink: `Logs/log_inference_YYYYMMDD.txt` (daily rotation, 30-day retention)
|
||||
- Stdout: INFO/DEBUG/SUCCESS levels
|
||||
- Stderr: WARNING and above
|
||||
|
||||
## Legacy / Orphaned Declarations
|
||||
|
||||
The `.pxd` header declares `QUEUE_MAXSIZE`, `COMMANDS_QUEUE`, and `ANNOTATIONS_QUEUE` (with comments referencing RabbitMQ) that are **not defined** in the `.pyx` implementation. These are remnants of a previous queue-based architecture and are unused.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `json`, `sys`, `loguru`
|
||||
- **Internal**: none (leaf module)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `ai_availability_status` (logging)
|
||||
- `annotation` (tile duplicate threshold)
|
||||
- `onnx_engine` (logging)
|
||||
- `tensorrt_engine` (logging)
|
||||
- `inference` (logging, constants, annotations_dict, format_time, SPLIT_SUFFIX, METERS_IN_TILE, MODELS_FOLDER, AI_ONNX_MODEL_FILE)
|
||||
- `main` (annotations_dict for label lookup)
|
||||
|
||||
## Data Models
|
||||
|
||||
- `AnnotationClass` — detection class metadata
|
||||
- `WeatherMode` — enum for weather conditions
|
||||
|
||||
## Configuration
|
||||
|
||||
- Reads `classes.json` at import time (must exist in working directory)
|
||||
|
||||
## External Integrations
|
||||
|
||||
None.
|
||||
|
||||
## Security
|
||||
|
||||
None.
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
@@ -0,0 +1,107 @@
|
||||
# Module: inference
|
||||
|
||||
## Purpose
|
||||
|
||||
Core inference orchestrator — manages the AI engine lifecycle, preprocesses media (images and video), runs batched inference, postprocesses detections, and applies validation filters (overlap removal, size filtering, tile deduplication, video tracking).
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Class: Inference
|
||||
|
||||
#### Fields
|
||||
|
||||
| Field | Type | Access | Description |
|
||||
|-------|------|--------|-------------|
|
||||
| `loader_client` | object | internal | LoaderHttpClient instance |
|
||||
| `engine` | InferenceEngine | internal | Active engine (OnnxEngine or TensorRTEngine), None if unavailable |
|
||||
| `ai_availability_status` | AIAvailabilityStatus | public | Current AI readiness status |
|
||||
| `stop_signal` | bool | internal | Flag to abort video processing |
|
||||
| `model_width` | int | internal | Model input width in pixels |
|
||||
| `model_height` | int | internal | Model input height in pixels |
|
||||
| `detection_counts` | dict[str, int] | internal | Per-media detection count |
|
||||
| `is_building_engine` | bool | internal | True during async TensorRT conversion |
|
||||
|
||||
#### Methods
|
||||
|
||||
| Method | Signature | Access | Description |
|
||||
|--------|-----------|--------|-------------|
|
||||
| `__init__` | `(loader_client)` | public | Initializes state, calls `init_ai()` |
|
||||
| `run_detect` | `(dict config_dict, annotation_callback, status_callback=None)` | cpdef | Main entry: parses config, separates images/videos, processes each |
|
||||
| `detect_single_image` | `(bytes image_bytes, dict config_dict) -> list` | cpdef | Single-image detection from raw bytes, returns list[Detection] |
|
||||
| `stop` | `()` | cpdef | Sets stop_signal to True |
|
||||
| `init_ai` | `()` | cdef | Engine initialization: tries TensorRT engine file → falls back to ONNX → background TensorRT conversion |
|
||||
| `preprocess` | `(frames) -> ndarray` | cdef | OpenCV blobFromImage: resize, normalize to 0..1, swap RGB, stack batch |
|
||||
| `postprocess` | `(output, ai_config) -> list[list[Detection]]` | cdef | Parses engine output to Detection objects, applies confidence threshold and overlap removal |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
### Engine Initialization (`init_ai`)
|
||||
|
||||
1. If `_converted_model_bytes` exists → load TensorRT from those bytes
|
||||
2. If GPU available → try downloading pre-built TensorRT engine from loader
|
||||
3. If download fails → download ONNX model, start background thread for ONNX→TensorRT conversion
|
||||
4. If no GPU → load OnnxEngine from ONNX model bytes
|
||||
|
||||
### Preprocessing
|
||||
|
||||
- `cv2.dnn.blobFromImage`: scale 1/255, resize to model dims, BGR→RGB, no crop
|
||||
- Stack multiple frames via `np.vstack` for batched inference
|
||||
|
||||
### Postprocessing
|
||||
|
||||
- Engine output format: `[batch][detection_index][x1, y1, x2, y2, confidence, class_id]`
|
||||
- Coordinates normalized to 0..1 by dividing by model width/height
|
||||
- Converted to center-format (cx, cy, w, h) Detection objects
|
||||
- Filtered by `probability_threshold`
|
||||
- Overlapping detections removed via `remove_overlapping_detections` (greedy, keeps higher confidence)
|
||||
|
||||
### Image Processing
|
||||
|
||||
- Small images (≤1.5× model size): processed as single frame
|
||||
- Large images: split into tiles based on ground sampling distance. Tile size = `METERS_IN_TILE / GSD` pixels. Tiles overlap by configurable percentage.
|
||||
- Tile deduplication: absolute-coordinate comparison across adjacent tiles using `Detection.__eq__`
|
||||
- Size filtering: detections whose physical size (meters) exceeds `AnnotationClass.max_object_size_meters` are removed. Physical size computed from GSD × pixel dimensions.
|
||||
|
||||
### Video Processing
|
||||
|
||||
- Frame sampling: every Nth frame (`frame_period_recognition`)
|
||||
- Batch accumulation up to engine batch size
|
||||
- Annotation validity: must differ from previous annotation by either:
|
||||
- Time gap ≥ `frame_recognition_seconds`
|
||||
- More detections than previous
|
||||
- Any detection moved beyond `tracking_distance_confidence` threshold
|
||||
- Any detection confidence increased beyond `tracking_probability_increase`
|
||||
- Valid frames get JPEG-encoded image attached
|
||||
|
||||
### Ground Sampling Distance (GSD)
|
||||
|
||||
`GSD = sensor_width * altitude / (focal_length * image_width)` — meters per pixel, used for physical size filtering of aerial detections.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `cv2`, `numpy`, `pynvml`, `mimetypes`, `pathlib`, `threading`
|
||||
- **Internal**: `constants_inf`, `ai_availability_status`, `annotation`, `ai_config`, `tensorrt_engine` (conditional), `onnx_engine` (conditional), `inference_engine` (type)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `main` — lazy-initializes Inference, calls `run_detect`, `detect_single_image`, reads `ai_availability_status`
|
||||
|
||||
## Data Models
|
||||
|
||||
Uses `Detection`, `Annotation` (from annotation), `AIRecognitionConfig` (from ai_config), `AIAvailabilityStatus` (from ai_availability_status).
|
||||
|
||||
## Configuration
|
||||
|
||||
All runtime config comes via `AIRecognitionConfig` dict. Engine selection is automatic based on GPU availability (checked at module-level via pynvml).
|
||||
|
||||
## External Integrations
|
||||
|
||||
- **Loader service** (via loader_client): model download/upload
|
||||
|
||||
## Security
|
||||
|
||||
None.
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
@@ -0,0 +1,59 @@
|
||||
# Module: inference_engine
|
||||
|
||||
## Purpose
|
||||
|
||||
Abstract base class defining the interface that all inference engine implementations must follow.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Class: InferenceEngine
|
||||
|
||||
#### Fields
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `batch_size` | int | Number of images per inference batch |
|
||||
|
||||
#### Methods
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `__init__` | `(bytes model_bytes, int batch_size=1, **kwargs)` | Stores batch_size |
|
||||
| `get_input_shape` | `() -> tuple` | Returns (height, width) of model input. Abstract — raises `NotImplementedError` |
|
||||
| `get_batch_size` | `() -> int` | Returns `self.batch_size` |
|
||||
| `run` | `(input_data) -> list` | Runs inference on preprocessed input blob. Abstract — raises `NotImplementedError` |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
Pure abstract class. All methods except `get_batch_size` raise `NotImplementedError` and must be overridden by subclasses (`OnnxEngine`, `TensorRTEngine`).
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `numpy` (declared in .pxd, not used in base)
|
||||
- **Internal**: none (leaf module)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `onnx_engine` — subclass
|
||||
- `tensorrt_engine` — subclass
|
||||
- `inference` — type reference in .pxd
|
||||
|
||||
## Data Models
|
||||
|
||||
None.
|
||||
|
||||
## Configuration
|
||||
|
||||
None.
|
||||
|
||||
## External Integrations
|
||||
|
||||
None.
|
||||
|
||||
## Security
|
||||
|
||||
None.
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
@@ -0,0 +1,61 @@
|
||||
# Module: loader_http_client
|
||||
|
||||
## Purpose
|
||||
|
||||
HTTP client for downloading and uploading model files (and other binary resources) via an external Loader microservice.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Class: LoadResult
|
||||
|
||||
Simple result wrapper.
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `err` | str or None | Error message if operation failed |
|
||||
| `data` | bytes or None | Response payload on success |
|
||||
|
||||
### Class: LoaderHttpClient
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `__init__` | `(str base_url)` | Stores base URL, strips trailing slash |
|
||||
| `load_big_small_resource` | `(str filename, str directory) -> LoadResult` | POST to `/load/{filename}` with JSON body `{filename, folder}`, returns raw bytes |
|
||||
| `upload_big_small_resource` | `(bytes content, str filename, str directory) -> LoadResult` | POST to `/upload/{filename}` with multipart file + form data `{folder}` |
|
||||
| `stop` | `() -> None` | No-op placeholder |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
Both load/upload methods wrap all exceptions into `LoadResult(err=str(e))`. Errors are logged via loguru but never raised.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `requests`, `loguru`
|
||||
- **Internal**: none (leaf module)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `inference` — downloads ONNX/TensorRT models, uploads converted TensorRT engines
|
||||
- `main` — instantiates client with `LOADER_URL`
|
||||
|
||||
## Data Models
|
||||
|
||||
- `LoadResult` — operation result with error-or-data semantics
|
||||
|
||||
## Configuration
|
||||
|
||||
- `base_url` — provided at construction time, sourced from `LOADER_URL` environment variable in `main.py`
|
||||
|
||||
## External Integrations
|
||||
|
||||
| Integration | Protocol | Endpoint Pattern |
|
||||
|-------------|----------|-----------------|
|
||||
| Loader service | HTTP POST | `/load/{filename}` (download), `/upload/{filename}` (upload) |
|
||||
|
||||
## Security
|
||||
|
||||
None (no auth headers sent to loader).
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
@@ -0,0 +1,115 @@
|
||||
# Module: main
|
||||
|
||||
## Purpose
|
||||
|
||||
FastAPI application entry point — exposes HTTP API for object detection on images and video media, health checks, and Server-Sent Events (SSE) streaming of detection results.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### API Endpoints
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/health` | Returns AI engine availability status |
|
||||
| POST | `/detect` | Single image detection (multipart file upload) |
|
||||
| POST | `/detect/{media_id}` | Start async detection on media from loader service |
|
||||
| GET | `/detect/stream` | SSE stream of detection events |
|
||||
|
||||
### DTOs (Pydantic Models)
|
||||
|
||||
| Model | Fields | Description |
|
||||
|-------|--------|-------------|
|
||||
| `DetectionDto` | centerX, centerY, width, height, classNum, label, confidence | Single detection result |
|
||||
| `DetectionEvent` | annotations (list[DetectionDto]), mediaId, mediaStatus, mediaPercent | SSE event payload |
|
||||
| `HealthResponse` | status, aiAvailability, errorMessage | Health check response |
|
||||
| `AIConfigDto` | frame_period_recognition, frame_recognition_seconds, probability_threshold, tracking_*, model_batch_size, big_image_tile_overlap_percent, altitude, focal_length, sensor_width, paths | Configuration input for media detection |
|
||||
|
||||
### Class: TokenManager
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `__init__` | `(str access_token, str refresh_token)` | Stores tokens |
|
||||
| `get_valid_token` | `() -> str` | Returns access_token; auto-refreshes if expiring within 60s |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
### `/health`
|
||||
|
||||
Returns `HealthResponse` with `status="healthy"` always. `aiAvailability` reflects the engine's `AIAvailabilityStatus`. On exception, returns `aiAvailability="None"`.
|
||||
|
||||
### `/detect` (single image)
|
||||
|
||||
1. Reads uploaded file bytes
|
||||
2. Parses optional JSON config
|
||||
3. Runs `inference.detect_single_image` in ThreadPoolExecutor (max 2 workers)
|
||||
4. Returns list of DetectionDto
|
||||
|
||||
Error mapping: RuntimeError("not available") → 503, RuntimeError → 422, ValueError → 400.
|
||||
|
||||
### `/detect/{media_id}` (async media)
|
||||
|
||||
1. Checks for duplicate active detection (409 if already running)
|
||||
2. Extracts auth tokens from Authorization header and x-refresh-token header
|
||||
3. Creates `asyncio.Task` for background detection
|
||||
4. Detection runs `inference.run_detect` in ThreadPoolExecutor
|
||||
5. Callbacks push `DetectionEvent` to all SSE queues
|
||||
6. If auth token present, also POSTs annotations to the Annotations service
|
||||
7. Returns immediately: `{"status": "started", "mediaId": media_id}`
|
||||
|
||||
### `/detect/stream` (SSE)
|
||||
|
||||
- Creates asyncio.Queue per client (maxsize=100)
|
||||
- Yields `data: {json}\n\n` SSE format
|
||||
- Cleans up queue on disconnect
|
||||
|
||||
### Token Management
|
||||
|
||||
- Decodes JWT exp claim from base64 payload (no signature verification)
|
||||
- Auto-refreshes via POST to `{ANNOTATIONS_URL}/auth/refresh` when within 60s of expiry
|
||||
|
||||
### Annotations Service Integration
|
||||
|
||||
- POST to `{ANNOTATIONS_URL}/annotations` with:
|
||||
- `mediaId`, `source: 0`, `videoTime` (formatted from ms), `detections` (list of dto dicts)
|
||||
- Optional base64-encoded `image`
|
||||
- Bearer token in Authorization header
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `asyncio`, `base64`, `json`, `os`, `time`, `concurrent.futures`, `typing`, `requests`, `fastapi`, `pydantic`
|
||||
- **Internal**: `inference` (lazy import), `constants_inf` (label lookup), `loader_http_client` (client instantiation)
|
||||
|
||||
## Consumers
|
||||
|
||||
None (entry point).
|
||||
|
||||
## Data Models
|
||||
|
||||
- `DetectionDto`, `DetectionEvent`, `HealthResponse`, `AIConfigDto` — Pydantic models for API
|
||||
- `TokenManager` — JWT token lifecycle
|
||||
|
||||
## Configuration
|
||||
|
||||
| Env Var | Default | Description |
|
||||
|---------|---------|-------------|
|
||||
| `LOADER_URL` | `http://loader:8080` | Loader service base URL |
|
||||
| `ANNOTATIONS_URL` | `http://annotations:8080` | Annotations service base URL |
|
||||
|
||||
## External Integrations
|
||||
|
||||
| Service | Protocol | Purpose |
|
||||
|---------|----------|---------|
|
||||
| Loader | HTTP (via LoaderHttpClient) | Model loading |
|
||||
| Annotations | HTTP POST | Auth refresh (`/auth/refresh`), annotation posting (`/annotations`) |
|
||||
|
||||
## Security
|
||||
|
||||
- Bearer token from request headers, refreshed via Annotations service
|
||||
- JWT exp decoded (base64, no signature verification) — token validation is not performed locally
|
||||
- No CORS configuration
|
||||
- No rate limiting
|
||||
- No input validation on media_id path parameter beyond string type
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
@@ -0,0 +1,51 @@
|
||||
# Module: onnx_engine
|
||||
|
||||
## Purpose
|
||||
|
||||
ONNX Runtime-based inference engine — CPU/CUDA fallback when TensorRT is unavailable.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Class: OnnxEngine (extends InferenceEngine)
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `__init__` | `(bytes model_bytes, int batch_size=1, **kwargs)` | Loads ONNX model from bytes, creates InferenceSession with CUDA > CPU provider priority. Reads input shape and batch size from model metadata. |
|
||||
| `get_input_shape` | `() -> tuple` | Returns `(height, width)` from input tensor shape |
|
||||
| `get_batch_size` | `() -> int` | Returns batch size (from model if not dynamic, else from constructor) |
|
||||
| `run` | `(input_data) -> list` | Runs session inference, returns output tensors |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
- Provider order: `["CUDAExecutionProvider", "CPUExecutionProvider"]` — ONNX Runtime selects the best available.
|
||||
- If the model's batch dimension is dynamic (-1), uses the constructor's `batch_size` parameter.
|
||||
- Logs model input metadata and custom metadata map at init.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `onnxruntime`
|
||||
- **Internal**: `inference_engine` (base class), `constants_inf` (logging)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `inference` — instantiated when no compatible NVIDIA GPU is found
|
||||
|
||||
## Data Models
|
||||
|
||||
None (wraps onnxruntime.InferenceSession).
|
||||
|
||||
## Configuration
|
||||
|
||||
None.
|
||||
|
||||
## External Integrations
|
||||
|
||||
None directly — model bytes are provided by caller (loaded via `loader_http_client`).
|
||||
|
||||
## Security
|
||||
|
||||
None.
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
@@ -0,0 +1,57 @@
|
||||
# Module: tensorrt_engine
|
||||
|
||||
## Purpose
|
||||
|
||||
TensorRT-based inference engine — high-performance GPU inference with CUDA memory management and ONNX-to-TensorRT model conversion.
|
||||
|
||||
## Public Interface
|
||||
|
||||
### Class: TensorRTEngine (extends InferenceEngine)
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `__init__` | `(bytes model_bytes, int batch_size=4, **kwargs)` | Deserializes TensorRT engine from bytes, allocates CUDA input/output memory, creates execution context and stream |
|
||||
| `get_input_shape` | `() -> tuple` | Returns `(height, width)` from input tensor shape |
|
||||
| `get_batch_size` | `() -> int` | Returns batch size |
|
||||
| `run` | `(input_data) -> list` | Async H2D copy → execute → D2H copy, returns output as numpy array |
|
||||
| `get_gpu_memory_bytes` | `(int device_id) -> int` | Static. Returns total GPU memory in bytes (default 2GB if unavailable) |
|
||||
| `get_engine_filename` | `(int device_id) -> str` | Static. Returns engine filename with compute capability and SM count: `azaion.cc_{major}.{minor}_sm_{count}.engine` |
|
||||
| `convert_from_onnx` | `(bytes onnx_model) -> bytes or None` | Static. Converts ONNX model to TensorRT serialized engine. Uses 90% of GPU memory as workspace. Enables FP16 if supported. |
|
||||
|
||||
## Internal Logic
|
||||
|
||||
- Input shape defaults to 1280×1280 for dynamic dimensions.
|
||||
- Output shape defaults to 300 max detections × 6 values (x1, y1, x2, y2, conf, cls) for dynamic dimensions.
|
||||
- `run` uses async CUDA memory transfers with stream synchronization.
|
||||
- `convert_from_onnx` uses explicit batch mode, configures FP16 precision when GPU supports it.
|
||||
- Default batch size is 4 (vs OnnxEngine's 1).
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **External**: `tensorrt`, `pycuda.driver`, `pycuda.autoinit`, `pynvml`, `numpy`
|
||||
- **Internal**: `inference_engine` (base class), `constants_inf` (logging)
|
||||
|
||||
## Consumers
|
||||
|
||||
- `inference` — instantiated when compatible NVIDIA GPU is found; also calls `convert_from_onnx` and `get_engine_filename`
|
||||
|
||||
## Data Models
|
||||
|
||||
None (wraps TensorRT runtime objects).
|
||||
|
||||
## Configuration
|
||||
|
||||
- Engine filename is GPU-specific (compute capability + SM count).
|
||||
- Workspace memory is 90% of available GPU memory.
|
||||
|
||||
## External Integrations
|
||||
|
||||
None directly — model bytes provided by caller.
|
||||
|
||||
## Security
|
||||
|
||||
None.
|
||||
|
||||
## Tests
|
||||
|
||||
None found.
|
||||
Reference in New Issue
Block a user