azaion/detections

mirror of https://github.com/azaion/detections.git synced 2026-04-22 22:06:32 +00:00

Files

T

Oleksandr Bezdieniezhnykh 3165a88f0b Add detailed file index and enhance skill documentation for autopilot, decompose, deploy, plan, and research skills. Introduce tests-only mode in decompose skill, clarify required files for deploy and plan skills, and improve prerequisite checks across skills for better user guidance and workflow efficiency.

2026-03-22 16:15:49 +02:00

3.4 KiB

Raw Blame History

Input Data Parameters

Media Input

Single Image Detection (POST /detect)

Parameter	Type	Source	Description
file	bytes (multipart)	Client upload	Image file (JPEG, PNG, etc. — any format OpenCV can decode)
config	JSON string (optional)	Query/form field	AIConfigDto overrides

Media Detection (POST /detect/{media_id})

Parameter	Type	Source	Description
media_id	string	URL path	Identifier for media in the Loader service
AIConfigDto body	JSON (optional)	Request body	Configuration overrides
Authorization header	Bearer token	HTTP header	JWT for Annotations service
x-refresh-token header	string	HTTP header	Refresh token for JWT renewal

Media files (images and videos) are resolved by the Inference pipeline via paths in the config. The Loader service provides model files, not media files directly.

Configuration Input (AIConfigDto / AIRecognitionConfig)

Field	Type	Default	Range/Meaning
frame_period_recognition	int	4	Process every Nth video frame
frame_recognition_seconds	int	2	Minimum seconds between video annotations
probability_threshold	float	0.25	Minimum detection confidence (0..1)
tracking_distance_confidence	float	0.0	Movement threshold for tracking (model-width fraction)
tracking_probability_increase	float	0.0	Confidence increase threshold for tracking
tracking_intersection_threshold	float	0.6	Overlap ratio for NMS deduplication
model_batch_size	int	1	Inference batch size
big_image_tile_overlap_percent	int	20	Tile overlap for large images (0-100%)
altitude	float	400	Camera altitude in meters
focal_length	float	24	Camera focal length in mm
sensor_width	float	23.5	Camera sensor width in mm
paths	list[str]	[]	Media file paths to process

Model Files

File	Format	Source	Description
azaion.onnx	ONNX	Loader service	Base detection model
azaion.cc_{M}.{m}sm{N}.engine	TensorRT	Loader service (cached)	GPU-specific compiled engine

Static Data

classes.json

Array of 19 objects, each with:

Field	Type	Example	Description
Id	int	0	Class identifier
Name	string	"ArmorVehicle"	English class name
ShortName	string	"Броня"	Ukrainian short name
Color	string	"#ff0000"	Hex color for visualization
MaxSizeM	int	8	Maximum physical object size in meters

Data Volumes

Single image: up to tens of megapixels (aerial imagery). Large images are tiled.
Video: processed frame-by-frame with configurable sampling rate.
Model file: ONNX model size depends on architecture (typically 10-100 MB). TensorRT engines are GPU-specific compiled versions.
Detection output: up to 300 detections per frame (model limit).

Data Formats

Data	Format	Serialization
API requests	HTTP multipart / JSON	Pydantic validation
API responses	JSON	Pydantic model_dump
SSE events	text/event-stream	JSON per event
Internal config	Python dict	AIRecognitionConfig.from_dict()
Legacy (unused)	msgpack	serialize() / from_msgpack()