mirror of
https://github.com/azaion/detections.git
synced 2026-04-22 22:06:32 +00:00
1fe9425aa8
- Update module docs: main, inference, ai_config, loader_http_client - Add new module doc: media_hash - Update component docs: inference_pipeline, api - Update system-flows (F2, F3) and data_parameters - Add Task Mode to document skill for incremental doc updates - Insert Step 11 (Update Docs) in existing-code flow, renumber 11-13 to 12-14 Made-with: Cursor
3.9 KiB
3.9 KiB
Input Data Parameters
Media Input
Upload Detection (POST /detect)
| Parameter | Type | Source | Description |
|---|---|---|---|
| file | bytes (multipart) | Client upload | Image or video file (JPEG, PNG, MP4, MOV, etc.) |
| config | JSON string (optional) | Form field | AIConfigDto overrides |
| Authorization header | Bearer token (optional) | HTTP header | JWT for media lifecycle management |
| x-refresh-token header | string (optional) | HTTP header | Refresh token for JWT renewal |
When auth headers are present, the service: computes an XxHash64 content hash, persists the file to VIDEOS_DIR/IMAGES_DIR, creates a media record via Annotations API, and tracks processing status.
Media Detection (POST /detect/{media_id})
| Parameter | Type | Source | Description |
|---|---|---|---|
| media_id | string | URL path | Identifier for media in the Annotations service |
| AIConfigDto body | JSON (optional) | Request body | Configuration overrides (merged with DB settings) |
| Authorization header | Bearer token | HTTP header | JWT for Annotations service |
| x-refresh-token header | string | HTTP header | Refresh token for JWT renewal |
Media path is resolved from the Annotations service via GET /api/media/{media_id}. AI settings are fetched from GET /api/users/{user_id}/ai-settings and merged with client overrides.
Configuration Input (AIConfigDto / AIRecognitionConfig)
| Field | Type | Default | Range/Meaning |
|---|---|---|---|
| frame_period_recognition | int | 4 | Process every Nth video frame |
| frame_recognition_seconds | int | 2 | Minimum seconds between video annotations |
| probability_threshold | float | 0.25 | Minimum detection confidence (0..1) |
| tracking_distance_confidence | float | 0.0 | Movement threshold for tracking (model-width fraction) |
| tracking_probability_increase | float | 0.0 | Confidence increase threshold for tracking |
| tracking_intersection_threshold | float | 0.6 | Overlap ratio for NMS deduplication |
| model_batch_size | int | 8 | Inference batch size |
| big_image_tile_overlap_percent | int | 20 | Tile overlap for large images (0-100%) |
| altitude | float | 400 | Camera altitude in meters |
| focal_length | float | 24 | Camera focal length in mm |
| sensor_width | float | 23.5 | Camera sensor width in mm |
paths field was removed in AZ-174 — media paths are now resolved via the Annotations service.
Model Files
| File | Format | Source | Description |
|---|---|---|---|
| azaion.onnx | ONNX | Loader service | Base detection model |
| azaion.cc_{M}.{m}sm{N}.engine | TensorRT | Loader service (cached) | GPU-specific compiled engine |
Static Data
classes.json
Array of 19 objects, each with:
| Field | Type | Example | Description |
|---|---|---|---|
| Id | int | 0 | Class identifier |
| Name | string | "ArmorVehicle" | English class name |
| ShortName | string | "Броня" | Ukrainian short name |
| Color | string | "#ff0000" | Hex color for visualization |
| MaxSizeM | int | 8 | Maximum physical object size in meters |
Data Volumes
- Single image: up to tens of megapixels (aerial imagery). Large images are tiled.
- Video: processed frame-by-frame with configurable sampling rate. Decoded from in-memory bytes via PyAV.
- Model file: ONNX model size depends on architecture (typically 10-100 MB). TensorRT engines are GPU-specific compiled versions.
- Detection output: up to 300 detections per frame (model limit).
Data Formats
| Data | Format | Serialization |
|---|---|---|
| API requests | HTTP multipart / JSON | Pydantic validation |
| API responses | JSON | Pydantic model_dump |
| SSE events | text/event-stream | JSON per event |
| Internal config | Python dict | AIRecognitionConfig.from_dict() |
| Content hash | XxHash64 hex string | 16-char hex digest |