Files
detections/_docs/02_document/system-flows.md
T
Oleksandr Bezdieniezhnykh 1fe9425aa8 [AZ-172] Update documentation for distributed architecture, add Update Docs step to workflow
- Update module docs: main, inference, ai_config, loader_http_client
- Add new module doc: media_hash
- Update component docs: inference_pipeline, api
- Update system-flows (F2, F3) and data_parameters
- Add Task Mode to document skill for incremental doc updates
- Insert Step 11 (Update Docs) in existing-code flow, renumber 11-13 to 12-14

Made-with: Cursor
2026-03-31 17:25:58 +03:00

320 lines
12 KiB
Markdown

# Azaion.Detections — System Flows
## Flow Inventory
| # | Flow Name | Trigger | Primary Components | Criticality |
|---|-----------|---------|-------------------|-------------|
| F1 | Health Check | Client GET /health | API, Inference Pipeline | High |
| F2 | Upload Detection (Image/Video) | Client POST /detect | API, Inference Pipeline, Engines, Domain, Annotations | High |
| F3 | Media Detection (Async, DB-Driven) | Client POST /detect/{media_id} | API, Inference Pipeline, Engines, Domain, Annotations | High |
| F4 | SSE Event Streaming | Client GET /detect/stream | API | Medium |
| F5 | Engine Initialization | First detection request | Inference Pipeline, Engines, Loader | High |
| F6 | TensorRT Background Conversion | No pre-built TensorRT engine | Inference Pipeline, Engines, Loader | Medium |
## Flow Dependencies
| Flow | Depends On | Shares Data With |
|------|-----------|-----------------|
| F1 | F5 (for meaningful status) | — |
| F2 | F5 (engine must be ready) | Annotations (media lifecycle) |
| F3 | F5 (engine must be ready) | F4 (via SSE event queues), Annotations (settings, media lifecycle) |
| F4 | — | F3 (receives events) |
| F5 | — | F6 (triggers conversion if needed) |
| F6 | F5 (triggered by init failure) | F5 (provides converted bytes) |
---
## Flow F1: Health Check
### Description
Client queries the service health status. Returns the current AI engine availability (None, Downloading, Converting, Enabled, Error, etc.) without triggering engine initialization.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant API as main.py
participant INF as Inference
participant STATUS as AIAvailabilityStatus
Client->>API: GET /health
API->>INF: get_inference()
INF-->>API: Inference instance
API->>STATUS: str(ai_availability_status)
STATUS-->>API: "Enabled" / "Downloading" / etc.
API-->>Client: HealthResponse{status, aiAvailability, errorMessage}
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Inference not yet created | get_inference() | Exception caught | Returns aiAvailability="None" |
---
## Flow F2: Upload Detection (Image or Video)
### Description
Client uploads a media file (image or video) and optionally provides config and auth tokens. The service detects the media kind, manages the media lifecycle (hashing, storage, record creation, status tracking), runs inference synchronously (via ThreadPoolExecutor), and returns detection results.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant API as main.py
participant HASH as media_hash
participant ANN as Annotations Service
participant INF as Inference
participant ENG as Engine (ONNX/TRT)
participant CONST as constants_inf
Client->>API: POST /detect (file + config? + auth?)
API->>API: Read bytes, detect kind (image/video)
API->>API: Validate image data (cv2.imdecode)
opt Authenticated user
API->>HASH: compute_media_content_hash(bytes)
HASH-->>API: content_hash
API->>API: Persist file to VIDEOS_DIR/IMAGES_DIR
API->>ANN: POST /api/media (create record)
API->>ANN: PUT /api/media/{id}/status (AI_PROCESSING)
end
alt Image
API->>INF: run_detect_image(bytes, ai_config, name, callback)
else Video
API->>INF: run_detect_video(bytes, ai_config, name, path, callback)
end
INF->>INF: init_ai() (idempotent)
INF->>ENG: process_frames(batch)
ENG-->>INF: raw output
INF->>INF: postprocess → filter → callbacks
INF-->>API: results via callback
opt Authenticated user
API->>ANN: PUT /api/media/{id}/status (AI_PROCESSED)
end
API->>CONST: annotations_dict[cls].name (label lookup)
API-->>Client: list[DetectionDto]
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Empty upload | API | len(bytes)==0 | 400 Bad Request |
| Invalid image data | cv2.imdecode | returns None | 400 Bad Request |
| Unrecognized format | _detect_upload_kind | cv2+PyAV probe fails | 400 Bad Request |
| Engine not available | init_ai | engine is None | 503 Service Unavailable |
| Inference failure | run/postprocess | RuntimeError | 422 Unprocessable Entity |
| Media record failure | _post_media_record | exception caught | Silently continues |
---
## Flow F3: Media Detection (Async, DB-Driven)
### Description
Client triggers detection on a media file resolved from the Annotations service. AI settings are fetched from the user's DB profile and merged with client overrides. Processing runs asynchronously. Results are streamed via SSE (F4) and optionally posted to the Annotations service. Media status is tracked throughout.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant API as main.py
participant ANN as Annotations Service
participant INF as Inference
participant ENG as Engine
participant SSE as SSE Queues
Client->>API: POST /detect/{media_id} (config? + auth headers)
API->>API: Check _active_detections (duplicate guard)
API->>ANN: GET /api/users/{user_id}/ai-settings
ANN-->>API: AI settings (merged with overrides)
API->>ANN: GET /api/media/{media_id}
ANN-->>API: media path
API-->>Client: {"status": "started"}
Note over API: asyncio.Task created
API->>API: Read file bytes from resolved path
API->>ANN: PUT /api/media/{id}/status (AI_PROCESSING)
alt Video file
API->>INF: run_detect_video(bytes, config, name, path, callbacks)
else Image file
API->>INF: run_detect_image(bytes, config, name, callbacks)
end
loop For each valid annotation
INF->>API: on_annotation(annotation, percent)
API->>SSE: DetectionEvent → all queues
opt Auth token present
API->>ANN: POST /annotations (detections + image)
end
end
INF->>API: on_status(media_name, count)
API->>SSE: DetectionEvent(status=AIProcessed, percent=100)
API->>ANN: PUT /api/media/{id}/status (AI_PROCESSED)
```
### Data Flow
| Step | From | To | Data | Format |
|------|------|----|------|--------|
| 1 | Client | API | media_id, config, auth tokens | HTTP POST JSON + headers |
| 2 | API | Annotations | user AI settings request | HTTP GET |
| 3 | API | Annotations | media path request | HTTP GET |
| 4 | API | Annotations | media status update (AI_PROCESSING) | HTTP PUT JSON |
| 5 | API | Inference | file bytes, config, callbacks | bytes + AIRecognitionConfig + callables |
| 6 | Inference | Engine | preprocessed batch | numpy ndarray |
| 7 | Engine | Inference | raw detections | numpy ndarray |
| 8 | Inference | API (callback) | Annotation + percent | Python objects |
| 9 | API | SSE clients | DetectionEvent | SSE JSON stream |
| 10 | API | Annotations Service | CreateAnnotationRequest | HTTP POST JSON |
| 11 | API | Annotations | media status update (AI_PROCESSED) | HTTP PUT JSON |
**Step 7 — Annotations POST detail:**
Fired once per detection batch when auth token is present. The request to `POST {ANNOTATIONS_URL}/annotations` carries:
```json
{
"mediaId": "string",
"source": 0,
"videoTime": "00:01:23",
"detections": [
{
"centerX": 0.56, "centerY": 0.67,
"width": 0.25, "height": 0.22,
"classNum": 3, "label": "ArmorVehicle",
"confidence": 0.92
}
],
"image": "<base64 encoded frame bytes, optional>"
}
```
`userId` is not included — the Annotations service resolves the user from the JWT. The Annotations API contract also accepts `description`, `affiliation`, and `combatReadiness` on each detection, but Detections does not populate these.
Authorization: `Bearer {accessToken}` forwarded from the original client request. For long-running video, the token is auto-refreshed via `POST {ANNOTATIONS_URL}/auth/refresh`.
The Annotations service responds 201 on success, 400 if neither image nor mediaId provided, 404 if mediaId unknown. On the Annotations side, the saved annotation triggers: SSE notification to UI, and enqueue to the RabbitMQ sync pipeline (unless SilentDetection mode).
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Duplicate media_id | API | _active_detections check | 409 Conflict |
| Engine unavailable | run_detect | engine is None | Error event pushed to SSE |
| Inference failure | processing | Exception | Error event pushed to SSE, media_id cleared |
| Annotations POST failure | _post_annotation | Exception | Silently caught, detection continues |
| Annotations 404 | _post_annotation | MediaId not found in Annotations DB | Silently caught, detection continues |
| Token refresh failure | TokenManager | Exception on /auth/refresh | Silently caught, subsequent POSTs may fail with 401 |
| SSE queue full | event broadcast | QueueFull | Event silently dropped for that client |
---
## Flow F4: SSE Event Streaming
### Description
Client opens a persistent SSE connection. Receives real-time detection events from all active F3 media detection tasks.
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant API as main.py
participant Queue as asyncio.Queue
Client->>API: GET /detect/stream
API->>Queue: Create queue (maxsize=100)
API->>API: Add to _event_queues
loop Until disconnect
Queue-->>API: await event
API-->>Client: data: {DetectionEvent JSON}
end
Note over API: Client disconnects (CancelledError)
API->>API: Remove from _event_queues
```
---
## Flow F5: Engine Initialization
### Description
On first detection request, the Inference class initializes the ML engine. Strategy: try TensorRT pre-built engine → fall back to ONNX → background TensorRT conversion.
### Flowchart
```mermaid
flowchart TD
Start([init_ai called]) --> CheckEngine{engine exists?}
CheckEngine -->|Yes| Done([Return])
CheckEngine -->|No| CheckBuilding{is_building_engine?}
CheckBuilding -->|Yes| Done
CheckBuilding -->|No| CheckConverted{_converted_model_bytes?}
CheckConverted -->|Yes| LoadConverted[Load TensorRT from bytes]
LoadConverted --> SetEnabled[status = ENABLED]
SetEnabled --> Done
CheckConverted -->|No| CheckGPU{GPU available?}
CheckGPU -->|Yes| DownloadTRT[Download pre-built TensorRT engine]
DownloadTRT --> TRTSuccess{Success?}
TRTSuccess -->|Yes| LoadTRT[Create TensorRTEngine]
LoadTRT --> SetEnabled
TRTSuccess -->|No| DownloadONNX[Download ONNX model]
DownloadONNX --> StartConversion[Start background thread: convert ONNX→TRT]
StartConversion --> Done
CheckGPU -->|No| DownloadONNX2[Download ONNX model]
DownloadONNX2 --> LoadONNX[Create OnnxEngine]
LoadONNX --> Done
```
---
## Flow F6: TensorRT Background Conversion
### Description
When no pre-built TensorRT engine exists, a background daemon thread converts the ONNX model to TensorRT, uploads the result to Loader for caching, and stores the bytes for the next `init_ai` call.
### Sequence Diagram
```mermaid
sequenceDiagram
participant INF as Inference
participant TRT as TensorRTEngine
participant LDR as Loader Service
participant STATUS as AIAvailabilityStatus
Note over INF: Background thread starts
INF->>STATUS: set_status(CONVERTING)
INF->>TRT: convert_from_onnx(onnx_bytes)
TRT->>TRT: Build TensorRT engine (90% GPU memory workspace)
TRT-->>INF: engine_bytes
INF->>STATUS: set_status(UPLOADING)
INF->>LDR: upload_big_small_resource(engine_bytes, filename)
LDR-->>INF: LoadResult
INF->>INF: _converted_model_bytes = engine_bytes
INF->>STATUS: set_status(ENABLED)
Note over INF: Next init_ai() call will load from _converted_model_bytes
```