mirror of https://github.com/azaion/ai-training.git synced 2026-04-22 21:46:35 +00:00

Files

T

Oleksandr Bezdieniezhnykh 142c6c4de8 Refactor constants management to use Pydantic BaseModel for configuration

- Replaced module-level path variables in constants.py with a structured Pydantic Config class.
- Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure.
- Fixed bugs related to image processing and model saving.
- Enhanced test infrastructure to accommodate the new configuration approach.

This refactor improves code maintainability and clarity by centralizing configuration management.

2026-03-27 18:18:30 +02:00

2.4 KiB

Raw Blame History

Module: inference/inference

Purpose

High-level video inference pipeline. Orchestrates preprocessing → engine inference → postprocessing → visualization for object detection on video streams.

Public Interface

Inference

Method	Signature	Returns	Description
`__init__`	`(engine: InferenceEngine, confidence_threshold, iou_threshold)`	—	Stores engine, thresholds, loads annotation classes
`preprocess`	`(frames: list) -> np.ndarray`	Batched blob tensor	Normalizes, resizes, and stacks frames into NCHW blob
`postprocess`	`(batch_frames, batch_timestamps, output) -> list[Annotation]`	Annotations per frame	Extracts detections from raw output, applies confidence filter and NMS
`process`	`(video: str)`	—	End-to-end: reads video → batched inference → draws + displays results
`draw`	`(annotation: Annotation)`	—	Draws bounding boxes with class labels on frame, shows via cv2.imshow
`remove_overlapping_detections`	`(detections: list[Detection]) -> list[Detection]`	Filtered list	Custom NMS: removes overlapping detections keeping higher confidence

Internal Logic

Video processing: Reads video via cv2.VideoCapture, processes every 4th frame (frame_count % 4), batches frames to engine batch size.
Preprocessing: cv2.dnn.blobFromImage with 1/255 scaling, model input size, BGR→RGB swap.
Postprocessing: Iterates raw output, filters by confidence threshold, normalizes coordinates from model space to [0,1], creates Detection objects, applies custom NMS.
Custom NMS: Pairwise IoU comparison. When two detections overlap above threshold, keeps the one with higher confidence (ties broken by lower class ID).
Visualization: Draws colored rectangles and confidence labels using annotation class colors in OpenCV window.

Dependencies

inference/dto — Detection, Annotation, AnnotationClass
inference/onnx_engine — InferenceEngine ABC (type hint)
cv2 (external) — video I/O, image processing, display
numpy (external) — tensor operations

Consumers

start_inference

Data Models

Uses Detection, Annotation from inference/dto.

Configuration

confidence_threshold and iou_threshold set at construction.

External Integrations

OpenCV video capture (file or stream input)
OpenCV GUI window for real-time display

Security

None.

Tests