# Azaion.Detections — Documentation Report ## Executive Summary Azaion.Detections is a Python/Cython microservice for automated aerial object detection. It exposes a FastAPI HTTP API that accepts images and video, runs YOLO-based inference through TensorRT (GPU) or ONNX Runtime (CPU fallback), and returns structured detection results. The system supports large aerial image tiling with ground sampling distance-based sizing, real-time video processing with frame sampling and tracking heuristics, and Server-Sent Events streaming for live detection updates. The codebase consists of 10 modules (2 Python, 8 Cython) organized into 4 components. It integrates with two external services: a Loader service for model storage and an Annotations service for result persistence. The system has no tests, no containerization config in this repo, and several legacy artifacts from a prior RabbitMQ-based architecture. ## Problem Statement Automated detection of military and infrastructure objects (19 classes including vehicles, artillery, trenches, personnel, camouflage) from aerial imagery and video feeds. Replaces manual analyst review with real-time AI-powered detection, enabling rapid situational awareness for reconnaissance operations. ## Architecture Overview **Tech stack**: Python 3 + Cython 3.1.3 | FastAPI + Uvicorn | ONNX Runtime 1.22.0 | TensorRT 10.11.0 | OpenCV 4.10.0 | NumPy 2.3.0 **Key architectural decisions**: 1. Cython for performance-critical inference loops 2. Dual engine strategy (TensorRT + ONNX fallback) with automatic conversion and caching 3. Lazy engine initialization for fast API startup 4. GSD-based image tiling for large aerial images ## Component Summary | # | Component | Modules | Purpose | Dependencies | |---|-----------|---------|---------|-------------| | 01 | Domain | constants_inf, ai_config, ai_availability_status, annotation | Shared data models, enums, constants, logging, class registry | None (foundation) | | 02 | Inference Engines | inference_engine, onnx_engine, tensorrt_engine | Pluggable ML inference backends (Strategy pattern) | Domain | | 03 | Inference Pipeline | inference, loader_http_client | Engine lifecycle, media preprocessing/postprocessing, model loading | Domain, Engines | | 04 | API | main | HTTP endpoints, SSE streaming, auth token management | Domain, Pipeline | ## System Flows | # | Flow | Trigger | Description | |---|------|---------|-------------| | F1 | Health Check | GET /health | Returns AI engine availability status | | F2 | Single Image Detection | POST /detect | Synchronous image inference, returns detections | | F3 | Media Detection (Async) | POST /detect/{media_id} | Background processing with SSE streaming + Annotations posting | | F4 | SSE Streaming | GET /detect/stream | Real-time event delivery to connected clients | | F5 | Engine Initialization | First detection request | TensorRT → ONNX fallback → background conversion | | F6 | TensorRT Conversion | No cached engine | Background ONNX→TensorRT conversion and upload | ## Risk Observations | Risk | Severity | Source | |------|----------|--------| | No tests in the codebase | High | Verification (Step 4) | | No CORS, rate limiting, or request size limits | Medium | Security review (main.py) | | JWT token handled without signature verification | Medium | Security review (main.py) | | Legacy unused code (serialize, from_msgpack, queue declarations) | Low | Verification (Step 4) | | No graceful shutdown for in-progress detections | Medium | Architecture review | | Single-instance in-memory state (_active_detections, _event_queues) | Medium | Scalability review | | No Dockerfile or CI/CD config in this repository | Low | Infrastructure review | | classes.json must exist at startup — no fallback | Low | Reliability review | | Hardcoded 1280×1280 default for dynamic TensorRT dimensions | Low | Flexibility review | ## Open Questions 1. Where is the Dockerfile / docker-compose.yml for this service? Likely in a separate infrastructure repository. 2. Is the legacy RabbitMQ code (serialize methods, from_msgpack, queue constants in .pxd) planned for removal? 3. What is the intended scaling model — single instance per GPU, or horizontal scaling with shared state? 4. Should JWT signature verification be added at the detection service level, or is the current pass-through approach intentional? 5. Are there integration or end-to-end tests in a separate repository? ## Artifact Index | Path | Description | |------|-------------| | `_docs/00_problem/problem.md` | Problem statement | | `_docs/00_problem/restrictions.md` | System restrictions and constraints | | `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria | | `_docs/00_problem/input_data/data_parameters.md` | Input data schemas and parameters | | `_docs/01_solution/solution.md` | Solution description and assessment | | `_docs/02_document/00_discovery.md` | Codebase discovery (tech stack, dependency graph) | | `_docs/02_document/modules/*.md` | Per-module documentation (10 modules) | | `_docs/02_document/components/01_domain/description.md` | Domain component spec | | `_docs/02_document/components/02_inference_engines/description.md` | Inference Engines component spec | | `_docs/02_document/components/03_inference_pipeline/description.md` | Inference Pipeline component spec | | `_docs/02_document/components/04_api/description.md` | API component spec | | `_docs/02_document/diagrams/components.md` | Component relationship diagram | | `_docs/02_document/architecture.md` | System architecture document | | `_docs/02_document/system-flows.md` | System flow diagrams and descriptions | | `_docs/02_document/data_model.md` | Data model with ERD | | `_docs/02_document/04_verification_log.md` | Verification pass results | | `_docs/02_document/FINAL_report.md` | This report | | `_docs/02_document/state.json` | Documentation process state |