mirror of
https://github.com/azaion/detections.git
synced 2026-06-21 21:31:08 +00:00
Added camera config
This commit is contained in:
@@ -17,7 +17,7 @@
|
||||
|
||||
- Images ≤ 1.5× model dimensions (1280×1280): processed as single frame.
|
||||
- Larger images: tiled based on ground sampling distance. Tile physical size: 25 meters (METERS_IN_TILE). Tile overlap: `big_image_tile_overlap_percent` (default: 20%).
|
||||
- GSD calculation: `sensor_width * altitude / (focal_length * image_width)` when `altitude` is provided.
|
||||
- GSD calculation: `sensor_width * current_height / (focal_length * current_zoom * image_width * sin(current_angle))` when `camera_config.current_height` and valid camera parameters are provided. `current_angle` is in degrees and defaults to 90.
|
||||
|
||||
## API
|
||||
|
||||
|
||||
@@ -36,9 +36,19 @@ Media path is resolved from the Annotations service via `GET /api/media/{media_i
|
||||
| tracking_intersection_threshold | float | 0.6 | Overlap ratio for NMS deduplication |
|
||||
| model_batch_size | int | 8 | Inference batch size |
|
||||
| big_image_tile_overlap_percent | int | 20 | Tile overlap for large images (0-100%) |
|
||||
| altitude | float | optional | Camera altitude in meters. When omitted, GSD-based size filtering and image tiling are skipped. |
|
||||
| camera_config | object | null | Camera parameters for GSD. When omitted or missing height, GSD-based size filtering and image tiling are skipped. |
|
||||
|
||||
### camera_config
|
||||
|
||||
| Field | Type | Default | Range/Meaning |
|
||||
|-------|------|---------|---------------|
|
||||
| focal_length | float | 24 | Camera focal length in mm |
|
||||
| sensor_width | float | 23.5 | Camera sensor width in mm |
|
||||
| current_zoom | float | 1 | Optical zoom multiplier; effective focal length is `focal_length * current_zoom` |
|
||||
| current_angle | float | 90 | Camera angle in degrees; 90 is nadir/downward |
|
||||
| current_height | float | optional | Camera height in meters |
|
||||
|
||||
Legacy flat `altitude`, `focal_length`, and `sensor_width` keys are still accepted for backward compatibility, but new clients should send `camera_config`.
|
||||
|
||||
`paths` field was removed in AZ-174 — media paths are now resolved via the Annotations service.
|
||||
|
||||
|
||||
@@ -32,7 +32,7 @@ graph LR
|
||||
| Cython inference pipeline | Python 3, Cython 3.1.3, OpenCV 4.10 | Near-C performance for tight detection loops while retaining Python ecosystem | Build complexity, limited IDE/debug support | Compilation step via setup.py | N/A | Low (open-source) | High — critical for postprocessing throughput |
|
||||
| Dual engine strategy (TensorRT + ONNX) | TensorRT 10.11, ONNX Runtime 1.22 | Maximum GPU speed with CPU fallback; auto-conversion and caching | Two code paths; GPU-specific engine files not portable | NVIDIA GPU (CC ≥ 6.1) for TensorRT | N/A | TensorRT free for NVIDIA GPUs | High — balances performance and portability |
|
||||
| FastAPI HTTP service | FastAPI, Uvicorn, Pydantic | Async SSE, auto-generated docs, fast development | Sync inference offloaded to ThreadPoolExecutor (2 workers) | Python 3.8+ | Bearer token pass-through | Low (open-source) | High — fits async streaming + sync inference pattern |
|
||||
| GSD-based image tiling | OpenCV, NumPy | Preserves small object detail in large aerial images | Complex tile dedup logic; overlap increases compute | Camera metadata (altitude, focal length, sensor width) | N/A | Compute cost scales with image size | High — essential for aerial imagery use case |
|
||||
| GSD-based image tiling | OpenCV, NumPy | Preserves small object detail in large aerial images | Complex tile dedup logic; overlap increases compute | Camera metadata (`camera_config`: height, angle, zoom, focal length, sensor width) | N/A | Compute cost scales with image size | High — essential for aerial imagery use case |
|
||||
| Lazy engine initialization | pynvml, threading | Fast API startup; background model conversion | First request has high latency; engine may be unavailable | None | N/A | N/A | High — prevents blocking startup on slow model download/conversion |
|
||||
|
||||
## 3. Testing Strategy
|
||||
|
||||
@@ -109,7 +109,7 @@ None — internal component, consumed by API layer.
|
||||
|
||||
### Large Image Tiling
|
||||
|
||||
- Ground Sampling Distance: `sensor_width * altitude / (focal_length * image_width)`
|
||||
- Ground Sampling Distance: `sensor_width * current_height / (focal_length * current_zoom * image_width * sin(current_angle))`
|
||||
- Tile size: `METERS_IN_TILE / GSD` pixels
|
||||
- Overlap: configurable percentage
|
||||
- Tile deduplication: absolute-coordinate Detection equality across adjacent tiles
|
||||
|
||||
@@ -37,9 +37,13 @@ erDiagram
|
||||
double tracking_intersection_threshold
|
||||
int big_image_tile_overlap_percent
|
||||
int model_batch_size
|
||||
double altitude
|
||||
bool has_camera_config
|
||||
double current_height
|
||||
double current_zoom
|
||||
double current_angle
|
||||
double focal_length
|
||||
double sensor_width
|
||||
double altitude
|
||||
}
|
||||
|
||||
AIAvailabilityStatus {
|
||||
@@ -107,7 +111,7 @@ Groups detections for a single frame or image tile.
|
||||
|
||||
### AIRecognitionConfig
|
||||
|
||||
Runtime configuration for inference behavior. Created from dict (API) or msgpack (internal).
|
||||
Runtime configuration for inference behavior. Created from dict (API). Camera values are grouped under `camera_config` at the API boundary and expanded into `current_height`, `current_zoom`, `current_angle`, `focal_length`, and `sensor_width` internally. `altitude` remains as a legacy alias for `current_height`.
|
||||
|
||||
### AIAvailabilityStatus
|
||||
|
||||
@@ -125,7 +129,7 @@ SSE event payload. Status values: AIProcessing, AIProcessed, Error.
|
||||
|
||||
### AIConfigDto
|
||||
|
||||
API input configuration. Same fields as AIRecognitionConfig with defaults.
|
||||
API input configuration. Same inference fields as `AIRecognitionConfig` with defaults, plus nested `camera_config` for GSD and physical-size filtering.
|
||||
|
||||
### HealthResponse
|
||||
|
||||
@@ -144,7 +148,7 @@ Annotation names encode media source and processing context:
|
||||
| Entity | Format | Usage |
|
||||
|--------|--------|-------|
|
||||
| Detection/Annotation | msgpack (compact keys) | `annotation.serialize()` |
|
||||
| AIRecognitionConfig | msgpack (compact keys) | `from_msgpack()` |
|
||||
| AIRecognitionConfig | Python dict | `AIRecognitionConfig.from_dict()` |
|
||||
| AIAvailabilityStatus | msgpack | `serialize()` |
|
||||
| DetectionDto/Event | JSON (Pydantic) | HTTP API responses, SSE |
|
||||
|
||||
|
||||
@@ -20,9 +20,13 @@ Data class holding all AI recognition configuration parameters, with factory met
|
||||
| `tracking_intersection_threshold` | double | 0.6 | IoU threshold for overlapping detection removal |
|
||||
| `model_batch_size` | int | 1 | Batch size for inference |
|
||||
| `big_image_tile_overlap_percent` | int | 20 | Tile overlap percentage for large image splitting |
|
||||
| `altitude` | double? | optional | Camera altitude in meters. When missing, GSD-based filtering is disabled |
|
||||
| `has_camera_config` | bool | false | Whether camera parameters were supplied |
|
||||
| `current_height` | double | 0.0 | Camera height in meters, from `camera_config.current_height` |
|
||||
| `current_zoom` | double | 1.0 | Camera zoom multiplier |
|
||||
| `current_angle` | double | 90.0 | Camera angle in degrees; 90 is nadir/downward |
|
||||
| `focal_length` | double | 24 | Camera focal length in mm |
|
||||
| `sensor_width` | double | 23.5 | Camera sensor width in mm |
|
||||
| `altitude` / `has_altitude` | double / bool | legacy | Backward-compatible aliases for older flat camera config |
|
||||
|
||||
#### Methods
|
||||
|
||||
@@ -32,7 +36,7 @@ Data class holding all AI recognition configuration parameters, with factory met
|
||||
|
||||
## Internal Logic
|
||||
|
||||
`from_dict` applies defaults for missing keys using full descriptive key names.
|
||||
`from_dict` applies defaults for missing keys using full descriptive key names. Camera parameters are read from nested `camera_config` first; legacy flat `altitude`, `focal_length`, and `sensor_width` keys remain supported for older clients.
|
||||
|
||||
**Removed**: `paths` field and `file_data` field were removed as part of the distributed architecture shift (AZ-174). Media paths are now resolved via the Annotations service API, not passed in config. `from_msgpack()` was also removed as it was unused.
|
||||
|
||||
@@ -51,7 +55,7 @@ Data class holding all AI recognition configuration parameters, with factory met
|
||||
|
||||
## Configuration
|
||||
|
||||
Camera/altitude parameters (`altitude`, `focal_length`, `sensor_width`) are used for ground sampling distance calculation in aerial image processing. If `altitude` is missing, the service skips GSD-based size filtering and does not tile large images by physical size.
|
||||
Camera parameters (`camera_config.focal_length`, `camera_config.sensor_width`, `camera_config.current_zoom`, `camera_config.current_angle`, `camera_config.current_height`) are used for ground sampling distance calculation in aerial image processing. If `camera_config` is missing or height/optics are invalid, the service skips GSD-based size filtering and does not tile large images by physical size.
|
||||
|
||||
## External Integrations
|
||||
|
||||
|
||||
@@ -90,7 +90,7 @@ Both `run_detect_image` and `run_detect_video` accept raw bytes instead of file
|
||||
|
||||
### Ground Sampling Distance (GSD)
|
||||
|
||||
`GSD = sensor_width * altitude / (focal_length * image_width)` — meters per pixel, used for physical size filtering of aerial detections.
|
||||
`GSD = sensor_width * current_height / (focal_length * current_zoom * image_width * sin(current_angle))` — meters per pixel, used for physical size filtering of aerial detections. `current_angle` is configured in degrees and defaults to 90.
|
||||
|
||||
## Dependencies
|
||||
|
||||
|
||||
@@ -23,7 +23,8 @@ FastAPI application entry point — exposes HTTP API for object detection on ima
|
||||
| `DetectionDto` | centerX, centerY, width, height, classNum, label, confidence | Single detection result |
|
||||
| `DetectionEvent` | annotations (list[DetectionDto]), mediaId, mediaStatus, mediaPercent | SSE event payload |
|
||||
| `HealthResponse` | status, aiAvailability, engineType, errorMessage | Health check response |
|
||||
| `AIConfigDto` | frame_period_recognition, frame_recognition_seconds, probability_threshold, tracking_*, model_batch_size, big_image_tile_overlap_percent, altitude, focal_length, sensor_width | Configuration input (no `paths` field — removed in AZ-174) |
|
||||
| `CameraConfigDto` | focal_length, sensor_width, current_zoom, current_angle, current_height | Camera input used for GSD and physical-size filtering |
|
||||
| `AIConfigDto` | frame_period_recognition, frame_recognition_seconds, probability_threshold, tracking_*, model_batch_size, big_image_tile_overlap_percent, camera_config | Configuration input (no `paths` field — removed in AZ-174) |
|
||||
|
||||
### Class: TokenManager
|
||||
|
||||
@@ -37,7 +38,7 @@ FastAPI application entry point — exposes HTTP API for object detection on ima
|
||||
|
||||
| Function | Signature | Description |
|
||||
|----------|-----------|-------------|
|
||||
| `_merged_annotation_settings_payload` | `(raw: object) -> dict` | Merges nested AI settings from Annotations service response (handles `aiRecognitionSettings`, `cameraSettings` sub-objects and PascalCase/camelCase/snake_case aliases) |
|
||||
| `_merged_annotation_settings_payload` | `(raw: object) -> dict` | Merges nested AI settings from Annotations service response (handles `aiRecognitionSettings`, `camera_config`/`cameraSettings` sub-objects and PascalCase/camelCase/snake_case aliases) |
|
||||
| `_resolve_media_for_detect` | `(media_id, token_mgr, override) -> tuple[dict, str]` | Fetches user AI settings + media path from Annotations service, merges with client overrides |
|
||||
| `_detect_upload_kind` | `(filename, data) -> tuple[str, str]` | Determines if upload is image or video by extension, falls back to content probing (cv2/PyAV) |
|
||||
| `_post_media_record` | `(payload, bearer) -> bool` | Creates media record via `POST /api/media` on Annotations service |
|
||||
|
||||
@@ -83,7 +83,7 @@
|
||||
|
||||
**Preconditions**:
|
||||
- Engine is initialized
|
||||
- Config includes altitude, focal_length, sensor_width for GSD calculation
|
||||
- Config includes `camera_config` with `current_height`, `focal_length`, `sensor_width`, `current_zoom`, and `current_angle` for GSD calculation
|
||||
|
||||
**Input data**: large-image (4000×3000)
|
||||
|
||||
@@ -91,7 +91,7 @@
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | `POST /detect` with large-image and config `{"altitude": 400, "focal_length": 24, "sensor_width": 23.5}` | 200 OK |
|
||||
| 1 | `POST /detect` with large-image and config `{"camera_config":{"current_height":400,"focal_length":24,"sensor_width":23.5,"current_zoom":1,"current_angle":90}}` | 200 OK |
|
||||
| 2 | Parse response JSON | Array of detections |
|
||||
| 3 | Verify detection coordinates | Bounding box coordinates are in 0.0–1.0 range relative to the full original image |
|
||||
|
||||
@@ -167,7 +167,7 @@
|
||||
|
||||
| Step | Consumer Action | Expected System Response |
|
||||
|------|----------------|------------------------|
|
||||
| 1 | `POST /detect` with small-image and config `{"altitude": 400, "focal_length": 24, "sensor_width": 23.5}` | 200 OK |
|
||||
| 1 | `POST /detect` with small-image and config `{"camera_config":{"current_height":400,"focal_length":24,"sensor_width":23.5,"current_zoom":1,"current_angle":90}}` | 200 OK |
|
||||
| 2 | For each detection, compute physical size from bounding box + GSD | No detection's physical size exceeds the MaxSizeM defined for its class in classes.json |
|
||||
|
||||
**Expected outcome**: All returned detections have plausible physical dimensions for their class.
|
||||
|
||||
Reference in New Issue
Block a user