Added camera config

2026-06-21 21:31:08 +00:00 · 2026-05-14 22:31:29 +03:00
parent 2eb5b5d8ad
commit c9aeed3dd9
19 changed files with 282 additions and 48 deletions
@@ -17,7 +17,7 @@

 - Images ≤ 1.5× model dimensions (1280×1280): processed as single frame.
 - Larger images: tiled based on ground sampling distance. Tile physical size: 25 meters (METERS_IN_TILE). Tile overlap: `big_image_tile_overlap_percent` (default: 20%).
- GSD calculation: `sensor_width * altitude / (focal_length * image_width)` when `altitude` is provided.
+- GSD calculation: `sensor_width * current_height / (focal_length * current_zoom * image_width * sin(current_angle))` when `camera_config.current_height` and valid camera parameters are provided. `current_angle` is in degrees and defaults to 90.

 ## API

@@ -36,9 +36,19 @@ Media path is resolved from the Annotations service via `GET /api/media/{media_i
 | tracking_intersection_threshold | float | 0.6 | Overlap ratio for NMS deduplication |
 | model_batch_size | int | 8 | Inference batch size |
 | big_image_tile_overlap_percent | int | 20 | Tile overlap for large images (0-100%) |
-| altitude | float | optional | Camera altitude in meters. When omitted, GSD-based size filtering and image tiling are skipped. |
+| camera_config | object | null | Camera parameters for GSD. When omitted or missing height, GSD-based size filtering and image tiling are skipped. |
+
+### camera_config
+
+| Field | Type | Default | Range/Meaning |
+|-------|------|---------|---------------|
 | focal_length | float | 24 | Camera focal length in mm |
 | sensor_width | float | 23.5 | Camera sensor width in mm |
+| current_zoom | float | 1 | Optical zoom multiplier; effective focal length is `focal_length * current_zoom` |
+| current_angle | float | 90 | Camera angle in degrees; 90 is nadir/downward |
+| current_height | float | optional | Camera height in meters |
+
+Legacy flat `altitude`, `focal_length`, and `sensor_width` keys are still accepted for backward compatibility, but new clients should send `camera_config`.

 `paths` field was removed in AZ-174 — media paths are now resolved via the Annotations service.

@@ -32,7 +32,7 @@ graph LR
 | Cython inference pipeline | Python 3, Cython 3.1.3, OpenCV 4.10 | Near-C performance for tight detection loops while retaining Python ecosystem | Build complexity, limited IDE/debug support | Compilation step via setup.py | N/A | Low (open-source) | High — critical for postprocessing throughput |
 | Dual engine strategy (TensorRT + ONNX) | TensorRT 10.11, ONNX Runtime 1.22 | Maximum GPU speed with CPU fallback; auto-conversion and caching | Two code paths; GPU-specific engine files not portable | NVIDIA GPU (CC ≥ 6.1) for TensorRT | N/A | TensorRT free for NVIDIA GPUs | High — balances performance and portability |
 | FastAPI HTTP service | FastAPI, Uvicorn, Pydantic | Async SSE, auto-generated docs, fast development | Sync inference offloaded to ThreadPoolExecutor (2 workers) | Python 3.8+ | Bearer token pass-through | Low (open-source) | High — fits async streaming + sync inference pattern |
-| GSD-based image tiling | OpenCV, NumPy | Preserves small object detail in large aerial images | Complex tile dedup logic; overlap increases compute | Camera metadata (altitude, focal length, sensor width) | N/A | Compute cost scales with image size | High — essential for aerial imagery use case |
+| GSD-based image tiling | OpenCV, NumPy | Preserves small object detail in large aerial images | Complex tile dedup logic; overlap increases compute | Camera metadata (`camera_config`: height, angle, zoom, focal length, sensor width) | N/A | Compute cost scales with image size | High — essential for aerial imagery use case |
 | Lazy engine initialization | pynvml, threading | Fast API startup; background model conversion | First request has high latency; engine may be unavailable | None | N/A | N/A | High — prevents blocking startup on slow model download/conversion |

 ## 3. Testing Strategy
@@ -109,7 +109,7 @@ None — internal component, consumed by API layer.

 ### Large Image Tiling

- Ground Sampling Distance: `sensor_width * altitude / (focal_length * image_width)`
+- Ground Sampling Distance: `sensor_width * current_height / (focal_length * current_zoom * image_width * sin(current_angle))`
 - Tile size: `METERS_IN_TILE / GSD` pixels
 - Overlap: configurable percentage
 - Tile deduplication: absolute-coordinate Detection equality across adjacent tiles
@@ -37,9 +37,13 @@ erDiagram
        double tracking_intersection_threshold
        int big_image_tile_overlap_percent
        int model_batch_size
-        double altitude
+        bool has_camera_config
+        double current_height
+        double current_zoom
+        double current_angle
        double focal_length
        double sensor_width
+        double altitude
    }

    AIAvailabilityStatus {
@@ -107,7 +111,7 @@ Groups detections for a single frame or image tile.

 ### AIRecognitionConfig

-Runtime configuration for inference behavior. Created from dict (API) or msgpack (internal).
+Runtime configuration for inference behavior. Created from dict (API). Camera values are grouped under `camera_config` at the API boundary and expanded into `current_height`, `current_zoom`, `current_angle`, `focal_length`, and `sensor_width` internally. `altitude` remains as a legacy alias for `current_height`.

 ### AIAvailabilityStatus

@@ -125,7 +129,7 @@ SSE event payload. Status values: AIProcessing, AIProcessed, Error.

 ### AIConfigDto

-API input configuration. Same fields as AIRecognitionConfig with defaults.
+API input configuration. Same inference fields as `AIRecognitionConfig` with defaults, plus nested `camera_config` for GSD and physical-size filtering.

 ### HealthResponse

@@ -144,7 +148,7 @@ Annotation names encode media source and processing context:
 | Entity | Format | Usage |
 |--------|--------|-------|
 | Detection/Annotation | msgpack (compact keys) | `annotation.serialize()` |
-| AIRecognitionConfig | msgpack (compact keys) | `from_msgpack()` |
+| AIRecognitionConfig | Python dict | `AIRecognitionConfig.from_dict()` |
 | AIAvailabilityStatus | msgpack | `serialize()` |
 | DetectionDto/Event | JSON (Pydantic) | HTTP API responses, SSE |

@@ -20,9 +20,13 @@ Data class holding all AI recognition configuration parameters, with factory met
 | `tracking_intersection_threshold` | double | 0.6 | IoU threshold for overlapping detection removal |
 | `model_batch_size` | int | 1 | Batch size for inference |
 | `big_image_tile_overlap_percent` | int | 20 | Tile overlap percentage for large image splitting |
-| `altitude` | double? | optional | Camera altitude in meters. When missing, GSD-based filtering is disabled |
+| `has_camera_config` | bool | false | Whether camera parameters were supplied |
+| `current_height` | double | 0.0 | Camera height in meters, from `camera_config.current_height` |
+| `current_zoom` | double | 1.0 | Camera zoom multiplier |
+| `current_angle` | double | 90.0 | Camera angle in degrees; 90 is nadir/downward |
 | `focal_length` | double | 24 | Camera focal length in mm |
 | `sensor_width` | double | 23.5 | Camera sensor width in mm |
+| `altitude` / `has_altitude` | double / bool | legacy | Backward-compatible aliases for older flat camera config |

 #### Methods

@@ -32,7 +36,7 @@ Data class holding all AI recognition configuration parameters, with factory met

 ## Internal Logic

-`from_dict` applies defaults for missing keys using full descriptive key names.
+`from_dict` applies defaults for missing keys using full descriptive key names. Camera parameters are read from nested `camera_config` first; legacy flat `altitude`, `focal_length`, and `sensor_width` keys remain supported for older clients.

 **Removed**: `paths` field and `file_data` field were removed as part of the distributed architecture shift (AZ-174). Media paths are now resolved via the Annotations service API, not passed in config. `from_msgpack()` was also removed as it was unused.

@@ -51,7 +55,7 @@ Data class holding all AI recognition configuration parameters, with factory met

 ## Configuration

-Camera/altitude parameters (`altitude`, `focal_length`, `sensor_width`) are used for ground sampling distance calculation in aerial image processing. If `altitude` is missing, the service skips GSD-based size filtering and does not tile large images by physical size.
+Camera parameters (`camera_config.focal_length`, `camera_config.sensor_width`, `camera_config.current_zoom`, `camera_config.current_angle`, `camera_config.current_height`) are used for ground sampling distance calculation in aerial image processing. If `camera_config` is missing or height/optics are invalid, the service skips GSD-based size filtering and does not tile large images by physical size.

 ## External Integrations

@@ -90,7 +90,7 @@ Both `run_detect_image` and `run_detect_video` accept raw bytes instead of file

 ### Ground Sampling Distance (GSD)

-`GSD = sensor_width * altitude / (focal_length * image_width)` — meters per pixel, used for physical size filtering of aerial detections.
+`GSD = sensor_width * current_height / (focal_length * current_zoom * image_width * sin(current_angle))` — meters per pixel, used for physical size filtering of aerial detections. `current_angle` is configured in degrees and defaults to 90.

 ## Dependencies

@@ -23,7 +23,8 @@ FastAPI application entry point — exposes HTTP API for object detection on ima
 | `DetectionDto` | centerX, centerY, width, height, classNum, label, confidence | Single detection result |
 | `DetectionEvent` | annotations (list[DetectionDto]), mediaId, mediaStatus, mediaPercent | SSE event payload |
 | `HealthResponse` | status, aiAvailability, engineType, errorMessage | Health check response |
-| `AIConfigDto` | frame_period_recognition, frame_recognition_seconds, probability_threshold, tracking_*, model_batch_size, big_image_tile_overlap_percent, altitude, focal_length, sensor_width | Configuration input (no `paths` field — removed in AZ-174) |
+| `CameraConfigDto` | focal_length, sensor_width, current_zoom, current_angle, current_height | Camera input used for GSD and physical-size filtering |
+| `AIConfigDto` | frame_period_recognition, frame_recognition_seconds, probability_threshold, tracking_*, model_batch_size, big_image_tile_overlap_percent, camera_config | Configuration input (no `paths` field — removed in AZ-174) |

 ### Class: TokenManager

@@ -37,7 +38,7 @@ FastAPI application entry point — exposes HTTP API for object detection on ima

 | Function | Signature | Description |
 |----------|-----------|-------------|
-| `_merged_annotation_settings_payload` | `(raw: object) -> dict` | Merges nested AI settings from Annotations service response (handles `aiRecognitionSettings`, `cameraSettings` sub-objects and PascalCase/camelCase/snake_case aliases) |
+| `_merged_annotation_settings_payload` | `(raw: object) -> dict` | Merges nested AI settings from Annotations service response (handles `aiRecognitionSettings`, `camera_config`/`cameraSettings` sub-objects and PascalCase/camelCase/snake_case aliases) |
 | `_resolve_media_for_detect` | `(media_id, token_mgr, override) -> tuple[dict, str]` | Fetches user AI settings + media path from Annotations service, merges with client overrides |
 | `_detect_upload_kind` | `(filename, data) -> tuple[str, str]` | Determines if upload is image or video by extension, falls back to content probing (cv2/PyAV) |
 | `_post_media_record` | `(payload, bearer) -> bool` | Creates media record via `POST /api/media` on Annotations service |
@@ -83,7 +83,7 @@

 **Preconditions**:
 - Engine is initialized
- Config includes altitude, focal_length, sensor_width for GSD calculation
+- Config includes `camera_config` with `current_height`, `focal_length`, `sensor_width`, `current_zoom`, and `current_angle` for GSD calculation

 **Input data**: large-image (4000×3000)

@@ -91,7 +91,7 @@

 | Step | Consumer Action | Expected System Response |
 |------|----------------|------------------------|
-| 1 | `POST /detect` with large-image and config `{"altitude": 400, "focal_length": 24, "sensor_width": 23.5}` | 200 OK |
+| 1 | `POST /detect` with large-image and config `{"camera_config":{"current_height":400,"focal_length":24,"sensor_width":23.5,"current_zoom":1,"current_angle":90}}` | 200 OK |
 | 2 | Parse response JSON | Array of detections |
 | 3 | Verify detection coordinates | Bounding box coordinates are in 0.0–1.0 range relative to the full original image |

@@ -167,7 +167,7 @@

 | Step | Consumer Action | Expected System Response |
 |------|----------------|------------------------|
-| 1 | `POST /detect` with small-image and config `{"altitude": 400, "focal_length": 24, "sensor_width": 23.5}` | 200 OK |
+| 1 | `POST /detect` with small-image and config `{"camera_config":{"current_height":400,"focal_length":24,"sensor_width":23.5,"current_zoom":1,"current_angle":90}}` | 200 OK |
 | 2 | For each detection, compute physical size from bounding box + GSD | No detection's physical size exceeds the MaxSizeM defined for its class in classes.json |

 **Expected outcome**: All returned detections have plausible physical dimensions for their class.