Refactor testing framework to replace integration tests with blackbox tests across various skills and documentation. Update related workflows, templates, and task specifications to align with the new blackbox testing approach. Remove obsolete integration test files and enhance clarity in task management and reporting structures.

This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-24 03:38:36 +02:00
parent ae3ad50b9e
commit e609586c7c
49 changed files with 2222 additions and 872 deletions
+591
View File
@@ -0,0 +1,591 @@
# Blackbox Tests
## Positive Scenarios
### FT-P-01: Health check returns status before engine initialization
**Summary**: Verify the health endpoint responds correctly when the inference engine has not yet been initialized.
**Traces to**: AC-API-1, AC-EL-1
**Category**: API, Engine Lifecycle
**Preconditions**:
- Detections service is running
- No detection requests have been made (engine is not initialized)
**Input data**: None
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `GET /health` | 200 OK with `{"status": "healthy", "aiAvailability": "None"}` |
**Expected outcome**: Health endpoint returns `status: "healthy"` and `aiAvailability: "None"` (engine not yet loaded).
**Max execution time**: 2s
---
### FT-P-02: Health check reflects engine availability after initialization
**Summary**: Verify the health endpoint reports the correct engine state after the engine has been initialized by a detection request.
**Traces to**: AC-API-1, AC-EL-2
**Category**: API, Engine Lifecycle
**Preconditions**:
- Detections service is running
- Mock-loader serves the ONNX model file
- At least one successful detection has been performed (engine initialized)
**Input data**: small-image
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with small-image (trigger engine init) | 200 OK with detection results |
| 2 | `GET /health` | 200 OK with `aiAvailability` set to `"Enabled"` or `"Warning"` |
**Expected outcome**: `aiAvailability` reflects an initialized engine state (not `"None"` or `"Downloading"`).
**Max execution time**: 30s (includes engine init on first call)
---
### FT-P-03: Single image detection returns detections
**Summary**: Verify that a valid small image submitted via POST /detect returns structured detection results.
**Traces to**: AC-DA-1, AC-API-2
**Category**: Detection Accuracy, API
**Preconditions**:
- Engine is initialized (or will be on this call)
- Mock-loader serves the model
**Input data**: small-image (640×480, contains detectable objects)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with small-image as multipart file | 200 OK |
| 2 | Parse response JSON | Array of detection objects, each with `x`, `y`, `width`, `height`, `label`, `confidence` |
| 3 | Verify all confidence values | Every detection has `confidence >= 0.25` (default probability_threshold) |
**Expected outcome**: Non-empty array of DetectionDto objects. All confidences meet threshold. Each detection has valid bounding box coordinates (0.01.0 range).
**Max execution time**: 30s
---
### FT-P-04: Large image triggers GSD-based tiling
**Summary**: Verify that an image exceeding 1.5× model dimensions is tiled and processed with tile-level detection results merged.
**Traces to**: AC-IP-1, AC-IP-2
**Category**: Image Processing
**Preconditions**:
- Engine is initialized
- Config includes altitude, focal_length, sensor_width for GSD calculation
**Input data**: large-image (4000×3000)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with large-image and config `{"altitude": 400, "focal_length": 24, "sensor_width": 23.5}` | 200 OK |
| 2 | Parse response JSON | Array of detections |
| 3 | Verify detection coordinates | Bounding box coordinates are in 0.01.0 range relative to the full original image |
**Expected outcome**: Detections returned for the full image. Coordinates are normalized to original image dimensions (not tile dimensions). Processing time is longer than small-image due to tiling.
**Max execution time**: 60s
---
### FT-P-05: Detection confidence filtering respects threshold
**Summary**: Verify that detections below the configured probability_threshold are filtered out.
**Traces to**: AC-DA-1
**Category**: Detection Accuracy
**Preconditions**:
- Engine is initialized
**Input data**: small-image
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with small-image and config `{"probability_threshold": 0.8}` | 200 OK |
| 2 | Parse response JSON | All returned detections have `confidence >= 0.8` |
| 3 | `POST /detect` with same image and config `{"probability_threshold": 0.1}` | 200 OK |
| 4 | Compare result counts | Step 3 returns >= number of detections from Step 1 |
**Expected outcome**: Higher threshold produces fewer or equal detections. No detection below threshold appears in results.
**Max execution time**: 30s
---
### FT-P-06: Overlapping detections are deduplicated
**Summary**: Verify that overlapping detections with containment ratio above threshold are deduplicated, keeping the higher-confidence one.
**Traces to**: AC-DA-2
**Category**: Detection Accuracy
**Preconditions**:
- Engine is initialized
- Image produces overlapping detections (dense scene)
**Input data**: small-image (scene with clustered objects)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with small-image and config `{"tracking_intersection_threshold": 0.6}` | 200 OK |
| 2 | Collect detections | No two detections of the same class overlap by more than 60% containment ratio |
| 3 | `POST /detect` with same image and config `{"tracking_intersection_threshold": 0.01}` | 200 OK |
| 4 | Compare result counts | Step 3 returns fewer or equal detections (more aggressive dedup) |
**Expected outcome**: No pair of returned detections exceeds the configured overlap threshold.
**Max execution time**: 30s
---
### FT-P-07: Physical size filtering removes oversized detections
**Summary**: Verify that detections exceeding the MaxSizeM for their class (given GSD) are removed.
**Traces to**: AC-DA-4
**Category**: Detection Accuracy
**Preconditions**:
- Engine is initialized
- classes.json loaded with MaxSizeM values
**Input data**: small-image, config with known GSD parameters
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with small-image and config `{"altitude": 400, "focal_length": 24, "sensor_width": 23.5}` | 200 OK |
| 2 | For each detection, compute physical size from bounding box + GSD | No detection's physical size exceeds the MaxSizeM defined for its class in classes.json |
**Expected outcome**: All returned detections have plausible physical dimensions for their class.
**Max execution time**: 30s
---
### FT-P-08: Async media detection returns "started" immediately
**Summary**: Verify that POST /detect/{media_id} returns immediately with status "started" while processing continues in background.
**Traces to**: AC-API-3
**Category**: API
**Preconditions**:
- Engine is initialized
- Media file paths are available via config
**Input data**: jwt-token, test-video path in config
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect/test-media-001` with config paths and auth headers | 200 OK, `{"status": "started"}` |
| 2 | Measure response time | Response arrives within 1s (before video processing completes) |
**Expected outcome**: Immediate response with `{"status": "started"}`. Processing continues asynchronously.
**Max execution time**: 2s (response only; processing continues in background)
---
### FT-P-09: SSE streaming delivers detection events during async processing
**Summary**: Verify that SSE clients receive real-time detection events during async media detection.
**Traces to**: AC-API-4, AC-API-3
**Category**: API
**Preconditions**:
- Engine is initialized
- SSE client connected before triggering detection
**Input data**: jwt-token, test-video path in config
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Open SSE connection: `GET /detect/stream` | Connection established |
| 2 | `POST /detect/test-media-002` with config and auth headers | `{"status": "started"}` |
| 3 | Listen on SSE connection | Receive events with `mediaStatus: "AIProcessing"` as frames are processed |
| 4 | Wait for completion | Final event with `mediaStatus: "AIProcessed"` and `percent: 100` |
**Expected outcome**: Multiple SSE events received. Events include detection data. Final event signals completion.
**Max execution time**: 120s
---
### FT-P-10: Video frame sampling processes every Nth frame
**Summary**: Verify that video processing respects the `frame_period_recognition` setting.
**Traces to**: AC-VP-1
**Category**: Video Processing
**Preconditions**:
- Engine is initialized
- SSE client connected
**Input data**: test-video (10s, 30fps = 300 frames), config `{"frame_period_recognition": 4}`
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Open SSE connection | Connection established |
| 2 | `POST /detect/test-media-003` with config `{"frame_period_recognition": 4, "paths": ["/media/test-video.mp4"]}` | `{"status": "started"}` |
| 3 | Count distinct SSE events with detection data | Number of processed frames ≈ 300/4 = 75 (±10% tolerance for start/end frames) |
**Expected outcome**: Approximately 75 frames processed (not all 300). The count scales proportionally with frame_period_recognition.
**Max execution time**: 120s
---
### FT-P-11: Video annotation interval enforcement
**Summary**: Verify that annotations are not reported more frequently than `frame_recognition_seconds`.
**Traces to**: AC-VP-2
**Category**: Video Processing
**Preconditions**:
- Engine is initialized
- SSE client connected
**Input data**: test-video, config `{"frame_recognition_seconds": 2}`
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Open SSE connection | Connection established |
| 2 | `POST /detect/test-media-004` with config `{"frame_recognition_seconds": 2, "paths": ["/media/test-video.mp4"]}` | `{"status": "started"}` |
| 3 | Record timestamps of consecutive SSE detection events | Minimum gap between consecutive annotation events ≥ 2 seconds |
**Expected outcome**: No two annotation events are closer than 2 seconds apart.
**Max execution time**: 120s
---
### FT-P-12: Video tracking accepts new annotations on movement
**Summary**: Verify that new annotations are accepted when detections move beyond the tracking threshold.
**Traces to**: AC-VP-3
**Category**: Video Processing
**Preconditions**:
- Engine is initialized
- SSE client connected
- Video contains moving objects
**Input data**: test-video, config with `tracking_distance_confidence > 0`
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Open SSE connection | Connection established |
| 2 | `POST /detect/test-media-005` with config `{"tracking_distance_confidence": 0.05, "paths": ["/media/test-video.mp4"]}` | `{"status": "started"}` |
| 3 | Collect SSE events | Annotations are emitted when object positions change between frames |
**Expected outcome**: Annotations contain updated positions reflecting object movement. Static objects do not generate redundant annotations.
**Max execution time**: 120s
---
### FT-P-13: Weather mode class variants
**Summary**: Verify that the system supports detection across different weather mode class variants (Norm, Wint, Night).
**Traces to**: AC-OC-1
**Category**: Object Classes
**Preconditions**:
- Engine is initialized
- classes.json includes weather-mode variants
**Input data**: small-image
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with small-image | 200 OK |
| 2 | Inspect returned detection labels | Labels correspond to valid class names from classes.json (base or weather-variant) |
**Expected outcome**: All returned labels are valid entries from the 19-class × 3-mode registry.
**Max execution time**: 30s
---
### FT-P-14: Engine lazy initialization on first detection request
**Summary**: Verify that the engine is not initialized at startup but is initialized on the first detection request.
**Traces to**: AC-EL-1, AC-EL-2
**Category**: Engine Lifecycle
**Preconditions**:
- Fresh service start, no prior requests
**Input data**: small-image
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `GET /health` immediately after service starts | `aiAvailability: "None"` — engine not loaded |
| 2 | `POST /detect` with small-image | 200 OK (may take longer — engine initializing) |
| 3 | `GET /health` | `aiAvailability` changed to `"Enabled"` or status indicating engine is active |
**Expected outcome**: Engine transitions from "None" to an active state only after a detection request.
**Max execution time**: 60s
---
### FT-P-15: ONNX fallback when GPU unavailable
**Summary**: Verify that the system falls back to ONNX Runtime when no compatible GPU is available.
**Traces to**: AC-EL-2, RESTRICT-HW-1
**Category**: Engine Lifecycle
**Preconditions**:
- Detections service running WITHOUT GPU runtime (CPU-only Docker profile)
- Mock-loader serves ONNX model
**Input data**: small-image
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with small-image | 200 OK with detection results |
| 2 | `GET /health` | `aiAvailability` indicates engine is active (ONNX fallback) |
**Expected outcome**: Detection succeeds via ONNX Runtime. No TensorRT-related errors.
**Max execution time**: 60s
---
### FT-P-16: Tile deduplication removes duplicate detections at tile boundaries
**Summary**: Verify that detections appearing in overlapping tile regions are deduplicated.
**Traces to**: AC-DA-3
**Category**: Detection Accuracy
**Preconditions**:
- Engine is initialized
- Large image that triggers tiling
**Input data**: large-image with config including GSD parameters and `big_image_tile_overlap_percent: 20`
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with large-image and tiling config | 200 OK |
| 2 | Inspect detections near tile boundaries | No two detections of the same class are within 0.01 coordinate difference of each other (TILE_DUPLICATE_CONFIDENCE_THRESHOLD) |
**Expected outcome**: Tile boundary detections are merged. No duplicates with near-identical coordinates remain.
**Max execution time**: 60s
---
## Negative Scenarios
### FT-N-01: Empty image returns 400
**Summary**: Verify that submitting an empty file to POST /detect returns a 400 error.
**Traces to**: AC-API-2 (negative case)
**Category**: API
**Preconditions**:
- Detections service is running
**Input data**: empty-image (zero-byte file)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with empty-image as multipart file | 400 Bad Request |
**Expected outcome**: HTTP 400 with error message indicating empty or invalid image.
**Max execution time**: 5s
---
### FT-N-02: Invalid image data returns 400
**Summary**: Verify that submitting a corrupt/non-image file returns a 400 error.
**Traces to**: AC-API-2 (negative case)
**Category**: API
**Preconditions**:
- Detections service is running
**Input data**: corrupt-image (random binary data)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect` with corrupt-image as multipart file | 400 Bad Request |
**Expected outcome**: HTTP 400. Image decoding fails gracefully with an error response (not a 500).
**Max execution time**: 5s
---
### FT-N-03: Detection when engine unavailable returns 503
**Summary**: Verify that a detection request returns 503 when the engine cannot be initialized.
**Traces to**: AC-API-2 (negative case), AC-EL-2
**Category**: API, Engine Lifecycle
**Preconditions**:
- Mock-loader configured to return errors (model download fails)
- Engine has not been previously initialized
**Input data**: small-image
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Configure mock-loader to return 503 on model requests | — |
| 2 | `POST /detect` with small-image | 503 Service Unavailable or 422 |
**Expected outcome**: HTTP 503 or 422 error indicating engine is not available. No crash or unhandled exception.
**Max execution time**: 30s
---
### FT-N-04: Duplicate media_id returns 409
**Summary**: Verify that submitting a second async detection request with an already-active media_id returns 409.
**Traces to**: AC-API-3 (negative case)
**Category**: API
**Preconditions**:
- Engine is initialized
- An async detection is already in progress for media_id "dup-test"
**Input data**: jwt-token, test-video
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | `POST /detect/dup-test` with config and auth headers | `{"status": "started"}` |
| 2 | Immediately `POST /detect/dup-test` again (same media_id) | 409 Conflict |
**Expected outcome**: Second request is rejected with 409. First detection continues normally.
**Max execution time**: 5s
---
### FT-N-05: Missing classes.json prevents startup
**Summary**: Verify that the service fails or returns no detections when classes.json is not present.
**Traces to**: RESTRICT-SW-4
**Category**: Restrictions
**Preconditions**:
- Detections service started WITHOUT classes.json volume mount
**Input data**: None
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Attempt to start detections service without classes.json | Service fails to start OR starts with empty class registry |
| 2 | If started: `POST /detect` with small-image | Empty detections or error response |
**Expected outcome**: Service either fails to start or returns no detections. No unhandled crash.
**Max execution time**: 30s
---
### FT-N-06: Loader service unreachable during model download
**Summary**: Verify that the system handles Loader service being unreachable during engine initialization.
**Traces to**: RESTRICT-ENV-1, AC-EL-2
**Category**: Resilience, Engine Lifecycle
**Preconditions**:
- Mock-loader is stopped or unreachable
- Engine not yet initialized
**Input data**: small-image
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Stop mock-loader service | — |
| 2 | `POST /detect` with small-image | Error response (503 or 422) |
| 3 | `GET /health` | `aiAvailability` reflects error state |
**Expected outcome**: Detection fails gracefully. Health endpoint reflects the engine error state.
**Max execution time**: 30s
---
### FT-N-07: Annotations service unreachable — detection continues
**Summary**: Verify that async detection continues even when the Annotations service is unreachable.
**Traces to**: RESTRICT-ENV-2
**Category**: Resilience
**Preconditions**:
- Engine is initialized
- Mock-annotations is stopped or returns errors
- SSE client connected
**Input data**: jwt-token, test-video
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Stop mock-annotations service | — |
| 2 | `POST /detect/test-media-006` with config and auth | `{"status": "started"}` |
| 3 | Listen on SSE | Detection events still arrive (annotations POST failure is silently caught) |
| 4 | Wait for completion | Final `AIProcessed` event received |
**Expected outcome**: Detection processing completes. SSE events are delivered. Annotations POST failure does not stop the detection pipeline.
**Max execution time**: 120s
---
### FT-N-08: SSE queue overflow is silently dropped
**Summary**: Verify that when an SSE client's queue reaches 100 events, additional events are dropped without error.
**Traces to**: AC-API-4
**Category**: API
**Preconditions**:
- Engine is initialized
- SSE client connected but NOT consuming events (stalled reader)
**Input data**: test-video (generates many events)
**Steps**:
| Step | Consumer Action | Expected System Response |
|------|----------------|------------------------|
| 1 | Open SSE connection but pause reading | Connection established |
| 2 | `POST /detect/test-media-007` with config that generates > 100 events | `{"status": "started"}` |
| 3 | Wait for processing to complete | No error on the detection side |
| 4 | Resume reading SSE | Receive ≤ 100 events (queue max depth) |
**Expected outcome**: No crash or error. Overflow events are silently dropped. Detection completes normally.
**Max execution time**: 120s
+125
View File
@@ -0,0 +1,125 @@
# Test Environment
## Overview
**System under test**: Azaion.Detections — FastAPI HTTP service exposing `POST /detect`, `POST /detect/{media_id}`, `GET /detect/stream`, `GET /health`
**Consumer app purpose**: Standalone test runner that exercises the detection service through its public HTTP/SSE interfaces, validating black-box use cases without access to internals.
## Docker Environment
### Services
| Service | Image / Build | Purpose | Ports |
|---------|--------------|---------|-------|
| detections | Build from repo root (setup.py + Cython compile, uvicorn entrypoint) | System under test — the detection microservice | 8000:8000 |
| mock-loader | Custom lightweight HTTP stub (Python/Node) | Mock of the Loader service — serves ONNX model files, accepts TensorRT uploads | 8080:8080 |
| mock-annotations | Custom lightweight HTTP stub (Python/Node) | Mock of the Annotations service — accepts detection results, provides token refresh | 8081:8081 |
| e2e-consumer | Build from `e2e/` directory | Black-box test runner (pytest) | — |
### GPU Configuration
For tests requiring TensorRT (GPU path):
- Deploy `detections` with `runtime: nvidia` and `NVIDIA_VISIBLE_DEVICES=all`
- The test suite has two profiles: `gpu` (TensorRT tests) and `cpu` (ONNX fallback tests)
- CPU-only tests run without GPU runtime, verifying ONNX fallback behavior
### Networks
| Network | Services | Purpose |
|---------|----------|---------|
| e2e-net | all | Isolated test network — all service-to-service communication via hostnames |
### Volumes
| Volume | Mounted to | Purpose |
|--------|-----------|---------|
| test-models | mock-loader:/models | Pre-built ONNX model file for test inference |
| test-media | e2e-consumer:/media | Sample images and video files for detection requests |
| test-classes | detections:/app/classes.json | classes.json with 19 detection classes |
| test-results | e2e-consumer:/results | CSV test report output |
### docker-compose structure
```yaml
services:
mock-loader:
build: ./e2e/mocks/loader
ports: ["8080:8080"]
volumes:
- test-models:/models
networks: [e2e-net]
mock-annotations:
build: ./e2e/mocks/annotations
ports: ["8081:8081"]
networks: [e2e-net]
detections:
build:
context: .
dockerfile: Dockerfile
ports: ["8000:8000"]
environment:
- LOADER_URL=http://mock-loader:8080
- ANNOTATIONS_URL=http://mock-annotations:8081
volumes:
- test-classes:/app/classes.json
depends_on:
- mock-loader
- mock-annotations
networks: [e2e-net]
# GPU profile adds: runtime: nvidia
e2e-consumer:
build: ./e2e
volumes:
- test-media:/media
- test-results:/results
depends_on:
- detections
networks: [e2e-net]
command: pytest --csv=/results/report.csv
volumes:
test-models:
test-media:
test-classes:
test-results:
networks:
e2e-net:
```
## Consumer Application
**Tech stack**: Python 3, pytest, requests, sseclient-py
**Entry point**: `pytest --csv=/results/report.csv`
### Communication with system under test
| Interface | Protocol | Endpoint | Authentication |
|-----------|----------|----------|----------------|
| Health check | HTTP GET | `http://detections:8000/health` | None |
| Single image detect | HTTP POST (multipart) | `http://detections:8000/detect` | None |
| Media detect | HTTP POST (JSON) | `http://detections:8000/detect/{media_id}` | Bearer JWT + x-refresh-token headers |
| SSE stream | HTTP GET (SSE) | `http://detections:8000/detect/stream` | None |
### What the consumer does NOT have access to
- No direct import of Cython modules (inference, annotation, engines)
- No direct access to the detections service filesystem or Logs/ directory
- No shared memory with the detections process
- No direct calls to mock-loader or mock-annotations (except for test setup/teardown verification)
## CI/CD Integration
**When to run**: On PR merge to dev, nightly scheduled run
**Pipeline stage**: After unit tests, before deployment
**Gate behavior**: Block merge if any functional test fails; non-functional failures are warnings
**Timeout**: 15 minutes for CPU profile, 30 minutes for GPU profile
## Reporting
**Format**: CSV
**Columns**: Test ID, Test Name, Execution Time (ms), Result (PASS/FAIL/SKIP), Error Message (if FAIL)
**Output path**: `/results/report.csv` (mounted volume → `./e2e-results/report.csv` on host)
@@ -0,0 +1,85 @@
# Performance Tests
### NFT-PERF-01: Single image detection latency
**Summary**: Measure end-to-end latency for a single small image detection request after engine is warm.
**Traces to**: AC-API-2
**Metric**: Request-to-response latency (ms)
**Preconditions**:
- Engine is initialized and warm (at least 1 prior detection)
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Send 10 sequential `POST /detect` with small-image | Record each request-response latency |
| 2 | Compute p50, p95, p99 | — |
**Pass criteria**: p95 latency < 5000ms for ONNX CPU, p95 < 1000ms for TensorRT GPU
**Duration**: ~60s (10 requests)
---
### NFT-PERF-02: Concurrent inference throughput
**Summary**: Verify the system handles 2 concurrent inference requests (ThreadPoolExecutor limit).
**Traces to**: RESTRICT-HW-3
**Metric**: Throughput (requests/second), latency under concurrency
**Preconditions**:
- Engine is initialized and warm
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | Send 2 concurrent `POST /detect` requests with small-image | Measure both response times |
| 2 | Send 3 concurrent requests | Third request should queue behind the first two |
| 3 | Record total time for 3 concurrent requests vs 2 concurrent | — |
**Pass criteria**: 2 concurrent requests complete without error. 3 concurrent requests: total time > time for 2 (queuing observed).
**Duration**: ~30s
---
### NFT-PERF-03: Large image tiling processing time
**Summary**: Measure processing time for a large image that triggers GSD-based tiling.
**Traces to**: AC-IP-2
**Metric**: Total processing time (ms), tiles processed
**Preconditions**:
- Engine is initialized and warm
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | `POST /detect` with large-image (4000×3000) and GSD config | Record total response time |
| 2 | Compare with small-image baseline from NFT-PERF-01 | Ratio indicates tiling overhead |
**Pass criteria**: Request completes within 120s. Processing time scales proportionally with number of tiles (not exponentially).
**Duration**: ~120s
---
### NFT-PERF-04: Video processing frame rate
**Summary**: Measure effective frame processing rate during video detection.
**Traces to**: AC-VP-1
**Metric**: Frames processed per second, total processing time
**Preconditions**:
- Engine is initialized and warm
- SSE client connected
**Steps**:
| Step | Consumer Action | Measurement |
|------|----------------|-------------|
| 1 | `POST /detect/test-media-perf` with test-video and `frame_period_recognition: 4` | — |
| 2 | Count SSE events and measure total time from "started" to "AIProcessed" | Compute frames/second |
**Pass criteria**: Processing completes within 5× video duration (10s video → < 50s processing). Frame processing rate is consistent (no stalls > 10s between events).
**Duration**: ~120s
@@ -0,0 +1,94 @@
# Resilience Tests
### NFT-RES-01: Loader service outage after engine initialization
**Summary**: Verify that detections continue working when the Loader service goes down after the engine is already loaded.
**Traces to**: RESTRICT-ENV-1
**Preconditions**:
- Engine is initialized (model already downloaded)
**Fault injection**:
- Stop mock-loader service
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Stop mock-loader | — |
| 2 | `POST /detect` with small-image | 200 OK — detection succeeds (engine already in memory) |
| 3 | `GET /health` | `aiAvailability` remains "Enabled" |
**Pass criteria**: Detection continues to work. Health status remains stable. No errors from loader unavailability.
---
### NFT-RES-02: Annotations service outage during async detection
**Summary**: Verify that async detection completes and delivers SSE events even when Annotations service is down.
**Traces to**: RESTRICT-ENV-2
**Preconditions**:
- Engine is initialized
- SSE client connected
**Fault injection**:
- Stop mock-annotations mid-processing
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Start async detection: `POST /detect/test-media-res01` | `{"status": "started"}` |
| 2 | After first few SSE events, stop mock-annotations | — |
| 3 | Continue listening to SSE | Events continue arriving. Annotations POST failures are silently caught |
| 4 | Wait for completion | Final `AIProcessed` event received |
**Pass criteria**: Detection pipeline completes fully. SSE delivery is unaffected. No crash or 500 errors.
---
### NFT-RES-03: Engine initialization retry after transient loader failure
**Summary**: Verify that if model download fails on first attempt, a subsequent detection request retries initialization.
**Traces to**: AC-EL-2
**Preconditions**:
- Fresh service (engine not initialized)
**Fault injection**:
- Mock-loader returns 503 on first model request, then recovers
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Configure mock-loader to fail first request | — |
| 2 | `POST /detect` with small-image | Error (503 or 422) |
| 3 | Configure mock-loader to succeed | — |
| 4 | `POST /detect` with small-image | 200 OK — engine initializes on retry |
**Pass criteria**: Second detection succeeds after loader recovers. System does not permanently lock into error state.
---
### NFT-RES-04: Service restart with in-memory state loss
**Summary**: Verify that after a service restart, all in-memory state (_active_detections, _event_queues) is cleanly reset.
**Traces to**: RESTRICT-OP-5, RESTRICT-OP-6
**Preconditions**:
- Previous detection may have been in progress
**Fault injection**:
- Restart detections container
**Steps**:
| Step | Action | Expected Behavior |
|------|--------|------------------|
| 1 | Restart detections container | — |
| 2 | `GET /health` | Returns `aiAvailability: "None"` (fresh start) |
| 3 | `POST /detect/any-media-id` | Accepted (no stale _active_detections blocking it) |
**Pass criteria**: No stale state from previous session. All endpoints functional after restart.
@@ -0,0 +1,87 @@
# Resource Limit Tests
### NFT-RES-LIM-01: ThreadPoolExecutor worker limit (2 concurrent)
**Summary**: Verify that no more than 2 inference operations run simultaneously.
**Traces to**: RESTRICT-HW-3
**Preconditions**:
- Engine is initialized
**Monitoring**:
- Track concurrent request timings
**Steps**:
| Step | Consumer Action | Expected Behavior |
|------|----------------|------------------|
| 1 | Send 4 concurrent `POST /detect` requests | — |
| 2 | Measure response arrival times | First 2 complete roughly together; next 2 complete after |
**Duration**: ~60s
**Pass criteria**: Clear evidence of 2-at-a-time processing (second batch starts after first completes). All 4 requests eventually succeed.
---
### NFT-RES-LIM-02: SSE queue depth limit (100 events)
**Summary**: Verify that the SSE queue per client does not exceed 100 events.
**Traces to**: AC-API-4
**Preconditions**:
- Engine is initialized
**Monitoring**:
- SSE event count
**Steps**:
| Step | Consumer Action | Expected Behavior |
|------|----------------|------------------|
| 1 | Open SSE connection but do not read (stall client) | — |
| 2 | Trigger async detection that produces > 100 events | — |
| 3 | After processing completes, drain the SSE queue | ≤ 100 events received |
**Duration**: ~120s
**Pass criteria**: No more than 100 events buffered. No OOM or connection errors from queue growth.
---
### NFT-RES-LIM-03: Max 300 detections per frame
**Summary**: Verify that the system returns at most 300 detections per frame (model output limit).
**Traces to**: RESTRICT-SW-6
**Preconditions**:
- Engine is initialized
- Image with dense scene expected to produce many detections
**Monitoring**:
- Detection count per response
**Duration**: ~30s
**Pass criteria**: No response contains more than 300 detections. Dense images hit the cap without errors.
---
### NFT-RES-LIM-04: Log file rotation and retention
**Summary**: Verify that log files rotate daily and are retained for 30 days.
**Traces to**: AC-LOG-1, AC-LOG-2
**Preconditions**:
- Detections service running with Logs/ volume mounted for inspection
**Monitoring**:
- Log file creation, naming, and count
**Steps**:
| Step | Consumer Action | Expected Behavior |
|------|----------------|------------------|
| 1 | Make several detection requests | Logs written to `Logs/log_inference_YYYYMMDD.txt` |
| 2 | Verify log file name matches current date | File name contains today's date |
| 3 | Verify log content format | Contains INFO/DEBUG/WARNING entries with timestamps |
**Duration**: ~10s
**Pass criteria**: Log file exists with correct date-based naming. Content includes structured log entries.
+48
View File
@@ -0,0 +1,48 @@
# Security Tests
### NFT-SEC-01: Malformed multipart payload handling
**Summary**: Verify that the service handles malformed multipart requests without crashing.
**Traces to**: AC-API-2 (security)
**Steps**:
| Step | Consumer Action | Expected Response |
|------|----------------|------------------|
| 1 | Send `POST /detect` with truncated multipart body (missing boundary) | 400 or 422 — not 500 |
| 2 | Send `POST /detect` with Content-Type: multipart but no file part | 400 — empty image |
| 3 | `GET /health` after malformed requests | Service is still healthy |
**Pass criteria**: All malformed requests return 4xx. Service remains operational.
---
### NFT-SEC-02: Oversized request body
**Summary**: Verify system behavior when an extremely large file is uploaded.
**Traces to**: RESTRICT-OP-4
**Steps**:
| Step | Consumer Action | Expected Response |
|------|----------------|------------------|
| 1 | Send `POST /detect` with a 500 MB random file | Error response (413, 400, or timeout) — not OOM crash |
| 2 | `GET /health` | Service is still running |
**Pass criteria**: Service does not crash or run out of memory. Returns an error or times out gracefully.
---
### NFT-SEC-03: JWT token is forwarded without modification
**Summary**: Verify that the Authorization header is forwarded to the Annotations service as-is.
**Traces to**: AC-API-3
**Steps**:
| Step | Consumer Action | Expected Response |
|------|----------------|------------------|
| 1 | `POST /detect/test-media-sec` with `Authorization: Bearer test-jwt-123` and `x-refresh-token: refresh-456` | `{"status": "started"}` |
| 2 | After processing, query mock-annotations `GET /mock/annotations` | Recorded request contains `Authorization: Bearer test-jwt-123` header |
**Pass criteria**: Exact token received by mock-annotations matches what the consumer sent.
+58
View File
@@ -0,0 +1,58 @@
# Test Data Management
## Seed Data Sets
| Data Set | Source File | Description | Used by Tests | How Loaded | Cleanup |
|----------|------------|-------------|---------------|-----------|---------|
| onnx-model | `input_data/azaion.onnx` | YOLO ONNX model (1280×1280 input, 19 classes, 81MB) | All detection tests | Volume mount to mock-loader `/models/azaion.onnx` | Container restart |
| classes-json | `classes.json` (repo root) | 19 detection classes with Id, Name, Color, MaxSizeM | All tests | Volume mount to detections `/app/classes.json` | Container restart |
| image-small | `input_data/image_small.jpg` | JPEG 1280×720 — below tiling threshold (1920×1920) | FT-P-01..03, 05, 07, 13..15, FT-N-03, 06, NFT-PERF-01..02, NFT-RES-01, 03, NFT-SEC-01, NFT-RES-LIM-01 | Volume mount to consumer `/media/` | N/A (read-only) |
| image-large | `input_data/image_large.JPG` | JPEG 6252×4168 — above tiling threshold, triggers GSD tiling | FT-P-04, 16, NFT-PERF-03 | Volume mount to consumer `/media/` | N/A (read-only) |
| image-dense-01 | `input_data/image_dense01.jpg` | JPEG 1280×720 — dense scene with many clustered objects | FT-P-06, NFT-RES-LIM-03 | Volume mount to consumer `/media/` | N/A (read-only) |
| image-dense-02 | `input_data/image_dense02.jpg` | JPEG 1920×1080 — dense scene variant, borderline tiling | FT-P-06 (variant) | Volume mount to consumer `/media/` | N/A (read-only) |
| image-different-types | `input_data/image_different_types.jpg` | JPEG 900×1600 — varied object classes for class variant tests | FT-P-13 | Volume mount to consumer `/media/` | N/A (read-only) |
| image-empty-scene | `input_data/image_empty_scene.jpg` | JPEG 1920×1080 — clean scene with no detectable objects | Edge case (zero detections) | Volume mount to consumer `/media/` | N/A (read-only) |
| video-short-01 | `input_data/video_short01.mp4` | MP4 video — standard async/SSE/video detection tests | FT-P-08..12, FT-N-04, 07, NFT-PERF-04, NFT-RES-02, NFT-SEC-03 | Volume mount to consumer `/media/` | N/A (read-only) |
| video-short-02 | `input_data/video_short02.mp4` | MP4 video — variant for concurrent and resilience tests | NFT-RES-02 (variant), NFT-RES-04 | Volume mount to consumer `/media/` | N/A (read-only) |
| video-long-03 | `input_data/video_long03.mp4` | MP4 long video (288MB) — generates >100 SSE events for overflow tests | FT-N-08, NFT-RES-LIM-02 | Volume mount to consumer `/media/` | N/A (read-only) |
| empty-image | Generated at build time | Zero-byte file | FT-N-01 | Generated in e2e/fixtures/ | N/A |
| corrupt-image | Generated at build time | Random binary garbage (not valid image format) | FT-N-02 | Generated in e2e/fixtures/ | N/A |
| jwt-token | Generated at runtime | Valid JWT with exp claim (not signature-verified by detections) | FT-P-08, 09, FT-N-04, 07, NFT-SEC-03 | Generated by consumer at runtime | N/A |
## Data Isolation Strategy
Each test run starts with fresh containers (`docker compose down -v && docker compose up`). The detections service is stateless — no persistent data between runs. Mock services reset their state on container restart. Tests that modify mock behavior (e.g., making loader unreachable) must run in isolated test groups.
## Input Data Mapping
| Input Data File | Source Location | Description | Covers Scenarios |
|-----------------|----------------|-------------|-----------------|
| data_parameters.md | `_docs/00_problem/input_data/data_parameters.md` | API parameter schemas, config defaults, classes.json structure | Informs all test input construction |
| azaion.onnx | `_docs/00_problem/input_data/azaion.onnx` | YOLO ONNX detection model | All detection tests |
| image_small.jpg | `_docs/00_problem/input_data/image_small.jpg` | 1280×720 aerial image | Single-frame detection, health, negative, perf tests |
| image_large.JPG | `_docs/00_problem/input_data/image_large.JPG` | 6252×4168 aerial image | Tiling tests |
| image_dense01.jpg | `_docs/00_problem/input_data/image_dense01.jpg` | Dense scene 1280×720 | Dedup, detection cap tests |
| image_dense02.jpg | `_docs/00_problem/input_data/image_dense02.jpg` | Dense scene 1920×1080 | Dedup variant |
| image_different_types.jpg | `_docs/00_problem/input_data/image_different_types.jpg` | Varied classes 900×1600 | Class variant tests |
| image_empty_scene.jpg | `_docs/00_problem/input_data/image_empty_scene.jpg` | Empty scene 1920×1080 | Zero-detection edge case |
| video_short01.mp4 | `_docs/00_problem/input_data/video_short01.mp4` | Standard video | Async, SSE, video, perf tests |
| video_short02.mp4 | `_docs/00_problem/input_data/video_short02.mp4` | Video variant | Resilience, concurrent tests |
| video_long03.mp4 | `_docs/00_problem/input_data/video_long03.mp4` | Long video (288MB) | SSE overflow, queue depth tests |
| classes.json | repo root `classes.json` | 19 detection classes | All tests |
## External Dependency Mocks
| External Service | Mock/Stub | How Provided | Behavior |
|-----------------|-----------|-------------|----------|
| Loader Service | HTTP stub | Docker service `mock-loader` | Serves ONNX model from volume on `GET /models/azaion.onnx`. Accepts TensorRT upload on `POST /upload`. Returns 404 for unknown files. Configurable: can simulate downtime (503) via control endpoint `POST /mock/config`. |
| Annotations Service | HTTP stub | Docker service `mock-annotations` | Accepts annotation POST on `POST /annotations` — stores in memory for verification. Provides token refresh on `POST /auth/refresh`. Configurable: can simulate downtime (503) via control endpoint `POST /mock/config`. Returns recorded annotations on `GET /mock/annotations` for test assertions. |
## Data Validation Rules
| Data Type | Validation | Invalid Examples | Expected System Behavior |
|-----------|-----------|-----------------|------------------------|
| Image file (POST /detect) | Non-empty bytes, decodable by OpenCV | Zero-byte file, random binary, text file | 400 Bad Request |
| media_id (POST /detect/{media_id}) | String, unique among active detections | Already-active media_id | 409 Conflict |
| AIConfigDto fields | probability_threshold: 0.01.0; frame_period_recognition: positive int; big_image_tile_overlap_percent: 0100 | probability_threshold: -1 or 2.0; frame_period_recognition: 0 | System uses defaults or returns validation error |
| Authorization header | Bearer token format | Missing header, malformed JWT | Token forwarded to Annotations as-is; detections still proceeds |
| classes.json | JSON array of objects with Id, Name, Color, MaxSizeM | Missing file, empty array, malformed JSON | Service fails to start / returns empty detections |
@@ -0,0 +1,70 @@
# Traceability Matrix
## Acceptance Criteria Coverage
| AC ID | Acceptance Criterion | Test IDs | Coverage |
|-------|---------------------|----------|----------|
| AC-DA-1 | Detections with confidence below probability_threshold are filtered out | FT-P-03, FT-P-05 | Covered |
| AC-DA-2 | Overlapping detections with containment ratio > tracking_intersection_threshold are deduplicated | FT-P-06 | Covered |
| AC-DA-3 | Tile duplicate detections identified when bounding box coordinates differ by < 0.01 | FT-P-16 | Covered |
| AC-DA-4 | Physical size filtering: detections exceeding max_object_size_meters removed | FT-P-07 | Covered |
| AC-VP-1 | Frame sampling: every Nth frame processed (frame_period_recognition) | FT-P-10, NFT-PERF-04 | Covered |
| AC-VP-2 | Minimum annotation interval: frame_recognition_seconds between annotations | FT-P-11 | Covered |
| AC-VP-3 | Tracking: new annotation accepted on movement/confidence change | FT-P-12 | Covered |
| AC-IP-1 | Images ≤ 1.5× model dimensions processed as single frame | FT-P-03 | Covered |
| AC-IP-2 | Larger images: tiled based on GSD, tile overlap configurable | FT-P-04, FT-P-16, NFT-PERF-03 | Covered |
| AC-API-1 | GET /health returns status: "healthy" with aiAvailability | FT-P-01, FT-P-02 | Covered |
| AC-API-2 | POST /detect returns detections synchronously. Errors: 400, 422, 503 | FT-P-03, FT-N-01, FT-N-02, FT-N-03, NFT-SEC-01 | Covered |
| AC-API-3 | POST /detect/{media_id} returns immediately with "started". Rejects duplicate with 409 | FT-P-08, FT-N-04, NFT-SEC-03 | Covered |
| AC-API-4 | GET /detect/stream delivers SSE events. Queue max depth: 100 | FT-P-09, FT-N-08, NFT-RES-LIM-02 | Covered |
| AC-EL-1 | Engine initialization is lazy (first detection, not startup) | FT-P-01, FT-P-14 | Covered |
| AC-EL-2 | Status transitions: NONE → DOWNLOADING → ENABLED / ERROR | FT-P-02, FT-P-14, FT-N-03, NFT-RES-03 | Covered |
| AC-EL-3 | GPU check: NVIDIA GPU with compute capability ≥ 6.1 | FT-P-15 | Covered |
| AC-EL-4 | TensorRT conversion uses FP16 when GPU supports it | — | NOT COVERED — requires specific GPU hardware; verified by visual inspection of TensorRT build logs |
| AC-EL-5 | Background conversion does not block API responsiveness | FT-P-01, FT-P-14 | Covered |
| AC-LOG-1 | Log files: Logs/log_inference_YYYYMMDD.txt | NFT-RES-LIM-04 | Covered |
| AC-LOG-2 | Rotation: daily. Retention: 30 days | NFT-RES-LIM-04 | Covered |
| AC-OC-1 | 19 base classes, 3 weather modes, up to 57 variants | FT-P-13 | Covered |
| AC-OC-2 | Each class has Id, Name, Color, MaxSizeM | FT-P-07, FT-P-13 | Covered |
## Restrictions Coverage
| Restriction ID | Restriction | Test IDs | Coverage |
|---------------|-------------|----------|----------|
| RESTRICT-HW-1 | GPU CC ≥ 6.1 required for TensorRT | FT-P-15 | Covered |
| RESTRICT-HW-2 | TensorRT conversion uses 90% GPU memory workspace | — | NOT COVERED — requires controlled GPU memory environment; verified during manual engine build |
| RESTRICT-HW-3 | ThreadPoolExecutor limited to 2 workers | NFT-PERF-02, NFT-RES-LIM-01 | Covered |
| RESTRICT-SW-1 | Python 3 + Cython 3.1.3 compilation required | — | NOT COVERED — build-time constraint; verified by Docker build succeeding |
| RESTRICT-SW-2 | ONNX model (azaion.onnx) must be available via Loader | FT-N-06, NFT-RES-01, NFT-RES-03 | Covered |
| RESTRICT-SW-3 | TensorRT engines are GPU-architecture-specific (not portable) | — | NOT COVERED — requires multiple GPU architectures; documented constraint |
| RESTRICT-SW-4 | classes.json must exist at startup | FT-N-05 | Covered |
| RESTRICT-SW-5 | Model input: fixed 1280×1280 | FT-P-03, FT-P-04 | Covered |
| RESTRICT-SW-6 | Max 300 detections per frame | NFT-RES-LIM-03 | Covered |
| RESTRICT-ENV-1 | LOADER_URL must be reachable for model download | FT-N-06, NFT-RES-01, NFT-RES-03 | Covered |
| RESTRICT-ENV-2 | ANNOTATIONS_URL must be reachable for result posting | FT-N-07, NFT-RES-02 | Covered |
| RESTRICT-ENV-3 | Logs/ directory must be writable | NFT-RES-LIM-04 | Covered |
| RESTRICT-OP-1 | Stateless — no local persistence of detection results | NFT-RES-04 | Covered |
| RESTRICT-OP-2 | No TLS at application level | — | NOT COVERED — infrastructure-level concern; out of scope for application blackbox tests |
| RESTRICT-OP-3 | No CORS configuration | — | NOT COVERED — requires browser-based testing; out of scope for API-level blackbox tests |
| RESTRICT-OP-4 | No rate limiting | NFT-SEC-02 | Covered |
| RESTRICT-OP-5 | No graceful shutdown — in-progress detections not drained | NFT-RES-04 | Covered |
| RESTRICT-OP-6 | Single-instance in-memory state (not shared across instances) | NFT-RES-04 | Covered |
## Coverage Summary
| Category | Total Items | Covered | Not Covered | Coverage % |
|----------|-----------|---------|-------------|-----------|
| Acceptance Criteria | 22 | 21 | 1 | 95% |
| Restrictions | 18 | 13 | 5 | 72% |
| **Total** | **40** | **34** | **6** | **85%** |
## Uncovered Items Analysis
| Item | Reason Not Covered | Risk | Mitigation |
|------|-------------------|------|-----------|
| AC-EL-4 (FP16 TensorRT) | Requires specific GPU with FP16 support; blackbox test cannot control hardware capabilities | Low — TensorRT builder auto-detects FP16 | Verified during manual TensorRT build; logged by engine |
| RESTRICT-HW-2 (90% GPU memory) | Requires controlled GPU memory environment with specific memory sizes | Low — hardcoded workspace fraction | Verified by observing TensorRT build logs on target hardware |
| RESTRICT-SW-1 (Cython compilation) | Build-time constraint, not runtime behavior | Low — Docker build validates this | Docker build step serves as the validation gate |
| RESTRICT-SW-3 (TensorRT non-portable) | Requires multiple GPU architectures in test environment | Low — engine filename encodes architecture | Architecture-specific filenames prevent incorrect loading |
| RESTRICT-OP-2 (No TLS) | Infrastructure-level concern; application does not implement TLS | None — by design | TLS handled by reverse proxy / service mesh in deployment |
| RESTRICT-OP-3 (No CORS) | Browser-specific concern; API-level blackbox tests don't use browsers | Low — known limitation | Can be tested separately with browser automation if needed |