Refactor inference engine and task management: Remove obsolete inference engine and ONNX engine files, update inference processing to utilize batch handling, and enhance task management structure in documentation. Adjust paths for task specifications to align with new directory organization.

2026-04-23 00:56:32 +00:00 · 2026-03-28 01:04:28 +02:00
parent 1e4ef299f9
commit 5be53739cd
60 changed files with 111875 additions and 208 deletions
@@ -0,0 +1,87 @@
+# Health & Engine Lifecycle Tests
+
+**Task**: AZ-139_test_health_engine
+**Name**: Health & Engine Lifecycle Tests
+**Description**: Implement E2E tests verifying health endpoint responses and engine lazy initialization lifecycle
+**Complexity**: 3 points
+**Dependencies**: AZ-138_test_infrastructure
+**Component**: Integration Tests
+**Jira**: AZ-139
+**Epic**: AZ-137
+
+## Problem
+
+The health endpoint and engine initialization lifecycle are critical for operational monitoring and service readiness. Tests must verify that the health endpoint correctly reflects engine state transitions (None → Downloading → Enabled/Error) and that engine initialization is lazy (triggered by first detection, not at startup).
+
+## Outcome
+
+- Health endpoint behavior verified across all engine states
+- Lazy initialization confirmed (no engine load at startup)
+- ONNX fallback path validated on CPU-only environments
+- Engine state transitions observable through health endpoint
+
+## Scope
+
+### Included
+- FT-P-01: Health check returns status before engine initialization
+- FT-P-02: Health check reflects engine availability after initialization
+- FT-P-14: Engine lazy initialization on first detection request
+- FT-P-15: ONNX fallback when GPU unavailable
+
+### Excluded
+- TensorRT-specific engine tests (require GPU hardware)
+- Performance benchmarking of engine initialization time
+- Engine error recovery scenarios (covered in resilience tests)
+
+## Acceptance Criteria
+
+**AC-1: Pre-init health check**
+Given the detections service just started with no prior requests
+When GET /health is called
+Then response is 200 with status "healthy" and aiAvailability "None"
+
+**AC-2: Post-init health check**
+Given a successful detection has been performed
+When GET /health is called
+Then aiAvailability reflects an active engine state (not "None" or "Downloading")
+
+**AC-3: Lazy initialization**
+Given a fresh service start
+When GET /health is called immediately
+Then aiAvailability is "None" (engine not loaded at startup)
+And after POST /detect with a valid image, GET /health shows engine active
+
+**AC-4: ONNX fallback**
+Given the service runs without GPU runtime (CPU-only profile)
+When POST /detect is called with a valid image
+Then detection succeeds via ONNX Runtime without TensorRT errors
+
+## Non-Functional Requirements
+
+**Performance**
+- Health check response within 2s
+- First detection (including engine init) within 60s
+
+**Reliability**
+- Tests must work on both CPU-only and GPU Docker profiles
+
+## Integration Tests
+
+| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
+|--------|------------------------|-------------|-------------------|----------------|
+| AC-1 | Fresh service, no requests | GET /health before any detection | 200, aiAvailability: "None" | Max 2s |
+| AC-2 | After POST /detect succeeds | GET /health after detection | aiAvailability not "None" | Max 30s |
+| AC-3 | Fresh service | Health → Detect → Health sequence | State transition None → active | Max 60s |
+| AC-4 | CPU-only Docker profile | POST /detect on CPU profile | Detection succeeds via ONNX | Max 60s |
+
+## Constraints
+
+- Tests must use the CPU Docker profile for ONNX fallback verification
+- Engine initialization time varies by hardware; timeouts must be generous
+- Health endpoint schema depends on AiAvailabilityStatus enum from codebase
+
+## Risks & Mitigation
+
+**Risk 1: Engine init timeout on slow CI**
+- *Risk*: Engine initialization may exceed timeout on resource-constrained CI runners
+- *Mitigation*: Use generous timeouts (60s) and mark as known slow test