# Health & Engine Lifecycle Tests **Task**: AZ-139_test_health_engine **Name**: Health & Engine Lifecycle Tests **Description**: Implement E2E tests verifying health endpoint responses and engine lazy initialization lifecycle **Complexity**: 3 points **Dependencies**: AZ-138_test_infrastructure **Component**: Integration Tests **Jira**: AZ-139 **Epic**: AZ-137 ## Problem The health endpoint and engine initialization lifecycle are critical for operational monitoring and service readiness. Tests must verify that the health endpoint correctly reflects engine state transitions (None → Downloading → Enabled/Error) and that engine initialization is lazy (triggered by first detection, not at startup). ## Outcome - Health endpoint behavior verified across all engine states - Lazy initialization confirmed (no engine load at startup) - ONNX fallback path validated on CPU-only environments - Engine state transitions observable through health endpoint ## Scope ### Included - FT-P-01: Health check returns status before engine initialization - FT-P-02: Health check reflects engine availability after initialization - FT-P-14: Engine lazy initialization on first detection request - FT-P-15: ONNX fallback when GPU unavailable ### Excluded - TensorRT-specific engine tests (require GPU hardware) - Performance benchmarking of engine initialization time - Engine error recovery scenarios (covered in resilience tests) ## Acceptance Criteria **AC-1: Pre-init health check** Given the detections service just started with no prior requests When GET /health is called Then response is 200 with status "healthy" and aiAvailability "None" **AC-2: Post-init health check** Given a successful detection has been performed When GET /health is called Then aiAvailability reflects an active engine state (not "None" or "Downloading") **AC-3: Lazy initialization** Given a fresh service start When GET /health is called immediately Then aiAvailability is "None" (engine not loaded at startup) And after POST /detect with a valid image, GET /health shows engine active **AC-4: ONNX fallback** Given the service runs without GPU runtime (CPU-only profile) When POST /detect is called with a valid image Then detection succeeds via ONNX Runtime without TensorRT errors ## Non-Functional Requirements **Performance** - Health check response within 2s - First detection (including engine init) within 60s **Reliability** - Tests must work on both CPU-only and GPU Docker profiles ## Integration Tests | AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References | |--------|------------------------|-------------|-------------------|----------------| | AC-1 | Fresh service, no requests | GET /health before any detection | 200, aiAvailability: "None" | Max 2s | | AC-2 | After POST /detect succeeds | GET /health after detection | aiAvailability not "None" | Max 30s | | AC-3 | Fresh service | Health → Detect → Health sequence | State transition None → active | Max 60s | | AC-4 | CPU-only Docker profile | POST /detect on CPU profile | Detection succeeds via ONNX | Max 60s | ## Constraints - Tests must use the CPU Docker profile for ONNX fallback verification - Engine initialization time varies by hardware; timeouts must be generous - Health endpoint schema depends on AiAvailabilityStatus enum from codebase ## Risks & Mitigation **Risk 1: Engine init timeout on slow CI** - *Risk*: Engine initialization may exceed timeout on resource-constrained CI runners - *Mitigation*: Use generous timeouts (60s) and mark as known slow test