add tests

gen_tests updated solution.md updated
2026-04-23 04:26:35 +00:00 · 2025-11-24 22:57:46 +02:00
parent f50006d100
commit 4f8c18a066
49 changed files with 7209 additions and 3 deletions
@@ -0,0 +1,171 @@
+# Acceptance Test: Single Image Processing Performance (<5 seconds)
+
+## Summary
+Validate AC-7 requirement that each individual image processes in less than 5 seconds on target hardware (NVIDIA RTX 2060/3070).
+
+## Linked Acceptance Criteria
+**AC-7**: Less than 5 seconds for processing one image.
+
+## Preconditions
+- ASTRAL-Next system deployed on target hardware
+- Hardware: NVIDIA RTX 2060 (minimum) or RTX 3070 (recommended)
+- TensorRT FP16 optimized models loaded
+- System warmed up (first frame excluded from timing)
+- No other GPU-intensive processes running
+
+## Test Data
+- **Test Images**: AD000001-AD000010 (representative sample)
+- **Scenarios**:
+  - Easy: Normal overlap, clear features (AD000001-002)
+  - Medium: Agricultural texture (AD000005-006)
+  - Hard: Minimal overlap (AD000032-033)
+
+## Performance Breakdown Target
+```
+L1 SuperPoint+LightGlue: 50-150ms
+L2 AnyLoc (keyframes only): 150-200ms
+L3 LiteSAM (keyframes only): 60-100ms
+Factor Graph Update: 50-100ms
+Overhead (I/O, coordination): 50-100ms
+-----------------------------------
+Total (L1 only): <500ms
+Total (L1+L2+L3): <700ms
+Safety Margin: 4300ms
+AC-7 Limit: 5000ms
+```
+
+## Test Steps
+
+### Step 1: Measure Easy Scenario Performance
+**Action**: Process AD000001 → AD000002 (normal overlap, clear features)
+**Expected Result**:
+```
+Image Load: <50ms
+L1 Processing: 80-120ms
+Factor Graph: 30-50ms
+Result Output: <20ms
+---
+Total: <240ms
+Status: WELL_UNDER_LIMIT (4.8% of budget)
+```
+
+### Step 2: Measure Medium Scenario Performance
+**Action**: Process AD000005 → AD000006 (agricultural texture)
+**Expected Result**:
+```
+Image Load: <50ms
+L1 Processing: 100-150ms (more features)
+Factor Graph: 40-60ms
+Result Output: <20ms
+---
+Total: <280ms
+Status: UNDER_LIMIT (5.6% of budget)
+```
+
+### Step 3: Measure Hard Scenario Performance
+**Action**: Process AD000032 → AD000033 (220.6m jump, minimal overlap)
+**Expected Result**:
+```
+Image Load: <50ms
+L1 Processing: 150-200ms (adaptive depth)
+L1 Confidence: LOW → Triggers L2
+L2 Processing: 150-200ms
+L3 Refinement: 80-120ms
+Factor Graph: 80-120ms (more complex)
+Result Output: <30ms
+---
+Total: <720ms
+Status: UNDER_LIMIT (14.4% of budget)
+```
+
+### Step 4: Measure Worst-Case Performance
+**Action**: Process with all layers active + large factor graph
+**Expected Result**:
+```
+Image Load: 80ms
+L1 Processing: 200ms
+L2 Processing: 200ms
+L3 Refinement: 120ms
+Factor Graph: 150ms (200+ nodes)
+Result Output: 50ms
+---
+Total: <800ms
+Status: UNDER_LIMIT (16% of budget)
+```
+
+### Step 5: Statistical Performance Analysis
+**Action**: Process 10 representative images, calculate statistics
+**Expected Result**:
+```
+Mean processing time: 350ms
+Median processing time: 280ms
+90th percentile: 500ms
+95th percentile: 650ms
+99th percentile: 800ms
+Max: <900ms
+All: <5000ms (AC-7 requirement)
+Status: PASS
+```
+
+### Step 6: Verify TensorRT Optimization Impact
+**Action**: Compare TensorRT FP16 vs PyTorch FP32 performance
+**Expected Result**:
+```
+PyTorch FP32 (baseline): 800-1200ms per image
+TensorRT FP16 (optimized): 250-400ms per image
+Speedup: 2.5-3.5x
+Without TensorRT: Would fail AC-7
+With TensorRT: Comfortably passes AC-7
+```
+
+## Pass/Fail Criteria
+
+**PASS if**:
+- 100% of images process in <5000ms
+- Mean processing time <1000ms (20% of budget)
+- 99th percentile <2000ms (40% of budget)
+- TensorRT FP16 optimization active and verified
+- Performance consistent across easy/medium/hard scenarios
+
+**FAIL if**:
+- ANY image takes ≥5000ms
+- Mean processing time >2000ms
+- System cannot maintain <5s with TensorRT optimization
+- Performance degrades over time (memory leak)
+
+## Hardware Requirements Validation
+
+### RTX 2060 (Minimum)
+- VRAM: 6GB
+- Expected performance: 90th percentile <1000ms
+- Status: Meets AC-7 with optimization
+
+### RTX 3070 (Recommended)
+- VRAM: 8GB  
+- Expected performance: 90th percentile <700ms
+- Status: Comfortably exceeds AC-7
+
+## Performance Optimization Checklist
+- TensorRT FP16 models compiled and loaded
+- CUDA graphs enabled for inference
+- Batch size = 1 (real-time constraint)
+- Asynchronous GPU operations where possible
+- Memory pre-allocated (no runtime allocation)
+- Factor graph incremental updates (iSAM2)
+
+## Monitoring and Profiling
+- **NVIDIA Nsight**: GPU utilization >80% during processing
+- **CPU Usage**: <50% (GPU-bound workload)
+- **Memory**: Stable (no leaks over 100+ images)
+- **Thermal**: GPU <85°C sustained
+
+## Notes
+- AC-7 specifies "processing one image", interpreted as latency per image
+- 5-second budget is generous given target ~500ms actual performance
+- Margin allows for:
+  - Older hardware (RTX 2060)
+  - Complex scenarios (multiple layers active)
+  - Factor graph growth over long flights
+  - System overhead
+- Real-time (<100ms) not required; <5s is operational target
+