mirror of
https://github.com/azaion/gps-denied-desktop.git
synced 2026-04-23 04:26:35 +00:00
add tests
gen_tests updated solution.md updated
This commit is contained in:
@@ -0,0 +1,171 @@
|
||||
# Acceptance Test: Single Image Processing Performance (<5 seconds)
|
||||
|
||||
## Summary
|
||||
Validate AC-7 requirement that each individual image processes in less than 5 seconds on target hardware (NVIDIA RTX 2060/3070).
|
||||
|
||||
## Linked Acceptance Criteria
|
||||
**AC-7**: Less than 5 seconds for processing one image.
|
||||
|
||||
## Preconditions
|
||||
- ASTRAL-Next system deployed on target hardware
|
||||
- Hardware: NVIDIA RTX 2060 (minimum) or RTX 3070 (recommended)
|
||||
- TensorRT FP16 optimized models loaded
|
||||
- System warmed up (first frame excluded from timing)
|
||||
- No other GPU-intensive processes running
|
||||
|
||||
## Test Data
|
||||
- **Test Images**: AD000001-AD000010 (representative sample)
|
||||
- **Scenarios**:
|
||||
- Easy: Normal overlap, clear features (AD000001-002)
|
||||
- Medium: Agricultural texture (AD000005-006)
|
||||
- Hard: Minimal overlap (AD000032-033)
|
||||
|
||||
## Performance Breakdown Target
|
||||
```
|
||||
L1 SuperPoint+LightGlue: 50-150ms
|
||||
L2 AnyLoc (keyframes only): 150-200ms
|
||||
L3 LiteSAM (keyframes only): 60-100ms
|
||||
Factor Graph Update: 50-100ms
|
||||
Overhead (I/O, coordination): 50-100ms
|
||||
-----------------------------------
|
||||
Total (L1 only): <500ms
|
||||
Total (L1+L2+L3): <700ms
|
||||
Safety Margin: 4300ms
|
||||
AC-7 Limit: 5000ms
|
||||
```
|
||||
|
||||
## Test Steps
|
||||
|
||||
### Step 1: Measure Easy Scenario Performance
|
||||
**Action**: Process AD000001 → AD000002 (normal overlap, clear features)
|
||||
**Expected Result**:
|
||||
```
|
||||
Image Load: <50ms
|
||||
L1 Processing: 80-120ms
|
||||
Factor Graph: 30-50ms
|
||||
Result Output: <20ms
|
||||
---
|
||||
Total: <240ms
|
||||
Status: WELL_UNDER_LIMIT (4.8% of budget)
|
||||
```
|
||||
|
||||
### Step 2: Measure Medium Scenario Performance
|
||||
**Action**: Process AD000005 → AD000006 (agricultural texture)
|
||||
**Expected Result**:
|
||||
```
|
||||
Image Load: <50ms
|
||||
L1 Processing: 100-150ms (more features)
|
||||
Factor Graph: 40-60ms
|
||||
Result Output: <20ms
|
||||
---
|
||||
Total: <280ms
|
||||
Status: UNDER_LIMIT (5.6% of budget)
|
||||
```
|
||||
|
||||
### Step 3: Measure Hard Scenario Performance
|
||||
**Action**: Process AD000032 → AD000033 (220.6m jump, minimal overlap)
|
||||
**Expected Result**:
|
||||
```
|
||||
Image Load: <50ms
|
||||
L1 Processing: 150-200ms (adaptive depth)
|
||||
L1 Confidence: LOW → Triggers L2
|
||||
L2 Processing: 150-200ms
|
||||
L3 Refinement: 80-120ms
|
||||
Factor Graph: 80-120ms (more complex)
|
||||
Result Output: <30ms
|
||||
---
|
||||
Total: <720ms
|
||||
Status: UNDER_LIMIT (14.4% of budget)
|
||||
```
|
||||
|
||||
### Step 4: Measure Worst-Case Performance
|
||||
**Action**: Process with all layers active + large factor graph
|
||||
**Expected Result**:
|
||||
```
|
||||
Image Load: 80ms
|
||||
L1 Processing: 200ms
|
||||
L2 Processing: 200ms
|
||||
L3 Refinement: 120ms
|
||||
Factor Graph: 150ms (200+ nodes)
|
||||
Result Output: 50ms
|
||||
---
|
||||
Total: <800ms
|
||||
Status: UNDER_LIMIT (16% of budget)
|
||||
```
|
||||
|
||||
### Step 5: Statistical Performance Analysis
|
||||
**Action**: Process 10 representative images, calculate statistics
|
||||
**Expected Result**:
|
||||
```
|
||||
Mean processing time: 350ms
|
||||
Median processing time: 280ms
|
||||
90th percentile: 500ms
|
||||
95th percentile: 650ms
|
||||
99th percentile: 800ms
|
||||
Max: <900ms
|
||||
All: <5000ms (AC-7 requirement)
|
||||
Status: PASS
|
||||
```
|
||||
|
||||
### Step 6: Verify TensorRT Optimization Impact
|
||||
**Action**: Compare TensorRT FP16 vs PyTorch FP32 performance
|
||||
**Expected Result**:
|
||||
```
|
||||
PyTorch FP32 (baseline): 800-1200ms per image
|
||||
TensorRT FP16 (optimized): 250-400ms per image
|
||||
Speedup: 2.5-3.5x
|
||||
Without TensorRT: Would fail AC-7
|
||||
With TensorRT: Comfortably passes AC-7
|
||||
```
|
||||
|
||||
## Pass/Fail Criteria
|
||||
|
||||
**PASS if**:
|
||||
- 100% of images process in <5000ms
|
||||
- Mean processing time <1000ms (20% of budget)
|
||||
- 99th percentile <2000ms (40% of budget)
|
||||
- TensorRT FP16 optimization active and verified
|
||||
- Performance consistent across easy/medium/hard scenarios
|
||||
|
||||
**FAIL if**:
|
||||
- ANY image takes ≥5000ms
|
||||
- Mean processing time >2000ms
|
||||
- System cannot maintain <5s with TensorRT optimization
|
||||
- Performance degrades over time (memory leak)
|
||||
|
||||
## Hardware Requirements Validation
|
||||
|
||||
### RTX 2060 (Minimum)
|
||||
- VRAM: 6GB
|
||||
- Expected performance: 90th percentile <1000ms
|
||||
- Status: Meets AC-7 with optimization
|
||||
|
||||
### RTX 3070 (Recommended)
|
||||
- VRAM: 8GB
|
||||
- Expected performance: 90th percentile <700ms
|
||||
- Status: Comfortably exceeds AC-7
|
||||
|
||||
## Performance Optimization Checklist
|
||||
- TensorRT FP16 models compiled and loaded
|
||||
- CUDA graphs enabled for inference
|
||||
- Batch size = 1 (real-time constraint)
|
||||
- Asynchronous GPU operations where possible
|
||||
- Memory pre-allocated (no runtime allocation)
|
||||
- Factor graph incremental updates (iSAM2)
|
||||
|
||||
## Monitoring and Profiling
|
||||
- **NVIDIA Nsight**: GPU utilization >80% during processing
|
||||
- **CPU Usage**: <50% (GPU-bound workload)
|
||||
- **Memory**: Stable (no leaks over 100+ images)
|
||||
- **Thermal**: GPU <85°C sustained
|
||||
|
||||
## Notes
|
||||
- AC-7 specifies "processing one image", interpreted as latency per image
|
||||
- 5-second budget is generous given target ~500ms actual performance
|
||||
- Margin allows for:
|
||||
- Older hardware (RTX 2060)
|
||||
- Complex scenarios (multiple layers active)
|
||||
- Factor graph growth over long flights
|
||||
- System overhead
|
||||
- Real-time (<100ms) not required; <5s is operational target
|
||||
|
||||
Reference in New Issue
Block a user