mirror of
https://github.com/azaion/gps-denied-desktop.git
synced 2026-04-22 22:36:36 +00:00
8.7 KiB
8.7 KiB
Batch Validator Helper
Interface Definition
Interface Name: IBatchValidator
Interface Methods
class IBatchValidator(ABC):
@abstractmethod
def validate_batch_size(self, batch: ImageBatch) -> ValidationResult:
pass
@abstractmethod
def check_sequence_continuity(self, batch: ImageBatch, expected_start: int) -> ValidationResult:
pass
@abstractmethod
def validate_naming_convention(self, filenames: List[str]) -> ValidationResult:
pass
@abstractmethod
def validate_format(self, image_data: bytes) -> ValidationResult:
pass
Component Description
Responsibilities
- Validate image batch integrity
- Check sequence continuity and naming conventions
- Validate image format and dimensions
- Ensure batch size constraints (10-50 images)
- Support strict sequential ordering (ADxxxxxx.jpg)
Scope
- Batch validation for F05 Image Input Pipeline
- Image format validation
- Filename pattern matching
- Sequence gap detection
API Methods
validate_batch_size(batch: ImageBatch) -> ValidationResult
Description: Validates batch contains 10-50 images.
Called By:
- F05 Image Input Pipeline (before queuing)
Input:
batch: ImageBatch:
images: List[bytes]
filenames: List[str]
start_sequence: int
end_sequence: int
Output:
ValidationResult:
valid: bool
errors: List[str]
Validation Rules:
- Minimum batch size: 10 images
- Maximum batch size: 50 images
- Reason: Balance between upload overhead and processing granularity
Error Conditions:
- Returns
valid=Falsewith error message (not an exception)
Test Cases:
- Valid batch (20 images): Returns
valid=True - Too few images (5): Returns
valid=False, error="Batch size 5 below minimum 10" - Too many images (60): Returns
valid=False, error="Batch size 60 exceeds maximum 50" - Empty batch: Returns
valid=False
check_sequence_continuity(batch: ImageBatch, expected_start: int) -> ValidationResult
Description: Validates images form consecutive sequence with no gaps.
Called By:
- F05 Image Input Pipeline (before queuing)
Input:
batch: ImageBatch
expected_start: int # Expected starting sequence number
Output:
ValidationResult:
valid: bool
errors: List[str]
Validation Rules:
- Sequence starts at expected_start: First image sequence == expected_start
- Consecutive numbers: No gaps in sequence (AD000101, AD000102, AD000103, ...)
- Filename extraction: Parse sequence from ADxxxxxx.jpg pattern
- Strict ordering: Images must be in sequential order
Algorithm:
sequences = [extract_sequence(filename) for filename in batch.filenames]
if sequences[0] != expected_start:
return invalid("Expected start {expected_start}, got {sequences[0]}")
for i in range(len(sequences) - 1):
if sequences[i+1] != sequences[i] + 1:
return invalid(f"Gap detected: {sequences[i]} -> {sequences[i+1]}")
return valid()
Error Conditions:
- Returns
valid=Falsewith specific gap information
Test Cases:
- Valid sequence (101-150): expected_start=101 → valid=True
- Wrong start: expected_start=101, got 102 → valid=False
- Gap in sequence: AD000101, AD000103 (missing 102) → valid=False
- Out of order: AD000102, AD000101 → valid=False
validate_naming_convention(filenames: List[str]) -> ValidationResult
Description: Validates filenames match ADxxxxxx.jpg pattern.
Called By:
- Internal (during check_sequence_continuity)
- F05 Image Input Pipeline
Input:
filenames: List[str]
Output:
ValidationResult:
valid: bool
errors: List[str]
Validation Rules:
- Pattern:
AD\d{6}\.(jpg|JPG|png|PNG) - Examples: AD000001.jpg, AD000237.JPG, AD002000.png
- Case insensitive: Accepts .jpg, .JPG, .Jpg
- 6 digits required: Zero-padded to 6 digits
Regex Pattern: ^AD\d{6}\.(jpg|JPG|png|PNG)$
Error Conditions:
- Returns
valid=Falselisting invalid filenames
Test Cases:
- Valid names: ["AD000001.jpg", "AD000002.jpg"] → valid=True
- Invalid prefix: "IMG_0001.jpg" → valid=False
- Wrong digit count: "AD001.jpg" (3 digits) → valid=False
- Missing extension: "AD000001" → valid=False
- Invalid extension: "AD000001.bmp" → valid=False
validate_format(image_data: bytes) -> ValidationResult
Description: Validates image file format and properties.
Called By:
- F05 Image Input Pipeline (per-image validation)
Input:
image_data: bytes # Raw image file bytes
Output:
ValidationResult:
valid: bool
errors: List[str]
Validation Rules:
- Format: Valid JPEG or PNG
- Dimensions: 640×480 to 6252×4168 pixels
- File size: < 10MB per image
- Image readable: Not corrupted
- Color channels: RGB (3 channels)
Algorithm:
try:
image = PIL.Image.open(BytesIO(image_data))
width, height = image.size
if image.format not in ['JPEG', 'PNG']:
return invalid("Format must be JPEG or PNG")
if width < 640 or height < 480:
return invalid("Dimensions too small")
if width > 6252 or height > 4168:
return invalid("Dimensions too large")
if len(image_data) > 10 * 1024 * 1024:
return invalid("File size exceeds 10MB")
return valid()
except Exception as e:
return invalid(f"Corrupted image: {e}")
Error Conditions:
- Returns
valid=Falsewith specific error
Test Cases:
- Valid JPEG (2048×1536): valid=True
- Valid PNG (6252×4168): valid=True
- Too small (320×240): valid=False
- Too large (8000×6000): valid=False
- File too big (15MB): valid=False
- Corrupted file: valid=False
- BMP format: valid=False
Integration Tests
Test 1: Complete Batch Validation
- Create batch with 20 images, AD000101.jpg - AD000120.jpg
- validate_batch_size() → valid
- validate_naming_convention() → valid
- check_sequence_continuity(expected_start=101) → valid
- validate_format() for each image → all valid
Test 2: Invalid Batch Detection
- Create batch with 60 images → validate_batch_size() fails
- Create batch with gap (AD000101, AD000103) → check_sequence_continuity() fails
- Create batch with IMG_0001.jpg → validate_naming_convention() fails
- Create batch with corrupted image → validate_format() fails
Test 3: Edge Cases
- Batch with exactly 10 images → valid
- Batch with exactly 50 images → valid
- Batch with 51 images → invalid
- Batch starting at AD999995.jpg (near max) → valid
Non-Functional Requirements
Performance
- validate_batch_size: < 1ms
- check_sequence_continuity: < 10ms for 50 images
- validate_naming_convention: < 5ms for 50 filenames
- validate_format: < 20ms per image (with PIL)
- Total batch validation: < 100ms for 50 images
Reliability
- Never raises exceptions (returns ValidationResult with errors)
- Handles edge cases gracefully
- Clear, actionable error messages
Maintainability
- Configurable validation rules (min/max batch size, dimensions)
- Easy to add new validation rules
- Comprehensive error reporting
Dependencies
Internal Components
- None (pure utility, no internal dependencies)
External Dependencies
- Pillow (PIL): Image format validation and dimension checking
- re (regex): Filename pattern matching
Data Models
ImageBatch
class ImageBatch(BaseModel):
images: List[bytes] # Raw image data
filenames: List[str] # e.g., ["AD000101.jpg", ...]
start_sequence: int # 101
end_sequence: int # 150
batch_number: int # Sequential batch number
ValidationResult
class ValidationResult(BaseModel):
valid: bool
errors: List[str] = [] # Empty if valid
warnings: List[str] = [] # Optional warnings
ValidationRules (Configuration)
class ValidationRules(BaseModel):
min_batch_size: int = 10
max_batch_size: int = 50
min_width: int = 640
min_height: int = 480
max_width: int = 6252
max_height: int = 4168
max_file_size_mb: int = 10
allowed_formats: List[str] = ["JPEG", "PNG"]
filename_pattern: str = r"^AD\d{6}\.(jpg|JPG|png|PNG)$"
Sequence Extraction
def extract_sequence(filename: str) -> int:
"""
Extracts sequence number from filename.
Example: "AD000237.jpg" -> 237
"""
match = re.match(r"AD(\d{6})\.", filename)
if match:
return int(match.group(1))
raise ValueError(f"Invalid filename format: {filename}")