Files
ai-training/_docs/02_document/modules/convert_annotations.md
T
Oleksandr Bezdieniezhnykh 142c6c4de8 Refactor constants management to use Pydantic BaseModel for configuration
- Replaced module-level path variables in constants.py with a structured Pydantic Config class.
- Updated all relevant modules (train.py, augmentation.py, exports.py, dataset-visualiser.py, manual_run.py) to access paths through the new config structure.
- Fixed bugs related to image processing and model saving.
- Enhanced test infrastructure to accommodate the new configuration approach.

This refactor improves code maintainability and clarity by centralizing configuration management.
2026-03-27 18:18:30 +02:00

2.0 KiB
Raw Blame History

Module: convert-annotations

Purpose

Standalone script that converts annotation files from external formats (Pascal VOC XML, oriented bounding box text) to YOLO format.

Public Interface

Function Signature Returns Description
convert (folder, dest_folder, read_annotations, ann_format) Generic converter: reads images + annotations from folder, writes YOLO format to dest
minmax2yolo (width, height, xmin, xmax, ymin, ymax) -> tuple (cx, cy, w, h) Converts pixel min/max coords to normalized YOLO center format
read_pascal_voc (width, height, s: str) -> list[str] YOLO label lines Parses Pascal VOC XML, maps class names to IDs, outputs YOLO lines
read_bbox_oriented (width, height, s: str) -> list[str] YOLO label lines Parses 14-column oriented bbox format, outputs YOLO lines (hardcoded class 2)
rename_images (folder) Renames files by trimming last 7 chars + replacing extension with .png

Internal Logic

  • convert(): Iterates image files in source folder, reads corresponding annotation file, calls format-specific reader, copies image and writes YOLO label to destination.
  • Pascal VOC: Parses XML <object> elements, maps class names via name_class_map (Truck→1, Car/Taxi→2), filters forbidden classes (Motorcycle). Default class = 1.
  • Oriented bbox: 14-column space-separated format, extracts min/max from columns 613, hardcodes class to 2.
  • Validation: Skips labels where normalized coordinates exceed 1.0 (out of bounds).

Dependencies

  • cv2 (external) — image reading for dimensions
  • xml.etree.cElementTree (stdlib) — Pascal VOC XML parsing
  • os, shutil, pathlib (stdlib)

Consumers

None (standalone script).

Data Models

None.

Configuration

Hardcoded class mappings: name_class_map = {'Truck': 1, 'Car': 2, 'Taxi': 2}, forbidden_classes = ['Motorcycle'].

External Integrations

Filesystem I/O only.

Security

None.

Tests

None.