[AZ-701] HTTP replay API service (FastAPI + magic-byte upload validation)
ci/woodpecker/push/02-build-push Pipeline failed

New replay_api component: FastAPI service wrapping the offline
gps-denied-replay pipeline. POST tlog+video (multipart) → either
sync 200 with result/map/report URLs, or async 202 + job id with
/jobs/{id} polling. Magic-byte validation, bearer auth, in-memory
JobRegistry with concurrency + queue caps (429 on overflow).

Helper accuracy_report.py promoted from tests/ to src/ because the
API needs the Markdown report writer at runtime; all AZ-699 imports
re-pointed. OpenAPI spec exported to docs.

18/18 unit tests pass (AC-1 sync, AC-2 async, AC-3 state machine,
AC-5 auth, AC-6 health, AC-8 concurrency, AC-9 magic-byte). Full
unit suite: 2251 pass, 86 skip, 1 pre-existing C12 cold-start flake
(unchanged). mypy --strict clean on the new surface.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-20 17:30:26 +03:00
parent b66b68ff76
commit 7d53cef0cf
22 changed files with 2854 additions and 13 deletions
@@ -0,0 +1,89 @@
"""AZ-701 — per-job temp-file lifecycle.
One ``StorageRoot`` rooted at ``REPLAY_API_STORAGE_ROOT``.
Each job allocates a subdirectory ``<root>/<job_id>/`` containing
the uploaded ``tlog`` + ``video`` + ``calibration`` plus the
estimator's outputs (``emissions.jsonl``, the AZ-699 report, the
AZ-700 map).
The directory is deleted on job completion (``release_job``) and on
service shutdown (``cleanup_all``). The service deliberately does
NOT keep finished-job artefacts forever — invariant 2 in the
contract.
"""
from __future__ import annotations
import logging
import shutil
from dataclasses import dataclass
from pathlib import Path
__all__ = ["JobStorage", "StorageRoot"]
_LOGGER = logging.getLogger("gps_denied_onboard.replay_api.storage")
@dataclass(frozen=True, slots=True)
class JobStorage:
"""The per-job paths the handler hands to the runner."""
root: Path
tlog_path: Path
video_path: Path
calibration_path: Path
output_dir: Path
class StorageRoot:
"""Parent of per-job storage directories.
The class is intentionally thin — the registry calls
``allocate_job`` at submit-time and ``release_job`` at terminal
transitions; nothing else owns mutation rights.
"""
def __init__(self, root: Path) -> None:
self._root = root
self._root.mkdir(parents=True, exist_ok=True)
@property
def root(self) -> Path:
return self._root
def allocate_job(self, job_id: str) -> JobStorage:
job_root = self._root / job_id
job_root.mkdir(parents=True, exist_ok=False)
output_dir = job_root / "output"
output_dir.mkdir(parents=True, exist_ok=True)
return JobStorage(
root=job_root,
tlog_path=job_root / "input.tlog",
video_path=job_root / "input.mp4",
calibration_path=job_root / "calibration.json",
output_dir=output_dir,
)
def release_job(self, job_id: str) -> None:
target = self._root / job_id
if not target.exists():
return
try:
shutil.rmtree(target)
except OSError as exc:
_LOGGER.warning(
"failed to delete per-job storage %s: %s", target, exc
)
def cleanup_all(self) -> None:
for child in self._root.iterdir():
if child.is_dir():
try:
shutil.rmtree(child)
except OSError as exc:
_LOGGER.warning(
"failed to delete per-job storage %s: %s",
child,
exc,
)