mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 23:11:13 +00:00
[AZ-701] HTTP replay API service (FastAPI + magic-byte upload validation)
ci/woodpecker/push/02-build-push Pipeline failed
ci/woodpecker/push/02-build-push Pipeline failed
New replay_api component: FastAPI service wrapping the offline
gps-denied-replay pipeline. POST tlog+video (multipart) → either
sync 200 with result/map/report URLs, or async 202 + job id with
/jobs/{id} polling. Magic-byte validation, bearer auth, in-memory
JobRegistry with concurrency + queue caps (429 on overflow).
Helper accuracy_report.py promoted from tests/ to src/ because the
API needs the Markdown report writer at runtime; all AZ-699 imports
re-pointed. OpenAPI spec exported to docs.
18/18 unit tests pass (AC-1 sync, AC-2 async, AC-3 state machine,
AC-5 auth, AC-6 health, AC-8 concurrency, AC-9 magic-byte). Full
unit suite: 2251 pass, 86 skip, 1 pre-existing C12 cold-start flake
(unchanged). mypy --strict clean on the new surface.
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,251 @@
|
||||
openapi: 3.1.0
|
||||
info:
|
||||
title: gps-denied-onboard replay API
|
||||
description: HTTP wrapper around the offline `gps-denied-replay` pipeline. Upload
|
||||
(tlog + video [+ calibration]); receive GPS fixes + an accuracy report + an HTML
|
||||
map.
|
||||
version: 1.0.0
|
||||
paths:
|
||||
/healthz:
|
||||
get:
|
||||
summary: Healthz
|
||||
operationId: healthz_healthz_get
|
||||
responses:
|
||||
'200':
|
||||
description: Successful Response
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
additionalProperties:
|
||||
type: string
|
||||
type: object
|
||||
title: Response Healthz Healthz Get
|
||||
/readyz:
|
||||
get:
|
||||
summary: Readyz
|
||||
operationId: readyz_readyz_get
|
||||
responses:
|
||||
'200':
|
||||
description: Successful Response
|
||||
content:
|
||||
application/json:
|
||||
schema: {}
|
||||
/replay:
|
||||
post:
|
||||
summary: Post Replay
|
||||
operationId: post_replay_replay_post
|
||||
parameters:
|
||||
- name: authorization
|
||||
in: header
|
||||
required: false
|
||||
schema:
|
||||
anyOf:
|
||||
- type: string
|
||||
- type: 'null'
|
||||
title: Authorization
|
||||
requestBody:
|
||||
required: true
|
||||
content:
|
||||
multipart/form-data:
|
||||
schema:
|
||||
$ref: '#/components/schemas/Body_post_replay_replay_post'
|
||||
responses:
|
||||
'200':
|
||||
description: Successful Response
|
||||
content:
|
||||
application/json:
|
||||
schema: {}
|
||||
'422':
|
||||
description: Validation Error
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/HTTPValidationError'
|
||||
/jobs/{job_id}:
|
||||
get:
|
||||
summary: Get Job
|
||||
operationId: get_job_jobs__job_id__get
|
||||
parameters:
|
||||
- name: job_id
|
||||
in: path
|
||||
required: true
|
||||
schema:
|
||||
type: string
|
||||
title: Job Id
|
||||
- name: authorization
|
||||
in: header
|
||||
required: false
|
||||
schema:
|
||||
anyOf:
|
||||
- type: string
|
||||
- type: 'null'
|
||||
title: Authorization
|
||||
responses:
|
||||
'200':
|
||||
description: Successful Response
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
type: object
|
||||
additionalProperties: true
|
||||
title: Response Get Job Jobs Job Id Get
|
||||
'422':
|
||||
description: Validation Error
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/HTTPValidationError'
|
||||
/jobs/{job_id}/result:
|
||||
get:
|
||||
summary: Get Result
|
||||
operationId: get_result_jobs__job_id__result_get
|
||||
parameters:
|
||||
- name: job_id
|
||||
in: path
|
||||
required: true
|
||||
schema:
|
||||
type: string
|
||||
title: Job Id
|
||||
- name: authorization
|
||||
in: header
|
||||
required: false
|
||||
schema:
|
||||
anyOf:
|
||||
- type: string
|
||||
- type: 'null'
|
||||
title: Authorization
|
||||
responses:
|
||||
'200':
|
||||
description: Successful Response
|
||||
content:
|
||||
application/json:
|
||||
schema: {}
|
||||
'422':
|
||||
description: Validation Error
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/HTTPValidationError'
|
||||
/jobs/{job_id}/map:
|
||||
get:
|
||||
summary: Get Map
|
||||
operationId: get_map_jobs__job_id__map_get
|
||||
parameters:
|
||||
- name: job_id
|
||||
in: path
|
||||
required: true
|
||||
schema:
|
||||
type: string
|
||||
title: Job Id
|
||||
- name: authorization
|
||||
in: header
|
||||
required: false
|
||||
schema:
|
||||
anyOf:
|
||||
- type: string
|
||||
- type: 'null'
|
||||
title: Authorization
|
||||
responses:
|
||||
'200':
|
||||
description: Successful Response
|
||||
content:
|
||||
application/json:
|
||||
schema: {}
|
||||
'422':
|
||||
description: Validation Error
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/HTTPValidationError'
|
||||
/jobs/{job_id}/report:
|
||||
get:
|
||||
summary: Get Report
|
||||
operationId: get_report_jobs__job_id__report_get
|
||||
parameters:
|
||||
- name: job_id
|
||||
in: path
|
||||
required: true
|
||||
schema:
|
||||
type: string
|
||||
title: Job Id
|
||||
- name: authorization
|
||||
in: header
|
||||
required: false
|
||||
schema:
|
||||
anyOf:
|
||||
- type: string
|
||||
- type: 'null'
|
||||
title: Authorization
|
||||
responses:
|
||||
'200':
|
||||
description: Successful Response
|
||||
content:
|
||||
application/json:
|
||||
schema: {}
|
||||
'422':
|
||||
description: Validation Error
|
||||
content:
|
||||
application/json:
|
||||
schema:
|
||||
$ref: '#/components/schemas/HTTPValidationError'
|
||||
components:
|
||||
schemas:
|
||||
Body_post_replay_replay_post:
|
||||
properties:
|
||||
tlog:
|
||||
type: string
|
||||
format: binary
|
||||
title: Tlog
|
||||
video:
|
||||
type: string
|
||||
format: binary
|
||||
title: Video
|
||||
calibration:
|
||||
anyOf:
|
||||
- type: string
|
||||
format: binary
|
||||
- type: 'null'
|
||||
title: Calibration
|
||||
pace:
|
||||
type: string
|
||||
title: Pace
|
||||
default: asap
|
||||
auto_trim:
|
||||
type: boolean
|
||||
title: Auto Trim
|
||||
default: true
|
||||
type: object
|
||||
required:
|
||||
- tlog
|
||||
- video
|
||||
title: Body_post_replay_replay_post
|
||||
HTTPValidationError:
|
||||
properties:
|
||||
detail:
|
||||
items:
|
||||
$ref: '#/components/schemas/ValidationError'
|
||||
type: array
|
||||
title: Detail
|
||||
type: object
|
||||
title: HTTPValidationError
|
||||
ValidationError:
|
||||
properties:
|
||||
loc:
|
||||
items:
|
||||
anyOf:
|
||||
- type: string
|
||||
- type: integer
|
||||
type: array
|
||||
title: Location
|
||||
msg:
|
||||
type: string
|
||||
title: Message
|
||||
type:
|
||||
type: string
|
||||
title: Error Type
|
||||
type: object
|
||||
required:
|
||||
- loc
|
||||
- msg
|
||||
- type
|
||||
title: ValidationError
|
||||
@@ -0,0 +1,135 @@
|
||||
# Contract: `replay_api` HTTP service
|
||||
|
||||
**Owner**: AZ-701 (epic AZ-696 / cycle-2 multi-flight demo deliverables).
|
||||
**Producer task**: AZ-701 (this contract).
|
||||
**Consumer**: any HTTP client — operator dashboards, the parent-suite UI, demo runners, ad-hoc `curl` sessions.
|
||||
**Version**: 1.0.0
|
||||
**Status**: draft (in-testing on Jetson)
|
||||
**Last Updated**: 2026-05-20
|
||||
**Module-layout home**:
|
||||
- `src/gps_denied_onboard/replay_api/app.py` — FastAPI app factory + uvicorn entrypoint.
|
||||
- `src/gps_denied_onboard/replay_api/handlers.py` — request handlers (multipart parse, magic-byte validation, auth dependency).
|
||||
- `src/gps_denied_onboard/replay_api/jobs.py` — in-memory `JobRegistry` + `JobRecord` + concurrency limit.
|
||||
- `src/gps_denied_onboard/replay_api/storage.py` — per-job temp directory lifecycle + cleanup.
|
||||
- `src/gps_denied_onboard/replay_api/interface.py` — `ReplayRunner` Protocol + DTOs (`ReplayJobResult`, `JobState`, `JobSnapshot`).
|
||||
- `src/gps_denied_onboard/replay_api/errors.py` — typed HTTP error families.
|
||||
- `src/gps_denied_onboard/cli/replay_api_entrypoint.py` — `replay-api` console-script.
|
||||
- `docker/replay-api.Dockerfile` — operator-side container image.
|
||||
|
||||
## Purpose
|
||||
|
||||
Expose the offline replay pipeline (`gps-denied-replay` CLI from AZ-402, plus the `gps-denied-render-map` HTML renderer from AZ-700) over a single HTTP surface so external consumers can upload `(tlog + video [+ calibration])` and receive GPS fixes + an accuracy report + a map without installing the Python stack.
|
||||
|
||||
The service is **operator-side only**: it is NOT bundled into the airborne binary. It runs in its own container (`docker/replay-api.Dockerfile`) and is started via the `replay-api` console-script in the `[operator-tools]` optional-dependency group.
|
||||
|
||||
## Design invariants
|
||||
|
||||
1. **The service does not re-implement the estimator.** It shells out to the existing `gps-denied-replay` console-script. The estimator path is exactly what runs on the airborne binary; the service is a thin HTTP shim.
|
||||
2. **No persistent state.** Jobs live in-process; uploads live in per-job temp directories that are deleted on completion or service shutdown. Operators that need durable history persist the JSONL + Markdown report + HTML map artefacts out-of-band.
|
||||
3. **Sync vs. async is decided by file size, not by the client.** Videos ≤ `REPLAY_API_SYNC_MAX_BYTES` (default 200 MB) run inline; larger uploads are queued and the client polls.
|
||||
4. **Magic-byte file validation** is applied before any data is handed to the estimator. The service refuses uploads whose first bytes do not match the expected `.tlog` (MAVLink magic byte `0xFD` for v2.0) or `.mp4` (`ftyp` box at offset 4) signatures.
|
||||
5. **Bearer-token auth** is the only auth surface. Default is **on**; `REPLAY_API_AUTH_REQUIRED=false` opts out for local dev and emits a WARN log on every request.
|
||||
|
||||
## Public API
|
||||
|
||||
The OpenAPI spec is the authoritative source — see `_docs/02_document/contracts/replay_api/openapi.yaml`. The summary below mirrors it for human readers.
|
||||
|
||||
### `POST /replay`
|
||||
|
||||
Multipart upload accepting:
|
||||
- `tlog`: binary `.tlog` file (required).
|
||||
- `video`: `.mp4` file (required).
|
||||
- `calibration`: camera-calibration JSON (optional; defaults to the AZ-702 KHP20S30 factory-sheet if the operator built the image with that calibration baked in).
|
||||
- `pace`: `asap` | `realtime` (form field, optional; default `asap`).
|
||||
- `auto_trim`: `true` | `false` (form field, optional; default `true`).
|
||||
|
||||
Response shapes:
|
||||
|
||||
- **Sync mode** (video ≤ `REPLAY_API_SYNC_MAX_BYTES`):
|
||||
- `200 OK` with body `{ "job_id": "<uuid>", "state": "done", "emissions_jsonl_url": "...", "accuracy_report_md_url": "...", "map_html_url": "..." }`
|
||||
- **Async mode** (video > `REPLAY_API_SYNC_MAX_BYTES` OR concurrency limit reached at submit time):
|
||||
- `202 Accepted` with `Location: /jobs/{id}` header and body `{ "job_id": "<uuid>", "state": "queued" | "running", "status_url": "/jobs/{id}" }`
|
||||
|
||||
### `GET /jobs/{id}`
|
||||
|
||||
Returns the job snapshot:
|
||||
|
||||
```json
|
||||
{
|
||||
"job_id": "...",
|
||||
"state": "queued" | "running" | "done" | "failed",
|
||||
"submitted_at_utc": "...",
|
||||
"started_at_utc": "...",
|
||||
"finished_at_utc": "...",
|
||||
"error": "<string, present only when state=failed>",
|
||||
"result": { ... },
|
||||
"status_url": "...",
|
||||
"emissions_jsonl_url": "...",
|
||||
"accuracy_report_md_url": "...",
|
||||
"map_html_url": "..."
|
||||
}
|
||||
```
|
||||
|
||||
### `GET /jobs/{id}/result`
|
||||
|
||||
Streams the JSONL emissions file. `200 OK` with `Content-Type: application/x-ndjson`. `409 Conflict` when the job is not in state `done`.
|
||||
|
||||
### `GET /jobs/{id}/map`
|
||||
|
||||
Streams the HTML map produced by AZ-700. `200 OK` with `Content-Type: text/html`. `409 Conflict` when the job is not in state `done`.
|
||||
|
||||
### `GET /jobs/{id}/report`
|
||||
|
||||
Streams the Markdown accuracy report produced by AZ-699. `200 OK` with `Content-Type: text/markdown`. `409 Conflict` when the job is not in state `done`.
|
||||
|
||||
### `GET /healthz`
|
||||
|
||||
Liveness probe. `200 OK` with `{"status":"ok"}` whenever the FastAPI app can process requests.
|
||||
|
||||
### `GET /readyz`
|
||||
|
||||
Readiness probe. `200 OK` only when the `gps-denied-replay` console-script is resolvable on `PATH` AND the storage root is writeable. `503 Service Unavailable` otherwise — Kubernetes / docker-compose health checks should use this, not `/healthz`.
|
||||
|
||||
## Errors
|
||||
|
||||
All errors are JSON objects of shape `{ "error_code": "...", "message": "...", "details": { ... } }`.
|
||||
|
||||
| HTTP | `error_code` | When |
|
||||
|------|-------------------------------|------|
|
||||
| 400 | `unsupported_file_kind` | Magic-byte validation failed. |
|
||||
| 400 | `multipart_missing_field` | Required field absent. |
|
||||
| 401 | `unauthorized` | Missing or wrong bearer token (when auth required). |
|
||||
| 404 | `job_not_found` | `GET /jobs/{id}*` for an unknown id. |
|
||||
| 409 | `job_not_complete` | Result/map/report requested while job is not `done`. |
|
||||
| 413 | `payload_too_large` | Upload exceeded `REPLAY_API_MAX_UPLOAD_BYTES`. |
|
||||
| 429 | `concurrency_limit_reached` | More than `REPLAY_API_MAX_CONCURRENT_JOBS` running. The handler still accepts the job and queues it; this code surfaces only when the queue itself is full. |
|
||||
| 500 | `replay_runner_failed` | The `gps-denied-replay` subprocess exited non-zero. `details.stderr_tail` carries the last 8 KB of stderr. |
|
||||
|
||||
## Configuration
|
||||
|
||||
| Env var | Default | Meaning |
|
||||
|--------------------------------------|----------------------|---------|
|
||||
| `REPLAY_API_BEARER_TOKEN` | _none_ | Required when `REPLAY_API_AUTH_REQUIRED=true`. |
|
||||
| `REPLAY_API_AUTH_REQUIRED` | `true` | Set to `false` to disable bearer-token auth (dev only — WARN logged). |
|
||||
| `REPLAY_API_MAX_UPLOAD_BYTES` | `2147483648` (2 GB) | Per-upload hard limit. |
|
||||
| `REPLAY_API_SYNC_MAX_BYTES` | `209715200` (200 MB) | Video size at which the service switches to async. |
|
||||
| `REPLAY_API_MAX_CONCURRENT_JOBS` | `1` | Max running estimator subprocesses. |
|
||||
| `REPLAY_API_MAX_QUEUED_JOBS` | `8` | Max queued jobs. Above this the API returns 429. |
|
||||
| `REPLAY_API_STORAGE_ROOT` | `/var/azaion/replay_api` | Per-job temp dir parent. |
|
||||
| `REPLAY_API_REPLAY_BINARY` | `gps-denied-replay` | Override the replay CLI binary used by the runner. |
|
||||
| `REPLAY_API_RENDER_BINARY` | `gps-denied-render-map` | Override the map-render CLI used by the runner. |
|
||||
|
||||
## Versioning rules
|
||||
|
||||
- Breaking changes to request / response schemas bump the major version and ship under `/v2/replay`. The `/replay` path remains v1 for one release after `/v2` ships.
|
||||
- The response shape may grow new fields without a version bump; clients MUST tolerate unknown fields.
|
||||
- The `error_code` set is appended-only; clients MUST tolerate unknown codes.
|
||||
- The `state` enum may grow new terminal-style values (e.g. `cancelled`) only with a minor bump documented in the OpenAPI changelog block.
|
||||
|
||||
## Out of scope
|
||||
|
||||
- Persistent job database — see invariant 2.
|
||||
- WebSocket / SSE progress streaming.
|
||||
- Authentication beyond bearer token (mTLS / OAuth2 are deliberately out).
|
||||
- Multi-node scheduling — a single host runs at most `REPLAY_API_MAX_CONCURRENT_JOBS` subprocesses.
|
||||
- A built-in web UI — operator dashboards integrate over HTTP.
|
||||
Reference in New Issue
Block a user