mirror of
https://github.com/azaion/annotations.git
synced 2026-06-21 22:21:07 +00:00
03f879206e
This commit captures everything produced during autodev existing-code Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec), together with the targeted auth + CORS re-sync triggered on 2026-05-14 when codebase drift was detected at Step 4 entry. None of this work was previously committed. Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution, architecture, system flows, glossary, module-layout, per-component specs (01..06), modules, deployment, diagrams, data model, FINAL report, verification log, discovery. Step 2 (Architecture Baseline) — architecture_compliance_baseline.md. Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No High/Critical findings; auto-chained to Step 3 per existing-code flow. Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across blackbox, security, resilience, resource-limit, performance), plus e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh, scripts/run-performance-tests.sh. Coverage 88% over the active scope (40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered). Targeted auth + CORS re-sync — replaces the deleted in-house token issuer with a JWKS-verifier model. AuthController and TokenService removed; JwtExtensions switched from HS256 symmetric to ES256 over admin's JWKS. ConfigurationResolver and CorsConfigurationValidator added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01, SEC-02, SEC-03 marked Closed. One new testability risk recorded in architecture.md Open Risks Section 6 (JWKS HTTPS gating). Source changes: - src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning - src/Program.cs (modified) — DI wiring for ConfigurationResolver and CorsConfigurationValidator - src/Controllers/AuthController.cs (deleted) — no in-service issuance - src/Services/TokenService.cs (deleted) — same - src/Infrastructure/ConfigurationResolver.cs (new) - src/Infrastructure/CorsConfigurationValidator.cs (new) - .env.example (new) — required env var documentation - .gitignore (updated) Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec captures the change-spec for downstream services that consumed the now deleted /auth endpoints. Co-authored-by: Cursor <cursoragent@cursor.com>
56 lines
5.4 KiB
Markdown
56 lines
5.4 KiB
Markdown
# Azaion.Annotations — Problem statement (retrospective)
|
||
|
||
> Reverse-engineered from `_docs/02_document/architecture.md`, `system-flows.md`, the per-component specs, and `suite/_docs/01_annotations.md`. Not copied from a real PRD — this is a retrospective synthesis.
|
||
|
||
## What this system is
|
||
|
||
`Azaion.Annotations` is the **annotation lifecycle service** of the AZAION suite. It is the single owner of the `annotations` table, the YOLO-format label files on disk, the lifecycle event stream (RabbitMQ + SSE), and the dataset exploration surface that downstream tooling — annotator UIs, the AI training pipeline, the admin sync worker — relies on.
|
||
|
||
It is a single .NET 10 service backed by PostgreSQL and a content-addressed filesystem cache, packaged as an ARM64 Docker image, deployed by the suite's branch-driven Woodpecker pipeline.
|
||
|
||
## Problem it solves
|
||
|
||
The suite's surveillance / detection pipeline produces a continuous stream of detected objects in video frames. Three independent consumers need that same data shaped differently:
|
||
|
||
1. **Annotator UI** — humans need to review, correct, accept, or reject each detection in near-real-time, frame-by-frame, with the underlying image visible. They need every change another annotator makes to surface immediately on their screen — no refresh button.
|
||
2. **AI training pipeline** — needs the *finalized* annotations + image bytes as a durable, replayable feed so it can build training datasets at any cadence.
|
||
3. **Suite-level admin worker** — needs an audit-grade record of every state change (who, when, what) for cross-service synchronisation.
|
||
|
||
Without a dedicated lifecycle service, these consumers would each poll the detection pipeline directly, which (a) doesn't expose lifecycle semantics — only "the model said this", not "a human accepted it", (b) has no notion of soft delete, status transitions, or human authorship, and (c) cannot deliver realtime updates to UIs and durable replay to batch consumers from the same source of truth.
|
||
|
||
`Azaion.Annotations` solves that three-way mismatch by being **the one place** where annotation state lives, where state transitions are emitted, and where both push (SSE for humans) and durable-pull (RabbitMQ Stream for machines) consumers attach.
|
||
|
||
## Users (consumer roles)
|
||
|
||
| Consumer | How they reach the system | What they need |
|
||
|----------|---------------------------|----------------|
|
||
| Annotator UI (human-facing web app) | REST + SSE, JWT policy `ANN` | List + detail of annotations, mutations, real-time fan-out of every other annotator's edits |
|
||
| Dataset Explorer UI | REST under `/dataset`, JWT policy `DATASET` | Filterable read of the current annotation corpus + bulk-status writes |
|
||
| Detections service (upstream pipeline) | REST `POST /annotations`, JWT policy `ANN`; long-running tokens are refreshed against admin's `POST /token/refresh` (annotations is verifier-only) | Push raw detections + the original frame image; receive the assigned annotation id |
|
||
| AI training pipeline | RabbitMQ Stream consumer (`azaion-annotations`) | Durable, replayable lifecycle events with the full payload |
|
||
| Admin sync worker | RabbitMQ Stream consumer (`azaion-annotations`) | Same stream, different consumer offset; cross-service event correlation |
|
||
| Suite admin (humans) | REST `[ADM]` endpoints (planned for `/classes` per RB-06; service-account registration is owned by the admin service, not annotations) | Manage detection class catalog, register service accounts (against admin) |
|
||
|
||
## How it works at a high level
|
||
|
||
A detection arrives as `POST /annotations` with the original frame as `image_bytes` and a list of YOLO detections in normalised coordinates. The service:
|
||
|
||
1. **Content-addresses** the image — sampled hashing produces a stable 32-char hex id; identical re-uploads collapse to the same row.
|
||
2. **Persists** the image, optionally a media row, the annotation row, and the detection rows in a transactional unit (subject to the agreed Refactor Backlog item RB-03 — today FS + DB + outbox are not yet wrapped together).
|
||
3. **Writes** a YOLO `.txt` label file next to the image so the AI training pipeline can ingest the data with no transformation step.
|
||
4. **Publishes** an SSE event so every Annotator UI viewing that mission gets the new annotation immediately.
|
||
5. **Enqueues** a row in the transactional outbox `annotations_queue_records`. The in-process `FailsafeProducer` drains that outbox into the RabbitMQ stream as a MessagePack-gzip frame, so the AI pipeline and admin worker get a durable copy.
|
||
|
||
Mutations (Update / UpdateStatus / Delete) follow the same shape — but only after Refactor Backlog item RB-01 lands (today they are silent on both SSE and outbox; that is a known gap, not a design choice). Deletes are *soft*: status flips to `Deleted (40)`, files relocate to a `deleted_dir`, the row stays.
|
||
|
||
The service also serves the dataset exploration surface (`/dataset/*`), the media upload pipeline (`/media/*`), and the system-metadata catalog (`/settings/*`, `/classes`).
|
||
|
||
## Cross-references
|
||
|
||
- Suite-level integration narrative: `suite/_docs/01_annotations.md`
|
||
- Architecture vision + 13 ADRs: `_docs/02_document/architecture.md`
|
||
- 8 verified system flows F1–F8: `_docs/02_document/system-flows.md`
|
||
- Component-level specs: `_docs/02_document/components/*/description.md`
|
||
- Glossary (canonical terminology): `_docs/02_document/glossary.md`
|
||
- README: none in repo (gap noted in `_docs/02_document/00_discovery.md`).
|