# Azaion.Annotations — Documentation Report ## Executive Summary Reverse-engineered the `Azaion.Annotations` codebase bottom-up — 11 module docs → 6 component specs → 1 system architecture + 8 verified flows + ER diagram + deployment + glossary, then synthesized a retrospective `solution.md` and a 5-file problem extraction. Verification surfaced 8 behavioral discrepancies between code and the suite-level `01_annotations.md` narrative; all 8 were resolved with stakeholder decisions, captured as 13 ADRs and 9 Refactor Backlog items (RB-01..RB-09) inside `architecture.md`. ## Problem Statement `Azaion.Annotations` is the suite's annotation lifecycle service. It is the single owner of the `annotations` table, the YOLO label files on disk, and the lifecycle event stream. Three independent consumers (Annotator UI, AI training pipeline, admin sync worker) need the same data shaped differently — push for humans (SSE), durable-pull for machines (RabbitMQ Stream) — and this service is the one place that reconciles those needs. See `_docs/00_problem/problem.md`. ## Architecture Overview Single .NET 10 ASP.NET Core service, single PostgreSQL state-of-record, content-addressed filesystem cache, in-process SSE channel + transactional outbox drained to RabbitMQ Stream by a hosted background service. JWT bearer with three policies (`ANN`, `DATASET`, `ADM`). Idempotent boot-time DDL migrator removes a separate migration deploy step. 13 ADRs captured the choices; 9 Refactor Backlog items capture the agreed-upon next moves. Full detail: `_docs/02_document/architecture.md`. **Technology stack**: .NET 10 + ASP.NET Core + Linq2DB + Npgsql + JwtBearer + RabbitMQ.Stream.Client + MessagePack + xxHash3 (per RB-04) on PostgreSQL 13+. **Deployment**: ARM64 multi-arch Docker image; branch-driven Woodpecker CI emits `${BRANCH}-arm` tags; orchestrator-managed at the suite level. ## Component Summary | # | Component | Purpose | Dependencies | Epic | |---|-----------|---------|--------------|------| | 01 | annotations-rest | Annotation CRUD + image/thumbnail file routes; YOLO label write; lifecycle producer | 02, 06 | TBD (Phase B) | | 02 | annotations-realtime-sync | In-process SSE channel + transactional outbox + `FailsafeProducer` to RabbitMQ Stream | 06 | TBD (Phase B) | | 03 | media | Multipart media upload (single + batch), file download, soft delete | 06 | TBD (Phase B) | | 04 | dataset | Dataset exploration: filters, class distribution, bulk status writes | 06 (today couples 01 — RB-08) | TBD (Phase B) | | 05 | settings-metadata | System / directory / camera / user settings + detection class catalog | 06 | TBD (Phase B) | | 06 | platform | Composition root, JWT, error envelope, path resolver, DB migrator | — | TBD (Phase B) | **Implementation order** (logical layer dependency, not "to-build" — the codebase already exists): 1. `06 platform` is the foundation; every other component imports from it. 2. `02 annotations-realtime-sync` is the lifecycle substrate `01` and (post RB-01) `04` feed into. 3. `01 annotations-rest`, `03 media`, `05 settings-metadata` sit on top of `06` directly. 4. `04 dataset` reads the storage `01` writes; today via direct DB coupling, post RB-08 via `AnnotationService`. **Refactor sequencing** is what Phase B will plan (Steps 8 onward); the Refactor Backlog already orders the items by impact: 1. RB-01 (lifecycle observability across mutations) — unblocks RB-09 stream contract and most Step 14 audit work. 2. RB-03 (transactional outbox wrapper) — required before RB-01 is testable. 3. RB-04 (xxHash3.Hash128) — small, isolated, can run parallel. 4. RB-02 (drop `silent_detection`) — small cleanup, after RB-01. 5. RB-08 (decouple `04 dataset` writes) — unblocks soft-delete read filtering. 6. RB-07 (`Flight*` → `Mission*` rename) — high-touch, needs coordination with suite consumers. 7. RB-06 (admin-managed detection classes) — feature, can run parallel. 8. RB-05 (replace `catch { }` in `FailsafeProducer.cs:138`) — trivial, anytime. 9. RB-09 (stream dedupe contract `(annotation_id, operation, date_time)`) — depends on RB-01. ## System Flows | Flow | Description | Key Components | |------|-------------|---------------| | F1 | Annotation create — content-address image, persist, write label, fan-out (SSE + outbox) | 01, 02, 06 | | F2 | Annotation listing / detail | 01, 06 | | F3 | Real-time SSE subscription per mission | 01, 02, 06 | | F4 | Failsafe outbox drain (background loop) → RabbitMQ Stream | 02, 06 | | F5 | Media upload (single + batch) | 03, 06 | | F6 | Auth: login + refresh token rotation | 06 | | F7 | Directory settings change → `pathResolver.Reset()` invariant | 05, 06 | | F8 | Dataset bulk status update | 04, 06 | Full sequence diagrams: `_docs/02_document/system-flows.md` and `_docs/02_document/diagrams/flows/`. ## Risk Summary Risks here are the operational / behavioral risks captured during verification + Step 14 candidates from `security_approach.md`. They live in `architecture.md` (Risks + Refactor Backlog) and will be mirrored into a formal risk register during Phase B Step 12. | Level | Count | Key Risks | |-------|-------|-----------| | Critical | 0 | — | | High | 3 | (1) silent mutation paths break downstream consumers (RB-01); (2) outbox not transactional with FS+DB write (RB-03); (3) outbox has no row-leasing → multi-instance double-publish (OP-02 / blocked by RB-09 contract). (Former SEC-01 — JWT issuer/audience not validated — closed by the auth refactor.) | | Medium | 3 | xxHash64 collision tolerance (RB-04); `silent_detection` ambiguity (RB-02); Swagger in prod (SEC-04); `04 dataset` direct-DB coupling (RB-08). (Former SEC-03 — CORS wide-open — closed by `CorsConfigurationValidator`.) | | Low | 6 | `Flight` vs `Mission` naming drift (RB-07); empty `catch{}` (RB-05); `detection_classes` not admin-CRUD (RB-06); upload MIME whitelist (SEC-05); rate limiting (SEC-06); audit log substrate (SEC-08). | **Iterations completed**: 1 verification pass with stakeholder review. **All Critical/High risks mitigated**: No — High items are tracked as RB-01, RB-03, and OP-02 (multi-instance constraint, time-boxed by current single-instance deployment). SEC-01 / SEC-02 / SEC-03 (the original auth + CORS gaps) were closed by the auth + CORS refactor between Steps 1 and 4. Remaining mitigations are scheduled, not executed. ## Test Coverage The repo currently has **zero automated tests** (`_docs/02_document/00_discovery.md`) and CI runs only build-and-push (`.woodpecker/build-arm.yml`). Test coverage is planned, not measured. | Component | Integration | Performance | Security | Acceptance | AC Coverage | |-----------|-------------|-------------|----------|------------|-------------| | 01 annotations-rest | 0 / TBD (Step 3) | 0 / TBD (Step 15) | 0 / TBD (Step 14) | 0 / 8 ACs | 0 / 8 | | 02 realtime-sync | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 4 ACs | 0 / 4 | | 03 media | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 2 ACs | 0 / 2 | | 04 dataset | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 2 ACs | 0 / 2 | | 05 settings-metadata | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 3 ACs | 0 / 3 | | 06 platform | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 5 ACs | 0 / 5 | **Overall acceptance criteria coverage**: 0 / 24 functional ACs + 0 / 5 non-functional ACs (0%). The autodev existing-code Phase A Steps 3 (Test Spec) and 6 (Implement Tests) own filling this matrix. ## Epic Roadmap The current invocation completed **Phase A Step 1 (Document)** of the autodev existing-code flow. No tracker epics have been opened yet. The work that follows in Phase A produces the test surface and the security/perf baselines: | Order | Phase A Step | Output | Effort | Dependencies | |-------|--------------|--------|--------|-------------| | 1 | Step 2 — Documentation Quality Audit | gap log | S | this report | | 2 | Step 3 — Test Spec | per-component `tests.md` (functional + integration shape) | M | step 2 | | 3 | Step 4 — Risk Mitigations | `risk_mitigations.md` | S | step 2 | | 4 | Step 5 — Solution Extraction (already done — see `_docs/01_solution/solution.md`) | — | — | — | | 5 | Step 6 — Implement Tests | actual test project + green CI step | L | step 3 | | 6 | Step 7 — Test Audit | coverage report against AC matrix | S | step 6 | Phase B (Feature Cycle) then runs per feature/refactor. The 9 Refactor Backlog items become the first batch of Phase B epics; sizing per the user's Jira complexity rules will be 2–5 points each, except RB-07 (rename across DTOs/controllers/consumers — likely 5). **Total estimated effort**: not committed. Phase A Steps 2–7 are scoped against this report; Phase B sizes per epic. ## Key Decisions Made These are the 13 ADRs from `architecture.md`. Eight of them came from the verification stakeholder review. | # | Decision | Rationale | Alternatives rejected | |---|----------|-----------|----------------------| | ADR-001 | In-process SSE channel for UI fan-out, separate transactional outbox for durable consumers | Sub-ms UI latency without standing up a broker for the inner loop | Single broker for both (UI latency hit); Postgres LISTEN/NOTIFY for UI (delivery semantics insufficient) | | ADR-002 (RETIRED) | Originally: symmetric HS256 JWT, no issuer/audience validation. Now: ES256 verifier-only over admin's JWKS, with `iss` / `aud` / `exp` / `alg` all enforced. | Identity is centralised in admin; annotations holds no signing material | The original symmetric scheme it replaced | | ADR-003 | Linq2DB + idempotent SQL DDL migrator (no EF, no DbUp/FluentMigrator) | Lighter dependency surface; one less deploy step (ADR-007) | EF Core migrations (heavier); FluentMigrator (separate runner) | | ADR-004 | Annotation id = `XxHash3.Hash128` over a sampled image-bytes window | 128-bit space tolerates the suite's annotation volume; sampled keeps large-frame ingest cheap | Full SHA-256 (CPU); xxHash64 (collision space too small — RB-04 was the upgrade) | | ADR-005 | Swagger UI mounted unconditionally | Internal-only deployment; aids debugging | Gating by env (deferred to SEC-04) | | ADR-006 (RETIRED) | Originally: CORS `AllowAny*`. Now: config-driven allow-list (`CorsConfig:AllowedOrigins` + opt-in `AllowAnyOrigin`) gated by `CorsConfigurationValidator` per environment. | Production cannot start without an explicit origin policy | The original wide-open default | | ADR-007 | DDL applied at boot, not in CI | Single deploy step; matches container-immutable model | Separate migration job (deploy complexity) | | ADR-008 | Business-transaction wrapper (transactional outbox) for annotation lifecycle | Atomicity across DB + outbox; FS write tolerated as best-effort with cleanup | DTC across FS + DB + RabbitMQ (heavyweight, not portable) | | ADR-009 | Every mutation path emits SSE + enqueues outbox row | One observability contract for humans + machines | SSE-only (durability gap for AI/admin worker); outbox-only (UI latency) | | ADR-010 | Remove `silent_detection` flag | Behavior is contradictory once ADR-009 holds | Keep flag and gate on it (forces every consumer to interpret it) | | ADR-011 | Detection class catalog becomes admin-managed (CRUD + cache) | Catalog evolves with deployments; migrator-only is a deploy-time-only escape hatch | Static catalog (RB-06 supersedes) | | ADR-012 | Canonical term is `Mission`; `Flight*` symbols renamed | Single suite-level vocabulary | Keep `Flight` in this service (drift cost grows over time) | | ADR-013 | On-the-wire dedupe key: `(annotation_id, operation, date_time)` | Lets every downstream consumer dedupe re-deliveries safely | Per-consumer offset trust (fragile under outbox replay) | ## Open Questions All 6 verification-pass questions were resolved during stakeholder review. Genuinely-open follow-ups now: | # | Question | Impact | Assigned To | |---|----------|--------|-------------| | 1 | Are P50/P95/P99 latency / throughput targets contracted anywhere in the suite? | Bounds NFR ACs (`AC-N-*`) and Step 15 perf-test shape. | Suite ops / product | | 2 | What is the upload format whitelist `/media` should enforce? | Bounds SEC-05 fix scope. | Detections-pipeline owner | | 3 | RPO/RTO contract for `images_dir` and `deleted_dir`? | Bounds the soft-delete restore story (post RB-01). | Suite ops | | 4 | Stream retention window for `azaion-annotations`? | Bounds the consumer replay window the AI pipeline depends on. | Suite ops | | 5 | Is multi-tenancy on the roadmap within the doc horizon? | Decides whether SEC-07 is a Step 14 must-fix or a deferred gap. | Product | ## Artifact Index | File | Description | |------|-------------| | `_docs/02_document/architecture.md` | Architecture vision, 13 ADRs, refactor backlog, NFRs | | `_docs/02_document/system-flows.md` | F1–F8 verified flow narratives | | `_docs/02_document/data_model.md` | ERD + per-table contract reproduced from `DatabaseMigrator.cs` | | `_docs/02_document/glossary.md` | 36 canonical terms (suite + project + code-level) | | `_docs/02_document/module-layout.md` | Module → component mapping | | `_docs/02_document/04_verification_log.md` | Verification pass corrections + stakeholder resolutions | | `_docs/02_document/components/01_annotations-rest/description.md` | Component 01 spec | | `_docs/02_document/components/02_annotations-realtime-sync/description.md` | Component 02 spec | | `_docs/02_document/components/03_media/description.md` | Component 03 spec | | `_docs/02_document/components/04_dataset/description.md` | Component 04 spec | | `_docs/02_document/components/05_settings-metadata/description.md` | Component 05 spec | | `_docs/02_document/components/06_platform/description.md` | Component 06 spec | | `_docs/02_document/modules/*.md` | 11 module-level deep-dives | | `_docs/02_document/diagrams/components.md` | Component diagram (Mermaid) | | `_docs/02_document/diagrams/flows/flow_annotation_create.md` | F1 sequence (verified) | | `_docs/02_document/diagrams/flows/flow_sse_subscription.md` | F3 sequence | | `_docs/02_document/diagrams/flows/flow_failsafe_drain.md` | F4 sequence | | `_docs/02_document/deployment/containerization.md` | Dockerfile-derived deployment notes | | `_docs/02_document/deployment/ci_cd_pipeline.md` | Woodpecker pipeline-derived notes | | `_docs/02_document/deployment/environment_strategy.md` | Env-var contract + ASPNETCORE_ENVIRONMENT use | | `_docs/02_document/deployment/observability.md` | Logging + `/health` + outbox depth gap | | `_docs/02_document/common-helpers/01_http-error-envelope.md` | Suite error envelope contract | | `_docs/01_solution/solution.md` | Retrospective per-component solution table | | `_docs/00_problem/problem.md` | Retrospective problem statement | | `_docs/00_problem/restrictions.md` | HW / SW / ENV / OP / suite-level restrictions | | `_docs/00_problem/acceptance_criteria.md` | 24 functional + 5 non-functional ACs | | `_docs/00_problem/input_data/data_parameters.md` | REST DTOs + env vars + seed data + wire format | | `_docs/00_problem/security_approach.md` | Auth/AuthZ/secrets posture + 8 SEC-XX gaps | ## Cross-references - Suite-level integration narrative: `suite/_docs/01_annotations.md` - Repo-config (monorepo discovery): `_docs/_repo-config.yaml` - Autodev state: `_docs/_autodev_state.md` - Document skill internal state: `_docs/02_document/state.json`