Files
annotations/_docs/02_document/FINAL_report.md
T
Oleksandr Bezdieniezhnykh 03f879206e docs+src: complete Steps 1-3 outcomes + auth re-sync baseline
This commit captures everything produced during autodev existing-code
Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec),
together with the targeted auth + CORS re-sync triggered on 2026-05-14
when codebase drift was detected at Step 4 entry. None of this work was
previously committed.

Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution,
architecture, system flows, glossary, module-layout, per-component
specs (01..06), modules, deployment, diagrams, data model, FINAL
report, verification log, discovery.

Step 2 (Architecture Baseline) — architecture_compliance_baseline.md.
Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No
High/Critical findings; auto-chained to Step 3 per existing-code flow.

Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across
blackbox, security, resilience, resource-limit, performance), plus
e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh,
scripts/run-performance-tests.sh. Coverage 88% over the active scope
(40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered).

Targeted auth + CORS re-sync — replaces the deleted in-house token
issuer with a JWKS-verifier model. AuthController and TokenService
removed; JwtExtensions switched from HS256 symmetric to ES256 over
admin's JWKS. ConfigurationResolver and CorsConfigurationValidator
added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01,
SEC-02, SEC-03 marked Closed. One new testability risk recorded in
architecture.md Open Risks Section 6 (JWKS HTTPS gating).

Source changes:
- src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning
- src/Program.cs (modified) — DI wiring for ConfigurationResolver
  and CorsConfigurationValidator
- src/Controllers/AuthController.cs (deleted) — no in-service issuance
- src/Services/TokenService.cs (deleted) — same
- src/Infrastructure/ConfigurationResolver.cs (new)
- src/Infrastructure/CorsConfigurationValidator.cs (new)
- .env.example (new) — required env var documentation
- .gitignore (updated)

Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec
captures the change-spec for downstream services that consumed the now
deleted /auth endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 20:19:05 +03:00

184 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Azaion.Annotations — Documentation Report
## Executive Summary
Reverse-engineered the `Azaion.Annotations` codebase bottom-up — 11 module docs → 6 component specs → 1 system architecture + 8 verified flows + ER diagram + deployment + glossary, then synthesized a retrospective `solution.md` and a 5-file problem extraction. Verification surfaced 8 behavioral discrepancies between code and the suite-level `01_annotations.md` narrative; all 8 were resolved with stakeholder decisions, captured as 13 ADRs and 9 Refactor Backlog items (RB-01..RB-09) inside `architecture.md`.
## Problem Statement
`Azaion.Annotations` is the suite's annotation lifecycle service. It is the single owner of the `annotations` table, the YOLO label files on disk, and the lifecycle event stream. Three independent consumers (Annotator UI, AI training pipeline, admin sync worker) need the same data shaped differently — push for humans (SSE), durable-pull for machines (RabbitMQ Stream) — and this service is the one place that reconciles those needs. See `_docs/00_problem/problem.md`.
## Architecture Overview
Single .NET 10 ASP.NET Core service, single PostgreSQL state-of-record, content-addressed filesystem cache, in-process SSE channel + transactional outbox drained to RabbitMQ Stream by a hosted background service. JWT bearer with three policies (`ANN`, `DATASET`, `ADM`). Idempotent boot-time DDL migrator removes a separate migration deploy step.
13 ADRs captured the choices; 9 Refactor Backlog items capture the agreed-upon next moves. Full detail: `_docs/02_document/architecture.md`.
**Technology stack**: .NET 10 + ASP.NET Core + Linq2DB + Npgsql + JwtBearer + RabbitMQ.Stream.Client + MessagePack + xxHash3 (per RB-04) on PostgreSQL 13+.
**Deployment**: ARM64 multi-arch Docker image; branch-driven Woodpecker CI emits `${BRANCH}-arm` tags; orchestrator-managed at the suite level.
## Component Summary
| # | Component | Purpose | Dependencies | Epic |
|---|-----------|---------|--------------|------|
| 01 | annotations-rest | Annotation CRUD + image/thumbnail file routes; YOLO label write; lifecycle producer | 02, 06 | TBD (Phase B) |
| 02 | annotations-realtime-sync | In-process SSE channel + transactional outbox + `FailsafeProducer` to RabbitMQ Stream | 06 | TBD (Phase B) |
| 03 | media | Multipart media upload (single + batch), file download, soft delete | 06 | TBD (Phase B) |
| 04 | dataset | Dataset exploration: filters, class distribution, bulk status writes | 06 (today couples 01 — RB-08) | TBD (Phase B) |
| 05 | settings-metadata | System / directory / camera / user settings + detection class catalog | 06 | TBD (Phase B) |
| 06 | platform | Composition root, JWT, error envelope, path resolver, DB migrator | — | TBD (Phase B) |
**Implementation order** (logical layer dependency, not "to-build" — the codebase already exists):
1. `06 platform` is the foundation; every other component imports from it.
2. `02 annotations-realtime-sync` is the lifecycle substrate `01` and (post RB-01) `04` feed into.
3. `01 annotations-rest`, `03 media`, `05 settings-metadata` sit on top of `06` directly.
4. `04 dataset` reads the storage `01` writes; today via direct DB coupling, post RB-08 via `AnnotationService`.
**Refactor sequencing** is what Phase B will plan (Steps 8 onward); the Refactor Backlog already orders the items by impact:
1. RB-01 (lifecycle observability across mutations) — unblocks RB-09 stream contract and most Step 14 audit work.
2. RB-03 (transactional outbox wrapper) — required before RB-01 is testable.
3. RB-04 (xxHash3.Hash128) — small, isolated, can run parallel.
4. RB-02 (drop `silent_detection`) — small cleanup, after RB-01.
5. RB-08 (decouple `04 dataset` writes) — unblocks soft-delete read filtering.
6. RB-07 (`Flight*``Mission*` rename) — high-touch, needs coordination with suite consumers.
7. RB-06 (admin-managed detection classes) — feature, can run parallel.
8. RB-05 (replace `catch { }` in `FailsafeProducer.cs:138`) — trivial, anytime.
9. RB-09 (stream dedupe contract `(annotation_id, operation, date_time)`) — depends on RB-01.
## System Flows
| Flow | Description | Key Components |
|------|-------------|---------------|
| F1 | Annotation create — content-address image, persist, write label, fan-out (SSE + outbox) | 01, 02, 06 |
| F2 | Annotation listing / detail | 01, 06 |
| F3 | Real-time SSE subscription per mission | 01, 02, 06 |
| F4 | Failsafe outbox drain (background loop) → RabbitMQ Stream | 02, 06 |
| F5 | Media upload (single + batch) | 03, 06 |
| F6 | Auth: login + refresh token rotation | 06 |
| F7 | Directory settings change → `pathResolver.Reset()` invariant | 05, 06 |
| F8 | Dataset bulk status update | 04, 06 |
Full sequence diagrams: `_docs/02_document/system-flows.md` and `_docs/02_document/diagrams/flows/`.
## Risk Summary
Risks here are the operational / behavioral risks captured during verification + Step 14 candidates from `security_approach.md`. They live in `architecture.md` (Risks + Refactor Backlog) and will be mirrored into a formal risk register during Phase B Step 12.
| Level | Count | Key Risks |
|-------|-------|-----------|
| Critical | 0 | — |
| High | 3 | (1) silent mutation paths break downstream consumers (RB-01); (2) outbox not transactional with FS+DB write (RB-03); (3) outbox has no row-leasing → multi-instance double-publish (OP-02 / blocked by RB-09 contract). (Former SEC-01 — JWT issuer/audience not validated — closed by the auth refactor.) |
| Medium | 3 | xxHash64 collision tolerance (RB-04); `silent_detection` ambiguity (RB-02); Swagger in prod (SEC-04); `04 dataset` direct-DB coupling (RB-08). (Former SEC-03 — CORS wide-open — closed by `CorsConfigurationValidator`.) |
| Low | 6 | `Flight` vs `Mission` naming drift (RB-07); empty `catch{}` (RB-05); `detection_classes` not admin-CRUD (RB-06); upload MIME whitelist (SEC-05); rate limiting (SEC-06); audit log substrate (SEC-08). |
**Iterations completed**: 1 verification pass with stakeholder review.
**All Critical/High risks mitigated**: No — High items are tracked as RB-01, RB-03, and OP-02 (multi-instance constraint, time-boxed by current single-instance deployment). SEC-01 / SEC-02 / SEC-03 (the original auth + CORS gaps) were closed by the auth + CORS refactor between Steps 1 and 4. Remaining mitigations are scheduled, not executed.
## Test Coverage
The repo currently has **zero automated tests** (`_docs/02_document/00_discovery.md`) and CI runs only build-and-push (`.woodpecker/build-arm.yml`). Test coverage is planned, not measured.
| Component | Integration | Performance | Security | Acceptance | AC Coverage |
|-----------|-------------|-------------|----------|------------|-------------|
| 01 annotations-rest | 0 / TBD (Step 3) | 0 / TBD (Step 15) | 0 / TBD (Step 14) | 0 / 8 ACs | 0 / 8 |
| 02 realtime-sync | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 4 ACs | 0 / 4 |
| 03 media | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 2 ACs | 0 / 2 |
| 04 dataset | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 2 ACs | 0 / 2 |
| 05 settings-metadata | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 3 ACs | 0 / 3 |
| 06 platform | 0 / TBD | 0 / TBD | 0 / TBD | 0 / 5 ACs | 0 / 5 |
**Overall acceptance criteria coverage**: 0 / 24 functional ACs + 0 / 5 non-functional ACs (0%).
The autodev existing-code Phase A Steps 3 (Test Spec) and 6 (Implement Tests) own filling this matrix.
## Epic Roadmap
The current invocation completed **Phase A Step 1 (Document)** of the autodev existing-code flow. No tracker epics have been opened yet. The work that follows in Phase A produces the test surface and the security/perf baselines:
| Order | Phase A Step | Output | Effort | Dependencies |
|-------|--------------|--------|--------|-------------|
| 1 | Step 2 — Documentation Quality Audit | gap log | S | this report |
| 2 | Step 3 — Test Spec | per-component `tests.md` (functional + integration shape) | M | step 2 |
| 3 | Step 4 — Risk Mitigations | `risk_mitigations.md` | S | step 2 |
| 4 | Step 5 — Solution Extraction (already done — see `_docs/01_solution/solution.md`) | — | — | — |
| 5 | Step 6 — Implement Tests | actual test project + green CI step | L | step 3 |
| 6 | Step 7 — Test Audit | coverage report against AC matrix | S | step 6 |
Phase B (Feature Cycle) then runs per feature/refactor. The 9 Refactor Backlog items become the first batch of Phase B epics; sizing per the user's Jira complexity rules will be 25 points each, except RB-07 (rename across DTOs/controllers/consumers — likely 5).
**Total estimated effort**: not committed. Phase A Steps 27 are scoped against this report; Phase B sizes per epic.
## Key Decisions Made
These are the 13 ADRs from `architecture.md`. Eight of them came from the verification stakeholder review.
| # | Decision | Rationale | Alternatives rejected |
|---|----------|-----------|----------------------|
| ADR-001 | In-process SSE channel for UI fan-out, separate transactional outbox for durable consumers | Sub-ms UI latency without standing up a broker for the inner loop | Single broker for both (UI latency hit); Postgres LISTEN/NOTIFY for UI (delivery semantics insufficient) |
| ADR-002 (RETIRED) | Originally: symmetric HS256 JWT, no issuer/audience validation. Now: ES256 verifier-only over admin's JWKS, with `iss` / `aud` / `exp` / `alg` all enforced. | Identity is centralised in admin; annotations holds no signing material | The original symmetric scheme it replaced |
| ADR-003 | Linq2DB + idempotent SQL DDL migrator (no EF, no DbUp/FluentMigrator) | Lighter dependency surface; one less deploy step (ADR-007) | EF Core migrations (heavier); FluentMigrator (separate runner) |
| ADR-004 | Annotation id = `XxHash3.Hash128` over a sampled image-bytes window | 128-bit space tolerates the suite's annotation volume; sampled keeps large-frame ingest cheap | Full SHA-256 (CPU); xxHash64 (collision space too small — RB-04 was the upgrade) |
| ADR-005 | Swagger UI mounted unconditionally | Internal-only deployment; aids debugging | Gating by env (deferred to SEC-04) |
| ADR-006 (RETIRED) | Originally: CORS `AllowAny*`. Now: config-driven allow-list (`CorsConfig:AllowedOrigins` + opt-in `AllowAnyOrigin`) gated by `CorsConfigurationValidator` per environment. | Production cannot start without an explicit origin policy | The original wide-open default |
| ADR-007 | DDL applied at boot, not in CI | Single deploy step; matches container-immutable model | Separate migration job (deploy complexity) |
| ADR-008 | Business-transaction wrapper (transactional outbox) for annotation lifecycle | Atomicity across DB + outbox; FS write tolerated as best-effort with cleanup | DTC across FS + DB + RabbitMQ (heavyweight, not portable) |
| ADR-009 | Every mutation path emits SSE + enqueues outbox row | One observability contract for humans + machines | SSE-only (durability gap for AI/admin worker); outbox-only (UI latency) |
| ADR-010 | Remove `silent_detection` flag | Behavior is contradictory once ADR-009 holds | Keep flag and gate on it (forces every consumer to interpret it) |
| ADR-011 | Detection class catalog becomes admin-managed (CRUD + cache) | Catalog evolves with deployments; migrator-only is a deploy-time-only escape hatch | Static catalog (RB-06 supersedes) |
| ADR-012 | Canonical term is `Mission`; `Flight*` symbols renamed | Single suite-level vocabulary | Keep `Flight` in this service (drift cost grows over time) |
| ADR-013 | On-the-wire dedupe key: `(annotation_id, operation, date_time)` | Lets every downstream consumer dedupe re-deliveries safely | Per-consumer offset trust (fragile under outbox replay) |
## Open Questions
All 6 verification-pass questions were resolved during stakeholder review. Genuinely-open follow-ups now:
| # | Question | Impact | Assigned To |
|---|----------|--------|-------------|
| 1 | Are P50/P95/P99 latency / throughput targets contracted anywhere in the suite? | Bounds NFR ACs (`AC-N-*`) and Step 15 perf-test shape. | Suite ops / product |
| 2 | What is the upload format whitelist `/media` should enforce? | Bounds SEC-05 fix scope. | Detections-pipeline owner |
| 3 | RPO/RTO contract for `images_dir` and `deleted_dir`? | Bounds the soft-delete restore story (post RB-01). | Suite ops |
| 4 | Stream retention window for `azaion-annotations`? | Bounds the consumer replay window the AI pipeline depends on. | Suite ops |
| 5 | Is multi-tenancy on the roadmap within the doc horizon? | Decides whether SEC-07 is a Step 14 must-fix or a deferred gap. | Product |
## Artifact Index
| File | Description |
|------|-------------|
| `_docs/02_document/architecture.md` | Architecture vision, 13 ADRs, refactor backlog, NFRs |
| `_docs/02_document/system-flows.md` | F1F8 verified flow narratives |
| `_docs/02_document/data_model.md` | ERD + per-table contract reproduced from `DatabaseMigrator.cs` |
| `_docs/02_document/glossary.md` | 36 canonical terms (suite + project + code-level) |
| `_docs/02_document/module-layout.md` | Module → component mapping |
| `_docs/02_document/04_verification_log.md` | Verification pass corrections + stakeholder resolutions |
| `_docs/02_document/components/01_annotations-rest/description.md` | Component 01 spec |
| `_docs/02_document/components/02_annotations-realtime-sync/description.md` | Component 02 spec |
| `_docs/02_document/components/03_media/description.md` | Component 03 spec |
| `_docs/02_document/components/04_dataset/description.md` | Component 04 spec |
| `_docs/02_document/components/05_settings-metadata/description.md` | Component 05 spec |
| `_docs/02_document/components/06_platform/description.md` | Component 06 spec |
| `_docs/02_document/modules/*.md` | 11 module-level deep-dives |
| `_docs/02_document/diagrams/components.md` | Component diagram (Mermaid) |
| `_docs/02_document/diagrams/flows/flow_annotation_create.md` | F1 sequence (verified) |
| `_docs/02_document/diagrams/flows/flow_sse_subscription.md` | F3 sequence |
| `_docs/02_document/diagrams/flows/flow_failsafe_drain.md` | F4 sequence |
| `_docs/02_document/deployment/containerization.md` | Dockerfile-derived deployment notes |
| `_docs/02_document/deployment/ci_cd_pipeline.md` | Woodpecker pipeline-derived notes |
| `_docs/02_document/deployment/environment_strategy.md` | Env-var contract + ASPNETCORE_ENVIRONMENT use |
| `_docs/02_document/deployment/observability.md` | Logging + `/health` + outbox depth gap |
| `_docs/02_document/common-helpers/01_http-error-envelope.md` | Suite error envelope contract |
| `_docs/01_solution/solution.md` | Retrospective per-component solution table |
| `_docs/00_problem/problem.md` | Retrospective problem statement |
| `_docs/00_problem/restrictions.md` | HW / SW / ENV / OP / suite-level restrictions |
| `_docs/00_problem/acceptance_criteria.md` | 24 functional + 5 non-functional ACs |
| `_docs/00_problem/input_data/data_parameters.md` | REST DTOs + env vars + seed data + wire format |
| `_docs/00_problem/security_approach.md` | Auth/AuthZ/secrets posture + 8 SEC-XX gaps |
## Cross-references
- Suite-level integration narrative: `suite/_docs/01_annotations.md`
- Repo-config (monorepo discovery): `_docs/_repo-config.yaml`
- Autodev state: `_docs/_autodev_state.md`
- Document skill internal state: `_docs/02_document/state.json`