Files
annotations/_docs/02_document/FINAL_report.md
T
Oleksandr Bezdieniezhnykh 03f879206e docs+src: complete Steps 1-3 outcomes + auth re-sync baseline
This commit captures everything produced during autodev existing-code
Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec),
together with the targeted auth + CORS re-sync triggered on 2026-05-14
when codebase drift was detected at Step 4 entry. None of this work was
previously committed.

Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution,
architecture, system flows, glossary, module-layout, per-component
specs (01..06), modules, deployment, diagrams, data model, FINAL
report, verification log, discovery.

Step 2 (Architecture Baseline) — architecture_compliance_baseline.md.
Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No
High/Critical findings; auto-chained to Step 3 per existing-code flow.

Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across
blackbox, security, resilience, resource-limit, performance), plus
e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh,
scripts/run-performance-tests.sh. Coverage 88% over the active scope
(40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered).

Targeted auth + CORS re-sync — replaces the deleted in-house token
issuer with a JWKS-verifier model. AuthController and TokenService
removed; JwtExtensions switched from HS256 symmetric to ES256 over
admin's JWKS. ConfigurationResolver and CorsConfigurationValidator
added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01,
SEC-02, SEC-03 marked Closed. One new testability risk recorded in
architecture.md Open Risks Section 6 (JWKS HTTPS gating).

Source changes:
- src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning
- src/Program.cs (modified) — DI wiring for ConfigurationResolver
  and CorsConfigurationValidator
- src/Controllers/AuthController.cs (deleted) — no in-service issuance
- src/Services/TokenService.cs (deleted) — same
- src/Infrastructure/ConfigurationResolver.cs (new)
- src/Infrastructure/CorsConfigurationValidator.cs (new)
- .env.example (new) — required env var documentation
- .gitignore (updated)

Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec
captures the change-spec for downstream services that consumed the now
deleted /auth endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 20:19:05 +03:00

15 KiB
Raw Blame History

Azaion.Annotations — Documentation Report

Executive Summary

Reverse-engineered the Azaion.Annotations codebase bottom-up — 11 module docs → 6 component specs → 1 system architecture + 8 verified flows + ER diagram + deployment + glossary, then synthesized a retrospective solution.md and a 5-file problem extraction. Verification surfaced 8 behavioral discrepancies between code and the suite-level 01_annotations.md narrative; all 8 were resolved with stakeholder decisions, captured as 13 ADRs and 9 Refactor Backlog items (RB-01..RB-09) inside architecture.md.

Problem Statement

Azaion.Annotations is the suite's annotation lifecycle service. It is the single owner of the annotations table, the YOLO label files on disk, and the lifecycle event stream. Three independent consumers (Annotator UI, AI training pipeline, admin sync worker) need the same data shaped differently — push for humans (SSE), durable-pull for machines (RabbitMQ Stream) — and this service is the one place that reconciles those needs. See _docs/00_problem/problem.md.

Architecture Overview

Single .NET 10 ASP.NET Core service, single PostgreSQL state-of-record, content-addressed filesystem cache, in-process SSE channel + transactional outbox drained to RabbitMQ Stream by a hosted background service. JWT bearer with three policies (ANN, DATASET, ADM). Idempotent boot-time DDL migrator removes a separate migration deploy step.

13 ADRs captured the choices; 9 Refactor Backlog items capture the agreed-upon next moves. Full detail: _docs/02_document/architecture.md.

Technology stack: .NET 10 + ASP.NET Core + Linq2DB + Npgsql + JwtBearer + RabbitMQ.Stream.Client + MessagePack + xxHash3 (per RB-04) on PostgreSQL 13+.

Deployment: ARM64 multi-arch Docker image; branch-driven Woodpecker CI emits ${BRANCH}-arm tags; orchestrator-managed at the suite level.

Component Summary

# Component Purpose Dependencies Epic
01 annotations-rest Annotation CRUD + image/thumbnail file routes; YOLO label write; lifecycle producer 02, 06 TBD (Phase B)
02 annotations-realtime-sync In-process SSE channel + transactional outbox + FailsafeProducer to RabbitMQ Stream 06 TBD (Phase B)
03 media Multipart media upload (single + batch), file download, soft delete 06 TBD (Phase B)
04 dataset Dataset exploration: filters, class distribution, bulk status writes 06 (today couples 01 — RB-08) TBD (Phase B)
05 settings-metadata System / directory / camera / user settings + detection class catalog 06 TBD (Phase B)
06 platform Composition root, JWT, error envelope, path resolver, DB migrator TBD (Phase B)

Implementation order (logical layer dependency, not "to-build" — the codebase already exists):

  1. 06 platform is the foundation; every other component imports from it.
  2. 02 annotations-realtime-sync is the lifecycle substrate 01 and (post RB-01) 04 feed into.
  3. 01 annotations-rest, 03 media, 05 settings-metadata sit on top of 06 directly.
  4. 04 dataset reads the storage 01 writes; today via direct DB coupling, post RB-08 via AnnotationService.

Refactor sequencing is what Phase B will plan (Steps 8 onward); the Refactor Backlog already orders the items by impact:

  1. RB-01 (lifecycle observability across mutations) — unblocks RB-09 stream contract and most Step 14 audit work.
  2. RB-03 (transactional outbox wrapper) — required before RB-01 is testable.
  3. RB-04 (xxHash3.Hash128) — small, isolated, can run parallel.
  4. RB-02 (drop silent_detection) — small cleanup, after RB-01.
  5. RB-08 (decouple 04 dataset writes) — unblocks soft-delete read filtering.
  6. RB-07 (Flight*Mission* rename) — high-touch, needs coordination with suite consumers.
  7. RB-06 (admin-managed detection classes) — feature, can run parallel.
  8. RB-05 (replace catch { } in FailsafeProducer.cs:138) — trivial, anytime.
  9. RB-09 (stream dedupe contract (annotation_id, operation, date_time)) — depends on RB-01.

System Flows

Flow Description Key Components
F1 Annotation create — content-address image, persist, write label, fan-out (SSE + outbox) 01, 02, 06
F2 Annotation listing / detail 01, 06
F3 Real-time SSE subscription per mission 01, 02, 06
F4 Failsafe outbox drain (background loop) → RabbitMQ Stream 02, 06
F5 Media upload (single + batch) 03, 06
F6 Auth: login + refresh token rotation 06
F7 Directory settings change → pathResolver.Reset() invariant 05, 06
F8 Dataset bulk status update 04, 06

Full sequence diagrams: _docs/02_document/system-flows.md and _docs/02_document/diagrams/flows/.

Risk Summary

Risks here are the operational / behavioral risks captured during verification + Step 14 candidates from security_approach.md. They live in architecture.md (Risks + Refactor Backlog) and will be mirrored into a formal risk register during Phase B Step 12.

Level Count Key Risks
Critical 0
High 3 (1) silent mutation paths break downstream consumers (RB-01); (2) outbox not transactional with FS+DB write (RB-03); (3) outbox has no row-leasing → multi-instance double-publish (OP-02 / blocked by RB-09 contract). (Former SEC-01 — JWT issuer/audience not validated — closed by the auth refactor.)
Medium 3 xxHash64 collision tolerance (RB-04); silent_detection ambiguity (RB-02); Swagger in prod (SEC-04); 04 dataset direct-DB coupling (RB-08). (Former SEC-03 — CORS wide-open — closed by CorsConfigurationValidator.)
Low 6 Flight vs Mission naming drift (RB-07); empty catch{} (RB-05); detection_classes not admin-CRUD (RB-06); upload MIME whitelist (SEC-05); rate limiting (SEC-06); audit log substrate (SEC-08).

Iterations completed: 1 verification pass with stakeholder review. All Critical/High risks mitigated: No — High items are tracked as RB-01, RB-03, and OP-02 (multi-instance constraint, time-boxed by current single-instance deployment). SEC-01 / SEC-02 / SEC-03 (the original auth + CORS gaps) were closed by the auth + CORS refactor between Steps 1 and 4. Remaining mitigations are scheduled, not executed.

Test Coverage

The repo currently has zero automated tests (_docs/02_document/00_discovery.md) and CI runs only build-and-push (.woodpecker/build-arm.yml). Test coverage is planned, not measured.

Component Integration Performance Security Acceptance AC Coverage
01 annotations-rest 0 / TBD (Step 3) 0 / TBD (Step 15) 0 / TBD (Step 14) 0 / 8 ACs 0 / 8
02 realtime-sync 0 / TBD 0 / TBD 0 / TBD 0 / 4 ACs 0 / 4
03 media 0 / TBD 0 / TBD 0 / TBD 0 / 2 ACs 0 / 2
04 dataset 0 / TBD 0 / TBD 0 / TBD 0 / 2 ACs 0 / 2
05 settings-metadata 0 / TBD 0 / TBD 0 / TBD 0 / 3 ACs 0 / 3
06 platform 0 / TBD 0 / TBD 0 / TBD 0 / 5 ACs 0 / 5

Overall acceptance criteria coverage: 0 / 24 functional ACs + 0 / 5 non-functional ACs (0%).

The autodev existing-code Phase A Steps 3 (Test Spec) and 6 (Implement Tests) own filling this matrix.

Epic Roadmap

The current invocation completed Phase A Step 1 (Document) of the autodev existing-code flow. No tracker epics have been opened yet. The work that follows in Phase A produces the test surface and the security/perf baselines:

Order Phase A Step Output Effort Dependencies
1 Step 2 — Documentation Quality Audit gap log S this report
2 Step 3 — Test Spec per-component tests.md (functional + integration shape) M step 2
3 Step 4 — Risk Mitigations risk_mitigations.md S step 2
4 Step 5 — Solution Extraction (already done — see _docs/01_solution/solution.md)
5 Step 6 — Implement Tests actual test project + green CI step L step 3
6 Step 7 — Test Audit coverage report against AC matrix S step 6

Phase B (Feature Cycle) then runs per feature/refactor. The 9 Refactor Backlog items become the first batch of Phase B epics; sizing per the user's Jira complexity rules will be 25 points each, except RB-07 (rename across DTOs/controllers/consumers — likely 5).

Total estimated effort: not committed. Phase A Steps 27 are scoped against this report; Phase B sizes per epic.

Key Decisions Made

These are the 13 ADRs from architecture.md. Eight of them came from the verification stakeholder review.

# Decision Rationale Alternatives rejected
ADR-001 In-process SSE channel for UI fan-out, separate transactional outbox for durable consumers Sub-ms UI latency without standing up a broker for the inner loop Single broker for both (UI latency hit); Postgres LISTEN/NOTIFY for UI (delivery semantics insufficient)
ADR-002 (RETIRED) Originally: symmetric HS256 JWT, no issuer/audience validation. Now: ES256 verifier-only over admin's JWKS, with iss / aud / exp / alg all enforced. Identity is centralised in admin; annotations holds no signing material The original symmetric scheme it replaced
ADR-003 Linq2DB + idempotent SQL DDL migrator (no EF, no DbUp/FluentMigrator) Lighter dependency surface; one less deploy step (ADR-007) EF Core migrations (heavier); FluentMigrator (separate runner)
ADR-004 Annotation id = XxHash3.Hash128 over a sampled image-bytes window 128-bit space tolerates the suite's annotation volume; sampled keeps large-frame ingest cheap Full SHA-256 (CPU); xxHash64 (collision space too small — RB-04 was the upgrade)
ADR-005 Swagger UI mounted unconditionally Internal-only deployment; aids debugging Gating by env (deferred to SEC-04)
ADR-006 (RETIRED) Originally: CORS AllowAny*. Now: config-driven allow-list (CorsConfig:AllowedOrigins + opt-in AllowAnyOrigin) gated by CorsConfigurationValidator per environment. Production cannot start without an explicit origin policy The original wide-open default
ADR-007 DDL applied at boot, not in CI Single deploy step; matches container-immutable model Separate migration job (deploy complexity)
ADR-008 Business-transaction wrapper (transactional outbox) for annotation lifecycle Atomicity across DB + outbox; FS write tolerated as best-effort with cleanup DTC across FS + DB + RabbitMQ (heavyweight, not portable)
ADR-009 Every mutation path emits SSE + enqueues outbox row One observability contract for humans + machines SSE-only (durability gap for AI/admin worker); outbox-only (UI latency)
ADR-010 Remove silent_detection flag Behavior is contradictory once ADR-009 holds Keep flag and gate on it (forces every consumer to interpret it)
ADR-011 Detection class catalog becomes admin-managed (CRUD + cache) Catalog evolves with deployments; migrator-only is a deploy-time-only escape hatch Static catalog (RB-06 supersedes)
ADR-012 Canonical term is Mission; Flight* symbols renamed Single suite-level vocabulary Keep Flight in this service (drift cost grows over time)
ADR-013 On-the-wire dedupe key: (annotation_id, operation, date_time) Lets every downstream consumer dedupe re-deliveries safely Per-consumer offset trust (fragile under outbox replay)

Open Questions

All 6 verification-pass questions were resolved during stakeholder review. Genuinely-open follow-ups now:

# Question Impact Assigned To
1 Are P50/P95/P99 latency / throughput targets contracted anywhere in the suite? Bounds NFR ACs (AC-N-*) and Step 15 perf-test shape. Suite ops / product
2 What is the upload format whitelist /media should enforce? Bounds SEC-05 fix scope. Detections-pipeline owner
3 RPO/RTO contract for images_dir and deleted_dir? Bounds the soft-delete restore story (post RB-01). Suite ops
4 Stream retention window for azaion-annotations? Bounds the consumer replay window the AI pipeline depends on. Suite ops
5 Is multi-tenancy on the roadmap within the doc horizon? Decides whether SEC-07 is a Step 14 must-fix or a deferred gap. Product

Artifact Index

File Description
_docs/02_document/architecture.md Architecture vision, 13 ADRs, refactor backlog, NFRs
_docs/02_document/system-flows.md F1F8 verified flow narratives
_docs/02_document/data_model.md ERD + per-table contract reproduced from DatabaseMigrator.cs
_docs/02_document/glossary.md 36 canonical terms (suite + project + code-level)
_docs/02_document/module-layout.md Module → component mapping
_docs/02_document/04_verification_log.md Verification pass corrections + stakeholder resolutions
_docs/02_document/components/01_annotations-rest/description.md Component 01 spec
_docs/02_document/components/02_annotations-realtime-sync/description.md Component 02 spec
_docs/02_document/components/03_media/description.md Component 03 spec
_docs/02_document/components/04_dataset/description.md Component 04 spec
_docs/02_document/components/05_settings-metadata/description.md Component 05 spec
_docs/02_document/components/06_platform/description.md Component 06 spec
_docs/02_document/modules/*.md 11 module-level deep-dives
_docs/02_document/diagrams/components.md Component diagram (Mermaid)
_docs/02_document/diagrams/flows/flow_annotation_create.md F1 sequence (verified)
_docs/02_document/diagrams/flows/flow_sse_subscription.md F3 sequence
_docs/02_document/diagrams/flows/flow_failsafe_drain.md F4 sequence
_docs/02_document/deployment/containerization.md Dockerfile-derived deployment notes
_docs/02_document/deployment/ci_cd_pipeline.md Woodpecker pipeline-derived notes
_docs/02_document/deployment/environment_strategy.md Env-var contract + ASPNETCORE_ENVIRONMENT use
_docs/02_document/deployment/observability.md Logging + /health + outbox depth gap
_docs/02_document/common-helpers/01_http-error-envelope.md Suite error envelope contract
_docs/01_solution/solution.md Retrospective per-component solution table
_docs/00_problem/problem.md Retrospective problem statement
_docs/00_problem/restrictions.md HW / SW / ENV / OP / suite-level restrictions
_docs/00_problem/acceptance_criteria.md 24 functional + 5 non-functional ACs
_docs/00_problem/input_data/data_parameters.md REST DTOs + env vars + seed data + wire format
_docs/00_problem/security_approach.md Auth/AuthZ/secrets posture + 8 SEC-XX gaps

Cross-references

  • Suite-level integration narrative: suite/_docs/01_annotations.md
  • Repo-config (monorepo discovery): _docs/_repo-config.yaml
  • Autodev state: _docs/_autodev_state.md
  • Document skill internal state: _docs/02_document/state.json