docs+src: complete Steps 1-3 outcomes + auth re-sync baseline

This commit captures everything produced during autodev existing-code
Steps 1 (Document), 2 (Architecture Baseline Scan), and 3 (Test Spec),
together with the targeted auth + CORS re-sync triggered on 2026-05-14
when codebase drift was detected at Step 4 entry. None of this work was
previously committed.

Step 1 (Document) — 50+ _docs/02_document/ files: problem, solution,
architecture, system flows, glossary, module-layout, per-component
specs (01..06), modules, deployment, diagrams, data model, FINAL
report, verification log, discovery.

Step 2 (Architecture Baseline) — architecture_compliance_baseline.md.
Verdict PASS_WITH_WARNINGS (0 Critical, 0 High, 1 Medium, 2 Low). No
High/Critical findings; auto-chained to Step 3 per existing-code flow.

Step 3 (Test Spec) — _docs/02_document/tests/* (67 scenarios across
blackbox, security, resilience, resource-limit, performance), plus
e2e/docker-compose.test.yml, e2e/seed/run.sh, scripts/run-tests.sh,
scripts/run-performance-tests.sh. Coverage 88% over the active scope
(40 of 45 items covered, 6 RB-deferred, 5 documented-as-uncovered).

Targeted auth + CORS re-sync — replaces the deleted in-house token
issuer with a JWKS-verifier model. AuthController and TokenService
removed; JwtExtensions switched from HS256 symmetric to ES256 over
admin's JWKS. ConfigurationResolver and CorsConfigurationValidator
added under src/Infrastructure/. ADR-002 and ADR-006 retired; SEC-01,
SEC-02, SEC-03 marked Closed. One new testability risk recorded in
architecture.md Open Risks Section 6 (JWKS HTTPS gating).

Source changes:
- src/Auth/JwtExtensions.cs (modified) — ES256, JWKS, alg pinning
- src/Program.cs (modified) — DI wiring for ConfigurationResolver
  and CorsConfigurationValidator
- src/Controllers/AuthController.cs (deleted) — no in-service issuance
- src/Services/TokenService.cs (deleted) — same
- src/Infrastructure/ConfigurationResolver.cs (new)
- src/Infrastructure/CorsConfigurationValidator.cs (new)
- .env.example (new) — required env var documentation
- .gitignore (updated)

Cross-repo coordination: _docs/cross-repo/flights_h1_h2_h3_change_spec
captures the change-spec for downstream services that consumed the now
deleted /auth endpoints.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-14 20:19:05 +03:00
parent 08eadc1158
commit 03f879206e
66 changed files with 6006 additions and 133 deletions
+44
View File
@@ -0,0 +1,44 @@
# Component diagram (Azaion.Annotations)
Derived from the **six-component** breakdown (user choice **B**: Annotations REST split from realtime + RabbitMQ sync).
```mermaid
flowchart LR
subgraph platform [06 Platform]
DB[(PostgreSQL)]
AUTH[JWT / refresh]
PATH[Paths + errors]
end
subgraph media [03 Media]
MAPI["/media"]
end
subgraph annRest [01 Annotations REST]
ARAPI["/annotations REST + files"]
end
subgraph annRT [02 Annotations realtime and sync]
SSE["SSE /events"]
RMQ[RabbitMQ stream]
end
subgraph dataset [04 Dataset]
DAPI["/dataset DATASET"]
end
subgraph settings [05 Settings and metadata]
SAPI["/settings /classes"]
end
platform --> media
platform --> annRest
platform --> annRT
platform --> dataset
platform --> settings
media --> annRest
annRest --> annRT
annRest --> dataset
```
**Shared source file:** `AnnotationsController.cs` is **split by concern** between **01** (REST + static files) and **02** (SSE `Events` action).
@@ -0,0 +1,83 @@
# Flow F1 — Annotation Create
Cross-reference: `system-flows.md` → Flow F1.
## Sequence (verified against `Services/AnnotationService.cs`)
```mermaid
sequenceDiagram
autonumber
participant Caller as Detections / UI
participant Ctrl as AnnotationsController (01)
participant Svc as AnnotationService (01)
participant Path as PathResolver (06)
participant DB as PostgreSQL (06)
participant FS as Filesystem
participant Evt as AnnotationEventService (02)
participant Q as annotations_queue_records (DB / 02)
Caller->>Ctrl: POST /annotations (CreateAnnotationRequest, JWT ANN)
Ctrl->>Svc: CreateAnnotation(request, userIdFromJwt)
alt request.Image bytes provided
Svc->>Svc: ComputeHash (XxHash64 over sampled bytes) -> id
Svc->>FS: write {id}.jpg under images_dir
Svc->>DB: SELECT media WHERE id = :id
opt media row missing
Svc->>DB: INSERT media (Image, MediaStatus.New, ...)
end
else MediaId provided
Svc->>DB: SELECT media WHERE id = :MediaId (404 if missing)
opt source media file exists & target image missing
Svc->>FS: copy media.Path -> images_dir/{id}.jpg
end
end
Svc->>DB: INSERT annotations
Svc->>DB: BulkCopy detection rows
Svc->>FS: write {id}.txt (YOLO label) under labels_dir
Svc->>Evt: PublishAsync(AnnotationEventDto)
Svc->>DB: SELECT system_settings (FirstOrDefault)
alt SilentDetection != true
Svc->>Q: FailsafeProducer.EnqueueAsync(db, id, QueueOperation.Created)
end
Svc-->>Ctrl: Annotation
Ctrl-->>Caller: 201 Created (Location: /annotations/{id})
```
## Flowchart
```mermaid
flowchart TD
start([POST /annotations]) --> auth{JWT valid + ANN claim?}
auth -->|no| rej401([401 / 403])
auth -->|yes| input{bytes or MediaId?}
input -->|neither| arg([400 ArgumentException])
input -->|bytes| hash[ComputeHash sampled XxHash64 -> id]
input -->|MediaId| lookupMedia[SELECT media WHERE id = MediaId]
lookupMedia -->|missing| nf404([404 KeyNotFound])
lookupMedia -->|exists| copyImg[copy media.Path to images dir if missing]
hash --> writeImg[write {id}.jpg]
writeImg --> mediaRow[INSERT media if absent]
mediaRow --> writeDb
copyImg --> writeDb[INSERT annotations + BulkCopy detections]
writeDb --> writeLabel[write {id}.txt YOLO label]
writeLabel --> sse[PublishAsync SSE event]
sse --> readSettings[SELECT system_settings]
readSettings --> silentChk{SilentDetection?}
silentChk -->|yes| ok([201 Created])
silentChk -->|no| outbox[FailsafeProducer.EnqueueAsync Created]
outbox --> ok
writeImg -->|IOException| err500([500 via ErrorHandlingMiddleware])
writeDb -->|DB error| err500
writeLabel -->|IOException| err500
outbox -->|DB error| err500
```
## Notes
- Image hashing is `XxHash64` over a **sampled** input (length prefix + head/middle/tail 1KB) for inputs > 3072 bytes. See ADR-004 in `architecture.md` for collision implications.
- The implementation is **not transactional across FS + DB + outbox**. Partial failure can leave orphan files or unsent outbox rows. Captured in `system-flows.md` → Open Behavioral Questions §4.
- `Update`, `UpdateStatus`, `DeleteAnnotation` paths do **NOT** publish SSE or enqueue outbox today. Captured in `system-flows.md` → Open Behavioral Questions §1.
- Outbox row is consumed asynchronously by Flow F4 (`FailsafeProducer`).
@@ -0,0 +1,52 @@
# Flow F4 — Failsafe Outbox Drain → RabbitMQ Stream
Cross-reference: `system-flows.md` → Flow F4.
## Sequence
```mermaid
sequenceDiagram
autonumber
participant FP as FailsafeProducer (02, IHostedService)
participant DB
participant Path as PathResolver (06)
participant FS as Filesystem
participant RMQ as RabbitMQ Stream "azaion-annotations"
loop while host running
FP->>DB: SELECT FROM annotations_queue_records
DB-->>FP: pending rows (operation, annotation_ids JSON)
loop per row
alt operation = Created
FP->>Path: GetImagePath(annotationId)
FP->>FS: read bytes
end
FP->>FP: serialize MessagePack (Annotation*QueueMessage)
FP->>RMQ: publish stream entry
alt publish ok
FP->>DB: DELETE annotations_queue_records WHERE id = :id
else stream unavailable
FP->>FP: log + backoff
end
end
end
```
## State
```mermaid
stateDiagram-v2
[*] --> Idle
Idle --> Draining: queue rows present
Draining --> Publishing: row picked
Publishing --> Acked: stream publish ok
Acked --> Idle: row deleted
Publishing --> Backoff: stream unavailable
Backoff --> Idle: backoff elapsed
```
## Notes
- See ADR-003 in `architecture.md` for rationale.
- Multi-instance drain: no leasing in DB → duplicate stream entries possible. Suite consumer contract should dedupe.
- Bulk message (`AnnotationBulkQueueMessage`) carries multiple annotation ids; `Created` semantics on bulk are out of scope here — confirm during Step 4 verification.
@@ -0,0 +1,43 @@
# Flow F3 — Real-time SSE Subscription
Cross-reference: `system-flows.md` → Flow F3.
## Sequence
```mermaid
sequenceDiagram
autonumber
participant UI
participant Ctrl as AnnotationsController.Events (component 02 doc-ownership)
participant Evt as AnnotationEventService (02)
participant ProducerF1 as Flow F1 (annotation create)
participant ProducerF8 as Flow F8 (dataset bulk status)
UI->>Ctrl: GET /annotations/events (Accept: text/event-stream, JWT ANN)
Ctrl->>Ctrl: set Content-Type: text/event-stream, no-cache
Ctrl->>Evt: ReadAllAsync(cancellationToken)
par event sources
ProducerF1->>Evt: PublishAsync(eventDto)
ProducerF8->>Evt: PublishAsync(eventDto)
end
Evt-->>Ctrl: yield AnnotationEventDto
Ctrl-->>UI: data: {json}\n\n
UI--xCtrl: client disconnects
Ctrl->>Ctrl: cancellation token fires; loop exits
```
## State
```mermaid
stateDiagram-v2
[*] --> Subscribing
Subscribing --> Streaming: header sent + reader attached
Streaming --> Streaming: PublishAsync -> data frame
Streaming --> Closed: client cancel / process restart
Closed --> [*]
```
## Notes
- Channel is **unbounded**: a slow client cannot back-pressure the producer. If a client stalls indefinitely, memory growth is bounded by per-publisher cancellation tokens at the controller level. Step 4 verification candidate.
- Cross-pod fan-out is **not provided** — each pod has its own channel. Sticky sessions or a broker-backed bus required for horizontal scale.