[AZ-529] [AZ-530] Cycle-2 documentation refresh

Refreshes _docs/02_document/ to reflect the cycle-2 auth-modernization
+ CMMC hardening landings (AZ-531..AZ-538). Authoritative source for
the ripple set is ripple_log_cycle2.md.

Covered:
- architecture.md (section 1 rewritten, ADRs 6-9 added)
- data_model.md (sessions, audit_events, user columns, migrations)
- system-flows.md (F1 rewritten; F11-F17 added; F2/F7/F9 minor)
- module-layout.md (cycle-2 sub-component table)
- diagrams/flows/flow_login.md (dual-token + MFA)
- components/{01_data_layer,03_auth_and_security,05_admin_api}
- modules/ (12 new, 8 modified — full Argon2id/ES256/MFA/refresh
  /mission/session/audit/jwks rollup)
- tests/{blackbox,security,traceability-matrix}

Step 13 (Update Docs) output for cycle 2.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-14 09:22:53 +03:00
parent c2c659ef62
commit a77b3f8a59
35 changed files with 3624 additions and 468 deletions
+493 -34
View File
@@ -1,45 +1,65 @@
# Azaion Admin API — System Flows
> **Cycle 1 (2026-05-13) note** — F4 (Hardware Check) was deleted by AZ-197; F3 no longer depends on hardware. Two new flows were added: F8 Detection Classes CRUD (AZ-513), F9 Device Auto-Provisioning (AZ-196). F10 OTA Update Check & Publish (AZ-183) was reverted later the same day after the security audit (finding F-1) — the OTA delivery model itself was deemed obsolete; see `_docs/05_security/security_report.md` for context. F3's narrative was updated to drop the hardware-check step.
> **Cycle 1 (2026-05-13) note** — F4 (Hardware Check) was deleted by AZ-197; F8 Detection Classes (AZ-513), F9 Device Auto-Provisioning (AZ-196) added; F10 OTA reverted after security audit F-1.
>
> **Cycle 2 (2026-05-14) note** — F3 (Encrypted Resource Download) and F6 (Installer Download) were removed entirely as obsolete. The encrypted-download support stack (`Security.GetApiEncryptionKey`, `EncryptTo`, `DecryptTo`, `ResourcesService.GetEncryptedResource`, `ResourcesService.GetInstaller`, `GetResourceRequest`, `WrongResourceName` (50)) and the installer config (`SuiteInstallerFolder`, `SuiteStageInstallerFolder`) all went with them. See `_docs/02_document/architecture.md` ADR-003 (retired).
> **Cycle 2 — early (2026-05-14)** — F3 (Encrypted Resource Download) and F6 (Installer Download) removed entirely as obsolete. ADR-003 retired.
>
> **Cycle 2 — Auth Modernization (2026-05-14)** — F1 was rebuilt around the new dual-token + MFA model (AZ-531/532/534/536/537). Six new flows were added: F11 Refresh Token Rotation (AZ-531), F12 Logout / Revocation (AZ-535), F13 Mission Token Issuance (AZ-533), F14 MFA Enrollment & Confirmation (AZ-534), F15 Verifier Revocation Snapshot (AZ-535), F16 Account Lockout & Per-IP Rate Limit (AZ-537). The legacy single-token narrative is no longer accurate.
## Flow Inventory
| # | Flow Name | Trigger | Primary Components | Criticality |
|---|-----------|---------|-------------------|-------------|
| F1 | User Login | POST /login | Admin API, User Mgmt, Auth & Security | High |
| F2 | User Registration | POST /users | Admin API, User Mgmt | High |
| ~~F3~~ | ~~Encrypted Resource Download~~ | ~~POST /resources/get~~ | — | **REMOVED — cycle 2 (obsolete)** |
| ~~F4~~ | ~~Hardware Check~~ | ~~POST /resources/check~~ | — | **REMOVED — AZ-197** |
| F5 | Resource Upload | POST /resources | Admin API, Resource Mgmt | Medium |
| ~~F6~~ | ~~Installer Download~~ | ~~GET /resources/get-installer~~ | — | **REMOVED — cycle 2 (obsolete)** |
| F7 | User Management (CRUD) | Various /users/* | Admin API, User Mgmt | Medium |
| F8 | Detection Classes CRUD *(AZ-513)* | POST/PATCH/DELETE /classes | Admin API, DetectionClassService | High |
| F9 | Device Auto-Provisioning *(AZ-196)* | POST /devices | Admin API, User Mgmt | High |
| ~~F10~~ | ~~OTA Update Check & Publish~~ | ~~POST /get-update + POST /resources/publish~~ | — | **REMOVED — post-cycle-1 (AZ-183 reverted, see security audit F-1)** |
| F1 | User Login (dual token + MFA) | `POST /login` (+ `/login/mfa`) | Admin API, User Mgmt, Auth & Security | **Critical** |
| F2 | User Registration | `POST /users` | Admin API, User Mgmt | High |
| ~~F3~~ | ~~Encrypted Resource Download~~ | | — | **REMOVED — cycle 2 early** |
| ~~F4~~ | ~~Hardware Check~~ | | — | **REMOVED — AZ-197** |
| F5 | Resource Upload | `POST /resources` | Admin API, Resource Mgmt | Medium |
| ~~F6~~ | ~~Installer Download~~ | | — | **REMOVED — cycle 2 early** |
| F7 | User Management (CRUD) | Various `/users/*` | Admin API, User Mgmt | Medium |
| F8 | Detection Classes CRUD | `POST/PATCH/DELETE /classes` | Admin API, DetectionClassService | High |
| F9 | Device Auto-Provisioning | `POST /devices` | Admin API, User Mgmt | High |
| ~~F10~~ | ~~OTA Update Check & Publish~~ | — | — | **REMOVED — post-cycle-1** |
| **F11** | **Refresh Token Rotation** *(AZ-531)* | `POST /token/refresh` | Admin API, RefreshTokenService, AuthService, SessionService | **Critical** |
| **F12** | **Logout / Revocation** *(AZ-535)* | `POST /logout`, `/logout/all`, `/sessions/{sid}/revoke` | Admin API, SessionService | High |
| **F13** | **Mission Token Issuance** *(AZ-533)* | `POST /sessions/mission` | Admin API, MissionTokenService, SessionService, AuthService | High |
| **F14** | **MFA Enrollment & Confirmation** *(AZ-534)* | `POST /users/me/mfa/{enroll,confirm,disable}` | Admin API, MfaService, AuditLog | High |
| **F15** | **Verifier Revocation Snapshot** *(AZ-535)* | `GET /sessions/revoked?since=` | Admin API, SessionService | **Critical** for verifier fleet |
| **F16** | **Account Lockout & Rate Limit** *(AZ-537)* | (cross-cuts F1) | Admin API rate-limiter middleware, UserService, AuditLog | High |
| **F17** | **JWKS Publication** *(AZ-532)* | `GET /.well-known/jwks.json` | Admin API, JwtSigningKeyProvider | **Critical** for verifier fleet |
## Flow Dependencies
| Flow | Depends On | Shares Data With |
|------|-----------|-----------------|
| F1 | — | All other flows (produces JWT token) |
| F2 | — | F1, F9 (creates user records — including device users via F9) |
| F5 | F1 (requires JWT) | — |
| F7 | F1 (requires JWT, ApiAdmin role) | — |
| F8 | F1 (requires JWT, ApiAdmin role) | UI Detection Classes table |
| F9 | F1 (requires JWT, ApiAdmin role) | F2 (writes a user row, but reuses `RegisterUser` end-to-end), F1 (provisioned devices later log in) |
| F1 | F17 (signing keys must exist), F16 (rate limit gate) | F11 (refresh chain), F12 (sid is the revocation key), F14 (MFA branch) |
| F2 | — | F1 (created users can log in) |
| F5 | F1 / F11 (access token) | — |
| F7 | F1 / F11 + ApiAdmin | F12 (disabling a user revokes their sessions) |
| F8 | F1 / F11 + ApiAdmin | UI |
| F9 | F1 / F11 + ApiAdmin | F1 (provisioned devices later log in) |
| F11 | F1 (created the family) | F12 (rotation is the same row store) |
| F12 | F1 / F11 (sid claim) | F15 (revoked rows surface here) |
| F13 | F1 / F11 (pilot's interactive token) | F12 (auto-revoke prior aircraft mission rows) |
| F14 | F1 (caller is authenticated) | F1 (the MFA branch consumes enrolled state) |
| F15 | — (verifier role only) | F12 (consumes revocation rows) |
| F16 | — | F1, F11 (gates them) |
| F17 | — | F1, F11, F13, F14 (every signed token), F15 (verifiers cache JWKS) |
---
## Flow F1: User Login
## Flow F1: User Login (dual token + MFA) *(rebuilt cycle 2)*
### Description
A user submits email/password credentials. The system validates them against the database and returns a signed JWT token for subsequent authenticated requests.
A user submits email/password credentials. The system enforces per-IP and per-account rate limits + lockout (F16), verifies the password with constant-time Argon2id (lazily migrating from SHA-384 if needed — AZ-536), and either:
- (no MFA) issues a short-lived ES256 access token + opaque refresh token bound to a new session row, OR
- (MFA enabled) issues a short-lived `mfa_token` (JWT, audience `mfa-step`, signed by the active ES256 key) and waits for `POST /login/mfa` to complete the second factor.
### Preconditions
- User account exists in the database
- User knows correct password
- User account exists, is enabled, and is not within an active lockout window
- Per-IP rate-limit bucket has remaining permits
- Per-account sliding-window failed-login count is below threshold
- For the MFA branch: user has previously enrolled and confirmed MFA (F14)
### Sequence Diagram
@@ -47,27 +67,84 @@ A user submits email/password credentials. The system validates them against the
sequenceDiagram
participant Client
participant API as Admin API
participant RL as RateLimiter (per-IP, AZ-537)
participant US as UserService
participant AL as AuditLog
participant Sec as Security (Argon2id, AZ-536)
participant DB as PostgreSQL
participant Mfa as MfaService
participant RT as RefreshTokenService
participant Auth as AuthService
participant SS as SessionService
Client->>API: POST /login {email, password}
API->>RL: per-IP sliding window check
alt rate-limited
RL-->>Client: 429 + Retry-After
end
API->>US: ValidateUser(request)
US->>DB: SELECT user WHERE email = ?
DB-->>US: User record
US->>US: Compare password hash
US->>DB: SELECT users WHERE email=? (read conn)
US->>AL: CountRecentFailedLogins(email, window)
alt account locked OR per-account threshold exceeded
US-->>API: BusinessException(AccountLocked / LoginRateLimited, RetryAfterSeconds)
API-->>Client: 423 / 429 + Retry-After
end
US->>Sec: VerifyPassword(presented, stored)
alt VerifyResult.Ok=false
US->>AL: RecordLoginFailed
US->>DB: UPDATE failed_login_count, lockout_until
US-->>API: WrongPassword (or NoEmailFound)
API-->>Client: 409
end
alt VerifyResult.NeedsRehash=true
US->>Sec: HashPassword (Argon2id)
US->>DB: UPDATE password_hash (lazy migrate)
end
US->>AL: RecordLoginSuccess
US->>DB: UPDATE failed_login_count=0, last_login=now()
US-->>API: User entity
API->>Auth: CreateToken(user)
Auth-->>API: JWT string
API-->>Client: 200 OK {token}
alt user.MfaEnabled
API->>Mfa: IssueMfaStepToken(userId)
Mfa-->>API: short-lived JWT (mfa_pending=true)
API-->>Client: 200 OK {mfa_required: true, mfa_token, expires_in: 300}
else
API->>RT: IssueForNewLogin(userId, mfaAuthenticated=false)
RT->>DB: INSERT INTO sessions (new id, family_id=id, refresh_hash, expires_at, mfa_authenticated=false)
RT-->>API: (opaqueRefreshToken, Session)
API->>Auth: CreateToken(user, sessionId=Session.Id, jti=new, amr=["pwd"])
Auth-->>API: AccessToken (ES256)
opt user.Role == CompanionPC
API->>SS: RevokeMissionsForAircraft(user.Id) // F13 / AZ-533 AC-4
end
API-->>Client: 200 OK LoginResponse {AccessToken, AccessExp, RefreshToken, RefreshExp}
end
Note over Client,API: MFA branch only:
Client->>API: POST /login/mfa {mfa_token, code}
API->>RL: per-IP sliding window check
API->>Mfa: ValidateMfaStepToken(mfa_token) -> userId
API->>US: GetById(userId)
API->>Mfa: VerifyForLogin(userId, code) -> amr
Mfa->>DB: TOTP verify against decrypted mfa_secret OR recovery code consume
Mfa->>AL: RecordMfaLoginSuccess (or MfaRecoveryUsed)
API->>RT: IssueForNewLogin(userId, mfaAuthenticated=true)
API->>Auth: CreateToken(user, sessionId, jti, amr=["pwd","mfa"])
API-->>Client: 200 OK LoginResponse
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Email not found | UserService.ValidateUser | No DB record | 409: NoEmailFound (code 10) |
| Wrong password | UserService.ValidateUser | Hash mismatch | 409: WrongPassword (code 30) |
| Per-IP limit exceeded | Rate-limiter middleware | sliding window | 429 + `Retry-After` |
| Account locked | UserService.ValidateUser | `now() < lockout_until` | 423 `AccountLocked` (code 50) + `Retry-After` |
| Per-account threshold | UserService.ValidateUser | failed-login count over window | 429 `LoginRateLimited` (code 51) + `Retry-After` |
| Email not found | UserService.ValidateUser | No DB record | 409 `NoEmailFound` (code 10) |
| Wrong password | UserService.ValidateUser | `VerifyPassword.Ok=false` | 409 `WrongPassword` (code 30) — also increments `failed_login_count` |
| User disabled | UserService.ValidateUser | `is_enabled=false` | 409 `UserDisabled` (code 38) |
| MFA token invalid | MfaService.ValidateMfaStepToken | bad signature / wrong audience / expired | 401 `InvalidMfaToken` (code 61) |
| MFA code wrong | MfaService.VerifyForLogin | TOTP and recovery both miss | 401 `InvalidMfaCode` (code 59) — `mfa_login_failed` audit row |
---
@@ -96,7 +173,7 @@ sequenceDiagram
API->>US: RegisterUser(request)
US->>DB: SELECT user WHERE email = ?
DB-->>US: null (no duplicate)
US->>US: Hash password (SHA-384)
US->>US: Hash password (Argon2id, AZ-536)
US->>DB: INSERT user (admin connection)
DB-->>US: OK
US-->>API: void
@@ -170,7 +247,9 @@ Admin operations: list users, change role, enable/disable, update queue offsets,
### Preconditions
- Caller has ApiAdmin role (for most operations)
All operations follow the same pattern: API endpoint → UserService method → DbFactory.RunAdmin → PostgreSQL UPDATE/DELETE. Cache is invalidated for affected user keys after writes (the `UpdateQueueOffsets` path is the only remaining cache-invalidation site post-AZ-197).
All operations follow the same pattern: API endpoint → UserService method → DbFactory.RunAdmin → PostgreSQL UPDATE/DELETE. Cache is invalidated for affected user keys after writes.
> **Cycle 2 cross-cut**: `PUT /users/{email}/disable` now also calls `SessionService.RevokeAllForUser` so disabling a user instantly cuts every active session. Verifiers pick this up via F15 within their poll cadence.
---
@@ -241,7 +320,7 @@ sequenceDiagram
## Flow F9: Device Auto-Provisioning *(AZ-196, 2026-05-13)*
### Description
ApiAdmin requests a fresh CompanionPC device user. The server allocates the next sequential serial (`azj-NNNN`), generates a 32-char hex password, persists the user with the SHA-384 hash, and returns the plaintext credentials exactly once. The provisioning script (out-of-tree) embeds the values into the device's `device.conf`.
ApiAdmin requests a fresh CompanionPC device user. The server allocates the next sequential serial (`azj-NNNN`), generates a 32-char hex password, persists the user with an Argon2id hash (cycle 2 — AZ-536), and returns the plaintext credentials exactly once. The provisioning script (out-of-tree) embeds the values into the device's `device.conf`.
### Preconditions
- Caller has ApiAdmin role (`apiAdminPolicy`)
@@ -262,7 +341,7 @@ sequenceDiagram
US->>US: nextNumber = parse(lastEmail.suffix) + 1 (or 0)
US->>US: serial = "azj-" + nextNumber.PadLeft(4)
US->>US: password = ToHex(RandomBytes(16)) // 32 hex chars
US->>DB: INSERT user {Email=serial@domain, PasswordHash=SHA384(password), Role=CompanionPC, IsEnabled=true} (admin conn)
US->>DB: INSERT user {Email=serial@domain, PasswordHash=Argon2id(password), Role=CompanionPC, IsEnabled=true} (admin conn)
DB-->>US: OK
US-->>API: RegisterDeviceResponse {Serial, Email, Password}
API-->>Admin: 200 OK {Serial, Email, Password}
@@ -288,3 +367,383 @@ Reasons:
2. The OTA delivery model is itself a leftover from the installer-shipping era; the target architecture (browser-only SaaS + fTPM-secured Jetsons) does not need it.
The `apiUploaderPolicy` definition was removed from `Program.cs`; the `RoleEnum.ResourceUploader` enum value remains as data (the seed `uploader@azaion.com` user still uses it for negative-auth tests) but is no longer wired to any endpoint.
---
## Flow F11: Refresh Token Rotation *(AZ-531, 2026-05-14)*
### Description
The client presents an opaque refresh token; the server validates it, rotates it (marks the old row as `revoked_reason='rotated'`), inserts a new row in the same `family_id`, and mints a new ES256 access token. Reuse of an already-rotated token revokes the entire family with `reason='reuse_detected'` (and triggers F15 surfacing for verifiers).
### Preconditions
- Refresh token is well-formed and corresponds to a non-revoked, non-expired session row
- The session is within both the sliding window (`SessionConfig.RefreshSlidingHours`) and the absolute cap (`SessionConfig.RefreshAbsoluteHours` measured from `family_started_at`)
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant API as Admin API
participant RT as RefreshTokenService
participant US as UserService
participant Auth as AuthService
participant SS as SessionService
participant DB as PostgreSQL
Client->>API: POST /token/refresh {refreshToken}
API->>RT: Rotate(opaqueToken)
RT->>DB: SELECT * FROM sessions WHERE refresh_hash = SHA256(token)
alt row missing
RT-->>API: 401 InvalidRefreshToken
end
alt row.revoked_reason = 'rotated' (reuse!)
RT->>DB: UPDATE sessions SET revoked_at=now, revoked_reason='reuse_detected' WHERE family_id = row.family_id AND revoked_at IS NULL
RT-->>API: 401 InvalidRefreshToken
end
alt row.revoked_at IS NOT NULL OR row.expires_at <= now
RT-->>API: 401 InvalidRefreshToken
end
RT->>DB: UPDATE sessions SET revoked_at=now, revoked_reason='rotated', last_used_at=now WHERE id = row.id
RT->>DB: INSERT INTO sessions (new id, family_id=row.family_id, refresh_hash=SHA256(newToken), parent_session_id=row.id, expires_at=now+sliding, mfa_authenticated=row.mfa_authenticated)
RT-->>API: (newOpaqueToken, newSession)
API->>US: GetById(newSession.UserId)
US-->>API: User
API->>Auth: CreateToken(user, sessionId=newSession.Id, jti=new, amr= ['pwd','mfa'] if mfaAuthenticated else ['pwd'])
Auth-->>API: AccessToken
opt user.Role == CompanionPC
API->>SS: RevokeMissionsForAircraft(user.Id)
end
API-->>Client: 200 OK LoginResponse {AccessToken, AccessExp, RefreshToken=newOpaqueToken, RefreshExp=newSession.ExpiresAt}
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Token missing / not in DB | RefreshTokenService.Rotate | `SHA256(token)` not found | 401 `InvalidRefreshToken` |
| Reuse detected | RefreshTokenService.Rotate | row already `revoked_reason='rotated'` | 401 `InvalidRefreshToken` + entire family revoked (visible via F15) |
| Sliding window expired | RefreshTokenService.Rotate | `expires_at <= now()` | 401 `InvalidRefreshToken` |
| Absolute cap exceeded | RefreshTokenService.Rotate | `now() - family_started_at > RefreshAbsoluteHours` | 401 `InvalidRefreshToken` |
| User missing (race with deletion) | API | `UserService.GetById` returns null | 401 `InvalidRefreshToken` |
---
## Flow F12: Logout / Revocation *(AZ-535, 2026-05-14)*
### Description
Three endpoints share `SessionService.RevokeBySid` / `RevokeAllForUser`:
- `POST /logout` — revoke caller's current `sid` (idempotent; returns `{ alreadyRevoked }`)
- `POST /logout/all` — revoke every active session for the caller's user
- `POST /sessions/{sid}/revoke` *(ApiAdmin)* — admin revoke-by-sid
All revocations write `revoked_at`, `revoked_reason`, and `revoked_by_user_id`; the rows surface to verifiers via F15 within the next poll window.
### Preconditions
- `/logout` / `/logout/all` — caller is authenticated; the access token's `sid` claim is well-formed
- `/sessions/{sid}/revoke` — caller is `ApiAdmin`
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant API as Admin API
participant SS as SessionService
participant DB as PostgreSQL
Note over Client,API: Self logout
Client->>API: POST /logout (Bearer access)
API->>API: ParseSidClaim(user) -> sid
API->>API: ParseUserIdClaim(user) -> caller
API->>SS: RevokeBySid(sid, caller, 'logged_out')
SS->>DB: UPDATE sessions SET revoked_at=now, revoked_reason='logged_out', revoked_by_user_id=caller WHERE id=sid AND revoked_at IS NULL
SS-->>API: alreadyRevoked: bool
API-->>Client: 200 OK { alreadyRevoked }
Note over Client,API: Logout-all
Client->>API: POST /logout/all
API->>SS: RevokeAllForUser(caller, caller, 'logged_out_all')
SS->>DB: UPDATE ... WHERE user_id=caller AND revoked_at IS NULL
SS-->>API: int (rows revoked)
API-->>Client: 200 OK { revoked }
Note over Client,API: Admin revoke-by-sid
Client->>API: POST /sessions/{sid}/revoke (ApiAdmin)
API->>SS: RevokeBySid(sid, admin, 'admin_revoked')
SS->>DB: UPDATE ... WHERE id=sid AND revoked_at IS NULL
SS-->>API: alreadyRevoked: bool
API-->>Client: 200 OK { alreadyRevoked }
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Missing/malformed `sid` claim | ParseSidClaim | not a Guid | 401 `InvalidRefreshToken` |
| Sid not in DB (admin path) | SessionService.RevokeBySid | row not found | 404 `SessionNotFound` |
| Already revoked | SessionService.RevokeBySid | UPDATE affected 0 rows | 200 OK with `alreadyRevoked: true` (idempotent) |
---
## Flow F13: Mission Token Issuance *(AZ-533, 2026-05-14)*
### Description
A pilot (an authenticated interactive user) requests a long-lived no-refresh access token bound to one aircraft and one mission. Before signing the token, the server inserts a `class='mission'` session row (so `sid` is bound), and revokes any previously-active mission sessions for that aircraft (`reason='aircraft_reconnected'`).
### Preconditions
- Caller is authenticated (interactive token; AMR can be `["pwd"]` or `["pwd","mfa"]` — F1 follow-up tightens this to require `mfa` once policy is set)
- `request.aircraftId` resolves to an existing user with `Role = CompanionPC`
- `request.missionId` matches the validation pattern; `request.plannedDurationH` is within bounds
### Sequence Diagram
```mermaid
sequenceDiagram
participant Pilot
participant API as Admin API
participant MTS as MissionTokenService
participant SS as SessionService
participant US as UserService
participant Auth as AuthService
participant DB as PostgreSQL
Pilot->>API: POST /sessions/mission {aircraftId, missionId, plannedDurationH, region}
API->>MTS: Issue(pilotId, request)
MTS->>US: GetById(aircraftId) (read conn)
alt aircraft missing or wrong role
MTS-->>API: 400 AircraftNotFound
end
MTS->>SS: RevokeMissionsForAircraft(aircraftId) // AC-4
SS->>DB: UPDATE sessions SET revoked_at=now, revoked_reason='aircraft_reconnected' WHERE aircraft_id=? AND class='mission' AND revoked_at IS NULL
MTS->>DB: INSERT INTO sessions (id, user_id=aircraftId, class='mission', aircraft_id=aircraftId, refresh_hash=NULL, expires_at=now + plannedDurationH)
MTS->>Auth: CreateToken(aircraftUser, sessionId=newSid, jti, amr=['pwd','mission'])
Auth-->>MTS: AccessToken
MTS-->>API: MissionSessionResponse {access_token, expires_at, mission_id, aircraft_id}
API-->>Pilot: 200 OK
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Validation failure | FluentValidation / MissionTokenService | bad `mission_id` pattern, `plannedDurationH` out of bounds | 400 `InvalidMissionRequest` (code 54) |
| Aircraft not a CompanionPC | MissionTokenService.Issue | role mismatch | 400 `AircraftNotFound` (code 55) |
---
## Flow F14: MFA Enrollment & Confirmation *(AZ-534, 2026-05-14)*
### Description
Three-step user-initiated lifecycle:
1. **Enroll** — server generates a new TOTP secret, encrypts it via `IDataProtector` (purpose `Azaion.Mfa.Secret`), persists with `mfa_enabled=false`, returns base32 secret + otpauth URL + QR PNG bytes.
2. **Confirm** — client submits a TOTP code; on success server flips `mfa_enabled=true`, generates 10 single-use Argon2id-hashed recovery codes, and returns them once.
3. **Disable** — requires both password + a current TOTP; server clears all MFA columns.
### Preconditions
- Caller is authenticated
- For Confirm: a prior Enroll call left the encrypted secret on the user
- For Disable: `mfa_enabled = true`
### Sequence Diagram
```mermaid
sequenceDiagram
participant User
participant API as Admin API
participant Mfa as MfaService
participant DP as IDataProtector
participant Sec as Security (Argon2id)
participant AL as AuditLog
participant DB as PostgreSQL
Note over User,API: ENROLL
User->>API: POST /users/me/mfa/enroll {password}
API->>Mfa: Enroll(userId, password)
Mfa->>DB: SELECT user
Mfa->>Sec: VerifyPassword(presented, stored)
Mfa->>Mfa: Generate 20-byte secret, base32 encode
Mfa->>DP: Protect(base32) -> encrypted base64
Mfa->>DB: UPDATE users SET mfa_secret = encrypted, mfa_enrolled_at = now, mfa_enabled=false
Mfa->>AL: RecordMfaEnroll
Mfa-->>API: MfaEnrollResponse { secret_base32, otpauth_url, qr_png }
API-->>User: 200 OK
Note over User,API: CONFIRM
User->>API: POST /users/me/mfa/confirm {code}
API->>Mfa: Confirm(userId, code)
Mfa->>DP: Unprotect(stored) -> base32 secret
Mfa->>Mfa: TOTP verify
alt code wrong
Mfa-->>API: 401 InvalidMfaCode
end
Mfa->>Mfa: Generate 10 recovery codes
Mfa->>Sec: HashPassword each (Argon2id)
Mfa->>DB: UPDATE users SET mfa_enabled=true, mfa_recovery_codes = jsonb([{ hash, used_at=null } x10]), mfa_last_used_window=current_step
Mfa->>AL: RecordMfaConfirm
Mfa-->>API: { recovery_codes: [...] }
API-->>User: 200 OK { mfaEnabled: true, recovery_codes }
Note over User,API: DISABLE
User->>API: POST /users/me/mfa/disable {password, code}
API->>Mfa: Disable(userId, password, code)
Mfa->>Sec: VerifyPassword
Mfa->>Mfa: TOTP verify
Mfa->>DB: UPDATE users SET mfa_enabled=false, mfa_secret=NULL, mfa_recovery_codes=NULL, mfa_enrolled_at=NULL, mfa_last_used_window=NULL
Mfa->>AL: RecordMfaDisable
Mfa-->>API: ok
API-->>User: 200 OK { mfaEnabled: false }
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Already enrolled (Enroll) | MfaService.Enroll | `mfa_enabled=true` | 409 `MfaAlreadyEnabled` (code 56) |
| Not enrolling (Confirm) | MfaService.Confirm | `mfa_secret IS NULL` | 409 `MfaNotEnrolling` (code 57) |
| Not enabled (Disable) | MfaService.Disable | `mfa_enabled=false` | 409 `MfaNotEnabled` (code 58) |
| Wrong password | Sec.VerifyPassword | hash mismatch | 409 `WrongPassword` (code 30) |
| Wrong TOTP code | MfaService TOTP path | code/window miss | 401 `InvalidMfaCode` (code 59) |
---
## Flow F15: Verifier Revocation Snapshot *(AZ-535, 2026-05-14)*
### Description
A `Service`-role identity (verifier fleet) polls `GET /sessions/revoked?since={iso8601}` periodically. The server returns every session whose `revoked_at >= since` and `expires_at > now()` so verifiers can deny tokens whose `sid` appears in the snapshot.
The `since` parameter is **clamped to a 12-hour floor** server-side so a buggy verifier asking for "everything since 1970" doesn't trigger a multi-million-row table scan. Verifiers should clock-skew-tolerate by stepping `since` back ~30s on each poll.
### Preconditions
- Caller has role `Service` or `ApiAdmin` (`revocationReaderPolicy`)
### Sequence Diagram
```mermaid
sequenceDiagram
participant Verifier
participant API as Admin API
participant SS as SessionService
participant DB as PostgreSQL
Verifier->>API: GET /sessions/revoked?since=2026-05-14T05:30:00Z
API->>API: clamp since to max(now-12h, since)
API->>SS: GetRevokedSince(effectiveSince)
SS->>DB: SELECT id, expires_at, revoked_at, revoked_reason FROM sessions WHERE revoked_at >= ? AND expires_at > now() ORDER BY revoked_at
DB-->>SS: rows (uses sessions_revoked_at_idx)
SS-->>API: IReadOnlyList<RevokedSession>
API-->>Verifier: 200 OK [{ sid, exp, revokedAt, reason }, ...] + Cache-Control: no-cache
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Wrong role | API authorization | not Service/ApiAdmin | 403 Forbidden |
| `since` missing | API | bind null `DateTime?` | clamp falls back to `now-12h` |
---
## Flow F16: Account Lockout & Per-IP Rate Limit *(AZ-537, 2026-05-14)*
### Description
Cross-cuts F1 and F11. Two layers:
1. **Per-IP** — ASP.NET Core `RateLimiter` middleware (`SlidingWindowRateLimiter`) attached to `/login` and `/login/mfa` via the `login-per-ip` policy. Rejection sets `429` and stamps `Retry-After` from the lease metadata.
2. **Per-account + lockout** — DB-backed in `UserService.ValidateUser`:
- Read `failed_login_count` and `lockout_until` from `users`.
- If `now() < lockout_until` → throw `BusinessException(AccountLocked, RetryAfterSeconds = LockoutUntil - now)`.
- Else: count `audit_events` rows where `event_type='login_failed' AND email=? AND occurred_at >= now - PerAccountWindowSeconds`. If over threshold → throw `BusinessException(LoginRateLimited, RetryAfterSeconds = PerAccountWindowSeconds)`.
- On wrong password: `RecordLoginFailed` + UPDATE `failed_login_count = failed_login_count + 1`. If new count >= `ConsecutiveFailureThreshold` → set `lockout_until = now + LockoutSeconds`, `RecordLoginLockout`, throw `AccountLocked`.
- On success: `RecordLoginSuccess` + UPDATE `failed_login_count = 0`, `lockout_until = NULL`.
### Preconditions
- `AuthConfig.RateLimit.*` and `AuthConfig.Lockout.*` are non-zero
- `audit_events` table exists
### Sequence Diagram
```mermaid
sequenceDiagram
participant Client
participant Mid as RateLimiter middleware
participant API as Admin API
participant US as UserService
participant AL as AuditLog
participant DB as PostgreSQL
Client->>Mid: POST /login {email, password}
Mid->>Mid: SlidingWindow per-IP check
alt no permits
Mid-->>Client: 429 + Retry-After
end
Mid->>API: forward
API->>US: ValidateUser
US->>DB: SELECT users (read)
US->>AL: CountRecentFailedLogins(email, window)
alt account locked OR threshold exceeded
US->>AL: RecordLoginFailed (or RecordLoginLockout if newly locked)
US-->>API: BusinessException(AccountLocked / LoginRateLimited, RetryAfterSeconds)
API-->>Client: 423 / 429 + Retry-After
end
US->>US: VerifyPassword
alt wrong password
US->>AL: RecordLoginFailed
US->>DB: UPDATE failed_login_count++; lockout_until = now + LockoutSeconds (if newly over)
US-->>API: BusinessException(WrongPassword)
API-->>Client: 409
end
US->>AL: RecordLoginSuccess
US->>DB: UPDATE failed_login_count = 0, lockout_until = NULL, last_login = now
US-->>API: User
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| Per-IP limit | RateLimiter middleware | sliding window | 429 + `Retry-After` |
| Account locked | UserService.ValidateUser | `now < lockout_until` | 423 `AccountLocked` + `Retry-After` |
| Per-account threshold | UserService.ValidateUser | `audit_events` count over window | 429 `LoginRateLimited` + `Retry-After` |
---
## Flow F17: JWKS Publication *(AZ-532, 2026-05-14)*
### Description
`GET /.well-known/jwks.json` (anonymous) returns the JSON Web Key Set containing one entry per loaded ES256 key. Verifiers cache for 1 hour (`Cache-Control: public, max-age=3600`).
### Preconditions
- `JwtConfig.KeysFolder` exists with at least one well-formed P-256 PEM
- `JwtConfig.ActiveKid` matches one of the loaded files (the others are still served, allowing verifiers to validate already-issued tokens during a key rotation)
### Sequence Diagram
```mermaid
sequenceDiagram
participant Verifier
participant API as Admin API
participant JKP as JwtSigningKeyProvider
participant FS as Filesystem
Note over JKP,FS: At app startup
API->>JKP: ctor (eager)
JKP->>FS: scan KeysFolder/*.pem
JKP->>JKP: validate P-256 curve, build EcdsaSecurityKey list
JKP-->>API: ready (or fail-fast if 0 keys)
Note over Verifier,API: Per-poll
Verifier->>API: GET /.well-known/jwks.json
API->>JKP: All
JKP-->>API: list of JwtSigningKey
API->>API: project to JWK { kty:EC, crv:P-256, kid, use:sig, alg:ES256, x, y }
API-->>Verifier: 200 OK { keys: [...] } + Cache-Control: public, max-age=3600
```
### Error Scenarios
| Error | Where | Detection | Recovery |
|-------|-------|-----------|----------|
| No keys / malformed PEM | JwtSigningKeyProvider ctor | startup crash (intentional) | Operator fix + restart |
| Wrong curve in PEM | JwtSigningKeyProvider ctor | startup crash | Operator fix + restart |
> **Rotation procedure**: drop a new PEM into `KeysFolder`, set `JwtConfig:ActiveKid` to the new kid, restart. Already-issued tokens remain verifiable until their `exp`. Old PEMs are physically removed only after the longest possible token TTL has elapsed.