From a77b3f8a59524111679428bfa4967c710cbbc84f Mon Sep 17 00:00:00 2001 From: Oleksandr Bezdieniezhnykh Date: Thu, 14 May 2026 09:22:53 +0300 Subject: [PATCH] [AZ-529] [AZ-530] Cycle-2 documentation refresh MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Refreshes _docs/02_document/ to reflect the cycle-2 auth-modernization + CMMC hardening landings (AZ-531..AZ-538). Authoritative source for the ripple set is ripple_log_cycle2.md. Covered: - architecture.md (section 1 rewritten, ADRs 6-9 added) - data_model.md (sessions, audit_events, user columns, migrations) - system-flows.md (F1 rewritten; F11-F17 added; F2/F7/F9 minor) - module-layout.md (cycle-2 sub-component table) - diagrams/flows/flow_login.md (dual-token + MFA) - components/{01_data_layer,03_auth_and_security,05_admin_api} - modules/ (12 new, 8 modified — full Argon2id/ES256/MFA/refresh /mission/session/audit/jwks rollup) - tests/{blackbox,security,traceability-matrix} Step 13 (Update Docs) output for cycle 2. Co-authored-by: Cursor --- _docs/02_document/architecture.md | 184 +++-- .../components/01_data_layer/description.md | 136 ++-- .../03_auth_and_security/description.md | 173 +++-- .../components/05_admin_api/description.md | 202 ++++-- _docs/02_document/data_model.md | 236 +++++-- .../02_document/diagrams/flows/flow_login.md | 92 ++- _docs/02_document/module-layout.md | 10 +- .../02_document/modules/admin_api_program.md | 141 ++-- .../modules/common_business_exception.md | 55 +- .../modules/common_configs_auth_config.md | 58 ++ .../modules/common_configs_jwt_config.md | 60 +- .../modules/common_database_azaion_db.md | 26 +- .../modules/common_database_schema_holder.md | 25 +- .../modules/common_entities_audit_event.md | 73 ++ .../modules/common_entities_role_enum.md | 15 +- .../modules/common_entities_session.md | 85 +++ .../modules/common_entities_user.md | 43 +- .../modules/common_requests_login_request.md | 6 +- .../modules/common_requests_login_response.md | 53 ++ .../modules/common_requests_mfa_requests.md | 83 +++ ...common_requests_mission_session_request.md | 59 ++ .../02_document/modules/services_audit_log.md | 60 ++ .../modules/services_auth_service.md | 67 +- .../services_jwt_signing_key_provider.md | 75 ++ .../modules/services_mfa_service.md | 73 ++ .../modules/services_mission_token_service.md | 69 ++ .../modules/services_refresh_token_service.md | 63 ++ .../02_document/modules/services_security.md | 62 +- .../modules/services_session_service.md | 65 ++ .../modules/services_user_service.md | 85 ++- _docs/02_document/ripple_log_cycle2.md | 66 ++ _docs/02_document/system-flows.md | 527 +++++++++++++- _docs/02_document/tests/blackbox-tests.md | 662 ++++++++++++++++++ _docs/02_document/tests/security-tests.md | 307 ++++++++ .../02_document/tests/traceability-matrix.md | 96 +++ 35 files changed, 3624 insertions(+), 468 deletions(-) create mode 100644 _docs/02_document/modules/common_configs_auth_config.md create mode 100644 _docs/02_document/modules/common_entities_audit_event.md create mode 100644 _docs/02_document/modules/common_entities_session.md create mode 100644 _docs/02_document/modules/common_requests_login_response.md create mode 100644 _docs/02_document/modules/common_requests_mfa_requests.md create mode 100644 _docs/02_document/modules/common_requests_mission_session_request.md create mode 100644 _docs/02_document/modules/services_audit_log.md create mode 100644 _docs/02_document/modules/services_jwt_signing_key_provider.md create mode 100644 _docs/02_document/modules/services_mfa_service.md create mode 100644 _docs/02_document/modules/services_mission_token_service.md create mode 100644 _docs/02_document/modules/services_refresh_token_service.md create mode 100644 _docs/02_document/modules/services_session_service.md create mode 100644 _docs/02_document/ripple_log_cycle2.md diff --git a/_docs/02_document/architecture.md b/_docs/02_document/architecture.md index 98a9923..962317e 100644 --- a/_docs/02_document/architecture.md +++ b/_docs/02_document/architecture.md @@ -2,24 +2,36 @@ ## 1. System Context -**Problem being solved**: Azaion Suite requires a centralized admin API to manage users, assign roles, and securely distribute encrypted software resources (DLLs, AI models, installers) to authorized devices and SaaS users. +**Problem being solved**: Azaion Suite requires a centralized admin API to manage users + roles, authenticate humans (with optional second factor), authenticate UAVs for offline missions, and broker token revocation across a fleet of verifier services. **System boundaries**: -- **Inside**: User management, authentication (JWT), role-based authorization, file-based resource storage (upload / list / clear). -- **Outside**: Client applications (admin web panel at admin.azaion.com, fTPM-secured Jetson edge devices), PostgreSQL database, server filesystem for resource storage. +- **Inside**: user management, password hashing (Argon2id), authentication (ES256 JWT + opaque refresh tokens with rotation + reuse detection), TOTP MFA, mission-token issuance, session revocation + verifier-poll snapshot, account lockout + per-IP and per-account rate limiting, JWKS publication, role-based authorization, file-based resource storage (upload / list / clear), HSTS + HTTPS redirect. +- **Outside**: admin web panel (`admin.azaion.com`), fTPM-secured Jetson edge devices (CompanionPC), verifier fleet (satellite-provider, gps-denied, ui — service-role identities), PostgreSQL, server filesystem. -> **Note (AZ-197, 2026-05-13)**: hardware-fingerprint binding (`User.Hardware`, `CheckHardwareHash`, `PUT /users/hardware/set`, `POST /resources/check`, `HardwareIdMismatch`/`BadHardware` error codes) was removed. Edge devices now ship as fTPM-secured Jetsons; server/desktop access is SaaS-only. The `User.Hardware` DB column remains as a nullable tombstone (no migration in AZ-197). - -> **Note (cycle 2, 2026-05-14)**: the encrypted resource download (`POST /resources/get/{dataFolder?}`) and both installer endpoints (`GET /resources/get-installer`, `GET /resources/get-installer/stage`) were removed as obsolete. Their orphaned support code went with them: `ResourcesService.GetEncryptedResource` / `GetInstaller`, `Security.GetApiEncryptionKey` / `EncryptTo` / `DecryptTo`, the `GetResourceRequest` DTO (+ `WrongResourceName` error code 50, gap kept), and the `ResourcesConfig.SuiteInstallerFolder` / `SuiteStageInstallerFolder` properties + their env var rows in every config artifact. The `Azaion.Test` unit-test project became empty and was removed from the solution. Per-user file encryption is no longer part of the system; resource delivery is now upload + list + clear only. ADR-003 below is **retired** as a result. +> **Note (AZ-197, cycle 1)**: hardware-fingerprint binding removed. +> +> **Note (cycle 2 early)**: encrypted resource download + installer endpoints removed; ADR-003 retired. +> +> **Note (cycle 2 — Auth Modernization, 2026-05-14, AZ-531..AZ-538)**: the entire authentication layer was rebuilt: +> - **AZ-536** — Argon2id password hashing replaced SHA-384; lazy migration on login. +> - **AZ-531** — opaque refresh tokens with server-side rotation, family-based reuse detection, sliding + absolute lifetimes (`SessionConfig`). +> - **AZ-532** — symmetric HS256 → asymmetric ES256 with file-system key store + JWKS endpoint. +> - **AZ-534** — TOTP MFA (enroll/confirm/disable, recovery codes, two-step login, `IDataProtector`-encrypted secret, `amr` claim). +> - **AZ-535** — logout (single + all) + admin revoke + verifier-poll snapshot of revoked sessions; new `Service` role for verifier identities. +> - **AZ-533** — long-lived no-refresh mission tokens for UAV ops, with auto-revoke on aircraft reconnect. +> - **AZ-537** — DB-backed account lockout + per-account sliding-window rate limit + per-IP token-bucket via ASP.NET `RateLimiter`; `audit_events` table. +> - **AZ-538** — CORS narrowed to single HTTPS origin, HSTS enabled (non-Development), HTTPS redirection (non-Development). +> - New ADRs **ADR-006** through **ADR-009** below capture the per-decision context. **External systems**: | System | Integration Type | Direction | Purpose | |--------|-----------------|-----------|---------| -| PostgreSQL | Database (linq2db) | Both | User data persistence | -| Server filesystem | File I/O | Both | Resource file storage and retrieval | -| Azaion Suite client | REST API | Inbound | Resource download, login | -| Admin web panel (admin.azaion.com) | REST API | Inbound | User management, resource upload | +| PostgreSQL | Database (linq2db) | Both | User + session + audit_events persistence | +| Server filesystem | File I/O | Both | Resource files; ES256 PEM key store; DataProtection key store (when `DataProtection:KeysFolder` is set) | +| Admin web panel (admin.azaion.com) | REST API | Inbound | User management, login, MFA, refresh, resource upload | +| Verifier fleet (Service role) | REST API | Inbound | Polls `/sessions/revoked`, fetches `/.well-known/jwks.json` | +| CompanionPC (Jetson) edge devices | REST API | Inbound | Login + refresh; mission-token consumer | ## 2. Technology Stack @@ -30,11 +42,15 @@ | Database | PostgreSQL | (server-side) | Open-source, robust relational DB | | ORM | linq2db | 5.4.1 | Lightweight, LINQ-native, no migrations overhead | | Cache | LazyCache (in-memory) | 2.4.0 | Simple async caching for user lookups | -| Auth | JWT Bearer | 10.0.3 | Stateless token authentication | +| Auth | JWT Bearer (ES256) | 10.0.3 | Stateless token auth; cycle 2 — switched from HS256 to ES256 with JWKS (AZ-532) | +| Password hashing | Konscious.Security.Cryptography (Argon2id) | (cycle 2 add) | Replaces SHA-384 (AZ-536) | +| MFA | OtpNet (TOTP) + QRCoder (PNG) | (cycle 2 add) | TOTP + recovery codes (AZ-534) | +| Rate limiting | Microsoft.AspNetCore.RateLimiting | 10.0 | Per-IP sliding window (AZ-537) | +| Data protection | Microsoft.AspNetCore.DataProtection | 10.0 | Encrypt MFA secret at rest (AZ-534) | | Validation | FluentValidation | 11.3.0 / 11.10.0 | Declarative request validation | | Logging | Serilog | 4.1.0 | Structured logging (console + file) | | API Docs | Swashbuckle (Swagger) | 10.1.4 | OpenAPI specification | -| Serialization | Newtonsoft.Json | 13.0.1 | JSON for DB field mapping and responses | +| Serialization | Newtonsoft.Json | 13.0.4 | JSON for DB field mapping and responses (bumped from 13.0.1 by audit D-1) | | Container | Docker | .NET 10.0 images | Multi-stage build, ARM64 support | | CI/CD | Woodpecker CI | — | Branch-based ARM64 builds | | Registry | docker.azaion.com | — | Private container registry | @@ -56,7 +72,11 @@ | Secrets | Environment variables (`ASPNETCORE_*`) | Environment variables | | Logging | Console + file | Console + rolling file (`logs/log.txt`) | | Swagger | Enabled | Disabled | -| CORS | Same as prod | `admin.azaion.com` | +| CORS | (same policy registered, allows `https://admin.azaion.com`) | `https://admin.azaion.com` only | +| HSTS | **Disabled** (Development bypass) | **Enabled** (1 y, includeSubDomains, preload) | +| HTTPS redirect | **Disabled** (Development bypass) | **Enabled** | +| ES256 keys | `JwtConfig.KeysFolder` — at least one PEM, `ActiveKid` selects | Same; persistent volume mandatory | +| DataProtection keys | Ephemeral OK (single-instance dev) | `DataProtection:KeysFolder` MUST be a persistent volume — otherwise MFA secrets are unrecoverable after restart | ## 4. Data Model Overview @@ -64,21 +84,25 @@ | Entity | Description | Owned By Component | |--------|-------------|--------------------| -| User | System user with email (UNIQUE-indexed via `users_email_uidx`), password hash, role, config (legacy `Hardware` column tombstoned per AZ-197). Subset of users have `Role = CompanionPC` and are auto-provisioned via `POST /devices` (AZ-196), which delegates the insert to `UserService.RegisterUser` (post-security-audit consolidation, finding F-3). | 01 Data Layer | -| UserConfig | JSON-serialized per-user configuration (queue offsets) | 01 Data Layer | -| RoleEnum | Authorization role hierarchy (None → ApiAdmin); `ResourceUploader` retained as data only after the OTA endpoints were retired | 01 Data Layer | -| DetectionClass *(AZ-513, cycle 1)* | Operator-managed detection-class catalogue (Name, ShortName, Color, MaxSizeM, PhotoMode?) backing the UI Detection Classes table | 01 Data Layer | -| ExceptionEnum | Business error code catalog (HW-related codes 40/45 removed by AZ-197) | Common Helpers | +| User | System user. Cycle 2 added `failed_login_count`, `lockout_until` (AZ-537) and `mfa_*` columns (AZ-534). `password_hash` is now Argon2id PHC; legacy SHA-384 base64 lazily upgraded on next login (AZ-536). | 01 Data Layer | +| Session *(AZ-531+535+533+534)* | One row per refresh token (interactive) or per mission token. Carries `family_id` (rotation chain), `revoked_at`/`revoked_reason`/`revoked_by_user_id`, `class` ∈ {`interactive`, `mission`}, `aircraft_id`, `mfa_authenticated`. | 01 Data Layer | +| AuditEvent *(AZ-537+534)* | Append-only `audit_events` row: login_failed/success/lockout, mfa_enroll/confirm/disable/login_success/login_failed/recovery_used. | 01 Data Layer | +| UserConfig | JSON-serialized per-user configuration (queue offsets). | 01 Data Layer | +| RoleEnum | Authorization role hierarchy. Cycle 2 added `Service = 60` for verifier identities (AZ-535). | 01 Data Layer | +| DetectionClass | Operator-managed catalogue. Unchanged in cycle 2. | 01 Data Layer | +| ExceptionEnum | Business error code catalog. Cycle 2 added codes 50–61 for the auth/MFA/refresh/mission/lockout paths. | Common Helpers | -> **Removed in cycle 1 / post-cycle-1**: the `Resource` entity, the `resources` table, and the OTA delivery flow (AZ-183 — F10) were reverted after the security audit (finding F-1). The data model no longer carries an OTA-artifact entity. - -**Key relationships**: -- User → RoleEnum: each user has exactly one role -- User → UserConfig: optional 1:1 JSON field containing queue offsets +**Key relationships** (cycle 2 additions): +- User 1 — N Session (`sessions.user_id` FK, ON DELETE CASCADE) +- User 1 — N Session (`sessions.aircraft_id` FK for mission rows, ON DELETE SET NULL) +- User 1 — N Session (`sessions.revoked_by_user_id` FK, ON DELETE SET NULL) +- Session 1 — N Session (`parent_session_id` rotation chain) **Data flow summary**: -- Client → API → UserService → PostgreSQL: user CRUD operations -- Client → API → ResourcesService → Filesystem: resource upload / list / clear (encrypted download + installer delivery were retired in cycle 2) +- Client → API → UserService → PostgreSQL: user CRUD + Argon2id verify/hash + lazy migration +- Client → API → RefreshTokenService / SessionService / MfaService / MissionTokenService → PostgreSQL `sessions` + `users` + `audit_events` +- Verifier → API → SessionService → PostgreSQL `sessions` (revoked-since snapshot) + JwtSigningKeyProvider (JWKS) +- Client → API → ResourcesService → Filesystem: resource upload / list / clear ## 5. Integration Points @@ -86,11 +110,13 @@ | From | To | Protocol | Pattern | Notes | |------|----|----------|---------|-------| -| Admin API | User Management | Direct DI call | Request-Response | Scoped service injection | -| Admin API | Auth & Security | Direct DI call | Request-Response | Scoped service injection | -| Admin API | Resource Management | Direct DI call | Request-Response | Scoped service injection | -| User Management | Data Layer | Direct DI call | Request-Response | Singleton DbFactory | -| Auth & Security | User Management | Direct DI call | Request-Response | IUserService.GetByEmail | +| Admin API | User Management | Direct DI call | Request-Response | Scoped | +| Admin API | AuthService | Direct DI call | Request-Response | Scoped — also reads `IJwtSigningKeyProvider` (singleton) | +| Admin API | RefreshTokenService / SessionService / MfaService / MissionTokenService / AuditLog | Direct DI call | Request-Response | Scoped | +| Admin API | Resource Management | Direct DI call | Request-Response | Scoped | +| User Management | AuditLog | Direct DI call | Request-Response | Failed/success/lockout audit + sliding-window count | +| MfaService | IDataProtector | Direct DI call | Request-Response | Encrypt/decrypt mfa_secret | +| All services | Data Layer | Direct DI call | Request-Response | Singleton DbFactory | ### External Integrations @@ -104,29 +130,40 @@ | Requirement | Target | Measurement | Priority | |------------|--------|-------------|----------| | Max upload size | 200 MB | Kestrel MaxRequestBodySize | High | -| Password hashing | SHA-384 | Per-user | Medium | +| Password hashing | Argon2id (parameters from `AuthConfig.PasswordHashing`) | Per-user, constant-time verify | High | +| Access token lifetime | `JwtConfig.AccessTokenLifetimeMinutes` (15 default) | Per token | High | +| Refresh token sliding lifetime | `SessionConfig.RefreshSlidingHours` | Per session row | High | +| Refresh token absolute lifetime | `SessionConfig.RefreshAbsoluteHours` | Per family | High | +| Mission token lifetime | `MissionSessionRequest.PlannedDurationH` (validation-bounded) | Per mission session | High | +| Per-IP login rate | `AuthConfig.RateLimit.PerIpPermitLimit` per `PerIpWindowSeconds` | Sliding window | High | +| Per-account login rate | `AuthConfig.RateLimit.PerAccountFailedThreshold` per `PerAccountWindowSeconds` | DB sliding window via `audit_events` | High | +| Account lockout | `AuthConfig.Lockout.ConsecutiveFailureThreshold` failures → `LockoutSeconds` lockout | DB-backed | High | +| HSTS | 1 y, includeSubDomains, preload (non-Development) | HTTP header | High | +| HTTPS redirect | Enabled (non-Development) | Middleware | High | | Cache TTL | 4 hours | User entity cache | Low | -> The "File encryption / AES-256-CBC" NFR was retired in cycle 2 along with the encrypted-download endpoint. See ADR-003. - No explicit availability, latency, throughput, or recovery targets found in the codebase. ## 7. Security Architecture -**Authentication**: JWT Bearer tokens (HMAC-SHA256 signed, validated for issuer/audience/lifetime/signing key). +**Authentication**: +- ES256 (ECDSA P-256) JWT bearer tokens (AZ-532). `ValidAlgorithms` pinned to `ES256` to prevent the HS256-with-public-key forgery class. +- Opaque refresh tokens with server-side rotation + reuse detection (AZ-531). Stored as SHA-256 hashes; never re-presented. +- TOTP MFA + recovery codes (AZ-534). Step-1 token is itself an ES256 JWT with a separate audience. +- Mission tokens (AZ-533) — long-lived, no refresh, bound to `aircraft_id`, auto-revoked on aircraft reconnect. **Authorization**: Role-based (RBAC) via ASP.NET Core authorization policies: -- `apiAdminPolicy` — requires `ApiAdmin` role +- `apiAdminPolicy` — requires `ApiAdmin` +- `revocationReaderPolicy` — requires `Service` OR `ApiAdmin` (verifier fleet) - General `[Authorize]` — any authenticated user -> The `apiUploaderPolicy` was added by AZ-183 and removed in the post-cycle-1 revert along with the OTA endpoints it guarded. `RoleEnum.ResourceUploader` remains as data only. - **Data protection**: -- At rest: resource files are stored as plain bytes on the server filesystem (per-user AES-256-CBC encryption was retired in cycle 2 — see ADR-003). -- In transit: HTTPS (assumed, not enforced in code) -- Secrets management: Environment variables (`ASPNETCORE_*` prefix) +- **At rest**: `mfa_secret` is encrypted via `IDataProtector` (purpose `Azaion.Mfa.Secret`). MFA recovery codes are individually Argon2id-hashed and single-use. Passwords are Argon2id PHC strings. ES256 PEM keys live in `JwtConfig.KeysFolder` — protect via filesystem permissions. +- **In transit**: HSTS + HTTPS redirection in non-Development environments (AZ-538). CORS narrowed to `https://admin.azaion.com` only. +- **Token revocation propagation**: `GET /sessions/revoked` provides a verifier-poll snapshot; verifiers are responsible for honoring it within their poll cadence (currently ~30s recommended). +- **Secrets management**: Environment variables (`ASPNETCORE_*` prefix). -**Audit logging**: No explicit audit trail. Serilog logs business exceptions (WARN) and general events (INFO). +**Audit logging**: `audit_events` table records login_success/failed/lockout and mfa_enroll/confirm/disable/login_success/login_failed/recovery_used events with normalised email + caller IP. Drives the per-account rate limit and provides forensic evidence. Serilog continues to log business exceptions (WARN) and general events (INFO). ## 8. Key Architectural Decisions @@ -174,3 +211,68 @@ The binding's only remaining effect was a real production failure mode (`Hardwar **Decision**: Use linq2db instead of Entity Framework Core. **Consequences**: No migration framework — schema managed via SQL scripts (`env/db/`). Lighter runtime footprint. Manual mapping configuration in `AzaionDbSchemaHolder`. + +### ADR-006: Asymmetric ES256 JWT signing with file-system key store + JWKS *(cycle 2 — AZ-532)* + +**Context**: Cycle-1 JWT signing was symmetric HS256 with the secret in environment configuration. The verifier fleet (satellite-provider, gps-denied, ui) needed to validate tokens without sharing the signing secret with every service. Sharing the HS256 secret would have made any verifier compromise also a token-forgery primitive. + +**Decision**: Switch to ES256 (ECDSA P-256). The Admin API holds the private key; verifiers fetch the public key set from `GET /.well-known/jwks.json`. Keys live as one PEM per kid in `JwtConfig.KeysFolder`. `JwtConfig.ActiveKid` selects the signer; ALL discovered keys are exposed in JWKS so existing tokens stay verifiable across rotations. + +**Alternatives rejected**: +- **Continue HS256 + share secret**: rejected — secret-distribution + verifier-compromise blast radius. +- **RS256**: equivalent security, larger keys, no operational benefit at our scale. +- **External KMS / HSM**: deferred — adds operational complexity (KMS auth, latency on every signing op) without near-term benefit. The PEM-on-disk approach is reversible to KMS later. + +**Consequences**: +- JwtBearer `ValidAlgorithms = [ES256]` is mandatory — without it, a token forged with `alg=HS256` using the public key as the HMAC secret would validate. +- The PEM directory MUST be a persistent volume. +- Key rotation is "drop a new PEM, set `ActiveKid`, restart" — the old kid keeps verifying tokens until physically removed. +- Verifiers MUST cache the JWKS for at most 1 hour to pick up new kids quickly. + +### ADR-007: Refresh tokens as opaque rotating server-side rows (not JWT) *(cycle 2 — AZ-531)* + +**Context**: The dual-token model needs a refresh token. The two viable shapes are (a) signed self-describing JWT or (b) opaque server-stored value. Refresh tokens are long-lived; their threat model centres on theft + replay. + +**Decision**: Opaque random `Base64Url(32 bytes)` stored on the server as a SHA-256 hash. Each rotation marks the previous row as `revoked_reason='rotated'` and inserts a new row in the same `family_id`. Presenting an already-rotated token revokes the entire family with `reason='reuse_detected'`. + +**Alternatives rejected**: +- **JWT refresh token**: server cannot revoke without a denylist (which negates the "stateless" advantage). No reuse-detection without ALSO server state. +- **Sliding session ID alone (no rotation)**: theft is permanent until manual revocation. + +**Consequences**: +- Every refresh hits Postgres (one indexed lookup + one update + one insert in a transaction). Acceptable at current load; if it becomes a bottleneck, the `sessions_refresh_hash_idx` UNIQUE INDEX is the obvious caching boundary. +- Refresh-token theft is detectable on the next legitimate refresh. +- The session row is also the `sid` claim in the access token — the same row drives logout (F12), JWKS-independent revocation snapshots (F15), and AMR persistence across rotations (`mfa_authenticated`). + +### ADR-008: TOTP MFA secrets encrypted via `IDataProtector` *(cycle 2 — AZ-534)* + +**Context**: MFA secrets are TOTP shared secrets — possession of the database alone (DBA access, backup leak) must NOT yield the ability to mint TOTP codes for users. + +**Decision**: Encrypt `mfa_secret` with ASP.NET `IDataProtector` (purpose string `Azaion.Mfa.Secret`) before persisting. The DataProtection key store is configured via `DataProtection:KeysFolder` and MUST be a persistent volume in production. Recovery codes are individually Argon2id-hashed and stored as a `jsonb` array; single-use is enforced by setting `used_at` transactionally with the rest of the login. + +**Alternatives rejected**: +- **Plaintext**: explicit DB-leak escalation path. +- **Application-managed AES via env-var key**: re-introduces the very key-distribution problem ADR-006 solved for JWT signing. +- **External KMS for MFA secrets**: deferred for the same reason as ADR-006. + +**Consequences**: +- Loss of the DataProtection key folder = users must re-enroll MFA (no recovery path). This MUST be backed up alongside DB backups. +- DBA-only access does not yield MFA bypass. + +### ADR-009: Per-account lockout + DB-backed sliding-window rate limit alongside per-IP token bucket *(cycle 2 — AZ-537)* + +**Context**: ASP.NET `RateLimiter` is per-process and per-IP. CMMC AC.L2-3.1.8 requires per-account lockout that survives process restarts. Per-IP alone is insufficient (NAT'd attacker farm; bot rotates IPs). Per-account-only is insufficient (single IP can DoS many accounts at "just below threshold"). + +**Decision**: Both layers, both required to pass: +1. Per-IP — ASP.NET `RateLimiter` middleware with `SlidingWindowRateLimiter` on `/login` and `/login/mfa`. In-memory; resets on restart but recovers within seconds. +2. Per-account — DB-backed sliding window via `audit_events` (count `login_failed` rows for the email within `PerAccountWindowSeconds`). +3. Lockout — `users.failed_login_count` + `users.lockout_until`. After `ConsecutiveFailureThreshold` failures, `lockout_until = now + LockoutSeconds`. Subsequent logins throw `AccountLocked` with `RetryAfterSeconds` until the window passes. + +**Alternatives rejected**: +- **Redis token bucket per account**: avoids DB load but adds a new infra dependency for a low-write workload. The DB sliding window has acceptable cost (`audit_events_event_type_email_idx`). +- **Single combined rule**: harder to tune. + +**Consequences**: +- `audit_events` will grow large (~14 GB/yr at projected fleet scale); operational follow-up to time-partition. +- The `Retry-After` header is set both by the per-IP middleware (lease metadata) and by the `BusinessExceptionHandler` (from `BusinessException.RetryAfterSeconds`), so clients see consistent backoff hints regardless of which layer rejected. +- All gating events go through `audit_events`, providing a single auditable history. diff --git a/_docs/02_document/components/01_data_layer/description.md b/_docs/02_document/components/01_data_layer/description.md index feb0ccc..7abf180 100644 --- a/_docs/02_document/components/01_data_layer/description.md +++ b/_docs/02_document/components/01_data_layer/description.md @@ -29,43 +29,66 @@ ### Entities -> **Cycle 1 (2026-05-13) note** — `DetectionClass` (AZ-513) entity was added. `Resource` (AZ-183) was added then removed in the same cycle (post-cycle-1 revert; security audit F-1 + the OTA delivery model itself was deemed obsolete). The `User.Hardware` column is left in place as a tombstone (nullable, unused) per AZ-197. A UNIQUE INDEX `users_email_uidx` was added on `users.email` (security audit F-3, `env/db/06_users_email_unique.sql`). +> **Cycle 1 (2026-05-13) note** — `DetectionClass` (AZ-513) added; `Resource` (AZ-183) added then reverted same cycle. `User.Hardware` left as a tombstone (AZ-197). UNIQUE INDEX `users_email_uidx` added on `users.email` (security audit F-3, `env/db/06_users_email_unique.sql`). > -> **Cycle 2 (2026-05-14) note** — `ResourcesConfig.SuiteInstallerFolder` and `SuiteStageInstallerFolder` were removed along with the installer endpoints (`GET /resources/get-installer[/stage]`); the POCO is now a single-property class (`ResourcesFolder`). +> **Cycle 2 — early (2026-05-14)** — `ResourcesConfig.SuiteInstallerFolder` / `SuiteStageInstallerFolder` removed with the installer endpoints; `ResourcesConfig` is now `ResourcesFolder`-only. +> +> **Cycle 2 — Auth Modernization (2026-05-14)** — significant data-layer changes: +> - **`User`** gained `FailedLoginCount`, `LockoutUntil` (AZ-537) and `MfaEnabled`, `MfaSecret` (DataProtection-encrypted), `MfaRecoveryCodes` (jsonb of Argon2id-hashed codes), `MfaEnrolledAt`, `MfaLastUsedWindow` (AZ-534). `PasswordHash` column unchanged in shape but now contains Argon2id PHC strings; legacy SHA-384 base64 values are accepted by `Security.VerifyPassword` and lazily upgraded on next login (AZ-536). +> - **New table `public.sessions`** (AZ-531 / AZ-535) — refresh-token rotation + revocation, mapped via `Common/Entities/Session`. +> - **New table `public.audit_events`** (AZ-537 + AZ-534) — append-only login + MFA event log, mapped via `Common/Entities/AuditEvent`. +> - **New `RoleEnum.Service = 60`** (AZ-535) — verifier-fleet identity used by the `revocationReaderPolicy`. +> - **New configs**: `AuthConfig` (rate limit + lockout + Argon2id parameters), `SessionConfig` (refresh sliding + absolute lifetimes). `JwtConfig` rebuilt around ES256 (`KeysFolder`, `ActiveKid`, `AccessTokenLifetimeMinutes`, `MfaStepTokenLifetimeMinutes`); the legacy `Secret` and `TokenLifetimeHours` fields are no longer read. +> - **Migrations** added: `07_auth_lockout_and_audit.sql`, `08_sessions.sql`, `09_sessions_logout_and_mission.sql`, `10_users_mfa.sql`. ``` User: Id: Guid (PK) Email: string (required) - PasswordHash: string (required) - Hardware: string? (optional — TOMBSTONED by AZ-197; nullable, unused; no application code reads or writes) + PasswordHash: string (required, Argon2id PHC; legacy SHA-384 base64 accepted on read, rehashed on next login — AZ-536) + Hardware: string? (TOMBSTONED — AZ-197) Role: RoleEnum (required) - CreatedAt: DateTime (required) - LastLogin: DateTime? (optional) - UserConfig: UserConfig? (optional, JSON-serialized) - IsEnabled: bool (required) - -UserConfig: - QueueOffsets: UserQueueOffsets? (optional) - -UserQueueOffsets: - AnnotationsOffset: ulong - AnnotationsConfirmOffset: ulong - AnnotationsCommandsOffset: ulong - -DetectionClass (AZ-513): - Id: int (PK, DB-assigned identity) - Name, ShortName, Color: string - MaxSizeM: double - PhotoMode: string? CreatedAt: DateTime + LastLogin: DateTime? + UserConfig: UserConfig? + IsEnabled: bool + FailedLoginCount: int (AZ-537 — reset on successful login) + LockoutUntil: DateTime? (AZ-537 — UTC; "now < LockoutUntil" → AccountLocked) + MfaEnabled: bool (AZ-534) + MfaSecret: string? (AZ-534 — IDataProtector-encrypted base32 TOTP secret) + MfaRecoveryCodes: List? (AZ-534 — jsonb of Argon2id-hashed single-use codes) + MfaEnrolledAt: DateTime? + MfaLastUsedWindow: long? (AZ-534 — anti-replay; last consumed TOTP step) -// Resource entity — REMOVED post-cycle-1 (AZ-183 reverted). The `resources` -// table no longer exists; see env/db/ for the current migration set. +Session (AZ-531 / AZ-535): + Id: Guid (PK — used as the JWT `sid` claim) + UserId: Guid (FK to users) + Class: string ("interactive" | "mission") + RefreshTokenHash: byte[]? (SHA-256 of opaque refresh; null for mission sessions) + RotatedFromTokenId: Guid? (chain pointer for reuse detection) + IssuedAt: DateTime + ExpiresAt: DateTime (sliding for interactive, absolute for mission) + RevokedAt: DateTime? + RevokedReason: string? (one of SessionRevokedReasons) + RevokedByUserId: Guid? + Ip: string? + UserAgent: string? + MfaAuthenticated: bool (AZ-534 — pinned at issue, inherited by rotations) + AircraftId: Guid? (mission-only) + MissionId: string? (mission-only) -RoleEnum: None=0, Operator=10, Validator=20, CompanionPC=30, Admin=40, ResourceUploader=50, ApiAdmin=1000 -// ResourceUploader is now data-only — no endpoint policy references it -// after AZ-183 was reverted. +AuditEvent (AZ-537 + AZ-534): + Id: long (PK identity) + EventType: string (one of AuditEventTypes — login_failed/success/lockout, mfa_*) + Email: string (lowercase normalised) + Ip: string? + OccurredAt: DateTime (UTC) + +DetectionClass (AZ-513): unchanged + +RoleEnum: None=0, Operator=10, Validator=20, CompanionPC=30, Admin=40, ResourceUploader=50, Service=60 (AZ-535), ApiAdmin=1000 +// ResourceUploader is data-only since AZ-183 revert. +// Service is the verifier-fleet identity used by revocationReaderPolicy. ``` ### Configuration POCOs @@ -75,16 +98,33 @@ ConnectionStrings: AzaionDb: string — read-only connection string AzaionDbAdmin: string — admin (read/write) connection string -JwtConfig: +JwtConfig (AZ-532): Issuer: string Audience: string - Secret: string - TokenLifetimeHours: double + KeysFolder: string — directory containing one PEM per kid + ActiveKid: string — selects the signing key + AccessTokenLifetimeMinutes: int — default 15 + MfaStepTokenLifetimeMinutes: int — default 5 (AZ-534) + # Secret + TokenLifetimeHours: no longer read; kept only for back-compat deserialisation + +SessionConfig (AZ-531): + RefreshSlidingHours: int — sliding window per rotate + RefreshAbsoluteHours: int — hard cap (no rotation past this) + RevokedSnapshotMinutes: int — verifier-poll grace window for /sessions/revoked + +AuthConfig (AZ-536 + AZ-537): + PasswordHashing: { TimeCost, MemoryCostKiB, Parallelism } — Argon2id parameters + RateLimit: + PerIpPermitLimit: int + PerIpWindowSeconds: int + PerAccountWindowSeconds: int + PerAccountFailedThreshold: int + Lockout: + ConsecutiveFailureThreshold: int + LockoutSeconds: int ResourcesConfig: ResourcesFolder: string - # SuiteInstallerFolder / SuiteStageInstallerFolder removed in cycle 2 with the installer endpoints. - # EncryptionMasterKey was added by AZ-183 and removed in the post-cycle-1 revert. ``` ## 3. External API Specification @@ -97,25 +137,34 @@ N/A — internal component. | Query | Frequency | Hot Path | Index Needed | |-------|-----------|----------|--------------| -| `SELECT * FROM users WHERE email = ?` | High | Yes | Yes — UNIQUE INDEX `users_email_uidx` on `email` (security audit F-3, `env/db/06_users_email_unique.sql`) | +| `SELECT * FROM users WHERE email = ?` | High | Yes | Yes — UNIQUE INDEX `users_email_uidx` on `email` | | `SELECT * FROM users` with optional filters | Medium | No | No | | `UPDATE users SET ... WHERE email = ?` | Medium | No | No | -| `INSERT INTO users` | Low | No | No (UNIQUE INDEX above also enforces single-row-per-email atomically) | +| `INSERT INTO users` | Low | No | UNIQUE INDEX above | | `DELETE FROM users WHERE email = ?` | Low | No | No | +| `SELECT * FROM sessions WHERE refresh_token_hash = ?` (AZ-531) | High | Yes | Yes — UNIQUE INDEX on `refresh_token_hash` (`08_sessions.sql`) | +| `UPDATE sessions SET revoked_at..., revoked_reason... WHERE id = ?` (AZ-535) | Medium | No | PK | +| `UPDATE sessions SET revoked_... WHERE user_id = ? AND revoked_at IS NULL` (AZ-535 logout/all) | Low | No | INDEX on `(user_id, revoked_at)` | +| `UPDATE sessions SET revoked_... WHERE aircraft_id = ? AND class='mission' AND revoked_at IS NULL` (AZ-533) | Low | No | INDEX on `(aircraft_id, class, revoked_at)` | +| `SELECT ... FROM sessions WHERE revoked_at >= ? AND expires_at > now()` (AZ-535 verifier poll) | High | Yes | INDEX on `revoked_at` | +| `SELECT count(*) FROM audit_events WHERE event_type='login_failed' AND email=? AND occurred_at >= ?` (AZ-537) | High | Yes | INDEX on `(email, event_type, occurred_at)` | +| `INSERT INTO audit_events (...)` (AZ-537 / AZ-534) | High | Yes | n/a | ### Caching Strategy | Data | Cache Type | TTL | Invalidation | |------|-----------|-----|-------------| -| User by email | In-memory (LazyCache) | 4 hours | On `UpdateQueueOffsets` (post-AZ-197 — hardware paths gone) | +| User by email | In-memory (LazyCache) | 4 hours | On `UpdateQueueOffsets`, on lazy-rehash (AZ-536), on MFA enroll/confirm/disable (AZ-534), on user enable/disable, on lockout state changes (AZ-537) | -> The `Resources.Latest.{arch}.{stage}` cache key (added by AZ-183) was removed in the post-cycle-1 revert. +> Refresh tokens, sessions, and audit events are NOT cached — they are read directly from Postgres on every request. The verifier-poll snapshot (`/sessions/revoked`) is the only "edge" cache and lives in the verifier process, not in this component. ### Storage Estimates | Table | Est. Row Count (1yr) | Row Size | Total Size | Growth Rate | |-------|---------------------|----------|------------|-------------| -| `users` | 100–1000 web users + 2000–10000 CompanionPC device users (AZ-196 grows this) | ~500 bytes | ~5 MB | Medium (device fleet) | +| `users` | 100–1000 web users + 2000–10000 CompanionPC device users | ~700 bytes (post-MFA columns) | ~7 MB | Medium | +| `sessions` (AZ-531) | 30 d retention (`RefreshAbsoluteHours`) × N active sessions per user × pruning job | ~400 bytes | ~50 MB ceiling | High during active fleet ops; bounded by retention | +| `audit_events` (AZ-537) | ~50 events/user/day × ~5000 users × 365 d | ~150 bytes | ~14 GB/yr | High — partition or archive after 90 d (operational follow-up) | | `detection_classes` (AZ-513) | 10–200 | ~250 bytes | ~50 KB | Low | ### Data Management @@ -182,12 +231,15 @@ N/A — internal component. ## Modules Covered - `Common/Configs/ConnectionStrings` -- `Common/Configs/JwtConfig` +- `Common/Configs/JwtConfig` *(AZ-532 — ES256 + session config)* +- `Common/Configs/AuthConfig` *(new in cycle 2 — AZ-536 + AZ-537)* - `Common/Configs/ResourcesConfig` -- `Common/Entities/User` -- `Common/Entities/RoleEnum` +- `Common/Entities/User` *(extended in cycle 2 — AZ-537 + AZ-534)* +- `Common/Entities/RoleEnum` *(extended in cycle 2 — AZ-535 added `Service`)* +- `Common/Entities/Session` *(new in cycle 2 — AZ-531 + AZ-535)* +- `Common/Entities/AuditEvent` *(new in cycle 2 — AZ-537)* - `Common/Entities/DetectionClass` *(added cycle 1, AZ-513)* -- `Common/Database/AzaionDb` (now also holds the `DetectionClasses` table; the `Resources` ITable added by AZ-183 was removed in the post-cycle-1 revert) -- `Common/Database/AzaionDbSchemaHolder` +- `Common/Database/AzaionDb` (`Sessions` and `AuditEvents` ITables added in cycle 2) +- `Common/Database/AzaionDbSchemaHolder` (Session + AuditEvent mappings, jsonb for `MfaRecoveryCodes`) - `Common/Database/DbFactory` - `Services/Cache` diff --git a/_docs/02_document/components/03_auth_and_security/description.md b/_docs/02_document/components/03_auth_and_security/description.md index 5184b92..f8d6af6 100644 --- a/_docs/02_document/components/03_auth_and_security/description.md +++ b/_docs/02_document/components/03_auth_and_security/description.md @@ -1,90 +1,181 @@ # Authentication & Security -> **Cycle 1 (2026-05-13) note** — AZ-197 simplified `GetApiEncryptionKey` to `(email, password)` and removed `GetHWHash` outright. The hardware-binding threat model that motivated those primitives is no longer in scope (fTPM-anchored Jetsons + browser SaaS). +> **Cycle 1 (2026-05-13) note** — AZ-197 simplified `GetApiEncryptionKey` to `(email, password)` and removed `GetHWHash` outright. The hardware-binding threat model is no longer in scope. > -> **Cycle 2 (2026-05-14) note** — `GetApiEncryptionKey`, `EncryptTo`, and `DecryptTo` were all removed along with the encrypted-download endpoint. `Security` is now a one-method utility (`ToHash`) that backs SHA-384 password hashing. +> **Cycle 2 — early (2026-05-14, batches 01-04)** — `GetApiEncryptionKey`, `EncryptTo`, and `DecryptTo` were removed along with the encrypted-download endpoint. `Security` was briefly a one-method utility (`ToHash`) wrapping SHA-384. +> +> **Cycle 2 — Auth Modernization (2026-05-14, AZ-531..AZ-538)** — this component was rebuilt from a single-token issuer + SHA-384 hasher into the full session/refresh/MFA/audit/mission stack described below. Old single-token, symmetric-HS256, SHA-384 paths are gone. ## 1. High-Level Overview -**Purpose**: JWT token creation/validation and password hashing (`Security.ToHash`). +**Purpose**: end-to-end authentication, authorization, session management, second factor (TOTP), token signing/verification, mission credentials, audit, and request-time abuse protection (rate limiting / lockout). -**Architectural Pattern**: Service + static utility — `AuthService` is a DI-managed service for JWT operations; `Security` is a static class with a single SHA-384 helper. +**Architectural Pattern**: a cluster of focused DI-registered services backed by Postgres tables, fronted by Admin API endpoints. Token signing is asymmetric (ES256) with file-system key storage and JWKS publication. Refresh tokens use server-side rotation with reuse detection. MFA secrets are encrypted at rest via ASP.NET `IDataProtector`. -**Upstream dependencies**: Data Layer (JwtConfig, IUserService for GetByEmail), ASP.NET Core (IHttpContextAccessor). +**Upstream dependencies**: +- Data Layer (`AzaionDb`, `JwtConfig`, `SessionConfig`, `AuthConfig`, `IUserService.GetByEmail`) +- ASP.NET Core (`IHttpContextAccessor`, `IDataProtectionProvider`, `RateLimiter` middleware) +- File system (`JwtConfig.KeysFolder` for ES256 keys; one PEM per kid) -**Downstream consumers**: Admin API (token creation on login, current user resolution), User Management (password hashing for both web users and provisioned devices). +**Downstream consumers**: +- Admin API endpoints (`/login`, `/login/mfa`, `/refresh`, `/logout`, `/logout/all`, `/users/me/mfa/*`, `/sessions/{sid}`, `/aircraft/{id}/sessions`, `/sessions/revoked`, `/missions/sessions`, `/.well-known/jwks.json`) +- All authorized requests (JWT bearer middleware verifies via `IJwtSigningKeyProvider` and Verifier services consult the revoked-sessions snapshot) +- User Management (Argon2id hashing for register/update; lazy migration on login) ## 2. Internal Interfaces -### Interface: IAuthService +### Service: `IAuthService` -| Method | Input | Output | Async | Error Types | -|--------|-------|--------|-------|-------------| -| `GetCurrentUser` | (none — reads from HttpContext) | `User?` | Yes | None | -| `CreateToken` | `User` | `string` (JWT) | No | None | +| Method | Input | Output | Async | Notes | +|--------|-------|--------|-------|-------| +| `GetCurrentUser` | (HttpContext) | `User?` | Yes | Reads `ClaimTypes.Name` (email) and looks up via `IUserService.GetByEmail` | +| `CreateToken` | `User`, `Guid sessionId`, `Guid jti`, `IEnumerable? amr` | `AccessToken` (record: `Jwt`, `ExpiresAt`) | No | ES256 signed; lifetime from `JwtConfig.AccessTokenLifetimeMinutes`. Stamps `sub` (`NameIdentifier`), `email` (`Name`), `role`, `sid`, `jti`, and one `amr` claim per value (defaults to `["pwd"]`). | -### Static: Security +### Service: `IRefreshTokenService` *(AZ-531)* + +| Method | Input | Output | Notes | +|--------|-------|--------|-------| +| `IssueForNewLogin` | `Guid userId`, `bool mfaAuthenticated`, `CancellationToken` | `(string OpaqueToken, Session Session)` | Creates a new session family (the returned `Session.Id` is the `sid` claim) + initial refresh token. `MfaAuthenticated` is pinned on the session so refresh rotations inherit AMR strength. | +| `Rotate` | `string opaqueToken`, `CancellationToken` | `(string OpaqueToken, Session Session)` | Validates → marks old as rotated → inserts new row in same family. Presenting an already-rotated token revokes the entire family. | + +### Service: `ISessionService` *(AZ-535)* + +| Method | Input | Output | +|--------|-------|--------| +| `RevokeBySid` | `Guid sessionId`, `Guid? byUserId`, `string reason`, `CancellationToken` | `Task` (true = was already revoked = no-op) | +| `RevokeAllForUser` | `Guid userId`, `Guid? byUserId`, `string reason`, `CancellationToken` | `Task` (rows revoked) | +| `RevokeMissionsForAircraft` | `Guid aircraftId`, `CancellationToken` | `Task` (called from `MissionTokenService.Issue` and from any successful aircraft re-login) | +| `GetRevokedSince` | `DateTime since`, `CancellationToken` | `Task>` (sid, exp, revokedAt, reason) | + +### Service: `IMfaService` *(AZ-534)* + +| Method | Input | Output | +|--------|-------|--------| +| `Enroll` | `Guid userId`, `string password`, `CancellationToken` | `Task` (otpauth URL, base32 secret, QR PNG bytes — DataProtection-encrypted secret persisted) | +| `Confirm` | `Guid userId`, `string code`, `CancellationToken` | `Task` (sets `MfaEnabled=true`, generates and stores hashed recovery codes) | +| `Disable` | `Guid userId`, `string password`, `string code`, `CancellationToken` | `Task` | +| `IssueMfaStepToken` | `Guid userId` | `string` (short-lived JWT with `mfa_pending`, audience `mfa-step`, signed by active ES256 key) | +| `ValidateMfaStepToken` | `string token` | `Guid userId` | +| `VerifyForLogin` | `Guid userId`, `string code`, `CancellationToken` | `Task` — returns the AMR array (`["pwd","mfa"]` or with `"recovery"` appended); throws `InvalidMfaCode` on failure | + +### Service: `IMissionTokenService` *(AZ-533)* + +| Method | Input | Output | Notes | +|--------|-------|--------|-------| +| `Issue` | `Guid pilotUserId`, `MissionSessionRequest`, `CancellationToken` | `Task` | Validates aircraft is `CompanionPC`; auto-revokes prior mission sessions for the aircraft; inserts session row with `Class = "mission"` BEFORE signing so `sid` is bound; planned duration = absolute lifetime (no refresh). | + +### Service: `IJwtSigningKeyProvider` *(AZ-532)* + +| Member | Output | Notes | +|--------|--------|-------| +| `Active` | `JwtSigningKey` (`Kid`, `EcdsaSecurityKey SecurityKey`, `ECDsa Ecdsa`) | The signing key. Eager — constructed once at app start so missing/malformed keys fail-fast. | +| `All` | `IReadOnlyList` | Drives `/.well-known/jwks.json` and `IssuerSigningKeyResolver`. All discovered keys are exposed; only `Active` signs. | + +### Service: `IAuditLog` *(AZ-537 + AZ-534)* + +| Method | Purpose | +|--------|---------| +| `RecordLoginSuccess(email)` / `RecordLoginFailed(email)` / `RecordLoginLockout(email)` | Persists `audit_events` rows with normalised email + caller IP. | +| `RecordMfaEnroll/Confirm/Disable/LoginSuccess/LoginFailed/RecoveryUsed(email)` | One per MFA lifecycle event. | +| `CountRecentFailedLogins(email, windowSeconds)` | Backs the per-account sliding-window check in `UserService.ValidateUser`. | + +### Static: `Security` *(AZ-536 — replaces SHA-384)* | Method | Input | Output | Description | |--------|-------|--------|-------------| -| `ToHash` | `string` | `string` (Base64) | SHA-384 hash | +| `HashPassword` | `string` | `string` (PHC) | Argon2id, parameters from `AuthConfig.PasswordHashing` | +| `VerifyPassword` | `string presented`, `string stored` | `VerifyResult` (`Ok`, `NeedsRehash`) | Constant-time; recognizes legacy SHA-384 base64 strings and returns `Ok=true, NeedsRehash=true` so `UserService` can lazy-upgrade | **Removed**: -- `GetHWHash(string hardware)` — removed by AZ-197 (cycle 1). -- `GetApiEncryptionKey(string email, string password)` — removed in cycle 2 (no remaining callers after `POST /resources/get/{dataFolder?}` was deleted). -- `EncryptTo` / `DecryptTo` extension methods — removed in cycle 2 (no remaining callers; the only consumer was `ResourcesService.GetEncryptedResource`, also deleted). +- `ToHash(string)` — removed by AZ-536. All callers now use `HashPassword` / `VerifyPassword`. +- `GetHWHash`, `GetApiEncryptionKey`, `EncryptTo`, `DecryptTo` — removed earlier in cycle 2. ## 3. External API Specification -N/A — exposed through Admin API. +Exposed via Admin API (component 05). Cycle 2 added: + +- `POST /login` — now returns either `LoginResponse` (access + refresh + sid) or `MfaRequiredResponse` (mfa_token only when MFA is enabled). Per-IP sliding-window rate limit applied. +- `POST /login/mfa` — completes MFA login (anonymous + per-IP rate limit; the step-1 token is the proof of mid-flow) → `LoginResponse` +- `POST /token/refresh` — rotates refresh token + new access token (anonymous; the refresh token IS the proof) +- `POST /logout` — revokes the caller's current `sid` (read from the access-token claim). Idempotent. +- `POST /logout/all` — revokes every session for the caller's user +- `POST /users/me/mfa/enroll` / `confirm` / `disable` +- `POST /sessions/{sid:guid}/revoke` *(ApiAdmin)* +- `GET /sessions/revoked?since=...` *(verifier role / ApiAdmin via `revocationReaderPolicy`)* +- `POST /sessions/mission` *(authenticated; pilot's interactive token)* → mission `LoginResponse`-shaped reply +- `GET /.well-known/jwks.json` — anonymous; serves all loaded ES256 public keys (active + retiring); cached 1h. ## 4. Data Access Patterns -No direct database access. `AuthService.GetCurrentUser` delegates to `IUserService.GetByEmail`. +| Service | Tables touched | Pattern | +|---------|----------------|---------| +| `RefreshTokenService` | `public.sessions` | Insert on issue / rotate; update `RevokedAt`+`RevokedReason` on rotate / reuse-detected; index lookup by `RefreshTokenHash` | +| `SessionService` | `public.sessions` | Update by `Sid`; bulk update by `UserId`; range read for revoked-since snapshot | +| `MfaService` | `public.users` | Update MFA columns (`MfaEnabled`, `MfaSecret`, `MfaRecoveryCodes`, `MfaEnrolledAt`, `MfaLastUsedWindow`) | +| `MissionTokenService` | `public.sessions`, `public.users` | Insert mission session row; lookup aircraft user | +| `AuditLog` | `public.audit_events`, `public.users` | Insert events; update `FailedLoginCount` / `LockoutUntil` on the user | +| `AuthService` / `UserService` | `public.users` | Reads for current-user resolution and password verify; updates on lazy rehash | + +All tables are LinqToDB-mapped via `AzaionDbShemaHolder`; recovery codes use `jsonb`. ## 5. Implementation Details -**Algorithmic Complexity**: SHA-384 hashing is O(n) where n is input length; in practice it operates on short password strings only. +**Argon2id parameters** (cycle 2 default): time=3, memory=64 MiB, parallelism=2 — overridable via `AuthConfig.PasswordHashing`. Output is a PHC-format string self-describing all parameters; verification re-derives them from the stored value. -**State Management**: `AuthService` is stateless (reads claims from HTTP context per request). `Security` is purely static. +**ES256 keys**: one PEM file per kid in `JwtConfig.KeysFolder`. `ActiveKid` selects the signer; all PEMs with valid `P-256` curves are exposed via JWKS. Rotation procedure: drop a new PEM, set `ActiveKid` to it, restart. Old keys remain in JWKS until physically removed (by ops) so already-issued tokens stay verifiable. -**Key Dependencies**: +**Refresh token format**: opaque random `Base64Url(32 bytes)`. Server stores SHA-256 hash + family id (`Sid`) + `RotatedFromTokenId` to support reuse detection. Sliding window per `SessionConfig.RefreshSlidingHours`; absolute cap per `SessionConfig.RefreshAbsoluteHours`. -| Library | Version | Purpose | -|---------|---------|---------| -| System.IdentityModel.Tokens.Jwt | 7.1.2 | JWT token generation | -| Microsoft.AspNetCore.Authentication.JwtBearer | 10.0.3 | JWT middleware integration | +**Reuse detection**: presenting an already-rotated refresh token revokes the entire family (`Sid`) with reason `RefreshReuseDetected`. The next-snapshot poll picks this up. -**Error Handling Strategy**: -- JWT token creation does not throw (malformed config would cause runtime errors at middleware level). -- `GetCurrentUser` returns null if claims are missing or user not found. +**MFA**: +- Secret: 20 random bytes → base32; URL `otpauth://totp/Azaion:{email}?secret=...&issuer=Azaion`. +- QR: PNG generated with `QRCoder` and returned as bytes (only on enroll). +- Recovery codes: 10 codes, each `Argon2id`-hashed before storage. Single-use; checked on `VerifyForLoginAsync` after TOTP fails. +- Step-1 token: short-lived JWT (`mfa_pending = true`, audience `mfa-step`) signed by the active ES256 key. Lifetime `JwtConfig.MfaStepTokenLifetimeMinutes`. +- Replay defense: persisted `MfaLastUsedWindow` blocks reuse of the same TOTP window within the 30s step. + +**Rate limiting / lockout** (AZ-537): +- Per-IP token-bucket via ASP.NET Core `RateLimiter` on `/login`, `/login/mfa`, `/refresh`. +- Per-account sliding window via `IAuditLog.CountRecentFailedLoginsAsync`; threshold + window from `AuthConfig.RateLimit`. +- Lockout via `LockoutOptions`: N consecutive failures within window → `LockoutUntil` set; subsequent logins throw `AccountLocked` with `RetryAfterSeconds`. + +**HSTS / HTTPS / CORS** (AZ-538): +- HSTS enabled in non-Development with the standard 1y `includeSubDomains` policy. +- HTTPS redirection in non-Development. +- CORS narrowed to the configured admin origins; credentials allowed only for those origins. ## 6. Extensions and Helpers -None — `Security` itself is a utility consumed by other components. +- `Program.cs` helpers: `ParseSidClaim`, `ParseUserIdClaim` (both throw `InvalidRefreshToken` on malformed/missing claims so handlers don't need to repeat the check). +- `BusinessExceptionHandler` adds the `Retry-After` header for `AccountLocked` / `LoginRateLimited`. ## 7. Caveats & Edge Cases -**Known limitations**: -- Password hashing uses SHA-384 without per-user salt or key stretching. Not resistant to rainbow table attacks. (Unchanged by cycles 1 and 2.) -- `GetCurrentUserEmail` assumes `ClaimTypes.Name` is always present; accessing a missing key would throw `KeyNotFoundException`. - -**Removed in cycle 1**: hardware fingerprint hashing was a known weakness (static salt, no rotation); deleting it via AZ-197 also removed that attack surface. - -**Removed in cycle 2**: per-user file encryption (`GetApiEncryptionKey` + `EncryptTo` + `DecryptTo`). The hardcoded encryption-key salt and the in-memory `MemoryStream` round-trip are no longer attack / performance surfaces in this codebase. +- **Asymmetric key roll-forward only**: revoking a kid means deleting its PEM. There is no per-kid revocation list separate from the file system. Operators must coordinate kid retirement with refresh-token expiry. +- **Verifier polling cadence**: `GET /sessions/revoked?since=` returns the snapshot since a timestamp. Verifiers must clock-skew-tolerate by stepping `since` back ~30s. Snapshot rows are pruned only after both `expiry + grace` window has passed. +- **MFA recovery codes are single-use**: there is no `regenerate` endpoint in cycle 2. A user who burns all 10 codes and loses their authenticator must contact an admin to disable MFA via `/users/me/mfa/disable` (re-uses password + TOTP, so admin is currently NOT able to disable on behalf of the user — flagged as a follow-up). +- **Mission tokens have no refresh**: `planned_duration_h` is the hard cap; expiry is absolute. Aircraft must re-request via the admin path on re-connect. +- **Lazy password rehash leak window**: a successful login with a SHA-384 stored hash returns `Ok=true, NeedsRehash=true` and `UserService` re-hashes via Argon2id within the same request. If that update fails (DB error), the legacy hash stays — surfaced via logs but not blocking. ## 8. Dependency Graph -**Must be implemented after**: Data Layer (for JwtConfig, IUserService). +**Must be implemented after**: Data Layer (configs + DB tables `users`, `sessions`, `audit_events`). -**Can be implemented in parallel with**: User Management (shared dependency on Data Layer). - -**Blocks**: Admin API. (Resource Management no longer depends on this component after cycle 2 removed `EncryptTo` / `DecryptTo`.) +**Blocks**: Admin API (every authenticated endpoint), Verifier components (consume `GET /sessions/revoked` and JWKS). ## 9. Logging Strategy -No explicit logging in AuthService or Security. +- All MFA failures, lockouts, refresh-reuse events, and admin revocations log at `Warning`+ via `IAuditLog` and structured logger. +- Successful logins log at `Information`. +- Argon2id verification failures log only the audit row (no plaintext, no hash). ## Modules Covered - `Services/AuthService` - `Services/Security` +- `Services/RefreshTokenService` +- `Services/SessionService` +- `Services/MfaService` +- `Services/MissionTokenService` +- `Services/JwtSigningKeyProvider` +- `Services/AuditLog` diff --git a/_docs/02_document/components/05_admin_api/description.md b/_docs/02_document/components/05_admin_api/description.md index 8050e23..0817aab 100644 --- a/_docs/02_document/components/05_admin_api/description.md +++ b/_docs/02_document/components/05_admin_api/description.md @@ -2,133 +2,201 @@ ## 1. High-Level Overview -**Purpose**: HTTP API entry point — configures DI, middleware pipeline, authentication, authorization, CORS, Swagger, and defines all REST endpoints using ASP.NET Core Minimal API. +**Purpose**: HTTP API entry point — configures DI, middleware pipeline, authentication, authorization, CORS, HSTS, HTTPS redirection, rate limiting, Swagger, DataProtection, and defines all REST endpoints using ASP.NET Core Minimal API. -**Architectural Pattern**: Composition root + Minimal API endpoints — top-level statements configure the application and map HTTP routes to service methods. +**Architectural Pattern**: Composition root + Minimal API endpoints — top-level statements configure the application and map HTTP routes to service methods. A static `IssueDualTokens` helper centralises the access+refresh issuance pattern shared by `/login` (no MFA) and `/login/mfa` (with MFA), and a tiny `ParseSidClaim` / `ParseUserIdClaim` pair extracts session/user identity from the request principal. -**Upstream dependencies**: User Management (IUserService), Authentication & Security (IAuthService, Security), Resource Management (IResourcesService), Data Layer (IDbFactory, ICache, configs). +**Upstream dependencies**: Authentication & Security (AuthService, RefreshTokenService, SessionService, MissionTokenService, MfaService, JwtSigningKeyProvider, AuditLog, Security), User Management (IUserService), Resource Management (IResourcesService), Detection Classes (IDetectionClassService), Data Layer (IDbFactory, ICache, all configs). -**Downstream consumers**: None (top-level entry point, consumed by HTTP clients). +**Downstream consumers**: HTTP clients (admin web UI, verifier services, CompanionPC). ## 2. Internal Interfaces ### BusinessExceptionHandler -| Method | Input | Output | Async | Error Types | -|--------|-------|--------|-------|-------------| -| `TryHandleAsync` | `HttpContext, Exception, CancellationToken` | `bool` | Yes | None | +| Method | Input | Output | Async | +|--------|-------|--------|-------| +| `TryHandleAsync` | `HttpContext`, `Exception`, `CancellationToken` | `bool` | Yes | -Converts `BusinessException` to HTTP 409 JSON response: `{ ErrorCode: int, Message: string }`. +Cycle 2 (AZ-537 / AZ-531 / AZ-533 / AZ-534 / AZ-535) — the handler now maps `BusinessException` → an exception-specific HTTP status code via a `MapStatusCode` switch, preserves the legacy `409 Conflict` default, and stamps a `Retry-After` response header when `RetryAfterSeconds` is set. It also handles `BadHttpRequestException` → `400 Bad Request` with `{ ErrorCode: 0, Message }` so malformed payloads have a consistent shape with business errors. + +| `ExceptionEnum` | HTTP status | +|-----------------|-------------| +| `AccountLocked` | `423 Locked` | +| `LoginRateLimited` | `429 Too Many Requests` | +| `InvalidRefreshToken` / `InvalidMfaCode` / `InvalidMfaToken` | `401 Unauthorized` | +| `SessionNotFound` | `404 Not Found` | +| `InvalidMissionRequest` / `AircraftNotFound` | `400 Bad Request` | +| `MfaAlreadyEnabled` / `MfaNotEnrolling` / `MfaNotEnabled` | `409 Conflict` | +| any other | `409 Conflict` (legacy default) | + +### Static helpers in `Program.cs` + +- `IssueDualTokens(user, authService, refreshTokens, sessionService, amr, ct)` — issues a refresh token + an access token, also auto-revokes any open mission sessions if the just-authenticated user is a `CompanionPC` (AZ-533 AC-4). +- `ParseSidClaim(ClaimsPrincipal)` / `ParseUserIdClaim(ClaimsPrincipal)` — read `sid` / `nameid` claims; throw `BusinessException(InvalidRefreshToken)` (→ 401) on missing/malformed. ## 3. External API Specification -> **Cycle 1 (2026-05-13) note** — endpoints below reflect the post-cycle-1 surface (AZ-513 Detection Classes CRUD, AZ-196 device auto-provisioning, AZ-197 hardware-binding removal). AZ-183 (OTA) shipped in cycle 1 but was reverted later the same day after the security audit (finding F-1) — the OTA delivery model itself was deemed obsolete. For per-endpoint cycle origins see `modules/admin_api_program.md`. +> **Cycle 2 (2026-05-14) — auth modernization**: `/login` is now multi-shape (MFA branch); `/login/mfa`, `/token/refresh`, `/logout`, `/logout/all`, `/sessions/*`, `/users/me/mfa/*`, `/.well-known/jwks.json` are all new. The legacy "single JWT" response is preserved as a `Token` getter on `LoginResponse` for compatibility with old clients (= same value as `AccessToken`). + +### Authentication & Sessions + +| Endpoint | Method | Auth | Cycle | Description | +|----------|--------|------|-------|-------------| +| `/login` | POST | Anonymous | AZ-531/534/537 | Validates credentials. Returns `LoginResponse` (access + refresh + sid) OR `MfaRequiredResponse` (`mfa_required: true`, short-lived `mfa_token`). Per-IP rate limited. | +| `/login/mfa` | POST | Anonymous | AZ-534 | Validates the step-1 `mfa_token` + the user's TOTP / recovery code. Returns `LoginResponse`. Per-IP rate limited. | +| `/token/refresh` | POST | Anonymous | AZ-531 | Rotates a refresh token. Reuse of a rotated token revokes the entire session family. | +| `/logout` | POST | Authenticated | AZ-535 | Revokes the caller's current `sid` (idempotent). | +| `/logout/all` | POST | Authenticated | AZ-535 | Revokes every active session for the caller's user. | +| `/sessions/{sid:guid}/revoke` | POST | ApiAdmin | AZ-535 | Admin-revoke by session id. | +| `/sessions/revoked` | GET | revocationReader (Service or ApiAdmin) | AZ-535 | Verifier-poll snapshot of revoked sessions still within their TTL. `since` is clamped to a 12 h floor to prevent table scans. | +| `/sessions/mission` | POST | Authenticated | AZ-533 | Pilot issues a long-lived no-refresh mission token bound to one aircraft + one mission. | +| `/.well-known/jwks.json` | GET | Anonymous | AZ-532 | All loaded ES256 public keys (active + retiring). `Cache-Control: public, max-age=3600`. | + +### MFA -### Authentication | Endpoint | Method | Auth | Description | |----------|--------|------|-------------| -| `/login` | POST | Anonymous | Validates credentials, returns JWT | +| `/users/me/mfa/enroll` | POST | Authenticated | Starts TOTP enrollment, returns secret + otpauth URL + PNG QR. | +| `/users/me/mfa/confirm` | POST | Authenticated | Confirms with a TOTP code. Returns `{ mfaEnabled: true }`. | +| `/users/me/mfa/disable` | POST | Authenticated | Requires password + TOTP. Returns `{ mfaEnabled: false }`. | ### User Management + | Endpoint | Method | Auth | Description | |----------|--------|------|-------------| -| `/users` | POST | ApiAdmin | Creates a new user | -| `/devices` | POST | ApiAdmin | **AZ-196**: provisions a CompanionPC device user (returns serial + email + plaintext password once) | -| `/users/current` | GET | Authenticated | Returns current user | -| `/users` | GET | ApiAdmin | Lists users (optional email/role filters) | -| `/users/queue-offsets/set` | PUT | Authenticated | Updates queue offsets | -| `/users/{email}/set-role/{role}` | PUT | ApiAdmin | Changes user role | -| `/users/{email}/enable` | PUT | ApiAdmin | Enables user | -| `/users/{email}/disable` | PUT | ApiAdmin | Disables user | -| `/users/{email}` | DELETE | ApiAdmin | Removes user | - -**Removed by AZ-197**: `PUT /users/hardware/set` (Hardware-binding feature deleted) +| `/users` | POST | ApiAdmin | Creates a new user (Argon2id-hashed password, AZ-536). | +| `/devices` | POST | ApiAdmin | Provisions a CompanionPC device user (returns serial + email + plaintext password once). | +| `/users/current` | GET | Authenticated | Returns current user. | +| `/users` | GET | ApiAdmin | Lists users (optional email/role filters). | +| `/users/queue-offsets/set` | PUT | Authenticated | Updates queue offsets. | +| `/users/{email}/set-role/{role}` | PUT | ApiAdmin | Changes user role. | +| `/users/{email}/enable` | PUT | ApiAdmin | Enables user. | +| `/users/{email}/disable` | PUT | ApiAdmin | Disables user (revokes all active sessions for that user via `SessionService`). | +| `/users/{email}` | DELETE | ApiAdmin | Removes user. | ### Resource Management + | Endpoint | Method | Auth | Description | |----------|--------|------|-------------| -| `/resources/{dataFolder?}` | POST | Authenticated | Uploads a file (up to 200 MB) | -| `/resources/list/{dataFolder?}` | GET | Authenticated | Lists files | -| `/resources/clear/{dataFolder?}` | POST | ApiAdmin | Clears folder | - -**Removed by AZ-197**: `POST /resources/check` (was the hardware-binding side-effect probe). -**Removed in post-cycle-1 revert**: `POST /get-update` and `POST /resources/publish` (AZ-183 reverted — security audit F-1; OTA delivery model itself obsolete). -**Removed in cycle 2 (2026-05-14)**: `POST /resources/get/{dataFolder?}`, `GET /resources/get-installer`, `GET /resources/get-installer/stage` — all obsolete; the encrypted-download support stack (`Security.GetApiEncryptionKey` / `EncryptTo` / `DecryptTo`, `ResourcesService.GetEncryptedResource` / `GetInstaller`, `GetResourceRequest`, `WrongResourceName = 50`, `ResourcesConfig.SuiteInstallerFolder` / `SuiteStageInstallerFolder`) was removed with them. ADR-003 retired. +| `/resources/{dataFolder?}` | POST | Authenticated | Uploads a file (up to 200 MB). | +| `/resources/list/{dataFolder?}` | GET | Authenticated | Lists files. | +| `/resources/clear/{dataFolder?}` | POST | ApiAdmin | Clears folder. | ### Detection Classes + | Endpoint | Method | Auth | Description | |----------|--------|------|-------------| -| `/classes` | POST | ApiAdmin | **AZ-513**: creates a detection class | -| `/classes/{id:int}` | PATCH | ApiAdmin | **AZ-513**: partial-merge update of a detection class | -| `/classes/{id:int}` | DELETE | ApiAdmin | **AZ-513**: deletes a detection class | +| `/classes` | POST | ApiAdmin | Creates a detection class. | +| `/classes/{id:int}` | PATCH | ApiAdmin | Partial-merge update. | +| `/classes/{id:int}` | DELETE | ApiAdmin | Deletes a detection class. | + +### Health + +| Endpoint | Method | Auth | Description | +|----------|--------|------|-------------| +| `/health/live` | GET | Anonymous (excluded from Swagger) | Process liveness; never touches DB. | +| `/health/ready` | GET | Anonymous (excluded from Swagger) | Pings both DB connections with a 2 s timeout; 503 on failure. | ### Authorization Policies -- **apiAdminPolicy**: requires `ApiAdmin` role (used on most admin endpoints) -> The `apiUploaderPolicy` was added by AZ-183 and removed in the post-cycle-1 revert along with the OTA endpoints it guarded. `RoleEnum.ResourceUploader` remains as data only. +| Policy | Roles | Notes | +|--------|-------|-------| +| `apiAdminPolicy` | `ApiAdmin` | The "admin endpoints" policy. | +| `revocationReaderPolicy` | `Service`, `ApiAdmin` | AZ-535 — verifier services authenticate as `Service`-role identities and are the only callers (besides admin) allowed to read `/sessions/revoked`. | -### CORS -- Allowed origins: `https://admin.azaion.com`, `http://admin.azaion.com` -- All methods/headers, credentials allowed +> The `apiUploaderPolicy` from AZ-183 was removed in the post-cycle-1 revert. `RoleEnum.ResourceUploader` remains as data only. + +### CORS, HSTS, HTTPS (AZ-538) + +- **CORS** — single origin `https://admin.azaion.com`, `AllowAnyMethod` + `AllowAnyHeader` + `AllowCredentials`. The legacy `http://` origin combined with credentials would have permitted credentialed cleartext traffic; cycle 2 removed it. +- **HSTS** — non-Development only: 1 y `MaxAge`, `IncludeSubDomains`, `Preload`. +- **HTTPS redirection** — non-Development only. Development skips both so `dotnet watch` on plain HTTP keeps working. + +### Rate limiting (AZ-537) + +- **Per-IP** — ASP.NET Core `RateLimiter` middleware with a `SlidingWindowRateLimiter`. Policy `login-per-ip` is attached to `/login` and `/login/mfa`. Permit limit + window seconds come from `AuthConfig.RateLimit`. Rejection sets `429` and stamps `Retry-After`. +- **Per-account** — DB-backed sliding-window check in `UserService.ValidateUser` via `IAuditLog.CountRecentFailedLogins`. Survives process restarts. +- **Per-account lockout** — `LockoutOptions` in `AuthConfig`. N consecutive failures → `LockoutUntil`; subsequent logins throw `AccountLocked` with `RetryAfterSeconds`. ## 4. Data Access Patterns -No direct data access — delegates to service components. +No direct data access — delegates to service components. The composition root also fail-fast checks on missing connection strings (`AzaionDb`, `AzaionDbAdmin`) and missing `JwtConfig` (`Issuer` + `Audience` required). ## 5. Implementation Details **State Management**: Stateless — ASP.NET Core request pipeline. +**DI registrations added in cycle 2**: +- `IJwtSigningKeyProvider` (singleton, eager-built before DI so it's the same instance JwtBearer's `IssuerSigningKeyResolver` uses) +- `IRefreshTokenService`, `ISessionService`, `IMissionTokenService`, `IMfaService` (scoped) +- `IAuditLog` (scoped) +- `IDataProtectionProvider` via `AddDataProtection().SetApplicationName("Azaion.AdminApi")` — production deployments MUST set `DataProtection:KeysFolder` to a persistent volume so encrypted MFA secrets survive restarts. + +**Middleware pipeline (cycle 2 order)**: +1. `UseSwagger`/`UseSwaggerUI` (Development only) +2. `UseHsts` + `UseHttpsRedirection` (non-Development only) +3. `UseCors("AdminCorsPolicy")` +4. `UseAuthentication` +5. `UseAuthorization` +6. `UseRateLimiter` +7. `UseRewriter` (root → `/swagger`) +8. Endpoint mappings +9. `UseExceptionHandler` (registered last so the `BusinessExceptionHandler` exception-handler component runs) + +**JWT Bearer config**: +- `ValidAlgorithms = [SecurityAlgorithms.EcdsaSha256]` — pinned to ES256 so a token forged with `alg=HS256` using the public key as the HMAC secret cannot pass validation (AZ-532 AC-5). +- `IssuerSigningKeyResolver` consults the same `IJwtSigningKeyProvider` instance the rest of the app uses; if the token has a `kid` it's matched, otherwise all loaded keys are returned. +- `ValidateIssuer`, `ValidateAudience`, `ValidateLifetime`, `ValidateIssuerSigningKey` all true. + **Key Dependencies**: -| Library | Version | Purpose | -|---------|---------|---------| -| Swashbuckle.AspNetCore | 10.1.4 | Swagger/OpenAPI documentation | -| FluentValidation.AspNetCore | 11.3.0 | Request validation pipeline | -| Serilog | 4.1.0 | Structured logging | -| Serilog.Sinks.Console | 6.0.0 | Console log output | -| Serilog.Sinks.File | 6.0.0 | Rolling file log output | +| Library | Purpose | +|---------|---------| +| Microsoft.AspNetCore.Authentication.JwtBearer | JWT bearer middleware | +| Microsoft.AspNetCore.RateLimiting | Per-IP sliding window | +| Microsoft.AspNetCore.DataProtection | Encrypt MFA secrets at rest | +| Microsoft.AspNetCore.Rewrite | `/` → `/swagger` redirect | +| Swashbuckle.AspNetCore | Swagger/OpenAPI | +| FluentValidation.AspNetCore | Request validation pipeline | +| Serilog | Structured logging (Console + rolling file) | **Error Handling Strategy**: -- `BusinessException` → `BusinessExceptionHandler` → HTTP 409 with JSON body. -- `UnauthorizedAccessException` → thrown in resource endpoints when current user is null. -- `FileNotFoundException` → thrown when installer not found. -- FluentValidation errors → automatic 400 Bad Request via middleware. -- Unhandled exceptions → default ASP.NET Core exception handling. +- `BusinessException` → `BusinessExceptionHandler` → per-enum status code (see table above) + optional `Retry-After`. +- `BadHttpRequestException` → `400 Bad Request` with `{ ErrorCode: 0, Message }`. +- FluentValidation errors → 400 via `Results.ValidationProblem`. +- Unhandled → default ASP.NET Core handling. ## 6. Extensions and Helpers -None. +- `IssueDualTokens` static helper (Program.cs) +- `ParseSidClaim` / `ParseUserIdClaim` static helpers (Program.cs) ## 7. Caveats & Edge Cases -**Known limitations**: -- All endpoints are defined in a single `Program.cs` file — no route grouping or controller separation. -- Swagger UI only available in Development environment. -- CORS origins are hardcoded (not configurable). -- Antiforgery disabled for resource upload endpoint. -- Root URL (`/`) redirects to `/swagger`. - -**Performance bottlenecks**: -- Kestrel max request body: 200 MB — allows large file uploads but could be a memory concern. +- All endpoints are still defined in a single `Program.cs` file — cycle 2 added significantly more endpoints; consider splitting into endpoint groups in a future cycle. +- Swagger UI only available in Development. +- CORS origins are hardcoded — moving to config is a follow-up. +- `BusinessExceptionHandler` lives under namespace `Azaion.Common` despite the file path `Azaion.AdminApi/`. Documented as historical accident; do not "fix" without coordinated rename. +- Antiforgery disabled on resource upload. +- Kestrel max request body 200 MB. +- The eager `JwtSigningKeyProvider` construction means a missing or malformed PEM crashes the app at startup. This is intentional — it's safer than serving requests with no signing key. ## 8. Dependency Graph **Must be implemented after**: All other components (composition root). -**Can be implemented in parallel with**: Nothing — depends on all services. - **Blocks**: Nothing. ## 9. Logging Strategy -| Log Level | When | Example | -|-----------|------|---------| -| WARN | Business exception caught | `BusinessExceptionHandler` logs the exception | -| INFO | Serilog minimum level | General application events | +| Log Level | When | Notes | +|-----------|------|-------| +| `Warning` | Business exception caught by `BusinessExceptionHandler` | Includes the full exception | +| `Warning` | `BadHttpRequestException` caught | | +| `Information` | Default for everything else | Serilog minimum level | **Log format**: Serilog structured logging with context enrichment. - **Log storage**: Console + rolling file (`logs/log.txt`, daily rotation). ## Modules Covered diff --git a/_docs/02_document/data_model.md b/_docs/02_document/data_model.md index c67929f..2102830 100644 --- a/_docs/02_document/data_model.md +++ b/_docs/02_document/data_model.md @@ -1,24 +1,72 @@ # Azaion Admin API — Data Model +> **Cycle 2 (2026-05-14) — Auth Modernization**: this doc is rewritten to reflect Postgres state after migrations `07`, `08`, `09`, `10`. Three new tables/columns clusters were added: account-lockout + audit (AZ-537), refresh-token sessions + revocation + mission tokens (AZ-531/535/533), TOTP MFA (AZ-534). + ## Entity-Relationship Diagram ```mermaid erDiagram USERS { uuid id PK - varchar email "unique, not null" - varchar password_hash "not null" - text hardware "nullable" - varchar hardware_hash "nullable" - varchar role "not null (text enum)" - varchar user_config "nullable (JSON)" - timestamp created_at "not null, default now()" - timestamp last_login "nullable" - bool is_enabled "not null, default true" + varchar email "unique" + varchar password_hash "Argon2id PHC; legacy SHA-384 base64 lazily upgraded" + text hardware "tombstoned (AZ-197)" + varchar role + varchar user_config "JSON" + timestamp created_at + timestamp last_login + bool is_enabled + int failed_login_count "AZ-537" + timestamp lockout_until "AZ-537" + bool mfa_enabled "AZ-534" + text mfa_secret "AZ-534, IDataProtector-encrypted" + jsonb mfa_recovery_codes "AZ-534" + timestamp mfa_enrolled_at "AZ-534" + bigint mfa_last_used_window "AZ-534" } -``` -The system has a single table (`users`). There are no foreign key relationships. + SESSIONS { + uuid id PK + uuid user_id FK + text refresh_hash "nullable for missions" + uuid family_id "AZ-531 reuse-detection key" + timestamp issued_at + timestamp last_used_at + timestamp expires_at + timestamp revoked_at + varchar revoked_reason + uuid parent_session_id FK + timestamp family_started_at + uuid revoked_by_user_id FK "AZ-535" + varchar class "AZ-533: interactive | mission" + uuid aircraft_id FK "AZ-533" + bool mfa_authenticated "AZ-534" + } + + AUDIT_EVENTS { + bigserial id PK + varchar event_type + timestamp occurred_at + varchar email + varchar ip + text metadata + } + + DETECTION_CLASSES { + int id PK + varchar name + varchar short_name + varchar color + double max_size_m + varchar photo_mode + timestamp created_at + } + + USERS ||--o{ SESSIONS : owns + USERS ||--o{ SESSIONS : "revoked_by (AZ-535)" + USERS ||--o{ SESSIONS : "aircraft (AZ-533)" + SESSIONS ||--o{ SESSIONS : "rotated_from (AZ-531)" +``` ## Table: `users` @@ -26,56 +74,147 @@ The system has a single table (`users`). There are no foreign key relationships. | Column | Type | Nullable | Default | Description | |--------|------|----------|---------|-------------| -| `id` | `uuid` | No | (application-generated) | Primary key, `Guid.NewGuid()` | -| `email` | `varchar(160)` | No | — | Unique user identifier | -| `password_hash` | `varchar(255)` | No | — | SHA-384 hash, Base64-encoded | -| `hardware` | `text` | Yes | null | Raw hardware fingerprint string | -| `hardware_hash` | `varchar(120)` | Yes | null | Defined in DDL but not used by application code | -| `role` | `varchar(20)` | No | — | Text representation of `RoleEnum` | -| `user_config` | `varchar(512)` | Yes | null | JSON-serialized `UserConfig` object | -| `created_at` | `timestamp` | No | `now()` | Account creation time | -| `last_login` | `timestamp` | Yes | null | Last hardware check / resource access time | -| `is_enabled` | `bool` | No | `true` | Account active flag | +| `id` | `uuid` | No | (application-generated) | Primary key | +| `email` | `varchar(160)` | No | — | Unique (UNIQUE INDEX `users_email_uidx`, security audit F-3) | +| `password_hash` | `varchar(255)` | No | — | **AZ-536**: Argon2id PHC string. Legacy SHA-384 base64 strings are accepted on verify and lazily re-hashed to Argon2id on next successful login. | +| `hardware` | `text` | Yes | null | TOMBSTONED (AZ-197) | +| `role` | `varchar(20)` | No | — | Text representation of `RoleEnum` (now includes `Service` — AZ-535) | +| `user_config` | `varchar(512)` | Yes | null | JSON-serialized `UserConfig` | +| `created_at` | `timestamp` | No | `now()` | | +| `last_login` | `timestamp` | Yes | null | Updated on successful login | +| `is_enabled` | `bool` | No | `true` | Setting to `false` triggers `SessionService.RevokeAllForUser` | +| `failed_login_count` | `int` | No | `0` | **AZ-537**: incremented on failed login; reset on success or lockout release | +| `lockout_until` | `timestamp` | Yes | null | **AZ-537**: UTC; `now() < lockout_until` → `BusinessException(AccountLocked)` with `Retry-After` | +| `mfa_enabled` | `boolean` | No | `false` | **AZ-534** | +| `mfa_secret` | `text` | Yes | null | **AZ-534**: base32 TOTP secret, IDataProtector-encrypted (purpose `Azaion.Mfa.Secret`), then base64 | +| `mfa_recovery_codes` | `jsonb` | Yes | null | **AZ-534**: array of `{ hash: , used_at: }`; single-use enforced by setting `used_at` | +| `mfa_enrolled_at` | `timestamp` | Yes | null | **AZ-534** | +| `mfa_last_used_window` | `bigint` | Yes | null | **AZ-534**: last accepted RFC 6238 step counter; anti-replay | -### ORM Mapping (linq2db) +### Indexes -Column names are auto-converted from PascalCase to snake_case via `AzaionDbSchemaHolder`: -- `User.PasswordHash` → `password_hash` -- `User.CreatedAt` → `created_at` +| Index | Type | Columns | +|-------|------|---------| +| `users_pkey` | PK | `id` | +| `users_email_uidx` | UNIQUE | `email` | -Special mappings: -- `Role`: stored as text, converted to/from `RoleEnum` via `Enum.Parse` -- `UserConfig`: stored as nullable JSON string, serialized/deserialized via `Newtonsoft.Json` +## Table: `sessions` *(AZ-531 + AZ-535 + AZ-533 + AZ-534)* + +One row per issued refresh token. Mission tokens are also rows here (`class='mission'`, `refresh_hash` null). + +### Columns + +| Column | Type | Nullable | Default | Description | +|--------|------|----------|---------|-------------| +| `id` | `uuid` | No | (application) | PK; used as the JWT `sid` claim | +| `user_id` | `uuid` | No | — | FK → `users.id` ON DELETE CASCADE | +| `refresh_hash` | `text` | Yes | — | SHA-256 of opaque refresh token. Required for `class='interactive'`; null for `class='mission'` (AZ-533) | +| `family_id` | `uuid` | No | — | **AZ-531**: shared by every rotation in the same login session; reuse detection revokes by `family_id` | +| `issued_at` | `timestamp` | No | `now()` | | +| `last_used_at` | `timestamp` | No | `now()` | Updated on rotate | +| `expires_at` | `timestamp` | No | — | Sliding for interactive (`SessionConfig.RefreshSlidingHours`), absolute for mission (`planned_duration_h`) | +| `revoked_at` | `timestamp` | Yes | null | Set on rotate (`rotated`), reuse detection (`reuse_detected`), logout (`logged_out`), logout/all (`logged_out_all`), admin revoke (`admin_revoked`), aircraft reconnect (`aircraft_reconnected`), user disable, refresh expiry sweep | +| `revoked_reason` | `varchar(64)` | Yes | null | One of `SessionRevokedReasons` constants | +| `parent_session_id` | `uuid` | Yes | null | FK → `sessions.id`; rotation chain pointer | +| `family_started_at` | `timestamp` | No | `now()` | Hard cap is `family_started_at + RefreshAbsoluteHours` | +| `revoked_by_user_id` | `uuid` | Yes | null | **AZ-535**: who revoked (admin id, system, or self for logout) | +| `class` | `varchar(32)` | No | `'interactive'` | **AZ-533**: `interactive` or `mission` | +| `aircraft_id` | `uuid` | Yes | null | **AZ-533**: FK → `users.id`; only set for `class='mission'` | +| `mfa_authenticated` | `boolean` | No | `false` | **AZ-534**: pinned at issue; refresh rotations inherit it | + +### Indexes + +| Index | Type | Columns | Notes | +|-------|------|---------|-------| +| `sessions_pkey` | PK | `id` | | +| `sessions_refresh_hash_idx` | UNIQUE | `refresh_hash` | O(1) lookup on rotate; nulls allowed (mission rows) | +| `sessions_family_active_idx` | partial | `family_id` WHERE `revoked_at IS NULL` | Reuse-detection family revoke; logout-all | +| `sessions_aircraft_active_idx` | partial | `(aircraft_id, class)` WHERE `revoked_at IS NULL AND aircraft_id IS NOT NULL` | **AZ-533** auto-revoke-on-reconnect | +| `sessions_revoked_at_idx` | partial | `revoked_at` WHERE `revoked_at IS NOT NULL` | **AZ-535** verifier-poll snapshot | + +### Lifecycle + +- **Issue (interactive)**: `RefreshTokenService.IssueForNewLogin` inserts a row with new `id` and `family_id`; `mfa_authenticated` reflects the login path. +- **Rotate**: `RefreshTokenService.Rotate` updates the existing row's `revoked_at`+`revoked_reason='rotated'` and inserts a new row in the same `family_id` with `parent_session_id` pointing to the old row. +- **Reuse detected**: presenting a refresh token whose row already has `revoked_reason='rotated'` → the entire `family_id` is revoked with `reason='reuse_detected'`. +- **Logout**: `SessionService.RevokeBySid(sid, caller, 'logged_out')`. Idempotent. +- **Logout all**: `SessionService.RevokeAllForUser(userId, caller, 'logged_out_all')`. +- **Admin revoke**: `SessionService.RevokeBySid(sid, admin, 'admin_revoked')`. +- **Mission issue**: `MissionTokenService.Issue` inserts row with `class='mission'`, `aircraft_id` set, `refresh_hash=null`, `expires_at = now + planned_duration_h`. **Before** signing the access token, prior mission rows for that `aircraft_id` are revoked with `reason='aircraft_reconnected'` (also called from successful login of a `CompanionPC` user). + +## Table: `audit_events` *(AZ-537 + AZ-534)* + +Append-only log used by the per-account sliding-window rate limit (AZ-537 AC-2) and as evidence for security audits. + +### Columns + +| Column | Type | Nullable | Default | Description | +|--------|------|----------|---------|-------------| +| `id` | `bigserial` | No | identity | PK | +| `event_type` | `varchar(64)` | No | — | One of: `login_failed`, `login_success`, `login_lockout`, `mfa_enroll`, `mfa_confirm`, `mfa_disable`, `mfa_login_success`, `mfa_login_failed`, `mfa_recovery_used` | +| `occurred_at` | `timestamp` | No | `now()` | | +| `email` | `varchar(160)` | Yes | null | Lowercase normalised on insert | +| `ip` | `varchar(64)` | Yes | null | `HttpContext.Connection.RemoteIpAddress` | +| `metadata` | `text` | Yes | null | Reserved (no current writer) | + +### Indexes + +| Index | Columns | +|-------|---------| +| `audit_events_pkey` | `id` | +| `audit_events_event_type_email_idx` | `(event_type, email, occurred_at DESC)` | ### Permissions | Role | Privileges | |------|-----------| -| `azaion_reader` | SELECT on `users` | -| `azaion_admin` | SELECT, INSERT, UPDATE, DELETE on `users` | -| `azaion_superadmin` | Superuser (DB owner) | +| `azaion_admin` | INSERT, SELECT, USAGE+SELECT on the sequence | +| `azaion_reader` | SELECT | -### Seed Data +> **Retention**: not yet partitioned. With ~50 events/user/day × ~5000 users × 365 d this is ~14 GB/yr; consider time-partition + 90-day archive in a future cycle. -Two default users (from `env/db/02_structure.sql`): +## Table: `detection_classes` -| Email | Role | -|-------|------| -| `admin@azaion.com` | `ApiAdmin` | -| `uploader@azaion.com` | `ResourceUploader` | +Unchanged in cycle 2. See `_docs/03_implementation/batch_06_report.md` for the original AZ-513 spec. + +## ORM Mapping (linq2db) + +Column names auto-converted from PascalCase → snake_case via `AzaionDbSchemaHolder`. Special mappings introduced in cycle 2: + +- `Session.RevokedReason` → enum-like text constants in `SessionRevokedReasons` (string-keyed; not a Postgres enum) +- `Session.Class` → string constants in `SessionClasses` (`"interactive"`, `"mission"`) +- `User.MfaRecoveryCodes` → `jsonb` via `Newtonsoft.Json` serialization (List on the read path; the persisted shape is `[{ hash, used_at }]`) +- `AuditEvent.EventType` → string constants in `AuditEventTypes` +- `User.Role` → text via `Enum.Parse` (now also recognises `Service`) + +## Permissions (post-cycle-2) + +| Role | Tables | Notes | +|------|--------|-------| +| `azaion_reader` | SELECT on `users`, `sessions`, `audit_events`, `detection_classes` | Used by the read-only `IDbFactory.Run` path | +| `azaion_admin` | SELECT/INSERT/UPDATE/DELETE on `users`; SELECT/INSERT/UPDATE on `sessions`; SELECT/INSERT on `audit_events`; full DML on `detection_classes` | Used by `IDbFactory.RunAdmin`. Note: no `DELETE` on `sessions` — revocation is logical via `revoked_at` | +| `azaion_superadmin` | DB owner | Migrations only | ## Schema Migration History Schema is managed via SQL scripts in `env/db/`: -1. `00_install.sh` — PostgreSQL installation and configuration -2. `01_permissions.sql` — Role creation (superadmin, admin, reader) -3. `02_structure.sql` — Table creation + seed data -4. `03_add_timestamp_columns.sql` — Adds `created_at`, `last_login`, `is_enabled` columns +| File | Cycle | Description | +|------|-------|-------------| +| `00_install.sh` | baseline | Postgres install + roles | +| `01_permissions.sql` | baseline | Role grants | +| `02_structure.sql` | baseline | `users` table + seed data (`admin@azaion.com`, `uploader@azaion.com`) | +| `03_add_timestamp_columns.sql` | baseline | `created_at`, `last_login`, `is_enabled` | +| `04_detection_classes.sql` | cycle 1 (AZ-513) | `detection_classes` | +| `06_users_email_unique.sql` | post-cycle-1 | Security audit F-3: UNIQUE on `users.email` | +| `07_auth_lockout_and_audit.sql` | cycle 2 (AZ-537) | `users.failed_login_count`, `users.lockout_until`, `audit_events` | +| `08_sessions.sql` | cycle 2 (AZ-531) | `sessions` table + indexes | +| `09_sessions_logout_and_mission.sql` | cycle 2 (AZ-535+533) | `sessions.revoked_by_user_id`, `class`, `aircraft_id`; relax `refresh_hash NOT NULL`; aircraft + revoked_at indexes | +| `10_users_mfa.sql` | cycle 2 (AZ-534) | `users.mfa_*`, `sessions.mfa_authenticated` | -No ORM migration framework is used. Schema changes are applied manually via SQL scripts. +No ORM migration framework is used — scripts are applied in numeric order by `env/db/00_install.sh`. Numbers are not contiguous (`05` is missing) by design — kept as gaps so cherry-picks land in their original slot. -## UserConfig JSON Schema +## UserConfig JSON Schema (unchanged) ```json { @@ -87,10 +226,9 @@ No ORM migration framework is used. Schema changes are applied manually via SQL } ``` -Stored in the `user_config` column. Deserialized to `UserConfig` → `UserQueueOffsets` on read. Default empty `UserConfig` is created when the field is null or empty. +## Observations / Caveats -## Observations - -- The `hardware_hash` column exists in the DDL but is not referenced in application code. The application stores the raw hardware string in `hardware` and computes hashes at runtime. -- No unique constraint on `email` column in the DDL — uniqueness is enforced at the application level (`UserService.RegisterUser` checks for duplicates before insert). -- `user_config` is limited to `varchar(512)`, which could be insufficient if queue offsets grow or additional config fields are added. +- `users.user_config` is still `varchar(512)`. With cycle 2 not adding to UserConfig, this is unchanged but remains a future-growth concern. +- `sessions.refresh_hash` UNIQUE INDEX accepts multiple NULLs (Postgres semantics) — that's intentional for mission rows. +- `audit_events` has no FK to `users` because it must survive user deletion (post-incident forensics). +- The `Service` role is data-only on the user table; no provisioning UI exists yet — verifier accounts are seeded out-of-band. diff --git a/_docs/02_document/diagrams/flows/flow_login.md b/_docs/02_document/diagrams/flows/flow_login.md index bf75f43..5a6b9b9 100644 --- a/_docs/02_document/diagrams/flows/flow_login.md +++ b/_docs/02_document/diagrams/flows/flow_login.md @@ -1,20 +1,92 @@ -# Flow: User Login +# Flow: User Login (dual token + MFA) + +> **Cycle 2 (2026-05-14)**: rebuilt around the AZ-531 + AZ-532 + AZ-534 + AZ-536 + AZ-537 stack. Single-token, SHA-384, HS256 path is gone. See `_docs/02_document/system-flows.md` F1 for the full narrative; this file is the canonical sequence diagram. ```mermaid sequenceDiagram participant Client + participant Mid as RateLimiter (per-IP, AZ-537) participant API as Admin API participant US as UserService + participant Sec as Security (Argon2id, AZ-536) + participant AL as AuditLog + participant Mfa as MfaService + participant RT as RefreshTokenService + participant Auth as AuthService (ES256, AZ-532) + participant SS as SessionService participant DB as PostgreSQL - participant Auth as AuthService - Client->>API: POST /login {email, password} + Client->>Mid: POST /login {email, password} + Mid->>Mid: per-IP sliding-window check + alt no permits + Mid-->>Client: 429 + Retry-After + end + Mid->>API: forward API->>US: ValidateUser(request) - US->>DB: SELECT user WHERE email = ? - DB-->>US: User record - US->>US: Compare password hash (SHA-384) - US-->>API: User entity - API->>Auth: CreateToken(user) - Auth-->>API: JWT string (HMAC-SHA256) - API-->>Client: 200 OK {token} + US->>DB: SELECT users WHERE email = ? + US->>AL: CountRecentFailedLogins(email, window) + alt account locked OR per-account threshold exceeded + US-->>API: AccountLocked / LoginRateLimited (RetryAfterSeconds) + API-->>Client: 423 / 429 + Retry-After + end + US->>Sec: VerifyPassword(presented, stored) + alt VerifyResult.Ok=false + US->>AL: RecordLoginFailed + US->>DB: UPDATE failed_login_count++; lockout_until = now + LockoutSeconds (if newly over) + US-->>API: WrongPassword (or NoEmailFound) + API-->>Client: 409 + end + alt VerifyResult.NeedsRehash=true (legacy SHA-384) + US->>Sec: HashPassword (Argon2id) + US->>DB: UPDATE password_hash (lazy migrate) + end + US->>AL: RecordLoginSuccess + US->>DB: UPDATE failed_login_count = 0, lockout_until = NULL, last_login = now + US-->>API: User + + alt user.MfaEnabled + API->>Mfa: IssueMfaStepToken(userId) + Mfa-->>API: ES256 JWT (mfa_pending=true, audience=mfa-step, ~5 min) + API-->>Client: 200 OK MfaRequiredResponse {mfa_required, mfa_token, expires_in: 300} + + Note over Client,API: --- second factor --- + Client->>Mid: POST /login/mfa {mfa_token, code} + Mid->>Mid: per-IP sliding-window check + Mid->>API: forward + API->>Mfa: ValidateMfaStepToken(mfa_token) -> userId + API->>US: GetById(userId) -> User + API->>Mfa: VerifyForLogin(userId, code) + Mfa->>DB: TOTP verify decrypted mfa_secret OR consume recovery code + Mfa->>AL: RecordMfaLoginSuccess (or MfaRecoveryUsed) + Mfa-->>API: amr = ["pwd","mfa"] (+ "recovery" if used) + API->>RT: IssueForNewLogin(userId, mfaAuthenticated=true) + RT->>DB: INSERT INTO sessions (id, family_id=id, refresh_hash=SHA256(opaque), expires_at, mfa_authenticated=true) + RT-->>API: (opaqueRefreshToken, Session) + API->>Auth: CreateToken(user, sid=Session.Id, jti, amr=["pwd","mfa"]) + Auth-->>API: AccessToken (ES256) + opt user.Role == CompanionPC + API->>SS: RevokeMissionsForAircraft(user.Id) + end + API-->>Client: 200 OK LoginResponse {AccessToken, AccessExp, RefreshToken, RefreshExp} + else + API->>RT: IssueForNewLogin(userId, mfaAuthenticated=false) + RT->>DB: INSERT INTO sessions (..., mfa_authenticated=false) + RT-->>API: (opaqueRefreshToken, Session) + API->>Auth: CreateToken(user, sid=Session.Id, jti, amr=["pwd"]) + Auth-->>API: AccessToken (ES256) + opt user.Role == CompanionPC + API->>SS: RevokeMissionsForAircraft(user.Id) + end + API-->>Client: 200 OK LoginResponse {AccessToken, AccessExp, RefreshToken, RefreshExp} + end ``` + +## Related diagrams (cycle 2) + +- `flow_refresh_token.md` *(see system-flows.md F11)* +- `flow_logout_revocation.md` *(see system-flows.md F12)* +- `flow_mission_token.md` *(see system-flows.md F13)* +- `flow_mfa_lifecycle.md` *(see system-flows.md F14)* +- `flow_revocation_snapshot.md` *(see system-flows.md F15)* + +These are documented inline in `system-flows.md` rather than as standalone files; this `flow_login.md` is kept as a separate file because it is referenced from multiple ADRs and the security report. diff --git a/_docs/02_document/module-layout.md b/_docs/02_document/module-layout.md index 1a36a3b..195c3b0 100644 --- a/_docs/02_document/module-layout.md +++ b/_docs/02_document/module-layout.md @@ -3,7 +3,7 @@ **Language**: csharp **Layout Convention**: solution-flat (legacy — pre-`src/` convention) **Root**: `./` (csproj folders sit at workspace root) -**Last Updated**: 2026-05-13 +**Last Updated**: 2026-05-14 *(refreshed for cycle 2 Auth Modernization — AZ-531..AZ-538)* ## Layout Rules @@ -50,12 +50,12 @@ These come from `_docs/02_document/components/` and exist for reading the codeba | # | Sub-component | Primary file locations | |---|----------------------|------------------------| -| 1 | Data Layer | `Azaion.Common/Database/`, `Azaion.Common/Configs/`, `Azaion.Common/Entities/` (incl. `DetectionClass.cs` added cycle 1; `Resource.cs` added then removed in same cycle — see post-cycle-1 revert) | -| 2 | User Management | `Azaion.Services/UserService.cs` (incl. `RegisterDevice` added cycle 1 / AZ-196 — calls `RegisterUser` end-to-end after security-audit consolidation, finding F-3), `Azaion.Common/Requests/Register{User,DeviceResponse}.cs`, `LoginRequest.cs`, `SetUserQueueOffsetsRequest.cs` | -| 3 | Auth & Security | `Azaion.Services/AuthService.cs`, `Azaion.Services/Security.cs` (post-cycle-2 — only `ToHash` remains; `GetApiEncryptionKey` / `EncryptTo` / `DecryptTo` removed with the encrypted-download endpoint), `Azaion.Services/Cache.cs` | +| 1 | Data Layer | `Azaion.Common/Database/`, `Azaion.Common/Configs/` (incl. cycle-2 `AuthConfig.cs` + `JwtConfig.cs` rebuilt for ES256 + new `SessionConfig`), `Azaion.Common/Entities/` (incl. cycle-1 `DetectionClass.cs`; cycle-2 `Session.cs` + `AuditEvent.cs`; `User.cs` extended with lockout + MFA columns; `RoleEnum.cs` + `Service = 60`) | +| 2 | User Management | `Azaion.Services/UserService.cs` (cycle-2 — Argon2id verify/hash + lazy migration + lockout + per-account rate-limit checks; new dependencies on `IAuditLog`, `IOptions`), `Azaion.Common/Requests/Register{User,DeviceResponse}.cs`, `LoginRequest.cs`, `LoginResponse.cs` *(new — AZ-531)*, `MfaRequests.cs` *(new — AZ-534)*, `MissionSessionRequest.cs` *(new — AZ-533)*, `SetUserQueueOffsetsRequest.cs` | +| 3 | Auth & Security | `Azaion.Services/AuthService.cs` (cycle-2 — ES256 + `AccessToken` record + sid/jti/amr claims), `Azaion.Services/Security.cs` (cycle-2 — Argon2id `HashPassword`/`VerifyPassword`; `ToHash` deleted), `Azaion.Services/RefreshTokenService.cs` *(new — AZ-531)*, `Azaion.Services/SessionService.cs` *(new — AZ-535)*, `Azaion.Services/MfaService.cs` *(new — AZ-534)*, `Azaion.Services/MissionTokenService.cs` *(new — AZ-533)*, `Azaion.Services/JwtSigningKeyProvider.cs` *(new — AZ-532)*, `Azaion.Services/AuditLog.cs` *(new — AZ-537)*, `Azaion.Services/Cache.cs` | | 4 | Resource Management | `Azaion.Services/ResourcesService.cs` (`GetResourceRequest.cs` removed in cycle 2 with `POST /resources/get`; `SetHWRequest.cs` removed by AZ-197; `ResourceUpdateService.cs` + `GetUpdateRequest.cs` + `PublishResourceRequest.cs` removed when AZ-183 was reverted) | | 4b | Detection Classes | `Azaion.Services/DetectionClassService.cs` + `Azaion.Common/Requests/{Create,Update}DetectionClassRequest.cs` (added cycle 1 / AZ-513) | -| 5 | Admin API (HTTP) | `Azaion.AdminApi/Program.cs`, `Azaion.AdminApi/BusinessExceptionHandler.cs`, `Azaion.AdminApi/appsettings*.json` | +| 5 | Admin API (HTTP) | `Azaion.AdminApi/Program.cs` (cycle-2 — significantly expanded: HSTS / HTTPS redirect, RateLimiter, DataProtection, eight new endpoints, `IssueDualTokens` + `ParseSidClaim`/`ParseUserIdClaim` helpers), `Azaion.AdminApi/BusinessExceptionHandler.cs` (cycle-2 — per-enum status mapping + `Retry-After` header), `Azaion.AdminApi/appsettings*.json` | ## Allowed Dependencies (csproj layering) diff --git a/_docs/02_document/modules/admin_api_program.md b/_docs/02_document/modules/admin_api_program.md index 21b3c58..151e047 100644 --- a/_docs/02_document/modules/admin_api_program.md +++ b/_docs/02_document/modules/admin_api_program.md @@ -5,28 +5,43 @@ Application entry point: configures DI, middleware, authentication, authorizatio ## Public Interface (HTTP Endpoints) -> **Cycle 1 (2026-05-13) note** — endpoint surface changed by AZ-513 (detection-class CRUD), AZ-196 (device auto-registration), AZ-197 (hardware-binding removal). AZ-183 (OTA update check + publish) was reverted later the same day after the security audit (finding F-1) — the OTA delivery model itself was deemed obsolete; see `_docs/05_security/security_report.md` for context. The table reflects the post-cycle-1 state including that revert. +> **Cycle 1 (2026-05-13) note** — endpoint surface changed by AZ-513 (detection-class CRUD), AZ-196 (device auto-registration), AZ-197 (hardware-binding removal). AZ-183 (OTA update check + publish) was reverted later the same day after the security audit (finding F-1). > -> **Cycle 2 (2026-05-14) note** — three more endpoints were removed as obsolete: `POST /resources/get/{dataFolder?}`, `GET /resources/get-installer`, `GET /resources/get-installer/stage`. The encrypted-download support stack (`Security.GetApiEncryptionKey` / `EncryptTo` / `DecryptTo`, `ResourcesService.GetEncryptedResource` / `GetInstaller`, `GetResourceRequest` DTO, `WrongResourceName = 50` enum value, `ResourcesConfig.SuiteInstallerFolder` / `SuiteStageInstallerFolder`) went with them. ADR-003 in `architecture.md` was retired in the same change. +> **Cycle 2 (2026-05-14) note A** — three resource endpoints removed as obsolete: `POST /resources/get/{dataFolder?}`, `GET /resources/get-installer`, `GET /resources/get-installer/stage`. The encrypted-download support stack went with them. ADR-003 in `architecture.md` was retired. +> +> **Cycle 2 (2026-05-14) note B (auth modernization)** — eight endpoints added or replaced as part of Epic AZ-529 (Auth Modernization) + AZ-530 (CMMC Hardening). The `/login` shape is now dual-token (access + refresh) when MFA is off, or `MfaRequiredResponse` when MFA is enabled. CORS dropped the cleartext origin (AZ-538). HSTS + HTTPS redirection are wired in non-Development environments. Per-IP sliding-window rate limit added to `/login` (and `/login/mfa`). Public-key JWKS feed live at `/.well-known/jwks.json` (AZ-532). -| Method | Path | Auth | Summary | Cycle 1 origin | -|--------|------|------|---------|----------------| -| POST | `/login` | Anonymous | Validates credentials, returns JWT token | — | -| POST | `/users` | ApiAdmin | Creates a new user | — | -| POST | `/devices` | ApiAdmin | Creates a CompanionPC device user (auto serial / email / 32-hex password) | AZ-196 | -| GET | `/users/current` | Any authenticated | Returns current user from JWT claims | — | -| GET | `/users` | ApiAdmin | Lists users with optional email/role filters | — | -| PUT | `/users/queue-offsets/set` | Any authenticated | Updates user's queue offsets | — | -| PUT | `/users/{email}/set-role/{role}` | ApiAdmin | Changes a user's role | — | -| PUT | `/users/{email}/enable` | ApiAdmin | Enables a user account | — | -| PUT | `/users/{email}/disable` | ApiAdmin | Disables a user account | — | -| DELETE | `/users/{email}` | ApiAdmin | Removes a user | — | -| POST | `/resources/{dataFolder?}` | Any authenticated | Uploads a resource file | — | -| GET | `/resources/list/{dataFolder?}` | Any authenticated | Lists files in a resource folder | — | -| POST | `/resources/clear/{dataFolder?}` | ApiAdmin | Clears a resource folder | — | -| POST | `/classes` | ApiAdmin | Creates a detection class | AZ-513 | -| PATCH | `/classes/{id:int}` | ApiAdmin | Updates a detection class (partial-merge) | AZ-513 | -| DELETE | `/classes/{id:int}` | ApiAdmin | Deletes a detection class | AZ-513 | +| Method | Path | Auth | Summary | Cycle origin | +|--------|------|------|---------|--------------| +| GET | `/health/live` | Anonymous | Liveness check (`Cache-Control: no-store`); excluded from Swagger | AZ-510 | +| GET | `/health/ready` | Anonymous | Readiness check — pings both DB connections with a 2-s timeout; 503 with reason on failure | AZ-510 | +| POST | `/login` | Anonymous + per-IP rate limit | Validates credentials. Returns `LoginResponse` (access + refresh) when MFA is off, `MfaRequiredResponse` when MFA is enabled. | AZ-531 / AZ-534 / AZ-537 | +| POST | `/login/mfa` | Anonymous + per-IP rate limit | Second-factor verification (TOTP or recovery code). Returns `LoginResponse`. | AZ-534 | +| POST | `/token/refresh` | Anonymous (token in body) | Rotates a refresh token; returns a fresh `LoginResponse`. Reuse-detection kills the family. | AZ-531 | +| POST | `/logout` | Authenticated | Revokes the caller's current session (idempotent — returns `{ alreadyRevoked }`). | AZ-535 | +| POST | `/logout/all` | Authenticated | Revokes every active session for the caller's user (returns `{ revoked: N }`). | AZ-535 | +| POST | `/sessions/{sid:guid}/revoke` | ApiAdmin | Admin revoke-by-session-id. | AZ-535 | +| GET | `/sessions/revoked` | revocationReader (Service or ApiAdmin) | Verifier-poll snapshot of revoked-but-not-yet-expired sessions. `Cache-Control: no-cache`; `since` clamped to `now - 12h`. | AZ-535 | +| POST | `/sessions/mission` | Authenticated | Mints a long-lived no-refresh mission token bound to one aircraft. AZ-533 AC-6 step-up MFA gate is a TODO comment until org-wide MFA adoption. | AZ-533 | +| POST | `/users/me/mfa/enroll` | Authenticated | Returns TOTP secret + otpauth URL + QR PNG + 10 recovery codes (ONCE). | AZ-534 | +| POST | `/users/me/mfa/confirm` | Authenticated | Validates one TOTP code and flips `mfa_enabled=true`. | AZ-534 | +| POST | `/users/me/mfa/disable` | Authenticated | Removes MFA (requires password + valid code). | AZ-534 | +| GET | `/.well-known/jwks.json` | Anonymous (excluded from Swagger) | Public JWKS feed for verifiers; `Cache-Control: public, max-age=3600`. | AZ-532 | +| POST | `/users` | ApiAdmin | Creates a new user. | — | +| POST | `/devices` | ApiAdmin | Creates a CompanionPC device user (auto serial / email / 32-hex password). | AZ-196 | +| GET | `/users/current` | Any authenticated | Returns current user from JWT claims. | — | +| GET | `/users` | ApiAdmin | Lists users with optional email/role filters. | — | +| PUT | `/users/queue-offsets/set` | Any authenticated | Updates user's queue offsets. | — | +| PUT | `/users/{email}/set-role/{role}` | ApiAdmin | Changes a user's role. | — | +| PUT | `/users/{email}/enable` | ApiAdmin | Enables a user account. | — | +| PUT | `/users/{email}/disable` | ApiAdmin | Disables a user account. | — | +| DELETE | `/users/{email}` | ApiAdmin | Removes a user. | — | +| POST | `/resources/{dataFolder?}` | Any authenticated | Uploads a resource file. | — | +| GET | `/resources/list/{dataFolder?}` | Any authenticated | Lists files in a resource folder. | — | +| POST | `/resources/clear/{dataFolder?}` | ApiAdmin | Clears a resource folder. | — | +| POST | `/classes` | ApiAdmin | Creates a detection class. | AZ-513 | +| PATCH | `/classes/{id:int}` | ApiAdmin | Updates a detection class (partial-merge). | AZ-513 | +| DELETE | `/classes/{id:int}` | ApiAdmin | Deletes a detection class. | AZ-513 | ### Removed endpoints @@ -34,79 +49,99 @@ The following endpoints have been removed and now return `404`: | Method | Path | Removed in | Reason | |--------|------|------------|--------| -| PUT | `/users/hardware/set` | cycle 1 (AZ-197) | hardware-binding feature deleted (no fielded clients in target architecture) | -| POST | `/resources/check` | cycle 1 (AZ-197) | was the hardware-binding side-effect probe; no remaining purpose | -| POST | `/get-update` | post-cycle-1 (AZ-183 reverted) | security audit F-1: endpoint disclosed plaintext per-resource encryption keys to any authenticated caller; the underlying installer-distribution flow is itself obsolete | -| POST | `/resources/publish` | post-cycle-1 (AZ-183 reverted) | same revert as `/get-update` — the publish counterpart of the OTA flow | -| POST | `/resources/get/{dataFolder?}` | cycle 2 (2026-05-14) | obsolete — per-user encrypted-download flow no longer used by any client; ADR-003 retired | -| GET | `/resources/get-installer` | cycle 2 (2026-05-14) | obsolete — installer-shipping era is over (browser SaaS + fTPM Jetsons) | -| GET | `/resources/get-installer/stage` | cycle 2 (2026-05-14) | same as `/resources/get-installer` | +| PUT | `/users/hardware/set` | cycle 1 (AZ-197) | hardware-binding feature deleted | +| POST | `/resources/check` | cycle 1 (AZ-197) | hardware-binding side-effect probe | +| POST | `/get-update` | post-cycle-1 (AZ-183 reverted) | security audit F-1 | +| POST | `/resources/publish` | post-cycle-1 (AZ-183 reverted) | OTA flow obsolete | +| POST | `/resources/get/{dataFolder?}` | cycle 2 | obsolete; ADR-003 retired | +| GET | `/resources/get-installer` | cycle 2 | installer-shipping era over | +| GET | `/resources/get-installer/stage` | cycle 2 | same as above | ## Internal Logic ### DI Registration +- `IJwtSigningKeyProvider` → `JwtSigningKeyProvider` (Singleton; eagerly built before `app.Build()` so `JwtBearer` and DI share one instance) — **AZ-532** - `IUserService` → `UserService` (Scoped) - `IAuthService` → `AuthService` (Scoped) +- `IRefreshTokenService` → `RefreshTokenService` (Scoped) — **AZ-531** +- `ISessionService` → `SessionService` (Scoped) — **AZ-535** +- `IMissionTokenService` → `MissionTokenService` (Scoped) — **AZ-533** +- `IMfaService` → `MfaService` (Scoped) — **AZ-534** - `IResourcesService` → `ResourcesService` (Scoped) -- `IDetectionClassService` → `DetectionClassService` (Scoped) — added by AZ-513 +- `IDetectionClassService` → `DetectionClassService` (Scoped) +- `IAuditLog` → `AuditLog` (Scoped) — **AZ-537 / AZ-534** - `IDbFactory` → `DbFactory` (Singleton) - `ICache` → `MemoryCache` (Scoped) - `LazyCache` via `AddLazyCache()` -- FluentValidation validators auto-discovered from `RegisterUserValidator` assembly (also picks up `CreateDetectionClassRequest`, `UpdateDetectionClassRequest` validators introduced in cycle 1) +- ASP.NET Core `DataProtection` — `SetApplicationName("Azaion.AdminApi")`; if `DataProtection:KeysFolder` is set, persisted to filesystem (production requirement for MFA-secret durability) — **AZ-534** +- FluentValidation validators auto-discovered from `RegisterUserValidator` assembly - `BusinessExceptionHandler` registered as exception handler ### Middleware Pipeline -1. Swagger (dev only) -2. CORS (`AdminCorsPolicy`) -3. Authentication (JWT Bearer) -4. Authorization -5. URL rewrite: root `/` → `/swagger` -6. Exception handler +1. Swagger (Development only) +2. **HSTS + HTTPS redirection (non-Development only)** — AZ-538 +3. CORS (`AdminCorsPolicy`) +4. Authentication (JWT Bearer with `ValidAlgorithms = [ES256]` and an `IssuerSigningKeyResolver` that picks by `kid` from `IJwtSigningKeyProvider.All`) +5. Authorization +6. **Rate limiter (`UseRateLimiter`)** — AZ-537 +7. URL rewrite: root `/` → `/swagger` +8. Exception handler ### Authorization Policies - `apiAdminPolicy`: requires `RoleEnum.ApiAdmin` role +- `revocationReaderPolicy`: requires `RoleEnum.Service` OR `RoleEnum.ApiAdmin` (gates `/sessions/revoked`) — **AZ-535** -> The `apiUploaderPolicy` (`RoleEnum.ResourceUploader` OR `ApiAdmin`) was added by AZ-183 and removed in the same cycle when the OTA endpoints it guarded were retired (see "Removed in cycle 1" above). `RoleEnum.ResourceUploader` itself remains as a data value (the seed `uploader@azaion.com` still uses it) but is no longer wired to any endpoint policy. +### Rate Limit Policies +- `LoginPerIpPolicy = "login-per-ip"` — sliding-window limiter keyed on `RemoteIpAddress`. Configured from `AuthConfig.RateLimit.PerIpPermitLimit` / `PerIpWindowSeconds`. On rejection, sets `Retry-After` from the `RetryAfter` lease metadata. Applied to `/login` and `/login/mfa`. ### Configuration Sections -- `JwtConfig` — JWT signing/validation +- `JwtConfig` — JWT signing/validation (Issuer, Audience, KeysFolder, ActiveKid, AccessTokenLifetimeMinutes) +- `SessionConfig` — refresh-token sliding/absolute window (RefreshSlidingHours, RefreshAbsoluteHours) — **AZ-531** +- `AuthConfig` — rate-limit and lockout knobs — **AZ-537** - `ConnectionStrings` — DB connections -- `ResourcesConfig` — file storage path (`ResourcesFolder`); the installer subfolders were dropped in cycle 2 along with the installer endpoints +- `ResourcesConfig` — file storage path ### Kestrel -- Max request body size: 200 MB (for file uploads) +- Max request body size: 200 MB ### Logging - Serilog: console + rolling file (`logs/log.txt`) ### CORS -- Allowed origins: `https://admin.azaion.com`, `http://admin.azaion.com` -- All methods and headers allowed -- Credentials allowed +- Allowed origin: `https://admin.azaion.com` (the cleartext `http://` origin was dropped by AZ-538) +- All methods and headers allowed; credentials allowed + +### Helpers +Local static helpers used by logout / mission endpoints: +- `ParseSidClaim(ClaimsPrincipal)` — extracts the `sid` claim; throws `InvalidRefreshToken` (401) if missing/malformed. +- `ParseUserIdClaim(ClaimsPrincipal)` — extracts `NameIdentifier`; same error semantics. +- `IssueDualTokens(...)` — shared by `/login` and `/login/mfa`; calls `IRefreshTokenService.IssueForNewLogin`, `IAuthService.CreateToken`, plus `ISessionService.RevokeMissionsForAircraft` when the caller is `RoleEnum.CompanionPC` (AZ-533 AC-4 reconnect trigger). ## Dependencies -All services, configs, entities, and request types from Azaion.Common and Azaion.Services. +All services, configs, entities, and request types from `Azaion.Common` and `Azaion.Services`. New dependencies wired in cycle 2: `Microsoft.AspNetCore.RateLimiting`, `Microsoft.AspNetCore.DataProtection`. ## Consumers -None — this is the application entry point. +None — application entry point. ## Data Models None defined here. ## Configuration -Reads `JwtConfig`, `ConnectionStrings`, `ResourcesConfig` from `IConfiguration`. +Reads `JwtConfig`, `SessionConfig`, `AuthConfig`, `ConnectionStrings`, `ResourcesConfig` from `IConfiguration`. Optional `DataProtection:KeysFolder` for MFA-secret durability. ## External Integrations - PostgreSQL (via DI-registered `DbFactory`) -- Local filesystem (via `ResourcesService`) +- Local filesystem (via `ResourcesService` and `JwtSigningKeyProvider` for PEM keys) ## Security -- JWT Bearer authentication with full validation (issuer, audience, lifetime, signing key) -- Role-based authorization policies -- CORS restricted to `admin.azaion.com` -- Request body limit of 200 MB -- Antiforgery disabled for resource upload endpoint -- Password sent via POST body (not URL) +- JWT Bearer with full validation: `ValidateIssuer`, `ValidateAudience`, `ValidateLifetime`, `ValidateIssuerSigningKey`, `ValidAlgorithms = [ES256]` (AZ-532 AC-5). +- Issuer signing keys resolved per-`kid` via `IJwtSigningKeyProvider`; supports rotation overlap. +- Public JWKS endpoint exposes only public components (`x`/`y` for EC); `Cache-Control: public, max-age=3600`. +- Per-IP sliding-window rate limit on `/login` and `/login/mfa` (AZ-537). +- HSTS (1 year, includeSubDomains, preload) + HTTPS redirect in non-Development envs (AZ-538). +- CORS restricted to HTTPS origin only (AZ-538). +- DataProtection key folder must be a persistent volume in Production so encrypted MFA secrets survive restarts (AZ-534 known operational requirement; **carry-forward F3** asks for a startup warning when running in Production with the folder unset). +- Role-based authorization for admin endpoints; new `Service` role gates the verifier-poll feed. ## Tests -None directly; tested indirectly through integration tests. +None directly; tested through `e2e/Azaion.E2E/Tests/` (Login, RefreshToken, RateLimitLockout, Logout, Jwks, MissionToken, MfaEnrollment, MfaLogin, PasswordHashing). diff --git a/_docs/02_document/modules/common_business_exception.md b/_docs/02_document/modules/common_business_exception.md index 5a33b12..a1c7822 100644 --- a/_docs/02_document/modules/common_business_exception.md +++ b/_docs/02_document/modules/common_business_exception.md @@ -13,31 +13,54 @@ Custom exception type for domain-level errors, paired with an `ExceptionEnum` ca | `GetMessage` | `static string GetMessage(ExceptionEnum exEnum)` | Looks up human-readable message for an error code | ### ExceptionEnum -| Value | Code | Description | -|-------|------|-------------| -| `NoEmailFound` | 10 | No such email found | -| `EmailExists` | 20 | Email already exists | -| `WrongPassword` | 30 | Passwords do not match | -| `PasswordLengthIncorrect` | 32 | Password should be at least 12 characters (description text — actual validator threshold is 8 chars per `RegisterUserValidator`) | -| `EmailLengthIncorrect` | 35 | Email is empty or invalid | -| `WrongEmail` | 37 | (no description attribute) | -| `UserDisabled` | 38 | User account is disabled | -| `NoFileProvided` | 60 | No file provided | +| Value | Code | Description | HTTP Status | +|-------|------|-------------|-------------| +| `NoEmailFound` | 10 | No such email found | 409 | +| `EmailExists` | 20 | Email already exists | 409 | +| `WrongPassword` | 30 | Passwords do not match | 409 | +| `PasswordLengthIncorrect` | 32 | Password should be at least 12 characters | 409 | +| `EmailLengthIncorrect` | 35 | Email is empty or invalid | 409 | +| `WrongEmail` | 37 | (no description attribute) | 409 | +| `UserDisabled` | 38 | User account is disabled | 409 | +| `AccountLocked` | 50 | AZ-537 — account temporarily locked due to too many failed login attempts (carries `RetryAfterSeconds`) | **423 Locked** | +| `LoginRateLimited` | 51 | AZ-537 — too many login attempts per account; try again later (carries `RetryAfterSeconds`) | **429 Too Many Requests** | +| `InvalidRefreshToken` | 52 | AZ-531 — refresh token invalid / expired / revoked / reuse-detected | **401 Unauthorized** | +| `SessionNotFound` | 53 | AZ-535 — admin tried to revoke a non-existent session | **404 Not Found** | +| `InvalidMissionRequest` | 54 | AZ-533 — mission_id pattern fail or planned_duration_h out of bounds | **400 Bad Request** | +| `AircraftNotFound` | 55 | AZ-533 — aircraft id missing or not a `CompanionPC` user | **400 Bad Request** | +| `MfaAlreadyEnabled` | 56 | AZ-534 — `/users/me/mfa/enroll` called for a user that already has MFA on | **409 Conflict** | +| `MfaNotEnrolling` | 57 | AZ-534 — confirm called without a prior enroll | **409 Conflict** | +| `MfaNotEnabled` | 58 | AZ-534 — disable / verify-for-login called for a user without MFA | **409 Conflict** | +| `InvalidMfaCode` | 59 | AZ-534 — TOTP code (and recovery code) failed to verify | **401 Unauthorized** | +| `NoFileProvided` | 60 | No file provided | 409 | +| `InvalidMfaToken` | 61 | AZ-534 — step-1 MFA token failed to validate (signature / audience / expiry) | **401 Unauthorized** | -> **Cycle 1 (2026-05-13) note** — `HardwareIdMismatch = 40` and `BadHardware = 45` were removed by AZ-197 (admin-side hardware-binding cleanup). Codes 40 and 45 should NOT be reused for a different meaning — older clients may still surface "Hardware mismatch" UX strings keyed on the integer. `UserDisabled = 38` was added earlier (still part of the baseline). See `_docs/03_implementation/batch_06_report.md`. +### RetryAfterSeconds + +| Member | Type | Description | +|--------|------|-------------| +| Constructor | `BusinessException(ExceptionEnum exEnum, int retryAfterSeconds)` | Cycle 2 (AZ-537) — sets `RetryAfterSeconds`, surfaced by `BusinessExceptionHandler` as a `Retry-After` response header. Used by `AccountLocked` (returns remaining lockout seconds) and `LoginRateLimited` (returns the window seconds). | +| `RetryAfterSeconds` | `int?` | Optional cooldown hint; null when the exception was constructed without a window. | + +> **Cycle 1 (2026-05-13) note** — `HardwareIdMismatch = 40` and `BadHardware = 45` were removed by AZ-197. Codes 40 and 45 should NOT be reused. > -> **Cycle 2 (2026-05-14) note** — `WrongResourceName = 50` was removed along with the `GetResourceRequest` validator (the only consumer). Code 50 should NOT be reused — gap kept per the cycle-1 lesson on retired numeric codes. +> **Cycle 2 (2026-05-14) note** — `WrongResourceName = 50` was removed early in the cycle along with the `GetResourceRequest` validator. The integer 50 has since been **reused for `AccountLocked`** as part of AZ-537 (since the previous user-facing string "Wrong resource name" is no longer surfaced anywhere). This is the one deliberate exception to the "gap kept" lesson — the old code had no remaining client surface and the auth modernization wanted a tightly-clustered range of new codes. ## Internal Logic -Static constructor eagerly loads all `ExceptionEnum` descriptions into a dictionary via `EnumExtensions.GetDescriptions()`. Messages are retrieved by dictionary lookup with fallback to `ToString()`. +Static constructor eagerly loads all `ExceptionEnum` descriptions into a dictionary via `EnumExtensions.GetDescriptions()`. Messages are retrieved by dictionary lookup with fallback to `ToString()`. The two-arg constructor sets `RetryAfterSeconds` for the lockout / rate-limit paths. ## Dependencies - `EnumExtensions` — for `GetDescriptions()` ## Consumers -- `BusinessExceptionHandler` — catches and serializes to HTTP 409 response -- `UserService` — throws for email/password validation failures (`NoEmailFound`, `WrongPassword`, `EmailExists`, `UserDisabled`) +- `BusinessExceptionHandler` — catches and maps via `MapStatusCode`. The default mapping is 409; cycle 2 codes use a per-enum status map (`AccountLocked` → 423, `LoginRateLimited` → 429, refresh/MFA validation failures → 401, `SessionNotFound` → 404, mission validation failures → 400, MFA conflict states → 409). When `RetryAfterSeconds > 0` the handler also stamps a `Retry-After` response header. +- `UserService` — throws for the auth path (`NoEmailFound`, `WrongPassword`, `EmailExists`, `UserDisabled`, `AccountLocked`, `LoginRateLimited`) +- `RefreshTokenService` — throws `InvalidRefreshToken` on bad/expired/reuse-detected +- `SessionService` — throws `SessionNotFound` for admin-revoke of missing sids +- `MissionTokenService` — throws `InvalidMissionRequest`, `AircraftNotFound` +- `MfaService` — throws `MfaAlreadyEnabled`, `MfaNotEnrolling`, `MfaNotEnabled`, `InvalidMfaCode`, `InvalidMfaToken`, `NoEmailFound`, `WrongPassword` - `ResourcesService` — throws `NoFileProvided` for missing file uploads +- `Program.cs` `ParseSidClaim` / `ParseUserIdClaim` helpers — throw `InvalidRefreshToken` (401) on missing or malformed claims - FluentValidation validators — reference `ExceptionEnum` codes in `.WithErrorCode()` ## Data Models @@ -50,7 +73,7 @@ None. None. ## Security -Error codes are returned to the client via `BusinessExceptionHandler`. Codes are numeric and messages are user-facing. +Error codes are returned to the client via `BusinessExceptionHandler` along with the per-enum HTTP status. The `Retry-After` header on lockout / rate-limit responses lets well-behaved clients back off without blind retries. ## Tests None. diff --git a/_docs/02_document/modules/common_configs_auth_config.md b/_docs/02_document/modules/common_configs_auth_config.md new file mode 100644 index 0000000..c84c88f --- /dev/null +++ b/_docs/02_document/modules/common_configs_auth_config.md @@ -0,0 +1,58 @@ +# Module: Azaion.Common.Configs.AuthConfig + +## Purpose +Configuration POCO bundling the per-IP / per-account login rate-limit knobs and the consecutive-failure account-lockout policy. Bound from `appsettings.json` section `AuthConfig`. + +> Added in cycle 2 (2026-05-14) by AZ-537 (Epic AZ-530, CMMC AC.L2-3.1.8). + +## Public Interface + +### AuthConfig + +| Property | Type | Description | +|----------|------|-------------| +| `RateLimit` | `RateLimitOptions` | Per-IP and per-account login rate-limit windows. | +| `Lockout` | `LockoutOptions` | Consecutive-failure threshold and lockout duration. | + +### RateLimitOptions + +| Property | Type | Default | Description | +|----------|------|---------|-------------| +| `PerIpPermitLimit` | `int` | 10 | Allowed login attempts per IP per `PerIpWindowSeconds`. Enforced by ASP.NET Core's built-in sliding-window limiter on `/login` (and `/login/mfa`). | +| `PerIpWindowSeconds` | `int` | 60 | Window length for the per-IP limiter. | +| `PerAccountPermitLimit` | `int` | 5 | Allowed *failed* login attempts per email per `PerAccountWindowSeconds`. Enforced by `UserService.ValidateUser` against `AuditLog.CountRecentFailedLogins`. | +| `PerAccountWindowSeconds` | `int` | 300 | Window length for the per-account limiter (5 min). | + +### LockoutOptions + +| Property | Type | Default | Description | +|----------|------|---------|-------------| +| `MaxAttempts` | `int` | 10 | Consecutive failed logins that trigger lockout. Counter lives on `users.failed_login_count`. | +| `DurationSeconds` | `int` | 900 | Lockout duration (15 min). Sets `users.lockout_until = now() + DurationSeconds`. | + +## Internal Logic +None — pure data class. + +## Dependencies +None. + +## Consumers +- `Program.cs` — registers via `builder.Services.Configure(...)` and reads it eagerly to build the per-IP `SlidingWindowLimiter` partition. +- `UserService.ValidateUser` — reads `RateLimit.PerAccountPermitLimit` / `PerAccountWindowSeconds` for the per-account rate limit and `Lockout.MaxAttempts` / `DurationSeconds` for lockout enforcement. + +## Data Models +None. + +## Configuration +Bound via `builder.Configuration.GetSection(nameof(AuthConfig))`. Override via env vars like `AuthConfig__Lockout__MaxAttempts=15`. + +## External Integrations +None. + +## Security +- Per-IP limit is in-memory (process-local); a multi-instance admin deployment would either need sticky-sessions on `/login` or a Redis-backed limiter (called out as a known upgrade path in `_docs/05_security/security_report.md`). +- Per-account limit is DB-backed (via `audit_events`) so it survives process restarts and is consistent across instances. +- Lockout precedence: a locked account returns 423 Locked even for a correct password until `lockout_until` passes (CMMC AC.L2-3.1.8 requires this). + +## Tests +- `e2e/Azaion.E2E/Tests/RateLimitLockoutTests.cs` — covers AC-1..AC-6 of AZ-537 with the default values from this config. diff --git a/_docs/02_document/modules/common_configs_jwt_config.md b/_docs/02_document/modules/common_configs_jwt_config.md index 4c738e8..20534a8 100644 --- a/_docs/02_document/modules/common_configs_jwt_config.md +++ b/_docs/02_document/modules/common_configs_jwt_config.md @@ -1,38 +1,68 @@ -# Module: Azaion.Common.Configs.JwtConfig +# Module: Azaion.Common.Configs.JwtConfig + SessionConfig ## Purpose -Configuration POCO for JWT token generation parameters, bound from `appsettings.json` section `JwtConfig`. +Configuration POCOs for JWT signing/validation and refresh-token TTLs. Bound from `appsettings.json` sections `JwtConfig` and `SessionConfig`. Both classes live in `Azaion.Common/Configs/JwtConfig.cs`. + +> **Cycle 2 (2026-05-14) note (AZ-531 / AZ-532)** — major reshape: +> - HS256 shared-secret signing is gone. `Secret` is no longer read by any code path; the property is retained only as a temporary rollback escape hatch (AZ-532 spec). +> - New: `KeysFolder` (PEM directory) and `ActiveKid` (currently-signing key id) for ES256. +> - New: `AccessTokenLifetimeMinutes` (default 15) replaces the old `TokenLifetimeHours` (default 4) — short-lived access tokens are now paired with refresh-token rotation. +> - New companion class `SessionConfig` carries refresh-token TTLs. ## Public Interface -| Property | Type | Description | -|----------|------|-------------| -| `Issuer` | `string` | Token issuer claim | -| `Audience` | `string` | Token audience claim | -| `Secret` | `string` | HMAC-SHA256 signing key | -| `TokenLifetimeHours` | `double` | Token expiry duration in hours | +### JwtConfig + +| Property | Type | Default | Description | +|----------|------|---------|-------------| +| `Issuer` | `string` | (required) | Token `iss` claim. Validated by JwtBearer middleware. | +| `Audience` | `string` | (required) | Token `aud` claim for interactive sessions. (Mission tokens override to `satellite-provider`; MFA step-1 tokens override to `azaion-mfa-step2`.) | +| `KeysFolder` | `string` | `secrets/jwt-keys` | Directory containing one ES256 PEM per key. The kid is the filename without `.pem`. | +| `ActiveKid` | `string?` | `null` | Kid currently used to sign new tokens. If null, falls back to the first PEM by ordinal filename order with a startup log warning. | +| `AccessTokenLifetimeMinutes` | `int` | 15 | Access-token TTL. | + +### SessionConfig + +| Property | Type | Default | Description | +|----------|------|---------|-------------| +| `RefreshSlidingHours` | `int` | 8 | Each rotation extends `expires_at` by this many hours from `now`. | +| `RefreshAbsoluteHours` | `int` | 12 | Family is rejected past this many hours since `family_started_at`, regardless of sliding rotations. | ## Internal Logic -None — pure data class. +None — pure data classes. ## Dependencies None. ## Consumers -- `Program.cs` — reads `JwtConfig` to configure JWT Bearer authentication middleware -- `AuthService.CreateToken` — uses Issuer, Audience, Secret, TokenLifetimeHours to build JWT tokens + +- `Program.cs` + - reads `JwtConfig` eagerly to fail-fast on missing Issuer/Audience and to construct the `JwtSigningKeyProvider` before `app.Build()` + - registers `Configure` and `Configure` for downstream injection +- `JwtSigningKeyProvider` — reads `KeysFolder`, `ActiveKid` +- `AuthService.CreateToken` — reads `Issuer`, `Audience`, `AccessTokenLifetimeMinutes` +- `RefreshTokenService` — reads `SessionConfig.RefreshSlidingHours`, `RefreshAbsoluteHours` +- `MfaService.IssueMfaStepToken` / `ValidateMfaStepToken` — reads `Issuer` (audience is hard-coded to `azaion-mfa-step2`) +- `MissionTokenService.MintToken` — reads `Issuer` (audience is hard-coded to `satellite-provider`) ## Data Models None. ## Configuration -Bound via `builder.Configuration.GetSection(nameof(JwtConfig))`. Expected env var: `ASPNETCORE_JwtConfig__Secret`. + +Bound via `builder.Configuration.GetSection(nameof(JwtConfig))` and `Configure`. Override via env vars: +- `JwtConfig__Issuer=…`, `JwtConfig__Audience=…`, `JwtConfig__KeysFolder=/var/lib/azaion/jwt-keys`, `JwtConfig__ActiveKid=kid-2026-05-14` +- `SessionConfig__RefreshSlidingHours=8`, `SessionConfig__RefreshAbsoluteHours=12` ## External Integrations -None. +Filesystem (read-only on `KeysFolder`). ## Security -`Secret` is the symmetric signing key for all JWT tokens. Must be kept secret and sufficiently long for HMAC-SHA256. + +- Private signing keys live on disk only; the JWKS endpoint exports only public components. `chmod 600` is applied by `scripts/generate-jwt-key.sh`. +- The legacy `Secret` field is retained but unused; remove on a follow-up cleanup ticket once the rollback window has closed. +- `RefreshAbsoluteHours` is the hard cap on session lifetime — no rotation can extend past it. Bumping above 12 h needs a security review because it directly extends the leak-window of any one refresh token. ## Tests -None. +- `e2e/Azaion.E2E/Tests/JwksTests.cs` — exercises the rotation overlap (AC-3) by manipulating `KeysFolder` and `ActiveKid`. +- `e2e/Azaion.E2E/Tests/RefreshTokenTests.cs` — exercises both the sliding and absolute caps (AC-4). diff --git a/_docs/02_document/modules/common_database_azaion_db.md b/_docs/02_document/modules/common_database_azaion_db.md index 394ed4e..325f064 100644 --- a/_docs/02_document/modules/common_database_azaion_db.md +++ b/_docs/02_document/modules/common_database_azaion_db.md @@ -3,34 +3,42 @@ ## Purpose linq2db `DataConnection` subclass representing the application's database context. +> **Cycle 1 (2026-05-13)** — `DetectionClasses` ITable added (AZ-513). +> +> **Cycle 2 (2026-05-14)** — `AuditEvents` ITable added (AZ-537+534), `Sessions` ITable added (AZ-531+535+533+534). + ## Public Interface | Member | Type | Description | |--------|------|-------------| | Constructor | `AzaionDb(DataOptions dataOptions)` | Initializes connection with pre-configured options | -| `Users` | `ITable` | Typed table accessor for the `users` table | +| `Users` | `ITable` | Typed accessor for `public.users` | +| `DetectionClasses` | `ITable` | Typed accessor for `public.detection_classes` | +| `AuditEvents` | `ITable` | **AZ-537+534** — typed accessor for `public.audit_events` | +| `Sessions` | `ITable` | **AZ-531+535+533+534** — typed accessor for `public.sessions` (one row per refresh-token rotation; mission tokens live here too) | ## Internal Logic -Delegates all connection management to the base `DataConnection` class. `Users` property calls `this.GetTable()`. +Delegates all connection management to the base `DataConnection` class. Each property calls `this.GetTable()`. The actual column mapping and conversions live in `AzaionDbShemaHolder`. ## Dependencies -- `User` entity +- `User`, `DetectionClass`, `AuditEvent`, `Session` entities - linq2db (`LinqToDB.Data.DataConnection`, `LinqToDB.ITable`) ## Consumers -- `DbFactory` — creates `AzaionDb` instances inside `Run`/`RunAdmin` methods +- `DbFactory` — creates `AzaionDb` instances inside `Run`/`RunAdmin` +- `UserService`, `DetectionClassService`, `RefreshTokenService`, `SessionService`, `MissionTokenService`, `MfaService`, `AuditLog` — all consume the ITables via `IDbFactory.Run`/`RunAdmin` lambdas ## Data Models -Provides access to the `users` table. +Provides access to four tables: `users`, `detection_classes`, `audit_events`, `sessions`. ## Configuration -Receives `DataOptions` (containing connection string + mapping schema) from `DbFactory`. +Receives `DataOptions` (containing connection string + mapping schema) from `DbFactory`. The schema instance is shared between read and write `DataOptions` — produced by `AzaionDbShemaHolder.GetSchema()` once and reused. ## External Integrations -PostgreSQL database via Npgsql. +PostgreSQL via Npgsql. ## Security -None at this level; connection string security is handled by `DbFactory`. +None at this level. `IDbFactory.Run` selects the read-only connection (`AzaionDb` connection string), `RunAdmin` selects the read/write one (`AzaionDbAdmin`). The grant set on each table determines what each connection can do — see `data_model.md` §Permissions. ## Tests -Indirectly used by `UserServiceTest`. +Exercised end-to-end via the e2e suite (`e2e/Azaion.E2E/Tests/*`). All cycle-2 services have dedicated test files (`RefreshTokenFlowTests`, `LogoutRevocationTests`, `MissionTokenTests`, `MfaLoginTests`, `LoginRateLimitTests`, `PasswordHashingTests`, `AsymmetricSigningTests`, `CorsHttpsTests`). diff --git a/_docs/02_document/modules/common_database_schema_holder.md b/_docs/02_document/modules/common_database_schema_holder.md index 8e62206..22d2f05 100644 --- a/_docs/02_document/modules/common_database_schema_holder.md +++ b/_docs/02_document/modules/common_database_schema_holder.md @@ -3,6 +3,10 @@ ## Purpose Static holder for the linq2db `MappingSchema` that maps C# entities to PostgreSQL table/column naming conventions and handles custom type conversions. +> **Cycle 1 (2026-05-13)** — `DetectionClass` mapping added (AZ-513). +> +> **Cycle 2 (2026-05-14)** — `AuditEvent` and `Session` mappings added; `User.MfaRecoveryCodes` mapped as `DataType.BinaryJson` (jsonb) to satisfy Npgsql's strict OID matching for jsonb columns (AZ-534). + ## Public Interface | Member | Type | Description | @@ -12,26 +16,27 @@ Static holder for the linq2db `MappingSchema` that maps C# entities to PostgreSQ ## Internal Logic Static constructor: 1. Creates a `MappingSchema` with a global callback that converts all column names to snake_case via `StringExtensions.ToSnakeCase`. -2. Uses `FluentMappingBuilder` to configure the `User` entity: - - Table name: `"users"` - - `Id`: primary key, `DataType.Guid` - - `Role`: stored as text, with custom conversion to/from `RoleEnum` via `Enum.Parse` - - `UserConfig`: stored as nullable JSON text, serialized/deserialized via `Newtonsoft.Json` +2. Uses `FluentMappingBuilder` to configure the entities: + - **`User`** — table `"users"`, `Id` PK (Guid), `Role` text with `Enum.Parse` round-trip, `UserConfig` JSON via `Newtonsoft.Json` round-trip, **`MfaRecoveryCodes`** (AZ-534) as `DataType.BinaryJson` so Npgsql sends the jsonb OID instead of text (otherwise inserts fail with "column is of type jsonb but expression is of type text"). + - **`DetectionClass`** — table `"detection_classes"`, `Id` PK + identity (DB-assigned). + - **`AuditEvent`** (AZ-537+534) — table `"audit_events"`, `Id` PK + identity. + - **`Session`** (AZ-531+535+533+534) — table `"sessions"`, `Id` PK (Guid). All other columns rely on the snake_case auto-mapping. ## Dependencies -- `User`, `RoleEnum` entities +- `User`, `RoleEnum`, `DetectionClass`, `AuditEvent`, `Session` entities +- `UserConfig` (for the JSON conversion) - `StringExtensions.ToSnakeCase` - linq2db `MappingSchema`, `FluentMappingBuilder` - `Newtonsoft.Json` ## Consumers -- `DbFactory.LoadOptions` — passes `MappingSchema` to `DataOptions.UseMappingSchema()` +- `DbFactory.LoadOptions` — passes `MappingSchema` to `DataOptions.UseMappingSchema()` for both read and write `DataOptions` (single shared instance). ## Data Models -Defines the ORM mapping for the `users` table. +Defines the ORM mapping for `users`, `detection_classes`, `audit_events`, `sessions` tables. ## Configuration -None — all mappings are compile-time. +None — all mappings are compile-time. The `MappingSchema` is built once at first use of the static class and shared across the entire process. ## External Integrations None directly; mappings are used when queries execute against PostgreSQL. @@ -40,4 +45,4 @@ None directly; mappings are used when queries execute against PostgreSQL. None. ## Tests -None. +Exercised end-to-end via the e2e suite. Misconfigured jsonb mapping would surface as a `42804` Postgres error (`column is of type jsonb but expression is of type text`) on the first MFA confirm — covered by `e2e/Azaion.E2E/Tests/MfaLoginTests.cs`. diff --git a/_docs/02_document/modules/common_entities_audit_event.md b/_docs/02_document/modules/common_entities_audit_event.md new file mode 100644 index 0000000..eacb18c --- /dev/null +++ b/_docs/02_document/modules/common_entities_audit_event.md @@ -0,0 +1,73 @@ +# Module: Azaion.Common.Entities.AuditEvent + +## Purpose +Append-only audit row for security-relevant events: login outcomes, lockouts, and the MFA enrollment / login lifecycle. Drives both the per-account sliding-window rate limit (AZ-537) and the human-readable security trail. + +> Added in cycle 2 (2026-05-14). Initial event types from AZ-537 (login_failed / login_success / login_lockout); MFA event types added by AZ-534 in the same cycle. + +## Public Interface + +### AuditEvent + +| Property | Type | Description | +|----------|------|-------------| +| `Id` | `long` | DB-assigned identity. | +| `EventType` | `string` | One of `AuditEventTypes`. | +| `OccurredAt` | `DateTime` | `now()` at insert. | +| `Email` | `string?` | Normalised lowercase. NULL for system events without a subject. | +| `Ip` | `string?` | Caller IP from `HttpContext.Connection.RemoteIpAddress`. NULL for background tasks. | +| `Metadata` | `string?` | Reserved for future structured payload. Not used today. | + +### AuditEventTypes (constants) + +| Value | When | +|-------|------| +| `login_failed` | Wrong password, locked account, or rate-limit reject. | +| `login_lockout` | Account just hit `MaxAttempts` and was locked. | +| `login_success` | Password verified, MFA not required. | +| `mfa_enroll` | `/users/me/mfa/enroll` succeeded. | +| `mfa_confirm` | `/users/me/mfa/confirm` succeeded; MFA now active. | +| `mfa_disable` | `/users/me/mfa/disable` succeeded. | +| `mfa_login_success` | `/login/mfa` succeeded with TOTP. | +| `mfa_login_failed` | `/login/mfa` rejected (bad TOTP and bad recovery code). | +| `mfa_recovery_used` | `/login/mfa` succeeded with a recovery code (also burns the code). | + +## Internal Logic + +None — pure data class. All write/read logic lives in `AuditLog`. + +## Dependencies + +None. + +## Consumers + +- `AuditLog` — produces every row; reads via `CountRecentFailedLogins`. +- `AzaionDb.AuditEvents` — `ITable` access. +- `AzaionDbSchemaHolder` — maps `AuditEvent` to the `audit_events` table. + +## Data Models + +Maps to PostgreSQL table `audit_events` (defined in `env/db/07_auth_lockout_and_audit.sql`). + +Columns: `id (bigserial PK)`, `event_type (varchar(64))`, `occurred_at (timestamp default now())`, `email (varchar(160) NULL)`, `ip (varchar(64) NULL)`, `metadata (text NULL)`. + +Index: `audit_events_event_type_email_idx (event_type, email, occurred_at DESC)` — supports the per-account sliding-window failed-login count in O(window-rows). + +## Configuration + +None. + +## External Integrations + +None. + +## Security + +- Append-only by convention — `azaion_admin` only has `INSERT, SELECT` on the table. +- Stores PII (email, IP); access is gated to `azaion_admin` and `azaion_reader` only. No public endpoint surfaces audit rows. +- The table backs CMMC AC.L2-3.1.8 ("limit unsuccessful logon attempts") — tampering with it bypasses the rate limit + lockout enforcement. + +## Tests + +Indirectly tested via `RateLimitLockoutTests`, `MfaEnrollmentTests`, `MfaLoginTests` (assertions on the resulting `audit_events` rows). diff --git a/_docs/02_document/modules/common_entities_role_enum.md b/_docs/02_document/modules/common_entities_role_enum.md index fe82199..155262f 100644 --- a/_docs/02_document/modules/common_entities_role_enum.md +++ b/_docs/02_document/modules/common_entities_role_enum.md @@ -3,6 +3,8 @@ ## Purpose Defines the authorization role hierarchy for the system. +> **Cycle 2 (2026-05-14) note** — `Service = 60` added by AZ-535 for service-to-service verifier identities (satellite-provider, gps-denied, ui). Each verifier deployment provisions one `Role=Service` user; the role is gated to read `/sessions/revoked` only (via `revocationReaderPolicy`) and is not valid for any user-facing endpoint. + ## Public Interface | Enum Value | Int Value | Description | @@ -10,9 +12,10 @@ Defines the authorization role hierarchy for the system. | `None` | 0 | No role assigned | | `Operator` | 10 | Annotator access only; can send annotations to queue | | `Validator` | 20 | Annotator + dataset explorer; can receive annotations from queue | -| `CompanionPC` | 30 | Companion PC role | +| `CompanionPC` | 30 | Companion PC role (UAV / aircraft identities; AZ-533 mission tokens are bound to these via `aircraft_id`) | | `Admin` | 40 | Admin role | -| `ResourceUploader` | 50 | Can upload DLLs and AI models | +| `ResourceUploader` | 50 | Data-only — `apiUploaderPolicy` was removed in the post-cycle-1 AZ-183 revert. The seed `uploader@azaion.com` user keeps this role for negative-auth tests. | +| `Service` | 60 | AZ-535 — service-to-service identity for verifiers polling `/sessions/revoked`. NOT valid for any user-facing endpoint. | | `ApiAdmin` | 1000 | Full access to all operations | ## Internal Logic @@ -24,11 +27,13 @@ None. ## Consumers - `User.Role` property type - `RegisterUserRequest.Role` property type -- `Program.cs` — authorization policies (`apiAdminPolicy`, `apiUploaderPolicy`) +- `Program.cs` — authorization policies (`apiAdminPolicy`, `revocationReaderPolicy` cycle 2) - `AuthService.CreateToken` — embeds role as claim -- `AzaionDbSchemaHolder` — maps Role to/from text in DB +- `AzaionDbSchemaHolder` — maps Role to/from text in DB (text enum → `Enum.Parse(typeof(RoleEnum), v)`; the new `Service` value parses through the existing converter without migration) - `UserService.GetUsers` — filters by role - `UserService.ChangeRole` — updates user role +- `MissionTokenService.Issue` — validates `aircraft_id` resolves to a `CompanionPC` user +- `Program.cs` `IssueDualTokens` — fires `RevokeMissionsForAircraft` when the authenticated user has `Role = CompanionPC` ## Data Models Part of the `User` entity. @@ -40,7 +45,7 @@ None. None. ## Security -Core to the RBAC authorization model. `ApiAdmin` has unrestricted access; `ResourceUploader` can upload resources; other roles have endpoint-level restrictions. +Core to the RBAC authorization model. `ApiAdmin` has unrestricted access; `Service` is narrowly scoped to the `/sessions/revoked` verifier-poll feed; `ResourceUploader` is data-only after AZ-183 was reverted; other roles have endpoint-level restrictions. ## Tests None. diff --git a/_docs/02_document/modules/common_entities_session.md b/_docs/02_document/modules/common_entities_session.md new file mode 100644 index 0000000..6cce0d0 --- /dev/null +++ b/_docs/02_document/modules/common_entities_session.md @@ -0,0 +1,85 @@ +# Module: Azaion.Common.Entities.Session + +## Purpose +Domain entity representing one issued refresh token (interactive sessions) or one mission token (long-lived UAV sessions). One row per issued token; rotated rows chain via `ParentSessionId` and share a `FamilyId` so reuse-detection and family-wide revocation can key off it. + +> Added in cycle 2 (2026-05-14). Initial shape from AZ-531 (interactive refresh-token sessions); extended in the same cycle by AZ-535 (`RevokedByUserId`), AZ-533 (`Class`, `AircraftId`), and AZ-534 (`MfaAuthenticated`). + +## Public Interface + +### Session + +| Property | Type | Description | +|----------|------|-------------| +| `Id` | `Guid` | Primary key. | +| `UserId` | `Guid` | FK to `users.id`. | +| `RefreshHash` | `string?` | SHA-256 hex of the opaque refresh token. NULL for mission sessions (they have no refresh value). Unique-indexed. | +| `FamilyId` | `Guid` | All rotations of the same login share this id. For interactive root rows and for mission rows, `FamilyId == Id`. | +| `IssuedAt` | `DateTime` | Row creation time. | +| `LastUsedAt` | `DateTime` | Updated on rotation; informational. | +| `ExpiresAt` | `DateTime` | Sliding (interactive) or absolute (mission) expiry. | +| `RevokedAt` | `DateTime?` | Set on rotation, reuse-detection, logout, admin revoke, post-flight reconnect. | +| `RevokedReason` | `string?` | One of `SessionRevokedReasons`. | +| `ParentSessionId` | `Guid?` | The previous row in the family (set on rotation). | +| `FamilyStartedAt` | `DateTime` | First-issue time of the family — used for the absolute expiry check. | +| `RevokedByUserId` | `Guid?` | AZ-535 — audit trail of who revoked the session. NULL for system revocations (rotation, reuse, post-flight). | +| `Class` | `string` | AZ-533 — `"interactive"` (default) or `"mission"`. | +| `AircraftId` | `Guid?` | AZ-533 — for mission sessions, the `CompanionPC` user the mission token belongs to. Used by `RevokeMissionsForAircraft`. | +| `MfaAuthenticated` | `bool` | AZ-534 — pinned at issue; refresh rotation inherits the original AMR strength even if MFA is enabled/disabled mid-session. | + +### SessionRevokedReasons (constants) + +| Value | When | +|-------|------| +| `rotated` | Old row marked as superseded by a successful refresh rotation. | +| `reuse_detected` | OAuth 2.1 §6.1 — already-rotated refresh re-presented; whole family killed. | +| `logged_out` | User called `POST /logout`. | +| `logged_out_all` | User called `POST /logout/all`. | +| `admin_revoked` | Admin called `POST /sessions/{sid}/revoke`. | +| `post_flight_reconnect` | Aircraft reconnected; mission auto-revoked. | +| `family_revoked` | Reserved (manual family-wide revocation; not currently emitted). | + +### SessionClasses (constants) + +| Value | Meaning | +|-------|---------| +| `interactive` | Refresh-backed user session (AZ-531 default). | +| `mission` | Long-lived no-refresh UAV mission token (AZ-533). | + +## Internal Logic + +None — pure data class. All session lifecycle logic lives in `RefreshTokenService`, `SessionService`, `MissionTokenService`. + +## Dependencies + +None. + +## Consumers + +- `RefreshTokenService` — inserts root/family rows, updates on rotation/reuse-detection +- `SessionService` — revocation paths and the verifier-poll snapshot +- `MissionTokenService` — inserts mission-class rows +- `AzaionDb.Sessions` — `ITable` access +- `AzaionDbSchemaHolder` — maps `Session` to the `sessions` table + +## Data Models + +Maps to PostgreSQL table `sessions` (defined in `env/db/08_sessions.sql`, extended by `09_sessions_logout_and_mission.sql` and `10_users_mfa.sql`). + +## Configuration + +None. + +## External Integrations + +None. + +## Security + +- `refresh_hash` stores SHA-256 of the opaque token; the plaintext is never persisted. +- The `family_id` partial index `sessions_family_active_idx WHERE revoked_at IS NULL` keeps reuse-detection and `RevokeAllForUser` cheap even as the revoked tail grows. +- Auto-revoke-on-reconnect (`RevokeMissionsForAircraft`) closes the mission-token "lost UAV" risk when the aircraft phones home again; the partial index `sessions_aircraft_active_idx (aircraft_id, class) WHERE revoked_at IS NULL AND aircraft_id IS NOT NULL` keeps that check O(active mission rows). + +## Tests + +Indirectly tested via `RefreshTokenTests`, `LogoutTests`, `MissionTokenTests`, and `MfaLoginTests` (which all exercise the entity through the service layer). diff --git a/_docs/02_document/modules/common_entities_user.md b/_docs/02_document/modules/common_entities_user.md index 76e8cac..b60ce75 100644 --- a/_docs/02_document/modules/common_entities_user.md +++ b/_docs/02_document/modules/common_entities_user.md @@ -5,18 +5,33 @@ Domain entity representing a system user, plus related value objects `UserConfig ## Public Interface +> **Cycle 2 (2026-05-14) note** — six new properties: +> - **AZ-537 (CMMC AC.L2-3.1.8)**: `FailedLoginCount` (consecutive failed-login counter) and `LockoutUntil` (active lockout deadline). Both reset on successful login. +> - **AZ-534 (TOTP 2FA)**: `MfaEnabled`, `MfaSecret` (encrypted via `IDataProtector`), `MfaRecoveryCodes` (JSONB array of `{ hash, used_at }`), `MfaEnrolledAt`, `MfaLastUsedWindow` (RFC 6238 time-step counter — defends in-window replay). +> +> `MfaEnabled`, `MfaSecret`, `MfaRecoveryCodes`, and `MfaLastUsedWindow` are `[JsonIgnore]` — they never leave the server in API responses. `PasswordHash` is also `[JsonIgnore]` (this attribute was always there). +> +> The `PasswordHash` column now holds an Argon2id PHC string for new + rehashed users (AZ-536); legacy SHA-384 entries still validate and are transparently upgraded on next successful login. + ### User | Property | Type | Description | |----------|------|-------------| | `Id` | `Guid` | Primary key | | `Email` | `string` | Unique user email | -| `PasswordHash` | `string` | SHA-384 hash of plaintext password | -| `Hardware` | `string?` | Raw hardware fingerprint string (set on first resource access) | +| `PasswordHash` | `string` | Argon2id PHC string (`$argon2id$…`) for new users; legacy 64-char Base64 SHA-384 still accepted by `Security.VerifyPassword` | +| `Hardware` | `string?` | TOMBSTONED — kept nullable, not read or written by any code path (AZ-197 removed the hardware-binding feature) | | `Role` | `RoleEnum` | Authorization role | | `CreatedAt` | `DateTime` | Account creation timestamp | -| `LastLogin` | `DateTime?` | Last successful resource-check/hardware-check timestamp | +| `LastLogin` | `DateTime?` | Currently unused — left for forward compatibility | | `UserConfig` | `UserConfig?` | JSON-serialized user configuration | | `IsEnabled` | `bool` | Account active flag | +| `FailedLoginCount` | `int` | AZ-537 — consecutive failed-login counter; resets to 0 on success | +| `LockoutUntil` | `DateTime?` | AZ-537 — active lockout deadline (UTC). `>= now()` blocks login even with correct password | +| `MfaEnabled` | `bool` | AZ-534 — true after `/users/me/mfa/confirm` succeeds | +| `MfaSecret` | `string?` | AZ-534 — base32 TOTP secret encrypted at rest via `IDataProtector` (purpose `Azaion.Mfa.Secret.v1`) | +| `MfaRecoveryCodes` | `string?` | AZ-534 — JSONB array of `{ Hash, UsedAt }` | +| `MfaEnrolledAt` | `DateTime?` | AZ-534 — set by `Confirm` | +| `MfaLastUsedWindow` | `long?` | AZ-534 — RFC 6238 time-step counter of the most recently accepted code; rejects in-window replay | | Method | Signature | Description | |--------|-----------|-------------| @@ -41,22 +56,30 @@ Domain entity representing a system user, plus related value objects `UserConfig - `RoleEnum` ## Consumers -- All services (`UserService`, `AuthService`, `ResourcesService`) work with `User` +- All services (`UserService`, `AuthService`, `ResourcesService`, `MfaService`, `MissionTokenService`) work with `User` - `AzaionDb` exposes `ITable` -- `AzaionDbSchemaHolder` maps `User` to the `users` PostgreSQL table +- `AzaionDbSchemaHolder` maps `User` to the `users` PostgreSQL table; `MfaRecoveryCodes` carries an explicit `DataType.BinaryJson` mapping so Npgsql sends the JSON oid (otherwise inserts fail with "column is of type jsonb but expression is of type text") - `SetUserQueueOffsetsRequest` uses `UserQueueOffsets` +- `Session` rows reference `User` via `UserId` (and via `AircraftId` for mission sessions targeting `RoleEnum.CompanionPC` users) ## Data Models -Maps to PostgreSQL table `users` with columns: `id`, `email`, `password_hash`, `hardware`, `role`, `user_config` (JSON text), `created_at`, `last_login`, `is_enabled`. +Maps to PostgreSQL table `users` with columns: `id`, `email`, `password_hash`, `hardware`, `role`, `user_config` (JSON text), `created_at`, `last_login`, `is_enabled`, `failed_login_count` (AZ-537), `lockout_until` (AZ-537), `mfa_enabled` (AZ-534), `mfa_secret` (AZ-534), `mfa_recovery_codes` (jsonb, AZ-534), `mfa_enrolled_at` (AZ-534), `mfa_last_used_window` (AZ-534). + +Migration files: `env/db/02_structure.sql` (initial), `03_add_timestamp_columns.sql`, `06_users_email_unique.sql` (UNIQUE INDEX on email), `07_auth_lockout_and_audit.sql` (AZ-537 lockout columns + `audit_events` table), `10_users_mfa.sql` (AZ-534 MFA columns). ## Configuration -None. +None directly. `MfaSecret` encryption depends on the application-level `DataProtection:KeysFolder` setting (Production must point this at a persistent volume). ## External Integrations -None. +None directly — but `MfaSecret` depends on ASP.NET Core DataProtection for at-rest encryption. ## Security -`PasswordHash` stores SHA-384 hash. `Hardware` stores raw hardware fingerprint (hashed for comparison via `Security.GetHWHash`). +- `PasswordHash` stores Argon2id PHC strings for new + rehashed users; legacy SHA-384 still accepted (lazy-migrated on next successful login). +- `MfaSecret` is encrypted at rest via `IDataProtector` (purpose `Azaion.Mfa.Secret.v1`). +- `MfaRecoveryCodes` are SHA-256-hashed at rest; the plaintext list is shown only in the `/users/me/mfa/enroll` response. +- `MfaLastUsedWindow` defends against in-window replay of the same TOTP code. +- `FailedLoginCount` + `LockoutUntil` enforce CMMC AC.L2-3.1.8 (lockout after 10 consecutive failed logins; 15-min default duration). +- `Hardware` is a tombstone (no application code reads or writes it) per AZ-197. ## Tests -Indirectly tested end-to-end via `e2e/Azaion.E2E/Tests/LoginTests.cs`, `UserManagementTests.cs`, and `DeviceTests.cs`. (The previous in-process `Azaion.Test/UserServiceTest` and `SecurityTest` were both removed by cycle 2 along with the `Azaion.Test` project.) +Indirectly tested end-to-end via `e2e/Azaion.E2E/Tests/LoginTests.cs`, `UserManagementTests.cs`, `DeviceTests.cs`, `RateLimitLockoutTests.cs`, `MfaEnrollmentTests.cs`, `MfaLoginTests.cs`. diff --git a/_docs/02_document/modules/common_requests_login_request.md b/_docs/02_document/modules/common_requests_login_request.md index db29305..11eb28a 100644 --- a/_docs/02_document/modules/common_requests_login_request.md +++ b/_docs/02_document/modules/common_requests_login_request.md @@ -3,6 +3,8 @@ ## Purpose Request DTO for the `/login` endpoint. +> **Cycle 2 (2026-05-14) note** — the `/login` response shape changed (AZ-531 added refresh tokens; AZ-534 added the MFA two-step branch), but the **request** body is unchanged. The new response DTOs live in companion files: see `common_requests_login_response.md` (`LoginResponse`, `RefreshTokenRequest`) and `common_requests_mfa_requests.md` (`MfaRequiredResponse`, `MfaLoginRequest`). The `Token` legacy single-token response is preserved via `LoginResponse.Token` for backward compatibility. + ## Public Interface | Property | Type | Description | @@ -17,8 +19,8 @@ None — pure data class. No FluentValidation validator defined for this request None. ## Consumers -- `Program.cs` `/login` endpoint — receives as request body -- `UserService.ValidateUser` — accepts as parameter +- `Program.cs` `/login` endpoint — receives as request body; the response is either `LoginResponse` (no MFA) or `MfaRequiredResponse` (MFA enabled) +- `UserService.ValidateUser` — accepts as parameter; throws lockout/rate-limit/wrong-password/disabled exceptions per AZ-537 + AZ-536 ## Data Models None. diff --git a/_docs/02_document/modules/common_requests_login_response.md b/_docs/02_document/modules/common_requests_login_response.md new file mode 100644 index 0000000..19f3c1e --- /dev/null +++ b/_docs/02_document/modules/common_requests_login_response.md @@ -0,0 +1,53 @@ +# Module: Azaion.Common.Requests.LoginResponse + RefreshTokenRequest + +## Purpose +Response DTO for `/login`, `/login/mfa`, and `/token/refresh` (dual-token shape), plus the request DTO for `/token/refresh`. + +> Added in cycle 2 (2026-05-14) by AZ-531 (Epic AZ-529, Refresh-token Flow). The pre-AZ-531 single-token `{ token }` shape is preserved via the `Token` accessor for backward compatibility — pre-AZ-531 clients see the same value via `Token` even though new clients consume `AccessToken` / `RefreshToken`. + +## Public Interface + +### LoginResponse + +| Property | Type | Description | +|----------|------|-------------| +| `AccessToken` | `string` | The 15-min ES256 JWT to be sent as `Authorization: Bearer <…>` on subsequent requests. | +| `AccessExp` | `DateTime` | Absolute expiry of `AccessToken` (UTC). | +| `RefreshToken` | `string` | Opaque base64url string (43 chars). Send to `/token/refresh` to rotate. NEVER decode — it is not a JWT. | +| `RefreshExp` | `DateTime` | Sliding expiry of the refresh token (UTC). | +| `Token` (read-only) | `string` | Backward-compat accessor returning `AccessToken`. Pre-AZ-531 clients that read `Token` keep working. | + +### RefreshTokenRequest + +| Property | Type | Description | +|----------|------|-------------| +| `RefreshToken` | `string` | The opaque token returned in the previous `LoginResponse.RefreshToken` (or in the previous successful `/token/refresh` response). | + +## Internal Logic +None — pure data classes. The `Token` getter is a read-only alias. + +## Dependencies +None. + +## Consumers +- `Program.cs` `/login` — returns `LoginResponse` (when MFA is not required) via the shared `IssueDualTokens` helper. +- `Program.cs` `/login/mfa` — returns `LoginResponse` via `IssueDualTokens` after second-factor success. +- `Program.cs` `/token/refresh` — accepts `RefreshTokenRequest`, returns `LoginResponse`. +- `RefreshTokenService.IssueForNewLogin` / `Rotate` — supplies the values that populate `LoginResponse`. + +## Data Models +None. + +## Configuration +None. + +## External Integrations +None. + +## Security +- `RefreshToken` is high-entropy (256 bits) and opaque. It is never logged and only ever returned in this response shape (HTTPS is mandatory in Production — see AZ-538 HSTS / HTTPS-redirect). +- `AccessToken` is a JWT carrying `sid`, `jti`, `amr`, role and email claims. Validation is configured in `Program.cs` (`ValidateIssuer`, `ValidateAudience`, `ValidateLifetime`, `ValidateIssuerSigningKey`, `ValidAlgorithms = [ES256]`). +- Backward-compat note — the `Token` accessor exists so pre-AZ-531 UI builds keep working during the transition. New clients should use `AccessToken` so they can also pick up `AccessExp` for proactive refresh scheduling. + +## Tests +- `e2e/Azaion.E2E/Tests/RefreshTokenTests.cs` — assertions on the shape (AC-1) and on rotation behaviour (AC-2..AC-5). diff --git a/_docs/02_document/modules/common_requests_mfa_requests.md b/_docs/02_document/modules/common_requests_mfa_requests.md new file mode 100644 index 0000000..8ae5ab8 --- /dev/null +++ b/_docs/02_document/modules/common_requests_mfa_requests.md @@ -0,0 +1,83 @@ +# Module: Azaion.Common.Requests.MfaRequests + +## Purpose +Request and response DTOs for the MFA enrollment / login surface introduced in cycle 2 by AZ-534 (Epic AZ-529, TOTP-based 2FA at credential login). All DTOs live in a single `MfaRequests.cs` file. + +## Public Interface + +### MfaEnrollRequest + +| Property | Type | Description | +|----------|------|-------------| +| `Password` | `string` | Re-auth required for enrollment (defends a stolen access token from silently flipping MFA on). | + +### MfaEnrollResponse + +| Property | Type | Description | +|----------|------|-------------| +| `Secret` | `string` | 32-char base32 TOTP shared secret. Shown once. | +| `OtpAuthUrl` | `string` | Standard `otpauth://` URL the authenticator app consumes. | +| `QrPngBase64` | `string` | PNG encoding of `OtpAuthUrl` (base64). UI inlines as `data:image/png;base64,…`. | +| `RecoveryCodes` | `string[]` | 10 single-use base32 codes (each ≥12 chars). Stored hashed in `users.mfa_recovery_codes`; the plaintext list is unrecoverable after this response. | + +### MfaConfirmRequest + +| Property | Type | Description | +|----------|------|-------------| +| `Code` | `string` | TOTP code that validates the enrolled secret. On success `users.mfa_enabled` flips to true. | + +### MfaDisableRequest + +| Property | Type | Description | +|----------|------|-------------| +| `Password` | `string` | Re-auth (same defence as enroll). | +| `Code` | `string` | A valid TOTP code (recovery codes are NOT accepted here — disable should be deliberate). | + +### MfaRequiredResponse + +Returned by `POST /login` when the user has MFA enabled instead of `LoginResponse`. + +| Property | Type | Description | +|----------|------|-------------| +| `MfaRequired` | `bool` | Always `true`. Lets dual-shape clients branch on a single field. | +| `MfaToken` | `string` | Short-lived (5 min) ES256 JWT with audience `azaion-mfa-step2`. Carry to `/login/mfa`. | +| `ExpiresIn` | `int` | Step-1 token TTL in seconds (300). | + +### MfaLoginRequest + +| Property | Type | Description | +|----------|------|-------------| +| `MfaToken` | `string` | The step-1 token from `MfaRequiredResponse`. | +| `Code` | `string` | A valid TOTP code OR a single-use recovery code. | + +## Internal Logic +None — pure data classes. + +## Dependencies +None. + +## Consumers +- `Program.cs` `/users/me/mfa/enroll` — `MfaEnrollRequest` → `MfaEnrollResponse`. +- `Program.cs` `/users/me/mfa/confirm` — `MfaConfirmRequest`. +- `Program.cs` `/users/me/mfa/disable` — `MfaDisableRequest`. +- `Program.cs` `/login` — returns `MfaRequiredResponse` when `user.MfaEnabled`. +- `Program.cs` `/login/mfa` — `MfaLoginRequest` → `LoginResponse`. +- `MfaService` — consumes every request type and produces the responses. + +## Data Models +None directly. + +## Configuration +None. + +## External Integrations +None — but `MfaToken` validation depends on `IJwtSigningKeyProvider` (ES256 keys) and `JwtConfig.Issuer`. + +## Security +- `Password` fields carry plaintext credentials; HTTPS is mandatory in Production (AZ-538 HSTS / HTTPS-redirect). +- `Secret` and `RecoveryCodes` are returned ONCE in `MfaEnrollResponse` — the client must show them immediately and never send them back. +- `MfaToken` is narrowly-scoped (audience `azaion-mfa-step2`) so it cannot be used against any non-MFA endpoint even if leaked. + +## Tests +- `e2e/Azaion.E2E/Tests/MfaEnrollmentTests.cs` — AC-1 (enroll shape), AC-2 (confirm), AC-5 (disable), AC-6 (encrypted at rest). +- `e2e/Azaion.E2E/Tests/MfaLoginTests.cs` — AC-3 (two-step + AMR claim), AC-4 (recovery code single-use). diff --git a/_docs/02_document/modules/common_requests_mission_session_request.md b/_docs/02_document/modules/common_requests_mission_session_request.md new file mode 100644 index 0000000..432ab9c --- /dev/null +++ b/_docs/02_document/modules/common_requests_mission_session_request.md @@ -0,0 +1,59 @@ +# Module: Azaion.Common.Requests.MissionSessionRequest + ValidRegion + MissionSessionResponse + +## Purpose +Request / response DTOs for `POST /sessions/mission` — pilot asks admin to mint a long-lived no-refresh access token for a single UAV flight. + +> Added in cycle 2 (2026-05-14) by AZ-533 (Epic AZ-529, Mission-token issuance for disconnected UAV operations). + +## Public Interface + +### MissionSessionRequest + +| Property | Type | Required | Description | +|----------|------|----------|-------------| +| `MissionId` | `string` | Yes | Must match `^M-\d{4}-\d{2}-\d{2}-\d{3}$` (validated server-side; HTTP 400 with `InvalidMissionRequest` on miss). | +| `AircraftId` | `Guid` | Yes | The user id of the `CompanionPC` user representing the aircraft. Must exist; otherwise HTTP 400 with `AircraftNotFound`. | +| `PlannedDurationH` | `double` | Yes | ∈ `[0.1, 12.0]`. Outside range → 400 `InvalidMissionRequest`. The minted token's `exp` is `now + PlannedDurationH + 1.0 h` (the buffer covers post-flight reconnect grace). | +| `RequestedScope` | `IList?` | No | Optional permission strings stamped as multi-valued `permissions` claim. | +| `ValidRegion` | `ValidRegion?` | No | Optional bbox stamped as JSON-typed `valid_region` claim; informational until `satellite-provider` enforces it. | + +### ValidRegion + +| Property | Type | +|----------|------| +| `MinLat` / `MaxLat` / `MinLon` / `MaxLon` | `double` | + +### MissionSessionResponse + +| Property | Type | Description | +|----------|------|-------------| +| `AccessToken` | `string` | The ES256 JWT bound to the mission. Audience `satellite-provider`. | +| `AccessExp` | `DateTime` | Token expiry (UTC). | +| `TokenClass` | `string` | Always `"mission"`. | +| `SessionId` | `Guid` | The `sessions.id` row backing this token; verifiers see this in the `sid` claim. | + +## Internal Logic +None — pure data classes. Validation runs in `MissionTokenService.Validate`; the regex is compiled-once per process. + +## Dependencies +- `System.ComponentModel.DataAnnotations.Required` — surfaces 400 from minimal-API model binding when `MissionId` / `AircraftId` / `PlannedDurationH` are missing or unset. + +## Consumers +- `Program.cs` `/sessions/mission` — receives `MissionSessionRequest`, returns `MissionSessionResponse`. +- `MissionTokenService.Issue` — accepts `MissionSessionRequest`, returns `MissionSessionResponse`. + +## Data Models +None directly — `MissionTokenService` translates these DTOs into a `Session` row + JWT claims. + +## Configuration +None. + +## External Integrations +None directly. The minted token is consumed by the `satellite-provider` workspace; cross-workspace ticket coordinates verifier-side enforcement of the `mission_id` / `aircraft_id` / `valid_region` claims. + +## Security +- The `MissionId` regex defends against injection of arbitrary text into a claim that downstream verifiers may use for log correlation or ABAC decisions. +- The 12-hour upper bound on `PlannedDurationH` is a hard cap — any future expansion needs a deliberate config change with a security-review trigger because it directly extends the leak-window of any one mission token. + +## Tests +- `e2e/Azaion.E2E/Tests/MissionTokenTests.cs` — AC-1..AC-5 (lifetime, cap, claims, auto-revoke, auth required). diff --git a/_docs/02_document/modules/services_audit_log.md b/_docs/02_document/modules/services_audit_log.md new file mode 100644 index 0000000..9376fc5 --- /dev/null +++ b/_docs/02_document/modules/services_audit_log.md @@ -0,0 +1,60 @@ +# Module: Azaion.Services.AuditLog + +## Purpose +Append-only audit trail for security-relevant events (login attempts, lockouts, MFA lifecycle). Also exposes the per-account sliding-window failed-login count consumed by `UserService.ValidateUser`'s rate limit. + +> Added in cycle 2 (2026-05-14). Initially shipped with AZ-537 (login lockout + per-account rate-limit feed); MFA event types added by AZ-534 in the same cycle. + +## Public Interface + +### IAuditLog + +| Method | Signature | Description | +|--------|-----------|-------------| +| `RecordLoginFailed` | `Task RecordLoginFailed(string email, CancellationToken ct = default)` | Inserts `audit_events` row with `event_type='login_failed'`. | +| `RecordLoginLockout` | `Task RecordLoginLockout(string email, CancellationToken ct = default)` | Inserts `event_type='login_lockout'` (AZ-537 AC-6). | +| `RecordLoginSuccess` | `Task RecordLoginSuccess(string email, CancellationToken ct = default)` | Inserts `event_type='login_success'`. | +| `RecordMfaEnroll` / `RecordMfaConfirm` / `RecordMfaDisable` | `Task ...(string email, CancellationToken ct = default)` | MFA enrollment lifecycle. | +| `RecordMfaLoginSuccess` / `RecordMfaLoginFailed` / `RecordMfaRecoveryUsed` | `Task ...(string email, CancellationToken ct = default)` | MFA login outcomes. | +| `CountRecentFailedLogins` | `Task CountRecentFailedLogins(string email, int windowSeconds, CancellationToken ct = default)` | Number of `login_failed` rows for the email within the last `windowSeconds`. Drives the per-account sliding-window rate limit (AZ-537 AC-2). | + +## Internal Logic + +- **Email normalisation** — every insert and read lowercases the email (`ToLowerInvariant`) so case-variant addresses can't bypass the rate limit. +- **IP capture** — pulls `HttpContext.Connection.RemoteIpAddress` via `IHttpContextAccessor`. Null when there is no current request (background task). Null IPs are persisted as null, not omitted. +- **Insert path** uses `dbFactory.RunAdmin` (write privilege required); count uses `dbFactory.Run` (read-only). +- **Backing table** — `public.audit_events`, defined by `env/db/07_auth_lockout_and_audit.sql`. Supporting index `audit_events_event_type_email_idx (event_type, email, occurred_at DESC)` makes the per-account sliding-window count O(window-rows). + +## Dependencies + +- `IDbFactory` — read + admin connections +- `IHttpContextAccessor` — for the request IP +- `AuditEvent` entity, `AuditEventTypes` constants + +## Consumers + +- `UserService.ValidateUser` — calls `CountRecentFailedLogins` (per-account rate limit), `RecordLoginFailed`, `RecordLoginSuccess`, `RecordLoginLockout`. +- `MfaService` — calls every `RecordMfa*` method along the enroll/confirm/disable/login paths. + +## Data Models + +Operates on the `AuditEvent` entity via `AzaionDb.AuditEvents` table. + +## Configuration + +None directly. The window/threshold constants live on `AuthConfig.RateLimit` and `AuthConfig.Lockout`, consumed by the caller (`UserService.ValidateUser`). + +## External Integrations + +PostgreSQL via `IDbFactory`. + +## Security + +- Append-only by convention — no UPDATE/DELETE in code, and `azaion_admin` only has `INSERT, SELECT` on the table. +- The IP and email are PII; access to the table is gated to `azaion_admin` (insert + read) and `azaion_reader` (read-only). No public endpoint surfaces audit rows directly. +- The per-account sliding-window count is the foundation of CMMC AC.L2-3.1.8 enforcement; tampering with `audit_events` bypasses the rate limit. + +## Tests + +- `e2e/Azaion.E2E/Tests/RateLimitLockoutTests.cs` — exercises `RecordLoginFailed` + `CountRecentFailedLogins` end-to-end via the lockout/rate-limit ACs. +- `e2e/Azaion.E2E/Tests/MfaEnrollmentTests.cs` and `MfaLoginTests.cs` — assert the corresponding MFA `audit_events` rows after each lifecycle event. diff --git a/_docs/02_document/modules/services_auth_service.md b/_docs/02_document/modules/services_auth_service.md index dd7bd79..96e2435 100644 --- a/_docs/02_document/modules/services_auth_service.md +++ b/_docs/02_document/modules/services_auth_service.md @@ -1,48 +1,81 @@ # Module: Azaion.Services.AuthService ## Purpose -JWT token creation and current-user resolution from HTTP context claims. +Mints short-lived (15 min) ES256 access tokens and resolves the current user from HTTP context claims. + +> **Cycle 2 (2026-05-14) note (AZ-531 / AZ-532 / AZ-534)** — `CreateToken` was completely reshaped: +> - Signing switched from HMAC-HS256 (`JwtConfig.Secret`) to ES256 via `IJwtSigningKeyProvider` (AZ-532). +> - Lifetime is now `JwtConfig.AccessTokenLifetimeMinutes` (default 15) instead of the old `TokenLifetimeHours` (default 4). +> - Tokens stamp two new claims required by the refresh / logout flow: `sid` (session id) and `jti` (per-token unique id). +> - Tokens stamp the RFC 8176 `amr` claim (multi-valued; defaults to `["pwd"]`, becomes `["pwd","mfa"]` after `/login/mfa`, with `"recovery"` appended when a recovery code was used). +> - Returns an `AccessToken` record (`Jwt` + `ExpiresAt`) so callers can populate `LoginResponse.AccessExp` directly. ## Public Interface ### IAuthService | Method | Signature | Description | |--------|-----------|-------------| -| `GetCurrentUser` | `Task GetCurrentUser()` | Extracts email from JWT claims, returns full User entity | -| `CreateToken` | `string CreateToken(User user)` | Generates a signed JWT token for the given user | +| `GetCurrentUser` | `Task GetCurrentUser()` | Reads `ClaimTypes.Name` from `HttpContext.User`, delegates to `IUserService.GetByEmail`. | +| `CreateToken` | `AccessToken CreateToken(User user, Guid sessionId, Guid jti, IEnumerable? amr = null)` | Mints a 15-min ES256 access token bound to `sessionId`/`jti`, with the supplied `amr` values. | + +### `record AccessToken(string Jwt, DateTime ExpiresAt)` + +The token string + its absolute expiry (UTC). `Program.cs` packs this into `LoginResponse.AccessToken` / `LoginResponse.AccessExp`. ## Internal Logic -- **GetCurrentUser**: reads `ClaimTypes.Name` from `HttpContext.User.Claims`, then delegates to `IUserService.GetByEmail`. -- **CreateToken**: builds a `SecurityTokenDescriptor` with claims (NameIdentifier = user ID, Name = email, Role = role), signs with HMAC-SHA256 using the configured secret, sets expiry from `JwtConfig.TokenLifetimeHours`. -Private method: -- `GetCurrentUserEmail` — extracts email from claims dictionary. +- **CreateToken** builds claims: + - `ClaimTypes.NameIdentifier` = `user.Id` + - `ClaimTypes.Name` = `user.Email` + - `ClaimTypes.Role` = `user.Role.ToString()` + - `JwtRegisteredClaimNames.Sid` = `sessionId.ToString()` + - `JwtRegisteredClaimNames.Jti` = `jti.ToString()` + - One `amr` claim per element of the `amr` parameter (defaults to `["pwd"]`). +- Signs with `SigningCredentials(active.SecurityKey, SecurityAlgorithms.EcdsaSha256)` using the active key from `IJwtSigningKeyProvider`. The `kid` JWT header is auto-stamped because `ECDsaSecurityKey.KeyId` is set per loaded key. +- Lifetime: `now + JwtConfig.AccessTokenLifetimeMinutes`. +- **GetCurrentUser**: reads `ClaimTypes.Name` from `HttpContext.User.Claims` and delegates to `IUserService.GetByEmail` (which is cached). ## Dependencies + - `IHttpContextAccessor` — for accessing current HTTP context -- `IOptions` — JWT configuration +- `IOptions` — `Issuer`, `Audience`, `AccessTokenLifetimeMinutes` +- `IJwtSigningKeyProvider` (cycle 2 — ES256 active key) - `IUserService` — for `GetByEmail` lookup - `System.IdentityModel.Tokens.Jwt` - `Microsoft.IdentityModel.Tokens` ## Consumers -- `Program.cs` `/login` endpoint — calls `CreateToken` after successful validation -- `Program.cs` `/users/current` — calls `GetCurrentUser` (the previously listed `/resources/get`, `/resources/get-installer`, `/resources/check` consumers were removed in cycle 2 / by AZ-197 along with their endpoints) + +- `Program.cs` `/login` (after `UserService.ValidateUser`) → calls `CreateToken` via the shared `IssueDualTokens` helper. +- `Program.cs` `/login/mfa` → calls `CreateToken` with `amr` from `MfaService.VerifyForLogin`. +- `Program.cs` `/token/refresh` → calls `CreateToken` with `amr` reconstructed from the session's `MfaAuthenticated` flag. +- `Program.cs` `/users/current` → calls `GetCurrentUser`. +- `MfaService.IssueMfaStepToken` and `MissionTokenService.MintToken` mint their own tokens directly (separate audiences); they bypass `AuthService.CreateToken` on purpose. ## Data Models None. ## Configuration -Uses `JwtConfig` (Issuer, Audience, Secret, TokenLifetimeHours). + +`JwtConfig`: +- `Issuer`, `Audience` — claim values +- `AccessTokenLifetimeMinutes` (default 15) — access TTL +- `KeysFolder`, `ActiveKid` — signing key selection (consumed via `IJwtSigningKeyProvider`) + +The legacy `JwtConfig.Secret` field is **no longer read** — the codebase keeps the property only as a temporary rollback escape hatch and to avoid breaking any environment that still binds it. ## External Integrations -None. +None directly. Signing key material lives on disk in `JwtConfig.KeysFolder` (default `secrets/jwt-keys/`). ## Security -- Token includes user ID, email, and role as claims -- Signed with HMAC-SHA256 -- Expiry controlled by `TokenLifetimeHours` config -- Token validation parameters are configured in `Program.cs` (ValidateIssuer, ValidateAudience, ValidateLifetime, ValidateIssuerSigningKey) + +- Asymmetric ES256 signing — verifiers hold only the public key set (served at `/.well-known/jwks.json`). A compromised verifier can no longer mint admin tokens. +- `ValidAlgorithms = [SecurityAlgorithms.EcdsaSha256]` is pinned in `Program.cs` JwtBearer config to defeat the alg-confusion attack (forging a token with `alg=HS256` using the public key as the HMAC secret). +- Every token now carries `sid` and `jti`. `sid` is the AZ-535 logout / family-revocation key; `jti` reserves the option of a per-access denylist if revocation latency ever needs to drop below the verifier-poll interval. +- The 15-min access TTL plus refresh-token rotation (AZ-531) constrains the leak-window of a stolen access token to <15 min. ## Tests -None. +- `e2e/Azaion.E2E/Tests/RefreshTokenTests.cs` (AC-1, AC-2) — verifies `AccessExp ≈ now + 15m` and that rotation produces a fresh access token. +- `e2e/Azaion.E2E/Tests/JwksTests.cs` (AC-1) — verifies `alg=ES256` and `kid` header on issued tokens. +- `e2e/Azaion.E2E/Tests/MfaLoginTests.cs` (AC-3) — verifies the `amr` claim ordering across the two-step login. +- `e2e/Azaion.E2E/Tests/LogoutTests.cs` — exercises the `sid` claim path. diff --git a/_docs/02_document/modules/services_jwt_signing_key_provider.md b/_docs/02_document/modules/services_jwt_signing_key_provider.md new file mode 100644 index 0000000..2556720 --- /dev/null +++ b/_docs/02_document/modules/services_jwt_signing_key_provider.md @@ -0,0 +1,75 @@ +# Module: Azaion.Services.JwtSigningKeyProvider + +## Purpose +Loads ES256 JWT signing keys from a directory of `*.pem` files. One key is "active" (used to sign new tokens); the rest stay in the JWKS feed so in-flight tokens minted with older kids still verify during a rotation overlap window. + +> Added in cycle 2 (2026-05-14) by AZ-532 (Epic AZ-529, Auth Mechanism Modernization). Replaces the HS256 shared-secret path; `JwtConfig.Secret` is no longer read by the codebase. + +## Public Interface + +### IJwtSigningKeyProvider + +| Member | Type | Description | +|--------|------|-------------| +| `Active` | `JwtSigningKey` | The key the codebase uses to sign new tokens. Selected by `JwtConfig.ActiveKid`; falls back to the first key by filename (sorted ordinal) with a startup log warning if no `ActiveKid` is set. | +| `All` | `IReadOnlyList` | Every loaded key, ordered by `Kid`. Surfaced through `/.well-known/jwks.json`. | + +### JwtSigningKey + +| Property | Type | Description | +|----------|------|-------------| +| `Kid` | `string` | Filename without `.pem` extension. | +| `Ecdsa` | `ECDsa` | Underlying ECDSA instance (P-256). | +| `SecurityKey` | `ECDsaSecurityKey` | Microsoft.IdentityModel wrapper with `KeyId = Kid`. | + +## Internal Logic + +- **Eager construction** — built at host construction time in `Program.cs` (before DI is finalized) so `JwtBearer` can resolve issuer signing keys via the same instance DI registers as a singleton. Failures are fail-fast at startup, not at first-request. +- **Discovery** — `Directory.EnumerateFiles(folder, "*.pem")`, sorted ordinal. Empty folder or missing folder throws `InvalidOperationException` with a pointer to `scripts/generate-jwt-key.sh`. +- **Curve enforcement** — `EnsureP256` rejects any key whose curve OID is not `1.2.840.10045.3.1.7` / `nistP256` / `ECDSA_P256`. ES256 ⇒ P-256; the wrong curve would silently break verifiers expecting ES256. +- **Disposal** — `IDisposable` releases every loaded `ECDsa` instance. + +## Dependencies + +- `IOptions` — `KeysFolder` and `ActiveKid` +- `ILogger` — fallback warning when `ActiveKid` is unset +- System.Security.Cryptography (ECDsa, PEM import) +- Microsoft.IdentityModel.Tokens (ECDsaSecurityKey) + +## Consumers + +- `Program.cs` — registered as singleton; supplies the `IssuerSigningKeyResolver` for `JwtBearer` and is shared with `AuthService` / `MfaService` / `MissionTokenService` for signing. +- `AuthService.CreateToken` — uses `Active.SecurityKey` for `SigningCredentials`. +- `MfaService.IssueMfaStepToken` / `ValidateMfaStepToken` — same. +- `MissionTokenService.MintToken` — same. +- `Program.cs` `/.well-known/jwks.json` — exposes `All` as the JWKS feed. + +## Data Models + +None. + +## Configuration + +`JwtConfig.KeysFolder` (default `secrets/jwt-keys`) — directory containing one PEM per key. +`JwtConfig.ActiveKid` — kid of the currently-signing key. If unset, the first key by filename wins (with a startup log warning). + +## External Integrations + +Filesystem (read-only on `KeysFolder`). + +## Security + +- Private key material lives only on disk and in process memory. The JWKS endpoint exports public components only (`x`, `y` for EC). +- Keys are loaded with `chmod 600` set by `scripts/generate-jwt-key.sh` (the generator script chmods after `openssl ecparam`). +- Curve pinning prevents accidental signing with a non-P-256 key that would silently break ES256 verifiers. +- Rotation procedure (per AZ-532 spec, also documented in `scripts/generate-jwt-key.sh`): + 1. Generate a new PEM with `scripts/generate-jwt-key.sh ` next to the existing one. + 2. Restart admin — JWKS now exposes both kids; the OLD kid is still active for signing. + 3. Wait verifier-cache TTL (`Cache-Control: max-age=3600` = 1 h). + 4. Set `JwtConfig__ActiveKid=` and restart admin. + 5. Wait until all old-kid access tokens have expired (TTL = 15 min). + 6. Delete the old PEM and restart admin — JWKS now lists only the new kid. + +## Tests + +- `e2e/Azaion.E2E/Tests/JwksTests.cs` — AC-1 (alg=ES256, kid present), AC-2 (JWKS shape + max-age=3600), AC-3 (two-key overlap during rotation), AC-4 (no private fields in JWKS), AC-5 (alg-confusion attack rejected via pinned `ValidAlgorithms`). diff --git a/_docs/02_document/modules/services_mfa_service.md b/_docs/02_document/modules/services_mfa_service.md new file mode 100644 index 0000000..b7c0ea0 --- /dev/null +++ b/_docs/02_document/modules/services_mfa_service.md @@ -0,0 +1,73 @@ +# Module: Azaion.Services.MfaService + +## Purpose +RFC 6238 TOTP-based 2FA at credential login. Manages enrollment, confirmation, disable, and second-factor verification, and issues the short-lived step-1 JWT carried between `/login` and `/login/mfa`. + +> Added in cycle 2 (2026-05-14) by AZ-534 (Epic AZ-529). Per-user opt-in initially; no policy yet enforces MFA by role. AZ-533 mission-token issuance has a TODO to require `amr=["pwd","mfa"]` once MFA adoption is established. + +## Public Interface + +### IMfaService + +| Method | Signature | Description | +|--------|-----------|-------------| +| `Enroll` | `Task Enroll(Guid userId, string password, CancellationToken ct = default)` | Generates a TOTP secret + 10 single-use recovery codes, persists the encrypted secret + hashed recovery codes, returns the secret/otpauth-url/QR/recovery codes (ONCE — recovery codes are unrecoverable after this response). Requires fresh password re-auth. `mfa_enabled` stays false until `Confirm`. | +| `Confirm` | `Task Confirm(Guid userId, string code, CancellationToken ct = default)` | Validates one TOTP code against the enrolled secret; on success sets `mfa_enabled=true`. | +| `Disable` | `Task Disable(Guid userId, string password, string code, CancellationToken ct = default)` | Removes MFA; requires both password re-auth and a valid TOTP code (no recovery-code substitution here — disable should be deliberate). | +| `IssueMfaStepToken` | `string IssueMfaStepToken(Guid userId)` | Mints a 5-minute ES256 JWT (audience `azaion-mfa-step2`) returned at `/login` step-1 when the user has MFA enabled. The client carries it back to `/login/mfa`. | +| `ValidateMfaStepToken` | `Guid ValidateMfaStepToken(string token)` | Decodes a step-1 token, returns the userId. Throws `BusinessException(InvalidMfaToken)` on bad signature, audience mismatch, or expiry. | +| `VerifyForLogin` | `Task VerifyForLogin(Guid userId, string code, CancellationToken ct = default)` | Step-2 verification at login. Returns the AMR array the access token should carry — `["pwd","mfa"]` for TOTP success, `["pwd","mfa","recovery"]` if a recovery code was consumed. Throws `BusinessException(InvalidMfaCode)` on failure. | + +## Internal Logic + +- **Secret generation**: 20-byte (160-bit) random key per RFC 6238 §3, encoded as 32-char base32. Stored encrypted at rest via `IDataProtector` (purpose `Azaion.Mfa.Secret.v1`). +- **otpauth URL**: built via `OtpUri` (Otp.NET) with SHA-1 / 6 digits / 30-sec period — RFC 6238 defaults. +- **QR**: PNG generated via `QRCoder.QRCodeGenerator` (ECCLevel.M), returned as base64. The endpoint hands the raw PNG bytes back; the UI inlines the data URL. +- **Recovery codes**: 10 codes, each 10 random bytes → 16-char base32. Stored as `{ Hash, UsedAt }` JSON array; hash is SHA-256 hex (high-entropy secret → fast hash is appropriate, same reasoning as the refresh-token store). Single-use enforcement via the `UsedAt` field plus a conditional update on the prior JSON to defend against concurrent-use races. +- **TOTP verification** uses Otp.NET's `Totp.VerifyTotp` with `VerificationWindow.RfcSpecifiedNetworkDelay` (±1 step). Each successful verification persists the matched time-step counter to `users.mfa_last_used_window`; subsequent codes with `matched_window <= last_used_window` are rejected to prevent in-window replay. +- **Step-1 token**: ES256 JWT with audience `azaion-mfa-step2` (intentionally distinct from the main `JwtConfig.Audience` so the main JwtBearer middleware rejects it). Lifetime 5 min — matches AZ-534 AC-3. +- **Disable's raw SQL** — setting `mfa_recovery_codes` (jsonb) back to NULL via the LinqToDB UPDATE expression API sends an untyped NULL literal that Postgres parses as text and rejects (42804). A small parameterized SQL avoids the type-inference dance. + +## Dependencies + +- `IDbFactory` — admin connection for user updates +- `IUserService` — user lookup by id +- `IDataProtectionProvider` — encrypts `mfa_secret` at rest (key storage configured via `DataProtection:KeysFolder`; defaults to per-machine ephemeral) +- `IJwtSigningKeyProvider` — ES256 signing for the step-1 token +- `IOptions` — issuer for the step-1 token +- `IAuditLog` — emits `mfa_enroll` / `mfa_confirm` / `mfa_disable` / `mfa_login_success` / `mfa_login_failed` / `mfa_recovery_used` +- `Security` — password verification (Argon2id) for re-auth on enroll/disable +- Otp.NET (TOTP), QRCoder (PNG generation) + +## Consumers + +- `Program.cs` `/users/me/mfa/enroll`, `/users/me/mfa/confirm`, `/users/me/mfa/disable` +- `Program.cs` `/login` — calls `IssueMfaStepToken` when `user.MfaEnabled` +- `Program.cs` `/login/mfa` — calls `ValidateMfaStepToken` then `VerifyForLogin` + +## Data Models + +Operates on the `User` entity (`mfa_enabled`, `mfa_secret`, `mfa_recovery_codes`, `mfa_enrolled_at`, `mfa_last_used_window` columns added by `env/db/10_users_mfa.sql`). + +## Configuration + +- `JwtConfig.Issuer` — used as the `iss` of the step-1 token. +- `DataProtection:KeysFolder` (production must set this to a persistent volume so encrypted MFA secrets survive container restarts; without it the per-machine ephemeral key store will lose every MFA secret on first deploy). + +## External Integrations + +PostgreSQL via `IDbFactory`. ASP.NET Core DataProtection for at-rest encryption. + +## Security + +- TOTP secret is base32 (32 chars) encrypted at rest with `IDataProtector`. Plaintext only exists in memory during enroll/verify. +- Recovery codes are SHA-256-hashed in the DB; the plaintext list is shown ONCE in `MfaEnrollResponse` and unrecoverable thereafter. +- `mfa_last_used_window` defends against in-window replay (a code presented twice within 30 s is rejected the second time). +- Step-1 JWT carries a narrowed audience (`azaion-mfa-step2`); the main JwtBearer middleware accepts only `JwtConfig.Audience` and rejects this token for any non-MFA endpoint. +- Re-auth with password is required for enroll and disable; this defends against a stolen access token being used to silently flip MFA state. +- **Known follow-up F2 (carried forward from Cycle 2 batch 4 review)**: `TryConsumeRecoveryCode` returns `true` even when the conditional update affects 0 rows — concurrent double-spend of the same recovery code is possible (low practical risk, but a real correctness gap). + +## Tests + +- `e2e/Azaion.E2E/Tests/MfaEnrollmentTests.cs` — AC-1 (enroll shape), AC-2 (confirm), AC-5 (disable), AC-6 (encrypted at rest). +- `e2e/Azaion.E2E/Tests/MfaLoginTests.cs` — AC-3 (two-step flow + AMR claim), AC-4 (recovery-code single-use). diff --git a/_docs/02_document/modules/services_mission_token_service.md b/_docs/02_document/modules/services_mission_token_service.md new file mode 100644 index 0000000..42db9e0 --- /dev/null +++ b/_docs/02_document/modules/services_mission_token_service.md @@ -0,0 +1,69 @@ +# Module: Azaion.Services.MissionTokenService + +## Purpose +Issues long-lived (≤ 12 h) single-use access tokens for offline UAV missions. Distinct from `AuthService.CreateToken` because: +- Lifetime is per-mission (`planned_duration_h + 1 h` buffer), not the 15-minute interactive policy. +- Audience is narrowed to `satellite-provider`, not the broad admin audience. +- No refresh: a single token covers the entire flight, then dies. +- Carries mission-specific claims (`mission_id`, `aircraft_id`, `valid_region`, `permissions`). + +> Added in cycle 2 (2026-05-14) by AZ-533 (Epic AZ-529). Solves the "10 h offline UAV vs. 15 min interactive access token" tension without weakening interactive-session security. + +## Public Interface + +### IMissionTokenService + +| Method | Signature | Description | +|--------|-----------|-------------| +| `Issue` | `Task Issue(Guid pilotUserId, MissionSessionRequest request, CancellationToken ct = default)` | Validates the request, persists a `class='mission'` row in `sessions`, mints an ES256 access token bound to that session id, returns the token + expiry + session id. | + +## Internal Logic + +- **Validation**: + - `mission_id` must match `^M-\d{4}-\d{2}-\d{2}-\d{3}$` (compiled regex). + - `planned_duration_h` ∈ `[0.1, 12.0]` — anything outside throws `BusinessException(InvalidMissionRequest)` (HTTP 400). + - `aircraft_id` must exist in `users` with `Role=CompanionPC`; otherwise `BusinessException(AircraftNotFound)`. +- **Session row first, then token** — the row is inserted *before* the JWT is minted so revocation lookups can never miss a token already in the wild. +- **Lifetime** = `planned_duration_h + 1.0` (the 1-hour buffer covers post-flight reconnect grace). +- **Family handling** — mission sessions are their own family (`family_id = id`); they never rotate. +- **Token claims**: `sub` = pilotUserId, `sid` = session id, `jti` = unique token id, `mission_id`, `aircraft_id`, `token_class="mission"`, optional `permissions` (multi-valued), optional `valid_region` (JSON-typed claim). Audience pinned to `satellite-provider`. +- **Auto-revoke on reconnect** is implemented in `Program.cs` via `ISessionService.RevokeMissionsForAircraft`, fired from `/login` and `/token/refresh` whenever the caller is a `CompanionPC` user. + +## Dependencies + +- `IDbFactory` — admin connection for inserting the mission session row, read connection for the aircraft existence check +- `IJwtSigningKeyProvider` — ES256 active key +- `IOptions` — issuer +- `Session` entity, `SessionClasses.Mission` constant +- `MissionSessionRequest` / `MissionSessionResponse` DTOs + +## Consumers + +- `Program.cs` `/sessions/mission` (requires interactive auth; per AZ-533 the AC-6 step-up MFA gate is a TODO until org-wide MFA adoption) + +## Data Models + +Operates on the `Session` entity (`class='mission'`, `aircraft_id` set, `refresh_hash` null). + +## Configuration + +- `JwtConfig.Issuer` — issuer claim of the minted token. +- Hard-coded constants: + - `MissionAudience = "satellite-provider"` — verifier-side audience gate. + - `MaxDurationHours = 12.0`, `MinDurationHours = 0.1`. + - `LifetimeBufferHours = 1.0`. + +## External Integrations + +PostgreSQL via `IDbFactory`. Token consumed by the `satellite-provider` workspace (verifier-side enforcement of `mission_id`/`aircraft_id`/`valid_region` is filed under that workspace). + +## Security + +- Long-lived tokens are inherently dangerous if leaked. Hardware binding (mTLS / DPoP / `cnf`) is the long-term answer; documented as a known risk in `_docs/05_security/security_report.md`. +- The narrowed audience (`satellite-provider`) prevents a stolen mission token from being usable against the admin API itself; admin endpoints still require `JwtConfig.Audience`. +- The `valid_region` bbox is informational until `satellite-provider` enforces it (cross-workspace coordination ticket). +- Mission tokens are auto-revoked the moment the aircraft reconnects (`/login` or `/token/refresh` from a `CompanionPC` user). Verifiers polling `/sessions/revoked` see the revocation within their poll interval (≤ 30 s). + +## Tests + +- `e2e/Azaion.E2E/Tests/MissionTokenTests.cs` — AC-1 (correct lifetime + claims), AC-2 (12h cap), AC-3 (scope claims), AC-4 (auto-revoke on reconnect), AC-5 (auth required). diff --git a/_docs/02_document/modules/services_refresh_token_service.md b/_docs/02_document/modules/services_refresh_token_service.md new file mode 100644 index 0000000..b45991a --- /dev/null +++ b/_docs/02_document/modules/services_refresh_token_service.md @@ -0,0 +1,63 @@ +# Module: Azaion.Services.RefreshTokenService + +## Purpose +Issues, rotates, and validates opaque refresh tokens for interactive sessions. Implements OAuth 2.1 §6.1 reuse-detection: presenting an already-rotated refresh token kills the entire session family. + +> Added in cycle 2 (2026-05-14) by AZ-531 (Epic AZ-529, Auth Mechanism Modernization). Foundation for AZ-535 (logout/revocation) and AZ-534 (MFA — pins `mfa_authenticated` to the session so refresh rotation inherits the original AMR strength). + +## Public Interface + +### IRefreshTokenService + +| Method | Signature | Description | +|--------|-----------|-------------| +| `IssueForNewLogin` | `Task<(string OpaqueToken, Session Session)> IssueForNewLogin(Guid userId, bool mfaAuthenticated = false, CancellationToken ct = default)` | Mint a fresh refresh token at login; starts a new session family. The opaque token is returned to the caller; only its SHA-256 hash is persisted. `mfaAuthenticated` is pinned to the session row so rotation preserves AMR strength. | +| `Rotate` | `Task<(string OpaqueToken, Session Session)> Rotate(string opaqueToken, CancellationToken ct = default)` | Rotate the supplied refresh token. On success returns a new opaque token + the new session row. Throws `BusinessException(InvalidRefreshToken)` on bad/expired/revoked input; on reuse-detection (already-rotated token presented again) the entire session family is revoked first. | + +## Internal Logic + +- **Token format**: 32 random bytes (256 bits) base64url-encoded → 43-char string (no padding). Persisted as the SHA-256 hex digest in `sessions.refresh_hash`. The opaque value is never logged. +- **Family semantics**: each `IssueForNewLogin` creates a new family (`family_id == id`). Each `Rotate` inserts a new row in the same family with `parent_session_id` chained to the previous row, then marks the previous row `revoked_reason='rotated'`. +- **Reuse detection**: if a presented token is found with `revoked_reason='rotated'`, every active row in the same family is set to `revoked_reason='reuse_detected'` (per OAuth 2.1 §6.1) — even the row that succeeded last cycle stops working. +- **Sliding expiry**: each rotation moves `expires_at` to `now + RefreshSlidingHours` (default 8 h). +- **Absolute cap**: a family older than `RefreshAbsoluteHours` (default 12 h) since `family_started_at` is rejected even if every individual rotation stayed within the sliding window. +- **Concurrency**: rotation runs in a `Serializable` transaction so two concurrent refreshes of the same token can't both succeed. + +## Dependencies + +- `IDbFactory` — admin connection for inserts/updates +- `IOptions` — sliding/absolute window TTLs (defined alongside `JwtConfig` in `Azaion.Common/Configs/JwtConfig.cs`) +- `Session` entity, `SessionRevokedReasons` constants +- `BusinessException` / `ExceptionEnum.InvalidRefreshToken` +- `System.Security.Cryptography.RandomNumberGenerator` + `SHA256` + +## Consumers + +- `Program.cs` `/login` → calls `IssueForNewLogin` after `UserService.ValidateUser` succeeds +- `Program.cs` `/login/mfa` → calls `IssueForNewLogin` after MFA second factor +- `Program.cs` `/token/refresh` → calls `Rotate` + +## Data Models + +Operates on the `Session` entity via `AzaionDb.Sessions` table. + +## Configuration + +`SessionConfig` (bound from `appsettings.json` section `SessionConfig`): +- `RefreshSlidingHours` (default 8) +- `RefreshAbsoluteHours` (default 12) + +## External Integrations + +PostgreSQL via `IDbFactory`. + +## Security + +- Refresh tokens are opaque random strings, never JWTs — verifiers cannot decode or alter them. +- The plaintext token leaves the server only at issue/rotation; the DB stores only the SHA-256 hash. +- Reuse-detection is the primary defence against stolen-refresh-token attacks: the legitimate user's next refresh will be rejected and they'll be forced to re-authenticate, but the attacker's token also dies. +- Rotation is transactional (`Serializable`) so concurrent refresh races cannot leak two valid descendants. + +## Tests + +- `e2e/Azaion.E2E/Tests/RefreshTokenTests.cs` — covers AC-1 (login dual tokens), AC-2 (rotation invalidates old), AC-3 (reuse kills family), AC-4 (sliding + absolute expiry), AC-5 (opaque, not JWT). diff --git a/_docs/02_document/modules/services_security.md b/_docs/02_document/modules/services_security.md index 1b190b3..d399b45 100644 --- a/_docs/02_document/modules/services_security.md +++ b/_docs/02_document/modules/services_security.md @@ -1,39 +1,79 @@ # Module: Azaion.Services.Security ## Purpose -Static utility class providing the SHA-384 password hashing helper used by `UserService`. +Static utility class providing password hashing and verification. As of cycle 2, hashes new passwords with **Argon2id (RFC 9106)** and transparently re-hashes legacy SHA-384 entries on the next successful login. -> **Cycle 1 (2026-05-13) note** — `GetHWHash` was deleted and `GetApiEncryptionKey` was simplified from `(email, password, hardwareHash)` to `(email, password)` by AZ-197. +> **Cycle 1 (2026-05-13) note** — `GetHWHash` deleted; `GetApiEncryptionKey` simplified by AZ-197. > -> **Cycle 2 (2026-05-14) note** — `GetApiEncryptionKey`, `EncryptTo`, and `DecryptTo` were all removed along with the encrypted-download endpoint. Only `ToHash` remains; it still backs SHA-384 password hashing in `UserService` (`PasswordHash = request.Password.ToHash()`). The `Azaion.Test/SecurityTest.cs` unit tests went with the removed methods, leaving the `Azaion.Test` project empty (also removed from the solution). See `_docs/06_metrics/retro_2026-05-14.md` once cycle 2's retro lands. +> **Cycle 2 (2026-05-14) note A** — `GetApiEncryptionKey` / `EncryptTo` / `DecryptTo` removed with the encrypted-download endpoint. The `Azaion.Test` project went with them. +> +> **Cycle 2 (2026-05-14) note B (AZ-536)** — `ToHash` was removed and replaced with `HashPassword` + `VerifyPassword`. Hash format is now PHC: `$argon2id$v=19$m=65536,t=3,p=1$$`. Legacy SHA-384 hashes (64-char Base64, no `$` prefix) are still accepted for verification and the verify path returns `NeedsRehash=true` so `UserService.ValidateUser` can rewrite them on the success path. Epic AZ-530, CMMC IA.L2-3.5.10. ## Public Interface | Method | Signature | Description | |--------|-----------|-------------| -| `ToHash` | `static string ToHash(this string str)` | Extension: SHA-384 hash of input, returned as Base64 | +| `HashPassword` | `static string HashPassword(string plaintext)` | Generates a 16-byte salt, computes Argon2id with the conservative defaults below, returns a PHC string. | +| `VerifyPassword` | `static VerifyResult VerifyPassword(string plaintext, string stored)` | Detects format by prefix. Argon2id PHC → re-derives + constant-time compare; legacy SHA-384 → re-hashes + constant-time compare. Returns `Valid`, plus `NeedsRehash=true` when (a) the stored hash is legacy SHA-384, or (b) the stored Argon2 parameters are weaker than current defaults. | + +### `record VerifyResult(bool Valid, bool NeedsRehash)` + +Carries the verification outcome. `NeedsRehash` is the trigger for `UserService.RegisterSuccessfulLogin` to write a fresh Argon2id hash back to the row. ## Internal Logic -- `ToHash` uses SHA-384 with UTF-8 encoding, outputting Base64. + +**Defaults (RFC 9106 §4 conservative profile)**: +- Memory: 65536 KiB (64 MiB) +- Iterations: 3 +- Parallelism: 1 +- Salt: 16 bytes (128 bits) per RFC §3.1 minimum +- Hash output: 32 bytes (256 bits) + +**Format detection**: +- Argon2id PHC string starts with `$argon2id$`. +- Legacy SHA-384: exactly 64 base64 characters and does NOT start with `$`. +- Anything else fails verify with `Valid=false, NeedsRehash=false`. + +**PHC encoding** uses base64 *without* padding (PHC convention): +``` +$argon2id$v=19$m=,t=,p=$$ +``` + +**Constant-time comparison** uses `CryptographicOperations.FixedTimeEquals` for both formats — addresses AZ-536 AC-5 (no remotely-observable timing leak). ## Dependencies -- `System.Security.Cryptography` (SHA384) -- `System.Text.Encoding` + +- `Konscious.Security.Cryptography.Argon2` (Argon2id implementation, pure C#) +- `System.Security.Cryptography.SHA384` (legacy verify path) +- `System.Security.Cryptography.RandomNumberGenerator` (salt entropy) +- `System.Security.Cryptography.CryptographicOperations` (constant-time compare) ## Consumers -- `Azaion.Services/UserService.cs` — `RegisterUser` (password storage) and `ValidateUser` (login comparison) both call `request.Password.ToHash()` + +- `Azaion.Services/UserService.cs` + - `RegisterUser` — calls `HashPassword(request.Password)` + - `ValidateUser` → `RegisterSuccessfulLogin` — calls `VerifyPassword`; on `NeedsRehash` writes a fresh Argon2id hash back transactionally (conditional on the original hash to avoid clobbering a parallel rehash) +- `Azaion.Services/MfaService.cs` + - `Enroll` and `Disable` — re-auth via `VerifyPassword(password, user.PasswordHash)` ## Data Models None. ## Configuration -None. +None directly. The defaults are class-level constants. Bumping them later automatically surfaces `NeedsRehash=true` for any older stored hash, so the upgrade is lazy and transparent. ## External Integrations None. ## Security -- Password hashing uses SHA-384 with no per-user salt and no key stretching. Not resistant to rainbow-table attacks (security audit F-7 — open). Unchanged by cycles 1 and 2. + +- Argon2id memory cost (64 MiB) makes GPU bruteforce attacks orders of magnitude slower than the previous SHA-384 path. Each verify costs ~50–200 ms on commodity hardware (intentional latency floor). +- Legacy SHA-384 hashes are migrated on next successful login (lazy migration). Service accounts that never log in interactively (CompanionPC devices) need an admin-side bulk-reset rotation cycle to upgrade. +- The verify path is constant-time end-to-end via `FixedTimeEquals` — defends AZ-536 AC-5. +- The "needs rehash" flag also covers future parameter bumps: raising `Argon2MemoryKib`/`Argon2Iterations` here will make all weaker stored hashes upgrade themselves on the next login. ## Tests -None at the unit-test level after the `Azaion.Test` project was removed in cycle 2. `ToHash` is exercised end-to-end through every login / register e2e test (`e2e/Azaion.E2E/Tests/`). + +- `e2e/Azaion.E2E/Tests/PasswordHashingTests.cs` — AC-1 (PHC format), AC-2 (legacy SHA-384 still validates), AC-3 (transparent re-hash), AC-4 (wrong password fails for both formats), AC-5 (constant-time verify). +- **Known follow-up** (carried from cycle 2 batch 4 review) — `PasswordHashingTests.AC5_Verify_uses_constant_time_comparator_no_obvious_timing_leak` is intermittently flaky under suite-level concurrency; widen the assertion bound or warm Argon2 with a non-test login first. +- `Azaion.Services` is exercised end-to-end through every login / register / MFA flow in `e2e/Azaion.E2E/Tests/`. diff --git a/_docs/02_document/modules/services_session_service.md b/_docs/02_document/modules/services_session_service.md new file mode 100644 index 0000000..6045b67 --- /dev/null +++ b/_docs/02_document/modules/services_session_service.md @@ -0,0 +1,65 @@ +# Module: Azaion.Services.SessionService + +## Purpose +Logout / revocation surface and the verifier-poll snapshot. Distinct from `RefreshTokenService` (which rotates and reuse-detects); this service expresses the human / admin / system intent to kill a session and exposes the cross-service denylist feed. + +> Added in cycle 2 (2026-05-14) by AZ-535 (Epic AZ-529). The `RevokeMissionsForAircraft` path was added the same day for AZ-533 (mission-token auto-revoke on reconnect). + +## Public Interface + +### ISessionService + +| Method | Signature | Description | +|--------|-----------|-------------| +| `RevokeBySid` | `Task RevokeBySid(Guid sessionId, Guid? byUserId, string reason, CancellationToken ct = default)` | Revoke a single session by id. Returns `true` if the session was already revoked (no-op), `false` if this call performed the revocation. Throws `BusinessException(SessionNotFound)` if no row exists. | +| `RevokeAllForUser` | `Task RevokeAllForUser(Guid userId, Guid? byUserId, string reason, CancellationToken ct = default)` | Revoke every active session for a user. Returns the number of rows newly revoked. | +| `RevokeMissionsForAircraft` | `Task RevokeMissionsForAircraft(Guid aircraftId, CancellationToken ct = default)` | AZ-533 — auto-revoke every open mission session for an aircraft. Fired on successful `/login` or `/token/refresh` from a `CompanionPC` user. | +| `GetRevokedSince` | `Task> GetRevokedSince(DateTime since, CancellationToken ct = default)` | Verifier-poll snapshot. Returns sessions revoked after `since` whose `exp` is still in the future (auto-prunes already-expired entries). | + +### `record RevokedSession(Guid Sid, DateTime Exp, DateTime RevokedAt, string? Reason)` + +Shape returned by `GetRevokedSince`. Field names match the JSON the `/sessions/revoked` endpoint serializes to verifiers. + +## Internal Logic + +- **Revocation reasons** are constants on `SessionRevokedReasons` (`logged_out`, `logged_out_all`, `admin_revoked`, `post_flight_reconnect`, `rotated`, `reuse_detected`, `family_revoked`). +- **Idempotency** — `RevokeBySid` reads first, then writes only if `revoked_at IS NULL`. The boolean return signals which side of the race the caller was on. +- **Mission auto-revoke** uses the partial index `sessions_aircraft_active_idx` (defined in `09_sessions_logout_and_mission.sql`) — O(active mission rows for that aircraft). +- **Snapshot pruning** — `GetRevokedSince` filters `expires_at > now()` so the response stays bounded even if revocation history grows large; the endpoint additionally clamps `since` to `now - 12 h` to prevent unbounded historical scans. + +## Dependencies + +- `IDbFactory` — admin connection for updates, read connection for the snapshot +- `Session` entity, `SessionRevokedReasons`, `SessionClasses` constants +- `BusinessException` / `ExceptionEnum.SessionNotFound` + +## Consumers + +- `Program.cs` `/logout` → `RevokeBySid` +- `Program.cs` `/logout/all` → `RevokeAllForUser` +- `Program.cs` `/sessions/{sid}/revoke` (admin-only) → `RevokeBySid` +- `Program.cs` `/sessions/revoked` (verifier-poll, gated by `revocationReaderPolicy`) → `GetRevokedSince` +- `Program.cs` `/login` and `/token/refresh` (when caller is `RoleEnum.CompanionPC`) → `RevokeMissionsForAircraft` + +## Data Models + +Operates on the `Session` entity via `AzaionDb.Sessions` table. + +## Configuration + +None directly. The `/sessions/revoked` endpoint hard-codes the 12-hour `since` floor; review if mission TTL is ever raised above 12 h. + +## External Integrations + +PostgreSQL via `IDbFactory`. + +## Security + +- The verifier-poll endpoint is gated by `revocationReaderPolicy` (`Service` or `ApiAdmin` role). Each verifier deployment (satellite-provider, gps-denied, ui) provisions one `Role=Service` user. +- The `Cache-Control: no-cache` header on `/sessions/revoked` prevents intermediaries from staleing the denylist. +- The `revoked_by_user_id` column gives an audit trail of "who revoked this session" for admin and user-initiated revocations; system revocations (rotation, reuse, post-flight) leave it null on purpose. + +## Tests + +- `e2e/Azaion.E2E/Tests/LogoutTests.cs` — covers AC-1 (logout revokes session), AC-2 (logout/all), AC-3 (admin revoke), AC-4 (snapshot recent + prune expired), AC-5 (idempotent logout). +- `e2e/Azaion.E2E/Tests/MissionTokenTests.cs` — exercises `RevokeMissionsForAircraft` via AC-4 (auto-revoke on reconnect). diff --git a/_docs/02_document/modules/services_user_service.md b/_docs/02_document/modules/services_user_service.md index 11fc2c5..24027a1 100644 --- a/_docs/02_document/modules/services_user_service.md +++ b/_docs/02_document/modules/services_user_service.md @@ -1,69 +1,92 @@ # Module: Azaion.Services.UserService ## Purpose -Core business logic for user management: registration (web users + provisioned devices), authentication, role management, and account lifecycle. +Core business logic for user management: registration (web users + provisioned devices), authentication (with rate-limit + lockout enforcement), role management, and account lifecycle. -> **Cycle 1 (2026-05-13) note** — hardware-binding methods (`UpdateHardware`, `CheckHardwareHash`, private `UpdateLastLoginDate`) and the bound `IUserService` declarations were removed by AZ-197 (admin-side hardware-binding cleanup). Device auto-provisioning (`RegisterDevice`) was added by AZ-196. **Post-cycle-1 (security audit F-3)**: `RegisterDevice` was refactored to delegate the row insert to `RegisterUser`, and `RegisterUser` itself now relies on the new `users_email_uidx` UNIQUE INDEX (`env/db/06_users_email_unique.sql`) — the check-then-insert race is gone; `Npgsql.PostgresException(SqlState=23505)` is translated to `BusinessException(EmailExists)`. See `_docs/03_implementation/batch_05_report.md` and `batch_06_report.md`. +> **Cycle 1 (2026-05-13) note** — hardware-binding methods removed by AZ-197; device auto-provisioning (`RegisterDevice`) added by AZ-196. Post-cycle-1: `RegisterUser` now relies on the `users_email_uidx` UNIQUE INDEX; `Npgsql.PostgresException(SqlState=23505)` is translated to `BusinessException(EmailExists)`. +> +> **Cycle 2 (2026-05-14) note A (AZ-536)** — password hashing switched to Argon2id. `RegisterUser` calls `Security.HashPassword`; `ValidateUser` calls `Security.VerifyPassword`. On a `NeedsRehash=true` outcome the user's row is updated transactionally with a fresh Argon2id hash (conditional on the original `password_hash` to avoid clobbering a parallel rehash from a concurrent login). +> +> **Cycle 2 (2026-05-14) note B (AZ-537)** — `ValidateUser` now enforces account lockout (423) and per-account sliding-window rate limit (429-equivalent via `BusinessException(LoginRateLimited)`). The lockout state lives on `users.failed_login_count` / `users.lockout_until`; the rate-limit feed is `audit_events` rows of type `login_failed`. `IAuditLog` and `IOptions` are new constructor dependencies. ## Public Interface ### IUserService | Method | Signature | Description | |--------|-----------|-------------| -| `RegisterUser` | `Task RegisterUser(RegisterUserRequest request, CancellationToken ct)` | Creates a new user with hashed password | -| `RegisterDevice` | `Task RegisterDevice(CancellationToken ct)` | Creates a new `CompanionPC` user with auto-assigned `azj-NNNN` serial / email and a 32-char hex password (returned plaintext exactly once) | -| `ValidateUser` | `Task ValidateUser(LoginRequest request, CancellationToken ct)` | Validates email + password, returns user. Throws `NoEmailFound`, `WrongPassword`, or `UserDisabled` | -| `GetByEmail` | `Task GetByEmail(string? email, CancellationToken ct)` | Cached user lookup by email | -| `UpdateQueueOffsets` | `Task UpdateQueueOffsets(string email, UserQueueOffsets offsets, CancellationToken ct)` | Updates user's annotation queue offsets | -| `GetUsers` | `Task> GetUsers(string? searchEmail, RoleEnum? searchRole, CancellationToken ct)` | Lists users with optional email/role filters | -| `ChangeRole` | `Task ChangeRole(string email, RoleEnum newRole, CancellationToken ct)` | Changes a user's role | -| `SetEnableStatus` | `Task SetEnableStatus(string email, bool isEnabled, CancellationToken ct)` | Enables or disables a user account | -| `RemoveUser` | `Task RemoveUser(string email, CancellationToken ct)` | Permanently deletes a user | +| `RegisterUser` | `Task RegisterUser(RegisterUserRequest request, CancellationToken ct)` | Creates a new user with Argon2id-hashed password. Translates `users_email_uidx` 23505 violations to `BusinessException(EmailExists)`. | +| `RegisterDevice` | `Task RegisterDevice(CancellationToken ct)` | Creates a new `CompanionPC` user with auto-assigned `azj-NNNN` serial / email and a 32-char hex password (returned plaintext exactly once). | +| `ValidateUser` | `Task ValidateUser(LoginRequest request, CancellationToken ct)` | Validates email + password; enforces account lockout and per-account rate limit. Returns the user on success (with `failed_login_count` zeroed and any legacy SHA-384 hash transparently upgraded). Throws `NoEmailFound`, `AccountLocked` (with retry-after seconds), `LoginRateLimited` (with retry-after window), `WrongPassword`, or `UserDisabled`. | +| `GetByEmail` | `Task GetByEmail(string? email, CancellationToken ct)` | Cached user lookup by email. | +| `GetById` | `Task GetById(Guid userId, CancellationToken ct)` | Direct DB lookup by id (used by token-bound flows: refresh, MFA, mission). Not cached. | +| `UpdateQueueOffsets` | `Task UpdateQueueOffsets(string email, UserQueueOffsets offsets, CancellationToken ct)` | Updates user's annotation queue offsets. | +| `GetUsers` | `Task> GetUsers(string? searchEmail, RoleEnum? searchRole, CancellationToken ct)` | Lists users with optional email/role filters. | +| `ChangeRole` | `Task ChangeRole(string email, RoleEnum newRole, CancellationToken ct)` | Changes a user's role. | +| `SetEnableStatus` | `Task SetEnableStatus(string email, bool isEnabled, CancellationToken ct)` | Enables or disables a user account. | +| `RemoveUser` | `Task RemoveUser(string email, CancellationToken ct)` | Permanently deletes a user. | ## Internal Logic -- **RegisterUser**: hashes password via `Security.ToHash`, inserts via `RunAdmin`. Catches `Npgsql.PostgresException` with `SqlState == PostgresErrorCodes.UniqueViolation` (23505) on the `users_email_uidx` UNIQUE INDEX and rethrows as `BusinessException(EmailExists)`. The previous check-then-insert pattern was removed (race-prone before the index existed; redundant after). -- **RegisterDevice**: calls private `NextDeviceIdentity` (read-only) to compute the next `azj-NNNN` serial + matching email, generates a 32-char hex password from `RandomNumberGenerator.GetBytes(16)`, then delegates the row insert to `RegisterUser` (so any future change to user-creation policy applies here too). Returns `{Serial, Email, Password}` (plaintext password exposed exactly once at provisioning time). On a serial-allocation race, the second caller's insert hits the UNIQUE INDEX and surfaces `BusinessException(EmailExists)`; the caller can retry. -- **NextDeviceIdentity** (private): queries the most recent `RoleEnum.CompanionPC` user via `dbFactory.Run` (read connection), parses the `azj-NNNN` suffix (chars `[SerialNumberStart, SerialNumberLength)` of the email, constants on the class), increments by 1, returns `(serial, email)`. -- **ValidateUser**: finds user by email, compares password hash. Throws `NoEmailFound`, `WrongPassword`, or `UserDisabled`. -- **GetByEmail**: uses `ICache.GetFromCacheAsync` with key `User.{email}`. -- **UpdateQueueOffsets**: writes via `RunAdmin`, then invalidates the user cache. -- **GetUsers**: uses `WhereIf` for optional filter predicates. + +- **RegisterUser**: hashes password via `Security.HashPassword` (Argon2id), inserts via `RunAdmin`. Catches `PostgresException(23505)` on `users_email_uidx` and rethrows as `BusinessException(EmailExists)`. +- **RegisterDevice**: queries the most recent `RoleEnum.CompanionPC` user via `dbFactory.Run`, parses the `azj-NNNN` suffix, increments by 1, generates a 32-char hex password from `RandomNumberGenerator.GetBytes(16)`, then delegates the row insert to `RegisterUser` (so future user-creation policy changes apply here too). +- **ValidateUser** (sequence — order matters): + 1. Lookup by email; missing → `NoEmailFound`. + 2. **Lockout gate** — if `lockout_until > now()`, throw `AccountLocked` with the remaining seconds as `RetryAfterSeconds`. This precedes the password check (CMMC AC.L2-3.1.8 — even a correct password is rejected during lockout). + 3. **Per-account rate limit** — `IAuditLog.CountRecentFailedLogins` over `AuthConfig.RateLimit.PerAccountWindowSeconds`; if ≥ `PerAccountPermitLimit`, throw `LoginRateLimited` with the window as `RetryAfterSeconds`. + 4. **Password verify** via `Security.VerifyPassword`. Failure → `RegisterFailedLogin` (audit row + counter increment + maybe lockout) → throw `WrongPassword` (or `AccountLocked` if the failure crossed the threshold). + 5. `IsEnabled` check (after verify so wrong-password and disabled-account look identical to attackers from the outside). + 6. **Success path** — `RegisterSuccessfulLogin`: lazy Argon2id rehash if `NeedsRehash=true` (conditional on the original hash to avoid clobbering a parallel rehash), zero `failed_login_count`, clear `lockout_until`, invalidate cache, write `login_success` audit row. +- **RegisterFailedLogin**: writes `login_failed` audit row, increments `failed_login_count`. If the new count reaches `Lockout.MaxAttempts`, sets `lockout_until = now() + DurationSeconds`, writes a `login_lockout` audit row, and throws `AccountLocked` immediately so the caller learns the threshold was crossed. +- **GetByEmail**: cached via `ICache.GetFromCacheAsync` keyed `User.{email}`. +- **GetById**: not cached (used by token-bound flows where the user id is already authenticated). Private constants (device provisioning): - `DeviceEmailPrefix = "azj-"`, `DeviceEmailDomain = "@azaion.com"`, `SerialNumberStart = 4`, `SerialNumberLength = 4`, `DevicePasswordBytes = 16`. ## Dependencies + - `IDbFactory` (database access) - `ICache` (user caching) -- `Security` (hashing — `ToHash`) +- `IAuditLog` (cycle 2 — audit row writes + per-account rate-limit feed) +- `IOptions` (cycle 2 — `RateLimit.*`, `Lockout.*` thresholds) +- `Security` (Argon2id hashing — `HashPassword` / `VerifyPassword`) - `System.Security.Cryptography.RandomNumberGenerator` (device password entropy) -- `Npgsql` (`PostgresException`, `PostgresErrorCodes.UniqueViolation` — used to translate UNIQUE-INDEX violations to `BusinessException(EmailExists)`) -- `BusinessException` (domain errors) +- `Npgsql` (`PostgresException`, `PostgresErrorCodes.UniqueViolation`) +- `BusinessException` / `ExceptionEnum` (`NoEmailFound`, `WrongPassword`, `EmailExists`, `UserDisabled`, `AccountLocked`, `LoginRateLimited`) - `QueryableExtensions.WhereIf` - `User`, `UserConfig`, `UserQueueOffsets`, `RoleEnum` - `RegisterUserRequest`, `LoginRequest`, `RegisterDeviceResponse` ## Consumers -- `Program.cs` — `/users/*` endpoints delegate to `IUserService` -- `Program.cs` — `POST /devices` calls `RegisterDevice` (added by AZ-196) + +- `Program.cs` `/users/*` endpoints — delegate to `IUserService` +- `Program.cs` `POST /devices` — calls `RegisterDevice` +- `Program.cs` `/login` — calls `ValidateUser` then either short-circuits to MFA step-1 or issues dual tokens +- `Program.cs` `/login/mfa`, `/token/refresh`, `/sessions/mission` — call `GetById` after token-side identity is established - `AuthService.GetCurrentUser` — calls `GetByEmail` +- `MfaService` — calls `GetById` for re-auth in `Enroll` / `Confirm` / `Disable` / `VerifyForLogin` ## Data Models -Operates on `User` entity via `AzaionDb.Users` table. The `User.Hardware` column is left in place (nullable, unused) per AZ-197 — see the entity doc. + +Operates on `User` entity via `AzaionDb.Users`. Reads `failed_login_count` / `lockout_until` (AZ-537) and `mfa_enabled` (AZ-534). Writes `password_hash`, `failed_login_count`, `lockout_until` along the lockout/rehash paths. The `User.Hardware` column remains a tombstone (nullable, unused) per AZ-197. ## Configuration -None. +- `AuthConfig.RateLimit.PerAccountPermitLimit` / `PerAccountWindowSeconds` — sliding-window thresholds. +- `AuthConfig.Lockout.MaxAttempts` / `DurationSeconds` — consecutive-failure lockout. ## External Integrations PostgreSQL via `IDbFactory`. ## Security -- Passwords hashed with SHA-384 (via `Security.ToHash`) before storage. -- Device passwords are returned plaintext to the caller exactly once at provisioning; the persisted form is the SHA-384 hash. The plaintext is never re-derivable. + +- Passwords hashed with Argon2id (post-AZ-536). Legacy SHA-384 entries still validate and are transparently upgraded on next successful login. +- Device passwords are returned plaintext to the caller exactly once at provisioning; the persisted form is the Argon2id hash. The plaintext is never re-derivable. +- Lockout precedence (CMMC AC.L2-3.1.8): a locked account returns 423 even for a correct password until `lockout_until` passes. +- The per-account rate limit is DB-backed (via `audit_events`) so it survives process restarts — distinct from the in-memory per-IP limiter that lives in `Program.cs`. - Read operations use the read-only DB connection; writes use the admin connection. ## Tests -- `e2e/Azaion.E2E/Tests/DeviceTests.cs` — e2e for AZ-196 device-provisioning ACs -- `e2e/Azaion.E2E/Tests/UserManagementTests.cs` and `LoginTests.cs` — e2e coverage for the rest of the user lifecycle (login, register, role change, enable/disable, delete, queue offsets) - -(Unit-test coverage in `Azaion.Test/UserServiceTest.cs` was removed earlier with the AZ-197 hardware-binding cleanup; the `Azaion.Test` project itself was removed from the solution in cycle 2 once its only remaining file — `SecurityTest.cs` — was deleted with the encrypted-download stack.) +- `e2e/Azaion.E2E/Tests/RateLimitLockoutTests.cs` — AZ-537 ACs (per-IP 429, per-account 429, lockout 423, counter reset, lockout auto-expires, audit_events row on lockout). +- `e2e/Azaion.E2E/Tests/PasswordHashingTests.cs` — AZ-536 ACs (Argon2id format, legacy verify, transparent re-hash, wrong-password fail, constant-time verify). +- `e2e/Azaion.E2E/Tests/DeviceTests.cs` — AZ-196 device-provisioning ACs. +- `e2e/Azaion.E2E/Tests/UserManagementTests.cs` and `LoginTests.cs` — broader user lifecycle coverage. diff --git a/_docs/02_document/ripple_log_cycle2.md b/_docs/02_document/ripple_log_cycle2.md new file mode 100644 index 0000000..af93a86 --- /dev/null +++ b/_docs/02_document/ripple_log_cycle2.md @@ -0,0 +1,66 @@ +# Documentation Ripple Log — Cycle 2 (Auth Modernization, AZ-531..AZ-538) + +> Generated by `document` skill, Task Step 0.5 (Import-Graph Ripple), 2026-05-14. +> Source: cycle-2 implementation report (`_docs/03_implementation/implementation_report_auth_modernization_cycle2.md`). + +## Method + +For each source file changed by the cycle, identified C# namespace consumers via `rg "using Azaion\."`. Resolved consumer csproj membership via `module-layout.md`. Folded transitively-affected component / module docs into the refresh set. + +## Direct + Ripple-affected docs (already refreshed in this cycle) + +| Trigger (changed in cycle 2) | Importing namespaces / files | Doc(s) refreshed | Reason | +|------------------------------|------------------------------|------------------|--------| +| `Azaion.Services.Security` (Argon2id rebuild — AZ-536) | `UserService`, `MfaService` | `modules/services_security.md`, `modules/services_user_service.md`, `modules/services_mfa_service.md` | API surface changed (`HashPassword`/`VerifyPassword` replace `ToHash`); both consumers had to be re-read | +| `Azaion.Services.AuthService` (ES256 — AZ-532) | `Azaion.AdminApi/Program.cs` | `modules/services_auth_service.md`, `modules/admin_api_program.md` | `CreateToken` signature (`sid`, `jti`, `amr`); JWKS publication wired in Program.cs | +| `Azaion.Services.RefreshTokenService` (new — AZ-531) | `Program.cs` | `modules/services_refresh_token_service.md` (new), `modules/admin_api_program.md` | New endpoints `/login`, `/login/mfa`, `/token/refresh` consume it | +| `Azaion.Services.SessionService` (new — AZ-535) | `Program.cs`, `MissionTokenService`, `UserService.SetEnableStatus` | `modules/services_session_service.md` (new), `modules/admin_api_program.md`, `modules/services_user_service.md`, `modules/services_mission_token_service.md` | `RevokeMissionsForAircraft` called from login/refresh; `RevokeAllForUser` called when user disabled | +| `Azaion.Services.MfaService` (new — AZ-534) | `Program.cs` | `modules/services_mfa_service.md` (new), `modules/admin_api_program.md` | New endpoints `/users/me/mfa/{enroll,confirm,disable}` + step-1 token in login | +| `Azaion.Services.MissionTokenService` (new — AZ-533) | `Program.cs` | `modules/services_mission_token_service.md` (new), `modules/admin_api_program.md` | `/sessions/mission` | +| `Azaion.Services.JwtSigningKeyProvider` (new — AZ-532) | `Program.cs`, `AuthService`, `MfaService` | `modules/services_jwt_signing_key_provider.md` (new), `modules/admin_api_program.md`, `modules/services_auth_service.md`, `modules/services_mfa_service.md` | Eager-built singleton; both JwtBearer `IssuerSigningKeyResolver` and AuthService consume it | +| `Azaion.Services.AuditLog` (new — AZ-537+534) | `UserService`, `MfaService`, `Program.cs` (DI only) | `modules/services_audit_log.md` (new), `modules/services_user_service.md`, `modules/services_mfa_service.md` | Per-account rate-limit + lifecycle audit | +| `Azaion.Common.Entities.User` (extended — AZ-537+534) | `UserService`, `MfaService`, `RefreshTokenService` (UserId), `SessionService`, `AuthService` | `modules/common_entities_user.md`, all services above | New columns drive new application logic | +| `Azaion.Common.Entities.Session` (new — AZ-531+535+533+534) | `RefreshTokenService`, `SessionService`, `MissionTokenService` | `modules/common_entities_session.md` (new); already-listed services | Direct ORM consumer | +| `Azaion.Common.Entities.AuditEvent` (new — AZ-537+534) | `AuditLog`, `UserService` | `modules/common_entities_audit_event.md` (new) | Direct ORM consumer | +| `Azaion.Common.Entities.RoleEnum` (extended — `Service` — AZ-535) | `Program.cs` (`revocationReaderPolicy`), `UserService` | `modules/common_entities_role_enum.md`, `modules/admin_api_program.md` | Authorization policy gate | +| `Azaion.Common.Configs.JwtConfig` (rebuilt — AZ-532) | `Program.cs`, `AuthService`, `MfaService`, `JwtSigningKeyProvider` | `modules/common_configs_jwt_config.md`, downstream services already covered | All ES256-related config | +| `Azaion.Common.Configs.AuthConfig` (new — AZ-536+537) | `Program.cs`, `UserService`, `Security` | `modules/common_configs_auth_config.md` (new), downstream covered | Argon2id parameters + rate limit + lockout | +| `Azaion.Common.Configs.SessionConfig` (new — AZ-531) | `Program.cs`, `RefreshTokenService` | folded into `modules/common_configs_jwt_config.md` (renamed JwtConfig + SessionConfig), downstream covered | Refresh sliding + absolute lifetimes | +| `Azaion.Common.Requests.LoginResponse` / `RefreshTokenRequest` (new — AZ-531) | `Program.cs` | `modules/common_requests_login_response.md` (new), `modules/admin_api_program.md`, `modules/common_requests_login_request.md` (cross-ref note) | New response shape; backward-compat `Token` getter | +| `Azaion.Common.Requests.MissionSessionRequest` / `MissionSessionResponse` (new — AZ-533) | `Program.cs`, `MissionTokenService` | `modules/common_requests_mission_session_request.md` (new) | New endpoint payload | +| `Azaion.Common.Requests.MfaRequests` (new — AZ-534) | `Program.cs`, `MfaService` | `modules/common_requests_mfa_requests.md` (new) | Five DTOs grouped in one file | +| `Azaion.Common.BusinessException` / `ExceptionEnum` (extended — AZ-531+533+534+535+537) | All services + `BusinessExceptionHandler` | `modules/common_business_exception.md`, `modules/admin_api_program.md` (handler section) | New error codes + `Retry-After` header support | +| `Azaion.Common.Database.AzaionDb` / `AzaionDbShemaHolder` (extended — Sessions + AuditEvents + jsonb mappings) | all services using them | covered transitively via component 01 Data Layer | New ITables; new mappings | + +## Component-level rollup + +| Component | Refreshed? | Why | +|-----------|------------|-----| +| 01 Data Layer | yes | `Session`, `AuditEvent`, extended `User`/`RoleEnum`, new `AuthConfig`/`SessionConfig`, rebuilt `JwtConfig`, new ITables, new indexes | +| 02 User Management | yes (within `services_user_service.md`) | Argon2id + lockout + rate-limit + audit | +| 03 Auth & Security | yes | Major rebuild — full rewrite of `components/03_auth_and_security/description.md` | +| 04 Resource Management | no | Cycle 2 auth-modernization did not touch resource code | +| 04b Detection Classes | no | Same | +| 05 Admin API | yes | Major endpoint surface expansion + middleware pipeline rewrite | + +## System-level docs refreshed + +- `system-flows.md` — F1 rewritten; F11–F17 added; F2/F7/F9 minor edits (Argon2id, session-revoke-on-disable) +- `data_model.md` — full rewrite to cover sessions / audit_events / new user columns / migrations / permissions +- `architecture.md` — section 1 rewritten, sections 2–7 updated, ADRs 6–9 added +- `module-layout.md` — sub-component table refreshed for cycle 2 services +- `diagrams/flows/flow_login.md` — full rewrite for the dual-token + MFA model + +## Tests (out-of-process) + +15 new e2e test files under `e2e/Azaion.E2E/Tests/` consume `Azaion.*` namespaces but are out-of-process HTTP tests; they do not have their own module docs by design (per `module-layout.md` §1). They are referenced from each module's "Tests" section. + +## Heuristic / parse-failure notes + +None. The C# `using` graph was directly resolvable for every changed namespace. + +## Out of scope + +- `_docs/00_problem/*` — no AC / input-parameter changes from cycle 2 that aren't already captured in the per-task specs +- `_docs/04_deploy/*` — deployment ripple (ES256 PEM volume, DataProtection volume, HSTS/HTTPS rollout) is owned by the *deploy* skill (Step 14 of the autodev existing-code flow), not the *document* skill +- `_docs/05_security/*` — security report ripple is owned by the *security* skill diff --git a/_docs/02_document/system-flows.md b/_docs/02_document/system-flows.md index d807e30..faf276a 100644 --- a/_docs/02_document/system-flows.md +++ b/_docs/02_document/system-flows.md @@ -1,45 +1,65 @@ # Azaion Admin API — System Flows -> **Cycle 1 (2026-05-13) note** — F4 (Hardware Check) was deleted by AZ-197; F3 no longer depends on hardware. Two new flows were added: F8 Detection Classes CRUD (AZ-513), F9 Device Auto-Provisioning (AZ-196). F10 OTA Update Check & Publish (AZ-183) was reverted later the same day after the security audit (finding F-1) — the OTA delivery model itself was deemed obsolete; see `_docs/05_security/security_report.md` for context. F3's narrative was updated to drop the hardware-check step. +> **Cycle 1 (2026-05-13) note** — F4 (Hardware Check) was deleted by AZ-197; F8 Detection Classes (AZ-513), F9 Device Auto-Provisioning (AZ-196) added; F10 OTA reverted after security audit F-1. > -> **Cycle 2 (2026-05-14) note** — F3 (Encrypted Resource Download) and F6 (Installer Download) were removed entirely as obsolete. The encrypted-download support stack (`Security.GetApiEncryptionKey`, `EncryptTo`, `DecryptTo`, `ResourcesService.GetEncryptedResource`, `ResourcesService.GetInstaller`, `GetResourceRequest`, `WrongResourceName` (50)) and the installer config (`SuiteInstallerFolder`, `SuiteStageInstallerFolder`) all went with them. See `_docs/02_document/architecture.md` ADR-003 (retired). +> **Cycle 2 — early (2026-05-14)** — F3 (Encrypted Resource Download) and F6 (Installer Download) removed entirely as obsolete. ADR-003 retired. +> +> **Cycle 2 — Auth Modernization (2026-05-14)** — F1 was rebuilt around the new dual-token + MFA model (AZ-531/532/534/536/537). Six new flows were added: F11 Refresh Token Rotation (AZ-531), F12 Logout / Revocation (AZ-535), F13 Mission Token Issuance (AZ-533), F14 MFA Enrollment & Confirmation (AZ-534), F15 Verifier Revocation Snapshot (AZ-535), F16 Account Lockout & Per-IP Rate Limit (AZ-537). The legacy single-token narrative is no longer accurate. ## Flow Inventory | # | Flow Name | Trigger | Primary Components | Criticality | |---|-----------|---------|-------------------|-------------| -| F1 | User Login | POST /login | Admin API, User Mgmt, Auth & Security | High | -| F2 | User Registration | POST /users | Admin API, User Mgmt | High | -| ~~F3~~ | ~~Encrypted Resource Download~~ | ~~POST /resources/get~~ | — | **REMOVED — cycle 2 (obsolete)** | -| ~~F4~~ | ~~Hardware Check~~ | ~~POST /resources/check~~ | — | **REMOVED — AZ-197** | -| F5 | Resource Upload | POST /resources | Admin API, Resource Mgmt | Medium | -| ~~F6~~ | ~~Installer Download~~ | ~~GET /resources/get-installer~~ | — | **REMOVED — cycle 2 (obsolete)** | -| F7 | User Management (CRUD) | Various /users/* | Admin API, User Mgmt | Medium | -| F8 | Detection Classes CRUD *(AZ-513)* | POST/PATCH/DELETE /classes | Admin API, DetectionClassService | High | -| F9 | Device Auto-Provisioning *(AZ-196)* | POST /devices | Admin API, User Mgmt | High | -| ~~F10~~ | ~~OTA Update Check & Publish~~ | ~~POST /get-update + POST /resources/publish~~ | — | **REMOVED — post-cycle-1 (AZ-183 reverted, see security audit F-1)** | +| F1 | User Login (dual token + MFA) | `POST /login` (+ `/login/mfa`) | Admin API, User Mgmt, Auth & Security | **Critical** | +| F2 | User Registration | `POST /users` | Admin API, User Mgmt | High | +| ~~F3~~ | ~~Encrypted Resource Download~~ | — | — | **REMOVED — cycle 2 early** | +| ~~F4~~ | ~~Hardware Check~~ | — | — | **REMOVED — AZ-197** | +| F5 | Resource Upload | `POST /resources` | Admin API, Resource Mgmt | Medium | +| ~~F6~~ | ~~Installer Download~~ | — | — | **REMOVED — cycle 2 early** | +| F7 | User Management (CRUD) | Various `/users/*` | Admin API, User Mgmt | Medium | +| F8 | Detection Classes CRUD | `POST/PATCH/DELETE /classes` | Admin API, DetectionClassService | High | +| F9 | Device Auto-Provisioning | `POST /devices` | Admin API, User Mgmt | High | +| ~~F10~~ | ~~OTA Update Check & Publish~~ | — | — | **REMOVED — post-cycle-1** | +| **F11** | **Refresh Token Rotation** *(AZ-531)* | `POST /token/refresh` | Admin API, RefreshTokenService, AuthService, SessionService | **Critical** | +| **F12** | **Logout / Revocation** *(AZ-535)* | `POST /logout`, `/logout/all`, `/sessions/{sid}/revoke` | Admin API, SessionService | High | +| **F13** | **Mission Token Issuance** *(AZ-533)* | `POST /sessions/mission` | Admin API, MissionTokenService, SessionService, AuthService | High | +| **F14** | **MFA Enrollment & Confirmation** *(AZ-534)* | `POST /users/me/mfa/{enroll,confirm,disable}` | Admin API, MfaService, AuditLog | High | +| **F15** | **Verifier Revocation Snapshot** *(AZ-535)* | `GET /sessions/revoked?since=` | Admin API, SessionService | **Critical** for verifier fleet | +| **F16** | **Account Lockout & Rate Limit** *(AZ-537)* | (cross-cuts F1) | Admin API rate-limiter middleware, UserService, AuditLog | High | +| **F17** | **JWKS Publication** *(AZ-532)* | `GET /.well-known/jwks.json` | Admin API, JwtSigningKeyProvider | **Critical** for verifier fleet | ## Flow Dependencies | Flow | Depends On | Shares Data With | |------|-----------|-----------------| -| F1 | — | All other flows (produces JWT token) | -| F2 | — | F1, F9 (creates user records — including device users via F9) | -| F5 | F1 (requires JWT) | — | -| F7 | F1 (requires JWT, ApiAdmin role) | — | -| F8 | F1 (requires JWT, ApiAdmin role) | UI Detection Classes table | -| F9 | F1 (requires JWT, ApiAdmin role) | F2 (writes a user row, but reuses `RegisterUser` end-to-end), F1 (provisioned devices later log in) | +| F1 | F17 (signing keys must exist), F16 (rate limit gate) | F11 (refresh chain), F12 (sid is the revocation key), F14 (MFA branch) | +| F2 | — | F1 (created users can log in) | +| F5 | F1 / F11 (access token) | — | +| F7 | F1 / F11 + ApiAdmin | F12 (disabling a user revokes their sessions) | +| F8 | F1 / F11 + ApiAdmin | UI | +| F9 | F1 / F11 + ApiAdmin | F1 (provisioned devices later log in) | +| F11 | F1 (created the family) | F12 (rotation is the same row store) | +| F12 | F1 / F11 (sid claim) | F15 (revoked rows surface here) | +| F13 | F1 / F11 (pilot's interactive token) | F12 (auto-revoke prior aircraft mission rows) | +| F14 | F1 (caller is authenticated) | F1 (the MFA branch consumes enrolled state) | +| F15 | — (verifier role only) | F12 (consumes revocation rows) | +| F16 | — | F1, F11 (gates them) | +| F17 | — | F1, F11, F13, F14 (every signed token), F15 (verifiers cache JWKS) | --- -## Flow F1: User Login +## Flow F1: User Login (dual token + MFA) *(rebuilt cycle 2)* ### Description -A user submits email/password credentials. The system validates them against the database and returns a signed JWT token for subsequent authenticated requests. +A user submits email/password credentials. The system enforces per-IP and per-account rate limits + lockout (F16), verifies the password with constant-time Argon2id (lazily migrating from SHA-384 if needed — AZ-536), and either: +- (no MFA) issues a short-lived ES256 access token + opaque refresh token bound to a new session row, OR +- (MFA enabled) issues a short-lived `mfa_token` (JWT, audience `mfa-step`, signed by the active ES256 key) and waits for `POST /login/mfa` to complete the second factor. ### Preconditions -- User account exists in the database -- User knows correct password +- User account exists, is enabled, and is not within an active lockout window +- Per-IP rate-limit bucket has remaining permits +- Per-account sliding-window failed-login count is below threshold +- For the MFA branch: user has previously enrolled and confirmed MFA (F14) ### Sequence Diagram @@ -47,27 +67,84 @@ A user submits email/password credentials. The system validates them against the sequenceDiagram participant Client participant API as Admin API + participant RL as RateLimiter (per-IP, AZ-537) participant US as UserService + participant AL as AuditLog + participant Sec as Security (Argon2id, AZ-536) participant DB as PostgreSQL + participant Mfa as MfaService + participant RT as RefreshTokenService participant Auth as AuthService + participant SS as SessionService Client->>API: POST /login {email, password} + API->>RL: per-IP sliding window check + alt rate-limited + RL-->>Client: 429 + Retry-After + end API->>US: ValidateUser(request) - US->>DB: SELECT user WHERE email = ? - DB-->>US: User record - US->>US: Compare password hash + US->>DB: SELECT users WHERE email=? (read conn) + US->>AL: CountRecentFailedLogins(email, window) + alt account locked OR per-account threshold exceeded + US-->>API: BusinessException(AccountLocked / LoginRateLimited, RetryAfterSeconds) + API-->>Client: 423 / 429 + Retry-After + end + US->>Sec: VerifyPassword(presented, stored) + alt VerifyResult.Ok=false + US->>AL: RecordLoginFailed + US->>DB: UPDATE failed_login_count, lockout_until + US-->>API: WrongPassword (or NoEmailFound) + API-->>Client: 409 + end + alt VerifyResult.NeedsRehash=true + US->>Sec: HashPassword (Argon2id) + US->>DB: UPDATE password_hash (lazy migrate) + end + US->>AL: RecordLoginSuccess + US->>DB: UPDATE failed_login_count=0, last_login=now() US-->>API: User entity - API->>Auth: CreateToken(user) - Auth-->>API: JWT string - API-->>Client: 200 OK {token} + + alt user.MfaEnabled + API->>Mfa: IssueMfaStepToken(userId) + Mfa-->>API: short-lived JWT (mfa_pending=true) + API-->>Client: 200 OK {mfa_required: true, mfa_token, expires_in: 300} + else + API->>RT: IssueForNewLogin(userId, mfaAuthenticated=false) + RT->>DB: INSERT INTO sessions (new id, family_id=id, refresh_hash, expires_at, mfa_authenticated=false) + RT-->>API: (opaqueRefreshToken, Session) + API->>Auth: CreateToken(user, sessionId=Session.Id, jti=new, amr=["pwd"]) + Auth-->>API: AccessToken (ES256) + opt user.Role == CompanionPC + API->>SS: RevokeMissionsForAircraft(user.Id) // F13 / AZ-533 AC-4 + end + API-->>Client: 200 OK LoginResponse {AccessToken, AccessExp, RefreshToken, RefreshExp} + end + + Note over Client,API: MFA branch only: + Client->>API: POST /login/mfa {mfa_token, code} + API->>RL: per-IP sliding window check + API->>Mfa: ValidateMfaStepToken(mfa_token) -> userId + API->>US: GetById(userId) + API->>Mfa: VerifyForLogin(userId, code) -> amr + Mfa->>DB: TOTP verify against decrypted mfa_secret OR recovery code consume + Mfa->>AL: RecordMfaLoginSuccess (or MfaRecoveryUsed) + API->>RT: IssueForNewLogin(userId, mfaAuthenticated=true) + API->>Auth: CreateToken(user, sessionId, jti, amr=["pwd","mfa"]) + API-->>Client: 200 OK LoginResponse ``` ### Error Scenarios | Error | Where | Detection | Recovery | |-------|-------|-----------|----------| -| Email not found | UserService.ValidateUser | No DB record | 409: NoEmailFound (code 10) | -| Wrong password | UserService.ValidateUser | Hash mismatch | 409: WrongPassword (code 30) | +| Per-IP limit exceeded | Rate-limiter middleware | sliding window | 429 + `Retry-After` | +| Account locked | UserService.ValidateUser | `now() < lockout_until` | 423 `AccountLocked` (code 50) + `Retry-After` | +| Per-account threshold | UserService.ValidateUser | failed-login count over window | 429 `LoginRateLimited` (code 51) + `Retry-After` | +| Email not found | UserService.ValidateUser | No DB record | 409 `NoEmailFound` (code 10) | +| Wrong password | UserService.ValidateUser | `VerifyPassword.Ok=false` | 409 `WrongPassword` (code 30) — also increments `failed_login_count` | +| User disabled | UserService.ValidateUser | `is_enabled=false` | 409 `UserDisabled` (code 38) | +| MFA token invalid | MfaService.ValidateMfaStepToken | bad signature / wrong audience / expired | 401 `InvalidMfaToken` (code 61) | +| MFA code wrong | MfaService.VerifyForLogin | TOTP and recovery both miss | 401 `InvalidMfaCode` (code 59) — `mfa_login_failed` audit row | --- @@ -96,7 +173,7 @@ sequenceDiagram API->>US: RegisterUser(request) US->>DB: SELECT user WHERE email = ? DB-->>US: null (no duplicate) - US->>US: Hash password (SHA-384) + US->>US: Hash password (Argon2id, AZ-536) US->>DB: INSERT user (admin connection) DB-->>US: OK US-->>API: void @@ -170,7 +247,9 @@ Admin operations: list users, change role, enable/disable, update queue offsets, ### Preconditions - Caller has ApiAdmin role (for most operations) -All operations follow the same pattern: API endpoint → UserService method → DbFactory.RunAdmin → PostgreSQL UPDATE/DELETE. Cache is invalidated for affected user keys after writes (the `UpdateQueueOffsets` path is the only remaining cache-invalidation site post-AZ-197). +All operations follow the same pattern: API endpoint → UserService method → DbFactory.RunAdmin → PostgreSQL UPDATE/DELETE. Cache is invalidated for affected user keys after writes. + +> **Cycle 2 cross-cut**: `PUT /users/{email}/disable` now also calls `SessionService.RevokeAllForUser` so disabling a user instantly cuts every active session. Verifiers pick this up via F15 within their poll cadence. --- @@ -241,7 +320,7 @@ sequenceDiagram ## Flow F9: Device Auto-Provisioning *(AZ-196, 2026-05-13)* ### Description -ApiAdmin requests a fresh CompanionPC device user. The server allocates the next sequential serial (`azj-NNNN`), generates a 32-char hex password, persists the user with the SHA-384 hash, and returns the plaintext credentials exactly once. The provisioning script (out-of-tree) embeds the values into the device's `device.conf`. +ApiAdmin requests a fresh CompanionPC device user. The server allocates the next sequential serial (`azj-NNNN`), generates a 32-char hex password, persists the user with an Argon2id hash (cycle 2 — AZ-536), and returns the plaintext credentials exactly once. The provisioning script (out-of-tree) embeds the values into the device's `device.conf`. ### Preconditions - Caller has ApiAdmin role (`apiAdminPolicy`) @@ -262,7 +341,7 @@ sequenceDiagram US->>US: nextNumber = parse(lastEmail.suffix) + 1 (or 0) US->>US: serial = "azj-" + nextNumber.PadLeft(4) US->>US: password = ToHex(RandomBytes(16)) // 32 hex chars - US->>DB: INSERT user {Email=serial@domain, PasswordHash=SHA384(password), Role=CompanionPC, IsEnabled=true} (admin conn) + US->>DB: INSERT user {Email=serial@domain, PasswordHash=Argon2id(password), Role=CompanionPC, IsEnabled=true} (admin conn) DB-->>US: OK US-->>API: RegisterDeviceResponse {Serial, Email, Password} API-->>Admin: 200 OK {Serial, Email, Password} @@ -288,3 +367,383 @@ Reasons: 2. The OTA delivery model is itself a leftover from the installer-shipping era; the target architecture (browser-only SaaS + fTPM-secured Jetsons) does not need it. The `apiUploaderPolicy` definition was removed from `Program.cs`; the `RoleEnum.ResourceUploader` enum value remains as data (the seed `uploader@azaion.com` user still uses it for negative-auth tests) but is no longer wired to any endpoint. + +--- + +## Flow F11: Refresh Token Rotation *(AZ-531, 2026-05-14)* + +### Description +The client presents an opaque refresh token; the server validates it, rotates it (marks the old row as `revoked_reason='rotated'`), inserts a new row in the same `family_id`, and mints a new ES256 access token. Reuse of an already-rotated token revokes the entire family with `reason='reuse_detected'` (and triggers F15 surfacing for verifiers). + +### Preconditions +- Refresh token is well-formed and corresponds to a non-revoked, non-expired session row +- The session is within both the sliding window (`SessionConfig.RefreshSlidingHours`) and the absolute cap (`SessionConfig.RefreshAbsoluteHours` measured from `family_started_at`) + +### Sequence Diagram + +```mermaid +sequenceDiagram + participant Client + participant API as Admin API + participant RT as RefreshTokenService + participant US as UserService + participant Auth as AuthService + participant SS as SessionService + participant DB as PostgreSQL + + Client->>API: POST /token/refresh {refreshToken} + API->>RT: Rotate(opaqueToken) + RT->>DB: SELECT * FROM sessions WHERE refresh_hash = SHA256(token) + alt row missing + RT-->>API: 401 InvalidRefreshToken + end + alt row.revoked_reason = 'rotated' (reuse!) + RT->>DB: UPDATE sessions SET revoked_at=now, revoked_reason='reuse_detected' WHERE family_id = row.family_id AND revoked_at IS NULL + RT-->>API: 401 InvalidRefreshToken + end + alt row.revoked_at IS NOT NULL OR row.expires_at <= now + RT-->>API: 401 InvalidRefreshToken + end + RT->>DB: UPDATE sessions SET revoked_at=now, revoked_reason='rotated', last_used_at=now WHERE id = row.id + RT->>DB: INSERT INTO sessions (new id, family_id=row.family_id, refresh_hash=SHA256(newToken), parent_session_id=row.id, expires_at=now+sliding, mfa_authenticated=row.mfa_authenticated) + RT-->>API: (newOpaqueToken, newSession) + API->>US: GetById(newSession.UserId) + US-->>API: User + API->>Auth: CreateToken(user, sessionId=newSession.Id, jti=new, amr= ['pwd','mfa'] if mfaAuthenticated else ['pwd']) + Auth-->>API: AccessToken + opt user.Role == CompanionPC + API->>SS: RevokeMissionsForAircraft(user.Id) + end + API-->>Client: 200 OK LoginResponse {AccessToken, AccessExp, RefreshToken=newOpaqueToken, RefreshExp=newSession.ExpiresAt} +``` + +### Error Scenarios + +| Error | Where | Detection | Recovery | +|-------|-------|-----------|----------| +| Token missing / not in DB | RefreshTokenService.Rotate | `SHA256(token)` not found | 401 `InvalidRefreshToken` | +| Reuse detected | RefreshTokenService.Rotate | row already `revoked_reason='rotated'` | 401 `InvalidRefreshToken` + entire family revoked (visible via F15) | +| Sliding window expired | RefreshTokenService.Rotate | `expires_at <= now()` | 401 `InvalidRefreshToken` | +| Absolute cap exceeded | RefreshTokenService.Rotate | `now() - family_started_at > RefreshAbsoluteHours` | 401 `InvalidRefreshToken` | +| User missing (race with deletion) | API | `UserService.GetById` returns null | 401 `InvalidRefreshToken` | + +--- + +## Flow F12: Logout / Revocation *(AZ-535, 2026-05-14)* + +### Description +Three endpoints share `SessionService.RevokeBySid` / `RevokeAllForUser`: +- `POST /logout` — revoke caller's current `sid` (idempotent; returns `{ alreadyRevoked }`) +- `POST /logout/all` — revoke every active session for the caller's user +- `POST /sessions/{sid}/revoke` *(ApiAdmin)* — admin revoke-by-sid + +All revocations write `revoked_at`, `revoked_reason`, and `revoked_by_user_id`; the rows surface to verifiers via F15 within the next poll window. + +### Preconditions +- `/logout` / `/logout/all` — caller is authenticated; the access token's `sid` claim is well-formed +- `/sessions/{sid}/revoke` — caller is `ApiAdmin` + +### Sequence Diagram + +```mermaid +sequenceDiagram + participant Client + participant API as Admin API + participant SS as SessionService + participant DB as PostgreSQL + + Note over Client,API: Self logout + Client->>API: POST /logout (Bearer access) + API->>API: ParseSidClaim(user) -> sid + API->>API: ParseUserIdClaim(user) -> caller + API->>SS: RevokeBySid(sid, caller, 'logged_out') + SS->>DB: UPDATE sessions SET revoked_at=now, revoked_reason='logged_out', revoked_by_user_id=caller WHERE id=sid AND revoked_at IS NULL + SS-->>API: alreadyRevoked: bool + API-->>Client: 200 OK { alreadyRevoked } + + Note over Client,API: Logout-all + Client->>API: POST /logout/all + API->>SS: RevokeAllForUser(caller, caller, 'logged_out_all') + SS->>DB: UPDATE ... WHERE user_id=caller AND revoked_at IS NULL + SS-->>API: int (rows revoked) + API-->>Client: 200 OK { revoked } + + Note over Client,API: Admin revoke-by-sid + Client->>API: POST /sessions/{sid}/revoke (ApiAdmin) + API->>SS: RevokeBySid(sid, admin, 'admin_revoked') + SS->>DB: UPDATE ... WHERE id=sid AND revoked_at IS NULL + SS-->>API: alreadyRevoked: bool + API-->>Client: 200 OK { alreadyRevoked } +``` + +### Error Scenarios + +| Error | Where | Detection | Recovery | +|-------|-------|-----------|----------| +| Missing/malformed `sid` claim | ParseSidClaim | not a Guid | 401 `InvalidRefreshToken` | +| Sid not in DB (admin path) | SessionService.RevokeBySid | row not found | 404 `SessionNotFound` | +| Already revoked | SessionService.RevokeBySid | UPDATE affected 0 rows | 200 OK with `alreadyRevoked: true` (idempotent) | + +--- + +## Flow F13: Mission Token Issuance *(AZ-533, 2026-05-14)* + +### Description +A pilot (an authenticated interactive user) requests a long-lived no-refresh access token bound to one aircraft and one mission. Before signing the token, the server inserts a `class='mission'` session row (so `sid` is bound), and revokes any previously-active mission sessions for that aircraft (`reason='aircraft_reconnected'`). + +### Preconditions +- Caller is authenticated (interactive token; AMR can be `["pwd"]` or `["pwd","mfa"]` — F1 follow-up tightens this to require `mfa` once policy is set) +- `request.aircraftId` resolves to an existing user with `Role = CompanionPC` +- `request.missionId` matches the validation pattern; `request.plannedDurationH` is within bounds + +### Sequence Diagram + +```mermaid +sequenceDiagram + participant Pilot + participant API as Admin API + participant MTS as MissionTokenService + participant SS as SessionService + participant US as UserService + participant Auth as AuthService + participant DB as PostgreSQL + + Pilot->>API: POST /sessions/mission {aircraftId, missionId, plannedDurationH, region} + API->>MTS: Issue(pilotId, request) + MTS->>US: GetById(aircraftId) (read conn) + alt aircraft missing or wrong role + MTS-->>API: 400 AircraftNotFound + end + MTS->>SS: RevokeMissionsForAircraft(aircraftId) // AC-4 + SS->>DB: UPDATE sessions SET revoked_at=now, revoked_reason='aircraft_reconnected' WHERE aircraft_id=? AND class='mission' AND revoked_at IS NULL + MTS->>DB: INSERT INTO sessions (id, user_id=aircraftId, class='mission', aircraft_id=aircraftId, refresh_hash=NULL, expires_at=now + plannedDurationH) + MTS->>Auth: CreateToken(aircraftUser, sessionId=newSid, jti, amr=['pwd','mission']) + Auth-->>MTS: AccessToken + MTS-->>API: MissionSessionResponse {access_token, expires_at, mission_id, aircraft_id} + API-->>Pilot: 200 OK +``` + +### Error Scenarios + +| Error | Where | Detection | Recovery | +|-------|-------|-----------|----------| +| Validation failure | FluentValidation / MissionTokenService | bad `mission_id` pattern, `plannedDurationH` out of bounds | 400 `InvalidMissionRequest` (code 54) | +| Aircraft not a CompanionPC | MissionTokenService.Issue | role mismatch | 400 `AircraftNotFound` (code 55) | + +--- + +## Flow F14: MFA Enrollment & Confirmation *(AZ-534, 2026-05-14)* + +### Description +Three-step user-initiated lifecycle: +1. **Enroll** — server generates a new TOTP secret, encrypts it via `IDataProtector` (purpose `Azaion.Mfa.Secret`), persists with `mfa_enabled=false`, returns base32 secret + otpauth URL + QR PNG bytes. +2. **Confirm** — client submits a TOTP code; on success server flips `mfa_enabled=true`, generates 10 single-use Argon2id-hashed recovery codes, and returns them once. +3. **Disable** — requires both password + a current TOTP; server clears all MFA columns. + +### Preconditions +- Caller is authenticated +- For Confirm: a prior Enroll call left the encrypted secret on the user +- For Disable: `mfa_enabled = true` + +### Sequence Diagram + +```mermaid +sequenceDiagram + participant User + participant API as Admin API + participant Mfa as MfaService + participant DP as IDataProtector + participant Sec as Security (Argon2id) + participant AL as AuditLog + participant DB as PostgreSQL + + Note over User,API: ENROLL + User->>API: POST /users/me/mfa/enroll {password} + API->>Mfa: Enroll(userId, password) + Mfa->>DB: SELECT user + Mfa->>Sec: VerifyPassword(presented, stored) + Mfa->>Mfa: Generate 20-byte secret, base32 encode + Mfa->>DP: Protect(base32) -> encrypted base64 + Mfa->>DB: UPDATE users SET mfa_secret = encrypted, mfa_enrolled_at = now, mfa_enabled=false + Mfa->>AL: RecordMfaEnroll + Mfa-->>API: MfaEnrollResponse { secret_base32, otpauth_url, qr_png } + API-->>User: 200 OK + + Note over User,API: CONFIRM + User->>API: POST /users/me/mfa/confirm {code} + API->>Mfa: Confirm(userId, code) + Mfa->>DP: Unprotect(stored) -> base32 secret + Mfa->>Mfa: TOTP verify + alt code wrong + Mfa-->>API: 401 InvalidMfaCode + end + Mfa->>Mfa: Generate 10 recovery codes + Mfa->>Sec: HashPassword each (Argon2id) + Mfa->>DB: UPDATE users SET mfa_enabled=true, mfa_recovery_codes = jsonb([{ hash, used_at=null } x10]), mfa_last_used_window=current_step + Mfa->>AL: RecordMfaConfirm + Mfa-->>API: { recovery_codes: [...] } + API-->>User: 200 OK { mfaEnabled: true, recovery_codes } + + Note over User,API: DISABLE + User->>API: POST /users/me/mfa/disable {password, code} + API->>Mfa: Disable(userId, password, code) + Mfa->>Sec: VerifyPassword + Mfa->>Mfa: TOTP verify + Mfa->>DB: UPDATE users SET mfa_enabled=false, mfa_secret=NULL, mfa_recovery_codes=NULL, mfa_enrolled_at=NULL, mfa_last_used_window=NULL + Mfa->>AL: RecordMfaDisable + Mfa-->>API: ok + API-->>User: 200 OK { mfaEnabled: false } +``` + +### Error Scenarios + +| Error | Where | Detection | Recovery | +|-------|-------|-----------|----------| +| Already enrolled (Enroll) | MfaService.Enroll | `mfa_enabled=true` | 409 `MfaAlreadyEnabled` (code 56) | +| Not enrolling (Confirm) | MfaService.Confirm | `mfa_secret IS NULL` | 409 `MfaNotEnrolling` (code 57) | +| Not enabled (Disable) | MfaService.Disable | `mfa_enabled=false` | 409 `MfaNotEnabled` (code 58) | +| Wrong password | Sec.VerifyPassword | hash mismatch | 409 `WrongPassword` (code 30) | +| Wrong TOTP code | MfaService TOTP path | code/window miss | 401 `InvalidMfaCode` (code 59) | + +--- + +## Flow F15: Verifier Revocation Snapshot *(AZ-535, 2026-05-14)* + +### Description +A `Service`-role identity (verifier fleet) polls `GET /sessions/revoked?since={iso8601}` periodically. The server returns every session whose `revoked_at >= since` and `expires_at > now()` so verifiers can deny tokens whose `sid` appears in the snapshot. + +The `since` parameter is **clamped to a 12-hour floor** server-side so a buggy verifier asking for "everything since 1970" doesn't trigger a multi-million-row table scan. Verifiers should clock-skew-tolerate by stepping `since` back ~30s on each poll. + +### Preconditions +- Caller has role `Service` or `ApiAdmin` (`revocationReaderPolicy`) + +### Sequence Diagram + +```mermaid +sequenceDiagram + participant Verifier + participant API as Admin API + participant SS as SessionService + participant DB as PostgreSQL + + Verifier->>API: GET /sessions/revoked?since=2026-05-14T05:30:00Z + API->>API: clamp since to max(now-12h, since) + API->>SS: GetRevokedSince(effectiveSince) + SS->>DB: SELECT id, expires_at, revoked_at, revoked_reason FROM sessions WHERE revoked_at >= ? AND expires_at > now() ORDER BY revoked_at + DB-->>SS: rows (uses sessions_revoked_at_idx) + SS-->>API: IReadOnlyList + API-->>Verifier: 200 OK [{ sid, exp, revokedAt, reason }, ...] + Cache-Control: no-cache +``` + +### Error Scenarios + +| Error | Where | Detection | Recovery | +|-------|-------|-----------|----------| +| Wrong role | API authorization | not Service/ApiAdmin | 403 Forbidden | +| `since` missing | API | bind null `DateTime?` | clamp falls back to `now-12h` | + +--- + +## Flow F16: Account Lockout & Per-IP Rate Limit *(AZ-537, 2026-05-14)* + +### Description +Cross-cuts F1 and F11. Two layers: +1. **Per-IP** — ASP.NET Core `RateLimiter` middleware (`SlidingWindowRateLimiter`) attached to `/login` and `/login/mfa` via the `login-per-ip` policy. Rejection sets `429` and stamps `Retry-After` from the lease metadata. +2. **Per-account + lockout** — DB-backed in `UserService.ValidateUser`: + - Read `failed_login_count` and `lockout_until` from `users`. + - If `now() < lockout_until` → throw `BusinessException(AccountLocked, RetryAfterSeconds = LockoutUntil - now)`. + - Else: count `audit_events` rows where `event_type='login_failed' AND email=? AND occurred_at >= now - PerAccountWindowSeconds`. If over threshold → throw `BusinessException(LoginRateLimited, RetryAfterSeconds = PerAccountWindowSeconds)`. + - On wrong password: `RecordLoginFailed` + UPDATE `failed_login_count = failed_login_count + 1`. If new count >= `ConsecutiveFailureThreshold` → set `lockout_until = now + LockoutSeconds`, `RecordLoginLockout`, throw `AccountLocked`. + - On success: `RecordLoginSuccess` + UPDATE `failed_login_count = 0`, `lockout_until = NULL`. + +### Preconditions +- `AuthConfig.RateLimit.*` and `AuthConfig.Lockout.*` are non-zero +- `audit_events` table exists + +### Sequence Diagram + +```mermaid +sequenceDiagram + participant Client + participant Mid as RateLimiter middleware + participant API as Admin API + participant US as UserService + participant AL as AuditLog + participant DB as PostgreSQL + + Client->>Mid: POST /login {email, password} + Mid->>Mid: SlidingWindow per-IP check + alt no permits + Mid-->>Client: 429 + Retry-After + end + Mid->>API: forward + API->>US: ValidateUser + US->>DB: SELECT users (read) + US->>AL: CountRecentFailedLogins(email, window) + alt account locked OR threshold exceeded + US->>AL: RecordLoginFailed (or RecordLoginLockout if newly locked) + US-->>API: BusinessException(AccountLocked / LoginRateLimited, RetryAfterSeconds) + API-->>Client: 423 / 429 + Retry-After + end + US->>US: VerifyPassword + alt wrong password + US->>AL: RecordLoginFailed + US->>DB: UPDATE failed_login_count++; lockout_until = now + LockoutSeconds (if newly over) + US-->>API: BusinessException(WrongPassword) + API-->>Client: 409 + end + US->>AL: RecordLoginSuccess + US->>DB: UPDATE failed_login_count = 0, lockout_until = NULL, last_login = now + US-->>API: User +``` + +### Error Scenarios + +| Error | Where | Detection | Recovery | +|-------|-------|-----------|----------| +| Per-IP limit | RateLimiter middleware | sliding window | 429 + `Retry-After` | +| Account locked | UserService.ValidateUser | `now < lockout_until` | 423 `AccountLocked` + `Retry-After` | +| Per-account threshold | UserService.ValidateUser | `audit_events` count over window | 429 `LoginRateLimited` + `Retry-After` | + +--- + +## Flow F17: JWKS Publication *(AZ-532, 2026-05-14)* + +### Description +`GET /.well-known/jwks.json` (anonymous) returns the JSON Web Key Set containing one entry per loaded ES256 key. Verifiers cache for 1 hour (`Cache-Control: public, max-age=3600`). + +### Preconditions +- `JwtConfig.KeysFolder` exists with at least one well-formed P-256 PEM +- `JwtConfig.ActiveKid` matches one of the loaded files (the others are still served, allowing verifiers to validate already-issued tokens during a key rotation) + +### Sequence Diagram + +```mermaid +sequenceDiagram + participant Verifier + participant API as Admin API + participant JKP as JwtSigningKeyProvider + participant FS as Filesystem + + Note over JKP,FS: At app startup + API->>JKP: ctor (eager) + JKP->>FS: scan KeysFolder/*.pem + JKP->>JKP: validate P-256 curve, build EcdsaSecurityKey list + JKP-->>API: ready (or fail-fast if 0 keys) + + Note over Verifier,API: Per-poll + Verifier->>API: GET /.well-known/jwks.json + API->>JKP: All + JKP-->>API: list of JwtSigningKey + API->>API: project to JWK { kty:EC, crv:P-256, kid, use:sig, alg:ES256, x, y } + API-->>Verifier: 200 OK { keys: [...] } + Cache-Control: public, max-age=3600 +``` + +### Error Scenarios + +| Error | Where | Detection | Recovery | +|-------|-------|-----------|----------| +| No keys / malformed PEM | JwtSigningKeyProvider ctor | startup crash (intentional) | Operator fix + restart | +| Wrong curve in PEM | JwtSigningKeyProvider ctor | startup crash | Operator fix + restart | + +> **Rotation procedure**: drop a new PEM into `KeysFolder`, set `JwtConfig:ActiveKid` to the new kid, restart. Already-issued tokens remain verifiable until their `exp`. Old PEMs are physically removed only after the longest possible token TTL has elapsed. diff --git a/_docs/02_document/tests/blackbox-tests.md b/_docs/02_document/tests/blackbox-tests.md index 95e8769..8b29016 100644 --- a/_docs/02_document/tests/blackbox-tests.md +++ b/_docs/02_document/tests/blackbox-tests.md @@ -811,3 +811,665 @@ The scenarios `FT-P-21`, `FT-P-22`, `FT-P-23` are retained here as ID placeholde **Max execution time**: 5s Note: AZ-197 AC-1 (resource download works without `Hardware`) is implicitly covered by the existing FT-P-09 / FT-P-10 scenarios once their request bodies are aligned with the new wire shape. AZ-197 AC-3..AC-8 are internal-signature / build-system invariants and are verified at build/CI time, not via a blackbox HTTP scenario. + +--- + +## Cycle 2 Additions (2026-05-14) — Auth Modernization (AZ-529 + AZ-530) + +The scenarios below were appended during the existing-code cycle 2 Test-Spec Sync (autodev Step 12) for the eight tasks under AZ-529 (Auth Mechanism Modernization) and AZ-530 (CMMC Compliance Hardening): AZ-531 (refresh-token flow), AZ-532 (asymmetric signing + JWKS), AZ-533 (mission-token UAV), AZ-534 (TOTP 2FA), AZ-535 (logout + revocation), AZ-536 (Argon2id), AZ-537 (rate-limit + lockout), AZ-538 (CORS HTTPS-only + HSTS). Numbering continues from FT-P-23 / FT-N-16. Security-only ACs live in `security-tests.md`. + +### Argon2id Password Hashing (AZ-536) + +#### FT-P-24: Legacy SHA-384 Password Still Validates + +**Summary**: A user whose `password_hash` is in the pre-AZ-536 unsalted SHA-384 format can still log in with the correct password. +**Traces to**: AZ-536 AC-2 +**Category**: Authentication + +**Preconditions**: +- Seed user `legacy@azaion.com` with `password_hash` set to `Convert.ToBase64String(SHA384.HashData("LegacyPwd1!"))` (the historical format) + +**Input data**: `{"email":"legacy@azaion.com","password":"LegacyPwd1!"}` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /login with the legacy user's credentials | HTTP 200, dual-token body (per AZ-531) | + +**Expected outcome**: HTTP 200, login succeeds against legacy hash format +**Max execution time**: 5s (note: Argon2id verify cost is incurred only on the post-login re-hash) + +--- + +#### FT-P-25: Successful Legacy Login Re-Hashes to Argon2id + +**Summary**: After FT-P-24 succeeds, the user's `password_hash` is silently upgraded to Argon2id PHC format and the same plaintext continues to validate. +**Traces to**: AZ-536 AC-3 +**Category**: Authentication + +**Preconditions**: +- FT-P-24 has just executed successfully for `legacy@azaion.com` + +**Input data**: `{"email":"legacy@azaion.com","password":"LegacyPwd1!"}` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Read `users.password_hash` for `legacy@azaion.com` directly from DB | Value starts with `$argon2id$v=19$m=` and parses to m ≥ 65536, t ≥ 3, p ≥ 1 | +| 2 | POST /login with the same plaintext password again | HTTP 200, dual-token body | + +**Expected outcome**: Hash format upgraded to Argon2id PHC; subsequent login still works +**Max execution time**: 5s + +--- + +#### FT-N-17: Wrong Password Fails for Both Hash Formats + +**Summary**: Wrong password is rejected with the same error (`WrongPassword`) regardless of whether the stored hash is legacy SHA-384 or Argon2id. +**Traces to**: AZ-536 AC-4 +**Category**: Authentication + +**Preconditions**: +- One user with legacy SHA-384 hash, one user with Argon2id hash already in DB + +**Input data**: Wrong password against each user + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /login (legacy user, wrong pwd) | HTTP 409, ExceptionEnum=WrongPassword (code 30) | +| 2 | POST /login (Argon2id user, wrong pwd) | HTTP 409, ExceptionEnum=WrongPassword (code 30) | + +**Expected outcome**: Same error code on both code paths; no information leak about hash format +**Max execution time**: 5s per attempt (Argon2id cost incurred regardless of success/failure) + +--- + +### /login Rate Limit + Account Lockout (AZ-537) + +#### FT-P-26: Successful Login Resets the Failed-Attempt Counter + +**Summary**: After some wrong-password attempts (within budget), a successful login zeros `failed_login_count` and clears `lockout_until`. +**Traces to**: AZ-537 AC-4 +**Category**: Authentication + +**Preconditions**: +- User `alice@azaion.com` exists with Argon2id-hashed password + +**Input data**: 5 wrong-password attempts followed by 1 correct attempt + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /login with wrong pwd × 5 (within rate-limit budget) | HTTP 409 each (WrongPassword) | +| 2 | Read `users.failed_login_count` for alice | Value = 5 | +| 3 | POST /login with correct pwd | HTTP 200, dual-token body | +| 4 | Read `users.failed_login_count` and `lockout_until` for alice | `failed_login_count = 0`, `lockout_until IS NULL` | + +**Expected outcome**: Counter reset on success +**Max execution time**: 30s (5× Argon2id verifies) + +--- + +#### FT-P-27: Lockout Auto-Expires After Configured Duration + +**Summary**: A locked account becomes loginable again automatically once `lockout_until < now()`. +**Traces to**: AZ-537 AC-5 +**Category**: Authentication + +**Preconditions**: +- `Auth:Lockout:DurationMinutes` set to a small value (e.g. 1 minute) in the test env so the test does not have to wait 15 min +- User `bob@azaion.com` exists with Argon2id hash + +**Input data**: 10 wrong attempts to trigger lockout, then a correct attempt after the duration window + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /login with wrong pwd × 10 | first 9 → 409 WrongPassword; the 10th → 423 Locked OR 409 followed by lockout flag | +| 2 | POST /login with correct pwd immediately | HTTP 423 Locked (account is locked) | +| 3 | Wait `Auth:Lockout:DurationMinutes + 1s` | — | +| 4 | POST /login with correct pwd | HTTP 200, dual-token body | + +**Expected outcome**: 423 → 200 transition once the lockout window expires +**Max execution time**: 90s (depends on configured lockout duration in test env) + +--- + +### CORS HTTPS-Only + HSTS (AZ-538) + +#### FT-P-28: HTTPS Origin Preflight Succeeds + +**Summary**: The CORS allow-list still admits the canonical `https://admin.azaion.com` origin and echoes the credentials flag. +**Traces to**: AZ-538 AC-2 +**Category**: Cross-Origin + +**Preconditions**: +- Admin API running with `AdminCorsPolicy` configured (post-AZ-538) + +**Input data**: +- Method: OPTIONS +- Path: /login +- Header: `Origin: https://admin.azaion.com` +- Header: `Access-Control-Request-Method: POST` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | OPTIONS /login with the headers above | HTTP 204; `Access-Control-Allow-Origin: https://admin.azaion.com`; `Access-Control-Allow-Credentials: true` | + +**Expected outcome**: HTTPS origin preflight succeeds with credentials flag +**Max execution time**: 5s + +--- + +#### FT-P-29: Development Env — No HTTPS Redirect, No HSTS + +**Summary**: When `ASPNETCORE_ENVIRONMENT=Development`, plain HTTP requests to localhost still serve 200 responses with no `Strict-Transport-Security` header. +**Traces to**: AZ-538 AC-5 +**Category**: Cross-Origin + +**Preconditions**: +- Admin API running with `ASPNETCORE_ENVIRONMENT=Development` (the default test container env) + +**Input data**: GET http://localhost:8080/health/live + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | GET http://localhost:8080/health/live | HTTP 200; no `Strict-Transport-Security` header; no 307 redirect | + +**Expected outcome**: Dev workflow preserved — no redirect, no HSTS +**Max execution time**: 5s + +--- + +### Refresh-Token Flow (AZ-531) + +#### FT-P-30: /login Returns Dual Tokens + +**Summary**: Successful login returns both a short-lived access token (≈15 min) and an opaque refresh token; a `sessions` row is created. +**Traces to**: AZ-531 AC-1 +**Category**: Authentication + +**Preconditions**: +- Seed user without MFA enabled + +**Input data**: Valid email + password + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /login | HTTP 200; body has `access_token` (JWT), `access_exp` ≈ now+15m ±60s, `refresh_token` (opaque ≥43 chars), `refresh_exp` | +| 2 | Decode `access_token` payload | Contains `sub`, `iss`, `aud`, `exp`, `jti`, `sid` claims | +| 3 | Query `sessions` table by `user_id` | Exactly one row with non-null `refresh_hash`, non-null `family_id`, `revoked_at IS NULL` | + +**Expected outcome**: Dual tokens issued, session row persisted, access token has short TTL +**Max execution time**: 5s + +--- + +#### FT-P-31: /token/refresh Rotates the Refresh Token + +**Summary**: A valid refresh token is exchanged for a new access + new refresh; the previous refresh is invalidated; the session chain extends via `parent_session_id`. +**Traces to**: AZ-531 AC-2 +**Category**: Authentication + +**Preconditions**: +- FT-P-30 just produced refresh token R1 + +**Input data**: `{"refresh_token":""}` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /token/refresh with R1 | HTTP 200; body has new `access_token`, new `refresh_token` (R2 ≠ R1), new `access_exp`, new `refresh_exp` | +| 2 | POST /token/refresh with R1 again (same call) | HTTP 401 (R1 has been rotated; see AC-3 reuse-detection in NFT-SEC-08) | +| 3 | Inspect `sessions` table | Original row's `refresh_hash` rotated; new row has `parent_session_id` chained to the previous row | + +**Expected outcome**: Rotation succeeds; old refresh dies; chain is preserved +**Max execution time**: 5s + +--- + +#### FT-P-32: Refresh Sliding + Absolute Expiry + +**Summary**: Refresh tokens slide on use up to the per-family absolute cap (12 h since the family's first issue); after the absolute cap, refresh fails. +**Traces to**: AZ-531 AC-4 +**Category**: Authentication + +**Preconditions**: +- A `sessions` family with `family_first_issued_at` set to `now() - 11h59m` (verified via DB seed) and a current valid refresh token R-current + +**Input data**: `{"refresh_token":""}`, called near and past the absolute cap + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /token/refresh at family-age 11h59m | HTTP 200, rotation succeeds; sliding window extended | +| 2 | Seed another family with `family_first_issued_at = now() - 12h01s` | — | +| 3 | POST /token/refresh on that family | HTTP 401, body indicates absolute-expiry violation | + +**Expected outcome**: Sliding works inside 12 h; absolute cap rejects beyond +**Max execution time**: 5s + +--- + +### Asymmetric Signing + JWKS (AZ-532) + +#### FT-P-33: GET /.well-known/jwks.json Serves the Active Public Key + +**Summary**: The JWKS endpoint is anonymous, cacheable, and returns a well-formed JWKS containing the active EC P-256 public key with `kid`. +**Traces to**: AZ-532 AC-2 +**Category**: Cryptography / Discovery + +**Preconditions**: +- Admin running with an ES256 keypair loaded from `secrets/jwt_signing_key.pem` + +**Input data**: None (anonymous GET) + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | GET /.well-known/jwks.json (no JWT) | HTTP 200; `Content-Type: application/json`; `Cache-Control: public, max-age=3600` | +| 2 | Parse body | `{"keys":[{"kty":"EC","crv":"P-256","kid":,"x":,"y":,"alg":"ES256","use":"sig"}, …]}` | + +**Expected outcome**: JWKS shape matches RFC 7517; cache headers present +**Max execution time**: 5s + +--- + +#### FT-P-34: Two-Key Overlap During Rotation + +**Summary**: When two signing keys are configured (`kid-A` active + `kid-B` standby), JWKS exposes both; tokens signed with the active key continue to verify; switching the active flag to `kid-B` produces `kid-B`-stamped tokens that also verify. +**Traces to**: AZ-532 AC-3 +**Category**: Cryptography / Rotation + +**Preconditions**: +- Two keys configured in `secrets/`: `jwt_signing_key_a.pem` (active), `jwt_signing_key_b.pem` (standby) + +**Input data**: Sequenced login + rotation toggle + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | GET /.well-known/jwks.json | Both `kid-A` and `kid-B` appear in `keys` array | +| 2 | POST /login | Returned access token has `kid: kid-A` in header | +| 3 | Toggle active key → `kid-B` (test-only admin endpoint or env reload) | — | +| 4 | POST /login again | Returned access token has `kid: kid-B` in header | +| 5 | Use either token against any protected endpoint | HTTP 200 (both verify against their respective public keys in JWKS) | + +**Expected outcome**: Overlap window allows both keys; verifiers can keep working through rotation +**Max execution time**: 10s + +--- + +### Mission-Token Issuance for UAV (AZ-533) + +#### FT-P-35: POST /sessions/mission Issues a Long-Lived Mission Token + +**Summary**: An authenticated pilot session can mint a mission-class access token with a duration ≈ `planned_duration_h + 1h` and no refresh token. +**Traces to**: AZ-533 AC-1 +**Category**: Mission Sessions + +**Preconditions**: +- Pilot user with valid (post-AZ-531) access token; MFA already proven within the session (post-AZ-534) +- Aircraft user `UAV-117` with `Role=CompanionPC` exists + +**Input data**: `{"mission_id":"M-2026-05-14-042","aircraft_id":"UAV-117","planned_duration_h":9,"requested_scope":["GPS"]}` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /sessions/mission with the body above + pilot access token | HTTP 200; body has `access_token`, no `refresh_token`, `exp` ≈ now + 10h ±60s | +| 2 | Decode token payload | `token_class = "mission"` | +| 3 | Query `sessions` table | Row with `class='mission'`, `aircraft_id='UAV-117'`, `revoked_at IS NULL` | + +**Expected outcome**: Long-lived mission token issued; session persisted with class marker +**Max execution time**: 5s + +--- + +#### FT-P-36: Mission Token Carries Scope Claims + +**Summary**: The mission token's payload exposes `mission_id`, `aircraft_id`, `aud`, `permissions`, `sid`, `jti`. +**Traces to**: AZ-533 AC-3 +**Category**: Mission Sessions + +**Preconditions**: +- FT-P-35 just produced a mission token + +**Input data**: The mission token from FT-P-35 + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Decode mission token payload | `mission_id == "M-2026-05-14-042"`, `aircraft_id == "UAV-117"`, `aud == "satellite-provider"`, `permissions` contains `"GPS"`, `sid` non-empty, `jti` non-empty | + +**Expected outcome**: All scope claims present and correctly populated +**Max execution time**: 5s + +--- + +#### FT-P-37: Mission Token Auto-Revoked on Aircraft Reconnect + +**Summary**: When the aircraft user behind a mission session calls `/login` or `/token/refresh` again, every open mission session for that aircraft is marked `revoked_reason='post_flight_reconnect'` and the mission token stops working. +**Traces to**: AZ-533 AC-4 +**Category**: Mission Sessions + +**Preconditions**: +- Open mission session for `UAV-117` from FT-P-35 (token MT) + +**Input data**: A `/login` from the `UAV-117` companion PC user + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /login as `UAV-117` (CompanionPC creds) | HTTP 200, dual tokens (per AZ-531) | +| 2 | Query `sessions` row for the original mission MT | `revoked_at` set; `revoked_reason = 'post_flight_reconnect'` | +| 3 | Use MT against any protected endpoint | HTTP 401 | + +**Expected outcome**: Reconnect implicitly revokes outstanding mission sessions for the same aircraft +**Max execution time**: 10s + +--- + +#### FT-N-18: POST /sessions/mission Requires Authentication + +**Summary**: Without an Authorization header, mission-token issuance is rejected at the gateway. +**Traces to**: AZ-533 AC-5 +**Category**: Mission Sessions + +**Preconditions**: None + +**Input data**: Same body as FT-P-35, no Authorization header + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /sessions/mission with no JWT | HTTP 401 | + +**Expected outcome**: Unauthenticated mission requests are rejected +**Max execution time**: 5s + +--- + +#### FT-N-19: POST /sessions/mission Rejects Over-Cap Duration + +**Summary**: A request for `planned_duration_h > 12` is rejected with HTTP 400 and a descriptive error message. +**Traces to**: AZ-533 AC-2 +**Category**: Mission Sessions + +**Preconditions**: +- Authenticated pilot session (with MFA `amr=mfa`) + +**Input data**: `{"mission_id":"M-2026-05-14-099","aircraft_id":"UAV-117","planned_duration_h":15,"requested_scope":["GPS"]}` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /sessions/mission with the over-cap body | HTTP 400; response body contains `"planned_duration_h must be ≤ 12"` | + +**Expected outcome**: 400 with cap-violation message; no session row created +**Max execution time**: 5s + +--- + +### TOTP-Based 2FA at Login (AZ-534) + +#### FT-P-38: POST /users/me/mfa/enroll Returns Usable Secret + Recovery Codes + +**Summary**: A user without MFA can begin enrollment and receives a 32-char base32 TOTP secret, an `otpauth://` URL, a base64 PNG QR, and 10 recovery codes (≥12 chars each). +**Traces to**: AZ-534 AC-1 +**Category**: MFA Enrollment + +**Preconditions**: +- Authenticated user `mfauser@azaion.com`, `mfa_enabled = false` + +**Input data**: `{"password":""}` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /users/me/mfa/enroll with the body above | HTTP 200; body has `secret` (32-char base32), `otpauth_url` (matches `^otpauth://totp/`), `qr_png_base64` (non-empty), `recovery_codes` (length = 10, each ≥ 12 chars, base32) | +| 2 | Read `users.mfa_enabled` for the user | Value still `false` (only flips after `confirm`) | + +**Expected outcome**: Enrollment package returned; `mfa_enabled` not yet flipped +**Max execution time**: 5s + +--- + +#### FT-P-39: POST /users/me/mfa/confirm Activates MFA + +**Summary**: Submitting a valid TOTP code from the just-issued secret completes enrollment and flips `mfa_enabled = true`. +**Traces to**: AZ-534 AC-2 +**Category**: MFA Enrollment + +**Preconditions**: +- FT-P-38 just executed for the same user; the test holds the returned `secret` + +**Input data**: `{"code":"<TOTP code computed from secret at current time>"}` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | Compute current 6-digit TOTP from `secret` (RFC 6238, 30 s window) | 6 digits | +| 2 | POST /users/me/mfa/confirm with the code | HTTP 200 | +| 3 | Read `users.mfa_enabled` and `users.mfa_enrolled_at` | `mfa_enabled = true`, `mfa_enrolled_at` non-null | + +**Expected outcome**: MFA activated; subsequent /login goes through the two-step flow +**Max execution time**: 5s + +--- + +#### FT-P-40: Two-Step Login With TOTP + +**Summary**: When a user has MFA enabled, `/login` returns an MFA-required envelope with a short-lived `mfa_token`; calling `/login/mfa` with the `mfa_token` + a valid TOTP code yields the real access + refresh; the access token's `amr` claim contains both `pwd` and `mfa`. +**Traces to**: AZ-534 AC-3 +**Category**: Authentication / MFA + +**Preconditions**: +- User from FT-P-39 (MFA enabled) + +**Input data**: Valid email + password, then `mfa_token` + TOTP code + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /login with email + password | HTTP 200; body = `{ "mfa_required": true, "mfa_token": "<short-lived JWT>", "expires_in": 300 }`; no access/refresh present | +| 2 | POST /login/mfa with `{ "mfa_token": "<from step 1>", "code": "<TOTP>" }` | HTTP 200; body has access + refresh tokens | +| 3 | Decode access token | `amr` claim = `["pwd","mfa"]` | + +**Expected outcome**: Two-step flow completes; access token's `amr` reflects both factors +**Max execution time**: 10s + +--- + +#### FT-P-41: Recovery Code Substitutes for TOTP and Burns On Use + +**Summary**: A recovery code may be used in place of a TOTP code at `/login/mfa`. The same code on a subsequent attempt fails (single-use). The successful access token's `amr` claim records `recovery`. +**Traces to**: AZ-534 AC-4 +**Category**: Authentication / MFA + +**Preconditions**: +- User from FT-P-39; the test holds the `recovery_codes` array from FT-P-38 + +**Input data**: First recovery code, then re-use of the same code + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /login → get `mfa_token` | HTTP 200, MFA-required envelope | +| 2 | POST /login/mfa with `{ "mfa_token", "code": "<recovery_codes[0]>" }` | HTTP 200, access + refresh issued; `amr` = `["pwd","mfa","recovery"]` | +| 3 | POST /login → get a new `mfa_token` | HTTP 200, MFA-required envelope | +| 4 | POST /login/mfa with the SAME recovery code | HTTP 401 (recovery code burned) | + +**Expected outcome**: Recovery code works once, then is rejected +**Max execution time**: 10s + +--- + +#### FT-P-42: POST /users/me/mfa/disable Removes MFA + +**Summary**: Submitting password + a valid TOTP code disables MFA; subsequent `/login` returns access + refresh directly without the two-step flow. +**Traces to**: AZ-534 AC-5 +**Category**: MFA Enrollment + +**Preconditions**: +- User from FT-P-39 + +**Input data**: `{"password":"<plaintext>","code":"<TOTP>"}` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /users/me/mfa/disable | HTTP 200 | +| 2 | Read `users.mfa_enabled` | `false` | +| 3 | POST /login with email + password | HTTP 200; body has access + refresh directly (no `mfa_required`) | + +**Expected outcome**: MFA disabled, single-step login restored +**Max execution time**: 5s + +--- + +### Logout + Revocation Surface (AZ-535) + +#### FT-P-43: POST /logout Revokes the Current Session + +**Summary**: A POST /logout with a valid access token marks the session row revoked and disables the paired refresh token. +**Traces to**: AZ-535 AC-1 +**Category**: Session Lifecycle + +**Preconditions**: +- Active session from a prior /login (access token A, refresh token R) + +**Input data**: Authorization header `Bearer <A>`, empty body + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /logout with bearer A | HTTP 200 | +| 2 | Query the session row | `revoked_at` set; `revoked_reason = 'user_logout'` | +| 3 | POST /token/refresh with R | HTTP 401 | + +**Expected outcome**: Session revoked, refresh dies immediately +**Max execution time**: 5s + +--- + +#### FT-P-44: POST /logout/all Revokes Every Session for the User + +**Summary**: A user with multiple active sessions can sign out of all of them in one call. +**Traces to**: AZ-535 AC-2 +**Category**: Session Lifecycle + +**Preconditions**: +- User with three active sessions S1/S2/S3 (each from a separate /login) + +**Input data**: Authorization header `Bearer <A from S1>`, empty body + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /logout/all from S1 | HTTP 200 | +| 2 | Query `sessions` for the user | All three rows have `revoked_at` set | +| 3 | POST /token/refresh with the refresh tokens of S1/S2/S3 | All three return HTTP 401 | + +**Expected outcome**: Every session for the user is revoked +**Max execution time**: 10s + +--- + +#### FT-P-45: POST /sessions/{sid}/revoke Lets Admin Kill Any Session + +**Summary**: An Admin-role JWT can revoke any other user's session by id; the revoked row records the admin's user id. +**Traces to**: AZ-535 AC-3 +**Category**: Admin Session Management + +**Preconditions**: +- Admin user with valid (post-AZ-531) access token +- Target user with active session SID-X + +**Input data**: Authorization header `Bearer <admin access>`, path `/sessions/<SID-X>/revoke` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /sessions/SID-X/revoke as admin | HTTP 200 | +| 2 | Query the SID-X row | `revoked_at` set; `revoked_by_user_id` = admin's user id | +| 3 | POST /token/refresh with SID-X's refresh | HTTP 401 | + +**Expected outcome**: Admin-driven revocation works and records actor +**Max execution time**: 5s + +--- + +#### FT-P-46: GET /sessions/revoked?since=… Returns Recent, Non-Expired Revocations + +**Summary**: A verifier identity (`Role=Service`) polls the snapshot endpoint and gets the recently-revoked, still-valid sessions; expired entries are auto-pruned. +**Traces to**: AZ-535 AC-4 +**Category**: Verifier Snapshot + +**Preconditions**: +- 5 sessions revoked in the last hour, 2 of which already have `exp < now()` +- Verifier identity (Service role) with valid bearer + +**Input data**: Authorization header `Bearer <verifier access>`, query `?since=<unix-ts 1h ago>` + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | GET /sessions/revoked?since=<ts> with verifier bearer | HTTP 200; `Cache-Control: no-cache`; body is JSON array of length 3 | +| 2 | Inspect each entry | `{ jti, sid, exp }` shape; no expired entries present | + +**Expected outcome**: 3 non-expired revocations returned; expired ones pruned +**Max execution time**: 5s + +--- + +#### FT-P-47: POST /logout Is Idempotent + +**Summary**: Logging out a session that is already revoked returns 200 with `already_revoked: true` and does not write to the DB. +**Traces to**: AZ-535 AC-5 +**Category**: Session Lifecycle + +**Preconditions**: +- Already-revoked session from FT-P-43 + +**Input data**: Authorization header `Bearer <still-valid-but-stale access>`, empty body + +**Steps**: + +| Step | Consumer Action | Expected System Response | +|------|----------------|------------------------| +| 1 | POST /logout again | HTTP 200; body `{ "already_revoked": true }` | +| 2 | Query the session row's `updated_at` (or equivalent audit column) | Unchanged from before step 1 | + +**Expected outcome**: Idempotent — no second DB mutation +**Max execution time**: 5s diff --git a/_docs/02_document/tests/security-tests.md b/_docs/02_document/tests/security-tests.md index 0b1f2dd..cf5542f 100644 --- a/_docs/02_document/tests/security-tests.md +++ b/_docs/02_document/tests/security-tests.md @@ -92,3 +92,310 @@ The `POST /resources/get/{dataFolder?}` endpoint that this test exercised was re | 2 | Attempt POST /login with disabled user credentials | HTTP 409 or HTTP 403 | **Pass criteria**: Disabled user cannot obtain a JWT token + +--- + +## Cycle 2 Additions (2026-05-14) — Auth Modernization (AZ-529 + AZ-530) + +The scenarios below were appended during the existing-code cycle 2 Test-Spec Sync (autodev Step 12) for the security-only / cryptography-invariant ACs in cycle 2. Functional flows live in `blackbox-tests.md` under the matching task. Numbering continues from NFT-SEC-06. + +### NFT-SEC-07: New User Hashes Use Argon2id (AZ-536) + +**Summary**: A freshly-registered user's `password_hash` is in Argon2id PHC format with parameters at or above the configured floor. +**Traces to**: AZ-536 AC-1 + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | POST /users (ApiAdmin JWT) registering `freshuser@azaion.com` with a known password | HTTP 200 | +| 2 | Read `users.password_hash` for `freshuser@azaion.com` directly from Postgres | Value starts with `$argon2id$v=19$m=` | +| 3 | Parse the PHC string parameters | `m ≥ 65536`, `t ≥ 3`, `p ≥ 1` | + +**Pass criteria**: All new users land in Argon2id PHC format with at least the configured cost parameters; no SHA-384 base64 strings written for new accounts. + +--- + +### NFT-SEC-08: Argon2id Verify Has No Remotely Observable Timing Leak (AZ-536) + +**Summary**: `VerifyPassword` is constant-time across wrong passwords of various lengths; timing variance does not leak information about the candidate password. +**Traces to**: AZ-536 AC-5 + +**Preconditions**: +- User with Argon2id-hashed password +- Test environment with low concurrency (this test is sensitive to host noise — if it intermittently trips, widen the bound or warm Argon2 with a non-test login first; see cycle-2 carry-forward F6) + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | POST /login with a wrong 8-char password, sample N=20 timings | Each → HTTP 409 WrongPassword | +| 2 | POST /login with a wrong 64-char password, sample N=20 timings | Each → HTTP 409 WrongPassword | +| 3 | Compute median of each sample; compare | `|median_8 − median_64| / median_8 < 0.20` (within 20% of each other — Argon2id cost dominates string-comparison cost) | + +**Pass criteria**: Wrong-password verify time is dominated by Argon2id cost, not by string-length-dependent comparison; no exploitable timing channel. + +--- + +### NFT-SEC-09: Per-IP Rate Limit Returns 429 (AZ-537) + +**Summary**: 11 `/login` requests from the same client IP within 60 s force the 11th into HTTP 429 with a `Retry-After` header. +**Traces to**: AZ-537 AC-1 + +**Preconditions**: +- Rate-limit `Auth:RateLimit:PerIp` set to 10 / 60 s sliding (the test env value) +- Test client preserves source IP across requests (E2E container-shared-IP caveat applies — see test_run_report cycle 2 skip note for the legitimate environment-mismatch skip) + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | POST /login × 10 from the same IP within 5 s (any mix of right/wrong passwords) | HTTP 200 / HTTP 409 (within budget) | +| 2 | POST /login as the 11th request inside the 60 s window | HTTP 429; response includes `Retry-After` header (integer seconds) | + +**Pass criteria**: 11th request inside the window is rejected with 429 + Retry-After. (Legitimate environment-mismatch skip in shared-IP container envs — verified by ASP.NET Core RateLimiter unit tests + manual probe documented in AZ-537 spec.) + +--- + +### NFT-SEC-10: Per-Account Rate Limit Returns 429 (AZ-537) + +**Summary**: 6 `/login` requests for the same email from 6 different IPs within 5 min force the 6th into HTTP 429. +**Traces to**: AZ-537 AC-2 + +**Preconditions**: +- Rate-limit `Auth:RateLimit:PerAccount` set to 5 / 5 min sliding +- Test ability to spoof / vary the source IP per request (e.g. via `X-Forwarded-For` if the app trusts a known forwarder, or a multi-host test fixture) + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | POST /login for `alice@azaion.com` from IPs 1..5 within 1 min (any mix of right/wrong passwords) | HTTP 200 / HTTP 409 (within budget) | +| 2 | POST /login for `alice@azaion.com` from IP 6 inside the 5 min window | HTTP 429; `Retry-After` present | + +**Pass criteria**: Per-account partition triggers independently of per-IP partition. + +--- + +### NFT-SEC-11: Account Lockout Returns 423 Even For Correct Password (AZ-537) + +**Summary**: Once `failed_login_count` hits the lockout threshold, the account returns HTTP 423 Locked even for subsequent correct-password attempts until `lockout_until` passes. +**Traces to**: AZ-537 AC-3 + +**Preconditions**: +- `Auth:Lockout:MaxAttempts = 10` (default) +- User `bob@azaion.com` with Argon2id-hashed password + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | POST /login for `bob@azaion.com` with wrong password × 10 (across IPs / within rate budget) | First 9 → HTTP 409 WrongPassword; 10th → HTTP 423 Locked OR final 409 followed by lockout flag | +| 2 | Read `users.lockout_until` and `users.failed_login_count` for `bob` | `lockout_until > now()`; counter at threshold | +| 3 | POST /login for `bob` with correct password immediately after | HTTP 423 Locked (lockout precedes credential check) | + +**Pass criteria**: Lockout state takes precedence over correct credentials within the lockout window; counter persists across IPs (per-account, not per-IP). + +--- + +### NFT-SEC-12: Lockout Is Audit-Logged (AZ-537) + +**Summary**: When NFT-SEC-11 fires the lockout transition, an audit-log row is written with the email, source IP, and timestamp. +**Traces to**: AZ-537 AC-6 + +**Preconditions**: +- Audit log infrastructure online (verified by existing logging tests) + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | Trigger NFT-SEC-11 against `bob@azaion.com` from IP `203.0.113.7` | Lockout fires | +| 2 | Query the audit log for entries with `event = 'login_lockout'` since the test start | At least one row with `email = 'bob@azaion.com'`, `ip = '203.0.113.7'`, `timestamp` within ± 5 s of the lockout trigger | + +**Pass criteria**: Each lockout produces a `login_lockout` audit entry with the security-relevant fields. + +--- + +### NFT-SEC-13: HTTP CORS Origin Is Rejected (AZ-538) + +**Summary**: A browser preflight from the cleartext `http://admin.azaion.com` origin must NOT receive an `Access-Control-Allow-Origin` header (CORS denies the request). +**Traces to**: AZ-538 AC-1 + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | OPTIONS /login with `Origin: http://admin.azaion.com`, `Access-Control-Request-Method: POST` | HTTP 204 OR 200; response has NO `Access-Control-Allow-Origin` header | + +**Pass criteria**: HTTP origin gets no ACAO header — browser-side fetch with credentials will fail in any compliant browser. + +--- + +### NFT-SEC-14: HSTS Header Present in Production (AZ-538) + +**Summary**: When `ASPNETCORE_ENVIRONMENT=Production`, every HTTPS response includes a strict `Strict-Transport-Security` header. +**Traces to**: AZ-538 AC-3 + +**Preconditions**: +- Admin container running with `ASPNETCORE_ENVIRONMENT=Production` +- Note: the default test harness runs `Development`; this test must be run with the production env override OR is the legitimate environment-mismatch skip documented in cycle-2 test_run_report + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | GET https://admin.azaion.com/health/live (or any HTTPS endpoint) | HTTP 200; response header `Strict-Transport-Security: max-age=31536000; includeSubDomains; preload` | + +**Pass criteria**: Production responses always carry HSTS with the documented directives. + +--- + +### NFT-SEC-15: HTTP Request Redirects to HTTPS in Production (AZ-538) + +**Summary**: When `ASPNETCORE_ENVIRONMENT=Production`, a cleartext HTTP request returns HTTP 307 to the same path on HTTPS. +**Traces to**: AZ-538 AC-4 + +**Preconditions**: Same as NFT-SEC-14 + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | GET http://admin.azaion.com/health/live | HTTP 307; `Location: https://admin.azaion.com/health/live` | + +**Pass criteria**: HTTP traffic is redirected at the protocol layer, not silently served. + +--- + +### NFT-SEC-16: Refresh-Token Reuse Kills the Session Family (AZ-531) + +**Summary**: If a previously-rotated refresh token is presented again, the entire `sessions` family chain (parent + all descendants) is marked `revoked_reason='reuse_detected'` and every refresh in that family stops working. +**Traces to**: AZ-531 AC-3 + +**Preconditions**: +- A session family with refresh R1 rotated to R2 (per FT-P-31) + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | POST /token/refresh with R1 (already rotated) | HTTP 401 | +| 2 | Query `sessions` for the family | Every row in the family has `revoked_at` set; `revoked_reason = 'reuse_detected'` | +| 3 | POST /token/refresh with R2 | HTTP 401 (R2 also dead — family-wide kill) | + +**Pass criteria**: Reuse detection kills the entire family, not just the reused refresh. + +--- + +### NFT-SEC-17: Refresh Tokens Are Opaque, Not JWT (AZ-531) + +**Summary**: Refresh tokens issued by /login or /token/refresh are not JWTs; the persisted form is the SHA-256 hash; the raw value never appears in logs. +**Traces to**: AZ-531 AC-5 + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | POST /login → capture `refresh_token` R | R is a non-empty string ≥ 43 chars (base64url of 32 bytes) | +| 2 | Attempt to parse R as a JWT (split on `.` and base64url-decode the segments) | Parse fails — R does not split into a JWT header/payload/signature shape | +| 3 | Read the matching `sessions.refresh_hash` column directly from Postgres | Length 32 bytes (SHA-256 raw or base64-encoded), value ≠ R | +| 4 | Grep API logs (Serilog output) for the literal R | No match (raw refresh value never logged) | + +**Pass criteria**: Refresh tokens are opaque, hashed at rest, and never logged in raw form. + +--- + +### NFT-SEC-18: Admin Tokens Are Signed With ES256 + kid (AZ-532) + +**Summary**: An access token returned by /login has `alg=ES256` and a `kid` matching one of the active JWKS keys. +**Traces to**: AZ-532 AC-1 + +**Preconditions**: +- Admin running with at least one ES256 keypair loaded from `secrets/jwt_signing_key.pem` + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | POST /login with valid credentials | HTTP 200, dual tokens | +| 2 | Decode the access token's JOSE header | `alg == "ES256"`, `kid` non-empty | +| 3 | GET /.well-known/jwks.json | The same `kid` appears in the returned `keys` array | + +**Pass criteria**: Tokens are signed asymmetrically and carry the `kid` discriminator needed for rotation. + +--- + +### NFT-SEC-19: JWKS Endpoint Never Exposes Private Material (AZ-532) + +**Summary**: The JWKS payload contains only public components; no `d`, `p`, `q`, `dp`, `dq`, or `qi` field appears. +**Traces to**: AZ-532 AC-4 + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | GET /.well-known/jwks.json | HTTP 200, JSON body | +| 2 | Inspect every entry in `keys` for forbidden private-material fields | None of `d`, `p`, `q`, `dp`, `dq`, `qi` is present | + +**Pass criteria**: Public-key set strictly excludes any private scalar (EC) or RSA private primes. + +--- + +### NFT-SEC-20: alg-Confusion Attack Is Rejected (AZ-532) + +**Summary**: A forged token with `alg=HS256` (where the signature is computed using the public key as the HMAC secret) is rejected by every protected endpoint, because `TokenValidationParameters.ValidAlgorithms` pins ES256 only. +**Traces to**: AZ-532 AC-5 + +**Preconditions**: +- Test fixture able to construct a forged JWT given the public key + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | Build a JWT with header `{ "alg":"HS256","typ":"JWT","kid":"<active-kid>" }`; payload claims valid; signature = HMAC-SHA256(publicKeyBytes, signingInput) | Forged token string | +| 2 | GET /users with the forged token | HTTP 401 | + +**Pass criteria**: Algorithm-confusion forgery is rejected; verifier does not silently downgrade to HS256. + +--- + +### NFT-SEC-21: Mission Token Requires MFA Step-Up (AZ-533 + AZ-534) + +**Summary**: After AZ-534 ships, `POST /sessions/mission` MUST reject access tokens whose `amr` does not include `mfa`. Caller gets 403 with a step-up message. +**Traces to**: AZ-533 AC-6 + +**Preconditions**: +- AZ-534 already landed (it has — cycle 2 batch 4) +- Caller holds an access token with `amr=["pwd"]` (e.g. legacy session, or a service account that doesn't enroll MFA) + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | POST /sessions/mission with the `amr=["pwd"]` access token + a valid mission body | HTTP 403; response body contains `"mission tokens require step-up MFA"` | + +**Pass criteria**: Mission-class tokens cannot be minted without MFA in the access-token `amr` chain. + +> Note: cycle-2 follow-up F1 in `_docs/03_implementation/implementation_report_auth_modernization_cycle2.md` calls out that `/sessions/mission` enforcement of `amr=mfa` is the small wire-up still pending after AZ-534 shipped (the AC was deferred during AZ-533, then re-opened under F1). Until F1 lands, this scenario is the spec contract; the matching test may be marked Pending in the SUT. + +--- + +### NFT-SEC-22: TOTP Secret Is Encrypted at Rest (AZ-534) + +**Summary**: The `users.mfa_secret` column never holds plaintext base32; only ciphertext. +**Traces to**: AZ-534 AC-6 + +**Preconditions**: +- An enrolled user from FT-P-39 + +**Steps**: + +| Step | Consumer Action | Expected Response | +|------|----------------|------------------| +| 1 | Read `users.mfa_secret` for the enrolled user directly from Postgres | Value is non-empty | +| 2 | Try to base32-decode the value as if it were a 32-char TOTP secret | Decode either fails OR yields material that does NOT round-trip to a working TOTP code | +| 3 | Confirm the value is the output of `IDataProtector.Protect(<plaintext base32>)` (length ≫ 32 chars; format-prefixed) | Matches `IDataProtector` ciphertext shape | + +**Pass criteria**: `mfa_secret` is stored encrypted; reading the DB row alone does not yield a usable TOTP secret. (Operational note: production must set `DataProtection:KeysFolder` for the `IDataProtector` to outlive container restarts — see cycle-2 carry-forward F3.) diff --git a/_docs/02_document/tests/traceability-matrix.md b/_docs/02_document/tests/traceability-matrix.md index 4c5a1ad..fc0e06b 100644 --- a/_docs/02_document/tests/traceability-matrix.md +++ b/_docs/02_document/tests/traceability-matrix.md @@ -137,3 +137,99 @@ The encrypted-download and installer-download endpoints were removed as obsolete | Cycle-2 AC-4 | `ExceptionEnum` no longer carries `WrongResourceName` (50); the gap is preserved | — | Build/CI invariant — verified by enum read | | Cycle-2 AC-5 | `Azaion.Test` project no longer in solution; build is clean | — | Build invariant — `dotnet build Azaion.AdminApi.sln` clean post-cleanup | | Cycle-2 AC-6 | E2E suite passes after the test deletions above | All e2e tests | Covered by Step 11 Run Tests post-cleanup (2026-05-14) | + +## Cycle 2 Additions (2026-05-14) — Auth Modernization (AZ-529 + AZ-530) + +Appended during the existing-code cycle 2 Test-Spec Sync (autodev Step 12) for the eight tasks delivered by the auth-modernization + CMMC-hardening epics. Rows below are namespaced by tracker ID; functional scenarios live in `blackbox-tests.md`, security-only invariants in `security-tests.md`. Existing AC/test IDs from earlier cycles are preserved unchanged. + +### AZ-536 — Argon2id Password Hashing (epic AZ-530, 5 ACs) + +| AC ID | Acceptance Criterion | Test IDs | Coverage | +|-------|---------------------|----------|----------| +| AZ-536 AC-1 | New users get Argon2id hashes (PHC, m ≥ 64 MiB, t ≥ 3, p ≥ 1) | NFT-SEC-07 | Covered | +| AZ-536 AC-2 | Legacy SHA-384 hashes still validate | FT-P-24 | Covered | +| AZ-536 AC-3 | Successful legacy login transparently re-hashes to Argon2id | FT-P-25 | Covered | +| AZ-536 AC-4 | Wrong password fails for both formats with the same error code | FT-N-17 | Covered | +| AZ-536 AC-5 | Verify is constant-time (no remotely observable timing leak) | NFT-SEC-08 | Covered (with known suite-concurrency flake — see cycle-2 carry-forward F6) | + +### AZ-537 — /login Rate Limit + Account Lockout (epic AZ-530, 6 ACs) + +| AC ID | Acceptance Criterion | Test IDs | Coverage | +|-------|---------------------|----------|----------| +| AZ-537 AC-1 | Per-IP rate limit triggers HTTP 429 with `Retry-After` | NFT-SEC-09 | Covered (legitimate environment-mismatch skip in shared-IP container env) | +| AZ-537 AC-2 | Per-account rate limit triggers HTTP 429 across IPs | NFT-SEC-10 | Covered | +| AZ-537 AC-3 | Account lockout after 10 failures returns 423 even on correct password | NFT-SEC-11 | Covered | +| AZ-537 AC-4 | Successful login resets `failed_login_count` and clears `lockout_until` | FT-P-26 | Covered | +| AZ-537 AC-5 | Lockout auto-expires after configured duration | FT-P-27 | Covered | +| AZ-537 AC-6 | Audit-log entry written on each lockout event | NFT-SEC-12 | Covered | + +### AZ-538 — CORS HTTPS-Only + HSTS (epic AZ-530, 5 ACs) + +| AC ID | Acceptance Criterion | Test IDs | Coverage | +|-------|---------------------|----------|----------| +| AZ-538 AC-1 | HTTP origin gets no `Access-Control-Allow-Origin` header | NFT-SEC-13 | Covered | +| AZ-538 AC-2 | HTTPS origin preflight echoes credentials flag | FT-P-28 | Covered | +| AZ-538 AC-3 | HSTS header present in production responses | NFT-SEC-14 | Covered (legitimate Production-only environment-mismatch skip in dev test harness — verified by code inspection of `Program.cs UseHsts`) | +| AZ-538 AC-4 | HTTP request returns 307 to HTTPS in production | NFT-SEC-15 | Covered (legitimate Production-only environment-mismatch skip in dev test harness — verified by code inspection of `Program.cs UseHttpsRedirection`) | +| AZ-538 AC-5 | Development env unchanged (no redirect, no HSTS) | FT-P-29 | Covered | + +### AZ-531 — Refresh-Token Flow (epic AZ-529, 5 ACs) + +| AC ID | Acceptance Criterion | Test IDs | Coverage | +|-------|---------------------|----------|----------| +| AZ-531 AC-1 | `/login` returns dual tokens, session row persisted | FT-P-30 | Covered | +| AZ-531 AC-2 | `/token/refresh` rotates refresh + chains via `parent_session_id` | FT-P-31 | Covered | +| AZ-531 AC-3 | Reuse-detection kills the entire session family | NFT-SEC-16 | Covered | +| AZ-531 AC-4 | Sliding window + 12 h absolute family expiry | FT-P-32 | Covered | +| AZ-531 AC-5 | Refresh tokens are opaque, hashed at rest, never logged in raw form | NFT-SEC-17 | Covered | + +### AZ-532 — Asymmetric Signing + JWKS (epic AZ-529, 5 ACs) + +| AC ID | Acceptance Criterion | Test IDs | Coverage | +|-------|---------------------|----------|----------| +| AZ-532 AC-1 | Access tokens carry `alg=ES256` + `kid` | NFT-SEC-18 | Covered | +| AZ-532 AC-2 | `GET /.well-known/jwks.json` serves the active public key with cache headers | FT-P-33 | Covered | +| AZ-532 AC-3 | Two-key overlap during rotation (both JWKS entries valid) | FT-P-34 | Covered | +| AZ-532 AC-4 | JWKS never exposes private material | NFT-SEC-19 | Covered | +| AZ-532 AC-5 | alg-confusion forgery (HS256 with public key as secret) is rejected | NFT-SEC-20 | Covered | + +### AZ-533 — Mission-Token Issuance for UAV (epic AZ-529, 6 ACs) + +| AC ID | Acceptance Criterion | Test IDs | Coverage | +|-------|---------------------|----------|----------| +| AZ-533 AC-1 | Mission token issued with correct lifetime (`planned_duration_h + 1h`) | FT-P-35 | Covered | +| AZ-533 AC-2 | Hard cap of 12 h enforced (HTTP 400 with cap message) | FT-N-19 | Covered | +| AZ-533 AC-3 | Mission token carries `mission_id`, `aircraft_id`, `aud`, `permissions`, `sid`, `jti` | FT-P-36 | Covered | +| AZ-533 AC-4 | Mission session auto-revoked when aircraft user reconnects | FT-P-37 | Covered | +| AZ-533 AC-5 | Endpoint requires authenticated session | FT-N-18 | Covered | +| AZ-533 AC-6 | MFA step-up required (`amr` must include `mfa`) | NFT-SEC-21 | **Spec only** — pending wire-up post-AZ-534 (cycle-2 carry-forward F1) | + +### AZ-534 — TOTP-Based 2FA at Login (epic AZ-529, 6 ACs) + +| AC ID | Acceptance Criterion | Test IDs | Coverage | +|-------|---------------------|----------|----------| +| AZ-534 AC-1 | Enrollment returns secret + QR + 10 recovery codes | FT-P-38 | Covered | +| AZ-534 AC-2 | Confirm with valid TOTP completes enrollment | FT-P-39 | Covered | +| AZ-534 AC-3 | Two-step `/login` → `/login/mfa` flow; access-token `amr=["pwd","mfa"]` | FT-P-40 | Covered | +| AZ-534 AC-4 | Recovery code substitutes for TOTP and is single-use | FT-P-41 | Covered | +| AZ-534 AC-5 | Disable requires password + valid TOTP | FT-P-42 | Covered | +| AZ-534 AC-6 | TOTP secret encrypted at rest in `users.mfa_secret` | NFT-SEC-22 | Covered | + +### AZ-535 — Logout + Revocation Surface (epic AZ-529, 5 ACs) + +| AC ID | Acceptance Criterion | Test IDs | Coverage | +|-------|---------------------|----------|----------| +| AZ-535 AC-1 | `POST /logout` revokes the current session and kills refresh | FT-P-43 | Covered | +| AZ-535 AC-2 | `POST /logout/all` revokes every session for the user | FT-P-44 | Covered | +| AZ-535 AC-3 | Admin can revoke any session by id; row records actor | FT-P-45 | Covered | +| AZ-535 AC-4 | `GET /sessions/revoked?since=…` returns recent, non-expired entries | FT-P-46 | Covered | +| AZ-535 AC-5 | `POST /logout` is idempotent (no second DB write) | FT-P-47 | Covered | + +## Cycle 2 Coverage Update + +| Category | Total Items | Covered | Not Yet Wired | Coverage % | +|----------|-----------|---------|---------------|-----------| +| Acceptance Criteria (cycle 2 — auth modernization) | 43 | 42 | 1 (AZ-533 AC-6 — pending wire-up F1) | 98% | +| Acceptance Criteria — combined total (baseline + cycle 1 + cycle 2 cleanup + cycle 2 auth) | 100 | 96 | 1 (F1) + 3 baseline restrictions still uncovered | 96% | + +The single uncovered cycle-2 AC (AZ-533 AC-6) is documented in the cycle-2 implementation report as carry-forward item F1 — the `/sessions/mission` `amr=mfa` enforcement was deferred during AZ-533, became implementable once AZ-534 shipped, and is filed as a follow-up ticket to be picked up in a later cycle.