[AZ-529] [AZ-530] Cycle-2 documentation refresh

Refreshes _docs/02_document/ to reflect the cycle-2 auth-modernization
+ CMMC hardening landings (AZ-531..AZ-538). Authoritative source for
the ripple set is ripple_log_cycle2.md.

Covered:
- architecture.md (section 1 rewritten, ADRs 6-9 added)
- data_model.md (sessions, audit_events, user columns, migrations)
- system-flows.md (F1 rewritten; F11-F17 added; F2/F7/F9 minor)
- module-layout.md (cycle-2 sub-component table)
- diagrams/flows/flow_login.md (dual-token + MFA)
- components/{01_data_layer,03_auth_and_security,05_admin_api}
- modules/ (12 new, 8 modified — full Argon2id/ES256/MFA/refresh
  /mission/session/audit/jwks rollup)
- tests/{blackbox,security,traceability-matrix}

Step 13 (Update Docs) output for cycle 2.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-14 09:22:53 +03:00
parent c2c659ef62
commit a77b3f8a59
35 changed files with 3624 additions and 468 deletions
@@ -0,0 +1,73 @@
# Module: Azaion.Services.MfaService
## Purpose
RFC 6238 TOTP-based 2FA at credential login. Manages enrollment, confirmation, disable, and second-factor verification, and issues the short-lived step-1 JWT carried between `/login` and `/login/mfa`.
> Added in cycle 2 (2026-05-14) by AZ-534 (Epic AZ-529). Per-user opt-in initially; no policy yet enforces MFA by role. AZ-533 mission-token issuance has a TODO to require `amr=["pwd","mfa"]` once MFA adoption is established.
## Public Interface
### IMfaService
| Method | Signature | Description |
|--------|-----------|-------------|
| `Enroll` | `Task<MfaEnrollResponse> Enroll(Guid userId, string password, CancellationToken ct = default)` | Generates a TOTP secret + 10 single-use recovery codes, persists the encrypted secret + hashed recovery codes, returns the secret/otpauth-url/QR/recovery codes (ONCE — recovery codes are unrecoverable after this response). Requires fresh password re-auth. `mfa_enabled` stays false until `Confirm`. |
| `Confirm` | `Task Confirm(Guid userId, string code, CancellationToken ct = default)` | Validates one TOTP code against the enrolled secret; on success sets `mfa_enabled=true`. |
| `Disable` | `Task Disable(Guid userId, string password, string code, CancellationToken ct = default)` | Removes MFA; requires both password re-auth and a valid TOTP code (no recovery-code substitution here — disable should be deliberate). |
| `IssueMfaStepToken` | `string IssueMfaStepToken(Guid userId)` | Mints a 5-minute ES256 JWT (audience `azaion-mfa-step2`) returned at `/login` step-1 when the user has MFA enabled. The client carries it back to `/login/mfa`. |
| `ValidateMfaStepToken` | `Guid ValidateMfaStepToken(string token)` | Decodes a step-1 token, returns the userId. Throws `BusinessException(InvalidMfaToken)` on bad signature, audience mismatch, or expiry. |
| `VerifyForLogin` | `Task<string[]> VerifyForLogin(Guid userId, string code, CancellationToken ct = default)` | Step-2 verification at login. Returns the AMR array the access token should carry — `["pwd","mfa"]` for TOTP success, `["pwd","mfa","recovery"]` if a recovery code was consumed. Throws `BusinessException(InvalidMfaCode)` on failure. |
## Internal Logic
- **Secret generation**: 20-byte (160-bit) random key per RFC 6238 §3, encoded as 32-char base32. Stored encrypted at rest via `IDataProtector` (purpose `Azaion.Mfa.Secret.v1`).
- **otpauth URL**: built via `OtpUri` (Otp.NET) with SHA-1 / 6 digits / 30-sec period — RFC 6238 defaults.
- **QR**: PNG generated via `QRCoder.QRCodeGenerator` (ECCLevel.M), returned as base64. The endpoint hands the raw PNG bytes back; the UI inlines the data URL.
- **Recovery codes**: 10 codes, each 10 random bytes → 16-char base32. Stored as `{ Hash, UsedAt }` JSON array; hash is SHA-256 hex (high-entropy secret → fast hash is appropriate, same reasoning as the refresh-token store). Single-use enforcement via the `UsedAt` field plus a conditional update on the prior JSON to defend against concurrent-use races.
- **TOTP verification** uses Otp.NET's `Totp.VerifyTotp` with `VerificationWindow.RfcSpecifiedNetworkDelay` (±1 step). Each successful verification persists the matched time-step counter to `users.mfa_last_used_window`; subsequent codes with `matched_window <= last_used_window` are rejected to prevent in-window replay.
- **Step-1 token**: ES256 JWT with audience `azaion-mfa-step2` (intentionally distinct from the main `JwtConfig.Audience` so the main JwtBearer middleware rejects it). Lifetime 5 min — matches AZ-534 AC-3.
- **Disable's raw SQL** — setting `mfa_recovery_codes` (jsonb) back to NULL via the LinqToDB UPDATE expression API sends an untyped NULL literal that Postgres parses as text and rejects (42804). A small parameterized SQL avoids the type-inference dance.
## Dependencies
- `IDbFactory` — admin connection for user updates
- `IUserService` — user lookup by id
- `IDataProtectionProvider` — encrypts `mfa_secret` at rest (key storage configured via `DataProtection:KeysFolder`; defaults to per-machine ephemeral)
- `IJwtSigningKeyProvider` — ES256 signing for the step-1 token
- `IOptions<JwtConfig>` — issuer for the step-1 token
- `IAuditLog` — emits `mfa_enroll` / `mfa_confirm` / `mfa_disable` / `mfa_login_success` / `mfa_login_failed` / `mfa_recovery_used`
- `Security` — password verification (Argon2id) for re-auth on enroll/disable
- Otp.NET (TOTP), QRCoder (PNG generation)
## Consumers
- `Program.cs` `/users/me/mfa/enroll`, `/users/me/mfa/confirm`, `/users/me/mfa/disable`
- `Program.cs` `/login` — calls `IssueMfaStepToken` when `user.MfaEnabled`
- `Program.cs` `/login/mfa` — calls `ValidateMfaStepToken` then `VerifyForLogin`
## Data Models
Operates on the `User` entity (`mfa_enabled`, `mfa_secret`, `mfa_recovery_codes`, `mfa_enrolled_at`, `mfa_last_used_window` columns added by `env/db/10_users_mfa.sql`).
## Configuration
- `JwtConfig.Issuer` — used as the `iss` of the step-1 token.
- `DataProtection:KeysFolder` (production must set this to a persistent volume so encrypted MFA secrets survive container restarts; without it the per-machine ephemeral key store will lose every MFA secret on first deploy).
## External Integrations
PostgreSQL via `IDbFactory`. ASP.NET Core DataProtection for at-rest encryption.
## Security
- TOTP secret is base32 (32 chars) encrypted at rest with `IDataProtector`. Plaintext only exists in memory during enroll/verify.
- Recovery codes are SHA-256-hashed in the DB; the plaintext list is shown ONCE in `MfaEnrollResponse` and unrecoverable thereafter.
- `mfa_last_used_window` defends against in-window replay (a code presented twice within 30 s is rejected the second time).
- Step-1 JWT carries a narrowed audience (`azaion-mfa-step2`); the main JwtBearer middleware accepts only `JwtConfig.Audience` and rejects this token for any non-MFA endpoint.
- Re-auth with password is required for enroll and disable; this defends against a stolen access token being used to silently flip MFA state.
- **Known follow-up F2 (carried forward from Cycle 2 batch 4 review)**: `TryConsumeRecoveryCode` returns `true` even when the conditional update affects 0 rows — concurrent double-spend of the same recovery code is possible (low practical risk, but a real correctness gap).
## Tests
- `e2e/Azaion.E2E/Tests/MfaEnrollmentTests.cs` — AC-1 (enroll shape), AC-2 (confirm), AC-5 (disable), AC-6 (encrypted at rest).
- `e2e/Azaion.E2E/Tests/MfaLoginTests.cs` — AC-3 (two-step flow + AMR claim), AC-4 (recovery-code single-use).