Files
admin/_docs/03_implementation/batch_04_cycle2_report.md
Oleksandr Bezdieniezhnykh 1e1ded73f5
ci/woodpecker/push/01-test Pipeline failed
ci/woodpecker/push/02-build-push unknown status
[AZ-534] TOTP-based 2FA at credential login
Add RFC 6238 TOTP enrollment, two-step /login flow, recovery codes, and
the amr=["pwd","mfa"] claim that propagates through refresh-token rotation.

- New endpoints: /users/me/mfa/{enroll,confirm,disable} and /login/mfa.
- /login short-circuits to a 5-min ES256 step-1 token (audience-pinned
  azaion-mfa-step2) when the user has MFA enabled; real access+refresh
  pair is minted only after /login/mfa.
- mfa_secret encrypted at rest via ASP.NET Core IDataProtector
  (purpose=Azaion.Mfa.Secret.v1; key folder configurable via
  DataProtection:KeysFolder for production persistence).
- Recovery codes (10 single-use, base32, ~80-bit entropy) hashed with
  SHA-256 and stored as JSONB; constant-time compare on lookup.
- RFC 6238 §5.2 replay defense via mfa_last_used_window per user.
- Sessions carry mfa_authenticated so /token/refresh re-stamps the
  amr claim correctly across the entire 30-day refresh window.
- New audit events: enroll, confirm, disable, login-success/failed,
  recovery-used.
- Schema: env/db/10_users_mfa.sql adds users.mfa_* columns and
  sessions.mfa_authenticated; mfa_recovery_codes mapped as BinaryJson
  in AzaionDbSchemaHolder; disable path uses raw parameterised SQL to
  avoid LinqToDB null-literal type-inference on jsonb columns.

E2E: 6 new tests in MfaLoginTests cover all six AC; full suite
82 passed / 0 failed / 3 intentional skips.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-14 06:21:28 +03:00

8.5 KiB

Batch Report

Batch: 4 (cycle 2) Tasks: AZ-534 (totp_2fa_login) Date: 2026-05-14 Total Complexity: 5 points Epic: AZ-529 — Auth Mechanism Modernization

Task Results

Task Status Files Modified Tests AC Coverage Issues
AZ-534 Done 9 source + 1 sql migration + 1 test 6/6 pass 6/6 None blocking — see review

Files Touched

Source (production)

  • Azaion.AdminApi/Program.cs — DI for IMfaService; configure ASP.NET Core DataProtection (with optional DataProtection:KeysFolder for production persistence); /login short-circuits to step-1 token when user.MfaEnabled; new /login/mfa endpoint; new /users/me/mfa/{enroll,confirm,disable} endpoints; IssueDualTokens helper centralises access+refresh minting; /token/refresh propagates amr from the persisted MfaAuthenticated flag
  • Azaion.AdminApi/BusinessExceptionHandler.cs — map MfaAlreadyEnabled / MfaNotEnrolling / MfaNotEnabled → 409, InvalidMfaCode / InvalidMfaToken → 401
  • Azaion.Common/BusinessException.cs — add MfaAlreadyEnabled = 56, MfaNotEnrolling = 57, MfaNotEnabled = 58, InvalidMfaCode = 59, InvalidMfaToken = 61
  • Azaion.Common/Database/AzaionDbShemaHolder.csUser.MfaRecoveryCodes mapped to DataType.BinaryJson so Npgsql sends the JSONB type oid on insert/update
  • Azaion.Common/Entities/User.cs — add MfaEnabled, MfaSecret, MfaRecoveryCodes, MfaEnrolledAt, MfaLastUsedWindow; sensitive fields [JsonIgnore]
  • Azaion.Common/Entities/Session.cs — add MfaAuthenticated (preserves AMR strength across refresh rotations)
  • Azaion.Common/Entities/AuditEvent.cs — new event type strings: MfaEnroll, MfaConfirm, MfaDisable, MfaLoginSuccess, MfaLoginFailed, MfaRecoveryUsed
  • Azaion.Common/Requests/MfaRequests.csnew; MfaEnrollRequest/Response, MfaConfirmRequest, MfaDisableRequest, MfaRequiredResponse, MfaLoginRequest
  • Azaion.Services/AuthService.csCreateToken accepts optional amr collection; values stamped as repeated amr claims per RFC 8176
  • Azaion.Services/AuditLog.cs — new RecordMfa… helpers
  • Azaion.Services/MfaService.csnew; TOTP enrol / confirm / disable / verify-for-login; ES256 step-1 token (5-min, audience-pinned azaion-mfa-step2); single-use recovery codes (SHA-256 hashed, JSONB-stored); RFC 6238 replay defence via MfaLastUsedWindow; IDataProtector encrypts mfa_secret at rest
  • Azaion.Services/RefreshTokenService.csIssueForNewLogin accepts mfaAuthenticated; Rotate carries the flag forward to the new session row

Migrations / infra

  • env/db/10_users_mfa.sqlnew; ALTER TABLE adds mfa_enabled (default false), mfa_secret (text), mfa_recovery_codes (jsonb), mfa_enrolled_at (timestamp), mfa_last_used_window (bigint); sessions.mfa_authenticated (default false)
  • e2e/db-init/00_run_all.sh — apply 10_users_mfa.sql in test DB
  • e2e/Azaion.E2E/Azaion.E2E.csproj — add Otp.NET package (test-side TOTP code generation)

Tests

  • e2e/Azaion.E2E/Tests/MfaLoginTests.csnew; 6 tests (enrol payload shape, confirm activates, two-step login + amr, recovery single-use, disable round-trip, ciphertext-at-rest)
  • e2e/Azaion.E2E/Helpers/DbHelper.cs — add GetMfaSecretRaw, GetMfaEnabled

Test Run Results

Batch 4 only (--filter MfaLoginTests): 6 / 6 passed, ~14 s. Full suite: 82 passed, 0 failed, 3 skipped, ~77 s.

The PasswordHashingTests.AC5_Verify_uses_constant_time_comparator_no_obvious_timing_leak flake noted in batch 3 review passed cleanly in this run, confirming it as an environmental flake rather than a regression.

AC Coverage

  • AC-1: Enrol returns base32 secret (32 chars), otpauth:// URL, base64 PNG QR, 10 recovery codes ≥12 chars; DB still mfa_enabled=falseAC1_Enroll_returns_secret_otpauth_qr_and_recovery_codes
  • AC-2: Confirm with valid TOTP flips mfa_enabled=trueAC2_Confirm_enables_MFA
  • AC-3: /login returns {mfa_required, mfa_token, expires_in:300} then /login/mfa returns access+refresh with amr=["pwd","mfa"]AC3_Login_returns_mfa_required_then_step2_returns_tokens_with_amr_pwd_mfa
  • AC-4: Recovery code works once (yields amr=["pwd","mfa","recovery"]); reuse rejected — AC4_Recovery_code_works_once_then_fails
  • AC-5: /users/me/mfa/disable requires password + valid TOTP; subsequent /login returns access+refresh directly without step 2 — AC5_Disable_requires_password_and_code_then_login_returns_tokens_directly
  • AC-6: users.mfa_secret read directly from Postgres is ciphertext (DataProtection envelope), not the base32 secret — AC6_Mfa_secret_is_encrypted_at_rest

Key Implementation Decisions

  1. IDataProtector for mfa_secret, not a hand-rolled AES wrapper. ASP.NET Core's DataProtection handles key generation, automatic 90-day rotation, and a versioned envelope format that survives key rolls without re-encrypting all rows. Custom AES-GCM would have given the same security guarantee but with three new test vectors and a manual rotation runbook. Purpose = "Azaion.Mfa.Secret.v1" namespaces the keys so an accidental cross-purpose decrypt fails. Key persistence is opt-in via DataProtection:KeysFolder — production deployments MUST set it (Program.cs comment is explicit), or restarts invalidate every enrolled secret.

  2. SHA-256 for recovery code hashing, not Argon2id. Recovery codes are 16-character base32 strings (~80 bits of entropy from KeyGeneration.GenerateRandomKey(10)). Argon2id at the calibrated ~250 ms cost would add 2.5 s to every wrong-code attempt (we walk all unused codes). High-entropy secrets need a fast hash, not a slow KDF — the same reasoning the refresh-token store uses. Constant-time compare via CryptographicOperations.FixedTimeEquals defends against timing oracles on the hash bytes.

  3. mfa_authenticated persisted on the session row, not re-derived from the access token. Refresh-token rotation produces a brand-new access token; we'd otherwise have no source of truth for "was this session born of MFA?" once the original access token expires. Storing the boolean on the session lets /token/refresh re-stamp amr=["pwd","mfa"] correctly across the entire 30-day refresh window. Costs one boolean column.

  4. Step-1 MFA token is ES256, audience-pinned azaion-mfa-step2. Re-uses the JWKS keypair so verifiers don't need to learn a second key. The narrow audience makes the main JwtBearer middleware reject this token for normal endpoints, and MfaService.ValidateMfaStepToken rejects any other audience — so a step-1 token cannot be presented at /users/me, and an access token cannot be presented at /login/mfa.

  5. VerifyTotpCode checks lastUsedWindow > matchedWindow first. RFC 6238 §5.2 says "the verifier MUST reject any code that was already used in the current or previous window". OtpNet.VerificationWindow.RfcSpecifiedNetworkDelay accepts the prior + current + next 30-second window. Without the per-user mfa_last_used_window check, a man-in-the-middle who captured the code mid-flight could replay it within the 30-90 s acceptance window. Persisting the matched window is one extra UPDATE users per successful login.

  6. Disable uses raw SQL parameter for the JSONB null. LinqToDB's UpdateAsync lambda compiles MfaRecoveryCodes = null into an untyped NULL literal which Postgres parses as text and rejects against the jsonb column (42804). The BinaryJson mapping handles non-null values fine, but null literals in expression bodies bypass parameter typing. Switched the disable path to a single parameterised UPDATE … SET mfa_recovery_codes = NULL::jsonb …. Local fix, doesn't affect the enrol/confirm/login paths.

Backward Compatibility

  • All new users columns default to MFA-off (mfa_enabled=false, others NULL). Existing rows untouched.
  • Pre-existing sessions rows default mfa_authenticated=false; /token/refresh against an old session continues to issue amr=["pwd"] — same behaviour as before.
  • /login response shape is unchanged for users without MFA enabled — no client-visible change for the existing CompanionPC fleet or any non-enrolled admin.
  • LoginResponse and LoginRequest DTOs unchanged. The MFA branch returns a different DTO (MfaRequiredResponse); clients that don't recognise the mfaRequired field will see an unexpected payload — UI workspace ticket flagged in the spec under "Risks / Notes".