diff --git a/_docs/05_security/dependency_scan_cycle2.md b/_docs/05_security/dependency_scan_cycle2.md new file mode 100644 index 0000000..a8cb8ec --- /dev/null +++ b/_docs/05_security/dependency_scan_cycle2.md @@ -0,0 +1,54 @@ +# Dependency Scan — Cycle 2 (Auth Modernization, AZ-531..AZ-538) + +**Date**: 2026-05-14 +**Scope**: delta from cycle 1's `dependency_scan.md` — focuses on packages added or version-bumped during cycle 2. +**Tooling**: `dotnet list package --vulnerable --include-transitive`, `dotnet list package --deprecated --include-transitive`. + +## Vulnerability scan result (all csprojs) + +``` +Project Azaion.AdminApi : no vulnerable packages +Project Azaion.Common : no vulnerable packages +Project Azaion.Services : no vulnerable packages +``` + +**Verdict**: 0 known CVEs across direct + transitive packages on the resolved sources (nuget.org + 3 internal feeds). + +## Packages added in cycle 2 + +| Package | Version | Project | Purpose | Security review | +|---------|---------|---------|---------|-----------------| +| `Konscious.Security.Cryptography.Argon2` | 1.3.1 | Azaion.Services | Argon2id password hashing (AZ-536) | No reported CVEs. Author Keef Aragon; widely used in the .NET community. Implements the Argon2 1.3 spec. Ensure `time/memory/parallelism` parameters in `AuthConfig.PasswordHashing` are tuned for the production host (current defaults: t=3, m=64 MiB, p=2). | +| `Otp.NET` | 1.4.1 | Azaion.Services | TOTP / HOTP (AZ-534) | No reported CVEs. Implements RFC 6238 and RFC 4226. MIT-licensed. Last updated 2024. | +| `QRCoder` | 1.8.0 | Azaion.Services | QR PNG generation for MFA enrollment (AZ-534) | No reported CVEs in 1.8.0. Note: an older version 1.3.7 had a Critical vulnerability — verify our pinned 1.8.0 stays past that boundary on every refresh. | +| `Microsoft.AspNetCore.DataProtection` | 10.0 (framework) | Azaion.AdminApi | Encrypt MFA secrets at rest (AZ-534) | Built-in to ASP.NET Core; CVE risk is folded into the framework version. | +| `Microsoft.AspNetCore.RateLimiting` | 10.0 (framework) | Azaion.AdminApi | Per-IP rate limit (AZ-537) | Built-in. | + +> No package was bumped to a new version during cycle 2 (cycle 1 already brought `Newtonsoft.Json` to 13.0.4 to close audit finding D-1). + +## Deprecated (Legacy) packages — unchanged from cycle 1 + +``` +Azaion.AdminApi: + > FluentValidation.AspNetCore 11.3.0 Legacy + +Azaion.Services: + > System.IdentityModel.Tokens.Jwt 7.1.2 Legacy + Transitive: + > Microsoft.IdentityModel.Abstractions 7.1.2 Legacy + > Microsoft.IdentityModel.JsonWebTokens 7.1.2 Legacy + > Microsoft.IdentityModel.Logging 7.1.2 Legacy + > Microsoft.IdentityModel.Tokens 7.1.2 Legacy +``` + +**Status**: deprecated ≠ vulnerable. Cycle-1 audit already flagged these (D-2, D-3, D-4). Cycle 2 brings these packages much more squarely into the security path because they now also handle ES256 signing + JWKS construction. **Recommendation upgraded** vs. cycle 1: schedule an upgrade window in cycle 3 to bump `Microsoft.IdentityModel.*` to a non-Legacy line. + +## DataProtection key store — operational note (NOT a CVE) + +`Azaion.AdminApi.Program.cs` lines 152–160 register DataProtection. If `DataProtection:KeysFolder` is unset in production, ASP.NET Core defaults to per-machine, ephemeral keys — restarts will silently invalidate every encrypted MFA secret in the database. This is **not** a code vulnerability but is a deployment-time misconfiguration risk; surfaced as a finding in `infrastructure_review_cycle2.md` (F-2026Q2-INFRA-1). + +## Recommendations (Phase 1 only) + +1. (Open from cycle 1, severity-elevated for cycle 2) Bump `Microsoft.IdentityModel.*` family from `7.1.2` (Legacy) to the current LTS line. Cycle-2 ES256 signing path runs through these packages. +2. (Open from cycle 1) Bump `FluentValidation.AspNetCore` from `11.3.0` (Legacy) to current. +3. (New) Pin a CI gate that re-runs `dotnet list package --vulnerable` weekly and fails the pipeline on any non-zero result. The cycle-1 audit recommended this; cycle 2 surface (Argon2id, OtpNet, QRCoder, JWT signing) makes it more important, not less. diff --git a/_docs/05_security/infrastructure_review_cycle2.md b/_docs/05_security/infrastructure_review_cycle2.md new file mode 100644 index 0000000..cd81f36 --- /dev/null +++ b/_docs/05_security/infrastructure_review_cycle2.md @@ -0,0 +1,100 @@ +# Infrastructure & Configuration Review — Cycle 2 (Delta) + +**Date**: 2026-05-14 +**Scope**: cycle-2 surface only — `Program.cs` middleware/CORS/HSTS, DataProtection wiring, ES256 key store, `secrets/`, `scripts/deploy.sh` + `start-services.sh` + `_lib.sh` + `health-check.sh`, `docker-compose.test.yml`, `.env.example`, `.woodpecker/`. Cycle-1 categories that did not change since `infrastructure_review.md` are out of scope here. +**Read order**: this file is a delta on `infrastructure_review.md`. Categories not listed here keep their cycle-1 status. + +## Cycle-2 Findings + +### F-2026Q2-INFRA-1: Deploy script hard-blocks on obsolete `JwtConfig__Secret` (CRITICAL — deploy-blocking) + +- **Location**: `scripts/start-services.sh:32`. +- **Description**: `start-services.sh` does `require_env ... ASPNETCORE_JwtConfig__Secret`, which kills the deploy if the variable is not set. AZ-532 removed `JwtConfig.Secret` entirely — `Program.cs:60-83` configures `JwtBearer` against `IssuerSigningKeyResolver` backed by `JwtSigningKeyProvider` (ES256 PEMs). A correctly-configured cycle-2 deploy that follows `.env.example` (which does NOT include `JwtConfig__Secret`) will fail at the deploy-script preflight. +- **Impact**: cycle-2 cannot deploy with the current scripts. Either the deploy fails on preflight, or the operator sets a dummy `JwtConfig__Secret=dummy` to get past the check — and then we hit F-INFRA-2 below. +- **Remediation**: replace the line with: + ``` + require_env ... ASPNETCORE_JwtConfig__KeysFolder ASPNETCORE_JwtConfig__ActiveKid + ``` + Drop `ASPNETCORE_JwtConfig__Secret` from the schema in `secrets/README.md` and from any documented env templates. + +### F-2026Q2-INFRA-2: ES256 keys folder not bind-mounted into container (CRITICAL — deploy-blocking) + +- **Location**: `scripts/start-services.sh:48-56` (the `docker run` line). +- **Description**: `JwtConfig.KeysFolder` defaults to `secrets/jwt-keys` (relative path, per `appsettings.json:15`). Inside the container, this resolves under `/app/`. The Dockerfile **does not** COPY `secrets/`, and `start-services.sh` **does not** add a `--volume` mapping for the host `secrets/jwt-keys` directory. Result: `JwtSigningKeyProvider.Load` (cycle-2 ctor) fails-fast at startup with "no PEM files found". +- **Impact**: container restart loop on every cycle-2 deploy. The only way to bring it up today is to manually `docker cp` the PEMs into the container — defeats reproducibility, no rotation story. +- **Remediation**: add a host-side directory (e.g. `/etc/azaion/jwt-keys` owned by the runtime user, mode 0700, PEMs mode 0400) and a corresponding `--volume "$DEPLOY_HOST_JWT_KEYS_DIR:/etc/azaion/jwt-keys:ro"` line in `start-services.sh`. Set `ASPNETCORE_JwtConfig__KeysFolder=/etc/azaion/jwt-keys` in the public env overlay. Document the host-side procedure in `secrets/README.md` and `_docs/04_deploy/`. +- **Cross-ref**: `e2e/test-keys` is correctly mounted in `docker-compose.test.yml:42` — the test stack works; only the prod deploy script is broken. + +### F-2026Q2-INFRA-3: DataProtection key store ephemeral by default — MFA secrets unrecoverable across restarts (HIGH) + +- **Location**: `Program.cs:147-160`, `scripts/start-services.sh` (no DataProtection bind-mount), `secrets/README.md` (no entry). +- **Description**: AZ-534 wraps `MfaSecret` ciphertext with `IDataProtector` (`Azaion.Mfa.Secret.v1` purpose). When `DataProtection:KeysFolder` is unset, ASP.NET Core writes its master keys to a per-machine path inside the container (`%LOCALAPPDATA%/ASP.NET/DataProtection-Keys` or `~/.aspnet/DataProtection-Keys` depending on platform), which is **lost on every container restart**. After the first restart, every existing `MfaSecret` becomes undecryptable; users with MFA enabled can no longer log in (their `/login/mfa` fails because the server can't unwrap the secret to verify TOTP), and they can't even self-disable MFA via `/users/me/mfa/disable` because that path also re-validates the existing TOTP. Recovery codes still work (SHA-256 hashed, no DataProtection involvement) — so the only escape is recovery-code-based login, then disable-and-re-enroll. +- **Impact**: catastrophic data loss for the auth surface. Every `docker stop && docker run` cycle locks every MFA user out. +- **Mitigating control (current)**: the cycle-2 test deploy is single-instance, so within one process lifetime the keys are stable. The risk crystallizes on first restart. +- **Remediation**: + 1. Add a host-side persistent directory (e.g. `/var/lib/azaion/dp-keys`) owned by the runtime user, mode 0700. + 2. Add `--volume "$DEPLOY_HOST_DP_KEYS_DIR:/var/lib/azaion/dp-keys"` to `start-services.sh`. + 3. Set `ASPNETCORE_DataProtection__KeysFolder=/var/lib/azaion/dp-keys` in `secrets/.public.env`. + 4. Add a fail-fast in `Program.cs:151-160`: if `app.Environment.IsProduction()` and `DataProtection:KeysFolder` is unset, throw at startup. This makes the misconfiguration loud instead of silent. + 5. Document key persistence and rotation in `secrets/README.md` and `_docs/04_deploy/`. + +### F-2026Q2-INFRA-4: `secrets/README.md` schema still lists HS256-era `JwtConfig__Secret` (HIGH — doc drift, deploy-blocking by proxy) + +- **Location**: `secrets/README.md:50-55` ("Schema (variables that MUST be in the encrypted file)"). +- **Description**: the documented schema still requires `ASPNETCORE_JwtConfig__Secret=<32 random bytes>`. This is the same root cause as F-INFRA-1 but on the documentation side — operators following the README will set a useless variable, miss `JwtConfig__KeysFolder` / `JwtConfig__ActiveKid`, and miss `DataProtection__KeysFolder`. +- **Impact**: misleads any operator onboarding to the project; reinforces the broken deploy script. +- **Remediation**: rewrite the "What goes where" + "Schema" sections to: + - Drop `ASPNETCORE_JwtConfig__Secret`. + - Add `ASPNETCORE_JwtConfig__KeysFolder=/etc/azaion/jwt-keys` (path; not a secret — belongs in `.public.env`). + - Add `ASPNETCORE_JwtConfig__ActiveKid=` (path; not a secret — belongs in `.public.env`). + - Add `ASPNETCORE_DataProtection__KeysFolder=/var/lib/azaion/dp-keys` (path; not a secret). + - Note: the PEM private keys themselves are NOT in sops; they live on the host filesystem at `KeysFolder`, owned by the runtime user, mode 0400. Rotation procedure is in `scripts/generate-jwt-key.sh`. + +### F-2026Q2-INFRA-5: HSTS preload+includeSubDomains may break legacy subdomains (MEDIUM) + +- **Location**: `Program.cs:217-225`. +- **Description**: HSTS is configured with `MaxAge = 365 days`, `IncludeSubDomains = true`, `Preload = true` in non-Development environments. If any current or future `*.azaion.com` subdomain serves over plain HTTP (legacy admin tools, internal monitoring, dev/staging mirrors of unrelated systems), browsers that have seen the header will refuse to connect to those subdomains. Worse, **HSTS preload registration is essentially permanent** — the Chrome HSTS preload list takes weeks/months to be removed from once submitted, even after the header is disabled. +- **Impact**: operational blast radius if a non-HTTPS subdomain exists or is added later. Preload makes the mistake hard to reverse. +- **Remediation**: + 1. Audit all `*.azaion.com` subdomains; confirm 100% HTTPS-only (including any internal-only ones — DNS hijacking can expose them to user browsers). + 2. Document the subdomain inventory in `_docs/04_deploy/`. + 3. Consider gating `Preload = true` behind an env var so staging and dev hosts don't trigger preload-list submission attempts. + 4. Do NOT submit to the public preload list (https://hstspreload.org) until the audit is complete and signed off. + +### F-2026Q2-INFRA-6: `audit_events` table has no retention policy (LOW — operational hygiene) + +- **Location**: `env/db/07_auth_lockout_and_audit.sql`. +- **Description**: `audit_events` is append-only with no documented retention or partitioning. Every login attempt writes a row. At 10K users × 5 attempts/day = 50K rows/day = ~18M rows/year. Postgres handles this fine, and the composite index `(event_type, email, occurred_at DESC)` keeps `CountRecentFailedLogins` sub-millisecond, but unbounded growth has compliance implications (GDPR / data minimization), backup/restore time, and storage cost. +- **Impact**: not a security risk per se — audit completeness is the goal — but the regulatory storage horizon needs a stated answer. +- **Remediation**: agree retention (CMMC says ≥1 year for audit logs), add a nightly `DELETE FROM audit_events WHERE occurred_at < now() - interval '13 months'` job (cron + small script), document in `_docs/04_deploy/`. Optional: switch to monthly partitions so the cleanup is `DROP PARTITION` instead of a row-by-row delete. + +### F-2026Q2-INFRA-7: `JwtSigningKeyProvider` silent fallback to first PEM (MEDIUM) + +- **Location**: `Azaion.Services/JwtSigningKeyProvider.cs:73-86`. +- **Description**: when `JwtConfig.ActiveKid` is unset, the provider picks the alphabetically-first PEM and only logs at `LogInformation`. Adding a new PEM with a name that sorts earlier silently changes the signing key on next restart. The deploy script in `scripts/start-services.sh` does not require `ActiveKid`, and `.env.example:25-26` calls it "optional". +- **Impact**: operator drops `kid-2026-04-aaa.pem` thinking it's a side-by-side rotation key, restart, and now all newly-minted tokens are signed under a kid the verifiers may not yet have in their JWKS cache (1-h max-age — fleet sees signature failures for up to 1 h). +- **Remediation**: + 1. Make `ActiveKid` required when more than one PEM is present (fail-fast at startup if ambiguous). + 2. If exactly one PEM exists, accept it but log at `Warning` (not `Information`). + 3. Update `.env.example` to mark `ActiveKid` as **required for prod** rather than "optional". +- **Cross-ref**: same finding documented in `static_analysis_cycle2.md` as F-2026Q2-AUTH-7. Listed here too because the operational-hardening fix (require it in `start-services.sh`) is in this scope. + +## Re-verified categories (no cycle-2 regression) + +| Area | Cycle-1 status | Cycle-2 verdict | +|------|----------------|-----------------| +| Container non-root user (F-6) | FAIL | **PASS** — Dockerfile now sets `USER app` (line 40) and chowns `/app/Content` + `/app/logs`. Closes F-6. | +| Production HTTPS enforcement (F-13) | FAIL | **PASS** — `app.UseHttpsRedirection()` + `app.UseHsts()` enabled in non-Development. Closes F-13 in code (still reverse-proxy fronting in deploy). | +| CORS | tight | **TIGHTER** — cycle-2 dropped the `http://admin.azaion.com` origin. Only `https://admin.azaion.com` remains. | +| Image pinned by digest | WARN | unchanged. Deferred. | +| Secrets via env vars | PASS | unchanged. | +| Test sidecar / E2E images | acceptable | unchanged. | +| Test compose `ASPNETCORE_ENVIRONMENT=Development` | acceptable for tests | unchanged. Flag operator risk: a misconfigured prod that inherits this value silently loses HTTPS enforcement and HSTS. | +| `.gitignore` excludes secrets | PASS | **PASS** — `secrets/jwt-keys/*` is gitignored; only `.gitkeep` tracked. Verified no PEMs in repo. | + +## Self-verification + +- [x] All cycle-2-touched infra files reviewed (Dockerfile, docker-compose.test.yml, scripts/*, secrets/*, .env.example, appsettings*.json, .woodpecker/*) +- [x] Each finding has file path + line number + remediation +- [x] Cycle-1 closures verified by re-reading the code (F-6 USER directive, F-13 HSTS+HttpsRedirection) +- [x] No false positives from test-only files (test fixtures are flagged as acceptable, not as findings) diff --git a/_docs/05_security/owasp_review_cycle2.md b/_docs/05_security/owasp_review_cycle2.md new file mode 100644 index 0000000..4a85113 --- /dev/null +++ b/_docs/05_security/owasp_review_cycle2.md @@ -0,0 +1,41 @@ +# OWASP Top 10 Review — Cycle 2 (Delta) + +**Date**: 2026-05-14 +**Framework**: [OWASP Top 10 — 2021](https://owasp.org/www-project-top-ten/) (current authoritative list). +**Scope**: cycle-2 surface only — categories that materially changed since `owasp_review.md` (2026-05-13). Every FAIL / PASS_WITH_WARNINGS cites the underlying finding ID from `static_analysis_cycle2.md` (`F-2026Q2-…`). +**Read order**: this file is a delta. Categories not listed here keep their cycle-1 status from `owasp_review.md`. + +## Cycle-2 Per-Category Delta + +| # | Category | Cycle-1 | Cycle-2 | Delta reason | +|---|----------|---------|---------|--------------| +| A01 | Broken Access Control | FAIL (F-2 path traversal) | **FAIL** (unchanged on F-2; cycle-2 adds **F-2026Q2-AUTH-4** MFA endpoints not rate-limited) | New surface added without per-IP throttle on MFA management endpoints. | +| A02 | Cryptographic Failures | PASS_WITH_WARNINGS (F-7 SHA-384 hardening) | **PASS** | F-7 closed by AZ-536 (Argon2id, lazy migration). Cycle-2 introduces ES256 for JWT (replaces HS256), `IDataProtector` for MFA secret, SHA-256 for high-entropy refresh tokens / recovery codes. All algorithm choices match RFC guidance. | +| A04 | Insecure Design | PASS_WITH_WARNINGS (F-8 no rate limiting) | **PASS_WITH_WARNINGS** | F-8 closed by AZ-537 (per-IP + per-account hybrid). New design risks: **F-2026Q2-AUTH-5** mission AMR loss, **F-2026Q2-AUTH-6** mission auto-revoke gap, **F-2026Q2-AUTH-7** silent kid fallback. None catastrophic individually. | +| A05 | Security Misconfiguration | FAIL (F-6 root container; F-13 no HTTPS; F-9 missing validators; F-11 placeholder credentials) | **FAIL** (improved) | F-13 closed by AZ-538 (HSTS + HttpsRedirection in non-Dev). F-6 / F-9 / F-11 unchanged. New risk: **F-INFRA-1** DataProtection ephemeral key store if `DataProtection:KeysFolder` unset (deployment-coupled). | +| A07 | Identification & Authentication Failures | PASS_WITH_WARNINGS (F-7, F-8 hardening) | **PASS_WITH_WARNINGS** | Cycle-2 modernized auth across the board (Argon2id, refresh rotation, TOTP MFA, lockout, JWKS). However the cycle-2 audit surfaced **F-2026Q2-AUTH-1** (user enumeration, upgraded High), **F-2026Q2-AUTH-2** (MFA brute-force not rate-limited, High), **F-2026Q2-AUTH-3** (disabled-account leak via auth ordering, Medium). These are blocking for the next deploy. | +| A08 | Software & Data Integrity Failures | PASS | **PASS** | No regressions. ES256 keys persisted to file system and read-only at runtime; DataProtection keys persisted (when configured). | +| A09 | Security Logging & Monitoring Failures | PASS_WITH_WARNINGS (no security-event-specific logger) | **PASS** | Cycle-2 introduced `IAuditLog` + `audit_events` table. Login success / failure / lockout / MFA enroll-confirm-disable / MFA login success-failure / MFA recovery-used are all persisted with email + IP + timestamp. Closes the cycle-1 A09 warning. **One residual gap** flagged under A07: `CountRecentFailedLogins` doesn't count `MfaLoginFailed` events, so the audit trail is complete but not all of it feeds the rate-limit decision (see F-2026Q2-AUTH-2). | +| A10 | SSRF | NOT_APPLICABLE | **NOT_APPLICABLE** | Unchanged — no outbound HTTP based on user-controlled URLs. JWKS endpoint serves; no external fetch. | + +Categories **A03 (Injection)** and **A06 (Vulnerable & Outdated Components)** unchanged from cycle 1 — re-verified clean for cycle-2 surface. See `static_analysis_cycle2.md` §"Vulnerability patterns scan" and `dependency_scan_cycle2.md` for evidence. + +## Cycle-2 Specific Verdict + +The cycle-2 modernization closes **two** outstanding cycle-1 hardening gaps (F-7 weak hashing, F-8 no rate limit) and **one** cycle-1 PASS_WITH_WARNINGS (A09 audit trail). Net category posture improves from **3 FAIL / 4 PASS_WITH_WARNINGS / 2 PASS / 1 N/A** to **2 FAIL / 2 PASS_WITH_WARNINGS / 5 PASS / 1 N/A**. + +However, the new auth surface introduces **3 cycle-2 blocking findings** that must be addressed before the next deploy: + +- **F-2026Q2-AUTH-1** (HIGH) — collapse `NoEmailFound` / `WrongPassword` / `UserDisabled` to a single `InvalidCredentials` error. +- **F-2026Q2-AUTH-2** (HIGH) — feed `MfaLoginFailed` into `failed_login_count` so MFA brute-force triggers the same lockout as password brute-force. +- **F-2026Q2-AUTH-4** (MEDIUM) — attach `LoginPerIpPolicy` (or a dedicated MFA policy) to `/users/me/mfa/{enroll,confirm,disable}`. + +Per-deploy gate decision recorded in the consolidated cycle-2 security report (Phase 5). + +## Self-verification + +- [x] Every FAIL has at least one finding with evidence (`F-2026Q2-…`) +- [x] Every PASS_WITH_WARNINGS has a stated remaining concern +- [x] Categories A02, A09 promotions justified by cycle-2 code changes +- [x] No PASS category is a regression from cycle 1 +- [x] NOT_APPLICABLE retains justification diff --git a/_docs/05_security/security_report_cycle2.md b/_docs/05_security/security_report_cycle2.md new file mode 100644 index 0000000..56aa263 --- /dev/null +++ b/_docs/05_security/security_report_cycle2.md @@ -0,0 +1,146 @@ +# Security Audit Report — Cycle 2 (Auth Modernization Delta) + +**Date**: 2026-05-14 +**Scope**: cycle-2 surface only (AZ-531 through AZ-538 — refresh tokens, ES256/JWKS, mission tokens, MFA, Argon2id, lockout/rate-limit, CORS/HSTS/HTTPS). Cycle-1 findings and closures remain authoritative in `security_report.md`. +**Verdict**: **FAIL — DO NOT DEPLOY.** 2 Critical deploy-blockers + 2 High auth findings discovered. Cycle-2 also closes 3 cycle-1 hardening gaps (F-7 weak hashing, F-8 no rate-limit, F-13 no HTTPS) and 1 cycle-1 PASS_WITH_WARNINGS (A09 audit trail). + +## Summary + +| Severity | Count | Closed in this audit | +|----------|-------|----------------------| +| Critical | 2 | 0 | +| High | 4 | 0 (3 cycle-1 hardening gaps closed by cycle-2 code itself: F-7, F-8, F-13 — see "Cycle-1 closures" below) | +| Medium | 7 | 0 | +| Low | 2 | 0 | + +## Findings (cycle-2 only, severity-ranked) + +| # | Severity | Category | Location | Title | +|---|----------|----------|----------|-------| +| F-2026Q2-INFRA-1 | **Critical** | A05 | `scripts/start-services.sh:32` | Deploy script hard-blocks on obsolete `JwtConfig__Secret` — cycle-2 cannot deploy with current scripts | +| F-2026Q2-INFRA-2 | **Critical** | A05 / A07 | `scripts/start-services.sh:48-56`, `appsettings.json:15` | ES256 keys folder not bind-mounted into container — `JwtSigningKeyProvider` fails-fast on startup | +| F-2026Q2-AUTH-1 | **High** | A07 | `Azaion.Services/UserService.cs:120-148`, `BusinessException.cs:33-52` | User enumeration via login error codes (`NoEmailFound` vs `WrongPassword`); upgraded to High because lockout amplifies the threat | +| F-2026Q2-AUTH-2 | **High** | A07 | `Azaion.Services/MfaService.cs:247-278`, `Azaion.Services/AuditLog.cs:53-63` | MFA brute-force not rate-limited — `MfaLoginFailed` events don't feed `failed_login_count` and aren't counted by `CountRecentFailedLogins` | +| F-2026Q2-INFRA-3 | **High** | A05 | `Program.cs:147-160`, `scripts/start-services.sh` | DataProtection key store ephemeral by default — every container restart locks every MFA user out | +| F-2026Q2-INFRA-4 | **High** | A05 | `secrets/README.md:50-55` | Operator README still lists HS256-era `JwtConfig__Secret`; misses `KeysFolder` / `ActiveKid` / `DataProtection__KeysFolder` | +| F-2026Q2-AUTH-3 | Medium | A07 | `Azaion.Services/UserService.cs:144-152` | Disabled-account leak via auth ordering (`IsEnabled` checked AFTER password verify) | +| F-2026Q2-AUTH-4 | Medium | A01 | `Azaion.AdminApi/Program.cs:452-481` | `/users/me/mfa/{enroll,confirm,disable}` not rate-limited; stolen access token can brute-force MFA disable | +| F-2026Q2-AUTH-5 | Medium | A04 | `Azaion.Services/MissionTokenService.cs:103-111`, `Program.cs:489` | Mission tokens missing `amr` claim; verifiers can't enforce MFA-pilot policy | +| F-2026Q2-AUTH-6 | Medium | A04 | `Azaion.Services/MissionTokenService.cs:46-87` | Mission token issuance does NOT auto-revoke prior aircraft missions; violates AZ-533 AC-4 | +| F-2026Q2-AUTH-7 | Medium | A04 / A05 | `Azaion.Services/JwtSigningKeyProvider.cs:73-86` | Silent fallback to first PEM if `ActiveKid` unset; rotation accidents silently change signing key | +| F-2026Q2-INFRA-5 | Medium | A05 | `Program.cs:217-225` | HSTS preload+includeSubDomains may break legacy `*.azaion.com` subdomains; preload registration is hard to undo | +| F-2026Q2-AUTH-9 | Medium (deployment-coupled) | A07 | `Azaion.Services/UserService.GetByEmail` (cycle-1) consumed by cycle-2 lockout/MFA reads | 4-hour user cache holds lockout / MFA state stale across instances; safe at single-instance, breaks horizontal scaling | +| F-2026Q2-AUTH-8 | Low | A09 | `Program.cs:261-280` | `/health/ready` leaks DB exception type to anonymous callers (mitigated by management-interface-only exposure) | +| F-2026Q2-INFRA-6 | Low | A05 | `env/db/07_auth_lockout_and_audit.sql` | `audit_events` table has no documented retention policy | +| F-2026Q2-DOCS-1 | Low | — | `_docs/02_document/components/03_auth_and_security/description.md`, `data_model.md` | Cycle-2 docs drift from code (Argon2 params, lockout field names, recovery-code hash algorithm) | + +Detailed evidence and remediation steps are in `static_analysis_cycle2.md` (F-AUTH-*) and `infrastructure_review_cycle2.md` (F-INFRA-*). + +## OWASP Top 10 (2021) — Cycle-2 Status + +Full reasoning is in `owasp_review_cycle2.md`. Net category posture moves from cycle-1 **3 FAIL / 4 PASS_WITH_WARNINGS / 2 PASS / 1 N/A** to cycle-2 **2 FAIL / 2 PASS_WITH_WARNINGS / 5 PASS / 1 N/A**. + +| Category | Cycle-1 | Cycle-2 | Reason | +|----------|---------|---------|--------| +| A01 Broken Access Control | FAIL | FAIL | F-2 still open + new F-AUTH-4 (MFA endpoints not rate-limited) | +| A02 Cryptographic Failures | PASS_WITH_WARNINGS | **PASS** | F-7 closed by Argon2id (AZ-536); ES256, IDataProtector, SHA-256 (high-entropy) all RFC-aligned | +| A04 Insecure Design | PASS_WITH_WARNINGS | PASS_WITH_WARNINGS | F-8 closed by hybrid rate-limit (AZ-537); new design risks F-AUTH-5/6/7 | +| A05 Security Misconfiguration | FAIL | FAIL | F-13 closed by HSTS+HttpsRedirection; F-6 closed by `USER app`; new criticals F-INFRA-1/2/3 + medium F-INFRA-5 | +| A07 Auth Failures | PASS_WITH_WARNINGS | PASS_WITH_WARNINGS | Cycle-2 modernized auth; new findings F-AUTH-1/2/3 | +| A09 Logging Failures | PASS_WITH_WARNINGS | **PASS** | `IAuditLog` + `audit_events` table closes the cycle-1 warning | + +## Cycle-1 Closures Confirmed in Cycle-2 Code + +The cycle-2 implementation directly closes three cycle-1 open hardening items and one cycle-1 PASS_WITH_WARNINGS. These are NOT new audit work — they are verifications that the AZ-53x tickets did the security-relevant thing they promised. + +| Cycle-1 finding | Cycle-2 resolution | Evidence | +|-----------------|-------------------|----------| +| **F-6** (container runs as root) | `USER app` directive added; `/app/Content` and `/app/logs` chowned | `Dockerfile:7-11, 39-40` | +| **F-7** (SHA-384 password hash, no salt/KDF) | Argon2id with PHC string format, lazy migration on next login | `Azaion.Services/Security.cs:1-135`, AZ-536 spec | +| **F-8** (no rate limiting on `/login`) | Per-IP sliding window (`RateLimiter`) + per-account DB-backed sliding window + lockout | `Program.cs:172-199, 308, 330`, `AuthConfig.cs`, `UserService.ValidateUser` | +| **F-13** (no HTTPS enforcement in code) | `app.UseHsts()` + `app.UseHttpsRedirection()` in non-Development | `Program.cs:217-240` | +| **A09** (no security-event audit log) | `IAuditLog` writes login_success/failed/lockout, MFA enroll/confirm/disable/login_success/failed/recovery_used to `audit_events` with email + IP + timestamp | `Azaion.Services/AuditLog.cs:1-80`, `env/db/07_auth_lockout_and_audit.sql` | + +These closures can be verified by inspection during the next deploy gate. + +## Verdict Logic + +**FAIL — do not deploy** because: + +1. **F-INFRA-1** (Critical): the deploy script's `require_env ASPNETCORE_JwtConfig__Secret` will fail-fast on every cycle-2 deploy attempt that follows the new `.env.example`. The deploy literally cannot start. +2. **F-INFRA-2** (Critical): even if the operator works around F-INFRA-1, the container will then fail-fast inside `JwtSigningKeyProvider` because `secrets/jwt-keys` is not bind-mounted. +3. **F-INFRA-3** (High): even if F-INFRA-1 and F-INFRA-2 are bypassed by `docker cp` and a dummy env var, the first container restart will lock every MFA-enrolled user out of their account because DataProtection master keys vanish. +4. **F-AUTH-1 / F-AUTH-2** (Highs): the new auth surface is exploitable — user enumeration is amplified by lockout, and MFA can be brute-forced from rotating IPs without ever locking the account. + +The combination of (1)+(2)+(3) means cycle-2 has **no working deploy path**. (4) means even if the deploy did work, the auth surface is below the cycle-1 baseline despite the modernization effort. + +## Recommendations + +### Pre-deploy (must be done before the next deploy attempt) + +The following CRITICAL and HIGH findings block deploy. Each is small enough to fit a single PR; bundle them as one cycle-2 hotfix sprint. + +1. **F-INFRA-1** — `scripts/start-services.sh`: replace `require_env ... ASPNETCORE_JwtConfig__Secret` with `require_env ... ASPNETCORE_JwtConfig__KeysFolder ASPNETCORE_JwtConfig__ActiveKid`. +2. **F-INFRA-2** — `scripts/start-services.sh`: add a `--volume "$DEPLOY_HOST_JWT_KEYS_DIR:/etc/azaion/jwt-keys:ro"` line; document `DEPLOY_HOST_JWT_KEYS_DIR` in `.env.example` and `secrets/README.md`. +3. **F-INFRA-3** — (a) add a `--volume "$DEPLOY_HOST_DP_KEYS_DIR:/var/lib/azaion/dp-keys"` line; (b) set `ASPNETCORE_DataProtection__KeysFolder=/var/lib/azaion/dp-keys` in `secrets/.public.env`; (c) fail-fast in `Program.cs:151-160` if Production and `KeysFolder` is unset. +4. **F-INFRA-4** — rewrite `secrets/README.md` "Schema" section to drop `JwtConfig__Secret` and add `JwtConfig__KeysFolder`, `JwtConfig__ActiveKid`, `DataProtection__KeysFolder`. +5. **F-AUTH-1** — `UserService.ValidateUser` + `BusinessException.cs`: introduce a single `InvalidCredentials` error code (e.g. 70). Map both `NoEmailFound`-equivalent and `WrongPassword`-equivalent paths to it. Move the `IsEnabled` check before the password verify (closes F-AUTH-3 too). +6. **F-AUTH-2** — `MfaService.VerifyForLogin`: increment `failed_login_count` and check the per-account rate limit (call into `UserService` or duplicate the logic). Also: extend `AuditLog.CountRecentFailedLogins` to count `MfaLoginFailed` rows alongside `LoginFailed`. + +### Short-term (Medium — file as separate tickets) + +7. **F-AUTH-4** — attach `LoginPerIpPolicy` (or a dedicated `MfaPerIpPolicy`) to `/users/me/mfa/{enroll,confirm,disable}` in `Program.cs`. +8. **F-AUTH-5** — `MissionTokenService.MintToken`: read the pilot's actual `amr` from the access token (or look it up via `Session.MfaAuthenticated`) and stamp it on the mission token. Reject if mission policy requires MFA and pilot has none. +9. **F-AUTH-6** — `MissionTokenService.Issue`: call `SessionService.RevokeMissionsForAircraft(aircraftId)` immediately before inserting the new mission session row. Also fix the cycle-2 docs which falsely claim this auto-revoke is happening. +10. **F-AUTH-7** — `JwtSigningKeyProvider`: fail-fast when `ActiveKid` is unset and >1 PEM exists; warn-only when exactly 1 PEM exists. Update `.env.example` to mark `ActiveKid` as required for prod. +11. **F-INFRA-5** — audit all `*.azaion.com` subdomains for HTTPS-only; document the inventory in `_docs/04_deploy/`; gate `Preload = true` behind an env var so staging doesn't trigger preload-list submission attempts. +12. **F-AUTH-9** — bypass the user cache for security-critical fields (`LockoutUntil`, `FailedLoginCount`, `MfaEnabled`, `MfaSecret`, `IsEnabled`) by reading them directly in `ValidateUser` and `MfaService`. Required before any horizontal scaling. + +### Long-term (Low / hardening) + +13. **F-AUTH-8** — `/health/ready`: log `ex.GetType().Name` internally; return only `{ status: "not-ready" }` externally. +14. **F-INFRA-6** — agree audit retention (≥1 year per CMMC); add a nightly cleanup job; document in `_docs/04_deploy/`. Consider monthly partitioning if volume warrants. +15. **F-DOCS-1** — sync cycle-2 docs to code: `AuthConfig` does not have `PasswordHashing`; `LockoutOptions.MaxAttempts` (not `ConsecutiveFailureThreshold`); `LockoutOptions.DurationSeconds` (not `LockoutSeconds`); `RateLimitOptions.PerAccountPermitLimit` (not `PerAccountFailedThreshold`); recovery codes are SHA-256 (not Argon2id). + +## Dependency Vulnerabilities (cycle 2) + +`dependency_scan_cycle2.md` is authoritative. Summary: no new CVEs in cycle-2 packages (`Konscious.Security.Cryptography.Argon2 1.3.1`, `Otp.NET 1.4.0`, `QRCoder 1.6.0`, ASP.NET Core `RateLimiting`, `DataProtection`). Two cycle-1 deprecation items are re-emphasized: `FluentValidation.AspNetCore 11.3.0` and `System.IdentityModel.Tokens.Jwt 7.1.2` should both move to maintained alternatives in a focused upgrade ticket. + +## Tracker Follow-Ups + +The 2 Critical + 4 High findings were filed as Jira tickets on 2026-05-14 as the cycle-2 hotfix sprint. Medium / Low items remain unfiled pending prioritization decision (see "Open" row below). + +### Filed (cycle-2 hotfix sprint — 2026-05-14) + +| Ticket | Finding | Title | Severity | Points | +|--------|---------|-------|----------|--------| +| [AZ-552](https://denyspopov.atlassian.net/browse/AZ-552) | F-INFRA-1 | Deploy script hard-blocks on obsolete `JwtConfig__Secret` | Critical | 1 | +| [AZ-553](https://denyspopov.atlassian.net/browse/AZ-553) | F-INFRA-2 | Bind-mount ES256 key folder in deploy script + host-side procedure | Critical | 2 | +| [AZ-554](https://denyspopov.atlassian.net/browse/AZ-554) | F-INFRA-3 | Persist DataProtection keys folder + fail-fast in Production | High | 2 | +| [AZ-555](https://denyspopov.atlassian.net/browse/AZ-555) | F-INFRA-4 | Rewrite `secrets/README.md` schema for ES256 + DataProtection | High | 1 | +| [AZ-556](https://denyspopov.atlassian.net/browse/AZ-556) | F-AUTH-1 + F-AUTH-3 | Collapse login error codes to `InvalidCredentials` + reorder IsEnabled check | High | 2 | +| [AZ-557](https://denyspopov.atlassian.net/browse/AZ-557) | F-AUTH-2 | Lock MFA brute-force into per-account lockout/rate-limit pipeline | High | 3 | + +Bundle: 11 story points. Labels: `security`, `cycle-2-hotfix`, `AZ-530-followup`. All gate the next deploy. + +### Open (Medium / Low — to be triaged) + +| Ticket suggestion | Finding | Title | Severity | Points | +|-------------------|---------|-------|----------|--------| +| AZ-NEW-7 | F-AUTH-4 | Add per-IP rate limiting to `/users/me/mfa/*` endpoints | Medium | 2 | +| AZ-NEW-8 | F-AUTH-5 | Stamp `amr` on mission tokens; gate by mission policy | Medium | 2 | +| AZ-NEW-9 | F-AUTH-6 | Auto-revoke prior aircraft mission on issuance (and fix the docs) | Medium | 2 | +| AZ-NEW-10 | F-AUTH-7 | Fail-fast on ambiguous `ActiveKid`; tighten `.env.example` | Medium | 1 | +| AZ-NEW-11 | F-INFRA-5 | Audit `*.azaion.com` subdomains for HTTPS-only before preload | Medium | 2 | +| AZ-NEW-12 | F-AUTH-9 | Bypass user-cache for security-critical fields (pre-scaling) | Medium | 3 | +| AZ-NEW-13 | F-AUTH-8 | `/health/ready` body redaction | Low | 1 | +| AZ-NEW-14 | F-INFRA-6 | Audit retention policy + cleanup job | Low | 2 | +| AZ-NEW-15 | F-DOCS-1 | Sync cycle-2 auth/security docs to code | Low | 1 | + +## Self-Verification + +- [x] Every Critical / High in this report has location + remediation +- [x] OWASP delta is reconciled against `owasp_review_cycle2.md` +- [x] Cycle-1 closures verified by re-reading the cycle-2 code (F-6, F-7, F-8, F-13, A09) +- [x] Verdict matches finding severity (FAIL because of 2 deploy-blocking Criticals, not a clerical "PASS_WITH_WARNINGS") +- [x] Tracker follow-ups grouped by severity for prioritization diff --git a/_docs/05_security/static_analysis_cycle2.md b/_docs/05_security/static_analysis_cycle2.md new file mode 100644 index 0000000..d72b53f --- /dev/null +++ b/_docs/05_security/static_analysis_cycle2.md @@ -0,0 +1,120 @@ +# Static Analysis — Cycle 2 (Auth Modernization, AZ-531..AZ-538) + +**Date**: 2026-05-14 +**Scope**: cycle-2 source files + cross-cutting changes; the cycle-1 surface was already audited in `static_analysis.md`. + +## Files reviewed + +- New: `Azaion.Services/{Security,RefreshTokenService,SessionService,MfaService,MissionTokenService,JwtSigningKeyProvider,AuditLog}.cs` +- Modified: `Azaion.Services/{AuthService,UserService}.cs`, `Azaion.AdminApi/{Program,BusinessExceptionHandler}.cs` +- Common: `Azaion.Common/{BusinessException,Configs/AuthConfig,Configs/JwtConfig,Configs/SessionConfig,Entities/Session,Entities/AuditEvent,Entities/User,Database/AzaionDb,Database/AzaionDbShemaHolder,Requests/*}.cs` +- DB: `env/db/{07_auth_lockout_and_audit,08_sessions,09_sessions_logout_and_mission,10_users_mfa}.sql` + +## Findings — cycle 2 (severity-ranked) + +### F-2026Q2-AUTH-1: User enumeration via login error codes (HIGH) + +- **Location**: `Azaion.Services/UserService.cs:120-148`, `Azaion.Common/BusinessException.cs:33-52`. +- **Description**: `ValidateUser` returns `BusinessException(NoEmailFound)` (code 10, "No such email found.") when the email doesn't exist, and `BusinessException(WrongPassword)` (code 30, "Passwords do not match.") when the email exists but the password is wrong. The two codes (and human-readable messages) are distinguishable to any client. This is a textbook user-enumeration leak. +- **Status**: **pre-existing in cycle 1** (not introduced by cycle 2) but the threat model is materially worse now — cycle 2 added per-account lockout that an attacker can selectively trigger only on real accounts. +- **Impact**: aids credential stuffing (attacker pre-filters their email list), aids targeted phishing, allows opportunistic DoS-via-lockout against known-real accounts. +- **Remediation**: collapse both branches to a single generic error — either `WrongPassword` for both, or a new `InvalidCredentials` code. Cycle-2 audit recommends: introduce `InvalidCredentials` (code 70 or similar) and migrate over a deprecation window. +- **Cross-ref**: `_docs/05_security/static_analysis.md` already captured this for cycle 1 but as Medium; **upgraded to High** for cycle 2. + +### F-2026Q2-AUTH-2: MFA brute force not rate-limited per account (HIGH) + +- **Location**: `Azaion.Services/MfaService.cs:247-278`. +- **Description**: `VerifyForLogin` records `audit_events.mfa_login_failed` on a wrong code but does NOT increment `users.failed_login_count` or `users.lockout_until`. The per-account sliding-window check in `UserService.ValidateUser` only counts `event_type='login_failed'` rows, **not** `mfa_login_failed`. The per-IP rate limiter on `/login/mfa` is the only throttle. +- **Impact**: an attacker with a leaked password can brute-force the 6-digit TOTP from rotating IPs. ~1M tries per minute saturate the 30s TOTP step before lockout; with rotation across IPs the per-IP limit is bypassed. The account never locks. +- **Remediation**: either (a) make `CountRecentFailedLogins` also count `mfa_login_failed` events, or (b) call `RegisterFailedLogin`-equivalent inside `VerifyForLogin` so `failed_login_count` increments. Recommend (b) because it surfaces the MFA failures into the existing lockout mechanism. + +### F-2026Q2-AUTH-3: Disabled-account leak via auth ordering (MEDIUM) + +- **Location**: `Azaion.Services/UserService.cs:144-152`. +- **Description**: `ValidateUser` checks `IsEnabled` AFTER password verification: + + ``` + var verify = Security.VerifyPassword(...); + if (!verify.Valid) throw WrongPassword; + if (!user.IsEnabled) throw UserDisabled; + ``` + + An attacker who knows the password of a disabled account learns the password is correct (sees `UserDisabled` instead of `WrongPassword`). +- **Impact**: confirms a credential pair is valid even if the account is disabled — useful for credential-replay against re-enabled accounts and for tracking which disabled accounts to keep targeting. +- **Remediation**: check `IsEnabled` BEFORE the password verify, return the same generic error as F-AUTH-1. + +### F-2026Q2-AUTH-4: MFA management endpoints not rate-limited (MEDIUM) + +- **Location**: `Azaion.AdminApi/Program.cs:452-481`. +- **Description**: `/users/me/mfa/{enroll,confirm,disable}` are mounted with `RequireAuthorization()` but NOT `RequireRateLimiting(LoginPerIpPolicy)`. Only `/login` and `/login/mfa` have the per-IP limit attached. An attacker with a stolen access token can brute-force the disable-confirm code unbounded. +- **Impact**: stolen-access-token escalation path that ends in MFA being silently turned off; account is then a single-factor target. +- **Remediation**: add a dedicated `mfa-per-account` policy (or attach `LoginPerIpPolicy`) to all three `/users/me/mfa/*` endpoints. + +### F-2026Q2-AUTH-5: Mission tokens missing `amr` claim (MEDIUM) + +- **Location**: `Azaion.Services/MissionTokenService.cs:103-111`. +- **Description**: The mission token claim list omits `amr`. The cycle-2 design intent (per `_docs/02_document/system-flows.md` F13 and the `// TODO (AZ-534)` comment in Program.cs:489) is `amr=["pwd","mfa"]`. Verifiers downstream cannot enforce "missions require MFA-authenticated pilots" because the claim is absent. +- **Impact**: a pilot logged in with `pwd` only can issue mission tokens that look identical to MFA-pilot ones to verifiers. The intended MFA-step-up gating is unenforceable. +- **Remediation**: read the pilot's current access-token AMR (or look it up via `sid` → `Session.MfaAuthenticated`) and stamp `amr=["pwd"]` or `amr=["pwd","mfa"]` accordingly. Reject if policy requires MFA and the pilot has none. + +### F-2026Q2-AUTH-6: Mission token issuance does not auto-revoke prior aircraft missions (MEDIUM) + +- **Location**: `Azaion.Services/MissionTokenService.cs:46-87`. +- **Description**: `Issue` inserts a new mission session row but does NOT call `SessionService.RevokeMissionsForAircraft(aircraftId)` first. Auto-revoke happens only when the aircraft itself logs in (`Program.cs:347-349`). Consequence: an admin can issue two concurrent mission tokens for the same aircraft and both remain valid until the aircraft re-connects. +- **Impact**: violates AZ-533 AC-4. Two pilots can hold parallel valid mission tokens for the same aircraft, breaking the "one mission per aircraft at a time" invariant. Audit / forensic confusion. +- **Remediation**: call `SessionService.RevokeMissionsForAircraft(request.AircraftId, ct)` immediately before the `db.InsertAsync(new Session ...)` for the mission row. +- **Doc impact**: also fix the cycle-2 docs (`system-flows.md` F13, `components/03_auth_and_security/description.md`) which currently *claim* this auto-revoke happens at issuance. + +### F-2026Q2-AUTH-7: Silent fallback to first PEM if `ActiveKid` is unset (MEDIUM) + +- **Location**: `Azaion.Services/JwtSigningKeyProvider.cs:73-86`. +- **Description**: When `JwtConfig.ActiveKid` is not configured, the provider falls back to the alphabetically-first PEM and only logs a `LogInformation`. Adding a new PEM whose filename sorts earlier than the current active kid silently changes the signing key. +- **Impact**: rotation accidents — operator drops `kid-2026-04-aaa.pem` thinking it's a side-by-side key, but it becomes the new signer because of name ordering. Tokens minted under the wrong kid; verifiers may reject (depending on JWKS cache state). +- **Remediation**: fail-fast when `ActiveKid` is unset and more than one PEM exists. If exactly one PEM exists, accept it but log at `Warning` (not `Information`). + +### F-2026Q2-AUTH-8: Health-readiness leaks DB exception type to anonymous callers (LOW) + +- **Location**: `Azaion.AdminApi/Program.cs:261-280`. +- **Description**: `/health/ready` returns `{ status: "not-ready", reason: ex.GetType().Name }` to any anonymous caller on DB failure. This leaks the underlying exception type (e.g., `NpgsqlException`, `TimeoutException`). +- **Impact**: minor info leak; no direct exploit but useful for fingerprinting in a compromise chain. +- **Remediation**: log the exception type internally; return only `{ status: "not-ready" }` externally. +- **Mitigating control**: per Program.cs line 252 comment, `/health/*` is "expected to be exposed only on the management interface (not via the public Nginx vhost)". If that nginx config is verified, drop this finding. + +### F-2026Q2-AUTH-9: User cache TTL holds lockout / MFA-state for 4h across instances (MEDIUM, deployment-coupled) + +- **Location**: `Azaion.Services/UserService.GetByEmail` (cycle 1) + cycle-2 surface that consumes `User.LockoutUntil` / `MfaEnabled`. +- **Description**: `GetByEmail` returns a 4-hour-cached `User` that includes `LockoutUntil`, `MfaEnabled`, `MfaSecret`. `cache.Invalidate` is local to the process. In a multi-instance deploy, instance A's lockout / MFA-disable / password-rehash is invisible to instance B for up to 4 h. +- **Impact**: a locked account can keep authenticating against a cold instance; an MFA-disabled user can keep being challenged for MFA against a stale instance; lazy-rehash race window grows. +- **Mitigating control**: per `architecture.md` §3, deployment is "self-hosted Linux server (single container deployment via deploy.cmd)". Single-instance = no cross-instance staleness today. Risk surfaces if/when the system scales horizontally. +- **Remediation**: bypass the cache for the security-critical fields (`LockoutUntil`, `FailedLoginCount`, `MfaEnabled`, `MfaSecret`, `IsEnabled`) by reading them directly in `ValidateUser` and `MfaService`, OR move to a shared-cache (Redis) before horizontal scaling. + +### F-2026Q2-DOCS-1: Documentation drift surfaced during audit (LOW) + +- **Location**: `_docs/02_document/components/03_auth_and_security/description.md`, `_docs/02_document/data_model.md` §Configuration POCOs, `_docs/02_document/modules/services_mfa_service.md`. +- **Description**: cycle-2 documentation drifted from the actual code on: + - `AuthConfig.PasswordHashing { TimeCost, MemoryCostKiB, Parallelism }` does NOT exist — Argon2id parameters are hardcoded constants in `Security.cs:16-23`. + - `LockoutOptions` has fields `MaxAttempts` (not `ConsecutiveFailureThreshold`) and `DurationSeconds` (not `LockoutSeconds`). + - `RateLimitOptions` has `PerAccountPermitLimit` (not `PerAccountFailedThreshold`). + - Recovery codes are SHA-256-hashed (high-entropy server-issued secrets), not Argon2id as docs claimed. +- **Impact**: misleads operators when tuning Argon2id; misleads ops when wiring `appsettings.json`. +- **Remediation**: docs-only fix in `Step 13` follow-up — to be addressed in the next docs sync. Tracked here so it's not forgotten. + +## Vulnerability patterns scan (clean) + +| Pattern | Result | +|---------|--------| +| SQL injection (string-interpolated SQL) | Clean — only `db.Execute("UPDATE ... WHERE id=@id", DataParameter)` and `SELECT 1`. All cycle-2 queries are LinqToDB expression-based. | +| Command injection / `Process.Start` / `subprocess` | Clean — no shell-out from C# in the cycle-2 surface. | +| XSS | N/A — JSON-only API; responses serialized via `Newtonsoft.Json` / `Results.Ok`. | +| Hardcoded secrets / credentials | Clean — no plaintext credentials in `Azaion.{AdminApi,Services,Common}/`. The seed `admin@azaion.com`/`uploader@azaion.com` rows in `env/db/02_structure.sql` use SHA-384 hashes; cycle 2 lazy-migrates these on first login. | +| Missing auth on endpoints | Clean — verified each `MapGet`/`MapPost`/`MapPut`/`MapDelete` carries either `RequireAuthorization`, `RequireAuthorization()`, or explicit `AllowAnyonymous` with documented justification. | +| Insecure deserialization | Clean — `Newtonsoft.Json` only deserializes typed DTOs; `System.Text.Json` deserializes the recovery-code array; jsonb mapping for `MfaRecoveryCodes` uses `DataType.BinaryJson`. | +| Sensitive data in logs | Clean — passwords, hashes, secrets, refresh tokens, recovery codes are not logged. `BusinessExceptionHandler` logs the exception type + message but no payload. | +| Cryptographic regressions | Clean — Argon2id (RFC 9106) for passwords; SHA-256 for high-entropy refresh tokens + recovery codes; ES256 for JWTs; SHA-256 of UTF-8 bytes for refresh-token hashing. No MD5 / SHA-1 / DES anywhere. | + +## Self-verification + +- [x] All cycle-2 source directories scanned +- [x] Each finding has file path and line numbers +- [x] No false positives from test files (test files were excluded except where they confirm a finding) +- [x] Vulnerability patterns swept (SQLi, cmd injection, secrets, missing auth, insecure deserialization, sensitive logs, weak crypto)