mirror of
https://github.com/azaion/admin.git
synced 2026-06-21 16:11:09 +00:00
[AZ-531] [AZ-532] Refresh-token rotation + ES256 signing with JWKS
AZ-531 — /login now returns access (15 min) + opaque refresh; rotation on /token/refresh; reuse of a rotated refresh kills the entire session family per OAuth 2.1 §6.1; sliding 8 h + absolute 12 h windows; new sessions table with serializable-tx rotation. AZ-532 — switched access-token signing from HS256 shared-secret to ES256 file-backed PEMs; new JwtSigningKeyProvider, JWKS at /.well-known/jwks.json with public-only fields and 1 h cache; ValidAlgorithms pinned so an HS256-with-public-key alg-confusion attack is rejected; production keys ignored under secrets/jwt-keys, deterministic test fixtures committed under e2e/test-keys. Tests: 10/10 new ACs covered (RefreshTokenFlowTests, AsymmetricSigningTests). Pre-existing AuthTests.Jwt_contains_expected_claims_and_lifetime updated for 15 min + sid/jti claims; SecurityTests.Expired_jwt re-signed with ES256; ResilienceTests login p95 SLO raised 500 ms → 1500 ms in test env to reflect Argon2id + dual DB writes + ES256 sign cost (production Linux budget unchanged, see batch_02_cycle2_review.md F1). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,84 +0,0 @@
|
||||
# Refresh-Token Flow with Rotation + Reuse Detection
|
||||
|
||||
**Task**: AZ-531_refresh_token_flow
|
||||
**Name**: Refresh-token flow with rotation + reuse detection
|
||||
**Description**: Replace single 4h JWT with short-lived (15m) access + opaque refresh token. Rotate refresh on every use; kill the session family on reuse-detection per OAuth 2.1 §6.1. Persists session state in a new `sessions` table — the foundation logout/revocation will build on.
|
||||
**Complexity**: 5 points
|
||||
**Dependencies**: None
|
||||
**Component**: Admin API + Services + DataAccess
|
||||
**Tracker**: AZ-531
|
||||
**Epic**: AZ-529
|
||||
|
||||
## Problem
|
||||
|
||||
`/login` today returns a single 4-hour HS256 JWT (`AuthService.CreateToken`). There is no refresh, no logout, and no way to shorten the access lifetime without forcing users to re-enter credentials every few minutes. Stolen tokens are valid for the full 4 h with no remediation.
|
||||
|
||||
## Outcome
|
||||
|
||||
- `POST /login` returns `{ access_token, access_exp, refresh_token, refresh_exp }`. Access TTL = 15 min. Refresh TTL = 8 h sliding, 12 h absolute.
|
||||
- `POST /token/refresh` accepts an opaque refresh token, **rotates** it (issues new access + new refresh, invalidates old refresh), and returns the same shape.
|
||||
- Refresh-reuse detection: if an already-rotated refresh token is presented again, the entire session family is killed (per OAuth 2.1 §6.1).
|
||||
- Refresh tokens are opaque random 32-byte base64url strings stored hashed in `sessions` table — never JWTs.
|
||||
- Existing single-token `/login` callers (UI) get an additive shape; older clients that ignore the new fields keep working until they're updated.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- New `sessions` table (id, user_id, refresh_hash, family_id, issued_at, last_used_at, expires_at, revoked_at, revoked_reason, parent_session_id).
|
||||
- `IRefreshTokenService` + impl in `Azaion.Services/`.
|
||||
- `/token/refresh` minimal-API handler in `Azaion.AdminApi/Program.cs`.
|
||||
- Update `AuthService.CreateToken` to take refresh-context and stamp `jti` + `sid` claims on access tokens (needed by AZ-535 logout ticket).
|
||||
- Update `LoginRequest`/`LoginResponse` DTO shape in `Azaion.Common/Requests/`.
|
||||
- Migration script for the `sessions` table.
|
||||
|
||||
### Excluded
|
||||
|
||||
- Asymmetric signing — see AZ-532.
|
||||
- Logout endpoint — see AZ-535. This ticket only persists session state.
|
||||
- 2FA enforcement on `/login` — see AZ-534.
|
||||
- UI changes to consume the new shape — cross-workspace ticket filed once admin lands.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: /login returns dual tokens**
|
||||
Given valid credentials
|
||||
When `POST /login` is called
|
||||
Then response body has non-empty `access_token` (JWT, exp ≈ now+15m ±60s) AND `refresh_token` (opaque ≥43 chars), and a session row exists.
|
||||
|
||||
**AC-2: /token/refresh rotates the refresh token**
|
||||
Given a valid refresh token
|
||||
When `POST /token/refresh` is called with it
|
||||
Then response returns a new access + new refresh; the old refresh becomes invalid; session row's `refresh_hash` is updated; `parent_session_id` chains to the previous row.
|
||||
|
||||
**AC-3: Reuse-detection kills family**
|
||||
Given refresh token R1 was rotated to R2
|
||||
When R1 is presented again
|
||||
Then `POST /token/refresh` returns 401, every session in R1's family is marked `revoked_reason='reuse_detected'`, and R2 also stops working.
|
||||
|
||||
**AC-4: Sliding + absolute expiry**
|
||||
Given a refresh token issued 7 h 50 min ago
|
||||
When used
|
||||
Then rotation succeeds, sliding window extended; if same family is older than 12 h absolute since first issue, refresh fails 401.
|
||||
|
||||
**AC-5: Refresh tokens are opaque, not JWT**
|
||||
Given any refresh token from `/login` or `/token/refresh`
|
||||
When decoded
|
||||
Then it is not a JWT (no dot-separated base64url segments parse as a header/payload). Stored as SHA-256 hash, raw value never logged.
|
||||
|
||||
## Blackbox Tests
|
||||
|
||||
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|
||||
|--------|------------------------|-------------|-------------------|----------------|
|
||||
| AC-1 | Seed user | POST /login | 200 with both tokens, exp ≈ now+15m | — |
|
||||
| AC-2 | Refresh R1 from AC-1 | POST /token/refresh with R1 | New access + new refresh; R1 invalid | — |
|
||||
| AC-3 | R1 rotated to R2 | POST /token/refresh with R1 again | 401; R2 also dead | — |
|
||||
| AC-4 | Refresh issued 11h59m ago | POST /token/refresh | Rotation succeeds; same family at 12h+ → 401 | — |
|
||||
| AC-5 | Refresh token from any path | Decode/parse | Not a JWT; DB stores SHA-256 | — |
|
||||
|
||||
## Risks / Notes
|
||||
|
||||
- `sessions` table needs an index on `(refresh_hash)` for O(1) lookup.
|
||||
- Rotation must be transactional (insert new + invalidate old in one tx) to prevent race where two parallel refreshes both succeed.
|
||||
- Coordinate with AZ-535 (logout) for shared session-table schema.
|
||||
- Coordinate with AZ-534 (2FA) for which `amr` value gets stamped into the access token's claims.
|
||||
@@ -1,81 +0,0 @@
|
||||
# Asymmetric Signing (RS256/ES256) + JWKS Endpoint
|
||||
|
||||
**Task**: AZ-532_asymmetric_signing_jwks
|
||||
**Name**: Asymmetric signing (RS256/ES256) + JWKS endpoint
|
||||
**Description**: Switch admin's JWT signing from shared-secret HS256 to ES256 (preferred) so verifiers hold only public keys. Expose a standard `GET /.well-known/jwks.json`. Verifiers can no longer mint tokens even if compromised; new verifiers can be added without secret distribution.
|
||||
**Complexity**: 5 points
|
||||
**Dependencies**: None (independent of AZ-531; can land before or after)
|
||||
**Component**: Admin API + Services
|
||||
**Tracker**: AZ-532
|
||||
**Epic**: AZ-529
|
||||
|
||||
## Problem
|
||||
|
||||
Access tokens are signed with HS256 using a shared symmetric secret (`JWT_SECRET`). Every verifier (satellite-provider today, gps-denied + ui tomorrow) holds material that can mint valid admin tokens — a breach of any one verifier compromises the whole auth domain. Adding a new verifier requires distributing the secret out-of-band.
|
||||
|
||||
## Outcome
|
||||
|
||||
- Admin signs access tokens with a **private key** (ES256 preferred for small signatures + speed; RS256 acceptable). Public key lives nowhere outside the JWKS endpoint.
|
||||
- `GET /.well-known/jwks.json` returns the active public key set with `kid` per key. Cache headers: `Cache-Control: public, max-age=3600` (verifiers cache, refresh hourly).
|
||||
- Tokens carry `kid` in the header so verifiers select the right key during rotation overlap.
|
||||
- Key material lives in admin's secrets dir (`secrets/jwt_signing_key.pem`) — NOT in env vars.
|
||||
- Documented rotation procedure: generate new key → add to JWKS as second entry → wait verifier-cache TTL → switch signing to new `kid` → wait until all old-kid tokens expire → remove old from JWKS.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- ES256 keypair generation script in `scripts/` (one-time setup + rotation tool).
|
||||
- `IJwtSigningKeyProvider` interface + file-backed impl loading from `secrets/`.
|
||||
- Update `AuthService.CreateToken` to use asymmetric signing.
|
||||
- New `GET /.well-known/jwks.json` minimal-API handler (anonymous, cacheable, `.AllowAnonymous()`).
|
||||
- Update `appsettings.json` / `.env.example` to drop `JWT_SECRET` (keep temporarily as fallback for one release for rollback safety).
|
||||
- Tests: round-trip sign/verify, JWKS payload shape, kid header presence, alg-confusion attack rejection.
|
||||
|
||||
### Excluded
|
||||
|
||||
- Verifier-side migration in satellite-provider / gps-denied / ui (filed under those workspaces once admin ships).
|
||||
- Hardware HSM / KMS integration (file-backed PEM is sufficient for now; HSM is a future ticket).
|
||||
- Mission-token specific signing path (handled in AZ-533; uses same key).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Admin signs with ES256**
|
||||
Given admin is configured with an ES256 keypair
|
||||
When `POST /login` succeeds
|
||||
Then the returned access token's header has `alg=ES256` and `kid` matching the active key.
|
||||
|
||||
**AC-2: JWKS endpoint serves the public key**
|
||||
Given a fresh admin instance
|
||||
When `GET /.well-known/jwks.json` is called (no auth)
|
||||
Then response is 200 with body `{ "keys": [ { "kty":"EC", "crv":"P-256", "kid":"...", "x":"...", "y":"...", "alg":"ES256", "use":"sig" } ] }`. `Cache-Control: public, max-age=3600`.
|
||||
|
||||
**AC-3: Two-key overlap during rotation**
|
||||
Given two valid signing keys are configured (kid-A active, kid-B inactive but kept)
|
||||
When JWKS is fetched
|
||||
Then both keys appear; tokens signed with kid-A still verify; switching active to kid-B starts producing kid-B tokens; both verify until kid-A is removed.
|
||||
|
||||
**AC-4: Private key never leaves admin**
|
||||
Given the JWKS endpoint
|
||||
When response is inspected
|
||||
Then no `d` field (private scalar for EC) or `p`/`q` (RSA private primes) appears. Only public components.
|
||||
|
||||
**AC-5: alg-confusion attack rejected**
|
||||
Given a forged token with `alg=HS256` and signature computed with the public key as the HMAC secret
|
||||
When presented to a verifier configured for ES256
|
||||
Then verification fails. (Pin expected algorithm explicitly in `TokenValidationParameters.ValidAlgorithms`.)
|
||||
|
||||
## Blackbox Tests
|
||||
|
||||
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|
||||
|--------|------------------------|-------------|-------------------|----------------|
|
||||
| AC-1 | ES256 key configured | POST /login → decode header | alg=ES256, kid present | — |
|
||||
| AC-2 | Fresh admin | GET /.well-known/jwks.json | 200, JWKS shape, max-age=3600 | — |
|
||||
| AC-3 | Two keys configured | GET JWKS twice across rotation | Both keys present in overlap | — |
|
||||
| AC-4 | JWKS response | Inspect for private fields | No `d`/`p`/`q` present | — |
|
||||
| AC-5 | Forged HS256-as-ES256-pubkey token | POST any protected endpoint | 401 | — |
|
||||
|
||||
## Risks / Notes
|
||||
|
||||
- HS256 → ES256 is a breaking change for verifiers. Coordinate the cutover: admin keeps signing HS256 in parallel for one release while verifiers add ES256 verification, then admin flips to ES256-only.
|
||||
- Document the cutover in `_docs/02_document/architecture.md` (suite-level).
|
||||
Reference in New Issue
Block a user