[AZ-197] Remove hardware ID binding from resource flow

Sealed-Jetson + SaaS architecture eliminates the credential-reuse-across-
machines threat that motivated hardware fingerprint binding. The binding's
only remaining effect was a real production failure mode on legitimate
hardware events.

Production:
- Drop PUT /users/hardware/set and POST /resources/check.
- Simplify POST /resources/get/{dataFolder?} (no Hardware field).
- Remove CheckHardwareHash, UpdateHardware, Security.GetHWHash.
- GetApiEncryptionKey signature: (email, password) — no hardwareHash.
- Drop SetHWRequest DTO and Hardware property from GetResourceRequest.
- Remove HardwareIdMismatch (40) and BadHardware (45) ExceptionEnum
  entries; numeric codes left as a gap, not for reuse.

Wire-compat policy: drop entirely (no Loader; no in-flight legacy
clients). Stale callers will see 404s, which is the right loud failure.

Tombstones:
- User.Hardware DB column kept (nullable, unused) — separate cleanup
  ticket for the migration per workspace "no rename without confirmation".
- User.LastLogin is now never written by app code (only writer was inside
  the deleted CheckHardwareHash); flagged in batch_06_review for a future
  ticket.

Tests:
- Delete e2e HardwareBindingTests (165 lines) and Azaion.Test
  UserServiceTest (sole test was CheckHardwareHashTest).
- Drop Hardware payloads + /resources/check preconditions from e2e
  ResourceTests, SecurityTests, ResilienceTests; drop hardwareId arg
  from Azaion.Test SecurityTest.
- Add SecurityTests.Hardware_endpoints_are_removed_AZ_197 (AC-2 regression
  asserting both removed routes return 404).

Docs:
- architecture.md: System Context note, ADR-003 new key formula, ADR-004
  retired with rationale.
- diagrams/flows/flow_hardware_check.md: tombstoned.

Also archives the four batch-1+batch-2 task files into _docs/02_tasks/done/
(file moves were missed by the batch_05 commit).

Code review: PASS — see _docs/03_implementation/reviews/batch_06_review.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 04:46:39 +03:00
parent 5ca9ccab2c
commit 5e90512987
22 changed files with 359 additions and 490 deletions
@@ -0,0 +1,126 @@
# Remove Hardware ID Binding
**Task**: AZ-197_remove_hardware_id
**Name**: Remove hardware ID binding from resource flow (admin-side cleanup)
**Description**: Remove `CheckHardwareHash`, `UpdateHardware`, `HardwareService`, the `PUT /users/hardware/set` endpoint, and the hardware-hash component of API encryption-key derivation. The threat this protected against (credential reuse across machines via desktop installers) no longer exists in the target architecture.
**Complexity**: 3 points
**Dependencies**: None
**Component**: Admin API
**Tracker**: AZ-197
**Epic**: AZ-181
## Problem
The `Hardware` field on `User` and the `CheckHardwareHash` flow were designed to bind a user account to a specific physical machine, preventing credential reuse across machines when users had desktop installers.
The target architecture has eliminated that threat:
- **Edge devices** ship as **secured Jetsons with fTPM** (secure boot, fTPM-protected key storage, no user filesystem access, no desktop installers distributed). Hardware identity is anchored in the fTPM, not in a SHA-384 of CPU/GPU/Memory/DriveSerial strings.
- **Server / desktop access** uses the **SaaS** path (browser → admin API). There is no installer to copy and no hardware fingerprint to take.
- The Loader component itself has been **architecturally retired** (Scenario X = Watchtower + rclone + flight-gate; see `suite/_docs/_repo-config.yaml` `unresolved:loader-retirement-arch-doc` and the assumptions log entries dated 2026-04-19). Provisioning was relocated from `loader/scripts/` to `suite/_infra/provisioning/`. There is no `loader/` workspace in the suite anymore, so this ticket is now **purely admin-side**.
The hardware binding therefore adds:
- Unnecessary complexity in the encryption-key derivation chain.
- A real production failure mode (`HardwareIdMismatch`, error code 40) on legitimate drive-replacement / fTPM-attested rotations.
- A maintenance cost on every endpoint and DTO that still carries the `Hardware` field.
## Outcome
- Resource download flow no longer requires a hardware fingerprint.
- API encryption-key derivation simplified to email + password only.
- All admin-API hardware-binding code paths removed.
- Hardware-binding tests removed (unit + e2e); other tests updated to stop sending `Hardware`.
- DB column `User.Hardware` left in place but nullable and unused — no migration in this ticket (separate cleanup ticket if/when desired).
## Scope
### Included — Admin API production code
- Remove `CheckHardwareHash` and `UpdateHardware` from `IUserService` / `UserService` (`Azaion.Services/UserService.cs`).
- Remove `PUT /users/hardware/set` endpoint from `Azaion.AdminApi/Program.cs`.
- Simplify `POST /resources/get/{dataFolder}` (`Program.cs` + `Azaion.Services/ResourcesService.cs`): remove `request.Hardware` parameter usage; derive encryption key without the hardware hash.
- Simplify `POST /resources/check`: remove the hardware-binding side-effect entirely. If the endpoint becomes purely a "do I have any newer resources?" probe, keep it; if it becomes a no-op shell, remove it (decide based on what consumers still call today).
- Update `Security.GetApiEncryptionKey` (`Azaion.Services/Security.cs`) to drop the `hardwareHash` parameter from its signature; derive the key from `email + password` only.
- Remove (do not deprecate) `Security.GetHWHash` — the codebase has no Loader to coordinate with anymore.
- Remove `SetHWRequest` DTO (`Azaion.Common/Requests/SetHWRequest.cs`).
- Remove the `Hardware` property usage from `GetResourceRequest` (`Azaion.Common/Requests/GetResourceRequest.cs`). The wire field may still be **accepted** (deserialized and ignored) for one release cycle to keep any in-flight legacy clients from breaking on 400s — pick the simplest of (drop entirely / accept-and-ignore) and document the choice in the implementation report.
- Remove `HardwareIdMismatch` and `BadHardware` from `Azaion.Common/BusinessException.cs` `ExceptionEnum`.
- Leave `User.Hardware` column in DB (nullable, unused). No migration here.
### Included — Tests in this workspace
- Delete `e2e/Azaion.E2E/Tests/HardwareBindingTests.cs` entirely (every test in that file asserts behaviour that is being removed).
- Update `e2e/Azaion.E2E/Tests/ResourceTests.cs`, `ResilienceTests.cs`, `SecurityTests.cs` to stop sending the `Hardware` field on resource calls (or to assert the field is ignored, whichever matches the chosen wire-compat policy above).
- Update `Azaion.Test/UserServiceTest.cs` and `Azaion.Test/SecurityTest.cs` to remove tests asserting hardware-hash behaviour and to drop the `hardwareHash` argument from any retained `GetApiEncryptionKey` calls.
- Trim test fixtures in `db-init/` and `Azaion.Test` if they seed a `User.Hardware` value purely to satisfy hardware-binding flows.
### Included — Workspace docs (pointer-only updates, no full rewrite)
- Mark `_docs/02_document/diagrams/flows/flow_hardware_check.md` as obsolete (header note + link to AZ-197 implementation report) — full deletion is fine if cleaner.
- Mark `_docs/02_document/modules/common_requests_set_hw.md` as obsolete (the documented module no longer exists).
- Note in `_docs/02_document/architecture.md` (Security & Encryption section) that API encryption-key derivation no longer includes a hardware hash, and that `User.Hardware` is a tombstoned column.
### Excluded
- Database migration to drop the `hardware` column from `users` (separate ticket if/when desired; harmless to leave nullable).
- Changes to user registration or login flow (those don't touch the hardware path).
- Any change to the suite-level `_docs/00_top_level_architecture.md` "Security & Encryption" / "Binary Split Security" sections — that's part of `unresolved:loader-retirement-arch-doc` and is owned at the suite level, not by this ticket.
- Live-device decommissioning of fielded Loader containers (separate ops runway, tracked as a sibling of `loader-retirement-arch-doc`).
- Anything in the `loader/` workspace — it does not exist in the suite anymore.
## Acceptance Criteria
**AC-1: Resource download works without hardware**
Given a provisioned device user with valid email and password
When `POST /resources/get/{dataFolder}` is called without a `Hardware` field
Then the resource is returned and decrypts successfully using a key derived from email + password only
**AC-2: Hardware-set endpoint is gone**
Given the updated admin API
When `PUT /users/hardware/set` is called with any payload
Then the response is 404
**AC-3: Encryption-key derivation is simplified**
Given the updated `Security.GetApiEncryptionKey`
When it is called with `(email, password)`
Then it returns the key derived from `email + password` only — there is no `hardwareHash` parameter on the public signature
**AC-4: Hardware-binding tests are gone**
Given the updated test projects
When the test suite is built and listed
Then `HardwareBindingTests` does not exist and no remaining test asserts `HardwareIdMismatch` / error code 40 / hardware-hash binding
**AC-5: Resource calls in remaining tests do not send `Hardware`**
Given the updated `ResourceTests`, `ResilienceTests`, `SecurityTests`
When the resource-download / resource-check requests are inspected
Then no test sends a `Hardware` field on any resource request (or, if accept-and-ignore wire-compat was chosen, tests assert the response is unchanged whether `Hardware` is present or absent)
**AC-6: ExceptionEnum no longer has hardware codes**
Given `Azaion.Common/BusinessException.cs`
When the `ExceptionEnum` is read
Then `HardwareIdMismatch` and `BadHardware` entries are gone, and no production code references them
**AC-7: Build is clean**
Given the workspace after the changes
When `dotnet build` runs across the solution
Then it completes with no errors and no new warnings introduced by this ticket
**AC-8: Test suite passes**
Given the workspace after the changes
When the existing test suite (`Azaion.Test` + `e2e/Azaion.E2E`) is run via `docker-compose.test.yml`
Then all tests pass (the deleted `HardwareBindingTests` are not counted)
## Constraints
- Wire-compat policy on the `Hardware` field of resource requests must be chosen explicitly (drop / accept-and-ignore) and recorded in the implementation report — this is the only consumer-facing contract change in the ticket.
- Do not rename `User.Hardware` column or drop it from the entity in this ticket; only stop reading/writing it. Renaming/dropping requires a separate migration ticket per the workspace's "no rename without confirmation" rule.
## Cross-architecture context
This ticket is the admin-side half of an architectural transition that has already happened:
- Loader retirement (Scenario X) — `suite/_docs/_repo-config.yaml``unresolved:loader-retirement-arch-doc`
- Suite-root restructure (2026-04-19) — see assumptions_log entries in the same file
- Admin-side hardware-binding cleanup — **this ticket** (AZ-197)
The matching suite-doc refresh (top-level architecture, Binary Split Security section) is tracked separately under the unresolved item above and is intentionally NOT in this ticket's scope.