[AZ-197] Remove hardware ID binding from resource flow

Sealed-Jetson + SaaS architecture eliminates the credential-reuse-across-
machines threat that motivated hardware fingerprint binding. The binding's
only remaining effect was a real production failure mode on legitimate
hardware events.

Production:
- Drop PUT /users/hardware/set and POST /resources/check.
- Simplify POST /resources/get/{dataFolder?} (no Hardware field).
- Remove CheckHardwareHash, UpdateHardware, Security.GetHWHash.
- GetApiEncryptionKey signature: (email, password) — no hardwareHash.
- Drop SetHWRequest DTO and Hardware property from GetResourceRequest.
- Remove HardwareIdMismatch (40) and BadHardware (45) ExceptionEnum
  entries; numeric codes left as a gap, not for reuse.

Wire-compat policy: drop entirely (no Loader; no in-flight legacy
clients). Stale callers will see 404s, which is the right loud failure.

Tombstones:
- User.Hardware DB column kept (nullable, unused) — separate cleanup
  ticket for the migration per workspace "no rename without confirmation".
- User.LastLogin is now never written by app code (only writer was inside
  the deleted CheckHardwareHash); flagged in batch_06_review for a future
  ticket.

Tests:
- Delete e2e HardwareBindingTests (165 lines) and Azaion.Test
  UserServiceTest (sole test was CheckHardwareHashTest).
- Drop Hardware payloads + /resources/check preconditions from e2e
  ResourceTests, SecurityTests, ResilienceTests; drop hardwareId arg
  from Azaion.Test SecurityTest.
- Add SecurityTests.Hardware_endpoints_are_removed_AZ_197 (AC-2 regression
  asserting both removed routes return 404).

Docs:
- architecture.md: System Context note, ADR-003 new key formula, ADR-004
  retired with rationale.
- diagrams/flows/flow_hardware_check.md: tombstoned.

Also archives the four batch-1+batch-2 task files into _docs/02_tasks/done/
(file moves were missed by the batch_05 commit).

Code review: PASS — see _docs/03_implementation/reviews/batch_06_review.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 04:46:39 +03:00
parent 5ca9ccab2c
commit 5e90512987
22 changed files with 359 additions and 490 deletions
@@ -0,0 +1,126 @@
# Remove Hardware ID Binding
**Task**: AZ-197_remove_hardware_id
**Name**: Remove hardware ID binding from resource flow (admin-side cleanup)
**Description**: Remove `CheckHardwareHash`, `UpdateHardware`, `HardwareService`, the `PUT /users/hardware/set` endpoint, and the hardware-hash component of API encryption-key derivation. The threat this protected against (credential reuse across machines via desktop installers) no longer exists in the target architecture.
**Complexity**: 3 points
**Dependencies**: None
**Component**: Admin API
**Tracker**: AZ-197
**Epic**: AZ-181
## Problem
The `Hardware` field on `User` and the `CheckHardwareHash` flow were designed to bind a user account to a specific physical machine, preventing credential reuse across machines when users had desktop installers.
The target architecture has eliminated that threat:
- **Edge devices** ship as **secured Jetsons with fTPM** (secure boot, fTPM-protected key storage, no user filesystem access, no desktop installers distributed). Hardware identity is anchored in the fTPM, not in a SHA-384 of CPU/GPU/Memory/DriveSerial strings.
- **Server / desktop access** uses the **SaaS** path (browser → admin API). There is no installer to copy and no hardware fingerprint to take.
- The Loader component itself has been **architecturally retired** (Scenario X = Watchtower + rclone + flight-gate; see `suite/_docs/_repo-config.yaml` `unresolved:loader-retirement-arch-doc` and the assumptions log entries dated 2026-04-19). Provisioning was relocated from `loader/scripts/` to `suite/_infra/provisioning/`. There is no `loader/` workspace in the suite anymore, so this ticket is now **purely admin-side**.
The hardware binding therefore adds:
- Unnecessary complexity in the encryption-key derivation chain.
- A real production failure mode (`HardwareIdMismatch`, error code 40) on legitimate drive-replacement / fTPM-attested rotations.
- A maintenance cost on every endpoint and DTO that still carries the `Hardware` field.
## Outcome
- Resource download flow no longer requires a hardware fingerprint.
- API encryption-key derivation simplified to email + password only.
- All admin-API hardware-binding code paths removed.
- Hardware-binding tests removed (unit + e2e); other tests updated to stop sending `Hardware`.
- DB column `User.Hardware` left in place but nullable and unused — no migration in this ticket (separate cleanup ticket if/when desired).
## Scope
### Included — Admin API production code
- Remove `CheckHardwareHash` and `UpdateHardware` from `IUserService` / `UserService` (`Azaion.Services/UserService.cs`).
- Remove `PUT /users/hardware/set` endpoint from `Azaion.AdminApi/Program.cs`.
- Simplify `POST /resources/get/{dataFolder}` (`Program.cs` + `Azaion.Services/ResourcesService.cs`): remove `request.Hardware` parameter usage; derive encryption key without the hardware hash.
- Simplify `POST /resources/check`: remove the hardware-binding side-effect entirely. If the endpoint becomes purely a "do I have any newer resources?" probe, keep it; if it becomes a no-op shell, remove it (decide based on what consumers still call today).
- Update `Security.GetApiEncryptionKey` (`Azaion.Services/Security.cs`) to drop the `hardwareHash` parameter from its signature; derive the key from `email + password` only.
- Remove (do not deprecate) `Security.GetHWHash` — the codebase has no Loader to coordinate with anymore.
- Remove `SetHWRequest` DTO (`Azaion.Common/Requests/SetHWRequest.cs`).
- Remove the `Hardware` property usage from `GetResourceRequest` (`Azaion.Common/Requests/GetResourceRequest.cs`). The wire field may still be **accepted** (deserialized and ignored) for one release cycle to keep any in-flight legacy clients from breaking on 400s — pick the simplest of (drop entirely / accept-and-ignore) and document the choice in the implementation report.
- Remove `HardwareIdMismatch` and `BadHardware` from `Azaion.Common/BusinessException.cs` `ExceptionEnum`.
- Leave `User.Hardware` column in DB (nullable, unused). No migration here.
### Included — Tests in this workspace
- Delete `e2e/Azaion.E2E/Tests/HardwareBindingTests.cs` entirely (every test in that file asserts behaviour that is being removed).
- Update `e2e/Azaion.E2E/Tests/ResourceTests.cs`, `ResilienceTests.cs`, `SecurityTests.cs` to stop sending the `Hardware` field on resource calls (or to assert the field is ignored, whichever matches the chosen wire-compat policy above).
- Update `Azaion.Test/UserServiceTest.cs` and `Azaion.Test/SecurityTest.cs` to remove tests asserting hardware-hash behaviour and to drop the `hardwareHash` argument from any retained `GetApiEncryptionKey` calls.
- Trim test fixtures in `db-init/` and `Azaion.Test` if they seed a `User.Hardware` value purely to satisfy hardware-binding flows.
### Included — Workspace docs (pointer-only updates, no full rewrite)
- Mark `_docs/02_document/diagrams/flows/flow_hardware_check.md` as obsolete (header note + link to AZ-197 implementation report) — full deletion is fine if cleaner.
- Mark `_docs/02_document/modules/common_requests_set_hw.md` as obsolete (the documented module no longer exists).
- Note in `_docs/02_document/architecture.md` (Security & Encryption section) that API encryption-key derivation no longer includes a hardware hash, and that `User.Hardware` is a tombstoned column.
### Excluded
- Database migration to drop the `hardware` column from `users` (separate ticket if/when desired; harmless to leave nullable).
- Changes to user registration or login flow (those don't touch the hardware path).
- Any change to the suite-level `_docs/00_top_level_architecture.md` "Security & Encryption" / "Binary Split Security" sections — that's part of `unresolved:loader-retirement-arch-doc` and is owned at the suite level, not by this ticket.
- Live-device decommissioning of fielded Loader containers (separate ops runway, tracked as a sibling of `loader-retirement-arch-doc`).
- Anything in the `loader/` workspace — it does not exist in the suite anymore.
## Acceptance Criteria
**AC-1: Resource download works without hardware**
Given a provisioned device user with valid email and password
When `POST /resources/get/{dataFolder}` is called without a `Hardware` field
Then the resource is returned and decrypts successfully using a key derived from email + password only
**AC-2: Hardware-set endpoint is gone**
Given the updated admin API
When `PUT /users/hardware/set` is called with any payload
Then the response is 404
**AC-3: Encryption-key derivation is simplified**
Given the updated `Security.GetApiEncryptionKey`
When it is called with `(email, password)`
Then it returns the key derived from `email + password` only — there is no `hardwareHash` parameter on the public signature
**AC-4: Hardware-binding tests are gone**
Given the updated test projects
When the test suite is built and listed
Then `HardwareBindingTests` does not exist and no remaining test asserts `HardwareIdMismatch` / error code 40 / hardware-hash binding
**AC-5: Resource calls in remaining tests do not send `Hardware`**
Given the updated `ResourceTests`, `ResilienceTests`, `SecurityTests`
When the resource-download / resource-check requests are inspected
Then no test sends a `Hardware` field on any resource request (or, if accept-and-ignore wire-compat was chosen, tests assert the response is unchanged whether `Hardware` is present or absent)
**AC-6: ExceptionEnum no longer has hardware codes**
Given `Azaion.Common/BusinessException.cs`
When the `ExceptionEnum` is read
Then `HardwareIdMismatch` and `BadHardware` entries are gone, and no production code references them
**AC-7: Build is clean**
Given the workspace after the changes
When `dotnet build` runs across the solution
Then it completes with no errors and no new warnings introduced by this ticket
**AC-8: Test suite passes**
Given the workspace after the changes
When the existing test suite (`Azaion.Test` + `e2e/Azaion.E2E`) is run via `docker-compose.test.yml`
Then all tests pass (the deleted `HardwareBindingTests` are not counted)
## Constraints
- Wire-compat policy on the `Hardware` field of resource requests must be chosen explicitly (drop / accept-and-ignore) and recorded in the implementation report — this is the only consumer-facing contract change in the ticket.
- Do not rename `User.Hardware` column or drop it from the entity in this ticket; only stop reading/writing it. Renaming/dropping requires a separate migration ticket per the workspace's "no rename without confirmation" rule.
## Cross-architecture context
This ticket is the admin-side half of an architectural transition that has already happened:
- Loader retirement (Scenario X) — `suite/_docs/_repo-config.yaml``unresolved:loader-retirement-arch-doc`
- Suite-root restructure (2026-04-19) — see assumptions_log entries in the same file
- Admin-side hardware-binding cleanup — **this ticket** (AZ-197)
The matching suite-doc refresh (top-level architecture, Binary Split Security section) is tracked separately under the unresolved item above and is intentionally NOT in this ticket's scope.
@@ -1,70 +0,0 @@
# Remove Hardware ID Binding
**Task**: AZ-197_remove_hardware_id
**Name**: Remove hardware ID binding from resource flow
**Description**: Remove CheckHardwareHash, UpdateHardware, HardwareService and simplify API encryption key derivation. Sealed Jetsons eliminate the credential-reuse threat this was protecting against.
**Complexity**: 3 points
**Dependencies**: None
**Component**: Admin API, Loader
**Tracker**: AZ-197
**Epic**: AZ-181
## Problem
The `Hardware` field on `User` and the `CheckHardwareHash` flow were designed to bind a user account to a specific physical machine, preventing credential reuse across machines when users had desktop installers. With sealed Jetsons (secure boot, fTPM, no user filesystem access, no installers distributed), this threat no longer exists. The hardware binding adds unnecessary complexity and failure modes (HardwareIdMismatch on drive replacement, etc.).
## Outcome
- Simpler resource download flow without hardware fingerprint requirement
- Simpler API encryption key derivation (email + password only)
- Removal of dead code paths related to hardware binding
- Fewer failure modes in production
## Scope
### Admin API changes
- Remove `CheckHardwareHash` and `UpdateHardware` from `IUserService` / `UserService`
- Remove `PUT /users/hardware/set` endpoint from `Program.cs`
- Simplify `POST /resources/get/{dataFolder}`: remove `request.Hardware` parameter, derive encryption key without hardware hash
- Simplify `POST /resources/check`: remove hardware check entirely (or remove the endpoint if unused)
- Update `Security.GetApiEncryptionKey` to not require `hardwareHash` parameter
- Remove or deprecate `Security.GetHWHash`
- Leave `User.Hardware` column nullable in DB (no migration needed, just stop writing/reading it)
- Remove `SetHWRequest` DTO
- Remove `HardwareIdMismatch` and `BadHardware` from `ExceptionEnum`
### Loader client changes
- Remove `HardwareService` class (`hardware_service.pyx`, `hardware_service.pxd`)
- Update `api_client.pyx` `load_bytes`: stop gathering hardware info, stop sending `hardware` field in resource request
- Update `security.pyx` `get_api_encryption_key`: remove `hardware_hash` parameter
- Update `security_provider.py`, `tpm_security_provider.py`, `legacy_security_provider.py`: remove `get_hw_hash` and update `get_api_encryption_key` signature
- Update `GetResourceRequest` validator to not require Hardware field
### Excluded
- Database migration to drop the `hardware` column (leave nullable, stop using it)
- Changes to user registration or login flow
## Acceptance Criteria
**AC-1: Resource download works without hardware**
Given a provisioned device with valid email and password
When the loader calls POST /resources/get without a hardware field
Then the resource is returned and can be decrypted using email + password only
**AC-2: No hardware endpoints remain**
Given the updated admin API
When PUT /users/hardware/set is called
Then 404 is returned
**AC-3: Encryption key derivation is simplified**
Given the updated Security class
When GetApiEncryptionKey is called
Then it derives the key from email + password only (no hardware hash)
**AC-4: HardwareService removed from loader**
Given the updated loader codebase
When the build is run
Then it compiles without hardware_service.pyx/pxd