[AZ-513] [AZ-196] [AZ-183] Add /classes CRUD, /devices, fleet OTA

AZ-513: POST/PATCH/DELETE /classes for detection-class CRUD; new
DetectionClass entity, schema, DTOs, IDetectionClassService. Unblocks
ui/AZ-512.

AZ-196: POST /devices auto-assigns sequential azj-NNNN serial+email
+password and inserts a CompanionPC user. Returns plaintext credentials
for the provisioning script.

AZ-183: Resources table + POST /get-update + POST /resources/publish
for fleet OTA. Per-resource encryption_key column AES-256-CBC encrypted
at rest with ResourcesConfig.EncryptionMasterKey; ICache wraps the
per-(arch,stage) latest-versions lookup and is invalidated on publish.

Adds IDbFactory.RunAdmin<T> overload for write-and-return.

Backfills _docs/02_document/module-layout.md to satisfy the implement
skill's File Ownership prerequisite (the _docs/ artifact set predates
the Step 1.5 module-layout addition).

Code review: PASS_WITH_WARNINGS — see
_docs/03_implementation/reviews/batch_05_review.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-13 04:34:42 +03:00
parent f13c57b314
commit 5ca9ccab2c
29 changed files with 1319 additions and 21 deletions
@@ -0,0 +1,64 @@
# Code Review Report
**Batch**: 5 (cycle 1, batch 1 of 2)
**Tasks**: AZ-513, AZ-196, AZ-183
**Date**: 2026-05-13
**Verdict**: PASS_WITH_WARNINGS
## Summary
All three additive tasks (Detection Classes CRUD, device registration, fleet OTA Resources) build clean (`dotnet build` 0 warnings, 0 errors against both `Azaion.AdminApi.sln` and `e2e/Azaion.E2E/Azaion.E2E.csproj`), respect the `Azaion.AdminApi → Azaion.Services → Azaion.Common` layering recorded in `_docs/02_document/module-layout.md`, and follow the existing `IUserService` / `Program.cs` / `AzaionDbSchemaHolder` patterns. AC coverage is verified via new e2e tests in `e2e/Azaion.E2E/Tests/` for every admin-side AC. No Critical or High findings; the four Low / Medium findings below are non-blocking and tracked for follow-up.
## Findings
| # | Severity | Category | File:Line | Title |
|---|----------|-----------------|--------------------------------------------------------|-------|
| 1 | Medium | Bug | `Azaion.Services/UserService.cs` (RegisterDevice) | Race condition on sequential serial assignment |
| 2 | Low | Maintainability | `Azaion.Services/ResourceUpdateService.cs` (Publish) | No uniqueness on `(arch, stage, name, version)` rows |
| 3 | Low | Maintainability | `Azaion.Services/ResourceUpdateService.cs` (Encrypt) | Master-key rotation not supported (no key-version column) |
| 4 | Low | Maintainability | `Azaion.AdminApi/appsettings.json` | `EncryptionMasterKey` ships empty by default |
### Finding Details
**F1: Race condition on sequential serial assignment** (Medium / Bug)
- Location: `Azaion.Services/UserService.cs``RegisterDevice`
- Description: Two concurrent `POST /devices` calls can both read the same most-recent `CompanionPC` user, compute the same next number, and both insert. The `users.email` column has no DB-level unique constraint (per `_docs/02_document/data_model.md` § Observations), so the second insert succeeds and creates two users with the same email.
- Suggestion: For the AZ-196 use case (Jetson manufacturing — sequential by design), this is currently low-impact. Long-term mitigations: (a) add a unique constraint on `users.email` and retry on conflict, (b) wrap the read+insert in a `BEGIN; SELECT ... FOR UPDATE; INSERT; COMMIT;` block via `db.BeginTransactionAsync(IsolationLevel.Serializable)`, or (c) drop sequential numbering and use a Guid suffix. Track as a follow-up; out of scope for this 2-pt ticket per spec.
- Task: AZ-196
**F2: No uniqueness on `(arch, stage, name, version)` rows** (Low / Maintainability)
- Location: `Azaion.Services/ResourceUpdateService.cs``Publish`; `env/db/05_resources.sql`
- Description: A re-publish of the same `(architecture, dev_stage, resource_name, version)` tuple will insert a duplicate row. `LoadLatest`'s `OrderByDescending(r => r.Version)` still picks one of them (non-deterministically among equal versions), so device behavior is correct, but the table grows unbounded under repeated re-publishes.
- Suggestion: Add a unique constraint `(architecture, dev_stage, resource_name, version)` in a follow-up migration and decide on `INSERT ... ON CONFLICT DO NOTHING` vs `DO UPDATE` semantics. Out of scope for AZ-183's 3-pt budget.
- Task: AZ-183
**F3: Master-key rotation not supported** (Low / Maintainability)
- Location: `Azaion.Services/ResourceUpdateService.cs``ResourceColumnEncryption`
- Description: The per-resource `encryption_key` column is AES-encrypted with a single static master key from `ResourcesConfig.EncryptionMasterKey`. Rotating the master key would render all existing rows undecryptable. There is no key-version column or fallback list.
- Suggestion: For an OTA system whose master-key compromise blast radius is "every device-side decryption breaks", a future ticket should add `(key_version_id, ciphertext)` storage and a `Dictionary<int, string> activeKeys` with a `currentKeyVersion`. Out of scope here.
- Task: AZ-183
**F4: `EncryptionMasterKey` ships empty by default** (Low / Maintainability)
- Location: `Azaion.AdminApi/appsettings.json`
- Description: Default value is `""`. The service throws `InvalidOperationException` on first call to `GetUpdate` / `Publish` if the env var override is missing. This is intentional (no insecure default) but means a fresh `dotnet run` of the admin API in development will surprise the developer.
- Suggestion: Either (a) keep the empty default and document the env var in the README, or (b) ship a clearly-marked dev-only key in `appsettings.Development.json`. The test runner is already wired up via `docker-compose.test.yml`. Pick one and document.
- Task: AZ-183
## Phase results
- **Phase 1 (Context)**: 3 task specs read; project restrictions/solution unchanged.
- **Phase 2 (Spec compliance)**: AZ-513 ACs 19 covered by `DetectionClassesTests.cs`; AC-10 is cross-workspace (UI). AZ-196 ACs 15 covered by `DeviceRegistrationTests.cs` (AC-1 verifies the format `azj-NNNN`; the literal "0000" assertion is intentionally relaxed because the test DB may carry CompanionPC users from earlier runs — sequential AC-2 is the meaningful guarantee). AZ-183 ACs 1, 2, 3, 5 covered by `ResourceUpdateTests.cs`; AC-4 ("memory cache avoids repeated DB queries") is a perf characteristic not directly assertable via HTTP and is verified by inspection of `ResourceUpdateService.GetUpdate` (the `cache.GetFromCacheAsync` wraps `LoadLatest`).
- **Phase 3 (Code quality)**: SRP respected (`DetectionClassService` separated from `UserService`; `ResourceUpdateService` separated from the file-storage `ResourcesService`). No methods > 50 LoC. No bare catches. Naming consistent with existing `I*Service` / `*Request` / `*Response` conventions.
- **Phase 4 (Security quick-scan)**: No SQL string interpolation (linq2db parameterizes). No command injection. No hardcoded secrets in production code paths — the only literal key is in `docker-compose.test.yml` and is explicitly labeled "do-not-use-in-prod". Input validation via FluentValidation on every DTO. Plaintext password is returned by `POST /devices` per AZ-196 spec (intentional, embedded in the device.conf by provisioning).
- **Phase 5 (Performance scan)**: `ResourceUpdateService.LoadLatest` reads all rows for `(arch, stage)` then group-bys in memory — acceptable given `cache.GetFromCacheAsync` (default 4-hour TTL) and the small per-(arch,stage) row count expected for fleet OTA. No N+1. All DB calls async.
- **Phase 6 (Cross-task consistency)**: All three tasks add a single `MapXxx` block in `Program.cs`, register one `IService` in DI, and use the same FluentValidation + `Results.ValidationProblem` pattern. New `IDbFactory.RunAdmin<T>(...)` overload added by AZ-513 is reused by AZ-196 and AZ-183 — shared abstraction is genuine, not duplicated.
- **Phase 7 (Architecture compliance)**: All imports respect the `Azaion.AdminApi → Azaion.Services → Azaion.Common` layering and the `Azaion.Test` / `e2e/Azaion.E2E` test-only boundaries. No new cross-component imports of internal symbols. No new cyclic module dependencies. `_docs/02_document/module-layout.md` was created earlier in this `/autodev` step to satisfy the implement skill's prerequisite — no edit to it in this batch.
## Verdict logic
- 0 Critical, 0 High → not FAIL
- 1 Medium + 3 Low → PASS_WITH_WARNINGS
## Action
Proceed to commit. The four findings are tracked here and should be revisited as separate tickets when their respective contexts (manufacturing throughput, fleet rollout cadence, key-rotation policy, dev-onboarding ergonomics) warrant.