[AZ-461] [AZ-464] [AZ-470] [AZ-472] Batch 5 - detection/bulk-validate/panel-width/classes tests
ci/woodpecker/push/build-arm Pipeline was successful

- AZ-461 sync image detect URL canary (FT-P-11) PASS;
  async-video QUARANTINE (FT-P-12) + X-Refresh-Token drift
  (FT-P-13) recorded as it.fails() with controls.
- AZ-464 bulk-validate URL + UI sync (≤2 s) PASS;
  body shape drift {annotationIds,status} vs contract
  {ids,targetStatus:30} captured as it.fails().
- AZ-470 panel-width debounce + rehydration: entire task
  is Phase-B target (useResizablePanel has no PUT writer
  / no rehydration); 3 ACs as it.fails() with controls.
- AZ-472 DetectionClasses load + click + fallback PASS;
  hotkey arithmetic P=0 PASS, P=20/P=40 it.fails() for
  classes[idx+P]-against-dense-array drift.

Code review: PASS (0 findings). Fast: 18/18 files,
102 passed / 13 skipped. Static: 21/21 PASS.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-11 04:38:22 +03:00
parent 1dd25edee3
commit 6d03643c2c
15 changed files with 1644 additions and 4 deletions
+117
View File
@@ -0,0 +1,117 @@
# Batch Report
**Batch**: 05
**Tasks**: AZ-461 (Detection endpoints sync/async/long-video), AZ-464 (Bulk-validate URL/body/UI sync), AZ-470 (Panel-width debounced PUT + rehydration), AZ-472 (DetectionClasses load + hotkeys + click + fallback)
**Date**: 2026-05-11
**Cycle**: Phase A baseline, Step 6 — Implement Tests
**Total complexity**: 9 pts (2 + 2 + 2 + 3)
## Task Results
| Task | Status | Files Modified | Tests | AC Coverage | Issues |
|------|--------|---------------|-------|-------------|--------|
| AZ-461_test_detection_endpoints | Done | 1 created (`tests/detection_endpoints.test.tsx`); 1 e2e created (`e2e/tests/detection_endpoints.e2e.ts`) | 4 fast (2 pass + 2 `it.fails()` per spec QUARANTINE / drift, 2 controls); 2 e2e (1 PASS + 1 `test.fail`) | 3 / 3 ACs covered | 2 documented drifts: production POSTs single-endpoint `/api/detect/<id>` regardless of mediaType (no async-video route — AC-25 lifts QUARANTINE); `api.post` sets only Authorization header (no `X-Refresh-Token` — Phase B wires it) |
| AZ-464_test_bulk_validate | Done | 1 created (`tests/bulk_validate.test.tsx`); 1 e2e created (`e2e/tests/bulk_validate.e2e.ts`) | 3 fast (2 pass + 1 `it.fails()` for body-shape drift + 1 control); 3 e2e (2 PASS + 1 `test.fail`) | 3 / 3 ACs covered | 1 documented drift: production sends `{annotationIds, status: AnnotationStatus.Validated (=2)}` instead of contract `{ids, targetStatus: 30}` (flips with AC-04 wire enum scheme) |
| AZ-470_test_panel_width_persistence | Done | 1 created (`tests/panel_width_persistence.test.tsx`); 1 e2e created (`e2e/tests/panel_width_persistence.e2e.ts`) | 5 fast (3 `it.fails()` + 2 controls — every AC is `it.fails()` per spec note); 1 e2e (`test.fail`) | 3 / 3 ACs covered | 1 systemic drift: `useResizablePanel` hook holds local state only — no PUT to `/api/annotations/settings/user` on resize-end, no rehydration of seeded `panelWidths` on reload (entire task is Phase-B-target) |
| AZ-472_test_detection_classes | Done | 1 created (`tests/detection_classes.test.tsx`); 1 e2e created (`e2e/tests/detection_classes.e2e.ts`) | 7 fast (5 pass + 2 `it.fails()` for hotkey drift); 1 e2e (PASS) | 4 / 4 ACs covered | 1 documented drift: production hotkey logic uses `classes[idx + photoMode]` against a dense array — yields wrong class for P=20 and out-of-range for P=40 (flips with filter-then-index OR sparse length-60 array). P=0 PASS (coincidentally) |
## AC Test Coverage: All covered (13 / 13 ACs across the four tasks)
### AZ-461 — Detection endpoints (3 ACs, 6 scenarios)
| Scenario | Where | Profile | Status |
|----------|-------|---------|--------|
| AC-1 / FT-P-11 (sync image detect URL) | `tests/detection_endpoints.test.tsx` + `e2e/tests/detection_endpoints.e2e.ts` | fast + e2e | PASS — production POSTs `/api/detect/<numeric-id>` matching the contract regex |
| AC-2 / FT-P-12 (async video detect endpoint + SSE — QUARANTINE) | `tests/detection_endpoints.test.tsx` | fast | `it.fails()` — runs end-to-end, emits "FT-P-12 awaits AC-25 / async video detect impl" log per spec |
| AC-2 / control: production POSTs `/api/detect/<id>` regardless of mediaType (drift pin) | same | fast | PASS — pins single-endpoint drift |
| AC-3 / FT-P-13 (long-video detect carries `X-Refresh-Token`) | `tests/detection_endpoints.test.tsx` + `e2e/tests/detection_endpoints.e2e.ts` | fast + e2e | `it.fails()` (fast) + `test.fail` (e2e) — production sets only Authorization |
| AC-3 / control: production sets only `Authorization` on detect (current behavior) | `tests/detection_endpoints.test.tsx` | fast | PASS — proves spy machinery + Authorization presence |
**AC summary**:
- AC-1 sync URL canary → PASS today (numeric media id satisfies `^/api/detect/[0-9]+$`).
- AC-2 async video / SSE → `it.fails()` + control + log per QUARANTINE rule.
- AC-3 X-Refresh-Token header → `it.fails()` + control pinning Authorization-only drift.
### AZ-464 — Bulk-validate (3 ACs, 4 scenarios)
| Scenario | Where | Profile | Status |
|----------|-------|---------|--------|
| AC-1 / FT-P-20 URL canary | `tests/bulk_validate.test.tsx` + `e2e/tests/bulk_validate.e2e.ts` | fast + e2e | PASS — production POSTs `/api/annotations/dataset/bulk-status` |
| AC-2 / FT-P-20 body shape `{ids, targetStatus: 30}` | same | fast + e2e | `it.fails()` (fast) + `test.fail` (e2e) |
| AC-2 / control: body is `{annotationIds, status: AnnotationStatus.Validated}` (current shape) | `tests/bulk_validate.test.tsx` | fast | PASS — pins field-name + status-value drift |
| AC-3 / FT-P-21 + NFT-PERF-07 (UI sync ≤ 2 000 ms) | `tests/bulk_validate.test.tsx` + `e2e/tests/bulk_validate.e2e.ts` | fast + e2e | PASS — wall-clock from click to all rows showing Validated badge ≤ 2 s |
**AC summary**:
- AC-1 URL canary → PASS.
- AC-2 body shape → `it.fails()` + control proving production's drift shape (both field names AND status value differ from contract).
- AC-3 UI sync → PASS within 2 s (production calls `fetchItems()` after the 200 returns).
### AZ-470 — Panel-width debounced PUT + rehydration (3 ACs, 5 scenarios)
| Scenario | Where | Profile | Status |
|----------|-------|---------|--------|
| AC-1 / FT-P-37 + NFT-PERF-08 (debounce window) | `tests/panel_width_persistence.test.tsx` | fast | `it.fails()` — production never PUTs |
| AC-1 / control: production emits ZERO PUTs during a resize today | same | fast | PASS — pins no-writer drift |
| AC-2 / FT-P-37 (PUT body carries `panelWidths`) | same | fast | `it.fails()` — depends on AC-1 writer landing |
| AC-3 / FT-P-38 (rehydration on reload) | same + `e2e/tests/panel_width_persistence.e2e.ts` | fast + e2e | `it.fails()` (fast) + `test.fail` (e2e) — no rehydration effect |
| AC-3 / control: production renders panels at constructor defaults (250 / 200) ignoring seeded settings | `tests/panel_width_persistence.test.tsx` | fast | PASS — pins drift |
**AC summary**:
- Entire AZ-470 is a Phase-B-target group per task spec (`useResizablePanel` has no settings writer / reader today).
- Every AC is `it.fails()`; controls pin the current no-writer + constructor-default behavior.
- Tests flip green automatically once `useResizablePanel` is wired to `<UserSettings>` save/load.
### AZ-472 — DetectionClasses (4 ACs, 8 scenarios)
| Scenario | Where | Profile | Status |
|----------|-------|---------|--------|
| AC-1 / FT-P-44 (load contract) | `tests/detection_classes.test.tsx` + `e2e/tests/detection_classes.e2e.ts` | fast + e2e | PASS — GET `/api/annotations/classes` observed at mount; 9 entries rendered for P=0 |
| AC-2 / FT-P-45 P=0 (keys 1..9 → ids 0..8) | `tests/detection_classes.test.tsx` | fast | PASS — coincidentally aligns since offset is 0 |
| AC-2 / FT-P-45 P=20 (keys 1..9 → ids 20..28) | same | fast | `it.fails()` — production's `classes[idx + 20]` lands in the 40s window against the dense length-27 array |
| AC-2 / FT-P-45 P=40 (keys 1..9 → ids 40..48) | same | fast | `it.fails()``classes[idx + 40]` exceeds array length; `cls` is undefined |
| AC-3 / FT-P-46 (click path) | same | fast | PASS — `userEvent.click` fires `onSelect(c.id)` |
| AC-4 / FT-P-47 fallback on `[]` | same | fast | PASS — `FALLBACK_CLASS_NAMES` rendered when API returns empty |
| AC-4 / FT-P-47 fallback on 500 | same | fast | PASS — `FALLBACK_CLASS_NAMES` rendered on server error |
| AC-4 / fallback id set equals `[0..N-1, 20..20+N-1, 40..40+N-1]` | same | fast | PASS — pins fallback contract for downstream AZ-473 dependants |
**AC summary**:
- AC-1 load → PASS at mount.
- AC-2 hotkey arithmetic → P=0 PASS, P=20 + P=40 `it.fails()` for documented production drift.
- AC-3 click → PASS.
- AC-4 fallback → 3 scenarios PASS (empty, 500, id-set).
## Code Review Verdict: PASS
See `_docs/03_implementation/reviews/batch_05_review.md` for the full 7-phase walkthrough.
- 0 Critical, 0 High, 0 Medium, 0 Low findings.
- All `it.fails()` placements anchored to either explicit task-spec QUARANTINE direction (AZ-461 AC-2) or documented production drift with control test pinning the current shape.
- Architecture compliance (Phase 7): no layer-direction violations; tests are leaves of the import graph; no new cyclic dependencies; static profile (STC-S6, STC-S13) re-confirms.
## Auto-Fix Attempts: 0
PASS verdict — no auto-fix loop entered.
## Stuck Agents: None
Each task implemented in a single sequential pass. No file rewritten 3+ times; no approach pivots.
## Test Run Summary
- `bun run test:fast` — 18 files / 102 passed / 13 skipped / 7.31 s.
- `./scripts/run-tests.sh --static-only` — all 21 static checks PASS / 17.95 s.
- `ReadLints` — clean on all 8 changed files.
## Documented Drifts (cumulative across batch)
| Drift | Where | Spec/AC affected | Resolves when |
|-------|-------|------------------|---------------|
| Single-endpoint detect (no `/api/detect/video/...`) | `src/features/annotations/AnnotationsSidebar.tsx` (Detect button handler) | AZ-461 AC-2 | AC-25 (Phase B async-video path) |
| `X-Refresh-Token` header absent on detect | `src/api/client.ts` request fn | AZ-461 AC-3 | Phase B (header wiring per Step 4 / F7) |
| Bulk-validate body shape `{annotationIds, status}` vs contract `{ids, targetStatus}` | `src/features/dataset/DatasetPage.tsx` | AZ-464 AC-2 | AC-04 wire enum scheme |
| Status value `AnnotationStatus.Validated` (=2) vs contract 30 | same | AZ-464 AC-2 | AC-04 wire enum scheme |
| `useResizablePanel` has no PUT writer | `src/hooks/useResizablePanel.ts` | AZ-470 AC-1 + AC-2 | Phase B (debounced settings writer) |
| `useResizablePanel` has no rehydration reader | same | AZ-470 AC-3 | Phase B (reads `panelWidths` from settings on mount) |
| Hotkey index formula `classes[idx + P]` against dense array | `src/components/DetectionClasses.tsx` (keydown handler) | AZ-472 AC-2 (P=20, P=40) | Either filter-then-index switch OR sparse length-60 fixture |
## Next Batch: AZ-454, AZ-456 epics likely complete after this batch — 14 → 10 tasks remaining in `todo/`. Cumulative review (batches 0406) triggers after the next batch per Step 14.5 (K=3 cadence).
@@ -0,0 +1,135 @@
# Code Review Report
**Batch**: 5 — AZ-461, AZ-464, AZ-470, AZ-472
**Date**: 2026-05-11
**Verdict**: PASS
**Mode**: Full (per-batch invocation by `/implement`)
## Inputs
- Task specs:
- `_docs/02_tasks/todo/AZ-461_test_detection_endpoints.md` (3 ACs, 2 pts)
- `_docs/02_tasks/todo/AZ-464_test_bulk_validate.md` (3 ACs, 2 pts)
- `_docs/02_tasks/todo/AZ-470_test_panel_width_persistence.md` (3 ACs, 2 pts)
- `_docs/02_tasks/todo/AZ-472_test_detection_classes.md` (4 ACs, 3 pts)
- Changed files (8 total, all under Blackbox Tests OWNED scope):
- `tests/detection_endpoints.test.tsx`
- `tests/bulk_validate.test.tsx`
- `tests/panel_width_persistence.test.tsx`
- `tests/detection_classes.test.tsx`
- `e2e/tests/detection_endpoints.e2e.ts`
- `e2e/tests/bulk_validate.e2e.ts`
- `e2e/tests/panel_width_persistence.e2e.ts`
- `e2e/tests/detection_classes.e2e.ts`
## Findings
| # | Severity | Category | File:Line | Title |
|---|----------|----------|-----------|-------|
| — | — | — | — | None |
No Critical, High, Medium, or Low findings.
## Phase Walkthrough
### Phase 1 — Context Loading
All 4 task specs read; ACs catalogued; module-layout.md consulted for OWNED/READ-ONLY/FORBIDDEN envelope. Every changed file falls under `tests/**` or `e2e/**`, both `Owns` globs of the `Blackbox Tests` cross-cutting component (epic AZ-455). No file outside the envelope was modified.
### Phase 2 — Spec Compliance
| Task | AC | Test | Today | Drift documented |
|------|----|------|-------|------------------|
| AZ-461 | AC-1 (FT-P-11 sync image URL) | `tests/detection_endpoints.test.tsx` | PASS | — |
| AZ-461 | AC-2 (FT-P-12 async video, QUARANTINE) | `it.fails()` + control | runs + emits "FT-P-12 awaits AC-25" log | spec mandates QUARANTINE marker |
| AZ-461 | AC-3 (FT-P-13 X-Refresh-Token header) | `it.fails()` + control | drift — production sets only Authorization | header wired in Phase B |
| AZ-464 | AC-1 (FT-P-20 URL) | `tests/bulk_validate.test.tsx` | PASS | — |
| AZ-464 | AC-2 (FT-P-20 body shape) | `it.fails()` + control | drift — `{annotationIds, status:2}` vs contract `{ids, targetStatus:30}` | flips with AC-04 wire enum |
| AZ-464 | AC-3 (FT-P-21 + NFT-PERF-07 ≤ 2 s) | wall-clock perf assertion | PASS | — |
| AZ-470 | AC-1 (FT-P-37 + NFT-PERF-08 debounce) | `it.fails()` + control | drift — `useResizablePanel` has no PUT writer | flips when PUT writer wired |
| AZ-470 | AC-2 (FT-P-37 body) | `it.fails()` | drift — depends on AC-1 writer | flips when writer wired |
| AZ-470 | AC-3 (FT-P-38 rehydration) | `it.fails()` + control | drift — no read of `panelWidths` from settings | flips with rehydration effect |
| AZ-472 | AC-1 (FT-P-44 load) | `tests/detection_classes.test.tsx` | PASS | — |
| AZ-472 | AC-2 P=0 (FT-P-45 hotkey) | direct assertion | PASS | — |
| AZ-472 | AC-2 P=20 (FT-P-45 hotkey) | `it.fails()` | drift — `classes[idx+P]` against dense array | flips with filter-then-index OR sparse array |
| AZ-472 | AC-2 P=40 (FT-P-45 hotkey) | `it.fails()` | drift — `classes[idx+40]` exceeds length | same as P=20 |
| AZ-472 | AC-3 (FT-P-46 click) | userEvent.click | PASS | — |
| AZ-472 | AC-4 (FT-P-47 fallback) | empty + 500 + id-set test | PASS | — |
Every AC has at least one test (running or `it.fails()` per spec direction). AC-2 and AC-3 of AZ-461 explicitly require running tests with documented drift markers — both satisfied. All `it.fails()` markers have inline justification anchored to a documented production behavior, with control tests pinning the current shape so a regression does not slip through silently.
No `Spec-Gap` findings.
### Phase 3 — Code Quality
- AAA pattern (`// Arrange / // Act / // Assert`) applied throughout, with sections elided where empty per `coderule.mdc` test convention.
- No bare catch / no error suppression. Every test uses MSW handlers + `seedBearer/clearBearer` deterministically.
- Helper functions (`captureDetectAndBootstrap`, `rigDatasetAndBulk`, `rigPanelEnv`, `captureClassesGets`) under 50 lines each; named for caller intent.
- No DRY violations across the batch — each task isolated; the only shared helper is `tests/helpers/auth` which already existed.
- `it.fails()` placements match documented drift. Comments explain *why* and *when each test flips green*, never narrating *what the code does*.
No findings.
### Phase 4 — Security Quick-Scan
- No SQL, no shell exec, no eval/new Function in any test.
- `seedBearer()` uses test-fixture token; no hardcoded production secrets.
- No sensitive data in logs (`console.log` exists in only one place — the AZ-461 AC-2 quarantine marker, mandated by spec).
No findings.
### Phase 5 — Performance Scan
- `waitFor` timeouts bounded (10003000 ms); no infinite waits.
- No N+1 patterns. `selectItemsWithCtrlClick` iterates the bounded `seedItems` (3 rows).
- Fake-timer use in `tests/panel_width_persistence.test.tsx` is correct (`shouldAdvanceTime: true`) and reset in `afterEach`.
- Wall-clock perf assertion (`elapsed ≤ 2000 ms`) for AC-3 of AZ-464 / NFT-PERF-07 measured from click time, not request-receipt time — slightly stricter than spec, which is fine.
No findings.
### Phase 6 — Cross-Task Consistency
- All 4 fast tests share the same scaffolding shape: `server.use(...)`, `seedBearer()`, `renderWithProviders`, AAA structure, `clearBearer()`.
- No conflicting MSW patterns; each task's handler block is self-contained and uses the same `paginate` / `jsonResponse` / `errorResponse` helpers from `tests/msw/helpers`.
- All 4 tasks declare `Dependencies: AZ-456_test_infrastructure`, which is satisfied (test infra was completed in earlier batches).
- E2E companions follow the established Playwright pattern (`page.route` interception + `test.fail()` for known drifts + `test.skip(...)` for seed gaps).
No findings.
### Phase 7 — Architecture Compliance
- Layer direction: every import in the batch flows leaf-ward (test → production); no upstream production code added or modified.
- Public API respect: imports from `src/types`, `src/components/FlightContext`, `src/components/DetectionClasses`, `src/features/annotations/AnnotationsPage`, `src/features/annotations/classColors`, `src/features/dataset/DatasetPage`. Per `module-layout.md` Public API tables, all five are de-facto Public API entries of their owning components. Static profile (STC-S6, STC-S13) passes against the same rule set.
- No new cyclic dependencies — tests are leaves of the import graph.
- No duplicate symbols across components — each task's test helpers are file-private.
- No cross-cutting concerns re-implemented locally — all logging goes through `console.log` only at the spec-mandated AZ-461 AC-2 quarantine marker.
No findings.
## Baseline Delta
`_docs/02_document/architecture_compliance_baseline.md` does not exist for this workspace — no baseline delta to compute.
## Verdict Logic
- 0 Critical findings
- 0 High findings
- 0 Medium findings
- 0 Low findings
**PASS**
## Notes
- The batch is test-only. No production source was modified. Every `it.fails()` is paired with documented drift evidence in the task spec or in the test file's header comment.
- `bun run test:fast` — 18 files / 102 passed / 13 skipped (pre-existing skip count unchanged).
- `./scripts/run-tests.sh --static-only` — all checks PASS.
- No new lint errors introduced (ReadLints clean on all 8 changed files).
## Outputs (for /implement)
- `verdict`: PASS
- `findings`: []
- `critical_count`: 0
- `high_count`: 0
- `report_path`: `_docs/03_implementation/reviews/batch_05_review.md`