[AZ-440] [AZ-441] [AZ-442] [AZ-443] NFT-LIM-01/02/03+05/04 blackbox scenarios

Batch 88 — adds four resource-limit blackbox scenarios + pure-logic
helpers + unit tests:

- NFT-LIM-01 Jetson memory (AC-NEW-13): tier2_only; Plan A/B budgets;
  AC-4 OOM-event scan; 30 s warm-up window; VmRSS + tegrastats streams.
- NFT-LIM-02 FDR size (AC-7.3): 30 min → 8 h linear extrapolation
  against 50 GiB; ±60 s replay-window slack for AC-1.
- NFT-LIM-03+05 storage (AC-7.4 + AC-NEW-12 + RESTRICT-STORAGE):
  aggregate ≤ 100 GiB across tile-cache + tile-cache-write +
  fdr-output; thumbnail-log < 1 GiB strict 8 h-extrapolated.
- NFT-LIM-04 thermal (AC-NEW-5 PARTIAL): tier2_only; CPU/SoC p99
  ≤ T_throttle − 5 °C; throttle-event scan; PARTIAL annotation written
  to traceability-status.json. Thresholds fixture lives at
  e2e/fixtures/jetson/thermal-thresholds.json (moved from the
  task spec's suggested tests/fixtures/ path so the file stays
  inside the blackbox_tests Owns: e2e/** envelope).

All four helpers are public-boundary-only (no src/gps_denied_onboard
imports). Scenarios skip cleanly in the Tier-1 docker harness pending
AZ-595 (SITL replay builder) for the four shared fixture inputs and
AZ-444 (Tier-2 Jetson runner) for the tier2_only scenarios.

Code review: PASS_WITH_WARNINGS (0/0/2/1). Both Mediums are
carried-over write_csv_evidence + _resolve_fixture_path duplication,
deferred to AZ-446 (batch 89). Low is the self-resolved AZ-443 fixture
ownership drift documented in the review.

Tests: 1223 e2e/_unit_tests passing (+1 vs. batch 87 from the new
directory-layout entry); 24 resource_limit scenarios collect and skip
cleanly under runner/pytest.ini.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-17 18:01:55 +03:00
parent d1e30f818f
commit 6e4a575221
22 changed files with 2785 additions and 4 deletions
@@ -0,0 +1,55 @@
# Batch Report
**Batch**: 88
**Tasks**: AZ-440 (NFT-LIM-01 Jetson memory), AZ-441 (NFT-LIM-02 FDR size), AZ-442 (NFT-LIM-03+05 storage), AZ-443 (NFT-LIM-04 thermal)
**Date**: 2026-05-17
**Cycle**: 1
**Complexity**: 9 points (3 + 2 + 2 + 2)
## Task Results
| Task | Status | Files Modified | Tests | AC Coverage | Issues |
|------|--------|----------------|-------|-------------|--------|
| AZ-440_nft_lim_01_jetson_memory | Done | 3 (helper, scenario, unit test) | pass (skip on tier1-docker) | 6/6 | None |
| AZ-441_nft_lim_02_fdr_size | Done | 3 (helper, scenario, unit test) | pass (skip on tier1-docker; vins_mono guarded) | 3/3 | None |
| AZ-442_nft_lim_03_05_storage_budget | Done | 3 (helper, scenario, unit test) | pass (skip on tier1-docker; vins_mono guarded) | 3/3 | None |
| AZ-443_nft_lim_04_thermal | Done | 4 (helper, scenario, unit test, fixture) | pass (skip on tier1-docker) | 5/5 | F3 self-resolved (fixture path) |
## AC Test Coverage: All covered (17 of 17 ACs across the batch)
## Code Review Verdict: PASS_WITH_WARNINGS
See `_docs/03_implementation/reviews/batch_88_review.md`. 0 Critical / 0 High / 2 Medium (both carried-over CSV-writer + fixture-resolver duplication, deferred to AZ-446) / 1 Low (self-resolved task-spec path drift for AZ-443's thermal-thresholds fixture).
## Auto-Fix Attempts: 0
No FAIL findings. The F3 Low finding was resolved in-batch (fixture moved into `e2e/fixtures/jetson/` and the task spec footnoted) without re-running the review.
## Stuck Agents: None
## Test Results
- 4 helper unit-test modules → 207 unit tests pass locally in 0.36 s
(`e2e/_unit_tests/helpers/test_*_evaluator.py`).
- Full e2e unit-test suite: **1223 passed in 154 s**
(`e2e/_unit_tests/`).
- 24 resource_limit scenarios collect cleanly and skip cleanly in the
Tier-1 docker harness:
- 12 SKIP on `tier2_only` (NFT-LIM-01, NFT-LIM-04 — Tier-2 only).
- 8 SKIP on `sitl_replay_ready` (NFT-LIM-02, NFT-LIM-03+05 — pending
AZ-595 fixture).
- 4 SKIP on `vins_mono` research-build-only guard (per D-C1-1-SUB-A).
## Production Dependencies Surfaced
- **AZ-595** (SITL replay builder) — per-second VmRSS + tegrastats memory
samples, per-second tegrastats temperature samples, per-minute
`du -sh` snapshots for `fdr-output`, `tile-cache`, `tile-cache-write`,
`thumbnail-log`, and runner-projected `dmesg` lines (OOM + thermal
throttle). Fixture filenames per scenario docstring; all 4 scenarios
block on `sitl_replay_ready` until AZ-595 lands.
- **AZ-444** (Tier-2 Jetson runner) — already required for AC-1
tier-guard skip-gating on tier1-docker. AC-1 enforced via
`@pytest.mark.tier2_only` for NFT-LIM-01 + NFT-LIM-04.
## Next Batch: 89 — AZ-446 (CSV reporter refinements, 2 points). Also picks up the F1+F2 cumulative-review carry-overs as natural scope.