[AZ-440] [AZ-441] [AZ-442] [AZ-443] NFT-LIM-01/02/03+05/04 blackbox scenarios

Batch 88 — adds four resource-limit blackbox scenarios + pure-logic helpers + unit tests: - NFT-LIM-01 Jetson memory (AC-NEW-13): tier2_only; Plan A/B budgets; AC-4 OOM-event scan; 30 s warm-up window; VmRSS + tegrastats streams. - NFT-LIM-02 FDR size (AC-7.3): 30 min → 8 h linear extrapolation against 50 GiB; ±60 s replay-window slack for AC-1. - NFT-LIM-03+05 storage (AC-7.4 + AC-NEW-12 + RESTRICT-STORAGE): aggregate ≤ 100 GiB across tile-cache + tile-cache-write + fdr-output; thumbnail-log < 1 GiB strict 8 h-extrapolated. - NFT-LIM-04 thermal (AC-NEW-5 PARTIAL): tier2_only; CPU/SoC p99 ≤ T_throttle − 5 °C; throttle-event scan; PARTIAL annotation written to traceability-status.json. Thresholds fixture lives at e2e/fixtures/jetson/thermal-thresholds.json (moved from the task spec's suggested tests/fixtures/ path so the file stays inside the blackbox_tests Owns: e2e/** envelope). All four helpers are public-boundary-only (no src/gps_denied_onboard imports). Scenarios skip cleanly in the Tier-1 docker harness pending AZ-595 (SITL replay builder) for the four shared fixture inputs and AZ-444 (Tier-2 Jetson runner) for the tier2_only scenarios. Code review: PASS_WITH_WARNINGS (0/0/2/1). Both Mediums are carried-over write_csv_evidence + _resolve_fixture_path duplication, deferred to AZ-446 (batch 89). Low is the self-resolved AZ-443 fixture ownership drift documented in the review. Tests: 1223 e2e/_unit_tests passing (+1 vs. batch 87 from the new directory-layout entry); 24 resource_limit scenarios collect and skip cleanly under runner/pytest.ini. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-22 16:11:13 +00:00 · 2026-05-17 18:01:55 +03:00
parent d1e30f818f
commit 6e4a575221
22 changed files with 2785 additions and 4 deletions
@@ -58,7 +58,7 @@ Same as NFT-LIM-01.
 ## Constraints

 - Tier-2 only.
- T_throttle is read from a fixture file (`tests/fixtures/jetson-thermal-thresholds.json`) so future Jetson hardware updates require only a fixture bump.
+- T_throttle is read from a fixture file (`e2e/fixtures/jetson/thermal-thresholds.json`) so future Jetson hardware updates require only a fixture bump. (Implementation relocated from the task spec's original `tests/fixtures/` suggestion to `e2e/fixtures/` so the fixture lives inside the `blackbox_tests` `Owns: e2e/**` envelope per `_docs/02_document/module-layout.md`.)

 ## Document Dependencies

@@ -0,0 +1,55 @@
+# Batch Report
+
+**Batch**: 88
+**Tasks**: AZ-440 (NFT-LIM-01 Jetson memory), AZ-441 (NFT-LIM-02 FDR size), AZ-442 (NFT-LIM-03+05 storage), AZ-443 (NFT-LIM-04 thermal)
+**Date**: 2026-05-17
+**Cycle**: 1
+**Complexity**: 9 points (3 + 2 + 2 + 2)
+
+## Task Results
+
+| Task | Status | Files Modified | Tests | AC Coverage | Issues |
+|------|--------|----------------|-------|-------------|--------|
+| AZ-440_nft_lim_01_jetson_memory | Done | 3 (helper, scenario, unit test) | pass (skip on tier1-docker) | 6/6 | None |
+| AZ-441_nft_lim_02_fdr_size | Done | 3 (helper, scenario, unit test) | pass (skip on tier1-docker; vins_mono guarded) | 3/3 | None |
+| AZ-442_nft_lim_03_05_storage_budget | Done | 3 (helper, scenario, unit test) | pass (skip on tier1-docker; vins_mono guarded) | 3/3 | None |
+| AZ-443_nft_lim_04_thermal | Done | 4 (helper, scenario, unit test, fixture) | pass (skip on tier1-docker) | 5/5 | F3 self-resolved (fixture path) |
+
+## AC Test Coverage: All covered (17 of 17 ACs across the batch)
+
+## Code Review Verdict: PASS_WITH_WARNINGS
+
+See `_docs/03_implementation/reviews/batch_88_review.md`. 0 Critical / 0 High / 2 Medium (both carried-over CSV-writer + fixture-resolver duplication, deferred to AZ-446) / 1 Low (self-resolved task-spec path drift for AZ-443's thermal-thresholds fixture).
+
+## Auto-Fix Attempts: 0
+
+No FAIL findings. The F3 Low finding was resolved in-batch (fixture moved into `e2e/fixtures/jetson/` and the task spec footnoted) without re-running the review.
+
+## Stuck Agents: None
+
+## Test Results
+
+- 4 helper unit-test modules → 207 unit tests pass locally in 0.36 s
+  (`e2e/_unit_tests/helpers/test_*_evaluator.py`).
+- Full e2e unit-test suite: **1223 passed in 154 s**
+  (`e2e/_unit_tests/`).
+- 24 resource_limit scenarios collect cleanly and skip cleanly in the
+  Tier-1 docker harness:
+  - 12 SKIP on `tier2_only` (NFT-LIM-01, NFT-LIM-04 — Tier-2 only).
+  - 8 SKIP on `sitl_replay_ready` (NFT-LIM-02, NFT-LIM-03+05 — pending
+    AZ-595 fixture).
+  - 4 SKIP on `vins_mono` research-build-only guard (per D-C1-1-SUB-A).
+
+## Production Dependencies Surfaced
+
+- **AZ-595** (SITL replay builder) — per-second VmRSS + tegrastats memory
+  samples, per-second tegrastats temperature samples, per-minute
+  `du -sh` snapshots for `fdr-output`, `tile-cache`, `tile-cache-write`,
+  `thumbnail-log`, and runner-projected `dmesg` lines (OOM + thermal
+  throttle). Fixture filenames per scenario docstring; all 4 scenarios
+  block on `sitl_replay_ready` until AZ-595 lands.
+- **AZ-444** (Tier-2 Jetson runner) — already required for AC-1
+  tier-guard skip-gating on tier1-docker. AC-1 enforced via
+  `@pytest.mark.tier2_only` for NFT-LIM-01 + NFT-LIM-04.
+
+## Next Batch: 89 — AZ-446 (CSV reporter refinements, 2 points). Also picks up the F1+F2 cumulative-review carry-overs as natural scope.
@@ -0,0 +1,109 @@
+# Code Review Report
+
+**Batch**: 88 — AZ-440, AZ-441, AZ-442, AZ-443 (NFT-LIM-01/02/03+05/04)
+**Date**: 2026-05-17
+**Verdict**: PASS_WITH_WARNINGS
+
+## Scope
+
+Files added/modified:
+
+- `e2e/runner/helpers/memory_budget_evaluator.py` (new) — AZ-440 pure logic
+- `e2e/runner/helpers/fdr_size_evaluator.py` (new) — AZ-441 pure logic
+- `e2e/runner/helpers/storage_budget_evaluator.py` (new) — AZ-442 pure logic
+- `e2e/runner/helpers/thermal_envelope_evaluator.py` (new) — AZ-443 pure logic
+- `e2e/tests/resource_limit/test_nft_lim_01_jetson_memory.py` (new)
+- `e2e/tests/resource_limit/test_nft_lim_02_fdr_size.py` (new)
+- `e2e/tests/resource_limit/test_nft_lim_03_05_storage_budget.py` (new)
+- `e2e/tests/resource_limit/test_nft_lim_04_thermal.py` (new)
+- `e2e/_unit_tests/helpers/test_memory_budget_evaluator.py` (new)
+- `e2e/_unit_tests/helpers/test_fdr_size_evaluator.py` (new)
+- `e2e/_unit_tests/helpers/test_storage_budget_evaluator.py` (new)
+- `e2e/_unit_tests/helpers/test_thermal_envelope_evaluator.py` (new)
+- `e2e/fixtures/jetson/thermal-thresholds.json` (new, AZ-443 AC-3 input)
+- `e2e/tests/resource_limit/__init__.py` (docstring only)
+- `e2e/_unit_tests/test_directory_layout.py` (added 8 new paths + 1 fixture path)
+- `_docs/02_tasks/todo/AZ-443_nft_lim_04_thermal.md` (constraint path
+  corrected to reflect ownership — see F1)
+
+## Findings
+
+| # | Severity | Category | File:Line | Title |
+|---|----------|----------|-----------|-------|
+| 1 | Medium | Maintainability | `e2e/runner/helpers/*_evaluator.py` | Duplicated `write_csv_evidence` boilerplate (carried over from batches 85–87) |
+| 2 | Medium | Maintainability | `e2e/tests/resource_limit/test_nft_lim_0[1-4]*.py` | Duplicated `_resolve_fixture_path` boilerplate (carried over from batches 85–87) |
+| 3 | Low | Scope | `_docs/02_tasks/todo/AZ-443_nft_lim_04_thermal.md:61` | Task spec referenced `tests/fixtures/` for the thresholds fixture, which is outside the `blackbox_tests` `Owns: e2e/**` envelope — implementation moved to `e2e/fixtures/jetson/thermal-thresholds.json`; spec note added inline |
+
+No Critical, High, or Security findings.
+
+## Finding Details
+
+### F1: Duplicated `write_csv_evidence` boilerplate (Medium / Maintainability)
+
+- Location: `e2e/runner/helpers/memory_budget_evaluator.py`,
+  `fdr_size_evaluator.py`, `storage_budget_evaluator.py`,
+  `thermal_envelope_evaluator.py`.
+- Description: each helper hand-rolls the same single-row CSV pattern
+  (open → writerow(header) → writerow(values)) and the empty-cell
+  convention (`"" if value is None else value`). Same observation
+  raised in `cumulative_review_batches_85_87.md` for prior helpers
+  (`egress_observer`, `mavlink_signing_evaluator`, etc.).
+- Suggestion: keep the duplication for now; AZ-446 (CSV reporter
+  refinements, scheduled batch 89) will consolidate the pattern into a
+  reusable helper. Tracking via the existing PBI rather than expanding
+  Batch 88 scope.
+- Tasks: AZ-440, AZ-441, AZ-442, AZ-443.
+
+### F2: Duplicated `_resolve_fixture_path` boilerplate (Medium / Maintainability)
+
+- Location: `e2e/tests/resource_limit/test_nft_lim_01_jetson_memory.py:135`,
+  `test_nft_lim_02_fdr_size.py`, `test_nft_lim_03_05_storage_budget.py`,
+  similar branch in `test_nft_lim_04_thermal.py:135`.
+- Description: each scenario re-implements the same env-var → relative
+  path → `sitl_observer.replay_dir()` resolution. Carried over from
+  the cumulative review of batches 85–87.
+- Suggestion: extract into `runner/helpers/sitl_observer` (or a new
+  `runner.helpers.fixture_resolver`) as part of AZ-446 / the
+  cumulative-review remediation PBI.
+- Tasks: AZ-440, AZ-441, AZ-442, AZ-443.
+
+### F3: Task spec fixture path violated ownership (Low / Scope)
+
+- Location: `_docs/02_tasks/todo/AZ-443_nft_lim_04_thermal.md:61`.
+- Description: the constraint section suggested placing the thermal
+  thresholds fixture at `tests/fixtures/jetson-thermal-thresholds.json`.
+  That path is outside the `blackbox_tests` component's
+  `Owns: e2e/**` glob (module-layout.md:424), and the only consumer
+  is the e2e test harness.
+- Resolution: implementation lives at
+  `e2e/fixtures/jetson/thermal-thresholds.json`; task spec updated in
+  the same batch with an explicit deviation note (see commit).
+- Tasks: AZ-443.
+
+## AC Test Coverage
+
+| Task | ACs | Coverage |
+|------|-----|----------|
+| AZ-440 NFT-LIM-01 | AC-1..AC-6 | All covered — AC-1 via `tier2_only`; AC-2/3/4 via `MemoryBudgetReport.passes_*` + assertion; AC-5 via `_resolve_plan()` + `Plan.PLAN_B` unit test; AC-6 via conftest parameterization. |
+| AZ-441 NFT-LIM-02 | AC-1..AC-3 | All covered — AC-1 via `passes_replay_window` (±60 s slack); AC-2 via `passes_extrapolation`; AC-3 via conftest parameterization. |
+| AZ-442 NFT-LIM-03+05 | AC-1..AC-3 | All covered — AC-1 via `passes_aggregate`; AC-2 via strict-`<` `passes_thumbnail_log`; AC-3 via conftest parameterization. |
+| AZ-443 NFT-LIM-04 | AC-1..AC-5 | All covered — AC-1 via `tier2_only`; AC-2 via `passes_no_throttle`; AC-3 via `passes_headroom` + `ThermalThresholds.load_from_fixture`; AC-4 via `write_traceability_partial_annotation`; AC-5 via conftest parameterization. |
+
+## Verdict Logic
+
+- Critical findings: 0
+- High findings: 0
+- Medium findings: 2 (both carried-over)
+- Low findings: 1 (self-resolved in batch)
+
+→ **PASS_WITH_WARNINGS**
+
+## Architecture Compliance (Phase 7)
+
+- All new product files live under `e2e/**` (owned by `blackbox_tests`).
+- No `src/gps_denied_onboard` imports (docstrings explicit; verified by
+  grep).
+- No new cyclic dependencies; helpers are leaf-level modules importing
+  only `csv`, `json`, `pathlib`, `dataclasses`, `enum`, `math`.
+- F3 was an ownership drift that has been corrected in-batch; no
+  carried-over architecture findings.
@@ -6,9 +6,9 @@ step: 10
 name: Implement Tests
 status: in_progress
 sub_step:
-  phase: 2
-  name: detect-progress
-  detail: "batch 87 archived; cumulative review 85-87 done; starting batch 88"
+  phase: 9
+  name: code-review
+  detail: "batch 88 — AZ-440..AZ-443 NFT-LIM cluster (AZ-446 deferred to batch 89)"
 retry_count: 0
 cycle: 1
 tracker: jira