mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 19:01:14 +00:00
[AZ-440] [AZ-441] [AZ-442] [AZ-443] NFT-LIM-01/02/03+05/04 blackbox scenarios
Batch 88 — adds four resource-limit blackbox scenarios + pure-logic helpers + unit tests: - NFT-LIM-01 Jetson memory (AC-NEW-13): tier2_only; Plan A/B budgets; AC-4 OOM-event scan; 30 s warm-up window; VmRSS + tegrastats streams. - NFT-LIM-02 FDR size (AC-7.3): 30 min → 8 h linear extrapolation against 50 GiB; ±60 s replay-window slack for AC-1. - NFT-LIM-03+05 storage (AC-7.4 + AC-NEW-12 + RESTRICT-STORAGE): aggregate ≤ 100 GiB across tile-cache + tile-cache-write + fdr-output; thumbnail-log < 1 GiB strict 8 h-extrapolated. - NFT-LIM-04 thermal (AC-NEW-5 PARTIAL): tier2_only; CPU/SoC p99 ≤ T_throttle − 5 °C; throttle-event scan; PARTIAL annotation written to traceability-status.json. Thresholds fixture lives at e2e/fixtures/jetson/thermal-thresholds.json (moved from the task spec's suggested tests/fixtures/ path so the file stays inside the blackbox_tests Owns: e2e/** envelope). All four helpers are public-boundary-only (no src/gps_denied_onboard imports). Scenarios skip cleanly in the Tier-1 docker harness pending AZ-595 (SITL replay builder) for the four shared fixture inputs and AZ-444 (Tier-2 Jetson runner) for the tier2_only scenarios. Code review: PASS_WITH_WARNINGS (0/0/2/1). Both Mediums are carried-over write_csv_evidence + _resolve_fixture_path duplication, deferred to AZ-446 (batch 89). Low is the self-resolved AZ-443 fixture ownership drift documented in the review. Tests: 1223 e2e/_unit_tests passing (+1 vs. batch 87 from the new directory-layout entry); 24 resource_limit scenarios collect and skip cleanly under runner/pytest.ini. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,77 +0,0 @@
|
||||
# NFT-LIM-01 — Jetson memory budget
|
||||
|
||||
**Task**: AZ-440_nft_lim_01_jetson_memory
|
||||
**Name**: Steady-state memory ≤ 4.5 GB (Plan A) / ≤ 6.0 GB (Plan B); peak ≤ 5.0 GB / 6.5 GB; no OOM (AC-NEW-13)
|
||||
**Description**: Implement NFT-LIM-01 — Tier-2 ONLY; 5 min Derkachi replay + 30 s warm-up; sample memory at 1 Hz from `/proc/<pid>/status` AND `tegrastats`; assert steady-state and peak budgets per Plan A / Plan B; no OOM kills observed.
|
||||
**Complexity**: 3 points
|
||||
**Dependencies**: AZ-406, AZ-407, AZ-444
|
||||
**Component**: Blackbox Tests / Resource Limit (epic AZ-262)
|
||||
**Tracker**: AZ-440
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
Jetson Orin Nano Super has 8 GB RAM total; the SUT must operate within Plan A (4.5 GB) or Plan B (6.0 GB) budgets to leave headroom for OS + suite. AC-NEW-13 prescribes the budgets and Plan-A/Plan-B switching rules.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/resource_limit/test_nft_lim_01_jetson_memory.py`. Tier-2 ONLY.
|
||||
- 5 min Derkachi replay + 30 s warm-up.
|
||||
- Per-second memory sample from (a) `/proc/<pid>/status` `VmRSS`, (b) `tegrastats` (system-level memory).
|
||||
- Compute steady-state (post-warm-up p50) and peak (post-warm-up max).
|
||||
- Assert `steady_state ≤ 4.5 GB` (Plan A default) AND `peak ≤ 5.0 GB`. Plan B (6.0 / 6.5 GB) gated behind `MEMORY_PLAN=B` flag.
|
||||
- Assert no OOM kills in `dmesg` during the run.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- `/proc/<pid>/status` + `tegrastats` sampling.
|
||||
- Steady-state + peak computation.
|
||||
- OOM detection in `dmesg`.
|
||||
|
||||
### Excluded
|
||||
- FDR size budget — owned by NFT-LIM-02 (AZ-441).
|
||||
- Tier-1 memory measurement — irrelevant (Docker on x86 has different budgets).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: tier guard**
|
||||
Given `tier == tier1-docker`
|
||||
Then the test SKIPs.
|
||||
|
||||
**AC-2: steady-state budget (Plan A)**
|
||||
Given the post-warm-up samples
|
||||
Then `p50(VmRSS) ≤ 4.5 GB` AND `p50(tegrastats system memory) ≤ 4.5 GB`.
|
||||
|
||||
**AC-3: peak budget (Plan A)**
|
||||
Given the same samples
|
||||
Then `max(VmRSS) ≤ 5.0 GB`.
|
||||
|
||||
**AC-4: no OOM kills**
|
||||
Given `dmesg --since "<run_start>"`
|
||||
Then no entries match `oom-killer` or `Killed process .*gps-denied-onboard`.
|
||||
|
||||
**AC-5: Plan B gated**
|
||||
Given `MEMORY_PLAN=B` env flag
|
||||
Then the budgets relax to `steady ≤ 6.0 GB` AND `peak ≤ 6.5 GB`.
|
||||
|
||||
**AC-6: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end on real hardware through public boundaries.
|
||||
|
||||
- **Allowed**: `/proc`, `tegrastats` (public OS / NVIDIA telemetry), `dmesg`.
|
||||
- **Forbidden**: instrumenting SUT internal allocators.
|
||||
|
||||
## Constraints
|
||||
|
||||
- Tier-2 only.
|
||||
- Plan A / Plan B switching is per the SUT's documented config; the test does NOT trigger the switch — it observes and reports the active plan.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/resource-limit-tests.md` § NFT-LIM-01
|
||||
- `_docs/02_document/tests/test-data.md` § Resource Limits (NFT-LIM-01 row)
|
||||
@@ -1,57 +0,0 @@
|
||||
# NFT-LIM-02 — FDR size budget
|
||||
|
||||
**Task**: AZ-441_nft_lim_02_fdr_size
|
||||
**Name**: 8 h-extrapolated FDR size ≤ 50 GB (AC-7.3)
|
||||
**Description**: Implement NFT-LIM-02 — replay 30 min Derkachi at typical fixed rates; measure FDR-archive growth (`du -sh fdr-output`); extrapolate to 8 h linearly; assert `8h_extrapolated_size ≤ 50 GB`.
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: AZ-406, AZ-407
|
||||
**Component**: Blackbox Tests / Resource Limit (epic AZ-262)
|
||||
**Tracker**: AZ-441
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
FDR size budget (AC-7.3) protects the on-board storage from being filled mid-flight. A 30 min run extrapolated to 8 h is the canonical measurement.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/resource_limit/test_nft_lim_02_fdr_size.py`. Tier-1 OR Tier-2.
|
||||
- 30 min Derkachi replay (loop the 8 min flight ~4×); per-minute `du -sh fdr-output` sampling.
|
||||
- Linear extrapolation to 8 h: `extrapolated_size = (size_at_30min / 30) × (8 × 60)`.
|
||||
- Assert `extrapolated_size ≤ 50 GB`.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- 30 min replay (looped 8 min × 4).
|
||||
- Per-minute size sampling.
|
||||
- Linear extrapolation.
|
||||
|
||||
### Excluded
|
||||
- Storage budget for tile cache + tiles → owned by NFT-LIM-03 (AZ-442).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: 30 min replay**
|
||||
Given the test runs
|
||||
Then the runner loops Derkachi for 30 min wall-clock.
|
||||
|
||||
**AC-2: extrapolation budget**
|
||||
Given per-minute samples
|
||||
Then `(size_at_30min / 30) × 480 ≤ 50 GB`.
|
||||
|
||||
**AC-3: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
End-to-end through public boundaries.
|
||||
|
||||
- **Allowed**: `du -sh` of mounted volumes.
|
||||
- **Forbidden**: importing FDR writer state.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/resource-limit-tests.md` § NFT-LIM-02
|
||||
- `_docs/02_document/tests/test-data.md` § Resource Limits (NFT-LIM-02 row)
|
||||
@@ -1,55 +0,0 @@
|
||||
# NFT-LIM-03 + NFT-LIM-05 — Aggregate storage budget + thumbnail-log budget
|
||||
|
||||
**Task**: AZ-442_nft_lim_03_05_storage_budget
|
||||
**Name**: Aggregate on-disk storage + thumbnail-log budget (AC-7.4 / AC-NEW-12 / RESTRICT-STORAGE)
|
||||
**Description**: Combined coverage for two storage-budget scenarios that share the same volume measurement: NFT-LIM-03 (aggregate `tile-cache + tile-cache-write + fdr-output ≤ 100 GB`) and NFT-LIM-05 (thumbnail-log size component < 1 GB / 8 h).
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: AZ-406, AZ-407
|
||||
**Component**: Blackbox Tests / Resource Limit (epic AZ-262)
|
||||
**Tracker**: AZ-442
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
The aggregate storage budget bounds how much disk the SUT may consume on the companion. Two related scenarios share this measurement and are combined to avoid duplication.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/resource_limit/test_nft_lim_03_05_storage_budget.py`. Tier-1 OR Tier-2.
|
||||
- 30 min Derkachi replay; per-minute `du -sh` of `tile-cache`, `tile-cache-write`, `fdr-output`, and the thumbnail-log subdirectory.
|
||||
- NFT-LIM-03: assert `aggregate(tile-cache + tile-cache-write + fdr-output) ≤ 100 GB`.
|
||||
- NFT-LIM-05: extrapolate thumbnail-log subdirectory to 8 h; assert `extrapolated_thumbnail_log < 1 GB`.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- 30 min replay (shareable with NFT-LIM-02 if combined in CI orchestration).
|
||||
- Per-volume size sampling.
|
||||
- Both budget assertions.
|
||||
|
||||
### Excluded
|
||||
- FDR-only size — owned by NFT-LIM-02.
|
||||
- Mid-flight tile generation rate — owned by FT-P-17 (AZ-422).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: aggregate budget**
|
||||
Given a 30 min replay
|
||||
Then `du -sh tile-cache + tile-cache-write + fdr-output ≤ 100 GB` at the end of the run.
|
||||
|
||||
**AC-2: thumbnail-log 8 h budget**
|
||||
Given the per-minute samples of the thumbnail-log subdirectory
|
||||
Then `(size_at_30min_thumb / 30) × 480 < 1 GB`.
|
||||
|
||||
**AC-3: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
Same as NFT-LIM-02.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/resource-limit-tests.md` § NFT-LIM-03, § NFT-LIM-05
|
||||
- `_docs/02_document/tests/test-data.md` § Resource Limits
|
||||
@@ -1,66 +0,0 @@
|
||||
# NFT-LIM-04 — Thermal envelope on Jetson
|
||||
|
||||
**Task**: AZ-443_nft_lim_04_thermal
|
||||
**Name**: Sustained thermal headroom + AC-NEW-5 PARTIAL acceptance (AC-NEW-5)
|
||||
**Description**: Implement NFT-LIM-04 — Tier-2 ONLY; sustained 30 min Derkachi loop at workstation ambient; record CPU/GPU/SoC temperatures via `tegrastats`; assert no thermal throttling kicks in. Mark as PARTIAL coverage of AC-NEW-5 (the +50 °C chamber portion is a separate release-gate scenario).
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: AZ-406, AZ-407, AZ-444
|
||||
**Component**: Blackbox Tests / Resource Limit (epic AZ-262)
|
||||
**Tracker**: AZ-443
|
||||
**Epic**: AZ-262 (E-BBT)
|
||||
|
||||
## Problem
|
||||
|
||||
Thermal behavior in workstation ambient is a partial coverage of AC-NEW-5 — it cannot prove the +50 °C envelope but it can flag obvious thermal regressions.
|
||||
|
||||
## Outcome
|
||||
|
||||
- pytest scenario at `e2e/tests/resource_limit/test_nft_lim_04_thermal.py`. Tier-2 ONLY.
|
||||
- 30 min Derkachi loop; per-second `tegrastats` capture; assert no thermal-throttling event in `dmesg` AND p99 `cpu_temp ≤ T_throttle - 5 °C` (5 °C headroom).
|
||||
- Annotate `traceability-matrix.md` AC-NEW-5 status as PARTIAL (chamber required for full).
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- 30 min loop.
|
||||
- `tegrastats` + `dmesg` capture.
|
||||
- PARTIAL annotation in evidence bundle.
|
||||
|
||||
### Excluded
|
||||
- +50 °C chamber portion — owned by a separate release-gate scenario, not in this CI scope.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: tier guard**
|
||||
Given `tier == tier1-docker`
|
||||
Then the test SKIPs.
|
||||
|
||||
**AC-2: no thermal throttle**
|
||||
Given the 30 min loop
|
||||
Then `dmesg --since "<run_start>"` shows no entries matching `thermal_throttle` / `tegra_thermal_zone`.
|
||||
|
||||
**AC-3: 5 °C headroom**
|
||||
Given the per-second `tegrastats` samples
|
||||
Then `p99(cpu_temp) ≤ T_throttle - 5 °C` AND `p99(soc_temp) ≤ T_throttle - 5 °C`. T_throttle is the hardware-documented value (97 °C for CPU, 95 °C for SoC on Orin Nano per nVidia docs; sourced from a fixture file at runtime).
|
||||
|
||||
**AC-4: PARTIAL annotation**
|
||||
Given the test completes
|
||||
Then the evidence bundle includes a `traceability-status.json` entry `"AC-NEW-5": "PARTIAL — chamber required for full"`.
|
||||
|
||||
**AC-5: parameterization**
|
||||
Given conftest parameterization
|
||||
Then the scenario runs per `(fc_adapter, vio_strategy)`.
|
||||
|
||||
## System Under Test Boundary
|
||||
|
||||
Same as NFT-LIM-01.
|
||||
|
||||
## Constraints
|
||||
|
||||
- Tier-2 only.
|
||||
- T_throttle is read from a fixture file (`tests/fixtures/jetson-thermal-thresholds.json`) so future Jetson hardware updates require only a fixture bump.
|
||||
|
||||
## Document Dependencies
|
||||
|
||||
- `_docs/02_document/tests/resource-limit-tests.md` § NFT-LIM-04
|
||||
- `_docs/02_document/tests/traceability-matrix.md` § AC-NEW-5 (PARTIAL annotation)
|
||||
Reference in New Issue
Block a user