mirror of
https://github.com/azaion/satellite-provider.git
synced 2026-06-21 23:31:15 +00:00
[AZ-504] Fix grep | wc -l pipefail crash in PT-08 batch counting
scripts/run-performance-tests.sh:416-417 used `grep -o ... | wc -l` to count `"status":"accepted"` and `"status":"rejected"` markers in the PT-08 batch response. On the happy path (rejected=0) grep -o exits 1, and under `set -o pipefail` + `set -e` (line 16) the pipeline killed the script before reaching any of PT-08's reporting code — reproducing twice in the cycle-3 perf-harness leftover (replay #2 + #3 post-AZ-500). Fix: neutralise grep's no-match exit locally with `|| true` on the grep stage of each pipeline. `grep -o | wc -l` is kept (not swapped for `grep -c`) because the PT-08 response is compact JSON — all items live on one line, so `grep -c` would always return 1 and lose occurrence-count semantics. An 8-line comment explains why grep cannot fail for I/O at this code path (file is curl-written, HTTP 200 gated). AC-1 + AC-2 verified in-place against a standalone harness under `set -e -o pipefail` (compact-JSON, mixed-status, edge-empty cases). AC-3 + AC-4 are Step 15 (Performance Test) obligations by spec design — the leftover deletion (AC-4) is "in the same commit" as the green full perf run. Batch report: _docs/03_implementation/batch_01_cycle5_report.md. Code review: _docs/03_implementation/reviews/batch_01_cycle5_review.md — PASS, no findings. Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,94 @@
|
||||
# Perf script: fix grep | wc -l pipefail crash in PT-08
|
||||
|
||||
**Task**: AZ-504_perf_script_grep_pipefail_fix
|
||||
**Name**: Perf script: fix grep | wc -l pipefail crash in PT-08
|
||||
**Description**: `scripts/run-performance-tests.sh:416-417` uses `grep -o ... | wc -l` to count `"status":"rejected"` and `"status":"accepted"` markers in the PT-08 batch response. On the happy path (`rejected=0`), `grep -o` exits 1 (no matches), and because the script has `set -o pipefail` + `set -e`, the assignment kills the script silently right after `rejected=0`. The cycle-3 perf-harness leftover (`_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md`) has reproduced this crash twice (replay #2 + #3, post-AZ-500); PT-01..PT-07 stay green. Until this one-line fix lands, PT-08 cannot be measured against its 2000 ms p95 threshold and the leftover stays open.
|
||||
**Complexity**: 1 points
|
||||
**Dependencies**: None
|
||||
**Component**: scripts/run-performance-tests.sh
|
||||
**Tracker**: AZ-504
|
||||
**Epic**: AZ-483 — Multi-source tile storage + UAV upload (Layer 2)
|
||||
|
||||
## Problem
|
||||
|
||||
`scripts/run-performance-tests.sh` lines 416-417 currently read:
|
||||
|
||||
```bash
|
||||
accepted=$(grep -o '"status":"accepted"' "$PERF_TMP_DIR/pt08_resp.json" | wc -l | tr -d ' ')
|
||||
rejected=$(grep -o '"status":"rejected"' "$PERF_TMP_DIR/pt08_resp.json" | wc -l | tr -d ' ')
|
||||
```
|
||||
|
||||
With `set -o pipefail` (line 16) and `set -e`, when `grep -o` matches zero times it exits 1; the pipeline returns 1; the assignment kills the script. This is masked when both counts are ≥1 — the bug only surfaces on the happy path where `rejected=0`. Two consecutive full-harness replays (post-AZ-500 .NET 10 migration) reproduced the crash at the exact same line; PT-01..PT-07 are unaffected because they don't pipe potentially-empty `grep` output through `wc -l`.
|
||||
|
||||
The leftover documents that the actual perf-relevant data PT-08 captured before crashing was healthy: HTTP 200, batch latency 99 ms (well under the 2000 ms threshold), accepted=2, rejected=0. The harness is buggy, not the production code path.
|
||||
|
||||
## Outcome
|
||||
|
||||
- `scripts/run-performance-tests.sh` runs PT-08 to completion when `rejected=0` (and when `accepted=0`, defensively).
|
||||
- A full default-parameter perf run (`PERF_REPEAT_COUNT=20 PERF_UAV_BATCH_SIZE=10`) prints the PT-08 summary line with batch p50/p95, accepted total, rejected total, and exits 0 against an api built from `dev`.
|
||||
- The cycle-3 perf-harness leftover (`_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md`) is deleted once a full perf run is green (per AZ-500 Constraint).
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
- `scripts/run-performance-tests.sh` lines 416-417 replaced with pipefail-tolerant counting (`grep -c ... || true`).
|
||||
- A short shellcheck pass on the surrounding `pt08_summary` block so no similar empty-match-counting bug remains in the same scope.
|
||||
- Delete `_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md` after the green full perf run completes (post-implementation).
|
||||
|
||||
### Excluded
|
||||
- Changes to PT-01..PT-07 scenarios or any production code path — bug is harness-only.
|
||||
- Adding new perf scenarios — out of scope.
|
||||
- Renaming or restructuring `scripts/run-performance-tests.sh`.
|
||||
- Server-side per-call `UavTileQualityGate.Validate` timing instrumentation (already deferred in cycle 3 AZ-492 — out of scope here too).
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: PT-08 completes on zero-rejected response**
|
||||
Given the api is up against `docker-compose up -d --build` and a UAV batch upload returns `accepted ≥ 1, rejected = 0`
|
||||
When the harness runs PT-08 batch counting
|
||||
Then the assignment to `rejected` succeeds (no script exit); `rejected` is `0`; the loop continues to the next iteration.
|
||||
|
||||
**AC-2: PT-08 completes on zero-accepted response (defensive)**
|
||||
Given a UAV batch upload returns `accepted = 0, rejected ≥ 1` (e.g., quality gate rejects every item)
|
||||
When the harness runs PT-08 batch counting
|
||||
Then the assignment to `accepted` succeeds (no script exit); `accepted` is `0`; the loop continues.
|
||||
|
||||
**AC-3: PT-08 summary line prints in full run**
|
||||
Given a full default-parameter perf run (`PERF_REPEAT_COUNT=20 PERF_UAV_BATCH_SIZE=10`) against a healthy api
|
||||
When the harness reaches the PT-08 reporting block
|
||||
Then a summary line `PT-08 UAV batch upload: PASS p95=Xms / 2000ms (accepted=A, rejected=R, N=20)` is printed; the script exits 0.
|
||||
|
||||
**AC-4: Leftover deletion on green full run**
|
||||
Given AC-1 + AC-2 + AC-3 hold and the full perf run is green
|
||||
When the implementer verifies the run
|
||||
Then `_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md` is deleted in the same commit (per AZ-500 Constraint "leftover file is deleted ONLY when the full perf script runs cleanly").
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Compatibility**
|
||||
- The fix must work under `bash` (script's `#!/bin/bash`). No POSIX-sh constraint.
|
||||
- The fix must not silently swallow other errors in the surrounding block — only the empty-match case (`grep` exit 1) is tolerated; any other failure (file unreadable, pipe broken) must still propagate (`coderule.mdc` "never suppress errors silently").
|
||||
|
||||
## Constraints
|
||||
|
||||
- One-file edit. No new scripts, no helper extraction.
|
||||
- Preserve `set -o pipefail` and `set -e` semantics globally — only neutralise grep's exit-1-on-no-match locally.
|
||||
- Do not change PT-08's threshold (2000 ms p95, established by AZ-488).
|
||||
- Do not rename, move, or restructure `scripts/run-performance-tests.sh`.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: `grep -c` counts lines containing the pattern, not occurrences**
|
||||
- *Risk*: `grep -c` counts matching LINES, not total occurrences. If the PT-08 response packs multiple `"status":"..."` markers on one line (compact JSON), the count diverges from `grep -o | wc -l`.
|
||||
- *Mitigation*: inspect a sample `pt08_resp.json` from a recent run; if the JSON is one-per-line the `grep -c` swap is exact, otherwise use `grep -o ... | wc -l` wrapped in `|| echo 0` to preserve occurrence-count semantics while neutralising the empty-match exit.
|
||||
|
||||
**Risk 2: Pipefail bug exists elsewhere in the same script**
|
||||
- *Risk*: similar `grep -o ... | wc -l` patterns elsewhere will crash on different happy paths.
|
||||
- *Mitigation*: in-scope shellcheck pass over the file flags every `grep -o ... | wc -l` / `grep ... | wc -l` site for review; same defensive treatment applied as needed.
|
||||
|
||||
## References
|
||||
|
||||
- `_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md` — replay #2 + #3 evidence; one-line fix proposed at lines 56-67 of the leftover.
|
||||
- `scripts/run-performance-tests.sh:416-417` — exact lines to fix.
|
||||
- `AZ-488` — PT-08 scenario owner; threshold context.
|
||||
- `AZ-500` Constraint — "leftover file is deleted ONLY when the full perf script runs cleanly".
|
||||
Reference in New Issue
Block a user