From e81b6fdfba3e2f6c8788595edb2dae5179da9f18 Mon Sep 17 00:00:00 2001 From: Yuzviak Date: Mon, 11 May 2026 18:35:47 +0300 Subject: [PATCH] docs(02-05): complete CI pipeline split + AC orphan reconciliation plan - 02-05-SUMMARY.md: full execution record for plan 02-05 - ROADMAP.md: plan progress updated (5 summaries of 7 plans complete) - REQUIREMENTS.md: AC-06 and TEST-02 marked complete Per-marker CI jobs and ac-traceability gate are now the CI contract. All 21 orphan ACs annotated pending-phase-N; --check exits 0 locally. --- .planning/REQUIREMENTS.md | 4 +- .planning/ROADMAP.md | 2 +- .../02-05-SUMMARY.md | 180 ++++++++++++++++++ 3 files changed, 183 insertions(+), 3 deletions(-) create mode 100644 .planning/phases/02-acceptance-criteria-test-taxonomy-observability-spine/02-05-SUMMARY.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index e8c9cad..25b9214 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -30,7 +30,7 @@ The stage 1 codebase (ESKF + cuVSLAM + GPR + MAVLink + pipeline + 195 passing te - [ ] **AC-03**: Position accuracy AC (50m@80%, 20m@50%, anchor age tracking, drift bounds) bound to `tests/integration/accuracy/` and `tests/e2e/` - [ ] **AC-04**: Failure-mode AC (visual blackout, spoofing promotion, dead reckoning, ≥3 disconnected segments) bound to `tests/blackbox/failure_modes/` - [ ] **AC-05**: Real-time performance AC (<400ms p95 e2e, <8GB RAM, ≥5Hz GPS_INPUT output) bound to a benchmark harness producing CI-tracked metrics -- [ ] **AC-06**: Traceability matrix `.planning/AC-TRACEABILITY.md` generated linking every AC ID → test ID(s) → implementing component(s) +- [x] **AC-06**: Traceability matrix `.planning/AC-TRACEABILITY.md` generated linking every AC ID → test ID(s) → implementing component(s) ### SAFE — Safety anchor state machine @@ -86,7 +86,7 @@ The stage 1 codebase (ESKF + cuVSLAM + GPR + MAVLink + pipeline + 195 passing te ### TEST — Test taxonomy & infrastructure - [ ] **TEST-01**: `tests/` reorganized to `tests/{unit,integration,blackbox,sitl,e2e}/`; existing tests redistributed by category -- [ ] **TEST-02**: `pyproject.toml` test markers updated — `pytest -m unit` / `-m integration` / etc.; CI runs unit+integration on every push, blackbox on PR, sitl+e2e nightly +- [x] **TEST-02**: `pyproject.toml` test markers updated — `pytest -m unit` / `-m integration` / etc.; CI runs unit+integration on every push, blackbox on PR, sitl+e2e nightly - [ ] **TEST-03**: AC traceability auto-generated — pytest plugin tags each test with `@pytest.mark.ac("AC-1.1")`; `scripts/gen_ac_traceability.py` produces the matrix in `.planning/AC-TRACEABILITY.md` ### OBS — Observability & tooling diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 732dbe4..ed1d302 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -126,7 +126,7 @@ Phase 6 (FIXTURE — Azaion replay + CLI + per-env Docker — exercises everythi | Phase | Plans Complete | Status | Completed | |-------|----------------|--------|-----------| | 1. Hexagonal Refactor & Composition Root | 0/0 | Not started | - | -| 2. Acceptance Criteria + Test Taxonomy + Observability Spine | 0/0 | Not started | - | +| 2. Acceptance Criteria + Test Taxonomy + Observability Spine | 5/7 | In Progress| | | 3. Safety Anchor State Machine & Geometry-Gated Verifier | 0/0 | Not started | - | | 4. Conditional Multi-Scale VPR + Flight Data Recorder | 0/0 | Not started | - | | 5. MAVLink Source-Aware Output & Spoofing/Blackout Handling | 0/0 | Not started | - | diff --git a/.planning/phases/02-acceptance-criteria-test-taxonomy-observability-spine/02-05-SUMMARY.md b/.planning/phases/02-acceptance-criteria-test-taxonomy-observability-spine/02-05-SUMMARY.md new file mode 100644 index 0000000..9fda119 --- /dev/null +++ b/.planning/phases/02-acceptance-criteria-test-taxonomy-observability-spine/02-05-SUMMARY.md @@ -0,0 +1,180 @@ +--- +phase: 02-acceptance-criteria-test-taxonomy-observability-spine +plan: "05" +subsystem: ci-pipeline +tags: [ci-yaml, per-marker-jobs, ac-traceability-gate, nightly, pending-phase-annotation] +dependency_graph: + requires: [02-04] + provides: [.github/workflows/ci.yml, .github/workflows/nightly.yml, AC orphan annotations, _PENDING_RE regex] + affects: [AC-06-satisfaction, TEST-02-satisfaction, 02-06, 02-07] +tech_stack: + added: [] + patterns: [per-marker-ci-jobs, ac-traceability-two-step-gate, nightly-slow-lane, pending-phase-annotation] +key_files: + created: + - .github/workflows/nightly.yml + modified: + - scripts/gen_ac_traceability.py + - _docs/00_problem/acceptance_criteria.md + - .github/workflows/ci.yml + - .planning/AC-TRACEABILITY.md +decisions: + - "ci.yml split into 6 jobs: lint, test-unit, test-integration, test-blackbox, ac-traceability, docker-build" + - "ac-traceability gate is two-step: git diff --exit-code + --check (separate error messages for stale matrix vs orphan AC)" + - "test-blackbox runs PR-only (if: github.event_name == 'pull_request') per PATTERNS.md §4.3 rationale" + - "nightly.yml sitl job uses --collect-only not actual run; real SITL stays in sitl.yml (PATTERNS.md §4.3 item 3)" + - "AC-8.4 deferred-stage3 token annotated with pending-phase-4 since deferred-stage3 did not match _DEFERRED_RE" + - "AC-3.1 and AC-3.2 (not in plan table) assigned pending-phase-4 (VPR-01) — resilience tests are VPR-class work" + - "AC-3.5 (not in plan table) assigned pending-phase-3 (SAFE-02) — dead_reckoned mode switch is SAFE scope" +metrics: + duration: "~12 minutes" + completed: "2026-05-11" + tasks_completed: 5 + files_created: 1 + files_modified: 4 +--- + +# Phase 02 Plan 05: CI Pipeline Split + AC Orphan Reconciliation Summary + +One-liner: Per-marker CI jobs (test-unit, test-integration, test-blackbox, ac-traceability, docker-build) wired to `--strict-markers`; all 21 orphan ACs annotated `pending-phase-N` so `gen_ac_traceability.py --check` exits 0. + +## Tasks Completed + +| Task | Name | Status | Commit | +|------|------|--------|--------| +| 1 | Extend gen_ac_traceability.py _PENDING_RE | Done | a464697 | +| 2 | Annotate 21 orphan ACs with pending-phase-N | Done | a54a41c | +| 3 | Rewrite ci.yml per-marker jobs + ac-traceability gate | Done | 2f360ec | +| 4 | Create nightly.yml sitl + e2e | Done | a2a9c2c | +| 5 | Regression gate + matrix re-commit | Done | 61c39cc | + +## Task 1: Script Extension + +Added `_PENDING_RE = re.compile(r"pending-phase-\d+", re.IGNORECASE)` alongside the existing `_DEFERRED_RE`. + +Updated `collect_acs_from_doc()` to: +- Initialize each AC entry with `{"deferred": False, "deferred_reason": None}` +- Set `deferred_reason = "hardware"` when `_DEFERRED_RE` matches +- Set `deferred_reason = ` when `_PENDING_RE` matches + +Updated `render_md()` to render `DEFERRED ({reason})` using the actual reason string (e.g. `DEFERRED (pending-phase-3)` vs `DEFERRED (hardware)`). + +## Task 2: Orphan AC Annotations + +All 21 orphans from Plan 02-04 SUMMARY annotated with `pending-phase-N (REQ-ID)` in their `**Status.**` lines: + +| AC | Phase assigned | Requirement | Reason | +|----|---------------|-------------|--------| +| AC-2.1b | 4 | VPR-03 | Cross-domain MRE requires VPR multi-scale | +| AC-3.1 | 4 | VPR-01 | Outlier tolerance is VPR recovery scope | +| AC-3.2 | 4 | VPR-01 | Sharp-turn re-loc is VPR recovery scope | +| AC-3.5 | 3 | SAFE-02 | Blackout mode switch lives in safety-anchor-state-machine | +| AC-4.2 | 4 | FDR-01 | Memory bench needs Azaion fixture for representative load | +| AC-4.5 | 3 | SAFE-04 | Refinement-correction round-trip = verifier->state machine | +| AC-5.1 | 3 | SAFE-01 | Init-from-FC-IMU in SAFE-01..03 startup hooks | +| AC-5.3 | 3 | SAFE-01 | Reboot recovery = SAFE-01 startup hook | +| AC-6.1 | 5 | MAVOUT-01 | QGC 1-2 Hz downsample = MAVOUT-01 | +| AC-6.2 | 5 | MAVOUT-03 | Operator re-loc hint = MAVOUT-03 | +| AC-7.1 | 5 | MAVOUT-04 | Object localisation = MAVOUT scope | +| AC-7.2 | 5 | MAVOUT-04 | Trig computation = MAVOUT scope | +| AC-8.1 | 4 | FDR-03 | Tile cache interface = FDR-03 + VPR-02 | +| AC-8.2 | 3 | VERIFY-03 | Freshness = VERIFY-03 | +| AC-8.3 | 4 | FDR-02 | Pre-load + storage budget = FDR-02 | +| AC-8.4 | 4 | FDR-05 | Mid-flight tile gen = FDR-05 (was deferred-stage3, not matched by _DEFERRED_RE) | +| AC-8.5 | 4 | FDR-04 | Storage policy = FDR-04 | +| AC-8.6 | 4 | VPR-01 | VPR retrieval = VPR-01..04 | +| AC-NEW-4 | 3 | VERIFY-01 | EKF covariance + Mahalanobis gate = VERIFY-01 + SAFE-02 | +| AC-NEW-6 | 3 | VERIFY-03 | Sector-aware freshness = VERIFY-03 | +| AC-NEW-8 | 3 | SAFE-02 | Visual blackout + GPS spoofing degraded budget = SAFE-02 | + +`grep -c 'pending-phase-' _docs/00_problem/acceptance_criteria.md` = 21. Matches orphan count exactly. + +## Task 3: ci.yml Job Structure + +Final job list and test invocations: + +| Job | Runs | Trigger | Needs | +|-----|------|---------|-------| +| lint | `ruff check src/ tests/ scripts/` | all pushes/PRs | — | +| test-unit | `pytest tests/ -m unit -q --tb=short` | all pushes/PRs | lint | +| test-integration | `pytest tests/ -m integration -q --tb=short` | all pushes/PRs | lint | +| test-blackbox | `pytest tests/ -m blackbox -q --tb=short` | PR only | lint | +| ac-traceability | regen + git diff + --check | all pushes/PRs | lint | +| docker-build | docker build + health smoke | all pushes/PRs | test-unit, test-integration | + +**ac-traceability two-step:** +1. `python scripts/gen_ac_traceability.py` — regenerate matrix +2. `git diff --exit-code .planning/AC-TRACEABILITY.md` — fail on stale matrix (error: "stale committed matrix") +3. `python scripts/gen_ac_traceability.py --check` — fail on orphan/unknown AC (error: "ORPHAN AC" or "UNKNOWN AC ID") + +## Task 4: nightly.yml Job Structure + +Schedule: `cron: "0 3 * * *"` (03:00 UTC daily) + `workflow_dispatch`. + +| Job | Command | Timeout | Notes | +|-----|---------|---------|-------| +| sitl | `pytest tests/ -m sitl --collect-only -q` | 30 min | scaffold only; real SITL in sitl.yml | +| e2e | `pytest tests/ -m "e2e or e2e_slow" -v --tb=short \|\| true` | 120 min | soft-fail Phase 2; hard-fail in Phase 6 | + +sitl.yml unchanged per PATTERNS.md §4.3 item 3. + +## Task 5: Regression Gate Results + +| Check | Result | +|-------|--------| +| `pytest -m unit` | 190 passed, 0 failed | +| `pytest -m integration` | 69 passed, 0 failed | +| `pytest -m blackbox` | 12 passed, 0 failed | +| `python scripts/gen_ac_traceability.py` | exit 0 | +| `git diff --exit-code .planning/AC-TRACEABILITY.md` | exit 0 (matrix clean) | +| `python scripts/gen_ac_traceability.py --check` | exit 0 | +| `pytest tests/ -q --ignore=tests/e2e` | **216 passed, 8 skipped** | + +Baseline: 216 passed. Current: 216 passed. Parity maintained. + +## Deviations from Plan + +**1. [Rule 2 - Missing annotation] AC-8.4 deferred-stage3 token not matched by _DEFERRED_RE** +- **Found during:** Task 2 — running `--check` showed AC-8.4 as orphan despite its `deferred-stage3` Status +- **Issue:** `_DEFERRED_RE = re.compile(r"deferred-hardware")` does not match `deferred-stage3`; the script only exempts hardware deferrals. The plan table lists AC-8.4 as orphan, confirming this was expected. +- **Fix:** Added `pending-phase-4 (FDR-05)` to AC-8.4's Status line alongside the existing `deferred-stage3` text. +- **Files modified:** `_docs/00_problem/acceptance_criteria.md` +- **Commit:** a54a41c + +**2. [Rule 2 - Missing annotation] AC-3.1 and AC-3.2 not in plan's orphan mapping table** +- **Found during:** Task 2 — these ACs were in the 02-04 orphan list but absent from Plan 02-05's mapping table. +- **Fix:** Applied plan instruction ("choose the phase whose requirement set is closest"). Outlier tolerance (AC-3.1) and sharp-turn handling (AC-3.2) both require VPR recovery capabilities → assigned pending-phase-4 (VPR-01). Documented here per plan instruction. +- **Files modified:** `_docs/00_problem/acceptance_criteria.md` +- **Commit:** a54a41c + +**3. [Rule 2 - Missing annotation] AC-3.5 not in plan's orphan mapping table** +- **Found during:** Task 2 — same situation as AC-3.1/3.2. +- **Fix:** Visual blackout mode switch (AC-3.5) is fundamentally a safety-state-machine concern (dead_reckoned label, covariance growth, mode switch latency) → assigned pending-phase-3 (SAFE-02). Documented here per plan instruction. +- **Files modified:** `_docs/00_problem/acceptance_criteria.md` +- **Commit:** a54a41c + +**4. [Rule 1 - Stat header] Updated AC-TRACEABILITY.md header stat to reflect both deferred types** +- **Found during:** Task 1 — the header `**ACs deferred to hardware:** 4` became misleading once pending-phase ACs were added. +- **Fix:** Changed to `**ACs deferred (hardware or pending-phase):** 25` so the matrix accurately describes what "deferred" means post-annotation. +- **Files modified:** `scripts/gen_ac_traceability.py` +- **Commit:** a464697 + +## Known Stubs + +None. All files are fully wired: CI YAML references real pytest commands and the real script path; nightly.yml uses real markers. No placeholder content. + +## Threat Flags + +None. No new network endpoints, auth paths, file access patterns, or schema changes at trust boundaries. CI YAML changes affect only the GitHub Actions runner environment. + +## Self-Check: PASSED + +- `scripts/gen_ac_traceability.py` has `_PENDING_RE`: FOUND +- `.github/workflows/ci.yml` has 6 jobs: FOUND (lint, test-unit, test-integration, test-blackbox, ac-traceability, docker-build) +- `.github/workflows/nightly.yml` exists with cron + sitl + e2e: FOUND +- `_docs/00_problem/acceptance_criteria.md` has 21 `pending-phase-` annotations: FOUND +- `.planning/AC-TRACEABILITY.md` regenerated and committed clean: FOUND +- `python scripts/gen_ac_traceability.py --check` exit 0: VERIFIED +- `git diff --exit-code .planning/AC-TRACEABILITY.md` exit 0: VERIFIED +- `pytest tests/ -q --ignore=tests/e2e`: 216 passed: VERIFIED +- Commits a464697, a54a41c, 2f360ec, a2a9c2c, 61c39cc: FOUND in git log