[AZ-899] [AZ-900] [AZ-901] Baseline doc + retro gate + EVIDENCE_OUT fix

AZ-899: create _docs/02_document/architecture_compliance_baseline.md seeded with 0 violations and the 2026-05-20 structural snapshot facts (15 inventory entries, 0 import cycles, 5 contract files). Documents the append-on-violation / mark-resolved-on-fix / snapshot-refresh protocol so cumulative reviews can emit Baseline Delta sections. Closes cycle-1 retro Top-3 #3 (third attempt). AZ-900: codify LESSONS 2026-05-26 [process] in .cursor/skills/autodev/flows/existing-code.md - Re-Entry After Completion now hosts a Previous-Cycle Retro Existence Gate that BLOCKS the cycle increment if no _docs/06_metrics/retro_*.md file dated within [cycle_start, cycle_end] exists. Skipped on state.cycle == 1. Presents Choose A (author retro) / B (stub + leftover) / C (abort). state.md - Session Boundaries gains a cross-reference bullet. AZ-901: fix e2e/runner/conftest.py:56 EVIDENCE_OUT default - host pytest now resolves <repo_root>/e2e-results/evidence/ instead of /e2e-results/evidence (container-only path; crashed on macOS / non- root Linux). Docker + Jetson harnesses unaffected (they pass --evidence-out explicitly). Verified locally: 24 SKIPPED, exit 0, evidence written. Closes leftover 2026-05-26_evidence_out_default_path.md. Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-21 05:41:13 +00:00 · 2026-05-26 17:18:54 +03:00
parent 940066bee2
commit aa8b9f2ee9
8 changed files with 190 additions and 54 deletions
@@ -326,7 +326,7 @@ After retrospective completes:
 **Re-Entry After Completion**
 State-driven: `state.step == done` OR Step 17 (Retrospective) is completed for `state.cycle` AND Step 16.5 verdict was `Released` or `Released-with-override`. A `Rolled-Back` cycle does NOT trigger Re-Entry — the user must explicitly invoke `/autodev` again.

-Action: The project completed a full cycle. Print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:
+Action: The project completed a full cycle. Before incrementing the cycle counter, run the **Previous-Cycle Retro Existence Gate** below. If the gate passes (or its `state.cycle == 1` early exit applies), print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:

 ```
 ══════════════════════════════════════
@@ -341,6 +341,58 @@ Set `step: 9`, `status: not_started`, and **increment `cycle`** (`cycle: state.c

 Note: the loop (Steps 9 → 17 → 9) ensures every feature cycle includes: New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security → Performance → Deploy → Release → Retrospective. The cycle only completes (and loops back to Step 9) on a `Released` or `Released-with-override` verdict; rolled-back or aborted releases stop the cycle.

+---
+
+**Previous-Cycle Retro Existence Gate** (AZ-900, codifies LESSONS 2026-05-26 [process])
+
+Trigger: run this gate at the start of Re-Entry After Completion, BEFORE the `cycle: state.cycle + 1` increment in the state file.
+
+Early-exit: if `state.cycle == 1`, the gate is **skipped** — cycle 1 has no previous cycle whose retro could exist. (A greenfield → existing-code transition on first entry to Phase B falls in this branch.)
+
+Otherwise (`state.cycle >= 2`):
+
+1. **Compute the date range for the cycle just completing.**
+   - `cycle_start = ` modification date of the latest `_docs/03_implementation/implementation_report_*_cycle{state.cycle-1}.md` file. If no implementation report exists for the previous cycle (e.g. cycle was rolled back at Step 16.5), use the modification date of the latest `_docs/06_metrics/retro_*.md` file as a lower bound, or fall back to "yesterday" if neither exists.
+   - `cycle_end = ` today (the date at which the gate runs).
+2. **Glob for the retro file**: `_docs/06_metrics/retro_*.md`, parse the `YYYY-MM-DD` portion of each filename, and check whether **any** file's date lies in the inclusive range `[cycle_start, cycle_end]`.
+3. **If at least one retro file is in range** → gate PASSES → continue with the cycle increment.
+4. **If no retro file is in range** → gate BLOCKS → play the notification sound per `.cursor/rules/human-attention-sound.mdc` and present the Choose block below.
+
+```
+══════════════════════════════════════
+ RETRO MISSING for cycle <state.cycle>
+══════════════════════════════════════
+ No `_docs/06_metrics/retro_*.md` file dated
+ within [<cycle_start>, <cycle_end>] was found.
+ Per LESSONS 2026-05-26 [process], the cycle
+ must close with a retro before cycle <state.cycle+1>
+ can start.
+══════════════════════════════════════
+ A) Author the missing retro now (invoke
+    .cursor/skills/retrospective/SKILL.md in
+    cycle-end mode against cycle <state.cycle>,
+    then re-run this gate)
+ B) Stub a backfilled retro and proceed (file
+    a leftover entry under
+    _docs/_process_leftovers/<YYYY-MM-DD>_retro_backfill_cycle<N>.md
+    naming what data is missing; create
+    _docs/06_metrics/retro_<today>_backfill_cycle<N>.md
+    with the available data; then continue
+    the cycle increment)
+ C) Abort and ask the user
+══════════════════════════════════════
+ Recommendation: A — a real retro keeps
+ LESSONS.md honest; B is a last resort when
+ cycle data is genuinely unrecoverable.
+══════════════════════════════════════
+```
+
+- **On A** → invoke `.cursor/skills/retrospective/SKILL.md` in cycle-end mode with `cycle: state.cycle`. When it completes successfully, re-run the gate from step 1; on PASS, continue. If retrospective itself fails, follow standard Failure Handling (`protocols.md`).
+- **On B** → create the leftover entry and stub retro as documented in the option label, then continue with the cycle increment. Surface this in the Status Summary footer of the next session via the leftovers folder.
+- **On C** → STOP. Do not increment `cycle`. Leave `state.step == done` so the user re-invokes `/autodev` after writing the retro by hand.
+
+Gate scope: this gate fires ONLY in `existing-code` flow. `greenfield` has no cycle counter (single Done step). `meta-repo` has no cycle counter (its cadence is `monorepo-status` re-runs, not feature cycles).
+
 ## Auto-Chain Rules

 ### Phase A — One-time baseline setup
@@ -146,6 +146,10 @@ A **session boundary** is a transition that explicitly breaks auto-chain. Which

 **Invariant**: a flow row without the `Session boundary` marker auto-chains unconditionally. Missing marker = missing boundary.

+**Cross-reference — content gates that can also stop auto-chain.** Some flow files declare additional gates that block a transition even when the row would otherwise auto-chain. These are NOT session boundaries (they don't end the conversation); they BLOCK the next step until a content prerequisite is satisfied. The orchestrator must respect them in the same place it respects session boundaries — between completing one step and starting the next. Currently declared:
+
+- `existing-code` Re-Entry After Completion → **Previous-Cycle Retro Existence Gate** (AZ-900) — blocks the `cycle: state.cycle + 1` increment if no `_docs/06_metrics/retro_*.md` file dated within the closing cycle's range exists. Skipped when `state.cycle == 1`. Presents an A/B/C choice (author now / stub-and-leftover / abort) when triggered. Full spec: `.cursor/skills/autodev/flows/existing-code.md` § "Previous-Cycle Retro Existence Gate".
+
 ### Orchestrator mechanism at a boundary

 1. Update the state file: mark the current step `completed`; set the next step with `status: not_started`; reset `sub_step: {phase: 0, name: awaiting-invocation, detail: ""}`; keep `retry_count: 0`.
@@ -0,0 +1,123 @@
+# Architecture Compliance Baseline
+
+> **Purpose.** Single canonical document against which every cumulative-review
+> report (per `.cursor/skills/code-review/SKILL.md` Phase 7 + the implement
+> skill's Step 14.5 cumulative review) computes its `## Baseline Delta` —
+> the count of **carried-over**, **resolved**, and **newly-introduced**
+> architecture violations. Without this file, cumulative reviews log
+> "baseline not found → no Baseline Delta section emitted" and structural
+> regressions are visible only pairwise per batch instead of cumulatively.
+
+**Baseline established**: 2026-05-26 (cycle-4 Step 10, batch 1, AZ-899)
+**Source-of-truth snapshot**: `_docs/06_metrics/structure_2026-05-20.md`
+**Initial violation count**: **0**
+**Cycle of last refresh**: 4
+
+## Source
+
+The "0 violations" claim is grounded in the structural facts captured by the
+cycle-1-close snapshot (`_docs/06_metrics/structure_2026-05-20.md`):
+
+| Fact | Value |
+|------|-------|
+| Inventory entries | 15 (14 production components C1–C13 + 1 cross-cutting `helpers/runtime_root` row) |
+| Import cycles in component graph | 0 (verified across batches 88–92 cumulative reviews; no back-edges) |
+| Contract files | 5 (`fdr_record_schema.md`, `fdr_client_protocol.md`, `log_record_schema.md`, the `shared_satellite_provider_ingest/` placeholder, `shared_flights_api/`) |
+| `_STRATEGY_REGISTRY` composition seam | `runtime_root.airborne_bootstrap` + `runtime_root.operator_bootstrap` (single composition root per binary, ADR-009) |
+| Layering rule | Layer-3 → Layer-4 imports **BANNED**; AZ-507 cross-component contract surface enforced by `tests/unit/test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies` lint |
+
+The architecture is documented in `_docs/02_document/architecture.md` (ADR-001
+monolith, ADR-002 build-time exclusion, ADR-009 interface-first DI,
+ADR-011 single-image live+replay). File ownership is documented in
+`_docs/02_document/module-layout.md`.
+
+## Violations
+
+*None at baseline.*
+
+This section is the append target for every cumulative-review run that
+detects an architecture finding (severity ≥ Medium, category =
+`Architecture`). The append schema is documented under § Update Protocol
+below.
+
+## Update Protocol
+
+### When a cumulative review finds a NEW architecture violation
+
+The reviewing skill (typically `.cursor/skills/code-review/SKILL.md` Phase 7,
+invoked from the implement skill's Step 14.5 cumulative review at every K=3
+batches) MUST append a row to § Violations using this schema:
+
+| Field | Example |
+|-------|---------|
+| Finding ID | `arch-2026-06-15-1` (date + sequence within the day) |
+| Batch range | `batches 17–19 cycle 4` |
+| Severity | `High` / `Medium` (Critical findings escalate immediately; Low findings stay in the per-batch report) |
+| Subcategory | `import-cycle` / `cross-component-import` / `parallel-pipeline` / `layer-violation` / `seam-bypass` |
+| File:line | `src/gps_denied_onboard/components/c2_vpr/ultra_vpr.py:117` |
+| One-line summary | `c2_vpr imports c6_tile_cache directly, bypassing the consumer-side Protocol cut required by AZ-507` |
+| Cumulative-review report | `_docs/03_implementation/cumulative_review_batches_17-19_cycle4_report.md` |
+| Status | `OPEN` (newly introduced) |
+
+The append happens IN THIS FILE, not in the cumulative-review report. The
+cumulative-review report references this file's row by Finding ID.
+
+### When a violation is resolved
+
+Update the violating row in place: change `Status: OPEN` to
+`Status: RESOLVED in batch <N> cycle <M> via <commit-hash>`. Do NOT delete
+the row — the audit trail must show both the introduction and the
+resolution.
+
+### When the structural snapshot is refreshed
+
+Any cycle that materially changes structure — new component, new
+cross-component edge, new contract file, new composition root — re-snapshots
+to a fresh `_docs/06_metrics/structure_<YYYY-MM-DD>.md` (the cycle-end
+retrospective triggers this when the diff is non-trivial). When that
+happens:
+
+1. Update the `**Source-of-truth snapshot**` header pointer at the top of
+   this file to the new file.
+2. Update the `Cycle of last refresh` header to the cycle that produced the
+   new snapshot.
+3. Update the § Source table values (component count, cycle count, contract
+   count) to match the new snapshot.
+4. Do NOT clear § Violations — open findings carry across snapshots.
+   Resolution status is per-finding, not per-snapshot.
+
+The refresh script is the same one that produced `structure_2026-05-20.md`
+(approach: count `src/gps_denied_onboard/components/*/` directories +
+`src/gps_denied_onboard/runtime_root/` + `helpers/`; run the AZ-270
+composition-root lint to detect cycles; enumerate
+`_docs/02_document/contracts/` subdirectories). If the script has been
+extracted into `tools/structure_snapshot.py` between cycles, use it;
+otherwise the manual approach is documented at the top of the source
+snapshot file.
+
+## Baseline Delta — how cumulative-review reports consume this file
+
+Every cumulative-review report MUST emit a `## Baseline Delta` section with
+three counts derived from this file:
+
+- **Carried-over**: count of rows whose `Status: OPEN` (or
+  `Status: ACCEPTED-RISK`) was unchanged at the start of this review's
+  batch window.
+- **Resolved**: count of rows that transitioned from `OPEN` to
+  `RESOLVED in batch ...` during this review's batch window.
+- **Newly-introduced**: count of rows added during this review's batch
+  window.
+
+An empty Baseline Delta (`0 new, 0 resolved, 0 carried-over`) is still
+emitted — its presence confirms the cumulative-review consulted the
+baseline rather than silently skipping the section as in cycles 1–3.
+
+## References
+
+- Cycle-3 retro § Top 3 Improvement Actions #3 — `_docs/06_metrics/retro_2026-05-26.md`
+- Cycle-1 retro § Top 3 Improvement Actions #3 (original) — `_docs/06_metrics/retro_2026-05-20.md`
+- Source snapshot — `_docs/06_metrics/structure_2026-05-20.md`
+- Existing-code flow Step 2 — `.cursor/skills/autodev/flows/existing-code.md` § "Step 2 — Architecture Baseline Scan"
+- Implement skill Step 14.5 — `.cursor/skills/implement/SKILL.md` § "Cumulative Code Review (every K batches)"
+- Architecture doc — `_docs/02_document/architecture.md`
+- Module-layout — `_docs/02_document/module-layout.md`
@@ -1,51 +0,0 @@
-# Leftover: EVIDENCE_OUT default is a hardcoded container path
-
-**Created**: 2026-05-26
-**Last replay attempt**: 2026-05-26
-**Category**: Test infrastructure defect (non-tracker leftover — code fix, not a deferred tracker write)
-**Surfaced by**: autodev cycle 3 Step 15 (Performance Test) — `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` "Findings worth tracking" item 3.
-
-## Problem
-
-`e2e/runner/conftest.py:56`:
-
-```python
-default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence")
-```
-
-The default path `/e2e-results/evidence` is the container mount inside the Tier-1 Docker harness and the Tier-2 Jetson run script. On a developer Mac/Linux workstation invoking `python -m pytest e2e/tests/performance/` directly (no Docker, no Jetson), this hook fires in `nfr_recorder.pytest_sessionfinish` and tries to create the directory, failing with:
-
-```
-OSError: [Errno 30] Read-only file system: '/e2e-results'
-```
-
-(macOS — the volume `/` is read-only at the filesystem root.) On Linux hosts it would fail with `PermissionError` for the same reason — `/e2e-results` is not writable by a non-root user.
-
-## Workaround (used today)
-
-```bash
-EVIDENCE_OUT="$(pwd)/e2e-results/cycle3-tier1-probe/evidence" \
-  python -m pytest e2e/tests/performance/ -v --tb=short
-```
-
-This produced a clean exit-0 run with the expected 24 SKIPPED outcomes.
-
-## Proposed fix
-
-Change `e2e/runner/conftest.py:56` to default to a workspace-relative path when neither `--evidence-out` nor `EVIDENCE_OUT` is set. Two viable shapes:
-
-1. **Workspace-relative default**: `default=os.environ.get("EVIDENCE_OUT", str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"))`.
-2. **Lazy fallback inside the recorder**: leave the default unset; if `evidence_dir` is `None` at session finish, skip emission and warn — useful for `--collect-only` or smoke runs where evidence output is genuinely not needed.
-
-Either shape preserves backward compatibility with the Docker / Jetson scripts (they pass `--evidence-out` explicitly).
-
-## Why not fix in this cycle
-
-Per `coderule.mdc` § Scope discipline: "Unrelated issues elsewhere: do not silently fix them as part of this task. Either note them to the user at end of turn and ASK before expanding scope, or record in `_docs/_process_leftovers/` for later handling." Cycle 3 was pre-flight / route-driven seeding work; the EVIDENCE_OUT default has no relationship to that scope. Recording here for either:
-
- Next cycle's New Task step to pick up as a small (~1 pt) housekeeping ticket, OR
- A drive-by fix during the next test-infrastructure touch (e.g. when AZ-444 Tier-2 harness lands).
-
-## Replay condition
-
-This is a **code-fix leftover**, not a tracker-write leftover. There is nothing to "replay against the tracker". Resolution = land the conftest change above and verify a Tier-1 host run of `pytest e2e/tests/performance/` exits cleanly without `EVIDENCE_OUT` pre-set. Once that PR merges, delete this leftover.
@@ -53,8 +53,16 @@ def pytest_addoption(parser: pytest.Parser) -> None:
    group.addoption(
        "--evidence-out",
        action="store",
-        default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence"),
-        help="Directory the evidence bundler writes per-run artifacts to.",
+        default=os.environ.get(
+            "EVIDENCE_OUT",
+            str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"),
+        ),
+        help="Directory the evidence bundler writes per-run artifacts to. "
+        "Default resolves to <repo_root>/e2e-results/evidence so host-direct "
+        "pytest runs don't crash on the container-mount path "
+        "/e2e-results/evidence (which is read-only on macOS, "
+        "non-writable on Linux). Docker / Jetson harnesses override this "
+        "explicitly via --evidence-out=/e2e-results/run-... (AZ-901).",
    )
    group.addoption(
        "--allow-no-skip-reason",