[AZ-899] [AZ-900] [AZ-901] Baseline doc + retro gate + EVIDENCE_OUT fix

AZ-899: create _docs/02_document/architecture_compliance_baseline.md
seeded with 0 violations and the 2026-05-20 structural snapshot facts
(15 inventory entries, 0 import cycles, 5 contract files). Documents
the append-on-violation / mark-resolved-on-fix / snapshot-refresh
protocol so cumulative reviews can emit Baseline Delta sections.
Closes cycle-1 retro Top-3 #3 (third attempt).

AZ-900: codify LESSONS 2026-05-26 [process] in
.cursor/skills/autodev/flows/existing-code.md - Re-Entry After
Completion now hosts a Previous-Cycle Retro Existence Gate that
BLOCKS the cycle increment if no _docs/06_metrics/retro_*.md file
dated within [cycle_start, cycle_end] exists. Skipped on
state.cycle == 1. Presents Choose A (author retro) / B (stub +
leftover) / C (abort). state.md - Session Boundaries gains a
cross-reference bullet.

AZ-901: fix e2e/runner/conftest.py:56 EVIDENCE_OUT default - host
pytest now resolves <repo_root>/e2e-results/evidence/ instead of
/e2e-results/evidence (container-only path; crashed on macOS / non-
root Linux). Docker + Jetson harnesses unaffected (they pass
--evidence-out explicitly). Verified locally: 24 SKIPPED, exit 0,
evidence written. Closes leftover 2026-05-26_evidence_out_default_path.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-26 17:18:54 +03:00
parent 940066bee2
commit aa8b9f2ee9
8 changed files with 190 additions and 54 deletions
+53 -1
View File
@@ -326,7 +326,7 @@ After retrospective completes:
**Re-Entry After Completion**
State-driven: `state.step == done` OR Step 17 (Retrospective) is completed for `state.cycle` AND Step 16.5 verdict was `Released` or `Released-with-override`. A `Rolled-Back` cycle does NOT trigger Re-Entry — the user must explicitly invoke `/autodev` again.
Action: The project completed a full cycle. Print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:
Action: The project completed a full cycle. Before incrementing the cycle counter, run the **Previous-Cycle Retro Existence Gate** below. If the gate passes (or its `state.cycle == 1` early exit applies), print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:
```
══════════════════════════════════════
@@ -341,6 +341,58 @@ Set `step: 9`, `status: not_started`, and **increment `cycle`** (`cycle: state.c
Note: the loop (Steps 9 → 17 → 9) ensures every feature cycle includes: New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security → Performance → Deploy → Release → Retrospective. The cycle only completes (and loops back to Step 9) on a `Released` or `Released-with-override` verdict; rolled-back or aborted releases stop the cycle.
---
**Previous-Cycle Retro Existence Gate** (AZ-900, codifies LESSONS 2026-05-26 [process])
Trigger: run this gate at the start of Re-Entry After Completion, BEFORE the `cycle: state.cycle + 1` increment in the state file.
Early-exit: if `state.cycle == 1`, the gate is **skipped** — cycle 1 has no previous cycle whose retro could exist. (A greenfield → existing-code transition on first entry to Phase B falls in this branch.)
Otherwise (`state.cycle >= 2`):
1. **Compute the date range for the cycle just completing.**
- `cycle_start = ` modification date of the latest `_docs/03_implementation/implementation_report_*_cycle{state.cycle-1}.md` file. If no implementation report exists for the previous cycle (e.g. cycle was rolled back at Step 16.5), use the modification date of the latest `_docs/06_metrics/retro_*.md` file as a lower bound, or fall back to "yesterday" if neither exists.
- `cycle_end = ` today (the date at which the gate runs).
2. **Glob for the retro file**: `_docs/06_metrics/retro_*.md`, parse the `YYYY-MM-DD` portion of each filename, and check whether **any** file's date lies in the inclusive range `[cycle_start, cycle_end]`.
3. **If at least one retro file is in range** → gate PASSES → continue with the cycle increment.
4. **If no retro file is in range** → gate BLOCKS → play the notification sound per `.cursor/rules/human-attention-sound.mdc` and present the Choose block below.
```
══════════════════════════════════════
RETRO MISSING for cycle <state.cycle>
══════════════════════════════════════
No `_docs/06_metrics/retro_*.md` file dated
within [<cycle_start>, <cycle_end>] was found.
Per LESSONS 2026-05-26 [process], the cycle
must close with a retro before cycle <state.cycle+1>
can start.
══════════════════════════════════════
A) Author the missing retro now (invoke
.cursor/skills/retrospective/SKILL.md in
cycle-end mode against cycle <state.cycle>,
then re-run this gate)
B) Stub a backfilled retro and proceed (file
a leftover entry under
_docs/_process_leftovers/<YYYY-MM-DD>_retro_backfill_cycle<N>.md
naming what data is missing; create
_docs/06_metrics/retro_<today>_backfill_cycle<N>.md
with the available data; then continue
the cycle increment)
C) Abort and ask the user
══════════════════════════════════════
Recommendation: A — a real retro keeps
LESSONS.md honest; B is a last resort when
cycle data is genuinely unrecoverable.
══════════════════════════════════════
```
- **On A** → invoke `.cursor/skills/retrospective/SKILL.md` in cycle-end mode with `cycle: state.cycle`. When it completes successfully, re-run the gate from step 1; on PASS, continue. If retrospective itself fails, follow standard Failure Handling (`protocols.md`).
- **On B** → create the leftover entry and stub retro as documented in the option label, then continue with the cycle increment. Surface this in the Status Summary footer of the next session via the leftovers folder.
- **On C** → STOP. Do not increment `cycle`. Leave `state.step == done` so the user re-invokes `/autodev` after writing the retro by hand.
Gate scope: this gate fires ONLY in `existing-code` flow. `greenfield` has no cycle counter (single Done step). `meta-repo` has no cycle counter (its cadence is `monorepo-status` re-runs, not feature cycles).
## Auto-Chain Rules
### Phase A — One-time baseline setup
+4
View File
@@ -146,6 +146,10 @@ A **session boundary** is a transition that explicitly breaks auto-chain. Which
**Invariant**: a flow row without the `Session boundary` marker auto-chains unconditionally. Missing marker = missing boundary.
**Cross-reference — content gates that can also stop auto-chain.** Some flow files declare additional gates that block a transition even when the row would otherwise auto-chain. These are NOT session boundaries (they don't end the conversation); they BLOCK the next step until a content prerequisite is satisfied. The orchestrator must respect them in the same place it respects session boundaries — between completing one step and starting the next. Currently declared:
- `existing-code` Re-Entry After Completion → **Previous-Cycle Retro Existence Gate** (AZ-900) — blocks the `cycle: state.cycle + 1` increment if no `_docs/06_metrics/retro_*.md` file dated within the closing cycle's range exists. Skipped when `state.cycle == 1`. Presents an A/B/C choice (author now / stub-and-leftover / abort) when triggered. Full spec: `.cursor/skills/autodev/flows/existing-code.md` § "Previous-Cycle Retro Existence Gate".
### Orchestrator mechanism at a boundary
1. Update the state file: mark the current step `completed`; set the next step with `status: not_started`; reset `sub_step: {phase: 0, name: awaiting-invocation, detail: ""}`; keep `retry_count: 0`.
@@ -0,0 +1,123 @@
# Architecture Compliance Baseline
> **Purpose.** Single canonical document against which every cumulative-review
> report (per `.cursor/skills/code-review/SKILL.md` Phase 7 + the implement
> skill's Step 14.5 cumulative review) computes its `## Baseline Delta` —
> the count of **carried-over**, **resolved**, and **newly-introduced**
> architecture violations. Without this file, cumulative reviews log
> "baseline not found → no Baseline Delta section emitted" and structural
> regressions are visible only pairwise per batch instead of cumulatively.
**Baseline established**: 2026-05-26 (cycle-4 Step 10, batch 1, AZ-899)
**Source-of-truth snapshot**: `_docs/06_metrics/structure_2026-05-20.md`
**Initial violation count**: **0**
**Cycle of last refresh**: 4
## Source
The "0 violations" claim is grounded in the structural facts captured by the
cycle-1-close snapshot (`_docs/06_metrics/structure_2026-05-20.md`):
| Fact | Value |
|------|-------|
| Inventory entries | 15 (14 production components C1C13 + 1 cross-cutting `helpers/runtime_root` row) |
| Import cycles in component graph | 0 (verified across batches 8892 cumulative reviews; no back-edges) |
| Contract files | 5 (`fdr_record_schema.md`, `fdr_client_protocol.md`, `log_record_schema.md`, the `shared_satellite_provider_ingest/` placeholder, `shared_flights_api/`) |
| `_STRATEGY_REGISTRY` composition seam | `runtime_root.airborne_bootstrap` + `runtime_root.operator_bootstrap` (single composition root per binary, ADR-009) |
| Layering rule | Layer-3 → Layer-4 imports **BANNED**; AZ-507 cross-component contract surface enforced by `tests/unit/test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies` lint |
The architecture is documented in `_docs/02_document/architecture.md` (ADR-001
monolith, ADR-002 build-time exclusion, ADR-009 interface-first DI,
ADR-011 single-image live+replay). File ownership is documented in
`_docs/02_document/module-layout.md`.
## Violations
*None at baseline.*
This section is the append target for every cumulative-review run that
detects an architecture finding (severity ≥ Medium, category =
`Architecture`). The append schema is documented under § Update Protocol
below.
## Update Protocol
### When a cumulative review finds a NEW architecture violation
The reviewing skill (typically `.cursor/skills/code-review/SKILL.md` Phase 7,
invoked from the implement skill's Step 14.5 cumulative review at every K=3
batches) MUST append a row to § Violations using this schema:
| Field | Example |
|-------|---------|
| Finding ID | `arch-2026-06-15-1` (date + sequence within the day) |
| Batch range | `batches 1719 cycle 4` |
| Severity | `High` / `Medium` (Critical findings escalate immediately; Low findings stay in the per-batch report) |
| Subcategory | `import-cycle` / `cross-component-import` / `parallel-pipeline` / `layer-violation` / `seam-bypass` |
| File:line | `src/gps_denied_onboard/components/c2_vpr/ultra_vpr.py:117` |
| One-line summary | `c2_vpr imports c6_tile_cache directly, bypassing the consumer-side Protocol cut required by AZ-507` |
| Cumulative-review report | `_docs/03_implementation/cumulative_review_batches_17-19_cycle4_report.md` |
| Status | `OPEN` (newly introduced) |
The append happens IN THIS FILE, not in the cumulative-review report. The
cumulative-review report references this file's row by Finding ID.
### When a violation is resolved
Update the violating row in place: change `Status: OPEN` to
`Status: RESOLVED in batch <N> cycle <M> via <commit-hash>`. Do NOT delete
the row — the audit trail must show both the introduction and the
resolution.
### When the structural snapshot is refreshed
Any cycle that materially changes structure — new component, new
cross-component edge, new contract file, new composition root — re-snapshots
to a fresh `_docs/06_metrics/structure_<YYYY-MM-DD>.md` (the cycle-end
retrospective triggers this when the diff is non-trivial). When that
happens:
1. Update the `**Source-of-truth snapshot**` header pointer at the top of
this file to the new file.
2. Update the `Cycle of last refresh` header to the cycle that produced the
new snapshot.
3. Update the § Source table values (component count, cycle count, contract
count) to match the new snapshot.
4. Do NOT clear § Violations — open findings carry across snapshots.
Resolution status is per-finding, not per-snapshot.
The refresh script is the same one that produced `structure_2026-05-20.md`
(approach: count `src/gps_denied_onboard/components/*/` directories +
`src/gps_denied_onboard/runtime_root/` + `helpers/`; run the AZ-270
composition-root lint to detect cycles; enumerate
`_docs/02_document/contracts/` subdirectories). If the script has been
extracted into `tools/structure_snapshot.py` between cycles, use it;
otherwise the manual approach is documented at the top of the source
snapshot file.
## Baseline Delta — how cumulative-review reports consume this file
Every cumulative-review report MUST emit a `## Baseline Delta` section with
three counts derived from this file:
- **Carried-over**: count of rows whose `Status: OPEN` (or
`Status: ACCEPTED-RISK`) was unchanged at the start of this review's
batch window.
- **Resolved**: count of rows that transitioned from `OPEN` to
`RESOLVED in batch ...` during this review's batch window.
- **Newly-introduced**: count of rows added during this review's batch
window.
An empty Baseline Delta (`0 new, 0 resolved, 0 carried-over`) is still
emitted — its presence confirms the cumulative-review consulted the
baseline rather than silently skipping the section as in cycles 13.
## References
- Cycle-3 retro § Top 3 Improvement Actions #3`_docs/06_metrics/retro_2026-05-26.md`
- Cycle-1 retro § Top 3 Improvement Actions #3 (original) — `_docs/06_metrics/retro_2026-05-20.md`
- Source snapshot — `_docs/06_metrics/structure_2026-05-20.md`
- Existing-code flow Step 2 — `.cursor/skills/autodev/flows/existing-code.md` § "Step 2 — Architecture Baseline Scan"
- Implement skill Step 14.5 — `.cursor/skills/implement/SKILL.md` § "Cumulative Code Review (every K batches)"
- Architecture doc — `_docs/02_document/architecture.md`
- Module-layout — `_docs/02_document/module-layout.md`
@@ -1,51 +0,0 @@
# Leftover: EVIDENCE_OUT default is a hardcoded container path
**Created**: 2026-05-26
**Last replay attempt**: 2026-05-26
**Category**: Test infrastructure defect (non-tracker leftover — code fix, not a deferred tracker write)
**Surfaced by**: autodev cycle 3 Step 15 (Performance Test) — `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` "Findings worth tracking" item 3.
## Problem
`e2e/runner/conftest.py:56`:
```python
default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence")
```
The default path `/e2e-results/evidence` is the container mount inside the Tier-1 Docker harness and the Tier-2 Jetson run script. On a developer Mac/Linux workstation invoking `python -m pytest e2e/tests/performance/` directly (no Docker, no Jetson), this hook fires in `nfr_recorder.pytest_sessionfinish` and tries to create the directory, failing with:
```
OSError: [Errno 30] Read-only file system: '/e2e-results'
```
(macOS — the volume `/` is read-only at the filesystem root.) On Linux hosts it would fail with `PermissionError` for the same reason — `/e2e-results` is not writable by a non-root user.
## Workaround (used today)
```bash
EVIDENCE_OUT="$(pwd)/e2e-results/cycle3-tier1-probe/evidence" \
python -m pytest e2e/tests/performance/ -v --tb=short
```
This produced a clean exit-0 run with the expected 24 SKIPPED outcomes.
## Proposed fix
Change `e2e/runner/conftest.py:56` to default to a workspace-relative path when neither `--evidence-out` nor `EVIDENCE_OUT` is set. Two viable shapes:
1. **Workspace-relative default**: `default=os.environ.get("EVIDENCE_OUT", str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"))`.
2. **Lazy fallback inside the recorder**: leave the default unset; if `evidence_dir` is `None` at session finish, skip emission and warn — useful for `--collect-only` or smoke runs where evidence output is genuinely not needed.
Either shape preserves backward compatibility with the Docker / Jetson scripts (they pass `--evidence-out` explicitly).
## Why not fix in this cycle
Per `coderule.mdc` § Scope discipline: "Unrelated issues elsewhere: do not silently fix them as part of this task. Either note them to the user at end of turn and ASK before expanding scope, or record in `_docs/_process_leftovers/` for later handling." Cycle 3 was pre-flight / route-driven seeding work; the EVIDENCE_OUT default has no relationship to that scope. Recording here for either:
- Next cycle's New Task step to pick up as a small (~1 pt) housekeeping ticket, OR
- A drive-by fix during the next test-infrastructure touch (e.g. when AZ-444 Tier-2 harness lands).
## Replay condition
This is a **code-fix leftover**, not a tracker-write leftover. There is nothing to "replay against the tracker". Resolution = land the conftest change above and verify a Tier-1 host run of `pytest e2e/tests/performance/` exits cleanly without `EVIDENCE_OUT` pre-set. Once that PR merges, delete this leftover.
+10 -2
View File
@@ -53,8 +53,16 @@ def pytest_addoption(parser: pytest.Parser) -> None:
group.addoption(
"--evidence-out",
action="store",
default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence"),
help="Directory the evidence bundler writes per-run artifacts to.",
default=os.environ.get(
"EVIDENCE_OUT",
str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"),
),
help="Directory the evidence bundler writes per-run artifacts to. "
"Default resolves to <repo_root>/e2e-results/evidence so host-direct "
"pytest runs don't crash on the container-mount path "
"/e2e-results/evidence (which is read-only on macOS, "
"non-writable on Linux). Docker / Jetson harnesses override this "
"explicitly via --evidence-out=/e2e-results/run-... (AZ-901).",
)
group.addoption(
"--allow-no-skip-reason",