[AZ-899] [AZ-900] [AZ-901] Baseline doc + retro gate + EVIDENCE_OUT fix

AZ-899: create _docs/02_document/architecture_compliance_baseline.md
seeded with 0 violations and the 2026-05-20 structural snapshot facts
(15 inventory entries, 0 import cycles, 5 contract files). Documents
the append-on-violation / mark-resolved-on-fix / snapshot-refresh
protocol so cumulative reviews can emit Baseline Delta sections.
Closes cycle-1 retro Top-3 #3 (third attempt).

AZ-900: codify LESSONS 2026-05-26 [process] in
.cursor/skills/autodev/flows/existing-code.md - Re-Entry After
Completion now hosts a Previous-Cycle Retro Existence Gate that
BLOCKS the cycle increment if no _docs/06_metrics/retro_*.md file
dated within [cycle_start, cycle_end] exists. Skipped on
state.cycle == 1. Presents Choose A (author retro) / B (stub +
leftover) / C (abort). state.md - Session Boundaries gains a
cross-reference bullet.

AZ-901: fix e2e/runner/conftest.py:56 EVIDENCE_OUT default - host
pytest now resolves <repo_root>/e2e-results/evidence/ instead of
/e2e-results/evidence (container-only path; crashed on macOS / non-
root Linux). Docker + Jetson harnesses unaffected (they pass
--evidence-out explicitly). Verified locally: 24 SKIPPED, exit 0,
evidence written. Closes leftover 2026-05-26_evidence_out_default_path.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-26 17:18:54 +03:00
parent 940066bee2
commit aa8b9f2ee9
8 changed files with 190 additions and 54 deletions
@@ -0,0 +1,123 @@
# Architecture Compliance Baseline
> **Purpose.** Single canonical document against which every cumulative-review
> report (per `.cursor/skills/code-review/SKILL.md` Phase 7 + the implement
> skill's Step 14.5 cumulative review) computes its `## Baseline Delta` —
> the count of **carried-over**, **resolved**, and **newly-introduced**
> architecture violations. Without this file, cumulative reviews log
> "baseline not found → no Baseline Delta section emitted" and structural
> regressions are visible only pairwise per batch instead of cumulatively.
**Baseline established**: 2026-05-26 (cycle-4 Step 10, batch 1, AZ-899)
**Source-of-truth snapshot**: `_docs/06_metrics/structure_2026-05-20.md`
**Initial violation count**: **0**
**Cycle of last refresh**: 4
## Source
The "0 violations" claim is grounded in the structural facts captured by the
cycle-1-close snapshot (`_docs/06_metrics/structure_2026-05-20.md`):
| Fact | Value |
|------|-------|
| Inventory entries | 15 (14 production components C1C13 + 1 cross-cutting `helpers/runtime_root` row) |
| Import cycles in component graph | 0 (verified across batches 8892 cumulative reviews; no back-edges) |
| Contract files | 5 (`fdr_record_schema.md`, `fdr_client_protocol.md`, `log_record_schema.md`, the `shared_satellite_provider_ingest/` placeholder, `shared_flights_api/`) |
| `_STRATEGY_REGISTRY` composition seam | `runtime_root.airborne_bootstrap` + `runtime_root.operator_bootstrap` (single composition root per binary, ADR-009) |
| Layering rule | Layer-3 → Layer-4 imports **BANNED**; AZ-507 cross-component contract surface enforced by `tests/unit/test_az270_compose_root.test_ac6_only_compose_root_imports_concrete_strategies` lint |
The architecture is documented in `_docs/02_document/architecture.md` (ADR-001
monolith, ADR-002 build-time exclusion, ADR-009 interface-first DI,
ADR-011 single-image live+replay). File ownership is documented in
`_docs/02_document/module-layout.md`.
## Violations
*None at baseline.*
This section is the append target for every cumulative-review run that
detects an architecture finding (severity ≥ Medium, category =
`Architecture`). The append schema is documented under § Update Protocol
below.
## Update Protocol
### When a cumulative review finds a NEW architecture violation
The reviewing skill (typically `.cursor/skills/code-review/SKILL.md` Phase 7,
invoked from the implement skill's Step 14.5 cumulative review at every K=3
batches) MUST append a row to § Violations using this schema:
| Field | Example |
|-------|---------|
| Finding ID | `arch-2026-06-15-1` (date + sequence within the day) |
| Batch range | `batches 1719 cycle 4` |
| Severity | `High` / `Medium` (Critical findings escalate immediately; Low findings stay in the per-batch report) |
| Subcategory | `import-cycle` / `cross-component-import` / `parallel-pipeline` / `layer-violation` / `seam-bypass` |
| File:line | `src/gps_denied_onboard/components/c2_vpr/ultra_vpr.py:117` |
| One-line summary | `c2_vpr imports c6_tile_cache directly, bypassing the consumer-side Protocol cut required by AZ-507` |
| Cumulative-review report | `_docs/03_implementation/cumulative_review_batches_17-19_cycle4_report.md` |
| Status | `OPEN` (newly introduced) |
The append happens IN THIS FILE, not in the cumulative-review report. The
cumulative-review report references this file's row by Finding ID.
### When a violation is resolved
Update the violating row in place: change `Status: OPEN` to
`Status: RESOLVED in batch <N> cycle <M> via <commit-hash>`. Do NOT delete
the row — the audit trail must show both the introduction and the
resolution.
### When the structural snapshot is refreshed
Any cycle that materially changes structure — new component, new
cross-component edge, new contract file, new composition root — re-snapshots
to a fresh `_docs/06_metrics/structure_<YYYY-MM-DD>.md` (the cycle-end
retrospective triggers this when the diff is non-trivial). When that
happens:
1. Update the `**Source-of-truth snapshot**` header pointer at the top of
this file to the new file.
2. Update the `Cycle of last refresh` header to the cycle that produced the
new snapshot.
3. Update the § Source table values (component count, cycle count, contract
count) to match the new snapshot.
4. Do NOT clear § Violations — open findings carry across snapshots.
Resolution status is per-finding, not per-snapshot.
The refresh script is the same one that produced `structure_2026-05-20.md`
(approach: count `src/gps_denied_onboard/components/*/` directories +
`src/gps_denied_onboard/runtime_root/` + `helpers/`; run the AZ-270
composition-root lint to detect cycles; enumerate
`_docs/02_document/contracts/` subdirectories). If the script has been
extracted into `tools/structure_snapshot.py` between cycles, use it;
otherwise the manual approach is documented at the top of the source
snapshot file.
## Baseline Delta — how cumulative-review reports consume this file
Every cumulative-review report MUST emit a `## Baseline Delta` section with
three counts derived from this file:
- **Carried-over**: count of rows whose `Status: OPEN` (or
`Status: ACCEPTED-RISK`) was unchanged at the start of this review's
batch window.
- **Resolved**: count of rows that transitioned from `OPEN` to
`RESOLVED in batch ...` during this review's batch window.
- **Newly-introduced**: count of rows added during this review's batch
window.
An empty Baseline Delta (`0 new, 0 resolved, 0 carried-over`) is still
emitted — its presence confirms the cumulative-review consulted the
baseline rather than silently skipping the section as in cycles 13.
## References
- Cycle-3 retro § Top 3 Improvement Actions #3`_docs/06_metrics/retro_2026-05-26.md`
- Cycle-1 retro § Top 3 Improvement Actions #3 (original) — `_docs/06_metrics/retro_2026-05-20.md`
- Source snapshot — `_docs/06_metrics/structure_2026-05-20.md`
- Existing-code flow Step 2 — `.cursor/skills/autodev/flows/existing-code.md` § "Step 2 — Architecture Baseline Scan"
- Implement skill Step 14.5 — `.cursor/skills/implement/SKILL.md` § "Cumulative Code Review (every K batches)"
- Architecture doc — `_docs/02_document/architecture.md`
- Module-layout — `_docs/02_document/module-layout.md`
@@ -1,51 +0,0 @@
# Leftover: EVIDENCE_OUT default is a hardcoded container path
**Created**: 2026-05-26
**Last replay attempt**: 2026-05-26
**Category**: Test infrastructure defect (non-tracker leftover — code fix, not a deferred tracker write)
**Surfaced by**: autodev cycle 3 Step 15 (Performance Test) — `_docs/06_metrics/perf_2026-05-26_cycle3-tier1-probe.md` "Findings worth tracking" item 3.
## Problem
`e2e/runner/conftest.py:56`:
```python
default=os.environ.get("EVIDENCE_OUT", "/e2e-results/evidence")
```
The default path `/e2e-results/evidence` is the container mount inside the Tier-1 Docker harness and the Tier-2 Jetson run script. On a developer Mac/Linux workstation invoking `python -m pytest e2e/tests/performance/` directly (no Docker, no Jetson), this hook fires in `nfr_recorder.pytest_sessionfinish` and tries to create the directory, failing with:
```
OSError: [Errno 30] Read-only file system: '/e2e-results'
```
(macOS — the volume `/` is read-only at the filesystem root.) On Linux hosts it would fail with `PermissionError` for the same reason — `/e2e-results` is not writable by a non-root user.
## Workaround (used today)
```bash
EVIDENCE_OUT="$(pwd)/e2e-results/cycle3-tier1-probe/evidence" \
python -m pytest e2e/tests/performance/ -v --tb=short
```
This produced a clean exit-0 run with the expected 24 SKIPPED outcomes.
## Proposed fix
Change `e2e/runner/conftest.py:56` to default to a workspace-relative path when neither `--evidence-out` nor `EVIDENCE_OUT` is set. Two viable shapes:
1. **Workspace-relative default**: `default=os.environ.get("EVIDENCE_OUT", str(Path(__file__).resolve().parents[2] / "e2e-results" / "evidence"))`.
2. **Lazy fallback inside the recorder**: leave the default unset; if `evidence_dir` is `None` at session finish, skip emission and warn — useful for `--collect-only` or smoke runs where evidence output is genuinely not needed.
Either shape preserves backward compatibility with the Docker / Jetson scripts (they pass `--evidence-out` explicitly).
## Why not fix in this cycle
Per `coderule.mdc` § Scope discipline: "Unrelated issues elsewhere: do not silently fix them as part of this task. Either note them to the user at end of turn and ASK before expanding scope, or record in `_docs/_process_leftovers/` for later handling." Cycle 3 was pre-flight / route-driven seeding work; the EVIDENCE_OUT default has no relationship to that scope. Recording here for either:
- Next cycle's New Task step to pick up as a small (~1 pt) housekeeping ticket, OR
- A drive-by fix during the next test-infrastructure touch (e.g. when AZ-444 Tier-2 harness lands).
## Replay condition
This is a **code-fix leftover**, not a tracker-write leftover. There is nothing to "replay against the tracker". Resolution = land the conftest change above and verify a Tier-1 host run of `pytest e2e/tests/performance/` exits cleanly without `EVIDENCE_OUT` pre-set. Once that PR merges, delete this leftover.