chore: sync .cursor skills from suite

2026-06-21 06:31:09 +00:00 · 2026-05-03 17:43:27 +03:00
parent de2405642f
commit 80aa818200
13 changed files with 369 additions and 90 deletions
@@ -3,7 +3,7 @@ name: autodev
 description: |
  Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
  Detects current project state from _docs/ folder, resumes from where it left off, and flows through
-  problem → research → plan → decompose → implement → deploy without manual skill invocation.
+  problem → research → plan → test specs → decompose → implement → tests → docs sync → deploy without manual skill invocation.
  Maximizes work per conversation by auto-transitioning between skills.
  Trigger phrases:
  - "autodev", "auto", "start", "continue"
@@ -1,6 +1,6 @@
 # Greenfield Workflow

-Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
+Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.

 ## Step Reference Table

@@ -10,13 +10,19 @@ Workflow for new projects built from scratch. Flows linearly: Problem → Resear
 | 2 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 |
 | 3 | Plan | plan/SKILL.md | Step 1–6 + Final |
 | 4 | UI Design | ui-design/SKILL.md | Phase 0–8 (conditional — UI projects only) |
-| 5 | Decompose | decompose/SKILL.md | Step 1–4 |
-| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
-| 7 | Run Tests | test-run/SKILL.md | Steps 1–4 |
-| 8 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
-| 9 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
-| 10 | Deploy | deploy/SKILL.md | Step 1–7 |
-| 11 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |
+| 5 | Test Spec | test-spec/SKILL.md | Phases 1–4 |
+| 6 | Decompose | decompose/SKILL.md | Step 1–4 |
+| 7 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 8 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 0–7 (conditional) |
+| 9 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
+| 10 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 11 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 12 | Test-Spec Sync | test-spec/SKILL.md (cycle-update mode) | Phase 2 + Phase 3 (scoped) |
+| 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
+| 14 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
+| 16 | Deploy | deploy/SKILL.md | Step 1–7 |
+| 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |

 ## Detection Rules

@@ -80,12 +86,12 @@ If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FIN
 ---

 **Step 4 — UI Design (conditional)**
-Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_tasks/todo/` does not exist or has no task files.
+Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
 State-driven: reached by auto-chain from Step 3.

 Action: Read and execute `.cursor/skills/ui-design/SKILL.md`. The skill runs its own **Applicability Check**, which handles UI project detection and the user's A/B choice. It returns one of:

- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Decompose).
+- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Test Spec).
 - `outcome: skipped, reason: not-a-ui-project` → mark Step 4 as `skipped`, auto-chain to Step 5.
 - `outcome: skipped, reason: user-declined` → mark Step 4 as `skipped`, auto-chain to Step 5.

@@ -93,34 +99,153 @@ The autodev no longer inlines UI detection heuristics — they live in `ui-desig

 ---

-**Step 5 — Decompose**
-Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/todo/` does not exist or has no task files
+**Step 5 — Test Spec**
+Condition (folder fallback): `_docs/02_document/FINAL_report.md` exists AND `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
+State-driven: reached by auto-chain from Step 4 (completed or skipped).

-Action: Read and execute `.cursor/skills/decompose/SKILL.md`
+Action: Read and execute `.cursor/skills/test-spec/SKILL.md`.
+
+This step converts the greenfield problem statement, acceptance criteria, solution, architecture, component docs, and UI design artifacts (if any) into test specifications before implementation begins. The test spec should cover unit, integration, blackbox, and e2e scenarios where those levels are applicable to the project.
+
+---
+
+**Step 6 — Decompose**
+Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_document/tests/traceability-matrix.md` exists AND `_docs/02_tasks/todo/` does not exist or has no implementation task files.
+
+Action: Read and execute `.cursor/skills/decompose/SKILL.md` in normal implementation mode. Test tasks are intentionally deferred to Step 9 (Decompose Tests) so the first implementation batch stays focused on product functionality.

 If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it.

 ---

-**Step 6 — Implement**
-Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any `implementation_report_*.md` file
+**Step 7 — Implement**
+Condition: `_docs/02_tasks/todo/` contains implementation task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any product `implementation_report_*.md` file.

 Action: Read and execute `.cursor/skills/implement/SKILL.md`

 If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.

+For folder fallback, **implementation task files** means task specs that are not test-only specs: exclude `*_test_infrastructure.md` and task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
+
+For folder fallback, a **product implementation report** is any `_docs/03_implementation/implementation_report_*.md` file except `_docs/03_implementation/implementation_report_tests.md` and refactor reports.
+
 ---

-**Step 7 — Run Tests**
-Condition (folder fallback): `_docs/03_implementation/` contains an `implementation_report_*.md` file.
-State-driven: reached by auto-chain from Step 6.
+**Step 8 — Code Testability Revision**
+Condition (folder fallback): `_docs/03_implementation/` contains a product implementation report AND `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` does not exist AND `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` does not exist AND `_docs/03_implementation/implementation_report_tests.md` does not exist AND `_docs/02_tasks/todo/` does not contain test task files.
+State-driven: reached by auto-chain from Step 7.
+
+**Purpose**: verify the newly built code can be exercised by the planned tests before writing the test suite. Greenfield code should be testable by design; this step catches accidental hardcoded paths, singletons, direct external service construction, or other implementation choices that would make meaningful tests impossible.
+
+**Scope — MINIMAL, SURGICAL fixes**: this is not a general refactor. It is the smallest set of changes required to make the implemented code runnable under tests.
+
+**Allowed changes** in this phase:
+- Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
+- Extract narrow interfaces for components that need stubbing in tests.
+- Add optional constructor parameters for dependency injection; default to the existing behavior so callers do not break.
+- Wrap global singletons in thin accessors that tests can override.
+- Split a function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.
+
+**NOT allowed** in this phase (defer to a later refactor task):
+- Renaming public APIs.
+- Moving code between files unless strictly required for isolation.
+- Changing algorithms or business logic.
+- Restructuring module boundaries or rewriting layers.
+
+Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
+
+1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`.
+2. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
+   - Hardcoded file paths or directory references
+   - Hardcoded configuration values (URLs, credentials, magic numbers)
+   - Global mutable state that cannot be overridden
+   - Tight coupling to external services without abstraction
+   - Missing dependency injection or non-configurable parameters
+   - Direct file system operations without path configurability
+   - Inline construction of heavy dependencies (models, clients)
+3. If ALL scenarios are testable as-is:
+   - Create `_docs/04_refactoring/01-testability-refactoring/`
+   - Write `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` with the scenarios reviewed and outcome "Code is testable — no changes needed"
+   - Mark Step 8 as `completed` with outcome "Code is testable — no changes needed"
+   - Auto-chain to Step 9 (Decompose Tests)
+4. If testability issues are found:
+   - Create `_docs/04_refactoring/01-testability-refactoring/`
+   - Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
+     - **Mode**: `guided`
+     - **Source**: `autodev-greenfield-testability-analysis`
+     - One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred refactor candidates" instead.
+   - Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
+   - Phase 3 (Safety Net) is skipped for this testability run because the test suite has not been implemented yet
+   - After execution, surface `RUN_DIR/testability_changes_summary.md` to the user via the Choose format (accept / request follow-up) before auto-chaining
+   - Copy or save the accepted summary as `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` so folder fallback can detect Step 8 completion
+   - Mark Step 8 as `completed`
+   - Auto-chain to Step 9 (Decompose Tests)
+
+---
+
+**Step 9 — Decompose Tests**
+Condition (folder fallback): `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND `_docs/03_implementation/` contains a product implementation report AND (`_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` exists OR `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` exists) AND (`_docs/02_tasks/todo/` does not exist or has no test task files) AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
+State-driven: reached by auto-chain from Step 8.
+
+Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
+1. Run Step 1t (test infrastructure bootstrap)
+2. Run Step 3 (blackbox/e2e-capable test task decomposition)
+3. Run Step 4 (cross-verification against test coverage)
+
+If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it — it appends test tasks alongside existing completed implementation tasks.
+
+---
+
+**Step 10 — Implement Tests**
+Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
+State-driven: reached by auto-chain from Step 9.
+
+Action: Read and execute `.cursor/skills/implement/SKILL.md`
+
+The implement skill reads test tasks from `_docs/02_tasks/todo/` and implements them.
+
+If `_docs/03_implementation/` has batch reports, the implement skill detects completed test tasks and continues.
+
+For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
+
+---
+
+**Step 11 — Run Tests**
+Condition (folder fallback): `_docs/03_implementation/implementation_report_tests.md` exists.
+State-driven: reached by auto-chain from Step 10.

 Action: Read and execute `.cursor/skills/test-run/SKILL.md`

+Verifies the implemented unit, integration, blackbox, and e2e tests pass before proceeding to spec and documentation sync.
+
 ---

-**Step 8 — Security Audit (optional)**
-State-driven: reached by auto-chain from Step 7.
+**Step 12 — Test-Spec Sync**
+State-driven: reached by auto-chain from Step 11. Requires `_docs/02_document/tests/traceability-matrix.md` to exist — if missing, mark Step 12 `skipped` (see Action below).
+
+Action: Read and execute `.cursor/skills/test-spec/SKILL.md` in **cycle-update mode**. Pass the completed implementation task specs, completed test task specs, and implementation reports as inputs.
+
+The skill appends implementation-learned acceptance criteria, scenarios, and NFR updates to the existing test-spec files without rewriting unaffected sections. If `traceability-matrix.md` is missing, mark Step 12 as `skipped` — the next `/test-spec` full run will regenerate it.
+
+After completion, auto-chain to Step 13 (Update Docs).
+
+---
+
+**Step 13 — Update Docs**
+State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires `_docs/02_document/` to contain existing documentation — if missing, mark Step 13 `skipped` (see Action below).
+
+Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all completed implementation and test task spec files plus the implementation reports.
+
+The document skill in Task mode updates affected module docs, component docs, system-level docs, and test documentation without redoing full discovery, verification, or problem extraction.
+
+If `_docs/02_document/` does not contain existing docs, mark Step 13 as `skipped`.
+
+After completion, auto-chain to Step 14 (Security Audit).
+
+---
+
+**Step 14 — Security Audit (optional)**
+State-driven: reached by auto-chain from Step 13 (completed or skipped).

 Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
 - question:        `Run security audit before deploy?`
@@ -128,12 +253,12 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
 - option-b-label:  `Skip — proceed directly to deploy`
 - recommendation:  `A — catches vulnerabilities before production`
 - target-skill:    `.cursor/skills/security/SKILL.md`
- next-step:       Step 9 (Performance Test)
+- next-step:       Step 15 (Performance Test)

 ---

-**Step 9 — Performance Test (optional)**
-State-driven: reached by auto-chain from Step 8.
+**Step 15 — Performance Test (optional)**
+State-driven: reached by auto-chain from Step 14 (completed or skipped).

 Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
 - question:        `Run performance/load tests before deploy?`
@@ -141,30 +266,30 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
 - option-b-label:  `Skip — proceed directly to deploy`
 - recommendation:  `A or B — base on whether acceptance criteria include latency, throughput, or load requirements`
 - target-skill:    `.cursor/skills/test-run/SKILL.md` in **perf mode** (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
- next-step:       Step 10 (Deploy)
+- next-step:       Step 16 (Deploy)

 ---

-**Step 10 — Deploy**
-State-driven: reached by auto-chain from Step 9 (after Step 9 is completed or skipped).
+**Step 16 — Deploy**
+State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).

 Action: Read and execute `.cursor/skills/deploy/SKILL.md`.

-After the deploy skill completes successfully, mark Step 10 as `completed` and auto-chain to Step 11 (Retrospective).
+After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 17 (Retrospective).

 ---

-**Step 11 — Retrospective**
-State-driven: reached by auto-chain from Step 10.
+**Step 17 — Retrospective**
+State-driven: reached by auto-chain from Step 16.

 Action: Read and execute `.cursor/skills/retrospective/SKILL.md` in **cycle-end mode**. This closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` and appending the top-3 lessons to `_docs/LESSONS.md`.

-After retrospective completes, mark Step 11 as `completed` and enter "Done" evaluation.
+After retrospective completes, mark Step 17 as `completed` and enter "Done" evaluation.

 ---

 **Done**
-State-driven: reached by auto-chain from Step 11. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
+State-driven: reached by auto-chain from Step 17. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)

 Action: Report project completion with summary. Then **rewrite the state file** so the next `/autodev` invocation enters the feature-cycle loop in the existing-code flow:

@@ -191,47 +316,65 @@ On the next invocation, Flow Resolution rule 1 reads `flow: existing-code` and r
 | Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
 | Research Decision → proceed | Auto-chain → Plan (3) |
 | Plan (3) | Auto-chain → UI Design detection (4) |
-| UI Design (4, done or skipped) | Auto-chain → Decompose (5) |
-| Decompose (5) | **Session boundary** — suggest new conversation before Implement |
-| Implement (6) | Auto-chain → Run Tests (7) |
-| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) |
-| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) |
-| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) |
-| Deploy (10) | Auto-chain → Retrospective (11) |
-| Retrospective (11) | Report completion; rewrite state to existing-code flow, step 9 |
+| UI Design (4, done or skipped) | Auto-chain → Test Spec (5) |
+| Test Spec (5) | Auto-chain → Decompose (6) |
+| Decompose (6) | **Session boundary** — suggest new conversation before Implement |
+| Implement (7) | Auto-chain → Code Testability Revision (8) |
+| Code Testability Revision (8) | Auto-chain → Decompose Tests (9) |
+| Decompose Tests (9) | **Session boundary** — suggest new conversation before Implement Tests |
+| Implement Tests (10) | Auto-chain → Run Tests (11) |
+| Run Tests (11, all pass) | Auto-chain → Test-Spec Sync (12) |
+| Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
+| Update Docs (13, done or skipped) | Auto-chain → Security Audit choice (14) |
+| Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
+| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
+| Deploy (16) | Auto-chain → Retrospective (17) |
+| Retrospective (17) | Report completion; rewrite state to existing-code flow, step 9 |

 ## Status Summary — Step List

 Flow name: `greenfield`. Render using the banner template in `protocols.md` → "Banner Template (authoritative)". No header-suffix, current-suffix, or footer-extras — all empty for this flow.

-| # | Step Name          | Extra state tokens (beyond the shared set) |
-|---|--------------------|--------------------------------------------|
-| 1 | Problem            | — |
-| 2 | Research           | `DONE (N drafts)` |
-| 3 | Plan               | — |
-| 4 | UI Design          | — |
-| 5 | Decompose          | `DONE (N tasks)` |
-| 6 | Implement          | `IN PROGRESS (batch M of ~N)` |
-| 7 | Run Tests          | `DONE (N passed, M failed)` |
-| 8 | Security Audit     | — |
-| 9 | Performance Test   | — |
-| 10 | Deploy            | — |
-| 11 | Retrospective     | — |
+| # | Step Name                   | Extra state tokens (beyond the shared set) |
+|---|-----------------------------|--------------------------------------------|
+| 1 | Problem                     | — |
+| 2 | Research                    | `DONE (N drafts)` |
+| 3 | Plan                        | — |
+| 4 | UI Design                   | — |
+| 5 | Test Spec                   | — |
+| 6 | Decompose                   | `DONE (N tasks)` |
+| 7 | Implement                   | `IN PROGRESS (batch M of ~N)` |
+| 8 | Code Testability Revision   | — |
+| 9 | Decompose Tests             | `DONE (N tasks)` |
+| 10 | Implement Tests            | `IN PROGRESS (batch M)` |
+| 11 | Run Tests                  | `DONE (N passed, M failed)` |
+| 12 | Test-Spec Sync             | — |
+| 13 | Update Docs                | — |
+| 14 | Security Audit             | — |
+| 15 | Performance Test           | — |
+| 16 | Deploy                     | — |
+| 17 | Retrospective              | — |

-All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 8, 9 additionally accept `SKIPPED`.
+All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 12, 13, 14, 15 additionally accept `SKIPPED`.

 Row rendering format (step-number column is right-padded to 2 characters for alignment):

 ```
- Step 1   Problem             [<state token>]
- Step 2   Research            [<state token>]
- Step 3   Plan                [<state token>]
- Step 4   UI Design           [<state token>]
- Step 5   Decompose           [<state token>]
- Step 6   Implement           [<state token>]
- Step 7   Run Tests           [<state token>]
- Step 8   Security Audit      [<state token>]
- Step 9   Performance Test    [<state token>]
- Step 10  Deploy              [<state token>]
- Step 11  Retrospective       [<state token>]
+ Step 1   Problem                   [<state token>]
+ Step 2   Research                  [<state token>]
+ Step 3   Plan                      [<state token>]
+ Step 4   UI Design                 [<state token>]
+ Step 5   Test Spec                 [<state token>]
+ Step 6   Decompose                 [<state token>]
+ Step 7   Implement                 [<state token>]
+ Step 8   Code Testability Rev.     [<state token>]
+ Step 9   Decompose Tests           [<state token>]
+ Step 10  Implement Tests           [<state token>]
+ Step 11  Run Tests                 [<state token>]
+ Step 12  Test-Spec Sync            [<state token>]
+ Step 13  Update Docs               [<state token>]
+ Step 14  Security Audit            [<state token>]
+ Step 15  Performance Test          [<state token>]
+ Step 16  Deploy                    [<state token>]
+ Step 17  Retrospective             [<state token>]
 ```
@@ -111,6 +111,7 @@ Before entering a step from this table for the first time in a session, verify t
 |------|------|----------|----------------|
 | greenfield | Plan | Step 6 — Epics | Create epics for each component |
 | greenfield | Decompose | Step 1 + Step 2 + Step 3 — All tasks | Create ticket per task, link to epic |
+| greenfield | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | New Task | Step 7 — Ticket | Create ticket per task, link to epic |

@@ -13,7 +13,7 @@ The autodev persists its position to `_docs/_autodev_state.md`. This is a lightw

 ## Current Step
 flow: [greenfield | existing-code | meta-repo]
-step: [1-11 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
+step: [1-17 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
 name: [step name from the active flow's Step Reference Table]
 status: [not_started / in_progress / completed / skipped / failed]
 sub_step:
@@ -25,6 +25,7 @@ Phase details live in `phases/` — read the relevant file before executing each
 - **Delegate execution**: all code changes go through the implement skill via task files
 - **Ask, don't assume**: when scope or priorities are unclear, STOP and ask the user
 - **Exact-fit recommendations**: do not recommend a replacement pattern, library, service, architecture, algorithm, or "modern approach" merely because it improves structure or solves a similar class of problem. It must fit confirmed product constraints, acceptance criteria, operating context, integration boundaries, and current code realities. Otherwise reject it, mark it experimental, or ask the user before adding it to the roadmap.
+- **Per-mode API capability verification on replacements**: when a refactor proposes replacing or adding a library/SDK/framework/service that exposes multiple modes or configurations, pin the exact mode the refactored code will use (inputs, outputs, runtime) and verify *that mode* via mandatory `context7` lookup plus a saved Minimum Viable Example before promoting the recommendation to `Selected`. Capability claims at the category level ("supports A, B, C modes") must be cross-checked against the literal mode enumeration — `A, B → A+B` style conflations are the recurring silent-failure path.

 ## Context Resolution

@@ -10,6 +10,17 @@
 2. Extract the **Project Constraint Matrix** from `problem.md`, `restrictions.md`, `acceptance_criteria.md`, current architecture/docs, and actual code constraints. Include required inputs/outputs, operating context, lifecycle assumptions, integration boundaries, non-functional targets, and hard disqualifiers.
 3. Research modern approaches for similar systems
 4. For each alternative pattern/library/service/architecture/algorithm, research intrinsic implementation constraints: required inputs/outputs, runtime assumptions, supported deployment modes, resource needs, operational limits, licensing/security constraints, and known failure reports.
+
+   **API Capability Verification — Per-Mode (MANDATORY, BLOCKING for proposed replacements)**
+
+   When a refactor recommendation replaces (or adds) a library/SDK/framework/service, the same per-mode verification used by `/research` Step 2 applies — selecting a replacement on category fit alone is the same silent-failure path. For every replacement candidate that has multiple modes or configurations:
+
+   1. **Pin the exact mode/configuration** the refactored code will use, in one explicit sentence. Inputs (data shapes, sensor counts, payloads, rates), outputs (per `acceptance_criteria.md` and contract files), runtime (matching the project's deployment).
+   2. **Run `context7` (or equivalent docs lookup)** for the candidate. **Mandatory for every replacement library/SDK/framework candidate**, not optional. Minimum three queries per candidate: mode enumeration, project's exact mode (with input/output shapes), disqualifier probe ("does this mode produce the required output? are there published limitations on this runtime?"). Append URLs to `RUN_DIR/analysis/research_findings.md` references section.
+   3. **Save a Minimum Viable Example (MVE)** for the pinned mode under `RUN_DIR/analysis/mve_evidence.md` with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌. If no official example covers the project's exact configuration, the recommendation cannot be `Selected` based on category fit alone — it must be `Experimental only` (with required-evidence note) or `Rejected`.
+   4. **Treat "the same library in a different mode" as a different recommendation.** If the project's pinned mode is `<X>` but the only documented evidence covers `<Y>`, do not silently soften the description. Open a separate recommendation row, with its own MVE, fit assessment, and disqualifiers.
+   5. **Common silent-failure pattern**: a fact summary paraphrases docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes" — no `A+B` combination exists. Cross-check paraphrased capability claims against the literal mode enumeration.
+
 5. Identify what could be done differently
 6. Suggest improvements only when they fit the Project Constraint Matrix. A cleaner or more modern approach that violates product constraints must be marked `Rejected` or `Experimental only`, not added as a roadmap recommendation.

@@ -17,7 +28,8 @@ Write `RUN_DIR/analysis/research_findings.md`:
 - Current state analysis: patterns used, strengths, weaknesses
 - Alternative approaches per component: current vs alternative, pros/cons, migration effort
 - Prioritized recommendations: quick wins + strategic improvements
- Constraint-fit table: recommendation, constraints checked, evidence, mismatches/disqualifiers, status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
+- Constraint-fit table: recommendation, **pinned mode/config**, constraints checked, **API capability evidence (MVE link)**, evidence, mismatches/disqualifiers, status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
+- For every recommendation that replaces or adds a library/SDK/framework, append a **Restrictions × Candidate-Mode sub-matrix** that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's pinned mode, marking each cell ✅ Pass / ❌ Fail / ❓ Verify / N/A with cited evidence. A recommendation cannot be `Selected` while any cell is ❌ or ❓.

 ## 2b. Solution Assessment & Hardening Tracks

@@ -88,6 +100,9 @@ Convert the finalized `RUN_DIR/list-of-changes.md` into implementable task files
 - [ ] Recommendations are grounded in actual code, not abstract
 - [ ] Every recommendation has been checked against the Project Constraint Matrix
 - [ ] No recommendation violates product restrictions, acceptance criteria, documented architecture decisions, or actual code integration boundaries
+- [ ] Every replacement library/SDK/framework recommendation has a pinned mode/config, a saved MVE in `mve_evidence.md`, and a Restrictions × Candidate-Mode sub-matrix with no ❌ or ❓ cells
+- [ ] `context7` (or equivalent) was consulted for every replacement library/SDK/framework recommendation
+- [ ] Paraphrased capability claims have been cross-checked against the literal mode-enumeration evidence (no `A, B → A+B` style conflation)
 - [ ] Rejected and experimental approaches are documented but not converted into implementation tasks without user approval
 - [ ] Roadmap phases are prioritized by impact
 - [ ] Epic created and all tasks linked to it
@@ -33,6 +33,24 @@ Transform vague topics raised by users into high-quality, deliverable research r
 - **Component option breadth** — for every component area, build a broad option landscape before selecting. Search direct candidates, adjacent-domain alternatives, commercial/open-source variants, classical/simple baselines, current SOTA, and "do not use" failure cases. A component may not be narrowed to one candidate until alternatives have been searched and rejected with evidence.
 - **Component research depth** — for every serious component candidate, go beyond discovery pages. Read official docs, repository/license files, issue discussions, benchmarks, deployment guides, version/platform requirements, security notes, maintenance signals, and real-world failure reports. Extract evidence for inputs/outputs, lifecycle assumptions, runtime/storage/latency fit, integration boundaries, licensing, operational risks, and unsupported scenarios before assigning any selection status.
 - **Exact-fit component selection** — never select a component, tool, library, service, architecture pattern, or algorithm merely because it solves a similar class of problem. It must be proven compatible with the project's explicit operating context, constraints, required inputs/outputs, non-functional requirements, lifecycle assumptions, and acceptance criteria. If fit is unproven or mismatched, mark it `Rejected`, `Experimental only`, or escalate for user decision before it can shape the solution.
+- **Per-mode API capability verification** *(applies only to technical-component selection — see Research Output Class below)* — when a candidate library/SDK/framework/service exposes multiple modes or configurations, *the candidate is not a single thing*. Pin the exact mode the project will use (one explicit sentence: inputs, outputs, runtime), and verify *that mode* against the project's required inputs/outputs via official docs (mandatory `context7` lookup) plus a saved Minimum Viable Example. Capability claims at the category level ("supports X, Y, Z modes") must be cross-checked against the literal mode enumeration before being treated as project-applicable. Two modes of one library are two distinct candidates for the purposes of the Component Applicability Gate. Does not apply to non-technical research (concept comparison, market/policy investigation, knowledge organization, etc.).
+
+## Research Output Class (BLOCKING — set in Step 1)
+
+Before applying any of the technical-component gates (per-mode API capability verification, Component Applicability Gate, Restrictions × Candidate-Mode sub-matrix, MVE evidence, mandatory `context7` lookup), classify the research output into one of two classes. Record the decision in `00_question_decomposition.md` once, near the top, so every downstream step honors it.
+
+| Class | What the output recommends or selects | Examples | Technical-component gates apply? |
+|-------|---------------------------------------|----------|----------------------------------|
+| **Technical-component selection** | One or more libraries, SDKs, frameworks, services, protocols, data formats, infrastructure patterns, algorithms, or APIs that will be implemented or operated against | "Pick a vector database", "Compare auth-token strategies for our API", "Should we use Kafka or RabbitMQ?", architecture / tech-stack / migration drafts (Mode A, Mode B) | **Yes — all gates active** |
+| **Non-technical investigation** | Concept comparisons, knowledge organization, root-cause investigation of an event, market/policy/regulatory/social analysis, literature review, decision support without committing to specific tooling | "Why did adoption stall in Q3?", "Compare phenomenology vs constructivism", "Map regulatory landscape for X", "What do practitioners say about onboarding under remote-first orgs?" | **No — skip API/MVE/sub-matrix gates; the rest of the 8-step engine still applies** |
+
+How to decide:
+1. Inspect the question and the input files (`problem.md`, `restrictions.md`, `acceptance_criteria.md`, or the standalone input file).
+2. If the deliverable will name specific software/services/protocols that someone will then build with or operate, it is **Technical-component selection**.
+3. If the deliverable is a report, comparison, or recommendation that does not commit to specific tooling, it is **Non-technical investigation**.
+4. **Mixed runs are valid.** Some research questions have a non-technical core but include one technical sub-question (or vice versa). In that case classify per component area within the run, not the run as a whole, and note in `00_question_decomposition.md` which component areas trigger the technical-component gates.
+
+When the run is purely **Non-technical investigation**, the rest of the research engine — question decomposition, perspective rotation, exhaustive web search, fact extraction, comparison framework, reasoning chain, validation, deliverable formatting — still applies in full. The sections that get skipped are explicitly the technical gates listed in the table above.

 ## Context Resolution

@@ -96,7 +96,7 @@ When the research topic has Critical or High sensitivity level:
 ## Exact-Fit Validation (BLOCKING)

 - [ ] Project Constraint Matrix extracted from problem context before component selection
- [ ] Component fit matrix includes `Component Area` and `Option Family` columns
+- [ ] Component fit matrix includes `Component Area`, `Option Family`, and `Pinned Mode/Config` columns
 - [ ] Every selected component/tool/library/service/pattern/algorithm has evidence for required inputs/outputs and integration boundaries
 - [ ] Every selected candidate has evidence for the operating context and lifecycle assumptions it must support
 - [ ] Every selected candidate has evidence for non-functional targets that are binding for the project
@@ -104,3 +104,21 @@ When the research topic has Critical or High sensitivity level:
 - [ ] Mismatches are recorded as disqualifiers, not softened into generic limitations
 - [ ] Any candidate with unproven fit is marked `Experimental only` or escalated for user decision
 - [ ] Any candidate with documented constraint conflict is marked `Rejected`
+
+## API Capability Verification (BLOCKING)
+
+**Applicability**: this checklist applies only when the run is classified as **Technical-component selection** (see SKILL.md → Research Output Class). For non-technical research (concept comparison, market/policy investigation, root-cause analysis, knowledge organization), skip this checklist entirely and note the skip in `05_validation_log.md`. For mixed runs, apply only to technical component areas.
+
+For every lead candidate that is a library/SDK/framework/service:
+
+- [ ] The exact mode/configuration the project will use is pinned in one explicit sentence (inputs, outputs, runtime); no vague "supports X" language
+- [ ] `context7` (or equivalent docs lookup) was run for the candidate, with at least 3 queries: mode enumeration, project's exact mode, disqualifier probe
+- [ ] All consulted URLs from context7 / official docs are appended to `01_source_registry.md`
+- [ ] A Minimum Viable Example (MVE) was saved for the pinned mode in `02_fact_cards.md` (or `02_mve_evidence.md`) with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌
+- [ ] When the MVE inputs or outputs do not exactly match the project's, the mismatch is cited from the official docs (not inferred), and the candidate is `Experimental only` or `Rejected`
+- [ ] When a library has multiple modes, each project-relevant mode appears as its own candidate row (not a single library row that softens across modes)
+- [ ] Restrictions × Candidate-Modes sub-matrix in `06_component_fit_matrix.md` is filled for every lead candidate, with one row per numbered restriction and per numbered acceptance criterion
+- [ ] Sub-matrix uses ✅ / ❌ / ❓ / N/A only — no free-form prose substitutes
+- [ ] No `Selected` candidate has any ❌ or ❓ cell in its sub-matrix
+- [ ] "Validation gate required" footnotes are explicitly classified as either *API capability* (must be resolved here) or *runtime quality* (may be carried forward)
+- [ ] Paraphrased capability claims in fact cards have been cross-checked against the literal mode-enumeration evidence (no `mono, inertial → mono-inertial` style conflation)
@@ -160,7 +160,7 @@ Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). C

 **Tool Usage**:
 - Use `WebSearch` for broad searches; `WebFetch` to read specific pages
- Use the `context7` MCP server (`resolve-library-id` then `get-library-docs`) for up-to-date library/framework documentation
+- Use the `context7` MCP server (`resolve-library-id` then `query-docs` / `get-library-docs`) for up-to-date library/framework documentation. **Mandatory per lead candidate** — see "API Capability Verification" below.
 - Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories)
 - When citing web sources, include the URL and date accessed

@@ -190,6 +190,48 @@ For every component/tool/library/service/pattern/algorithm that may be selected
 - Licensing, security, maintenance, and community-health constraints
 - Exact phrases from the project's restrictions and acceptance criteria combined with the candidate name

+**API Capability Verification — Per-Mode (MANDATORY, BLOCKING for lead candidates)**:
+
+**Applicability**: this section applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section, and only to lead candidates that are libraries/SDKs/frameworks/services/protocols/data formats with multiple modes or configurations. For non-technical research (concept comparison, market/policy investigation, knowledge organization, root-cause analysis without tooling commitments), skip this entire sub-section and continue with the rest of Step 2 — the broader candidate implementation-limit search above is sufficient. State the skip explicitly once in `02_fact_cards.md`: `API Capability Verification: not applicable — this run is a Non-technical investigation, no library/SDK/service candidates`.
+
+Most libraries/SDKs/services expose **multiple modes or configurations** (e.g., monocular vs stereo VO, sync vs async API, batch vs streaming inference, write-through vs write-behind cache). Selecting a candidate "because it supports X" without pinning *which mode* the project will use, and *whether that exact mode produces the required outputs from the required inputs*, is the most common silent-failure path in research. A library can support a class of problem in mode A while being unusable for the project's specific configuration in mode B.
+
+For every lead candidate that is a library/SDK/framework/service with multiple modes or configurations, do the following — in this order, before marking the candidate `Selected`:
+
+1. **Pin the exact mode/configuration the project will use.**
+   Derived from the Project Constraint Matrix: which inputs are available (sensor count, sensor types, data shapes, rates), which outputs are required (per `acceptance_criteria.md` and contract files), which hardware/runtime is fixed (per `restrictions.md`). Write this as a single sentence: "We will use `<library>` in `<mode/config>` with inputs `<list>` and expect outputs `<list>` on `<runtime>`." Do not progress past this step on a vague mode description.
+
+2. **Run `context7` (or equivalent docs lookup) for the candidate** — this is **mandatory for every lead library/SDK/framework candidate**, not optional. Minimum three queries per candidate:
+   1. *Mode enumeration*: "What modes/configurations does `<library>` support? List every value of the mode/config enum and what each requires as input."
+   2. *Project's exact mode*: "Show a minimum runnable example of `<library>` in `<the pinned mode>` with `<the project's input shape>`. What does it produce?"
+   3. *Disqualifier probe*: "Does `<library>` `<the pinned mode>` produce `<the required output>`? Are there published limitations of `<the pinned mode>` for `<the project's runtime/hardware>`?"
+
+   For services without context7 coverage, use official docs site + WebFetch on the API reference page + the project's example/tutorial directory in the source repo. Append every consulted URL to `01_source_registry.md`.
+
+3. **Save a Minimum Viable Example (MVE) for the pinned mode.**
+   Append to `02_fact_cards.md` (or a sibling `02_mve_evidence.md`) at least one block per lead library candidate with:
+
+   ```markdown
+   ## MVE — <library> in <pinned mode>
+   - **Source**: <official URL or context7 reference, with date>
+   - **Inputs in the example**: <e.g., 2 calibrated cameras + IMU at 200 Hz>
+   - **Outputs in the example**: <e.g., 6-DoF pose with covariance>
+   - **Project inputs**: <e.g., 1 camera + IMU at 200 Hz>
+   - **Project outputs required**: <e.g., 6-DoF pose with metric translation>
+   - **Match assessment**: ✅ exact match / ⚠️ partial (specify dimension) / ❌ mismatch (specify dimension)
+   - **If ⚠️ or ❌**: cite the official-docs sentence that establishes the mismatch.
+   ```
+
+   If no official example covers the project's exact configuration → the candidate cannot be marked `Selected` based on category fit alone. Status must be `Experimental only` (with required-evidence note) or `Rejected` (when the docs explicitly disqualify the configuration).
+
+4. **Bind every numbered Restriction and Acceptance Criterion to the candidate's pinned mode.**
+   For each numbered line in `restrictions.md` and `acceptance_criteria.md`, decide one of: `Pass` (the pinned mode satisfies it with cited evidence), `Fail` (the pinned mode contradicts it with cited evidence), `Verify` (no evidence either way; deeper investigation required), `N/A` (the line is irrelevant to this component area). Record this in `02_fact_cards.md` under the candidate's MVE block. The structural matrix in Step 7.5 reads from these bindings.
+
+5. **Treat "the same library in a different mode" as a different candidate.**
+   If the project's pinned mode is `Monocular` but the only documented evidence covers `Stereo`, do not silently soften "rotation only" into "rotation + translation". Open a separate candidate row for the Monocular mode, with its own MVE, fit assessment, and disqualifiers. Two modes of one library are two distinct candidates for the purposes of this gate.
+
+**Common silent-failure pattern this guards against**: a fact card paraphrases the docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes". A category-level "Selected" decision then carries through every downstream artifact, masking that the project's required A+B combination does not exist as a single mode.
+
 **Search broadening strategies** (use when results are thin):
 - Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation"
 - Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums
@@ -144,28 +144,61 @@ If using Y: [expected behavior]

 ### Step 7.5: Component Applicability Gate (BLOCKING)

-Before finalizing the solution draft, build an exact-fit matrix for every component/tool/library/service/pattern/algorithm that is selected, recommended, rejected, or treated as a fallback.
+**Applicability**: this gate applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section. For non-technical research (concept comparison, market/policy investigation, root-cause analysis without tooling, knowledge organization), skip this entire step and proceed to Step 8 — there are no components to gate. State the skip once in `05_validation_log.md`: `Step 7.5 (Component Applicability Gate): not applicable — Non-technical investigation`. For mixed runs (some component areas technical, some not), apply this gate only to the technical component areas; the non-technical ones do not produce 7.5 rows.
+
+Before finalizing the solution draft, build an exact-fit matrix for every component/tool/library/service/pattern/algorithm that is selected, recommended, rejected, or treated as a fallback. Free-form prose in a "Project Constraints Checked" column is **not sufficient** — mismatches hide inside rationale text. The matrix must be structured per restriction and per acceptance criterion.
+
+#### 7.5.1 Top-level Component Fit Matrix

 ```markdown
 # Component Fit Matrix

-| Component Area | Candidate | Option Family | Intended Role | Project Constraints Checked | Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
-|----------------|-----------|---------------|---------------|-----------------------------|----------|----------------------------|--------|--------------------|
-| [area] | [name] | [family] | [role] | [constraints] | [Fact # / Source #] | [none / list] | Selected / Rejected / Experimental only / Needs user decision | [why] |
+| Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
+|----------------|-----------|--------------------|---------------|---------------|-------------------------|----------------------------|--------|--------------------|
+| [area] | [name] | [exact mode/config the project will use, copied verbatim from the MVE block in Step 2] | [family] | [role] | MVE: [link to MVE block in `02_fact_cards.md` or `02_mve_evidence.md`]; docs: [Source #] | [none / list] | Selected / Rejected / Experimental only / Needs user decision | [why] |
 ```

-Rules:
- `Selected` is allowed only when the candidate's documented implementation assumptions match the project's explicit constraints and acceptance criteria.
- `Experimental only` is required when a candidate might work but lacks proof for the exact operating context.
- `Rejected` is required when documented assumptions conflict with project constraints.
- `Needs user decision` is required when a mismatch changes scope, cost, safety, product behavior, or acceptance criteria.
+The new **Pinned Mode/Config** column is mandatory. A row without a pinned mode is incomplete. The new **API Capability Evidence** column links to the Minimum Viable Example saved during Step 2's API Capability Verification — without an MVE link the candidate cannot be `Selected`.
+
+#### 7.5.2 Restrictions × Candidate-Modes Sub-Matrix (MANDATORY)
+
+For each lead candidate row in the top-level matrix, append a structured cross-check that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's **pinned mode/config**.
+
+```markdown
+## Sub-Matrix — <Candidate Name> in <Pinned Mode>
+
+| Restriction / AC | Candidate-mode behavior | Result | Evidence |
+|------------------|-------------------------|--------|----------|
+| R1: <verbatim line from restrictions.md> | <how the pinned mode behaves under this restriction> | ✅ Pass / ❌ Fail / ❓ Verify / N/A | [Fact # / Source # / MVE link] |
+| R2: ... | ... | ... | ... |
+| ... | ... | ... | ... |
+| AC-1.1: <verbatim line from acceptance_criteria.md> | <how the pinned mode satisfies (or contradicts) this AC's measurable target> | ✅ / ❌ / ❓ / N/A | [Fact # / Source # / MVE link] |
+| AC-1.2: ... | ... | ... | ... |
+| ... | ... | ... | ... |
+```
+
+Cell semantics:
+- ✅ **Pass** — the candidate's pinned mode satisfies this line, with cited official-doc or MVE evidence.
+- ❌ **Fail** — the candidate's pinned mode contradicts this line, with cited evidence. Even one ❌ disqualifies the candidate from `Selected` status.
+- ❓ **Verify** — no evidence yet either way; further investigation required (loops back to Step 2 / Step 3.5). A row left ❓ at the end of analysis blocks the candidate.
+- **N/A** — the line is irrelevant to this component area (state why in one phrase).
+
+A candidate row may not be marked `Selected` while any cell is ❌ or ❓.
+
+#### 7.5.3 Decision Rules
+
+- `Selected` is allowed only when (a) the top-level row has an MVE link, (b) the sub-matrix has zero ❌, (c) the sub-matrix has zero ❓, and (d) the candidate's documented implementation assumptions match the project's explicit constraints and acceptance criteria.
+- `Experimental only` is required when a candidate might work but lacks proof for the exact operating context (e.g., MVE exists for a similar configuration but not the exact one).
+- `Rejected` is required when documented assumptions conflict with project constraints (any sub-matrix row is ❌ with cited evidence).
+- `Needs user decision` is required when a mismatch changes scope, cost, safety, product behavior, or acceptance criteria — and the user has not yet been consulted.
 - Each component area must include at least one selected or fallback-safe option, plus the most credible rejected/experimental alternatives discovered during web research.
 - A component area with only one candidate is incomplete unless `00_question_decomposition.md` documents the broader searches and why they yielded no realistic alternatives.
 - A candidate may not appear as the lead solution in Step 8 unless this gate marks it `Selected`.
+- "Validation gate required" footnotes are not equivalent to `Selected`. If the validation gate concerns API capability (does the mode produce the required output?), that is a Step-2 / Step-7.5 question and must be resolved here, not deferred to runtime. Only validation gates concerning *runtime quality* (e.g., "does this VO converge on this terrain class?") may be carried forward as `Selected with runtime gate`.

-**Save action**: Write `06_component_fit_matrix.md`.
+**Save action**: Write `06_component_fit_matrix.md` containing both 7.5.1 (top-level) and 7.5.2 (per-candidate sub-matrices).

-**BLOCKING**: If any lead candidate is `Experimental only`, `Rejected`, or `Needs user decision`, do not silently proceed. Ask the user or choose a different selected candidate.
+**BLOCKING**: If any lead candidate has ❌, ❓, `Experimental only`, `Rejected`, or `Needs user decision` status, do not silently proceed. Ask the user or choose a different selected candidate.

 ---

@@ -10,17 +10,21 @@

 [Architecture solution that meets restrictions and acceptance criteria.]

+> **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical research outputs (concept comparison, market/policy report, investigation answer), this Architecture section may be replaced with a comparison/analysis section that does not use these columns; or the columns may be marked `N/A` per row when the row describes a non-technical "component" (a process, a policy, an organizational construct). For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
+
 ### Component: [Component Name]

-| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
-|----------|-------|-----------|-------------|-------------|----------|------|-----|
-| [Option 1] | [lib/platform] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
-| [Option 2] | [lib/platform] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
+|----------|-------|--------------------|-----------|-------------|-------------|----------|------|-------------------------|-----|
+| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |

 **Exact-fit evidence**:
 - Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
 - Evidence: [Fact # / Source #]
 - Disqualifiers: [none or list]
+- Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` § <Candidate Name>
+- API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected

 [Repeat per component]

@@ -13,17 +13,21 @@

 [Architecture solution that meets restrictions and acceptance criteria.]

+> **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical assessment outputs (e.g., reassessing a policy approach, comparing organizational designs), this Architecture section may be replaced with the assessment content that does not use these columns; or the columns may be marked `N/A` per row for non-technical "components". For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
+
 ### Component: [Component Name]

-| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
-|----------|-------|-----------|-------------|-------------|----------|------------|-----|
-| [Option 1] | [lib/platform] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
-| [Option 2] | [lib/platform] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
+|----------|-------|--------------------|-----------|-------------|-------------|----------|------------|-------------------------|-----|
+| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
+| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |

 **Exact-fit evidence**:
 - Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
 - Evidence: [Fact # / Source #]
 - Disqualifiers: [none or list]
+- Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` § <Candidate Name>
+- API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected

 [Repeat per component]

@@ -95,7 +95,7 @@ Examples:

 File: `expected_results/image_01_detections.json`

-```json
+```json
 {
  "input": "image_01.jpg",
  "expected": {
@@ -119,7 +119,7 @@ File: `expected_results/image_01_detections.json`
    ]
  }
 }
-```
+```
 ```

 ---