- Modified the autodev state to reflect the current testing phase and details of the new `jetson-e2e` tests. - Enhanced the "How to Test" documentation to provide clearer instructions on the demo replay validation process, including video and tlog alignment steps. - Updated architectural documentation to include the new demo replay operator flow and its dependencies. - Documented the removal of deprecated auto-sync features and clarified the operator-facing UI for replay validation. - Added new entries in the dependencies table for upcoming tasks related to the demo replay flow. These changes improve clarity and usability for operators and developers working with the demo replay system.
27 KiB
Greenfield Workflow
Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement + Product Completeness Gate → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy (optional) → Release (optional, only if Deploy ran) → Retrospective.
Step Reference Table
| Step | Name | Sub-Skill | Internal SubSteps |
|---|---|---|---|
| 1 | Problem | problem/SKILL.md | Phase 1–4 |
| 2 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 |
| 3 | Plan | plan/SKILL.md | Step 1, 2, 3, 4, 4.5 (ADR Capture), 5, 6 + Final |
| 4 | UI Design | ui-design/SKILL.md | Phase 0–8 (conditional — UI projects only) |
| 5 | Test Spec | test-spec/SKILL.md | Phases 1–4 |
| 6 | Decompose | decompose/SKILL.md (implementation task decomposition) | Step 1 + Step 1.5 + Step 2 + Step 4 |
| 7 | Implement | implement/SKILL.md | Batch loop + Product Implementation Completeness Gate |
| 8 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 0–7 (conditional) |
| 9 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
| 10 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 11 | Run Tests | test-run/SKILL.md | Steps 1–4 |
| 12 | Test-Spec Sync | test-spec/SKILL.md (cycle-update mode) | Phase 2 + Phase 3 (scoped) |
| 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
| 14 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
| 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
| 16 | Deploy | deploy/SKILL.md | Step 1–7 (optional) |
| 16.5 | Release | release/SKILL.md | Phase 1–6 (optional — only if Step 16 completed) |
| 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |
Detection Rules
Resolution: when a state file exists, state.step + state.status drive detection and the conditions below are not consulted. When no state file exists (cold start), walk the rules in order — first folder-probe match wins. Steps without a folder probe are state-driven only; they can only be reached by auto-chain from a prior step.
Step 1 — Problem Gathering
Condition: _docs/00_problem/ does not exist, OR any of these are missing/empty:
problem.mdrestrictions.mdacceptance_criteria.mdinput_data/(must contain at least one file)
Action: Read and execute .cursor/skills/problem/SKILL.md
Step 2 — Research (Initial)
Condition: _docs/00_problem/ is complete AND _docs/01_solution/ has no solution_draft*.md files
Action: Read and execute .cursor/skills/research/SKILL.md (will auto-detect Mode A)
Research Decision (inline gate between Step 2 and Step 3)
Condition: _docs/01_solution/ contains solution_draft*.md files AND _docs/01_solution/solution.md does not exist AND _docs/02_document/architecture.md does not exist
Action: Present the current research state to the user:
- How many solution drafts exist
- Whether tech_stack.md and security_analysis.md exist
- One-line summary from the latest draft
Then present using the Choose format:
══════════════════════════════════════
DECISION REQUIRED: Research complete — next action?
══════════════════════════════════════
A) Run another research round (Mode B assessment)
B) Proceed to planning with current draft
══════════════════════════════════════
Recommendation: [A or B] — [reason based on draft quality]
══════════════════════════════════════
- If user picks A → Read and execute
.cursor/skills/research/SKILL.md(will auto-detect Mode B) - If user picks B → auto-chain to Step 3 (Plan)
Step 3 — Plan
Condition: _docs/01_solution/ has solution_draft*.md files AND _docs/02_document/architecture.md does not exist
Action:
- The plan skill's Prereq 2 will rename the latest draft to
solution.md— this is handled by the plan skill itself - Read and execute
.cursor/skills/plan/SKILL.md
If _docs/02_document/ exists but is incomplete (has some artifacts but no FINAL_report.md), the plan skill's built-in resumability handles it.
Step 4 — UI Design (conditional)
Condition (folder fallback): _docs/02_document/architecture.md exists AND _docs/02_document/tests/traceability-matrix.md does not exist.
State-driven: reached by auto-chain from Step 3.
Action: Read and execute .cursor/skills/ui-design/SKILL.md. The skill runs its own Applicability Check, which handles UI project detection and the user's A/B choice. It returns one of:
outcome: completed→ mark Step 4 ascompleted, auto-chain to Step 5 (Test Spec).outcome: skipped, reason: not-a-ui-project→ mark Step 4 asskipped, auto-chain to Step 5.outcome: skipped, reason: user-declined→ mark Step 4 asskipped, auto-chain to Step 5.
The autodev no longer inlines UI detection heuristics — they live in ui-design/SKILL.md under "Applicability Check".
Step 5 — Test Spec
Condition (folder fallback): _docs/02_document/FINAL_report.md exists AND _docs/02_document/architecture.md exists AND _docs/02_document/tests/traceability-matrix.md does not exist.
State-driven: reached by auto-chain from Step 4 (completed or skipped).
Action: Read and execute .cursor/skills/test-spec/SKILL.md.
This step converts the greenfield problem statement, acceptance criteria, solution, architecture, component docs, and UI design artifacts (if any) into test specifications before implementation begins. The test spec should cover unit, integration, blackbox, and e2e scenarios where those levels are applicable to the project.
Step 6 — Decompose
Condition: _docs/02_document/ contains architecture.md AND _docs/02_document/components/ has at least one component AND _docs/02_document/tests/traceability-matrix.md exists AND _docs/02_tasks/todo/ does not exist or has no implementation task files.
Action: Invoke .cursor/skills/decompose/SKILL.md for implementation task decomposition. The greenfield flow selects the implementation entrypoint before handing off: Bootstrap Structure, Module Layout, Component Task Decomposition, and Cross-Task Verification.
Do not invoke Blackbox Test Task Decomposition from Step 6. Test tasks are intentionally deferred to Step 9 (Decompose Tests) so the first implementation batch stays focused on product functionality and Step 8 can revise testability before test task files exist.
If _docs/02_tasks/ subfolders have some task files already, the decompose skill's resumability handles it.
Step 7 — Implement
Condition: _docs/02_tasks/todo/ contains implementation task files AND _dependencies_table.md exists AND _docs/03_implementation/ does not contain a valid product implementation report.
Action: Invoke .cursor/skills/implement/SKILL.md with task selection context Product implementation.
The implement skill must run its Product Implementation Completeness Gate before it writes any final product implementation report. This gate compares completed product task specs, architecture/component promises, and actual source code so scaffold-only implementations cannot advance to Step 8. A final product implementation report without _docs/03_implementation/implementation_completeness_cycle[N]_report.md is incomplete and must not be treated as Step 7 completion.
If _docs/03_implementation/ has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.
For folder fallback, implementation task files means task specs that are not test-only specs: exclude *_test_infrastructure.md and task specs whose **Component** or **Epic** identifies Blackbox Tests.
For folder fallback, a product implementation report is any _docs/03_implementation/implementation_report_*.md file except _docs/03_implementation/implementation_report_tests.md and refactor reports. It is valid for greenfield progression only when:
- the matching
_docs/03_implementation/implementation_completeness_cycle[N]_report.mdexists, - that completeness report does not contain unresolved
FAILclassifications, and _docs/02_tasks/todo/contains no pending implementation task files.
If a product report exists but any of those validity checks fail, treat product implementation as incomplete and stay in Step 7.
Step 8 — Code Testability Revision
Condition (folder fallback): _docs/03_implementation/ contains a valid product implementation report, _docs/03_implementation/implementation_completeness_cycle[N]_report.md exists without unresolved FAIL classifications, _docs/04_refactoring/01-testability-refactoring/testability_assessment.md does not exist, _docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md does not exist, _docs/03_implementation/implementation_report_tests.md does not exist, and _docs/02_tasks/todo/ does not contain test task files.
State-driven: reached by auto-chain from Step 7.
Purpose: verify the newly built code can be exercised by the planned tests before writing the test suite. Greenfield code should be testable by design; this step catches accidental hardcoded paths, singletons, direct external service construction, or other implementation choices that would make meaningful tests impossible.
Scope — MINIMAL, SURGICAL fixes: this is not a general refactor. It is the smallest set of changes required to make the implemented code runnable under tests.
Allowed changes in this phase:
- Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
- Extract narrow interfaces for components that need stubbing in tests.
- Add optional constructor parameters for dependency injection; default to the existing behavior so callers do not break.
- Wrap global singletons in thin accessors that tests can override.
- Split a function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.
NOT allowed in this phase (defer to a later refactor task):
- Renaming public APIs.
- Moving code between files unless strictly required for isolation.
- Changing algorithms or business logic.
- Restructuring module boundaries or rewriting layers.
Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
- Read
_docs/02_document/tests/traceability-matrix.mdand all test scenario files in_docs/02_document/tests/. - For each test scenario, check whether the code under test can be exercised in isolation. Look for:
- Hardcoded file paths or directory references
- Hardcoded configuration values (URLs, credentials, magic numbers)
- Global mutable state that cannot be overridden
- Tight coupling to external services without abstraction
- Missing dependency injection or non-configurable parameters
- Direct file system operations without path configurability
- Inline construction of heavy dependencies (models, clients)
- If ALL scenarios are testable as-is:
- Create
_docs/04_refactoring/01-testability-refactoring/ - Write
_docs/04_refactoring/01-testability-refactoring/testability_assessment.mdwith the scenarios reviewed and outcome "Code is testable — no changes needed" - Mark Step 8 as
completedwith outcome "Code is testable — no changes needed" - Auto-chain to Step 9 (Decompose Tests)
- Create
- If testability issues are found:
- Create
_docs/04_refactoring/01-testability-refactoring/ - Write
list-of-changes.mdin that directory using the refactor skill template (.cursor/skills/refactor/templates/list-of-changes.md), with:- Mode:
guided - Source:
autodev-greenfield-testability-analysis - One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred refactor candidates" instead.
- Mode:
- Invoke the refactor skill in guided mode: read and execute
.cursor/skills/refactor/SKILL.mdwith thelist-of-changes.mdas input - Phase 3 (Safety Net) is skipped for this testability run because the test suite has not been implemented yet
- After execution, surface
RUN_DIR/testability_changes_summary.mdto the user via the Choose format (accept / request follow-up) before auto-chaining - Copy or save the accepted summary as
_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.mdso folder fallback can detect Step 8 completion - Mark Step 8 as
completed - Auto-chain to Step 9 (Decompose Tests)
- Create
Step 9 — Decompose Tests
Condition (folder fallback): _docs/02_document/tests/traceability-matrix.md exists AND workspace contains source code files AND _docs/03_implementation/ contains a valid product implementation report AND _docs/03_implementation/implementation_completeness_cycle[N]_report.md exists without unresolved FAIL classifications AND (_docs/04_refactoring/01-testability-refactoring/testability_assessment.md exists OR _docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md exists) AND (_docs/02_tasks/todo/ does not exist or has no test task files) AND _docs/03_implementation/implementation_report_tests.md does not exist.
State-driven: reached by auto-chain from Step 8.
Action: Read and execute .cursor/skills/decompose/SKILL.md in tests-only mode (pass _docs/02_document/tests/ as input). The decompose skill will:
- Run Step 1t (test infrastructure bootstrap)
- Run Step 3 (blackbox/e2e-capable test task decomposition)
- Run Step 4 (cross-verification against test coverage)
If _docs/02_tasks/ subfolders have some task files already, the decompose skill's resumability handles it — it appends test tasks alongside existing completed implementation tasks.
Step 10 — Implement Tests
Condition (folder fallback): _docs/02_tasks/todo/ contains test task files AND _dependencies_table.md exists AND _docs/03_implementation/implementation_report_tests.md does not exist.
State-driven: reached by auto-chain from Step 9.
Action: Invoke .cursor/skills/implement/SKILL.md with task selection context Test implementation.
The implement skill reads only test tasks from _docs/02_tasks/todo/ and implements them.
If _docs/03_implementation/ has batch reports, the implement skill detects completed test tasks and continues.
For folder fallback, test task files means *_test_infrastructure.md plus task specs whose **Component** or **Epic** identifies Blackbox Tests.
Step 11 — Run Tests
Condition (folder fallback): _docs/03_implementation/implementation_report_tests.md exists.
State-driven: reached by auto-chain from Step 10.
Action: Read and execute .cursor/skills/test-run/SKILL.md
Verifies the implemented unit, integration, blackbox, and e2e tests pass before proceeding to spec and documentation sync. This is a hard product gate, not a harness-smoke gate: e2e/blackbox tests must exercise the actual implemented system through public runtime boundaries and compare actual outputs against _docs/00_problem/input_data/expected_results/results_report.md or referenced machine-readable expected-result files. Stubs are allowed only for external systems outside the product boundary; missing internal product implementation must fail or block the gate and send the flow back to Implement.
Step 12 — Test-Spec Sync
State-driven: reached by auto-chain from Step 11. Requires _docs/02_document/tests/traceability-matrix.md to exist — if missing, mark Step 12 skipped (see Action below).
Action: Read and execute .cursor/skills/test-spec/SKILL.md in cycle-update mode. Pass the completed implementation task specs, completed test task specs, and implementation reports as inputs.
The skill appends implementation-learned acceptance criteria, scenarios, and NFR updates to the existing test-spec files without rewriting unaffected sections. If traceability-matrix.md is missing, mark Step 12 as skipped — the next /test-spec full run will regenerate it.
After completion, auto-chain to Step 13 (Update Docs).
Step 13 — Update Docs
State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires _docs/02_document/ to contain existing documentation — if missing, mark Step 13 skipped (see Action below).
Action: Read and execute .cursor/skills/document/SKILL.md in Task mode. Pass all completed implementation and test task spec files plus the implementation reports.
The document skill in Task mode updates affected module docs, component docs, system-level docs, and test documentation without redoing full discovery, verification, or problem extraction.
If _docs/02_document/ does not contain existing docs, mark Step 13 as skipped.
After completion, auto-chain to Step 14 (Security Audit).
Step 14 — Security Audit (optional) State-driven: reached by auto-chain from Step 13 (completed or skipped).
Action: Apply the Optional Skill Gate (protocols.md → "Optional Skill Gate") with:
- question:
Run security audit before deploy? - option-a-label:
Run security audit (recommended for production deployments) - option-b-label:
Skip — proceed directly to deploy - recommendation:
A — catches vulnerabilities before production - target-skill:
.cursor/skills/security/SKILL.md - next-step: Step 15 (Performance Test)
Step 15 — Performance Test (optional) State-driven: reached by auto-chain from Step 14 (completed or skipped).
Action: Apply the Optional Skill Gate (protocols.md → "Optional Skill Gate") with:
- question:
Run performance/load tests before deploy? - option-a-label:
Run performance tests (recommended for latency-sensitive or high-load systems) - option-b-label:
Skip — proceed directly to deploy - recommendation:
A or B — base on whether acceptance criteria include latency, throughput, or load requirements - target-skill:
.cursor/skills/test-run/SKILL.mdin perf mode (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures) - next-step: Step 16 (Deploy)
Step 16 — Deploy (optional) State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).
Action: Apply the Optional Skill Gate (protocols.md → "Optional Skill Gate") with:
- question:
Run deploy planning (scripts, procedures, compose overlays) now? - option-a-label:
Run deploy — produce/update deploy artifacts and scripts - option-b-label:
Skip — continue development; deploy when ready for production - recommendation:
B when the product is not ready to ship; A when targeting a release soon - target-skill:
.cursor/skills/deploy/SKILL.md - next-step: Step 16.5 (Release) — only when Step 16 was completed; otherwise Step 17 (Retrospective)
On skip: mark Step 16 and Step 16.5 as skipped; record in the release report (if one exists) or _docs/_autodev_state.md sub_step.detail that deploy/release were deferred; auto-chain to Step 17 (Retrospective in cycle-end mode).
On complete: mark Step 16 completed and auto-chain to Step 16.5 (Release).
Step 16.5 — Release (optional)
State-driven: reached by auto-chain from Step 16 only when Step 16 status is completed. If Step 16 was skipped, Step 16.5 is also skipped and the flow does not invoke /release.
Action: Read and execute .cursor/skills/release/SKILL.md. The release skill is responsible for selecting the target environment, executing the deploy artifacts, smoke-testing, watching the rollout, and producing a definitive verdict (Released, Released-with-override, Rolled-Back, or Aborted).
The release skill has its own internal BLOCKING gates (Phase 1 pre-release gate, Phase 2 strategy select, Phase 6 user confirmation when soft regression escalates). Autodev does NOT add a wrapping A/B/C gate — the release skill owns its own user interaction.
After the release skill exits:
- Verdict
Released→ mark Step 16.5completedand auto-chain to Step 17 (Retrospective in cycle-end mode). - Verdict
Released-with-override→ mark Step 16.5completedAND auto-chain to Step 17 (Retrospective in incident mode) — the override is itself an incident the retrospective must analyze. - Verdict
Rolled-Back→ mark Step 16.5failed. Auto-chain to Step 17 (Retrospective in incident mode). Do NOT consider the project "Done" — the user owns the next move (re-run /implement on a fix branch, re-run /deploy, re-run /release). - Verdict
Aborted→ mark Step 16.5not_started(the release was never started) ORfailedif the abort came after Phase 3 had already touched the live system. Surface the abort reason and STOP — do not auto-chain to retrospective.
Step 17 — Retrospective
State-driven: reached by auto-chain from Step 16.5 (any verdict) OR from Step 16/16.5 both skipped (cycle-end mode — note deploy/release deferred in the retro report).
Action: Read and execute .cursor/skills/retrospective/SKILL.md. Mode selection:
- Step 16.5 verdict
Released→ cycle-end mode - Step 16.5 verdict
Released-with-overrideorRolled-Back→ incident mode
The retrospective closes the cycle's feedback loop by folding metrics into _docs/06_metrics/retro_<date>.md (or incident_<date>_release.md in incident mode) and appending the top-3 lessons to _docs/LESSONS.md.
After retrospective completes, mark Step 17 as completed and enter "Done" evaluation.
Done
State-driven: reached by auto-chain from Step 17. (Sanity check: if Step 16 was completed, _docs/04_deploy/ should contain the expected deploy artifacts. If Step 16.5 was completed, _docs/04_release/ should contain a release report with a definitive verdict. Skipped deploy/release is valid — no release report required.)
Action: Report project completion with summary. Then rewrite the state file so the next /autodev invocation enters the feature-cycle loop in the existing-code flow:
flow: existing-code
step: 9
name: New Task
status: not_started
sub_step:
phase: 0
name: awaiting-invocation
detail: ""
retry_count: 0
cycle: 1
On the next invocation, Flow Resolution rule 1 reads flow: existing-code and re-entry flows directly into existing-code Step 9 (New Task).
Auto-Chain Rules
| Completed Step | Next Action |
|---|---|
| Problem (1) | Auto-chain → Research (2) |
| Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
| Research Decision → proceed | Auto-chain → Plan (3) |
| Plan (3) | Auto-chain → UI Design detection (4) |
| UI Design (4, done or skipped) | Auto-chain → Test Spec (5) |
| Test Spec (5) | Auto-chain → Decompose (6) |
| Decompose (6) | Session boundary — suggest new conversation before Implement |
| Implement (7) | Auto-chain only after Product Implementation Completeness Gate passes → Code Testability Revision (8) |
| Code Testability Revision (8) | Auto-chain → Decompose Tests (9) |
| Decompose Tests (9) | Session boundary — suggest new conversation before Implement Tests |
| Implement Tests (10) | Auto-chain → Run Tests (11) |
| Run Tests (11, all pass) | Auto-chain → Test-Spec Sync (12) |
| Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
| Update Docs (13, done or skipped) | Auto-chain → Security Audit choice (14) |
| Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
| Performance Test (15, done or skipped) | Auto-chain → Deploy choice (16) |
| Deploy (16, completed) | Auto-chain → Release (16.5) |
| Deploy (16, skipped) | Mark 16.5 skipped → auto-chain → Retrospective (17, cycle-end mode) |
| Release (16.5, verdict Released) | Auto-chain → Retrospective (17, cycle-end mode) |
| Release (16.5, verdict Released-with-override) | Auto-chain → Retrospective (17, incident mode) |
| Release (16.5, verdict Rolled-Back) | Auto-chain → Retrospective (17, incident mode); do NOT enter Done |
| Release (16.5, verdict Aborted) | STOP — surface abort reason; do not auto-chain |
| Retrospective (17) | Report completion; rewrite state to existing-code flow, step 9 |
Status Summary — Step List
Flow name: greenfield. Render using the banner template in protocols.md → "Banner Template (authoritative)". No header-suffix, current-suffix, or footer-extras — all empty for this flow.
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|---|---|
| 1 | Problem | — |
| 2 | Research | DONE (N drafts) |
| 3 | Plan | — |
| 4 | UI Design | — |
| 5 | Test Spec | — |
| 6 | Decompose | DONE (N tasks) |
| 7 | Implement | IN PROGRESS (batch M of ~N) |
| 8 | Code Testability Revision | — |
| 9 | Decompose Tests | DONE (N tasks) |
| 10 | Implement Tests | IN PROGRESS (batch M) |
| 11 | Run Tests | DONE (N passed, M failed) |
| 12 | Test-Spec Sync | — |
| 13 | Update Docs | — |
| 14 | Security Audit | — |
| 15 | Performance Test | — |
| 16 | Deploy | — |
| 16.5 | Release | `DONE (Released |
| 17 | Retrospective | — |
All rows also accept the shared state tokens (DONE, IN PROGRESS, NOT STARTED, FAILED (retry N/3)); rows 4, 12, 13, 14, 15, 16, 16.5 additionally accept SKIPPED.
Row rendering format (step-number column is right-padded to 2 characters for alignment):
Step 1 Problem [<state token>]
Step 2 Research [<state token>]
Step 3 Plan [<state token>]
Step 4 UI Design [<state token>]
Step 5 Test Spec [<state token>]
Step 6 Decompose [<state token>]
Step 7 Implement [<state token>]
Step 8 Code Testability Rev. [<state token>]
Step 9 Decompose Tests [<state token>]
Step 10 Implement Tests [<state token>]
Step 11 Run Tests [<state token>]
Step 12 Test-Spec Sync [<state token>]
Step 13 Update Docs [<state token>]
Step 14 Security Audit [<state token>]
Step 15 Performance Test [<state token>]
Step 16 Deploy [<state token>]
Step 16.5 Release [<state token>]
Step 17 Retrospective [<state token>]