mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-06-21 09:01:14 +00:00

Files

T

Oleksandr Bezdieniezhnykh cab7b5d020 [AZ-233] Update Docker Compose and enhance test documentation

- Modified the Docker Compose configuration to include an input root for replay tests and added an environment variable for enabling SITL.
- Enhanced documentation for various testing processes, including the addition of a Runtime Completeness Decomposition Gate and clarifications on internal module testing requirements.
- Updated the implementation completeness report to reflect the current state and added new test cases for performance and resilience scenarios.

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-06 05:03:48 +03:00

24 KiB

Raw Blame History

Greenfield Workflow

Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement + Product Completeness Gate → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.

Step Reference Table

Step	Name	Sub-Skill	Internal SubSteps
1	Problem	problem/SKILL.md	Phase 1–4
2	Research	research/SKILL.md	Mode A: Phase 1–4 · Mode B: Step 0–8
3	Plan	plan/SKILL.md	Step 1–6 + Final
4	UI Design	ui-design/SKILL.md	Phase 0–8 (conditional — UI projects only)
5	Test Spec	test-spec/SKILL.md	Phases 1–4
6	Decompose	decompose/SKILL.md (implementation task decomposition)	Step 1 + Step 1.5 + Step 2 + Step 4
7	Implement	implement/SKILL.md	Batch loop + Product Implementation Completeness Gate
8	Code Testability Revision	refactor/SKILL.md (guided mode)	Phases 0–7 (conditional)
9	Decompose Tests	decompose/SKILL.md (tests-only)	Step 1t + Step 3 + Step 4
10	Implement Tests	implement/SKILL.md	(batch-driven, no fixed sub-steps)
11	Run Tests	test-run/SKILL.md	Steps 1–4
12	Test-Spec Sync	test-spec/SKILL.md (cycle-update mode)	Phase 2 + Phase 3 (scoped)
13	Update Docs	document/SKILL.md (task mode)	Task Steps 0–5
14	Security Audit	security/SKILL.md	Phase 1–5 (optional)
15	Performance Test	test-run/SKILL.md (perf mode)	Steps 1–5 (optional)
16	Deploy	deploy/SKILL.md	Step 1–7
17	Retrospective	retrospective/SKILL.md (cycle-end mode)	Steps 1–4

Detection Rules

Resolution: when a state file exists, state.step + state.status drive detection and the conditions below are not consulted. When no state file exists (cold start), walk the rules in order — first folder-probe match wins. Steps without a folder probe are state-driven only; they can only be reached by auto-chain from a prior step.

Step 1 — Problem Gathering Condition: _docs/00_problem/ does not exist, OR any of these are missing/empty:

problem.md
restrictions.md
acceptance_criteria.md
input_data/ (must contain at least one file)

Action: Read and execute .cursor/skills/problem/SKILL.md

Step 2 — Research (Initial) Condition: _docs/00_problem/ is complete AND _docs/01_solution/ has no solution_draft*.md files

Action: Read and execute .cursor/skills/research/SKILL.md (will auto-detect Mode A)

Research Decision (inline gate between Step 2 and Step 3) Condition: _docs/01_solution/ contains solution_draft*.md files AND _docs/01_solution/solution.md does not exist AND _docs/02_document/architecture.md does not exist

Action: Present the current research state to the user:

How many solution drafts exist
Whether tech_stack.md and security_analysis.md exist
One-line summary from the latest draft

Then present using the Choose format:

══════════════════════════════════════
 DECISION REQUIRED: Research complete — next action?
══════════════════════════════════════
 A) Run another research round (Mode B assessment)
 B) Proceed to planning with current draft
══════════════════════════════════════
 Recommendation: [A or B] — [reason based on draft quality]
══════════════════════════════════════

If user picks A → Read and execute .cursor/skills/research/SKILL.md (will auto-detect Mode B)
If user picks B → auto-chain to Step 3 (Plan)

Step 3 — Plan Condition: _docs/01_solution/ has solution_draft*.md files AND _docs/02_document/architecture.md does not exist

Action:

The plan skill's Prereq 2 will rename the latest draft to solution.md — this is handled by the plan skill itself
Read and execute .cursor/skills/plan/SKILL.md

If _docs/02_document/ exists but is incomplete (has some artifacts but no FINAL_report.md), the plan skill's built-in resumability handles it.

Step 4 — UI Design (conditional) Condition (folder fallback): _docs/02_document/architecture.md exists AND _docs/02_document/tests/traceability-matrix.md does not exist. State-driven: reached by auto-chain from Step 3.

Action: Read and execute .cursor/skills/ui-design/SKILL.md. The skill runs its own Applicability Check, which handles UI project detection and the user's A/B choice. It returns one of:

outcome: completed → mark Step 4 as completed, auto-chain to Step 5 (Test Spec).
outcome: skipped, reason: not-a-ui-project → mark Step 4 as skipped, auto-chain to Step 5.
outcome: skipped, reason: user-declined → mark Step 4 as skipped, auto-chain to Step 5.

The autodev no longer inlines UI detection heuristics — they live in ui-design/SKILL.md under "Applicability Check".

Step 5 — Test Spec Condition (folder fallback): _docs/02_document/FINAL_report.md exists AND _docs/02_document/architecture.md exists AND _docs/02_document/tests/traceability-matrix.md does not exist. State-driven: reached by auto-chain from Step 4 (completed or skipped).

Action: Read and execute .cursor/skills/test-spec/SKILL.md.

This step converts the greenfield problem statement, acceptance criteria, solution, architecture, component docs, and UI design artifacts (if any) into test specifications before implementation begins. The test spec should cover unit, integration, blackbox, and e2e scenarios where those levels are applicable to the project.

Step 6 — Decompose Condition: _docs/02_document/ contains architecture.md AND _docs/02_document/components/ has at least one component AND _docs/02_document/tests/traceability-matrix.md exists AND _docs/02_tasks/todo/ does not exist or has no implementation task files.

Action: Invoke .cursor/skills/decompose/SKILL.md for implementation task decomposition. The greenfield flow selects the implementation entrypoint before handing off: Bootstrap Structure, Module Layout, Component Task Decomposition, and Cross-Task Verification.

Do not invoke Blackbox Test Task Decomposition from Step 6. Test tasks are intentionally deferred to Step 9 (Decompose Tests) so the first implementation batch stays focused on product functionality and Step 8 can revise testability before test task files exist.

If _docs/02_tasks/ subfolders have some task files already, the decompose skill's resumability handles it.

Step 7 — Implement Condition: _docs/02_tasks/todo/ contains implementation task files AND _dependencies_table.md exists AND _docs/03_implementation/ does not contain a valid product implementation report.

Action: Invoke .cursor/skills/implement/SKILL.md with task selection context Product implementation.

The implement skill must run its Product Implementation Completeness Gate before it writes any final product implementation report. This gate compares completed product task specs, architecture/component promises, and actual source code so scaffold-only implementations cannot advance to Step 8. A final product implementation report without _docs/03_implementation/implementation_completeness_cycle[N]_report.md is incomplete and must not be treated as Step 7 completion.

If _docs/03_implementation/ has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.

For folder fallback, implementation task files means task specs that are not test-only specs: exclude *_test_infrastructure.md and task specs whose **Component** or **Epic** identifies Blackbox Tests.

For folder fallback, a product implementation report is any _docs/03_implementation/implementation_report_*.md file except _docs/03_implementation/implementation_report_tests.md and refactor reports. It is valid for greenfield progression only when:

the matching _docs/03_implementation/implementation_completeness_cycle[N]_report.md exists,
that completeness report does not contain unresolved FAIL classifications, and
_docs/02_tasks/todo/ contains no pending implementation task files.

If a product report exists but any of those validity checks fail, treat product implementation as incomplete and stay in Step 7.

Step 8 — Code Testability Revision Condition (folder fallback): _docs/03_implementation/ contains a valid product implementation report, _docs/03_implementation/implementation_completeness_cycle[N]_report.md exists without unresolved FAIL classifications, _docs/04_refactoring/01-testability-refactoring/testability_assessment.md does not exist, _docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md does not exist, _docs/03_implementation/implementation_report_tests.md does not exist, and _docs/02_tasks/todo/ does not contain test task files. State-driven: reached by auto-chain from Step 7.

Purpose: verify the newly built code can be exercised by the planned tests before writing the test suite. Greenfield code should be testable by design; this step catches accidental hardcoded paths, singletons, direct external service construction, or other implementation choices that would make meaningful tests impossible.

Scope — MINIMAL, SURGICAL fixes: this is not a general refactor. It is the smallest set of changes required to make the implemented code runnable under tests.

Allowed changes in this phase:

Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
Extract narrow interfaces for components that need stubbing in tests.
Add optional constructor parameters for dependency injection; default to the existing behavior so callers do not break.
Wrap global singletons in thin accessors that tests can override.
Split a function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.

NOT allowed in this phase (defer to a later refactor task):

Renaming public APIs.
Moving code between files unless strictly required for isolation.
Changing algorithms or business logic.
Restructuring module boundaries or rewriting layers.

Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.

Read _docs/02_document/tests/traceability-matrix.md and all test scenario files in _docs/02_document/tests/.
For each test scenario, check whether the code under test can be exercised in isolation. Look for:
- Hardcoded file paths or directory references
- Hardcoded configuration values (URLs, credentials, magic numbers)
- Global mutable state that cannot be overridden
- Tight coupling to external services without abstraction
- Missing dependency injection or non-configurable parameters
- Direct file system operations without path configurability
- Inline construction of heavy dependencies (models, clients)
If ALL scenarios are testable as-is:
- Create _docs/04_refactoring/01-testability-refactoring/
- Write _docs/04_refactoring/01-testability-refactoring/testability_assessment.md with the scenarios reviewed and outcome "Code is testable — no changes needed"
- Mark Step 8 as completed with outcome "Code is testable — no changes needed"
- Auto-chain to Step 9 (Decompose Tests)
If testability issues are found:
- Create _docs/04_refactoring/01-testability-refactoring/
- Write list-of-changes.md in that directory using the refactor skill template (.cursor/skills/refactor/templates/list-of-changes.md), with:
  - Mode: guided
  - Source: autodev-greenfield-testability-analysis
  - One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred refactor candidates" instead.
- Invoke the refactor skill in guided mode: read and execute .cursor/skills/refactor/SKILL.md with the list-of-changes.md as input
- Phase 3 (Safety Net) is skipped for this testability run because the test suite has not been implemented yet
- After execution, surface RUN_DIR/testability_changes_summary.md to the user via the Choose format (accept / request follow-up) before auto-chaining
- Copy or save the accepted summary as _docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md so folder fallback can detect Step 8 completion
- Mark Step 8 as completed
- Auto-chain to Step 9 (Decompose Tests)

Step 9 — Decompose Tests Condition (folder fallback): _docs/02_document/tests/traceability-matrix.md exists AND workspace contains source code files AND _docs/03_implementation/ contains a valid product implementation report AND _docs/03_implementation/implementation_completeness_cycle[N]_report.md exists without unresolved FAIL classifications AND (_docs/04_refactoring/01-testability-refactoring/testability_assessment.md exists OR _docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md exists) AND (_docs/02_tasks/todo/ does not exist or has no test task files) AND _docs/03_implementation/implementation_report_tests.md does not exist. State-driven: reached by auto-chain from Step 8.

Action: Read and execute .cursor/skills/decompose/SKILL.md in tests-only mode (pass _docs/02_document/tests/ as input). The decompose skill will:

Run Step 1t (test infrastructure bootstrap)
Run Step 3 (blackbox/e2e-capable test task decomposition)
Run Step 4 (cross-verification against test coverage)

If _docs/02_tasks/ subfolders have some task files already, the decompose skill's resumability handles it — it appends test tasks alongside existing completed implementation tasks.

Step 10 — Implement Tests Condition (folder fallback): _docs/02_tasks/todo/ contains test task files AND _dependencies_table.md exists AND _docs/03_implementation/implementation_report_tests.md does not exist. State-driven: reached by auto-chain from Step 9.

Action: Invoke .cursor/skills/implement/SKILL.md with task selection context Test implementation.

The implement skill reads only test tasks from _docs/02_tasks/todo/ and implements them.

If _docs/03_implementation/ has batch reports, the implement skill detects completed test tasks and continues.

For folder fallback, test task files means *_test_infrastructure.md plus task specs whose **Component** or **Epic** identifies Blackbox Tests.

Step 11 — Run Tests Condition (folder fallback): _docs/03_implementation/implementation_report_tests.md exists. State-driven: reached by auto-chain from Step 10.

Action: Read and execute .cursor/skills/test-run/SKILL.md

Verifies the implemented unit, integration, blackbox, and e2e tests pass before proceeding to spec and documentation sync. This is a hard product gate, not a harness-smoke gate: e2e/blackbox tests must exercise the actual implemented system through public runtime boundaries and compare actual outputs against _docs/00_problem/input_data/expected_results/results_report.md or referenced machine-readable expected-result files. Stubs are allowed only for external systems outside the product boundary; missing internal product implementation must fail or block the gate and send the flow back to Implement.

Step 12 — Test-Spec Sync State-driven: reached by auto-chain from Step 11. Requires _docs/02_document/tests/traceability-matrix.md to exist — if missing, mark Step 12 skipped (see Action below).

Action: Read and execute .cursor/skills/test-spec/SKILL.md in cycle-update mode. Pass the completed implementation task specs, completed test task specs, and implementation reports as inputs.

The skill appends implementation-learned acceptance criteria, scenarios, and NFR updates to the existing test-spec files without rewriting unaffected sections. If traceability-matrix.md is missing, mark Step 12 as skipped — the next /test-spec full run will regenerate it.

After completion, auto-chain to Step 13 (Update Docs).

Step 13 — Update Docs State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires _docs/02_document/ to contain existing documentation — if missing, mark Step 13 skipped (see Action below).

Action: Read and execute .cursor/skills/document/SKILL.md in Task mode. Pass all completed implementation and test task spec files plus the implementation reports.

The document skill in Task mode updates affected module docs, component docs, system-level docs, and test documentation without redoing full discovery, verification, or problem extraction.

If _docs/02_document/ does not contain existing docs, mark Step 13 as skipped.

After completion, auto-chain to Step 14 (Security Audit).

Step 14 — Security Audit (optional) State-driven: reached by auto-chain from Step 13 (completed or skipped).

Action: Apply the Optional Skill Gate (protocols.md → "Optional Skill Gate") with:

question: Run security audit before deploy?
option-a-label: Run security audit (recommended for production deployments)
option-b-label: Skip — proceed directly to deploy
recommendation: A — catches vulnerabilities before production
target-skill: .cursor/skills/security/SKILL.md
next-step: Step 15 (Performance Test)

Step 15 — Performance Test (optional) State-driven: reached by auto-chain from Step 14 (completed or skipped).

Action: Apply the Optional Skill Gate (protocols.md → "Optional Skill Gate") with:

question: Run performance/load tests before deploy?
option-a-label: Run performance tests (recommended for latency-sensitive or high-load systems)
option-b-label: Skip — proceed directly to deploy
recommendation: A or B — base on whether acceptance criteria include latency, throughput, or load requirements
target-skill: .cursor/skills/test-run/SKILL.md in perf mode (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
next-step: Step 16 (Deploy)

Step 16 — Deploy State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).

Action: Read and execute .cursor/skills/deploy/SKILL.md.

After the deploy skill completes successfully, mark Step 16 as completed and auto-chain to Step 17 (Retrospective).

Step 17 — Retrospective State-driven: reached by auto-chain from Step 16.

Action: Read and execute .cursor/skills/retrospective/SKILL.md in cycle-end mode. This closes the cycle's feedback loop by folding metrics into _docs/06_metrics/retro_<date>.md and appending the top-3 lessons to _docs/LESSONS.md.

After retrospective completes, mark Step 17 as completed and enter "Done" evaluation.

Done State-driven: reached by auto-chain from Step 17. (Sanity check: _docs/04_deploy/ should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)

Action: Report project completion with summary. Then rewrite the state file so the next /autodev invocation enters the feature-cycle loop in the existing-code flow:

flow: existing-code
step: 9
name: New Task
status: not_started
sub_step:
  phase: 0
  name: awaiting-invocation
  detail: ""
retry_count: 0
cycle: 1

On the next invocation, Flow Resolution rule 1 reads flow: existing-code and re-entry flows directly into existing-code Step 9 (New Task).

Auto-Chain Rules

Completed Step	Next Action
Problem (1)	Auto-chain → Research (2)
Research (2)	Auto-chain → Research Decision (ask user: another round or proceed?)
Research Decision → proceed	Auto-chain → Plan (3)
Plan (3)	Auto-chain → UI Design detection (4)
UI Design (4, done or skipped)	Auto-chain → Test Spec (5)
Test Spec (5)	Auto-chain → Decompose (6)
Decompose (6)	Session boundary — suggest new conversation before Implement
Implement (7)	Auto-chain only after Product Implementation Completeness Gate passes → Code Testability Revision (8)
Code Testability Revision (8)	Auto-chain → Decompose Tests (9)
Decompose Tests (9)	Session boundary — suggest new conversation before Implement Tests
Implement Tests (10)	Auto-chain → Run Tests (11)
Run Tests (11, all pass)	Auto-chain → Test-Spec Sync (12)
Test-Spec Sync (12, done or skipped)	Auto-chain → Update Docs (13)
Update Docs (13, done or skipped)	Auto-chain → Security Audit choice (14)
Security Audit (14, done or skipped)	Auto-chain → Performance Test choice (15)
Performance Test (15, done or skipped)	Auto-chain → Deploy (16)
Deploy (16)	Auto-chain → Retrospective (17)
Retrospective (17)	Report completion; rewrite state to existing-code flow, step 9

Status Summary — Step List

Flow name: greenfield. Render using the banner template in protocols.md → "Banner Template (authoritative)". No header-suffix, current-suffix, or footer-extras — all empty for this flow.

#	Step Name	Extra state tokens (beyond the shared set)
1	Problem	—
2	Research	`DONE (N drafts)`
3	Plan	—
4	UI Design	—
5	Test Spec	—
6	Decompose	`DONE (N tasks)`
7	Implement	`IN PROGRESS (batch M of ~N)`
8	Code Testability Revision	—
9	Decompose Tests	`DONE (N tasks)`
10	Implement Tests	`IN PROGRESS (batch M)`
11	Run Tests	`DONE (N passed, M failed)`
12	Test-Spec Sync	—
13	Update Docs	—
14	Security Audit	—
15	Performance Test	—
16	Deploy	—
17	Retrospective	—

All rows also accept the shared state tokens (DONE, IN PROGRESS, NOT STARTED, FAILED (retry N/3)); rows 4, 12, 13, 14, 15 additionally accept SKIPPED.

Row rendering format (step-number column is right-padded to 2 characters for alignment):

 Step 1   Problem                   [<state token>]
 Step 2   Research                  [<state token>]
 Step 3   Plan                      [<state token>]
 Step 4   UI Design                 [<state token>]
 Step 5   Test Spec                 [<state token>]
 Step 6   Decompose                 [<state token>]
 Step 7   Implement                 [<state token>]
 Step 8   Code Testability Rev.     [<state token>]
 Step 9   Decompose Tests           [<state token>]
 Step 10  Implement Tests           [<state token>]
 Step 11  Run Tests                 [<state token>]
 Step 12  Test-Spec Sync            [<state token>]
 Step 13  Update Docs               [<state token>]
 Step 14  Security Audit            [<state token>]
 Step 15  Performance Test          [<state token>]
 Step 16  Deploy                    [<state token>]
 Step 17  Retrospective             [<state token>]

24 KiB Raw Blame History Unescape Escape