From 35547e9b656a16f862678917ca799c229a429a08 Mon Sep 17 00:00:00 2001
From: Oleksandr Bezdieniezhnykh <oleksandr.bezdieniezhnykh@pwc.com>
Date: Sat, 2 May 2026 05:31:23 +0300
Subject: [PATCH] Update autodev workflow documentation to include new steps
 for Test Spec and Decompose Tests, enhancing the greenfield process. Revise
 existing steps to reflect changes in task flow and clarify conditions for
 implementation. Adjust current state to indicate progress in the Decompose
 phase.

---
 .cursor/skills/autodev/SKILL.md               |   2 +-
 .cursor/skills/autodev/flows/greenfield.md    | 271 ++++++++++++----
 .cursor/skills/autodev/protocols.md           |   1 +
 .cursor/skills/autodev/state.md               |   2 +-
 .../02_tasks/todo/AZ-219_initial_structure.md | 288 ++++++++++++++++++
 _docs/_autodev_state.md                       |  12 +-
 6 files changed, 504 insertions(+), 72 deletions(-)
 create mode 100644 _docs/02_tasks/todo/AZ-219_initial_structure.md

diff --git a/.cursor/skills/autodev/SKILL.md b/.cursor/skills/autodev/SKILL.md
index 5724174..3d511d3 100644
--- a/.cursor/skills/autodev/SKILL.md
+++ b/.cursor/skills/autodev/SKILL.md
@@ -3,7 +3,7 @@ name: autodev
 description: |
   Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
   Detects current project state from _docs/ folder, resumes from where it left off, and flows through
-  problem → research → plan → decompose → implement → deploy without manual skill invocation.
+  problem → research → plan → test specs → decompose → implement → tests → docs sync → deploy without manual skill invocation.
   Maximizes work per conversation by auto-transitioning between skills.
   Trigger phrases:
   - "autodev", "auto", "start", "continue"
diff --git a/.cursor/skills/autodev/flows/greenfield.md b/.cursor/skills/autodev/flows/greenfield.md
index 778bbf4..6f186da 100644
--- a/.cursor/skills/autodev/flows/greenfield.md
+++ b/.cursor/skills/autodev/flows/greenfield.md
@@ -1,6 +1,6 @@
 # Greenfield Workflow
 
-Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
+Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
 
 ## Step Reference Table
 
@@ -10,13 +10,19 @@ Workflow for new projects built from scratch. Flows linearly: Problem → Resear
 | 2 | Research | research/SKILL.md | Mode A: Phase 1–4 · Mode B: Step 0–8 |
 | 3 | Plan | plan/SKILL.md | Step 1–6 + Final |
 | 4 | UI Design | ui-design/SKILL.md | Phase 0–8 (conditional — UI projects only) |
-| 5 | Decompose | decompose/SKILL.md | Step 1–4 |
-| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
-| 7 | Run Tests | test-run/SKILL.md | Steps 1–4 |
-| 8 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
-| 9 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
-| 10 | Deploy | deploy/SKILL.md | Step 1–7 |
-| 11 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |
+| 5 | Test Spec | test-spec/SKILL.md | Phases 1–4 |
+| 6 | Decompose | decompose/SKILL.md | Step 1–4 |
+| 7 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 8 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 0–7 (conditional) |
+| 9 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
+| 10 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
+| 11 | Run Tests | test-run/SKILL.md | Steps 1–4 |
+| 12 | Test-Spec Sync | test-spec/SKILL.md (cycle-update mode) | Phase 2 + Phase 3 (scoped) |
+| 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
+| 14 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 1–5 (optional) |
+| 16 | Deploy | deploy/SKILL.md | Step 1–7 |
+| 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 1–4 |
 
 ## Detection Rules
 
@@ -80,12 +86,12 @@ If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FIN
 ---
 
 **Step 4 — UI Design (conditional)**
-Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_tasks/todo/` does not exist or has no task files.
+Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
 State-driven: reached by auto-chain from Step 3.
 
 Action: Read and execute `.cursor/skills/ui-design/SKILL.md`. The skill runs its own **Applicability Check**, which handles UI project detection and the user's A/B choice. It returns one of:
 
-- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Decompose).
+- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Test Spec).
 - `outcome: skipped, reason: not-a-ui-project` → mark Step 4 as `skipped`, auto-chain to Step 5.
 - `outcome: skipped, reason: user-declined` → mark Step 4 as `skipped`, auto-chain to Step 5.
 
@@ -93,34 +99,153 @@ The autodev no longer inlines UI detection heuristics — they live in `ui-desig
 
 ---
 
-**Step 5 — Decompose**
-Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/todo/` does not exist or has no task files
+**Step 5 — Test Spec**
+Condition (folder fallback): `_docs/02_document/FINAL_report.md` exists AND `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
+State-driven: reached by auto-chain from Step 4 (completed or skipped).
 
-Action: Read and execute `.cursor/skills/decompose/SKILL.md`
+Action: Read and execute `.cursor/skills/test-spec/SKILL.md`.
+
+This step converts the greenfield problem statement, acceptance criteria, solution, architecture, component docs, and UI design artifacts (if any) into test specifications before implementation begins. The test spec should cover unit, integration, blackbox, and e2e scenarios where those levels are applicable to the project.
+
+---
+
+**Step 6 — Decompose**
+Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_document/tests/traceability-matrix.md` exists AND `_docs/02_tasks/todo/` does not exist or has no implementation task files.
+
+Action: Read and execute `.cursor/skills/decompose/SKILL.md` in normal implementation mode. Test tasks are intentionally deferred to Step 9 (Decompose Tests) so the first implementation batch stays focused on product functionality.
 
 If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it.
 
 ---
 
-**Step 6 — Implement**
-Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any `implementation_report_*.md` file
+**Step 7 — Implement**
+Condition: `_docs/02_tasks/todo/` contains implementation task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any product `implementation_report_*.md` file.
 
 Action: Read and execute `.cursor/skills/implement/SKILL.md`
 
 If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.
 
+For folder fallback, **implementation task files** means task specs that are not test-only specs: exclude `*_test_infrastructure.md` and task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
+
+For folder fallback, a **product implementation report** is any `_docs/03_implementation/implementation_report_*.md` file except `_docs/03_implementation/implementation_report_tests.md` and refactor reports.
+
 ---
 
-**Step 7 — Run Tests**
-Condition (folder fallback): `_docs/03_implementation/` contains an `implementation_report_*.md` file.
-State-driven: reached by auto-chain from Step 6.
+**Step 8 — Code Testability Revision**
+Condition (folder fallback): `_docs/03_implementation/` contains a product implementation report AND `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` does not exist AND `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` does not exist AND `_docs/03_implementation/implementation_report_tests.md` does not exist AND `_docs/02_tasks/todo/` does not contain test task files.
+State-driven: reached by auto-chain from Step 7.
+
+**Purpose**: verify the newly built code can be exercised by the planned tests before writing the test suite. Greenfield code should be testable by design; this step catches accidental hardcoded paths, singletons, direct external service construction, or other implementation choices that would make meaningful tests impossible.
+
+**Scope — MINIMAL, SURGICAL fixes**: this is not a general refactor. It is the smallest set of changes required to make the implemented code runnable under tests.
+
+**Allowed changes** in this phase:
+- Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
+- Extract narrow interfaces for components that need stubbing in tests.
+- Add optional constructor parameters for dependency injection; default to the existing behavior so callers do not break.
+- Wrap global singletons in thin accessors that tests can override.
+- Split a function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.
+
+**NOT allowed** in this phase (defer to a later refactor task):
+- Renaming public APIs.
+- Moving code between files unless strictly required for isolation.
+- Changing algorithms or business logic.
+- Restructuring module boundaries or rewriting layers.
+
+Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
+
+1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`.
+2. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
+   - Hardcoded file paths or directory references
+   - Hardcoded configuration values (URLs, credentials, magic numbers)
+   - Global mutable state that cannot be overridden
+   - Tight coupling to external services without abstraction
+   - Missing dependency injection or non-configurable parameters
+   - Direct file system operations without path configurability
+   - Inline construction of heavy dependencies (models, clients)
+3. If ALL scenarios are testable as-is:
+   - Create `_docs/04_refactoring/01-testability-refactoring/`
+   - Write `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` with the scenarios reviewed and outcome "Code is testable — no changes needed"
+   - Mark Step 8 as `completed` with outcome "Code is testable — no changes needed"
+   - Auto-chain to Step 9 (Decompose Tests)
+4. If testability issues are found:
+   - Create `_docs/04_refactoring/01-testability-refactoring/`
+   - Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
+     - **Mode**: `guided`
+     - **Source**: `autodev-greenfield-testability-analysis`
+     - One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred refactor candidates" instead.
+   - Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
+   - Phase 3 (Safety Net) is skipped for this testability run because the test suite has not been implemented yet
+   - After execution, surface `RUN_DIR/testability_changes_summary.md` to the user via the Choose format (accept / request follow-up) before auto-chaining
+   - Copy or save the accepted summary as `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` so folder fallback can detect Step 8 completion
+   - Mark Step 8 as `completed`
+   - Auto-chain to Step 9 (Decompose Tests)
+
+---
+
+**Step 9 — Decompose Tests**
+Condition (folder fallback): `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND `_docs/03_implementation/` contains a product implementation report AND (`_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` exists OR `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` exists) AND (`_docs/02_tasks/todo/` does not exist or has no test task files) AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
+State-driven: reached by auto-chain from Step 8.
+
+Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
+1. Run Step 1t (test infrastructure bootstrap)
+2. Run Step 3 (blackbox/e2e-capable test task decomposition)
+3. Run Step 4 (cross-verification against test coverage)
+
+If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it — it appends test tasks alongside existing completed implementation tasks.
+
+---
+
+**Step 10 — Implement Tests**
+Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
+State-driven: reached by auto-chain from Step 9.
+
+Action: Read and execute `.cursor/skills/implement/SKILL.md`
+
+The implement skill reads test tasks from `_docs/02_tasks/todo/` and implements them.
+
+If `_docs/03_implementation/` has batch reports, the implement skill detects completed test tasks and continues.
+
+For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
+
+---
+
+**Step 11 — Run Tests**
+Condition (folder fallback): `_docs/03_implementation/implementation_report_tests.md` exists.
+State-driven: reached by auto-chain from Step 10.
 
 Action: Read and execute `.cursor/skills/test-run/SKILL.md`
 
+Verifies the implemented unit, integration, blackbox, and e2e tests pass before proceeding to spec and documentation sync.
+
 ---
 
-**Step 8 — Security Audit (optional)**
-State-driven: reached by auto-chain from Step 7.
+**Step 12 — Test-Spec Sync**
+State-driven: reached by auto-chain from Step 11. Requires `_docs/02_document/tests/traceability-matrix.md` to exist — if missing, mark Step 12 `skipped` (see Action below).
+
+Action: Read and execute `.cursor/skills/test-spec/SKILL.md` in **cycle-update mode**. Pass the completed implementation task specs, completed test task specs, and implementation reports as inputs.
+
+The skill appends implementation-learned acceptance criteria, scenarios, and NFR updates to the existing test-spec files without rewriting unaffected sections. If `traceability-matrix.md` is missing, mark Step 12 as `skipped` — the next `/test-spec` full run will regenerate it.
+
+After completion, auto-chain to Step 13 (Update Docs).
+
+---
+
+**Step 13 — Update Docs**
+State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires `_docs/02_document/` to contain existing documentation — if missing, mark Step 13 `skipped` (see Action below).
+
+Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all completed implementation and test task spec files plus the implementation reports.
+
+The document skill in Task mode updates affected module docs, component docs, system-level docs, and test documentation without redoing full discovery, verification, or problem extraction.
+
+If `_docs/02_document/` does not contain existing docs, mark Step 13 as `skipped`.
+
+After completion, auto-chain to Step 14 (Security Audit).
+
+---
+
+**Step 14 — Security Audit (optional)**
+State-driven: reached by auto-chain from Step 13 (completed or skipped).
 
 Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
 - question:        `Run security audit before deploy?`
@@ -128,12 +253,12 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
 - option-b-label:  `Skip — proceed directly to deploy`
 - recommendation:  `A — catches vulnerabilities before production`
 - target-skill:    `.cursor/skills/security/SKILL.md`
-- next-step:       Step 9 (Performance Test)
+- next-step:       Step 15 (Performance Test)
 
 ---
 
-**Step 9 — Performance Test (optional)**
-State-driven: reached by auto-chain from Step 8.
+**Step 15 — Performance Test (optional)**
+State-driven: reached by auto-chain from Step 14 (completed or skipped).
 
 Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
 - question:        `Run performance/load tests before deploy?`
@@ -141,30 +266,30 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
 - option-b-label:  `Skip — proceed directly to deploy`
 - recommendation:  `A or B — base on whether acceptance criteria include latency, throughput, or load requirements`
 - target-skill:    `.cursor/skills/test-run/SKILL.md` in **perf mode** (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
-- next-step:       Step 10 (Deploy)
+- next-step:       Step 16 (Deploy)
 
 ---
 
-**Step 10 — Deploy**
-State-driven: reached by auto-chain from Step 9 (after Step 9 is completed or skipped).
+**Step 16 — Deploy**
+State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).
 
 Action: Read and execute `.cursor/skills/deploy/SKILL.md`.
 
-After the deploy skill completes successfully, mark Step 10 as `completed` and auto-chain to Step 11 (Retrospective).
+After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 17 (Retrospective).
 
 ---
 
-**Step 11 — Retrospective**
-State-driven: reached by auto-chain from Step 10.
+**Step 17 — Retrospective**
+State-driven: reached by auto-chain from Step 16.
 
 Action: Read and execute `.cursor/skills/retrospective/SKILL.md` in **cycle-end mode**. This closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` and appending the top-3 lessons to `_docs/LESSONS.md`.
 
-After retrospective completes, mark Step 11 as `completed` and enter "Done" evaluation.
+After retrospective completes, mark Step 17 as `completed` and enter "Done" evaluation.
 
 ---
 
 **Done**
-State-driven: reached by auto-chain from Step 11. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
+State-driven: reached by auto-chain from Step 17. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
 
 Action: Report project completion with summary. Then **rewrite the state file** so the next `/autodev` invocation enters the feature-cycle loop in the existing-code flow:
 
@@ -191,47 +316,65 @@ On the next invocation, Flow Resolution rule 1 reads `flow: existing-code` and r
 | Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
 | Research Decision → proceed | Auto-chain → Plan (3) |
 | Plan (3) | Auto-chain → UI Design detection (4) |
-| UI Design (4, done or skipped) | Auto-chain → Decompose (5) |
-| Decompose (5) | **Session boundary** — suggest new conversation before Implement |
-| Implement (6) | Auto-chain → Run Tests (7) |
-| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) |
-| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) |
-| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) |
-| Deploy (10) | Auto-chain → Retrospective (11) |
-| Retrospective (11) | Report completion; rewrite state to existing-code flow, step 9 |
+| UI Design (4, done or skipped) | Auto-chain → Test Spec (5) |
+| Test Spec (5) | Auto-chain → Decompose (6) |
+| Decompose (6) | **Session boundary** — suggest new conversation before Implement |
+| Implement (7) | Auto-chain → Code Testability Revision (8) |
+| Code Testability Revision (8) | Auto-chain → Decompose Tests (9) |
+| Decompose Tests (9) | **Session boundary** — suggest new conversation before Implement Tests |
+| Implement Tests (10) | Auto-chain → Run Tests (11) |
+| Run Tests (11, all pass) | Auto-chain → Test-Spec Sync (12) |
+| Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
+| Update Docs (13, done or skipped) | Auto-chain → Security Audit choice (14) |
+| Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
+| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
+| Deploy (16) | Auto-chain → Retrospective (17) |
+| Retrospective (17) | Report completion; rewrite state to existing-code flow, step 9 |
 
 ## Status Summary — Step List
 
 Flow name: `greenfield`. Render using the banner template in `protocols.md` → "Banner Template (authoritative)". No header-suffix, current-suffix, or footer-extras — all empty for this flow.
 
-| # | Step Name          | Extra state tokens (beyond the shared set) |
-|---|--------------------|--------------------------------------------|
-| 1 | Problem            | — |
-| 2 | Research           | `DONE (N drafts)` |
-| 3 | Plan               | — |
-| 4 | UI Design          | — |
-| 5 | Decompose          | `DONE (N tasks)` |
-| 6 | Implement          | `IN PROGRESS (batch M of ~N)` |
-| 7 | Run Tests          | `DONE (N passed, M failed)` |
-| 8 | Security Audit     | — |
-| 9 | Performance Test   | — |
-| 10 | Deploy            | — |
-| 11 | Retrospective     | — |
+| # | Step Name                   | Extra state tokens (beyond the shared set) |
+|---|-----------------------------|--------------------------------------------|
+| 1 | Problem                     | — |
+| 2 | Research                    | `DONE (N drafts)` |
+| 3 | Plan                        | — |
+| 4 | UI Design                   | — |
+| 5 | Test Spec                   | — |
+| 6 | Decompose                   | `DONE (N tasks)` |
+| 7 | Implement                   | `IN PROGRESS (batch M of ~N)` |
+| 8 | Code Testability Revision   | — |
+| 9 | Decompose Tests             | `DONE (N tasks)` |
+| 10 | Implement Tests            | `IN PROGRESS (batch M)` |
+| 11 | Run Tests                  | `DONE (N passed, M failed)` |
+| 12 | Test-Spec Sync             | — |
+| 13 | Update Docs                | — |
+| 14 | Security Audit             | — |
+| 15 | Performance Test           | — |
+| 16 | Deploy                     | — |
+| 17 | Retrospective              | — |
 
-All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 8, 9 additionally accept `SKIPPED`.
+All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 12, 13, 14, 15 additionally accept `SKIPPED`.
 
 Row rendering format (step-number column is right-padded to 2 characters for alignment):
 
 ```
- Step 1   Problem             [<state token>]
- Step 2   Research            [<state token>]
- Step 3   Plan                [<state token>]
- Step 4   UI Design           [<state token>]
- Step 5   Decompose           [<state token>]
- Step 6   Implement           [<state token>]
- Step 7   Run Tests           [<state token>]
- Step 8   Security Audit      [<state token>]
- Step 9   Performance Test    [<state token>]
- Step 10  Deploy              [<state token>]
- Step 11  Retrospective       [<state token>]
+ Step 1   Problem                   [<state token>]
+ Step 2   Research                  [<state token>]
+ Step 3   Plan                      [<state token>]
+ Step 4   UI Design                 [<state token>]
+ Step 5   Test Spec                 [<state token>]
+ Step 6   Decompose                 [<state token>]
+ Step 7   Implement                 [<state token>]
+ Step 8   Code Testability Rev.     [<state token>]
+ Step 9   Decompose Tests           [<state token>]
+ Step 10  Implement Tests           [<state token>]
+ Step 11  Run Tests                 [<state token>]
+ Step 12  Test-Spec Sync            [<state token>]
+ Step 13  Update Docs               [<state token>]
+ Step 14  Security Audit            [<state token>]
+ Step 15  Performance Test          [<state token>]
+ Step 16  Deploy                    [<state token>]
+ Step 17  Retrospective             [<state token>]
 ```
diff --git a/.cursor/skills/autodev/protocols.md b/.cursor/skills/autodev/protocols.md
index b170c41..beee18b 100644
--- a/.cursor/skills/autodev/protocols.md
+++ b/.cursor/skills/autodev/protocols.md
@@ -111,6 +111,7 @@ Before entering a step from this table for the first time in a session, verify t
 |------|------|----------|----------------|
 | greenfield | Plan | Step 6 — Epics | Create epics for each component |
 | greenfield | Decompose | Step 1 + Step 2 + Step 3 — All tasks | Create ticket per task, link to epic |
+| greenfield | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
 | existing-code | New Task | Step 7 — Ticket | Create ticket per task, link to epic |
 
diff --git a/.cursor/skills/autodev/state.md b/.cursor/skills/autodev/state.md
index adcdb87..9f59ffc 100644
--- a/.cursor/skills/autodev/state.md
+++ b/.cursor/skills/autodev/state.md
@@ -13,7 +13,7 @@ The autodev persists its position to `_docs/_autodev_state.md`. This is a lightw
 
 ## Current Step
 flow: [greenfield | existing-code | meta-repo]
-step: [1-11 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
+step: [1-17 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
 name: [step name from the active flow's Step Reference Table]
 status: [not_started / in_progress / completed / skipped / failed]
 sub_step:
diff --git a/_docs/02_tasks/todo/AZ-219_initial_structure.md b/_docs/02_tasks/todo/AZ-219_initial_structure.md
new file mode 100644
index 0000000..dc02260
--- /dev/null
+++ b/_docs/02_tasks/todo/AZ-219_initial_structure.md
@@ -0,0 +1,288 @@
+# Initial Project Structure
+
+**Task**: AZ-219_initial_structure
+**Name**: Initial Structure
+**Description**: Scaffold the project skeleton for the Jetson-hosted GPS-denied localization runtime, replay harness, local infrastructure, CI, and deployment evidence paths.
+**Complexity**: 5 points
+**Dependencies**: None
+**Component**: Bootstrap
+**Tracker**: AZ-219
+**Epic**: AZ-206
+
+## Project Folder Layout
+
+```text
+project-root/
+├── src/
+│   ├── __init__.py
+│   ├── camera_ingest_calibration/
+│   ├── basalt_vio_adapter/
+│   ├── safety_anchor_wrapper/
+│   ├── satellite_retrieval/
+│   ├── anchor_verification/
+│   ├── cache_tile_lifecycle/
+│   ├── mavlink_gcs_integration/
+│   ├── fdr_observability/
+│   ├── validation_harness/
+│   ├── shared/
+│   │   ├── contracts/
+│   │   ├── geo_geometry/
+│   │   ├── time_sync/
+│   │   ├── config/
+│   │   ├── errors/
+│   │   └── telemetry/
+│   └── native/
+│       ├── basalt_bridge/
+│       ├── feature_matching/
+│       └── tensor_rt/
+├── migrations/
+│   ├── postgresql/
+│   └── seed/
+├── tests/
+│   ├── unit/
+│   ├── integration/
+│   ├── blackbox/
+│   ├── fixtures/
+│   └── sitl/
+├── e2e/
+│   ├── replay/
+│   └── reports/
+├── deployment/
+│   ├── docker/
+│   ├── compose/
+│   ├── jetson/
+│   └── scripts/
+├── config/
+│   ├── development/
+│   ├── ci/
+│   ├── jetson/
+│   └── production/
+├── data/
+│   ├── input/
+│   ├── expected/
+│   ├── cache/
+│   ├── fdr/
+│   └── test-results/
+├── .github/
+│   └── workflows/
+│       └── ci.yml
+├── docker-compose.yml
+├── docker-compose.test.yml
+├── .dockerignore
+└── .env.example
+```
+
+### Layout Rationale
+
+The runtime is organized directly under `src/` because this repository already represents the GPS-denied onboard system. Component directories live at the source root, with native bridges isolated under `src/native/` for BASALT, feature matching, and TensorRT-sensitive paths. Shared contracts, geometry, time-sync, configuration, error envelopes, and telemetry DTOs are centralized so component tasks consume a single public interface instead of duplicating cross-cutting logic.
+
+The scaffold separates runtime source, migrations, tests, deployment assets, configuration, and mutable data. Production runs on Jetson hardware, while Docker/compose is used for replay, SITL, and deterministic CI environments.
+
+## DTOs and Interfaces
+
+### Shared DTOs
+
+| DTO Name | Used By Components | Fields Summary |
+|----------|--------------------|----------------|
+| `FramePacket` | Camera ingest, BASALT VIO, satellite retrieval, anchor verification, cache lifecycle, FDR | Frame ID, timestamp, image reference, calibration ID, occlusion status, quality metrics |
+| `TelemetrySample` | MAVLink/GCS, BASALT VIO, safety wrapper, FDR, validation harness | Timestamp, IMU, attitude, airspeed, altitude, GPS health |
+| `VioStatePacket` | BASALT VIO, safety wrapper, FDR, validation harness | Timestamp, relative pose, velocity, bias, tracking quality, covariance hint |
+| `PositionEstimate` | Safety wrapper, MAVLink/GCS, cache lifecycle, FDR, validation harness | WGS84 coordinates, covariance semi-major axis, source label, fix type, horizontal accuracy, anchor age |
+| `VprCandidate` | Satellite retrieval, anchor verification, FDR | Chunk ID, tile ID, score, footprint, freshness status |
+| `AnchorDecision` | Anchor verification, safety wrapper, FDR | Candidate ID, acceptance result, estimated pose, inliers, MRE, rejection reason |
+| `CacheTileRecord` | Cache lifecycle, satellite retrieval, anchor verification, FDR | Tile ID, type, CRS, meters per pixel, capture date, signature/hash status, trust level |
+| `FdrEvent` | All runtime components, validation harness | Event type, timestamp, component, severity, payload reference, mission/run ID |
+| `ScenarioReport` | Validation harness, CI/CD, release evidence | Scenario ID, result, metrics, artifacts, failure reason |
+
+### Component Interfaces
+
+| Component | Interface | Methods | Exposed To |
+|-----------|-----------|---------|------------|
+| Camera ingest/calibration | `FrameProvider` | `next_frame`, `detect_occlusion`, `classify_quality` | BASALT VIO, satellite retrieval, anchor verification, cache lifecycle |
+| BASALT VIO adapter | `VioAdapter` | `initialize`, `process`, `health` | Safety wrapper, validation harness |
+| Safety/anchor wrapper | `LocalizationStateMachine` | `update_vio`, `consider_anchor`, `degrade`, `propagate_imu_only`, `tile_write_eligibility` | MAVLink/GCS, cache lifecycle, FDR, validation harness |
+| Satellite retrieval | `CandidateRetriever` | `load_index`, `retrieve` | Safety wrapper, anchor verification |
+| Anchor verification | `AnchorVerifier` | `verify`, `benchmark_matcher` | Safety wrapper, FDR |
+| Cache/tile lifecycle | `CacheRepository` | `validate_cache`, `get_tile_window`, `write_generated_tile`, `package_sync` | Satellite retrieval, anchor verification, post-flight sync |
+| MAVLink/GCS integration | `MavlinkGateway` | `subscribe_telemetry`, `emit_gps_input`, `emit_status` | BASALT VIO, safety wrapper, QGC, FDR |
+| FDR/observability | `FlightRecorder` | `append_event`, `rollover`, `export` | All runtime components, validation harness |
+| Validation harness | `ScenarioRunner` | `validate_fixture`, `run_scenario` | CI/CD, release evidence review |
+
+## CI/CD Pipeline
+
+| Stage | Purpose | Trigger |
+|-------|---------|---------|
+| Format / lint | Enforce code style and static quality | Every PR and push to `dev` |
+| Unit tests | Validate component-local behavior and shared contracts | Every PR and push to `dev` |
+| Replay black-box smoke | Run deterministic still-image/cache/SITL subsets | Every PR |
+| Cache/security fixture tests | Validate signed manifests, stale-tile rejection, no provider calls | Every PR |
+| Plane SITL spoof/failsafe | Validate ArduPilot Plane `GPS_INPUT`, failsafe, spoofing promotion | Nightly and release candidate |
+| Public dataset replay | Exercise VIO, retrieval, and anchor behavior against pinned public slices | Nightly and release candidate |
+| Jetson latency/resource tests | Measure p95 latency, memory, cold start, TensorRT/ONNX fidelity | Release candidate |
+| Thermal/FDR endurance | Prove 8-hour 25 W no-throttle and <=64 GB FDR rollover | Hardware qualification |
+| Build / package | Produce replay-compatible and Jetson deployment artifacts | Release candidate |
+| Deploy staging evidence | Publish reports, tlogs, FDR summaries, cache validation artifacts | Release candidate |
+
+### Pipeline Configuration Notes
+
+CI uses Docker/compose for replay and SITL gates. Jetson, thermal, camera, and representative replay gates run on dedicated hardware runners and block release rather than every PR. Dataset downloads are license-tagged and not baked into images. Secrets, signing keys, and Satellite Service credentials are never cached.
+
+## Environment Strategy
+
+| Environment | Purpose | Configuration Notes |
+|-------------|---------|---------------------|
+| Development replay | Fast local iteration | Small fixture cache, local PostgreSQL/PostGIS, replay frames, test keys only |
+| CI replay | Deterministic PR checks | Docker services for runtime, replay consumer, cache stub, and reports |
+| Public dataset replay | Algorithm de-risking | Pinned dataset slices with license metadata and ground-truth reports |
+| Plane SITL | MAVLink/failsafe validation | ArduPilot Plane SITL, QGC/tlog observer, production-like `GPS_INPUT` parameters |
+| Jetson hardware validation | Production path profiling | JetPack/CUDA/TensorRT, local camera/GPU/MAVLink access, resource monitoring |
+| Representative flight/replay | Final acceptance evidence | Target-like UAV, FC, camera, synchronized telemetry, ground truth |
+| Production | Onboard mission runtime | Preloaded signed cache, local PostGIS, per-flight FDR, no provider network calls |
+
+### Environment Variables
+
+| Variable | Dev | Staging | Production | Description |
+|----------|-----|---------|------------|-------------|
+| `GPSD_ENV` | `development` | `staging` | `production` | Selects runtime profile |
+| `GPSD_CONFIG_DIR` | local path | staged config path | onboard config path | Versioned configuration root |
+| `GPSD_CACHE_DIR` | fixture cache | staged mission cache | onboard mission cache | COG, manifest, sidecar, descriptor root |
+| `GPSD_FDR_DIR` | temp output | staged evidence path | per-flight NVMe path | Flight data recorder output |
+| `GPSD_DATABASE_URL` | local PostGIS | staging PostGIS | onboard local PostGIS | Manifest, mission state, FDR event index |
+| `GPSD_MAVLINK_URL` | SITL/replay | SITL/hardware rig | FC telemetry link | MAVLink connection endpoint |
+| `GPSD_CAMERA_SOURCE` | fixture/replay | fixture or hardware | live camera | Navigation frame source |
+| `GPSD_SIGNING_KEY_REF` | test key ref | staging secret ref | mission secret ref | Cache/sidecar signature verification |
+| `GPSD_MAX_FDR_BYTES` | small fixture cap | release-like cap | `68719476736` | FDR rollover threshold |
+| `GPSD_LOG_LEVEL` | debug/info | info | info/warn | Runtime logging level |
+
+## Database Migration Approach
+
+**Migration tool**: Versioned PostgreSQL migration scripts, with PostGIS extension setup and deterministic seed scripts.
+**Strategy**: Additive migrations by default. Database, table, and column renames require explicit approval before implementation. Runtime rejects unknown required schema versions loudly.
+
+### Initial Schema
+
+- Mission profile tables for route geometry, sector classification, altitude band, and cache budget.
+- Camera calibration tables for camera model, intrinsics, extrinsics, and verification status.
+- Cache manifest tables for COG tiles, generated tiles, freshness, signatures, sidecar hashes, trust level, and PostGIS footprints.
+- VPR chunk tables for retrieval footprints, descriptor metadata, multi-scale index references, and sector top-K policy.
+- FDR event index tables for mission/run IDs, timestamps, event type, component, severity, and CBOR segment references.
+- Validation run tables or report records for scenario IDs, metrics, artifacts, and release evidence pointers.
+
+## Test Structure
+
+```text
+tests/
+├── unit/
+│   ├── shared/
+│   ├── camera_ingest_calibration/
+│   ├── basalt_vio_adapter/
+│   ├── safety_anchor_wrapper/
+│   ├── satellite_retrieval/
+│   ├── anchor_verification/
+│   ├── cache_tile_lifecycle/
+│   ├── mavlink_gcs_integration/
+│   ├── fdr_observability/
+│   └── validation_harness/
+├── integration/
+│   ├── contracts/
+│   ├── cache_postgis/
+│   ├── mavlink/
+│   └── fdr/
+├── blackbox/
+│   ├── still_image_geolocation/
+│   ├── satellite_anchor/
+│   ├── visual_blackout_spoofing/
+│   ├── cache_freshness/
+│   └── resource_limits/
+├── fixtures/
+│   ├── project_60_images/
+│   ├── expected_results/
+│   ├── satellite_cache/
+│   ├── telemetry/
+│   └── public_dataset_slices/
+└── sitl/
+    ├── plane_gps_input/
+    ├── spoofing_promotion/
+    └── failsafe/
+```
+
+### Test Configuration Notes
+
+Unit tests validate contracts and component behavior without external hardware. Integration tests exercise PostGIS, MAVLink, FDR segments, and cache fixtures. Black-box tests interact only through public inputs and outputs: frames, telemetry, offline cache, `GPS_INPUT`, QGC status, and FDR. Release gates add Jetson hardware, thermal/endurance, and representative replay evidence.
+
+## Health Checks
+
+| Service / Component | Liveness | Readiness |
+|---------------------|----------|-----------|
+| GPS-denied service | Process event loop responsive | Config, PostGIS, cache, FDR path, MAVLink, and camera/replay source validated |
+| Replay consumer | Runner process responsive | Fixtures, expected results, and output path available |
+| Satellite cache stub | Fixture volume mounted | Manifests, sidecars, descriptors, and signatures validated |
+| ArduPilot Plane SITL | SITL process responsive | MAVLink ports accepting telemetry and production parameters loaded |
+| QGC observer/log parser | Parser process responsive | Tlog/status stream connected |
+
+Each deployable service exposes `/health/live`, `/health/ready`, and `/metrics` where HTTP is available. Non-HTTP hardware processes write equivalent structured health events to FDR and CI reports.
+
+## Docker And Compose Requirements
+
+- Create a replay-compatible Dockerfile for the Python runtime and native optional stubs.
+- Create a replay-consumer Dockerfile for black-box test execution.
+- Create a satellite-cache-stub fixture image or volume contract.
+- Create an ArduPilot Plane SITL service for integration tests.
+- Use non-root users, pinned base images, multi-stage builds where native compilation is needed, and health checks for long-running services.
+- Provide `docker-compose.yml` for local development/replay and `docker-compose.test.yml` for black-box/SITL test execution.
+- Do not bake secrets, provider credentials, mission signing keys, or public dataset payloads into images.
+
+## Implementation Order
+
+| Order | Component | Reason |
+|-------|-----------|--------|
+| 1 | Bootstrap and shared contracts | Establish package, configuration, migrations, CI, DTOs, and error envelopes |
+| 2 | Shared geometry/time helpers | Foundational math and timestamp contracts used by camera, VIO, cache, wrapper, and validation |
+| 3 | Runtime configuration and error handling | Prevent duplicated config/error behavior across components |
+| 4 | Camera ingest/calibration | Produces the frame and occlusion signals required by VIO, anchor, cache, and tests |
+| 5 | MAVLink/GCS integration | Supplies FC telemetry DTOs and validates `GPS_INPUT` output contract early |
+| 6 | Cache/tile lifecycle | Owns PostGIS cache manifest, sidecars, COG access, and freshness gates |
+| 7 | FDR/observability | Provides audit path for all components and validation reports |
+| 8 | BASALT VIO adapter | Depends on frame and telemetry contracts, blocks wrapper integration |
+| 9 | Satellite retrieval | Depends on cache schema and frame DTOs, feeds anchor verification |
+| 10 | Anchor verification | Depends on retrieval candidates and cache tile access |
+| 11 | Safety/anchor wrapper | Consumes VIO, anchor, camera degradation, MAVLink, and FDR contracts |
+| 12 | Validation harness | Uses public interfaces once contracts and runtime components are stable |
+| 13 | Black-box, SITL, Jetson, and endurance test tasks | Exercise release gates and acceptance criteria end to end |
+
+## Acceptance Criteria
+
+**AC-1: Project scaffolded**
+Given the structure plan above
+When the implementer executes this task
+Then source, test, migration, deployment, configuration, and data directories exist with placeholder files where needed so empty directories are retained.
+
+**AC-2: Shared contracts initialized**
+Given the planned component interfaces
+When the scaffold is complete
+Then shared DTO, error, configuration, telemetry, geometry, and time-sync contract locations exist and are importable from component skeletons.
+
+**AC-3: Local infrastructure defined**
+Given the system requires local PostGIS, replay, SITL, cache, and FDR paths
+When the scaffold is complete
+Then `docker-compose.yml`, `docker-compose.test.yml`, `.env.example`, and migration seed directories describe all required local services and volumes.
+
+**AC-4: CI/CD configured**
+Given the pipeline stages in the planning artifacts
+When CI runs on a PR
+Then format/lint, unit tests, replay black-box smoke, cache/security fixture tests, and artifact collection stages are defined.
+
+**AC-5: Test harness skeleton ready**
+Given black-box tests must use public interfaces only
+When the scaffold is complete
+Then unit, integration, black-box, fixtures, and SITL test locations exist with runner entry points ready for implementation tasks.
+
+**AC-6: Deployment evidence paths ready**
+Given release requires Jetson, SITL, FDR, cache, and representative replay evidence
+When the scaffold is complete
+Then deployment scripts, report output paths, and documentation placeholders exist for those evidence artifacts.
+
+**AC-7: No production secrets or raw frames committed**
+Given onboard runtime must not retain raw frames and must protect signing credentials
+When the scaffold is reviewed
+Then `.gitignore`, `.dockerignore`, and `.env.example` exclude secrets, generated FDR payloads, raw frame dumps, cache payloads, and test result artifacts unless explicitly fixture-scoped.
diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md
index a8671b8..6d34b14 100644
--- a/_docs/_autodev_state.md
+++ b/_docs/_autodev_state.md
@@ -2,13 +2,13 @@
 
 ## Current Step
 flow: greenfield
-step: 3
-name: Plan
-status: completed
+step: 5
+name: Decompose
+status: in_progress
 tracker: jira
 sub_step:
-  phase: 10
-  name: plan-complete
-  detail: "FINAL_report.md and epics.md saved; Jira epics AZ-206 through AZ-218 created"
+  phase: 1
+  name: bootstrap-structure
+  detail: "AZ-219_initial_structure.md flattened to src/ root; awaiting structure confirmation"
 retry_count: 0
 cycle: 1