ui/.cursor/skills/implement/SKILL.md

---
name: implement
description: |
  Implement tasks sequentially with dependency-aware batching and integrated code review.
  Reads flat task files and _dependencies_table.md from TASKS_DIR, computes execution batches via topological sort,
  implements tasks one at a time in dependency order, runs code-review skill after each batch, and loops until done.
  Use after /decompose has produced task files.
  Trigger phrases:
  - "implement", "start implementation", "implement tasks"
  - "execute tasks"
category: build
tags: [implementation, batching, code-review]
disable-model-invocation: true
---

# Implementation Runner

Implement all tasks produced by the `/decompose` skill. This skill reads task specs, computes execution order, writes the code and tests for each task **sequentially** (no subagents, no parallel execution), validates results via the `/code-review` skill, and escalates issues.

For each task the main agent receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria — then moves on to the next task.

## Core Principles

- **Sequential execution**: implement one task at a time. Do NOT spawn subagents and do NOT run tasks in parallel. (See `.cursor/rules/no-subagents.mdc`.)
- **Dependency-aware ordering**: tasks run only when all their dependencies are satisfied
- **Batching for review, not parallelism**: tasks are grouped into batches so `/code-review` and commits operate on a coherent unit of work — all tasks inside a batch are still implemented one after the other
- **Integrated review**: `/code-review` skill runs automatically after each batch
- **Completeness before testing**: product implementation is not done until code is checked against task outcomes, included scope, architecture/component promises, named runtime dependencies, and unresolved scaffold/native placeholders — not just task AC tests
- **Runtime dependency reality**: production code cannot satisfy a task by exposing only a protocol, fake runner, deterministic fallback, or "native bridge" placeholder when the task/architecture promises a concrete internal capability such as BASALT VIO, FAISS retrieval, LightGlue matching, or a full A-Z localization pipeline. Stubs are allowed only for external systems and tests.
- **Auto-start**: batches start immediately — no user confirmation before a batch
- **Gate on failure**: user confirmation is required only when code review returns FAIL
- **Commit per batch**: after each batch is confirmed, commit. Ask the user whether to push to remote unless the user previously opted into auto-push for this session.

## Context Resolution

- TASKS_DIR: `_docs/02_tasks/`
- Task files: selected `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
- Dependency table: `TASKS_DIR/_dependencies_table.md`

### Task Selection Context

The invoking flow decides which task category this run should execute. The implement skill must honor that selected context instead of consuming every file in `todo/`.

| Context | Selected task files |
|---------|---------------------|
| Product implementation | Task specs that are not test-only and not refactoring specs |
| Test implementation | `*_test_infrastructure.md` plus task specs whose `Component` or `Epic` identifies `Blackbox Tests` |
| Refactoring | Task specs whose filename or task ID includes `_refactor_` |

If no explicit context is provided, infer it from the active autodev step:
- greenfield Step 7 or existing-code Step 10 → Product implementation
- greenfield Step 10 or existing-code Step 6 → Test implementation
- refactor Phase 4 → Refactoring

Unselected task files remain in `TASKS_DIR/todo/` for their later flow step.

### Task Lifecycle Folders

```
TASKS_DIR/
├── _dependencies_table.md
├── todo/        ← tasks ready for implementation (this skill reads from here)
├── backlog/     ← parked tasks (not scheduled yet, ignored by this skill)
└── done/        ← completed tasks (moved here after implementation)
```

## Prerequisite Checks (BLOCKING)

1. `TASKS_DIR/todo/` exists and contains at least one task file for the selected context — **STOP if missing**
   - Exception for Product implementation re-entry: if no selected product tasks remain in `todo/`, but the active autodev state is Step 7 or the latest product completeness report is missing/invalid/contains `FAIL`, skip directly to Step 15 (Product Implementation Completeness Gate). This gate may create remediation tasks and return to Step 1. Do not write a final implementation report from this state.
2. `_dependencies_table.md` exists — **STOP if missing**
3. At least one task is not yet completed — **STOP if all done**
4. **Working tree is clean** — run `git status --porcelain`; the output must be empty.
   - If dirty, STOP and present the list of changed files to the user via the Choose format:
     - A) Commit or stash stray changes manually, then re-invoke `/implement`
     - B) Agent commits stray changes as a single `chore: WIP pre-implement` commit and proceeds
     - C) Abort
   - Rationale: each batch ends with a commit. Unrelated uncommitted changes would get silently folded into batch commits otherwise.
   - This check is repeated at the start of each batch iteration (see step 6 / step 14 Loop).

## Algorithm

### 1. Parse

- Read selected task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
- Read `_dependencies_table.md` — parse into a dependency graph (DAG)
- Validate: no circular dependencies in the selected task graph, all referenced selected-task dependencies exist or are already completed in `TASKS_DIR/done/`

### 2. Detect Progress

- Scan the codebase to determine which tasks are already completed
- Match implemented code against task acceptance criteria
- Mark completed tasks as done in the DAG
- Report progress to user: "X of Y tasks completed"

### 3. Compute Next Batch

- Topological sort remaining tasks
- Select tasks whose dependencies are ALL satisfied (completed)
- A batch is simply a coherent group of tasks for review + commit. Within the batch, tasks are implemented sequentially in topological order.
- Cap the batch size at a reasonable review scope (default: 4 tasks)
- If the batch would exceed 20 total complexity points, suggest splitting and let the user decide

### 4. Assign File Ownership

The authoritative file-ownership map is `_docs/02_document/module-layout.md` (produced by the decompose skill's Step 1.5). Task specs are purely behavioral — they do NOT carry file paths. Derive ownership from the layout, not from the task spec's prose.

For each task in the batch:
- Read the task spec's **Component** field.
- Look up the component in `_docs/02_document/module-layout.md` → Per-Component Mapping.
- Set **OWNED** = the component's `Owns` glob (the files this task is allowed to write).
- Set **READ-ONLY** = Public API files of every component in the component's `Imports from` list, plus all `shared/*` Public API files.
- Set **FORBIDDEN** = every other component's `Owns` glob, and every other component's internal (non-Public API) files.
- If the task is a shared / cross-cutting task (lives under `shared/*`), OWNED = that shared directory; READ-ONLY = nothing; FORBIDDEN = every component directory.

Since execution is sequential, there is no parallel-write conflict to resolve; ownership here is a **scope discipline** check — it stops a task from drifting into unrelated components even when alone.

If `_docs/02_document/module-layout.md` is missing or the component is not found:
- STOP the batch.
- Instruct the user to run `/decompose` Step 1.5 or to manually add the component entry to `module-layout.md`.
- Do NOT guess file paths from the task spec — that is exactly the drift this file exists to prevent.

### 5. Update Tracker Status → In Progress

For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before starting work. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.

### 6. Implement Tasks Sequentially

**Per-batch dirty-tree re-check**: before starting the batch, run `git status --porcelain`. On the first batch this is guaranteed clean by the prerequisite check. On subsequent batches, the previous batch ended with a commit so the tree should still be clean. If the tree is dirty at this point, STOP and surface the dirty files to the user using the same A/B/C choice as the prerequisite check. The most likely causes are a failed commit in the previous batch, a user who edited files mid-loop, or a pre-commit hook that re-wrote files and was not captured.

For each task in the batch **in topological order, one at a time**:
1. Read the task spec file.
2. Respect the file-ownership envelope computed in Step 4 (OWNED / READ-ONLY / FORBIDDEN).
3. Implement the feature and write/update tests for every acceptance criterion in the spec. Tests for internal product behavior must exercise the production implementation path. If a test cannot run in the current environment (e.g., TensorRT requires GPU), the test must still exist and skip/block with a clear prerequisite reason, but that skip does not make missing production code complete.
4. Run the relevant tests locally before moving on to the next task in the batch. If tests fail, fix in-place — do not defer.
5. Capture a short per-task status line (files changed, tests pass/fail, any blockers) for the batch report.

Do NOT spawn subagents and do NOT attempt to implement two tasks simultaneously, even if they touch disjoint files. See `.cursor/rules/no-subagents.mdc`.

### 7. Collect Status

- After all tasks in the batch are finished, aggregate the per-task status lines into a structured batch status.
- If any task reported "Blocked", log the blocker with the failing task's ID and continue — the batch report will surface it.

**Stuck detection** — while implementing a task, watch for these signals in your own progress:
- The same file has been rewritten 3+ times without tests going green → stop, mark the task Blocked, and move to the next task in the batch (the user will be asked at the end of the batch).
- You have tried 3+ distinct approaches without evidence-driven progress → stop, mark Blocked, move on.
- Do NOT loop indefinitely on a single task. Record the blocker and proceed.

### 8. AC Test Coverage Verification

Before code review, verify that every acceptance criterion in each task spec has at least one test that validates it. For each task in the batch:

1. Read the task spec's **Acceptance Criteria** section
2. Search the test files (new and existing) for tests that cover each AC
3. Classify each AC as:
   - **Covered**: a test directly validates this AC (running or skipped-with-reason)
   - **Not covered**: no test exists for this AC

If any AC is **Not covered**:
- This is a **BLOCKING** failure — the missing test must be written before proceeding
- Go back to the offending task, add tests for the specific ACs that lack coverage, then re-run this check
- If the test cannot run in the current environment (GPU required, platform-specific, external service), the test must still exist and skip with `pytest.mark.skipif` or `pytest.skip()` explaining the prerequisite
- A skipped test counts as **Covered** — the test exists and will run when the environment allows

Only proceed to Step 9 when every AC has a corresponding test.

### 9. Code Review

- Run `/code-review` skill on the batch's changed files + corresponding task specs
- The code-review skill produces a verdict: PASS, PASS_WITH_WARNINGS, or FAIL

### 10. Auto-Fix Gate

Bounded auto-fix loop — only applies to **mechanical** findings. Critical and Security findings are never auto-fixed.

**Auto-fix eligibility matrix:**

| Severity | Category | Auto-fix? |
|----------|----------|-----------|
| Low | any | yes |
| Medium | Style, Maintainability, Performance | yes |
| Medium | Bug, Spec-Gap, Security, Architecture | escalate |
| High | Style, Scope | yes |
| High | Bug, Spec-Gap, Performance, Maintainability, Architecture | escalate |
| Critical | any | escalate |
| any | Security | escalate |
| any | Architecture (cyclic deps) | escalate |

Flow:

1. If verdict is **PASS** or **PASS_WITH_WARNINGS**: show findings as info, continue to step 11
2. If verdict is **FAIL**:
   - Partition findings into auto-fix-eligible and escalate (using the matrix above)
   - For eligible findings, attempt fixes using location/description/suggestion, then re-run `/code-review` on modified files (max 2 rounds)
   - If all remaining findings are auto-fix-eligible and re-review now passes → continue to step 11
   - If any non-eligible finding exists at any point → stop auto-fixing, present the full list to the user (**BLOCKING**)
3. User must explicitly approve each non-auto-fix finding (accept, request manual fix, mark as out-of-scope) before proceeding.

Track `auto_fix_attempts` and `escalated_findings` in the batch report for retrospective analysis.

### 11. Commit (and optionally Push)

- After user confirms the batch (explicitly for FAIL, implicitly for PASS/PASS_WITH_WARNINGS):
  - `git add` all changed files from the batch
  - `git commit` with a message that includes ALL task IDs (tracker IDs or numeric prefixes) of tasks implemented in the batch, followed by a summary of what was implemented. Format: `[TASK-ID-1] [TASK-ID-2] ... Summary of changes`
  - Ask the user whether to push to remote, unless the user previously opted into auto-push for this session

### 12. Update Tracker Status → In Testing

After the batch is committed (and pushed if the user approved pushing), transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.

### 13. Archive Completed Tasks

Move each completed task file from `TASKS_DIR/todo/` to `TASKS_DIR/done/`.

For product implementation, this archive means "batch implementation accepted." The Product Implementation Completeness Gate can still require follow-up remediation tasks before the feature is complete; it does not move original task files back to `todo/`.

### 14. Loop

- Go back to step 2 until all tasks in `todo/` are done

### 14.5. Cumulative Code Review (every K batches)

- **Trigger**: every K completed batches (default `K = 3`; configurable per run via a `cumulative_review_interval` knob in the invocation context)
- **Purpose**: per-batch review (Step 9) catches batch-local issues; cumulative review catches issues that only appear when tasks are combined — architecture drift, cross-task inconsistency, duplicate symbols introduced across different batches, contracts that drifted across producer/consumer batches
- **Scope**: the union of files changed since the **last** cumulative review (or since the start of the run if this is the first)
- **Action**: invoke `.cursor/skills/code-review/SKILL.md` in **cumulative mode**. All 7 phases run, with emphasis on Phase 6 (Cross-Task Consistency), Phase 7 (Architecture Compliance), and duplicate-symbol detection across the accumulated code
- **Output**: write the report to `_docs/03_implementation/cumulative_review_batches_[NN-MM]_cycle[N]_report.md` where `[NN-MM]` is the batch range covered and `[N]` is the current `state.cycle`. When `_docs/02_document/architecture_compliance_baseline.md` exists, the report includes the `## Baseline Delta` section (carried over / resolved / newly introduced) per `code-review/SKILL.md` "Baseline delta".
- **Gate**:
  - `PASS` or `PASS_WITH_WARNINGS` → continue to next batch (step 14 loop)
  - `FAIL` → STOP. Present the report to the user via the Choose format:
    - A) Auto-fix findings using the Auto-Fix Gate matrix in step 10, then re-run cumulative review
    - B) Open a targeted refactor run (invoke refactor skill in guided mode with the findings as `list-of-changes.md`)
    - C) Manually fix, then re-invoke `/implement`
  - Do NOT loop to the next batch on `FAIL` — the whole point is to stop drift before it compounds
- **Interaction with Auto-Fix Gate**: Architecture findings (new category from code-review Phase 7) always escalate per the implement auto-fix matrix; they cannot silently auto-fix
- **Resumability**: if interrupted, the next invocation checks for the latest `cumulative_review_batches_*.md` and computes the changed-file set from batch reports produced after that review

### 15. Product Implementation Completeness Gate

Run this gate after all **product implementation** tasks are complete and before writing any final product implementation report or allowing autodev to proceed to testability/test decomposition. Skip this gate only when the remaining context is explicitly test implementation or refactoring, as determined by the task files and report filename rules.

**Goal**: catch the failure mode where narrow tests validate scaffold behavior while the task's actual outcome, included scope, architecture promise, or named integration remains unimplemented.

Inputs:

- Completed product task specs from `_docs/02_tasks/done/` for the current cycle
- `_docs/02_document/architecture.md`
- `_docs/02_document/system-flows.md`
- Relevant `_docs/02_document/components/*/description.md` files
- Current source code under each completed task's ownership envelope
- Batch reports and code-review reports for the current cycle

For each completed product task:

1. Read these sections from the task spec: `Description`, `Outcome`, `Scope / Included`, `Acceptance Criteria`, `Non-Functional Requirements`, `Constraints`, and explicit named technologies or integrations.
2. Compare those promises against actual source code, not only tests or report prose.
3. Search the task's owned component files for unresolved implementation markers: `placeholder`, `stub`, `reserved`, `TODO`, `NotImplemented`, `pass`, `deterministic`, `fake`, `mock`, `scaffold`, `native bridge`, and empty native/readme-only integration directories. Ignore test fixtures/mocks only when they are under test-owned paths and not used as production behavior.
4. Verify that each named runtime dependency in the task promise is integrated as production behavior, not merely represented by an interface. Examples: if a task promises FAISS, DINOv2, BASALT, LightGlue, OpenCV, RANSAC, a database, cloud service, or hardware SDK, the production code must either call that dependency or contain an adapter that loads and executes the real dependency package. A deterministic fallback, fake runner, empty `native/` package, or "bridge to be supplied later" is **FAIL** unless the task itself explicitly scoped the dependency out before implementation started.
5. Distinguish internal implementation from external prerequisites:
   - Internal product capabilities (VIO, anchor verification, cache retrieval, safety wrapper, FDR, MAVLink emission) must be implemented in production code before the task can pass.
   - External systems/hardware/data (Jetson device, physical camera, ArduPilot process, QGC, third-party service credentials, unavailable licensed dataset) may be `BLOCKED` only when production code exists and the missing prerequisite is outside the product boundary.
6. Verify tests exercise the real implementation path where local prerequisites exist. Environment-gated tests may skip only with an explicit prerequisite reason; they do not make missing production code complete.
7. For any architecture promise that describes an end-to-end user outcome, verify there is an executable production pipeline connecting the relevant components. Isolated component contracts and test-only harness orchestration are not enough.
8. Classify each task:
   - **PASS**: task promises are implemented or explicitly out of scope in the task itself.
   - **BLOCKED**: production code exists but cannot be fully verified due to external hardware/data/license/runtime prerequisites; the blocker is explicit and tests report blocked/skipped with reason.
   - **FAIL**: promised production behavior is missing, only scaffolded, or only represented in tests/reports.

Save the audit to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` with:

- Per-task classification
- Evidence files/symbols checked
- Any unresolved scaffold/native placeholders
- Any named promised technologies not integrated
- Required remediation task suggestions, each sized to 5 points or less

Gate:

- If every product task is `PASS` or `BLOCKED` with explicit prerequisite evidence, continue to Final Test Run.
- If any product task is `FAIL`, STOP. Do not write the final product implementation report and do not proceed to any downstream autodev step. Completed original task files remain in `done/`; the missing work is represented by remediation tasks. Present a Choose block:
  - A) Create remediation tasks now and return to implementation
  - B) Mark the missing behavior explicitly out of scope in task/docs, then re-run this gate
  - C) Abort for manual correction
- Recommendation must normally be A unless the user deliberately accepts reduced scope.

Remediation task creation:

1. For each `FAIL`, create one or more task specs using `.cursor/skills/decompose/templates/task.md`; each remediation task must be sized at 5 points or less.
2. Save each task to `_docs/02_tasks/todo/` with a short name prefixed by `remediate_`.
3. Set **Component** to the failed task's component and set **Dependencies** to the failed task ID plus any remediation prerequisites.
4. Create or defer tracker tickets using the same tracker rules as decompose/new-task: if tracker is available, create tickets immediately; if the user explicitly chose `tracker: local`, keep numeric prefixes with `Tracker: pending` / `Epic: pending`.
5. Append the remediation tasks to `_docs/02_tasks/_dependencies_table.md`.
6. Return to Step 1 (Parse) in **Product implementation** context. The final product implementation report can be written only after remediation tasks complete and this gate reruns without `FAIL`.

### 16. Final Test Run

- After all batches are complete, run the full test suite once unless the invoking flow's immediate next step is `Run Tests`.
- If the next flow step is `Run Tests`, record a handoff in the final implementation report and let `.cursor/skills/test-run/SKILL.md` own the full-suite gate to avoid duplicate full runs.
- When this step does run, read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices).
- Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision.
- When tests pass, report final summary.

## Batch Report Persistence

After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. For product implementation, produce the FINAL implementation report only after the Product Implementation Completeness Gate passes. For test and refactor implementation, produce the FINAL report after all selected tasks complete and the full-suite gate is either run or handed off per Step 16. The filename depends on context:

- **Test implementation** (tasks from test decomposition): `_docs/03_implementation/implementation_report_tests.md`
- **Feature implementation**: `_docs/03_implementation/implementation_report_{feature_slug}_cycle{N}.md` where `{feature_slug}` is derived from the batch task names (e.g., `implementation_report_core_api_cycle2.md`) and `{N}` is the current `state.cycle` from `_docs/_autodev_state.md`. If `state.cycle` is absent (pre-migration), default to `cycle1`.
- **Refactoring**: `_docs/03_implementation/implementation_report_refactor_{run_name}.md`

Determine the context from the task files being implemented: if all tasks have test-related names or belong to a test epic, use the tests filename; otherwise derive the feature slug from the component names and append the cycle suffix.

Batch report filenames must also include the cycle counter when running feature implementation: `_docs/03_implementation/batch_{NN}_cycle{N}_report.md` (test and refactor runs may use the plain `batch_{NN}_report.md` form since they are not cycle-scoped).

## Batch Report

After each batch, produce a structured report:

```markdown
# Batch Report

**Batch**: [N]
**Tasks**: [list]
**Date**: [YYYY-MM-DD]

## Task Results

| Task | Status | Files Modified | Tests | AC Coverage | Issues |
|------|--------|---------------|-------|-------------|--------|
| [TRACKER-ID]_[name] | Done | [count] files | [pass/fail] | [N/N ACs covered] | [count or None] |

## AC Test Coverage: [All covered / X of Y covered]
## Code Review Verdict: [PASS/FAIL/PASS_WITH_WARNINGS]
## Auto-Fix Attempts: [0/1/2]
## Stuck Agents: [count or None]

## Next Batch: [task list] or "All tasks complete"
```

## Stop Conditions and Escalation

| Situation | Action |
|-----------|--------|
| Same task rewritten 3+ times without green tests | Mark Blocked, continue batch, escalate at batch end |
| Task blocked on external dependency (not in task list) | Report and skip |
| File ownership violated (task wrote outside OWNED) | ASK user |
| Product completeness gate finds missing promised implementation | STOP — create remediation tasks or get explicit user scope reduction |
| Test failure after final test run | Delegate to test-run skill — blocking gate |
| All tasks complete | Report final summary, suggest final commit |
| `_dependencies_table.md` missing | STOP — run `/decompose` first |

## Recovery

Each batch commit serves as a rollback checkpoint. If recovery is needed:

- **Tests fail after final test run**: `git revert <batch-commit-hash>` using hashes from the batch reports in `_docs/03_implementation/`
- **Resuming after interruption**: Read `_docs/03_implementation/batch_*_report.md` files (filtered by current `state.cycle` for feature implementation) to determine which batches completed, then continue from the next batch
- **Multiple consecutive batches fail**: Stop and escalate to user with links to batch reports and commit hashes

## Safety Rules

- Never start a task whose dependencies are not yet completed
- Never run tasks in parallel and never spawn subagents — see `.cursor/rules/no-subagents.mdc`
- If a task is flagged as stuck, stop working on it and report — do not let it loop indefinitely
- Always run the Product Implementation Completeness Gate before final product reports
- Always run or hand off the full test suite after all batches complete (step 16)