# Autopilot State Management

## State File: `_docs/_autopilot_state.md`

The autopilot persists its state to `_docs/_autopilot_state.md`. This file is the primary source of truth for re-entry. Folder scanning is the fallback when the state file doesn't exist.

### Format

```markdown
# Autopilot State

## Current Step
flow: [greenfield | existing-code]
step: [1-10 for greenfield, 1-12 for existing-code, or "done"]
name: [step name from the active flow's Step Reference Table]
status: [not_started / in_progress / completed / skipped / failed]
sub_step: [optional — sub-skill internal step number + name if interrupted mid-step]
retry_count: [0-3 — number of consecutive auto-retry attempts for current step, reset to 0 on success]

When updating `Current Step`, always write it as:
  flow: existing-code   ← active flow
  step: N               ← autopilot step (sequential integer)
  sub_step: M           ← sub-skill's own internal step/phase number + name
  retry_count: 0        ← reset on new step or success; increment on each failed retry
Example:
  flow: greenfield
  step: 3
  name: Plan
  status: in_progress
  sub_step: 4 — Architecture Review & Risk Assessment
  retry_count: 0
Example (failed after 3 retries):
  flow: existing-code
  step: 2
  name: Test Spec
  status: failed
  sub_step: 1b — Test Case Generation
  retry_count: 3

## Completed Steps

| Step | Name | Completed | Key Outcome |
|------|------|-----------|-------------|
| 1 | [name] | [date] | [one-line summary] |
| 2 | [name] | [date] | [one-line summary] |
| ... | ... | ... | ... |

## Key Decisions
- [decision 1: e.g. "Tech stack: Python + Rust for perf-critical, Postgres DB"]
- [decision N]

## Last Session
date: [date]
ended_at: Step [N] [Name] — SubStep [M] [sub-step name]
reason: [completed step / session boundary / user paused / context limit]
notes: [any context for next session]

## Retry Log
| Attempt | Step | Name | SubStep | Failure Reason | Timestamp |
|---------|------|------|---------|----------------|-----------|
| 1 | [step] | [name] | [sub_step] | [reason] | [date-time] |
| ... | ... | ... | ... | ... | ... |

(Clear this table when the step succeeds or user resets. Append a row on each failed auto-retry.)

## Blockers
- [blocker 1, if any]
- [none]
```

### State File Rules

1. **Create** the state file on the very first autopilot invocation (after state detection determines Step 1)
2. **Update** the state file after every step completion, every session boundary, every BLOCKING gate confirmation, and every failed retry attempt
3. **Read** the state file as the first action on every invocation — before folder scanning
4. **Cross-check**: after reading the state file, verify against actual `_docs/` folder contents. If they disagree (e.g., state file says Step 3 but `_docs/02_document/architecture.md` already exists), trust the folder structure and update the state file to match
5. **Never delete** the state file. It accumulates history across the entire project lifecycle
6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` when the step succeeds or the user manually resets. If `retry_count` reaches 3, set `status: failed` and add an entry to `Blockers`
7. **Failed state on re-entry**: if the state file shows `status: failed` with `retry_count: 3`, do NOT auto-retry — present the blocker to the user and wait for their decision before proceeding

## State Detection

Read `_docs/_autopilot_state.md` first. If it exists and is consistent with the folder structure, use the `Current Step` from the state file. If the state file doesn't exist or is inconsistent, fall back to folder scanning.

### Folder Scan Rules (fallback)

Scan `_docs/` to determine the current workflow position. The detection rules are defined in each flow file (`flows/greenfield.md` and `flows/existing-code.md`). Check the existing-code flow first (Step 1 detection), then greenfield flow rules. First match wins.

## Re-Entry Protocol

When the user invokes `/autopilot` and work already exists:

1. Read `_docs/_autopilot_state.md`
2. Cross-check against `_docs/` folder structure
3. Present Status Summary with context from state file (key decisions, last session, blockers)
4. If the detected step has a sub-skill with built-in resumability (plan, decompose, implement, deploy all do), the sub-skill handles mid-step recovery
5. Continue execution from detected state

## Session Boundaries

After any decompose/planning step completes, **do not auto-chain to implement**. Instead:

1. Update state file: mark the step as completed, set current step to the next implement step with status `not_started`
   - Existing-code flow: After Step 3 (Decompose Tests) → set current step to 4 (Implement Tests)
   - Existing-code flow: After Step 7 (New Task) → set current step to 8 (Implement)
   - Greenfield flow: After Step 5 (Decompose) → set current step to 6 (Implement)
2. Write `Last Session` section: `reason: session boundary`, `notes: Decompose complete, implementation ready`
3. Present a summary: number of tasks, estimated batches, total complexity points
4. Use Choose format:

```
══════════════════════════════════════
 DECISION REQUIRED: Decompose complete — start implementation?
══════════════════════════════════════
 A) Start a new conversation for implementation (recommended for context freshness)
 B) Continue implementation in this conversation
══════════════════════════════════════
 Recommendation: A — implementation is the longest phase, fresh context helps
══════════════════════════════════════
```

These are the only hard session boundaries. All other transitions auto-chain.