mirror of
https://github.com/azaion/ui.git
synced 2026-04-22 22:46:34 +00:00
395 lines
21 KiB
Markdown
395 lines
21 KiB
Markdown
# Autodev Protocols
|
||
|
||
## User Interaction Protocol
|
||
|
||
Every time the autodev or a sub-skill needs a user decision, use the **Choose A / B / C / D** format. This applies to:
|
||
|
||
- State transitions where multiple valid next actions exist
|
||
- Sub-skill BLOCKING gates that require user judgment
|
||
- Any fork where the autodev cannot confidently pick the right path
|
||
- Trade-off decisions (tech choices, scope, risk acceptance)
|
||
|
||
### When to Ask (MUST ask)
|
||
|
||
- The next action is ambiguous (e.g., "another research round or proceed?")
|
||
- The decision has irreversible consequences (e.g., architecture choices, skipping a step)
|
||
- The user's intent or preference cannot be inferred from existing artifacts
|
||
- A sub-skill's BLOCKING gate explicitly requires user confirmation
|
||
- Multiple valid approaches exist with meaningfully different trade-offs
|
||
|
||
### When NOT to Ask (auto-transition)
|
||
|
||
- Only one logical next step exists (e.g., Problem complete → Research is the only option)
|
||
- The transition is deterministic from the state (e.g., Plan complete → Decompose)
|
||
- The decision is low-risk and reversible
|
||
- Existing artifacts or prior decisions already imply the answer
|
||
|
||
### Choice Format
|
||
|
||
Always present decisions in this format:
|
||
|
||
```
|
||
══════════════════════════════════════
|
||
DECISION REQUIRED: [brief context]
|
||
══════════════════════════════════════
|
||
A) [Option A — short description]
|
||
B) [Option B — short description]
|
||
C) [Option C — short description, if applicable]
|
||
D) [Option D — short description, if applicable]
|
||
══════════════════════════════════════
|
||
Recommendation: [A/B/C/D] — [one-line reason]
|
||
══════════════════════════════════════
|
||
```
|
||
|
||
Rules:
|
||
1. Always provide 2–4 concrete options (never open-ended questions)
|
||
2. Always include a recommendation with a brief justification
|
||
3. Keep option descriptions to one line each
|
||
4. If only 2 options make sense, use A/B only — do not pad with filler options
|
||
5. Play the notification sound (per `.cursor/rules/human-attention-sound.mdc`) before presenting the choice
|
||
6. After the user picks, proceed immediately — no follow-up confirmation unless the choice was destructive
|
||
|
||
## Optional Skill Gate (reusable template)
|
||
|
||
Several flow steps ask the user whether to run an optional skill (security audit, performance test, etc.) before auto-chaining. Instead of re-stating the Choose block and skip semantics at each such step, flow files invoke this shared template.
|
||
|
||
### Template shape
|
||
|
||
```
|
||
══════════════════════════════════════
|
||
DECISION REQUIRED: <question>
|
||
══════════════════════════════════════
|
||
A) <option-a-label>
|
||
B) <option-b-label>
|
||
══════════════════════════════════════
|
||
Recommendation: <A|B> — <reason>
|
||
══════════════════════════════════════
|
||
```
|
||
|
||
### Semantics (same for every invocation)
|
||
|
||
- **On A** → read and execute the target skill's `SKILL.md`; after it completes, auto-chain to `<next-step>`.
|
||
- **On B** → mark the current step `skipped` in the state file; auto-chain to `<next-step>`.
|
||
- **On skill failure** → standard Failure Handling (§Failure Handling) — retry ladder, then escalate via Choose block.
|
||
- **Sound before the prompt** — follow `.cursor/rules/human-attention-sound.mdc`.
|
||
|
||
### How flow files invoke it
|
||
|
||
Each flow-file step that needs this gate supplies only the variable parts:
|
||
|
||
```
|
||
Action: Apply the **Optional Skill Gate** (protocols.md → "Optional Skill Gate") with:
|
||
- question: <Choose-block header>
|
||
- option-a-label: <one-line A description>
|
||
- option-b-label: <one-line B description>
|
||
- recommendation: <A|B> — <short reason, may be dynamic>
|
||
- target-skill: <.cursor/skills/<name>/SKILL.md, plus any mode hint>
|
||
- next-step: Step <N> (<name>)
|
||
```
|
||
|
||
The resolved Choose block (shape above) is then rendered verbatim by substituting these variables. Do NOT reword the shared scaffolding — reword only the variable parts. If a step needs different semantics (e.g., "re-run same skill" rather than "skip to next step"), it MUST NOT use this template; it writes the Choose block inline with its own semantics.
|
||
|
||
### When NOT to use this template
|
||
|
||
- The user choice has **more than two options** (A/B/C/D).
|
||
- The choice is **not "run-or-skip-this-skill"** (e.g., "another round of the same skill", "pick tech stack", "proceed vs. rollback").
|
||
- The skipped path needs special bookkeeping beyond `status: skipped` (e.g., must also move artifacts, notify tracker, trigger a different skill).
|
||
|
||
For those cases, write the Choose block inline using the base format in §User Interaction Protocol.
|
||
|
||
## Work Item Tracker Authentication
|
||
|
||
All tracker detection, authentication, availability gating, `tracker: local` fallback semantics, and leftovers handling are defined in `.cursor/rules/tracker.mdc`. Follow that rule — do not restate its logic here.
|
||
|
||
Autodev-specific additions on top of the rule:
|
||
|
||
### Steps That Require Work Item Tracker
|
||
|
||
Before entering a step from this table for the first time in a session, verify tracker availability per `.cursor/rules/tracker.mdc`. If the user has already chosen `tracker: local`, skip the gate and proceed.
|
||
|
||
| Flow | Step | Sub-Step | Tracker Action |
|
||
|------|------|----------|----------------|
|
||
| greenfield | Plan | Step 6 — Epics | Create epics for each component |
|
||
| greenfield | Decompose | Step 1 + Step 2 + Step 3 — All tasks | Create ticket per task, link to epic |
|
||
| existing-code | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
|
||
| existing-code | New Task | Step 7 — Ticket | Create ticket per task, link to epic |
|
||
|
||
### State File Marker
|
||
|
||
Record the resolved choice in the state file once per session: `tracker: jira` or `tracker: local`. Subsequent steps read this marker instead of re-running the gate.
|
||
|
||
## Error Handling
|
||
|
||
All error situations that require user input MUST use the **Choose A / B / C / D** format.
|
||
|
||
| Situation | Action |
|
||
|-----------|--------|
|
||
| State detection is ambiguous (artifacts suggest two different steps) | Present findings and use Choose format with the candidate steps as options |
|
||
| Sub-skill fails or hits an unrecoverable blocker | Use Choose format: A) retry, B) skip with warning, C) abort and fix manually |
|
||
| User wants to skip a step | Use Choose format: A) skip (with dependency warning), B) execute the step |
|
||
| User wants to go back to a previous step | Use Choose format: A) re-run (with overwrite warning), B) stay on current step |
|
||
| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution |
|
||
|
||
## Failure Handling
|
||
|
||
One retry ladder covers all failure modes: explicit failure returned by a sub-skill, stuck loops detected while monitoring, and persistent failures across conversations. The single counter is `retry_count` in the state file; the single escalation is the Choose block below.
|
||
|
||
### Failure signals
|
||
|
||
Treat the sub-skill as **failed** when ANY of the following is observed:
|
||
|
||
- The sub-skill explicitly returns a failed result (including blocked subagents, auto-fix loop exhaustion, prerequisite violations).
|
||
- **Stuck signals**: the same artifact is rewritten 3+ times without meaningful change; the sub-skill re-asks a question that was already answered; no new artifact has been saved despite active execution.
|
||
|
||
### Retry ladder
|
||
|
||
```
|
||
Failure observed
|
||
│
|
||
├─ retry_count < 3 ?
|
||
│ YES → increment retry_count in state file
|
||
│ → re-read the sub-skill's SKILL.md and _docs/_autodev_state.md
|
||
│ → resume from the last recorded sub_step (restart from sub_step 1 only if corruption is suspected)
|
||
│ → loop
|
||
│
|
||
│ NO (retry_count = 3) →
|
||
│ → set status: failed and retry_count: 3 in Current Step
|
||
│ → play notification sound (.cursor/rules/human-attention-sound.mdc)
|
||
│ → escalate (Choose block below)
|
||
│ → do NOT auto-retry until the user intervenes
|
||
```
|
||
|
||
Rules:
|
||
1. **Auto-retry is immediate** — do not ask before retrying.
|
||
2. **Preserve `sub_step`** across retries unless the failure indicates artifact corruption.
|
||
3. **Reset `retry_count: 0` on success.**
|
||
4. The counter is **per step, per cycle**. It is not cleared by crossing a session boundary — persistence across conversations is intentional; it IS the circuit breaker.
|
||
|
||
### Escalation
|
||
|
||
```
|
||
══════════════════════════════════════
|
||
SKILL FAILED: [Skill Name] — 3 consecutive failures
|
||
══════════════════════════════════════
|
||
Step: [N] — [Name]
|
||
SubStep: [M] — [sub-step name]
|
||
Last failure reason: [reason]
|
||
══════════════════════════════════════
|
||
A) Retry with fresh context (new conversation)
|
||
B) Skip this step with warning
|
||
C) Abort — investigate and fix manually
|
||
══════════════════════════════════════
|
||
Recommendation: A — fresh context often resolves
|
||
persistent failures
|
||
══════════════════════════════════════
|
||
```
|
||
|
||
### Re-entry after escalation
|
||
|
||
On the next invocation, if the state file shows `status: failed` AND `retry_count: 3`, do NOT auto-retry. Present the escalation block above first:
|
||
|
||
- User picks A → reset `retry_count: 0`, set `status: in_progress`, re-execute.
|
||
- User picks B → mark step `skipped`, proceed to the next step.
|
||
- User picks C → stop; return control to the user.
|
||
|
||
### Incident retrospective
|
||
|
||
Immediately after the user has made their A/B/C choice, invoke `.cursor/skills/retrospective/SKILL.md` in **incident mode**:
|
||
|
||
```
|
||
mode: incident
|
||
failing_skill: <skill name>
|
||
failure_summary: <last failure reason string>
|
||
```
|
||
|
||
This produces `_docs/06_metrics/incident_<YYYY-MM-DD>_<skill>.md` and appends 1–3 lessons to `_docs/LESSONS.md` under `process` or `tooling`. The retro runs even if the user picked Abort — the goal is to capture the pattern while it is fresh. If the retrospective skill itself fails, log the failure to `_docs/_process_leftovers/` but do NOT block the user's recovery choice from completing.
|
||
|
||
## Context Management Protocol
|
||
|
||
### Principle
|
||
|
||
Disk is memory. Never rely on in-context accumulation — read from `_docs/` artifacts, not from conversation history.
|
||
|
||
### Minimal Re-Read Set Per Skill
|
||
|
||
When re-entering a skill (new conversation or context refresh):
|
||
|
||
- Always read: `_docs/_autodev_state.md`
|
||
- Always read: the active skill's `SKILL.md`
|
||
- Conditionally read: only the `_docs/` artifacts the current sub-step requires (listed in each skill's Context Resolution section)
|
||
- Never bulk-read: do not load all `_docs/` files at once
|
||
|
||
### Mid-Skill Interruption
|
||
|
||
If context is filling up during a long skill (e.g., document, implement):
|
||
|
||
1. Save current sub-step progress to the skill's artifact directory
|
||
2. Update `_docs/_autodev_state.md` with exact sub-step position
|
||
3. Suggest a new conversation: "Context is getting long — recommend continuing in a fresh conversation for better results"
|
||
4. On re-entry, the skill's resumability protocol picks up from the saved sub-step
|
||
|
||
### Large Artifact Handling
|
||
|
||
When a skill needs to read large files (e.g., full solution.md, architecture.md):
|
||
|
||
- Read only the sections relevant to the current sub-step
|
||
- Use search tools (Grep, SemanticSearch) to find specific sections rather than reading entire files
|
||
- Summarize key decisions from prior steps in the state file so they don't need to be re-read
|
||
|
||
### Context Budget Heuristic
|
||
|
||
Agents cannot programmatically query context window usage. Use these heuristics to avoid degradation:
|
||
|
||
| Zone | Indicators | Action |
|
||
|------|-----------|--------|
|
||
| **Safe** | State file + SKILL.md + 2–3 focused artifacts loaded | Continue normally |
|
||
| **Caution** | 5+ artifacts loaded, or 3+ large files (architecture, solution, discovery), or conversation has 20+ tool calls | Complete current sub-step, then suggest session break |
|
||
| **Danger** | Repeated truncation in tool output, tool calls failing unexpectedly, responses becoming shallow or repetitive | Save immediately, update state file, force session boundary |
|
||
|
||
**Skill-specific guidelines**:
|
||
|
||
| Skill | Recommended session breaks |
|
||
|-------|---------------------------|
|
||
| **document** | After every ~5 modules in Step 1; between Step 4 (Verification) and Step 5 (Solution Extraction) |
|
||
| **implement** | Each batch is a natural checkpoint; if more than 2 batches completed in one session, suggest break |
|
||
| **plan** | Between Step 5 (Test Specifications) and Step 6 (Epics) for projects with many components |
|
||
| **research** | Between Mode A rounds; between Mode A and Mode B |
|
||
|
||
**How to detect caution/danger zone without API**:
|
||
|
||
1. Count tool calls made so far — if approaching 20+, context is likely filling up
|
||
2. If reading a file returns truncated content, context is under pressure
|
||
3. If the agent starts producing shorter or less detailed responses than earlier in the conversation, context quality is degrading
|
||
4. When in doubt, save and suggest a new conversation — re-entry is cheap thanks to the state file
|
||
|
||
## Rollback Protocol
|
||
|
||
### Implementation Steps (git-based)
|
||
|
||
Handled by `/implement` skill — each batch commit is a rollback checkpoint via `git revert`.
|
||
|
||
### Planning/Documentation Steps (artifact-based)
|
||
|
||
For steps that produce `_docs/` artifacts (problem, research, plan, decompose, document):
|
||
|
||
1. **Before overwriting**: if re-running a step that already has artifacts, the sub-skill's prerequisite check asks the user (resume/overwrite/skip)
|
||
2. **Rollback to previous step**: use Choose format:
|
||
|
||
```
|
||
══════════════════════════════════════
|
||
ROLLBACK: Re-run [step name]?
|
||
══════════════════════════════════════
|
||
A) Re-run the step (overwrites current artifacts)
|
||
B) Stay on current step
|
||
══════════════════════════════════════
|
||
Warning: This will overwrite files in _docs/[folder]/
|
||
══════════════════════════════════════
|
||
```
|
||
|
||
3. **Git safety net**: artifacts are committed with each autodev step completion. To roll back: `git log --oneline _docs/` to find the commit, then `git checkout <commit> -- _docs/<folder>/`
|
||
4. **State file rollback**: when rolling back artifacts, also update `_docs/_autodev_state.md` to reflect the rolled-back step (set it to `in_progress`, clear completed date)
|
||
|
||
## Debug Protocol
|
||
|
||
When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or an implementer subagent reports a blocker, the user is asked to intervene. This protocol guides the debugging process. (Retry budget and escalation are covered by Failure Handling above; this section is about *how* to diagnose once the user has been looped in.)
|
||
|
||
### Structured Debugging Workflow
|
||
|
||
When escalated to the user after implementation failure:
|
||
|
||
1. **Classify the failure** — determine the category:
|
||
- **Missing dependency**: a package, service, or module the task needs but isn't available
|
||
- **Logic error**: code runs but produces wrong results (assertion failures, incorrect output)
|
||
- **Integration mismatch**: interfaces between components don't align (type errors, missing methods, wrong signatures)
|
||
- **Environment issue**: Docker, database, network, or configuration problem
|
||
- **Spec ambiguity**: the task spec is unclear or contradictory
|
||
|
||
2. **Reproduce** — isolate the failing behavior:
|
||
- Run the specific failing test(s) in isolation
|
||
- Check whether the failure is deterministic or intermittent
|
||
- Capture the exact error message, stack trace, and relevant file:line
|
||
|
||
3. **Narrow scope** — focus on the minimal reproduction:
|
||
- For logic errors: trace the data flow from input to the point of failure
|
||
- For integration mismatches: compare the caller's expectations against the callee's actual interface
|
||
- For environment issues: verify Docker services are running, DB is accessible, env vars are set
|
||
|
||
4. **Fix and verify** — apply the fix and confirm:
|
||
- Make the minimal change that fixes the root cause
|
||
- Re-run the failing test(s) to confirm the fix
|
||
- Run the full test suite to check for regressions
|
||
- If the fix changes a shared interface, check all consumers
|
||
|
||
5. **Report** — update the batch report with:
|
||
- Root cause category
|
||
- Fix applied (file:line, description)
|
||
- Tests that now pass
|
||
|
||
### Common Recovery Patterns
|
||
|
||
| Failure Pattern | Typical Root Cause | Recovery Action |
|
||
|----------------|-------------------|----------------|
|
||
| ImportError / ModuleNotFoundError | Missing dependency or wrong path | Install dependency or fix import path |
|
||
| TypeError on method call | Interface mismatch between tasks | Align caller with callee's actual signature |
|
||
| AssertionError in test | Logic bug or wrong expected value | Fix logic or update test expectations |
|
||
| ConnectionRefused | Service not running | Start Docker services, check docker-compose |
|
||
| Timeout | Blocking I/O or infinite loop | Add timeout, fix blocking call |
|
||
| FileNotFoundError | Hardcoded path or missing fixture | Make path configurable, add fixture |
|
||
|
||
### Escalation
|
||
|
||
If debugging does not resolve the issue after 2 focused attempts:
|
||
|
||
```
|
||
══════════════════════════════════════
|
||
DEBUG ESCALATION: [failure description]
|
||
══════════════════════════════════════
|
||
Root cause category: [category]
|
||
Attempted fixes: [list]
|
||
Current state: [what works, what doesn't]
|
||
══════════════════════════════════════
|
||
A) Continue debugging with more context
|
||
B) Revert this batch and skip the task (move to backlog)
|
||
C) Simplify the task scope and retry
|
||
══════════════════════════════════════
|
||
```
|
||
|
||
## Status Summary
|
||
|
||
On every invocation, before executing any skill, present a status summary built from the state file (with folder scan fallback). For re-entry (state file exists), cross-check the current step against `_docs/` folder structure and present any `status: failed` state to the user before continuing.
|
||
|
||
### Banner Template (authoritative)
|
||
|
||
The banner shell is defined here once. Each flow file contributes only its step-list fragment and any flow-specific header/footer extras. Do not inline a full banner in flow files.
|
||
|
||
```
|
||
═══════════════════════════════════════════════════
|
||
AUTODEV STATUS (<flow-name>)<header-suffix>
|
||
═══════════════════════════════════════════════════
|
||
<step-list from the active flow file>
|
||
═══════════════════════════════════════════════════
|
||
Current: Step <N> — <Name><current-suffix>
|
||
SubStep: <M> — <sub-skill internal step name>
|
||
Retry: <N/3> ← omit row if retry_count is 0
|
||
Action: <what will happen next>
|
||
<footer-extras from the active flow file>
|
||
═══════════════════════════════════════════════════
|
||
```
|
||
|
||
### Slot rules
|
||
|
||
- `<flow-name>` — `greenfield`, `existing-code`, or `meta-repo`.
|
||
- `<header-suffix>` — optional, flow-specific. The existing-code flow appends ` — Cycle <N>` when `state.cycle > 1`; other flows leave it empty.
|
||
- `<step-list>` — a fixed-width table supplied by the active flow file (see that file's "Status Summary — Step List" section). Row format is standardized:
|
||
```
|
||
Step <N> <Step Name> [<state token>]
|
||
```
|
||
where `<state token>` comes from the state-token set defined per row in the flow's step-list table.
|
||
- `<current-suffix>` — optional, flow-specific. The existing-code flow appends ` (cycle <N>)` when `state.cycle > 1`; other flows leave it empty.
|
||
- `Retry:` row — omit entirely when `retry_count` is 0. Include it with `<N>/3` otherwise.
|
||
- `<footer-extras>` — optional, flow-specific. The meta-repo flow adds a `Config:` line with `_docs/_repo-config.yaml` state; other flows leave it empty.
|
||
|
||
### State token set (shared)
|
||
|
||
The common tokens all flows may emit are: `DONE`, `IN PROGRESS`, `NOT STARTED`, `SKIPPED`, `FAILED (retry N/3)`. Specific step rows may extend this with parenthetical detail (e.g., `DONE (N drafts)`, `DONE (N tasks)`, `IN PROGRESS (batch M of ~N)`, `DONE (N passed, M failed)`). The flow's step-list table declares which extensions each step supports.
|