mirror of
https://github.com/azaion/detections.git
synced 2026-04-23 00:26:31 +00:00
[AZ-137] [AZ-138] Decompose test tasks and scaffold E2E test infrastructure
Made-with: Cursor
This commit is contained in:
@@ -106,6 +106,76 @@ All error situations that require user input MUST use the **Choose A / B / C / D
|
||||
| User wants to go back to a previous step | Use Choose format: A) re-run (with overwrite warning), B) stay on current step |
|
||||
| User asks "where am I?" without wanting to continue | Show Status Summary only, do not start execution |
|
||||
|
||||
## Skill Failure Retry Protocol
|
||||
|
||||
Sub-skills can return a **failed** result. Failures are often caused by missing user input, environment issues, or transient errors that resolve on retry. The autopilot auto-retries before escalating.
|
||||
|
||||
### Retry Flow
|
||||
|
||||
```
|
||||
Skill execution → FAILED
|
||||
│
|
||||
├─ retry_count < 3 ?
|
||||
│ YES → increment retry_count in state file
|
||||
│ → log failure reason in state file (Retry Log section)
|
||||
│ → re-read the sub-skill's SKILL.md
|
||||
│ → re-execute from the current sub_step
|
||||
│ → (loop back to check result)
|
||||
│
|
||||
│ NO (retry_count = 3) →
|
||||
│ → set status: failed in Current Step
|
||||
│ → add entry to Blockers section:
|
||||
│ "[Skill Name] failed 3 consecutive times at sub_step [M].
|
||||
│ Last failure: [reason]. Auto-retry exhausted."
|
||||
│ → present warning to user (see Escalation below)
|
||||
│ → do NOT auto-retry again until user intervenes
|
||||
```
|
||||
|
||||
### Retry Rules
|
||||
|
||||
1. **Auto-retry immediately**: when a skill fails, retry it without asking the user — the failure is often transient (missing user confirmation in a prior step, docker not running, file lock, etc.)
|
||||
2. **Preserve sub_step**: retry from the last recorded `sub_step`, not from the beginning of the skill — unless the failure indicates corruption, in which case restart from sub_step 1
|
||||
3. **Increment `retry_count`**: update `retry_count` in the state file's `Current Step` section on each retry attempt
|
||||
4. **Log each failure**: append the failure reason and timestamp to the state file's `Retry Log` section
|
||||
5. **Reset on success**: when the skill eventually succeeds, reset `retry_count: 0` and clear the `Retry Log` for that step
|
||||
|
||||
### Escalation (after 3 consecutive failures)
|
||||
|
||||
After 3 failed auto-retries of the same skill, the failure is likely not user-related. Stop retrying and escalate:
|
||||
|
||||
1. Update the state file:
|
||||
- Set `status: failed` in `Current Step`
|
||||
- Set `retry_count: 3`
|
||||
- Add a blocker entry describing the repeated failure
|
||||
2. Play notification sound (per `human-input-sound.mdc`)
|
||||
3. Present using Choose format:
|
||||
|
||||
```
|
||||
══════════════════════════════════════
|
||||
SKILL FAILED: [Skill Name] — 3 consecutive failures
|
||||
══════════════════════════════════════
|
||||
Step: [N] — [Name]
|
||||
SubStep: [M] — [sub-step name]
|
||||
Last failure reason: [reason]
|
||||
══════════════════════════════════════
|
||||
A) Retry with fresh context (new conversation)
|
||||
B) Skip this step with warning
|
||||
C) Abort — investigate and fix manually
|
||||
══════════════════════════════════════
|
||||
Recommendation: A — fresh context often resolves
|
||||
persistent failures
|
||||
══════════════════════════════════════
|
||||
```
|
||||
|
||||
### Re-Entry After Failure
|
||||
|
||||
On the next autopilot invocation (new conversation), if the state file shows `status: failed` and `retry_count: 3`:
|
||||
|
||||
- Present the blocker to the user before attempting execution
|
||||
- If the user chooses to retry → reset `retry_count: 0`, set `status: in_progress`, and re-execute
|
||||
- If the user chooses to skip → mark step as `skipped`, proceed to next step
|
||||
- Do NOT silently auto-retry — the user must acknowledge the persistent failure first
|
||||
|
||||
## Error Recovery Protocol
|
||||
|
||||
### Stuck Detection
|
||||
@@ -211,17 +281,18 @@ On every invocation, before executing any skill, present a status summary built
|
||||
═══════════════════════════════════════════════════
|
||||
AUTOPILOT STATUS (greenfield)
|
||||
═══════════════════════════════════════════════════
|
||||
Step 0 Problem [DONE / IN PROGRESS / NOT STARTED]
|
||||
Step 1 Research [DONE (N drafts) / IN PROGRESS / NOT STARTED]
|
||||
Step 2 Plan [DONE / IN PROGRESS / NOT STARTED]
|
||||
Step 3 Decompose [DONE (N tasks) / IN PROGRESS / NOT STARTED]
|
||||
Step 4 Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED]
|
||||
Step 5 Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED]
|
||||
Step 5b Security Audit [DONE / SKIPPED / IN PROGRESS / NOT STARTED]
|
||||
Step 6 Deploy [DONE / IN PROGRESS / NOT STARTED]
|
||||
Step 0 Problem [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 1 Research [DONE (N drafts) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2 Plan [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 3 Decompose [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 4 Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 5 Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 5b Security Audit [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 6 Deploy [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
═══════════════════════════════════════════════════
|
||||
Current: Step N — Name
|
||||
SubStep: M — [sub-skill internal step name]
|
||||
Retry: [N/3 if retrying, omit if 0]
|
||||
Action: [what will happen next]
|
||||
═══════════════════════════════════════════════════
|
||||
```
|
||||
@@ -232,19 +303,20 @@ On every invocation, before executing any skill, present a status summary built
|
||||
═══════════════════════════════════════════════════
|
||||
AUTOPILOT STATUS (existing-code)
|
||||
═══════════════════════════════════════════════════
|
||||
Pre Document [DONE / IN PROGRESS / NOT STARTED]
|
||||
Step 2b Blackbox Test Spec [DONE / IN PROGRESS / NOT STARTED]
|
||||
Step 2c Decompose Tests [DONE (N tasks) / IN PROGRESS / NOT STARTED]
|
||||
Step 2d Implement Tests [DONE / IN PROGRESS (batch M) / NOT STARTED]
|
||||
Step 2e Refactor [DONE / IN PROGRESS (phase N) / NOT STARTED]
|
||||
Step 2f New Task [DONE (N tasks) / IN PROGRESS / NOT STARTED]
|
||||
Step 2g Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED]
|
||||
Step 2h Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED]
|
||||
Step 2hb Security Audit [DONE / SKIPPED / IN PROGRESS / NOT STARTED]
|
||||
Step 2i Deploy [DONE / IN PROGRESS / NOT STARTED]
|
||||
Pre Document [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2b Blackbox Test Spec [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2c Decompose Tests [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2d Implement Tests [DONE / IN PROGRESS (batch M) / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2e Refactor [DONE / IN PROGRESS (phase N) / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2f New Task [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2g Implement [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2h Run Tests [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2hb Security Audit [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
Step 2i Deploy [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
|
||||
═══════════════════════════════════════════════════
|
||||
Current: Step N — Name
|
||||
SubStep: M — [sub-skill internal step name]
|
||||
Retry: [N/3 if retrying, omit if 0]
|
||||
Action: [what will happen next]
|
||||
═══════════════════════════════════════════════════
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user