mirror of
https://github.com/azaion/autopilot.git
synced 2026-04-22 09:26:34 +00:00
248 lines
12 KiB
Markdown
248 lines
12 KiB
Markdown
---
|
||
name: retrospective
|
||
description: |
|
||
Collect metrics from implementation batch reports and code review findings, analyze trends across cycles,
|
||
and produce improvement reports with actionable recommendations.
|
||
3-step workflow: collect metrics, analyze trends, produce report.
|
||
Outputs to _docs/06_metrics/.
|
||
Trigger phrases:
|
||
- "retrospective", "retro", "run retro"
|
||
- "metrics review", "feedback loop"
|
||
- "implementation metrics", "analyze trends"
|
||
category: evolve
|
||
tags: [retrospective, metrics, trends, improvement, feedback-loop]
|
||
disable-model-invocation: true
|
||
---
|
||
|
||
# Retrospective
|
||
|
||
Collect metrics from implementation artifacts, analyze trends across development cycles, and produce actionable improvement reports.
|
||
|
||
## Core Principles
|
||
|
||
- **Data-driven**: conclusions come from metrics, not impressions
|
||
- **Actionable**: every finding must have a concrete improvement suggestion
|
||
- **Cumulative**: each retrospective compares against previous ones to track progress
|
||
- **Save immediately**: write artifacts to disk after each step
|
||
- **Non-judgmental**: focus on process improvement, not blame
|
||
|
||
## Context Resolution
|
||
|
||
Fixed paths:
|
||
|
||
- IMPL_DIR: `_docs/03_implementation/`
|
||
- METRICS_DIR: `_docs/06_metrics/`
|
||
- TASKS_DIR: `_docs/02_tasks/` (scan all subfolders: `todo/`, `backlog/`, `done/`)
|
||
|
||
Announce the resolved paths to the user before proceeding.
|
||
|
||
## Prerequisite Checks (BLOCKING)
|
||
|
||
1. `IMPL_DIR` exists and contains at least one `batch_*_report.md` — **STOP if missing** (nothing to analyze)
|
||
2. Create METRICS_DIR if it does not exist
|
||
3. Check for previous retrospective reports in METRICS_DIR to enable trend comparison
|
||
|
||
## Artifact Management
|
||
|
||
### Directory Structure
|
||
|
||
```
|
||
METRICS_DIR/
|
||
├── retro_[YYYY-MM-DD].md
|
||
├── retro_[YYYY-MM-DD].md
|
||
└── ...
|
||
```
|
||
|
||
## Invocation Modes
|
||
|
||
- **cycle-end mode** (default): invoked automatically at end of cycle by the autodev orchestrator — as greenfield Step 11 Retrospective (after Step 10 Deploy) and existing-code Step 17 Retrospective (after Step 16 Deploy). Runs Steps 1–4. Output: `retro_<YYYY-MM-DD>.md` + LESSONS.md update.
|
||
- **incident mode**: invoked automatically after the failure retry protocol reaches `retry_count: 3` and the user has made a recovery choice. Runs Steps 1 (scoped to the failing skill's artifacts only), 2 (focused on the failure), 3 (shorter report), 4 (append 1–3 lessons in the `process` or `tooling` category). Output: `_docs/06_metrics/incident_<YYYY-MM-DD>_<skill>.md` + LESSONS.md update. Pass the invocation context with `mode: incident`, `failing_skill: <skill-name>`, and `failure_summary: <string>`.
|
||
- **on-demand mode**: user-triggered (trigger phrases above). Runs Steps 1–4 over the entire artifact set.
|
||
|
||
## Progress Tracking
|
||
|
||
At the start of execution, create a TodoWrite with all steps (1 through 4). Update status as each step completes.
|
||
|
||
## Workflow
|
||
|
||
### Step 1: Collect Metrics
|
||
|
||
**Role**: Data analyst
|
||
**Goal**: Parse all implementation artifacts and extract quantitative metrics
|
||
**Constraints**: Collection only — no interpretation yet
|
||
|
||
#### Sources
|
||
|
||
| Source | Metrics Extracted |
|
||
|--------|------------------|
|
||
| `batch_*_report.md` | Tasks per batch, batch count, task statuses (Done/Blocked/Partial) |
|
||
| Code review sections in batch reports | PASS/FAIL/PASS_WITH_WARNINGS ratios, finding counts by severity and category |
|
||
| Task spec files in TASKS_DIR | Complexity points per task, dependency count |
|
||
| `implementation_report_*.md` | Total tasks, total batches, overall duration |
|
||
| Git log (if available) | Commits per batch, files changed per batch |
|
||
| `cumulative_review_batches_*.md` `## Baseline Delta` | Architecture findings: carried over / resolved / newly introduced counts |
|
||
| `_docs/02_document/module-layout.md` + source import graph | Component count, cross-component edges, cycles, avg imports/module |
|
||
| `_docs/02_document/contracts/**/*.md` | Contract count, contracts per public-API symbol |
|
||
|
||
#### Metrics to Compute
|
||
|
||
**Implementation Metrics**:
|
||
- Total tasks implemented
|
||
- Total batches executed
|
||
- Average tasks per batch
|
||
- Average complexity points per batch
|
||
- Total complexity points delivered
|
||
|
||
**Quality Metrics**:
|
||
- Code review pass rate (PASS / total reviews)
|
||
- Code review findings by severity: Critical, High, Medium, Low counts
|
||
- Code review findings by category: Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope
|
||
- FAIL count (batches that required user intervention)
|
||
|
||
**Structural Metrics** (skip only if `module-layout.md` is absent):
|
||
- Component count and change vs previous cycle
|
||
- Cross-component import edges and change vs previous cycle
|
||
- Cycles in the component import graph (should stay 0; any new cycle is a regression)
|
||
- Average imports per module
|
||
- New Architecture violations this cycle (from `## Baseline Delta` → Newly introduced)
|
||
- Resolved Architecture violations this cycle (from `## Baseline Delta` → Resolved)
|
||
- Net Architecture delta = new − resolved (negative is good)
|
||
- Percentage of public-API symbols covered by a contract file (contract count / public-API symbol count)
|
||
- `shared/*` entries used by ≥2 components (healthy) vs by ≤1 component (dead cross-cutting)
|
||
|
||
Persist the structural snapshot to `METRICS_DIR/structure_[YYYY-MM-DD].md` so future retros can compute deltas without re-deriving from source.
|
||
|
||
**Efficiency Metrics**:
|
||
- Blocked task count and reasons
|
||
- Tasks completed on first attempt vs requiring fixes
|
||
- Batch with most findings (identify problem areas)
|
||
|
||
**Auto-lesson triggers** (feed Step 4 LESSONS.md generation):
|
||
- Net Architecture delta > 0 this cycle → `architecture` lesson
|
||
- Any structural metric regressed by >20% vs previous snapshot → `architecture` or `dependencies` lesson depending on the metric
|
||
- Contract coverage % decreased → `architecture` lesson
|
||
|
||
**Self-verification**:
|
||
- [ ] All batch reports parsed
|
||
- [ ] All metric categories computed
|
||
- [ ] No batch reports missed
|
||
- [ ] Structural snapshot written (or explicitly skipped with reason "module-layout.md absent")
|
||
- [ ] If a previous `structure_*.md` exists, deltas are computed against the most recent one
|
||
|
||
---
|
||
|
||
### Step 2: Analyze Trends
|
||
|
||
**Role**: Process improvement analyst
|
||
**Goal**: Identify patterns, recurring issues, and improvement opportunities
|
||
**Constraints**: Analysis must be grounded in the metrics from Step 1
|
||
|
||
1. If previous retrospective reports exist in METRICS_DIR, load the most recent one for comparison
|
||
2. Identify patterns:
|
||
- **Recurring findings**: which code review categories appear most frequently?
|
||
- **Problem components**: which components/files generate the most findings?
|
||
- **Complexity accuracy**: do high-complexity tasks actually produce more issues?
|
||
- **Blocker patterns**: what types of blockers occur and can they be prevented?
|
||
3. Compare against previous retrospective (if exists):
|
||
- Which metrics improved?
|
||
- Which metrics degraded?
|
||
- Were previous improvement actions effective?
|
||
4. Identify top 3 improvement actions ranked by impact
|
||
|
||
**Self-verification**:
|
||
- [ ] Patterns are grounded in specific metrics
|
||
- [ ] Comparison with previous retro included (if exists)
|
||
- [ ] Top 3 actions are concrete and actionable
|
||
|
||
---
|
||
|
||
### Step 3: Produce Report
|
||
|
||
**Role**: Technical writer
|
||
**Goal**: Write a structured retrospective report with metrics, trends, and recommendations
|
||
**Constraints**: Concise, data-driven, actionable
|
||
|
||
Write `METRICS_DIR/retro_[YYYY-MM-DD].md` using `templates/retrospective-report.md` as structure.
|
||
|
||
**Self-verification**:
|
||
- [ ] All metrics from Step 1 included
|
||
- [ ] Trend analysis from Step 2 included
|
||
- [ ] Top 3 improvement actions clearly stated
|
||
- [ ] Suggested rule/skill updates are specific
|
||
|
||
**Save action**: Write `retro_[YYYY-MM-DD].md` (in cycle-end / on-demand mode) or `incident_[YYYY-MM-DD]_[skill].md` (in incident mode).
|
||
|
||
Present the report summary to the user.
|
||
|
||
---
|
||
|
||
### Step 4: Update Lessons Log
|
||
|
||
**Role**: Process improvement analyst
|
||
**Goal**: Keep a short, frequently-consulted log of actionable lessons that downstream skills read before they plan or estimate.
|
||
|
||
1. Extract the **top 3 concrete lessons** from the current retrospective (or 1–3 lessons in incident mode, scoped to the failing skill). Each lesson must:
|
||
- Be specific enough to change future behavior (not a platitude).
|
||
- Be single-sentence.
|
||
- Be tied to one of the categories: `estimation`, `architecture`, `testing`, `dependencies`, `tooling`, `process`.
|
||
2. Append one bullet per lesson to `_docs/LESSONS.md` using this format:
|
||
|
||
```
|
||
- [YYYY-MM-DD] [category] one-line lesson statement.
|
||
Source: _docs/06_metrics/retro_YYYY-MM-DD.md
|
||
```
|
||
|
||
3. After appending, trim `_docs/LESSONS.md` to keep only the last **15 entries** (ring buffer). Oldest entries drop off the top. Preserve the file's header section if present.
|
||
4. If `_docs/LESSONS.md` does not exist, create it with this skeleton before appending:
|
||
|
||
```markdown
|
||
# Lessons Log
|
||
|
||
A ring buffer of the last 15 actionable lessons extracted from retrospectives and incidents.
|
||
Downstream skills consume this file:
|
||
- `.cursor/skills/new-task/SKILL.md` (Step 2 Complexity Assessment)
|
||
- `.cursor/skills/plan/steps/06_work-item-epics.md` (epic sizing)
|
||
- `.cursor/skills/decompose/SKILL.md` (Step 2 task complexity)
|
||
- `.cursor/skills/autodev/SKILL.md` (Execution Loop step 0 — surface top 3 lessons)
|
||
|
||
Categories: estimation · architecture · testing · dependencies · tooling · process
|
||
```
|
||
|
||
**Self-verification**:
|
||
- [ ] 1–3 lessons extracted (3 in cycle-end / on-demand mode, 1–3 in incident mode)
|
||
- [ ] Each lesson is single-sentence, specific, and tagged with a valid category
|
||
- [ ] Each lesson includes a Source link back to its retro or incident file
|
||
- [ ] `_docs/LESSONS.md` trimmed to at most 15 entries
|
||
- [ ] Skeleton header preserved if file was just created
|
||
|
||
**Save action**: Write (or update) `_docs/LESSONS.md`.
|
||
|
||
---
|
||
|
||
## Escalation Rules
|
||
|
||
| Situation | Action |
|
||
|-----------|--------|
|
||
| No batch reports exist | **STOP** — nothing to analyze |
|
||
| Batch reports have inconsistent format | **WARN user**, extract what is available |
|
||
| No previous retrospective for comparison | PROCEED — report baseline metrics only |
|
||
| Metrics suggest systemic issue (>50% FAIL rate) | **WARN user** — suggest immediate process review |
|
||
|
||
## Methodology Quick Reference
|
||
|
||
```
|
||
┌────────────────────────────────────────────────────────────────┐
|
||
│ Retrospective (3-Step Method) │
|
||
├────────────────────────────────────────────────────────────────┤
|
||
│ PREREQ: batch reports exist in _docs/03_implementation/ │
|
||
│ │
|
||
│ 1. Collect Metrics → parse batch reports, compute metrics │
|
||
│ 2. Analyze Trends → patterns, comparison, improvement areas │
|
||
│ 3. Produce Report → _docs/06_metrics/retro_[date].md │
|
||
│ 4. Update Lessons → append top-3 to _docs/LESSONS.md (≤15) │
|
||
├────────────────────────────────────────────────────────────────┤
|
||
│ Principles: Data-driven · Actionable · Cumulative │
|
||
│ Non-judgmental · Save immediately │
|
||
└────────────────────────────────────────────────────────────────┘
|
||
```
|