autopilot/.cursor/skills/retrospective/SKILL.md

---
name: retrospective
description: |
  Collect metrics from implementation batch reports and code review findings, analyze trends across cycles,
  and produce improvement reports with actionable recommendations.
  3-step workflow: collect metrics, analyze trends, produce report.
  Outputs to _docs/06_metrics/.
  Trigger phrases:
  - "retrospective", "retro", "run retro"
  - "metrics review", "feedback loop"
  - "implementation metrics", "analyze trends"
category: evolve
tags: [retrospective, metrics, trends, improvement, feedback-loop]
disable-model-invocation: true
---

# Retrospective

Collect metrics from implementation artifacts, analyze trends across development cycles, and produce actionable improvement reports.

## Core Principles

- **Data-driven**: conclusions come from metrics, not impressions
- **Actionable**: every finding must have a concrete improvement suggestion
- **Cumulative**: each retrospective compares against previous ones to track progress
- **Save immediately**: write artifacts to disk after each step
- **Non-judgmental**: focus on process improvement, not blame

## Context Resolution

Fixed paths:

- IMPL_DIR: `_docs/03_implementation/`
- METRICS_DIR: `_docs/06_metrics/`
- TASKS_DIR: `_docs/02_tasks/` (scan all subfolders: `todo/`, `backlog/`, `done/`)

Announce the resolved paths to the user before proceeding.

## Prerequisite Checks (BLOCKING)

1. `IMPL_DIR` exists and contains at least one `batch_*_report.md` — **STOP if missing** (nothing to analyze)
2. Create METRICS_DIR if it does not exist
3. Check for previous retrospective reports in METRICS_DIR to enable trend comparison

## Artifact Management

### Directory Structure

```
METRICS_DIR/
├── retro_[YYYY-MM-DD].md
├── retro_[YYYY-MM-DD].md
└── ...
```

## Invocation Modes

- **cycle-end mode** (default): invoked automatically at end of cycle by the autodev orchestrator — as greenfield Step 11 Retrospective (after Step 10 Deploy) and existing-code Step 17 Retrospective (after Step 16 Deploy). Runs Steps 1–4. Output: `retro_<YYYY-MM-DD>.md` + LESSONS.md update.
- **incident mode**: invoked automatically after the failure retry protocol reaches `retry_count: 3` and the user has made a recovery choice. Runs Steps 1 (scoped to the failing skill's artifacts only), 2 (focused on the failure), 3 (shorter report), 4 (append 1–3 lessons in the `process` or `tooling` category). Output: `_docs/06_metrics/incident_<YYYY-MM-DD>_<skill>.md` + LESSONS.md update. Pass the invocation context with `mode: incident`, `failing_skill: <skill-name>`, and `failure_summary: <string>`.
- **on-demand mode**: user-triggered (trigger phrases above). Runs Steps 1–4 over the entire artifact set.

## Progress Tracking

At the start of execution, create a TodoWrite with all steps (1 through 4). Update status as each step completes.

## Workflow

### Step 1: Collect Metrics

**Role**: Data analyst
**Goal**: Parse all implementation artifacts and extract quantitative metrics
**Constraints**: Collection only — no interpretation yet

#### Sources

| Source | Metrics Extracted |
|--------|------------------|
| `batch_*_report.md` | Tasks per batch, batch count, task statuses (Done/Blocked/Partial) |
| Code review sections in batch reports | PASS/FAIL/PASS_WITH_WARNINGS ratios, finding counts by severity and category |
| Task spec files in TASKS_DIR | Complexity points per task, dependency count |
| `implementation_report_*.md` | Total tasks, total batches, overall duration |
| Git log (if available) | Commits per batch, files changed per batch |
| `cumulative_review_batches_*.md` `## Baseline Delta` | Architecture findings: carried over / resolved / newly introduced counts |
| `_docs/02_document/module-layout.md` + source import graph | Component count, cross-component edges, cycles, avg imports/module |
| `_docs/02_document/contracts/**/*.md` | Contract count, contracts per public-API symbol |

#### Metrics to Compute

**Implementation Metrics**:
- Total tasks implemented
- Total batches executed
- Average tasks per batch
- Average complexity points per batch
- Total complexity points delivered

**Quality Metrics**:
- Code review pass rate (PASS / total reviews)
- Code review findings by severity: Critical, High, Medium, Low counts
- Code review findings by category: Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope
- FAIL count (batches that required user intervention)

**Structural Metrics** (skip only if `module-layout.md` is absent):
- Component count and change vs previous cycle
- Cross-component import edges and change vs previous cycle
- Cycles in the component import graph (should stay 0; any new cycle is a regression)
- Average imports per module
- New Architecture violations this cycle (from `## Baseline Delta` → Newly introduced)
- Resolved Architecture violations this cycle (from `## Baseline Delta` → Resolved)
- Net Architecture delta = new − resolved (negative is good)
- Percentage of public-API symbols covered by a contract file (contract count / public-API symbol count)
- `shared/*` entries used by ≥2 components (healthy) vs by ≤1 component (dead cross-cutting)

Persist the structural snapshot to `METRICS_DIR/structure_[YYYY-MM-DD].md` so future retros can compute deltas without re-deriving from source.

**Efficiency Metrics**:
- Blocked task count and reasons
- Tasks completed on first attempt vs requiring fixes
- Batch with most findings (identify problem areas)

**Auto-lesson triggers** (feed Step 4 LESSONS.md generation):
- Net Architecture delta > 0 this cycle → `architecture` lesson
- Any structural metric regressed by >20% vs previous snapshot → `architecture` or `dependencies` lesson depending on the metric
- Contract coverage % decreased → `architecture` lesson

**Self-verification**:
- [ ] All batch reports parsed
- [ ] All metric categories computed
- [ ] No batch reports missed
- [ ] Structural snapshot written (or explicitly skipped with reason "module-layout.md absent")
- [ ] If a previous `structure_*.md` exists, deltas are computed against the most recent one

---

### Step 2: Analyze Trends

**Role**: Process improvement analyst
**Goal**: Identify patterns, recurring issues, and improvement opportunities
**Constraints**: Analysis must be grounded in the metrics from Step 1

1. If previous retrospective reports exist in METRICS_DIR, load the most recent one for comparison
2. Identify patterns:
   - **Recurring findings**: which code review categories appear most frequently?
   - **Problem components**: which components/files generate the most findings?
   - **Complexity accuracy**: do high-complexity tasks actually produce more issues?
   - **Blocker patterns**: what types of blockers occur and can they be prevented?
3. Compare against previous retrospective (if exists):
   - Which metrics improved?
   - Which metrics degraded?
   - Were previous improvement actions effective?
4. Identify top 3 improvement actions ranked by impact

**Self-verification**:
- [ ] Patterns are grounded in specific metrics
- [ ] Comparison with previous retro included (if exists)
- [ ] Top 3 actions are concrete and actionable

---

### Step 3: Produce Report

**Role**: Technical writer
**Goal**: Write a structured retrospective report with metrics, trends, and recommendations
**Constraints**: Concise, data-driven, actionable

Write `METRICS_DIR/retro_[YYYY-MM-DD].md` using `templates/retrospective-report.md` as structure.

**Self-verification**:
- [ ] All metrics from Step 1 included
- [ ] Trend analysis from Step 2 included
- [ ] Top 3 improvement actions clearly stated
- [ ] Suggested rule/skill updates are specific

**Save action**: Write `retro_[YYYY-MM-DD].md` (in cycle-end / on-demand mode) or `incident_[YYYY-MM-DD]_[skill].md` (in incident mode).

Present the report summary to the user.

---

### Step 4: Update Lessons Log

**Role**: Process improvement analyst
**Goal**: Keep a short, frequently-consulted log of actionable lessons that downstream skills read before they plan or estimate.

1. Extract the **top 3 concrete lessons** from the current retrospective (or 1–3 lessons in incident mode, scoped to the failing skill). Each lesson must:
   - Be specific enough to change future behavior (not a platitude).
   - Be single-sentence.
   - Be tied to one of the categories: `estimation`, `architecture`, `testing`, `dependencies`, `tooling`, `process`.
2. Append one bullet per lesson to `_docs/LESSONS.md` using this format:

   ```
   - [YYYY-MM-DD] [category] one-line lesson statement.
     Source: _docs/06_metrics/retro_YYYY-MM-DD.md
   ```

3. After appending, trim `_docs/LESSONS.md` to keep only the last **15 entries** (ring buffer). Oldest entries drop off the top. Preserve the file's header section if present.
4. If `_docs/LESSONS.md` does not exist, create it with this skeleton before appending:

   ```markdown
   # Lessons Log

   A ring buffer of the last 15 actionable lessons extracted from retrospectives and incidents.
   Downstream skills consume this file:
   - `.cursor/skills/new-task/SKILL.md` (Step 2 Complexity Assessment)
   - `.cursor/skills/plan/steps/06_work-item-epics.md` (epic sizing)
   - `.cursor/skills/decompose/SKILL.md` (Step 2 task complexity)
   - `.cursor/skills/autodev/SKILL.md` (Execution Loop step 0 — surface top 3 lessons)

   Categories: estimation · architecture · testing · dependencies · tooling · process
   ```

**Self-verification**:
- [ ] 1–3 lessons extracted (3 in cycle-end / on-demand mode, 1–3 in incident mode)
- [ ] Each lesson is single-sentence, specific, and tagged with a valid category
- [ ] Each lesson includes a Source link back to its retro or incident file
- [ ] `_docs/LESSONS.md` trimmed to at most 15 entries
- [ ] Skeleton header preserved if file was just created

**Save action**: Write (or update) `_docs/LESSONS.md`.

---

## Escalation Rules

| Situation | Action |
|-----------|--------|
| No batch reports exist | **STOP** — nothing to analyze |
| Batch reports have inconsistent format | **WARN user**, extract what is available |
| No previous retrospective for comparison | PROCEED — report baseline metrics only |
| Metrics suggest systemic issue (>50% FAIL rate) | **WARN user** — suggest immediate process review |

## Methodology Quick Reference

```
┌────────────────────────────────────────────────────────────────┐
│              Retrospective (3-Step Method)                     │
├────────────────────────────────────────────────────────────────┤
│ PREREQ: batch reports exist in _docs/03_implementation/        │
│                                                                │
│ 1. Collect Metrics  → parse batch reports, compute metrics     │
│ 2. Analyze Trends   → patterns, comparison, improvement areas  │
│ 3. Produce Report   → _docs/06_metrics/retro_[date].md         │
│ 4. Update Lessons   → append top-3 to _docs/LESSONS.md (≤15)   │
├────────────────────────────────────────────────────────────────┤
│ Principles: Data-driven · Actionable · Cumulative              │
│             Non-judgmental · Save immediately                  │
└────────────────────────────────────────────────────────────────┘
```