---
name: retrospective
description: |
  Collect metrics from implementation batch reports and code review findings, analyze trends across cycles,
  and produce improvement reports with actionable recommendations.
  3-step workflow: collect metrics, analyze trends, produce report.
  Outputs to _docs/06_metrics/.
  Trigger phrases:
  - "retrospective", "retro", "run retro"
  - "metrics review", "feedback loop"
  - "implementation metrics", "analyze trends"
category: evolve
tags: [retrospective, metrics, trends, improvement, feedback-loop]
disable-model-invocation: true
---

# Retrospective

Collect metrics from implementation artifacts, analyze trends across development cycles, and produce actionable improvement reports.

## Core Principles

- **Data-driven**: conclusions come from metrics, not impressions
- **Actionable**: every finding must have a concrete improvement suggestion
- **Cumulative**: each retrospective compares against previous ones to track progress
- **Save immediately**: write artifacts to disk after each step
- **Non-judgmental**: focus on process improvement, not blame

## Context Resolution

Fixed paths:

- IMPL_DIR: `_docs/03_implementation/`
- METRICS_DIR: `_docs/06_metrics/`
- TASKS_DIR: `_docs/02_tasks/` (scan all subfolders: `todo/`, `backlog/`, `done/`)

Announce the resolved paths to the user before proceeding.

## Prerequisite Checks (BLOCKING)

1. `IMPL_DIR` exists and contains at least one `batch_*_report.md` — **STOP if missing** (nothing to analyze)
2. Create METRICS_DIR if it does not exist
3. Check for previous retrospective reports in METRICS_DIR to enable trend comparison

## Artifact Management

### Directory Structure

```
METRICS_DIR/
├── retro_[YYYY-MM-DD].md
├── retro_[YYYY-MM-DD].md
└── ...
```

## Progress Tracking

At the start of execution, create a TodoWrite with all steps (1 through 3). Update status as each step completes.

## Workflow

### Step 1: Collect Metrics

**Role**: Data analyst
**Goal**: Parse all implementation artifacts and extract quantitative metrics
**Constraints**: Collection only — no interpretation yet

#### Sources

| Source | Metrics Extracted |
|--------|------------------|
| `batch_*_report.md` | Tasks per batch, batch count, task statuses (Done/Blocked/Partial) |
| Code review sections in batch reports | PASS/FAIL/PASS_WITH_WARNINGS ratios, finding counts by severity and category |
| Task spec files in TASKS_DIR | Complexity points per task, dependency count |
| `FINAL_implementation_report.md` | Total tasks, total batches, overall duration |
| Git log (if available) | Commits per batch, files changed per batch |

#### Metrics to Compute

**Implementation Metrics**:
- Total tasks implemented
- Total batches executed
- Average tasks per batch
- Average complexity points per batch
- Total complexity points delivered

**Quality Metrics**:
- Code review pass rate (PASS / total reviews)
- Code review findings by severity: Critical, High, Medium, Low counts
- Code review findings by category: Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope
- FAIL count (batches that required user intervention)

**Efficiency Metrics**:
- Blocked task count and reasons
- Tasks completed on first attempt vs requiring fixes
- Batch with most findings (identify problem areas)

**Self-verification**:
- [ ] All batch reports parsed
- [ ] All metric categories computed
- [ ] No batch reports missed

---

### Step 2: Analyze Trends

**Role**: Process improvement analyst
**Goal**: Identify patterns, recurring issues, and improvement opportunities
**Constraints**: Analysis must be grounded in the metrics from Step 1

1. If previous retrospective reports exist in METRICS_DIR, load the most recent one for comparison
2. Identify patterns:
   - **Recurring findings**: which code review categories appear most frequently?
   - **Problem components**: which components/files generate the most findings?
   - **Complexity accuracy**: do high-complexity tasks actually produce more issues?
   - **Blocker patterns**: what types of blockers occur and can they be prevented?
3. Compare against previous retrospective (if exists):
   - Which metrics improved?
   - Which metrics degraded?
   - Were previous improvement actions effective?
4. Identify top 3 improvement actions ranked by impact

**Self-verification**:
- [ ] Patterns are grounded in specific metrics
- [ ] Comparison with previous retro included (if exists)
- [ ] Top 3 actions are concrete and actionable

---

### Step 3: Produce Report

**Role**: Technical writer
**Goal**: Write a structured retrospective report with metrics, trends, and recommendations
**Constraints**: Concise, data-driven, actionable

Write `METRICS_DIR/retro_[YYYY-MM-DD].md` using `templates/retrospective-report.md` as structure.

**Self-verification**:
- [ ] All metrics from Step 1 included
- [ ] Trend analysis from Step 2 included
- [ ] Top 3 improvement actions clearly stated
- [ ] Suggested rule/skill updates are specific

**Save action**: Write `retro_[YYYY-MM-DD].md`

Present the report summary to the user.

---

## Escalation Rules

| Situation | Action |
|-----------|--------|
| No batch reports exist | **STOP** — nothing to analyze |
| Batch reports have inconsistent format | **WARN user**, extract what is available |
| No previous retrospective for comparison | PROCEED — report baseline metrics only |
| Metrics suggest systemic issue (>50% FAIL rate) | **WARN user** — suggest immediate process review |

## Methodology Quick Reference

```
┌────────────────────────────────────────────────────────────────┐
│              Retrospective (3-Step Method)                     │
├────────────────────────────────────────────────────────────────┤
│ PREREQ: batch reports exist in _docs/03_implementation/        │
│                                                                │
│ 1. Collect Metrics  → parse batch reports, compute metrics     │
│ 2. Analyze Trends   → patterns, comparison, improvement areas  │
│ 3. Produce Report   → _docs/06_metrics/retro_[date].md         │
├────────────────────────────────────────────────────────────────┤
│ Principles: Data-driven · Actionable · Cumulative              │
│             Non-judgmental · Save immediately                  │
└────────────────────────────────────────────────────────────────┘
```