Files
autopilot/.cursor/skills/code-review/SKILL.md
T
2026-04-18 22:04:00 +03:00

13 KiB

name, description, category, tags, disable-model-invocation
name description category tags disable-model-invocation
code-review Multi-phase code review against task specs with structured findings output. 6-phase workflow: context loading, spec compliance, code quality, security quick-scan, performance scan, cross-task consistency. Produces a structured report with severity-ranked findings and a PASS/FAIL/PASS_WITH_WARNINGS verdict. Invoked by /implement skill after each batch, or manually. Trigger phrases: - "code review", "review code", "review implementation" - "check code quality", "review against specs" review
code-review
quality
security-scan
performance
SOLID
true

Code Review

Multi-phase code review that verifies implementation against task specs, checks code quality, and produces structured findings.

Core Principles

  • Understand intent first: read the task specs before reviewing code — know what it should do before judging how
  • Structured output: every finding has severity, category, location, description, and suggestion
  • Deduplicate: same issue at the same location is reported once using {file}:{line}:{title} as key
  • Severity-ranked: findings sorted Critical > High > Medium > Low
  • Verdict-driven: clear PASS/FAIL/PASS_WITH_WARNINGS drives automation decisions

Input

  • List of task spec files that were just implemented (paths to [TRACKER-ID]_[short_name].md)
  • Changed files (detected via git diff or provided by the /implement skill)
  • Project context: _docs/00_problem/restrictions.md, _docs/01_solution/solution.md

Phase 1: Context Loading

Before reviewing code, build understanding of intent:

  1. Read each task spec — acceptance criteria, scope, constraints, dependencies
  2. Read project restrictions and solution overview
  3. Map which changed files correspond to which task specs
  4. Understand what the code is supposed to do before judging how it does it

Phase 2: Spec Compliance Review

For each task, verify implementation satisfies every acceptance criterion:

  • Walk through each AC (Given/When/Then) and trace it in the code
  • Check that unit tests cover each AC
  • Check that blackbox tests exist where specified in the task spec
  • Flag any AC that is not demonstrably satisfied as a Spec-Gap finding (severity: High)
  • Flag any scope creep (implementation beyond what the spec asked for) as a Scope finding (severity: Low)

Contract verification (for shared-models / shared-API tasks — any task with a ## Contract section):

  • Verify the referenced contract file exists at the stated path under _docs/02_document/contracts/.
  • Verify the implementation's public signatures (types, method shapes, endpoint paths, error variants) match the contract's Shape section.
  • Verify invariants from the contract's Invariants section are enforced in code (either structurally via types or via runtime checks with tests).
  • If the implementation and the contract disagree, emit a Spec-Gap finding (High severity) and note which side is drifting.

Consumer-side contract verification (for tasks whose Dependencies list a contract file):

  • Verify the consumer's imports and call sites match the contract's Shape.
  • If they diverge, emit a Spec-Gap finding (High severity) with a hint that the consumer, the contract, or the producer is drifting.

Phase 3: Code Quality Review

Check implemented code against quality standards:

  • SOLID principles — single responsibility, open/closed, Liskov, interface segregation, dependency inversion
  • Error handling — consistent strategy, no bare catch/except, meaningful error messages
  • Naming — clear intent, follows project conventions
  • Complexity — functions longer than 50 lines or cyclomatic complexity > 10
  • DRY — duplicated logic across files
  • Test quality — tests assert meaningful behavior, not just "no error thrown"
  • Dead code — unused imports, unreachable branches

Phase 4: Security Quick-Scan

Lightweight security checks (defer deep analysis to the /security skill):

  • SQL injection via string interpolation
  • Command injection (subprocess with shell=True, exec, eval)
  • Hardcoded secrets, API keys, passwords
  • Missing input validation on external inputs
  • Sensitive data in logs or error messages
  • Insecure deserialization

Phase 5: Performance Scan

Check for common performance anti-patterns:

  • O(n^2) or worse algorithms where O(n) is possible
  • N+1 query patterns
  • Unbounded data fetching (missing pagination/limits)
  • Blocking I/O in async contexts
  • Unnecessary memory copies or allocations in hot paths

Phase 6: Cross-Task Consistency

When multiple tasks were implemented in the same batch:

  • Interfaces between tasks are compatible (method signatures, DTOs match)
  • No conflicting patterns (e.g., one task uses repository pattern, another does raw SQL)
  • Shared code is not duplicated across task implementations
  • Dependencies declared in task specs are properly wired

Phase 7: Architecture Compliance

Verify the implemented code respects the architecture documented in _docs/02_document/architecture.md and the component boundaries declared in _docs/02_document/module-layout.md.

Inputs:

  • _docs/02_document/architecture.md — layering, allowed dependencies, patterns
  • _docs/02_document/module-layout.md — per-component directories, Public API surface, Imports from lists, Allowed Dependencies table
  • The cumulative list of changed files (for per-batch invocation) or the full codebase (for baseline invocation)

Checks:

  1. Layer direction: for each import in a changed file, resolve the importer's layer (from the Allowed Dependencies table) and the importee's layer. Flag any import where the importee's layer is strictly higher than the importer's. Severity: High. Category: Architecture.

  2. Public API respect: for each cross-component import, verify the imported symbol lives in the target component's Public API file list (from module-layout.md). Importing an internal file of another component is an Architecture finding. Severity: High.

  3. No new cyclic module dependencies: build a module-level import graph of the changed files plus their direct dependencies. Flag any new cycle introduced by this batch. Severity: Critical (cycles are structurally hard to undo once wired). Category: Architecture.

  4. Duplicate symbols across components: scan changed files for class, function, or constant names that also appear in another component's code AND do not share an interface. If a shared abstraction was expected (via cross-cutting epic or shared/*), flag it. Severity: High. Category: Architecture.

  5. Cross-cutting concerns not locally re-implemented: if a file under a component directory contains logic that should live in shared/<concern>/ (e.g., custom logging setup, config loader, error envelope), flag it. Severity: Medium. Category: Architecture.

Detection approach (per language):

  • Python: parse import / from ... import statements; optionally AST with ast module for reliable symbol resolution.
  • TypeScript / JavaScript: parse import ... from '...' and require('...'); resolve via tsconfig.json paths.
  • C#: parse using directives and fully-qualified type references; respect .csproj ProjectReference layering.
  • Rust: parse use <crate>:: and mod declarations; respect Cargo.toml workspace members.
  • Go: parse import blocks; respect module path ownership.

If a static analyzer tool is available on the project (ArchUnit, NsDepCop, tach, eslint-plugin-boundaries, etc.), prefer invoking it and parsing its output over hand-rolled analysis.

Invocation modes:

  • Full mode (default when invoked by the implement skill per batch): all 7 phases run.
  • Baseline mode: Phase 1 + Phase 7 only. Used for one-time architecture scan of an existing codebase (see existing-code flow Step 2 — Architecture Baseline Scan). Produces _docs/02_document/architecture_compliance_baseline.md instead of a batch review report.
  • Cumulative mode: all 7 phases on the union of changed files since the last cumulative review. Used mid-implementation (see implement skill Step 14.5).

Baseline delta (cumulative mode + full mode, when _docs/02_document/architecture_compliance_baseline.md exists):

After the seven phases produce the current Architecture findings list, partition those findings against the baseline:

  • Carried over: a finding whose (file, category, rule) triple matches an entry in the baseline. Not new; still present.
  • Resolved: a baseline entry whose (file, category, rule) triple is NOT in the current findings AND whose target file is in scope of this review. The team fixed it.
  • Newly introduced: a current finding that was not in the baseline. The review cycle created this.

Emit a ## Baseline Delta section in the report with three tables (Carried over, Resolved, Newly introduced) and per-category counts. The verdict logic does not change — Critical / High still drive FAIL. The delta is additional signal for the user and feeds the retrospective's structural metrics.

Output Format

Produce a structured report with findings deduplicated and sorted by severity:

# Code Review Report

**Batch**: [task list]
**Date**: [YYYY-MM-DD]
**Verdict**: PASS | PASS_WITH_WARNINGS | FAIL

## Findings

| # | Severity | Category | File:Line | Title |
|---|----------|----------|-----------|-------|
| 1 | Critical | Security | src/api/auth.py:42 | SQL injection via f-string |
| 2 | High | Spec-Gap | src/service/orders.py | AC-3 not satisfied |

### Finding Details

**F1: SQL injection via f-string** (Critical / Security)
- Location: `src/api/auth.py:42`
- Description: User input interpolated directly into SQL query
- Suggestion: Use parameterized query via bind parameters
- Task: 04_auth_service

**F2: AC-3 not satisfied** (High / Spec-Gap)
- Location: `src/service/orders.py`
- Description: AC-3 requires order total recalculation on item removal, but no such logic exists
- Suggestion: Add recalculation in remove_item() method
- Task: 07_order_processing

Severity Definitions

Severity Meaning Blocks?
Critical Security vulnerability, data loss, crash Yes — verdict FAIL
High Spec gap, logic bug, broken test Yes — verdict FAIL
Medium Performance issue, maintainability concern, missing validation No — verdict PASS_WITH_WARNINGS
Low Style, minor improvement, scope creep No — verdict PASS_WITH_WARNINGS

Category Values

Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope, Architecture

Architecture findings come from Phase 7. They indicate layering violations, Public API bypasses, new cyclic dependencies, duplicate symbols, or cross-cutting concerns re-implemented locally.

Verdict Logic

  • FAIL: any Critical or High finding exists
  • PASS_WITH_WARNINGS: only Medium or Low findings
  • PASS: no findings

Integration with /implement

The /implement skill invokes this skill after each batch completes:

  1. Collects changed files from all implementer agents in the batch
  2. Passes task spec paths + changed files to this skill
  3. If verdict is FAIL — presents findings to user (BLOCKING), user fixes or confirms
  4. If verdict is PASS or PASS_WITH_WARNINGS — proceeds automatically (findings shown as info)

Integration Contract

Inputs (provided by the implement skill)

Input Type Source Required
task_specs list of file paths Task .md files from _docs/02_tasks/todo/ for the current batch Yes
changed_files list of file paths Files modified by implementer agents (from git diff or agent reports) Yes
batch_number integer Current batch number (for report naming) Yes
project_restrictions file path _docs/00_problem/restrictions.md If exists
solution_overview file path _docs/01_solution/solution.md If exists

Invocation Pattern

The implement skill invokes code-review by:

  1. Reading .cursor/skills/code-review/SKILL.md
  2. Providing the inputs above as context (read the files, pass content to the review phases)
  3. Executing all 6 phases sequentially
  4. Consuming the verdict from the output

Outputs (returned to the implement skill)

Output Type Description
verdict PASS / PASS_WITH_WARNINGS / FAIL Drives the implement skill's auto-fix gate
findings structured list Each finding has: severity, category, file:line, title, description, suggestion, task reference
critical_count integer Number of Critical findings
high_count integer Number of High findings
report_path file path _docs/03_implementation/reviews/batch_[NN]_review.md

Report Persistence

Save the review report to _docs/03_implementation/reviews/batch_[NN]_review.md (create the reviews/ directory if it does not exist). The report uses the Output Format defined above.

The implement skill uses verdict to decide:

  • PASS / PASS_WITH_WARNINGS → proceed to commit
  • FAIL → enter auto-fix loop (up to 2 attempts), then escalate to user