Sync .cursor from detections

2026-06-21 04:31:11 +00:00 · 2026-04-12 05:05:08 +03:00
parent 416e559e8b
commit 4b52c0be3b
14 changed files with 847 additions and 572 deletions
@@ -4,7 +4,12 @@ alwaysApply: true
 ---
 # Coding preferences
 - Always prefer simple solution
+- Follow the Single Responsibility Principle — a class or method should have one reason to change:
+  - If a method is hard to name precisely from the caller's perspective, its responsibility is misplaced. Vague names like "candidate", "data", or "item" are a signal — fix the design, not just the name.
+  - Logic specific to a platform, variant, or environment belongs in the class that owns that variant, not in the general coordinator. Passing a dependency through is preferable to leaking variant-specific concepts into shared code.
+  - Only use static methods for pure, self-contained computations (constants, simple math, stateless lookups). If a static method involves resource access, side effects, OS interaction, or logic that varies across subclasses or environments — use an instance method or factory class instead. Before implementing a non-trivial static method, ask the user.
 - Generate concise code
+- Never suppress errors silently — no `2>/dev/null`, empty `catch` blocks, bare `except: pass`, or discarded error returns. These hide the information you need most when something breaks. If an error is truly safe to ignore, log it or comment why.
 - Do not put comments in the code, except in tests: every test must use the Arrange / Act / Assert pattern with language-appropriate comment syntax (`# Arrange` for Python, `// Arrange` for C#/Rust/JS/TS). Omit any section that is not needed (e.g. if there is no setup, skip Arrange; if act and assert are the same line, keep only Assert)
 - Do not put logs unless it is an exception, or was asked specifically
 - Do not put code annotations unless it was asked specifically 
@@ -13,6 +18,7 @@ alwaysApply: true
 - Mocking data is needed only for tests, never mock data for dev or prod env
 - Make test environment (files, db and so on) as close as possible to the production environment
 - When you add new libraries or dependencies make sure you are using the same version of it as other parts of the code
+- When writing code that calls a library API, verify the API actually exists in the pinned version. Check the library's changelog or migration guide for breaking changes between major versions. Never assume an API works at a given version — test the actual call path before committing.
 - When a test fails due to a missing dependency, install it — do not fake or stub the module system. For normal packages, add them to the project's dependency file (requirements-test.txt, package.json devDependencies, test csproj, etc.) and install. Only consider stubbing if the dependency is heavy (e.g. hardware-specific SDK, large native toolchain) — and even then, ask the user first before choosing to stub.
 - Do not solve environment or infrastructure problems (dependency resolution, import paths, service discovery, connection config) by hardcoding workarounds in source code. Fix them at the environment/configuration level.
 - Before writing new infrastructure or workaround code, check how the existing codebase already handles the same concern. Follow established project patterns.
@@ -14,6 +14,10 @@ globs: [".cursor/**"]
 - Body under 500 lines; use `references/` directory for overflow content
 - Templates live under their skill's `templates/` directory

+## Command Files (.cursor/commands/)
+- Plain markdown, no frontmatter
+- Kebab-case filenames
+
 ## Agent Files (.cursor/agents/)
 - Must have `name` and `description` in frontmatter

@@ -6,3 +6,5 @@ alwaysApply: true

 - Work on the `dev` branch
 - Commit message format: `[TRACKER-ID-1] [TRACKER-ID-2] Summary of changes`
+- Commit message total length must not exceed 30 characters
+- Do NOT push or merge unless the user explicitly asks you to. Always ask first if there is a need.
@@ -31,3 +31,20 @@ When the user reacts negatively to generated code ("WTF", "what the hell", "why
 **Preventive rules added to coderule.mdc**:
 - "Do not solve environment or infrastructure problems by hardcoding workarounds in source code. Fix them at the environment/configuration level."
 - "Before writing new infrastructure or workaround code, check how the existing codebase already handles the same concern. Follow established project patterns."
+
+## Debugging Over Contemplation
+When the root cause of a bug is not clear after ~5 minutes of reasoning, analysis, and assumption-making — **stop speculating and add debugging logs**. Observe actual runtime behavior before forming another theory. The pattern to follow:
+
+1. Identify the last known-good boundary (e.g., "request enters handler") and the known-bad result (e.g., "callback never fires").
+2. Add targeted `print(..., flush=True)` or log statements at each intermediate step to narrow the gap.
+3. Read the output. Let evidence drive the next step — not inference chains built on unverified assumptions.
+
+Prolonged mental contemplation without evidence is a time sink. A 15-minute instrumented run beats 45 minutes of "could it be X? but then Y... unless Z..." reasoning.
+
+## Long Investigation Retrospective
+When a problem takes significantly longer than expected (>30 minutes), perform a post-mortem before closing out:
+
+1. **Identify the bottleneck**: Was the delay caused by assumptions that turned out wrong? Missing visibility into runtime state? Incorrect mental model of a framework or language boundary?
+2. **Extract the general lesson**: What category of mistake was this? (e.g., "Python cannot call Cython `cdef` methods", "engine errors silently swallowed", "wrong layer to fix the problem")
+3. **Propose a preventive rule**: Formulate it as a short, actionable statement. Present it to the user for approval.
+4. **Write it down**: Add the approved rule to the appropriate `.mdc` file so it applies to all future sessions.
@@ -1,6 +1,6 @@
 ---
 description: "Python coding conventions: PEP 8, type hints, pydantic, pytest, async patterns, project structure"
-globs: ["**/*.py", "**/pyproject.toml", "**/requirements*.txt"]
+globs: ["**/*.py", "**/*.pyx", "**/*.pxd", "**/pyproject.toml", "**/requirements*.txt"]
 ---
 # Python

@@ -12,5 +12,10 @@ globs: ["**/*.py", "**/pyproject.toml", "**/requirements*.txt"]
 - Catch specific exceptions, never bare `except:`; use custom exception classes
 - Use `async`/`await` with `asyncio` for I/O-bound concurrency
 - Use `pytest` for testing (not `unittest`); fixtures for setup/teardown
- Use virtual environments (`venv` or `poetry`); pin dependencies
+- **NEVER install packages globally** (`pip install` / `pip3 install` without a venv). ALWAYS use a virtual environment (`venv`, `poetry`, or `conda env`). If no venv exists for the project, create one first (`python3 -m venv .venv && source .venv/bin/activate`) before installing anything. Pin dependencies.
 - Format with `black`; lint with `ruff` or `flake8`
+
+## Cython
+- In `cdef class` methods, prefer `cdef` over `cpdef` unless the method must be callable from Python. `cdef` = C-only (fastest), `cpdef` = C + Python, `def` = Python-only. Check all call sites before choosing.
+- **Python cannot call `cdef` methods.** If a `.py` file needs to call a `cdef` method on a Cython object, there are exactly two options: (a) convert the calling file to `.pyx`, `cimport` the class, and use a typed parameter so Cython dispatches the call at the C level; or (b) change the method to `cpdef` if it genuinely needs to be callable from both Python and Cython. Never leave a bare `except Exception: pass` around such a call — it will silently swallow the `AttributeError` and make the failure invisible for a very long time.
+- When converting a `.py` file to `.pyx` to gain access to `cdef` methods: add the new extension to `setup.py`, add a `cimport` of the relevant `.pxd`, type the parameter(s) that carry the Cython object, and delete the old `.py` file. This ensures the cross-language call is resolved at compile time, not at runtime.
@@ -16,11 +16,12 @@ Workflow for projects with an existing codebase. Starts with documentation, prod
 | 8 | New Task | new-task/SKILL.md | Steps 1–8 (loop) |
 | 9 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
 | 10 | Run Tests | test-run/SKILL.md | Steps 1–4 |
-| 11 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
-| 12 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
-| 13 | Deploy | deploy/SKILL.md | Step 1–7 |
+| 11 | Update Docs | document/SKILL.md (task mode) | Task Steps 0–5 |
+| 12 | Security Audit | security/SKILL.md | Phase 1–5 (optional) |
+| 13 | Performance Test | (autopilot-managed) | Load/stress tests (optional) |
+| 14 | Deploy | deploy/SKILL.md | Step 1–7 |

-After Step 13, the existing-code workflow is complete.
+After Step 14, the existing-code workflow is complete.

 ## Detection Rules

@@ -157,8 +158,24 @@ Action: Read and execute `.cursor/skills/test-run/SKILL.md`

 ---

-**Step 11 — Security Audit (optional)**
-Condition: the autopilot state shows Step 10 (Run Tests) is completed AND the autopilot state does NOT show Step 11 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+**Step 11 — Update Docs**
+Condition: the autopilot state shows Step 10 (Run Tests) is completed AND the autopilot state does NOT show Step 11 (Update Docs) as completed AND `_docs/02_document/` contains existing documentation (module or component docs)
+
+Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all task spec files from `_docs/02_tasks/done/` that were implemented in the current cycle (i.e., tasks moved to `done/` during Steps 8–9 of this cycle).
+
+The document skill in Task mode:
+1. Reads each task spec to identify changed source files
+2. Updates affected module docs, component docs, and system-level docs
+3. Does NOT redo full discovery, verification, or problem extraction
+
+If `_docs/02_document/` does not contain existing docs (e.g., documentation step was skipped), mark Step 11 as `skipped`.
+
+After completion, auto-chain to Step 12 (Security Audit).
+
+---
+
+**Step 12 — Security Audit (optional)**
+Condition: the autopilot state shows Step 11 (Update Docs) is completed or skipped AND the autopilot state does NOT show Step 12 (Security Audit) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Present using Choose format:

@@ -173,13 +190,13 @@ Action: Present using Choose format:
 ══════════════════════════════════════
 ```

- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 12 (Performance Test).
- If user picks B → Mark Step 11 as `skipped` in the state file, auto-chain to Step 12 (Performance Test).
+- If user picks A → Read and execute `.cursor/skills/security/SKILL.md`. After completion, auto-chain to Step 13 (Performance Test).
+- If user picks B → Mark Step 12 as `skipped` in the state file, auto-chain to Step 13 (Performance Test).

 ---

-**Step 12 — Performance Test (optional)**
-Condition: the autopilot state shows Step 11 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 12 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)
+**Step 13 — Performance Test (optional)**
+Condition: the autopilot state shows Step 12 (Security Audit) is completed or skipped AND the autopilot state does NOT show Step 13 (Performance Test) as completed or skipped AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Present using Choose format:

@@ -200,13 +217,13 @@ Action: Present using Choose format:
  2. Otherwise, check if `_docs/02_document/tests/performance-tests.md` exists for test scenarios, detect appropriate load testing tool (k6, locust, artillery, wrk, or built-in benchmarks), and execute performance test scenarios against the running system
  3. Present results vs acceptance criteria thresholds
  4. If thresholds fail → present Choose format: A) Fix and re-run, B) Proceed anyway, C) Abort
-  5. After completion, auto-chain to Step 13 (Deploy)
- If user picks B → Mark Step 12 as `skipped` in the state file, auto-chain to Step 13 (Deploy).
+  5. After completion, auto-chain to Step 14 (Deploy)
+- If user picks B → Mark Step 13 as `skipped` in the state file, auto-chain to Step 14 (Deploy).

 ---

-**Step 13 — Deploy**
-Condition: the autopilot state shows Step 10 (Run Tests) is completed AND (Step 11 is completed or skipped) AND (Step 12 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)
+**Step 14 — Deploy**
+Condition: the autopilot state shows Step 10 (Run Tests) is completed AND (Step 11 is completed or skipped) AND (Step 12 is completed or skipped) AND (Step 13 is completed or skipped) AND (`_docs/04_deploy/` does not exist or is incomplete)

 Action: Read and execute `.cursor/skills/deploy/SKILL.md`

@@ -215,7 +232,7 @@ After deployment completes, the existing-code workflow is done.
 ---

 **Re-Entry After Completion**
-Condition: the autopilot state shows `step: done` OR all steps through 13 (Deploy) are completed
+Condition: the autopilot state shows `step: done` OR all steps through 14 (Deploy) are completed

 Action: The project completed a full cycle. Print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:

@@ -230,6 +247,8 @@ Action: The project completed a full cycle. Print the status banner and automati

 Set `step: 8`, `status: not_started` in the state file, then auto-chain to Step 8 (New Task).

+Note: the loop (Steps 8 → 14 → 8) ensures every feature cycle includes: New Task → Implement → Run Tests → Update Docs → Security → Performance → Deploy.
+
 ## Auto-Chain Rules

 | Completed Step | Next Action |
@@ -243,10 +262,11 @@ Set `step: 8`, `status: not_started` in the state file, then auto-chain to Step
 | Refactor (7, done or skipped) | Auto-chain → New Task (8) |
 | New Task (8) | **Session boundary** — suggest new conversation before Implement |
 | Implement (9) | Auto-chain → Run Tests (10) |
-| Run Tests (10, all pass) | Auto-chain → Security Audit choice (11) |
-| Security Audit (11, done or skipped) | Auto-chain → Performance Test choice (12) |
-| Performance Test (12, done or skipped) | Auto-chain → Deploy (13) |
-| Deploy (13) | **Workflow complete** — existing-code flow done |
+| Run Tests (10, all pass) | Auto-chain → Update Docs (11) |
+| Update Docs (11) | Auto-chain → Security Audit choice (12) |
+| Security Audit (12, done or skipped) | Auto-chain → Performance Test choice (13) |
+| Performance Test (13, done or skipped) | Auto-chain → Deploy (14) |
+| Deploy (14) | **Workflow complete** — existing-code flow done |

 ## Status Summary Template

@@ -264,9 +284,10 @@ Set `step: 8`, `status: not_started` in the state file, then auto-chain to Step
 Step 8   New Task                 [DONE (N tasks) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
 Step 9   Implement                [DONE / IN PROGRESS (batch M of ~N) / NOT STARTED / FAILED (retry N/3)]
 Step 10  Run Tests                [DONE (N passed, M failed) / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 11  Security Audit           [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 12  Performance Test         [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
- Step 13  Deploy                   [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 11  Update Docs              [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 12  Security Audit           [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 13  Performance Test         [DONE / SKIPPED / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
+ Step 14  Deploy                   [DONE / IN PROGRESS / NOT STARTED / FAILED (retry N/3)]
 ═══════════════════════════════════════════════════
 Current: Step N — Name
 SubStep: M — [sub-skill internal step name]
@@ -19,9 +19,19 @@ disable-model-invocation: true

 Analyze an existing codebase from the bottom up — individual modules first, then components, then system-level architecture — and produce the same `_docs/` artifacts that the `problem` and `plan` skills generate, without requiring user interview.

+## File Index
+
+| File | Purpose |
+|------|---------|
+| `workflows/full.md` | Full / Focus Area / Resume modes — Steps 0–7 (discovery through final report) |
+| `workflows/task.md` | Task mode — lightweight incremental doc update triggered by task spec files |
+| `references/artifacts.md` | Directory structure, state.json format, resumability, save principles |
+
+**On every invocation**: read the appropriate workflow file based on mode detection below.
+
 ## Core Principles

- **Bottom-up always**: module docs -> component specs -> architecture/flows -> solution -> problem extraction. Every higher level is synthesized from the level below.
+- **Bottom-up always**: module docs → component specs → architecture/flows → solution → problem extraction. Every higher level is synthesized from the level below.
 - **Dependencies first**: process modules in topological order (leaves first). When documenting module X, all of X's dependencies already have docs.
 - **Incremental context**: each module's doc uses already-written dependency docs as context — no ever-growing chain.
 - **Verify against code**: cross-reference every entity in generated docs against actual codebase. Catch hallucinations.
@@ -46,470 +56,16 @@ Announce resolved paths (and FOCUS_DIR if set) to user before proceeding.

 Determine the execution mode before any other logic:

-| Mode | Trigger | Scope |
-|------|---------|-------|
-| **Full** | No input file, no existing state | Entire codebase |
-| **Focus Area** | User provides a directory path (e.g., `@src/api/`) | Only the specified subtree + transitive dependencies |
-| **Resume** | `state.json` exists in DOCUMENT_DIR | Continue from last checkpoint |
+| Mode | Trigger | Scope | Workflow File |
+|------|---------|-------|---------------|
+| **Full** | No input file, no existing state | Entire codebase | `workflows/full.md` |
+| **Focus Area** | User provides a directory path (e.g., `@src/api/`) | Only the specified subtree + transitive dependencies | `workflows/full.md` |
+| **Resume** | `state.json` exists in DOCUMENT_DIR | Continue from last checkpoint | `workflows/full.md` |
+| **Task** | User provides a task spec file AND `_docs/02_document/` has existing docs | Targeted update of docs affected by the task | `workflows/task.md` |

-Focus Area mode produces module + component docs for the targeted area only. It can be run repeatedly for different areas — each run appends to the existing module and component docs without overwriting other areas.
+After detecting the mode, read and follow the corresponding workflow file.

-## Prerequisite Checks
+- **Full / Focus Area / Resume** → read `workflows/full.md`
+- **Task** → read `workflows/task.md`

-1. If `_docs/` already exists and contains files AND mode is **Full**, ASK user: **overwrite, merge, or write to `_docs_generated/` instead?**
-2. Create DOCUMENT_DIR, SOLUTION_DIR, and PROBLEM_DIR if they don't exist
-3. If DOCUMENT_DIR contains a `state.json`, offer to **resume from last checkpoint or start fresh**
-4. If FOCUS_DIR is set, verify the directory exists and contains source files — **STOP if missing**
-
-## Progress Tracking
-
-Create a TodoWrite with all steps (0 through 7). Update status as each step completes.
-
-## Workflow
-
-### Step 0: Codebase Discovery
-
-**Role**: Code analyst
-**Goal**: Build a complete map of the codebase (or targeted subtree) before analyzing any code.
-
-**Focus Area scoping**: if FOCUS_DIR is set, limit the scan to that directory subtree. Still identify transitive dependencies outside FOCUS_DIR (modules that FOCUS_DIR imports) and include them in the processing order, but skip modules that are neither inside FOCUS_DIR nor dependencies of it.
-
-Scan and catalog:
-
-1. Directory tree (ignore `node_modules`, `.git`, `__pycache__`, `bin/`, `obj/`, build artifacts)
-2. Language detection from file extensions and config files
-3. Package manifests: `package.json`, `requirements.txt`, `pyproject.toml`, `*.csproj`, `Cargo.toml`, `go.mod`
-4. Config files: `Dockerfile`, `docker-compose.yml`, `.env.example`, CI/CD configs (`.github/workflows/`, `.gitlab-ci.yml`, `azure-pipelines.yml`)
-5. Entry points: `main.*`, `app.*`, `index.*`, `Program.*`, startup scripts
-6. Test structure: test directories, test frameworks, test runner configs
-7. Existing documentation: README, `docs/`, wiki references, inline doc coverage
-8. **Dependency graph**: build a module-level dependency graph by analyzing imports/references. Identify:
-   - Leaf modules (no internal dependencies)
-   - Entry points (no internal dependents)
-   - Cycles (mark for grouped analysis)
-   - Topological processing order
-   - If FOCUS_DIR: mark which modules are in-scope vs dependency-only
-
-**Save**: `DOCUMENT_DIR/00_discovery.md` containing:
- Directory tree (concise, relevant directories only)
- Tech stack summary table (language, framework, database, infra)
- Dependency graph (textual list + Mermaid diagram)
- Topological processing order
- Entry points and leaf modules
-
-**Save**: `DOCUMENT_DIR/state.json` with initial state:
-```json
-{
-  "current_step": "module-analysis",
-  "completed_steps": ["discovery"],
-  "focus_dir": null,
-  "modules_total": 0,
-  "modules_documented": [],
-  "modules_remaining": [],
-  "module_batch": 0,
-  "components_written": [],
-  "last_updated": ""
-}
-```
-
-Set `focus_dir` to the FOCUS_DIR path if in Focus Area mode, or `null` for Full mode.
-
---
-
-### Step 1: Module-Level Documentation
-
-**Role**: Code analyst
-**Goal**: Document every identified module individually, processing in topological order (leaves first).
-
-**Batched processing**: process modules in batches of ~5 (sorted by topological order). After each batch: save all module docs, update `state.json`, present a progress summary. Between batches, evaluate whether to suggest a session break.
-
-For each module in topological order:
-
-1. **Read**: read the module's source code. Assess complexity and what context is needed.
-2. **Gather context**: collect already-written docs of this module's dependencies (available because of bottom-up order). Note external library usage.
-3. **Write module doc** with these sections:
-   - **Purpose**: one-sentence responsibility
-   - **Public interface**: exported functions/classes/methods with signatures, input/output types
-   - **Internal logic**: key algorithms, patterns, non-obvious behavior
-   - **Dependencies**: what it imports internally and why
-   - **Consumers**: what uses this module (from the dependency graph)
-   - **Data models**: entities/types defined in this module
-   - **Configuration**: env vars, config keys consumed
-   - **External integrations**: HTTP calls, DB queries, queue operations, file I/O
-   - **Security**: auth checks, encryption, input validation, secrets access
-   - **Tests**: what tests exist for this module, what they cover
-4. **Verify**: cross-check that every entity referenced in the doc exists in the codebase. Flag uncertainties.
-
-**Cycle handling**: modules in a dependency cycle are analyzed together as a group, producing a single combined doc.
-
-**Large modules**: if a module exceeds comfortable analysis size, split into logical sub-sections and analyze each part, then combine.
-
-**Save**: `DOCUMENT_DIR/modules/[module_name].md` for each module.
-**State**: update `state.json` after each module completes (move from `modules_remaining` to `modules_documented`). Increment `module_batch` after each batch of ~5.
-
-**Session break heuristic**: after each batch, if more than 10 modules remain AND 2+ batches have already completed in this session, suggest a session break:
-
-```
-══════════════════════════════════════
- SESSION BREAK SUGGESTED
-══════════════════════════════════════
- Modules documented: [X] of [Y]
- Batches completed this session: [N]
-══════════════════════════════════════
- A) Continue in this conversation
- B) Save and continue in a fresh conversation (recommended)
-══════════════════════════════════════
- Recommendation: B — fresh context improves
- analysis quality for remaining modules
-══════════════════════════════════════
-```
-
-Re-entry is seamless: `state.json` tracks exactly which modules are done.
-
---
-
-### Step 2: Component Assembly
-
-**Role**: Software architect
-**Goal**: Group related modules into logical components and produce component specs.
-
-1. Analyze module docs from Step 1 to identify natural groupings:
-   - By directory structure (most common)
-   - By shared data models or common purpose
-   - By dependency clusters (tightly coupled modules)
-2. For each identified component, synthesize its module docs into a single component specification using `.cursor/skills/plan/templates/component-spec.md` as structure:
-   - High-level overview: purpose, pattern, upstream/downstream
-   - Internal interfaces: method signatures, DTOs (from actual module code)
-   - External API specification (if the component exposes HTTP/gRPC endpoints)
-   - Data access patterns: queries, caching, storage estimates
-   - Implementation details: algorithmic complexity, state management, key libraries
-   - Extensions and helpers: shared utilities needed
-   - Caveats and edge cases: limitations, race conditions, bottlenecks
-   - Dependency graph: implementation order relative to other components
-   - Logging strategy
-3. Identify common helpers shared across multiple components -> document in `common-helpers/`
-4. Generate component relationship diagram (Mermaid)
-
-**Self-verification**:
- [ ] Every module from Step 1 is covered by exactly one component
- [ ] No component has overlapping responsibility with another
- [ ] Inter-component interfaces are explicit (who calls whom, with what)
- [ ] Component dependency graph has no circular dependencies
-
-**Save**:
- `DOCUMENT_DIR/components/[##]_[name]/description.md` per component
- `DOCUMENT_DIR/common-helpers/[##]_helper_[name].md` per shared helper
- `DOCUMENT_DIR/diagrams/components.md` (Mermaid component diagram)
-
-**BLOCKING**: Present component list with one-line summaries to user. Do NOT proceed until user confirms the component breakdown is correct.
-
---
-
-### Step 3: System-Level Synthesis
-
-**Role**: Software architect
-**Goal**: From component docs, synthesize system-level documents.
-
-All documents here are derived from component docs (Step 2) + module docs (Step 1). No new code reading should be needed. If it is, that indicates a gap in Steps 1-2 — go back and fill it.
-
-#### 3a. Architecture
-
-Using `.cursor/skills/plan/templates/architecture.md` as structure:
-
- System context and boundaries from entry points and external integrations
- Tech stack table from discovery (Step 0) + component specs
- Deployment model from Dockerfiles, CI configs, environment strategies
- Data model overview from per-component data access sections
- Integration points from inter-component interfaces
- NFRs from test thresholds, config limits, health checks
- Security architecture from per-module security observations
- Key ADRs inferred from technology choices and patterns
-
-**Save**: `DOCUMENT_DIR/architecture.md`
-
-#### 3b. System Flows
-
-Using `.cursor/skills/plan/templates/system-flows.md` as structure:
-
- Trace main flows through the component interaction graph
- Entry point -> component chain -> output for each major flow
- Mermaid sequence diagrams and flowcharts
- Error scenarios from exception handling patterns
- Data flow tables per flow
-
-**Save**: `DOCUMENT_DIR/system-flows.md` and `DOCUMENT_DIR/diagrams/flows/flow_[name].md`
-
-#### 3c. Data Model
-
- Consolidate all data models from module docs
- Entity-relationship diagram (Mermaid ERD)
- Migration strategy (if ORM/migration tooling detected)
- Seed data observations
- Backward compatibility approach (if versioning found)
-
-**Save**: `DOCUMENT_DIR/data_model.md`
-
-#### 3d. Deployment (if Dockerfile/CI configs exist)
-
- Containerization summary
- CI/CD pipeline structure
- Environment strategy (dev, staging, production)
- Observability (logging patterns, metrics, health checks found in code)
-
-**Save**: `DOCUMENT_DIR/deployment/` (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md — only files for which sufficient code evidence exists)
-
---
-
-### Step 4: Verification Pass
-
-**Role**: Quality verifier
-**Goal**: Compare every generated document against actual code. Fix hallucinations, fill gaps, correct inaccuracies.
-
-For each document generated in Steps 1-3:
-
-1. **Entity verification**: extract all code entities (class names, function names, module names, endpoints) mentioned in the doc. Cross-reference each against the actual codebase. Flag any that don't exist.
-2. **Interface accuracy**: for every method signature, DTO, or API endpoint in component specs, verify it matches actual code.
-3. **Flow correctness**: for each system flow diagram, trace the actual code path and verify the sequence matches.
-4. **Completeness check**: are there modules or components discovered in Step 0 that aren't covered by any document? Flag gaps.
-5. **Consistency check**: do component docs agree with architecture doc? Do flow diagrams match component interfaces?
-
-Apply corrections inline to the documents that need them.
-
-**Save**: `DOCUMENT_DIR/04_verification_log.md` with:
- Total entities verified vs flagged
- Corrections applied (which document, what changed)
- Remaining gaps or uncertainties
- Completeness score (modules covered / total modules)
-
-**BLOCKING**: Present verification summary to user. Do NOT proceed until user confirms corrections are acceptable or requests additional fixes.
-
-**Session boundary**: After verification is confirmed, suggest a session break before proceeding to the synthesis steps (5–7). These steps produce different artifact types and benefit from fresh context:
-
-```
-══════════════════════════════════════
- VERIFICATION COMPLETE — session break?
-══════════════════════════════════════
- Steps 0–4 (analysis + verification) are done.
- Steps 5–7 (solution + problem extraction + report)
- can run in a fresh conversation.
-══════════════════════════════════════
- A) Continue in this conversation
- B) Save and continue in a new conversation (recommended)
-══════════════════════════════════════
-```
-
-If **Focus Area mode**: Steps 5–7 are skipped (they require full codebase coverage). Present a summary of modules and components documented for this area. The user can run `/document` again for another area, or run without FOCUS_DIR once all areas are covered to produce the full synthesis.
-
---
-
-### Step 5: Solution Extraction (Retrospective)
-
-**Role**: Software architect
-**Goal**: From all verified technical documentation, retrospectively create `solution.md` — the same artifact the research skill produces. This makes downstream skills (`plan`, `deploy`, `decompose`) compatible with the documented codebase.
-
-Synthesize from architecture (Step 3) + component specs (Step 2) + system flows (Step 3) + verification findings (Step 4):
-
-1. **Product Solution Description**: what the system is, brief component interaction diagram (Mermaid)
-2. **Architecture**: the architecture that is implemented, with per-component solution tables:
-
-| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
-|----------|-------|-----------|-------------|-------------|----------|------|-----|
-| [actual implementation] | [libs/platforms used] | [observed strengths] | [observed limitations] | [requirements met] | [security approach] | [cost indicators] | [fitness assessment] |
-
-3. **Testing Strategy**: summarize integration/functional tests and non-functional tests found in the codebase
-4. **References**: links to key config files, Dockerfiles, CI configs that evidence the solution choices
-
-**Save**: `SOLUTION_DIR/solution.md` (`_docs/01_solution/solution.md`)
-
---
-
-### Step 6: Problem Extraction (Retrospective)
-
-**Role**: Business analyst
-**Goal**: From all verified technical docs, retrospectively derive the high-level problem definition — producing the same documents the `problem` skill creates through interview.
-
-This is the inverse of normal workflow: instead of problem -> solution -> code, we go code -> technical docs -> problem understanding.
-
-#### 6a. `problem.md`
-
- Synthesize from architecture overview + component purposes + system flows
- What is this system? What problem does it solve? Who are the users? How does it work at a high level?
- Cross-reference with README if one exists
- Free-form text, concise, readable by someone unfamiliar with the project
-
-#### 6b. `restrictions.md`
-
- Extract from: tech stack choices, Dockerfile specs (OS, base images), CI configs (platform constraints), dependency versions, environment configs
- Categorize with headers: Hardware, Software, Environment, Operational
- Each restriction should be specific and testable
-
-#### 6c. `acceptance_criteria.md`
-
- Derive from: test assertions (expected values, thresholds), performance configs (timeouts, rate limits, batch sizes), health check endpoints, validation rules in code
- Categorize with headers by domain
- Every criterion must have a measurable value — if only implied, note the source
-
-#### 6d. `input_data/`
-
- Document data schemas found (DB schemas, API request/response types, config file formats)
- Create `data_parameters.md` describing what data the system consumes, formats, volumes, update patterns
-
-#### 6e. `security_approach.md` (only if security code found)
-
- Authentication mechanisms, authorization patterns, encryption, secrets handling, CORS, rate limiting, input sanitization — all from code observations
- If no security-relevant code found, skip this file
-
-**Save**: all files to `PROBLEM_DIR/` (`_docs/00_problem/`)
-
-**BLOCKING**: Present all problem documents to user. These are the most abstracted and therefore most prone to interpretation error. Do NOT proceed until user confirms or requests corrections.
-
---
-
-### Step 7: Final Report
-
-**Role**: Technical writer
-**Goal**: Produce `FINAL_report.md` integrating all generated documentation.
-
-Using `.cursor/skills/plan/templates/final-report.md` as structure:
-
- Executive summary from architecture + problem docs
- Problem statement (transformed from problem.md, not copy-pasted)
- Architecture overview with tech stack one-liner
- Component summary table (number, name, purpose, dependencies)
- System flows summary table
- Risk observations from verification log (Step 4)
- Open questions (uncertainties flagged during analysis)
- Artifact index listing all generated documents with paths
-
-**Save**: `DOCUMENT_DIR/FINAL_report.md`
-
-**State**: update `state.json` with `current_step: "complete"`.
-
---
-
-## Artifact Management
-
-### Directory Structure
-
-```
-_docs/
-├── 00_problem/                          # Step 6 (retrospective)
-│   ├── problem.md
-│   ├── restrictions.md
-│   ├── acceptance_criteria.md
-│   ├── input_data/
-│   │   └── data_parameters.md
-│   └── security_approach.md
-├── 01_solution/                         # Step 5 (retrospective)
-│   └── solution.md
-└── 02_document/                         # DOCUMENT_DIR
-    ├── 00_discovery.md                  # Step 0
-    ├── modules/                         # Step 1
-    │   ├── [module_name].md
-    │   └── ...
-    ├── components/                      # Step 2
-    │   ├── 01_[name]/description.md
-    │   ├── 02_[name]/description.md
-    │   └── ...
-    ├── common-helpers/                  # Step 2
-    ├── architecture.md                  # Step 3
-    ├── system-flows.md                  # Step 3
-    ├── data_model.md                    # Step 3
-    ├── deployment/                      # Step 3
-    ├── diagrams/                        # Steps 2-3
-    │   ├── components.md
-    │   └── flows/
-    ├── 04_verification_log.md           # Step 4
-    ├── FINAL_report.md                  # Step 7
-    └── state.json                       # Resumability
-```
-
-### Resumability
-
-Maintain `DOCUMENT_DIR/state.json`:
-
-```json
-{
-  "current_step": "module-analysis",
-  "completed_steps": ["discovery"],
-  "focus_dir": null,
-  "modules_total": 12,
-  "modules_documented": ["utils/helpers", "models/user"],
-  "modules_remaining": ["services/auth", "api/endpoints"],
-  "module_batch": 1,
-  "components_written": [],
-  "last_updated": "2026-03-21T14:00:00Z"
-}
-```
-
-Update after each module/component completes. If interrupted, resume from next undocumented module.
-
-When resuming:
-1. Read `state.json`
-2. Cross-check against actual files in DOCUMENT_DIR (trust files over state if they disagree)
-3. Continue from the next incomplete item
-4. Inform user which steps are being skipped
-
-### Save Principles
-
-1. **Save immediately**: write each module doc as soon as analysis completes
-2. **Incremental context**: each subsequent module uses already-written docs as context
-3. **Preserve intermediates**: keep all module docs even after synthesis into component docs
-4. **Enable recovery**: state file tracks exact progress for resume
-
-## Escalation Rules
-
-| Situation | Action |
-|-----------|--------|
-| Minified/obfuscated code detected | WARN user, skip module, note in verification log |
-| Module too large for context window | Split into sub-sections, analyze parts separately, combine |
-| Cycle in dependency graph | Group cycled modules, analyze together as one doc |
-| Generated code (protobuf, swagger-gen) | Note as generated, document the source spec instead |
-| No tests found in codebase | Note gap in acceptance_criteria.md, derive AC from validation rules and config limits only |
-| Contradictions between code and README | Flag in verification log, ASK user |
-| Binary files or non-code assets | Skip, note in discovery |
-| `_docs/` already exists | ASK user: overwrite, merge, or use `_docs_generated/` |
-| Code intent is ambiguous | ASK user, do not guess |
-
-## Common Mistakes
-
- **Top-down guessing**: never infer architecture before documenting modules. Build up, don't assume down.
- **Hallucinating entities**: always verify that referenced classes/functions/endpoints actually exist in code.
- **Skipping modules**: every source module must appear in exactly one module doc and one component.
- **Monolithic analysis**: don't try to analyze the entire codebase in one pass. Module by module, in order.
- **Inventing restrictions**: only document constraints actually evidenced in code, configs, or Dockerfiles.
- **Vague acceptance criteria**: "should be fast" is not a criterion. Extract actual numeric thresholds from code.
- **Writing code**: this skill produces documents, never implementation code.
-
-## Methodology Quick Reference
-
-```
-┌──────────────────────────────────────────────────────────────────┐
-│          Bottom-Up Codebase Documentation (8-Step)               │
-├──────────────────────────────────────────────────────────────────┤
-│ MODE: Full / Focus Area (@dir) / Resume (state.json)             │
-│ PREREQ: Check _docs/ exists (overwrite/merge/new?)               │
-│ PREREQ: Check state.json for resume                              │
-│                                                                  │
-│ 0. Discovery          → dependency graph, tech stack, topo order │
-│    (Focus Area: scoped to FOCUS_DIR + transitive deps)           │
-│ 1. Module Docs        → per-module analysis (leaves first)       │
-│    (batched ~5 modules; session break between batches)           │
-│ 2. Component Assembly → group modules, write component specs     │
-│    [BLOCKING: user confirms components]                          │
-│ 3. System Synthesis   → architecture, flows, data model, deploy  │
-│ 4. Verification       → compare all docs vs code, fix errors     │
-│    [BLOCKING: user reviews corrections]                          │
-│    [SESSION BREAK suggested before Steps 5–7]                    │
-│    ── Focus Area mode stops here ──                              │
-│ 5. Solution Extraction → retrospective solution.md               │
-│ 6. Problem Extraction → retrospective problem, restrictions, AC  │
-│    [BLOCKING: user confirms problem docs]                        │
-│ 7. Final Report       → FINAL_report.md                          │
-├──────────────────────────────────────────────────────────────────┤
-│ Principles: Bottom-up always · Dependencies first                │
-│             Incremental context · Verify against code            │
-│             Save immediately · Resume from checkpoint            │
-│             Batch modules · Session breaks for large codebases   │
-└──────────────────────────────────────────────────────────────────┘
-```
+For artifact directory structure and state.json format, see `references/artifacts.md`.
@@ -0,0 +1,70 @@
+# Document Skill — Artifact Management
+
+## Directory Structure
+
+```
+_docs/
+├── 00_problem/                          # Step 6 (retrospective)
+│   ├── problem.md
+│   ├── restrictions.md
+│   ├── acceptance_criteria.md
+│   ├── input_data/
+│   │   └── data_parameters.md
+│   └── security_approach.md
+├── 01_solution/                         # Step 5 (retrospective)
+│   └── solution.md
+└── 02_document/                         # DOCUMENT_DIR
+    ├── 00_discovery.md                  # Step 0
+    ├── modules/                         # Step 1
+    │   ├── [module_name].md
+    │   └── ...
+    ├── components/                      # Step 2
+    │   ├── 01_[name]/description.md
+    │   ├── 02_[name]/description.md
+    │   └── ...
+    ├── common-helpers/                  # Step 2
+    ├── architecture.md                  # Step 3
+    ├── system-flows.md                  # Step 3
+    ├── data_model.md                    # Step 3
+    ├── deployment/                      # Step 3
+    ├── diagrams/                        # Steps 2-3
+    │   ├── components.md
+    │   └── flows/
+    ├── 04_verification_log.md           # Step 4
+    ├── FINAL_report.md                  # Step 7
+    └── state.json                       # Resumability
+```
+
+## State File (state.json)
+
+Maintained in `DOCUMENT_DIR/state.json` for resumability:
+
+```json
+{
+  "current_step": "module-analysis",
+  "completed_steps": ["discovery"],
+  "focus_dir": null,
+  "modules_total": 12,
+  "modules_documented": ["utils/helpers", "models/user"],
+  "modules_remaining": ["services/auth", "api/endpoints"],
+  "module_batch": 1,
+  "components_written": [],
+  "last_updated": "2026-03-21T14:00:00Z"
+}
+```
+
+Update after each module/component completes. If interrupted, resume from next undocumented module.
+
+### Resume Protocol
+
+1. Read `state.json`
+2. Cross-check against actual files in DOCUMENT_DIR (trust files over state if they disagree)
+3. Continue from the next incomplete item
+4. Inform user which steps are being skipped
+
+## Save Principles
+
+1. **Save immediately**: write each module doc as soon as analysis completes
+2. **Incremental context**: each subsequent module uses already-written docs as context
+3. **Preserve intermediates**: keep all module docs even after synthesis into component docs
+4. **Enable recovery**: state file tracks exact progress for resume
@@ -0,0 +1,376 @@
+# Document Skill — Full / Focus Area / Resume Workflow
+
+Covers three related modes that share the same 8-step pipeline:
+
+- **Full**: entire codebase, no prior state
+- **Focus Area**: scoped to a directory subtree + transitive dependencies
+- **Resume**: continue from `state.json` checkpoint
+
+## Prerequisite Checks
+
+1. If `_docs/` already exists and contains files AND mode is **Full**, ASK user: **overwrite, merge, or write to `_docs_generated/` instead?**
+2. Create DOCUMENT_DIR, SOLUTION_DIR, and PROBLEM_DIR if they don't exist
+3. If DOCUMENT_DIR contains a `state.json`, offer to **resume from last checkpoint or start fresh**
+4. If FOCUS_DIR is set, verify the directory exists and contains source files — **STOP if missing**
+
+## Progress Tracking
+
+Create a TodoWrite with all steps (0 through 7). Update status as each step completes.
+
+## Steps
+
+### Step 0: Codebase Discovery
+
+**Role**: Code analyst
+**Goal**: Build a complete map of the codebase (or targeted subtree) before analyzing any code.
+
+**Focus Area scoping**: if FOCUS_DIR is set, limit the scan to that directory subtree. Still identify transitive dependencies outside FOCUS_DIR (modules that FOCUS_DIR imports) and include them in the processing order, but skip modules that are neither inside FOCUS_DIR nor dependencies of it.
+
+Scan and catalog:
+
+1. Directory tree (ignore `node_modules`, `.git`, `__pycache__`, `bin/`, `obj/`, build artifacts)
+2. Language detection from file extensions and config files
+3. Package manifests: `package.json`, `requirements.txt`, `pyproject.toml`, `*.csproj`, `Cargo.toml`, `go.mod`
+4. Config files: `Dockerfile`, `docker-compose.yml`, `.env.example`, CI/CD configs (`.github/workflows/`, `.gitlab-ci.yml`, `azure-pipelines.yml`)
+5. Entry points: `main.*`, `app.*`, `index.*`, `Program.*`, startup scripts
+6. Test structure: test directories, test frameworks, test runner configs
+7. Existing documentation: README, `docs/`, wiki references, inline doc coverage
+8. **Dependency graph**: build a module-level dependency graph by analyzing imports/references. Identify:
+   - Leaf modules (no internal dependencies)
+   - Entry points (no internal dependents)
+   - Cycles (mark for grouped analysis)
+   - Topological processing order
+   - If FOCUS_DIR: mark which modules are in-scope vs dependency-only
+
+**Save**: `DOCUMENT_DIR/00_discovery.md` containing:
+- Directory tree (concise, relevant directories only)
+- Tech stack summary table (language, framework, database, infra)
+- Dependency graph (textual list + Mermaid diagram)
+- Topological processing order
+- Entry points and leaf modules
+
+**Save**: `DOCUMENT_DIR/state.json` with initial state (see `references/artifacts.md` for format).
+
+---
+
+### Step 1: Module-Level Documentation
+
+**Role**: Code analyst
+**Goal**: Document every identified module individually, processing in topological order (leaves first).
+
+**Batched processing**: process modules in batches of ~5 (sorted by topological order). After each batch: save all module docs, update `state.json`, present a progress summary. Between batches, evaluate whether to suggest a session break.
+
+For each module in topological order:
+
+1. **Read**: read the module's source code. Assess complexity and what context is needed.
+2. **Gather context**: collect already-written docs of this module's dependencies (available because of bottom-up order). Note external library usage.
+3. **Write module doc** with these sections:
+   - **Purpose**: one-sentence responsibility
+   - **Public interface**: exported functions/classes/methods with signatures, input/output types
+   - **Internal logic**: key algorithms, patterns, non-obvious behavior
+   - **Dependencies**: what it imports internally and why
+   - **Consumers**: what uses this module (from the dependency graph)
+   - **Data models**: entities/types defined in this module
+   - **Configuration**: env vars, config keys consumed
+   - **External integrations**: HTTP calls, DB queries, queue operations, file I/O
+   - **Security**: auth checks, encryption, input validation, secrets access
+   - **Tests**: what tests exist for this module, what they cover
+4. **Verify**: cross-check that every entity referenced in the doc exists in the codebase. Flag uncertainties.
+
+**Cycle handling**: modules in a dependency cycle are analyzed together as a group, producing a single combined doc.
+
+**Large modules**: if a module exceeds comfortable analysis size, split into logical sub-sections and analyze each part, then combine.
+
+**Save**: `DOCUMENT_DIR/modules/[module_name].md` for each module.
+**State**: update `state.json` after each module completes (move from `modules_remaining` to `modules_documented`). Increment `module_batch` after each batch of ~5.
+
+**Session break heuristic**: after each batch, if more than 10 modules remain AND 2+ batches have already completed in this session, suggest a session break:
+
+```
+══════════════════════════════════════
+ SESSION BREAK SUGGESTED
+══════════════════════════════════════
+ Modules documented: [X] of [Y]
+ Batches completed this session: [N]
+══════════════════════════════════════
+ A) Continue in this conversation
+ B) Save and continue in a fresh conversation (recommended)
+══════════════════════════════════════
+ Recommendation: B — fresh context improves
+ analysis quality for remaining modules
+══════════════════════════════════════
+```
+
+Re-entry is seamless: `state.json` tracks exactly which modules are done.
+
+---
+
+### Step 2: Component Assembly
+
+**Role**: Software architect
+**Goal**: Group related modules into logical components and produce component specs.
+
+1. Analyze module docs from Step 1 to identify natural groupings:
+   - By directory structure (most common)
+   - By shared data models or common purpose
+   - By dependency clusters (tightly coupled modules)
+2. For each identified component, synthesize its module docs into a single component specification using `.cursor/skills/plan/templates/component-spec.md` as structure:
+   - High-level overview: purpose, pattern, upstream/downstream
+   - Internal interfaces: method signatures, DTOs (from actual module code)
+   - External API specification (if the component exposes HTTP/gRPC endpoints)
+   - Data access patterns: queries, caching, storage estimates
+   - Implementation details: algorithmic complexity, state management, key libraries
+   - Extensions and helpers: shared utilities needed
+   - Caveats and edge cases: limitations, race conditions, bottlenecks
+   - Dependency graph: implementation order relative to other components
+   - Logging strategy
+3. Identify common helpers shared across multiple components → document in `common-helpers/`
+4. Generate component relationship diagram (Mermaid)
+
+**Self-verification**:
+- [ ] Every module from Step 1 is covered by exactly one component
+- [ ] No component has overlapping responsibility with another
+- [ ] Inter-component interfaces are explicit (who calls whom, with what)
+- [ ] Component dependency graph has no circular dependencies
+
+**Save**:
+- `DOCUMENT_DIR/components/[##]_[name]/description.md` per component
+- `DOCUMENT_DIR/common-helpers/[##]_helper_[name].md` per shared helper
+- `DOCUMENT_DIR/diagrams/components.md` (Mermaid component diagram)
+
+**BLOCKING**: Present component list with one-line summaries to user. Do NOT proceed until user confirms the component breakdown is correct.
+
+---
+
+### Step 3: System-Level Synthesis
+
+**Role**: Software architect
+**Goal**: From component docs, synthesize system-level documents.
+
+All documents here are derived from component docs (Step 2) + module docs (Step 1). No new code reading should be needed. If it is, that indicates a gap in Steps 1-2 — go back and fill it.
+
+#### 3a. Architecture
+
+Using `.cursor/skills/plan/templates/architecture.md` as structure:
+
+- System context and boundaries from entry points and external integrations
+- Tech stack table from discovery (Step 0) + component specs
+- Deployment model from Dockerfiles, CI configs, environment strategies
+- Data model overview from per-component data access sections
+- Integration points from inter-component interfaces
+- NFRs from test thresholds, config limits, health checks
+- Security architecture from per-module security observations
+- Key ADRs inferred from technology choices and patterns
+
+**Save**: `DOCUMENT_DIR/architecture.md`
+
+#### 3b. System Flows
+
+Using `.cursor/skills/plan/templates/system-flows.md` as structure:
+
+- Trace main flows through the component interaction graph
+- Entry point → component chain → output for each major flow
+- Mermaid sequence diagrams and flowcharts
+- Error scenarios from exception handling patterns
+- Data flow tables per flow
+
+**Save**: `DOCUMENT_DIR/system-flows.md` and `DOCUMENT_DIR/diagrams/flows/flow_[name].md`
+
+#### 3c. Data Model
+
+- Consolidate all data models from module docs
+- Entity-relationship diagram (Mermaid ERD)
+- Migration strategy (if ORM/migration tooling detected)
+- Seed data observations
+- Backward compatibility approach (if versioning found)
+
+**Save**: `DOCUMENT_DIR/data_model.md`
+
+#### 3d. Deployment (if Dockerfile/CI configs exist)
+
+- Containerization summary
+- CI/CD pipeline structure
+- Environment strategy (dev, staging, production)
+- Observability (logging patterns, metrics, health checks found in code)
+
+**Save**: `DOCUMENT_DIR/deployment/` (containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md — only files for which sufficient code evidence exists)
+
+---
+
+### Step 4: Verification Pass
+
+**Role**: Quality verifier
+**Goal**: Compare every generated document against actual code. Fix hallucinations, fill gaps, correct inaccuracies.
+
+For each document generated in Steps 1-3:
+
+1. **Entity verification**: extract all code entities (class names, function names, module names, endpoints) mentioned in the doc. Cross-reference each against the actual codebase. Flag any that don't exist.
+2. **Interface accuracy**: for every method signature, DTO, or API endpoint in component specs, verify it matches actual code.
+3. **Flow correctness**: for each system flow diagram, trace the actual code path and verify the sequence matches.
+4. **Completeness check**: are there modules or components discovered in Step 0 that aren't covered by any document? Flag gaps.
+5. **Consistency check**: do component docs agree with architecture doc? Do flow diagrams match component interfaces?
+
+Apply corrections inline to the documents that need them.
+
+**Save**: `DOCUMENT_DIR/04_verification_log.md` with:
+- Total entities verified vs flagged
+- Corrections applied (which document, what changed)
+- Remaining gaps or uncertainties
+- Completeness score (modules covered / total modules)
+
+**BLOCKING**: Present verification summary to user. Do NOT proceed until user confirms corrections are acceptable or requests additional fixes.
+
+**Session boundary**: After verification is confirmed, suggest a session break before proceeding to the synthesis steps (5–7). These steps produce different artifact types and benefit from fresh context:
+
+```
+══════════════════════════════════════
+ VERIFICATION COMPLETE — session break?
+══════════════════════════════════════
+ Steps 0–4 (analysis + verification) are done.
+ Steps 5–7 (solution + problem extraction + report)
+ can run in a fresh conversation.
+══════════════════════════════════════
+ A) Continue in this conversation
+ B) Save and continue in a new conversation (recommended)
+══════════════════════════════════════
+```
+
+If **Focus Area mode**: Steps 5–7 are skipped (they require full codebase coverage). Present a summary of modules and components documented for this area. The user can run `/document` again for another area, or run without FOCUS_DIR once all areas are covered to produce the full synthesis.
+
+---
+
+### Step 5: Solution Extraction (Retrospective)
+
+**Role**: Software architect
+**Goal**: From all verified technical documentation, retrospectively create `solution.md` — the same artifact the research skill produces.
+
+Synthesize from architecture (Step 3) + component specs (Step 2) + system flows (Step 3) + verification findings (Step 4):
+
+1. **Product Solution Description**: what the system is, brief component interaction diagram (Mermaid)
+2. **Architecture**: the architecture that is implemented, with per-component solution tables:
+
+| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
+|----------|-------|-----------|-------------|-------------|----------|------|-----|
+| [actual implementation] | [libs/platforms used] | [observed strengths] | [observed limitations] | [requirements met] | [security approach] | [cost indicators] | [fitness assessment] |
+
+3. **Testing Strategy**: summarize integration/functional tests and non-functional tests found in the codebase
+4. **References**: links to key config files, Dockerfiles, CI configs that evidence the solution choices
+
+**Save**: `SOLUTION_DIR/solution.md` (`_docs/01_solution/solution.md`)
+
+---
+
+### Step 6: Problem Extraction (Retrospective)
+
+**Role**: Business analyst
+**Goal**: From all verified technical docs, retrospectively derive the high-level problem definition.
+
+#### 6a. `problem.md`
+
+- Synthesize from architecture overview + component purposes + system flows
+- What is this system? What problem does it solve? Who are the users? How does it work at a high level?
+- Cross-reference with README if one exists
+
+#### 6b. `restrictions.md`
+
+- Extract from: tech stack choices, Dockerfile specs, CI configs, dependency versions, environment configs
+- Categorize: Hardware, Software, Environment, Operational
+
+#### 6c. `acceptance_criteria.md`
+
+- Derive from: test assertions, performance configs, health check endpoints, validation rules
+- Every criterion must have a measurable value
+
+#### 6d. `input_data/`
+
+- Document data schemas (DB schemas, API request/response types, config file formats)
+- Create `data_parameters.md` describing what data the system consumes
+
+#### 6e. `security_approach.md` (only if security code found)
+
+- Authentication, authorization, encryption, secrets handling, CORS, rate limiting, input sanitization
+
+**Save**: all files to `PROBLEM_DIR/` (`_docs/00_problem/`)
+
+**BLOCKING**: Present all problem documents to user. Do NOT proceed until user confirms or requests corrections.
+
+---
+
+### Step 7: Final Report
+
+**Role**: Technical writer
+**Goal**: Produce `FINAL_report.md` integrating all generated documentation.
+
+Using `.cursor/skills/plan/templates/final-report.md` as structure:
+
+- Executive summary from architecture + problem docs
+- Problem statement (transformed from problem.md, not copy-pasted)
+- Architecture overview with tech stack one-liner
+- Component summary table (number, name, purpose, dependencies)
+- System flows summary table
+- Risk observations from verification log (Step 4)
+- Open questions (uncertainties flagged during analysis)
+- Artifact index listing all generated documents with paths
+
+**Save**: `DOCUMENT_DIR/FINAL_report.md`
+
+**State**: update `state.json` with `current_step: "complete"`.
+
+---
+
+## Escalation Rules
+
+| Situation | Action |
+|-----------|--------|
+| Minified/obfuscated code detected | WARN user, skip module, note in verification log |
+| Module too large for context window | Split into sub-sections, analyze parts separately, combine |
+| Cycle in dependency graph | Group cycled modules, analyze together as one doc |
+| Generated code (protobuf, swagger-gen) | Note as generated, document the source spec instead |
+| No tests found in codebase | Note gap in acceptance_criteria.md, derive AC from validation rules and config limits only |
+| Contradictions between code and README | Flag in verification log, ASK user |
+| Binary files or non-code assets | Skip, note in discovery |
+| `_docs/` already exists | ASK user: overwrite, merge, or use `_docs_generated/` |
+| Code intent is ambiguous | ASK user, do not guess |
+
+## Common Mistakes
+
+- **Top-down guessing**: never infer architecture before documenting modules. Build up, don't assume down.
+- **Hallucinating entities**: always verify that referenced classes/functions/endpoints actually exist in code.
+- **Skipping modules**: every source module must appear in exactly one module doc and one component.
+- **Monolithic analysis**: don't try to analyze the entire codebase in one pass. Module by module, in order.
+- **Inventing restrictions**: only document constraints actually evidenced in code, configs, or Dockerfiles.
+- **Vague acceptance criteria**: "should be fast" is not a criterion. Extract actual numeric thresholds from code.
+- **Writing code**: this skill produces documents, never implementation code.
+
+## Quick Reference
+
+```
+┌──────────────────────────────────────────────────────────────────┐
+│          Bottom-Up Codebase Documentation (8-Step)               │
+├──────────────────────────────────────────────────────────────────┤
+│ MODE: Full / Focus Area (@dir) / Resume (state.json)             │
+│ PREREQ: Check _docs/ exists (overwrite/merge/new?)               │
+│ PREREQ: Check state.json for resume                              │
+│                                                                  │
+│ 0. Discovery          → dependency graph, tech stack, topo order │
+│    (Focus Area: scoped to FOCUS_DIR + transitive deps)           │
+│ 1. Module Docs        → per-module analysis (leaves first)       │
+│    (batched ~5 modules; session break between batches)           │
+│ 2. Component Assembly → group modules, write component specs     │
+│    [BLOCKING: user confirms components]                          │
+│ 3. System Synthesis   → architecture, flows, data model, deploy  │
+│ 4. Verification       → compare all docs vs code, fix errors     │
+│    [BLOCKING: user reviews corrections]                          │
+│    [SESSION BREAK suggested before Steps 5–7]                    │
+│    ── Focus Area mode stops here ──                              │
+│ 5. Solution Extraction → retrospective solution.md               │
+│ 6. Problem Extraction → retrospective problem, restrictions, AC  │
+│    [BLOCKING: user confirms problem docs]                        │
+│ 7. Final Report       → FINAL_report.md                          │
+├──────────────────────────────────────────────────────────────────┤
+│ Principles: Bottom-up always · Dependencies first                │
+│             Incremental context · Verify against code            │
+│             Save immediately · Resume from checkpoint            │
+│             Batch modules · Session breaks for large codebases   │
+└──────────────────────────────────────────────────────────────────┘
+```
@@ -0,0 +1,90 @@
+# Document Skill — Task Mode Workflow
+
+Lightweight, incremental documentation update triggered by task spec files. Updates only the docs affected by implemented tasks — does NOT redo full discovery, verification, or problem extraction.
+
+## Trigger
+
+- User provides one or more task spec files (e.g., `@_docs/02_tasks/done/AZ-173_*.md`)
+- AND `_docs/02_document/` already contains module/component docs
+
+## Accepts
+
+One or more task spec files from `_docs/02_tasks/todo/` or `_docs/02_tasks/done/`.
+
+## Steps
+
+### Task Step 0: Scope Analysis
+
+1. Read each task spec — extract the "Files Modified" or "Scope / Included" section to identify which source files were changed
+2. Map changed source files to existing module docs in `DOCUMENT_DIR/modules/`
+3. Map affected modules to their parent components in `DOCUMENT_DIR/components/`
+4. Identify which higher-level docs might be affected (system-flows, data_model, data_parameters)
+
+**Output**: a list of docs to update, organized by level:
+- Module docs (direct matches)
+- Component docs (parents of affected modules)
+- System-level docs (only if the task changed API endpoints, data models, or external integrations)
+- Problem-level docs (only if the task changed input parameters, acceptance criteria, or restrictions)
+
+### Task Step 1: Module Doc Updates
+
+For each affected module:
+
+1. Read the current source file
+2. Read the existing module doc
+3. Diff the module doc against current code — identify:
+   - New functions/methods/classes not in the doc
+   - Removed functions/methods/classes still in the doc
+   - Changed signatures or behavior
+   - New/removed dependencies
+   - New/removed external integrations
+4. Update the module doc in-place, preserving the existing structure and style
+5. If a module is entirely new (no existing doc), create a new module doc following the standard template from `workflows/full.md` Step 1
+
+### Task Step 2: Component Doc Updates
+
+For each affected component:
+
+1. Read all module docs belonging to this component (including freshly updated ones)
+2. Read the existing component doc
+3. Update internal interfaces, dependency graphs, implementation details, and caveats sections
+4. Do NOT change the component's purpose, pattern, or high-level overview unless the task fundamentally changed it
+
+### Task Step 3: System-Level Doc Updates (conditional)
+
+Only if the task changed API endpoints, system flows, data models, or external integrations:
+
+1. Update `system-flows.md` — modify affected flow diagrams and data flow tables
+2. Update `data_model.md` — if entities changed
+3. Update `architecture.md` — only if new external integrations or architectural patterns were added
+
+### Task Step 4: Problem-Level Doc Updates (conditional)
+
+Only if the task changed API input parameters, configuration, or acceptance criteria:
+
+1. Update `_docs/00_problem/input_data/data_parameters.md`
+2. Update `_docs/00_problem/acceptance_criteria.md` — if new testable criteria emerged
+
+### Task Step 5: Summary
+
+Present a summary of all docs updated:
+
+```
+══════════════════════════════════════
+ DOCUMENTATION UPDATE COMPLETE
+══════════════════════════════════════
+ Task(s): [task IDs]
+ Module docs updated: [count]
+ Component docs updated: [count]
+ System-level docs updated: [list or "none"]
+ Problem-level docs updated: [list or "none"]
+══════════════════════════════════════
+```
+
+## Principles
+
+- **Minimal changes**: only update what the task actually changed. Do not rewrite unaffected sections.
+- **Preserve style**: match the existing doc's structure, tone, and level of detail.
+- **Verify against code**: for every entity added or changed in a doc, confirm it exists in the current source.
+- **New modules**: if the task introduced an entirely new source file, create a new module doc from the standard template.
+- **Dead references**: if the task removed code, remove the corresponding doc entries. Do not keep stale references.
@@ -37,7 +37,13 @@ For each file/area referenced in the input file:

 Write per-component to `RUN_DIR/discovery/components/[##]_[name].md` (same format as automatic mode, but scoped to affected areas only).

-### 1i. Produce List of Changes
+### 1i. Logical Flow Analysis (guided mode)
+
+Even in guided mode, perform the logical flow analysis from step 1c (automatic mode) — scoped to the areas affected by the input file. Cross-reference documented flows against actual implementation for the affected components. This catches issues the input file author may have missed.
+
+Write findings to `RUN_DIR/discovery/logical_flow_analysis.md`.
+
+### 1j. Produce List of Changes

 1. Start from the validated input file entries
 2. Enrich each entry with:
@@ -45,7 +51,8 @@ Write per-component to `RUN_DIR/discovery/components/[##]_[name].md` (same forma
   - Risk assessment (low/medium/high)
   - Dependencies between changes
 3. Add any additional issues discovered during scoped analysis (1h)
-4. Write `RUN_DIR/list-of-changes.md` using `templates/list-of-changes.md` format
+4. **Add any logical flow contradictions** discovered during step 1i
+5. Write `RUN_DIR/list-of-changes.md` using `templates/list-of-changes.md` format
   - Set **Mode**: `guided`
   - Set **Source**: path to the original input file

@@ -84,9 +91,36 @@ Also copy to project standard locations:
 - `SOLUTION_DIR/solution.md`
 - `DOCUMENT_DIR/system_flows.md`

-### 1c. Produce List of Changes
+### 1c. Logical Flow Analysis

-From the component analysis and solution synthesis, identify all issues that need refactoring:
+**Critical step — do not skip.** Before producing the change list, cross-reference documented business flows against actual implementation. This catches issues that static code inspection alone misses.
+
+1. **Read documented flows**: Load `DOCUMENT_DIR/system-flows.md`, `DOCUMENT_DIR/architecture.md`, and `SOLUTION_DIR/solution.md` (if they exist). Extract every documented business flow, data path, and architectural decision.
+
+2. **Trace each flow through code**: For every documented flow (e.g., "video batch processing", "image tiling", "engine initialization"), walk the actual code path line by line. At each decision point ask:
+   - Does the code match the documented/intended behavior?
+   - Are there edge cases where the flow silently drops data, double-processes, or deadlocks?
+   - Do loop boundaries handle partial batches, empty inputs, and last-iteration cleanup?
+   - Are assumptions from one component (e.g., "batch size is dynamic") honored by all consumers?
+
+3. **Check for logical contradictions**: Specifically look for:
+   - **Fixed-size assumptions vs dynamic-size reality**: Does the code require exact batch alignment when the engine supports variable sizes? Does it pad, truncate, or drop data to fit a fixed size?
+   - **Loop scoping bugs**: Are accumulators (lists, counters) reset at the right point? Does the last iteration flush remaining data? Are results from inside the loop duplicated outside?
+   - **Wasted computation**: Is the system doing redundant work (e.g., duplicating frames to fill a batch, processing the same data twice)?
+   - **Silent data loss**: Are partial batches, remaining frames, or edge-case inputs silently dropped instead of processed?
+   - **Documentation drift**: Does the architecture doc describe components or patterns (e.g., "msgpack serialization") that are actually dead in the code?
+
+4. **Classify each finding** as:
+   - **Logic bug**: Incorrect behavior (data loss, double-processing)
+   - **Performance waste**: Correct but inefficient (unnecessary padding, redundant inference)
+   - **Design contradiction**: Code assumes X but system needs Y (fixed vs dynamic batch)
+   - **Documentation drift**: Docs describe something the code doesn't do
+
+Write findings to `RUN_DIR/discovery/logical_flow_analysis.md`.
+
+### 1d. Produce List of Changes
+
+From the component analysis, solution synthesis, and **logical flow analysis**, identify all issues that need refactoring:

 1. Hardcoded values (paths, config, magic numbers)
 2. Tight coupling between components
@@ -97,6 +131,8 @@ From the component analysis and solution synthesis, identify all issues that nee
 7. Testability blockers (code that cannot be exercised in isolation)
 8. Security concerns
 9. Performance bottlenecks
+10. **Logical flow contradictions** (from step 1c)
+11. **Silent data loss or wasted computation** (from step 1c)

 Write `RUN_DIR/list-of-changes.md` using `templates/list-of-changes.md` format:
 - Set **Mode**: `automatic`
@@ -112,6 +148,8 @@ Write all discovery artifacts to RUN_DIR.
 - [ ] Every referenced file in list-of-changes.md exists in the codebase
 - [ ] Each change entry has file paths, problem, change description, risk, and dependencies
 - [ ] Component documentation covers all areas affected by the changes
+- [ ] **Logical flow analysis completed**: every documented business flow traced through code, contradictions identified
+- [ ] **No silent data loss**: loop boundaries, partial batches, and edge cases checked for all processing flows
 - [ ] In guided mode: all input file entries are validated or flagged
 - [ ] In automatic mode: solution description covers all components
 - [ ] Mermaid diagrams are syntactically correct
@@ -32,36 +32,13 @@ Check in order — first match wins:

 If no runner detected → report failure and ask user to specify.

-#### Docker Suitability Check
+#### Execution Environment Check

-Docker is the preferred test environment. Before using it, verify no constraints prevent easy Docker execution:
-
-1. Check `_docs/02_document/tests/environment.md` for a "Test Execution" decision (if the test-spec skill already assessed this, follow that decision)
-2. If no prior decision exists, check for disqualifying factors:
-   - Hardware bindings: GPU, MPS, CUDA, TPU, FPGA, sensors, cameras, serial devices, host-level drivers
-   - Host dependencies: licensed software, OS-specific services, kernel modules, proprietary SDKs
-   - Data/volume constraints: large files (> 100MB) impractical to copy into a container
-   - Network/environment: host networking, VPN, specific DNS/firewall rules
-   - Performance: Docker overhead would invalidate benchmarks or latency measurements
-3. If any disqualifying factor found → fall back to local test runner. Present to user using Choose format:
-
-```
-══════════════════════════════════════
- DECISION REQUIRED: Docker is preferred but factors
- preventing easy Docker execution detected
-══════════════════════════════════════
- Factors detected:
- - [list factors]
-══════════════════════════════════════
- A) Run tests locally (recommended)
- B) Run tests in Docker anyway
-══════════════════════════════════════
- Recommendation: A — detected constraints prevent
- easy Docker execution
-══════════════════════════════════════
-```
-
-4. If no disqualifying factors → use Docker (preferred default)
+1. Check `_docs/02_document/tests/environment.md` for a "Test Execution" section. If the test-spec skill already assessed hardware dependencies and recorded a decision (local / docker / both), **follow that decision**.
+2. If the "Test Execution" section says **local** → run tests directly on host (no Docker).
+3. If the "Test Execution" section says **docker** → use Docker (docker-compose).
+4. If the "Test Execution" section says **both** → run local first, then Docker (or vice versa), and merge results.
+5. If no prior decision exists → fall back to the hardware-dependency detection logic from the test-spec skill's "Hardware-Dependency & Execution Environment Assessment" section. Ask the user if hardware indicators are found.

 ### 2. Run Tests

@@ -79,42 +56,94 @@ Present a summary:
 ══════════════════════════════════════
 ```

-**Important**: Collection errors (import failures, missing dependencies, syntax errors) count as failures — they are not "skipped" or ignorable.
+**Important**: Collection errors (import failures, missing dependencies, syntax errors) count as failures — they are not "skipped" or ignorable. If a collection error is caused by a missing dependency, install it (add to the project's dependency file and install) before re-running. The test runner script (`run-tests.sh`) should install all dependencies automatically — if it doesn't, fix the script to do so.

-### 4. Diagnose Failures
+### 4. Diagnose Failures and Skips

-Before presenting choices, list every failing/erroring test with a one-line root cause:
+Before presenting choices, list every failing/erroring/skipped test with a one-line root cause:

 ```
 Failures:
 1. test_foo.py::test_bar — missing dependency 'netron' (not installed)
 2. test_baz.py::test_qux — AssertionError: expected 5, got 3 (logic error)
 3. test_old.py::test_legacy — ImportError: no module 'removed_module' (possibly obsolete)
+
+Skips:
+ 1. test_x.py::test_pre_init — runtime skip: engine already initialized (unreachable in current test order)
+ 2. test_y.py::test_docker_only — explicit @skip: requires Docker (dead code in local runs)
 ```

-Categorize each as: **missing dependency**, **broken import**, **logic/assertion error**, **possibly obsolete**, or **environment-specific**.
+Categorize failures as: **missing dependency**, **broken import**, **logic/assertion error**, **possibly obsolete**, or **environment-specific**.
+
+Categorize skips as: **explicit skip (dead code)**, **runtime skip (unreachable)**, **environment mismatch**, or **missing fixture/data**.

 ### 5. Handle Outcome

-**All tests pass** → return success to the autopilot for auto-chain.
+**All tests pass, zero skipped** → return success to the autopilot for auto-chain.

-**Any test fails or errors** → this is a **blocking gate**. Never silently ignore or skip failures. Present using Choose format:
+**Any test fails or errors** → this is a **blocking gate**. Never silently ignore failures. **Always investigate the root cause before deciding on an action.** Read the failing test code, read the error output, check service logs if applicable, and determine whether the bug is in the test or in the production code.
+
+After investigating, present:

 ```
 ══════════════════════════════════════
 TEST RESULTS: [N passed, M failed, K skipped, E errors]
 ══════════════════════════════════════
- A) Investigate and fix failing tests/code, then re-run
- B) Remove obsolete tests (if diagnosis shows they are no longer relevant)
- C) Abort — fix manually
+ Failures:
+  1. test_X — root cause: [detailed reason] → action: [fix test / fix code / remove + justification]
 ══════════════════════════════════════
- Recommendation: A — fix failures before proceeding
+ A) Apply recommended fixes, then re-run
+ B) Abort — fix manually
+══════════════════════════════════════
+ Recommendation: A — fix root causes before proceeding
 ══════════════════════════════════════
 ```

- If user picks A → investigate root causes, attempt fixes, then re-run (loop back to step 2)
- If user picks B → confirm which tests to remove, delete them, then re-run (loop back to step 2)
- If user picks C → return failure to the autopilot
+- If user picks A → apply fixes, then re-run (loop back to step 2)
+- If user picks B → return failure to the autopilot
+
+**Any test skipped** → this is also a **blocking gate**. Skipped tests mean something is wrong — either with the test, the environment, or the test design. **Never blindly remove a skipped test.** Always investigate the root cause first.
+
+#### Investigation Protocol for Skipped Tests
+
+For each skipped test:
+
+1. **Read the test code** — understand what the test is supposed to verify and why it skips.
+2. **Determine the root cause** — why did the skip condition fire?
+   - Is the test environment misconfigured? (e.g., wrong ports, missing env vars, service not started correctly)
+   - Is the test ordering wrong? (e.g., a fixture in an earlier test mutates shared state)
+   - Is a dependency missing? (e.g., package not installed, fixture file absent)
+   - Is the skip condition outdated? (e.g., code was refactored but the skip guard still checks the old behavior)
+   - Is the test fundamentally untestable in the current setup? (e.g., requires Docker restart, different OS, special hardware)
+3. **Try to fix the root cause first** — the goal is to make the test run, not to delete it:
+   - Fix the environment or configuration
+   - Reorder tests or isolate shared state
+   - Install the missing dependency
+   - Update the skip condition to match current behavior
+4. **Only remove as last resort** — if the test truly cannot run in any realistic test environment (e.g., requires hardware not available, duplicates another test with identical assertions), then removal is justified. Document the reasoning.
+
+#### Categorization
+
+- **explicit skip (dead code)**: Has `@pytest.mark.skip` — investigate whether the reason in the decorator is still valid. Often these are temporary skips that became permanent by accident.
+- **runtime skip (unreachable)**: `pytest.skip()` fires inside the test body — investigate why the condition always triggers. Often fixable by adjusting test order, environment, or the condition itself.
+- **environment mismatch**: Test assumes a different environment — investigate whether the test environment setup can be fixed.
+- **missing fixture/data**: Data or service not available — investigate whether it can be provided.
+
+After investigating, present findings:
+
+```
+══════════════════════════════════════
+ SKIPPED TESTS: K tests skipped
+══════════════════════════════════════
+ 1. test_X — root cause: [detailed reason] → action: [fix / restructure / remove + justification]
+ 2. test_Y — root cause: [detailed reason] → action: [fix / restructure / remove + justification]
+══════════════════════════════════════
+ A) Apply recommended fixes, then re-run
+ B) Accept skips and proceed (requires user justification per skip)
+══════════════════════════════════════
+```
+
+Only option B allows proceeding with skips, and it requires explicit user approval with documented justification for each skip.

 ## Trigger Conditions

@@ -209,7 +209,7 @@ Based on all acquired data, acceptance_criteria, and restrictions, form detailed
 - [ ] Expected results use comparison methods from `.cursor/skills/test-spec/templates/expected-results.md`
 - [ ] Positive and negative scenarios are balanced
 - [ ] Consumer app has no direct access to system internals
- [ ] Test environment matches project constraints (see Docker Suitability Assessment below)
+- [ ] Test environment matches project constraints (see Hardware-Dependency & Execution Environment Assessment below)
 - [ ] External dependencies have mock/stub services defined
 - [ ] Traceability matrix has no uncovered AC or restrictions

@@ -337,43 +337,80 @@ When coverage ≥ 70% and all remaining tests have validated data AND quantifiab

 ---

-### Docker Suitability Assessment (BLOCKING — runs before Phase 4)
+### Hardware-Dependency & Execution Environment Assessment (BLOCKING — runs before Phase 4)

-Docker is the **preferred** test execution environment (reproducibility, isolation, CI parity). Before generating scripts, check whether the project has any constraints that prevent easy Docker usage.
+Docker is the **preferred** test execution environment (reproducibility, isolation, CI parity). However, hardware-dependent projects may require local execution to exercise the real code paths. This assessment determines the right execution strategy by scanning both documentation and source code.

-**Disqualifying factors** (any one is sufficient to fall back to local):
- Hardware bindings: GPU, MPS, TPU, FPGA, accelerators, sensors, cameras, serial devices, host-level drivers (CUDA, Metal, OpenCL, etc.)
- Host dependencies: licensed software, OS-specific services, kernel modules, proprietary SDKs not installable in a container
- Data/volume constraints: large files (> 100MB) that would be impractical to copy into a container, databases that must run on the host
- Network/environment: tests that require host networking, VPN access, or specific DNS/firewall rules
- Performance: Docker overhead would invalidate benchmarks or latency-sensitive measurements
+#### Step 1 — Documentation scan

-**Assessment steps**:
-1. Scan project source, config files, and dependencies for indicators of the factors above
-2. Check `TESTS_OUTPUT_DIR/environment.md` for environment requirements
-3. Check `_docs/00_problem/restrictions.md` and `_docs/01_solution/solution.md` for constraints
+Check the following files for mentions of hardware-specific requirements:

-**Decision**:
- If ANY disqualifying factor is found → recommend **local test execution** as fallback. Present to user using Choose format:
+| File | Look for |
+|------|----------|
+| `_docs/00_problem/restrictions.md` | Platform requirements, hardware constraints, OS-specific features |
+| `_docs/01_solution/solution.md` | Engine selection logic, platform-dependent paths, hardware acceleration |
+| `_docs/02_document/architecture.md` | Component diagrams showing hardware layers, engine adapters |
+| `_docs/02_document/components/*/description.md` | Per-component hardware mentions |
+| `TESTS_OUTPUT_DIR/environment.md` | Existing environment decisions |
+
+#### Step 2 — Code scan
+
+Search the project source for indicators of hardware dependence. The project is **hardware-dependent** if ANY of the following are found:
+
+| Category | Code indicators (imports, APIs, config) |
+|----------|-----------------------------------------|
+| GPU / CUDA | `import pycuda`, `import tensorrt`, `import pynvml`, `torch.cuda`, `nvidia-smi`, `CUDA_VISIBLE_DEVICES`, `runtime: nvidia` |
+| Apple Neural Engine / CoreML | `import coremltools`, `CoreML`, `MLModel`, `ComputeUnit`, `MPS`, `sys.platform == "darwin"`, `platform.machine() == "arm64"` |
+| OpenCL / Vulkan | `import pyopencl`, `clCreateContext`, vulkan headers |
+| TPU / FPGA | `import tensorflow.distribute.TPUStrategy`, FPGA bitstream loaders |
+| Sensors / Cameras | `import cv2.VideoCapture(0)` (device index), serial port access, GPIO, V4L2 |
+| OS-specific services | Kernel modules (`modprobe`), host-level drivers, platform-gated code (`sys.platform` branches selecting different backends) |
+
+Also check dependency files (`requirements.txt`, `setup.py`, `pyproject.toml`, `Cargo.toml`, `*.csproj`) for hardware-specific packages.
+
+#### Step 3 — Classify the project
+
+Based on Steps 1–2, classify the project:
+
+- **Not hardware-dependent**: no indicators found → use Docker (preferred default), skip to "Record the decision" below
+- **Hardware-dependent**: one or more indicators found → proceed to Step 4
+
+#### Step 4 — Present execution environment choice
+
+Present the findings and ask the user using Choose format:

 ```
 ══════════════════════════════════════
 DECISION REQUIRED: Test execution environment
 ══════════════════════════════════════
- Docker is preferred, but factors preventing easy
- Docker execution detected:
- - [list factors found]
+ Hardware dependencies detected:
+ - [list each indicator found, with file:line]
 ══════════════════════════════════════
- A) Local execution (recommended)
- B) Docker execution (constraints may cause issues)
+ Running in Docker means these hardware code paths
+ are NOT exercised — Docker uses a Linux VM where
+ [specific hardware, e.g. CoreML / CUDA] is unavailable.
+ The system would fall back to [fallback engine/path].
 ══════════════════════════════════════
- Recommendation: A — detected constraints prevent
- easy Docker execution
+ A) Local execution only (tests the real hardware path)
+ B) Docker execution only (tests the fallback path)
+ C) Both local and Docker (tests both paths, requires
+    two test runs — recommended for CI with heterogeneous
+    runners)
+══════════════════════════════════════
+ Recommendation: [A, B, or C] — [reason]
 ══════════════════════════════════════
 ```

- If NO disqualifying factors → use Docker (preferred default)
- Record the decision in `TESTS_OUTPUT_DIR/environment.md` under a "Test Execution" section
+#### Step 5 — Record the decision
+
+Write or update a **"Test Execution"** section in `TESTS_OUTPUT_DIR/environment.md` with:
+
+1. **Decision**: local / docker / both
+2. **Hardware dependencies found**: list with file references
+3. **Execution instructions** per chosen mode:
+   - **Local mode**: prerequisites (OS, SDK, hardware), how to start services, how to run the test runner, environment variables
+   - **Docker mode**: docker-compose profile/command, required images, how results are collected
+   - **Both mode**: instructions for each, plus guidance on which CI runner type runs which mode

 ---

@@ -398,17 +435,20 @@ Docker is the **preferred** test execution environment (reproducibility, isolati
 3. Identify performance/load testing tools from dependencies (k6, locust, artillery, wrk, or built-in benchmarks)
 4. Read `TESTS_OUTPUT_DIR/environment.md` for infrastructure requirements

-#### Step 2 — Generate `scripts/run-tests.sh`
+#### Step 2 — Generate test runner

-Create `scripts/run-tests.sh` at the project root using `.cursor/skills/test-spec/templates/run-tests-script.md` as structural guidance. The script must:
+**Docker is the default.** Only generate a local `scripts/run-tests.sh` if the Hardware-Dependency Assessment determined **local** or **both** execution (i.e., the project requires real hardware like GPU/CoreML/TPU/sensors). For all other projects, use `docker-compose.test.yml` — it provides reproducibility, isolation, and CI parity without a custom shell script.
+
+**If local script is needed** — create `scripts/run-tests.sh` at the project root using `.cursor/skills/test-spec/templates/run-tests-script.md` as structural guidance. The script must:

 1. Set `set -euo pipefail` and trap cleanup on EXIT
-2. Optionally accept a `--unit-only` flag to skip blackbox tests
-3. Run unit/blackbox tests using the detected test runner:
-   - **Local mode**: activate virtualenv (if present), run test runner directly on host
-   - **Docker mode**: spin up docker-compose environment, wait for health checks, run test suite, tear down
-4. Print a summary of passed/failed/skipped tests
-5. Exit 0 on all pass, exit 1 on any failure
+2. **Install all project and test dependencies** (e.g. `pip install -q -r requirements.txt -r e2e/requirements.txt`, `dotnet restore`, `npm ci`). This prevents collection-time import errors on fresh environments.
+3. Optionally accept a `--unit-only` flag to skip blackbox tests
+4. Run unit/blackbox tests using the detected test runner (activate virtualenv if present, run test runner directly on host)
+5. Print a summary of passed/failed/skipped tests
+6. Exit 0 on all pass, exit 1 on any failure
+
+**If Docker** — generate or update `docker-compose.test.yml` that builds the test image, installs all dependencies inside the container, runs the test suite, and exits with the test runner's exit code.

 #### Step 3 — Generate `scripts/run-performance-tests.sh`

@@ -2,7 +2,13 @@

 Reference for generating `scripts/run-tests.sh` and `scripts/run-performance-tests.sh`.

-## `scripts/run-tests.sh`
+## When to generate a local `run-tests.sh`
+
+A local shell script is needed **only** for hardware-dependent projects that require real hardware (GPU, CoreML, TPU, sensors, etc.) to exercise the actual code paths. If the Hardware-Dependency Assessment (Phase 4 prerequisite) determined **local** or **both** execution, generate this script.
+
+For all other projects, **use Docker** (`docker-compose.test.yml` / `Dockerfile.test`). Docker is the default — it provides reproducibility, isolation, and CI parity. Do not generate a local `run-tests.sh` when Docker is sufficient.
+
+## `scripts/run-tests.sh` (local / hardware-dependent only)

 ```bash
 #!/usr/bin/env bash
@@ -20,23 +26,33 @@ for arg in "$@"; do
 done

 cleanup() {
-  # tear down docker-compose if it was started
+  # tear down services started by this script
 }
 trap cleanup EXIT

 mkdir -p "$RESULTS_DIR"

+# --- Install Dependencies ---
+# MANDATORY: install all project + test dependencies before building or running.
+# A fresh clone or CI runner may have nothing installed.
+# Python:  pip install -q -r requirements.txt -r e2e/requirements.txt
+# .NET:    dotnet restore
+# Rust:    cargo fetch
+# Node:    npm ci
+
+# --- Build (if needed) ---
+# [e.g. Cython: python setup.py build_ext --inplace]
+
 # --- Unit Tests ---
 # [detect runner: pytest / dotnet test / cargo test / npm test]
 # [run and capture exit code]
-# [save results to $RESULTS_DIR/unit-results.*]

 # --- Blackbox Tests (skip if --unit-only) ---
 # if ! $UNIT_ONLY; then
-#   [docker compose -f <compose-file> up -d]
+#   [start mock services]
+#   [start system under test]
 #   [wait for health checks]
 #   [run blackbox test suite]
-#   [save results to $RESULTS_DIR/blackbox-results.*]
 # fi

 # --- Summary ---
@@ -61,6 +77,9 @@ trap cleanup EXIT

 mkdir -p "$RESULTS_DIR"

+# --- Install Dependencies ---
+# [same as above — always install first]
+
 # --- Start System Under Test ---
 # [docker compose up -d or start local server]
 # [wait for health checks]
@@ -80,6 +99,8 @@ mkdir -p "$RESULTS_DIR"

 ## Key Requirements

+- **Docker is the default**: only generate a local `run-tests.sh` for hardware-dependent projects. Otherwise use `docker-compose.test.yml`.
+- **Always install dependencies first**: the script must install all project and test dependencies before building or running tests. A fresh clone or CI runner may have nothing installed. Missing a single dependency causes collection errors that abort the entire test run.
 - Both scripts must be idempotent (safe to run multiple times)
 - Both scripts must work in CI (no interactive prompts, no GUI)
 - Use `trap cleanup EXIT` to ensure teardown even on failure