Add detailed file index and enhance skill documentation for autopilot, decompose, deploy, plan, and research skills. Introduce tests-only mode in decompose skill, clarify required files for deploy and plan skills, and improve prerequisite checks across skills for better user guidance and workflow efficiency.

This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-22 16:15:49 +02:00
parent 60ebe686ff
commit 3165a88f0b
60 changed files with 6324 additions and 1550 deletions
+21 -673
View File
@@ -43,257 +43,51 @@ Determine the operating mode based on invocation before any other logic runs.
**Standalone mode** (explicit input file provided, e.g. `/research @some_doc.md`):
- INPUT_FILE: the provided file (treated as problem description)
- OUTPUT_DIR: `_standalone/01_solution/`
- RESEARCH_DIR: `_standalone/00_research/`
- BASE_DIR: if specified by the caller, use it; otherwise default to `_standalone/`
- OUTPUT_DIR: `BASE_DIR/01_solution/`
- RESEARCH_DIR: `BASE_DIR/00_research/`
- Guardrails relaxed: only INPUT_FILE must exist and be non-empty
- `restrictions.md` and `acceptance_criteria.md` are optional — warn if absent, proceed if user confirms
- Mode detection uses OUTPUT_DIR for `solution_draft*.md` scanning
- Draft numbering works the same, scoped to OUTPUT_DIR
- **Final step**: after all research is complete, move INPUT_FILE into `_standalone/`
- **Final step**: after all research is complete, move INPUT_FILE into BASE_DIR
Announce the detected mode and resolved paths to the user before proceeding.
## Project Integration
### Prerequisite Guardrails (BLOCKING)
Before any research begins, verify the input context exists. **Do not proceed if guardrails fail.**
**Project mode:**
1. Check INPUT_DIR exists — **STOP if missing**, ask user to create it and provide problem files
2. Check `problem.md` in INPUT_DIR exists and is non-empty — **STOP if missing**
3. Check `restrictions.md` in INPUT_DIR exists and is non-empty — **STOP if missing**
4. Check `acceptance_criteria.md` in INPUT_DIR exists and is non-empty — **STOP if missing**
5. Check `input_data/` in INPUT_DIR exists and contains at least one file — **STOP if missing**
6. Read **all** files in INPUT_DIR to ground the investigation in the project context
7. Create OUTPUT_DIR and RESEARCH_DIR if they don't exist
**Standalone mode:**
1. Check INPUT_FILE exists and is non-empty — **STOP if missing**
2. Warn if no `restrictions.md` or `acceptance_criteria.md` were provided alongside INPUT_FILE — proceed if user confirms
3. Create OUTPUT_DIR and RESEARCH_DIR if they don't exist
### Mode Detection
After guardrails pass, determine the execution mode:
1. Scan OUTPUT_DIR for files matching `solution_draft*.md`
2. **No matches found****Mode A: Initial Research**
3. **Matches found****Mode B: Solution Assessment** (use the highest-numbered draft as input)
4. **User override**: if the user explicitly says "research from scratch" or "initial research", force Mode A regardless of existing drafts
Inform the user which mode was detected and confirm before proceeding.
### Solution Draft Numbering
All final output is saved as `OUTPUT_DIR/solution_draft##.md` with a 2-digit zero-padded number:
1. Scan existing files in OUTPUT_DIR matching `solution_draft*.md`
2. Extract the highest existing number
3. Increment by 1
4. Zero-pad to 2 digits (e.g., `01`, `02`, ..., `10`, `11`)
Example: if `solution_draft01.md` through `solution_draft10.md` exist, the next output is `solution_draft11.md`.
### Working Directory & Intermediate Artifact Management
#### Directory Structure
At the start of research, **must** create a working directory under RESEARCH_DIR:
```
RESEARCH_DIR/
├── 00_ac_assessment.md # Mode A Phase 1 output: AC & restrictions assessment
├── 00_question_decomposition.md # Step 0-1 output
├── 01_source_registry.md # Step 2 output: all consulted source links
├── 02_fact_cards.md # Step 3 output: extracted facts
├── 03_comparison_framework.md # Step 4 output: selected framework and populated data
├── 04_reasoning_chain.md # Step 6 output: fact → conclusion reasoning
├── 05_validation_log.md # Step 7 output: use-case validation results
└── raw/ # Raw source archive (optional)
├── source_1.md
└── source_2.md
```
### Save Timing & Content
| Step | Save immediately after completion | Filename |
|------|-----------------------------------|----------|
| Mode A Phase 1 | AC & restrictions assessment tables | `00_ac_assessment.md` |
| Step 0-1 | Question type classification + sub-question list | `00_question_decomposition.md` |
| Step 2 | Each consulted source link, tier, summary | `01_source_registry.md` |
| Step 3 | Each fact card (statement + source + confidence) | `02_fact_cards.md` |
| Step 4 | Selected comparison framework + initial population | `03_comparison_framework.md` |
| Step 6 | Reasoning process for each dimension | `04_reasoning_chain.md` |
| Step 7 | Validation scenarios + results + review checklist | `05_validation_log.md` |
| Step 8 | Complete solution draft | `OUTPUT_DIR/solution_draft##.md` |
### Save Principles
1. **Save immediately**: Write to the corresponding file as soon as a step is completed; don't wait until the end
2. **Incremental updates**: Same file can be updated multiple times; append or replace new content
3. **Preserve process**: Keep intermediate files even after their content is integrated into the final report
4. **Enable recovery**: If research is interrupted, progress can be recovered from intermediate files
Read and follow `steps/00_project-integration.md` for prerequisite guardrails, mode detection, draft numbering, working directory setup, save timing, and output file inventory.
## Execution Flow
### Mode A: Initial Research
Triggered when no `solution_draft*.md` files exist in OUTPUT_DIR, or when the user explicitly requests initial research.
Read and follow `steps/01_mode-a-initial-research.md`.
#### Phase 1: AC & Restrictions Assessment (BLOCKING)
**Role**: Professional software architect
A focused preliminary research pass **before** the main solution research. The goal is to validate that the acceptance criteria and restrictions are realistic before designing a solution around them.
**Input**: All files from INPUT_DIR (or INPUT_FILE in standalone mode)
**Task**:
1. Read all problem context files thoroughly
2. **ASK the user about every unclear aspect** — do not assume:
- Unclear problem boundaries → ask
- Ambiguous acceptance criteria values → ask
- Missing context (no `security_approach.md`, no `input_data/`) → ask what they have
- Conflicting restrictions → ask which takes priority
3. Research in internet **extensively** — use multiple search queries per question, rephrase, and search from different angles:
- How realistic are the acceptance criteria for this specific domain? Search for industry benchmarks, standards, and typical values
- How critical is each criterion? Search for case studies where criteria were relaxed or tightened
- What domain-specific acceptance criteria are we missing? Search for industry standards, regulatory requirements, and best practices in the specific domain
- Impact of each criterion value on the whole system quality — search for research papers and engineering reports
- Cost/budget implications of each criterion — search for pricing, total cost of ownership analyses, and comparable project budgets
- Timeline implications — search for project timelines, development velocity reports, and comparable implementations
- What do practitioners in this domain consider the most important criteria? Search forums, conference talks, and experience reports
4. Research restrictions from multiple perspectives:
- Are the restrictions realistic? Search for comparable projects that operated under similar constraints
- Should any be tightened or relaxed? Search for what constraints similar projects actually ended up with
- Are there additional restrictions we should add? Search for regulatory, compliance, and safety requirements in this domain
- What restrictions do practitioners wish they had defined earlier? Search for post-mortem reports and lessons learned
5. Verify findings with authoritative sources (official docs, papers, benchmarks) — each key finding must have at least 2 independent sources
**Uses Steps 0-3 of the 8-step engine** (question classification, decomposition, source tiering, fact extraction) scoped to AC and restrictions assessment.
**📁 Save action**: Write `RESEARCH_DIR/00_ac_assessment.md` with format:
```markdown
# Acceptance Criteria Assessment
## Acceptance Criteria
| Criterion | Our Values | Researched Values | Cost/Timeline Impact | Status |
|-----------|-----------|-------------------|---------------------|--------|
| [name] | [current] | [researched range] | [impact] | Added / Modified / Removed |
## Restrictions Assessment
| Restriction | Our Values | Researched Values | Cost/Timeline Impact | Status |
|-------------|-----------|-------------------|---------------------|--------|
| [name] | [current] | [researched range] | [impact] | Added / Modified / Removed |
## Key Findings
[Summary of critical findings]
## Sources
[Key references used]
```
**BLOCKING**: Present the AC assessment tables to the user. Wait for confirmation or adjustments before proceeding to Phase 2. The user may update `acceptance_criteria.md` or `restrictions.md` based on findings.
---
#### Phase 2: Problem Research & Solution Draft
**Role**: Professional researcher and software architect
Full 8-step research methodology. Produces the first solution draft.
**Input**: All files from INPUT_DIR (possibly updated after Phase 1) + Phase 1 artifacts
**Task** (drives the 8-step engine):
1. Research existing/competitor solutions for similar problems — search broadly across industries and adjacent domains, not just the obvious competitors
2. Research the problem thoroughly — all possible ways to solve it, split into components; search for how different fields approach analogous problems
3. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1
4. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions
5. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist?
6. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues)
7. Include security considerations in each component analysis
8. Provide rough cost estimates for proposed solutions
Be concise in formulating. The fewer words, the better, but do not miss any important details.
**📁 Save action**: Write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md`
---
#### Phase 3: Tech Stack Consolidation (OPTIONAL)
**Role**: Software architect evaluating technology choices
Focused synthesis step — no new 8-step cycle. Uses research already gathered in Phase 2 to make concrete technology decisions.
**Input**: Latest `solution_draft##.md` from OUTPUT_DIR + all files from INPUT_DIR
**Task**:
1. Extract technology options from the solution draft's component comparison tables
2. Score each option against: fitness for purpose, maturity, security track record, team expertise, cost, scalability
3. Produce a tech stack summary with selection rationale
4. Assess risks and learning requirements per technology choice
**📁 Save action**: Write `OUTPUT_DIR/tech_stack.md` with:
- Requirements analysis (functional, non-functional, constraints)
- Technology evaluation tables (language, framework, database, infrastructure, key libraries) with scores
- Tech stack summary block
- Risk assessment and learning requirements tables
---
#### Phase 4: Security Deep Dive (OPTIONAL)
**Role**: Security architect
Focused analysis step — deepens the security column from the solution draft into a proper threat model and controls specification.
**Input**: Latest `solution_draft##.md` from OUTPUT_DIR + `security_approach.md` from INPUT_DIR + problem context
**Task**:
1. Build threat model: asset inventory, threat actors, attack vectors
2. Define security requirements and proposed controls per component (with risk level)
3. Summarize authentication/authorization, data protection, secure communication, and logging/monitoring approach
**📁 Save action**: Write `OUTPUT_DIR/security_analysis.md` with:
- Threat model (assets, actors, vectors)
- Per-component security requirements and controls table
- Security controls summary
Phases: AC Assessment (BLOCKING) → Problem Research → Tech Stack (optional) → Security (optional).
---
### Mode B: Solution Assessment
Triggered when `solution_draft*.md` files exist in OUTPUT_DIR.
Read and follow `steps/02_mode-b-solution-assessment.md`.
**Role**: Professional software architect
---
Full 8-step research methodology applied to assessing and improving an existing solution draft.
## Research Engine (8-Step Method)
**Input**: All files from INPUT_DIR + the latest (highest-numbered) `solution_draft##.md` from OUTPUT_DIR
The 8-step method is the core research engine used by both modes. Steps 0-1 and Step 8 have mode-specific behavior; Steps 2-7 are identical regardless of mode.
**Task** (drives the 8-step engine):
1. Read the existing solution draft thoroughly
2. Research in internet extensively — for each component/decision in the draft, search for:
- Known problems and limitations of the chosen approach
- What practitioners say about using it in production
- Better alternatives that may have emerged recently
- Common failure modes and edge cases
- How competitors/similar projects solve the same problem differently
3. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure"
4. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft
5. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports
6. For each identified weak point, search for multiple solution approaches and compare them
7. Based on findings, form a new solution draft in the same format
**Investigation phase** (Steps 03.5): Read and follow `steps/03_engine-investigation.md`.
Covers: question classification, novelty sensitivity, question decomposition, perspective rotation, exhaustive web search, fact extraction, iterative deepening.
**📁 Save action**: Write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md`
**Analysis phase** (Steps 48): Read and follow `steps/04_engine-analysis.md`.
Covers: comparison framework, baseline alignment, reasoning chain, use-case validation, deliverable formatting.
**Optional follow-up**: After Mode B completes, the user can request Phase 3 (Tech Stack Consolidation) or Phase 4 (Security Deep Dive) using the revised draft. These phases work identically to their Mode A descriptions above.
## Solution Draft Output Templates
- Mode A: `templates/solution_draft_mode_a.md`
- Mode B: `templates/solution_draft_mode_b.md`
## Escalation Rules
@@ -317,389 +111,12 @@ When the user wants to:
- Gather information and evidence for a decision
- Assess or improve an existing solution draft
**Keywords**:
- "deep research", "deep dive", "in-depth analysis"
- "research this", "investigate", "look into"
- "assess solution", "review draft", "improve solution"
- "comparative analysis", "concept comparison", "technical comparison"
**Differentiation from other Skills**:
- Needs a **visual knowledge graph** → use `research-to-diagram`
- Needs **written output** (articles/tutorials) → use `wsy-writer`
- Needs **material organization** → use `material-to-markdown`
- Needs **research + solution draft** → use this Skill
## Research Engine (8-Step Method)
The 8-step method is the core research engine used by both modes. Steps 0-1 and Step 8 have mode-specific behavior; Steps 2-7 are identical regardless of mode.
### Step 0: Question Type Classification
First, classify the research question type and select the corresponding strategy:
| Question Type | Core Task | Focus Dimensions |
|---------------|-----------|------------------|
| **Concept Comparison** | Build comparison framework | Mechanism differences, applicability boundaries |
| **Decision Support** | Weigh trade-offs | Cost, risk, benefit |
| **Trend Analysis** | Map evolution trajectory | History, driving factors, predictions |
| **Problem Diagnosis** | Root cause analysis | Symptoms, causes, evidence chain |
| **Knowledge Organization** | Systematic structuring | Definitions, classifications, relationships |
**Mode-specific classification**:
| Mode / Phase | Typical Question Type |
|--------------|----------------------|
| Mode A Phase 1 | Knowledge Organization + Decision Support |
| Mode A Phase 2 | Decision Support |
| Mode B | Problem Diagnosis + Decision Support |
### Step 0.5: Novelty Sensitivity Assessment (BLOCKING)
Before starting research, assess the novelty sensitivity of the question (Critical/High/Medium/Low). This determines source time windows and filtering strategy.
**For full classification table, critical-domain rules, trigger words, and assessment template**: Read `references/novelty-sensitivity.md`
Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources within 6 months, mandatory version annotations, cross-validation from 2+ sources, and direct verification of official download pages.
**📁 Save action**: Append timeliness assessment to the end of `00_question_decomposition.md`
---
### Step 1: Question Decomposition & Boundary Definition
**Mode-specific sub-questions**:
**Mode A Phase 2** (Initial Research — Problem & Solution):
- "What existing/competitor solutions address this problem?"
- "What are the component parts of this problem?"
- "For each component, what are the state-of-the-art solutions?"
- "What are the security considerations per component?"
- "What are the cost implications of each approach?"
**Mode B** (Solution Assessment):
- "What are the weak points and potential problems in the existing draft?"
- "What are the security vulnerabilities in the proposed architecture?"
- "Where are the performance bottlenecks?"
- "What solutions exist for each identified issue?"
**General sub-question patterns** (use when applicable):
- **Sub-question A**: "What is X and how does it work?" (Definition & mechanism)
- **Sub-question B**: "What are the dimensions of relationship/difference between X and Y?" (Comparative analysis)
- **Sub-question C**: "In what scenarios is X applicable/inapplicable?" (Boundary conditions)
- **Sub-question D**: "What are X's development trends/best practices?" (Extended analysis)
#### Perspective Rotation (MANDATORY)
For each research problem, examine it from **at least 3 different perspectives**. Each perspective generates its own sub-questions and search queries.
| Perspective | What it asks | Example queries |
|-------------|-------------|-----------------|
| **End-user / Consumer** | What problems do real users encounter? What do they wish were different? | "X problems", "X frustrations reddit", "X user complaints" |
| **Implementer / Engineer** | What are the technical challenges, gotchas, hidden complexities? | "X implementation challenges", "X pitfalls", "X lessons learned" |
| **Business / Decision-maker** | What are the costs, ROI, strategic implications? | "X total cost of ownership", "X ROI case study", "X vs Y business comparison" |
| **Contrarian / Devil's advocate** | What could go wrong? Why might this fail? What are critics saying? | "X criticism", "why not X", "X failures", "X disadvantages real world" |
| **Domain expert / Academic** | What does peer-reviewed research say? What are theoretical limits? | "X research paper", "X systematic review", "X benchmarks academic" |
| **Practitioner / Field** | What do people who actually use this daily say? What works in practice vs theory? | "X in production", "X experience report", "X after 1 year" |
Select at least 3 perspectives relevant to the problem. Document the chosen perspectives in `00_question_decomposition.md`.
#### Question Explosion (MANDATORY)
For **each sub-question**, generate **at least 3-5 search query variants** before searching. This ensures broad coverage and avoids missing relevant information due to terminology differences.
**Query variant strategies**:
- **Specificity ladder**: broad ("indoor navigation systems") → narrow ("UWB-based indoor drone navigation accuracy")
- **Negation/failure**: "X limitations", "X failure modes", "when X doesn't work"
- **Comparison framing**: "X vs Y for Z", "X alternative for Z", "X or Y which is better for Z"
- **Practitioner voice**: "X in production experience", "X real-world results", "X lessons learned"
- **Temporal**: "X 2025", "X latest developments", "X roadmap"
- **Geographic/domain**: "X in Europe", "X for defense applications", "X in agriculture"
Record all planned queries in `00_question_decomposition.md` alongside each sub-question.
**⚠️ Research Subject Boundary Definition (BLOCKING - must be explicit)**:
When decomposing questions, you must explicitly define the **boundaries of the research subject**:
| Dimension | Boundary to define | Example |
|-----------|--------------------|---------|
| **Population** | Which group is being studied? | University students vs K-12 vs vocational students vs all students |
| **Geography** | Which region is being studied? | Chinese universities vs US universities vs global |
| **Timeframe** | Which period is being studied? | Post-2020 vs full historical picture |
| **Level** | Which level is being studied? | Undergraduate vs graduate vs vocational |
**Common mistake**: User asks about "university classroom issues" but sources include policies targeting "K-12 students" — mismatched target populations will invalidate the entire research.
**📁 Save action**:
1. Read all files from INPUT_DIR to ground the research in the project context
2. Create working directory `RESEARCH_DIR/`
3. Write `00_question_decomposition.md`, including:
- Original question
- Active mode (A Phase 2 or B) and rationale
- Summary of relevant problem context from INPUT_DIR
- Classified question type and rationale
- **Research subject boundary definition** (population, geography, timeframe, level)
- List of decomposed sub-questions
- **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale
- **Search query variants** for each sub-question (at least 3-5 per sub-question)
4. Write TodoWrite to track progress
### Step 2: Source Tiering & Exhaustive Web Investigation
Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). Conclusions must be traceable to L1/L2; L3/L4 serve as supplementary and validation.
**For full tier definitions, search strategies, community mining steps, and source registry templates**: Read `references/source-tiering.md`
**Tool Usage**:
- Use `WebSearch` for broad searches; `WebFetch` to read specific pages
- Use the `context7` MCP server (`resolve-library-id` then `get-library-docs`) for up-to-date library/framework documentation
- Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories)
- When citing web sources, include the URL and date accessed
#### Exhaustive Search Requirements (MANDATORY)
Do not stop at the first few results. The goal is to build a comprehensive evidence base.
**Minimum search effort per sub-question**:
- Execute **all** query variants generated in Step 1's Question Explosion (at least 3-5 per sub-question)
- Consult at least **2 different source tiers** per sub-question (e.g., L1 official docs + L4 community discussion)
- If initial searches yield fewer than 3 relevant sources for a sub-question, **broaden the search** with alternative terms, related domains, or analogous problems
**Search broadening strategies** (use when results are thin):
- Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation"
- Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums
- Try different geographies: search in English + search for European/Asian approaches if relevant
- Try historical evolution: "history of X", "evolution of X approaches", "X state of the art 2024 2025"
- Try failure analysis: "X project failure", "X post-mortem", "X recall", "X incident report"
**Search saturation rule**: Continue searching until new queries stop producing substantially new information. If the last 3 searches only repeat previously found facts, the sub-question is saturated.
**📁 Save action**:
For each source consulted, **immediately** append to `01_source_registry.md` using the entry template from `references/source-tiering.md`.
### Step 3: Fact Extraction & Evidence Cards
Transform sources into **verifiable fact cards**:
```markdown
## Fact Cards
### Fact 1
- **Statement**: [specific fact description]
- **Source**: [link/document section]
- **Confidence**: High/Medium/Low
### Fact 2
...
```
**Key discipline**:
- Pin down facts first, then reason
- Distinguish "what officials said" from "what I infer"
- When conflicting information is found, annotate and preserve both sides
- Annotate confidence level:
- ✅ High: Explicitly stated in official documentation
- ⚠️ Medium: Mentioned in official blog but not formally documented
- ❓ Low: Inference or from unofficial sources
**📁 Save action**:
For each extracted fact, **immediately** append to `02_fact_cards.md`:
```markdown
## Fact #[number]
- **Statement**: [specific fact description]
- **Source**: [Source #number] [link]
- **Phase**: [Phase 1 / Phase 2 / Assessment]
- **Target Audience**: [which group this fact applies to, inherited from source or further refined]
- **Confidence**: ✅/⚠️/❓
- **Related Dimension**: [corresponding comparison dimension]
```
**⚠️ Target audience in fact statements**:
- If a fact comes from a "partially overlapping" or "reference only" source, the statement **must explicitly annotate the applicable scope**
- Wrong: "The Ministry of Education banned phones in classrooms" (doesn't specify who)
- Correct: "The Ministry of Education banned K-12 students from bringing phones into classrooms (does not apply to university students)"
### Step 3.5: Iterative Deepening — Follow-Up Investigation
After initial fact extraction, review what you have found and identify **knowledge gaps and new questions** that emerged from the initial research. This step ensures the research doesn't stop at surface-level findings.
**Process**:
1. **Gap analysis**: Review fact cards and identify:
- Sub-questions with fewer than 3 high-confidence facts → need more searching
- Contradictions between sources → need tie-breaking evidence
- Perspectives (from Step 1) that have no or weak coverage → need targeted search
- Claims that rely only on L3/L4 sources → need L1/L2 verification
2. **Follow-up question generation**: Based on initial findings, generate new questions:
- "Source X claims [fact] — is this consistent with other evidence?"
- "If [approach A] has [limitation], how do practitioners work around it?"
- "What are the second-order effects of [finding]?"
- "Who disagrees with [common finding] and why?"
- "What happened when [solution] was deployed at scale?"
3. **Targeted deep-dive searches**: Execute follow-up searches focusing on:
- Specific claims that need verification
- Alternative viewpoints not yet represented
- Real-world case studies and experience reports
- Failure cases and edge conditions
- Recent developments that may change the picture
4. **Update artifacts**: Append new sources to `01_source_registry.md`, new facts to `02_fact_cards.md`
**Exit criteria**: Proceed to Step 4 when:
- Every sub-question has at least 3 facts with at least one from L1/L2
- At least 3 perspectives from Step 1 have supporting evidence
- No unresolved contradictions remain (or they are explicitly documented as open questions)
- Follow-up searches are no longer producing new substantive information
### Step 4: Build Comparison/Analysis Framework
Based on the question type, select fixed analysis dimensions. **For dimension lists** (General, Concept Comparison, Decision Support): Read `references/comparison-frameworks.md`
**📁 Save action**:
Write to `03_comparison_framework.md`:
```markdown
# Comparison Framework
## Selected Framework Type
[Concept Comparison / Decision Support / ...]
## Selected Dimensions
1. [Dimension 1]
2. [Dimension 2]
...
## Initial Population
| Dimension | X | Y | Factual Basis |
|-----------|---|---|---------------|
| [Dimension 1] | [description] | [description] | Fact #1, #3 |
| ... | | | |
```
### Step 5: Reference Point Baseline Alignment
Ensure all compared parties have clear, consistent definitions:
**Checklist**:
- [ ] Is the reference point's definition stable/widely accepted?
- [ ] Does it need verification, or can domain common knowledge be used?
- [ ] Does the reader's understanding of the reference point match mine?
- [ ] Are there ambiguities that need to be clarified first?
### Step 6: Fact-to-Conclusion Reasoning Chain
Explicitly write out the "fact → comparison → conclusion" reasoning process:
```markdown
## Reasoning Process
### Regarding [Dimension Name]
1. **Fact confirmation**: According to [source], X's mechanism is...
2. **Compare with reference**: While Y's mechanism is...
3. **Conclusion**: Therefore, the difference between X and Y on this dimension is...
```
**Key discipline**:
- Conclusions come from mechanism comparison, not "gut feelings"
- Every conclusion must be traceable to specific facts
- Uncertain conclusions must be annotated
**📁 Save action**:
Write to `04_reasoning_chain.md`:
```markdown
# Reasoning Chain
## Dimension 1: [Dimension Name]
### Fact Confirmation
According to [Fact #X], X's mechanism is...
### Reference Comparison
While Y's mechanism is... (Source: [Fact #Y])
### Conclusion
Therefore, the difference between X and Y on this dimension is...
### Confidence
✅/⚠️/❓ + rationale
---
## Dimension 2: [Dimension Name]
...
```
### Step 7: Use-Case Validation (Sanity Check)
Validate conclusions against a typical scenario:
**Validation questions**:
- Based on my conclusions, how should this scenario be handled?
- Is that actually the case?
- Are there counterexamples that need to be addressed?
**Review checklist**:
- [ ] Are draft conclusions consistent with Step 3 fact cards?
- [ ] Are there any important dimensions missed?
- [ ] Is there any over-extrapolation?
- [ ] Are conclusions actionable/verifiable?
**📁 Save action**:
Write to `05_validation_log.md`:
```markdown
# Validation Log
## Validation Scenario
[Scenario description]
## Expected Based on Conclusions
If using X: [expected behavior]
If using Y: [expected behavior]
## Actual Validation Results
[actual situation]
## Counterexamples
[yes/no, describe if yes]
## Review Checklist
- [x] Draft conclusions consistent with fact cards
- [x] No important dimensions missed
- [x] No over-extrapolation
- [ ] Issue found: [if any]
## Conclusions Requiring Revision
[if any]
```
### Step 8: Deliverable Formatting
Make the output **readable, traceable, and actionable**.
**📁 Save action**:
Integrate all intermediate artifacts. Write to `OUTPUT_DIR/solution_draft##.md` using the appropriate output template based on active mode:
- Mode A: `templates/solution_draft_mode_a.md`
- Mode B: `templates/solution_draft_mode_b.md`
Sources to integrate:
- Extract background from `00_question_decomposition.md`
- Reference key facts from `02_fact_cards.md`
- Organize conclusions from `04_reasoning_chain.md`
- Generate references from `01_source_registry.md`
- Supplement with use cases from `05_validation_log.md`
- For Mode A: include AC assessment from `00_ac_assessment.md`
## Solution Draft Output Templates
### Mode A: Initial Research Output
Use template: `templates/solution_draft_mode_a.md`
### Mode B: Solution Assessment Output
Use template: `templates/solution_draft_mode_b.md`
## Stakeholder Perspectives
Adjust content depth based on audience:
@@ -710,75 +127,6 @@ Adjust content depth based on audience:
| **Implementers** | Specific mechanisms, how-to | Detailed, emphasize how to do it |
| **Technical experts** | Details, boundary conditions, limitations | In-depth, emphasize accuracy |
## Output Files
Default intermediate artifacts location: `RESEARCH_DIR/`
**Required files** (automatically generated through the process):
| File | Content | When Generated |
|------|---------|----------------|
| `00_ac_assessment.md` | AC & restrictions assessment (Mode A only) | After Phase 1 completion |
| `00_question_decomposition.md` | Question type, sub-question list | After Step 0-1 completion |
| `01_source_registry.md` | All source links and summaries | Continuously updated during Step 2 |
| `02_fact_cards.md` | Extracted facts and sources | Continuously updated during Step 3 |
| `03_comparison_framework.md` | Selected framework and populated data | After Step 4 completion |
| `04_reasoning_chain.md` | Fact → conclusion reasoning | After Step 6 completion |
| `05_validation_log.md` | Use-case validation and review | After Step 7 completion |
| `OUTPUT_DIR/solution_draft##.md` | Complete solution draft | After Step 8 completion |
| `OUTPUT_DIR/tech_stack.md` | Tech stack evaluation and decisions | After Phase 3 (optional) |
| `OUTPUT_DIR/security_analysis.md` | Threat model and security controls | After Phase 4 (optional) |
**Optional files**:
- `raw/*.md` - Raw source archives (saved when content is lengthy)
## Methodology Quick Reference Card
```
┌──────────────────────────────────────────────────────────────────┐
│ Deep Research — Mode-Aware 8-Step Method │
├──────────────────────────────────────────────────────────────────┤
│ CONTEXT: Resolve mode (project vs standalone) + set paths │
│ GUARDRAILS: Check INPUT_DIR/INPUT_FILE exists + required files │
│ MODE DETECT: solution_draft*.md in 01_solution? → A or B │
│ │
│ MODE A: Initial Research │
│ Phase 1: AC & Restrictions Assessment (BLOCKING) │
│ Phase 2: Full 8-step → solution_draft##.md │
│ Phase 3: Tech Stack Consolidation (OPTIONAL) → tech_stack.md │
│ Phase 4: Security Deep Dive (OPTIONAL) → security_analysis.md │
│ │
│ MODE B: Solution Assessment │
│ Read latest draft → Full 8-step → solution_draft##.md (N+1) │
│ Optional: Phase 3 / Phase 4 on revised draft │
│ │
│ 8-STEP ENGINE: │
│ 0. Classify question type → Select framework template │
│ 0.5 Novelty sensitivity → Time windows for sources │
│ 1. Decompose question → sub-questions + perspectives + queries │
│ → Perspective Rotation (3+ viewpoints, MANDATORY) │
│ → Question Explosion (3-5 query variants per sub-Q) │
│ 2. Exhaustive web search → L1 > L2 > L3 > L4, broad coverage │
│ → Execute ALL query variants, search until saturation │
│ 3. Extract facts → Each with source, confidence level │
│ 3.5 Iterative deepening → gaps, contradictions, follow-ups │
│ → Keep searching until exit criteria met │
│ 4. Build framework → Fixed dimensions, structured compare │
│ 5. Align references → Ensure unified definitions │
│ 6. Reasoning chain → Fact→Compare→Conclude, explicit │
│ 7. Use-case validation → Sanity check, prevent armchairing │
│ 8. Deliverable → solution_draft##.md (mode-specific format) │
├──────────────────────────────────────────────────────────────────┤
│ Key discipline: Ask don't assume · Facts before reasoning │
│ Conclusions from mechanism, not gut feelings │
│ Search broadly, from multiple perspectives, until saturation │
└──────────────────────────────────────────────────────────────────┘
```
## Usage Examples
For detailed execution flow examples (Mode A initial, Mode B assessment, standalone, force override): Read `references/usage-examples.md`
## Source Verifiability Requirements
Every cited piece of external information must be directly verifiable by the user. All links must be publicly accessible (annotate `[login required]` if not), citations must include exact section/page/timestamp, and unverifiable information must be annotated `[limited source]`. Full checklist in `references/quality-checklists.md`.
@@ -796,7 +144,7 @@ Before completing the solution draft, run through the checklists in `references/
When replying to the user after research is complete:
**Should include**:
**Should include**:
- Active mode used (A or B) and which optional phases were executed
- One-sentence core conclusion
- Key findings summary (3-5 points)
@@ -804,7 +152,7 @@ When replying to the user after research is complete:
- Paths to optional artifacts if produced: `tech_stack.md`, `security_analysis.md`
- If there are significant uncertainties, annotate points requiring further verification
**Must not include**:
**Must not include**:
- Process file listings (e.g., `00_question_decomposition.md`, `01_source_registry.md`, etc.)
- Detailed research step descriptions
- Working directory structure display