Enhance research methodology documentation by adding new guidelines for internet search depth, multi-perspective analysis, and question reformulation. Update quality checklists and source tiering strategies to emphasize comprehensive search practices and verification of findings from diverse sources.

This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-21 22:45:49 +02:00
parent 2a17590248
commit 60ebe686ff
3 changed files with 147 additions and 29 deletions
+133 -29
View File
@@ -27,6 +27,9 @@ Transform vague topics raised by users into high-quality, deliverable research r
- **Prioritize authoritative sources: L1 > L2 > L3 > L4** - **Prioritize authoritative sources: L1 > L2 > L3 > L4**
- **Intermediate results must be saved for traceability and reuse** - **Intermediate results must be saved for traceability and reuse**
- **Ask, don't assume** — when any aspect of the problem, criteria, or restrictions is unclear, STOP and ask the user before proceeding - **Ask, don't assume** — when any aspect of the problem, criteria, or restrictions is unclear, STOP and ask the user before proceeding
- **Internet-first investigation** — do not rely on training data for factual claims; search the web extensively for every sub-question, rephrase queries when results are thin, and keep searching until you have converging evidence from multiple independent sources
- **Multi-perspective analysis** — examine every problem from at least 3 different viewpoints (e.g., end-user, implementer, business decision-maker, contrarian, domain expert, field practitioner); each perspective should generate its own search queries
- **Question multiplication** — for each sub-question, generate multiple reformulated search queries (synonyms, related terms, negations, "what can go wrong" variants, practitioner-focused variants) to maximize coverage and uncover blind spots
## Context Resolution ## Context Resolution
@@ -153,18 +156,20 @@ A focused preliminary research pass **before** the main solution research. The g
- Ambiguous acceptance criteria values → ask - Ambiguous acceptance criteria values → ask
- Missing context (no `security_approach.md`, no `input_data/`) → ask what they have - Missing context (no `security_approach.md`, no `input_data/`) → ask what they have
- Conflicting restrictions → ask which takes priority - Conflicting restrictions → ask which takes priority
3. Research in internet: 3. Research in internet **extensively** — use multiple search queries per question, rephrase, and search from different angles:
- How realistic are the acceptance criteria for this specific domain? - How realistic are the acceptance criteria for this specific domain? Search for industry benchmarks, standards, and typical values
- How critical is each criterion? - How critical is each criterion? Search for case studies where criteria were relaxed or tightened
- What domain-specific acceptance criteria are we missing? - What domain-specific acceptance criteria are we missing? Search for industry standards, regulatory requirements, and best practices in the specific domain
- Impact of each criterion value on the whole system quality - Impact of each criterion value on the whole system quality — search for research papers and engineering reports
- Cost/budget implications of each criterion - Cost/budget implications of each criterion — search for pricing, total cost of ownership analyses, and comparable project budgets
- Timeline implications — how long would it take to meet each criterion - Timeline implications — search for project timelines, development velocity reports, and comparable implementations
4. Research restrictions: - What do practitioners in this domain consider the most important criteria? Search forums, conference talks, and experience reports
- Are the restrictions realistic? 4. Research restrictions from multiple perspectives:
- Should any be tightened or relaxed? - Are the restrictions realistic? Search for comparable projects that operated under similar constraints
- Are there additional restrictions we should add? - Should any be tightened or relaxed? Search for what constraints similar projects actually ended up with
5. Verify findings with authoritative sources (official docs, papers, benchmarks) - Are there additional restrictions we should add? Search for regulatory, compliance, and safety requirements in this domain
- What restrictions do practitioners wish they had defined earlier? Search for post-mortem reports and lessons learned
5. Verify findings with authoritative sources (official docs, papers, benchmarks) — each key finding must have at least 2 independent sources
**Uses Steps 0-3 of the 8-step engine** (question classification, decomposition, source tiering, fact extraction) scoped to AC and restrictions assessment. **Uses Steps 0-3 of the 8-step engine** (question classification, decomposition, source tiering, fact extraction) scoped to AC and restrictions assessment.
@@ -205,12 +210,14 @@ Full 8-step research methodology. Produces the first solution draft.
**Input**: All files from INPUT_DIR (possibly updated after Phase 1) + Phase 1 artifacts **Input**: All files from INPUT_DIR (possibly updated after Phase 1) + Phase 1 artifacts
**Task** (drives the 8-step engine): **Task** (drives the 8-step engine):
1. Research existing/competitor solutions for similar problems 1. Research existing/competitor solutions for similar problems — search broadly across industries and adjacent domains, not just the obvious competitors
2. Research the problem thoroughly — all possible ways to solve it, split into components 2. Research the problem thoroughly — all possible ways to solve it, split into components; search for how different fields approach analogous problems
3. For each component, research all possible solutions and find the most efficient state-of-the-art approaches 3. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1
4. Verify that suggested tools/libraries actually exist and work as described 4. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions
5. Include security considerations in each component analysis 5. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist?
6. Provide rough cost estimates for proposed solutions 6. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues)
7. Include security considerations in each component analysis
8. Provide rough cost estimates for proposed solutions
Be concise in formulating. The fewer words, the better, but do not miss any important details. Be concise in formulating. The fewer words, the better, but do not miss any important details.
@@ -272,11 +279,17 @@ Full 8-step research methodology applied to assessing and improving an existing
**Task** (drives the 8-step engine): **Task** (drives the 8-step engine):
1. Read the existing solution draft thoroughly 1. Read the existing solution draft thoroughly
2. Research in internet — identify all potential weak points and problems 2. Research in internet extensively — for each component/decision in the draft, search for:
3. Identify security weak points and vulnerabilities - Known problems and limitations of the chosen approach
4. Identify performance bottlenecks - What practitioners say about using it in production
5. Address these problems and find ways to solve them - Better alternatives that may have emerged recently
6. Based on findings, form a new solution draft in the same format - Common failure modes and edge cases
- How competitors/similar projects solve the same problem differently
3. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure"
4. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft
5. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports
6. For each identified weak point, search for multiple solution approaches and compare them
7. Based on findings, form a new solution draft in the same format
**📁 Save action**: Write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md` **📁 Save action**: Write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md`
@@ -311,9 +324,10 @@ When the user wants to:
- "comparative analysis", "concept comparison", "technical comparison" - "comparative analysis", "concept comparison", "technical comparison"
**Differentiation from other Skills**: **Differentiation from other Skills**:
- Needs a **visual knowledge graph** → use `research-to-diagram`
- Needs **written output** (articles/tutorials) → use `wsy-writer`
- Needs **material organization** → use `material-to-markdown`
- Needs **research + solution draft** → use this Skill - Needs **research + solution draft** → use this Skill
- Needs **security audit** → use `/security`
- Needs **existing codebase documented** → use `/document`
## Research Engine (8-Step Method) ## Research Engine (8-Step Method)
@@ -374,6 +388,35 @@ Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources
- **Sub-question C**: "In what scenarios is X applicable/inapplicable?" (Boundary conditions) - **Sub-question C**: "In what scenarios is X applicable/inapplicable?" (Boundary conditions)
- **Sub-question D**: "What are X's development trends/best practices?" (Extended analysis) - **Sub-question D**: "What are X's development trends/best practices?" (Extended analysis)
#### Perspective Rotation (MANDATORY)
For each research problem, examine it from **at least 3 different perspectives**. Each perspective generates its own sub-questions and search queries.
| Perspective | What it asks | Example queries |
|-------------|-------------|-----------------|
| **End-user / Consumer** | What problems do real users encounter? What do they wish were different? | "X problems", "X frustrations reddit", "X user complaints" |
| **Implementer / Engineer** | What are the technical challenges, gotchas, hidden complexities? | "X implementation challenges", "X pitfalls", "X lessons learned" |
| **Business / Decision-maker** | What are the costs, ROI, strategic implications? | "X total cost of ownership", "X ROI case study", "X vs Y business comparison" |
| **Contrarian / Devil's advocate** | What could go wrong? Why might this fail? What are critics saying? | "X criticism", "why not X", "X failures", "X disadvantages real world" |
| **Domain expert / Academic** | What does peer-reviewed research say? What are theoretical limits? | "X research paper", "X systematic review", "X benchmarks academic" |
| **Practitioner / Field** | What do people who actually use this daily say? What works in practice vs theory? | "X in production", "X experience report", "X after 1 year" |
Select at least 3 perspectives relevant to the problem. Document the chosen perspectives in `00_question_decomposition.md`.
#### Question Explosion (MANDATORY)
For **each sub-question**, generate **at least 3-5 search query variants** before searching. This ensures broad coverage and avoids missing relevant information due to terminology differences.
**Query variant strategies**:
- **Specificity ladder**: broad ("indoor navigation systems") → narrow ("UWB-based indoor drone navigation accuracy")
- **Negation/failure**: "X limitations", "X failure modes", "when X doesn't work"
- **Comparison framing**: "X vs Y for Z", "X alternative for Z", "X or Y which is better for Z"
- **Practitioner voice**: "X in production experience", "X real-world results", "X lessons learned"
- **Temporal**: "X 2025", "X latest developments", "X roadmap"
- **Geographic/domain**: "X in Europe", "X for defense applications", "X in agriculture"
Record all planned queries in `00_question_decomposition.md` alongside each sub-question.
**⚠️ Research Subject Boundary Definition (BLOCKING - must be explicit)**: **⚠️ Research Subject Boundary Definition (BLOCKING - must be explicit)**:
When decomposing questions, you must explicitly define the **boundaries of the research subject**: When decomposing questions, you must explicitly define the **boundaries of the research subject**:
@@ -397,9 +440,11 @@ When decomposing questions, you must explicitly define the **boundaries of the r
- Classified question type and rationale - Classified question type and rationale
- **Research subject boundary definition** (population, geography, timeframe, level) - **Research subject boundary definition** (population, geography, timeframe, level)
- List of decomposed sub-questions - List of decomposed sub-questions
- **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale
- **Search query variants** for each sub-question (at least 3-5 per sub-question)
4. Write TodoWrite to track progress 4. Write TodoWrite to track progress
### Step 2: Source Tiering & Authority Anchoring ### Step 2: Source Tiering & Exhaustive Web Investigation
Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). Conclusions must be traceable to L1/L2; L3/L4 serve as supplementary and validation. Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). Conclusions must be traceable to L1/L2; L3/L4 serve as supplementary and validation.
@@ -411,6 +456,24 @@ Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). C
- Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories) - Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories)
- When citing web sources, include the URL and date accessed - When citing web sources, include the URL and date accessed
#### Exhaustive Search Requirements (MANDATORY)
Do not stop at the first few results. The goal is to build a comprehensive evidence base.
**Minimum search effort per sub-question**:
- Execute **all** query variants generated in Step 1's Question Explosion (at least 3-5 per sub-question)
- Consult at least **2 different source tiers** per sub-question (e.g., L1 official docs + L4 community discussion)
- If initial searches yield fewer than 3 relevant sources for a sub-question, **broaden the search** with alternative terms, related domains, or analogous problems
**Search broadening strategies** (use when results are thin):
- Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation"
- Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums
- Try different geographies: search in English + search for European/Asian approaches if relevant
- Try historical evolution: "history of X", "evolution of X approaches", "X state of the art 2024 2025"
- Try failure analysis: "X project failure", "X post-mortem", "X recall", "X incident report"
**Search saturation rule**: Continue searching until new queries stop producing substantially new information. If the last 3 searches only repeat previously found facts, the sub-question is saturated.
**📁 Save action**: **📁 Save action**:
For each source consulted, **immediately** append to `01_source_registry.md` using the entry template from `references/source-tiering.md`. For each source consulted, **immediately** append to `01_source_registry.md` using the entry template from `references/source-tiering.md`.
@@ -456,6 +519,40 @@ For each extracted fact, **immediately** append to `02_fact_cards.md`:
- Wrong: "The Ministry of Education banned phones in classrooms" (doesn't specify who) - Wrong: "The Ministry of Education banned phones in classrooms" (doesn't specify who)
- Correct: "The Ministry of Education banned K-12 students from bringing phones into classrooms (does not apply to university students)" - Correct: "The Ministry of Education banned K-12 students from bringing phones into classrooms (does not apply to university students)"
### Step 3.5: Iterative Deepening — Follow-Up Investigation
After initial fact extraction, review what you have found and identify **knowledge gaps and new questions** that emerged from the initial research. This step ensures the research doesn't stop at surface-level findings.
**Process**:
1. **Gap analysis**: Review fact cards and identify:
- Sub-questions with fewer than 3 high-confidence facts → need more searching
- Contradictions between sources → need tie-breaking evidence
- Perspectives (from Step 1) that have no or weak coverage → need targeted search
- Claims that rely only on L3/L4 sources → need L1/L2 verification
2. **Follow-up question generation**: Based on initial findings, generate new questions:
- "Source X claims [fact] — is this consistent with other evidence?"
- "If [approach A] has [limitation], how do practitioners work around it?"
- "What are the second-order effects of [finding]?"
- "Who disagrees with [common finding] and why?"
- "What happened when [solution] was deployed at scale?"
3. **Targeted deep-dive searches**: Execute follow-up searches focusing on:
- Specific claims that need verification
- Alternative viewpoints not yet represented
- Real-world case studies and experience reports
- Failure cases and edge conditions
- Recent developments that may change the picture
4. **Update artifacts**: Append new sources to `01_source_registry.md`, new facts to `02_fact_cards.md`
**Exit criteria**: Proceed to Step 4 when:
- Every sub-question has at least 3 facts with at least one from L1/L2
- At least 3 perspectives from Step 1 have supporting evidence
- No unresolved contradictions remain (or they are explicitly documented as open questions)
- Follow-up searches are no longer producing new substantive information
### Step 4: Build Comparison/Analysis Framework ### Step 4: Build Comparison/Analysis Framework
Based on the question type, select fixed analysis dimensions. **For dimension lists** (General, Concept Comparison, Decision Support): Read `references/comparison-frameworks.md` Based on the question type, select fixed analysis dimensions. **For dimension lists** (General, Concept Comparison, Decision Support): Read `references/comparison-frameworks.md`
@@ -657,9 +754,15 @@ Default intermediate artifacts location: `RESEARCH_DIR/`
│ │ │ │
│ 8-STEP ENGINE: │ │ 8-STEP ENGINE: │
│ 0. Classify question type → Select framework template │ │ 0. Classify question type → Select framework template │
1. Decompose question → mode-specific sub-questions 0.5 Novelty sensitivity → Time windows for sources
2. Tier sources → L1 Official > L2 Blog > L3 Media > L4 1. Decompose question → sub-questions + perspectives + queries
│ → Perspective Rotation (3+ viewpoints, MANDATORY) │
│ → Question Explosion (3-5 query variants per sub-Q) │
│ 2. Exhaustive web search → L1 > L2 > L3 > L4, broad coverage │
│ → Execute ALL query variants, search until saturation │
│ 3. Extract facts → Each with source, confidence level │ │ 3. Extract facts → Each with source, confidence level │
│ 3.5 Iterative deepening → gaps, contradictions, follow-ups │
│ → Keep searching until exit criteria met │
│ 4. Build framework → Fixed dimensions, structured compare │ │ 4. Build framework → Fixed dimensions, structured compare │
│ 5. Align references → Ensure unified definitions │ │ 5. Align references → Ensure unified definitions │
│ 6. Reasoning chain → Fact→Compare→Conclude, explicit │ │ 6. Reasoning chain → Fact→Compare→Conclude, explicit │
@@ -667,7 +770,8 @@ Default intermediate artifacts location: `RESEARCH_DIR/`
│ 8. Deliverable → solution_draft##.md (mode-specific format) │ │ 8. Deliverable → solution_draft##.md (mode-specific format) │
├──────────────────────────────────────────────────────────────────┤ ├──────────────────────────────────────────────────────────────────┤
│ Key discipline: Ask don't assume · Facts before reasoning │ │ Key discipline: Ask don't assume · Facts before reasoning │
Conclusions from mechanism, not gut feelings │ │ Conclusions from mechanism, not gut feelings
│ Search broadly, from multiple perspectives, until saturation │
└──────────────────────────────────────────────────────────────────┘ └──────────────────────────────────────────────────────────────────┘
``` ```
@@ -10,6 +10,17 @@
- [ ] Every citation can be directly verified by the user (source verifiability) - [ ] Every citation can be directly verified by the user (source verifiability)
- [ ] Structure hierarchy is clear; executives can quickly locate information - [ ] Structure hierarchy is clear; executives can quickly locate information
## Internet Search Depth
- [ ] Every sub-question was searched with at least 3-5 different query variants
- [ ] At least 3 perspectives from the Perspective Rotation were applied and searched
- [ ] Search saturation reached: last searches stopped producing new substantive information
- [ ] Adjacent fields and analogous problems were searched, not just direct matches
- [ ] Contrarian viewpoints were actively sought ("why not X", "X criticism", "X failure")
- [ ] Practitioner experience was searched (production use, real-world results, lessons learned)
- [ ] Iterative deepening completed: follow-up questions from initial findings were searched
- [ ] No sub-question relies solely on training data without web verification
## Mode A Specific ## Mode A Specific
- [ ] Phase 1 completed: AC assessment was presented to and confirmed by user - [ ] Phase 1 completed: AC assessment was presented to and confirmed by user
@@ -25,6 +25,9 @@
- L3/L4 serve only as supplementary and validation - L3/L4 serve only as supplementary and validation
- L4 community discussions are used to discover "what users truly care about" - L4 community discussions are used to discover "what users truly care about"
- Record all information sources - Record all information sources
- **Search broadly before searching deeply** — cast a wide net with multiple query variants before diving deep into any single source
- **Cross-domain search** — when direct results are sparse, search adjacent fields, analogous problems, and related industries
- **Never rely on a single search** — each sub-question requires multiple searches from different angles (synonyms, negations, practitioner language, academic language)
## Timeliness Filtering Rules (execute based on Step 0.5 sensitivity level) ## Timeliness Filtering Rules (execute based on Step 0.5 sensitivity level)