gps-denied-desktop/.cursor/skills/research/steps/03_engine-investigation.md

## Research Engine — Investigation Phase (Steps 0–3.5)

### Step 0: Question Type Classification

First, classify the research question type and select the corresponding strategy:

| Question Type | Core Task | Focus Dimensions |
|---------------|-----------|------------------|
| **Concept Comparison** | Build comparison framework | Mechanism differences, applicability boundaries |
| **Decision Support** | Weigh trade-offs | Cost, risk, benefit |
| **Trend Analysis** | Map evolution trajectory | History, driving factors, predictions |
| **Problem Diagnosis** | Root cause analysis | Symptoms, causes, evidence chain |
| **Knowledge Organization** | Systematic structuring | Definitions, classifications, relationships |

**Mode-specific classification**:

| Mode / Phase | Typical Question Type |
|--------------|----------------------|
| Mode A Phase 1 | Knowledge Organization + Decision Support |
| Mode A Phase 2 | Decision Support |
| Mode B | Problem Diagnosis + Decision Support |

### Step 0.5: Novelty Sensitivity Assessment (BLOCKING)

Before starting research, assess the novelty sensitivity of the question (Critical/High/Medium/Low). This determines source time windows and filtering strategy.

**For full classification table, critical-domain rules, trigger words, and assessment template**: Read `references/novelty-sensitivity.md`

Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources within 6 months, mandatory version annotations, cross-validation from 2+ sources, and direct verification of official download pages.

**Save action**: Append timeliness assessment to the end of `00_question_decomposition.md`

---

### Step 1: Question Decomposition & Boundary Definition

**Mode-specific sub-questions**:

**Mode A Phase 2** (Initial Research — Problem & Solution):
- "What existing/competitor solutions address this problem?"
- "What are the component parts of this problem?"
- "For each component, what are the state-of-the-art solutions?"
- "For each component, what are the practical alternatives across simple baseline, established production option, open-source option, commercial option, current SOTA, adjacent-domain option, and no-build/defer option?"
- "What are the security considerations per component?"
- "What are the cost implications of each approach?"

**Mode B** (Solution Assessment):
- "What are the weak points and potential problems in the existing draft?"
- "What are the security vulnerabilities in the proposed architecture?"
- "Where are the performance bottlenecks?"
- "What solutions exist for each identified issue?"
- "For each component already selected in the draft, what alternatives should be considered before keeping, replacing, or rejecting it?"

**General sub-question patterns** (use when applicable):
- **Sub-question A**: "What is X and how does it work?" (Definition & mechanism)
- **Sub-question B**: "What are the dimensions of relationship/difference between X and Y?" (Comparative analysis)
- **Sub-question C**: "In what scenarios is X applicable/inapplicable?" (Boundary conditions)
- **Sub-question D**: "What are X's development trends/best practices?" (Extended analysis)

#### Perspective Rotation (MANDATORY)

For each research problem, examine it from **at least 3 different perspectives**. Each perspective generates its own sub-questions and search queries.

| Perspective | What it asks | Example queries |
|-------------|-------------|-----------------|
| **End-user / Consumer** | What problems do real users encounter? What do they wish were different? | "X problems", "X frustrations reddit", "X user complaints" |
| **Implementer / Engineer** | What are the technical challenges, gotchas, hidden complexities? | "X implementation challenges", "X pitfalls", "X lessons learned" |
| **Business / Decision-maker** | What are the costs, ROI, strategic implications? | "X total cost of ownership", "X ROI case study", "X vs Y business comparison" |
| **Contrarian / Devil's advocate** | What could go wrong? Why might this fail? What are critics saying? | "X criticism", "why not X", "X failures", "X disadvantages real world" |
| **Domain expert / Academic** | What does peer-reviewed research say? What are theoretical limits? | "X research paper", "X systematic review", "X benchmarks academic" |
| **Practitioner / Field** | What do people who actually use this daily say? What works in practice vs theory? | "X in production", "X experience report", "X after 1 year" |

Select at least 3 perspectives relevant to the problem. Document the chosen perspectives in `00_question_decomposition.md`.

#### Question Explosion (MANDATORY)

For **each sub-question**, generate **at least 3-5 search query variants** before searching. This ensures broad coverage and avoids missing relevant information due to terminology differences.

**Query variant strategies**:
- **Specificity ladder**: broad ("indoor navigation systems") → narrow ("UWB-based indoor drone navigation accuracy")
- **Negation/failure**: "X limitations", "X failure modes", "when X doesn't work"
- **Comparison framing**: "X vs Y for Z", "X alternative for Z", "X or Y which is better for Z"
- **Practitioner voice**: "X in production experience", "X real-world results", "X lessons learned"
- **Temporal**: "X 2025", "X latest developments", "X roadmap"
- **Geographic/domain**: "X in Europe", "X for defense applications", "X in agriculture"

Record all planned queries in `00_question_decomposition.md` alongside each sub-question.

#### Component Option Breadth (MANDATORY)

Before Step 2, identify the component areas implied by the problem and create a search plan for options in each area. A component area is any replaceable tool, library, model, service, algorithm, data format, protocol, infrastructure pattern, or validation approach that could materially affect the solution.

For every component area, generate search queries for these option families unless clearly not applicable:
- **Simple baseline**: low-complexity classical or manual approach that can serve as a fallback or regression baseline.
- **Established production option**: mature library/service/pattern with field usage.
- **Open-source candidate**: permissive-license option with inspectable implementation and community history.
- **Commercial/vendor option**: paid or vendor-supported option, including SDK/platform constraints.
- **Current SOTA / research option**: recent model, paper, or benchmark leader that may be promising but immature.
- **Adjacent-domain option**: solution from a neighboring domain with similar constraints.
- **No-build / defer option**: whether the component can be avoided, simplified, or moved out of scope.
- **Known bad option**: candidate or family that appears attractive but has documented failure modes or disqualifiers.

For each component area, record:
- Candidate names and option families to search.
- At least 5 query variants covering alternatives, comparisons, limitations, licensing, runtime/scale, and exact project constraints.
- The minimum evidence needed to mark a candidate `Selected`, `Rejected`, `Experimental only`, or `Needs user decision`.

Add this as a "Component Option Search Plan" section in `00_question_decomposition.md`.

**Research Subject Boundary Definition (BLOCKING - must be explicit)**:

When decomposing questions, you must explicitly define the **boundaries of the research subject**:

| Dimension | Boundary to define | Example |
|-----------|--------------------|---------|
| **Population** | Which group is being studied? | University students vs K-12 vs vocational students vs all students |
| **Geography** | Which region is being studied? | Chinese universities vs US universities vs global |
| **Timeframe** | Which period is being studied? | Post-2020 vs full historical picture |
| **Level** | Which level is being studied? | Undergraduate vs graduate vs vocational |
| **Operating context** | What exact environment, lifecycle phase, and runtime conditions must the solution support? | In-flight embedded runtime vs offline post-processing; production web traffic vs admin batch job |
| **Required interfaces** | What inputs, outputs, protocols, data shapes, and ownership boundaries are fixed? | One camera vs stereo rig; REST API vs message queue; local file boundary vs service API |
| **Non-functional envelope** | What latency, throughput, storage, memory, availability, safety, security, cost, and maintainability targets are binding? | <400 ms p95, 8 GB RAM, 99.9% availability, reversible migrations |

**Common mistake**: User asks about "university classroom issues" but sources include policies targeting "K-12 students" — mismatched target populations will invalidate the entire research.

#### Decomposition Completeness Audit (MANDATORY)

After generating sub-questions, verify the decomposition covers all major dimensions of the problem — not just the ones that came to mind first.

1. **Domain discovery search**: Search the web for "key factors when [problem domain]" / "what to consider when [problem domain]" (e.g., "key factors GPS-denied navigation", "what to consider when choosing an edge deployment strategy"). Extract dimensions that practitioners and domain experts consider important but are absent from the current sub-questions.
2. **Run completeness probes**: Walk through each probe in `references/comparison-frameworks.md` → "Decomposition Completeness Probes" against the current sub-question list. For each probe, note whether it is covered, not applicable (state why), or missing.
3. **Fill gaps**: Add sub-questions (with search query variants) for any uncovered area. Do this before proceeding to Step 2.

Record the audit result in `00_question_decomposition.md` as a "Completeness Audit" section.

**Save action**:
1. Read all files from INPUT_DIR to ground the research in the project context
2. Create working directory `RESEARCH_DIR/`
3. Write `00_question_decomposition.md`, including:
   - Original question
   - Active mode (A Phase 2 or B) and rationale
   - Summary of relevant problem context from INPUT_DIR
   - Classified question type and rationale
   - **Research subject boundary definition** (population, geography, timeframe, level)
   - **Project Constraint Matrix summary** (operating context, required interfaces, non-functional envelope, lifecycle assumptions, and hard disqualifiers extracted from input files)
   - List of decomposed sub-questions
   - **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale
   - **Search query variants** for each sub-question (at least 3-5 per sub-question)
   - **Component Option Search Plan** (component areas, option families, candidate names, query variants, required evidence)
   - **Completeness audit** (taxonomy cross-reference + domain discovery results)
4. Write TodoWrite to track progress

---

### Step 2: Source Tiering & Exhaustive Web Investigation

Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). Conclusions must be traceable to L1/L2; L3/L4 serve as supplementary and validation.

**For full tier definitions, search strategies, community mining steps, and source registry templates**: Read `references/source-tiering.md`

**Tool Usage**:
- Use `WebSearch` for broad searches; `WebFetch` to read specific pages
- Use the `context7` MCP server (`resolve-library-id` then `query-docs` / `get-library-docs`) for up-to-date library/framework documentation. **Mandatory per lead candidate** — see "API Capability Verification" below.
- Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories)
- When citing web sources, include the URL and date accessed

#### Exhaustive Search Requirements (MANDATORY)

Do not stop at the first few results. The goal is to build a comprehensive evidence base.

**Minimum search effort per sub-question**:
- Execute **all** query variants generated in Step 1's Question Explosion (at least 3-5 per sub-question)
- Consult at least **2 different source tiers** per sub-question (e.g., L1 official docs + L4 community discussion)
- If initial searches yield fewer than 3 relevant sources for a sub-question, **broaden the search** with alternative terms, related domains, or analogous problems

**Minimum search effort per component area**:
- Search every option family from the "Component Option Search Plan" before choosing a lead candidate.
- For each lead, fallback, or rejected candidate, search at least one official/source-of-truth page and at least one independent validation source when available.
- Search `"[component] alternatives"`, `"[candidate] vs [alternative]"`, `"[candidate] limitations"`, `"[candidate] license"`, `"[candidate] production"`, and `"[candidate] [binding project constraint]"`.
- If fewer than 3 realistic candidates are found for a component area, explicitly document why the landscape is narrow and search adjacent domains before accepting that result.
- Include at least one simple baseline and one "do not use" or disqualified candidate per component area when possible; these prevent false confidence in the selected option.

**Candidate implementation-limit searches (MANDATORY)**:
For every component/tool/library/service/pattern/algorithm that may be selected or recommended, search for its intrinsic implementation constraints. Do not rely on product category labels, marketing summaries, or examples from a different operating context. Include query variants for:
- Official supported inputs/outputs, protocols, data formats, and deployment modes
- Required hardware/runtime/platform/version constraints
- Timing, throughput, memory, storage, synchronization, and scaling assumptions
- Lifecycle assumptions: offline vs online, batch vs real time, development vs production, single tenant vs multi tenant, local vs networked
- Known unsupported scenarios, limitations, issue reports, production failures, and workarounds
- Licensing, security, maintenance, and community-health constraints
- Exact phrases from the project's restrictions and acceptance criteria combined with the candidate name

**API Capability Verification — Per-Mode (MANDATORY, BLOCKING for lead candidates)**:

**Applicability**: this section applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section, and only to lead candidates that are libraries/SDKs/frameworks/services/protocols/data formats with multiple modes or configurations. For non-technical research (concept comparison, market/policy investigation, knowledge organization, root-cause analysis without tooling commitments), skip this entire sub-section and continue with the rest of Step 2 — the broader candidate implementation-limit search above is sufficient. State the skip explicitly once in `02_fact_cards.md`: `API Capability Verification: not applicable — this run is a Non-technical investigation, no library/SDK/service candidates`.

Most libraries/SDKs/services expose **multiple modes or configurations** (e.g., monocular vs stereo VO, sync vs async API, batch vs streaming inference, write-through vs write-behind cache). Selecting a candidate "because it supports X" without pinning *which mode* the project will use, and *whether that exact mode produces the required outputs from the required inputs*, is the most common silent-failure path in research. A library can support a class of problem in mode A while being unusable for the project's specific configuration in mode B.

For every lead candidate that is a library/SDK/framework/service with multiple modes or configurations, do the following — in this order, before marking the candidate `Selected`:

1. **Pin the exact mode/configuration the project will use.**
   Derived from the Project Constraint Matrix: which inputs are available (sensor count, sensor types, data shapes, rates), which outputs are required (per `acceptance_criteria.md` and contract files), which hardware/runtime is fixed (per `restrictions.md`). Write this as a single sentence: "We will use `<library>` in `<mode/config>` with inputs `<list>` and expect outputs `<list>` on `<runtime>`." Do not progress past this step on a vague mode description.

2. **Run `context7` (or equivalent docs lookup) for the candidate** — this is **mandatory for every lead library/SDK/framework candidate**, not optional. Minimum three queries per candidate:
   1. *Mode enumeration*: "What modes/configurations does `<library>` support? List every value of the mode/config enum and what each requires as input."
   2. *Project's exact mode*: "Show a minimum runnable example of `<library>` in `<the pinned mode>` with `<the project's input shape>`. What does it produce?"
   3. *Disqualifier probe*: "Does `<library>` `<the pinned mode>` produce `<the required output>`? Are there published limitations of `<the pinned mode>` for `<the project's runtime/hardware>`?"

   For services without context7 coverage, use official docs site + WebFetch on the API reference page + the project's example/tutorial directory in the source repo. Append every consulted URL to `01_source_registry.md`.

3. **Save a Minimum Viable Example (MVE) for the pinned mode.**
   Append to `02_fact_cards.md` (or a sibling `02_mve_evidence.md`) at least one block per lead library candidate with:

   ```markdown
   ## MVE — <library> in <pinned mode>
   - **Source**: <official URL or context7 reference, with date>
   - **Inputs in the example**: <e.g., 2 calibrated cameras + IMU at 200 Hz>
   - **Outputs in the example**: <e.g., 6-DoF pose with covariance>
   - **Project inputs**: <e.g., 1 camera + IMU at 200 Hz>
   - **Project outputs required**: <e.g., 6-DoF pose with metric translation>
   - **Match assessment**: ✅ exact match / ⚠️ partial (specify dimension) / ❌ mismatch (specify dimension)
   - **If ⚠️ or ❌**: cite the official-docs sentence that establishes the mismatch.
   ```

   If no official example covers the project's exact configuration → the candidate cannot be marked `Selected` based on category fit alone. Status must be `Experimental only` (with required-evidence note) or `Rejected` (when the docs explicitly disqualify the configuration).

4. **Bind every numbered Restriction and Acceptance Criterion to the candidate's pinned mode.**
   For each numbered line in `restrictions.md` and `acceptance_criteria.md`, decide one of: `Pass` (the pinned mode satisfies it with cited evidence), `Fail` (the pinned mode contradicts it with cited evidence), `Verify` (no evidence either way; deeper investigation required), `N/A` (the line is irrelevant to this component area). Record this in `02_fact_cards.md` under the candidate's MVE block. The structural matrix in Step 7.5 reads from these bindings.

5. **Treat "the same library in a different mode" as a different candidate.**
   If the project's pinned mode is `Monocular` but the only documented evidence covers `Stereo`, do not silently soften "rotation only" into "rotation + translation". Open a separate candidate row for the Monocular mode, with its own MVE, fit assessment, and disqualifiers. Two modes of one library are two distinct candidates for the purposes of this gate.

**Common silent-failure pattern this guards against**: a fact card paraphrases the docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes". A category-level "Selected" decision then carries through every downstream artifact, masking that the project's required A+B combination does not exist as a single mode.

**Search broadening strategies** (use when results are thin):
- Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation"
- Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums
- Try different geographies: search in English + search for European/Asian approaches if relevant
- Try historical evolution: "history of X", "evolution of X approaches", "X state of the art 2024 2025"
- Try failure analysis: "X project failure", "X post-mortem", "X recall", "X incident report"
- Try disqualifier probes: "X unsupported", "X limitations", "X requirements", "X with [project constraint]", "X without [required input]", "X real-time [target]", "X production failure"

**Search saturation rule**: Continue searching until new queries stop producing substantially new information. If the last 3 searches only repeat previously found facts, the sub-question is saturated.

**Save action**:
For each source consulted, **immediately** append to `01_source_registry.md` using the entry template from `references/source-tiering.md`.

---

### Step 3: Fact Extraction & Evidence Cards

Transform sources into **verifiable fact cards**:

```markdown
## Fact Cards

### Fact 1
- **Statement**: [specific fact description]
- **Source**: [link/document section]
- **Confidence**: High/Medium/Low

### Fact 2
...
```

**Key discipline**:
- Pin down facts first, then reason
- Distinguish "what officials said" from "what I infer"
- When conflicting information is found, annotate and preserve both sides
- Annotate confidence level:
  - ✅ High: Explicitly stated in official documentation
  - ⚠️ Medium: Mentioned in official blog but not formally documented
  - ❓ Low: Inference or from unofficial sources

**Save action**:
For each extracted fact, **immediately** append to `02_fact_cards.md`:
```markdown
## Fact #[number]
- **Statement**: [specific fact description]
- **Source**: [Source #number] [link]
- **Phase**: [Phase 1 / Phase 2 / Assessment]
- **Target Audience**: [which group this fact applies to, inherited from source or further refined]
- **Confidence**: ✅/⚠️/❓
- **Related Dimension**: [corresponding comparison dimension]
- **Fit Impact**: [supports selection / disqualifies / makes experimental / needs user decision]
```

**Target audience in fact statements**:
- If a fact comes from a "partially overlapping" or "reference only" source, the statement **must explicitly annotate the applicable scope**
- Wrong: "The Ministry of Education banned phones in classrooms" (doesn't specify who)
- Correct: "The Ministry of Education banned K-12 students from bringing phones into classrooms (does not apply to university students)"

---

### Step 3.5: Iterative Deepening — Follow-Up Investigation

After initial fact extraction, review what you have found and identify **knowledge gaps and new questions** that emerged from the initial research. This step ensures the research doesn't stop at surface-level findings.

**Process**:

1. **Gap analysis**: Review fact cards and identify:
   - Sub-questions with fewer than 3 high-confidence facts → need more searching
   - Contradictions between sources → need tie-breaking evidence
   - Perspectives (from Step 1) that have no or weak coverage → need targeted search
   - Claims that rely only on L3/L4 sources → need L1/L2 verification

2. **Follow-up question generation**: Based on initial findings, generate new questions:
   - "Source X claims [fact] — is this consistent with other evidence?"
   - "If [approach A] has [limitation], how do practitioners work around it?"
   - "What are the second-order effects of [finding]?"
   - "Who disagrees with [common finding] and why?"
   - "What happened when [solution] was deployed at scale?"

3. **Targeted deep-dive searches**: Execute follow-up searches focusing on:
   - Specific claims that need verification
   - Alternative viewpoints not yet represented
   - Real-world case studies and experience reports
   - Failure cases and edge conditions
   - Recent developments that may change the picture

4. **Update artifacts**: Append new sources to `01_source_registry.md`, new facts to `02_fact_cards.md`

**Exit criteria**: Proceed to Step 4 when:
- Every sub-question has at least 3 facts with at least one from L1/L2
- At least 3 perspectives from Step 1 have supporting evidence
- No unresolved contradictions remain (or they are explicitly documented as open questions)
- Follow-up searches are no longer producing new substantive information