Files
detections/.cursor/skills/monorepo-discover/SKILL.md
T
2026-04-18 22:04:05 +03:00

8.8 KiB
Raw Blame History

name, description
name description
monorepo-discover Scans a monorepo or meta-repo (git-submodule aggregators, npm/cargo workspaces, etc.) and generates a human-reviewable `_docs/_repo-config.yaml` that other `monorepo-*` skills (document, cicd, onboard, status) read. Produces inferred mappings tagged with evidence; never writes to the config's `confirmed_by_user` flag — the human does that. Use on first setup in a new monorepo, or to refresh the config after structural changes.

Monorepo Discover

Writes or refreshes _docs/_repo-config.yaml — the shared config file that every other monorepo-* skill depends on. Does NOT modify any other files.

Core principle

Discovery is a suggestion, not a commitment. The skill infers repo structure, but every inferred entry is tagged with confirmed: false + evidence. Action skills (monorepo-document, monorepo-cicd, monorepo-onboard) refuse to run until the human reviews the config and sets confirmed_by_user: true.

Mitigations against LLM inference errors (applies throughout)

Rule What it means
M1 Separation This skill never triggers other skills. It stops after writing config.
M2 Evidence thresholds No mapping gets recorded without at least one signal (name match, textual reference, directory convention, explicit statement). Zero-signal candidates go under unresolved: with a question.
M3 Factual vs. interpretive Resolve factual questions alone (file exists? line says what?). Ask for interpretive ones (does A feed into B?) unless M2 evidence is present. Ask for conventional ones always (commit prefix? target branch?).
M4 Batch questions Accumulate all unresolved: questions. Present at end of discovery, not drip-wise.
M5 Skip over guess Never record a zero-evidence mapping under components: or docs: — always put it in unresolved: with a question.
M6 Assumptions footer Every run ends with an explicit list of assumptions used. Also append to assumptions_log: in the config.
M7 Structural drift If the config already exists, produce a diff of what would change and ask for approval before overwriting. Never silently regenerate.

Guardrail

This skill writes ONLY _docs/_repo-config.yaml. It never edits unified docs, CI files, or component directories. If the workflow ever pushes you to modify anything else, stop.

Workflow

Phase 1: Detect repo type

Check which of these exists (first match wins):

  1. .gitmodulesgit-submodules meta-repo
  2. package.json with workspaces field → npm/yarn/pnpm workspace
  3. pnpm-workspace.yamlpnpm workspace
  4. Cargo.toml with [workspace] section → cargo workspace
  5. go.workgo workspace
  6. Multiple top-level subfolders each with their own package.json / Cargo.toml / pyproject.toml / *.csprojad-hoc monorepo

If none match → ask the user what kind of monorepo this is. Don't guess.

Record in repo.type and repo.component_registry.

Phase 2: Enumerate components

Based on repo type, parse the registry and list components. For each collect:

  • name, path
  • stack — infer from files present (.csproj → .NET, pyproject.toml → Python, Cargo.toml → Rust, package.json → Node/TS, go.mod → Go). Multiple signals → pick dominant one. No signals → stack: unknown and add to unresolved:.
  • evidence — list of signals used (e.g., [gitmodules_entry, csproj_present])

Do NOT yet populate primary_doc, secondary_docs, ci_config, or deployment_tier — those come in Phases 4 and 5.

Phase 3: Locate docs root

Probe in order: _docs/, docs/, documentation/, or a root-level README with links to sub-docs.

  • Multiple candidates → ask user which is canonical
  • None → docs.root: null + flag under unresolved:

Once located, classify each *.md:

  • Primary doc — filename or H1 names a component/feature
  • Cross-cutting doc — describes repo-wide concerns (architecture, schema, auth, index)
  • IndexREADME.md, index.md, or _index.md

Detect filename convention (e.g., NN_<name>.md) and next unused prefix.

Phase 4: Map components to docs (inference, M2-gated)

For each component, attempt to find its primary doc using the evidence rules. A mapping qualifies for components: (with confirmed: false) if at least ONE of these holds:

  • Name match — component name appears in the doc filename OR H1
  • Textual reference — doc body explicitly names the component path or git URL
  • Directory convention — doc lives inside the component's folder
  • Explicit statement — README, index, or comment asserts the mapping

No signal → entry goes under unresolved: with an A/B/C question, NOT under components: as a guess.

Cross-cutting docs go in docs.cross_cutting: with an owns: list describing what triggers updates to them. If you can't classify a doc, add an unresolved: entry asking the user.

Phase 5: Detect CI tooling

Probe at repo root AND per-component for CI configs:

  • .github/workflows/*.yml → GitHub Actions
  • .gitlab-ci.yml → GitLab CI
  • .woodpecker/ or .woodpecker.yml → Woodpecker
  • .drone.yml → Drone
  • Jenkinsfile → Jenkins
  • bitbucket-pipelines.yml → Bitbucket
  • azure-pipelines.yml → Azure Pipelines
  • .circleci/config.yml → CircleCI

Probe for orchestration/infra at root:

  • docker-compose*.yml
  • kustomization.yaml, helm/
  • Makefile with build/deploy targets
  • *-install.sh, *-setup.sh
  • .env.example, .env.template

Record under ci:. For image tag formats, grep compose files for image: lines and record the pattern (e.g., ${REGISTRY}/${NAME}:${BRANCH}-${ARCH}).

Anything ambiguous → unresolved: entry.

Phase 6: Detect conventions

  • Commit prefix: git log --format=%s -50 → look for [PREFIX] consistency
  • Target/work branch: check CI config trigger branches; fall back to git remote show origin
  • Ticket ID pattern: grep commits and docs for regex like [A-Z]+-\d+
  • Image tag format: see Phase 5
  • Deployment tiers: scan root README and architecture docs for named tiers/environments

Record inferred conventions with confirmed: false.

Phase 7: Read existing config (if any) and produce diff

If _docs/_repo-config.yaml already exists:

  1. Parse it.
  2. Compare against what Phases 16 discovered.
  3. Produce a diff report:
    • Entries added (new components, new docs)
    • Entries changed (e.g., primary_doc changed due to doc renaming)
    • Entries removed (component removed from registry)
  4. Ask the user whether to apply the diff.
  5. If applied, preserve confirmed: true flags for entries that still match — don't reset human-approved mappings.
  6. If user declines, stop — leave config untouched.

Phase 8: Batch question checkpoint (M4)

Present ALL accumulated unresolved: questions in one round. For each offer options when possible (A/B/C), open-ended only when no options exist.

After answers, update the draft config with the resolutions.

Phase 9: Write config file

Write _docs/_repo-config.yaml using the schema in templates/repo-config.example.yaml.

  • Top-level confirmed_by_user: false ALWAYS — only the human flips this
  • Every entry has confirmed: <bool> and (when false) evidence: [...]
  • Append to assumptions_log: a new entry for this run

Output:

Generated/refreshed _docs/_repo-config.yaml:
- N components discovered (X confirmed, Y inferred, Z unresolved)
- M docs located (K primary, L cross-cutting)
- CI tooling: <detected>
- P unresolved questions resolved this run; Q still open — see config
- Assumptions made during discovery:
  - Treated <path> as unified-docs root (only candidate found)
  - Inferred `<component>` primary doc = `<doc>` (name match)
  - Commit prefix `<prefix>` seen in N of last 20 commits

Next step: please review _docs/_repo-config.yaml, correct any wrong inferences,
and set `confirmed_by_user: true` at the top. After that, monorepo-document,
monorepo-cicd, monorepo-status, and monorepo-onboard will run.

Then stop.

What this skill will NEVER do

  • Modify any file other than _docs/_repo-config.yaml
  • Set confirmed_by_user: true
  • Record a mapping with zero evidence
  • Chain to another skill automatically
  • Commit the generated config

Failure / ambiguity handling

  • Internal contradictions in a component (README references files not in code) → surface to user, stop, do NOT silently reconcile
  • Docs root cannot be located → record docs.root: null and list unresolved question; do not create a new _docs/ folder
  • Parsing fails on _docs/_repo-config.yaml (existing file is corrupt) → surface to user, stop; never overwrite silently