mirror of
https://github.com/azaion/ui.git
synced 2026-04-23 05:26:34 +00:00
183 lines
8.8 KiB
Markdown
183 lines
8.8 KiB
Markdown
---
|
||
name: monorepo-discover
|
||
description: Scans a monorepo or meta-repo (git-submodule aggregators, npm/cargo workspaces, etc.) and generates a human-reviewable `_docs/_repo-config.yaml` that other `monorepo-*` skills (document, cicd, onboard, status) read. Produces inferred mappings tagged with evidence; never writes to the config's `confirmed_by_user` flag — the human does that. Use on first setup in a new monorepo, or to refresh the config after structural changes.
|
||
---
|
||
|
||
# Monorepo Discover
|
||
|
||
Writes or refreshes `_docs/_repo-config.yaml` — the shared config file that every other `monorepo-*` skill depends on. Does NOT modify any other files.
|
||
|
||
## Core principle
|
||
|
||
**Discovery is a suggestion, not a commitment.** The skill infers repo structure, but every inferred entry is tagged with `confirmed: false` + evidence. Action skills (`monorepo-document`, `monorepo-cicd`, `monorepo-onboard`) refuse to run until the human reviews the config and sets `confirmed_by_user: true`.
|
||
|
||
## Mitigations against LLM inference errors (applies throughout)
|
||
|
||
| Rule | What it means |
|
||
| ---- | ------------- |
|
||
| **M1** Separation | This skill never triggers other skills. It stops after writing config. |
|
||
| **M2** Evidence thresholds | No mapping gets recorded without at least one signal (name match, textual reference, directory convention, explicit statement). Zero-signal candidates go under `unresolved:` with a question. |
|
||
| **M3** Factual vs. interpretive | Resolve factual questions alone (file exists? line says what?). Ask for interpretive ones (does A feed into B?) unless M2 evidence is present. Ask for conventional ones always (commit prefix? target branch?). |
|
||
| **M4** Batch questions | Accumulate all `unresolved:` questions. Present at end of discovery, not drip-wise. |
|
||
| **M5** Skip over guess | Never record a zero-evidence mapping under `components:` or `docs:` — always put it in `unresolved:` with a question. |
|
||
| **M6** Assumptions footer | Every run ends with an explicit list of assumptions used. Also append to `assumptions_log:` in the config. |
|
||
| **M7** Structural drift | If the config already exists, produce a diff of what would change and ask for approval before overwriting. Never silently regenerate. |
|
||
|
||
## Guardrail
|
||
|
||
**This skill writes ONLY `_docs/_repo-config.yaml`.** It never edits unified docs, CI files, or component directories. If the workflow ever pushes you to modify anything else, stop.
|
||
|
||
## Workflow
|
||
|
||
### Phase 1: Detect repo type
|
||
|
||
Check which of these exists (first match wins):
|
||
|
||
1. `.gitmodules` → **git-submodules meta-repo**
|
||
2. `package.json` with `workspaces` field → **npm/yarn/pnpm workspace**
|
||
3. `pnpm-workspace.yaml` → **pnpm workspace**
|
||
4. `Cargo.toml` with `[workspace]` section → **cargo workspace**
|
||
5. `go.work` → **go workspace**
|
||
6. Multiple top-level subfolders each with their own `package.json` / `Cargo.toml` / `pyproject.toml` / `*.csproj` → **ad-hoc monorepo**
|
||
|
||
If none match → **ask the user** what kind of monorepo this is. Don't guess.
|
||
|
||
Record in `repo.type` and `repo.component_registry`.
|
||
|
||
### Phase 2: Enumerate components
|
||
|
||
Based on repo type, parse the registry and list components. For each collect:
|
||
|
||
- `name`, `path`
|
||
- `stack` — infer from files present (`.csproj` → .NET, `pyproject.toml` → Python, `Cargo.toml` → Rust, `package.json` → Node/TS, `go.mod` → Go). Multiple signals → pick dominant one. No signals → `stack: unknown` and add to `unresolved:`.
|
||
- `evidence` — list of signals used (e.g., `[gitmodules_entry, csproj_present]`)
|
||
|
||
Do NOT yet populate `primary_doc`, `secondary_docs`, `ci_config`, or `deployment_tier` — those come in Phases 4 and 5.
|
||
|
||
### Phase 3: Locate docs root
|
||
|
||
Probe in order: `_docs/`, `docs/`, `documentation/`, or a root-level README with links to sub-docs.
|
||
|
||
- Multiple candidates → ask user which is canonical
|
||
- None → `docs.root: null` + flag under `unresolved:`
|
||
|
||
Once located, classify each `*.md`:
|
||
|
||
- **Primary doc** — filename or H1 names a component/feature
|
||
- **Cross-cutting doc** — describes repo-wide concerns (architecture, schema, auth, index)
|
||
- **Index** — `README.md`, `index.md`, or `_index.md`
|
||
|
||
Detect filename convention (e.g., `NN_<name>.md`) and next unused prefix.
|
||
|
||
### Phase 4: Map components to docs (inference, M2-gated)
|
||
|
||
For each component, attempt to find its **primary doc** using the evidence rules. A mapping qualifies for `components:` (with `confirmed: false`) if at least ONE of these holds:
|
||
|
||
- **Name match** — component name appears in the doc filename OR H1
|
||
- **Textual reference** — doc body explicitly names the component path or git URL
|
||
- **Directory convention** — doc lives inside the component's folder
|
||
- **Explicit statement** — README, index, or comment asserts the mapping
|
||
|
||
No signal → entry goes under `unresolved:` with an A/B/C question, NOT under `components:` as a guess.
|
||
|
||
Cross-cutting docs go in `docs.cross_cutting:` with an `owns:` list describing what triggers updates to them. If you can't classify a doc, add an `unresolved:` entry asking the user.
|
||
|
||
### Phase 5: Detect CI tooling
|
||
|
||
Probe at repo root AND per-component for CI configs:
|
||
|
||
- `.github/workflows/*.yml` → GitHub Actions
|
||
- `.gitlab-ci.yml` → GitLab CI
|
||
- `.woodpecker/` or `.woodpecker.yml` → Woodpecker
|
||
- `.drone.yml` → Drone
|
||
- `Jenkinsfile` → Jenkins
|
||
- `bitbucket-pipelines.yml` → Bitbucket
|
||
- `azure-pipelines.yml` → Azure Pipelines
|
||
- `.circleci/config.yml` → CircleCI
|
||
|
||
Probe for orchestration/infra at root:
|
||
|
||
- `docker-compose*.yml`
|
||
- `kustomization.yaml`, `helm/`
|
||
- `Makefile` with build/deploy targets
|
||
- `*-install.sh`, `*-setup.sh`
|
||
- `.env.example`, `.env.template`
|
||
|
||
Record under `ci:`. For image tag formats, grep compose files for `image:` lines and record the pattern (e.g., `${REGISTRY}/${NAME}:${BRANCH}-${ARCH}`).
|
||
|
||
Anything ambiguous → `unresolved:` entry.
|
||
|
||
### Phase 6: Detect conventions
|
||
|
||
- **Commit prefix**: `git log --format=%s -50` → look for `[PREFIX]` consistency
|
||
- **Target/work branch**: check CI config trigger branches; fall back to `git remote show origin`
|
||
- **Ticket ID pattern**: grep commits and docs for regex like `[A-Z]+-\d+`
|
||
- **Image tag format**: see Phase 5
|
||
- **Deployment tiers**: scan root README and architecture docs for named tiers/environments
|
||
|
||
Record inferred conventions with `confirmed: false`.
|
||
|
||
### Phase 7: Read existing config (if any) and produce diff
|
||
|
||
If `_docs/_repo-config.yaml` already exists:
|
||
|
||
1. Parse it.
|
||
2. Compare against what Phases 1–6 discovered.
|
||
3. Produce a **diff report**:
|
||
- Entries added (new components, new docs)
|
||
- Entries changed (e.g., `primary_doc` changed due to doc renaming)
|
||
- Entries removed (component removed from registry)
|
||
4. **Ask the user** whether to apply the diff.
|
||
5. If applied, **preserve `confirmed: true` flags** for entries that still match — don't reset human-approved mappings.
|
||
6. If user declines, stop — leave config untouched.
|
||
|
||
### Phase 8: Batch question checkpoint (M4)
|
||
|
||
Present ALL accumulated `unresolved:` questions in one round. For each offer options when possible (A/B/C), open-ended only when no options exist.
|
||
|
||
After answers, update the draft config with the resolutions.
|
||
|
||
### Phase 9: Write config file
|
||
|
||
Write `_docs/_repo-config.yaml` using the schema in [templates/repo-config.example.yaml](templates/repo-config.example.yaml).
|
||
|
||
- Top-level `confirmed_by_user: false` ALWAYS — only the human flips this
|
||
- Every entry has `confirmed: <bool>` and (when `false`) `evidence: [...]`
|
||
- Append to `assumptions_log:` a new entry for this run
|
||
|
||
### Phase 10: Review handoff + assumptions footer (M6)
|
||
|
||
Output:
|
||
|
||
```
|
||
Generated/refreshed _docs/_repo-config.yaml:
|
||
- N components discovered (X confirmed, Y inferred, Z unresolved)
|
||
- M docs located (K primary, L cross-cutting)
|
||
- CI tooling: <detected>
|
||
- P unresolved questions resolved this run; Q still open — see config
|
||
- Assumptions made during discovery:
|
||
- Treated <path> as unified-docs root (only candidate found)
|
||
- Inferred `<component>` primary doc = `<doc>` (name match)
|
||
- Commit prefix `<prefix>` seen in N of last 20 commits
|
||
|
||
Next step: please review _docs/_repo-config.yaml, correct any wrong inferences,
|
||
and set `confirmed_by_user: true` at the top. After that, monorepo-document,
|
||
monorepo-cicd, monorepo-status, and monorepo-onboard will run.
|
||
```
|
||
|
||
Then stop.
|
||
|
||
## What this skill will NEVER do
|
||
|
||
- Modify any file other than `_docs/_repo-config.yaml`
|
||
- Set `confirmed_by_user: true`
|
||
- Record a mapping with zero evidence
|
||
- Chain to another skill automatically
|
||
- Commit the generated config
|
||
|
||
## Failure / ambiguity handling
|
||
|
||
- Internal contradictions in a component (README references files not in code) → surface to user, stop, do NOT silently reconcile
|
||
- Docs root cannot be located → record `docs.root: null` and list unresolved question; do not create a new `_docs/` folder
|
||
- Parsing fails on `_docs/_repo-config.yaml` (existing file is corrupt) → surface to user, stop; never overwrite silently
|