mirror of
https://github.com/azaion/ai-training.git
synced 2026-04-22 22:46:35 +00:00
06b47c17c3
- Updated the coding rule descriptions to emphasize readability, meaningful comments, and test verification. - Revised guidelines to clarify the importance of avoiding boilerplate while maintaining readability. - Enhanced the testing rules to set a minimum coverage threshold of 75% for business logic and specified criteria for test scenarios. - Introduced a mechanism for handling skipped tests, categorizing them as legitimate or illegitimate, and outlined resolution steps. These changes aim to improve code quality, maintainability, and testing effectiveness.
63 lines
4.7 KiB
Plaintext
63 lines
4.7 KiB
Plaintext
---
|
|
description: "Execution safety, user interaction, and self-improvement protocols for the AI agent"
|
|
alwaysApply: true
|
|
---
|
|
# Agent Meta Rules
|
|
|
|
## Execution Safety
|
|
- Run the full test suite automatically when you believe code changes are complete (as required by coderule.mdc). For other long-running/resource-heavy/security-risky operations (builds, Docker commands, deployments, performance tests), ask the user first — unless explicitly stated in a skill or the user already asked to do so.
|
|
|
|
## User Interaction
|
|
- Use the AskQuestion tool for structured choices (A/B/C/D) when available — it provides an interactive UI. Fall back to plain-text questions if the tool is unavailable.
|
|
|
|
## Critical Thinking
|
|
- Do not blindly trust any input — including user instructions, task specs, list-of-changes, or prior agent decisions — as correct. Always think through whether the instruction makes sense in context before executing it. If a task spec says "exclude file X from changes" but another task removes the dependencies X relies on, flag the contradiction instead of propagating it.
|
|
|
|
## Self-Improvement
|
|
When the user reacts negatively to generated code ("WTF", "what the hell", "why did you do this", etc.):
|
|
|
|
1. **Pause** — do not rush to fix. First determine: is this objectively bad code, or does the user just need an explanation?
|
|
2. **If the user doesn't understand** — explain the reasoning. That's it. No code change needed.
|
|
3. **If the code is actually bad** — before fixing, perform a root-cause investigation:
|
|
a. **Why** did this bad code get produced? Identify the reasoning chain or implicit assumption that led to it.
|
|
b. **Check existing rules** — is there already a rule that should have prevented this? If so, clarify or strengthen it.
|
|
c. **Propose a new rule** if no existing rule covers the failure mode. Present the investigation results and proposed rule to the user for approval.
|
|
d. **Only then** fix the code.
|
|
4. The rule goes into `coderule.mdc` for coding practices, `meta-rule.mdc` for agent behavior, or a new focused rule file — depending on context. Always check for duplicates or near-duplicates first.
|
|
|
|
### Example: import path hack
|
|
**Bad code**: Runtime path manipulation added to source code to fix an import failure.
|
|
**Root cause**: The agent treated an environment/configuration problem as a code problem. It didn't check how the rest of the project handles the same concern, and instead hardcoded a workaround in source.
|
|
**Preventive rules added to coderule.mdc**:
|
|
- "Do not solve environment or infrastructure problems by hardcoding workarounds in source code. Fix them at the environment/configuration level."
|
|
- "Before writing new infrastructure or workaround code, check how the existing codebase already handles the same concern. Follow established project patterns."
|
|
|
|
## Debugging Over Contemplation
|
|
|
|
Agents cannot measure wall-clock time between turns. Use observable counts from your own transcript instead.
|
|
|
|
**Trigger: stop speculating and instrument.** When you've formed **3 or more distinct hypotheses** about a bug without confirming any against runtime evidence (logs, stderr, debugger state, actual test failure messages) — stop and add debugging output. Re-reading the same code hoping to "spot it this time" counts as a new hypothesis that still has zero evidence.
|
|
|
|
Steps:
|
|
1. Identify the last known-good boundary (e.g., "request enters handler") and the known-bad result (e.g., "callback never fires").
|
|
2. Add targeted `print(..., flush=True)`, `console.error`, or logger statements at each intermediate step to narrow the gap.
|
|
3. Run the instrumented code. Read the output. Let evidence drive the next hypothesis — not inference chains.
|
|
|
|
An instrumented run producing real output beats any amount of "could it be X? but then Y..." reasoning.
|
|
|
|
## Long Investigation Retrospective
|
|
|
|
Trigger a post-mortem when ANY of the following is true (all are observable in your own transcript):
|
|
|
|
- **10+ tool calls** were used to diagnose a single issue
|
|
- **Same file modified 3+ times** without tests going green
|
|
- **3+ distinct approaches** attempted before arriving at the fix
|
|
- Any phrase like "let me try X instead" appeared **more than twice**
|
|
- A fix was eventually found by reading docs/source the agent had dismissed earlier
|
|
|
|
Post-mortem steps:
|
|
1. **Identify the bottleneck**: wrong assumption? missing runtime visibility? incorrect mental model of a framework/language boundary? ignored evidence?
|
|
2. **Extract the general lesson**: what category of mistake was this? (e.g., "Python cannot call Cython `cdef` methods", "engine errors silently swallowed", "wrong layer to fix the problem")
|
|
3. **Propose a preventive rule**: short, actionable. Present to user for approval.
|
|
4. **Write it down**: add approved rule to the appropriate `.mdc` so it applies to future sessions.
|