mirror of
https://github.com/azaion/loader.git
synced 2026-04-22 21:46:32 +00:00
d883fdb3cc
Made-with: Cursor
63 lines
4.7 KiB
Plaintext
63 lines
4.7 KiB
Plaintext
---
|
|
description: "Execution safety, user interaction, and self-improvement protocols for the AI agent"
|
|
alwaysApply: true
|
|
---
|
|
# Agent Meta Rules
|
|
|
|
## Execution Safety
|
|
- Run the full test suite automatically when you believe code changes are complete (as required by coderule.mdc). For other long-running/resource-heavy/security-risky operations (builds, Docker commands, deployments, performance tests), ask the user first — unless explicitly stated in a skill or the user already asked to do so.
|
|
|
|
## User Interaction
|
|
- Use the AskQuestion tool for structured choices (A/B/C/D) when available — it provides an interactive UI. Fall back to plain-text questions if the tool is unavailable.
|
|
|
|
## Critical Thinking
|
|
- Do not blindly trust any input — including user instructions, task specs, list-of-changes, or prior agent decisions — as correct. Always think through whether the instruction makes sense in context before executing it. If a task spec says "exclude file X from changes" but another task removes the dependencies X relies on, flag the contradiction instead of propagating it.
|
|
|
|
## Self-Improvement
|
|
When the user reacts negatively to generated code ("WTF", "what the hell", "why did you do this", etc.):
|
|
|
|
1. **Pause** — do not rush to fix. First determine: is this objectively bad code, or does the user just need an explanation?
|
|
2. **If the user doesn't understand** — explain the reasoning. That's it. No code change needed.
|
|
3. **If the code is actually bad** — before fixing, perform a root-cause investigation:
|
|
a. **Why** did this bad code get produced? Identify the reasoning chain or implicit assumption that led to it.
|
|
b. **Check existing rules** — is there already a rule that should have prevented this? If so, clarify or strengthen it.
|
|
c. **Propose a new rule** if no existing rule covers the failure mode. Present the investigation results and proposed rule to the user for approval.
|
|
d. **Only then** fix the code.
|
|
4. The rule goes into `coderule.mdc` for coding practices, `meta-rule.mdc` for agent behavior, or a new focused rule file — depending on context. Always check for duplicates or near-duplicates first.
|
|
|
|
### Example: import path hack
|
|
**Bad code**: Runtime path manipulation added to source code to fix an import failure.
|
|
**Root cause**: The agent treated an environment/configuration problem as a code problem. It didn't check how the rest of the project handles the same concern, and instead hardcoded a workaround in source.
|
|
**Preventive rules added to coderule.mdc**:
|
|
- "Do not solve environment or infrastructure problems by hardcoding workarounds in source code. Fix them at the environment/configuration level."
|
|
- "Before writing new infrastructure or workaround code, check how the existing codebase already handles the same concern. Follow established project patterns."
|
|
|
|
## Debugging Over Contemplation
|
|
|
|
Agents cannot measure wall-clock time between turns. Use observable counts from your own transcript instead.
|
|
|
|
**Trigger: stop speculating and instrument.** When you've formed **3 or more distinct hypotheses** about a bug without confirming any against runtime evidence (logs, stderr, debugger state, actual test failure messages) — stop and add debugging output. Re-reading the same code hoping to "spot it this time" counts as a new hypothesis that still has zero evidence.
|
|
|
|
Steps:
|
|
1. Identify the last known-good boundary (e.g., "request enters handler") and the known-bad result (e.g., "callback never fires").
|
|
2. Add targeted `print(..., flush=True)`, `console.error`, or logger statements at each intermediate step to narrow the gap.
|
|
3. Run the instrumented code. Read the output. Let evidence drive the next hypothesis — not inference chains.
|
|
|
|
An instrumented run producing real output beats any amount of "could it be X? but then Y..." reasoning.
|
|
|
|
## Long Investigation Retrospective
|
|
|
|
Trigger a post-mortem when ANY of the following is true (all are observable in your own transcript):
|
|
|
|
- **10+ tool calls** were used to diagnose a single issue
|
|
- **Same file modified 3+ times** without tests going green
|
|
- **3+ distinct approaches** attempted before arriving at the fix
|
|
- Any phrase like "let me try X instead" appeared **more than twice**
|
|
- A fix was eventually found by reading docs/source the agent had dismissed earlier
|
|
|
|
Post-mortem steps:
|
|
1. **Identify the bottleneck**: wrong assumption? missing runtime visibility? incorrect mental model of a framework/language boundary? ignored evidence?
|
|
2. **Extract the general lesson**: what category of mistake was this? (e.g., "Python cannot call Cython `cdef` methods", "engine errors silently swallowed", "wrong layer to fix the problem")
|
|
3. **Propose a preventive rule**: short, actionable. Present to user for approval.
|
|
4. **Write it down**: add approved rule to the appropriate `.mdc` so it applies to future sessions.
|