Refine coding standards and testing guidelines

- Updated coding rules to emphasize readability, meaningful comments, and maintainability.
- Adjusted test coverage thresholds to 75% for business logic and clarified expectations for test scenarios.
- Enhanced guidelines for handling skipped tests, emphasizing the need for investigation and resolution.
- Introduced a completeness audit for research decomposition to ensure thoroughness in addressing problem dimensions.

Made-with: Cursor
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-04-17 20:29:15 +03:00
parent 567092188d
commit f46531fc6d
17 changed files with 275 additions and 90 deletions
+22 -10
View File
@@ -5,7 +5,7 @@ alwaysApply: true
# Agent Meta Rules
## Execution Safety
- Never run test suites, builds, Docker commands, or other long-running/resource-heavy/security-risky operations without asking the user first — unless it is explicitly stated in a skill or agent, or the user already asked to do so.
- Run the full test suite automatically when you believe code changes are complete (as required by coderule.mdc). For other long-running/resource-heavy/security-risky operations (builds, Docker commands, deployments, performance tests), ask the user first — unless explicitly stated in a skill or the user already asked to do so.
## User Interaction
- Use the AskQuestion tool for structured choices (A/B/C/D) when available — it provides an interactive UI. Fall back to plain-text questions if the tool is unavailable.
@@ -33,18 +33,30 @@ When the user reacts negatively to generated code ("WTF", "what the hell", "why
- "Before writing new infrastructure or workaround code, check how the existing codebase already handles the same concern. Follow established project patterns."
## Debugging Over Contemplation
When the root cause of a bug is not clear after ~5 minutes of reasoning, analysis, and assumption-making — **stop speculating and add debugging logs**. Observe actual runtime behavior before forming another theory. The pattern to follow:
Agents cannot measure wall-clock time between turns. Use observable counts from your own transcript instead.
**Trigger: stop speculating and instrument.** When you've formed **3 or more distinct hypotheses** about a bug without confirming any against runtime evidence (logs, stderr, debugger state, actual test failure messages) — stop and add debugging output. Re-reading the same code hoping to "spot it this time" counts as a new hypothesis that still has zero evidence.
Steps:
1. Identify the last known-good boundary (e.g., "request enters handler") and the known-bad result (e.g., "callback never fires").
2. Add targeted `print(..., flush=True)` or log statements at each intermediate step to narrow the gap.
3. Read the output. Let evidence drive the next step — not inference chains built on unverified assumptions.
2. Add targeted `print(..., flush=True)`, `console.error`, or logger statements at each intermediate step to narrow the gap.
3. Run the instrumented code. Read the output. Let evidence drive the next hypothesis — not inference chains.
Prolonged mental contemplation without evidence is a time sink. A 15-minute instrumented run beats 45 minutes of "could it be X? but then Y... unless Z..." reasoning.
An instrumented run producing real output beats any amount of "could it be X? but then Y..." reasoning.
## Long Investigation Retrospective
When a problem takes significantly longer than expected (>30 minutes), perform a post-mortem before closing out:
1. **Identify the bottleneck**: Was the delay caused by assumptions that turned out wrong? Missing visibility into runtime state? Incorrect mental model of a framework or language boundary?
2. **Extract the general lesson**: What category of mistake was this? (e.g., "Python cannot call Cython `cdef` methods", "engine errors silently swallowed", "wrong layer to fix the problem")
3. **Propose a preventive rule**: Formulate it as a short, actionable statement. Present it to the user for approval.
4. **Write it down**: Add the approved rule to the appropriate `.mdc` file so it applies to all future sessions.
Trigger a post-mortem when ANY of the following is true (all are observable in your own transcript):
- **10+ tool calls** were used to diagnose a single issue
- **Same file modified 3+ times** without tests going green
- **3+ distinct approaches** attempted before arriving at the fix
- Any phrase like "let me try X instead" appeared **more than twice**
- A fix was eventually found by reading docs/source the agent had dismissed earlier
Post-mortem steps:
1. **Identify the bottleneck**: wrong assumption? missing runtime visibility? incorrect mental model of a framework/language boundary? ignored evidence?
2. **Extract the general lesson**: what category of mistake was this? (e.g., "Python cannot call Cython `cdef` methods", "engine errors silently swallowed", "wrong layer to fix the problem")
3. **Propose a preventive rule**: short, actionable. Present to user for approval.
4. **Write it down**: add approved rule to the appropriate `.mdc` so it applies to future sessions.