[AZ-233] Update Docker Compose and enhance test documentation

- Modified the Docker Compose configuration to include an input root for replay tests and added an environment variable for enabling SITL.
- Enhanced documentation for various testing processes, including the addition of a Runtime Completeness Decomposition Gate and clarifications on internal module testing requirements.
- Updated the implementation completeness report to reflect the current state and added new test cases for performance and resilience scenarios.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-06 05:03:48 +03:00
parent 2485763d09
commit cab7b5d020
20 changed files with 265 additions and 41 deletions
+12 -1
View File
@@ -32,6 +32,17 @@ After selecting a mode, read its corresponding workflow below; do not mix them.
## Functional Mode
### 0. System-Under-Test Reality Gate
Before accepting any functional, blackbox, or e2e result as a pass, verify what the tests actually exercised.
1. If `_docs/00_problem/input_data/expected_results/results_report.md` exists, at least one e2e/blackbox run must compare actual product outputs against that mapping or the machine-readable files it references.
2. Stubs are allowed only for external systems outside the product boundary: flight controller/SITL, QGC observer, satellite-provider/Suite service, physical Jetson hardware, physical camera, unavailable licensed datasets, and network services.
3. Stubs, fakes, deterministic fallbacks, monkeypatches, or direct replacement of internal product modules are not allowed for the behavior under test. Internal examples include VIO, safety/anchor wrapper, satellite retrieval, anchor verification, tile manager, MAVLink output adapter, FDR, and the A-Z localization pipeline.
4. If tests pass only because an internal module is fake/scaffolded, classify the run as **failed** with category `missing product implementation`.
5. If a scenario is blocked because external hardware/data is absent, verify the production code path exists before accepting the block as legitimate. Missing internal production code is not an environment block.
6. If the test runner writes CSV/Markdown reports, inspect them. A zero exit code is not enough; blocked/internal-stubbed scenarios still require classification.
### 1. Detect Test Runner
Check in order — first match wins:
@@ -94,7 +105,7 @@ Categorize skips as: **explicit skip (dead code)**, **runtime skip (unreachable)
### 5. Handle Outcome
**All tests pass, zero skipped** → return success to the autodev for auto-chain.
**All tests pass, zero skipped, and the System-Under-Test Reality Gate passes** → return success to the autodev for auto-chain.
**Any test fails or errors** → this is a **blocking gate**. Never silently ignore failures. **Always investigate the root cause before deciding on an action.** Read the failing test code, read the error output, check service logs if applicable, and determine whether the bug is in the test or in the production code.