Autodev greenfield Step 8 closes with outcome "Code is testable — no changes needed" after reviewing the 41 test scenarios in _docs/02_document/tests/ against the codebase against the Step-8 allowed-changes checklist. Key findings: - Hardcoded paths are config defaults, overridable via Config dataclass - All mutable registries expose clear_*_registry()/_reset_for_tests() - Hot-path timing uses injected Clock; cosmetic timestamps are monkeypatch-safe (2105-test unit suite proves it) - Heavy strategies (OKVIS2, VINS-Mono, FAISS, TRT) are BUILD_* gated - compose_root(pre_constructed=...) (AZ-591) is the Tier-1 injection seam; tests/e2e/replay already drives it end-to-end Artifacts: - _docs/04_refactoring/01-testability-refactoring/ testability_assessment.md - State advanced to Step 9 (Decompose Tests) - last_step_outcomes.step_8 recorded Co-authored-by: Cursor <cursoragent@cursor.com>
8.4 KiB
Testability Assessment — Cycle 1 (Greenfield Step 8)
Run:
_docs/04_refactoring/01-testability-refactoring/Date: 2026-05-16 Outcome: Code is testable — no changes needed. Auto-chain target: Step 9 — Decompose Tests
1. Inputs Reviewed
_docs/02_document/tests/traceability-matrix.md_docs/02_document/tests/environment.md_docs/02_document/tests/test-data.md_docs/02_document/tests/blackbox-tests.md_docs/02_document/tests/resilience-tests.md_docs/02_document/tests/performance-tests.md_docs/02_document/tests/security-tests.md_docs/02_document/tests/resource-limit-tests.md_docs/03_implementation/implementation_completeness_cycle1_report.md(gate verdict: PASS-with-BLOCKED; zero FAIL after AZ-591)
2. Test Surface Snapshot
| Tier | Scenario count | Driver | Public boundaries exercised |
|---|---|---|---|
| Tier-1 (workstation Docker) | All FT-P-*, FT-N-*, NFT-RES-*, NFT-SEC-*, NFT-LIM-* except those below |
e2e-runner (pytest in container) |
frame source, FC inbound (MAVLink/MSP2 replayer), tile cache RO mount, FC outbound observed via SITL, FDR filesystem (post-run), GCS observed via mavproxy-listener |
| Tier-2 (Jetson hardware) | NFT-PERF-01..04, NFT-LIM-01, NFT-LIM-04, NFT-LIM-05, AC-NEW-5 chamber |
hardware-attached runner | Same public boundaries; adds NVML/Tegra release file probes which are correctly skipif-gated in pytest |
All scenarios are blackbox: they NEVER import SUT modules, NEVER touch private state, and observe SUT only via public I/O surfaces.
3. Testability Checklist Per Step-8 Allowed-Changes Categories
| Category | Verdict | Evidence |
|---|---|---|
| Hardcoded file paths / directory references | OK | Every hit (/var/lib/gps_denied_onboard/..., /var/lib/gps-denied/..., /tmp/replay.jsonl, /var/lib/azaion/c10/cache, /etc/nv_tegra_release) is a default value inside a dataclass config field (schema.py, c1_vio/config.py, c6_tile_cache/config.py, c7_inference/config.py, c12_operator_orchestrator/config.py). Tests override via Config(...) dataclass construction; e2e tests bind-mount the actual production paths inside a Docker volume. /etc/nv_tegra_release is read only by the Jetson host-tuple probe, already skipif-gated. |
| Hardcoded configuration values (URLs / credentials / magic numbers) | OK | No http:// / https:// URL hardcoded in src/. MAVLink signing passkey loaded via Docker secret. All magic numbers (rate limits, ms thresholds, drain sleeps) are either constants tagged to the AC that owns them or constructor params with documented defaults. |
| Global mutable state | OK | All registries (_STRATEGY_REGISTRY, _STATE_REGISTRY, _POSE_REGISTRY, _FC_REGISTRY, _GCS_REGISTRY, _COMPONENT_REGISTRY, _DEFAULT_REGISTRY in c7 architecture, _LAZY_NAMES) and caches (fdr_client._CACHE) export a clear_*_registry() or _reset_for_tests() companion. Confirmed by greps of clear_strategy_registry, clear_pose_registry, clear_state_registry, clear_component_registry, _reset_for_tests. AZ-591 added a per-process bootstrap (register_airborne_strategies()) that tests can isolate using the existing clear helpers. |
| Tight coupling to external services without abstraction | OK | Pymavlink / MSP2 adapters built behind MavlinkTransport and MspTransport interfaces (c8). Paramiko SSH is built behind c12 operator-orchestrator's strategy factories. FAISS / TensorRT / ML runtimes are build-flag-gated (BUILD_FAISS, BUILD_TRT, BUILD_OKVIS2, etc.) and constructed via factory wrappers. Mock-suite-sat-service replaces the parent-suite Satellite API at the docker-compose layer; the SUT never embeds a real cloud client. |
| Missing dependency injection / non-configurable parameters | OK | compose_root(config, pre_constructed=...) (AZ-591) is the canonical injection seam. Every strategy/factory takes Config + named kwargs for its dependencies. FileFdrWriter takes flight_root, flight_id, config, fdr_clients, gcs_alert, on_rotation, record_kind_policy, drain_sleep_s, clock — all injectable. |
| Direct filesystem operations without path configurability | OK | All filesystem writes route through Path arguments bound at construction time (FDR writer, tile cache, descriptor index, c10 provisioner, replay JSONL sink). No module-level open() / Path() to fixed paths in business code. |
| Inline construction of heavy dependencies (models, clients) | OK | Heavy strategies — OKVIS2 ThreadedSlam, VINS-Mono, FAISS HNSW, ONNX-TRT runtimes, MegaLoc/MixVPR/SALAD/SelaVPR/UltraVPR/EigenPlaces models — are lazy-imported through per-component factories (vio_factory, vpr_factory, inference_factory, rerank_factory, matcher_factory, refiner_factory) and gated behind BUILD_* env flags. Default Tier-1 path runs KLT-RANSAC + no-VPR + no-rerank. |
| Time / clock | OK with note | Hot-path / safety-critical timing already uses injected Clock (c2_vpr engines, c8 FC adapters, c11 tile manager, c12 reloc service, c13 FDR writer, c1_vio strategies, c5_state estimators, c10 provisioner, etc.). Cosmetic datetime.now() calls (_iso_now, ts=datetime.now(tz=timezone.utc).isoformat()) are confined to ISO-timestamp helpers and overrideable in tests via monkeypatch.setattr. The 2105-test unit suite proves this pattern works. |
4. Composition-Root Seam (AZ-591, just landed)
The Step-7 implementation report identified the compose_root(pre_constructed=...) extension as the production blocker; it was implemented in Batch 66.
Implication for tests:
- E2E (blackbox) tests get the full production composition by
docker compose upagainstdocker/Dockerfile. They never touchpre_constructed. - Unit and integration tests that drive
compose_rootdirectly (existing pattern intests/e2e/replay/test_az401_compose_root_replay.py,tests/unit/test_az270_compose_root.py,tests/unit/runtime_root/test_az591_airborne_bootstrap.py) inject infrastructure stubs throughpre_constructed. - Tier-1 strategy selection happens entirely through
Config(c1_vio=..., c2_vpr=..., ...); no test needs to monkeypatch_STRATEGY_REGISTRYfor ordinary scenarios.
5. Watch-Items (NON-Blocking)
These are not testability defects per the Step-8 allowed-changes list, but they are observations for future refactor cycles or test-spec sync (Step 12):
- Direct
datetime.now()inc13_fdr/writer.py::_iso_now,c13_fdr/cap_policy.py::_iso_now,c11_tile_manager/tile_uploader.py::_iso_now: tests that assert exacttsfield equality mustmonkeypatchthe helper or use schema-shape assertions. The blackbox harness already does the latter — FDR records are validated by schema + value-range, not by exact timestamp. BUILD_OKVIS2/BUILD_VINS_MONOstrategies block-on-import (AZ-592 / AZ-593, deferred Tier-2): C++ binding linkage requires the Jetson toolchain. Tier-1 tests parameterize overokvis2only whenBUILD_OKVIS2=ONis honored by the docker build arg; default Tier-1 build pinsBUILD_VINS_MONO=OFFand the matrix exercisesklt_ransaceverywhere. No source change needed; documented inenvironment.md.- Component-internal registries (c7
_DEFAULT_REGISTRY) require explicitregister()calls in test fixtures:c5_stateand c7 architecture registries do not lazy-import on first lookup. Tests that exercise these strategies must call the relevantregister()(e.g.gtsam_isam2_estimator.register()), or rely onregister_airborne_strategies()which already chains the calls. This is by design — keeps test isolation explicit — not a defect.
None of the watch-items requires a source-code change to enable Step-9 test decomposition.
6. Outcome
Code is testable — no changes needed.
The greenfield decomposition (Steps 1–7) produced a codebase whose every external boundary is named in _docs/02_document/components/, every dependency is constructor-injected, every heavy strategy is build-flag-gated, every mutable global has a reset helper, and the composition root accepts pre-constructed infrastructure for test injection. The 41 blackbox / NFR test scenarios in _docs/02_document/tests/ can be implemented against the existing public surfaces without modifying source code.
Step 8 closes with no list-of-changes.md and no testability_changes_summary.md. Auto-chain advances to Step 9 — Decompose Tests (test-task generation only, no source changes).