Files
gps-denied-onboard/_docs/04_refactoring/01-testability-refactoring/testability_assessment.md
T
Oleksandr Bezdieniezhnykh 7a71579428 Step 8: Code Testability Revision — no changes needed
Autodev greenfield Step 8 closes with outcome
"Code is testable — no changes needed" after reviewing the 41 test
scenarios in _docs/02_document/tests/ against the codebase against the
Step-8 allowed-changes checklist.

Key findings:
- Hardcoded paths are config defaults, overridable via Config dataclass
- All mutable registries expose clear_*_registry()/_reset_for_tests()
- Hot-path timing uses injected Clock; cosmetic timestamps are
  monkeypatch-safe (2105-test unit suite proves it)
- Heavy strategies (OKVIS2, VINS-Mono, FAISS, TRT) are BUILD_* gated
- compose_root(pre_constructed=...) (AZ-591) is the Tier-1 injection
  seam; tests/e2e/replay already drives it end-to-end

Artifacts:
- _docs/04_refactoring/01-testability-refactoring/
  testability_assessment.md
- State advanced to Step 9 (Decompose Tests)
- last_step_outcomes.step_8 recorded

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-16 13:05:43 +03:00

8.4 KiB
Raw Blame History

Testability Assessment — Cycle 1 (Greenfield Step 8)

Run: _docs/04_refactoring/01-testability-refactoring/ Date: 2026-05-16 Outcome: Code is testable — no changes needed. Auto-chain target: Step 9 — Decompose Tests

1. Inputs Reviewed

  • _docs/02_document/tests/traceability-matrix.md
  • _docs/02_document/tests/environment.md
  • _docs/02_document/tests/test-data.md
  • _docs/02_document/tests/blackbox-tests.md
  • _docs/02_document/tests/resilience-tests.md
  • _docs/02_document/tests/performance-tests.md
  • _docs/02_document/tests/security-tests.md
  • _docs/02_document/tests/resource-limit-tests.md
  • _docs/03_implementation/implementation_completeness_cycle1_report.md (gate verdict: PASS-with-BLOCKED; zero FAIL after AZ-591)

2. Test Surface Snapshot

Tier Scenario count Driver Public boundaries exercised
Tier-1 (workstation Docker) All FT-P-*, FT-N-*, NFT-RES-*, NFT-SEC-*, NFT-LIM-* except those below e2e-runner (pytest in container) frame source, FC inbound (MAVLink/MSP2 replayer), tile cache RO mount, FC outbound observed via SITL, FDR filesystem (post-run), GCS observed via mavproxy-listener
Tier-2 (Jetson hardware) NFT-PERF-01..04, NFT-LIM-01, NFT-LIM-04, NFT-LIM-05, AC-NEW-5 chamber hardware-attached runner Same public boundaries; adds NVML/Tegra release file probes which are correctly skipif-gated in pytest

All scenarios are blackbox: they NEVER import SUT modules, NEVER touch private state, and observe SUT only via public I/O surfaces.

3. Testability Checklist Per Step-8 Allowed-Changes Categories

Category Verdict Evidence
Hardcoded file paths / directory references OK Every hit (/var/lib/gps_denied_onboard/..., /var/lib/gps-denied/..., /tmp/replay.jsonl, /var/lib/azaion/c10/cache, /etc/nv_tegra_release) is a default value inside a dataclass config field (schema.py, c1_vio/config.py, c6_tile_cache/config.py, c7_inference/config.py, c12_operator_orchestrator/config.py). Tests override via Config(...) dataclass construction; e2e tests bind-mount the actual production paths inside a Docker volume. /etc/nv_tegra_release is read only by the Jetson host-tuple probe, already skipif-gated.
Hardcoded configuration values (URLs / credentials / magic numbers) OK No http:// / https:// URL hardcoded in src/. MAVLink signing passkey loaded via Docker secret. All magic numbers (rate limits, ms thresholds, drain sleeps) are either constants tagged to the AC that owns them or constructor params with documented defaults.
Global mutable state OK All registries (_STRATEGY_REGISTRY, _STATE_REGISTRY, _POSE_REGISTRY, _FC_REGISTRY, _GCS_REGISTRY, _COMPONENT_REGISTRY, _DEFAULT_REGISTRY in c7 architecture, _LAZY_NAMES) and caches (fdr_client._CACHE) export a clear_*_registry() or _reset_for_tests() companion. Confirmed by greps of clear_strategy_registry, clear_pose_registry, clear_state_registry, clear_component_registry, _reset_for_tests. AZ-591 added a per-process bootstrap (register_airborne_strategies()) that tests can isolate using the existing clear helpers.
Tight coupling to external services without abstraction OK Pymavlink / MSP2 adapters built behind MavlinkTransport and MspTransport interfaces (c8). Paramiko SSH is built behind c12 operator-orchestrator's strategy factories. FAISS / TensorRT / ML runtimes are build-flag-gated (BUILD_FAISS, BUILD_TRT, BUILD_OKVIS2, etc.) and constructed via factory wrappers. Mock-suite-sat-service replaces the parent-suite Satellite API at the docker-compose layer; the SUT never embeds a real cloud client.
Missing dependency injection / non-configurable parameters OK compose_root(config, pre_constructed=...) (AZ-591) is the canonical injection seam. Every strategy/factory takes Config + named kwargs for its dependencies. FileFdrWriter takes flight_root, flight_id, config, fdr_clients, gcs_alert, on_rotation, record_kind_policy, drain_sleep_s, clock — all injectable.
Direct filesystem operations without path configurability OK All filesystem writes route through Path arguments bound at construction time (FDR writer, tile cache, descriptor index, c10 provisioner, replay JSONL sink). No module-level open() / Path() to fixed paths in business code.
Inline construction of heavy dependencies (models, clients) OK Heavy strategies — OKVIS2 ThreadedSlam, VINS-Mono, FAISS HNSW, ONNX-TRT runtimes, MegaLoc/MixVPR/SALAD/SelaVPR/UltraVPR/EigenPlaces models — are lazy-imported through per-component factories (vio_factory, vpr_factory, inference_factory, rerank_factory, matcher_factory, refiner_factory) and gated behind BUILD_* env flags. Default Tier-1 path runs KLT-RANSAC + no-VPR + no-rerank.
Time / clock OK with note Hot-path / safety-critical timing already uses injected Clock (c2_vpr engines, c8 FC adapters, c11 tile manager, c12 reloc service, c13 FDR writer, c1_vio strategies, c5_state estimators, c10 provisioner, etc.). Cosmetic datetime.now() calls (_iso_now, ts=datetime.now(tz=timezone.utc).isoformat()) are confined to ISO-timestamp helpers and overrideable in tests via monkeypatch.setattr. The 2105-test unit suite proves this pattern works.

4. Composition-Root Seam (AZ-591, just landed)

The Step-7 implementation report identified the compose_root(pre_constructed=...) extension as the production blocker; it was implemented in Batch 66.

Implication for tests:

  • E2E (blackbox) tests get the full production composition by docker compose up against docker/Dockerfile. They never touch pre_constructed.
  • Unit and integration tests that drive compose_root directly (existing pattern in tests/e2e/replay/test_az401_compose_root_replay.py, tests/unit/test_az270_compose_root.py, tests/unit/runtime_root/test_az591_airborne_bootstrap.py) inject infrastructure stubs through pre_constructed.
  • Tier-1 strategy selection happens entirely through Config(c1_vio=..., c2_vpr=..., ...); no test needs to monkeypatch _STRATEGY_REGISTRY for ordinary scenarios.

5. Watch-Items (NON-Blocking)

These are not testability defects per the Step-8 allowed-changes list, but they are observations for future refactor cycles or test-spec sync (Step 12):

  1. Direct datetime.now() in c13_fdr/writer.py::_iso_now, c13_fdr/cap_policy.py::_iso_now, c11_tile_manager/tile_uploader.py::_iso_now: tests that assert exact ts field equality must monkeypatch the helper or use schema-shape assertions. The blackbox harness already does the latter — FDR records are validated by schema + value-range, not by exact timestamp.
  2. BUILD_OKVIS2/BUILD_VINS_MONO strategies block-on-import (AZ-592 / AZ-593, deferred Tier-2): C++ binding linkage requires the Jetson toolchain. Tier-1 tests parameterize over okvis2 only when BUILD_OKVIS2=ON is honored by the docker build arg; default Tier-1 build pins BUILD_VINS_MONO=OFF and the matrix exercises klt_ransac everywhere. No source change needed; documented in environment.md.
  3. Component-internal registries (c7 _DEFAULT_REGISTRY) require explicit register() calls in test fixtures: c5_state and c7 architecture registries do not lazy-import on first lookup. Tests that exercise these strategies must call the relevant register() (e.g. gtsam_isam2_estimator.register()), or rely on register_airborne_strategies() which already chains the calls. This is by design — keeps test isolation explicit — not a defect.

None of the watch-items requires a source-code change to enable Step-9 test decomposition.

6. Outcome

Code is testable — no changes needed.

The greenfield decomposition (Steps 17) produced a codebase whose every external boundary is named in _docs/02_document/components/, every dependency is constructor-injected, every heavy strategy is build-flag-gated, every mutable global has a reset helper, and the composition root accepts pre-constructed infrastructure for test injection. The 41 blackbox / NFR test scenarios in _docs/02_document/tests/ can be implemented against the existing public surfaces without modifying source code.

Step 8 closes with no list-of-changes.md and no testability_changes_summary.md. Auto-chain advances to Step 9 — Decompose Tests (test-task generation only, no source changes).