mirror of https://github.com/azaion/gps-denied-onboard.git synced 2026-06-21 07:11:13 +00:00

Files

T

Oleksandr Bezdieniezhnykh 827d4fe644 [AZ-240] Update product implementation and task decomposition processes

- Refined task decomposition steps to ensure implementation tasks are atomic and complexity does not exceed 5 points.
- Enhanced the product implementation process with a completeness gate to verify task outcomes against architecture promises before proceeding to testing.
- Updated dependencies table to reflect new tasks and their relationships, ensuring all test tasks are linked to product remediation tasks.
- Adjusted workflow documentation to clarify entry points for task decomposition and implementation contexts.

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-05 01:02:25 +03:00

8.6 KiB

Raw Blame History

Test Infrastructure

Task: AZ-233_test_infrastructure Name: Test Infrastructure Description: Scaffold the blackbox and e2e test project: runner, deterministic fixtures, isolated replay/SITL environment, reporting, and external dependency stubs. Complexity: 5 points Dependencies: AZ-240_native_vio_backend_integration, AZ-241_real_satellite_vpr_descriptor_retrieval, AZ-242_real_anchor_feature_matching_ransac Component: Blackbox Tests Tracker: AZ-233 Epic: AZ-218

Test Project Folder Layout

e2e/
├── replay/
│   ├── run_replay.py
│   ├── scenarios/
│   └── reports/
├── fixtures/
│   ├── cache/
│   ├── mavlink/
│   ├── telemetry/
│   └── expected/
├── tests/
│   ├── test_still_image_replay.py
│   ├── test_vio_replay.py
│   ├── test_satellite_anchor.py
│   ├── test_blackout_spoofing.py
│   ├── test_resource_limits.py
│   └── test_security_gates.py
├── mocks/
│   ├── satellite_cache_stub/
│   ├── ardupilot_sitl/
│   └── qgc_observer/
└── reports/

Layout Rationale

The test project keeps blackbox/e2e runner code outside product runtime internals. Scenario definitions, fixtures, mocks, and reports are separated so tests can reset state between runs and produce release evidence without importing private component modules.

Test implementation starts only after remediation tasks AZ-240, AZ-241, and AZ-242 close the native VIO, real satellite VPR, and real anchor matching gaps found during autodev verification.

Mock Services

Mock Service	Replaces	Interfaces	Behavior
`satellite_cache_stub`	Offline Azaion Suite Satellite Service cache package	Local COG/manifest/descriptor fixture volume	Serves preloaded valid, stale, unsigned, hash-mismatched, and low-resolution cache fixtures; never performs network fetches during flight-mode tests.
`ardupilot_sitl`	ArduPilot Plane flight controller	MAVLink telemetry and `GPS_INPUT` receiving path	Emits generated IMU, attitude, GPS health, spoofing, and failsafe traces; records injected `GPS_INPUT` for assertions.
`qgc_observer`	QGroundControl status consumer	MAVLink/tlog parser	Records downsampled `STATUSTEXT`, status, and failsafe messages for rate and content assertions.

Mock Control API

Each mock or runner fixture must expose deterministic scenario controls for normal replay, stale cache, missing cache, spoofed GPS, blackout, restart, and resource-load modes. Recorded interactions must be queryable after each test run for assertions.

Docker Test Environment

`docker-compose.test.yml` Structure

Service	Image / Build	Purpose	Depends On
`gps-denied-service`	Project runtime image or local package mount	System under test	`satellite-cache-stub`
`replay-consumer`	Python replay/test harness	Feeds frames, telemetry, cache data, and faults	`gps-denied-service`, mock services
`satellite-cache-stub`	Fixture volume/service	Provides offline cache manifests, sidecars, descriptors, and generated invalid variants	none
`ardupilot-plane-sitl`	SITL container or local process wrapper	Validates `GPS_INPUT`, spoofing, and failsafe behavior	`gps-denied-service`
`qgc-observer`	MAVLink log parser	Verifies GCS-visible status output	`ardupilot-plane-sitl`

Networks and Volumes

replay-net: connects the runtime, replay consumer, and satellite-cache stub.
sitl-net: connects the runtime, ArduPilot Plane SITL, and QGC observer.
input-data: read-only mount for _docs/00_problem/input_data/.
expected-results: read-only mount for expected coordinate and report fixtures.
derkachi-replay: read-only mount for flight_derkachi.mp4 and data_imu.csv.
satellite-cache: fixture cache volume with valid and invalid manifests.
fdr-output: fresh per-run output volume for FDR and report artifacts.

Test Runner Configuration

Framework: Python pytest-style replay harness. Entry point: run-blackbox-replay or equivalent pytest command that executes scenario groups and writes reports. Reports: CSV summary plus FDR validation Markdown.

Fixture Strategy

Fixture	Scope	Purpose
`project_60_still_images`	session	Provides 60 nadir images and expected WGS84 centers.
`derkachi_video_telemetry`	session	Provides synchronized video, IMU, and `GLOBAL_POSITION_INT` replay data.
`cache_integrity_fixtures`	function	Provides valid, stale, unsigned, hash-mismatched, and low-resolution cache variants.
`sitl_spoofing_scenarios`	function	Provides generated GPS loss/spoofing and blackout traces.
`public_nadir_vio_candidates`	optional/session	Provides public or representative synchronized datasets when available.

Test Data Fixtures

Data Set	Source	Format	Used By
`project_60_still_images`	`_docs/00_problem/input_data/`	JPG + metadata	Still-image accuracy, confidence, latency smoke
`expected_frame_centers`	`_docs/00_problem/input_data/coordinates.csv` and expected-results report	CSV/Markdown	Geolocation assertions
`derkachi_video_telemetry`	`_docs/00_problem/input_data/flight_derkachi/`	MP4 + CSV	VIO replay, latency, resilience
`cache_integrity_fixtures`	generated fixture volume	COG/manifest/sidecar/index fixtures	Cache freshness, poisoning, no-fetch tests
`sitl_spoofing_scenarios`	generated by SITL harness	MAVLink/tlog traces	Spoofing, blackout, failsafe, GCS status
`public_nadir_vio_candidates`	pinned external fixtures	dataset-specific	Final VIO and satellite-anchor validation

Data Isolation

Every run uses read-only input fixtures and fresh run-scoped output directories. FDR, generated tiles, tlogs, and reports are written only to per-run output volumes. Mock state and generated fixtures are reset before each scenario group.

Test Reporting

Format: CSV summary and Markdown evidence report. Output paths: test-results/blackbox-report.csv and test-results/fdr-validation-summary.md. Required columns: Test ID, test name, input dataset, execution time, result, error distance, source label, covariance 95% semi-major, GPS_INPUT.fix_type, and error message.

Acceptance Criteria

AC-1: Test environment starts Given the Docker/replay test environment When the test stack starts Then the runtime, replay consumer, cache fixture, SITL, and observer services are reachable or report a clear blocked prerequisite.

AC-2: External dependency stubs are deterministic Given a scenario config for cache, MAVLink, QGC, or fixture behavior When the replay consumer executes it Then mocks produce repeatable responses and expose recorded interactions for assertions.

AC-3: Test runner executes scenario groups Given valid fixtures and a running test environment When the test runner starts Then it discovers and executes blackbox, performance, resilience, security, and resource-limit scenario groups.

AC-4: Reports are generated Given a completed or blocked test run When reporting finishes Then CSV and Markdown evidence files are written with the required columns, metrics, artifact paths, and blocked-prerequisite reasons.

Non-Functional Requirements

Reliability

Missing hardware, public datasets, calibration, or SITL prerequisites are reported as blocked, not passed.

Security

Fixture stubs must not access external satellite-provider or Suite service networks during in-flight test scenarios.

Data Isolation

No test may mutate source fixtures or write FDR/generated-tile artifacts outside run-scoped output paths.

Constraints

The test suite must use public runtime boundaries only: navigation frames, telemetry, offline cache, MAVLink output, QGC status, and FDR outputs.
The suite must not import private estimator, BASALT, wrapper, or tile-manager internals.
Hardware-specific Jetson gates remain release-gate tests and may be skipped or blocked in ordinary local replay.

Risks & Mitigation

Risk 1: Environment prerequisites hide real failures

Risk: Missing hardware, calibration, or datasets could be treated as success.
Mitigation: Report unavailable prerequisites as blocked with explicit artifact evidence.

Risk 2: Fixture mutation contaminates later runs

Risk: Generated FDR, cache, or SITL output changes expected input fixtures.
Mitigation: Use read-only fixture mounts and fresh run-scoped output volumes for every execution.

8.6 KiB Raw Blame History