gps-denied-onboard/.planning/codebase/TESTING.md

# Testing Patterns

**Analysis Date:** 2026-04-01

## Test Framework

**Runner:**
- `pytest` >= 8.0
- Config: `pyproject.toml` `[tool.pytest.ini_options]` section
- `testpaths = ["tests"]`
- `asyncio_mode = "auto"` — all async tests run automatically without per-test `@pytest.mark.asyncio` decorator (though some tests still apply it explicitly for clarity)

**Async Extension:**
- `pytest-asyncio` >= 0.24

**HTTP Client for API tests:**
- `httpx` >= 0.28 with `ASGITransport` — in-process FastAPI testing, no real server needed

**Run Commands:**
```bash
pytest                        # Run all tests
pytest tests/test_<module>.py # Run single file
pytest -v                     # Verbose with test names
pytest -x                     # Stop on first failure
# No coverage command configured; no pytest-cov in dependencies
```

## Test File Organization

**Location:** All tests in `/tests/` flat directory — no sub-directories.

**Naming:** `test_<module>.py` maps to the component being tested:

| Test File | Component Tested | Component ID |
|-----------|-----------------|--------------|
| `test_coordinates.py` | `CoordinateTransformer` | F13 |
| `test_vo.py` | `SequentialVisualOdometry` | F07 |
| `test_gpr.py` | `GlobalPlaceRecognition` | F08 |
| `test_metric.py` | `MetricRefinement` | F09 |
| `test_graph.py` | `FactorGraphOptimizer` | F10 |
| `test_recovery.py` | `FailureRecoveryCoordinator` | F11 |
| `test_chunk_manager.py` | `RouteChunkManager` | F12 |
| `test_rotation.py` | `ImageRotationManager` | F06 |
| `test_pipeline.py` | `ImageInputPipeline` | F05 |
| `test_satellite.py` | `SatelliteDataManager` + `mercator` utils | F04, H06 |
| `test_models.py` | `ModelManager` | F16 |
| `test_processor_full.py` | `FlightProcessor` orchestration | F15 (Stage 10) |
| `test_acceptance.py` | Full pipeline acceptance scenarios | AC-1 through AC-6 |
| `test_api_flights.py` | REST API endpoints | HTTP integration |
| `test_schemas.py` | Pydantic schemas + config | Domain validation |
| `test_database.py` | `FlightRepository` + DB models | Persistence layer |
| `test_health.py` | `/health` endpoint | Smoke test |

No `conftest.py` exists — fixtures are defined locally per test file.

## Test Count Summary

| File | Test Functions |
|------|---------------|
| `test_schemas.py` | 16 (class-based: 6 `TestXxx` classes) |
| `test_database.py` | 9 |
| `test_acceptance.py` | 6 |
| `test_api_flights.py` | 5 |
| `test_vo.py` | 5 |
| `test_satellite.py` | 5 |
| `test_coordinates.py` | 4 |
| `test_graph.py` | 4 |
| `test_rotation.py` | 4 |
| `test_processor_full.py` | 4 |
| `test_pipeline.py` | 3 |
| `test_gpr.py` | 3 |
| `test_metric.py` | 3 |
| `test_models.py` | 3 |
| `test_chunk_manager.py` | 3 |
| `test_recovery.py` | 2 |
| `test_health.py` | 1 |
| **Total** | **~85 tests** |

## Test Structure

**Fixture-based setup (dominant pattern):**
```python
@pytest.fixture
def transformer():
    return CoordinateTransformer()

def test_gps_to_enu(transformer):
    ...
```

**Class-based grouping (schemas only):**
```python
class TestGPSPoint:
    def test_valid(self): ...
    def test_lat_out_of_range(self): ...
```
Only `test_schemas.py` uses class-based grouping. All other files use module-level test functions.

**Async fixtures:**
```python
@pytest.fixture
async def session():
    engine = create_async_engine("sqlite+aiosqlite://", echo=False)
    ...
    async with async_session() as s:
        yield s
    await engine.dispose()
```

## Mocking Patterns

**`unittest.mock.MagicMock` / `AsyncMock`:**
Used to stub repository and SSE streamer in processor/acceptance tests:
```python
repo = MagicMock()
streamer = MagicMock()
streamer.push_event = AsyncMock()
proc = FlightProcessor(repo, streamer)
```

**`monkeypatch`:**
Used to force specific VO outcomes or override alignment methods mid-test:
```python
monkeypatch.setattr(processor._vo, "compute_relative_pose", bad_vo)
monkeypatch.setattr(processor._recovery, "process_chunk_recovery", lambda *a, **k: False)
```

**`MockInferenceEngine` (production mock, not test mock):**
Located in `src/gps_denied/core/models.py`. `ModelManager` always returns `MockInferenceEngine` instances — no real ML models are loaded at any point. The mock generates deterministic random numpy arrays of the expected shapes. This means all component tests run against fake inference, not real SuperPoint/LightGlue/LiteSAM.

**What is NOT mocked:**
- Coordinate math (`CoordinateTransformer`) — tested with real arithmetic
- SQLite database — in-memory `aiosqlite` is used for all DB tests (real ORM, real SQL)
- FastAPI app — `ASGITransport` runs the real app in-process
- Mercator utilities — tested with real computations

## Database Test Pattern

All DB tests and API tests use in-memory SQLite to avoid test pollution:
```python
engine = create_async_engine("sqlite+aiosqlite://", echo=False)
async with engine.begin() as conn:
    await conn.run_sync(Base.metadata.create_all)
```
Foreign key cascade enforcement requires explicit SQLite pragma (applied via SQLAlchemy event):
```python
@event.listens_for(engine.sync_engine, "connect")
def _set_sqlite_pragma(dbapi_connection, connection_record):
    cursor.execute("PRAGMA foreign_keys=ON")
```

## API Test Pattern

```python
@pytest.fixture
async def client(override_get_session) -> AsyncClient:
    async with AsyncClient(
        transport=ASGITransport(app=app), base_url="http://test"
    ) as ac:
        yield ac
```
The `override_get_session` fixture patches `app.dependency_overrides[get_session]` with the in-memory SQLite session. Tests make real HTTP calls via `client.post(...)`, `client.get(...)`.

## Test Data

**No external fixtures or data files.** All test data is constructed inline:
- `GPSPoint`, `CameraParameters`, `Waypoint` objects instantiated directly in tests
- Images: `np.zeros(...)` or `np.random.randint(...)` numpy arrays
- FLIGHT_PAYLOAD dict defined at module level in `test_api_flights.py`
- Database CAM dict defined at module level in `test_database.py`

**No `fixtures/` or `data/` directory exists.**

## Acceptance Tests (`test_acceptance.py`)

Six scenarios implemented covering key pipeline behaviors:

| Test | Scenario |
|------|----------|
| `test_ac1_normal_flight` | 20 frames, no crash, 20 SSE events emitted |
| `test_ac2_tracking_loss_and_recovery` | VO fails on frame 5 → RECOVERY → back to NORMAL |
| `test_ac3_performance_per_frame` | max < 5s/frame, avg < 1s (mock pipeline) |
| `test_ac4_user_anchor_fix` | `add_absolute_factor(is_user_anchor=True)` updates trajectory |
| `test_ac5_sustained_throughput` | 50 frames < 30s total |
| `test_ac6_graph_optimization_convergence` | `optimize()` reports `converged=True` |

These use the fully wired `FlightProcessor` with all real components attached (but mock ML inference).

## Coverage Gaps

### Not Tested At Unit Level

**SSE streamer (`src/gps_denied/core/sse.py`):**
No dedicated `test_sse.py`. SSE push calls are verified only indirectly via `AsyncMock.assert_called_*` in processor tests.

**Results manager (`src/gps_denied/core/results.py`):**
No test file. Not exercised directly.

**App lifespan / startup (`src/gps_denied/app.py`):**
Component wiring in the FastAPI lifespan handler is not tested. API integration tests bypass lifespan by overriding the session dependency only.

**`pixel_to_gps` accuracy:**
`test_coordinates.py` tests the round-trip only with the mock (FAKE) math. The real ray-casting implementation is explicitly noted as a placeholder — no test verifies the correct geometric result.

**`image_object_to_gps`:**
Tested only that it doesn't crash and returns the origin (because the underlying `pixel_to_gps` is a fake). No accuracy assertion possible until real implementation exists.

**`transform_points`:**
`CoordinateTransformer.transform_points` is a stub that returns the input unchanged. No test covers it.

**MAVLink output / flight controller integration:**
Not testable from the current codebase — no `mavlink` module exists yet. Blackbox tests FT-P-05, FT-P-09 require SITL.

**Confidence scoring / `FlightStatusResponse.confidence` field:**
`FlightStatusResponse` has no `confidence` field in the schema; the SSE event dict sets `confidence` but it is a `float` (0.0–1.0), not the `HIGH`/`MEDIUM`/`LOW` tier described in blackbox tests FT-P-07, FT-P-08.

**IMU / ESKF dead reckoning:**
Not implemented. No tests. Referenced in blackbox FT-N-06.

**`/flights/{flightId}/delete` (bulk), `/flights` (list), `/flights/{flightId}/frames/{frameId}/object-to-gps`:**
`test_api_flights.py` only covers create, get detail, upload batch, user-fix, and status. Delete, list, and object-to-GPS endpoints are not covered by API integration tests.

### Blackbox Tests vs. Implemented Tests

The `_docs/02_document/tests/blackbox-tests.md` defines **14 positive** (FT-P-01 to FT-P-14) and **7 negative** (FT-N-01 to FT-N-07) scenarios = **21 planned blackbox tests**, none of which are implemented as runnable code. They require:
- SITL ArduPilot environment
- camera-replay Docker service
- satellite tile server
- MAVLink capture infrastructure
- ground truth `coordinates.csv` and `flight-sequence-60` dataset

All 21 blackbox scenarios are documentation-only at this point.

### Traceability Matrix Coverage

Per `_docs/02_document/tests/traceability-matrix.md`:
- **26 Acceptance Criteria**: 24/26 covered by planned blackbox tests; 2 explicitly not coverable (AC-25 MRE internal metric, AC-26 imagery age)
- **16 Restrictions**: 15/16 covered; 1 not coverable (RESTRICT-05 sunny weather)
- Overall planned blackbox coverage: 93% of requirements

The gap between **planned** (blackbox doc) and **implemented** (pytest suite) is large: all 21 blackbox scenarios remain unimplemented.

## Coverage Configuration

No `pytest-cov` in dependencies. No coverage targets defined. No `.coveragerc` or `coverage` configuration in `pyproject.toml`.

Coverage is not enforced in CI (no CI configuration file detected).

---

*Testing analysis: 2026-04-01*