Files
ui/_docs/02_tasks/todo/AZ-456_test_infrastructure.md
T
2026-05-11 00:59:46 +03:00

22 KiB
Raw Blame History

Test Infrastructure

Task: AZ-456_test_infrastructure Name: Test Infrastructure Description: Scaffold the Blackbox test project — Vitest + jsdom + MSW for the fast profile, Playwright + the suite docker-compose stack for the e2e profile, ripgrep + bun scripts for static. Includes mock services (owm-stub, tile-stub), test data fixtures, isolation strategy, and CSV+JUnit reporting. Complexity: 5 points Dependencies: None Component: Blackbox Tests Tracker: AZ-456 Epic: AZ-455

Test Project Folder Layout

tests/                          # fast profile (colocated with src/, top-level dir for shared helpers)
├── setup.ts                    # Vitest global setup — RTL matchers + jsdom polyfills + MSW server.listen
├── msw/
│   ├── server.ts               # MSW Node server (Vitest) — used by every fast test
│   ├── handlers/
│   │   ├── admin.ts            # /api/admin/* default handlers (auth, users, classes-write, GPS)
│   │   ├── flights.ts          # /api/flights/* default handlers (CRUD + live-GPS SSE simulator)
│   │   ├── annotations.ts      # /api/annotations/* default handlers (media + annotations + dataset + SSE)
│   │   ├── detect.ts           # /api/detect/* default handlers (sync image detect)
│   │   ├── loader.ts           # /api/loader/* default handlers
│   │   ├── resource.ts         # /api/resource/* default handlers
│   │   ├── owm.ts              # owm-stub equivalent for fast (OpenWeatherMap canned wind data)
│   │   └── tiles.ts            # tile-stub equivalent for fast (returns a 1×1 transparent PNG)
│   └── helpers.ts              # buildResponse, errorResponse, sse, latency, drop helpers
├── fixtures/
│   ├── enum_spec_snapshot.ts   # re-exports the committed _docs/00_problem/input_data/enum_spec_snapshot.json
│   ├── seed_users.ts           # 4 users matching seed_users (test-data.md)
│   ├── seed_aircraft.ts        # 3 aircraft, one default
│   ├── seed_flights.ts         # 5 flights, one with live-GPS wire-up
│   ├── seed_classes.ts         # [0..N-1, 20..20+N-1, 40..40+N-1] N≥9
│   ├── seed_media.ts           # 6 media with mediaStatus enum coverage
│   ├── seed_annotations.ts     # AI + Manual, splitTile valid + malformed
│   └── seed_user_settings.ts   # known selectedFlightId + panelWidths for op_alice
├── helpers/
│   ├── render.tsx              # renderWithProviders (router + auth + i18n)
│   ├── auth.ts                 # seedBearer / clearBearer (uses setToken from src/api/client.ts)
│   ├── navigate.ts             # seedNavigateToLogin spy (uses setNavigateToLogin from src/api/client.ts)
│   └── sse-mock.ts             # EventSource polyfill helper for MSW SSE handlers
└── (test files colocated alongside source, e.g. src/api/client.test.ts, src/features/flights/FlightsPage.test.tsx)

e2e/                            # e2e profile
├── docker-compose.suite-e2e.yml    # Suite stack + azaion-ui + owm-stub + tile-stub + playwright-runner
├── playwright.config.ts        # Chromium + Firefox projects per AC-18; baseURL = http://azaion-ui:80
├── stubs/
│   ├── owm/                    # ./testing/stubs/owm — tiny Node/Bun HTTP server, canned /data/2.5/* responses
│   │   ├── Dockerfile
│   │   └── server.ts
│   └── tile/                   # ./testing/stubs/tile — returns 256x256 PNG for /{z}/{x}/{y} and /sat/{z}/{y}/{x}
│       ├── Dockerfile
│       └── server.ts
├── runner/                     # ./testing/runner — Playwright runner container image
│   ├── Dockerfile
│   └── entrypoint.sh           # bun run test:e2e + writes reports to /output
├── tests/
│   ├── auth.e2e.ts             # Group 1 e2e coverage
│   ├── flights.e2e.ts          # Groups 4, 11, 16
│   ├── annotations.e2e.ts      # Groups 3, 10, 16
│   ├── dataset.e2e.ts          # Group 5
│   ├── admin.e2e.ts            # Groups 1, 13
│   ├── i18n.e2e.ts             # Group 8
│   ├── upload.e2e.ts           # Group 6
│   └── bundle.e2e.ts           # Group 7 (post-build inspection)
├── fixtures/
│   └── seeds.sql               # suite-side seed loader hooked into admin/flights/annotations init
└── results/                    # mounted to playwright-runner:/output; CSV + JUnit + traces

scripts/
├── run-tests.sh                # ALREADY EXISTS — extend in Step 6 to actually shell out to vitest / playwright
└── run-performance-tests.sh    # ALREADY EXISTS — extend in Step 6

Layout Rationale

  • Colocated *.test.tsx next to source (fast): Vitest convention; one test file per src file keeps the import graph shallow and the change-radius obvious. Shared MSW handlers + fixtures + helpers live under tests/ so they're discoverable in one place.
  • Dedicated e2e/ tree: Playwright + the docker-compose stack are a hermetic black-box layer; physically separating them from tests/ prevents accidental cross-imports between fast and e2e (the e2e runner has no access to src/).
  • testing/stubs/ (named per environment.md): the suite's existing convention for stub services. Keeps Dockerfiles small and independently buildable.
  • Reports under test-output/: matches environment.md § Reporting and the Docker compose snippet exactly — no path drift.

Mock Services

Mock Service Replaces Endpoints Behavior
owm-stub (e2e Docker) OpenWeatherMap (E10) GET /data/2.5/weather?lat=&lon=&appid=&units=metric Returns { wind: { speed, deg } } matching flightPlanUtils.test.ts expected wind compute. Two canned response sets keyed by lat,lon.
tile-stub (e2e Docker) OSM + Esri tile servers GET /{z}/{x}/{y}.png, GET /sat/{z}/{y}/{x} Always returns a 256×256 PNG; logs every request so tile-coverage tests can assert.
tests/msw/handlers/*.ts (fast) Every suite service + OWM + tiles per file Per-test handler override via server.use(...). Resets between tests. SSE via MSW WebSocket adapter or EventSourcePolyfill test double.
Real admin/, flights/, annotations/, detect/, loader/, resource/, gps-denied-{desktop,onboard}/, autopilot/ (e2e Docker) n/a — they ARE the system in e2e per suite contract Provided by the parent suite's images. Test-mode endpoints (POST /test-only/reset, embedded LiveGPS + annotation-status event generators) enabled via a non-production build flag.

Mock Control API

  • Fast (MSW): per-test server.use(...) for handler overrides; server.resetHandlers() in afterEach. No HTTP control API needed — it's in-process.
  • owm-stub + tile-stub (e2e): POST /mock/config to swap canned response sets; GET /mock/log to read recorded requests. Used by resilience and performance tests.
  • admin/ test-only: POST /test-only/reset resets all three DBs to seed state between isolation buckets (per test-data.md).

Docker Test Environment

docker-compose.suite-e2e.yml Structure

Service Image / Build Purpose Depends On
azaion-ui Build . (project Dockerfile, ARM64) SPA under test (the system) admin, flights, annotations, detect
admin azaion/admin:test Auth + RBAC + classes-write test-db
flights azaion/flights:test Flight CRUD + live-GPS SSE simulator test-db
annotations azaion/annotations:test Media + annotations + dataset + status SSE test-db
detect azaion/detect:test Sync image detect
loader, resource, gps-denied-desktop, gps-denied-onboard, autopilot suite images Auxiliary services hit by SPA
test-db postgres:16-alpine Backs admin/flights/annotations
owm-stub Build ./e2e/stubs/owm OpenWeatherMap stub
tile-stub Build ./e2e/stubs/tile OSM + Esri tile stub
playwright-runner Build ./e2e/runner Test runner (Chromium + Firefox) azaion-ui, owm-stub, tile-stub

Networks and Volumes

  • Network: azaion-test-net — isolated, no internet egress; the two stubs replace the only external hops (OWM, tiles).
  • Volumes:
    • test-db-datatest-db:/var/lib/postgresql/data (wiped with down -v between e2e runs).
    • seed-fixturesadmin:/seed, flights:/seed, annotations:/seed (read-only) loaded at service start from e2e/fixtures/seeds.sql.
    • test-outputplaywright-runner:/output (mounted to ./test-output/ on host).

Test Runner Configuration

Fast profile

  • Framework: Vitest 3.x (Vite 6-compatible; matches S3 pin) under bun test:fast (a package.json script alias).
  • Environment: jsdom (default for component tests); a small subset of node-only tests opt in via // @vitest-environment node.
  • Plugins / libraries:
    • @testing-library/react ^16.x (React 19 compatible)
    • @testing-library/jest-dom ^6.x
    • @testing-library/user-event ^14.x
    • msw ^2.x (Node server + per-test handler override)
    • @vitest/coverage-v8 ^3.x
  • Entry point: bun run test:fastvitest run --coverage --reporter=verbose --reporter=junit --outputFile.junit=./test-output/fast-report.xml.
  • Setup file: tests/setup.ts — wires jest-dom matchers, starts MSW server, polyfills EventSource if not in jsdom, registers global afterEach(() => { server.resetHandlers(); cleanup() }).

E2E profile

  • Framework: Playwright ^1.49 with @playwright/test. Two browser projects: Chromium and Firefox (per AC-18).
  • Entry point: bun run test:e2eplaywright test --reporter=list,junit --output=./test-output/e2e --reporter-output=./test-output/e2e-report.xml.
  • Config file: e2e/playwright.config.tsbaseURL: http://azaion-ui:80, workers: 1 for SSE-sensitive tests, retries: 1 on dev/stage, traces on first retry.
  • Network gate: Playwright context.route('**/api.openweathermap.org/**', route => route.abort()) + **/unpkg.com/** .abort() defense-in-depth — fails the run if any external host is hit despite the env-var redirection landed in autodev Step 4.

Static profile

  • Framework: existing scripts/run-tests.sh (Step 4 deliverable) extended in Step 6 to run:
    • tsc -b --noEmit
    • ripgrep checks: no unpkg.com, no banned libraries (ML, signature, persistence, WS/GraphQL/SSR), no service-worker registration, no literal OWM key.
    • vite build + bundle-size assertion.
  • Entry point: ./scripts/run-tests.sh --static-only.
  • Output: ./test-output/static-report.csv.

Fixture Strategy

Fixture Scope Purpose
renderWithProviders function Mount component with the same Router + AuthProvider + I18nextProvider tree as the production app
seedBearer(token) / clearBearer() function Set/clear the in-memory bearer via setToken (production accessor — no monkey-patching)
seedNavigateToLogin(spy) function Replace navigateToLoginImpl via setNavigateToLogin (production accessor from autodev Step 4 / C06)
mswServer module-level Single MSW Node server shared by all fast tests; reset between each test
loadEnumSnapshot() function Reads _docs/00_problem/input_data/enum_spec_snapshot.json for AC-04 / AC-29 contract assertions
Playwright page fixture per-test Standard Playwright fixture; extended with request-log interceptor that records every outbound request for assertion

Test Data Fixtures

Data Set Source Format Used By
seed_users e2e/fixtures/seeds.sql + tests/fixtures/seed_users.ts (mirror) SQL (e2e) / TS (fast) Groups 1, 13, 14, 16
seed_aircraft as above SQL / TS Group 7, 16
seed_flights as above SQL / TS Group 4, 11, 16
seed_classes as above SQL / TS Group 16 (DetectionClasses + AdminPage class CRUD)
seed_media as above SQL / TS Group 3, 5, 16, 17
seed_annotations as above SQL / TS Group 3, 16, 17
seed_user_settings as above SQL / TS Group 4 (AC-21), Group 11 (AC-06)
enum_spec_snapshot _docs/00_problem/input_data/enum_spec_snapshot.json (committed) JSON Group 2 (AC-04 / AC-29 static contract checks)
bundle_artifact dist/ from vite build filesystem Group 7 (AC-11 / AC-31 / AC-33) — static profile only

Data Isolation

  • Fast: MSW handlers are per-test; server.resetHandlers() in afterEach. React tree unmounted via RTL cleanup. In-memory bearer + navigate-to-login spy reset via afterEach calls to setToken(null) + setNavigateToLogin(default).
  • E2E: fresh docker compose down -v && up -d per CI run. Within a run, tests grouped into isolation buckets by data set; bucket teardown calls admin/'s POST /test-only/reset to restore the seed state. Buckets across CI machines run on independent compose stacks. No cross-test order dependencies — any test must boot the bucket's seed snapshot and run alone.
  • Static: no shared state; each script run is hermetic.

Test Reporting

  • Format: CSV (suite rollup) + JUnit XML (CI ingestion) + Playwright traces (.zip) on failure.
  • Columns (CSV): Test ID, Test Name, Profile, Execution Time (ms), Result (PASS|FAIL|SKIP|QUARANTINE), Error Message, Traces to AC, Traces to results_report.md row.
  • Output paths:
    • ./test-output/fast-report.xml (JUnit)
    • ./test-output/fast-report.csv (rollup written by scripts/run-tests.sh)
    • ./test-output/e2e/ (Playwright artifacts incl. traces)
    • ./test-output/e2e-report.xml (JUnit)
    • ./test-output/static-report.csv
    • ./test-output/summary.csv (suite-level rollup combining the three)

Acceptance Criteria

AC-1: Test environment starts (e2e) Given e2e/docker-compose.suite-e2e.yml, When docker compose -f e2e/docker-compose.suite-e2e.yml up -d is executed, Then azaion-ui, admin, flights, annotations, detect, owm-stub, tile-stub, test-db, and playwright-runner all become healthy and http://azaion-ui:80 returns the SPA HTML.

AC-2: Mock services respond (e2e) Given the test environment is running, When the test runner sends GET http://owm-stub:8081/data/2.5/weather?lat=0&lon=0&appid=test, and GET http://tile-stub:8082/1/0/0.png, Then owm-stub returns the canned OpenWeatherMap shape and tile-stub returns a 256×256 PNG.

AC-3: MSW intercepts every outbound API call (fast) Given the fast profile is configured per this task, When any test under the fast profile triggers a fetch to /api/<service>/..., Then the MSW handler matches, no real network call leaves the process, and afterEach resets handlers to defaults.

AC-4: Fast runner executes Given the fast profile is configured, When bun run test:fast is executed, Then Vitest discovers **/*.test.ts(x) files, runs them under jsdom, and produces ./test-output/fast-report.xml (JUnit) + a coverage summary.

AC-5: E2E runner executes both browsers Given the e2e environment is running, When bun run test:e2e is executed inside the playwright-runner container, Then Playwright runs every e2e/tests/**/*.e2e.ts under both Chromium and Firefox projects and writes ./test-output/e2e-report.xml.

AC-6: Static runner executes Given the working tree at HEAD, When ./scripts/run-tests.sh --static-only is executed, Then it runs tsc -b --noEmit, the ripgrep static checks, and vite build, and writes ./test-output/static-report.csv. Exit code is non-zero on any failure.

AC-7: Reports correctly shaped Given any of the three runners completes, When the corresponding JUnit XML or CSV is opened, Then it contains the columns listed under "Test Reporting" above and one row per test (or a JUnit <testcase> per test).

AC-8: External-host firewall Given the e2e network has no internet egress, When any test inadvertently triggers a fetch to api.openweathermap.org, unpkg.com, or *.tile.openstreetmap.org, Then the Playwright route-abort guard fires AND tile-stub/owm-stub log no requests under those hosts. The test fails with a "leaked external request" assertion.

Non-Functional Requirements

Compatibility

  • Vite 6 (S3), Bun 1.3.11 (S4), React 19, TypeScript 5.7. Vitest 3.x and Playwright 1.49+ are confirmed compatible with this stack via context7 lookup on Step 6 before installs.

Security

  • No production credentials in seed-fixtures/. Test bearer + OWM API key are placeholders.
  • Test-only endpoints (POST /test-only/reset) gated behind a non-production build flag in the suite — the UI test runner does not embed this concern.

Operational

  • Total fast suite wall-clock budget ≤ 5 min (per environment.md CI/CD Integration).
  • Total e2e suite wall-clock budget ≤ 30 min.
  • Each e2e test's Max execution time is honored per the scenario in blackbox-tests.md / etc.

Constraints

  • The runner picks (Vitest, MSW, Playwright) ARE the testability runners environment.md flags as "chosen at Step 5". They MUST match exactly the SPA's stack and the published test profiles. No alternative runner introduced silently.
  • No src/ import from any test except for the typed wire-contract enum shapes declared in src/types/index.ts (per P9 / environment.md § Black-box discipline).
  • _docs/00_problem/input_data/enum_spec_snapshot.json is the contract pin for Group 2 (AC-04, AC-29). Any UI drift it carries is asserted by static checks, not silently accepted.
  • The Playwright + MSW pair is the only outbound-network observation surface. No test reads src/ private state or mounts internal hooks.
  • E2E reports MUST land in ./test-output/ (mounted volume) — never inside the runner container alone.

Risks & Mitigation

Risk 1 — MSW Service Worker mode in fast

  • Risk: MSW has two runtimes (Node setupServer and browser setupWorker); fast tests must use Node (setupServer), not the SW mode. Mixing them silently bypasses MSW.
  • Mitigation: tests/msw/server.ts exports only the Node server. ESLint rule (added in Step 6) bans import { setupWorker } from anywhere under tests/.

Risk 2 — Playwright route-abort hides a real regression

  • Risk: AC-8 aborts external requests. If a code-path inadvertently regresses to a hardcoded URL, the abort makes the test still "pass" because the inner promise rejects to a friendly null.
  • Mitigation: the abort handler increments a counter; the test fixture asserts externalHosts.size === 0 in afterEach. A leaked request fails the test loudly.

Risk 3 — SSE testing under MSW

  • Risk: MSW 2.x does not have first-class EventSource support; we lean on a polyfill or a custom stream adapter.
  • Mitigation: tests/helpers/sse-mock.ts exposes simulateSseStream(handler, events[]) that returns a fake EventSource instance. Tests inject it via setEventSourceFactory if src/api/sse.ts exports one (NOTE: it does not today — this is flagged for a follow-up "Wrap EventSource constructor in an accessor" testability change in a later refactor run, OR the e2e Playwright path covers SSE definitively while fast uses the polyfill only for narrow cases).

Risk 4 — Suite-image availability

  • Risk: azaion/admin:test etc. tags must exist in the registry; pulling them is the e2e bootstrap. If the parent suite has not published a tagged build, the e2e profile is dead in the water.
  • Mitigation: defer to the parent suite's CI; ASK before Step 6 starts if the registry has a published :test tag for each service. If absent, gate e2e tests behind a CI-skip flag until the suite publishes.

Blackbox Tests

AC Ref Initial Data / Conditions What to Test Expected Behavior NFR References
AC-1 Clean host with Docker 24+ docker compose -f e2e/docker-compose.suite-e2e.yml up -d → wait for healthchecks → curl http://azaion-ui:80/ 200 OK with the SPA HTML Compat
AC-2 Stubs running curl http://owm-stub:8081/data/2.5/weather?...; curl http://tile-stub:8082/1/0/0.png OWM JSON shape; 256×256 PNG Compat
AC-3 Fast profile configured A trivial component test that triggers fetch('/api/admin/me'); no MSW handler registered The test fails with MSW's "no handler" warning by default; handler override makes it pass Compat
AC-4 Fast profile configured bun run test:fast Exit 0; ./test-output/fast-report.xml exists with at least one <testcase> Operational
AC-5 E2E env up bun run test:e2e Exit 0; ./test-output/e2e-report.xml exists; Playwright HTML report exists Operational
AC-6 Working tree at HEAD ./scripts/run-tests.sh --static-only Exit 0; ./test-output/static-report.csv exists with one row per check Compat
AC-7 All three runners ran Open each report Columns match the Test Reporting spec Operational
AC-8 E2E env up Patch a test to fetch('https://api.openweathermap.org/...') Test fails with "leaked external request"; tile-stub/owm-stub logs show 0 requests under the external host Security

Out of Scope (deferred)

  • Writing the actual blackbox / perf / resilience / security / resource-limit test code (that's Step 3 + Step 6 of autodev).
  • Publishing the parent suite's :test images (parent suite's CI concern).
  • Visual-regression snapshots — not required by any current AC; can be revisited when AC-40 / AC-25 land in Phase B.
  • Performance-runner bundle-size threshold tuning (already in run-performance-tests.sh from Step 4; final thresholds confirmed in Step 7).