Scaffolds the Blackbox test project per AZ-456 / environment.md across
the three profiles:
- fast : Vitest 3.x + jsdom + MSW 2.x + RTL/jest-dom; tests/setup.ts
boots the MSW Node server with onUnhandledRequest:'error',
afterEach resets handlers, clears bearer + navigate-to-login
spy. Default handlers ship for every suite service plus OWM
and tile stand-ins. Fixtures mirror seed_* in test-data.md.
- e2e : Playwright ^1.49 with chromium + firefox projects against the
suite docker-compose stack; owm-stub + tile-stub Bun servers,
playwright-runner image, seeds.sql for the test-db.
- static: scripts/run-tests.sh extended — tsc --noEmit (test config),
vite build, ripgrep checks (with grep -r fallback), CSV
report at test-output/static-report.csv per AC-7 columns.
Smoke tests cover AC-3, AC-4 (fast, 5 tests, PASS) and AC-1, AC-2,
AC-5, AC-8 (e2e, gated by Risk 4 docker availability). Static profile
(13 checks) PASS — STC-SEC1 (no literal OWM key) lifted from
QUARANTINE per AZ-447 with a narrowed pattern.
Files:
+24 tests/**, +10 e2e/**, +vitest.config.ts, +tsconfig.test.json
~package.json (test scripts + devDeps for vitest, @testing-library/*,
msw, @playwright/test, jsdom, @types/node, @vitest/coverage-v8)
~scripts/run-tests.sh, scripts/run-performance-tests.sh — switched
RESULTS_DIR to test-output/, compose path to project-local
~.gitignore — added /test-output/
Verification:
bun run test:fast → 11 / 11 PASS
./scripts/run-tests.sh → static 13/13 + fast 11/11 PASS, exit 0
Tracker: AZ-456 → In Testing.
Co-authored-by: Cursor <cursoragent@cursor.com>
22 KiB
Test Infrastructure
Task: AZ-456_test_infrastructure
Name: Test Infrastructure
Description: Scaffold the Blackbox test project — Vitest + jsdom + MSW for the fast profile, Playwright + the suite docker-compose stack for the e2e profile, ripgrep + bun scripts for static. Includes mock services (owm-stub, tile-stub), test data fixtures, isolation strategy, and CSV+JUnit reporting.
Complexity: 5 points
Dependencies: None
Component: Blackbox Tests
Tracker: AZ-456
Epic: AZ-455
Test Project Folder Layout
tests/ # fast profile (colocated with src/, top-level dir for shared helpers)
├── setup.ts # Vitest global setup — RTL matchers + jsdom polyfills + MSW server.listen
├── msw/
│ ├── server.ts # MSW Node server (Vitest) — used by every fast test
│ ├── handlers/
│ │ ├── admin.ts # /api/admin/* default handlers (auth, users, classes-write, GPS)
│ │ ├── flights.ts # /api/flights/* default handlers (CRUD + live-GPS SSE simulator)
│ │ ├── annotations.ts # /api/annotations/* default handlers (media + annotations + dataset + SSE)
│ │ ├── detect.ts # /api/detect/* default handlers (sync image detect)
│ │ ├── loader.ts # /api/loader/* default handlers
│ │ ├── resource.ts # /api/resource/* default handlers
│ │ ├── owm.ts # owm-stub equivalent for fast (OpenWeatherMap canned wind data)
│ │ └── tiles.ts # tile-stub equivalent for fast (returns a 1×1 transparent PNG)
│ └── helpers.ts # buildResponse, errorResponse, sse, latency, drop helpers
├── fixtures/
│ ├── enum_spec_snapshot.ts # re-exports the committed _docs/00_problem/input_data/enum_spec_snapshot.json
│ ├── seed_users.ts # 4 users matching seed_users (test-data.md)
│ ├── seed_aircraft.ts # 3 aircraft, one default
│ ├── seed_flights.ts # 5 flights, one with live-GPS wire-up
│ ├── seed_classes.ts # [0..N-1, 20..20+N-1, 40..40+N-1] N≥9
│ ├── seed_media.ts # 6 media with mediaStatus enum coverage
│ ├── seed_annotations.ts # AI + Manual, splitTile valid + malformed
│ └── seed_user_settings.ts # known selectedFlightId + panelWidths for op_alice
├── helpers/
│ ├── render.tsx # renderWithProviders (router + auth + i18n)
│ ├── auth.ts # seedBearer / clearBearer (uses setToken from src/api/client.ts)
│ ├── navigate.ts # seedNavigateToLogin spy (uses setNavigateToLogin from src/api/client.ts)
│ └── sse-mock.ts # EventSource polyfill helper for MSW SSE handlers
└── (test files colocated alongside source, e.g. src/api/client.test.ts, src/features/flights/FlightsPage.test.tsx)
e2e/ # e2e profile
├── docker-compose.suite-e2e.yml # Suite stack + azaion-ui + owm-stub + tile-stub + playwright-runner
├── playwright.config.ts # Chromium + Firefox projects per AC-18; baseURL = http://azaion-ui:80
├── stubs/
│ ├── owm/ # ./testing/stubs/owm — tiny Node/Bun HTTP server, canned /data/2.5/* responses
│ │ ├── Dockerfile
│ │ └── server.ts
│ └── tile/ # ./testing/stubs/tile — returns 256x256 PNG for /{z}/{x}/{y} and /sat/{z}/{y}/{x}
│ ├── Dockerfile
│ └── server.ts
├── runner/ # ./testing/runner — Playwright runner container image
│ ├── Dockerfile
│ └── entrypoint.sh # bun run test:e2e + writes reports to /output
├── tests/
│ ├── auth.e2e.ts # Group 1 e2e coverage
│ ├── flights.e2e.ts # Groups 4, 11, 16
│ ├── annotations.e2e.ts # Groups 3, 10, 16
│ ├── dataset.e2e.ts # Group 5
│ ├── admin.e2e.ts # Groups 1, 13
│ ├── i18n.e2e.ts # Group 8
│ ├── upload.e2e.ts # Group 6
│ └── bundle.e2e.ts # Group 7 (post-build inspection)
├── fixtures/
│ └── seeds.sql # suite-side seed loader hooked into admin/flights/annotations init
└── results/ # mounted to playwright-runner:/output; CSV + JUnit + traces
scripts/
├── run-tests.sh # ALREADY EXISTS — extend in Step 6 to actually shell out to vitest / playwright
└── run-performance-tests.sh # ALREADY EXISTS — extend in Step 6
Layout Rationale
- Colocated
*.test.tsxnext to source (fast): Vitest convention; one test file per src file keeps the import graph shallow and the change-radius obvious. Shared MSW handlers + fixtures + helpers live undertests/so they're discoverable in one place. - Dedicated
e2e/tree: Playwright + the docker-compose stack are a hermetic black-box layer; physically separating them fromtests/prevents accidental cross-imports between fast and e2e (the e2e runner has no access tosrc/). testing/stubs/(named per environment.md): the suite's existing convention for stub services. Keeps Dockerfiles small and independently buildable.- Reports under
test-output/: matchesenvironment.md§ Reporting and the Docker compose snippet exactly — no path drift.
Mock Services
| Mock Service | Replaces | Endpoints | Behavior |
|---|---|---|---|
owm-stub (e2e Docker) |
OpenWeatherMap (E10) | GET /data/2.5/weather?lat=&lon=&appid=&units=metric |
Returns { wind: { speed, deg } } matching flightPlanUtils.test.ts expected wind compute. Two canned response sets keyed by lat,lon. |
tile-stub (e2e Docker) |
OSM + Esri tile servers | GET /{z}/{x}/{y}.png, GET /sat/{z}/{y}/{x} |
Always returns a 256×256 PNG; logs every request so tile-coverage tests can assert. |
tests/msw/handlers/*.ts (fast) |
Every suite service + OWM + tiles | per file | Per-test handler override via server.use(...). Resets between tests. SSE via MSW WebSocket adapter or EventSourcePolyfill test double. |
Real admin/, flights/, annotations/, detect/, loader/, resource/, gps-denied-{desktop,onboard}/, autopilot/ (e2e Docker) |
n/a — they ARE the system in e2e |
per suite contract | Provided by the parent suite's images. Test-mode endpoints (POST /test-only/reset, embedded LiveGPS + annotation-status event generators) enabled via a non-production build flag. |
Mock Control API
- Fast (MSW): per-test
server.use(...)for handler overrides;server.resetHandlers()inafterEach. No HTTP control API needed — it's in-process. - owm-stub + tile-stub (e2e):
POST /mock/configto swap canned response sets;GET /mock/logto read recorded requests. Used by resilience and performance tests. admin/test-only:POST /test-only/resetresets all three DBs to seed state between isolation buckets (pertest-data.md).
Docker Test Environment
docker-compose.suite-e2e.yml Structure
| Service | Image / Build | Purpose | Depends On |
|---|---|---|---|
azaion-ui |
Build . (project Dockerfile, ARM64) |
SPA under test (the system) | admin, flights, annotations, detect |
admin |
azaion/admin:test |
Auth + RBAC + classes-write | test-db |
flights |
azaion/flights:test |
Flight CRUD + live-GPS SSE simulator | test-db |
annotations |
azaion/annotations:test |
Media + annotations + dataset + status SSE | test-db |
detect |
azaion/detect:test |
Sync image detect | — |
loader, resource, gps-denied-desktop, gps-denied-onboard, autopilot |
suite images | Auxiliary services hit by SPA | — |
test-db |
postgres:16-alpine |
Backs admin/flights/annotations | — |
owm-stub |
Build ./e2e/stubs/owm |
OpenWeatherMap stub | — |
tile-stub |
Build ./e2e/stubs/tile |
OSM + Esri tile stub | — |
playwright-runner |
Build ./e2e/runner |
Test runner (Chromium + Firefox) | azaion-ui, owm-stub, tile-stub |
Networks and Volumes
- Network:
azaion-test-net— isolated, no internet egress; the two stubs replace the only external hops (OWM, tiles). - Volumes:
test-db-data→test-db:/var/lib/postgresql/data(wiped withdown -vbetween e2e runs).seed-fixtures→admin:/seed,flights:/seed,annotations:/seed(read-only) loaded at service start frome2e/fixtures/seeds.sql.test-output→playwright-runner:/output(mounted to./test-output/on host).
Test Runner Configuration
Fast profile
- Framework: Vitest 3.x (Vite 6-compatible; matches S3 pin) under
bun test:fast(apackage.jsonscript alias). - Environment: jsdom (default for component tests); a small subset of node-only tests opt in via
// @vitest-environment node. - Plugins / libraries:
@testing-library/react^16.x (React 19 compatible)@testing-library/jest-dom^6.x@testing-library/user-event^14.xmsw^2.x (Node server + per-test handler override)@vitest/coverage-v8^3.x
- Entry point:
bun run test:fast→vitest run --coverage --reporter=verbose --reporter=junit --outputFile.junit=./test-output/fast-report.xml. - Setup file:
tests/setup.ts— wires jest-dom matchers, starts MSW server, polyfillsEventSourceif not in jsdom, registers globalafterEach(() => { server.resetHandlers(); cleanup() }).
E2E profile
- Framework: Playwright ^1.49 with
@playwright/test. Two browser projects: Chromium and Firefox (per AC-18). - Entry point:
bun run test:e2e→playwright test --reporter=list,junit --output=./test-output/e2e --reporter-output=./test-output/e2e-report.xml. - Config file:
e2e/playwright.config.ts—baseURL: http://azaion-ui:80,workers: 1for SSE-sensitive tests,retries: 1ondev/stage, traces on first retry. - Network gate: Playwright
context.route('**/api.openweathermap.org/**', route => route.abort())+**/unpkg.com/**.abort()defense-in-depth — fails the run if any external host is hit despite the env-var redirection landed in autodev Step 4.
Static profile
- Framework: existing
scripts/run-tests.sh(Step 4 deliverable) extended in Step 6 to run:tsc -b --noEmitripgrepchecks: nounpkg.com, no banned libraries (ML, signature, persistence, WS/GraphQL/SSR), no service-worker registration, no literal OWM key.vite build+ bundle-size assertion.
- Entry point:
./scripts/run-tests.sh --static-only. - Output:
./test-output/static-report.csv.
Fixture Strategy
| Fixture | Scope | Purpose |
|---|---|---|
renderWithProviders |
function | Mount component with the same Router + AuthProvider + I18nextProvider tree as the production app |
seedBearer(token) / clearBearer() |
function | Set/clear the in-memory bearer via setToken (production accessor — no monkey-patching) |
seedNavigateToLogin(spy) |
function | Replace navigateToLoginImpl via setNavigateToLogin (production accessor from autodev Step 4 / C06) |
mswServer |
module-level | Single MSW Node server shared by all fast tests; reset between each test |
loadEnumSnapshot() |
function | Reads _docs/00_problem/input_data/enum_spec_snapshot.json for AC-04 / AC-29 contract assertions |
Playwright page fixture |
per-test | Standard Playwright fixture; extended with request-log interceptor that records every outbound request for assertion |
Test Data Fixtures
| Data Set | Source | Format | Used By |
|---|---|---|---|
seed_users |
e2e/fixtures/seeds.sql + tests/fixtures/seed_users.ts (mirror) |
SQL (e2e) / TS (fast) | Groups 1, 13, 14, 16 |
seed_aircraft |
as above | SQL / TS | Group 7, 16 |
seed_flights |
as above | SQL / TS | Group 4, 11, 16 |
seed_classes |
as above | SQL / TS | Group 16 (DetectionClasses + AdminPage class CRUD) |
seed_media |
as above | SQL / TS | Group 3, 5, 16, 17 |
seed_annotations |
as above | SQL / TS | Group 3, 16, 17 |
seed_user_settings |
as above | SQL / TS | Group 4 (AC-21), Group 11 (AC-06) |
enum_spec_snapshot |
_docs/00_problem/input_data/enum_spec_snapshot.json (committed) |
JSON | Group 2 (AC-04 / AC-29 static contract checks) |
bundle_artifact |
dist/ from vite build |
filesystem | Group 7 (AC-11 / AC-31 / AC-33) — static profile only |
Data Isolation
- Fast: MSW handlers are per-test;
server.resetHandlers()inafterEach. React tree unmounted via RTLcleanup. In-memory bearer + navigate-to-login spy reset viaafterEachcalls tosetToken(null)+setNavigateToLogin(default). - E2E: fresh
docker compose down -v && up -dper CI run. Within a run, tests grouped into isolation buckets by data set; bucket teardown callsadmin/'sPOST /test-only/resetto restore the seed state. Buckets across CI machines run on independent compose stacks. No cross-test order dependencies — any test must boot the bucket's seed snapshot and run alone. - Static: no shared state; each script run is hermetic.
Test Reporting
- Format: CSV (suite rollup) + JUnit XML (CI ingestion) + Playwright traces (
.zip) on failure. - Columns (CSV):
Test ID, Test Name, Profile, Execution Time (ms), Result (PASS|FAIL|SKIP|QUARANTINE), Error Message, Traces to AC, Traces to results_report.md row. - Output paths:
./test-output/fast-report.xml(JUnit)./test-output/fast-report.csv(rollup written byscripts/run-tests.sh)./test-output/e2e/(Playwright artifacts incl. traces)./test-output/e2e-report.xml(JUnit)./test-output/static-report.csv./test-output/summary.csv(suite-level rollup combining the three)
Acceptance Criteria
AC-1: Test environment starts (e2e)
Given e2e/docker-compose.suite-e2e.yml,
When docker compose -f e2e/docker-compose.suite-e2e.yml up -d is executed,
Then azaion-ui, admin, flights, annotations, detect, owm-stub, tile-stub, test-db, and playwright-runner all become healthy and http://azaion-ui:80 returns the SPA HTML.
AC-2: Mock services respond (e2e)
Given the test environment is running,
When the test runner sends GET http://owm-stub:8081/data/2.5/weather?lat=0&lon=0&appid=test, and GET http://tile-stub:8082/1/0/0.png,
Then owm-stub returns the canned OpenWeatherMap shape and tile-stub returns a 256×256 PNG.
AC-3: MSW intercepts every outbound API call (fast)
Given the fast profile is configured per this task,
When any test under the fast profile triggers a fetch to /api/<service>/...,
Then the MSW handler matches, no real network call leaves the process, and afterEach resets handlers to defaults.
AC-4: Fast runner executes
Given the fast profile is configured,
When bun run test:fast is executed,
Then Vitest discovers **/*.test.ts(x) files, runs them under jsdom, and produces ./test-output/fast-report.xml (JUnit) + a coverage summary.
AC-5: E2E runner executes both browsers
Given the e2e environment is running,
When bun run test:e2e is executed inside the playwright-runner container,
Then Playwright runs every e2e/tests/**/*.e2e.ts under both Chromium and Firefox projects and writes ./test-output/e2e-report.xml.
AC-6: Static runner executes
Given the working tree at HEAD,
When ./scripts/run-tests.sh --static-only is executed,
Then it runs tsc -b --noEmit, the ripgrep static checks, and vite build, and writes ./test-output/static-report.csv. Exit code is non-zero on any failure.
AC-7: Reports correctly shaped
Given any of the three runners completes,
When the corresponding JUnit XML or CSV is opened,
Then it contains the columns listed under "Test Reporting" above and one row per test (or a JUnit <testcase> per test).
AC-8: External-host firewall
Given the e2e network has no internet egress,
When any test inadvertently triggers a fetch to api.openweathermap.org, unpkg.com, or *.tile.openstreetmap.org,
Then the Playwright route-abort guard fires AND tile-stub/owm-stub log no requests under those hosts. The test fails with a "leaked external request" assertion.
Non-Functional Requirements
Compatibility
- Vite 6 (S3), Bun 1.3.11 (S4), React 19, TypeScript 5.7. Vitest 3.x and Playwright 1.49+ are confirmed compatible with this stack via
context7lookup on Step 6 before installs.
Security
- No production credentials in
seed-fixtures/. Test bearer + OWM API key are placeholders. - Test-only endpoints (
POST /test-only/reset) gated behind a non-production build flag in the suite — the UI test runner does not embed this concern.
Operational
- Total
fastsuite wall-clock budget ≤ 5 min (perenvironment.mdCI/CD Integration). - Total
e2esuite wall-clock budget ≤ 30 min. - Each
e2etest'sMax execution timeis honored per the scenario inblackbox-tests.md/ etc.
Constraints
- The runner picks (Vitest, MSW, Playwright) ARE the testability runners environment.md flags as "chosen at Step 5". They MUST match exactly the SPA's stack and the published test profiles. No alternative runner introduced silently.
- No
src/import from any test except for the typed wire-contract enum shapes declared insrc/types/index.ts(perP9/environment.md§ Black-box discipline). _docs/00_problem/input_data/enum_spec_snapshot.jsonis the contract pin for Group 2 (AC-04, AC-29). Any UI drift it carries is asserted by static checks, not silently accepted.- The Playwright + MSW pair is the only outbound-network observation surface. No test reads
src/private state or mounts internal hooks. - E2E reports MUST land in
./test-output/(mounted volume) — never inside the runner container alone.
Risks & Mitigation
Risk 1 — MSW Service Worker mode in fast
- Risk: MSW has two runtimes (Node
setupServerand browsersetupWorker); fast tests must use Node (setupServer), not the SW mode. Mixing them silently bypasses MSW. - Mitigation:
tests/msw/server.tsexports only the Node server. ESLint rule (added in Step 6) bansimport { setupWorker }from anywhere undertests/.
Risk 2 — Playwright route-abort hides a real regression
- Risk: AC-8 aborts external requests. If a code-path inadvertently regresses to a hardcoded URL, the abort makes the test still "pass" because the inner promise rejects to a friendly null.
- Mitigation: the abort handler increments a counter; the test fixture asserts
externalHosts.size === 0inafterEach. A leaked request fails the test loudly.
Risk 3 — SSE testing under MSW
- Risk: MSW 2.x does not have first-class EventSource support; we lean on a polyfill or a custom stream adapter.
- Mitigation:
tests/helpers/sse-mock.tsexposessimulateSseStream(handler, events[])that returns a fakeEventSourceinstance. Tests inject it viasetEventSourceFactoryifsrc/api/sse.tsexports one (NOTE: it does not today — this is flagged for a follow-up "Wrap EventSource constructor in an accessor" testability change in a later refactor run, OR the e2e Playwright path covers SSE definitively while fast uses the polyfill only for narrow cases).
Risk 4 — Suite-image availability
- Risk:
azaion/admin:testetc. tags must exist in the registry; pulling them is the e2e bootstrap. If the parent suite has not published a tagged build, the e2e profile is dead in the water. - Mitigation: defer to the parent suite's CI; ASK before Step 6 starts if the registry has a published
:testtag for each service. If absent, gate e2e tests behind a CI-skip flag until the suite publishes.
Blackbox Tests
| AC Ref | Initial Data / Conditions | What to Test | Expected Behavior | NFR References |
|---|---|---|---|---|
| AC-1 | Clean host with Docker 24+ | docker compose -f e2e/docker-compose.suite-e2e.yml up -d → wait for healthchecks → curl http://azaion-ui:80/ |
200 OK with the SPA HTML | Compat |
| AC-2 | Stubs running | curl http://owm-stub:8081/data/2.5/weather?...; curl http://tile-stub:8082/1/0/0.png |
OWM JSON shape; 256×256 PNG | Compat |
| AC-3 | Fast profile configured | A trivial component test that triggers fetch('/api/admin/me'); no MSW handler registered |
The test fails with MSW's "no handler" warning by default; handler override makes it pass | Compat |
| AC-4 | Fast profile configured | bun run test:fast |
Exit 0; ./test-output/fast-report.xml exists with at least one <testcase> |
Operational |
| AC-5 | E2E env up | bun run test:e2e |
Exit 0; ./test-output/e2e-report.xml exists; Playwright HTML report exists |
Operational |
| AC-6 | Working tree at HEAD | ./scripts/run-tests.sh --static-only |
Exit 0; ./test-output/static-report.csv exists with one row per check |
Compat |
| AC-7 | All three runners ran | Open each report | Columns match the Test Reporting spec | Operational |
| AC-8 | E2E env up | Patch a test to fetch('https://api.openweathermap.org/...') |
Test fails with "leaked external request"; tile-stub/owm-stub logs show 0 requests under the external host |
Security |
Out of Scope (deferred)
- Writing the actual blackbox / perf / resilience / security / resource-limit test code (that's Step 3 + Step 6 of autodev).
- Publishing the parent suite's
:testimages (parent suite's CI concern). - Visual-regression snapshots — not required by any current AC; can be revisited when AC-40 / AC-25 land in Phase B.
- Performance-runner bundle-size threshold tuning (already in
run-performance-tests.shfrom Step 4; final thresholds confirmed in Step 7).