mirror of
https://github.com/azaion/satellite-provider.git
synced 2026-06-22 17:31:14 +00:00
[AZ-493] Cycle 3 batch 3: integration test DB-reset hook
AZ-493 (2 SP): replace the cycle-2 wallclock-seeded _coordinateCounter workaround with a proper Postgres state-reset hook that runs at integration test runner startup, eliminating the per-source-unique-index collision risk that the persistent docker-compose Postgres volume introduced post-AZ-484. The reset is split into two surfaces: * SatelliteProvider.TestSupport.IntegrationTestResetGuard - pure static class, I/O-free, unit-tested. Two independent guards: (a) ASPNETCORE_ENVIRONMENT must equal "Testing", (b) DB_CONNECTION_STRING Host must be in the allowed-host list (postgres, localhost, 127.0.0.1). Failure of either guard surfaces a structured operator-friendly InvalidOperationException. * SatelliteProvider.IntegrationTests.IntegrationTestDatabaseReset - instance class owning the Npgsql side effects. Calls the guard then runs TRUNCATE TABLE route_regions, route_points, routes, regions, tiles RESTART IDENTITY CASCADE inside a single Npgsql transaction. Spec-vs-reality: the task spec prescribed "DB name contains _test" as Guard 2; the actual compose file uses Database=satelliteprovider and DB rename is gated on user confirmation per coderule.mdc. Substituted a Host allowlist as the equivalent guard (intent identical: reject remote / production hosts). Recorded as Low/Spec-Gap in the review. Program.cs adds --keep-state CLI flag and INTEGRATION_KEEP_STATE env var (1/true) opt-outs so a developer can inspect leftover state when debugging. Startup banner shows which path executed. docker-compose.tests.yml gets ASPNETCORE_ENVIRONMENT=Testing + passthrough for INTEGRATION_KEEP_STATE. scripts/run-tests.sh wires the --keep-state flag through to compose. UavUploadTests._coordinateCounter wallclock seed is retained as defense-in-depth (per the task spec's implementer choice). The reset is the primary isolation path; the seed is the belt-and-suspenders fallback for --keep-state runs. 8 new unit tests in SatelliteProvider.Tests/TestSupport/ IntegrationTestResetGuardTests.cs cover Production/Staging/missing-env throw, allowed-host case-insensitivity, disallowed-host rejection with representative prod hostnames, and the AllowedHosts contract. tests_integration.md gains a Reliability section that documents the hook, the two guards, the truncate order, and the three opt-out forms. module-layout.md TestSupport entry extended with the new pure guard and the explicit "Npgsql stays in IntegrationTests" boundary. Test-suite gate (AC-6) deferred to Step 16 Final Test Run per implement skill convention. Per-batch review verdict: PASS_WITH_WARNINGS with 1 Low (spec-vs-reality on Guard 2, non-blocking). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,133 +0,0 @@
|
||||
# Integration test DB-reset hook
|
||||
|
||||
**Task**: AZ-493_integration_test_db_reset_hook
|
||||
**Name**: Integration test DB-reset hook
|
||||
**Description**: Replace the cycle-2 wallclock-seeded `_coordinateCounter` workaround in `UavUploadTests` with a real Postgres state-reset hook executed at integration test runner startup, so future integration tests don't have to invent their own collision-avoidance scheme against the persistent Docker volume.
|
||||
**Complexity**: 2 points
|
||||
**Dependencies**: None
|
||||
**Component**: Test infrastructure (`SatelliteProvider.IntegrationTests`) + optionally `scripts/run-tests.sh`
|
||||
**Tracker**: AZ-493
|
||||
**Epic**: none (cycle-3 test-infrastructure hardening)
|
||||
|
||||
## Problem
|
||||
|
||||
During cycle 2 Step 11 (Run Tests), the new `UavUploadTests.MultiSourceCoexistence_AZ484_Cycle2` failed with `duplicate key value violates unique constraint "idx_tiles_unique_location_source"` while seeding a `google_maps` row. Root cause: the `_coordinateCounter` field in `UavUploadTests` reset to 0 on every test-runner process start, while the Postgres named volume in `docker-compose.yml` persists `tiles` rows across `docker-compose down` cycles.
|
||||
|
||||
The cycle-2 fix (`dc3dabe [AZ-488] fix: seed UavUploadTests coordinate counter from wall-clock`) seeded `_coordinateCounter` from `(int)((DateTime.UtcNow.Ticks / TimeSpan.TicksPerSecond) % 1_000_000)`. This makes per-run coordinates *probabilistically* unique enough to avoid collision in normal usage — but it's a workaround, not isolation:
|
||||
|
||||
- Two fast back-to-back runs within the same second produce the same seed.
|
||||
- Parallel test execution against the same DB will collide.
|
||||
- Every new integration test that inserts into `tiles` will have to invent its own collision-avoidance scheme.
|
||||
- The DB grows monotonically across cycles, making `docker-compose down` (without `-v`) progressively slower.
|
||||
|
||||
Cycle-2 retrospective Improvement Action 3 (`_docs/06_metrics/retro_2026-05-11_cycle2.md` § 7) promoted this to a top-3 action. LESSONS.md `[testing]` carries the lesson "Integration tests must explicitly reset DB state at startup — relying on wallclock seeds is a workaround, not isolation".
|
||||
|
||||
## Outcome
|
||||
|
||||
- Integration test runner explicitly resets relevant DB tables (`tiles`, `regions`, `routes`, `route_points`, `route_regions`) to a known empty state at startup before any test class runs.
|
||||
- The wallclock-seeded `_coordinateCounter` in `UavUploadTests` is no longer required — `_coordinateCounter` can return to 0-initialized behavior. (Implementer's call whether to actually revert the seed; the safety net works either way.)
|
||||
- The DB-reset behavior is opt-out — running with an explicit flag (e.g., `--keep-state`) skips the reset so a developer can inspect leftover state.
|
||||
- `scripts/run-tests.sh` invokes the reset path by default; only `--keep-state` skips it.
|
||||
- `_docs/02_document/modules/tests_integration.md` documents the new convention so future test authors know to rely on it rather than invent their own scheme.
|
||||
- Existing integration tests pass end-to-end against the reset hook.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- Add a DB-reset routine to `SatelliteProvider.IntegrationTests/Program.cs` (or a new helper class) that:
|
||||
- Connects to Postgres via the same connection string used by other tests.
|
||||
- Truncates (in FK-safe order: `route_regions` → `route_points` → `routes` → `regions` → `tiles`) all tables that integration tests insert into.
|
||||
- Runs ONLY when `ASPNETCORE_ENVIRONMENT=Testing` OR an explicit env var (e.g. `INTEGRATION_RESET_STATE=1`) is set. Never runs against a non-test environment.
|
||||
- Wire the reset routine to execute exactly once at startup, before any test class instantiation, with structured logging of what was truncated.
|
||||
- Add `--keep-state` flag handling so a developer can `./scripts/run-tests.sh --full --keep-state` for debugging.
|
||||
- Optionally: extend `scripts/run-tests.sh` with `--reset-volume` flag that does `docker-compose down -v` between runs — but the in-runner truncate is the primary path.
|
||||
- Optionally revert the `_coordinateCounter` wallclock-seed workaround in `UavUploadTests`, OR keep it as defense-in-depth. Implementer's choice; document the decision in the batch report.
|
||||
- Update `_docs/02_document/modules/tests_integration.md` § Reliability with a short paragraph: "Integration tests rely on a startup DB-reset; new tests can use deterministic seed values."
|
||||
|
||||
### Excluded
|
||||
|
||||
- Any change to production code or the API container.
|
||||
- Migration to a per-test-class transaction-rollback pattern (e.g., test fixtures wrapping each test in a transaction) — that's a larger refactor, separate PBI if ever needed.
|
||||
- Resetting the file system (`./tiles/`, `./ready/`) — UAV uploads do write to `./tiles/uav/`, but the file-side collision risk is lower; address only if a test demonstrably hits it.
|
||||
- Renaming or restructuring `UavUploadTests` itself.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Empty-state on startup**
|
||||
Given the integration test runner starts up with `ASPNETCORE_ENVIRONMENT=Testing`
|
||||
When the runner connects to Postgres
|
||||
Then `SELECT count(*) FROM tiles`, `regions`, `routes`, `route_points`, `route_regions` each return 0 before the first test class begins.
|
||||
|
||||
**AC-2: Wallclock workaround no longer needed**
|
||||
Given `UavUploadTests._coordinateCounter` is reverted to a 0-initialized default (implementer's choice)
|
||||
When the full integration suite runs back-to-back twice within the same calendar second
|
||||
Then both runs pass without any `duplicate key value violates unique constraint "idx_tiles_unique_location_source"` errors.
|
||||
|
||||
**AC-3: Opt-out preserves state**
|
||||
Given `scripts/run-tests.sh --full --keep-state` is invoked
|
||||
When the runner starts
|
||||
Then the DB-reset routine is skipped, AND rows from a previous run are still present at runner startup.
|
||||
|
||||
**AC-4: Reset only fires in test environment**
|
||||
Given the runner is somehow misconfigured to point at a non-test connection string (e.g., `ASPNETCORE_ENVIRONMENT=Production`)
|
||||
When the runner starts
|
||||
Then the DB-reset routine refuses to execute and exits with a clear error message.
|
||||
|
||||
**AC-5: Documentation reflects new convention**
|
||||
Given the post-PBI repo
|
||||
When `_docs/02_document/modules/tests_integration.md` is read
|
||||
Then a Reliability paragraph documents the startup-reset convention and the `--keep-state` opt-out.
|
||||
|
||||
**AC-6: Existing tests pass unchanged**
|
||||
Given the full integration test suite
|
||||
When it runs against the reset hook
|
||||
Then all existing test scenarios produce the same pass/fail outcomes as before this PBI.
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Reliability**
|
||||
- Reset MUST be idempotent — multiple invocations produce the same final state.
|
||||
- Reset MUST run inside a transaction (or be otherwise atomic) so a mid-truncate failure doesn't leave the DB in a half-empty state.
|
||||
|
||||
**Performance**
|
||||
- Reset on an empty DB completes in < 100 ms.
|
||||
- Reset on a DB with O(10K) rows completes in < 1 s (TRUNCATE is O(1) for table-size in PG, but FK-cascade order matters).
|
||||
|
||||
**Safety**
|
||||
- The reset code path MUST refuse to run if connected to a database whose name does not contain `_test` or whose environment is not `Testing`. Two independent guards (env var + connection-string check).
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|-------------|-----------------|
|
||||
| AC-4 | Reset routine called with `ASPNETCORE_ENVIRONMENT=Production` | Throws (or no-ops with structured error log) — does NOT execute truncates |
|
||||
| AC-1 | Truncate-order helper (if extracted as a method) | Returns the table names in FK-safe order |
|
||||
|
||||
## Blackbox Tests
|
||||
|
||||
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|
||||
|--------|------------------------|-------------|-------------------|----------------|
|
||||
| AC-1 | DB seeded with leftover rows from a previous run | Run `./scripts/run-tests.sh --full` | All target tables empty at start of test execution; full suite passes | Reliability |
|
||||
| AC-2 | Two back-to-back full runs within 1s | Run `./scripts/run-tests.sh --full` twice without `--keep-state` | Both runs pass with zero unique-constraint violations | Reliability |
|
||||
| AC-3 | DB seeded with previous-run rows | Run `./scripts/run-tests.sh --full --keep-state` | Rows present at startup; full suite still passes (or fails on AC-2 collision — that's expected) | — |
|
||||
|
||||
## Constraints
|
||||
|
||||
- Reset must use parameterless TRUNCATE (no per-test data injection from the helper) — keep the helper minimal.
|
||||
- Must not require new NuGet dependencies — Npgsql + Dapper are already present.
|
||||
- Connection-string detection MUST use the same `ConnectionStrings:Postgres` key as the production code; do not introduce a second connection-string concept for tests.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: Accidental truncate against production DB**
|
||||
- *Risk*: A misconfigured CI runner or developer with `JWT_SECRET` overridden but `ConnectionStrings:Postgres` pointing at a real DB would have their data wiped.
|
||||
- *Mitigation*: Two-guard model — refuse to truncate unless BOTH (a) `ASPNETCORE_ENVIRONMENT=Testing` AND (b) the DB name contains `_test` (or equivalent). The integration test compose file already uses a separate DB; verify and document the name pattern.
|
||||
|
||||
**Risk 2: Truncate ordering wrong for future schema additions**
|
||||
- *Risk*: A future PBI adds a new table with a foreign key to `tiles` but doesn't update the reset order; reset fails on FK constraint.
|
||||
- *Mitigation*: Document the order in `_docs/02_document/modules/tests_integration.md`. Add a checklist row to the new-task / decompose templates: "If your task adds a table referenced by integration tests, update the DB-reset helper's truncate order."
|
||||
|
||||
**Risk 3: Tests that expect existing data from a previous run**
|
||||
- *Risk*: If a developer pattern emerged (post-cycle-2) where one test class inserts and a later test class reads, those will break.
|
||||
- *Mitigation*: Search for cross-test-class data dependencies before this PBI lands. Cycle-1 / cycle-2 integration tests do not exhibit this pattern — each test class seeds its own data. AC-6 catches any unknown coupling.
|
||||
Reference in New Issue
Block a user