mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 05:41:13 +00:00
[AZ-777] Rewrite spec: real satellite-provider + production C10/C11
Original spec called for direct OSM/CARTO downloads, contradicting architecture (C11 owns tile network I/O against parent-suite satellite-provider .NET 8 service; C10 batches descriptors over the populated C6, never touches the upstream). Rewritten spec drives the production C10/C11 pipeline against the real satellite-provider running in docker-compose.test.yml, replacing the mock-suite-sat- service GET stub. Complexity 5 -> 8 pts (single-ticket override). Decision log: _docs/_process_leftovers/2026-05-21_az777_complexity_ override.md. Jira AZ-777 description + summary synced. Autodev state pauses for next session to pick up Phase 1 (satellite-provider stand-up + smoke test). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,193 +1,196 @@
|
||||
# Derkachi C6 reference tile cache + descriptor index (OSM/CARTO basemap)
|
||||
# Derkachi e2e: wire real satellite-provider + production C10/C11 pipeline into the operator pre-flight fixture
|
||||
|
||||
**Task**: AZ-777_derkachi_c6_reference_fixture
|
||||
**Name**: Build the C6 reference tile cache + FAISS descriptor index for the Derkachi flight bbox so the full-protocol C1+C2+C3+C4+C5 pipeline can produce satellite anchors during e2e replay
|
||||
**Description**: Add a reproducible build script that downloads OSM/CARTO basemap tiles for the Derkachi flight bbox (approx 50.05–50.15 lat, 36.05–36.15 lon), pre-computes feature descriptors via the same C7 backbone the airborne binary uses (DINOv2 or the configured VPR backbone), populates the C6 tile store + FAISS HNSW index, and integrates them into the e2e replay harness. Unblocks the two remaining `@xfail`-masked Derkachi tests on Jetson (`test_ac3_within_100m_80pct_of_ticks` and `test_az699_real_flight_validation_emits_verdict_and_report`) and produces the first honest AZ-699 accuracy verdict.
|
||||
**Complexity**: 5 points
|
||||
**Dependencies**: AZ-776_eskf_open_loop_composition_profile
|
||||
**Component**: c6_tile_cache / e2e fixtures / input_data
|
||||
**Name**: Drive the production C10/C11 pre-flight pipeline against a real parent-suite `satellite-provider` service in the e2e harness so the Derkachi clip produces a real FAISS-anchored C4/C5 satellite-fix loop end-to-end
|
||||
**Description**: Replace the e2e harness's `mock-suite-sat-service` `/healthz`-only stub on the GET tile path with the real `satellite-provider` .NET 8 service (sibling repo at `../satellite-provider`). Seed satellite-provider's Derkachi-bbox tile catalog from a CC-BY-licensed basemap source. Replace the `operator_pre_flight_setup` placeholder fixture in `tests/e2e/replay/conftest.py` with a real fixture that drives the production C11 `HttpTileDownloader` + C10 `DescriptorBatcher` pipeline against the running service, builds C6 (Postgres metadata + filesystem tile store + FAISS HNSW descriptor index), and mounts the populated cache into the e2e-runner container. Un-xfail the Derkachi AC-3 + AZ-699 verdict tests on Tier-2 Jetson; produce the first honest AZ-699 horizontal-error verdict report.
|
||||
**Complexity**: 8 points (explicit override of the standard 5-pt PBI cap — see decision log entry 2026-05-21 under `_docs/_process_leftovers/2026-05-21_az777_complexity_override.md`; single-ticket containment is preferred over decomposition because the four sub-deliverables only deliver demo-confidence value when shipped together)
|
||||
**Dependencies**: AZ-776_eskf_open_loop_composition_profile (done — AZ-776 unblocks compose; this task closes the satellite-anchoring loop)
|
||||
**Component**: e2e fixtures / c6_tile_cache / c10_provisioning / c11_tile_manager / docker compose
|
||||
**Tracker**: AZ-777
|
||||
**Epic**: AZ-602
|
||||
|
||||
## Problem
|
||||
|
||||
The Derkachi e2e fixture
|
||||
(`_docs/00_problem/input_data/flight_derkachi/`) ships the real
|
||||
flight inputs (video, tlog, IMU, camera calibration) but DOES NOT
|
||||
ship the C6 tile-cache artifacts that the replay protocol requires
|
||||
the operator's pre-flight C10 stage to produce:
|
||||
The Derkachi e2e fixture (`_docs/00_problem/input_data/flight_derkachi/`) ships real flight inputs (video, tlog, IMU, camera calibration) but DOES NOT ship the populated C6 tile cache + FAISS descriptor index the replay protocol requires (`replay_protocol.md` Invariant 12: "Real C6 cache in replay: the airborne binary in replay mode reads the same pre-built C6 tile cache the operator built via the normal pre-flight C10/C11/C12 flow"). Two architectural gaps stop the full-protocol C1+C2+C3+C4+C5 pipeline from running against Derkachi today:
|
||||
|
||||
- `c6_tile_store` — persistent JPEG tiles covering the flight area at the chosen zoom levels
|
||||
- `c6_descriptor_index` — FAISS index of VPR-backbone descriptors over those tiles
|
||||
1. **`mock-suite-sat-service` is `/healthz`-only.** The stub at `tests/fixtures/mock-suite-sat-service/main.py` exposes only `GET /healthz` and does NOT implement the `/api/satellite/tiles` contract that C11 `HttpTileDownloader` (production code at `src/gps_denied_onboard/components/c11_tile_manager/tile_downloader.py`) queries against. Any e2e test that wants to exercise the production tile-download path against the stub gets HTTP 404 the moment C11 calls `_LIST_PATH = "/api/satellite/tiles"`.
|
||||
2. **`operator_pre_flight_setup` is a placeholder.** The fixture at `tests/e2e/replay/conftest.py` (lines 293-310) `mkdir`s an empty `operator_cache` directory and yields. It does NOT drive C11 download or C10 descriptor-batcher; it does NOT populate C6. The fixture's docstring explicitly calls itself "a stub" pending this ticket.
|
||||
|
||||
Without these artifacts:
|
||||
The production architecture says (per `architecture.md` Principle #5 + the C10/C11 component descriptions):
|
||||
|
||||
- C2 VPR has no haystack to look up against — `c2_vpr.lookup` returns empty.
|
||||
- C3 matcher has nothing to match against (depends on C2 candidates).
|
||||
- C4 pose has no anchors — cannot estimate satellite-frame pose.
|
||||
- C5 state has no anchors to fuse — runs open-loop on VIO only.
|
||||
- C10 does NOT touch satellite-provider — tile network I/O lives in C11.
|
||||
- C11 `HttpTileDownloader` is the production path: authenticated GETs against the parent-suite `satellite-provider` .NET 8 REST service (sibling repo at `../satellite-provider/`, real implementation with `SatelliteProvider.Api`, region-onboarding flows, integration tests).
|
||||
- `satellite-provider` owns the OSM/CARTO tile network I/O + license attribution + multi-flight voting layer — the onboard companion is read-only against it (via C11) during pre-flight and read-only against C6 during flight.
|
||||
- `mock-suite-sat-service` exists specifically for the D-PROJ-2 ingest (POST upload) endpoint that the parent-suite has not yet shipped — NOT for the GET tile-fetch path.
|
||||
|
||||
When `c5_state.strategy = gtsam_isam2` (the default that AZ-699's e2e
|
||||
exercises), the composition reaches the per-frame loop but
|
||||
`iSAM2.update` crashes at frame 1 with:
|
||||
|
||||
```
|
||||
EstimatorFatalError: compute_marginals failed: Attempting to at the
|
||||
key 'x2', which does not exist in the Values.
|
||||
```
|
||||
|
||||
— because no C4 anchor was ever inserted (C2/C3/C4 have nothing to
|
||||
match against).
|
||||
|
||||
AZ-776 (sibling, prerequisite) makes the open-loop C1+C5(ESKF)
|
||||
composition runnable, but that path skips C2–C4 entirely and accepts
|
||||
unbounded drift. To validate the FULL protocol-compliant pipeline
|
||||
against Derkachi — i.e. AC-3 (`≤100 m for 80 % of ticks`) and the
|
||||
AZ-699 horizontal-error verdict — we need real C6 fixtures.
|
||||
|
||||
The replay protocol (`replay_protocol.md` line 214) explicitly states
|
||||
"`BUILD_FAISS_INDEX` is ON in the airborne binary (live and replay
|
||||
alike). C2 in replay queries the **real** C6 `FaissDescriptorIndex`,
|
||||
populated by the pre-flight C10 build. This is the architectural
|
||||
change vs. v1.0.0 of this contract." We have no such build for
|
||||
Derkachi.
|
||||
The current AZ-777 spec ("write a script under `scripts/build_derkachi_c6_fixture.py` that downloads OSM/CARTO basemap tiles directly") was inconsistent with this architecture: it asked the onboard companion to do network I/O against an external imagery source instead of going through C11→satellite-provider. The corrected scope (this revision) drives the production pipeline end-to-end.
|
||||
|
||||
## Outcome
|
||||
|
||||
- A reproducible build script under `scripts/` produces the C6 artifacts (`tile_store` + `descriptor_index`) given the Derkachi bbox + zoom levels + camera calibration, deterministically on a clean checkout, in under 30 minutes on a developer workstation.
|
||||
- Reference imagery source is OSM-tile-server-distributed basemap (CARTO Voyager or equivalent CC-BY-licensed source). Each tile carries the source URL + license attribution in its metadata sidecar.
|
||||
- The Derkachi fixture directory documents the build invocation; tiles + index are EITHER committed to the repo (if total size ≤ 100 MB) OR built on-demand from the script (if larger) — decision recorded in the fixture README.
|
||||
- `tests/e2e/replay/conftest.py`'s `operator_pre_flight_setup` fixture is replaced (or extended) to mount the prebuilt artifacts into the e2e-runner container. The mock-suite-sat-service stub is retired for the C6-served paths (it remains for the C12 operator-workflow AC-8).
|
||||
- After this task ships (with AZ-776), un-xfail `test_ac3_within_100m_80pct_of_ticks` (`test_derkachi_1min.py` line 174) AND `test_az699_real_flight_validation_emits_verdict_and_report` (`test_derkachi_real_tlog.py` line 174); both pass on the Jetson harness.
|
||||
- The first honest AZ-699 verdict lands at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` with the full horizontal-error distribution. Whether the verdict is PASS or FAIL is the honest finding — this task's success is that the verdict is *produced* against the real pipeline, not that it is necessarily green.
|
||||
- The e2e harness `docker-compose.test.yml` runs the real `satellite-provider` .NET 8 service (built from `../satellite-provider/SatelliteProvider.Api/Dockerfile`) alongside the existing `mock-sat` (which is retained only for the D-PROJ-2 POST/upload contract until the parent suite ships it).
|
||||
- `satellite-provider`'s tile catalog is seeded with the Derkachi bbox (≈50.05–50.15 lat, 36.05–36.15 lon) at the camera-AGL-appropriate zoom levels (15–18) via the service's existing region-onboarding flow (CC-BY-licensed basemap source; license + attribution baked into the seeded catalog's metadata).
|
||||
- `tests/e2e/replay/conftest.py::operator_pre_flight_setup` is replaced by a real fixture that:
|
||||
1. Resolves the Derkachi bbox + camera-derived zoom range from the existing flight fixture.
|
||||
2. Invokes C11 `HttpTileDownloader` against the running `satellite-provider` to populate C6 (Postgres metadata + filesystem tile store).
|
||||
3. Invokes C10 `DescriptorBatcher` against the populated C6 to build the FAISS HNSW descriptor index via the production NetVLAD backbone (C2 default per `c2_vpr/config.py:67`).
|
||||
4. Verifies all three sidecar files (`.index`, `.sha256`, `.meta.json`) per the FAISS sidecar coherence invariant (AZ-306).
|
||||
5. Yields the populated cache directory + Postgres connection string for the e2e-runner to mount.
|
||||
- The populated C6 is mounted into the `e2e-runner` container via named volumes that survive across pytest sessions (so repeated test runs reuse the cache).
|
||||
- AC-3 (`test_ac3_within_100m_80pct_of_ticks` in `tests/e2e/replay/test_derkachi_1min.py`) un-xfails and passes on Tier-2 Jetson with ≥ 80 % of ticks within 100 m of ground truth.
|
||||
- AZ-699 verdict test (`test_az699_real_flight_validation_emits_verdict_and_report` in `tests/e2e/replay/test_derkachi_real_tlog.py`) un-xfails and produces the first honest horizontal-error distribution report at `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md`.
|
||||
|
||||
## Scope
|
||||
|
||||
### Included
|
||||
|
||||
- `scripts/build_derkachi_c6_fixture.py` (or equivalent module under `e2e/fixtures/derkachi_c6/`): reproducible build pipeline that:
|
||||
- Reads the Derkachi bbox + zoom levels from a small YAML config (`tests/fixtures/derkachi_c6/bbox.yaml`).
|
||||
- Downloads OSM/CARTO basemap tiles into `<output>/tiles/{zoom}/{x}/{y}.jpg` mirroring `satellite-provider`'s on-disk layout (per architecture principle #5).
|
||||
- Computes per-tile descriptors via the same C7 backbone the airborne binary uses (configurable; defaults to whatever `config.components.c2_vpr.strategy`'s feature dimension is — e.g. UltraVPR or NetVLAD).
|
||||
- Builds a FAISS HNSW index over the descriptors, writes via `faiss.write_index` + atomicwrites + SHA-256 content-hash gate (per D-C10-3).
|
||||
- Emits a manifest JSON recording tile count, bbox, zoom levels, backbone, descriptor dimension, FAISS index parameters, source URL template, license, and the SHA-256 of every artifact.
|
||||
- `tests/fixtures/derkachi_c6/bbox.yaml`: the bbox + zoom + backbone config consumed by the build script. Committed.
|
||||
- `tests/fixtures/derkachi_c6/README.md`: how to rebuild + license attribution + estimated artifact size.
|
||||
- Build the artifacts once, decide commit vs on-demand:
|
||||
- If total size ≤ 100 MB → commit to `_docs/00_problem/input_data/flight_derkachi/c6_cache/` (under LFS).
|
||||
- If > 100 MB → keep build-on-demand only, document the build invocation in the fixture README, and add a `scripts/run-tests-jetson.sh` pre-step that builds if absent.
|
||||
- `tests/e2e/replay/conftest.py`: replace `operator_pre_flight_setup`'s mock with a real fixture that mounts the prebuilt artifacts into the e2e-runner container at the expected paths (`/opt/tiles/`, `/opt/descriptor_index.index`).
|
||||
- `docker-compose.test.yml` + `docker-compose.test.jetson.yml`: mount the artifacts into the `e2e-runner` service (bind mount or named volume), set `c6_tile_store.path` + `c6_descriptor_index.path` env vars.
|
||||
- `tests/e2e/replay/test_derkachi_1min.py`: remove the `@pytest.mark.xfail` decorator on AC-3 (line 174).
|
||||
- `tests/e2e/replay/test_derkachi_real_tlog.py`: remove the `@pytest.mark.xfail` decorator on AZ-699 (line 174).
|
||||
- `_docs/00_problem/input_data/flight_derkachi/README.md`: document the new C6 artifacts + build invocation + license attribution.
|
||||
- `_docs/02_document/contracts/c6_tile_cache/`: if a contract file exists for the descriptor-index format, append a Consumer entry naming this fixture; if not, no new contract needed.
|
||||
**Phase 1 — satellite-provider stand-up in the e2e harness**
|
||||
|
||||
- `docker-compose.test.yml`: add a `satellite-provider` service that builds from `../satellite-provider/SatelliteProvider.Api/Dockerfile`. Service depends on a `satellite-provider-db` Postgres instance (separate from the existing `db` service for c6 metadata to avoid cross-tenant table collisions). Service exposes port 5101 (`satellite-provider` standard) inside the compose network.
|
||||
- `e2e-runner` env: replace `SATELLITE_PROVIDER_URL: http://mock-sat:5100` with `SATELLITE_PROVIDER_URL: http://satellite-provider:5101` for the C11 download path. Keep `MOCK_SAT_UPLOAD_URL: http://mock-sat:5100` for the D-PROJ-2 POST stub (until D-PROJ-2 ships).
|
||||
- `docker-compose.test.jetson.yml`: mirror the same satellite-provider service for Tier-2 (build context unchanged; Jetson uses cross-compiled image once the parent-suite .NET service builds for arm64 — verify in this task whether the existing Dockerfile produces an arm64-capable image, otherwise file a follow-up).
|
||||
- Smoke test in `tests/e2e/satellite_provider/test_smoke.py`: brings up the docker-compose stack, GETs `/healthz` against the real service, runs a single C11 `HttpTileDownloader.download_for_bbox` call against a 1-tile bbox, asserts the tile arrives in C6 + the metadata row is inserted. Gated by `RUN_REPLAY_E2E=1`.
|
||||
|
||||
**Phase 2 — Derkachi tile catalog seeding**
|
||||
|
||||
- `tests/fixtures/derkachi_c6/seed_region.py` (new): a Python helper that calls the real `satellite-provider` region-onboarding endpoint (`/api/regions` or whatever the contract is — verify against the .NET source at `../satellite-provider/SatelliteProvider.Api`) to register the Derkachi bbox + zoom range. The seed run uses CARTO Voyager Basemap as the upstream imagery source (CC-BY-3.0; satellite-provider owns the actual tile download from CARTO and applies the freshness gate).
|
||||
- `tests/fixtures/derkachi_c6/bbox.yaml`: Derkachi bbox + zoom levels + imagery source + license attribution metadata. The values match the seed script's payload.
|
||||
- `tests/fixtures/derkachi_c6/README.md`: how to re-seed if the satellite-provider DB is wiped; license attribution operators must propagate.
|
||||
|
||||
**Phase 3 — replace `operator_pre_flight_setup` with a real fixture**
|
||||
|
||||
- `tests/e2e/replay/conftest.py::operator_pre_flight_setup`: replace the placeholder. The new fixture:
|
||||
- Reads the Derkachi bbox from `tests/fixtures/derkachi_c6/bbox.yaml`.
|
||||
- Invokes C11 `HttpTileDownloader` against the running satellite-provider service.
|
||||
- Invokes C10 `DescriptorBatcher` against the populated C6 (NetVLAD backbone per c2_vpr default).
|
||||
- Verifies sidecar coherence (`.index` + `.sha256` + `.meta.json` triple-consistency check per AZ-306).
|
||||
- Yields a `PopulatedC6Cache` dataclass that the test bodies consume.
|
||||
- The fixture's outputs are mounted into the e2e-runner container via named volumes that survive across pytest sessions (so the second test run in the same session reuses the populated cache — re-seeding takes minutes, re-downloading takes longer).
|
||||
|
||||
**Phase 4 — un-xfail the Tier-2 tests**
|
||||
|
||||
- `tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks`: remove `@pytest.mark.xfail` (still gated by `RUN_REPLAY_E2E=1` env + `tier2` marker — only runs on Tier-2 harness).
|
||||
- `tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report`: remove `@pytest.mark.xfail`. The test body MUST emit the verdict report to `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` regardless of PASS/FAIL — the success criterion is that the report exists with the honest distribution, not that the verdict is necessarily green.
|
||||
|
||||
**Phase 5 — documentation**
|
||||
|
||||
- `_docs/02_document/contracts/replay/replay_protocol.md`: Invariant 12 already states "Real C6 cache in replay" — append the AZ-777 / e2e-runner integration detail in a new sub-section under **Composition root extension** describing the operator_pre_flight_setup fixture's behaviour.
|
||||
- `_docs/00_problem/input_data/flight_derkachi/README.md`: add a Derkachi C6 section pointing at the seed script + bbox config.
|
||||
- `_docs/02_document/architecture.md`: append a new sub-section to the existing satellite-provider entry (line ~28) noting that the e2e harness now stands up the real service via `docker-compose.test.yml`; `mock-suite-sat-service` is retained only for the unshipped D-PROJ-2 POST contract.
|
||||
|
||||
### Excluded
|
||||
|
||||
- Multi-flight fixtures — just Derkachi. (Other flights would each need their own C6 build invocation.)
|
||||
- Online tile download at test time — the e2e harness MUST remain offline (per replay protocol Invariant 5 / RESTRICT-SAT-1 / NFT-SEC-02; the docker compose `internal: true` network). The build script downloads tiles AT BUILD TIME from the developer workstation; the e2e harness only sees the prebuilt artifacts.
|
||||
- Replacing the mock-suite-sat-service stub for the C12 operator-workflow `test_ac8_operator_workflow` test — that test exercises the D-PROJ-2 ingest contract which is parent-suite work, not in scope here.
|
||||
- Building tiles for any backbone other than the airborne-default. If the operator wants a different backbone, they re-run the script with a different `--backbone` flag; this task only commits the default-backbone artifacts.
|
||||
- Switching the airborne C6 backend from Postgres-mirroring to anything else — the build script writes the same on-disk layout the production C6 expects.
|
||||
- AZ-776 (sibling): this task does NOT introduce the `c4_pose.enabled` flag or the open-loop composition profile. AZ-776 must land first to unblock the open-loop xfails (AC-1, AC-2, AC-5, AC-6); this task targets the full-GTSAM xfails (AC-3, AZ-699).
|
||||
- The D-PROJ-2 POST/upload contract — still gated on the parent-suite design landing. `mock-suite-sat-service` continues to handle the POST stub.
|
||||
- Multi-flight fixtures — Derkachi only. Other flights each need their own bbox seed and re-run.
|
||||
- Switching C2 default backbone away from `net_vlad` — out of scope; if the operator wants UltraVPR or DINOv2, they re-run C10 with a different backbone configuration.
|
||||
- Cross-compilation of satellite-provider for Jetson arm64 if the existing Dockerfile does not produce arm64 — file a follow-up ticket if needed; this task does NOT attempt to land arm64 support in the .NET service.
|
||||
- Modifying any file under `../satellite-provider/` (sibling repo) — this task is purely additive on the gps-denied-onboard side + docker-compose orchestration. If the .NET service is missing an endpoint the C11 client requires, file a parent-suite ticket and STOP.
|
||||
- Persisting the populated C6 to git/LFS — the named-volume approach above keeps the cache out of the repo. If repo-committed artifacts become a requirement later, file a follow-up to evaluate LFS size.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
**AC-1: Reproducible build**
|
||||
Given a clean checkout
|
||||
When `python scripts/build_derkachi_c6_fixture.py --output tests/fixtures/derkachi_c6/out --bbox tests/fixtures/derkachi_c6/bbox.yaml` runs
|
||||
Then it produces a `tiles/` directory in the documented `{zoom}/{x}/{y}.jpg` layout, a FAISS `.index` file with a SHA-256-verified content hash, and a `manifest.json` recording tile count, bbox, backbone, descriptor dimension, FAISS parameters, source URL template, license, and per-artifact SHA-256, in under 30 minutes on a developer workstation
|
||||
**AC-1: Real satellite-provider runs in the e2e harness**
|
||||
Given `docker-compose.test.yml` with the new `satellite-provider` service
|
||||
When `docker compose -f docker-compose.test.yml up satellite-provider` is invoked
|
||||
Then the service builds from `../satellite-provider/SatelliteProvider.Api/Dockerfile`, comes up healthy on port 5101, and `GET /healthz` returns 200
|
||||
|
||||
**AC-2: License attribution**
|
||||
Given the produced artifacts
|
||||
When the manifest is inspected
|
||||
Then it records the tile source URL template, the license name (CC-BY-3.0 or CC-BY-4.0 as applicable), and the attribution string the operator must surface in any derived publication
|
||||
**AC-2: C11 downloads against real satellite-provider succeed**
|
||||
Given the running satellite-provider service + a seeded Derkachi-bbox tile catalog
|
||||
When `tests/e2e/satellite_provider/test_smoke.py` runs C11 `HttpTileDownloader.download_for_bbox` for a single tile
|
||||
Then the tile arrives in the C6 filesystem store, the metadata row is inserted into C6's Postgres, and the freshness label is `fresh` (per the C6 freshness gate)
|
||||
|
||||
**AC-3: Offline e2e harness**
|
||||
Given the prebuilt C6 artifacts mounted into the e2e-runner container
|
||||
When `scripts/run-tests-jetson.sh` runs on Jetson with `RUN_REPLAY_E2E=1 GPS_DENIED_TIER=2` and the Docker compose network is `internal: true`
|
||||
Then the test harness never reaches out to any external host; all C6 queries are served from the mounted artifacts
|
||||
**AC-3: operator_pre_flight_setup drives the production pipeline**
|
||||
Given the running satellite-provider with Derkachi tiles seeded
|
||||
When `tests/e2e/replay/conftest.py::operator_pre_flight_setup` runs
|
||||
Then C11 `HttpTileDownloader` downloads the Derkachi-bbox tiles into C6, C10 `DescriptorBatcher` builds the FAISS HNSW index over them using the NetVLAD backbone, the three sidecar files (`.index` + `.sha256` + `.meta.json`) pass the AZ-306 triple-consistency check, and the fixture yields a `PopulatedC6Cache` with all three artifact paths populated
|
||||
|
||||
**AC-4: Full-protocol e2e passes**
|
||||
Given AZ-776 has landed AND the C6 artifacts are mounted AND the YAML config selects `c5_state.strategy = gtsam_isam2` with `c4_pose.enabled = True`
|
||||
When `gps-denied-replay` runs the Derkachi 1-min fixture on Jetson
|
||||
Then it exits with code 0, emits one EstimatorOutput per video frame, `test_ac3_within_100m_80pct_of_ticks` un-xfails and passes (≥80 % of ticks within 100 m of ground truth), and the per-frame loop emits `replay.satellite_anchor_inserted` log lines (not the existing `satellite_anchoring_not_wired` warning)
|
||||
**AC-4: AC-3 Derkachi test un-xfails on Tier-2**
|
||||
Given AZ-776 landed + the populated C6 from AC-3 mounted into the e2e-runner + the airborne binary configured with `c5_state.strategy = gtsam_isam2` + `c4_pose.enabled = True`
|
||||
When `tests/e2e/replay/test_derkachi_1min.py::test_ac3_within_100m_80pct_of_ticks` runs on Tier-2 Jetson
|
||||
Then it un-xfails, the test passes (≥ 80 % of ticks within 100 m of ground truth), and the per-frame loop emits `replay.satellite_anchor_inserted` log lines (not the existing `satellite_anchoring_not_wired` warning)
|
||||
|
||||
**AC-5: AZ-699 produces an honest verdict**
|
||||
Given AZ-776 has landed AND the C6 artifacts are mounted AND the real flight video + factory calibration are present (already are)
|
||||
When `test_az699_real_flight_validation_emits_verdict_and_report` runs on Jetson
|
||||
Then it un-xfails, the test runs to completion within the 15-min NFR budget, and `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` records the horizontal-error distribution with the honest PASS/FAIL verdict against the ≥80 % within 100 m gate
|
||||
**AC-5: AZ-699 verdict report is produced**
|
||||
Given AZ-776 landed + the populated C6 from AC-3 + the real flight video + factory calibration
|
||||
When `tests/e2e/replay/test_derkachi_real_tlog.py::test_az699_real_flight_validation_emits_verdict_and_report` runs on Tier-2 Jetson
|
||||
Then it un-xfails, the test runs to completion within the 15-min NFR budget, and `_docs/06_metrics/real_flight_validation_<YYYY-MM-DD>.md` records the horizontal-error distribution with the honest PASS/FAIL verdict against the ≥ 80 % within 100 m gate (PASS not required for the AC; HONEST report required)
|
||||
|
||||
**AC-6: Fixture README documents rebuild**
|
||||
Given the updated `_docs/00_problem/input_data/flight_derkachi/README.md`
|
||||
When a new contributor reads it
|
||||
Then it documents (i) what C6 artifacts now exist, (ii) the exact `python scripts/build_derkachi_c6_fixture.py …` invocation to rebuild, (iii) the license attribution operators must propagate, (iv) the size-on-disk decision (committed vs. build-on-demand)
|
||||
**AC-6: Documentation captures the new architecture seam**
|
||||
Given the rewritten replay protocol doc + the Derkachi fixture README + the architecture sub-section
|
||||
When a new contributor reads them
|
||||
Then they understand (i) why the real satellite-provider runs in the e2e harness, (ii) how to re-seed the Derkachi catalog, (iii) which path goes through `mock-sat` vs. real satellite-provider (POST vs. GET), and (iv) what license attribution operators must propagate
|
||||
|
||||
## Non-Functional Requirements
|
||||
|
||||
**Performance**
|
||||
- Build script completes in ≤ 30 minutes on a developer workstation (Apple Silicon or x86 Linux, no GPU required for OSM tile download + descriptor pre-compute via the CPU-fallback path of the backbone).
|
||||
- Built artifacts do not regress the airborne C2 lookup latency budget — the FAISS HNSW parameters MUST match what production C6 expects (M, efConstruction, efSearch); the index is built once and never rebuilt at runtime.
|
||||
- `operator_pre_flight_setup` completes in ≤ 5 minutes on first invocation (cold cache), ≤ 30 seconds on subsequent invocations within the same docker-compose session (warm cache via named volume).
|
||||
- Built C6 artifacts (tile store + descriptor index) match the airborne C2 lookup latency budget — FAISS HNSW parameters MUST match what production C6 expects (M, efConstruction, efSearch); the index is built once per session, never rebuilt mid-test.
|
||||
|
||||
**Compatibility**
|
||||
- Tile on-disk layout `{zoom}/{x}/{y}.jpg` MUST be byte-equivalent to `satellite-provider`'s layout (architecture principle #5) so a future post-landing upload would be byte-identical.
|
||||
- FAISS index format MUST be loadable by the airborne `c6_descriptor_index.FaissDescriptorIndex` impl without code changes.
|
||||
- Descriptor dimension MUST match the configured C7 backbone's output dimension — the build script asserts this at start.
|
||||
- Tile on-disk layout `{zoom}/{x}/{y}.jpg` MUST be byte-equivalent to satellite-provider's layout (architecture principle #5) — this is automatic because C11 writes via the C6 production code path.
|
||||
- FAISS index format MUST be loadable by the airborne `c6_descriptor_index.FaissDescriptorIndex.from_config` impl without code changes — this is automatic because C10 writes via the C6 production code path.
|
||||
- The .NET satellite-provider service's `/api/satellite/tiles` contract version MUST be compatible with the C11 `HttpTileDownloader._LIST_PATH` / `_GET_PATH` constants (`/api/satellite/tiles`). Mismatch is a parent-suite bug; this task does not patch C11 around it.
|
||||
|
||||
**Reliability**
|
||||
- Build script MUST fail loud on partial downloads (network error, HTTP 429/500, malformed tile) rather than silently producing an incomplete tile store. Resume-from-partial is allowed but each resumed run re-verifies SHA-256 of every committed tile.
|
||||
- The SHA-256 content-hash gate on the FAISS index (per D-C10-3) MUST be enforced — operator can verify a downloaded fixture matches what was built.
|
||||
- The smoke test (AC-2) MUST fail loud if the satellite-provider service is unreachable, returns malformed responses, or rate-limits — no silent skip.
|
||||
- The `operator_pre_flight_setup` fixture MUST clean up partial cache state on failure (no half-built FAISS index left around).
|
||||
- The SHA-256 content-hash gate on the FAISS index (per D-C10-3) MUST be verified at every fixture yield — mismatch raises `IndexUnavailableError`.
|
||||
|
||||
**Security**
|
||||
- Reference imagery URLs MUST be HTTPS. Tile metadata MUST record the exact source URL so license auditors can verify attribution.
|
||||
- No API keys committed to the repo — if the chosen tile source requires registration, the build script reads the key from an env var and documents the env var name in the fixture README.
|
||||
- Reference imagery source URLs MUST be HTTPS. License attribution recorded in the seeded catalog's metadata so operators can verify before any derived publication.
|
||||
- No JWT secrets committed — the satellite-provider service in docker-compose reads `JWT_SECRET` from a `.env.test` file that's `.gitignore`'d; the test environment uses a development-only key.
|
||||
- C11 download MUST go through the production auth path (Bearer token from satellite-provider's `/api/auth`) — no auth bypass for tests.
|
||||
|
||||
## Unit Tests
|
||||
|
||||
| AC Ref | What to Test | Required Outcome |
|
||||
|--------|--------------|------------------|
|
||||
| AC-1 | Build script produces `tiles/`, `descriptor_index.index`, `manifest.json` on a small mock bbox | All three artifacts exist, manifest fields populated |
|
||||
| AC-1 | SHA-256 of `descriptor_index.index` recorded in manifest matches actual file hash | Hashes match |
|
||||
| AC-2 | Manifest records source URL template + license + attribution | All three fields non-empty |
|
||||
| AC-2 | License field matches the source's documented license | Round-trips against an enum |
|
||||
| AC-6 | Fixture README documents the build invocation | Invocation string greps cleanly |
|
||||
| AC-1 | docker-compose.test.yml validates `satellite-provider` service definition | YAML lints; service has correct build context + port |
|
||||
| AC-2 | C11 `HttpTileDownloader.download_for_bbox` against a stubbed real satellite-provider response | Returns expected `DownloadBatchReport` with `outcome=SUCCESS` |
|
||||
| AC-3 | `operator_pre_flight_setup` fixture yields a `PopulatedC6Cache` with non-empty tile store + FAISS index | All three sidecar files exist + sha256 triple-consistency holds |
|
||||
| AC-3 | Sidecar SHA-256 coherence check inside the fixture | `IndexUnavailableError` raised when one of the three files is tampered |
|
||||
| AC-6 | Fixture README documents the seed invocation | Invocation string + license attribution greps cleanly |
|
||||
|
||||
## Blackbox Tests
|
||||
|
||||
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|
||||
|--------|------------------------|--------------|-------------------|----------------|
|
||||
| AC-3 | Prebuilt C6 artifacts + e2e-runner with `internal: true` network | Run `scripts/run-tests-jetson.sh` end-to-end | No outbound network calls observed by Docker network logs; all C6 queries return from local index | Security, Reliability |
|
||||
| AC-4 | AZ-776 landed + C6 artifacts mounted + full-GTSAM YAML | `test_ac3_within_100m_80pct_of_ticks` un-xfailed | Test passes (≥80 % of ticks within 100 m); `satellite_anchor_inserted` log lines visible | Perf, Compat |
|
||||
| AC-5 | AZ-776 landed + C6 artifacts mounted + real flight video + factory calibration | `test_az699_real_flight_validation_emits_verdict_and_report` un-xfailed | Test runs to completion ≤ 15 min, verdict report written to `_docs/06_metrics/` | Perf |
|
||||
| AC-1 | docker-compose.test.yml + satellite-provider service definition | `docker compose up satellite-provider` | Service comes up healthy in ≤ 60 s | Perf |
|
||||
| AC-2 | Real satellite-provider running + 1-tile-bbox query | C11 HttpTileDownloader against the live service | Tile arrives in C6 + metadata row inserted + freshness=fresh | Reliability |
|
||||
| AC-3 | Seeded Derkachi catalog + e2e-runner | `operator_pre_flight_setup` cold + warm invocation | Cold ≤ 5 min, warm ≤ 30 s, all three sidecar files coherent | Perf |
|
||||
| AC-4 | AZ-776 landed + populated C6 mounted + full-GTSAM YAML | `test_ac3_within_100m_80pct_of_ticks` un-xfailed on Tier-2 Jetson | Test passes (≥ 80 % within 100 m); `satellite_anchor_inserted` log lines visible | Perf, Compat |
|
||||
| AC-5 | AZ-776 landed + populated C6 mounted + real flight video + factory calibration | `test_az699_real_flight_validation_emits_verdict_and_report` un-xfailed | Test completes ≤ 15 min, verdict report written to `_docs/06_metrics/` | Perf |
|
||||
|
||||
## Constraints
|
||||
|
||||
- Reference imagery source MUST be OSM/CARTO basemap (CC-BY-licensed). Operator chose this during AZ-777 scoping (cycle-3 Step 9, 2026-05-21) over Maxar Open Data (license uncertainty for in-repo redistribution) and video-self-orthorectification (self-referential, makes AC-3 a smoke test rather than a real accuracy gate). The trade-off — lower-resolution reference imagery may produce a higher residual on the AC-3 horizontal-error metric than satellite imagery would — is an HONEST finding the AZ-699 verdict will surface.
|
||||
- The build script MUST NOT depend on `satellite-provider` running. The script's only network dependency is the chosen OSM/CARTO tile server (HTTPS, public, no auth).
|
||||
- The committed artifact size budget (if AC-6 chooses commit-to-repo) is 100 MB total across `tiles/` + `descriptor_index.index`. Over budget → switch to build-on-demand, document in README.
|
||||
- The `mock-suite-sat-service` stub stays in place for `test_ac8_operator_workflow` — that test exercises the D-PROJ-2 contract which this task does not address.
|
||||
- Per replay protocol Invariant 5: ZERO outbound network from the e2e-runner. The build script runs on the developer workstation; the harness only sees prebuilt artifacts.
|
||||
- ZERO modifications to files under `../satellite-provider/` (sibling repo). If a parent-suite API gap is discovered (e.g., `/api/satellite/tiles` returns 404 because the endpoint isn't wired), STOP and file a parent-suite ticket; do not work around it on the onboard side.
|
||||
- Per replay protocol Invariant 5: ZERO outbound network from the e2e-runner once the cache is populated. The cache-population phase needs network (satellite-provider downloads from CARTO upstream), but once the docker-compose `e2e-runner` service is `internal: true`-networked for the airborne replay run, no external host is reachable. Verify with Docker network inspection during AC-4.
|
||||
- Imagery source MUST be CC-BY-licensed (CARTO Voyager Basemap or equivalent). The seeded catalog records the license + attribution string operators must propagate in any derived publication.
|
||||
- The seeded Derkachi catalog size budget is 100 MB on the satellite-provider DB side. Over budget → reduce zoom-level coverage; document the trade-off in `bbox.yaml` and `tests/fixtures/derkachi_c6/README.md`.
|
||||
|
||||
## Risks & Mitigation
|
||||
|
||||
**Risk 1: OSM basemap residual is too coarse for the AC-3 threshold**
|
||||
- *Risk*: AC-3's `≤100 m for 80 %` gate may be physically unmeetable when the reference imagery is OSM rasterized basemap (street-level features, not satellite features) — the visual descriptors may not lock against the aerial nav-camera frames at all.
|
||||
- *Mitigation*: This is an honest discovery. If AC-3 still fails after this task lands, the failure mode shifts from "no anchors at all" (current) to "anchors exist but VPR similarity is too low to produce ≥80 % within 100 m". The AZ-699 verdict report will surface the actual horizontal-error distribution; if it lands at e.g. p50 = 250 m, that becomes evidence for a follow-up ticket to switch to satellite imagery. The xfail is removed in either case because the test now exercises the real pipeline — the verdict, not the xfail, becomes the honest signal.
|
||||
**Risk 1: satellite-provider's `/api/satellite/tiles` contract drifts from what C11 expects**
|
||||
- *Risk*: C11 `HttpTileDownloader` was implemented against an older satellite-provider contract; recent satellite-provider changes may have moved or renamed the endpoint.
|
||||
- *Mitigation*: AC-1 smoke test fires the C11 call against the real service before any test depends on it. Any 404/400/contract mismatch surfaces immediately; the failure points at a parent-suite ticket, not an onboard bug. The onboard code path is the standard production code; this task does not modify it.
|
||||
|
||||
**Risk 2: Tile source rate-limits or goes offline mid-build**
|
||||
- *Risk*: Public OSM/CARTO tile servers may rate-limit or temporarily go down, breaking reproducibility on a re-build.
|
||||
- *Mitigation*: Build script implements exponential backoff + resume-from-partial. Document the chosen tile-server URL in the fixture README so an operator can swap to a mirror if needed. If commit-to-repo is chosen for the artifacts, future re-builds are unnecessary — the committed artifacts are the source of truth.
|
||||
**Risk 2: CARTO Voyager basemap residual is too coarse for AC-4**
|
||||
- *Risk*: CC-BY basemap is OSM-derived (street-level features, not satellite features). NetVLAD descriptors may not lock against nadir camera frames well enough for ≥ 80 % within 100 m.
|
||||
- *Mitigation*: This is an honest discovery surface. AC-4 may fail on accuracy after this task lands — the failure mode shifts from "no anchors at all" (current) to "anchors exist but VPR similarity is too low". The AZ-699 verdict report (AC-5) surfaces the actual horizontal-error distribution; if it lands at e.g. p50 = 250 m, that becomes evidence for a follow-up ticket to seed a satellite-imagery source (Maxar Open Data, Sentinel-2, etc.). The xfail is removed in either case because the test now exercises the real pipeline — the verdict, not the xfail, is the honest signal.
|
||||
|
||||
**Risk 3: Repo size pressure if artifacts are committed**
|
||||
- *Risk*: Tile store + FAISS index could exceed 100 MB depending on bbox + zoom levels; committing them under LFS still costs LFS storage and bandwidth.
|
||||
- *Mitigation*: First build run measures the size. If under 100 MB → commit. If over → build-on-demand documented in README + `scripts/run-tests-jetson.sh` pre-step. Either choice is acceptable per AC-6.
|
||||
**Risk 3: satellite-provider doesn't build on arm64 (Jetson)**
|
||||
- *Risk*: The existing `SatelliteProvider.Api/Dockerfile` uses `mcr.microsoft.com/dotnet/aspnet:10.0` which is amd64-default. Tier-2 Jetson is arm64.
|
||||
- *Mitigation*: First check whether the multi-arch manifest exists for the dotnet/aspnet image at the pinned version. If yes → no action needed. If no → file a follow-up ticket to multi-arch the satellite-provider Dockerfile; AC-4 + AC-5 stay BLOCKED on Tier-2 until that ticket lands, but Phases 1–3 + AC-1/2/3/6 still complete on Tier-1 in this ticket's scope.
|
||||
|
||||
**Risk 4: Backbone descriptor dimension mismatch**
|
||||
- *Risk*: If the operator changes the airborne C2 backbone (UltraVPR → NetVLAD, etc.) without rebuilding the index, the FAISS load will fail at runtime with a dimension mismatch.
|
||||
- *Mitigation*: Manifest records the descriptor dimension. C6 loader asserts the manifest's dimension matches the configured backbone's output dimension at compose time; mismatch surfaces as an `AirborneBootstrapError` naming both numbers + the rebuild invocation.
|
||||
**Risk 4: docker-compose stand-up flakiness slows down the test suite**
|
||||
- *Risk*: Cold-bringing up satellite-provider + its Postgres + the gps-denied-onboard companion + e2e-runner across CI pipelines adds wall-clock time.
|
||||
- *Mitigation*: Named volumes for both the satellite-provider DB and the populated C6 mean only the first run in a CI session pays the cost. Subsequent runs are warm. Document the named volumes in the docker-compose comments + the fixture README so an operator knows to `docker volume prune` if they want to force a re-seed.
|
||||
|
||||
**Risk 5: Single-ticket 8-pt complexity exceeds the standard PBI cap**
|
||||
- *Risk*: The task is intentionally above the 5-pt cap stated in the project's PBI complexity rule; this can mask the failure mode where a sub-phase blocks and the whole ticket grinds.
|
||||
- *Mitigation*: The five phases above are explicit handoff points. If Phase 1 (satellite-provider stand-up) fails for reasons outside this ticket's scope (e.g., parent-suite contract drift, arm64 issue), the implementer STOPS at the phase boundary, reports the blocker, and proposes a split into smaller follow-up tickets. The "single ticket" property is preserved as long as the work proceeds linearly; if it grinds at any phase boundary, decomposition is the escape hatch.
|
||||
|
||||
### ADR Impact
|
||||
|
||||
> Affects ADR-001 (composition root is single registration site): unchanged — C6 is built outside the composition root by the operator-side build script; the airborne binary still just loads what's on disk.
|
||||
> Implements architecture principle #4 (no in-air network I/O) and principle #5 (all persistent imagery in `satellite-provider` on-disk layout) — this is the FIRST executable artifact that demonstrates both principles end-to-end against a real flight.
|
||||
> Affects ADR-002 (build-time exclusion): unchanged — C11 is already operator-side-only via process-level isolation (architecture Principle #4 + ADR-004); this task just exercises that path against the real upstream.
|
||||
> Affects ADR-011 (replay is a configuration): unchanged — the per-frame loop is mode-agnostic; this task closes the gap between the live and replay paths' upstream tile source (both now go through the real satellite-provider).
|
||||
> Implements architecture principle #5 (satellite-provider on-disk layout) end-to-end against a real flight for the first time.
|
||||
> No new ADR — the architectural decision is "wire the production C10/C11 pipeline into the e2e harness", which is execution of existing decisions, not a new one.
|
||||
|
||||
+18
-2
@@ -4,12 +4,28 @@
|
||||
flow: existing-code
|
||||
step: 10
|
||||
name: Implement
|
||||
status: in_progress
|
||||
status: paused
|
||||
sub_step:
|
||||
phase: 7
|
||||
name: batch-loop
|
||||
detail: "batch 103 cycle3: AZ-776 committed + transitioned to In Testing; AZ-777 next"
|
||||
detail: "batch 104 cycle3: AZ-777 spec rewritten (architecture-aligned, 8 pts, single ticket) + Jira synced; In Progress in Jira; Phase 1 (satellite-provider stand-up in docker-compose.test.yml) ready for next /autodev session"
|
||||
retry_count: 0
|
||||
cycle: 3
|
||||
tracker: jira
|
||||
last_completed_batch: 103
|
||||
session_handoff:
|
||||
current_task: AZ-777
|
||||
jira_status: in_progress
|
||||
canonical_spec: _docs/02_tasks/todo/AZ-777_derkachi_c6_reference_fixture.md
|
||||
decision_log: _docs/_process_leftovers/2026-05-21_az777_complexity_override.md
|
||||
next_session_phase: "Phase 1 — satellite-provider stand-up in docker-compose.test.yml + smoke test at tests/e2e/satellite_provider/test_smoke.py"
|
||||
parent_suite_paths:
|
||||
satellite_provider_repo: ../satellite-provider/
|
||||
api_dockerfile: ../satellite-provider/SatelliteProvider.Api/Dockerfile
|
||||
api_port_default: 8080
|
||||
integration_test_compose: ../satellite-provider/docker-compose.tests.yml
|
||||
notes:
|
||||
- "C2 default backbone is net_vlad (c2_vpr/config.py:67) — Phase 3 fixture uses it."
|
||||
- "STOP gates apply between phases — see canonical spec Risk 5 + Phase headers."
|
||||
- "If satellite-provider 's /api/satellite/tiles contract drifts from C11 expectations, STOP and file parent-suite ticket; do not patch C11."
|
||||
- "Tier-2 arm64 of satellite-provider not yet validated; check multi-arch manifest in Phase 1 or file follow-up."
|
||||
|
||||
@@ -0,0 +1,46 @@
|
||||
# AZ-777 — Complexity override (8 pts, single ticket)
|
||||
|
||||
**Timestamp**: 2026-05-21T13:30:00+03:00
|
||||
**Type**: Decision log (not a blocked tracker write)
|
||||
**Decision-maker**: user (explicit choice via /autodev questionnaire 2026-05-21)
|
||||
|
||||
## Context
|
||||
|
||||
The standard PBI complexity rule in `user_rules` says:
|
||||
|
||||
> Create PBI with 2 or 3 points of complexity, could be 5. Do not create very complex PBIs with more than 5 points.
|
||||
|
||||
AZ-777 was originally a 5-pt task ("write a script that downloads OSM/CARTO basemap tiles directly"). During cycle-3 Step 10 implementation, the agent surfaced that the task spec contradicted the architecture (C10 does not touch satellite-provider; C11 owns that path against the real parent-suite .NET service). The user was asked to choose among:
|
||||
|
||||
- A) Decompose AZ-777 into 4 sub-tickets (AZ-777-a/b/c/d), cancel original
|
||||
- B) Rewrite AZ-777 in place, expand to 8 pts, keep single ticket, multi-session implementation
|
||||
- C) Implement original spec as-written (ignore architecture mismatch)
|
||||
- D) Close cycle, pick up later
|
||||
|
||||
User chose B.
|
||||
|
||||
## Override rationale
|
||||
|
||||
The four sub-deliverables (satellite-provider stand-up, Derkachi catalog seeding, operator_pre_flight_setup rewrite, Tier-2 AC-4/AC-5 validation) only deliver demo-confidence value when shipped together. Splitting them into four PBIs would create a half-shipped state where:
|
||||
|
||||
- AZ-777-a alone leaves the e2e harness with a satellite-provider service that nothing consumes.
|
||||
- AZ-777-b alone seeds a tile catalog that nothing queries.
|
||||
- AZ-777-c alone tries to drive a fixture without the upstream service in place.
|
||||
|
||||
The user's preference is single-ticket containment with explicit phase boundaries documented in the task spec (Phases 1–5 + STOP gates per phase). This is the "single ticket but staged execution" pattern, not the "decompose into sub-tickets" pattern.
|
||||
|
||||
## STOP-gate enforcement
|
||||
|
||||
The rewritten AZ-777 spec includes explicit STOP gates between phases:
|
||||
|
||||
1. **Phase 1 → Phase 2**: If satellite-provider stand-up fails for parent-suite reasons (contract drift, arm64 issue), STOP and file a parent-suite ticket. Do not work around on the onboard side.
|
||||
2. **Phase 2 → Phase 3**: If satellite-provider's region-onboarding endpoint shape differs from what the seed script expects, STOP and file a parent-suite ticket.
|
||||
3. **Any phase → next**: If the implementation runs into work that materially exceeds the remaining phase's budget, STOP and propose decomposition (escape hatch into the 4-ticket split that was option A above).
|
||||
|
||||
The "single ticket" property is preserved as long as work proceeds linearly. If it grinds at any phase boundary, decomposition becomes the escape hatch. The user has been informed of this escape via the task spec's Risk 5.
|
||||
|
||||
## Replay obligation
|
||||
|
||||
This is NOT a tracker write blocker — Jira is reachable and the AZ-777 description + story points update is being made in the same /autodev turn that this decision log is being written. This file is the AUDIT TRAIL for the override, not a deferred-write record.
|
||||
|
||||
No replay action required on subsequent /autodev invocations. The file can be deleted once AZ-777 is moved to `done/`, but it's small enough that keeping it as historical documentation of the decision is fine.
|
||||
Reference in New Issue
Block a user