[AZ-839] Fix C3 fixture path mismatch (batch 108b)

The batch 108 fixture built tile_store + descriptor_index from
the static operator config (root_dir baked into YAML) but built
the AC-3/AC-6 verifier from cache_root/descriptor.index (fresh
tmp path). On Tier-2 the descriptor_batcher would write under
the YAML root and the verifier would open the tmp path, raising
IndexUnavailableError before the fixture could yield a
PopulatedC6Cache. Unit tests missed it because every test
stubbed descriptor_index_factory.

Mutate the c6_tile_cache config block in-memory at fixture entry
so root_dir = cache_root and faiss_index_path falls back to
<cache_root>/descriptor.index. Production C6 components and the
verifier now share one path source. Align tile_store_path with
PostgresFilesystemStore's <root_dir>/tiles layout so the
integration test's tile_store_path.is_dir() assertion holds.

Driver and unit tests are path-agnostic and unaffected. Batch
108b report documents the defect, the fix, and the self-review
miss.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-23 15:20:14 +03:00
parent bfcac2cb9f
commit 8c4be9ace0
2 changed files with 200 additions and 2 deletions
@@ -0,0 +1,179 @@
# Batch 108b — Cycle 3 — AZ-839 conftest path-mismatch fix
**Date**: 2026-05-23
**Tasks**: AZ-839 (C3 — Epic AZ-835).
**Story points**: 0 (defect fix on top of the AZ-839 batch 108
ship; counts under the existing 5 SP envelope).
**Jira status**: AZ-839 reopened (In Testing → In Progress) at the
start of this batch on the 2026-05-23 self-review finding;
re-transitions to In Testing at commit step.
## Why this batch exists
The AZ-839 batch 108 self-review verdict was PASS_WITH_WARNINGS
based on 11 driver unit tests + 28 replay-suite passes. While
reading the C3 fixture to plan the AZ-840 orchestrator, a real
path-mismatch defect surfaced that **AC-3 / AC-6 unit tests
could not catch** because every unit test stubs the
`descriptor_index_factory`. The defect was not introduced by
batch 108b — it was missed by batch 108's self-review and would
have failed the AC-9 Tier-2 integration test on first execution.
Per `meta-rule.mdc` "Real Results, Not Simulated Ones" the work
was paused before any AZ-840 code was written, the user was given
a Choose A/B/C/D, and option A (reopen AZ-839, fix, recommit) was
selected.
## The defect
In `tests/e2e/replay/conftest.py::_build_operator_pre_flight_cache`:
* `tile_store = build_tile_store(config)` constructed a
`PostgresFilesystemStore` whose filesystem root came from
`config.components["c6_tile_cache"].root_dir` — i.e. the static
YAML path baked into the operator config (default
`/var/lib/gps-denied/tiles`).
* `descriptor_index = build_descriptor_index(config)` constructed
a `FaissDescriptorIndex` at
`<config.root_dir>/descriptor.index`.
* `_descriptor_index_factory()` (the AC-3 / AC-6 verifier seam)
constructed a SEPARATE `FaissDescriptorIndex` at
`cache_root / "descriptor.index"` — the freshly-mktemp'd
fixture path.
* On Tier-2 those two paths cannot be equal: `cache_root` is
generated at test time by `tmp_path_factory`; the static YAML
carries a path that is fixed at config-load time.
* Result: `descriptor_batcher.populate_descriptors()` writes the
rebuilt FAISS triple under the static YAML root; the verifier
then opens `cache_root/descriptor.index` and finds nothing,
raising `IndexUnavailableError` from `FaissDescriptorIndex._load`.
The fixture would have failed to ever yield a `PopulatedC6Cache`
on Tier-2 — AC-3 (paths populated) and AC-6 (sidecar coherence)
both unreachable.
The same shape applied to the tile filesystem: `tile_store_path =
cache_root / "tile_store"` did not match the actual
`PostgresFilesystemStore` layout (`<root_dir>/tiles/`).
## The fix
`_build_operator_pre_flight_cache` now mutates the in-memory
`c6_tile_cache` config block so the production C6 components and
the verifier all read/write under the fixture's `cache_root`:
```python
c6_block = config.components["c6_tile_cache"]
c6_block_overridden = dataclasses.replace(
c6_block,
root_dir=str(cache_root),
faiss_index_path="", # force fallback to <root_dir>/descriptor.index
)
config = dataclasses.replace(
config,
components={**config.components, "c6_tile_cache": c6_block_overridden},
)
tile_store_path = cache_root / "tiles"
faiss_index_path = cache_root / "descriptor.index"
```
After the override:
* `build_tile_store(config)` writes under `cache_root/tiles/`.
* `build_descriptor_index(config)` rebuilds at
`cache_root/descriptor.index` (+ `.sha256` + `.meta.json`).
* `_descriptor_index_factory()` reads from the same
`cache_root/descriptor.index` — triple-consistency check now has
files to validate.
* `PopulatedC6Cache.tile_store_path` matches the
`PostgresFilesystemStore.__init__` layout (`self._tiles_dir =
self._root_dir / "tiles"`); the integration test's
`populated.tile_store_path.is_dir()` assertion will hold.
The existing operator-config YAML stays unchanged — the override
is in-memory, scoped to the fixture session, and never touches the
disk file the operator wrote.
## Files changed
* `tests/e2e/replay/conftest.py` — added `import dataclasses`;
added the c6_tile_cache override block + comment in
`_build_operator_pre_flight_cache`; renamed
`tile_store_path = cache_root / "tile_store"`
`cache_root / "tiles"` to match `PostgresFilesystemStore` layout;
removed the unused `tile_store_path.mkdir(...)` (the store's
constructor creates it).
No driver, unit-test, or integration-test changes. The driver's
public API (`populate_c6_from_route`, `PopulatedC6Cache`) is
unchanged.
## AC coverage delta
The minimal fix narrows AC-3 (paths populated) and AC-6 (sidecar
coherence) from "would have failed on Tier-2" to "actually
verifiable on Tier-2". No AC was previously claimed PASS that
this batch downgrades.
## Test run results
```
$ .venv/bin/pytest tests/e2e/replay/ -v --tb=short --timeout=60
============================ 28 passed, 9 skipped in 3.08s ===========================
```
Same outcome as batch 108. The unit suite is path-agnostic (every
test in `test_operator_pre_flight_driver.py` injects its own
paths through `_build_harness`) so the fix has no observable
effect on the green path. The 9 skipped tests are
RUN_REPLAY_E2E + Tier-2 gated; they will exercise the fix on the
Jetson harness when AZ-839's AC-9 integration test next runs.
## Code review (self-review of batch 108b)
Verdict: **PASS** (single-finding fix; no new findings).
| Phase | Result |
|-------|--------|
| 1. Context loading | Re-read `storage_factory.py` + `postgres_filesystem_store.py` + `faiss_descriptor_index.py` to confirm where `root_dir` / `faiss_index_path` are honoured. |
| 2. Spec compliance | AZ-839 AC-3 / AC-6 are now reachable on Tier-2; AC-9 entry point unchanged. |
| 3. Code quality | Comment names the failure mode the override prevents. `dataclasses.replace` is used twice rather than mutating frozen dataclasses. The new `tile_store_path` matches the production layout exactly. |
| 4. Security quick-scan | The override only changes paths; no DSN, JWT, or env-secret handling moved. |
| 5. Performance scan | No-op — the override runs once per session, before any heavy I/O. |
| 6. Cross-task consistency | Single-defect batch — N/A. |
| 7. Architecture compliance | The fixture stays in `tests/`; mutating `config.components` is a documented composition-root pattern (see `Config.with_blocks`). No new src/ writes. |
## Self-review meta — why batch 108 missed this
The batch 108 self-review went through all 7 review phases but
relied on the unit-test pass count for AC-3 / AC-6 confidence.
Every unit test injected its own `descriptor_index_factory`, so
the fixture's wiring of that factory to `cache_root` was never
exercised against the real production wiring of `descriptor_index`
to `config.root_dir`. Phase 7 (Architecture compliance) noted
"the conftest fixture wires deps via the existing `runtime_root`
factories — does not import concrete impl modules directly" but
did not check that the wiring was internally consistent.
Preventive lesson (no rule change yet — surfacing for AZ-840
follow-up): **when a fixture wires production components from a
config and ALSO constructs a side verifier from a different
source of truth, the two paths must be derived from a single
upstream value or asserted equal at fixture-setup time.** This
goes into the AZ-839 leftover note for AZ-840 to act on or to
escalate to a `coderule.mdc` rule update.
## Notes for follow-up
* AZ-840 (e2e orchestrator test) — this batch unblocks AZ-840
AC-3 (which hard-depends on the C3 fixture producing a usable
cache). AZ-840 will additionally need to feed the airborne
replay binary a config that points at the same `cache_root`
(the binary takes a single `--config <path>` and cannot read
the in-memory mutation); the cleanest path is for AZ-840 to
write an effective YAML at runtime from the same override
recipe used here. AZ-840's batch report will record the choice.
* AZ-839's batch 108 self-review process is being noted as a
partially-effective gate. No `coderule.mdc` rule change yet —
the `meta-rule.mdc` "Real Results" rule already covers the
general case; AZ-840's planning will check whether a more
specific fixture-vs-config-wiring rule is warranted.
+21 -2
View File
@@ -15,6 +15,7 @@ import shutil
import subprocess import subprocess
import sys import sys
from collections.abc import Iterator from collections.abc import Iterator
import dataclasses
from dataclasses import dataclass from dataclasses import dataclass
from pathlib import Path from pathlib import Path
from typing import Any from typing import Any
@@ -430,8 +431,26 @@ def _build_operator_pre_flight_cache(
config = load_config(os.environ, paths=[config_path]) config = load_config(os.environ, paths=[config_path])
cache_root = tmp_path_factory.mktemp("operator_pre_flight_cache") cache_root = tmp_path_factory.mktemp("operator_pre_flight_cache")
tile_store_path = cache_root / "tile_store" # PostgresFilesystemStore writes JPEGs under `<root_dir>/tiles/`;
tile_store_path.mkdir(parents=True, exist_ok=True) # FaissDescriptorIndex falls back to `<root_dir>/descriptor.index`
# when `faiss_index_path` is empty. Override the c6_tile_cache
# block in-memory so the production components built below
# (build_tile_store / build_descriptor_index / batcher) write to
# the same `cache_root` PopulatedC6Cache advertises. Without this
# the static YAML at GPS_DENIED_OPERATOR_CONFIG_PATH would route
# writes to its baked-in `root_dir` while the verifier read from
# the fixture's tmp path, breaking AC-3 / AC-6 on Tier-2.
c6_block = config.components["c6_tile_cache"]
c6_block_overridden = dataclasses.replace(
c6_block,
root_dir=str(cache_root),
faiss_index_path="",
)
config = dataclasses.replace(
config,
components={**config.components, "c6_tile_cache": c6_block_overridden},
)
tile_store_path = cache_root / "tiles"
faiss_index_path = cache_root / "descriptor.index" faiss_index_path = cache_root / "descriptor.index"
route_spec = extract_route_from_tlog( route_spec = extract_route_from_tlog(