[AZ-589] [AZ-590] Close completeness gate cycle 1: VIO remediation tasks

The Product Implementation Completeness Gate (cycle 1, 2026-05-16)
audited 107 done product tasks. 105 PASS / 0 BLOCKED / 2 FAIL.

FAIL findings — both AZ-332 (OKVIS2) and AZ-333 (VINS-Mono) ship a
real Python facade + AC-tested fake backend, but their native pybind11
bindings (_native/okvis2_binding.cpp, _native/vins_mono_binding.cpp)
are skeletons: _build_estimator() sets estimator_built_ = false; the
first add_frame() raises *FatalException("estimator not yet wired").
Production-default VIO and the comparative-study path both crash on
the first nav-camera frame.

Remediation tasks created in _docs/02_tasks/todo/:
  - AZ-589  remediate_okvis2_threadedkfvio_wiring  (5pt)
  - AZ-590  remediate_vins_mono_estimator_wiring   (5pt)

Both tasks also seed the per-binary bootstrap register_strategy() call
sites — the existing strategy registry in runtime_root/__init__.py is
never invoked in src/ today.

Artifacts:
  - _docs/03_implementation/implementation_completeness_cycle1_report.md
  - _docs/02_tasks/todo/AZ-589_remediate_okvis2_threadedkfvio_wiring.md
  - _docs/02_tasks/todo/AZ-590_remediate_vins_mono_estimator_wiring.md
  - _docs/02_tasks/_dependencies_table.md  (+2 rows; totals refreshed)
  - _docs/_autodev_state.md                (Step 7 phase 1 parse;
                                            current_batch: 66)

Returning to implement-skill Step 1 to parse Batch 66 against these
remediation tasks (per Step 15 option A).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-16 10:24:38 +03:00
parent c5ffc14fe9
commit be5c6d20aa
5 changed files with 597 additions and 8 deletions
@@ -0,0 +1,377 @@
# Product Implementation Completeness Gate — Cycle 1
**Date**: 2026-05-16
**Cycle**: 1
**Tasks audited**: 107 done product tasks under `_docs/02_tasks/done/` (the
six hygiene-only specs and AZ-525-class follow-ups are included as PASS
because they don't promise new runtime behavior).
**Audit scope**: every task spec's `Description` / `Outcome` /
`Scope.Included` / `Acceptance Criteria` / `Non-Functional Requirements` /
`Constraints` / `Runtime Completeness` block against actual source under
`src/gps_denied_onboard/`.
## Verdict
**FAIL — Step 7 must not advance.**
Two product tasks (AZ-332 OKVIS2, AZ-333 VINS-Mono) shipped a *Python
facade + pybind11 binding skeleton* but DID NOT wire the actual upstream
estimator (`okvis::ThreadedKFVio` / `vins_estimator::Estimator`). The
binding compiles and loads, then throws a fatal exception on the first
`add_frame` call. The production-default C1 VioStrategy therefore cannot
process a single nav-camera frame on a real binary.
Both task specs explicitly anticipated this split — AZ-332 §
`Implementation Notes (2026-05-12, batch 23)` names the follow-up
`AZ-332_tier2_validation` and states that this gate (Step 15) is the
designated creator. AZ-333 carries the same skeleton pattern but no
self-deferral note. This report executes that contract.
Per `implement/SKILL.md` § 15 ("If any product task is `FAIL`, STOP. Do
not write the final product implementation report and do not proceed to
any downstream autodev step."), Step 7 stays `in_progress`; remediation
tasks are proposed below; the original task files remain in `done/` and
do NOT regress to `todo/`.
## FAIL findings
### AZ-332 — C1 OKVIS2 Strategy (production-default VIO)
**Promised capability**: "production-default `VioStrategy` ... Python
facade over the OKVIS2 C++ tightly-coupled keyframe-based VIO core"
(`AZ-332_c1_okvis2_strategy.md` § Description). `Runtime Completeness`
explicitly lists "real per-frame OKVIS2 estimator update; real covariance
read from OKVIS2's internal Hessian" as required, and explicitly forbids
"a pre-built deterministic-fallback `VioOutput` while OKVIS2 is 'compiled
out'".
**Evidence**:
- `src/gps_denied_onboard/components/c1_vio/okvis2.py` — 339-line Python
facade. Conforms to the AZ-331 `VioStrategy` Protocol. PASS.
- `src/gps_denied_onboard/components/c1_vio/_native/okvis2_binding.cpp`
— pybind11 module compiles + loads but `_build_estimator()` always
sets `estimator_built_ = false`. `_drive_estimator()` (called on the
first `add_frame`) throws `OkvisFatalException("OKVIS2 estimator not
yet wired — this binding is the AZ-332 skeleton")`. FAIL.
- OKVIS2 upstream is never `#include`'d (the `#include
<okvis/ThreadedKFVio.hpp>` line is commented out, line 48 of the
binding).
**Self-documentation**: AZ-332 task spec, Implementation Notes (2026-05-12,
batch 23) — "This batch — production-quality Python facade ... pybind11
binding source that compiles + loads but throws ... ; Tier-2 follow-up —
actual `okvis::ThreadedKFVio` wiring ... The follow-up task is named
`AZ-332_tier2_validation` and will be created by the Product
Implementation Completeness Gate at end-of-cycle (Step 15) per
`implement/SKILL.md`."
**Blast radius**: the deployment binary (`config.vio.strategy = "okvis2"`,
`BUILD_OKVIS2=ON`) cannot run F3 (Steady-state per-frame estimation) —
the first nav-camera frame raises `VioFatalError`. C5 fusion, C8
outbound, GCS telemetry, mid-flight tile gen all sit on top of this.
### AZ-333 — C1 VINS-Mono Strategy (research-only VIO)
**Promised capability**: `Runtime Completeness` requires "real `VinsMonoStrategy`
class implementing the AZ-331 Protocol; real pybind11 binding to
`cpp/vins_mono/` (real VINS-Mono upstream, de-ROSified); real per-frame
estimator update".
**Evidence**:
- `src/gps_denied_onboard/components/c1_vio/vins_mono.py` — 448-line
Python facade. Conforms to the AZ-331 Protocol. PASS.
- `src/gps_denied_onboard/components/c1_vio/_native/vins_mono_binding.cpp`
— same skeleton pattern as OKVIS2. `_drive_estimator()` throws
`VinsMonoFatalException("VINS-Mono estimator not yet wired — this
binding is the AZ-333 skeleton")`. FAIL.
**Self-documentation**: no explicit Implementation Notes block (unlike
AZ-332), but the binding's source comment names "AZ-333's tier2
deliverable bundle".
**Blast radius**: limited — VINS-Mono is research-only
(`BUILD_VINS_MONO=ON`) and not linked into the deployment binary
(ADR-002). The IT-12 comparative-study research binary cannot run today;
the deployment binary is unaffected by AZ-333 specifically.
## PASS — by component
107 audited tasks, 105 PASS, 0 BLOCKED, 2 FAIL.
Tasks classified as PASS have at least one of:
- A substantial Python/C++ source artifact under the task's owned
component (`module-layout.md` ownership envelope).
- A self-contained pure-Python implementation backed by the named
third-party dependency (OpenCV, GTSAM, FAISS, TensorRT, ONNX-Runtime,
PyTorch, pymavlink, psycopg, atomicwrites, httpx).
- For "Implementation Notes" tasks (AZ-300 / AZ-301 / AZ-302), the named
capability is implemented and the deferral covers either a warm-up
optimization, a Tier-2 NVML test skip, or a Tier-2 hot-path perf
microbench — none of which materially block runtime behavior.
### Foundation + cross-cutting (10 tasks) — all PASS
| Task | Title | Evidence |
|------|-------|----------|
| AZ-263 | initial structure | `src/` skeleton present; package importable. |
| AZ-266 | log module | `gps_denied_onboard.logging` package. |
| AZ-267 | fdr log bridge | producer-id-tagged log → FDR records. |
| AZ-268 | log schema contract test | shipped in tests. |
| AZ-269 | config loader | `gps_denied_onboard.config` (env + YAML). |
| AZ-270 | compose root | `runtime_root/__init__.py` (`compose_root`, `compose_operator`). |
| AZ-271 | config precedence tests | shipped. |
| AZ-507 | hygiene module-layout AZ-270 alignment | lint test `tests/unit/test_az270_compose_root.py`. |
| AZ-508 / AZ-526 | iso-timestamp consolidation | `helpers/iso_timestamps.py`. |
| AZ-527 | engine-dim assertion consolidation | `components/c2_vpr/_engine_dim_assertion.py` + sibling under c3. |
| AZ-528 | c1 vio facade spine consolidation | `_facade_spine.py`. |
### FDR / FdrClient (4 tasks) — all PASS
| AZ-272 | fdr record schema | `fdr_client/records.py`. |
| AZ-273 | fdr client ringbuf | `fdr_client/client.py`. |
| AZ-274 | fdr overrun emission | producer-side overrun records. |
| AZ-275 | fake fdr sink | test fixture, used by every component's unit tests. |
### Shared helpers (8 tasks) — all PASS
| AZ-276 | imu_preintegrator | `helpers/imu_preintegrator.py` (real GTSAM `CombinedImuFactor` substrate). |
| AZ-277 | se3_utils | `helpers/se3_utils.py`. |
| AZ-278 | lightglue_runtime | `helpers/lightglue_runtime.py` (TRT engine handle). |
| AZ-279 | wgs_converter | `helpers/wgs_converter.py`. |
| AZ-280 | sha256 sidecar | `helpers/sha256_sidecar.py`. |
| AZ-281 | engine filename schema | `helpers/engine_filename.py`. |
| AZ-282 | ransac filter | `helpers/ransac_filter.py` (cv2 essential-matrix). |
| AZ-283 | descriptor normaliser | `helpers/descriptor_normaliser.py`. |
### C13 FDR writer (6 tasks) — all PASS
| AZ-291 | writer thread | `c13_fdr/writer.py` (real single-writer thread + ringbuf consumer). |
| AZ-292 | flight header/footer | persistent records. |
| AZ-293 | capacity cap policy | `≤ 64 GB` enforcement, oldest-first rollover. |
| AZ-294 | mid-flight tile snapshot | C6 → C13 hook. |
| AZ-295 | thumbnail rate limiter | ≤ 0.1 Hz failed-tile thumbnail log. |
| AZ-296 | open-error takeoff abort | `take_off` aborts with exit 2 + structured ERROR. |
### C7 Inference (6 tasks) — all PASS (notable deferrals are documented + non-blocking)
| AZ-297 | runtime protocol | `c7_inference/interface.py`. |
| AZ-298 | tensorrt runtime | 1263-line `tensorrt_runtime.py`; lazy-imports real `tensorrt` (line 497). |
| AZ-299 | onnxrt fallback | 666-line `onnx_trt_ep_runtime.py`; lazy-imports `onnxruntime` (line 213). |
| AZ-300 | pytorch baseline | 339-line `pytorch_fp16_runtime.py`. Warm-up deferred to Tier-2 (documented in spec); first real `infer` does implicit warm-up, no AC blocked. |
| AZ-301 | engine gate | `engine_gate.py`. AC-8 NVML/Jetpack test is Tier-2-skip — production helper code exists. |
| AZ-302 | thermal publisher | `thermal_publisher.py` + `_JtopSource` + `_PynvmlSource`. AC-7 perf microbench Tier-2-deferred — runtime code exists. |
### C6 Tile cache (6 tasks) — all PASS
| AZ-303 | storage interfaces | `c6_tile_cache/interface.py`. |
| AZ-304 | postgres schema | SQL migration shipped. |
| AZ-305 | postgres+filesystem store | `postgres_filesystem_store.py` (real `psycopg` + atomicwrites). |
| AZ-306 | faiss descriptor index | `faiss_descriptor_index.py` (real `faiss` import). |
| AZ-307 | freshness gate | `freshness_gate.py`. |
| AZ-308 | cache budget eviction | `cache_budget_enforcer.py`. |
### C11 Tile manager (5 tasks) — all PASS
| AZ-316 | tile downloader | `c11_tile_manager/tile_downloader.py` (real `httpx`). |
| AZ-317 | flight state gate | superseded by C12 SRP refactor; C11 carries no gate. |
| AZ-318 | signing key | `signing_key.py` (per-flight key + FDR rotation log). |
| AZ-319 | tile uploader | `tile_uploader.py` (real ingest contract). |
| AZ-320 | idempotent retry | `IdempotentRetryTileUploader` decorator. |
### C10 Provisioning (5 tasks) — all PASS
| AZ-321 | engine compiler | `c10_provisioning/provisioner.py` (real TRT engine compile via C7). |
| AZ-322 | descriptor batcher | batched C2 descriptor gen. |
| AZ-323 | manifest builder | `manifest_builder.py` (real SHA-256 manifest). |
| AZ-324 | manifest verifier | content-hash gate. |
| AZ-325 | cache provisioner | end-to-end F1 build path. |
### C12 Operator orchestrator (5 tasks) — all PASS
| AZ-326 | cli app | `c12_operator_orchestrator/cli.py`. |
| AZ-327 | companion bringup | `paramiko_ssh_session.py`. |
| AZ-328 | build cache orchestrator | `remote_c10_invoker.py`. |
| AZ-329 | post-landing upload | `PostLandingUploadOrchestrator` (real FDR footer gate). |
| AZ-330 | operator reloc service | `OperatorReLocService` + `OperatorCommandTransport` Protocol. |
| AZ-489 | flights api client | `flights_api/httpx_client.py`. |
### C1 VIO (5 tasks) — 1 PASS, 2 FAIL, 2 PASS
| AZ-331 | strategy protocol | `c1_vio/interface.py`. PASS. |
| AZ-332 | OKVIS2 production-default | **FAIL** — native binding is a skeleton (see above). |
| AZ-333 | VINS-Mono research-only | **FAIL** — same skeleton pattern. |
| AZ-334 | KLT/RANSAC simple baseline | 706-line `klt_ransac.py`, pure-Python OpenCV; no native dep; functional. PASS. |
| AZ-335 | warm start recovery | `warm_start_store.py`. PASS. |
### C2 VPR (6 tasks) — all PASS
| AZ-336 | strategy protocol | `c2_vpr/interface.py`. |
| AZ-337 | UltraVPR (production-default) | 441-line `ultra_vpr.py`; consumes C7 TRT engine. |
| AZ-338 | NetVLAD baseline | 500-line `net_vlad.py` + `_net_vlad_architecture.py` + PyTorch FP16 path. |
| AZ-339 | MegaLoc + MixVPR | substantial impls. |
| AZ-340 | SelaVPR + EigenPlaces + SALAD | substantial impls. |
| AZ-341 | faiss retrieve wiring | `_faiss_bridge.py`. |
Note: `src/gps_denied_onboard/components/c2_vpr/_native/__init__.py`
contains only the line `"""Native bindings for VPR runtime — placeholder."""`.
The C2 strategies route inference through C7 (TensorRT / ONNX-RT /
PyTorch), so this `_native/` directory is empty by design (no extant
task promises VPR-specific C++ code). Recommend deleting the directory
in a future hygiene pass; not a FAIL today.
### C2.5 Re-rank (2 tasks) — both PASS (with one noted concern, see § Notes)
| AZ-342 | strategy protocol | `c2_5_rerank/interface.py`. |
| AZ-343 | inlier-count reranker | `inlier_based_reranker.py` (real LightGlue inlier counting). |
### C3 Cross-domain matcher (4 tasks) — all PASS
| AZ-344 | matcher protocol | `c3_matcher/interface.py`. |
| AZ-345 | DISK + LightGlue (production-default) | 288-line `disk_lightglue.py`; consumes C7 + helpers. |
| AZ-346 | ALIKED + LightGlue (secondary) | 289-line `aliked_lightglue.py`. |
| AZ-347 | XFeat (alternate) | 544-line `xfeat.py`. |
Note: `c3_matcher/_native/__init__.py` is similarly an empty placeholder
— same situation as C2's `_native/`. Hygiene cleanup, not a FAIL.
### C3.5 AdHoP refinement (2 tasks) — both PASS
| AZ-348 | refiner protocol | `c3_5_adhop/interface.py`. |
| AZ-349 | AdHoP refiner | 509-line `adhop_refiner.py`; real C7-backed AdHoP engine load. Note: `runtime_root/refiner_factory.py` docstring still calls AdHoPRefiner "placeholder today" — that comment is stale; the production class is real. Hygiene fix recommended (one-line doc update). |
### C4 Pose estimation (3 tasks) — all PASS
| AZ-355 | pose protocol | `c4_pose/interface.py`. |
| AZ-358 | OpenCV `solvePnPRansac` + GTSAM Marginals | `opencv_gtsam_estimator.py` (real cv2 + gtsam). |
| AZ-361 | Jacobian thermal hybrid | D-CROSS-LATENCY-1 auto-degrade path. |
### C5 State estimator (9 tasks) — all PASS
| AZ-381 | protocol | `c5_state/interface.py`. |
| AZ-382 | iSAM2 smoother wiring | `gtsam_isam2_estimator.py` (real `gtsam.IncrementalFixedLagSmoother`). |
| AZ-383 | factor adds | factor-graph construction. |
| AZ-384 | marginals outputs | covariance recovery via `gtsam.Marginals`. |
| AZ-385 | source-label spoof gate | `SourceLabelStateMachine`. |
| AZ-386 | ESKF baseline | `eskf_baseline.py` (mandatory engine-rule baseline). |
| AZ-387 | smoothed history FDR | retroactive smoothing → FDR. |
| AZ-388 | AC-5.2 fallback | FC-IMU-only fallback path. |
| AZ-389 | orthorectifier → C6 mid-flight tiles | `_orthorectifier.py` + compose-root `_C6MidFlightIngestAdapter`. |
| AZ-490 | set_takeoff_origin | operator-origin warm-start hook. |
### C8 FC adapter (8 tasks) — all PASS
| AZ-390 | adapter protocol | `c8_fc_adapter/interface.py`. |
| AZ-391 | inbound subscription | `pymavlink` + `msp2` decoders. |
| AZ-392 | covariance projector | 2×2 horizontal sub-block → `horiz_accuracy`. |
| AZ-393 | ardupilot outbound | `pymavlink_ardupilot_adapter.py`. |
| AZ-394 | inav outbound | `msp2_inav_adapter.py`. |
| AZ-395 | mavlink signing | per-flight key rotation + FDR record. |
| AZ-396 | source-set switch | `MAV_CMD_SET_EKF_SOURCE_SET` flow. |
| AZ-397 | qgc telemetry adapter | `mavlink_gcs_adapter.py`. |
| AZ-558 | mavlink transport routing | seam between encoder + serial transport. |
### Replay path (8 tasks) — all PASS
| AZ-398 | frame source + clock | `replay_input/` + `frame_source/`. |
| AZ-399 | tlog adapter | `replay_input/tlog_adapter.py`. |
| AZ-400 | jsonl sink | `c8_fc_adapter/replay_sink.py`. |
| AZ-401 | replay compose | `runtime_root/_replay_branch.py`. |
| AZ-402 | replay cli | `cli/replay.py`. |
| AZ-403 | replay dockerfile + ci | shipped under `Dockerfile.replay` + `.github/workflows/`. |
| AZ-404 | replay e2e fixture | `tests/e2e/replay/`. |
| AZ-405 | replay auto-sync | `replay_input/auto_sync.py`. |
## Notes / non-blocking observations
1. **Production composition root has no per-binary bootstrap module
registering strategies.** `runtime_root/__init__.py` defines a strategy
registry (`register_strategy`, `_resolve_strategy`) and topo-sorts
constructed components, but `register_strategy` is never called
anywhere in `src/`. `compose_root(config)` would raise
`StrategyNotLinkedError` on every C1-C8 slug if invoked today. This
is the "per-binary bootstrap module" the AZ-263 / AZ-270 prose
anticipates — a separate concern from any one task and arguably out
of scope for this gate (the registry seam exists; the actual
registration lives in a not-yet-written bootstrap module). Recommend
surfacing as a separate cross-cutting task (`E-CC-CONF` or
`E-BOOT`).
2. **`helpers/feature_extractor.py::OpenCvOrbExtractor`** is documented
as a placeholder ("Production deployments MUST replace this
extractor with a deep-learning backbone before flight (tracked under
the future C2.5 backbone-extractor task)"). No DISK/ALIKED extractor
exists. C2.5 (AZ-343) uses an injected `FeatureExtractor`; the only
concrete impl is ORB. AZ-343's spec does NOT name DISK/ALIKED, so
this is a known-future-task gap rather than an AZ-343 FAIL — but the
prod composition root will need a non-ORB extractor before flight.
Recommend surfacing as a follow-up task (5 points or less).
3. **`_types/tile.py`** scaffolding DTOs (`Tile`, `TileRecord`) are no
longer referenced by any module under `src/`. Dead code per
`coderule.mdc`. Recommend deleting in a hygiene PBI; not a Gate FAIL.
4. **`runtime_root/refiner_factory.py`** docstring describes AdHoPRefiner
as "placeholder today" — stale comment; the production class is
real. One-line doc fix.
5. **`c2_vpr/_native/__init__.py` and `c3_matcher/_native/__init__.py`**
are empty placeholder modules. C2/C3 strategies route inference
through C7; no native code is owed. Recommend deleting both
directories.
6. **Process leftover `2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`**
remains open (gtsam still numpy 1.x). Not blocking for this gate.
## Remediation tasks proposed
Per `implement/SKILL.md` § 15 remediation task creation rules: each
remediation task is sized at ≤ 5 points; depends on its failed parent;
goes to `_docs/02_tasks/todo/`; tracker tickets to be created on user
approval (Jira availability gate per `.cursor/rules/tracker.mdc`).
### Proposed task 1 — `remediate_AZ-332_okvis2_threadedkfvio_wiring`
- **Parent FAIL**: AZ-332.
- **Goal**: wire `okvis::ThreadedKFVio` inside
`_native/okvis2_binding.cpp` (`_build_estimator()` and
`_drive_estimator()`); enable the commented-out includes; instantiate
the estimator from `yaml_config_`; attach the output callback that
fills `latest_output_` under `output_mtx_`; CI matrix that installs
Ceres + initialises OKVIS2's vendored submodules.
- **Complexity**: 5 points.
- **Dependencies**: AZ-332, AZ-276, AZ-277.
- **Out of scope**: AC-9 honest-covariance Tier-2 validation against
Derkachi-class fixtures (separate Tier-2 perf task).
### Proposed task 2 — `remediate_AZ-333_vins_mono_estimator_wiring`
- **Parent FAIL**: AZ-333.
- **Goal**: wire `vins_estimator::Estimator` + `feature_tracker` inside
`_native/vins_mono_binding.cpp`; enable the de-ROSified VINS-Mono pin
build; ensure the same Protocol-conforming output shape as OKVIS2;
research-only.
- **Complexity**: 5 points.
- **Dependencies**: AZ-333, AZ-276, AZ-277.
- **Out of scope**: IT-12 comparative-study harness (lives in E-BBT).
If either remediation task grows beyond 5 points during decomposition,
split into infrastructure + estimator-wiring + per-frame-cov-read
sub-tasks before scheduling.
## Gate decision
Per `implement/SKILL.md` § 15:
> If any product task is `FAIL`, STOP. Do not write the final product
> implementation report and do not proceed to any downstream autodev
> step. Completed original task files remain in `done/`; the missing
> work is represented by remediation tasks.
**State**: Step 7 stays `in_progress`. The Choose block in the next
agent message presents the operator A/B/C options. The two remediation
tasks above will be created on user direction.
End of report.