mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 08:41:12 +00:00
[AZ-300] [AZ-301] [AZ-302] [AZ-304] docs: sync module-layout for c6+c7
Cumulative review of batches 23-27 (cycle 1) surfaced three Medium
documentation-drift findings on module-layout.md. All three fixed
inline per user direction:
F1: c7_inference Internal list expanded with architecture_registry,
config, engine_gate, errors, manifest, thermal_publisher (added
across AZ-300/301/302).
F2: c6_tile_cache `connection.py` re-attributed from AZ-304 (which
deferred it) to AZ-305 with a "planned, not landed yet" tag.
F3: c7_inference Public API description rewritten by category
(Protocol + DTOs + component services + config + error family)
with a pointer to __init__.py's __all__ for the canonical list.
Cumulative review report: _docs/03_implementation/cumulative_review_
batches_23-27_cycle1_report.md (PASS_WITH_WARNINGS).
Autodev state moved to status: paused_user_requested per user
choice; /autodev will resume at greenfield Step 7 (next batch
selection) on next invocation.
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -145,7 +145,7 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
|
||||
- `_native/` (`cpp/faiss_index/` wrapper)
|
||||
- `migrations.py` (`apply_migrations(config) -> MigrationResult` runner invoked by the composition root at startup; AZ-304 + later)
|
||||
- `_uuid_namespace.py` (pinned `TILE_NAMESPACE_UUID` + `derive_tile_id` / `derive_location_hash` helpers; cross-repo coordinated with `satellite-provider`; AZ-304)
|
||||
- `connection.py` (`psycopg_pool` ConnectionPool helper; AZ-304)
|
||||
- `connection.py` (planned — `psycopg_pool` ConnectionPool helper; AZ-305, not landed yet)
|
||||
- **Owns**: `src/gps_denied_onboard/components/c6_tile_cache/**`, `cpp/faiss_index/**`, `tests/unit/c6_tile_cache/**`, `db/migrations/**` (project-level Alembic env owned by c6 — `alembic.ini` at repo root points here; `0001_initial.py` shipped by AZ-263 bootstrap, `0002_c6_tile_identity_and_lru.py` and forward owned by AZ-304+ migrations)
|
||||
- **Imports from**: `_types`, `helpers.sha256_sidecar`, `helpers.wgs_converter`, `config`, `logging`, `fdr_client`
|
||||
- **Consumed by**: `c2_vpr`, `c2_5_rerank`, `c3_matcher`, `c10_provisioning`, `c11_tile_manager`, `runtime_root`
|
||||
@@ -154,13 +154,22 @@ Bootstrap reference: `_docs/02_tasks/todo/AZ-263_initial_structure.md`. Architec
|
||||
|
||||
- **Epic**: AZ-249 (E-C7 Inference Runtime)
|
||||
- **Directory**: `src/gps_denied_onboard/components/c7_inference/`
|
||||
- **Public API**:
|
||||
- `__init__.py` (re-exports `InferenceRuntime`, `EngineCacheEntry`)
|
||||
- **Public API** (`__init__.py` `__all__`; consult the module for the canonical list):
|
||||
- `interface.py` (`InferenceRuntime` Protocol)
|
||||
- DTOs re-exported from `_types.inference` + `_types.thermal`: `BuildConfig`, `EngineCacheEntry`, `EngineHandle`, `OptimizationProfile`, `PrecisionMode`, `ThermalState`
|
||||
- Component services: `EngineGate` + `HostTuple` (AZ-301), `ThermalStatePublisher` + `ThermalReading` + `ThermalSource` Protocol (AZ-302), `ManifestReader` + `ManifestReaderProtocol` + `DeploymentManifest` (AZ-301), `ArchitectureFactory` + `default_registry` + `register_architecture` (AZ-300)
|
||||
- Config block: `C7InferenceConfig` (registered on import)
|
||||
- Error family rooted at `RuntimeError` with documented subtypes (`InferenceError`, `EngineBuildError`, `EngineDeserializeError`, `EngineHashMismatchError`, `EngineSchemaMismatchError`, `EngineSidecarMissingError`, `CalibrationCacheError`, `OutOfMemoryError`, `TelemetryUnavailableError`)
|
||||
- **Internal**:
|
||||
- `tensorrt_runtime.py` (production-default; TensorRT 10.3)
|
||||
- `onnx_trt_runtime.py` (ONNX Runtime + TensorRT EP)
|
||||
- `pytorch_fp16_runtime.py` (research-only baseline)
|
||||
- `architecture_registry.py` (AZ-300; family of registered `ArchitectureFactory` callables consumed by `PytorchFp16Runtime`)
|
||||
- `config.py` (`C7InferenceConfig` dataclass; registered on import)
|
||||
- `engine_gate.py` (AZ-301; D-C10-3 + D-C10-7 takeoff validator)
|
||||
- `errors.py` (component error family)
|
||||
- `manifest.py` (AZ-301; `DeploymentManifest` + `ManifestReader` for engine sidecar manifests)
|
||||
- `onnx_trt_runtime.py` (ONNX Runtime + TensorRT EP, pending)
|
||||
- `pytorch_fp16_runtime.py` (AZ-300; research-only / simple-baseline strategy)
|
||||
- `tensorrt_runtime.py` (production-default; TensorRT 10.3, pending)
|
||||
- `thermal_publisher.py` (AZ-302; 1 Hz background poller, jtop/NVML fallback)
|
||||
- **Owns**: `src/gps_denied_onboard/components/c7_inference/**`, `tests/unit/c7_inference/**`
|
||||
- **Imports from**: `_types`, `helpers.engine_filename_schema`, `helpers.sha256_sidecar`, `config`, `logging`, `fdr_client`
|
||||
- **Consumed by**: `c2_vpr`, `c2_5_rerank`, `c3_matcher`, `c10_provisioning`, `runtime_root`
|
||||
|
||||
@@ -0,0 +1,226 @@
|
||||
# Cumulative Code Review — Batches 23–27 / Cycle 1
|
||||
|
||||
**Date**: 2026-05-12
|
||||
**Mode**: Cumulative (all 7 phases, emphasis on Phases 6 + 7)
|
||||
**Batches covered**: 23, 24, 25, 26, 27
|
||||
**Tasks covered**: AZ-332 (c1_vio OKVIS2), AZ-300 (c7 PyTorch FP16 baseline), AZ-301 (c7 EngineGate), AZ-302 (c7 ThermalStatePublisher), AZ-304 (c6 Postgres schema 0002)
|
||||
**Changed files in scope**: 54
|
||||
**Verdict**: **PASS_WITH_WARNINGS**
|
||||
|
||||
## Summary
|
||||
|
||||
No Critical or High findings. Three Medium findings, all documentation
|
||||
drift on `_docs/02_document/module-layout.md`: the c7_inference
|
||||
component grew six new internal files across batches 24–26 that are
|
||||
not listed in the layout doc, the c6_tile_cache entry still claims
|
||||
`connection.py` / `psycopg_pool` for AZ-304 even though AZ-304
|
||||
explicitly deferred them to AZ-305, and the c7_inference Public API
|
||||
description is stale (claims `__init__.py` re-exports only
|
||||
`InferenceRuntime` + `EngineCacheEntry` but the actual `__all__` lists
|
||||
26 symbols).
|
||||
|
||||
No cross-component Public API bypasses, no new module-import cycles,
|
||||
no duplicate symbols across components, no layer-direction
|
||||
violations. All cross-component imports of the new code go through
|
||||
the composition root (`runtime_root/storage_factory.py`) or stay
|
||||
intra-component as required by ADR-009.
|
||||
|
||||
## Findings
|
||||
|
||||
| # | Severity | Category | File:Line | Title |
|
||||
|---|----------|----------|-----------|-------|
|
||||
| 1 | Medium | Architecture | `_docs/02_document/module-layout.md:153` | c7_inference Internal list missing 6 new files (engine_gate, manifest, thermal_publisher, architecture_registry, config, errors) |
|
||||
| 2 | Medium | Architecture | `_docs/02_document/module-layout.md:136` | c6_tile_cache Internal list references `connection.py` (AZ-304) but AZ-304 deferred it to AZ-305 |
|
||||
| 3 | Medium | Maintainability | `_docs/02_document/module-layout.md:158` | c7_inference Public API description stale (claims only `InferenceRuntime`/`EngineCacheEntry` re-exported; actual `__all__` has 26 symbols) |
|
||||
|
||||
### Finding Details
|
||||
|
||||
**F1: c7_inference Internal list missing 6 new files** (Medium / Architecture)
|
||||
|
||||
- Location: `_docs/02_document/module-layout.md:153`
|
||||
- Description: Batches 24–26 added the following modules under
|
||||
`src/gps_denied_onboard/components/c7_inference/`:
|
||||
`engine_gate.py` (AZ-301), `manifest.py` (AZ-301),
|
||||
`thermal_publisher.py` (AZ-302), `architecture_registry.py`
|
||||
(AZ-300), `config.py` (older), `errors.py` (older), and
|
||||
`pytorch_fp16_runtime.py` (AZ-300, already listed). None of the
|
||||
newly-added internals are documented as Internal in the layout doc.
|
||||
This is documentation drift, not a behavioural issue: every
|
||||
cross-component user only reaches into the component via
|
||||
`__init__.py`'s `__all__` (verified by grep across `src/` and
|
||||
`tests/`).
|
||||
- Suggestion: append to the c7_inference Internal bullets:
|
||||
`architecture_registry.py`, `config.py`, `engine_gate.py`,
|
||||
`errors.py`, `manifest.py`, `thermal_publisher.py`. Attribute each
|
||||
to its AZ ticket for traceability.
|
||||
- Task: AZ-300 / AZ-301 / AZ-302 (incremental drift)
|
||||
|
||||
**F2: c6_tile_cache Internal list references deferred `connection.py`** (Medium / Architecture)
|
||||
|
||||
- Location: `_docs/02_document/module-layout.md:136`
|
||||
- Description: The c6_tile_cache Internal list (just updated in
|
||||
batch 27) still includes the line:
|
||||
`` `connection.py` (`psycopg_pool` ConnectionPool helper; AZ-304) ``
|
||||
AZ-304 explicitly deferred `connection.py` and `psycopg_pool` to
|
||||
AZ-305 (`c6_postgres_filesystem_store`) after the dependency-pin
|
||||
reality check; the file does NOT exist on disk and AZ-305 has not
|
||||
shipped yet. The line is forward-leaning fiction.
|
||||
- Suggestion: either remove the line, or re-attribute it to AZ-305
|
||||
with a "(planned)" suffix so readers know it has not landed yet.
|
||||
Preferred: re-attribute to AZ-305 so the layout doc stays a
|
||||
forward-looking plan.
|
||||
- Task: AZ-304 (introduced the entry under the wrong attribution)
|
||||
|
||||
**F3: c7_inference Public API description stale** (Medium / Maintainability)
|
||||
|
||||
- Location: `_docs/02_document/module-layout.md:158`
|
||||
- Description: The Public API bullets say:
|
||||
`` `__init__.py` (re-exports `InferenceRuntime`, `EngineCacheEntry`) ``
|
||||
The actual `c7_inference/__init__.py` `__all__` exports 26 symbols:
|
||||
`ArchitectureFactory`, `BuildConfig`, `C7InferenceConfig`,
|
||||
`CalibrationCacheError`, `DeploymentManifest`, `EngineBuildError`,
|
||||
`EngineCacheEntry`, `EngineDeserializeError`, `EngineGate`,
|
||||
`EngineHandle`, `EngineHashMismatchError`,
|
||||
`EngineSchemaMismatchError`, `EngineSidecarMissingError`,
|
||||
`HostTuple`, `InferenceError`, `InferenceRuntime`,
|
||||
`ManifestReader`, `ManifestReaderProtocol`, `OptimizationProfile`,
|
||||
`OutOfMemoryError`, `PrecisionMode`, `RuntimeError`,
|
||||
`TelemetryUnavailableError`, `ThermalReading`, `ThermalSource`,
|
||||
`ThermalState`, `ThermalStatePublisher`, `default_registry`,
|
||||
`register_architecture`. Tests confirm no cross-component code
|
||||
reaches into c7_inference internals — Public API surface is wider
|
||||
than documented but consistent with intent (re-exports from
|
||||
`_types.inference` and `_types.thermal`, the new EngineGate /
|
||||
ManifestReader / ThermalStatePublisher classes, the error family,
|
||||
the config block, and the architecture registry).
|
||||
- Suggestion: rewrite the Public API bullet for c7_inference to list
|
||||
the four documented surfaces (Protocol + DTOs + Errors +
|
||||
ThermalStatePublisher + EngineGate + ManifestReader) at category
|
||||
level, not enumerate every symbol. Optionally add a "see `__all__`
|
||||
for the full list" pointer.
|
||||
- Task: AZ-297 / AZ-300 / AZ-301 / AZ-302 (drift accumulated over
|
||||
Public-API expansions in these batches)
|
||||
|
||||
## Phase-by-Phase Notes
|
||||
|
||||
### Phase 1 — Context Loading
|
||||
|
||||
Read the five batch reports (`batch_23..27_cycle1_report.md`), the
|
||||
five task specs (now under `_docs/02_tasks/done/`), and
|
||||
`_docs/00_problem/restrictions.md` (not re-read this cycle —
|
||||
unchanged since the 01-22 cumulative).
|
||||
|
||||
### Phase 2 — Spec Compliance
|
||||
|
||||
Per-batch reviews were the source-of-truth for batch-local AC
|
||||
coverage (existing batch reports list every AC → test mapping).
|
||||
Cumulative spot-check on AZ-304: AC-1 through AC-12 all have named
|
||||
tests; Tier-2 docker tests are gated by `@pytest.mark.docker` and
|
||||
skip on Tier-1. No re-litigation here.
|
||||
|
||||
### Phase 3 — Code Quality
|
||||
|
||||
No new SOLID / SRP / naming / complexity findings beyond what
|
||||
per-batch reviews surfaced. Spot-check on the AZ-304 additions:
|
||||
|
||||
- `_uuid_namespace.py` — pure stdlib `uuid.uuid5`; clean SRP; all
|
||||
helpers private (`_normalize_*`).
|
||||
- `migrations.py` — `apply_migrations` is the only public function;
|
||||
retry logic is documented inline; `MigrationResult` is `frozen`.
|
||||
- `0002_c6_tile_identity_and_lru.py` — `upgrade()` and `downgrade()`
|
||||
are flat / linear; revision metadata correct.
|
||||
|
||||
### Phase 4 — Security Quick-Scan
|
||||
|
||||
No SQL string interpolation (the migration uses `op.*` parameterised
|
||||
DDL; the test helpers parameterise via psycopg `cursor.execute(sql,
|
||||
params)`). No hardcoded secrets. No subprocess. No
|
||||
deserialization. Logging payloads carry only revision strings + the
|
||||
pinned namespace UUID — no PII.
|
||||
|
||||
### Phase 5 — Performance
|
||||
|
||||
`_uuid_namespace.derive_*` are pure functions, called once per
|
||||
INSERT in production. The migration `0002` is bounded by index
|
||||
creation on an empty `tiles` table (greenfield) — no real-world
|
||||
data-volume risk. The migration runner's retry-without-sleep is
|
||||
deliberate (see batch 27 report § 4) — no measurable performance
|
||||
regression vs. sleep-then-retry.
|
||||
|
||||
### Phase 6 — Cross-Task Consistency
|
||||
|
||||
- DTO consistency: `TileMetadata.location_hash` field added in AZ-304
|
||||
is consumed by no other batch (AZ-305 will be the first consumer);
|
||||
no producer/consumer drift today.
|
||||
- Protocol consistency: `VioStrategy` (c1_vio), `InferenceRuntime`
|
||||
(c7_inference), `TileMetadataStore` (c6_tile_cache) — all three
|
||||
Protocols are independently scoped and have no shared method
|
||||
signatures or DTOs across components.
|
||||
- Shared code: `_types.thermal.ThermalState` (introduced earlier) is
|
||||
re-exported by c7_inference's `__init__.py` — single home,
|
||||
re-export consistent with module-layout § 5 cross-cutting concern
|
||||
rule.
|
||||
- No duplicate class/function names across components. `EngineGate`,
|
||||
`ThermalStatePublisher`, `ManifestReader`, `DeploymentManifest`,
|
||||
`MigrationResult`, `MigrationError`, `derive_tile_id`,
|
||||
`derive_location_hash` — all unique. Verified by grep across
|
||||
`src/`.
|
||||
|
||||
### Phase 7 — Architecture Compliance
|
||||
|
||||
1. **Layer direction**: spot-checked all cross-module imports in the
|
||||
54 changed files. No import resolves to a higher-layer importee.
|
||||
Composition root (`runtime_root/storage_factory.py`) is the only
|
||||
place that imports concrete strategy modules
|
||||
(`postgres_filesystem_store`, `faiss_descriptor_index`) — allowed
|
||||
per ADR-009.
|
||||
|
||||
2. **Public API respect**: no file in another component imports a
|
||||
non-`__init__`/`interface` file of c1_vio, c6_tile_cache, or
|
||||
c7_inference. Verified by:
|
||||
```
|
||||
grep -nR 'from gps_denied_onboard.components.<c>.<internal_module>'
|
||||
```
|
||||
on every internal module of the three components. All hits are
|
||||
either intra-component or under `runtime_root/` (composition
|
||||
root, explicitly allowed).
|
||||
|
||||
3. **No new module-import cycles**: spot-built import graph for
|
||||
`c1_vio`, `c6_tile_cache`, `c7_inference` (changed files +
|
||||
direct deps). No cycles introduced. The `c7_inference.__init__`
|
||||
→ `_types.inference` / `_types.thermal` edges are one-way as
|
||||
designed.
|
||||
|
||||
4. **Duplicate symbols**: none (see Phase 6).
|
||||
|
||||
5. **Cross-cutting concerns**: no logging / config / error-envelope
|
||||
re-implementations spotted. `migrations.py` uses
|
||||
`logging.structured.get_logger` (the shared concern). Both
|
||||
`_uuid_namespace` and `migrations.py` import from the shared
|
||||
`_types.*` / `config.schema` / `logging.structured` modules,
|
||||
not local copies.
|
||||
|
||||
## Baseline Delta
|
||||
|
||||
Not applicable — `_docs/02_document/architecture_compliance_baseline.
|
||||
md` does not exist on this branch (greenfield project, no architecture
|
||||
baseline scan run yet). Baseline-delta tables omitted.
|
||||
|
||||
## Auto-Fix Eligibility (per implement skill § 10 matrix)
|
||||
|
||||
All three findings are Medium / Architecture or Medium /
|
||||
Maintainability against a single documentation file
|
||||
(`_docs/02_document/module-layout.md`). Per the auto-fix matrix:
|
||||
|
||||
- Medium / Architecture → **escalate**
|
||||
- Medium / Maintainability → **escalate**
|
||||
|
||||
All three findings escalate to the user. None are auto-fix eligible.
|
||||
|
||||
## Verdict
|
||||
|
||||
**PASS_WITH_WARNINGS** — no Critical / High findings; three Medium
|
||||
documentation-drift findings on `module-layout.md`. The implement
|
||||
skill's loop may continue to batch 28; the user is the final
|
||||
authority on whether to repair `module-layout.md` now or roll it
|
||||
into a doc-sync task.
|
||||
@@ -4,11 +4,11 @@
|
||||
flow: greenfield
|
||||
step: 7
|
||||
name: Implement
|
||||
status: in_progress
|
||||
status: paused_user_requested
|
||||
sub_step:
|
||||
phase: 3
|
||||
name: compute-batch
|
||||
detail: "batch 27/cycle1: AZ-304 complete (additive 0002 migration + UUIDv5 namespace + runner + v1.2.0 contract bump). Awaiting next batch selection."
|
||||
phase: 14.5
|
||||
name: cumulative-review-aftermath
|
||||
detail: "batches 23-27 cumulative review: PASS_WITH_WARNINGS (3 Medium doc-drift findings on module-layout.md). User approved fix-now for findings, then chose stop_here. Module-layout.md updated; user will re-invoke /autodev when ready to continue."
|
||||
retry_count: 0
|
||||
cycle: 1
|
||||
tracker: jira
|
||||
|
||||
Reference in New Issue
Block a user