mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-22 12:31:13 +00:00
[autodev] Update configuration and documentation for cycle-1
ci/woodpecker/push/02-build-push Pipeline failed
ci/woodpecker/push/02-build-push Pipeline failed
- Enhanced `.env.example` with detailed CMake build flags and replay-mode strategy flags for development and CI environments. - Updated `.gitignore` to include a new deploy rollback bookmark. - Revised `_docs/_autodev_state.md` to reflect the current task status and steps. - Added new lessons to `_docs/LESSONS.md` regarding testing and architectural improvements. - Documented changes in `_docs/02_document/deployment/ci_cd_pipeline.md` to reflect the relaxed OpenCV version pin. - Updated test data documentation in `_docs/02_document/tests/test-data.md` to clarify fixture usage and paths. This commit continues the cycle-1 documentation sync and addresses various configuration updates for improved clarity and functionality.
This commit is contained in:
@@ -0,0 +1,121 @@
|
||||
# Performance Test Run — 2026-05-19 — workstation Tier-1 probe
|
||||
|
||||
**Invoked by**: autodev greenfield Step 15 — `.cursor/skills/test-run/SKILL.md` perf mode.
|
||||
**Host**: developer Mac workstation (no Jetson hardware, no `E2E_SITL_REPLAY_DIR` fixture mounted).
|
||||
**Runner**: `scripts/run-performance-tests.sh` + direct `pytest e2e/tests/performance/` probe.
|
||||
**Run ID**: `workstation-tier1-probe`.
|
||||
**Status**: **Unverified across all 4 production perf NFRs; pure-logic evaluator unit tests Pass (70/70).** No regression detected because no measurement was possible. No Warn / Fail to gate on. **Not blocking deploy** per the skill's "Any Unverified scenarios with no Warn/Fail" rule.
|
||||
|
||||
## What ran
|
||||
|
||||
### A) `scripts/run-performance-tests.sh`
|
||||
|
||||
```text
|
||||
Tier-2 perf tests skipped (GPS_DENIED_TIER!=2).
|
||||
exit=0
|
||||
```
|
||||
|
||||
The runner script is deliberately a Tier-2 gate (`pytest -m tier2 -q tests/perf` only when `GPS_DENIED_TIER=2`). On Tier-1 / workstation it exits 0 silently. By design — the canonical perf measurements require Jetson Orin Nano Super hardware (D-C7-9, JetPack 6.2, TensorRT 10.3); a workstation run would produce numbers that DO NOT meet the pinned-hardware budgets and would actively mislead trend tracking.
|
||||
|
||||
### B) Direct `pytest e2e/tests/performance/` probe (24 parameterizations)
|
||||
|
||||
| NFR | Configs | Outcome | Skip reason |
|
||||
|---|---|---|---|
|
||||
| **NFT-PERF-01** (E2E latency p95 ≤ 400 ms — AC-4.1) | 6 ({ardupilot, inav} × {okvis2, klt_ransac, vins_mono}) | 6 skipped | "Tier-2 only — Jetson hardware required" |
|
||||
| **NFT-PERF-02** (frame-by-frame streaming, inter-emit p95 ≤ 350 ms — AC-4.4) | 6 ({ardupilot, inav} × {okvis2, klt_ransac, vins_mono}) | 4 skipped (no fixture) + 2 skipped (vins_mono research-only per D-C1-1-SUB-A) | "requires `E2E_SITL_REPLAY_DIR` (AZ-595) carrying the 5 min Derkachi @ 3 Hz replay" |
|
||||
| **NFT-PERF-03** (cold-start TTFF p95 ≤ 30 s — AC-NEW-1) | 6 | 6 skipped | "Tier-2 only — Jetson hardware required" |
|
||||
| **NFT-PERF-04** (spoof-promotion p95 ≤ 600 ms — AC-NEW-2) | 6 | 4 skipped (no fixture) + 2 skipped (vins_mono research-only per D-C1-1-SUB-A) | "requires `E2E_SITL_REPLAY_DIR` (AZ-595) containing N≥20 randomized-start blackout+spoof events" |
|
||||
|
||||
Total: 24 skipped, 0 passed, 0 failed, 0 errored. Exit code 0.
|
||||
|
||||
### C) Pure-logic evaluator unit tests — `e2e/_unit_tests/helpers/test_*_evaluator.py`
|
||||
|
||||
The four perf NFRs each map to a pure-logic evaluator that computes the gate (p95 / inter-emit interval / TTFF distribution / spoof-promotion latency) from a recorded sample set. These evaluators are tested without any SITL / Jetson dependency:
|
||||
|
||||
```text
|
||||
e2e/_unit_tests/helpers/test_e2e_latency_evaluator.py → covers NFT-PERF-01 AC-2/3/4 math
|
||||
e2e/_unit_tests/helpers/test_streaming_evaluator.py → covers NFT-PERF-02 AC-1/AC-2 math
|
||||
e2e/_unit_tests/helpers/test_ttff_evaluator.py → covers NFT-PERF-03 AC-3/AC-4 math
|
||||
e2e/_unit_tests/helpers/test_spoof_promotion_evaluator.py → covers NFT-PERF-04 AC-1/AC-2 math
|
||||
```
|
||||
|
||||
```text
|
||||
$ .venv/bin/python -m pytest e2e/_unit_tests/helpers/test_e2e_latency_evaluator.py \
|
||||
e2e/_unit_tests/helpers/test_streaming_evaluator.py \
|
||||
e2e/_unit_tests/helpers/test_ttff_evaluator.py \
|
||||
e2e/_unit_tests/helpers/test_spoof_promotion_evaluator.py \
|
||||
--no-header -q
|
||||
...................................................................... [100%]
|
||||
70 passed in 0.50s
|
||||
```
|
||||
|
||||
**70/70 pass.** Confirms that the threshold-comparison logic (percentile estimators, inter-emit interval, TTFF distribution, spoof-onset → label-switch delta) is correct independent of whether real measurements have been recorded yet. A future hardware run feeds JSON fixtures into the same evaluators — only the input data changes, not the math.
|
||||
|
||||
## Threshold comparison (Step 3 of skill)
|
||||
|
||||
Per the skill's Step 3, thresholds load from `_docs/02_document/tests/performance-tests.md`. The thresholds exist and are documented but no scenario produced a measurement to compare them against.
|
||||
|
||||
| NFR | Threshold | Observed | Verdict |
|
||||
|---|---|---|---|
|
||||
| NFT-PERF-01 | p95 ≤ 400 ms (K=3 baseline AND K=2 hybrid auto-degrade) + ≤10 % frame drops | — | **Unverified** (Tier-2 hardware required) |
|
||||
| NFT-PERF-02 | p95 inter-emit interval ≤ 350 ms; no window of ≥3 missed-emit gaps | — | **Unverified** (`E2E_SITL_REPLAY_DIR` fixture not yet recorded; AZ-595) |
|
||||
| NFT-PERF-03 | p95 TTFF < 30 s (50 cold boots) | — | **Unverified** (Tier-2 hardware required) |
|
||||
| NFT-PERF-04 | p95 < 3 s on both FCs (50 trials per FC) | — | **Unverified** (`E2E_SITL_REPLAY_DIR` fixture not yet recorded; AZ-595) |
|
||||
|
||||
## Classification
|
||||
|
||||
Per the skill's perf-mode reporting:
|
||||
|
||||
```text
|
||||
══════════════════════════════════════
|
||||
PERF RESULTS
|
||||
══════════════════════════════════════
|
||||
Scenarios: [pass 0 · warn 0 · fail 0 · unverified 4]
|
||||
──────────────────────────────────────
|
||||
1. NFT-PERF-01 — Unverified — Tier-2 Jetson hardware required
|
||||
2. NFT-PERF-02 — Unverified — SITL replay fixture pending (AZ-595)
|
||||
3. NFT-PERF-03 — Unverified — Tier-2 Jetson hardware required
|
||||
4. NFT-PERF-04 — Unverified — SITL replay fixture pending (AZ-595)
|
||||
──────────────────────────────────────
|
||||
Pure-logic evaluator coverage: 70/70 unit tests pass
|
||||
(e2e/_unit_tests/helpers/test_{e2e_latency,streaming,ttff,spoof_promotion}_evaluator.py)
|
||||
══════════════════════════════════════
|
||||
```
|
||||
|
||||
## Coverage gap assessment (skill Step 5: "Unverified")
|
||||
|
||||
Per the skill:
|
||||
|
||||
> **Any Unverified scenarios with no Warn/Fail** → not blocking, but surface them in the report so the user knows coverage gaps exist. Suggest running `/test-spec` to add expected results next cycle.
|
||||
|
||||
This run has **0 Warn + 0 Fail + 4 Unverified**, so:
|
||||
|
||||
- **Not deploy-blocking.** The perf gate is allowed to be Unverified when the SUT is not yet running on its canonical hardware.
|
||||
- **Coverage gap is fully cataloged.** Each Unverified scenario points at a concrete task:
|
||||
- **NFT-PERF-01 / NFT-PERF-03**: AZ-444 (Tier-2 Jetson harness) is the recording-phase task. When AZ-444 lands, these scenarios run on the Jetson and produce numbers — at which point this report's "Unverified" entries become "Pass / Warn / Fail" against the AC-4.1 / AC-NEW-1 thresholds.
|
||||
- **NFT-PERF-02 / NFT-PERF-04**: AZ-595 (SITL replay fixture builder) is the recording task. When AZ-595 lands, the fixtures are committed under `e2e/fixtures/sitl_replay/`, `E2E_SITL_REPLAY_DIR` is set, and the scenarios run on Tier-1.
|
||||
- **The thresholds, evaluators, parameterizations, and report wiring are all in place.** Recording is the only gap, not test design.
|
||||
|
||||
## Anti-patterns explicitly NOT used
|
||||
|
||||
Per the skill's anti-pattern guidance:
|
||||
|
||||
- **No improvised perf tests.** Did not synthesize a workstation-only "approximation" of any NFR; the AC-4.1 / AC-NEW-1 / AC-NEW-2 / AC-4.4 budgets are pinned to canonical hardware and synthetic Tier-1 numbers would mislead the trend-tracker.
|
||||
- **No skip-acceptance without justification.** Each Unverified entry is cataloged against a concrete recording task (AZ-444 / AZ-595).
|
||||
- **No threshold downgrade.** Did not soften any threshold to make a Tier-1 measurement "pass".
|
||||
|
||||
## Two minor housekeeping items (Low)
|
||||
|
||||
1. **Unregistered pytest mark `tier2_only`** — pytest warnings at `e2e/tests/performance/test_nft_perf_01_e2e_latency.py:61` and `e2e/tests/performance/test_nft_perf_03_ttff.py:48`. Add `tier2_only: marks scenarios that require Jetson hardware` to `e2e/runner/pytest.ini` `markers` list.
|
||||
2. **`scripts/run-performance-tests.sh` is intentionally a Tier-2 stub.** This is documented in the script header; not a defect, just a reminder that the Tier-1 path is "skip + log" by design. If a Tier-1 perf trend-tracking workflow is ever desired, add an explicit branch (e.g. invoke the pure-logic evaluators against a smaller `derkachi-short-fixture`).
|
||||
|
||||
## Cross-Reference Index
|
||||
|
||||
| Source | Purpose |
|
||||
|---|---|
|
||||
| `_docs/02_document/tests/performance-tests.md` | Threshold + scenario spec |
|
||||
| `scripts/run-performance-tests.sh` | Runner script (current Tier-2 stub) |
|
||||
| `_docs/02_tasks/todo/AZ-444*` | Tier-2 Jetson harness (recording-phase task) |
|
||||
| `_docs/02_tasks/todo/AZ-595*` | SITL replay fixture builder (recording task) |
|
||||
| `_docs/02_tasks/todo/AZ-{428..431}*` | NFT-PERF-{01..04} scenario tasks (currently complete on the runner side; the harness side is pending) |
|
||||
| `_docs/06_metrics/` (this directory) | Per-run perf trend artefacts |
|
||||
@@ -0,0 +1,147 @@
|
||||
# Retrospective — 2026-05-20
|
||||
|
||||
> Cycle-1 retrospective for GPS-Denied Onboard. Cycle 1 spans
|
||||
> 2026-05-11 → 2026-05-20 (Problem → Deploy → this retro). This is the
|
||||
> **first** retro for the project — no prior baseline. Generated by
|
||||
> `/autodev` greenfield Step 17 (Retrospective, cycle-end mode).
|
||||
|
||||
## Implementation Summary
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Total tasks (done) | 165 (product + test + hygiene + refactor) |
|
||||
| Total tasks (backlog) | 2 (AZ-592, AZ-593 — Tier-2 OKVIS2 / VINS-Mono validation) |
|
||||
| Total tasks (todo) | 0 |
|
||||
| Total batches | 97 (cycle 1) |
|
||||
| Cycle duration | 9 days (2026-05-11 → 2026-05-20) |
|
||||
| Avg tasks per batch | ≈ 1.7 |
|
||||
| Estimated total complexity points | ≈ 565 cp (sampled: 80 × 3 cp, 50 × 5 cp, 30 × 2 cp, plus a handful of 4 cp and 0 cp umbrella) |
|
||||
| Source LoC | 61,071 Python (src/) |
|
||||
| Components | 15 (C1, C2, C2.5, C3, C3.5, C4, C5, C6, C7, C8, C10, C11, C12, C13 + helpers/runtime_root) |
|
||||
| Binary tracks | 3 (airborne, research, operator-orchestrator) per ADR-002 + ADR-011 |
|
||||
|
||||
## Quality Metrics
|
||||
|
||||
### Product Implementation Completeness Gate (Step 7 → Step 8)
|
||||
|
||||
Source: `_docs/03_implementation/implementation_completeness_cycle1_report.md` (revised 2026-05-19 addendum).
|
||||
|
||||
| Verdict | Count | Percentage |
|
||||
|---------|-------|------------|
|
||||
| PASS | 114 | 98.3 % |
|
||||
| BLOCKED-with-named-Tier-2-handle | 4 | 3.4 % (overlaps; 2 share one hardware artifact) |
|
||||
| FAIL | 0 | 0 % |
|
||||
|
||||
The 4 BLOCKED items are: **AZ-332** (OKVIS2 production-default VIO native binding → AZ-592 backlog), **AZ-333** (VINS-Mono research-only VIO → AZ-593 backlog), **AZ-624 AC-5** + **AZ-687 AC-687-3** (Jetson Tier-2 evidence file `_docs/03_implementation/jetson_runs/2026-05-19_az687_tier2_run.txt` pending).
|
||||
|
||||
### Code Review Verdicts (sampled across batches 50..97)
|
||||
|
||||
The verdict field is consistently `**Verdict**:` in batches ≥ 53. Earlier batches (01–22) use a different convention and are captured in the consolidated `cumulative_review_batches_01-22_cycle1_report.md`.
|
||||
|
||||
| Verdict | Approximate count (batches 53..97) | Notes |
|
||||
|---------|------------------------------------|-------|
|
||||
| PASS | ~24 | Includes single-task batches with inline self-review |
|
||||
| PASS_WITH_WARNINGS | ~30 | Includes 38 files matching the verdict phrase |
|
||||
| FAIL | 0 | No batch failed the in-loop review across the full cycle |
|
||||
| BLOCKED | 0 (at batch level; 4 at task level — see Completeness Gate above) | |
|
||||
|
||||
Auto-fix loop never had to escalate to user intervention across cycle 1.
|
||||
|
||||
### Findings by Severity (latest 3 cumulative review windows: 70-72, 82-87, 88-92)
|
||||
|
||||
| Severity | Count (rolling 3-window) | Trend |
|
||||
|----------|---------------------------|-------|
|
||||
| Critical | 0 | — |
|
||||
| High | 0 | — |
|
||||
| Medium | 2 (CR-F1 csv-helper, CR-F2 fixture-path-helper — both escalated and still OPEN at end of cycle 1) | Carried over from window 85-87 → 88-92 |
|
||||
| Low | ~5 (5 in batch 87, 3 in batch 85, 2 each in batches 83/84, 1 in batch 81) | Trending down across late batches |
|
||||
| Info | many — process notes, no action | — |
|
||||
|
||||
### Findings by Category (qualitative, since aggregate metrics across all 97 batches would need per-finding tagging that isn't uniform across early batches)
|
||||
|
||||
| Category | Notable patterns | Top files |
|
||||
|----------|------------------|-----------|
|
||||
| Bug | Essentially none post Step 11 (Run Tests) — the iSAM2 ordering issue in AZ-625 was caught + fixed mid-batch | — |
|
||||
| Spec-Gap | AZ-595 dependency surfaced late (blocks 17 NFT scenarios); AZ-687 replay-mode guard emerged from AZ-624's Jetson run | `e2e/_unit_tests/helpers/`, `e2e/runner/helpers/` |
|
||||
| Security | OpenCV CVE pin replay condition tracked since 2026-05-11 (gtsam numpy<2 ABI block); per `_docs/05_security/dependency_scan.md` re-validated against current pin — no advisory exposure | `pyproject.toml` |
|
||||
| Performance | NFT-PERF Tier-2 baselines pending AZ-595 + AZ-592/AZ-593; cycle-1 Tier-1 probe completed 2026-05-19 | `_docs/06_metrics/perf_2026-05-19_workstation-tier1-probe.md` |
|
||||
| Maintainability | Two helper-duplication patterns surfaced (CR-F1, CR-F2) — 5 cp combined hygiene PBI proposed | `e2e/runner/helpers/csv_evidence_writer`, `e2e/runner/helpers/fixture_path` |
|
||||
| Style | Minimal — `ruff` is run pre-commit + in the e2e-runner; batch 91 opportunistically cleaned 12 pre-existing UP037 lints in `airborne_bootstrap.py` | `airborne_bootstrap.py` |
|
||||
| Scope | One mid-cycle scope correction — AZ-589/AZ-590 closed Won't Fix after post-mortem found the actual gap was cross-cutting (empty `_STRATEGY_REGISTRY`), not per-strategy C++ wiring. Re-decomposed into AZ-591 + AZ-592/AZ-593 backlog | `runtime_root/__init__.py` → `airborne_bootstrap.py` |
|
||||
|
||||
## Structural Metrics
|
||||
|
||||
`_docs/02_document/architecture_compliance_baseline.md` does **not** exist — cumulative reviews could not emit `## Baseline Delta` sections through cycle 1. This is a known gap (see Improvement Action #3 below).
|
||||
|
||||
| Metric | Value | Trend |
|
||||
|--------|-------|-------|
|
||||
| Component count | 15 | First-cycle baseline |
|
||||
| Source LoC (Python, src/) | 61,071 | First-cycle baseline |
|
||||
| Cycles in component import graph | 0 | Healthy — no back-edges in `runtime_root.airborne_bootstrap → {clock, fdr_client, runtime_root.*_factory, runtime_root.{storage,inference,errors,…}}` |
|
||||
| Cross-component edges | Concentrated in `runtime_root/` factories (composition root by design) | Healthy — single composition seam |
|
||||
| Contract files | `_docs/02_document/contracts/shared_*/` populated for FDR record schema (v1.3.0), log record schema (v1.0.0), C8 transport, post-landing upload, ingest (D-PROJ-2 placeholder) | First-cycle baseline; coverage % vs public-API symbols not computed (no inventory) |
|
||||
| Architecture violations net delta | n/a (no baseline) | Recommend establishing the baseline file in cycle 2 Step 6 |
|
||||
|
||||
A structural snapshot for future delta comparison is recorded in `_docs/06_metrics/structure_2026-05-20.md` (next item in this retro).
|
||||
|
||||
## Efficiency
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Blocked tasks (cycle-1 close) | 4 (all Tier-2 hardware / evidence rooted) |
|
||||
| Tasks requiring fixes after review | ~5 (Medium/Low warnings landed as hygiene PBIs, not as in-cycle fixes) |
|
||||
| Auto-fix loop escalations to user | 0 |
|
||||
| Mid-cycle remediation post-mortems | 1 (AZ-589/AZ-590 → AZ-591) |
|
||||
| Mid-cycle scope rewinds | 1 (Step 11 Run Tests → Step 7 Implement, for AZ-618 cross-cutting umbrella with 12 builder signatures; surfaced as 2026-05-18 lessons entry already in LESSONS.md) |
|
||||
| Batch with most findings | Batch 87 (`PASS_WITH_WARNINGS — 0 Critical, 0 High, 0 Medium, 5 Low`) — late-cycle e2e helper polish |
|
||||
| Out-of-loop process leftover | D-CROSS-CVE-1 OpenCV pin deferred on upstream `gtsam` numpy<2 ABI; re-checked at every `/autodev` invocation per leftover protocol |
|
||||
|
||||
### Blocker Analysis
|
||||
|
||||
| Blocker Type | Count | Prevention (carries to cycle 2) |
|
||||
|--------------|-------|----------------------------------|
|
||||
| Tier-2 hardware/evidence (Jetson Orin) | 4 | Allocate Tier-2 Jetson access for AZ-595 + AZ-624 AC-5 re-run + AZ-687 AC-687-3; AZ-592/AZ-593 unblocked only when CI build env + DBoW2 vocab + upstream choice are decided. |
|
||||
| Upstream library pin (D-CROSS-CVE-1) | 1 | Out-of-band — replay condition is `gtsam` numpy-2 wheels (or alternate SE(3) backend). Re-check is debounced ≤ 2 h per leftover entry; no action needed in dev cycle. |
|
||||
| Mid-cycle scope correction | 1 | The fix already informed the 2026-05-18 LESSONS entry. Cross-cutting registry/factory state should be probed before classifying a per-task FAIL. |
|
||||
|
||||
## Trend Comparison
|
||||
|
||||
*Previous retrospective: N/A — first retro.*
|
||||
|
||||
Cycle 1 establishes the baseline. From cycle 2 onward, this section will compare against:
|
||||
|
||||
| Metric | Current (cycle-1 baseline) | Target (cycle-2) |
|
||||
|--------|-----------------------------|-------------------|
|
||||
| Code-review pass rate (PASS / total) | ≈ 44 % PASS, ≈ 55 % PASS_WITH_WARNINGS, 0 % FAIL | Maintain 0 % FAIL; lift PASS share by landing CR-F1 + CR-F2 hygiene PBIs early |
|
||||
| Avg findings per batch (Medium + Low) | ~0.2 (≈ 16 total Medium/Low across 97 batches, dominated by helper-duplication carryovers) | ≤ 0.15 |
|
||||
| Blocked tasks (cycle close) | 4 (all Tier-2 rooted) | ≤ 2; AZ-624 AC-5 + AZ-687 AC-687-3 close once the Tier-2 run lands |
|
||||
| Mid-cycle remediation post-mortems | 1 | 0 |
|
||||
| Structural baseline file present | No (gap) | Yes |
|
||||
|
||||
## Top 3 Improvement Actions
|
||||
|
||||
1. **Land the two open hygiene PBIs (CR-F1 + CR-F2) before any new NFT helper expansion in cycle 2.**
|
||||
- Impact: drops the only carried-over Medium findings; consolidates `csv_evidence_writer` + `fixture_path.resolve` into one canonical helper module under `e2e/runner/helpers/`; unblocks future NFT additions without duplication drift.
|
||||
- Effort: low (combined 5 cp; both already pre-decomposed in `cumulative_review_batches_88-92_cycle1_report.md`).
|
||||
|
||||
2. **Sequence AZ-595 (SITL observer + FDR replay fixture) as the first product task of cycle 2.**
|
||||
- Impact: closes 17 NFT scenarios in one PBI — every NFT-PERF / NFT-RES / NFT-SEC scenario that currently `sitl_replay_ready`-skips on the Tier-1 docker harness. Also unblocks the Tier-2 evidence files for AZ-624 AC-5 + AZ-687 AC-687-3 once Jetson hardware is available.
|
||||
- Effort: medium (5 cp, in scope of a single batch; depends only on already-done tasks).
|
||||
|
||||
3. **Create `_docs/02_document/architecture_compliance_baseline.md` as a Step 6 (Decompose) prerequisite.**
|
||||
- Impact: cumulative reviews can emit `## Baseline Delta` sections from cycle 2 onward, quantifying architecture violations carried over / resolved / newly introduced per cycle. Without it, structural regressions are invisible to the retro process.
|
||||
- Effort: low (small file: scrape ADR list + initial violation count = 0 for cycle 2; the structural snapshot in this retro can seed the baseline). Suggest landing it in cycle 2 Step 6 as a precondition.
|
||||
|
||||
## Suggested Rule/Skill Updates
|
||||
|
||||
| File | Change | Rationale |
|
||||
|------|--------|-----------|
|
||||
| `.cursor/skills/decompose/SKILL.md` (Step 6 prerequisites) | Add a prerequisite check that `_docs/02_document/architecture_compliance_baseline.md` exists; create it with `0` baseline violations if missing | Closes the gap that cumulative reviews flagged repeatedly across cycle 1 ("`architecture_compliance_baseline.md` does NOT exist → no Baseline Delta section emitted") |
|
||||
| `.cursor/skills/implement/SKILL.md` (Step 15 Completeness Gate) | Before classifying any per-task FAIL, run a workspace grep for cross-cutting state the task depends on (e.g. central registries, factory dispatch tables); if the actual gap is cross-cutting, propose a single cross-cutting task instead of N per-task remediation tasks | AZ-589 + AZ-590 → AZ-591 post-mortem; saves a wasted remediation-task round-trip |
|
||||
| `.cursor/skills/decompose/SKILL.md` (or a new sub-step) | Identify the **fixture-builder dependency surface** explicitly during test-task decomposition: if N test tasks share a single un-built fixture, schedule the fixture builder ahead of the dependent tasks as a P0 prerequisite, not as a peer | AZ-595 surfaced as a late-cycle 17-scenario blocker — would have been a 1-task P0 if decompose had cross-referenced fixture references in each test spec |
|
||||
|
||||
## Process Leftovers (out of band)
|
||||
|
||||
- **`_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md`** — still OPEN; replay condition unchanged (gtsam numpy<2). Re-checked at start of every `/autodev` invocation; last check 2026-05-20T05:51 UTC+3.
|
||||
|
||||
End of cycle-1 retrospective.
|
||||
@@ -0,0 +1,70 @@
|
||||
# Structural Snapshot — 2026-05-20 (cycle-1 close)
|
||||
|
||||
> Baseline snapshot for future retros. Captures the component / import /
|
||||
> contract topology at end of cycle 1 so cycle 2 retro can compute deltas
|
||||
> without re-deriving from source.
|
||||
|
||||
## Component Inventory
|
||||
|
||||
| # | Component | Path | Strategy slots (registered) | Notes |
|
||||
|---|-----------|------|------------------------------|-------|
|
||||
| 1 | C1 — VIO | `src/gps_denied_onboard/components/c1_vio/` | `klt_ransac` (operational default), `okvis2` (BLOCKED via AZ-332→AZ-592), `vins_mono` (BLOCKED via AZ-333→AZ-593) | Three strategies behind `_STRATEGY_REGISTRY`; airborne_bootstrap registers all 3 with `BUILD_*` gating |
|
||||
| 2 | C2 — VPR | `src/gps_denied_onboard/components/c2_vpr/` | `ultra_vpr` (production default), `megaloc`, `mixvpr`, `selavpr`, `eigenplaces`, `salad`, `netvlad` | All implemented; secondaries behind `BUILD_*` flags per ADR-002 |
|
||||
| 3 | C2.5 — Re-rank | `src/gps_denied_onboard/components/c2_5_rerank/` | `inlier_based_reranker` | Single strategy |
|
||||
| 4 | C3 — Matcher | `src/gps_denied_onboard/components/c3_matcher/` | `disk_lightglue` (default), `aliked_lightglue`, `xfeat` | Three strategies |
|
||||
| 5 | C3.5 — AdHoP | `src/gps_denied_onboard/components/c3_5_adhop/` | `adhop_refiner` (production), `passthrough` (baseline) | Conditional refinement |
|
||||
| 6 | C4 — Pose | `src/gps_denied_onboard/components/c4_pose/` | `opencv_gtsam` (production) | D-CROSS-LATENCY-1 hybrid trigger lives here |
|
||||
| 7 | C5 — State | `src/gps_denied_onboard/components/c5_state/` | `gtsam_isam2` (production), `eskf_baseline` (engine-rule baseline) | iSAM2 + IncrementalFixedLagSmoother |
|
||||
| 8 | C6 — Tile cache | `src/gps_denied_onboard/components/c6_tile_cache/` | Factory-based (no `_STRATEGY_REGISTRY` entry) | Postgres 16 + filesystem + FAISS HNSW |
|
||||
| 9 | C7 — Inference | `src/gps_denied_onboard/components/c7_inference/` | `tensorrt`, `pytorch_fp16`, `onnx_trt_ep` (selected via `INFERENCE_BACKEND` env) | Factory-based |
|
||||
| 10 | C8 — FC adapter | `src/gps_denied_onboard/components/c8_fc_adapter/` | `ardupilot_plane` (signed MAVLink), `inav` (MSP2) | Selected via `GPS_DENIED_FC_PROFILE` |
|
||||
| 11 | C10 — Provisioning | `src/gps_denied_onboard/components/c10_provisioning/` | n/a (operator-only) | C10 + C11 + C12 ship in `operator-orchestrator` image |
|
||||
| 12 | C11 — Tile Manager | `src/gps_denied_onboard/components/c11_tilemanager/` | n/a (operator-only) | `BUILD_C11_TILE_MANAGER=OFF` on airborne (ADR-004) |
|
||||
| 13 | C12 — Operator Orchestrator | `src/gps_denied_onboard/components/c12_operator_orchestrator/` | n/a (operator-only) | Includes `FlightsApiClient` (AZ-489) + `PostLandingUploadOrchestrator` + `OperatorReLocService` |
|
||||
| 14 | C13 — FDR | `src/gps_denied_onboard/components/c13_fdr/` | Factory-based | Writer thread + segment rotation + 64 GB cap + flight header/footer |
|
||||
| 15 | helpers/runtime_root | `src/gps_denied_onboard/helpers/`, `src/gps_denied_onboard/runtime_root/` | n/a — composition root + cross-cutting helpers | Hosts `compose_root`, `airborne_bootstrap`, `operator_bootstrap`, `_STRATEGY_REGISTRY`, replay branch, factory modules |
|
||||
|
||||
## Source Size
|
||||
|
||||
| Scope | Python LoC |
|
||||
|-------|------------|
|
||||
| `src/` (total) | 61,071 |
|
||||
|
||||
## Import Graph Health
|
||||
|
||||
- **Cycles in component import graph**: 0 (verified across batches 88–92 cumulative reviews — no back-edges introduced).
|
||||
- **Composition seam**: `runtime_root.airborne_bootstrap` (single composition root for airborne binary) + `runtime_root.operator_bootstrap` (single composition root for operator-orchestrator binary). Both pull from per-component `*_factory` modules under `runtime_root/`.
|
||||
|
||||
## Contract Files
|
||||
|
||||
| Path | Purpose | Status |
|
||||
|------|---------|--------|
|
||||
| `_docs/02_document/contracts/shared_fdr_client/fdr_record_schema.md` | FDR record envelope (v1.3.0) | Production |
|
||||
| `_docs/02_document/contracts/shared_fdr_client/fdr_client_protocol.md` | FDR client API surface | Production |
|
||||
| `_docs/02_document/contracts/shared_log_bridge/log_record_schema.md` | Log record schema (v1.0.0) | Production |
|
||||
| `_docs/02_document/contracts/shared_satellite_provider_ingest/` | D-PROJ-2 ingest endpoint placeholder | Planned (parent-suite work) |
|
||||
| `_docs/02_document/contracts/shared_flights_api/` | C12 → `flights` REST DTO | Production (consumed by AZ-489) |
|
||||
|
||||
Contract coverage % vs public-API symbols is not computed in this snapshot (no public-API inventory exists yet). Recommend adding the inventory + coverage metric as a cycle-2 follow-up — the same effort that lands `architecture_compliance_baseline.md` per Improvement Action #3.
|
||||
|
||||
## Build Track Coverage (per ADR-002 + ADR-011)
|
||||
|
||||
| Binary | Composition root | Strategies linked (production) | `BUILD_*` flags |
|
||||
|--------|------------------|---------------------------------|------------------|
|
||||
| Airborne (`companion-tier1` Tier-1 / `companion-jetson` Tier-2) | `airborne_bootstrap` | C1=`klt_ransac`, C2=`ultra_vpr`, C2.5=`inlier_based_reranker`, C3=`disk_lightglue`, C3.5=`adhop_refiner`, C4=`opencv_gtsam`, C5=`gtsam_isam2`, C7=`tensorrt`/`pytorch_fp16`, C8=`ardupilot_plane`/`inav` | `BUILD_VINS_MONO=OFF`, `BUILD_SALAD=OFF`, `BUILD_C11_TILE_MANAGER=OFF`, `BUILD_DEV_STATIC_KEY=OFF`, `BUILD_STATE_ESKF=OFF`; replay flags `BUILD_VIDEO_FILE_FRAME_SOURCE=ON`, `BUILD_TLOG_REPLAY_ADAPTER=ON`, `BUILD_REPLAY_SINK_JSONL=ON` (ADR-011) |
|
||||
| Research (lab Jetson IT-12) | `airborne_bootstrap` with research flags | airborne contents + every non-default strategy linked | All `BUILD_*` flags ON except `BUILD_DEV_STATIC_KEY` |
|
||||
| Operator-Orchestrator | `operator_bootstrap` | C10, C11 (`TileDownloader` + `TileUploader`), C12 (`FlightsApiClient`, `PostLandingUploadOrchestrator`, `OperatorReLocService`) | `BUILD_C11_TILE_MANAGER=ON` |
|
||||
|
||||
## Process Leftovers (open at snapshot time)
|
||||
|
||||
- `_docs/_process_leftovers/2026-05-11_d_cross_cve_1_opencv_pin_deferred.md` — OPEN, gtsam numpy-2 unblock pending.
|
||||
|
||||
## Cycle-2 Delta Targets
|
||||
|
||||
When cycle 2 closes, compare against this snapshot:
|
||||
|
||||
- **Component count change** — target 0 (no new components mid-feature-cycle; if AZ-595 or follow-ups add a component, document why).
|
||||
- **Source LoC change** — informational; rapid growth flagged for sub-component refactor review.
|
||||
- **Cycles in import graph** — must stay 0; any new cycle is a regression and surfaces a Critical finding.
|
||||
- **`_STRATEGY_REGISTRY` keys** — should grow by 0–2 (only when a new strategy lands per ADR-002).
|
||||
- **Contract files** — D-PROJ-2 ingest moves from "placeholder" to "production" if parent-suite ships its endpoint.
|
||||
Reference in New Issue
Block a user