diff --git a/_docs/02_document/deployment/ci_cd_pipeline.md b/_docs/02_document/deployment/ci_cd_pipeline.md index cd0e870..1654ba3 100644 --- a/_docs/02_document/deployment/ci_cd_pipeline.md +++ b/_docs/02_document/deployment/ci_cd_pipeline.md @@ -28,13 +28,29 @@ Other branches do NOT build (PR builds, feature-branch builds, tag builds — no | `tsc --noEmit` | Type-check the whole project | Already part of `bun run build` (`tsc -b && vite build`) | | `bun test` (or vitest / jest) | Run test suite | **Required** — there is no test runner today | | `eslint` / `biome` | Lint | Not configured today | -| Vulnerability scan | CVE scan on the image | `trivy` or `grype` candidates | -| SBOM emission | Software bill of materials | `syft` candidate | -| Image signing | Supply-chain trust | `cosign` candidate | +| `bun audit --severity high` | Block build on new HIGH/CRITICAL CVEs in deps | Tracked as Phase B follow-up F-INF-1 (cycle 2 security audit). Today the audit is run manually; without a CI gate the dev-only Vite/PostCSS HIGH advisories that AZ-502 closed could re-enter the lockfile undetected. | +| Vulnerability scan (image) | CVE scan on the image | `trivy` or `grype` candidates — Phase B follow-up F-INF-3 | +| SBOM emission | Software bill of materials | `syft` candidate — Phase B follow-up F-INF-4 | +| Image signing | Supply-chain trust | `cosign` candidate — Phase B follow-up F-INF-4 | | Multi-arch build | Add AMD64 alongside ARM64 | `docker buildx` candidates | These are tracked as Step 4–7 deliverables under autodev; the current pipeline is correct but minimal. +## 2a. Dependency overrides (AZ-502, cycle 2) + +Both `package.json` and `mission-planner/package.json` carry an `overrides` block: + +```json +"overrides": { + "vite": ">=6.4.2", + "postcss": ">=8.5.10" +} +``` + +**Why**: `bun audit` flagged 3 advisories (1 HIGH, 2 MODERATE) in `vite <= 6.4.1` and `postcss < 8.5.10` introduced via nested transitive copies through `vitest` / `vite-node`. A direct `bun update vite` did not displace those nested copies. Forcing a floor via `overrides` plus a clean reinstall (`rm -rf node_modules bun.lock && bun install`) cleared the advisories. + +**Maintenance rule**: do NOT remove these overrides until both `vite` and `postcss` are direct (non-transitive) at safe versions everywhere — verify with `bun pm ls vite postcss` before deleting. The `bun audit` CI gate (F-INF-1) will catch regressions if the overrides drift. + ## 3. Secrets & registry - `${REGISTRY_HOST}` — provided by Woodpecker secrets at runtime. diff --git a/_docs/02_document/deployment/environment_strategy.md b/_docs/02_document/deployment/environment_strategy.md index 28556ef..857cebd 100644 --- a/_docs/02_document/deployment/environment_strategy.md +++ b/_docs/02_document/deployment/environment_strategy.md @@ -25,11 +25,12 @@ The SPA bundle is **fully static**. No env vars are read at runtime by the bundl | Satellite tile provider URL (mission-planner) | `mission-planner/.env.example` declares its own independent `VITE_SATELLITE_TILE_URL` | mission-planner only; not deployed | | OpenWeatherMap API key + base URL (main SPA) | `.env.example` declares `VITE_OWM_API_KEY` + `VITE_OWM_BASE_URL`; resolved by `getOwmBaseUrl()` and the `flightPlanUtils.ts` builder. Closed AZ-448 / AZ-449 (no longer hardcoded). | | OpenWeatherMap API key + base URL (mission-planner) | `mission-planner/.env.example` declares `VITE_OWM_API_KEY` + `VITE_OWM_BASE_URL`; `WeatherService.getWeatherData(lat, lon)` returns `null` and issues NO outbound `fetch` when the key is unset (fail-soft). Closed cycle 2 / AZ-499. The previously-committed literal value MUST be revoked at the OWM dashboard (manual deliverable — AC-42 / AZ-499 AC-7); `STC-SEC1C` defends against re-introduction. | +| Google Geocode API key (mission-planner) | `mission-planner/.env.example` declares `VITE_GOOGLE_GEOCODE_KEY`; `GeocodeService.geocodeAddress(address)` returns `null` and issues NO outbound `fetch` when the key is unset (fail-soft, console.warn). Closed cycle 2 / AZ-501 (AC-43). The previously-committed literal value MUST be revoked at the Google Cloud Console (manual deliverable — AC-43 / AZ-501 AC-6); `STC-SEC1D` defends against re-introduction. | | `AZAION_REVISION` | Stamped into image at build time | For diagnostics | ## 3. `.env` strategy -Step 4 testability + cycle 2 added a workspace `.env.example` (resolved by Vite at build time via `import.meta.env.VITE_*`). Today it declares: `VITE_OWM_API_KEY`, `VITE_OWM_BASE_URL` (AZ-448 / AZ-449), and `VITE_SATELLITE_TILE_URL` (AZ-498). `mission-planner/.env.example` mirrors the OWM pair (AZ-499) and keeps its own independent `VITE_SATELLITE_TILE_URL`. +Step 4 testability + cycle 2 added a workspace `.env.example` (resolved by Vite at build time via `import.meta.env.VITE_*`). Today it declares: `VITE_OWM_API_KEY`, `VITE_OWM_BASE_URL` (AZ-448 / AZ-449), and `VITE_SATELLITE_TILE_URL` (AZ-498). `mission-planner/.env.example` mirrors the OWM pair (AZ-499), declares its own independent `VITE_SATELLITE_TILE_URL`, and (AZ-501) adds `VITE_GOOGLE_GEOCODE_KEY` for the address-search lookup. **Trade-off**: Vite resolves `import.meta.env.VITE_*` at build time, so `dist/` is environment-specific once a non-empty `VITE_OWM_API_KEY` is baked in — the OpenWeatherMap key (and any future build-time config) cannot be changed without a rebuild. This trades promotability for the air-gap-friendly pattern that lets a deploy ship with `VITE_OWM_API_KEY=""` (no OWM call, fail-soft `null` return) when the deployment must not touch the internet. @@ -50,4 +51,4 @@ In practice: branch separation is the gating mechanism. Once dev → stage → m - **`bun.lock`**: committed (per `package.json`'s `packageManager` field). `package-lock.json` is gitignored. - **`.idea/`, `.claude/`, `.superpowers/`**: gitignored — IDE / agent metadata. - **Playwright entries in `.gitignore`**: present but aspirational — Playwright is not installed (Step 5–7 territory). -- **mission-planner**: has its own `.env.example` declaring `VITE_SATELLITE_TILE_URL` and (cycle 2 / AZ-499) `VITE_OWM_API_KEY` + `VITE_OWM_BASE_URL`. Runs as a sibling Vite app; not bundled into the deployed image (per AC-31 / NFT-RES-LIM-04). +- **mission-planner**: has its own `.env.example` declaring `VITE_SATELLITE_TILE_URL`, (cycle 2 / AZ-499) `VITE_OWM_API_KEY` + `VITE_OWM_BASE_URL`, and (cycle 2 / AZ-501) `VITE_GOOGLE_GEOCODE_KEY`. Runs as a sibling Vite app; not bundled into the deployed image (per AC-31 / NFT-RES-LIM-04). Despite not being deployed, the keys must still be revoked at their respective dashboards because the literals were committed and exist in git history. diff --git a/_docs/03_implementation/deploy_planning_sync_cycle2.md b/_docs/03_implementation/deploy_planning_sync_cycle2.md new file mode 100644 index 0000000..5fb0091 --- /dev/null +++ b/_docs/03_implementation/deploy_planning_sync_cycle2.md @@ -0,0 +1,41 @@ +# Cycle 2 Step 16 — Deploy Planning Sync (planning-only) + +**Date**: 2026-05-12 +**Cycle**: 2 (autodev Step 16) +**Outcome**: Planning sync completed; **prod cutover deferred** (see leftovers). +**Decision basis**: user skipped the structured choice; agent defaulted to option B +(planning-only) because option A required unverifiable cross-workspace state and +option C would have lost the planning information. + +## What was synced + +| Document | Cycle 2 delta captured | +|----------|------------------------| +| `_docs/02_document/deployment/environment_strategy.md` | Section 2: new row for `VITE_GOOGLE_GEOCODE_KEY` (AZ-501, mission-planner) mirroring the OWM-mission-planner row. Section 3: `mission-planner/.env.example` now lists three env vars (OWM pair + tile URL + new Google key). Section 5: mission-planner local-dev bullet updated with the new key + reminder that committed-then-removed literals must still be revoked at the upstream dashboards. | +| `_docs/02_document/deployment/ci_cd_pipeline.md` | Section 2 (Missing steps): `bun audit --severity high` row added with rationale (linked to F-INF-1 from the cycle 2 security audit) and explicit notes against re-introducing the AZ-502 advisories. New §2a "Dependency overrides (AZ-502, cycle 2)": documents the `vite >=6.4.2` and `postcss >=8.5.10` `overrides` block in both `package.json`s, why it exists, and the maintenance rule for removing it safely. | +| `_docs/02_document/deployment/containerization.md` | No changes — Vite 6.4.2 upgrade does not affect the Dockerfile or the runtime image. | +| `_docs/02_document/deployment/observability.md` | No changes — cycle 2 added no client-telemetry surface. | + +## What was NOT done (deferred) + +Three pieces of work could not complete this cycle. Each is recorded in +`_docs/_process_leftovers/2026-05-12_az-498-deploy-and-key-revocations.md` with a full +replay procedure: + +| ID | Item | Reason | Owner | +|----|------|--------|-------| +| L-AZ-498-DEPLOY | UI tile-swap prod cutover | Cross-workspace gate: satellite-provider cookie-auth migration on `GET /tiles/{z}/{x}/{y}` must merge + deploy first. Deploying the UI side alone produces a broken map. | Cross-workspace + user | +| L-AZ-499-OWM-REVOKE | OWM key revocation at owm dashboard | Manual third-party-console action; cannot be automated from CI. AZ-499 AC-7 / AC-42 pending evidence attachment. | User | +| L-AZ-501-GOOGLE-REVOKE | Google Geocode key revocation at Google Cloud Console | Same reason as above. AZ-501 AC-6 / AC-43 pending evidence attachment. | User | + +## Verification + +- Read-after-write check: each modified deployment doc was re-read in this session; + the new content is present and the surrounding sections are intact. +- No source-code changes — this is a documentation-only step. +- No pipeline / Docker / nginx changes — those are deferred to the Phase B follow-ups + F-INF-1..F-INF-5 already tracked in `_docs/05_security/infrastructure_review.md`. + +## Auto-chain + +→ Step 17 (Retrospective) for cycle 2. diff --git a/_docs/06_metrics/perf_2026-05-12_cycle2.md b/_docs/06_metrics/perf_2026-05-12_cycle2.md new file mode 100644 index 0000000..330f5d9 --- /dev/null +++ b/_docs/06_metrics/perf_2026-05-12_cycle2.md @@ -0,0 +1,75 @@ +# Performance Test Report — Cycle 2 + +**Date**: 2026-05-12 +**Cycle**: 2 (Phase B, autodev Step 15) +**Runner**: `scripts/run-performance-tests.sh --static-only` +**Toolchain**: bun 1.3.11, vite 6.4.2 (post-AZ-502 override), node 24.10 +**Trigger**: pre-deploy gate after Cycle 2 Step 14 (security audit + AZ-501/AZ-502 inline fixes) + +## Summary + +``` +Scenarios: pass 1 · warn 0 · fail 0 · unverified 9 (deferred) · quarantined 3 +Verdict: PASS — bundle size budget honored after Vite 6.4.2 upgrade +``` + +The only enforced metric this cycle (NFT-PERF-01, gzipped initial JS bundle ≤ 2 MB) +passes with a wide margin. All other NFT-PERF-* scenarios are runtime-observable in +Playwright; the perf-mode Playwright project (`e2e/playwright.perf.config.ts`) is not +yet wired (deferred to per-AC test tasks AZ-457..AZ-482), so they are recorded as +**Unverified** rather than failed. Three scenarios remain quarantined pending +upstream code fixes (NFT-PERF-03, NFT-PERF-08, NFT-PERF-09). + +## Per-Scenario Results + +| Scenario | Verdict | Measured | Threshold | Source row | +|----------|---------|----------|-----------|------------| +| NFT-PERF-01 (initial JS bundle, gzipped) | **Pass** | 290 465 B (~283.7 KB) | ≤ 2 097 152 B (2 MB) | results_report row 40 / AC-11 | +| NFT-PERF-02 (auth refresh round-trips) | Unverified | — | exactly 1 refresh per cycle | results_report row 12 | +| NFT-PERF-03 (SSE bearer-rotation reconnect) | Quarantine | — | ≤ 5 000 ms | Step 8 hardening (SSE refresh rotation) | +| NFT-PERF-04 (live-GPS SSE open after select) | Unverified | — | ≤ 5 000 ms | results_report row 34 | +| NFT-PERF-05 (live-GPS SSE close after deselect) | Unverified | — | ≤ 1 000 ms | results_report row 35 | +| NFT-PERF-06 (annotation-status SSE unmount close) | Unverified | — | ≤ 1 000 ms | results_report row 25 | +| NFT-PERF-07 (bulk-validate UI reflect) | Unverified | — | ≤ 2 000 ms | results_report row 37 | +| NFT-PERF-08 (panel-width persistence debounce) | Quarantine | — | exactly 1 PUT ≤ 1 000 ms | Step 4 fix (panel-width persistence) | +| NFT-PERF-09 (settings save error surfacing) | Quarantine | — | ≤ 2 000 ms | Step 4 fix (settings save error surfacing) | +| NFT-PERF-10 (FCP on /flights, edge profile) | Unverified | — | ≤ 3 000 ms | results_report row 98 | + +## Bundle Size Detail (NFT-PERF-01) + +Vite 6.4.2 fresh build (`bun run build` after `rm -rf dist`): + +| Chunk | Raw | Gzipped | +|-------|-----|---------| +| `dist/index.html` | 0.43 KB | 0.30 KB | +| `dist/assets/index-*.css` | 53.76 KB | 13.50 KB | +| `dist/assets/index-*.js` (initial entry) | 923.12 KB | **290.45 KB** | + +Headroom against the 2 MB gate: ~1.78 MB unused (~85.86% of budget). + +**No bundle regression introduced by AZ-502 Vite/PostCSS upgrade** — pre- and post-upgrade +bundles measured identically at 290 465 B (cached `dist/` and freshly rebuilt `dist/` produced +the same byte total). + +### Pre-existing build warnings (not introduced this cycle) + +- `Some chunks are larger than 500 kB after minification` — single 923.12 KB unsplit `index-*.js` chunk. Mitigation candidates listed in build output (dynamic `import()`, `manualChunks`). Track separately if/when CI enforces a stricter chunk-size budget. +- One CSS lint note about `flex` value (compiler suggestion). Pre-existing; unrelated to AZ-502. + +## Coverage Gaps + +The 6 Unverified scenarios (NFT-PERF-02, -04, -05, -06, -07, -10) measure runtime UI timings +that require the Playwright perf project. Per the runner script: + +> Awaiting NFT-PERF-* task implementations (AZ-457..AZ-482); until then the e2e perf +> scenarios are SKIPPED. + +Recommended next step (cycle 3+): enable the perf Playwright project alongside the +existing e2e harness so these thresholds can be enforced pre-deploy. + +## Outcome + +**PASS — auto-chain to autodev Step 16 (Deploy)**. + +No regression detected. All enforced thresholds met. Unverified scenarios are deferred +gaps tracked in the performance-tests spec, not blocking failures. diff --git a/_docs/06_metrics/retro_2026-05-12_cycle2.md b/_docs/06_metrics/retro_2026-05-12_cycle2.md new file mode 100644 index 0000000..45d8b02 --- /dev/null +++ b/_docs/06_metrics/retro_2026-05-12_cycle2.md @@ -0,0 +1,177 @@ +# Retrospective — 2026-05-12 (Phase B Cycle 2) + +**Mode**: cycle-end (autodev existing-code Step 17) +**Scope**: Phase B, cycle 2 (`state.cycle = 2`) +**Epic**: AZ-497 (`Self-Hosted Satellite Tiles — SPA Integration`) + ad-hoc security tickets AZ-501 / AZ-502 spawned by Step 14 +**Cycle duration**: 2 batches over 1 working day (2026-05-12) +**Previous retro**: `_docs/06_metrics/retro_2026-05-12.md` (cycle 1, same calendar day) + +## Implementation Summary + +| Metric | Value | Δ vs cycle 1 | +|--------|-------|--------------| +| Total tasks | 4 (AZ-498, AZ-499, AZ-501, AZ-502) | +2 (+100 %) | +| Total batches | 2 (batch 11 = AZ-498 + AZ-499; batch 12 = AZ-501 + AZ-502 inline-fix sub-step under Step 14) | 0 | +| Total complexity points | 11 (AZ-498=5, AZ-499=2, AZ-501≈2, AZ-502≈2) | +1 (+10 %) | +| Avg tasks per batch | 2 | +1 | +| Avg complexity per batch | 5.5 | +0.5 | +| Source files mutated | 12 production + 1 e2e harness + 4 i18n/MSW + 2 scripts + 4 test files + 9 docs | n/a (different shape vs cycle 1's refactor focus) | + +Sources: `_docs/03_implementation/batch_11_report.md`, `_docs/03_implementation/batch_12_report.md`, `_docs/03_implementation/test_run_report_phase_b_cycle2.md`, `_docs/03_implementation/deploy_planning_sync_cycle2.md`. + +## Quality Metrics + +### Code Review Results + +| Verdict | Count | Percentage | Δ vs cycle 1 | +|---------|-------|-----------|--------------| +| PASS | 0 | 0 % | −2 | +| PASS_WITH_WARNINGS | 1 | 50 % | +1 | +| FAIL | 0 | 0 % | 0 | +| (no formal review — security inline-fix sub-step) | 1 | 50 % | n/a | + +Note: batch 12 (AZ-501 + AZ-502) was executed as a Step-14 inline-fix sub-step, not as a Step-10 implement batch, so it did not pass through the implement skill's per-batch self-review path. Static + fast tests covered all 5 ACs implemented in code; the manual-deliverable ACs (AC-6 / AC-7) cannot be verified by tests at all. + +### Findings by Severity (code review only — security-audit findings tracked separately below) + +| Severity | Count | Δ vs cycle 1 | +|----------|-------|--------------| +| Critical | 0 | 0 | +| High | 0 | 0 | +| Medium | 0 | 0 | +| Low | 1 (`F1` — trim-trailing-slash idiom duplication; pre-existing pattern across 4 call sites in 2 vite roots; consolidation deferred to a future shared-helper extraction task) | +1 | + +### Findings by Category (code review) + +| Category | Count | Top Files | +|----------|-------|-----------| +| Bug | 0 | — | +| Spec-Gap | 0 | — | +| Security | 0 (in code review) | — | +| Performance | 0 | — | +| Maintainability | 1 (Low, pre-existing) | `src/features/flights/types.ts`, `mission-planner/src/services/{Weather,Geocode}Service.ts`, `src/features/flights/flightPlanUtils.ts` | +| Style | 0 | — | +| Scope | 0 | — | + +### Security-Audit Findings (Step 14, separate from code review) + +12 findings total. Inline-fixed this cycle: + +| ID | Severity | Status | +|----|----------|--------| +| F-SAST-1 (Google Geocode key in mission-planner port-source) | HIGH | RESOLVED (AZ-501) | +| F-DEP-1 (Vite ≤ 6.4.1 + PostCSS < 8.5.10 dev-only WebSocket file-read CVEs) | HIGH | RESOLVED (AZ-502) | +| F-SAST-2 (`unpkg.com` CDN ref) | MEDIUM | DEFERRED (Phase B follow-up) | +| F-SAST-3 (`STC-SEC2` coverage gap) | MEDIUM | DEFERRED | +| F-SAST-4 (third-party tile fallbacks) | LOW | DEFERRED | +| F-INF-1 (no CI `bun audit` gate) | MEDIUM | DEFERRED (tracked in `_docs/05_security/infrastructure_review.md`) | +| F-INF-2 (missing nginx headers + log redaction) | MEDIUM | DEFERRED | +| F-INF-3 (no Trivy image scan) | MEDIUM | DEFERRED | +| F-INF-4 (no SBOM + cosign signing) | MEDIUM | DEFERRED | +| F-INF-5 (nginx as root, no HEALTHCHECK) | MEDIUM | DEFERRED | +| F-OWASP-1 (security misconfiguration: nginx headers) | MEDIUM | covered by F-INF-2 | +| F-OWASP-2 (vulnerable & outdated components) | MEDIUM | RESOLVED via F-DEP-1 closure (AZ-502) | + +**Security verdict trajectory**: cycle 2 audit overall verdict was FAIL → after AZ-501 + AZ-502 inline fixes, code-level surface returns to PASS_WITH_WARNINGS (Phase B infrastructure follow-ups remain). All 5 deferred F-INF-* items are tracked as concrete next-cycle backlog candidates, not silent gaps. + +## Structural Metrics + +Source: cycle 1 baseline `_docs/06_metrics/structure_2026-05-12.md` (no new structural snapshot needed — cycle 2 introduced no architecture changes). + +| Metric | Cycle 1 close | Cycle 2 close | Δ | +|--------|--------------|--------------|---| +| Component count | 12 | 12 | 0 | +| Public-API barrels | 11 / 11 (100 %) | 11 / 11 (100 %) | 0 | +| Commit-time static gates | 31 / 31 PASS | **33 / 33 PASS** | +2 (`STC-SEC1C`, `STC-SEC1D`) | +| Architecture cycles | 0 | 0 | 0 | +| Architecture findings open (baseline F1–F9) | 7 of 9 | 7 of 9 | 0 | +| Newly introduced architecture violations | 0 | 0 | 0 | +| Net architecture delta this cycle | −2 (improvement) | **0** | — | +| Wire-contract assertions (`endpoints.test.ts`) | 36 | 36 | 0 | +| Fast-profile suite | 209 PASS / 13 SKIP / 0 FAIL | **229 PASS / 13 SKIP / 0 FAIL** | +20 PASS, 0 SKIP delta | +| Bundle (gzipped initial JS) | not measured | **290 465 B** (~14 % of 2 MB budget) | new metric (NFT-PERF-01 baseline) | + +### Auto-lesson triggers (per skill Step 1) + +- Net Architecture delta > 0? **No** — delta is 0; no `architecture` regression lesson required. +- Structural metric regression > 20 %? **No** — every structural metric held or improved. +- Contract coverage % decreased? **N/A** — `endpoints.test.ts` count held at 36; project still uses code-derived contracts. +- New finding category emerged? **Yes — `security`** (Step 14 audit fired for the first time this cycle). One of the lessons below captures the rotation-discipline pattern that resulted. + +## Efficiency + +| Metric | Value | Δ vs cycle 1 | +|--------|-------|--------------| +| Blocked tasks (cycle-internal) | 0 | 0 | +| Tasks pending external user action | 2 (AZ-499 AC-7 OWM revocation, AZ-501 AC-6 Google revocation) | +2 (new pattern) | +| Cross-workspace gates outstanding | 1 (AZ-498 deploy via satellite-provider cookie-auth) | +1 (new pattern) | +| Tasks requiring fixes after review | 0 | 0 | +| Batch with most findings | batch 11 (1 Low pre-existing) | n/a | +| Auto-fix loops invoked | 0 | −1 | +| Stuck-agent incidents | 0 | 0 | + +### Blocker Analysis + +| Blocker Type | Count | Prevention | +|--------------|-------|-----------| +| Manual third-party-console action (key revocation) | 2 | Folded into the new "external-secret" task template (Improvement Action #2 below) | +| Cross-workspace ticket dependency (deploy gate) | 1 | Surface during Step 9 (New Task) when ticket scope crosses workspace boundaries; capture in the task spec's `Dependencies` field as it was for AZ-498 | + +### User-decision points (cycle 2 only) + +- Step 14 outcome (HIGH findings): user chose A (fix both inline) — produced AZ-501 + AZ-502. +- Step 15 perf: user chose A (run perf tests) — confirmed bundle stays under budget. +- Commit decision: user chose B (commit + push to remote `dev`) — `f7dd6c9` pushed. +- Step 16 deploy gate: **user skipped** the structured choice; agent defaulted to planning-only sync (option B in the absence of an answer) and recorded the prod cutover + key revocations as leftovers. Rationale: the unanswered options A (full deploy) required external state I could not verify, and option C (skip entirely) would have lost the planning information. + +## Trend Comparison + +| Trend | Cycle 1 | Cycle 2 | Direction | +|-------|---------|---------|-----------| +| Code review pass rate | 100 % | 50 % (1 PASS_WITH_WARNINGS, 1 no-formal-review sub-step) | ⬇ but explainable: PWW finding was pre-existing Low, not introduced this cycle | +| Test count | +46 (cumulative this cycle) | +20 (this cycle on top of cycle 1) | continued positive growth | +| Static gate count | +2 | +2 | continued positive growth (now both axes: arch + security literal-scan) | +| Architecture findings open | 7 (−2) | 7 (0) | held; cycle 2 was config/wire + security, no architecture surface touched | +| Pending USER actions at cycle close | 0 | 2 (revocations) + 1 (cross-workspace gate) | ⬆ — first cycle to exit with non-zero user-action backlog; visible in leftovers | + +The cycle 2 user-action backlog is a **structural side-effect of running Step 14 (Security Audit)** for the first time, not a process regression. The Phase A baseline never scanned for committed secrets; cycle 2's audit surfaced two such secrets that could only be neutralized via vendor-console action. Both are tracked in `_docs/_process_leftovers/2026-05-12_az-498-deploy-and-key-revocations.md` with full replay procedures. + +## Top 3 Improvement Actions + +1. **Run Step 14 (Security Audit) earlier in the cycle, ideally as a pre-flight to Step 9 (New Task)**. + This cycle's audit caught two HIGH findings (Google Geocode key + Vite CVEs) **after** the implement work was complete, forcing the inline-fix detour and producing AZ-501 / AZ-502 mid-cycle. Running a lightweight static-only audit pre-Step-9 (read `mission-planner/src/config.ts`, `mission-planner/src/services/`, top-level deps) would have surfaced the Google key during AZ-499's planning — both `mission-planner/` keys could have been externalized in the same batch as AZ-499. + - Impact: high — would have collapsed AZ-499 + AZ-501 into a single batch with a single rotation discipline; would have caught F-DEP-1 before AZ-498 implementation began (cleaner branch state). + - Effort: low — add a `pre-cycle` mode to `.cursor/skills/security/SKILL.md` that runs Phase 1 (deps) + Phase 2 (SAST) only, callable from Step 9 of the existing-code flow. + +2. **Standardize an "external-secret externalization" task template**. + AZ-499 and AZ-501 are mechanically identical: extract to service module → env var via `import.meta.env.VITE_*` → fail-soft return → add literal-scan static gate (`STC-SEC1x`) → document in `.env.example` with `` placeholder → leave the actual revocation as a manual deliverable AC. The third such task (whichever comes next) should copy a checklist, not re-derive the pattern. + - Impact: medium-high — directly addresses the cycle-2 user-action backlog as a structural pattern; the next external-secret task lands in a single PR with all 6 steps already scoped. + - Effort: low — add `_docs/02_tasks/_templates/external_secret_externalization.md` (new) and reference it from `.cursor/skills/new-task/SKILL.md`'s "Task Type Detection" section. + +3. **Enforce `bun audit --severity high` in CI (close F-INF-1)**. + F-DEP-1 (Vite/PostCSS CVEs) was found by manual `bun audit` invocation during the audit. A CI gate would have caught it within hours of the advisory being published, instead of waiting for the next manual audit cycle. The fix is small — one Woodpecker step before the build stage — and the AZ-502 `package.json` overrides already make the gate green today. + - Impact: medium — closes a known coverage gap before the next dependency CVE lands; pairs with action #1 (security earlier in cycle) to push security from "audit" to "continuous gate". + - Effort: low — single step addition to `.woodpecker/build-arm.yml`. + +## Suggested Rule / Skill Updates + +| File | Change | Rationale | +|------|--------|-----------| +| `.cursor/skills/security/SKILL.md` | Add a `pre-cycle` invocation mode that runs Phase 1 (deps) + Phase 2 (SAST) only, with a 5-minute time budget. Wire it into `.cursor/skills/autodev/flows/existing-code.md` as an optional pre-Step-9 gate. | §Top 3 Improvement Action #1. | +| `_docs/02_tasks/_templates/external_secret_externalization.md` | NEW file. Template with the 6-step checklist (extract to service module → env var → fail-soft → STC-SECx literal-scan → `.env.example` placeholder → manual revocation AC). Include AZ-499 and AZ-501 as canonical examples. | §Top 3 Improvement Action #2. | +| `.cursor/skills/new-task/SKILL.md` (Task Type Detection) | Add an "external-secret-externalization" trigger phrase set ("hardcoded API key", "rotate credential", "externalize secret") that suggests the new template. | §Top 3 Improvement Action #2 enablement. | +| `.woodpecker/build-arm.yml` | Add a `bun audit --severity high` step before the build stage (closes F-INF-1). | §Top 3 Improvement Action #3 + audit infrastructure_review.md F-INF-1. | +| `_docs/LESSONS.md` (top) | Append the 3 lessons in §LESSONS Append below; trim to ≤ 15 entries. | Skill Step 4. | + +## Notes — Step 16 outcome + +Step 16 (Deploy) ran in **planning-only mode** because: +- The user skipped the structured deploy-gate choice; the agent defaulted to option B (plan only) since option A required unverifiable cross-workspace state and option C would have lost the planning information. +- The actual prod cutover for AZ-498 + the two key revocations are tracked as leftovers — see `_docs/_process_leftovers/2026-05-12_az-498-deploy-and-key-revocations.md` (3 entries, each with a full replay procedure). +- `_docs/02_document/deployment/{environment_strategy,ci_cd_pipeline}.md` were updated to reflect cycle 2 changes (new env var + override block) so the next cycle's Step 16 starts from accurate planning artifacts. + +## LESSONS Append (top 3, single-sentence, tagged) + +1. **[process]** When externalizing a committed API key, always follow the 4-step rotation discipline: (a) extract to env-var via a service module so unit tests can stub it, (b) add a literal-scan static gate (STC-SECx) against the rotated value as defense-in-depth, (c) document in `.env.example` using the established `` placeholder convention, (d) leave the actual key revocation as a manual deliverable AC with evidence-attachment requirement — never assume the static gate alone neutralizes the leaked credential. +2. **[dependencies]** When `bun audit` reports advisories on a transitive dep that direct `bun update ` does not clear (because nested copies persist under sibling tools, e.g. `vitest/node_modules/`), use `package.json` `"overrides"` to floor the resolution AND clean reinstall (`rm -rf node_modules bun.lock && bun install`) — a direct update alone cannot displace nested copies, and Bun honors the npm-compatible `overrides` field exactly as npm does. +3. **[tooling]** When the autodev orchestrator delegates to a sub-skill that ends in a HIGH-severity blocking gate (e.g. security audit FAIL → user picks "fix inline"), capture the inline-fix sub-step results as a separate batch report (`batch_NN_report.md`) — not as an extension of the prior batch — so the cycle metrics correctly attribute findings, ACs, and complexity to the work boundary that produced them. diff --git a/_docs/LESSONS.md b/_docs/LESSONS.md index 2441906..43a0215 100644 --- a/_docs/LESSONS.md +++ b/_docs/LESSONS.md @@ -8,6 +8,33 @@ Categories: estimation · architecture · testing · dependencies · tooling · --- +- [2026-05-12] [process] When externalizing a committed API key, always follow + the 4-step rotation discipline: (a) extract to env-var via a service module + so unit tests can stub it, (b) add a literal-scan static gate (STC-SECx) + against the rotated value as defense-in-depth, (c) document in + `.env.example` using the established `` placeholder convention, + (d) leave the actual key revocation as a manual deliverable AC with + evidence-attachment requirement — never assume the static gate alone + neutralizes the leaked credential. + Source: _docs/06_metrics/retro_2026-05-12_cycle2.md + +- [2026-05-12] [dependencies] When `bun audit` reports advisories on a + transitive dep that direct `bun update ` does not clear (because + nested copies persist under sibling tools, e.g. + `vitest/node_modules/`), use `package.json` `"overrides"` to floor + the resolution AND clean reinstall (`rm -rf node_modules bun.lock && + bun install`) — a direct update alone cannot displace nested copies, and + Bun honors the npm-compatible `overrides` field exactly as npm does. + Source: _docs/06_metrics/retro_2026-05-12_cycle2.md + +- [2026-05-12] [tooling] When the autodev orchestrator delegates to a + sub-skill that ends in a HIGH-severity blocking gate (e.g. security audit + FAIL → user picks "fix inline"), capture the inline-fix sub-step results + as a separate batch report (`batch_NN_report.md`) — not as an extension + of the prior batch — so the cycle metrics correctly attribute findings, + ACs, and complexity to the work boundary that produced them. + Source: _docs/06_metrics/retro_2026-05-12_cycle2.md + - [2026-05-12] [architecture] When adding an architecture gate (STC-ARCH-*), extend the existing single-script dispatcher with a new `--mode` flag instead of forking a second script; same walker, same comment-skip, same diff --git a/_docs/_autodev_state.md b/_docs/_autodev_state.md index 9c37940..ad99ceb 100644 --- a/_docs/_autodev_state.md +++ b/_docs/_autodev_state.md @@ -2,24 +2,19 @@ ## Current Step flow: existing-code -step: 15 -name: Performance Test -status: not_started +step: 9 +name: New Task +status: in_progress sub_step: - phase: 0 - name: awaiting-invocation + phase: 1 + name: gather-feature-description detail: "" retry_count: 0 -cycle: 2 +cycle: 3 tracker: jira ## Notes -- Cycle 2 Step 14 CLOSED. Audit: `_docs/05_security/` (5 reports). Verdict: - FAIL (1 HIGH F-SAST-1, 1 HIGH F-DEP-1, 7 MED, 2 LOW). User chose A — - fixed both HIGH inline (AZ-501 Google key, AZ-502 Vite/PostCSS). - Implementation report: `_docs/03_implementation/batch_12_report.md`. - Static + fast: 229 PASS / 13 SKIP / 0 FAIL. Both tickets transitioned to - "In Progress" in Jira. PENDING USER: AZ-501 AC-6 (Google key revocation - at Google Cloud Console) + AZ-499 AC-7 (OWM key revocation, carried from - earlier). PENDING CROSS-WORKSPACE: AZ-498 deploy gate (Step 16). - Phase B follow-ups deferred: F-INF-1..F-INF-5 in security audit report. +- Cycle 3 entered via auto-loop from cycle 2 retrospective. +- Cycle 2 leftovers carried forward (`_docs/_process_leftovers/2026-05-12_az-498-deploy-and-key-revocations.md`): + - L-AZ-498-DEPLOY → scheduled for cycle 3 Step 16 (cross-workspace gate). + - L-AZ-499-OWM-REVOKE / L-AZ-501-GOOGLE-REVOKE → await user manual action at OWM / Google Cloud dashboards. diff --git a/_docs/_process_leftovers/2026-05-12_az-498-deploy-and-key-revocations.md b/_docs/_process_leftovers/2026-05-12_az-498-deploy-and-key-revocations.md new file mode 100644 index 0000000..8c84ced --- /dev/null +++ b/_docs/_process_leftovers/2026-05-12_az-498-deploy-and-key-revocations.md @@ -0,0 +1,99 @@ +# Cycle 2 Step 16 — Deferred deploy + manual revocations + +**Created**: 2026-05-12T01:44:00Z (autodev Step 16, planning-only outcome) +**Cycle**: 2 + +This file tracks deploy-related work that could not complete this cycle because each +item depends on action outside this workspace. + +--- + +## L-AZ-498-DEPLOY — UI tile-swap prod cutover (cross-workspace gate) + +**What is blocked**: prod deploy of the UI changes from cycle 2 batch 11 that route +the map's `` through the suite's own `satellite-provider` (`/tiles/{z}/{x}/{y}`) +with same-origin cookie auth. The image will build cleanly today (the source change is in +`dev`), but cutting prod traffic over before satellite-provider's auth migration lands +will break the map for all users. + +**Cross-workspace prerequisite**: a separate AZAION ticket on the **satellite-provider** +workspace must publish a cookie-auth variant of `GET /tiles/{z}/{x}/{y}` AND deploy that +change to all environments the UI is promoted into (dev / stage / prod). Today the UI +sets `crossOrigin="use-credentials"` on tile images, but the server still expects an +`Authorization: Bearer ...` header (which Leaflet `` requests cannot send). + +**Replay procedure** (run at the start of the next cycle's Step 16, or sooner on user +request): + +1. Verify the satellite-provider workspace has merged the cookie-auth change to dev, + stage, and main equivalents. +2. Verify the satellite-provider deploys are live in each environment (smoke check: + `curl --cookie ... https:///tiles/0/0/0` returns 200 with `Content-Type: image/jpeg`). +3. Run the UI tile-render smoke check: `bunx playwright test e2e/tests/infrastructure.e2e.ts -g "tile"` + against each environment. +4. Build + push the UI image (the Woodpecker pipeline already does this on every `dev` + push; cycle 2 commit `f7dd6c9` is on `dev` as of 2026-05-12). +5. Promote: `dev → stage → main` per the standard branch model + (`_docs/02_document/deployment/ci_cd_pipeline.md` §1). +6. Post-deploy verification: load `/flights` on each environment, pan the map, watch + network panel — every `/tiles/...` request returns 200 and the request is sent with + the auth cookie attached. + +**Escalation**: if the satellite-provider ticket is still not landed by the next cycle's +Step 16 review, surface to the user via Choose A/B/C/D — the gate cannot be silently +bypassed because doing so produces a visibly broken map in production. + +--- + +## L-AZ-499-OWM-REVOKE — OpenWeatherMap key revocation + +**What is blocked**: closing AZ-499 acceptance criterion AC-7 (and the equivalent +project-wide AC-42), which requires the OWM key `335799082893fad97fa36118b131f919` +that was previously committed to the repo to be revoked at the OWM dashboard. + +**Why this can't be done from this workspace**: revocation requires authenticated +access to `https://home.openweathermap.org/api_keys` — a third-party UI that cannot +be automated from CI without storing OWM credentials, which is out of scope. + +**Replay procedure** (manual, requires user): + +1. Sign into `https://home.openweathermap.org/api_keys`. +2. Locate the key `335799082893fad97fa36118b131f919`. +3. Disable / regenerate / delete it. Capture evidence: dashboard screenshot OR a + timestamped URL showing the key is no longer active. +4. Attach the evidence to Jira ticket **AZ-499** (or to the parent epic if the user + prefers). +5. Transition AZ-499 to **Done** in Jira. +6. Delete this leftover entry once steps 1–5 are complete. + +**Compensating control already in place**: `STC-SEC1C` (in `scripts/check-banned-deps.mjs` ++ `tests/security/banned-deps.json`) prevents the literal value from re-entering the +source tree. + +--- + +## L-AZ-501-GOOGLE-REVOKE — Google Geocode key revocation + +**What is blocked**: closing AZ-501 acceptance criterion AC-6 (and the project-wide +AC-43), which requires the Google Geocode key `AIzaSyAhvDeYukuyWVrQYbRhuv91bsi_jj5_Iys` +that was previously committed in `mission-planner/src/config.ts` to be revoked at the +Google Cloud Console. + +**Why this can't be done from this workspace**: same reason as AZ-499 — revocation +requires authenticated access to `https://console.cloud.google.com/google/maps-apis/credentials`, +which cannot be automated. + +**Replay procedure** (manual, requires user): + +1. Sign into `https://console.cloud.google.com/google/maps-apis/credentials`. +2. Locate the key `AIzaSyAhvDeYukuyWVrQYbRhuv91bsi_jj5_Iys`. +3. Restrict the key to no APIs / no referrers (effectively revoke) OR regenerate it. + Capture evidence: dashboard screenshot OR a timestamped URL showing the + restriction. +4. Attach the evidence to Jira ticket **AZ-501**. +5. Transition AZ-501 to **Done** in Jira. +6. Delete this leftover entry once steps 1–5 are complete. + +**Compensating control already in place**: `STC-SEC1D` (registered in `scripts/run-tests.sh` +under `run_static`, with the literal in `tests/security/banned-deps.json` → +`google_key_in_source`) prevents the literal value from re-entering the source tree.