115 Commits

Author SHA1 Message Date
Oleksandr Bezdieniezhnykh 7d3ba1c3fd Enhance .cursor documentation and workflows
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Updated the README.md to reflect new skill commands and improved descriptions for various workflows, including the addition of new skills like /test-spec, /code-review, and /release. Enhanced clarity on session boundaries and flow resolutions in the auto-chaining process.

Removed the implementer agent file as it is no longer needed. Updated coderule and meta-rule documents to enforce stricter testing and implementation standards, ensuring real results are produced rather than simulated ones.

Revised the autodev flow documentation to include a new release step and clarified the retrospective process. Adjusted the testing rules to specify coverage thresholds for critical paths.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-21 13:52:34 +03:00
Oleksandr Bezdieniezhnykh 19c0371fd6 [no-ticket] Sync .cursor with suite root
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Bring this repo's .cursor/ in line with the suite monorepo root .cursor/
so rules, skills, and autodev artifacts stay consistent across
submodules and sibling repos.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-17 13:11:02 +03:00
Oleksandr Bezdieniezhnykh af661359c7 [AZ-505] Cycle 6 Step 17: retrospective + close cycle
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Single-task cycle delivering AZ-505 (3 SP); 1 batch, PASS verdict
after a single auto-fix round (ComputeLocationHash duplication
consolidated into Uuidv5.LocationHashForTile). Step 14 Security
Audit skipped; Step 15 Performance Test PASS (8/8, exit 0) and
closes the cycle-3 perf-harness leftover that carried across
cycles 3-5.

Top 3 lessons appended to LESSONS.md ring buffer:
- Kestrel Http1AndHttp2 requires TLS for ALPN; spec-time decision
- Dapper-bypassing test paths own the Npgsql type contract
- Test fixtures naming specific schema artifacts need migration awareness

Cycle 7 opens at Step 9 (New Task).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 23:07:19 +03:00
Oleksandr Bezdieniezhnykh ba3bdb1918 [AZ-505] Cycle 6 Steps 15-16 perf + deploy report
Step 15 (Performance Test): 8/8 PT scenarios PASS in a single
default-parameter run (exit 0). Adapts scripts/run-performance-tests.sh
for the new TLS+ALPN dev listener via CURL_OPTS=(--cacert ./certs/api.crt).
Report at _docs/06_metrics/perf_2026-05-12_cycle6.md. The clean exit-0
satisfies the cycle-3 perf-harness leftover deletion criterion that
carried across cycles 3-5; leftover file deleted.

Step 16 (Deploy): _docs/03_implementation/deploy_cycle6.md captures the
shipping payload (inventory endpoint, HTTP/2 TLS+ALPN, tiles_leaflet_path
covering index, migration 015), the dev-cert plumbing for local-docker +
integration-tests parity, the production-TLS topology note (terminate at
ingress; never promote the dev cert), and the operator runbook for
promoting cycle-6 past dev.

NU1902 / CA2227 / ASPDEPR002 / Serilog-10.x re-listed as carry-overs
unchanged; admin-team iss/aud confirmation unchanged.

State advanced to Step 17 (Retrospective).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 23:02:00 +03:00
Oleksandr Bezdieniezhnykh 5d84d2839e [AZ-505] Test-spec sync + task-mode doc updates for cycle 6
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Step 12 (Test-Spec Sync, cycle-update mode):
- blackbox-tests.md: append BT-23..BT-26 for AZ-505's new
  observable behaviors (inventory order/shape; leaflet
  most-recent via location_hash; HTTP/2 multiplex over TLS+ALPN;
  request validation).
- performance-tests.md: append PT-09 (inventory p95 ≤ 1000ms /
  2500 tiles); records cycle-6 measured p95=66ms; documents
  promotion path to scripts/run-performance-tests.sh if budget
  ever tightens.
- traceability-matrix.md: resolve the 5 AZ-503 deferrals
  (AC-5/6/9/10/12) by pointing at AZ-505 test names + add 7
  AZ-505 AC rows (AC-1..AC-7) + bump totals (90 -> 94 tests,
  56/56 -> 63/63 in-scope) + add cycle-6 coverage shape notes
  (budget relaxation rationale, voting-filter deferral note,
  TLS+ALPN pivot, NFR propagation).

Step 13 (Update Docs, task mode):
- common_dtos.md: add 5 new TileInventory DTOs.
- common_interfaces.md: add ITileService.GetInventoryAsync.
- services_tile_service.md: document TileService.GetInventoryAsync
  steps + the XOR-validation-in-handler note.
- dataaccess_migrator.md: bump migration count 14 -> 15;
  describe migration 015 (AZ-505 leaflet covering index, lock
  window, INCLUDE-list trade-off).
- system-flows.md: add F7 (Leaflet Tile Serving, AZ-310 +
  AZ-505 location_hash rewire + TLS+ALPN) and F8 (Tile
  Inventory Bulk Lookup) with sequence diagrams, validation
  surface, and AC-4 perf evidence. Update Flow Inventory +
  Dependencies tables accordingly.
- glossary.md: add "Tile Inventory" entry pointing at the
  v1.0.0 contract.
- ripple_log_cycle6.md: new file — exhaustive reverse-dependency
  analysis confirms zero stale downstream module docs.

Advance autodev state from step 11 -> 14 (skipping 12+13 as
completed in this commit; auto-chain through Step 14 = Security
Audit optional gate).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 22:29:22 +03:00
Oleksandr Bezdieniezhnykh c74a2339aa [AZ-505] AC-5 fix: enable TLS for HTTP/2 via ALPN
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Kestrel with HttpProtocols.Http1AndHttp2 on a plaintext listener
silently downgrades to HTTP/1.1-only (logs "HTTP/2 is not enabled
... TLS is not enabled"), so AC-5's multiplexed-GET test failed
with HTTP_1_1_REQUIRED. ALPN cannot run over plaintext, so the
fix switches the dev listener to TLS on https://+:8080:

- scripts/run-tests.sh generates a self-signed dev cert idempotently
  (./certs/api.pfx + api.crt) via openssl in an alpine container;
  certs/ is gitignored.
- docker-compose.yml binds Kestrel to ASPNETCORE_URLS=https://+:8080
  with Kestrel__Certificates__Default__Path bound to the .pfx.
- docker-compose.tests.yml mounts api.crt into the integration-tests
  container's CA store and runs update-ca-certificates so HttpClient
  trusts the cert transparently; default API_URL is now https://api:8080.
- Drop the obsolete Http2UnencryptedSupport AppContext switch from
  Http2MultiplexingTests; ALPN over TLS handles negotiation.

Test-data fixes caught on the post-TLS rerun (independent of the TLS
switch but surfaced together):

- Http2MultiplexingTests: switch slippy coords from (154321, 95812)
  -- which Google Maps returns 404 for -- to (158485, 91707), the
  slippy projection of (47.461747, 37.647063) already exercised by
  JwtIntegrationTests.
- TileInventoryTests + LeafletPathIndexOnlyTests: SpecifyKind to
  Unspecified at the binding site for raw Npgsql seed paths writing
  into tiles.captured_at / created_at / updated_at (TIMESTAMP without
  tz). Npgsql v6+ refuses Kind=Utc into plain timestamp columns;
  production goes through Dapper and never hits this code path.
- MigrationTests Az503NewUniqueIndexCoversIntegerKeyAndFlightId:
  accept either idx_tiles_location_hash (migration 014) or its
  AZ-505 successor tiles_leaflet_path (migration 015) -- both have
  location_hash as the leading column, which is the AC-9 intent.

Docs updated to reflect the TLS+ALPN path: tile-inventory.md
Non-Goals, modules/api_program.md, module-layout.md, the AZ-505
task spec's Risk 3, and the cycle 6 implementation + completeness
reports. The full integration test suite passes (mode=full, exit 0).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 22:19:26 +03:00
Oleksandr Bezdieniezhnykh da40534b49 chore: advance autodev state to Step 11 (Run Tests) after AZ-505 batch 1
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 21:17:23 +03:00
Oleksandr Bezdieniezhnykh 909f69cb3a [AZ-505] Tile inventory endpoint + HTTP/2 + Leaflet covering index
Production code:
- POST /api/satellite/tiles/inventory (XOR body, 5000-cap,
  most-recent-per-location_hash select, present/absent shaping).
- Kestrel HttpProtocols.Http1AndHttp2 on every listener (AC-5).
- Migration 015 creates tiles_leaflet_path covering index over
  (location_hash, captured_at DESC, updated_at DESC, id DESC)
  INCLUDE (file_path, source); drops superseded idx_tiles_location_hash.
- TileRepository.GetByTileCoordinatesAsync rewired to filter by
  location_hash (Index Only Scan via tiles_leaflet_path).
- TileRepository.GetTilesByLocationHashesAsync added with Npgsql-
  direct ANY($1::uuid[]) binding (Dapper IEnumerable expansion is
  incompatible with the array form).
- Uuidv5.LocationHashForTile centralises the UUIDv5(TileNamespace,
  "{z}/{x}/{y}") formula — single source of truth for the cross-repo
  invariant (gps-denied-onboard parity).

Contracts:
- New: contracts/api/tile-inventory.md v1.0.0.
- Bumped: contracts/data-access/tile-storage.md to v2.0.0 (joint
  ownership by AZ-503-foundation + AZ-505: schema + covering index +
  GetByTileCoordinatesAsync rewrite).

Tests:
- TileInventoryTests covers AC-1, AC-2 (DB-level), AC-4, AC-6.
- Http2MultiplexingTests covers AC-5 (20 concurrent multiplexed GETs
  over h2c via SocketsHttpHandler + AppContext Http2Unencrypted switch).
- LeafletPathIndexOnlyTests covers AC-3 (EXPLAIN (ANALYZE, BUFFERS)
  asserts Index Only Scan over tiles_leaflet_path with heap_blocks=0).

Docs:
- architecture.md, system-flows.md, data_model.md, module-layout.md,
  glossary.md, modules/api_program.md, modules/dataaccess_tile_repository.md,
  components/02_data_access/description.md all updated to reference the
  v2.0.0 tile-storage contract + new tile-inventory contract + AC-7.

Reports:
- batch_01_cycle6_report.md, batch_01_cycle6_review.md,
  implementation_completeness_cycle6_report.md (PASS),
  implementation_report_tile_inventory_cycle6.md.

Task spec moved todo/ -> done/.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 21:16:37 +03:00
Oleksandr Bezdieniezhnykh 3c7cd4e56b chore: update autodev state to Step 10 (Implement) and refine task details for AZ-505
- Advanced the autodev state to Step 10, reflecting the implementation phase.
- Updated task name for AZ-505 to "Implement" and adjusted its status to "not_started."
- Enhanced the dependencies table with detailed context for AZ-505, clarifying its relationship with AZ-503 and its role in cycle 6.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 20:17:57 +03:00
Oleksandr Bezdieniezhnykh aa1a1bf19f chore: open cycle 6 — state advanced to Step 9 (New Task)
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 18:08:17 +03:00
Oleksandr Bezdieniezhnykh ea278afb37 [AZ-503] [AZ-504] Cycle 5 Step 17: retrospective + close cycle
retro_2026-05-12_cycle5.md captures the cycle-end retrospective:
- Implementation: 2 tasks (AZ-504 + AZ-503-foundation), 4 SP total,
  100% first-attempt pass rate, 1 mid-implement scope-split
  (AZ-503 → AZ-503-foundation + AZ-505, blocked-linked).
- Quality: 50/50 PASS/PASS_WITH_WARNINGS, 0 new Medium+, 1 new Low
  (defensive contentSha256 soft-NULL guard).
- Security: PASS_WITH_WARNINGS, 0 new Critical/High/Medium, 2 new
  Low informational (F1 flightId provenance, F2 pgcrypto runbook
  gap).
- Performance: PASS_WITH_INFRA_WARNINGS — first measurable PT-08
  ever (Run #1 199ms, Run #2 117ms vs 2000ms threshold); PT-01/02
  failed on recurring local Docker/colima DNS cold-start, not an
  app regression.
- Structural: +1 ProjectReference edge (IntegrationTests → Common),
  +1 minor contract bump (uav-tile-upload 1.0.0 → 1.1.0), +1 DB
  migration (014_AddTileIdentityColumns.sql), 0 NuGet bumps,
  0 csproj additions, DAG still acyclic at 9 projects.

structure_2026-05-12_cycle5.md captures the structural snapshot.

LESSONS.md updated with 3 cycle-5 entries (oldest dropped to
preserve the 15-entry ring buffer):
- [architecture] Cross-repo cryptographic invariants must live as
  code constants in both repos with reference-vector tests.
- [tooling] When perf-mode "one re-run" fires twice with the same
  DNS root cause, escalate from re-run to harness fix.
- [process] Spec contradicts live code by >=2 prerequisites →
  prefer split into foundation + follow-up (A/B/C option C).

Top 3 follow-up actions (cycle 6 candidates):
- Action 1 (1 SP): DNS pre-warm in scripts/run-performance-tests.sh
  → closes the cycle-3 perf-harness leftover.
- Action 2 (5 SP): AZ-505 — inventory endpoint + HTTP/2 + Leaflet
  covering index (blocked-linked on AZ-503-foundation, this cycle).
- Action 3 (1 SP): pgcrypto pre-install runbook step (F2-cy5 doc
  fix).

Cycle 5 closed. Autodev state advanced for cycle 6 by the next
/autodev invocation.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 18:07:57 +03:00
Oleksandr Bezdieniezhnykh 0e05fc519a [AZ-503] [AZ-504] Cycle 5 Step 16 deploy report
deploy_cycle5.md captures everything operators need to promote
cycle 5 beyond dev:

- Code shipped: AZ-503-foundation (deterministic UUIDv5 tile
  identity, integer-only flight-aware UPSERT, per-flight on-disk
  paths) + AZ-504 (perf script grep-pipefail fix).
- NEW database migration 014_AddTileIdentityColumns.sql adds
  flight_id, location_hash, content_sha256, legacy_id; enables
  pgcrypto; swaps the AZ-484 float index for the new
  idx_tiles_unique_identity integer index. Idempotent under
  DbUp's journal.
- NEW contract version uav-tile-upload.md 1.0.0 → 1.1.0 (adds
  optional flightId; derived tileId in response).
- NEW per-flight on-disk path layout for UAV tiles (additive;
  legacy paths preserved).
- No env-var changes. Container image base unchanged from cycle 4.
- Verification gates passed: PASS (Step 11), PASS (Steps 12+13),
  PASS_WITH_WARNINGS (Step 14), PASS_WITH_INFRA_WARNINGS (Step 15).
- Cycle-3 perf-harness leftover stays OPEN with two clean follow-up
  paths recorded (DNS pre-warm in script, OR move perf gate to CI).
- Operator runbook includes pgcrypto pre-install check for managed
  Postgres providers.

Autodev state advanced to Step 17 (Retrospective).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 18:01:49 +03:00
Oleksandr Bezdieniezhnykh 61612044fb [AZ-503] [AZ-504] Cycle 5 Steps 11-15 sync
Wrap up cycle 5 verification + documentation:
- Steps 10/11 wrap-up reports (implementation_completeness +
  implementation_report) for the AZ-503-foundation + AZ-504 batch.
- Step 12 test-spec sync: AZ-503-foundation/AZ-504 ACs appended;
  AZ-505 deferred ACs recorded.
- Step 13 update-docs: architecture, data-model, glossary, module-
  layout, uav-tile-upload contract (v1.1.0), DataAccess + Services
  + Tests module docs synced; new common_uuidv5.md module doc.
- Step 14 security audit: PASS_WITH_WARNINGS; 0 new Critical/High;
  2 new Low informational (F1 flightId provenance, F2 pgcrypto
  deploy gap).
- Step 15 performance test: PASS_WITH_INFRA_WARNINGS; PT-08
  passed twice (AZ-504 fix verified); PT-01/02 failed due to
  recurring local Docker/colima DNS cold-start (not an app
  regression). Cycle-3 perf-harness leftover stays OPEN with
  replay #5 documented.
- Autodev state moved to Step 16 (Deploy).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 18:01:27 +03:00
Oleksandr Bezdieniezhnykh c646aa93e2 [AZ-503] Tile identity → UUIDv5 + integer UPSERT (foundation)
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Foundation half of original AZ-503 (split during /autodev step 10 batch 2
on user choice; deferred work moved to AZ-505 with a Blocks link).

Adds deterministic tile identity (UUIDv5 over (z, x, y, source, flight_id))
shared cross-repo with gps-denied-onboard via the pinned TileNamespace
5b8d0c2e-7f1a-4d3b-9c5e-1f3a8e7d2b6c, switches the tiles UPSERT key from
floats to integers with per-flight separation, plumbs FlightId through
UavTileMetadata + handler, and writes UAV evidence to per-flight
on-disk directories so two flights at the same (z, x, y) coexist.

- Common: pure-C# RFC 9562 Uuidv5 (no third-party dep) + FlightId DTO
  field; 10 Python-reference unit vectors verify byte parity.
- DataAccess: migration 014 adds flight_id (uuid NULL), location_hash
  (uuid NOT NULL, backfilled via session-scoped pg_temp.uuidv5),
  content_sha256 (bytea NULL), legacy_id (uuid NULL = preserves
  pre-AZ-503 random id one cycle); drops idx_tiles_unique_location_source
  (AZ-484) and adds idx_tiles_unique_identity keyed on
  (tile_zoom, tile_x, tile_y, tile_size_meters, source,
   COALESCE(flight_id, '00000000-...'::uuid)) + idx_tiles_location_hash.
- TileRepository: ColumnList + UPSERT updated; id never updated on
  conflict (preserves AC-2 idempotence). UpdateAsync extended.
- Services: TileService and UavTileUploadHandler compute deterministic
  Id + LocationHash + ContentSha256 before insert; UAV file path
  becomes ./tiles/uav/{flight_id or 'none'}/{z}/{x}/{y}.jpg.
- Tests: Uuidv5Tests (10 reference vectors), UavTileFilePathTests
  (per-flight + anonymous paths), UavTileUploadHandlerTests (AC-2,
  AC-3, AC-7, AC-11 unit-level), UavUploadTests (AC-3 + AC-4
  integration: multi-flight DB coexistence with shared location_hash
  + distinct file_path; float-different lat/lon collapse to 1 row),
  MigrationTests (column shape, idx_tiles_unique_identity supersedes
  AZ-484 index, deterministic backfill).
- IntegrationTests project references Common to reuse Uuidv5 in raw
  SQL seeds.
- AZ-488 MultiSourceCoexistence seed fixed to populate location_hash
  (otherwise migration 014's NOT NULL constraint fails).

ACs covered: AC-1, AC-2, AC-3, AC-4, AC-7, AC-8, AC-11.
ACs deferred to AZ-505: AC-5, AC-6, AC-9, AC-10, AC-12.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 17:07:35 +03:00
Oleksandr Bezdieniezhnykh f6197499a4 chore: update autodev state after AZ-504 batch 1
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 16:33:08 +03:00
Oleksandr Bezdieniezhnykh ab437a15df [AZ-504] Fix grep | wc -l pipefail crash in PT-08 batch counting
scripts/run-performance-tests.sh:416-417 used `grep -o ... | wc -l`
to count `"status":"accepted"` and `"status":"rejected"` markers in
the PT-08 batch response. On the happy path (rejected=0) grep -o
exits 1, and under `set -o pipefail` + `set -e` (line 16) the
pipeline killed the script before reaching any of PT-08's reporting
code — reproducing twice in the cycle-3 perf-harness leftover
(replay #2 + #3 post-AZ-500).

Fix: neutralise grep's no-match exit locally with `|| true` on the
grep stage of each pipeline. `grep -o | wc -l` is kept (not swapped
for `grep -c`) because the PT-08 response is compact JSON — all
items live on one line, so `grep -c` would always return 1 and lose
occurrence-count semantics. An 8-line comment explains why grep
cannot fail for I/O at this code path (file is curl-written, HTTP
200 gated).

AC-1 + AC-2 verified in-place against a standalone harness under
`set -e -o pipefail` (compact-JSON, mixed-status, edge-empty
cases). AC-3 + AC-4 are Step 15 (Performance Test) obligations by
spec design — the leftover deletion (AC-4) is "in the same commit"
as the green full perf run.

Batch report: _docs/03_implementation/batch_01_cycle5_report.md.
Code review: _docs/03_implementation/reviews/batch_01_cycle5_review.md
— PASS, no findings.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 16:32:36 +03:00
Oleksandr Bezdieniezhnykh 8e509b550c [AZ-503] [AZ-504] cycle 5 new-task: tile identity + perf-script-fix
- AZ-503 (3 SP, epic AZ-483) — Tile identity → UUIDv5 deterministic id;
  integer-only UPSERT with COALESCE(flight_id) per-flight separation;
  content_sha256 column; POST /api/satellite/tiles/inventory bulk-list
  endpoint; HTTP/2 at Kestrel edge. Cross-workspace handoff from
  gps-denied-onboard (AZ-304 / AZ-316 counterpart). Supersedes the
  AZ-484 UPSERT-conflict-key portion.

- AZ-504 (1 SP, epic AZ-483) — Fix scripts/run-performance-tests.sh
  lines 416-417: grep -o | wc -l + set -o pipefail kills PT-08 when
  rejected=0. Closes the replay obligation for the cycle-3 perf-harness
  leftover (leftover deletion gated on green full perf run, AC-4).

Updates _dependencies_table.md with cycle 5 entries and records
replay attempt #4 against the perf-cycle3 leftover (PBI opened —
leftover still stays until AZ-504 lands and full perf run is green).

State advanced to Step 10 (Implement).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 16:27:40 +03:00
Oleksandr Bezdieniezhnykh e31f59211d [AZ-500] Cycle 4 Step 17: retrospective + close cycle
Adds retro_2026-05-12_cycle4.md, structure_2026-05-12_cycle4.md, and
the deploy_cycle4.md report that was dropped from the Steps 12-15
sync commit. Appends 3 new lessons to LESSONS.md (12/15 ring buffer)
on transitive major-version bumps, exposed pre-existing bugs, and
single-task-cycle metric framing. State advances to cycle 5 / step 9
(awaiting next New Task invocation).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 06:14:43 +03:00
Oleksandr Bezdieniezhnykh af4219fce6 [AZ-500] Cycle 4 Steps 12-15 sync (test-spec / docs / security / perf)
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Step 12 (Test-Spec Sync) - cycle-update mode
  - traceability-matrix: 8 AZ-500 AC rows + .NET 10 runtime
    restriction supersession + Cycle-4 coverage shape note
    (no new tests; ACs verified by re-running existing 78-test
    suite + build pipeline + manifest grep)

Step 13 (Update Docs) - task mode
  - FINAL_report, 00_discovery, architecture, module-layout,
    api_program, tests_unit: .NET 8 -> .NET 10 / C# 12 -> 14 /
    Swashbuckle 6.6.2 -> 10.1.7 + Microsoft.OpenApi 2.x
    refactor note in api_program; Serilog.AspNetCore 8.0.3
    fallback documented inline per AZ-500 Risk #4
  - deployment/{containerization, ci_cd_pipeline}: Docker
    aspnet/sdk:8.0 -> :10.0
  - ripple_log_cycle4: empty import-graph ripple recorded
    (Program.cs is entry point; ParameterDescriptionFilter only
    consumed by Program.cs; csproj/global.json/Dockerfile have
    no import edges)

Step 14 (Security Audit) - resume mode
  - dependency_scan_cycle4: AZ-500 19-package delta scanned;
    cycle-3 D1+D3 (CVE-2026-26130) closed by major-version
    bump; cycle-3 D2 (Test.Sdk 17.8.0 NuGet.Frameworks flag)
    carried over - explicitly out of AZ-500 scope
  - security_report_cycle4: PASS_WITH_WARNINGS (only carry-over
    Medium open; AZ-500 introduced 0 new Critical/High); cycle-3
    static_analysis/owasp_review/infrastructure_review carried
    forward unchanged (AZ-500 made no source-level edits to
    those surfaces)

Step 15 (Performance Test) - perf mode, full default-param run
  - perf_2026-05-12_cycle4: 7 Pass + 1 Unverified (PT-08 hit
    pre-existing scripts/run-performance-tests.sh:417 grep-
    pipefail bug, NOT a .NET 10 regression)
  - PT-07 warm p95 = 301ms (7.7x improvement vs cycle-3 short
    variant - .NET 10 pipeline + N=20 dilution); cold p95 =
    2782ms (-14%); PT-06 90ms (-49%)
  - AZ-500 NFR (Performance) MET for 7/8 scenarios
  - Cycle-3 perf-harness leftover updated with replay #3
    results; STAYS OPEN per AZ-500 Constraint (deletes only on
    fully clean run)

Recommended follow-up PBIs (out of cycle-4 scope, surfaced for
the backlog):
  - 1 SP fix scripts/run-performance-tests.sh:416-417 grep-
    pipefail (replace grep -o ... | wc -l with grep -c ... ||
    true) - unblocks PT-08 + closes the cycle-3 perf leftover
  - 3 SP migrate WithOpenApi(...) callsites to ASP.NET Core 10
    minimal-API metadata extensions (clears 8 ASPDEPR002
    warnings; recorded in batch_01_cycle4_review.md)
  - 1 SP Microsoft.OpenApi 2.x nullable cleanup (CS8604 in
    ParameterDescriptionFilter.cs:25)
  - 1 SP bump Microsoft.NET.Test.Sdk 17.8.0 -> 17.13.0+
    (closes cycle-3 D2 NuGet.Frameworks transitive flag)

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 06:05:29 +03:00
Oleksandr Bezdieniezhnykh de609cffa1 [AZ-500] Cycle 4 implement-skill wrap-up reports
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Adds the cycle-4 product implementation completeness gate report
(verdict: PASS) and the final implementation report for the .NET 10
migration. Records the Step 16 handoff to Step 11 (test-run skill)
to avoid duplicating the full-suite run already executed during
AC-6 verification (271/271 unit + full integration suite green).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 05:32:27 +03:00
Oleksandr Bezdieniezhnykh 813136326f [AZ-500] .NET 8 -> .NET 10 migration
Coordinated cross-cutting bump: 9 csproj TFMs net8.0 -> net10.0;
global.json sdk.version 8.0.0 -> 10.0.0; all Dockerfiles + scripts/
+ .woodpecker on mcr.microsoft.com/dotnet/{sdk,aspnet,runtime}:10.0;
all Microsoft.AspNetCore.* (8.0.25) and Microsoft.Extensions.* (9.0.10)
packages -> 10.0.7. Serilog.AspNetCore retained at 8.0.3 (10.0.0
requires Serilog.Sinks.File >= 7.0.0; out of AZ-500 scope per "no
unrelated package bumps") -- documented in AGENTS.md. Swashbuckle
9.x bumped to 10.1.7 to track Microsoft.OpenApi 2.x; Program.cs +
ParameterDescriptionFilter.cs refactored for the 2.x namespace
(Microsoft.OpenApi), OpenApiSecuritySchemeReference, JsonSchemaType
enum, and IOpenApiSchema dictionary properties. Fixed implicit AC-5
prereq: scripts/run-performance-tests.sh PERF_DLL path bin/Release/
net8.0 -> net10.0. Docs sync: architecture.md + AGENTS.md.

ACs verified: AC-1..AC-4 + AC-7 + AC-8 by grep + build; AC-6 by
./scripts/run-tests.sh --full (271/271 unit tests + full integration
suite green); AC-5 short bootstrap-smoke (PERF_REPEAT_COUNT=2
PERF_UAV_BATCH_SIZE=2) succeeded at the bootstrap step (no exit 3),
PT-01..PT-07 PASS. PT-08 surfaced a pre-existing grep-pipefail bug
in run-performance-tests.sh:417 -- not an SDK problem; recorded as
follow-up in the perf-cycle3 leftover. Code review verdict:
PASS_WITH_WARNINGS (2 Medium deferred per scope discipline:
WithOpenApi ASPDEPR002 deprecation x8, CS8604 nullable in
ParameterDescriptionFilter.cs; both targeted at follow-up PBIs).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 05:28:01 +03:00
Oleksandr Bezdieniezhnykh c0f004d2c9 [AZ-500] Cycle 4 Step 9: new-task .NET 10 migration
Closes Step 9 (New Task) of cycle 4. AZ-500 spec defines the
.NET 8 -> .NET 10 migration (TFM bump on 9 csprojs, global.json
SDK pin to 10.0.0, both Dockerfiles + run-tests.sh + woodpecker
to mcr.microsoft.com/dotnet/*:10.0, Microsoft.AspNetCore.* and
Microsoft.Extensions.* to the 10.x line, Serilog.AspNetCore to
10.x or documented 8.0.3 fallback, plus arch.md + AGENTS.md doc
sync). Closes the cycle-3 perf-harness leftover via AC-5
(bootstrap smoke after migration).

Also logs the cycle-4 perf-leftover replay attempt that
discovered the host-SDK / project-SDK mismatch and rolls the
state file from cycle 3 -> cycle 4 (Step 9 done -> Step 10
ready).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 04:35:52 +03:00
Oleksandr Bezdieniezhnykh ca0ca9f2a4 [AZ-491] [AZ-492] [AZ-493] [AZ-494] [AZ-495] [AZ-496] Cycle 3 Step 17: retrospective + close cycle
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Cycle-3 retrospective:
  - 6 tasks (AZ-491..AZ-496), 5 batches, 18 SP delivered.
  - 100% code review pass rate (5/5 PASS_WITH_WARNINGS, 0 FAIL).
  - 0 Critical/High/Medium review findings; 7 distinct Low.
  - Security audit PASS_WITH_WARNINGS: 0 new Medium, 3 Low (all
    test-only or operator-CLI), 2 Informational, 1 False Positive.
  - Net Architecture delta: **-3** (F-AUTH-2 + D1 + D3 RESOLVED;
    only new findings are Low test-side surfaces). First
    net-negative cycle on record.
  - 5 of 6 tasks completed first attempt (no post-review fix
    commits). Cycle-2's 2 prior-retro actions all translated to
    closed work (AZ-491 from Action 1, AZ-492 from Action 2,
    AZ-493 from Action 3).

Top 3 cycle-4 improvement actions surfaced:
  1. Execute the perf harness to capture PT-07/PT-08 baseline.
  2. Bump TestSupport JWT pins 7.0.3 → 7.1.2+ (D4 NU1902 cleanup).
  3. Add `workspace:` tag to cross-repo ACs in task-spec writing
     and render them separately in the traceability matrix.

3 new ring-buffer lessons appended to _docs/LESSONS.md:
  - [process] Option-B forcing functions for cross-team blockers.
  - [process] ACs prescribing a measurement should also prescribe
    the collection path.
  - [process] Cross-repo-write ACs need workspace tags.

Structural snapshot at structure_2026-05-12_cycle3.md records the
new SatelliteProvider.TestSupport project (+2 ProjectReference edges
into it; no production-layer dependents) and the AZ-496 package
bumps (8.0.21 → 8.0.25).

Cycle 3 COMPLETE. State advanced to Step 9 (New Task) for cycle 4
per existing-code flow Re-Entry After Completion.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 03:46:41 +03:00
Oleksandr Bezdieniezhnykh 65cdfae970 [AZ-491] [AZ-492] [AZ-493] [AZ-494] [AZ-495] [AZ-496] Cycle 3 Step 15 skip + Step 16 deploy report
Step 15 (Performance Test): SKIPPED. User skipped the optional gate
question. Per meta-rule.mdc, performance tests require explicit
approval; a skipped question is not approval. Recorded as leftover at
_docs/_process_leftovers/2026-05-12_perf-cycle3-harness-execution.md
for replay at next /autodev invocation.

Step 16 (Deploy): COMPLETED. Produced deploy_cycle3.md mirroring the
cycle-2 shape. Covers:
  - 9 cycle-3 commits + zero DB migration
  - Config changes (JWT_ISSUER/JWT_AUDIENCE env vars w/ fail-fast,
    8.0.25 package bumps, new TestSupport project)
  - Pre-deploy gate recap (Steps 11-15)
  - Cycle-3 operational risks R1-R4 (admin-team iss/aud confirm,
    cross-repo doc deferral, cycle-2 R1/R3 carry-overs, test-runner
    log line)
  - Rollback plan, post-deploy verification (incl. wrong-iss / wrong-
    aud smoke probes), CI/CD push policy
  - Resolved this cycle: F-AUTH-2, D1, D3, PT-07/PT-08 leftover
  - Follow-up backlog: D4 NU1902 bump, F-DBR-2 third guard, F-PERF-1
    token-history hardening, image-fixture consolidation, AC-7 cross-
    repo write, no-revocation-list residual

Next: Step 17 (Retrospective, cycle-end mode).
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 03:42:10 +03:00
Oleksandr Bezdieniezhnykh 314d1dec39 [AZ-491] [AZ-492] [AZ-493] [AZ-494] [AZ-496] Cycle 3 Step 14: security audit refresh
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
All 5 phases refreshed against cycle-3 delta:

Phase 1 (Dependency Scan):
  - D1 RESOLVED (AZ-496): Microsoft.AspNetCore.OpenApi 8.0.21 → 8.0.25
  - D3 RESOLVED (AZ-496): JwtBearer 8.0.21 → 8.0.25
  - D4 NEW (Low, test-only): System.IdentityModel.Tokens.Jwt 7.0.3 +
    Microsoft.IdentityModel.Tokens 7.0.3 pinned in TestSupport carry
    CVE-2024-21319 (JWE DoS). Bump to ≥ 7.1.2 tracked as future PBI.

Phase 2 (Static Analysis):
  - F-AUTH-3 (Info): test runner Program.cs logs iss/aud at startup;
    production API does NOT (verified by grep).
  - F-AUTH-4 (Info): DEV-ONLY iss/aud placeholders in
    appsettings.Development.json + .env.example — by design per
    Option B for AZ-494.
  - F-DBR-1: TRUNCATE string interpolation in
    IntegrationTestDatabaseReset.cs — false positive (hard-coded
    table list).
  - F-DBR-2 (Low): TRUNCATE guard is operator-bypassable. Two-guard
    model is conservative-by-default and unit-tested.
  - F-PERF-1 (Low): perf-bootstrap --mint-only writes a 4-hour
    GPS-permission token to stdout. Operator-trusted machine assumed.

Phase 3 (OWASP Top 10):
  - A03 carries D1/D3 RESOLVED + D4 NEW.
  - A07 flips F-AUTH-2 to RESOLVED (AZ-494); residual revocation-list
    Low recorded.
  - A05 status unchanged (F-DBR-1 false positive).
  - A08 picks up F-DBR-2.

Phase 4 (Infrastructure):
  - JWT_ISSUER / JWT_AUDIENCE flow .env → compose → Kestrel config,
    same pattern as JWT_SECRET.
  - INTEGRATION_TEST_DB_RESET + ASPNETCORE_ENVIRONMENT=Testing wired
    for AZ-493 reset gate.
  - SatelliteProvider.TestSupport is IsPackable=false — never ships
    in a production container image.
  - New operational gate added to deploy runbook: grep for DEV-ONLY-
    in the rendered deploy environment must return zero hits.

Phase 5 (Security Report):
  - Verdict: PASS_WITH_WARNINGS (cycle 3 does not escalate).
  - 0 Critical, 0 High, 0 new Medium.
  - Cycle-2 F-AUTH-2 (Medium) RESOLVED; cycle-1 D1 + cycle-2 D3
    RESOLVED.

Autodev state advanced to Step 14 completed. Next: Step 15
(Performance Test, optional gate).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 03:13:04 +03:00
Oleksandr Bezdieniezhnykh e42bf62152 [AZ-491] [AZ-492] [AZ-493] [AZ-494] [AZ-495] [AZ-496] Cycle 3 Steps 11-13: test-spec sync + ripple log
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Step 11 (Run Tests) is recorded as PASS based on the implement skill's
internal Step 16 gate (./scripts/run-tests.sh --full, all-green) per
test-run/SKILL.md § Functional Mode — same runner, immediately
preceding invocation, no value in a second run.

Step 12 (Test-Spec Sync, cycle-update mode):
  - traceability-matrix.md: rows added for AZ-491 AC-1..AC-6,
    AZ-493 AC-1..AC-6, AZ-495 (doc convention), AZ-496 AC-1..AC-N
    (dependency bump); AZ-494 AC-1/AC-2 rows now cross-reference
    new SEC-12 / SEC-13 blackbox IDs.
  - security-tests.md: SEC-12 (wrong iss returns 401) and SEC-13
    (wrong aud returns 401) appended for AZ-494.
  - environment.md: Environment Variables table extended with
    GOOGLE_MAPS_API_KEY, JWT_SECRET, JWT_ISSUER, JWT_AUDIENCE,
    INTEGRATION_TEST_DB_RESET. Closes a cycle-2 oversight where
    JWT_SECRET was never recorded.

Step 13 (Update Docs, task mode):
  - tests_unit.md: consolidated the duplicate
    AuthenticationServiceCollectionExtensionsTests entry that
    spanned AZ-487 + AZ-494 into one coherent block.
  - ripple_log_cycle3.md created: per-task source files +
    every doc that was touched (architecture, module-layout,
    api_program, tests_unit, tests_integration, traceability,
    performance-tests, security-tests, environment, security_report,
    owasp_review, deploy_cycle2, retro_2026-05-11_cycle2). Notes
    which docs were intentionally NOT touched and the open
    cross-repo doc ripple (AC-7).

Autodev state advanced to Step 13 completed. Next: Step 14 Security
Audit (optional gate).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 02:57:45 +03:00
Oleksandr Bezdieniezhnykh 495605f51b [AZ-494] [AZ-492] Cycle 3 Step 16: full test suite green; close batches
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Final cumulative review for batches 04-05 (PASS_WITH_WARNINGS, 4 Low
findings, all non-blocking). Combined with the prior 01-03 cumulative,
this closes the per-cycle batch coverage with two PASS_WITH_WARNINGS
verdicts.

scripts/run-tests.sh --full green: format check + 13 cycle-3 unit
tests (including the 4 new AZ-494 fail-fast cases for missing /
empty iss / aud) + the full integration suite (including the 2 new
WrongIssuer / WrongAudience 401 assertions).

Fixed a stale "leave blank to fall back" comment in .env.example
that contradicted the "REQUIRED" line right above it; the integration
runner reads env vars directly with no appsettings fallback so blank
values now fail-fast.

Advanced _docs/_autodev_state.md to mark Step 10 (Implement) status:
completed.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 02:35:10 +03:00
Oleksandr Bezdieniezhnykh f979e18811 [AZ-494] Enable JWT iss/aud validation with fail-fast startup
Option B per user decision: production ships with empty Jwt.Issuer /
Jwt.Audience in appsettings.json so the API process refuses to start
unless JWT_ISSUER + JWT_AUDIENCE env vars are supplied. Development
ships with grep-friendly DEV-ONLY- placeholders so local + docker
flows keep working unchanged.

AuthenticationServiceCollectionExtensions flips ValidateIssuer +
ValidateAudience to true and wires ValidIssuer / ValidAudience via a
new ResolveRequiredOrThrow helper that all three required values
(secret, iss, aud) now share. JwtTokenFactory.Create + CreateExpired
gain optional iss / aud parameters (default null) so existing call
sites compile unchanged. JwtTestHelpers adds MintAuthenticated /
MintExpired wrappers that resolve iss + aud from env, plus
ResolveIssuerOrThrow / ResolveAudienceOrThrow. PerfBootstrap.MintToken
+ Program.cs JWT bootstrap migrated to the new surface so the perf
harness and the integration runner both validate against the same
contract.

Adds 4 fail-fast unit tests (missing/empty issuer + audience), 2
negative integration scenarios (WrongIssuer_Returns401,
WrongAudience_Returns401), and re-tags every existing integration
mint site via MintAuthenticated.

Compose, .env.example, run-tests.sh, run-performance-tests.sh all
load + export JWT_ISSUER + JWT_AUDIENCE alongside JWT_SECRET.

Resolves F-AUTH-2 (security_report.md + owasp_review.md). AC-7
(cross-repo suite/_docs/10_auth.md write) deferred — outside this
workspace; tracked in deploy_cycle2.md R3 follow-up.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 02:28:48 +03:00
Oleksandr Bezdieniezhnykh 080441db5d [AZ-492] Cycle 3 batch 4: perf harness PT-07 + PT-08 + JWT-attach
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Drains all three deferred perf-harness items in one batch:
- PT-01..PT-06 now carry Authorization: Bearer minted via the canonical
  SatelliteProvider.TestSupport.JwtTokenFactory (AZ-491) — no third copy
  of JWT logic in the shell.
- PT-07 implemented as cold + warm dual-pass distribution (N=20 each),
  reports p50/p95 for both passes and fails if warm p95 >= cold p95.
- PT-08 implemented as 20-batch upload distribution with batch p95 gated
  at the AZ-488 2000 ms target; per-item gate cost reported as derived
  proxy (batch_p95 / batch_size).

New SatelliteProvider.IntegrationTests/PerfBootstrap.cs adds two CLI
short-circuit subcommands (--mint-only and --gen-uav-fixture <path>)
invoked by the shell so the perf script never inlines the JWT or
JPEG-fixture logic. The dispatch sits at the top of Program.cs Main
and runs before any HTTP / DB / readiness setup.

performance-tests.md PT-07 + PT-08 flip from Deferred to Implemented.
traceability-matrix.md PT-07 + PT-08 rows move from recorded to covered
(PT-08 partial due to per-item proxy — flagged Low in batch-4 review).
_docs/_process_leftovers/2026-05-11_perf-pt07-harness.md deleted; the
leftovers directory is now empty.

Closes cycle-2 retro Action 2; LESSONS.md [process] rule about Deferred
NFRs remains in force as a guardrail.

Also includes the previously-uncommitted cumulative review report for
cycle-3 batches 01-03 (generated at the end of batch 3 but not staged).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 01:52:25 +03:00
Oleksandr Bezdieniezhnykh 745f4840e6 [AZ-493] Cycle 3 batch 3: integration test DB-reset hook
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
AZ-493 (2 SP): replace the cycle-2 wallclock-seeded _coordinateCounter
workaround with a proper Postgres state-reset hook that runs at
integration test runner startup, eliminating the per-source-unique-index
collision risk that the persistent docker-compose Postgres volume
introduced post-AZ-484.

The reset is split into two surfaces:

* SatelliteProvider.TestSupport.IntegrationTestResetGuard - pure
  static class, I/O-free, unit-tested. Two independent guards: (a)
  ASPNETCORE_ENVIRONMENT must equal "Testing", (b) DB_CONNECTION_STRING
  Host must be in the allowed-host list (postgres, localhost, 127.0.0.1).
  Failure of either guard surfaces a structured operator-friendly
  InvalidOperationException.
* SatelliteProvider.IntegrationTests.IntegrationTestDatabaseReset -
  instance class owning the Npgsql side effects. Calls the guard then
  runs TRUNCATE TABLE route_regions, route_points, routes, regions,
  tiles RESTART IDENTITY CASCADE inside a single Npgsql transaction.

Spec-vs-reality: the task spec prescribed "DB name contains _test" as
Guard 2; the actual compose file uses Database=satelliteprovider and
DB rename is gated on user confirmation per coderule.mdc. Substituted
a Host allowlist as the equivalent guard (intent identical: reject
remote / production hosts). Recorded as Low/Spec-Gap in the review.

Program.cs adds --keep-state CLI flag and INTEGRATION_KEEP_STATE env
var (1/true) opt-outs so a developer can inspect leftover state when
debugging. Startup banner shows which path executed.
docker-compose.tests.yml gets ASPNETCORE_ENVIRONMENT=Testing +
passthrough for INTEGRATION_KEEP_STATE. scripts/run-tests.sh wires the
--keep-state flag through to compose.

UavUploadTests._coordinateCounter wallclock seed is retained as
defense-in-depth (per the task spec's implementer choice). The reset
is the primary isolation path; the seed is the belt-and-suspenders
fallback for --keep-state runs.

8 new unit tests in SatelliteProvider.Tests/TestSupport/
IntegrationTestResetGuardTests.cs cover Production/Staging/missing-env
throw, allowed-host case-insensitivity, disallowed-host rejection
with representative prod hostnames, and the AllowedHosts contract.

tests_integration.md gains a Reliability section that documents the
hook, the two guards, the truncate order, and the three opt-out forms.
module-layout.md TestSupport entry extended with the new pure guard
and the explicit "Npgsql stays in IntegrationTests" boundary.

Test-suite gate (AC-6) deferred to Step 16 Final Test Run per implement
skill convention. Per-batch review verdict: PASS_WITH_WARNINGS with 1
Low (spec-vs-reality on Guard 2, non-blocking).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 01:38:42 +03:00
Oleksandr Bezdieniezhnykh c396740644 [AZ-491] Cycle 3 batch 2: consolidate JWT test-mint helpers into TestSupport
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
AZ-491 (3 SP): eliminate the cycle-2 duplicate of JWT-minting logic
that existed in both SatelliteProvider.Tests/TestUtilities/
JwtTokenFactory.cs (unit-side) and SatelliteProvider.IntegrationTests/
JwtTestHelpers.cs (integration-side), where the same Expires <
NotBefore bug needed parallel fixes in commits f64d0d7 + 11b7074.

Option A chosen: new SatelliteProvider.TestSupport class library
(no test framework) holds the canonical JwtTokenFactory.Create /
CreateExpired / TamperSignature. Both Tests and IntegrationTests
consume it via ProjectReference; production projects (Api, Common,
DataAccess, Services.*) cannot depend on it. The notBefore-shift
workaround is preserved with an inline regression-prevention comment
back-referencing the cycle-2 fix commits.

SatelliteProvider.IntegrationTests/JwtTestHelpers.cs is stripped to
runner-only concerns: ResolveSecretOrThrow, AttachDefaultAuthorization,
and the DefaultSubject = "integration-tests" constant. Call sites in
Program.cs, JwtIntegrationTests.cs, and UavUploadTests.cs (10 sites)
switched to JwtTokenFactory.* with JwtTestHelpers.DefaultSubject
explicitly passed for the runner subject - behavior parity preserved.

Dockerfile for IntegrationTests gets the new TestSupport csproj
in its pre-restore COPY layer. Api Dockerfile unchanged (TestSupport
is NOT a production dependency).

A new code-review SKILL.md Phase 6 checklist row flags near-identical
helper logic across test projects as a Medium / Maintainability
finding with explicit cycle-2 retro back-reference, so this whole
pattern stops at one occurrence.

module-layout.md adds a TestSupport Shared/Cross-Cutting entry
documenting the production-isolation invariant. tests_unit.md +
tests_integration.md updated to describe the consolidated layout.
sln updated.

Test-suite gate (AC-2 + AC-3) deferred to Step 16 Final Test Run
per implement-skill convention. Per-batch review verdict:
PASS_WITH_WARNINGS with 1 Low (pre-existing 7.0.3 version pin
preserved verbatim from cycle-2 IntegrationTests csproj for parity;
not blocking; deferred bump).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 01:32:24 +03:00
Oleksandr Bezdieniezhnykh 9cfd80babe [AZ-495] [AZ-496] Cycle 3 batch 1: doc convention + AspNetCore 8.0.25
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
AZ-495 (1 SP): formalize the modules-only documentation convention for
the WebApi component. _docs/02_document/module-layout.md now carries an
explicit Documentation Layout section anchoring WebApi docs at
modules/api_program.md; the components/06_web_api/ folder is
intentionally absent. .cursor/skills/new-task/SKILL.md Step 4 directs
future agents at the correct path. Cycle-1 + cycle-2 F1 findings in the
two batch-review files are marked RESOLVED with back-reference to
AZ-495. Cycle-2 retrospective decision-item list F1 updated.

AZ-496 (2 SP): bump Microsoft.AspNetCore.OpenApi and JwtBearer in
SatelliteProvider.Api.csproj from 8.0.21 to 8.0.25, closing CVE-
2026-26130 (SignalR DoS - not reachable in this app, but the runtime
patch is the recommended hardening per cycle-1 D1 + cycle-2 D3).
SatelliteProvider.Tests.csproj has no direct JwtBearer reference - it
consumes JwtBearer transitively via ProjectReference to Api, so no
edit needed there. Dockerfiles use floating mcr.microsoft.com/
dotnet/aspnet:8.0 / sdk:8.0 / runtime:8.0 tags which auto-resolve to
>= 8.0.25 on rebuild. Security artifacts (dependency_scan.md,
security_report.md) and current-state docs (module-layout.md,
architecture.md, modules/api_program.md, modules/tests_unit.md)
updated to reflect 8.0.25.

Batch report + code review report (verdict PASS_WITH_WARNINGS with 2
Low findings, neither blocking) written under _docs/03_implementation.

Test suite gate deferred to Step 16 (Final Test Run) per implement
skill convention. Patch-level bump within .NET 8 LTS; regression risk
very low.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 01:24:48 +03:00
Oleksandr Bezdieniezhnykh 76076cbd90 [AZ-491] [AZ-492] [AZ-493] [AZ-494] [AZ-495] [AZ-496] Cycle 3 Step 9: 6 task specs
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Drains the cycle-2 retrospective top-3 improvement actions plus three
carried-forward security and process items, into 6 individually-
trackable PBIs:

- AZ-491 (3 SP) consolidate JWT test-mint helpers (Retro Action 1)
- AZ-492 (3 SP) perf harness PT-07 + PT-08 + JWT-attach (Retro Act. 2)
- AZ-493 (2 SP) integration test DB-reset hook (Retro Action 3)
- AZ-494 (2 SP) JWT iss/aud validation (cycle-2 F-AUTH-2 Medium)
- AZ-495 (1 SP) doc-folder convention for WebApi (cycle 1+2 F1 carry)
- AZ-496 (2 SP) bump AspNetCore 8.0.21 -> 8.0.25 (cycle 1+2 D1+D3)

All 6 at-or-below the user-rule 5 SP cap. AZ-494 gated on admin-team
confirming iss/aud values. Cycle 3 total: 13 SP.

Autodev pointer advances to Step 10 Implement.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 01:15:21 +03:00
Oleksandr Bezdieniezhnykh b69cf5640e [AZ-487] [AZ-488] retro: cycle 2 report + structural snapshot
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Cycle-2 retrospective covering AZ-487 + AZ-488. Captures six patterns
(duplicate JWT helpers diverged then both broke; pre-existing
test bugs unmasked by downstream test pressure; cycle 1 perf-NFR
action stopped adding scenarios but did not drain backlog; doc-path
F1 carried over twice with no decision; integration test DB
isolation = wallclock workaround; 8 SP friction observable even
with user override). Top-3 improvement actions: consolidate JWT
mint helpers, promote PT-07/PT-08/JWT-attach to real PBI, real
integration DB-reset hook.

LESSONS.md ring buffer now holds 6 entries (testing x3, process x2,
estimation x1).

Structural snapshot: 6 components / 12 PR edges unchanged; contract
coverage 14% -> 29%; new external NuGet edges (JwtBearer 8.0.21 +
ImageSharp 3.1.11) tied to cycle-2 security findings.

Autodev pointer advances to cycle 3 / Step 9 New Task.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 00:43:27 +03:00
Oleksandr Bezdieniezhnykh e9f4e84adb [AZ-487] [AZ-488] docs: cycle 2 deploy report
Per-cycle deploy report covering AZ-487 (JWT baseline) + AZ-488 (UAV
tile batch upload). Lists all 12 cycle-2 commits already pushed to
origin/dev, recaps Steps 11-15 gate outcomes, flags three
operator-gated risks (R1 consumer Bearer-token coordination, R2
JWT_SECRET prod-distinct verification, R3 GPS-permission claim
provisioning), documents rollback (image flip; zero schema change),
and lists deferred follow-ups (PT-07/PT-08 harness + run-perf script
JWT-attach, F1 doc-folder choice).

Advances autodev pointer to Step 17 (Retrospective).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 00:37:32 +03:00
Oleksandr Bezdieniezhnykh cbbb26bd28 [AZ-487] [AZ-488] chore: cycle 2 Step 15 skip + record JWT-attach script rot
Step 15 (Performance Test) — skipped per gate (option B). Recording two
deferred items in the existing perf leftover:

* PT-07 + PT-08 remain Deferred. Both NFRs depend on the same
  baseline-capture harness that has not landed; the integration-test
  fixtures needed for PT-08 already exist (UavUploadTests +
  UavTileImageFactory), so PT-08 attaches to the same harness as PT-07
  when implemented.
* scripts/run-performance-tests.sh PT-01..PT-06 currently return 401
  against the post-AZ-487 build because they attach no Bearer token.
  Script must mint an HS256 token from JWT_SECRET at script start
  before any curl call. Tracked in the leftover so PT-01..PT-06 are
  runnable again the same cycle PT-07/PT-08 are activated.

No code change in this commit — leftover + state advance only.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 00:35:18 +03:00
Oleksandr Bezdieniezhnykh 5214a4a647 [AZ-487] [AZ-488] security: cycle 2 delta audit (PASS_WITH_WARNINGS)
Step 14 (Security Audit) for cycle 2 — delta scan against the cycle-1
baseline. Verdict remains PASS_WITH_WARNINGS; no Critical/High.

Scope: JWT auth boundary (AZ-487) and UAV multipart upload + ImageSharp
decode of attacker-controlled bytes (AZ-488). Both new packages
(JwtBearer 8.0.21, ImageSharp 3.1.11 in Services.TileDownloader)
checked.

Cycle-2 delta:
* 0 Critical / 0 High
* 2 Medium: F-AUTH-2 (iss/aud not validated — by design until admin
  team publishes values, AZ-487 § Constraints), F-UAV-1 (ImageSharp
  decode now runs on attacker-controlled bytes — mitigations
  sufficient; pin to GHSA subscribe-and-bump policy).
* 4 Low: F-AUTH-1 (DEV-ONLY secret in appsettings.Development.json —
  accepted), F-AUTH-3 (rate-limit gap extends to 401 floods — folds
  into cycle-1 I3), F-UAV-2 (JsonDocument.Parse on signature-validated
  claims — bounded by Kestrel header cap), D3 (JwtBearer shares D1
  patch line).
* 1 Informational: F-UAV-3 (reject reasons disclose gate structure —
  accepted UX trade-off; documented in contract).

OWASP refresh: A01 / A07 move from N/A (with caveat) to
PASS_WITH_WARNINGS (per-tenant authz absent; iss/aud + revocation
gaps tracked).

Pre-deploy operational gate added: deploy pipeline must verify
JWT_SECRET != DEV-ONLY placeholder before promoting api.

Artifacts: dependency_scan.md, static_analysis.md, owasp_review.md,
infrastructure_review.md, security_report.md — all appended with a
"Cycle 2 Delta" section preserving cycle-1 finding IDs.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 00:13:58 +03:00
Oleksandr Bezdieniezhnykh e3cd388577 [AZ-487] [AZ-488] docs: cycle 2 doc sync (task mode)
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Step 13 (Update Docs) for cycle 2. Most cross-cutting docs were
already updated during Step 10 (architecture.md, glossary.md,
components/03_tile_downloader, modules/api_program.md, data_model.md,
contracts/api/uav-tile-upload.md). This commit completes the remaining
module-level + module-layout updates and writes the cycle-2 ripple log.

* modules/common_configs.md: + UavQualityConfig section and
  appsettings-section row (UavQuality).
* modules/common_dtos.md: + UavTileMetadata, UavTileBatchMetadataPayload,
  UavTileBatchUploadResponse, UavTileUploadResultItem, UavTileUploadStatus,
  UavTileRejectReasons (closed enumeration v1.0.0).
* module-layout.md: refresh Common (+ UavQualityConfig + UAV DTOs),
  TileDownloader (+ UavTileQualityGate, UavTileUploadHandler, +
  SixLabors.ImageSharp 3.1.11 PackageReference), and WebApi (+
  Authentication/*, DTOs/UavTileBatchUploadRequest, + JwtBearer 8.0.21
  PackageReference). Updates the "Last Updated" stamp to cycle 2.
* modules/tests_unit.md: replace the obsolete "only a dummy test"
  description; add cycle-2 AZ-487 / AZ-488 test classes
  (AuthenticationServiceCollectionExtensionsTests, JwtTokenFactoryTests,
  UavTileQualityGateTests, UavTileUploadHandlerTests, UavTileFilePathTests,
  PermissionsRequirementTests) + new ProjectReference / package
  references.
* modules/tests_integration.md: + JwtIntegrationTests, UavUploadTests
  (incl. wall-clock-seeded coordinate counter rationale from the Step 11
  fix), and the StubAndErrorContractTests update for the removed 501
  stub.
* ripple_log_cycle2.md (new): cycle-2 reverse-dependency scan results
  showing every importer of the new symbols resolves inside the three
  already-updated components (WebApi, TileDownloader, Common). No
  unexpected ripple, no heuristic fallback needed.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 00:04:05 +03:00
Oleksandr Bezdieniezhnykh 98cdcd17c1 [AZ-487] [AZ-488] docs: cycle 2 test-spec sync
Append cycle 2 entries to test-spec artifacts (cycle-update mode):

* security-tests.md: SEC-05..SEC-09 (AZ-487 JWT 401/403/parity)
  + SEC-10..SEC-11 (AZ-488 permission + reject-detail leak hygiene).
* blackbox-tests.md: BT-13..BT-17 (UAV happy / mixed / multi-source
  coexistence / same-source UPSERT / rule-ordering) + BT-18 (existing
  endpoints parity with Bearer token).
* resource-limit-tests.md: RL-05..RL-07 (MaxBatchSize, per-item MaxBytes,
  Kestrel/Form envelope cap).
* performance-tests.md: untouched (PT-08 already landed with AZ-488 as
  Deferred — see _docs/_process_leftovers/2026-05-11_perf-pt07-harness).
* traceability-matrix.md: append AC rows for AZ-487 AC-1..AC-8 and
  AZ-488 AC-1..AC-10 + AC-7a..AC-7e; annotate "No authentication"
  restriction as superseded by AZ-487+AZ-488; add NFR rows (perf,
  security, reliability, compatibility) for both tasks; refresh totals
  (78 tests; 47/47 ACs; 8/8 restrictions).

Coverage shape: AZ-487 AC-7 (Swagger Authorize) and the perf NFRs are
recorded but not actively measured this commit (manual UI smoke +
deferred PT-08 harness, respectively).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 00:00:14 +03:00
Oleksandr Bezdieniezhnykh dc3dabe7bd [AZ-488] fix: seed UavUploadTests coordinate counter from wall-clock
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Postgres data volume persists across docker-compose runs. The previous
`int _coordinateCounter = 0` reset on every test-runner process start
so the SECOND `--full` run collided with rows seeded by the first
`--smoke` run (the AC-3 MultiSourceCoexistence test does a raw INSERT
for the pre-seed step, not an UPSERT, and the unique constraint fires).

Seed the counter from a wall-clock value (~Unix epoch seconds mod 1M)
so each runner process picks a distinct coordinate band. Eliminates
inter-run collisions without coupling the test to docker volume reset.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 23:53:44 +03:00
Oleksandr Bezdieniezhnykh 1802d32107 [AZ-488] UAV tile batch upload + 5-rule quality gate
Replaces the 501 stub at POST /api/satellite/upload with a multipart
batch endpoint that ingests UAV-captured tiles, runs each item through
a 5-rule quality gate, and persists accepted tiles via the AZ-484
multi-source storage path with source='uav'.

Quality gate (in fixed order, first failure wins): JPEG format
(content-type + magic), size band 5 KiB-5 MiB, exact 256x256
dimensions, captured-at age (no future >30 s skew, no older than
7 days), luminance variance on 32x32 downsample. Closed reject-reason
enumeration in v1.0.0 contract.

Authorization: custom PermissionsRequirement / PermissionsAuthorization
Handler that reads the JWT `permissions` claim (tolerates both
repeated-string and JSON-array shapes). Endpoint protected by
RequiresGpsPermission policy; 401 without token, 403 without GPS perm.

Persistence: file-first to ./tiles/uav/{z}/{x}/{y}.jpg, then
ITileRepository.InsertAsync UPSERT (per-source UPSERT contract from
AZ-484). Per-item failures reported in response without aborting the
batch. Kestrel MaxRequestBodySize and FormOptions limits set to
MaxBatchSize x MaxBytes (default 100 x 5 MiB = 500 MiB).

New frozen contract: _docs/02_document/contracts/api/uav-tile-upload.md
v1.0.0. PT-08 NFR added to performance-tests.md as Deferred (harness
work tracked in PT-07 leftover, per AZ-488 § Risk 4).

Tests: 11 quality-gate unit tests, 5 handler unit tests, 3 file-path
unit tests, 12 permission-handler unit tests, 7 integration tests
(AC-1..AC-6, AC-8). All 253 unit tests + smoke integration suite
green.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 23:50:49 +03:00
Oleksandr Bezdieniezhnykh 11b7074485 [AZ-487] fix: integration-test JWT factory handles negative lifetime
Same fix as f64d0d7 applied to the integration tests' own copy of the
JWT mint helper. MintExpiredToken passes a negative lifetime which made
Expires < NotBefore and the JwtSecurityToken constructor rejected the
token before it could exercise lifetime-validation. Shift NotBefore
behind Expires for non-positive lifetimes.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 23:47:26 +03:00
Oleksandr Bezdieniezhnykh f64d0d760a [AZ-487] fix: JWT factory + tests now pass on net8.0
- JwtTokenFactory.Create: negative `lifetime` produced Expires < NotBefore
  which `JwtSecurityToken` rejects at construction time. Shift NotBefore
  behind Expires whenever the requested lifetime is non-positive so the
  expired-token fixture round-trips and lifetime validation can fire.
- JwtTokenFactoryTests: validate against a handler with
  `MapInboundClaims = false` so assertions read the factory's own claim
  names ("sub", "email", "permissions") rather than the .NET-default
  remapped ClaimTypes.* aliases.

These were latent — masked by the CS0104 build break fixed in 753be43.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 23:45:12 +03:00
Oleksandr Bezdieniezhnykh 753be43d11 [AZ-487] fix: resolve CS0104 ambiguity in AuthN tests
`AuthenticationServiceCollectionExtensions` is also a built-in
.NET type under `Microsoft.Extensions.DependencyInjection`. With
both namespaces imported the unqualified references in this test
file failed with CS0104, breaking the entire test project build.
Resolved via a `using` alias so the call sites stay short while
the build stays unambiguous.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 23:41:31 +03:00
Oleksandr Bezdieniezhnykh 96cd3c4495 [AZ-487] JWT validation baseline (HS256, all endpoints)
ci/woodpecker/push/01-test Pipeline failed
ci/woodpecker/push/02-build-push unknown status
Adds Microsoft.AspNetCore.Authentication.JwtBearer 8.0.21 and the
SatelliteProvider.Api.Authentication.AddSatelliteJwt extension that
validates HS256 tokens against a shared JWT_SECRET (>=32 bytes, fail
fast at startup). Every minimal-API endpoint now carries
.RequireAuthorization(); the middleware chain is UseExceptionHandler ->
UseHttpsRedirection -> UseCors -> UseAuthentication -> UseAuthorization
-> endpoints. Swagger UI gets a Bearer security definition so the
Authorize button works.

Test infrastructure: JwtTokenFactory (unit) and JwtTestHelpers
(integration) mint deterministic tokens against the same secret; the
integration test runner attaches a default Bearer token to its shared
HttpClient so existing tests continue to exercise protected endpoints.
JwtIntegrationTests adds AC-1..AC-4 and AC-7 (Swagger advertises
Bearer) end-to-end; AuthenticationServiceCollectionExtensionsTests
covers AC-5 (missing/empty/short secret fail-fast) plus env-var
precedence; JwtTokenFactoryTests covers AC-6 (claims pass through
the JwtSecurityTokenHandler.ValidateToken path JwtBearer uses).

docker-compose and scripts/run-tests.sh now propagate JWT_SECRET to
the api and integration-tests containers, with a >=32-byte guard.
.env.example documents the required keys; .env stays gitignored.

Code review verdict: PASS_WITH_WARNINGS (2 Low findings surfaced
in _docs/03_implementation/reviews/batch_01_cycle2_review.md).

Cross-component coordination: gps-denied-onboard and the mission
planner UI must attach Bearer tokens before this lands in dev.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 23:06:23 +03:00
Oleksandr Bezdieniezhnykh 8e15e53782 chore: cycle 2 step 9 task plan artifacts + step 10 state
Carries forward new-task research + solution drafts under
_docs/02_task_plans/uav-batch-upload/ that were not included in
the Step 9 task-spec commit (42a3cc7). Also marks the autodev
state as Step 10 in_progress for cycle 2 implementation.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 22:54:36 +03:00
Oleksandr Bezdieniezhnykh 42a3cc7467 [AZ-487] [AZ-488] Cycle 2 Step 9: JWT baseline + UAV upload task specs
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Created two PBIs for cycle 2 under epic AZ-483 (multi-source tile
storage + UAV upload). Splits the originally-planned single AZ-485
into 2 cohesive tasks because the combined scope was ~10 SP and
JWT auth is independently shippable:

- AZ-487 (2 SP) JWT validation baseline. Adds HS256 JwtBearer
  middleware against JWT_SECRET env var per the suite-level auth
  contract (suite/_docs/10_auth.md). Applies .RequireAuthorization()
  on all existing endpoints. Skips iss/aud validation (suite doc
  does not specify). No /users/me endpoints. Hard prerequisite for
  AZ-488.

- AZ-488 (8 SP, over-cap user-accepted) UAV tile upload endpoint
  with batch + 5-rule quality gate. Replaces the 501 stub. Multipart
  batch DTO, 5 quality rules (format, size band, dimensions,
  captured_at age 7d, blank/uniform variance heuristic). UAV files
  land at ./tiles/uav/{z}/{x}/{y}.jpg; google_maps grandfathered
  at bare ./tiles/{z}/{x}/{y}.jpg. Per-source UPSERT via the
  AZ-484 ITileRepository.InsertAsync. Sync 200 with per-item
  results. Requires GPS permission claim. Produces frozen contract
  uav-tile-upload.md v1.0.0.

Both Jira tickets created and linked. Dependencies table updated.
Autodev state advanced to cycle 2 Step 10 (Implement).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 21:04:49 +03:00
Oleksandr Bezdieniezhnykh 18609656f9 [AZ-484] Cycle 1 Step 17 Retrospective: report + structural snapshot
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Closes the AZ-484 cycle:
- retro_2026-05-11.md: 5 patterns identified (code-review-PASS does
  not imply runtime PASS; spec-authorship under-specifies wire
  format / test sites; NFR test-spec entries decoupled from runner
  scripts; pre-existing module doc staleness; pre-existing security
  Mediums now visible). Top-3 actions ranked by impact, with target
  rule/skill files and owners.
- structure_2026-05-11.md: baseline structural snapshot for future
  retro deltas (6 components, 12 ProjectReference edges, 0 cycles
  in import graph, 0 net architecture violations, 1 frozen
  contract, ~14% contract coverage).
- LESSONS.md: header rewritten to describe the two-layer format
  (deep lessons + 15-entry ring buffer); appended 3 new ring-buffer
  entries (testing/process/estimation) sourced from this retro.
- _autodev_state.md: cycle 2 starting at Step 9 New Task.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 10:08:47 +03:00
Oleksandr Bezdieniezhnykh 51b572108a [AZ-484] Cycle 1 Steps 12-16: docs, security, perf, deploy report
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Captures the post-implementation autodev gates for AZ-484 multi-source
tile storage:

- Step 12 (Test-Spec Sync): added 7 AC rows (AZ-484 AC-1..AC-7) and a
  PT-07 NFR row to traceability-matrix.md; added PT-07 scenario to
  performance-tests.md.
- Step 13 (Update Docs): refreshed data_model.md (tiles columns +
  indexes + selection rule + UPSERT contract + migrations 012/013),
  module-layout.md (Common/Enums section with L-001 guidance,
  DataAccess imports-from now lists 6 sites), 6 module / component
  docs to reflect the new repo signatures, source/captured_at fields,
  and Dapper enum bypass workaround. ripple_log_cycle1.md records
  zero out-of-scope ripple.
- Step 14 (Security Audit): PASS_WITH_WARNINGS - 0 Critical, 0 High,
  5 Medium, 5 Low. AZ-484 itself added zero new findings. Hardening
  items (Postgres default creds, .env in build context, GMaps key
  rotation, ASP.NET Core 8.0.21 -> 8.0.25, rate limiter) recorded
  for separate tickets.
- Step 15 (Performance Test): all PT-01..PT-07 scenarios Unverified
  (non-blocking); PT-07 baseline-comparison harness deferred to a
  leftover for next cycle.
- Step 16 (Deploy): cycle deploy report covering migration safety,
  rollback path, post-deploy verification, security caveats.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 10:03:05 +03:00
Oleksandr Bezdieniezhnykh e9d6db077c [AZ-484] Fix multi-source tile reads: drop Dapper enum handler
Two integration-test failures uncovered after the initial commit:

1) GetTilesByRegionAsync outer ORDER BY referenced 'updated_at' but
   the inner DISTINCT ON subquery aliased it to 'UpdatedAt' (Postgres
   folds to 'updatedat'). DISTINCT ON already guarantees one row per
   (latitude, longitude, ...) so the third tiebreak was unreachable;
   removed it.

2) Dapper 2.1.35 silently bypasses SqlMapper.TypeHandler<T> for enum
   types during read deserialization (Dapper issue #259). The
   TileSourceTypeHandler worked for writes but reads fell through to
   Enum.TryParse, which cannot map 'google_maps' to GoogleMaps.

   Pivoted: TileEntity.Source is now a string (the wire value).
   TileSource enum stays as the public producer surface in
   Common.Enums; TileSourceConverter (Common.Enums) provides
   ToWireValue / FromWireValue / IsValidWireValue at the boundary.
   TileSourceTypeHandler deleted; registration removed from
   DapperEnumTypeHandlers.RegisterAll.

   tile-storage.md Inv-5 amended to document the storage choice.
   _docs/LESSONS.md L-001 records the Dapper bypass for future cycles.

Full suite passes (213 unit + integration suite incl. AZ-484
AC-1..AC-5, security SEC-01..SEC-04, AZ-356/362/357).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 06:44:34 +03:00
Oleksandr Bezdieniezhnykh 687d6bdd5b [AZ-484] Multi-source tile storage: source + captured_at
Add per-source tile rows to support multi-provider imagery (Google
Maps + future UAV). Migration 013 (transactional) introduces
source/captured_at columns, backfills existing rows to
(source='google_maps', captured_at=created_at), and replaces the
4-column unique index with a 5-column index that includes source.

TileRepository:
- ColumnList includes source + captured_at
- GetByTileCoordinatesAsync returns most-recent row across sources
  (ORDER BY captured_at DESC, updated_at DESC, id DESC)
- GetTilesByRegionAsync uses DISTINCT ON to pick the most-recent
  tile per cell, restoring caller-facing row order
- Insert/Update upsert on the new 5-column conflict key

TileSource enum lives in Common.Enums. Snake_case wire format
(google_maps, uav) is enforced by a focused TileSourceTypeHandler
because the generic ToLowerInvariant pattern would emit
"googlemaps", violating contract v1.0.0.

TileService stamps Source=GoogleMaps + CapturedAt=UtcNow on every
new tile. Tile-storage contract is now frozen at v1.0.0.

AC coverage 7/7. New unit + integration tests cover all ACs;
existing 200 unit + 5 smoke tests preserved.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 06:21:59 +03:00
Oleksandr Bezdieniezhnykh 5ba58b6c8d [AZ-484] [AZ-483] Add task spec + tile-storage v1.0.0 contract draft
Step-9 (new-task) cycle 1 artifacts for the AZ-483 multi-source tile
storage epic. AZ-485 (UAV upload + quality gate) deferred to a future
Step-9 loop and recorded as planned in the dependencies table.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 06:05:06 +03:00
Oleksandr Bezdieniezhnykh 08451df027 [AZ-350] Close 03-code-quality-refactoring: Phase 6+7 + FINAL_report
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Phase 6 (Verification): smoke run green (format gate + 200/200
unit + integration smoke). verification_report.md captures
metric deltas vs Phase 0 baseline; all 5 ACs met, all 4
constraints honored, 0 regressions.

Phase 7 (Documentation):
- module-layout.md: corrected DataAccess->Common dependency
  (was mistakenly documented as "Imports from: (none)" by
  prior AZ-315 baseline; csproj reference + 7 import sites
  have actually been there since AZ-309).
- architecture_compliance_baseline.md: F5 entry revised to
  reflect the actual layering invariant (one-way: Common
  MUST NOT import from DataAccess, but DataAccess MAY
  import from Common).
- 00_discovery.md: added "Updates Since Baseline" section
  enumerating the AZ-309 split + AZ-350 27-change run +
  AZ-372 tooling additions; original tree kept as a
  2026-05-10 snapshot.

FINAL_report: complete run summary (10 batches, 27 tasks,
3 K=3 cumulative reviews, baseline->final metric table,
remaining items, lessons learned).

Autodev state: advance Step 8 -> Step 9 (New Task);
sub_step reset to phase 0 awaiting-invocation.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 05:23:35 +03:00
Oleksandr Bezdieniezhnykh 9a53bff92e [AZ-375] [AZ-377] HashSet tile lookup + consolidate Earth constants
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Batch 24 of 03-code-quality-refactoring run; closes the run.

AZ-375 (C22): GoogleMapsDownloaderV2.DownloadTilesGridAsync now
builds a HashSet<(int X, int Y, int Z)> once from existingTiles
and tests Contains((x, y, zoomLevel)) per cell. Removes the per-cell
FirstOrDefault tolerance scan and the unused _processingConfig
.LatLonTolerance reference at this site.

AZ-377 (C24): promote Earth + tile-pixel constants to a single
home. GeoUtils now exposes EarthRadiusMeters, EarthEquatorial
CircumferenceMeters, MetersPerDegreeLatitude as public const.
MapConfig adds DefaultTileSizePixels (const) wired as the
TileSizePixels property default. TileRepository and Google
MapsDownloaderV2 read those constants instead of duplicating
the literals 6378137, 40075016.686, 111000.0, and 256.

Tests: +6 new (DownloaderRefactorTests, extended GeoUtils
RefactorTests). 200/200 unit tests pass.

Cumulative K=3 review (batches 22-24): PASS_WITH_WARNINGS,
4 Low findings only — see
_docs/03_implementation/reviews/cumulative_review_22-24.md.

Tooling fix: scripts/run-tests.sh --unit-only path now restores
before testing (was failing on SixLabors resolution in clean
container). Stripped stray BOM from MapConfig.cs to satisfy the
.editorconfig charset gate.

Updates _dependencies_table.md to reflect all 27 03-code-quality-
refactoring tasks done; updates _autodev_state.md to refactor
phase 5 (test-sync).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 05:14:51 +03:00
Oleksandr Bezdieniezhnykh 6099d1c86b [AZ-376] [AZ-378] [AZ-379] [AZ-380] Repo cleanup: dead code, logger discipline, ColumnList consts
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Batch 23 of refactor 03-code-quality-refactoring (4 tasks, 5 SP):

- AZ-376 (C23): Delete unused FindExistingTileAsync from
  ITileRepository / TileRepository. No callers; method also took the
  obsolete `version` arg removed by C06/AZ-357.
- AZ-378 (C25): Repository _logger discipline.
  TileRepository.GetTilesByRegionAsync now emits LogWarning when the
  query exceeds SlowQueryThresholdMs (500 ms). RegionRepository and
  RouteRepository drop the unused ILogger<TRepo> field, parameter, and
  using; Program.cs DI registrations updated.
- AZ-379 (C26): Extract `private const string ColumnList` per repo
  (TileRepository, RegionRepository, RouteRepository); SELECTs use
  $@"SELECT {ColumnList} FROM ..." (C# 10+ const interpolation).
  INSERT/UPDATE/DELETE unchanged; route_points single-site SELECT left
  inline.
- AZ-380 (C27): Delete dead alias GeoUtils.CalculatePolygonDiagonalDistance.

Tests: +9 new (RepositoryRefactorTests x8, GeoUtilsRefactorTests x1)
covering each AC via reflection / file-content assertions; pattern
mirrors ToolingConfigurationTests (b22) and AcceptanceCriteriaRT2Tests
(b19). Unit suite 181 -> 190, all green. dotnet format clean.

Code review: PASS_WITH_WARNINGS (3 Low findings, all informational or
out-of-scope for this batch). See
_docs/03_implementation/reviews/batch_23_review.md.

Cumulative review counter 2/3; next K=3 review fires after batch 24.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:57:49 +03:00
Oleksandr Bezdieniezhnykh 534ab41b8e [AZ-372] Apply dotnet format whitespace cleanup; archive batch 22
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Pure whitespace-only cleanup uncovered by the new format gate from the
previous commit. Verified via `git diff -w --stat`: only 4 files differ
when whitespace is ignored, and those differ only by the BOM byte.

Cleanup kinds applied across 22 source files:
- BOM removal (MapConfig.cs, SatTile.cs, GeoUtils.cs,
  IntegrationTests/Program.cs)
- CRLF -> LF (IntegrationTests/Program.cs)
- Trailing whitespace on blank lines (Common, Api, DataAccess,
  IntegrationTests, Services.RegionProcessing,
  Services.TileDownloader)
- Final newline added (RoutePoint.cs, GeoPoint.cs, others)

After this commit `dotnet format whitespace SatelliteProvider.sln
--verify-no-changes` exits 0; AC-1 is enforceable from `scripts/
run-tests.sh` going forward.

Also lands the batch 22 report, code-review report
(PASS_WITH_WARNINGS, 2 Low findings — both deferred per spec),
dependency-table status update (AZ-372 -> Done (In Testing)), task
archive (todo/ -> done/), and autodev state update.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:43:08 +03:00
Oleksandr Bezdieniezhnykh 68359350fc [AZ-372] Add .editorconfig, Directory.Build.props, format/coverage wiring
Wires the C19 tooling baseline so dotnet format and Coverlet gate the
test script and a small NetAnalyzers ruleset (CA1001, CA1051, CA1816,
CA2227) at warning severity is visible from the next build.

- .editorconfig (new, root=true): whitespace rules, per-extension
  indent sizes, C# style preferences as suggestions, initial CA rules.
- Directory.Build.props (new): EnableNETAnalyzers=true,
  AnalysisLevel=latest, AnalysisMode=None so only rules explicitly
  enabled in .editorconfig fire; EnforceCodeStyleInBuild=false to keep
  build clean from style.
- scripts/run-tests.sh: Step 0 runs dotnet format whitespace
  --verify-no-changes via Docker SDK; unit/integration test calls now
  collect XPlat Code Coverage into TestResults/. New --skip-format
  escape hatch.
- .gitignore: TestResults/, coverage.cobertura.xml, *.coverage.
- SatelliteProvider.Tests/ToolingConfigurationTests.cs (new, 6 tests):
  runtime assertions that the config files, script wiring, and
  coverlet.collector reference are all in place; mirrors the
  AcceptanceCriteriaRT2Tests pattern.

Whitespace cleanup that the new format gate uncovers is staged for the
next commit (per AZ-372 spec: "commit cleanup as a separate batch").

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:42:45 +03:00
Oleksandr Bezdieniezhnykh 8fee955bb5 [AZ-350] autodev state: ready for batch 22 (AZ-372)
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:25:19 +03:00
Oleksandr Bezdieniezhnykh a7c622204f [AZ-350] Cumulative K=3 review for batches 19-21: PASS_WITH_WARNINGS
F1 (Low/Maintainability): module-layout.md docs stale on DataAccess
project reference after AZ-370; tracked for refactor Phase 7.
F2 (Low/Maintainability): redundant builder.Services.AddHttpClient()
in Program.cs after AZ-374; deferred per batch 21 design note.
No Critical/High findings; auto-chain to next batch (AZ-372).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:23:28 +03:00
Oleksandr Bezdieniezhnykh fae0d1cc34 [AZ-374] Update autodev state: cumulative-review pending
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:18:08 +03:00
Oleksandr Bezdieniezhnykh 7736cfd761 [AZ-374] Add batch 21 report; autodev state batch 21 complete
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:11:57 +03:00
Oleksandr Bezdieniezhnykh 89b4bfd245 [AZ-374] Refactor C21: named GoogleMapsTiles HttpClient in DI
- Register IHttpClientFactory named client "GoogleMapsTiles" inside
  AddTileDownloader() with User-Agent and 100s timeout (preserves
  HttpClient's implicit default).
- Resolve the same named client from all three CreateClient() call
  sites in GoogleMapsDownloaderV2 (session token, single-tile,
  batch-tile retry lambda) and drop the duplicated per-call
  UserAgent.ParseAdd setup.
- Expose USER_AGENT, the client name, and the timeout as internal
  consts on GoogleMapsDownloaderV2 so the extension and the
  downloader share one source of truth.
- Add AC test that builds the DI container, resolves the named
  client, and asserts both the User-Agent header and the timeout.
- Archive AZ-374 task file: todo/ -> done/.

175 unit + 5 smoke pass.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:11:57 +03:00
Oleksandr Bezdieniezhnykh 1ca8c80d7b [AZ-373] Add batch 20 report; autodev state batch 20 complete
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:05:45 +03:00
Oleksandr Bezdieniezhnykh 7c37636fdf [AZ-373] Refactor C20: drop MapsVersion from new writes (option a)
- Stop writing "downloaded_YYYY-MM-DD" into tiles.maps_version: new rows
  bind @MapsVersion to NULL via TileService.BuildTileEntity.
- Retain the tiles.maps_version column (coderule.mdc forbids unprompted
  column drops); pre-existing rows keep their values for forensics.
- Remove MapsVersion property from DownloadTileResponse (API wire shape)
  and TileMetadata (internal DTO); OpenAPI schema regenerates from the
  DTO via Swashbuckle.
- Add 3 AC tests in TileServiceTests covering the captured-entity write
  (AC-1) and the DTO/wire-shape removal (AC-2).
- Update integration-test local DTO + console output; refresh docs in
  common_dtos.md, services_tile_service.md, data_model.md.
- Archive AZ-373 task file: todo/ -> done/.

174 unit + 5 smoke pass.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 04:05:40 +03:00
Oleksandr Bezdieniezhnykh 45f7852fb2 [AZ-370] Add batch 19 report
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 03:56:27 +03:00
Oleksandr Bezdieniezhnykh 23ab05766d [AZ-370] Refactor C17: status / point-type enums + AC RT2 update
Replaces bare strings with two enums in Common/Enums/:
  RegionStatus { Queued, Processing, Completed, Failed }
  RoutePointType { Start, End, Action, Intermediate }

Adds a Dapper EnumStringTypeHandler<T> (DataAccess/TypeHandlers/)
that round-trips enums to/from lowercase strings, registered once
at startup via DapperEnumTypeHandlers.RegisterAll(). DataAccess now
references Common (project ref) so entities can carry the enum types.

Sites converted: RegionService (5), RouteProcessingService (3),
RoutePointGraphBuilder (4), entity Status/PointType columns. Log
message and summary file format preserved via .ToLowerInvariant().

API JSON contract preserved by adding JsonStringEnumConverter with
JsonNamingPolicy.CamelCase to the http JSON options — single-word
enum members serialize to the same lowercase strings as before.

DTO renamed: Common.DTO.RegionStatus -> RegionStatusResponse to
free the RegionStatus name for the new enum (forced by the task's
explicit enum name); the renamed DTO has no public-API impact at
the JSON wire level. Stale doc references updated.

AC RT2 in _docs/00_problem/acceptance_criteria.md now lists all 4
point types (start/end/action/intermediate).

Tests: 171 / 171 unit + 5 / 5 smoke green (was 141 + 5; +30 new tests
covering type handler round-trip, set/parse, unknown-value rejection,
idempotent registration, and the AC RT2 doc check).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 03:55:22 +03:00
Oleksandr Bezdieniezhnykh 6d98c8f8d1 [AZ-350] Cumulative K=3 review for batches 16-18: PASS (0 findings)
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 03:34:01 +03:00
Oleksandr Bezdieniezhnykh aa8beaa684 [AZ-371] Archive task file: todo -> done
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 03:30:25 +03:00
Oleksandr Bezdieniezhnykh 4456542cec [AZ-350] autodev state: batch 18 (AZ-371) complete
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 03:30:24 +03:00
Oleksandr Bezdieniezhnykh 1dcd089d39 [AZ-371] Refactor C18: magic numbers to ProcessingConfig/MapConfig
Promotes 8 operational levers into config keys with defaults that match
the prior source literals byte-for-byte:
  ProcessingConfig: RegionProcessingTimeoutSeconds (300),
  RouteProcessingPollIntervalSeconds (5),
  MaxRoutePointSpacingMeters (200), LatLonTolerance (0.0001).
  MapConfig: TileSizePixels (256), AllowedZoomLevels ([15..19]),
  RetryBaseDelaySeconds (1), RetryMaxDelaySeconds (30).

Sites updated: RegionService, RouteProcessingService,
RoutePointGraphBuilder, RouteValidator, RouteService 4-arg ctor,
RouteImageRenderer, GoogleMapsDownloaderV2, TileService. Closes LF-2 by
forwarding HttpContext.RequestAborted from GetTileByLatLon into the
downloader. appsettings.json gains the 8 new keys at default values.

Tests: 141 / 141 unit + 5 / 5 smoke green. New ConfigDefaultsTests pins
defaults to original literals; new TileService unit test asserts CT
identity from caller to downloader (AZ-371 AC-3).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 03:30:07 +03:00
Oleksandr Bezdieniezhnykh ee42b1716b [AZ-364] [AZ-360] Archive task files: todo -> done
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 03:13:01 +03:00
Oleksandr Bezdieniezhnykh ea1ca714f6 [AZ-350] autodev state: batch 17 (AZ-364, folds AZ-360) complete
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 03:12:59 +03:00
Oleksandr Bezdieniezhnykh 6f23120c49 [AZ-364] [AZ-360] Refactor C11+C08: decompose RouteProcessingService
Extracts RouteRegionMatcher, RouteCsvWriter, RouteSummaryWriter,
RouteImageRenderer, TilesZipBuilder, RegionFileCleaner from the
~750-LOC RouteProcessingService god-class. Moves TileInfo to its
own file as a sealed record. Replaces IServiceProvider scope-
locator with a direct IRegionService injection (folds AZ-360 / C08).
Updates DI registration and tests.

Tests: 133 / 133 unit + 5 / 5 smoke green; integration suite exit 0.
Pixel-equivalent stitched route image and byte-equivalent CSV /
summary / ZIP outputs verified through the smoke run.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 03:12:49 +03:00
Oleksandr Bezdieniezhnykh 70a0a2c4d5 [AZ-367] Archive task file: todo -> done
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 02:55:51 +03:00
Oleksandr Bezdieniezhnykh 3603dd319c [AZ-350] autodev state: batch 16 (AZ-367) complete
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 02:55:29 +03:00
Oleksandr Bezdieniezhnykh 10d31b4c1c [AZ-367] Refactor C14: extract shared TileGridStitcher
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 02:55:25 +03:00
Oleksandr Bezdieniezhnykh 23d513b24c [AZ-350] Remove processed leftover after archive replay
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 02:41:38 +03:00
Oleksandr Bezdieniezhnykh 7729cee2d3 [AZ-350] Archive previously-completed task files todo -> done
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 02:40:19 +03:00
Oleksandr Bezdieniezhnykh c98f6fde48 [AZ-350] Record leftover: 3 un-archived task files from batches 11-13
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 02:39:13 +03:00
Oleksandr Bezdieniezhnykh 1b62b3268a [AZ-350] autodev state: batch 15 (AZ-365) complete; K=3 review done
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 02:08:28 +03:00
Oleksandr Bezdieniezhnykh bcb9bf5130 [AZ-350] Cumulative code review batches 13-15: PASS, 0 findings
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 02:08:26 +03:00
Oleksandr Bezdieniezhnykh f7ad7aa5ab [AZ-365] Refactor C12: decompose RouteService.CreateRouteAsync
Extract RouteValidator (aggregating validator), RoutePointGraphBuilder
(point interpolation + sequence numbering), GeofenceGridCalculator
(NW/SE region centers), and RouteResponseMapper (entity -> DTO; also
used by GetRouteAsync, eliminating duplicate DTO assembly).

CreateRouteAsync shrinks 184 -> 52 LOC of orchestration. RouteService.cs
shrinks 295 -> 138 LOC overall. Validation aggregates all failures into
a single ArgumentException (AC-2); single-violation messages preserved
verbatim so existing RouteServiceTests pass unchanged. 28 new unit
tests for the four helpers (112/112 unit tests, smoke green).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 02:08:21 +03:00
Oleksandr Bezdieniezhnykh d327000fb6 [AZ-350] autodev state: batch 14 (AZ-369) complete
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 01:55:33 +03:00
Oleksandr Bezdieniezhnykh d2d9f6352b [AZ-369] Refactor C16: move inline DTOs out of Program.cs
Move 5 cross-component DTOs (GetSatelliteTilesResponse, SatelliteTile,
SaveResult, DownloadTileResponse, RequestRegionRequest) to
SatelliteProvider.Common/DTO/. Keep UploadImageRequest in the API
project under SatelliteProvider.Api.DTOs (IFormFile depends on
Microsoft.AspNetCore.Http; pulling it into Common would force an
ASP.NET framework reference into the foundation layer and break the
module-layout "Common: Imports from: (none)" invariant). Move
ParameterDescriptionFilter to SatelliteProvider.Api.Swagger.

Program.cs shrinks from 366 to 257 lines and now contains only
endpoint wiring (AC-1). JSON wire shape and Swagger schema names are
preserved (AC-2). 84 unit + full integration suite green (AC-3).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 01:54:35 +03:00
Oleksandr Bezdieniezhnykh 4c88201e5c [AZ-350] autodev state: batch 13 (AZ-368) complete
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 01:08:19 +03:00
Oleksandr Bezdieniezhnykh 89260d0ec4 [AZ-368] Refactor C15: extract shared TileCsvWriter
Both RegionService.GenerateCsvFileAsync and
RouteProcessingService.GenerateRouteCsvAsync wrote the same CSV
shape: header "latitude,longitude,file_path", same
OrderByDescending(Latitude).ThenBy(Longitude) ordering, same F6
numeric format. Two near-identical writers with no shared abstraction.

Extracted TileCsvWriter (instance class, no DI dependencies) plus a
TileCsvRow record bridging the per-pipeline DTOs (TileMetadata vs
TileInfo) to a single contract. The header constant, ordering rule,
and StreamWriter lifecycle now live in one place.

Both call sites collapse to a one-line projection plus a delegated
WriteAsync call. Region method becomes static (no longer references
instance state). Route method preserves its existing logger line.

Coverage:
- 7 new unit tests including a byte-for-byte equivalence test that
  writes the same input via both the new TileCsvWriter and the
  inlined-original code path side by side and asserts file bytes
  are identical.
- Integration smoke + full suite green; route + region CSV outputs
  unchanged across all existing scenarios (verified by extended-route
  CSV verification step in the integration suite).
- 84/84 unit tests pass (was 77).

Side improvement: writer now respects CancellationToken mid-loop.
The pre-refactor inline code did not. Strict improvement; consistent
with every other async API in the codebase.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 01:07:51 +03:00
Oleksandr Bezdieniezhnykh 098f905796 [AZ-350] Cumulative code review batches 10-12: PASS, 0 findings
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 01:01:46 +03:00
Oleksandr Bezdieniezhnykh cc54001e4d [AZ-350] autodev state: batch 12 (AZ-366) complete; K=3 review due
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:57:18 +03:00
Oleksandr Bezdieniezhnykh 330bccd724 [AZ-366] Refactor C13: consolidate Haversine + tile-coord parsing
RouteProcessingService.CalculateDistance(double, double, double, double)
re-implemented Haversine using EARTH_RADIUS=6371000 alongside the
canonical GeoUtils.CalculateDistance(GeoPoint, GeoPoint) which uses
EARTH_RADIUS=6378137. Two implementations of the same formula for the
same problem.

Separately, ExtractTileCoordinatesFromFilename in RouteProcessingService
parsed the tile_{z}_{x}_{y}_{ts}.jpg filename pattern that's *generated*
by StorageConfig.GetTileFilePath in another assembly — the writer and
parser were coupled by string convention only and a format change in
one would silently break the other.

Both fixes:

(a) Deleted the duplicate Haversine in RouteProcessingService. The
single call site (region-matching nearest-neighbor OrderBy) now uses
GeoUtils.CalculateDistance with GeoPoint instances. The constant
difference is monotonic-equivalent for OrderBy purposes — same region
is picked.

(b) Added static StorageConfig.TryExtractTileCoordinates(string, out
int, out int): bool — pure parser, co-located with GetTileFilePath so
the inverse-pair invariant is structural, not by-convention.
RouteProcessingService.ExtractTileCoordinatesFromFilename becomes a
thin wrapper that delegates to the helper and emits the existing
warning log on malformed input — the AZ-352 tests for warning behavior
all still pass.

Verification:
- 77/77 unit tests green (was 71 → +6 new StorageConfigTests including
  a writer/parser round-trip test for AC-2).
- Smoke + full integration suite green.
- AC-1 grep verification: Math.Sin/EARTH_RADIUS patterns are now
  confined to GeoUtils.cs only.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:56:46 +03:00
Oleksandr Bezdieniezhnykh fd8fc74249 [AZ-350] autodev state: batch 11 (AZ-362) complete
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:46:29 +03:00
Oleksandr Bezdieniezhnykh 2393bff1f2 [AZ-362] Refactor C09: idempotent POST contract for caller-supplied GUIDs
Both POST /api/satellite/request and POST /api/satellite/route accept
a caller-supplied id (Guid). Before this change, a retried POST with
the same id would either crash with a unique-key violation (regions)
or quietly create a divergent row (routes), neither of which matched
the documented intent of caller-supplied GUIDs.

RegionService.RequestRegionAsync and RouteService.CreateRouteAsync
now check for an existing row by id at the top of the method. If one
is found, the existing resource is returned with HTTP 200 and the
side effects (insert + enqueue + point regeneration + geofence-region
queueing) are all skipped. The Information-level log line on the
idempotent path makes retries observable.

OpenAPI Description metadata documents the contract on both endpoints
so client integrators see it in Swagger.

Coverage:
- 2 new unit tests (one per service) assert that on duplicate id no
  insert / enqueue / point-generation / region-queueing call is made.
- 2 new integration tests (IdempotentPostTests.cs) exercise the
  contract end-to-end via HTTP, asserting both calls return 200 and
  CreatedAt matches within 1ms (PostgreSQL truncates TIMESTAMP to
  microseconds while .NET DateTime keeps 100ns ticks; a real
  re-insertion would shift CreatedAt by milliseconds at minimum).

Note: the check-first pattern leaves a TOCTOU window for concurrent
retries. The repository unique key still surfaces the race as a
PostgresException which AZ-353 maps to a clean error. Acceptable for
realistic sequential-retry patterns; recorded in batch report as a
non-blocking observation.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:45:51 +03:00
Oleksandr Bezdieniezhnykh 546ddb3e6c [AZ-357] AC-2 follow-up: populated-duplicates migration test
Closes the partial-coverage gap from batch 10. Adds two integration
tests in MigrationTests.cs:

- DedupeSqlCollapsesDuplicatesByLatestUpdatedAt_AZ357_AC2: seeds a
  session-scoped temp table with intentional 4-column duplicates
  (varying updated_at and id), runs the exact dedupe SQL from
  migration 012, asserts only the expected rows survive (newest
  updated_at wins; ties broken by largest id).
- NewUniqueConstraintExistsOnFourColumns_AZ357_AC2: queries
  pg_indexes against the live DB to assert idx_tiles_unique_location
  is a unique 4-column btree and excludes the version column.

Also wires Npgsql 9.0.2 into the integration-tests project, exposes
DB_CONNECTION_STRING + postgres healthcheck dependency to the test
container in docker-compose.tests.yml, and registers the new tests
in both smoke and full suites.

Implementation note: first attempt used CREATE TEMP TABLE
ON COMMIT DROP, which dropped the table immediately because each
Npgsql command runs in its own implicit transaction. Removed
ON COMMIT DROP — session-scoped temps are dropped on connection
close, which is what we want.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:45:24 +03:00
Oleksandr Bezdieniezhnykh 581dff206e [AZ-357] Refactor C06: drop tile Version concept; cumulative review batches 7-9
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
AZ-357 — eliminate year-based tile cache expiry (LF-1):
- Migration 012: drop 5-col unique index, dedupe by (lat,lon,zoom,
  size) keeping max(updated_at), add new 4-col unique index, make
  version column nullable + drop default. Column itself preserved
  per coderule (column drops require explicit confirmation; tracked
  in AZ-373 / C20).
- TileEntity.Version, TileMetadata.Version, DownloadTileResponse.
  Version: int -> int? (HTTP shape preserved; field still in JSON).
- TileService.DownloadAndStoreTilesAsync: drop currentVersion year
  computation and the .Where(t => t.Version == currentVersion)
  cache filter. BuildTileEntity: drop year arg; write Version=null.
- TileRepository: ON CONFLICT now 4-col; lookup queries
  ORDER BY updated_at DESC instead of version DESC.
- Tests: replace inverted BT02b with positive AZ357_AC1
  (prior-year cached tile is reused). Add BuildTileEntity_
  DoesNotPopulateVersion_AZ357 to enforce the no-write contract.
- 69 unit + 5 smoke + 3 stub-contract integration tests pass.

Cumulative code review (batches 7-9, 7 tasks): VERDICT=PASS.
Report at _docs/03_implementation/reviews/batch_09_review.md.
Zero Critical/High/Medium/Low findings. Architecture baseline
remains clean.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:20:47 +03:00
Oleksandr Bezdieniezhnykh 5a28f67d33 [AZ-359] Refactor C07: collapse RegionService catch ladder via RegionFailureClassifier
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Replace 9 nearly-identical catch blocks in
RegionService.ProcessRegionAsync with a single catch (Exception ex)
that delegates to RegionFailureClassifier.Classify, returning a
typed (category, errorMessage) pair. Preserves all original error
messages stored in region summary files; failure-path call into
HandleProcessingFailureAsync is unchanged.

Net source delta: -38 lines in RegionService, +71 lines in new
RegionFailureClassifier (pure static), +10 unit tests covering
each category, precedence, status-code propagation, null guard.

Tests: 68 unit (was 58) + 5 smoke + 3 stub-contract integration
tests pass.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 00:07:31 +03:00
Oleksandr Bezdieniezhnykh 1d89cd9997 [AZ-353][AZ-354][AZ-356] Refactor 03 batch 2: harden API surface
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
AZ-353: Centralize 500 handling via GlobalExceptionHandler /
AddProblemDetails / UseExceptionHandler. Sanitized ProblemDetails
body carries a generic title, RFC9110 type link, and the request's
TraceIdentifier as correlationId; the leaky exception message stays
server-side in the ERR log entry. Strip per-endpoint
try/catch (Exception) wrappers and the unused ILogger<Program>
parameters they served. Preserve the typed ArgumentException catch
in CreateRoute (AC-3). The handler maps BadHttpRequestException
back to its framework-supplied StatusCode so model-binding /
malformed-body failures stay 4xx instead of being promoted to 500.

AZ-354: Extract CorsConfigurationValidator (pure static helpers)
and wire it into Program.cs. Production with empty
CorsConfig:AllowedOrigins and no CorsConfig:AllowAnyOrigin opt-in
now throws InvalidOperationException at host startup. Development
keeps the permissive default but logs a warning post-build. Adds
the explicit CorsConfig:AllowAnyOrigin escape hatch.

AZ-356: GetSatelliteTilesByMgrs and UploadImage now return
Results.Problem(StatusCode 501) with ProblemDetails. Added
.ProducesProblem(501) so swagger.json documents the
not-implemented status.

Tests: SatelliteProvider.Tests now references SatelliteProvider.Api
(downward, idiomatic) so unit tests can reach the new helpers.
+9 CorsConfigurationValidator unit tests, +3
GlobalExceptionHandler unit tests, +3 StubAndErrorContractTests
integration tests (added to smoke + full suites).

58/58 unit + 5/5 smoke + 3/3 stub-contract pass.
Code review verdict: PASS.
Batch report: _docs/03_implementation/batch_08_report.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 23:52:52 +03:00
Oleksandr Bezdieniezhnykh de4d4fa760 [AZ-351][AZ-352][AZ-363] Refactor 03 batch 1: critical defensive fixes
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
AZ-351: Resolve ILogger<DatabaseMigrator> directly from DI in
Program.cs instead of casting ILogger<Program> (which always
returned null). Migrator now logs through Serilog at startup.

AZ-352: Drop empty catch in
RouteProcessingService.ExtractTileCoordinatesFromFilename. Convert
the method from private static to internal instance so it can use
the existing _logger (per coderule: side-effecting code must not be
static). Add typed null-guard via ArgumentNullException.ThrowIfNull
so unexpected exceptions propagate. Adds InternalsVisibleTo on the
RouteManagement csproj for SatelliteProvider.Tests, plus 4 unit
tests in RouteProcessingServiceTests.cs covering AC-1 (valid /
malformed / non-numeric) and AC-2 (null path propagation).

AZ-363: Delete _totalEnqueued / _totalDequeued fields and the two
non-atomic ++ writes in RegionRequestQueue. Fields were write-only
dead code and a thread-safety hazard.

Tests: 44/44 unit + 5/5 smoke (scripts/run-tests.sh --smoke).
Code review verdict: PASS, 0 findings, 0 auto-fix attempts.
Batch report: _docs/03_implementation/batch_07_report.md.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 23:34:17 +03:00
Oleksandr Bezdieniezhnykh ff030a9521 [AZ-350] Refactor 03 Phase 2: roadmap + 27 task specs + safety net
Adds Phase 0 (baseline metrics, .gitignore tweaks), Phase 1
(research findings, list-of-changes), and Phase 2 (refactoring
roadmap, epic AZ-350, 27 task specs AZ-351..AZ-380, dependency
table updates) for the 03-code-quality-refactoring run.

Phase 3 (Safety Net) re-verified: 40/40 unit + 5/5 smoke
integration pass; documented in test_specs/existing_coverage.md.
Coverage % gating deferred to ticket C19 (AZ-372) which adds
Coverlet + reportgenerator.

Auto-chains to Phase 4 (Execution) via /implement starting at
batch 1 (Phase 1 critical fixes).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 23:26:07 +03:00
Oleksandr Bezdieniezhnykh 524809d77d [AZ-309] Close coupling refactor with FINAL_report
Adds _docs/04_refactoring/02-coupling-refactoring/FINAL_report.md
recording goal achievement, baseline-vs-final metrics, batch
summaries, and lessons learned. Advances autodev state from
Step 8 (Refactor) to Step 9 (New Task) — Phase A baseline setup
is now complete; Phase B feature cycle starts on next invocation.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 14:48:33 +03:00
Oleksandr Bezdieniezhnykh 6b373082c8 [AZ-315] Sync architecture docs after coupling refactor
Phase C of architecture coupling refactor (epic AZ-309). Closes the
last baseline finding (F5 — DataAccess incorrectly documented as
importing Common) and synchronizes the rest of _docs/02_document/
with the post-split project layout from AZ-312/313/314:

- module-layout.md: per-component sections for the three new csprojs
  with explicit ProjectReferences and the no-cross-sibling-reference
  invariant the split enforces.
- architecture.md: components and internal-communication tables
  updated to show calls flow through Common interfaces.
- architecture_compliance_baseline.md: F1..F5 marked Resolved with
  task IDs and commit refs; baseline summary now 0 findings.
- diagrams/components.md, components/03_tile_downloader/description.md,
  modules/{common_interfaces,services_tile_service,
  services_google_maps_downloader,tests_unit}.md updated for the
  split, RateLimitException relocation, and new ITileService methods.

Documentation-only batch — no code, no tests, no build changes.
Epic AZ-309 complete (6 tasks across 3 batches).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 07:25:21 +03:00
Oleksandr Bezdieniezhnykh 8b0ddae075 [AZ-312] [AZ-313] [AZ-314] Split Services into per-component csprojs
Phase B of architecture coupling refactor (epic AZ-309). Replaces
the monolithic SatelliteProvider.Services with three per-component
csprojs to add a compiler-enforced module boundary (resolves F4):

- SatelliteProvider.Services.TileDownloader
- SatelliteProvider.Services.RegionProcessing
- SatelliteProvider.Services.RouteManagement

DI registrations relocated into per-component AddTileDownloader /
AddRegionProcessing / AddRouteManagement extension methods called
from Program.cs. RateLimitException moved to Common/Exceptions/ to
keep the three new csprojs as siblings (no Region->TileDownloader
ProjectReference). Dockerfiles and consumer csprojs (Api, Tests)
rewired to the new project paths. No DI lifetime or hosted-service
order changes.

Build: 0 warn, 0 err. Unit tests: 40/40. Smoke integration: green.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 07:15:44 +03:00
Oleksandr Bezdieniezhnykh 12b582deac [AZ-310] [AZ-311] Route tile endpoints through ITileService
Move cache+DB+download logic for /tiles/{z}/{x}/{y} and
/api/satellite/tiles/latlon out of Program.cs into TileService.
Endpoints now inject only ITileService + ILogger. Service owns
IMemoryCache (1h absolute / 30min sliding preserved). Added
TileBytes DTO; ITileService gains GetOrDownloadTileAsync and
DownloadAndStoreSingleTileAsync. 5 new unit tests cover cache
hit, repo hit, downloader fallback, and AZ-311 happy + error.

Build clean (0/0), unit suite 40/40. Resolves architecture
baseline F3 in code (docs handled by AZ-315).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 06:06:11 +03:00
Oleksandr Bezdieniezhnykh 220277b9c7 [AZ-309] Refactor 02-coupling-refactoring Phase 0-2 artifacts
- Baseline metrics, list of changes, and analysis (research findings,
  refactoring roadmap) for the coupling refactor run
- Six task specs AZ-310..AZ-315 covering endpoint routing through
  ITileService, Services csproj split, consumer rewire, DI extension
  methods, and docs sync
- Existing test coverage assessment for Phase 3 safety net gate
- Dependencies table updated with the refactor block

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 05:53:29 +03:00
Oleksandr Bezdieniezhnykh cc0a876168 [AZ-AUTODEV] Step 7 (Run Tests): smoke profile + green run
Add a fast integration profile so Step 7 (and future autodev
re-entries) can verify the full stack in ~2 min instead of ~15 min,
without losing access to the long-running coverage when needed.

- TestRunMode.cs: smoke flag + tightened poll/timeout values.
- Program.cs: env var INTEGRATION_TESTS_MODE / --smoke|--full CLI
  switch; smoke runs Tile + 200m region + simple route + ZIP route +
  Security; full keeps the existing sequence.
- RegionTests / ExtendedRouteTests: read timeouts from TestRunMode
  instead of hardcoding 120/180/360.
- docker-compose.tests.yml: forwards INTEGRATION_TESTS_MODE to the
  integration-tests container (default 'full').
- scripts/run-tests.sh: adds --unit-only / --smoke / --full flags,
  loads .env automatically, fails fast if GOOGLE_MAPS_API_KEY is
  missing.

Step 7 result: all tests passed in 111.86 s wall-clock (35/35 unit +
5/5 smoke integration scenarios incl. SEC-01..04). Report saved to
_docs/03_implementation/test_run_step7.md.

State advanced to Step 8 (Refactor).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 05:29:10 +03:00
Oleksandr Bezdieniezhnykh d1624e6d54 [AZ-AUTODEV] Step 6 (Implement Tests) complete; advance to Step 7
Final implementation report for the test step. All six test tasks
(AZ-285..AZ-290) completed across three batches. Code review verdicts:
all PASS_WITH_WARNINGS. Run-Tests handoff recorded for Step 7 per
implement skill Step 16.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 05:12:38 +03:00
Oleksandr Bezdieniezhnykh 7822841587 [AZ-289] [AZ-290] Batch 3 tests: integration ZIP cap, perf, security, queue
AZ-289 — RL-01 50MB ZIP cap added to RunRouteWithTilesZipTest;
existing integration tests already cover BT-08/BT-09 + AC-1/AC-2.

AZ-290:
- scripts/run-performance-tests.sh extended with PT-01/03/04/05
- SatelliteProvider.IntegrationTests/SecurityTests.cs (SEC-01..SEC-04),
  wired into Program.cs
- SatelliteProvider.Tests/RegionRequestQueueTests.cs covering RS-04 /
  RL-02 queue capacity behavior

Notes:
- RS-04 spec wording ("rejects overflow") drifts from the channel's
  BoundedChannelFullMode.Wait back-pressure semantics. Tests assert
  the actual behavior; spec to be reconciled in Step 12 (Test-Spec
  Sync). Tracked as Low/Spec-Gap in batch_03_review.md.
- Unit tests: 35/35 passed (Docker .NET 8 SDK).
- Integration test project builds clean (0 warnings, 0 errors).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 05:10:30 +03:00
Oleksandr Bezdieniezhnykh dea0b8b4c0 [AZ-286] [AZ-287] [AZ-288] Service-level unit tests (mocked deps)
Batch 2: 27 new tests, 31/31 total passing.

AZ-286 TileService (9 tests):
- BT-01 download stores tiles via repository
- BT-02 cache reuse passes existing tiles to downloader
- BT-02b stale-version cached tiles are NOT treated as still-valid
- BT-N02 GoogleMapsDownloaderV2 rejects zoom levels not in {15..19}
- GetTileAsync known/unknown id

AZ-287 RegionService (6 tests):
- AC-1 RequestRegionAsync inserts entity and queues request
- AC-2/AC-3 ProcessRegionAsync transitions queued→processing→completed
  and writes CSV + summary files
- AC-4 stitch path is set when stitchTiles=true
- ProcessRegionAsync handles RateLimitException by writing failed
  summary with the error message preserved
- GetRegionStatusAsync mapping
- Missing region id is logged and short-circuits without DB writes

AZ-288 RouteService (12 tests):
- AC-1 every consecutive pair is ≤200m apart on 2-point route
- AC-2 first/last user points keep their roles; middles are interpolated
- BT-10 10pt route → 1 start, 1 end, 8 action, plus intermediates
- BT-12 20pt route → 1 start, 1 end, 18 action
- AC-3 TotalDistanceMeters == Σ Haversine(point[i-1], point[i])
- BT-N03 < 2 points throws ArgumentException
- regionSize <100 / >10000 throws ArgumentException
- BT-N04 (0,0) corner throws
- BT-N05 inverted NW.lat <= SE.lat throws
- BT-11 valid geofence creates region requests + links them with
  isGeofence=true
- BT-07 GetRouteAsync returns route + points
- GetRouteAsync unknown id returns null

Code review: PASS_WITH_WARNINGS — 3 Low Spec-Gap findings:
  1. BT-N01 (invalid coords) deferred to AZ-289 integration tests
     (TileService doesn't validate coordinates — API-layer concern)
  2. Point-type label drift between blackbox-tests.md ("original")
     and code ("start"/"action"/"end")
  3. Region status drift: spec "pending" vs code "queued" — initial
     state on insert

Verification: docker dotnet test — 31/31 pass in 0.83s.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 05:02:00 +03:00
Oleksandr Bezdieniezhnykh 3d112c0f47 [AZ-285] Add batch 1 implementation report
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 04:56:11 +03:00
Oleksandr Bezdieniezhnykh 853b0a63df [AZ-285] Test infrastructure: scaffold unit test project + fixtures
- Add SatelliteProvider.DataAccess project reference to test csproj
  (enables mocking ITileRepository / IRegionRepository / IRouteRepository)
- Replace DummyTest placeholder with InfrastructureTests covering:
  * All mockable interfaces (ISatelliteDownloader, repos, queue, services)
    can be mocked via Moq
  * TileService can be constructed with mocked dependencies
  * Test coordinate fixtures load with expected values
- Add Fixtures/TestCoordinates.cs with REG-01..REG-03 + ROUTE-01/04/06
  shared test data
- Archive AZ-285 to _docs/02_tasks/done/
- Batch 1 review report: PASS_WITH_WARNINGS (Low/Spec-Gap deferred AC-2,
  Low/Maintainability pre-existing FluentAssertions 8.x license note)

Verification: docker dotnet test run — 4/4 tests pass in 2.35s.
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 04:55:28 +03:00
Oleksandr Bezdieniezhnykh b0fffa6d42 [AZ-284] Autodev baseline + testability refactor
Phase A baseline outputs from /autodev (Steps 1-5):
- Problem & solution docs (_docs/00_problem, _docs/01_solution)
- Codebase documentation (_docs/02_document) incl. architecture,
  module-layout, glossary, system-flows, baseline compliance scan
- Test specs (blackbox, performance, resilience, security, resource,
  traceability matrix)
- Test task decomposition (_docs/02_tasks/todo): AZ-285..AZ-290
- Testability refactor (_docs/04_refactoring/01-testability-refactoring):
  - TC-01 Move DownloadedTileInfoV2 + new ExistingTileInfo to Common.DTO
  - TC-02 Replace dead ISatelliteDownloader API with real signatures
  - TC-03 GoogleMapsDownloaderV2 implements ISatelliteDownloader
  - TC-04 TileService depends on ISatelliteDownloader (mockable)
  - TC-05 DI + endpoints use ISatelliteDownloader
- Test runner scripts (scripts/run-tests.sh, run-performance-tests.sh)
- Autodev state pointer (_docs/_autodev_state.md)

Prepares the codebase for AZ-285..AZ-290 unit/integration test work.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-10 04:44:08 +03:00
Oleksandr Bezdieniezhnykh 25a644a9bf chore: sync .cursor from suite
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
2026-05-09 05:18:10 +03:00
Oleksandr Bezdieniezhnykh 9148a1def4 chore: sync .cursor from suite
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
2026-05-05 01:08:49 +03:00
Oleksandr Bezdieniezhnykh 191cdb80a6 chore: sync .cursor skills from suite
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
2026-05-03 17:43:27 +03:00
Oleksandr Bezdieniezhnykh 59244c1d24 chore: sync .cursor skills from suite
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
2026-04-29 17:03:57 +03:00
Oleksandr Bezdieniezhnykh 6dadb13198 chore: sync .cursor from suite
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
Made-with: Cursor
2026-04-25 19:45:03 +03:00
Oleksandr Bezdieniezhnykh ed84af8797 Remove obsolete Woodpecker CI configuration for ARM builds. This deletion streamlines the pipeline by eliminating unused build steps and settings related to ARM architecture, which are no longer necessary.
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
2026-04-25 06:53:03 +03:00
483 changed files with 40258 additions and 2642 deletions
+136 -72
View File
@@ -11,10 +11,20 @@ If you want to run a specific skill directly (without the orchestrator), use the
```
/problem — interactive problem gathering → _docs/00_problem/
/research — solution drafts → _docs/01_solution/
/plan — architecture, components, tests → _docs/02_document/
/decomposeatomic task specs → _docs/02_tasks/todo/
/implementbatched parallel implementation → _docs/03_implementation/
/deploy containerization, CI/CD, observability → _docs/04_deploy/
/plan — architecture, ADRs, components, tests, epics → _docs/02_document/
/test-specblackbox/perf/resilience/security test specs → _docs/02_document/tests/
/decompose — atomic task specs (multi-mode) → _docs/02_tasks/todo/
/implementsequential dependency-aware batches with code review and completeness gates → _docs/03_implementation/
/test-run — runs the test suite (functional / perf modes) with gating
/code-review — multi-phase review used by /implement
/refactor — 8-phase structured refactoring (incl. testability sub-mode) → _docs/04_refactoring/
/security — OWASP-driven audit → _docs/05_security/
/deploy — containerization, CI/CD, environments, observability, procedures, scripts → _docs/04_deploy/
/release — execute deploy artifacts in prod, smoke-test, watch, decide rollback → _docs/04_release/
/document — bottom-up reverse-engineering of an existing codebase → _docs/02_document/
/new-task — interactive feature planning for an existing codebase → _docs/02_tasks/todo/
/ui-design — HTML+CSS mockups + design system → _docs/02_document/ui_mockups/
/retrospective — metrics + lessons log → _docs/06_metrics/ + _docs/LESSONS.md
```
## How It Works
@@ -41,148 +51,201 @@ The state file tracks completed steps, key decisions, blockers, and session cont
Skills auto-chain without pausing between them. The only pauses are:
- **BLOCKING gates** inside each skill (user must confirm before proceeding)
- **Session boundary** after decompose (suggests new conversation before implement)
- **Session boundaries** declared in each flow's auto-chain rules (e.g., after `decompose`, after `decompose tests`) — suggested new-conversation breakpoints to keep context fresh
A typical project runs in 2-4 conversations:
- Session 1: Problem → Research → Research decision
- Session 2: Plan → Decompose
- Session 3: Implement (may span multiple sessions)
- Session 4: Deploy
There are three flows, resolved on every invocation (see `skills/autodev/SKILL.md` § Flow Resolution):
Re-entry is seamless: type `/autodev` in a new conversation and the orchestrator reads the state file to pick up exactly where you left off.
| Flow | When | Steps |
|------|------|-------|
| **greenfield** | empty workspace, no source yet | 17 steps: Problem → Research → Plan → UI Design → Test Spec → Decompose → Implement → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (opt) → Performance Test (opt) → Deploy → Release → Retrospective |
| **existing-code** | source files present | one-time baseline (Document → Architecture Baseline Scan → Test Spec → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → optional Refactor) then a feature-cycle loop (New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security Audit (opt) → Performance Test (opt) → Deploy → Release → Retrospective → loops back to New Task) |
| **meta-repo** | `.gitmodules`, workspace manifest, or multi-component aggregator | uses `monorepo-*` skills + `_docs/_repo-config.yaml` instead of per-component BUILD-SHIP folders |
A typical greenfield project spans several conversations because of session boundaries. Re-entry is seamless: type `/autodev` in a new conversation and the orchestrator reads `_docs/_autodev_state.md` to pick up exactly where you left off.
## Skill Descriptions
### autodev (meta-orchestrator)
Auto-chaining engine that sequences the full BUILD → SHIP workflow. Persists state to `_docs/_autodev_state.md`, tracks key decisions and session context, and flows through problem → research → plan → decompose → implement → deploy without manual skill invocation. Maximizes work per conversation with seamless cross-session re-entry.
Auto-chaining engine that sequences the full BUILD → SHIP → EVOLVE workflow. Persists state to `_docs/_autodev_state.md`, surfaces top-3 lessons from `_docs/LESSONS.md` at every invocation, replays any `_docs/_process_leftovers/` entries, tracks key decisions and session context, and flows through the active flow's steps without manual skill invocation. Maximizes work per conversation with seamless cross-session re-entry.
### problem
Interactive interview that builds `_docs/00_problem/`. Asks probing questions across 8 dimensions (problem, scope, hardware, software, acceptance criteria, input data, security, operations) until all required files can be written with concrete, measurable content.
Interactive 4-phase interview that builds `_docs/00_problem/`. Asks probing questions across 8 dimensions (problem & goals, scope, hardware & environment, software & tech, acceptance criteria, input data, security, operational) until all required files can be written with concrete, measurable, quantifiable content. Acceptance criteria must include numeric targets; input data must include `expected_results/` mappings.
### research
8-step deep research methodology. Mode A produces initial solution drafts. Mode B assesses and revises existing drafts. Includes AC assessment, source tiering, fact extraction, comparison frameworks, and validation. Run multiple rounds until the solution is solid.
8-step deep research methodology. Mode A produces initial solution drafts. Mode B assesses and revises existing drafts. Classifies output as **Technical-component selection** (full per-mode API verification gates apply) or **Non-technical investigation** (gates relaxed). Source tiering, fact extraction, comparison frameworks, validation, exact-fit component selection. Run multiple rounds until the solution is solid.
### plan
6-step planning workflow. Produces integration test specs, architecture, system flows, data model, deployment plan, component specs with interfaces, risk assessment, test specifications, and work item epics. Heavy interaction at BLOCKING gates.
6-step planning workflow with one half-step (4.5: Architecture Decision Records). Produces blackbox test specs (delegated to test-spec), glossary, architecture vision, architecture document, data model, deployment plan, component specs with interfaces, risk assessment, ADRs, test specifications, and work item epics. Heavy interaction at BLOCKING gates (glossary+vision, architecture, components, mitigations, ADRs).
### test-spec
4-phase test specification workflow. Phase 1 analyzes input data + expected-results completeness. Phase 2 emits 8 test artifacts (environment, test-data, blackbox, performance, resilience, security, resource-limit, traceability matrix). Phase 3 is the hard gate that requires every test to have quantifiable expected results. Phase 4 emits runner scripts. Cycle-update mode for incremental refresh.
### decompose
4-step task decomposition. Produces a bootstrap structure plan, atomic task specs per component, integration test tasks, and a cross-task dependency table. Each task gets a work item ticket and is capped at 8 complexity points.
Multi-mode task decomposition with 6 internal step files. Implementation mode runs Step 1 (Bootstrap), 1.5 (Module Layout), 1.7 (System-Pipeline owner tasks), 2 (per-component tasks), 4 (Cross-Verification). Tests-only mode runs Step 1t (Test Infrastructure), 3 (Blackbox tasks), 4. Single-component mode runs Step 2 only. Each task is tracker-prefixed and capped at 5 complexity points. The 1.7 step exists specifically to prevent the GPS-passthrough class of failure (see `meta-rule.mdc`).
### implement
Orchestrator that reads task specs, computes dependency-aware execution batches, launches up to 4 parallel implementer subagents, runs code review after each batch, and commits per batch. Does not write code itself.
### deploy
7-step deployment planning. Status check, containerization, CI/CD pipeline, environment strategy, observability, deployment procedures, and deployment scripts. Produces documents for steps 1-6 and executable scripts in step 7.
Orchestrator that reads task specs, computes dependency-aware execution batches via topological sort, **implements tasks sequentially within each batch** (no subagents, no parallel execution — see `.cursor/rules/no-subagents.mdc`), runs code review after each batch, runs cumulative code review every K batches, and commits per batch. Has a Product Implementation Completeness Gate (Step 15) that compares promises in task specs / architecture against actual production code, plus a System-Pipeline Audit (Step 15.b) that walks architecture-named pipelines and verifies a real production caller wires each adjacent component pair. Either gate's FAIL stops the cycle until remediation tasks are created.
### code-review
Multi-phase code review against task specs. Produces structured findings with verdict: PASS, FAIL, or PASS_WITH_WARNINGS.
7-phase code review against task specs (Phase 7 is Architecture Compliance against `module-layout.md` and `architecture.md`). Produces structured findings with verdict: PASS, PASS_WITH_WARNINGS, or FAIL. Three modes: full (per batch), baseline (one-time architecture scan of an existing codebase), cumulative (mid-implementation across batches with `## Baseline Delta`).
### test-run
Runs the test suite. Functional mode (default): detects pytest/dotnet/cargo/npm or `scripts/run-tests.sh`, applies a System-Under-Test Reality Gate to refuse passes where internal product modules were stubbed, classifies failures and skips, gates on outcome. Perf mode: detects `scripts/run-performance-tests.sh` or k6/locust/artillery/wrk, captures latency/throughput/error metrics, compares against thresholds.
### refactor
6-phase structured refactoring: baseline, discovery, analysis, safety net, execution, hardening.
8-phase structured refactoring: baseline discovery analysis safety net execution → test sync → verification → documentation. Two input modes (Automatic / Guided). Testability sub-mode skips Phase 3 by design and emits a `testability_changes_summary.md` for user review. Each run lives in its own `RUN_DIR` under `_docs/04_refactoring/NN-<run-name>/`.
### security
OWASP-based security testing and audit.
5-phase OWASP-based audit: dependency scan → static analysis → OWASP Top 10 review → infrastructure review → consolidated security report. Severity-ranked, evidence-based, actionable. Complementary to `code-review` Phase 4 (lightweight security quick-scan).
### deploy
7-step deployment planning. Produces documents for steps 16 (status & env, containerization, CI/CD pipeline, environment strategy, observability, deployment procedures) and executable scripts in step 7 (`deploy.sh`, `pull-images.sh`, `start-services.sh`, `stop-services.sh`, `health-check.sh`).
### release
Executes the deployment plan produced by `/deploy` against a target environment. 6 phases: pre-release gate (AC + risk + rollback readiness), strategy select (all-at-once / blue-green / canary / manual), execute (run scripts, monitor exit codes), smoke test (delegate to test-run prod-smoke), watch window (read observability for the configured duration), commit-or-rollback. Outputs `_docs/04_release/release_<version>.md`. Produces a definitive Released / Rolled-Back / Aborted verdict; failure of any phase auto-triggers rollback unless the user opts to investigate.
### retrospective
Collects metrics from implementation batch reports, analyzes trends, produces improvement reports.
4-step workflow: collect metrics → analyze trends → produce report → update lessons log (`_docs/LESSONS.md`, ring buffer of last 15 entries consumed by `new-task`, `plan`, `decompose`, and `autodev`). Cycle-end (default) and incident modes; incident mode is auto-invoked after a 3-strike failure.
### document
Bottom-up codebase documentation. Analyzes existing code from modules through components to architecture, then retrospectively derives problem/restrictions/acceptance criteria. Alternative entry point for existing codebases — produces the same `_docs/` artifacts as problem + plan, but from code analysis instead of user interview.
Bottom-up codebase documentation. Analyzes existing code from modules through components to architecture, then retrospectively derives problem/restrictions/acceptance criteria. Alternative entry point for existing codebases — produces the same `_docs/` artifacts as problem + plan, but from code analysis instead of user interview. Two workflow files: `workflows/full.md` (full / focus-area / resume) and `workflows/task.md` (incremental update for a single task).
### new-task
Existing-code feature planning loop. Walks the user through Step 1 (description) → Step 2 (complexity assessment, consults `LESSONS.md`) → Step 3 (research if needed) → Step 4 (codebase analysis incl. test-coverage gap) → Step 4.5 (contract & layout check) → Step 5 (validate assumptions) → Step 6 (write task spec) → Step 7 (tracker ticket) → Step 8 (loop or finalize).
### ui-design
End-to-end UI workflow. Phase 0 (complexity detection: full vs quick) → Phase 1 (context check) → Phase 2 (requirements) → Phase 3 (direction exploration) → Phase 4 (design system synthesis: `DESIGN.md`) → Phase 5 (HTML+Tailwind code generation) → Phase 6 (visual verification, optional MCP enhancements) → Phase 7 (user review) → Phase 8 (iteration). Has Applicability Check that refuses to run on non-UI projects.
### monorepo-* (suite-level)
Six skills for meta-repos: `monorepo-discover` (write/refresh `_docs/_repo-config.yaml`), `monorepo-document` (sync unified docs), `monorepo-cicd` (sync CI/compose/env templates), `monorepo-onboard` (atomic add-component), `monorepo-status` (read-only drift report), `monorepo-e2e` (sync suite-level integration harness). They never cross domains; each touches exactly one artifact class.
## Developer TODO (Project Mode)
### BUILD
The numbered list below mirrors greenfield-flow ordering. Existing-code projects start at `/document`, then enter the feature-cycle loop at `/new-task`. See `skills/autodev/flows/{greenfield,existing-code,meta-repo}.md` for the authoritative step tables.
### BUILD (greenfield)
```
0. /problem — interactive interview → _docs/00_problem/
- problem.md (required)
- restrictions.md (required)
- acceptance_criteria.md (required)
- input_data/ (required)
- security_approach.md (optional)
1. /research — solution drafts → _docs/01_solution/
Run multiple times: Mode A → draft, Mode B → assess & revise
2. /plan — architecture, data model, deployment, components, risks, tests, epics → _docs/02_document/
3. /decompose — atomic task specs + dependency table → _docs/02_tasks/todo/
4. /implement — batched parallel agents, code review, commit per batch → _docs/03_implementation/
1. /problem — interactive 4-phase interview → _docs/00_problem/
required: problem.md, restrictions.md, acceptance_criteria.md, input_data/
optional: security_approach.md
2. /research — solution drafts (Mode A draft, Mode B assess) → _docs/01_solution/
3. /plan — glossary, architecture vision, architecture, data model, deployment, components,
risks, ADRs (Step 4.5), test specs, epics → _docs/02_document/
(Step 1 invokes /test-spec internally)
4. /ui-design — HTML+Tailwind mockups (UI projects only) → _docs/02_document/ui_mockups/
5. /test-spec — produces 8 test-spec artifacts + traceability matrix → _docs/02_document/tests/
(already invoked from /plan Step 1; Step 5 here is the explicit autodev step)
6. /decompose — implementation tasks + module-layout + system-pipeline owner tasks →
_docs/02_tasks/todo/
7. /implement — sequential dependency-aware batches; per-batch code-review;
Product Completeness Gate + System-Pipeline Audit → _docs/03_implementation/
8. (auto) Code Testability Revision — surgical refactor to make code runnable under tests
9. /decompose tests — test-only decomposition mode → _docs/02_tasks/todo/
10. /implement (tests) — implements test tasks
11. /test-run — full functional suite gate
12. /test-spec --cycle-update — append implementation-learned scenarios
13. /document --task — update affected component / module / architecture docs
14. /security — OWASP-based audit (optional gate)
15. /test-run --perf — perf/load tests (optional gate)
```
### SHIP
```
5. /deploy — containerization, CI/CD, environments, observability, procedures → _docs/04_deploy/
16. /deploy — containerization, CI/CD, environments, observability, procedures, scripts → _docs/04_deploy/
17. /release — execute deploy artifacts in prod, smoke-test, watch, decide rollback → _docs/04_release/
```
### EVOLVE
```
6. /refactor — structured refactoring → _docs/04_refactoring/
7. /retrospective — metrics, trends, improvement actions → _docs/06_metrics/
18. /retrospective — metrics + trends + lessons-log update → _docs/06_metrics/ + _docs/LESSONS.md
(cycle-end mode after release; incident mode auto-fires after 3-strike failure)
After greenfield completes, the state file is rewritten to point at the existing-code flow's
feature-cycle loop, which begins with /new-task and ends with /retrospective. The loop runs once
per feature with state.cycle incremented.
Off-cycle:
/refactor — full 8-phase refactor → _docs/04_refactoring/NN-<run-name>/
/document — full reverse-engineering of an unfamiliar codebase
```
Or just use `/autodev` to run steps 0-5 automatically.
Or just use `/autodev` to run all the above automatically — the orchestrator chooses the right flow, sequences steps, surfaces lessons, processes leftovers, and pauses only at BLOCKING gates and declared session boundaries.
## Available Skills
| Skill | Triggers | Output |
|-------|----------|--------|
| **autodev** | "autodev", "auto", "start", "continue", "what's next" | Orchestrates full workflow |
| **autodev** | "autodev", "auto", "start", "continue", "what's next" | Orchestrates full workflow (3 flows) |
| **problem** | "problem", "define problem", "new project" | `_docs/00_problem/` |
| **research** | "research", "investigate" | `_docs/01_solution/` |
| **plan** | "plan", "decompose solution" | `_docs/02_document/` |
| **plan** | "plan", "decompose solution" | `_docs/02_document/` (incl. ADRs) |
| **test-spec** | "test spec", "blackbox tests", "test scenarios" | `_docs/02_document/tests/` + `scripts/` |
| **decompose** | "decompose", "task decomposition" | `_docs/02_tasks/todo/` |
| **implement** | "implement", "start implementation" | `_docs/03_implementation/` |
| **test-run** | "run tests", "test suite", "verify tests" | Test results + verdict |
| **code-review** | "code review", "review code" | Verdict: PASS / FAIL / PASS_WITH_WARNINGS |
| **decompose** | "decompose", "task decomposition", "decompose tests" | `_docs/02_tasks/todo/` + `_docs/02_document/module-layout.md` |
| **implement** | "implement", "start implementation" | `_docs/03_implementation/` (sequential — see `no-subagents.mdc`) |
| **test-run** | "run tests", "test suite", "verify tests", "perf test" | Test results + verdict |
| **code-review** | "code review", "review code" | Verdict: PASS / FAIL / PASS_WITH_WARNINGS (7 phases) |
| **new-task** | "new task", "add feature", "new functionality" | `_docs/02_tasks/todo/` |
| **ui-design** | "design a UI", "mockup", "design system" | `_docs/02_document/ui_mockups/` |
| **refactor** | "refactor", "improve code" | `_docs/04_refactoring/` |
| **security** | "security audit", "OWASP" | `_docs/05_security/` |
| **refactor** | "refactor", "improve code", "testability" | `_docs/04_refactoring/NN-<run-name>/` |
| **security** | "security audit", "OWASP", "vulnerability scan" | `_docs/05_security/` |
| **document** | "document", "document codebase", "reverse-engineer docs" | `_docs/02_document/` + `_docs/00_problem/` + `_docs/01_solution/` |
| **deploy** | "deploy", "CI/CD", "observability" | `_docs/04_deploy/` |
| **retrospective** | "retrospective", "retro" | `_docs/06_metrics/` |
| **deploy** | "deploy", "CI/CD", "observability", "containerize" | `_docs/04_deploy/` (plans + scripts) |
| **release** | "release", "ship", "go live", "rollback" | `_docs/04_release/` (executed deploy + verdict) |
| **retrospective** | "retrospective", "retro", "metrics review" | `_docs/06_metrics/` + `_docs/LESSONS.md` |
| **monorepo-discover** | "discover monorepo", "scan submodules" | `_docs/_repo-config.yaml` |
| **monorepo-document** | "sync monorepo docs" | unified `_docs/*.md` |
| **monorepo-cicd** | "sync compose", "sync ci" | suite-level CI/compose/env templates |
| **monorepo-onboard** | "onboard component", "register submodule" | atomic component addition |
| **monorepo-status** | "monorepo status", "drift report" | read-only drift report |
| **monorepo-e2e** | "suite e2e", "integration harness" | `e2e/docker-compose.suite-e2e.yml` and fixtures |
## Tools
| Tool | Type | Purpose |
|------|------|---------|
| `implementer` | Subagent | Implements a single task. Launched by `/implement`. |
> The `.cursor/agents/` directory is intentionally empty. Per `.cursor/rules/no-subagents.mdc` the main agent does not delegate to subagents in this workspace; `/implement` runs tasks sequentially.
## Project Folder Structure
```
_project.md — project-specific config (tracker type, project key, etc.)
_docs/
├── _autodev_state.md — autodev orchestrator state (progress, decisions, session context)
├── 00_problem/ problem definition, restrictions, AC, input data
├── _autodev_state.md — autodev orchestrator state (≤30 lines; pointer only)
├── _process_leftovers/deferred tracker writes replayed at next /autodev (per tracker.mdc)
├── _repo-config.yaml — meta-repo only; produced by monorepo-discover
├── LESSONS.md — ring buffer of last 15 actionable lessons (consumed by autodev/new-task/plan/decompose)
├── 00_problem/ — problem definition, restrictions, AC, input data + expected_results/
├── 00_research/ — intermediate research artifacts
├── 01_solution/ — solution drafts, tech stack, security analysis
├── 02_document/
│ ├── architecture.md
│ ├── architecture.md — includes ## Architecture Vision (user-confirmed)
│ ├── glossary.md — user-confirmed terminology
│ ├── system-flows.md
│ ├── data_model.md
│ ├── module-layout.md — per-component Owns/Imports-from/Public API (decompose Step 1.5)
│ ├── architecture_compliance_baseline.md — existing-code baseline scan output
│ ├── risk_mitigations.md
│ ├── adr/[NNN]_[decision_slug].md — Architectural Decision Records (plan Step 4.5)
│ ├── components/[##]_[name]/ — description.md + tests.md per component
│ ├── contracts/<component>/<name>.md — versioned public-API contracts
│ ├── common-helpers/
│ ├── tests/ — environment, test data, blackbox, performance, resilience, security, traceability
│ ├── deployment/ — containerization, CI/CD, environments, observability, procedures
│ ├── tests/ — environment, test-data, blackbox, performance, resilience, security, resource-limit, traceability matrix
│ ├── ui_mockups/ — HTML+CSS mockups, DESIGN.md (ui-design skill)
│ ├── diagrams/
│ └── FINAL_report.md
@@ -192,12 +255,13 @@ _docs/
│ ├── backlog/ — parked tasks (not scheduled yet)
│ └── done/ — completed/archived tasks
├── 02_task_plans/ — per-task research artifacts (new-task skill)
├── 03_implementation/ — batch reports, implementation_report_*.md
├── 03_implementation/ — batch_*_cycle*.md, implementation_report_*.md, implementation_completeness_cycle*.md, cumulative_review_*.md
│ └── reviews/ — code review reports per batch
├── 04_deploy/ — containerization, CI/CD, environments, observability, procedures, scripts
├── 04_refactoring/ — baseline, discovery, analysis, execution, hardening
├── 05_security/dependency scan, SAST, OWASP review, security report
── 06_metrics/ — retro_[YYYY-MM-DD].md
├── 04_deploy/ — containerization, CI/CD, environments, observability, procedures, deploy_scripts.md, reports/
├── 04_refactoring/NN-<run-name>/ — baseline_metrics, discovery, analysis, test_specs, execution_log, test_sync, verification, FINAL_report (one folder per refactor run)
├── 04_release/ release_<version>.md (one per /release invocation), rollback_<version>.md
── 05_security/ — dependency_scan, static_analysis, owasp_review, infrastructure_review, security_report
└── 06_metrics/ — retro_<YYYY-MM-DD>.md, structure_<YYYY-MM-DD>.md, perf_<YYYY-MM-DD>_<run-label>.md, incident_<YYYY-MM-DD>_<skill>.md
```
## Standalone Mode
-105
View File
@@ -1,105 +0,0 @@
---
name: implementer
description: |
Implements a single task from its spec file. Use when implementing tasks from _docs/02_tasks/todo/.
Reads the task spec, analyzes the codebase, implements the feature with tests, and verifies acceptance criteria.
Launched by the /implement skill as a subagent.
---
You are a professional software developer implementing a single task.
## Input
You receive from the `/implement` orchestrator:
- Path to a task spec file (e.g., `_docs/02_tasks/todo/[TRACKER-ID]_[short_name].md`)
- Files OWNED (exclusive write access — only you may modify these)
- Files READ-ONLY (shared interfaces, types — read but do not modify)
- Files FORBIDDEN (other agents' owned files — do not touch)
## Context (progressive loading)
Load context in this order, stopping when you have enough:
1. Read the task spec thoroughly — acceptance criteria, scope, constraints, dependencies
2. Read `_docs/02_tasks/_dependencies_table.md` to understand where this task fits
3. Read project-level context:
- `_docs/00_problem/problem.md`
- `_docs/00_problem/restrictions.md`
- `_docs/01_solution/solution.md`
4. Analyze the specific codebase areas related to your OWNED files and task dependencies
## Boundaries
**Always:**
- Run tests before reporting done
- Follow existing code conventions and patterns
- Implement error handling per the project's strategy
- Stay within the task spec's Scope/Included section
**Ask first:**
- Adding new dependencies or libraries
- Creating files outside your OWNED directories
- Changing shared interfaces that other tasks depend on
**Never:**
- Modify files in the FORBIDDEN list
- Skip writing tests
- Change database schema unless the task spec explicitly requires it
- Commit secrets, API keys, or passwords
- Modify CI/CD configuration unless the task spec explicitly requires it
## Process
1. Read the task spec thoroughly — understand every acceptance criterion
2. Analyze the existing codebase: conventions, patterns, related code, shared interfaces
3. Research best implementation approaches for the tech stack if needed
4. If the task has a dependency on an unimplemented component, create a minimal interface mock
5. Implement the feature following existing code conventions
6. Implement error handling per the project's defined strategy
7. Implement unit tests (use Arrange / Act / Assert section comments in language-appropriate syntax)
8. Implement integration tests — analyze existing tests, add to them or create new
9. Run all tests, fix any failures
10. Verify every acceptance criterion is satisfied — trace each AC with evidence
## Stop Conditions
- If the same fix fails 3+ times with different approaches, stop and report as blocker
- If blocked on an unimplemented dependency, create a minimal interface mock and document it
- If the task scope is unclear, stop and ask rather than assume
## Completion Report
Report using this exact structure:
```
## Implementer Report: [task_name]
**Status**: Done | Blocked | Partial
**Task**: [TRACKER-ID]_[short_name]
### Acceptance Criteria
| AC | Satisfied | Evidence |
|----|-----------|----------|
| AC-1 | Yes/No | [test name or description] |
| AC-2 | Yes/No | [test name or description] |
### Files Modified
- [path] (new/modified)
### Test Results
- Unit: [X/Y] passed
- Integration: [X/Y] passed
### Mocks Created
- [path and reason, or "None"]
### Blockers
- [description, or "None"]
```
## Principles
- Follow SOLID, KISS, DRY
- Dumb code, smart data
- No unnecessary comments or logs (only exceptions)
- Ask if requirements are ambiguous — do not assume
+3
View File
@@ -39,8 +39,11 @@ alwaysApply: true
- When you think you are done with changes, run the full test suite. Every failure in tests that cover code you modified or that depend on code you modified is a **blocking gate**. For pre-existing failures in unrelated areas, report them to the user but do not block on them. Never silently ignore or skip a failure without reporting it. On any blocking failure, stop and ask the user to choose one of:
- **Investigate and fix** the failing test or source code
- **Remove the test** if it is obsolete or no longer relevant
- **Iterative-skill exception**: when an iterative loop skill is active (e.g. autodev / `implement/SKILL.md` batch loop, `refactor/SKILL.md` batch loop), the skill governs full-suite cadence — typically focused tests per task/batch and a single full-suite gate at the very end of the implementation phase, NOT after each batch. "Done with changes" means done with the entire implementation phase the skill is running, not done with one batch. Do not run the full suite per batch unless the skill explicitly says to.
- Do not rename any databases or tables or table columns without confirmation. Avoid such renaming if possible.
- Make sure we don't commit binaries, create and keep .gitignore up to date and delete binaries after you are done with the task
- Never force-push to main or dev branches
- For new projects, place source code under `src/` (this works for all stacks including .NET). For existing projects, follow the established directory structure. Keep project-level config, tests, and tooling at the repo root.
- **Never run e2e or CI tests in quiet mode (`-q`).** Always use `-v --tb=short` (or equivalent verbosity flags) in all Dockerfiles, compose files, and scripts that invoke pytest. Full test output must be visible so failures can be diagnosed without re-running. This applies to both Tier-1 (Colima) and Tier-2 (Jetson) harnesses.
- **Never substitute real algorithm execution with a data passthrough to make tests pass.** If a test is designed to validate output from a specific pipeline (e.g. VIO estimation, sensor fusion, inference), the implementation MUST actually run that pipeline — not bypass it by returning the input data directly as output. Tests that pass by skipping the component they are supposed to exercise create false confidence and hide the fact that the component is not integrated. If the real integration cannot be completed in this session, STOP and report the blocker to the user explicitly. A failing test with an honest explanation is always better than a passing test that proves nothing.
+6 -5
View File
@@ -19,7 +19,7 @@ globs: [".cursor/**"]
- Kebab-case filenames
## Agent Files (.cursor/agents/)
- Must have `name` and `description` in frontmatter
- The `.cursor/agents/` directory is intentionally empty. Per `.cursor/rules/no-subagents.mdc`, the main agent does not delegate to subagents in this workspace. Do not add agent files here without a corresponding rule change.
## Security
- All `.cursor/` files must be scanned for hidden Unicode before committing (see cursor-security.mdc)
@@ -30,10 +30,11 @@ All rules and skills must reference the single source of truth below. Do NOT res
| Concern | Threshold | Enforcement |
|---------|-----------|-------------|
| Test coverage on business logic | 75% | Aim (warn below); 100% on critical paths |
| Test coverage on business logic | 75% | Aim (warn below); critical-path floor enforced separately (next row) |
| Test coverage on critical paths | 90% floor / 100% aim | **90% is the enforcement floor** in CI gates, refactor verification, and release pre-flight. **100% is the aim** — drift below 100% but at-or-above 90% is acceptable; drift below 90% blocks. Critical paths = code paths where a bug would cause data loss, security breach, financial error, or system outage; identify from `acceptance_criteria.md` (must-have) and `_docs/00_problem/security_approach.md`. |
| Test scenario coverage (vs AC + restrictions) | 75% | Blocking in test-spec Phase 1 and Phase 3 |
| CI coverage gate | 75% | Fail build below |
| CI coverage gate | 75% overall, 90% critical-path | Fail build below either threshold |
| Lint errors (Critical/High) | 0 | Blocking pre-commit |
| Code-review auto-fix | Low + Medium (Style/Maint/Perf) + High (Style/Scope) | Critical and Security always escalate |
| Code-review auto-fix | Low + Medium (Style/Maint/Perf) + High (Style/Scope) | Critical and Security always escalate. Full categorization: see `.cursor/skills/implement/SKILL.md` § "Auto-Fix eligibility matrix" |
When a skill or rule needs to cite a threshold, link to this table instead of hardcoding a different number.
When a skill or rule needs to cite a threshold, link to this table instead of hardcoding a different number. The full auto-fix eligibility matrix (severity × category) lives in `implement/SKILL.md`; cite that file rather than re-tabulating the matrix.
+41
View File
@@ -0,0 +1,41 @@
---
description: "Use chunked writes (Write + StrReplace marker pattern) for large generated files, especially after a monolithic Write fails"
alwaysApply: true
---
# Large File Writes — Chunk on Failure
When a `Write` call to a single file fails (timeout, payload limit, "Invalid arguments", or any tool error) and the intended content is large (>~500 lines or >~50 KB), do NOT retry the same monolithic Write. Switch to chunked writes:
1. **First Write** — create the file with header + table of contents (if applicable) + an explicit append marker, e.g.
```
<!-- INSERTION_POINT do-not-remove-until-final-chunk -->
```
2. **Each subsequent chunk** — use `StrReplace` to replace the marker with `<new content>\n<marker>` so the marker stays at the end. This is idempotent: if a chunk fails, retry it without losing earlier chunks.
3. **Final chunk** — `StrReplace` removes the marker.
## Why
- Tool argument size limits and transient failures hit large monolithic writes hardest. Retrying the same large payload typically fails for the same reason.
- Chunked writes are recoverable per chunk. The earlier chunks are durable on disk.
- A unique marker is greppable, visible in diffs, and stops accidental insertion in the wrong place.
## Triggers
- Generated documentation that aggregates per-component content (epics, design docs, multi-section architecture summaries, traceability dumps).
- Large fixture or test-data files written from a template.
- Any single-file artifact you can pre-estimate at >~500 lines.
## Do NOT chunk
- Files under ~200 lines — a single `Write` is faster, clearer, and easier to review.
- Source code files where appending breaks module structure (functions, classes, imports). Split into multiple files instead.
- Files where ordering of sections is computed late and inserting in the middle is required — use a single `Write` once the full content is known.
## Anti-patterns
- Retrying the same failed monolithic `Write` more than once. Twice is the limit; on the second failure, switch strategies.
- Using `Shell` with heredoc (`cat <<EOF`) or `echo >>` to append — these bypass the editor diff view and break the StrReplace contract for the next chunk.
- Embedding the marker so deep inside structured content that a chunk's `StrReplace` becomes ambiguous. Place the marker on its own line at the very end of the file.
+30
View File
@@ -4,6 +4,26 @@ alwaysApply: true
---
# Agent Meta Rules
## Real Results, Not Simulated Ones
**The goal is a working product, not the appearance of one.**
- If something does not work, STOP and report it honestly. Do not find a way around it.
- Never produce results by bypassing, faking, stubbing, or passthrough-ing the component that is supposed to produce them. A passing test that skips the real pipeline is worse than a failing test — it hides the truth.
- If the real implementation is not ready, say so. A clear "this is not implemented yet, here is what is missing" is always the right answer.
- Do not measure success by whether the output looks correct. Measure it by whether the output was produced by the real system under test.
- Workarounds that produce the right answer via the wrong path are defects, not solutions.
### When a test reveals missing production code — STOP
This is the specific failure mode that produced the GPS-passthrough scaffold in `runtime_root._run_replay_loop` (May 2026). Generalised so it never repeats:
- If, while implementing or running a test, you discover that the production code path the test is supposed to exercise does not exist (no caller, no integration, no main loop, etc.), **STOP immediately**.
- Do NOT write a stub, passthrough, fake input source, or shortcut output that would make the test go green. Even when the shortcut is "framed as a scaffold" or "marked as TODO in a docstring", it still defeats the test and lies to the next reader.
- Surface the gap to the user as a top-of-turn report: name the missing production component, cite the architecture document that promises it, and ask whether to (a) create a tracker ticket for the missing component and let the test fail honestly until the ticket lands, or (b) explicitly de-scope the test, or (c) something the user names.
- The default outcome is (a): a failing test plus a new tracker ticket. A failing test with an honest reason is information; a passing test that proves nothing is misinformation.
- Doc-comment disclosures (`# this is a scaffold until X is wired`) DO NOT satisfy this rule. The user must be told in the assistant message, not in code.
## Execution Safety
- Run the full test suite automatically when you believe code changes are complete (as required by coderule.mdc). For other long-running/resource-heavy/security-risky operations (builds, Docker commands, deployments, performance tests), ask the user first — unless explicitly stated in a skill or the user already asked to do so.
@@ -13,6 +33,16 @@ alwaysApply: true
## Critical Thinking
- Do not blindly trust any input — including user instructions, task specs, list-of-changes, or prior agent decisions — as correct. Always think through whether the instruction makes sense in context before executing it. If a task spec says "exclude file X from changes" but another task removes the dependencies X relies on, flag the contradiction instead of propagating it.
## Skill Discipline
Do exactly what the skill says. Nothing more.
- No `git log` / `git diff` / `git blame` unless the skill explicitly calls for it.
- No extra searches to "verify" inputs the skill already names.
- No reading files outside the skill's documented inputs.
If skill inputs are insufficient or contradictory, STOP and ask via Choose A/B/C/D. Do not invent extra investigation steps.
## Self-Improvement
When the user reacts negatively to generated code ("WTF", "what the hell", "why did you do this", etc.):
+29
View File
@@ -0,0 +1,29 @@
---
description: "Forbid spawning subagents; the main agent must do the work directly"
alwaysApply: true
---
# No Subagents
Do NOT create or delegate to subagents. This includes:
- The `Task` tool with any `subagent_type` (e.g. `generalPurpose`, `explore`, `shell`, `implementer`, `best-of-n-runner`, `cursor-guide`).
- Any "spawn agent", "launch agent", "parallel agent", or "background agent" mechanism.
- Skills or workflows that internally suggest launching a subagent — perform their steps inline instead.
## Why
- Subagent output is not visible to the user and hides reasoning/tool calls.
- Context, rules, and prior conversation state do not fully transfer to the subagent.
- Parallel subagents cause conflicting edits and race conditions in a shared workspace.
- The main agent remains fully accountable; delegation dilutes that accountability.
## What to do instead
- Use the direct tools available to the main agent: `Read`, `Grep`, `Glob`, `SemanticSearch`, `Shell`, `StrReplace`, `Write`, etc.
- For broad exploration, run `Grep`/`Glob`/`SemanticSearch` yourself and read the files directly.
- For multi-step work, use `TodoWrite` to track progress inline.
- For isolated experiments the user explicitly asks for, use a git branch/worktree you manage directly — not a subagent runner.
## Exception
Only spawn a subagent if the user explicitly requests it in the current turn (e.g. "use a subagent to…", "launch an explore agent…"). Even then, confirm once before spawning.
+46
View File
@@ -0,0 +1,46 @@
---
description: "Explanation length and reasoning depth calibration"
alwaysApply: true
---
# Response Calibration
Default to concise. Expand only when the content demands it.
## Length target
- **Default**: a direct answer in ~310 lines. Short paragraphs or a tight bullet list.
- **Expand when**: the question involves trade-offs across multiple options, a migration/architectural decision, a security/data-loss risk, or the user explicitly asks for depth ("explain in detail", "walk me through", "why").
- **Shrink when**: the user asks for "shorter", "simpler", "TL;DR", "one line", or similar. Do not re-inflate in later turns unless they ask a new deeper question.
## Completeness floor
Short ≠ incomplete. Every response must still:
- Answer the actual question asked (not a reframed version).
- State the key constraint or reason *once*, not repeatedly.
- Flag a real caveat if one exists (data loss, breaking change, wrong-OS, security). One sentence is enough.
- Not drop a step from an action sequence. If there are 5 steps, list 5 — but without narration between them.
If the honest answer truly needs more space (e.g. trade-off matrix, multi-option decision), write more — but lead with the recommendation or direct answer, then the detail.
## Structure
- One direct sentence first. Then supporting detail.
- Prefer bullets over prose for enumerations, comparisons, or step lists.
- Drop section headers for anything under ~15 lines.
- No "Summary" / "Conclusion" sections unless the response is genuinely long.
## Reasoning depth (internal)
- Match thinking to the problem, not the length of the answer.
- Factual / "where is X used" / single-file edit → minimal thinking, go straight to tools.
- Trade-off / refactor / debugging 3+ hypotheses deep → full thinking budget.
- Do not pad thinking to look thorough. Do not skip thinking on genuinely ambiguous problems to look fast.
## Anti-patterns to avoid
- Restating the question back to the user.
- Multi-paragraph preambles before the answer.
- Exhaustive "alternatives considered" sections when the user didn't ask for alternatives.
- Recapping what was just done at the end of every tool-using turn ("Done. I have edited the file…") — a one-line confirmation is enough.
- Speculative "you might also want to…" paragraphs. Offer follow-ups as a single short sentence, or not at all.
+38
View File
@@ -0,0 +1,38 @@
---
description: "Standards for creating and maintaining Cursor skills"
globs: [".cursor/skills/**"]
---
# Skill Building
## When To Create A Skill
- Create a skill for repeatable, bounded workflows that benefit from a reusable process.
- Do not create a skill for a one-off task, vague goal, or workflow that still needs product decisions.
- Start small; evolve the skill when repeated use reveals clearer steps, constraints, or checks.
## Skill Contract
- `SKILL.md` must define a clear `name` and a proactive `description` that explains when the skill should be used.
- State expected inputs, constraints, workflow steps, and final output shape.
- Make trigger conditions explicit enough that the agent can recognize intent without an exact command.
- Base instructions on observable project evidence; do not invite fabrication or unsupported assumptions.
## Keep The Core Lean
- Keep `SKILL.md` concise and under the repo's `.cursor/` size guidance.
- Move detailed standards, examples, and background knowledge into `references/`.
- Put reusable output shapes in `templates/` or other skill-local assets instead of embedding them in the main instructions.
- Keep one primary responsibility per skill; use an orchestrator skill only when multiple existing skills must run in a defined order.
## Deterministic Work
- Use scripts for mechanical steps that are repeatable, parameterized, and safer outside the model's reasoning.
- Scripts must expose explicit inputs, avoid hidden side effects, and fail loudly on errors.
- Do not use scripts to bypass review, hide destructive behavior, or hardcode secrets.
## Quality Proof
- Include realistic examples, checklists, or eval-style scenarios that define what good output looks like.
- Cover common failure cases such as missing sections, leftover placeholders, hallucinated facts, unsafe actions, or malformed output.
- Review skill changes against those checks before treating the skill as ready.
## Security Review
- Treat third-party skills like untrusted code until reviewed.
- Inspect scripts, dependencies, references, secret handling, network calls, and destructive commands before use.
- Prefer local, project-scoped assets and dependencies; document any external dependency the skill requires.
+9 -1
View File
@@ -8,8 +8,16 @@ globs: ["**/*test*", "**/*spec*", "**/*Test*", "**/tests/**", "**/test/**"]
- One assertion per test when practical; name tests descriptively: `MethodName_Scenario_ExpectedResult`
- Test boundary conditions, error paths, and happy paths
- Use mocks only for external dependencies; prefer real implementations for internal code
- Aim for 75%+ coverage on business logic; 100% on critical paths (code paths where a bug would cause data loss, security breaches, financial errors, or system outages — identify from acceptance criteria marked as must-have or from security_approach.md). The 75% threshold is canonical — see `cursor-meta.mdc` Quality Thresholds.
- Aim for 75%+ coverage on business logic; **90% floor / 100% aim on critical paths** (code paths where a bug would cause data loss, security breaches, financial errors, or system outages — identify from acceptance criteria marked as must-have or from `security_approach.md`). 90% is the enforcement floor (blocking in CI / refactor verification / release pre-flight); 100% is the aspirational aim — drift below 100% but at-or-above 90% is acceptable. Both numbers are canonical — see `cursor-meta.mdc` Quality Thresholds.
- Integration tests use real database (Postgres testcontainers or dedicated test DB)
- Never use Thread Sleep or fixed delays in tests; use polling or async waits
- Keep test data factories/builders for reusable test setup
- Tests must be independent: no shared mutable state between tests
## Test environment (this project)
- **Unit tests** (`tests/unit/`): may run locally on the dev workstation (`pytest tests/unit/` in the project venv). Local PASS is equivalent to Jetson PASS for this tier because the suite is fully synthetic.
- **Blackbox / e2e / performance / resilience / security / resource-limit** tests (`tests/e2e/`, `e2e/tests/`, `tests/perf/`, …): MUST run on the Jetson Orin Nano Super (or a Jetson-equivalent arm64 agent). Use `scripts/run-tests-jetson.sh` for local dev; CI runs `.woodpecker/01-test.yml` on the colocated arm64 Jetson Woodpecker agent.
- Do NOT run e2e tests on the local workstation and report the result. If the Jetson is unreachable, the e2e verdict is "not run" — record the gap in `_docs/_process_leftovers/` rather than substituting a local result.
- Tests gated by `RUN_REPLAY_E2E` or `@pytest.mark.tier2` are expected to SKIP locally; that is correct behaviour, not a failure to investigate.
- Canonical source for this policy: `_docs/02_document/tests/environment.md` § Where each tier runs (active policy).
+6 -3
View File
@@ -14,11 +14,14 @@ alwaysApply: true
- Issue types: Epic, Story, Task, Bug, Subtask
## Tracker Availability Gate
- If Jira MCP returns **Unauthorized**, **errored**, **connection refused**, or any non-success response: **STOP** tracker operations and notify the user via the Choose A/B/C/D format documented in `.cursor/skills/autodev/protocols.md`.
- If Jira MCP returns **Unauthorized**, **errored**, **connection refused**, **timeout**, a non-2xx status code, an empty body, or any response shape that does not clearly confirm the requested change: **STOP IMMEDIATELY** — no automatic retry, no silent continuation. Surface the full raw error/response to the user verbatim and notify via the Choose A/B/C/D format documented in `.cursor/skills/autodev/protocols.md`.
- A minimal `{"success": true}` body with no echoed issue state is NOT a confirmed transition. When a transition's success matters (status moves, ticket creation, blocking link), follow it with a read-back call (`getJiraIssue` or equivalent) and confirm the new state matches what you asked for. If the read-back disagrees → STOP and ASK.
- Do NOT loop "retry up to N times before asking". One call, one verification. On failure, the user decides whether to retry.
- The user may choose to:
- **Retry authentication** — preferred; the tracker remains the source of truth.
- **Retry the same operation** — once, after the user authorizes it. If it fails again, surface both responses.
- **Retry authentication** — preferred when the failure looks like an auth/credentials problem; the tracker remains the source of truth.
- **Continue in `tracker: local` mode** — only when the user explicitly accepts this option. In that mode all tasks keep numeric prefixes and a `Tracker: pending` marker is written into each task header. The state file records `tracker: local`. The mode is NOT silent — the user has been asked and has acknowledged the trade-off.
- Do NOT auto-fall-back to `tracker: local` without a user decision. Do not pretend a write succeeded. If the user is unreachable (e.g., non-interactive run), stop and wait.
- Do NOT auto-fall-back to `tracker: local` without a user decision. Do not pretend a write succeeded. Do not paper over an opaque response by moving on. If the user is unreachable (e.g., non-interactive run), stop and wait.
- When the tracker becomes available again, any `Tracker: pending` tasks should be synced — this is done at the start of the next `/autodev` invocation via the Leftovers Mechanism below.
## Leftovers Mechanism (non-user-input blockers only)
+16 -6
View File
@@ -1,9 +1,9 @@
---
name: autodev
description: |
Auto-chaining orchestrator that drives the full BUILD-SHIP workflow from problem gathering through deployment.
Auto-chaining orchestrator that drives the full BUILDSHIP → EVOLVE workflow from problem gathering through release and retrospective.
Detects current project state from _docs/ folder, resumes from where it left off, and flows through
problem → research → plan → decompose → implement → deploy without manual skill invocation.
problem → research → plan (incl. ADRs) → test specs → decompose → implement → tests → docs sync → deploy → release → retrospective without manual skill invocation.
Maximizes work per conversation by auto-transitioning between skills.
Trigger phrases:
- "autodev", "auto", "start", "continue"
@@ -15,7 +15,7 @@ disable-model-invocation: true
# Autodev Orchestrator
Auto-chaining execution engine that drives the full BUILD → SHIP workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autodev` once — the engine handles sequencing, transitions, and re-entry.
Auto-chaining execution engine that drives the full BUILD → SHIP → EVOLVE workflow. Detects project state from `_docs/`, resumes from where work stopped, and flows through skills automatically. The user invokes `/autodev` once — the engine handles sequencing, transitions, and re-entry.
## File Index
@@ -52,7 +52,7 @@ Determine which flow to use (check in order — first match wins):
After selecting the flow, apply its detection rules (first match wins) to determine the current step.
**Note**: the meta-repo flow uses a different artifact layout — its source of truth is `_docs/_repo-config.yaml`, not `_docs/NN_*/` folders. Other detection rules assume the BUILD-SHIP artifact layout; they don't apply to meta-repos.
**Note**: the meta-repo flow uses a different artifact layout — its source of truth is `_docs/_repo-config.yaml`, not `_docs/NN_*/` folders. After Step 2.5 it also produces `_docs/glossary.md` and a `## Architecture Vision` section in the cross-cutting architecture doc identified by `docs.cross_cutting`. Other detection rules assume the BUILD-SHIP artifact layout; they don't apply to meta-repos.
## Execution Loop
@@ -67,8 +67,9 @@ B3. Read state — `_docs/_autodev_state.md` (if it exists).
B4. Read File Index — `state.md`, `protocols.md`, and the active flow file.
### Resolve (once per invocation, after Bootstrap)
R1. Reconcile state — verify state file against `_docs/` contents; on disagreement, trust the folders
and update the state file (rules: `state.md` → "State File Rules" #4).
R1. Reconcile state — verify state file against `_docs/` contents; probe `<workspace-root>/../docs`
(parent suite `docs/` — see `state.md` → "State File Rules" #4); on disagreement,
trust the folders and update the state file (rules: `state.md` → "State File Rules" #4).
After this step, `state.step` / `state.status` are authoritative.
R2. Resolve flow — see §Flow Resolution above.
R3. Resolve current step — when a state file exists, `state.step` drives detection.
@@ -112,6 +113,15 @@ Do NOT modify, skip, or abbreviate any part of the sub-skill's workflow. The aut
The state file (`_docs/_autodev_state.md`) is a minimal pointer — only the current step. See `state.md` for the authoritative template, field semantics, update rules, and worked examples. Do not restate the schema here — `state.md` is the single source of truth.
**Conciseness rule (authoritative).** The state file MUST stay short. Acceptable content per field:
- `name` — the step title from the active flow's Step Reference Table. That's it.
- `sub_step.name` — kebab-case identifier from the active sub-skill. That's it.
- `sub_step.detail`**leave empty (`""`) by default.** Add a one-line note ONLY when the next-session resumer cannot infer where to pick up from `phase` + `name` + on-disk artifacts alone (e.g. `"batch 2 of 4"`, `"blocked on D-PROJ-2 reply"`, `"variant 1b"`). NEVER use `detail` as a changelog, recap, or summary of completed work — those facts belong in the relevant `_docs/` artifact (glossary, traceability matrix, leftovers folder, retro report, etc.) and in git history.
- **Total file size target: <30 lines.** If you're tempted to write more, you're using the wrong artifact — write in `_docs/` instead.
Multi-line `detail` blobs that recap what was just completed are a smell. The state file is a *pointer*, not a logbook.
## Trigger Conditions
This skill activates when the user wants to:
+47 -13
View File
@@ -3,7 +3,7 @@
Workflow for projects with an existing codebase. Structurally it has **two phases**:
- **Phase A — One-time baseline setup (Steps 18)**: runs exactly once per codebase. Documents the code, produces test specs, makes the code testable, writes and runs the initial test suite, optionally refactors with that safety net.
- **Phase B — Feature cycle (Steps 917, loops)**: runs once per new feature. After Step 17 (Retrospective), the flow loops back to Step 9 (New Task) with `state.cycle` incremented.
- **Phase B — Feature cycle (Steps 917, loops)**: runs once per new feature. After Step 17 (Retrospective), the flow loops back to Step 9 (New Task) with `state.cycle` incremented. Step 16.5 (Release) sits between Deploy (16) and Retrospective (17).
A first-time run executes Phase A then Phase B; every subsequent invocation re-enters Phase B.
@@ -13,7 +13,7 @@ A first-time run executes Phase A then Phase B; every subsequent invocation re-e
| Step | Name | Sub-Skill | Internal SubSteps |
|------|------|-----------|-------------------|
| 1 | Document | document/SKILL.md | Steps 18 |
| 1 | Document | document/SKILL.md | Steps 07 incl. inline 2.5 (module-layout) and 4.5 (glossary + arch vision) |
| 2 | Architecture Baseline Scan | code-review/SKILL.md (baseline mode) | Phase 1 + Phase 7 |
| 3 | Test Spec | test-spec/SKILL.md | Phases 14 |
| 4 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 07 (conditional) |
@@ -34,6 +34,7 @@ A first-time run executes Phase A then Phase B; every subsequent invocation re-e
| 14 | Security Audit | security/SKILL.md | Phase 15 (optional) |
| 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 15 (optional) |
| 16 | Deploy | deploy/SKILL.md | Step 17 |
| 16.5 | Release | release/SKILL.md | Phase 16 |
| 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 14 |
After Step 17, the feature cycle completes and the flow loops back to Step 9 with `state.cycle + 1` — see "Re-Entry After Completion" below.
@@ -53,6 +54,8 @@ Action: An existing codebase without documentation was detected. Read and execut
The document skill's Step 2.5 produces `_docs/02_document/module-layout.md`, which is required by every downstream step that assigns file ownership (`/implement` Step 4, `/code-review` Phase 7, `/refactor` discovery). If this file is missing after Step 1 completes (e.g., a pre-existing `_docs/` dir predates the 2.5 addition), re-invoke `/document` in resume mode — it will pick up at Step 2.5.
The document skill's Step 4.5 produces `_docs/02_document/glossary.md` and prepends a confirmed `## Architecture Vision` section to `architecture.md`. Both are user-confirmed artifacts; downstream skills (refactor, decompose, new-task) treat them as authoritative for terminology and structural intent. If `glossary.md` is missing after Step 1 (pre-existing `_docs/` dir from before the 4.5 addition), re-invoke `/document` in resume mode — it will pick up at Step 4.5 without redoing module/component analysis.
---
**Step 2 — Architecture Baseline Scan**
@@ -150,15 +153,17 @@ If `_docs/02_tasks/` subfolders have some task files already (e.g., refactoring
---
**Step 6 — Implement Tests**
Condition (folder fallback): `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
State-driven: reached by auto-chain from Step 5.
Action: Read and execute `.cursor/skills/implement/SKILL.md`
Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.
The implement skill reads test tasks from `_docs/02_tasks/todo/` and implements them.
The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.
If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues.
For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
---
**Step 7 — Run Tests**
@@ -283,21 +288,43 @@ State-driven: reached by auto-chain from Step 15 (completed or skipped).
Action: Read and execute `.cursor/skills/deploy/SKILL.md`.
After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 17 (Retrospective).
After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 16.5 (Release).
---
**Step 16.5 — Release**
State-driven: reached by auto-chain from Step 16, for the current `state.cycle`.
Action: Read and execute `.cursor/skills/release/SKILL.md`. The release skill owns its own user interaction (Phase 1 pre-release gate, Phase 2 strategy select, Phase 6 escalation). Autodev does NOT add a wrapping A/B/C gate. Pass cycle context (`cycle: state.cycle`).
After the release skill exits, route on the verdict:
- **Verdict `Released`** → mark Step 16.5 `completed` and auto-chain to Step 17 (Retrospective in cycle-end mode).
- **Verdict `Released-with-override`** → mark Step 16.5 `completed` AND auto-chain to Step 17 (Retrospective in **incident mode**).
- **Verdict `Rolled-Back`** → mark Step 16.5 `failed`. Auto-chain to Step 17 (Retrospective in **incident mode**). The cycle does NOT loop back to Step 9.
- **Verdict `Aborted`** → mark Step 16.5 `not_started` (no live-system change) OR `failed` (live-system touched before abort). Surface the abort reason and STOP. Next `/autodev` invocation re-evaluates Phase B from the failed step.
---
**Step 17 — Retrospective**
State-driven: reached by auto-chain from Step 16, for the current `state.cycle`.
State-driven: reached by auto-chain from Step 16.5 with a `Released`, `Released-with-override`, or `Rolled-Back` verdict, for the current `state.cycle`.
Action: Read and execute `.cursor/skills/retrospective/SKILL.md` in **cycle-end mode**. Pass cycle context (`cycle: state.cycle`) so the retro report and LESSONS.md entries record which feature cycle they came from.
Action: Read and execute `.cursor/skills/retrospective/SKILL.md`. Mode selection:
After retrospective completes, mark Step 17 as `completed` and enter "Re-Entry After Completion" evaluation.
- Step 16.5 verdict `Released` → cycle-end mode
- Step 16.5 verdict `Released-with-override` or `Rolled-Back` → incident mode
Pass cycle context (`cycle: state.cycle`) so the retro report and LESSONS.md entries record which feature cycle they came from.
After retrospective completes:
- If Step 16.5 verdict was `Released` or `Released-with-override` → mark Step 17 as `completed` and enter "Re-Entry After Completion" evaluation (loop back to Step 9 for cycle N+1).
- If Step 16.5 verdict was `Rolled-Back` → mark Step 17 as `completed` but do NOT loop back. Surface the incident retro path and STOP.
---
**Re-Entry After Completion**
State-driven: `state.step == done` OR Step 17 (Retrospective) is completed for `state.cycle`.
State-driven: `state.step == done` OR Step 17 (Retrospective) is completed for `state.cycle` AND Step 16.5 verdict was `Released` or `Released-with-override`. A `Rolled-Back` cycle does NOT trigger Re-Entry — the user must explicitly invoke `/autodev` again.
Action: The project completed a full cycle. Print the status banner and automatically loop back to New Task — do NOT ask the user for confirmation:
@@ -312,7 +339,7 @@ Action: The project completed a full cycle. Print the status banner and automati
Set `step: 9`, `status: not_started`, and **increment `cycle`** (`cycle: state.cycle + 1`) in the state file, then auto-chain to Step 9 (New Task). Reset `sub_step` to `phase: 0, name: awaiting-invocation, detail: ""` and `retry_count: 0`.
Note: the loop (Steps 9 → 17 → 9) ensures every feature cycle includes: New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security → Performance → Deploy → Retrospective.
Note: the loop (Steps 9 → 17 → 9) ensures every feature cycle includes: New Task → Implement → Run Tests → Test-Spec Sync → Update Docs → Security → Performance → Deploy → Release → Retrospective. The cycle only completes (and loops back to Step 9) on a `Released` or `Released-with-override` verdict; rolled-back or aborted releases stop the cycle.
## Auto-Chain Rules
@@ -340,8 +367,13 @@ Note: the loop (Steps 9 → 17 → 9) ensures every feature cycle includes: New
| Update Docs (13) | Auto-chain → Security Audit choice (14) |
| Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
| Deploy (16) | Auto-chain → Retrospective (17) |
| Retrospective (17) | **Cycle complete** — loop back to New Task (9) with incremented cycle counter |
| Deploy (16) | Auto-chain → Release (16.5) |
| Release (16.5, verdict Released) | Auto-chain → Retrospective (17, cycle-end mode) |
| Release (16.5, verdict Released-with-override) | Auto-chain → Retrospective (17, **incident mode**) |
| Release (16.5, verdict Rolled-Back) | Auto-chain → Retrospective (17, **incident mode**); cycle does NOT loop back |
| Release (16.5, verdict Aborted) | STOP — surface abort reason; do not auto-chain |
| Retrospective (17, after Released / Released-with-override) | **Cycle complete** — loop back to New Task (9) with incremented cycle counter |
| Retrospective (17, after Rolled-Back) | Cycle remains incomplete — STOP and surface incident retro path |
## Status Summary — Step List
@@ -377,6 +409,7 @@ Flow-specific slot values:
| 14 | Security Audit | — |
| 15 | Performance Test | — |
| 16 | Deploy | — |
| 16.5 | Release | `DONE (Released | Released-with-override | Rolled-Back | Aborted)` |
| 17 | Retrospective | — |
All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2, 4, 8, 12, 13, 14, 15 additionally accept `SKIPPED`.
@@ -402,5 +435,6 @@ Row rendering format (renders with a phase separator between Step 8 and Step 9):
Step 14 Security Audit [<state token>]
Step 15 Performance Test [<state token>]
Step 16 Deploy [<state token>]
Step 16.5 Release [<state token>]
Step 17 Retrospective [<state token>]
```
+247 -67
View File
@@ -1,6 +1,6 @@
# Greenfield Workflow
Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Decompose → Implement → Run Tests → Security Audit (optional) → Performance Test (optional) → Deploy → Retrospective.
Workflow for new projects built from scratch. Flows linearly: Problem → Research → Plan → UI Design (if applicable) → Test Spec → Decompose → Implement + Product Completeness Gate → Code Testability Revision → Decompose Tests → Implement Tests → Run Tests → Test-Spec Sync → Update Docs → Security Audit (optional) → Performance Test (optional) → Deploy → Release → Retrospective.
## Step Reference Table
@@ -8,15 +8,22 @@ Workflow for new projects built from scratch. Flows linearly: Problem → Resear
|------|------|-----------|-------------------|
| 1 | Problem | problem/SKILL.md | Phase 14 |
| 2 | Research | research/SKILL.md | Mode A: Phase 14 · Mode B: Step 08 |
| 3 | Plan | plan/SKILL.md | Step 16 + Final |
| 3 | Plan | plan/SKILL.md | Step 1, 2, 3, 4, 4.5 (ADR Capture), 5, 6 + Final |
| 4 | UI Design | ui-design/SKILL.md | Phase 08 (conditional — UI projects only) |
| 5 | Decompose | decompose/SKILL.md | Step 14 |
| 6 | Implement | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 7 | Run Tests | test-run/SKILL.md | Steps 14 |
| 8 | Security Audit | security/SKILL.md | Phase 15 (optional) |
| 9 | Performance Test | test-run/SKILL.md (perf mode) | Steps 15 (optional) |
| 10 | Deploy | deploy/SKILL.md | Step 17 |
| 11 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 14 |
| 5 | Test Spec | test-spec/SKILL.md | Phases 14 |
| 6 | Decompose | decompose/SKILL.md (implementation task decomposition) | Step 1 + Step 1.5 + Step 2 + Step 4 |
| 7 | Implement | implement/SKILL.md | Batch loop + Product Implementation Completeness Gate |
| 8 | Code Testability Revision | refactor/SKILL.md (guided mode) | Phases 07 (conditional) |
| 9 | Decompose Tests | decompose/SKILL.md (tests-only) | Step 1t + Step 3 + Step 4 |
| 10 | Implement Tests | implement/SKILL.md | (batch-driven, no fixed sub-steps) |
| 11 | Run Tests | test-run/SKILL.md | Steps 14 |
| 12 | Test-Spec Sync | test-spec/SKILL.md (cycle-update mode) | Phase 2 + Phase 3 (scoped) |
| 13 | Update Docs | document/SKILL.md (task mode) | Task Steps 05 |
| 14 | Security Audit | security/SKILL.md | Phase 15 (optional) |
| 15 | Performance Test | test-run/SKILL.md (perf mode) | Steps 15 (optional) |
| 16 | Deploy | deploy/SKILL.md | Step 17 |
| 16.5 | Release | release/SKILL.md | Phase 16 |
| 17 | Retrospective | retrospective/SKILL.md (cycle-end mode) | Steps 14 |
## Detection Rules
@@ -80,12 +87,12 @@ If `_docs/02_document/` exists but is incomplete (has some artifacts but no `FIN
---
**Step 4 — UI Design (conditional)**
Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_tasks/todo/` does not exist or has no task files.
Condition (folder fallback): `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
State-driven: reached by auto-chain from Step 3.
Action: Read and execute `.cursor/skills/ui-design/SKILL.md`. The skill runs its own **Applicability Check**, which handles UI project detection and the user's A/B choice. It returns one of:
- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Decompose).
- `outcome: completed` → mark Step 4 as `completed`, auto-chain to Step 5 (Test Spec).
- `outcome: skipped, reason: not-a-ui-project` → mark Step 4 as `skipped`, auto-chain to Step 5.
- `outcome: skipped, reason: user-declined` → mark Step 4 as `skipped`, auto-chain to Step 5.
@@ -93,34 +100,162 @@ The autodev no longer inlines UI detection heuristics — they live in `ui-desig
---
**Step 5 — Decompose**
Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_tasks/todo/` does not exist or has no task files
**Step 5 — Test Spec**
Condition (folder fallback): `_docs/02_document/FINAL_report.md` exists AND `_docs/02_document/architecture.md` exists AND `_docs/02_document/tests/traceability-matrix.md` does not exist.
State-driven: reached by auto-chain from Step 4 (completed or skipped).
Action: Read and execute `.cursor/skills/decompose/SKILL.md`
Action: Read and execute `.cursor/skills/test-spec/SKILL.md`.
This step converts the greenfield problem statement, acceptance criteria, solution, architecture, component docs, and UI design artifacts (if any) into test specifications before implementation begins. The test spec should cover unit, integration, blackbox, and e2e scenarios where those levels are applicable to the project.
---
**Step 6 — Decompose**
Condition: `_docs/02_document/` contains `architecture.md` AND `_docs/02_document/components/` has at least one component AND `_docs/02_document/tests/traceability-matrix.md` exists AND `_docs/02_tasks/todo/` does not exist or has no implementation task files.
Action: Invoke `.cursor/skills/decompose/SKILL.md` for **implementation task decomposition**. The greenfield flow selects the implementation entrypoint before handing off: Bootstrap Structure, Module Layout, Component Task Decomposition, and Cross-Task Verification.
Do not invoke Blackbox Test Task Decomposition from Step 6. Test tasks are intentionally deferred to Step 9 (Decompose Tests) so the first implementation batch stays focused on product functionality and Step 8 can revise testability before test task files exist.
If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it.
---
**Step 6 — Implement**
Condition: `_docs/02_tasks/todo/` contains task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain any `implementation_report_*.md` file
**Step 7 — Implement**
Condition: `_docs/02_tasks/todo/` contains implementation task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/` does not contain a valid product implementation report.
Action: Read and execute `.cursor/skills/implement/SKILL.md`
Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Product implementation**.
The implement skill must run its **Product Implementation Completeness Gate** before it writes any final product implementation report. This gate compares completed product task specs, architecture/component promises, and actual source code so scaffold-only implementations cannot advance to Step 8. A final product implementation report without `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` is incomplete and must not be treated as Step 7 completion.
If `_docs/03_implementation/` has batch reports, the implement skill detects completed tasks and continues. The FINAL report filename is context-dependent — see implement skill documentation for naming convention.
For folder fallback, **implementation task files** means task specs that are not test-only specs: exclude `*_test_infrastructure.md` and task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
For folder fallback, a **product implementation report** is any `_docs/03_implementation/implementation_report_*.md` file except `_docs/03_implementation/implementation_report_tests.md` and refactor reports. It is valid for greenfield progression only when:
- the matching `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists,
- that completeness report does not contain unresolved `FAIL` classifications, and
- `_docs/02_tasks/todo/` contains no pending implementation task files.
If a product report exists but any of those validity checks fail, treat product implementation as incomplete and stay in Step 7.
---
**Step 7Run Tests**
Condition (folder fallback): `_docs/03_implementation/` contains an `implementation_report_*.md` file.
State-driven: reached by auto-chain from Step 6.
**Step 8Code Testability Revision**
Condition (folder fallback): `_docs/03_implementation/` contains a valid product implementation report, `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications, `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` does not exist, `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` does not exist, `_docs/03_implementation/implementation_report_tests.md` does not exist, and `_docs/02_tasks/todo/` does not contain test task files.
State-driven: reached by auto-chain from Step 7.
**Purpose**: verify the newly built code can be exercised by the planned tests before writing the test suite. Greenfield code should be testable by design; this step catches accidental hardcoded paths, singletons, direct external service construction, or other implementation choices that would make meaningful tests impossible.
**Scope — MINIMAL, SURGICAL fixes**: this is not a general refactor. It is the smallest set of changes required to make the implemented code runnable under tests.
**Allowed changes** in this phase:
- Replace hardcoded URLs / file paths / credentials / magic numbers with env vars or constructor arguments.
- Extract narrow interfaces for components that need stubbing in tests.
- Add optional constructor parameters for dependency injection; default to the existing behavior so callers do not break.
- Wrap global singletons in thin accessors that tests can override.
- Split a function ONLY when necessary to stub one of its collaborators — do not split for clarity alone.
**NOT allowed** in this phase (defer to a later refactor task):
- Renaming public APIs.
- Moving code between files unless strictly required for isolation.
- Changing algorithms or business logic.
- Restructuring module boundaries or rewriting layers.
Action: Analyze the codebase against the test specs to determine whether the code can be tested as-is.
1. Read `_docs/02_document/tests/traceability-matrix.md` and all test scenario files in `_docs/02_document/tests/`.
2. For each test scenario, check whether the code under test can be exercised in isolation. Look for:
- Hardcoded file paths or directory references
- Hardcoded configuration values (URLs, credentials, magic numbers)
- Global mutable state that cannot be overridden
- Tight coupling to external services without abstraction
- Missing dependency injection or non-configurable parameters
- Direct file system operations without path configurability
- Inline construction of heavy dependencies (models, clients)
3. If ALL scenarios are testable as-is:
- Create `_docs/04_refactoring/01-testability-refactoring/`
- Write `_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` with the scenarios reviewed and outcome "Code is testable — no changes needed"
- Mark Step 8 as `completed` with outcome "Code is testable — no changes needed"
- Auto-chain to Step 9 (Decompose Tests)
4. If testability issues are found:
- Create `_docs/04_refactoring/01-testability-refactoring/`
- Write `list-of-changes.md` in that directory using the refactor skill template (`.cursor/skills/refactor/templates/list-of-changes.md`), with:
- **Mode**: `guided`
- **Source**: `autodev-greenfield-testability-analysis`
- One change entry per testability issue found (change ID, file paths, problem, proposed change, risk, dependencies). Each entry must fit the allowed-changes list above; reject entries that drift into full refactor territory and log them under "Deferred refactor candidates" instead.
- Invoke the refactor skill in **guided mode**: read and execute `.cursor/skills/refactor/SKILL.md` with the `list-of-changes.md` as input
- Phase 3 (Safety Net) is skipped for this testability run because the test suite has not been implemented yet
- After execution, surface `RUN_DIR/testability_changes_summary.md` to the user via the Choose format (accept / request follow-up) before auto-chaining
- Copy or save the accepted summary as `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` so folder fallback can detect Step 8 completion
- Mark Step 8 as `completed`
- Auto-chain to Step 9 (Decompose Tests)
---
**Step 9 — Decompose Tests**
Condition (folder fallback): `_docs/02_document/tests/traceability-matrix.md` exists AND workspace contains source code files AND `_docs/03_implementation/` contains a valid product implementation report AND `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` exists without unresolved `FAIL` classifications AND (`_docs/04_refactoring/01-testability-refactoring/testability_assessment.md` exists OR `_docs/04_refactoring/01-testability-refactoring/testability_changes_summary.md` exists) AND (`_docs/02_tasks/todo/` does not exist or has no test task files) AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
State-driven: reached by auto-chain from Step 8.
Action: Read and execute `.cursor/skills/decompose/SKILL.md` in **tests-only mode** (pass `_docs/02_document/tests/` as input). The decompose skill will:
1. Run Step 1t (test infrastructure bootstrap)
2. Run Step 3 (blackbox/e2e-capable test task decomposition)
3. Run Step 4 (cross-verification against test coverage)
If `_docs/02_tasks/` subfolders have some task files already, the decompose skill's resumability handles it — it appends test tasks alongside existing completed implementation tasks.
---
**Step 10 — Implement Tests**
Condition (folder fallback): `_docs/02_tasks/todo/` contains test task files AND `_dependencies_table.md` exists AND `_docs/03_implementation/implementation_report_tests.md` does not exist.
State-driven: reached by auto-chain from Step 9.
Action: Invoke `.cursor/skills/implement/SKILL.md` with task selection context **Test implementation**.
The implement skill reads only test tasks from `_docs/02_tasks/todo/` and implements them.
If `_docs/03_implementation/` has batch reports, the implement skill detects completed test tasks and continues.
For folder fallback, **test task files** means `*_test_infrastructure.md` plus task specs whose `**Component**` or `**Epic**` identifies `Blackbox Tests`.
---
**Step 11 — Run Tests**
Condition (folder fallback): `_docs/03_implementation/implementation_report_tests.md` exists.
State-driven: reached by auto-chain from Step 10.
Action: Read and execute `.cursor/skills/test-run/SKILL.md`
Verifies the implemented unit, integration, blackbox, and e2e tests pass before proceeding to spec and documentation sync. This is a hard product gate, not a harness-smoke gate: e2e/blackbox tests must exercise the actual implemented system through public runtime boundaries and compare actual outputs against `_docs/00_problem/input_data/expected_results/results_report.md` or referenced machine-readable expected-result files. Stubs are allowed only for external systems outside the product boundary; missing internal product implementation must fail or block the gate and send the flow back to Implement.
---
**Step 8Security Audit (optional)**
State-driven: reached by auto-chain from Step 7.
**Step 12Test-Spec Sync**
State-driven: reached by auto-chain from Step 11. Requires `_docs/02_document/tests/traceability-matrix.md` to exist — if missing, mark Step 12 `skipped` (see Action below).
Action: Read and execute `.cursor/skills/test-spec/SKILL.md` in **cycle-update mode**. Pass the completed implementation task specs, completed test task specs, and implementation reports as inputs.
The skill appends implementation-learned acceptance criteria, scenarios, and NFR updates to the existing test-spec files without rewriting unaffected sections. If `traceability-matrix.md` is missing, mark Step 12 as `skipped` — the next `/test-spec` full run will regenerate it.
After completion, auto-chain to Step 13 (Update Docs).
---
**Step 13 — Update Docs**
State-driven: reached by auto-chain from Step 12 (completed or skipped). Requires `_docs/02_document/` to contain existing documentation — if missing, mark Step 13 `skipped` (see Action below).
Action: Read and execute `.cursor/skills/document/SKILL.md` in **Task mode**. Pass all completed implementation and test task spec files plus the implementation reports.
The document skill in Task mode updates affected module docs, component docs, system-level docs, and test documentation without redoing full discovery, verification, or problem extraction.
If `_docs/02_document/` does not contain existing docs, mark Step 13 as `skipped`.
After completion, auto-chain to Step 14 (Security Audit).
---
**Step 14 — Security Audit (optional)**
State-driven: reached by auto-chain from Step 13 (completed or skipped).
Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
- question: `Run security audit before deploy?`
@@ -128,12 +263,12 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
- option-b-label: `Skip — proceed directly to deploy`
- recommendation: `A — catches vulnerabilities before production`
- target-skill: `.cursor/skills/security/SKILL.md`
- next-step: Step 9 (Performance Test)
- next-step: Step 15 (Performance Test)
---
**Step 9 — Performance Test (optional)**
State-driven: reached by auto-chain from Step 8.
**Step 15 — Performance Test (optional)**
State-driven: reached by auto-chain from Step 14 (completed or skipped).
Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Gate") with:
- question: `Run performance/load tests before deploy?`
@@ -141,30 +276,51 @@ Action: Apply the **Optional Skill Gate** (`protocols.md` → "Optional Skill Ga
- option-b-label: `Skip — proceed directly to deploy`
- recommendation: `A or B — base on whether acceptance criteria include latency, throughput, or load requirements`
- target-skill: `.cursor/skills/test-run/SKILL.md` in **perf mode** (the skill handles runner detection, threshold comparison, and its own A/B/C gate on threshold failures)
- next-step: Step 10 (Deploy)
- next-step: Step 16 (Deploy)
---
**Step 10 — Deploy**
State-driven: reached by auto-chain from Step 9 (after Step 9 is completed or skipped).
**Step 16 — Deploy**
State-driven: reached by auto-chain from Step 15 (after Step 15 is completed or skipped).
Action: Read and execute `.cursor/skills/deploy/SKILL.md`.
After the deploy skill completes successfully, mark Step 10 as `completed` and auto-chain to Step 11 (Retrospective).
After the deploy skill completes successfully, mark Step 16 as `completed` and auto-chain to Step 16.5 (Release).
---
**Step 11 — Retrospective**
State-driven: reached by auto-chain from Step 10.
**Step 16.5 — Release**
State-driven: reached by auto-chain from Step 16.
Action: Read and execute `.cursor/skills/retrospective/SKILL.md` in **cycle-end mode**. This closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` and appending the top-3 lessons to `_docs/LESSONS.md`.
Action: Read and execute `.cursor/skills/release/SKILL.md`. The release skill is responsible for selecting the target environment, executing the deploy artifacts, smoke-testing, watching the rollout, and producing a definitive verdict (`Released`, `Released-with-override`, `Rolled-Back`, or `Aborted`).
After retrospective completes, mark Step 11 as `completed` and enter "Done" evaluation.
The release skill has its own internal BLOCKING gates (Phase 1 pre-release gate, Phase 2 strategy select, Phase 6 user confirmation when soft regression escalates). Autodev does NOT add a wrapping A/B/C gate — the release skill owns its own user interaction.
After the release skill exits:
- **Verdict `Released`** → mark Step 16.5 `completed` and auto-chain to Step 17 (Retrospective in cycle-end mode).
- **Verdict `Released-with-override`** → mark Step 16.5 `completed` AND auto-chain to Step 17 (Retrospective in **incident mode**) — the override is itself an incident the retrospective must analyze.
- **Verdict `Rolled-Back`** → mark Step 16.5 `failed`. Auto-chain to Step 17 (Retrospective in **incident mode**). Do NOT consider the project "Done" — the user owns the next move (re-run /implement on a fix branch, re-run /deploy, re-run /release).
- **Verdict `Aborted`** → mark Step 16.5 `not_started` (the release was never started) OR `failed` if the abort came after Phase 3 had already touched the live system. Surface the abort reason and STOP — do not auto-chain to retrospective.
---
**Step 17 — Retrospective**
State-driven: reached by auto-chain from Step 16.5 with a `Released` or `Released-with-override` verdict, OR from a `Rolled-Back` verdict (in incident mode).
Action: Read and execute `.cursor/skills/retrospective/SKILL.md`. Mode selection:
- Step 16.5 verdict `Released` → cycle-end mode
- Step 16.5 verdict `Released-with-override` or `Rolled-Back` → incident mode
The retrospective closes the cycle's feedback loop by folding metrics into `_docs/06_metrics/retro_<date>.md` (or `incident_<date>_release.md` in incident mode) and appending the top-3 lessons to `_docs/LESSONS.md`.
After retrospective completes, mark Step 17 as `completed` and enter "Done" evaluation.
---
**Done**
State-driven: reached by auto-chain from Step 11. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md.)
State-driven: reached by auto-chain from Step 17. (Sanity check: `_docs/04_deploy/` should contain all expected artifacts — containerization.md, ci_cd_pipeline.md, environment_strategy.md, observability.md, deployment_procedures.md, deploy_scripts.md. `_docs/04_release/` should contain at least one `release_<version>_<env>_<timestamp>.md` with a `Released` verdict — or the user has explicitly chosen to handle release outside autodev.)
Action: Report project completion with summary. Then **rewrite the state file** so the next `/autodev` invocation enters the feature-cycle loop in the existing-code flow:
@@ -191,47 +347,71 @@ On the next invocation, Flow Resolution rule 1 reads `flow: existing-code` and r
| Research (2) | Auto-chain → Research Decision (ask user: another round or proceed?) |
| Research Decision → proceed | Auto-chain → Plan (3) |
| Plan (3) | Auto-chain → UI Design detection (4) |
| UI Design (4, done or skipped) | Auto-chain → Decompose (5) |
| Decompose (5) | **Session boundary** — suggest new conversation before Implement |
| Implement (6) | Auto-chain → Run Tests (7) |
| Run Tests (7, all pass) | Auto-chain → Security Audit choice (8) |
| Security Audit (8, done or skipped) | Auto-chain → Performance Test choice (9) |
| Performance Test (9, done or skipped) | Auto-chain → Deploy (10) |
| Deploy (10) | Auto-chain → Retrospective (11) |
| Retrospective (11) | Report completion; rewrite state to existing-code flow, step 9 |
| UI Design (4, done or skipped) | Auto-chain → Test Spec (5) |
| Test Spec (5) | Auto-chain → Decompose (6) |
| Decompose (6) | **Session boundary** — suggest new conversation before Implement |
| Implement (7) | Auto-chain only after Product Implementation Completeness Gate passes → Code Testability Revision (8) |
| Code Testability Revision (8) | Auto-chain → Decompose Tests (9) |
| Decompose Tests (9) | **Session boundary** — suggest new conversation before Implement Tests |
| Implement Tests (10) | Auto-chain → Run Tests (11) |
| Run Tests (11, all pass) | Auto-chain → Test-Spec Sync (12) |
| Test-Spec Sync (12, done or skipped) | Auto-chain → Update Docs (13) |
| Update Docs (13, done or skipped) | Auto-chain → Security Audit choice (14) |
| Security Audit (14, done or skipped) | Auto-chain → Performance Test choice (15) |
| Performance Test (15, done or skipped) | Auto-chain → Deploy (16) |
| Deploy (16) | Auto-chain → Release (16.5) |
| Release (16.5, verdict Released) | Auto-chain → Retrospective (17, cycle-end mode) |
| Release (16.5, verdict Released-with-override) | Auto-chain → Retrospective (17, **incident mode**) |
| Release (16.5, verdict Rolled-Back) | Auto-chain → Retrospective (17, **incident mode**); do NOT enter Done |
| Release (16.5, verdict Aborted) | STOP — surface abort reason; do not auto-chain |
| Retrospective (17) | Report completion; rewrite state to existing-code flow, step 9 |
## Status Summary — Step List
Flow name: `greenfield`. Render using the banner template in `protocols.md` → "Banner Template (authoritative)". No header-suffix, current-suffix, or footer-extras — all empty for this flow.
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|--------------------|--------------------------------------------|
| 1 | Problem | — |
| 2 | Research | `DONE (N drafts)` |
| 3 | Plan | — |
| 4 | UI Design | — |
| 5 | Decompose | `DONE (N tasks)` |
| 6 | Implement | `IN PROGRESS (batch M of ~N)` |
| 7 | Run Tests | `DONE (N passed, M failed)` |
| 8 | Security Audit | — |
| 9 | Performance Test | — |
| 10 | Deploy | — |
| 11 | Retrospective | — |
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|-----------------------------|--------------------------------------------|
| 1 | Problem | — |
| 2 | Research | `DONE (N drafts)` |
| 3 | Plan | — |
| 4 | UI Design | — |
| 5 | Test Spec | — |
| 6 | Decompose | `DONE (N tasks)` |
| 7 | Implement | `IN PROGRESS (batch M of ~N)` |
| 8 | Code Testability Revision | — |
| 9 | Decompose Tests | `DONE (N tasks)` |
| 10 | Implement Tests | `IN PROGRESS (batch M)` |
| 11 | Run Tests | `DONE (N passed, M failed)` |
| 12 | Test-Spec Sync | — |
| 13 | Update Docs | — |
| 14 | Security Audit | — |
| 15 | Performance Test | — |
| 16 | Deploy | — |
| 16.5 | Release | `DONE (Released | Released-with-override | Rolled-Back | Aborted)` |
| 17 | Retrospective | — |
All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 8, 9 additionally accept `SKIPPED`.
All rows also accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4, 12, 13, 14, 15 additionally accept `SKIPPED`.
Row rendering format (step-number column is right-padded to 2 characters for alignment):
```
Step 1 Problem [<state token>]
Step 2 Research [<state token>]
Step 3 Plan [<state token>]
Step 4 UI Design [<state token>]
Step 5 Decompose [<state token>]
Step 6 Implement [<state token>]
Step 7 Run Tests [<state token>]
Step 8 Security Audit [<state token>]
Step 9 Performance Test [<state token>]
Step 10 Deploy [<state token>]
Step 11 Retrospective [<state token>]
Step 1 Problem [<state token>]
Step 2 Research [<state token>]
Step 3 Plan [<state token>]
Step 4 UI Design [<state token>]
Step 5 Test Spec [<state token>]
Step 6 Decompose [<state token>]
Step 7 Implement [<state token>]
Step 8 Code Testability Rev. [<state token>]
Step 9 Decompose Tests [<state token>]
Step 10 Implement Tests [<state token>]
Step 11 Run Tests [<state token>]
Step 12 Test-Spec Sync [<state token>]
Step 13 Update Docs [<state token>]
Step 14 Security Audit [<state token>]
Step 15 Performance Test [<state token>]
Step 16 Deploy [<state token>]
Step 16.5 Release [<state token>]
Step 17 Retrospective [<state token>]
```
+314 -32
View File
@@ -5,7 +5,8 @@ Workflow for **meta-repositories** — repos that aggregate multiple components
This flow differs fundamentally from `greenfield` and `existing-code`:
- **No problem/research/plan phases** — meta-repos don't build features, they coordinate existing ones
- **No test spec / implement / run tests** — the meta-repo has no code to test
- **No test spec / run tests** — the meta-repo has no code to test
- **`implement` is scoped to suite-level work only** — cross-repo concerns, repo/folder renames, suite-root infra additions (e.g., `.gitmodules`, `_infra/`, suite `e2e/`). Per-component implementation lives in each component's own workspace `/autodev` cycle. The meta-repo's implement step (Step 3.5) executes only when `_docs/tasks/todo/` is non-empty AND the user explicitly opts in; placement is **before** the sync skills so subsequent Doc/E2E/CICD sync propagates the post-implementation state.
- **No `_docs/00_problem/` artifacts** — documentation target is `_docs/*.md` unified docs, not per-feature `_docs/NN_feature/` folders
- **Primary artifact is `_docs/_repo-config.yaml`** — generated by `monorepo-discover`, read by every other step
@@ -15,8 +16,11 @@ This flow differs fundamentally from `greenfield` and `existing-code`:
|------|------|-----------|-------------------|
| 1 | Discover | monorepo-discover/SKILL.md | Phase 110 |
| 2 | Config Review | (human checkpoint, no sub-skill) | — |
| 2.5 | Glossary & Architecture Vision | (inline, no sub-skill) | Steps 15 |
| 3 | Status | monorepo-status/SKILL.md | Sections 15 |
| 3.5 | Suite Implement | implement/SKILL.md (suite-level invocation context) | Steps 114 + 16 (Step 14.5 + Step 15 skipped); conditional on `_docs/tasks/todo/` non-empty AND user opt-in |
| 4 | Document Sync | monorepo-document/SKILL.md | Phase 17 (conditional on doc drift) |
| 4.5 | Integration Test Sync | monorepo-e2e/SKILL.md | Phase 16 (conditional on suite-e2e drift; skipped if `suite_e2e:` block absent in config) |
| 5 | CICD Sync | monorepo-cicd/SKILL.md | Phase 17 (conditional on CI drift) |
| 6 | Loop | (auto-return to Step 3 on next invocation) | — |
@@ -58,17 +62,121 @@ Action: This is a **hard session boundary**. The skill cannot proceed until a hu
══════════════════════════════════════
```
- If user picks A → verify `confirmed_by_user: true` is now set in the config. If still `false`, re-ask. If true, auto-chain to **Step 3 (Status)**.
- If user picks A → verify `confirmed_by_user: true` is now set in the config. If still `false`, re-ask. If true, auto-chain to **Step 2.5 (Glossary & Architecture Vision)**.
- If user picks B → mark Step 2 as `in_progress`, update state file, end the session. Tell the user to invoke `/autodev` again after reviewing.
**Do NOT auto-flip `confirmed_by_user`.** Only the human does that.
---
**Step 2.5 — Glossary & Architecture Vision** (one-shot)
Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true` AND (`_docs/glossary.md` does NOT exist OR the cross-cutting architecture doc identified in `docs.cross_cutting` does NOT contain a `## Architecture Vision` section).
State-driven: reached by auto-chain from Step 2 (user picked A).
**Goal**: Capture meta-repo-wide terminology and the user's architecture vision **once**, after the config is confirmed but before any sync skill runs. Without this, `monorepo-document` will faithfully propagate per-component changes but never surface a unified mental model of the meta-repo to the user, and the AI will keep re-inferring the same project terminology on every invocation.
**Why inline (no sub-skill)**: `monorepo-discover` is hard-guarded to write only `_repo-config.yaml`; `monorepo-document` only edits *existing* docs. Glossary and architecture-vision creation is a first-time, user-confirmed write that crosses both guarantees, so it lives directly in the flow.
**Inputs**:
- `_docs/_repo-config.yaml` (component list, doc map, conventions, assumptions log)
- Cross-cutting docs listed under `docs.cross_cutting` (existing architecture doc, if any)
- Each component's `primary_doc` (read-only, for terminology + responsibility extraction)
- Root `README.md` if `repo.root_readme` is referenced
**Outputs**:
- `_docs/glossary.md` (or `<docs.root>/glossary.md` if `docs.root``_docs/`) — NEW
- The cross-cutting architecture doc updated in place: a `## Architecture Vision` section is prepended (or merged into an existing "Vision" / "Overview" heading)
- One new entry appended to `_docs/_repo-config.yaml` under `assumptions_log:` recording the run
- A new top-level config entry: `glossary_doc: <path>` so future `monorepo-status` and `monorepo-document` runs treat the glossary as a known cross-cutting doc
**Procedure**:
1. **Draft glossary** from `_repo-config.yaml` + each component's primary doc. Include:
- Component codenames as they appear in the config (`name` field) and any rename pairs the user noted in `unresolved:` resolutions
- Domain terms that recur across ≥2 component docs
- Acronyms / abbreviations
- Convention names from `conventions:` (e.g., commit prefix, deployment tier names)
- Stakeholder personas if cross-cutting docs reference them
Each entry: one-line definition + source (`source: components.<name>.primary_doc` or `source: _repo-config.yaml conventions`). Skip generic terms.
2. **Draft architecture vision** from the meta-repo perspective:
- **One paragraph**: what the system as a whole is, what each component contributes, the runtime topology (one binary / N services / N clients + 1 server / hybrid), how components communicate (REST / gRPC / queue / DB-shared / file-shared)
- **Components & responsibilities** (one-line each), pulled directly from `_repo-config.yaml` `components:` list
- **Cross-cutting concerns ownership**: which doc owns which concern (auth, schema, deployment, etc.) — pulled from `docs.cross_cutting[].owns`
- **Architectural principles / non-negotiables** the user has implied across components (e.g., "all components share a single Postgres", "submodules own their own CI", "deployment is per-tier, not per-component")
- **Open questions / structural drift signals**: components missing from `docs.cross_cutting`, components in registry but not in config (registry mismatch), or contradictions between component primary docs
3. **Present condensed view** to the user (NOT the full draft files):
```
══════════════════════════════════════
REVIEW: Meta-Repo Glossary + Architecture Vision
══════════════════════════════════════
Glossary (N terms drafted from config + component docs):
- <Term>: <one-line definition>
- ...
Architecture Vision — meta-repo level:
<one-paragraph synopsis>
Components / responsibilities:
- <component>: <one-line>
- ...
Cross-cutting ownership:
- <concern> → <doc>
- ...
Principles / non-negotiables:
- <principle>
- ...
Open questions / drift signals:
- <q1>
- <q2>
══════════════════════════════════════
A) Looks correct — write the files
B) Add / correct entries (provide diffs)
C) Resolve open questions / drift signals first
══════════════════════════════════════
Recommendation: pick C if drift signals exist;
otherwise B if components or principles
don't match your intent; A only when
the inferred vision is exactly right.
══════════════════════════════════════
```
4. **Iterate**:
- On B → integrate the user's diffs/additions, re-present, loop until A.
- On C → ask the listed open questions in one batch, integrate answers, re-present.
- **Do NOT proceed to step 5 until the user picks A.**
5. **Save**:
- Write `_docs/glossary.md` (alphabetical) with `**Status**: confirmed-by-user` + date.
- Update the cross-cutting architecture doc identified in `docs.cross_cutting` (or create one at `_docs/00_architecture.md` if none exists and the user's option-B input named one): prepend `## Architecture Vision` with the confirmed paragraph + components + ownership + principles. Preserve every existing H2 below verbatim.
- Append to `_docs/_repo-config.yaml`:
- Top-level `glossary_doc: <path-relative-to-repo-root>` (sibling of `docs.root`)
- New `assumptions_log:` entry: `{ date: <today>, skill: autodev-meta-repo Step 2.5, run_notes: "Captured glossary + architecture vision", assumptions: [...] }`
- Do NOT flip any `confirmed: false` → `confirmed: true` in the config; this step writes its own confirmed artifact, it does not retroactively confirm config inferences.
**Self-verification**:
- [ ] Every glossary entry traces to either the config or a component primary doc
- [ ] Every component listed in the vision matches a `components:` entry in the config
- [ ] All open questions are answered or explicitly deferred (with the user's acknowledgement)
- [ ] The cross-cutting architecture doc still contains every H2 it had before this step
- [ ] User picked option A on the latest condensed view
**Idempotency**: if both `_docs/glossary.md` exists AND the architecture doc already has a `## Architecture Vision` section, this step is **skipped on re-invocation**. To refresh, the user invokes `/autodev` after deleting `glossary.md` (or running `monorepo-discover` with structural changes that justify a re-confirmation).
After completion, auto-chain to **Step 3 (Status)**.
---
**Step 3 — Status**
Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true`.
State-driven: reached by auto-chain from Step 2 (user picked A), or entered on any re-invocation after a completed cycle.
Condition (folder fallback): `_docs/_repo-config.yaml` exists AND `confirmed_by_user: true` AND (`_docs/glossary.md` exists OR `glossary_doc:` is recorded in the config).
State-driven: reached by auto-chain from Step 2.5, or entered on any re-invocation after a completed cycle.
Action: Read and execute `.cursor/skills/monorepo-status/SKILL.md`.
@@ -78,11 +186,16 @@ The status report identifies:
- Registry/config mismatches
- Unresolved questions
Based on the report, auto-chain branches:
Based on the report, auto-chain branches in this evaluation order (first match wins):
- If **doc drift** found → auto-chain to **Step 4 (Document Sync)**
- Else if **CI drift** (only) found → auto-chain to **Step 5 (CICD Sync)**
- Else if **registry mismatch** found (new components not in config) → present Choose format:
1. **Registry mismatch** (new components not in config, or config component not in registry) → present the Choose format below FIRST. After the user resolves it (A: refresh discover, B: onboard, C: continue with mismatch acknowledged), proceed to the next rule. This rule has priority because a stale config would mislead Step 3.5's ownership-envelope synthesis and any sync skill's component scope.
2. **Pre-routing gate (Step 3.5 detection)** — check `_docs/tasks/todo/` for suite-level task files (`*.md` excluding files starting with `_`). If ≥1 task is present, auto-chain to **Step 3.5 (Suite Implement)**. After Step 3.5 returns (regardless of A/B outcome), the post-implement re-status applies rules 36 below to the post-implementation state.
3. If **doc drift** found → auto-chain to **Step 4 (Document Sync)**
4. Else if **CI drift** (only) found → auto-chain to **Step 5 (CICD Sync)**
5. Else if **suite-e2e drift** (only) found → auto-chain to **Step 4.5 (Integration Test Sync)** (only when `suite_e2e:` block exists in config)
6. Else → **workflow done for this cycle**.
**Registry mismatch Choose format** (rule 1):
```
══════════════════════════════════════
@@ -99,7 +212,134 @@ Based on the report, auto-chain branches:
══════════════════════════════════════
```
- Else → **workflow done for this cycle**. Report "No drift. Meta-repo is in sync." Loop waits for next invocation.
When rule 6 fires (no drift, no todo tasks), report "No drift. Meta-repo is in sync." and end the cycle. Loop waits for next invocation.
---
**Step 3.5 — Suite Implement**
Condition (folder fallback): `_docs/tasks/todo/` exists AND contains ≥1 file matching `*.md` excluding files starting with `_` (e.g., `_dependencies_table.md` is excluded by convention).
State-driven: reached by auto-chain from Step 3 when the pre-routing gate detected todo tasks. Inserted **before** the sync skills (Step 4 / 4.5 / 5) by deliberate design: implementing renames + cross-repo edits first means the subsequent sync skills propagate the actual landed state rather than the pre-change state, avoiding a second cycle to fix downstream drift.
**Skip condition**: `_docs/tasks/todo/` is empty, missing, or contains only `_*` files. In that case Step 3.5 is skipped entirely and the cycle proceeds with Step 3's existing drift-based routing.
**Goal**: Execute suite-level implementation tasks — cross-repo concerns (e.g., `autopilot` + `ui` + suite `e2e/` cutover in a coordinated change-set), folder renames (e.g., `git mv flights missions` + `.gitmodules` edit + `_infra/` path refs), and suite-root infrastructure additions (e.g., `_infra/dev/docker-compose.dev.yml`). Per-component implementation work stays in each component's own workspace `/autodev` cycle.
**Why this exists**: the meta-repo's existing sync skills (`monorepo-document`, `monorepo-cicd`, `monorepo-e2e`) only **propagate** changes that already landed. They cannot **execute** a task spec. Without Step 3.5, suite-level tickets like AZ-543 (B4 repo rename) or AZ-506 (new dev compose) have no flow path forward — they require operator action outside autodev.
**Inputs**:
- `_docs/tasks/todo/*.md` (excluding `_*`) — task specs in the existing format (`Task` / `Component` / `Dependencies` / `Acceptance criteria` headers)
- `_docs/_repo-config.yaml` — `components[].path` list, used to compute the suite-level OWNED envelope (workspace root EXCLUDING any path under a component's folder)
- `_docs/tasks/_dependencies_table.md` — synthesized by this step if missing (see Procedure)
- `_docs/tasks/_suite_module_layout.md` — synthesized by this step if missing (see Procedure)
**Procedure**:
1. **Detection (already done by Step 3 pre-routing gate)**. List task files in `_docs/tasks/todo/` (excluding `_*`). If 0 → skip Step 3.5. If ≥1 → continue.
2. **Present Choose**:
```
══════════════════════════════════════
DECISION REQUIRED: <N> suite-level task(s) in _docs/tasks/todo/
══════════════════════════════════════
Task(s) detected:
- AZ-XXX: <title> (deps: <list or "—">)
- AZ-YYY: <title> (deps: <list or "—">)
...
A) Run implement skill on these task(s) now (then continue to Doc / E2E / CICD sync)
B) Skip implement this cycle — continue to Doc / E2E / CICD sync without executing tasks
C) Pause — review the tasks before deciding (end session, no state changes)
══════════════════════════════════════
Recommendation: A — running implement BEFORE syncs means subsequent
sync skills propagate the post-implementation state.
B is appropriate when tasks are blocked on user input
or external coordination. C when the tasks themselves
need owner clarification before execution.
══════════════════════════════════════
```
3. **On user A — Pre-flight**:
a. **Working tree clean check**. Run `git status --porcelain`. If non-empty, surface to the user with a Choose A/B/C identical to the implement skill's prerequisite gate (commit/stash manually; agent commits as `chore: WIP pre-implement`; abort).
b. **Synthesize `_docs/tasks/_dependencies_table.md`** if missing. Parse each in-scope task's `Dependencies:` field. Write a minimal table of the form:
```markdown
# Suite-Level Task Dependencies
| Task ID | Depends on | Notes |
|---------|------------|-------|
| AZ-XXX | (none) | — |
| AZ-YYY | AZ-XXX | — |
```
If a task lists a dependency that is neither in `todo/` nor `done/`, log a warning in the synthesized file but do not block — implement skill's Step 1 (Parse) will surface the issue if it actually blocks execution.
c. **Synthesize `_docs/tasks/_suite_module_layout.md`** if missing. Default content:
```markdown
# Suite-Level Module Layout (synthetic)
Generated by autodev meta-repo Step 3.5. The suite root has no per-feature decomposition; ownership is defined at the component-boundary level only.
## Per-Component Mapping
| Component | Owns | Imports from |
|-----------|----------------------------------|--------------|
| suite | (workspace root) excluding any path listed under `_repo-config.yaml.components[].path` | (read-only) every component's primary doc + `_docs/*.md` |
Suite-level tasks operate on: `.gitmodules`, `_infra/**`, `_docs/**` (excluding `_docs/tasks/_*` regenerated files), root `README.md`, `e2e/**` (suite e2e harness only).
Forbidden paths for suite-level tasks: `<component>/**` for every component listed in `_repo-config.yaml.components[].path` — those edits live in the component's own workspace `/autodev` cycle.
```
d. **Prepare invocation context**:
```
suite_level: true
TASKS_DIR: _docs/tasks/
module_layout_path: _docs/tasks/_suite_module_layout.md
```
4. **Invoke implement skill**. Read and execute `.cursor/skills/implement/SKILL.md` with the prepared context. The skill's "Suite-level invocation context" subsection (added in tandem with this flow change) honors the three flags above and skips:
- Step 14.5 (cumulative code review) — no `architecture_compliance_baseline.md` exists at the suite level; cross-task drift is captured by the next `monorepo-status` cycle instead.
- Step 15 (Product Implementation Completeness Gate) — the gate's inputs (`_docs/02_document/architecture.md`, `system-flows.md`, `components/*/description.md`) do not exist in the meta-repo artifact layout. Suite tasks are infrastructure / coordination work, not feature implementation.
All other implement skill steps (114, 16) execute unchanged. Tracker integration (Step 5: In Progress, Step 12: In Testing) runs normally.
5. **Post-implement re-status**. After the implement skill completes (last batch committed, all originally-todo tasks moved to `_docs/tasks/done/`), silently re-run Step 3's drift detection logic — do NOT re-render the full Status report; just re-evaluate the drift signals against the post-implementation tree. Then auto-chain per the post-implementation drift findings:
- Doc drift → Step 4 (Document Sync)
- Suite-e2e drift only → Step 4.5
- CI drift only → Step 5
- No drift → cycle complete
Note: the post-implement re-status is exactly why Step 3.5 is placed before sync. A repo rename will typically introduce doc + CI drift; the next invocation of Step 4 / Step 5 catches it on the same cycle.
6. **On user B (skip)** → mark Step 3.5 `skipped` in state file. Apply Step 3's original drift-based routing (compute from the pre-Step-3.5 Status report).
7. **On user C (pause)** → end session. Update state to `step: 3.5, status: in_progress, sub_step: {phase: 0, name: awaiting-task-review, detail: "<N> tasks pending review"}`. Tell the user to invoke `/autodev` again after deciding. **Do NOT modify any files** — pre-flight has not run yet.
**Self-verification** (executed before invoking implement):
- [ ] Working tree is clean (or user explicitly chose B in the WIP-stash sub-Choose)
- [ ] `_docs/tasks/_dependencies_table.md` exists (synthesized if it didn't)
- [ ] `_docs/tasks/_suite_module_layout.md` exists (synthesized if it didn't)
- [ ] All in-scope task files have a `Component:` field (skip + report any that don't — don't guess ownership)
- [ ] Tracker availability gate satisfied per `protocols.md` (or `tracker: local` previously chosen)
**Failure handling**:
- If implement returns FAILED → standard Failure Handling (`protocols.md`): retry up to 3 times, then escalate.
- If implement is interrupted mid-batch → next invocation re-detects via the implement skill's resumability protocol (read latest `_docs/03_implementation/suite_batch_*.md`). Step 3.5 itself is reentrant: on re-entry, if `todo/` still has tasks, it presents the Choose again with the remaining set.
- **Half-applied state risk** (acknowledged): if implement is interrupted between commits, the working tree is clean at the last commit boundary but the in-flight batch is lost. The user is responsible for inspecting and re-invoking. This is intentional — automated rollback of suite-level renames + `.gitmodules` edits is more dangerous than a human-driven recovery.
**Idempotency**: if `_docs/tasks/todo/` becomes empty after this step (all tasks moved to `done/`), the next `/autodev` invocation skips Step 3.5 entirely and proceeds with normal Status → sync flow.
---
@@ -115,6 +355,28 @@ The skill:
3. Applies doc edits
4. Skips any component with unconfirmed mapping (M5), reports
After completion:
- If the status report ALSO flagged suite-e2e drift → auto-chain to **Step 4.5 (Integration Test Sync)**
- Else if the status report ALSO flagged CI drift → auto-chain to **Step 5 (CICD Sync)**
- Else → end cycle, report done
---
**Step 4.5 — Integration Test Sync**
State-driven: reached by auto-chain from Step 3 (when status report flagged suite-e2e drift and no doc drift) or from Step 4 (when both doc and suite-e2e drift were flagged).
**Skip condition**: if `_docs/_repo-config.yaml` has no `suite_e2e:` block, this step is skipped entirely — there's no harness to sync. The status report should not flag suite-e2e drift in that case; if it does, that's a status-skill bug.
Action: Read and execute `.cursor/skills/monorepo-e2e/SKILL.md` with scope = components flagged by status.
The skill:
1. Verifies every path under `suite_e2e.*` exists (binary fixtures excepted — see the skill's Phase 1)
2. Classifies each flagged change against the suite-e2e impact table
3. Applies edits to `e2e/docker-compose.suite-e2e.yml`, `e2e/fixtures/init.sql`, `e2e/fixtures/expected_detections.json` metadata, and `e2e/runner/tests/*.spec.ts` selectors as needed
4. Bumps baseline `fixture_version` with a `-stale` suffix and appends a `_docs/_process_leftovers/` entry whenever the detection model revision changes (binary fixture cannot be regenerated automatically)
5. Reports synced files; does not run the suite e2e itself
After completion:
- If the status report ALSO flagged CI drift → auto-chain to **Step 5 (CICD Sync)**
- Else → end cycle, report done
@@ -123,11 +385,11 @@ After completion:
**Step 5 — CICD Sync**
State-driven: reached by auto-chain from Step 3 (when status report flagged CI drift and no doc drift) or from Step 4 (when both doc and CI drift were flagged).
State-driven: reached by auto-chain from Step 3 (when status report flagged CI drift and no doc/suite-e2e drift), Step 4, or Step 4.5.
Action: Read and execute `.cursor/skills/monorepo-cicd/SKILL.md` with scope = components flagged by status.
After completion, end cycle. Report files updated across both doc and CI sync.
After completion, end cycle. Report files updated across doc, suite-e2e, and CI sync.
---
@@ -156,14 +418,24 @@ After onboarding completes, the config is updated. Auto-chain back to **Step 3 (
| Completed Step | Next Action |
|---------------|-------------|
| Discover (1) | Auto-chain → Config Review (2) |
| Config Review (2, user picked A, confirmed_by_user: true) | Auto-chain → Status (3) |
| Config Review (2, user picked A, confirmed_by_user: true) | Auto-chain → Glossary & Architecture Vision (2.5) |
| Config Review (2, user picked B) | **Session boundary** — end session, await re-invocation |
| Status (3, doc drift) | Auto-chain → Document Sync (4) |
| Status (3, CI drift only) | Auto-chain → CICD Sync (5) |
| Status (3, no drift) | **Cycle complete** — end session, await re-invocation |
| Glossary & Architecture Vision (2.5) | Auto-chain → Status (3) |
| Status (3, todo tasks present) | Auto-chain → Suite Implement (3.5) — pre-routing gate fires before drift-based routing |
| Status (3, no todo tasks, doc drift) | Auto-chain → Document Sync (4) |
| Status (3, no todo tasks, suite-e2e drift only) | Auto-chain → Integration Test Sync (4.5) |
| Status (3, no todo tasks, CI drift only) | Auto-chain → CICD Sync (5) |
| Status (3, no todo tasks, no drift) | **Cycle complete** — end session, await re-invocation |
| Status (3, registry mismatch) | Ask user (A: discover, B: onboard, C: continue) |
| Document Sync (4) + CI drift pending | Auto-chain → CICD Sync (5) |
| Document Sync (4) + no CI drift | **Cycle complete** |
| Suite Implement (3.5, user picked A, success) | Silent re-status; auto-chain per post-implementation drift (Step 4 / 4.5 / 5 / cycle complete) |
| Suite Implement (3.5, user picked B) | Mark `skipped`; auto-chain per Step 3's original drift findings |
| Suite Implement (3.5, user picked C) | **Session boundary** — end session, await re-invocation |
| Suite Implement (3.5, FAILED ×3) | Standard Failure Handling escalation (`protocols.md`) |
| Document Sync (4) + suite-e2e drift pending | Auto-chain → Integration Test Sync (4.5) |
| Document Sync (4) + CI drift only pending | Auto-chain → CICD Sync (5) |
| Document Sync (4) + no further drift | **Cycle complete** |
| Integration Test Sync (4.5) + CI drift pending | Auto-chain → CICD Sync (5) |
| Integration Test Sync (4.5) + no CI drift | **Cycle complete** |
| CICD Sync (5) | **Cycle complete** |
## Status Summary — Step List
@@ -178,30 +450,40 @@ Flow-specific slot values:
Config: _docs/_repo-config.yaml [confirmed_by_user: <true|false>, last_updated: <date>]
```
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|------------------|--------------------------------------------|
| 1 | Discover | — |
| 2 | Config Review | `IN PROGRESS (awaiting human)` |
| 3 | Status | `DONE (no drift)`, `DONE (N drifts)` |
| 4 | Document Sync | `DONE (N docs)`, `SKIPPED (no doc drift)` |
| 5 | CICD Sync | `DONE (N files)`, `SKIPPED (no CI drift)` |
| # | Step Name | Extra state tokens (beyond the shared set) |
|---|------------------------------------|--------------------------------------------|
| 1 | Discover | — |
| 2 | Config Review | `IN PROGRESS (awaiting human)` |
| 2.5 | Glossary & Architecture Vision | `SKIPPED (already captured)` |
| 3 | Status | `DONE (no drift)`, `DONE (N drifts)` |
| 3.5 | Suite Implement | `DONE (N tasks)`, `SKIPPED (no todo tasks)`, `SKIPPED (user picked B)`, `IN PROGRESS (batch M of ~N)`, `IN PROGRESS (awaiting-task-review)` |
| 4 | Document Sync | `DONE (N docs)`, `SKIPPED (no doc drift)` |
| 4.5 | Integration Test Sync | `DONE (N files)`, `SKIPPED (no suite-e2e drift)`, `SKIPPED (no suite_e2e config block)` |
| 5 | CICD Sync | `DONE (N files)`, `SKIPPED (no CI drift)` |
All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 4 and 5 additionally accept `SKIPPED`.
All rows accept the shared state tokens (`DONE`, `IN PROGRESS`, `NOT STARTED`, `FAILED (retry N/3)`); rows 2.5, 3.5, 4, 4.5, and 5 additionally accept `SKIPPED`.
Row rendering format:
```
Step 1 Discover [<state token>]
Step 2 Config Review [<state token>]
Step 3 Status [<state token>]
Step 4 Document Sync [<state token>]
Step 5 CICD Sync [<state token>]
Step 1 Discover [<state token>]
Step 2 Config Review [<state token>]
Step 2.5 Glossary & Architecture Vision [<state token>]
Step 3 Status [<state token>]
Step 3.5 Suite Implement [<state token>]
Step 4 Document Sync [<state token>]
Step 4.5 Integration Test Sync [<state token>]
Step 5 CICD Sync [<state token>]
```
## Notes for the meta-repo flow
- **No session boundary except Step 2**: unlike existing-code flow (which has boundaries around decompose), meta-repo flow only pauses at config review. Syncing is fast enough to complete in one session.
- **Session boundaries**: Step 2 (Config Review pending), Step 2.5 (one-shot glossary/vision review), and Step 3.5 (when user picks C "Pause"). Step 3.5's A/B picks do NOT cross a session boundary — they auto-chain to syncs in the same session.
- **Cyclical, not terminal**: no "done forever" state. Each invocation completes a drift cycle; next invocation starts fresh.
- **No tracker integration**: this flow does NOT create Jira/ADO tickets. Maintenance is not a feature — if a feature-level ticket spans the meta-repo's concerns, it lives in the per-component workspace.
- **Tracker integration scope**: this flow does NOT create Jira/ADO tickets in its sync skills (Status / Document Sync / E2E / CICD). Step 3.5 (Suite Implement) IS tracker-integrated — it transitions existing tickets In Progress → In Testing per the implement skill's standard tracker handling. Suite-level tickets are authored manually by the operator (typically as children of an Epic that spans multiple components, like AZ-539); the flow doesn't auto-create them.
- **Per-component vs. suite-level work**:
- Tickets that touch component source code (`<component>/src/**`) belong in that component's own workspace `/autodev` cycle. The meta-repo flow does NOT execute them.
- Tickets that touch suite-root paths only (`.gitmodules`, `_infra/**`, suite `e2e/**`, root `README.md`, suite `_docs/**` outside `tasks/_*`) are eligible for Step 3.5.
- Tickets that span both (e.g., AZ-550 B11 consumer cutover, which touches `autopilot/`, `ui/`, AND suite `e2e/`) are NOT executable from a single workspace by design — split the ticket so the suite-level slice can run in Step 3.5 and the component slices run in their owning workspaces.
- **Onboarding is opt-in**: never auto-onboarded. User must explicitly request.
- **Failure handling**: uses the same retry/escalation protocol as other flows (see `protocols.md`).
+6 -4
View File
@@ -110,9 +110,11 @@ Before entering a step from this table for the first time in a session, verify t
| Flow | Step | Sub-Step | Tracker Action |
|------|------|----------|----------------|
| greenfield | Plan | Step 6 — Epics | Create epics for each component |
| greenfield | Decompose | Step 1 + Step 2 + Step 3All tasks | Create ticket per task, link to epic |
| greenfield | Decompose | Implementation decomposition Step 1 + Step 2Product tasks | Create ticket per product task, link to epic |
| greenfield | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
| existing-code | Decompose Tests | Step 1t + Step 3 — All test tasks | Create ticket per task, link to epic |
| existing-code | New Task | Step 7 — Ticket | Create ticket per task, link to epic |
| meta-repo | Suite Implement | Step 3.5 — implement skill Step 5 / Step 12 | Transition existing tickets In Progress → In Testing per implement skill (does NOT create new tickets — operator authors them) |
### State File Marker
@@ -138,7 +140,7 @@ One retry ladder covers all failure modes: explicit failure returned by a sub-sk
Treat the sub-skill as **failed** when ANY of the following is observed:
- The sub-skill explicitly returns a failed result (including blocked subagents, auto-fix loop exhaustion, prerequisite violations).
- The sub-skill explicitly returns a failed result (including blocked tasks, auto-fix loop exhaustion, prerequisite violations).
- **Stuck signals**: the same artifact is rewritten 3+ times without meaningful change; the sub-skill re-asks a question that was already answered; no new artifact has been saved despite active execution.
### Retry ladder
@@ -291,7 +293,7 @@ For steps that produce `_docs/` artifacts (problem, research, plan, decompose, d
## Debug Protocol
When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or an implementer subagent reports a blocker, the user is asked to intervene. This protocol guides the debugging process. (Retry budget and escalation are covered by Failure Handling above; this section is about *how* to diagnose once the user has been looped in.)
When the implement skill's auto-fix loop fails (code review FAIL after 2 auto-fix attempts) or a task reports a blocker, the user is asked to intervene. This protocol guides the debugging process. (Retry budget and escalation are covered by Failure Handling above; this section is about *how* to diagnose once the user has been looped in.)
### Structured Debugging Workflow
@@ -387,7 +389,7 @@ The banner shell is defined here once. Each flow file contributes only its step-
where `<state token>` comes from the state-token set defined per row in the flow's step-list table.
- `<current-suffix>` — optional, flow-specific. The existing-code flow appends ` (cycle <N>)` when `state.cycle > 1`; other flows leave it empty.
- `Retry:` row — omit entirely when `retry_count` is 0. Include it with `<N>/3` otherwise.
- `<footer-extras>` — optional, flow-specific. The meta-repo flow adds a `Config:` line with `_docs/_repo-config.yaml` state; other flows leave it empty.
- `<footer-extras>` — optional, flow-specific. The meta-repo flow adds a `Config:` line with `_docs/_repo-config.yaml` state; other flows leave it empty unless **parent suite docs** apply: if `<workspace-root>/../docs` exists and is a directory, append `Suite docs (parent): <absolute path>` on its own line (or `Suite docs (parent): absent` is **not** required — omit when missing). This line is orthogonal to flow-specific footer lines; both may appear.
### State token set (shared)
+15 -2
View File
@@ -13,7 +13,7 @@ The autodev persists its position to `_docs/_autodev_state.md`. This is a lightw
## Current Step
flow: [greenfield | existing-code | meta-repo]
step: [1-11 for greenfield, 1-17 for existing-code, 1-6 for meta-repo, or "done"]
step: [1-17 for greenfield (incl. fractional 16.5), 1-17 for existing-code (incl. fractional 16.5), 1-6 for meta-repo (incl. fractional 2.5 and 3.5), or "done"]
name: [step name from the active flow's Step Reference Table]
status: [not_started / in_progress / completed / skipped / failed]
sub_step:
@@ -82,6 +82,19 @@ retry_count: 0
cycle: 1
```
```
flow: meta-repo
step: 3.5
name: Suite Implement
status: in_progress
sub_step:
phase: 7
name: batch-loop
detail: "AZ-543 batch 1 of 1; suite-level"
retry_count: 0
cycle: 1
```
```
flow: existing-code
step: 10
@@ -100,7 +113,7 @@ cycle: 3
1. **Create** on the first autodev invocation (after state detection determines Step 1)
2. **Update** after every change — this includes: batch completion, sub-step progress, step completion, session boundary, failed retry, or any meaningful state transition. The state file must always reflect the current reality.
3. **Read** as the first action on every invocation — before folder scanning
4. **Cross-check**: verify against actual `_docs/` folder contents. If they disagree, trust the folder structure and update the state file
4. **Cross-check**: verify against actual `_docs/` folder contents. If they disagree, trust the folder structure and update the state file. **Parent suite `docs/`**: on every invocation, also probe `<workspace-root>/../docs` (the parent directorys `docs` folder — typical suite-level shared documentation next to a component repo). If it exists, mention it in the Status Summary footer per `protocols.md`; use it only as supplemental reading context unless a flow step explicitly ties detection to it. It never replaces workspace `_docs/` for step detection by default.
5. **Never delete** the state file
6. **Retry tracking**: increment `retry_count` on each failed auto-retry; reset to `0` on success. If `retry_count` reaches 3, set `status: failed`
7. **Failed state on re-entry**: if `status: failed` with `retry_count: 3`, do NOT auto-retry — present the issue to the user first
+12 -6
View File
@@ -2,7 +2,7 @@
name: code-review
description: |
Multi-phase code review against task specs with structured findings output.
6-phase workflow: context loading, spec compliance, code quality, security quick-scan, performance scan, cross-task consistency.
7-phase workflow: context loading, spec compliance, code quality, security quick-scan, performance scan, cross-task consistency, architecture compliance.
Produces a structured report with severity-ranked findings and a PASS/FAIL/PASS_WITH_WARNINGS verdict.
Invoked by /implement skill after each batch, or manually.
Trigger phrases:
@@ -106,11 +106,12 @@ When multiple tasks were implemented in the same batch:
## Phase 7: Architecture Compliance
Verify the implemented code respects the architecture documented in `_docs/02_document/architecture.md` and the component boundaries declared in `_docs/02_document/module-layout.md`.
Verify the implemented code respects the architecture documented in `_docs/02_document/architecture.md`, the component boundaries declared in `_docs/02_document/module-layout.md`, and the **accepted Architectural Decision Records** under `_docs/02_document/adr/`.
**Inputs**:
- `_docs/02_document/architecture.md` — layering, allowed dependencies, patterns
- `_docs/02_document/module-layout.md` — per-component directories, Public API surface, `Imports from` lists, Allowed Dependencies table
- `_docs/02_document/adr/` — every `Status: Accepted` ADR is an enforceable structural rule. `Status: Proposed`, `Status: Deprecated`, and `Status: Superseded` ADRs are NOT enforced (Proposed = not yet ratified; Deprecated/Superseded = a later ADR overturned it). If the directory does not exist or has only the index file, ADRs are skipped — log this skip in the report so the absence is visible.
- The cumulative list of changed files (for per-batch invocation) or the full codebase (for baseline invocation)
**Checks**:
@@ -125,6 +126,11 @@ Verify the implemented code respects the architecture documented in `_docs/02_do
5. **Cross-cutting concerns not locally re-implemented**: if a file under a component directory contains logic that should live in `shared/<concern>/` (e.g., custom logging setup, config loader, error envelope), flag it. Severity: Medium. Category: Architecture.
6. **ADR compliance**: for each `Status: Accepted` ADR, confirm the changed code does not contradict the ADR's `Decision`. Two failure modes are flagged:
- **ADR-Violation**: the changed code does the opposite of an Accepted ADR's `Decision`. Example: ADR-002 says "We will use Postgres for transactional data" and the changed code introduces a SQLite dependency for a transactional path. Severity: **Critical**. Category: Architecture. The finding cites the ADR by `NNN_<slug>` and the offending file/line.
- **ADR-Drift**: the changed code does something the ADR did not anticipate AND that materially affects the ADR's `Consequences` (positive or negative). Example: ADR-004 says "Event-driven cross-component comms" and a changed file introduces a new synchronous HTTP call between two components. Severity: **High**. Category: Architecture. The finding either proposes "Update ADR-NNN to acknowledge the new pattern" or "Remove the drift to align with ADR-NNN" — never silently accepts.
The check skips ADRs that are explicitly out of scope of the changed batch (e.g., ADR-001 about deployment pipeline when the batch only touches business-logic files). Use the ADR's `Evidence` section to determine scope: if no Evidence path overlaps with any changed file, skip the ADR for this batch.
**Detection approach (per language)**:
- Python: parse `import` / `from ... import` statements; optionally AST with `ast` module for reliable symbol resolution.
@@ -197,7 +203,7 @@ Produce a structured report with findings deduplicated and sorted by severity:
Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope, Architecture
`Architecture` findings come from Phase 7. They indicate layering violations, Public API bypasses, new cyclic dependencies, duplicate symbols, or cross-cutting concerns re-implemented locally.
`Architecture` findings come from Phase 7. They indicate layering violations, Public API bypasses, new cyclic dependencies, duplicate symbols, cross-cutting concerns re-implemented locally, **ADR-Violation** (changed code contradicts an `Accepted` ADR's Decision — Critical), or **ADR-Drift** (changed code introduces a pattern that materially affects an `Accepted` ADR's Consequences without superseding it — High).
## Verdict Logic
@@ -209,7 +215,7 @@ Bug, Spec-Gap, Security, Performance, Maintainability, Style, Scope, Architectur
The `/implement` skill invokes this skill after each batch completes:
1. Collects changed files from all implementer agents in the batch
1. Collects changed files from all tasks implemented in the batch
2. Passes task spec paths + changed files to this skill
3. If verdict is FAIL — presents findings to user (BLOCKING), user fixes or confirms
4. If verdict is PASS or PASS_WITH_WARNINGS — proceeds automatically (findings shown as info)
@@ -221,7 +227,7 @@ The `/implement` skill invokes this skill after each batch completes:
| Input | Type | Source | Required |
|-------|------|--------|----------|
| `task_specs` | list of file paths | Task `.md` files from `_docs/02_tasks/todo/` for the current batch | Yes |
| `changed_files` | list of file paths | Files modified by implementer agents (from `git diff` or agent reports) | Yes |
| `changed_files` | list of file paths | Files modified by the tasks in the batch (from `git diff`) | Yes |
| `batch_number` | integer | Current batch number (for report naming) | Yes |
| `project_restrictions` | file path | `_docs/00_problem/restrictions.md` | If exists |
| `solution_overview` | file path | `_docs/01_solution/solution.md` | If exists |
@@ -232,7 +238,7 @@ The implement skill invokes code-review by:
1. Reading `.cursor/skills/code-review/SKILL.md`
2. Providing the inputs above as context (read the files, pass content to the review phases)
3. Executing all 6 phases sequentially
3. Executing all 7 phases sequentially
4. Consuming the verdict from the output
### Outputs (returned to the implement skill)
+44 -24
View File
@@ -2,8 +2,8 @@
name: decompose
description: |
Decompose planned components into atomic implementable tasks with bootstrap structure plan.
4-step workflow: bootstrap structure plan, component task decomposition, blackbox test task decomposition, and cross-task verification.
Supports full decomposition (_docs/ structure), single component mode, and tests-only mode.
Workflow entrypoints: implementation task decomposition, single component decomposition, and tests-only decomposition.
The invoking flow decides which entrypoint to run; this skill executes that selected sequence.
Trigger phrases:
- "decompose", "decompose features", "feature decomposition"
- "task decomposition", "break down components"
@@ -20,7 +20,7 @@ Decompose planned components into atomic, implementable task specs with a bootst
## Core Principles
- **Atomic tasks**: each task does one thing; if it exceeds 8 complexity points, split it
- **Atomic tasks**: each task does one thing; if it exceeds 5 complexity points, split it
- **Behavioral specs, not implementation plans**: describe what the system should do, not how to build it
- **Flat structure**: all tasks are tracker-ID-prefixed files in TASKS_DIR — no component subdirectories
- **Save immediately**: write artifacts to disk after each task; never accumulate unsaved work
@@ -30,14 +30,15 @@ Decompose planned components into atomic, implementable task specs with a bootst
## Context Resolution
Determine the operating mode based on invocation before any other logic runs.
Resolve the selected entrypoint from the invocation context before any other logic runs. The caller decides whether this is implementation, single component, or tests-only decomposition; this skill only executes the selected sequence.
**Default** (no explicit input file provided):
**Implementation task decomposition** (default; selected by flows before invoking this skill):
- DOCUMENT_DIR: `_docs/02_document/`
- TASKS_DIR: `_docs/02_tasks/`
- TASKS_TODO: `_docs/02_tasks/todo/`
- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, DOCUMENT_DIR
- Produces only implementation tasks. Blackbox/e2e test task files are produced only when the invoking flow selects tests-only decomposition.
**Single component mode** (provided file is within `_docs/02_document/` and inside a `components/` subdirectory):
@@ -55,24 +56,25 @@ Determine the operating mode based on invocation before any other logic runs.
- TESTS_DIR: `DOCUMENT_DIR/tests/`
- Reads from: `_docs/00_problem/`, `_docs/01_solution/`, TESTS_DIR
Announce the detected mode and resolved paths to the user before proceeding.
Announce the selected entrypoint and resolved paths to the user before proceeding.
### Step Applicability by Mode
| Step | File | Default | Single | Tests-only |
|------|------|:-------:|:------:|:----------:|
| Step | File | Implementation | Single | Tests-only |
|------|------|:--------------:|:------:|:----------:|
| 1 Bootstrap Structure | `steps/01_bootstrap-structure.md` | ✓ | — | — |
| 1t Test Infrastructure | `steps/01t_test-infrastructure.md` | — | — | ✓ |
| 1.5 Module Layout | `steps/01-5_module-layout.md` | ✓ | — | — |
| 1.7 System-Pipeline Tasks | `steps/01-7_system-pipeline-tasks.md` | ✓ | — | — |
| 2 Task Decomposition | `steps/02_task-decomposition.md` | ✓ | ✓ | — |
| 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | | — | ✓ |
| 3 Blackbox Test Tasks | `steps/03_blackbox-test-decomposition.md` | | — | ✓ |
| 4 Cross-Verification | `steps/04_cross-verification.md` | ✓ | — | ✓ |
## Input Specification
### Required Files
**Default:**
**Implementation task decomposition:**
| File | Purpose |
|------|---------|
@@ -80,10 +82,11 @@ Announce the detected mode and resolved paths to the user before proceeding.
| `_docs/00_problem/restrictions.md` | Constraints and limitations |
| `_docs/00_problem/acceptance_criteria.md` | Measurable acceptance criteria |
| `_docs/01_solution/solution.md` | Finalized solution |
| `DOCUMENT_DIR/architecture.md` | Architecture from plan skill |
| `DOCUMENT_DIR/architecture.md` | Architecture from plan/document skill (must contain a `## Architecture Vision` H2 — confirmed user intent) |
| `DOCUMENT_DIR/glossary.md` | Project terminology (confirmed by user in plan Phase 2a.0 or document Step 4.5). Use it to keep task names, component references, and AC wording consistent with the user's vocabulary |
| `DOCUMENT_DIR/system-flows.md` | System flows from plan skill |
| `DOCUMENT_DIR/components/[##]_[name]/description.md` | Component specs from plan skill |
| `DOCUMENT_DIR/tests/` | Blackbox test specs from plan skill |
| `DOCUMENT_DIR/tests/` | Optional product acceptance context from test-spec skill; do not create test task files from it in this entrypoint |
**Single component mode:**
@@ -110,7 +113,7 @@ Announce the detected mode and resolved paths to the user before proceeding.
### Prerequisite Checks (BLOCKING)
**Default:**
**Implementation task decomposition:**
1. DOCUMENT_DIR contains `architecture.md` and `components/`**STOP if missing**
2. Create TASKS_DIR and TASKS_TODO if they do not exist
@@ -144,6 +147,8 @@ TASKS_DIR/
**Naming convention**: Each task file is initially saved in `TASKS_TODO/` with a temporary numeric prefix (`[##]_[short_name].md`). After creating the work item ticket, rename the file to use the work item ticket ID as prefix (`[TRACKER-ID]_[short_name].md`). For example: `todo/01_initial_structure.md``todo/AZ-42_initial_structure.md`.
If tracker availability fails, follow `.cursor/rules/tracker.mdc` before continuing. Only when the user explicitly chooses `tracker: local` may the numeric prefix remain; in that mode set `Tracker: pending` and `Epic: pending` in the task header and keep the task eligible for later tracker sync.
### Save Timing
| Step | Save immediately after | Filename |
@@ -165,11 +170,11 @@ If TASKS_DIR subfolders already contain task files:
## Progress Tracking
At the start of execution, create a TodoWrite with all applicable steps for the detected mode (see Step Applicability table). Update status as each step/component completes.
At the start of execution, create a TodoWrite with all applicable steps for the selected entrypoint (see Step Applicability table). Update status as each step/component completes.
## Workflow
### Step 1: Bootstrap Structure Plan (default mode only)
### Step 1: Bootstrap Structure Plan (implementation mode only)
Read and follow `steps/01_bootstrap-structure.md`.
@@ -181,25 +186,39 @@ Read and follow `steps/01t_test-infrastructure.md`.
---
### Step 1.5: Module Layout (default mode only)
### Step 1.5: Module Layout (implementation mode only)
Read and follow `steps/01-5_module-layout.md`.
---
### Step 2: Task Decomposition (default and single component modes)
### Step 1.7: System-Pipeline Tasks (implementation mode only)
Read and follow `steps/01-7_system-pipeline-tasks.md`.
This step exists because per-component task decomposition (Step 2)
produces one task per component but NEVER produces a task whose
deliverable is "the production code that drives the end-to-end
pipeline by calling each component in order against real inputs".
The architecture document describes the loop; nobody owns it. The
GPS-passthrough incident (May 2026) is the canonical failure this
step prevents.
---
### Step 2: Task Decomposition (implementation and single component modes)
Read and follow `steps/02_task-decomposition.md`.
---
### Step 3: Blackbox Test Task Decomposition (default and tests-only modes)
### Step 3: Blackbox Test Task Decomposition (tests-only mode only)
Read and follow `steps/03_blackbox-test-decomposition.md`.
---
### Step 4: Cross-Task Verification (default and tests-only modes)
### Step 4: Cross-Task Verification (implementation and tests-only modes)
Read and follow `steps/04_cross-verification.md`.
@@ -207,7 +226,7 @@ Read and follow `steps/04_cross-verification.md`.
- **Coding during decomposition**: this workflow produces specs, never code
- **Over-splitting**: don't create many tasks if the component is simple — 1 task is fine
- **Tasks exceeding 8 points**: split them; no task should be too complex for a single implementer
- **Tasks exceeding 5 points**: split them; no task should be too complex for a single implementer
- **Cross-component tasks**: each task belongs to exactly one component
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
- **Creating git branches**: branch creation is an implementation concern, not a decomposition one
@@ -220,7 +239,7 @@ Read and follow `steps/04_cross-verification.md`.
| Situation | Action |
|-----------|--------|
| Ambiguous component boundaries | ASK user |
| Task complexity exceeds 8 points after splitting | ASK user |
| Task complexity exceeds 5 points after splitting | ASK user |
| Missing component specs in DOCUMENT_DIR | ASK user |
| Cross-component dependency conflict | ASK user |
| Tracker epic not found for a component | ASK user for Epic ID |
@@ -232,15 +251,16 @@ Read and follow `steps/04_cross-verification.md`.
┌────────────────────────────────────────────────────────────────┐
│ Task Decomposition (Multi-Mode) │
├────────────────────────────────────────────────────────────────┤
│ CONTEXT: Resolve mode (default / single component / tests-only) │
│ CONTEXT: Invoke the selected entrypoint (implementation / single / tests-only) │
│ │
DEFAULT MODE:
IMPLEMENTATION TASK DECOMPOSITION:
│ 1. Bootstrap Structure → steps/01_bootstrap-structure.md │
│ [BLOCKING: user confirms structure] │
│ 1.5 Module Layout → steps/01-5_module-layout.md │
│ [BLOCKING: user confirms layout] │
│ 1.7 System-Pipeline → steps/01-7_system-pipeline-tasks.md │
│ [BLOCKING: user confirms pipeline owners] │
│ 2. Component Tasks → steps/02_task-decomposition.md │
│ 3. Blackbox Tests → steps/03_blackbox-test-decomposition.md │
│ 4. Cross-Verification → steps/04_cross-verification.md │
│ [BLOCKING: user confirms dependencies] │
│ │
@@ -16,7 +16,8 @@
3. Each component owns ONE top-level directory. Shared code goes under `<root>/shared/` (or language equivalent).
4. Public API surface = files in the layout's `public:` list for each component; everything else is internal and MUST NOT be imported from other components.
5. Cross-cutting concerns (logging, error handling, config, telemetry, auth middleware, feature flags, i18n) each get ONE entry under Shared / Cross-Cutting; per-component tasks consume them (see Step 2 cross-cutting rule).
6. Write `_docs/02_document/module-layout.md` using `templates/module-layout.md` format.
6. **ADR cross-check**: if `_docs/02_document/adr/` exists, read every `Status: Accepted` ADR. For each, confirm the proposed module layout does not contradict the ADR's `Decision` (e.g., an ADR mandating an event-bus boundary between two components must show up as a `Imports from` exclusion in the layout; an ADR locking a layering style must show up in the Layering table). If an ADR conflicts with the language-conventional layout from step 2, the ADR wins — record the conflict in a `## ADR-driven exceptions to the conventional layout` section of `module-layout.md` with `See ADR NNN_<slug>` references. If the ADR conflict is irreconcilable (the ADR demands something the language genuinely cannot express), STOP and ask the user A/B/C: (A) update the ADR via plan Step 4.5 supersede flow, (B) accept a layered exception with documented rationale, (C) re-open architecture.
7. Write `_docs/02_document/module-layout.md` using `templates/module-layout.md` format. Each Per-Component Mapping entry that is governed by an ADR includes a trailing `> See ADR NNN_<slug>` line.
## Self-verification
@@ -26,6 +27,8 @@
- [ ] No component's `Imports from` list points at a higher layer
- [ ] Paths follow the detected language's convention
- [ ] No two components own overlapping paths
- [ ] If `_docs/02_document/adr/` exists with Accepted ADRs, every layout decision that an ADR governs has a trailing `> See ADR NNN_<slug>` reference
- [ ] No Accepted ADR is contradicted by the layout without a documented exception
## Save action
@@ -0,0 +1,72 @@
# Step 1.7: System-Pipeline Tasks (implementation mode only)
**Role**: Professional software architect, integration-focused.
**Goal**: For every end-to-end pipeline named in `_docs/02_document/architecture.md` and `_docs/02_document/system-flows.md`, ensure there is exactly ONE explicit task that owns the production code that drives that pipeline against real inputs. This step prevents the failure mode where every individual component is "complete" but no production code wires them together (May 2026 GPS-passthrough incident — see `meta-rule.mdc` "When a test reveals missing production code").
**Constraints**:
- This step produces *integration* tasks, not per-component tasks. Per-component tasks come from Step 2.
- An integration task's owner is typically the composition root, runtime root, main loop, or whichever component the module layout (Step 1.5) names as the "system spine". It is NEVER a leaf component.
- Each integration task must be sized at 5 points or fewer. If the pipeline is too large for one task, split it into per-stage integration tasks (e.g. "wire ingress → C1", then "wire C1 → C5") rather than one giant task.
## Inputs
| File | Purpose |
|------|---------|
| `_docs/02_document/architecture.md` | Source of named end-to-end pipelines and their component sequences |
| `_docs/02_document/system-flows.md` | Source of operational flows (per-frame loop, request lifecycle, batch job, etc.) |
| `_docs/02_document/module-layout.md` | Produced by Step 1.5. Names the "system spine" component(s) — typically `runtime_root`, `app`, `main`, `composition`, or equivalent. |
| `_docs/02_document/components/*/description.md` | Per-component contracts so you can tell which side of a seam each method lives on |
## Steps
1. **Enumerate end-to-end pipelines.** Read `architecture.md` and `system-flows.md`. For each named pipeline / flow that spans 2+ components, record:
- The pipeline name (e.g. "per-frame nav loop", "tile-cache build", "operator pre-flight verification").
- The ordered sequence of components it touches (e.g. `frame_source → c1_vio → c2_vpr → ... → c5_state → replay_sink`).
- The trigger (per-frame, per-request, scheduled, manual).
- The output (what the pipeline emits and to whom).
2. **For each pipeline, locate the owner.** Use `module-layout.md` to find the component that owns the orchestration (the "spine"). If `module-layout.md` does not name one, STOP and ASK the user which component owns the pipeline. Do NOT silently default to the bootstrap structure task — bootstrap is about project skeleton, not behavior.
3. **Check whether the pipeline is already covered by an existing task spec or by the bootstrap-structure task.** A pipeline is "covered" only if:
- A task spec's `Outcome` or `Acceptance Criteria` section explicitly names "drives the {pipeline_name} end-to-end against real production components", AND
- That task's owned files include the orchestration code (typically the spine component's main loop / entrypoint).
4. **For every uncovered pipeline, create a system-integration task spec** in `_docs/02_tasks/todo/` using `.cursor/skills/decompose/templates/task.md`:
- **Component**: the spine component from step 2 (e.g. `runtime_root`).
- **Outcome**: the production callsite that drives the pipeline exists and runs end-to-end on real inputs.
- **Scope / Included**: the orchestration code (loop body, dispatcher, scheduler, entrypoint); explicit list of every component it must call in order; the data type at each seam.
- **Acceptance Criteria** (write each as testable):
- At least one production caller of every component method in the pipeline can be found by grep — name the methods explicitly.
- The orchestration runs against the real production component instances (NOT mocks, NOT a passthrough that bypasses them).
- At least one integration test exercises the orchestration end-to-end against real inputs.
- **Dependencies**: every per-component task whose component appears in the pipeline.
- **Complexity points**: ≤5; split the pipeline if it doesn't fit.
- **Tracker**: create a ticket immediately (per `decompose/SKILL.md` "Tracker inline" principle); rename the file to `[TRACKER-ID]_pipeline_<name>.md`.
5. **Mark the integration task as `Dependencies` for the integration test task.** If `tests-only` decomposition has already produced an e2e/integration test task for this pipeline, append the new integration task to its `Dependencies` field so the test cannot be "made green" before the integration ships.
## Anti-patterns this step explicitly blocks
- **"compose_root returns a wired runtime"** prose interpreted as "the loop exists". Composition assembles the graph; it is NOT the loop. The loop is the code that pulls inputs, drives each node, and emits outputs. If grep finds zero callers of the leaf components, the loop does not exist regardless of what compose_root does.
- **Treating the bootstrap-structure task as the home of the main loop.** Bootstrap is project skeleton (package layout, CLI scaffold, build files). It is NOT the main loop. Main loop is its own task.
- **Per-component tasks claiming integration scope.** A C1 VIO task's deliverable is "C1 works in isolation against unit tests". A C1 task's acceptance criteria MUST NOT include "C1 is wired into the runtime" — that's the integration task's job.
## Self-verification
- [ ] Every pipeline named in `architecture.md` / `system-flows.md` is listed in your enumeration.
- [ ] Every enumerated pipeline either (a) has an existing covered task, or (b) has a new integration task in `todo/`.
- [ ] No integration task exceeds 5 complexity points.
- [ ] Every integration task names every component in the pipeline as a `Dependencies` entry.
- [ ] No integration task is owned by a leaf component — every owner is named in `module-layout.md` as a spine / orchestrator.
- [ ] Every integration task has a tracker ticket created and the filename renamed to `[TRACKER-ID]_pipeline_<name>.md`.
## Save action
Write the new integration task files into `_docs/02_tasks/todo/`. They will be picked up by Step 2 (Task Decomposition's dependency-table writer) and by Step 4 (Cross-Verification).
## Blocking
**BLOCKING**: Present the pipeline enumeration + the list of new integration tasks to the user. Do NOT proceed to Step 2 until the user confirms:
- The enumeration matches what they expect from the architecture documents.
- Every uncovered pipeline now has an integration task.
- The chosen spine owners are correct.
If the user identifies a pipeline you missed, add it before proceeding. If the user names a different spine owner, update the task and re-run self-verification.
@@ -26,7 +26,7 @@ For each component (or the single provided component):
4. Do not create tasks for other components — only tasks for the current component
5. Each task should be atomic, containing 1 API or a list of semantically connected APIs
6. Write each task spec using `templates/task.md`
7. Estimate complexity per task (1, 2, 3, 5, 8 points); no task should exceed 8 points — split if it does
7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
8. Note task dependencies (referencing tracker IDs of already-created dependency tasks, e.g., `AZ-42_initial_structure`)
9. **Cross-cutting rule**: if a concern spans ≥2 components (logging, config loading, auth/authZ, error envelope, telemetry, feature flags, i18n), create ONE shared task under the cross-cutting epic. Per-component tasks declare it as a dependency and consume it; they MUST NOT re-implement it locally. Duplicate local implementations are an `Architecture` finding (High) in code-review Phase 7 and a `Maintainability` finding in Phase 6.
10. **Shared-models / shared-API rule**: classify the task as shared if ANY of the following is true:
@@ -43,16 +43,32 @@ For each component (or the single provided component):
Consumers read the contract file, not the producer's task spec. This prevents interface drift when the producer's implementation detail leaks into consumers.
11. **Immediately after writing each task file**: create a work item ticket, link it to the component's epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.
## Runtime Completeness Decomposition Gate
Before Step 2 is considered complete, scan `architecture.md`, `system-flows.md`, component descriptions, and the solution for named internal runtime capabilities and dependencies. Examples include BASALT/OpenVINS/Kimera, FAISS, DINOv2, ONNX/TensorRT, ALIKED/DISK, LightGlue, RANSAC, PostGIS, MAVLink emission, FDR rollover, and any "A-Z" user-visible pipeline.
For every named internal capability:
1. Ensure at least one implementation task explicitly owns the production integration or production algorithm.
2. Do not treat "define protocol", "create adapter boundary", "add deterministic fallback", "create scaffold", or "prepare native bridge" as implementation of the capability unless the architecture explicitly says the real capability is out of scope.
3. If a capability needs external hardware/data to verify, still create the production implementation task. Verification may be hardware-gated later; implementation must not be omitted.
4. Add a `## Runtime Completeness` section to any affected task with:
- named capability/dependency,
- production code that must exist,
- allowed external stubs, if any,
- unacceptable substitutes such as fake/deterministic/internal stubs.
## Self-verification (per component)
- [ ] Every task is atomic (single concern)
- [ ] No task exceeds 8 complexity points
- [ ] No task exceeds 5 complexity points
- [ ] Task dependencies reference correct tracker IDs
- [ ] Tasks cover all interfaces defined in the component spec
- [ ] No tasks duplicate work from other components
- [ ] Every task has a work item ticket linked to the correct epic
- [ ] Every shared-models / shared-API task has a contract file at `_docs/02_document/contracts/<component>/<name>.md` and a `## Contract` section linking to it
- [ ] Every cross-cutting concern appears exactly once as a shared task, not N per-component copies
- [ ] Every named internal runtime capability has a production implementation task, not only an interface/scaffold/fallback task
## Save action
@@ -1,4 +1,4 @@
# Step 3: Blackbox Test Task Decomposition (default and tests-only modes)
# Step 3: Blackbox Test Task Decomposition (tests-only mode only)
**Role**: Professional Quality Assurance Engineer
**Goal**: Decompose blackbox test specs into atomic, implementable task specs.
@@ -6,7 +6,6 @@
## Numbering
- In default mode: continue sequential numbering from where Step 2 left off.
- In tests-only mode: start from 02 (01 is the test infrastructure bootstrap from Step 1t).
## Steps
@@ -14,21 +13,26 @@
1. Read all test specs from `DOCUMENT_DIR/tests/` (`blackbox-tests.md`, `performance-tests.md`, `resilience-tests.md`, `security-tests.md`, `resource-limit-tests.md`)
2. Group related test scenarios into atomic tasks (e.g., one task per test category or per component under test)
3. Each task should reference the specific test scenarios it implements and the environment/test-data specs
4. Dependencies:
- In default mode: blackbox test tasks depend on the component implementation tasks they exercise
4. Add a **System Under Test Boundary** section to every e2e/blackbox test task:
- The test must drive the product through public runtime boundaries and compare actual outputs to `_docs/00_problem/input_data/expected_results/results_report.md` and any referenced machine-readable expected-result files.
- Stubs are allowed only for external systems outside the product boundary: flight controller/SITL, QGC observer, satellite-provider/Suite service, physical Jetson hardware, physical camera, licensed public datasets, and network services.
- Stubs, fakes, deterministic fallbacks, monkeypatches, or direct imports are not allowed for internal product modules that the scenario is meant to validate, such as VIO, safety/anchor wrapper, satellite retrieval, anchor verification, tile manager, MAVLink output adapter, or FDR.
- If an internal module is not implemented, the test must fail/block as missing product implementation; it must not pass by replacing that module with a test stub.
5. Dependencies:
- In tests-only mode: blackbox test tasks depend on the test infrastructure bootstrap task (Step 1t)
5. Write each task spec using `templates/task.md`
6. Estimate complexity per task (1, 2, 3, 5, 8 points); no task should exceed 8 points — split if it does
7. Note task dependencies (referencing tracker IDs of already-created dependency tasks)
8. **Immediately after writing each task file**: create a work item ticket under the "Blackbox Tests" epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.
6. Write each task spec using `templates/task.md`
7. Estimate complexity per task (1, 2, 3, 5 points); no task should exceed 5 points — split if it does
8. Note task dependencies (referencing tracker IDs of already-created dependency tasks)
9. **Immediately after writing each task file**: create a work item ticket under the "Blackbox Tests" epic, write the work item ticket ID and Epic ID back into the task header, then rename the file from `todo/[##]_[short_name].md` to `todo/[TRACKER-ID]_[short_name].md`.
## Self-verification
- [ ] Every scenario from `tests/blackbox-tests.md` is covered by a task
- [ ] Every scenario from `tests/performance-tests.md`, `tests/resilience-tests.md`, `tests/security-tests.md`, and `tests/resource-limit-tests.md` is covered by a task
- [ ] No task exceeds 8 complexity points
- [ ] Dependencies correctly reference the dependency tasks (component tasks in default mode, test infrastructure in tests-only mode)
- [ ] No task exceeds 5 complexity points
- [ ] Dependencies correctly reference the test infrastructure task
- [ ] Every task has a work item ticket linked to the "Blackbox Tests" epic
- [ ] Every e2e/blackbox task forbids internal product stubs/fakes and requires comparison against expected-results artifacts
## Save action
@@ -1,4 +1,4 @@
# Step 4: Cross-Task Verification (default and tests-only modes)
# Step 4: Cross-Task Verification (implementation and tests-only modes)
**Role**: Professional software architect and analyst
**Goal**: Verify task consistency and produce `_dependencies_table.md`.
@@ -8,17 +8,20 @@
1. Verify task dependencies across all tasks are consistent
2. Check no gaps:
- In default mode: every interface in `architecture.md` has tasks covering it
- In implementation mode: every product interface in `architecture.md` has implementation task coverage
- In tests-only mode: every test scenario in `traceability-matrix.md` is covered by a task
- In implementation mode: every named internal runtime capability/dependency from architecture, solution, system flows, and component descriptions has a production implementation task, not only an interface/scaffold/fallback task
- In tests-only mode: every e2e/blackbox task has a System Under Test Boundary section that forbids stubbing internal product modules and requires comparison to expected-results artifacts
3. Check no overlaps: tasks don't duplicate work
4. Check no circular dependencies in the task graph
5. Produce `_dependencies_table.md` using `templates/dependencies-table.md`
## Self-verification
### Default mode
### Implementation mode
- [ ] Every architecture interface is covered by at least one task
- [ ] Every product interface in `architecture.md` is covered by at least one implementation task
- [ ] Every named internal runtime capability has a production implementation task
- [ ] No circular dependencies in the task graph
- [ ] Cross-component dependencies are explicitly noted in affected task specs
- [ ] `_dependencies_table.md` contains every task with correct dependencies
@@ -26,6 +29,7 @@
### Tests-only mode
- [ ] Every test scenario from `traceability-matrix.md` "Covered" entries has a corresponding task
- [ ] Every e2e/blackbox task validates actual product behavior and allows stubs only for external systems
- [ ] No circular dependencies in the task graph
- [ ] Test task dependencies reference the test infrastructure bootstrap
- [ ] `_dependencies_table.md` contains every task with correct dependencies
@@ -28,4 +28,4 @@ Use this template after cross-task verification. Save as `TASKS_DIR/_dependencie
- Dependencies column lists tracker IDs (e.g., "AZ-43, AZ-44") or "None"
- No circular dependencies allowed
- Tasks should be listed in recommended execution order
- The `/implement` skill reads this table to compute parallel batches
- The `/implement` skill reads this table to compute dependency-aware batches; task execution remains sequential
@@ -1,6 +1,6 @@
# Module Layout Template
The module layout is the **authoritative file-ownership map** used by the `/implement` skill to assign OWNED / READ-ONLY / FORBIDDEN files to implementer subagents. It is derived from `_docs/02_document/architecture.md` and the component specs at `_docs/02_document/components/`, and it follows the target language's standard project-layout conventions.
The module layout is the **authoritative file-ownership map** used by the `/implement` skill to assign OWNED / READ-ONLY / FORBIDDEN files to each task. It is derived from `_docs/02_document/architecture.md` and the component specs at `_docs/02_document/components/`, and it follows the target language's standard project-layout conventions.
Save as `_docs/02_document/module-layout.md`. This file is produced by the decompose skill (Step 1.5 module layout) and consumed by the implement skill (Step 4 file ownership). Task specs remain purely behavioral — they do NOT carry file paths. The layout is the single place where component → filesystem mapping lives.
@@ -104,4 +104,4 @@ The implement skill's Step 4 (File Ownership) reads this file and, for each task
3. Set READ-ONLY = the Public API files of every component listed in `Imports from`, plus `shared/*` Public API files.
4. Set FORBIDDEN = every other component's Owns glob.
If two tasks in the same batch map to the same component, the implement skill schedules them sequentially (one implementer at a time for that component) to avoid file conflicts on shared internal files.
Execution inside a batch is already sequential (one task at a time). This mapping is still required because it enforces scope discipline per task — preventing a task from drifting into files that belong to another component.
+2 -3
View File
@@ -11,7 +11,7 @@ Save as `TASKS_DIR/[##]_[short_name].md` initially, then rename to `TASKS_DIR/[T
**Task**: [TRACKER-ID]_[short_name]
**Name**: [short human name]
**Description**: [one-line description of what this task delivers]
**Complexity**: [1|2|3|5|8] points
**Complexity**: [1|2|3|5] points
**Dependencies**: [AZ-43_shared_models, AZ-44_db_migrations] or "None"
**Component**: [component name for context]
**Tracker**: [TASK-ID]
@@ -102,8 +102,7 @@ Consumers MUST read that file — not this task spec — to discover the interfa
- 2 points: Non-trivial, low complexity, minimal coordination
- 3 points: Multi-step, moderate complexity, potential alignment needed
- 5 points: Difficult, interconnected logic, medium-high risk
- 8 points: High difficulty, high ambiguity or coordination, multiple components
- 13 points: Too complex — split into smaller tasks
- 8+ points: Too complex — split into smaller tasks
## Output Guidelines
@@ -26,7 +26,8 @@
- Application components under test
- Test runner container (black-box, no internal imports)
- Isolated database with seed data
- All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit`
- All tests runnable via `docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner`
- See the Woodpecker two-workflow contract in [`../templates/ci_cd_pipeline.md`](../templates/ci_cd_pipeline.md) — the test runner entry point defined here becomes the first step of `.woodpecker/01-test.yml`.
7. Define image tagging strategy: `<registry>/<project>/<component>:<git-sha>` for CI, `latest` for local dev only
## Self-verification
@@ -29,7 +29,7 @@ Save as `_docs/04_deploy/ci_cd_pipeline.md`.
### Test
- Unit tests: [framework and command]
- Blackbox tests: [framework and command, uses docker-compose.test.yml]
- Coverage threshold: 75% overall, 90% critical paths
- Coverage threshold: 75% overall, 90% critical-path floor (100% aim) — per `.cursor/rules/cursor-meta.mdc` Quality Thresholds
- Coverage report published as pipeline artifact
### Security
@@ -85,3 +85,140 @@ Save as `_docs/04_deploy/ci_cd_pipeline.md`.
| Deploy success | [Slack] | [team] |
| Deploy failure | [Slack/email + PagerDuty] | [on-call] |
```
---
## Reference Implementation: Woodpecker CI two-workflow contract
Use this when the project's CI is **Woodpecker** and the test layout follows the autodev e2e contract from [`../../decompose/templates/test-infrastructure-task.md`](../../decompose/templates/test-infrastructure-task.md) (an `e2e/` folder containing `Dockerfile`, `docker-compose.test.yml`, `conftest.py`, `requirements.txt`, `mocks/`, `fixtures/`, `tests/`).
The contract is **two workflows in `.woodpecker/`**, scheduled on the same agent label, with the build workflow gated on a successful test run:
- `.woodpecker/01-test.yml` — runs the e2e contract, publishes `results/report.csv` as an artifact, fails the pipeline on any test failure.
- `.woodpecker/02-build-push.yml``depends_on: [01-test]`. Builds the image, tags it `${CI_COMMIT_BRANCH}-${TAG_SUFFIX}`, pushes it to the registry. Skipped automatically if test failed.
The agent label is parameterized via `matrix:` so a single workflow file fans out across architectures: `labels: platform: ${PLATFORM}` routes each matrix entry to the matching agent. Both workflows for a repo must use the same matrix so test and build run on the same machine and share Docker layer cache. New architectures = new matrix entries; never new files.
### Multi-arch matrix conventions
| Variable | Meaning | Typical values |
|----------|---------|----------------|
| `PLATFORM` | Woodpecker agent label — selects which physical machine runs the entry. | `arm64`, `amd64` |
| `TAG_SUFFIX` | Image tag suffix appended after the branch name. | `arm`, `amd` |
| `DOCKERFILE` *(only when arches need different Dockerfiles)* | Path to the Dockerfile for this entry. | `Dockerfile`, `Dockerfile.jetson` |
Most repos use the same `Dockerfile` for both arches (multi-arch base images handle the rest), so `DOCKERFILE` can be omitted from the matrix and hardcoded in the build command. Repos with split per-arch Dockerfiles (e.g., `detections` uses `Dockerfile.jetson` on Jetson with TensorRT/CUDA-on-L4T) declare `DOCKERFILE` as a matrix var.
When only one architecture is currently in use, keep the matrix block with a single entry and the second entry commented out — adding a new arch is then a one-line uncomment, not a structural change.
### `.woodpecker/01-test.yml`
```yaml
when:
event: [push, pull_request, manual]
branch: [dev, stage, main]
matrix:
include:
- PLATFORM: arm64
TAG_SUFFIX: arm
# - PLATFORM: amd64
# TAG_SUFFIX: amd
labels:
platform: ${PLATFORM}
steps:
- name: e2e
image: docker
commands:
- cd e2e
- docker compose -f docker-compose.test.yml up --abort-on-container-exit --exit-code-from e2e-runner --build
- docker compose -f docker-compose.test.yml down -v
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- name: report
image: docker
when:
status: [success, failure]
commands:
- test -f e2e/results/report.csv && cat e2e/results/report.csv || echo "no report"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
```
Notes:
- `--abort-on-container-exit` shuts the whole compose down as soon as ANY service exits, so a crashed dependency surfaces immediately instead of hanging the runner.
- `--exit-code-from e2e-runner` ensures the pipeline's exit code reflects the test runner's, not the SUT's.
- The `report` step runs on `[success, failure]` so the report is always published; without this the CSV is lost on red builds.
- `down -v` between runs drops mock state and DB volumes — every test run starts clean.
### `.woodpecker/02-build-push.yml`
```yaml
when:
event: [push, manual]
branch: [dev, stage, main]
depends_on:
- 01-test
matrix:
include:
- PLATFORM: arm64
TAG_SUFFIX: arm
# - PLATFORM: amd64
# TAG_SUFFIX: amd
labels:
platform: ${PLATFORM}
steps:
- name: build-push
image: docker
environment:
REGISTRY_HOST:
from_secret: registry_host
REGISTRY_USER:
from_secret: registry_user
REGISTRY_TOKEN:
from_secret: registry_token
commands:
- echo "$REGISTRY_TOKEN" | docker login "$REGISTRY_HOST" -u "$REGISTRY_USER" --password-stdin
- export TAG=${CI_COMMIT_BRANCH}-${TAG_SUFFIX}
- export BUILD_DATE=$(date -u +%Y-%m-%dT%H:%M:%SZ)
- |
docker build -f Dockerfile \
--build-arg CI_COMMIT_SHA=$CI_COMMIT_SHA \
--label org.opencontainers.image.revision=$CI_COMMIT_SHA \
--label org.opencontainers.image.created=$BUILD_DATE \
--label org.opencontainers.image.source=$CI_REPO_URL \
-t $REGISTRY_HOST/azaion/<service>:$TAG .
- docker push $REGISTRY_HOST/azaion/<service>:$TAG
volumes:
- /var/run/docker.sock:/var/run/docker.sock
```
Notes:
- `depends_on: [01-test]` is enforced by Woodpecker — a failed `01-test` (any matrix entry) skips this workflow.
- The build workflow does NOT trigger on `pull_request` events: PRs get test signal only; pushes to `dev`/`stage`/`main` produce images. Avoids polluting the registry with PR images.
- Replace `<service>` with the actual service name (matches the registry namespace pattern `azaion/<service>`).
- For repos with split per-arch Dockerfiles, add `DOCKERFILE: Dockerfile.jetson` (or similar) to the matrix entry and substitute `${DOCKERFILE}` for `Dockerfile` in the `docker build -f` line.
### Variations by stack
The contract is language-agnostic because the runner is `docker compose`. The Dockerfile inside `e2e/` selects the test framework:
| Stack | `e2e/Dockerfile` runs |
|-------|----------------------|
| Python | `pytest --csv=/results/report.csv -v` |
| .NET | `dotnet test --logger:"trx;LogFileName=/results/report.trx"` (convert to CSV in a final step if needed) |
| Node/UI | `npm test -- --reporters=default --reporters=jest-junit --outputDirectory=/results` |
| Rust | `cargo test --no-fail-fast -- --format json > /results/report.json` |
When the repo has **only unit tests** (no `e2e/docker-compose.test.yml`), drop the compose orchestration and run the native test command directly inside a stack-appropriate image. Keep the same two-workflow split — `01-test.yml` runs unit tests, `02-build-push.yml` is unchanged.
### Manual-trigger override (test infrastructure not yet validated)
If a repo ships a complete `e2e/` layout but the test fixtures are not yet validated end-to-end (e.g., expected-results data is still being authored), gate `01-test.yml` on `event: [manual]` only and add a TODO comment pointing to the unblocking task. The `02-build-push.yml` workflow drops its `depends_on` clause for the manual-only window — an explicit and reversible exception, not a permanent split.
@@ -31,6 +31,7 @@ _docs/
│ ├── components.md
│ └── flows/
├── 04_verification_log.md # Step 4
├── glossary.md # Step 4.5 (confirmed-by-user)
├── FINAL_report.md # Step 7
└── state.json # Resumability
```
@@ -49,6 +50,7 @@ Maintained in `DOCUMENT_DIR/state.json` for resumability:
"modules_remaining": ["services/auth", "api/endpoints"],
"module_batch": 1,
"components_written": [],
"step_4_5_glossary_vision": "not_started",
"last_updated": "2026-03-21T14:00:00Z"
}
```
+102 -2
View File
@@ -15,7 +15,7 @@ Covers three related modes that share the same 8-step pipeline:
## Progress Tracking
Create a TodoWrite with all steps (0 through 7). Update status as each step completes.
Create a TodoWrite with all steps (0 through 7, including the inline Step 2.5 Module Layout Derivation and Step 4.5 Glossary & Architecture Vision). Update status as each step completes.
## Steps
@@ -251,7 +251,107 @@ Apply corrections inline to the documents that need them.
**BLOCKING**: Present verification summary to user. Do NOT proceed until user confirms corrections are acceptable or requests additional fixes.
**Session boundary**: After verification is confirmed, suggest a session break before proceeding to the synthesis steps (57). These steps produce different artifact types and benefit from fresh context:
---
### Step 4.5: Glossary & Architecture Vision (BLOCKING)
**Role**: Software architect + business analyst
**Goal**: Reconcile the AI's verified understanding of the codebase with the user's intended terminology and architecture vision. Existing-code projects often carry domain language and structural intent that is invisible from code alone (synonyms, deprecated names, modules that are "supposed to" be split, components the user thinks of as one logical unit even though they live in two folders). This step makes that intent explicit before any downstream skill (refactor, decompose, new-task) acts on the docs.
**When this step runs**:
- Always, after Step 4 (Verification Pass) — for Full and Resume modes.
- **Skipped** in Focus Area mode (the glossary/vision is system-wide; running it on a partial scan would produce a partial glossary). Resume the user once a full pass exists.
**Inputs** (already on disk after Step 4):
- `DOCUMENT_DIR/architecture.md`, `system-flows.md`, `data_model.md`, `deployment/*`
- `DOCUMENT_DIR/components/*/description.md`
- `DOCUMENT_DIR/modules/*.md`
- `DOCUMENT_DIR/04_verification_log.md` (so the AI knows which doc parts are confirmed vs. flagged)
**Outputs**:
- `DOCUMENT_DIR/glossary.md` (NEW)
- `DOCUMENT_DIR/architecture.md` updated in place: a new `## Architecture Vision` section is prepended (or merged into an existing "Overview" / "Vision" heading if already present); existing technical sections are preserved verbatim
**Procedure**:
1. **Draft glossary** from verified docs:
- Domain entities, processes, roles named in module/component docs
- Acronyms / abbreviations
- Internal codenames (project, service, model names) that recur in the codebase
- Synonym pairs the AI noticed (e.g., the codebase uses "flight" but module comments say "mission")
- Stakeholder personas if any docs reference them
Each entry: one-line definition + source reference (`source: components/03_flights/description.md`). Skip generic CS/industry terms.
2. **Draft architecture vision** as the AI currently understands the codebase:
- **One paragraph**: what the system is, who runs it, the runtime topology shape (monolith / services / pipeline / library / hybrid), and the dominant pattern (e.g., "submodule-based meta-repo with REST + SSE between UI and backend").
- **Components & responsibilities** (one-line each), pulled from `components/*/description.md`.
- **Major data flows** (one or two sentences each), pulled from `system-flows.md`.
- **Architectural principles / non-negotiables** the AI inferred from the code (e.g., "DB-driven config", "all UI traffic via REST + SSE only", "no per-component shared state"). Mark each with `inferred-from: <source>`.
- **Open questions / drift signals**: places where the code disagrees with itself, or where the AI cannot tell intent from implementation (e.g., two components doing similar work — is that legacy duplication or deliberate?).
3. **Present condensed view** to the user (NOT the full draft files — a synopsis only):
```
══════════════════════════════════════
REVIEW: Glossary + Architecture Vision (existing code)
══════════════════════════════════════
Glossary (N terms drafted from verified docs):
- <Term>: <one-line definition>
- ...
Architecture Vision — as inferred from the codebase:
<one-paragraph synopsis>
Components / responsibilities:
- <component>: <one-line>
- ...
Principles / non-negotiables (inferred):
- <principle> [inferred-from: <source>]
- ...
Open questions / drift signals:
- <q1>
- <q2>
══════════════════════════════════════
A) Inferred vision matches my intent — write the files
B) Add / correct entries (provide diffs — terms, components,
principles, or rename pairs)
C) Resolve the open questions / drift signals first
══════════════════════════════════════
Recommendation: pick C if any drift signals exist;
otherwise B if the vision misses
project-specific intent; A only when
the inferred vision is exactly right.
══════════════════════════════════════
```
4. **Iterate**:
- On B → integrate the user's diffs/additions, re-present, loop until A.
- On C → ask the listed open questions in one batch (M4-style), integrate answers, re-present.
- **Do NOT proceed to step 5 until the user picks A.**
5. **Save**:
- Write `DOCUMENT_DIR/glossary.md`, alphabetical, with a top-line `**Status**: confirmed-by-user` and the date.
- Update `DOCUMENT_DIR/architecture.md`:
- If a `## Architecture Vision` (or `## Vision` / `## Overview`) section already exists at the top, replace its body with the confirmed paragraph + components + principles.
- Otherwise, insert `## Architecture Vision` as the first H2 after the title; preserve every existing H2 below.
- Do NOT delete or re-order existing technical sections (Tech Stack, Deployment Model, Data Model, NFRs, ADRs).
6. **Update `state.json`**: mark `step_4_5_glossary_vision: confirmed`. Resume on rerun must skip this step unless the user explicitly invokes `/document --refresh-vision`.
**Self-verification**:
- [ ] Every glossary entry traces to at least one file under `DOCUMENT_DIR/`
- [ ] Every component listed in the vision matches a folder under `DOCUMENT_DIR/components/`
- [ ] All open questions are answered or explicitly deferred (with the user's acknowledgement)
- [ ] `architecture.md` still contains all H2 sections it had before this step
- [ ] User picked option A on the latest condensed view
**BLOCKING**: Do NOT proceed to the session boundary / Step 5 until both files are saved and the user has picked A.
---
**Session boundary**: After Step 4.5 is confirmed, suggest a session break before proceeding to the synthesis steps (57). These steps produce different artifact types and benefit from fresh context:
```
══════════════════════════════════════
+188 -56
View File
@@ -1,41 +1,59 @@
---
name: implement
description: |
Orchestrate task implementation with dependency-aware batching, parallel subagents, and integrated code review.
Implement tasks sequentially with dependency-aware batching and integrated code review.
Reads flat task files and _dependencies_table.md from TASKS_DIR, computes execution batches via topological sort,
launches up to 4 implementer subagents in parallel, runs code-review skill after each batch, and loops until done.
implements tasks one at a time in dependency order, runs code-review skill after each batch, and loops until done.
Use after /decompose has produced task files.
Trigger phrases:
- "implement", "start implementation", "implement tasks"
- "run implementers", "execute tasks"
- "execute tasks"
category: build
tags: [implementation, orchestration, batching, parallel, code-review]
tags: [implementation, batching, code-review]
disable-model-invocation: true
---
# Implementation Orchestrator
# Implementation Runner
Orchestrate the implementation of all tasks produced by the `/decompose` skill. This skill is a **pure orchestrator** — it does NOT write implementation code itself. It reads task specs, computes execution order, delegates to `implementer` subagents, validates results via the `/code-review` skill, and escalates issues.
Implement all tasks produced by the `/decompose` skill. This skill reads task specs, computes execution order, writes the code and tests for each task **sequentially** (no subagents, no parallel execution), validates results via the `/code-review` skill, and escalates issues.
The `implementer` agent is the specialist that writes all the code — it receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria.
For each task the main agent receives a task spec, analyzes the codebase, implements the feature, writes tests, and verifies acceptance criteria — then moves on to the next task.
## Core Principles
- **Orchestrate, don't implement**: this skill delegates all coding to `implementer` subagents
- **Dependency-aware batching**: tasks run only when all their dependencies are satisfied
- **Max 4 parallel agents**: never launch more than 4 implementer subagents simultaneously
- **File isolation**: no two parallel agents may write to the same file
- **Sequential execution**: implement one task at a time. Do NOT spawn subagents and do NOT run tasks in parallel. (See `.cursor/rules/no-subagents.mdc`.)
- **Dependency-aware ordering**: tasks run only when all their dependencies are satisfied
- **Batching for review, not parallelism**: tasks are grouped into batches so `/code-review` and commits operate on a coherent unit of work — all tasks inside a batch are still implemented one after the other
- **Integrated review**: `/code-review` skill runs automatically after each batch
- **Auto-start**: batches launch immediately — no user confirmation before a batch
- **Completeness before testing**: product implementation is not done until code is checked against task outcomes, included scope, architecture/component promises, named runtime dependencies, and unresolved scaffold/native placeholders — not just task AC tests
- **Runtime dependency reality**: production code cannot satisfy a task by exposing only a protocol, fake runner, deterministic fallback, or "native bridge" placeholder when the task/architecture promises a concrete internal capability such as BASALT VIO, FAISS retrieval, LightGlue matching, or a full A-Z localization pipeline. Stubs are allowed only for external systems and tests.
- **Auto-start**: batches start immediately — no user confirmation before a batch
- **Gate on failure**: user confirmation is required only when code review returns FAIL
- **Commit per batch**: after each batch is confirmed, commit. Ask the user whether to push to remote unless the user previously opted into auto-push for this session.
## Context Resolution
- TASKS_DIR: `_docs/02_tasks/`
- Task files: all `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
- Task files: selected `*.md` files in `TASKS_DIR/todo/` (excluding files starting with `_`)
- Dependency table: `TASKS_DIR/_dependencies_table.md`
### Task Selection Context
The invoking flow decides which task category this run should execute. The implement skill must honor that selected context instead of consuming every file in `todo/`.
| Context | Selected task files |
|---------|---------------------|
| Product implementation | Task specs that are not test-only and not refactoring specs |
| Test implementation | `*_test_infrastructure.md` plus task specs whose `Component` or `Epic` identifies `Blackbox Tests` |
| Refactoring | Task specs whose filename or task ID includes `_refactor_` |
If no explicit context is provided, infer it from the active autodev step:
- greenfield Step 7 or existing-code Step 10 → Product implementation
- greenfield Step 10 or existing-code Step 6 → Test implementation
- refactor Phase 4 → Refactoring
Unselected task files remain in `TASKS_DIR/todo/` for their later flow step.
### Task Lifecycle Folders
```
@@ -46,9 +64,31 @@ TASKS_DIR/
└── done/ ← completed tasks (moved here after implementation)
```
### Suite-level invocation context (meta-repo flow)
When invoked from `.cursor/skills/autodev/flows/meta-repo.md` Step 3.5 (or any caller that supplies the same context envelope), the skill receives:
```
suite_level: true
TASKS_DIR: <override> # e.g., _docs/tasks/ (vs. default _docs/02_tasks/)
module_layout_path: <override> # e.g., _docs/tasks/_suite_module_layout.md
```
When `suite_level: true` is present, the following gate adjustments apply — and ONLY these. All other steps (114, 16) execute unchanged:
1. **TASKS_DIR override** is honored throughout the skill (Step 1 Parse, Step 13 Archive, Step 15 input paths if it ran). Default `_docs/02_tasks/` is replaced by the supplied path.
2. **module_layout_path override** is read instead of the hardcoded `_docs/02_document/module-layout.md` in Step 4 (Assign File Ownership). The supplied file uses the same `Per-Component Mapping` schema. If both the override and the hardcoded path are missing, behavior is unchanged from default mode (STOP and instruct).
3. **Step 14.5 (Cumulative Code Review) — SKIPPED**. The meta-repo has no `_docs/02_document/architecture_compliance_baseline.md`; cross-task drift is captured by the next `monorepo-status` cycle instead.
4. **Step 15 (Product Implementation Completeness Gate) — SKIPPED**. The gate's hard inputs (`_docs/02_document/architecture.md`, `system-flows.md`, `components/*/description.md`) do not exist in the meta-repo artifact layout. Suite-level tasks are infrastructure / coordination work (renames, cross-repo edits, suite-root infra additions), not feature implementation; the equivalent completeness signal is the next `monorepo-status` drift report (which the meta-repo flow re-runs immediately after Step 3.5 returns).
5. **Final report filename**: `_docs/03_implementation/suite_implementation_report_{run_name}.md` (in addition to the existing feature/test/refactor variants). Batch reports follow `_docs/03_implementation/suite_batch_{NN}_report.md`.
6. **Tracker integration** (Step 5: In Progress, Step 12: In Testing) runs unchanged — suite-level tickets follow the same tracker rules as any other.
Without `suite_level: true`, none of these adjustments apply and the skill runs exactly as documented in default mode.
## Prerequisite Checks (BLOCKING)
1. `TASKS_DIR/todo/` exists and contains at least one task file — **STOP if missing**
1. `TASKS_DIR/todo/` exists and contains at least one task file for the selected context **STOP if missing**
- Exception for Product implementation re-entry: if no selected product tasks remain in `todo/`, but the active autodev state is Step 7 or the latest product completeness report is missing/invalid/contains `FAIL`, skip directly to Step 15 (Product Implementation Completeness Gate). This gate may create remediation tasks and return to Step 1. Do not write a final implementation report from this state.
2. `_dependencies_table.md` exists — **STOP if missing**
3. At least one task is not yet completed — **STOP if all done**
4. **Working tree is clean** — run `git status --porcelain`; the output must be empty.
@@ -56,16 +96,16 @@ TASKS_DIR/
- A) Commit or stash stray changes manually, then re-invoke `/implement`
- B) Agent commits stray changes as a single `chore: WIP pre-implement` commit and proceeds
- C) Abort
- Rationale: implementer subagents edit files in parallel and commit per batch. Unrelated uncommitted changes get silently folded into batch commits otherwise.
- Rationale: each batch ends with a commit. Unrelated uncommitted changes would get silently folded into batch commits otherwise.
- This check is repeated at the start of each batch iteration (see step 6 / step 14 Loop).
## Algorithm
### 1. Parse
- Read all task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
- Read selected task `*.md` files from `TASKS_DIR/todo/` (excluding files starting with `_`)
- Read `_dependencies_table.md` — parse into a dependency graph (DAG)
- Validate: no circular dependencies, all referenced dependencies exist
- Validate: no circular dependencies in the selected task graph, all referenced selected-task dependencies exist or are already completed in `TASKS_DIR/done/`
### 2. Detect Progress
@@ -78,22 +118,23 @@ TASKS_DIR/
- Topological sort remaining tasks
- Select tasks whose dependencies are ALL satisfied (completed)
- If a ready task depends on any task currently being worked on in this batch, it must wait for the next batch
- Cap the batch at 4 parallel agents
- A batch is simply a coherent group of tasks for review + commit. Within the batch, tasks are implemented sequentially in topological order.
- Cap the batch size at a reasonable review scope (default: 4 tasks)
- If the batch would exceed 20 total complexity points, suggest splitting and let the user decide
### 4. Assign File Ownership
The authoritative file-ownership map is `_docs/02_document/module-layout.md` (produced by the decompose skill's Step 1.5). Task specs are purely behavioral — they do NOT carry file paths. Derive ownership from the layout, not from the task spec's prose.
The authoritative file-ownership map is `_docs/02_document/module-layout.md` (produced by the decompose skill's Step 1.5), unless `suite_level: true` was supplied in the invocation context — in which case the `module_layout_path` override is read instead (see "Suite-level invocation context" above). Task specs are purely behavioral — they do NOT carry file paths. Derive ownership from the layout, not from the task spec's prose.
For each task in the batch:
- Read the task spec's **Component** field.
- Look up the component in `_docs/02_document/module-layout.md` → Per-Component Mapping.
- Set **OWNED** = the component's `Owns` glob (exclusive write for the duration of the batch).
- Set **OWNED** = the component's `Owns` glob (the files this task is allowed to write).
- Set **READ-ONLY** = Public API files of every component in the component's `Imports from` list, plus all `shared/*` Public API files.
- Set **FORBIDDEN** = every other component's `Owns` glob, and every other component's internal (non-Public API) files.
- If the task is a shared / cross-cutting task (lives under `shared/*`), OWNED = that shared directory; READ-ONLY = nothing; FORBIDDEN = every component directory.
- If two tasks in the same batch map to the same component or overlapping `Owns` globs, schedule them sequentially instead of in parallel.
Since execution is sequential, there is no parallel-write conflict to resolve; ownership here is a **scope discipline** check — it stops a task from drifting into unrelated components even when alone.
If `_docs/02_document/module-layout.md` is missing or the component is not found:
- STOP the batch.
@@ -102,31 +143,30 @@ If `_docs/02_document/module-layout.md` is missing or the component is not found
### 5. Update Tracker Status → In Progress
For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before launching the implementer. If `tracker: local`, skip this step.
For each task in the batch, transition its ticket status to **In Progress** via the configured work item tracker (see `protocols.md` for tracker detection) before starting work. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.
### 6. Launch Implementer Subagents
### 6. Implement Tasks Sequentially
**Per-batch dirty-tree re-check**: before launching subagents, run `git status --porcelain`. On the first batch this is guaranteed clean by the prerequisite check. On subsequent batches, the previous batch ended with a commit so the tree should still be clean. If the tree is dirty at this point, STOP and surface the dirty files to the user using the same A/B/C choice as the prerequisite check. The most likely causes are a failed commit in the previous batch, a user who edited files mid-loop, or a pre-commit hook that re-wrote files and was not captured.
**Per-batch dirty-tree re-check**: before starting the batch, run `git status --porcelain`. On the first batch this is guaranteed clean by the prerequisite check. On subsequent batches, the previous batch ended with a commit so the tree should still be clean. If the tree is dirty at this point, STOP and surface the dirty files to the user using the same A/B/C choice as the prerequisite check. The most likely causes are a failed commit in the previous batch, a user who edited files mid-loop, or a pre-commit hook that re-wrote files and was not captured.
For each task in the batch, launch an `implementer` subagent with:
- Path to the task spec file
- List of files OWNED (exclusive write access)
- List of files READ-ONLY
- List of files FORBIDDEN
- **Explicit instruction**: the implementer must write or update tests that validate each acceptance criterion in the task spec. If a test cannot run in the current environment (e.g., TensorRT requires GPU), the test must still be written and skip with a clear reason.
For each task in the batch **in topological order, one at a time**:
1. Read the task spec file.
2. Respect the file-ownership envelope computed in Step 4 (OWNED / READ-ONLY / FORBIDDEN).
3. Implement the feature and write/update tests for every acceptance criterion in the spec. Tests for internal product behavior must exercise the production implementation path. If a test cannot run in the current environment (e.g., TensorRT requires GPU), the test must still exist and skip/block with a clear prerequisite reason, but that skip does not make missing production code complete.
4. Run the relevant tests locally before moving on to the next task in the batch. If tests fail, fix in-place — do not defer.
5. Capture a short per-task status line (files changed, tests pass/fail, any blockers) for the batch report.
Launch all subagents immediately — no user confirmation.
Do NOT spawn subagents and do NOT attempt to implement two tasks simultaneously, even if they touch disjoint files. See `.cursor/rules/no-subagents.mdc`.
### 7. Monitor
### 7. Collect Status
- Wait for all subagents to complete
- Collect structured status reports from each implementer
- If any implementer reports "Blocked", log the blocker and continue with others
- After all tasks in the batch are finished, aggregate the per-task status lines into a structured batch status.
- If any task reported "Blocked", log the blocker with the failing task's ID and continue — the batch report will surface it.
**Stuck detection** — while monitoring, watch for these signals per subagent:
- Same file modified 3+ times without test pass rate improving → flag as stuck, stop the subagent, report as Blocked
- Subagent has not produced new output for an extended period → flag as potentially hung
- If a subagent is flagged as stuck, do NOT let it continue looping — stop it and record the blocker in the batch report
**Stuck detection** — while implementing a task, watch for these signals in your own progress:
- The same file has been rewritten 3+ times without tests going green → stop, mark the task Blocked, and move to the next task in the batch (the user will be asked at the end of the batch).
- You have tried 3+ distinct approaches without evidence-driven progress → stop, mark Blocked, move on.
- Do NOT loop indefinitely on a single task. Record the blocker and proceed.
### 8. AC Test Coverage Verification
@@ -139,8 +179,8 @@ Before code review, verify that every acceptance criterion in each task spec has
- **Not covered**: no test exists for this AC
If any AC is **Not covered**:
- This is a **BLOCKING** failure — the implementer must write the missing test before proceeding
- Re-launch the implementer with the specific ACs that need tests
- This is a **BLOCKING** failure — the missing test must be written before proceeding
- Go back to the offending task, add tests for the specific ACs that lack coverage, then re-run this check
- If the test cannot run in the current environment (GPU required, platform-specific, external service), the test must still exist and skip with `pytest.mark.skipif` or `pytest.skip()` explaining the prerequisite
- A skipped test counts as **Covered** — the test exists and will run when the environment allows
@@ -189,18 +229,22 @@ Track `auto_fix_attempts` and `escalated_findings` in the batch report for retro
### 12. Update Tracker Status → In Testing
After the batch is committed and pushed, transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step.
After the batch is committed (and pushed if the user approved pushing), transition the ticket status of each task in the batch to **In Testing** via the configured work item tracker. If `tracker: local`, skip this step. If a tracker operation fails unexpectedly, follow `.cursor/rules/tracker.mdc`.
### 13. Archive Completed Tasks
Move each completed task file from `TASKS_DIR/todo/` to `TASKS_DIR/done/`.
For product implementation, this archive means "batch implementation accepted." The Product Implementation Completeness Gate can still require follow-up remediation tasks before the feature is complete; it does not move original task files back to `todo/`.
### 14. Loop
- Go back to step 2 until all tasks in `todo/` are done
### 14.5. Cumulative Code Review (every K batches)
**Skipped entirely when `suite_level: true`** (see "Suite-level invocation context" above) — the meta-repo has no `architecture_compliance_baseline.md` to evaluate against; cross-task drift is captured by the next `monorepo-status` cycle.
- **Trigger**: every K completed batches (default `K = 3`; configurable per run via a `cumulative_review_interval` knob in the invocation context)
- **Purpose**: per-batch review (Step 9) catches batch-local issues; cumulative review catches issues that only appear when tasks are combined — architecture drift, cross-task inconsistency, duplicate symbols introduced across different batches, contracts that drifted across producer/consumer batches
- **Scope**: the union of files changed since the **last** cumulative review (or since the start of the run if this is the first)
@@ -216,22 +260,108 @@ Move each completed task file from `TASKS_DIR/todo/` to `TASKS_DIR/done/`.
- **Interaction with Auto-Fix Gate**: Architecture findings (new category from code-review Phase 7) always escalate per the implement auto-fix matrix; they cannot silently auto-fix
- **Resumability**: if interrupted, the next invocation checks for the latest `cumulative_review_batches_*.md` and computes the changed-file set from batch reports produced after that review
### 15. Final Test Run
### 15. Product Implementation Completeness Gate
- After all batches are complete, run the full test suite once
- Read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices)
- Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision
- When tests pass, report final summary
Run this gate after all **product implementation** tasks are complete and before writing any final product implementation report or allowing autodev to proceed to testability/test decomposition. Skip this gate when (a) the remaining context is explicitly test implementation or refactoring (as determined by the task files and report filename rules), OR (b) `suite_level: true` was supplied in the invocation context (the gate's inputs do not exist in the meta-repo artifact layout — see "Suite-level invocation context" above).
**Goal**: catch the failure mode where narrow tests validate scaffold behavior while the task's actual outcome, included scope, architecture promise, or named integration remains unimplemented.
Inputs:
- Completed product task specs from `_docs/02_tasks/done/` for the current cycle
- `_docs/02_document/architecture.md`
- `_docs/02_document/system-flows.md`
- Relevant `_docs/02_document/components/*/description.md` files
- Current source code under each completed task's ownership envelope
- Batch reports and code-review reports for the current cycle
For each completed product task:
1. Read these sections from the task spec: `Description`, `Outcome`, `Scope / Included`, `Acceptance Criteria`, `Non-Functional Requirements`, `Constraints`, and explicit named technologies or integrations.
2. Compare those promises against actual source code, not only tests or report prose.
3. Search the task's owned component files for unresolved implementation markers: `placeholder`, `stub`, `reserved`, `TODO`, `NotImplemented`, `pass`, `deterministic`, `fake`, `mock`, `scaffold`, `native bridge`, and empty native/readme-only integration directories. Ignore test fixtures/mocks only when they are under test-owned paths and not used as production behavior.
4. Verify that each named runtime dependency in the task promise is integrated as production behavior, not merely represented by an interface. Examples: if a task promises FAISS, DINOv2, BASALT, LightGlue, OpenCV, RANSAC, a database, cloud service, or hardware SDK, the production code must either call that dependency or contain an adapter that loads and executes the real dependency package. A deterministic fallback, fake runner, empty `native/` package, or "bridge to be supplied later" is **FAIL** unless the task itself explicitly scoped the dependency out before implementation started.
5. Distinguish internal implementation from external prerequisites:
- Internal product capabilities (VIO, anchor verification, cache retrieval, safety wrapper, FDR, MAVLink emission) must be implemented in production code before the task can pass.
- External systems/hardware/data (Jetson device, physical camera, ArduPilot process, QGC, third-party service credentials, unavailable licensed dataset) may be `BLOCKED` only when production code exists and the missing prerequisite is outside the product boundary.
6. Verify tests exercise the real implementation path where local prerequisites exist. Environment-gated tests may skip only with an explicit prerequisite reason; they do not make missing production code complete.
7. For any architecture promise that describes an end-to-end user outcome, verify there is an executable production pipeline connecting the relevant components. Isolated component contracts and test-only harness orchestration are not enough.
8. Classify each task:
- **PASS**: task promises are implemented or explicitly out of scope in the task itself.
- **BLOCKED**: production code exists but cannot be fully verified due to external hardware/data/license/runtime prerequisites; the blocker is explicit and tests report blocked/skipped with reason.
- **FAIL**: promised production behavior is missing, only scaffolded, or only represented in tests/reports.
#### 15.b System-Pipeline Check (runs ONCE per gate invocation, after per-task classification)
The per-task classification above (steps 18) operates on `_docs/02_tasks/done/`. It catches missing component-local behavior but it CANNOT catch a missing *integration* — there is no task to fail if no task ever owned the integration in the first place. The GPS-passthrough incident (May 2026) escaped this gate because every per-component task in `done/` was honestly complete; the missing piece was the cross-component loop, which had no owning task.
The system-pipeline check fixes that by walking the architecture documents directly, independent of `done/`.
**Inputs**:
- `_docs/02_document/architecture.md`
- `_docs/02_document/system-flows.md`
- Full source tree under the project's production directory (e.g. `src/`).
**Procedure**:
1. **Enumerate end-to-end pipelines.** Read `architecture.md` and `system-flows.md`. For each named pipeline / operational flow that spans 2+ components, record the ordered component sequence and the trigger (per-frame, per-request, scheduled, manual).
2. **Grep for production callers of each seam method.** For each adjacent pair `A → B` in a pipeline, find a production source file (not under `tests/`, not under a `bench/` package, not a doc) that calls `A`'s public output method AND passes the result into `B`'s public input method.
3. **Classify the pipeline**:
- **WIRED**: a production caller exists and the chain is complete from the first to the last component in the sequence.
- **PARTIALLY WIRED**: some adjacent pairs have callers but at least one seam is missing.
- **NOT WIRED**: no production code calls the pipeline's components in order. Bench tools, unit tests, and microbenchmarks do NOT count as "wiring".
4. **Distinguish "wired but stubbed" from "wired with real components"**: a caller that invokes a passthrough / GPS-from-tlog / mock-output-generator instead of the real component is `NOT WIRED` for the purposes of this gate. The seam exists in the source file but the production behavior is faked. Grep for the same scaffold markers Step 15 already enumerates (`placeholder`, `stub`, `passthrough`, `scaffold until`, etc.) inside the caller's body.
5. **Output**: append a `## System Pipeline Audit` section to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md`. Per-pipeline row: name, sequence, classification, evidence file (the caller, or "NONE FOUND"), remediation suggestion if not `WIRED`.
**Pipeline classification feeds the combined gate below.** Any pipeline that is not `WIRED` is a system-level FAIL that the per-task gate cannot rescue.
**Why this is here and not only in decompose**: decompose Step 1.7 creates integration tasks up front; this check verifies the integration tasks actually got implemented (or, if they were never created, surfaces the gap before the cycle closes). The two layers are belt-and-suspenders by design.
Save the audit to `_docs/03_implementation/implementation_completeness_cycle[N]_report.md` with:
- Per-task classification
- Evidence files/symbols checked
- Any unresolved scaffold/native placeholders
- Any named promised technologies not integrated
- **System Pipeline Audit table** (per pipeline: name, sequence, WIRED / PARTIALLY WIRED / NOT WIRED, evidence file, remediation suggestion)
- Required remediation task suggestions, each sized to 5 points or less
Gate:
- If every product task is `PASS` or `BLOCKED` with explicit prerequisite evidence, AND every enumerated pipeline is `WIRED`, continue to Final Test Run.
- If any product task is `FAIL` OR any pipeline is `PARTIALLY WIRED` / `NOT WIRED`, STOP. Do not write the final product implementation report and do not proceed to any downstream autodev step. Completed original task files remain in `done/`; the missing work is represented by remediation tasks. Present a Choose block:
- A) Create remediation tasks now and return to implementation. (For pipeline FAILs the remediation task is a NEW integration task owned by the spine component per `_docs/02_document/module-layout.md`; it is NOT a test task and NOT a doc task; its deliverable is production code that drives the pipeline against real components.)
- B) Mark the missing behavior explicitly out of scope in task/docs, then re-run this gate
- C) Abort for manual correction
- Recommendation must normally be A unless the user deliberately accepts reduced scope.
Remediation task creation:
1. For each `FAIL`, create one or more task specs using `.cursor/skills/decompose/templates/task.md`; each remediation task must be sized at 5 points or less.
2. Save each task to `_docs/02_tasks/todo/` with a short name prefixed by `remediate_`.
3. Set **Component** to the failed task's component and set **Dependencies** to the failed task ID plus any remediation prerequisites.
4. Create or defer tracker tickets using the same tracker rules as decompose/new-task: if tracker is available, create tickets immediately; if the user explicitly chose `tracker: local`, keep numeric prefixes with `Tracker: pending` / `Epic: pending`.
5. Append the remediation tasks to `_docs/02_tasks/_dependencies_table.md`.
6. Return to Step 1 (Parse) in **Product implementation** context. The final product implementation report can be written only after remediation tasks complete and this gate reruns without `FAIL`.
### 16. Final Test Run
- After all batches are complete, run the full test suite once unless the invoking flow's immediate next step is `Run Tests`.
- If the next flow step is `Run Tests`, record a handoff in the final implementation report and let `.cursor/skills/test-run/SKILL.md` own the full-suite gate to avoid duplicate full runs.
- When this step does run, read and execute `.cursor/skills/test-run/SKILL.md` (detect runner, run suite, diagnose failures, present blocking choices).
- Test failures are a **blocking gate** — do not proceed until the test-run skill completes with a user decision.
- When tests pass, report final summary.
## Batch Report Persistence
After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. When all tasks are complete, produce a FINAL implementation report with a summary of all batches. The filename depends on context:
After each batch completes, save the batch report to `_docs/03_implementation/batch_[NN]_cycle[N]_report.md` for feature implementation (or `batch_[NN]_report.md` for test/refactor runs). Create the directory if it doesn't exist. For product implementation, produce the FINAL implementation report only after the Product Implementation Completeness Gate passes. For test and refactor implementation, produce the FINAL report after all selected tasks complete and the full-suite gate is either run or handed off per Step 16. The filename depends on context:
- **Test implementation** (tasks from test decomposition): `_docs/03_implementation/implementation_report_tests.md`
- **Feature implementation**: `_docs/03_implementation/implementation_report_{feature_slug}_cycle{N}.md` where `{feature_slug}` is derived from the batch task names (e.g., `implementation_report_core_api_cycle2.md`) and `{N}` is the current `state.cycle` from `_docs/_autodev_state.md`. If `state.cycle` is absent (pre-migration), default to `cycle1`.
- **Refactoring**: `_docs/03_implementation/implementation_report_refactor_{run_name}.md`
- **Suite-level** (when `suite_level: true` was supplied — see "Suite-level invocation context" above): `_docs/03_implementation/suite_implementation_report_{run_name}.md`. Batch reports use `_docs/03_implementation/suite_batch_{NN}_report.md`. `{run_name}` is derived from the batch task IDs (e.g., `suite_implementation_report_az543_az549_az550.md`).
Determine the context from the task files being implemented: if all tasks have test-related names or belong to a test epic, use the tests filename; otherwise derive the feature slug from the component names and append the cycle suffix.
Determine the context from the task files being implemented: if all tasks have test-related names or belong to a test epic, use the tests filename; if `suite_level: true` was supplied, use the suite filename; otherwise derive the feature slug from the component names and append the cycle suffix.
Batch report filenames must also include the cycle counter when running feature implementation: `_docs/03_implementation/batch_{NN}_cycle{N}_report.md` (test and refactor runs may use the plain `batch_{NN}_report.md` form since they are not cycle-scoped).
@@ -264,9 +394,10 @@ After each batch, produce a structured report:
| Situation | Action |
|-----------|--------|
| Implementer fails same approach 3+ times | Stop it, escalate to user |
| Same task rewritten 3+ times without green tests | Mark Blocked, continue batch, escalate at batch end |
| Task blocked on external dependency (not in task list) | Report and skip |
| File ownership conflict unresolvable | ASK user |
| File ownership violated (task wrote outside OWNED) | ASK user |
| Product completeness gate finds missing promised implementation | STOP — create remediation tasks or get explicit user scope reduction |
| Test failure after final test run | Delegate to test-run skill — blocking gate |
| All tasks complete | Report final summary, suggest final commit |
| `_dependencies_table.md` missing | STOP — run `/decompose` first |
@@ -281,7 +412,8 @@ Each batch commit serves as a rollback checkpoint. If recovery is needed:
## Safety Rules
- Never launch tasks whose dependencies are not yet completed
- Never allow two parallel agents to write to the same file
- If a subagent fails or is flagged as stuck, stop it and report — do not let it loop indefinitely
- Always run the full test suite after all batches complete (step 15)
- Never start a task whose dependencies are not yet completed
- Never run tasks in parallel and never spawn subagents — see `.cursor/rules/no-subagents.mdc`
- If a task is flagged as stuck, stop working on it and report — do not let it loop indefinitely
- Always run the Product Implementation Completeness Gate before final product reports
- Always run or hand off the full test suite after all batches complete (step 16)
@@ -3,29 +3,31 @@
## Topological Sort with Batch Grouping
The `/implement` skill uses a topological sort to determine execution order,
then groups tasks into batches for parallel execution.
then groups tasks into batches for code review and commit. Execution within a
batch is **sequential** — see `.cursor/rules/no-subagents.mdc`.
## Algorithm
1. Build adjacency list from `_dependencies_table.md`
2. Compute in-degree for each task node
3. Initialize batch 0 with all nodes that have in-degree 0
3. Initialize the ready set with all nodes that have in-degree 0
4. For each batch:
a. Select up to 4 tasks from the ready set
b. Check file ownership — if two tasks would write the same file, defer one to the next batch
c. Launch selected tasks as parallel implementer subagents
d. When all complete, remove them from the graph and decrement in-degrees of dependents
e. Add newly zero-in-degree nodes to the next batch's ready set
a. Select up to 4 tasks from the ready set (default batch size cap)
b. Implement the selected tasks one at a time in topological order
c. When all tasks in the batch complete, remove them from the graph and
decrement in-degrees of dependents
d. Add newly zero-in-degree nodes to the ready set
5. Repeat until the graph is empty
## File Ownership Conflict Resolution
## Ordering Inside a Batch
When two tasks in the same batch map to overlapping files:
- Prefer to run the lower-numbered task first (it's more foundational)
- Defer the higher-numbered task to the next batch
- If both have equal priority, ask the user
Tasks inside a batch are executed in topological order — a task is only
started after every task it depends on (inside the batch or in a previous
batch) is done. When two tasks have the same topological rank, prefer the
lower-numbered (more foundational) task first.
## Complexity Budget
Each batch should not exceed 20 total complexity points.
If it does, split the batch and let the user choose which tasks to include.
The budget exists to keep the per-batch code review scope reviewable.
+2 -1
View File
@@ -129,7 +129,8 @@ If `_docs/_repo-config.yaml` already exists:
- Entries removed (component removed from registry)
4. **Ask the user** whether to apply the diff.
5. If applied, **preserve `confirmed: true` flags** for entries that still match — don't reset human-approved mappings.
6. If user declines, stop — leave config untouched.
6. **Preserve user-owned top-level keys verbatim**: `glossary_doc:` (written by autodev meta-repo Step 2.5) and any `assumptions_log:` entries are NEVER edited or removed by this skill. Carry them through unchanged. If the file referenced by `glossary_doc:` no longer exists on disk, surface as an `unresolved:` question — do not auto-clear the field.
7. If user declines, stop — leave config untouched.
### Phase 8: Batch question checkpoint (M4)
@@ -15,6 +15,8 @@ Propagates component changes into the unified documentation set. Strictly scoped
| Root `README.md` **only** if `_repo-config.yaml` lists it as a doc target (e.g., services table) | Install scripts (`ci-*.sh`) → use `monorepo-cicd` |
| Docs index (`_docs/README.md` or similar) cross-reference tables | Component-internal docs (`<component>/README.md`, `<component>/docs/*`) |
| Cross-cutting docs listed in `docs.cross_cutting` | `_docs/_repo-config.yaml` itself (only `monorepo-discover` and `monorepo-onboard` write it) |
| Body of cross-cutting docs **except** the `## Architecture Vision` section (preserved verbatim — owned by autodev meta-repo Step 2.5) | The file at `glossary_doc:` (user-confirmed; only autodev meta-repo Step 2.5 rewrites it). New project terms surfaced during sync are reported back to the user, not silently appended |
| `## Architecture Vision` body — read-only, may be referenced for terminology consistency but never edited | — |
If a component change requires CI/env updates too, tell the user to also run `monorepo-cicd`. This skill does NOT cross domains.
@@ -166,6 +168,8 @@ Append to `_docs/_repo-config.yaml` under `assumptions_log:`:
- Change `confirmed_by_user` or any `confirmed: <bool>` flag
- Auto-commit or push
- Guess a mapping not in the config
- Edit `glossary_doc:` (the file recorded under the config's `glossary_doc:` key)
- Edit the `## Architecture Vision` section of any cross-cutting doc; if a sync would conflict with that section, surface the conflict to the user and skip — do not silently rewrite user-confirmed content
## Edge cases
+152
View File
@@ -0,0 +1,152 @@
---
name: monorepo-e2e
description: Syncs the suite-level integration e2e harness (`e2e/docker-compose.suite-e2e.yml`, fixtures, Playwright runner) when component contracts drift in ways that affect the cross-service scenario. Reads `_docs/_repo-config.yaml` to know which suite-e2e artifacts are in play. Touches ONLY suite-e2e files — never per-component CI, docs, or component internals. Use when a component changes a port, env var, public API endpoint, DB schema column, or detection model that the suite e2e exercises.
---
# Monorepo Suite-E2E
Propagates component changes into the suite-level integration e2e harness. Strictly scoped — never edits docs, component internals, per-component CI configs, or the production deploy compose.
## Scope — explicit
| In scope | Out of scope |
| -------- | ------------ |
| `e2e/docker-compose.suite-e2e.yml` (overlay, healthchecks, seed services) | Production `_infra/deploy/<target>/docker-compose.yml``monorepo-cicd` owns it |
| `e2e/fixtures/init.sql` (seeded rows that the spec depends on) | Component DB migrations — owned by each component |
| `e2e/fixtures/expected_detections.json` (detection baseline) | Detection model itself — owned by `detections/` |
| `e2e/runner/tests/*.spec.ts` selector / contract-driven edits | New scenarios (user-driven, not drift-driven) |
| `e2e/runner/Dockerfile` / `package.json` Playwright version bumps | Net-new e2e infrastructure (use `monorepo-onboard` or initial scaffolding) |
| `.woodpecker/suite-e2e.yml` (suite-level pipeline) | Per-component `.woodpecker/01-test.yml` / `02-build-push.yml``monorepo-cicd` owns those |
| Suite-e2e leftover entries under `_docs/_process_leftovers/` | Per-component leftovers — owned by each component |
If a component change needs doc updates too, tell the user to also run `monorepo-document`. If it needs production-deploy or per-component CI updates, run `monorepo-cicd`. This skill **only** updates the suite-e2e surface.
## Preconditions (hard gates)
1. `_docs/_repo-config.yaml` exists.
2. Top-level `confirmed_by_user: true`.
3. `suite_e2e.*` section is populated in config (see "Required config block" below). If absent, abort and ask the user to extend the config via `monorepo-discover`.
4. Components-in-scope have confirmed contract mappings (port, public API path, DB tables touched), OR user explicitly approves inferred ones.
## Required config block
This skill expects `_docs/_repo-config.yaml` to carry:
```yaml
suite_e2e:
overlay: e2e/docker-compose.suite-e2e.yml
fixtures:
init_sql: e2e/fixtures/init.sql
baseline_json: e2e/fixtures/expected_detections.json
binary_fixtures:
- e2e/fixtures/sample.mp4
- e2e/fixtures/model.tar.gz
runner:
dockerfile: e2e/runner/Dockerfile
package_json: e2e/runner/package.json
spec_dir: e2e/runner/tests
pipeline: .woodpecker/suite-e2e.yml
scenario:
description: "Upload video → detect → overlays → dataset → DB persistence"
components_exercised:
- ui
- annotations
- detections
- postgres-local
api_contracts:
- component: ui
path: /api/admin/auth/login
- component: annotations
path: /api/annotations/media/batch
- component: annotations
path: /api/annotations/media/{id}/annotations
db_tables:
- media
- annotations
- detection
- detection_classes
model_pin:
detections_repo_path: <path-to-model-config-or-classes-source>
classes_source: annotations/src/Database/DatabaseMigrator.cs
```
If `suite_e2e:` is missing the skill **stops** — it does not invent a default mapping.
## Mitigations (M1M7)
- **M1** Separation: this skill only touches suite-e2e files; no production deploy compose, no per-component CI, no docs, no component internals.
- **M3** Factual vs. interpretive: port, env var, API path, DB column — FACTUAL, read from the components' code. Whether a baseline still matches the model — DEFERRED to the user (the skill flags drift, never silently re-records).
- **M4** Batch questions at checkpoints.
- **M5** Skip over guess: a component change that doesn't map cleanly to one of the in-scope artifacts → skip and report.
- **M6** Assumptions footer + append to `_repo-config.yaml` `assumptions_log`.
- **M7** Drift detection: verify every path under `suite_e2e.*` exists on disk; stop if not.
## Workflow
### Phase 1: Drift check (M7)
Verify every file listed under `suite_e2e.*` (excluding `binary_fixtures`, which are gitignored) exists on disk. Missing file → stop and ask:
- Run `monorepo-discover` to refresh, OR
- Skip the missing artifact (recorded in report)
For `binary_fixtures` paths that are absent (expected — they live in S3/LFS), check whether `expected_detections.json._meta.video_sha256` is still a `TBD-...` placeholder. If yes, surface this as a known leftover (`_docs/_process_leftovers/2026-04-22_suite-e2e-binary-fixtures.md`) and continue.
### Phase 2: Determine scope
Same as `monorepo-cicd` Phase 2 — ask the user, or auto-detect. For **auto-detect**, flag commits that touch suite-e2e-relevant concerns:
| Commit pattern | Suite-e2e impact |
| -------------- | ---------------- |
| New port exposed by `<component>` | Healthcheck override may change in `e2e/docker-compose.suite-e2e.yml` |
| New required env var on `<component>` | `e2e/docker-compose.suite-e2e.yml` `e2e-runner` env block + `init.sql` seed |
| Public API path renamed / removed | Spec selector / API call path in `e2e/runner/tests/*.spec.ts` |
| DB schema column renamed in a `db_tables` entry | `init.sql` column reference + spec `pg.query` text |
| New required DB table referenced by spec | `init.sql` insert block (skip if owned by component migration) |
| Detection model rev change in `detections/` | `expected_detections.json` `_meta.model.revision` + flag baseline as stale |
| New canonical detection class added | `expected_detections.json._meta` annotation |
Present the flagged list; confirm.
### Phase 3: Classify changes per component
| Change type | Target suite-e2e files |
| ----------- | ---------------------- |
| Port / env var change | `e2e/docker-compose.suite-e2e.yml` |
| API path / contract change | `e2e/runner/tests/*.spec.ts` |
| DB schema reference change | `e2e/fixtures/init.sql` and spec SQL queries |
| Model / class catalog change | `e2e/fixtures/expected_detections.json` (mark `_meta.fixture_version` bump + leftover entry for binary refresh) |
| Playwright dependency drift | `e2e/runner/package.json` + `e2e/runner/Dockerfile` |
| Suite scenario steps gone stale | **Stop and ask** — scenario edits are user-driven, not drift-driven |
### Phase 4: Apply edits
Edit each in-scope file. After each batch, run `ReadLints` on touched files. Do NOT run the suite e2e itself — that's a downstream pipeline operation, not a sync-skill responsibility.
For `expected_detections.json`: when the model revision changes, the skill **does not** re-record the baseline — the binary fixture cannot be regenerated from the dev environment. Instead:
1. Set `_meta.model.revision` to the new revision.
2. Set `_meta.fixture_version` to a new bumped version with a `-stale` suffix (e.g., `0.2.0-stale`).
3. Append a new entry to `_docs/_process_leftovers/` describing the required re-record.
4. Leave `expected.by_class` untouched — the spec's tolerance check will fail loudly until the binary refresh lands.
### Phase 5: Update assumptions log
Append a new `assumptions_log:` entry to `_docs/_repo-config.yaml` recording:
- Date, components in scope, which suite-e2e files were touched
- Any inferred contract mappings still tagged `confirmed: false`
- Any leftover entries created
### Phase 6: Report
Render a Choose-format summary of the synced files, surface any `_process_leftovers/` entries created, and end. Do NOT auto-commit.
## Self-verification
- [ ] No file outside `e2e/`, `.woodpecker/suite-e2e.yml`, or `_docs/_process_leftovers/` was edited
- [ ] `_docs/_repo-config.yaml` `suite_e2e:` block was not silently mutated except for `assumptions_log` append
- [ ] `expected_detections.json` was not re-recorded (only metadata bumped + leftover added)
- [ ] Every spec edit traces to a flagged commit pattern in Phase 2
- [ ] `ReadLints` clean on every touched file
## Failure handling
Same retry / escalation protocol as `monorepo-cicd` — see `protocols.md`. The most common failure mode is the binary-fixture leftover (sample.mp4 missing or SHA-mismatched); this skill does not attempt to resolve it, only surfaces it.
+4
View File
@@ -59,6 +59,8 @@ Mark each as `complete` / `partial` / `missing` and explain.
- Every component in `components:` appears in the registry — flag mismatches
- Every `docs.root` file cross-referenced in config exists on disk — flag missing
- Every `ci.orchestration_files` and `ci.install_scripts` exists — flag missing
- `glossary_doc:` (if recorded in config) points to a file that exists on disk — flag missing
- The cross-cutting architecture doc identified by `docs.cross_cutting` contains a `## Architecture Vision` section — flag missing (signals the meta-repo flow's Step 2.5 was skipped or the section was removed)
### Section 5: Unresolved questions
@@ -113,6 +115,8 @@ In registry, not in config: [list or "(none)"]
In config, not in registry: [list or "(none)"]
Config-referenced docs missing: [list or "(none)"]
Config-referenced CI files missing: [list or "(none)"]
glossary_doc: [path or "not recorded — run /autodev to capture"]
Architecture Vision section: [present | missing in <doc>]
═══════════════════════════════════════════════════
Unresolved questions
+61 -14
View File
@@ -75,7 +75,7 @@ Record the description verbatim for use in subsequent steps.
**Role**: Technical analyst
**Goal**: Determine whether deep research is needed.
Read the user's description and the existing codebase documentation from DOCUMENT_DIR (architecture.md, components/, system-flows.md).
Read the user's description and the existing codebase documentation from DOCUMENT_DIR (architecture.md including its `## Architecture Vision` section, glossary.md, components/, system-flows.md). Use `glossary.md` to keep the new task's name, acceptance-criteria wording, and component references aligned with the user's confirmed vocabulary; flag the task to the user if the request appears to violate an Architecture Vision principle, do not silently allow it.
**Consult LESSONS.md**: if `_docs/LESSONS.md` exists, read it and look for entries in categories `estimation`, `architecture`, `dependencies` that might apply to the task under consideration. If a relevant lesson exists (e.g., "estimation: auth-related changes historically take 2x estimate"), bias the classification and recommendation accordingly. Note in the output which lessons (if any) were applied.
@@ -84,29 +84,66 @@ Assess the change along these dimensions:
- **Novelty**: does it involve libraries, protocols, or patterns not already in the codebase?
- **Risk**: could it break existing functionality or require architectural changes?
Classification:
### 2a. Complexity-Points Estimate
Project policy (per the workspace user-rule on ADO points): aim for tasks at 23 points (rarely 5). Tasks at 8 points are high risk; tasks at 13 are too complex and MUST be broken down. The new-task skill enforces this here, before producing a single-file task spec.
Map the Scope/Novelty/Risk profile to a points estimate using this table:
| Profile | Points | Examples |
|---------|--------|----------|
| All three low | **12** | One-line config change; trivial CRUD field addition |
| Two low + one medium | **3** | Localized refactor; add one well-understood endpoint |
| One low + two medium, OR all medium | **5** | New small feature touching 23 components; integration with a known library |
| Any high, OR two medium + one high | **8** | Cross-cutting concern across 4+ components; integration with an unfamiliar protocol; significant architectural change |
| Two or three high | **13** | New subsystem; unfamiliar tech across the stack; multiple unknown unknowns |
If a relevant LESSONS.md entry biases the estimate (e.g., "auth-related changes historically take 2× estimate"), apply the multiplier and round up to the next discrete point on the scale (1, 2, 3, 5, 8, 13).
### 2b. Routing by Complexity
| Estimate | Default routing | Override path |
|----------|-----------------|---------------|
| **15** | Continue this skill at Step 3 (Research) or Step 4 (Codebase Analysis) — see classification below | — |
| **8** | **STOP this skill and recommend handoff to `/decompose @<feature_description>`** (single-component decompose mode if the affected scope fits inside one component, default mode if it does not). The user may override and proceed in `/new-task`, but the override must be explicitly chosen. | C) Proceed in /new-task anyway with the user's acknowledgement that the resulting task is high-risk and may need to be re-decomposed mid-implementation |
| **13** | **STOP this skill — auto-handoff is mandatory.** A 13-point feature cannot be a single task spec. Invoke `/decompose @<feature_description>` (default mode) before writing any task file. Surface the handoff to the user with no override path; this is a hard policy gate. | None — must decompose |
For the auto-handoff path:
1. Render a one-paragraph description of the feature suitable to feed `/decompose` (combine Step 1's verbatim user description with the complexity-points reasoning).
2. Save it to `_docs/02_task_plans/<feature_slug>/feature-description.md` so the decompose skill has a stable input file.
3. Either (a) directly auto-chain into `.cursor/skills/decompose/SKILL.md` in default mode with this file as input, or (b) report the handoff to the user along with the exact `/decompose` invocation and stop. Pick (a) only if the user has explicitly enabled auto-chain across skills (e.g., we are inside an `/autodev` invocation); otherwise pick (b).
### 2c. Research vs Skip Research (only for ≤5 estimates)
Classification (independent of points; runs only when points ≤ 5 and Step 2b chose Continue):
| Category | Criteria | Action |
|----------|----------|--------|
| **Needs research** | New libraries/frameworks, unfamiliar protocols, significant architectural change, multiple unknowns | Proceed to Step 3 (Research) |
| **Needs research** | New libraries/frameworks, unfamiliar protocols, multiple unknowns | Proceed to Step 3 (Research) |
| **Skip research** | Extends existing functionality, uses patterns already in codebase, straightforward new component with known tech | Skip to Step 4 (Codebase Analysis) |
Present the assessment to the user:
Present the full assessment to the user:
```
══════════════════════════════════════
COMPLEXITY ASSESSMENT
══════════════════════════════════════
Scope: [low / medium / high]
Novelty: [low / medium / high]
Risk: [low / medium / high]
Scope: [low / medium / high]
Novelty: [low / medium / high]
Risk: [low / medium / high]
Points: [1 / 2 / 3 / 5 / 8 / 13] (project aim: 23, rarely 5)
Routing: [Continue in /new-task | Hand off to /decompose]
══════════════════════════════════════
Recommendation: [Research needed / Skip research]
Reason: [one-line justification]
Recommendation: [Research needed | Skip research | Decompose required]
Reason: [one-line justification, including any LESSONS.md influence]
══════════════════════════════════════
```
**BLOCKING**: Ask the user to confirm or override the recommendation before proceeding.
**BLOCKING**:
- If points ≤ 5 → ask the user to confirm or override the research recommendation before proceeding.
- If points = 8 → ask the user to choose between hand-off to /decompose (recommended) and continuing in /new-task with explicit risk acknowledgement.
- If points = 13 → STOP and present the handoff plan; do not offer a continue-anyway override.
---
@@ -134,7 +171,8 @@ The `<task_slug>` is a short kebab-case name derived from the feature descriptio
**Goal**: Determine where and how to insert the new functionality, and whether existing tests cover the new requirements.
1. Read the codebase documentation from DOCUMENT_DIR:
- `architecture.md` — overall structure
- `architecture.md` — overall structure (the `## Architecture Vision` H2 is user-confirmed intent and must not be violated by the new task without explicit approval)
- `glossary.md` — project terminology; reuse the user's vocabulary in task names, AC, and component references
- `components/` — component specs
- `system-flows.md` — data flows (if exists)
- `data_model.md` — data model (if exists)
@@ -202,7 +240,13 @@ Apply the four shared-task triggers from `.cursor/skills/decompose/SKILL.md` Ste
2. Add the layout edit to the task's deliverables; the implementer writes it alongside the code change.
3. If `module-layout.md` does not exist, STOP and instruct the user to run `/document` first (existing-code flow) or `/decompose` default mode (greenfield). Do not guess.
Record the classification and any contract/layout deliverables in the working notes; they feed Step 5 (Validate Assumptions) and Step 6 (Create Task).
- **ADR cross-check** — runs unconditionally for every new-task in any of the three classifications above:
1. If `_docs/02_document/adr/` exists, scan every `Status: Accepted` ADR. For each, ask: "would the proposed task either contradict this ADR's `Decision` or materially affect its `Consequences`?"
2. **Conflict** (task contradicts an Accepted ADR) → STOP and Choose A/B/C: **A)** Re-scope the task to comply with the ADR, **B)** Propose superseding the ADR — the task spec then includes a deliverable to invoke `/plan --adr-only` (or the next `/plan` cycle's Step 4.5) with `Supersedes: ADR-NNN`, and the new task does NOT proceed until that supersede ADR is `Accepted`, **C)** Park the task in `backlog/` with a `Blocked-By: ADR-NNN review` note. Do not silently approve a contradictory task.
3. **Drift** (task changes assumptions an ADR depends on but does not directly contradict it) → record the affected ADR(s) under a new `### ADR Impact` section in the task spec with `> Affects ADR NNN_<slug>: <one-line summary>`. The implementer surfaces this at code-review Phase 7 (which then classifies it as ADR-Drift if not addressed).
4. **Aligned** (task implements something an Accepted ADR mandates) → cite the ADR(s) under `### ADR Compliance` in the task spec with `> Implements ADR NNN_<slug>`. Code-review Phase 7 then expects matching evidence in the implemented code.
Record the classification, any contract/layout deliverables, and any ADR cross-check outcomes in the working notes; they feed Step 5 (Validate Assumptions) and Step 6 (Create Task).
**BLOCKING**: none — this step surfaces findings; the user confirms them in Step 5.
@@ -262,6 +306,9 @@ Present using the Choose format for each decision that has meaningful alternativ
- [ ] If Step 4.5 classified the task as producer, the `## Contract` section exists and points at a contract file
- [ ] If Step 4.5 classified the task as consumer, `### Document Dependencies` lists the relevant contract file
- [ ] If Step 4.5 flagged a layout delta, the task's Scope.Included names the `module-layout.md` edit
- [ ] If Step 4.5 flagged an ADR conflict, the task is either re-scoped (A), explicitly blocked on a supersede ADR (B), or parked in backlog (C) — never silently bypassed
- [ ] If Step 4.5 flagged ADR drift, the task spec has an `### ADR Impact` section listing the affected ADR(s)
- [ ] If Step 4.5 flagged ADR alignment, the task spec has an `### ADR Compliance` section citing the implemented ADR(s)
---
@@ -281,7 +328,7 @@ Present using the Choose format for each decision that has meaningful alternativ
- Update **Epic** field: `[EPIC-ID]`
3. Rename the file from `[##]_[short_name].md` to `[TICKET-ID]_[short_name].md`
If the work item tracker is not authenticated or unavailable (`tracker: local`):
If the work item tracker is not authenticated or unavailable, follow `.cursor/rules/tracker.mdc` before continuing. Only if the user explicitly chooses `tracker: local`:
- Keep the numeric prefix
- Set **Tracker** to `pending`
- Set **Epic** to `pending`
@@ -336,7 +383,7 @@ After the user chooses **Done**:
| Research skill hits a blocker | Follow research skill's own escalation rules |
| Codebase analysis reveals conflicting architectures | **ASK** user which pattern to follow |
| Complexity exceeds 5 points | **WARN** user and suggest splitting into multiple tasks |
| Work item tracker MCP unavailable | **WARN**, continue with local-only task files |
| Work item tracker MCP unavailable | Follow `.cursor/rules/tracker.mdc`; do not continue in local mode unless the user explicitly chooses it |
## Trigger Conditions
+21 -6
View File
@@ -15,7 +15,7 @@ disable-model-invocation: true
# Solution Planning
Decompose a problem and solution into architecture, data model, deployment plan, system flows, components, tests, and work item epics through a systematic 6-step workflow.
Decompose a problem and solution into architecture, data model, deployment plan, system flows, components, ADRs, tests, and work item epics through a systematic workflow with seven step files (1, 2, 3, 4, 4.5, 5, 6) plus a Final quality checklist.
## Core Principles
@@ -55,7 +55,7 @@ Read `steps/01_artifact-management.md` for directory structure, save timing, sav
## Progress Tracking
At the start of execution, create a TodoWrite with all steps (1 through 6 plus Final). Update status as each step completes.
At the start of execution, create a TodoWrite with all steps (1, 2, 3, 4, 4.5, 5, 6 plus Final). Update status as each step completes. The fractional Step 4.5 (ADR Capture) sits between Architecture Review (Step 4) and Test Specifications (Step 5).
## Workflow
@@ -69,7 +69,7 @@ Capture any new questions, findings, or insights that arise during test specific
### Step 2: Solution Analysis
Read and follow `steps/02_solution-analysis.md`.
Read and follow `steps/02_solution-analysis.md`. The step opens with **Phase 2a.0: Glossary & Architecture Vision** (BLOCKING) — drafts `_docs/02_document/glossary.md` and a one-paragraph architecture vision, presents the condensed view to the user, iterates until confirmed, then proceeds into the architecture, data-model, and deployment phases. The confirmed vision becomes the first `## Architecture Vision` H2 of `architecture.md`.
---
@@ -85,6 +85,16 @@ Read and follow `steps/04_review-risk.md`.
---
### Step 4.5: Architecture Decision Records (ADRs)
Read and follow `steps/04-5_adr-capture.md`.
This step captures the architecture and tech-stack decisions that were made (or revised) in Steps 24 as durable, dated, immutable records under `_docs/02_document/adr/`. ADRs are the single thing in `_docs/` that explain the **why** of each major decision after the conversation history is gone. They are consumed by `decompose` (when bootstrapping module layout), `new-task` (when assessing a new feature against existing decisions), `refactor` (when proposing replacements), and any future code-review cycle that needs to confirm a structural choice was deliberate.
This step is **BLOCKING**: the ADR set must be reviewed and confirmed by the user before Step 5 begins.
---
### Step 5: Test Specifications
Read and follow `steps/05_test-specifications.md`.
@@ -107,6 +117,7 @@ Read and follow `steps/07_quality-checklist.md`.
- **Coding during planning**: this workflow produces documents, never code
- **Multi-responsibility components**: if a component does two things, split it
- **Skipping BLOCKING gates**: never proceed past a BLOCKING marker without user confirmation
- **Skipping the glossary/vision gate (Phase 2a.0)**: drafting `architecture.md` from raw `solution.md` without confirming terminology and vision means the AI's mental model is not aligned with the user's; every downstream artifact will inherit that drift
- **Diagrams without data**: generate diagrams only after the underlying structure is documented
- **Copy-pasting problem.md**: the architecture doc should analyze and transform, not repeat the input
- **Vague interfaces**: "component A talks to component B" is not enough; define the method, input, output
@@ -119,7 +130,7 @@ Read and follow `steps/07_quality-checklist.md`.
|-----------|--------|
| Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — planning cannot proceed |
| Ambiguous requirements | ASK user |
| Input data coverage below 75% | Search internet for supplementary data, ASK user to validate |
| Input data coverage below the canonical threshold (`cursor-meta.mdc` Quality Thresholds) | Search internet for supplementary data, ASK user to validate |
| Technology choice with multiple valid options | ASK user |
| Component naming | PROCEED, confirm at next BLOCKING gate |
| File structure within templates | PROCEED |
@@ -137,12 +148,16 @@ Read and follow `steps/07_quality-checklist.md`.
│ │
│ 1. Blackbox Tests → test-spec/SKILL.md │
│ [BLOCKING: user confirms test coverage] │
│ 2. Solution Analysis → architecture, data model, deployment
[BLOCKING: user confirms architecture]
│ 2. Solution Analysis → glossary + vision, architecture,
data model, deployment
│ [BLOCKING 2a.0: user confirms glossary + vision] │
│ [BLOCKING 2a: user confirms architecture] │
│ 3. Component Decomp → component specs + interfaces │
│ [BLOCKING: user confirms components] │
│ 4. Review & Risk → risk register, iterations │
│ [BLOCKING: user confirms mitigations] │
│ 4.5 ADR Capture → _docs/02_document/adr/NNN_*.md │
│ [BLOCKING: user confirms ADR set] │
│ 5. Test Specifications → per-component test specs │
│ 6. Work Item Epics → epic per component + bootstrap │
│ ───────────────────────────────────────────────── │
@@ -26,6 +26,10 @@ DOCUMENT_DIR/
│ └── deployment_procedures.md
├── risk_mitigations.md
├── risk_mitigations_02.md (iterative, ## as sequence)
├── adr/
│ ├── 001_[decision_slug].md
│ ├── 002_[decision_slug].md
│ └── ...
├── components/
│ ├── 01_[name]/
│ │ ├── description.md
@@ -66,6 +70,8 @@ DOCUMENT_DIR/
| Step 3 | Common helpers generated | `common-helpers/[##]_helper_[name].md` |
| Step 3 | Diagrams generated | `diagrams/` |
| Step 4 | Risk assessment complete | `risk_mitigations.md` |
| Step 4.5 | Each ADR captured | `adr/NNN_[decision_slug].md` |
| Step 4.5 | ADR index updated | `adr/README.md` |
| Step 5 | Tests written per component | `components/[##]_[name]/tests.md` |
| Step 6 | Epics created in work item tracker | Tracker via MCP |
| Final | All steps complete | `FINAL_report.md` |
@@ -85,3 +91,15 @@ If DOCUMENT_DIR already contains artifacts:
2. Identify the last completed step based on which artifacts exist
3. Resume from the next incomplete step
4. Inform the user which steps are being skipped
#### Step 4.5 (ADR Capture) resumption rule
ADR files have a `Status` field that disambiguates "step in progress" from "step done":
- `Status: Proposed` → Step 4.5 is **in progress**. The user has not yet hit the BLOCKING gate (or hit it and chose B/C/D, which kept files at `Proposed`). Resume Step 4.5 at Phase 4.5f and re-present the BLOCKING Choose to the user. Do NOT skip to Step 5.
- `Status: Accepted` AND `adr/README.md` index exists AND every Accepted ADR is referenced in the index → Step 4.5 is **done**. Skip to Step 5.
- `Status: Accepted` but `adr/README.md` is missing or out of date → Step 4.5 is **partially complete**. Resume at Phase 4.5d (Maintain the ADR Index) before moving on.
- Mixed `Proposed` + `Accepted` files in the same directory → Step 4.5 is **in progress** with prior partial confirmations. Resume at Phase 4.5f and re-present only the still-`Proposed` ADRs.
- Empty `adr/` directory or no `adr/` directory → Step 4.5 has not started yet. Begin at Phase 4.5a.
The `Date` field on every Accepted ADR is the date the user confirmed it; do not regenerate it during resumption.
@@ -4,20 +4,105 @@
**Goal**: Produce `architecture.md`, `system-flows.md`, `data_model.md`, and `deployment/` from the solution draft
**Constraints**: No code, no component-level detail yet; focus on system-level view
### Phase 2a.0: Glossary & Architecture Vision (BLOCKING)
**Role**: Software architect + business analyst
**Goal**: Align the AI's mental model of the project with the user's intent BEFORE drafting `architecture.md`. Capture domain terminology and the user's high-level architecture vision so every downstream artifact (architecture, components, flows, tests, epics) is grounded in confirmed user intent — not in AI inference.
**Inputs**:
- `_docs/00_problem/problem.md`, `acceptance_criteria.md`, `restrictions.md`
- `_docs/00_problem/input_data/*`
- `_docs/01_solution/solution.md` (and any earlier `solution_draft*.md` siblings)
- Any blackbox-test findings produced in Step 1
**Outputs**:
- `_docs/02_document/glossary.md` (NEW)
- A confirmed "Architecture Vision" paragraph + bullet list held in working memory and used as the spine of Phase 2a's `architecture.md`
**Procedure**:
1. **Draft glossary** — extract project-specific terminology from inputs (NOT generic software terms). Include:
- Domain entities, processes, and roles
- Acronyms / abbreviations
- Internal codenames or product names
- Synonym pairs in active use (e.g., "flight" vs. "mission")
- Stakeholder personas referenced in problem.md
Each entry: one-line definition, plus a parenthetical source (`source: problem.md`, `source: solution.md §3`).
Skip terms that have a single well-known industry meaning (REST, JSON, etc.).
2. **Draft architecture vision** — synthesize from inputs:
- **One paragraph**: what the system is, who uses it, the shape of the runtime topology (monolith / services / pipeline / library / hybrid).
- **Components & responsibilities** (one-line each). At this stage these are *intent-level*, not the formal decomposition that Step 3 produces.
- **Major data flows** (one or two sentences each).
- **Architectural principles / non-negotiables** the user has implied (e.g., "DB-driven config", "no per-component state outside Redis", "all UI traffic via REST + SSE only").
- **Open architectural questions** the AI cannot resolve from inputs alone.
3. **Present condensed view** to the user (NOT the full draft files — a synopsis only):
```
══════════════════════════════════════
REVIEW: Glossary + Architecture Vision
══════════════════════════════════════
Glossary (N terms drafted):
- <Term>: <one-line definition>
- ...
Architecture Vision:
<one-paragraph synopsis>
Components / responsibilities:
- <component>: <one-line>
- ...
Principles / non-negotiables:
- <principle>
- ...
Open questions (AI could not resolve):
- <q1>
- <q2>
══════════════════════════════════════
A) Looks correct — write glossary.md, use vision for Phase 2a
B) I want to add / correct entries (provide diffs)
C) Answer the open questions first, then re-present
══════════════════════════════════════
Recommendation: pick C if open questions exist, otherwise A
══════════════════════════════════════
```
4. **Iterate**:
- On B → integrate the user's diffs/additions, re-present the condensed view, loop until A.
- On C → ask the listed open questions one round (M4-style batch), integrate answers, re-present.
- **Do NOT proceed to step 5 until the user picks A.**
5. **Save**:
- Write `_docs/02_document/glossary.md` with terms in alphabetical order. Include a top-line `**Status**: confirmed-by-user` and the date.
- Hold the confirmed vision (paragraph + components + principles) in working memory; Phase 2a will materialize it into `architecture.md` and **must** preserve every confirmed principle and component intent verbatim.
**Self-verification**:
- [ ] Every glossary entry traces to at least one input file (no invented terms)
- [ ] Every component listed in the vision is one the inputs reference
- [ ] All open questions are either answered or explicitly deferred (with the user's acknowledgement)
- [ ] User picked option A on the latest condensed view
**BLOCKING**: Do NOT proceed to Phase 2a until `glossary.md` is saved and the user has confirmed the architecture vision.
### Phase 2a: Architecture & Flows
1. Read all input files thoroughly
2. Incorporate findings, questions, and insights discovered during Step 1 (blackbox tests)
3. Research unknown or questionable topics via internet; ask user about ambiguities
4. Document architecture using `templates/architecture.md` as structure
5. Document system flows using `templates/system-flows.md` as structure
3. **Apply confirmed vision from Phase 2a.0**: the architecture document must include a top-level `## Architecture Vision` section that contains the user-confirmed paragraph, components, and principles verbatim. The rest of `architecture.md` (tech stack, deployment model, NFRs, ADRs) builds on top of that section, never contradicts it
4. Research unknown or questionable topics via internet; ask user about ambiguities
5. Document architecture using `templates/architecture.md` as structure
6. Document system flows using `templates/system-flows.md` as structure
**Self-verification**:
- [ ] `architecture.md` opens with a `## Architecture Vision` section matching Phase 2a.0
- [ ] Architecture covers all capabilities mentioned in solution.md
- [ ] System flows cover all main user/system interactions
- [ ] No contradictions with problem.md or restrictions.md
- [ ] No contradictions with problem.md, restrictions.md, or the confirmed vision
- [ ] Technology choices are justified
- [ ] Blackbox test findings are reflected in architecture decisions
- [ ] Every term used in `architecture.md` that is project-specific appears in `glossary.md`
**Save action**: Write `architecture.md` and `system-flows.md`
@@ -0,0 +1,187 @@
# Step 4.5: Architecture Decision Records (ADRs)
**Role**: Architect / technical writer
**Goal**: Capture every major architecture, tech-stack, data-model, and integration decision made during Steps 24 as a durable, dated, immutable record under `_docs/02_document/adr/`.
**Constraints**: ADRs only — do not re-open architecture; do not make new decisions in this step. Document what has been decided, not what is still open.
ADRs are the single thing in `_docs/` that explains the **why** of each major decision after the conversation history is gone. They are consumed by:
- `decompose` Step 1.5 (`steps/01-5_module-layout.md`) — every Accepted ADR is cross-checked against the module-layout proposal; conflicts trigger an explicit Choose between supersede / exception / re-open.
- `new-task` Step 4.5 (`SKILL.md` § "Step 4.5: Contract & Layout Check") — every new task is classified against Accepted ADRs as Conflict / Drift / Aligned; conflicts STOP the task with a Choose A/B/C; drift adds an `### ADR Impact` section; alignment adds an `### ADR Compliance` section.
- `refactor` Phase 2b.1 (`phases/02-analysis.md`) — every Accepted ADR is diffed against the proposed roadmap; Violations trigger a BLOCKING supersede gate that produces a `supersede_adr_NNN.md` task before any refactor task is created.
- `code-review` Phase 7 (`SKILL.md` § "Phase 7: Architecture Compliance") — every changed-files batch is checked against Accepted ADRs; ADR-Violation findings are Critical, ADR-Drift findings are High.
Discipline that still relies on the human: when a downstream skill detects a Drift case, the resulting task spec MUST land its `## ADR Impact` / `## ADR Compliance` section; the implementer must address it; the next code-review batch then has the context it needs. Drift left undocumented is the silent-failure path — every consumer hook above is designed to make it visible.
## Inputs
- `_docs/02_document/architecture.md` (incl. confirmed `## Architecture Vision`)
- `_docs/02_document/glossary.md`
- `_docs/02_document/data_model.md`
- `_docs/02_document/system-flows.md`
- `_docs/02_document/risk_mitigations.md` (and any `risk_mitigations_NN.md` iterations from Step 4)
- `_docs/02_document/components/[##]_[name]/description.md`
- `_docs/02_document/deployment/` (CI/CD, environments, observability)
- `_docs/00_problem/restrictions.md` and `_docs/00_problem/acceptance_criteria.md` (each ADR must reference relevant constraints / AC by ID)
- Optional: `_docs/01_solution/solution.md` and `_docs/01_solution/tech_stack.md` (research output)
- Optional: `_docs/LESSONS.md` — surface any lesson categories of `architecture` / `dependencies` that bias the recommendation
## What is an ADR (and what is not)
Capture an ADR when **all** of the following hold:
1. The decision picks between two or more genuinely valid approaches with meaningful trade-offs.
2. The decision has **downstream consequences** that other decisions, code, or tasks inherit from.
3. The decision is **non-obvious** to a future reader who only sees the final code — they would ask "why was it built this way?" rather than discovering the answer by reading the source.
Do NOT create an ADR for:
- Naming, formatting, or purely cosmetic choices.
- A choice that is fully implied by a single explicit restriction (`restrictions.md` is itself the record — link to it from the architecture doc instead).
- A choice the team has not actually made yet — open questions live in `risk_mitigations.md` or `_docs/_process_leftovers/`, not in ADRs.
- A technology selection where research already produced an exact-fit selection with one viable option (the research doc is the record — link to the relevant `solution_draft*.md` section).
## Process
### Phase 4.5a: Decision Inventory
Walk the inputs and list candidate decisions. For each candidate, record a one-liner:
```
- [decision] — [trade-off summary] — [downstream consumers] — [evidence file:section]
```
Inspect at minimum:
| Inspection target | Typical decisions surfaced |
|-------------------|----------------------------|
| `architecture.md` § layering | Layering style (clean vs hex vs n-tier), which layer owns transactions, how cross-cutting concerns enter |
| `architecture.md` § Architecture Vision | The North Star principle (e.g., "edge-first, sync-second"); ADR captures the implication for one specific subsystem |
| `data_model.md` | Datastore choice (Postgres vs Mongo), partitioning, soft vs hard deletes, schema evolution strategy |
| `system-flows.md` | Sync vs async boundaries, idempotency strategy, retry policy ownership, error envelope shape |
| `components/*/description.md` § interfaces | Public-API style (REST vs RPC vs event), versioning strategy, auth/authorization placement |
| `deployment/containerization.md` | Single container vs sidecar vs init container, base image lineage |
| `deployment/ci_cd_pipeline.md` | Trunk-based vs feature-branch, gate ordering, deploy strategy (blue-green / canary / all-at-once) |
| `deployment/observability.md` | Logging stack, metric backend, sampling rate decisions, retention |
| `risk_mitigations.md` | Risk-acceptance trade-offs (e.g., "we accept N% data loss in exchange for sub-100ms p99") |
| Tech-stack from `_docs/01_solution/tech_stack.md` | Anything where research recorded ≥2 candidates and a winner |
Drop any candidate that fails the three "what is an ADR" criteria above. Keep the rest.
### Phase 4.5b: Numbering and Slugs
ADRs are numbered globally per project, monotonically, never re-used.
1. List existing files under `_docs/02_document/adr/` matching `^[0-9]{3}_.+\.md$`.
2. The next ADR number is `max(existing) + 1`, zero-padded to 3 digits.
3. The slug is kebab-case, ≤6 words, derived from the decision summary. Example: `001_use-postgres-for-transactional-data.md`, `004_event-driven-cross-component-comms.md`.
### Phase 4.5c: Render One ADR Per Decision
For each kept candidate, render the ADR using `templates/adr.md`. Required sections (do NOT omit any):
| Section | Content |
|---------|---------|
| **Number** | `NNN` |
| **Title** | One-line decision statement (matches slug) |
| **Status** | `Proposed` (only during Step 4.5 iteration) → `Accepted` (after user confirmation at the BLOCKING gate) |
| **Date** | YYYY-MM-DD (the date the user confirmed) |
| **Deciders** | The user (project owner) — the AI is not a decider |
| **Context** | The problem this decision addresses, including links to AC IDs, restriction IDs, risks, and (where relevant) the research draft section |
| **Decision** | The chosen approach in one sentence, then the supporting detail |
| **Alternatives Considered** | Each alternative with a one-line "rejected because…" |
| **Consequences** | Positive (what becomes easier / cheaper / faster) and negative (what becomes harder / locked in / costly to undo). Be honest — every decision has a downside. |
| **Supersedes / Superseded by** | Empty initially; updated when a future ADR overturns this one |
| **Evidence** | File-and-section pointers into `_docs/` showing where the decision is reflected (architecture.md § layering, components/02_*/description.md § interface, etc.) |
After rendering, write each file to `_docs/02_document/adr/NNN_<slug>.md`. Keep `Status: Proposed` until the BLOCKING gate.
### Phase 4.5d: Maintain the ADR Index
Write or update `_docs/02_document/adr/README.md` with this exact shape:
```markdown
# Architecture Decision Records
This index lists every ADR for this project, in number order. ADRs are immutable once `Accepted`
new decisions that overturn a prior ADR are recorded as new ADRs whose `Supersedes` field points
back, and the original ADR's `Superseded by` field is updated.
| # | Title | Status | Date | Supersedes |
|---|-------|--------|------|------------|
| 001 | Use Postgres for transactional data | Accepted | 2026-05-21 | — |
| 002 | Event-driven cross-component comms | Accepted | 2026-05-21 | — |
| ... | ... | ... | ... | ... |
```
Sort by `#` ascending. Include all ADRs ever written, even superseded ones — the audit trail is the point.
### Phase 4.5e: Cross-Link from architecture.md
In `architecture.md`, every section that reflects an ADR decision gets a one-line trailing reference:
```markdown
> See ADR 001 (Use Postgres for transactional data), ADR 003 (Event-driven cross-component comms).
```
Place the reference at the end of the section, after the prose. This lets a future reader of `architecture.md` jump straight to the rationale.
### Phase 4.5f: BLOCKING Gate — User Confirmation
Present the ADR set to the user using the Choose format from `.cursor/skills/autodev/protocols.md` (or plain text if AskQuestion is unavailable):
```
══════════════════════════════════════
DECISION REQUIRED: ADR set captured (N records)
══════════════════════════════════════
001 — [title]
002 — [title]
...
══════════════════════════════════════
A) Accept all ADRs as written
B) Edit specific ADRs (numbers and edits)
C) Add a missed decision (description)
D) Remove an ADR (number and reason)
══════════════════════════════════════
Recommendation: A — review the rendered set and confirm; corrections are quick on Round 2
══════════════════════════════════════
```
Loop:
- **A** → flip every ADR's `Status` from `Proposed` to `Accepted`, set `Date` to today's date, save, exit step.
- **B** → apply edits, re-present the modified ADRs, loop.
- **C** → run Phase 4.5a4.5e for the missed decision only, append to the set, re-present, loop.
- **D** → confirm with the user that the candidate fails the three "what is an ADR" criteria, remove the file, update the index, loop.
Do NOT mark `Accepted` without an explicit user A.
## Self-verification
- [ ] Every kept candidate from Phase 4.5a has a corresponding file under `adr/`
- [ ] Every ADR has all required sections (none empty except `Supersedes` / `Superseded by`)
- [ ] `Decision` sections are one-sentence-then-detail, not "we'll figure it out"
- [ ] `Alternatives Considered` lists at least one rejected alternative per ADR
- [ ] `Consequences` lists both positive AND negative consequences (an ADR with no negatives is suspect)
- [ ] `Evidence` points at real `_docs/` sections that exist on disk
- [ ] `adr/README.md` index lists every file in the directory and matches their `Status` / `Date`
- [ ] `architecture.md` has a trailing `See ADR …` reference at every section that an ADR reflects
- [ ] The user confirmed the set via Choose A; every ADR is `Accepted` with today's date
## Common mistakes
- **Re-opening architecture**: Step 4.5 records, it does not decide. If a candidate decision turns out to be unsettled, that's a Step 2 / Step 4 gap — return there, do not paper over it with a wishy-washy ADR.
- **Decision-of-the-week**: do not write an ADR for every minor pattern choice. The bar is "non-obvious to a future reader". 515 ADRs is typical for a planning round; 40+ is over-capture.
- **Negative consequences left empty**: every real decision has costs. If you cannot name one, the decision was not actually weighed.
- **Vague evidence**: `architecture.md` is not enough — point at the specific section. `architecture.md § Layering``architecture.md`.
- **Numbering reuse**: never recycle a number from a deleted ADR. The audit trail is more important than tidy numbering.
- **Superseding without recording**: when a later cycle overturns an ADR, the new ADR must point at the old one via `Supersedes`, AND the old ADR's `Superseded by` field must be updated. Index reflects both. (This is enforced when `decompose` or `refactor` later updates ADRs.)
## Escalation
| Situation | Action |
|-----------|--------|
| Candidate decision is unsettled (the team has not actually decided) | Return to the originating step (2 / 3 / 4); do NOT write a placeholder ADR |
| Two candidates in Phase 4.5a turn out to be the same decision phrased differently | Merge into one ADR, list both phrasings in `Context` |
| User picks D (remove an ADR) and the AI judges the decision is genuinely worth recording | Surface the disagreement, ASK why the user wants it removed, defer to user |
| Existing `adr/` directory has files but `adr/README.md` is missing or stale | Rebuild the index from the directory before adding new ADRs |
@@ -2,7 +2,7 @@
**Role**: Professional Quality Assurance Engineer
**Goal**: Write test specs for each component achieving minimum 75% acceptance criteria coverage
**Goal**: Write test specs for each component achieving the canonical minimum acceptance-criteria coverage (currently 75% — see `.cursor/rules/cursor-meta.mdc` Quality Thresholds; do not restate a different number here)
**Constraints**: Test specs only — no test code. Each test must trace to an acceptance criterion.
@@ -58,4 +58,4 @@ Do NOT create minimal epics with just a summary and short description. The epic
8. **Create "Blackbox Tests" epic** — this epic will parent the blackbox test tasks created by the `/decompose` skill. It covers implementing the test scenarios defined in `tests/`.
**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If `tracker: local`, save locally only.
**Save action**: Epics created via the configured tracker MCP. Also saved locally in `epics.md` with ticket IDs. If tracker availability fails, follow `.cursor/rules/tracker.mdc`; only if the user explicitly chooses `tracker: local`, save locally only with pending tracker markers.
+67
View File
@@ -0,0 +1,67 @@
# ADR-{NNN}: {decision-title}
- **Status**: {Proposed | Accepted | Deprecated | Superseded}
- **Date**: {YYYY-MM-DD}
- **Deciders**: {user / project owner}
- **Supersedes**: {ADR-NNN | —}
- **Superseded by**: {ADR-NNN | —}
## Context
What problem does this decision address? Cite the relevant constraint(s), acceptance criterion / criteria, and risk(s) by ID.
- Acceptance criteria addressed: AC-{ID-1}, AC-{ID-2}
- Restrictions addressed: R-{ID-1}, R-{ID-2}
- Risks addressed: RISK-{ID-1}
- Research source (if any): `_docs/01_solution/solution_draftN.md` § {section}
A short paragraph (36 sentences) explaining why a choice is required now and what makes it non-trivial. Do not pre-announce the decision here — that goes in `Decision`. Focus on the forces at play (load, scale, team familiarity, hardware constraints, regulatory drivers, third-party limits).
## Decision
One declarative sentence: **"We will …"** Then 13 paragraphs of supporting detail explaining how the decision will be implemented at the boundaries between components.
Be specific. "We will use Postgres" is too thin; "We will use Postgres 16 with logical replication for read scaling, restricting JSONB columns to top-level metadata only, with all transactional data in normalized tables" is the right resolution.
## Alternatives Considered
| Alternative | Rejected because |
|-------------|------------------|
| {Alt 1 — short label} | {one line: the cost / mismatch / risk that ruled it out, ideally referencing a measurable criterion} |
| {Alt 2 — short label} | {one line} |
| {Alt 3 — short label} | {one line} |
At least one rejected alternative is mandatory. If only one option was ever considered, this is not an ADR — link to the source restriction or research selection from the parent doc instead.
## Consequences
### Positive
- {What becomes easier / cheaper / faster, with concrete examples where possible}
- {…}
### Negative
- {What becomes harder / locked in / costly to undo}
- {…}
Every real decision has both. If the negatives section is hard to fill, the alternatives were probably not weighed seriously — return to the prior step.
### Neutral / Open
- {What is unchanged but worth flagging for future readers (e.g., "this does not change the auth boundary; auth remains in component 02_user_management as decided in ADR-003")}
## Evidence
Where this decision is reflected on disk. Use `file:section` links so future readers can jump.
- `_docs/02_document/architecture.md` § {section}
- `_docs/02_document/data_model.md` § {section}
- `_docs/02_document/components/{##_name}/description.md` § {section}
- `_docs/02_document/system-flows.md` § {flow name}
- `_docs/02_document/deployment/{file}.md` § {section}
- {add more as needed}
## Notes
Optional. Use for caveats that did not fit above, links to external research, or follow-ups that the team agreed to revisit on a known trigger ("re-evaluate after 6 months in production" / "re-evaluate when load exceeds 10× baseline").
+1 -1
View File
@@ -133,4 +133,4 @@ Link to architecture.md and relevant component spec.]
- `component` — a normal per-component epic
- `cross-cutting` — a shared concern that spans ≥2 components
- `tests` — the blackbox-tests epic (always exactly one)
- Complexity points for child issues follow the project standard: 1, 2, 3, 5, 8. Do not create issues above 5 points — split them.
- Complexity points for child issues follow the project standard: 1, 2, 3, 5. Do not create issues above 5 points — split them.
@@ -1,6 +1,6 @@
# Final Planning Report Template
Use this template after completing all 6 steps and the quality checklist. Save as `_docs/02_document/FINAL_report.md`.
Use this template after completing all steps (1, 2, 3, 4, 4.5, 5, 6) and the quality checklist. Save as `_docs/02_document/FINAL_report.md`.
---
+2
View File
@@ -181,6 +181,8 @@ Categorized measurable criteria with markdown headers and bullet points:
Every criterion must have a measurable value. Vague criteria like "should be fast" are not acceptable — push for "less than 400ms end-to-end".
**AC must be design-independent**: describe testable outcomes only — no libraries, algorithms, params, or design choices. Implementation follows AC, never reverse. (IEEE 830 / Atlassian / GitScrum)
### input_data/
At least one file. Options:
+5 -3
View File
@@ -24,6 +24,8 @@ Phase details live in `phases/` — read the relevant file before executing each
- **Save immediately**: write artifacts to disk after each phase
- **Delegate execution**: all code changes go through the implement skill via task files
- **Ask, don't assume**: when scope or priorities are unclear, STOP and ask the user
- **Exact-fit recommendations**: do not recommend a replacement pattern, library, service, architecture, algorithm, or "modern approach" merely because it improves structure or solves a similar class of problem. It must fit confirmed product constraints, acceptance criteria, operating context, integration boundaries, and current code realities. Otherwise reject it, mark it experimental, or ask the user before adding it to the roadmap.
- **Per-mode API capability verification on replacements**: when a refactor proposes replacing or adding a library/SDK/framework/service that exposes multiple modes or configurations, pin the exact mode the refactored code will use (inputs, outputs, runtime) and verify *that mode* via mandatory `context7` lookup plus a saved Minimum Viable Example before promoting the recommendation to `Selected`. Capability claims at the category level ("supports A, B, C modes") must be cross-checked against the literal mode enumeration — `A, B → A+B` style conflations are the recurring silent-failure path.
## Context Resolution
@@ -57,7 +59,7 @@ Create REFACTOR_DIR and RUN_DIR if missing. If a RUN_DIR with the same name alre
Both modes produce `RUN_DIR/list-of-changes.md` (template: `templates/list-of-changes.md`). Both modes then convert that file into task files in TASKS_DIR during Phase 2.
**Guided mode cleanup**: after `RUN_DIR/list-of-changes.md` is created from the input file, delete the original input file to avoid duplication.
**Guided mode cleanup**: after `RUN_DIR/list-of-changes.md` is created from the input file, delete the original input file only if it lives outside `RUN_DIR`. If the provided file is already the canonical `RUN_DIR/list-of-changes.md`, keep it as the audit record.
## Workflow
@@ -79,10 +81,10 @@ Both modes produce `RUN_DIR/list-of-changes.md` (template: `templates/list-of-ch
- "refactor [specific target]" → skip phase 1 if docs exist
- Default → all phases
**Testability-run specifics** (guided mode invoked by autodev existing-code flow Step 4):
**Testability-run specifics** (guided mode invoked by autodev existing-code Step 4 or greenfield Step 8):
- Run name is `01-testability-refactoring`.
- Phase 3 (Safety Net) is skipped by design — no tests exist yet. Compensating control: the `list-of-changes.md` gate in Phase 1 must be reviewed and approved by the user before Phase 4 runs.
- Scope is MINIMAL and surgical; reject change entries that drift into full refactor territory (see existing-code flow Step 4 for allowed/disallowed lists). Flagged entries go to `RUN_DIR/deferred_to_refactor.md` for Step 8 (optional full refactor) consideration.
- Scope is MINIMAL and surgical; reject change entries that drift into full refactor territory (see the invoking flow's testability step for allowed/disallowed lists). Flagged entries go to `RUN_DIR/deferred_to_refactor.md` for the next optional full-refactor step or backlog consideration.
- After Phase 4 (Execution) completes, write `RUN_DIR/testability_changes_summary.md` as Phase 4.5. Format: one bullet per applied change.
```markdown
# Testability Changes Summary ({{run_name}})
@@ -95,7 +95,7 @@ Also copy to project standard locations:
**Critical step — do not skip.** Before producing the change list, cross-reference documented business flows against actual implementation. This catches issues that static code inspection alone misses.
1. **Read documented flows**: Load `DOCUMENT_DIR/system-flows.md`, `DOCUMENT_DIR/architecture.md`, `DOCUMENT_DIR/module-layout.md`, every file under `DOCUMENT_DIR/contracts/`, and `SOLUTION_DIR/solution.md` (whichever exist). Extract every documented business flow, data path, architectural decision, module ownership boundary, and contract shape.
1. **Read documented flows**: Load `DOCUMENT_DIR/system-flows.md`, `DOCUMENT_DIR/architecture.md` (paying special attention to its `## Architecture Vision` section — that's the user-confirmed structural intent), `DOCUMENT_DIR/glossary.md`, `DOCUMENT_DIR/module-layout.md`, every file under `DOCUMENT_DIR/contracts/`, and `SOLUTION_DIR/solution.md` (whichever exist). Extract every documented business flow, data path, architectural decision, module ownership boundary, and contract shape. Any refactor change that contradicts a confirmed Architecture Vision principle must either be rejected or surfaced to the user before being added to `list-of-changes.md` — those principles are not refactor targets without explicit user approval.
2. **Trace each flow through code**: For every documented flow (e.g., "video batch processing", "image tiling", "engine initialization"), walk the actual code path line by line. At each decision point ask:
- Does the code match the documented/intended behavior?
+73 -4
View File
@@ -7,14 +7,29 @@
## 2a. Deep Research
1. Analyze current implementation patterns
2. Research modern approaches for similar systems
3. Identify what could be done differently
4. Suggest improvements based on state-of-the-art practices
2. Extract the **Project Constraint Matrix** from `problem.md`, `restrictions.md`, `acceptance_criteria.md`, current architecture/docs, and actual code constraints. Include required inputs/outputs, operating context, lifecycle assumptions, integration boundaries, non-functional targets, and hard disqualifiers.
3. Research modern approaches for similar systems
4. For each alternative pattern/library/service/architecture/algorithm, research intrinsic implementation constraints: required inputs/outputs, runtime assumptions, supported deployment modes, resource needs, operational limits, licensing/security constraints, and known failure reports.
**API Capability Verification — Per-Mode (MANDATORY, BLOCKING for proposed replacements)**
When a refactor recommendation replaces (or adds) a library/SDK/framework/service, the same per-mode verification used by `/research` Step 2 applies — selecting a replacement on category fit alone is the same silent-failure path. For every replacement candidate that has multiple modes or configurations:
1. **Pin the exact mode/configuration** the refactored code will use, in one explicit sentence. Inputs (data shapes, sensor counts, payloads, rates), outputs (per `acceptance_criteria.md` and contract files), runtime (matching the project's deployment).
2. **Run `context7` (or equivalent docs lookup)** for the candidate. **Mandatory for every replacement library/SDK/framework candidate**, not optional. Minimum three queries per candidate: mode enumeration, project's exact mode (with input/output shapes), disqualifier probe ("does this mode produce the required output? are there published limitations on this runtime?"). Append URLs to `RUN_DIR/analysis/research_findings.md` references section.
3. **Save a Minimum Viable Example (MVE)** for the pinned mode under `RUN_DIR/analysis/mve_evidence.md` with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌. If no official example covers the project's exact configuration, the recommendation cannot be `Selected` based on category fit alone — it must be `Experimental only` (with required-evidence note) or `Rejected`.
4. **Treat "the same library in a different mode" as a different recommendation.** If the project's pinned mode is `<X>` but the only documented evidence covers `<Y>`, do not silently soften the description. Open a separate recommendation row, with its own MVE, fit assessment, and disqualifiers.
5. **Common silent-failure pattern**: a fact summary paraphrases docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes" — no `A+B` combination exists. Cross-check paraphrased capability claims against the literal mode enumeration.
5. Identify what could be done differently
6. Suggest improvements only when they fit the Project Constraint Matrix. A cleaner or more modern approach that violates product constraints must be marked `Rejected` or `Experimental only`, not added as a roadmap recommendation.
Write `RUN_DIR/analysis/research_findings.md`:
- Current state analysis: patterns used, strengths, weaknesses
- Alternative approaches per component: current vs alternative, pros/cons, migration effort
- Prioritized recommendations: quick wins + strategic improvements
- Constraint-fit table: recommendation, **pinned mode/config**, constraints checked, **API capability evidence (MVE link)**, evidence, mismatches/disqualifiers, status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
- For every recommendation that replaces or adds a library/SDK/framework, append a **Restrictions × Candidate-Mode sub-matrix** that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's pinned mode, marking each cell ✅ Pass / ❌ Fail / ❓ Verify / N/A with cited evidence. A recommendation cannot be `Selected` while any cell is ❌ or ❓.
## 2b. Solution Assessment & Hardening Tracks
@@ -22,6 +37,45 @@ Write `RUN_DIR/analysis/research_findings.md`:
2. Identify weak points in codebase, map to specific code areas
3. Perform gap analysis: acceptance criteria vs current state
4. Prioritize changes by impact and effort
5. Reject or escalate any proposed refactor that improves code structure while weakening required behavior, integration contracts, runtime constraints, safety/security posture, or acceptance criteria
### 2b.1. ADR Superseding Gate (BLOCKING)
A refactor that improves code structure while overturning a documented architecture decision is the silent-drift class the project repeatedly burns on (see `meta-rule.mdc` § GPS-passthrough postmortem and the auto-lessons it produced). This gate makes drift visible and forces a deliberate ADR update.
1. **List candidate ADRs**: read every `Status: Accepted` file in `_docs/02_document/adr/`. If the directory does not exist or contains only the index, log `No ADRs in scope` to `RUN_DIR/analysis/adr_impact.md` and skip the rest of this gate.
2. **Diff each candidate against the proposed refactor roadmap**: for each ADR, ask the same two questions as code-review Phase 7:
- **Violation**: does any roadmap item do the *opposite* of the ADR's `Decision`?
- **Drift**: does any roadmap item materially affect the ADR's `Consequences` (positive or negative) without contradicting the Decision outright?
3. **Classify each impacted ADR** in `RUN_DIR/analysis/adr_impact.md`:
| ADR | Roadmap item | Impact | Required action |
|-----|--------------|--------|-----------------|
| NNN | `roadmap-item-NN` | Violation / Drift / Aligned | (filled by Choose A/B/C below) |
4. **For every Violation row, present a BLOCKING Choose**:
```
══════════════════════════════════════
DECISION REQUIRED: Refactor would violate ADR-NNN (<title>)
══════════════════════════════════════
A) Update the ADR via supersede: the refactor produces a NEW ADR
(`Supersedes: NNN`) capturing the new Decision, and ADR-NNN's
`Superseded by` field is updated. The supersede ADR is itself a
deliverable of this refactor run (added to RUN_DIR/analysis/adr_impact.md
and to TASKS_DIR as a task) and must be `Accepted` before Phase 4.
B) Reduce the refactor scope to NOT violate ADR-NNN
C) Re-evaluate ADR-NNN: keep the refactor but only after ADR-NNN is
formally re-opened in a new /plan Step 4.5 round
══════════════════════════════════════
Recommendation: A — supersede is the only path that keeps the audit
trail intact while letting the refactor land
══════════════════════════════════════
```
5. **For every Drift row**: do not block, but the roadmap item must include a `## ADR Impact` section in its task spec citing the affected ADR(s). The implementer surfaces this at code-review Phase 7, which would otherwise classify the change as ADR-Drift (High) without context.
6. **For every Aligned row**: cite the ADR in the roadmap item's task spec under `## ADR Compliance`. No further action.
7. **Self-supersede deliverable**: any Choose A path adds a `[##]_supersede_adr_NNN.md` task file to the refactor run's TASKS_DIR with the new ADR text drafted (using `.cursor/skills/plan/templates/adr.md`). The task's only Acceptance Criterion is "ADR file exists at `_docs/02_document/adr/<next>_<slug>.md` with `Status: Accepted`, ADR-NNN's `Superseded by` field updated, and `_docs/02_document/adr/README.md` index reflects both."
Present optional hardening tracks for user to include in the roadmap:
@@ -47,6 +101,11 @@ Write `RUN_DIR/analysis/refactoring_roadmap.md`:
- Gap analysis: what's missing, what needs improvement
- Phased roadmap: Phase 1 (critical fixes), Phase 2 (major improvements), Phase 3 (enhancements)
- Selected hardening tracks and their items
- Applicability gate: each roadmap item must state constraint fit, mismatches, required evidence, and status (`Selected` / `Rejected` / `Experimental only` / `Needs user decision`)
**BLOCKING applicability gate**: Before 2c and 2d, every recommendation in the roadmap must be `Selected`. Items marked `Rejected` are excluded. Items marked `Experimental only` or `Needs user decision` require a user decision before task creation.
**BLOCKING ADR-supersede gate**: Before 2c and 2d, every Violation row in `RUN_DIR/analysis/adr_impact.md` (from 2b.1) must be resolved via Choose A, B, or C. A Violation row with no chosen path blocks task creation.
## 2c. Create Epic
@@ -55,7 +114,7 @@ Create a work item tracker epic for this refactoring run:
1. Epic name: the RUN_DIR name (e.g., `01-testability-refactoring`)
2. Create the epic via configured tracker MCP
3. Record the Epic ID — all tasks in 2d will be linked under this epic
4. If tracker unavailable, use `PENDING` placeholder and note for later
4. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; only use `PENDING` placeholders if the user explicitly chooses `tracker: local`
## 2d. Task Decomposition
@@ -79,6 +138,12 @@ Convert the finalized `RUN_DIR/list-of-changes.md` into implementable task files
**Self-verification**:
- [ ] All acceptance criteria are addressed in gap analysis
- [ ] Recommendations are grounded in actual code, not abstract
- [ ] Every recommendation has been checked against the Project Constraint Matrix
- [ ] No recommendation violates product restrictions, acceptance criteria, documented architecture decisions, or actual code integration boundaries
- [ ] Every replacement library/SDK/framework recommendation has a pinned mode/config, a saved MVE in `mve_evidence.md`, and a Restrictions × Candidate-Mode sub-matrix with no ❌ or ❓ cells
- [ ] `context7` (or equivalent) was consulted for every replacement library/SDK/framework recommendation
- [ ] Paraphrased capability claims have been cross-checked against the literal mode-enumeration evidence (no `A, B → A+B` style conflation)
- [ ] Rejected and experimental approaches are documented but not converted into implementation tasks without user approval
- [ ] Roadmap phases are prioritized by impact
- [ ] Epic created and all tasks linked to it
- [ ] Every entry in list-of-changes.md has a corresponding task file in TASKS_DIR
@@ -86,6 +151,10 @@ Convert the finalized `RUN_DIR/list-of-changes.md` into implementable task files
- [ ] Task dependencies are consistent (no circular dependencies)
- [ ] `_dependencies_table.md` includes all refactoring tasks
- [ ] Every task has a work item ticket (or PENDING placeholder)
- [ ] If `_docs/02_document/adr/` exists with Accepted ADRs, `RUN_DIR/analysis/adr_impact.md` has been written and every Violation row is resolved (A/B/C) — no implicit overrides
- [ ] For every Violation resolved via Choose A, a `[##]_supersede_adr_NNN.md` task exists in TASKS_DIR with the drafted supersede ADR
- [ ] For every Drift row, the corresponding roadmap-item task spec has a `## ADR Impact` section
- [ ] For every Aligned row, the corresponding roadmap-item task spec has a `## ADR Compliance` section
**Save action**: Write analysis artifacts to RUN_DIR, task files to TASKS_DIR
@@ -15,9 +15,9 @@ Before designing or implementing any new tests, check what already exists:
1. Scan the project for existing test files (unit tests, integration tests, blackbox tests)
2. Run the existing test suite — record pass/fail counts
3. Measure current coverage against the areas being refactored (from `RUN_DIR/list-of-changes.md` file paths)
4. Assess coverage against thresholds:
4. Assess coverage against thresholds (canonical: see `.cursor/rules/cursor-meta.mdc` Quality Thresholds — never hardcode a different number):
- Minimum overall coverage: 75%
- Critical path coverage: 90%
- Critical path coverage: **90% floor / 100% aim** — 90% is the enforcement floor (blocks Phase 4 if not met); 100% is the aspirational target. Refactors are NOT permitted to drop below 90% on the critical paths covered by the in-scope changes.
- All public APIs must have blackbox tests
- All error handling paths must be tested
@@ -47,7 +47,7 @@ For each uncovered critical area, write test specs to `RUN_DIR/test_specs/[##]_[
4. Document any discovered issues
**Self-verification**:
- [ ] Coverage requirements met (75% overall, 90% critical paths) across existing + new tests
- [ ] Coverage requirements met (75% overall, 90% critical-path floor — 100% aim — per canonical `cursor-meta.mdc` Quality Thresholds) across existing + new tests
- [ ] All tests pass on current codebase
- [ ] All public APIs in refactoring scope have blackbox tests
- [ ] Test data fixtures are configured
@@ -10,7 +10,7 @@
- All `[TRACKER-ID]_refactor_*.md` files are present
- Each task file has valid header fields (Task, Name, Description, Complexity, Dependencies)
2. Verify `TASKS_DIR/_dependencies_table.md` includes the refactoring tasks
3. Verify all tests pass (safety net from Phase 3 is green)
3. Verify all tests pass (safety net from Phase 3 is green), unless this is a testability run where Phase 3 was intentionally skipped
4. If any check fails, go back to the relevant phase to fix
## 4b. Delegate to Implement Skill
@@ -21,9 +21,9 @@ The implement skill will:
1. Parse task files and dependency graph from TASKS_DIR
2. Detect already-completed tasks (skip non-refactoring tasks from prior workflow steps)
3. Compute execution batches for the refactoring tasks
4. Launch implementer subagents (up to 4 in parallel)
4. Implement tasks sequentially in topological order (no subagents, no parallelism)
5. Run code review after each batch
6. Commit and push per batch
6. Commit per batch and push only when the user approved pushing
7. Update work item ticket status
Do NOT modify, skip, or abbreviate any part of the implement skill's workflow. The refactor skill is delegating execution, not optimizing it.
@@ -47,7 +47,7 @@ After the implement skill completes:
For each successfully completed refactoring task:
1. Transition the work item ticket status to **Done** via the configured tracker MCP
2. If tracker unavailable, note the pending status transitions in `RUN_DIR/execution_log.md`
2. If tracker is unavailable, follow `.cursor/rules/tracker.mdc`; if the user explicitly chose `tracker: local`, note the pending status transitions in `RUN_DIR/execution_log.md`
For any failed or blocked tasks, leave their status as-is (the implement skill already set them to In Testing or blocked).
@@ -45,7 +45,7 @@ Write `RUN_DIR/test_sync/new_tests.md`:
- [ ] All obsolete tests removed or merged
- [ ] All pre-existing tests pass after updates
- [ ] New code from Phase 4 has test coverage
- [ ] Overall coverage meets or exceeds Phase 3 baseline (75% overall, 90% critical paths)
- [ ] Overall coverage meets or exceeds Phase 3 baseline (75% overall, 90% critical-path floor / 100% aim — per `.cursor/rules/cursor-meta.mdc` Quality Thresholds)
- [ ] No tests reference removed or renamed code
**Save action**: Write test_sync artifacts; implemented tests go into the project's test folder
@@ -32,7 +32,7 @@ For each component doc affected:
## 7d. Update System-Level Documentation
If structural changes were made (new modules, removed modules, changed interfaces):
1. Update `_docs/02_document/architecture.md` if architecture changed
1. Update `_docs/02_document/architecture.md` if architecture changed — but **never edit the `## Architecture Vision` section**. That section is user-confirmed (plan Phase 2a.0 / document Step 4.5); if a refactor invalidates a vision principle, surface it to the user and let them update the vision themselves before continuing. Update only the technical sections below the Vision H2.
2. Update `_docs/02_document/system-flows.md` if flow sequences changed
3. Update `_docs/02_document/diagrams/components.md` if component relationships changed
@@ -23,6 +23,7 @@ Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
- **Problem**: [what makes this problematic / untestable / coupled]
- **Change**: [what to do — behavioral description, not implementation steps]
- **Rationale**: [why this change is needed]
- **Constraint Fit**: [which product constraints / acceptance criteria / integration boundaries this preserves; or "Rejected — violates ..."]
- **Risk**: [low | medium | high]
- **Dependencies**: [other change IDs this depends on, or "None"]
@@ -31,6 +32,7 @@ Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
- **Problem**: [description]
- **Change**: [description]
- **Rationale**: [description]
- **Constraint Fit**: [description]
- **Risk**: [low | medium | high]
- **Dependencies**: [C01, or "None"]
```
@@ -44,6 +46,8 @@ Save as `RUN_DIR/list-of-changes.md`. Produced during Phase 1 (Discovery).
- **File(s)** must reference actual files verified to exist in the codebase
- **Problem** describes the current state, not the desired state
- **Change** describes what the system should do differently — behavioral, not prescriptive
- **Constraint Fit** proves the change preserves confirmed product requirements, restrictions, acceptance criteria, architecture decisions, and integration contracts
- Do not include changes whose only benefit is structural cleanliness if they weaken required behavior or violate constraints; record those as rejected in analysis instead
- **Dependencies** reference other change IDs within this list; cross-run dependencies use tracker IDs
- In guided mode, the input file entries are validated against actual code and enriched with file paths, risk, and dependencies before writing
- In automatic mode, entries are derived from Phase 1 component analysis and Phase 2 research findings
+290
View File
@@ -0,0 +1,290 @@
---
name: release
description: |
Executes the deployment plan produced by /deploy against a target environment.
Closes the loop between "we have a plan" and "the new version is running in production with a verdict on disk."
6-phase workflow: pre-release gate, strategy select, execute, smoke test, watch window, commit-or-rollback.
Outputs _docs/04_release/release_<version>.md with a definitive Released / Rolled-Back / Aborted verdict.
Trigger phrases:
- "release", "ship", "go live", "release this version"
- "deploy to prod", "promote to staging", "roll out"
- "rollback", "abort the release"
category: ship
tags: [release, deployment, rollback, smoke-test, observability, production]
disable-model-invocation: true
---
# Release Execution
The `/deploy` skill produces a plan and scripts. The `/release` skill **runs** them, verifies the live system, watches it for a defined window, and produces a definitive verdict on disk.
## Core Principles
- **Real execution, not simulation**: every phase must actually run against the target environment. If a phase cannot be executed (missing scripts, no SSH access, disabled secrets, registry auth failure), STOP — do not pretend a step succeeded. See `meta-rule.mdc` § "Real Results, Not Simulated Ones".
- **Verifiable rollback path**: the release does not start until rollback is proven viable for this version. "We can roll back" without evidence is not a rollback path.
- **Quiet failure is a release failure**: a deploy script that exits 0 but emits no observable signal in the watch window is treated as a regression, not a success.
- **One release per invocation**: a single `/release` execution targets exactly one version against exactly one environment. Multi-stage promotion (staging → prod) is two invocations, not one.
- **Never skip the watch window**: even successful deploys can degrade after 560 minutes (cache warm-up, scheduled jobs, downstream backpressure). The watch window is mandatory.
- **Autonomous rollback on hard regressions**: critical health-check failure, error-rate spike above threshold, or smoke-test failure → automatic rollback. Soft regressions (latency drift, capacity warnings) escalate to the user.
## Context Resolution
Fixed paths:
- DEPLOY_DIR: `_docs/04_deploy/`
- RELEASE_DIR: `_docs/04_release/`
- SCRIPTS_DIR: `scripts/`
- DEPLOY_SCRIPT: `scripts/deploy.sh`
- HEALTH_SCRIPT: `scripts/health-check.sh`
- ENV_TEMPLATE: `.env.example`
- OBSERVABILITY_DOC: `_docs/04_deploy/observability.md`
- ENVIRONMENT_DOC: `_docs/04_deploy/environment_strategy.md`
- PROCEDURES_DOC: `_docs/04_deploy/deployment_procedures.md`
- ARCHITECTURE: `_docs/02_document/architecture.md`
- RESTRICTIONS: `_docs/00_problem/restrictions.md`
Announce the resolved paths and the **target environment + version + strategy** to the user before any phase that touches the live system.
## Inputs (BLOCKING prerequisites)
| Input | Required | Source |
|-------|----------|--------|
| Target environment | Yes — ASK user | `environment_strategy.md` enumerates valid options |
| Target version / image tag | Yes — ASK user | Must exist in the registry; verified in Phase 1 |
| Rollback target version | Yes — ASK user | Defaults to currently-deployed version if discoverable |
| `scripts/deploy.sh` | Yes | Produced by `/deploy` Step 7. STOP if missing → run `/deploy` first |
| `scripts/health-check.sh` | Yes | Same |
| `_docs/04_deploy/deployment_procedures.md` | Yes | Defines per-environment runbook, manual approval rules, change-window restrictions |
| `_docs/04_deploy/observability.md` | Yes | Defines watch metrics, thresholds, and dashboards |
| `_docs/04_deploy/environment_strategy.md` | Yes | Defines target hostnames, registries, secrets, deploy strategy per env |
## Outputs
```
RELEASE_DIR/
├── release_<version>_<env>_<YYYY-MM-DD-HHmm>.md (mandatory; one per invocation)
├── rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md (only when rollback fires; pairs with the release file)
└── manual_approvals/
└── approval_<version>_<env>.md (when restrictions require manual approval, written before Phase 3)
```
The release report (`templates/release-report.md`) is appended to as each phase completes — it is durable across phase failures and reflects partial progress so the next operator can resume or audit.
## Phases
```
┌────────────────────────────────────────────────────────────────┐
│ Release Execution (6-Phase Method) │
├────────────────────────────────────────────────────────────────┤
│ PREREQ: deploy artifacts on disk; tests green at HEAD │
│ │
│ 1. Pre-Release Gate → AC + change summary + readiness │
│ [BLOCKING: user confirms or aborts] │
│ 2. Strategy Select → all-at-once / blue-green / canary │
│ [BLOCKING: user picks strategy] │
│ 3. Execute → run deploy.sh, capture exit + logs │
│ [AUTO-ROLLBACK on non-zero exit] │
│ 4. Smoke Test → /test-run prod-smoke in target env │
│ [AUTO-ROLLBACK on failure] │
│ 5. Watch Window → poll observability for N minutes │
│ [AUTO-ROLLBACK on hard threshold breach] │
│ 6. Commit or Rollback → finalize verdict, update tracker │
│ [BLOCKING: user confirms only if soft regression escalated] │
├────────────────────────────────────────────────────────────────┤
│ Verdicts: Released · Rolled-Back · Aborted │
└────────────────────────────────────────────────────────────────┘
```
### Phase 1: Pre-Release Gate
**Goal**: Refuse to start if the system is not ready for a real release.
1. **Acceptance criteria check**: read `_docs/00_problem/acceptance_criteria.md`. If any AC is marked unmet OR if any AC has no associated test marked `Passed` in the latest `test-run` report, STOP and surface the unmet items. Do not let the user override with "ship anyway" without a recorded reason in the release report.
2. **Test status check**: read the most recent `_docs/06_metrics/perf_*.md` (if perf is required by restrictions) and the latest functional test report. Any failing or skipped test that maps to a critical-path AC blocks the release.
3. **Change summary**: read the git log between the version-tag-of-last-release and HEAD (or, if no prior release exists, from the project root commit). Render a short list grouped by component: features, fixes, breaking changes, security fixes. Cross-reference against the latest implementation reports under `_docs/03_implementation/`.
4. **Rollback readiness**:
- Confirm the previous version's image is still pullable from the registry (do not deploy without this).
- Confirm `scripts/deploy.sh --rollback` works as documented (read the script; if `--rollback` flag is missing, STOP — that is a deploy-skill bug).
- Confirm a rollback target exists (e.g., previously-deployed image tag) and is recorded in the release report under `Rollback Plan`.
5. **Restrictions**: read `_docs/00_problem/restrictions.md` for change-window rules, manual-approval rules, blackout windows, regulatory requirements (e.g., 4-eyes review, ITAR controls). If any apply, gate accordingly — write a `manual_approvals/approval_<version>_<env>.md` file once received.
6. **Tracker check**: list tracker tickets in the release scope (per `tracker.mdc` rules). Any ticket still in `In Progress` or `Code Review` that maps to a change in the release scope blocks Phase 1. Move-and-deploy is not allowed.
**BLOCKING gate**: present the assembled summary to the user using Choose A/B/C:
```
══════════════════════════════════════
PRE-RELEASE GATE
══════════════════════════════════════
Target env: {env}
Target version: {version} ({git-sha})
Rollback target: {previous-version}
Changes: N tickets, M components
- {summary list}
Open risks: {summary or "none"}
Blocking issues: {summary or "none"}
══════════════════════════════════════
A) Proceed to Strategy Select
B) Abort — fix blocking issue and re-invoke
C) Edit release scope — exclude a ticket and reassemble
══════════════════════════════════════
```
If A → write Phase 1 section to release report, proceed. If B → write `Aborted` verdict to release report with reason, exit. If C → loop back into Phase 1 with edited scope.
### Phase 2: Strategy Select
**Goal**: Pick the deployment strategy that fits the change risk and environment capability.
Read `environment_strategy.md` and `deployment_procedures.md` to learn which strategies the target env supports. Strategies and when each is appropriate:
| Strategy | When to pick | Risk if wrong |
|----------|--------------|---------------|
| **all-at-once** | Internal tools, low traffic, well-rehearsed change, env supports nothing else | All users hit the new version simultaneously — bug blast radius is 100% |
| **blue-green** | Stateless services with a load balancer, env has dual-stack capability | Cutover is binary — observability must be ready to detect issues fast |
| **canary** | Customer-facing, traffic-tier load balancer in place, gradual rollout possible | Canary metric thresholds must be well-tuned or canary fails for harmless reasons |
| **manual** | Non-automatable env (one-off VMs, regulated infrastructure, non-Docker host) | The whole release becomes a runbook and the watch window phases are operator-driven; the release skill records but does not execute |
Recommend a default based on:
- Risk level inferred from change summary (any breaking change → bias toward canary or blue-green)
- Restrictions (e.g., regulatory rules forcing manual approval at each step)
- Environment capability (some envs may only support all-at-once)
**BLOCKING gate**: Choose A/B/C/D between strategies. Record the choice in the release report.
### Phase 3: Execute
**Goal**: Actually run the deploy. Capture exit code and full stdout/stderr.
1. Validate environment file (`.env`) exists, all required vars from `.env.example` are set, no placeholder secrets remain.
2. Source the env file and run `scripts/deploy.sh` against the target host. The script produced by `/deploy` Step 7 is the point of execution; do NOT bypass it. If a strategy-specific flag is needed (e.g., `--canary 5%`), pass it through.
3. Stream stdout/stderr to the release report, with timestamps, in a fenced code block under `## Phase 3: Execute`.
4. Capture exit code.
5. **AUTO-ROLLBACK trigger**: non-zero exit code → immediately invoke Phase 6 with verdict `Rolled-Back: deploy script failure`. Do NOT continue to Phase 4.
If `deploy.sh` emits no output for more than the configured idle threshold (default 5 minutes; check `deployment_procedures.md` for an explicit value), treat it as hung — capture a snapshot of what's running on the target, kill the script, and AUTO-ROLLBACK with reason `Deploy hung — manual investigation required`.
**Manual strategy**: if Phase 2 picked `manual`, write a checklist of operator steps from `deployment_procedures.md` to the release report and pause until the user types `done` or `failed`. Phase 3 then records the user's report verbatim.
### Phase 4: Smoke Test
**Goal**: Verify the new version is *actually serving traffic correctly* in the target environment.
1. Resolve the smoke-test command from `_docs/02_document/tests/blackbox-tests.md` § Production Smoke Tests, OR delegate to `/test-run` in `--prod-smoke` mode against the target environment.
2. The smoke-test set must (a) hit each public endpoint of each component, (b) include at least one read AND one write per public endpoint where applicable, and (c) complete in under 5 minutes total.
3. Capture pass/fail per case to the release report.
4. **AUTO-ROLLBACK trigger**: any smoke-test failure → invoke Phase 6 with verdict `Rolled-Back: smoke test failure: <test-name>`.
If smoke tests are **missing** for the target environment (no production-mode test set), STOP — write a leftover entry to `_docs/_process_leftovers/` per `tracker.mdc`, do not proceed to watch window without smoke coverage. Write `Aborted: smoke tests missing for prod-mode target` and ASK the user.
### Phase 5: Watch Window
**Goal**: Observe the live system for a defined window to catch latent regressions.
1. Read `observability.md` for the project's metrics, dashboards, and threshold definitions. Required watch metrics for any production target (per cursor-meta convention) include error rate, request rate, p99 latency, and saturation (CPU/memory/queue-depth).
2. Compute the watch-window duration from `deployment_procedures.md`. If unspecified, default to **15 minutes** for staging and **60 minutes** for production.
3. Poll the observability backend at 1-minute intervals (or the configured cadence). For each interval, record metric snapshots to the release report.
4. Threshold rules:
- **Hard breach** (auto-rollback): error-rate ≥ 2× baseline, p99 latency ≥ 3× baseline, any health-check failure persisting for 2 consecutive intervals.
- **Soft breach** (escalate): metric drift between 1.5× and 2× baseline, single-interval health blip, queue-depth steady but elevated.
- **No data** (escalate): if metrics are not flowing within the first 3 minutes, treat the absence as a hard breach — observability is itself broken.
5. **AUTO-ROLLBACK trigger**: hard breach at any interval. Move to Phase 6 with verdict `Rolled-Back: <metric> breached <multiplier>× baseline at T+<minutes>`.
6. **ESCALATE trigger**: soft breach. Pause polling, surface the metric, and ask the user A/B/C:
- A) Continue watch — accept current drift, keep polling
- B) Roll back now — treat soft drift as hard
- C) Extend watch window by N minutes
7. End of watch window with no breach → proceed to Phase 6.
The watch window cannot be skipped. If the user explicitly demands skipping (e.g., emergency rollforward), record the override reason in the release report and continue, but mark the verdict as `Released-with-override` — this triggers an automatic incident retrospective per `retrospective/SKILL.md`.
### Phase 6: Commit or Rollback
**Goal**: Finalize the release with a definitive verdict on disk.
**Path A — Commit (clean release)**:
1. Update tracker tickets: every ticket in scope moves to `Released` (or `Done`, per project convention defined in `tracker.mdc` / `_docs/_repo-config.yaml`).
2. Tag the git HEAD with `release/<version>` (or the project's tag convention from `deployment_procedures.md`).
3. Write the final `Released` verdict to the release report with a summary table.
4. Trigger `/retrospective --cycle-end` with this release as the cycle terminus.
5. Auto-chain to autodev's next step (Retrospective in greenfield, or feature-cycle loop start in existing-code).
**Path B — Rollback (auto-fired or user-elected)**:
1. Run `scripts/deploy.sh --rollback` with the rollback target captured in Phase 1.
2. Stream output to a new file `RELEASE_DIR/rollback_<version>_<env>_<YYYY-MM-DD-HHmm>.md` AND append a summary to the original release report under `## Rollback`.
3. Re-run Phase 4 (smoke test) and a 5-minute mini watch window against the rolled-back version. If THAT also fails, escalate immediately — the system is in an unknown state and needs human takeover.
4. Update tracker tickets back to `Ready for Release` (or the project's pre-release status).
5. Write the final `Rolled-Back` verdict with full reason chain.
6. Auto-trigger `/retrospective --incident` with this release as the incident anchor (per `retrospective/SKILL.md` incident mode).
7. Do NOT auto-chain to anything else — the user owns the next step.
**Path C — Aborted**:
Reached only via Phase 1 Choose B, Phase 4 smoke-tests-missing escalation, or any phase that detects a precondition violation. Write `Aborted: <reason>` to the release report. Do not auto-chain.
## Self-verification
- [ ] Release report exists at `RELEASE_DIR/release_<version>_<env>_<timestamp>.md` with verdict (Released / Rolled-Back / Aborted)
- [ ] Every phase that ran has a section in the release report with timestamps and tool output
- [ ] On Released: tracker tickets moved to release status; git tag pushed (if convention)
- [ ] On Rolled-Back: rollback report exists at `RELEASE_DIR/rollback_<version>_<env>_<timestamp>.md`; tracker tickets moved back to pre-release status; incident retrospective scheduled
- [ ] On Aborted: reason recorded; no live-system changes attempted; no tracker movement
- [ ] No phase was skipped without an explicit reason recorded in the release report
## Escalation Rules
| Situation | Action |
|-----------|--------|
| `scripts/deploy.sh` missing or `--rollback` unsupported | STOP — return to `/deploy` Step 7, do not patch the script in `/release` |
| Registry auth failure during pre-release | STOP — fix credentials at infra layer (per `coderule.mdc`); do not embed creds in the script |
| Smoke tests missing for prod target | STOP — write a leftover; do not improvise smoke tests in `/release` |
| Observability backend unreachable | STOP — observability blindness is itself a release blocker |
| User asks to skip the watch window | Record override, mark verdict `Released-with-override`, fire incident retro |
| Rollback also fails its smoke test | ESCALATE to user — system is in unknown state; do not loop deploys |
| Tracker MCP returns Unauthorized during ticket movement | Per `tracker.mdc`, write a leftover entry; do NOT silently continue without confirming the move |
| Multiple environments named in user request | STOP — one release per invocation; ask user to pick one |
| Production smoke test would touch real customer data | STOP — that is a `coderule.mdc` violation; ask user to define a smoke endpoint or test account |
## Common Mistakes
- **Skipping the watch window when "everything looks fine after deploy"** — a deploy that exited 0 is not a release that's stable. Watch is mandatory.
- **Faking smoke tests** to pass the gate when the prod test set is incomplete. STOP and surface the gap; do not embed prod URLs into ad-hoc curl commands.
- **Rolling forward through a failure** ("the next deploy will fix it"). Roll back first, fix the cause, then deploy a real fix.
- **Treating the release report as optional** when only an internal tool changed. Every release writes a report — the audit trail is the value, not the prose volume.
- **Approving manual gates yourself** without the user's input when restrictions require human approval. The release skill records, the human approves.
- **Reusing `release_<version>` filenames** across attempted releases. Always include the timestamp in the filename so re-attempts are visible side-by-side.
- **Letting tracker drift silently** between release attempts. If Phase 6 cannot move tickets, the release is not complete — write a leftover and stop.
## Project Mode vs Standalone
- **Project mode** (default): autodev invokes `/release` after `/deploy`. State writes occur under `_docs/_autodev_state.md`. Full integration with retrospective and feature-cycle loop.
- **Standalone mode**: `/release` invoked directly with `@<artifact>` (rare; usually only for re-running a rollback against a specific version). All outputs still go to `RELEASE_DIR/`.
## Methodology Quick Reference
```
┌────────────────────────────────────────────────────────────────┐
│ Release (6 phases, 3 verdicts) │
├────────────────────────────────────────────────────────────────┤
│ Phase 1 Pre-Release Gate │
│ AC + tests + change summary + rollback path │
│ [BLOCKING — user A/B/C] │
│ Phase 2 Strategy Select │
│ all-at-once · blue-green · canary · manual │
│ [BLOCKING — user picks] │
│ Phase 3 Execute │
│ scripts/deploy.sh, capture exit code + logs │
│ [AUTO-ROLLBACK on non-zero or hang] │
│ Phase 4 Smoke Test │
│ /test-run --prod-smoke against target │
│ [AUTO-ROLLBACK on any failure] │
│ Phase 5 Watch Window │
│ Poll observability for N minutes │
│ [AUTO-ROLLBACK on hard breach; escalate on soft] │
│ Phase 6 Commit or Rollback │
│ Released → tracker, tag, retrospective │
│ Rolled-Back → tracker reset, incident retrospective │
│ Aborted → no live-system change │
├────────────────────────────────────────────────────────────────┤
│ Principles: real execution · verifiable rollback · │
│ quiet failure = release failure · │
│ watch window mandatory │
└────────────────────────────────────────────────────────────────┘
```
@@ -0,0 +1,114 @@
# Release Report — {version} → {env}
- **Date**: {YYYY-MM-DD HH:MM} {timezone}
- **Operator**: {user}
- **Strategy**: {all-at-once | blue-green | canary | manual}
- **Verdict**: {Released | Released-with-override | Rolled-Back | Aborted}
- **Verdict reason**: {one-line summary}
## Pre-Release Gate (Phase 1)
### Acceptance Criteria
| AC ID | Status | Evidence |
|-------|--------|----------|
| AC-001 | Met / Unmet | path:section, test report, etc. |
### Test Status
| Suite | Pass | Fail | Skip | Source |
|-------|------|------|------|--------|
| Functional | N | N | N | _docs/03_implementation/{batch}.md |
| Performance | N | N | N | _docs/06_metrics/perf_*.md |
### Change Summary
| Component | Tickets | Type |
|-----------|---------|------|
| {component} | TKT-001, TKT-002 | feature / fix / breaking / security |
### Rollback Plan
- Previous version: `{previous-version}` (registry digest: `{sha}`)
- Rollback script: `scripts/deploy.sh --rollback`
- Rollback target verified pullable: yes / no
- Rollback target verified bootable in target env: yes / no
### Restrictions / Approvals
- Change-window restrictions: {none | description}
- Manual approvals required: {none | reference to approval file}
### Tracker State at Gate
- Tickets in scope: {N}
- Tickets blocking release: {0 — list any}
## Strategy Select (Phase 2)
- Recommended: {strategy} — reasoning
- Chosen: {strategy} — reasoning (if differs from recommended)
## Execute (Phase 3)
- Start: {timestamp}
- End: {timestamp}
- Exit code: {0 / non-zero}
```
<scripts/deploy.sh stdout/stderr stream, with timestamps>
```
## Smoke Test (Phase 4)
- Mode: {/test-run --prod-smoke | manual smoke set}
- Start: {timestamp}
- End: {timestamp}
| Test | Result | Notes |
|------|--------|-------|
| {name} | Pass / Fail | response time, status, etc. |
## Watch Window (Phase 5)
- Duration: {minutes}
- Cadence: {minutes per poll}
- Backend: {observability source — Prometheus, CloudWatch, Datadog, etc.}
| T+min | error_rate | rps | p99_latency | saturation | health | notes |
|-------|------------|-----|-------------|------------|--------|-------|
| 0 | … | … | … | … | OK | … |
| 1 | … | … | … | … | OK | … |
| … | … | … | … | … | … | … |
### Threshold breaches
- {None | "p99 latency 1.7× baseline at T+8 — soft breach, user accepted continuation"}
## Commit or Rollback (Phase 6)
### If Released
- Tracker tickets moved: {list}
- Git tag pushed: {tag} → {sha}
- Retrospective scheduled: yes — {/retrospective --cycle-end output path}
### If Rolled-Back
- Trigger: {auto / user-elected}
- Reason: {phase + one-line cause}
- Rollback start: {timestamp}
- Rollback end: {timestamp}
- Post-rollback smoke: pass / fail
- Tracker tickets moved back: {list}
- Incident retrospective scheduled: yes — {/retrospective --incident output path}
### If Aborted
- Phase that aborted: {1 / 2 / 3 / 4 / 5}
- Reason: {one-line cause}
- No live-system changes attempted: yes / no (if live changes, document under Phase 3 above and treat as Rolled-Back instead)
## Lessons (one-liners; full incident retro if Rolled-Back / Released-with-override)
- {Optional: short one-liner observations the operator wants the next /retrospective to consider}
+21
View File
@@ -30,6 +30,27 @@ Transform vague topics raised by users into high-quality, deliverable research r
- **Internet-first investigation** — do not rely on training data for factual claims; search the web extensively for every sub-question, rephrase queries when results are thin, and keep searching until you have converging evidence from multiple independent sources
- **Multi-perspective analysis** — examine every problem from at least 3 different viewpoints (e.g., end-user, implementer, business decision-maker, contrarian, domain expert, field practitioner); each perspective should generate its own search queries
- **Question multiplication** — for each sub-question, generate multiple reformulated search queries (synonyms, related terms, negations, "what can go wrong" variants, practitioner-focused variants) to maximize coverage and uncover blind spots
- **Component option breadth** — for every component area, build a broad option landscape before selecting. Search direct candidates, adjacent-domain alternatives, commercial/open-source variants, classical/simple baselines, current SOTA, and "do not use" failure cases. A component may not be narrowed to one candidate until alternatives have been searched and rejected with evidence.
- **Component research depth** — for every serious component candidate, go beyond discovery pages. Read official docs, repository/license files, issue discussions, benchmarks, deployment guides, version/platform requirements, security notes, maintenance signals, and real-world failure reports. Extract evidence for inputs/outputs, lifecycle assumptions, runtime/storage/latency fit, integration boundaries, licensing, operational risks, and unsupported scenarios before assigning any selection status.
- **Exact-fit component selection** — never select a component, tool, library, service, architecture pattern, or algorithm merely because it solves a similar class of problem. It must be proven compatible with the project's explicit operating context, constraints, required inputs/outputs, non-functional requirements, lifecycle assumptions, and acceptance criteria. If fit is unproven or mismatched, mark it `Rejected`, `Experimental only`, or escalate for user decision before it can shape the solution.
- **Per-mode API capability verification** *(applies only to technical-component selection — see Research Output Class below)* — when a candidate library/SDK/framework/service exposes multiple modes or configurations, *the candidate is not a single thing*. Pin the exact mode the project will use (one explicit sentence: inputs, outputs, runtime), and verify *that mode* against the project's required inputs/outputs via official docs (mandatory `context7` lookup) plus a saved Minimum Viable Example. Capability claims at the category level ("supports X, Y, Z modes") must be cross-checked against the literal mode enumeration before being treated as project-applicable. Two modes of one library are two distinct candidates for the purposes of the Component Applicability Gate. Does not apply to non-technical research (concept comparison, market/policy investigation, knowledge organization, etc.).
## Research Output Class (BLOCKING — set in Step 1)
Before applying any of the technical-component gates (per-mode API capability verification, Component Applicability Gate, Restrictions × Candidate-Mode sub-matrix, MVE evidence, mandatory `context7` lookup), classify the research output into one of two classes. Record the decision in `00_question_decomposition.md` once, near the top, so every downstream step honors it.
| Class | What the output recommends or selects | Examples | Technical-component gates apply? |
|-------|---------------------------------------|----------|----------------------------------|
| **Technical-component selection** | One or more libraries, SDKs, frameworks, services, protocols, data formats, infrastructure patterns, algorithms, or APIs that will be implemented or operated against | "Pick a vector database", "Compare auth-token strategies for our API", "Should we use Kafka or RabbitMQ?", architecture / tech-stack / migration drafts (Mode A, Mode B) | **Yes — all gates active** |
| **Non-technical investigation** | Concept comparisons, knowledge organization, root-cause investigation of an event, market/policy/regulatory/social analysis, literature review, decision support without committing to specific tooling | "Why did adoption stall in Q3?", "Compare phenomenology vs constructivism", "Map regulatory landscape for X", "What do practitioners say about onboarding under remote-first orgs?" | **No — skip API/MVE/sub-matrix gates; the rest of the 8-step engine still applies** |
How to decide:
1. Inspect the question and the input files (`problem.md`, `restrictions.md`, `acceptance_criteria.md`, or the standalone input file).
2. If the deliverable will name specific software/services/protocols that someone will then build with or operate, it is **Technical-component selection**.
3. If the deliverable is a report, comparison, or recommendation that does not commit to specific tooling, it is **Non-technical investigation**.
4. **Mixed runs are valid.** Some research questions have a non-technical core but include one technical sub-question (or vice versa). In that case classify per component area within the run, not the run as a whole, and note in `00_question_decomposition.md` which component areas trigger the technical-component gates.
When the run is purely **Non-technical investigation**, the rest of the research engine — question decomposition, perspective rotation, exhaustive web search, fact extraction, comparison framework, reasoning chain, validation, deliverable formatting — still applies in full. The sections that get skipped are explicitly the technical gates listed in the table above.
## Context Resolution
@@ -27,13 +27,26 @@
- [ ] Iterative deepening completed: follow-up questions from initial findings were searched
- [ ] No sub-question relies solely on training data without web verification
## Component Option Breadth
- [ ] `00_question_decomposition.md` contains a Component Option Search Plan
- [ ] Every component area was searched across simple baseline, established production, open-source, commercial/vendor, current SOTA, adjacent-domain, no-build/defer, and known-bad options where applicable
- [ ] Every component area has at least 3 realistic candidates, or a documented explanation of why broad searches found fewer
- [ ] Each lead candidate has official/source-of-truth evidence plus independent validation when available
- [ ] Each component area includes at least one baseline/fallback option and at least one rejected or experimental option when possible
- [ ] Alternative names, synonyms, and neighboring-domain terms were searched before declaring the option landscape complete
- [ ] Licensing, runtime, platform, maintenance, and unsupported-scenario searches were performed for every lead, fallback, and rejected candidate
## Mode A Specific
- [ ] Phase 1 completed: AC assessment was presented to and confirmed by user
- [ ] AC assessment consistent: Solution draft respects the (possibly adjusted) acceptance criteria and restrictions
- [ ] Competitor analysis included: Existing solutions were researched
- [ ] All components have comparison tables: Each component lists alternatives with tools, advantages, limitations, security, cost
- [ ] Component options are broad: component tables include baseline, production, open-source, commercial/vendor, SOTA/research, adjacent-domain, defer/no-build, and disqualified options where applicable
- [ ] Tools/libraries verified: Suggested tools actually exist and work as described
- [ ] Component fit matrix completed: `06_component_fit_matrix.md` (or `06_component_fit_matrix/` if split) exists and every selected component/tool/pattern is marked `Selected`
- [ ] No field-adjacent substitution: no selected candidate is chosen only because it solves a similar class of problem while failing the project's explicit constraints
- [ ] Testing strategy covers AC: Tests map to acceptance criteria
- [ ] Tech stack documented (if Phase 3 ran): `tech_stack.md` has evaluation tables, risk assessment, and learning requirements
- [ ] Security analysis documented (if Phase 4 ran): `security_analysis.md` has threat model and per-component controls
@@ -45,6 +58,9 @@
- [ ] New draft is self-contained: Written as if from scratch, no "updated" markers
- [ ] Performance column included: Mode B comparison tables include performance characteristics
- [ ] Previous draft issues addressed: Every finding in the table is resolved in the new draft
- [ ] Existing selected components were challenged against a broad alternative landscape before being kept
- [ ] Existing component fit audited: every old and new component/tool/pattern was checked against `restrictions.md`, `acceptance_criteria.md`, and the Project Constraint Matrix
- [ ] Rejected/experimental candidates are not lead recommendations unless the user explicitly accepted the risk
## Timeliness Check (High-Sensitivity Domain BLOCKING)
@@ -64,7 +80,7 @@ When the research topic has Critical or High sensitivity level:
## Target Audience Consistency Check (BLOCKING)
- [ ] Research boundary clearly defined: `00_question_decomposition.md` has clear population/geography/timeframe/level boundaries
- [ ] Every source has target audience annotated in `01_source_registry.md`
- [ ] Every source has target audience annotated in `01_source_registry.md` (or category files under `01_source_registry/` if split)
- [ ] Mismatched sources properly handled (excluded, annotated, or marked reference-only)
- [ ] No audience confusion in fact cards: Every fact has target audience consistent with research boundary
- [ ] No audience confusion in the report: Policies/research/data cited have consistent target audiences
@@ -76,3 +92,33 @@ When the research topic has Critical or High sensitivity level:
- [ ] Cited facts have corresponding statements in the original text (no over-interpretation)
- [ ] Source publication/update dates annotated; technical docs include version numbers
- [ ] Unverifiable information annotated `[limited source]` and not sole support for core conclusions
## Exact-Fit Validation (BLOCKING)
- [ ] Project Constraint Matrix extracted from problem context before component selection
- [ ] Component fit matrix includes `Component Area`, `Option Family`, and `Pinned Mode/Config` columns
- [ ] Every selected component/tool/library/service/pattern/algorithm has evidence for required inputs/outputs and integration boundaries
- [ ] Every selected candidate has evidence for the operating context and lifecycle assumptions it must support
- [ ] Every selected candidate has evidence for non-functional targets that are binding for the project
- [ ] Known unsupported scenarios and failure reports were searched for every selected candidate
- [ ] Mismatches are recorded as disqualifiers, not softened into generic limitations
- [ ] Any candidate with unproven fit is marked `Experimental only` or escalated for user decision
- [ ] Any candidate with documented constraint conflict is marked `Rejected`
## API Capability Verification (BLOCKING)
**Applicability**: this checklist applies only when the run is classified as **Technical-component selection** (see SKILL.md → Research Output Class). For non-technical research (concept comparison, market/policy investigation, root-cause analysis, knowledge organization), skip this checklist entirely and note the skip in `05_validation_log.md`. For mixed runs, apply only to technical component areas.
For every lead candidate that is a library/SDK/framework/service:
- [ ] The exact mode/configuration the project will use is pinned in one explicit sentence (inputs, outputs, runtime); no vague "supports X" language
- [ ] `context7` (or equivalent docs lookup) was run for the candidate, with at least 3 queries: mode enumeration, project's exact mode, disqualifier probe
- [ ] All consulted URLs from context7 / official docs are appended to `01_source_registry.md` (or files under `01_source_registry/` if split)
- [ ] A Minimum Viable Example (MVE) was saved for the pinned mode in `02_fact_cards.md` / `02_fact_cards/` (or `02_mve_evidence.md`) with: source, inputs in example, outputs in example, project inputs, project outputs required, match assessment ✅/⚠️/❌
- [ ] When the MVE inputs or outputs do not exactly match the project's, the mismatch is cited from the official docs (not inferred), and the candidate is `Experimental only` or `Rejected`
- [ ] When a library has multiple modes, each project-relevant mode appears as its own candidate row (not a single library row that softens across modes)
- [ ] Restrictions × Candidate-Modes sub-matrix in `06_component_fit_matrix.md` (or files under `06_component_fit_matrix/` if split) is filled for every lead candidate, with one row per numbered restriction and per numbered acceptance criterion
- [ ] Sub-matrix uses ✅ / ❌ / ❓ / N/A only — no free-form prose substitutes
- [ ] No `Selected` candidate has any ❌ or ❓ cell in its sub-matrix
- [ ] "Validation gate required" footnotes are explicitly classified as either *API capability* (must be resolved here) or *runtime quality* (may be carried forward)
- [ ] Paraphrased capability claims in fact cards have been cross-checked against the literal mode-enumeration evidence (no `mono, inertial → mono-inertial` style conflation)
@@ -89,7 +89,7 @@ Value Translation:
## Source Registry Entry Template
For each source consulted, immediately append to `01_source_registry.md`:
For each source consulted, immediately append to `01_source_registry.md` (or the appropriate category file under `01_source_registry/` if the artifact has been split — see splittable-artifacts convention in `steps/00_project-integration.md`):
```markdown
## Source #[number]
- **Title**: [source title]
@@ -57,22 +57,49 @@ RESEARCH_DIR/
├── 03_comparison_framework.md # Step 4 output: selected framework and populated data
├── 04_reasoning_chain.md # Step 6 output: fact → conclusion reasoning
├── 05_validation_log.md # Step 7 output: use-case validation results
├── 06_component_fit_matrix.md # Step 7.5 output: component exact-fit gate
└── raw/ # Raw source archive (optional)
├── source_1.md
└── source_2.md
```
#### Splittable artifacts — Layout convention
The following three artifacts MAY equivalently be a **folder** of the same base name when the single-file form has grown unwieldy (typically ≳ 1000 lines or ≳ 200 KB):
- `01_source_registry.md``01_source_registry/`
- `02_fact_cards.md``02_fact_cards/`
- `06_component_fit_matrix.md``06_component_fit_matrix/`
When using the folder form:
- Place a `00_summary.md` index file at the folder root with a short common summary table and the cross-cutting status the single-file form would have carried in its preamble.
- Split per-entry content into category files (e.g. one file per sub-question or per component): `SQ1_*.md`, `C1_*.md`, etc. Keep entry numbering global across the folder so cross-references like "Source #42" still resolve to exactly one place.
- Cross-references from outside the folder may point at either `01_source_registry/00_summary.md` (for the index) or directly at the relevant category file.
```
RESEARCH_DIR/01_source_registry/ # split form (when single-file is too large)
├── 00_summary.md # index + investigation status + compact source table
├── SQ1_existing_systems.md # category file
├── SQ2_canonical_pipeline.md # category file
├── C1_vio.md # per-component file
└── ...
```
Throughout the rest of this skill (other steps, references, templates), the singular `XX.md` form is used as a logical name; treat each occurrence as applying equally to the folder form when the artifact has been split.
### Save Timing & Content
| Step | Save immediately after completion | Filename |
|------|-----------------------------------|----------|
| Mode A Phase 1 | AC & restrictions assessment tables | `00_ac_assessment.md` |
| Step 0-1 | Question type classification + sub-question list | `00_question_decomposition.md` |
| Step 2 | Each consulted source link, tier, summary | `01_source_registry.md` |
| Step 3 | Each fact card (statement + source + confidence) | `02_fact_cards.md` |
| Step 2 | Each consulted source link, tier, summary | `01_source_registry.md` *(splittable, see convention)* |
| Step 3 | Each fact card (statement + source + confidence) | `02_fact_cards.md` *(splittable, see convention)* |
| Step 4 | Selected comparison framework + initial population | `03_comparison_framework.md` |
| Step 6 | Reasoning process for each dimension | `04_reasoning_chain.md` |
| Step 7 | Validation scenarios + results + review checklist | `05_validation_log.md` |
| Step 7.5 | Component exact-fit gate and selection status | `06_component_fit_matrix.md` *(splittable, see convention)* |
| Step 8 | Complete solution draft | `OUTPUT_DIR/solution_draft##.md` |
### Save Principles
@@ -90,11 +117,12 @@ RESEARCH_DIR/
|------|---------|----------------|
| `00_ac_assessment.md` | AC & restrictions assessment (Mode A only) | After Phase 1 completion |
| `00_question_decomposition.md` | Question type, sub-question list | After Step 0-1 completion |
| `01_source_registry.md` | All source links and summaries | Continuously updated during Step 2 |
| `02_fact_cards.md` | Extracted facts and sources | Continuously updated during Step 3 |
| `01_source_registry.md` *(splittable)* | All source links and summaries | Continuously updated during Step 2 |
| `02_fact_cards.md` *(splittable)* | Extracted facts and sources | Continuously updated during Step 3 |
| `03_comparison_framework.md` | Selected framework and populated data | After Step 4 completion |
| `04_reasoning_chain.md` | Fact → conclusion reasoning | After Step 6 completion |
| `05_validation_log.md` | Use-case validation and review | After Step 7 completion |
| `06_component_fit_matrix.md` *(splittable)* | Exact-fit matrix for every proposed component/tool/pattern with status `Selected` / `Rejected` / `Experimental only` / `Needs user decision` | Before Step 8 deliverable formatting |
| `OUTPUT_DIR/solution_draft##.md` | Complete solution draft | After Step 8 completion |
| `OUTPUT_DIR/tech_stack.md` | Tech stack evaluation and decisions | After Phase 3 (optional) |
| `OUTPUT_DIR/security_analysis.md` | Threat model and security controls | After Phase 4 (optional) |
@@ -6,7 +6,9 @@ Triggered when no `solution_draft*.md` files exist in OUTPUT_DIR, or when the us
**Role**: Professional software architect
A focused preliminary research pass **before** the main solution research. The goal is to validate that the acceptance criteria and restrictions are realistic before designing a solution around them.
> **AC must be design-independent**: describe testable outcomes only — no libraries, algorithms, params, or design choices. Implementation follows AC, never reverse. (IEEE 830 / Atlassian / GitScrum)
A focused preliminary research pass **before** the main solution research. The goal is to validate that the acceptance criteria and restrictions are realistic before designing a solution around them. Any revision proposed in this phase must respect the design-independence rule above — propose AC changes as outcome/budget edits, not as implementation prescriptions.
**Input**: All files from INPUT_DIR (or INPUT_FILE in standalone mode)
@@ -73,16 +75,18 @@ Full 8-step research methodology. Produces the first solution draft.
**Task** (drives the 8-step engine):
1. Research existing/competitor solutions for similar problems — search broadly across industries and adjacent domains, not just the obvious competitors
2. Research the problem thoroughly — all possible ways to solve it, split into components; search for how different fields approach analogous problems
3. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1
4. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions
5. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist?
6. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues)
7. Include security considerations in each component analysis
8. Provide rough cost estimates for proposed solutions
3. Derive a **Project Constraint Matrix** before evaluating component options. Extract exact constraints from `problem.md`, `restrictions.md`, `acceptance_criteria.md`, input data notes, and the Phase 1 AC assessment. Include required inputs/outputs, operating context, runtime envelope, data availability, lifecycle boundaries, non-functional targets, integration boundaries, security constraints, and explicit out-of-scope decisions.
4. For each component, research all possible solutions and find the most efficient state-of-the-art approaches — use multiple query variants and perspectives from Step 1
5. For each promising approach, search for real-world deployment experience: success stories, failure reports, lessons learned, and practitioner opinions
6. Search for contrarian viewpoints — who argues against the common approaches and why? What failure modes exist?
7. Verify that suggested tools/libraries actually exist and work as described — check official repos, latest releases, and community health (stars, recent commits, open issues)
8. For every candidate component/tool/library/service/pattern/algorithm, prove exact fit against the Project Constraint Matrix. A field-adjacent solution is not selectable unless its documented implementation assumptions match the project's constraints. Mismatches must be recorded as disqualifiers and the candidate marked `Rejected`, `Experimental only`, or `Needs user decision`.
9. Include security considerations in each component analysis
10. Provide rough cost estimates for proposed solutions
Be concise in formulating. The fewer words, the better, but do not miss any important details.
**Save action**: Write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md`
**Save action**: Write `RESEARCH_DIR/06_component_fit_matrix.md` (or its split-folder equivalent under `RESEARCH_DIR/06_component_fit_matrix/`, per the splittable-artifacts convention in `00_project-integration.md`) before the final draft, then write `OUTPUT_DIR/solution_draft##.md` using template: `templates/solution_draft_mode_a.md`
---
@@ -10,18 +10,25 @@ Full 8-step research methodology applied to assessing and improving an existing
**Task** (drives the 8-step engine):
1. Read the existing solution draft thoroughly
2. Research in internet extensively — for each component/decision in the draft, search for:
2. Derive or refresh the **Project Constraint Matrix** from all files in INPUT_DIR. Include required inputs/outputs, operating context, runtime envelope, data availability, lifecycle boundaries, non-functional targets, integration boundaries, security constraints, and explicit out-of-scope decisions.
3. Audit every component/decision in the existing draft against the Project Constraint Matrix before researching alternatives:
- If a component's documented implementation assumptions match the project constraints, keep it eligible and record evidence.
- If fit is unproven, mark it `Experimental only` until evidence is found.
- If constraints conflict, mark it `Rejected` and search for alternatives.
- If rejecting it changes product behavior or risk materially, escalate for user decision.
4. Research in internet extensively — for each component/decision in the draft, search for:
- Known problems and limitations of the chosen approach
- What practitioners say about using it in production
- Better alternatives that may have emerged recently
- Common failure modes and edge cases
- How competitors/similar projects solve the same problem differently
3. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure"
4. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft
5. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports
6. For each identified weak point, search for multiple solution approaches and compare them
7. Based on findings, form a new solution draft in the same format
5. Search specifically for contrarian views: "why not [chosen approach]", "[chosen approach] criticism", "[chosen approach] failure"
6. Identify security weak points and vulnerabilities — search for CVEs, security advisories, and known attack vectors for each technology in the draft
7. Identify performance bottlenecks — search for benchmarks, load test results, and scalability reports
8. For each identified weak point, search for multiple solution approaches and compare them
9. For every revised candidate, prove exact fit against the Project Constraint Matrix. Do not select field-adjacent or "similar problem" options unless their intrinsic implementation constraints match the project.
10. Based on findings, form a new solution draft in the same format
**Save action**: Write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md`
**Save action**: Write `RESEARCH_DIR/06_component_fit_matrix.md` (or its split-folder equivalent under `RESEARCH_DIR/06_component_fit_matrix/`, per the splittable-artifacts convention in `00_project-integration.md`) before the final draft, then write `OUTPUT_DIR/solution_draft##.md` (incremented) using template: `templates/solution_draft_mode_b.md`
**Optional follow-up**: After Mode B completes, the user can request Phase 3 (Tech Stack Consolidation) or Phase 4 (Security Deep Dive) using the revised draft. These phases work identically to their Mode A descriptions in `steps/01_mode-a-initial-research.md`.
@@ -40,6 +40,7 @@ Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources
- "What existing/competitor solutions address this problem?"
- "What are the component parts of this problem?"
- "For each component, what are the state-of-the-art solutions?"
- "For each component, what are the practical alternatives across simple baseline, established production option, open-source option, commercial option, current SOTA, adjacent-domain option, and no-build/defer option?"
- "What are the security considerations per component?"
- "What are the cost implications of each approach?"
@@ -48,6 +49,7 @@ Key principle: Critical-sensitivity topics (AI/LLMs, blockchain) require sources
- "What are the security vulnerabilities in the proposed architecture?"
- "Where are the performance bottlenecks?"
- "What solutions exist for each identified issue?"
- "For each component already selected in the draft, what alternatives should be considered before keeping, replacing, or rejecting it?"
**General sub-question patterns** (use when applicable):
- **Sub-question A**: "What is X and how does it work?" (Definition & mechanism)
@@ -84,6 +86,27 @@ For **each sub-question**, generate **at least 3-5 search query variants** befor
Record all planned queries in `00_question_decomposition.md` alongside each sub-question.
#### Component Option Breadth (MANDATORY)
Before Step 2, identify the component areas implied by the problem and create a search plan for options in each area. A component area is any replaceable tool, library, model, service, algorithm, data format, protocol, infrastructure pattern, or validation approach that could materially affect the solution.
For every component area, generate search queries for these option families unless clearly not applicable:
- **Simple baseline**: low-complexity classical or manual approach that can serve as a fallback or regression baseline.
- **Established production option**: mature library/service/pattern with field usage.
- **Open-source candidate**: permissive-license option with inspectable implementation and community history.
- **Commercial/vendor option**: paid or vendor-supported option, including SDK/platform constraints.
- **Current SOTA / research option**: recent model, paper, or benchmark leader that may be promising but immature.
- **Adjacent-domain option**: solution from a neighboring domain with similar constraints.
- **No-build / defer option**: whether the component can be avoided, simplified, or moved out of scope.
- **Known bad option**: candidate or family that appears attractive but has documented failure modes or disqualifiers.
For each component area, record:
- Candidate names and option families to search.
- At least 5 query variants covering alternatives, comparisons, limitations, licensing, runtime/scale, and exact project constraints.
- The minimum evidence needed to mark a candidate `Selected`, `Rejected`, `Experimental only`, or `Needs user decision`.
Add this as a "Component Option Search Plan" section in `00_question_decomposition.md`.
**Research Subject Boundary Definition (BLOCKING - must be explicit)**:
When decomposing questions, you must explicitly define the **boundaries of the research subject**:
@@ -94,6 +117,9 @@ When decomposing questions, you must explicitly define the **boundaries of the r
| **Geography** | Which region is being studied? | Chinese universities vs US universities vs global |
| **Timeframe** | Which period is being studied? | Post-2020 vs full historical picture |
| **Level** | Which level is being studied? | Undergraduate vs graduate vs vocational |
| **Operating context** | What exact environment, lifecycle phase, and runtime conditions must the solution support? | In-flight embedded runtime vs offline post-processing; production web traffic vs admin batch job |
| **Required interfaces** | What inputs, outputs, protocols, data shapes, and ownership boundaries are fixed? | One camera vs stereo rig; REST API vs message queue; local file boundary vs service API |
| **Non-functional envelope** | What latency, throughput, storage, memory, availability, safety, security, cost, and maintainability targets are binding? | <400 ms p95, 8 GB RAM, 99.9% availability, reversible migrations |
**Common mistake**: User asks about "university classroom issues" but sources include policies targeting "K-12 students" — mismatched target populations will invalidate the entire research.
@@ -116,9 +142,11 @@ Record the audit result in `00_question_decomposition.md` as a "Completeness Aud
- Summary of relevant problem context from INPUT_DIR
- Classified question type and rationale
- **Research subject boundary definition** (population, geography, timeframe, level)
- **Project Constraint Matrix summary** (operating context, required interfaces, non-functional envelope, lifecycle assumptions, and hard disqualifiers extracted from input files)
- List of decomposed sub-questions
- **Chosen perspectives** (at least 3 from the Perspective Rotation table) with rationale
- **Search query variants** for each sub-question (at least 3-5 per sub-question)
- **Component Option Search Plan** (component areas, option families, candidate names, query variants, required evidence)
- **Completeness audit** (taxonomy cross-reference + domain discovery results)
4. Write TodoWrite to track progress
@@ -132,7 +160,7 @@ Tier sources by authority, **prioritize primary sources** (L1 > L2 > L3 > L4). C
**Tool Usage**:
- Use `WebSearch` for broad searches; `WebFetch` to read specific pages
- Use the `context7` MCP server (`resolve-library-id` then `get-library-docs`) for up-to-date library/framework documentation
- Use the `context7` MCP server (`resolve-library-id` then `query-docs` / `get-library-docs`) for up-to-date library/framework documentation. **Mandatory per lead candidate** — see "API Capability Verification" below.
- Always cross-verify training data claims against live sources for facts that may have changed (versions, APIs, deprecations, security advisories)
- When citing web sources, include the URL and date accessed
@@ -145,17 +173,77 @@ Do not stop at the first few results. The goal is to build a comprehensive evide
- Consult at least **2 different source tiers** per sub-question (e.g., L1 official docs + L4 community discussion)
- If initial searches yield fewer than 3 relevant sources for a sub-question, **broaden the search** with alternative terms, related domains, or analogous problems
**Minimum search effort per component area**:
- Search every option family from the "Component Option Search Plan" before choosing a lead candidate.
- For each lead, fallback, or rejected candidate, search at least one official/source-of-truth page and at least one independent validation source when available.
- Search `"[component] alternatives"`, `"[candidate] vs [alternative]"`, `"[candidate] limitations"`, `"[candidate] license"`, `"[candidate] production"`, and `"[candidate] [binding project constraint]"`.
- If fewer than 3 realistic candidates are found for a component area, explicitly document why the landscape is narrow and search adjacent domains before accepting that result.
- Include at least one simple baseline and one "do not use" or disqualified candidate per component area when possible; these prevent false confidence in the selected option.
**Candidate implementation-limit searches (MANDATORY)**:
For every component/tool/library/service/pattern/algorithm that may be selected or recommended, search for its intrinsic implementation constraints. Do not rely on product category labels, marketing summaries, or examples from a different operating context. Include query variants for:
- Official supported inputs/outputs, protocols, data formats, and deployment modes
- Required hardware/runtime/platform/version constraints
- Timing, throughput, memory, storage, synchronization, and scaling assumptions
- Lifecycle assumptions: offline vs online, batch vs real time, development vs production, single tenant vs multi tenant, local vs networked
- Known unsupported scenarios, limitations, issue reports, production failures, and workarounds
- Licensing, security, maintenance, and community-health constraints
- Exact phrases from the project's restrictions and acceptance criteria combined with the candidate name
**API Capability Verification — Per-Mode (MANDATORY, BLOCKING for lead candidates)**:
**Applicability**: this section applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section, and only to lead candidates that are libraries/SDKs/frameworks/services/protocols/data formats with multiple modes or configurations. For non-technical research (concept comparison, market/policy investigation, knowledge organization, root-cause analysis without tooling commitments), skip this entire sub-section and continue with the rest of Step 2 — the broader candidate implementation-limit search above is sufficient. State the skip explicitly once in `02_fact_cards.md` (or in `02_fact_cards/00_summary.md` if split): `API Capability Verification: not applicable — this run is a Non-technical investigation, no library/SDK/service candidates`.
Most libraries/SDKs/services expose **multiple modes or configurations** (e.g., monocular vs stereo VO, sync vs async API, batch vs streaming inference, write-through vs write-behind cache). Selecting a candidate "because it supports X" without pinning *which mode* the project will use, and *whether that exact mode produces the required outputs from the required inputs*, is the most common silent-failure path in research. A library can support a class of problem in mode A while being unusable for the project's specific configuration in mode B.
For every lead candidate that is a library/SDK/framework/service with multiple modes or configurations, do the following — in this order, before marking the candidate `Selected`:
1. **Pin the exact mode/configuration the project will use.**
Derived from the Project Constraint Matrix: which inputs are available (sensor count, sensor types, data shapes, rates), which outputs are required (per `acceptance_criteria.md` and contract files), which hardware/runtime is fixed (per `restrictions.md`). Write this as a single sentence: "We will use `<library>` in `<mode/config>` with inputs `<list>` and expect outputs `<list>` on `<runtime>`." Do not progress past this step on a vague mode description.
2. **Run `context7` (or equivalent docs lookup) for the candidate** — this is **mandatory for every lead library/SDK/framework candidate**, not optional. Minimum three queries per candidate:
1. *Mode enumeration*: "What modes/configurations does `<library>` support? List every value of the mode/config enum and what each requires as input."
2. *Project's exact mode*: "Show a minimum runnable example of `<library>` in `<the pinned mode>` with `<the project's input shape>`. What does it produce?"
3. *Disqualifier probe*: "Does `<library>` `<the pinned mode>` produce `<the required output>`? Are there published limitations of `<the pinned mode>` for `<the project's runtime/hardware>`?"
For services without context7 coverage, use official docs site + WebFetch on the API reference page + the project's example/tutorial directory in the source repo. Append every consulted URL to `01_source_registry.md` (or the appropriate category file under `01_source_registry/` if split — see splittable-artifacts convention in `00_project-integration.md`).
3. **Save a Minimum Viable Example (MVE) for the pinned mode.**
Append to `02_fact_cards.md` / `02_fact_cards/` (or a sibling `02_mve_evidence.md`) at least one block per lead library candidate with:
```markdown
## MVE — <library> in <pinned mode>
- **Source**: <official URL or context7 reference, with date>
- **Inputs in the example**: <e.g., 2 calibrated cameras + IMU at 200 Hz>
- **Outputs in the example**: <e.g., 6-DoF pose with covariance>
- **Project inputs**: <e.g., 1 camera + IMU at 200 Hz>
- **Project outputs required**: <e.g., 6-DoF pose with metric translation>
- **Match assessment**: ✅ exact match / ⚠️ partial (specify dimension) / ❌ mismatch (specify dimension)
- **If ⚠️ or ❌**: cite the official-docs sentence that establishes the mismatch.
```
If no official example covers the project's exact configuration → the candidate cannot be marked `Selected` based on category fit alone. Status must be `Experimental only` (with required-evidence note) or `Rejected` (when the docs explicitly disqualify the configuration).
4. **Bind every numbered Restriction and Acceptance Criterion to the candidate's pinned mode.**
For each numbered line in `restrictions.md` and `acceptance_criteria.md`, decide one of: `Pass` (the pinned mode satisfies it with cited evidence), `Fail` (the pinned mode contradicts it with cited evidence), `Verify` (no evidence either way; deeper investigation required), `N/A` (the line is irrelevant to this component area). Record this in `02_fact_cards.md` (or the candidate's per-component file under `02_fact_cards/` if split) under the candidate's MVE block. The structural matrix in Step 7.5 reads from these bindings.
5. **Treat "the same library in a different mode" as a different candidate.**
If the project's pinned mode is `Monocular` but the only documented evidence covers `Stereo`, do not silently soften "rotation only" into "rotation + translation". Open a separate candidate row for the Monocular mode, with its own MVE, fit assessment, and disqualifiers. Two modes of one library are two distinct candidates for the purposes of this gate.
**Common silent-failure pattern this guards against**: a fact card paraphrases the docs as "supports A, B, C, D modes" when the docs actually mean "supports A; B; C and D as separate orthogonal modes". A category-level "Selected" decision then carries through every downstream artifact, masking that the project's required A+B combination does not exist as a single mode.
**Search broadening strategies** (use when results are thin):
- Try adjacent fields: if researching "drone indoor navigation", also search "robot indoor navigation", "warehouse AGV navigation"
- Try different communities: academic papers, industry whitepapers, military/defense publications, hobbyist forums
- Try different geographies: search in English + search for European/Asian approaches if relevant
- Try historical evolution: "history of X", "evolution of X approaches", "X state of the art 2024 2025"
- Try failure analysis: "X project failure", "X post-mortem", "X recall", "X incident report"
- Try disqualifier probes: "X unsupported", "X limitations", "X requirements", "X with [project constraint]", "X without [required input]", "X real-time [target]", "X production failure"
**Search saturation rule**: Continue searching until new queries stop producing substantially new information. If the last 3 searches only repeat previously found facts, the sub-question is saturated.
**Save action**:
For each source consulted, **immediately** append to `01_source_registry.md` using the entry template from `references/source-tiering.md`.
For each source consulted, **immediately** append to `01_source_registry.md` (or the appropriate category file under `01_source_registry/` if split) using the entry template from `references/source-tiering.md`.
---
@@ -185,7 +273,7 @@ Transform sources into **verifiable fact cards**:
- ❓ Low: Inference or from unofficial sources
**Save action**:
For each extracted fact, **immediately** append to `02_fact_cards.md`:
For each extracted fact, **immediately** append to `02_fact_cards.md` (or the appropriate category file under `02_fact_cards/` if split):
```markdown
## Fact #[number]
- **Statement**: [specific fact description]
@@ -194,6 +282,7 @@ For each extracted fact, **immediately** append to `02_fact_cards.md`:
- **Target Audience**: [which group this fact applies to, inherited from source or further refined]
- **Confidence**: ✅/⚠️/❓
- **Related Dimension**: [corresponding comparison dimension]
- **Fit Impact**: [supports selection / disqualifies / makes experimental / needs user decision]
```
**Target audience in fact statements**:
@@ -229,7 +318,7 @@ After initial fact extraction, review what you have found and identify **knowled
- Failure cases and edge conditions
- Recent developments that may change the picture
4. **Update artifacts**: Append new sources to `01_source_registry.md`, new facts to `02_fact_cards.md`
4. **Update artifacts**: Append new sources to `01_source_registry.md`, new facts to `02_fact_cards.md` (use the appropriate category files under `01_source_registry/` and `02_fact_cards/` if split)
**Exit criteria**: Proceed to Step 4 when:
- Every sub-question has at least 3 facts with at least one from L1/L2
@@ -24,6 +24,18 @@ Write to `03_comparison_framework.md`:
| ... | | | |
```
**Required exact-fit dimensions for component/tool decisions**:
When the output selects or recommends a component, tool, library, service, architecture pattern, or algorithm, the framework MUST include these dimensions unless explicitly not applicable:
- Option family (`Simple baseline`, `Established production`, `Open-source`, `Commercial/vendor`, `Current SOTA`, `Adjacent-domain`, `No-build/defer`, `Known bad`)
- Required inputs/outputs and ownership boundaries
- Operating context and lifecycle fit
- Non-functional envelope fit
- Implementation assumptions and hard disqualifiers
- Evidence quality and source tier
- Selection status (`Selected`, `Rejected`, `Experimental only`, `Needs user decision`)
For each component area, include multiple candidates in the initial population. Do not present only the preferred option unless the investigation found no realistic alternatives; if so, state the searches that proved the narrow landscape.
---
### Step 5: Reference Point Baseline Alignment
@@ -97,6 +109,8 @@ Validate conclusions against a typical scenario:
- [ ] Are there any important dimensions missed?
- [ ] Is there any over-extrapolation?
- [ ] Are conclusions actionable/verifiable?
- [ ] Does every selected component/tool/pattern match the Project Constraint Matrix?
- [ ] Are mismatches marked as disqualifiers instead of hidden as generic "limitations"?
**Save action**:
Write to `05_validation_log.md`:
@@ -128,6 +142,66 @@ If using Y: [expected behavior]
---
### Step 7.5: Component Applicability Gate (BLOCKING)
**Applicability**: this gate applies only when the run is classified as **Technical-component selection** in the SKILL's Research Output Class section. For non-technical research (concept comparison, market/policy investigation, root-cause analysis without tooling, knowledge organization), skip this entire step and proceed to Step 8 — there are no components to gate. State the skip once in `05_validation_log.md`: `Step 7.5 (Component Applicability Gate): not applicable — Non-technical investigation`. For mixed runs (some component areas technical, some not), apply this gate only to the technical component areas; the non-technical ones do not produce 7.5 rows.
Before finalizing the solution draft, build an exact-fit matrix for every component/tool/library/service/pattern/algorithm that is selected, recommended, rejected, or treated as a fallback. Free-form prose in a "Project Constraints Checked" column is **not sufficient** — mismatches hide inside rationale text. The matrix must be structured per restriction and per acceptance criterion.
#### 7.5.1 Top-level Component Fit Matrix
```markdown
# Component Fit Matrix
| Component Area | Candidate | Pinned Mode/Config | Option Family | Intended Role | API Capability Evidence | Mismatches / Disqualifiers | Status | Decision Rationale |
|----------------|-----------|--------------------|---------------|---------------|-------------------------|----------------------------|--------|--------------------|
| [area] | [name] | [exact mode/config the project will use, copied verbatim from the MVE block in Step 2] | [family] | [role] | MVE: [link to MVE block in `02_fact_cards.md` / `02_fact_cards/` or `02_mve_evidence.md`]; docs: [Source #] | [none / list] | Selected / Rejected / Experimental only / Needs user decision | [why] |
```
The new **Pinned Mode/Config** column is mandatory. A row without a pinned mode is incomplete. The new **API Capability Evidence** column links to the Minimum Viable Example saved during Step 2's API Capability Verification — without an MVE link the candidate cannot be `Selected`.
#### 7.5.2 Restrictions × Candidate-Modes Sub-Matrix (MANDATORY)
For each lead candidate row in the top-level matrix, append a structured cross-check that walks every numbered line of `restrictions.md` and `acceptance_criteria.md` against the candidate's **pinned mode/config**.
```markdown
## Sub-Matrix — <Candidate Name> in <Pinned Mode>
| Restriction / AC | Candidate-mode behavior | Result | Evidence |
|------------------|-------------------------|--------|----------|
| R1: <verbatim line from restrictions.md> | <how the pinned mode behaves under this restriction> | ✅ Pass / ❌ Fail / ❓ Verify / N/A | [Fact # / Source # / MVE link] |
| R2: ... | ... | ... | ... |
| ... | ... | ... | ... |
| AC-1.1: <verbatim line from acceptance_criteria.md> | <how the pinned mode satisfies (or contradicts) this AC's measurable target> | ✅ / ❌ / ❓ / N/A | [Fact # / Source # / MVE link] |
| AC-1.2: ... | ... | ... | ... |
| ... | ... | ... | ... |
```
Cell semantics:
- ✅ **Pass** — the candidate's pinned mode satisfies this line, with cited official-doc or MVE evidence.
- ❌ **Fail** — the candidate's pinned mode contradicts this line, with cited evidence. Even one ❌ disqualifies the candidate from `Selected` status.
- ❓ **Verify** — no evidence yet either way; further investigation required (loops back to Step 2 / Step 3.5). A row left ❓ at the end of analysis blocks the candidate.
- **N/A** — the line is irrelevant to this component area (state why in one phrase).
A candidate row may not be marked `Selected` while any cell is ❌ or ❓.
#### 7.5.3 Decision Rules
- `Selected` is allowed only when (a) the top-level row has an MVE link, (b) the sub-matrix has zero ❌, (c) the sub-matrix has zero ❓, and (d) the candidate's documented implementation assumptions match the project's explicit constraints and acceptance criteria.
- `Experimental only` is required when a candidate might work but lacks proof for the exact operating context (e.g., MVE exists for a similar configuration but not the exact one).
- `Rejected` is required when documented assumptions conflict with project constraints (any sub-matrix row is ❌ with cited evidence).
- `Needs user decision` is required when a mismatch changes scope, cost, safety, product behavior, or acceptance criteria — and the user has not yet been consulted.
- Each component area must include at least one selected or fallback-safe option, plus the most credible rejected/experimental alternatives discovered during web research.
- A component area with only one candidate is incomplete unless `00_question_decomposition.md` documents the broader searches and why they yielded no realistic alternatives.
- A candidate may not appear as the lead solution in Step 8 unless this gate marks it `Selected`.
- "Validation gate required" footnotes are not equivalent to `Selected`. If the validation gate concerns API capability (does the mode produce the required output?), that is a Step-2 / Step-7.5 question and must be resolved here, not deferred to runtime. Only validation gates concerning *runtime quality* (e.g., "does this VO converge on this terrain class?") may be carried forward as `Selected with runtime gate`.
**Save action**: Write `06_component_fit_matrix.md` (or, when split, the equivalent files under `06_component_fit_matrix/` — typically `00_summary.md` for the top-level matrix plus per-component sub-matrix files) containing both 7.5.1 (top-level) and 7.5.2 (per-candidate sub-matrices).
**BLOCKING**: If any lead candidate has ❌, ❓, `Experimental only`, `Rejected`, or `Needs user decision` status, do not silently proceed. Ask the user or choose a different selected candidate.
---
### Step 8: Deliverable Formatting
Make the output **readable, traceable, and actionable**.
@@ -139,8 +213,8 @@ Integrate all intermediate artifacts. Write to `OUTPUT_DIR/solution_draft##.md`
Sources to integrate:
- Extract background from `00_question_decomposition.md`
- Reference key facts from `02_fact_cards.md`
- Reference key facts from `02_fact_cards.md` (or files under `02_fact_cards/` if split)
- Organize conclusions from `04_reasoning_chain.md`
- Generate references from `01_source_registry.md`
- Generate references from `01_source_registry.md` (or files under `01_source_registry/` if split)
- Supplement with use cases from `05_validation_log.md`
- For Mode A: include AC assessment from `00_ac_assessment.md`
@@ -10,12 +10,21 @@
[Architecture solution that meets restrictions and acceptance criteria.]
> **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical research outputs (concept comparison, market/policy report, investigation answer), this Architecture section may be replaced with a comparison/analysis section that does not use these columns; or the columns may be marked `N/A` per row when the row describes a non-technical "component" (a process, a policy, an organizational construct). For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
### Component: [Component Name]
| Solution | Tools | Advantages | Limitations | Requirements | Security | Cost | Fit |
|----------|-------|-----------|-------------|-------------|----------|------|-----|
| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [cost] | [fit assessment] |
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Cost | API Capability Evidence | Fit |
|----------|-------|--------------------|-----------|-------------|-------------|----------|------|-------------------------|-----|
| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [cost] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |
**Exact-fit evidence**:
- Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
- Evidence: [Fact # / Source #]
- Disqualifiers: [none or list]
- Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` (or `06_component_fit_matrix/` if split) § <Candidate Name>
- API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected
[Repeat per component]
@@ -13,12 +13,21 @@
[Architecture solution that meets restrictions and acceptance criteria.]
> **Applicability** — the table columns `Pinned Mode/Config` and `API Capability Evidence` apply only to technical-component runs (per SKILL.md → Research Output Class). For non-technical assessment outputs (e.g., reassessing a policy approach, comparing organizational designs), this Architecture section may be replaced with the assessment content that does not use these columns; or the columns may be marked `N/A` per row for non-technical "components". For mixed runs, fill the columns only on rows that describe libraries/SDKs/frameworks/services/protocols/data formats/algorithms.
### Component: [Component Name]
| Solution | Tools | Advantages | Limitations | Requirements | Security | Performance | Fit |
|----------|-------|-----------|-------------|-------------|----------|------------|-----|
| [Option 1] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
| [Option 2] | [lib/platform] | [pros] | [cons] | [reqs] | [security] | [perf] | [fit assessment] |
| Solution | Tools | Pinned Mode/Config | Advantages | Limitations | Requirements | Security | Performance | API Capability Evidence | Fit |
|----------|-------|--------------------|-----------|-------------|-------------|----------|------------|-------------------------|-----|
| [Option 1] | [lib/platform] | [exact mode/config used: inputs, outputs, runtime] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link to MVE block]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision — cite exact-fit evidence and disqualifiers] |
| [Option 2] | [lib/platform] | [exact mode/config used] | [pros] | [cons] | [intrinsic requirements] | [security] | [perf] | MVE: [link]; docs: [Source #] | [Selected / Rejected / Experimental only / Needs user decision] |
**Exact-fit evidence**:
- Project constraints checked: [inputs/outputs, operating context, lifecycle, NFRs, acceptance criteria]
- Evidence: [Fact # / Source #]
- Disqualifiers: [none or list]
- Restrictions × Candidate-Modes sub-matrix: see `06_component_fit_matrix.md` (or `06_component_fit_matrix/` if split) § <Candidate Name>
- API capability gates: ✅ MVE saved / ⚠️ partial — see disqualifiers / ❌ no MVE — candidate is Experimental only or Rejected
[Repeat per component]
+4 -4
View File
@@ -2,9 +2,9 @@
name: retrospective
description: |
Collect metrics from implementation batch reports and code review findings, analyze trends across cycles,
and produce improvement reports with actionable recommendations.
3-step workflow: collect metrics, analyze trends, produce report.
Outputs to _docs/06_metrics/.
and produce improvement reports plus a lessons-log update with actionable recommendations.
4-step workflow: collect metrics, analyze trends, produce report, update lessons log.
Outputs to _docs/06_metrics/ and appends to _docs/LESSONS.md (ring buffer, last 15).
Trigger phrases:
- "retrospective", "retro", "run retro"
- "metrics review", "feedback loop"
@@ -232,7 +232,7 @@ Present the report summary to the user.
```
┌────────────────────────────────────────────────────────────────┐
│ Retrospective (3-Step Method) │
│ Retrospective (4-Step Method) │
├────────────────────────────────────────────────────────────────┤
│ PREREQ: batch reports exist in _docs/03_implementation/ │
│ │
+13 -2
View File
@@ -22,7 +22,7 @@ test-run has two modes. The caller passes the mode explicitly; if missing, defau
| Mode | Scope | Typical caller | Input artifacts |
|------|-------|---------------|-----------------|
| `functional` (default) | Unit / integration / blackbox tests — correctness | autodev Steps that verify after Implement Tests or Implement | `scripts/run-tests.sh`, `_docs/02_document/tests/environment.md`, `_docs/02_document/tests/blackbox-tests.md` |
| `perf` | Performance / load / stress / soak tests — latency, throughput, error-rate thresholds | autodev greenfield Step 9, existing-code Step 15 (pre-deploy) | `scripts/run-performance-tests.sh`, `_docs/02_document/tests/performance-tests.md`, AC thresholds in `_docs/00_problem/acceptance_criteria.md` |
| `perf` | Performance / load / stress / soak tests — latency, throughput, error-rate thresholds | autodev greenfield Step 15, existing-code Step 15 (pre-deploy) | `scripts/run-performance-tests.sh`, `_docs/02_document/tests/performance-tests.md`, AC thresholds in `_docs/00_problem/acceptance_criteria.md` |
Direct user invocation (`/test-run`) defaults to `functional`. If the user says "perf tests", "load test", "performance", or passes a performance scenarios file, run `perf` mode.
@@ -32,6 +32,17 @@ After selecting a mode, read its corresponding workflow below; do not mix them.
## Functional Mode
### 0. System-Under-Test Reality Gate
Before accepting any functional, blackbox, or e2e result as a pass, verify what the tests actually exercised.
1. If `_docs/00_problem/input_data/expected_results/results_report.md` exists, at least one e2e/blackbox run must compare actual product outputs against that mapping or the machine-readable files it references.
2. Stubs are allowed only for external systems outside the product boundary: flight controller/SITL, QGC observer, satellite-provider/Suite service, physical Jetson hardware, physical camera, unavailable licensed datasets, and network services.
3. Stubs, fakes, deterministic fallbacks, monkeypatches, or direct replacement of internal product modules are not allowed for the behavior under test. Internal examples include VIO, safety/anchor wrapper, satellite retrieval, anchor verification, tile manager, MAVLink output adapter, FDR, and the A-Z localization pipeline.
4. If tests pass only because an internal module is fake/scaffolded, classify the run as **failed** with category `missing product implementation`.
5. If a scenario is blocked because external hardware/data is absent, verify the production code path exists before accepting the block as legitimate. Missing internal production code is not an environment block.
6. If the test runner writes CSV/Markdown reports, inspect them. A zero exit code is not enough; blocked/internal-stubbed scenarios still require classification.
### 1. Detect Test Runner
Check in order — first match wins:
@@ -94,7 +105,7 @@ Categorize skips as: **explicit skip (dead code)**, **runtime skip (unreachable)
### 5. Handle Outcome
**All tests pass, zero skipped** → return success to the autodev for auto-chain.
**All tests pass, zero skipped, and the System-Under-Test Reality Gate passes** → return success to the autodev for auto-chain.
**Any test fails or errors** → this is a **blocking gate**. Never silently ignore failures. **Always investigate the root cause before deciding on an action.** Read the failing test code, read the error output, check service logs if applicable, and determine whether the bug is in the test or in the production code.
+4 -3
View File
@@ -202,12 +202,12 @@ If invoked in `cycle-update` mode (see "Invocation Modes" above), read and follo
| Missing acceptance_criteria.md, restrictions.md, or input_data/ | **STOP** — specification cannot proceed |
| Missing input_data/expected_results/results_report.md | **STOP** — ask user to provide expected results mapping using the template |
| Ambiguous requirements | ASK user |
| Input data coverage below 75% (Phase 1) | Search internet for supplementary data, ASK user to validate |
| Input data coverage below the canonical threshold (Phase 1) | Search internet for supplementary data, ASK user to validate. See `.cursor/rules/cursor-meta.mdc` Quality Thresholds for the canonical 75% number — do not hardcode a different threshold here. |
| Expected results missing or not quantifiable (Phase 1) | ASK user to provide quantifiable expected results before proceeding |
| Test scenario conflicts with restrictions | ASK user to clarify intent |
| System interfaces unclear (no architecture.md) | ASK user or derive from solution.md |
| Test data or expected result not provided for a test scenario (Phase 3) | WARN user and REMOVE the test |
| Final coverage below 75% after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec |
| Final coverage below the canonical threshold after removals (Phase 3) | BLOCK — require user to supply data or accept reduced spec (see `cursor-meta.mdc` Quality Thresholds) |
## Common Mistakes
@@ -252,7 +252,8 @@ When the user wants to:
│ │
│ Phase 3: Test Data & Expected Results Validation Gate (HARD GATE) │
│ → phases/03-data-validation-gate.md │
│ [BLOCKING: coverage ≥ 75% required to pass]
│ [BLOCKING: coverage ≥ canonical threshold required to pass —
│ see cursor-meta.mdc Quality Thresholds (75%)] │
│ │
│ Hardware-Dependency Assessment (BLOCKING, pre-Phase-4) │
│ → phases/hardware-assessment.md │
@@ -1,7 +1,7 @@
# Phase 3: Test Data & Expected Results Validation Gate (HARD GATE)
**Role**: Professional Quality Assurance Engineer
**Goal**: Ensure every test scenario produced in Phase 2 has concrete, sufficient test data. Remove tests that lack data. Verify final coverage stays above 75%.
**Goal**: Ensure every test scenario produced in Phase 2 has concrete, sufficient test data. Remove tests that lack data. Verify final coverage stays above the canonical threshold (currently 75% — see `.cursor/rules/cursor-meta.mdc` Quality Thresholds; never hardcode a different number in any phase).
**Constraints**: This phase is MANDATORY and cannot be skipped.
## Step 1 — Build the requirements checklist
@@ -95,7 +95,7 @@ Examples:
File: `expected_results/image_01_detections.json`
```json
```json
{
"input": "image_01.jpg",
"expected": {
@@ -119,7 +119,7 @@ File: `expected_results/image_01_detections.json`
]
}
}
```
```
```
---
+42
View File
@@ -0,0 +1,42 @@
root = true
[*]
indent_style = space
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
[*.cs]
indent_size = 4
[*.{csproj,props,targets,nuspec,resx}]
indent_size = 2
[*.{json,yml,yaml}]
indent_size = 2
[*.{md,sql}]
trim_trailing_whitespace = false
[*.cs]
csharp_new_line_before_open_brace = all
csharp_new_line_before_else = true
csharp_new_line_before_catch = true
csharp_new_line_before_finally = true
csharp_new_line_before_members_in_object_initializers = true
csharp_new_line_before_members_in_anonymous_types = true
csharp_new_line_between_query_expression_clauses = true
csharp_style_namespace_declarations = file_scoped:suggestion
csharp_using_directive_placement = outside_namespace:suggestion
dotnet_style_qualification_for_field = false:suggestion
dotnet_style_qualification_for_property = false:suggestion
dotnet_style_qualification_for_method = false:suggestion
dotnet_style_qualification_for_event = false:suggestion
dotnet_diagnostic.CA1001.severity = warning
dotnet_diagnostic.CA1051.severity = warning
dotnet_diagnostic.CA1816.severity = warning
dotnet_diagnostic.CA2227.severity = warning
+35
View File
@@ -0,0 +1,35 @@
# Satellite Provider environment configuration template.
# Copy this file to `.env` and replace placeholder values before running
# docker-compose or scripts/run-tests.sh.
#
# IMPORTANT: `.env` is gitignored on purpose. Never commit real secrets.
# Google Maps Platform API key for satellite imagery downloads.
GOOGLE_MAPS_API_KEY=
# HMAC-SHA256 signing key for JWT validation (suite-level auth contract,
# `suite/_docs/10_auth.md`). MUST be at least 32 bytes (UTF-8) — the API
# fails fast on startup if this is unset or shorter.
#
# Generate a strong secret with, for example:
# openssl rand -hex 32
#
# Test/CI runs may use a clearly tagged TEST-ONLY value (still >=32 bytes).
JWT_SECRET=
# JWT issuer / audience claims (AZ-494). Both are REQUIRED — the API
# fails fast at startup if either is unset or whitespace-only.
#
# Production values MUST be confirmed by the admin team before deploy
# (the admin API stamps the `iss` claim; satellite-provider validates
# the `aud` claim).
#
# For local dev / CI: use the DEV-ONLY values below. The integration
# test runner and scripts/run-tests.sh read these directly from the
# environment (no appsettings fallback on the test side), so leaving
# them blank will cause run-tests.sh to refuse to start.
#
# NEVER ship these DEV-ONLY values to prod — they exist only to make
# local-dev mints validate against appsettings.Development.json:
JWT_ISSUER=DEV-ONLY-iss-admin-azaion-local
JWT_AUDIENCE=DEV-ONLY-aud-satellite-provider
+8 -1
View File
@@ -10,4 +10,11 @@ Content/
.env
tiles/
ready/
.DS_Store
.DS_Store
TestResults/
coverage.cobertura.xml
coverage.opencover.xml
*.coverage
_docs/03_implementation/test_runs/
_docs/04_run_results/
certs/
+13
View File
@@ -0,0 +1,13 @@
when:
event: [push, pull_request, manual]
branch: [dev, stage, main]
labels:
platform: arm64
steps:
- name: unit-tests
image: mcr.microsoft.com/dotnet/sdk:10.0
commands:
- dotnet restore SatelliteProvider.sln
- dotnet test SatelliteProvider.Tests/SatelliteProvider.Tests.csproj --no-restore --configuration Release --logger "console;verbosity=normal" --logger "trx;LogFileName=test-results.trx" --results-directory /app/test-results
@@ -2,8 +2,20 @@ when:
event: [push, manual]
branch: [dev, stage, main]
depends_on:
- 01-test
# Multi-arch matrix. Adding amd64 = uncommenting the second entry once an
# amd64 agent is online.
matrix:
include:
- PLATFORM: arm64
TAG_SUFFIX: arm
# - PLATFORM: amd64
# TAG_SUFFIX: amd
labels:
platform: arm64
platform: ${PLATFORM}
steps:
- name: build-push
@@ -17,7 +29,7 @@ steps:
from_secret: registry_token
commands:
- echo "$REGISTRY_TOKEN" | docker login "$REGISTRY_HOST" -u "$REGISTRY_USER" --password-stdin
- export TAG=${CI_COMMIT_BRANCH}-arm
- export TAG=${CI_COMMIT_BRANCH}-${TAG_SUFFIX}
- export BUILD_DATE=$(date -u +%Y-%m-%dT%H:%M:%SZ)
- |
docker build -f SatelliteProvider.Api/Dockerfile \
+7 -5
View File
@@ -2,11 +2,11 @@
## System Overview
This is a .NET 8.0 ASP.NET Web API service that downloads, stores, and manages satellite imagery tiles from Google Maps. The service supports region-based tile requests, route planning with intermediate points, and geofencing capabilities.
This is a .NET 10 ASP.NET Web API service that downloads, stores, and manages satellite imagery tiles from Google Maps. The service supports region-based tile requests, route planning with intermediate points, and geofencing capabilities.
## Tech Stack
- **.NET 8.0** with ASP.NET Core Web API
- **.NET 10** with ASP.NET Core Web API
- **PostgreSQL** database (via Docker)
- **Dapper** for data access (ORM)
- **DbUp** for database migrations
@@ -236,10 +236,12 @@ Development defaults:
## Dependencies and Versions
Key packages (all .NET 8.0):
- Microsoft.AspNetCore.OpenApi 8.0.21
Key packages (all .NET 10):
- Microsoft.AspNetCore.OpenApi 10.0.7
- Microsoft.AspNetCore.Authentication.JwtBearer 10.0.7
- Microsoft.Extensions.* 10.0.7
- Swashbuckle.AspNetCore 6.6.2
- Serilog.AspNetCore 8.0.3
- Serilog.AspNetCore 8.0.3 (intentional — 10.0.0 requires Serilog.Sinks.File ≥ 7.0.0; bumping Serilog.Sinks.File is out of AZ-500 scope per "no unrelated package bumps")
- SixLabors.ImageSharp 3.1.11
- Newtonsoft.Json 13.0.4
- Dapper (check DataAccess csproj)
+10
View File
@@ -0,0 +1,10 @@
<Project>
<PropertyGroup>
<EnableNETAnalyzers>true</EnableNETAnalyzers>
<AnalysisLevel>latest</AnalysisLevel>
<AnalysisMode>None</AnalysisMode>
<EnforceCodeStyleInBuild>false</EnforceCodeStyleInBuild>
</PropertyGroup>
</Project>
@@ -0,0 +1,96 @@
using System.Text;
using Microsoft.AspNetCore.Authentication.JwtBearer;
using Microsoft.IdentityModel.Tokens;
namespace SatelliteProvider.Api.Authentication;
public static class AuthenticationServiceCollectionExtensions
{
public const string JwtSecretEnvVar = "JWT_SECRET";
public const string JwtSecretConfigKey = "Jwt:Secret";
public const string JwtIssuerEnvVar = "JWT_ISSUER";
public const string JwtIssuerConfigKey = "Jwt:Issuer";
public const string JwtAudienceEnvVar = "JWT_AUDIENCE";
public const string JwtAudienceConfigKey = "Jwt:Audience";
public const int MinSecretByteLength = 32;
public static IServiceCollection AddSatelliteJwt(this IServiceCollection services, IConfiguration configuration)
{
ArgumentNullException.ThrowIfNull(services);
ArgumentNullException.ThrowIfNull(configuration);
var secret = ResolveSecretOrThrow(configuration);
var issuer = ResolveRequiredOrThrow(configuration, JwtIssuerEnvVar, JwtIssuerConfigKey, "JWT issuer");
var audience = ResolveRequiredOrThrow(configuration, JwtAudienceEnvVar, JwtAudienceConfigKey, "JWT audience");
var signingKey = new SymmetricSecurityKey(Encoding.UTF8.GetBytes(secret));
services
.AddAuthentication(JwtBearerDefaults.AuthenticationScheme)
.AddJwtBearer(options =>
{
options.TokenValidationParameters = new TokenValidationParameters
{
ValidateIssuerSigningKey = true,
IssuerSigningKey = signingKey,
ValidateLifetime = true,
ClockSkew = TimeSpan.FromSeconds(30),
ValidateIssuer = true,
ValidIssuer = issuer,
ValidateAudience = true,
ValidAudience = audience,
RequireSignedTokens = true,
RequireExpirationTime = true
};
});
return services;
}
internal static string ResolveSecretOrThrow(IConfiguration configuration)
{
var secret = Environment.GetEnvironmentVariable(JwtSecretEnvVar);
if (string.IsNullOrWhiteSpace(secret))
{
secret = configuration[JwtSecretConfigKey];
}
if (string.IsNullOrWhiteSpace(secret))
{
throw new InvalidOperationException(
$"JWT secret is not configured. Set the {JwtSecretEnvVar} environment variable " +
$"or the {JwtSecretConfigKey} configuration key to a value of at least {MinSecretByteLength} bytes.");
}
var byteLength = Encoding.UTF8.GetByteCount(secret);
if (byteLength < MinSecretByteLength)
{
throw new InvalidOperationException(
$"JWT secret is too short ({byteLength} bytes). HMAC-SHA256 requires at least {MinSecretByteLength} bytes " +
$"per RFC 2104 §3. Set {JwtSecretEnvVar} or {JwtSecretConfigKey} to a longer value.");
}
return secret;
}
// AZ-494: required non-secret config (iss / aud). Fail-fast contract mirrors
// JWT_SECRET — missing or whitespace-only values throw at startup so a
// production deploy without the operator-confirmed values cannot silently
// accept tokens with arbitrary issuer/audience claims.
internal static string ResolveRequiredOrThrow(IConfiguration configuration, string envVar, string configKey, string humanLabel)
{
var value = Environment.GetEnvironmentVariable(envVar);
if (string.IsNullOrWhiteSpace(value))
{
value = configuration[configKey];
}
if (string.IsNullOrWhiteSpace(value))
{
throw new InvalidOperationException(
$"{humanLabel} is not configured. Set the {envVar} environment variable " +
$"or the {configKey} configuration key. (See AZ-494 task spec.)");
}
return value;
}
}
@@ -0,0 +1,119 @@
using System.Security.Claims;
using System.Text.Json;
using Microsoft.AspNetCore.Authorization;
namespace SatelliteProvider.Api.Authentication;
// AZ-488: enforces a required permission from the `permissions` JWT claim.
// The claim may arrive as either:
// - a JWT array claim (multiple ClaimType="permissions" entries — Microsoft.IdentityModel
// splits arrays this way), OR
// - a single JSON-array string ("[\"GPS\",\"FL\"]") when a producer mis-encodes.
// Both shapes are matched so the satellite-provider remains tolerant of upstream
// producers that do not split array claims out of the box.
public sealed class PermissionsRequirement : IAuthorizationRequirement
{
public PermissionsRequirement(string requiredPermission)
{
if (string.IsNullOrWhiteSpace(requiredPermission))
{
throw new ArgumentException("Required permission must be a non-empty string.", nameof(requiredPermission));
}
RequiredPermission = requiredPermission;
}
public string RequiredPermission { get; }
}
public sealed class PermissionsAuthorizationHandler : AuthorizationHandler<PermissionsRequirement>
{
public const string ClaimType = "permissions";
protected override Task HandleRequirementAsync(AuthorizationHandlerContext context, PermissionsRequirement requirement)
{
ArgumentNullException.ThrowIfNull(context);
ArgumentNullException.ThrowIfNull(requirement);
var user = context.User;
if (user?.Identity?.IsAuthenticated != true)
{
return Task.CompletedTask;
}
if (UserHasPermission(user, requirement.RequiredPermission))
{
context.Succeed(requirement);
}
return Task.CompletedTask;
}
private static bool UserHasPermission(ClaimsPrincipal user, string requiredPermission)
{
foreach (var claim in user.FindAll(ClaimType))
{
if (string.Equals(claim.Value, requiredPermission, StringComparison.Ordinal))
{
return true;
}
if (TryReadJsonArray(claim.Value, out var values))
{
foreach (var value in values)
{
if (string.Equals(value, requiredPermission, StringComparison.Ordinal))
{
return true;
}
}
}
}
return false;
}
private static bool TryReadJsonArray(string value, out IReadOnlyList<string> items)
{
items = Array.Empty<string>();
if (string.IsNullOrWhiteSpace(value) || value[0] != '[')
{
return false;
}
try
{
using var document = JsonDocument.Parse(value);
if (document.RootElement.ValueKind != JsonValueKind.Array)
{
return false;
}
var result = new List<string>(document.RootElement.GetArrayLength());
foreach (var element in document.RootElement.EnumerateArray())
{
if (element.ValueKind == JsonValueKind.String)
{
var text = element.GetString();
if (!string.IsNullOrEmpty(text))
{
result.Add(text);
}
}
}
items = result;
return result.Count > 0;
}
catch (JsonException)
{
return false;
}
}
}
public static class SatellitePermissions
{
public const string Gps = "GPS";
public const string UavUploadPolicy = "RequiresGpsPermission";
}
@@ -0,0 +1,41 @@
namespace SatelliteProvider.Api;
public static class CorsConfigurationValidator
{
public const string MissingOriginsMessage =
"CORS is misconfigured: CorsConfig:AllowedOrigins is empty and CorsConfig:AllowAnyOrigin is not true. " +
"Refusing to start in Production with a permissive CORS policy. " +
"Set CorsConfig:AllowedOrigins to a non-empty array, or set CorsConfig:AllowAnyOrigin=true to opt in.";
public const string PermissiveDefaultWarning =
"CorsConfig:AllowedOrigins is empty and CorsConfig:AllowAnyOrigin is not true. " +
"Permissive CORS is being applied for environment {Environment}; do not run with this configuration in Production.";
public static void EnsureSafeForEnvironment(
string[] allowedOrigins,
bool allowAnyOrigin,
string environmentName)
{
ArgumentNullException.ThrowIfNull(allowedOrigins);
ArgumentNullException.ThrowIfNull(environmentName);
if (allowedOrigins.Length == 0
&& !allowAnyOrigin
&& string.Equals(environmentName, "Production", StringComparison.OrdinalIgnoreCase))
{
throw new InvalidOperationException(MissingOriginsMessage);
}
}
public static bool ShouldUsePermissivePolicy(string[] allowedOrigins, bool allowAnyOrigin)
{
ArgumentNullException.ThrowIfNull(allowedOrigins);
return allowAnyOrigin || allowedOrigins.Length == 0;
}
public static bool ShouldWarnAboutPermissiveDefault(string[] allowedOrigins, bool allowAnyOrigin)
{
ArgumentNullException.ThrowIfNull(allowedOrigins);
return allowedOrigins.Length == 0 && !allowAnyOrigin;
}
}
@@ -0,0 +1,16 @@
using Microsoft.AspNetCore.Http;
using Microsoft.AspNetCore.Mvc;
namespace SatelliteProvider.Api.DTOs;
// AZ-488 / `uav-tile-upload.md` v1.0.0 — multipart envelope. `Metadata` carries the
// JSON array of UavTileMetadata records; `Files` provides one IFormFile per record
// at the matching ordinal index.
public record UavTileBatchUploadRequest
{
[FromForm(Name = "metadata")]
public string Metadata { get; init; } = string.Empty;
[FromForm(Name = "files")]
public IFormFileCollection? Files { get; init; }
}
+5 -3
View File
@@ -1,14 +1,16 @@
FROM mcr.microsoft.com/dotnet/aspnet:8.0 AS base
FROM mcr.microsoft.com/dotnet/aspnet:10.0 AS base
WORKDIR /app
EXPOSE 8080
EXPOSE 8081
FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build
FROM mcr.microsoft.com/dotnet/sdk:10.0 AS build
WORKDIR /src
COPY ["SatelliteProvider.Api/SatelliteProvider.Api.csproj", "SatelliteProvider.Api/"]
COPY ["SatelliteProvider.Common/SatelliteProvider.Common.csproj", "SatelliteProvider.Common/"]
COPY ["SatelliteProvider.DataAccess/SatelliteProvider.DataAccess.csproj", "SatelliteProvider.DataAccess/"]
COPY ["SatelliteProvider.Services/SatelliteProvider.Services.csproj", "SatelliteProvider.Services/"]
COPY ["SatelliteProvider.Services.TileDownloader/SatelliteProvider.Services.TileDownloader.csproj", "SatelliteProvider.Services.TileDownloader/"]
COPY ["SatelliteProvider.Services.RegionProcessing/SatelliteProvider.Services.RegionProcessing.csproj", "SatelliteProvider.Services.RegionProcessing/"]
COPY ["SatelliteProvider.Services.RouteManagement/SatelliteProvider.Services.RouteManagement.csproj", "SatelliteProvider.Services.RouteManagement/"]
RUN dotnet restore "SatelliteProvider.Api/SatelliteProvider.Api.csproj"
COPY . .
WORKDIR "/src/SatelliteProvider.Api"
@@ -0,0 +1,76 @@
using Microsoft.AspNetCore.Diagnostics;
using Microsoft.AspNetCore.Mvc;
namespace SatelliteProvider.Api;
public sealed class GlobalExceptionHandler : IExceptionHandler
{
private readonly ILogger<GlobalExceptionHandler> _logger;
public GlobalExceptionHandler(ILogger<GlobalExceptionHandler> logger)
{
_logger = logger;
}
public async ValueTask<bool> TryHandleAsync(
HttpContext httpContext,
Exception exception,
CancellationToken cancellationToken)
{
// Framework-level request-binding/parsing failures carry their own HTTP status
// (typically 400/415). Honor that status so we don't promote a client error to 5xx.
if (exception is BadHttpRequestException badRequest)
{
await WriteClientErrorAsync(httpContext, badRequest, cancellationToken);
return true;
}
var correlationId = httpContext.TraceIdentifier;
_logger.LogError(
exception,
"Unhandled exception while processing {Method} {Path} (correlationId={CorrelationId})",
httpContext.Request.Method,
httpContext.Request.Path,
correlationId);
httpContext.Response.StatusCode = StatusCodes.Status500InternalServerError;
var problem = new ProblemDetails
{
Status = StatusCodes.Status500InternalServerError,
Title = "Internal Server Error",
Detail = "An unexpected error occurred. Use the correlationId to look up the server log entry.",
Type = "https://datatracker.ietf.org/doc/html/rfc9110#name-500-internal-server-error",
};
problem.Extensions["correlationId"] = correlationId;
await httpContext.Response.WriteAsJsonAsync(
problem,
options: null,
contentType: "application/problem+json",
cancellationToken: cancellationToken);
return true;
}
private static async Task WriteClientErrorAsync(
HttpContext httpContext,
BadHttpRequestException badRequest,
CancellationToken cancellationToken)
{
httpContext.Response.StatusCode = badRequest.StatusCode;
var problem = new ProblemDetails
{
Status = badRequest.StatusCode,
Title = "Bad Request",
Detail = badRequest.Message,
};
await httpContext.Response.WriteAsJsonAsync(
problem,
options: null,
contentType: "application/problem+json",
cancellationToken: cancellationToken);
}
}
+273 -332
View File
@@ -1,97 +1,168 @@
using System.ComponentModel.DataAnnotations;
using Microsoft.AspNetCore.Authorization;
using Microsoft.AspNetCore.Http.Features;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Caching.Memory;
using Microsoft.OpenApi.Models;
using Microsoft.AspNetCore.Server.Kestrel.Core;
using Microsoft.OpenApi;
using Swashbuckle.AspNetCore.SwaggerGen;
using SatelliteProvider.Api;
using SatelliteProvider.Api.Authentication;
using SatelliteProvider.Api.DTOs;
using SatelliteProvider.Api.Swagger;
using SatelliteProvider.DataAccess;
using SatelliteProvider.DataAccess.Models;
using SatelliteProvider.DataAccess.Repositories;
using SatelliteProvider.DataAccess.TypeHandlers;
using SatelliteProvider.Common.Configs;
using SatelliteProvider.Common.DTO;
using SatelliteProvider.Common.Interfaces;
using SatelliteProvider.Common.Utils;
using SatelliteProvider.Services;
using SatelliteProvider.Services.RegionProcessing;
using SatelliteProvider.Services.RouteManagement;
using SatelliteProvider.Services.TileDownloader;
using Serilog;
var builder = WebApplication.CreateBuilder(args);
builder.Host.UseSerilog((context, configuration) =>
builder.Host.UseSerilog((context, configuration) =>
configuration.ReadFrom.Configuration(context.Configuration));
var connectionString = builder.Configuration.GetConnectionString("DefaultConnection")
var connectionString = builder.Configuration.GetConnectionString("DefaultConnection")
?? throw new InvalidOperationException("Connection string 'DefaultConnection' not found.");
DapperEnumTypeHandlers.RegisterAll();
builder.Services.Configure<MapConfig>(builder.Configuration.GetSection("MapConfig"));
builder.Services.Configure<StorageConfig>(builder.Configuration.GetSection("StorageConfig"));
builder.Services.Configure<ProcessingConfig>(builder.Configuration.GetSection("ProcessingConfig"));
builder.Services.Configure<UavQualityConfig>(builder.Configuration.GetSection("UavQuality"));
var uavQuality = builder.Configuration.GetSection("UavQuality").Get<UavQualityConfig>() ?? new UavQualityConfig();
var uavBatchBodyLimit = checked((long)uavQuality.MaxBatchSize * uavQuality.MaxBytes);
builder.Services.Configure<KestrelServerOptions>(options =>
{
options.Limits.MaxRequestBodySize = uavBatchBodyLimit;
// AZ-505: enable HTTP/2 alongside HTTP/1.1 on every Kestrel endpoint so
// programmatic clients (httpx http2=True, .NET HttpClient) can multiplex
// tile reads on a single TCP connection. Kestrel requires TLS+ALPN for
// HTTP/2 — the dev/test compose files mount a self-signed cert at
// /app/certs/api.pfx and set ASPNETCORE_URLS=https://+:8080; production
// is expected to terminate TLS at the same layer or upstream. Browsers
// negotiate HTTP/2 via ALPN once TLS is present; legacy HTTP/1.1
// callers continue to work over the same listener. HTTP/3/QUIC is
// intentionally out of scope (see AZ-505 task spec § Excluded).
options.ConfigureEndpointDefaults(listen =>
{
listen.Protocols = HttpProtocols.Http1AndHttp2;
});
});
builder.Services.Configure<FormOptions>(options =>
{
options.MultipartBodyLengthLimit = uavBatchBodyLimit;
options.ValueLengthLimit = Math.Max(options.ValueLengthLimit, uavQuality.MaxBatchSize * 512);
});
builder.Services.AddSingleton<ITileRepository>(sp => new TileRepository(connectionString, sp.GetRequiredService<ILogger<TileRepository>>()));
builder.Services.AddSingleton<IRegionRepository>(sp => new RegionRepository(connectionString, sp.GetRequiredService<ILogger<RegionRepository>>()));
builder.Services.AddSingleton<IRouteRepository>(sp => new RouteRepository(connectionString, sp.GetRequiredService<ILogger<RouteRepository>>()));
builder.Services.AddSingleton<IRegionRepository>(sp => new RegionRepository(connectionString));
builder.Services.AddSingleton<IRouteRepository>(sp => new RouteRepository(connectionString));
builder.Services.AddHttpClient();
builder.Services.AddMemoryCache();
builder.Services.AddSingleton<GoogleMapsDownloaderV2>();
builder.Services.AddSingleton<ITileService, TileService>();
builder.Services.AddTileDownloader();
builder.Services.AddRegionProcessing();
builder.Services.AddRouteManagement();
builder.Services.AddSatelliteJwt(builder.Configuration);
builder.Services.AddSingleton<IAuthorizationHandler, PermissionsAuthorizationHandler>();
builder.Services.AddAuthorization(options =>
{
options.AddPolicy(SatellitePermissions.UavUploadPolicy, policy =>
{
policy.RequireAuthenticatedUser();
policy.Requirements.Add(new PermissionsRequirement(SatellitePermissions.Gps));
});
});
var allowedOrigins = builder.Configuration.GetSection("CorsConfig:AllowedOrigins").Get<string[]>() ?? Array.Empty<string>();
var allowAnyOrigin = builder.Configuration.GetValue<bool>("CorsConfig:AllowAnyOrigin");
CorsConfigurationValidator.EnsureSafeForEnvironment(allowedOrigins, allowAnyOrigin, builder.Environment.EnvironmentName);
builder.Services.AddCors(options =>
{
options.AddPolicy("TilesCors", policy =>
{
if (allowedOrigins.Length > 0)
policy.WithOrigins(allowedOrigins).AllowAnyHeader().AllowAnyMethod();
else
if (CorsConfigurationValidator.ShouldUsePermissivePolicy(allowedOrigins, allowAnyOrigin))
policy.AllowAnyOrigin().AllowAnyHeader().AllowAnyMethod();
else
policy.WithOrigins(allowedOrigins).AllowAnyHeader().AllowAnyMethod();
});
});
var processingConfig = builder.Configuration.GetSection("ProcessingConfig").Get<ProcessingConfig>() ?? new ProcessingConfig();
builder.Services.AddSingleton<IRegionRequestQueue>(sp =>
{
var logger = sp.GetRequiredService<ILogger<RegionRequestQueue>>();
return new RegionRequestQueue(processingConfig.QueueCapacity, logger);
});
builder.Services.AddSingleton<IRegionService, RegionService>();
builder.Services.AddHostedService<RegionProcessingService>();
builder.Services.AddSingleton<IRouteService, RouteService>();
builder.Services.AddHostedService<RouteProcessingService>();
builder.Services.AddProblemDetails();
builder.Services.AddExceptionHandler<GlobalExceptionHandler>();
builder.Services.ConfigureHttpJsonOptions(options =>
{
options.SerializerOptions.PropertyNamingPolicy = System.Text.Json.JsonNamingPolicy.CamelCase;
options.SerializerOptions.PropertyNameCaseInsensitive = true;
options.SerializerOptions.Converters.Add(
new System.Text.Json.Serialization.JsonStringEnumConverter(System.Text.Json.JsonNamingPolicy.CamelCase));
});
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen(c =>
{
c.SwaggerDoc("v1", new OpenApiInfo { Title = "Satellite Provider API", Version = "v1" });
c.MapType<UploadImageRequest>(() => new OpenApiSchema
c.AddSecurityDefinition("Bearer", new OpenApiSecurityScheme
{
Type = "object",
Properties = new Dictionary<string, OpenApiSchema>
{
["timestamp"] = new() { Type = "string", Format = "date-time", Description = "Image capture timestamp" },
["image"] = new() { Type = "string", Format = "binary", Description = "Image file to upload" },
["lat"] = new() { Type = "number", Format = "double", Description = "Latitude coordinate where image was captured" },
["lon"] = new() { Type = "number", Format = "double", Description = "Longitude coordinate where image was captured" },
["height"] = new() { Type = "number", Format = "double", Description = "Height/altitude in meters where image was captured" },
["focalLength"] = new() { Type = "number", Format = "double", Description = "Camera focal length in millimeters" },
["sensorWidth"] = new() { Type = "number", Format = "double", Description = "Camera sensor width in millimeters" },
["sensorHeight"] = new() { Type = "number", Format = "double", Description = "Camera sensor height in millimeters" }
},
Required = new HashSet<string> { "timestamp", "image", "lat", "lon", "height", "focalLength", "sensorWidth", "sensorHeight" }
Name = "Authorization",
Type = SecuritySchemeType.Http,
Scheme = "bearer",
BearerFormat = "JWT",
In = ParameterLocation.Header,
Description = "JWT Authorization header using the Bearer scheme. Example: 'Bearer {token}'"
});
c.AddSecurityRequirement(_ => new OpenApiSecurityRequirement
{
{
new OpenApiSecuritySchemeReference("Bearer"),
new List<string>()
}
});
c.MapType<UavTileBatchUploadRequest>(() => new OpenApiSchema
{
Type = JsonSchemaType.Object,
Properties = new Dictionary<string, IOpenApiSchema>
{
["metadata"] = new OpenApiSchema
{
Type = JsonSchemaType.String,
Description = "JSON document `{ \"items\": [ { \"latitude\", \"longitude\", \"tileZoom\", \"tileSizeMeters\", \"capturedAt\" } ] }` where item ordinal index aligns with the matching file in `files`."
},
["files"] = new OpenApiSchema
{
Type = JsonSchemaType.Array,
Description = "UAV tile JPEG files in the same order as `metadata.items`.",
Items = new OpenApiSchema { Type = JsonSchemaType.String, Format = "binary" }
}
},
Required = new HashSet<string> { "metadata", "files" }
});
c.OperationFilter<ParameterDescriptionFilter>();
});
var app = builder.Build();
var logger = app.Services.GetRequiredService<ILogger<Program>>();
var migrator = new DatabaseMigrator(connectionString, logger as ILogger<DatabaseMigrator>);
if (CorsConfigurationValidator.ShouldWarnAboutPermissiveDefault(allowedOrigins, allowAnyOrigin))
{
app.Services
.GetRequiredService<ILogger<Program>>()
.LogWarning(CorsConfigurationValidator.PermissiveDefaultWarning, app.Environment.EnvironmentName);
}
var migratorLogger = app.Services.GetRequiredService<ILogger<DatabaseMigrator>>();
var migrator = new DatabaseMigrator(connectionString, migratorLogger);
if (!migrator.RunMigrations())
{
throw new Exception("Database migration failed. Application cannot start.");
@@ -107,212 +178,205 @@ if (app.Environment.IsDevelopment())
app.UseSwaggerUI();
}
app.UseExceptionHandler();
app.UseHttpsRedirection();
app.UseCors("TilesCors");
app.UseAuthentication();
app.UseAuthorization();
app.MapGet("/tiles/{z:int}/{x:int}/{y:int}", ServeTile)
.RequireAuthorization()
.WithOpenApi(op => new(op) { Summary = "Get satellite tile image by z/x/y coordinates (Slippy Map tile server)" });
app.MapGet("/api/satellite/tiles/latlon", GetTileByLatLon)
.RequireAuthorization()
.WithOpenApi(op => new(op) { Summary = "Get satellite tile by latitude and longitude coordinates" });
app.MapGet("/api/satellite/tiles/mgrs", GetSatelliteTilesByMgrs)
.WithOpenApi(op => new(op) { Summary = "Get satellite tiles by MGRS coordinates" });
.RequireAuthorization()
.ProducesProblem(StatusCodes.Status501NotImplemented)
.WithOpenApi(op => new(op) { Summary = "Get satellite tiles by MGRS coordinates (NOT IMPLEMENTED)" });
app.MapPost("/api/satellite/upload", UploadImage)
.Accepts<UploadImageRequest>("multipart/form-data")
.WithOpenApi(op => new(op) { Summary = "Upload image with metadata and save to /maps folder" })
app.MapPost("/api/satellite/tiles/inventory", GetTilesInventory)
.RequireAuthorization()
.Accepts<TileInventoryRequest>("application/json")
.Produces<TileInventoryResponse>(StatusCodes.Status200OK)
.ProducesProblem(StatusCodes.Status400BadRequest)
.WithOpenApi(op => new(op)
{
Summary = "Bulk tile inventory lookup by (z,x,y) coords or location_hash",
Description = "AZ-505 / `tile-inventory.md` v1.0.0. Body MUST populate exactly one of `tiles` (array of `{tileZoom,tileX,tileY}`) OR `locationHashes` (array of UUIDv5). Response order matches request order. Returns one entry per request item with `present: true|false`; when present, identity + recency fields are included. Hard cap: 5000 entries per call (HTTP 400 above)."
});
app.MapPost("/api/satellite/upload", UploadUavTileBatch)
.RequireAuthorization(SatellitePermissions.UavUploadPolicy)
.Accepts<UavTileBatchUploadRequest>("multipart/form-data")
.Produces<UavTileBatchUploadResponse>(StatusCodes.Status200OK)
.ProducesProblem(StatusCodes.Status400BadRequest)
.WithOpenApi(op => new(op)
{
Summary = "Upload a batch of UAV-captured satellite tiles",
Description = "AZ-488 / `uav-tile-upload.md` v1.0.0. Multipart form: a JSON `metadata` field and an aligned `files` collection. Each item is graded by the 5-rule quality gate and persisted with `source='uav'` when accepted. Returns 200 with per-item results (mixed accept/reject), 400 for envelope-level errors (malformed metadata, missing files, oversized batch), 401 without a valid JWT, 403 without the `GPS` permission claim."
})
.DisableAntiforgery();
app.MapPost("/api/satellite/request", RequestRegion)
.WithOpenApi(op => new(op) { Summary = "Request tiles for a region" });
.RequireAuthorization()
.WithOpenApi(op => new(op)
{
Summary = "Request tiles for a region",
Description = "Idempotent (AZ-362): POSTing the same `id` twice returns the existing region resource with HTTP 200 and does not enqueue duplicate background processing.",
});
app.MapGet("/api/satellite/region/{id:guid}", GetRegionStatus)
.RequireAuthorization()
.WithOpenApi(op => new(op) { Summary = "Get region status and file paths" });
app.MapPost("/api/satellite/route", CreateRoute)
.WithOpenApi(op => new(op) { Summary = "Create a route with intermediate points" });
.RequireAuthorization()
.WithOpenApi(op => new(op)
{
Summary = "Create a route with intermediate points",
Description = "Idempotent (AZ-362): POSTing the same `id` twice returns the existing route resource with HTTP 200 and does not regenerate intermediate points or re-queue geofence regions.",
});
app.MapGet("/api/satellite/route/{id:guid}", GetRoute)
.RequireAuthorization()
.WithOpenApi(op => new(op) { Summary = "Get route information with calculated points" });
app.Run();
async Task<IResult> ServeTile(int z, int x, int y, HttpContext httpContext, ITileRepository tileRepository, GoogleMapsDownloaderV2 downloader, IMemoryCache cache, ILogger<Program> logger)
async Task<IResult> ServeTile(int z, int x, int y, HttpContext httpContext, ITileService tileService)
{
var cacheKey = $"tile_{z}_{x}_{y}";
try
{
if (cache.TryGetValue(cacheKey, out byte[]? cachedBytes) && cachedBytes != null)
{
httpContext.Response.Headers.CacheControl = "public, max-age=86400";
httpContext.Response.Headers.ETag = $"\"{z}_{x}_{y}\"";
return Results.Bytes(cachedBytes, "image/jpeg");
}
string? filePath = null;
var tile = await tileRepository.GetByTileCoordinatesAsync(z, x, y);
if (tile != null && File.Exists(tile.FilePath))
{
filePath = tile.FilePath;
}
else
{
var tileCenter = GeoUtils.TileToWorldPos(x, y, z);
var downloadedTile = await downloader.DownloadSingleTileAsync(tileCenter.Lat, tileCenter.Lon, z);
var now = DateTime.UtcNow;
var tileEntity = new TileEntity
{
Id = Guid.NewGuid(),
TileZoom = z,
TileX = downloadedTile.X,
TileY = downloadedTile.Y,
Latitude = downloadedTile.CenterLatitude,
Longitude = downloadedTile.CenterLongitude,
TileSizeMeters = downloadedTile.TileSizeMeters,
TileSizePixels = 256,
ImageType = "jpg",
MapsVersion = $"downloaded_{now:yyyy-MM-dd}",
Version = now.Year,
FilePath = downloadedTile.FilePath,
CreatedAt = now,
UpdatedAt = now
};
await tileRepository.InsertAsync(tileEntity);
filePath = tileEntity.FilePath;
}
var bytes = await File.ReadAllBytesAsync(filePath);
cache.Set(cacheKey, bytes, new MemoryCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromHours(1),
SlidingExpiration = TimeSpan.FromMinutes(30)
});
httpContext.Response.Headers.CacheControl = "public, max-age=86400";
httpContext.Response.Headers.ETag = $"\"{z}_{x}_{y}\"";
return Results.Bytes(bytes, "image/jpeg");
}
catch (Exception ex)
{
logger.LogError(ex, "Failed to serve tile {Z}/{X}/{Y}", z, x, y);
return Results.Problem(detail: ex.Message, statusCode: 500);
}
var tile = await tileService.GetOrDownloadTileAsync(z, x, y, httpContext.RequestAborted);
httpContext.Response.Headers.CacheControl = $"public, max-age={(long)tile.MaxAge.TotalSeconds}";
httpContext.Response.Headers.ETag = tile.ETag;
return Results.Bytes(tile.Bytes, tile.ContentType);
}
async Task<IResult> GetTileByLatLon([FromQuery] double Latitude, [FromQuery] double Longitude, [FromQuery] int ZoomLevel, GoogleMapsDownloaderV2 downloader, ITileRepository tileRepository, ILogger<Program> logger)
async Task<IResult> GetTileByLatLon([FromQuery] double Latitude, [FromQuery] double Longitude, [FromQuery] int ZoomLevel, HttpContext httpContext, ITileService tileService)
{
try
var tile = await tileService.DownloadAndStoreSingleTileAsync(Latitude, Longitude, ZoomLevel, httpContext.RequestAborted);
var response = new DownloadTileResponse
{
var downloadedTile = await downloader.DownloadSingleTileAsync(
Latitude,
Longitude,
ZoomLevel);
Id = tile.Id,
ZoomLevel = tile.TileZoom,
Latitude = tile.Latitude,
Longitude = tile.Longitude,
TileSizeMeters = tile.TileSizeMeters,
TileSizePixels = tile.TileSizePixels,
ImageType = tile.ImageType,
Version = tile.Version,
FilePath = tile.FilePath,
CreatedAt = tile.CreatedAt,
UpdatedAt = tile.UpdatedAt
};
var now = DateTime.UtcNow;
var currentVersion = now.Year;
var tileEntity = new TileEntity
{
Id = Guid.NewGuid(),
TileZoom = downloadedTile.ZoomLevel,
TileX = downloadedTile.X,
TileY = downloadedTile.Y,
Latitude = downloadedTile.CenterLatitude,
Longitude = downloadedTile.CenterLongitude,
TileSizeMeters = downloadedTile.TileSizeMeters,
TileSizePixels = 256,
ImageType = "jpg",
MapsVersion = $"downloaded_{now:yyyy-MM-dd}",
Version = currentVersion,
FilePath = downloadedTile.FilePath,
CreatedAt = now,
UpdatedAt = now
};
await tileRepository.InsertAsync(tileEntity);
var response = new DownloadTileResponse
{
Id = tileEntity.Id,
ZoomLevel = tileEntity.TileZoom,
Latitude = tileEntity.Latitude,
Longitude = tileEntity.Longitude,
TileSizeMeters = tileEntity.TileSizeMeters,
TileSizePixels = tileEntity.TileSizePixels,
ImageType = tileEntity.ImageType,
MapsVersion = tileEntity.MapsVersion,
Version = currentVersion,
FilePath = tileEntity.FilePath,
CreatedAt = tileEntity.CreatedAt,
UpdatedAt = tileEntity.UpdatedAt
};
return Results.Ok(response);
}
catch (Exception ex)
{
logger.LogError(ex, "Failed to get tile");
return Results.Problem(detail: ex.Message, statusCode: 500);
}
return Results.Ok(response);
}
IResult GetSatelliteTilesByMgrs(string mgrs, double squareSideMeters)
{
return Results.Ok(new GetSatelliteTilesResponse());
return Results.Problem(
statusCode: StatusCodes.Status501NotImplemented,
title: "Not implemented",
detail: "MGRS-based tile retrieval is not implemented.");
}
IResult UploadImage([FromForm] UploadImageRequest request)
async Task<IResult> GetTilesInventory(
[FromBody] TileInventoryRequest? request,
HttpContext httpContext,
ITileService tileService)
{
return Results.Ok(new SaveResult { Success = false });
if (request is null)
{
return Results.Problem(
statusCode: StatusCodes.Status400BadRequest,
title: "Invalid tile inventory request",
detail: "Request body is required.");
}
var tileCount = request.Tiles?.Count ?? 0;
var hashCount = request.LocationHashes?.Count ?? 0;
if ((tileCount == 0) == (hashCount == 0))
{
return Results.Problem(
statusCode: StatusCodes.Status400BadRequest,
title: "Invalid tile inventory request",
detail: "Populate exactly one of `tiles` or `locationHashes`. Sending both, or neither, is not allowed.");
}
var totalCount = Math.Max(tileCount, hashCount);
if (totalCount > TileInventoryLimits.MaxEntriesPerRequest)
{
return Results.Problem(
statusCode: StatusCodes.Status400BadRequest,
title: "Invalid tile inventory request",
detail: $"Inventory request capped at {TileInventoryLimits.MaxEntriesPerRequest} entries; got {totalCount}.");
}
var response = await tileService.GetInventoryAsync(request, httpContext.RequestAborted);
return Results.Ok(response);
}
async Task<IResult> RequestRegion([FromBody] RequestRegionRequest request, IRegionService regionService, ILogger<Program> logger)
async Task<IResult> UploadUavTileBatch(
HttpContext httpContext,
IUavTileUploadHandler handler,
[FromForm] UavTileBatchUploadRequest request)
{
try
{
if (request.SizeMeters < 100 || request.SizeMeters > 10000)
{
return Results.BadRequest(new { error = "Size must be between 100 and 10000 meters" });
}
ArgumentNullException.ThrowIfNull(request);
var status = await regionService.RequestRegionAsync(
request.Id,
request.Latitude,
request.Longitude,
request.SizeMeters,
request.ZoomLevel,
request.StitchTiles);
return Results.Ok(status);
}
catch (Exception ex)
var files = request.Files ?? (IFormFileCollection)new FormFileCollection();
var uploadFiles = new List<UavUploadFile>(files.Count);
foreach (var file in files)
{
logger.LogError(ex, "Failed to request region");
return Results.Problem(detail: ex.Message, statusCode: 500);
await using var stream = file.OpenReadStream();
using var buffer = new MemoryStream(checked((int)file.Length));
await stream.CopyToAsync(buffer, httpContext.RequestAborted);
uploadFiles.Add(new UavUploadFile(file.FileName, file.ContentType, buffer.ToArray()));
}
var result = await handler.HandleAsync(request.Metadata, uploadFiles, httpContext.RequestAborted);
if (result.EnvelopeRejected)
{
return Results.Problem(
statusCode: StatusCodes.Status400BadRequest,
title: "Invalid UAV tile batch",
detail: result.EnvelopeError);
}
return Results.Ok(result.Response);
}
async Task<IResult> GetRegionStatus(Guid id, IRegionService regionService, ILogger<Program> logger)
async Task<IResult> RequestRegion([FromBody] RequestRegionRequest request, IRegionService regionService)
{
try
if (request.SizeMeters < 100 || request.SizeMeters > 10000)
{
var status = await regionService.GetRegionStatusAsync(id);
if (status == null)
{
return Results.NotFound(new { error = $"Region {id} not found" });
}
return Results.BadRequest(new { error = "Size must be between 100 and 10000 meters" });
}
return Results.Ok(status);
}
catch (Exception ex)
var status = await regionService.RequestRegionAsync(
request.Id,
request.Latitude,
request.Longitude,
request.SizeMeters,
request.ZoomLevel,
request.StitchTiles);
return Results.Ok(status);
}
async Task<IResult> GetRegionStatus(Guid id, IRegionService regionService)
{
var status = await regionService.GetRegionStatusAsync(id);
if (status == null)
{
logger.LogError(ex, "Failed to get region status");
return Results.Problem(detail: ex.Message, statusCode: 500);
return Results.NotFound(new { error = $"Region {id} not found" });
}
return Results.Ok(status);
}
async Task<IResult> CreateRoute([FromBody] CreateRouteRequest request, IRouteService routeService, ILogger<Program> logger)
@@ -327,139 +391,16 @@ async Task<IResult> CreateRoute([FromBody] CreateRouteRequest request, IRouteSer
logger.LogWarning(ex, "Invalid route request");
return Results.BadRequest(new { error = ex.Message });
}
catch (Exception ex)
}
async Task<IResult> GetRoute(Guid id, IRouteService routeService)
{
var route = await routeService.GetRouteAsync(id);
if (route == null)
{
logger.LogError(ex, "Failed to create route");
return Results.Problem(detail: ex.Message, statusCode: 500);
}
}
async Task<IResult> GetRoute(Guid id, IRouteService routeService, ILogger<Program> logger)
{
try
{
var route = await routeService.GetRouteAsync(id);
if (route == null)
{
return Results.NotFound(new { error = $"Route {id} not found" });
}
return Results.Ok(route);
}
catch (Exception ex)
{
logger.LogError(ex, "Failed to get route");
return Results.Problem(detail: ex.Message, statusCode: 500);
}
}
public record GetSatelliteTilesResponse
{
public List<SatelliteTile> Tiles { get; set; } = new();
}
public record SatelliteTile
{
public string TileId { get; set; } = string.Empty;
public byte[] ImageData { get; set; } = Array.Empty<byte>();
public double Lat { get; set; }
public double Lon { get; set; }
public int ZoomLevel { get; set; }
}
public record UploadImageRequest
{
[Required]
public DateTime Timestamp { get; set; }
[Required]
public IFormFile? Image { get; set; }
[Required]
public double Lat { get; set; }
[Required]
public double Lon { get; set; }
[Required]
public double Height { get; set; }
[Required]
public double FocalLength { get; set; }
[Required]
public double SensorWidth { get; set; }
[Required]
public double SensorHeight { get; set; }
}
public record SaveResult
{
public bool Success { get; set; }
public string? Exception { get; set; }
}
public record DownloadTileResponse
{
public Guid Id { get; set; }
public int ZoomLevel { get; set; }
public double Latitude { get; set; }
public double Longitude { get; set; }
public double TileSizeMeters { get; set; }
public int TileSizePixels { get; set; }
public string ImageType { get; set; } = string.Empty;
public string? MapsVersion { get; set; }
public int Version { get; set; }
public string FilePath { get; set; } = string.Empty;
public DateTime CreatedAt { get; set; }
public DateTime UpdatedAt { get; set; }
}
public record RequestRegionRequest
{
[Required]
public Guid Id { get; set; }
[Required]
public double Latitude { get; set; }
[Required]
public double Longitude { get; set; }
[Required]
public double SizeMeters { get; set; }
[Required]
public int ZoomLevel { get; set; } = 18;
public bool StitchTiles { get; set; } = false;
}
public class ParameterDescriptionFilter : IOperationFilter
{
public void Apply(OpenApiOperation operation, OperationFilterContext context)
{
if (operation.Parameters == null) return;
var parameterDescriptions = new Dictionary<string, string>
{
["lat"] = "Latitude coordinate where image was captured",
["lon"] = "Longitude coordinate where image was captured",
["mgrs"] = "MGRS coordinate string",
["squareSideMeters"] = "Square side size in meters",
["Latitude"] = "Latitude coordinate of the tile center",
["Longitude"] = "Longitude coordinate of the tile center",
["ZoomLevel"] = "Zoom level for the tile (higher values = more detail)"
};
foreach (var parameter in operation.Parameters)
{
if (parameterDescriptions.TryGetValue(parameter.Name, out var description))
{
parameter.Description = description;
}
}
return Results.NotFound(new { error = $"Route {id} not found" });
}
return Results.Ok(route);
}
@@ -1,24 +1,27 @@
<Project Sdk="Microsoft.NET.Sdk.Web">
<PropertyGroup>
<TargetFramework>net8.0</TargetFramework>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.AspNetCore.OpenApi" Version="8.0.21"/>
<PackageReference Include="Microsoft.AspNetCore.Authentication.JwtBearer" Version="10.0.7" />
<PackageReference Include="Microsoft.AspNetCore.OpenApi" Version="10.0.7"/>
<PackageReference Include="Newtonsoft.Json" Version="13.0.4" />
<PackageReference Include="Serilog.AspNetCore" Version="8.0.3" />
<PackageReference Include="Serilog.Sinks.File" Version="6.0.0" />
<PackageReference Include="SixLabors.ImageSharp" Version="3.1.11" />
<PackageReference Include="Swashbuckle.AspNetCore" Version="6.6.2"/>
<PackageReference Include="Swashbuckle.AspNetCore" Version="10.1.7"/>
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\SatelliteProvider.Common\SatelliteProvider.Common.csproj" />
<ProjectReference Include="..\SatelliteProvider.DataAccess\SatelliteProvider.DataAccess.csproj" />
<ProjectReference Include="..\SatelliteProvider.Services\SatelliteProvider.Services.csproj" />
<ProjectReference Include="..\SatelliteProvider.Services.TileDownloader\SatelliteProvider.Services.TileDownloader.csproj" />
<ProjectReference Include="..\SatelliteProvider.Services.RegionProcessing\SatelliteProvider.Services.RegionProcessing.csproj" />
<ProjectReference Include="..\SatelliteProvider.Services.RouteManagement\SatelliteProvider.Services.RouteManagement.csproj" />
</ItemGroup>
</Project>
@@ -0,0 +1,31 @@
using Microsoft.OpenApi;
using Swashbuckle.AspNetCore.SwaggerGen;
namespace SatelliteProvider.Api.Swagger;
public class ParameterDescriptionFilter : IOperationFilter
{
public void Apply(OpenApiOperation operation, OperationFilterContext context)
{
if (operation.Parameters == null) return;
var parameterDescriptions = new Dictionary<string, string>
{
["lat"] = "Latitude coordinate where image was captured",
["lon"] = "Longitude coordinate where image was captured",
["mgrs"] = "MGRS coordinate string",
["squareSideMeters"] = "Square side size in meters",
["Latitude"] = "Latitude coordinate of the tile center",
["Longitude"] = "Longitude coordinate of the tile center",
["ZoomLevel"] = "Zoom level for the tile (higher values = more detail)"
};
foreach (var parameter in operation.Parameters)
{
if (parameterDescriptions.TryGetValue(parameter.Name, out var description))
{
parameter.Description = description;
}
}
}
}
@@ -10,6 +10,11 @@
"ConnectionStrings": {
"DefaultConnection": "Host=localhost;Port=5432;Database=satelliteprovider;Username=postgres;Password=postgres"
},
"Jwt": {
"Secret": "DEV-ONLY-DO-NOT-USE-IN-PROD-replace-with-real-secret-via-JWT_SECRET-env-var",
"Issuer": "DEV-ONLY-iss-admin-azaion-local",
"Audience": "DEV-ONLY-aud-satellite-provider"
},
"MapConfig": {
"Service": "GoogleMaps",
"ApiKey": ""
+24 -2
View File
@@ -23,9 +23,27 @@
"ConnectionStrings": {
"DefaultConnection": "Host=localhost;Database=satelliteprovider;Username=postgres;Password=postgres"
},
"Jwt": {
"Secret": "",
"Issuer": "",
"Audience": ""
},
"UavQuality": {
"MinBytes": 5120,
"MaxBytes": 5242880,
"MaxAgeDays": 7,
"CapturedAtFutureSkewSeconds": 30,
"MinLuminanceVariance": 10.0,
"MaxBatchSize": 100,
"LuminanceSampleSize": 32
},
"MapConfig": {
"Service": "GoogleMaps",
"ApiKey": ""
"ApiKey": "",
"TileSizePixels": 256,
"AllowedZoomLevels": [ 15, 16, 17, 18, 19 ],
"RetryBaseDelaySeconds": 1,
"RetryMaxDelaySeconds": 30
},
"StorageConfig": {
"TilesDirectory": "./tiles",
@@ -37,7 +55,11 @@
"DefaultZoomLevel": 20,
"QueueCapacity": 1000,
"DelayBetweenRequestsMs": 50,
"SessionTokenReuseCount": 100
"SessionTokenReuseCount": 100,
"RegionProcessingTimeoutSeconds": 300,
"RouteProcessingPollIntervalSeconds": 5,
"MaxRoutePointSpacingMeters": 200.0,
"LatLonTolerance": 0.0001
},
"CorsConfig": {
"AllowedOrigins": []
+13 -3
View File
@@ -1,7 +1,17 @@
namespace SatelliteProvider.Common.Configs;
namespace SatelliteProvider.Common.Configs;
public class MapConfig
{
public string Service { get; set; } = null!;
public string ApiKey { get; set; } = null!;
}
public string ApiKey { get; set; } = null!;
// AZ-371 / C18 — Google Maps tile constants promoted from source literals.
// AZ-377 / C24 — DefaultTileSizePixels is the canonical pixel size of a Google Maps tile;
// sites that cannot inject IOptions<MapConfig> (e.g. DataAccess.TileRepository) reference it
// directly so the value still has one source of truth.
public const int DefaultTileSizePixels = 256;
public int TileSizePixels { get; set; } = DefaultTileSizePixels;
public int[] AllowedZoomLevels { get; set; } = new[] { 15, 16, 17, 18, 19 };
public int RetryBaseDelaySeconds { get; set; } = 1;
public int RetryMaxDelaySeconds { get; set; } = 30;
}
@@ -8,5 +8,11 @@ public class ProcessingConfig
public int QueueCapacity { get; set; } = 100;
public int DelayBetweenRequestsMs { get; set; } = 50;
public int SessionTokenReuseCount { get; set; } = 100;
// AZ-371 / C18 — operational levers promoted from source literals.
public int RegionProcessingTimeoutSeconds { get; set; } = 300;
public int RouteProcessingPollIntervalSeconds { get; set; } = 5;
public double MaxRoutePointSpacingMeters { get; set; } = 200.0;
public double LatLonTolerance { get; set; } = 0.0001;
}
@@ -18,5 +18,30 @@ public class StorageConfig
var fileName = $"tile_{zoomLevel}_{tileX}_{tileY}_{timestamp}.jpg";
return Path.Combine(subdirectory, fileName);
}
// Inverse of GetTileFilePath: parses tile_{zoom}_{x}_{y}_{ts}.jpg.
// Co-located with the writer per AZ-366 / C13 so format changes can never
// drift between the two ends. Pure: no I/O, no logging, no exceptions for
// malformed input — caller decides how to react to a `false` return.
public static bool TryExtractTileCoordinates(string filePath, out int tileX, out int tileY)
{
tileX = -1;
tileY = -1;
ArgumentNullException.ThrowIfNull(filePath);
var filename = Path.GetFileNameWithoutExtension(filePath);
var parts = filename.Split('_');
if (parts.Length >= 4 && parts[0] == "tile" &&
int.TryParse(parts[2], out var x) &&
int.TryParse(parts[3], out var y))
{
tileX = x;
tileY = y;
return true;
}
return false;
}
}
@@ -0,0 +1,14 @@
namespace SatelliteProvider.Common.Configs;
// AZ-488: tunable thresholds for the UAV tile-upload quality gate.
// Defaults are documented in `_docs/02_document/contracts/api/uav-tile-upload.md` v1.0.0.
public class UavQualityConfig
{
public int MinBytes { get; set; } = 5 * 1024;
public int MaxBytes { get; set; } = 5 * 1024 * 1024;
public int MaxAgeDays { get; set; } = 7;
public int CapturedAtFutureSkewSeconds { get; set; } = 30;
public double MinLuminanceVariance { get; set; } = 10.0;
public int MaxBatchSize { get; set; } = 100;
public int LuminanceSampleSize { get; set; } = 32;
}
@@ -9,12 +9,12 @@ public class CreateRouteRequest
public string? Description { get; set; }
public double RegionSizeMeters { get; set; }
public int ZoomLevel { get; set; }
public List<RoutePoint> Points { get; set; } = new();
[JsonPropertyName("geofences")]
public Geofences? Geofences { get; set; }
public bool RequestMaps { get; set; } = false;
public bool CreateTilesZip { get; set; } = false;
}
@@ -0,0 +1,16 @@
namespace SatelliteProvider.Common.DTO;
public record DownloadTileResponse
{
public Guid Id { get; set; }
public int ZoomLevel { get; set; }
public double Latitude { get; set; }
public double Longitude { get; set; }
public double TileSizeMeters { get; set; }
public int TileSizePixels { get; set; }
public string ImageType { get; set; } = string.Empty;
public int? Version { get; set; }
public string FilePath { get; set; } = string.Empty;
public DateTime CreatedAt { get; set; }
public DateTime UpdatedAt { get; set; }
}
@@ -0,0 +1,6 @@
namespace SatelliteProvider.Common.DTO;
public record DownloadedTileInfoV2(
int X, int Y, int ZoomLevel,
double CenterLatitude, double CenterLongitude,
string FilePath, double TileSizeMeters);
@@ -0,0 +1,3 @@
namespace SatelliteProvider.Common.DTO;
public record ExistingTileInfo(double Latitude, double Longitude, int TileZoom);
+3 -3
View File
@@ -5,10 +5,10 @@ namespace SatelliteProvider.Common.DTO;
public class GeoPoint
{
const double PRECISION_TOLERANCE = 0.00005;
[JsonPropertyName("lat")]
public double Lat { get; set; }
[JsonPropertyName("lon")]
public double Lon { get; set; }
@@ -35,4 +35,4 @@ public class GeoPoint
public static bool operator ==(GeoPoint left, GeoPoint right) => Equals(left, right);
public static bool operator !=(GeoPoint left, GeoPoint right) => !Equals(left, right);
}
}

Some files were not shown because too many files have changed in this diff Show More