Files
satellite-provider/_docs/06_metrics/retro_2026-05-11_cycle2.md
T
Oleksandr Bezdieniezhnykh b69cf5640e
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
[AZ-487] [AZ-488] retro: cycle 2 report + structural snapshot
Cycle-2 retrospective covering AZ-487 + AZ-488. Captures six patterns
(duplicate JWT helpers diverged then both broke; pre-existing
test bugs unmasked by downstream test pressure; cycle 1 perf-NFR
action stopped adding scenarios but did not drain backlog; doc-path
F1 carried over twice with no decision; integration test DB
isolation = wallclock workaround; 8 SP friction observable even
with user override). Top-3 improvement actions: consolidate JWT
mint helpers, promote PT-07/PT-08/JWT-attach to real PBI, real
integration DB-reset hook.

LESSONS.md ring buffer now holds 6 entries (testing x3, process x2,
estimation x1).

Structural snapshot: 6 components / 12 PR edges unchanged; contract
coverage 14% -> 29%; new external NuGet edges (JwtBearer 8.0.21 +
ImageSharp 3.1.11) tied to cycle-2 security findings.

Autodev pointer advances to cycle 3 / Step 9 New Task.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 00:43:27 +03:00

19 KiB

Retrospective — Cycle 2 (AZ-487 + AZ-488, 2026-05-11)

Cycle: 2 (Phase B feature cycle, two tasks) Tasks: AZ-487 (JWT validation baseline, 2 SP) + AZ-488 (UAV tile upload + 5-rule quality gate, 8 SP — user-accepted over-cap) Mode: cycle-end (autodev Step 17) Previous retro: retro_2026-05-11.md (cycle 1, AZ-484)

1. Implementation Metrics

Metric Value Δ vs cycle 1
Tasks implemented 2 (AZ-487, AZ-488) +1
Batches 2 (batch 01 + batch 02 of cycle 2) +1
Avg tasks / batch 1.0 unchanged
Complexity points delivered 10 SP (2 + 8) +5 SP
Avg complexity / batch 5.0 SP unchanged
Tasks at-or-below 5 SP cap 1 of 2 (AZ-487) AZ-488 above cap
Source files modified (production) 6 +2
Source files added (production) 11 (JWT extension, permission requirement, UAV gate + handler + DTOs + config + request envelope) +8
Test files added 11 (4 JWT-side unit, 1 permission unit, 3 UAV-gate/handler/path unit, 1 image factory, 1 UAV integration suite, 1 JWT integration suite) +10
Doc files added 4 (uav-tile-upload.md contract, ripple_log_cycle2.md, batch_01_cycle2_report.md, batch_02_cycle2_report.md, plus 2 review files, deploy report) +6
Migrations added 0 (AZ-488 reuses AZ-484 schema) -1
Frozen contracts produced 1 (uav-tile-upload v1.0.0) unchanged trend (cycle 1 also produced 1)
Total source diff in cycle +2,488 / -80 lines across 28 files
Total doc diff in cycle +1,811 / -64 lines across 35 files

2. Quality Metrics

Metric Value Δ vs cycle 1
Code review pass rate 2/2 = 100% (both PASS_WITH_WARNINGS, neither FAIL) unchanged
Code review findings — Critical 0 unchanged
Code review findings — High 0 unchanged
Code review findings — Medium 0 unchanged
Code review findings — Low 6 (2 in batch 01, 4 in batch 02 — but batch-02 F1 is the same item as batch-01 F1 carried over, so 5 distinct items) +5
Distinct findings (deduped) 5 (F1 doc-path missing, F2 dev JWT secret committed, batch-02 F2 mutable JpegMagicBytes, F3 UAV path divergence, F4 byte[].ToArray() per write) +5
Code review FAIL count 0 unchanged
Auto-fix attempts during code review 1 (batch 02 in-flight build fix: removed unused using Microsoft.AspNetCore.Http;) +1
Post-code-review test-failure iterations 5 (CS0104 ambiguity, IDX12401 unit-side, IDX12401 integration-side, uniform-grey test, _coordinateCounter collision) +3 vs cycle 1 (which had 2)
Security audit verdict PASS_WITH_WARNINGS unchanged
Security findings introduced by cycle 2 2 new Medium (F-AUTH-2 iss/aud unvalidated, F-UAV-1+F-DEPS-UAV ImageSharp decode surface) + 4 new Low + 1 Informational +2 Medium

3. Structural Metrics (snapshot: structure_2026-05-11_cycle2.md)

Metric Value Δ vs cycle 1
Components 6 0
ProjectReference edges 12 0 (no new cross-component edges)
Cycles in import graph 0 0 (still clean DAG)
Avg ProjectReferences / component 2.0 0
Architecture violations newly introduced 0
Architecture violations resolved 0
Net architecture delta 0 second clean cycle in a row
Frozen contracts 2 +1 (uav-tile-upload v1.0.0)
Contract coverage % ~29% +15 pp
External NuGet edges added +2 (JwtBearer 8.0.21, ImageSharp 3.1.11)

4. Efficiency Metrics

Metric Value Δ vs cycle 1
Blocked task count 0 unchanged
Tasks completed first attempt (post-review) 0 of 2 — both required test-gate fix iterations unchanged (0 of 1 in cycle 1)
Tasks requiring multiple post-code-review fix commits 2 of 2 — AZ-487 had 3 separate fix: commits; AZ-488 had 1 wallclock fix + 1 in-flight build fix +1 (cycle 1 had 1 of 1)
First-attempt-runtime-pass rate 0 / 2 = 0% unchanged (cycle 1 had 0/1)
Most-findings batch batch 02 (4 findings, all Low)
Spec accuracy ("doc paths exist") 0 of 2 task specs accurate — both referenced _docs/02_document/components/01_web_api/description.md which doesn't exist new pattern this cycle (was AZ-484 specific in cycle 1)
Performance gate scenarios run 0 of 7 active (PT-01..PT-06) + 0 of 2 deferred (PT-07, PT-08) — Step 15 skipped by user, run-performance-tests.sh doesn't attach JWT 0 → 0 (still no perf runs landing)

5. Patterns Identified

Pattern 1 — Duplicate JWT mint helpers diverged then both needed identical fixes

SatelliteProvider.Tests/TestUtilities/JwtTokenFactory.cs and SatelliteProvider.IntegrationTests/JwtTestHelpers.cs were authored as two near-identical copies of the same logic during AZ-487. They were not the same file because the unit and integration test projects don't share a reference. Both contained the same Expires < NotBefore bug for negative-lifetime tokens; both needed the same fix in separate commits (f64d0d7 for unit-side, 11b7074 for integration-side).

This is a classic copy-paste-diverge pattern. The unit-side helper is more mature (has happy + tampered + expired variants); the integration-side helper started as a subset and inherited the same bug. The fix-twice cost was small this time (one extra commit), but if the divergence had been a security-relevant difference the consequences would be larger.

Pattern 2 — Pre-existing test-side bugs unmasked by downstream test pressure

The AZ-487 batch report shipped with status "test gate not yet run" because the implement skill committed before Step 11. Three separate AZ-487 test bugs (CS0104, IDX12401 unit, IDX12401 integration) did not surface until AZ-488 layered new tests on top and ran the full suite. The bugs were introduced by AZ-487 but only observable during AZ-488's test gate.

The fix flow worked correctly (separate fix: commits, no scope creep), but the metric "AZ-487 batch report: 8/8 ACs covered, PASS_WITH_WARNINGS" is misleading — at that point the new tests had not actually been executed end-to-end. A more honest framing: "AZ-487 declared done while its own tests had compile errors masked by build resolution order."

Pattern 3 — Cycle 1 retro Action 2 not executed; perf gate now has 3 deferred items

Cycle 1 Action 2 ("couple NFR additions in test-spec with runner-script updates in the same step") was approved and the lesson appended to LESSONS.md. Cycle 2 added PT-08 to performance-tests.md under the Deferred branch sanctioned by that lesson — which is correct by the letter of the rule. But the underlying harness work (scripts/run-performance-tests.sh PT-07 implementation) still has not landed, and cycle 2 added a third perf-side item (the runner script doesn't attach JWT tokens against the new auth-required endpoints).

The improvement action helped us stop adding undocumented Unverified scenarios. It did NOT help us reduce the backlog. The PT-07 leftover is now 1 cycle old; PT-08 was born deferred; the JWT-attach rot was discovered during the cycle 2 Step 15 gate. The perf gate is currently 0 actual scenarios verified for 2 cycles running.

Pattern 4 — Task-spec doc-path drift repeated, no decision made

Both AZ-487 and AZ-488 task specs referenced _docs/02_document/components/01_web_api/description.md — a folder that has never existed. Both code reviews flagged it as F1 (Low / Style). Both batches worked around it by updating modules/api_program.md instead. The decision item (create the stub folder vs. formalize the WebApi-lives-in-modules convention) is a 1-turn decision; until it's made, every cycle that touches WebApi will repeat the same finding and the same workaround.

Pattern 5 — Integration test state leakage from persistent Postgres volume

UavUploadTests failed mid-test-run with duplicate key value violates unique constraint "idx_tiles_unique_location_source" because _coordinateCounter reset to 0 on every test-runner process start, while the Postgres named volume in docker-compose.yml persists tile rows across docker-compose down cycles. The wallclock-seeded counter (dc3dabe) makes per-run coordinates unique enough to avoid collision, but it's a workaround — the test isn't isolated, it just probabilistically doesn't collide.

The real fixture-isolation options are: (a) docker-compose down -v in scripts/run-tests.sh, (b) explicit DELETE-from-tiles at integration test startup, or (c) a per-test transaction with rollback. Option (b) is the smallest fix; option (a) is the easiest. Option (c) is the proper TDD answer but probably overkill for this codebase.

Pattern 6 — 8 SP task came with measurable friction even with explicit user override

AZ-488 was 8 SP (over the 5 SP user-rule cap), and the cycle landed cleanly — but at observable cost: 1 in-flight build fix (auto-fix during implement), 1 wallclock-counter fix during test-run, 1 PT-08 NFR carried as a Deferred entry, 1 in-flight DTO-layering refactor (Common vs Api placement). None of these were failures. All of them were small, in-isolation harmless edits. But this is exactly the friction profile that the 5 SP guidance is designed to flag, and the cumulative cost should be visible in retros so future "should I split or accept the user override?" decisions are calibrated.

This is NOT a recommendation to never accept user-overrides on SP cap — the cycle 2 user-override was correct given AZ-488's natural unit-of-work shape. It IS a recommendation to track the over-cap friction so we can see if the pattern persists across cycles.

6. Comparison vs. previous retro

Metric Cycle 1 Cycle 2 Direction
First-attempt-runtime-pass rate 0/1 0/2 unchanged at 0%
Code review pass rate 100% 100% unchanged
Test-failure iterations post-review 2 5 worsened — but cycle 2 had a 2-batch surface area (5/2 ≈ 2.5 per batch, slightly higher than cycle 1's 2/1)
Net architecture delta 0 0 unchanged (good)
Contracts coverage % 14% 29% improved +15 pp
Security findings introduced 0 2 Medium worsened — but both are bounded by existing mitigations
Performance scenarios run 0/7 0/8 unchanged at 0% (Action 2 didn't move the needle)

Effectiveness check on cycle 1's top-3 actions:

  • Cycle 1 Action 1 (persisted-enum DB-roundtrip rule) — landed in LESSONS.md and coderule.mdc. AZ-484-style enum bugs did not recur in cycle 2 (cycle 2 didn't introduce any new persisted enums, so the rule wasn't exercised but also wasn't violated). Verdict: cannot evaluate — no triggering work in cycle 2.
  • Cycle 1 Action 2 (NFR + runner same step) — landed in LESSONS.md. Cycle 2 PT-08 was added as Deferred (legal per the rule). But the underlying harness work didn't progress; one new perf-rot item appeared (JWT-attach). Verdict: rule prevented one new failure mode, did not reduce the existing backlog.
  • Cycle 1 Action 3 (periodic doc-base-rate refresh) — recorded as backlog item, never scheduled. The pre-existing tests_unit.md staleness was incidentally refreshed during cycle 2's Step 13 doc sync (we walked the file as part of touching it). Verdict: opportunistic fix happened; the recurring task was not scheduled.

7. Top 3 Improvement Actions (ranked by impact)

Action 1 — Consolidate JWT token-mint helpers into one shared utility

Why: Pattern 1. Same bug existed in two near-identical files (SatelliteProvider.Tests/TestUtilities/JwtTokenFactory.cs and SatelliteProvider.IntegrationTests/JwtTestHelpers.cs); both needed independent fixes. The two files will continue to drift, and the next time they drift on something security-relevant (claim handling, signing algorithm, audience binding) the cost will be larger than one extra commit.

Concrete action:

  • Move the canonical JwtTokenFactory to a shared location that both unit and integration test projects can reference. Two viable shapes:
    • Option A (preferred): create a new tiny test-support project SatelliteProvider.TestSupport (class library, no test framework) holding JwtTokenFactory, UavTileImageFactory, and similar. Reference it from both SatelliteProvider.Tests and SatelliteProvider.IntegrationTests. Cost: ~1 PBI, ~3 SP.
    • Option B (cheaper): have SatelliteProvider.IntegrationTests add a ProjectReference to SatelliteProvider.Tests and consume the existing TestUtilities/JwtTokenFactory.cs directly. Cost: ~1 SP. Downside: integration tests acquire a dependency on unit tests, slightly unusual.
  • Either way, delete IntegrationTests/JwtTestHelpers.cs::MintValidToken and have the integration tests call the unified factory.
  • Add a code-review checklist item to .cursor/skills/code-review/SKILL.md Phase 6 (Cross-Task Consistency): "If two test projects contain the same helper logic, flag as a Medium finding for consolidation."

Owner: whoever picks up the next test-infrastructure PBI. Recommend scheduling as AZ-XXX Unify test JWT mint helper at 3 SP. Estimated impact: removes the dual-fix tax on every future JWT-side change; prevents drift on the surface that mints test credentials.

Action 2 — Schedule PT-07 harness work as actual feature work, not a leftover

Why: Pattern 3 + Comparison checkpoint. Cycle 1 Action 2 made the spec-side honest (no more silent Unverified scenarios) but did nothing to drain the backlog. Cycle 2 added a 3rd deferred item (the script-rot on run-performance-tests.sh). The perf gate has been 0-of-N for two cycles running and is now actively misleading — anyone reading the gate banner sees "skipped, will check next cycle" without realizing "next cycle" never arrives.

Concrete action:

  • Promote _docs/_process_leftovers/2026-05-11_perf-pt07-harness.md items into a real Jira PBI (e.g. AZ-XXX Perf harness: implement PT-07 + PT-08 + JWT-attach in run-performance-tests.sh). 3 SP estimate.
  • The PBI's exit criteria: PT-07 + PT-08 + at least one of PT-01..PT-06 actually return a real measurement (not Unverified) in _docs/06_metrics/perf_<date>.md.
  • After the PBI lands, delete the leftover file. Until it lands, do not accept new Deferred-status NFR entries — either the harness exists or the spec entry doesn't get added. Update .cursor/skills/test-spec/SKILL.md accordingly.

Owner: next planning loop. Sequence it before any AC that has a hard latency or throughput requirement. Estimated impact: turns the perf gate from theatre into a real gate. Reduces the surface area where regressions can hide.

Action 3 — Reset integration-test DB state between runs (real fix, not workaround)

Why: Pattern 5. The wallclock-seed for _coordinateCounter is a workaround that lets cycle 2 ship but doesn't fix the fundamental issue: integration tests are not isolated from the persistent Postgres volume. The next test class that inserts into tiles will have to invent its own collision-avoidance scheme. Eventually two tests will collide despite both using wallclock seeds (e.g. parallel test execution, fast back-to-back runs in CI).

Concrete action:

  • Add a truncate_or_reset_test_data step to SatelliteProvider.IntegrationTests/Program.cs startup that explicitly clears tiles, regions, routes, route_points, route_regions (in FK-safe order) when ASPNETCORE_ENVIRONMENT=Testing (or when an explicit --reset-state flag is passed). The test runner has direct Postgres access already (used by UavUploadTests for seeding).
  • Alternative: extend scripts/run-tests.sh to call docker-compose down -v between --full runs by default, with an explicit --keep-state flag for debugging. Choose this if data-reset-in-runner is too invasive.
  • Document the chosen pattern in _docs/02_document/modules/tests_integration.md so future integration-test authors don't reinvent collision-avoidance.

Owner: same PBI as Action 2 ("Test infrastructure hardening: perf harness + DB isolation") OR a separate 2 SP PBI. Estimated impact: enables parallel test execution in the future; eliminates the recurring "why are my inserts colliding?" debug loop.

Target file Change
.cursor/skills/code-review/SKILL.md (Phase 6) Add the duplicate-helper detection rule from Action 1
.cursor/skills/test-spec/SKILL.md Tighten the cycle-1 Action 2 rule: "Deferred-status entries are allowed at most once per NFR. If a Deferred NFR has not landed by the end of the next cycle in which it was deferred, the spec MUST be locked to that NFR and the harness work scheduled as a real PBI before any new NFR is accepted as Deferred."
.cursor/skills/implement/SKILL.md (test-gate step) Add: "A batch report's 'AC test coverage' claim is provisional until Step 11 (Run Tests) has executed end-to-end. Mark the batch as 'tests-not-yet-exercised' in its report header if the tests were committed but not run." — guards against Pattern 2.
(optional) .cursor/rules/coderule.mdc § Testing Add a note that integration-test data isolation is required, not optional — link to Action 3's chosen pattern when implemented.

9. Decision items carried over (operator)

These are not part of the top-3 improvement actions because they're 1-turn decisions, not multi-cycle initiatives:

  • F1 (cycle 1 + cycle 2 carry): _docs/02_document/components/01_web_api/description.md doesn't exist. Choose: (a) create stub folder that defers to modules/api_program.md, (b) formalize "WebApi has no components/* folder" in the module-layout doc and update all task spec templates to point at modules/api_program.md. Recommendation: (b) — modules/api_program.md is already the de-facto canonical location.
  • R2 (cycle 2 security report): confirm with admin team the expected iss / aud values for production JWT tokens; flip ValidateIssuer / ValidateAudience to true in a small follow-up PBI when the values are known.
  • R3 (cycle 2 deploy report): coordinate with gps-denied-onboard and mission-planner-UI teams to attach Bearer tokens BEFORE promoting cycle-2 image past dev.

10. Self-verification

  • All cycle 2 implementation artifacts parsed (2 batch reports, 2 reviews, 0 cumulative review since K=3 not reached, 1 deploy report)
  • All metric categories computed
  • Structural snapshot written (structure_2026-05-11_cycle2.md); cycle 1 baseline (structure_2026-05-11.md) used for delta calc
  • Comparison vs previous retro (retro_2026-05-11.md) included
  • Top 3 actions are concrete, owner-named, SP-estimated
  • Suggested rule/skill updates are file-specific
  • Cycle-1 action effectiveness explicitly checked