mirror of https://github.com/azaion/satellite-provider.git synced 2026-06-21 09:31:13 +00:00

Files

T

Oleksandr Bezdieniezhnykh ea278afb37 [AZ-503] [AZ-504] Cycle 5 Step 17: retrospective + close cycle

retro_2026-05-12_cycle5.md captures the cycle-end retrospective:
- Implementation: 2 tasks (AZ-504 + AZ-503-foundation), 4 SP total,
  100% first-attempt pass rate, 1 mid-implement scope-split
  (AZ-503 → AZ-503-foundation + AZ-505, blocked-linked).
- Quality: 50/50 PASS/PASS_WITH_WARNINGS, 0 new Medium+, 1 new Low
  (defensive contentSha256 soft-NULL guard).
- Security: PASS_WITH_WARNINGS, 0 new Critical/High/Medium, 2 new
  Low informational (F1 flightId provenance, F2 pgcrypto runbook
  gap).
- Performance: PASS_WITH_INFRA_WARNINGS — first measurable PT-08
  ever (Run #1 199ms, Run #2 117ms vs 2000ms threshold); PT-01/02
  failed on recurring local Docker/colima DNS cold-start, not an
  app regression.
- Structural: +1 ProjectReference edge (IntegrationTests → Common),
  +1 minor contract bump (uav-tile-upload 1.0.0 → 1.1.0), +1 DB
  migration (014_AddTileIdentityColumns.sql), 0 NuGet bumps,
  0 csproj additions, DAG still acyclic at 9 projects.

structure_2026-05-12_cycle5.md captures the structural snapshot.

LESSONS.md updated with 3 cycle-5 entries (oldest dropped to
preserve the 15-entry ring buffer):
- [architecture] Cross-repo cryptographic invariants must live as
  code constants in both repos with reference-vector tests.
- [tooling] When perf-mode "one re-run" fires twice with the same
  DNS root cause, escalate from re-run to harness fix.
- [process] Spec contradicts live code by >=2 prerequisites →
  prefer split into foundation + follow-up (A/B/C option C).

Top 3 follow-up actions (cycle 6 candidates):
- Action 1 (1 SP): DNS pre-warm in scripts/run-performance-tests.sh
  → closes the cycle-3 perf-harness leftover.
- Action 2 (5 SP): AZ-505 — inventory endpoint + HTTP/2 + Leaflet
  covering index (blocked-linked on AZ-503-foundation, this cycle).
- Action 3 (1 SP): pgcrypto pre-install runbook step (F2-cy5 doc
  fix).

Cycle 5 closed. Autodev state advanced for cycle 6 by the next
/autodev invocation.

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-05-12 18:07:57 +03:00

10 KiB

Raw Blame History

Engineering Lessons

Recurring bugs, surprising library behaviors, and process insights extracted from completed cycles. Newest at the top. Keep entries short — this is for fast scanning at the start of new cycles, not exhaustive history.

This file has two layers:

Deep engineering lessons (L-NNN): library bugs, architectural insights, multi-paragraph context. Persist forever.
Ring buffer at the bottom: most recent 15 single-sentence lessons emitted by the retrospective skill, consumed by new-task / plan / decompose / autodev Step 0. Oldest entries drop off the top.

Categories: estimation · architecture · testing · dependencies · tooling · process

L-001 — Dapper `TypeHandler<T>` is bypassed for enum types during read deserialization

Cycle: 1 (AZ-484) Discovered by: integration test failure (Error parsing column 12 (source=google_maps - String)); root-caused via web search to long-standing Dapper issue #259. Affects: Dapper 2.1.35 (and most other versions until the proposed Settings.PreferTypeHandlersForEnums opt-in in PR #2200, not yet merged).

What happens Registering SqlMapper.AddTypeHandler(new MyEnumHandler()) for an enum type — even via SqlMapper.TypeHandler<TEnum> — works for writes (the handler's SetValue is invoked for parameter binding) but is silently bypassed for reads. Dapper's IL-emitted deserializer checks IsEnum first and falls back to Enum.TryParse(string, ignoreCase: true).

Why this is dangerous If the enum's wire string happens to match a member name case-insensitively (e.g., RegionStatus.Failed ↔ "failed"), the bypass goes unnoticed and round-trip works accidentally. The bug only surfaces when the wire format diverges from the C# member name (e.g., TileSource.GoogleMaps ↔ "google_maps" — Enum.TryParse("google_maps") does not match GoogleMaps because of the underscore).

Recommended approach

Do not rely on SqlMapper.TypeHandler<TEnum> for read-side enum mapping unless the wire values match the enum member names case-insensitively.
For enums whose wire format diverges (snake_case, kebab-case, custom IDs), store the entity field as string and provide an explicit converter (*Converter.ToWireValue / FromWireValue) for use at the service-layer boundary. This is what AZ-484 does for TileEntity.Source ↔ TileSourceConverter.
Unit-test the converter directly. Do not assume that round-tripping through Dapper proves anything for enums.

Detection

Unit tests of the type handler in isolation will pass even when the handler is bypassed at runtime.
Failure surfaces only at integration-test time when the actual SELECT runs.
If you must keep an enum-typed field, write at minimum one integration test that reads the enum back through Dapper from a real database row.

Ring buffer (last 15 entries — newest at top)

[2026-05-12] [architecture] Cross-repo cryptographic invariants (UUID namespaces, deterministic-key formulas, base32/64 alphabets, tile-zoom conventions) MUST live as code-level constants in BOTH repos with reference-vector tests on BOTH sides — documentation alone is insufficient because constant drift surfaces only as 100% lookup misses in production, harder to detect than a unit-test failure (cycle 5: AZ-503 introduced TileNamespace = 5b8d0c2e-7f1a-4d3b-9c5e-1f3a8e7d2b6c which must byte-match the same constant in gps-denied-onboard/components/c6_tile_cache/_uuid.py; the satellite-provider side has the constant + 10 Python-generated reference vectors in Uuidv5Tests.cs and the sibling repo will mirror). Source: _docs/06_metrics/retro_2026-05-12_cycle5.md
[2026-05-12] [tooling] Local Docker/colima DNS cold-start is a recurring class of failure that contaminates the Step-15 perf gate — when the perf-mode "one re-run" rule fires twice across consecutive cycles with the same root-cause class (DNS / NTP / resolver), the harness must escalate from "re-run" to a deterministic fix at the harness layer (DNS pre-warm in script, OR move gate to CI), not just another re-run (cycle 5: PT-01 failed Run #1 on tile.googleapis.com cold-start, then Run #2 on mt0.google.com cold-start; the warmup probe between runs only touched the hostnames it explicitly named). Source: _docs/06_metrics/retro_2026-05-12_cycle5.md
[2026-05-12] [process] When a /autodev cycle's task spec contradicts the live codebase by ≥2 missing prerequisites, the implement skill should preferentially split into foundation + follow-up via A/B/C (option C) rather than (A) silently expand the SP budget or (B) defer the entire task — both halves remain individually shippable and individually testable, the cross-PBI dependency is captured as a blocked-link in the tracker (cycle 5: AZ-503 → AZ-503-foundation + AZ-505 split when flight_id / FlightId / voting_status all missing from live code; AZ-503-foundation shipped this cycle, AZ-505 blocks-on-it for cycle 6). Source: _docs/06_metrics/retro_2026-05-12_cycle5.md
[2026-05-12] [dependencies] Major-version bumps of direct deps cascade through transitives; the task spec must list the transitive packages whose major version changes as a result OR explicitly note "transitive major-version drift not analyzed in spec" — verify with dotnet restore --dry-run against a scratch branch before writing the spec (cycle 4: AZ-500 surprise-bumped Microsoft.OpenApi 1.x → 2.x via the Microsoft.AspNetCore.OpenApi 8.0.25 → 10.0.7 path; forced an unscheduled Swashbuckle bump + Program.cs refactor mid-implementation). Source: _docs/06_metrics/retro_2026-05-12_cycle4.md
[2026-05-12] [process] When a scope-protected task newly exposes a pre-existing bug elsewhere in the codebase (vs. introducing a new one), surface it as a recommended follow-up PBI in the batch report AND list it as a "newly exposed bug" separate from "newly introduced findings" in the deploy report — bugs that already existed don't count as cycle-introduced regressions, but they must not be silently re-buried (cycle 4: AZ-500's bootstrap fix unmasked the pre-existing scripts/run-performance-tests.sh:417 grep -o | wc -l + pipefail bug). Source: _docs/06_metrics/retro_2026-05-12_cycle4.md
[2026-05-12] [process] When a cycle has a single non-functional task (migration / refactor / dependency hygiene), the retro must reframe the metric set around continuity (0 regressions), forward-resolution (prior findings closed by the bump itself), and unblocking (capabilities now exercisable end-to-end) — task count + complexity points read as misleading flatlines that look like under-productivity (cycle 4: AZ-500 alone delivered 5 SP vs cycle 3's 18 SP, but the cycle's value was forward-resolving 2 cycle-3 advisories and finally executing PT-01..PT-08 end-to-end against the migrated build). Source: _docs/06_metrics/retro_2026-05-12_cycle4.md
[2026-05-12] [process] For cross-team blockers (admin team must supply config values, etc.), prefer an Option-B forcing function (ship the validation/scaffolding with prod-empty config that fails-fast at deploy) over deferring the entire task — the fail-fast contract makes the cross-team conversation impossible to skip and ships the in-workspace work in the current cycle (cycle 3: AZ-494 shipped iss/aud validation with empty prod appsettings so deploy must supply real values). Source: _docs/06_metrics/retro_2026-05-12_cycle3.md
[2026-05-12] [process] ACs that prescribe a specific measurement or sentinel mechanism (e.g. "per-item latency < 50ms", "guard fires when DB name contains _test") should also prescribe — or explicitly defer — the path for collecting / enforcing it, or implementations will substitute proxies / equivalents that look like spec drift in review (cycle 3: AZ-492 PT-08 per-item gate cost became a derived proxy; AZ-493 DB-name guard became Host-allowlist). Source: _docs/06_metrics/retro_2026-05-12_cycle3.md
[2026-05-12] [process] ACs that require cross-repo writes should be tagged with the target workspace and rendered separately in the traceability matrix — mixing them with in-workspace ACs makes "correctly deferred" indistinguishable from "incomplete work" (cycle 3: AZ-494 AC-7 deferred for the suite-repo write; matrix renders as ◐ deferred which is ambiguous). Source: _docs/06_metrics/retro_2026-05-12_cycle3.md
[2026-05-11] [testing] Test helpers shared across unit + integration projects must live in one consolidated location — duplicate near-identical copies will diverge and require parallel fixes (cycle 2: JwtTokenFactory.cs and JwtTestHelpers.cs had the same Expires < NotBefore bug fixed in separate commits). Source: _docs/06_metrics/retro_2026-05-11_cycle2.md
[2026-05-11] [process] Deferred-status NFR entries are allowed at most ONCE per NFR — if a Deferred NFR has not landed by the end of the cycle that follows the one in which it was deferred, the harness work must be promoted to a real PBI before any new NFR is accepted as Deferred (cycle 2 inherited cycle 1's PT-07 + added PT-08 + JWT-attach script-rot). Source: _docs/06_metrics/retro_2026-05-11_cycle2.md
[2026-05-11] [testing] Integration tests must explicitly reset DB state at startup — relying on wallclock seeds or "tests probably won't collide" is a workaround, not isolation; the persistent Postgres volume in docker-compose makes test data accumulation the default state (cycle 2: UavUploadTests._coordinateCounter collision was patched with a wallclock seed instead of a real DB-reset hook). Source: _docs/06_metrics/retro_2026-05-11_cycle2.md
[2026-05-11] [testing] Persisted enums need a Dapper read-roundtrip integration test — unit-testing the type handler in isolation does not prove read-side behavior (see L-001). Source: _docs/06_metrics/retro_2026-05-11.md
[2026-05-11] [process] NFR test-spec additions must include the runner-script implementation in the same step, or be tagged "Deferred — harness work tracked in "; otherwise scenarios accumulate as Unverified across cycles. Source: _docs/06_metrics/retro_2026-05-11.md
[2026-05-11] [estimation] Task-spec test-site-count estimates must be backed by an explicit grep evidence block, not pattern-matched against neighboring code (AZ-484 spec said ~3 sites in RegionServiceTests; actual = 0). Source: _docs/06_metrics/retro_2026-05-11.md

10 KiB Raw Blame History

Engineering Lessons

L-001 — Dapper TypeHandler<T> is bypassed for enum types during read deserialization

Ring buffer (last 15 entries — newest at top)

10 KiB

Raw Blame History

L-001 — Dapper `TypeHandler<T>` is bypassed for enum types during read deserialization