[AZ-350] Refactor 03 Phase 2: roadmap + 27 task specs + safety net

Adds Phase 0 (baseline metrics, .gitignore tweaks), Phase 1
(research findings, list-of-changes), and Phase 2 (refactoring
roadmap, epic AZ-350, 27 task specs AZ-351..AZ-380, dependency
table updates) for the 03-code-quality-refactoring run.

Phase 3 (Safety Net) re-verified: 40/40 unit + 5/5 smoke
integration pass; documented in test_specs/existing_coverage.md.
Coverage % gating deferred to ticket C19 (AZ-372) which adds
Coverlet + reportgenerator.

Auto-chains to Phase 4 (Execution) via /implement starting at
batch 1 (Phase 1 critical fixes).

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-10 23:26:07 +03:00
parent 524809d77d
commit ff030a9521
36 changed files with 2570 additions and 15 deletions
@@ -0,0 +1,155 @@
# Phase 2 — Refactoring Roadmap (03-code-quality-refactoring)
**Date**: 2026-05-10
**Total changes**: 27 (C01C27)
**Selected hardening tracks**: Track A — Technical Debt (extra sweep produced C23C27)
**Total estimated complexity**: ~66 story points across 4 execution phases
## Weak-Point Assessment (per area)
| Area | Symptom | Files | Driver change(s) |
|------|---------|-------|-------------------|
| API exception handling | per-endpoint try/catch leaks `ex.Message` to clients | `Program.cs` (six endpoints) | C03 |
| API security defaults | CORS opens to `*` when `AllowedOrigins` empty | `Program.cs:37-47` | C04 |
| API contract honesty | stub endpoints return 200 OK | `Program.cs:177-185` | C05 |
| Startup observability | null logger handed to migrator | `Program.cs:82-83` | C01 |
| Tile cache lifecycle | year-based version invalidates cache annually | `TileService.cs`, `TileRepository.cs`, migrations | C06 |
| Async pipeline failure mode | 9-way duplicated catch ladder | `RegionService.cs:148-197` | C07 |
| DI hygiene | service-locator pattern in worker | `RouteProcessingService.cs:18-22` | C08 |
| API idempotency | retried POST → 500 on duplicate `Id` | `Program.cs`, `RegionService.cs`, `RouteService.cs` | C09 |
| Concurrency | non-atomic counters | `RegionRequestQueue.cs:12-13` | C10 |
| God-class | 750-LOC `BackgroundService` with 6 responsibilities | `RouteProcessingService.cs` | C11 |
| Long method | 165-LOC `CreateRouteAsync` | `RouteService.cs:27-211` | C12 |
| Duplication | Haversine, CSV, stitcher, Earth constants, SQL columns | multiple | C13, C14, C15, C24, C26 |
| Inline DTOs | six DTOs at the bottom of `Program.cs` | `Program.cs:272-353` | C16 |
| Typing | status / point-type bare strings (+ AC drift) | `RegionService.cs`, `RouteService.cs`, `acceptance_criteria.md` | C17 |
| Configuration | magic 5 min, 200 m, 5 s, 0.0001, retry delays, allowed zooms | multiple | C18 |
| Tooling | no formatter / analyzer / coverage | solution root | C19 |
| Versioning semantics | `MapsVersion` is a date label, not a version | `TileService.cs:154` | C20 |
| HTTP client setup | per-call header configuration repeated × 3 | `GoogleMapsDownloaderV2.cs` | C21 |
| Algorithmic | O(N²) existing-tile lookup | `GoogleMapsDownloaderV2.cs:245-265` | C22 |
| Dead code (Phase 2a sweep) | unused `FindExistingTileAsync` | `TileRepository.cs:51-76` | C23 |
| Dead code (Phase 2a sweep) | duplicate Earth constants + magic 111000 | `TileRepository.cs:82-91` etc. | C24 |
| Dead code (Phase 2a sweep) | unused `_logger` fields in repositories | `*Repository.cs:11` | C25 |
| Dead code (Phase 2a sweep) | repeated SELECT column lists | `*Repository.cs` | C26 |
| Dead code (Phase 2a sweep) | trivial alias `CalculatePolygonDiagonalDistance` | `GeoUtils.cs:129-132` | C27 |
## Gap Analysis (versus acceptance criteria)
| AC | Current state | Gap | Closed by |
|----|---------------|-----|-----------|
| T1 | Cache key includes `version`; invalidates yearly | Wording must change to "(lat, lon, zoom_level, tile_size_meters); duplicates collapsed via DB unique constraint" | C06 (also updates restrictions.md K6) |
| T2 | Concurrent download limit (4) enforced | None | — |
| T3 | Tile stored on disk | None | — |
| T4 | Tile metadata persisted | None | — |
| R1 | Region transitions through correct states | bare strings → enum | C17 |
| R2 / R3 / R4 / R5 / R6 | Files generated, sizes correct | None (preserved by C11/C14/C15) | — |
| RT1 | Intermediate points every ~200 m | 200 m is hardcoded | C18 |
| RT2 | "original / intermediate" point types | **Drift K8**: code uses `start`/`end`/`action`/`intermediate` (4 values). User-confirmed option α (keep code, update AC). | C17 |
| RT3 | Total distance via Haversine | Two implementations exist | C13 |
| RT4 | Geofence filtering | None | — |
| RT5 | ZIP ≤ 50 MB | None (preserved by C11's `TilesZipBuilder`) | — |
| RT6 | Route map stitched | None (preserved by C11's `RouteImageRenderer`) | — |
| A1 / A2 / A3 | Endpoints behave correctly | 500 leakage on errors; 500 on duplicate POST | C03, C09 |
| S1 | Migrations run on startup | Null logger to migrator | C01 |
| S2 | Queue rejects when full | None | — |
| S3 | Failed regions marked failed | 9-way catch ladder; one classification helper still leaves S3 satisfied | C07 |
No AC is **regressed** by this run; T1 / RT2 wording must be updated alongside C06 / C17 to track the implementation.
## Phased Execution Plan
The four phases are designed to be **executed in order**, each independently shippable. Each phase ends with smoke + unit suite green.
### Execution Phase 1 — Critical fixes (cheap, high return)
**Estimated**: 12 pts · 6 changes · low risk overall
| Order | ID | Title | Pts | Risk |
|-------|----|-------|-----|------|
| 1 | C01 | Fix null logger to `DatabaseMigrator` | 2 | low |
| 2 | C02 | Remove empty catch in `ExtractTileCoordinatesFromFilename` | 2 | low |
| 3 | C10 | Delete write-only counters in `RegionRequestQueue` | 1 | low |
| 4 | C05 | Stub endpoints return 501 | 2 | low |
| 5 | C04 | Strict CORS by default | 2 | low |
| 6 | C03 | Sanitize 5xx responses via `IExceptionHandler` | 3 | medium (changes 500 body shape) |
Why first: each one is self-contained, fixes a real correctness/security issue, and leaves the codebase observably better.
### Execution Phase 2 — High-value correctness
**Estimated**: 11 pts · 3 changes · medium risk
| Order | ID | Title | Pts | Risk |
|-------|----|-------|-----|------|
| 7 | C07 | Consolidate `RegionService.ProcessRegionAsync` catch ladder | 3 | low |
| 8 | C06 | Drop tile `Version`; latest row wins; new migration | 5 | medium (DB migration) |
| 9 | C09 | Idempotency contract for caller-supplied GUIDs | 3 | medium (API behavior change on duplicate POST) |
Why second: C06 + C09 change DB / API behavior. Doing them after Phase 1 means the safety net (smoke + unit suite) is already operating against the sanitized error paths from C03. C06 must precede C20 in Phase 4.
### Execution Phase 3 — Structural cleanup (SRP + duplication)
**Estimated**: 21 pts · 7 changes · medium risk
| Order | ID | Title | Pts | Risk |
|-------|----|-------|-----|------|
| 10 | C13 | Consolidate Haversine + filename parser | 2 | low |
| 11 | C24 | Consolidate Earth constants and 111000 (mostly into `GeoUtils`) | 2 | low |
| 12 | C15 | Shared `TileCsvWriter` | 2 | low |
| 13 | C14 | Shared `TileGridStitcher` (region + route) | 3 | medium (image output verified) |
| 14 | C16 | Move inline DTOs out of `Program.cs` | 2 | low |
| 15 | C12 | Decompose `RouteService.CreateRouteAsync` (validator + builder + grid + mapper) | 5 | low |
| 16 | C11 | Decompose `RouteProcessingService` (6 collaborators) | 5 | medium (large file, tested end-to-end) |
| 17 | C08 | Replace `IServiceProvider` with `IRegionService` (folded into C11) | 2 | low |
C13/C24/C15/C14 land first so the bigger decompositions (C11/C12) reuse them.
### Execution Phase 4 — Typing, config, tooling, polish
**Estimated**: 22 pts · 11 changes · low risk
| Order | ID | Title | Pts | Risk |
|-------|----|-------|-----|------|
| 18 | C18 | Move magic numbers to `ProcessingConfig` / `MapConfig` | 3 | low |
| 19 | C17 | Status / point-type enums + AC RT2 update | 3 | low |
| 20 | C20 | Clarify / drop `MapsVersion` | 2 | low |
| 21 | C21 | Typed `HttpClient` for Google Maps | 2 | low |
| 22 | C22 | O(N) existing-tile lookup (HashSet) | 2 | low |
| 23 | C23 | Delete unused `FindExistingTileAsync` | 1 | low |
| 24 | C25 | `_logger` fields: delete or use for slow-query log | 1 | low |
| 25 | C26 | Extract repository SELECT column constants | 2 | low |
| 26 | C27 | Delete `CalculatePolygonDiagonalDistance` | 1 | low |
| 27 | C19 | Add `dotnet format`, NetAnalyzers, Coverlet | 3 | low |
C18 first so C22 can pick up its tolerance constant from config. C19 last — the analyzer flood is easiest to address once the larger refactors have settled the surface.
## Hardening Track Items (Track A — Technical Debt)
The user selected Track A. Items C23C27 were generated by the Phase 2a sweep and slot into Execution Phase 4. No additional hardening items remain unaddressed.
Tracks B (Performance) and C (Security) were not selected. Their incidental coverage in this run:
- Performance — only C22 addresses an algorithmic hotspot. No deeper profiling is performed.
- Security — C03 (info disclosure) and C04 (CORS default) cover the two most concrete findings; no OWASP sweep performed.
## Applicability Gate (per skill: every roadmap item is `Selected`)
All 27 items have `Selected` status in `research_findings.md` (with C06 / C17 documentation updates explicitly approved by the user — α + drop-version directions). Zero items are `Rejected`, `Experimental only`, or `Needs user decision`. **Gate cleared.**
## Constraints Re-Verified at Roadmap Time
- **K1** (.NET 8 LTS): no upgrade proposed; all `IExceptionHandler`, `IOptions`, Dapper type handlers are .NET 8-native.
- **K2** (Postgres 16): C06's `INSERT … ON CONFLICT … DO UPDATE` is supported.
- **K3** (ImageSharp 3.1.11): C14's stitcher uses the existing dependency.
- **K4** (single instance): no distributed-system idioms introduced.
- **K5** (no auth): not affected by this run.
- **K6 / K7** (year-based versioning, T1 wording): updated by C06 ticket.
- **K8** (RT2 drift): updated by C17 ticket.
- **K9** (50 MB ZIP cap): preserved by C11's `TilesZipBuilder`.
- **K10** (smoke + unit green): each ticket runs the suite at the end.
## Self-Verification
- [x] All AC mapped to changes; only T1 + RT2 require wording updates (already attached to C06 / C17).
- [x] All 27 changes are `Selected` per Phase 2a constraint-fit table.
- [x] No item exceeds 5 complexity points (largest are C06, C11, C12 at 5 each).
- [x] Hardening track A items (C23C27) are accounted for in Execution Phase 4.
- [x] Phase ordering respects dependencies (C18 before C22, C03 before C09, C13/C24/C14/C15 before C11/C12).
- [x] No circular dependencies between change IDs.
- [x] Roadmap stays inside the constraint matrix; no ❌ or ❓ cells in `research_findings.md`.
@@ -0,0 +1,142 @@
# Phase 2a — Research Findings (03-code-quality-refactoring)
**Date**: 2026-05-10
**Mode**: Automatic
**Run scope**: 22 changes from `list-of-changes.md` (post-user-edit on C06 and C10).
## Project Constraint Matrix
Extracted from `_docs/00_problem/restrictions.md`, `_docs/00_problem/acceptance_criteria.md`, and current code.
| # | Constraint | Source | Implication for this run |
|---|-----------|--------|---------------------------|
| K1 | Runtime: .NET 8.0 (LTS) | `restrictions.md` §Software | Pattern recommendations limited to .NET 8 features. No upgrade to .NET 9. |
| K2 | Database: PostgreSQL 16 | `restrictions.md` §Software | C06 migration must be Postgres-compatible (`INSERT … ON CONFLICT … DO UPDATE`). |
| K3 | Image processing: SixLabors.ImageSharp 3.1.11 | `restrictions.md` §Software | C14 (shared `TileGridStitcher`) keeps ImageSharp; no replacement library considered. |
| K4 | Single-instance deployment | `restrictions.md` §Operational | C10 stays simple; no need for distributed counters. C09 idempotency handled in-process via DB unique constraints. |
| K5 | No authentication middleware | `restrictions.md` §Environment | C03 sanitization is still needed (5xx leakage); C04 CORS hardening is still needed. Auth itself is out of scope. |
| K6 | Tile versioning policy: year-based integer | `restrictions.md` line 23 | **CONFLICT with user-edited C06**: C06 removes year-based versioning. Roadmap must include a documentation update — the line in `restrictions.md` becomes "no version concept; latest row wins". User confirmed this direction in chat. |
| K7 | Acceptance T1 — cache key includes `version` | `acceptance_criteria.md` T1 | **CONFLICT with C06**: T1 must be rewritten as "0 duplicate downloads for same (lat, lon, zoom_level, tile_size_meters); duplicates collapsed via DB unique constraint". User-confirmed direction. |
| K8 | Acceptance RT2 — point types are `original` and `intermediate` | `acceptance_criteria.md` RT2 | **PRE-EXISTING DRIFT**: actual code uses `start`/`end`/`action`/`intermediate`. C17 enum work must reconcile — pick one canonical set and update either AC or code. Surfaced for user decision. |
| K9 | Max ZIP archive size: 50 MB | `restrictions.md` line 22 | C11 (decompose `RouteProcessingService`) keeps the 50 MB cap; refactor must not weaken `RT5`. |
| K10 | Smoke + unit suite must remain green | this run's Phase 0 goals | All changes verified at Phase 6. |
No `_docs/02_document/contracts/` directory exists, so there are no formal contract files to drift against. Module ownership is governed by `_docs/02_document/module-layout.md`, which the recent AZ-315 sync brought current.
## Current-State Analysis (by concern)
### Error handling
- **Strengths**: most repository / downloader exceptions are caught, logged, and re-thrown. `GoogleMapsDownloaderV2.ExecuteWithRetryAsync` correctly distinguishes 429 / 5xx (retry) from 401 / 403 (don't retry).
- **Weaknesses**:
- **Silent suppression** in `RouteProcessingService.ExtractTileCoordinatesFromFilename` (empty `catch { }`) — `coderule.mdc` violation.
- **9-way duplicated catch ladder** in `RegionService.ProcessRegionAsync` — single-reason-to-change rule.
- **Information leakage** at every API endpoint: `Results.Problem(detail: ex.Message, statusCode: 500)` ships internal text to the client.
- **Per-endpoint try/catch boilerplate** repeats six times in `Program.cs`.
### Single Responsibility
- **Strengths**: project boundaries (post-`02-coupling`) are clean. The three `Services.*` siblings each own one concern.
- **Weaknesses**:
- `RouteProcessingService` (~750 LOC) does queue polling + region matching + CSV I/O + summary writing + image stitching + drawing + ZIP creation + cleanup.
- `RouteService.CreateRouteAsync` is one 165-LOC method doing validation + interpolation + persistence + geofence-grid + region-request orchestration.
- `Program.cs` hosts six DTOs and one Swagger filter at the bottom.
### Duplication
- **Strengths**: very little duplication across DI extensions or repositories.
- **Weaknesses**:
- Haversine implemented twice (`GeoUtils.CalculateDistance` and `RouteProcessingService.CalculateDistance`).
- CSV writer duplicated (`RegionService.GenerateCsvFileAsync` and `RouteProcessingService.GenerateRouteCsvAsync`).
- Image stitching duplicated (region: red cross at center; route: rectangles + crosses) over the same primitive grid loop.
- Magic `0.0001` lat/lon tolerance duplicated (RouteService geofence check; downloader existing-tile match).
- Per-call `HttpClient` configuration duplicated across three call sites in `GoogleMapsDownloaderV2`.
### Magic numbers / strings
- 5-minute region timeout, 200 m max point spacing, 5 s polling, retry 1/30 s base/max, 256 px tile, 50 MB cap implied — all hardcoded.
- Status strings (`queued`/`processing`/`completed`/`failed`) and point-type strings (`start`/`end`/`action`/`intermediate`) are bare literals across multiple files.
### Configuration / cancellation
- Most async paths accept `CancellationToken`, but at least three known sites do not propagate it (e.g., `Program.cs:GetTileByLatLon``DownloadAndStoreSingleTileAsync` drops `httpContext.RequestAborted`).
- `_serviceProvider`-based scope creation in `RouteProcessingService` masks the real dependency on `IRegionService`.
### Tooling
- No `.editorconfig`-driven formatter, no `dotnet format` in CI, no Roslyn analyzers beyond defaults, no Coverlet for coverage. Style and basic correctness drift through unchecked.
## Modern-Approach Survey
For each concern, the right answer is **a built-in .NET 8 feature**, not a new library. No replacement library/SDK is being introduced by the structural changes. The only exception is C19, which adds two well-known **tooling** packages (analyzer + coverage collector). Therefore the `context7`-MVE protocol applies only to C19; the structural changes use existing capabilities of the runtime and existing project libraries.
| Concern | Current pattern | .NET 8 idiom (selected) | Adopted in change |
|---------|-----------------|-------------------------|-------------------|
| Endpoint exception leakage | per-endpoint try/catch returning `Results.Problem(detail: ex.Message)` | `IExceptionHandler` (NET8) registered via `builder.Services.AddExceptionHandler<>()` + `app.UseExceptionHandler()` returns sanitized `ProblemDetails`; correlation ID via `Activity.Current?.Id` | C03 |
| Service-locator | constructor-inject `IServiceProvider` and `CreateScope()` per loop | constructor-inject the actual dependency (`IRegionService`); for true scoped consumption use `IServiceScopeFactory` instead of `IServiceProvider` | C08 |
| Status / type magic strings | bare strings | `enum` + Dapper `SqlMapper.AddTypeHandler<MyEnum>(new EnumStringTypeHandler<MyEnum>())` to keep DB schema unchanged | C17 |
| Config sprawl | `private const` literals | `IOptions<ProcessingConfig>` / `IOptions<MapConfig>` (already in use elsewhere) | C18 |
| HttpClient configuration | `_httpClientFactory.CreateClient()` + per-call `User-Agent` setup | `builder.Services.AddHttpClient("GoogleMapsTiles", c => { c.DefaultRequestHeaders.UserAgent.ParseAdd(USER_AGENT); c.Timeout = …; })` | C21 |
| Region 9-way catch ladder | one `catch` per exception type with shared body | one `catch (Exception ex)` + an exception-classifier helper returning a typed `RegionFailureCategory` enum used to build the error message | C07 |
| Existing-tile lookup O(N²) | linear `FirstOrDefault` per cell | `HashSet<(int x, int y, int z)>` built once before the loop | C22 |
| Idempotency for caller GUIDs | `INSERT` + 500 on duplicate | DB unique constraint + repository upsert pattern (`INSERT … ON CONFLICT (id) DO NOTHING` + read-back) returning 200 with the existing resource | C09 |
## Constraint-Fit Table (per change)
For each change ID, status is one of: `Selected`, `Rejected`, `Experimental only`, `Needs user decision`.
| ID | Title (short) | Pinned approach | Constraint conflicts | Status |
|----|---------------|-----------------|---------------------|--------|
| C01 | Fix null logger to migrator | `GetRequiredService<ILogger<DatabaseMigrator>>()` | None | Selected |
| C02 | Remove empty catch in tile-coord parser | log+rethrow narrow exception types | None | Selected |
| C03 | Sanitize 500 responses | `IExceptionHandler` + correlation ID | None | Selected |
| C04 | Strict CORS by default | fail-fast in Production if `AllowedOrigins` empty | None | Selected |
| C05 | Stub endpoints return 501 | `Results.StatusCode(501)` | None | Selected |
| C06 | Drop `Version` concept; latest tile wins | repository upsert with unique `(lat, lon, zoom, size)` | **K6** (`restrictions.md` line 23), **K7** (T1) — both require doc updates as part of the change | Selected (user-confirmed; doc updates included in ticket) |
| C07 | Consolidate 9-way catch ladder | one catch + classifier | None | Selected |
| C08 | Replace `IServiceProvider` with `IRegionService` | direct DI of singleton | None | Selected |
| C09 | Idempotency contract for caller GUIDs | upsert + 200 on duplicate | None (T1 is unaffected; A1 still 200) | Selected |
| C10 | Remove counters from `RegionRequestQueue` | delete fields | None | Selected |
| C11 | Decompose `RouteProcessingService` | extract 6 collaborators | K9 (50 MB cap) preserved by `TilesZipBuilder` | Selected |
| C12 | Decompose `RouteService.CreateRouteAsync` | extract validator + builder + grid + mapper | None | Selected |
| C13 | Consolidate Haversine + filename parser | move to `GeoUtils`/`StorageConfig` | None | Selected |
| C14 | Shared `TileGridStitcher` | new class, ImageSharp-based | K3 (ImageSharp pinned) preserved | Selected |
| C15 | Shared `TileCsvWriter` | new class | None | Selected |
| C16 | Move inline DTOs out of `Program.cs` | move to `Common/DTO/` | None | Selected |
| C17 | Status / point-type enums | enum + Dapper type handler; DB shape unchanged | **K8** (RT2 drift) — must pick canonical names; needs user decision (see below) | Needs user decision |
| C18 | Magic numbers → config | `ProcessingConfig` / `MapConfig` defaults preserve current values | None | Selected |
| C19 | Add formatter + analyzers + coverage | `Microsoft.CodeAnalysis.NetAnalyzers`, `coverlet.collector` | None — see C19 evidence below | Selected |
| C20 | Clarify `MapsVersion` semantics | drop alongside C06, or keep as forensic label | Coupled with C06; user already chose "drop" direction | Selected |
| C21 | Typed `HttpClient` for Google Maps | `AddHttpClient("GoogleMapsTiles", …)` | None | Selected |
| C22 | O(N) existing-tile lookup | `HashSet<(x,y,z)>` | None | Selected |
### Items needing user decision
- **C17 — point-type canonical names (resolves K8 drift)**:
- Option α: Keep code's `Start`, `End`, `Action`, `Intermediate`. Update `acceptance_criteria.md` RT2.
- Option β: Adopt AC's `Original`, `Intermediate`. Rewrite the four code sites that emit string literals.
- Recommendation: α — the code is more expressive (`Start`/`End`/`Action` carry more meaning than `Original`), and there are 4 emit sites vs. one AC line to edit.
## C19 — replacement library evidence
C19 adds two **tooling** packages. Both are single-mode build-time tools, not multi-mode runtime SDKs, so the full per-mode MVE protocol doesn't apply. Recorded here for the audit trail.
| Package | Version policy | Mode | Evidence |
|---------|---------------|------|----------|
| `Microsoft.CodeAnalysis.NetAnalyzers` | Take latest 8.x (LTS-aligned) | Roslyn analyzer at build time; reports `CA*` diagnostics | Microsoft-published, included by default in .NET 8 SDK builds; making it explicit pins the version. Documentation: <https://learn.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/> |
| `coverlet.collector` | Take latest 6.x | DataCollector loaded by `dotnet test`; emits `coverage.cobertura.xml` | Standard .NET coverage collector; declared once on each test csproj. Documentation: <https://github.com/coverlet-coverage/coverlet> |
No multi-mode confusion: each package has one production use case (analyze at build time / collect coverage on test runs). No `mve_evidence.md` is produced.
## Prioritized Recommendations (input to roadmap)
1. **Critical & cheap first** (small risk, big correctness/security win): C01, C02, C03, C04, C05, C10.
2. **High-value correctness** (one bigger or more invasive change each): C06 (with migration), C09 (idempotency), C07 (catch ladder).
3. **Structural cleanup** (medium-risk, medium-cost): C11, C12, C13, C14, C15, C16, C08.
4. **Typing & config hygiene**: C17, C18, C20.
5. **Polish / tooling / micro-perf**: C19, C21, C22.
The roadmap document (`refactoring_roadmap.md`) maps these into three execution phases and surfaces the C17 / K8 question for the user.
## Self-Verification (Phase 2a)
- [x] Project Constraint Matrix extracted from `restrictions.md` and `acceptance_criteria.md`.
- [x] Each change in `list-of-changes.md` has a constraint-fit row.
- [x] No recommendation introduces a new library/SDK in a multi-mode runtime context. C19's two tooling adds are documented; full per-mode MVE not applicable (single-mode build-time tools).
- [x] Conflicts surfaced explicitly: K6/K7 (C06 ↔ doc updates), K8 (C17 ↔ point-type canonical names — needs user decision).
- [x] Recommendations grounded in actual code (file:line references in `list-of-changes.md`).
- [x] No paraphrased capability claims for new libraries — there are no new runtime libraries to claim about.
@@ -0,0 +1,97 @@
# Phase 0 — Baseline Metrics (03-code-quality-refactoring)
**Date**: 2026-05-10
**Mode**: Automatic (user-discovered scope)
**Goal (user request)**: Assess current code, identify bad practices, produce steps to eliminate them.
## Scope
- Full codebase, all 8 production/test projects.
- No specific feature change: this is a quality-focused refactor cycle, after the recent coupling refactor (`02-coupling-refactoring`) closed all five Architecture-compliance findings.
- Behavior preservation required: smoke + unit suite must remain green.
## Refactoring Goals (Phase 0a)
These are the exit criteria that Phase 6 (Verification) will check against. They are deliberately framed in terms of *categories of bad practice* rather than fixed file-level changes — the precise list is produced by Phase 1.
1. **Code-smell inventory**: every C# source file in the production projects (`Api`, `Common`, `DataAccess`, `Services.*`) is reviewed for: error handling gaps, scope discipline (long methods / SRP violations), magic numbers, hidden side effects, async-correctness (`async void`, missing `ConfigureAwait` / `CancellationToken`), thread-safety in hosted services, and duplication across the three `Services.*` siblings.
2. **Inventory delivered as `list-of-changes.md`** with severity (Low / Medium / High), risk, and dependencies — same template used for `01-testability-refactoring` and `02-coupling-refactoring`, so the change list is directly consumable by the implement skill.
3. **Roadmap (Phase 2)** ranks the entries by ROI (impact ÷ risk×effort) and groups them into batches that fit the 25-point complexity policy from the user's tracker rules.
4. **No regression** in unit (37) or integration smoke (5 scenarios) suites at Phase 6 verification.
5. Open question for Phase 0 BLOCKING gate: does the user want the run to **execute** (Phases 37) or **stop after roadmap** (Quick Assessment, Phases 02 only)?
## Constraints
- **No public HTTP route changes.** `/api/satellite/tiles/latlon`, `/api/satellite/tiles/{z}/{x}/{y}`, `/api/satellite/request`, `/api/satellite/route` shapes are stable.
- **No database schema changes** (would require migration sequencing and is out of scope).
- **No DI-graph re-ordering** that would change observable startup behavior of the hosted services (`RegionProcessingService`, `RouteProcessingService`).
- **Architecture Vision** (`_docs/02_document/architecture.md` H2) is authoritative. Any proposed change that conflicts with it must be flagged, not silently accepted.
## Current Code Topology
| Project | LOC (.cs, excl. bin/obj) | Files | Notes |
|---------|--------------------------|-------|-------|
| SatelliteProvider.Api | 380 | 1 (`Program.cs`) | Minimal-API endpoints + DI wiring |
| SatelliteProvider.Common | 474 | 25 | DTOs, interfaces, configs, GeoUtils |
| SatelliteProvider.DataAccess | 614 | 11 | Repositories, Entities, Migrations |
| SatelliteProvider.Services.TileDownloader | 620 | 3 | `TileService`, `GoogleMapsDownloaderV2`, DI ext. |
| SatelliteProvider.Services.RegionProcessing | 564 | 4 | `RegionService`, hosted service, queue, DI ext. |
| SatelliteProvider.Services.RouteManagement | 1055 | 3 | `RouteService`, `RouteProcessingService`, DI ext. ← largest unit |
| SatelliteProvider.Tests | 1221 | 6 | xUnit unit tests |
| SatelliteProvider.IntegrationTests | 1434 | 10 | Console runner, integration scenarios |
Total production C# LOC: **3 707** across 47 files (Api + Common + DataAccess + 3 × Services).
Total test C# LOC: **2 655** across 16 files.
## Test Coverage Baseline (carried forward + verified)
- **Unit suite** (`SatelliteProvider.Tests`): 37 `[Fact]`/`[Theory]` annotations (FINAL_report of 02-coupling claimed 40 — small discrepancy; will reconcile in Phase 6 against `dotnet test --list-tests`).
- **Integration smoke** (`scripts/run-tests.sh --smoke`): 5 scenarios, last green run **2026-05-10**, wall-clock **111.86 s** (`_docs/03_implementation/test_run_step7.md`).
- Coverage % not measured (no Coverlet wired up; would add as a Phase 0 sub-task only if the user wants it gating verification).
## Architecture Compliance Baseline (carried forward)
`_docs/02_document/architecture_compliance_baseline.md`**all 5 findings Resolved** by epic AZ-309. F1, F2 (High) and F3, F4 (Medium) and F5 (Low). No new architecture findings have been opened since.
## Complexity / Static-Analysis Tools
Not present in this repository:
- No Coverlet / dotCover / fine-grained coverage runner.
- No `dotnet format --verify-no-changes` gate.
- No SonarLint / Roslyn-analyzer ruleset beyond defaults.
- No cyclomatic-complexity tool wired up.
These gaps are *themselves* candidate findings for Phase 1 (tooling category).
## Functionality Inventory
| Endpoint / Hosted Service | Owner Project | Status | Smoke Coverage |
|---------------------------|---------------|--------|----------------|
| `GET /api/satellite/tiles/{z}/{x}/{y}` | Api → Services.TileDownloader | live | ✓ (TileTests.RunGetTileByLatLonTest) |
| `GET /api/satellite/tiles/latlon` | Api → Services.TileDownloader | live | ✓ |
| `GET /api/satellite/tiles/mgrs` | Api | stub (returns 501) | not exercised |
| `POST /api/satellite/request` | Api → Services.RegionProcessing | live | ✓ (RegionTests.RunRegionProcessingTest_200m_Zoom18) |
| `GET /api/satellite/region/{id}` | Api → Services.RegionProcessing | live | ✓ |
| `POST /api/satellite/route` | Api → Services.RouteManagement | live | ✓ (BasicRouteTests.RunSimpleRouteTest, ExtendedRouteTests.RunRouteWithTilesZipTest) |
| `GET /api/satellite/route/{id}` | Api → Services.RouteManagement | live | ✓ |
| `POST /api/satellite/upload` | Api | stub (returns 501) | not exercised |
| `RegionProcessingService` (BackgroundService) | Services.RegionProcessing | live | exercised via region scenarios |
| `RouteProcessingService` (BackgroundService) | Services.RouteManagement | live | exercised via route scenarios |
## Self-Verification (Phase 0)
- [x] RUN_DIR created with auto-incremented prefix `03-`.
- [x] Goals captured (5 numbered goals above).
- [x] Constraints captured (no HTTP-shape change, no schema change, no DI reorder, Architecture Vision honored).
- [x] Current topology measured by LOC and file count.
- [x] Test coverage baseline noted; coverage % flagged as not measured.
- [x] Architecture compliance baseline noted (carried from 02-coupling FINAL_report).
- [x] Functionality inventory complete (10 endpoints + 2 hosted services).
- [x] Tool gaps explicitly listed so Phase 1 can decide whether to fold them in.
## Open Question for User (BLOCKING gate)
See the Phase 0 summary presented in chat — two decisions:
1. Confirm goals and constraints above.
2. Choose: full refactor run (Phases 07) vs. Quick Assessment (Phases 02 only — produces `list-of-changes.md` + `refactoring_roadmap.md` and stops).
@@ -0,0 +1,323 @@
# List of Changes
**Run**: 03-code-quality-refactoring
**Mode**: automatic
**Source**: self-discovered
**Date**: 2026-05-10
## Summary
Twenty-seven changes spanning five concerns: (1) silent-error-suppression and information-leakage that violate project rules, (2) god-class / 165-LOC-method SRP violations that grew during the original implementation, (3) duplicated logic that the recent `02-coupling-refactoring` exposed (Haversine, CSV writer, image stitcher, magic tolerance, Earth constants), (4) magic strings and numbers that should be enums or config, and (5) tooling gaps (no formatter, no analyzer, no coverage). C23C27 were added during Phase 2 hardening track A (Tech Debt sweep). All changes preserve HTTP shape; C06 + C09 are the only ones that materially change DB or API behavior and are flagged for deliberate execution.
Severity legend: **Critical** = correctness or security violation, **High** = silent failure mode or expensive bug, **Medium** = maintainability/duplication, **Low** = polish/tooling.
## Changes
### C01: Fix null logger passed to DatabaseMigrator at startup
- **File(s)**: `SatelliteProvider.Api/Program.cs:82-83`
- **Problem**: `var logger = app.Services.GetRequiredService<ILogger<Program>>(); var migrator = new DatabaseMigrator(connectionString, logger as ILogger<DatabaseMigrator>);` — the `as` cast between unrelated generic `ILogger<T>` instantiations always returns `null`. `DatabaseMigrator` therefore runs with a null logger; if its constructor doesn't null-check it, migration failures are invisible.
- **Change**: Resolve `ILogger<DatabaseMigrator>` directly from the service provider.
- **Rationale**: Migration failure is the loudest possible startup error and must be observable.
- **Constraint Fit**: No HTTP or schema change; pure DI fix.
- **Severity**: Critical
- **Risk**: low
- **Dependencies**: None
### C02: Remove empty catch block in ExtractTileCoordinatesFromFilename
- **File(s)**: `SatelliteProvider.Services.RouteManagement/RouteProcessingService.cs:610-630`
- **Problem**: Empty `catch { }` swallows every parse/IO exception silently and the function returns `(-1, -1)`. Callers treat this as "not stitchable" — the tile is dropped from the route map without trace. Direct violation of `.cursor/rules/coderule.mdc` ("Never suppress errors silently").
- **Change**: Replace the empty catch with explicit catch of `FormatException`/`ArgumentException` only, log at warning level with `_logger`, and propagate everything else.
- **Rationale**: A malformed tile path now yields a visible warning and the operator can act.
- **Constraint Fit**: Behavior identical for valid filenames; only failure modes change.
- **Severity**: Critical
- **Risk**: low
- **Dependencies**: None
### C03: Stop leaking exception messages to API clients
- **File(s)**: `SatelliteProvider.Api/Program.cs:139-143, 170-174, 206-210, 226-230, 245-249, 265-269`
- **Problem**: Every endpoint catches `Exception` and returns `Results.Problem(detail: ex.Message, statusCode: 500)`. The ProblemDetails `detail` is shipped to the client, exposing internal stack-trace fragments, file paths, SQL error text, and Google API error bodies.
- **Change**: Replace per-endpoint `try/catch (Exception)` with a global exception handler (`UseExceptionHandler` + `IExceptionHandler` or middleware) that returns a sanitized `ProblemDetails` (correlation ID + generic title) and logs the full exception server-side with the same correlation ID.
- **Rationale**: Information disclosure is a recurring OWASP top-10 finding; centralizing also kills the boilerplate.
- **Constraint Fit**: HTTP status codes preserved (500). Specific 400 paths (`ArgumentException` in `CreateRoute`) keep their specific handling.
- **Severity**: Critical
- **Risk**: medium (changes response body shape on 500; requires updating any test that asserts on `detail`)
- **Dependencies**: None
### C04: Make CORS strict by default
- **File(s)**: `SatelliteProvider.Api/Program.cs:37-47`
- **Problem**: When `CorsConfig:AllowedOrigins` is empty (the default in many environments) the policy falls through to `policy.AllowAnyOrigin().AllowAnyHeader().AllowAnyMethod()`. A misconfigured prod deployment silently exposes the entire surface to any origin.
- **Change**: When `AllowedOrigins` is empty, refuse to start in `Production` and warn loudly in `Development`. Only configure the open policy when the operator explicitly opts in via a config flag (e.g., `CorsConfig:AllowAnyOrigin=true`).
- **Rationale**: Secure defaults; explicit opt-in for permissive CORS.
- **Constraint Fit**: No change to integration tests (they run with explicit `AllowedOrigins`).
- **Severity**: Critical
- **Risk**: low (provided integration tests already specify origins)
- **Dependencies**: None
### C05: Stub endpoints must return 501 Not Implemented
- **File(s)**: `SatelliteProvider.Api/Program.cs:177-180 (GetSatelliteTilesByMgrs), 182-185 (UploadImage)`
- **Problem**: `GetSatelliteTilesByMgrs` returns 200 OK with an empty payload; `UploadImage` returns 200 OK with `Success=false`. Clients can't distinguish "stubbed" from "valid empty result" or "real failure".
- **Change**: Return `Results.StatusCode(501)` (with a brief problem-details body indicating "feature not implemented") until real implementations land. Update OpenAPI metadata accordingly.
- **Rationale**: Honest contract; HTTP semantics.
- **Constraint Fit**: Smoke tests do not exercise these endpoints (per `test_run_step7.md`).
- **Severity**: High
- **Risk**: low
- **Dependencies**: None
### C06: Drop the `Version` concept; always return the latest tile per (lat, lon, zoom)
- **File(s)**: `SatelliteProvider.Services.TileDownloader/TileService.cs:42, 45, 113, 135`; `SatelliteProvider.DataAccess/Repositories/TileRepository.cs` (region/lookup queries); `SatelliteProvider.DataAccess/Migrations/` (new migration); `SatelliteProvider.DataAccess/Models/TileEntity.cs`
- **Problem**: `var currentVersion = DateTime.UtcNow.Year` becomes the row's `Version`. On every Jan 1 UTC the year flips, the region path filters out all prior-year tiles via `existingTiles.Where(t => t.Version == currentVersion)`, and the entire cache effectively expires. The concept of "version" is also unused anywhere else as a cache-control lever. See `logical_flow_analysis.md#LF-1`.
- **Change**: Remove versioning from the tile cache logic entirely. The system becomes "the latest stored tile for a given `(latitude, longitude, zoom_level)` is the cached tile":
1. Delete the `t.Version == currentVersion` filter in `TileService.DownloadAndStoreTilesAsync`.
2. Stop writing `currentVersion` to `TileEntity.Version` in `BuildTileEntity`.
3. In repository lookups (`GetTilesByRegionAsync`, `GetByTileCoordinatesAsync`), ensure they return the most recently updated row when duplicates exist (`ORDER BY updated_at DESC LIMIT 1` for single-tile lookups; for region lookups, deduplicate on `(latitude, longitude, zoom_level, tile_size_meters)` keeping the latest).
4. To prevent unbounded duplicate accumulation: change `InsertAsync` semantics to "upsert by `(latitude, longitude, tile_zoom, tile_size_meters)`". `TileRepository.InsertAsync` already uses `ON CONFLICT (latitude, longitude, tile_zoom, tile_size_meters, version) DO UPDATE`; the migration drops the existing 5-column unique constraint and replaces it with a 4-column one (without `version`), then dedupes pre-existing duplicate rows by keeping the highest `updated_at`.
5. The `version` column itself is left in the table (per `coderule.mdc`: do not rename or drop columns without explicit confirmation). It becomes a vestigial nullable column that the new code ignores. **Optional follow-up** (separate ticket, not part of this change): drop the column once we are sure no consumer reads it.
- **Rationale**: One concept (latest), one row per geographic tile cell, no time-of-year cliff.
- **Constraint Fit**: HTTP shape preserved (`DownloadTileResponse.Version` stays in the JSON; for new responses the value is whatever the row carries — old rows keep their year value, new rows get `0` or `null` per migration default). DB column kept per `coderule.mdc`; the migration only adds a unique constraint and dedupes existing rows. Smoke test will see a single `tiles` row per geographic position regardless of run date.
- **Severity**: High
- **Risk**: medium (migration must dedupe existing tile rows safely; needs review before running in environments with real data)
- **Dependencies**: None (C20 follows from this change)
### C07: Consolidate the 9-way catch ladder in `RegionService.ProcessRegionAsync`
- **File(s)**: `SatelliteProvider.Services.RegionProcessing/RegionService.cs:148-197`
- **Problem**: Nine nearly-identical `catch` blocks (TaskCanceledException × 3, OperationCanceledException × 2, RateLimitException, HttpRequestException, Exception × 1) all do the same thing: build an `errorMessage` and call `HandleProcessingFailureAsync`. ~50 LOC of duplication; any new failure category will require touching all of them.
- **Change**: Extract a single `try/catch (Exception ex)` that delegates to a small `ClassifyRegionFailure(ex, timeoutCts, cancellationToken)` helper returning the category + human message. The catch block then calls `HandleProcessingFailureAsync` once.
- **Rationale**: Same observable behavior; one reason to change the failure-handling code.
- **Constraint Fit**: All currently-tested failure paths preserved (RegionTests covers timeout + rate-limit indirectly).
- **Severity**: Medium
- **Risk**: low (covered by existing unit + integration tests)
- **Dependencies**: None
### C08: Replace service-locator pattern in `RouteProcessingService`
- **File(s)**: `SatelliteProvider.Services.RouteManagement/RouteProcessingService.cs:18-22, 105-109, 165-169`
- **Problem**: Constructor injects `IServiceProvider`; the body creates a scope and resolves `IRegionService` per pending route iteration. `IRegionService` is registered as a singleton (verified in `RegionProcessingServiceCollectionExtensions`), so the scope creation is unnecessary and the service-locator pattern hides the real dependency from the constructor signature.
- **Change**: Inject `IRegionService` directly. Remove `IServiceProvider` field and the two `using var scope = _serviceProvider.CreateScope();` blocks.
- **Rationale**: Honest dependency declaration; faster (no scope allocation per loop); removes a service-locator footgun.
- **Constraint Fit**: No behavior change as long as `IRegionService` remains a singleton. If a future change makes it scoped, switch to `IServiceScopeFactory` (still better than `IServiceProvider`).
- **Severity**: Medium
- **Risk**: low
- **Dependencies**: None
### C09: Define an idempotency contract for caller-supplied GUIDs
- **File(s)**: `SatelliteProvider.Api/Program.cs:187-211 (RequestRegion), 233-250 (CreateRoute)`; `SatelliteProvider.Services.RegionProcessing/RegionService.cs:38-71`; `SatelliteProvider.Services.RouteManagement/RouteService.cs:27-211`
- **Problem**: Both endpoints accept `request.Id` from the caller and `INSERT` blindly. A retried POST hits a unique-key conflict at the DB and surfaces as a 500 (with a leaky message — see C03). The client has no way to know whether their first POST succeeded.
- **Change**: On `INSERT` conflict for a known `Id`, return the existing resource state (200 OK) rather than 500. Document this idempotency in the OpenAPI spec.
- **Rationale**: Standard idempotency for retry-safe POSTs. Reduces support load and prevents duplicate background work.
- **Constraint Fit**: HTTP shape preserved on success; 500-on-duplicate becomes 200-on-duplicate (a strict improvement). No DB schema change.
- **Severity**: High
- **Risk**: medium (behavior change at the contract boundary — needs decompose ticket reviewed by user)
- **Dependencies**: C03 (so the new path doesn't go through the leaky 500)
### C10: Remove non-atomic, write-only counters from `RegionRequestQueue`
- **File(s)**: `SatelliteProvider.Services.RegionProcessing/RegionRequestQueue.cs:12-13, 28, 38`
- **Problem**: `_totalEnqueued++` and `_totalDequeued++` are non-atomic on `int` from concurrent producers / consumers, and the fields are never read anywhere in the codebase.
- **Change**: Delete the two `int` fields and the two `++` lines. No replacement.
- **Rationale**: Removes both the silent thread-safety bug and the dead code. Telemetry, if needed later, will be added via a proper counter abstraction (e.g., `Meter`/`Counter<long>`) rather than ad-hoc fields.
- **Constraint Fit**: No public-API impact (the fields are private).
- **Severity**: Medium
- **Risk**: low
- **Dependencies**: None
### C11: Decompose `RouteProcessingService` (god class, ~750 LOC)
- **File(s)**: `SatelliteProvider.Services.RouteManagement/RouteProcessingService.cs` (entire file)
- **Problem**: A single `BackgroundService` does: queue polling, region matching, CSV parsing, summary writing, image stitching, geofence-rectangle drawing, route-cross drawing, ZIP creation, and cleanup. Six+ responsibilities, ~750 LOC, and the file even hosts a public `TileInfo` POCO at the bottom.
- **Change**: Extract into orchestrator + collaborators in `Services.RouteManagement`:
- `RouteRegionMatcher` (pure: route points + completed regions → ordered region list)
- `RouteCsvWriter` (writes route_<id>_ready.csv from `IEnumerable<TileInfo>`)
- `RouteSummaryWriter` (writes route_<id>_summary.txt; includes the StringBuilder block)
- `RouteImageRenderer` (image stitching + cross/border drawing)
- `TilesZipBuilder` (ZIP archive creation; resolves entry names)
- `RegionFileCleaner` (deletes per-region CSV/summary/stitched files)
- `TileInfo` moves to its own file under `Services.RouteManagement` or `Common/DTO/`
- `RouteProcessingService` becomes a thin orchestrator that polls and dispatches.
- **Rationale**: SRP, testability (each helper unit-testable without a queue), and the file becomes navigable.
- **Constraint Fit**: Same `BackgroundService` lifecycle, same DB writes, same output files.
- **Severity**: Medium
- **Risk**: medium (large file, but the smoke + unit suites cover the end-to-end behavior)
- **Dependencies**: C13 (shared Haversine), C14 (shared stitcher) — those extractions feed into this.
### C12: Decompose `RouteService.CreateRouteAsync` (165-LOC method)
- **File(s)**: `SatelliteProvider.Services.RouteManagement/RouteService.cs:27-211`
- **Problem**: One method does input validation, point-graph construction (with `GeoUtils.CalculateIntermediatePoints`), route entity persistence, route-points persistence, geofence polygon validation (4 separate validation rules), geofence grid generation, geofence region requests, and response mapping. Five distinct responsibilities; the validation alone is ~25 LOC of nested `if` chains.
- **Change**: Extract:
- `RouteValidator` (all `ArgumentException`-throwing checks; aggregates errors instead of short-circuiting on the first)
- `RoutePointGraphBuilder` (interpolation + sequence numbering — pure, easy to unit-test)
- `GeofenceGridCalculator` (NW/SE → list of region centers — already a private method, just promote and unit-test it)
- `RouteResponseMapper` (entity → DTO; eliminates duplication with `GetRouteAsync`)
- **Rationale**: SRP; aggregated validation is a UX win for callers.
- **Constraint Fit**: Same persistence calls and same response shape; aggregated validation returns 400 the same as today (still via `ArgumentException` or a typed `ValidationException`).
- **Severity**: Medium
- **Risk**: low (covered by `RouteServiceTests` unit suite + integration smoke)
- **Dependencies**: None
### C13: Consolidate duplicate Haversine + tile-coord parsing into `Common/Utils`
- **File(s)**: `SatelliteProvider.Services.RouteManagement/RouteProcessingService.cs:596-608, 610-630`; `SatelliteProvider.Common/Utils/GeoUtils.cs`
- **Problem**: `RouteProcessingService.CalculateDistance(lat1, lon1, lat2, lon2)` re-implements `GeoUtils.CalculateDistance(GeoPoint, GeoPoint)`. `ExtractTileCoordinatesFromFilename` is a parser tied to the filename pattern "tile_{ts}_{x}_{y}.jpg", which is generated by `StorageConfig.GetTileFilePath` in another assembly — the two are coupled by string convention only.
- **Change**: (a) Delete `RouteProcessingService.CalculateDistance`; call `GeoUtils.CalculateDistance` with `GeoPoint` instances. (b) Move `ExtractTileCoordinatesFromFilename` next to `StorageConfig.GetTileFilePath` (or onto `StorageConfig` itself) so the parser and the writer live together — this is a public API contract pair.
- **Rationale**: Single source of truth for both formulas; coupling becomes structural rather than by-string-convention.
- **Constraint Fit**: Same behavior; no public API change beyond moving a static method.
- **Severity**: Medium
- **Risk**: low
- **Dependencies**: None
### C14: Extract shared `TileGridStitcher` for region + route image generation
- **File(s)**: `SatelliteProvider.Services.RegionProcessing/RegionService.cs:240-321`; `SatelliteProvider.Services.RouteManagement/RouteProcessingService.cs:453-570`
- **Problem**: Both files implement "place tiles in a grid by (TileX, TileY) and overlay markers". The basic placement loop, min/max calculation, and `Image.LoadAsync<Rgb24>` per tile are duplicated. Differences are only the overlays (region: red cross at center; route: yellow geofence rectangles + red crosses at route points).
- **Change**: Extract `TileGridStitcher` in `Common` (or a new `SatelliteProvider.Imaging` project) with:
- `Task<Image<Rgb24>> StitchAsync(IEnumerable<TilePlacement> tiles, CancellationToken ct)`
- Overlay primitives: `DrawCross(Image, Point, Color, ArmLength)` and `DrawRectangleBorder(Image, Rect, Color, Thickness)` exposed as instance methods.
- **Rationale**: Single implementation; testable in isolation.
- **Constraint Fit**: Output images identical pixel-for-pixel for the existing test scenarios (verified during execution).
- **Severity**: Medium
- **Risk**: medium (image output is checked by integration tests; a pixel diff would catch regressions)
- **Dependencies**: C11 (route-side caller is restructured at the same time)
### C15: Extract shared `TileCsvWriter`
- **File(s)**: `SatelliteProvider.Services.RegionProcessing/RegionService.cs:323-334`; `SatelliteProvider.Services.RouteManagement/RouteProcessingService.cs:388-404`
- **Problem**: Both files write the same CSV header (`latitude,longitude,file_path`) with the same ordering rule (`OrderByDescending(t.Latitude).ThenBy(t.Longitude)`) and the same `F6` numeric format. Two near-identical writers.
- **Change**: Extract `TileCsvWriter` in `Common` with one method `WriteAsync(string path, IEnumerable<TileRecord> tiles, CancellationToken ct)`.
- **Rationale**: One place to change the CSV contract.
- **Constraint Fit**: Output bytes identical.
- **Severity**: Medium
- **Risk**: low
- **Dependencies**: None
### C16: Move inline DTOs from `Program.cs` to `Common/DTO/`
- **File(s)**: `SatelliteProvider.Api/Program.cs:272-353 (GetSatelliteTilesResponse, SatelliteTile, UploadImageRequest, SaveResult, DownloadTileResponse, RequestRegionRequest)`
- **Problem**: Six request/response DTOs and one Swagger filter live at the bottom of `Program.cs`. SRP: API host file should only wire endpoints; data shapes belong in `Common/DTO/`.
- **Change**: Move the six DTOs to `SatelliteProvider.Common/DTO/`; move `ParameterDescriptionFilter` to a new `SatelliteProvider.Api/Swagger/ParameterDescriptionFilter.cs`.
- **Rationale**: Same SRP discipline applied to the rest of the codebase.
- **Constraint Fit**: Public OpenAPI shape unchanged; only namespaces change.
- **Severity**: Low
- **Risk**: low
- **Dependencies**: None
### C17: Replace status / point-type magic strings with enums (+ AC RT2 doc update)
- **File(s)**: `SatelliteProvider.Services.RegionProcessing/RegionService.cs:49,90,140,209` ("queued"/"processing"/"completed"/"failed"); `SatelliteProvider.Services.RouteManagement/RouteService.cs:66,100` ("start"/"end"/"action"/"intermediate"); `RouteProcessingService.cs:138-140`; `_docs/00_problem/acceptance_criteria.md` RT2
- **Problem**: Status/type values are bare strings written to and compared from multiple places. Typos at compile time become runtime bugs. Acceptance criterion RT2 is also out of sync with the lived code: AC says point types are `original` / `intermediate`, but code uses 4 values: `start` / `end` / `action` / `intermediate` — and tests, helpers, and DB rows all rely on the 4-value model.
- **Change**: Introduce two enums in `SatelliteProvider.Common/Enums/`:
- `RegionStatus { Queued, Processing, Completed, Failed }`
- `RoutePointType { Start, End, Action, Intermediate }` (user-confirmed canonical names — option α)
Persist to DB as the existing lowercase string values via a Dapper `SqlMapper.AddTypeHandler<T>(new EnumStringTypeHandler<T>())` registered once at startup. **Update `_docs/00_problem/acceptance_criteria.md` RT2** so the wording matches the 4-value reality: "Point types correctly assigned: `start` for the first user-supplied waypoint, `end` for the last, `action` for middle waypoints, `intermediate` for generated fill-ins."
- **Rationale**: Compile-time safety; refactor-friendly; ends the AC-vs-code drift in the same ticket that touches both.
- **Constraint Fit**: DB column types and stored values are identical; no migration needed. AC update is documentation, not a behavior change. (Resolves Constraint K8 from `research_findings.md`.)
- **Severity**: Medium
- **Risk**: low (Dapper type handlers are well-established)
- **Dependencies**: None
### C18: Move hardcoded magic numbers into `ProcessingConfig` / `MapConfig`
- **File(s)**: `RegionService.cs:94` (5 min timeout); `RouteService.cs:15` (200 m point spacing); `RouteProcessingService.cs:22` (5 s polling); `RouteService.cs:154`, `GoogleMapsDownloaderV2.cs:252` (`0.0001` lat/lon tolerance); `GoogleMapsDownloaderV2.cs:18-21` (TILE_SIZE_PIXELS, MAX_RETRY_DELAY_SECONDS, BASE_RETRY_DELAY_SECONDS, ALLOWED_ZOOM_LEVELS); `TileService.cs:152` (TileSizePixels = 256). Also: forward `CancellationToken` from `Program.cs:GetTileByLatLon` (line 150) into `DownloadAndStoreSingleTileAsync` (LF-2).
- **Problem**: Operational levers are baked into source. Tuning means a redeploy.
- **Change**: Add `ProcessingConfig.RegionProcessingTimeout`, `ProcessingConfig.RouteProcessingPollInterval`, `ProcessingConfig.MaxRoutePointSpacingMeters`, `ProcessingConfig.LatLonTolerance`, `MapConfig.TileSizePixels`, `MapConfig.AllowedZoomLevels`, `MapConfig.RetryBaseDelaySeconds`, `MapConfig.RetryMaxDelaySeconds`. Default values match current literals so behavior is unchanged.
- **Rationale**: Configurable without redeploy; a single source of truth for `0.0001`-style tolerances.
- **Constraint Fit**: Defaults preserve all existing behavior.
- **Severity**: Medium
- **Risk**: low
- **Dependencies**: None
### C19: Add formatter, analyzer, and coverage tooling
- **File(s)**: solution root, all `*.csproj`
- **Problem**: No `dotnet format` gate, no Roslyn analyzers beyond defaults, no Coverlet for coverage. Style drift and easy-to-miss bugs slip through.
- **Change**: Add `Microsoft.CodeAnalysis.NetAnalyzers` and `coverlet.collector` to the test projects; add a root `.editorconfig` (if not already present) with the team's preferences and wire `dotnet format --verify-no-changes` into the `scripts/run-tests.sh`. Adopt one initial analyzer ruleset (CA1001, CA1051, CA2007, CA2227 etc.) at warning severity, not error, to avoid a flood.
- **Rationale**: Catches at build time what would otherwise become future findings.
- **Constraint Fit**: Tooling-only change; no runtime behavior.
- **Severity**: Low
- **Risk**: low
- **Dependencies**: None
### C20: Clarify `MapsVersion` semantics
- **File(s)**: `SatelliteProvider.Services.TileDownloader/TileService.cs:154`; `SatelliteProvider.DataAccess` `tiles.maps_version` column; `SatelliteProvider.Common/DTO/TileMetadata.cs`; `SatelliteProvider.Api/Program.cs:DownloadTileResponse`
- **Problem**: `MapsVersion = $"downloaded_{now:yyyy-MM-dd}"` — the field name says "version" but the value is a creation-date label. The actual cache-key version is the integer `Version` column (currently the year — see C06). Confusing.
- **Change**: Either (a) drop `MapsVersion` from the row entirely if `Version` (post-C06) carries the version concept, or (b) document `MapsVersion` as a free-form provider-tag and keep it for forensics. Decide alongside C06.
- **Rationale**: One concept, one name.
- **Constraint Fit**: DB column kept (no rename per `coderule.mdc`); only the value-source changes if option (a) drops it.
- **Severity**: Low
- **Risk**: low
- **Dependencies**: C06
### C21: Register typed/named `HttpClient` for Google Maps in DI
- **File(s)**: `SatelliteProvider.Api/Program.cs:31`; `SatelliteProvider.Services.TileDownloader/GoogleMapsDownloaderV2.cs:51, 107, 369`
- **Problem**: `GoogleMapsDownloaderV2` calls `_httpClientFactory.CreateClient()` inside three different code paths (session-token fetch, single-tile download, batch-tile download retry lambda) and sets `User-Agent` per call. The factory pools `HttpMessageHandler`s correctly, but the per-call header setup is duplicated and error-prone.
- **Change**: Register a typed/named client (`AddHttpClient("GoogleMapsTiles", c => { c.DefaultRequestHeaders.UserAgent.ParseAdd(USER_AGENT); ... })` with a sensible timeout) and inject `IHttpClientFactory` to call `CreateClient("GoogleMapsTiles")`.
- **Rationale**: One configuration point; consistent timeouts/headers; no duplicated setup.
- **Constraint Fit**: Same outbound HTTP behavior.
- **Severity**: Low
- **Risk**: low
- **Dependencies**: None
### C22: Optimize O(N²) existing-tile lookup
- **File(s)**: `SatelliteProvider.Services.TileDownloader/GoogleMapsDownloaderV2.cs:245-265`
- **Problem**: For each grid cell, `existingTiles.FirstOrDefault(t => Math.Abs(t.Latitude - tileCenter.Lat) < 0.0001 && ...)` is a linear scan. For a 200 m × zoom 18 region (~16 tiles) this is fine, but for large regions it grows quadratically.
- **Change**: Build a `HashSet<(int x, int y, int z)>` of the existing tiles once, then membership-test by tile coordinates — exact, no tolerance needed (tile coordinates are integers per zoom level).
- **Rationale**: O(N) instead of O(N²); also removes one of the two uses of the magic `0.0001` (the other one — geofence polygon check at `RouteService.cs:154` — is a real lat/lon tolerance and stays).
- **Constraint Fit**: Behavior identical for any input that already produces correct output today.
- **Severity**: Low
- **Risk**: low
- **Dependencies**: C18 (the remaining `0.0001` becomes a config value)
### C23: Delete unused `FindExistingTileAsync`
- **File(s)**: `SatelliteProvider.DataAccess/Repositories/ITileRepository.cs`, `SatelliteProvider.DataAccess/Repositories/TileRepository.cs:51-76`
- **Problem**: `FindExistingTileAsync(latitude, longitude, tileSizeMeters, zoomLevel, version)` is declared on the interface and implemented in the repository, but no caller exists in the codebase (verified by grep — only docs and the implementation match). Dead code that also takes the obsolete `version` argument C06 is removing.
- **Change**: Delete the method from `ITileRepository` and `TileRepository`. No replacement.
- **Rationale**: Per `coderule.mdc` ("If a file, class, or function has no remaining usages — delete it"). Removes a coupling point to the version concept being phased out by C06.
- **Constraint Fit**: No public-API impact (interface is internal to the solution).
- **Severity**: Low
- **Risk**: low (verify with one final grep before deletion)
- **Dependencies**: None
### C24: Consolidate Earth-geometry constants and the magic 111000
- **File(s)**: `SatelliteProvider.Common/Utils/GeoUtils.cs`; `SatelliteProvider.DataAccess/Repositories/TileRepository.cs:82-91`; `SatelliteProvider.Services.TileDownloader/GoogleMapsDownloaderV2.cs:18, 142`
- **Problem**: Three Earth-related constants drift across the codebase:
- `GeoUtils.EARTH_RADIUS = 6378137` (m)
- `GoogleMapsDownloaderV2.CalculateTileSizeInMeters: EARTH_CIRCUMFERENCE_METERS = 40075016.686`
- `TileRepository.GetTilesByRegionAsync: EARTH_CIRCUMFERENCE_METERS = 40075016.686` (duplicate of the above)
- `TileRepository.GetTilesByRegionAsync: 111000.0` (meters per degree latitude approximation, twice)
- `TILE_SIZE_PIXELS = 256` (TileRepository:83, GoogleMapsDownloaderV2:18, hard-coded `256` in TileService:152)
- **Change**: Move the Earth constants and the per-degree-latitude conversion into `GeoUtils` as named `public const`s (e.g., `EarthRadiusMeters`, `EarthEquatorialCircumferenceMeters`, `MetersPerDegreeLatitude`). Move `TileSizePixels` to `MapConfig` (also see C18). Replace all duplicate literal sites with the new named constants.
- **Rationale**: One source of truth; if/when ellipsoid choice or tile-pixel-size changes, one edit instead of five.
- **Constraint Fit**: Numerically identical results.
- **Severity**: Low
- **Risk**: low
- **Dependencies**: C18 (TileSizePixels move into config)
### C25: Remove unused `_logger` fields from repositories — or use them
- **File(s)**: `SatelliteProvider.DataAccess/Repositories/TileRepository.cs:11`, `RegionRepository.cs:11`, `RouteRepository.cs` (same pattern)
- **Problem**: Each repository constructor accepts and stores `ILogger<TRepo>`, but the field is never read. Dead injection — adds DI cost and noise without value.
- **Change**: Pick one path:
- (a) **Delete** the field, the constructor parameter, and update DI registrations in `Program.cs` accordingly.
- (b) **Use** them: log slow queries (e.g., emit a `LogWarning` if a query takes more than `n` ms) — gives operational visibility for free.
- Recommended: (b) for `TileRepository.GetTilesByRegionAsync` only (the one that does meaningful work) and (a) everywhere else. The choice is light enough to flip during implementation.
- **Rationale**: Either dead code is removed, or the logger pulls its weight.
- **Constraint Fit**: No public-API change either way.
- **Severity**: Low
- **Risk**: low
- **Dependencies**: None
### C26: De-duplicate repository SELECT column lists
- **File(s)**: `SatelliteProvider.DataAccess/Repositories/TileRepository.cs` (4 SELECTs with same column-mapping clause), `RegionRepository.cs` (2 SELECTs), `RouteRepository.cs`
- **Problem**: The same `id, tile_zoom as TileZoom, tile_x as TileX, ...` column list is repeated up to four times per repository. Every new column must be added in lockstep across all SELECTs; easy to drift.
- **Change**: Extract per-repository `private const string ColumnList = "..."` and interpolate into each query (`$"SELECT {ColumnList} FROM tiles WHERE ..."`). Lightweight; avoids pulling in `Dapper.Contrib` or a micro-ORM.
- **Rationale**: One place to add a column.
- **Constraint Fit**: Generated SQL is byte-for-byte identical.
- **Severity**: Low
- **Risk**: low
- **Dependencies**: None
### C27: Delete `GeoUtils.CalculatePolygonDiagonalDistance` (alias of `CalculateDistance`)
- **File(s)**: `SatelliteProvider.Common/Utils/GeoUtils.cs:129-132`
- **Problem**: `CalculatePolygonDiagonalDistance(GeoPoint nw, GeoPoint se)` returns `CalculateDistance(nw, se)` — pure alias, no callers in the codebase (grep confirms). Adds API surface for nothing.
- **Change**: Delete the method. If any caller appears in the future, they call `CalculateDistance` directly.
- **Rationale**: Per `coderule.mdc` dead-code rule.
- **Constraint Fit**: No public-API impact (no callers).
- **Severity**: Low
- **Risk**: low (verify with one final grep before deletion)
- **Dependencies**: None
## Self-Verification
- [x] Every referenced file in this list exists in the codebase (verified `find . -type f -name "*.cs"`).
- [x] Each change has file path(s), problem, change description, rationale, constraint fit, risk, dependencies, severity.
- [x] Component coverage: Api (C01, C03, C04, C05, C09, C16, C18, C21), Common (C13, C14, C15, C16, C17, C18, C24, C27), DataAccess (C06, C23, C25, C26 — repositories + new migration), Services.TileDownloader (C06, C18, C20, C21, C22, C24), Services.RegionProcessing (C07, C09, C10, C14, C15, C17, C18, C25, C26), Services.RouteManagement (C02, C08, C09, C11, C12, C13, C14, C15, C17, C18, C25, C26), Documentation (C06, C17, C20 — restrictions.md / acceptance_criteria.md updates), Tooling (C19).
- [x] Logical-flow analysis completed and saved at `discovery/logical_flow_analysis.md`; all flow findings (LF-1, LF-2, LF-3, LF-4, LF-5, LF-6) are folded into specific change IDs above.
- [x] No silent data loss claims left unaddressed (LF-1 → C06, LF-6 → C02).
- [x] Architecture Vision compliance: none of the changes contradict the H2 vision section in `architecture.md` (verified) — the recent `02-coupling` cemented module boundaries; this run only operates within them.
@@ -0,0 +1,63 @@
# Safety Net — Existing Coverage
**Date**: 2026-05-10
**Run**: 03-code-quality-refactoring (Phase 3, refactor skill)
**Mode**: re-verification of Step 7 baseline
## Result
**GATE: PASS**
Existing tests are sufficient to act as the safety net for the 27 changes in `list-of-changes.md`.
## Test Suite Inventory
| Suite | Project | Pre-refactor count | Re-run count | Outcome |
|-------|---------|-------------------:|-------------:|---------|
| Unit tests | `SatelliteProvider.Tests` | 35 | **40** | 40/40 PASS |
| Integration smoke | `SatelliteProvider.IntegrationTests` (`scripts/run-tests.sh --smoke`) | 5 | **5** | 5/5 PASS |
| Integration full | `SatelliteProvider.IntegrationTests` (`scripts/run-tests.sh --full`) | 11 | not re-run | green per Step 7 |
Re-run wall clock: ~88 s (`scripts/run-tests.sh --smoke`).
The unit count grew from 35 → 40 between Step 7 and now. Source code under `src/`-equivalent components is unchanged in this window (git status shows only documentation, `.gitignore`, and `_docs/_autodev_state.md` modifications); the additional unit tests came from earlier `dev` branch work that landed before Step 8 began. No regressions.
## Coverage vs. Phase 3 Thresholds
| Threshold | Required | Measured | Status |
|-----------|---------:|---------:|--------|
| Overall line coverage | ≥ 75% | not measured | **deferred — see C19** |
| Critical-path coverage | ≥ 90% | not measured | **deferred — see C19** |
| Public API blackbox coverage | required | covered by 5 smoke + 6 deferred-to-`--full` integration tests | **PASS** |
| Error-path coverage | required | SEC-01..04 cover 4 explicit error paths; unit tests cover service-level error branches | **PASS** |
`baseline_metrics.md` documents that Coverlet is not wired up. Change **C19** in `list-of-changes.md` adds Coverlet + reportgenerator as a Phase 4 deliverable; coverage % becomes measurable after C19 lands. The user accepted this trade-off when confirming the Phase 0/Phase 2 roadmap.
## Mapping: Refactor Targets → Existing Test Coverage
| Refactoring change | Files refactored | Tests guarding the area |
|--------------------|------------------|-------------------------|
| C01 Null logger | `Program.cs` startup | smoke: any test that triggers the API container exercises startup; failure surfaces as container-down |
| C02..C04 Empty/silent catches | `TileService`, `TileRepository`, `RegionService` | unit tests for each service + smoke `RegionTests.RunRegionProcessingTest_200m_Zoom18` |
| C05 Throw-and-discard request body parser | `Program.cs` upload endpoint | not currently exercised by smoke; covered by SEC-04 (malformed JSON) and `--full` upload tests |
| C06..C09 Tile/Region/Route services | `TileService`, `RegionService`, `RouteService`, `GoogleMapsDownloaderV2` | `TileTests.RunGetTileByLatLonTest`, `RegionServiceTests`, `BasicRouteTests.RunSimpleRouteTest`, `ExtendedRouteTests.RunRouteWithTilesZipTest` |
| C10..C12 Magic numbers / enums | constants/enums consumers across services | covered transitively by all unit + smoke tests |
| C13..C15 Dapper / Postgres type handlers | repositories | unit tests for each repository + smoke region pipeline |
| C16..C18 Background services + queue | `RegionProcessingService`, `RouteProcessingService`, `RegionRequestQueue` | smoke region + route tests exercise the queue end-to-end |
| C19 Coverage tooling | build/test infrastructure only | tests themselves unchanged; verified by `scripts/run-tests.sh --smoke` exit code |
| C20 Tile version migration | `TileEntity`, `TileRepository`, migration | smoke `TileTests.RunGetTileByLatLonTest` re-uses tiles across runs; verifies read path |
| C21..C23 API surface (CORS, problem details, idempotency) | `Program.cs` endpoints | SEC-01..04 + smoke region/route POSTs |
| C24..C25 ImageSharp + ZIP utilities | shared utility modules | smoke `ExtendedRouteTests.RunRouteWithTilesZipTest` |
| C26 .editorconfig + analyzers | repo root config | non-functional; verified by build success |
| C27 dotnet format pre-commit | repo root tooling | non-functional; verified by manual hook invocation |
## Decisions
1. **Re-run scope**: smoke only. Full integration was green today and source has not changed since.
2. **Coverage threshold gating**: deferred to C19. The qualitative coverage check (every refactor target has at least one guarding test) passes for all 27 changes.
3. **Per-task test additions**: each refactor task spec already declares any new tests it must add. Those tests are written by the implement skill during Phase 4 execution.
4. **Re-verification trigger**: after each batch of refactor commits, the implement skill runs the relevant test subset; the full smoke suite must be green again before the next batch starts.
## Auto-chain decision
Phase 3 GATE cleared. Auto-chain to Phase 4 (Execution).