Files
satellite-provider/_docs/02_document/architecture.md
T
Oleksandr Bezdieniezhnykh af4219fce6
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
[AZ-500] Cycle 4 Steps 12-15 sync (test-spec / docs / security / perf)
Step 12 (Test-Spec Sync) - cycle-update mode
  - traceability-matrix: 8 AZ-500 AC rows + .NET 10 runtime
    restriction supersession + Cycle-4 coverage shape note
    (no new tests; ACs verified by re-running existing 78-test
    suite + build pipeline + manifest grep)

Step 13 (Update Docs) - task mode
  - FINAL_report, 00_discovery, architecture, module-layout,
    api_program, tests_unit: .NET 8 -> .NET 10 / C# 12 -> 14 /
    Swashbuckle 6.6.2 -> 10.1.7 + Microsoft.OpenApi 2.x
    refactor note in api_program; Serilog.AspNetCore 8.0.3
    fallback documented inline per AZ-500 Risk #4
  - deployment/{containerization, ci_cd_pipeline}: Docker
    aspnet/sdk:8.0 -> :10.0
  - ripple_log_cycle4: empty import-graph ripple recorded
    (Program.cs is entry point; ParameterDescriptionFilter only
    consumed by Program.cs; csproj/global.json/Dockerfile have
    no import edges)

Step 14 (Security Audit) - resume mode
  - dependency_scan_cycle4: AZ-500 19-package delta scanned;
    cycle-3 D1+D3 (CVE-2026-26130) closed by major-version
    bump; cycle-3 D2 (Test.Sdk 17.8.0 NuGet.Frameworks flag)
    carried over - explicitly out of AZ-500 scope
  - security_report_cycle4: PASS_WITH_WARNINGS (only carry-over
    Medium open; AZ-500 introduced 0 new Critical/High); cycle-3
    static_analysis/owasp_review/infrastructure_review carried
    forward unchanged (AZ-500 made no source-level edits to
    those surfaces)

Step 15 (Performance Test) - perf mode, full default-param run
  - perf_2026-05-12_cycle4: 7 Pass + 1 Unverified (PT-08 hit
    pre-existing scripts/run-performance-tests.sh:417 grep-
    pipefail bug, NOT a .NET 10 regression)
  - PT-07 warm p95 = 301ms (7.7x improvement vs cycle-3 short
    variant - .NET 10 pipeline + N=20 dilution); cold p95 =
    2782ms (-14%); PT-06 90ms (-49%)
  - AZ-500 NFR (Performance) MET for 7/8 scenarios
  - Cycle-3 perf-harness leftover updated with replay #3
    results; STAYS OPEN per AZ-500 Constraint (deletes only on
    fully clean run)

Recommended follow-up PBIs (out of cycle-4 scope, surfaced for
the backlog):
  - 1 SP fix scripts/run-performance-tests.sh:416-417 grep-
    pipefail (replace grep -o ... | wc -l with grep -c ... ||
    true) - unblocks PT-08 + closes the cycle-3 perf leftover
  - 3 SP migrate WithOpenApi(...) callsites to ASP.NET Core 10
    minimal-API metadata extensions (clears 8 ASPDEPR002
    warnings; recorded in batch_01_cycle4_review.md)
  - 1 SP Microsoft.OpenApi 2.x nullable cleanup (CS8604 in
    ParameterDescriptionFilter.cs:25)
  - 1 SP bump Microsoft.NET.Test.Sdk 17.8.0 -> 17.13.0+
    (closes cycle-3 D2 NuGet.Frameworks transitive flag)

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 06:05:29 +03:00

201 lines
16 KiB
Markdown

# Satellite Provider — Architecture
## Architecture Vision
Satellite Provider is a self-hosted .NET 10 backend service that pre-downloads, caches, and composites Google Maps satellite imagery for offline use. It runs as a single containerized monolith with PostgreSQL, processing requests asynchronously via in-process queues. The dominant pattern is a layered architecture (API → Services → DataAccess → PostgreSQL) with background hosted services for long-running work.
**Components & responsibilities** (each owns its own `.csproj` since AZ-309):
- **Common** (`SatelliteProvider.Common`) — Shared contracts: DTOs, service interfaces, common exceptions, configuration models, geospatial math
- **DataAccess** (`SatelliteProvider.DataAccess`) — PostgreSQL persistence via Dapper + DbUp migrations
- **TileDownloader** (`SatelliteProvider.Services.TileDownloader`) — Provider-agnostic tile acquisition via `ISatelliteDownloader` interface (first implementation: Google Maps) with deduplication, concurrency control, and an in-memory tile-byte cache owned by `TileService`
- **RegionProcessing** (`SatelliteProvider.Services.RegionProcessing`) — Batch tile downloads for geographic areas, stitching, output generation
- **RouteManagement** (`SatelliteProvider.Services.RouteManagement`) — Route interpolation, geofenced region generation, consolidated map output
The three Layer-3 service components are compile-time siblings: each only references `SatelliteProvider.Common` and `SatelliteProvider.DataAccess`. Cross-component runtime calls flow exclusively through interfaces in `SatelliteProvider.Common.Interfaces`.
**Major data flows**:
- *Tile acquisition*: HTTP request → cache check → Google Maps download → disk + DB persistence
- *Region processing*: Request queued → background worker calculates tile grid → downloads all tiles → produces CSV/summary/stitched image
- *Route expansion*: Waypoints → interpolated points every ~200m → geofence filtering → region requests per point → optional ZIP archive
**Architectural principles** (inferred):
- Single-instance deployment, no horizontal scaling requirements (`inferred-from: Channel-based queue, no distributed state`)
- Append-by-source tile storage — multiple producers (Google Maps, UAV upload, future SatAR, …) can each persist a row per `(latitude, longitude, tile_zoom, tile_size_meters)` cell. Reads return the most-recent row across sources, ordered by `captured_at DESC` with deterministic `(updated_at DESC, id DESC)` tie-breaks. The single-row-per-cell-per-source invariant is enforced by the 5-column unique index `idx_tiles_unique_location_source` introduced in migration 013 (AZ-484). The `tiles.version` column is vestigial since AZ-357 dropped year-based cache invalidation in favour of cell-level overwrite. (`inferred-from: tiles table + AZ-484/AZ-357 migrations + tile-storage contract v1.0.0`)
- Fire-and-forget async processing with status polling (`inferred-from: queue + background service + status endpoint`)
- JWT-validated callers only — every HTTP endpoint requires a valid HS256-signed Bearer token, validated locally against a shared `JWT_SECRET` per the suite-level auth contract (`suite/_docs/10_auth.md`). Issuer/audience are intentionally not validated yet; signature + lifetime + ≥32-byte key are. Per-endpoint permission claims (e.g. `permissions: ["GPS"]` on the UAV upload) layer on top of this baseline.
**Authentication & Authorization** (AZ-487):
- Validation library: `Microsoft.AspNetCore.Authentication.JwtBearer` 10.0.7 (matches `Microsoft.AspNetCore.OpenApi` 10.0.7; AZ-496 bumped both packages from 8.0.21 → 8.0.25 in cycle 3 to close the cycle-1 D1 + cycle-2 D3 supply-chain findings, then AZ-500 bumped both 8.0.25 → 10.0.7 in cycle 4 as part of the .NET 8 → .NET 10 migration). The `TokenValidationParameters` shape is unchanged across the JwtBearer 8 → 10 jump — AZ-487/AZ-494 integration tests are the gate and all pass on .NET 10.
- Signing key: read from the `JWT_SECRET` environment variable (preferred) or the `Jwt:Secret` configuration key. Startup fails fast if the resolved secret is unset, empty, or shorter than 32 bytes (HMAC-SHA256 minimum per RFC 2104 §3).
- Token contract: `ValidateIssuerSigningKey = true`, `ValidateLifetime = true`, `RequireSignedTokens = true`, `RequireExpirationTime = true`, `ValidateIssuer = true` + `ValidIssuer = $JWT_ISSUER`, `ValidateAudience = true` + `ValidAudience = $JWT_AUDIENCE` (AZ-494), `ClockSkew = 30s`. The 5-minute JwtBearer default is intentionally tightened.
- Authorization model: every endpoint registered in `Program.cs` is decorated with `.RequireAuthorization()`. AZ-488 adds `permissions`-claim policies on top of this baseline (UAV upload requires `GPS`).
- Test infrastructure: `JwtTokenFactory` (unit tests) and `JwtTestHelpers` (integration tests) mint deterministic tokens against the same `JWT_SECRET`; the integration test runner attaches a default Bearer token to its shared `HttpClient` so legacy non-auth tests continue to exercise the protected endpoints unchanged.
**Planned features** (confirmed by user, currently stubs):
- MGRS endpoint — tile access via Military Grid Reference System coordinates
**Multi-source tile producers** (live as of AZ-488):
- *Google Maps* — `TileService.DownloadAndStoreTilesAsync` / `DownloadAndStoreSingleTileAsync` stamp `source='google_maps'` on every persisted row; tile JPEGs live under `{StorageConfig.TilesDirectory}/{zoom}/...` per the legacy grandfathered layout.
- *UAV* — `POST /api/satellite/upload` (AZ-488) accepts a multipart batch of UAV-captured tiles, runs each item through a 5-rule quality gate (`UavTileQualityGate`), and persists accepted items via `ITileRepository.InsertAsync` with `source='uav'`. UAV JPEGs live under `{StorageConfig.TilesDirectory}/uav/{zoom}/{x}/{y}.jpg`. Requires the `GPS` permission claim on top of the JWT baseline.
The N-source storage contract is authoritative in `_docs/02_document/contracts/data-access/tile-storage.md` (v1.0.0). The UAV upload contract is authoritative in `_docs/02_document/contracts/api/uav-tile-upload.md` (v1.0.0). Anything that reads or writes `tiles` MUST follow those contracts rather than re-deriving the rules from prose here.
**Drift signals**:
- `geofence_polygons` mentioned in AGENTS.md as a routes table column but does not exist in schema or entity — documentation drift
## 1. System Context
**Problem being solved**: A GPS-denied UAV navigation service requires satellite imagery for positioning and route planning without GPS. This service pre-downloads satellite tiles from one or more imagery sources (currently Google Maps; future sources including UAV nadir camera upload and additional providers such as SatAR) for specified regions and routes, stores them alongside each other under a per-source storage key, and serves the most-recent row across sources on access. Tiles are stitched into composite images and packaged for offline use.
**System boundaries**: The Satellite Provider is a self-contained backend service. It receives HTTP requests (region/route definitions), downloads tiles from Google Maps, stores them on disk and in PostgreSQL, and produces output files (images, CSVs, ZIPs).
**External systems**:
| System | Integration Type | Direction | Purpose |
|--------|-----------------|-----------|---------|
| Satellite imagery provider (e.g., Google Maps) | HTTPS (tile download) | Outbound | First implementation of the multi-source `tiles` storage; provider-agnostic via `ISatelliteDownloader`. Stamps `source='google_maps'` on every persisted row. |
| GPS-Denied Service (UAV) | REST API (multipart) | Inbound | Producer of `source='uav'` rows via `POST /api/satellite/upload` (AZ-488). Authenticates with a JWT carrying the `GPS` permission claim; items pass through the 5-rule quality gate before persistence. |
| PostgreSQL | TCP (Npgsql) | Both | Tile metadata, region/route state |
| File System | Local disk | Both | Tile image storage, output artifacts |
| HTTP Clients | REST API | Inbound | Region/route requests, tile queries |
## 2. Technology Stack
| Layer | Technology | Version | Rationale |
|-------|-----------|---------|-----------|
| Language | C# | 14.0 (was 12.0 through cycle 3 — AZ-500) | .NET ecosystem, strong typing |
| Framework | ASP.NET Core (Minimal API) | 10.0 (was 8.0 through cycle 3 — AZ-500) | Lightweight HTTP hosting |
| Database | PostgreSQL | 15+ | Reliable RDBMS, spatial-friendly |
| ORM | Dapper | latest | Micro-ORM, raw SQL control |
| Migrations | DbUp | latest | Simple SQL-file-based schema migrations |
| Image Processing | SixLabors.ImageSharp | 3.1.11 | Cross-platform image manipulation |
| Logging | Serilog | 8.0.3 | Structured logging with file sinks |
| Hosting | Docker (docker-compose) | — | Containerized deployment |
| CI/CD | Woodpecker CI | — | Lightweight self-hosted CI |
## 3. Deployment Model
**Environments**: Development (docker-compose), Production (Docker)
**Infrastructure**:
- Docker-based containerized deployment
- PostgreSQL as a separate container
- Shared volumes for tile storage and output artifacts
- No cloud provider dependency (self-hosted capable)
**Environment-specific configuration**:
| Config | Development | Production |
|--------|-------------|------------|
| Database | localhost:5432 (Docker) | Container network `db:5432` |
| Secrets | appsettings.Development.json | Environment variables |
| Logging | Console + File | File (./logs/) |
| API URL | http://localhost:5100 | http://0.0.0.0:5100 |
## 4. Data Model Overview
**Core entities**:
| Entity | Description | Owned By Component |
|--------|-------------|--------------------|
| Tile | A single satellite image tile with coordinates and zoom | TileDownloader |
| Region | A square area request with processing status | RegionProcessing |
| Route | A named path with geofence polygons | RouteManagement |
| RoutePoint | An individual point (original or interpolated) on a route | RouteManagement |
**Key relationships**:
- Route → RoutePoint: one-to-many (a route has many sequential points)
- Route → Region: many-to-many via `route_regions` (each route point generates a region)
- Region → Tile: implicit (a processed region references tiles by coordinate/zoom)
**Data flow summary**:
- Client → API → Queue → BackgroundService → GoogleMaps → FileSystem + DB: tile acquisition pipeline
- Client → API → RouteService → PointInterpolation → RegionCreation → Queue: route-to-region expansion
## 5. Integration Points
### Internal Communication
| From | To | Protocol | Pattern | Notes |
|------|----|----------|---------|-------|
| WebApi | RegionProcessing | In-process queue (Channel) | Fire-and-forget | Request queued, status polled. Uses `IRegionService` / `IRegionRequestQueue` from Common. |
| WebApi | TileDownloader | `ITileService` (Common interface) | Request-Response | Single-tile reads (`GetOrDownloadTileAsync`) and writes (`DownloadAndStoreSingleTileAsync`) flow through `ITileService` since AZ-310 / AZ-311. No direct dependency on the concrete `GoogleMapsDownloaderV2`. |
| RegionProcessing | TileDownloader | `ITileService` (Common interface) | Request-Response | Per-tile within region processing. Resolved through DI; no compile-time `ProjectReference` between RegionProcessing and TileDownloader csprojs. |
| RouteManagement | RegionProcessing | `IRegionService` / `IRegionRequestQueue` (Common interfaces) | Fire-and-forget | Route regions submitted to queue. No compile-time `ProjectReference` between RouteManagement and RegionProcessing csprojs. |
| All Services | DataAccess | Direct method call (via repository interfaces) | Repository pattern | Dapper queries |
### External Integrations
| External System | Protocol | Auth | Rate Limits | Failure Mode |
|----------------|----------|------|-------------|--------------|
| Satellite imagery provider (abstracted via `ISatelliteDownloader`; first implementation: Google Maps) | HTTPS GET | Provider-specific (e.g., session token) | Configured concurrency (MaxConcurrentDownloads) | Retry with backoff, mark region failed |
## 6. Non-Functional Requirements
| Requirement | Target | Measurement | Priority |
|------------|--------|-------------|----------|
| Concurrent Downloads | 4 (configurable) | SemaphoreSlim limit | High |
| Concurrent Regions | 20 (configurable) | Processing config | Medium |
| Queue Capacity | 1000 requests | Channel bounded capacity | Medium |
| Tile Deduplication | 100% (no re-download) | DB lookup before fetch | High |
| Max Zip Size | 50 MB | Route zip output | Medium |
## 7. Security Architecture
**Authentication**: HS256 JWT Bearer tokens (AZ-487 + AZ-494). Signing key from `JWT_SECRET` env var (≥ 32 bytes, validated at startup). Issuer and audience claims are validated against `JWT_ISSUER` / `JWT_AUDIENCE` env vars (AZ-494) — both required, fail-fast at startup if unset. `Microsoft.AspNetCore.Authentication.JwtBearer` validates signature, lifetime, signing key, issuer, and audience. ClockSkew tightened from JwtBearer default (5 min) to 30 s. Tokens are minted by the centralized Admin API per `suite/_docs/10_auth.md`; their `iss` and `aud` claims MUST match the satellite-provider configured values or validation rejects with 401.
**Authorization**: Every endpoint requires authentication via `.RequireAuthorization()`. Permission-claim enforcement is layered on top through the `PermissionsRequirement` authorization handler, which reads the `permissions` claim (accepting either repeated string claims OR a single JSON-array string). AZ-488 wires the `RequiresGpsPermission` policy on `POST /api/satellite/upload` — callers without `GPS` receive HTTP 403; other endpoints accept any authenticated principal.
**Data protection**:
- At rest: No encryption (tiles stored as plain JPEG files)
- In transit: HTTPS for Google Maps calls; API itself runs HTTP behind Kestrel (TLS termination is a deployment-layer concern)
- Secrets management: `JWT_SECRET` and `GOOGLE_MAPS_API_KEY` from environment variables / `.env` (gitignored); `.env.example` documents the required keys. Production deployments MUST supply both via the host environment, never via the appsettings files.
**Audit logging**: Serilog writes to file; logs exceptions and processing state transitions. 401/403 responses are emitted by the JwtBearer middleware via the `WWW-Authenticate` header; no body leakage of internal details.
## 8. Key Architectural Decisions
### ADR-001: Minimal API over Controller-based
**Context**: Project needed a lightweight HTTP layer for a small set of endpoints.
**Decision**: Use ASP.NET Core Minimal APIs (no controllers, no MVC).
**Consequences**: Less ceremony, all routing in `Program.cs`, but less structure for future growth.
### ADR-002: Dapper over Entity Framework
**Context**: Database access is straightforward CRUD with some spatial queries.
**Decision**: Use Dapper for raw SQL control and performance, paired with DbUp for schema migrations.
**Consequences**: Full SQL control, no ORM overhead; trade-off is manual mapping and no change tracking.
### ADR-003: In-Process Queue over External Message Broker
**Context**: Region/route processing needs to be asynchronous but the system is a single service.
**Decision**: Use `System.Threading.Channels` as an in-process bounded queue.
**Consequences**: Simple, no external dependencies; but limited to single-instance deployment — no horizontal scaling of workers.
### ADR-004: File-Based Tile Storage
**Context**: Tiles are immutable JPEG images that need fast random access.
**Decision**: Store tiles as files in a directory hierarchy with metadata in PostgreSQL. The layout is per-source so the bytes for `google_maps` and `uav` writes for the same cell remain individually addressable on disk:
- Google Maps (legacy, grandfathered): `{StorageConfig.TilesDirectory}/{zoom}/{x_bucket}/{y_bucket}/tile_{zoom}_{x}_{y}_{timestamp}.jpg`
- UAV (AZ-488): `{StorageConfig.TilesDirectory}/uav/{zoom}/{x}/{y}.jpg`
The authoritative source marker is the `tiles.source` column; the per-source on-disk path matters only for write isolation between producers.
**Consequences**: Fast reads, easy backup/migration, both producers can run without colliding on bytes, but requires shared filesystem for multi-instance (which is not currently needed). No migration of pre-AZ-488 Google Maps files is shipped — the legacy layout stays intact.
### ADR-005: Background Hosted Services for Processing
**Context**: Region and route processing is long-running and should not block HTTP requests.
**Decision**: Use `IHostedService` implementations that consume from the in-process queue.
**Consequences**: Clean separation of request handling and processing; lifecycle managed by the host.