Files
satellite-provider/_docs/05_security/infrastructure_review.md
T
Oleksandr Bezdieniezhnykh 51b572108a
ci/woodpecker/push/01-test Pipeline was successful
ci/woodpecker/push/02-build-push Pipeline was successful
[AZ-484] Cycle 1 Steps 12-16: docs, security, perf, deploy report
Captures the post-implementation autodev gates for AZ-484 multi-source
tile storage:

- Step 12 (Test-Spec Sync): added 7 AC rows (AZ-484 AC-1..AC-7) and a
  PT-07 NFR row to traceability-matrix.md; added PT-07 scenario to
  performance-tests.md.
- Step 13 (Update Docs): refreshed data_model.md (tiles columns +
  indexes + selection rule + UPSERT contract + migrations 012/013),
  module-layout.md (Common/Enums section with L-001 guidance,
  DataAccess imports-from now lists 6 sites), 6 module / component
  docs to reflect the new repo signatures, source/captured_at fields,
  and Dapper enum bypass workaround. ripple_log_cycle1.md records
  zero out-of-scope ripple.
- Step 14 (Security Audit): PASS_WITH_WARNINGS - 0 Critical, 0 High,
  5 Medium, 5 Low. AZ-484 itself added zero new findings. Hardening
  items (Postgres default creds, .env in build context, GMaps key
  rotation, ASP.NET Core 8.0.21 -> 8.0.25, rate limiter) recorded
  for separate tickets.
- Step 15 (Performance Test): all PT-01..PT-07 scenarios Unverified
  (non-blocking); PT-07 baseline-comparison harness deferred to a
  leftover for next cycle.
- Step 16 (Deploy): cycle deploy report covering migration safety,
  rollback path, post-deploy verification, security caveats.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-11 10:03:05 +03:00

7.1 KiB

Phase 4 — Configuration & Infrastructure Review

Date: 2026-05-11 Scope: Dockerfile (API + integration tests), docker-compose.yml, docker-compose.tests.yml, .dockerignore, .woodpecker/01-test.yml, .woodpecker/02-build-push.yml, appsettings*.json, .env handling.

Findings

I1 — Dockerfile runs as root (Low)

  • Location: SatelliteProvider.Api/Dockerfile (no USER directive)
  • Description: The final stage of the API image inherits root from mcr.microsoft.com/dotnet/aspnet:8.0 (current Microsoft images default to root unless overridden). Any process compromise — even a low-impact one — has uid-0 inside the container.
  • Impact: Container escape primitives (e.g., kernel CVE, sloppy bind-mount of /var/run/docker.sock) become host-root rather than host-uid-1000. The 02-build-push.yml step itself bind-mounts /var/run/docker.sock into the build container — that's a separate concern (build host, not runtime), but it underscores why "least privilege at runtime" matters even on a single-tenant box.
  • Remediation: Add to the final stage:
    RUN adduser --disabled-password --gecos "" --uid 10001 satellite && \
        chown -R satellite:satellite /app
    USER satellite
    
    Also verify ./tiles, ./ready, ./logs host volumes are writable by uid 10001 in deployment manifests.

I2 — No security headers middleware (Low)

  • Location: SatelliteProvider.Api/Program.cs (no app.UseSecurityHeaders() / app.Use(headers …) block)
  • Description: API responses do not set X-Content-Type-Options: nosniff, Referrer-Policy: no-referrer, X-Frame-Options: DENY, or HSTS (Strict-Transport-Security) — only app.UseHttpsRedirection() is wired. For a JSON-only API this is low impact (no browser is the primary client), but the missing Cache-Control defaults can let proxies cache 5xx responses.
  • Impact: Limited — JSON-only responses, no cookies, no browser session. The Swagger UI (Development-only) does render HTML; a permissive default there is more of a hygiene issue than a real risk.
  • Remediation: Add a tiny middleware to set the standard hardening headers, OR install NWebsec.AspNetCore.Middleware and wire app.UseHsts() + the nosniff / frame-options defaults. Cheap, no behavioural change.

I3 — No rate limiting on any HTTP endpoint (Medium)

  • Location: SatelliteProvider.Api/Program.cs (no app.UseRateLimiter(), no AddRateLimiter())
  • Description: There is internal concurrency control on outbound Google Maps calls (SemaphoreSlim, MaxConcurrentDownloads), but no inbound rate limiting. An attacker can:
    • Submit N POST /api/satellite/request calls in a tight loop, filling the bounded IRegionRequestQueue (capacity 1000) and DoS-ing the background processor.
    • Submit N GET /api/satellite/tiles/latlon calls with novel lat/lon pairs, forcing the upstream Google Maps quota to drain.
  • Impact: Service-degradation DoS. Combined with finding A01-caveat (no auth), the only protection is the network boundary.
  • Remediation: Wire Microsoft.AspNetCore.RateLimiting (built into .NET 8 — no new package). Conservative starting point:
    builder.Services.AddRateLimiter(options =>
    {
        options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
            RateLimitPartition.GetFixedWindowLimiter(
                partitionKey: ctx.Connection.RemoteIpAddress?.ToString() ?? "unknown",
                factory: _ => new FixedWindowRateLimiterOptions { PermitLimit = 60, Window = TimeSpan.FromMinutes(1) }));
    });
    app.UseRateLimiter();
    
    Tune per-endpoint after observing baseline production load.

I4 — No security-event logs / alerting (Low)

  • Location: Logging strategy across Program.cs, GlobalExceptionHandler, CorsConfigurationValidator
  • Description: Operational logs are well-structured (Serilog → file rotation; correlationId propagation), but there are no log entries for what would be security-relevant events: validation failures (BadRequest stream), repeated 4xx from a single IP, malformed input bursts, or migration failures. The migration failure path does throw and crash startup (good signal), but this leaves no trail in the file logs.
  • Impact: No way to detect abuse of the unauthenticated endpoints from logs alone. For an internal-only deployment this is acceptable; if the API ever moves toward a less-trusted network, post-deploy log-mining will not be able to reconstruct attack patterns.
  • Remediation: Defer until/unless the trust boundary changes. When required: add a structured log line for each 400/404 (with Method, Path, RemoteIp, correlationId) and a counter for "validation failures per minute per IP".

I5 — .env is NOT in .dockerignore (Medium)

  • Location: .dockerignore (line by line review — no .env entry); SatelliteProvider.Api/Dockerfile:15 (COPY . .)
  • Description: The Dockerfile's COPY . . step copies the entire build context into /src. The build context starts at the repo root, where .env lives. .env IS in .gitignore so the dev-only Google Maps key never reaches the git repo, but it WILL be baked into the build-stage image layer (and into the final image, since final does COPY --from=publish /app/publish . — only /app/publish survives, but the build stage retains .env and is reachable if anyone introspects an intermediate layer).
  • Impact:
    • Anyone with read access to the registry can docker pull <build-stage-tag> (if exported) and recover the API key from the layer.
    • Even just the final image: BuildKit cache mounts and any future Dockerfile change that does COPY . /app instead of COPY --from=publish would silently include the file.
  • Remediation: Add .env to .dockerignore:
    .env
    .env.*
    !.env.example
    
    This is a one-line fix and complements finding S4.

I6 — docker-compose.yml exposes Postgres on 0.0.0.0:5432 (Medium — duplicate of S2; restated here for infra-domain completeness)

  • Location: docker-compose.yml:9-10
  • Description / Remediation: See S2.

Items checked clean

  • Secrets management in CI: .woodpecker/02-build-push.yml uses from_secret: registry_host / registry_user / registry_token — no plaintext credentials. The docker login step pipes the token via --password-stdin, which avoids leaking the token via process list. ✓
  • Image attribution: build step labels images with org.opencontainers.image.revision, …created, …source — good provenance hygiene. ✓
  • Healthcheck on Postgres: pg_isready -U postgres configured. (Note: relies on the weak default user from S2.)
  • Log volume layout: ./logs:/app/logs mounted; not exposed via the API. ✓
  • Test runner isolation: docker-compose.tests.yml extends the API service (good — same image) but uses restart: "no" so a flapping integration test doesn't loop and amplify load.

Self-verification

  • All Dockerfiles reviewed (Api + IntegrationTests)
  • All CI/CD configs reviewed (.woodpecker/01-test.yml, 02-build-push.yml)
  • All env / config files reviewed (appsettings*.json, .env, docker-compose*.yml)