Update demo replay validation and testing documentation
ci/woodpecker/push/02-build-push Pipeline failed

- Modified the autodev state to reflect the current testing phase and details of the new `jetson-e2e` tests.
- Enhanced the "How to Test" documentation to provide clearer instructions on the demo replay validation process, including video and tlog alignment steps.
- Updated architectural documentation to include the new demo replay operator flow and its dependencies.
- Documented the removal of deprecated auto-sync features and clarified the operator-facing UI for replay validation.
- Added new entries in the dependencies table for upcoming tasks related to the demo replay flow.

These changes improve clarity and usability for operators and developers working with the demo replay system.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-06-20 11:24:43 +03:00
parent 12d0008763
commit 1f634c2604
175 changed files with 20701 additions and 41 deletions
@@ -0,0 +1,95 @@
syntax = "proto3";
package satellite.v1;
import "google/protobuf/timestamp.proto";
option csharp_namespace = "Satellite.V1";
service RouteTileDelivery {
rpc DeliverRouteTiles(DeliverRouteTilesRequest) returns (stream RouteTileEvent);
}
message DeliverRouteTilesRequest {
RouteSpec route = 1;
repeated ClientTileRecord client_tiles = 2;
}
message RouteSpec {
string route_id = 1;
repeated Waypoint waypoints = 2;
double region_size_meters = 3;
int32 zoom = 4;
repeated GeofencePolygon geofences = 5;
bool include_geofence_tiles = 6;
}
message Waypoint {
double lat = 1;
double lon = 2;
}
message GeofencePolygon {
repeated Waypoint vertices = 1;
}
message ClientTileRecord {
int32 z = 1;
int32 x = 2;
int32 y = 3;
double resolution_m_per_px = 4;
google.protobuf.Timestamp captured_at = 5;
optional string source = 6;
bytes content_sha256 = 7;
}
message RouteTileEvent {
oneof payload {
RouteManifest manifest = 1;
TileBatch batch = 2;
ProgressUpdate progress = 3;
DeliveryComplete complete = 4;
DeliveryError error = 5;
}
}
message RouteManifest {
uint32 total_candidates = 1;
uint32 skipped_by_client = 2;
uint32 to_deliver = 3;
}
message TileBatch {
uint32 batch_seq = 1;
repeated TilePayload tiles = 2;
}
message TilePayload {
int32 z = 1;
int32 x = 2;
int32 y = 3;
double resolution_m_per_px = 4;
google.protobuf.Timestamp captured_at = 5;
string source = 6;
bytes jpeg = 7;
bytes content_sha256 = 8;
uint32 route_priority = 9;
}
message ProgressUpdate {
uint32 delivered = 1;
uint32 total = 2;
uint32 downloading = 3;
}
message DeliveryComplete {
uint32 delivered = 1;
uint32 skipped_client = 2;
uint32 skipped_server_filter = 3;
}
message DeliveryError {
string code = 1;
string message = 2;
bool retryable = 3;
}
@@ -0,0 +1,143 @@
# Contract: RouteTileDelivery (gRPC)
**Component**: c11_tilemanager (consumer), satellite-provider (producer)
**Epic**: AZ-976
**ADR**: ADR-013 (architecture.md)
**Proto**: `tile_provision.proto``package satellite.v1`
**Version**: 0.3.0
**Status**: proposed
**Last Updated**: 2026-06-19
## Purpose
Operator-side **pre-flight cache provisioning**. Client sends route + onboard tile catalog once; server streams `RouteTileEvent` messages until `DeliveryComplete` or `DeliveryError`.
satellite-provider does **not** receive `flight_id` — that is a C6 bookkeeping concern on the gps-denied side only (`route_id` is the wire correlation id).
C11/C12 on the **operator workstation** only. ADR-004: airborne image must not import stubs or open this channel.
## RPC
```protobuf
service RouteTileDelivery {
rpc DeliverRouteTiles(DeliverRouteTilesRequest) returns (stream RouteTileEvent);
}
```
| Concern | Rule |
|---------|------|
| Auth | gRPC metadata `authorization: Bearer <JWT>` |
| TLS | Required in production; `SATELLITE_PROVIDER_TLS_INSECURE=1` dev knob |
| Idempotency | `RouteSpec.route_id` (UUID string) |
| Resume | Client persists last acked `batch_seq` per `route_id` locally (not on wire) |
## Request
### `DeliverRouteTilesRequest`
| Field | Description |
|-------|-------------|
| `route` | Corridor geometry + single zoom |
| `client_tiles` | Onboard inventory snapshot (route intersection only) |
### `RouteSpec`
| Field | Maps from gps-denied |
|-------|----------------------|
| `route_id` | Client-generated UUID per provision job |
| `waypoints` | `replay_input.tlog_route.RouteSpec.waypoints` |
| `region_size_meters` | `RouteSpec.suggested_region_size_meters` |
| `zoom` | Single slippy zoom level (confirmed sufficient) |
| `geofences` | Optional inclusion polygons |
| `include_geofence_tiles` | Union geofence tiles with corridor grid |
### `ClientTileRecord`
Canonical key: **`(z, x, y)`**. `source` is informational only — **not** used in skip logic.
| Field | C6 mapping |
|-------|------------|
| `resolution_m_per_px` | RESTRICT-SAT-4 (lower = better) |
| `captured_at` | `TileMetadata.capture_timestamp` |
| `content_sha256` | `TileMetadata.content_sha256_hex` (raw 32 bytes) |
## Server skip rule (client catalog)
For each server candidate tile, **omit from stream** when `client_tiles` has matching `(z,x,y)` and **any** of:
1. `client.content_sha256` is non-empty and **equals** server payload hash → skip (byte-identical)
2. `client.resolution_m_per_px <= server.resolution_m_per_px` **and** `client.captured_at >= server.captured_at` → skip (metadata-sufficient)
`source` is **not** compared.
`RouteManifest.skipped_by_client` counts tiles removed by this rule.
## Sector — not on this wire
**Sector** (`active_conflict` vs `stable_rear`) controls **how stale a tile may be before C6 rejects it on write** (AC-NEW-6 freshness). It is an operator decision about the geographic area, not something satellite-provider needs to deliver tiles.
| Layer | Who applies sector |
|-------|-------------------|
| satellite-provider | Does not need sector — streams tiles by route geometry |
| C11 client write | Reads sector from **C11/C12 config** (same as today) when calling C6 freshness gate |
No `SectorClass` field on the gRPC request.
## Response stream: `RouteTileEvent`
Typical sequence:
1. **`RouteManifest`** — `total_candidates`, `skipped_by_client`, `to_deliver`
2. **`TileBatch`** — monotonic `batch_seq`; on-disk hits first, then freshly fetched
3. **`ProgressUpdate`** — optional
4. **`DeliveryComplete`** or **`DeliveryError`**
### `DeliveryComplete` counters
| Field | Meaning |
|-------|---------|
| `delivered` | Tiles actually sent in `TileBatch` streams |
| `skipped_client` | Same as manifest `skipped_by_client` (echo for client verify) |
| `skipped_server_filter` | Tiles SP required but **did not send** after client dedup — see below |
#### `skipped_server_filter` — what counts
Tiles that entered the post-client-dedup work queue but never appeared in a batch:
| Reason | Example |
|--------|---------|
| **Fetch failed** | External imagery provider 404/timeout after retries |
| **Below SP min resolution** | SP refuses to store/serve below its configured floor |
| **Geometry clip** | Tile dropped after server-side corridor/geofence validation |
| **Operational cap** | Job hit max-tiles / rate limit (if SP enforces) |
Tiles skipped by the **client catalog rule** are **not** included here (they are `skipped_client`).
If SP has no server-side filters in v1, `skipped_server_filter` may be **0**; the field is reserved for observability.
### `TilePayload`
| Field | Notes |
|-------|-------|
| `content_sha256` | 32-byte SHA-256 of `jpeg`; matches C6 DB invariant |
| `route_priority` | Lower = earlier along route |
## Client write path (gps-denied)
`RouteTileDeliveryClient` (C11):
- Assigns C6 `flight_id` from operator context locally (not from SP)
- Applies RESTRICT-SAT-4, **sector-based freshness**, AZ-308 budget, download journal
- Resumes via persisted `route_id` + `batch_seq`
## Migration
REST `route_client` + `HttpTileDownloader` remain fallback until AZ-979 benchmark.
## Change log
| Version | Date | Change |
|---------|------|--------|
| 0.3.0 | 2026-06-19 | `ClientTileRecord.content_sha256`; sequential field nums on `TilePayload`; sector/flight_id off wire; skip rule + `skipped_server_filter` defined |
| 0.2.0 | 2026-06-19 | `satellite.v1.RouteTileDelivery` + `RouteTileEvent` oneof |
| 0.1.0 | 2026-06-19 | Initial draft (superseded) |
@@ -289,7 +289,9 @@ The two **invalid** cells (`true` + `eskf` and `false` + `gtsam_isam2`) raise `C
**Sub-invariant 14.c (auto-sync deprecation — AZ-895)**: the `replay_input.auto_sync` module (AZ-405) is reduced to a deprecated no-op stub that raises `ReplayInputAdapterError("auto-sync removed; supply --imu CSV instead")` from every public entry point. The CLI flags `--time-offset-ms`, `--skip-auto-sync`, and `--auto-trim` are accepted with a deprecation warning and ignored. The justification: with a single canonical clock at the CSV row level (14.a), there is no second clock to align against — the operator authors the CSV with the correct row-0 alignment, and the fixture verifies row 0's `Time == 0`. Hard removal of the deprecated surface is tracked in AZ-908; this cycle ships only the stub + warnings to preserve source-compat for any downstream caller built against AZ-405's pre-deprecation shape.
**Sub-invariant 14.d (operator-facing UI — AZ-897, future cycle)**: the cycle-4 deliverable is the headless `gps-denied-replay --video X --imu Y` shape. An operator-facing web UI (single-page React + Tailwind form that uploads a paired `(video, CSV)` and tails the verdict) is tracked separately in AZ-897 and is NOT on the critical path of the CSV redesign; this sub-invariant exists only to record that the format spec (AZ-896) and the CSV adapter (AZ-894) MUST stay UI-friendly (CSV example, format docs link, clear error messages on row-0-misalignment) so AZ-897 lands without contract drift.
**Sub-invariant 14.d (operator-facing UI — AZ-897, superseded by Invariant 15)**: retained for historical cycle-4 CSV-only upload spec. Default demo entry is now F11 / AZ-969.
15. **Operator demo replay path (cycle 5 — AZ-969 / F11)**: the default product demo accepts raw `(video, tlog, calibration)` from the suite UI. Alignment is operator-visible (dual timeline bars + explicit refine); the backend exports an AZ-896 CSV whose `Time` column is the single canonical replay clock (Invariant 14.a). Steps: preview timelines (AZ-970) → coarse align + refine (AZ-897, AZ-971) → export CSV (AZ-972) → seed corridor cache from tlog GPS (AZ-974) → run `gps-denied-replay` (AZ-973) → map + verdict. The `(video, pre-authored CSV)` bypass (AZ-959) is optional, not default. E2E tests MUST use the same orchestration modules as production — no parallel test-only graph. AZ-908 (hard removal of alignment stubs) is deferred until AZ-971 ships.
## Producer / Consumer Split