Implements two new C12 services and rebalances the C11/C12 boundary in one atomic commit: * AZ-329 PostLandingUploadOrchestrator — gates C11 upload on the `flight_footer` FDR record's `clean_shutdown` field; 4 refusal modes; new FdrFooterReader Protocol + LocalFdrFooterReader. * AZ-330 OperatorReLocService — AC-3.4 visual-loss re-localization hint; reuses shared LatLonAlt; OperatorCommandTransport Protocol cut (E-C8 owns the future pymavlink concrete); new FDR record kind `c12.reloc.requested`; log redaction (lat/lon 5 decimals, reason 200 chars). * AZ-523 C11 internal flight-state gate removed (SRP refactor): `confirm_flight_state` / `FlightStateSignal` use / `FlightStateNotOnGroundError` deleted from C11; TileUploader contract bumped to v2.0.0 (frozen) with migration note; AZ-317 superseded. * AZ-524 Package rename `c12_operator_tooling` → `c12_operator_orchestrator` across source, tests, pyproject, CMake, Dockerfile, compose, CI, runtime-root services class (`OperatorOrchestratorServices`) + factory function (`build_operator_orchestrator`), logger namespaces, config slug, docs, and the E-C12 epic title. Tests: 1543 passed, 80 skipped (all environment gates). Targeted AC suite (AZ-329 + AZ-330 + FdrFooterReader): 37 passed. Cold-start NFR-perf still ≤ 500 ms p99. Tracker: AZ-317 → Done (superseded); AZ-319 v2.0.0 contract bump comment; AZ-329/AZ-330 → In Testing; AZ-253 epic renamed; AZ-523 + AZ-524 created and closed as audit-trail tickets. See `_docs/03_implementation/batch_44_cycle1_report.md`. Co-authored-by: Cursor <cursoragent@cursor.com>
40 KiB
Data Model
Date: 2026-05-09 (Plan Phase 2b — initial draft). Inputs:
_docs/02_document/architecture.md§ 4 (Data Model Overview), § 5, § 7;_docs/02_document/glossary.md;_docs/00_problem/{acceptance_criteria,restrictions}.md;_docs/01_solution/solution.md; parent-suitesatellite-providerschema (mirrored target for byte-identical post-landing upload). Scope: persistent and on-disk artifacts produced or consumed by the onboard companion, the operator-side pre-flight tooling (C12), the operator-side Tile Manager (C11 — pre-flight tile download + post-landing tile upload), and the FDR (C13). In-flight runtime DTOs (per-frame estimates, IMU windows, etc.) are summarised at the end of this document but are NOT persistent and NOT in the migration scope.
1. Overview
The onboard system is mostly streaming / in-memory: the per-frame pipeline (NavCameraFrame → C1 + C2 → C2.5 → C3/C3.5 → C4 → C5 → C8) operates entirely on in-process Python DTOs and never persists those frames or their derived per-frame estimates as relational rows. What IS persistent splits into four backing stores; the data model below is the union of all four.
| Store | Technology | What lives here | Persistence scope |
|---|---|---|---|
| Relational | PostgreSQL 16 (mirror of satellite-provider's schema; one DB per companion) |
tiles (mirrored), flights, sector_classifications, manifests, engine_cache_entries |
Across flights; survives reboot |
| Filesystem (tile cache) | Local NVM (./tiles/{zoomLevel}/{x}/{y}.jpg + ./tiles/{zoomLevel}/{x}/{y}.json sidecar) |
Tile JPEG bodies; per-tile metadata sidecar | Across flights; survives reboot |
| Filesystem (artifacts) | Local NVM | FAISS HNSW .index; TRT engines + INT8 calibration caches; camera calibration JSON; manifest JSON |
Across flights; rebuildable from the relational tables + satellite-provider |
| Filesystem (FDR) | Local NVM, ≤ 64 GB ring per flight | FdrRecord segments; AC-8.5 ≤ 0.1 Hz failed-tile thumbnails |
Per flight; rolling, oldest-segment-dropped on overrun |
Architecture-driven principles (carried verbatim from architecture.md Principle list):
- The persistent tile schema MUST stay byte-compatible with
satellite-provider's so post-landing upload (D-PROJ-2) is a copy, not a transformation. Onboard-only fields are added as additive columns; the canonical Google-Maps-sourced columns must keep their semantics and types. - No raw-frame storage (AC-8.5). The only image persistence path is the orthorectified
Tile(mid-flight or pre-flight). The single forensic exception is the AC-8.5 ≤ 0.1 Hz failed-tile thumbnail log inside the FDR. - The camera calibration artifact is the ONLY way camera-specific math enters the system (Principle #1). Test fixtures (
adti26.json) and production deployments (adti20.json) load different artifacts on the same code path; the schema MUST treatadti20vsadti26as data, not as a code branch. - No persistent secrets on the airborne companion image. The MAVLink-2.0 signing key and the per-flight onboard tile-signing key are generated at takeoff load, logged to the FDR, and discarded with the FDR ring. No long-lived secrets table.
- Append-only / additive-only by default. Migrations may add tables, columns, indexes, and CHECK constraints, but MUST NOT rename or drop existing columns without an ADR-recorded deprecation window. The
tilesschema specifically is frozen on the canonical columns and only extensible via additive onboard-only columns. - Honest covariance is a schema concern, not a tuning knob. Every persisted estimate-derived structure (
TileQualityMetadata, FDR-emitted-position records) carries the same 2×2 horizontal sub-matrix that drives AC-NEW-4 / AC-NEW-7. Lossy storage of covariance is a defect.
What is NOT in scope for this document:
- Per-frame in-memory DTOs (
VioOutput,VprQuery,VprResult,RerankResult,MatchResult,PoseEstimate,EmittedExternalPosition) — they are listed as "non-persistent runtime entities" in § 5 below for completeness only; their schema lives in component specs (Step 3). - The
satellite-provider-side voting / trust schema (D-PROJ-2 design task #2). The onboard side writesvoting_status = pendingand reads onlyvoting_status = trusted(or operator-overriddenpending); the actual promotion logic is parent-suite work. - The operator-workstation pre-flight tooling UI state (C12) — that is operator-side code with its own persistence story, out of the onboard companion's data scope.
2. Persistent Entities
2.1 tiles (PostgreSQL — MIRRORS satellite-provider; additive onboard columns)
The tile is the single most important persistent entity. The schema deliberately starts from satellite-provider's existing PostgreSQL tiles table (Google-Maps-sourced) and extends it with onboard-only columns. The canonical columns are read-and-writeable from both sides; the onboard-only columns are NULL on satellite-provider-sourced rows and populated only on companion-orthorectified rows.
Storage split: row in PostgreSQL ↔ JPEG body on filesystem at ./tiles/{zoomLevel}/{x}/{y}.jpg. The JPEG is byte-identical to what satellite-provider produces / consumes; the sidecar ./tiles/{zoomLevel}/{x}/{y}.json carries the same row content as a JSON dump for the Tile Manager's batched HTTP form-data payload (D-PROJ-2 contract).
2.1.1 Columns
| Column | Type | Origin | Constraints | Description |
|---|---|---|---|---|
id |
bigserial PK |
both | NOT NULL | Surrogate primary key (consistent with satellite-provider). |
zoom_level |
int |
both | NOT NULL, CHECK (zoom_level BETWEEN 10 AND 22) |
Slippy/XYZ zoom; same semantics as satellite-provider. |
tile_x |
int |
both | NOT NULL | Slippy/XYZ X index. |
tile_y |
int |
both | NOT NULL | Slippy/XYZ Y index. |
latitude |
double precision |
both | NOT NULL | Tile center latitude (WGS84). |
longitude |
double precision |
both | NOT NULL | Tile center longitude (WGS84). |
tile_size_meters |
double precision |
both | NOT NULL, CHECK (tile_size_meters > 0) |
Ground footprint side length in metres (lat-adjusted m/px × pixels). |
tile_size_pixels |
int |
both | NOT NULL, CHECK (tile_size_pixels > 0) |
Pixel side length (square tiles). |
capture_timestamp |
timestamptz |
both | NOT NULL | When the tile imagery was captured (Google Maps ingest time for googlemaps; UAV nav-camera frame timestamp for onboard_ingest). |
compression |
text |
both | NOT NULL, default 'jpeg' |
Currently jpeg only; reserved for future formats. |
crs |
text |
both | NOT NULL, default 'EPSG:3857' |
Spherical Mercator; same as satellite-provider. |
source |
text |
both | NOT NULL, CHECK (source IN ('googlemaps', 'onboard_ingest')) |
New CHECK domain; existing satellite-provider rows pre-migration are backfilled 'googlemaps'. |
flight_id |
uuid |
onboard-only | NULL for googlemaps; NOT NULL for onboard_ingest; FK → flights.id |
Which flight produced the tile (D-PROJ-2 design task #1 contract). |
companion_id |
text |
onboard-only | NULL for googlemaps; NOT NULL for onboard_ingest |
Deployed unit identifier; threat-model boundary marker. |
quality_metadata |
jsonb |
onboard-only | NULL for googlemaps; NOT NULL for onboard_ingest; schema § 2.1.3 |
Sufficient inputs for parent-suite voting (D-PROJ-2 #2). |
voting_status |
text |
onboard-only | NULL for googlemaps; NOT NULL for onboard_ingest; CHECK (voting_status IN ('pending', 'trusted', 'rejected')); default 'pending' |
Cache-poisoning safety budget gate (AC-NEW-7). The companion never writes 'trusted'; promotion is parent-suite-side only. |
freshness_status |
text |
both | NOT NULL, CHECK (freshness_status IN ('fresh', 'stale_warn', 'stale_reject')); default 'fresh' |
Computed at ingest from capture_timestamp vs. the area's SectorClassification (active-conflict 6 mo / stable rear 12 mo); demoted on age regardless of voting_status. |
signature |
bytea |
onboard-only | NULL for googlemaps; NOT NULL for onboard_ingest |
Per-flight onboard signing key signature over the upload payload (D-PROJ-2 § Design task #1). |
created_at |
timestamptz |
both | NOT NULL, default now() |
Row creation time on the companion (or on satellite-provider for googlemaps rows). |
updated_at |
timestamptz |
both | NOT NULL, default now() |
Updated by trigger on voting_status / freshness_status change. |
Composite uniqueness: UNIQUE (zoom_level, tile_x, tile_y, source, flight_id) — (zoom_level, tile_x, tile_y) is unique per (source, flight_id) tuple, so the same area can have one googlemaps row plus N onboard_ingest rows from different flights without collision. flight_id is treated as the empty UUID '00000000-...-...-000000000000' for googlemaps rows so the composite uniqueness still holds.
Indexes:
- B-tree on
(zoom_level, tile_x, tile_y)— primary spatial lookup path for VPR retrieval and pre-flight cache hydration. - B-tree on
(latitude, longitude)— bounding-box queries for sector classification and spatial-coverage reports. - B-tree on
voting_statuspartial WHEREsource = 'onboard_ingest'— operator-orchestrator queries for "which mid-flight tiles are still pending promotion?". - B-tree on
flight_id— FDR cross-reference; post-landing upload batching. - B-tree on
created_at— pruning / rollover queries.
Filesystem invariant: for every row with source = 'onboard_ingest', the JPEG body MUST exist at ./tiles/{zoom_level}/{tile_x}/{tile_y}.jpg and a sidecar ./tiles/{zoom_level}/{tile_x}/{tile_y}.json MUST exist with the row's full JSON form. The Tile Manager's TileUploader (C11) reads both and the row may be deleted from the local PostgreSQL only after satellite-provider returns 2xx for that (flight_id, zoom_level, tile_x, tile_y) tuple.
2.1.2 Lifecycle
| Trigger | Action on tiles row |
Action on filesystem | Action on FDR |
|---|---|---|---|
Pre-flight cache hydration (C11 TileDownloader download from satellite-provider) |
INSERT source='googlemaps', voting_status=NULL, freshness_status computed |
WRITE ./tiles/{z}/{x}/{y}.jpg (atomic) |
n/a (pre-flight) |
| Mid-flight orthorectify-and-store (in-air) | INSERT source='onboard_ingest', voting_status='pending', quality_metadata populated |
WRITE ./tiles/{z}/{x}/{y}.jpg (atomic, dedup by composite key) + sidecar JSON |
EVENT tile_emitted with row PK |
| Freshness re-evaluation (background, periodic per AC-8.2 / AC-NEW-6) | UPDATE freshness_status |
n/a | EVENT freshness_demoted if changed |
Post-landing upload success (C11 TileUploader, op-workstation) |
DELETE local row after satellite-provider 2xx |
DELETE local JPEG + sidecar | EVENT tile_uploaded with (flight_id, z, x, y) and remote ack ID |
| Post-landing upload failure (retryable) | UPDATE updated_at; row retained for retry |
retain JPEG + sidecar | EVENT tile_upload_failed with reason |
No DELETE in flight: rows produced in flight (source='onboard_ingest') MUST NOT be deleted while flight_state == IN_AIR. Defense-in-depth on Principle #4 (process-level isolation already prevents the upload code path from being loaded).
2.1.3 quality_metadata jsonb sub-schema
For source = 'onboard_ingest' rows, quality_metadata carries the full input the parent-suite voting layer (D-PROJ-2 design task #2) needs without re-derivation. Schema is fixed; new keys are additive only.
{
"estimator_label": "satellite_anchored" | "visual_propagated" | "dead_reckoned",
"covariance_2x2": [[<float σ_xx>, <float σ_xy>], [<float σ_yx>, <float σ_yy>]],
"last_anchor_age_ms": <int >= 0>,
"mre_px": <float >= 0>, // reprojection error at the contributing match
"imu_bias_norm": <float >= 0>, // VIO health proxy
"vio_strategy": "okvis2" | "vins_mono" | "klt_ransac",
"vpr_strategy": "ultra_vpr" | "mega_loc" | "mix_vpr" | "sela_vpr" | "eigen_places" | "net_vlad" | "salad",
"matcher": "disk_lightglue" | "aliked_lightglue" | "xfeat",
"adhop_invoked": <bool>,
"thermal_throttle_active": <bool>, // D-CROSS-LATENCY-1 hybrid was active at emit time
"build_kind": "deployment" | "research" // ADR-002 disambiguation; voting layer may down-weight research-binary tiles
}
The tiles_quality_metadata_schema_v1 jsonschema lives at _docs/02_document/schemas/tiles_quality_metadata_v1.json and is referenced by both the local INSERT path and the C11 TileUploader payload validator. Schema versioning: bumping any field requires a new *_schema_v2.json and the row stamps the schema version into the jsonb under _v (additive-only).
2.2 flights (PostgreSQL — onboard-only)
A lightweight tracking row per flight, used by the FDR's manifest, the Tile Manager TileUploader batching, and the cache-poisoning safety audit (AC-NEW-7).
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
uuid PK |
NOT NULL | Generated at takeoff load. |
companion_id |
text |
NOT NULL | Same as tiles.companion_id. |
started_at |
timestamptz |
NOT NULL, default now() |
First MAVLink HEARTBEAT received post-takeoff. |
ended_at |
timestamptz |
NULL until landing | Set on flight_state transition IN_AIR → ON_GROUND. |
signing_key_fingerprint |
bytea |
NOT NULL | SHA-256 of the per-flight onboard signing pubkey; private key is never persisted. |
mavlink_signing_key_fingerprint |
bytea |
NULL on iNav-only flights | SHA-256 of the per-flight MAVLink-2.0 signing key (D-C8-9 = (d), AP path only). |
fc_profile |
text |
NOT NULL, CHECK (fc_profile IN ('ardupilot_plane', 'inav')) |
Which FC profile this flight ran. |
vio_strategy |
text |
NOT NULL | Active C1 strategy this flight (ADR-001 — locked at startup). |
vpr_strategy |
text |
NOT NULL | Active C2 strategy. |
build_kind |
text |
NOT NULL, CHECK (build_kind IN ('deployment', 'research')) |
Disambiguates IT-12 research-binary flights from production flights. |
manifest_id |
uuid |
NOT NULL, FK → manifests.id |
Pre-flight manifest used for this flight (ties to model + calibration + corpus + sector). |
fdr_path |
text |
NOT NULL | Local path to the FDR ring directory for this flight. |
created_at |
timestamptz |
NOT NULL, default now() |
Lifecycle: row INSERTed at takeoff load; UPDATE on landing; never UPDATEd in flight after the takeoff-load row has been written. Row is never DELETED on the companion (the FDR retention policy decides physical removal).
2.3 sector_classifications (PostgreSQL — operator-set, onboard-side cache)
Mirrors operator-orchestrator C12's authoritative sector classification onto the companion so the freshness gate (AC-8.2 / AC-NEW-6) can be evaluated locally without a network call.
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
bigserial PK |
NOT NULL | |
polygon_geojson |
jsonb |
NOT NULL | GeoJSON polygon (WGS84) of the classified area. |
classification |
text |
NOT NULL, CHECK (classification IN ('active_conflict', 'stable_rear')) |
Drives the freshness threshold. |
freshness_threshold_days |
int |
NOT NULL, CHECK (freshness_threshold_days IN (180, 365)) |
Materialized: 180 for active_conflict, 365 for stable_rear. |
set_by |
text |
NOT NULL | Operator identifier from C12. |
set_at |
timestamptz |
NOT NULL, default now() |
|
revoked_at |
timestamptz |
NULL until revoked | Only set when a polygon's classification is changed; the new row supersedes; the old row is retained for audit. |
manifest_id |
uuid |
NOT NULL, FK → manifests.id |
Which pre-flight manifest baked this classification. |
Indexes: GiST on polygon_geojson for spatial containment queries; B-tree on manifest_id.
Note: this table is the onboard cache of operator authority. The companion never originates a classification; it only consumes the rows the operator-workstation tooling (C12) staged into the manifest. A row mismatch between the manifest's expected classifications and the table is a content-hash gate failure (D-C10-3) → companion refuses takeoff.
2.4 manifests (PostgreSQL — pre-flight idempotence gate)
Implements D-C10-1 (pre-flight idempotence): the same input bundle (model + calibration + tile corpus + sector classifications) must produce a stable manifest hash. The manifest is the takeoff-load content-hash root; D-C10-3 (atomic write + content-hash gate) verifies it.
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
uuid PK |
NOT NULL | |
manifest_hash |
bytea |
NOT NULL, UNIQUE | SHA-256 over the canonicalized manifest contents. |
model_bundle_hash |
bytea |
NOT NULL | SHA-256 over the (VIO + VPR + matcher + AdHoP) model bundle. |
calibration_artifact_hash |
bytea |
NOT NULL | SHA-256 over the loaded camera calibration JSON. |
corpus_hash |
bytea |
NOT NULL | SHA-256 over the sorted list of (zoom_level, tile_x, tile_y, content_hash) for source='googlemaps' rows in the manifest's spatial scope. |
sector_classifications_hash |
bytea |
NOT NULL | SHA-256 over the sorted list of sector_classifications rows. |
engine_cache_bundle_hash |
bytea |
NOT NULL | SHA-256 over the linked TRT engines + INT8 calibration caches. |
descriptor_index_hash |
bytea |
NOT NULL | SHA-256 over the FAISS .index file. |
created_at |
timestamptz |
NOT NULL, default now() |
|
staged_by |
text |
NOT NULL | Operator workstation identifier. |
verified_at |
timestamptz |
NULL until verified | Set by C10 at takeoff load when the content-hash gate passes. |
Indexes: UNIQUE on manifest_hash; B-tree on created_at.
2.5 engine_cache_entries (PostgreSQL — TRT engine catalogue)
Implements D-C10-7: TRT engines are keyed by the (SM, JetPack, TRT, precision) tuple so a deployed image with the wrong tuple cannot accidentally consume an engine compiled for a different SM.
| Column | Type | Constraints | Description |
|---|---|---|---|
id |
bigserial PK |
NOT NULL | |
model_name |
text |
NOT NULL | Logical name (disk, lightglue, ultra_vpr, aliked, xfeat, …). |
model_revision |
text |
NOT NULL | Upstream commit / weight revision pin. |
sm |
text |
NOT NULL | Jetson SM (e.g., 87 for Orin). |
jetpack_version |
text |
NOT NULL | E.g., 6.2. |
tensorrt_version |
text |
NOT NULL | E.g., 10.3. |
precision |
text |
NOT NULL, CHECK (precision IN ('fp32', 'fp16', 'int8', 'int8_fp16_mixed')) |
|
engine_path |
text |
NOT NULL | Filesystem path to the .engine binary. |
engine_size_bytes |
bigint |
NOT NULL | |
engine_hash |
bytea |
NOT NULL | SHA-256 over the engine file. |
int8_calibration_path |
text |
NULL for non-INT8 | Filesystem path to the INT8 calibration cache. |
int8_calibration_hash |
bytea |
NULL for non-INT8 | SHA-256 over the calibration cache. |
created_at |
timestamptz |
NOT NULL, default now() |
Composite uniqueness: UNIQUE (model_name, model_revision, sm, jetpack_version, tensorrt_version, precision).
Filesystem invariant: engine_path and int8_calibration_path (when set) MUST be valid local files; their on-disk SHA-256 MUST match engine_hash / int8_calibration_hash. C7 verifies the hash on engine load; mismatch ⇒ takeoff refused.
2.6 Camera calibration artifact (filesystem JSON, NOT in PostgreSQL)
Per Principle #1, camera-specific math enters only via this file. It is a single JSON document loaded once at startup. It is not stored in PostgreSQL because it is treated as an image of the camera (test fixture vs. production unit) rather than a queryable record — different binaries / different fixtures point at different files via config, and the schema is the same on every code path.
Path convention:
- Production:
/etc/gps-denied/calibration/adti20.json(per-deployed-unit, post D-PROJ-1 hybrid) - Test fixtures:
tests/fixtures/calibration/adti26.json
Schema (camera_calibration_v1):
{
"schema_version": 1,
"camera_name": "adti20" | "adti26" | <other>,
"model": "ADTi 20MP 20L V1" | <other>,
"sensor": {
"width_px": 5472,
"height_px": 3648,
"pixel_size_um": 3.76,
"sensor_size_mm": [23.6, 15.7]
},
"intrinsics": {
"fx_px": <float>, "fy_px": <float>,
"cx_px": <float>, "cy_px": <float>
},
"distortion": {
"model": "opencv_radtan" | "opencv_kannala_brandt",
"coefficients": [<float>, ...]
},
"body_to_camera": {
"rotation_quaternion_xyzw": [<float>, <float>, <float>, <float>],
"translation_m": [<float>, <float>, <float>]
},
"acquisition_method": "factory_sheet" | "checkerboard_refined" | "hybrid",
"acquisition_timestamp": <ISO 8601>,
"acquisition_notes": <string>,
"content_hash": <sha256-hex of the canonicalized JSON without this field>
}
The artifact's content_hash is the value referenced by manifests.calibration_artifact_hash.
2.7 FAISS HNSW descriptor index (filesystem binary, NOT in PostgreSQL)
VPR descriptors are stored in a FAISS-native .index file. This is the on-disk form of VprQuery candidates' lookup index, not a queryable relational entity. Implementation details:
- Path:
/var/lib/gps-denied/descriptors/{vpr_strategy}_{model_revision}.index - Atomic write: per D-C10-3 — write to
*.index.tmp, fsync, rename viaatomicwrites. - Content-hash gate: SHA-256 over the file body, recorded in
manifests.descriptor_index_hash. - Companion of the
.indexis a*.meta.jsoncarrying{tile_id → descriptor_offset}for the rows under the manifest's spatial scope. The mapping usestiles.idPKs, so a tile pruned from the relational table invalidates the offset and forces a re-build of the index.
The descriptor index is rebuildable from (tiles, model_bundle) + a deterministic VPR feature-extraction pass; therefore it is not part of any backup / DR target — losing the .index re-runs C10 descriptor generation on the operator workstation pre-flight.
2.8 Flight Data Recorder (FdrRecord, filesystem ring, NOT in PostgreSQL)
The FDR is a per-flight ≤ 64 GB local ring buffer of structured records. It is append-only while the flight is IN_AIR; rollover (oldest segment dropped first) is logged but never silent (AC-NEW-3).
Layout (per flight):
/var/lib/gps-denied/fdr/{flight_id}/
manifest.json # flight metadata snapshot (mirror of flights row + manifests row)
segments/
seg_00001.bin # rolling segments (~256 MB each)
seg_00002.bin
...
thumbnails/ # AC-8.5 ≤ 0.1 Hz failed-tile thumbnail log (forensic exception)
YYYYMMDD-HHMMSS-<short-uuid>.jpg
rollover.log # one line per dropped segment
Record schema (FdrRecord, length-prefixed binary stream):
record_header (16 bytes):
magic u32 = 0x47464452 # "GFDR"
version u16 = 1
type u16 # see record types below
monotonic_ms u64 # since flight start
record_body: variable, schema per `type`
record_crc32 u32
Record types (type field — additive-only):
| Type ID | Name | Body |
|---|---|---|
0x0001 |
EmittedExternalPosition |
WGS84 + 6×6 covariance + provenance label + last_satellite_anchor_age_ms + per-FC encoding (AP GPS_INPUT or iNav MSP2_SENSOR_GPS) |
0x0002 |
ImuTrace |
timestamped accel + gyro window |
0x0003 |
ReceivedMavlinkRaw |
raw MAVLink frame (signed/unsigned), captured tlog-style |
0x0004 |
ReceivedMsp2Raw |
raw MSP2 frame (iNav profile only) |
0x0005 |
SystemHealth |
CPU%, GPU%, temp, throttle flag, RAM, VRAM, NVM remaining |
0x0006 |
SourceLabelTransition |
{satellite_anchored, visual_propagated, dead_reckoned} transition |
0x0007 |
MidFlightTileEmitted |
reference to tiles.id + quality_metadata snapshot |
0x0008 |
MidFlightTileFailed |
reason + thumbnail filename (AC-8.5 forensic exception) |
0x0009 |
MavlinkSigningKeyRotated |
new key fingerprint + reason (D-C8-9) |
0x000A |
EkfSourceSetCommand |
D-C8-2 source-set switch event |
0x000B |
VisualBlackoutEvent |
start / end + reason (AC-3.5, AC-NEW-8) |
0x000C |
SpoofingPromotionEvent |
promoted / rejected + 10 s-gate state (AC-NEW-2, AC-NEW-8) |
0x000D |
ContentHashGateFail |
takeoff-load gate fail (D-C10-3) |
0x000E |
ThermalThrottleHybridSwitch |
K=3 ↔ K=2 switch event (D-CROSS-LATENCY-1, ADR-006) |
0x000F |
ComponentLifecycleEvent |
per-component start / stop / fail |
Backward compatibility: new record types are appended; readers MUST skip records they don't recognise (the record_header length is enough to advance the cursor). No record type is ever renumbered or removed; deprecation is by ceasing to emit.
Retention: per-flight ring; on IN_AIR → ON_GROUND transition, the ring is sealed and the operator-orchestrator FDR-retrieval workflow (C12) copies it off the companion. The companion auto-prunes flights older than the configured retention window (default: 30 days) — the prune log itself is its own FDR record on the next flight.
2.9 Tile JPEG bodies (filesystem)
JPEG bodies live at ./tiles/{zoomLevel}/{x}/{y}.jpg. A sidecar ./tiles/{zoomLevel}/{x}/{y}.json carries the full row content for upload-time payload assembly. Both files are atomic-written (via atomicwrites); both are removed only after the corresponding tiles row's lifecycle says it is safe (see § 2.1.2). Filesystem and PostgreSQL drift is treated as a defect: the operator-orchestrator C12 has a periodic consistency_audit that reports any orphan files / missing files.
3. Mermaid ERD
erDiagram
flights ||--o{ tiles : "1 flight produces N onboard tiles"
flights ||--|| manifests : "1 flight references 1 manifest"
manifests ||--o{ sector_classifications : "manifest captures N classifications"
manifests ||--o{ engine_cache_entries : "manifest pins N engines"
sector_classifications ||--o{ tiles : "polygon contains N tiles (spatial)"
flights {
uuid id PK
text companion_id
timestamptz started_at
timestamptz ended_at
bytea signing_key_fingerprint
bytea mavlink_signing_key_fingerprint
text fc_profile
text vio_strategy
text vpr_strategy
text build_kind
uuid manifest_id FK
text fdr_path
timestamptz created_at
}
tiles {
bigserial id PK
int zoom_level
int tile_x
int tile_y
double latitude
double longitude
double tile_size_meters
int tile_size_pixels
timestamptz capture_timestamp
text compression
text crs
text source
uuid flight_id FK
text companion_id
jsonb quality_metadata
text voting_status
text freshness_status
bytea signature
timestamptz created_at
timestamptz updated_at
}
sector_classifications {
bigserial id PK
jsonb polygon_geojson
text classification
int freshness_threshold_days
text set_by
timestamptz set_at
timestamptz revoked_at
uuid manifest_id FK
}
manifests {
uuid id PK
bytea manifest_hash
bytea model_bundle_hash
bytea calibration_artifact_hash
bytea corpus_hash
bytea sector_classifications_hash
bytea engine_cache_bundle_hash
bytea descriptor_index_hash
timestamptz created_at
text staged_by
timestamptz verified_at
}
engine_cache_entries {
bigserial id PK
text model_name
text model_revision
text sm
text jetpack_version
text tensorrt_version
text precision
text engine_path
bigint engine_size_bytes
bytea engine_hash
text int8_calibration_path
bytea int8_calibration_hash
timestamptz created_at
}
Filesystem-only artifacts (camera calibration JSON, FAISS
.index, FDR records, tile JPEG bodies) are NOT shown in the ERD. Their referential ties to PostgreSQL are:manifests.calibration_artifact_hash↔ calibration JSON;manifests.descriptor_index_hash↔ FAISS.index;flights.fdr_path↔ FDR ring directory;tiles.id×(zoom_level, tile_x, tile_y)↔ tile JPEG body + sidecar.
4. Migration Strategy
4.1 Versioning tool
Alembic (SQLAlchemy-based migrations) is the chosen tool. Justification:
- Python is the host language (per
architecture.md§ 2). Alembic ships with the SQLAlchemy ecosystem the project is already using for any DB-touching code. - Alembic supports both autogenerate and hand-written migrations; the team will use hand-written migrations only (autogenerate is acceptable as a starting point for a migration but must be human-reviewed before commit — autogenerate cannot infer the additive-only / freeze-canonical rules).
- Alembic's
downgrade()direction satisfies the reversibility requirement below for additive migrations; non-reversible migrations (rare; only for the controlled-deprecation cases) are explicitly marked. - Alembic plays well with the per-companion local PostgreSQL deployment model — no shared upstream migration coordinator is needed.
Coordination with satellite-provider: the parent suite owns the canonical satellite-provider schema (whatever migration tool — likely EF Core — they use is irrelevant to the onboard side). The onboard project's tiles migration baseline MUST be cross-checked against the latest satellite-provider tiles schema at every onboard cycle; the onboard-only columns are added on top. A Plan-phase carryforward (Step 4 risk register) captures the ongoing coordination obligation.
4.2 Reversibility requirement
Default = reversible. Every migration MUST implement a working downgrade() that returns the schema to the prior state without data loss. Migrations that cannot be reversed (e.g., DROP COLUMN) are forbidden by default and require an ADR + user sign-off as part of the migration commit.
Concretely:
op.add_column(...)⇄op.drop_column(...)— reversible.op.create_table(...)⇄op.drop_table(...)— reversible (data loss is intentional for new tables).op.create_index(...)⇄op.drop_index(...)— reversible.op.alter_column(...)for type changes — only reversible if the type widening is truly bidirectional; otherwise this is a forbidden-by-default migration that requires ADR.- Backfills / data migrations — must be implemented as either (a) idempotent (safe to re-run on either direction) or (b) split into a separate "data-only" migration whose reverse is a no-op with a documented data-loss note.
4.3 Naming convention
migrations/versions/YYYY_MM_DD_HHMM_<short_slug>.py
Where:
YYYY_MM_DD_HHMMis the migration's authored timestamp in UTC (matches Alembic's natural ordering and is operator-friendly).<short_slug>is kebab-case-with-underscores describing the change in ≤ 6 words:add_voting_status_to_tiles,create_engine_cache_entries,freshness_threshold_check, etc.- The Alembic
revisionanddown_revisionIDs are 12-character random hex (Alembic default).
migrations/env.py uses the gps_denied.db.metadata SQLAlchemy MetaData object for autogenerate diffs; the connection URL is read from the same config file the runtime uses (no separate migration config).
4.4 Migration baseline
The first migration 0001_initial.py creates flights, manifests, sector_classifications, engine_cache_entries, and tiles (with the canonical satellite-provider columns). The first onboard-extension migration 0002_add_onboard_columns_to_tiles.py adds the onboard-only columns (flight_id, companion_id, quality_metadata, voting_status, freshness_status, signature) plus the indexes specific to those columns. Splitting the baseline this way keeps the diff against satellite-provider's migration history mechanically clear.
5. Seed Data
| Table | dev-tier1 (Tier-1 Docker) | staging-tier1 / staging-tier2 (CI) | production |
|---|---|---|---|
flights |
One synthetic flight row keyed against tests/fixtures/flight_derkachi/ |
Synthetic flights per replay scenario; matches IT-12 comparative-study fixture set | Real flight data — written by takeoff-load code; NEVER seeded |
tiles (source='googlemaps') |
Loaded from tests/fixtures/tiles_corpus/ (a curated, ≤100 MB subset of satellite-provider's output for the Derkachi area) |
Same as dev-tier1; the e2e-test mock-suite-sat-service fixture populates additional rows on demand for upload-side scenarios |
Written by C11 TileDownloader from the operator workstation; NEVER seeded |
tiles (source='onboard_ingest') |
Synthetic injected for IT / NFT replays | Synthetic injected for NFT-SEC-01 cache-poisoning Monte-Carlo runs | NEVER seeded (only takeoff-load orthorectification code can write) |
sector_classifications |
One active_conflict polygon over Derkachi + one stable_rear polygon (test fixtures) |
Same | Loaded by C10 from operator-staged manifest; NEVER seeded |
manifests |
One pre-built manifest for the Derkachi fixture set | Multiple — one per CI scenario | Loaded by C10 from operator-staged manifest; NEVER seeded |
engine_cache_entries |
Tier-1 has no GPU engines; rows reference Tier-2-built engines pulled from a CI artifact cache | Tier-2 builds engines on first run; rows generated then | Loaded by C10 from operator-staged engine cache; NEVER seeded |
Hard rule: production NEVER seeds any of these tables. The pre-flight cache build flow (C11 TileDownloader writes tiles rows; C10 writes manifests and engine_cache_entries) is the only writer of canonical pre-flight rows; the takeoff-load flow is the only writer of flight rows; the in-flight orthorectifier is the only writer of source='onboard_ingest' rows. A test fixture writing into a production database is a defect (per coderule.mdc "mocking data is needed only for tests, never mock data for dev or prod env").
Seed mechanism (Tier-1 / staging): tests/fixtures/seed_db.py is the canonical Tier-1 seeder; it issues only INSERTs against a dedicated test database. It is invoked by the test harness (pytest) and by the docker compose developer setup. It does NOT use Alembic — it operates on the schema Alembic has already created.
6. Backward Compatibility
6.1 Default: additive-only
Every schema change is additive by default:
- New tables — additive; existing code paths unaffected.
- New columns — additive; default value or NULL; existing rows backfilled per migration data step.
- New indexes — additive; not visible to read-paths semantically, only to performance.
- New CHECK constraints — additive only if all existing rows already satisfy the constraint; otherwise the migration MUST split into "backfill rows" then "add constraint".
- New jsonb keys (e.g., in
tiles.quality_metadata) — additive; readers MUST tolerate missing keys (treat as absent / null). - New FDR record types — additive; readers skip unknown types.
6.2 Forbidden by default
The following changes are forbidden by default and require an ADR + user sign-off:
- Renaming a column or table.
- Dropping a column or table.
- Tightening a column type (narrowing).
- Tightening a CHECK constraint that existing rows could violate.
- Removing a value from a CHECK constraint enum.
This rule is the data-model expression of coderule.mdc's "Do not rename any databases or tables or table columns without confirmation."
6.3 The tiles schema specifically: frozen on canonical columns
Because the tiles schema MUST stay byte-compatible with satellite-provider's on-disk and ingest format, the canonical columns (those marked "both" in § 2.1.1) are frozen: this project never renames, types-narrows, or drops them. Coordination with the parent suite is the only path to canonical-column changes; the onboard-only columns can evolve under the additive-only rule above without parent-suite coordination.
If satellite-provider introduces a canonical-column change in a future cycle, the onboard project must:
- Open an ADR documenting the upstream change.
- Run a migration that mirrors the upstream change exactly.
- Re-validate that the post-landing upload payload (D-PROJ-2 contract) still matches.
- Re-run the post-landing upload integration test (
mock-suite-sat-serviceswapped for the realsatellite-provider).
6.4 Schema versioning for jsonb sub-schemas
tiles.quality_metadata (and any future jsonb fields with non-trivial structure) carries a _v integer key. Readers branch on _v:
_v == 1(current): the schema in § 2.1.3._vmissing: treat as_v == 1(back-compat for rows written before_vwas added)._v == 2+: future additive evolution; readers MUST tolerate unknown additional keys.
Schema-version bumps are tracked in _docs/02_document/schemas/ (a new tiles_quality_metadata_v{N}.json per bump).
6.5 FDR file-format compatibility
The FDR record_header is fixed at version 1. Every FDR reader (operator-orchestrator, replay tools) MUST:
- Validate
magic == 0x47464452and skip a corrupt segment. - Read the
versionfield; onversion != 1, refuse to interpret the body and emit a "unknown FDR version" diagnostic. - For known
version == 1, read thetypefield; on unknowntype, advance the cursor bybody_length + 4(CRC) and continue.
The FDR file format will be bumped to version 2 only if a structural change to record_header is required; that change is gated on the same ADR + user sign-off rule as a forbidden-by-default schema change.
7. Non-Persistent Runtime Entities (referenced for completeness only)
The following DTOs flow through the per-frame pipeline in memory and are NOT persistent (except as derived FDR records — see § 2.8 above). They are documented here because § 4 of architecture.md lists them as "Core entities" — but the data-model migration scope does not include them. Their full schema lives in component specs (Plan Step 3).
| DTO | Producer | Consumers | Persisted as |
|---|---|---|---|
NavCameraFrame |
Camera ingest thread | C1, C2 | Never (AC-8.5 forbids raw-frame storage; the only forensic exception is AC-8.5 ≤ 0.1 Hz failed-tile thumbnails inside the FDR) |
ImuSample / ImuWindow |
C8 inbound side | C1, C5 | FDR ImuTrace records (lossy summary, not full stream) |
VioOutput |
C1 | C5 | FDR SystemHealth (covariance summary only); not as a row |
VprQuery / VprResult |
C2 | C2.5 | Never (transient retrieval state) |
RerankResult |
C2.5 | C3 / C3.5 | Never |
MatchResult |
C3 / C3.5 | C4 | Never |
PoseEstimate |
C4 → C5 | C8, C13 | FDR EmittedExternalPosition records (post-emission); also feeds tiles.quality_metadata for in-flight orthorectified tiles |
EmittedExternalPosition |
C8 | FC, FDR | FDR record |
FlightStateSignal |
C8 inbound side | flight-state guard, FDR | FDR ComponentLifecycleEvent on transition |
CameraCalibration (loaded once) |
calibration loader | C1, C3, C4 | NOT in PostgreSQL — see § 2.6 |
8. Open Items / Plan-Phase Carryforward
satellite-providerschema-drift coordination (recurring): the canonical-columns freeze depends on an explicit cross-project schema review every cycle. Captured as a Plan-Step-4 risk register entry; carryforward.- D-PROJ-2 #1 ingest-endpoint contract: the
signaturecolumn's exact algorithm (Ed25519 vs ECDSA) and the per-flight key distribution is a parent-suite design decision; onboard side is contract-flexible and treatssignatureas opaquebytea. - D-PROJ-2 #2 voting-layer schema: parent-suite-side; this onboard data model writes
voting_status='pending'and reads'trusted'only — the actual promotion table lives insatellite-provider's schema and is out of scope here. - GeoJSON polygon precision (
sector_classifications.polygon_geojson): GeoJSON is precision-bounded by JSON number representation; if AC-NEW-7 cache-poisoning safety needs sub-metre polygon edges, a future migration can switch to PostGISgeography(Polygon, 4326). Captured as carryforward (currently no AC requirement to do so). - FDR retention policy default: 30 days post-landing is a reasonable default but is not pinned in any AC; carryforward to the operator-orchestrator spec (C12) for confirmation.