Update autodev state, architecture documentation, and glossary terms

Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-10 00:21:34 +03:00
parent 723f574b14
commit 64542d32fc
52 changed files with 8789 additions and 88 deletions
@@ -0,0 +1,32 @@
# Common Helper — `ImuPreintegrator`
## Purpose
Shared GTSAM `CombinedImuFactor` preintegration buffer. Both C1 (VIO) and C5 (StateEstimator) consume the same FC IMU window and want the same preintegrated Δ-pose / Δ-velocity / Δ-bias quantities. Centralising the preintegrator avoids two inconsistent integrations of the same IMU stream.
## Used By
- C1 — Visual / Visual-Inertial Odometry.
- C5 — State Estimator.
## Interface (sketch)
```
class ImuPreintegrator:
def reset_with_bias(bias: ImuBias) -> None
def integrate_sample(sample: ImuSample) -> None
def integrate_window(window: ImuWindow) -> None
def current_preintegration() -> CombinedImuFactor
def reset_for_new_keyframe() -> CombinedImuFactor # returns the closed factor; clears state
```
## Implementation Notes
- Wraps GTSAM's `PreintegrationCombinedParams` + `PreintegratedCombinedMeasurements`.
- Holds the gyro/accel noise model from the camera-calibration artifact (which carries IMU-noise covariances per-deployment).
- Single-threaded by design — composition root binds one preintegrator instance to one writer thread.
## Caveats
- Bias drift is the responsibility of the consumers (C1 + C5) who call `reset_with_bias(...)` whenever their estimate of the IMU bias changes.
- The preintegrator does not own a clock — every `integrate_*` call requires a monotonic timestamp on the IMU sample.
@@ -0,0 +1,30 @@
# Common Helper — `SE3Utils`
## Purpose
SE(3) ↔ pose-matrix conversion and Lie-algebra exponential/logarithm. Used wherever a 4×4 transformation matrix needs to be converted to/from a 6-vector, or where Jacobians of SE(3) operations are needed for covariance recovery.
## Used By
- C1 — Visual / Visual-Inertial Odometry (relative pose updates).
- C4 — Pose Estimation (`solvePnPRansac` 4×4 → SE(3) for the GTSAM factor).
- C5 — State Estimator (iSAM2 graph keys + smoothed history).
## Interface (sketch)
```
def matrix_to_se3(T_4x4: ndarray) -> SE3
def se3_to_matrix(pose: SE3) -> ndarray
def exp_map(xi: Vector6) -> SE3
def log_map(pose: SE3) -> Vector6
def adjoint(pose: SE3) -> Matrix6
```
## Implementation Notes
- Backed by GTSAM `Pose3` + Eigen Lie-algebra primitives where available; otherwise pure numpy.
- All-positive-determinant rotation guarantee — caller is responsible for orthogonalising input rotation matrices before calling `matrix_to_se3`.
## Caveats
- Library-grade Lie-algebra functions exist in `manifpy` and `pylie`; we use GTSAM's primitives directly to avoid pulling in a second math library. If a future strategy needs richer manifold ops, evaluate `manifpy` then.
@@ -0,0 +1,30 @@
# Common Helper — `LightGlueRuntime`
## Purpose
Shared LightGlue inference handle. C2.5 (Re-rank) does single-pair LightGlue matching for inlier counting on K=10 candidates per frame; C3 (CrossDomainMatcher) does the heavier matching pass on the surviving N=3 candidates. Both use the same LightGlue engine; sharing the engine avoids paying the engine-build / GPU-memory cost twice.
## Used By
- C2.5 — Inlier-based Re-rank.
- C3 — Cross-domain Matcher.
## Interface (sketch)
```
class LightGlueRuntime:
def __init__(engine_handle: EngineHandle): ...
def match(features_a: KeypointSet, features_b: KeypointSet) -> CorrespondenceSet
def match_batch(features_a_list, features_b_list) -> list[CorrespondenceSet]
def descriptor_dim() -> int
```
## Implementation Notes
- Owned by the composition root; the same instance is constructor-injected into both C2.5 and C3.
- Backed by C7's `InferenceRuntime.deserialize_engine(LIGHTGLUE_ENGINE_CACHE_ENTRY)` at takeoff.
- Single CUDA stream; concurrent calls forbidden — composition root binds the runtime to the single F3 hot-path thread.
## Caveats
- The features fed in MUST come from the same backbone as the LightGlue engine was trained for (DISK in production-default; ALIKED / XFeat in alternates). Mixing backbones is a runtime error caught by the matcher's input shape check.
@@ -0,0 +1,43 @@
# Common Helper — `WgsConverter`
## Purpose
WGS84 ↔ local tangent-plane (ENU/NED) ↔ tile pixel-coordinate conversions. Required by every component that interacts with geographic positions — from C4's pose estimation, through C5's state graph, through C6's tile-bounding-box queries, through C8's per-FC encoding, through C10's bbox provisioning, through C12's operator UX.
## Used By
- C4 — Pose Estimation.
- C5 — State Estimator.
- C6 — Tile Cache + Spatial Index (bbox queries).
- C8 — FC Adapter (per-FC encoding of LatLonAlt → MAVLink/MSP2).
- C10 — Pre-flight Cache Provisioning (bbox → tile-id list).
- C12 — Operator Pre-flight Tooling (operator-entered bbox).
## Interface (sketch)
```
class WgsConverter:
@staticmethod
def latlonalt_to_ecef(p: LatLonAlt) -> Vector3
@staticmethod
def ecef_to_latlonalt(p: Vector3) -> LatLonAlt
@staticmethod
def latlonalt_to_local_enu(origin: LatLonAlt, p: LatLonAlt) -> Vector3
@staticmethod
def local_enu_to_latlonalt(origin: LatLonAlt, p_enu: Vector3) -> LatLonAlt
@staticmethod
def latlon_to_tile_xy(zoom: int, lat: float, lon: float) -> tuple[int, int]
@staticmethod
def tile_xy_to_latlon_bounds(zoom: int, x: int, y: int) -> BoundingBox
```
## Implementation Notes
- Stateless; pure functions.
- Backed by `pyproj` for the geodesy primitives; tile_xy math uses the standard slippy-map convention (matches `satellite-provider`'s on-disk layout).
- All conversions use WGS84 ellipsoid; no datum-shift complexity.
## Caveats
- The static-only design satisfies the coderule.mdc constraint ("only use static methods for pure self-contained computations"). If a future deployment needs alternative datum support, switch to an instance-based factory then.
- Tile-coordinate math is zoom-level-sensitive; callers MUST pass the right zoom level for the tile in question (typically zoomLevel from `TileMetadata`).
@@ -0,0 +1,36 @@
# Common Helper — `Sha256Sidecar`
## Purpose
Atomic-write + SHA-256 content-hash sidecar pattern (D-C10-3). Every persistent artifact that takeoff-load (F2) verifies must be written atomically AND have a `.sha256` sidecar that the verifier can independently recompute. Centralising the pattern avoids two slightly-different implementations across C6 (FAISS index, tile metadata) and C7 (engine + calibration cache) and C10 (Manifest itself).
## Used By
- C6 — Tile Cache + Spatial Index (FAISS `.index`, descriptor sidecar; tile pixels do NOT use sidecars individually — there are too many; the Manifest covers the tile-tree hash collectively).
- C7 — Inference Runtime (engine cache files + INT8 calibration cache; D-C10-6 calibration-cache trust depends on this).
- C10 — Pre-flight Cache Provisioning (Manifest itself; aggregate hash of the cache root).
## Interface (sketch)
```
class Sha256Sidecar:
@staticmethod
def write_atomic(path: Path, payload: bytes) -> sha256
@staticmethod
def write_atomic_and_sidecar(path: Path, payload: bytes) -> sha256
@staticmethod
def verify(path: Path) -> bool # checks payload hash against sidecar
@staticmethod
def aggregate_hash(paths: list[Path]) -> sha256 # for Manifest covering many files
```
## Implementation Notes
- Backed by the `atomicwrites` package for atomic rename and Python's `hashlib.sha256` for digesting.
- Sidecars are written as `<path>.sha256` containing the hex digest (no JSON wrapper — keeps verification trivial).
- `aggregate_hash` is order-deterministic (sorts paths first) so two runs that read the same files yield the same aggregate.
## Caveats
- The atomic rename is filesystem-level — works on POSIX local filesystems, not on NFS / SMB / overlayfs. For production deployments the cache root MUST live on a local filesystem.
- The sidecar is NOT cryptographically signed; it protects against accidental corruption + file-replacement-after-staging, NOT against an attacker with write access to the cache root. Threat model treats the operator workstation as trusted; the companion's write access is restricted to F4 (mid-flight tile gen) which has its own per-flight signing key path.
@@ -0,0 +1,50 @@
# Common Helper — `EngineFilenameSchema`
## Purpose
Self-describing `.engine` filename schema per D-C10-7. TensorRT engines are NOT portable across `(SM, JetPack, TRT, precision)` tuples; the filename schema makes mismatch instantly visible at takeoff load (F2) so refusing-to-deserialize-on-mismatch becomes trivial.
## Used By
- C7 — Inference Runtime (writes engines with this schema; reads them on `deserialize_engine`).
- C10 — Pre-flight Cache Provisioning (compiles engines via C7 with this schema; writes them to the cache root).
## Interface (sketch)
```
class EngineFilenameSchema:
@staticmethod
def build(model_name: str, sm: int, jetpack: str, trt: str, precision: str) -> str
@staticmethod
def parse(filename: str) -> EngineCacheKey
@staticmethod
def matches_host(filename: str, host_capabilities: HostCapabilities) -> bool
```
Filename format: `{model}__sm{SM}_jp{JP_dotted}_trt{TRT_dotted}_{precision}.engine`
Example: `ultravpr__sm87_jp6.2_trt10.3_fp16.engine`
```
EngineCacheKey:
model_name: string
sm: int (e.g. 87 for Jetson Orin Nano Super)
jetpack: string (e.g. "6.2")
trt: string (e.g. "10.3")
precision: enum {fp16, int8, mixed}
HostCapabilities:
current_sm: int
current_jetpack: string
current_trt: string
```
## Implementation Notes
- Stateless; pure string parsing.
- `matches_host` returns true iff every tuple element matches exactly. F2 takeoff load uses this to decide which engines to deserialize and which to refuse.
## Caveats
- The dotted-version format must round-trip cleanly through filesystems (no `/` or `\` in dotted versions; safe).
- Adding a new tuple dimension (e.g., a per-binary `BUILD_*` flag combination) requires extending the schema AND every existing `.engine` filename. Versioning the schema itself is a Plan-phase carryforward if/when needed.
@@ -0,0 +1,49 @@
# Common Helper — `RansacFilter`
## Purpose
Thin wrapper around OpenCV's RANSAC + reprojection-residual computation. Used by the cross-domain matcher (C3 — RANSAC over 2D-2D correspondences for the per-candidate inlier count), the conditional refiner (C3.5 — recompute residual after AdHoP refinement), and the pose estimator (C4 — RANSAC inside `solvePnPRansac` is OpenCV-internal, but C4 also computes per-frame final reprojection residual via this helper for FDR provenance).
## Used By
- C3 — Cross-domain Matcher.
- C3.5 — AdHoP-conditional Refiner.
- C4 — Pose Estimation.
## Interface (sketch)
```
class RansacFilter:
@staticmethod
def filter_correspondences(
correspondences: ndarray[N, 4],
ransac_threshold_px: float,
min_inliers: int,
) -> RansacResult
@staticmethod
def compute_reprojection_residual(
correspondences: ndarray[I, 4],
K: ndarray[3, 3],
distortion: ndarray,
pose: SE3,
) -> float # median residual in pixels
```
```
RansacResult:
inlier_correspondences: ndarray[I, 4]
inlier_count: int
outlier_count: int
median_residual_px: float
```
## Implementation Notes
- Backed by `cv2.findHomography(..., cv2.RANSAC)` for the 2D-2D case and `cv2.projectPoints(...)` for the post-pose residual.
- Stateless; pure function.
## Caveats
- The RANSAC threshold is a tunable; defaults are documented per-component (C3, C3.5, C4) in their specs.
- For 2D-3D RANSAC inside C4's `solvePnPRansac`, OpenCV does it internally — this helper is for the standalone reprojection-residual computation that lives outside the PnP call.
@@ -0,0 +1,33 @@
# Common Helper — `DescriptorNormaliser`
## Purpose
L2-normalise descriptors so cosine similarity aligns with Euclidean distance — required because FAISS HNSW operates on Euclidean / inner-product spaces but the upstream backbones (UltraVPR, MegaLoc, MixVPR, etc.) emit raw cosine-similar embeddings. The same normalisation MUST be applied at both the **corpus** side (C10 during F1 provisioning) and the **query** side (C2 at runtime); centralising the helper guarantees they don't drift apart.
## Used By
- C2 — VPR (query side; per-frame embedding before FAISS lookup).
- C10 — Pre-flight Cache Provisioning (corpus side; per-tile embedding before FAISS index population).
## Interface (sketch)
```
class DescriptorNormaliser:
@staticmethod
def l2_normalise(descriptor: ndarray[D]) -> ndarray[D]
@staticmethod
def l2_normalise_batch(descriptors: ndarray[N, D]) -> ndarray[N, D]
@staticmethod
def descriptor_metric() -> str # always "inner_product" — for FAISS index config
```
## Implementation Notes
- Stateless; pure function.
- Backed by numpy / numpy-CUDA depending on the runtime; preserves dtype (fp16 in/out → fp16 out, fp32 in/out → fp32 out).
- The `descriptor_metric()` static return is the source of truth for the FAISS HNSW index distance metric — both C6's `DescriptorIndex.search_topk` and C10's index-build code consult it.
## Caveats
- Zero-norm vectors are returned as the zero vector (no division-by-zero); callers must filter or accept that such descriptors will match nothing.
- The choice of "inner product on L2-normalised" rather than "cosine" is FAISS-idiomatic — FAISS does not have a built-in cosine metric; cosine is achieved by L2-normalising and using inner product.