Update autodev state, architecture documentation, and glossary terms

Transitioned the autodev state to phase 21, reflecting the completion of Step 5 and the drafting of Step 6 epics. Revised the architecture documentation to clarify the roles of the Tile Manager and its components, ensuring accurate representation of the system's operational flow. Updated glossary entries for Flight State and Operator to incorporate recent changes and enhance clarity on component interactions and responsibilities.
2026-06-22 09:21:12 +00:00 · 2026-05-10 00:21:34 +03:00
parent 723f574b14
commit 64542d32fc
52 changed files with 8789 additions and 88 deletions
@@ -0,0 +1,32 @@
+# Common Helper — `ImuPreintegrator`
+
+## Purpose
+
+Shared GTSAM `CombinedImuFactor` preintegration buffer. Both C1 (VIO) and C5 (StateEstimator) consume the same FC IMU window and want the same preintegrated Δ-pose / Δ-velocity / Δ-bias quantities. Centralising the preintegrator avoids two inconsistent integrations of the same IMU stream.
+
+## Used By
+
+- C1 — Visual / Visual-Inertial Odometry.
+- C5 — State Estimator.
+
+## Interface (sketch)
+
+```
+class ImuPreintegrator:
+    def reset_with_bias(bias: ImuBias) -> None
+    def integrate_sample(sample: ImuSample) -> None
+    def integrate_window(window: ImuWindow) -> None
+    def current_preintegration() -> CombinedImuFactor
+    def reset_for_new_keyframe() -> CombinedImuFactor   # returns the closed factor; clears state
+```
+
+## Implementation Notes
+
+- Wraps GTSAM's `PreintegrationCombinedParams` + `PreintegratedCombinedMeasurements`.
+- Holds the gyro/accel noise model from the camera-calibration artifact (which carries IMU-noise covariances per-deployment).
+- Single-threaded by design — composition root binds one preintegrator instance to one writer thread.
+
+## Caveats
+
+- Bias drift is the responsibility of the consumers (C1 + C5) who call `reset_with_bias(...)` whenever their estimate of the IMU bias changes.
+- The preintegrator does not own a clock — every `integrate_*` call requires a monotonic timestamp on the IMU sample.
@@ -0,0 +1,30 @@
+# Common Helper — `SE3Utils`
+
+## Purpose
+
+SE(3) ↔ pose-matrix conversion and Lie-algebra exponential/logarithm. Used wherever a 4×4 transformation matrix needs to be converted to/from a 6-vector, or where Jacobians of SE(3) operations are needed for covariance recovery.
+
+## Used By
+
+- C1 — Visual / Visual-Inertial Odometry (relative pose updates).
+- C4 — Pose Estimation (`solvePnPRansac` 4×4 → SE(3) for the GTSAM factor).
+- C5 — State Estimator (iSAM2 graph keys + smoothed history).
+
+## Interface (sketch)
+
+```
+def matrix_to_se3(T_4x4: ndarray) -> SE3
+def se3_to_matrix(pose: SE3) -> ndarray
+def exp_map(xi: Vector6) -> SE3
+def log_map(pose: SE3) -> Vector6
+def adjoint(pose: SE3) -> Matrix6
+```
+
+## Implementation Notes
+
+- Backed by GTSAM `Pose3` + Eigen Lie-algebra primitives where available; otherwise pure numpy.
+- All-positive-determinant rotation guarantee — caller is responsible for orthogonalising input rotation matrices before calling `matrix_to_se3`.
+
+## Caveats
+
+- Library-grade Lie-algebra functions exist in `manifpy` and `pylie`; we use GTSAM's primitives directly to avoid pulling in a second math library. If a future strategy needs richer manifold ops, evaluate `manifpy` then.
@@ -0,0 +1,30 @@
+# Common Helper — `LightGlueRuntime`
+
+## Purpose
+
+Shared LightGlue inference handle. C2.5 (Re-rank) does single-pair LightGlue matching for inlier counting on K=10 candidates per frame; C3 (CrossDomainMatcher) does the heavier matching pass on the surviving N=3 candidates. Both use the same LightGlue engine; sharing the engine avoids paying the engine-build / GPU-memory cost twice.
+
+## Used By
+
+- C2.5 — Inlier-based Re-rank.
+- C3 — Cross-domain Matcher.
+
+## Interface (sketch)
+
+```
+class LightGlueRuntime:
+    def __init__(engine_handle: EngineHandle): ...
+    def match(features_a: KeypointSet, features_b: KeypointSet) -> CorrespondenceSet
+    def match_batch(features_a_list, features_b_list) -> list[CorrespondenceSet]
+    def descriptor_dim() -> int
+```
+
+## Implementation Notes
+
+- Owned by the composition root; the same instance is constructor-injected into both C2.5 and C3.
+- Backed by C7's `InferenceRuntime.deserialize_engine(LIGHTGLUE_ENGINE_CACHE_ENTRY)` at takeoff.
+- Single CUDA stream; concurrent calls forbidden — composition root binds the runtime to the single F3 hot-path thread.
+
+## Caveats
+
+- The features fed in MUST come from the same backbone as the LightGlue engine was trained for (DISK in production-default; ALIKED / XFeat in alternates). Mixing backbones is a runtime error caught by the matcher's input shape check.
@@ -0,0 +1,43 @@
+# Common Helper — `WgsConverter`
+
+## Purpose
+
+WGS84 ↔ local tangent-plane (ENU/NED) ↔ tile pixel-coordinate conversions. Required by every component that interacts with geographic positions — from C4's pose estimation, through C5's state graph, through C6's tile-bounding-box queries, through C8's per-FC encoding, through C10's bbox provisioning, through C12's operator UX.
+
+## Used By
+
+- C4 — Pose Estimation.
+- C5 — State Estimator.
+- C6 — Tile Cache + Spatial Index (bbox queries).
+- C8 — FC Adapter (per-FC encoding of LatLonAlt → MAVLink/MSP2).
+- C10 — Pre-flight Cache Provisioning (bbox → tile-id list).
+- C12 — Operator Pre-flight Tooling (operator-entered bbox).
+
+## Interface (sketch)
+
+```
+class WgsConverter:
+    @staticmethod
+    def latlonalt_to_ecef(p: LatLonAlt) -> Vector3
+    @staticmethod
+    def ecef_to_latlonalt(p: Vector3) -> LatLonAlt
+    @staticmethod
+    def latlonalt_to_local_enu(origin: LatLonAlt, p: LatLonAlt) -> Vector3
+    @staticmethod
+    def local_enu_to_latlonalt(origin: LatLonAlt, p_enu: Vector3) -> LatLonAlt
+    @staticmethod
+    def latlon_to_tile_xy(zoom: int, lat: float, lon: float) -> tuple[int, int]
+    @staticmethod
+    def tile_xy_to_latlon_bounds(zoom: int, x: int, y: int) -> BoundingBox
+```
+
+## Implementation Notes
+
+- Stateless; pure functions.
+- Backed by `pyproj` for the geodesy primitives; tile_xy math uses the standard slippy-map convention (matches `satellite-provider`'s on-disk layout).
+- All conversions use WGS84 ellipsoid; no datum-shift complexity.
+
+## Caveats
+
+- The static-only design satisfies the coderule.mdc constraint ("only use static methods for pure self-contained computations"). If a future deployment needs alternative datum support, switch to an instance-based factory then.
+- Tile-coordinate math is zoom-level-sensitive; callers MUST pass the right zoom level for the tile in question (typically zoomLevel from `TileMetadata`).
@@ -0,0 +1,36 @@
+# Common Helper — `Sha256Sidecar`
+
+## Purpose
+
+Atomic-write + SHA-256 content-hash sidecar pattern (D-C10-3). Every persistent artifact that takeoff-load (F2) verifies must be written atomically AND have a `.sha256` sidecar that the verifier can independently recompute. Centralising the pattern avoids two slightly-different implementations across C6 (FAISS index, tile metadata) and C7 (engine + calibration cache) and C10 (Manifest itself).
+
+## Used By
+
+- C6 — Tile Cache + Spatial Index (FAISS `.index`, descriptor sidecar; tile pixels do NOT use sidecars individually — there are too many; the Manifest covers the tile-tree hash collectively).
+- C7 — Inference Runtime (engine cache files + INT8 calibration cache; D-C10-6 calibration-cache trust depends on this).
+- C10 — Pre-flight Cache Provisioning (Manifest itself; aggregate hash of the cache root).
+
+## Interface (sketch)
+
+```
+class Sha256Sidecar:
+    @staticmethod
+    def write_atomic(path: Path, payload: bytes) -> sha256
+    @staticmethod
+    def write_atomic_and_sidecar(path: Path, payload: bytes) -> sha256
+    @staticmethod
+    def verify(path: Path) -> bool                           # checks payload hash against sidecar
+    @staticmethod
+    def aggregate_hash(paths: list[Path]) -> sha256          # for Manifest covering many files
+```
+
+## Implementation Notes
+
+- Backed by the `atomicwrites` package for atomic rename and Python's `hashlib.sha256` for digesting.
+- Sidecars are written as `<path>.sha256` containing the hex digest (no JSON wrapper — keeps verification trivial).
+- `aggregate_hash` is order-deterministic (sorts paths first) so two runs that read the same files yield the same aggregate.
+
+## Caveats
+
+- The atomic rename is filesystem-level — works on POSIX local filesystems, not on NFS / SMB / overlayfs. For production deployments the cache root MUST live on a local filesystem.
+- The sidecar is NOT cryptographically signed; it protects against accidental corruption + file-replacement-after-staging, NOT against an attacker with write access to the cache root. Threat model treats the operator workstation as trusted; the companion's write access is restricted to F4 (mid-flight tile gen) which has its own per-flight signing key path.
@@ -0,0 +1,50 @@
+# Common Helper — `EngineFilenameSchema`
+
+## Purpose
+
+Self-describing `.engine` filename schema per D-C10-7. TensorRT engines are NOT portable across `(SM, JetPack, TRT, precision)` tuples; the filename schema makes mismatch instantly visible at takeoff load (F2) so refusing-to-deserialize-on-mismatch becomes trivial.
+
+## Used By
+
+- C7 — Inference Runtime (writes engines with this schema; reads them on `deserialize_engine`).
+- C10 — Pre-flight Cache Provisioning (compiles engines via C7 with this schema; writes them to the cache root).
+
+## Interface (sketch)
+
+```
+class EngineFilenameSchema:
+    @staticmethod
+    def build(model_name: str, sm: int, jetpack: str, trt: str, precision: str) -> str
+    @staticmethod
+    def parse(filename: str) -> EngineCacheKey
+    @staticmethod
+    def matches_host(filename: str, host_capabilities: HostCapabilities) -> bool
+```
+
+Filename format: `{model}__sm{SM}_jp{JP_dotted}_trt{TRT_dotted}_{precision}.engine`
+
+Example: `ultravpr__sm87_jp6.2_trt10.3_fp16.engine`
+
+```
+EngineCacheKey:
+  model_name:       string
+  sm:               int               (e.g. 87 for Jetson Orin Nano Super)
+  jetpack:          string            (e.g. "6.2")
+  trt:              string            (e.g. "10.3")
+  precision:        enum {fp16, int8, mixed}
+
+HostCapabilities:
+  current_sm:       int
+  current_jetpack:  string
+  current_trt:      string
+```
+
+## Implementation Notes
+
+- Stateless; pure string parsing.
+- `matches_host` returns true iff every tuple element matches exactly. F2 takeoff load uses this to decide which engines to deserialize and which to refuse.
+
+## Caveats
+
+- The dotted-version format must round-trip cleanly through filesystems (no `/` or `\` in dotted versions; safe).
+- Adding a new tuple dimension (e.g., a per-binary `BUILD_*` flag combination) requires extending the schema AND every existing `.engine` filename. Versioning the schema itself is a Plan-phase carryforward if/when needed.
@@ -0,0 +1,49 @@
+# Common Helper — `RansacFilter`
+
+## Purpose
+
+Thin wrapper around OpenCV's RANSAC + reprojection-residual computation. Used by the cross-domain matcher (C3 — RANSAC over 2D-2D correspondences for the per-candidate inlier count), the conditional refiner (C3.5 — recompute residual after AdHoP refinement), and the pose estimator (C4 — RANSAC inside `solvePnPRansac` is OpenCV-internal, but C4 also computes per-frame final reprojection residual via this helper for FDR provenance).
+
+## Used By
+
+- C3 — Cross-domain Matcher.
+- C3.5 — AdHoP-conditional Refiner.
+- C4 — Pose Estimation.
+
+## Interface (sketch)
+
+```
+class RansacFilter:
+    @staticmethod
+    def filter_correspondences(
+        correspondences: ndarray[N, 4],
+        ransac_threshold_px: float,
+        min_inliers: int,
+    ) -> RansacResult
+
+    @staticmethod
+    def compute_reprojection_residual(
+        correspondences: ndarray[I, 4],
+        K: ndarray[3, 3],
+        distortion: ndarray,
+        pose: SE3,
+    ) -> float                                      # median residual in pixels
+```
+
+```
+RansacResult:
+  inlier_correspondences:        ndarray[I, 4]
+  inlier_count:                  int
+  outlier_count:                 int
+  median_residual_px:            float
+```
+
+## Implementation Notes
+
+- Backed by `cv2.findHomography(..., cv2.RANSAC)` for the 2D-2D case and `cv2.projectPoints(...)` for the post-pose residual.
+- Stateless; pure function.
+
+## Caveats
+
+- The RANSAC threshold is a tunable; defaults are documented per-component (C3, C3.5, C4) in their specs.
+- For 2D-3D RANSAC inside C4's `solvePnPRansac`, OpenCV does it internally — this helper is for the standalone reprojection-residual computation that lives outside the PnP call.
@@ -0,0 +1,33 @@
+# Common Helper — `DescriptorNormaliser`
+
+## Purpose
+
+L2-normalise descriptors so cosine similarity aligns with Euclidean distance — required because FAISS HNSW operates on Euclidean / inner-product spaces but the upstream backbones (UltraVPR, MegaLoc, MixVPR, etc.) emit raw cosine-similar embeddings. The same normalisation MUST be applied at both the **corpus** side (C10 during F1 provisioning) and the **query** side (C2 at runtime); centralising the helper guarantees they don't drift apart.
+
+## Used By
+
+- C2 — VPR (query side; per-frame embedding before FAISS lookup).
+- C10 — Pre-flight Cache Provisioning (corpus side; per-tile embedding before FAISS index population).
+
+## Interface (sketch)
+
+```
+class DescriptorNormaliser:
+    @staticmethod
+    def l2_normalise(descriptor: ndarray[D]) -> ndarray[D]
+    @staticmethod
+    def l2_normalise_batch(descriptors: ndarray[N, D]) -> ndarray[N, D]
+    @staticmethod
+    def descriptor_metric() -> str                        # always "inner_product" — for FAISS index config
+```
+
+## Implementation Notes
+
+- Stateless; pure function.
+- Backed by numpy / numpy-CUDA depending on the runtime; preserves dtype (fp16 in/out → fp16 out, fp32 in/out → fp32 out).
+- The `descriptor_metric()` static return is the source of truth for the FAISS HNSW index distance metric — both C6's `DescriptorIndex.search_topk` and C10's index-build code consult it.
+
+## Caveats
+
+- Zero-norm vectors are returned as the zero vector (no division-by-zero); callers must filter or accept that such descriptors will match nothing.
+- The choice of "inner product on L2-normalised" rather than "cosine" is FAISS-idiomatic — FAISS does not have a built-in cosine metric; cosine is achieved by L2-normalising and using inner product.