[AZ-918] [AZ-919] [AZ-920] [AZ-921] [AZ-922] VIO/ESKF baseline fixes

Derkachi e2e Tier-2 divergence had three stacked root causes; this
commit ships fixes for all three plus the IMU prerequisite they
depend on, plus a baseline cheirality gate for cv2.recoverPose.

AZ-918  MAVLink IMU adapters now convert raw mG/mrad-s + FRD body to
        SI m/s^2 + rad/s + FLU body via helpers.imu_units. Without
        this the ESKF receives values ~1000x too small with wrong-
        sign Y/Z and cannot function at all.

AZ-919  Composition root wires EskfNominalAltitudeProvider into the
        KLT/RANSAC strategy via the AZ-331 factory introspect path;
        OKVIS2 and VINS-Mono are unaffected.

AZ-920  KLT/RANSAC recovers metric translation via Ground Sampling
        Distance when AGL is available; otherwise falls through with
        scale_quality=direction_only/unknown (no fake scale invented).

AZ-921  VioOutput.scale_quality signal; ESKF add_vio adapts R_meas
        position block based on the flag (1e6 inflation when scale is
        direction_only/unknown to keep the filter consistent).

AZ-922  KLT/RANSAC cheirality gate rejects single-frame rotations
        beyond a config threshold (default 30 deg), catching
        cv2.recoverPose twisted-pair flips that cause immediate ESKF
        divergence on low-parallax aerial scenes.

Verification:
- Tier-1 (macOS) unit suite: 2346 passed, 0 failed.
- Tier-2 (Jetson) Derkachi e2e: divergence moves from frame 5
  (mahalanobis^2 3757) to frame 233 (mahalanobis^2 212). Remaining
  drift is open-loop attitude accumulation, not cheirality.

Follow-up tickets filed:
- AZ-923 closed as misdiagnosed: EskfNominalAltitudeProvider was
  already correct (nominal_pos.z IS the AGL when takeoff origin sits
  at ground level); the early-frame AGL near zero reflects the drone
  being stationary on the ground, not a provider bug.
- AZ-942 filed: cross-check VIO rotation against IMU preintegrator
  (consistency gate) - more physically grounded than the coarse
  AZ-922 threshold and likely required to absorb the frame-233 drift.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-05-27 22:28:40 +03:00
parent 38170b3499
commit 94d2358c8b
32 changed files with 2320 additions and 82 deletions
@@ -127,6 +127,12 @@ Lives at `src/gps_denied_onboard/runtime_root/vio_factory.py`. Selects the strat
- **6×6 SPD covariance always returned**: `pose_covariance_6x6` is symmetric and positive-definite for every `VioOutput`. Implementations MUST NOT return a "tightened" covariance (smaller Frobenius norm) during a degradation event; honest covariance is the safety floor for AC-NEW-4 and AC-NEW-7. A test (covariance-monotonicity contract test, deferred to Step 9 / E-BBT) asserts this across all three strategies.
- **`frame_id` echo**: `VioOutput.frame_id` equals the input `NavCameraFrame.frame_id`. C5 relies on this for time-aligned factor insertion.
- **`relative_pose_T.translation()` is in metres** (NOT pixels, NOT unit-length). Every strategy MUST emit metric translation; C5 fuses it directly into the state estimator without further scaling. Monocular strategies (KLT/RANSAC) recover scale through an injected `AltitudeProvider` (see AZ-919/AZ-920); stereo / VIO strategies (OKVIS2, VINS-Mono) get scale from their backend optimization.
- **`scale_quality` carries the per-frame degraded-mode signal** (AZ-921). Three values:
- `"metric"` — translation is in metres, fully trustworthy. ESKF consumes `pose_covariance_6x6` as-is.
- `"direction_only"` — translation direction is informative but magnitude is not (near-vertical motion in a nadir camera; banked turn). ESKF overrides `R_meas[0:3, 0:3]` to `_DIRECTION_ONLY_TRANSLATION_SIGMA_M² = 64 m²` so the rotation update is honoured and the position update contributes little.
- `"unknown"` — translation is not trustworthy at all (AGL missing, zero inlier flow, hover, stationary). ESKF overrides `R_meas[0:3, 0:3]` to `_UNKNOWN_TRANSLATION_SIGMA_M² = 1e6 m²` so the position update is effectively skipped while the rotation update remains active.
Default `"unknown"` on the `VioOutput` dataclass keeps legacy strategies bug-for-bug compatible until they opt in to the AZ-919 `AltitudeProvider` plumbing.
- **Single-threaded by contract**: each `VioStrategy` instance is bound to one writer thread (the camera ingest thread). Concurrent calls to `process_frame` on the same instance are undefined behaviour. The composition root binds one instance per ingest thread.
- **`reset_to_warm_start` is destructive**: clears the strategy's keyframe window, IMU integration state, and feature track buffer; subsequent `process_frame` calls re-initialise from the hint. Calling `reset_to_warm_start` mid-flight is allowed (F8 reboot recovery) but must not be issued concurrently with a `process_frame` call on the same instance.
- **`current_strategy_label()` is constant per instance**: returns the same string for the lifetime of the instance and matches `config.vio.strategy` exactly. The label is FDR-stamped on every `VioHealth` event for AC-NEW-3 audit.
@@ -47,16 +47,24 @@ but the column **names** must match exactly (case-sensitive).
### Required columns
| # | Column | Unit | Type | Notes |
|---|--------|------|------|-------|
CSV columns use the MAVLink wire format (mG accel, mrad/s gyro, FRD
body frame). The parser converts to SI / FLU at the `ImuSample`
boundary via
`gps_denied_onboard.helpers.imu_units.mavlink_imu_to_si_flu` (AZ-918)
so downstream C5 ESKF + `imu_preintegrator` consumers see the contract
they were built for. **Operator-facing CSV files keep the raw scaling**
— the conversion is a parser-internal concern.
| # | Column | Unit (CSV) | Type | Notes |
|---|--------|------------|------|-------|
| 1 | `timestamp(ms)` | ms | float | Pixhawk wall clock at sample capture. **Ignored by the replay pipeline** — kept only for trace-back to the original tlog. |
| 2 | `Time` | s | float | **Canonical replay clock.** Must start at `0.0`, increase monotonically, and be uniformly spaced. The replay loop uses this column for every timestamp it emits. |
| 3 | `SCALED_IMU2.xacc` | mg | float | Body-frame X accelerometer, MAVLink `SCALED_IMU2` raw scaling. Forwarded unchanged into `ImuSample.accel_xyz[0]`. |
| 4 | `SCALED_IMU2.yacc` | mg | float | Body-frame Y accelerometer. |
| 5 | `SCALED_IMU2.zacc` | mg | float | Body-frame Z accelerometer. |
| 6 | `SCALED_IMU2.xgyro` | mrad/s | float | Body-frame X gyro, MAVLink `SCALED_IMU2` raw scaling. Forwarded unchanged into `ImuSample.gyro_xyz[0]`. |
| 7 | `SCALED_IMU2.ygyro` | mrad/s | float | Body-frame Y gyro. |
| 8 | `SCALED_IMU2.zgyro` | mrad/s | float | Body-frame Z gyro. |
| 3 | `SCALED_IMU2.xacc` | mg, FRD | float | Body-frame X accelerometer, MAVLink `SCALED_IMU2` raw scaling. Converted by the parser to m/s² in `ImuSample.accel_xyz[0]` (FLU body). |
| 4 | `SCALED_IMU2.yacc` | mg, FRD | float | Body-frame Y accelerometer; sign-flipped during FRD→FLU. |
| 5 | `SCALED_IMU2.zacc` | mg, FRD | float | Body-frame Z accelerometer; sign-flipped during FRD→FLU. |
| 6 | `SCALED_IMU2.xgyro` | mrad/s, FRD | float | Body-frame X gyro, MAVLink `SCALED_IMU2` raw scaling. Converted to rad/s in `ImuSample.gyro_xyz[0]` (FLU body). |
| 7 | `SCALED_IMU2.ygyro` | mrad/s, FRD | float | Body-frame Y gyro; sign-flipped during FRD→FLU. |
| 8 | `SCALED_IMU2.zgyro` | mrad/s, FRD | float | Body-frame Z gyro; sign-flipped during FRD→FLU. |
| 9 | `GLOBAL_POSITION_INT.lat` | degrees | float | WGS84 latitude. **Already in decimal degrees** (Derkachi dump convention — pre-divided by 1e7 from MAVLink's int representation). |
| 10 | `GLOBAL_POSITION_INT.lon` | degrees | float | WGS84 longitude (same convention as `lat`). |
| 11 | `GLOBAL_POSITION_INT.alt` | mm | float | MSL altitude. Parser divides by 1000 to emit metres. |