Files
gps-denied-onboard/_docs/00_problem/acceptance_criteria.md
T
Oleksandr Bezdieniezhnykh 9eba1689b3 - Introduced a new document detailing the current state of the autodev process, including steps, status, and findings.
- Revised acceptance criteria in the acceptance_criteria.md file to clarify metrics and expectations, including updates to GPS accuracy and image processing quality.
- Enhanced restrictions documentation to reflect operational parameters and constraints for UAV flights, including camera specifications and satellite imagery usage.
- Added new research documents for acceptance criteria assessment and question decomposition to support ongoing project evaluation and decision-making.
2026-04-26 14:28:10 +03:00

19 KiB
Raw Blame History

Acceptance Criteria

Last revised: 2026-04-26 (post Mode B Solution Assessment + user-driven addendum on VPR granularity & change-robustness + user lock-in of Mode B open items Q1Q5). Changes vs. previous version (2026-04-25): AC-1.2 split into hard-floor + stretch; AC-1.4 made quantitative; AC-2.2 split per pipeline stage; AC-3.4 dual-trigger; AC-4.3 autopilot-pinned; AC-5.2 N pinned; AC-7.1 scoped to level flight; AC-8.2 freshness by sector; six new AC added (AC-NEW-1 … AC-NEW-6). Changes 2026-04-26: AC-4.3 extended to dual-channel hybrid (GPS_INPUT primary + ODOMETRY auxiliary); AC-8.6 added (VPR retrieval-unit + change-robustness); AC-NEW-7 added with confirmed numeric thresholds (cache-poisoning safety budget).

Position Accuracy

  • AC-1.1 — The system shall determine GPS coordinates of frame centers within 50 m of true GPS for ≥80% of photos in normal flight segments.
  • AC-1.2 — The system shall determine GPS coordinates of frame centers within 20 m of true GPS for ≥50% of photos in normal flight segments.
  • AC-1.3 — Maximum cumulative VO drift between two consecutive satellite-anchored fixes shall be <100 m (VO-only fallback) or <50 m (when IMU is fused). Drift is measured as ‖VO-extrapolated centre next anchor centre‖ at the moment of the anchor fix.
  • AC-1.4 — The system shall report a quantitative confidence score per position estimate, comprising:
    • the 95% covariance ellipse semi-major axis in meters, AND
    • a categorical label {satellite_anchored, vo_extrapolated, dead_reckoned}.

Image Processing Quality

  • AC-2.1 — Image registration rate >95% for normal flight segments (defined as: nadir flight ±10° bank / pitch, ≥40% overlap with prior frame, daytime, season-matched satellite tile).
  • AC-2.2 — Mean Reprojection Error (MRE):
    • <1.0 px for VO frame-to-frame homography on overlapping aerial pairs;
    • <2.5 px for satellite-anchored cross-domain (UAV photo ↔ ortho satellite tile) registration.

Resilience & Edge Cases

  • AC-3.1 — The system shall correctly continue work in the presence of up to 350 m outliers between two consecutive photos (caused by airframe tilt up to ±20°).
  • AC-3.2 — The system shall correctly continue work during sharp turns where the next photo overlaps <5% with the previous, drifts <200 m, and changes heading <70°. Sharp-turn frames are expected to fail VO and shall be handled by satellite-based re-localization (place recognition over the satellite tile cache).
  • AC-3.3 — The system shall handle ≥3 disconnected segments per flight, connecting each new segment to the previous trajectory via global descriptor retrieval + RANSAC pose-graph relocalization. This is a core capability, not a degraded mode.
  • AC-3.4 — When the system cannot determine position for ≥3 consecutive frames AND ≥2 s, it shall send a re-localization request to the ground station via telemetry. While waiting, it continues VO/IMU dead reckoning and the flight controller uses last known position + IMU extrapolation.

Real-Time Onboard Performance

  • AC-4.1 — End-to-end latency from camera capture to GPS coordinate output to the flight controller shall be <400 ms p95. Up to ~10% of frames may be dropped under sustained load (skip-allowed).
  • AC-4.2 — Memory usage shall remain below 8 GB shared on Jetson Orin Nano Super (CPU and GPU share the same 8 GB LPDDR5 pool).
  • AC-4.3 — The system shall output its position estimate to the flight controller via two parallel MAVLink channels, both emitted by pymavlink (general telemetry uses MAVSDK):
    • Primary: GPS_INPUT targeting ArduPilot with GPS1_TYPE=14 (MAVLink GPS substitute). Matches the "replacement for the GPS module" framing of the build.
    • Auxiliary (when the EKF emits a fix with full 6-DoF covariance and quality > VISO_QUAL_MIN): ODOMETRY so EKF3 can fuse the richer covariance + native yaw error + quality field. ArduPilot's own dev docs designate ODOMETRY as the preferred external-nav channel for non-GPS substitution; we hybridise to keep AC-4.3's GPS-substitute framing while not throwing away the covariance fidelity that AC-NEW-4 depends on.
    • FC source priorities are configured so GPS_INPUT remains the failover path if ODOMETRY trips a parameter gate.
    • v1 scope clause (added 2026-04-26 — see solution_draft03 finding M-30): v1 ships GPS_INPUT only; the ODOMETRY auxiliary channel is intentionally disabled in v1 because feeding both GPS_INPUT and ODOMETRY for overlapping axes triggers ArduPilot EKF3 double-fusion bugs (issues #30076 / #32506). EK3_SRC1_*=GPS+Compass; ODOMETRY emission re-enables in v1.1 once F-T9 SITL confirms PR #30080-class clean source-switching. Tests therefore assert v1 emits GPS_INPUT only and that ODOMETRY is intentionally absent on the wire.
    • (Decision rationale: MAVSDK has no native GPS_INPUT support — see _docs/00_research/00_ac_assessment.md Q-1; ODOMETRY hybrid rationale — see Mode B finding M-1 in _docs/00_research/02_fact_cards.md; v1 single-channel rationale — see Mode B round-2 finding M-30 in _docs/00_research/02_fact_cards.md / solution_draft03.)
  • AC-4.4 — Position estimates are streamed to the flight controller frame-by-frame; the system shall not batch or delay output.
  • AC-4.5 — The system may refine previously calculated positions and send corrections to the flight controller as updated estimates.

Startup & Failsafe

  • AC-5.1 — The system shall initialise using the last known valid GPS position from the flight controller's EKF, plus IMU-extrapolated position at the moment of GPS denial.
  • AC-5.2 — If the system fails to produce any position estimate for >3 s, the flight controller shall fall back to IMU-only dead reckoning and the system shall log the failure.
  • AC-5.3 — On companion computer reboot mid-flight, the system shall attempt to re-initialise from the flight controller's current IMU-extrapolated position. See AC-NEW-1 for the cold-start time-to-first-fix budget.

Ground Station & Telemetry

  • AC-6.1 — Position estimates and confidence scores shall be streamed to QGroundControl via the MAVLink telemetry link. High-rate (per-frame) content stays on the local link for forensics; the GCS link is downsampled to 12 Hz for situational awareness.
  • AC-6.2 — The ground station can send commands to the onboard system (e.g., operator-assisted re-localization hint with approximate coordinates) via STATUSTEXT, NAMED_VALUE_FLOAT, or a custom MAVLink dialect.
  • AC-6.3 — Output coordinates are in WGS84 format (matches GPS_INPUT spec).

Object Localization (AI Camera)

  • AC-7.1 — Other onboard AI systems may request GPS coordinates of objects detected by the AI camera. Localization accuracy is consistent with the frame-center accuracy of the GPS-Denied system in level flight (bank/pitch <5°). In maneuvering flight, ground-projection error is bounded by altitude × |sin(unknown_bank_or_pitch)| and the system shall publish that bound alongside the estimate.
  • AC-7.2 — The system computes object coordinates trigonometrically using: current UAV GPS position (from GPS-Denied), known AI-camera gimbal angle, zoom, and current flight altitude. Flat-terrain assumption applies.

Satellite Reference Imagery

  • AC-8.1 — Satellite reference imagery is provided by the Azaion Suite Satellite Service (a separate component of the Suite). The runtime onboard system consumes this service through an offline tile cache interface; it does not call commercial providers (Maxar, Airbus, Planet, etc.) directly. The Satellite Service is responsible for upstream sourcing and is out of scope for this build. Required resolution at the cache interface: at least 0.5 m/pixel, ideally 0.3 m/pixel.
  • AC-8.2 — Satellite tiles consumed at runtime shall be:
    • <6 months old for active-conflict sectors;
    • <12 months old for stable rear sectors. System shall reject or downgrade-confidence on tiles older than these thresholds (see AC-NEW-6).
  • AC-8.3 — Satellite imagery for the operational area shall be pre-loaded and pre-processed onto the companion computer before flight. Offline preprocessing time is not time-critical (minutes/hours). Pre-extracted tile descriptors (e.g., SuperPoint keypoints/descriptors and DINOv2-VLAD global descriptors) are part of the cache.
  • AC-8.4Mid-flight tile generation & write-back: during flight, the system shall continuously orthorectify navigation-camera frames into tiles aligned with the basemap projection and store them in the local cache, deduplicated so each ground sector is stored at most once (latest / highest-quality tile wins). On landing, the companion computer shall upload newly generated tiles back to the Azaion Suite Satellite Service so that the next mission cache contains imagery refreshed by the previous flight.
  • AC-8.5Storage policy: the system shall not retain raw navigation-camera frames or AI-camera frames as part of normal operation. Tiles are the only persistent imagery artifact. Forensic exception: a low-rate (≤0.1 Hz) thumbnail log of frames that failed tile generation may be retained for debugging within the FDR budget (AC-NEW-3).
  • AC-8.6VPR retrieval unit + change-robustness:
    • The Visual Place Recognition (Component 2) FAISS index shall be built over ground-footprint-sized "VPR chunks" (~600800 m at the deployment altitude band, with 4050 % overlap between adjacent chunks), decoupled from the slippy-XYZ storage tile (z=20). Any UAV frame footprint shall fall fully inside ≥1 chunk regardless of position.
    • The index shall be multi-scale: in addition to fine-scale chunks (derived from z=20 storage), a coarser-scale chunk descriptor set (z=17 or z=18 effective scale) shall be maintained for change-robust retrieval in active-conflict sectors where building destruction or major scene change is expected.
    • VPR top-K shall be dynamically sized by sector classification (AC-NEW-6) and EKF position covariance: K=5 in stable sectors with σ_xy ≤ 20 m; K=20 in active-conflict sectors; K=50 on expanding-window fallback.
    • VPR shall be invoked conditionally, not on every frame: in steady state (last anchor age < 2 s, σ_xy < 20 m, VO healthy), the system uses a geometric prior from IMU+VO predicted position to rank candidate chunks by distance alone. VPR's DINOv2 forward is invoked on re-loc triggers (cold start AC-NEW-1, sharp turn AC-3.2, disconnected segment AC-3.3, σ_xy > 50 m, or VO failure for ≥2 frames).

New AC (added in Phase 1 assessment, expanded with rationale & validation)

AC-NEW-1 — Time-to-first-fix on cold start

Statement. From companion-computer boot, the system shall emit its first valid GPS_INPUT message in <30 s, given an IMU-extrapolated initial position handed over from the flight controller's EKF.

Why it matters. A mid-flight reboot (brown-out, watchdog reset, OS panic) is a realistic scenario on a fixed-wing UAV running an 8-hour mission. The autopilot continues to fly on IMU dead reckoning during the gap; a 30 s budget keeps that drift under ~500 m at 60 km/h cruise, which the EKF can absorb when our first fix arrives.

Implementation drivers. TensorRT engines must be built at install time (not at first run); CUDA / TRT init <5 s; tile-cache mmap warm at start; FAISS index loaded before MAVLink connect; first VPR retrieval + cross-view match must succeed at full resolution within the remaining budget.

Validation. Bench: cold-boot the companion 50× with simulated FC-pose input; record time from boot to first valid GPS_INPUT MAVLink frame. Pass = 95% percentile <30 s.

AC-NEW-2 — Spoofing-promotion latency

Statement. When the flight controller signals GPS denial or spoofing (ArduPilot fix-loss / EKF lane-switch event; PX4 EKF2_GPS_SPOOFED flag if PX4 ever returns to scope), the GPS-Denied system shall promote its own estimate to the FC's primary GPS source within <3 s.

Why it matters. Without this gate, the FC may continue to follow a spoofed real-GPS source while our valid estimate sits idle. 3 s is short enough to keep the FC from acting on a malicious heading change but long enough to ride out a single-frame anomaly.

Implementation drivers. Subscribe to GPS_RAW_INT, EKF_STATUS_REPORT, SYS_STATUS. Maintain an internal "real-GPS health" rolling average; switch to "primary" mode (raise our GPS_INPUT fix_type to 3D and assert) when health drops below threshold for ≥1 s. Emit STATUSTEXT to QGC on every promotion / demotion.

Validation. SITL: simulate spoofing (inject false GPS_RAW_INT from a malicious node); measure time from spoof onset to our promotion. Pass = 95% percentile <3 s.

AC-NEW-3 — Flight Data Recorder

Statement. The system shall retain to non-volatile storage, per flight: per-frame position estimates with covariance and source-label, IMU traces from the FC at full rate, all emitted GPS_INPUT frames, MAVLink raw stream (tlog), system health (CPU / GPU / temp / throttle), tiles generated mid-flight (AC-8.4), and a low-rate (≤0.1 Hz) thumbnail log of frames that failed tile generation. Raw nav-cam frames and AI-cam frames are NOT retained (AC-8.5). Storage cap 64 GB / flight; recorder rolls over (oldest segment dropped first) after cap.

Why it matters. Tiles, telemetry traces, and IMU are the operationally useful artifacts: they reproduce the mission, feed the next mission's cache (AC-8.4), and let post-mission analysis explain any false-position event (AC-NEW-4). Raw frames are large and redundant once tiles exist.

Implementation drivers. Per-day directory layout; fixed-size segment files; rollover policy on segment-close, not on every write. NVMe ≥64 GB on top of the persistent satellite-tile cache.

Validation. Bench: run an 8-hour synthetic load (3 Hz nav frames replayed from disk), assert the FDR ends ≤64 GB and no payload class is silently dropped without a logged rollover event.

AC-NEW-4 — False-position safety budget

Statement.

  • P(reported estimate error > 500 m) <0.1 % per flight.
  • P(reported estimate error > 1 km) <0.01 % per flight.

Why it matters. A single 1-km-off GPS_INPUT frame can hand the FC a heading that flies the UAV outside the geofence in seconds. The covariance carried in GPS_INPUT (h_acc) is the FC's only defense; this AC bounds the probability of our covariance under-reporting reality.

Implementation drivers. EKF covariance must be calibrated, not optimistic. Cross-view fixes with low inlier ratio must be rejected, not down-weighted to "small but non-zero". Outlier rejection at the EKF stage (Mahalanobis gate) is mandatory.

Validation. Monte Carlo over the AerialVL public dataset (S03) and our own recorded Mavic flights, with synthetic IMU injection where applicable; report error CDF; pass = both probabilities below budget across ≥100 simulated flights worth of frames.

AC-NEW-5 — Operational environmental envelope

Statement. Operating temperature 20 °C to +50 °C; vibration / shock per RTCA DO-160G low-altitude UAV-class envelope. The cooling solution shall sustain the 25 W power mode at the upper temperature bound for the full 8-hour duty cycle without thermal throttling.

Why it matters. Without this, all latency / accuracy ACs are conditional on a benign thermal day. Eastern/southern Ukraine summers easily exceed +35 °C ambient inside a UAV bay; without active cooling, the Jetson throttles to 15 W mode and our 400 ms latency budget collapses.

Implementation drivers. Forced-air or active heatsink sized for 25 W continuous at +50 °C ambient bay temperature; thermal sensors logged in FDR (AC-NEW-3); throttle event = automatic STATUSTEXT warning to QGC.

Validation. Hot-soak chamber test: 25 W workload at +50 °C ambient for 8 h; assert no throttle. Cold-soak: 20 °C cold-start to first fix within AC-NEW-1 budget.

AC-NEW-6 — Imagery freshness enforcement

Statement. The system shall reject (or downgrade confidence on) any satellite tile whose capture date violates AC-8.2 (>6 months old in active-conflict sectors; >12 months old in stable rear sectors). Tiles generated mid-flight (AC-8.4) and not yet uploaded to the Suite Satellite Service are timestamped with the current flight date and treated as fresh.

Why it matters. Stale satellite tiles are the dominant cross-view-matching failure mode in active-conflict sectors (cratering, dam destruction, road realignment). A confident match against a stale tile is worse than no match.

Implementation drivers. Each tile carries capture_date metadata in the cache index. Sector classification (active vs stable) is part of the operational area definition handed in pre-flight. Confidence weight = 1.0 if within freshness budget, linearly decayed to 0.0 over a 30-day grace zone past the budget, hard reject beyond the grace.

Validation. Inject tiles with synthetic age into the cache; verify rejection / decay curve matches spec; verify a stale-tile match never produces a satellite_anchored source label.

AC-NEW-7 — Cache-poisoning safety budget

Statement. Per flight, across all onboard tiles written by Component 1b (in-flight ortho-tile generator):

  • P(onboard tile geo-misaligned > 30 m) <1 %.
  • P(onboard tile geo-misaligned > 100 m) <0.1 %.

Why it matters. Onboard tiles feed back into the Suite Satellite Service's basemap (AC-8.4). Without this AC, a confidently-bad EKF pose can write a misaligned tile that, after Service ingest, becomes the next flight's satellite anchor — producing cross-flight error compounding that AC-NEW-4 (single-flight false-position budget) does not capture. This AC bounds the probability that an onboard tile's claimed geo-alignment is wrong by a margin that would propagate to a downstream flight.

Implementation drivers.

  • Service-source tiles are immutable within freshness budget (AC-8.2); onboard tiles overwrite only stale or other-onboard tiles.
  • The Suite Satellite Service ingest applies a 2-flight voting layer: an onboard tile gets promoted to "trusted basemap" only after N≥2 independent flights confirm consistent geo-alignment within X m of each other. (Active sectors per AC-NEW-6 may use single-flight promotion when σ_xy ≤ 3 m AND OSM-road-overlap ≥ 70 %.)
  • The Component-1b parent-pose covariance is a hard gate in the local quality score: σ_xy ≤ 5 m for a hard write (trust_level = candidate); σ_xy ≤ 3 m for trust_level = candidate with full quality; tiles written in the σ_xy ∈ (3, 5] m band are marked trust_level = soft in the sidecar.
  • Eligibility check (Component 1b) tightens generation gate from σ_xy ≤ 10 m to σ_xy ≤ 5 m.

Validation. Multi-flight Monte Carlo replay over AerialVL + Mavic + AerialExtreMatch with synthetic over-confidence injection (artificially deflate EKF covariance by 1.5×–3×): assert both probabilities below budget across ≥100 simulated flights worth of frames. Independently, Service-side voting layer is exercised in F-T3 to verify candidate tiles are not promoted to trusted basemap before N-flight confirmation.