Refactor acceptance criteria, problem description, and restrictions for UAV GPS-Denied system. Enhance clarity and detail in performance metrics, image processing requirements, and operational constraints. Introduce new sections for UAV specifications, camera details, satellite imagery, and onboard hardware.

2026-06-21 07:11:13 +00:00 · 2026-03-17 09:00:06 +02:00
parent 767874cb90
commit f2aa95c8a2
35 changed files with 4857 additions and 26 deletions
@@ -1,21 +1,50 @@
- The system should find out the GPS of centers of 80% of the photos from the flight within an error of no more than 50 meters in comparison to the real GPS
+# Position Accuracy

- The system should find out the GPS of centers of 60% of the photos from the flight within an error of no more than 20 meters in comparison to the real GPS
+- The system should determine GPS coordinates of frame centers for 80% of photos within 50m error compared to real GPS
+- The system should determine GPS coordinates of frame centers for 60% of photos within 20m error compared to real GPS
+- Maximum cumulative VO drift between satellite correction anchors should be less than 100 meters
+- System should report a confidence score per position estimate (high = satellite-anchored, low = VO-extrapolated with drift)

- The system should correctly continue the work even in the presence of up to 350 meters of an outlier photo between 2 consecutive pictures en route. This could happen due to tilt of the plane.
+# Image Processing Quality

- System should correctly continue the work even during sharp turns, where the next photo doesn't overlap at all, or overlaps in less than 5%. The next photo should be in less than 200m drift and at an angle of less than 70%
+- Image Registration Rate > 95% for normal flight segments. The system can find enough matching features to confidently calculate the camera's 6-DoF pose and stitch that image into the trajectory
+- Mean Reprojection Error (MRE) < 1.0 pixels

- System should try to operate when UAV made a sharp turn, and all the next photos has no common points with previous route. In that situation system should try to figure out location of the new piece of the route and connect it to the previous route. Also this separate chunks could be more than 2, so this strategy should be in the core of the system
+# Resilience & Edge Cases

- In case of being absolutely incapable of determining the system to determine next, second next, and third next images GPS, by any means (these 20% of the route), then it should ask the user for input for the next image, so that the user can specify the location
+- The system should correctly continue work even in the presence of up to 350m outlier between 2 consecutive photos (due to tilt of the plane)
+- System should correctly continue work during sharp turns, where the next photo doesn't overlap at all or overlaps less than 5%. The next photo should be within 200m drift and at an angle of less than 70 degrees. Sharp-turn frames are expected to fail VO and should be handled by satellite-based re-localization
+- System should operate when UAV makes a sharp turn and next photos have no common points with previous route. It should figure out the location of the new route segment and connect it to the previous route. There could be more than 2 such disconnected segments, so this strategy must be core to the system
+- In case the system cannot determine the position of 3 consecutive frames by any means, it should send a re-localization request to the ground station operator via telemetry link. While waiting for operator input, the system continues attempting VO/IMU dead reckoning and the flight controller uses last known position + IMU extrapolation

- Less than 5 seconds for processing one image
+# Real-Time Onboard Performance

- Results of image processing should appear immediately to user, so that user shouldn't wait for the whole route to complete in order to analyze first results. Also, system could refine existing calculated results and send refined results again to user 
+- Less than 400ms end-to-end per frame: from camera capture to GPS coordinate output to the flight controller (camera shoots at ~3fps)
+- Memory usage should stay below 8GB shared memory (Jetson Orin Nano Super — CPU and GPU share the same 8GB LPDDR5 pool)
+- The system must output calculated GPS coordinates directly to the flight controller via MAVLink GPS_INPUT messages (using MAVSDK)
+- Position estimates are streamed to the flight controller frame-by-frame; the system does not batch or delay output
+- The system may refine previously calculated positions and send corrections to the flight controller as updated estimates

- Image Registration Rate > 95%. The system can find enough matching features to confidently calculate the camera's 6-DoF pose (position and orientation) and "stitch" that image into the final trajectory
+# Startup & Failsafe

- Mean Reprojection Error (MRE) < 1.0 pixels. The distance, in pixels, between the original pixel location of the object and the re-projected pixel location.
+- The system initializes using the last known valid GPS position from the flight controller before GPS denial begins
+- If the system completely fails to produce any position estimate for more than N seconds (TBD), the flight controller should fall back to IMU-only dead reckoning and the system should log the failure
+- On companion computer reboot mid-flight, the system should attempt to re-initialize from the flight controller's current IMU-extrapolated position

- The whole system should work as a background service exposed via REST API with Server-Sent Events (SSE) for real-time streaming. Service should be up and running and awaiting for the initial request. On the request processing should start, and immediately after the first results system should provide them to the client via SSE stream
+# Ground Station & Telemetry
+
+- Position estimates and confidence scores should be streamed to the ground station via telemetry link for operator situational awareness
+- The ground station can send commands to the onboard system (e.g., operator-assisted re-localization hint with approximate coordinates)
+- Output coordinates in WGS84 format
+
+# Object Localization
+
+- Other onboard AI systems can request GPS coordinates of objects detected by the AI camera
+- The GPS-Denied system calculates object coordinates trigonometrically using: current UAV GPS position (from GPS-Denied), known AI camera angle, zoom, and current flight altitude. Flat terrain is assumed
+- Accuracy is consistent with the frame-center position accuracy of the GPS-Denied system
+
+# Satellite Reference Imagery
+
+- Satellite reference imagery resolution must be at least 0.5 m/pixel, ideally 0.3 m/pixel
+- Satellite imagery for the operational area should be less than 2 years old where possible
+- Satellite imagery must be pre-processed and loaded onto the companion computer before flight. Offline preprocessing time is not time-critical (can take minutes/hours)
@@ -1,4 +1,2 @@
-We have a lot of images taken from a wing-type UAV using a camera with at least Full HD resolution. Resolution of each photo could be up to 6200*4100 for the whole flight, but for other flights, it could be FullHD
-Photos are taken and named consecutively within 100 meters of each other.
-We know only the starting GPS coordinates. We need to determine the GPS of the centers of each next image. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. 
-The real world examples are in input_data folder
+We have a wing-type UAV with a camera pointing downwards that can take photos 3 times per second with a resolution 6200*4100. Also plane has flight controller with IMU. During the plane flight, we know GPS coordinates initially. During the flight, GPS could be disabled or spoofed. We need to determine the GPS of the centers of the next frame from the camera. And also the coordinates of the center of any object in these photos. We can use an external satellite provider for ground checks on the existing photos. So, before the flight, UAV's operator should upload the satellite photos to the plane's companion PC. 
+The real world examples are in input_data folder, but the distance between each photo is way bigger than it will be from a real plane. On that particular example photos were taken 1 photo per 2-3 seconds. But in real-world scenario frames would appear within the interval no more than 500ms or even 400 ms.
@@ -1,11 +1,37 @@
- - Photos are taken by only airplane type UAVs.
- - Photos are taken by the camera pointing downwards and fixed, but it is not autostabilized.
- - The flying range is restricted by the eastern and southern parts of Ukraine (To the left of the Dnipro River)
- - The image resolution could be from FullHD to 6252*4168. Camera parameters are known: focal length, sensor width, resolution and so on.
- - Altitude is predefined and no more than 1km. The height of the terrain can be neglected.
- - There is NO data from IMU
- - Flights are done mostly in sunny weather
- - We can use satellite providers, but we're limited right now to Google Maps, which could be outdated for some regions
- - Number of photos could be up to 3000, usually in the 500-1500 range
- - During the flight, UAVs can make sharp turns, so that the next photo may be absolutely different from the previous one (no same objects), but it is rather an exception than the rule
- - Processing is done on a stationary computer or laptop with NVidia GPU at least RTX2060, better 3070. (For the UAV solution Jetson Orin Nano would be used, but that is out of scope.)
+# UAV & Flight
+
+- Photos are taken by only airplane (fixed-wing) type UAVs
+- Photos are taken by the camera pointing downwards and fixed, but it is not autostabilized
+- The flying range is restricted by the eastern and southern parts of Ukraine (to the left of the Dnipro River)
+- Altitude is predefined and no more than 1km. The height of the terrain can be neglected
+- Flights are done mostly in sunny weather
+- During the flight, UAVs can make sharp turns, so that the next photo may be absolutely different from the previous one (no same objects), but it is rather an exception than the rule
+- Number of photos per flight could be up to 3000, usually in the 500-1500 range
+
+# Cameras
+
+- UAV has two cameras:
+  1. **Navigation camera** — fixed, pointing downwards, not autostabilized. Used by GPS-Denied system for position estimation
+  2. **AI camera** — main camera with configurable angle and zoom, used by onboard AI detection systems
+- Navigation camera resolution: FullHD to 6252*4168. Camera parameters are known: focal length, sensor width, resolution, etc.
+- Cameras are connected to the companion computer (interface TBD: USB, CSI, or GigE)
+- Terrain is assumed flat (eastern/southern Ukraine operational area); height differences are negligible
+
+# Satellite Imagery
+
+- We can use satellite providers, but we're limited right now to Google Maps, which could be outdated for some regions
+- Satellite imagery for the operational area must be pre-loaded onto the companion computer before flight
+
+# Onboard Hardware
+
+- Processing is done on a Jetson Orin Nano Super (67 TOPS, 8GB shared LPDDR5, 25W TDP)
+- The companion computer runs JetPack (Ubuntu-based) with CUDA/TensorRT available
+- Onboard storage for satellite imagery is limited (exact capacity TBD, but must be accounted for in tile preparation)
+- Sustained GPU load may cause thermal throttling; the processing pipeline must stay within thermal envelope
+
+# Sensors & Integration
+
+- There is a lot of data from IMU (via the flight controller)
+- The system communicates with the flight controller via MAVLink protocol using MAVSDK library
+- The system must output GPS coordinates to the flight controller as a replacement for the real GPS module (MAVLink GPS_INPUT message)
+- Ground station telemetry link is available but bandwidth-limited; it is not the primary output channel