Files
gps-denied-onboard/.planning/phases/01-eskf-core/01-02-PLAN.md
T

14 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
01-eskf-core 02 execute 1
src/gps_denied/core/coordinates.py
true
ESKF-06
truths artifacts key_links
pixel_to_gps uses camera intrinsic matrix K to unproject pixel to camera ray, not fake 0.1m/pixel scaling
Camera-to-body transform uses T_cam_body rotation (nadir-pointing camera mount)
Body-to-NED rotation uses a quaternion parameter (from ESKF attitude), not hardcoded identity
Ray-ground intersection at known altitude produces NED position, then converts to WGS84
gps_to_pixel is the inverse: WGS84 -> NED -> body -> camera -> pixel
transform_points applies a real 3x3 homography via cv2.perspectiveTransform
path provides contains
src/gps_denied/core/coordinates.py Full coordinate chain: pixel -> camera ray -> body -> NED -> WGS84 np.linalg.inv
from to via pattern
src/gps_denied/core/coordinates.py numpy K matrix construction, quaternion rotation, ray-ground intersection np.array
from to via pattern
src/gps_denied/core/coordinates.py cv2 perspectiveTransform for homography application cv2.perspectiveTransform
Replace all FAKE Math stubs in CoordinateTransformer with the real pixel-to-GPS coordinate chain: pixel -> camera ray (K matrix) -> body frame (T_cam_body) -> NED (quaternion) -> WGS84.

Purpose: Without real coordinate transforms, all position estimates, object localization, and satellite matching produce wrong results. The 0.1m/pixel approximation ignores altitude, camera intrinsics, and UAV attitude entirely.

Output: Updated src/gps_denied/core/coordinates.py with real geometric projections.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/codebase/CONVENTIONS.md @_docs/01_solution/solution.md (lines 187-222 — coordinate system spec) @src/gps_denied/core/coordinates.py @src/gps_denied/schemas/__init__.py

From src/gps_denied/schemas/__init__.py:

class GPSPoint(BaseModel):
    lat: float = Field(..., ge=-90, le=90)
    lon: float = Field(..., ge=-180, le=180)

class CameraParameters(BaseModel):
    focal_length: float  # mm
    sensor_width: float  # mm
    sensor_height: float  # mm
    resolution_width: int  # pixels
    resolution_height: int  # pixels
    principal_point: tuple[float, float] | None = None
    distortion_coefficients: list[float] | None = None

From _docs/01_solution/solution.md (camera intrinsics):

  • ADTI 20L V1 + 16mm lens: fx = fy = 16 * 5456 / 23.2 = 3763 pixels
  • cx = 2728, cy = 1816
  • Camera mount: nadir-pointing (Z-down), R_cam_body = Rx(180deg)
Task 1: Replace fake coordinate transforms with real camera projection math src/gps_denied/core/coordinates.py - src/gps_denied/core/coordinates.py (current file — understand existing methods, ENU convention, OriginNotSetError) - src/gps_denied/schemas/__init__.py (CameraParameters fields: focal_length in mm, sensor_width/height in mm, resolution_width/height in pixels, principal_point optional) - _docs/01_solution/solution.md (lines 187-222 for reference frames: Camera C, Body B, NED N, WGS84; transformation chain; T_cam_body spec) Modify `src/gps_denied/core/coordinates.py`. Keep all existing methods (set_enu_origin, get_enu_origin, gps_to_enu, enu_to_gps) UNCHANGED. Replace the implementations of `pixel_to_gps`, `gps_to_pixel`, `image_object_to_gps`, and `transform_points`.

Add imports at top:

import cv2
import numpy as np

Add module-level helper function:

def _build_intrinsic_matrix(cam: CameraParameters) -> np.ndarray:
    """Build 3x3 camera intrinsic matrix K from CameraParameters."""
    fx = cam.focal_length * cam.resolution_width / cam.sensor_width
    fy = cam.focal_length * cam.resolution_height / cam.sensor_height
    if cam.principal_point is not None:
        cx, cy = cam.principal_point
    else:
        cx = cam.resolution_width / 2.0
        cy = cam.resolution_height / 2.0
    return np.array([
        [fx,  0, cx],
        [ 0, fy, cy],
        [ 0,  0,  1],
    ], dtype=np.float64)

Add module-level helper:

def _cam_to_body_rotation() -> np.ndarray:
    """Camera-to-body rotation for nadir-pointing camera.

    Camera frame: Z forward (optical axis), X right, Y down (OpenCV convention).
    Body frame: X forward (nose), Y right (starboard), Z down.
    Camera points nadir: camera Z-axis aligns with body -Z (downward).

    The rotation is Rx(180deg): flips Y and Z axes.
    R_cam_body = [[1, 0, 0], [0, -1, 0], [0, 0, -1]]
    """
    return np.array([
        [1,  0,  0],
        [0, -1,  0],
        [0,  0, -1],
    ], dtype=np.float64)

Add module-level helper:

def _quat_to_rotation_matrix(q: np.ndarray) -> np.ndarray:
    """Convert [w, x, y, z] quaternion to 3x3 rotation matrix."""
    w, x, y, z = q
    return np.array([
        [1 - 2*(y*y + z*z), 2*(x*y - w*z),     2*(x*z + w*y)],
        [2*(x*y + w*z),     1 - 2*(x*x + z*z), 2*(y*z - w*x)],
        [2*(x*z - w*y),     2*(y*z + w*x),     1 - 2*(x*x + y*y)],
    ], dtype=np.float64)

Replace pixel_to_gps method:

def pixel_to_gps(
    self,
    flight_id: str,
    pixel: tuple[float, float],
    frame_pose: dict,
    camera_params: CameraParameters,
    altitude: float,
    quaternion: np.ndarray | None = None,
) -> GPSPoint:
    """Unproject pixel to GPS via ray-ground intersection.

    Chain: pixel -> camera ray (K^-1) -> body (T_cam_body) -> ENU (quaternion) -> WGS84.

    Args:
        flight_id: Flight identifier for ENU origin lookup.
        pixel: (u, v) pixel coordinates.
        frame_pose: Dict with "position" key -> [east, north, up] in ENU meters.
        camera_params: Camera intrinsic parameters.
        altitude: UAV altitude above ground in meters (positive up).
        quaternion: [w, x, y, z] body-to-ENU rotation quaternion.
                    If None, assumes identity (level flight, north-facing).
    """
    # Step 1: Pixel -> camera ray
    K = _build_intrinsic_matrix(camera_params)
    K_inv = np.linalg.inv(K)
    pixel_h = np.array([pixel[0], pixel[1], 1.0])
    ray_cam = K_inv @ pixel_h  # direction in camera frame

    # Step 2: Camera -> body
    R_cam_body = _cam_to_body_rotation()
    ray_body = R_cam_body @ ray_cam

    # Step 3: Body -> ENU
    if quaternion is not None:
        R_body_enu = _quat_to_rotation_matrix(quaternion)
    else:
        R_body_enu = np.eye(3)
    ray_enu = R_body_enu @ ray_body

    # Step 4: Ray-ground intersection
    # UAV is at altitude (Up component), ground is at Up=0
    # Ray: P = P_uav + t * ray_enu
    # Ground plane: Up = 0 -> t = -altitude / ray_enu[2]
    if abs(ray_enu[2]) < 1e-10:
        # Ray parallel to ground — return UAV position projected
        frame_enu = frame_pose.get("position", [0, 0, 0])
        return self.enu_to_gps(flight_id, (frame_enu[0], frame_enu[1], 0.0))

    t = -altitude / ray_enu[2]
    if t < 0:
        # Ray points upward — use UAV position
        frame_enu = frame_pose.get("position", [0, 0, 0])
        return self.enu_to_gps(flight_id, (frame_enu[0], frame_enu[1], 0.0))

    frame_enu = frame_pose.get("position", [0, 0, 0])
    ground_east = frame_enu[0] + t * ray_enu[0]
    ground_north = frame_enu[1] + t * ray_enu[1]

    # Step 5: ENU -> WGS84
    return self.enu_to_gps(flight_id, (ground_east, ground_north, 0.0))

Replace gps_to_pixel method:

def gps_to_pixel(
    self,
    flight_id: str,
    gps: GPSPoint,
    frame_pose: dict,
    camera_params: CameraParameters,
    altitude: float,
    quaternion: np.ndarray | None = None,
) -> tuple[float, float]:
    """Project GPS coordinate to image pixel.

    Inverse of pixel_to_gps: WGS84 -> ENU -> body -> camera -> pixel.
    """
    # Step 1: GPS -> ENU
    enu = self.gps_to_enu(flight_id, gps)

    # Step 2: ENU point relative to UAV
    frame_enu = frame_pose.get("position", [0, 0, 0])
    point_enu = np.array([
        enu[0] - frame_enu[0],
        enu[1] - frame_enu[1],
        -altitude,  # ground is at -altitude relative to UAV (UAV is at +altitude)
    ])

    # Step 3: ENU -> body
    if quaternion is not None:
        R_body_enu = _quat_to_rotation_matrix(quaternion)
    else:
        R_body_enu = np.eye(3)
    point_body = R_body_enu.T @ point_enu

    # Step 4: Body -> camera
    R_cam_body = _cam_to_body_rotation()
    point_cam = R_cam_body.T @ point_body

    # Step 5: Camera -> pixel
    if abs(point_cam[2]) < 1e-10:
        cx = camera_params.resolution_width / 2.0
        cy = camera_params.resolution_height / 2.0
        return (cx, cy)

    K = _build_intrinsic_matrix(camera_params)
    pixel_h = K @ (point_cam / point_cam[2])
    return (float(pixel_h[0]), float(pixel_h[1]))

Replace image_object_to_gps method:

def image_object_to_gps(
    self,
    flight_id: str,
    frame_id: int,
    object_pixel: tuple[float, float],
    frame_pose: dict | None = None,
    camera_params: CameraParameters | None = None,
    altitude: float = 100.0,
    quaternion: np.ndarray | None = None,
) -> GPSPoint:
    """Convert object pixel coordinates to GPS using full projection chain.

    If frame_pose or camera_params are not provided, uses defaults for
    backward compatibility (level flight, ADTI 20L V1 camera).
    """
    if frame_pose is None:
        frame_pose = {"position": [0, 0, 0]}
    if camera_params is None:
        camera_params = CameraParameters(
            focal_length=16.0,
            sensor_width=23.2,
            sensor_height=15.4,
            resolution_width=5456,
            resolution_height=3632,
        )
    return self.pixel_to_gps(
        flight_id, object_pixel, frame_pose, camera_params, altitude, quaternion
    )

Replace transform_points method:

def transform_points(
    self,
    points: list[tuple[float, float]],
    transformation: list[list[float]],
) -> list[tuple[float, float]]:
    """Apply 3x3 homography to a list of 2D points using cv2.perspectiveTransform."""
    if not points:
        return []
    H = np.array(transformation, dtype=np.float64)
    pts = np.array(points, dtype=np.float64).reshape(-1, 1, 2)
    transformed = cv2.perspectiveTransform(pts, H)
    return [(float(p[0][0]), float(p[0][1])) for p in transformed]

Per ESKF-06: complete coordinate chain replacing all FAKE Math stubs. python -c " import numpy as np from gps_denied.core.coordinates import CoordinateTransformer, _build_intrinsic_matrix from gps_denied.schemas import CameraParameters, GPSPoint

ct = CoordinateTransformer() ct.set_enu_origin('f1', GPSPoint(lat=48.0, lon=37.0))

Test: image center at altitude 600m with identity quaternion should project near UAV position

cam = CameraParameters(focal_length=16.0, sensor_width=23.2, sensor_height=15.4, resolution_width=5456, resolution_height=3632) pose = {'position': [0, 0, 0]} q_identity = np.array([1.0, 0.0, 0.0, 0.0])

gps = ct.pixel_to_gps('f1', (2728.0, 1816.0), pose, cam, 600.0, q_identity)

Center pixel should project to UAV nadir position

assert abs(gps.lat - 48.0) < 0.001, f'lat={gps.lat}' assert abs(gps.lon - 37.0) < 0.001, f'lon={gps.lon}'

Test K matrix

K = _build_intrinsic_matrix(cam) assert K.shape == (3, 3) fx_expected = 16.0 * 5456 / 23.2 assert abs(K[0, 0] - fx_expected) < 1.0, f'fx={K[0,0]}, expected={fx_expected}'

print('Coordinate chain tests OK') " <acceptance_criteria> - src/gps_denied/core/coordinates.py contains import numpy as np - src/gps_denied/core/coordinates.py contains import cv2 - src/gps_denied/core/coordinates.py contains def _build_intrinsic_matrix(cam: CameraParameters) -> np.ndarray - src/gps_denied/core/coordinates.py contains def _cam_to_body_rotation() -> np.ndarray - src/gps_denied/core/coordinates.py contains def _quat_to_rotation_matrix(q: np.ndarray) -> np.ndarray - src/gps_denied/core/coordinates.py contains K_inv = np.linalg.inv(K) in pixel_to_gps - src/gps_denied/core/coordinates.py contains ray_cam = K_inv @ pixel_h in pixel_to_gps - src/gps_denied/core/coordinates.py contains R_cam_body @ ray_cam in pixel_to_gps - src/gps_denied/core/coordinates.py contains t = -altitude / ray_enu[2] in pixel_to_gps (ray-ground intersection) - src/gps_denied/core/coordinates.py contains cv2.perspectiveTransform in transform_points - The string "FAKE Math" does NOT appear in the file - The string "0.1" scaling factor does NOT appear in pixel_to_gps - gps_to_enu and enu_to_gps methods are UNCHANGED from original - image_object_to_gps uses ADTI 20L V1 defaults: focal_length=16.0, sensor_width=23.2, resolution_width=5456 </acceptance_criteria> All FAKE Math stubs replaced. pixel_to_gps uses K^-1 unprojection, T_cam_body rotation, quaternion body-to-ENU, and ray-ground intersection. gps_to_pixel is the exact inverse. transform_points uses cv2.perspectiveTransform. image_object_to_gps uses real ADTI 20L V1 camera defaults.

1. `python -c "from gps_denied.core.coordinates import CoordinateTransformer, _build_intrinsic_matrix, _cam_to_body_rotation, _quat_to_rotation_matrix; print('Imports OK')"` — all new functions importable 2. `ruff check src/gps_denied/core/coordinates.py` — no lint errors 3. `grep -c "FAKE Math" src/gps_denied/core/coordinates.py` — returns 0 (all stubs removed)

<success_criteria>

  • pixel_to_gps implements real K^-1 unprojection with camera intrinsics
  • Camera-to-body rotation R_cam_body accounts for nadir-pointing mount
  • Body-to-ENU rotation uses quaternion parameter
  • Ray-ground intersection uses altitude for ground plane
  • Image center pixel projects to UAV nadir position (verified numerically)
  • gps_to_pixel is the exact inverse of pixel_to_gps
  • transform_points uses cv2.perspectiveTransform
  • No FAKE Math stubs remain </success_criteria>
After completion, create `.planning/phases/01-eskf-core/01-02-SUMMARY.md`