AZ-408 (3pt) — Replace AZ-406 injector scaffolds with concrete generators: - outlier.py: deterministic stride + far-away tile replacement; AC-2 ≥350m offset - blackout_spoof.py: paired video blackout + FC GPS spoof with ≤40ms alignment; AC-4 realistic fix_type/hdop; AC-NEW-8 200-500m inter-spoof deltas - multi_segment.py: ≥3 disjoint windows, ≥30s gaps, ≤25% coverage - fc_proxy.py: timed-splice runtime proxy with pre-activate RuntimeError guard - _common.py: derive_rng + tile-manifest reader + tmpfs helpers - injector_fixtures.py: pytest fixtures wired via runner conftest AZ-410 (3pt) — FT-P-02 cumulative drift between satellite anchors: - anchor_pair_detector.py: AC-1 detection, AC-2/3 pass-fraction, AC-4 monotonicity check, CSV evidence - test_ft_p_02_derkachi_drift.py: scenario gated on upstream helper NotImplementedError (frame_source_replay / fdr_reader / imu_replay) AZ-411 (2pt) — FT-P-03 + FT-P-14 schema + WGS84: - estimate_schema.py: AC-1 schema completeness, AC-2 source-label set containment, AC-3 WGS84 range + int32 1e-7 decode - test_ft_p_03_14_schema_wgs84.py: shared single-image-push scenario Tests: 248 unit tests pass (+91 vs batch 68). Reports: batch_69_report.md, batch_69_review.md (PASS), cumulative_review_batches_67-69_cycle1_report.md (PASS). Co-authored-by: Cursor <cursoragent@cursor.com>
8.9 KiB
Code Review Report
Batch: 69 — AZ-408, AZ-410, AZ-411 Date: 2026-05-16 Verdict: PASS
Findings
(none — see "Findings Sweep" below for the per-phase enumeration)
Findings Sweep
Phase 1 — Context Loading
Loaded task specs AZ-408_fixture_builders_synth_injectors.md,
AZ-410_ft_p_02_derkachi_drift.md, AZ-411_ft_p_03_14_schema_wgs84.md
plus _docs/02_document/module-layout.md (blackbox_tests cross-cutting
entry) and _docs/00_problem/input_data/flight_derkachi/ for fixture
schema.
Phase 2 — Spec Compliance
Per-AC walk:
AZ-408
- AC-1 (outlier seed-deterministic):
test_outlier.py—test_build_is_seed_deterministic,test_different_seeds_produce_different_replacements,test_density_ratio_maps_to_correct_stride[light|medium|heavy]✓ - AC-2 (≥350 m offset):
test_outlier.py—test_every_replacement_exceeds_min_offset,test_far_away_indices_filters_by_distance✓ - AC-3 (blackout_spoof ≤40 ms alignment):
test_fc_proxy.py—test_alignment_err_below_40ms_when_clock_matches_first_blackout,test_alignment_err_within_budget_under_normal_clock_skew,test_proxy_spoofs_inside_window; schedule-side:test_blackout_spoof.py::test_schedule_has_max_alignment_err_per_ac3✓ - AC-4 (spoof realistic + AC-NEW-8 200-500 m deltas):
test_blackout_spoof.py—test_spoof_fields_are_realistic,test_spoof_track_inter_position_delta_in_range✓ - AC-5 (multi_segment ≥3 disjoint, ≥30 s gaps, ≤25 % coverage):
test_multi_segment.py—test_produces_three_disjoint_segments,test_segments_are_at_least_30_seconds_apart,test_total_blackout_below_25_percent,test_rejects_overlapping_gap,test_rejects_too_few_segments✓ - AC-6 (tmpfs auto-cleared):
test_outlier.py—test_build_writes_only_under_out_root,test_build_overwrites_existing_out_root,test_cleanup_tmpfs_removes_scratch,test_cleanup_tmpfs_is_silent_for_missing_path✓
AZ-410
- AC-1 (anchor-pair detection):
test_anchor_pair_detector.py— five tests covering first-anchor-skip, visual-only, IMU-fused, dead-reckoned, and multi-pair flights ✓ - AC-2 (visual-only drift <100 m, ≥95 %):
test_pass_fraction_all_pass,test_pass_fraction_partial,test_aggregate_round_trip✓ - AC-3 (IMU-fused drift <50 m, ≥95 %):
test_aggregate_round_trip(covers visual/IMU segregation); pass-fraction helper covers the bound check ✓ - AC-4 (monotonic distribution):
test_check_monotonic_passes_for_increasing_medians,test_check_monotonic_flags_regression,test_check_monotonic_flags_2x_jump,test_bin_drifts_default_edges✓ - AC-5 (parametrize across (fc_adapter, vio_strategy)): scenario
test_ft_p_02_derkachi_drift.pyrequests both fixtures and is collected as 6 variants ✓ (verified viapytest --collect-only) - Full Derkachi end-to-end (AC-1.3 runtime): documented NOT COVERED at unit-test time — gated by
_harness_helpers_implementeduntilrunner.helpers.{frame_source_replay,fdr_reader,imu_replay}land (owned by AZ-441 + AZ-407 leftovers). Same pattern as batch 68's AZ-444 hardware-loop ACs.
AZ-411
- AC-1 (schema completeness):
test_estimate_schema.py—test_valid_record_passes_schema,test_missing_field_caught,test_int_typed_field_rejected_when_wrong_type,test_bool_does_not_silently_satisfy_int,test_required_fields_table_is_what_the_spec_says✓ - AC-2 (source-label set containment):
test_each_allowed_label_passes[satellite_anchored|visual_propagated|dead_reckoned],test_unknown_label_rejected,test_non_string_label_rejected✓ - AC-3 (WGS84 range):
test_valid_wgs84_inside_range,test_lat_above_90_rejected,test_lon_below_minus_180_rejected,test_nan_rejected,test_decode_lat_lon_int32_round_trip,test_decode_lat_lon_int32_rejects_out_of_int32_range✓ - AC-4 (parametrize): scenario
test_ft_p_03_14_schema_wgs84.pycollected as 12 variants (6 per test method) ✓ - Single-image push runtime: documented NOT COVERED at unit-test time — gated on the same upstream helpers as AZ-410.
No Spec-Gap findings.
Phase 3 — Code Quality
- SRP respected: each injector module owns one scenario;
_common.pyholds shared concerns (seeds, tile-cache reader, tmpfs root) so the per-injector modules stay narrow. - Error handling: every injector raises
FileNotFoundErrorwith explicit "build the X first" guidance when an input is missing;multi_segment._plan_segmentsraisesValueErrorwith a remediation hint on infeasible plans. - Naming: dataclass + function names follow
snake_case/CamelCaseper project convention. - Complexity: longest function is
outlier.buildat ~70 lines (still under the 50-line guideline target by the strict reading, but it's a linear pipeline). All other functions are short. - Tests assert behaviour (window length, geodesic offset, schema field presence) not "no exception" — meaningful.
- Dead code: removed obsolete
OutlierInjectionPlan.target_segment_seconds/n_outliers(AZ-406 scaffold field) — the contract test was updated to the new shape.
Phase 4 — Security
No SQL, no subprocess(shell=True), no credentials, no deserialization. The CLI argparse paths use typed --seed: int and Path types — input validation by argparse + downstream type checks.
Phase 5 — Performance
- Injector tests build PIL JPEG frames — slow but pre-existing pattern (batch 67/68 fixture tests have the same characteristic; 165 s for 83 fixture tests is unchanged from batch 68's 12 s for 26 fixture-only tests). Acceptable in unit-test context.
anchor_pair_detectoris O(N) over the FDR stream; bin computation is O(N + bins).estimate_schemavalidators are O(1) per record; aggregate is O(N).
Phase 6 — Cross-Task Consistency
- AZ-408's
_common.derive_rngis consumed by bothoutlierandblackout_spoof— shared seed discipline. - AZ-410's
anchor_pair_detectorusesrunner.helpers.geo.distance_m(pyproj WGS84) — consistent with the project's existing distance helper. - AZ-411's
estimate_schemadoes not overlap withanchor_pair_detector(different concerns: schema/transport vs trajectory analysis). - All three new helper modules under
runner/helpers/are independent — no inter-module imports between AZ-410 and AZ-411 deliverables. Tests cover the helpers independently. - Scenario files (
test_ft_p_02_*,test_ft_p_03_14_*) share the same_harness_helpers_implementedpattern (probe NotImplementedError on upstream helpers; skip with clear reason). Consistent style.
Phase 7 — Architecture Compliance
- Layer direction: every new file under
e2e/**; no imports ofgps_denied_onboard.*— verified by thetest_no_sut_imports.pyinvariant (passes). The blackbox_tests cross-cutting entry in module-layout.md sits outside the production layering table; this batch respects its envelope. - Public API respect:
_common.pyis a private module (leading underscore) consumed only by the three injectors; cross-injector consumption goes through documented public names (derive_rng,cleanup_tmpfs,tmpfs_root,read_tile_manifest,haversine_m,far_away_indices). - No new cyclic dependencies: import graph is linear —
outlier/blackout_spoof/multi_segment→_common;fc_proxyis standalone;injector_fixtures→ injectors; scenario files →runner.helpers.{anchor_pair_detector,estimate_schema}only. - Duplicate symbols:
_common.haversine_mis a deliberate duplicate of the project'sgeo.distance_m(Vincenty); the docstring explains the reason — injectors run in minimal Docker images without pyproj, while the runner image always has pyproj. Acceptable. - Cross-cutting concerns: pytest plugin registration (
injector_fixturesadded topytest_plugins) follows the existing pattern fromcsv_reporter/evidence_bundler/nfr_recorder.
No Architecture findings.
Baseline delta: _docs/02_document/architecture_compliance_baseline.md does not exist for this project — baseline delta section omitted.
AC Test Coverage Summary
| Task | ACs Covered | Test File(s) | Notes |
|---|---|---|---|
| AZ-408 | 1, 2, 3, 4, 5, 6 | test_outlier.py, test_blackout_spoof.py, test_multi_segment.py, test_fc_proxy.py, test_injectors_contract.py |
60 new unit tests; all pass |
| AZ-410 | 1, 2, 3, 4, 5 (collection) | test_anchor_pair_detector.py |
15 new unit tests; runtime AC-1.3 hardware-loop NOT COVERED (docker harness leftover) |
| AZ-411 | 1, 2, 3, 4 (collection) | test_estimate_schema.py |
18 new unit tests; runtime single-image push NOT COVERED (docker harness leftover) |
Code Review Verdict: PASS
No Critical, High, Medium, or Low findings. Implementation matches the
three task specs' AC sets at the unit-test layer; runtime end-to-end
paths for AZ-410 / AZ-411 are correctly gated and documented as
hardware-loop ACs pending the upstream frame_source_replay /
fdr_reader / imu_replay / sitl_observer helpers landing.
Auto-Fix Attempts: 0
No code-review failures — auto-fix gate not entered.
Stuck Agents: 0
None.