# Autopilot State

## Current Step
flow: existing-code
step: 7
name: Refactor
status: completed
sub_step: done
retry_count: 0

## Completed Steps

| Step | Name | Completed | Key Outcome |
|------|------|-----------|-------------|
| 1 | Document | 2026-03-21 | 10 modules, 4 components, full _docs/ generated from existing codebase |
| 2 | Test Spec | 2026-03-21 | 39 test scenarios (16 positive, 8 negative, 11 non-functional), 85% total coverage, 5 artifacts produced |
| 3 | Code Testability Rev. | 2026-03-29 | Engine factory refactoring completed: polymorphic EngineClass pattern (TensorRT/CoreML/ONNX) with auto-detection. Hardcoded values aligned with Docker compose. |
| 4 | Decompose Tests | 2026-03-23 | 11 tasks (AZ-138..AZ-148), 35 complexity points, 3 batches. Phase 3 test data gate PASSED: 39/39 scenarios validated, 12 data files provided. |
| 5 | Implement Tests | 2026-03-23 | 11 tasks implemented across 4 batches, 38 tests (2 skipped), all code reviews PASS_WITH_WARNINGS. Commits: 5418bd7, a469579, 861d4f0, f0e3737. |
| 6 | Run Tests | 2026-03-30 | 23 passed, 0 failed, 0 skipped, 0 errors in 11.93s. Fixed: Cython __reduce_cython__ (clean rebuild), missing Pillow dep, relative MEDIA_DIR paths. Removed 14 dead/unreachable tests. Updated test-run skill to treat skips as blocking gate. |
| 7 | Refactor | 2026-03-31 | Engine-centric dynamic batch refactoring. Moved source to src/. Engine pipeline redesign: preprocess/postprocess/process_frames in base InferenceEngine, dynamic batching per engine (CoreML=1, TensorRT=GPU-calculated, ONNX=config). Fixed: video partial batch flush, image accumulation data loss, frame-is-None crash. Removed detect_single_image (POST /detect delegates to run_detect). Dead code: removed msgpack, serialize methods, unused constants/fields. Configurable classes.json + log paths, HTTP timeouts. 28 e2e tests pass. |

## Key Decisions
- User chose to document existing codebase before proceeding
- Component breakdown: 4 components (Domain, Inference Engines, Inference Pipeline, API)
- Verification: 4 legacy issues found and documented (unused serialize/from_msgpack, orphaned queue declarations)
- Input data coverage approved at ~90% (Phase 1a)
- Test coverage approved at 85% (21/22 AC, 13/18 restrictions) with all gaps justified
- User chose refactor path (decompose tests → implement tests → refactor)
- Integration Tests Epic: AZ-137
- Test Infrastructure: AZ-138 (5 pts)
- 10 integration test tasks decomposed: AZ-139 through AZ-148 (30 pts)
- Total: 11 tasks, 35 complexity points, 3 batches
- Phase 3 (Test Data Validation Gate) PASSED: 39/39 scenarios have data, 85% coverage, 0 tests removed
- Test data: 6 images, 3 videos, 1 ONNX model, 1 classes.json provided by user
- User confirmed dependency table and test data gate
- Jira MCP auth skipped — tickets not transitioned to In Testing
- Test run: removed 14 dead/unreachable tests (explicit @skip + runtime always-skip), added .c to .gitignore
- User chose to refactor (option A) — clean up legacy dead code
- User requested: move code to src/, thorough re-analysis, exhaustive refactoring list
- Refactoring round: 01-code-cleanup, automatic mode, 15 changes identified
- User feedback: analyze logical flow contradictions, not just static code. Updated refactor skill Phase 1 with logical flow analysis.
- User chose: split scope — engine refactoring as Step 7, architecture shift (streaming, DB config, media storage, Jetson) as Step 8
- User chose: remove detect_single_image, POST /detect delegates to run_detect
- GPU memory fraction: 80% for inference, 20% buffer (Jetson 40% deferred to Step 8)

## Last Session
date: 2026-03-31
ended_at: Step 7 complete — all 11 todos done, 28 e2e tests pass
reason: Refactoring complete
notes: Engine-centric dynamic batch refactoring implemented. Source moved to src/. InferenceEngine base class now owns preprocess/postprocess/process_frames with per-engine max_batch_size. CoreML overrides preprocess (direct PIL, no blob reversal) and postprocess. TensorRT calculates max_batch_size from GPU memory (80% fraction) with optimization profiles for dynamic batch. All logical flow bugs fixed (LF-01 through LF-09). Dead code removed (msgpack, serialize, unused constants). POST /detect unified through run_detect. Next: Step 8 (architecture shift — streaming media, DB-backed config, media storage, Jetson support).

## Blockers
- none