Fixed dynamic ONNX input

Fix dynamic ONNX input
Update docs with correct file name for tests
This commit is contained in:
Roman Meshko
2026-04-19 20:55:51 +03:00
committed by GitHub
parent e90ec69131
commit 7d897df380
10 changed files with 230 additions and 14 deletions
+187
View File
@@ -0,0 +1,187 @@
# Codex Context Bridge
This file is a compact compatibility layer for Codex. It explains how the repository uses `.cursor/`, where project memory lives, and what should be read first in a new chat.
## First Read Order
When starting a new Codex session in this repository, read in this order:
1. `AGENTS.md`
2. `.cursor/CODEX_CONTEXT.md`
3. `_docs/_autopilot_state.md`
4. The skill file relevant to the user's request under `.cursor/skills/*/SKILL.md`
5. Only the `_docs/` artifacts and `.cursor/rules/*.mdc` files relevant to that request
Do not bulk-read all of `_docs/` or all skill files unless the task truly needs it.
## Mental Model
- `.cursor/` is the workflow engine, policy layer, and skill library
- `_docs/` is the persisted working memory for the project
- `src/`, `tests/`, `e2e/`, and related runtime files are the implementation layer
For Codex, the important distinction is:
- `.cursor/` tells you **how the team wants work to happen**
- `_docs/` tells you **what has already been decided or completed**
## Project Snapshot
- Product: `Azaion.Detections`
- Type: Python/Cython microservice for aerial object detection
- API: FastAPI + SSE
- Engines: TensorRT on compatible NVIDIA GPUs, ONNX Runtime fallback
- Main code areas: `src/`, `tests/`, `e2e/`, `scripts/`
- Workflow memory: `_docs/`
Relevant documented architecture:
- 4 components: Domain, Inference Engines, Inference Pipeline, API
- 10 documented modules under `_docs/02_document/modules/`
- External services: Loader service, Annotations service
## Current Workflow State
As of `2026-04-15`, the persisted workflow state says:
- Flow: `existing-code`
- Current step: `2`
- Current step name: `Test Spec`
- Current status: `in_progress`
- Current sub-step: `Phase 3 - Test Data Validation Gate`
Important rollback note from `_docs/_autopilot_state.md`:
- On `2026-04-10`, the workflow was rolled back from Step 8 (`New Task`) to Step 2 (`Test Spec`)
- Reason: expected-result artifacts were incomplete for verification
Concrete blocker confirmed from current files:
- `_docs/00_problem/input_data/expected_results/results_report.md` still contains `?` for most expected detection counts
- Per-file expected-result CSVs for non-empty datasets are header-only
- That means black-box tests cannot verify detection correctness yet
Practical unblocker:
1. Populate the expected-result CSVs for the non-empty image/video fixtures
2. Replace `?` counts in `results_report.md` with real values
3. Re-run or continue the `test-spec` workflow from Phase 3
## Cursor Asset Map
### Core entry points
- `.cursor/README.md`: high-level overview of the whole Cursor system
- `.cursor/skills/autopilot/SKILL.md`: orchestrator contract
- `.cursor/skills/autopilot/flows/existing-code.md`: active flow for this repository
- `.cursor/skills/autopilot/protocols.md`: decision, retry, and re-entry rules
- `.cursor/agents/implementer.md`: only defined subagent
### Rules
Always-check rules:
- `.cursor/rules/meta-rule.mdc`
- `.cursor/rules/techstackrule.mdc`
- `.cursor/rules/git-workflow.mdc`
- `.cursor/rules/quality-gates.mdc`
Highly relevant contextual rules for this repo:
- `.cursor/rules/python.mdc`
- `.cursor/rules/testing.mdc`
- `.cursor/rules/docker.mdc`
- `.cursor/rules/cursor-meta.mdc`
Other rules exist for security, trackers, OpenAPI, React, Rust, SQL, and .NET. Read them only if the task touches those domains.
## Skills Index
Use this table as the fast map instead of opening every skill up front.
| Skill | Primary use | Main outputs |
|------|-------------|--------------|
| `autopilot` | Continue the end-to-end workflow | state progression in `_docs/_autopilot_state.md` |
| `problem` | Gather or refine the problem definition | `_docs/00_problem/` |
| `research` | Investigate solutions or unknowns | `_docs/01_solution/` or standalone research folder |
| `plan` | Architecture, components, risks, tests, epics | `_docs/02_document/` |
| `test-spec` | Black-box test specifications and test runners | `_docs/02_document/tests/`, `scripts/run-tests.sh`, `scripts/run-performance-tests.sh` |
| `decompose` | Break plan or tests into atomic tasks | `_docs/02_tasks/` |
| `implement` | Batch orchestration of coding tasks | `_docs/03_implementation/` plus code changes |
| `test-run` | Execute and diagnose test suites | test results and pass/fail guidance |
| `code-review` | Review implemented batches against specs | review report and verdict |
| `new-task` | Plan new functionality for existing code | `_docs/02_tasks/todo/` and optional `_docs/02_task_plans/` |
| `refactor` | Structured refactoring with safety checks | `_docs/04_refactoring/` |
| `security` | Security audit and OWASP-style review | `_docs/05_security/` |
| `document` | Reverse-engineer or update docs from code | `_docs/02_document/` and related problem/solution docs |
| `deploy` | Containerization, CI/CD, observability | `_docs/04_deploy/` |
| `retrospective` | Review implementation metrics and trends | `_docs/06_metrics/` |
| `ui-design` | UI mockups and design system artifacts | `_docs/02_document/ui_mockups/` |
## Agents
Defined agent:
- `implementer`
- File: `.cursor/agents/implementer.md`
- Role: implement one task spec with tests and AC verification
- Invoked by: `implement` skill
No other `.cursor/agents/` definitions are currently present.
## Codex Operating Notes
### When the user asks for Cursor-style continuation
If the user says things like:
- "continue autopilot"
- "what's next"
- "continue workflow"
- "/autopilot"
then:
1. Read `_docs/_autopilot_state.md`
2. Read `.cursor/skills/autopilot/SKILL.md`
3. Read `.cursor/skills/autopilot/protocols.md`
4. Read the active flow file
5. Read only the specific downstream skill file needed for the current step
### When the user asks for direct coding help
You do not need to force the full Cursor workflow. Work directly in the codebase, but still:
- respect `.cursor/rules/*.mdc`
- use `_docs/` as authoritative project memory
- preserve alignment with existing task specs and documented architecture when relevant
### Context discipline
- Prefer progressive loading over reading everything
- Treat disk artifacts as the source of truth, not prior chat history
- Cross-check state file claims against actual files when something seems inconsistent
## Most Relevant Files For This Repo
- `AGENTS.md`
- `.cursor/CODEX_CONTEXT.md`
- `.cursor/README.md`
- `.cursor/skills/autopilot/SKILL.md`
- `.cursor/skills/autopilot/flows/existing-code.md`
- `.cursor/skills/test-spec/SKILL.md`
- `.cursor/agents/implementer.md`
- `_docs/_autopilot_state.md`
- `_docs/00_problem/`
- `_docs/01_solution/solution.md`
- `_docs/02_document/`
- `_docs/02_tasks/`
## Short Version
If you only have a minute:
- This repo uses Cursor as a workflow framework and `_docs/` as persistent memory
- The project is already documented and mid-workflow
- The current workflow is blocked in `test-spec` because expected-result data is incomplete
- For future Codex chats, start with `AGENTS.md`, this file, and `_docs/_autopilot_state.md`
+3
View File
@@ -80,3 +80,6 @@ data/
# Runtime logs
Logs/
#IDEA
.idea/
@@ -49,9 +49,9 @@ For videos, the additional field:
| # | Input File | Description | Expected Result File | Notes |
|---|------------|-------------|---------------------|-------|
| 7 | `video_short01.mp4` | Standard test video | `video_short01_expected.csv` | Primary async/SSE/video test. List key-frame detections. |
| 8 | `video_short02.mp4` | Video variant | `video_short02_expected.csv` | Used for resilience and concurrent tests |
| 9 | `video_long03.mp4` | Long video (288MB), generates >100 SSE events | `video_long03_expected.csv` | SSE overflow test. Only key-frame samples needed. |
| 7 | `video_test01.mp4` | Standard test video | `video_test01_expected.csv` | Primary async/SSE/video test. List key-frame detections. |
| 8 | `video_1.mp4` | Video variant | `video_1_expected.csv` | Secondary local fixture for resilience and concurrent-style validation. |
| 9 | `video_1_faststart.mp4` | Faststart video variant | `video_1_faststart_expected.csv` | Streaming compatibility variant. Separate long-video overflow fixture is not currently present in local fixtures. |
## How to Fill
+6 -6
View File
@@ -12,9 +12,9 @@
| image-dense-02 | `input_data/image_dense02.jpg` | JPEG 1920×1080 — dense scene variant, borderline tiling | FT-P-06 (variant) | Volume mount to consumer `/media/` | N/A (read-only) |
| image-different-types | `input_data/image_different_types.jpg` | JPEG 900×1600 — varied object classes for class variant tests | FT-P-13 | Volume mount to consumer `/media/` | N/A (read-only) |
| image-empty-scene | `input_data/image_empty_scene.jpg` | JPEG 1920×1080 — clean scene with no detectable objects | Edge case (zero detections) | Volume mount to consumer `/media/` | N/A (read-only) |
| video-short-01 | `input_data/video_short01.mp4` | MP4 video — standard async/SSE/video detection tests | FT-P-08..12, FT-N-04, 07, NFT-PERF-04, NFT-RES-02, NFT-SEC-03 | Volume mount to consumer `/media/` | N/A (read-only) |
| video-short-02 | `input_data/video_short02.mp4` | MP4 video — variant for concurrent and resilience tests | NFT-RES-02 (variant), NFT-RES-04 | Volume mount to consumer `/media/` | N/A (read-only) |
| video-long-03 | `input_data/video_long03.mp4` | MP4 long video (288MB) — generates >100 SSE events for overflow tests | FT-N-08, NFT-RES-LIM-02 | Volume mount to consumer `/media/` | N/A (read-only) |
| video-test-01 | `input_data/video_test01.mp4` | MP4 video — standard async/SSE/video detection tests | FT-P-08..12, FT-N-04, 07, NFT-PERF-04, NFT-RES-02, NFT-SEC-03 | Volume mount to consumer `/media/` | N/A (read-only) |
| video-1 | `input_data/video_1.mp4` | MP4 video — local variant for concurrent and resilience-style tests | NFT-RES-02 (variant), NFT-RES-04 | Volume mount to consumer `/media/` | N/A (read-only) |
| video-1-faststart | `input_data/video_1_faststart.mp4` | MP4 video — faststart/local streaming variant | Streaming compatibility checks | Volume mount to consumer `/media/` | N/A (read-only) |
| empty-image | Generated at build time | Zero-byte file | FT-N-01 | Generated in e2e/fixtures/ | N/A |
| corrupt-image | Generated at build time | Random binary garbage (not valid image format) | FT-N-02 | Generated in e2e/fixtures/ | N/A |
| jwt-token | Generated at runtime | Valid JWT with exp claim (not signature-verified by detections) | FT-P-08, 09, FT-N-04, 07, NFT-SEC-03 | Generated by consumer at runtime | N/A |
@@ -35,9 +35,9 @@ Each test run starts with fresh containers (`docker compose down -v && docker co
| image_dense02.jpg | `_docs/00_problem/input_data/image_dense02.jpg` | Dense scene 1920×1080 | Dedup variant |
| image_different_types.jpg | `_docs/00_problem/input_data/image_different_types.jpg` | Varied classes 900×1600 | Class variant tests |
| image_empty_scene.jpg | `_docs/00_problem/input_data/image_empty_scene.jpg` | Empty scene 1920×1080 | Zero-detection edge case |
| video_short01.mp4 | `_docs/00_problem/input_data/video_short01.mp4` | Standard video | Async, SSE, video, perf tests |
| video_short02.mp4 | `_docs/00_problem/input_data/video_short02.mp4` | Video variant | Resilience, concurrent tests |
| video_long03.mp4 | `_docs/00_problem/input_data/video_long03.mp4` | Long video (288MB) | SSE overflow, queue depth tests |
| video_test01.mp4 | `_docs/00_problem/input_data/video_test01.mp4` | Standard video | Async, SSE, video, perf tests |
| video_1.mp4 | `_docs/00_problem/input_data/video_1.mp4` | Video variant | Resilience, concurrent tests |
| video_1_faststart.mp4 | `_docs/00_problem/input_data/video_1_faststart.mp4` | Faststart video variant | Streaming compatibility checks |
| classes.json | repo root `classes.json` | 19 detection classes | All tests |
## External Dependency Mocks
+1 -3
View File
@@ -1,3 +1,4 @@
cimport constants_inf
from loader_http_client cimport LoaderHttpClient, LoadResult
@@ -42,7 +43,6 @@ class EngineFactory:
def build_and_cache(self, bytes source_bytes, LoaderHttpClient loader_client, str models_dir):
cdef LoadResult res
import constants_inf
engine_bytes, engine_filename = self.build_from_source(source_bytes, loader_client, models_dir)
res = loader_client.upload_big_small_resource(engine_bytes, engine_filename, models_dir)
if res.err is not None:
@@ -56,7 +56,6 @@ class OnnxEngineFactory(EngineFactory):
return OnnxEngine(model_bytes)
def get_source_filename(self):
import constants_inf
return constants_inf.AI_ONNX_MODEL_FILE
@@ -81,7 +80,6 @@ class TensorRTEngineFactory(EngineFactory):
return TensorRTEngine.get_engine_filename()
def get_source_filename(self):
import constants_inf
return constants_inf.AI_ONNX_MODEL_FILE
def build_from_source(self, onnx_bytes, loader_client, models_dir):
+2
View File
@@ -8,6 +8,8 @@ cdef class OnnxEngine(InferenceEngine):
cdef object model_inputs
cdef str input_name
cdef object input_shape
cdef object _resolved_input_hw
cdef tuple _resolve_input_hw(self, object metadata)
cdef tuple get_input_shape(self)
cdef run(self, input_data)
+28 -2
View File
@@ -2,6 +2,7 @@ from engines.inference_engine cimport InferenceEngine
import onnxruntime as onnx
cimport constants_inf
import ast
import os
def _select_providers():
@@ -29,15 +30,40 @@ cdef class OnnxEngine(InferenceEngine):
model_meta = self.session.get_modelmeta()
constants_inf.log(f"Metadata: {model_meta.custom_metadata_map}")
self._resolved_input_hw = self._resolve_input_hw(model_meta.custom_metadata_map)
self._cpu_session = None
if any("CoreML" in p for p in self.session.get_providers()):
constants_inf.log(<str>'CoreML active — creating CPU fallback session')
self._cpu_session = onnx.InferenceSession(
model_bytes, providers=["CPUExecutionProvider"])
cdef tuple _resolve_input_hw(self, object metadata):
cdef object h = self.input_shape[2] if len(self.input_shape) > 2 else None
cdef object w = self.input_shape[3] if len(self.input_shape) > 3 else None
cdef int resolved_h
cdef int resolved_w
if isinstance(h, int) and h > 0 and isinstance(w, int) and w > 0:
return <tuple>(h, w)
try:
imgsz = metadata.get("imgsz") if metadata is not None else None
if imgsz:
parsed = ast.literal_eval(imgsz)
if isinstance(parsed, (list, tuple)) and len(parsed) == 2:
resolved_h = int(parsed[0])
resolved_w = int(parsed[1])
if resolved_h > 0 and resolved_w > 0:
return <tuple>(resolved_h, resolved_w)
except Exception:
pass
# Dynamic ONNX models are expected to use the project's canonical 1280x1280 input.
return <tuple>(1280, 1280)
cdef tuple get_input_shape(self):
shape = self.input_shape
return <tuple>(shape[2], shape[3])
return self._resolved_input_hw
cdef run(self, input_data):
try: