Refactor task management structure and update documentation

- Changed the directory structure for task specifications to include a dedicated `todo/` folder within `_docs/02_tasks/` for tasks ready for implementation.
- Updated references in various skills and documentation to reflect the new task lifecycle, including changes in the `implementer` and `decompose` skills.
- Enhanced the README and flow documentation to clarify the new task organization and its implications for the implementation process.

These updates improve task management clarity and streamline the implementation workflow.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-03-28 01:17:45 +02:00
parent 8c665bd0a4
commit cbf370c765
35 changed files with 1348 additions and 58 deletions
@@ -0,0 +1,99 @@
# Resource Limit Tests
**Task**: AZ-148_test_resource_limits
**Name**: Resource Limit Tests
**Description**: Implement E2E tests verifying ThreadPoolExecutor worker limit, SSE queue depth cap, max detections per frame, SSE overflow handling, and log file rotation
**Complexity**: 3 points
**Dependencies**: AZ-138_test_infrastructure, AZ-142_test_async_sse
**Component**: Integration Tests
**Jira**: AZ-148
**Epic**: AZ-137
## Problem
The system enforces several resource limits: 2 concurrent inference workers, 100-event SSE queue depth, 300 max detections per frame, and daily log rotation. Tests must verify these limits are enforced correctly and that overflow conditions are handled gracefully.
## Outcome
- ThreadPoolExecutor limited to 2 concurrent inference operations
- SSE queue capped at 100 events per client, overflow silently dropped
- No response contains more than 300 detections per frame
- Log files use date-based naming with daily rotation
- SSE overflow does not crash the service or the detection pipeline
## Scope
### Included
- FT-N-08: SSE queue overflow is silently dropped
- NFT-RES-LIM-01: ThreadPoolExecutor worker limit (2 concurrent)
- NFT-RES-LIM-02: SSE queue depth limit (100 events)
- NFT-RES-LIM-03: Max 300 detections per frame
- NFT-RES-LIM-04: Log file rotation and retention
### Excluded
- Memory limits (OS-level, not application-enforced)
- Disk space limits
- Network bandwidth throttling
## Acceptance Criteria
**AC-1: Worker limit**
Given an initialized engine
When 4 concurrent POST /detect requests are sent
Then first 2 complete roughly together, next 2 complete after (2-at-a-time processing)
And all 4 requests eventually succeed
**AC-2: SSE queue depth**
Given an SSE client connected but not reading (stalled)
When async detection produces > 100 events
Then stalled client receives <= 100 events when it resumes reading
And no OOM or connection errors
**AC-3: SSE overflow handling**
Given an SSE client pauses reading
When async detection generates many events
Then detection completes normally (no error from overflow)
And stalled client receives at most 100 buffered events
**AC-4: Max detections per frame**
Given an initialized engine and a dense scene image
When POST /detect is called
Then response contains at most 300 detections
**AC-5: Log file rotation**
Given the service is running with Logs/ volume mounted
When detection requests are made
Then log file exists at Logs/log_inference_YYYYMMDD.txt with today's date
And log content contains structured INFO/DEBUG/WARNING entries
## Non-Functional Requirements
**Reliability**
- Resource limits must be enforced without crash or undefined behavior
## Integration Tests
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|--------|------------------------|-------------|-------------------|----------------|
| AC-1 | Engine warm | 4 concurrent POST /detect | 2-at-a-time processing pattern | Max 60s |
| AC-2 | Engine warm, stalled SSE | Async detection > 100 events | <= 100 events buffered | Max 120s |
| AC-3 | Engine warm, stalled SSE | Detection pipeline behavior | Completes normally | Max 120s |
| AC-4 | Engine warm, dense scene image | POST /detect | <= 300 detections | Max 30s |
| AC-5 | Service running, Logs/ mounted | Detection requests | Date-named log file exists | Max 10s |
## Constraints
- Worker limit test requires precise timing measurement of response arrivals
- SSE overflow test requires ability to pause/resume SSE client reading
- Detection cap test requires an image producing many detections (may not reach 300 with test fixture)
- Log rotation test verifies naming convention; full 30-day retention requires long-running test
## Risks & Mitigation
**Risk 1: Insufficient detections for cap test**
- *Risk*: Test image may not produce 300 detections to actually hit the cap
- *Mitigation*: Verify the cap exists by checking detection count <= 300; accept as passing if under limit
**Risk 2: SSE client stall implementation**
- *Risk*: HTTP client libraries may not support controlled read pausing
- *Mitigation*: Use raw socket or thread-based approach to control when events are consumed