Update demo replay validation and testing documentation
ci/woodpecker/push/02-build-push Pipeline failed

- Modified the autodev state to reflect the current testing phase and details of the new `jetson-e2e` tests.
- Enhanced the "How to Test" documentation to provide clearer instructions on the demo replay validation process, including video and tlog alignment steps.
- Updated architectural documentation to include the new demo replay operator flow and its dependencies.
- Documented the removal of deprecated auto-sync features and clarified the operator-facing UI for replay validation.
- Added new entries in the dependencies table for upcoming tasks related to the demo replay flow.

These changes improve clarity and usability for operators and developers working with the demo replay system.
This commit is contained in:
Oleksandr Bezdieniezhnykh
2026-06-20 11:24:43 +03:00
parent 12d0008763
commit 1f634c2604
175 changed files with 20701 additions and 41 deletions
@@ -0,0 +1,133 @@
# API Contract Template
A contract is the **frozen, reviewed interface** between two or more components. When task A produces a shared model, DTO, schema, event payload, or public API, and task B consumes it, they must not reverse-engineer each other's implementation — they must read the contract.
Save the filled contract at `_docs/02_document/contracts/<component>/<name>.md`. Reference it from the producing task's `## Contract` section and from every consuming task's `## Dependencies` section.
---
```markdown
# Contract: [contract-name]
**Component**: [component-name]
**Producer task**: [TRACKER-ID] — [task filename]
**Consumer tasks**: [list of TRACKER-IDs or "TBD at decompose time"]
**Version**: 1.0.0
**Status**: [draft | frozen | deprecated]
**Last Updated**: [YYYY-MM-DD]
## Purpose
Short statement of what this contract represents and why it is shared (13 sentences).
## Shape
Choose ONE of the following shape forms per the contract type:
### For data models (DTO / schema / event)
```[language]
// language-native type definitions — e.g., Python dataclass, C# record, TypeScript interface, Rust struct, JSON Schema
```
For each field:
| Field | Type | Required | Description | Constraints |
|-------|------|----------|-------------|-------------|
| `id` | `string` (UUID) | yes | Unique identifier | RFC 4122 v4 |
| `created_at` | `datetime` (ISO 8601 UTC) | yes | Creation timestamp | |
| `...` | ... | ... | ... | ... |
### For function / method APIs
| Name | Signature | Throws / Errors | Blocking? |
|------|-----------|-----------------|-----------|
| `do_x` | `(input: InputDto) -> Result<OutputDto, XError>` | `XError::NotFound`, `XError::Invalid` | sync |
| ... | ... | ... | ... |
### For HTTP / RPC endpoints
| Method | Path | Request body | Response | Status codes |
|--------|------|--------------|----------|--------------|
| `POST` | `/api/v1/resource` | `CreateResource` | `Resource` | 201, 400, 409 |
| ... | ... | ... | ... | ... |
## Invariants
Properties that MUST hold for every valid instance or every allowed interaction. These survive refactors.
- Invariant 1: [statement]
- Invariant 2: [statement]
## Non-Goals
Things this contract intentionally does NOT cover. Helps prevent scope creep.
- Not covered: [statement]
## Versioning Rules
- **Breaking changes** (field renamed/removed, type changed, required→optional flipped) require a new major version and a deprecation path for consumers.
- **Non-breaking additions** (new optional field, new error variant consumers already tolerate) require a minor version bump.
## Test Cases
Representative cases that both producer and consumer tests must cover. Keep short — this is the contract test surface, not an exhaustive suite.
| Case | Input | Expected | Notes |
|------|-------|----------|-------|
| valid-minimal | minimal valid instance | accepted | |
| invalid-missing-required | missing `id` | rejected with specific error | |
| edge-case-x | ... | ... | |
## Change Log
| Version | Date | Change | Author |
|---------|------|--------|--------|
| 1.0.0 | YYYY-MM-DD | Initial contract | [agent/user] |
```
---
## Decompose-skill rules for emitting contracts
A task is a **shared-models / shared-API task** when ANY of the following is true:
- The component spec lists it as a shared component (under `shared/*` in `module-layout.md`).
- The task's **Scope.Included** mentions any of: "public interface", "DTO", "schema", "event", "contract", "API endpoint", "shared model".
- The task is parented to a cross-cutting epic (`epic_type: cross-cutting`).
- The task is depended on by ≥2 other tasks across different components.
For every shared-models / shared-API task:
1. Create a contract file at `_docs/02_document/contracts/<component>/<name>.md` using this template.
2. Fill in Shape, Invariants, Non-Goals, Versioning Rules, and at least 3 Test Cases.
3. Add a mandatory `## Contract` section to the task spec that links to the contract file:
```markdown
## Contract
This task produces/implements the contract at `_docs/02_document/contracts/<component>/<name>.md`.
Consumers MUST read that file — not this task spec — to discover the interface.
```
4. For every consuming task, add the contract path to its `## Dependencies` section as a document dependency (not a task dependency):
```markdown
### Document Dependencies
- `_docs/02_document/contracts/<component>/<name>.md` — API contract produced by [TRACKER-ID].
```
5. If the contract changes after it was frozen, the producer task must bump the `Version` and note the change in `Change Log`. Consumers referenced in the contract header must be notified (surface to user via Choose format).
## Code-review-skill rules for verifying contracts
Phase 2 (Spec Compliance) adds a check:
- For every task with a `## Contract` section:
- Verify the referenced contract file exists at the stated path.
- Verify the implementation's public signatures (types, method shapes, endpoint paths) match the contract's Shape section.
- If they diverge, emit a `Spec-Gap` finding with High severity.
- For every consuming task's Document Dependencies that reference a contract:
- Verify the consumer's imports / calls match the contract's Shape.
- If they diverge, emit a `Spec-Gap` finding with High severity and a hint that either the contract or the consumer is drifting.
@@ -0,0 +1,31 @@
# Dependencies Table Template
Use this template after cross-task verification. Save as `TASKS_DIR/_dependencies_table.md`.
---
```markdown
# Dependencies Table
**Date**: [YYYY-MM-DD]
**Total Tasks**: [N]
**Total Complexity Points**: [N]
| Task | Name | Complexity | Dependencies | Epic |
|------|------|-----------|-------------|------|
| [TRACKER-ID] | initial_structure | [points] | None | [EPIC-ID] |
| [TRACKER-ID] | [short_name] | [points] | [TRACKER-ID] | [EPIC-ID] |
| [TRACKER-ID] | [short_name] | [points] | [TRACKER-ID] | [EPIC-ID] |
| [TRACKER-ID] | [short_name] | [points] | [TRACKER-ID], [TRACKER-ID] | [EPIC-ID] |
| ... | ... | ... | ... | ... |
```
---
## Guidelines
- Every task from TASKS_DIR must appear in this table
- Dependencies column lists tracker IDs (e.g., "AZ-43, AZ-44") or "None"
- No circular dependencies allowed
- Tasks should be listed in recommended execution order
- The `/implement` skill reads this table to compute dependency-aware batches; task execution remains sequential
@@ -0,0 +1,135 @@
# Initial Structure Task Template
Use this template for the bootstrap structure plan. Save as `TASKS_DIR/01_initial_structure.md` initially, then rename to `TASKS_DIR/[TRACKER-ID]_initial_structure.md` after work item ticket creation.
---
```markdown
# Initial Project Structure
**Task**: [TRACKER-ID]_initial_structure
**Name**: Initial Structure
**Description**: Scaffold the project skeleton — folders, shared models, interfaces, stubs, CI/CD, DB migrations, test structure
**Complexity**: [3|5] points
**Dependencies**: None
**Component**: Bootstrap
**Tracker**: [TASK-ID]
**Epic**: [EPIC-ID]
## Project Folder Layout
```
project-root/
├── [folder structure based on tech stack and components]
└── ...
```
### Layout Rationale
[Brief explanation of why this structure was chosen — language conventions, framework patterns, etc.]
## DTOs and Interfaces
### Shared DTOs
| DTO Name | Used By Components | Fields Summary |
|----------|-------------------|---------------|
| [name] | [component list] | [key fields] |
### Component Interfaces
| Component | Interface | Methods | Exposed To |
|-----------|-----------|---------|-----------|
| [name] | [InterfaceName] | [method list] | [consumers] |
## CI/CD Pipeline
| Stage | Purpose | Trigger |
|-------|---------|---------|
| Build | Compile/bundle the application | Every push |
| Lint / Static Analysis | Code quality and style checks | Every push |
| Unit Tests | Run unit test suite | Every push |
| Blackbox Tests | Run blackbox test suite | Every push |
| Security Scan | SAST / dependency check | Every push |
| Deploy to Staging | Deploy to staging environment | Merge to staging branch |
### Pipeline Configuration Notes
[Framework-specific notes: CI tool, runners, caching, parallelism, etc.]
## Environment Strategy
| Environment | Purpose | Configuration Notes |
|-------------|---------|-------------------|
| Development | Local development | [local DB, mock services, debug flags] |
| Staging | Pre-production testing | [staging DB, staging services, production-like config] |
| Production | Live system | [production DB, real services, optimized config] |
### Environment Variables
| Variable | Dev | Staging | Production | Description |
|----------|-----|---------|------------|-------------|
| [VAR_NAME] | [value/source] | [value/source] | [value/source] | [purpose] |
## Database Migration Approach
**Migration tool**: [tool name]
**Strategy**: [migration strategy — e.g., versioned scripts, ORM migrations]
### Initial Schema
[Key tables/collections that need to be created, referencing component data access patterns]
## Test Structure
```
tests/
├── unit/
│ ├── [component_1]/
│ ├── [component_2]/
│ └── ...
├── integration/
│ ├── test_data/
│ └── [test files]
└── ...
```
### Test Configuration Notes
[Test runner, fixtures, test data management, isolation strategy]
## Implementation Order
| Order | Component | Reason |
|-------|-----------|--------|
| 1 | [name] | [why first — foundational, no dependencies] |
| 2 | [name] | [depends on #1] |
| ... | ... | ... |
## Acceptance Criteria
**AC-1: Project scaffolded**
Given the structure plan above
When the implementer executes this task
Then all folders, stubs, and configuration files exist
**AC-2: Tests runnable**
Given the scaffolded project
When the test suite is executed
Then all stub tests pass (even if they only assert true)
**AC-3: CI/CD configured**
Given the scaffolded project
When CI pipeline runs
Then build, lint, and test stages complete successfully
```
---
## Guidance Notes
- This is a PLAN document, not code. The `/implement` skill executes it.
- Focus on structure and organization decisions, not implementation details.
- Reference component specs for interface and DTO details — don't repeat everything.
- The folder layout should follow conventions of the identified tech stack.
- Environment strategy should account for secrets management and configuration.
@@ -0,0 +1,107 @@
# Module Layout Template
The module layout is the **authoritative file-ownership map** used by the `/implement` skill to assign OWNED / READ-ONLY / FORBIDDEN files to each task. It is derived from `_docs/02_document/architecture.md` and the component specs at `_docs/02_document/components/`, and it follows the target language's standard project-layout conventions.
Save as `_docs/02_document/module-layout.md`. This file is produced by the decompose skill (Step 1.5 module layout) and consumed by the implement skill (Step 4 file ownership). Task specs remain purely behavioral — they do NOT carry file paths. The layout is the single place where component → filesystem mapping lives.
---
```markdown
# Module Layout
**Language**: [python | csharp | rust | typescript | go | mixed]
**Layout Convention**: [src-layout | crates-workspace | packages-workspace | custom]
**Root**: [src/ | crates/ | packages/ | ./]
**Last Updated**: [YYYY-MM-DD]
## Layout Rules
1. Each component owns ONE top-level directory under the root.
2. Shared code lives under `<root>/shared/` (or language equivalent: `src/shared/`, `crates/shared/`, `packages/shared/`).
3. Cross-cutting concerns (logging, config, error handling, telemetry) live under `<root>/shared/<concern>/`.
4. Public API surface per component = files listed in `public:` below. Everything else is internal — other components MUST NOT import it directly.
5. Tests live outside the component tree in a separate `tests/` or `<component>/tests/` directory per the language's test convention.
## Per-Component Mapping
### Component: [component-name]
- **Epic**: [TRACKER-ID]
- **Directory**: `src/<path>/`
- **Public API**: files in this list are importable by other components
- `src/<path>/public_api.py` (or `mod.rs`, `index.ts`, `PublicApi.cs`, etc.)
- `src/<path>/types.py`
- **Internal (do NOT import from other components)**:
- `src/<path>/internal/*`
- `src/<path>/_helpers.py`
- **Owns (exclusive write during implementation)**: `src/<path>/**`
- **Imports from**: [list of other components whose Public API this component may use]
- **Consumed by**: [list of components that depend on this component's Public API]
### Component: [next-component]
...
## Shared / Cross-Cutting
### shared/models
- **Directory**: `src/shared/models/`
- **Purpose**: DTOs, value types, schemas shared across components
- **Owned by**: whoever implements task `[TRACKER-ID]_shared_models`
- **Consumed by**: all components
### shared/logging
- **Directory**: `src/shared/logging/`
- **Purpose**: structured logging setup
- **Owned by**: cross-cutting task `[TRACKER-ID]_logging`
- **Consumed by**: all components
### shared/[other concern]
...
## Allowed Dependencies (layering)
Read top-to-bottom; an upper layer may import from a lower layer but NEVER the reverse.
| Layer | Components | May import from |
|-------|------------|-----------------|
| 4. API / Entry | [list] | 1, 2, 3 |
| 3. Application | [list] | 1, 2 |
| 2. Domain | [list] | 1 |
| 1. Shared / Foundation | shared/* | (none) |
Violations of this table are **Architecture** findings in code-review Phase 7 and are High severity.
## Layout Conventions (reference)
| Language | Root | Per-component path | Public API file | Test path |
|----------|------|-------------------|-----------------|-----------|
| Python | `src/<pkg>/` | `src/<pkg>/<component>/` | `src/<pkg>/<component>/__init__.py` (re-exports) | `tests/<component>/` |
| C# (.NET) | `src/` | `src/<Component>/` | `src/<Component>/<Component>.cs` (namespace root) | `tests/<Component>.Tests/` |
| Rust | `crates/` | `crates/<component>/` | `crates/<component>/src/lib.rs` | `crates/<component>/tests/` |
| TypeScript / React | `packages/` or `src/` | `src/<component>/` | `src/<component>/index.ts` (barrel) | `src/<component>/__tests__/` or `tests/<component>/` |
| Go | `./` | `internal/<component>/` or `pkg/<component>/` | `internal/<component>/doc.go` + exported symbols | `internal/<component>/*_test.go` |
```
---
## Self-verification for the decompose skill
When writing `_docs/02_document/module-layout.md`, verify:
- [ ] Every component in `_docs/02_document/components/` has a Per-Component Mapping entry.
- [ ] Every shared / cross-cutting epic has an entry in the Shared section.
- [ ] Layering table rows cover every component.
- [ ] No component's `Imports from` list contains a component at a higher layer.
- [ ] Paths follow the detected language's convention.
- [ ] No two components own overlapping paths.
## How the implement skill consumes this
The implement skill's Step 4 (File Ownership) reads this file and, for each task in the batch:
1. Resolve the task's Component field to a Per-Component Mapping entry.
2. Set OWNED = the component's `Owns` glob.
3. Set READ-ONLY = the Public API files of every component listed in `Imports from`, plus `shared/*` Public API files.
4. Set FORBIDDEN = every other component's Owns glob.
Execution inside a batch is already sequential (one task at a time). This mapping is still required because it enforces scope discipline per task — preventing a task from drifting into files that belong to another component.
+124
View File
@@ -0,0 +1,124 @@
# Task Specification Template
Create a focused behavioral specification that describes **what** the system should do, not **how** it should be built.
Save as `TASKS_DIR/[##]_[short_name].md` initially, then rename to `TASKS_DIR/[TRACKER-ID]_[short_name].md` after work item ticket creation.
---
```markdown
# [Feature Name]
**Task**: [TRACKER-ID]_[short_name]
**Name**: [short human name]
**Description**: [one-line description of what this task delivers]
**Complexity**: [1|2|3|5] points
**Dependencies**: [AZ-43_shared_models, AZ-44_db_migrations] or "None"
**Component**: [component name for context]
**Tracker**: [TASK-ID]
**Epic**: [EPIC-ID]
## Problem
Clear, concise statement of the problem users are facing.
## Outcome
- Measurable or observable goal 1
- Measurable or observable goal 2
- ...
## Scope
### Included
- What's in scope for this task
### Excluded
- Explicitly what's NOT in scope
## Acceptance Criteria
**AC-1: [Title]**
Given [precondition]
When [action]
Then [expected result]
**AC-2: [Title]**
Given [precondition]
When [action]
Then [expected result]
## Non-Functional Requirements
**Performance**
- [requirement if relevant]
**Compatibility**
- [requirement if relevant]
**Reliability**
- [requirement if relevant]
## Unit Tests
| AC Ref | What to Test | Required Outcome |
|--------|-------------|-----------------|
| AC-1 | [test subject] | [expected result] |
## Blackbox Tests
| AC Ref | Initial Data/Conditions | What to Test | Expected Behavior | NFR References |
|--------|------------------------|-------------|-------------------|----------------|
| AC-1 | [setup] | [test subject] | [expected behavior] | [NFR if any] |
## Constraints
- [Architectural pattern constraint if critical]
- [Technical limitation]
- [Integration requirement]
## Risks & Mitigation
**Risk 1: [Title]**
- *Risk*: [Description]
- *Mitigation*: [Approach]
## Contract
<!--
OMIT this section for behavioral-only tasks.
INCLUDE this section ONLY for shared-models / shared-API / contract tasks.
See decompose/SKILL.md Step 2 shared-models rule and decompose/templates/api-contract.md.
-->
This task produces/implements the contract at `_docs/02_document/contracts/<component>/<name>.md`.
Consumers MUST read that file — not this task spec — to discover the interface.
```
---
## Complexity Points Guide
- 1 point: Trivial, self-contained, no dependencies
- 2 points: Non-trivial, low complexity, minimal coordination
- 3 points: Multi-step, moderate complexity, potential alignment needed
- 5 points: Difficult, interconnected logic, medium-high risk
- 8+ points: Too complex — split into smaller tasks
## Output Guidelines
**DO:**
- Focus on behavior and user experience
- Use clear, simple language
- Keep acceptance criteria testable (Gherkin format)
- Include realistic scope boundaries
- Write from the user's perspective
- Include complexity estimation
- Reference dependencies by tracker ID (e.g., AZ-43_shared_models)
**DON'T:**
- Include implementation details (file paths, classes, methods)
- Prescribe technical solutions or libraries
- Add architectural diagrams or code examples
- Specify exact API endpoints or data structures
- Include step-by-step implementation instructions
- Add "how to build" guidance
@@ -0,0 +1,129 @@
# Test Infrastructure Task Template
Use this template for the test infrastructure bootstrap (Step 1t in tests-only mode). Save as `TASKS_DIR/01_test_infrastructure.md` initially, then rename to `TASKS_DIR/[TRACKER-ID]_test_infrastructure.md` after work item ticket creation.
---
```markdown
# Test Infrastructure
**Task**: [TRACKER-ID]_test_infrastructure
**Name**: Test Infrastructure
**Description**: Scaffold the Blackbox test project — test runner, mock services, Docker test environment, test data fixtures, reporting
**Complexity**: [3|5] points
**Dependencies**: None
**Component**: Blackbox Tests
**Tracker**: [TASK-ID]
**Epic**: [EPIC-ID]
## Test Project Folder Layout
```
e2e/
├── conftest.py
├── requirements.txt
├── Dockerfile
├── mocks/
│ ├── [mock_service_1]/
│ │ ├── Dockerfile
│ │ └── [entrypoint file]
│ └── [mock_service_2]/
│ ├── Dockerfile
│ └── [entrypoint file]
├── fixtures/
│ └── [test data files]
├── tests/
│ ├── test_[category_1].py
│ ├── test_[category_2].py
│ └── ...
└── docker-compose.test.yml
```
### Layout Rationale
[Brief explanation of directory structure choices — framework conventions, separation of mocks from tests, fixture management]
## Mock Services
| Mock Service | Replaces | Endpoints | Behavior |
|-------------|----------|-----------|----------|
| [name] | [external service] | [endpoints it serves] | [response behavior, configurable via control API] |
### Mock Control API
Each mock service exposes a `POST /mock/config` endpoint for test-time behavior control (e.g., simulate downtime, inject errors). A `GET /mock/[resource]` endpoint returns recorded interactions for assertion.
## Docker Test Environment
### docker-compose.test.yml Structure
| Service | Image / Build | Purpose | Depends On |
|---------|--------------|---------|------------|
| [system-under-test] | [build context] | Main system being tested | [mock services] |
| [mock-1] | [build context] | Mock for [external service] | — |
| [e2e-consumer] | [build from e2e/] | Test runner | [system-under-test] |
### Networks and Volumes
[Isolated test network, volume mounts for test data, model files, results output]
## Test Runner Configuration
**Framework**: [e.g., pytest]
**Plugins**: [e.g., pytest-csv, sseclient-py, requests]
**Entry point**: [e.g., pytest --csv=/results/report.csv]
### Fixture Strategy
| Fixture | Scope | Purpose |
|---------|-------|---------|
| [name] | [session/module/function] | [what it provides] |
## Test Data Fixtures
| Data Set | Source | Format | Used By |
|----------|--------|--------|---------|
| [name] | [volume mount / generated / API seed] | [format] | [test categories] |
### Data Isolation
[Strategy: fresh containers per run, volume cleanup, mock state reset]
## Test Reporting
**Format**: [e.g., CSV]
**Columns**: [e.g., Test ID, Test Name, Execution Time (ms), Result, Error Message]
**Output path**: [e.g., /results/report.csv → mounted to host]
## Acceptance Criteria
**AC-1: Test environment starts**
Given the docker-compose.test.yml
When `docker compose -f docker-compose.test.yml up` is executed
Then all services start and the system-under-test is reachable
**AC-2: Mock services respond**
Given the test environment is running
When the e2e-consumer sends requests to mock services
Then mock services respond with configured behavior
**AC-3: Test runner executes**
Given the test environment is running
When the e2e-consumer starts
Then the test runner discovers and executes test files
**AC-4: Test report generated**
Given tests have been executed
When the test run completes
Then a report file exists at the configured output path with correct columns
```
---
## Guidance Notes
- This is a PLAN document, not code. The `/implement` skill executes it.
- Focus on test infrastructure decisions, not individual test implementations.
- Reference environment.md and test-data.md from the test specs — don't repeat everything.
- Mock services must be deterministic: same input always produces same output.
- The Docker environment must be self-contained: `docker compose up` sufficient.