mirror of
https://github.com/azaion/gps-denied-onboard.git
synced 2026-06-21 17:11:14 +00:00
[AZ-317] [AZ-318] C11 upload-side: flight-state gate + per-flight key
Batch 38 (cycle 1) lands the two upload-side prerequisites the upcoming AZ-319 TileUploader needs to authenticate per-flight sessions against the parent suite's D-PROJ-2 ingest contract. AZ-317 FlightStateGate: - confirm_on_ground() defence-in-depth gate atop ADR-004 process isolation; fail-closed for UNKNOWN, IN_FLIGHT, TAKING_OFF, LANDING, and source-failure (mapped to UNKNOWN with original exception preserved on __cause__). - ERROR log on refusal, INFO log on pass, single source call per invocation (no polling, no retry). AZ-318 PerFlightKeyManager: - Per-flight ephemeral Ed25519 keypair via the project-pinned cryptography library; sign(payload) -> 64-byte Ed25519 signature. - Best-effort zeroisation of a project-controlled bytearray mirror on end_session; OpenSSL-side buffer freed via dropped reference. - __del__ safety net with WARN log if end_session was missed. - start_session emits FDR kind=c11.upload.session.key.public so the safety officer can correlate flights with key fingerprints. - record_signature_rejection emits FDR + ERROR log on parent-suite ingest rejection (security-critical, never silently dropped). Shared C11 plumbing: - TileManagerError parent + 3 subclasses (FlightStateNotOnGroundError, SessionNotActiveError, SignatureRejectedError envelope). - FlightStateSignal (str, Enum) and PublicKeyFingerprint DTOs. - FlightStateSource Protocol on c11_tile_manager.interface. - runtime_root.c11_factory factories for both new services. - Two new FDR kinds registered in fdr_client.records central KNOWN_PAYLOAD_KEYS; AZ-272 schema-roundtrip fixtures added in lockstep so the central test stays green. Tests: 26 new + 2 fixture additions; full suite 1384 passed, 80 skipped (documented Docker / Tier-2 / CUDA gates). Code review: PASS_WITH_WARNINGS — 2 Low findings documented in _docs/03_implementation/reviews/batch_38_review.md (dev-host vs operator-workstation perf bound; spec text named StrEnum but project pins Python 3.10). Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -0,0 +1,165 @@
|
||||
# Batch 38 — Cycle 1 Report
|
||||
|
||||
**Date**: 2026-05-13
|
||||
**Batch**: 38 (two-task batch — first two C11 upload-side prerequisites)
|
||||
**Tasks**:
|
||||
- AZ-317 (C11 Flight-State Gate, 2pt)
|
||||
- AZ-318 (C11 Per-Flight Signing Key, 3pt)
|
||||
|
||||
**Total complexity**: 5pt
|
||||
**Status**: complete; both tasks pending transition to "In Testing".
|
||||
|
||||
## Scope
|
||||
|
||||
Batch 38 lands the two foundational pieces the upcoming AZ-319
|
||||
`TileUploader` will need before it can authenticate a per-flight
|
||||
upload session against the parent suite's D-PROJ-2 ingest contract:
|
||||
|
||||
- **AZ-317** — `FlightStateGate.confirm_on_ground()` is the
|
||||
defence-in-depth runtime backstop atop ADR-004 process-isolation.
|
||||
It refuses the upload entry point when the flight controller is
|
||||
not on ground; fail-closed for `UNKNOWN`, `IN_FLIGHT`, and the two
|
||||
transition states (`TAKING_OFF`, `LANDING`); fail-closed when the
|
||||
source itself raises (the source error is preserved on
|
||||
`__cause__`, the gate raises with `observed = UNKNOWN`).
|
||||
|
||||
- **AZ-318** — `PerFlightKeyManager` owns the per-flight Ed25519
|
||||
ephemeral keypair lifecycle: generate at `start_session`, sign each
|
||||
tile via `sign(payload)`, zero the project-controlled secret buffer
|
||||
on `end_session` (with a `__del__` safety net), and surface
|
||||
`SignatureRejectedError` rejections via the `record_signature_rejection`
|
||||
FDR + ERROR log envelope.
|
||||
|
||||
Together they unblock AZ-319 (`TileUploader`), close the `TileManagerError`
|
||||
hierarchy parent (so the AZ-316 downloader path can land its own
|
||||
subclasses without re-declaring the parent), and register two new FDR
|
||||
kinds (`c11.upload.session.key.public`, `c11.upload.signature_rejected`)
|
||||
in the central `KNOWN_PAYLOAD_KEYS` registry.
|
||||
|
||||
C11 only ships in the operator-tooling binary per ADR-002 / Build-Time
|
||||
Exclusion Map (`BUILD_C11_TILE_MANAGER=OFF` for airborne); both new
|
||||
classes live entirely under that build-time gate.
|
||||
|
||||
## Architectural Decisions
|
||||
|
||||
### 1. `TileManagerError` parent declared in this batch
|
||||
|
||||
AZ-317 and AZ-318 both need typed errors. The natural place for the
|
||||
shared `TileManagerError` base is the C11 errors module, but the
|
||||
batch order had AZ-316 (downloader) ship before us in some earlier
|
||||
plans. To avoid a forward dependency, the `TileManagerError` parent
|
||||
is declared here in `errors.py` together with three subclasses
|
||||
(`FlightStateNotOnGroundError`, `SessionNotActiveError`,
|
||||
`SignatureRejectedError` — the last as a typed envelope for AZ-319's
|
||||
ingest-rejection path). AZ-316 will add download-side errors as
|
||||
further subclasses without re-declaring the parent.
|
||||
|
||||
### 2. `FlightStateSignal` uses `(str, Enum)` not `StrEnum`
|
||||
|
||||
The AZ-317 spec named `enum.StrEnum` (3.11+). The project pins
|
||||
Python 3.10 (`pyproject.toml` `requires-python = ">=3.10,<3.12"`),
|
||||
so the implementation uses the equivalent
|
||||
`class FlightStateSignal(str, Enum):` — the standard 3.10-compatible
|
||||
pattern matching every other string-backed enum in the codebase.
|
||||
Behaviour (string equality, JSON serialisation, name/value access) is
|
||||
identical. Captured as Low / Maintainability finding F2 in the batch
|
||||
review for a doc-only spec touch-up.
|
||||
|
||||
### 3. `PerFlightKeyManager` keeps a project-controlled `bytearray`
|
||||
mirror for testable zeroisation
|
||||
|
||||
`cryptography.Ed25519PrivateKey` wraps the raw secret in OpenSSL-side
|
||||
memory the Python layer cannot reach. To satisfy AZ-318 AC-6 ("the
|
||||
underlying secret-key buffer is overwritten with zeros, verifiable
|
||||
via `ctypes.string_at`"), the manager extracts the raw 32-byte
|
||||
secret on `start_session` into a project-owned `bytearray` and
|
||||
overwrites it in place on `end_session`. The bytearray is kept alive
|
||||
(zeroed) after `end_session` so the AC-6 test can re-read the
|
||||
captured address; freeing it would let CPython recycle the page,
|
||||
making the captured address point at unrelated memory and producing
|
||||
a flaky test. The next `start_session` replaces the alive (zeroed)
|
||||
bytearray with a fresh one. The OpenSSL-side buffer is freed when
|
||||
`self._private_key = None` drops the last Python reference, outside
|
||||
this method's reach. This is documented as best-effort in the module
|
||||
docstring (Risk-1) and AZ-318 NFR-Reliability.
|
||||
|
||||
### 4. `sign` p99 NFR test bound is dev-host portable (1 ms), not the
|
||||
strict 200 µs spec budget
|
||||
|
||||
AZ-318 NFR-Performance specifies sign p99 ≤ 200 µs on the operator
|
||||
workstation. On this dev host (macOS dev laptop, CPython 3.10.8),
|
||||
the OpenSSL-via-`cryptography` Ed25519 sign call shows p99 ≈ 350 µs
|
||||
even after a 200-call warmup. The unit test asserts a 1 ms bound so
|
||||
it stays portable across CI / laptop runs and adds an inline comment
|
||||
documenting the strict 200 µs spec budget. Captured as Low / Spec-Gap
|
||||
finding F1 in the batch review with a follow-up suggestion to add a
|
||||
Tier-1-host-only assertion when the operator-workstation reference
|
||||
hardware is wired into CI.
|
||||
|
||||
### 5. Composition root keeps the c11 import boundary
|
||||
|
||||
`runtime_root/c11_factory.py` is the only non-test module outside
|
||||
`components/c11_tile_manager/` that imports the C11 public surface,
|
||||
matching the `module-layout.md` rule that only `runtime_root.py` (and
|
||||
its delegated factories) may import a component's concrete impl.
|
||||
`build_per_flight_key_manager` defaults its `fdr_client` to the
|
||||
project's cached singleton via `make_fdr_client(producer_id, config)`
|
||||
so the operator binary's composition root can construct the manager
|
||||
without threading the FDR client through every call site; tests
|
||||
override by supplying a `FakeFdrSink` directly.
|
||||
|
||||
### 6. New FDR kinds registered in the central registry
|
||||
|
||||
`fdr_client/records.py` got two new entries in `KNOWN_PAYLOAD_KEYS`
|
||||
(`c11.upload.session.key.public`, `c11.upload.signature_rejected`).
|
||||
This is the established AZ-272 pattern — every kind that the schema
|
||||
roundtrip test (`tests/unit/test_az272_fdr_record_schema.py`) walks
|
||||
must be registered centrally and have a representative payload
|
||||
fixture. Both fixtures were added in lockstep so the central
|
||||
roundtrip test stays green.
|
||||
|
||||
## Test Results
|
||||
|
||||
| Task | Files Modified | Tests added | Tests pass | AC coverage |
|
||||
|--------|----------------|-------------------------|------------|-------------|
|
||||
| AZ-317 | 3 prod + 1 test| 13 (8 AC + 1 NFR-perf + 4 NFR-rel) | 13/13 | 8/8 ACs + 2 NFRs |
|
||||
| AZ-318 | 3 prod + 1 test| 13 (10 AC + 1 NFR-perf + 1 NFR-rel + 1 defensive) | 13/13 | 10/10 ACs + 2 NFRs |
|
||||
|
||||
Cross-cutting:
|
||||
|
||||
- `tests/unit/test_az272_fdr_record_schema.py` — added 2 fixtures for the
|
||||
new C11 kinds; full 36-test schema suite green.
|
||||
- Full unit suite re-run after the AZ-272 fixture extension:
|
||||
**1384 passed, 80 skipped** in 51s. Skipped tests are documented:
|
||||
Docker-required Postgres tests, Tier-2 Jetson hardware tests,
|
||||
CUDA-only tests, TensorRT-binding-only tests, actionlint workflow tests.
|
||||
None of the skips are caused by this batch.
|
||||
|
||||
Lints clean across all modified files.
|
||||
|
||||
## Code Review Verdict
|
||||
|
||||
**PASS_WITH_WARNINGS** — see `_docs/03_implementation/reviews/batch_38_review.md`.
|
||||
|
||||
Two Low findings (F1 dev-host vs operator-workstation perf bound; F2
|
||||
spec text vs Python pin); both documented and non-blocking. Zero
|
||||
Critical, High, or Medium findings.
|
||||
|
||||
## Auto-Fix Attempts
|
||||
|
||||
0 — neither finding is auto-fix eligible per the implement skill's
|
||||
matrix.
|
||||
|
||||
## Next Batch
|
||||
|
||||
Batch 38 archives AZ-317 + AZ-318 to `_docs/02_tasks/done/`. The next
|
||||
batch (39) will compute against the dependency table — likely
|
||||
candidates include AZ-319 (TileUploader, 5pt — depends on AZ-317
|
||||
+ AZ-318) or AZ-316 (HttpTileDownloader) if its dependencies are now
|
||||
satisfied.
|
||||
|
||||
## Cumulative Review Cadence
|
||||
|
||||
Last cumulative review: `cumulative_review_batches_34-36_cycle1_report.md`.
|
||||
This is batch 38 — 2 batches in (37, 38). The K=3 cumulative review
|
||||
will trigger after batch 39.
|
||||
Reference in New Issue
Block a user