Files
fresnel-fourier/phase4_iter5b_plan_v2.md
T
marfrit 5abea730a0 iter5b Phase 4 v2: re-plan with option β — CreateContext-centric OUTPUT lifecycle
Supersedes phase4_iter5b_plan.md (the α' plan rejected at Phase 7).
β architecture: strip OUTPUT-side V4L2 device state from
RequestCreateSurfaces2 entirely; move it to RequestCreateContext
where config_id (and therefore the bound profile) is unambiguously
known. CreateSurfaces2 becomes ID-allocation + per-surface
bookkeeping only.

9 contract clauses (C1..C9). Reuses 2 of 3 reverted iter5b commits
(codec.h/codec.c helper; object_config->pixelformat wire-up at
CreateConfig). New work: C3 strip surface.c, C4 build out
context.c — predicted ~120 LOC into context.c, ~190 LOC stripped
from surface.c (net ~70 LOC delta).

Risk register: 7 items; highest is multi-context resolution change
within shared driver_data (medium impact, mitigated by existing
DestroyContext teardown). α''s destructive teardown failure mode
disappears because β has no in-CreateSurfaces2 teardown branch.

Phase 5 review focus: error-recovery branches in CreateContext,
per-surface destination_* fill semantics (format-uniform fields
at CreateContext vs per-slot fields at BeginPicture), ohm
backwards-compat verification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 12:57:54 +00:00

268 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iteration 5b — Phase 4 v2 (plan, option β)
Captured 2026-05-12 after Phase 7 → Phase 4 loopback on the failed α' attempt. Re-plan with **option β: defer all OUTPUT-side V4L2 device state from `RequestCreateSurfaces2` to `RequestCreateContext`** — the lifecycle redesign the `surface.c:164-171` TODO comment anticipated and the Phase 5 reviewer-of-Phase-2 originally recommended.
## Why β, this time
Phase 7 empirically demonstrated α' fails because ffmpeg-vaapi calls `vaCreateSurfaces2` multiple times per session, with `vaCreateContext + STREAMON + frame decode` interleaved. α''s pixfmt-change-triggers-teardown gate corrupts state when called mid-stream. β eliminates the problem at the architectural level:
- **`CreateSurfaces2` no longer touches V4L2 device state.** No S_FMT, no CREATE_BUFS, no cap_pool. It allocates surface_object IDs and stores width/height. That's it.
- **`CreateContext` owns the full lifecycle.** It looks up `config_object->profile` (already does at context.c:71), derives the right pixelformat via `pixelformat_for_profile()`, does `S_FMT(OUTPUT, …)`, queries CAPTURE format, runs `cap_pool_init`, walks the bound `surface_ids[]` to fill per-surface `destination_*`. Then proceeds to existing `request_pool_init` + STREAMON + device-wide controls.
- **`DestroyContext` already handles teardown correctly.** STREAMOFF both queues, REQBUFS(0), cap_pool_destroy at context.c:276-307. No change needed.
The α' gate's destructive failure mode disappears because there's no gate — each `CreateContext` is a fresh setup, and the corresponding `DestroyContext` cleans up. mpv probe-then-real pattern: probe context creates with H264_SLICE (or whatever the probe profile), then DestroyContext, then real CreateContext with the real profile — STREAMOFF + REQBUFS(0) happens in DestroyContext between them.
## Reusable from failed iter5b
Of the three reverted commits, two are reusable without modification for β:
| iter5b commit (reverted) | Reusable for β? | Notes |
|---|---|---|
| `ce304ef` Commit A — `codec.h/codec.c` helper | **YES, identical** | `pixelformat_for_profile()` is profile→FOURCC map. Used at CreateConfig (commit B). |
| `f8256e6` Commit B — `last_output_pixelformat` field + `object_config->pixelformat` wire-up | **PARTIAL** | `object_config->pixelformat` wire-up at CreateConfig is reusable; the `last_output_pixelformat` field in driver_data is unnecessary in β (gate semantics move to per-CreateContext). |
| `4b2288f` Commit C — `find_sole_active_pixelformat` + surface.c gate change | **NO** | Wrong approach. β has no heap walk (CreateContext has config_id directly) and no surface.c gate (lifecycle is CreateContext-centric). |
## Contract clauses
### C1 — `pixelformat_for_profile()` helper (re-introduce from reverted commit A)
**New files**: `src/codec.h`, `src/codec.c` (identical to reverted commit A).
Helper:
```c
unsigned int pixelformat_for_profile(VAProfile profile);
```
Maps `VAProfile*``V4L2_PIX_FMT_*_SLICE/FRAME`. Returns 0 for unhandled.
Register in `meson.build`: add `'codec.c'` to sources, `'codec.h'` to headers.
### C2 — Wire `object_config->pixelformat` at CreateConfig (re-introduce from reverted commit B)
**Edit**: `src/config.c::RequestCreateConfig` at line ~98.
```diff
+ #include "codec.h"
...
config_object->profile = profile;
config_object->entrypoint = entrypoint;
+ config_object->pixelformat = pixelformat_for_profile(profile);
```
Populates the previously-dead `object_config->pixelformat` field (config.h:46) with the right FOURCC for each profile. CreateContext reads this field.
### C3 — Strip OUTPUT-side lifecycle from `RequestCreateSurfaces2`
**Edit**: `src/surface.c::RequestCreateSurfaces2`.
Remove:
- Lines 173-232 (OUTPUT format S_FMT + teardown branch + last_output_*).
- Lines 237-274 (`driver_data->video_format` probe — moves to CreateContext).
- Lines 276-280 (G_FMT CAPTURE — moves to CreateContext).
- Lines 282-326 (destination_* computation — moves to CreateContext).
- Lines 305-313 (cap_pool_init — moves to CreateContext).
- Inner per-surface destination_* fill at lines 340-359 (deferred to CreateContext walk).
Keep:
- Line 135-153: function entry, declarations.
- Line 234-235: `VA_RT_FORMAT_YUV420` validation.
- Lines 328-377: per-surface object_heap_allocate + minimal field init. Specifically: `width`, `height`, `status = VASurfaceReady`, `current_slot = NULL`, etc. Skip the destination_* fill (do it at CreateContext).
Result: `RequestCreateSurfaces2` becomes ~60 LOC (was ~250 LOC). Pure ID allocation + per-surface bookkeeping.
### C4 — Move OUTPUT-side lifecycle into `RequestCreateContext`
**Edit**: `src/context.c::RequestCreateContext`.
Current flow (post-revert):
```c
config_object = CONFIG(driver_data, config_id); // line 71
...
request_pool_init(&driver_data->output_pool, ...); // line 103
...
v4l2_set_controls(... H264_DECODE_MODE/START_CODE ...); // line 142+
v4l2_set_controls(... HEVC_DECODE_MODE/START_CODE ...); // line 164+
switch (config_object->profile) { // line 205
case VAProfileH264*/HEVCMain: h264_start_code = true;
default: h264_start_code = false;
}
v4l2_set_stream(... output_type, true); // line 219 STREAMON
v4l2_set_stream(... capture_type, true); // line 225 STREAMON
```
New flow:
```c
config_object = CONFIG(driver_data, config_id);
// β: derive OUTPUT pixelformat from the bound config's profile.
unsigned int pixelformat = config_object->pixelformat;
if (pixelformat == 0) {
// Defensive: shouldn't happen because CreateConfig rejects unhandled profiles,
// but Phase 5 review insists on never deref-ing 0 from helper return.
status = VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
goto error;
}
unsigned int output_type_pre = v4l2_type_video_output(true); // assume mplane; CAPTURE probe sets true/false below
rc = v4l2_set_format(driver_data->video_fd, output_type_pre, pixelformat,
picture_width, picture_height);
if (rc < 0) { status = VA_STATUS_ERROR_OPERATION_FAILED; goto error; }
// β: probe CAPTURE format (moved from surface.c:237-274).
if (driver_data->video_format == NULL) {
/* SUNXI_TILED_NV12 / NV12 probe identical to surface.c logic */
...
driver_data->video_format = ...;
}
video_format = driver_data->video_format;
output_type = v4l2_type_video_output(video_format->v4l2_mplane);
capture_type = v4l2_type_video_capture(video_format->v4l2_mplane);
// β: G_FMT CAPTURE to get sizes (moved from surface.c:276-280).
rc = v4l2_get_format(driver_data->video_fd, capture_type,
&format_width, &format_height,
destination_bytesperlines, destination_sizes, NULL);
if (rc < 0) { ... }
// β: cap_pool_init (moved from surface.c:305-313).
if (!driver_data->capture_pool.initialized) {
unsigned int pool_count = surfaces_count > MIN_CAP_POOL ?
surfaces_count : MIN_CAP_POOL;
rc = cap_pool_init(&driver_data->capture_pool,
driver_data->video_fd, capture_type,
pool_count, video_format->v4l2_buffers_count);
if (rc < 0) { ... }
}
// β: walk surface_ids and fill destination_* (moved from surface.c).
unsigned int destination_planes_count = video_format->planes_count;
if (video_format->v4l2_buffers_count == 1) {
destination_sizes[0] = destination_bytesperlines[0] * format_height;
for (j = 1; j < destination_planes_count; j++)
destination_sizes[j] = destination_sizes[0] / 2;
}
for (i = 0; i < surfaces_count; i++) {
struct object_surface *surface_object = SURFACE(driver_data, surfaces_ids[i]);
if (surface_object == NULL) { ... }
surface_object->destination_planes_count = destination_planes_count;
surface_object->destination_buffers_count = video_format->v4l2_buffers_count;
/* Fill destination_offsets[], destination_sizes[], destination_bytesperlines[]
per video_format->v4l2_buffers_count branch (1 vs n). Same logic as surface.c
lines 340-359. */
...
}
// existing flow continues:
request_pool_init(...);
v4l2_set_controls(... H264_DECODE_MODE/START_CODE ...);
v4l2_set_controls(... HEVC_DECODE_MODE/START_CODE ...);
switch (config_object->profile) { ... h264_start_code ... }
v4l2_set_stream(... output_type, true);
v4l2_set_stream(... capture_type, true);
```
Predicted LOC: ~120 LOC added to context.c (mostly transplanted from surface.c).
### C5 — Resolution-change handling moves to DestroyContext
The existing DestroyContext at context.c:276-307 already does:
- STREAMOFF both queues
- REQBUFS(0) on OUTPUT
- cap_pool_destroy
No change needed — that path is already a complete teardown. mpv's probe-then-real pattern works because each context cycle does its own setup + teardown.
### C6 — `surface.h` per-surface fields stay the same
The `object_surface` struct still has `destination_planes_count`, `destination_buffers_count`, `destination_sizes`, `destination_offsets`, `destination_bytesperlines`, `destination_data`, etc. These are now filled at CreateContext time instead of CreateSurfaces2 time, but the field layout doesn't change. No header edit needed.
### C7 — No `last_output_*` fields needed in driver_data
β doesn't need the `last_output_width/height/pixelformat` gate because the lifecycle is CreateContext-centric. The existing `last_output_width/height` fields in driver_data become dead.
**Two sub-options** for what to do with the dead fields:
a) Leave them in `request_data` with a `/* DEAD post-iter5b-β; was used by pre-β CreateSurfaces2 gate */` comment.
b) Delete them entirely.
Recommend **(b) delete**: cleaner, the dev-process discipline favors removing dead code. Also delete `surface_reset_format_cache` at surface.c:75-79 which assigns to these dead fields.
### C8 — Phase 7 verification matrix
Re-run iter5 Phase 3 sweep. Expected hash matrix:
| Codec | Pre-β libva | Expected post-β libva | kdirect (anchor) |
|---|---|---|---|
| H.264 | `71ac099b…` keyframe partial | unchanged (Bug 4 deferred) | `1e7a0bc9…` |
| HEVC | `06b2c5a0…` all-zero | **`9340b832…`** | `9340b832…` |
| VP9 | `06b2c5a0…` all-zero | **`4f1565e8…`** | `4f1565e8…` |
| MPEG-2 | `19eefbf4…` worked | `19eefbf4…` (unchanged) | `19eefbf4…` |
| VP8 | `06b2c5a0…` all-zero | **`136ce5cb…`** | `136ce5cb…` |
PASS criteria same as Phase 4 v1: 3 codecs unblocked (HEVC + VP9 + VP8) + MPEG-2 unchanged + H.264 unchanged.
Plus an extra robustness check: **re-run the sweep within the same backend process** (or back-to-back ffmpeg invocations) to confirm no SIGSEGV. The α' failure mode was multi-CreateSurfaces2 mid-stream; β should be robust because each session is a self-contained CreateContext lifecycle.
### C9 — Phase 8 close criteria
- C8 verification 3/3 (HEVC/VP9/VP8) libva == kdirect.
- No regression on H.264 keyframe + MPEG-2.
- `phase8_iteration5b_close.md` documenting commits + verification.
- Memory updates:
- Extend `feedback_unconditional_codec_state.md` with the pixelformat instance.
- Add `feedback_va_lifecycle_profile_threading.md` describing the VA-API CreateSurfaces-has-no-config_id observation and why CreateContext-centric lifecycle (β) is the right architecture for v4l2-request style backends.
- Add (or extend) a rule about **trust the Phase 2 reviewer's architectural recommendation when the smaller alternative carries lifecycle ordering risk** — the iter5b α' failure cost half a day's wallclock that β would have avoided.
- Campaign scoreboard: "5/5 with 4 direct + 1 (H.264 inter) partial" — same goal as iter5b-α' tried to reach.
## Risk register
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| `video_format` probe at CreateContext fails because CAPTURE format isn't queryable until OUTPUT format is set first | LOW | MEDIUM | The S_FMT(OUTPUT) call happens BEFORE the CAPTURE probe in C4's flow; hantro's CAPTURE-derivation happens at OUTPUT S_FMT time. Order is correct. |
| Stripping CreateSurfaces2's logic accidentally drops state that consumers expect | LOW | MEDIUM | Per VA-API spec, surfaces between CreateSurfaces and CreateContext are inert. Consumers should not query `destination_*` before CreateContext binds them. Phase 5 reviewer verifies. |
| `cap_pool_init` at CreateContext when surfaces_count==0 (ffmpeg vaapi-copy might pass 0) | LOW | LOW | Existing code at surface.c:306 already handles this via `surfaces_count > MIN_CAP_POOL ?` clamp. Transplanted as-is. |
| Multiple CreateContext calls in the same session (e.g., ffmpeg-vaapi re-init with different config) | MEDIUM | LOW | First CreateContext: clean setup. Subsequent CreateContext: skips `video_format` and `cap_pool_init` if already initialized (existing `if (!driver_data->video_format)` and `if (!driver_data->capture_pool.initialized)` checks transplant). For multi-context resolution OR profile change, mpv/ffmpeg patterns call DestroyContext between contexts, which already STREAMOFF+REQBUFS(0)+cap_pool_destroy. Phase 7 verifies. |
| Resolution-change within a single context (no DestroyContext) | LOW | LOW | Not in scope for iter5b-β; consumers that do this would need a separate iteration (the TODO comment's "for the next iteration after this one" target). |
| Failing CreateContext leaves V4L2 state half-set (S_FMT done but cap_pool_init failed) | LOW | MEDIUM | Error paths goto `error:` which frees the context. Need to also REQBUFS(0) + STREAMOFF if partial state was created. Phase 5 reviewer verifies the error-recovery branch. |
| `picture_width`/`picture_height` from VA-API differ from surface width/height | LOW | LOW | VA-API spec: surfaces and context picture dimensions should match for hardware decode. If they differ, S_FMT uses picture_*, which is the right kernel-facing dimension. |
## Phase 5 review concerns to invite
When Phase 4 v2 lands and Phase 5 reviewer engages, attack:
1. **Error-recovery branches in CreateContext** — if S_FMT succeeds but cap_pool_init fails, do we correctly REQBUFS(0) and STREAMOFF? Walk the `goto error:` flow.
2. **Multi-context with shared driver_data**`video_format` and `cap_pool` are driver_data-level (shared across contexts). First CreateContext sets them up; subsequent CreateContexts skip the init. Is the skip-logic correct? What if the second context has a different resolution?
3. **Surface ID lookup at CreateContext** — what if a `surfaces_ids[i]` doesn't resolve via `SURFACE()` macro (already-destroyed)? Existing code path handles NULL via error return. Phase 5 verifies.
4. **picture_width vs surface->width/height**`surface_object` stores width/height from CreateSurfaces2 (the surface's own dimensions). CreateContext uses picture_width/picture_height (the context's picture dimensions). What if they differ? Use picture_* for S_FMT (kernel state); fill surface_object->{width,height} with picture_*.
5. **`destination_data` fill ordering** — currently filled at surface_bind_slot (BeginPicture time, with the cap_pool_slot's actual mmap pointer). β's CreateContext walks should fill ONLY format-uniform fields (sizes, offsets, bytesperlines, planes_count, buffers_count); destination_data stays NULL until BeginPicture binds a slot. Phase 5 verifies the fields are correctly partitioned: per-format-uniform (CreateContext) vs per-slot (BeginPicture).
6. **Backwards compat** — the libva-multiplanar (ohm) campaign uses the same fork. ohm tests H.264 + MPEG-2 only. Both worked pre-iter5b (with the H264_SLICE hardcode happenstance-correct for H.264, and hantro's tolerance for MPEG-2). Post-β: H.264 path on ohm hantro gets explicit H264_SLICE; MPEG-2 path gets explicit MPEG2_SLICE. Both should still work or improve. Phase 5 verifies via ohm's Phase 3 anchors if available.
## Predicted iter5b-β cadence
- Phase 5 review: ~30 min (medium surface).
- Phase 6 implementation:
- Re-introduce commit A + commit B (mostly identical to iter5b commits A + B): ~15 min.
- Commit C: strip surface.c — ~30 min careful editing.
- Commit D: build out context.c — ~45 min careful editing.
- Build + install on fresnel — ~10 min.
- Phase 7 verification:
- Re-run sweep — ~5 min.
- Multi-codec back-to-back stress test — ~10 min.
- Phase 8 close — ~30 min.
Total: 2-3 hours wallclock.
## Phase 4 v2 → Phase 5 handoff
This document supersedes `phase4_iter5b_plan.md` (the rejected α' plan). Phase 5 reviewer reads this fresh; the prior review (`phase5_iter5b_review.md`) is historical context only.
Most relevant context for the reviewer:
- `phase7_iter5b_verification.md` — the empirical failure mode that mandates β.
- `phase2_iter5b_situation.md` — original lifecycle analysis. Recommended β; this Phase 4 v2 implements that recommendation.
- `phase0_iter5_loopback.md` — root cause of Bug 2 (surface.c:173 hardcode).
- `phase3_iter5_baseline.md` — pre-fix per-codec hash matrix (anchors for Phase 7).
Reviewer should specifically verify the C4 logic transplant from surface.c to context.c — the per-surface destination_* walk has tricky single-buffer vs multi-buffer-per-frame logic that mustn't break.