# Iteration 5b — Phase 5 review v2 (of option-β plan) Phase 5 architect review of `phase4_iter5b_plan_v2.md` by sonnet-architect, 2026-05-12. Second Phase 5 pass for iter5b after the first Phase 5 cleared α' (which then failed at Phase 7). ## Executive summary Option β's architectural choice is correct — the α' failure mode is genuinely eliminated. CreateContext is the right lifecycle site because `config_id` is unambiguously available there. **But two Critical blocking issues exist** that the plan glossed over: (1) `video_format == NULL` guard at context.c:64-66 rejects every first β CreateContext; (2) DestroyContext lacks `request_pool_destroy`, leaving stale output_pool state across context cycles. Stripping surface.c (C3) without fixing both makes the second CreateContext fail in a worse way than α' did. Plus three Important findings and four Non-Amendment confirmations. ## Critical findings (BLOCKING Phase 6) ### CRIT-1 — `video_format == NULL` guard at context.c:64-66 will reject every first β CreateContext In current code: ```c video_format = driver_data->video_format; // context.c:64 if (video_format == NULL) // context.c:65 return VA_STATUS_ERROR_OPERATION_FAILED; // context.c:66 output_type = v4l2_type_video_output(video_format->v4l2_mplane); // :68 capture_type = v4l2_type_video_capture(video_format->v4l2_mplane); // :69 ``` Pre-β: CreateSurfaces2 (surface.c:253) sets `driver_data->video_format` before CreateContext runs. Under β, that probe MOVES into CreateContext. First CreateContext call: `driver_data->video_format == NULL` → early return. **Required fix**: Remove the early-return at lines 65-66. Move CAPTURE-format probe (which sets `driver_data->video_format`) to the top of CreateContext. Derive `output_type`/`capture_type` after the probe. ### CRIT-2 — `request_pool_destroy` missing from DestroyContext: second CreateContext re-uses stale pool DestroyContext does: - v4l2_set_stream(OUTPUT, false) (line 276) - v4l2_set_stream(CAPTURE, false) (line 280) - RequestDestroySurfaces (line 286) - v4l2_request_buffers(OUTPUT, 0) (line 296) — REQBUFS(0) frees kernel-side OUTPUT buffers - cap_pool_destroy (line 306-307) — frees CAPTURE side - surface_reset_format_cache (line 319) **Missing**: `request_pool_destroy(&driver_data->output_pool)`. Empirical grep: ``` src/request_pool.c:93: request_pool_destroy(pool); # internal error path src/request_pool.c:97:void request_pool_destroy(...) # definition src/request.c:361: request_pool_destroy(...) # Terminate src/surface.c:220: request_pool_destroy(...) # resolution-change in CreateSurfaces2 ``` Only surface.c:220 calls it from a per-session path — and β strips that line. `request_pool_init` at request_pool.c:35-36 has `if (pool->initialized) return 0;`. Under β: 1. First CreateContext: request_pool_init succeeds, pool->initialized=true. 2. DestroyContext: REQBUFS(0) frees kernel buffers BUT pool->initialized stays true, slot mmap pointers stay populated. 3. Second CreateContext: request_pool_init sees initialized=true, returns 0 — pool retains stale slot indices pointing at kernel buffers that no longer exist. 4. Next QBUF on any slot → EINVAL. **Required fix**: Add `request_pool_destroy(&driver_data->output_pool)` to DestroyContext BEFORE `v4l2_request_buffers(OUTPUT, 0)`. Order matters: munmap (request_pool_destroy) first, then REQBUFS(0) (kernel buffer free). Mirror surface.c:219-220's existing `if (output_pool.initialized) request_pool_destroy(...)` shape. C3 (surface.c strip) and CRIT-2 fix (DestroyContext add) MUST land in the same commit or strictly sequenced. ## Important findings ### IMP-1 — `output_type` mplane assumption: hardcoded `v4l2_type_video_output(true)` is wrong for SUNXI_TILED_NV12 Plan's C4 opens with `output_type_pre = v4l2_type_video_output(true);` to set OUTPUT format before CAPTURE probe. Works on fresnel (NV12 = mplane=true) but fails on cedrus/Allwinner targets where SUNXI_TILED_NV12 has v4l2_mplane=false. **Fix**: Reorder — probe CAPTURE first (the find_format calls don't require OUTPUT to be set), derive output_type from `video_format->v4l2_mplane`, THEN S_FMT(OUTPUT). Eliminates the `output_type_pre` hack. Empirical correctness: `v4l2_find_format` enumerates available formats without requiring any S_FMT first — confirmed by surface.c:238-253 doing exactly that probe. ### IMP-2 — `driver_data->video_format` retained across DestroyContext; `surface_reset_format_cache` becomes dead `video_format` is a pointer into static `formats[]` in video.c, never freed. Skipping the probe on second CreateContext via `if (!driver_data->video_format)` is intentional and correct. But: DestroyContext's `surface_reset_format_cache` call (line 319) and its comment (lines 309-318) reference `last_output_width/height` which C7 deletes. Under β, `surface_reset_format_cache` is empty/dead. **Fix**: C7 should also delete `surface_reset_format_cache` and the call in DestroyContext, plus update the comment to reflect β's CreateContext-centric lifecycle. ### IMP-3 — Error recovery in CreateContext (β): partial state on S_FMT-succeeds + cap_pool_init-fails If S_FMT(OUTPUT) succeeds then cap_pool_init fails, the existing error path (context.c:244-252) frees ids+context_object but leaves OUTPUT format set. cap_pool's own error_cleanup at cap_pool.c:125-137 handles its own REQBUFS(0), so cap_pool failure is self-recovering. OUTPUT S_FMT is not destructive when no buffers exist yet. **Safe but messy.** Phase 6 should add a comment in the error path noting what state remains, so future readers don't think the rollback is complete. ## Non-amendment findings ### NC-1 — α' failure mode eliminated by β: CONFIRMED α' crash: CreateSurfaces2 fires mid-stream → cap_pool_destroy → stale destination_data → SIGSEGV. Under β: CreateSurfaces2 touches no V4L2 device state and allocates no V4L2 buffers. No cap_pool_destroy from CreateSurfaces2. Multi-CreateSurfaces2 mid-stream is harmless. ### NC-2 — Multi-context without intervening DestroyContext: SAFE Second CreateContext sees STREAMON-on-OUTPUT EBUSY → returns OPERATION_FAILED. No silent corruption. Real consumers don't do this anyway. ### NC-3 — Field partitioning for surface_bind_slot: CORRECT Format-uniform fields (`destination_planes_count`, `destination_sizes`, `destination_offsets`, `destination_bytesperlines`, `destination_buffers_count`) move to CreateContext (β). Per-slot fields (`destination_index`, `destination_map[]`, `destination_data[]`) stay at BeginPicture's surface_bind_slot. The split is exactly what surface.h:63-72 documents. Minor: `destination_buffers_count` is also set at surface_bind_slot (line 101). Redundant but harmless (slot->buffers_count matches video_format->v4l2_buffers_count). ### NC-4 — mpv probe-then-real pattern: CORRECT under β with CRIT-2 fix Probe-CreateContext → DestroyContext (with CRIT-2 fix: full pool teardown) → Real-CreateContext. Clean rebuild at the new resolution. Without CRIT-2 fix this path breaks; with CRIT-2 it works. ## Plan-specific concern answers | Concern | Verdict | |---|---| | 1 (Error-recovery branches) | IMP-3 | | 2 (Multi-context shared driver_data) | NC-2 + IMP-2 | | 3 (Surface ID lookup fails) | Handled via standard SURFACE() NULL check + error return | | 4 (picture_width vs surface->width/height) | Pre-existing quirk, β doesn't introduce | | 5 (destination_data fill ordering) | NC-3 | | 6 (ohm backwards compat) | Safe — MPEG-2 gets explicit format that hantro previously substituted; H.264 unchanged | | 7 (CreateContext without DestroyContext between) | NC-2 | | 8 (mpv probe-then-real) | NC-4 (depends on CRIT-2 fix) | | 9 (destination_* transplant correctness) | Mechanical; preserve all 3 branches of `video_format->v4l2_buffers_count` switch | | 10 (CreateConfig error paths) | object_heap_allocate zeroes the struct; if CreateConfig returns ERROR before reaching the pixelformat assignment, the field is 0; CreateContext's `if (pixelformat == 0)` defensive check catches it | ## Summary scorecard | Clause | Status | Finding | |---|---|---| | C1 (codec.h/c helper) | PASS | Identical to reverted commit; no issue | | C2 (CreateConfig wire-up) | PASS | Correct | | C3 (strip surface.c) | CONDITIONAL | Must land with CRIT-2 fix | | C4 (CreateContext new flow) | FAIL: CRIT-1 + IMP-1 | NULL guard removal; probe order before S_FMT | | C5 (DestroyContext: "no change") | **FAIL: CRIT-2** | Plan said "no change needed" — wrong; must add request_pool_destroy | | C6 (surface.h unchanged) | PASS | No edit needed | | C7 (delete last_output_*) | PASS | Recommend extending to delete surface_reset_format_cache too (IMP-2) | | C8 (Phase 7 verification) | PASS | Same methodology | | C9 (Phase 8 close) | CONDITIONAL | Depends on CRIT fixes | **Phase 6 BLOCKED pending CRIT-1 + CRIT-2 fixes.** ## Required amendments before Phase 6 1. **CRIT-1**: Remove `video_format == NULL` guard at context.c:65-66. Move CAPTURE-format probe to top of CreateContext. Derive output_type/capture_type after probe. 2. **CRIT-2**: Add `request_pool_destroy(&driver_data->output_pool)` to DestroyContext BEFORE `v4l2_request_buffers(OUTPUT, 0)`. Mirror surface.c:219-220's existing pattern. 3. **IMP-1**: Reorder C4 — CAPTURE probe FIRST, then derive output_type from `video_format->v4l2_mplane`, then S_FMT(OUTPUT). Eliminates `output_type_pre` hack. 4. **IMP-2 (folded into C7)**: Delete `surface_reset_format_cache` (function + call in DestroyContext + the stale comment block). Field deletion + function deletion happen together. 5. **C3 + CRIT-2 commit ordering**: Phase 6 must commit C3 (strip surface.c) and CRIT-2 fix (DestroyContext add) together or strictly sequenced. Cannot commit C3 alone — it would leave NO call site for request_pool_destroy in per-session teardown. ## Author re-verification (post-review, 2026-05-12) Per `feedback_review_empirical_over_theoretical.md` Direction 2: ### CRIT-1 — CONFIRMED at code level context.c:64-66 verified by direct read. The early-return guarantees first-CreateContext failure under β. ### CRIT-2 — CONFIRMED at code level Grep of `request_pool_destroy` callsites: only surface.c:220 and request.c:361 (Terminate) plus internal request_pool.c:93. β strips surface.c:220. Without amendment, ZERO per-session callsites. ### IMP-1, IMP-2 — CONFIRMED mplane=false case exists for SUNXI_TILED_NV12 (video.c entries). `surface_reset_format_cache` reads only `last_output_width/height` — fully dead under C7's field deletion. ### Lessons The first Phase 5 review (of α') cleared a plan that then failed Phase 7. The second Phase 5 review (this one) caught two architectural gaps before Phase 6. Empirical-over-theoretical worked: the reviewer read the source to verify "DestroyContext does the right teardown" — the plan claimed it did, but the source disagreed. The author (Phase 4 v2 writer, me) read context.c:276-307 and noted "STREAMOFF + REQBUFS(0) + cap_pool_destroy at context.c:276-307. No change needed." — missing that `request_pool_destroy` is absent. Surface-level read vs full empirical traversal. ### Memory rule reinforcement `feedback_review_empirical_over_theoretical.md` (Direction 2): the reviewer here applied it correctly by enumerating `request_pool_destroy` callsites via grep before claiming the function existed only in surface.c. The author should apply Direction 1 in writing future plans: don't claim "no change needed" without enumerating call/use sites empirically. Worth a fresh memory rule? Maybe: **"Before writing 'no change needed' in a plan, grep for the symbol/function the change would touch. List the call sites. Confirm the change wouldn't break any of them."** Pin this lesson when iter5b closes.