From eca03d264117f461a9f2567dfefc7dac85d77da2 Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Mon, 11 May 2026 19:24:35 +0000 Subject: [PATCH] =?UTF-8?q?iter5b=20Phase=204:=20plan=20=E2=80=94=20option?= =?UTF-8?q?=20=CE=B1'=20(single-config=20lookup),=2010=20contract=20clause?= =?UTF-8?q?s?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Picks α' over the Phase 2 recommendation of β: smaller scope (~50 LOC vs ~250), targets iter5b's actual bug (wrong OUTPUT format at INITIAL CreateSurfaces2, not the multi-resolution mid-stream case the surface.c:164-171 TODO comment anticipates). Patches: - C1/C6: NEW src/codec.{h,c} + meson.build — pixelformat_for_profile() - C2: NEW find_sole_active_profile() static helper in surface.c - C3: Replace surface.c:173 hardcode with profile-derived lookup - C5: Extend last_output_* gate with pixelformat Phase 7 expected post-fix matrix: HEVC + VP9 + VP8 libva == kdirect == sw (3 codecs unblocked); MPEG-2 unchanged (already worked); H.264 still race-loses inter frames (Bug 4, deferred to iter6). Phase 5 review concerns laid out: helper completeness, heap iterator API, gate semantics, hantro CAPTURE-derivation on correct format, mpv probe-then-real flow, memory rule placement. Option β deferral note: cleaner refactor exists but not necessary for iter5b's bug; defer to future iteration when multi-resolution mid-stream becomes a target. Co-Authored-By: Claude Opus 4.7 (1M context) --- phase4_iter5b_plan.md | 311 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 311 insertions(+) create mode 100644 phase4_iter5b_plan.md diff --git a/phase4_iter5b_plan.md b/phase4_iter5b_plan.md new file mode 100644 index 0000000..9a81a4c --- /dev/null +++ b/phase4_iter5b_plan.md @@ -0,0 +1,311 @@ +# Iteration 5b — Phase 4 (plan) + +Captured 2026-05-11 after Phase 2 narrowed the option space. Phase 2 recommended option β (defer OUTPUT lifecycle from `RequestCreateSurfaces2` to `RequestCreateContext`). On detailed Phase 4 design, β requires moving CAPTURE-side work too (cap_pool_init, destination_* fill) because hantro derives CAPTURE format from OUTPUT format at S_FMT time. That's a 200-300 LOC refactor. + +After re-reading the surface.c:164-171 TODO comment, the deferred-work it flags is **multi-resolution mid-stream**, not iter5b's case (wrong OUTPUT format at INITIAL CreateSurfaces). iter5b can ship a smaller, pragmatic fix without touching the resolution-change lifecycle. + +Phase 4 picks **option α' — single-config lookup** (variant of option α with the heap walk): + +> At `RequestCreateSurfaces2` entry, if exactly one `object_config` exists in the config heap, use its profile to derive the OUTPUT pixelformat. If zero or multiple, fall back to the current hardcoded H264_SLICE with a `request_log` warning. + +Why this works in practice: +- Every real consumer (mpv, ffmpeg, Firefox, Chromium) keeps exactly one config alive between CreateConfig and CreateSurfaces. +- vainfo probes configs sequentially with destroy in between → at most one alive. +- Pathological multi-config-then-surface is not used by any in-scope consumer; falls back to current (broken) behavior. + +LOC delta predicted: ~50 LOC across 3 files (new helper + surface.c edit + driver_data field). + +Option β (CreateContext relocation) is deferred to a future iteration when multi-resolution mid-stream becomes a target. + +## Contract clauses + +### C1 — `pixelformat_for_profile()` helper + +**New file**: `src/codec.h` + +```c +#ifndef _CODEC_H_ +#define _CODEC_H_ + +#include + +/** + * pixelformat_for_profile - map a VA-API VAProfile to its V4L2 OUTPUT + * pixel format FOURCC. + * + * @profile: VAProfile* enum value as passed to vaCreateConfig. + * + * Returns the V4L2_PIX_FMT_* constant for the OUTPUT_MPLANE buffer + * format that corresponds to @profile. Used at CreateSurfaces2 + + * CreateContext to set the right OUTPUT format on the V4L2 device, + * which gates kernel decoder dispatch. + * + * Returns 0 if @profile is not handled by this backend (caller + * should fall back to a safe default and log a warning). + */ +unsigned int pixelformat_for_profile(VAProfile profile); + +#endif +``` + +**New file**: `src/codec.c` + +```c +#include "codec.h" + +#include + +unsigned int pixelformat_for_profile(VAProfile profile) +{ + switch (profile) { + case VAProfileMPEG2Simple: + case VAProfileMPEG2Main: + return V4L2_PIX_FMT_MPEG2_SLICE; + case VAProfileH264Main: + case VAProfileH264High: + case VAProfileH264ConstrainedBaseline: + case VAProfileH264MultiviewHigh: + case VAProfileH264StereoHigh: + return V4L2_PIX_FMT_H264_SLICE; + case VAProfileHEVCMain: + return V4L2_PIX_FMT_HEVC_SLICE; + case VAProfileVP8Version0_3: + return V4L2_PIX_FMT_VP8_FRAME; + case VAProfileVP9Profile0: + return V4L2_PIX_FMT_VP9_FRAME; + default: + return 0; + } +} +``` + +Single source of truth. Mirror of `config.c::RequestQueryConfigProfiles`'s probe blocks at lines 138-188, in helper form. + +### C2 — `find_sole_active_profile()` helper in surface.c + +**Edit**: `src/surface.c` + +Add static helper before `RequestCreateSurfaces2`: + +```c +/* + * iter5b: walk config_heap to find the profile of the single active + * config. Returns VAProfileNone (-1) if zero or multiple configs are + * alive. Used to determine the OUTPUT pixel format at CreateSurfaces2 + * time, where the VA-API contract gives us no config_id parameter. + * + * Every real consumer (mpv, ffmpeg, Firefox, Chromium) creates exactly + * one config before CreateSurfaces. vainfo creates+destroys configs in + * pairs and never reaches CreateSurfaces with surfaces_count > 0. The + * "multiple configs alive" case is pathological and falls back to the + * pre-iter5b hardcoded H264_SLICE with a request_log warning. + */ +static VAProfile find_sole_active_profile(struct request_data *driver_data) +{ + VAProfile found = VAProfileNone; + int seen = 0; + struct object_config *config_object; + struct object_heap_iterator iter; + struct object_base *iter_obj; + + iter_obj = object_heap_first(&driver_data->config_heap, &iter); + while (iter_obj) { + config_object = (struct object_config *)iter_obj; + if (seen == 0) + found = config_object->profile; + seen++; + iter_obj = object_heap_next(&driver_data->config_heap, &iter); + } + + if (seen == 1) + return found; + + if (seen > 1) + request_log("CreateSurfaces2: %d active configs, profile-pick is ambiguous; falling back to default OUTPUT format\n", + seen); + return VAProfileNone; +} +``` + +The `object_heap_first` / `object_heap_next` API is the iterator pattern the backend already uses elsewhere (grep for `object_heap_first` in the source to confirm the call shape before Phase 6 lands). + +### C3 — Use helper in `RequestCreateSurfaces2` + +**Edit**: `src/surface.c:173` + +Replace: + +```c +unsigned int pixelformat = V4L2_PIX_FMT_H264_SLICE; +``` + +With: + +```c +/* + * iter5b: derive OUTPUT pixel format from the single active config's + * profile. Pre-iter5b hardcoded V4L2_PIX_FMT_H264_SLICE for every + * profile — for HEVC, VP9, VP8 the kernel rkvdec/hantro driver + * silently rejected the format/control mismatch and produced + * all-zero CAPTURE pages. See phase0_iter5_loopback.md (commit + * cd34ec1) for the empirical trace; same bug class as + * feedback_unconditional_codec_state.md (iter4 h264_start_code). + */ +VAProfile sole_profile = find_sole_active_profile(driver_data); +unsigned int pixelformat = pixelformat_for_profile(sole_profile); + +if (pixelformat == 0) { + if (sole_profile == VAProfileNone) { + request_log("CreateSurfaces2: no config to derive OUTPUT pixel format; falling back to H.264 SLICE\n"); + } else { + request_log("CreateSurfaces2: profile %d not mapped to a V4L2 OUTPUT pixel format; falling back to H.264 SLICE\n", + sole_profile); + } + pixelformat = V4L2_PIX_FMT_H264_SLICE; +} +``` + +The fallback preserves pre-iter5b behavior for the (theoretical) pathological case. In practice the fallback never fires for the in-scope consumer matrix. + +### C4 — `#include "codec.h"` in surface.c + +**Edit**: `src/surface.c` — add `#include "codec.h"` near the existing includes block. + +### C5 — Track pixelformat in driver_data for gate + +**Edit**: `src/request.h` — add field to `struct request_data` (request.h:50+): + +```c ++ /* ++ * iter5b: pixelformat last set on the OUTPUT queue (e.g. ++ * V4L2_PIX_FMT_HEVC_SLICE). Gates re-S_FMT on profile change ++ * the same way last_output_width/height gates re-S_FMT on ++ * resolution change. 0 = uninitialized; non-zero = the FOURCC ++ * of the current OUTPUT format on the V4L2 fd. ++ */ ++ unsigned int last_output_pixelformat; + unsigned int last_output_width; + unsigned int last_output_height; +``` + +**Edit**: `src/surface.c:176` + +Extend the existing gate: + +```c +if (driver_data->last_output_width != width || + driver_data->last_output_height != height || + driver_data->last_output_pixelformat != pixelformat) { +``` + +And at line 230-231: + +```c +driver_data->last_output_width = width; +driver_data->last_output_height = height; ++ driver_data->last_output_pixelformat = pixelformat; +``` + +The gate now fires on either resolution change OR profile change. The existing teardown branch (cap_pool_destroy, request_pool_destroy, REQBUFS(0)) handles either case identically — the V4L2 device just needs all buffers gone before a fresh S_FMT. + +### C6 — `meson.build` update + +**Edit**: `src/meson.build` — add `'codec.c'` to the sources list, `'codec.h'` to the headers list. + +### C7 — Phase 7 verification matrix + +Re-run the iter5 Phase 3 sweep script (`/tmp/iter5_p3/sweep.sh` on fresnel — preserved in `iter5_phase3_baseline.tgz`). Expected hash matrix: + +| Codec | Fixture | libva pre-iter5b | libva post-iter5b | kdirect | sw | Verdict | +|---|---|---|---|---|---|---| +| H.264 1080p30 | `bbb_1080p30_h264.mp4` | `71ac099b…` (broken keyframe-only) | **expected same as pre** (Bug 4 not fixed in iter5b) | `1e7a0bc9…` | `1e7a0bc9…` | Bug 4 still open; not in scope | +| HEVC 720p | `bbb_720p10s_hevc.mp4` | `06b2c5a0…` (all-zero) | **`9340b832…`** | `9340b832…` | `9340b832…` | **libva == kdirect == sw** ✓ | +| VP9 720p | `bbb_720p10s_vp9.webm` | `06b2c5a0…` | **`4f1565e8…`** | `4f1565e8…` | `4f1565e8…` | **libva == kdirect == sw** ✓ | +| MPEG-2 720p | `bbb_720p10s_mpeg2.ts` | `19eefbf4…` (working) | **`19eefbf4…`** (unchanged) | `19eefbf4…` | `7be8cad7…` | regression-free ✓ | +| VP8 720p | `bbb_720p10s_vp8.webm` | `06b2c5a0…` | **`136ce5cb…`** | `136ce5cb…` | `136ce5cb…` | **libva == kdirect == sw** ✓ | + +iter5b PASS = 3 codecs unblocked (HEVC, VP9, VP8) + 1 regression-free (MPEG-2) + 1 unchanged (H.264 — Bug 4 still open). + +Phase 7 strict equality check: +```bash +ssh fresnel 'cd /tmp/iter5b_p7 && ./sweep.sh && \ + for c in hevc vp9 vp8 mpeg2; do + if cmp -s libva_$c.yuv kdirect_$c.yuv; then + echo "$c: libva == kdirect (PASS)" + else + echo "$c: libva != kdirect (FAIL)" + fi + done' +``` + +H.264 deferred: keyframe-only decode continues, which the iter4 testing already showed. + +### C8 — Build + install on fresnel + +No kernel work. Backend-only rebuild: + +```bash +ssh fresnel 'cd ~/src/libva-v4l2-request-fourier && \ + git pull origin master && \ + ninja -C build && \ + sudo install -m 644 build/src/v4l2_request_drv_video.so /usr/lib/dri/v4l2_request_drv_video.so' +``` + +Capture post-install SHA256 of the .so as the iter5b binding identifier. Compare against pre-iter5b `6e90b7a9b2c33480…` to confirm a different build was installed. + +### C9 — Control-payload anchor regression check + +Optional but cheap: re-run the Phase 3 anchor strace and confirm `VIDIOC_S_EXT_CTRLS` payloads byte-match the iter5 baseline (per-codec strace files in `iter5_phase3_baseline.tgz`). The backend changes only touch OUTPUT format setup, not control submission, so byte-identical is the expectation. Any divergence indicates an unintended ripple. + +### C10 — Phase 8 close criteria + +- C7 verification: 3 of 3 (HEVC/VP9/VP8) `libva == kdirect == sw`. MPEG-2 unchanged. H.264 unchanged (Bug 4 deferred to iter6). +- `phase8_iteration5b_close.md` documenting commits + verification hashes. +- Campaign scoreboard: "5/5 with 4 direct + 1 (H.264 inter) partial" — up from current "5/5 with 1 direct (MPEG-2) + 4 transitive." +- Phase 5 sonnet-architect review sign-off recorded. +- Commits via `claude-noether` per memory `feedback_gitea_as_claude_noether.md`. +- Memory update: extend `feedback_unconditional_codec_state.md` with the second instance (`pixelformat_for_profile` site), or add a fresh `feedback_va_lifecycle_profile_threading.md` if the rule generalizes beyond start codes + pixel formats. + +## Risk register + +| Risk | Probability | Impact | Mitigation | +|---|---|---|---| +| `object_heap_first/next` API signature differs from what I'm guessing | LOW | LOW | Phase 6 reads existing usages in the backend before authoring the iterator | +| `find_sole_active_profile` misfires when consumer creates config + creates surfaces + destroys config + creates new config (sequential single-config flow) | LOW | LOW | Confirmed by tracing mpv + ffmpeg patterns; single-config-at-a-time is preserved | +| `request_log` is the wrong logger symbol | LOW | LOW | Grep confirms `request_log` is the backend-wide logger (used in v4l2.c, surface.c etc.) | +| Hantro's CAPTURE-format-derivation breaks when OUTPUT goes HEVC_SLICE | **MEDIUM** | **HIGH** | Hantro doesn't expose HEVC, so HEVC won't bind to hantro. rkvdec is the only path for HEVC. Phase 6 verifies. For MPEG-2/VP8 on hantro the format goes from "wrong-but-tolerated H264_SLICE" to "correct MPEG2_SLICE/VP8_FRAME" — hantro's CAPTURE derivation should be MORE accurate, not less | +| Existing iter4 h264_start_code switch and the new pixelformat switch fall out of sync | LOW | LOW | Both code blocks exist in different files (context.c vs surface.c); helper isolates the mapping logic. Phase 5 review checks both | +| `VAProfileNone` constant comparison surprises | LOW | LOW | va.h:504 confirms `VAProfileNone = -1`. The fallback condition `pixelformat == 0` makes the comparison double-safe | +| Ohm campaign regression (libva-multiplanar shares the fork) | LOW | MEDIUM | Ohm tests only H.264 + MPEG-2. Both get correct pixelformat via the helper. MPEG-2 path goes from accidentally-tolerated H264_SLICE to explicitly-correct MPEG2_SLICE — likely improvement, not regression. Phase 5 review verifies via the ohm Phase 3 anchors if available | + +## Phase 5 review concerns + +Inviting the sonnet-architect to specifically check: + +1. **Helper completeness** — does `pixelformat_for_profile()` cover every VAProfile the backend enumerates in `RequestQueryConfigProfiles` at config.c:131-194? Missing a profile silently breaks that codec. +2. **Heap iterator API** — confirm `object_heap_first` / `object_heap_next` exist and have the signature my pseudocode uses. The backend's `object_heap.h` will tell us in seconds. +3. **Gate semantics** — does extending the `last_output_*` gate with pixelformat correctly fire the teardown branch on profile change? Walk through the case where the same resolution is decoded as H.264 first, then HEVC second on the same driver_data instance. +4. **Hantro CAPTURE-derivation** — confirm hantro/rk3399-vpu-dec accepts an explicit MPEG2_SLICE OUTPUT format without breaking the implicit CAPTURE derivation. Phase 3 baseline showed MPEG-2 works with the wrong H264_SLICE format because hantro is forgiving; will it ALSO work with the right MPEG2_SLICE format? Almost certainly yes (hantro probes the SPS to derive CAPTURE), but the reviewer should empirically check via boltzmann's linux-rockchip tree. +5. **mpv probe-then-real allocation pattern** — mpv first vaCreateSurfaces with 128x128 probe surfaces, then real-resolution. The gate-on-resolution-change works for mpv. Does the new gate-on-pixelformat-change cooperate correctly? Specifically: probe surfaces happen BEFORE vaCreateConfig in some mpv flows? If so, `find_sole_active_profile` returns VAProfileNone and falls back to H264_SLICE for the probe, then the real-resolution call (after CreateConfig) finds the right profile. The gate fires on both resolution AND pixelformat change, so the second call re-S_FMTs correctly. +6. **Memory rule update** — does this iteration's lesson belong in `feedback_unconditional_codec_state.md` (extending the start-code finding to pixel format) or as a separate `feedback_va_lifecycle_profile_threading.md` rule about VA-API's CreateSurfaces having no config_id parameter and how to work around it? + +## Predicted iter5b cadence + +- Phase 5 review: ~20-30 min (small surface). +- Phase 6 implementation: ~45-60 min total — 2 new files (~40 LOC), 3 file edits (~40 LOC delta), 1 meson update, 1 rebuild on fresnel, 1 install. +- Phase 7 verification: ~10 min (re-run the sweep script). +- Phase 8 close: ~30 min for the doc + memory updates. + +Total: 1-2 hours wallclock if nothing surprises. + +## Option β deferral note + +The option β refactor (CreateContext owns OUTPUT-side lifecycle entirely; CreateSurfaces2 becomes ID-allocation-only) is the architecturally cleanest solution and aligns with the surface.c:164-171 TODO comment's "context-level redesign for the next iteration." + +iter5b chooses option α' instead because: + +1. The TODO comment specifically targets multi-resolution-mid-stream, not iter5b's "wrong-format-at-initial-create-surfaces" case. iter5b's bug doesn't need the full β refactor to fix. +2. Option α' is 5x smaller LOC (~50 vs ~250) and 10x faster to ship. +3. Option β can be implemented in a later iteration when multi-resolution mid-stream becomes a target. + +If Phase 5 reviewer makes a strong case for β (e.g. discovers that α' has a corner case it doesn't handle), iter5b will loop back to Phase 4 and re-plan as β.