Files
fresnel-fourier/phase4_iter5b_plan.md
T
marfrit eca03d2641 iter5b Phase 4: plan — option α' (single-config lookup), 10 contract clauses
Picks α' over the Phase 2 recommendation of β: smaller scope (~50 LOC
vs ~250), targets iter5b's actual bug (wrong OUTPUT format at INITIAL
CreateSurfaces2, not the multi-resolution mid-stream case the
surface.c:164-171 TODO comment anticipates).

Patches:
- C1/C6: NEW src/codec.{h,c} + meson.build — pixelformat_for_profile()
- C2: NEW find_sole_active_profile() static helper in surface.c
- C3: Replace surface.c:173 hardcode with profile-derived lookup
- C5: Extend last_output_* gate with pixelformat

Phase 7 expected post-fix matrix: HEVC + VP9 + VP8 libva == kdirect
== sw (3 codecs unblocked); MPEG-2 unchanged (already worked);
H.264 still race-loses inter frames (Bug 4, deferred to iter6).

Phase 5 review concerns laid out: helper completeness, heap iterator
API, gate semantics, hantro CAPTURE-derivation on correct format,
mpv probe-then-real flow, memory rule placement.

Option β deferral note: cleaner refactor exists but not necessary
for iter5b's bug; defer to future iteration when multi-resolution
mid-stream becomes a target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 19:24:35 +00:00

16 KiB
Raw Blame History

Iteration 5b — Phase 4 (plan)

Captured 2026-05-11 after Phase 2 narrowed the option space. Phase 2 recommended option β (defer OUTPUT lifecycle from RequestCreateSurfaces2 to RequestCreateContext). On detailed Phase 4 design, β requires moving CAPTURE-side work too (cap_pool_init, destination_* fill) because hantro derives CAPTURE format from OUTPUT format at S_FMT time. That's a 200-300 LOC refactor.

After re-reading the surface.c:164-171 TODO comment, the deferred-work it flags is multi-resolution mid-stream, not iter5b's case (wrong OUTPUT format at INITIAL CreateSurfaces). iter5b can ship a smaller, pragmatic fix without touching the resolution-change lifecycle.

Phase 4 picks option α' — single-config lookup (variant of option α with the heap walk):

At RequestCreateSurfaces2 entry, if exactly one object_config exists in the config heap, use its profile to derive the OUTPUT pixelformat. If zero or multiple, fall back to the current hardcoded H264_SLICE with a request_log warning.

Why this works in practice:

  • Every real consumer (mpv, ffmpeg, Firefox, Chromium) keeps exactly one config alive between CreateConfig and CreateSurfaces.
  • vainfo probes configs sequentially with destroy in between → at most one alive.
  • Pathological multi-config-then-surface is not used by any in-scope consumer; falls back to current (broken) behavior.

LOC delta predicted: ~50 LOC across 3 files (new helper + surface.c edit + driver_data field).

Option β (CreateContext relocation) is deferred to a future iteration when multi-resolution mid-stream becomes a target.

Contract clauses

C1 — pixelformat_for_profile() helper

New file: src/codec.h

#ifndef _CODEC_H_
#define _CODEC_H_

#include <va/va.h>

/**
 * pixelformat_for_profile - map a VA-API VAProfile to its V4L2 OUTPUT
 *                            pixel format FOURCC.
 *
 * @profile: VAProfile* enum value as passed to vaCreateConfig.
 *
 * Returns the V4L2_PIX_FMT_* constant for the OUTPUT_MPLANE buffer
 * format that corresponds to @profile. Used at CreateSurfaces2 +
 * CreateContext to set the right OUTPUT format on the V4L2 device,
 * which gates kernel decoder dispatch.
 *
 * Returns 0 if @profile is not handled by this backend (caller
 * should fall back to a safe default and log a warning).
 */
unsigned int pixelformat_for_profile(VAProfile profile);

#endif

New file: src/codec.c

#include "codec.h"

#include <linux/videodev2.h>

unsigned int pixelformat_for_profile(VAProfile profile)
{
	switch (profile) {
	case VAProfileMPEG2Simple:
	case VAProfileMPEG2Main:
		return V4L2_PIX_FMT_MPEG2_SLICE;
	case VAProfileH264Main:
	case VAProfileH264High:
	case VAProfileH264ConstrainedBaseline:
	case VAProfileH264MultiviewHigh:
	case VAProfileH264StereoHigh:
		return V4L2_PIX_FMT_H264_SLICE;
	case VAProfileHEVCMain:
		return V4L2_PIX_FMT_HEVC_SLICE;
	case VAProfileVP8Version0_3:
		return V4L2_PIX_FMT_VP8_FRAME;
	case VAProfileVP9Profile0:
		return V4L2_PIX_FMT_VP9_FRAME;
	default:
		return 0;
	}
}

Single source of truth. Mirror of config.c::RequestQueryConfigProfiles's probe blocks at lines 138-188, in helper form.

C2 — find_sole_active_profile() helper in surface.c

Edit: src/surface.c

Add static helper before RequestCreateSurfaces2:

/*
 * iter5b: walk config_heap to find the profile of the single active
 * config. Returns VAProfileNone (-1) if zero or multiple configs are
 * alive. Used to determine the OUTPUT pixel format at CreateSurfaces2
 * time, where the VA-API contract gives us no config_id parameter.
 *
 * Every real consumer (mpv, ffmpeg, Firefox, Chromium) creates exactly
 * one config before CreateSurfaces. vainfo creates+destroys configs in
 * pairs and never reaches CreateSurfaces with surfaces_count > 0. The
 * "multiple configs alive" case is pathological and falls back to the
 * pre-iter5b hardcoded H264_SLICE with a request_log warning.
 */
static VAProfile find_sole_active_profile(struct request_data *driver_data)
{
	VAProfile found = VAProfileNone;
	int seen = 0;
	struct object_config *config_object;
	struct object_heap_iterator iter;
	struct object_base *iter_obj;

	iter_obj = object_heap_first(&driver_data->config_heap, &iter);
	while (iter_obj) {
		config_object = (struct object_config *)iter_obj;
		if (seen == 0)
			found = config_object->profile;
		seen++;
		iter_obj = object_heap_next(&driver_data->config_heap, &iter);
	}

	if (seen == 1)
		return found;

	if (seen > 1)
		request_log("CreateSurfaces2: %d active configs, profile-pick is ambiguous; falling back to default OUTPUT format\n",
			    seen);
	return VAProfileNone;
}

The object_heap_first / object_heap_next API is the iterator pattern the backend already uses elsewhere (grep for object_heap_first in the source to confirm the call shape before Phase 6 lands).

C3 — Use helper in RequestCreateSurfaces2

Edit: src/surface.c:173

Replace:

unsigned int pixelformat = V4L2_PIX_FMT_H264_SLICE;

With:

/*
 * iter5b: derive OUTPUT pixel format from the single active config's
 * profile. Pre-iter5b hardcoded V4L2_PIX_FMT_H264_SLICE for every
 * profile — for HEVC, VP9, VP8 the kernel rkvdec/hantro driver
 * silently rejected the format/control mismatch and produced
 * all-zero CAPTURE pages. See phase0_iter5_loopback.md (commit
 * cd34ec1) for the empirical trace; same bug class as
 * feedback_unconditional_codec_state.md (iter4 h264_start_code).
 */
VAProfile sole_profile = find_sole_active_profile(driver_data);
unsigned int pixelformat = pixelformat_for_profile(sole_profile);

if (pixelformat == 0) {
	if (sole_profile == VAProfileNone) {
		request_log("CreateSurfaces2: no config to derive OUTPUT pixel format; falling back to H.264 SLICE\n");
	} else {
		request_log("CreateSurfaces2: profile %d not mapped to a V4L2 OUTPUT pixel format; falling back to H.264 SLICE\n",
			    sole_profile);
	}
	pixelformat = V4L2_PIX_FMT_H264_SLICE;
}

The fallback preserves pre-iter5b behavior for the (theoretical) pathological case. In practice the fallback never fires for the in-scope consumer matrix.

C4 — #include "codec.h" in surface.c

Edit: src/surface.c — add #include "codec.h" near the existing includes block.

C5 — Track pixelformat in driver_data for gate

Edit: src/request.h — add field to struct request_data (request.h:50+):

+	/*
+	 * iter5b: pixelformat last set on the OUTPUT queue (e.g.
+	 * V4L2_PIX_FMT_HEVC_SLICE). Gates re-S_FMT on profile change
+	 * the same way last_output_width/height gates re-S_FMT on
+	 * resolution change. 0 = uninitialized; non-zero = the FOURCC
+	 * of the current OUTPUT format on the V4L2 fd.
+	 */
+	unsigned int last_output_pixelformat;
	unsigned int last_output_width;
	unsigned int last_output_height;

Edit: src/surface.c:176

Extend the existing gate:

if (driver_data->last_output_width != width ||
    driver_data->last_output_height != height ||
    driver_data->last_output_pixelformat != pixelformat) {

And at line 230-231:

driver_data->last_output_width = width;
driver_data->last_output_height = height;
+ driver_data->last_output_pixelformat = pixelformat;

The gate now fires on either resolution change OR profile change. The existing teardown branch (cap_pool_destroy, request_pool_destroy, REQBUFS(0)) handles either case identically — the V4L2 device just needs all buffers gone before a fresh S_FMT.

C6 — meson.build update

Edit: src/meson.build — add 'codec.c' to the sources list, 'codec.h' to the headers list.

C7 — Phase 7 verification matrix

Re-run the iter5 Phase 3 sweep script (/tmp/iter5_p3/sweep.sh on fresnel — preserved in iter5_phase3_baseline.tgz). Expected hash matrix:

Codec Fixture libva pre-iter5b libva post-iter5b kdirect sw Verdict
H.264 1080p30 bbb_1080p30_h264.mp4 71ac099b… (broken keyframe-only) expected same as pre (Bug 4 not fixed in iter5b) 1e7a0bc9… 1e7a0bc9… Bug 4 still open; not in scope
HEVC 720p bbb_720p10s_hevc.mp4 06b2c5a0… (all-zero) 9340b832… 9340b832… 9340b832… libva == kdirect == sw
VP9 720p bbb_720p10s_vp9.webm 06b2c5a0… 4f1565e8… 4f1565e8… 4f1565e8… libva == kdirect == sw
MPEG-2 720p bbb_720p10s_mpeg2.ts 19eefbf4… (working) 19eefbf4… (unchanged) 19eefbf4… 7be8cad7… regression-free ✓
VP8 720p bbb_720p10s_vp8.webm 06b2c5a0… 136ce5cb… 136ce5cb… 136ce5cb… libva == kdirect == sw

iter5b PASS = 3 codecs unblocked (HEVC, VP9, VP8) + 1 regression-free (MPEG-2) + 1 unchanged (H.264 — Bug 4 still open).

Phase 7 strict equality check:

ssh fresnel 'cd /tmp/iter5b_p7 && ./sweep.sh && \
  for c in hevc vp9 vp8 mpeg2; do
    if cmp -s libva_$c.yuv kdirect_$c.yuv; then
      echo "$c: libva == kdirect (PASS)"
    else
      echo "$c: libva != kdirect (FAIL)"
    fi
  done'

H.264 deferred: keyframe-only decode continues, which the iter4 testing already showed.

C8 — Build + install on fresnel

No kernel work. Backend-only rebuild:

ssh fresnel 'cd ~/src/libva-v4l2-request-fourier && \
  git pull origin master && \
  ninja -C build && \
  sudo install -m 644 build/src/v4l2_request_drv_video.so /usr/lib/dri/v4l2_request_drv_video.so'

Capture post-install SHA256 of the .so as the iter5b binding identifier. Compare against pre-iter5b 6e90b7a9b2c33480… to confirm a different build was installed.

C9 — Control-payload anchor regression check

Optional but cheap: re-run the Phase 3 anchor strace and confirm VIDIOC_S_EXT_CTRLS payloads byte-match the iter5 baseline (per-codec strace files in iter5_phase3_baseline.tgz). The backend changes only touch OUTPUT format setup, not control submission, so byte-identical is the expectation. Any divergence indicates an unintended ripple.

C10 — Phase 8 close criteria

  • C7 verification: 3 of 3 (HEVC/VP9/VP8) libva == kdirect == sw. MPEG-2 unchanged. H.264 unchanged (Bug 4 deferred to iter6).
  • phase8_iteration5b_close.md documenting commits + verification hashes.
  • Campaign scoreboard: "5/5 with 4 direct + 1 (H.264 inter) partial" — up from current "5/5 with 1 direct (MPEG-2) + 4 transitive."
  • Phase 5 sonnet-architect review sign-off recorded.
  • Commits via claude-noether per memory feedback_gitea_as_claude_noether.md.
  • Memory update: extend feedback_unconditional_codec_state.md with the second instance (pixelformat_for_profile site), or add a fresh feedback_va_lifecycle_profile_threading.md if the rule generalizes beyond start codes + pixel formats.

Risk register

Risk Probability Impact Mitigation
object_heap_first/next API signature differs from what I'm guessing LOW LOW Phase 6 reads existing usages in the backend before authoring the iterator
find_sole_active_profile misfires when consumer creates config + creates surfaces + destroys config + creates new config (sequential single-config flow) LOW LOW Confirmed by tracing mpv + ffmpeg patterns; single-config-at-a-time is preserved
request_log is the wrong logger symbol LOW LOW Grep confirms request_log is the backend-wide logger (used in v4l2.c, surface.c etc.)
Hantro's CAPTURE-format-derivation breaks when OUTPUT goes HEVC_SLICE MEDIUM HIGH Hantro doesn't expose HEVC, so HEVC won't bind to hantro. rkvdec is the only path for HEVC. Phase 6 verifies. For MPEG-2/VP8 on hantro the format goes from "wrong-but-tolerated H264_SLICE" to "correct MPEG2_SLICE/VP8_FRAME" — hantro's CAPTURE derivation should be MORE accurate, not less
Existing iter4 h264_start_code switch and the new pixelformat switch fall out of sync LOW LOW Both code blocks exist in different files (context.c vs surface.c); helper isolates the mapping logic. Phase 5 review checks both
VAProfileNone constant comparison surprises LOW LOW va.h:504 confirms VAProfileNone = -1. The fallback condition pixelformat == 0 makes the comparison double-safe
Ohm campaign regression (libva-multiplanar shares the fork) LOW MEDIUM Ohm tests only H.264 + MPEG-2. Both get correct pixelformat via the helper. MPEG-2 path goes from accidentally-tolerated H264_SLICE to explicitly-correct MPEG2_SLICE — likely improvement, not regression. Phase 5 review verifies via the ohm Phase 3 anchors if available

Phase 5 review concerns

Inviting the sonnet-architect to specifically check:

  1. Helper completeness — does pixelformat_for_profile() cover every VAProfile the backend enumerates in RequestQueryConfigProfiles at config.c:131-194? Missing a profile silently breaks that codec.
  2. Heap iterator API — confirm object_heap_first / object_heap_next exist and have the signature my pseudocode uses. The backend's object_heap.h will tell us in seconds.
  3. Gate semantics — does extending the last_output_* gate with pixelformat correctly fire the teardown branch on profile change? Walk through the case where the same resolution is decoded as H.264 first, then HEVC second on the same driver_data instance.
  4. Hantro CAPTURE-derivation — confirm hantro/rk3399-vpu-dec accepts an explicit MPEG2_SLICE OUTPUT format without breaking the implicit CAPTURE derivation. Phase 3 baseline showed MPEG-2 works with the wrong H264_SLICE format because hantro is forgiving; will it ALSO work with the right MPEG2_SLICE format? Almost certainly yes (hantro probes the SPS to derive CAPTURE), but the reviewer should empirically check via boltzmann's linux-rockchip tree.
  5. mpv probe-then-real allocation pattern — mpv first vaCreateSurfaces with 128x128 probe surfaces, then real-resolution. The gate-on-resolution-change works for mpv. Does the new gate-on-pixelformat-change cooperate correctly? Specifically: probe surfaces happen BEFORE vaCreateConfig in some mpv flows? If so, find_sole_active_profile returns VAProfileNone and falls back to H264_SLICE for the probe, then the real-resolution call (after CreateConfig) finds the right profile. The gate fires on both resolution AND pixelformat change, so the second call re-S_FMTs correctly.
  6. Memory rule update — does this iteration's lesson belong in feedback_unconditional_codec_state.md (extending the start-code finding to pixel format) or as a separate feedback_va_lifecycle_profile_threading.md rule about VA-API's CreateSurfaces having no config_id parameter and how to work around it?

Predicted iter5b cadence

  • Phase 5 review: ~20-30 min (small surface).
  • Phase 6 implementation: ~45-60 min total — 2 new files (~40 LOC), 3 file edits (~40 LOC delta), 1 meson update, 1 rebuild on fresnel, 1 install.
  • Phase 7 verification: ~10 min (re-run the sweep script).
  • Phase 8 close: ~30 min for the doc + memory updates.

Total: 1-2 hours wallclock if nothing surprises.

Option β deferral note

The option β refactor (CreateContext owns OUTPUT-side lifecycle entirely; CreateSurfaces2 becomes ID-allocation-only) is the architecturally cleanest solution and aligns with the surface.c:164-171 TODO comment's "context-level redesign for the next iteration."

iter5b chooses option α' instead because:

  1. The TODO comment specifically targets multi-resolution-mid-stream, not iter5b's "wrong-format-at-initial-create-surfaces" case. iter5b's bug doesn't need the full β refactor to fix.
  2. Option α' is 5x smaller LOC (~50 vs ~250) and 10x faster to ship.
  3. Option β can be implemented in a later iteration when multi-resolution mid-stream becomes a target.

If Phase 5 reviewer makes a strong case for β (e.g. discovers that α' has a corner case it doesn't handle), iter5b will loop back to Phase 4 and re-plan as β.