Phase 4 plan for iter2 HEVC fix. Structured per the
feedback_dev_process.md Phase 6 contract-before-code worked example
(0012-h264-omit-scaling-matrix-frame-based.patch shape): contract
clauses with citations first, then code changes mapping 1:1 to
clauses.
10 contract clauses cited from authoritative sources:
Clause 1 — Per-frame batched VIDIOC_S_EXT_CTRLS, count=5
Authority: linux/v4l2-controls.h:2090-2300 (8 HEVC stateless CIDs)
Reference impl: FFmpeg libavcodec/v4l2_request_hevc.c:505-565
(v4l2_request_hevc_queue_decode)
Empirical anchor: Phase 3 Baseline B verbatim payload
Clause 2 — v4l2_ctrl_hevc_sps layout (40 bytes)
Authority: linux/v4l2-controls.h:2096+ (struct + 9 SPS_FLAG_* bits)
Field-by-field VAAPI source mapping table; existing
h265_fill_sps logic preserved, just routed to flags bitmask
Phase 3 Baseline B BBB SPS bytes: flags=SAO|STRONG_INTRA_SMOOTHING
Clause 3 — v4l2_ctrl_hevc_pps layout (64 bytes, 19 flags)
Authority: linux/v4l2-controls.h:2126-2150
Field source: VAPictureParameterBufferHEVC + slice (for
dependent_slice_segment_flag)
Clause 4 — v4l2_ctrl_hevc_slice_params (variable; dynamic-array)
Authority: kernel exposes 0xa40a92 elems=1 dims=[600] dynamic-array
Submission shape: size = sizeof(slice_params) * num_slices_in_frame
Reference impl: FFmpeg v4l2_request_hevc.c:540-547
BEHAVIORAL CHANGE: per-slice accumulation in codec_store_buffer
(replace overwrite with append-to-array)
DPB MOVES OUT of slice_params to DECODE_PARAMS (Clause 6)
Clause 5 — v4l2_ctrl_hevc_scaling_matrix (size M; conditional)
Conditional on kernel availability (probed via VIDIOC_QUERY_EXT_CTRL
at init), NOT on bitstream flag (Phase 3 baseline corrects Phase 2
assumption)
Spec defaults from ISO/IEC 23008-2 Table 4-1 when iqmatrix_set==false
PROTOCOL: transcribe defaults from Phase 3 Baseline B verbatim
SCALING_MATRIX bytes, NOT from spec recall (per
memory feedback_review_empirical_over_theoretical.md)
Clause 6 — v4l2_ctrl_hevc_decode_params layout (328 bytes)
NEW in modern API (didn't exist in staging-era)
Contains: DPB array (16 entries), POC, num_active_dpb_entries,
num_poc_st_curr_before/after, num_poc_lt_curr,
poc_st_curr_before[8], etc.
Source: existing h265_fill_slice_params lines 269-315 logic
preserved, routed to new struct
Clause 7 — Device-wide DECODE_MODE + START_CODE menus
Set once at init via v4l2_set_controls(...request_fd=-1, 2 ctrls)
rkvdec accepts: FRAME_BASED + ANNEX_B (only options per kernel menu
constraints, Phase 0 v4l2_inventory)
Default location: extend src/context.c:142-155 device-init block
Clause 8 — config.c HEVCMain case must break;
Authority: C semantics; iter1 Bug 1 pattern verbatim
Empirical anchor: Phase 3 Baseline D scratch confirmed
Clause 9 — picture.c::codec_set_controls HEVCMain dispatch
Authority: existing MPEG-2 dispatch pattern at picture.c:186-191
Replace explicit Fourier-local: HEVC stripped reject with
h265_set_controls call
Clause 10 — Per-slice accumulation in codec_store_buffer
HEVC slice_params dynamic-array source = per-RenderPicture appends
BeginPicture resets num_slices=0; codec_store_buffer appends each
VASliceParameterBufferType to slices[N] array
Diff scope (8 files):
src/config.c — 5-line break addition (Clause 8)
src/picture.c — HEVCMain dispatch (Clause 9) + per-slice
accumulation (Clause 10) + BeginPicture
num_slices reset, ~25 lines
src/surface.h — extend params.h265 with slices[64] +
num_slices, ~17 KB extra per surface union
src/h265.c — full rewrite ~400 lines (Clauses 2-7)
src/h265.h — re-enable
src/meson.build — uncomment h265.c + h265.h
src/context.c — extend device-init for HEVC DECODE_MODE +
START_CODE
include/hevc-ctrls.h — leave as-is (9-line shim, lower-risk path
per iter1 Phase 5 Nit 6 deferral)
Phase 6 implementation order (2 logical commits + optional fix-forward):
A: src/config.c HEVCMain break only (substrate fix in isolation;
Phase 3 Baseline D already verified collateral safe)
B: h265.c rewrite + picture.c dispatch + slice_params accumulation +
meson re-enable + surface.h extension + context.c device-init
C: optional fix-forward if Phase 7 surfaces a regression
Phase 7 verification harness (full Bash incantations in plan body):
Criterion 1: vainfo lists VAProfileHEVCMain on rkvdec
Criterion 2: vaCreateConfig(VAProfileHEVCMain) = SUCCESS via libva trace
Criterion 3: ffmpeg -hwaccel vaapi exit 0, no Failed-to-create
Criterion 4: mpv --hwdec=vaapi --vo=image at +02s; HW=SW byte-identical
(DMA-BUF GL cache-coherency-safe path per memory
feedback_rockchip_pixel_verify_path.md)
Criterion 5: iter1 MPEG-2 + T4 H.264 reference hashes still match
Bonus: byte-compare post-fix S_EXT_CTRLS payload vs Baseline B
Pre-identified Phase 7 → Phase 4 loopback triggers:
1. S_EXT_CTRLS EINVAL post-fix → check struct sizes (pahole),
reserved zeroing, SCALING_MATRIX size encoding
2. HW pixel hash mismatch → DPB ordering, slice_params bit_offset,
SPS/PPS flags bit positions, SCALING_MATRIX values
3. mpv --hwdec=vaapi filters HEVC out → fall-forward to ffmpeg
-vf hwdownload (less likely; vaapi engaged MPEG-2 in iter1)
4. iter1/T4 regression → verify diffs scoped right
5. Slice_params dynamic-array submission shape rejected → cross-
validator size encoding anchor
6. SCALING_MATRIX availability detection wrong → defensive
QUERY_EXT_CTRL probe in h265_init_device_controls
7. Latent bug B3 hits HEVC differently than MPEG-2 → byte 240 in
h265.picture; ffmpeg-vaapi sends VAPictureParameterBufferType
per frame so masking holds
Out-of-scope (LOCKED): VP9/VP8; HEVC Main 10 / Main Still Picture /
range ext / tile-wavefront; perf metrics; long-duration stress;
SLICE_BASED decode mode (rkvdec FRAME_BASED only); Phase 4 cross-
cutting backlog (B1 device-discovery, B3 BeginPicture profile-aware,
B4 context.c log suppression, B5 vbv_buffer_size, L3 vaDeriveImage
cache-stale); chromium-fourier 149 install; upstream engagement;
hevc-ctrls.h deletion (Phase 5 Nit 6 lower-risk path continues).
Predicted Phase 8 close: 4-6 commits on the fork (vs iter1's 4).
Iter2 ~3x larger codebase delta than iter1 (mpeg2.c rewrite was
~120 lines; h265.c rewrite is ~400 lines).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fresnel-fourier
TL;DR
Peer campaign to libva-multiplanar, targeting fresnel (Pinebook Pro / Rockchip RK3399) instead of ohm (PineTab2 / RK3566). Same deliverable shape — make libva-v4l2-request-fourier work end-to-end, for production VA-API consumers — but on a different SoC with a different (and broader) V4L2 decoder surface.
The libva backend fork itself (libva-v4l2-request-fourier) is shared: it lives at ../libva-multiplanar/libva-v4l2-request-fourier/. This campaign does not nest a second copy. Code-side work that turns out to be RK3399-specific lands either as #ifdef/runtime-detected paths inside that fork's master or, if scope diverges sharply, on a feature branch — Phase 2 source-read of each iteration decides.
Origin
libva-multiplanar reached iter5-close on 2026-05-05 with the ohm path solid: hantro (rk3568-vpu DT compatible) decodes H.264 to NV12 dmabufs end-to-end, three iterations of bugs fixed, mpv --hwdec=vaapi smooth, firefox-fourier RDD-sandboxed Firefox engages the backend without MOZ_DISABLE_RDD_SANDBOX=1, chromium-fourier 149 confirmed as the regression-check consumer.
That campaign's README explicitly names fresnel (RK3399) and ampere/boltzmann (RK3588) as "future iterations after ohm path is solid" (libva-multiplanar/README.md:87). fresnel-fourier is the formal peer campaign for the RK3399 leg of that promise.
Topology choice (LOCKED 2026-05-07): peer campaign, not child of libva-multiplanar. Each runs its own 8(+1) phase loop. Cross-link only — fresnel-fourier results do not gate libva-multiplanar's Phase 8 close (which already happened iteration-by-iteration on ohm).
Hardware target
fresnel — Pinebook Pro laptop. See reference_fresnel_kernel_constraints.md for the custom-OC-kernel discipline (don't let pacman -Syu clobber the OC DTB) and project_fresnel.md for fleet placement.
| Property | Value |
|---|---|
| SoC | Rockchip RK3399 (2× Cortex-A72 + 4× Cortex-A53) |
| GPU | Mali-T860 MP4 (Midgard, panfrost) |
| Decoder block 1 | rkvdec (/dev/video3) — H.264 + HEVC + VP9 |
| Decoder block 2 | hantro-vpu-dec (rk3399-vpu-dec, /dev/video5) — MPEG-2 + VP8 (RK3399 hantro does not advertise H.264; corrected 2026-05-07 from empirical V4L2 enumeration — see phase0_evidence/2026-05-07/v4l2_inventory_findings.md) |
| Encoder block | hantro-vpu-enc (/dev/video4) — JPEG only |
| OS | EndeavourOS-ARM (Arch derivative; same pacman + marfrit-packages mechanism as ohm) |
| Kernel | linux-eos-arm 6.19.9-99 — CONFIG_FTRACE=y, CONFIG_FUNCTION_TRACER=y, CONFIG_DYNAMIC_FTRACE=y, CONFIG_TRACING=y verified 2026-05-07. No rebuild needed for trace work. |
The decode-side surface area is genuinely broader than ohm. ohm has hantro (H.264 + MPEG-2 + VP8) and an rkvdec block whose mainline rkvdec2/vdpu346 driver isn't merged. fresnel has hantro (MPEG-2 + VP8 only — empirical, see correction note in the table above) plus a fully-driven mainline rkvdec covering H.264 + HEVC + VP9. So this campaign exercises codecs (HEVC, VP9) the libva-v4l2-request-fourier fork has never run on real hardware to date, and there is exactly one decoder bind for H.264 (rkvdec) — no two-block routing decision.
GPU side: panfrost on Mali-T860 (Midgard) is a different generation than ohm's Mali-G52 (Bifrost). KWin / Mesa / panfrost stack regressions or wins on T860 are not assumed to track G52 — the kwin-fourier verdict from fourier_attribution doesn't transfer for free.
Scope (LOCKED 2026-05-07 in phase0_findings.md)
In scope:
- libva-v4l2-request-fourier backend exercised on fresnel V4L2 decode nodes (
rkvdec+hantro-vpu-dec). - Codecs: everything decode-capable — H.264 + HEVC + VP9 (via rkvdec) + MPEG-2 + VP8 (via hantro-vpu-dec). This is the explicit broadening from libva-multiplanar's H.264-first locked scope.
- Test consumers:
vainfo,mpv --hwdec=vaapi, Firefox viamedia.ffmpeg.vaapi.enabled, chromium-fourier 149 (regression check). - Phase 1 success criterion (matching libva-multiplanar): boolean correctness — "libva accepted + providing access to hardware decoder for each codec." Performance metrics deferred.
- Phase 0 task 1: recover fresnel from the SDDM greeter crash-loop (per
~/.claude/plans/dynamic-forging-piglet.md). Recovery is bookkept as substrate work inside this campaign, not as a separate prereq.
Out of scope:
- Front-end libva (API library). Backend only.
- Other hardware (ohm + ampere/boltzmann are libva-multiplanar's iterations).
- AV1 (no decoder block on RK3399 supports it).
- Performance metrics — fresnel CPU/GPU benchmarking is a separate iteration after correctness lands.
cros-codecsRust replacement (peruser_stance_rust.md).- Bootlin / Collabora upstreaming default-deferred (per
feedback_no_upstream.md). Same discipline as libva-multiplanar. - KWin / panfrost / Mali-T860 work — orthogonal until proven otherwise; a parallel
kwin-fourier-fresnelcampaign would be a separate decision.
Process
8(+1) phase loop per feedback_dev_process.md. Phase 0 substrate is in phase0_findings.md. Phase 5 review uses the sonnet-architect subagent pattern (Plan with model: sonnet).
In-session-acquired data discipline per feedback_replicate_baseline_first.md: libva-multiplanar's ohm-side measurements are reference history, not threshold sources for fresnel-fourier cells.
Predecessor work this campaign builds on
../libva-multiplanar/— five closed iterations on ohm. Read in order:README.md— current state, codec scope, file map.phase8_iteration5_close.md— most recent close. The iter5-end backend is the substrate fresnel-fourier starts from.phase0_findings.md(andphase0_findings_iter[2-5].md) — locked-scope precedent for codec breadth and consumer matrix on ohm; useful frame-of-reference when locking fresnel-side scope.phase8_iteration1_close.md— iter1 surface-export DMA-BUF lifecycle race + multi-resolution cache + 64-pitch alignment bugs. Likely re-surface candidates on RK3399.
../libva-multiplanar/libva-v4l2-request-fourier/— the fork itself. 12 commits ahead of bootlin tip plus iter1..iter5 work.git logfor the actual landing record.~/.claude/plans/dynamic-forging-piglet.md— fresnel SDDM greeter crash diagnosis + recovery plan. Phase 0 task 1 picks up from here.~/src/fourier_attribution/— ohm-only attribution matrix. The chromium-fourier WHEAT-but-fragile verdict and Cell E (vanilla Chromium 149 control) item are ohm-side context, not fresnel data.
External reference (carry-over from libva-multiplanar):
- Mozilla bug 1833354 / 1965646 (Firefox HW decode on RK35xx via libva-v4l2-request).
- Bootlin upstream
bootlin/libva-v4l2-request— dormant since 2021. - Linux kernel
drivers/staging/media/rkvdec/— RK3399 rkvdec H.264/HEVC/VP9 control protocol reference. - Linux kernel
drivers/media/platform/verisilicon/hantro_*— RK3399-vpu-dec MPEG-2/H.264/VP8 control protocol reference.
Repository layout
~/src/fresnel-fourier/ <- this campaign (its own git repo)
├── README.md <- this file
├── phase0_findings.md <- locked research question + Phase 0 work list
├── (worklist.md, phase[2-8]*.md as phases land)
└── (the libva fork is NOT here — see ../libva-multiplanar/libva-v4l2-request-fourier/)
The campaign repo and the fork repo stay separate. fresnel-fourier commits its findings here; code changes to the backend land on the fork's master (or a branch named per the iteration if scope diverges from libva-multiplanar's ohm-side master).
Operator-facing repo URL: git.reauktion.de/marfrit/fresnel-fourier — created empty during scaffolding, no push until first iteration finds something worth publishing.
Non-upstreaming default
Inherited from libva-multiplanar / feedback_no_upstream.md. Patches must be aligned to upstream in syntax and semantics; PR/MR/bug-report only on explicit operator instruction.
Build infrastructure
distcc/cross-build path is the existing fleet: aarch64crosscompiler LXD on data, tesla LXD on hertz, dcc1 on dcw3. See reference_distcc_kernel_builds.md for invocation. Per the locked Phase 0 answer for libva-multiplanar (item 9), no distcc for libva builds — libva is small and links fast, hand-build on fresnel directly. Same default applies here unless a specific reason emerges.
For chromium-fourier 149 / firefox-fourier rebuilds against fresnel-side findings, the boltzmann LXD container path from libva-multiplanar iter3 is reusable.