Files
fresnel-fourier/phase4_iter22_plan.md
T
marfrit bf67900cd8 iter20-26: kernel-side root-cause localization, α-25/α-26 fix Bug 4, partial Bug 5
iter20-23: kernel printk in rkvdec_hevc_run + v4l2_ctrl_request_setup
iter24:    pinpointed rkvdec_s_ctrl returning -EBUSY for HEVC_SPS due
           to vb2_is_busy(CAPTURE) — libva pre-allocates 24 CAPTURE bufs
           before first per-frame S_EXT_CTRLS, blocking image_fmt reset
iter25 α-25: synthetic SPS injection before cap_pool_init seeds
           ctx->image_fmt to RKVDEC_IMG_FMT_420_8BIT while CAPTURE is
           still empty. H264 Bug 4 fully fixed (byte-equal kdirect).
           HEVC Bug 5 frame 1 fixed (byte-equal kdirect).
iter26 α-26: populate decode_params.short_term_ref_pic_set_size from
           picture->st_rps_bits (VAAPI does expose it). Bytes 4-5 of
           dp now match kdirect. HEVC frame 2+ still diverges
           (separate bug, likely DPB entry mapping).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:10:56 +00:00

2.6 KiB

Iteration 22 — Phase 4 (plan)

Opens 2026-05-14 following iter21's smoking-gun finding: libva's request-clone-handler is missing 6 of 7 HEVC stateless controls registered in main_hdl.

Locked research question (iter22)

"At which control_id does v4l2_ctrl_request_clone's iteration break for libva, and what error code does handler_new_ref return?"

Approach

Add three printks to v4l2_ctrl_request_clone in drivers/media/v4l2-core/v4l2-ctrls-request.c:

pr_info("iter22_clone_start: new_hdl=%p from=%p\n", hdl, from);
// per iteration:
pr_info("iter22_clone_step: id=0x%x err=%d hdl_error=%d new_ref=%p\n",
        ctrl->id, err, hdl->error, new_ref);
// on break:
pr_info("iter22_clone_break: at id=0x%x err=%d hdl_error=%d\n",
        ctrl->id, err, hdl->error);
// on end:
pr_info("iter22_clone_end: hdl=%p err=%d\n", hdl, err);

Built as linux-fresnel-fourier 7.0-6 (pkgrel 5→6). Deploy, reboot, run libva HEVC + kdirect HEVC. Diff.

Outcome interpretation

handler_new_ref return hdl->error Diagnosis
0, new_ref=valid 0 Loop step succeeded — clone wouldn't break here. Look further.
0, new_ref=NULL 0 Duplicate (skip silently). Means main_hdl has duplicate ctrl_refs — unlikely.
-ENOMEM -ENOMEM kzalloc failed. Memory pressure analysis needed.
0, hdl->error=X non-zero Earlier auto-class-control insertion failed; subsequent handler_new_ref short-circuits.
-EINVAL varies Validation failed (e.g., overlapping ID range).

Coordinate with iter21 finding

If iter22 shows the loop breaks at 0xa40905 (H264_PRED_WEIGHTS) and again at 0xa40a91 (HEVC_PPS), the break must be UNREACHED by libva's iteration → means the source main_hdl itself doesn't have these controls.

If iter22 shows the loop reaches 0xa40a91 with err=0 (i.e., NOT a break), then libva's clone-hdl actually DOES contain HEVC_PPS, and our iter21 printk was missing it (e.g., a list-ordering bug in the iteration). Unlikely but worth checking.

Substrate state at iter22 open

  • Kernel linux-fresnel-fourier 7.0-6 building on boltzmann (PID 1613982, log /tmp/iter22-kbuild.log).
  • Backend SHA c1d4bb53… — unchanged from iter15.
  • Fork tip e109306 — unchanged.
  • 5-codec anchors: unchanged.

Phase 5 review

Diagnostic-only kernel patch (printk-only, no behavior change). Skipped per iter17 precedent.

Phase 7 plan

After 7.0-6 deploys:

  1. Reboot fresnel; sddm autologin reseats mfritsche.
  2. sudo dmesg -C.
  3. Run libva HEVC; capture iter22_clone_* lines.
  4. sudo dmesg -C.
  5. Run kdirect HEVC; capture same.
  6. Diff. Localize the break or absence-from-source.