57051b665c
Per iter16 close (Bug 4/5/6 confirmed kernel-side, libva byte-correct),
add a single pr_info at rkvdec_hevc_run entry dumping key state values
from run->sps / pps / slices_params[0] / decode_params. Build 7.0-3,
deploy, reboot, run libva-HEVC + kdirect-HEVC, diff dmesg output.
Outcome interpretations:
identical -> bug is in rkvdec assemble_hw_*/config_registers/HW path
different -> libva somehow leaks different struct contents via non-
ioctl path despite identical V4L2 ioctls
Build running on boltzmann via kernel-agent workflow; pkgrel 7.0-2 -> 7.0-3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
61 lines
3.5 KiB
Markdown
61 lines
3.5 KiB
Markdown
# Iteration 17 — Phase 0 + Phase 6 (kernel-side investigation)
|
||
|
||
Opens 2026-05-14 after iter16's confirmation that Bug 4 + Bug 5 + Bug 6 are kernel-side. Per `feedback_libva_byte_correct_kernel_bug.md`, libva-side hypothesis space is exhausted. Productive direction is kernel investigation via kernel-agent workflow.
|
||
|
||
## Locked research question (iter17)
|
||
|
||
> *"At the rkvdec_hevc_run() entry on RK3399, are the struct values pointed to by `run->sps`, `run->pps`, `run->slices_params[0]`, `run->decode_params` byte-equal between libva-triggered and kdirect-triggered HEVC decodes? If yes, the divergence is inside rkvdec's per-call hardware setup (assemble_hw_*, config_registers, IRQ handler). If no, libva is somehow passing different struct contents despite ioctls being byte-equal."*
|
||
|
||
## Approach
|
||
|
||
Kernel printk diagnostic at rkvdec_hevc_run entry (after rkvdec_hevc_run_preamble that populates `run` from V4L2 control state):
|
||
|
||
```c
|
||
pr_info("rkvdec_hevc_run: sps_id=%u dpb_buf=%u reorder=%u"
|
||
" w=%u h=%u bd_l=%u bd_c=%u chroma=%u"
|
||
" num_short_st=%u num_long_lt=%u"
|
||
" slices=%u nal_unit_type=%u slice_type=%u"
|
||
" decode_flags=0x%x\n", ...);
|
||
```
|
||
|
||
Build linux-fresnel-fourier 7.0-3 with this printk, deploy, reboot, run libva-HEVC + kdirect-HEVC each, capture dmesg, diff.
|
||
|
||
## Substrate state at iter17 open
|
||
|
||
- Fork tip `111f8ba` on noether + fresnel + gitea (α-20 reverted).
|
||
- Backend SHA `80e65c5a…` on fresnel.
|
||
- Kernel `7.0-2` currently. iter17 will deploy `7.0-3` with the diagnostic printk.
|
||
- Pkgrel bumped to 3 in PKGBUILD on boltzmann.
|
||
|
||
## Why this is the right next investigation
|
||
|
||
iter11–iter16 cumulative empirical findings:
|
||
- libva submits byte-identical V4L2 controls (every rkvdec-read field).
|
||
- libva submits byte-identical OUTPUT bitstream (HEVC frame 1: 96890 bytes match input exactly; VP8: 300614 bytes match input exactly minus header).
|
||
- libva ioctl sequence has been brought to structural near-parity with kdirect (S_FMT CAPTURE, DMA_BUF_IOCTL_SYNC, IRAP/IDR flags, POC strip, timestamp counter).
|
||
|
||
At the V4L2 ioctl layer, libva == kdirect for every observable byte. The kernel must therefore see different state internally — either through some non-ioctl path (vb2 allocator difference, buffer alignment, request_fd ordering) OR within rkvdec's per-call processing (assemble_hw_*, config_registers).
|
||
|
||
The printk at `rkvdec_hevc_run` entry shows the kernel's view of `run->*` struct contents. If those are byte-equal between libva and kdirect, the bug is in assemble/config_registers/IRQ. If they differ, libva is somehow leaking state through a path we haven't traced.
|
||
|
||
## Phase 5 review acknowledgment
|
||
|
||
This is a diagnostic-only kernel patch (single printk, no behavior change). No Phase 5 review needed for the patch itself; the methodology is direct empirical comparison.
|
||
|
||
If the printk reveals a real bug, the actual fix will require Phase 5 architectural review per CLAUDE.md "reviews never skippable."
|
||
|
||
## Phase 7 plan
|
||
|
||
After 7.0-3 deploys:
|
||
1. Reboot fresnel (sddm autologin kicks in).
|
||
2. `sudo dmesg -C` to clear.
|
||
3. Run libva HEVC; capture dmesg; expect 3 frames worth of printk lines.
|
||
4. `sudo dmesg -C` again.
|
||
5. Run kdirect HEVC; capture dmesg.
|
||
6. Diff the two dmesg outputs.
|
||
|
||
## What outcomes mean
|
||
|
||
- **Identical**: kernel sees the same `run->*` data from both backends. Bug 4/5/6 is in rkvdec's assemble/config_registers/HW. Next step: instrument those.
|
||
- **Different**: libva somehow gets different struct contents into rkvdec despite identical ioctls. Investigate.
|