dfebd8017f
Entry condition: iter2 F1 closed with deterministic x1=0x51a0
evidence + 'our new controls don't reach the kernel' strace.
Substrate:
- kernel source ampere:~/src/linux-rockchip @ ampere-minimal-devices
(same tree as boltzmann's linux-rk3588-marfrit branch)
- module-only rebuild path: rockchip_vdec.ko, ~30s on boltzmann
16-core, deploy via scp + rmmod/insmod cycle (no reboot needed)
5 open questions for Phase 1:
Q1 decode 0x51a0 (candidate: 261*80=sizeof × count?)
Q2 where does ctrl->p_cur.p = 0x51a0 happen? (printk every
assignment)
Q3 is ctx->has_sps_st_rps true even w/o backend S_EXT_CTRLS?
Q4 (CHEAPEST) why don't our new CIDs reach the kernel — log
h265_populate_ext_sps_rps_cache return path. NO KERNEL REBUILD.
Q4 first; informs all others.
Q5 RK3588 routes through vdpu381-hevc.c or vdpu383-hevc.c?
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4.6 KiB
4.6 KiB
Phase 0 — iter3 substrate (HEVC kernel-side investigation)
Opened 2026-05-16 evening, immediately following iter2's F1 close. Entry conditions are already concrete; this Phase 0 is brief.
Research question
What kernel-side state causes run->ext_sps_st_rps to deterministically equal 0x51a0 in rkvdec_hevc_prepare_hw_st_rps on ampere, and what's the minimal kernel patch that makes the kernel's HEVC RPS preparation safe against the userspace inputs ampere's libva backend actually supplies?
Locked-in evidence carried from iter2
| Observation | Source | Status |
|---|---|---|
OOPS at __pi_memcmp+0x10/0x110 called from rkvdec_hevc_prepare_hw_st_rps+0x38/0x300 |
ampere dmesg, 3 captures | reproducible 100 % |
Faulting argument: x1 = 0x51a0 (run->ext_sps_st_rps), pgd=0 (no page-table mapping) |
dmesg register dump | deterministic across reboots |
x0 = ffff000… (valid kernel heap, the cache arg), x2 = 0x48 (72 bytes) |
same | normal-looking |
Backend's S_EXT_CTRLS for 0xa40a98 (HEVC_EXT_SPS_ST_RPS) + 0xa40a99 (_LT_RPS) never appear in ioctl trace |
iter2 strace (/tmp/iter2_after.strace.* on ampere) |
confirmed |
Backend's standard 5-control submission returns EINVAL with error_idx=5 |
same strace | kernel rejects whole batch |
Kernel ctx->has_sps_st_rps only goes true via |= !!(ctrl->has_changed) in rkvdec.c::rkvdec_s_ctrl |
source-read | gate path identified |
Kernel control descriptor for EXT_SPS_*_RPS declares .cfg.dims = { 65 } (dimensional-array, not plain compound) |
rkvdec.c::vdpu38x_hevc_ctrl_descs[] |
dynamic-array protocol semantics |
| Backend infrastructure landed: vendored GStreamer 1.28.2 parser, UAPI shim, per-fd probe, h265 set_controls gate. Build clean, install clean. | iter2 commits f91c3f5..1a2c958 |
reusable |
Substrate
- Kernel source:
ampere:~/src/linux-rockchipbranchampere-minimal-devices, tip7c241f2e2835. Identical mirror also atboltzmann:~/src/linux-rockchip @ linux-rk3588-marfrit(per ampere iter1 phase0). - Target build artefact:
drivers/media/platform/rockchip/rkvdec/rockchip_vdec.koonly — module-incremental rebuild, NOT full kernel. ~30 s on boltzmann's 16-core after first full pass. - Module-deploy path: scp built
.koto ampere,sudo rmmod rockchip_vdec; sudo insmod /tmp/rockchip_vdec.ko. Avoids reboot (cheap iteration). - Build invocation: kernel-agent dispatch OR hand-build via
make M=drivers/media/platform/rockchip/rkvdec modulesagainst pre-configured tree. - dmesg capture path:
sudo dmesg --time-format=ctime | grep rkvdecpost-test.
Open questions tabled for Phase 1
- What concretely is
0x51a0? Three candidate decompositions:0x51a0= 20896 =261 × 80(where 80 issizeof(struct v4l2_ctrl_hevc_ext_sps_st_rps)per our header)0x51a0mod 8 = 0, mod 16 = 0 — aligned; rules out "random heap fragment"0x51a0÷ 4 = 0x1468 (5224). Doesn't map to anything obvious yet.- Look for any kernel literal
0x51a0or struct field that would be at that offset inv4l2_ctrlorrkvdec_ctx.
- Where does the
ctrl->p_cur.p = 0x51a0assignment happen? Trace via printk: everyrkvdec_s_ctrlcall (does our backend's S_EXT_CTRLS hit this?), everyv4l2_ctrl_handler_init+v4l2_ctrl_new_customfor the EXT_SPS_*_RPS controls (during driver probe), every assignment toctrl->p_cur.pfor these controls. - Is
ctx->has_sps_st_rpsever observably true on a backend that doesn't set these controls? Phase 1 hypothesis if yes: there's a synthetichas_changed=trueset during ctrl_handler init for dimensional-array controls. If no, then we're hitting a different code path entirely (maybe an alternateprepare_hw_st_rpscall site we haven't found). - Why does our backend's S_EXT_CTRLS for the new CIDs not appear in strace? Cheap to diagnose: add
request_loginsideh265_populate_ext_sps_rps_cacheto print return code + source_data SPS-NAL-found status. Doesn't require kernel rebuild. Do this FIRST in Phase 6 — answers a question that's orthogonal to the kernel-side instrumentation but informs the eventual fix path. - What other rkvdec drivers exist in this kernel source that could be the actual run-target? ampere has
rkvdec-vdpu381-hevc.cANDrkvdec-vdpu383-hevc.c— both callrkvdec_hevc_assemble_hw_rps. Which one fires on RK3588 (CoolPi GenBook is which)? Phase 2 source-read.
Phase 0 close
Substrate locked. iter2's evidence is the binding-cell starting condition. Five open questions for Phase 1 to lock — Q4 (cheap backend log) is the gating dependency-of-other-questions and goes first in Phase 6.