Markus Fritsche 53e247724f iter6 v2 attempt-2 close — 0004 headless-clean but compositor-wedges
Step 2.1 (0004 vb2 helper alone) on lockdep+ramoops kernel:
- Headless smoke test PASS: ffmpeg rc=0, 4147200B output, zero
  lockdep splat, iter4-DIAG validate_sps per-frame, iter5-IRQ DEC_RDY=1
- ~10 min later when GPU compositor + concurrent vb2 traffic active:
  silent wedge → panic_timeout=10 → boot loop
- PROVE_LOCKING + 9 other lock-debug flags DID NOT catch it before wedge

Architect H1 hypothesis (classical lock-order inversion lockdep models)
is incomplete. Refined hypothesis ladder for v3:
- H1' cross-context wait not modeled by lockdep (KCSAN)
- H2 use-after-free / double-signal of dma_fence (KASAN)
- H3 race in fence alloc/init under concurrency (KCSAN)
- H4 panthor consumer-side bug (separate non-vb2 test)

Sub-bug fixed during execution: forcibly-rebuild-all-vb2-consumers
required after 0004 (touch core.h + make -j8 modules), otherwise
ABI mismatch produces 0-byte ffmpeg output.

DT-based ramoops reservation works (memmap= cmdline is x86-only).
Verified 2465B+70818B ramoops capture across panic+reset.

Recovery via WeChat stick chroot — 4 modules restored to pre-base
md5s, extlinux default flipped to arch_devices vanilla. Ampere
confirmed clean.

v3 plan pending: add KASAN+KCSAN+DEBUG_PREEMPT, full rebuild,
intentionally reproduce wedge under compositor. ~75 min build.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 19:26:30 +00:00

ampere-kernel-decoders

Kernel-side sibling campaign to ampere-fourier (which validated the userspace libva backend on RK3588 at iter1 close, 3 codecs working) and fresnel-fourier (the RK3399 peer). Focuses on the kernel-side enablement required to unblock the three codec blockers iter1 of ampere-fourier surfaced:

  1. HEVC kernel OOPS in rkvdec_hevc_prepare_hw_st_rps (__pi_memcmp fault, cascades to v4l2_mem2mem wedge). Tracked at marfrit/kernel-agent#11 [ka:experiment].
  2. VP9 not exposed on RK3588 rkvdec — kernel doesn't register V4L2_PIX_FMT_VP9_FRAME on the VDPU381/383 variant_ops. Tracked at marfrit/kernel-agent#12 [ka:experiment].
  3. (AV1 backend iter39 is userspace work, not in this campaign's scope; tracked at marfrit/libva-v4l2-request-fourier#2.)

Topology

Property Value
Working tree (substrate kernel) boltzmann:~/src/linux-rockchip branch linux-rk3588-marfrit
Board patches boltzmann:~/src/misc_patches/genbook/kernel/000*.patch (6 board DTS/config patches; the ampere baseline package consumes these)
Experiment target host ampere (CoolPi CM5 GenBook, RK3588)
Baseline kernel for regression-checks linux-ampere-fourier 7.0rc3.kafr1-1 (vanilla torvalds v7.0-rc3 + genbook/kernel/000* only — NO codec patches)
Patch destination kernel-agent experiment branches per issue; promoted to linux-ampere-fourier only via explicit operator decision per feedback_characterize_before_change (campaign-experiment patches stay OUT of baseline)
Build host boltzmann (primary aarch64 builder per kernel-agent README) + ampere as fallback secondary
Patch landing pad marfrit/kernel-agent/patches/{soc,driver}/... scope-tagged per kernel-agent's tree convention

Scope (in flux until Phase 0 close)

In scope (iter1 candidate):

  • HEVC OOPS fix — minimal-blast-radius candidate patch for rkvdec_hevc_prepare_hw_st_rps, validated on ampere by re-running ampere-fourier Phase 3 with HEVC added.
  • Survey upstream prior art (linux-rockchip, linux-media, linux-mm, Kwiboo, Bootlin, Collabora) before writing any code — there may already be a fix in v7.0-rc4+ or in an out-of-tree branch.

Out of scope (likely iter2+):

  • VP9 enablement (broader work — VDPU381/383 variant_ops surface, DTS bindings, potentially backend codec table updates). Separate iteration.
  • Multi-core support for rkvdec/hantro (the "missing multi-core support, ignoring this instance" lines in dmesg). Upstream work; out of fleet-scope.

Explicitly NOT in scope:

  • Codec patches on the baseline linux-ampere-fourier package. Per operator policy 2026-05-16, ampere baseline stays clean mainline + board DTS. Codec enablement = kernel-agent experiment branches only.

Process

8(+1)-phase loop per feedback_dev_process.md, same as ampere-fourier / fresnel-fourier. Notable for this campaign:

  • Phase 0 includes an upstream prior-art survey (linux-rockchip + linux-media + linux-mm + Kwiboo + Bootlin). This is non-optional — writing a kernel patch without first checking whether the fix already exists upstream is the wrong workflow. Memory feedback_no_upstream says no PR/MR/RFC from us by default; that's about publishing, not consuming. Consume aggressively.
  • Phase 5 review uses the sonnet-architect subagent pattern (Plan with model: sonnet), same as ampere-fourier. Memory rule feedback_review_empirical_over_theoretical applies — test-compile reviewer-suggested struct/field mappings before adopting amendments.
  • Phase 6 implementation produces kernel patches in ~/src/linux-rockchip on boltzmann + accompanying kernel-agent experiment branch entries. No patches land in marfrit-packages/arch/linux-ampere-fourier/ directly.
  • Phase 7 verification re-runs ampere-fourier iter1's Phase 3 scripts with HEVC added to the codec list. Predicted outcome anchors against ampere-fourier iter1's baseline numbers.

Predecessor work this campaign builds on

  • ../ampere-fourier/ iter1 — the baseline floor this campaign regresses against. Re-anchoring to ampere-fourier's N=3 FPS numbers + per-codec SSIM floors for any codec that flips from "blocked" to "validated."
  • marfrit/kernel-agent — the experiment-branch home + issue tracker.
  • marfrit/kernel-agent/issues/11 — HEVC bug ticket with the OOPS trace + reproducer.
  • marfrit/kernel-agent/issues/12 — VP9 enablement ticket.
  • Memory feedback_rkvdec_patch_reachability — VDPU381/383 vs RK3399 legacy path boundary.
  • Memory feedback_characterize_before_change — campaign-process rule from ampere-fourier iter1.

Operator-facing repo URL

git.reauktion.de/marfrit/ampere-kernel-decoders — to be created at iter1 close if there's something publish-worthy (matches the fresnel-fourier / ampere-fourier convention). Local-only at ~/src/ampere-kernel-decoders/ on noether for now.

S
Description
RK3588 decoder enablement on ampere (CoolPi CM5 GenBook). Sibling kernel-side campaign to marfrit/ampere-fourier (userspace consumer). Meta-campaign coordinating HEVC OOPS investigation (iter2: F1 negative-result), VP9 enablement (iter4), AV1 backend probe (sibling). Process: 8(+1)-phase loop.
Readme 194 KiB
Languages
Shell 100%