Markus Fritsche 11d2dde8ab iter6 post-mortem Phase 0: substrate + safeguards
Locks in forensics from boot -9..-1 journal:
- Silent watchdog reset, no oops/panic in logs
- vblank/edp WARNs are pre-existing (also fire on recovered boot 0)
- vb2_buffer_attach_release_fence / dma_fence / dma_resv NEVER
  appear in iter6-boot kernel logs — deadlock at a level kernel
  can't reach printk
- Hardware Synopsys DesignWare watchdog is the reset mechanism

Six non-negotiable safeguards for any retry:
1. backup .ko AND off-device archive before sudo install
2. CONFIG_PROVE_LOCKING + DEBUG_ATOMIC_SLEEP + LOCKDEP etc
3. bisect-apply one patch at a time, reboot+test between
4. SDDM auto-login OFF (done — file renamed .disabled-iter6postmortem)
5. pstore.backend=ramoops to capture kernel oops across reset
6. Phase 5 architect review of plan + 0007 source before apply

Four gating questions for Phase 1, starting with bisect:
- which of 4 patches is the actual vector
- lockdep splat hidden by CONFIG_PROVE_LOCKING=n
- why no oops in journal
- producer-side fence-alloc hang vs consumer-side wait hang

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 16:02:09 +00:00

ampere-kernel-decoders

Kernel-side sibling campaign to ampere-fourier (which validated the userspace libva backend on RK3588 at iter1 close, 3 codecs working) and fresnel-fourier (the RK3399 peer). Focuses on the kernel-side enablement required to unblock the three codec blockers iter1 of ampere-fourier surfaced:

  1. HEVC kernel OOPS in rkvdec_hevc_prepare_hw_st_rps (__pi_memcmp fault, cascades to v4l2_mem2mem wedge). Tracked at marfrit/kernel-agent#11 [ka:experiment].
  2. VP9 not exposed on RK3588 rkvdec — kernel doesn't register V4L2_PIX_FMT_VP9_FRAME on the VDPU381/383 variant_ops. Tracked at marfrit/kernel-agent#12 [ka:experiment].
  3. (AV1 backend iter39 is userspace work, not in this campaign's scope; tracked at marfrit/libva-v4l2-request-fourier#2.)

Topology

Property Value
Working tree (substrate kernel) boltzmann:~/src/linux-rockchip branch linux-rk3588-marfrit
Board patches boltzmann:~/src/misc_patches/genbook/kernel/000*.patch (6 board DTS/config patches; the ampere baseline package consumes these)
Experiment target host ampere (CoolPi CM5 GenBook, RK3588)
Baseline kernel for regression-checks linux-ampere-fourier 7.0rc3.kafr1-1 (vanilla torvalds v7.0-rc3 + genbook/kernel/000* only — NO codec patches)
Patch destination kernel-agent experiment branches per issue; promoted to linux-ampere-fourier only via explicit operator decision per feedback_characterize_before_change (campaign-experiment patches stay OUT of baseline)
Build host boltzmann (primary aarch64 builder per kernel-agent README) + ampere as fallback secondary
Patch landing pad marfrit/kernel-agent/patches/{soc,driver}/... scope-tagged per kernel-agent's tree convention

Scope (in flux until Phase 0 close)

In scope (iter1 candidate):

  • HEVC OOPS fix — minimal-blast-radius candidate patch for rkvdec_hevc_prepare_hw_st_rps, validated on ampere by re-running ampere-fourier Phase 3 with HEVC added.
  • Survey upstream prior art (linux-rockchip, linux-media, linux-mm, Kwiboo, Bootlin, Collabora) before writing any code — there may already be a fix in v7.0-rc4+ or in an out-of-tree branch.

Out of scope (likely iter2+):

  • VP9 enablement (broader work — VDPU381/383 variant_ops surface, DTS bindings, potentially backend codec table updates). Separate iteration.
  • Multi-core support for rkvdec/hantro (the "missing multi-core support, ignoring this instance" lines in dmesg). Upstream work; out of fleet-scope.

Explicitly NOT in scope:

  • Codec patches on the baseline linux-ampere-fourier package. Per operator policy 2026-05-16, ampere baseline stays clean mainline + board DTS. Codec enablement = kernel-agent experiment branches only.

Process

8(+1)-phase loop per feedback_dev_process.md, same as ampere-fourier / fresnel-fourier. Notable for this campaign:

  • Phase 0 includes an upstream prior-art survey (linux-rockchip + linux-media + linux-mm + Kwiboo + Bootlin). This is non-optional — writing a kernel patch without first checking whether the fix already exists upstream is the wrong workflow. Memory feedback_no_upstream says no PR/MR/RFC from us by default; that's about publishing, not consuming. Consume aggressively.
  • Phase 5 review uses the sonnet-architect subagent pattern (Plan with model: sonnet), same as ampere-fourier. Memory rule feedback_review_empirical_over_theoretical applies — test-compile reviewer-suggested struct/field mappings before adopting amendments.
  • Phase 6 implementation produces kernel patches in ~/src/linux-rockchip on boltzmann + accompanying kernel-agent experiment branch entries. No patches land in marfrit-packages/arch/linux-ampere-fourier/ directly.
  • Phase 7 verification re-runs ampere-fourier iter1's Phase 3 scripts with HEVC added to the codec list. Predicted outcome anchors against ampere-fourier iter1's baseline numbers.

Predecessor work this campaign builds on

  • ../ampere-fourier/ iter1 — the baseline floor this campaign regresses against. Re-anchoring to ampere-fourier's N=3 FPS numbers + per-codec SSIM floors for any codec that flips from "blocked" to "validated."
  • marfrit/kernel-agent — the experiment-branch home + issue tracker.
  • marfrit/kernel-agent/issues/11 — HEVC bug ticket with the OOPS trace + reproducer.
  • marfrit/kernel-agent/issues/12 — VP9 enablement ticket.
  • Memory feedback_rkvdec_patch_reachability — VDPU381/383 vs RK3399 legacy path boundary.
  • Memory feedback_characterize_before_change — campaign-process rule from ampere-fourier iter1.

Operator-facing repo URL

git.reauktion.de/marfrit/ampere-kernel-decoders — to be created at iter1 close if there's something publish-worthy (matches the fresnel-fourier / ampere-fourier convention). Local-only at ~/src/ampere-kernel-decoders/ on noether for now.

S
Description
RK3588 decoder enablement on ampere (CoolPi CM5 GenBook). Sibling kernel-side campaign to marfrit/ampere-fourier (userspace consumer). Meta-campaign coordinating HEVC OOPS investigation (iter2: F1 negative-result), VP9 enablement (iter4), AV1 backend probe (sibling). Process: 8(+1)-phase loop.
Readme 194 KiB
Languages
Shell 100%