Commit 63c4db0095ca on boltzmann vp9-enablement-iter1 branch adds the full register-packing path: RCB addresses, block clock-gating defaults, frame-area timeout threshold (with rkvdec_schedule_watchdog), 8-segment packing helper, ref/mode-deltas bit-packing into 28/14-bit combined fields, and first-cut probability storage aliasing. Branch is now at 4 commits, 1390 LoC across 4 files. Module compiles clean against 7.0.0-rc3 ARM64 kernel. What remains potentially-needed for first-light is in reg103_frame_flags (prob_update_en, ref/mode/single/comp refresh enables, etc.) — currently zero-init; will tune in Phase 6 if byte-compare diverges from SW. Phase 3 (hardware install + first decode) is the natural inflection point. Module artifact is at boltzmann:~/src/linux-rockchip/drivers/media /platform/rockchip/rkvdec/rockchip-vdec.ko, ready to install on ampere after backing up the current sibling-campaign module. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4.5 KiB
Phase 2.1 close — 5 register-packing TODOs implemented
Date: 2026-05-17 ~03:10. Phase 2.1 closed in the same session as Phase 2 — module builds clean with the full register-packing path.
Commit on boltzmann:~/src/linux-rockchip:vp9-enablement-iter1
63c4db0095ca — media: rkvdec: VP9 (vdpu381): implement Phase 2.1 register-packing TODOs (+128, -10)
Stacks on Phase 2's 3 commits (47431635801d, da8482271938, 71cc0d96d212). Total branch delta now 1390 LoC across 4 files.
What changed
| TODO | Implementation |
|---|---|
| 1. RCB addresses | for (i = 0; i < rkvdec_rcb_buf_count(ctx); i++) regs->common_addr.rcb_base[i] = rkvdec_rcb_buf_dma_addr(ctx, i); — direct mirror of vdpu381-HEVC pattern |
| 2. Block clock-gating defaults | All reg026_block_gating_en.* fields set from HEVC defaults (busifd_auto_gating_e = 0 per pattern, rest = 1) |
| 3. Frame-area timeout threshold | RKVDEC_TIMEOUT_1080p/_4K/_8K/_MAX based on pixel count; rkvdec_schedule_watchdog() replaces the fixed 2-second schedule_delayed_work |
| 4. Seg-register packing | vdpu381_config_seg_register(vp9_ctx, segid) helper called 8× — full port of legacy config_seg_registers to reg067_074_segid[] |
| 5. Loop-filter bit-packing | ref_deltas[0..3] → 28-bit packed into reg094.ref_deltas_lastframe; mode_deltas[0..1] → 14-bit packed into reg075.mode_deltas_lastframe (using (x & 0x7f) << (i*7)) |
| Bonus: probability storage | reg160_delta_prob_base, reg162_last_prob_base, reg172_update_prob_wr_base aliased to priv_tbl.probs for first-cut; per-slot rotation deferred to Phase 6 if adaptive entropy diverges |
| Bonus: prob_idx | reg028.vp9_rd_prob_idx = vp9_wr_prob_idx = 0 first-cut; BSP rotates across frames |
Build verification
boltzmann$ PATH=/home/mfritsche/bin:$PATH make -j$(nproc) \
ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- \
drivers/media/platform/rockchip/rkvdec/
CC [M] drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-vp9.o
LD [M] drivers/media/platform/rockchip/rkvdec/rockchip-vdec.o
Clean — zero warnings, zero errors.
What remains potentially-needed for first-light decode
| Field | Risk | Mitigation |
|---|---|---|
reg103_frame_flags.txfmmode_rfsh_en, ref_mode_rfsh_en, single_ref_rfsh_en, comp_ref_rfsh_en, inter_coef_rfsh_flag, last_key_frame_flag, prob_update_en, prob_save_en |
Frame may decode but adaptive entropy may diverge from SW after a few frames | Currently zero-init. Verify via byte-compare on frames 1-5 vs SW reference. If divergence, mirror BSP vp9d_vdpu382_gen_regs setting logic |
| Probability slot rotation | First-cut single-slot may corrupt frame_context across multiple inter frames | Watch dmesg + byte-compare frames 2+ vs SW reference |
| RCB sizing | HEVC-tuned for 8K, oversized for 4K VP9 | Wastes SRAM but doesn't fail; tune in Phase 6 |
Phase 3 entry point (unchanged, hardware-domain)
Phase 3 needs hardware:
- Retry
git push gitea vp9-enablement-iter1(or pushlinux-rk3588-marfritbase first) - Backup ampere's current
/lib/modules/7.0.0-rc3-devices+/kernel/drivers/media/platform/rockchip/rkvdec/rockchip-vdec.ko(with off-device archive per feedback_backup_before_module_replace) scpboltzmann'srockchip-vdec.koto ampere;modprobe -r/modprobecycle (no reboot needed — kernel decoder module hot-swappable)- Sanity:
v4l2-ctl --device=/dev/video0 --list-formats-outshowsVP9Falongside H264_SLICE + HEVC_SLICE - Pre-flight IRQ diagnostic:
pr_warn("vp9 irq status=0x%x\n", status)invdpu381_irq_handlerfor first 5 decodes (Janet Amendment C) - First decode attempt:
LIBVA_DRIVER_NAME=v4l2_request ffmpeg -hwaccel vaapi -i bbb_30s_720p.vp9.webm -vf hwdownload,format=nv12 -frames:v 5 -f rawvideo /tmp/hw-vp9.nv12 - Watch dmesg for IRQ status, watchdog firing, error patterns
- If first-light works: SW byte-compare on frame 100 against
ffmpeg -c:v vp9reference
Tonight's campaign summary
Started: ampere-vp9-enablement campaign opened at ~01:00, sibling to ampere-kernel-decoders close. Closed: 4 commits on branch + 7 docs in campaign repo, module compiles clean on RK3588 target kernel.
Wall-time: ~2h15m for Phase 0 (substrate) + Phase 1 (plan + 2 architect reviews) + Phase 2 (1400 LoC implementation) + Phase 2.1 (full register-packing). All architecturally-correct work; bit-level register correctness verified next session against hardware.
Next session continues at the hardware testing inflection point.