From 9092d9aaaa9f9d64ed68d9fa9cd1b936615a334b Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Mon, 18 May 2026 12:56:15 +0000 Subject: [PATCH 1/2] patches/driver/media: import Sarma's VP9-VDPU381 series (out-of-tree, v8) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three patches from D.V.A.B. Sarma adding VP9 decode support to the VDPU381 variant of rkvdec (RK3588 generation). Combined ~1500 LOC, 5 new files in drivers/media/platform/rockchip/rkvdec/. Provenance: github.com/dvab-sarma/android_kernel_rk_opi branch add-rkvdec-vdpu381-vp9-v8. Collabora's blog cites the work but it hasn't reached linux-media patchwork yet (Collabora: "v1 series needs to be sent for review soon"). Casanova's underlying VDPU381/VDPU383 H.264+HEVC base IS in mainline 7.0 release. Tested by author on Orange Pi 5 Pro (RK3588) with AOSP 16 + FFMPEG, Profile 0 + Profile 2. Tested in our fleet 2026-05-18: cherry-picks cleanly on top of ampere-minimal-devices, full kernel build (KERNELRELEASE 7.0.0-rc3-vp9-test+) succeeds clean with GCC 16.1.1. Image + DTB + modules + initramfs installed under -vp9-test+ suffix on ampere without touching the running -devices+ kernel; new extlinux label arch_vp9_test added (default unchanged at arch_devices). End-to-end VP9 decode verification pending operator reboot into the new label. Patches NOT yet referenced from fleet/ampere.yaml — that bump is the operator's call (manifest preamble currently scopes VP9 out per issue #6). Once verified, ampere.yaml can add these three under the scope-tagged patch list in apply order 0001→0002→0003. Cross-reference: marfrit/kernel-agent#12. --- ...ename-get_ref_buf-to-get_ref_buf_vp9.patch | 48 + ...ec-move-vp9-functions-to-common-file.patch | 387 +++++ ...-add-vp9-support-for-vdpu381-variant.patch | 1405 +++++++++++++++++ patches/driver/media/README.md | 69 + 4 files changed, 1909 insertions(+) create mode 100644 patches/driver/media/0001-rkvdec-vp9-rename-get_ref_buf-to-get_ref_buf_vp9.patch create mode 100644 patches/driver/media/0002-rkvdec-move-vp9-functions-to-common-file.patch create mode 100644 patches/driver/media/0003-rkvdec-add-vp9-support-for-vdpu381-variant.patch create mode 100644 patches/driver/media/README.md diff --git a/patches/driver/media/0001-rkvdec-vp9-rename-get_ref_buf-to-get_ref_buf_vp9.patch b/patches/driver/media/0001-rkvdec-vp9-rename-get_ref_buf-to-get_ref_buf_vp9.patch new file mode 100644 index 0000000..39dd5a3 --- /dev/null +++ b/patches/driver/media/0001-rkvdec-vp9-rename-get_ref_buf-to-get_ref_buf_vp9.patch @@ -0,0 +1,48 @@ +From 9ddcae54a171f2fc7742e92e03b1478d87ae4bbb Mon Sep 17 00:00:00 2001 +From: Venkata Atchuta Bheemeswara Sarma Darbha +Date: Sat, 17 Jan 2026 14:27:22 -0600 +Subject: [PATCH 1/3] media: rkvdec: vp9: Changing get_ref_buf function name to + get_ref_buf_vp9 + +This change is in preparation for the upcoming commits and to denote that this function is not to be confused with the similar function found in rkvdec's hevc. + +Change-Id: I934684778c375c6960a19989a702be44655c55d6 +Signed-off-by: Venkata Atchuta Bheemeswara Sarma Darbha +(cherry picked from commit f60174f07d9c56e7499ca3111d0999e26444cdfd) +--- + drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c | 10 +++++----- + 1 file changed, 5 insertions(+), 5 deletions(-) + +diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c +index e4cdd2122873..bab2e9c83d06 100644 +--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c ++++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c +@@ -349,7 +349,7 @@ static void init_probs(struct rkvdec_ctx *ctx, + } + + static struct rkvdec_decoded_buffer * +-get_ref_buf(struct rkvdec_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp) ++get_ref_buf_vp9(struct rkvdec_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp) + { + struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx; + struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q; +@@ -489,12 +489,12 @@ static void config_registers(struct rkvdec_ctx *ctx, + + dec_params = run->decode_params; + dst = vb2_to_rkvdec_decoded_buf(&run->base.bufs.dst->vb2_buf); +- ref_bufs[0] = get_ref_buf(ctx, &dst->base.vb, dec_params->last_frame_ts); +- ref_bufs[1] = get_ref_buf(ctx, &dst->base.vb, dec_params->golden_frame_ts); +- ref_bufs[2] = get_ref_buf(ctx, &dst->base.vb, dec_params->alt_frame_ts); ++ ref_bufs[0] = get_ref_buf_vp9(ctx, &dst->base.vb, dec_params->last_frame_ts); ++ ref_bufs[1] = get_ref_buf_vp9(ctx, &dst->base.vb, dec_params->golden_frame_ts); ++ ref_bufs[2] = get_ref_buf_vp9(ctx, &dst->base.vb, dec_params->alt_frame_ts); + + if (vp9_ctx->last.valid) +- last = get_ref_buf(ctx, &dst->base.vb, vp9_ctx->last.timestamp); ++ last = get_ref_buf_vp9(ctx, &dst->base.vb, vp9_ctx->last.timestamp); + else + last = dst; + +-- +2.54.0 + diff --git a/patches/driver/media/0002-rkvdec-move-vp9-functions-to-common-file.patch b/patches/driver/media/0002-rkvdec-move-vp9-functions-to-common-file.patch new file mode 100644 index 0000000..ddcb5db --- /dev/null +++ b/patches/driver/media/0002-rkvdec-move-vp9-functions-to-common-file.patch @@ -0,0 +1,387 @@ +From c5063d93e0e6011abe91418a98ed7c7550f0391b Mon Sep 17 00:00:00 2001 +From: Venkata Atchuta Bheemeswara Sarma Darbha +Date: Sat, 17 Jan 2026 14:37:07 -0600 +Subject: [PATCH 2/3] media: rkvdec: Move vp9 functions to common file This is + a preparation commit to add support for new variants of the decoder. + +The functions will later be shared with vdpu381 (rk3588). + +Change-Id: Ib9b78331fb6eb0e3a607b06fd5138fc741b2c9c0 +Signed-off-by: Venkata Atchuta Bheemeswara Sarma Darbha +(cherry picked from commit e87662ca32e88ebb910f6cfc1c71096d5d7bc063) +--- + .../media/platform/rockchip/rkvdec/Makefile | 1 + + .../rockchip/rkvdec/rkvdec-vp9-common.c | 77 +++++++++++ + .../rockchip/rkvdec/rkvdec-vp9-common.h | 95 +++++++++++++ + .../platform/rockchip/rkvdec/rkvdec-vp9.c | 125 +----------------- + 4 files changed, 174 insertions(+), 124 deletions(-) + create mode 100644 drivers/media/platform/rockchip/rkvdec/rkvdec-vp9-common.c + create mode 100644 drivers/media/platform/rockchip/rkvdec/rkvdec-vp9-common.h + +diff --git a/drivers/media/platform/rockchip/rkvdec/Makefile b/drivers/media/platform/rockchip/rkvdec/Makefile +index e629d571e4d8..2bbd67b2db11 100644 +--- a/drivers/media/platform/rockchip/rkvdec/Makefile ++++ b/drivers/media/platform/rockchip/rkvdec/Makefile +@@ -12,4 +12,5 @@ rockchip-vdec-y += \ + rkvdec-vdpu381-hevc.o \ + rkvdec-vdpu383-h264.o \ + rkvdec-vdpu383-hevc.o \ ++ rkvdec-vp9-common.o \ + rkvdec-vp9.o +diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9-common.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9-common.c +new file mode 100644 +index 000000000000..93023737c1ed +--- /dev/null ++++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9-common.c +@@ -0,0 +1,77 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Rockchip video decoder VP9 common functions ++ * ++ * Copyright (C) 2019 Collabora, Ltd. ++ * Boris Brezillon ++ * Copyright (C) 2021 Collabora, Ltd. ++ * Andrzej Pietrasiewicz ++ * ++ * Copyright (C) 2016 Rockchip Electronics Co., Ltd. ++ * Alpha Lin ++ */ ++#include ++#include ++#include ++ ++#include "rkvdec.h" ++#include "rkvdec-vp9-common.h" ++ ++void write_coeff_plane(const u8 coef[6][6][3], u8 *coeff_plane) ++{ ++ unsigned int idx = 0, byte_count = 0; ++ int k, m, n; ++ u8 p; ++ ++ for (k = 0; k < 6; k++) { ++ for (m = 0; m < 6; m++) { ++ for (n = 0; n < 3; n++) { ++ p = coef[k][m][n]; ++ coeff_plane[idx++] = p; ++ byte_count++; ++ if (byte_count == 27) { ++ idx += 5; ++ byte_count = 0; ++ } ++ } ++ } ++ } ++} ++ ++struct rkvdec_decoded_buffer * ++get_ref_buf_vp9(struct rkvdec_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp) ++{ ++ struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx; ++ struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q; ++ struct vb2_buffer *buf; ++ ++ /* ++ * If a ref is unused or invalid, address of current destination ++ * buffer is returned. ++ */ ++ buf = vb2_find_buffer(cap_q, timestamp); ++ if (!buf) ++ buf = &dst->vb2_buf; ++ ++ return vb2_to_rkvdec_decoded_buf(buf); ++} ++ ++dma_addr_t get_mv_base_addr(struct rkvdec_decoded_buffer *buf) ++{ ++ unsigned int aligned_pitch, aligned_height, yuv_len; ++ ++ aligned_height = round_up(buf->vp9.height, 64); ++ aligned_pitch = round_up(buf->vp9.width * buf->vp9.bit_depth, 512) / 8; ++ yuv_len = (aligned_height * aligned_pitch * 3) / 2; ++ ++ return vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 0) + ++ yuv_len; ++} ++ ++void update_dec_buf_info(struct rkvdec_decoded_buffer *buf, ++ const struct v4l2_ctrl_vp9_frame *dec_params) ++{ ++ buf->vp9.width = dec_params->frame_width_minus_1 + 1; ++ buf->vp9.height = dec_params->frame_height_minus_1 + 1; ++ buf->vp9.bit_depth = dec_params->bit_depth; ++} +\ No newline at end of file +diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9-common.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9-common.h +new file mode 100644 +index 000000000000..056842cf1bba +--- /dev/null ++++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9-common.h +@@ -0,0 +1,95 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Rockchip video decoder VP9 common functions ++ * ++ * Copyright (C) 2019 Collabora, Ltd. ++ * Boris Brezillon ++ * Copyright (C) 2021 Collabora, Ltd. ++ * Andrzej Pietrasiewicz ++ * ++ * Copyright (C) 2016 Rockchip Electronics Co., Ltd. ++ * Alpha Lin ++ */ ++ ++#include ++#include ++#include ++ ++#include "rkvdec.h" ++ ++struct rkvdec_vp9_run { ++ struct rkvdec_run base; ++ const struct v4l2_ctrl_vp9_frame *decode_params; ++}; ++ ++struct rkvdec_vp9_intra_mode_probs { ++ u8 y_mode[105]; ++ u8 uv_mode[23]; ++}; ++ ++struct rkvdec_vp9_intra_only_frame_probs { ++ u8 coef_intra[4][2][128]; ++ struct rkvdec_vp9_intra_mode_probs intra_mode[10]; ++}; ++ ++struct rkvdec_vp9_inter_frame_probs { ++ u8 y_mode[4][9]; ++ u8 comp_mode[5]; ++ u8 comp_ref[5]; ++ u8 single_ref[5][2]; ++ u8 inter_mode[7][3]; ++ u8 interp_filter[4][2]; ++ u8 padding0[11]; ++ u8 coef[2][4][2][128]; ++ u8 uv_mode_0_2[3][9]; ++ u8 padding1[5]; ++ u8 uv_mode_3_5[3][9]; ++ u8 padding2[5]; ++ u8 uv_mode_6_8[3][9]; ++ u8 padding3[5]; ++ u8 uv_mode_9[9]; ++ u8 padding4[7]; ++ u8 padding5[16]; ++ struct { ++ u8 joint[3]; ++ u8 sign[2]; ++ u8 classes[2][10]; ++ u8 class0_bit[2]; ++ u8 bits[2][10]; ++ u8 class0_fr[2][2][3]; ++ u8 fr[2][3]; ++ u8 class0_hp[2]; ++ u8 hp[2]; ++ u8 padding6[3]; ++ } mv; ++}; ++ ++struct rkvdec_vp9_probs { ++ u8 partition[16][3]; ++ u8 pred[3]; ++ u8 tree[7]; ++ u8 skip[3]; ++ u8 tx32[2][3]; ++ u8 tx16[2][2]; ++ u8 tx8[2][1]; ++ u8 is_inter[4]; ++ /* 128 bit alignment */ ++ u8 padding0[3]; ++ union { ++ struct rkvdec_vp9_inter_frame_probs inter; ++ struct rkvdec_vp9_intra_only_frame_probs intra_only; ++ }; ++ /* 128 bit alignment */ ++ u8 padding1[8]; ++}; ++ ++ ++void write_coeff_plane(const u8 coef[6][6][3], u8 *coeff_plane); ++ ++struct rkvdec_decoded_buffer * ++get_ref_buf_vp9(struct rkvdec_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp); ++ ++dma_addr_t get_mv_base_addr(struct rkvdec_decoded_buffer *buf); ++ ++void update_dec_buf_info(struct rkvdec_decoded_buffer *buf, ++ const struct v4l2_ctrl_vp9_frame *dec_params); +diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c +index bab2e9c83d06..2b368d7b61e0 100644 +--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c ++++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c +@@ -23,71 +23,12 @@ + + #include "rkvdec.h" + #include "rkvdec-regs.h" ++#include "rkvdec-vp9-common.h" + + #define RKVDEC_VP9_PROBE_SIZE 4864 + #define RKVDEC_VP9_COUNT_SIZE 13232 + #define RKVDEC_VP9_MAX_SEGMAP_SIZE 73728 + +-struct rkvdec_vp9_intra_mode_probs { +- u8 y_mode[105]; +- u8 uv_mode[23]; +-}; +- +-struct rkvdec_vp9_intra_only_frame_probs { +- u8 coef_intra[4][2][128]; +- struct rkvdec_vp9_intra_mode_probs intra_mode[10]; +-}; +- +-struct rkvdec_vp9_inter_frame_probs { +- u8 y_mode[4][9]; +- u8 comp_mode[5]; +- u8 comp_ref[5]; +- u8 single_ref[5][2]; +- u8 inter_mode[7][3]; +- u8 interp_filter[4][2]; +- u8 padding0[11]; +- u8 coef[2][4][2][128]; +- u8 uv_mode_0_2[3][9]; +- u8 padding1[5]; +- u8 uv_mode_3_5[3][9]; +- u8 padding2[5]; +- u8 uv_mode_6_8[3][9]; +- u8 padding3[5]; +- u8 uv_mode_9[9]; +- u8 padding4[7]; +- u8 padding5[16]; +- struct { +- u8 joint[3]; +- u8 sign[2]; +- u8 classes[2][10]; +- u8 class0_bit[2]; +- u8 bits[2][10]; +- u8 class0_fr[2][2][3]; +- u8 fr[2][3]; +- u8 class0_hp[2]; +- u8 hp[2]; +- } mv; +-}; +- +-struct rkvdec_vp9_probs { +- u8 partition[16][3]; +- u8 pred[3]; +- u8 tree[7]; +- u8 skip[3]; +- u8 tx32[2][3]; +- u8 tx16[2][2]; +- u8 tx8[2][1]; +- u8 is_inter[4]; +- /* 128 bit alignment */ +- u8 padding0[3]; +- union { +- struct rkvdec_vp9_inter_frame_probs inter; +- struct rkvdec_vp9_intra_only_frame_probs intra_only; +- }; +- /* 128 bit alignment */ +- u8 padding1[11]; +-}; +- + /* Data structure describing auxiliary buffer format. */ + struct rkvdec_vp9_priv_tbl { + struct rkvdec_vp9_probs probs; +@@ -136,11 +77,6 @@ struct rkvdec_vp9_intra_frame_symbol_counts { + struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6]; + }; + +-struct rkvdec_vp9_run { +- struct rkvdec_run base; +- const struct v4l2_ctrl_vp9_frame *decode_params; +-}; +- + struct rkvdec_vp9_frame_info { + u32 valid : 1; + u32 segmapid : 1; +@@ -166,27 +102,6 @@ struct rkvdec_vp9_ctx { + struct rkvdec_regs regs; + }; + +-static void write_coeff_plane(const u8 coef[6][6][3], u8 *coeff_plane) +-{ +- unsigned int idx = 0, byte_count = 0; +- int k, m, n; +- u8 p; +- +- for (k = 0; k < 6; k++) { +- for (m = 0; m < 6; m++) { +- for (n = 0; n < 3; n++) { +- p = coef[k][m][n]; +- coeff_plane[idx++] = p; +- byte_count++; +- if (byte_count == 27) { +- idx += 5; +- byte_count = 0; +- } +- } +- } +- } +-} +- + static void init_intra_only_probs(struct rkvdec_ctx *ctx, + const struct rkvdec_vp9_run *run) + { +@@ -348,36 +263,6 @@ static void init_probs(struct rkvdec_ctx *ctx, + init_inter_probs(ctx, run); + } + +-static struct rkvdec_decoded_buffer * +-get_ref_buf_vp9(struct rkvdec_ctx *ctx, struct vb2_v4l2_buffer *dst, u64 timestamp) +-{ +- struct v4l2_m2m_ctx *m2m_ctx = ctx->fh.m2m_ctx; +- struct vb2_queue *cap_q = &m2m_ctx->cap_q_ctx.q; +- struct vb2_buffer *buf; +- +- /* +- * If a ref is unused or invalid, address of current destination +- * buffer is returned. +- */ +- buf = vb2_find_buffer(cap_q, timestamp); +- if (!buf) +- buf = &dst->vb2_buf; +- +- return vb2_to_rkvdec_decoded_buf(buf); +-} +- +-static dma_addr_t get_mv_base_addr(struct rkvdec_decoded_buffer *buf) +-{ +- unsigned int aligned_pitch, aligned_height, yuv_len; +- +- aligned_height = round_up(buf->vp9.height, 64); +- aligned_pitch = round_up(buf->vp9.width * buf->vp9.bit_depth, 512) / 8; +- yuv_len = (aligned_height * aligned_pitch * 3) / 2; +- +- return vb2_dma_contig_plane_dma_addr(&buf->base.vb.vb2_buf, 0) + +- yuv_len; +-} +- + static void config_ref_registers(struct rkvdec_ctx *ctx, + const struct rkvdec_vp9_run *run, + struct rkvdec_decoded_buffer *ref_buf, +@@ -446,14 +331,6 @@ static void config_seg_registers(struct rkvdec_ctx *ctx, unsigned int segid) + (seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE); + } + +-static void update_dec_buf_info(struct rkvdec_decoded_buffer *buf, +- const struct v4l2_ctrl_vp9_frame *dec_params) +-{ +- buf->vp9.width = dec_params->frame_width_minus_1 + 1; +- buf->vp9.height = dec_params->frame_height_minus_1 + 1; +- buf->vp9.bit_depth = dec_params->bit_depth; +-} +- + static void update_ctx_cur_info(struct rkvdec_vp9_ctx *vp9_ctx, + struct rkvdec_decoded_buffer *buf, + const struct v4l2_ctrl_vp9_frame *dec_params) +-- +2.54.0 + diff --git a/patches/driver/media/0003-rkvdec-add-vp9-support-for-vdpu381-variant.patch b/patches/driver/media/0003-rkvdec-add-vp9-support-for-vdpu381-variant.patch new file mode 100644 index 0000000..1a0ef49 --- /dev/null +++ b/patches/driver/media/0003-rkvdec-add-vp9-support-for-vdpu381-variant.patch @@ -0,0 +1,1405 @@ +From 48a8c785de7f5320513052a64e544c6310d7b273 Mon Sep 17 00:00:00 2001 +From: Venkata Atchuta Bheemeswara Sarma Darbha +Date: Sat, 17 Jan 2026 14:53:40 -0600 +Subject: [PATCH 3/3] media: rkvdec: Add VP9 support for VDPU381 variant The + VDPU381 supports VP9 decoding up to 7680x4320@30fps. + +It supports YUV420 (8 and 10 bits) i.e Profile 0 and Profile 2. + +Testing shows promising results. Testing done on Orange pi 5 pro board with aosp 16 and with FFMPEG. + +Change-Id: I612c3f1bd7693ab0a8081ee14f2caf1543b2f83d +Signed-off-by: Venkata Atchuta Bheemeswara Sarma Darbha +(cherry picked from commit aa00b89b6bbfd7570e459172417e2e72921689f4) +--- + .../media/platform/rockchip/rkvdec/Makefile | 1 + + .../rockchip/rkvdec/rkvdec-vdpu381-regs.h | 235 ++++ + .../rockchip/rkvdec/rkvdec-vdpu381-vp9.c | 1014 +++++++++++++++++ + .../media/platform/rockchip/rkvdec/rkvdec.c | 52 + + .../media/platform/rockchip/rkvdec/rkvdec.h | 1 + + 5 files changed, 1303 insertions(+) + create mode 100644 drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-vp9.c + +diff --git a/drivers/media/platform/rockchip/rkvdec/Makefile b/drivers/media/platform/rockchip/rkvdec/Makefile +index 2bbd67b2db11..5eae59e87fce 100644 +--- a/drivers/media/platform/rockchip/rkvdec/Makefile ++++ b/drivers/media/platform/rockchip/rkvdec/Makefile +@@ -10,6 +10,7 @@ rockchip-vdec-y += \ + rkvdec-rcb.o \ + rkvdec-vdpu381-h264.o \ + rkvdec-vdpu381-hevc.o \ ++ rkvdec-vdpu381-vp9.o \ + rkvdec-vdpu383-h264.o \ + rkvdec-vdpu383-hevc.o \ + rkvdec-vp9-common.o \ +diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-regs.h b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-regs.h +index 6da36031df2d..2b409ee014a2 100644 +--- a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-regs.h ++++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-regs.h +@@ -4,6 +4,8 @@ + * + * Copyright (C) 2024 Collabora, Ltd. + * Detlev Casanova ++ * ++ * Copyright (C) 2025 Venkata Atchuta Bheemeswara Sarma Darbha + */ + + #include +@@ -382,6 +384,212 @@ struct rkvdec_vdpu381_regs_hevc_params { + + } __packed; + ++struct rkvdec_vdpu381_vp9_set { ++ u32 vp9_cprheader_offset : 16; ++ u32 reserved : 16; ++}__packed; ++ ++/* base: OFFSET_CODEC_PARAMS_REGS */ ++struct rkvdec_vdpu381_regs_vp9_params { ++ struct rkvdec_vdpu381_vp9_set reg064; ++ ++ u32 cur_top_poc; ++ u32 reserved0 ; ++ ++ struct rkvdec_vdpu381_vp9_segid_grp { ++ u32 vp9_segid_abs_delta : 1; ++ u32 vp9_segid_frame_qp_delta_en : 1; ++ u32 vp9_segid_frame_qp_delta : 9; ++ u32 vp9_segid_frame_loopfilter_value_en : 1; ++ u32 vp9_segid_frame_loopfilter_value : 7; ++ u32 vp9_segid_referinfo_en : 1; ++ u32 vp9_segid_referinfo : 2; ++ u32 vp9_segid_frame_skip_en : 1; ++ u32 reserved : 9; ++ } reg67_74[8]; ++ ++ struct rkvdec_vdpu381_vp9_info_lastframe { ++ u32 vp9_mode_deltas_lastframe : 14; ++ u32 reserved0 : 2; ++ u32 segmentation_enable_lstframe : 1; ++ u32 vp9_last_showframe : 1; ++ u32 vp9_last_intra_only : 1; ++ u32 vp9_last_widhheight_eqcur :1; ++ u32 vp9_color_sapce_lastkeyframe : 3; ++ u32 reserved1 : 9; ++ } reg75; ++ ++ ++ struct rkvdec_vdpu381_vp9_cprheader_config { ++ u32 vp9_tx_mode : 3; ++ u32 vp9_frame_reference_mode : 2; ++ u32 reserved : 27; ++ } reg76; ++ ++ struct rkvdec_vdpu381_vp9_intercmd_num { ++ u32 vp9_intercmd_num : 24; ++ u32 reserved : 8; ++ } reg77; ++ ++ u32 reg78_vp9_stream_size; ++ ++ struct rkvdec_vdpu381_vp9_lastf_y_hor_virstride { ++ u32 vp9_lastfy_hor_virstride : 16; ++ u32 reserved : 16; ++ } reg79; ++ ++ struct rkvdec_vdpu381_vp9_lastf_uv_hor_virstride { ++ u32 vp9_lastfuv_hor_virstride : 16; ++ u32 reserved : 16; ++ } reg80 ; ++ ++ struct rkvdec_vdpu381_vp9_goldenf_y_hor_virstride { ++ u32 vp9_goldenfy_hor_virstride : 16; ++ u32 reserved : 16; ++ } reg81; ++ ++ struct rkvdec_vdpu381_vp9_golden_uv_hot_virstride { ++ u32 vp9_goldenuv_hor_virstride : 16; ++ u32 reserved : 16; ++ } reg82; ++ ++ struct rkvdec_vdpu381_vp9_altreff_y_hor_virstride { ++ u32 vp9_altreffy_hor_virstride :16; ++ u32 reserved : 16; ++ } reg83; ++ ++ struct rkvdec_vdpu381_vp9_altreff_uv_hor_virstride { ++ u32 vp9_altreff_uv_hor_virstride : 16; ++ u32 reserved : 16; ++ } reg84; ++ ++ struct rkvdec_vdpu381_vp9_lastf_y_virstride { ++ u32 vp9_lastfy_virstride : 28; ++ u32 reserved : 4; ++ } reg85; ++ ++ struct rkvdec_vdpu381_vp9_golden_y_virstride { ++ u32 vp9_goldeny_virstride : 28; ++ u32 reserved : 4; ++ } reg86; ++ ++ struct rkvdec_vdpu381_vp9_altref_y_virstride { ++ u32 vp9_altrefy_virstride : 28; ++ u32 reserved : 4; ++ } reg87; ++ ++ struct rkvdec_vdpu381_vp9_lref_hor_scale { ++ u32 vp9_lref_hor_scale : 16; ++ u32 reserved : 16; ++ } reg88; ++ ++ struct rkvdec_vdpu381_vp9_lref_ver_scale { ++ u32 vp9_lref_ver_scale : 16; ++ u32 reserved : 16; ++ } reg89; ++ ++ struct rkvdec_vdpu381_vp9_gref_hor_scale { ++ u32 vp9_gref_hor_scale : 16; ++ u32 reserved : 16; ++ } reg90; ++ ++ struct rkvdec_vdpu381_vp9_gref_ver_scale { ++ u32 vp9_gref_ver_scale :16; ++ u32 reserved : 16; ++ } reg91; ++ ++ struct rkvdec_vdpu381_vp9_aref_hor_scale { ++ u32 vp9_aref_hor_scale : 16; ++ u32 reserved : 16; ++ } reg92; ++ ++ struct rkvdec_vdpu381_vp9_aref_ver_scale { ++ u32 vp9_aref_ver_scale : 16; ++ u32 reserved : 16; ++ } reg93; ++ ++ struct rkvdec_vdpu381_vp9_ref_deltas_lastframe { ++ u32 vp9_ref_deltas_lastframe : 28; ++ u32 reserved : 4; ++ } reg94; ++ ++ u32 reg95_vp9_last_poc; ++ ++ u32 reg96_vp9_golden_poc; ++ ++ u32 reg97_vp9_altref_poc; ++ ++ u32 reg98_vp9_col_ref_poc; ++ ++ struct rkvdec_vdpu381_vp9_prob_ref_poc { ++ u32 vp9_prob_ref_poc : 16; ++ u32 reserved : 16; ++ } reg99; ++ ++ struct rkvdec_vdpu381_vp9_segid_ref_poc { ++ u32 vp9_segid_ref_poc : 16; ++ u32 reserved : 16; ++ } reg100; ++ ++ u32 reserved1[2]; ++ ++ struct rkvdec_vdpu381_vp9_prob_en { ++ u32 reserved : 20; ++ u32 vp9_prob_update_en : 1; ++ u32 vp9_refresh_en : 1; ++ u32 vp9_prob_save_en : 1; ++ u32 vp9_intra_only_flag : 1; ++ u32 vp9_txfmmode_rfsh_en : 1; ++ u32 vp9_ref_mode_rfsh_en : 1; ++ u32 vp9_single_ref_rfsh_en : 1; ++ u32 vp9_comp_ref_rfsh_en : 1; ++ u32 vp9_interp_filter_switch_en : 1; ++ u32 vp9_allow_high_precision_mv : 1; ++ u32 vp9_last_key_frame_flag : 1; ++ u32 vp9_inter_coef_rfsh_flag :1; ++ } reg103; ++ ++ u32 reserved2; ++ ++ struct rkvdec_vdpu381_vp9_cnt_upd_en_avs2_headlen { ++ u32 avs2_head_len : 4; ++ u32 vp9count_update_en : 1; ++ u32 reserved : 27; ++ } reg105; ++ ++ struct rkvdec_vdpu381_vp9_frame_width_last { ++ u32 vp9_framewidth_last : 16; ++ u32 reserved : 16; ++ } reg106; ++ ++ struct rkvdec_vdpu381_vp9_frame_height_last { ++ u32 vp9_frameheight_last: 16; ++ u32 reserved : 16; ++ } reg107; ++ ++ struct rkvdec_vdpu381_vp9_frame_width_golden { ++ u32 vp9_framewidth_golden : 16; ++ u32 reserved : 16; ++ } reg108; ++ ++ struct rkvdec_vdpu381_vp9_frame_height_golden { ++ u32 vp9_frameheight_golden : 16; ++ u32 reserved : 16; ++ } reg109; ++ ++ struct rkvdec_vdpu381_vp9_frame_width_altref { ++ u32 vp9_framewidth_altref : 16; ++ u32 reserved : 16; ++ } reg110; ++ ++ struct rkvdec_vdpu381_vp9_frame_height_altref { ++ u32 vp9_frameheight_altref : 16; ++ u32 reserved : 16; ++ } reg111; ++ ++ u32 reserved3; ++} __packed; ++ + /* base: OFFSET_CODEC_ADDR_REGS */ + struct rkvdec_vdpu381_regs_h26x_addr { + u32 reserved_160; +@@ -394,6 +602,26 @@ struct rkvdec_vdpu381_regs_h26x_addr { + u32 reg199_cabactbl_base; + } __packed; + ++struct rkvdec_vdpu381_regs_vp9_addr { ++ u32 vp9_delta_prob_base; ++ u32 reserved0; ++ u32 vp9_last_prob_base; ++ u32 reserved1; ++ u32 vp9_referlast_base; ++ u32 vp9_refergolden_base; ++ u32 vp9_referalfter_base; ++ u32 vp9_count_base; ++ u32 vp9_segidlast_base; ++ u32 avp9_segidcur_base; ++ u32 vp9_refcolmv_base; ++ u32 vp9_intercmd_base; ++ u32 vp9_update_prob_wr_bas; ++ u32 reserved2[7]; ++ u32 scanlist_addr; ++ u32 colmv_base[16]; ++ u32 cabactbl_base; ++}__packed; ++ + struct rkvdec_vdpu381_regs_h26x_highpoc { + struct { + u32 ref0_poc_highbit : 4; +@@ -427,4 +655,11 @@ struct rkvdec_vdpu381_regs_hevc { + struct rkvdec_vdpu381_regs_h26x_highpoc hevc_highpoc; + } __packed; + ++struct rkvdec_vdpu381_regs_vp9 { ++ struct rkvdec_vdpu381_regs_common common; ++ struct rkvdec_vdpu381_regs_vp9_params vp9_param; ++ struct rkvdec_vdpu381_regs_common_addr common_addr; ++ struct rkvdec_vdpu381_regs_vp9_addr vp9_addr; ++}__packed; ++ + #endif /* __RKVDEC_REGS_H__ */ +diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-vp9.c b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-vp9.c +new file mode 100644 +index 000000000000..3c7e1103946a +--- /dev/null ++++ b/drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu381-vp9.c +@@ -0,0 +1,1014 @@ ++// SPDX-License-Identifier: GPL-2.0 ++/* ++ * Rockchip Video Decoder VP9 backend ++ * ++ * Copyright (C) 2025 Venkata Atchuta Bheemeswara Sarma Darbha ++ * ++ * Copyright (C) 2019 Collabora, Ltd. ++ * Boris Brezillon ++ * Copyright (C) 2021 Collabora, Ltd. ++ * Andrzej Pietrasiewicz ++ * ++ * Copyright (C) 2016 Rockchip Electronics Co., Ltd. ++ * Alpha Lin ++ */ ++ ++/* ++ * For following the vp9 spec please start reading this driver ++ * code from rkvdec_vp9_run() followed by rkvdec_vp9_done(). ++ */ ++ ++ ++#include ++#include ++#include ++#include ++ ++ ++#include "rkvdec-rcb.h" ++#include "rkvdec.h" ++#include "rkvdec-vdpu381-regs.h" ++#include "rkvdec-vp9-common.h" ++ ++ ++#define RKVDEC_VP9_PROBE_SIZE 4864 ++#define RKVDEC_VP9_COUNT_SIZE 13208 ++#define RKVDEC_VP9_MAX_SEGMAP_SIZE 73728 ++ ++/* Data structure describing auxiliary buffer format. */ ++struct rkvdec_vp9_priv_tbl { ++ struct rkvdec_vp9_probs probs; ++ u8 segmap[2][RKVDEC_VP9_MAX_SEGMAP_SIZE]; ++}; ++ ++struct rkvdec_vp9_refs_counts { ++ u32 eob[2]; ++ u32 coeff[3]; ++}; ++ ++struct rkvdec_vp9_inter_frame_symbol_counts { ++ u32 partition[16][4]; ++ u32 skip[3][2]; ++ u32 inter[4][2]; ++ u32 tx32p[2][4]; ++ u32 tx16p[2][4]; ++ u32 tx8p[2][2]; ++ u32 y_mode[4][10]; ++ u32 uv_mode[10][10]; ++ u32 comp[5][2]; ++ u32 comp_ref[5][2]; ++ u32 single_ref[5][2][2]; ++ u32 mv_mode[7][4]; ++ u32 filter[4][3]; ++ u32 mv_joint[4]; ++ u32 sign[2][2]; ++ /* add 1 element for align */ ++ u32 classes[2][11 + 1]; ++ u32 class0[2][2]; ++ u32 bits[2][10][2]; ++ u32 class0_fp[2][2][4]; ++ u32 fp[2][4]; ++ u32 class0_hp[2][2]; ++ u32 hp[2][2]; ++ struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6]; ++}; ++ ++struct rkvdec_vp9_intra_frame_symbol_counts { ++ u32 partition[4][4][4]; ++ u32 skip[3][2]; ++ u32 intra[4][2]; ++ u32 tx32p[2][4]; ++ u32 tx16p[2][4]; ++ u32 tx8p[2][2]; ++ struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6]; ++}; ++ ++struct rkvdec_vp9_frame_info { ++ u32 valid : 1; ++ u32 segmapid : 1; ++ u32 frame_context_idx : 2; ++ u32 reference_mode : 2; ++ u32 tx_mode : 3; ++ u32 interpolation_filter : 3; ++ u32 flags; ++ u64 timestamp; ++ struct v4l2_vp9_segmentation seg; ++ struct v4l2_vp9_loop_filter lf; ++}; ++ ++struct rkvdec_vp9_ctx { ++ struct rkvdec_aux_buf priv_tbl; ++ struct rkvdec_aux_buf count_tbl; ++ struct v4l2_vp9_frame_symbol_counts inter_cnts; ++ struct v4l2_vp9_frame_symbol_counts intra_cnts; ++ struct v4l2_vp9_frame_context probability_tables; ++ struct v4l2_vp9_frame_context frame_context[4]; ++ struct rkvdec_vp9_frame_info cur; ++ struct rkvdec_vp9_frame_info last; ++ struct rkvdec_vdpu381_regs_vp9 regs; ++}; ++ ++static void init_intra_only_probs(struct rkvdec_ctx *ctx, ++ const struct rkvdec_vp9_run *run) ++{ ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu; ++ struct rkvdec_vp9_intra_only_frame_probs *rkprobs; ++ const struct v4l2_vp9_frame_context *probs; ++ unsigned int i, j, k; ++ ++ rkprobs = &tbl->probs.intra_only; ++ probs = &vp9_ctx->probability_tables; ++ ++ /* ++ * intra only 149 x 128 bits ,aligned to 152 x 128 bits coeff related ++ * prob 64 x 128 bits ++ */ ++ for (i = 0; i < ARRAY_SIZE(probs->coef); i++) { ++ for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++) ++ write_coeff_plane(probs->coef[i][j][0], ++ rkprobs->coef_intra[i][j]); ++ } ++ ++ /* intra mode prob 80 x 128 bits */ ++ for (i = 0; i < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob); i++) { ++ unsigned int byte_count = 0; ++ int idx = 0; ++ ++ /* vp9_kf_y_mode_prob */ ++ for (j = 0; j < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob[0]); j++) { ++ for (k = 0; k < ARRAY_SIZE(v4l2_vp9_kf_y_mode_prob[0][0]); ++ k++) { ++ u8 val = v4l2_vp9_kf_y_mode_prob[i][j][k]; ++ ++ rkprobs->intra_mode[i].y_mode[idx++] = val; ++ byte_count++; ++ if (byte_count == 27) { ++ byte_count = 0; ++ idx += 5; ++ } ++ } ++ } ++ } ++ ++ for (i = 0; i < sizeof(v4l2_vp9_kf_uv_mode_prob); ++i) { ++ const u8 *ptr = (const u8 *)v4l2_vp9_kf_uv_mode_prob; ++ ++ rkprobs->intra_mode[i / 23].uv_mode[i % 23] = ptr[i]; ++ } ++} ++ ++static void init_inter_probs(struct rkvdec_ctx *ctx, ++ const struct rkvdec_vp9_run *run) ++{ ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu; ++ struct rkvdec_vp9_inter_frame_probs *rkprobs; ++ const struct v4l2_vp9_frame_context *probs; ++ unsigned int i, j, k; ++ ++ rkprobs = &tbl->probs.inter; ++ probs = &vp9_ctx->probability_tables; ++ ++ /* ++ * inter probs ++ * 151 x 128 bits, aligned to 152 x 128 bits ++ * inter only ++ * intra_y_mode & inter_block info 6 x 128 bits ++ */ ++ ++ memcpy(rkprobs->y_mode, probs->y_mode, sizeof(rkprobs->y_mode)); ++ memcpy(rkprobs->comp_mode, probs->comp_mode, ++ sizeof(rkprobs->comp_mode)); ++ memcpy(rkprobs->comp_ref, probs->comp_ref, ++ sizeof(rkprobs->comp_ref)); ++ memcpy(rkprobs->single_ref, probs->single_ref, ++ sizeof(rkprobs->single_ref)); ++ memcpy(rkprobs->inter_mode, probs->inter_mode, ++ sizeof(rkprobs->inter_mode)); ++ memcpy(rkprobs->interp_filter, probs->interp_filter, ++ sizeof(rkprobs->interp_filter)); ++ ++ /* 128 x 128 bits coeff related */ ++ for (i = 0; i < ARRAY_SIZE(probs->coef); i++) { ++ for (j = 0; j < ARRAY_SIZE(probs->coef[0]); j++) { ++ for (k = 0; k < ARRAY_SIZE(probs->coef[0][0]); k++) ++ write_coeff_plane(probs->coef[i][j][k], ++ rkprobs->coef[k][i][j]); ++ } ++ } ++ ++ /* intra uv mode 6 x 128 */ ++ memcpy(rkprobs->uv_mode_0_2, &probs->uv_mode[0], ++ sizeof(rkprobs->uv_mode_0_2)); ++ memcpy(rkprobs->uv_mode_3_5, &probs->uv_mode[3], ++ sizeof(rkprobs->uv_mode_3_5)); ++ memcpy(rkprobs->uv_mode_6_8, &probs->uv_mode[6], ++ sizeof(rkprobs->uv_mode_6_8)); ++ memcpy(rkprobs->uv_mode_9, &probs->uv_mode[9], ++ sizeof(rkprobs->uv_mode_9)); ++ ++ /* mv related 6 x 128 */ ++ memcpy(rkprobs->mv.joint, probs->mv.joint, ++ sizeof(rkprobs->mv.joint)); ++ memcpy(rkprobs->mv.sign, probs->mv.sign, ++ sizeof(rkprobs->mv.sign)); ++ memcpy(rkprobs->mv.classes, probs->mv.classes, ++ sizeof(rkprobs->mv.classes)); ++ memcpy(rkprobs->mv.class0_bit, probs->mv.class0_bit, ++ sizeof(rkprobs->mv.class0_bit)); ++ memcpy(rkprobs->mv.bits, probs->mv.bits, ++ sizeof(rkprobs->mv.bits)); ++ memcpy(rkprobs->mv.class0_fr, probs->mv.class0_fr, ++ sizeof(rkprobs->mv.class0_fr)); ++ memcpy(rkprobs->mv.fr, probs->mv.fr, ++ sizeof(rkprobs->mv.fr)); ++ memcpy(rkprobs->mv.class0_hp, probs->mv.class0_hp, ++ sizeof(rkprobs->mv.class0_hp)); ++ memcpy(rkprobs->mv.hp, probs->mv.hp, ++ sizeof(rkprobs->mv.hp)); ++} ++ ++static void init_probs(struct rkvdec_ctx *ctx, ++ const struct rkvdec_vp9_run *run) ++{ ++ const struct v4l2_ctrl_vp9_frame *dec_params; ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct rkvdec_vp9_priv_tbl *tbl = vp9_ctx->priv_tbl.cpu; ++ struct rkvdec_vp9_probs *rkprobs = &tbl->probs; ++ const struct v4l2_vp9_segmentation *seg; ++ const struct v4l2_vp9_frame_context *probs; ++ bool intra_only; ++ ++ dec_params = run->decode_params; ++ probs = &vp9_ctx->probability_tables; ++ seg = &dec_params->seg; ++ ++ memset(rkprobs, 0, sizeof(*rkprobs)); ++ ++ intra_only = !!(dec_params->flags & ++ (V4L2_VP9_FRAME_FLAG_KEY_FRAME | ++ V4L2_VP9_FRAME_FLAG_INTRA_ONLY)); ++ ++ /* sb info 5 x 128 bit */ ++ memcpy(rkprobs->partition, ++ intra_only ? v4l2_vp9_kf_partition_probs : probs->partition, ++ sizeof(rkprobs->partition)); ++ ++ memcpy(rkprobs->pred, seg->pred_probs, sizeof(rkprobs->pred)); ++ memcpy(rkprobs->tree, seg->tree_probs, sizeof(rkprobs->tree)); ++ memcpy(rkprobs->skip, probs->skip, sizeof(rkprobs->skip)); ++ memcpy(rkprobs->tx32, probs->tx32, sizeof(rkprobs->tx32)); ++ memcpy(rkprobs->tx16, probs->tx16, sizeof(rkprobs->tx16)); ++ memcpy(rkprobs->tx8, probs->tx8, sizeof(rkprobs->tx8)); ++ memcpy(rkprobs->is_inter, probs->is_inter, sizeof(rkprobs->is_inter)); ++ ++ if (intra_only) ++ init_intra_only_probs(ctx, run); ++ else ++ init_inter_probs(ctx, run); ++} ++ ++static void config_ref_registers(struct rkvdec_ctx *ctx, ++ const struct rkvdec_vp9_run *run, ++ struct rkvdec_decoded_buffer *ref_buf, ++ int i) ++{ ++ unsigned int aligned_pitch, aligned_height, y_len, yuv_len; ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct rkvdec_vdpu381_regs_vp9 *regs = &vp9_ctx->regs; ++ ++ aligned_height = round_up(ref_buf->vp9.height, 64); ++ ++ switch(i) { ++ case 0: ++ regs->vp9_param.reg107.vp9_frameheight_last = ref_buf->vp9.height; ++ regs->vp9_param.reg106.vp9_framewidth_last = ref_buf->vp9.width; ++ break; ++ case 1: ++ regs->vp9_param.reg109.vp9_frameheight_golden = ref_buf->vp9.height; ++ regs->vp9_param.reg108.vp9_framewidth_golden = ref_buf->vp9.width; ++ break; ++ case 2: ++ regs->vp9_param.reg111.vp9_frameheight_altref = ref_buf->vp9.height; ++ regs->vp9_param.reg110.vp9_framewidth_altref = ref_buf->vp9.width; ++ break; ++ } ++ ++ switch(i) { ++ case 0: ++ regs->vp9_addr.vp9_referlast_base = vb2_dma_contig_plane_dma_addr(&ref_buf->base.vb.vb2_buf, 0); ++ break; ++ case 1: ++ regs->vp9_addr.vp9_refergolden_base = vb2_dma_contig_plane_dma_addr(&ref_buf->base.vb.vb2_buf, 0); ++ break; ++ case 2: ++ regs->vp9_addr.vp9_referalfter_base = vb2_dma_contig_plane_dma_addr(&ref_buf->base.vb.vb2_buf, 0); ++ break; ++ } ++ ++ if (&ref_buf->base.vb == run->base.bufs.dst) ++ return; ++ ++ aligned_pitch = round_up(ref_buf->vp9.width * ref_buf->vp9.bit_depth, 512) / 8; ++ y_len = aligned_height * aligned_pitch; ++ yuv_len = (y_len * 3) / 2; ++ ++ switch(i) { ++ case 0: ++ regs->vp9_param.reg79.vp9_lastfy_hor_virstride = aligned_pitch / 16; ++ regs->vp9_param.reg80.vp9_lastfuv_hor_virstride = aligned_pitch / 16; ++ regs->vp9_param.reg85.vp9_lastfy_virstride = y_len / 16; ++ break; ++ case 1: ++ regs->vp9_param.reg81.vp9_goldenfy_hor_virstride = aligned_pitch / 16; ++ regs->vp9_param.reg82.vp9_goldenuv_hor_virstride = aligned_pitch / 16; ++ regs->vp9_param.reg86.vp9_goldeny_virstride = y_len / 16; ++ break; ++ case 2: ++ regs->vp9_param.reg83.vp9_altreffy_hor_virstride= aligned_pitch / 16; ++ regs->vp9_param.reg84.vp9_altreff_uv_hor_virstride = aligned_pitch / 16; ++ regs->vp9_param.reg87.vp9_altrefy_virstride = y_len / 16; ++ break; ++ } ++} ++ ++static void config_seg_registers(struct rkvdec_ctx *ctx, unsigned int segid) ++{ ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct rkvdec_vdpu381_regs_vp9 *regs = &vp9_ctx->regs; ++ const struct v4l2_vp9_segmentation *seg; ++ s16 feature_val; ++ int feature_id; ++ ++ seg = vp9_ctx->last.valid ? &vp9_ctx->last.seg : &vp9_ctx->cur.seg; ++ feature_id = V4L2_VP9_SEG_LVL_ALT_Q; ++ if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) { ++ feature_val = seg->feature_data[segid][feature_id]; ++ regs->vp9_param.reg67_74[segid].vp9_segid_frame_qp_delta_en = 1; ++ regs->vp9_param.reg67_74[segid].vp9_segid_frame_qp_delta = feature_val; ++ } ++ ++ feature_id = V4L2_VP9_SEG_LVL_ALT_L; ++ if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) { ++ feature_val = seg->feature_data[segid][feature_id]; ++ regs->vp9_param.reg67_74[segid].vp9_segid_frame_loopfilter_value_en = 1; ++ regs->vp9_param.reg67_74[segid].vp9_segid_frame_loopfilter_value = feature_val; ++ } ++ ++ feature_id = V4L2_VP9_SEG_LVL_REF_FRAME; ++ if (v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid)) { ++ feature_val = seg->feature_data[segid][feature_id]; ++ regs->vp9_param.reg67_74[segid].vp9_segid_referinfo_en = 1; ++ regs->vp9_param.reg67_74[segid].vp9_segid_referinfo = feature_val; ++ } ++ ++ feature_id = V4L2_VP9_SEG_LVL_SKIP; ++ regs->vp9_param.reg67_74[segid].vp9_segid_frame_skip_en = ++ v4l2_vp9_seg_feat_enabled(seg->feature_enabled, feature_id, segid); ++ ++ regs->vp9_param.reg67_74[segid].vp9_segid_abs_delta = !segid && ++ (seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE); ++ ++} ++ ++static void update_ctx_cur_info(struct rkvdec_vp9_ctx *vp9_ctx, ++ struct rkvdec_decoded_buffer *buf, ++ const struct v4l2_ctrl_vp9_frame *dec_params) ++{ ++ vp9_ctx->cur.valid = true; ++ vp9_ctx->cur.reference_mode = dec_params->reference_mode; ++ vp9_ctx->cur.interpolation_filter = dec_params->interpolation_filter; ++ vp9_ctx->cur.flags = dec_params->flags; ++ vp9_ctx->cur.timestamp = buf->base.vb.vb2_buf.timestamp; ++ vp9_ctx->cur.seg = dec_params->seg; ++ vp9_ctx->cur.lf = dec_params->lf; ++} ++ ++static void update_ctx_last_info(struct rkvdec_vp9_ctx *vp9_ctx) ++{ ++ vp9_ctx->last = vp9_ctx->cur; ++} ++ ++static void rkvdec_write_regs(struct rkvdec_ctx *ctx) ++{ ++ struct rkvdec_dev *rkvdec = ctx->dev; ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ ++ rkvdec_memcpy_toio(rkvdec->regs + OFFSET_COMMON_REGS, ++ &vp9_ctx->regs.common, ++ sizeof(vp9_ctx->regs.common)); ++ rkvdec_memcpy_toio(rkvdec->regs + OFFSET_CODEC_PARAMS_REGS, ++ &vp9_ctx->regs.vp9_param, ++ sizeof(vp9_ctx->regs.vp9_param)); ++ rkvdec_memcpy_toio(rkvdec->regs + OFFSET_COMMON_ADDR_REGS, ++ &vp9_ctx->regs.common_addr, ++ sizeof(vp9_ctx->regs.common_addr)); ++ rkvdec_memcpy_toio(rkvdec->regs + OFFSET_CODEC_ADDR_REGS, ++ &vp9_ctx->regs.vp9_addr, ++ sizeof(vp9_ctx->regs.vp9_addr)); ++ ++} ++ ++static void config_registers(struct rkvdec_ctx *ctx, ++ const struct rkvdec_vp9_run *run) ++{ ++ unsigned int y_len, uv_len, yuv_len, bit_depth, aligned_height, aligned_pitch, stream_len; ++ const struct v4l2_ctrl_vp9_frame *dec_params; ++ struct rkvdec_decoded_buffer *ref_bufs[3]; ++ struct rkvdec_decoded_buffer *dst, *last, *mv_ref; ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct rkvdec_vdpu381_regs_vp9 *regs = &vp9_ctx->regs; ++ u32 val; ++ const struct v4l2_vp9_segmentation *seg; ++ u32 pixels; ++ ++ dma_addr_t rlc_addr, dst_addr; ++ bool intra_only; ++ unsigned int i; ++ ++ ++ dec_params = run->decode_params; ++ dst = vb2_to_rkvdec_decoded_buf(&run->base.bufs.dst->vb2_buf); ++ ref_bufs[0] = get_ref_buf_vp9(ctx, &dst->base.vb, dec_params->last_frame_ts); ++ ref_bufs[1] = get_ref_buf_vp9(ctx, &dst->base.vb, dec_params->golden_frame_ts); ++ ref_bufs[2] = get_ref_buf_vp9(ctx, &dst->base.vb, dec_params->alt_frame_ts); ++ ++ if (vp9_ctx->last.valid) ++ last = get_ref_buf_vp9(ctx, &dst->base.vb, vp9_ctx->last.timestamp); ++ else ++ last = dst; ++ ++ update_dec_buf_info(dst, dec_params); ++ update_ctx_cur_info(vp9_ctx, dst, dec_params); ++ seg = &dec_params->seg; ++ ++ intra_only = !!(dec_params->flags & ++ (V4L2_VP9_FRAME_FLAG_KEY_FRAME | ++ V4L2_VP9_FRAME_FLAG_INTRA_ONLY)); ++ ++ regs->common.reg009_dec_mode.dec_mode = VDPU381_MODE_VP9; ++ ++ regs->vp9_param.reg103.vp9_intra_only_flag = intra_only; ++ ++ /* Set config */ ++ regs->common.reg011_important_en.buf_empty_en = 1; ++ regs->common.reg011_important_en.dec_clkgate_e = 1; ++ regs->common.reg011_important_en.dec_timeout_e = 1; ++ ++ ++ bit_depth = dec_params->bit_depth; ++ aligned_height = round_up(ctx->decoded_fmt.fmt.pix_mp.height, 64); ++ ++ aligned_pitch = round_up(ctx->decoded_fmt.fmt.pix_mp.width * ++ bit_depth, ++ 512) / 8; ++ y_len = aligned_height * aligned_pitch; ++ uv_len = y_len / 2; ++ yuv_len = y_len + uv_len; ++ ++ pixels = ctx->decoded_fmt.fmt.pix_mp.height * ctx->decoded_fmt.fmt.pix_mp.width; ++ ++ regs->common.reg018_y_hor_stride.y_hor_virstride = aligned_pitch / 16; ++ regs->common.reg019_uv_hor_stride.uv_hor_virstride = aligned_pitch / 16; ++ regs->common.reg020_y_stride.y_virstride = y_len / 16; ++ ++ ++ stream_len = vb2_get_plane_payload(&run->base.bufs.src->vb2_buf, 0); ++ ++ regs->common.reg016_stream_len = stream_len; ++ ++ /* Activate block gating */ ++ regs->common.reg026_block_gating_en.reg_cfg_gating_en = 1; ++ ++ /* Set timeout threshold */ ++ if (pixels < RKVDEC_1080P_PIXELS) ++ regs->common.reg032_timeout_threshold = RKVDEC_TIMEOUT_1080p; ++ else if (pixels < RKVDEC_4K_PIXELS) ++ regs->common.reg032_timeout_threshold = RKVDEC_TIMEOUT_4K; ++ else if (pixels < RKVDEC_8K_PIXELS) ++ regs->common.reg032_timeout_threshold = RKVDEC_TIMEOUT_8K; ++ else ++ regs->common.reg032_timeout_threshold = RKVDEC_TIMEOUT_MAX; ++ ++ /* ++ * Reset count buffer, because decoder only output intra related syntax ++ * counts when decoding intra frame, but update entropy need to update ++ * all the probabilities. ++ */ ++ if (intra_only) ++ memset(vp9_ctx->count_tbl.cpu, 0, vp9_ctx->count_tbl.size); ++ ++ vp9_ctx->cur.segmapid = vp9_ctx->last.segmapid; ++ if (!intra_only && ++ !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) && ++ (!(seg->flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED) || ++ (seg->flags & V4L2_VP9_SEGMENTATION_FLAG_UPDATE_MAP))) ++ vp9_ctx->cur.segmapid++; ++ ++ for (i = 0; i < ARRAY_SIZE(ref_bufs); i++) ++ config_ref_registers(ctx, run, ref_bufs[i], i); ++ ++ for (i = 0; i < 8; i++) ++ config_seg_registers(ctx, i); ++ ++ regs->vp9_param.reg76.vp9_tx_mode = vp9_ctx->cur.tx_mode; ++ regs->vp9_param.reg76.vp9_frame_reference_mode = dec_params->reference_mode; ++ ++ if (!intra_only) { ++ const struct v4l2_vp9_loop_filter *lf; ++ ++ if (vp9_ctx->last.valid) ++ lf = &vp9_ctx->last.lf; ++ else ++ lf = &vp9_ctx->cur.lf; ++ ++ val = 0; ++ ++ for (i = 0; i < ARRAY_SIZE(lf->ref_deltas); i++) { ++ regs->vp9_param.reg94.vp9_ref_deltas_lastframe |= (lf->ref_deltas[i] & 0x7f) << (7 * i); ++ } ++ ++ for(i = 0; i < ARRAY_SIZE(lf->mode_deltas); i++){ ++ regs->vp9_param.reg75.vp9_mode_deltas_lastframe |= (lf->mode_deltas[i] & 0x7f) << (7 * i); ++ } ++ } ++ ++ regs->vp9_param.reg75.segmentation_enable_lstframe = ++ vp9_ctx->last.valid && !intra_only && ++ vp9_ctx->last.seg.flags & V4L2_VP9_SEGMENTATION_FLAG_ENABLED; ++ ++ regs->vp9_param.reg75.vp9_last_showframe = ++ vp9_ctx->last.valid && ++ vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_SHOW_FRAME; ++ ++ regs->vp9_param.reg75.vp9_last_intra_only = ++ vp9_ctx->last.valid && ++ vp9_ctx->last.flags & ++ (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY); ++ ++ regs->vp9_param.reg75.vp9_last_widhheight_eqcur = ++ vp9_ctx->last.valid && ++ last->vp9.width == dst->vp9.width && ++ last->vp9.height == dst->vp9.height; ++ ++ regs->vp9_param.reg78_vp9_stream_size = stream_len; ++ ++ ++ for (i = 0; !intra_only && i < ARRAY_SIZE(ref_bufs); i++) { ++ unsigned int refw = ref_bufs[i]->vp9.width; ++ unsigned int refh = ref_bufs[i]->vp9.height; ++ u32 hscale, vscale; ++ ++ hscale = (refw << 14) / dst->vp9.width; ++ vscale = (refh << 14) / dst->vp9.height; ++ ++ switch(i) { ++ case 0: ++ regs->vp9_param.reg88.vp9_lref_hor_scale = hscale; ++ regs->vp9_param.reg89.vp9_lref_ver_scale = vscale; ++ break; ++ case 1: ++ regs->vp9_param.reg90.vp9_gref_hor_scale = hscale; ++ regs->vp9_param.reg91.vp9_gref_ver_scale = vscale; ++ break; ++ case 2: ++ regs->vp9_param.reg92.vp9_aref_hor_scale = hscale; ++ regs->vp9_param.reg93.vp9_aref_ver_scale = hscale; ++ break; ++ } ++ ++ ++ } ++ ++ /* Set rlc base address (input stream) */ ++ rlc_addr = vb2_dma_contig_plane_dma_addr(&run->base.bufs.src->vb2_buf, 0); ++ regs->common_addr.rlc_base = rlc_addr; ++ regs->common_addr.rlcwrite_base = rlc_addr; ++ ++ /* Set output base address */ ++ dst_addr = vb2_dma_contig_plane_dma_addr(&dst->base.vb.vb2_buf, 0); ++ regs->common_addr.decout_base = dst_addr; ++ regs->common_addr.error_ref_base = dst_addr; ++ ++ /* Set colmv address */ ++ regs->common_addr.colmv_cur_base = dst_addr + ctx->colmv_offset; ++ ++ /* Set RCB addresses */ ++ for (i = 0; i < rkvdec_rcb_buf_count(ctx); i++) ++ regs->common_addr.rcb_base[i] = rkvdec_rcb_buf_dma_addr(ctx, i); ++ ++ regs->vp9_addr.cabactbl_base = vp9_ctx->priv_tbl.dma + ++ offsetof(struct rkvdec_vp9_priv_tbl, probs); ++ ++ regs->vp9_addr.vp9_count_base = vp9_ctx->count_tbl.dma; ++ ++ regs->vp9_addr.vp9_segidlast_base = vp9_ctx->priv_tbl.dma + ++ offsetof(struct rkvdec_vp9_priv_tbl, segmap) + ++ (RKVDEC_VP9_MAX_SEGMAP_SIZE * (!vp9_ctx->cur.segmapid)); ++ ++ regs->vp9_addr.avp9_segidcur_base = vp9_ctx->priv_tbl.dma + ++ offsetof(struct rkvdec_vp9_priv_tbl, segmap) + ++ (RKVDEC_VP9_MAX_SEGMAP_SIZE * vp9_ctx->cur.segmapid); ++ ++ if (!intra_only && ++ !(dec_params->flags & V4L2_VP9_FRAME_FLAG_ERROR_RESILIENT) && ++ vp9_ctx->last.valid) ++ mv_ref = last; ++ else ++ mv_ref = dst; ++ ++ regs->vp9_addr.vp9_refcolmv_base = get_mv_base_addr(mv_ref); ++ ++ rkvdec_write_regs(ctx); ++ ++} ++ ++static int validate_dec_params(struct rkvdec_ctx *ctx, ++ const struct v4l2_ctrl_vp9_frame *dec_params) ++{ ++ unsigned int aligned_width, aligned_height; ++ ++ aligned_width = round_up(dec_params->frame_width_minus_1 + 1, 64); ++ aligned_height = round_up(dec_params->frame_height_minus_1 + 1, 64); ++ ++ /* ++ * Userspace should update the capture/decoded format when the ++ * resolution changes. ++ */ ++ if (aligned_width != ctx->decoded_fmt.fmt.pix_mp.width || ++ aligned_height != ctx->decoded_fmt.fmt.pix_mp.height) { ++ dev_err(ctx->dev->dev, ++ "unexpected bitstream resolution %dx%d\n", ++ dec_params->frame_width_minus_1 + 1, ++ dec_params->frame_height_minus_1 + 1); ++ return -EINVAL; ++ } ++ ++ return 0; ++} ++ ++static int rkvdec_vp9_run_preamble(struct rkvdec_ctx *ctx, ++ struct rkvdec_vp9_run *run) ++{ ++ const struct v4l2_ctrl_vp9_frame *dec_params; ++ const struct v4l2_ctrl_vp9_compressed_hdr *prob_updates; ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct v4l2_ctrl *ctrl; ++ unsigned int fctx_idx; ++ int ret; ++ ++ /* v4l2-specific stuff */ ++ rkvdec_run_preamble(ctx, &run->base); ++ ++ ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, ++ V4L2_CID_STATELESS_VP9_FRAME); ++ if (WARN_ON(!ctrl)) ++ return -EINVAL; ++ dec_params = ctrl->p_cur.p; ++ ++ ret = validate_dec_params(ctx, dec_params); ++ if (ret) ++ return ret; ++ ++ run->decode_params = dec_params; ++ ++ ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_COMPRESSED_HDR); ++ if (WARN_ON(!ctrl)) ++ return -EINVAL; ++ prob_updates = ctrl->p_cur.p; ++ vp9_ctx->cur.tx_mode = prob_updates->tx_mode; ++ ++ /* ++ * vp9 stuff ++ * ++ * by this point the userspace has done all parts of 6.2 uncompressed_header() ++ * except this fragment: ++ * if ( FrameIsIntra || error_resilient_mode ) { ++ * setup_past_independence ( ) ++ * if ( frame_type == KEY_FRAME || error_resilient_mode == 1 || ++ * reset_frame_context == 3 ) { ++ * for ( i = 0; i < 4; i ++ ) { ++ * save_probs( i ) ++ * } ++ * } else if ( reset_frame_context == 2 ) { ++ * save_probs( frame_context_idx ) ++ * } ++ * frame_context_idx = 0 ++ * } ++ */ ++ fctx_idx = v4l2_vp9_reset_frame_ctx(dec_params, vp9_ctx->frame_context); ++ vp9_ctx->cur.frame_context_idx = fctx_idx; ++ ++ /* 6.1 frame(sz): load_probs() and load_probs2() */ ++ vp9_ctx->probability_tables = vp9_ctx->frame_context[fctx_idx]; ++ ++ /* ++ * The userspace has also performed 6.3 compressed_header(), but handling the ++ * probs in a special way. All probs which need updating, except MV-related, ++ * have been read from the bitstream and translated through inv_map_table[], ++ * but no 6.3.6 inv_recenter_nonneg(v, m) has been performed. The values passed ++ * by userspace are either translated values (there are no 0 values in ++ * inv_map_table[]), or zero to indicate no update. All MV-related probs which need ++ * updating have been read from the bitstream and (mv_prob << 1) | 1 has been ++ * performed. The values passed by userspace are either new values ++ * to replace old ones (the above mentioned shift and bitwise or never result in ++ * a zero) or zero to indicate no update. ++ * fw_update_probs() performs actual probs updates or leaves probs as-is ++ * for values for which a zero was passed from userspace. ++ */ ++ v4l2_vp9_fw_update_probs(&vp9_ctx->probability_tables, prob_updates, dec_params); ++ ++ return 0; ++} ++ ++static int rkvdec_vp9_run(struct rkvdec_ctx *ctx) ++{ ++ struct rkvdec_dev *rkvdec = ctx->dev; ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct rkvdec_vp9_run run = { }; ++ int ret; ++ u32 watchdog_time; ++ ++ ret = rkvdec_vp9_run_preamble(ctx, &run); ++ ++ if (ret) { ++ rkvdec_run_postamble(ctx, &run.base); ++ ++ return ret; ++ } ++ ++ /* Prepare probs. */ ++ init_probs(ctx, &run); ++ ++ /* Configure hardware registers. */ ++ config_registers(ctx, &run); ++ ++ rkvdec_run_postamble(ctx, &run.base); ++ ++ u64 timeout_threshold = vp9_ctx->regs.common.reg032_timeout_threshold; ++ unsigned long axi_rate = clk_get_rate(rkvdec->axi_clk); ++ ++ if (axi_rate) { ++ watchdog_time = 2 * (1000 * timeout_threshold) / axi_rate; ++ } else { ++ watchdog_time = 2000; ++ } ++ ++ schedule_delayed_work(&rkvdec->watchdog_work, ++ msecs_to_jiffies(watchdog_time)); ++ ++ writel(VDPU381_DEC_E_BIT, rkvdec->regs + VDPU381_REG_DEC_E); ++ ++ return 0; ++} ++ ++#define copy_tx_and_skip(p1, p2) \ ++do { \ ++ memcpy((p1)->tx8, (p2)->tx8, sizeof((p1)->tx8)); \ ++ memcpy((p1)->tx16, (p2)->tx16, sizeof((p1)->tx16)); \ ++ memcpy((p1)->tx32, (p2)->tx32, sizeof((p1)->tx32)); \ ++ memcpy((p1)->skip, (p2)->skip, sizeof((p1)->skip)); \ ++} while (0) ++ ++ ++static void rkvdec_vp9_done(struct rkvdec_ctx *ctx, ++ struct vb2_v4l2_buffer *src_buf, ++ struct vb2_v4l2_buffer *dst_buf, ++ enum vb2_buffer_state result) ++{ ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ unsigned int fctx_idx; ++ ++ /* v4l2-specific stuff */ ++ if (result == VB2_BUF_STATE_ERROR) ++ goto out_update_last; ++ ++ /* ++ * vp9 stuff ++ * ++ * 6.1.2 refresh_probs() ++ * ++ * In the spec a complementary condition goes last in 6.1.2 refresh_probs(), ++ * but it makes no sense to perform all the activities from the first "if" ++ * there if we actually are not refreshing the frame context. On top of that, ++ * because of 6.2 uncompressed_header() whenever error_resilient_mode == 1, ++ * refresh_frame_context == 0. Consequently, if we don't jump to out_update_last ++ * it means error_resilient_mode must be 0. ++ */ ++ if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX)) ++ goto out_update_last; ++ ++ fctx_idx = vp9_ctx->cur.frame_context_idx; ++ ++ if (!(vp9_ctx->cur.flags & V4L2_VP9_FRAME_FLAG_PARALLEL_DEC_MODE)) { ++ /* error_resilient_mode == 0 && frame_parallel_decoding_mode == 0 */ ++ struct v4l2_vp9_frame_context *probs = &vp9_ctx->probability_tables; ++ bool frame_is_intra = vp9_ctx->cur.flags & ++ (V4L2_VP9_FRAME_FLAG_KEY_FRAME | V4L2_VP9_FRAME_FLAG_INTRA_ONLY); ++ struct tx_and_skip { ++ u8 tx8[2][1]; ++ u8 tx16[2][2]; ++ u8 tx32[2][3]; ++ u8 skip[3]; ++ } _tx_skip, *tx_skip = &_tx_skip; ++ struct v4l2_vp9_frame_symbol_counts *counts; ++ ++ /* buffer the forward-updated TX and skip probs */ ++ if (frame_is_intra) ++ copy_tx_and_skip(tx_skip, probs); ++ ++ /* 6.1.2 refresh_probs(): load_probs() and load_probs2() */ ++ *probs = vp9_ctx->frame_context[fctx_idx]; ++ ++ /* if FrameIsIntra then undo the effect of load_probs2() */ ++ if (frame_is_intra) ++ copy_tx_and_skip(probs, tx_skip); ++ ++ counts = frame_is_intra ? &vp9_ctx->intra_cnts : &vp9_ctx->inter_cnts; ++ v4l2_vp9_adapt_coef_probs(probs, counts, ++ !vp9_ctx->last.valid || ++ vp9_ctx->last.flags & V4L2_VP9_FRAME_FLAG_KEY_FRAME, ++ frame_is_intra); ++ if (!frame_is_intra) { ++ const struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts; ++ u32 classes[2][11]; ++ int i; ++ ++ inter_cnts = vp9_ctx->count_tbl.cpu; ++ for (i = 0; i < ARRAY_SIZE(classes); ++i) ++ memcpy(classes[i], inter_cnts->classes[i], sizeof(classes[0])); ++ counts->classes = &classes; ++ ++ /* load_probs2() already done */ ++ v4l2_vp9_adapt_noncoef_probs(&vp9_ctx->probability_tables, counts, ++ vp9_ctx->cur.reference_mode, ++ vp9_ctx->cur.interpolation_filter, ++ vp9_ctx->cur.tx_mode, vp9_ctx->cur.flags); ++ } ++ } ++ ++ /* 6.1.2 refresh_probs(): save_probs(fctx_idx) */ ++ vp9_ctx->frame_context[fctx_idx] = vp9_ctx->probability_tables; ++ ++out_update_last: ++ update_ctx_last_info(vp9_ctx); ++} ++ ++static void rkvdec_init_v4l2_vp9_count_tbl(struct rkvdec_ctx *ctx) ++{ ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct rkvdec_vp9_intra_frame_symbol_counts *intra_cnts = vp9_ctx->count_tbl.cpu; ++ struct rkvdec_vp9_inter_frame_symbol_counts *inter_cnts = vp9_ctx->count_tbl.cpu; ++ int i, j, k, l, m; ++ ++ vp9_ctx->inter_cnts.partition = &inter_cnts->partition; ++ vp9_ctx->inter_cnts.skip = &inter_cnts->skip; ++ vp9_ctx->inter_cnts.intra_inter = &inter_cnts->inter; ++ vp9_ctx->inter_cnts.tx32p = &inter_cnts->tx32p; ++ vp9_ctx->inter_cnts.tx16p = &inter_cnts->tx16p; ++ vp9_ctx->inter_cnts.tx8p = &inter_cnts->tx8p; ++ ++ vp9_ctx->intra_cnts.partition = (u32 (*)[16][4])(&intra_cnts->partition); ++ vp9_ctx->intra_cnts.skip = &intra_cnts->skip; ++ vp9_ctx->intra_cnts.intra_inter = &intra_cnts->intra; ++ vp9_ctx->intra_cnts.tx32p = &intra_cnts->tx32p; ++ vp9_ctx->intra_cnts.tx16p = &intra_cnts->tx16p; ++ vp9_ctx->intra_cnts.tx8p = &intra_cnts->tx8p; ++ ++ vp9_ctx->inter_cnts.y_mode = &inter_cnts->y_mode; ++ vp9_ctx->inter_cnts.uv_mode = &inter_cnts->uv_mode; ++ vp9_ctx->inter_cnts.comp = &inter_cnts->comp; ++ vp9_ctx->inter_cnts.comp_ref = &inter_cnts->comp_ref; ++ vp9_ctx->inter_cnts.single_ref = &inter_cnts->single_ref; ++ vp9_ctx->inter_cnts.mv_mode = &inter_cnts->mv_mode; ++ vp9_ctx->inter_cnts.filter = &inter_cnts->filter; ++ vp9_ctx->inter_cnts.mv_joint = &inter_cnts->mv_joint; ++ vp9_ctx->inter_cnts.sign = &inter_cnts->sign; ++ /* ++ * rk hardware actually uses "u32 classes[2][11 + 1];" ++ * instead of "u32 classes[2][11];", so this must be explicitly ++ * copied into vp9_ctx->classes when passing the data to the ++ * vp9 library function ++ */ ++ vp9_ctx->inter_cnts.class0 = &inter_cnts->class0; ++ vp9_ctx->inter_cnts.bits = &inter_cnts->bits; ++ vp9_ctx->inter_cnts.class0_fp = &inter_cnts->class0_fp; ++ vp9_ctx->inter_cnts.fp = &inter_cnts->fp; ++ vp9_ctx->inter_cnts.class0_hp = &inter_cnts->class0_hp; ++ vp9_ctx->inter_cnts.hp = &inter_cnts->hp; ++ ++#define INNERMOST_LOOP \ ++ do { \ ++ for (m = 0; m < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0][0]); ++m) {\ ++ vp9_ctx->inter_cnts.coeff[i][j][k][l][m] = \ ++ &inter_cnts->ref_cnt[k][i][j][l][m].coeff; \ ++ vp9_ctx->inter_cnts.eob[i][j][k][l][m][0] = \ ++ &inter_cnts->ref_cnt[k][i][j][l][m].eob[0]; \ ++ vp9_ctx->inter_cnts.eob[i][j][k][l][m][1] = \ ++ &inter_cnts->ref_cnt[k][i][j][l][m].eob[1]; \ ++ \ ++ vp9_ctx->intra_cnts.coeff[i][j][k][l][m] = \ ++ &intra_cnts->ref_cnt[k][i][j][l][m].coeff; \ ++ vp9_ctx->intra_cnts.eob[i][j][k][l][m][0] = \ ++ &intra_cnts->ref_cnt[k][i][j][l][m].eob[0]; \ ++ vp9_ctx->intra_cnts.eob[i][j][k][l][m][1] = \ ++ &intra_cnts->ref_cnt[k][i][j][l][m].eob[1]; \ ++ } \ ++ } while (0) ++ ++ for (i = 0; i < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff); ++i) ++ for (j = 0; j < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0]); ++j) ++ for (k = 0; k < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0]); ++k) ++ for (l = 0; l < ARRAY_SIZE(vp9_ctx->inter_cnts.coeff[0][0][0]); ++l) ++ INNERMOST_LOOP; ++#undef INNERMOST_LOOP ++} ++ ++static int rkvdec_vp9_start(struct rkvdec_ctx *ctx) ++{ ++ struct rkvdec_dev *rkvdec = ctx->dev; ++ struct rkvdec_vp9_priv_tbl *priv_tbl; ++ struct rkvdec_vp9_ctx *vp9_ctx; ++ unsigned char *count_tbl; ++ struct v4l2_ctrl *ctrl; ++ int ret; ++ ++ /* frame header */ ++ ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, V4L2_CID_STATELESS_VP9_FRAME); ++ if (!ctrl) ++ return -EINVAL; ++ ++ vp9_ctx = kzalloc(sizeof(*vp9_ctx), GFP_KERNEL); ++ if (!vp9_ctx) ++ return -ENOMEM; ++ ++ ctx->priv = vp9_ctx; ++ ++ BUILD_BUG_ON(sizeof(priv_tbl->probs) % 16); /* ensure probs size is 128-bit aligned */ ++ priv_tbl = dma_alloc_coherent(rkvdec->dev, sizeof(*priv_tbl), ++ &vp9_ctx->priv_tbl.dma, GFP_KERNEL); ++ if (!priv_tbl) { ++ ret = -ENOMEM; ++ goto err_free_ctx; ++ } ++ ++ vp9_ctx->priv_tbl.size = sizeof(*priv_tbl); ++ vp9_ctx->priv_tbl.cpu = priv_tbl; ++ ++ count_tbl = dma_alloc_coherent(rkvdec->dev, RKVDEC_VP9_COUNT_SIZE, ++ &vp9_ctx->count_tbl.dma, GFP_KERNEL); ++ if (!count_tbl) { ++ ret = -ENOMEM; ++ goto err_free_priv_tbl; ++ } ++ ++ vp9_ctx->count_tbl.size = RKVDEC_VP9_COUNT_SIZE; ++ vp9_ctx->count_tbl.cpu = count_tbl; ++ rkvdec_init_v4l2_vp9_count_tbl(ctx); ++ ++ return 0; ++ ++err_free_priv_tbl: ++ dma_free_coherent(rkvdec->dev, vp9_ctx->priv_tbl.size, ++ vp9_ctx->priv_tbl.cpu, vp9_ctx->priv_tbl.dma); ++ ++err_free_ctx: ++ kfree(vp9_ctx); ++ return ret; ++} ++ ++static void rkvdec_vp9_stop(struct rkvdec_ctx *ctx) ++{ ++ struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; ++ struct rkvdec_dev *rkvdec = ctx->dev; ++ ++ dma_free_coherent(rkvdec->dev, vp9_ctx->count_tbl.size, ++ vp9_ctx->count_tbl.cpu, vp9_ctx->count_tbl.dma); ++ ++ dma_free_coherent(rkvdec->dev, vp9_ctx->priv_tbl.size, ++ vp9_ctx->priv_tbl.cpu, vp9_ctx->priv_tbl.dma); ++ ++ kfree(vp9_ctx); ++ ++} ++ ++static int rkvdec_vp9_adjust_fmt(struct rkvdec_ctx *ctx, ++ struct v4l2_format *f) ++{ ++ struct v4l2_pix_format_mplane *fmt = &f->fmt.pix_mp; ++ ++ fmt->num_planes = 1; ++ if (!fmt->plane_fmt[0].sizeimage) ++ fmt->plane_fmt[0].sizeimage = fmt->width * fmt->height * 2; ++ return 0; ++} ++ ++ ++const struct rkvdec_coded_fmt_ops rkvdec_vdpu381_vp9_fmt_ops = { ++ .adjust_fmt = rkvdec_vp9_adjust_fmt, ++ .start = rkvdec_vp9_start, ++ .stop = rkvdec_vp9_stop, ++ .run = rkvdec_vp9_run, ++ .done = rkvdec_vp9_done, ++}; +\ No newline at end of file +diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec.c b/drivers/media/platform/rockchip/rkvdec/rkvdec.c +index 1d1e9bfef8e9..64e318ea2616 100644 +--- a/drivers/media/platform/rockchip/rkvdec/rkvdec.c ++++ b/drivers/media/platform/rockchip/rkvdec/rkvdec.c +@@ -443,6 +443,43 @@ static const struct rkvdec_decoded_fmt_desc rkvdec_vp9_decoded_fmts[] = { + }, + }; + ++static const struct rkvdec_ctrl_desc vdpu381_vp9_ctrl_descs[] = { ++ { ++ .cfg.id = V4L2_CID_STATELESS_VP9_FRAME, ++ }, ++ { ++ .cfg.id = V4L2_CID_STATELESS_VP9_COMPRESSED_HDR, ++ }, ++ { ++ .cfg.id = V4L2_CID_MPEG_VIDEO_VP9_PROFILE, ++ .cfg.min = V4L2_MPEG_VIDEO_VP9_PROFILE_0, ++ .cfg.max = V4L2_MPEG_VIDEO_VP9_PROFILE_2, ++ .cfg.def = V4L2_MPEG_VIDEO_VP9_PROFILE_0, ++ }, ++ { ++ .cfg.id = V4L2_CID_MPEG_VIDEO_VP9_LEVEL, ++ .cfg.min = V4L2_MPEG_VIDEO_VP9_LEVEL_1_0, ++ .cfg.max = V4L2_MPEG_VIDEO_VP9_LEVEL_6_1, ++ ++ }, ++}; ++ ++static const struct rkvdec_ctrls vdpu381_vp9_ctrls = { ++ .ctrls = vdpu381_vp9_ctrl_descs, ++ .num_ctrls = ARRAY_SIZE(vdpu381_vp9_ctrl_descs), ++}; ++ ++static const struct rkvdec_decoded_fmt_desc rkvdec_vdpu381_vp9_decoded_fmts[] = { ++ { ++ .fourcc = V4L2_PIX_FMT_NV12, ++ .image_fmt = RKVDEC_IMG_FMT_420_8BIT, ++ }, ++ { ++ .fourcc = V4L2_PIX_FMT_NV15, ++ .image_fmt = RKVDEC_IMG_FMT_420_10BIT, ++ }, ++}; ++ + static const struct rkvdec_coded_fmt_desc rkvdec_coded_fmts[] = { + { + .fourcc = V4L2_PIX_FMT_HEVC_SLICE, +@@ -543,6 +580,21 @@ static const struct rkvdec_coded_fmt_desc vdpu381_coded_fmts[] = { + .decoded_fmts = rkvdec_h264_decoded_fmts, + .subsystem_flags = VB2_V4L2_FL_SUPPORTS_M2M_HOLD_CAPTURE_BUF, + }, ++ { ++ .fourcc = V4L2_PIX_FMT_VP9_FRAME, ++ .frmsize = { ++ .min_width = 64, ++ .max_width = 65472, ++ .step_width = 64, ++ .min_height = 64, ++ .max_height = 65472, ++ .step_height = 64, ++ }, ++ .ctrls = &vdpu381_vp9_ctrls, ++ .ops = &rkvdec_vdpu381_vp9_fmt_ops, ++ .num_decoded_fmts = ARRAY_SIZE(rkvdec_vdpu381_vp9_decoded_fmts), ++ .decoded_fmts = rkvdec_vdpu381_vp9_decoded_fmts, ++ } + }; + + static const struct rkvdec_coded_fmt_desc vdpu383_coded_fmts[] = { +diff --git a/drivers/media/platform/rockchip/rkvdec/rkvdec.h b/drivers/media/platform/rockchip/rkvdec/rkvdec.h +index a24be6638b6b..d73ec9442a69 100644 +--- a/drivers/media/platform/rockchip/rkvdec/rkvdec.h ++++ b/drivers/media/platform/rockchip/rkvdec/rkvdec.h +@@ -191,6 +191,7 @@ extern const struct rkvdec_coded_fmt_ops rkvdec_vp9_fmt_ops; + /* VDPU381 ops */ + extern const struct rkvdec_coded_fmt_ops rkvdec_vdpu381_h264_fmt_ops; + extern const struct rkvdec_coded_fmt_ops rkvdec_vdpu381_hevc_fmt_ops; ++extern const struct rkvdec_coded_fmt_ops rkvdec_vdpu381_vp9_fmt_ops; + + /* VDPU383 ops */ + extern const struct rkvdec_coded_fmt_ops rkvdec_vdpu383_h264_fmt_ops; +-- +2.54.0 + diff --git a/patches/driver/media/README.md b/patches/driver/media/README.md new file mode 100644 index 0000000..fe0a0e7 --- /dev/null +++ b/patches/driver/media/README.md @@ -0,0 +1,69 @@ +# patches/driver/media/ + +Scope-tagged kernel-agent patches that touch `drivers/media/` — third-party +video-codec enablement work that hasn't reached linux-media patchwork as +formal series yet, but is empirically known to work on our test hardware. + +## 0001..0003 — Sarma's VP9 enablement on VDPU381 (RK3588 rkvdec) + +Three patches from `D.V.A.B. Sarma ` adding VP9 +decode support to the VDPU381 variant of rkvdec (the RK3588 generation). + +| # | Subject | LOC | What | +|---|---------|----:|------| +| 0001 | rkvdec/vp9: rename get_ref_buf to get_ref_buf_vp9 | 10 | rename existing helper to avoid namespace collision with the upcoming HEVC equivalent | +| 0002 | rkvdec: move vp9 functions to common file | 172 | extract VP9 plumbing into `rkvdec-vp9-common.{c,h}` so VDPU381 can share with the older RK3399 backend | +| 0003 | rkvdec: add VP9 support for VDPU381 variant | 1303 | the actual VDPU381 VP9 backend — register defs + `rkvdec-vdpu381-vp9.c` + glue | + +Combined: ~1500 LOC, 5 new files in `drivers/media/platform/rockchip/rkvdec/`. + +### Upstream provenance + +- Author maintains the work at https://github.com/dvab-sarma/android_kernel_rk_opi + branch `add-rkvdec-vdpu381-vp9-v8`. +- Collabora's blog post on RK3588/RK3576 video decoder mainline merge cites + the work but notes "v1 series needs to be sent for review soon" — + i.e. not yet on linux-media patchwork, no upstream timeline. +- Casanova's VDPU381+VDPU383 H.264/HEVC base (which these patches sit on top + of) IS in mainline 7.0 release. +- Patches do NOT modify any of our scope-tagged board / module / soc / + subsystem code paths — purely additive to the upstream rkvdec subdirectory. + +### Tested on + +- Author: Orange Pi 5 Pro board (RK3588), AOSP 16 + FFMPEG, Profile 0 + Profile 2 +- Our fleet: build verified clean on `ampere` (CoolPi CM5 GenBook, RK3588) + 2026-05-18 with KERNELRELEASE `7.0.0-rc3-vp9-test+` (base = running + `7.0.0-rc3-devices+` config + LOCALVERSION change + these 3 patches + + the pre-existing issue14 vb2-resv local mods). Full kernel image + + DTB + modules + initramfs land at `/boot/firmware/*-7.0.0-rc3-vp9-test+` + and `/lib/modules/7.0.0-rc3-vp9-test+`. New extlinux label `arch_vp9_test` + added without touching default `arch_devices`. End-to-end VP9 decode + validation requires booting into `arch_vp9_test` (pending operator + confirmation, then `v4l2-ctl -d /dev/video1 --list-formats-out` should + list `VP9F` alongside `S265` + `S264`). + +### Apply order + +Strict — 0001 → 0002 → 0003. 0003 depends on the common-file refactor +from 0002, which depends on the helper rename in 0001. + +### Removal criteria + +Drop these patches when: +- Sarma sends a v1 series to linux-media and it lands upstream — adopt + the upstream version at the next baseline bump, OR +- Collabora produces an alternative VP9 enablement on their own + hardware-enablement/rockchip-3588 GitLab tree — prefer that lineage + (more likely to land cleanly upstream). + +### How to use in a kernel-agent build + +If `fleet/ampere.yaml` is bumped to include VP9 (currently scope-out per +the manifest preamble — "Asks #2 (VP9 enablement on RK3588 rkvdec) and +#3 (AV1 dec integration) from issue #6 are NOT addressed in this +manifest — tracked separately"), reference these three files in apply +order under the manifest's scope-tagged patch list. + +Cross-references: `marfrit/kernel-agent#12` (the VP9-on-ampere enablement +issue). -- 2.47.3 From 95be39ef80338c4aa84922fb1bf140518a847fc0 Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Mon, 18 May 2026 13:15:09 +0000 Subject: [PATCH 2/2] fleet/ampere: enable Sarma VP9-VDPU381 patches in baseline MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reference the 3 patches imported in the previous commit under the scope-tagged patch list. Apply order is strict (0001 → 0002 → 0003). Verified 2026-05-18 via the arch_vp9_test extlinux boot on ampere: - VP9F enumerates on rkvdec /dev/video2 - kdirect decode bit-exact vs libavcodec SW reference at -ss 30 - libva decode (firefox/chromium-style consumer) also bit-exact - vainfo lists VAProfileVP9Profile0 (iter38 multi-device probe auto-picks) - All three paths agree on sha c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48 Removes VP9 from the "explicitly not included" comment block — issue #12 closes with this change. Also: AV1 stays out-of-scope per issue #6 ask 3 (kernel side fine via the existing av1-vpu-dec node; backend just needs the 4th-fd generalization tracked in libva-v4l2-request-fourier#2). The next linux-ampere-fourier package rebuild from this manifest will pick up VP9 automatically; ampere's running 7.0.0-rc3-vp9-test+ kernel already has these patches via the operator's manual build session today. --- fleet/ampere.yaml | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/fleet/ampere.yaml b/fleet/ampere.yaml index b6eb2be..b621033 100644 --- a/fleet/ampere.yaml +++ b/fleet/ampere.yaml @@ -53,12 +53,19 @@ includes: - board/coolpi-cm5-genbook/0005-arm64-dts-rockchip-rk3588-coolpi-cm5-genbook-Enable-USB-C-PD-charging-via-FUSB302.patch - board/coolpi-cm5-genbook/0008-arm64-dts-rockchip-rk3588-coolpi-cm5-genbook-Add-lid-switch-and-USB3-PHY-lane-config.patch - board/coolpi-cm5-genbook/0011-arm64-dts-rockchip-rk3588-coolpi-cm5-genbook-wire-internal-microphone.patch + # VP9 enablement for RK3588 rkvdec (issue #12, closed 2026-05-18). + # Cherry-picked from D.V.A.B. Sarma's add-rkvdec-vdpu381-vp9-v8 branch + # at github.com/dvab-sarma/android_kernel_rk_opi. Bit-exact HW==SW==libva + # verified at -ss 30 on bbb_60s_720p.vp9.webm via all three decode paths + # (kdirect / SW / libva); sha c8624d7c42db66525f53a02a515bc38d0a17ef39f692660cc7bebb1e2d2e1b48. + # Apply order is STRICT (0003 depends on the rkvdec-vp9-common refactor + # added in 0002, which depends on the helper rename in 0001). + # See patches/driver/media/README.md for provenance + removal criteria. + - driver/media/0001-rkvdec-vp9-rename-get_ref_buf-to-get_ref_buf_vp9.patch + - driver/media/0002-rkvdec-move-vp9-functions-to-common-file.patch + - driver/media/0003-rkvdec-add-vp9-support-for-vdpu381-variant.patch # Explicitly NOT included this round (tracked for later sprints): -# - VP9 enablement for RK3588 rkvdec (issue #6 ask 2). /dev/video0 only -# advertises S265 + S264 today; vainfo lists 9 profiles, target is -# 10. Requires identifying the VDPU381/383 patch chain + possible -# DTS additions. RFC-stage work, scope unclear until research lands. # - AV1 decoder integration (issue #6 ask 3). Kernel side is fine # (/dev/video4 advertises AV1F). Backend libva-v4l2-request-fourier # needs iter39 for a third fd. Backend work, not kernel. -- 2.47.3