h264: V3D shaders for the 8 diagonal qpel positions #33
Reference in New Issue
Block a user
Delete Branch "noether/v3d-shader-h264-qpel-diagonals"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes the put_ qpel QPU matrix. All 15 useful positions now on QPU (mc00 = integer copy, no shader).
8 new shaders for mc11/12/13/21/23/31/32/33 — each composes 2 half-pel anchors (mc20/mc02/mc22) via L2 rounded-average, with the position-specific (r±1, c±1) offsets per H.264 §8.4.2.2.1. ~88 lines per shader. Generated from a python template against the same POSITIONS table as the C ref from fourier PR #18.
Shared
dispatch_h264_qpel_diag_qpuhelper (mc22-style src envelope: src_max = src_off + 10*stride + 11 covers rows -2..+10 and cols -2..+10 for any offset).All 8 PASS 2048/2048 bytes bit-exact first try — meaningful because the asymmetric (r±1, c±1) shifts are easy to transpose between positions; passing on all means the position-specific shifts are correct in all 8 templates.
put_ QPU matrix complete: mc{00,10,20,30,01,11,21,31,02,12,22,32,03,13,23,33}. avg_ qpel (15 more positions) remain on CPU NEON — can land as a follow-up since avg_ is just put_ + one extra L2.