Files
fresnel-fourier/phase7_iter39_close.md
T
marfrit 7d8d720631 iter39 Phase 7 CLOSE: vainfo + iter38 baseline PASS; Hi10P kernel/HW gap on RK3399
Phase 7 verification on fresnel (kernel 7.0-14 / linux-fresnel-fourier).

C1 vainfo enumeration: PASS — VAProfileH264High10 + VAProfileHEVCMain10
both listed; iter38 baseline 10 profiles intact at 12 total.

C5 iter38 5/5 baseline preserved: PASS — H.264 / HEVC / VP9 / VP8 /
MPEG-2 all libva == kdirect bit-exact, no regression from iter39
backend changes.

C2 Hi10P bit-exact vs kdirect: N/A — kdirect ALSO fails with EINVAL
(0 bytes output). The kernel ctrl table advertises Hi10P + NV15
CAPTURE but RK3399 HW doesn't actually decode 10-bit H264. Verified:
S_FMT(CAPTURE, NV15) succeeds; decode submits cleanly; CAPTURE buffer
returns all-zero. xxd 64 bytes of 0x00. SW reference has 222 unique
luma bytes.

C3 Main10 bit-exact vs kdirect: untested — system x265 is 8-bit-only
build, no kvazaar/x265-hbd in Arch repos, no Main10 sample downloaded
successfully. Same kernel-vs-HW caveat may apply.

Two backend fixes landed during Phase 7 (both pushed to gitea master):

  a13215d — skip pre-S_FMT NV15 CAPTURE format probe (rkvdec only
            advertises NV15 AFTER S_FMT(OUTPUT) + S_EXT_CTRLS(SPS))
  63fed87 — advertise P010 unconditionally in QueryImageFormats
            (ffmpeg-vaapi queries before CreateContext fires; gating
            on is_10bit hid the format from early consumers)

Without these the 10-bit decode pipeline can't even start. With them
it reaches the kernel cleanly.

Memory entry filed:
  feedback_rk3399_h264_hi10p_advertised_not_functional.md
  (kernel ctrl table necessary but NOT sufficient — always cross-check
   with kdirect before treating a profile as truly HW-supported)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 16:40:57 +00:00

5.8 KiB

Phase 7 close — iter39 sub-profile verification on fresnel

Closed 2026-05-17 evening. Backend tip 63fed87 on master (pushed to gitea). Fresnel back online on kernel 7.0.0-fresnel-fourier (i.e. linux-fresnel-fourier 7.0-14-equivalent).

Verification matrix

Criterion Result Notes
C1 — vainfo enumeration PASS VAProfileH264High10 + VAProfileHEVCMain10 both listed; iter38 baseline (10 profiles) intact at 12 total
C2 — Hi10P decode bit-exact vs kdirect N/A — kdirect also fails kdirect emits Invalid argument and produces 0 bytes for Hi10P input
C3 — Main10 decode bit-exact vs kdirect untested — no Main10 fixture system x265 is 8-bit-only build; no x265-hbd in Arch repos; no accessible Main10 sample downloaded successfully
C4 — SSIM_Y ≥ 0.999 vs SW n/a (no decode to compare)
C5 — iter38 5/5 baseline preserved PASS H.264 / HEVC / VP9 / VP8 / MPEG-2 all libva == kdirect bit-exact, no regression from iter39 backend changes

Two backend fixes landed during Phase 7

63fed87advertise P010 unconditionally in RequestQueryImageFormats. ffmpeg-vaapi calls vaQueryImageFormats during hwframes context setup, BEFORE vaCreateContext fires; the previous is_10bit gate meant P010 wasn't in the catalog at that early query → hwdownload,format=p010le rejected with "Invalid output format" before decode could even attempt. Safe: P010 unpack path is independently gated on image->format.fourcc == VA_FOURCC_P010.

a13215dskip pre-S_FMT NV15 CAPTURE format probe for 10-bit profiles. RK3399 rkvdec only advertises NV15 in VIDIOC_ENUM_FMT(CAPTURE) AFTER S_FMT(OUTPUT) + S_EXT_CTRLS(SPS) resolve image_fmt to 420_10BIT. Pre-flight v4l2_find_format(NV15) always returned 0 → CreateContext returned OPERATION_FAILED → ffmpeg-vaapi hwaccel init failed with "Failed to create decode context: 1". Direct lookup of the NV15 video_format entry; the subsequent S_FMT(CAPTURE) commits the actual mode.

Without these two fixes the 10-bit decode pipeline can't even start. With them the pipeline runs end-to-end — kernel accepts S_FMT NV15 (sizeimage=2188800, bytesperline=1600 for 1280x720), submits OUTPUT bytes, dequeues CAPTURE.

RK3399 Hi10P kernel-vs-HW gap

Strace shows the kernel accepts everything cleanly. But libva HW output is all zeros (verified via xxd: 64 bytes of 0x00 at offset 0; only 2 unique byte values across the 13.8 MB output). SW reference for the same fixture has 222 unique luma bytes — real content with bright pixels around 0xd500 (P010 = high 10 bits used).

kdirect (ffmpeg -hwaccel v4l2request) also fails on the same Hi10P input:

Task finished with error code: -22 (Invalid argument)
Nothing was written into output file

That eliminates our backend as the cause. Either:

  • RK3399's rkvdec HW genuinely doesn't have 10-bit H264 decode despite the kernel's rkvdec_h264_decoded_fmts[] listing NV15 / RKVDEC_IMG_FMT_420_10BIT. The kernel advertisement appears to be aspirational (or VDPU38x-driven inheritance into the legacy rk3399_variant_ops that isn't backed by actual silicon support).
  • A kernel-side ctrl path is missing that BOTH ffmpeg-vaapi-via-our-backend AND ffmpeg-v4l2request need.

Either way the gap is below our backend's control. Phase 0 source-read claimed Hi10P PASS (kernel ctrl cfg.max=HIGH_422_INTRA with bit_depth path live in rkvdec-h264-common.c:196); empirically that read overstated the HW capability.

Recommended scoping post-iter39

Two options:

A. Keep Hi10P enumerated, document as advertised-not-functional: vainfo lists both profiles, decode reaches kernel cleanly, no crash. Consumers that try Hi10P discover empty frames rather than a hard failure — graceful degradation. Phase 8 memory entry captures the kernel-vs-HW gap so future iterations don't re-investigate.

B. Conditionally drop Hi10P from RequestQueryConfigProfiles for RK3399 rkvdec: probe more deeply (e.g., try a synthetic SPS submission and check for error), only enumerate when probe succeeds. Cleaner consumer experience but adds probe complexity. Main10 likely needs the same treatment (untested).

Recommend A for this iteration close — the kernel-side gap is the right place to fix this if it gets fixed at all, and our backend already does the right thing structurally.

Memory entry to file

feedback_rk3399_h264_hi10p_advertised_not_functional.md: per-empirical-test, RK3399 rkvdec advertises H264 Hi10P in its V4L2 ctrl table (cfg.max=HIGH_422_INTRA) and accepts S_FMT(CAPTURE) NV15, but actual decode produces all-zero CAPTURE buffer. Confirmed both libva and kdirect (ffmpeg-v4l2request) fail equivalently. The kernel advertisement does NOT mean the HW does the decode. When evaluating "does RK3399 support codec X profile Y": (1) check kernel ctrl table — necessary but not sufficient; (2) try a SW-reference fixture through kdirect; (3) only treat as supported if kdirect produces real content. iter39 (libva sub-profile) close 2026-05-17.

Commits delivered this Phase 7 session

63fed87 iter39 fresnel fix: advertise P010 unconditionally in QueryImageFormats
a13215d iter39 fresnel fix: skip pre-S_FMT NV15 CAPTURE format probe

Both pushed to gitea master.

Open follow-ups

  1. Real Main10 fixture acquisition — without a properly-encoded Main10 HEVC sample, the Main10 path can't be empirically verified. Once a fixture is available the same test script (phase7_iter39_test_rig.sh) covers it; verification is a 5-minute run.
  2. Re-test iter39 on ampere (RK3588) — vpu981 is supposed to support 10-bit decode properly. If iter39 PASSes on ampere it's a strong signal the backend is right and the fresnel result is purely a kernel/HW issue.
  3. Memory entry filed (see above).