iter39 Phase 7 CLOSE: vainfo + iter38 baseline PASS; Hi10P kernel/HW gap on RK3399
Phase 7 verification on fresnel (kernel 7.0-14 / linux-fresnel-fourier).
C1 vainfo enumeration: PASS — VAProfileH264High10 + VAProfileHEVCMain10
both listed; iter38 baseline 10 profiles intact at 12 total.
C5 iter38 5/5 baseline preserved: PASS — H.264 / HEVC / VP9 / VP8 /
MPEG-2 all libva == kdirect bit-exact, no regression from iter39
backend changes.
C2 Hi10P bit-exact vs kdirect: N/A — kdirect ALSO fails with EINVAL
(0 bytes output). The kernel ctrl table advertises Hi10P + NV15
CAPTURE but RK3399 HW doesn't actually decode 10-bit H264. Verified:
S_FMT(CAPTURE, NV15) succeeds; decode submits cleanly; CAPTURE buffer
returns all-zero. xxd 64 bytes of 0x00. SW reference has 222 unique
luma bytes.
C3 Main10 bit-exact vs kdirect: untested — system x265 is 8-bit-only
build, no kvazaar/x265-hbd in Arch repos, no Main10 sample downloaded
successfully. Same kernel-vs-HW caveat may apply.
Two backend fixes landed during Phase 7 (both pushed to gitea master):
a13215d — skip pre-S_FMT NV15 CAPTURE format probe (rkvdec only
advertises NV15 AFTER S_FMT(OUTPUT) + S_EXT_CTRLS(SPS))
63fed87 — advertise P010 unconditionally in QueryImageFormats
(ffmpeg-vaapi queries before CreateContext fires; gating
on is_10bit hid the format from early consumers)
Without these the 10-bit decode pipeline can't even start. With them
it reaches the kernel cleanly.
Memory entry filed:
feedback_rk3399_h264_hi10p_advertised_not_functional.md
(kernel ctrl table necessary but NOT sufficient — always cross-check
with kdirect before treating a profile as truly HW-supported)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -14,7 +14,7 @@ Use this doc to resume the fresnel-fourier campaign after Claude context compact
|
||||
| Env-gated DIAG probes (iter29/30/33/35) | **CLEANED** | iter36 (-131 / +7 LOC) |
|
||||
| α-26 mis-routed cosmetic | **REVERTED** | iter37 (1-line; rkvdec never read that field) |
|
||||
| Libva multi-device probe | **DONE** | iter38 (single session serves all 5 codecs; no env override needed) |
|
||||
| H264 Hi10P + HEVC Main10 sub-profile | **CODE LANDED — Phase 7 PENDING** | iter39 α-31 (backend 662f887): NV15 CAPTURE pix_fmt, synthetic-SPS bit_depth=2, NV15→P010 userspace unpack in copy_surface_to_image, P010 reporting in DeriveImage/QueryImageFormats. Self-tested (test_nv15_unpack passes on noether). Awaiting fresnel power-on for vainfo enumeration + libva.P010==kdirect.P010 bit-exact verification. |
|
||||
| H264 Hi10P + HEVC Main10 sub-profile | **CLOSED 2026-05-17 with kernel/HW caveat** | iter39 α-31 (backend `63fed87`): vainfo enumeration ✓, iter38 5/5 baseline preserved ✓, Hi10P decode path reaches kernel cleanly but RK3399 HW produces all-zero CAPTURE (kdirect fails equivalently — kernel-side gap, not backend). Two Phase 7 fixes landed: `a13215d` skip pre-S_FMT NV15 probe, `63fed87` advertise P010 unconditionally. Main10 untested (no fixture). See `phase7_iter39_close.md` + memory [[feedback_rk3399_h264_hi10p_advertised_not_functional]]. |
|
||||
|
||||
| Codec | libva 10F sha | kdirect 10F sha | SW 10F sha | L==K | L==SW |
|
||||
|---|---|---|---|---|---|
|
||||
@@ -154,7 +154,7 @@ Expect: 5× PASS.
|
||||
|
||||
1. **Multi-context simultaneously** — current design supports only one decode context at a time across devices (device switch tears down pools). Could be expanded to per-context pools to support simultaneous mixed-codec decode. Not requested.
|
||||
|
||||
2. ~~**Sub-profile support**~~ — *Phase 6 LANDED 2026-05-17 (iter39 α-31, backend `662f887`)*. H264 Hi10P + HEVC Main10 wired through the backend with NV15→P010 userspace unpack. VP9 Profile 2 explicitly excluded (RK3399 rkvdec kernel ctrl caps at PROFILE_0). PRIME-side P010 emission deferred (consumers wanting P010 must use the COPY path). Phase 7 test rig at `phase7_iter39_test_rig.sh`; awaiting fresnel.
|
||||
2. ~~**Sub-profile support**~~ — *CLOSED 2026-05-17 with HW caveat (backend `63fed87`)*. H264 Hi10P + HEVC Main10 wired through the backend with NV15→P010 userspace unpack. VP9 Profile 2 explicitly excluded (RK3399 rkvdec kernel ctrl caps at PROFILE_0). PRIME-side P010 emission deferred. Phase 7 verified vainfo enumeration + iter38 5/5 baseline preserved. Hi10P actual decode produces all-zero on RK3399 HW — kdirect fails equivalently, kernel-side gap. Memory entry [[feedback_rk3399_h264_hi10p_advertised_not_functional]]. Main10 untested (no fixture). Full details: `phase7_iter39_close.md`.
|
||||
|
||||
## Resumption sequence — iter39 Phase 7 (when fresnel is up)
|
||||
|
||||
|
||||
@@ -0,0 +1,67 @@
|
||||
# Phase 7 close — iter39 sub-profile verification on fresnel
|
||||
|
||||
Closed 2026-05-17 evening. Backend tip `63fed87` on `master` (pushed to gitea). Fresnel back online on kernel `7.0.0-fresnel-fourier` (i.e. `linux-fresnel-fourier 7.0-14`-equivalent).
|
||||
|
||||
## Verification matrix
|
||||
|
||||
| Criterion | Result | Notes |
|
||||
|---|---|---|
|
||||
| **C1 — vainfo enumeration** | **PASS** ✓ | `VAProfileH264High10` + `VAProfileHEVCMain10` both listed; iter38 baseline (10 profiles) intact at 12 total |
|
||||
| C2 — Hi10P decode bit-exact vs kdirect | **N/A** — kdirect also fails | kdirect emits `Invalid argument` and produces 0 bytes for Hi10P input |
|
||||
| C3 — Main10 decode bit-exact vs kdirect | **untested** — no Main10 fixture | system x265 is 8-bit-only build; no x265-hbd in Arch repos; no accessible Main10 sample downloaded successfully |
|
||||
| C4 — SSIM_Y ≥ 0.999 vs SW | n/a (no decode to compare) | — |
|
||||
| **C5 — iter38 5/5 baseline preserved** | **PASS** ✓ | H.264 / HEVC / VP9 / VP8 / MPEG-2 all libva == kdirect bit-exact, no regression from iter39 backend changes |
|
||||
|
||||
## Two backend fixes landed during Phase 7
|
||||
|
||||
`63fed87` — **advertise P010 unconditionally in `RequestQueryImageFormats`**. ffmpeg-vaapi calls `vaQueryImageFormats` during hwframes context setup, BEFORE `vaCreateContext` fires; the previous `is_10bit` gate meant P010 wasn't in the catalog at that early query → `hwdownload,format=p010le` rejected with "Invalid output format" before decode could even attempt. Safe: P010 unpack path is independently gated on `image->format.fourcc == VA_FOURCC_P010`.
|
||||
|
||||
`a13215d` — **skip pre-S_FMT NV15 CAPTURE format probe for 10-bit profiles**. RK3399 rkvdec only advertises NV15 in `VIDIOC_ENUM_FMT(CAPTURE)` AFTER `S_FMT(OUTPUT)` + `S_EXT_CTRLS(SPS)` resolve `image_fmt` to `420_10BIT`. Pre-flight `v4l2_find_format(NV15)` always returned 0 → `CreateContext` returned `OPERATION_FAILED` → ffmpeg-vaapi hwaccel init failed with "Failed to create decode context: 1". Direct lookup of the NV15 `video_format` entry; the subsequent `S_FMT(CAPTURE)` commits the actual mode.
|
||||
|
||||
Without these two fixes the 10-bit decode pipeline can't even start. With them the pipeline runs end-to-end — kernel accepts S_FMT NV15 (`sizeimage=2188800, bytesperline=1600` for 1280x720), submits OUTPUT bytes, dequeues CAPTURE.
|
||||
|
||||
## RK3399 Hi10P kernel-vs-HW gap
|
||||
|
||||
Strace shows the kernel accepts everything cleanly. But libva HW output is **all zeros** (verified via xxd: 64 bytes of `0x00` at offset 0; only 2 unique byte values across the 13.8 MB output). SW reference for the same fixture has 222 unique luma bytes — real content with bright pixels around `0xd500` (P010 = high 10 bits used).
|
||||
|
||||
kdirect (`ffmpeg -hwaccel v4l2request`) **also fails** on the same Hi10P input:
|
||||
|
||||
```
|
||||
Task finished with error code: -22 (Invalid argument)
|
||||
Nothing was written into output file
|
||||
```
|
||||
|
||||
That eliminates our backend as the cause. Either:
|
||||
- RK3399's rkvdec HW genuinely doesn't have 10-bit H264 decode despite the kernel's `rkvdec_h264_decoded_fmts[]` listing `NV15` / `RKVDEC_IMG_FMT_420_10BIT`. The kernel advertisement appears to be aspirational (or VDPU38x-driven inheritance into the legacy `rk3399_variant_ops` that isn't backed by actual silicon support).
|
||||
- A kernel-side ctrl path is missing that BOTH ffmpeg-vaapi-via-our-backend AND ffmpeg-v4l2request need.
|
||||
|
||||
Either way the gap is below our backend's control. Phase 0 source-read claimed Hi10P PASS (kernel ctrl `cfg.max=HIGH_422_INTRA` with bit_depth path live in `rkvdec-h264-common.c:196`); empirically that read overstated the HW capability.
|
||||
|
||||
## Recommended scoping post-iter39
|
||||
|
||||
Two options:
|
||||
|
||||
**A. Keep Hi10P enumerated, document as advertised-not-functional**: vainfo lists both profiles, decode reaches kernel cleanly, no crash. Consumers that try Hi10P discover empty frames rather than a hard failure — graceful degradation. Phase 8 memory entry captures the kernel-vs-HW gap so future iterations don't re-investigate.
|
||||
|
||||
**B. Conditionally drop Hi10P from `RequestQueryConfigProfiles` for RK3399 rkvdec**: probe more deeply (e.g., try a synthetic SPS submission and check for error), only enumerate when probe succeeds. Cleaner consumer experience but adds probe complexity. Main10 likely needs the same treatment (untested).
|
||||
|
||||
Recommend **A** for this iteration close — the kernel-side gap is the right place to fix this if it gets fixed at all, and our backend already does the right thing structurally.
|
||||
|
||||
## Memory entry to file
|
||||
|
||||
`feedback_rk3399_h264_hi10p_advertised_not_functional.md`: per-empirical-test, RK3399 rkvdec advertises H264 Hi10P in its V4L2 ctrl table (cfg.max=HIGH_422_INTRA) and accepts S_FMT(CAPTURE) NV15, but actual decode produces all-zero CAPTURE buffer. Confirmed both libva and kdirect (ffmpeg-v4l2request) fail equivalently. The kernel advertisement does NOT mean the HW does the decode. When evaluating "does RK3399 support codec X profile Y": (1) check kernel ctrl table — necessary but not sufficient; (2) try a SW-reference fixture through kdirect; (3) only treat as supported if kdirect produces real content. iter39 (libva sub-profile) close 2026-05-17.
|
||||
|
||||
## Commits delivered this Phase 7 session
|
||||
|
||||
```
|
||||
63fed87 iter39 fresnel fix: advertise P010 unconditionally in QueryImageFormats
|
||||
a13215d iter39 fresnel fix: skip pre-S_FMT NV15 CAPTURE format probe
|
||||
```
|
||||
|
||||
Both pushed to gitea master.
|
||||
|
||||
## Open follow-ups
|
||||
|
||||
1. **Real Main10 fixture acquisition** — without a properly-encoded Main10 HEVC sample, the Main10 path can't be empirically verified. Once a fixture is available the same test script (`phase7_iter39_test_rig.sh`) covers it; verification is a 5-minute run.
|
||||
2. **Re-test iter39 on ampere (RK3588)** — vpu981 is supposed to support 10-bit decode properly. If iter39 PASSes on ampere it's a strong signal the backend is right and the fresnel result is purely a kernel/HW issue.
|
||||
3. **Memory entry** filed (see above).
|
||||
Reference in New Issue
Block a user