80 lines
7.0 KiB
Markdown
80 lines
7.0 KiB
Markdown
# Phase 7 close — iter39 sub-profile verification on fresnel
|
|
|
|
Closed 2026-05-17 evening. Backend tip `63fed87` on `master` (pushed to gitea). Fresnel back online on kernel `7.0.0-fresnel-fourier` (i.e. `linux-fresnel-fourier 7.0-14`-equivalent).
|
|
|
|
## Verification matrix
|
|
|
|
| Criterion | Result | Notes |
|
|
|---|---|---|
|
|
| **C1 — vainfo enumeration** | **PASS** ✓ | `VAProfileH264High10` + `VAProfileHEVCMain10` both listed; iter38 baseline (10 profiles) intact at 12 total |
|
|
| C2 — Hi10P decode bit-exact vs kdirect | **N/A** — kdirect also fails | kdirect emits `Invalid argument` and produces 0 bytes for Hi10P input |
|
|
| C3 — Main10 decode bit-exact vs kdirect | **untested** — no Main10 fixture | system x265 is 8-bit-only build; no x265-hbd in Arch repos; no accessible Main10 sample downloaded successfully |
|
|
| C4 — SSIM_Y ≥ 0.999 vs SW | n/a (no decode to compare) | — |
|
|
| **C5 — iter38 5/5 baseline preserved** | **PASS** ✓ | H.264 / HEVC / VP9 / VP8 / MPEG-2 all libva == kdirect bit-exact, no regression from iter39 backend changes |
|
|
|
|
## Two backend fixes landed during Phase 7
|
|
|
|
`63fed87` — **advertise P010 unconditionally in `RequestQueryImageFormats`**. ffmpeg-vaapi calls `vaQueryImageFormats` during hwframes context setup, BEFORE `vaCreateContext` fires; the previous `is_10bit` gate meant P010 wasn't in the catalog at that early query → `hwdownload,format=p010le` rejected with "Invalid output format" before decode could even attempt. Safe: P010 unpack path is independently gated on `image->format.fourcc == VA_FOURCC_P010`.
|
|
|
|
`a13215d` — **skip pre-S_FMT NV15 CAPTURE format probe for 10-bit profiles**. RK3399 rkvdec only advertises NV15 in `VIDIOC_ENUM_FMT(CAPTURE)` AFTER `S_FMT(OUTPUT)` + `S_EXT_CTRLS(SPS)` resolve `image_fmt` to `420_10BIT`. Pre-flight `v4l2_find_format(NV15)` always returned 0 → `CreateContext` returned `OPERATION_FAILED` → ffmpeg-vaapi hwaccel init failed with "Failed to create decode context: 1". Direct lookup of the NV15 `video_format` entry; the subsequent `S_FMT(CAPTURE)` commits the actual mode.
|
|
|
|
Without these two fixes the 10-bit decode pipeline can't even start. With them the pipeline runs end-to-end — kernel accepts S_FMT NV15 (`sizeimage=2188800, bytesperline=1600` for 1280x720), submits OUTPUT bytes, dequeues CAPTURE.
|
|
|
|
## RK3399 Hi10P kernel-vs-HW gap
|
|
|
|
Strace shows the kernel accepts everything cleanly. But libva HW output is **all zeros** (verified via xxd: 64 bytes of `0x00` at offset 0; only 2 unique byte values across the 13.8 MB output). SW reference for the same fixture has 222 unique luma bytes — real content with bright pixels around `0xd500` (P010 = high 10 bits used).
|
|
|
|
kdirect (`ffmpeg -hwaccel v4l2request`) **also fails** on the same Hi10P input:
|
|
|
|
```
|
|
Task finished with error code: -22 (Invalid argument)
|
|
Nothing was written into output file
|
|
```
|
|
|
|
That eliminates our backend as the cause. Either:
|
|
- RK3399's rkvdec HW genuinely doesn't have 10-bit H264 decode despite the kernel's `rkvdec_h264_decoded_fmts[]` listing `NV15` / `RKVDEC_IMG_FMT_420_10BIT`. The kernel advertisement appears to be aspirational (or VDPU38x-driven inheritance into the legacy `rk3399_variant_ops` that isn't backed by actual silicon support).
|
|
- A kernel-side ctrl path is missing that BOTH ffmpeg-vaapi-via-our-backend AND ffmpeg-v4l2request need.
|
|
|
|
Either way the gap is below our backend's control. Phase 0 source-read claimed Hi10P PASS (kernel ctrl `cfg.max=HIGH_422_INTRA` with bit_depth path live in `rkvdec-h264-common.c:196`); empirically that read overstated the HW capability.
|
|
|
|
## Recommended scoping post-iter39
|
|
|
|
Two options:
|
|
|
|
**A. Keep Hi10P enumerated, document as advertised-not-functional**: vainfo lists both profiles, decode reaches kernel cleanly, no crash. Consumers that try Hi10P discover empty frames rather than a hard failure — graceful degradation. Phase 8 memory entry captures the kernel-vs-HW gap so future iterations don't re-investigate.
|
|
|
|
**B. Conditionally drop Hi10P from `RequestQueryConfigProfiles` for RK3399 rkvdec**: probe more deeply (e.g., try a synthetic SPS submission and check for error), only enumerate when probe succeeds. Cleaner consumer experience but adds probe complexity. Main10 likely needs the same treatment (untested).
|
|
|
|
Recommend **A** for this iteration close — the kernel-side gap is the right place to fix this if it gets fixed at all, and our backend already does the right thing structurally.
|
|
|
|
## Memory entry to file
|
|
|
|
`feedback_rk3399_h264_hi10p_advertised_not_functional.md`: per-empirical-test, RK3399 rkvdec advertises H264 Hi10P in its V4L2 ctrl table (cfg.max=HIGH_422_INTRA) and accepts S_FMT(CAPTURE) NV15, but actual decode produces all-zero CAPTURE buffer. Confirmed both libva and kdirect (ffmpeg-v4l2request) fail equivalently. The kernel advertisement does NOT mean the HW does the decode. When evaluating "does RK3399 support codec X profile Y": (1) check kernel ctrl table — necessary but not sufficient; (2) try a SW-reference fixture through kdirect; (3) only treat as supported if kdirect produces real content. iter39 (libva sub-profile) close 2026-05-17.
|
|
|
|
## Commits delivered this Phase 7 session
|
|
|
|
```
|
|
63fed87 iter39 fresnel fix: advertise P010 unconditionally in QueryImageFormats
|
|
a13215d iter39 fresnel fix: skip pre-S_FMT NV15 CAPTURE format probe
|
|
```
|
|
|
|
Both pushed to gitea master.
|
|
|
|
## Open follow-ups
|
|
|
|
1. **Real Main10 fixture acquisition** — without a properly-encoded Main10 HEVC sample, the Main10 path can't be empirically verified. Once a fixture is available the same test script (`phase7_iter39_test_rig.sh`) covers it; verification is a 5-minute run.
|
|
2. **Re-test iter39 on ampere (RK3588)** — vpu981 is supposed to support 10-bit decode properly. If iter39 PASSes on ampere it's a strong signal the backend is right and the fresnel result is purely a kernel/HW issue.
|
|
3. **Memory entry** filed (see above).
|
|
|
|
## Phase 7 Option B implementation (post-close addendum)
|
|
|
|
User-directed Option B applied 2026-05-17 evening after cross-test on ampere:
|
|
|
|
- **Web research** (lwn.net/Articles/950434, patchwork.kernel.org Karlman v6→v10 series) showed: kernel + HW ready; ffmpeg-v4l2-request userspace plumbing for the new uAPI controls is the missing piece. Karlman's series explicitly states "FFmpeg patches required to fully runtime test."
|
|
- **Cross-test on ampere (RK3588)**: same all-zero failure mode as fresnel. kdirect ALSO returns EINVAL on both SoCs. Confirms the gap is in upstream ffmpeg userspace, not our backend or device-specific HW.
|
|
- **Backend commit `6bc12fe`** drops `VAProfileH264High10` + `VAProfileHEVCMain10` from `RequestQueryConfigProfiles`. All other iter39 backend code (codec.c, context.c, image.c, surface.c, nv15.c, request.h is_10bit flag) is RETAINED — re-enable by uncommenting two `profiles[index++]` lines and bumping the H264 guard from `(-5)` back to `(-6)` when upstream ffmpeg-v4l2request learns Hi10P.
|
|
|
|
vainfo post-Option-B: 10 profiles on fresnel (matches iter38 baseline exactly). 9 profiles on ampere (matches ampere-fourier iter1 — no VP9 on vpu981 path).
|
|
|
|
iter38 5/5 PASS preserved on fresnel post-Option-B (no other codec touched). Cross-device cleanliness maintained.
|