diff --git a/arch/chromium-fourier/STUDY.md b/arch/chromium-fourier/STUDY.md new file mode 100644 index 000000000..bc6dba119 --- /dev/null +++ b/arch/chromium-fourier/STUDY.md @@ -0,0 +1,165 @@ +# chromium-fourier — Chromium with V4L2-stateless / VAAPI HW decode on mainline-kernel ARM Linux + +## Goal + +Patch upstream Chromium so it can do hardware video decode through +`VaapiVideoDecoder` on a plain Linux Wayland system on Rockchip +(RK3566 / RK3588), via our `marfrit/libva-v4l2-request-fourier` libva +backend. Target consumers: Brave (rebuilt from the same patched +Chromium), upstream Chromium itself for Arch Linux ARM, and any other +Blink-based browser that picks up the patches. + +This fills a niche **no shipping fork** currently fills — every existing +ARM-Linux Chromium with HW decode (7Ji's `chromium-mpp`, Ubuntu's +`liujianfeng1994/rockchip-multimedia` PPA, JeffyCN's recipe, etc.) goes +through the **vendor MPP path** on **5.10 BSP kernel + X11 + Mali blob ++ panfork**. We're going for **mainline kernel + V4L2 stateless + +Wayland + panfrost / panthor**, the direction Fourier is heading anyway. + +## What's actually broken in stock Chromium 147+ + +1. **`media/gpu/chromeos/video_decoder_pipeline.cc`** is selected on Linux + when it shouldn't be. It then tries to initialize a V4L2 + `ImageProcessor` (a ChromeOS-specific m2m chip block for color + conversion / scaling) that doesn't exist on a plain Linux Wayland + system. Failure surface: + + ``` + PickDecoderOutputFormat(): Initializing ImageProcessor; max buffers: 16 + ERROR: failed Initialize()ing the frame pool + ``` + +2. **`VaapiVideoDecoder`** *is* compiled into stock Chromium / brave-bin + (verified: `strings /opt/brave-bin/brave | grep VaapiVideoDecoder`), + but the chromeos pipeline preempts it. Once the pipeline fails, the + decoder selection bails — it never falls back to direct + `VaapiVideoDecoder`. + +3. **`V4L2VideoDecoder`** factory is `BUILDFLAG(IS_CHROMEOS)`-gated. + Feature flags `UseChromeOSDirectVideoDecoder` / + `V4L2FlatStatelessVideoDecoder` are no-ops on a non-ChromeOS build — + their flag strings don't even appear in the binary. + +## The patch shape (sketch) + +### Patch 1 — bypass the chromeos pipeline on Linux + +`media/gpu/chromeos/video_decoder_pipeline.cc` is reached from +`media/gpu/vaapi/vaapi_video_decoder.cc` somewhere in +`ApplyResolutionChangeWithScreenSizes`. Either: + +- Skip the chromeos pipeline entirely on Linux non-ChromeOS, going + straight to `VaapiVideoDecoder::CreateContextAndScopedVASurfaces` for + frame allocation, OR +- Replace `PickDecoderOutputFormat`'s ImageProcessor probing with a + no-op that just trusts the VAAPI surface format (since on real + ChromeOS the IP does color conversion that we don't need in our + dmabuf-passthrough path). + +Reference: [crbug 40192819] "VaapiVideoDecoder on linux" +(). + +### Patch 2 — un-gate `V4L2VideoDecoder` factory for non-ChromeOS Linux + +Optional but useful as a fallback. The class is compiled in; only the +factory registration is `IS_CHROMEOS`-gated. The `igel-oss/meta-browser-hwdecode` +patch +shows the historical pattern for the V4L2VDA factory; the modern factory +is in `media/gpu/v4l2/v4l2_video_decoder.{h,cc}`. + +This is **plan B** if Patch 1 turns out harder than expected — V4L2VDA +talks to `/dev/videoN` directly without going through libva, which means +we don't depend on the `libva-v4l2-request-fourier` decode-submission +path either. Shorter end-to-end critical path, but a more invasive +factory change. + +### Patch 3 — point libva at our backend (build-time) + +Already configurable at runtime via `LIBVA_DRIVER_NAME=v4l2_request`, +but a Brave/Chromium binary that defaults to it would be cleaner. Could +patch the GPU process startup to set the env var if not already set. +Cosmetic, optional. + +## Reference forks (read these side-by-side with our patch) + +- **JeffyCN/meta-rockchip chromium recipe** — the upstream of the V4L2VDA + factory un-gating and `libv4l-rkmpp` shim that every shipping fork + uses. + Caveat: targets the V4L2VDA path (Plan B), not the modern + `VaapiVideoDecoder` (Plan A) we want. +- **igel-oss/meta-browser-hwdecode** — Yocto layer with the original + `0001-Add-support-for-V4L2VDA-on-Linux.patch`. 2017-vintage but the + pattern still applies. +- **7Ji-PKGBUILDs/chromium-mpp** — the most recent ALARM-shipping + variant, Chromium 132 with MPP. Useful for the **PKGBUILD shape** and + the patch-set list, even though we're not using MPP. + +- **amazingfate/chromium-debian-build** — Debian flavour of the same + approach, for reference. + +## Build environment + +- **Source tree**: `gn` / `depot_tools`, hosted on **fermi** (Arch ARM + aarch64 LXC on hertz). Fetch is ~30 GB; full chromium tree is ~100 GB + with build artifacts. Fermi's storage budget needs checking — may + need to bind-mount a hertz path with more headroom. +- **Build acceleration**: `distcc-avahi` is already deployed. Wire the + chromium build to use distcc through CT108 + tesla as compile workers. + Pump mode can shave further; chromium's ninja will accept distcc'd + cc/c++ via wrappers. +- **First build wall time estimate**: 6–10 hours initial on fermi alone; + 3–5 hours with distcc-avahi if the network throughput holds. After + the first build, incrementals on small patches are ~10–15 min. +- **Configure flags**: `gn args` — start from Arch's `chromium` + PKGBUILD, add `use_vaapi=true use_v4l2_codec=true is_official_build=false`. +- **Output**: `chromium-fourier--1-aarch64.pkg.tar.zst` + shipping `/usr/bin/chromium`. `provides=(chromium) conflicts=(chromium)` + shape, same as `ffmpeg-v4l2-request-git`'s replacement of stock ffmpeg. + +## Validation path + +Once a patched binary builds: + +1. Launch with the same env we use for mpv-vaapi: + `LIBVA_DRIVER_NAME=v4l2_request LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video0` +2. Open `chrome://gpu` — should show "Video Decode: Hardware accelerated" + (not "Disabled" or "Software only"). +3. Open a 1080p H.264 file:// URL, watch CPU on `/dev/video0` openers, + measure Brave's GPU process CPU% during playback. +4. Cross-reference with `mpv --hwdec=vaapi` numbers once the + libva-v4l2-request-fourier decode-submission path also lands. + +## Order of operations + +Phase A (this session): workspace + STUDY.md (this file). **Done.** + +Phase B: build environment — install `depot_tools` on fermi, run +`fetch chromium`, get a baseline (unpatched) build to confirm fermi can +build chromium at all. ~one full day. + +Phase C: identify the exact line to flip for Patch 1 by reading +`media/gpu/chromeos/video_decoder_pipeline.cc` + +`media/gpu/vaapi/vaapi_video_decoder.cc` against current Chromium +master. Iterate the patch on a real build. + +Phase D: package as `chromium-fourier` PKGBUILD, hook into +marfrit-packages CI on fermi (already has the pattern from +ffmpeg-v4l2-request-git). + +Phase E: rebase Patch 1 (and Patch 2 if needed) onto Brave's source +tree, ship as `brave-fourier` next to `chromium-fourier`. Brave's tree +adds ~50 patches on top of upstream Chromium; the chromeos-pipeline +seam should be unchanged across that delta, so this should be a +mechanical rebase. + +## Out of scope + +- HW video **encode** (cameras, webcam streams). RK3566 has a separate + encoder block on `/dev/video2`; not a Fourier priority. +- HEVC / VP9 / AV1 — RK3566 has no HW for these. RK3588 has VDPU381 + but our libva backend doesn't speak HEVC yet. +- WebGL / WebGPU performance — separate concern, not part of video + decode. +- Brave-specific features (Shields, Wallet, etc.) — they all live in + Brave's source tree on top of Chromium and are unaffected by our + decode patches.