Files
test0r 3144b31ac9 ohm browser HW decode: link to chromium-fourier side-project
marfrit-packages/arch/chromium-fourier/STUDY.md now holds the plan for
patching upstream Chromium to reach VaapiVideoDecoder directly on
Linux Wayland, talking to libva-v4l2-request-fourier. README points at
it from the existing 'why not Brave's V4L2 / use-system-ffmpeg path'
section so a cold-start session sees the next step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 22:51:30 +00:00

591 lines
32 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Fourier — Hardware-Assisted Video Decoding on the Rockchip Fleet
One umbrella project to bring up working HW video decode on all four Rockchip
devices in Markus' fleet. Named 2026-04-24.
## Target fleet
| Host | SoC | Role | Parent umbrella |
|-----------|--------|---------------------------|-----------------|
| fresnel | RK3399 | Pinebook Pro fleet laptop | — |
| ohm | RK3566 | PineTab2 tablet | — |
| boltzmann | RK3588 | Rock 5 ITX+ (always on) | Volta |
| ampere | RK3588 | CoolPi GenBook laptop | Coulomb |
Current priority: **ohm**. The other three follow in the order of easiest-win
(fresnel, mature mainline) → incremental (ampere, RK3588 landed Feb 2026) →
hardest (boltzmann, currently on vendor kernel).
## Related Rockchip projects
Fourier touches hardware that already has its own umbrellas. Keep display/boot
concerns in those projects; keep video decode in Fourier.
- **Coulomb** — ampere / CoolPi GenBook stack. Subprojects:
- **Bin** — u-boot, display bringup (eDP + keyboard)
- **MegabitChip** — DDR init blob matching-decomp / HIL
- **RockHard** — mainline kernel (Collabora TF-A, LP5-3200 OC)
- **Volta** — boltzmann / Rock 5 ITX+ stack. Subprojects:
- **Quark** — edk2-rk3588 UEFI firmware
- **Neutron** — mainline / vendor kernel, UEFI-booted
- **fresnel** — Pinebook Pro, custom overclocked kernel package (no umbrella)
- **ohm** — PineTab2, DanctNIX base, UART capture rig for ampere DDR work
Cross-cutting notes in `/home/mfritsche/.claude/projects/-home-mfritsche-claude/memory/`
— see `MEMORY.md` for the index.
## Meet His
When Fourier needs infra muscle (wake a sleeping host, reach a VPN peer, bump
DHCP on the Fritz, find the right lmcp token, set up distcc for a kernel
build), summon **His** — the Home Infrastructure Specialist.
- As a subagent: `Agent(subagent_type: "his", prompt: "...")` — takes over a
task end-to-end and returns a report.
- As a skill: `Skill(skill: "his")` — loads the runbook cheatsheet into the
current session.
His knows the distcc workers (tesla, CT108, dcc1), the Fritz!DECT plug blast
radii (careful with Himbeere + Office), the `/opt/herding/` layout on hertz,
the lmcp endpoint token map, and the VPN naming (`<host>.vpn` via shannon's
dnsmasq). Canonical runbook: `/home/mfritsche/claude/CLAUDE.md` on noether.
For Fourier specifically, expect to lean on His for: waking ohm over LAN +
VPN, confirming ampere is at .vpn (not the stale .88.x LAN IP), setting up
cross-builds of patched ffmpeg via distcc, and nudging kernel packages into
the marfrit repo.
## Two kernel paths
Rockchip hardware decode has **two incompatible kernel driver families**.
Userspace config diverges accordingly — don't mix flags.
### Path A — mainline V4L2 stateless
- Kernel: `rkvdec` / `rkvdec2` stateless V4L2 drivers (mainline or out-of-tree
patch series).
- /dev node: V4L2 m2m `/dev/videoN` with compressed H.264/HEVC/VP9 queues +
uncompressed output queue.
- FFmpeg: `ffmpeg-v4l2-request-git` from AUR (Jernej's patchset). Upstream
v2 patchset posted on ffmpeg-devel 2024-08, **not yet merged** as of
2026-04-24.
- GStreamer: `v4l2codecs` plugin in gst-plugins-bad. 1.26 added Rockchip
buffer formats; 1.28 added the new HEVC UAPI controls.
- mpv: `mpv --hwdec=drm --vo=gpu-next` (wayland compositor) or
`--vo=dmabuf-wayland` (wlroots).
- Kodi: native V4L2-request API support, independent of FFmpeg.
- VA-API: `libva-v4l2-request` bridge for Firefox / Chromium paths.
### Path B — Rockchip MPP vendor
- Kernel: `mpp_rkvdec` / `mpp_rkvdec2` from Rockchip BSP (the rkr* kernel
series).
- /dev node: `/dev/mpp_service` chardev, not V4L2.
- FFmpeg: build with `--enable-rkmpp` against `librockchip-mpp`.
- GStreamer: Rockchip-specific plugins (`rkximagesink`, `mppvideodec`, …).
- Present today on **boltzmann** (kernel 6.1.75-rkr3).
## rkvdec mainline status (2026-04-24)
| SoC | Block | Mainline status | Codecs (mainline) |
|---------|--------------------|------------------------------|------------------------------------------------------|
| RK3399 | Hantro G1/G2 | mature | MPEG-2, VP8, H.264, HEVC, VP9 |
| RK3566 | rkvdec2 (VDPU346) | **not yet** (Collabora TODO) | rkvdec1: MPEG-2/VP8/H.264; HEVC/VP9 need out-of-tree |
| RK3588 | VDPU381 | **merged 2026-02-27** | H.264, H.265 (no VP9/AV1 upstream yet) |
| RK3576 | VDPU383 | merged 2026-02-27 | H.264, H.265 |
The RK3588 / RK3576 landing was the Feb 2026 Collabora drop. VDPU346 for the
RK356x family is on the roadmap but not merged; ohm depends on DanctNIX
carrying out-of-tree rkvdec2 patches if we want HEVC/VP9.
## ohm — first priority, known recipe
From Martin Chang's blog (clehaxze.tw, 2023, still mirrored by DanctNIX):
```sh
sudo pacman -S mpv
yay -S ffmpeg-v4l2-request-git
mpv --hwdec=drm --vo=gpu-next --wayland-disable-vsync=yes input.mp4
# wlroots compositor variant:
mpv --hwdec=drm --vo=dmabuf-wayland input.mp4
```
2023 coverage: **MPEG-2, VP8, H.264**. 1080p60 H.264 hit ~80% of one CPU core
— decode was on the hardware; the CPU was in the compositor path.
### Recon 2026-04-24 (ohm)
- Running kernel: `6.19.10-danctnix1-1-pinetab2`.
- Decoder: `hantro-vpu` on `/dev/video1` (DT `rockchip,rk3568-vpu` at
`fdea0400`); encoder on `/dev/video2`.
- Exposed stateless formats: `S264` (H.264), `MG2S` (MPEG-2), `VP8F` (VP8)
`NV12`. No HEVC, no VP9.
- No rkvdec2 DT node, no rkvdec2 driver, no decoder firmware blob
(`/lib/firmware/rockchip/` only carries `dptx.bin`). HEVC/VP9 hardware
decode is **not available on this image**. Tree-side confirmed: the
kernel build ships only the mainline `rockchip-vdec` module
(`CONFIG_VIDEO_ROCKCHIP_VDEC=m`, Collabora, rkvdec1) whose DT aliases are
`rockchip,rk{3288,3328,3399}-vdec` — none of which match RK3566. No
rkvdec2 module in `/lib/modules/.../rockchip/`; DanctNIX carries no
out-of-tree rkvdec2 patches. HEVC/VP9 on ohm needs either (a) our own
carrier patch series (VDPU346 driver + DT + firmware, rebuild
`linux-pinetab2`) or (b) waiting for VDPU346 to land upstream.
- Userspace: `ffmpeg 8.1` (stock, v4l2m2m only — **no v4l2request hwaccel**,
does not drive the stateless decoder), `gstreamer 1.28.2` with
`v4l2codecs` exposing `v4l2sl{h264,mpeg2,vp8}dec`, `mpv 0.41.0`,
`libva 2.23.0`. No `ffmpeg-v4l2-request-git`, no `libva-v4l2-request`,
no `kodi`.
- Noise: `bes2600` Wi-Fi OOT driver spams `WARN` every ~30 s. Unrelated to
decode; ignore for this umbrella.
So: 2023 recipe works verbatim for H.264 / MPEG-2 / VP8 once
`ffmpeg-v4l2-request-git` is installed. GStreamer `v4l2slh264dec` is the
shortest path to a first decode proof. HEVC/VP9 park until the kernel side
picks up rkvdec2 patches.
### Plan (tasks #1824 in the noether task list)
1. **Live recon** — kernel, `/dev/video*`, `v4l2-ctl --list-devices`, dmesg
rkvdec, `/lib/firmware/rockchip/`, installed ffmpeg/gstreamer/mpv/kodi
2. **SW baseline** — 1080p H.264 / HEVC / VP9 benchmark with ffmpeg and mpv
hwdec=no, record CPU% + fps
3. **Driver binding** — confirm V4L2 stateless decode node exposed; if not,
diagnose kconfig / DT / firmware
4. **GStreamer v4l2codecs** — cleanest proof; `v4l2slh264dec` pipeline with
`fakesink`, then with display sink
5. **FFmpeg v4l2-request** — install `ffmpeg-v4l2-request-git` (AUR) or
verify a DanctNIX-shipped ffmpeg with v4l2-request enabled
6. **Kodi + mpv validation** — 1080p HEVC at <30% of one core target
7. **Document** — freeze the final recipe in this README
**Status 2026-04-24**: ohm reachable; steps 1 + 2 + 3 + 4 + 5 done for
H.264. GStreamer `v4l2slh264dec → waylandsink` is 67 % CPU with
zero-copy dmabuf NV12. FFmpeg with `-hwaccel v4l2request -hwaccel_output_format
drm_prime` is 14 % realtime / 93.5 fps peak — hantro's 1080p H.264
ceiling on RK3566 is ≈95 fps. marfrit `ffmpeg-v4l2-request-git` lives in
the repo and installs cleanly via `pacman -S`. Task (b) (Brave HW decode)
blocked on multiplanar `libva-v4l2-request` rework — see section below.
**Note**: DanctNIX already ships `ffmpeg-v4l2-request 2:8.1-3` in its
`danctnix` repo; the marfrit build is our stripped variant (no X11/AMF/
CUDA/etc, Kwiboo tip pinned by commit) and sorts strictly newer in pacman
vercmp.
### Baseline numbers (2026-04-24, ohm, `bbb_1080p30_h264.mp4`)
Clip SHA-16 `dcf8a7170fbd49bb` pulled from doppler
`/moviedata/fourier-test/` via hertz `lxc file pull` → http bridge → ohm
(same bytes on every fleet device). **Source frame rate is 24 fps** per
H.264 caps (`framerate=24/1`) — the `1080p30` in the file name is a
misnomer carried from Blender's encode metadata; the actual media rate is
24 fps. That's the number the target/delivered columns below are measured
against.
| Test | CPU% | Frames / wall | Effective fps | Dropped |
|--------------------------------------------------------------|-------|-------------------|---------------|---------|
| SW: `ffmpeg -hwaccel none -f null -` | 319 % | 1440 / 18.55 s | 77.6 | n/a |
| SW: `ffmpeg -re -hwaccel none -f null -` | 90 % | 1440 / 60.27 s | 24.0 (paced) | n/a |
| SW: `mpv --hwdec=no --vo=gpu-next`, DSI-1 | 127 % | 1440 source / 60 s| ~7.8 delivered| **973** |
| HW: `gst v4l2slh264dec → fakesink sync=false` | 89 % | 1800 / 48.9 s | 36.8 | n/a |
| HW: `gst v4l2slh264dec → waylandsink` (dmabuf), DSI-1 1:1 | 7 % | 1488 / 62 s | 24.0 (paced) | 0 (progressreport 1:1) |
| HW: `gst v4l2slh264dec → waylandsink fullscreen=true`, scaled | 6 % | 1488 / 62 s | 24.0 (paced) | 0 (progressreport 1:1) |
| HW: `ffmpeg -hwaccel v4l2request -f null -` | 105 % | 1440 / 38.4 s | 37.5 | n/a |
| HW: `ffmpeg -re -hwaccel v4l2request -f null -` | 67 % | 1440 / 59.9 s | 24.0 (paced) | n/a |
| HW: `ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime -f null -` | 51 % | 1440 / 15.4 s | **93.5** | n/a |
| HW: `ffmpeg -re + drm_prime -f null -` | **14 %** | 1440 / 59.9 s | 24.0 (paced) | n/a |
| HW: `mpv --hwdec=v4l2request --vo=gpu-next`, DSI-1 | 122 % | 1440 source / 60 s| ~8.5 delivered| **930** |
Reading:
- SW decode alone has ~3.2× headroom over source rate (77.6 / 24 fps) but
costs ~3.2 cores unpaced or 1 core at realtime pace.
- SW mpv via `gpu-next` through KWin drops **68 %** of frames while only
using 127 % CPU — the compositor / VO path is the bottleneck, not
decode. Exactly the "compositor-bound ≠ decode-bound" gotcha below.
- HW decode to `fakesink` clocks ~36.8 fps (~1.5× source rate), 89 % CPU.
- **HW decode to `waylandsink` via zero-copy dmabuf (`DMA_DRM` NV12)
drops the CPU to 67 %** — an ≈18× drop from the SW mpv number. The
GStreamer `v4l2codecs → waylandsink` path on KWin negotiates
dmabuf-direct, bypassing any GL upload.
- **Fit-to-screen scaling is effectively free.** Native 1920×1080 in a
window is 7 % CPU; `fullscreen=true` with KWin scaling the dmabuf down
to 1280×800 is 6 % CPU. Almost certainly the VOP2 HW scaling planes on
RK3566 are doing the downscale during scanout, not the GPU.
- Frame-drop count validated by `progressreport update-freq=5`: stream
position advances 1:1 with wall clock for the full 62 s run — zero
drops, full 24 fps delivery.
- **FFmpeg path needs `-hwaccel_output_format drm_prime`** to avoid the
default CPU readback. Without it, `-hwaccel v4l2request` costs 67 % CPU
at realtime 24 fps; with it, 14 %. Peak throughput 93.5 fps vs 37.5 fps.
The hantro VPU's real 1080p H.264 ceiling on RK3566 is ≈95 fps — enough
for 1080p60 with headroom. GStreamer's `v4l2codecs → waylandsink` path
still wins on CPU (67 %) because its dmabuf-direct scanout avoids every
copy; ffmpeg's `null` muxer costs a few percent even with drm_prime.
- **mpv with `--hwdec=v4l2request` does engage the HW decoder** (mpv
loads the `drmprime` hwdec driver, mpv's `--hwdec=help` lists
`h264-v4l2request` after the marfrit ffmpeg install), but with
`--vo=gpu-next` we hit the same compositor bottleneck as SW mpv: 122 %
CPU + 930 / 1440 dropped frames. The decode is essentially free; KWin /
gpu-next's GL composition path still drops most frames waiting on the
compositor. `--vo=dmabuf-wayland` would be the right VO for zero-copy
to KWin, but mpv's hwdec → dmabuf-wayland format negotiation fails
("hardware format not supported", `yuv420p → drm_prime` upload fails).
Until that's untangled, the **recommended ohm playback recipe is
GStreamer `v4l2slh264dec → waylandsink`** for graphical use, and
`ffmpeg -hwaccel v4l2request -hwaccel_output_format drm_prime` for any
scripted / headless work.
### Browser HW decode (Brave / Chromium) — partially wired, library-blocked
Attempted as task (b) on 2026-04-24. State reached:
- Installed bootlin `libva-v4l2-request-git` from AUR with local patches
(patch archive on ohm: `~/fourier-test/libva-patches/fourier-local.patch`):
- HEVC stripped from the build (`h265.c` + `h265.h` removed from
`src/meson.build`, HEVC case in `picture.c` returns
`UNSUPPORTED_PROFILE`). The library's HEVC UAPI binding predates the
mainline rename (`V4L2_CID_MPEG_VIDEO_HEVC_*``V4L2_CID_STATELESS_HEVC_*`)
and ohm has no HW HEVC anyway.
- Missing `#include "utils.h"` added to `src/h264.c` (pre-existing bug
the library grew into as compilers got stricter about implicit
declarations).
- `src/config.c` probe extended to try both `V4L2_BUF_TYPE_VIDEO_OUTPUT`
and `V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE` when enumerating profiles.
- `vainfo LIBVA_DRIVER_NAME=v4l2_request LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video1`
enumerates profiles: **MPEG-2 Simple/Main + H.264 Main/High/ConstrainedBaseline/
MultiviewHigh/StereoHigh**. Library is up and talking to the hantro
decoder.
- Brave launched with `--enable-features=VaapiVideoDecoder,VaapiIgnoreDriverChecks,AcceleratedVideoDecodeLinuxGL --use-gl=angle --use-angle=gl-egl --ozone-platform=wayland`
visibly activates VaapiVideoDecoder in the GPU process argv and attempts
VA context creation on the video file.
Failure mode:
```
ERROR:media/gpu/vaapi/vaapi_wrapper.cc:2407] vaCreateContext failed,
VA error: operation failed
ERROR:media/gpu/vaapi/vaapi_video_decoder.cc:1219] failed creating VAContext
```
Root cause: the bootlin `libva-v4l2-request` was written for Allwinner's
sunxi-cedrus (single-plane V4L2). The probe patch gets it past format
enumeration, but `context.c` / `picture.c` / `v4l2.c` still hardcode
single-plane buffer setup, `VIDIOC_S_FMT` paths, and request-API frame
submission in ways that don't match a multiplanar decoder. A real port to
multiplanar is a side project — the upstream is effectively unmaintained
(last meaningful commit years ago; Collabora talked about a replacement
but it has not shipped).
Decision: **browser HW video decode on ohm is parked until a multiplanar
`libva-v4l2-request` rework exists**, either ours or someone else's.
Non-browser HW decode (GStreamer `v4l2codecs`, FFmpeg via
`ffmpeg-v4l2-request-git`) is unaffected and works. The patch set on
ohm is a useful starting point for anyone who wants to pick up the
multiplanar port.
#### Why not "Brave via our system ffmpeg" or Chromium's own V4L2 decoder?
- **System ffmpeg is not a decoder source for Chromium/Brave.** Chromium
links a vendored ffmpeg fork; even the Arch-style `use_system_ffmpeg`
builds use system `libavcodec` only as a pure software decoder
(`FfmpegVideoDecoder`), with no `hw_device_ctx` / hwaccel wiring.
Installing our `ffmpeg-v4l2-request-git` would not give Brave HW decode.
- **Chromium's own `V4L2VideoDecoder` class is compiled into `brave-bin`**
(verified via `strings`) but gated behind ChromeOS-only runtime factory
checks. `UseChromeOSDirectVideoDecoder` / `V4L2FlatStatelessVideoDecoder`
feature flag names do **not** appear as strings in the brave-bin binary
— only `AcceleratedVideoDecodeLinuxGL` does, which is a VA-API gate, not
a V4L2 gate. Enabling those flags at the command line is a no-op on this
build. Unlocking V4L2 on Linux Brave would require either (a) a Chromium
rebuild with different buildflags, or (b) upstream patches to extend the
V4L2 decoder factory to Linux non-ChromeOS targets.
#### chromium-fourier (parallel side-project)
Workspace at `marfrit-packages/arch/chromium-fourier/STUDY.md`
patching upstream Chromium so it reaches `VaapiVideoDecoder` directly
on Linux Wayland (skipping the chromeos pipeline that fails on Brave),
then talks to our `marfrit/libva-v4l2-request-fourier` libva backend.
No shipping fork covers this niche today (every ARM-Linux Chromium-with-
HW-decode goes through the **vendor MPP path** on **5.10 BSP kernel +
X11 + Mali blob + panfork** — opposite of where Fourier is going). See
that STUDY for the patch shape, reference forks, build plan on fermi
with distcc-avahi acceleration, and validation path.
### Acceptance criterion
**1080p @ 30 fps, no dropped frames.** Applies to every device and every
codec we claim support for. Same bar everywhere — don't grade RK3566 on a
curve vs RK3588. Dropped-frame count comes from `mpv --msg-level=all=info`
or `ffmpeg -f null -`; "no dropped frames" means zero across a 60 s clip.
### Test corpus
**Big Buck Bunny** is the canonical open test clip (Blender Foundation 2008,
CC-BY), ubiquitous in HW-decode testing across a decade of embedded
silicon. Various encodes readily available at
[peach.blender.org](https://peach.blender.org/download/) and mirrors.
Plan: park a fixed set on doppler under `/moviedata/fourier-test/` so Gerbera
re-indexes it and every fleet device on LAN + VPN streams identical bits.
Canonical encodes:
| File | Codec | Res | Notes |
|-------------------------------|--------|-------|------------------------------|
| bbb_1080p30_h264.mp4 | H.264 | 1080p | covers all four devices |
| bbb_1080p30_hevc.mp4 | HEVC | 1080p | ampere/boltzmann full; ohm pending rkvdec2 |
| bbb_1080p30_vp9.webm | VP9 | 1080p | ohm pending; RK3588 pending upstream |
| bbb_1080p30_mpeg2.ts | MPEG-2 | 1080p | Path A fresnel + rkvdec1 baseline |
| bbb_1080p30_vp8.webm | VP8 | 1080p | fresnel + RK356x rkvdec1 baseline |
| bbb_2160p60_hevc.mp4 | HEVC | 4K60 | RK3588 stretch goal (ampere/boltzmann only) |
Stretch: add **Tears of Steel** (2012, also CC-BY) for HDR / high-bitrate
HEVC once the stretch goal is viable.
## After ohm
- **ampere** (RK3588) — once RockHard kernel tracks a tree with VDPU381
merged (>= Linux 6.14-ish with the Feb 2026 backport), add `v4l2codecs` and
`ffmpeg-v4l2-request` packaging to RockHard's output. Should be a small
step.
- **boltzmann** (RK3588) — dual-path decision. Short-term: use Path B with
rkmpp (the kernel already exposes it). Long-term: migrate Neutron to a
mainline kernel with VDPU381 and move to Path A for symmetry with ampere.
- **fresnel** (RK3399) — Hantro is mature upstream; likely only needs
userspace install + recipe validation. Endeavour OS package names TBD.
Patches upstreamed along the way (v4l2-request to FFmpeg, VDPU346 to
linux-media) count double — they benefit the whole fleet.
## Known gotchas / watch-for
Sign-posting only — details in linked sources. If you hit one of these,
expand the bullet with your specific find.
- **IOMMU faults.** Collabora's Feb 2026 VDPU381 landing included specific
IOMMU fixes. If dmesg shows rockchip-iommu faults during decode, check
you're on a kernel that carries the fixes, not just the driver.
([Collabora post](https://www.collabora.com/news-and-blog/news-and-events/rk3588-and-rk3576-video-decoders-support-merged-in-the-upstream-linux-kernel.html))
- **Firmware blob.** rkvdec / rkvdec2 need a firmware file under
`/lib/firmware/rockchip/` (varies by SoC). Missing blob → driver probe
fails silently (no /dev/videoN). Check linux-firmware package currency.
- **HEVC UAPI version gating.** New explicit RPS controls arrived with
the Feb 2026 landing; GStreamer 1.28 + FFmpeg-preliminary honour them.
Older userspace on a newer kernel (or vice versa) fails in unintuitive
ways. Pin both sides when diagnosing.
- **Compositor-bound ≠ decode-bound.** The 2023 clehaxze result saw 80%
CPU at 1080p60 H.264 — that was the compositor + dmabuf path, not
decode. Always attribute CPU with `perf top` / `top -H` before calling
decode slow. Use `mpv --vo=null` to isolate the decode path.
- **VOP2 cfg_done (RK3588).** Decoded frames appear via VP0/VP1/VP2; if
the shadow-register latch dance is wrong, frames don't show. See
`feedback_rk3588_cfg_done.md` + `project_bin_vp0_theory.md` in noether
memory. Touches Coulomb / Bin territory — if it gets there, ask.
- **Multi-core VDPU381 on RK3588.** Silicon has two decoder cores;
upstream multi-core support is still landing. Until then, ampere /
boltzmann run on one core and under-deliver vs headline specs.
## Build infrastructure
Fourier touches two kinds of builds: userspace (FFmpeg + patches, GStreamer,
libva-v4l2-request) and kernel (DT + module compiles if we carry out-of-tree
rkvdec2 / VDPU346). Both benefit from distcc — CT108 (14-core aarch64 on
data, wake via `wake-data` on hertz) plus always-on tesla on hertz.
Zeroconf is broken in containers — hand-wire `DISTCC_HOSTS`.
### Kernel / DT modules (cross-build from x86 host)
```sh
DISTCC_HOSTS="localhost/4 192.168.88.208:3632/14,lzo,cpp tesla.lxd:3632/4,lzo,cpp" \
pump make -j80 O=build/ ARCH=arm64 \
CROSS_COMPILE='distcc aarch64-linux-gnu-' HOSTCC='ccache gcc'
```
Pump mode sometimes misbehaves on aarch64 cross builds — fall back to plain
`make -j80` without `pump` if include-server throws.
### FFmpeg / userspace (native aarch64 host — ohm, ampere, fermi build CT)
```sh
export DISTCC_HOSTS="localhost/4 192.168.88.208:3632/14,lzo,cpp tesla.lxd:3632/4,lzo,cpp"
export MAKEFLAGS="-j80"
makepkg -s # AUR / marfrit-packages workflow
# or, from raw tree:
./configure && pump make -j80
```
`makepkg` honours `MAKEFLAGS` but not `DISTCC_HOSTS` for all build hooks —
put `DISTCC_HOSTS` in `/etc/makepkg.conf` or export before invocation. CC/CXX
wrappers: `/usr/lib/distcc/` symlinks on native aarch64 (NOT
`/usr/lib/distcc/bin/`).
Remember to re-run `wake-data` if CT108 is asleep — see `his` skill or the
`project_distcc_infra.md` memory for the full recipe. When that memory and
this README diverge, fix both.
## Packaging — marfrit-packages
New fleet hosts should get Fourier userspace via `pacman -S` / `apt install`,
not per-device AUR rebuilds. Mirror plan in `marfrit-packages` (layout:
`arch/<pkg>/PKGBUILD`, `debian/<pkg>/build-deb.sh + debian/`):
| Package | Source | CI runner | Target | Status |
|------------------------------|-----------------------------------------|-----------|---------|--------|
| `ffmpeg-v4l2-request-git` | Kwiboo `v4l2-request-n8.1` (rebase of Jernej's series onto ffmpeg 8.1), pinned to `_commit` | **fermi** (Arch ARM aarch64) | Path A: ohm, fresnel, ampere | scaffolded 2026-04-24 |
| `ffmpeg-v4l2-request` | same patches against Debian ffmpeg src | **feynman** (Debian aarch64) | Path A Debian hosts | pending |
| `gst-plugins-bad-fourier` | only if stock distro package is <1.28 | fermi / feynman | 1.28 HEVC UAPI uplift | not needed yet (stock ALARM ships 1.28.2) |
| `libva-v4l2-request` | upstream tag | fermi / feynman | VA-API bridge | pending |
Why not the AUR `ffmpeg-v4l2-request-git`: it tracks a 6.1.1 branch with
`epoch=2`, which makes it older than stock Arch `2:8.1-3` — install would
**downgrade** system ffmpeg. It also pulls X11/AMF/CUDA/FireWire/AviSynth/
OpenMPT/Bluray/etc, none relevant on a Wayland ARM video-decode fleet, and
has no commit pin. The marfrit fork tracks Kwiboo's `v4l2-request-n8.1`
(ffmpeg 8.1 base, sideways swap with stock), pins a commit for
reproducibility, and strips the unused deps.
The Arch package is the priority (ohm + fresnel are Arch-based). Debian
package comes second (relevant if a fleet host ever runs Debian — kepler or
similar). Both are `provides=(ffmpeg) conflicts=(ffmpeg)`, so install
deliberately replaces the distro ffmpeg on a given host.
CI trigger on each push. Artifacts land in packages.reauktion.de
(`reference_marfrit_repo_bootstrap.md` in noether memory has the client
bootstrap).
Out of scope: rkmpp-based FFmpeg (Path B). Boltzmann stays on rkr* until
Neutron migrates; that host doesn't need a marfrit-packages entry until
migration.
## Upstreaming policy
**Default: upstream-first.** Anything generic (FFmpeg v4l2-request polish,
GStreamer fixes, linux-media DT nodes for rkvdec2 / VDPU346 on RK356x,
kernel compatible-string additions) goes to the relevant maintainer list
(`ffmpeg-devel`, `linux-media`, `linux-rockchip`, `gstreamer-devel`,
gitlab.fd.o) — not to marfrit-packages as a downstream carrier.
**Fallback: when upstream ideology blocks success**, stop fighting the
review. Mark the patch `# NOT FOR UPSTREAM`, carry it in marfrit-packages,
and document it in this README under a "fleet-local patches" section. The
stance then becomes: *Markus has the only devices in the world which are
hardware-decoding-capable, and whoever wants to follow should get a Claude
plan*. Fleet keeps working; replication path is reproducible; no energy
burnt arguing with maintainers who won't merge.
The bar for switching: a concrete rejection on ideological grounds (licensing
stance, design-religion, NIH), not a normal review cycle. Normal review
friction — respin until it lands.
Known RockHard precedent for downstream-only: the `GCC_PLUGINS` patch
(`project_linux_rk3588.md`). Tag new carrier-only patches the same way for
consistency.
## Working agreements
Standing rules for how we run this project — inherited from the broader
collaboration canon (`feedback_*` / `project_*` in noether's memory system),
captured here so a cold-start Fourier session doesn't re-learn them.
### ReCAP — ReContextualization After Pruning
Claude's context gets compacted when long sessions hit the window limit, and
any live-session state not in durable storage vanishes. Our counter:
- Memory files (`/home/mfritsche/.claude/projects/-home-mfritsche-claude/memory/`,
`MEMORY.md` is the index) are the long-term substrate. Don't keep
load-bearing facts only in conversation.
- Project READMEs (this file) carry the research dossier + current plan.
Update them when state changes, not later.
- Task list on noether carries active work. Status transitions are cheap —
mark in_progress when starting, completed when done.
- After a `/compact`, re-read relevant memory files + README + recent git log
before reasoning. Don't infer from conversation alone.
See `project_recap.md` in memory for the full protocol.
### Commit per experiment
Every experiment that touches the tree — a kconfig change, a DT tweak, a
ffmpeg build flag, a test clip benchmark — gets its own commit on a WIP
branch, with a short message naming what changed and what was observed.
- Dirty trees are tech debt. If the tree is dirty for >30 min, commit.
- Include benchmark numbers / dmesg excerpts / observation in the commit
body, not just the code diff.
- Rebase / squash later if a clean series matters; don't delay the commit
waiting for it.
See `feedback_commit_cadence.md` in memory.
### Ask before flash / reboot
This project spans fleet laptops and always-on hosts. We have **real blast
radius**.
- Never flash anything without a verified, tested backup. Fritz!Box 7490 was
bricked by flashing with an empty backup on 2026-04-12 — we don't repeat
that class of mistake. See `feedback_flash_critical.md`.
- Shared hardware reboots pause the user's other work. Ask before rebooting
data (Proxmox), hertz (home-LAN spine), boltzmann (always-on build host).
See `feedback_no_bulldoze_reboots.md`.
- For kernel / u-boot / firmware changes: simulate first where possible
(QEMU, module-style load) before flashing silicon. See
`feedback_simulate_first.md`.
- **Never** run `update-initramfs` or equivalent on a remote host that boots
from a non-standard root (ZFS, btrfs-with-subvols). See
`feedback_initramfs.md`.
### Off-machine backups
Fleet laptops back up to data via backintime (Anacron mode, daily attempts,
VPN-guarded, 30-day staleness alert by email). Hertz backs up to data
weekly / quarterly via `backup-hertz.sh` (vitruvius dashboard shows
progress). Data itself lives on ZFS RAIDZ2 + snapshots.
- Before any invasive change on a fleet laptop, confirm its last successful
backintime snapshot in `/opt/herding/var/backintime-status/<host>` on
hertz.
- New per-device state (kernel patch series, ffmpeg build tree) lives in a
Gitea repo + gets backed up with the rest of the working tree — don't
leave unique work on a single disk.
- External photos / family data on hertz is mirrored via the hertz backup to
data. Restore recipe is in the two-step `restore-hertz-step1-sd.sh` /
`restore-hertz-step2-lxd.sh` on data.
### See also — broader feedback canon
- **Test the observer first** (`feedback_observer_first.md`) — before
drawing decode-performance conclusions, confirm the rig (v4l2-ctl reports
sensible caps? ffmpeg actually routing through v4l2request?).
- **Three strikes then verify** (`feedback_three_strikes.md`) — after two
failed fixes, stop guessing; verify the binary / wire / protocol.
- **TRM or nothing** (`feedback_trm_or_nothing.md`) — register writes need
documentation backing. Relevant for any DT or driver patches we ship.
- **Trust Markus' eyes** (`feedback_trust_user_eyes.md`) — if he reports
"it plays smoothly", that's primary evidence. Don't over-qualify.
- **No budget framing** (`feedback_no_budget_framing.md`) — don't pre-shrink
scope citing "session cost"; Markus sets pace.
## References
### Mainline kernel state
- [Rockchip RK3588 / RK3576 H.264 and H.265 video decoders gain mainline Linux support (CNX Software, 2026-02-27)](https://www.cnx-software.com/2026/02/27/rockchip-rk3588-rk3576-h-264-and-h-265-video-decoders-mainline-linux/)
- [RK3588 and RK3576 video decoders support merged in the upstream Linux Kernel (Collabora)](https://www.collabora.com/news-and-blog/news-and-events/rk3588-and-rk3576-video-decoders-support-merged-in-the-upstream-linux-kernel.html)
- [Upstream support for Rockchip's RK3588: Progress and future plans (Collabora)](https://www.collabora.com/news-and-blog/news-and-events/rockchip-rk3588-upstream-support-progress-future-plans.html)
- [media: rkvdec: Add support for VDPU381 and VDPU383 (LWN)](https://lwn.net/Articles/1053556/)
- [RKVDEC2 Driver Posted For Accelerated Video Decoding On Newer Rockchip SoCs (Phoronix)](https://www.phoronix.com/news/RKVDEC2-Rockchip-Video-Decode)
### FFmpeg V4L2 request API
- [PATCH v2 0/8 Add V4L2 Request API hwaccels for MPEG2, H.264 and HEVC (ffmpeg-devel, 2024-08)](https://ffmpeg.org/pipermail/ffmpeg-devel/2024-August/332034.html)
- [Miouyouyou/FFmpeg-V4L2-Request (build script)](https://github.com/Miouyouyou/FFmpeg-V4L2-Request)
- [ffmpeg v4l2 requests 4.4.3 patchset (artemis.sh, 2023-03)](https://artemis.sh/2023/03/06/ffmpeg-v4l2-requests-4.4.3.html)
### GStreamer v4l2codecs
- [GStreamer v4l2codecs plugin docs](https://gstreamer.freedesktop.org/documentation/v4l2codecs/index.html)
- [Adding VP9 and MPEG2 stateless support in v4l2codecs for GStreamer (Collabora, 2021)](https://www.collabora.com/news-and-blog/blog/2021/06/23/adding-vp9-and-mpeg2-stateless-support-in-v4l2codecs-for-gstreamer/)
- [GStreamer 1.26 — improved hardware efficiency (Collabora)](https://www.collabora.com/news-and-blog/news-and-events/gstreamer-126-improved-hardware-efficiency.html)
### PineTab2 specific
- [Hardware accelerated playback on PineTab 2 (clehaxze.tw, 2023-09)](https://clehaxze.tw/gemlog/2023/09-17-hardware-accelerated-playback-on-pinetab2.gmi)
- [PINE64 Mainline Hardware Decoding wiki](https://wiki.pine64.org/wiki/Mainline_Hardware_Decoding)
- [dreemurrs-embedded/linux-pinetab2 releases (6.15.2-danctnix2 latest)](https://github.com/dreemurrs-embedded/linux-pinetab2/releases)