ohm step 2: SW baseline — 127% CPU, 54% frame drops via KWin/gpu-next
ffmpeg -hwaccel none, bbb_1080p30_h264.mp4 (fleet-canonical, SHA-16 dcf8a7170fbd49bb, pulled from doppler /moviedata/fourier-test/ via hertz lxc file pull + ephemeral http bridge). - Max throughput (no -re, no display): 1440 frames in 18.55 s → 77.6 fps, 319% CPU (≈3.2 of 4 A55 cores). 2.6× realtime headroom. - Realtime -re (no display): 1440 frames in 60.27 s, 90% CPU (≈1 core). Per-frame decode cost ≈40 ms. - mpv --hwdec=no --vo=gpu-next through KWin on DSI-1: 127% CPU but 973 dropped frames out of 1800 target → 54% drop rate, ~14 fps delivered. Decode is not the bottleneck; the compositor/VO path is. HW decode will cut decode CPU but not by itself fix delivery drops — the "compositor- bound ≠ decode-bound" gotcha the README flagged. Direct KMS scanout (--vo=drm) or dmabuf-wayland will be the lever for the delivery side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -158,10 +158,29 @@ picks up rkvdec2 patches.
|
||||
6. **Kodi + mpv validation** — 1080p HEVC at <30% of one core target
|
||||
7. **Document** — freeze the final recipe in this README
|
||||
|
||||
**Status 2026-04-24**: ohm reachable; step 1 (live recon) complete; step 3
|
||||
effectively done for H.264 / MPEG-2 / VP8 (driver bound, node exposed).
|
||||
Open threads: step 2 (SW baseline numbers) and step 5 (ffmpeg-v4l2-request
|
||||
build + install).
|
||||
**Status 2026-04-24**: ohm reachable; steps 1 + 2 + 3 done for H.264 on
|
||||
ohm. SW baseline numbers below; step 5 (ffmpeg-v4l2-request build +
|
||||
install via marfrit-packages) is next.
|
||||
|
||||
### SW baseline numbers (2026-04-24, ohm, `bbb_1080p30_h264.mp4`)
|
||||
|
||||
Clip SHA-16 `dcf8a7170fbd49bb` pulled from doppler
|
||||
`/moviedata/fourier-test/` via hertz `lxc file pull` → http bridge → ohm
|
||||
(same bytes on every fleet device).
|
||||
|
||||
| Test | CPU% | Frames / wall | Effective fps | Dropped |
|
||||
|----------------------------------------|-------|----------------|---------------|---------|
|
||||
| `ffmpeg -hwaccel none -f null -` | 319 % | 1440 / 18.55 s | 77.6 | n/a |
|
||||
| `ffmpeg -re -hwaccel none -f null -` | 90 % | 1440 / 60.27 s | 23.9 (paced) | n/a |
|
||||
| `mpv --hwdec=no --vo=gpu-next`, DSI-1 | 127 % | 1800 target | ~14 delivered | **973** |
|
||||
|
||||
Reading: raw SW decode has ~2.6× headroom over realtime on the 4× A55
|
||||
(≈1 core saturated per 30 fps stream). But mpv with `gpu-next` through
|
||||
KWin drops **54 %** of frames while only using 127 % CPU — the
|
||||
compositor / VO path is the bottleneck, not decode. Exactly the
|
||||
"compositor-bound ≠ decode-bound" gotcha below. HW decode will cut the
|
||||
decode cost but won't by itself fix the composition drops; `--vo=drm`
|
||||
(direct KMS scanout) or a dmabuf-capable VO is the lever for that.
|
||||
|
||||
### Acceptance criterion
|
||||
|
||||
|
||||
Reference in New Issue
Block a user