Files
fresnel-fourier/PRE_COMPACT_HANDOFF.md
T

211 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Pre-Compact Handoff — Session 2026-05-14 (FINAL post iter38)
Use this doc to resume the fresnel-fourier campaign after Claude context compaction. **Campaign at definitive close: 5/5 codecs PASS in a single libva session, no env-override required.**
## TL;DR
| Bug / Item | Status | Fix iter |
|---|---|---|
| Bug 4 (H.264 keyframe-partial) | **FIXED** | iter25 α-25 (rkvdec image_fmt pre-seed via synthetic SPS at CreateContext) |
| Bug 5 (HEVC libva all-zero CAPTURE) | **FIXED** | iter25 α-25 (frame 1) + iter31 α-29 (frames 2+: slice_params.short_term_ref_pic_set_size from VAAPI st_rps_bits) |
| VP8 wrong output through libva | **FIXED** | iter33 α-30 (prepend 10/3 byte VP8 uncompressed header to OUTPUT — ffmpeg-vaapi strips it) |
| MPEG-2 HW differs from SW | **NOT A BUG** | hantro IDCT precision (≤3 LSB / pixel, SSIM > 0.9999); libva == kdirect bit-exact |
| Kernel diagnostic printks | **CLEANED** | iter32 (7.0-11) + iter34 (7.0-14) |
| Env-gated DIAG probes (iter29/30/33/35) | **CLEANED** | iter36 (-131 / +7 LOC) |
| α-26 mis-routed cosmetic | **REVERTED** | iter37 (1-line; rkvdec never read that field) |
| Libva multi-device probe | **DONE** | iter38 (single session serves all 5 codecs; no env override needed) |
| Codec | libva 10F sha | kdirect 10F sha | SW 10F sha | L==K | L==SW |
|---|---|---|---|---|---|
| H.264 | dd4f5f2d552c07bc | same | same | ✓ | ✓ |
| HEVC | 108f925bb6cbb6c9 | same | same | ✓ | ✓ |
| VP9 | cf35908ae0f9ab60 | same | same | ✓ | ✓ |
| VP8 | d3231e5b6c0ee10b | same | same | ✓ | ✓ |
| MPEG-2| 95c5905890c937d4 | same | 933b744134e47ba4 | ✓ | ~ (≤3 LSB IDCT precision) |
**5/5 PASS** the libva-vs-kdirect bit-exact correctness contract. 4/5 also bit-equal SW.
`vainfo` with NO env override enumerates the union of profiles from rkvdec + hantro:
```
v4l2-request: auto-selected codec device: /dev/video3 + /dev/media1
v4l2-request: iter38: also opened hantro-vpu decoder at /dev/video2 + /dev/media0
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointVLD
VAProfileH264High : VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264MultiviewHigh : VAEntrypointVLD
VAProfileH264StereoHigh : VAEntrypointVLD
VAProfileHEVCMain : VAEntrypointVLD
VAProfileVP8Version0_3 : VAEntrypointVLD
VAProfileVP9Profile0 : VAEntrypointVLD
```
## Substrate state
| Component | Location | Tip |
|---|---|---|
| Campaign repo (this) | `/home/mfritsche/src/fresnel-fourier/` | `ba4b6fd` on gitea master |
| Libva backend fork (noether) | `/home/mfritsche/src/libva-multiplanar/libva-v4l2-request-fourier/` | `7ac934e` on gitea master |
| Libva backend (fresnel deploy) | `/home/mfritsche/src/libva-v4l2-request-fourier/` | sync to gitea master, `ninja -C build` |
| Kernel source (boltzmann) | `~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/` | pkgrel=14 clean |
| Kernel running on fresnel | `linux-fresnel-fourier 7.0-14` | clean shipping kernel, no diagnostic printks |
| Test fixtures (fresnel) | `/home/mfritsche/fourier-test/bbb_*.{mp4,ts,webm}` | 5 codecs at 720p10s or 1080p30 |
| Memory | `~/.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/` | see entries below |
## Identity for gitea pushes
All `git.reauktion.de` interactions use the `claude-noether` identity (per memory `feedback_gitea_as_claude_noether.md`). Backend remote URL: `ssh://gitea@git.reauktion.de.claude-noether/marfrit/libva-v4l2-request-fourier.git`.
## Device map on 7.0-14
`/dev/video*` and `/dev/media*` numbers SHIFT between kernel boots based on probe order. On the current 7.0-14 boot:
| Driver | /dev/videoN | /dev/mediaN |
|---|---|---|
| rockchip-rga | video0 | n/a |
| rk3399-vpu-enc | video1 | (shared) |
| rk3399-vpu-dec (hantro) | **video2** | **media0** |
| rkvdec | **video3** | **media1** |
`v4l2-ctl --info` + `media-ctl -p` if mapping uncertain on a fresh boot. Iter38 makes this irrelevant for typical use — libva auto-probes both.
## Backend commits delivered (chronological, this campaign day)
```
7ac934e iter38b: bounds check uses MAX_PROFILES (11), not MAX_CONFIG_ATTRIBUTES (10)
c56a77b iter38: multi-device probe — single libva session serves all 5 codecs ← architectural close
25d3e5f iter37: revert α-26 — decode_params.short_term_ref_pic_set_size back to 0
7db15a5 iter36: remove env-gated DIAG probes (iter29/30/33/35)
48fd028 iter35 DIAG: env-gated dump of v4l2_ctrl_mpeg2_* contents (removed iter36)
7e0848d iter33 α-30: prepend VP8 uncompressed frame header to OUTPUT buffer ← VP8 fix
bf3e3d8 iter33: extend VP8 DIAG to dump VAAPI probability struct directly (removed iter36)
4b3c21b iter33 DIAG: env-gated dump of v4l2_ctrl_vp8_frame contents (removed iter36)
23eb1bd iter31 α-29: slice_params.short_term_ref_pic_set_size = picture->st_rps_bits ← HEVC fix
68dbbdd iter30 DIAG: LIBVA_TS_SCALE env-gated timestamp multiplier (removed iter36)
0eca3ff iter29 DIAG: env-gated dump of HEVC slice_data trailing 80 bytes (removed iter36)
6646b16 Revert iter28b DIAG: trim=40 universal-trim broke IDR frame 1
cd286d9 iter28 α-28: bit_size = (slice_data_size - data_byte_offset) * 8 for HEVC
754be1d iter27 diag: env-gated VAAPI slice fields dump
719d813 iter27 α-27: populate slice_params.num_entry_point_offsets (no-op)
66ef848 iter26 α-26: decode_params.short_term_ref_pic_set_size from VAAPI (reverted iter37)
d062fec iter25 α-25 fix: FRAME_MBS_ONLY flag for H264 dummy SPS
db0b7f9 iter25 α-25: inject synthetic SPS before cap_pool_init to seed image_fmt ← H264+HEVC frame 1 fix
```
Load-bearing commits: `db0b7f9 + d062fec` (α-25), `23eb1bd` (α-29), `7e0848d` (α-30), `c56a77b + 7ac934e` (iter38 multi-device).
## Campaign repo commits delivered (today's arc)
```
ba4b6fd iter38 close: multi-device probe — 5/5 codecs in one libva session
7e3eadf iter36 close: env-gated DIAG removed, 5/5 PASS retained
7c06c51 iter35 close: MPEG-2 verified libva-correct; HW IDCT precision intrinsic
70ddbd6 iter34 close: kernel 7.0-14 CLEAN ship — 5/5 codecs PASS
cd2d077 iter33: MPEG-2 closed (libva==kdirect bit-exact) — 5/5 codecs PASS
51eee19 iter33 α-30 close: VP8 FIXED — 4/5 codecs PASS
acacf3d iter32 close: kernel substrate cleanup landed → 7.0-11 SHIPPING
85cc178 Update campaign session doc: full-day arc closes at 3/3 PASS
fde8a25 Update handoff doc: HEVC Bug 5 fully fixed (3/3 PASS)
c1f9738 iter31 α-29 close: HEVC Bug 5 remainder FIXED — 3/3 PASS
422ecaf Add pre-compact handoff doc for session resumption
… earlier in day: c15fc6c, 8b17bf7, 02c4192, bf67900 (iter20-28 chain)
```
## How to verify the current state
Run on fresnel (post-7.0-14 boot, no env override needed):
```bash
for codec in h264:bbb_1080p30_h264.mp4 hevc:bbb_720p10s_hevc.mp4 vp9:bbb_720p10s_vp9.webm vp8:bbb_720p10s_vp8.webm mpeg2:bbb_720p10s_mpeg2.ts; do
name="${codec%%:*}"; fixture="${codec#*:}"
env LIBVA_DRIVER_NAME=v4l2_request \
LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \
ffmpeg -hide_banner -loglevel error -y \
-hwaccel vaapi -hwaccel_output_format vaapi \
-i "/home/mfritsche/fourier-test/$fixture" \
-vf "hwdownload,format=nv12" -frames:v 10 \
-f rawvideo -pix_fmt nv12 "/tmp/L_${name}.yuv"
ffmpeg -hide_banner -loglevel error -y -hwaccel v4l2request -hwaccel_output_format drm_prime \
-i "/home/mfritsche/fourier-test/$fixture" -vf "hwdownload,format=nv12" \
-frames:v 10 -f rawvideo -pix_fmt nv12 "/tmp/K_${name}.yuv"
L=$(sha256sum "/tmp/L_${name}.yuv" | cut -c1-16)
K=$(sha256sum "/tmp/K_${name}.yuv" | cut -c1-16)
[ "$L" = "$K" ] && echo "$name: PASS" || echo "$name: FAIL"
done
```
Expect: 5× PASS.
## Root cause summary
**Bug 4 + Bug 5 frame 1 (iter25 α-25)**: `rkvdec_s_ctrl` returns -EBUSY when first SPS triggers image_fmt reset on a busy CAPTURE queue. libva pre-allocated 24 CAPTURE buffers at CreateContext (iter5b-β design) before per-frame S_EXT_CTRLS. Fix: inject synthetic SPS at CreateContext, pre-cap_pool_init, while CAPTURE is still empty.
**Bug 5 frame 2+ (iter31 α-29)**: libva backend set `slice_params->short_term_ref_pic_set_size = 0` (stale "VAAPI doesn't expose" comment). rkvdec's `assemble_sw_rps` (rkvdec-hevc.c:386-389) reads this; when zero with `num_short_term_ref_pic_sets <= 1`, falls back to 0 → entropy decoder consumes slice-header bits as long-term-RPS → garbage for every non-IDR slice. IDR is gated by `!IDR_PIC` so frame 1 was unaffected. Fix: `slice_params->short_term_ref_pic_set_size = picture->st_rps_bits` (VAAPI's field IS the slice-header bit count, per `va_dec_hevc.h` doc). α-26 had mis-routed this value into `decode_params` (same field name in V4L2, different semantics — SPS-side bit count) — reverted in iter37.
**VP8 (iter33 α-30)**: ffmpeg-vaapi strips the VP8 uncompressed frame header (3 bytes interframe / 10 bytes keyframe) before submitting via VAAPI. ffmpeg-v4l2request keeps it. Hantro hard-codes `first_part_offset = V4L2_VP8_FRAME_IS_KEY_FRAME(hdr) ? 10 : 3` and uses it for both `mb_offset_bits` and `dct_part_offset`. Without the prepended header in libva's OUTPUT, hantro's offset arithmetic lands inside the compressed bitstream and the entropy decoder produces garbage. Fix: in `codec_store_buffer`, prepend `header_size` zero bytes to OUTPUT for VP8 profile (hantro skips these bytes for actual parsing, uses ctrl-struct values).
**Multi-device probe (iter38)**: VA_DRIVER_INIT opens BOTH rkvdec + hantro fds. `RequestCreateConfig` retargets `driver_data->{video,media}_fd` to the right device per profile (tearing down pools on switch). `RequestQueryConfigProfiles` unions across all open fds. iter38b fixed a latent off-by-one: bounds checks used `MAX_CONFIG_ATTRIBUTES` (10) but profile array is sized by `MAX_PROFILES` (11) — pre-iter38 never returned more than 9 profiles so the bug never bit.
## Open items (low priority, optional polish)
1. **Multi-context simultaneously** — current design supports only one decode context at a time across devices (device switch tears down pools). Could be expanded to per-context pools to support simultaneous mixed-codec decode. Not requested.
2. **Sub-profile support** — H264 Hi10P, HEVC Main10, VP9 Profile 2 are HW-supported on RK3399 but the libva backend has no entries in `pixelformat_for_profile` and elsewhere. Out of scope for this campaign.
## Memory entries (full campaign set)
- `feedback_rkvdec_image_fmt_pre_seed.md`α-25 (Bug 4 + Bug 5 frame 1)
- `feedback_va_st_rps_bits_is_slice_field.md`α-29 (Bug 5 frame 2+)
- `feedback_vaapi_strips_vp8_uncompressed_header.md`α-30 (VP8)
- `feedback_mpeg2_hw_sw_idct_precision.md` — MPEG-2 PASS criterion = libva==kdirect (HW vs SW gap intrinsic per spec)
- `feedback_multi_device_probe_design.md` — iter38 dual-fd architecture + MAX_PROFILES bounds gotcha
- `feedback_libva_byte_correct_kernel_bug.md`**FULLY OVERTURNED** (both Bug 4 + Bug 5 are libva-side fixes)
- `reference_fresnel_kernel_substrate.md` — 7.0-14 clean, device-enumeration-shift caveat
- MEMORY.md index updated
## Key commands quickreference
```bash
# Sync backend on fresnel + rebuild
ssh fresnel 'cd ~/src/libva-v4l2-request-fourier && git fetch && git reset --hard origin/master && ninja -C build'
# 5-codec smoke (above script). Each codec ~5s.
# Identify which video device is rkvdec vs hantro after a fresh boot
ssh fresnel 'for v in /dev/video*; do v4l2-ctl -d $v --info 2>/dev/null | grep -E "^Card type" | head -1 | awk -v dev=$v "{print dev,\$0}"; done'
# vainfo (auto-detects + opens both decoders since iter38)
ssh fresnel 'env LIBVA_DRIVER_NAME=v4l2_request \
LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \
vainfo'
# kdirect reference (works for any codec; hwaccel auto-routes)
ssh fresnel 'ffmpeg -hide_banner -loglevel error -y -hwaccel v4l2request -hwaccel_output_format drm_prime \
-i /home/mfritsche/fourier-test/bbb_720p10s_hevc.mp4 \
-vf "hwdownload,format=nv12" -frames:v 10 -f rawvideo -pix_fmt nv12 /tmp/y.yuv'
# Force single-device mode (skip iter38 alt-probe)
env LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media1 ...
# Reboot fresnel (sddm autologin reseats mfritsche)
ssh fresnel 'sudo systemctl reboot'; sleep 60
```
## Safe vs needs-confirmation actions
**Safe (no confirmation needed)**:
- Read/grep on noether, boltzmann, fresnel
- Push to gitea (claude-noether identity)
- Reboot fresnel (sddm autologin restores session)
- Build kernel on boltzmann via `makepkg -ef --skipinteg --noconfirm`
- Deploy kernel via `scp` + `sudo pacman -U`
- Run ffmpeg/cmp tests on fresnel
**Needs confirmation**:
- Significant rebuild (~25-30 min CPU on boltzmann, e.g. ffmpeg full rebuild or fresh kernel build)
- Per-context pool refactor (item 1 — would allow simultaneous mixed-codec decode but is invasive)
- Sub-profile rollout (item 2)