From 422ecafca983befc6c0d88e34e0bb196a5e36d71 Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Thu, 14 May 2026 14:55:15 +0000 Subject: [PATCH] Add pre-compact handoff doc for session resumption --- PRE_COMPACT_HANDOFF.md | 194 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 194 insertions(+) create mode 100644 PRE_COMPACT_HANDOFF.md diff --git a/PRE_COMPACT_HANDOFF.md b/PRE_COMPACT_HANDOFF.md new file mode 100644 index 0000000..0c4d3a3 --- /dev/null +++ b/PRE_COMPACT_HANDOFF.md @@ -0,0 +1,194 @@ +# Pre-Compact Handoff — Session 2026-05-14 + +Use this doc to resume the fresnel-fourier campaign after Claude context compaction. + +## TL;DR (read first) + +- **Bug 4 (H.264 keyframe-partial): FIXED** — H.264 10F byte-equal to SW reference. +- **Bug 5 (HEVC libva all-zero): PARTIAL** — frame 1 byte-equal to SW; frame 2+ diverges (separate ffmpeg-vaapi slice_data inflation bug, deferred). +- **VP9**: unchanged (HW=SW byte-equal). +- **MPEG-2 / VP8**: untestable through libva on current kernel boot (pre-existing libva single-device profile-probe limitation; auto-select picks rkvdec which doesn't expose those profiles). +- Root cause identified after 6 kernel-printk iterations: `rkvdec_s_ctrl` returns -EBUSY when first SPS triggers `image_fmt` reset on a busy CAPTURE queue. Fixed by synthetic SPS injection at libva CreateContext. + +## Substrate state (where things live) + +| Component | Location | Tip | +|---|---|---| +| Campaign repo (this) | `/home/mfritsche/src/fresnel-fourier/` | `c15fc6c` on gitea master | +| Libva backend fork (noether) | `/home/mfritsche/src/libva-multiplanar/libva-v4l2-request-fourier/` | `6646b16` on gitea master | +| Libva backend (fresnel deploy) | `/home/mfritsche/src/libva-v4l2-request-fourier/` | sync to gitea master, run `ninja -C build` | +| Kernel source (boltzmann) | `~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/` | pkgrel=9 with iter17/20/21/22/23/27 diag printks | +| Kernel running on fresnel | `linux-fresnel-fourier 7.0-9` | diagnostic build; revert to clean 7.0-X before any production work | +| Test fixtures (fresnel) | `/home/mfritsche/fourier-test/bbb_*.{mp4,ts,webm}` | 5 codecs at 720p10s or 1080p30 | +| OUTPUT-buffer dumps (fresnel) | `/tmp/out_dump/output_*.bin` | from α-16 env `LIBVA_V4L2_DUMP_OUTPUT` | +| Memory | `/home/mfritsche/.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/` | `feedback_rkvdec_image_fmt_pre_seed.md` is the key new entry | + +## Identity for gitea pushes + +All `git.reauktion.de` interactions use `claude-noether` identity (per memory `feedback_gitea_as_claude_noether.md`). Backend remote URL: `ssh://gitea@git.reauktion.de.claude-noether/marfrit/libva-v4l2-request-fourier.git`. + +## Backend commits delivered this session + +``` +6646b16 Revert iter28b DIAG: trim=40 universal-trim breaks IDR frame 1 +c555788 iter28b DIAG: env-gated trim of HEVC slice_data trailing N bytes (reverted) +cd286d9 iter28 α-28: bit_size = (slice_data_size - data_byte_offset) * 8 for HEVC (no-op, rkvdec ignores) +754be1d iter27 diag: env-gated VAAPI slice fields dump +c9bfa21 iter27: remove request_log diag +719d813 iter27 α-27: populate slice_params.num_entry_point_offsets from VAAPI (no-op) +66ef848 iter26 α-26: populate decode_params.short_term_ref_pic_set_size from VAAPI st_rps_bits +d062fec iter25 α-25 fix: add FRAME_MBS_ONLY to H264 dummy SPS +db0b7f9 iter25 α-25: inject synthetic SPS before cap_pool_init to seed image_fmt ← THE FIX +``` + +## Campaign repo commits delivered + +``` +c15fc6c iter28b DIAG documented: universal trim=40 breaks IDR (reverted) +8b17bf7 Final session summary: H264 + VP9 + HEVC frame 1 byte-equal to SW +02c4192 iter27/28: probe HEVC frame 2+ divergence; α-27/α-28 no-op; ffmpeg-vaapi slice_data inflation localized +bf67900 iter20-26: kernel-side root-cause localization, α-25/α-26 fix Bug 4, partial Bug 5 +``` + +Phase docs (chronological): `phase4_iter21_plan.md`, `phase4_iter22_plan.md`, `phase8_iteration20_close.md` … `phase8_iteration27_close.md`, `CAMPAIGN_SESSION_2026_05_14.md`. + +## How to verify the current state + +Run on fresnel after `git pull` + `ninja -C build` in `~/src/libva-v4l2-request-fourier`: + +```bash +# H.264 — should be byte-equal to SW (Bug 4 fixed) +env LIBVA_DRIVER_NAME=v4l2_request \ + LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \ + ffmpeg -hide_banner -loglevel error -y -hwaccel vaapi -hwaccel_output_format vaapi \ + -i /home/mfritsche/fourier-test/bbb_1080p30_h264.mp4 \ + -vf "hwdownload,format=nv12,crop=1920:1080:0:0" \ + -frames:v 10 -f rawvideo -pix_fmt nv12 /tmp/libva_h264.yuv +ffmpeg -hide_banner -loglevel error -y \ + -i /home/mfritsche/fourier-test/bbb_1080p30_h264.mp4 \ + -frames:v 10 -f rawvideo -pix_fmt nv12 /tmp/sw_h264.yuv +cmp /tmp/libva_h264.yuv /tmp/sw_h264.yuv # SHOULD print nothing (equal) + +# HEVC — frame 1 byte-equal, frames 2+ differ +env LIBVA_DRIVER_NAME=v4l2_request \ + LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \ + ffmpeg -hide_banner -loglevel error -y -hwaccel vaapi -hwaccel_output_format vaapi \ + -i /home/mfritsche/fourier-test/bbb_720p10s_hevc.mp4 \ + -vf "hwdownload,format=nv12,crop=1280:720:0:0" \ + -frames:v 1 -f rawvideo -pix_fmt nv12 /tmp/libva_hevc1.yuv +ffmpeg -hide_banner -loglevel error -y \ + -i /home/mfritsche/fourier-test/bbb_720p10s_hevc.mp4 \ + -frames:v 1 -f rawvideo -pix_fmt nv12 /tmp/sw_hevc1.yuv +cmp /tmp/libva_hevc1.yuv /tmp/sw_hevc1.yuv # SHOULD print nothing +``` + +## Root cause (saved to memory) + +`rkvdec_s_ctrl` on first HEVC_SPS / H264_SPS: + +```c +image_fmt = desc->ops->get_image_fmt(ctx, ctrl); +if (rkvdec_image_fmt_changed(ctx, image_fmt)) { + vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE); + if (vb2_is_busy(vq)) + return -EBUSY; // ← THIS + ctx->image_fmt = image_fmt; + rkvdec_reset_decoded_fmt(ctx); +} +``` + +`ctx->image_fmt` defaults to `RKVDEC_IMG_FMT_ANY` at open. First per-frame SPS resolves to a concrete value (e.g., `RKVDEC_IMG_FMT_420_8BIT`); since ANY ≠ concrete, `image_fmt_changed` returns true → tries reset → `vb2_is_busy` returns true (libva pre-allocated 24 CAPTURE buffers at CreateContext) → -EBUSY → setup loop breaks → SPS never committed to ctx->ctrl_hdl → rkvdec_hevc_run reads zero → all-zero CAPTURE. + +## The fix (α-25) + +In `src/context.c::RequestCreateContext`, BEFORE `cap_pool_init` (line ~215), inject one synthetic SPS via non-request `v4l2_set_controls`: + +- VAProfileHEVCMain: synthetic `v4l2_ctrl_hevc_sps` with chroma_format_idc=1 (4:2:0), bit_depth=0 (8-bit), width/height from caller. +- VAProfileH264*: synthetic `v4l2_ctrl_h264_sps` with same chroma+bit_depth + `V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY` flag set (else rkvdec_h264_validate_sps doubles height and rejects). + +At this point CAPTURE is empty → `vb2_is_busy=false` → rkvdec_s_ctrl succeeds → `ctx->image_fmt = RKVDEC_IMG_FMT_420_8BIT`. From then on per-frame SPS finds `image_fmt_changed=false` → skip reset → commits successfully. + +Source: see commit `db0b7f9` for the full diff. + +## Open items (deferred) + +### 1. HEVC frame 2+ divergence + +For non-IDR HEVC frames, libva's `slice_data_size` from VAAPI is consistently 40 bytes larger than `ffmpeg-v4l2request`'s `size` parameter (= the `nal->raw_size` value libavcodec dispatches). The 40 extra bytes inflate libva's OUTPUT buffer → rkvdec reads past the slice payload → wrong reference → frame 2+ visual garbage. + +Evidence (from iter27 kernel printk): + +``` +libva frame 2 OUTPUT = 5552 bytes (3 prefix + 5549 slice_data) +kdirect frame 2 OUTPUT = 5512 bytes (3 prefix + 5509) +diff = 40 bytes per slice, P/B-frame specific (IDR is correct) +``` + +Universal trim=40 tested as iter28b → broke IDR (frame 1) which was correct → reverted. Real fix requires: + +**Option A**: Rebuild ffmpeg with `fprintf(stderr, "size=%u\n", size)` at top of `v4l2_request_hevc_decode_slice` in `libavcodec/v4l2_request_hevc.c:564` to confirm what `size` libavcodec actually dispatches. Probe added during session, build was killed mid-link due to context length. To redo: source is at `/home/mfritsche/src/aur/ffmpeg-git/src/FFmpeg/` on fresnel, restored to clean state. Add the probe, `nohup make -j4 ffmpeg > /tmp/log 2>&1 &`, wait ~25 min, then run libva HEVC and see actual size values dispatched by libavcodec. + +**Option B**: Write a libva-side HEVC slice trailing-bits parser to find the rbsp_stop_one_bit position dynamically. Scan slice_data buffer backwards, identify the byte containing the stop bit (pattern: data...1 0...0), trim slices_size to that position. Complicated by the fact that the 40 trailing bytes for BBB frame 2 look like real entropy data (not zeros), so simple "trim trailing zeros" doesn't work. + +**Option C**: Patch ffmpeg-vaapi to ensure slice_data_size matches `nal->raw_size` from libavcodec exactly (suspect there's some internal inflation in `vaapi_hevc_decode_slice`/`ff_vaapi_decode_make_slice_buffer` path). Upstream ffmpeg work. + +### 2. MPEG-2 / VP8 untestable through libva on current kernel boot + +Libva backend's `find_codec_device` (in `src/request.c:427`) selects ONE device for the entire session. On RK3399 with both rkvdec (`/dev/media0`+`/dev/video1` this boot) and hantro (`/dev/media1`+`/dev/video2`+`/dev/video3`), the backend picks rkvdec — which exposes H264/HEVC/VP9 only, not MPEG-2/VP8. + +Override with `LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media1` to force hantro for MPEG-2/VP8 testing. But that disables H264/HEVC/VP9 simultaneously, and the unconditional HEVC DECODE_MODE/START_CODE controls libva sets at CreateContext (context.c:343-379) fail on hantro with `Unable to set control(s): Invalid argument` — pre-existing, not iter25 regression. + +Fix would require either: +- Libva backend multi-device probe + per-codec dispatch (~200-400 LOC, called out in `phase0_findings_iter7.md`). +- Conditional codec-init controls (skip controls hantro doesn't support). + +### 3. Kernel substrate cleanup + +`linux-fresnel-fourier 7.0-9` has 5+ accumulated `pr_info` diagnostic patches in `drivers/media/v4l2-core/v4l2-ctrls-request.c` and `drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c`. Before any production work, revert to clean 7.0-X (i.e., apply only the 3 PBP DTS patches + RFC v2 fence series, without diagnostics). Or just bump to 7.0-X and ship without diagnostics. + +## Memory entries this session + +- New: `feedback_rkvdec_image_fmt_pre_seed.md` — root cause + α-25 fix summary. +- Updated: `feedback_libva_byte_correct_kernel_bug.md` flagged as partially overturned (the byte-correctness claim was right; the kernel-side bug claim was misleading — actual bug was libva-side CAPTURE-pool timing interacting with kernel state). + +## Key commands quickreference + +```bash +# Sync backend on fresnel + rebuild +ssh fresnel 'cd ~/src/libva-v4l2-request-fourier && git fetch && git reset --hard origin/master && ninja -C build' + +# Run libva HEVC + capture rkvdec kernel printk +ssh fresnel 'sudo dmesg -C; env LIBVA_DRIVER_NAME=v4l2_request \ + LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \ + ffmpeg -hide_banner -loglevel error -y -hwaccel vaapi -hwaccel_output_format vaapi \ + -i /home/mfritsche/fourier-test/bbb_720p10s_hevc.mp4 \ + -vf "hwdownload,format=nv12,crop=1280:720:0:0" -frames:v 3 -f rawvideo -pix_fmt nv12 /tmp/x.yuv; + sudo dmesg | grep -E "rkvdec|iter2[0-9]_"' + +# kdirect (ffmpeg-v4l2request) reference +ssh fresnel 'ffmpeg -hide_banner -loglevel error -y \ + -hwaccel v4l2request -hwaccel_output_format drm_prime \ + -i /home/mfritsche/fourier-test/bbb_720p10s_hevc.mp4 \ + -vf "hwdownload,format=nv12,crop=1280:720:0:0" -frames:v 3 -f rawvideo -pix_fmt nv12 /tmp/y.yuv' + +# Force hantro path (untested with backend, see open-item 2) +env LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media1 ... + +# Reboot fresnel (sddm autologin reseats mfritsche per /etc/sddm.conf.d/20-autologin.conf) +ssh fresnel 'sudo systemctl reboot'; sleep 60 +``` + +## What's safe to do without user confirmation + +- Read/grep on noether, boltzmann, fresnel. +- Push to gitea (claude-noether identity). +- Reboot fresnel (sddm autologin restores session). +- Build kernel on boltzmann via `makepkg -e --noconfirm` in `~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/`. +- Deploy kernel via `scp` + `sudo pacman -U`. +- Run ffmpeg/cmp tests on fresnel. + +## What needs user confirmation + +- Significant ffmpeg rebuild (~25 min CPU time). +- Reverting kernel-substrate diagnostics to ship a clean kernel. +- Decisions on whether to invest in HEVC frame 2+ fix or MPEG-2/VP8 multi-device probe.