Files
fresnel-fourier/PRE_COMPACT_HANDOFF.md
T

195 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Pre-Compact Handoff — Session 2026-05-14
Use this doc to resume the fresnel-fourier campaign after Claude context compaction.
## TL;DR (read first)
- **Bug 4 (H.264 keyframe-partial): FIXED** — H.264 10F byte-equal to SW reference.
- **Bug 5 (HEVC libva all-zero): PARTIAL** — frame 1 byte-equal to SW; frame 2+ diverges (separate ffmpeg-vaapi slice_data inflation bug, deferred).
- **VP9**: unchanged (HW=SW byte-equal).
- **MPEG-2 / VP8**: untestable through libva on current kernel boot (pre-existing libva single-device profile-probe limitation; auto-select picks rkvdec which doesn't expose those profiles).
- Root cause identified after 6 kernel-printk iterations: `rkvdec_s_ctrl` returns -EBUSY when first SPS triggers `image_fmt` reset on a busy CAPTURE queue. Fixed by synthetic SPS injection at libva CreateContext.
## Substrate state (where things live)
| Component | Location | Tip |
|---|---|---|
| Campaign repo (this) | `/home/mfritsche/src/fresnel-fourier/` | `c15fc6c` on gitea master |
| Libva backend fork (noether) | `/home/mfritsche/src/libva-multiplanar/libva-v4l2-request-fourier/` | `6646b16` on gitea master |
| Libva backend (fresnel deploy) | `/home/mfritsche/src/libva-v4l2-request-fourier/` | sync to gitea master, run `ninja -C build` |
| Kernel source (boltzmann) | `~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/` | pkgrel=9 with iter17/20/21/22/23/27 diag printks |
| Kernel running on fresnel | `linux-fresnel-fourier 7.0-9` | diagnostic build; revert to clean 7.0-X before any production work |
| Test fixtures (fresnel) | `/home/mfritsche/fourier-test/bbb_*.{mp4,ts,webm}` | 5 codecs at 720p10s or 1080p30 |
| OUTPUT-buffer dumps (fresnel) | `/tmp/out_dump/output_*.bin` | from α-16 env `LIBVA_V4L2_DUMP_OUTPUT` |
| Memory | `/home/mfritsche/.claude/projects/-home-mfritsche-src-fresnel-fourier/memory/` | `feedback_rkvdec_image_fmt_pre_seed.md` is the key new entry |
## Identity for gitea pushes
All `git.reauktion.de` interactions use `claude-noether` identity (per memory `feedback_gitea_as_claude_noether.md`). Backend remote URL: `ssh://gitea@git.reauktion.de.claude-noether/marfrit/libva-v4l2-request-fourier.git`.
## Backend commits delivered this session
```
6646b16 Revert iter28b DIAG: trim=40 universal-trim breaks IDR frame 1
c555788 iter28b DIAG: env-gated trim of HEVC slice_data trailing N bytes (reverted)
cd286d9 iter28 α-28: bit_size = (slice_data_size - data_byte_offset) * 8 for HEVC (no-op, rkvdec ignores)
754be1d iter27 diag: env-gated VAAPI slice fields dump
c9bfa21 iter27: remove request_log diag
719d813 iter27 α-27: populate slice_params.num_entry_point_offsets from VAAPI (no-op)
66ef848 iter26 α-26: populate decode_params.short_term_ref_pic_set_size from VAAPI st_rps_bits
d062fec iter25 α-25 fix: add FRAME_MBS_ONLY to H264 dummy SPS
db0b7f9 iter25 α-25: inject synthetic SPS before cap_pool_init to seed image_fmt ← THE FIX
```
## Campaign repo commits delivered
```
c15fc6c iter28b DIAG documented: universal trim=40 breaks IDR (reverted)
8b17bf7 Final session summary: H264 + VP9 + HEVC frame 1 byte-equal to SW
02c4192 iter27/28: probe HEVC frame 2+ divergence; α-27/α-28 no-op; ffmpeg-vaapi slice_data inflation localized
bf67900 iter20-26: kernel-side root-cause localization, α-25/α-26 fix Bug 4, partial Bug 5
```
Phase docs (chronological): `phase4_iter21_plan.md`, `phase4_iter22_plan.md`, `phase8_iteration20_close.md``phase8_iteration27_close.md`, `CAMPAIGN_SESSION_2026_05_14.md`.
## How to verify the current state
Run on fresnel after `git pull` + `ninja -C build` in `~/src/libva-v4l2-request-fourier`:
```bash
# H.264 — should be byte-equal to SW (Bug 4 fixed)
env LIBVA_DRIVER_NAME=v4l2_request \
LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \
ffmpeg -hide_banner -loglevel error -y -hwaccel vaapi -hwaccel_output_format vaapi \
-i /home/mfritsche/fourier-test/bbb_1080p30_h264.mp4 \
-vf "hwdownload,format=nv12,crop=1920:1080:0:0" \
-frames:v 10 -f rawvideo -pix_fmt nv12 /tmp/libva_h264.yuv
ffmpeg -hide_banner -loglevel error -y \
-i /home/mfritsche/fourier-test/bbb_1080p30_h264.mp4 \
-frames:v 10 -f rawvideo -pix_fmt nv12 /tmp/sw_h264.yuv
cmp /tmp/libva_h264.yuv /tmp/sw_h264.yuv # SHOULD print nothing (equal)
# HEVC — frame 1 byte-equal, frames 2+ differ
env LIBVA_DRIVER_NAME=v4l2_request \
LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \
ffmpeg -hide_banner -loglevel error -y -hwaccel vaapi -hwaccel_output_format vaapi \
-i /home/mfritsche/fourier-test/bbb_720p10s_hevc.mp4 \
-vf "hwdownload,format=nv12,crop=1280:720:0:0" \
-frames:v 1 -f rawvideo -pix_fmt nv12 /tmp/libva_hevc1.yuv
ffmpeg -hide_banner -loglevel error -y \
-i /home/mfritsche/fourier-test/bbb_720p10s_hevc.mp4 \
-frames:v 1 -f rawvideo -pix_fmt nv12 /tmp/sw_hevc1.yuv
cmp /tmp/libva_hevc1.yuv /tmp/sw_hevc1.yuv # SHOULD print nothing
```
## Root cause (saved to memory)
`rkvdec_s_ctrl` on first HEVC_SPS / H264_SPS:
```c
image_fmt = desc->ops->get_image_fmt(ctx, ctrl);
if (rkvdec_image_fmt_changed(ctx, image_fmt)) {
vq = v4l2_m2m_get_vq(ctx->fh.m2m_ctx, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE);
if (vb2_is_busy(vq))
return -EBUSY; // ← THIS
ctx->image_fmt = image_fmt;
rkvdec_reset_decoded_fmt(ctx);
}
```
`ctx->image_fmt` defaults to `RKVDEC_IMG_FMT_ANY` at open. First per-frame SPS resolves to a concrete value (e.g., `RKVDEC_IMG_FMT_420_8BIT`); since ANY ≠ concrete, `image_fmt_changed` returns true → tries reset → `vb2_is_busy` returns true (libva pre-allocated 24 CAPTURE buffers at CreateContext) → -EBUSY → setup loop breaks → SPS never committed to ctx->ctrl_hdl → rkvdec_hevc_run reads zero → all-zero CAPTURE.
## The fix (α-25)
In `src/context.c::RequestCreateContext`, BEFORE `cap_pool_init` (line ~215), inject one synthetic SPS via non-request `v4l2_set_controls`:
- VAProfileHEVCMain: synthetic `v4l2_ctrl_hevc_sps` with chroma_format_idc=1 (4:2:0), bit_depth=0 (8-bit), width/height from caller.
- VAProfileH264*: synthetic `v4l2_ctrl_h264_sps` with same chroma+bit_depth + `V4L2_H264_SPS_FLAG_FRAME_MBS_ONLY` flag set (else rkvdec_h264_validate_sps doubles height and rejects).
At this point CAPTURE is empty → `vb2_is_busy=false` → rkvdec_s_ctrl succeeds → `ctx->image_fmt = RKVDEC_IMG_FMT_420_8BIT`. From then on per-frame SPS finds `image_fmt_changed=false` → skip reset → commits successfully.
Source: see commit `db0b7f9` for the full diff.
## Open items (deferred)
### 1. HEVC frame 2+ divergence
For non-IDR HEVC frames, libva's `slice_data_size` from VAAPI is consistently 40 bytes larger than `ffmpeg-v4l2request`'s `size` parameter (= the `nal->raw_size` value libavcodec dispatches). The 40 extra bytes inflate libva's OUTPUT buffer → rkvdec reads past the slice payload → wrong reference → frame 2+ visual garbage.
Evidence (from iter27 kernel printk):
```
libva frame 2 OUTPUT = 5552 bytes (3 prefix + 5549 slice_data)
kdirect frame 2 OUTPUT = 5512 bytes (3 prefix + 5509)
diff = 40 bytes per slice, P/B-frame specific (IDR is correct)
```
Universal trim=40 tested as iter28b → broke IDR (frame 1) which was correct → reverted. Real fix requires:
**Option A**: Rebuild ffmpeg with `fprintf(stderr, "size=%u\n", size)` at top of `v4l2_request_hevc_decode_slice` in `libavcodec/v4l2_request_hevc.c:564` to confirm what `size` libavcodec actually dispatches. Probe added during session, build was killed mid-link due to context length. To redo: source is at `/home/mfritsche/src/aur/ffmpeg-git/src/FFmpeg/` on fresnel, restored to clean state. Add the probe, `nohup make -j4 ffmpeg > /tmp/log 2>&1 &`, wait ~25 min, then run libva HEVC and see actual size values dispatched by libavcodec.
**Option B**: Write a libva-side HEVC slice trailing-bits parser to find the rbsp_stop_one_bit position dynamically. Scan slice_data buffer backwards, identify the byte containing the stop bit (pattern: data...1 0...0), trim slices_size to that position. Complicated by the fact that the 40 trailing bytes for BBB frame 2 look like real entropy data (not zeros), so simple "trim trailing zeros" doesn't work.
**Option C**: Patch ffmpeg-vaapi to ensure slice_data_size matches `nal->raw_size` from libavcodec exactly (suspect there's some internal inflation in `vaapi_hevc_decode_slice`/`ff_vaapi_decode_make_slice_buffer` path). Upstream ffmpeg work.
### 2. MPEG-2 / VP8 untestable through libva on current kernel boot
Libva backend's `find_codec_device` (in `src/request.c:427`) selects ONE device for the entire session. On RK3399 with both rkvdec (`/dev/media0`+`/dev/video1` this boot) and hantro (`/dev/media1`+`/dev/video2`+`/dev/video3`), the backend picks rkvdec — which exposes H264/HEVC/VP9 only, not MPEG-2/VP8.
Override with `LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media1` to force hantro for MPEG-2/VP8 testing. But that disables H264/HEVC/VP9 simultaneously, and the unconditional HEVC DECODE_MODE/START_CODE controls libva sets at CreateContext (context.c:343-379) fail on hantro with `Unable to set control(s): Invalid argument` — pre-existing, not iter25 regression.
Fix would require either:
- Libva backend multi-device probe + per-codec dispatch (~200-400 LOC, called out in `phase0_findings_iter7.md`).
- Conditional codec-init controls (skip controls hantro doesn't support).
### 3. Kernel substrate cleanup
`linux-fresnel-fourier 7.0-9` has 5+ accumulated `pr_info` diagnostic patches in `drivers/media/v4l2-core/v4l2-ctrls-request.c` and `drivers/media/platform/rockchip/rkvdec/rkvdec-hevc.c`. Before any production work, revert to clean 7.0-X (i.e., apply only the 3 PBP DTS patches + RFC v2 fence series, without diagnostics). Or just bump to 7.0-X and ship without diagnostics.
## Memory entries this session
- New: `feedback_rkvdec_image_fmt_pre_seed.md` — root cause + α-25 fix summary.
- Updated: `feedback_libva_byte_correct_kernel_bug.md` flagged as partially overturned (the byte-correctness claim was right; the kernel-side bug claim was misleading — actual bug was libva-side CAPTURE-pool timing interacting with kernel state).
## Key commands quickreference
```bash
# Sync backend on fresnel + rebuild
ssh fresnel 'cd ~/src/libva-v4l2-request-fourier && git fetch && git reset --hard origin/master && ninja -C build'
# Run libva HEVC + capture rkvdec kernel printk
ssh fresnel 'sudo dmesg -C; env LIBVA_DRIVER_NAME=v4l2_request \
LIBVA_DRIVERS_PATH=/home/mfritsche/src/libva-v4l2-request-fourier/build/src \
ffmpeg -hide_banner -loglevel error -y -hwaccel vaapi -hwaccel_output_format vaapi \
-i /home/mfritsche/fourier-test/bbb_720p10s_hevc.mp4 \
-vf "hwdownload,format=nv12,crop=1280:720:0:0" -frames:v 3 -f rawvideo -pix_fmt nv12 /tmp/x.yuv;
sudo dmesg | grep -E "rkvdec|iter2[0-9]_"'
# kdirect (ffmpeg-v4l2request) reference
ssh fresnel 'ffmpeg -hide_banner -loglevel error -y \
-hwaccel v4l2request -hwaccel_output_format drm_prime \
-i /home/mfritsche/fourier-test/bbb_720p10s_hevc.mp4 \
-vf "hwdownload,format=nv12,crop=1280:720:0:0" -frames:v 3 -f rawvideo -pix_fmt nv12 /tmp/y.yuv'
# Force hantro path (untested with backend, see open-item 2)
env LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media1 ...
# Reboot fresnel (sddm autologin reseats mfritsche per /etc/sddm.conf.d/20-autologin.conf)
ssh fresnel 'sudo systemctl reboot'; sleep 60
```
## What's safe to do without user confirmation
- Read/grep on noether, boltzmann, fresnel.
- Push to gitea (claude-noether identity).
- Reboot fresnel (sddm autologin restores session).
- Build kernel on boltzmann via `makepkg -e --noconfirm` in `~/src/kernel-agent-bootstrap/build/marfrit-packages/arch/linux-fresnel-fourier/`.
- Deploy kernel via `scp` + `sudo pacman -U`.
- Run ffmpeg/cmp tests on fresnel.
## What needs user confirmation
- Significant ffmpeg rebuild (~25 min CPU time).
- Reverting kernel-substrate diagnostics to ship a clean kernel.
- Decisions on whether to invest in HEVC frame 2+ fix or MPEG-2/VP8 multi-device probe.