Files
fresnel-fourier/phase0_findings_iter7.md
T
marfrit fc44a1e63c iter7 Phase 0 lock: iter4-B1 auto-detect harden — require MEDIA_ENT_F_PROC_VIDEO_DECODER
Backend-only ~30-80 LOC. Walk media-topology entities (already partially
done at iter4 Commit Z); require at least one entity with function ==
MEDIA_ENT_F_PROC_VIDEO_DECODER. Eliminates the hantro encoder false-match
that breaks vainfo + ffmpeg-vaapi on every other reboot.

5 boolean Phase 1 criteria locked. No kernel work. No pixel-correctness
chasing. Quality-of-life delivery; removes per-session env-override
friction.

Predicted lowest-difficulty iteration since iter1. 2-3 hours wallclock.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 23:25:18 +00:00

122 lines
7.3 KiB
Markdown

# Iteration 7 — Phase 0 (substrate / motivation / inventory) → Phase 1 lock
Opens 2026-05-12 immediately after iter6 PARTIAL close ([`phase8_iteration6_close.md`](phase8_iteration6_close.md), commit `8ce00d3`).
iter6 narrowed Bug 6 to kernel-side (H-E) and closed PARTIAL. iter7 pivots to a smaller, lower-risk delivery: **iter4-B1 auto-detect device discrimination**. Pure backend fix, no kernel work, no pixel-correctness chasing.
## Locked research question (iteration 7)
> *"Backend auto-detect picks the correct V4L2 decode device on every fresnel boot, regardless of `/dev/media*` enumeration order. After fix: a fresh-boot `vainfo` lists all 5 codec profiles correctly without any `LIBVA_V4L2_REQUEST_*` env override."*
### Pass/fail (boolean)
1. **Fresh-boot vainfo enumerates all 5 codecs**. `ssh fresnel 'env LIBVA_DRIVER_NAME=v4l2_request vainfo'` (no `LIBVA_V4L2_REQUEST_VIDEO_PATH` / `_MEDIA_PATH` override) lists `VAProfileH264*` + `VAProfileHEVCMain` + `VAProfileVP9Profile0` + `VAProfileMPEG2*` + `VAProfileVP8Version0_3`.
2. **Auto-detect correctly routes H.264/HEVC/VP9 to rkvdec**. `ffmpeg -hwaccel vaapi -i bbb_720p10s_hevc.mp4 -frames:v 1 -f null -` without env override engages rkvdec (verifiable via strace showing `/dev/video<rkvdec>` opened).
3. **Auto-detect correctly routes MPEG-2/VP8 to hantro-vpu-dec**. Same shape for MPEG-2 + VP8 fixtures; strace shows hantro decoder opened (NOT hantro encoder at `/dev/video<hantro-enc>`).
4. **No regression on any locked iter5b-β / iter6 state**. VP9 still PASS direct, MPEG-2 still PASS, H.264 keyframe-partial still unchanged, HEVC + VP8 still in their existing partial states (Bugs 5/6 not in iter7 scope).
5. **Multi-boot stability**: at least 2 reboots of fresnel (different `/dev/media*` enumeration orders if achievable) confirm auto-detect routes correctly each time.
Clean iter7 close = all 5 criteria green. Phase 7 → Phase 4 loopback per `feedback_dev_process.md` if any fail.
## Mechanism the question targets
Per `phase4_iter5b_plan_v2.md` C5 risk-register and iter5b Phase 7 retro: today's auto-detect at `request.c::v4l2_request_init` walks `/dev/media*` in enumeration order, picks the first one whose `MEDIA_IOC_DEVICE_INFO.driver` name matches an allow-list (`{rkvdec, hantro-vpu, cedrus, sun4i_csi}`). The allow-list doesn't discriminate decoder vs encoder.
On RK3399 today, `hantro-vpu` is the kernel driver name for BOTH:
- `/dev/media0` or `/dev/media1` (boot-dependent) → `rockchip,rk3399-vpu-enc` (the encoder card)
- `/dev/media0` or `/dev/media1` (boot-dependent) → `rockchip,rk3399-vpu-dec` (the decoder card)
The walk picks the first hantro-vpu match, which is sometimes the encoder. The encoder doesn't expose decode formats; vainfo enumerates nothing; ffmpeg-vaapi fails.
iter4 Phase 6 Commit Z established the media-topology-walk pattern (better than enumeration-order /dev/video*). iter4 Phase 7 + iter5/iter5b/iter6 still hit the issue because the topology walk reads the driver name only, not the entity types.
### The fix shape
Walk `/dev/media*`, do `MEDIA_IOC_DEVICE_INFO` (driver name check), THEN walk media-topology entities and require at least one entity with function `MEDIA_ENT_F_PROC_VIDEO_DECODER`. Only accept the media device if a decoder entity is present.
This eliminates the encoder. Predicted fix size: ~50-100 LOC in `request.c`.
## Substrate state at iter7 open
| Property | Value |
|---|---|
| Kernel | `7.0.0-fresnel-fourier` (linux-fresnel-fourier 7.0-1). Unchanged from iter5b/iter6. |
| Fork tip | `70196f8` (iter5b-β Phase 6 Commit D). Unchanged through iter6. |
| Backend installed | SHA `2c6ff82cbdc156ff8910d0c7fe58e75eeecdfd6e6a1caabb049c8adf43a098b8`. Unchanged. |
| Test fixtures | unchanged. |
| Bugs 4/5/6 | still open, deferred to future iterations. |
| iter6 narrowing | Bug 6 confirmed kernel-side (H-E); 4 of 5 hypotheses eliminated. |
## Scope locks
**In scope**:
- `src/request.c::v4l2_request_init` auto-detect path.
- Media-topology entity-walk via `MEDIA_IOC_G_TOPOLOGY` (already partially used per iter4 Commit Z).
- Add `MEDIA_ENT_F_PROC_VIDEO_DECODER` entity-function check to the topology walk.
- Optional: scope the driver-name allow-list to encoder/decoder-aware variants if the kernel exposes them.
- 5-codec sweep regression-verify on the fixed backend.
**Out of scope**:
- Any pixel-correctness chasing (Bugs 4/5/6).
- Kernel patches.
- Performance metrics.
- Multi-decoder per-driver-data routing (the "use rkvdec for some codecs + hantro for others on the same backend instance" challenge — known as iter4-B1's "walk-and-pick-first" sub-issue).
- Front-end libva.
- AV1 / other-hardware.
## Phase 2 source-read targets
- `src/request.c::v4l2_request_init` — current auto-detect implementation (iter4 Commit Z `7f8fa93`).
- `<linux/media.h>``MEDIA_IOC_G_TOPOLOGY`, `MEDIA_ENT_F_*` enum.
- iter4 Phase 6 commit Z body — what the walk does today.
## Phase 3 baseline
iter4-B1 is well-known: env-override required per boot. Phase 3 captures the empirical baseline:
1. Fresh boot. Enumerate `/dev/media*` driver names.
2. `vainfo` with auto-detect (no env override). Observe what gets picked.
3. Show that on the boot where hantro-vpu encoder enumerates first, vainfo lists NO profiles.
iter6 Phase 3 already captured device-mapping artifacts inadvertently (per logs). Phase 3 of iter7 may reuse that.
## Phase 4 plan shape (predicted)
Mechanical:
1. After `MEDIA_IOC_DEVICE_INFO` matches the allow-list driver name, do `MEDIA_IOC_G_TOPOLOGY` (already happens at iter4 Commit Z).
2. Walk the topology's entities array. For each entity, check `function` field.
3. Accept the device only if AT LEAST ONE entity has `function == MEDIA_ENT_F_PROC_VIDEO_DECODER`.
4. Else skip and continue.
LOC estimate: 30-80 LOC in `request.c`. One commit. Maybe a follow-up commit for any cosmetic logging.
## Phase 5 review concerns to invite
- Does the v7.0-fresnel-fourier kernel's hantro / rkvdec set `MEDIA_ENT_F_PROC_VIDEO_DECODER` on the right entities? Verify empirically by reading the topology of each media device.
- Edge case: a media device with both encoder AND decoder entities (e.g., some SoCs have one combined video subsystem). Would the new code accept it? Yes (decoder entity present) — that's correct.
- Edge case: a media device with no entity-type info (older kernels). Fall back to current driver-name-only check, or refuse the device? Phase 4 picks.
## Predicted iter7 cadence
Small. ~30 min for each phase.
- Phase 0: this doc.
- Phase 2: source-read request.c + topology UAPI. ~15 min.
- Phase 3: baseline empirical capture. ~15 min.
- Phase 4: plan. ~15 min.
- Phase 5: sonnet-architect review. ~30 min.
- Phase 6: implement, build, install. ~30 min.
- Phase 7: verify, reboot test. ~30 min.
- Phase 8: close. ~15 min.
Total: 2-3 hours wallclock, contingent on fresnel reboot availability.
## What "iteration 7 close" looks like
Per `feedback_dev_process.md` Phase 8:
- All 5 Phase 1 criteria green.
- `phase8_iteration7_close.md` summarizing the commit + verification.
- Memory entry update: `iter4-B1` removed from backlog; auto-detect harden documented (or fold into existing media-topology rule if it exists).
- Campaign scoreboard: unchanged on pixel-correctness axis; +1 quality-of-life delivery (no more env-override per session).
Predicted iter7 difficulty: **lowest of any iter since iter1**. Pure backend mechanical fix. No new bug classes anticipated.