# Iteration 7 — Phase 0 (substrate / motivation / inventory) → Phase 1 lock Opens 2026-05-12 immediately after iter6 PARTIAL close ([`phase8_iteration6_close.md`](phase8_iteration6_close.md), commit `8ce00d3`). iter6 narrowed Bug 6 to kernel-side (H-E) and closed PARTIAL. iter7 pivots to a smaller, lower-risk delivery: **iter4-B1 auto-detect device discrimination**. Pure backend fix, no kernel work, no pixel-correctness chasing. ## Locked research question (iteration 7) > *"Backend auto-detect picks the correct V4L2 decode device on every fresnel boot, regardless of `/dev/media*` enumeration order. After fix: a fresh-boot `vainfo` lists all 5 codec profiles correctly without any `LIBVA_V4L2_REQUEST_*` env override."* ### Pass/fail (boolean) 1. **Fresh-boot vainfo enumerates all 5 codecs**. `ssh fresnel 'env LIBVA_DRIVER_NAME=v4l2_request vainfo'` (no `LIBVA_V4L2_REQUEST_VIDEO_PATH` / `_MEDIA_PATH` override) lists `VAProfileH264*` + `VAProfileHEVCMain` + `VAProfileVP9Profile0` + `VAProfileMPEG2*` + `VAProfileVP8Version0_3`. 2. **Auto-detect correctly routes H.264/HEVC/VP9 to rkvdec**. `ffmpeg -hwaccel vaapi -i bbb_720p10s_hevc.mp4 -frames:v 1 -f null -` without env override engages rkvdec (verifiable via strace showing `/dev/video` opened). 3. **Auto-detect correctly routes MPEG-2/VP8 to hantro-vpu-dec**. Same shape for MPEG-2 + VP8 fixtures; strace shows hantro decoder opened (NOT hantro encoder at `/dev/video`). 4. **No regression on any locked iter5b-β / iter6 state**. VP9 still PASS direct, MPEG-2 still PASS, H.264 keyframe-partial still unchanged, HEVC + VP8 still in their existing partial states (Bugs 5/6 not in iter7 scope). 5. **Multi-boot stability**: at least 2 reboots of fresnel (different `/dev/media*` enumeration orders if achievable) confirm auto-detect routes correctly each time. Clean iter7 close = all 5 criteria green. Phase 7 → Phase 4 loopback per `feedback_dev_process.md` if any fail. ## Mechanism the question targets Per `phase4_iter5b_plan_v2.md` C5 risk-register and iter5b Phase 7 retro: today's auto-detect at `request.c::v4l2_request_init` walks `/dev/media*` in enumeration order, picks the first one whose `MEDIA_IOC_DEVICE_INFO.driver` name matches an allow-list (`{rkvdec, hantro-vpu, cedrus, sun4i_csi}`). The allow-list doesn't discriminate decoder vs encoder. On RK3399 today, `hantro-vpu` is the kernel driver name for BOTH: - `/dev/media0` or `/dev/media1` (boot-dependent) → `rockchip,rk3399-vpu-enc` (the encoder card) - `/dev/media0` or `/dev/media1` (boot-dependent) → `rockchip,rk3399-vpu-dec` (the decoder card) The walk picks the first hantro-vpu match, which is sometimes the encoder. The encoder doesn't expose decode formats; vainfo enumerates nothing; ffmpeg-vaapi fails. iter4 Phase 6 Commit Z established the media-topology-walk pattern (better than enumeration-order /dev/video*). iter4 Phase 7 + iter5/iter5b/iter6 still hit the issue because the topology walk reads the driver name only, not the entity types. ### The fix shape Walk `/dev/media*`, do `MEDIA_IOC_DEVICE_INFO` (driver name check), THEN walk media-topology entities and require at least one entity with function `MEDIA_ENT_F_PROC_VIDEO_DECODER`. Only accept the media device if a decoder entity is present. This eliminates the encoder. Predicted fix size: ~50-100 LOC in `request.c`. ## Substrate state at iter7 open | Property | Value | |---|---| | Kernel | `7.0.0-fresnel-fourier` (linux-fresnel-fourier 7.0-1). Unchanged from iter5b/iter6. | | Fork tip | `70196f8` (iter5b-β Phase 6 Commit D). Unchanged through iter6. | | Backend installed | SHA `2c6ff82cbdc156ff8910d0c7fe58e75eeecdfd6e6a1caabb049c8adf43a098b8`. Unchanged. | | Test fixtures | unchanged. | | Bugs 4/5/6 | still open, deferred to future iterations. | | iter6 narrowing | Bug 6 confirmed kernel-side (H-E); 4 of 5 hypotheses eliminated. | ## Scope locks **In scope**: - `src/request.c::v4l2_request_init` auto-detect path. - Media-topology entity-walk via `MEDIA_IOC_G_TOPOLOGY` (already partially used per iter4 Commit Z). - Add `MEDIA_ENT_F_PROC_VIDEO_DECODER` entity-function check to the topology walk. - Optional: scope the driver-name allow-list to encoder/decoder-aware variants if the kernel exposes them. - 5-codec sweep regression-verify on the fixed backend. **Out of scope**: - Any pixel-correctness chasing (Bugs 4/5/6). - Kernel patches. - Performance metrics. - Multi-decoder per-driver-data routing (the "use rkvdec for some codecs + hantro for others on the same backend instance" challenge — known as iter4-B1's "walk-and-pick-first" sub-issue). - Front-end libva. - AV1 / other-hardware. ## Phase 2 source-read targets - `src/request.c::v4l2_request_init` — current auto-detect implementation (iter4 Commit Z `7f8fa93`). - `` — `MEDIA_IOC_G_TOPOLOGY`, `MEDIA_ENT_F_*` enum. - iter4 Phase 6 commit Z body — what the walk does today. ## Phase 3 baseline iter4-B1 is well-known: env-override required per boot. Phase 3 captures the empirical baseline: 1. Fresh boot. Enumerate `/dev/media*` driver names. 2. `vainfo` with auto-detect (no env override). Observe what gets picked. 3. Show that on the boot where hantro-vpu encoder enumerates first, vainfo lists NO profiles. iter6 Phase 3 already captured device-mapping artifacts inadvertently (per logs). Phase 3 of iter7 may reuse that. ## Phase 4 plan shape (predicted) Mechanical: 1. After `MEDIA_IOC_DEVICE_INFO` matches the allow-list driver name, do `MEDIA_IOC_G_TOPOLOGY` (already happens at iter4 Commit Z). 2. Walk the topology's entities array. For each entity, check `function` field. 3. Accept the device only if AT LEAST ONE entity has `function == MEDIA_ENT_F_PROC_VIDEO_DECODER`. 4. Else skip and continue. LOC estimate: 30-80 LOC in `request.c`. One commit. Maybe a follow-up commit for any cosmetic logging. ## Phase 5 review concerns to invite - Does the v7.0-fresnel-fourier kernel's hantro / rkvdec set `MEDIA_ENT_F_PROC_VIDEO_DECODER` on the right entities? Verify empirically by reading the topology of each media device. - Edge case: a media device with both encoder AND decoder entities (e.g., some SoCs have one combined video subsystem). Would the new code accept it? Yes (decoder entity present) — that's correct. - Edge case: a media device with no entity-type info (older kernels). Fall back to current driver-name-only check, or refuse the device? Phase 4 picks. ## Predicted iter7 cadence Small. ~30 min for each phase. - Phase 0: this doc. - Phase 2: source-read request.c + topology UAPI. ~15 min. - Phase 3: baseline empirical capture. ~15 min. - Phase 4: plan. ~15 min. - Phase 5: sonnet-architect review. ~30 min. - Phase 6: implement, build, install. ~30 min. - Phase 7: verify, reboot test. ~30 min. - Phase 8: close. ~15 min. Total: 2-3 hours wallclock, contingent on fresnel reboot availability. ## What "iteration 7 close" looks like Per `feedback_dev_process.md` Phase 8: - All 5 Phase 1 criteria green. - `phase8_iteration7_close.md` summarizing the commit + verification. - Memory entry update: `iter4-B1` removed from backlog; auto-detect harden documented (or fold into existing media-topology rule if it exists). - Campaign scoreboard: unchanged on pixel-correctness axis; +1 quality-of-life delivery (no more env-override per session). Predicted iter7 difficulty: **lowest of any iter since iter1**. Pure backend mechanical fix. No new bug classes anticipated.