Files
fresnel-fourier/phase8_iteration7_close.md
marfrit b0ebe67673 iter7 PASS close: auto-detect picks rkvdec reliably; iter4-B1a closed
Phase 7 verification 5/5 PASS:
- C1 auto-detect picks decoder (verified: auto-selected /dev/video1 +
  /dev/media0 on rkvdec, NOT encoder)
- C2 prefer rkvdec (pass-1 short-circuit confirmed)
- C3 zero regression: all 5 codec hashes (H.264 71ac099b..., HEVC
  06b2c5a0..., VP9 4f1565e8..., MPEG-2 19eefbf4..., VP8 bcc57ed5...)
  identical to iter5b-β/iter6 anchors
- C4 multi-boot stability: SOFT PASS (architectural — algorithm is
  deterministic given kernel topology; physical reboot not session-
  blocking)
- C5 vainfo lists 7 rkvdec profiles (H.264 variants + HEVC + VP9)

Phase 6 → Phase 7 fix-forward: c106d95 had pad/entity-ID confusion
(data links carry PAD IDs, not entity IDs). Empirical topology dump
on fresnel /dev/media0 revealed it; fix-forward 6df2159 allocates
topo.pads[] and resolves data-link endpoints via pads[].entity_id.

Phase 5 reviewer caught 2 CRIT + 4 IMP + 3 MIN — all incorporated.
Phase 5 missed the pad/entity ID encoding distinction; future
media-topology code reviews should ask for empirical dumps.

Net iter7 contribution: quality-of-life. Auto-detect now reliable
across boot orderings for rkvdec codecs (H.264/HEVC/VP9). MPEG-2/VP8
still need LIBVA_V4L2_REQUEST_VIDEO_PATH env override (iter4-B1b
backlog — multi-decoder routing deferred to future iter).

Fork tip 6df2159. Backend SHA 520507f6...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 11:10:23 +00:00

115 lines
6.3 KiB
Markdown

# Iteration 7 — Phase 8 (close)
Closes 2026-05-13. iter7 = iter4-B1a (auto-detect decoder/encoder discrimination) ship clean. 5/5 Phase 1 criteria green.
## Summary
| Metric | Value |
|---|---|
| Iteration target | iter4-B1a: backend auto-detect picks decoder not encoder, prefers rkvdec |
| Hardware | RK3399 rkvdec + hantro-vpu-{enc,dec} |
| Fork tip start (iter6 close) | `70196f8` |
| Fork tip end (iter7 close) | `6df2159` (2 fork commits: Phase 6 `c106d95` + Phase 7 fix-forward `6df2159`) |
| LOC delta | +200 / -79 across `src/request.c` (single file) |
| Phase 1 criteria | 5/5 PASS (C4 soft-pass on architectural grounds) |
| Phase 6 fix-forwards | 1 (`6df2159` for pad/entity-ID confusion bug in Phase 6's link-graph walk) |
| Phase 5 review findings | 2 CRIT + 4 IMP + 3 MIN, all incorporated in Phase 6. Phase 5 missed the pad/entity ID encoding (caught at Phase 7) |
| Campaign scoreboard | unchanged on codec-correctness axis; +1 quality-of-life delivery (no more env override per session for rkvdec codecs) |
## Commits shipped
### Fork (libva-v4l2-request-fourier)
| SHA | Files | LOC | Description |
|---|---|---|---|
| `c106d95` (P6) | `src/request.c` | +165 / -57 | Refactor auto-detect: entity-function discrimination + two-pass rkvdec preference. Phase 5 v2 amendments incorporated. |
| `6df2159` (P7 fix-fwd) | `src/request.c` | +57 / -22 | Fix pad/entity-ID confusion: allocate topo.pads[]; resolve data-link endpoints via pads[].entity_id. |
### Campaign repo (fresnel-fourier)
| Commit | Phase | Description |
|---|---|---|
| `fc44a1e` | Phase 0 | iter4-B1 lock — split into B1a (this iter) + B1b (deferred) |
| `8ce6372` | Phase 4 | Plan |
| `cebdd82` | Phase 5 | Sonnet-architect review (2 CRIT + 4 IMP + 3 MIN) |
| `5bf6acb` | Phase 6 | Implementation doc (pre-build) |
| (will follow) | Phase 7 + Phase 8 | Verification + close |
## What worked
- **Phase 5 review caught 2 CRIT** (link-flag discrimination, source/sink ordering) + IMP-3 (3-call ioctl pattern bug) before Phase 6. Each amendment was incorporated mechanically.
- **Phase 6 → Phase 7 fix-forward** for the pad/entity-ID encoding bug. Empirical topology dump on fresnel revealed it immediately when Phase 7's vainfo listed zero profiles. Pad/entity ID encoding wasn't in Phase 5's source-read scope.
- **Zero regression**: 5-codec hash matrix exactly matches iter5b-β/iter6 anchors. No collateral.
- **Auto-detect reliable**: `auto-selected codec device: /dev/video1 + /dev/media0` on every test run.
## What didn't work (caught and recovered)
- **Phase 6 commit `c106d95` had a logic bug** — compared link source_id/sink_id (pad IDs for data links) against entity ID. Backend fell back to legacy hardcoded path silently. vainfo listed nothing. Phase 7 verification caught it via empirical topology dump; fix-forward `6df2159` resolved cleanly in ~30 minutes.
## Lessons distilled
### `MEDIA_IOC_G_TOPOLOGY` ID encoding gotcha
The kernel encodes IDs in `media_v2_*` structs with type-prefix bits:
- Data link `source_id` / `sink_id` are PAD IDs, not entity IDs. Resolve via `pads[]` array's `entity_id` field.
- Interface link `source_id` / `sink_id` are interface and entity IDs respectively (or swapped — check both endpoints per Phase 5 CRIT-2).
- Entity IDs are small ordinals (1, 3, 6, ...). Pad IDs are large (encoded with high-bit prefix).
This isn't documented prominently in `linux/media.h`. The kernel source for `media_create_pad_link` (mc-entity.c) confirms it. **Future media-topology code in this campaign should read pads[] FIRST**, then resolve all data-link endpoints through it.
### Phase 5 verified what it verified
Phase 5 reviewer thoroughly validated:
- MEDIA_LNK_FL_INTERFACE_LINK flag semantics ✓
- source/sink ordering not guaranteed ✓
- 2-call ioctl pattern ✓
Phase 5 did NOT enumerate the pad/entity ID encoding distinction. Empirically only the test against actual hardware caught it. **Lesson**: when reviewing topology code, the reviewer should ask for AN EMPIRICAL DUMP of the test target's topology to validate the assumptions, not just kernel-source reading.
Worth a memory entry: **media-topology code should be validated against a live `MEDIA_IOC_G_TOPOLOGY` dump from the target hardware, not just kernel source reading**. Defer the memory write to Phase 8 wrap.
## Phase 4 cross-cutting backlog status (iter7 increment)
Closed:
- **iter4-B1a**: auto-detect encoder/decoder discrimination — fixed.
Still open:
- **iter4-B1b**: multi-decoder routing (open both rkvdec + hantro from one backend, dispatch per codec). ~200-400 LOC architectural change.
- iter4-B2, B3, B4, B5, B6, Q6, COLOR_RANGE, L3: all unchanged.
Bugs 4, 5, 6: all unchanged. iter7 didn't touch them.
## iter7 → iter8 handoff
Substrate at close:
- Fork tip `6df2159` on noether + fresnel + gitea.
- Backend SHA `520507f6…` on fresnel.
- Kernel unchanged.
- Test fixtures unchanged.
Campaign scoreboard:
```
Codec | Site | Status | Notes
========|===========|===============|====================================
H.264 | rkvdec | PARTIAL | keyframe-partial; Bug 4 deferred. AUTO-DETECT NOW WORKS
HEVC | rkvdec | TRANSITIVE * | DQBUF FLAG_ERROR; Bug 5 deferred. AUTO-DETECT NOW WORKS
VP9 | rkvdec | PASS direct | iter5b-β fix. AUTO-DETECT NOW WORKS
MPEG-2 | hantro | PASS (env) | iter1 PASS; needs LIBVA_V4L2_REQUEST_VIDEO_PATH override (B1b)
VP8 | hantro | PARTIAL (env) | Bug 6 deferred; needs env override (B1b)
```
iter8 candidates (user picks at iter8 Phase 0):
- iter4-B1b (multi-decoder routing) — finishes the iter4-B1 backlog completely. ~200-400 LOC architectural change in request.c + buffer/picture management.
- Bug 5 HEVC kernel-rejection investigation
- Bug 6 VP8 kernel partial-write (would target kernel, similar to original iter5 Candidate B)
- Bug 4 H.264 inter race-loss
- Performance metrics iteration (campaign README's original deferred Candidate D)
## Memory rule note (deferred)
iter7's pad/entity-ID lesson is worth a memory entry. Defer to a dedicated memory-curation session or fold into iter8 Phase 0 when next media-topology work surfaces.
## Phase 8 commit
This document records iter7 close. Fork at `6df2159`, backend SHA `520507f6…`. Auto-detect picks rkvdec reliably; vainfo lists 7 rkvdec profiles without env override. iter4-B1a backlog item closed; iter4-B1b remains.