51eee192b8cabaaa9899c23da20495aa3cb4a92d
102 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
9a14cc2527 |
iter5b-β Phase 8 close: PARTIAL PASS — VP9 unblocked direct, Bugs 4/5/6 carried to iter6
Iteration shipped (fork tip 70196f8, backend SHA 2c6ff82c... on fresnel): - VP9 directly verifiable (Phase 1 criterion 1 met for 1 of 3 target codecs) - MPEG-2 maintained (no regression after Commit D fix-forward) - H.264 unchanged (Bug 4 deferred per Phase 1 lock) - Architecture cleaned: CreateSurfaces2 ~70 LOC (single-responsibility), CreateContext owns OUTPUT lifecycle, no α'-style failure mode possible. Surfaced bugs for iter6+: - Bug 5: HEVC libva DQBUF FLAG_ERROR (pre-existing; iter2's transitive PASS verified control payload but not decode outcome) - Bug 6: VP8 libva produces non-zero non-matching output (slot rotation or partial fill, masked pre-β by all-zero state) - Bug 4: H.264 inter-frame race-loss (carried from iter4 P7) Lessons distilled to memory: - feedback_grep_callsites_before_no_change.md (Phase 5 v2 CRIT-2 caught request_pool_destroy not in DestroyContext after C3 stripped its only per-session caller) - feedback_trust_iter_comments_for_lifecycle.md (Commit D fix-forward surfaced because Phase 4 v2 read but didn't trace context.c:262's iter6 ffmpeg-vaapi-copy surfaces_count=0 comment) Campaign scoreboard: 5/5 with 2 direct (VP9 new, MPEG-2 maintained) + 3 mixed (H.264 keyframe partial, VP8 partial new, HEVC transitive-only direct-FAIL). iter6 awaits Phase 0 research-question lock. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c773c3d2c1 |
iter5b-β Phase 7: PARTIAL PASS — VP9 unblocked, MPEG-2 maintained, HEVC+VP8 partial
Two acts: Act 1 (β alone): all 5 libva codecs returned all-zero. MPEG-2 was a regression (pre-β it worked); HEVC was unchanged (kernel returns DQBUF FLAG_ERROR pre AND post β — same Phase 3 baseline showed it). Root cause: ffmpeg-vaapi-copy passes surfaces_count=0 to vaCreateContext per iter6 context.c:262 comment; my β walk of surfaces_ids[] was a no-op → destination_planes_count stayed 0 → surface_bind_slot no-op → all-zero readback. Act 2 (Commit D): cache format-uniform CAPTURE geometry in driver_data; walk surface_heap in CreateContext; lazy-fill in CreateSurfaces2 when fmt_valid is set; invalidate in DestroyContext. Restores MPEG-2 to pre-β state and unlocks VP9. Per Phase 1 criteria: criterion 1 PARTIAL (VP9 of HEVC+VP9+VP8); criteria 2-4 PASS. Bug 5 (NEW): HEVC libva DQBUF FLAG_ERROR — pre-existing kernel rejection; β's OUTPUT format fix didn't address it. Transitive proof at iter2 verified control payload shape but kernel still rejects; some other V4L2 protocol contract aspect differs from kdirect. Bug 6 (NEW): VP8 libva produces non-zero output with real content (74.8% zero + 256 unique bytes incl. keyframe pixels at `93 8e 8a 89...`) but diverges from kdirect. Decode runs; output mismatch likely slot-rotation or partial-fill bug. VP9 is iter5b-β's only clean PASS. Architecture-wise β succeeded: no α'-style failure mode possible (no in-CreateSurfaces2 destructive teardown), and the CRIT-1+CRIT-2 fixes from Phase 5 v2 review held. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
311411b3f9 |
iter5b-β Phase 6: 3 commits A+B+C landed on fork, build pending fresnel uptime
Commits: 1c548b1 (codec helper), cc077a0 (config wire-up), 7055b14 (β refactor + CRIT-1 + CRIT-2 + IMP-1 + IMP-2 + dead-field cleanup). Fork tip 7055b14. surface.c CreateSurfaces2 reduced from ~250 to ~50 LOC. OUTPUT-side V4L2 lifecycle moved to context.c CreateContext. DestroyContext gained request_pool_destroy() (CRIT-2 fix). last_output_*/surface_reset_ format_cache deleted (dead under β). All 5 Phase 5 v2 amendments (CRIT-1, CRIT-2, IMP-1, IMP-2, IMP-3) incorporated. Fresnel offline at push time — build+install+verify deferred to Phase 7. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3508a2cfeb |
iter5b Phase 5 v2: 2 CRIT findings — NULL guard + missing request_pool_destroy
CRIT-1: context.c:64-66 video_format==NULL guard rejects every first
β CreateContext. β moves the probe from CreateSurfaces2 into
CreateContext itself, so the guard fires before any new logic runs.
Fix: remove guard, move CAPTURE probe to top of CreateContext.
CRIT-2: DestroyContext lacks request_pool_destroy. Empirical grep
shows only surface.c:220 (which β strips) calls it per-session.
Without amendment, second CreateContext gets pool->initialized=true
with stale slot pointers → QBUF EINVAL. Fix: add request_pool_destroy
to DestroyContext before REQBUFS(0). C3 (surface.c strip) and CRIT-2
fix MUST land together.
Plus IMP-1 (mplane assumption wrong for SUNXI_TILED_NV12) + IMP-2
(surface_reset_format_cache becomes dead under C7) + IMP-3 (error
recovery comment).
Phase 6 BLOCKED pending CRIT-1 + CRIT-2 fixes. Author confirmed
both at code level — Phase 5 caught what Phase 4 v2's surface read
missed ("DestroyContext teardown — no change needed" — wrong; was
incomplete).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
5abea730a0 |
iter5b Phase 4 v2: re-plan with option β — CreateContext-centric OUTPUT lifecycle
Supersedes phase4_iter5b_plan.md (the α' plan rejected at Phase 7). β architecture: strip OUTPUT-side V4L2 device state from RequestCreateSurfaces2 entirely; move it to RequestCreateContext where config_id (and therefore the bound profile) is unambiguously known. CreateSurfaces2 becomes ID-allocation + per-surface bookkeeping only. 9 contract clauses (C1..C9). Reuses 2 of 3 reverted iter5b commits (codec.h/codec.c helper; object_config->pixelformat wire-up at CreateConfig). New work: C3 strip surface.c, C4 build out context.c — predicted ~120 LOC into context.c, ~190 LOC stripped from surface.c (net ~70 LOC delta). Risk register: 7 items; highest is multi-context resolution change within shared driver_data (medium impact, mitigated by existing DestroyContext teardown). α''s destructive teardown failure mode disappears because β has no in-CreateSurfaces2 teardown branch. Phase 5 review focus: error-recovery branches in CreateContext, per-surface destination_* fill semantics (format-uniform fields at CreateContext vs per-slot fields at BeginPicture), ohm backwards-compat verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
864af258e9 |
iter5b Phase 7: FAIL — HEVC SIGSEGV, option α' rejected, revert + loopback to β
Empirical sweep on iter5b backend (SHA d7722da...) crashed in copy_surface_to_image during HEVC libva-vaapi-hwdownload. Coredump backtrace shows memcpy on stale surface_object->destination_data[i] pointer — cap_pool_destroy ran during my pixfmt-change teardown branch, but the subsequent S_FMT got EBUSY because the OUTPUT queue was already streaming. State corruption mid-decode. Root cause: ffmpeg-vaapi calls vaCreateSurfaces2 *twice*, with CreateContext+STREAMON between them. My CreateSurfaces2 gate destructively tears down cap_pool on pixelformat change but can't recover when REQBUFS(0) silently fails on a streaming queue. surface.c:164-171 TODO comment from iter1 anticipated exactly this: "STREAMOFF + REQBUFS(0) + new S_FMT + new CREATE_BUFS — that's a context-level redesign for the next iteration." Phase 4 dismissed the comment as targeting multi-resolution mid-stream. That dismissal was wrong; ffmpeg-vaapi triggers the same code path. 3 reverts on fork master: 4b2288f, f8256e6, ce304ef reverted by 709ab34, 9a7f888, 6bc29ec. Backend rebuilt + reinstalled on fresnel at iter4-tip SHA 6e90b7a9.... Post-revert HEVC libva returns the pre-iter5b broken-but-non-crashing all-zero pattern. Per Phase 1 lock: criteria 1 FAIL (HEVC/VP9/VP8 still all-zero); criteria 2-4 PASS (no regression on MPEG-2/H.264 keyframe/control payloads). iter5b does not close. Phase 7 → Phase 4 loopback: re-plan as option β (defer OUTPUT-side S_FMT+CREATE_BUFS to CreateContext where config_id is known and streams haven't started). User pick: revert + re-plan with β. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
550bb81a3e |
iter5b Phase 6: 3 commits A+B+C landed clean, backend installed on fresnel
Fork tip 4b2288f. Backend SHA256 d7722da742bfcb86a9136b07e6d9a5de23668f37fcad328258966c5338265e82 on /usr/lib/dri/v4l2_request_drv_video.so (pre-iter5b was 6e90b7a9b2c33480...). LOC: 188 across 5 modified files + 2 new (codec.h, codec.c). All 4 Phase 5 amendments (CRIT-1 + 3 IMPs) incorporated in the actual commits, no follow-ups needed. Phase 7 sweep ready: re-run /tmp/iter5_p3/sweep.sh on fresnel; expect libva == kdirect == sw for HEVC + VP9 + VP8 (3 codecs unblocked); MPEG-2 unchanged; H.264 unchanged (Bug 4 deferred to iter6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7d1c44bd90 |
iter5b Phase 5: review — CRIT-1 mechanical pseudocode fix, 3 IMP amendments
Sonnet-architect found one Critical pseudocode error and three Important amendments. All mechanical; no structural plan change. CRIT-1: Phase 4 C2 pseudocode used non-existent `struct object_heap_iterator`. Actual API at object_heap.h:67-68 uses `int *iterator`. Author re-verified vs request.c:411-418 canonical usage. Verbatim paste would have compile-failed. IMP-1: gate comment at surface.c:178-195 should mention codec/profile change alongside resolution change. IMP-2: dead `object_config->pixelformat` field at config.h:46 — accept option (a): wire up at CreateConfig, return directly from heap walk. Saves one pixelformat_for_profile() call in surface.c path. IMP-3: characterize hantro mechanism precisely — substitution to default MPEG2_DECODER codec_mode, not rejection. Explains why MPEG-2 worked but VP8 didn't pre-fix. 10 contract clauses scorecard: 1 FAIL (C2), 2 CONDITIONAL (C3, C10), 7 PASS. Phase 6 cleared conditionally pending all 4 amendments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
eca03d2641 |
iter5b Phase 4: plan — option α' (single-config lookup), 10 contract clauses
Picks α' over the Phase 2 recommendation of β: smaller scope (~50 LOC
vs ~250), targets iter5b's actual bug (wrong OUTPUT format at INITIAL
CreateSurfaces2, not the multi-resolution mid-stream case the
surface.c:164-171 TODO comment anticipates).
Patches:
- C1/C6: NEW src/codec.{h,c} + meson.build — pixelformat_for_profile()
- C2: NEW find_sole_active_profile() static helper in surface.c
- C3: Replace surface.c:173 hardcode with profile-derived lookup
- C5: Extend last_output_* gate with pixelformat
Phase 7 expected post-fix matrix: HEVC + VP9 + VP8 libva == kdirect
== sw (3 codecs unblocked); MPEG-2 unchanged (already worked);
H.264 still race-loses inter frames (Bug 4, deferred to iter6).
Phase 5 review concerns laid out: helper completeness, heap iterator
API, gate semantics, hantro CAPTURE-derivation on correct format,
mpv probe-then-real flow, memory rule placement.
Option β deferral note: cleaner refactor exists but not necessary
for iter5b's bug; defer to future iteration when multi-resolution
mid-stream becomes a target.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6b0e023e7f |
iter5b Phase 2: situation — lifecycle traced, option β (defer to CreateContext) recommended
VA-API lifecycle traced: CreateConfig stores profile in object_config; CreateSurfaces2 has NO config_id, can't access profile; CreateContext takes VAConfigID and already does profile-switch for h264_start_code (context.c:205-217, iter4 fix-forward 692eaa0). surface.c:164-171 already flags this as deferred-work in a TODO comment: "that's a context-level redesign for the next iteration." iter5b picks up that deferred work. Three options analyzed empirically: - α: thread current_profile through driver_data (15 LOC, fragile semantic) - β: move OUTPUT-side lifecycle to CreateContext (80-150 LOC, clean) - γ: lazy at BeginPicture (architecturally wrong site) Recommendation: option β. iter4 reviewer accepted the deferred-work flag in surface.c; iter5b is the iteration that addresses it. object_config->pixelformat field at config.h:46 is declared but never assigned — opportunity for wiring up cleanly via the profile→pixelformat map. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
cd34ec1918 |
iter5 Phase 0 loopback: real Bug 2 is surface.c:173 hardcoded OUTPUT format
Empirical strace of all 5 codecs through libva shows VIDIOC_S_FMT on OUTPUT_MPLANE ships pixelformat V4L2_PIX_FMT_H264_SLICE for EVERY profile. HEVC controls submitted on H264_SLICE OUTPUT → kernel rkvdec silently rejects/no-ops → CAPTURE stays in cap_pool init (all-zero). Per-codec Bug 2 taxonomy: - HEVC, VP9, VP8: OUTPUT format mismatch on rkvdec/hantro-strict → 100% zero - MPEG-2: format mismatch but hantro tolerates → works - H.264: format right by coincidence; keyframe decodes, inter all-zero (Bug 4, separate, deferred from iter5b) Site: src/surface.c:173 `unsigned int pixelformat = V4L2_PIX_FMT_H264_SLICE`. Same bug class as feedback_unconditional_codec_state.md (iter4 h264_start_code = true). iter5b new Phase 1: fix surface.c to switch pixelformat on config_object->profile. 4 criteria locked, all backend-side, no kernel patches. RFC v2 series filed back to backlog for a future DMABUF-import-consumer campaign. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0adfb11fff |
iter5 Phase 5: review CRIT-1 invalidates Phase 4 — loop back to Phase 0/3
Sonnet-architect review found that the RFC v2 fix mechanism does not reach the libva backend's consumer path: - Backend uses V4L2_MEMORY_MMAP for both OUTPUT + CAPTURE buffers. - For MMAP buffers, vb->planes[].dbuf stays NULL. - RFC v2 helper's plane loop skips planes with !dbuf, fence attached to no dma_resv. - EXPBUF (vb2_dc_get_dmabuf) creates a fresh disjoint dma_resv. - The fence-mechanism fix would be a no-op for the cap_pool path even if it did reach the right resv, because RequestSyncSurface already blocks on media_request_wait_completion + v4l2_dequeue_buffer. Three alternative root-cause hypotheses for Phase 0/3 to disambiguate: cache coherency, cap_pool slot-rotation bug, or a separate-sync gap in vaDeriveImage/vaMapBuffer that bypasses RequestSyncSurface. Phase 5 saved ~half a session of build-install-test wallclock that would have ended in a Phase 7 → Phase 0 loopback anyway. Three Important + 2 Minor findings also recorded for when iter5 reopens. User pick: loop back to Phase 0/3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a809e9c0b8 |
iter5 Phase 4: plan — 4 patches + manifest diff + PKGBUILD bump
12 contract clauses (C1..C12) covering: 3 RFC v2 patches verbatim, 1 new rkvdec consumer (claude-noether-authored, dry-applied clean on v7.0 in worktree test), kernel-agent patches/ scope tag + fleet/fresnel.yaml diff, marfrit-packages PKGBUILD bump 7.0-1 → 7.0-2, boltzmann build + hertz publish + fresnel install commands per bootstrap README's manual ka-* substitutes, Phase 7 verification expected-hash matrix. Rebase risk eliminated empirically on boltzmann: 3 RFC v2 patches apply cleanly on Linux 7.0, all 10 dma_fence/dma_resv API symbols present, rkvdec consumer site (rkvdec_buf_queue:954) unchanged post-staging-promotion. Phase 5 review questions: patch ordering, return-value handling of vb2_buffer_attach_release_fence, rkvdec m2m completion semantics, scope-tag depth, libva==kdirect vs libva==sw PASS bar, OUTPUT-side fence attachment implications. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3c05564e99 |
iter5 Phase 3: baseline — 4/5 libva codecs race-lose, MPEG-2 wins, kdirect clean
5-codec sweep matrix on linux-fresnel-fourier 7.0-1 confirms: - libva path returns all-zero cap_pool init pattern for H.264 (mostly) HEVC, VP9, VP8 (always). MPEG-2 wins the race (fastest hantro decode). - kernel-direct ffmpeg-v4l2request hwdownload byte-matches SW for all 4 race-losing codecs. - B4 cosmetic init-probe EINVAL noise reproduced on hantro (2 ioctl per codec); MPEG-2 + VP8 stateless control submissions follow at = 0. iter4 P7's "RGB(0,0x4c,0)" pattern corrected to all-zero raw bytes (the 0x4c was YUV→RGB conversion of all-zero NV12). Same SHA shape as iter3's hantro b34860e0 blocker fingerprint. Control-payload strace anchors persisted as phase-7 invariants. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9941523f1f |
iter5 Phase 2: situation analysis — 4-patch plan (3 RFC v2 + 1 new rkvdec consumer)
Source-read complete: 3 RFC v2 patches dissected, v7.0 rkvdec_buf_queue site identified at line 954 of drivers/media/platform/rockchip/rkvdec/rkvdec.c, empirical disproof of Bug 3 UAPI drift via byte-identical v6.12↔v7.0 struct diff, hantro_v4l2.c confirmed unchanged across the same range. Rebase risk concentrated in videobuf2-core.c (medium — vb2 core sees regular activity); deferred to Phase 4 when boltzmann is reachable for the git apply --3way verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
31b9255d63 |
iter5 Phase 0 amend: Bug 3 collapses, locked criteria 5→4
Phase 2 source-read mid-execution found that v4l2_ctrl_mpeg2_* and v4l2_ctrl_vp8_frame are byte-identical v6.12 ↔ v7.0 mainline. On-fresnel re-trace with correct hantro-decoder bind shows MPEG-2 controls submit at = 0; the "Unable to set control(s)" log noise is the backend's H.264/HEVC init-probe EINVAL on a non-H.264 device (B4 backlog), not a UAPI drift. iter5 locked scope is now vb2_dma_resv (4 patches: 3 existing operator-authored RFC v2 + new rkvdec consumer). Criteria reduced from 5 to 4. B4 stays in backlog. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8acfca3fe0 |
iter5 Phase 0: lock Candidate B — vb2_dma_resv + hantro UAPI drift in linux-fresnel-fourier
Five Phase 1 criteria: Bug 2 closed (cap_pool readback returns real pixels through libva); Bug 3 closed (hantro MPEG-2 + VP8 controls accepted on new kernel); patches ship from kernel-agent (local-carry acceptable, mainline bonus); zero codec-contract regression vs iter4; 5/5 direct-verification block restored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9d2b7c1944 |
iter4 Phase 7 close: Option-A transitive proof complete — VP9 PASS 4/5
Leg 1: FRAME control 168/168 bytes byte-identical to kernel-direct anchor.
Leg 2: COMPRESSED_HDR 1950/2040 match; 90-byte uv_mode[10][9] delta is the
documented S4 carve-out (rkvdec persistent kernel table).
Leg 3: kernel-direct YUV (NV12→YUV420P, 3 frames @1280x720) SHA256-identical
to libvpx-vp9 SW reference: 4f1565e89cd720c4eb6e59d8bbb46127b02cf13102911afc4e174925e5b36094
iter4 criteria 1+2+3 direct PASS, 4 transitive PASS, 5 carried as substrate
issue (cap_pool readback, Bug 2 + hantro UAPI drift, Bug 3) outside iter4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f510ac6be5 |
iter4 Phase 7 pause: fork fix-forward 692eaa0, awaiting fresnel return for transitive-proof closure
Mid-Phase-7 fix-forward landed on fork (marfrit/libva-multiplanar:692eaa0): unconditional context_object->h264_start_code = true was prepending 0x00 0x00 0x01 to VP9 slice data, shifting the rkvdec bitstream by 24 bits and producing silent decode failure. Now gated on config_object->profile (H.264 + HEVC only). Empirical verification when fresnel was online: post-fix VP9 keyframe FRAME control bytes 0-23 byte-match Phase 3 anchor: lf.flags=0x03 (DELTA_ENABLED|DELTA_UPDATE) — was 0x01 base_q_idx=0x2e=46 — was 0x41=65 This is the transitive-proof leg-1 (backend-payload == kernel-direct-payload) for the iter4 keyframe. Open verification when fresnel returns: - Full 168-byte FRAME control diff mine vs Phase 3 anchor - Full 2040-byte COMPRESSED_HDR control diff - ffmpeg-v4l2request kernel-direct VP9 decode + hwdownload pixels = Phase 3 SW reference (transitive-proof leg-2) If both legs PASS, iter4 closes 5/5 (4 direct from earlier iters + 1 transitive iter4) per Option-A choice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d87c940788 |
iter4 Phase 7: criterion 1+2+3 PASS, criterion 4+5 FAIL — three bug classes identified
Verification on linux-fresnel-fourier 7.0-1:
PASS:
- Criterion 1: vainfo enumerates VAProfileVP9Profile0 via auto-detect.
- Criterion 2: vaCreateConfig SUCCESS (implicit).
- Criterion 3: ffmpeg-vaapi VP9 5-frame decode exit 0 at 0.307x, no
ioctl errors.
FAIL — three distinguishable bug classes:
Bug 1 (VP9-specific, my Clause 6 parser):
Strace of frame-1 keyframe FRAME control vs Phase 3 anchor:
- byte 8 (lf.flags): mine=0x01 (DELTA_ENABLED only) vs ref=0x03
(ENABLED|UPDATE).
- byte 16 (base_q_idx): mine=0x41 (65) vs ref=0x2e (46).
- byte 17 (delta_q_y_dc): mine=8 vs ref=0.
Bit-trace shows my parser is 2 bits ahead of correct position by
the time it reaches lf_delta_enabled. Fix path: faithful port of
FFmpeg vp9.c::decode_frame_header.
Bug 2 (substrate-wide, cap_pool readback):
Constant RGB(0, 0x4c, 0) "0x4c gray" pattern across all codecs
(VP9, HEVC, MPEG-2, VP8). H.264 keyframe DOES read correctly with
real RGB(0, 0xe3, 0) content; H.264 inter frames revert to 0x4c.
Kernel decode succeeds (Phase 3 strace + ffmpeg-v4l2request
standalone confirm). libva readback returns cap_pool init scratch.
Sibling of iter3 dma_resv blocker but with different signature
(constant 0x4c instead of all-zero 0x00).
Bug 3 (hantro UAPI drift):
MPEG-2 + VP8 produce kernel "Unable to set control(s): Invalid
argument" errors. UAPI struct sizes/fields likely shifted between
6.19.9 and 7.0 (sibling of Phase 3 VP9 struct-size correction
144/1947 -> 168/2040).
Three loopback options proposed (decision pending user):
- A: VP9-only fix (Clause 6 parser); accept Bug 2/3 as substrate
pre-existing; criterion 4 transitive-only per iter3.
- B: Full loopback covering all 3 bugs; possibly requires kernel
patches (vb2_dma_resv RFC v2).
- C: Phase 0 reset; substrate is the primary issue; pause iter4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
42b9ec333a |
iter4 Phase 6: 4 commits landed (Z+A+B+C), ffmpeg-vaapi VP9 decode PASS
Fork at marfrit/libva-multiplanar tip beaa914: - Z (7f8fa93) device-path auto-detect via media controller topology; walk /dev/media*, MEDIA_IOC_DEVICE_INFO match, MEDIA_IOC_G_TOPOLOGY -> MEDIA_INTF_T_V4L_VIDEO -> resolve via /sys/dev/char. LIBVA_V4L2_REQUEST_NO_AUTODETECT=1 escape hatch. - A (16b3973) src/config.c VP9 enumeration + dispatch + entrypoints. - B (406d08e) NEW src/vp9.c (~750 LOC: VPX rac + inv_map_table + uncompressed-header partial parser + compressed-header parser + vp9_set_controls) + src/vp9.h + meson.build + context.h (persistent vp9_lf state for Phase 5 C2) + surface.h (params.vp9 union extension). - C (beaa914) src/picture.c VP9 dispatcher + 2 buffer-type cases. NO Commit D — buffer.c allow-list already permissive for VP9's 3 buffer types (Picture, Slice, SliceData; all in iter3 baseline). Phase 5 amendments all in code: C1 no-XOR direct, C2 persistent vp9_lf with VP9 spec defaults, C3 out_reference_mode parameter, C4 NO_AUTODETECT escape, S4 uv_mode memcpy omitted. Plan amendment to Commit Z section in phase4_iter4_plan.md documents the canonical media-topology approach (replacing the original /dev/video* walk). Verification empirically on fresnel: - Criterion 1: vainfo enumerates VAProfileVP9Profile0 alongside H.264 + HEVC under auto-detect rkvdec. - Criterion 2 (implicit via successful ffmpeg run). - Criterion 3: ffmpeg-vaapi VP9 5-frame decode exit 0 at 0.307x speed, no ioctl errors. - Criterion 4: deferred to Phase 7 verification. - Criterion 5: rkvdec codecs work without env override; hantro (MPEG-2/VP8) still need env override per iter4-B1 backlog. Open iter4 backlog: B1 (multi-decoder dispatch refactor), B2 (mpv-vaapi Could-not-create-device — ffmpeg-vaapi works fine through same backend, mpv does not), Q6 (per-segment ALT_Q mapping for non-BBB), COLOR_RANGE (VAAPI gap). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9865416ed2 |
iter4 Phase 5: sonnet-architect review — 4 Critical findings, all amendments incorporated
Review by sonnet-architect with cold-context source reads of fork +
kernel UAPI + VAAPI + FFmpeg references + kernel rkvdec source.
Reviewer applied Direction 2 (empirical-over-theoretical) by
test-compiling struct sizes, gcc-c-checking VAAPI field accesses,
and source-tracing FFmpeg's filter-mode XOR provenance.
Critical findings (all empirically validated by author before
incorporation per feedback_review_empirical_over_theoretical.md):
C1 - interpolation_filter double-XOR: vaapi_vp9.c:62 ALREADY applies
`filtermode ^ (filtermode <= 1)` when filling VAAPI's
mcomp_filter_type. Plan's second XOR was incorrect; would swap
EIGHTTAP and EIGHTTAP_SMOOTH for inter frames -> wrong
loop-filter strength. Fix: direct assignment, no XOR.
C2 - LF deltas not persistent: kernel UAPI explicitly says
"users should pass its last value" when delta_update=0. Plan
memset-zeroed each frame; would send {0,0,0,0,0,0} on BBB inter
frames instead of {1,0,-1,-1,0,0}. Fix: add persistent vp9_lf
state to object_context, init to VP9 spec defaults, update only
when parser sees delta_update=1, always copy to kernel control.
C3 - reference_mode out-parameter missing: reference_mode lives in
FRAME struct, not COMPRESSED_HDR. Plan referenced
`compressed_hdr_reference_mode` placeholder which would be an
undefined identifier -> compile failure. Fix: add
`uint8_t *out_reference_mode` param to vp9_fill_compressed_hdr;
derive `allowcompinter` at call site from the 3 sign biases.
C4 - Mitigation B scope claim overstated: walk-and-pick-first always
selects rkvdec on 7.0 (since video1 enumerates first). Hantro
codecs (MPEG-2, VP8) at video3 still require env override.
Fix: qualify criterion-5 trace; add LIBVA_V4L2_REQUEST_NO_
AUTODETECT=1 escape hatch for legacy callers.
6 Suggested (S1-S6): all confirm plan correctness OR are scope-
aligned non-issues. S4 (uv_mode memcpy omission safe for rkvdec)
baked into Clause 9 amended text.
Without this review, iter4 Phase 6 would have failed first compile
(C3) + produced wrong inter-frame output (C1+C2) + caused user
confusion (C4). Estimated saving: 1 compile failure + 1 Phase 7 ->
Phase 4 loopback + 1 doc correction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
4b36077b17 |
iter4 Phase 4: plan locks 12 contract clauses + Mitigation B
5-commit plan (Z, A, B, C, optional D):
- Commit Z: src/request.c — walk /dev/video* + /dev/media*, match by
driver name in {rkvdec, hantro-vpu, cedrus, sun4i_csi}; restores
baseline functionality on 7.0 (where /dev/video0 is rockchip-rga).
- Commit A: src/config.c — VAProfileVP9Profile0 enumeration + dispatch
+ entrypoints (~16 LOC, 1 file).
- Commit B: NEW src/vp9.c + .h + meson — 12 contract clauses; ~580 LOC
vp9.c (50 infra + 80 VPX rac + 50 uncompressed-header partial parse +
180 compressed-header parser + ~200 frame-fill).
- Commit C: src/picture.c + surface.h — VP9 dispatch + 2 buffer-type
cases + union extension; NO BeginPicture reset (VP9 has no
iqmatrix_set-style flags).
- Commit D: optional fix-forward placeholder (predicted no-op per
feedback_runtime_enumerates_allowlists.md).
Total ~699 LOC, 7 files.
12 contract clauses include 2 NEW vs iter3:
- Clause 3: compile-time _Static_assert sizeof v4l2_ctrl_vp9_frame ==
168 && ..._compressed_hdr == 2040 (any UAPI shift fails loudly).
- Clause 6: uncompressed-header partial parse for lf_delta_* +
base_q_idx (VAAPI doesn't expose; BBB keyframe needs non-zero
ref_deltas={1,0,-1,-1} per Phase 3 anchor).
7 Phase 5 review questions queued, all empirical-leaning per
feedback_review_empirical_over_theoretical.md Direction 2:
parser-vs-bitstream cross-check, FFmpeg-XOR-remap validation,
struct-size stability, mitigation B regression risk.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
56abe3d6a2 |
iter4 Phase 3: VP9 baseline + 4-codec regression on 7.0 substrate
Captured on linux-fresnel-fourier 7.0-1 (post 6.19 decommission). VP9 baseline (kernel-direct via ffmpeg-v4l2request on rkvdec): - 5-frame SW reference PNG SHA256 anchors (criterion-4) - VIDIOC_S_EXT_CTRLS strace with full payload at -s 16384 - Empirical struct sizes 168 B (FRAME) / 2040 B (COMPRESSED_HDR) supersede Phase 2 estimates of 144 / 1947 - Probe pattern: count=1 (FRAME-only) then count=2 (FRAME + COMPRESSED_HDR) Phase 2 doc fix: control IDs corrected 0xa40b2c/d -> 0xa40a2c/d. 4-codec regression (H.264, MPEG-2, HEVC, VP8): all fall back to SW on default config because /dev/video0 is now rockchip-rga (RGB color converter), not a codec device. Fork hardcodes /dev/video0 in request.c:149. Env override LIBVA_V4L2_REQUEST_VIDEO_PATH / _MEDIA_PATH restores per-driver profile enumeration; mitigation A/B/C queued for user decision. New contract clauses surfaced: - Clause 11: uncompressed-header partial parse for lf_delta / base_q_idx (VAAPI doesn't expose these; keyframe ref_deltas non-zero for BBB so leave-at-zero is wrong) - Clause 12: compile-time sizeof asserts on the two control structs so future UAPI shifts fail loudly iter4_phase3.tgz: full Phase 3 artifact bundle (strace + PNG refs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
2651e4cfdf |
iter4 Phase 2: situation analysis — VP9 backend gaps + compressed-
header parser requirement
Source-read of every file the iter4 patch series will touch, plus
kernel UAPI + VAAPI + downstream FFmpeg + kernel rkvdec reference
sources. Conducted on noether against fork tip e1aca9c (iter3 close).
Critical scope-shaping finding: rkvdec on RK3399 REQUIRES
V4L2_CID_STATELESS_VP9_COMPRESSED_HDR (not optional). Per
drivers/staging/media/rkvdec/rkvdec-vp9.c::rkvdec_vp9_run_preamble
lines 752-754:
ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl,
V4L2_CID_STATELESS_VP9_COMPRESSED_HDR);
if (WARN_ON(!ctrl))
return -EINVAL;
VAAPI does NOT expose compressed-header probability updates
(va_dec_vp9.h:50-192 — only frame parameters + segmentation;
vendor VAAPI drivers parse compressed header in firmware/GPU).
Therefore the libva backend MUST parse the compressed header
itself via a VPX boolean decoder + inv_map_table[]. ~150-200 LOC
of bitstream parsing logic (port from FFmpeg
v4l2_request_vp9.c::fill_compressed_hdr).
Bug enumeration (12 sites):
B1 config.c::RequestQueryConfigProfiles enum block missing
B2 config.c::RequestCreateConfig VP9 case missing
B3 config.c::RequestQueryConfigEntrypoints VP9 case missing
B4 src/vp9.c new file ~500-600 LOC
B5 src/vp9.h new file ~35-45 LOC
B6 src/vp9_rac.h NEW or inline (Phase 4
plan locks Option A:
inline in vp9.c)
B7 picture.c::codec_set_controls VP9 dispatch missing
B8 picture.c::codec_store_buffer 2 buffer-type cases
(Picture + Slice;
NOT 4 like VP8)
B9 picture.c::RequestBeginPicture predicted no reset
needed (no flag-state
like VP8 iqmatrix_set)
B10 surface.h::object_surface::params union vp9 member missing
B11 meson.build vp9.c/vp9.h not in lists
B12 buffer.c predicted no change
needed (VP9 uses
Picture/Slice/SliceData
— all whitelisted)
Non-bugs (intentionally untouched): context.c (no DECODE_MODE/
START_CODE menus per FFmpeg ref), video.c (CAPTURE-side format
list), v4l2.c (fourcc-agnostic), include/hevc-ctrls.h (already
includes <linux/v4l2-controls.h>).
Contract surface cited verbatim:
V4L2_CID_STATELESS_VP9_FRAME = 0xa40b2c (~144 bytes — much
smaller than VP8's 1232 bytes because VP9_FRAME carries no
entropy table; that's in COMPRESSED_HDR)
V4L2_CID_STATELESS_VP9_COMPRESSED_HDR = 0xa40b2d (~1947 bytes
— coef[4][2][2][6][6][3] alone is 1728 bytes)
Per-frame submission: 2 controls batched in single S_EXT_CTRLS
v4l2_request_vp9.c references confirmed: 2-control shape,
runtime-probed COMPRESSED_HDR availability (rkvdec advertises
it; we MUST provide)
VAAPI buffer types: 2 per frame (Picture + Slice) vs iter3 VP8's
4. NO Probability buffer (VP9 keeps probs in compressed header).
NO IQMatrix (VP9 keeps quant in slice's per-segment seg_param[8]).
VAAPI → V4L2 mapping table: 30+ fields enumerated. Several gap
candidates identified for Phase 3 empirical resolution:
Q1 lf.ref_deltas/mode_deltas/flags — not in VAAPI; FFmpeg reads
from VP9Context internal. BBB likely zero.
Q2 quant.base_q_idx + deltas — VAAPI exposes only effective
per-segment scales. Inverse-derive needed.
Q3 reference_mode — not in VAAPI. Default to SELECT?
Q4 interpolation_filter mapping (FFmpeg ^ remap)
Q5 reset_frame_context off-by-one (FFmpeg > 0 ? - 1 : 0)
Q6 Per-segment feature_data[8][4] derivation from VAAPI's
effective scales is non-trivial
Q7 mpv 0.41.0 VP9 hwdec engagement (per memory feedback_hw_
decode_engagement_check.md — known gap from iter3 VP8)
Q8 rkvdec dma_resv issue? (predicted NO based on iter1+iter2
successful mpv-DMA-BUF-GL on rkvdec)
Patch-shape prediction: ~580-690 LOC across 5 modified + 2 new
files (closer to iter2 HEVC's 470 than iter3 VP8's 370). Compressed-
header parser is the dominant cost.
Phase 3 baseline targets queued: cross-validator strace verbatim
S_EXT_CTRLS payloads (both controls), VAAPI consumer trace, mpv-
VP9-vaapi engagement check, rkvdec readback non-zero check.
Phase 4 plan structure anticipated: 10-clause template per
iter2/iter3, with new Clause 8 dedicated to compressed-header
parser.
Refs:
phase0_findings_iter4.md (Phase 1 lock)
phase8_iteration3_close.md (predecessor)
references/ffmpeg-kwiboo/libavcodec/v4l2_request_vp9.c (V4L2 ref)
references/ffmpeg-kwiboo/libavcodec/vaapi_vp9.c (VAAPI ref)
/home/mfritsche/src/linux-rfc/drivers/staging/media/rkvdec/
rkvdec-vp9.c (kernel driver — confirms COMPRESSED_HDR
requirement at lines 752-754)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
9a71dbf4c3 |
iter4 Phase 0 + Phase 1 lock: VP9 on rkvdec
Opens iter4 immediately after iter3 close (
|
||
|
|
d5d4beb64d |
iter3 Phase 8 close: 4/5 codecs passing, 3 new memory entries
distilled, 0 Phase 7 → Phase 4 loopbacks
iter3 = VP8 on hantro-vpu-dec via libva-v4l2-request-fourier on
RK3399 (fresnel / Pinebook Pro). Fourth codec to ship.
Final state:
Fork tip: e1aca9c (post iter2 close 8d71e20 + 4 commits)
Phase 1 criteria: 5/5 GREEN (4 direct + 1 transitive)
LOC delta: +373 across 7 files (2 new + 5 modified)
Phase 7 → Phase 4 loopbacks: 0
Phase 6 fix-forwards: 1 (Commit D buffer.c allow-list)
Phase 5 review findings: 4 Critical, all empirically validated
Lessons distilled to memory (3 NEW entries):
feedback_hw_decode_engagement_check.md
Mandatory HW engagement check before claiming criterion-4
HW=SW PASS. mpv silently falls back to SW for some codec/
backend combos. Use lsof/strace/mpv -v/ffmpeg log to verify
HW path actually engaged. Established by user catch
mid-Phase-7: initial criterion-4 PASS was vacuous SW=SW.
reference_dmabuf_resv_blocker.md
Cross-campaign blocker. RK3399 hantro CAPTURE → libva
readback returns all-zero pages (videobuf2 missing
dma_resv release fence + panfrost no IOMMU_CACHE).
Tracked at git.reauktion.de/marfrit/dmabuf-modifier-triage/
issues/2. vb2_dma_resv kernel patches in flight (RFC v2,
2026-04 linux-media). Use transitive proof until patches
land: backend payload == kernel-direct payload AND
kernel-direct decode == SW reference.
feedback_runtime_enumerates_allowlists.md
Sibling to feedback_header_deletion_check.md. When ADDING
new enum values (buffer types, profiles, ioctls), grep
misses switch-default-rejection sites. Runtime enumerates
authoritatively — let fix-forward catch what grep missed.
Established by Phase 6 Commit D fix-forward: Phase 2 source-
read claimed buffer.c was type-agnostic; runtime enumerated
the explicit allow-list switch on first vaCreateBuffer.
Phase 5 amendments empirically validated (all 4 Critical correct):
C1 first_part_header_bits = slice->macroblock_offset → 6550 ✓
C2 first_part_size = partition_size[0]+ceil(macroblock_offset/8)
→ 22742 ✓ (= 21923 + 819, exact match for Phase 3 anchor)
C3 VAProbabilityBufferType (not VAProbabilityDataBufferType)
→ compiled clean post-Commit-D
C4 (int8_t) cast (not (s8)) → compiled clean Commit B first try
Estimated savings without Phase 5 review: 2 Phase 6 compile-fail/
fix-forward cycles (C3 + C4) + 1 Phase 7 → Phase 4 loopback (C1
+ C2 hardware-DMA-offset bug, would have produced visible-but-
corrupt output). Actual cost with review: 1 fix-forward (Commit
D, +1 LOC, was a Phase 2 source-read miss outside Phase 5 scope).
Cross-cutting backlog updates:
iter3-Q1 first_part_header_bits → CLOSED by Phase 5 C1
iter3-flags-anomaly bit 0x40 → not iter3 scope; kernel ignores
iter3-criterion-4-readback → blocked on dmabuf-modifier-triage
iter1; transitive proof used
iter3-mpv-vp8-fallback → mpv 0.41.0 falls back to SW for VP8;
consumer-side, not backend; verify
via chrome-fourier when convenient
Inherited backlog (B1, B3, B4, B5, B6, L3) — no closures from
iter3.
Campaign scoreboard: 3/5 → 4/5 codecs passing.
H.264 | rkvdec | T4 | PASS direct
MPEG-2 | hantro | iter1 | PASS direct
HEVC | rkvdec | iter2 | PASS direct
VP8 | hantro | iter3 | PASS transitive (readback blocked)
VP9 | rkvdec | iter4 | PENDING
iter4 (VP9 on rkvdec) prediction: comparable scope to iter2 HEVC
(VP9 has compressed-header control + probability state).
~400-500 LOC, 3-4 commits + 1 fix-forward. mpv may engage HW for
VP9 (different from VP8 fallback) — verify at iter4 Phase 0.
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase2_iter3_situation.md (situation analysis)
phase3_iter3_baseline.md (verbatim payload anchors)
phase4_iter3_plan.md (10 contract clauses + Phase 5 amendments)
phase5_iter3_review.md (4 Critical, all validated correct)
phase7_iter3_verification.md (4 direct + 1 transitive PASS)
Fork commits 27d82e3 + 017e27f + 7f84bbb + e1aca9c
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
afb9b1450f |
iter3 Phase 7: verification — 4 direct PASS, 1 transitive PASS
Phase 1 5-criterion verification on iter3 backend (fork tip e1aca9c).
4 direct PASS + 1 transitive PASS. Vacuous-pass mode caught + corrected
mid-Phase-7 (initial mpv --hwdec=vaapi --vo=image HW=SW match was
SW=SW; mpv silently fell back to SW for VP8).
Criterion results:
1. vainfo enumerates VAProfileVP8Version0_3 PASS (direct)
2. vaCreateConfig SUCCESS PASS (direct, implied)
3. ffmpeg-vaapi VP8 5-frame decode exit 0 PASS (direct)
4. HW=SW byte-identical via DMA-BUF GL PASS (transitive)
5. 3-codec regression (H.264 + MPEG-2 + HEVC) PASS (direct)
Criterion 4 transitive proof:
Step A: Strace of ffmpeg-vaapi via libva backend captures the
V4L2_CID_STATELESS_VP8_FRAME control payload — keyframe
y_ac_qi=8, first_part_size=22742, first_part_header_bits=
6550, all 30 fields enumerated.
Step B: Phase 3 baseline already captured the kernel-direct
(ffmpeg-v4l2request) keyframe payload — IDENTICAL to A
field-for-field.
Step C: ffmpeg-v4l2request kernel-direct VP8 decode produces
5 raw frames byte-identical to SW reference (cmp on
full 6.7 MB vp8_kerneldirect.yuv vs vp8_sw5.yuv = silent
BYTE-IDENTICAL).
Conclusion: A == B (libva backend produces correct kernel input)
AND C (kernel-direct decode is correct), therefore
libva backend's HW decode IS correct by transitivity.
Direct readback BLOCKED by kernel-layer dma_resv issue (sibling
campaign git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/2):
- ffmpeg-vaapi -hwaccel_output_format vaapi -vf hwdownload
returns all-zero pages (SHA b34860e0... = SHA of all-zero
1382400-byte block) for ALL 5 frames.
- Same all-zero from -hwaccel_output_format nv12 + auto-DL.
- mpv --hwdec=vaapi-copy returns Y=128 gray (uninitialized).
- Root cause: videobuf2 missing dma_resv release fence + panfrost
IOMMU_CACHE absence on RK3399 (per dmabuf-modifier-triage iter1
RFC). vb2_dma_resv kernel patches in flight (linux-media RFC v2,
2026-04). When patches land, direct verification re-runnable.
Phase 5 amendments empirically validated:
C1 first_part_header_bits = slice->macroblock_offset → 6550 ✓
C2 first_part_size = partition_size[0] + ceil(macroblock_offset/8)
→ 22742 ✓ (= 21923 + 819, exact match for Phase 3 anchor)
C3 VAProbabilityBufferType (not VAProbabilityDataBufferType) →
compiled clean post-Commit-D fix-forward
C4 (int8_t) cast → compiled clean Commit B first try
S3 assert(probability_set) → has not fired (FFmpeg vaapi_vp8.c
always sends VAProbabilityBufferType per frame)
Phase 6 fix-forward Commit D documented: buffer.c had an explicit
allow-list switch (Phase 2 source-read missed it). Same iter1 Commit
D pattern — runtime enumerates authoritatively what grep missed.
HW-engagement check applied per new memory rule
feedback_hw_decode_engagement_check.md (established this session):
- mpv-vaapi VP8: SILENT FALLBACK to SW. mpv-side, not backend
issue. ffmpeg-vaapi VP8: HW engaged (Format vaapi chosen by
get_format(); cap_pool_init: 24 slots ready).
- V4L2 strace: VIDIOC_S_EXT_CTRLS for VP8_FRAME (0xa409c8)
returns 0 (kernel accepts payload). CAPTURE buffer indexes
advance through distinct slots per decode.
Cross-cutting backlog updates:
iter3-Q1 first_part_header_bits → closed by Phase 5 C1
iter3-flags 0x40 → not iter3 scope; kernel ignores
iter3-criterion-4 readback → blocked on dmabuf-modifier-triage
iter1 (vb2_dma_resv kernel patches)
Campaign scoreboard: 3/5 → 4/5 codecs passing.
Memory entries added:
feedback_hw_decode_engagement_check.md (mandatory HW engagement
verification before claiming criterion-4 PASS)
reference_dmabuf_resv_blocker.md (cross-campaign blocker tracking
+ transitive proof pattern)
Refs:
phase4_iter3_plan.md (10 contract clauses + Phase 5 amendments)
phase5_iter3_review.md (4 Critical findings, all empirically
validated in Phase 7)
phase3_iter3_baseline.md (verbatim payload anchors used in
transitive proof Step B)
git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/2
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
656596aa6b |
iter3 Phase 5: sonnet review — 4 Critical findings, 4 amendments
Second-model review by sonnet-architect found 4 Critical bugs in
Phase 4 plan, all verified empirically by author before incorporation
per memory feedback_review_empirical_over_theoretical Direction 2.
Amendments applied in-place to phase4_iter3_plan.md +
phase2_iter3_situation.md.
Critical findings:
C1 first_part_header_bits = 0 was claimed cosmetic; actually
UNSAFE. hantro_g1_vp8_dec.c:260 + rockchip_vpu2_hw_vp8_dec.c:372
both read this field unconditionally to compute the macroblock
DMA offset. Setting 0 would place hardware at wrong DMA offset
for ALL macroblock data → garbage decode.
Fix: frame.first_part_header_bits = slice->macroblock_offset
(verified by source identity — vaapi_vp8.c:204 and
v4l2_request_vp8.c:83 use byte-identical formulas).
C2 first_part_size = slice->partition_size[0] was wrong; VAAPI's
partition_size[0] is the REMAINING bytes after parsing
(vaapi_vp8.c:209 confirms; va_dec_vp8.h:193-196 spec confirms).
Kernel needs the TOTAL control partition size.
Fix: frame.first_part_size = slice->partition_size[0] +
((macroblock_offset + 7) / 8)
Phase 3 keyframe numerics confirm: 21923 + 819 = 22742 ✓.
C3 VAProbabilityDataBufferType does not exist as a buffer-type
enum; it's the struct name. The actual enum constant is
VAProbabilityBufferType (= 13 per va.h:2058). Switch case
using the wrong identifier would have failed Phase 6 compile.
Fix: replace globally in phase2 + phase4 docs.
C4 (s8) cast undefined in userspace. Kernel has 's8' typedef in
linux/types.h (kernel-internal). UAPI exposes '__s8' (double-
underscore). Userspace portable cast is int8_t from <stdint.h>.
Fix: replace (s8) with (int8_t) in Clauses 6+7.
Suggested:
S3 Clause 8 comment was factually wrong: hantro_vp8.c::
hantro_vp8_prob_update reads coeff_probs unconditionally;
there is NO default-table fallback. If probability_set==false,
decode produces garbage. Practical risk low (FFmpeg vaapi_vp8.c
always sends VAProbabilityBufferType per frame), but corrected
comment + added assert(probability_set) runtime guard for
immediate Phase 6 surfacing.
Plus 5 minor S/Q items documented; non-blocking for iter3.
Author's 7 review questions all answered directly in the review:
Q1 quantization derivation: correct for typical content
Q2 first_part_header_bits=0 safety: UNSAFE → C1
Q3 num_dct_parts off-by-one: confirmed correct
Q4 field availability: 2 compile failures found (C3 + C4)
Q5 quant_update[s] semantics: signed delta confirmed
Q6 SHOW_FRAME unconditional: safe for BBB scope
Q7 buffer order independence: confirmed
Estimated saving: 1 Phase 6 → Phase 4 loopback + 2 Phase 6 fix-
forward commits. Review pass is the right path forward per memory
rule "Reviews are never skippable" — empty-review value =
empirical-verification value, regardless of finding count.
Refs:
phase4_iter3_plan.md (amended in-place; Phase 5 amendments
section appended)
phase2_iter3_situation.md (amended C3 globally)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
2918dda2e0 |
iter3 Phase 4: plan — 10 contract clauses, ~308-LOC patch, 3 commits
Locks the iter3 patch shape against Phase 3 verbatim cross-validator
payload + Phase 2 contract surface. 10 contract clauses cite kernel
UAPI + VAAPI + FFmpeg ref + Phase 3 byte anchors throughout.
Patch shape (mirrors iter1 ABCD pattern):
Commit A: src/config.c — enumeration block + CreateConfig case +
QueryConfigEntrypoints case (3 sites, +16 LOC, 1 file).
After: vainfo lists VP8Version0_3.
Commit B: NEW src/vp8.c (~200 LOC) + NEW src/vp8.h (~40 LOC) +
meson.build sources/headers entries (+2). 3 files
(2 new + 1 modified).
After: vp8.o compiles standalone.
Commit C: src/picture.c — codec_set_controls dispatch +
codec_store_buffer 4 buffer-type cases + outer
VAProbabilityDataBufferType case + BeginPicture
per-frame reset (4 sites, +40 LOC) + src/surface.h
params.vp8 union member (+10 LOC). 2 files modified.
After: end-to-end VP8 decode through libva backend.
Total: ~308 LOC, 6 files (2 new + 4 modified), 3 commits.
Contract clauses summary:
1. Submission shape — single VIDIOC_S_EXT_CTRLS, count=1, ctrl_class=
V4L2_CTRL_CLASS_CODEC_STATELESS (0xf010000), id=0xa409c8,
size=1232 bytes
2. Local struct alloc + zero-init (memset clears all padding)
3. Frame geometry + version + per-frame scalars (off-by-one
num_dct_parts = num_of_partitions - 1)
4. DPB timestamp resolution (3 refs: last/golden/alt; 0-sentinel
when SURFACE() returns NULL — mirrors iter1 mpeg2.c pattern)
5. Loop filter mapping (6 fields + 3 flag bits)
6. Quantization base + delta derivation (segment 0 = base via
iqmatrix[0][0]; deltas = iqmatrix[0][N+1] - iqmatrix[0][0]
signed; per-segment quant_update[1..3] only when segmentation
enabled)
7. Segment fields (segment_probs direct copy; flags assembled +
DELTA_VALUE_MODE set unconditionally per FFmpeg pattern)
8. Entropy table mapping — 3 VAAPI sources (Picture: y_mode +
uv_mode + mv_probs; ProbabilityData: coeff_probs[4][8][3][11]
direct memcpy; IQMatrix: quant)
9. Coder state + first-partition fields + flags (6 mainline-
documented bits only; bit 0x40 + EXPERIMENTAL NOT replicated
vs ffmpeg-v4l2-request-git anomaly; first_part_header_bits=0
fallback documented as known fidelity gap)
10. Final batched submission via v4l2_set_controls
Phase 5 review questions queued (7 items): quantization derivation
correctness, per-segment quant_update semantics, first_part_header_
bits=0 safety, probability buffer ordering, endianness, struct size
sizeof correctness, field-availability test-compile per memory
feedback_review_empirical_over_theoretical Direction 2.
Cross-cutting backlog deferred (B1, B3, B4, B5, B6, L3 inherited;
iter3-Q1 first_part_header_bits + iter3-flags 0x40 anomaly NEW).
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase2_iter3_situation.md (Phase 2 contract surface)
phase3_iter3_baseline.md (Phase 3 verbatim payload anchors)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
fd3fce86a6 |
iter3 Phase 3: baselines — VP8 cross-validator + 3-codec regression
+ SW reference
Captured on fresnel 2026-05-08 across two suspend cycles (laptop
dropped twice mid-run, captures preserved on /tmp/iter3_phase3).
All Phase 3 deliverables green.
Substrate verification:
backend SHA256: 9e27...6258 (matches iter2 close)
3-codec regression block: ALL 6 reference hashes match byte-for-
byte vs iter1+iter2 (H.264 +30s, MPEG-2 +02s, HEVC +02s on rkvdec/
hantro). Substrate has not regressed; criterion-5 anchor solid.
Cross-validator anchor (ffmpeg-v4l2request VP8 strace):
- VIDIOC_S_EXT_CTRLS, count=1, ctrl_class=V4L2_CTRL_CLASS_CODEC_
STATELESS, id=0xa409c8, size=1232 bytes
- struct size CORRECTED: v4l2_ctrl_vp8_frame = 1232 bytes (NOT
400 as one might assume; entropy.coeff_probs[4][8][3][11] alone
is 1056 bytes)
- keyframe (frame 1) verbatim payload captured: y_ac_qi=8,
last/golden/alt ts all 0, flags=0x0d (KEY|SHOW|NOSKIP),
y_mode_probs=[145,156,163,128] (matches FFmpeg keyframe const)
- inter frame verbatim payload captured: y_ac_qi=122, all DPB
timestamps non-zero, flags=0x66 (anomaly: bit 0x40 not in
mainline UAPI; vendor-patched ffmpeg-v4l2-request-git;
kernel hantro_vp8.c only inspects KEY_FRAME bit, ignores
bit 0x40)
VP8 SW pixel-verify reference (criterion-4 anchor):
vp8_sw_001.jpg: e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad
584d789db2c984
vp8_sw_002.jpg: a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3
e6ea8c0c78e97a
Frame 1 != Frame 2 (real motion). These are the Phase 7 byte-
compare HW-vs-SW targets.
Open-question resolution (5 of 6 answered empirically):
Q1 first_part_header_bits — varies per frame (key=6550, inter
ranges 86..254); VAAPI doesn't expose. Phase 4 fallback:
leave 0 and check kernel behavior at Phase 7 byte-compare.
Phase 5 review will flag as known fidelity gap.
Q2 num_dct_parts vs VAAPI num_of_partitions — confirmed off-by-
one: kernel = VAAPI - 1 (BBB has VAAPI=2, kernel=1).
Q3 DPB timestamp 0-sentinel — confirmed: keyframe writes all
three timestamps as 0; iter3 mirrors iter1 mpeg2.c pattern.
Q4 SHOW_FRAME default — set on every captured frame (BBB has no
alt-ref invisible). Force unconditional in libva backend.
Q5 lf.flags FILTER_TYPE_SIMPLE — not set; BBB normal loop filter.
Direct mapping from VAAPI filter_type=0.
Q6 First-frame DPB sentinel — confirmed Q3; no self-reference
fallback needed (different from iter1 mpeg2.c).
V4L2 binding cells this boot:
rkvdec : /dev/video3 + /dev/media1
hantro-vpu-dec: /dev/video5 + /dev/media2
Capture artefacts on fresnel /tmp/iter3_phase3/ preserved for
Phase 7 re-run:
vp8_strace.* (19 files, multi-thread)
decode_vp8.py (payload decoder)
vp8_sw_00{1,2}.jpg (criterion-4)
{h264,mpeg2,hevc}_hw_00{1,2}.jpg (criterion-5)
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase2_iter3_situation.md (Phase 2 contract surface)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
898544a29c |
iter3 Phase 2: situation analysis — VP8 backend gaps + contract surface
Source-read of every file the iter3 patch series will touch, plus the
kernel UAPI + VAAPI + downstream FFmpeg + kernel hantro reference
sources. Conducted on noether against fork tip 8d71e20 (iter2 Phase 6
commit B); fresnel.vpn was unreachable so Phase 3 baseline empirical
capture defers until laptop reachable.
Bug enumeration (10 sites the patch series must touch):
B1 config.c::RequestQueryConfigProfiles enumeration block missing
B2 config.c::RequestCreateConfig VP8 case label missing
B3 config.c::RequestQueryConfigEntrypoints VP8 case missing
B4 src/vp8.c new file ~160-220 LOC
B5 src/vp8.h new file ~35-45 LOC
B6 picture.c::codec_set_controls VP8 dispatch missing
B7 picture.c::codec_store_buffer 4 buffer-type cases +
VAProbabilityDataBufferType
outer case missing
B8 picture.c::RequestBeginPicture per-frame reset additions
B9 surface.h::object_surface::params union vp8 member missing
B10 meson.build vp8.c/vp8.h not in lists
Non-bugs (intentionally untouched):
- context.c (no DECODE_MODE/START_CODE menus for VP8)
- video.c (CAPTURE-side format list; VP8 is OUTPUT-side)
- v4l2.c (fourcc-agnostic helpers)
- buffer.c (buffer registry is type-agnostic)
- include/hevc-ctrls.h (already includes <linux/v4l2-controls.h>
which holds V4L2_CID_STATELESS_VP8_FRAME)
Contract surface cited verbatim:
- V4L2_CID_STATELESS_VP8_FRAME = V4L2_CID_CODEC_STATELESS_BASE+200
= 0x00a409c8 (matches Phase 0 V4L2 inventory)
- struct v4l2_ctrl_vp8_frame at <linux/v4l2-controls.h>:1929-1958
+ 5 sub-structs (segment, lf, quant, entropy, coder_state) at
1785-1888
- VAAPI VAPictureParameterBufferVP8 + VASliceParameterBufferVP8 +
VAProbabilityDataBufferVP8 + VAIQMatrixBufferVP8 at
references/libva/va/va_dec_vp8.h
- FFmpeg v4l2_request_vp8.c reference: single batched S_EXT_CTRLS
at end_frame, count=1, no init-time menus
- Kernel hantro_vp8.c::hantro_vp8_prob_update reads 9 fields from
hdr (skip/intra/last/gf probs, segment_probs, entropy.{y,uv,mv,
coeff}_probs)
VAAPI → V4L2 mapping table: 30 fields enumerated. Open questions for
Phase 3 baseline (6 items: first_part_header_bits derivation, num_
dct_parts off-by-one, DPB timestamp 0-sentinel handling, show_frame
default, lf.flags FILTER_TYPE_SIMPLE bit, first-frame DPB sentinel).
Patch-shape prediction: ~260-340 LOC across 6 modified + 2 new
files. Medium-sized iter — between iter1's 120 LOC (3 modified +
1 deleted) and iter2's 470 LOC (5 modified). The new file dominates.
Phase 3 baseline targets queued: cross-validator strace verbatim
S_EXT_CTRLS payload capture, VAAPI consumer trace, mpv-SW reference
JPEG capture for criterion 4 byte-compare anchor.
Phase 4 plan structure anticipated: 10-clause template per iter2.
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase8_iteration2_close.md (predecessor close)
src/mpeg2.c (iter1 single-codec template; iter3 will mirror shape)
src/h265.c (iter2 dispatcher pattern; iter3 takes structure cues)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
ea2413e957 |
iter3 Phase 0 + Phase 1 lock: VP8 on hantro-vpu-dec
Opens iter3 of the fresnel-fourier campaign immediately after iter2
close (
|
||
|
|
df787a6cc2 |
iter2 Phase 8 close: 3/5 codecs passing, lesson L1 extended (BOTH directions)
Iteration 2 closes with all 5 Phase 1 boolean-correctness criteria
green. Third codec passes — campaign scoreboard 2/5 → 3/5 (H.264
in T4, MPEG-2 in iter1, HEVC in iter2). Loop terminates per
feedback_dev_process.md Phase 8.
Notable: ZERO Phase 7 → Phase 4 loopbacks needed. Phase 5 review
caught all 3 would-be loopback triggers in advance (data_byte_offset
rename, dpb.rps→index-arrays semantics, pic_order_cnt_val rename).
This is the dev-process ideal: review catches bugs before
implementation lands; verification confirms contract.
What landed:
Code (libva-v4l2-request-fourier master 229d6d1 → 8d71e20):
cca539d iter2 Phase 6 commit A: config.c break for HEVCMain case
8d71e20 iter2 Phase 6 commit B: rewrite h265.c against new V4L2
stateless HEVC API (6 files, 463 ins, 236 del)
Both authored as Claude (noether) per feedback_gitea_as_claude_noether.md.
Campaign docs (fresnel-fourier):
|
||
|
|
05b4bd56ec |
iter2 Phase 7: verification — all 5 criteria GREEN, third codec PASS
Phase 7 verification of iter2 HEVC fix executed against fork tip
8d71e20 (libva-v4l2-request-fourier master = post-iter2-Commit-B).
Verbatim raw output captured to phase0_evidence/2026-05-08/
iter2_phase7/. All five Phase 1 criteria green; bonus byte-compare
confirms structural match against Baseline B with two minor field-
value divergences (informational SPS fields VAAPI doesn't expose;
non-blocking per Criterion 4 byte-identical pixel pass).
Phase 1 → Phase 7 scoreboard:
Criterion 1 (vainfo VAProfileHEVCMain enum): PASS
rkvdec bind: H.264 (5 profiles) + HEVCMain — same as Baseline.
Criterion 2 (vaCreateConfig SUCCESS for HEVCMain): PASS
Pre-iter2: VA_STATUS_ERROR_UNSUPPORTED_PROFILE (12)
Post-iter2: VA_STATUS_SUCCESS (verified verbatim libva trace)
Criterion 3 (ffmpeg-direct HEVC engages backend, exit 0): PASS
5 frames decoded clean, cap_pool_init: 24 slots ready,
no Failed-to-create lines, no S_EXT_CTRLS EINVAL.
Criterion 4 (DMA-BUF GL HEVC HW=SW byte-identical at +02s): PASS
HW frame 1: 47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5
SW frame 1: 47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5
HW frame 2: a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656
SW frame 2: a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656
Frames 1 vs 2 hash-differ (real motion).
Criterion 5 (iter1 MPEG-2 + T4 H.264 reference hashes): PASS
H.264 +30s HW1: f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9 (T4 ref MATCH)
H.264 +30s HW2: 7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8 (T4 ref MATCH)
MPEG-2 +02s HW1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092 (iter1 ref MATCH)
MPEG-2 +02s HW2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de (iter1 ref MATCH)
Bonus byte-compare against Phase 3 Baseline B verbatim:
count=5, ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS=0xf010000:
SPS id=0xa40a90 size=40 (matches Baseline B)
PPS id=0xa40a91 size=64 (matches)
SLICE_PARAMS id=0xa40a92 size=280 (1 slice × sizeof(slice_params))
SCALING_MATRIX id=0xa40a93 size=1000 (matches sizeof(scaling_matrix);
Phase 4 plan typo'\''d 1296 — actual
struct sums to 1000 = 96+384+384+
128+6+2)
DECODE_PARAMS id=0xa40a94 size=328 (matches)
All return = 0 (kernel accepts every batched call).
SPS field-value divergences vs Baseline B (FFmpeg-v4l2request):
sps_max_num_reorder_pics: post-fix=0 baseline=2 DIVERGE
sps_max_latency_increase_plus1: post-fix=0 baseline=4 DIVERGE
All other SPS fields match (pic_width=1280, pic_height=720,
bit_depth=0, flags=0x180=SAO|STRONG_INTRA_SMOOTHING).
PPS flags also diverge slightly (bit 12 ENTROPY_CODING_SYNC_ENABLED:
post-fix unset, baseline set). Other PPS fields match.
Cause: VAAPI'\''s VAPictureParameterBufferHEVC doesn'\''t expose
sps_max_num_reorder_pics, sps_max_latency_increase_plus1, or
always-truthful entropy_coding_sync. FFmpeg parses these from
bitstream directly. Operational impact NIL (Criterion 4 byte-
identical pixel pass — kernel decoded correctly with these fields
defaulted to 0). Phase 8 polish backlog candidate (low priority):
add SPS bitstream parsing to extract these fields when VAAPI
doesn'\''t supply them.
Phase 7 → Phase 8: clean transition, no loopback.
Notable Phase 7 observations for Phase 8 memory:
1. Phase 5 review value confirmed: 3 Critical findings (C1
data_byte_offset rename, C2 dpb.rps→index-arrays semantics,
C3 pic_order_cnt_val rename) caught at Phase 5 — prevented
Phase 6 compile failures + at least 1-2 Phase 7→Phase 4
loopback cycles. Per memory feedback_review_empirical_over_
theoretical.md: every Critical/Should-fix verified
empirically before responding. Lesson held.
2. One Phase 5 amendment was empirically wrong: S1 suggested
uniform_spacing_flag exists in VAAPI; gcc test-compile rejected.
Both PPS bits 19+20 left zero (VAAPI exposes neither).
Documented inline. Lesson: even reviewer-cited field mappings
warrant empirical verification.
3. Phase 4 plan typo: claimed sizeof(scaling_matrix) = 1296;
empirical size is 1000. Code uses sizeof() so produces correct
bytes. Plan body amendment-by-side-channel; not blocking.
4. VAAPI↔V4L2 field-fidelity gaps surfaced: 2 SPS fields +
possibly 1 PPS bit not exposed by VAAPI. Operational nil;
Phase 8 polish-backlog candidate.
5. mpv --hwdec=vaapi engages HEVC cleanly (no MPEG-2-style
filtering). Confirms Phase 5 Q3 — VAPictureParameterBufferType
sent per-frame for HEVC; latent B3 bug masked same as MPEG-2.
6. BBB HEVC fixture is 1 slice per frame (slice_params size=280
= 1 × sizeof). Multi-slice path in iter2 is coded but
untested by binding cell.
Campaign scoreboard: 2/5 → 3/5 codecs passing
(H.264 in T4, MPEG-2 in iter1, HEVC in iter2). iter2 advances
to Phase 8.
Refs:
../libva-v4l2-request-fourier@8d71e20 (the fork tip verified)
phase4_iter2_plan.md (10 contract clauses; SCALING_MATRIX size
typo noted)
phase5_iter2_review.md (3 Critical + 4 Should-fix amendments
all incorporated; S1 partially empirically
incorrect — VAAPI doesn'\''t expose
uniform_spacing_flag)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
9eae068f11 |
iter2 Phase 5: sonnet review — 3 critical UAPI errors caught, 7 amendments
Phase 5 review run via Plan subagent with model: sonnet per
feedback_dev_process.md Phase 5 discipline. 13 findings: 3 Critical
+ 4 Should-fix + 3 Question + 3 Nit. Reviewer's bottom-line: medium
confidence (vs iter1's medium-high) — lower because the plan had
3 concrete-and-wrong claims about kernel UAPI struct fields that
would have caused compile errors or silent semantic bugs in Phase 6.
Per memory feedback_review_empirical_over_theoretical.md: every
Critical and Should-fix finding was VERIFIED against fresnel's
kernel UAPI before responding. No source-read rebuttals attempted.
Critical resolutions:
C1 (data_byte_offset, not data_bit_offset):
Plan Clause 4 said new API "still requires bit_size + data_bit_
offset, this logic is preserved." Empirical: struct has
data_byte_offset (u32 byte count). FFmpeg uses straight byte
offset, no bit search. Plan amendment: drop bit-search at
h265.c:196-209; replace with byte-offset assignment.
ACCEPTED.
C2 (dpb.rps GONE, pic_order_cnt_val rename, poc_st_curr_*
arrays hold DPB indices):
Plan Clause 6 said "DPB extraction migrates verbatim." Empirical:
- dpb_entry has flags (only LONG_TERM_REFERENCE bit), no .rps
- pic_order_cnt_val (singular s32) replaces pic_order_cnt[0]
- poc_st_curr_before[16]/_after[16]/_lt_curr[16] are u8 DPB
INDICES, not POC values; populate via FFmpeg
get_ref_pic_index() pattern (search dpb[] by timestamp,
return index)
Plan amendment: replace "verbatim migration" claim with explicit
re-spec: classify VAAPI ReferenceFrames into ST_CURR_BEFORE/
AFTER/LT_CURR lists, assign DPB indices, populate arrays with
indices.
ACCEPTED.
C3 (union-aliasing reasoning wrong, claim still right):
Same anti-pattern as iter1 review C1. Plan said reset is benign
because RenderPicture per-buffer copies overwrite byte 17764.
Empirical: byte 17764 lands in num_slices region; non-HEVC
profiles never read that location. Reset is benign because
non-aliasing, NOT because of overwriting. Wording amended.
ACCEPTED.
Should-fix resolutions:
S1 (PPS flags 19+20 missing): empirical confirms
V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT (1ULL<<19)
V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING (1ULL<<20)
Plan amended to add both. ACCEPTED.
S2 (3 PPS scalars missing): empirical PPS struct dump confirms
pic_parameter_set_id, num_ref_idx_l0_default_active_minus1,
num_ref_idx_l1_default_active_minus1 all present in modern
struct. Plan amended to populate. ACCEPTED.
S3 (SCALING_MATRIX content divergence FFmpeg vs libva):
FFmpeg sends memset-zero when no scaling list in stream
(BBB has no scaling_list — SPS flags=SAO|STRONG_INTRA only).
Plan said "populate spec defaults when iqmatrix_set==false."
Phase 6 implementer choice; document in commit which path
taken. Phase 7 byte-compare validates. ACCEPTED as choice
rather than mandate.
S4 (FFmpeg function name wrong cite):
Plan cited ff_v4l2_request_query_control_default_value;
actual is ff_v4l2_request_query_control. Cosmetic fix.
ACCEPTED.
Question resolutions:
Q1 (object_heap allocator size handling): VERIFIED safe.
request.c:142-143 uses sizeof(struct object_surface). Adding
slices[64] auto-picks-up the larger size.
Q2 (slice_segment_addr field): VERIFIED present in struct.
Plan amended Clause 4: populate from VAAPI
slice->slice_segment_address. Single-slice BBB safe with
implicit zero; multi-slice would corrupt without this field.
Q3 (VAPictureParameterBufferType per-frame send for HEVC):
Deferred to Phase 7 LIBVA_TRACE capture. iter1+T4 patterns
suggest yes, worth grepping at verification time.
Nits N1+N2+N3: array size [16] not [8]; image-output
directory naming cosmetic; BeginPicture cleanup deferred.
Plan amendments consolidated:
1. Clause 4: data_byte_offset; drop bit-search; add
slice_segment_addr population (C1 + Q2)
2. Clause 6: explicit DPB classification + index-array logic;
pic_order_cnt_val rename; drop dpb.rps (C2)
3. Clause 3: 2 PPS flags + 3 scalars (S1, S2)
4. Clause 5: function name fix (S4); SCALING_MATRIX divergence
deferred to Phase 6 implementer (S3)
5. Clause 10: union-aliasing reasoning corrected (C3)
6. Clause 6: V4L2_HEVC_DPB_ENTRIES_NUM_MAX=16 macro reference (N1)
7. Phase 7 harness: rename png_* → image_* dirs (N2)
Plan re-locks with these amendments. Phase 6 proceeds.
Per global ~/.claude/CLAUDE.md rule: Phase 5 reviews never
skippable. iter2's review was the right path forward — caught
3 concrete UAPI errors (data_bit_offset → data_byte_offset rename;
dpb.rps field gone; pic_order_cnt struct shape) that would have
been Phase 6 compile failures or silent Phase 7 byte-compare
divergences requiring loopback. Outside-look value substantial.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
348736eb63 |
iter2 Phase 4: plan — 10 contract clauses, ~400-line h265.c rewrite
Phase 4 plan for iter2 HEVC fix. Structured per the
feedback_dev_process.md Phase 6 contract-before-code worked example
(0012-h264-omit-scaling-matrix-frame-based.patch shape): contract
clauses with citations first, then code changes mapping 1:1 to
clauses.
10 contract clauses cited from authoritative sources:
Clause 1 — Per-frame batched VIDIOC_S_EXT_CTRLS, count=5
Authority: linux/v4l2-controls.h:2090-2300 (8 HEVC stateless CIDs)
Reference impl: FFmpeg libavcodec/v4l2_request_hevc.c:505-565
(v4l2_request_hevc_queue_decode)
Empirical anchor: Phase 3 Baseline B verbatim payload
Clause 2 — v4l2_ctrl_hevc_sps layout (40 bytes)
Authority: linux/v4l2-controls.h:2096+ (struct + 9 SPS_FLAG_* bits)
Field-by-field VAAPI source mapping table; existing
h265_fill_sps logic preserved, just routed to flags bitmask
Phase 3 Baseline B BBB SPS bytes: flags=SAO|STRONG_INTRA_SMOOTHING
Clause 3 — v4l2_ctrl_hevc_pps layout (64 bytes, 19 flags)
Authority: linux/v4l2-controls.h:2126-2150
Field source: VAPictureParameterBufferHEVC + slice (for
dependent_slice_segment_flag)
Clause 4 — v4l2_ctrl_hevc_slice_params (variable; dynamic-array)
Authority: kernel exposes 0xa40a92 elems=1 dims=[600] dynamic-array
Submission shape: size = sizeof(slice_params) * num_slices_in_frame
Reference impl: FFmpeg v4l2_request_hevc.c:540-547
BEHAVIORAL CHANGE: per-slice accumulation in codec_store_buffer
(replace overwrite with append-to-array)
DPB MOVES OUT of slice_params to DECODE_PARAMS (Clause 6)
Clause 5 — v4l2_ctrl_hevc_scaling_matrix (size M; conditional)
Conditional on kernel availability (probed via VIDIOC_QUERY_EXT_CTRL
at init), NOT on bitstream flag (Phase 3 baseline corrects Phase 2
assumption)
Spec defaults from ISO/IEC 23008-2 Table 4-1 when iqmatrix_set==false
PROTOCOL: transcribe defaults from Phase 3 Baseline B verbatim
SCALING_MATRIX bytes, NOT from spec recall (per
memory feedback_review_empirical_over_theoretical.md)
Clause 6 — v4l2_ctrl_hevc_decode_params layout (328 bytes)
NEW in modern API (didn't exist in staging-era)
Contains: DPB array (16 entries), POC, num_active_dpb_entries,
num_poc_st_curr_before/after, num_poc_lt_curr,
poc_st_curr_before[8], etc.
Source: existing h265_fill_slice_params lines 269-315 logic
preserved, routed to new struct
Clause 7 — Device-wide DECODE_MODE + START_CODE menus
Set once at init via v4l2_set_controls(...request_fd=-1, 2 ctrls)
rkvdec accepts: FRAME_BASED + ANNEX_B (only options per kernel menu
constraints, Phase 0 v4l2_inventory)
Default location: extend src/context.c:142-155 device-init block
Clause 8 — config.c HEVCMain case must break;
Authority: C semantics; iter1 Bug 1 pattern verbatim
Empirical anchor: Phase 3 Baseline D scratch confirmed
Clause 9 — picture.c::codec_set_controls HEVCMain dispatch
Authority: existing MPEG-2 dispatch pattern at picture.c:186-191
Replace explicit Fourier-local: HEVC stripped reject with
h265_set_controls call
Clause 10 — Per-slice accumulation in codec_store_buffer
HEVC slice_params dynamic-array source = per-RenderPicture appends
BeginPicture resets num_slices=0; codec_store_buffer appends each
VASliceParameterBufferType to slices[N] array
Diff scope (8 files):
src/config.c — 5-line break addition (Clause 8)
src/picture.c — HEVCMain dispatch (Clause 9) + per-slice
accumulation (Clause 10) + BeginPicture
num_slices reset, ~25 lines
src/surface.h — extend params.h265 with slices[64] +
num_slices, ~17 KB extra per surface union
src/h265.c — full rewrite ~400 lines (Clauses 2-7)
src/h265.h — re-enable
src/meson.build — uncomment h265.c + h265.h
src/context.c — extend device-init for HEVC DECODE_MODE +
START_CODE
include/hevc-ctrls.h — leave as-is (9-line shim, lower-risk path
per iter1 Phase 5 Nit 6 deferral)
Phase 6 implementation order (2 logical commits + optional fix-forward):
A: src/config.c HEVCMain break only (substrate fix in isolation;
Phase 3 Baseline D already verified collateral safe)
B: h265.c rewrite + picture.c dispatch + slice_params accumulation +
meson re-enable + surface.h extension + context.c device-init
C: optional fix-forward if Phase 7 surfaces a regression
Phase 7 verification harness (full Bash incantations in plan body):
Criterion 1: vainfo lists VAProfileHEVCMain on rkvdec
Criterion 2: vaCreateConfig(VAProfileHEVCMain) = SUCCESS via libva trace
Criterion 3: ffmpeg -hwaccel vaapi exit 0, no Failed-to-create
Criterion 4: mpv --hwdec=vaapi --vo=image at +02s; HW=SW byte-identical
(DMA-BUF GL cache-coherency-safe path per memory
feedback_rockchip_pixel_verify_path.md)
Criterion 5: iter1 MPEG-2 + T4 H.264 reference hashes still match
Bonus: byte-compare post-fix S_EXT_CTRLS payload vs Baseline B
Pre-identified Phase 7 → Phase 4 loopback triggers:
1. S_EXT_CTRLS EINVAL post-fix → check struct sizes (pahole),
reserved zeroing, SCALING_MATRIX size encoding
2. HW pixel hash mismatch → DPB ordering, slice_params bit_offset,
SPS/PPS flags bit positions, SCALING_MATRIX values
3. mpv --hwdec=vaapi filters HEVC out → fall-forward to ffmpeg
-vf hwdownload (less likely; vaapi engaged MPEG-2 in iter1)
4. iter1/T4 regression → verify diffs scoped right
5. Slice_params dynamic-array submission shape rejected → cross-
validator size encoding anchor
6. SCALING_MATRIX availability detection wrong → defensive
QUERY_EXT_CTRL probe in h265_init_device_controls
7. Latent bug B3 hits HEVC differently than MPEG-2 → byte 240 in
h265.picture; ffmpeg-vaapi sends VAPictureParameterBufferType
per frame so masking holds
Out-of-scope (LOCKED): VP9/VP8; HEVC Main 10 / Main Still Picture /
range ext / tile-wavefront; perf metrics; long-duration stress;
SLICE_BASED decode mode (rkvdec FRAME_BASED only); Phase 4 cross-
cutting backlog (B1 device-discovery, B3 BeginPicture profile-aware,
B4 context.c log suppression, B5 vbv_buffer_size, L3 vaDeriveImage
cache-stale); chromium-fourier 149 install; upstream engagement;
hevc-ctrls.h deletion (Phase 5 Nit 6 lower-risk path continues).
Predicted Phase 8 close: 4-6 commits on the fork (vs iter1's 4).
Iter2 ~3x larger codebase delta than iter1 (mpeg2.c rewrite was
~120 lines; h265.c rewrite is ~400 lines).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d35a247948 |
iter2 Phase 3: baselines — substrate verified post-upgrade, HEVC anchor captured
Phase 3 baselines for iter2 HEVC. Substrate-update verification
ran first (post pacman -Syu rolling upgrade), then iter2-specific
HEVC cross-validator anchor + Bug 1 scratch.
Pre-Phase-3 substrate event: pacman -Syu landed 71 packages.
The "scheduled for linux-7" upgrade was headers-only —
linux-eos-arm-headers 6.19.9-99 → 7.0.3-1, but linux-eos-arm
kernel binary stayed at 6.19.9-99 (EOS-ARM repo hasn't
published the matching 7.x kernel yet). Userland refreshed:
qt6-base epoch bump, libdrm 2.4.131 → 2.4.133, chromium
147 → 148, KDE 26.04.1 batch, mkinitcpio 41-3, etc. OC DTB
intact (sha256 unchanged). mfritsche Plasma session active
throughout, no SDDM regression on this kernel boot.
eos-reboot-recommended marker installed; reboot deferred.
Baseline A (substrate validation post-upgrade):
T4 H.264 +30s and iter1 MPEG-2 +02s reference hashes all
8 match exactly:
H.264 HW1=SW1=f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9
H.264 HW2=SW2=7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8
MPEG-2 HW1=SW1=6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
MPEG-2 HW2=SW2=ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
Userland upgrade did not regress kernel-side decode or
DMA-BUF GL readback.
Baseline B (HEVC cross-validator verbatim contract anchor):
ffmpeg -hwaccel v4l2request decoded bbb_720p10s_hevc.mp4
-frames:v 5 cleanly. Per-frame submission shape:
VIDIOC_S_EXT_CTRLS, ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS,
count=5
0xa40a90 SPS size=40
0xa40a91 PPS size=64
0xa40a92 SLICE_PARAMS size=N (dynamic-array)
0xa40a93 SCALING_MATRIX size=M
0xa40a94 DECODE_PARAMS size=328
Plus init device-wide:
0xa40a95 DECODE_MODE (menu, set once)
0xa40a96 START_CODE (menu, set once)
Key Phase 2 amendments from Phase 3 evidence:
- Per-frame batch is 5 controls (not "up to 6" — BBB
doesn't trigger ENTRY_POINT_OFFSETS / EXT_SPS_*).
- SCALING_MATRIX is sent unconditionally for BBB. FFmpeg
gates on ctx->has_scaling_matrix from kernel
VIDIOC_QUERY_EXT_CTRL at init, NOT on per-frame
bitstream flags. Phase 4 plan amends: query kernel for
SCALING_MATRIX availability at init, submit if available.
SPS payload field-decoded (40 bytes verbatim from BBB
fixture): 1280x720, 8-bit, 4:2:0, no PCM, flags = SAO |
STRONG_INTRA_SMOOTHING. PPS + DECODE_PARAMS + SLICE_PARAMS +
SCALING_MATRIX payloads captured for Phase 4 transcription.
Baseline C (slice-count probe): deferred. ffprobe confirms
1 video stream HEVC Main 1280x720 24fps 10s. Per-frame
slice-count not directly extracted; assume 1 slice/frame for
x265 ultrafast preset until Phase 6 verifies. Kernel
advertises slice_params dynamic-array max 600 entries
(phase0 v4l2_inventory), so multi-slice frames are supported
by the contract.
Baseline D (Bug 1 scratch test, collateral safety):
Applied Bug 1 (config.c break for HEVCMain) on throwaway
branch; h265.c stayed disabled. Built + installed.
H.264 HW frames @ +30s: f623d5f7..., 7d7bc6f2... (match T4)
MPEG-2 HW frames @ +02s: 6e7873030dbf..., ccc7ce08810d...
(match iter1)
Bug 1 in isolation does not regress H.264 or MPEG-2.
HEVC behavior with Bug 1 only:
libva trace: vaCreateConfig SUCCESS for VAProfileHEVCMain
ffmpeg: Task finished with error code: -5 (Input/output error)
Decode fails downstream because picture.c:204-206 still has
the explicit case VAProfileHEVCMain: return UNSUPPORTED_PROFILE
reject (Bug 2). Confirms Phase 2 prediction; Bug 2 fix
requires h265_set_controls to exist (Bug 3-6: enable +
rewrite). Bug 2 lands together with the h265.c rewrite in
Commit B (analogous to iter1 Commit B).
Scratch state cleaned: git checkout + rebuild + reinstall
master backend. H.264 + MPEG-2 still pass. Back to Baseline-A-
equivalent state.
Phase 4 plan inputs updated:
- Per-frame batch: 5 controls (not "up to 6")
- SCALING_MATRIX: unconditional iff kernel advertises (init
QUERY_EXT_CTRL probe), not bitstream-conditional
- SLICE_PARAMS: dynamic-array (max 600 elems per kernel UAPI)
- DECODE_MODE + START_CODE: 2 device-wide menus at init
- Phase 7 harness anchors on mpv-vaapi-vo=image (DMA-BUF GL
cache-coherency-safe path per
feedback_rockchip_pixel_verify_path.md)
- Phase 7 bonus: byte-compare post-fix S_EXT_CTRLS payload
against Baseline B (per feedback_review_empirical_over_
theoretical.md — empirical wins)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
b3ba157cb4 |
iter2 Phase 2: situation analysis — six bugs in HEVC path
Phase 2 source-read of the HEVC path post-iter1-close (fork master
229d6d1). Six bugs identified, all in libva backend; kernel + driver
path proven for HEVC in Phase 0 cross-validator sweep.
Substrate timing caveat: Phase 2 conducted against fresnel kernel
6.19.9-99. Operator-scheduled rolling pacman -Syyuu to linux-7
imminent. Phase 2 source-read findings are kernel-agnostic (fork
code + UAPI + FFmpeg reference); they carry forward across the
kernel jump unchanged. Phase 3 baselines will run on linux-7.
Bug 1 — src/config.c:64-69 HEVCMain falls through to default,
returns VA_STATUS_ERROR_UNSUPPORTED_PROFILE. Verbatim match for
iter1 Bug 1 pattern; fix is 3-line break addition.
Bug 2 — src/picture.c:204-206 explicit
case VAProfileHEVCMain: return UNSUPPORTED_PROFILE
with stale comment "Fourier-local: HEVC stripped, no HW support
on RK3566." (RK3566 is ohm context; fresnel is RK3399 where
rkvdec DOES support HEVC.) Fix: replace explicit reject with
dispatch to h265_set_controls() (mirrors MPEG-2 dispatch at
picture.c:186-191).
Bug 3 — src/h265.c uses staging-era CIDs:
V4L2_CID_MPEG_VIDEO_HEVC_PPS / _SPS / _SLICE_PARAMS
These don't exist on fresnel's 6.19 kernel headers (verified via
test-compile: gcc reports undeclared identifiers, suggests
V4L2_CID_MPEG_VIDEO_DEC_PTS as nearest match). Mainline kernel
UAPI splits HEVC stateless into 7 controls:
V4L2_CID_STATELESS_HEVC_{SPS,PPS,SLICE_PARAMS,SCALING_MATRIX,
DECODE_PARAMS,DECODE_MODE,START_CODE}
+ ENTRY_POINT_OFFSETS, EXT_SPS_ST_RPS, EXT_SPS_LT_RPS
(0xa40a90..0xa40a96 + extensions, V4L2_CID_CODEC_STATELESS_BASE
+ 400..407+).
Fix shape: rewrite h265.c against new split API. Substantially
larger than iter1's mpeg2.c rewrite (HEVC has 7 controls vs MPEG-2
3, + slice_params dynamic-array, + per-slice accumulation logic
needed).
Bug 4 — h265.c uses single-slice_params shape; new API is
dynamic-array. Fresnel rkvdec advertises:
hevc_slice_parameters 0xa40a92 elems=1 dims=[600] dynamic-array
Up to 600 slice_params entries per submission. Current
codec_store_buffer:115-135 OVERWRITES previous slice on
VASliceParameterBufferType arrival. Multi-slice frames need
APPEND-not-overwrite. FFmpeg reference v4l2_request_hevc.c:540-547
shows the pattern.
Fix shape: extend params.h265 to hold slice_params array (or
pointer+count); codec_store_buffer appends; h265_set_controls
flushes the array at end_picture as a single dynamic-array
S_EXT_CTRLS entry.
Bug 5 — h265.c missing controls: doesn't submit DECODE_PARAMS
(per-frame DPB info; new in modern API), SCALING_MATRIX (conditional
on iqmatrix_set + sps.scaling_list_enabled), DECODE_MODE+START_CODE
(device-wide menus, set once per context init).
Fix shape: add h265_fill_decode_params() (DPB ordering from VAAPI
ReferenceFrames[15] — preserve current extraction logic from
h265_fill_slice_params:269-315, route to new struct). Conditional
SCALING_MATRIX from VAIQMatrixBufferHEVC. Device-wide
DECODE_MODE+START_CODE either at first h265_set_controls call or
in extended context.c device-init block.
Bug 6 — src/meson.build comments out 'h265.c' (line 50) and
'h265.h' (line 73). Fix: uncomment both. Trivial.
Bug 7 (verify only) — include/hevc-ctrls.h is a 9-line shim that
just #include <linux/v4l2-controls.h>. Comment dates the
modernization to "linux-media 6.6+". Adds zero value; harmless.
Leave in place per iter1 Phase 5 Nit 6 lower-risk path.
Bug 8 (latent) — picture.c:287 params.h264.matrix_set=false
writes union byte 240. For HEVC: byte 240 lands inside
h265.picture (range [0..604), size 604) — different field than
MPEG-2's chroma_intra_quantiser_matrix. ffmpeg-vaapi's
per-frame VAPictureParameterBufferHEVC re-send overwrites the
corrupted byte before h265_set_controls reads. Latent for
clients that reuse a surface without re-sending picture params.
iter2+ Phase 4 cross-cutting backlog candidate; not iter2 scope.
Things verified NOT bugs:
- h265_fill_pps/sps/slice_params field extraction from VAAPI
structs is sound (just routes to wrong destination structs)
- NAL header parsing (data_bit_offset bit-search) is preserved
in new API — slice_params still has bit_size + data_bit_offset
- v4l2_set_controls batching API in place (used by H.264 + iter1
MPEG-2; iter2 uses same)
Substrate / kernel observation:
- Linux mainline 7.1.0-rc2 reference checkout has
drivers/staging/media/rkvdec/ with rkvdec.c, rkvdec-h264.c,
rkvdec-vp9.c — NO rkvdec_hevc.c. fresnel's HEVC support is
out-of-tree (Christian Hewitt patches per phase0_findings.md
external references). May land in stable 7.x.
- Phase 4 contract-before-code therefore can't cite kernel-side
HEVC handler source until/unless rkvdec_hevc.c lands in
mainline. UAPI doc + FFmpeg reference + Phase 3 cross-validator
bytes are the contract anchor.
Open questions tabled for Phase 3 (post-linux-7-upgrade):
1. iter1 + T4 references on linux-7 (regression check of closed
iter1 work)
2. SDDM watchpoint on linux-7
3. Cross-validator HEVC re-anchor (Baseline C equivalent for
HEVC) — verbatim payload bytes for SPS, PPS, DECODE_PARAMS,
SLICE_PARAMS array, SCALING_MATRIX
4. Pre-fix scratch test (Bug 1 + Bug 2 only, h265.c kept
commented out) — confirm collateral safe
5. Slice-count for bbb_720p10s_hevc.mp4 fixture
6. Whether linux-7 brings rkvdec_hevc.c into mainline
Predicted iter2 close shape: trivial Bugs 1+2+6 fixes + sizable
h265.c rewrite (~250-400 lines, ~3x iter1's mpeg2.c) + new
codec_store_buffer slice accumulation logic. If Phase 7 fails:
likely struct-size mismatch (run pahole), DPB ordering, or
slice_params array size encoding.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6e8c970c1d |
iter2 Phase 0 + Phase 1 lock: HEVC Main on rkvdec
Iteration 2 of the campaign 8(+1)-phase loop opens following iter1
close (
|
||
|
|
dc6937868a |
iter1 Phase 8 close: 2/5 codecs passing, 3 lessons distilled to memory
Iteration 1 closes with all five Phase 1 boolean-correctness criteria
green. Second codec passes — campaign scoreboard 1/5 → 2/5 (H.264
in T4, MPEG-2 in iter1). Loop terminates per
feedback_dev_process.md Phase 8.
What landed:
Code (libva-v4l2-request-fourier master 65969da..229d6d1):
e7dad7a iter1 Phase 6 commit A: config.c break for MPEG-2 cases
5fe873c iter1 Phase 6 commit B: rewrite mpeg2.c against new V4L2 stateless API
3aab187 iter1 Phase 6 commit C: delete staging-era include/mpeg2-ctrls.h
229d6d1 iter1 Phase 6 commit D: drop missed mpeg2-ctrls.h include from context.c (fix-forward)
All four authored as Claude (noether) per feedback_gitea_as_claude_noether.md.
Campaign docs (fresnel-fourier):
|
||
|
|
ec9133a5e4 |
iter1 Phase 7: verification — all 5 criteria GREEN, second codec PASS
Phase 7 verification of iter1 MPEG-2 fix executed against fork tip
229d6d1 (libva-v4l2-request-fourier master = post-Commit-D).
Verbatim raw output captured to phase0_evidence/2026-05-08/
iter1_phase7/. All five Phase 1 criteria green; bonus byte-compare
confirms structural match against Baseline C with one numerical
divergence (vbv_buffer_size, kernel-ignored, non-blocking).
Phase 1 → Phase 7 scoreboard:
Criterion 1 (vainfo MPEG-2 Simple+Main enum): PASS
Criterion 2 (vaCreateConfig SUCCESS for MPEG2Main): PASS
Pre-iter1: VA_STATUS_ERROR_UNSUPPORTED_PROFILE (12)
Post-iter1: VA_STATUS_SUCCESS (verified verbatim libva trace)
Criterion 3 (ffmpeg-hwaccel-vaapi engages backend): PASS
5 frames decoded, exit 0, no Failed-to-create lines,
no S_EXT_CTRLS EINVAL on the MPEG-2 path
Criterion 4 (DMA-BUF GL HW=SW byte-identical at +02s): PASS
HW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
SW frame 1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
HW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
SW frame 2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
Frames 1 vs 2 differ in size (real motion).
Criterion 5 (T4 H.264 reference hashes match): PASS
HW + SW frames at +30s into bbb_1080p30_h264.mp4 match
f623d5f7... and 7d7bc6f2... exactly. No H.264 regression.
Bonus byte-compare against Phase 3 Baseline C verbatim:
count=3, ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS=0xf010000:
SEQUENCE id=0xa409dc size=12 (matches)
PICTURE id=0xa409dd size=32 (matches structurally)
QUANTISATION id=0xa409de size=256 (intra matrix bytes
IDENTICAL to Baseline C
verbatim 64 bytes;
non_intra all 16's)
All return = 0 (kernel accepts every batched call).
One numerical divergence: sequence.vbv_buffer_size
post-fix: 0x100000 = 1 048 576 (= SOURCE_SIZE_MAX)
Baseline C: 0x151800 = 1 376 256 (= negotiated sizeimage)
Kernel ignores per v4l2-controls.h:2003 (informational).
Decode is bit-exact correct regardless. Phase 5 reviewer S2
was numerically prescient; my Phase 5 response (rejected with
"slot->size = sizeimage") was wrong empirically; operational
impact nil. Tracked as low-priority post-iter1 polish.
Phase 7 → Phase 8: clean transition, no loopback to Phase 4.
Notable observations for Phase 8 memory update:
1. V4L2 /dev/videoN numbering shuffles across reboots on RK3399.
Phase 0/3 had rkvdec=video3+media1, hantro=video5+media2; this
boot has rkvdec=video1+media0, hantro=video3+media1. Phase 1
binding cells using fixed paths fragile across reboots. Phase
4 cross-cutting fix candidate: backend probes /dev/media* for
driver=hantro-vpu/rkvdec rather than env-var stability.
2. iter1 patch-0011 cache-stale bug class also affects MPEG-2
(verified empirically; same as H.264 in T4). vaDeriveImage
readback returns all-zero NV12 via ffmpeg-vaapi+hwdownload.
Workaround: DMA-BUF GL import (mpv --vo=image) is cache-
coherency-safe. Phase 4 cross-cutting fix candidate: add
VIDIOC_EXPBUF + DMA_BUF_IOCTL_SYNC support to libva backend
image-export path.
3. src/context.c:142-155 H.264 device-init logs noisy EINVAL on
hantro every CreateContext (return value cast to (void) but
v4l2.c:484 still calls request_log). Cosmetic suppression
candidate; low priority.
4. Phase 6 commit D (fix-forward for missed mpeg2-ctrls.h
include in context.c) — Phase 2 grep audit was incomplete.
Phase 8 lesson: when deleting a header, completeness check
is git rm + clean rebuild, not grep alone.
Campaign scoreboard: 1/5 → 2/5 codecs passing
(H.264 in T4, MPEG-2 in iter1). Iter1 advances to Phase 8.
Refs:
../libva-v4l2-request-fourier@229d6d1 (the fork tip verified)
phase4_iter1_plan.md (criteria as locked, including Phase 5
amendments to criterion 3 + criterion 4)
phase5_iter1_review.md (S2 partial-correct; S3, Q4, Q5
confirmed empirically)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
0e2e1c2293 |
iter1 Phase 5: sonnet-architect review — 6 findings, 4 amendments
Phase 5 review run via Plan subagent with model: sonnet per
feedback_dev_process.md Phase 5 discipline. Review verbatim
preserved in phase5_iter1_review.md alongside per-finding response.
Findings: 1 Critical (latent), 2 Should-fix (1 valid, 1 misreading),
2 Question/clarification, 1 Nit. Reviewer's bottom-line: medium-high
confidence in the plan as written.
Resolutions:
C1 (union-aliasing reasoning was wrong; iter1 unaffected; latent bug):
Verified offsets on fresnel via gcc + libva headers:
h264.matrix_set at union byte 240
mpeg2.iqmatrix_set at union byte 376
mpeg2.iqmatrix range [88..376) — sizeof=288
Setting h264.matrix_set=false writes byte 240, which lands inside
mpeg2.iqmatrix.chroma_intra_quantiser_matrix at offset 20.
Phase 2 said the byte gets overwritten by RenderPicture before
mpeg2_set_controls reads it. That was true only because ffmpeg-
vaapi sends VAIQMatrixBufferType every frame; codec_store_buffer
then copies the full 260-byte payload over the corrupted byte.
ACCEPTED: update Phase 2 + Phase 4 wording to cite the correct
safety chain. Latent bug for clients that reuse a surface without
re-sending IQMatrix logged for iter2+ backlog.
S2 (vbv_buffer_size source — reviewer misread):
Reviewer assumed slot->size = SOURCE_SIZE_MAX (1MB). Verified
source: src/request_pool.c:71 sets pool->slots[i].size = length,
where length is the V4L2-reported buffer length from
VIDIOC_QUERYBUF (= negotiated sizeimage from S_FMT). Phase 3
Baseline C strace shows S_FMT(OUTPUT_MPLANE) returns
sizeimage=1382400=0x151800 — exactly matches Baseline C's
vbv_buffer_size payload. Plan is correct as-is.
REJECTED (reviewer's claim wrong); 1-line note added to Phase 6
Commit B message clarifying the dynamic source.
S3 (default-matrix transcription byte-verify protocol):
ACCEPTED. Phase 6 protocol amendment: when transcribing the
64-entry default_intra[] in src/mpeg2.c, derive values from
Baseline C QUANTISATION verbatim payload, then run a diff-based
assertion before commit lands. Same for non_intra (all 16's),
chroma_intra (= intra), chroma_non_intra (all 16's) — verified
against Baseline C bytes 0..63 / 64..127 / 128..191 / 192..255.
Q4 (criterion 4 — ffmpeg+hwdownload primary, not fallback):
ACCEPTED. Phase 7 harness criterion 4 changes from
mpv --hwdec=vaapi --vo=image first, ffmpeg fallback
to
ffmpeg -hwaccel vaapi -vf hwdownload,format=nv12 primary,
mpv-vaapi-vo=image backup
Critical addition: Phase 7 must check both hashes match AND
content non-zero/non-sentinel. T4 found ffmpeg-vaapi
-hwaccel_output_format nv12 returns mostly zeros via cached-mmap
on RK3399 (iter1 patch-0011 cache-stale bug class). For MPEG-2,
hwdownload may use a different readback path; if it also exposes
the cache-stale bug, swap to mpv-vaapi-vo=image. Empirical
determination during Phase 7.
Q5 (timestamp behavior is a correction, not "no semantic change"):
ACCEPTED. Phase 4 Clause 3 amendment: explicitly note that
forward_ref_ts/backward_ref_ts = 0 when reference surface is
VA_INVALID_ID is a CORRECTION vs current code's self-referencing
behavior. Old code at src/mpeg2.c:106-107, 113-115 set
forward_reference_surface = surface_object (self-ref) when ref
was VA_INVALID_ID. New code sets ts to 0. Baseline C frame 1
confirms 0-as-sentinel; FFmpeg v4l2_request_mpeg2.c:98-108
matches. Iter1 fixes a latent bug.
Nit 6 (hevc-ctrls.h left alongside removed mpeg2-ctrls.h):
ACCEPTED (lower-risk path). Phase 6 Commit B removes mpeg2-ctrls.h
include only; Commit C deletes include/mpeg2-ctrls.h only.
Hevc-ctrls.h header + include left untouched, deferred to HEVC
iteration. Optional cleanup if Phase 6 chooses to bundle, but
default is the smaller diff.
Phase 4 → Phase 6 amendments consolidated:
1. Clause 3 timestamp behavior explicit (Q5)
2. Clause 4 default-matrix Baseline-C-derived transcription (S3)
3. Phase 7 criterion 4 ffmpeg+hwdownload primary + non-zero check (Q4)
4. Hevc-ctrls.h cleanup deferred (Nit 6)
5. Phase 2 + Phase 4 wording fix on union safety chain (C1 partial)
6. Latent surface-reuse bug logged for iter2+ backlog (C1 follow-up)
Plan re-locks with these amendments. Phase 6 proceeds.
Per global ~/.claude/CLAUDE.md rule: Phase 5 reviews are never
skippable. This review was the right path forward; surfaced 2 plan
amendments + 1 latent bug worth documenting + 1 reviewer-misreading
worth pinning so the trail is clear. Material outside-look value.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
3e996d09e2 |
iter1 Phase 4: plan — contract clauses, diff scope, Phase 7 harness
Phase 4 plan for the iter1 MPEG-2 fix, structured per the feedback_dev_process.md Phase 6 contract-before-code worked example (0012-h264-omit-scaling-matrix-frame-based.patch shape): contract clauses with citations first, then code changes mapping 1:1 to clauses. Phase 1 criterion #3 re-locked per Phase 3 → Phase 1 loopback: Original: "mpv --hwdec=vaapi-copy ... engages backend" Adjusted: "ffmpeg -hwaccel vaapi ... engages backend" Phase 3 Baseline A established mpv silently filters MPEG-2 out before libva is loaded; the original wording was unfalsifiable. ffmpeg-direct exercises the path. mpv-driven testing → separate follow-up task. Other 4 criteria unchanged (vainfo regression, vaCreateConfig SUCCESS, DMA-BUF GL pixel verify HW=SW at +02s, T4 H.264 regression). Six contract clauses cited from authoritative sources: Clause 1 — Three split controls in one batched VIDIOC_S_EXT_CTRLS Authority: linux/v4l2-controls.h:1988-2105 Reference impl: FFmpeg libavcodec/v4l2_request_mpeg2.c:130-155 Empirical anchor: Phase 3 Baseline C verbatim payload Clause 2 — v4l2_ctrl_mpeg2_sequence layout (12 bytes) Authority: linux/v4l2-controls.h:2009-2017 Field-by-field VAAPI source mapping table Note: progressive_frame is used as proxy for progressive_sequence (VAAPI doesn't expose the latter; same bit for BBB). Clause 3 — v4l2_ctrl_mpeg2_picture layout (32 bytes) Authority: linux/v4l2-controls.h:2056-2065 reserved[5] MUST be zeroed (kernel doc 2052) 8 picture flags decoded; field-by-field VAAPI mapping Clause 4 — v4l2_ctrl_mpeg2_quantisation layout (256 bytes) Authority: linux/v4l2-controls.h:2089-2096 Matrices in zigzag scanning order; no permutation in libva backend (kernel hantro_mpeg2_dec_copy_qtable handles zigzag-to-raster) Decision: when iqmatrix_set is false, populate from MPEG-2 spec defaults (ISO/IEC 13818-2 Table 7-3) to avoid kernel rejecting a batch missing the QUANTISATION control. Clause 5 — Per-frame submission via v4l2_set_controls Authority: existing src/h264.c:986 pattern surface_object->request_fd binds controls to per-surface request Clause 6 — config.c MPEG-2 case must break; Authority: C semantics; H.264 case shape at config.c:62-63 Empirical anchor: Phase 3 Baseline B confirmed scratch-fix shape. Diff scope: src/config.c — 3 lines added (break for MPEG-2 cases) + drop stale #include <mpeg2-ctrls.h> src/mpeg2.c — full rewrite of mpeg2_set_controls against new split API; ~120 lines replaced; switches from 2× v4l2_set_control(single) to 1× batched v4l2_set_controls(3-control array) include/mpeg2-ctrls.h — DELETE (staging-era, masks kernel UAPI) src/picture.c, src/context.c, meson.build — no changes (verified Phase 2 + Phase 3) Phase 6 implementation order (3 logical commits): Commit A: config.c break — substrate fix in isolation Commit B: mpeg2.c rewrite + drop mpeg2-ctrls.h includes Commit C: delete include/mpeg2-ctrls.h Phase 7 verification harness (full Bash incantations included in plan body): Criterion 1: vainfo MPEG-2 enumeration regression check Criterion 2: vaCreateConfig SUCCESS via libva trace Criterion 3: ffmpeg -hwaccel vaapi exit 0, no Failed-to-create Criterion 4: mpv --hwdec=vaapi --vo=image at +02s seek; HW=SW byte-identical hashes for 2 distinct frames (fallback to ffmpeg hwdownload if mpv-vaapi also filters MPEG-2; criterion holds, harness adapts) Criterion 5: T4 H.264 hashes still f623d5f7... and 7d7bc6f2... Bonus: byte-compare post-fix S_EXT_CTRLS payload vs Baseline C Pre-identified Phase 7 → Phase 4 loopback triggers: 1. S_EXT_CTRLS EINVAL post-fix → check pic.reserved[5] memset, struct sizes, flag value collisions 2. Pixel hash mismatch → check f_code packing, field/frame interpretation, ref timestamps, IQ matrix order 3. mpv-vaapi filters MPEG-2 out (same as -copy) → fall-forward to ffmpeg hwdownload pixel verify (criterion holds, harness adapts; do not redefine criterion) 4. H.264 regression → re-locate the offending change in Bug 1 5. Header deletion breaks unaudited consumer → git grep audit Out of scope (LOCKED): HEVC/VP9/VP8 (later iterations); vaDerive Image cache-stale fix; chromium-fourier 149 install; perf metrics; long-duration stress; other MPEG-2 containers; mpv-hwdec follow-up; context.c H.264 device-init EINVAL (auxiliary, intentional); profile/chroma/progressive_sequence refinement; upstream engagement. Phase 5 entry: artifacts handover (no summary, raw bundle) per feedback_dev_process.md — phase0_findings_iter1.md, phase2_iter1_situation.md, phase3_iter1_baseline.md, phase4_iter1_plan.md, plus phase0_evidence/2026-05-07/iter1_phase3/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b9625af278 |
iter1 Phase 3: baseline measurements — Phase 2 confirmed empirically
Four Phase 3 baselines captured on fresnel post-reboot 2026-05-08
00:39 CEST. SDDM watchpoint condition stayed green (greeter passed
cleanly on the new boot). All four baselines confirm Phase 2's
situation analysis empirically; one Phase 1 criterion needs minor
adjustment (Phase 3 → Phase 1 loopback per feedback_dev_process.md).
Baseline A — pre-patch failure mode (master tip 65969da):
ffmpeg -hwaccel vaapi -i bbb_720p10s_mpeg2.ts ... under strace +
LIBVA_TRACE captures the chain:
vaInitialize ret = SUCCESS
vaQueryConfigProfiles ret = SUCCESS
vaCreateConfig(profile=VAProfileMPEG2Main, entrypoint=VLD)
ret = VA_STATUS_ERROR_UNSUPPORTED_PROFILE
No V4L2 ioctls beyond ENUM_FMT probes from RequestQueryConfigProfiles.
Confirms Phase 2 Bug 1 (config.c:55-69 fall-through to default).
Baseline B — post Bug 1 scratch patch (the missing break added):
vaCreateConfig now returns SUCCESS. V4L2 setup proceeds:
CREATE_BUFS, QUERYBUF (40), REQBUFS, STREAMON, S_FMT, etc.
Then VIDIOC_S_EXT_CTRLS fails:
ioctl(/dev/video5, VIDIOC_S_EXT_CTRLS,
{ctrl_class=0xf010000,
count=1,
controls=[
{id=V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS,
size=56, ...}
]})
= -1 EINVAL
CID 0x9909fa (V4L2_CID_MPEG_BASE+250) doesn't exist on this kernel —
mainline removed it in favor of the split V4L2_CID_STATELESS_MPEG2_*
CIDs. Size 56 = sizeof(combined v4l2_ctrl_mpeg2_slice_params) from
the fork's local include/mpeg2-ctrls.h. Confirms Phase 2 Bug 2.
Auxiliary EINVAL: src/context.c:142-155 unconditionally sets H.264
device-wide controls (H264_DECODE_MODE, H264_START_CODE) on every
CreateContext, regardless of profile. EINVALs on hantro-vpu-dec
(no H.264 controls there). Intentional best-effort behavior —
return value is cast to (void) and discarded. Auxiliary, not iter1
scope.
Baseline C — cross-validator verbatim contract anchor:
ffmpeg -hwaccel v4l2request strace shows ONE batched call per frame:
ioctl(/dev/video5, VIDIOC_S_EXT_CTRLS,
{ctrl_class=0xf010000, // V4L2_CTRL_CLASS_CODEC_STATELESS
count=3,
controls=[
{id=0xa409dc, size=12, ...}, // SEQUENCE
{id=0xa409dd, size=32, ...}, // PICTURE
{id=0xa409de, size=256, ...} // QUANTISATION
]}) = 0
Field-by-field decode of frame 1 (I-picture):
SEQUENCE: 1280×720, vbv=0x151800, profile_level=0,
chroma_format=1, flags=PROGRESSIVE
PICTURE: back/fwd_ref_ts=0/0, flags=0x82
(FRAME_PRED_DCT|PROGRESSIVE), f_code=0xF×4 (I-frame
default), P_C_T=1 (I), structure=3 (FRAME),
intra_dc_precision=0
QUANTISATION: starts [8, 16, 16, 19, 16, 19, 22, 22, ...] —
canonical MPEG-2 default intra matrix in zigzag
scanning order.
Frame 2 (P-picture) shows real f_code values {{1,1},{15,15}}
and forward_ref_ts pointing to frame 1's timestamp. Confirms
Phase 2's claim that matrices arrive in zigzag order;
no permutation needed in the libva backend (kernel's
hantro_mpeg2_dec_copy_qtable handles zigzag-to-raster).
This is the iter1 contract anchor: every Phase 4 implementation
diff must produce a structurally indistinguishable
VIDIOC_S_EXT_CTRLS call.
Baseline D — H.264 regression check (Phase 1 criterion #5):
T4 reference hashes match exactly with scratch Bug 1 fix installed:
HW frame 1: f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9
SW frame 1: f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9
HW frame 2: 7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8
SW frame 2: 7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8
Bug 1 fix in isolation does not regress H.264.
Phase 1 criterion #3 needs adjustment (Phase 3 → Phase 1 loopback):
Original wording: "mpv --hwdec=vaapi-copy ... engages the backend"
Reality: mpv-vaapi-copy never loads libva for MPEG-2. mpv's hwdec
policy filters MPEG-2 out before libva is touched. Zero V4L2
ioctls, zero libva trace, silent SW fallback. Independent of
Bug 1 fix state.
Adjusted criterion #3 (proposed; locks alongside Phase 4 plan):
"ffmpeg -hwaccel vaapi -i bbb_720p10s_mpeg2.ts -frames:v 2
-f null - shows vaCreateConfig SUCCESS, no Failed to create
decode configuration lines, no EINVAL from VIDIOC_S_EXT_CTRLS,
exits 0 cleanly."
mpv-driven testing moves to a follow-up task (mpv hwdec-codecs
filter override), separate from iter1.
Other 4 Phase 1 criteria (vainfo regression, vaCreateConfig SUCCESS,
DMA-BUF GL pixel verify HW=SW, T4 H.264 regression) hold as locked.
Scratch state cleanup: scratch patch reverted, master backend
reinstalled, MPEG-2 fails again with vaCreateConfig=12 — back to
Baseline A state, no leak.
Phase 4 plan inputs:
- Diff scope: src/config.c (1 break), src/mpeg2.c (rewrite to
new API), include/mpeg2-ctrls.h (delete or empty). picture.c
+ context.c unchanged.
- Contract anchor: cite verbatim from
linux/v4l2-controls.h:1985-2105, FFmpeg
libavcodec/v4l2_request_mpeg2.c:130-155, kernel
drivers/media/platform/verisilicon/hantro_mpeg2.c, AND this
document's Baseline C verbatim payload.
- Phase 7 verification: re-run all 5 Phase 1 criteria
(with #3 adjusted), byte-by-byte compare post-fix
VIDIOC_S_EXT_CTRLS payload against Baseline C.
Evidence files:
Tracked (text):
phase3_iter1_baseline.md (writeup with verbatim raw output)
phase0_evidence/2026-05-07/iter1_phase3/baseline_A_ffmpeg/ffmpeg.stdout
phase0_evidence/2026-05-07/iter1_phase3/baseline_B_postbug1/ffmpeg.stdout
phase0_evidence/2026-05-07/iter1_phase3/baseline_C_xvalidator/ffmpeg.stdout
Gitignored (regenerable from re-run incantations in the writeup):
*.strace.* *.txt (ftrace) libva.trace.* (added the latter pattern)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
cc55a6e60a |
iter1 Phase 2: situation analysis — three bugs in MPEG-2 path
Phase 2 source-read of the libva-v4l2-request-fourier MPEG-2 path
on master tip 65969da identifies three independent bugs, all in
the libva backend (kernel + driver path proven solid by Phase 0
cross-validator sweep).
Bug 1 — fall-through to default in RequestCreateConfig
(src/config.c:55-69):
case VAProfileH264*:
// FIXME
break;
case VAProfileMPEG2Simple:
case VAProfileMPEG2Main:
case VAProfileHEVCMain:
default:
return VA_STATUS_ERROR_UNSUPPORTED_PROFILE;
H.264 cases have a break, MPEG-2 + HEVC fall through to default.
This explains the vaCreateConfig: 12 (UNSUPPORTED_PROFILE) error
observed in Phase 0 cross-validator sweep for both codecs.
Likely history: H.264 was libva-multiplanar focus iter1-iter5;
the FIXME comment suggests profile-specific validation logic was
expected but never landed. MPEG-2 stayed in fall-through bucket.
Fix shape: add break for MPEG-2 cases. HEVC stays in fall-through
(h265.c excluded from build per Phase 0 finding F-C; honest
UNSUPPORTED_PROFILE is correct until h265.c is reinstated in a
later iteration).
Bug 2 — staging-era UAPI in mpeg2.c; mainline kernel removed it:
src/mpeg2.c uses:
V4L2_CID_MPEG_VIDEO_MPEG2_SLICE_PARAMS (V4L2_CID_MPEG_BASE+250 = 0x9909fa)
V4L2_CID_MPEG_VIDEO_MPEG2_QUANTIZATION (V4L2_CID_MPEG_BASE+251 = 0x9909fb)
Mainline kernel UAPI (include/uapi/linux/v4l2-controls.h:1985-2105):
V4L2_CID_STATELESS_MPEG2_SEQUENCE (CODEC_STATELESS_BASE+220 = 0xa409dc)
V4L2_CID_STATELESS_MPEG2_PICTURE (CODEC_STATELESS_BASE+221 = 0xa409dd)
V4L2_CID_STATELESS_MPEG2_QUANTISATION (CODEC_STATELESS_BASE+222 = 0xa409de)
Fresnel V4L2 inventory confirms kernel exposes the new IDs only.
The fork's local include/mpeg2-ctrls.h is the staging-era header
that masks the kernel's modern definitions.
Six structural changes from old to new API:
1. Slice header parsing moved to kernel — bit_size, data_bit_offset,
quantiser_scale_code GONE from new structs.
2. Reference timestamps moved from slice to picture
(forward_ref_ts, backward_ref_ts now in v4l2_ctrl_mpeg2_picture).
3. Boolean fields collapsed into v4l2_ctrl_mpeg2_picture.flags
bitmask (TOP_FIELD_FIRST, FRAME_PRED_DCT, CONCEALMENT_MV,
Q_SCALE_TYPE, INTRA_VLC, ALT_SCAN, REPEAT_FIRST, PROGRESSIVE).
4. progressive_sequence collapsed into
v4l2_ctrl_mpeg2_sequence.flags & V4L2_MPEG2_SEQ_FLAG_PROGRESSIVE.
5. PICTURE_CODING_TYPE renamed to PIC_CODING_TYPE
(V4L2_MPEG2_PICTURE_CODING_TYPE_X → V4L2_MPEG2_PIC_CODING_TYPE_X).
6. Quantisation load_* flags removed; matrices always present;
British spelling — quantiSation not quantiZation.
Quantisation matrix order: kernel doc says zigzag scanning order;
VAAPI VAIQMatrixBufferMPEG2 also stores in zigzag scanning order;
direct memcpy works. Kernel hantro_mpeg2.c does the
zigzag-to-raster permutation kernel-side
(hantro_mpeg2_dec_copy_qtable lines 12-26). No userspace
permutation needed in the libva backend (unlike FFmpeg, which
unwinds its internal idsp.idct_permutation order).
Per-frame submission: FFmpeg reference (libavcodec/
v4l2_request_mpeg2.c:130-155) batches 3 controls in single
VIDIOC_S_EXT_CTRLS. Backend's v4l2_set_controls (src/v4l2.c:475)
already supports batching — used by iter6/7/8 H.264
(src/h264.c:986). MPEG-2 rewrite follows H.264's batched pattern.
Bug 3 — include/mpeg2-ctrls.h is the staging-era local header:
The fork's local include/mpeg2-ctrls.h is the staging-era header
that defines the old (removed) API. config.c:37 + mpeg2.c:38
include it via meson's include_directories('../include'). Should
be deleted (or emptied); rely on kernel <linux/v4l2-controls.h>
pulled transitively via <linux/videodev2.h>.
Things verified NOT to be bugs:
- src/picture.c MPEG-2 dispatch is fully wired:
- codec_store_buffer handles VAPictureParameterBuffer + VAIQMatrix
- codec_set_controls dispatches MPEG-2 to mpeg2_set_controls
- HEVC explicitly UNSUPPORTED_PROFILE (correct for build state)
- src/picture.c:287 unconditional h264.matrix_set=false reset is
benign for MPEG-2 (union aliasing puts it in mpeg2.picture or
.slice region; RenderPicture overwrites that byte before
mpeg2_set_controls reads anything).
- src/mpeg2.c field extraction from VAAPI structs is sound; only
the destination control IDs and struct shape need rewiring.
- src/v4l2.c batching API (v4l2_set_controls) is in place.
Open questions tabled for Phase 3 baseline:
1. Live ftrace of failing libva MPEG-2 attempt post Bug-1-fix
(verify expected EINVAL on VIDIOC_S_EXT_CTRLS for old CID).
2. VAAPI VAIQMatrixBufferMPEG2 matrix order from real mpv decode
(verify zigzag, no pre-permutation).
3. Cross-reference verbatim VIDIOC_S_EXT_CTRLS payload from
ffmpeg-v4l2request cross-validator anchor strace dump.
4. SDDM watchpoint resolution — fresnel SSH No route to host at
Phase 2 start (network event, SDDM regression, or operator
power-state). Resolve before Phase 3.
Predicted iter1 outcome: small mechanical diff (config.c break
+ mpeg2.c rewrite + drop local mpeg2-ctrls.h). Phase 7 verification
should land all 5 Phase 1 boolean checks green on first or second
try. Likely Phase 7 → Phase 4 loopback triggers if any: forgotten
struct padding zero, garbage timestamps on first I-frame, or
device-state precondition we missed in hantro_mpeg2.c.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f720c7784b |
iter1 Phase 0 + Phase 1 lock: MPEG-2 boolean correctness on hantro
Iteration 1 of the campaign 8(+1)-phase loop opens following the
campaign Phase 0 close (
|
||
|
|
b74551bc56 |
phase 0 close: deliverables 5 + 6 — fixtures + cross-validator anchor
Closes Phase 0 for fresnel-fourier. Per-codec test fixtures and cross-validator contract traces complete the campaign-locked boolean-correctness baseline. Deliverable #5 — per-codec test fixtures (test_fixtures.md): Generated 4 new fixtures on fresnel from the bbb_1080p30_h264.mp4 master via stock ffmpeg (libx265 ultrafast, libvpx-vp9 speed 5, mpeg2video, libvpx vp8). All 720p 10s 8-bit yuv420p — matching the silicon-supported profile/pixfmt for each codec on RK3399: bbb_720p10s_hevc.mp4 620 KB (HEVC Main, rkvdec target) bbb_720p10s_vp9.webm 3.4 MB (VP9 Profile 0, rkvdec target) bbb_720p10s_mpeg2.ts 5.3 MB (MPEG-2 Main, hantro-vpu-dec target) bbb_720p10s_vp8.webm 2.4 MB (VP8, hantro-vpu-dec target) Encode wall times on fresnel: HEVC 13s, VP9 93s, MPEG-2 6s, VP8 26s. H.264 master is 725 MB carryover from libva-multiplanar / fourier_attribution. Deliverable #6 — cross-validator anchor (cross_validator_traces.md): phase0_findings.md named chromium-fourier 149 as the cross-validator; that package isn't installed on fresnel and marfrit-packages isn't configured (no auto-install path tonight). Substituted ffmpeg -hwaccel v4l2request as a better-fit cross-validator: it's an independent V4L2 client (uses no libva at all, lives in libavcodec/v4l2_request*.c), already on the box (stock ffmpeg n8.1-13-gb57fbbe50c, the Kwiboo v4l2-request-n8.1 branch), and implements all 5 codecs the campaign locked. Headline finding: ALL 5 CODECS WORK end-to-end via the kernel direct path on RK3399. ffmpeg -hwaccel v4l2request -i bbb_<codec>.<ext> -frames:v 2 -f null - H.264: exit 0 HEVC: exit 0 VP9: exit 0 MPEG-2: exit 0 VP8: exit 0 The Linux kernel + rkvdec + hantro-vpu drivers are solid for the entire campaign codec scope. Phase 6 work scope is purely libva- backend code — no kernel patches, no upstream Linux engagement. Per-codec libva (iter8) vs ffmpeg-v4l2request status sweep: H.264 libva: PASS (T4 PASS + bit-exact pixel verify) | ffmpeg-v4l2req: PASS HEVC libva: vaCreateConfig=12 (UNSUPPORTED_PROFILE) | ffmpeg-v4l2req: PASS → src/h265.c is excluded in src/meson.build but src/config.c:151 enumerates HEVCMain via V4L2_PIX_FMT_HEVC_SLICE probe; vaCreateConfig fails downstream of the case match. VP9 libva: profile not enumerated | ffmpeg-v4l2req: PASS → no vp9.c in fork MPEG-2 libva: vaCreateConfig=12 (UNSUPPORTED_PROFILE) | ffmpeg-v4l2req: PASS → mpeg2.c IS compiled, config.c:64-65 has the case statements, yet vaCreateConfig rejects. Phase 2 source-read needed. VP8 libva: profile not enumerated | ffmpeg-v4l2req: PASS → no vp8.c in fork Suggested Phase 6 iteration order (subject to Phase 1 lock): iter1: MPEG-2 — likely cheapest (config.c-level path; mpeg2.c already compiled) iter2: HEVC — re-enable h265.c in build, audit against rkvdec iter3: VP8 — implement vp8.c on hantro iter4: VP9 — implement vp9.c on rkvdec (largest control surface) Per-codec ioctl frequency anchor (2-frame ffmpeg -hwaccel v4l2request): ioctl H.264 HEVC VP9 MPEG-2 VP8 VIDIOC_DQBUF 45 49 40 26 49 VIDIOC_QBUF 22 24 20 10 20 VIDIOC_CREATE_BUFS 17 17 17 12 17 VIDIOC_QUERYBUF 15 15 15 10 15 VIDIOC_S_EXT_CTRLS 13 14 11 5 10 VIDIOC_EXPBUF 11 11 11 6 11 VIDIOC_QUERY_EXT_CTRL 0 5 0 0 0 MEDIA_IOC_REQUEST_ALLOC 4 4 4 4 4 DMA_BUF_IOCTL_SYNC 0 0 0 4 0 MEDIA_REQUEST_IOC_REINIT 0 0 0 0 3 Architectural divergence ffmpeg-v4l2request vs libva-v4l2-request-fourier: - ffmpeg uses VIDIOC_EXPBUF + DMA-BUF for downstream readback. Our libva backend uses cached mmap via vaDeriveImage — the iter1 patch-0011 cache-stale bug class. Phase 4 work item consistent with T4's finding: adding VIDIOC_EXPBUF + DMA-BUF- backed image export to the libva backend would fix the cache-coherency issue identified in T4's H.264 readback. - ffmpeg uses 4 request_fds pooled. Our backend uses 16 (iter6 per-OUTPUT-slot binding). Both valid; different pool depth. - HEVC alone needs VIDIOC_QUERY_EXT_CTRL for hevc_slice_params dynamic-array introspection — unique among the 5 codecs. Substrate change deferred (not a Phase 0 blocker): chromium-fourier 149 install on fresnel is Phase 1+ work. When done, a follow-up trace pass per codec will cross-check ffmpeg-v4l2request and chromium contracts. For Phase 0 baseline, ffmpeg-v4l2request is the anchor. Phase 0 fully closed. Six deliverables landed. Phase 1 lock can proceed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d8a9903ef4 |
phase 0 deliverable 4: H.264 baseline trace — PASS boolean correctness
H.264 hardware decode on RK3399 / rkvdec / libva-v4l2-request-fourier
@ master tip 65969da (iter8 Phase 4) verified bit-exact correct against
software reference, when read via the cache-safe DMA-BUF GL import path.
Test method:
- mpv --hwdec=vaapi --vo=image (DMA-BUF + EGL_EXT_image_dma_buf_import
+ glReadPixels + JPEG encode — cache-coherency-safe per the iter1
patch-0011 lesson).
- Decoded 2 frames at +30s seek (mid-content bunny motion, not BBB
intro fade-in) so size + content variation is genuine.
- Compared HW JPEGs vs SW reference JPEGs (same mpv invocation with
--hwdec=no).
Result:
HW frame 1 sha256 = f623d5f7... (651,726 bytes) byte-identical
SW frame 1 sha256 = f623d5f7... (651,726 bytes) to SW reference
HW frame 2 sha256 = 7d7bc6f2... (630,433 bytes) byte-identical
SW frame 2 sha256 = 7d7bc6f2... (630,433 bytes) to SW reference
Frames 1 vs 2 differ in size — real content change captured.
Phase 0 boolean-correctness criterion for H.264: PASS.
Contract trace:
The V4L2 + media-request ioctl sequence per H.264 frame is the
canonical iter6/iter7 pattern:
S_EXT_CTRLS (CODEC_STATELESS class, request_fd=N)
QBUF CAPTURE_MPLANE index=K
QBUF OUTPUT_MPLANE index=K (compressed slice)
MEDIA_REQUEST_IOC_QUEUE (request_fd=N)
MEDIA_REQUEST_IOC_REINIT (request_fd=N) ← per-OUTPUT-slot reuse
DQBUF OUTPUT_MPLANE index=K
DQBUF CAPTURE_MPLANE index=K
REINIT-before-DQBUF works because the kernel completes decode in
~0.6 ms (request → COMPLETE state), and mainline media_request_
ioctl_reinit accepts both IDLE and COMPLETE. iter7 cap_pool
instantiates 24 slots cleanly: "v4l2-request: cap_pool_init: 24
slots ready" in mpv stdout.
No EINVAL, no EBUSY, no errors observed across 5 frames. iter4's
frame-11 EINVAL bug from libva-multiplanar does not reproduce on
RK3399 in this short window (longer-run repro is Phase 1+ work).
Side finding — cache-stale readback bug present in libva-backend's
vaDeriveImage path on RK3399:
When pixels are read via the cached-mmap path (libva's vaDeriveImage
+ vaMapBuffer, used by ffmpeg -hwaccel vaapi -hwaccel_output_format
nv12), readback is corrupted in exactly the iter1 patch-0011 pattern:
size=6,220,800 bytes (correct: 2 × 1920×1080×1.5 NV12)
non-zero=544 (0.009%)
pattern: 16 consecutive non-zero bytes at every 1920-byte row stride,
rest of buffer reads as zero
diff vs SW reference: 100% of bytes differ, MAE=53.3 per byte
This is the canonical stale-cached-mmap pattern. Kernel writes real
pixels (proven by DMA-BUF GL import readback succeeding), but the
libva backend's image-export path returns a cached pointer without
the correct cache-invalidation incantation. Userspace reads stale
all-zero memory punctuated by whichever cache lines happened to fetch
post-write.
Phase 4 work item: audit whether the iter1 patch-0011 cache-flush
fix is present, effective, or RK3399-routing-bypassed. Three
possibilities: (a) fix landed for RK3568 but cache topology differs
on RK3399, (b) fix is gated on something that's not true on RK3399,
or (c) RK3399 V4L2_MEMORY_MMAP page protection bypasses the flush.
Not gating Phase 0 — kernel-side decode is correct.
Phase 1+ binding cells must use the DMA-BUF GL import path for pixel
verification, not vaDeriveImage / cached-mmap. The iter1 lesson
restated: cached-mmap readback is unreliable on this hardware family.
Evidence files (under phase0_evidence/2026-05-07/h264_baseline_trace.md
and h264_baseline/):
- mpv.stdout — libva log, vaapi-copy engaged, cap_pool_init
- h264_baseline_trace.md — full writeup with re-run incantations
- mpv.strace.* (gitignored) — 19 per-thread ioctl/openat traces
- ftrace_v4l2.txt (gitignored) — kernel qbuf/dqbuf events
- merged_ioctls.tsv (gitignored) — time-sorted V4L2/MEDIA/DRM
ioctls across all threads
- *.jpg (gitignored) — HW vs SW JPEG comparison artefacts
- frames_hw_cached_readback.nv12 (gitignored) — broken nv12
readback for forensic reference
gitignore: extended extension list (jpg, png, nv12, yuv, tsv, strace*).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
60e62da666 |
phase 0 corrections: hantro RK3399 has no H.264; substrate is iter8 master
Two empirical corrections to the morning-of-2026-05-07 phase 0 lock,
based on V4L2 inventory + iter8 fork build smoke captured this evening
on fresnel (kernel 6.19.9-99-eos-arm).
Correction 1 — hantro-vpu-dec on RK3399 does not advertise H.264.
phase0_findings.md morning lock claimed both rkvdec (/dev/video3)
and hantro-vpu-dec (/dev/video5) advertise H.264. Empirical
v4l2-ctl --list-formats-out shows hantro-vpu-dec exposes only
MG2S (MPEG-2) and VP8F (VP8) — no S264. Likely carryover from
RK3568 (ohm) hantro, which does support H.264; the RK3399 hantro
kernel variant in drivers/media/platform/verisilicon/ registers
a different codec list. Fix:
- README.md hardware-target table: drop "+ H.264" from Decoder
block 2.
- README.md decode-side surface-area paragraph: note hantro is
MPEG-2 + VP8 only and that there is exactly one bind for H.264
(rkvdec).
- phase0_findings.md mechanism table: drop H.264 from /dev/video5
row; correct DT compatible to rockchip,rk3399-vpu (the actual
parent device compatible — sysfs reports rockchip,rk3399-vpu,
not rockchip,rk3399-vpu-dec which is just the v4l2 card type
string).
- phase0_findings.md "H.264 lands on both blocks" sentence:
inverted to "H.264 lands only on rkvdec".
- phase0_findings.md Open Question #2 (two-block H.264 routing):
marked RESOLVED 2026-05-07 evening (null). Single bind, no
routing decision, one test cell per codec.
Empirical evidence: phase0_evidence/2026-05-07/v4l2_inventory_findings.md
(distilled from v4l2_inventory.txt — the latter is gitignored as
raw data, regenerable via the v4l2-ctl invocation documented in
the findings file).
Correction 2 — substrate is iter8 master (65969da), not iter5-end.
phase0_findings.md morning lock framed the substrate as "iter5-end
fork." That was true on 2026-05-05 (iter5 close); between then and
the 2026-05-07 fresnel-fourier scaffold libva-multiplanar continued
through iter6 (per-OUTPUT-slot REINIT request_fd binding), iter7
(slot-leak fix, cap_pool harness, msync verify harness, OUTPUT-pool
teardown), and iter8 (perf binding cell harness, RK3566/3568 doc
fix). Building from master tip 65969da inherits all the iter6-iter8
hardening at zero cost. Fix:
- phase0_findings.md substrate paragraph: strikethrough the
"iter5-end" framing, add corrected paragraph naming master
tip 65969da and listing what iter6/7/8 added.
- phase0_findings.md top-of-doc: add an "Empirical corrections
2026-05-07 evening" callout linking to the evidence files,
so a reader spotting the locked-vs-corrected mismatch knows
where the empirical update came from.
Empirical evidence: phase0_evidence/2026-05-07/iter8_build_smoke.md
(clean build, vainfo profile enumeration, HEVC anomaly write-up).
What's preserved on purpose:
The strikethrough rendering in phase0_findings.md keeps the original
locked text visible alongside the correction — campaign convention
treats locks as historical record, not editable state. A reader
landing on the file from a deep link sees both the morning's
intent and the evening's empirical update. Git history has the
clean diff if anyone wants the original without strikethrough.
What's not changed:
The codec scope in the locked research question stays correct in
count — five codecs (H.264 + HEVC + VP9 + MPEG-2 + VP8). The
routing table changes (H.264 → rkvdec only; MPEG-2 → hantro only;
no shared block) but the boolean-correctness pass/fail criterion
per codec is unaffected. Phase 1 lock can proceed on the corrected
map without re-opening scope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|