b0ebe676739bfa66fb2e0d915b872b6bfce92138
62 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
b0ebe67673 |
iter7 PASS close: auto-detect picks rkvdec reliably; iter4-B1a closed
Phase 7 verification 5/5 PASS: - C1 auto-detect picks decoder (verified: auto-selected /dev/video1 + /dev/media0 on rkvdec, NOT encoder) - C2 prefer rkvdec (pass-1 short-circuit confirmed) - C3 zero regression: all 5 codec hashes (H.264 71ac099b..., HEVC 06b2c5a0..., VP9 4f1565e8..., MPEG-2 19eefbf4..., VP8 bcc57ed5...) identical to iter5b-β/iter6 anchors - C4 multi-boot stability: SOFT PASS (architectural — algorithm is deterministic given kernel topology; physical reboot not session- blocking) - C5 vainfo lists 7 rkvdec profiles (H.264 variants + HEVC + VP9) Phase 6 → Phase 7 fix-forward: c106d95 had pad/entity-ID confusion (data links carry PAD IDs, not entity IDs). Empirical topology dump on fresnel /dev/media0 revealed it; fix-forward 6df2159 allocates topo.pads[] and resolves data-link endpoints via pads[].entity_id. Phase 5 reviewer caught 2 CRIT + 4 IMP + 3 MIN — all incorporated. Phase 5 missed the pad/entity ID encoding distinction; future media-topology code reviews should ask for empirical dumps. Net iter7 contribution: quality-of-life. Auto-detect now reliable across boot orderings for rkvdec codecs (H.264/HEVC/VP9). MPEG-2/VP8 still need LIBVA_V4L2_REQUEST_VIDEO_PATH env override (iter4-B1b backlog — multi-decoder routing deferred to future iter). Fork tip 6df2159. Backend SHA 520507f6... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5bf6acb964 |
iter7 Phase 6: 1 commit landed on fork — auto-detect refactor pending fresnel build
Fork tip c106d95 (was 70196f8). 165 LOC added / 57 removed in src/request.c. All 9 Phase 5 amendments (2 CRIT + 4 IMP + 3 MIN) incorporated. Fresnel offline at push time. Build + install + Phase 7 verify deferred until host returns. Phase 7 sweep ready to execute: vainfo + ffmpeg-vaapi + reboot stability + iter5b/iter6 regression check. Code review verified algorithm correctness against Phase 5 reviewer pseudocode + boltzmann's linux-rockchip source confirms MEDIA_ENT_F_PROC_VIDEO_DECODER is set on rkvdec.c:1382 + hantro_drv.c proc entities. Compile-time syntax untested (no va-api dev headers on noether). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
cebdd82e7f |
iter7 Phase 5: review — 2 CRIT on link-graph traversal; algorithm validated
Phase 5 sonnet-architect found: - CRIT-1: interface links connect IO entities (source/sink) to interfaces, NOT directly to proc entity. Walk must use MEDIA_LNK_FL_INTERFACE_LINK (1U<<28) to discriminate. Author verified at media.h:223-225. - CRIT-2: source_id/sink_id ordering not guaranteed in link entries; check both endpoints. Author verified media_v2_link struct at media.h:341-347. - IMP-1: hantro decoder-proc (entity 17) distinct from encoder-proc (entity 3) by function field. Algorithm correct by construction — no encoder contamination possible. - IMP-2: MEDIA_ENT_F_PROC_VIDEO_DECODER set on both rkvdec-proc (rkvdec.c:1382) and hantro-dec-proc (hantro_drv.c). - IMP-3: current 3-call ioctl pattern has spurious memset; new function uses 2-call pattern (alloc all 3 arrays before second call). - IMP-4/MIN-1/2/3: minor implementation notes. All 5 substantive findings empirically verified against boltzmann's linux-rockchip tree. Phase 6 implementer pseudocode provided: walk entities → find decoder proc → walk data links to collect IO entity neighbors → walk interface links to find linked interface → resolve major:minor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8ce6372ef8 |
iter7 Phase 4: plan — split iter4-B1 into B1a (this iter, encoder/decoder) + B1b (defer, multi-decoder routing)
Phase 2 source-read found iter4-B1 conflates two sub-bugs: - B1a: walk picks encoder when it should pick decoder. SMALL FIX (~100-150 LOC). Add MEDIA_ENT_F_PROC_VIDEO_DECODER entity check in find_video_node_via_topology; two-pass prefer rkvdec. - B1b: multi-decoder routing (rkvdec for H.264/HEVC/VP9 + hantro for MPEG-2/VP8 from one backend instance). Bigger arch fix ~200-400 LOC. DEFERRED. iter7 ships B1a. Phase 1 criteria amended: - Auto-detect always picks a decoder, never an encoder. - Prefer rkvdec over hantro (rkvdec serves 3 of 5 codecs). - 2 reboots verify stability. - vainfo lists rkvdec's 3 codecs minimum. - No regression on iter5b-β / iter6 state. Phase 6 will use MEDIA_IOC_G_TOPOLOGY's entities+links arrays to match V4L node entities to decoder-proc entities. Two-pass walk: pass-1 rkvdec only, pass-2 any decoder. Empirical baseline: on 2026-05-12 boot, /dev/media0=rkvdec (only decoder), /dev/media1=hantro-vpu (encoder AND decoder both inside), /dev/media2=uvc. Fix must skip encoder when accepting media1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
fc44a1e63c |
iter7 Phase 0 lock: iter4-B1 auto-detect harden — require MEDIA_ENT_F_PROC_VIDEO_DECODER
Backend-only ~30-80 LOC. Walk media-topology entities (already partially done at iter4 Commit Z); require at least one entity with function == MEDIA_ENT_F_PROC_VIDEO_DECODER. Eliminates the hantro encoder false-match that breaks vainfo + ffmpeg-vaapi on every other reboot. 5 boolean Phase 1 criteria locked. No kernel work. No pixel-correctness chasing. Quality-of-life delivery; removes per-session env-override friction. Predicted lowest-difficulty iteration since iter1. 2-3 hours wallclock. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8ce00d3aa1 |
iter6 PARTIAL close: Bug 6 narrowed to H-E (kernel-side hantro VP8 partial-write)
Phase 3 Candidate K executed: H-D (slot rotation) ELIMINATED via instrumented bind+read site logging. Slot v4l2_index matches at BeginPicture and at vaGetImage for every surface; destination_data[0] matches slot->map[0]. No rotation mismatch. H-A/B/C/D all eliminated. H-E (kernel-side hantro VP8 partial-write) confirmed by elimination. The libva backend submits correct controls, correct slice bytes, correct slices_size, correct slot indices. Kernel writes erratic partial content (per-frame Y plane transitions at row 536, 24, ... — not a clean buffer-size truncation, not slot rotation). iter6 close PARTIAL: 5 of 6 Phase 1 criteria PASS; criterion 1 (libva_vp8 == kdirect) PARTIAL — kernel-side fix needed, out of iter6's locked backend-only scope. No patches landed. Fresnel substrate unchanged: fork tip 70196f8, backend SHA 2c6ff82c... (identical to iter5b-β close). Net deliverable: Phase 3 narrowing reduces Bug-6 hypothesis space from 5 to 1. Future iter7+ (or kernel-agent campaign) picks up the kernel-side investigation. Pattern recognized: iter2 HEVC transitive PASS masked Bug 5; iter3 VP8 transitive PASS masked Bug 6. Both surfaced under direct verification post-iter5b-β. Transitive proofs against ONE artifact (control payload) don't catch bugs in OTHER artifacts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
007cf6ca8e |
iter6 Phase 3: narrowed Bug 6 — H-A/B/C eliminated; H-D/E (kernel) remain
Empirical Phase 3 narrowing: - H-A slice data corruption: ELIMINATED. SHA256 of libva-dumped slice 0 (300614 bytes) byte-identical to raw VP8 frame 0 from .webm at offset 10..300624 (post-VP8-header). - H-B slices_size wrong: ELIMINATED. slices_size = fp_size + sum(dct_part_sizes) = 300614 exactly. - H-C cache coherency: ELIMINATED. msync attempt yielded no output change; VP9 uses same image.c path and works fine. - Control payloads: byte-identical between libva and kdirect for VP8 keyframe (pre-Phase-2 finding). Output pattern: erratic partial-write. Frame 0 Y plane has real content rows 0-535, then 100% zero rows 536-719. UV plane real rows 0-133, zero 134-359. Frame 1 Y plane real rows 0-23, zero 24-719. Per-frame transitions differ — not buffer-size truncation, not slot rotation. Remaining: - H-D slot rotation (untested; needs instrumentation) - H-E kernel-side hantro VP8 partial-write quirk (likely; needs ftrace / kernel investigation) iter5b-β did fix Bug 2 for VP8 (pre-β all-zero was format mismatch; post-β real-but-partial content is a separate kernel-side issue). Phase 3 hands off 4 candidate directions to user: - K: continue H-D investigation (1-2h next session) - L: pivot to H-E kernel-side work (multi-session) - M: park Bug 6, pick different bug (Bug 4/5 or iter4-B1) - N: close iter6 PARTIAL, defer Bug 6 to iter7+ Substrate unchanged; no regression. Backend SHA still 2c6ff82c.... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
bece7b7016 |
iter6 Phase 2: situation — VP8 control bytes are correct; bug is elsewhere
Empirical byte-diff of libva vs kdirect VP8 control payloads on current substrate: - Keyframe (payloads 0+1): BYTE-IDENTICAL (0 diffs / 1232 bytes) - Inter frames: only 24 bytes diff at offset 1200-1223, which are the 3 reference-frame timestamps. libva uses gettimeofday→ns (large values), kdirect uses pts-derived (small). Both internally consistent; kernel uses them as keys, absolute values don't matter. Verdict: Bug 6 is NOT in vp8.c control generation. The bytes match. With identical controls and same hardware, libva produces 0.4% pixel match for keyframe — bug lives in slice-data path, bytesused, cache coherency, or CAPTURE slot rotation. 5 hypotheses (H-A..H-E) for Phase 3 to narrow: - H-A slice data corruption in libva path (picture.c memcpy) - H-B slices_size wrong on OUTPUT QBUF - H-C cache coherency on OUTPUT mmap before kernel DMA read - H-D CAPTURE slot rotation mismatch - H-E other (deeper kernel-side) Pre-iter5b masked all of these via the OUTPUT format mismatch producing all-zero output. β fixed format → kernel actually decodes → underlying bug now visible. iter3's transitive proof verified specific control fields. Did not verify slice data, bytesused, cache state, or slot rotation. Same pattern as iter2's HEVC transitive PASS missing Bug 5. Future transitive PASS claims must enumerate non-verified artifacts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
868d854121 |
iter6 Phase 0 lock: Candidate G — Bug 6 VP8 partial output
User pick. 6 boolean criteria locked: VP8 libva==kdirect; no regression on VP9/MPEG-2/H.264-keyframe/HEVC; control-payload anchors hold. Scope: src/vp8.c, src/picture.c VP8 dispatch + buffer cases, src/surface.c surface_bind_slot, cap_pool slot lifecycle. No kernel work. Backend-side fix expected (decode runs through kernel cleanly; output diverges in slot rotation or partial fill). Predicted small: 5-50 LOC once root-caused. Phase 2 + Phase 3 likely take more wallclock than Phase 6 implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
34e1480de5 |
iter6 Phase 0: substrate inventory + 5 candidate research questions
iter5b-β surfaced 3 explicit bugs (Bug 4 H.264 inter, Bug 5 HEVC DQBUF ERROR, Bug 6 VP8 partial output) plus carried backlog items (iter4-B1 device discrimination, B2-B6, L3, Q6, COLOR_RANGE). Candidates F-J laid out for user lock: - F: Bug 5 HEVC kernel-rejection (highest claim-vs-reality stigma) - G: Bug 6 VP8 partial output (smallest suspect surface) - H: Bug 4 H.264 inter race (highest consumer impact) - I: Re-anchor regression hashes on β substrate - J: iter4-B1 auto-detect harden Recommendation: G → H → F sequence if multiple iters planned; otherwise H for impact or J for architectural-cleanup fit. Phase 1 lock pending user pick. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9a14cc2527 |
iter5b-β Phase 8 close: PARTIAL PASS — VP9 unblocked direct, Bugs 4/5/6 carried to iter6
Iteration shipped (fork tip 70196f8, backend SHA 2c6ff82c... on fresnel): - VP9 directly verifiable (Phase 1 criterion 1 met for 1 of 3 target codecs) - MPEG-2 maintained (no regression after Commit D fix-forward) - H.264 unchanged (Bug 4 deferred per Phase 1 lock) - Architecture cleaned: CreateSurfaces2 ~70 LOC (single-responsibility), CreateContext owns OUTPUT lifecycle, no α'-style failure mode possible. Surfaced bugs for iter6+: - Bug 5: HEVC libva DQBUF FLAG_ERROR (pre-existing; iter2's transitive PASS verified control payload but not decode outcome) - Bug 6: VP8 libva produces non-zero non-matching output (slot rotation or partial fill, masked pre-β by all-zero state) - Bug 4: H.264 inter-frame race-loss (carried from iter4 P7) Lessons distilled to memory: - feedback_grep_callsites_before_no_change.md (Phase 5 v2 CRIT-2 caught request_pool_destroy not in DestroyContext after C3 stripped its only per-session caller) - feedback_trust_iter_comments_for_lifecycle.md (Commit D fix-forward surfaced because Phase 4 v2 read but didn't trace context.c:262's iter6 ffmpeg-vaapi-copy surfaces_count=0 comment) Campaign scoreboard: 5/5 with 2 direct (VP9 new, MPEG-2 maintained) + 3 mixed (H.264 keyframe partial, VP8 partial new, HEVC transitive-only direct-FAIL). iter6 awaits Phase 0 research-question lock. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c773c3d2c1 |
iter5b-β Phase 7: PARTIAL PASS — VP9 unblocked, MPEG-2 maintained, HEVC+VP8 partial
Two acts: Act 1 (β alone): all 5 libva codecs returned all-zero. MPEG-2 was a regression (pre-β it worked); HEVC was unchanged (kernel returns DQBUF FLAG_ERROR pre AND post β — same Phase 3 baseline showed it). Root cause: ffmpeg-vaapi-copy passes surfaces_count=0 to vaCreateContext per iter6 context.c:262 comment; my β walk of surfaces_ids[] was a no-op → destination_planes_count stayed 0 → surface_bind_slot no-op → all-zero readback. Act 2 (Commit D): cache format-uniform CAPTURE geometry in driver_data; walk surface_heap in CreateContext; lazy-fill in CreateSurfaces2 when fmt_valid is set; invalidate in DestroyContext. Restores MPEG-2 to pre-β state and unlocks VP9. Per Phase 1 criteria: criterion 1 PARTIAL (VP9 of HEVC+VP9+VP8); criteria 2-4 PASS. Bug 5 (NEW): HEVC libva DQBUF FLAG_ERROR — pre-existing kernel rejection; β's OUTPUT format fix didn't address it. Transitive proof at iter2 verified control payload shape but kernel still rejects; some other V4L2 protocol contract aspect differs from kdirect. Bug 6 (NEW): VP8 libva produces non-zero output with real content (74.8% zero + 256 unique bytes incl. keyframe pixels at `93 8e 8a 89...`) but diverges from kdirect. Decode runs; output mismatch likely slot-rotation or partial-fill bug. VP9 is iter5b-β's only clean PASS. Architecture-wise β succeeded: no α'-style failure mode possible (no in-CreateSurfaces2 destructive teardown), and the CRIT-1+CRIT-2 fixes from Phase 5 v2 review held. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
311411b3f9 |
iter5b-β Phase 6: 3 commits A+B+C landed on fork, build pending fresnel uptime
Commits: 1c548b1 (codec helper), cc077a0 (config wire-up), 7055b14 (β refactor + CRIT-1 + CRIT-2 + IMP-1 + IMP-2 + dead-field cleanup). Fork tip 7055b14. surface.c CreateSurfaces2 reduced from ~250 to ~50 LOC. OUTPUT-side V4L2 lifecycle moved to context.c CreateContext. DestroyContext gained request_pool_destroy() (CRIT-2 fix). last_output_*/surface_reset_ format_cache deleted (dead under β). All 5 Phase 5 v2 amendments (CRIT-1, CRIT-2, IMP-1, IMP-2, IMP-3) incorporated. Fresnel offline at push time — build+install+verify deferred to Phase 7. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3508a2cfeb |
iter5b Phase 5 v2: 2 CRIT findings — NULL guard + missing request_pool_destroy
CRIT-1: context.c:64-66 video_format==NULL guard rejects every first
β CreateContext. β moves the probe from CreateSurfaces2 into
CreateContext itself, so the guard fires before any new logic runs.
Fix: remove guard, move CAPTURE probe to top of CreateContext.
CRIT-2: DestroyContext lacks request_pool_destroy. Empirical grep
shows only surface.c:220 (which β strips) calls it per-session.
Without amendment, second CreateContext gets pool->initialized=true
with stale slot pointers → QBUF EINVAL. Fix: add request_pool_destroy
to DestroyContext before REQBUFS(0). C3 (surface.c strip) and CRIT-2
fix MUST land together.
Plus IMP-1 (mplane assumption wrong for SUNXI_TILED_NV12) + IMP-2
(surface_reset_format_cache becomes dead under C7) + IMP-3 (error
recovery comment).
Phase 6 BLOCKED pending CRIT-1 + CRIT-2 fixes. Author confirmed
both at code level — Phase 5 caught what Phase 4 v2's surface read
missed ("DestroyContext teardown — no change needed" — wrong; was
incomplete).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
5abea730a0 |
iter5b Phase 4 v2: re-plan with option β — CreateContext-centric OUTPUT lifecycle
Supersedes phase4_iter5b_plan.md (the α' plan rejected at Phase 7). β architecture: strip OUTPUT-side V4L2 device state from RequestCreateSurfaces2 entirely; move it to RequestCreateContext where config_id (and therefore the bound profile) is unambiguously known. CreateSurfaces2 becomes ID-allocation + per-surface bookkeeping only. 9 contract clauses (C1..C9). Reuses 2 of 3 reverted iter5b commits (codec.h/codec.c helper; object_config->pixelformat wire-up at CreateConfig). New work: C3 strip surface.c, C4 build out context.c — predicted ~120 LOC into context.c, ~190 LOC stripped from surface.c (net ~70 LOC delta). Risk register: 7 items; highest is multi-context resolution change within shared driver_data (medium impact, mitigated by existing DestroyContext teardown). α''s destructive teardown failure mode disappears because β has no in-CreateSurfaces2 teardown branch. Phase 5 review focus: error-recovery branches in CreateContext, per-surface destination_* fill semantics (format-uniform fields at CreateContext vs per-slot fields at BeginPicture), ohm backwards-compat verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
864af258e9 |
iter5b Phase 7: FAIL — HEVC SIGSEGV, option α' rejected, revert + loopback to β
Empirical sweep on iter5b backend (SHA d7722da...) crashed in copy_surface_to_image during HEVC libva-vaapi-hwdownload. Coredump backtrace shows memcpy on stale surface_object->destination_data[i] pointer — cap_pool_destroy ran during my pixfmt-change teardown branch, but the subsequent S_FMT got EBUSY because the OUTPUT queue was already streaming. State corruption mid-decode. Root cause: ffmpeg-vaapi calls vaCreateSurfaces2 *twice*, with CreateContext+STREAMON between them. My CreateSurfaces2 gate destructively tears down cap_pool on pixelformat change but can't recover when REQBUFS(0) silently fails on a streaming queue. surface.c:164-171 TODO comment from iter1 anticipated exactly this: "STREAMOFF + REQBUFS(0) + new S_FMT + new CREATE_BUFS — that's a context-level redesign for the next iteration." Phase 4 dismissed the comment as targeting multi-resolution mid-stream. That dismissal was wrong; ffmpeg-vaapi triggers the same code path. 3 reverts on fork master: 4b2288f, f8256e6, ce304ef reverted by 709ab34, 9a7f888, 6bc29ec. Backend rebuilt + reinstalled on fresnel at iter4-tip SHA 6e90b7a9.... Post-revert HEVC libva returns the pre-iter5b broken-but-non-crashing all-zero pattern. Per Phase 1 lock: criteria 1 FAIL (HEVC/VP9/VP8 still all-zero); criteria 2-4 PASS (no regression on MPEG-2/H.264 keyframe/control payloads). iter5b does not close. Phase 7 → Phase 4 loopback: re-plan as option β (defer OUTPUT-side S_FMT+CREATE_BUFS to CreateContext where config_id is known and streams haven't started). User pick: revert + re-plan with β. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
550bb81a3e |
iter5b Phase 6: 3 commits A+B+C landed clean, backend installed on fresnel
Fork tip 4b2288f. Backend SHA256 d7722da742bfcb86a9136b07e6d9a5de23668f37fcad328258966c5338265e82 on /usr/lib/dri/v4l2_request_drv_video.so (pre-iter5b was 6e90b7a9b2c33480...). LOC: 188 across 5 modified files + 2 new (codec.h, codec.c). All 4 Phase 5 amendments (CRIT-1 + 3 IMPs) incorporated in the actual commits, no follow-ups needed. Phase 7 sweep ready: re-run /tmp/iter5_p3/sweep.sh on fresnel; expect libva == kdirect == sw for HEVC + VP9 + VP8 (3 codecs unblocked); MPEG-2 unchanged; H.264 unchanged (Bug 4 deferred to iter6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7d1c44bd90 |
iter5b Phase 5: review — CRIT-1 mechanical pseudocode fix, 3 IMP amendments
Sonnet-architect found one Critical pseudocode error and three Important amendments. All mechanical; no structural plan change. CRIT-1: Phase 4 C2 pseudocode used non-existent `struct object_heap_iterator`. Actual API at object_heap.h:67-68 uses `int *iterator`. Author re-verified vs request.c:411-418 canonical usage. Verbatim paste would have compile-failed. IMP-1: gate comment at surface.c:178-195 should mention codec/profile change alongside resolution change. IMP-2: dead `object_config->pixelformat` field at config.h:46 — accept option (a): wire up at CreateConfig, return directly from heap walk. Saves one pixelformat_for_profile() call in surface.c path. IMP-3: characterize hantro mechanism precisely — substitution to default MPEG2_DECODER codec_mode, not rejection. Explains why MPEG-2 worked but VP8 didn't pre-fix. 10 contract clauses scorecard: 1 FAIL (C2), 2 CONDITIONAL (C3, C10), 7 PASS. Phase 6 cleared conditionally pending all 4 amendments. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
eca03d2641 |
iter5b Phase 4: plan — option α' (single-config lookup), 10 contract clauses
Picks α' over the Phase 2 recommendation of β: smaller scope (~50 LOC
vs ~250), targets iter5b's actual bug (wrong OUTPUT format at INITIAL
CreateSurfaces2, not the multi-resolution mid-stream case the
surface.c:164-171 TODO comment anticipates).
Patches:
- C1/C6: NEW src/codec.{h,c} + meson.build — pixelformat_for_profile()
- C2: NEW find_sole_active_profile() static helper in surface.c
- C3: Replace surface.c:173 hardcode with profile-derived lookup
- C5: Extend last_output_* gate with pixelformat
Phase 7 expected post-fix matrix: HEVC + VP9 + VP8 libva == kdirect
== sw (3 codecs unblocked); MPEG-2 unchanged (already worked);
H.264 still race-loses inter frames (Bug 4, deferred to iter6).
Phase 5 review concerns laid out: helper completeness, heap iterator
API, gate semantics, hantro CAPTURE-derivation on correct format,
mpv probe-then-real flow, memory rule placement.
Option β deferral note: cleaner refactor exists but not necessary
for iter5b's bug; defer to future iteration when multi-resolution
mid-stream becomes a target.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6b0e023e7f |
iter5b Phase 2: situation — lifecycle traced, option β (defer to CreateContext) recommended
VA-API lifecycle traced: CreateConfig stores profile in object_config; CreateSurfaces2 has NO config_id, can't access profile; CreateContext takes VAConfigID and already does profile-switch for h264_start_code (context.c:205-217, iter4 fix-forward 692eaa0). surface.c:164-171 already flags this as deferred-work in a TODO comment: "that's a context-level redesign for the next iteration." iter5b picks up that deferred work. Three options analyzed empirically: - α: thread current_profile through driver_data (15 LOC, fragile semantic) - β: move OUTPUT-side lifecycle to CreateContext (80-150 LOC, clean) - γ: lazy at BeginPicture (architecturally wrong site) Recommendation: option β. iter4 reviewer accepted the deferred-work flag in surface.c; iter5b is the iteration that addresses it. object_config->pixelformat field at config.h:46 is declared but never assigned — opportunity for wiring up cleanly via the profile→pixelformat map. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
cd34ec1918 |
iter5 Phase 0 loopback: real Bug 2 is surface.c:173 hardcoded OUTPUT format
Empirical strace of all 5 codecs through libva shows VIDIOC_S_FMT on OUTPUT_MPLANE ships pixelformat V4L2_PIX_FMT_H264_SLICE for EVERY profile. HEVC controls submitted on H264_SLICE OUTPUT → kernel rkvdec silently rejects/no-ops → CAPTURE stays in cap_pool init (all-zero). Per-codec Bug 2 taxonomy: - HEVC, VP9, VP8: OUTPUT format mismatch on rkvdec/hantro-strict → 100% zero - MPEG-2: format mismatch but hantro tolerates → works - H.264: format right by coincidence; keyframe decodes, inter all-zero (Bug 4, separate, deferred from iter5b) Site: src/surface.c:173 `unsigned int pixelformat = V4L2_PIX_FMT_H264_SLICE`. Same bug class as feedback_unconditional_codec_state.md (iter4 h264_start_code = true). iter5b new Phase 1: fix surface.c to switch pixelformat on config_object->profile. 4 criteria locked, all backend-side, no kernel patches. RFC v2 series filed back to backlog for a future DMABUF-import-consumer campaign. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0adfb11fff |
iter5 Phase 5: review CRIT-1 invalidates Phase 4 — loop back to Phase 0/3
Sonnet-architect review found that the RFC v2 fix mechanism does not reach the libva backend's consumer path: - Backend uses V4L2_MEMORY_MMAP for both OUTPUT + CAPTURE buffers. - For MMAP buffers, vb->planes[].dbuf stays NULL. - RFC v2 helper's plane loop skips planes with !dbuf, fence attached to no dma_resv. - EXPBUF (vb2_dc_get_dmabuf) creates a fresh disjoint dma_resv. - The fence-mechanism fix would be a no-op for the cap_pool path even if it did reach the right resv, because RequestSyncSurface already blocks on media_request_wait_completion + v4l2_dequeue_buffer. Three alternative root-cause hypotheses for Phase 0/3 to disambiguate: cache coherency, cap_pool slot-rotation bug, or a separate-sync gap in vaDeriveImage/vaMapBuffer that bypasses RequestSyncSurface. Phase 5 saved ~half a session of build-install-test wallclock that would have ended in a Phase 7 → Phase 0 loopback anyway. Three Important + 2 Minor findings also recorded for when iter5 reopens. User pick: loop back to Phase 0/3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a809e9c0b8 |
iter5 Phase 4: plan — 4 patches + manifest diff + PKGBUILD bump
12 contract clauses (C1..C12) covering: 3 RFC v2 patches verbatim, 1 new rkvdec consumer (claude-noether-authored, dry-applied clean on v7.0 in worktree test), kernel-agent patches/ scope tag + fleet/fresnel.yaml diff, marfrit-packages PKGBUILD bump 7.0-1 → 7.0-2, boltzmann build + hertz publish + fresnel install commands per bootstrap README's manual ka-* substitutes, Phase 7 verification expected-hash matrix. Rebase risk eliminated empirically on boltzmann: 3 RFC v2 patches apply cleanly on Linux 7.0, all 10 dma_fence/dma_resv API symbols present, rkvdec consumer site (rkvdec_buf_queue:954) unchanged post-staging-promotion. Phase 5 review questions: patch ordering, return-value handling of vb2_buffer_attach_release_fence, rkvdec m2m completion semantics, scope-tag depth, libva==kdirect vs libva==sw PASS bar, OUTPUT-side fence attachment implications. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3c05564e99 |
iter5 Phase 3: baseline — 4/5 libva codecs race-lose, MPEG-2 wins, kdirect clean
5-codec sweep matrix on linux-fresnel-fourier 7.0-1 confirms: - libva path returns all-zero cap_pool init pattern for H.264 (mostly) HEVC, VP9, VP8 (always). MPEG-2 wins the race (fastest hantro decode). - kernel-direct ffmpeg-v4l2request hwdownload byte-matches SW for all 4 race-losing codecs. - B4 cosmetic init-probe EINVAL noise reproduced on hantro (2 ioctl per codec); MPEG-2 + VP8 stateless control submissions follow at = 0. iter4 P7's "RGB(0,0x4c,0)" pattern corrected to all-zero raw bytes (the 0x4c was YUV→RGB conversion of all-zero NV12). Same SHA shape as iter3's hantro b34860e0 blocker fingerprint. Control-payload strace anchors persisted as phase-7 invariants. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9941523f1f |
iter5 Phase 2: situation analysis — 4-patch plan (3 RFC v2 + 1 new rkvdec consumer)
Source-read complete: 3 RFC v2 patches dissected, v7.0 rkvdec_buf_queue site identified at line 954 of drivers/media/platform/rockchip/rkvdec/rkvdec.c, empirical disproof of Bug 3 UAPI drift via byte-identical v6.12↔v7.0 struct diff, hantro_v4l2.c confirmed unchanged across the same range. Rebase risk concentrated in videobuf2-core.c (medium — vb2 core sees regular activity); deferred to Phase 4 when boltzmann is reachable for the git apply --3way verification. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
31b9255d63 |
iter5 Phase 0 amend: Bug 3 collapses, locked criteria 5→4
Phase 2 source-read mid-execution found that v4l2_ctrl_mpeg2_* and v4l2_ctrl_vp8_frame are byte-identical v6.12 ↔ v7.0 mainline. On-fresnel re-trace with correct hantro-decoder bind shows MPEG-2 controls submit at = 0; the "Unable to set control(s)" log noise is the backend's H.264/HEVC init-probe EINVAL on a non-H.264 device (B4 backlog), not a UAPI drift. iter5 locked scope is now vb2_dma_resv (4 patches: 3 existing operator-authored RFC v2 + new rkvdec consumer). Criteria reduced from 5 to 4. B4 stays in backlog. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
8acfca3fe0 |
iter5 Phase 0: lock Candidate B — vb2_dma_resv + hantro UAPI drift in linux-fresnel-fourier
Five Phase 1 criteria: Bug 2 closed (cap_pool readback returns real pixels through libva); Bug 3 closed (hantro MPEG-2 + VP8 controls accepted on new kernel); patches ship from kernel-agent (local-carry acceptable, mainline bonus); zero codec-contract regression vs iter4; 5/5 direct-verification block restored. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9d2b7c1944 |
iter4 Phase 7 close: Option-A transitive proof complete — VP9 PASS 4/5
Leg 1: FRAME control 168/168 bytes byte-identical to kernel-direct anchor.
Leg 2: COMPRESSED_HDR 1950/2040 match; 90-byte uv_mode[10][9] delta is the
documented S4 carve-out (rkvdec persistent kernel table).
Leg 3: kernel-direct YUV (NV12→YUV420P, 3 frames @1280x720) SHA256-identical
to libvpx-vp9 SW reference: 4f1565e89cd720c4eb6e59d8bbb46127b02cf13102911afc4e174925e5b36094
iter4 criteria 1+2+3 direct PASS, 4 transitive PASS, 5 carried as substrate
issue (cap_pool readback, Bug 2 + hantro UAPI drift, Bug 3) outside iter4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f510ac6be5 |
iter4 Phase 7 pause: fork fix-forward 692eaa0, awaiting fresnel return for transitive-proof closure
Mid-Phase-7 fix-forward landed on fork (marfrit/libva-multiplanar:692eaa0): unconditional context_object->h264_start_code = true was prepending 0x00 0x00 0x01 to VP9 slice data, shifting the rkvdec bitstream by 24 bits and producing silent decode failure. Now gated on config_object->profile (H.264 + HEVC only). Empirical verification when fresnel was online: post-fix VP9 keyframe FRAME control bytes 0-23 byte-match Phase 3 anchor: lf.flags=0x03 (DELTA_ENABLED|DELTA_UPDATE) — was 0x01 base_q_idx=0x2e=46 — was 0x41=65 This is the transitive-proof leg-1 (backend-payload == kernel-direct-payload) for the iter4 keyframe. Open verification when fresnel returns: - Full 168-byte FRAME control diff mine vs Phase 3 anchor - Full 2040-byte COMPRESSED_HDR control diff - ffmpeg-v4l2request kernel-direct VP9 decode + hwdownload pixels = Phase 3 SW reference (transitive-proof leg-2) If both legs PASS, iter4 closes 5/5 (4 direct from earlier iters + 1 transitive iter4) per Option-A choice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
d87c940788 |
iter4 Phase 7: criterion 1+2+3 PASS, criterion 4+5 FAIL — three bug classes identified
Verification on linux-fresnel-fourier 7.0-1:
PASS:
- Criterion 1: vainfo enumerates VAProfileVP9Profile0 via auto-detect.
- Criterion 2: vaCreateConfig SUCCESS (implicit).
- Criterion 3: ffmpeg-vaapi VP9 5-frame decode exit 0 at 0.307x, no
ioctl errors.
FAIL — three distinguishable bug classes:
Bug 1 (VP9-specific, my Clause 6 parser):
Strace of frame-1 keyframe FRAME control vs Phase 3 anchor:
- byte 8 (lf.flags): mine=0x01 (DELTA_ENABLED only) vs ref=0x03
(ENABLED|UPDATE).
- byte 16 (base_q_idx): mine=0x41 (65) vs ref=0x2e (46).
- byte 17 (delta_q_y_dc): mine=8 vs ref=0.
Bit-trace shows my parser is 2 bits ahead of correct position by
the time it reaches lf_delta_enabled. Fix path: faithful port of
FFmpeg vp9.c::decode_frame_header.
Bug 2 (substrate-wide, cap_pool readback):
Constant RGB(0, 0x4c, 0) "0x4c gray" pattern across all codecs
(VP9, HEVC, MPEG-2, VP8). H.264 keyframe DOES read correctly with
real RGB(0, 0xe3, 0) content; H.264 inter frames revert to 0x4c.
Kernel decode succeeds (Phase 3 strace + ffmpeg-v4l2request
standalone confirm). libva readback returns cap_pool init scratch.
Sibling of iter3 dma_resv blocker but with different signature
(constant 0x4c instead of all-zero 0x00).
Bug 3 (hantro UAPI drift):
MPEG-2 + VP8 produce kernel "Unable to set control(s): Invalid
argument" errors. UAPI struct sizes/fields likely shifted between
6.19.9 and 7.0 (sibling of Phase 3 VP9 struct-size correction
144/1947 -> 168/2040).
Three loopback options proposed (decision pending user):
- A: VP9-only fix (Clause 6 parser); accept Bug 2/3 as substrate
pre-existing; criterion 4 transitive-only per iter3.
- B: Full loopback covering all 3 bugs; possibly requires kernel
patches (vb2_dma_resv RFC v2).
- C: Phase 0 reset; substrate is the primary issue; pause iter4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
42b9ec333a |
iter4 Phase 6: 4 commits landed (Z+A+B+C), ffmpeg-vaapi VP9 decode PASS
Fork at marfrit/libva-multiplanar tip beaa914: - Z (7f8fa93) device-path auto-detect via media controller topology; walk /dev/media*, MEDIA_IOC_DEVICE_INFO match, MEDIA_IOC_G_TOPOLOGY -> MEDIA_INTF_T_V4L_VIDEO -> resolve via /sys/dev/char. LIBVA_V4L2_REQUEST_NO_AUTODETECT=1 escape hatch. - A (16b3973) src/config.c VP9 enumeration + dispatch + entrypoints. - B (406d08e) NEW src/vp9.c (~750 LOC: VPX rac + inv_map_table + uncompressed-header partial parser + compressed-header parser + vp9_set_controls) + src/vp9.h + meson.build + context.h (persistent vp9_lf state for Phase 5 C2) + surface.h (params.vp9 union extension). - C (beaa914) src/picture.c VP9 dispatcher + 2 buffer-type cases. NO Commit D — buffer.c allow-list already permissive for VP9's 3 buffer types (Picture, Slice, SliceData; all in iter3 baseline). Phase 5 amendments all in code: C1 no-XOR direct, C2 persistent vp9_lf with VP9 spec defaults, C3 out_reference_mode parameter, C4 NO_AUTODETECT escape, S4 uv_mode memcpy omitted. Plan amendment to Commit Z section in phase4_iter4_plan.md documents the canonical media-topology approach (replacing the original /dev/video* walk). Verification empirically on fresnel: - Criterion 1: vainfo enumerates VAProfileVP9Profile0 alongside H.264 + HEVC under auto-detect rkvdec. - Criterion 2 (implicit via successful ffmpeg run). - Criterion 3: ffmpeg-vaapi VP9 5-frame decode exit 0 at 0.307x speed, no ioctl errors. - Criterion 4: deferred to Phase 7 verification. - Criterion 5: rkvdec codecs work without env override; hantro (MPEG-2/VP8) still need env override per iter4-B1 backlog. Open iter4 backlog: B1 (multi-decoder dispatch refactor), B2 (mpv-vaapi Could-not-create-device — ffmpeg-vaapi works fine through same backend, mpv does not), Q6 (per-segment ALT_Q mapping for non-BBB), COLOR_RANGE (VAAPI gap). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9865416ed2 |
iter4 Phase 5: sonnet-architect review — 4 Critical findings, all amendments incorporated
Review by sonnet-architect with cold-context source reads of fork +
kernel UAPI + VAAPI + FFmpeg references + kernel rkvdec source.
Reviewer applied Direction 2 (empirical-over-theoretical) by
test-compiling struct sizes, gcc-c-checking VAAPI field accesses,
and source-tracing FFmpeg's filter-mode XOR provenance.
Critical findings (all empirically validated by author before
incorporation per feedback_review_empirical_over_theoretical.md):
C1 - interpolation_filter double-XOR: vaapi_vp9.c:62 ALREADY applies
`filtermode ^ (filtermode <= 1)` when filling VAAPI's
mcomp_filter_type. Plan's second XOR was incorrect; would swap
EIGHTTAP and EIGHTTAP_SMOOTH for inter frames -> wrong
loop-filter strength. Fix: direct assignment, no XOR.
C2 - LF deltas not persistent: kernel UAPI explicitly says
"users should pass its last value" when delta_update=0. Plan
memset-zeroed each frame; would send {0,0,0,0,0,0} on BBB inter
frames instead of {1,0,-1,-1,0,0}. Fix: add persistent vp9_lf
state to object_context, init to VP9 spec defaults, update only
when parser sees delta_update=1, always copy to kernel control.
C3 - reference_mode out-parameter missing: reference_mode lives in
FRAME struct, not COMPRESSED_HDR. Plan referenced
`compressed_hdr_reference_mode` placeholder which would be an
undefined identifier -> compile failure. Fix: add
`uint8_t *out_reference_mode` param to vp9_fill_compressed_hdr;
derive `allowcompinter` at call site from the 3 sign biases.
C4 - Mitigation B scope claim overstated: walk-and-pick-first always
selects rkvdec on 7.0 (since video1 enumerates first). Hantro
codecs (MPEG-2, VP8) at video3 still require env override.
Fix: qualify criterion-5 trace; add LIBVA_V4L2_REQUEST_NO_
AUTODETECT=1 escape hatch for legacy callers.
6 Suggested (S1-S6): all confirm plan correctness OR are scope-
aligned non-issues. S4 (uv_mode memcpy omission safe for rkvdec)
baked into Clause 9 amended text.
Without this review, iter4 Phase 6 would have failed first compile
(C3) + produced wrong inter-frame output (C1+C2) + caused user
confusion (C4). Estimated saving: 1 compile failure + 1 Phase 7 ->
Phase 4 loopback + 1 doc correction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
4b36077b17 |
iter4 Phase 4: plan locks 12 contract clauses + Mitigation B
5-commit plan (Z, A, B, C, optional D):
- Commit Z: src/request.c — walk /dev/video* + /dev/media*, match by
driver name in {rkvdec, hantro-vpu, cedrus, sun4i_csi}; restores
baseline functionality on 7.0 (where /dev/video0 is rockchip-rga).
- Commit A: src/config.c — VAProfileVP9Profile0 enumeration + dispatch
+ entrypoints (~16 LOC, 1 file).
- Commit B: NEW src/vp9.c + .h + meson — 12 contract clauses; ~580 LOC
vp9.c (50 infra + 80 VPX rac + 50 uncompressed-header partial parse +
180 compressed-header parser + ~200 frame-fill).
- Commit C: src/picture.c + surface.h — VP9 dispatch + 2 buffer-type
cases + union extension; NO BeginPicture reset (VP9 has no
iqmatrix_set-style flags).
- Commit D: optional fix-forward placeholder (predicted no-op per
feedback_runtime_enumerates_allowlists.md).
Total ~699 LOC, 7 files.
12 contract clauses include 2 NEW vs iter3:
- Clause 3: compile-time _Static_assert sizeof v4l2_ctrl_vp9_frame ==
168 && ..._compressed_hdr == 2040 (any UAPI shift fails loudly).
- Clause 6: uncompressed-header partial parse for lf_delta_* +
base_q_idx (VAAPI doesn't expose; BBB keyframe needs non-zero
ref_deltas={1,0,-1,-1} per Phase 3 anchor).
7 Phase 5 review questions queued, all empirical-leaning per
feedback_review_empirical_over_theoretical.md Direction 2:
parser-vs-bitstream cross-check, FFmpeg-XOR-remap validation,
struct-size stability, mitigation B regression risk.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
56abe3d6a2 |
iter4 Phase 3: VP9 baseline + 4-codec regression on 7.0 substrate
Captured on linux-fresnel-fourier 7.0-1 (post 6.19 decommission). VP9 baseline (kernel-direct via ffmpeg-v4l2request on rkvdec): - 5-frame SW reference PNG SHA256 anchors (criterion-4) - VIDIOC_S_EXT_CTRLS strace with full payload at -s 16384 - Empirical struct sizes 168 B (FRAME) / 2040 B (COMPRESSED_HDR) supersede Phase 2 estimates of 144 / 1947 - Probe pattern: count=1 (FRAME-only) then count=2 (FRAME + COMPRESSED_HDR) Phase 2 doc fix: control IDs corrected 0xa40b2c/d -> 0xa40a2c/d. 4-codec regression (H.264, MPEG-2, HEVC, VP8): all fall back to SW on default config because /dev/video0 is now rockchip-rga (RGB color converter), not a codec device. Fork hardcodes /dev/video0 in request.c:149. Env override LIBVA_V4L2_REQUEST_VIDEO_PATH / _MEDIA_PATH restores per-driver profile enumeration; mitigation A/B/C queued for user decision. New contract clauses surfaced: - Clause 11: uncompressed-header partial parse for lf_delta / base_q_idx (VAAPI doesn't expose these; keyframe ref_deltas non-zero for BBB so leave-at-zero is wrong) - Clause 12: compile-time sizeof asserts on the two control structs so future UAPI shifts fail loudly iter4_phase3.tgz: full Phase 3 artifact bundle (strace + PNG refs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
2651e4cfdf |
iter4 Phase 2: situation analysis — VP9 backend gaps + compressed-
header parser requirement
Source-read of every file the iter4 patch series will touch, plus
kernel UAPI + VAAPI + downstream FFmpeg + kernel rkvdec reference
sources. Conducted on noether against fork tip e1aca9c (iter3 close).
Critical scope-shaping finding: rkvdec on RK3399 REQUIRES
V4L2_CID_STATELESS_VP9_COMPRESSED_HDR (not optional). Per
drivers/staging/media/rkvdec/rkvdec-vp9.c::rkvdec_vp9_run_preamble
lines 752-754:
ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl,
V4L2_CID_STATELESS_VP9_COMPRESSED_HDR);
if (WARN_ON(!ctrl))
return -EINVAL;
VAAPI does NOT expose compressed-header probability updates
(va_dec_vp9.h:50-192 — only frame parameters + segmentation;
vendor VAAPI drivers parse compressed header in firmware/GPU).
Therefore the libva backend MUST parse the compressed header
itself via a VPX boolean decoder + inv_map_table[]. ~150-200 LOC
of bitstream parsing logic (port from FFmpeg
v4l2_request_vp9.c::fill_compressed_hdr).
Bug enumeration (12 sites):
B1 config.c::RequestQueryConfigProfiles enum block missing
B2 config.c::RequestCreateConfig VP9 case missing
B3 config.c::RequestQueryConfigEntrypoints VP9 case missing
B4 src/vp9.c new file ~500-600 LOC
B5 src/vp9.h new file ~35-45 LOC
B6 src/vp9_rac.h NEW or inline (Phase 4
plan locks Option A:
inline in vp9.c)
B7 picture.c::codec_set_controls VP9 dispatch missing
B8 picture.c::codec_store_buffer 2 buffer-type cases
(Picture + Slice;
NOT 4 like VP8)
B9 picture.c::RequestBeginPicture predicted no reset
needed (no flag-state
like VP8 iqmatrix_set)
B10 surface.h::object_surface::params union vp9 member missing
B11 meson.build vp9.c/vp9.h not in lists
B12 buffer.c predicted no change
needed (VP9 uses
Picture/Slice/SliceData
— all whitelisted)
Non-bugs (intentionally untouched): context.c (no DECODE_MODE/
START_CODE menus per FFmpeg ref), video.c (CAPTURE-side format
list), v4l2.c (fourcc-agnostic), include/hevc-ctrls.h (already
includes <linux/v4l2-controls.h>).
Contract surface cited verbatim:
V4L2_CID_STATELESS_VP9_FRAME = 0xa40b2c (~144 bytes — much
smaller than VP8's 1232 bytes because VP9_FRAME carries no
entropy table; that's in COMPRESSED_HDR)
V4L2_CID_STATELESS_VP9_COMPRESSED_HDR = 0xa40b2d (~1947 bytes
— coef[4][2][2][6][6][3] alone is 1728 bytes)
Per-frame submission: 2 controls batched in single S_EXT_CTRLS
v4l2_request_vp9.c references confirmed: 2-control shape,
runtime-probed COMPRESSED_HDR availability (rkvdec advertises
it; we MUST provide)
VAAPI buffer types: 2 per frame (Picture + Slice) vs iter3 VP8's
4. NO Probability buffer (VP9 keeps probs in compressed header).
NO IQMatrix (VP9 keeps quant in slice's per-segment seg_param[8]).
VAAPI → V4L2 mapping table: 30+ fields enumerated. Several gap
candidates identified for Phase 3 empirical resolution:
Q1 lf.ref_deltas/mode_deltas/flags — not in VAAPI; FFmpeg reads
from VP9Context internal. BBB likely zero.
Q2 quant.base_q_idx + deltas — VAAPI exposes only effective
per-segment scales. Inverse-derive needed.
Q3 reference_mode — not in VAAPI. Default to SELECT?
Q4 interpolation_filter mapping (FFmpeg ^ remap)
Q5 reset_frame_context off-by-one (FFmpeg > 0 ? - 1 : 0)
Q6 Per-segment feature_data[8][4] derivation from VAAPI's
effective scales is non-trivial
Q7 mpv 0.41.0 VP9 hwdec engagement (per memory feedback_hw_
decode_engagement_check.md — known gap from iter3 VP8)
Q8 rkvdec dma_resv issue? (predicted NO based on iter1+iter2
successful mpv-DMA-BUF-GL on rkvdec)
Patch-shape prediction: ~580-690 LOC across 5 modified + 2 new
files (closer to iter2 HEVC's 470 than iter3 VP8's 370). Compressed-
header parser is the dominant cost.
Phase 3 baseline targets queued: cross-validator strace verbatim
S_EXT_CTRLS payloads (both controls), VAAPI consumer trace, mpv-
VP9-vaapi engagement check, rkvdec readback non-zero check.
Phase 4 plan structure anticipated: 10-clause template per
iter2/iter3, with new Clause 8 dedicated to compressed-header
parser.
Refs:
phase0_findings_iter4.md (Phase 1 lock)
phase8_iteration3_close.md (predecessor)
references/ffmpeg-kwiboo/libavcodec/v4l2_request_vp9.c (V4L2 ref)
references/ffmpeg-kwiboo/libavcodec/vaapi_vp9.c (VAAPI ref)
/home/mfritsche/src/linux-rfc/drivers/staging/media/rkvdec/
rkvdec-vp9.c (kernel driver — confirms COMPRESSED_HDR
requirement at lines 752-754)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
9a71dbf4c3 |
iter4 Phase 0 + Phase 1 lock: VP9 on rkvdec
Opens iter4 immediately after iter3 close (
|
||
|
|
d5d4beb64d |
iter3 Phase 8 close: 4/5 codecs passing, 3 new memory entries
distilled, 0 Phase 7 → Phase 4 loopbacks
iter3 = VP8 on hantro-vpu-dec via libva-v4l2-request-fourier on
RK3399 (fresnel / Pinebook Pro). Fourth codec to ship.
Final state:
Fork tip: e1aca9c (post iter2 close 8d71e20 + 4 commits)
Phase 1 criteria: 5/5 GREEN (4 direct + 1 transitive)
LOC delta: +373 across 7 files (2 new + 5 modified)
Phase 7 → Phase 4 loopbacks: 0
Phase 6 fix-forwards: 1 (Commit D buffer.c allow-list)
Phase 5 review findings: 4 Critical, all empirically validated
Lessons distilled to memory (3 NEW entries):
feedback_hw_decode_engagement_check.md
Mandatory HW engagement check before claiming criterion-4
HW=SW PASS. mpv silently falls back to SW for some codec/
backend combos. Use lsof/strace/mpv -v/ffmpeg log to verify
HW path actually engaged. Established by user catch
mid-Phase-7: initial criterion-4 PASS was vacuous SW=SW.
reference_dmabuf_resv_blocker.md
Cross-campaign blocker. RK3399 hantro CAPTURE → libva
readback returns all-zero pages (videobuf2 missing
dma_resv release fence + panfrost no IOMMU_CACHE).
Tracked at git.reauktion.de/marfrit/dmabuf-modifier-triage/
issues/2. vb2_dma_resv kernel patches in flight (RFC v2,
2026-04 linux-media). Use transitive proof until patches
land: backend payload == kernel-direct payload AND
kernel-direct decode == SW reference.
feedback_runtime_enumerates_allowlists.md
Sibling to feedback_header_deletion_check.md. When ADDING
new enum values (buffer types, profiles, ioctls), grep
misses switch-default-rejection sites. Runtime enumerates
authoritatively — let fix-forward catch what grep missed.
Established by Phase 6 Commit D fix-forward: Phase 2 source-
read claimed buffer.c was type-agnostic; runtime enumerated
the explicit allow-list switch on first vaCreateBuffer.
Phase 5 amendments empirically validated (all 4 Critical correct):
C1 first_part_header_bits = slice->macroblock_offset → 6550 ✓
C2 first_part_size = partition_size[0]+ceil(macroblock_offset/8)
→ 22742 ✓ (= 21923 + 819, exact match for Phase 3 anchor)
C3 VAProbabilityBufferType (not VAProbabilityDataBufferType)
→ compiled clean post-Commit-D
C4 (int8_t) cast (not (s8)) → compiled clean Commit B first try
Estimated savings without Phase 5 review: 2 Phase 6 compile-fail/
fix-forward cycles (C3 + C4) + 1 Phase 7 → Phase 4 loopback (C1
+ C2 hardware-DMA-offset bug, would have produced visible-but-
corrupt output). Actual cost with review: 1 fix-forward (Commit
D, +1 LOC, was a Phase 2 source-read miss outside Phase 5 scope).
Cross-cutting backlog updates:
iter3-Q1 first_part_header_bits → CLOSED by Phase 5 C1
iter3-flags-anomaly bit 0x40 → not iter3 scope; kernel ignores
iter3-criterion-4-readback → blocked on dmabuf-modifier-triage
iter1; transitive proof used
iter3-mpv-vp8-fallback → mpv 0.41.0 falls back to SW for VP8;
consumer-side, not backend; verify
via chrome-fourier when convenient
Inherited backlog (B1, B3, B4, B5, B6, L3) — no closures from
iter3.
Campaign scoreboard: 3/5 → 4/5 codecs passing.
H.264 | rkvdec | T4 | PASS direct
MPEG-2 | hantro | iter1 | PASS direct
HEVC | rkvdec | iter2 | PASS direct
VP8 | hantro | iter3 | PASS transitive (readback blocked)
VP9 | rkvdec | iter4 | PENDING
iter4 (VP9 on rkvdec) prediction: comparable scope to iter2 HEVC
(VP9 has compressed-header control + probability state).
~400-500 LOC, 3-4 commits + 1 fix-forward. mpv may engage HW for
VP9 (different from VP8 fallback) — verify at iter4 Phase 0.
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase2_iter3_situation.md (situation analysis)
phase3_iter3_baseline.md (verbatim payload anchors)
phase4_iter3_plan.md (10 contract clauses + Phase 5 amendments)
phase5_iter3_review.md (4 Critical, all validated correct)
phase7_iter3_verification.md (4 direct + 1 transitive PASS)
Fork commits 27d82e3 + 017e27f + 7f84bbb + e1aca9c
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
afb9b1450f |
iter3 Phase 7: verification — 4 direct PASS, 1 transitive PASS
Phase 1 5-criterion verification on iter3 backend (fork tip e1aca9c).
4 direct PASS + 1 transitive PASS. Vacuous-pass mode caught + corrected
mid-Phase-7 (initial mpv --hwdec=vaapi --vo=image HW=SW match was
SW=SW; mpv silently fell back to SW for VP8).
Criterion results:
1. vainfo enumerates VAProfileVP8Version0_3 PASS (direct)
2. vaCreateConfig SUCCESS PASS (direct, implied)
3. ffmpeg-vaapi VP8 5-frame decode exit 0 PASS (direct)
4. HW=SW byte-identical via DMA-BUF GL PASS (transitive)
5. 3-codec regression (H.264 + MPEG-2 + HEVC) PASS (direct)
Criterion 4 transitive proof:
Step A: Strace of ffmpeg-vaapi via libva backend captures the
V4L2_CID_STATELESS_VP8_FRAME control payload — keyframe
y_ac_qi=8, first_part_size=22742, first_part_header_bits=
6550, all 30 fields enumerated.
Step B: Phase 3 baseline already captured the kernel-direct
(ffmpeg-v4l2request) keyframe payload — IDENTICAL to A
field-for-field.
Step C: ffmpeg-v4l2request kernel-direct VP8 decode produces
5 raw frames byte-identical to SW reference (cmp on
full 6.7 MB vp8_kerneldirect.yuv vs vp8_sw5.yuv = silent
BYTE-IDENTICAL).
Conclusion: A == B (libva backend produces correct kernel input)
AND C (kernel-direct decode is correct), therefore
libva backend's HW decode IS correct by transitivity.
Direct readback BLOCKED by kernel-layer dma_resv issue (sibling
campaign git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/2):
- ffmpeg-vaapi -hwaccel_output_format vaapi -vf hwdownload
returns all-zero pages (SHA b34860e0... = SHA of all-zero
1382400-byte block) for ALL 5 frames.
- Same all-zero from -hwaccel_output_format nv12 + auto-DL.
- mpv --hwdec=vaapi-copy returns Y=128 gray (uninitialized).
- Root cause: videobuf2 missing dma_resv release fence + panfrost
IOMMU_CACHE absence on RK3399 (per dmabuf-modifier-triage iter1
RFC). vb2_dma_resv kernel patches in flight (linux-media RFC v2,
2026-04). When patches land, direct verification re-runnable.
Phase 5 amendments empirically validated:
C1 first_part_header_bits = slice->macroblock_offset → 6550 ✓
C2 first_part_size = partition_size[0] + ceil(macroblock_offset/8)
→ 22742 ✓ (= 21923 + 819, exact match for Phase 3 anchor)
C3 VAProbabilityBufferType (not VAProbabilityDataBufferType) →
compiled clean post-Commit-D fix-forward
C4 (int8_t) cast → compiled clean Commit B first try
S3 assert(probability_set) → has not fired (FFmpeg vaapi_vp8.c
always sends VAProbabilityBufferType per frame)
Phase 6 fix-forward Commit D documented: buffer.c had an explicit
allow-list switch (Phase 2 source-read missed it). Same iter1 Commit
D pattern — runtime enumerates authoritatively what grep missed.
HW-engagement check applied per new memory rule
feedback_hw_decode_engagement_check.md (established this session):
- mpv-vaapi VP8: SILENT FALLBACK to SW. mpv-side, not backend
issue. ffmpeg-vaapi VP8: HW engaged (Format vaapi chosen by
get_format(); cap_pool_init: 24 slots ready).
- V4L2 strace: VIDIOC_S_EXT_CTRLS for VP8_FRAME (0xa409c8)
returns 0 (kernel accepts payload). CAPTURE buffer indexes
advance through distinct slots per decode.
Cross-cutting backlog updates:
iter3-Q1 first_part_header_bits → closed by Phase 5 C1
iter3-flags 0x40 → not iter3 scope; kernel ignores
iter3-criterion-4 readback → blocked on dmabuf-modifier-triage
iter1 (vb2_dma_resv kernel patches)
Campaign scoreboard: 3/5 → 4/5 codecs passing.
Memory entries added:
feedback_hw_decode_engagement_check.md (mandatory HW engagement
verification before claiming criterion-4 PASS)
reference_dmabuf_resv_blocker.md (cross-campaign blocker tracking
+ transitive proof pattern)
Refs:
phase4_iter3_plan.md (10 contract clauses + Phase 5 amendments)
phase5_iter3_review.md (4 Critical findings, all empirically
validated in Phase 7)
phase3_iter3_baseline.md (verbatim payload anchors used in
transitive proof Step B)
git.reauktion.de/marfrit/dmabuf-modifier-triage/issues/2
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
656596aa6b |
iter3 Phase 5: sonnet review — 4 Critical findings, 4 amendments
Second-model review by sonnet-architect found 4 Critical bugs in
Phase 4 plan, all verified empirically by author before incorporation
per memory feedback_review_empirical_over_theoretical Direction 2.
Amendments applied in-place to phase4_iter3_plan.md +
phase2_iter3_situation.md.
Critical findings:
C1 first_part_header_bits = 0 was claimed cosmetic; actually
UNSAFE. hantro_g1_vp8_dec.c:260 + rockchip_vpu2_hw_vp8_dec.c:372
both read this field unconditionally to compute the macroblock
DMA offset. Setting 0 would place hardware at wrong DMA offset
for ALL macroblock data → garbage decode.
Fix: frame.first_part_header_bits = slice->macroblock_offset
(verified by source identity — vaapi_vp8.c:204 and
v4l2_request_vp8.c:83 use byte-identical formulas).
C2 first_part_size = slice->partition_size[0] was wrong; VAAPI's
partition_size[0] is the REMAINING bytes after parsing
(vaapi_vp8.c:209 confirms; va_dec_vp8.h:193-196 spec confirms).
Kernel needs the TOTAL control partition size.
Fix: frame.first_part_size = slice->partition_size[0] +
((macroblock_offset + 7) / 8)
Phase 3 keyframe numerics confirm: 21923 + 819 = 22742 ✓.
C3 VAProbabilityDataBufferType does not exist as a buffer-type
enum; it's the struct name. The actual enum constant is
VAProbabilityBufferType (= 13 per va.h:2058). Switch case
using the wrong identifier would have failed Phase 6 compile.
Fix: replace globally in phase2 + phase4 docs.
C4 (s8) cast undefined in userspace. Kernel has 's8' typedef in
linux/types.h (kernel-internal). UAPI exposes '__s8' (double-
underscore). Userspace portable cast is int8_t from <stdint.h>.
Fix: replace (s8) with (int8_t) in Clauses 6+7.
Suggested:
S3 Clause 8 comment was factually wrong: hantro_vp8.c::
hantro_vp8_prob_update reads coeff_probs unconditionally;
there is NO default-table fallback. If probability_set==false,
decode produces garbage. Practical risk low (FFmpeg vaapi_vp8.c
always sends VAProbabilityBufferType per frame), but corrected
comment + added assert(probability_set) runtime guard for
immediate Phase 6 surfacing.
Plus 5 minor S/Q items documented; non-blocking for iter3.
Author's 7 review questions all answered directly in the review:
Q1 quantization derivation: correct for typical content
Q2 first_part_header_bits=0 safety: UNSAFE → C1
Q3 num_dct_parts off-by-one: confirmed correct
Q4 field availability: 2 compile failures found (C3 + C4)
Q5 quant_update[s] semantics: signed delta confirmed
Q6 SHOW_FRAME unconditional: safe for BBB scope
Q7 buffer order independence: confirmed
Estimated saving: 1 Phase 6 → Phase 4 loopback + 2 Phase 6 fix-
forward commits. Review pass is the right path forward per memory
rule "Reviews are never skippable" — empty-review value =
empirical-verification value, regardless of finding count.
Refs:
phase4_iter3_plan.md (amended in-place; Phase 5 amendments
section appended)
phase2_iter3_situation.md (amended C3 globally)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
2918dda2e0 |
iter3 Phase 4: plan — 10 contract clauses, ~308-LOC patch, 3 commits
Locks the iter3 patch shape against Phase 3 verbatim cross-validator
payload + Phase 2 contract surface. 10 contract clauses cite kernel
UAPI + VAAPI + FFmpeg ref + Phase 3 byte anchors throughout.
Patch shape (mirrors iter1 ABCD pattern):
Commit A: src/config.c — enumeration block + CreateConfig case +
QueryConfigEntrypoints case (3 sites, +16 LOC, 1 file).
After: vainfo lists VP8Version0_3.
Commit B: NEW src/vp8.c (~200 LOC) + NEW src/vp8.h (~40 LOC) +
meson.build sources/headers entries (+2). 3 files
(2 new + 1 modified).
After: vp8.o compiles standalone.
Commit C: src/picture.c — codec_set_controls dispatch +
codec_store_buffer 4 buffer-type cases + outer
VAProbabilityDataBufferType case + BeginPicture
per-frame reset (4 sites, +40 LOC) + src/surface.h
params.vp8 union member (+10 LOC). 2 files modified.
After: end-to-end VP8 decode through libva backend.
Total: ~308 LOC, 6 files (2 new + 4 modified), 3 commits.
Contract clauses summary:
1. Submission shape — single VIDIOC_S_EXT_CTRLS, count=1, ctrl_class=
V4L2_CTRL_CLASS_CODEC_STATELESS (0xf010000), id=0xa409c8,
size=1232 bytes
2. Local struct alloc + zero-init (memset clears all padding)
3. Frame geometry + version + per-frame scalars (off-by-one
num_dct_parts = num_of_partitions - 1)
4. DPB timestamp resolution (3 refs: last/golden/alt; 0-sentinel
when SURFACE() returns NULL — mirrors iter1 mpeg2.c pattern)
5. Loop filter mapping (6 fields + 3 flag bits)
6. Quantization base + delta derivation (segment 0 = base via
iqmatrix[0][0]; deltas = iqmatrix[0][N+1] - iqmatrix[0][0]
signed; per-segment quant_update[1..3] only when segmentation
enabled)
7. Segment fields (segment_probs direct copy; flags assembled +
DELTA_VALUE_MODE set unconditionally per FFmpeg pattern)
8. Entropy table mapping — 3 VAAPI sources (Picture: y_mode +
uv_mode + mv_probs; ProbabilityData: coeff_probs[4][8][3][11]
direct memcpy; IQMatrix: quant)
9. Coder state + first-partition fields + flags (6 mainline-
documented bits only; bit 0x40 + EXPERIMENTAL NOT replicated
vs ffmpeg-v4l2-request-git anomaly; first_part_header_bits=0
fallback documented as known fidelity gap)
10. Final batched submission via v4l2_set_controls
Phase 5 review questions queued (7 items): quantization derivation
correctness, per-segment quant_update semantics, first_part_header_
bits=0 safety, probability buffer ordering, endianness, struct size
sizeof correctness, field-availability test-compile per memory
feedback_review_empirical_over_theoretical Direction 2.
Cross-cutting backlog deferred (B1, B3, B4, B5, B6, L3 inherited;
iter3-Q1 first_part_header_bits + iter3-flags 0x40 anomaly NEW).
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase2_iter3_situation.md (Phase 2 contract surface)
phase3_iter3_baseline.md (Phase 3 verbatim payload anchors)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
fd3fce86a6 |
iter3 Phase 3: baselines — VP8 cross-validator + 3-codec regression
+ SW reference
Captured on fresnel 2026-05-08 across two suspend cycles (laptop
dropped twice mid-run, captures preserved on /tmp/iter3_phase3).
All Phase 3 deliverables green.
Substrate verification:
backend SHA256: 9e27...6258 (matches iter2 close)
3-codec regression block: ALL 6 reference hashes match byte-for-
byte vs iter1+iter2 (H.264 +30s, MPEG-2 +02s, HEVC +02s on rkvdec/
hantro). Substrate has not regressed; criterion-5 anchor solid.
Cross-validator anchor (ffmpeg-v4l2request VP8 strace):
- VIDIOC_S_EXT_CTRLS, count=1, ctrl_class=V4L2_CTRL_CLASS_CODEC_
STATELESS, id=0xa409c8, size=1232 bytes
- struct size CORRECTED: v4l2_ctrl_vp8_frame = 1232 bytes (NOT
400 as one might assume; entropy.coeff_probs[4][8][3][11] alone
is 1056 bytes)
- keyframe (frame 1) verbatim payload captured: y_ac_qi=8,
last/golden/alt ts all 0, flags=0x0d (KEY|SHOW|NOSKIP),
y_mode_probs=[145,156,163,128] (matches FFmpeg keyframe const)
- inter frame verbatim payload captured: y_ac_qi=122, all DPB
timestamps non-zero, flags=0x66 (anomaly: bit 0x40 not in
mainline UAPI; vendor-patched ffmpeg-v4l2-request-git;
kernel hantro_vp8.c only inspects KEY_FRAME bit, ignores
bit 0x40)
VP8 SW pixel-verify reference (criterion-4 anchor):
vp8_sw_001.jpg: e43757a40e5d71ad176455c0fda14c2cbf9351b702188fc8ad
584d789db2c984
vp8_sw_002.jpg: a86bf885e588257731ff6cf8d2ccc5756be550e85220eee1c3
e6ea8c0c78e97a
Frame 1 != Frame 2 (real motion). These are the Phase 7 byte-
compare HW-vs-SW targets.
Open-question resolution (5 of 6 answered empirically):
Q1 first_part_header_bits — varies per frame (key=6550, inter
ranges 86..254); VAAPI doesn't expose. Phase 4 fallback:
leave 0 and check kernel behavior at Phase 7 byte-compare.
Phase 5 review will flag as known fidelity gap.
Q2 num_dct_parts vs VAAPI num_of_partitions — confirmed off-by-
one: kernel = VAAPI - 1 (BBB has VAAPI=2, kernel=1).
Q3 DPB timestamp 0-sentinel — confirmed: keyframe writes all
three timestamps as 0; iter3 mirrors iter1 mpeg2.c pattern.
Q4 SHOW_FRAME default — set on every captured frame (BBB has no
alt-ref invisible). Force unconditional in libva backend.
Q5 lf.flags FILTER_TYPE_SIMPLE — not set; BBB normal loop filter.
Direct mapping from VAAPI filter_type=0.
Q6 First-frame DPB sentinel — confirmed Q3; no self-reference
fallback needed (different from iter1 mpeg2.c).
V4L2 binding cells this boot:
rkvdec : /dev/video3 + /dev/media1
hantro-vpu-dec: /dev/video5 + /dev/media2
Capture artefacts on fresnel /tmp/iter3_phase3/ preserved for
Phase 7 re-run:
vp8_strace.* (19 files, multi-thread)
decode_vp8.py (payload decoder)
vp8_sw_00{1,2}.jpg (criterion-4)
{h264,mpeg2,hevc}_hw_00{1,2}.jpg (criterion-5)
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase2_iter3_situation.md (Phase 2 contract surface)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
898544a29c |
iter3 Phase 2: situation analysis — VP8 backend gaps + contract surface
Source-read of every file the iter3 patch series will touch, plus the
kernel UAPI + VAAPI + downstream FFmpeg + kernel hantro reference
sources. Conducted on noether against fork tip 8d71e20 (iter2 Phase 6
commit B); fresnel.vpn was unreachable so Phase 3 baseline empirical
capture defers until laptop reachable.
Bug enumeration (10 sites the patch series must touch):
B1 config.c::RequestQueryConfigProfiles enumeration block missing
B2 config.c::RequestCreateConfig VP8 case label missing
B3 config.c::RequestQueryConfigEntrypoints VP8 case missing
B4 src/vp8.c new file ~160-220 LOC
B5 src/vp8.h new file ~35-45 LOC
B6 picture.c::codec_set_controls VP8 dispatch missing
B7 picture.c::codec_store_buffer 4 buffer-type cases +
VAProbabilityDataBufferType
outer case missing
B8 picture.c::RequestBeginPicture per-frame reset additions
B9 surface.h::object_surface::params union vp8 member missing
B10 meson.build vp8.c/vp8.h not in lists
Non-bugs (intentionally untouched):
- context.c (no DECODE_MODE/START_CODE menus for VP8)
- video.c (CAPTURE-side format list; VP8 is OUTPUT-side)
- v4l2.c (fourcc-agnostic helpers)
- buffer.c (buffer registry is type-agnostic)
- include/hevc-ctrls.h (already includes <linux/v4l2-controls.h>
which holds V4L2_CID_STATELESS_VP8_FRAME)
Contract surface cited verbatim:
- V4L2_CID_STATELESS_VP8_FRAME = V4L2_CID_CODEC_STATELESS_BASE+200
= 0x00a409c8 (matches Phase 0 V4L2 inventory)
- struct v4l2_ctrl_vp8_frame at <linux/v4l2-controls.h>:1929-1958
+ 5 sub-structs (segment, lf, quant, entropy, coder_state) at
1785-1888
- VAAPI VAPictureParameterBufferVP8 + VASliceParameterBufferVP8 +
VAProbabilityDataBufferVP8 + VAIQMatrixBufferVP8 at
references/libva/va/va_dec_vp8.h
- FFmpeg v4l2_request_vp8.c reference: single batched S_EXT_CTRLS
at end_frame, count=1, no init-time menus
- Kernel hantro_vp8.c::hantro_vp8_prob_update reads 9 fields from
hdr (skip/intra/last/gf probs, segment_probs, entropy.{y,uv,mv,
coeff}_probs)
VAAPI → V4L2 mapping table: 30 fields enumerated. Open questions for
Phase 3 baseline (6 items: first_part_header_bits derivation, num_
dct_parts off-by-one, DPB timestamp 0-sentinel handling, show_frame
default, lf.flags FILTER_TYPE_SIMPLE bit, first-frame DPB sentinel).
Patch-shape prediction: ~260-340 LOC across 6 modified + 2 new
files. Medium-sized iter — between iter1's 120 LOC (3 modified +
1 deleted) and iter2's 470 LOC (5 modified). The new file dominates.
Phase 3 baseline targets queued: cross-validator strace verbatim
S_EXT_CTRLS payload capture, VAAPI consumer trace, mpv-SW reference
JPEG capture for criterion 4 byte-compare anchor.
Phase 4 plan structure anticipated: 10-clause template per iter2.
Refs:
phase0_findings_iter3.md (Phase 1 lock)
phase8_iteration2_close.md (predecessor close)
src/mpeg2.c (iter1 single-codec template; iter3 will mirror shape)
src/h265.c (iter2 dispatcher pattern; iter3 takes structure cues)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
ea2413e957 |
iter3 Phase 0 + Phase 1 lock: VP8 on hantro-vpu-dec
Opens iter3 of the fresnel-fourier campaign immediately after iter2
close (
|
||
|
|
df787a6cc2 |
iter2 Phase 8 close: 3/5 codecs passing, lesson L1 extended (BOTH directions)
Iteration 2 closes with all 5 Phase 1 boolean-correctness criteria
green. Third codec passes — campaign scoreboard 2/5 → 3/5 (H.264
in T4, MPEG-2 in iter1, HEVC in iter2). Loop terminates per
feedback_dev_process.md Phase 8.
Notable: ZERO Phase 7 → Phase 4 loopbacks needed. Phase 5 review
caught all 3 would-be loopback triggers in advance (data_byte_offset
rename, dpb.rps→index-arrays semantics, pic_order_cnt_val rename).
This is the dev-process ideal: review catches bugs before
implementation lands; verification confirms contract.
What landed:
Code (libva-v4l2-request-fourier master 229d6d1 → 8d71e20):
cca539d iter2 Phase 6 commit A: config.c break for HEVCMain case
8d71e20 iter2 Phase 6 commit B: rewrite h265.c against new V4L2
stateless HEVC API (6 files, 463 ins, 236 del)
Both authored as Claude (noether) per feedback_gitea_as_claude_noether.md.
Campaign docs (fresnel-fourier):
|
||
|
|
05b4bd56ec |
iter2 Phase 7: verification — all 5 criteria GREEN, third codec PASS
Phase 7 verification of iter2 HEVC fix executed against fork tip
8d71e20 (libva-v4l2-request-fourier master = post-iter2-Commit-B).
Verbatim raw output captured to phase0_evidence/2026-05-08/
iter2_phase7/. All five Phase 1 criteria green; bonus byte-compare
confirms structural match against Baseline B with two minor field-
value divergences (informational SPS fields VAAPI doesn't expose;
non-blocking per Criterion 4 byte-identical pixel pass).
Phase 1 → Phase 7 scoreboard:
Criterion 1 (vainfo VAProfileHEVCMain enum): PASS
rkvdec bind: H.264 (5 profiles) + HEVCMain — same as Baseline.
Criterion 2 (vaCreateConfig SUCCESS for HEVCMain): PASS
Pre-iter2: VA_STATUS_ERROR_UNSUPPORTED_PROFILE (12)
Post-iter2: VA_STATUS_SUCCESS (verified verbatim libva trace)
Criterion 3 (ffmpeg-direct HEVC engages backend, exit 0): PASS
5 frames decoded clean, cap_pool_init: 24 slots ready,
no Failed-to-create lines, no S_EXT_CTRLS EINVAL.
Criterion 4 (DMA-BUF GL HEVC HW=SW byte-identical at +02s): PASS
HW frame 1: 47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5
SW frame 1: 47a5f3850df5d8c732767a227830c2272ff78402a7b6adeea329e29838808be5
HW frame 2: a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656
SW frame 2: a467b3bc9d7b6374b6786ecfac46932d6c7bb932ab11d311edaa233d7863e656
Frames 1 vs 2 hash-differ (real motion).
Criterion 5 (iter1 MPEG-2 + T4 H.264 reference hashes): PASS
H.264 +30s HW1: f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9 (T4 ref MATCH)
H.264 +30s HW2: 7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8 (T4 ref MATCH)
MPEG-2 +02s HW1: 6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092 (iter1 ref MATCH)
MPEG-2 +02s HW2: ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de (iter1 ref MATCH)
Bonus byte-compare against Phase 3 Baseline B verbatim:
count=5, ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS=0xf010000:
SPS id=0xa40a90 size=40 (matches Baseline B)
PPS id=0xa40a91 size=64 (matches)
SLICE_PARAMS id=0xa40a92 size=280 (1 slice × sizeof(slice_params))
SCALING_MATRIX id=0xa40a93 size=1000 (matches sizeof(scaling_matrix);
Phase 4 plan typo'\''d 1296 — actual
struct sums to 1000 = 96+384+384+
128+6+2)
DECODE_PARAMS id=0xa40a94 size=328 (matches)
All return = 0 (kernel accepts every batched call).
SPS field-value divergences vs Baseline B (FFmpeg-v4l2request):
sps_max_num_reorder_pics: post-fix=0 baseline=2 DIVERGE
sps_max_latency_increase_plus1: post-fix=0 baseline=4 DIVERGE
All other SPS fields match (pic_width=1280, pic_height=720,
bit_depth=0, flags=0x180=SAO|STRONG_INTRA_SMOOTHING).
PPS flags also diverge slightly (bit 12 ENTROPY_CODING_SYNC_ENABLED:
post-fix unset, baseline set). Other PPS fields match.
Cause: VAAPI'\''s VAPictureParameterBufferHEVC doesn'\''t expose
sps_max_num_reorder_pics, sps_max_latency_increase_plus1, or
always-truthful entropy_coding_sync. FFmpeg parses these from
bitstream directly. Operational impact NIL (Criterion 4 byte-
identical pixel pass — kernel decoded correctly with these fields
defaulted to 0). Phase 8 polish backlog candidate (low priority):
add SPS bitstream parsing to extract these fields when VAAPI
doesn'\''t supply them.
Phase 7 → Phase 8: clean transition, no loopback.
Notable Phase 7 observations for Phase 8 memory:
1. Phase 5 review value confirmed: 3 Critical findings (C1
data_byte_offset rename, C2 dpb.rps→index-arrays semantics,
C3 pic_order_cnt_val rename) caught at Phase 5 — prevented
Phase 6 compile failures + at least 1-2 Phase 7→Phase 4
loopback cycles. Per memory feedback_review_empirical_over_
theoretical.md: every Critical/Should-fix verified
empirically before responding. Lesson held.
2. One Phase 5 amendment was empirically wrong: S1 suggested
uniform_spacing_flag exists in VAAPI; gcc test-compile rejected.
Both PPS bits 19+20 left zero (VAAPI exposes neither).
Documented inline. Lesson: even reviewer-cited field mappings
warrant empirical verification.
3. Phase 4 plan typo: claimed sizeof(scaling_matrix) = 1296;
empirical size is 1000. Code uses sizeof() so produces correct
bytes. Plan body amendment-by-side-channel; not blocking.
4. VAAPI↔V4L2 field-fidelity gaps surfaced: 2 SPS fields +
possibly 1 PPS bit not exposed by VAAPI. Operational nil;
Phase 8 polish-backlog candidate.
5. mpv --hwdec=vaapi engages HEVC cleanly (no MPEG-2-style
filtering). Confirms Phase 5 Q3 — VAPictureParameterBufferType
sent per-frame for HEVC; latent B3 bug masked same as MPEG-2.
6. BBB HEVC fixture is 1 slice per frame (slice_params size=280
= 1 × sizeof). Multi-slice path in iter2 is coded but
untested by binding cell.
Campaign scoreboard: 2/5 → 3/5 codecs passing
(H.264 in T4, MPEG-2 in iter1, HEVC in iter2). iter2 advances
to Phase 8.
Refs:
../libva-v4l2-request-fourier@8d71e20 (the fork tip verified)
phase4_iter2_plan.md (10 contract clauses; SCALING_MATRIX size
typo noted)
phase5_iter2_review.md (3 Critical + 4 Should-fix amendments
all incorporated; S1 partially empirically
incorrect — VAAPI doesn'\''t expose
uniform_spacing_flag)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
9eae068f11 |
iter2 Phase 5: sonnet review — 3 critical UAPI errors caught, 7 amendments
Phase 5 review run via Plan subagent with model: sonnet per
feedback_dev_process.md Phase 5 discipline. 13 findings: 3 Critical
+ 4 Should-fix + 3 Question + 3 Nit. Reviewer's bottom-line: medium
confidence (vs iter1's medium-high) — lower because the plan had
3 concrete-and-wrong claims about kernel UAPI struct fields that
would have caused compile errors or silent semantic bugs in Phase 6.
Per memory feedback_review_empirical_over_theoretical.md: every
Critical and Should-fix finding was VERIFIED against fresnel's
kernel UAPI before responding. No source-read rebuttals attempted.
Critical resolutions:
C1 (data_byte_offset, not data_bit_offset):
Plan Clause 4 said new API "still requires bit_size + data_bit_
offset, this logic is preserved." Empirical: struct has
data_byte_offset (u32 byte count). FFmpeg uses straight byte
offset, no bit search. Plan amendment: drop bit-search at
h265.c:196-209; replace with byte-offset assignment.
ACCEPTED.
C2 (dpb.rps GONE, pic_order_cnt_val rename, poc_st_curr_*
arrays hold DPB indices):
Plan Clause 6 said "DPB extraction migrates verbatim." Empirical:
- dpb_entry has flags (only LONG_TERM_REFERENCE bit), no .rps
- pic_order_cnt_val (singular s32) replaces pic_order_cnt[0]
- poc_st_curr_before[16]/_after[16]/_lt_curr[16] are u8 DPB
INDICES, not POC values; populate via FFmpeg
get_ref_pic_index() pattern (search dpb[] by timestamp,
return index)
Plan amendment: replace "verbatim migration" claim with explicit
re-spec: classify VAAPI ReferenceFrames into ST_CURR_BEFORE/
AFTER/LT_CURR lists, assign DPB indices, populate arrays with
indices.
ACCEPTED.
C3 (union-aliasing reasoning wrong, claim still right):
Same anti-pattern as iter1 review C1. Plan said reset is benign
because RenderPicture per-buffer copies overwrite byte 17764.
Empirical: byte 17764 lands in num_slices region; non-HEVC
profiles never read that location. Reset is benign because
non-aliasing, NOT because of overwriting. Wording amended.
ACCEPTED.
Should-fix resolutions:
S1 (PPS flags 19+20 missing): empirical confirms
V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT (1ULL<<19)
V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING (1ULL<<20)
Plan amended to add both. ACCEPTED.
S2 (3 PPS scalars missing): empirical PPS struct dump confirms
pic_parameter_set_id, num_ref_idx_l0_default_active_minus1,
num_ref_idx_l1_default_active_minus1 all present in modern
struct. Plan amended to populate. ACCEPTED.
S3 (SCALING_MATRIX content divergence FFmpeg vs libva):
FFmpeg sends memset-zero when no scaling list in stream
(BBB has no scaling_list — SPS flags=SAO|STRONG_INTRA only).
Plan said "populate spec defaults when iqmatrix_set==false."
Phase 6 implementer choice; document in commit which path
taken. Phase 7 byte-compare validates. ACCEPTED as choice
rather than mandate.
S4 (FFmpeg function name wrong cite):
Plan cited ff_v4l2_request_query_control_default_value;
actual is ff_v4l2_request_query_control. Cosmetic fix.
ACCEPTED.
Question resolutions:
Q1 (object_heap allocator size handling): VERIFIED safe.
request.c:142-143 uses sizeof(struct object_surface). Adding
slices[64] auto-picks-up the larger size.
Q2 (slice_segment_addr field): VERIFIED present in struct.
Plan amended Clause 4: populate from VAAPI
slice->slice_segment_address. Single-slice BBB safe with
implicit zero; multi-slice would corrupt without this field.
Q3 (VAPictureParameterBufferType per-frame send for HEVC):
Deferred to Phase 7 LIBVA_TRACE capture. iter1+T4 patterns
suggest yes, worth grepping at verification time.
Nits N1+N2+N3: array size [16] not [8]; image-output
directory naming cosmetic; BeginPicture cleanup deferred.
Plan amendments consolidated:
1. Clause 4: data_byte_offset; drop bit-search; add
slice_segment_addr population (C1 + Q2)
2. Clause 6: explicit DPB classification + index-array logic;
pic_order_cnt_val rename; drop dpb.rps (C2)
3. Clause 3: 2 PPS flags + 3 scalars (S1, S2)
4. Clause 5: function name fix (S4); SCALING_MATRIX divergence
deferred to Phase 6 implementer (S3)
5. Clause 10: union-aliasing reasoning corrected (C3)
6. Clause 6: V4L2_HEVC_DPB_ENTRIES_NUM_MAX=16 macro reference (N1)
7. Phase 7 harness: rename png_* → image_* dirs (N2)
Plan re-locks with these amendments. Phase 6 proceeds.
Per global ~/.claude/CLAUDE.md rule: Phase 5 reviews never
skippable. iter2's review was the right path forward — caught
3 concrete UAPI errors (data_bit_offset → data_byte_offset rename;
dpb.rps field gone; pic_order_cnt struct shape) that would have
been Phase 6 compile failures or silent Phase 7 byte-compare
divergences requiring loopback. Outside-look value substantial.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
348736eb63 |
iter2 Phase 4: plan — 10 contract clauses, ~400-line h265.c rewrite
Phase 4 plan for iter2 HEVC fix. Structured per the
feedback_dev_process.md Phase 6 contract-before-code worked example
(0012-h264-omit-scaling-matrix-frame-based.patch shape): contract
clauses with citations first, then code changes mapping 1:1 to
clauses.
10 contract clauses cited from authoritative sources:
Clause 1 — Per-frame batched VIDIOC_S_EXT_CTRLS, count=5
Authority: linux/v4l2-controls.h:2090-2300 (8 HEVC stateless CIDs)
Reference impl: FFmpeg libavcodec/v4l2_request_hevc.c:505-565
(v4l2_request_hevc_queue_decode)
Empirical anchor: Phase 3 Baseline B verbatim payload
Clause 2 — v4l2_ctrl_hevc_sps layout (40 bytes)
Authority: linux/v4l2-controls.h:2096+ (struct + 9 SPS_FLAG_* bits)
Field-by-field VAAPI source mapping table; existing
h265_fill_sps logic preserved, just routed to flags bitmask
Phase 3 Baseline B BBB SPS bytes: flags=SAO|STRONG_INTRA_SMOOTHING
Clause 3 — v4l2_ctrl_hevc_pps layout (64 bytes, 19 flags)
Authority: linux/v4l2-controls.h:2126-2150
Field source: VAPictureParameterBufferHEVC + slice (for
dependent_slice_segment_flag)
Clause 4 — v4l2_ctrl_hevc_slice_params (variable; dynamic-array)
Authority: kernel exposes 0xa40a92 elems=1 dims=[600] dynamic-array
Submission shape: size = sizeof(slice_params) * num_slices_in_frame
Reference impl: FFmpeg v4l2_request_hevc.c:540-547
BEHAVIORAL CHANGE: per-slice accumulation in codec_store_buffer
(replace overwrite with append-to-array)
DPB MOVES OUT of slice_params to DECODE_PARAMS (Clause 6)
Clause 5 — v4l2_ctrl_hevc_scaling_matrix (size M; conditional)
Conditional on kernel availability (probed via VIDIOC_QUERY_EXT_CTRL
at init), NOT on bitstream flag (Phase 3 baseline corrects Phase 2
assumption)
Spec defaults from ISO/IEC 23008-2 Table 4-1 when iqmatrix_set==false
PROTOCOL: transcribe defaults from Phase 3 Baseline B verbatim
SCALING_MATRIX bytes, NOT from spec recall (per
memory feedback_review_empirical_over_theoretical.md)
Clause 6 — v4l2_ctrl_hevc_decode_params layout (328 bytes)
NEW in modern API (didn't exist in staging-era)
Contains: DPB array (16 entries), POC, num_active_dpb_entries,
num_poc_st_curr_before/after, num_poc_lt_curr,
poc_st_curr_before[8], etc.
Source: existing h265_fill_slice_params lines 269-315 logic
preserved, routed to new struct
Clause 7 — Device-wide DECODE_MODE + START_CODE menus
Set once at init via v4l2_set_controls(...request_fd=-1, 2 ctrls)
rkvdec accepts: FRAME_BASED + ANNEX_B (only options per kernel menu
constraints, Phase 0 v4l2_inventory)
Default location: extend src/context.c:142-155 device-init block
Clause 8 — config.c HEVCMain case must break;
Authority: C semantics; iter1 Bug 1 pattern verbatim
Empirical anchor: Phase 3 Baseline D scratch confirmed
Clause 9 — picture.c::codec_set_controls HEVCMain dispatch
Authority: existing MPEG-2 dispatch pattern at picture.c:186-191
Replace explicit Fourier-local: HEVC stripped reject with
h265_set_controls call
Clause 10 — Per-slice accumulation in codec_store_buffer
HEVC slice_params dynamic-array source = per-RenderPicture appends
BeginPicture resets num_slices=0; codec_store_buffer appends each
VASliceParameterBufferType to slices[N] array
Diff scope (8 files):
src/config.c — 5-line break addition (Clause 8)
src/picture.c — HEVCMain dispatch (Clause 9) + per-slice
accumulation (Clause 10) + BeginPicture
num_slices reset, ~25 lines
src/surface.h — extend params.h265 with slices[64] +
num_slices, ~17 KB extra per surface union
src/h265.c — full rewrite ~400 lines (Clauses 2-7)
src/h265.h — re-enable
src/meson.build — uncomment h265.c + h265.h
src/context.c — extend device-init for HEVC DECODE_MODE +
START_CODE
include/hevc-ctrls.h — leave as-is (9-line shim, lower-risk path
per iter1 Phase 5 Nit 6 deferral)
Phase 6 implementation order (2 logical commits + optional fix-forward):
A: src/config.c HEVCMain break only (substrate fix in isolation;
Phase 3 Baseline D already verified collateral safe)
B: h265.c rewrite + picture.c dispatch + slice_params accumulation +
meson re-enable + surface.h extension + context.c device-init
C: optional fix-forward if Phase 7 surfaces a regression
Phase 7 verification harness (full Bash incantations in plan body):
Criterion 1: vainfo lists VAProfileHEVCMain on rkvdec
Criterion 2: vaCreateConfig(VAProfileHEVCMain) = SUCCESS via libva trace
Criterion 3: ffmpeg -hwaccel vaapi exit 0, no Failed-to-create
Criterion 4: mpv --hwdec=vaapi --vo=image at +02s; HW=SW byte-identical
(DMA-BUF GL cache-coherency-safe path per memory
feedback_rockchip_pixel_verify_path.md)
Criterion 5: iter1 MPEG-2 + T4 H.264 reference hashes still match
Bonus: byte-compare post-fix S_EXT_CTRLS payload vs Baseline B
Pre-identified Phase 7 → Phase 4 loopback triggers:
1. S_EXT_CTRLS EINVAL post-fix → check struct sizes (pahole),
reserved zeroing, SCALING_MATRIX size encoding
2. HW pixel hash mismatch → DPB ordering, slice_params bit_offset,
SPS/PPS flags bit positions, SCALING_MATRIX values
3. mpv --hwdec=vaapi filters HEVC out → fall-forward to ffmpeg
-vf hwdownload (less likely; vaapi engaged MPEG-2 in iter1)
4. iter1/T4 regression → verify diffs scoped right
5. Slice_params dynamic-array submission shape rejected → cross-
validator size encoding anchor
6. SCALING_MATRIX availability detection wrong → defensive
QUERY_EXT_CTRL probe in h265_init_device_controls
7. Latent bug B3 hits HEVC differently than MPEG-2 → byte 240 in
h265.picture; ffmpeg-vaapi sends VAPictureParameterBufferType
per frame so masking holds
Out-of-scope (LOCKED): VP9/VP8; HEVC Main 10 / Main Still Picture /
range ext / tile-wavefront; perf metrics; long-duration stress;
SLICE_BASED decode mode (rkvdec FRAME_BASED only); Phase 4 cross-
cutting backlog (B1 device-discovery, B3 BeginPicture profile-aware,
B4 context.c log suppression, B5 vbv_buffer_size, L3 vaDeriveImage
cache-stale); chromium-fourier 149 install; upstream engagement;
hevc-ctrls.h deletion (Phase 5 Nit 6 lower-risk path continues).
Predicted Phase 8 close: 4-6 commits on the fork (vs iter1's 4).
Iter2 ~3x larger codebase delta than iter1 (mpeg2.c rewrite was
~120 lines; h265.c rewrite is ~400 lines).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
d35a247948 |
iter2 Phase 3: baselines — substrate verified post-upgrade, HEVC anchor captured
Phase 3 baselines for iter2 HEVC. Substrate-update verification
ran first (post pacman -Syu rolling upgrade), then iter2-specific
HEVC cross-validator anchor + Bug 1 scratch.
Pre-Phase-3 substrate event: pacman -Syu landed 71 packages.
The "scheduled for linux-7" upgrade was headers-only —
linux-eos-arm-headers 6.19.9-99 → 7.0.3-1, but linux-eos-arm
kernel binary stayed at 6.19.9-99 (EOS-ARM repo hasn't
published the matching 7.x kernel yet). Userland refreshed:
qt6-base epoch bump, libdrm 2.4.131 → 2.4.133, chromium
147 → 148, KDE 26.04.1 batch, mkinitcpio 41-3, etc. OC DTB
intact (sha256 unchanged). mfritsche Plasma session active
throughout, no SDDM regression on this kernel boot.
eos-reboot-recommended marker installed; reboot deferred.
Baseline A (substrate validation post-upgrade):
T4 H.264 +30s and iter1 MPEG-2 +02s reference hashes all
8 match exactly:
H.264 HW1=SW1=f623d5f7a41697f67dd227275c6f1b21ffc257f65626d32fde8229357f8764c9
H.264 HW2=SW2=7d7bc6f2146dda8b2d223bba622c4b9fbe9674181ff1e02afe286b620342e0a8
MPEG-2 HW1=SW1=6e7873030dbf0403c67f35dd106ebef3c7909a0fd12433b82ad758e7fee9f092
MPEG-2 HW2=SW2=ccc7ce08810d4a96e9ba7a19f4f95bbf6cc861bda9337604b5c668ad52bef7de
Userland upgrade did not regress kernel-side decode or
DMA-BUF GL readback.
Baseline B (HEVC cross-validator verbatim contract anchor):
ffmpeg -hwaccel v4l2request decoded bbb_720p10s_hevc.mp4
-frames:v 5 cleanly. Per-frame submission shape:
VIDIOC_S_EXT_CTRLS, ctrl_class=V4L2_CTRL_CLASS_CODEC_STATELESS,
count=5
0xa40a90 SPS size=40
0xa40a91 PPS size=64
0xa40a92 SLICE_PARAMS size=N (dynamic-array)
0xa40a93 SCALING_MATRIX size=M
0xa40a94 DECODE_PARAMS size=328
Plus init device-wide:
0xa40a95 DECODE_MODE (menu, set once)
0xa40a96 START_CODE (menu, set once)
Key Phase 2 amendments from Phase 3 evidence:
- Per-frame batch is 5 controls (not "up to 6" — BBB
doesn't trigger ENTRY_POINT_OFFSETS / EXT_SPS_*).
- SCALING_MATRIX is sent unconditionally for BBB. FFmpeg
gates on ctx->has_scaling_matrix from kernel
VIDIOC_QUERY_EXT_CTRL at init, NOT on per-frame
bitstream flags. Phase 4 plan amends: query kernel for
SCALING_MATRIX availability at init, submit if available.
SPS payload field-decoded (40 bytes verbatim from BBB
fixture): 1280x720, 8-bit, 4:2:0, no PCM, flags = SAO |
STRONG_INTRA_SMOOTHING. PPS + DECODE_PARAMS + SLICE_PARAMS +
SCALING_MATRIX payloads captured for Phase 4 transcription.
Baseline C (slice-count probe): deferred. ffprobe confirms
1 video stream HEVC Main 1280x720 24fps 10s. Per-frame
slice-count not directly extracted; assume 1 slice/frame for
x265 ultrafast preset until Phase 6 verifies. Kernel
advertises slice_params dynamic-array max 600 entries
(phase0 v4l2_inventory), so multi-slice frames are supported
by the contract.
Baseline D (Bug 1 scratch test, collateral safety):
Applied Bug 1 (config.c break for HEVCMain) on throwaway
branch; h265.c stayed disabled. Built + installed.
H.264 HW frames @ +30s: f623d5f7..., 7d7bc6f2... (match T4)
MPEG-2 HW frames @ +02s: 6e7873030dbf..., ccc7ce08810d...
(match iter1)
Bug 1 in isolation does not regress H.264 or MPEG-2.
HEVC behavior with Bug 1 only:
libva trace: vaCreateConfig SUCCESS for VAProfileHEVCMain
ffmpeg: Task finished with error code: -5 (Input/output error)
Decode fails downstream because picture.c:204-206 still has
the explicit case VAProfileHEVCMain: return UNSUPPORTED_PROFILE
reject (Bug 2). Confirms Phase 2 prediction; Bug 2 fix
requires h265_set_controls to exist (Bug 3-6: enable +
rewrite). Bug 2 lands together with the h265.c rewrite in
Commit B (analogous to iter1 Commit B).
Scratch state cleaned: git checkout + rebuild + reinstall
master backend. H.264 + MPEG-2 still pass. Back to Baseline-A-
equivalent state.
Phase 4 plan inputs updated:
- Per-frame batch: 5 controls (not "up to 6")
- SCALING_MATRIX: unconditional iff kernel advertises (init
QUERY_EXT_CTRL probe), not bitstream-conditional
- SLICE_PARAMS: dynamic-array (max 600 elems per kernel UAPI)
- DECODE_MODE + START_CODE: 2 device-wide menus at init
- Phase 7 harness anchors on mpv-vaapi-vo=image (DMA-BUF GL
cache-coherency-safe path per
feedback_rockchip_pixel_verify_path.md)
- Phase 7 bonus: byte-compare post-fix S_EXT_CTRLS payload
against Baseline B (per feedback_review_empirical_over_
theoretical.md — empirical wins)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
b3ba157cb4 |
iter2 Phase 2: situation analysis — six bugs in HEVC path
Phase 2 source-read of the HEVC path post-iter1-close (fork master
229d6d1). Six bugs identified, all in libva backend; kernel + driver
path proven for HEVC in Phase 0 cross-validator sweep.
Substrate timing caveat: Phase 2 conducted against fresnel kernel
6.19.9-99. Operator-scheduled rolling pacman -Syyuu to linux-7
imminent. Phase 2 source-read findings are kernel-agnostic (fork
code + UAPI + FFmpeg reference); they carry forward across the
kernel jump unchanged. Phase 3 baselines will run on linux-7.
Bug 1 — src/config.c:64-69 HEVCMain falls through to default,
returns VA_STATUS_ERROR_UNSUPPORTED_PROFILE. Verbatim match for
iter1 Bug 1 pattern; fix is 3-line break addition.
Bug 2 — src/picture.c:204-206 explicit
case VAProfileHEVCMain: return UNSUPPORTED_PROFILE
with stale comment "Fourier-local: HEVC stripped, no HW support
on RK3566." (RK3566 is ohm context; fresnel is RK3399 where
rkvdec DOES support HEVC.) Fix: replace explicit reject with
dispatch to h265_set_controls() (mirrors MPEG-2 dispatch at
picture.c:186-191).
Bug 3 — src/h265.c uses staging-era CIDs:
V4L2_CID_MPEG_VIDEO_HEVC_PPS / _SPS / _SLICE_PARAMS
These don't exist on fresnel's 6.19 kernel headers (verified via
test-compile: gcc reports undeclared identifiers, suggests
V4L2_CID_MPEG_VIDEO_DEC_PTS as nearest match). Mainline kernel
UAPI splits HEVC stateless into 7 controls:
V4L2_CID_STATELESS_HEVC_{SPS,PPS,SLICE_PARAMS,SCALING_MATRIX,
DECODE_PARAMS,DECODE_MODE,START_CODE}
+ ENTRY_POINT_OFFSETS, EXT_SPS_ST_RPS, EXT_SPS_LT_RPS
(0xa40a90..0xa40a96 + extensions, V4L2_CID_CODEC_STATELESS_BASE
+ 400..407+).
Fix shape: rewrite h265.c against new split API. Substantially
larger than iter1's mpeg2.c rewrite (HEVC has 7 controls vs MPEG-2
3, + slice_params dynamic-array, + per-slice accumulation logic
needed).
Bug 4 — h265.c uses single-slice_params shape; new API is
dynamic-array. Fresnel rkvdec advertises:
hevc_slice_parameters 0xa40a92 elems=1 dims=[600] dynamic-array
Up to 600 slice_params entries per submission. Current
codec_store_buffer:115-135 OVERWRITES previous slice on
VASliceParameterBufferType arrival. Multi-slice frames need
APPEND-not-overwrite. FFmpeg reference v4l2_request_hevc.c:540-547
shows the pattern.
Fix shape: extend params.h265 to hold slice_params array (or
pointer+count); codec_store_buffer appends; h265_set_controls
flushes the array at end_picture as a single dynamic-array
S_EXT_CTRLS entry.
Bug 5 — h265.c missing controls: doesn't submit DECODE_PARAMS
(per-frame DPB info; new in modern API), SCALING_MATRIX (conditional
on iqmatrix_set + sps.scaling_list_enabled), DECODE_MODE+START_CODE
(device-wide menus, set once per context init).
Fix shape: add h265_fill_decode_params() (DPB ordering from VAAPI
ReferenceFrames[15] — preserve current extraction logic from
h265_fill_slice_params:269-315, route to new struct). Conditional
SCALING_MATRIX from VAIQMatrixBufferHEVC. Device-wide
DECODE_MODE+START_CODE either at first h265_set_controls call or
in extended context.c device-init block.
Bug 6 — src/meson.build comments out 'h265.c' (line 50) and
'h265.h' (line 73). Fix: uncomment both. Trivial.
Bug 7 (verify only) — include/hevc-ctrls.h is a 9-line shim that
just #include <linux/v4l2-controls.h>. Comment dates the
modernization to "linux-media 6.6+". Adds zero value; harmless.
Leave in place per iter1 Phase 5 Nit 6 lower-risk path.
Bug 8 (latent) — picture.c:287 params.h264.matrix_set=false
writes union byte 240. For HEVC: byte 240 lands inside
h265.picture (range [0..604), size 604) — different field than
MPEG-2's chroma_intra_quantiser_matrix. ffmpeg-vaapi's
per-frame VAPictureParameterBufferHEVC re-send overwrites the
corrupted byte before h265_set_controls reads. Latent for
clients that reuse a surface without re-sending picture params.
iter2+ Phase 4 cross-cutting backlog candidate; not iter2 scope.
Things verified NOT bugs:
- h265_fill_pps/sps/slice_params field extraction from VAAPI
structs is sound (just routes to wrong destination structs)
- NAL header parsing (data_bit_offset bit-search) is preserved
in new API — slice_params still has bit_size + data_bit_offset
- v4l2_set_controls batching API in place (used by H.264 + iter1
MPEG-2; iter2 uses same)
Substrate / kernel observation:
- Linux mainline 7.1.0-rc2 reference checkout has
drivers/staging/media/rkvdec/ with rkvdec.c, rkvdec-h264.c,
rkvdec-vp9.c — NO rkvdec_hevc.c. fresnel's HEVC support is
out-of-tree (Christian Hewitt patches per phase0_findings.md
external references). May land in stable 7.x.
- Phase 4 contract-before-code therefore can't cite kernel-side
HEVC handler source until/unless rkvdec_hevc.c lands in
mainline. UAPI doc + FFmpeg reference + Phase 3 cross-validator
bytes are the contract anchor.
Open questions tabled for Phase 3 (post-linux-7-upgrade):
1. iter1 + T4 references on linux-7 (regression check of closed
iter1 work)
2. SDDM watchpoint on linux-7
3. Cross-validator HEVC re-anchor (Baseline C equivalent for
HEVC) — verbatim payload bytes for SPS, PPS, DECODE_PARAMS,
SLICE_PARAMS array, SCALING_MATRIX
4. Pre-fix scratch test (Bug 1 + Bug 2 only, h265.c kept
commented out) — confirm collateral safe
5. Slice-count for bbb_720p10s_hevc.mp4 fixture
6. Whether linux-7 brings rkvdec_hevc.c into mainline
Predicted iter2 close shape: trivial Bugs 1+2+6 fixes + sizable
h265.c rewrite (~250-400 lines, ~3x iter1's mpeg2.c) + new
codec_store_buffer slice accumulation logic. If Phase 7 fails:
likely struct-size mismatch (run pahole), DPB ordering, or
slice_params array size encoding.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6e8c970c1d |
iter2 Phase 0 + Phase 1 lock: HEVC Main on rkvdec
Iteration 2 of the campaign 8(+1)-phase loop opens following iter1
close (
|