Files
fresnel-fourier/phase5_iter4_review.md
T
marfrit 9865416ed2 iter4 Phase 5: sonnet-architect review — 4 Critical findings, all amendments incorporated
Review by sonnet-architect with cold-context source reads of fork +
kernel UAPI + VAAPI + FFmpeg references + kernel rkvdec source.
Reviewer applied Direction 2 (empirical-over-theoretical) by
test-compiling struct sizes, gcc-c-checking VAAPI field accesses,
and source-tracing FFmpeg's filter-mode XOR provenance.

Critical findings (all empirically validated by author before
incorporation per feedback_review_empirical_over_theoretical.md):

C1 - interpolation_filter double-XOR: vaapi_vp9.c:62 ALREADY applies
     `filtermode ^ (filtermode <= 1)` when filling VAAPI's
     mcomp_filter_type. Plan's second XOR was incorrect; would swap
     EIGHTTAP and EIGHTTAP_SMOOTH for inter frames -> wrong
     loop-filter strength. Fix: direct assignment, no XOR.

C2 - LF deltas not persistent: kernel UAPI explicitly says
     "users should pass its last value" when delta_update=0. Plan
     memset-zeroed each frame; would send {0,0,0,0,0,0} on BBB inter
     frames instead of {1,0,-1,-1,0,0}. Fix: add persistent vp9_lf
     state to object_context, init to VP9 spec defaults, update only
     when parser sees delta_update=1, always copy to kernel control.

C3 - reference_mode out-parameter missing: reference_mode lives in
     FRAME struct, not COMPRESSED_HDR. Plan referenced
     `compressed_hdr_reference_mode` placeholder which would be an
     undefined identifier -> compile failure. Fix: add
     `uint8_t *out_reference_mode` param to vp9_fill_compressed_hdr;
     derive `allowcompinter` at call site from the 3 sign biases.

C4 - Mitigation B scope claim overstated: walk-and-pick-first always
     selects rkvdec on 7.0 (since video1 enumerates first). Hantro
     codecs (MPEG-2, VP8) at video3 still require env override.
     Fix: qualify criterion-5 trace; add LIBVA_V4L2_REQUEST_NO_
     AUTODETECT=1 escape hatch for legacy callers.

6 Suggested (S1-S6): all confirm plan correctness OR are scope-
aligned non-issues. S4 (uv_mode memcpy omission safe for rkvdec)
baked into Clause 9 amended text.

Without this review, iter4 Phase 6 would have failed first compile
(C3) + produced wrong inter-frame output (C1+C2) + caused user
confusion (C4). Estimated saving: 1 compile failure + 1 Phase 7 ->
Phase 4 loopback + 1 doc correction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 05:49:13 +00:00

20 KiB
Raw Blame History

Iteration 4 — Phase 5 (review)

Second-model review of phase4_iter4_plan.md by sonnet-architect on 2026-05-10. Reviewer performed cold-context source reads of: fork files at iter3 tip (e1aca9c), kernel UAPI at /usr/include/linux/v4l2-controls.h, VAAPI header va_dec_vp9.h, FFmpeg V4L2-request reference v4l2_request_vp9.c, FFmpeg VAAPI reference vaapi_vp9.c, FFmpeg internal VP9 decoder vp9.c + vp9shared.h, and kernel sources v4l2-vp9.c + rkvdec-vp9.c. Struct sizes verified by compile (gcc). VAAPI field accesses verified by compile. All Critical findings were verified empirically before inclusion per memory feedback_review_empirical_over_theoretical.md Direction 2.

Overall plan health: The mapping logic for Clauses 15 and Clause 89's VPX rac port strategy is sound. Three correctness bugs were found: a double-XOR on interpolation_filter (C1, critical for inter frames), a missing persistent-state mechanism for loop-filter deltas across frames (C2, critical for criterion-4 on BBB frames 25), and an underspecified function signature for reference_mode extraction from the compressed-header parser (C3). All three will produce Phase 7 byte-mismatches without amendment. Commit Z has a documented but partially-incorrect scope claim (C4).


Critical findings (BLOCKING Phase 6)

C1 — interpolation_filter double-XOR: VAAPI delivers post-remap value; plan re-remaps it

Site: Clause 10, plan lines ~279280.

Issue: The plan applies mcomp_filter_type ^ (mcomp_filter_type <= 1) to convert VAAPI's mcomp_filter_type to V4L2's interpolation_filter. But the FFmpeg VAAPI consumer (vaapi_vp9.c:62) already applies this exact XOR when filling mcomp_filter_type:

// vaapi_vp9.c:62:
.mcomp_filter_type = h->h.filtermode ^ (h->h.filtermode <= 1),

The XOR swaps FFmpeg's FILTER_8TAP_SMOOTH (0) and FILTER_8TAP_REGULAR (1) to align with V4L2 enum order (V4L2_VP9_INTERP_FILTER_EIGHTTAP=0 = regular, EIGHTTAP_SMOOTH=1). VAAPI's mcomp_filter_type is therefore already in V4L2 order. The plan applying a second XOR undoes the first, reverting to FFmpeg's internal enum — the opposite of what V4L2 expects.

Evidence (empirically verified by tracing all values):

FFmpeg FilterMode VAAPI mcomp_filter_type (post-XOR) Plan applies XOR again → V4L2 expected
SMOOTH=0 1 (EIGHTTAP_SMOOTH) 1^1=0 (EIGHTTAP) 1 WRONG
REGULAR=1 0 (EIGHTTAP) 0^1=1 (EIGHTTAP_SMOOTH) 0 WRONG
SHARP=2 2 2^0=2 2 OK
BILINEAR=3 3 3^0=3 3 OK
SWITCHABLE=4 4 4^0=4 4 OK

vaapi_vp9.c verified at /home/mfritsche/src/libva-multiplanar/references/ffmpeg-kwiboo/libavcodec/vaapi_vp9.c:62. v4l2_request_vp9.c applies XOR to FFmpeg's raw filtermode at v4l2_request_vp9.c:292 — this is the operation that VAAPI already did.

The plan's own footnote ("XOR with (0 <= 1)=1 → 1 hmm") flags this discrepancy but leaves it unresolved.

The bug affects only SMOOTH and REGULAR — values 2, 3, 4 are unaffected. For keyframes, the interpolation filter is irrelevant (all-intra). The bug surfaces on BBB inter frames (frames 25 in the criterion-4 test), where EIGHTTAP_REGULAR is swapped for EIGHTTAP_SMOOTH → wrong deblocking filter selection → criterion-4 byte mismatch.

Proposed fix:

/* VAAPI mcomp_filter_type is already in V4L2 enum order.
 * vaapi_vp9.c:62 applied the XOR when filling VAAPI's mcomp_filter_type.
 * Direct assignment, no re-remap needed. */
frame.interpolation_filter = picture->pic_fields.bits.mcomp_filter_type;

Empirical validation: Author should strace a BBB frame-2 submission under FFmpeg v4l2request and read the interpolation_filter byte from the FRAME payload (offset 157 in the 168-byte struct). Then LIBVA_TRACE-log the same frame under mpv/ffmpeg-vaapi to extract mcomp_filter_type. They should be EQUAL. If the plan's XOR were applied, the byte would differ by 1 for SMOOTH/REGULAR content.


C2 — Loop-filter ref_deltas/mode_deltas not persistent across frames

Site: Clause 6; vp9_parse_uncompressed_header_lf_quant design.

Issue: The kernel UAPI comment for v4l2_vp9_loop_filter.ref_deltas explicitly states:

"If this syntax element is not present in the bitstream, users should pass its last value."

(/usr/include/linux/v4l2-controls.h:2578)

The plan zero-inits both structs with memset each frame. For frames where loop_filter_delta_update=0 (deltas not updated in the bitstream, but loop_filter_delta_enabled=1), the plan's parser correctly sets the DELTA_ENABLED flag but leaves ref_deltas and mode_deltas at zero from the memset. The kernel needs the persisted values from the most recent update.

FFmpeg handles this correctly by reading from s->s.h.lf_delta.ref[i] (persistent internal state, initialized to {1,0,-1,-1} on keyframe per vp9.c:666669 and updated when lf_delta.updated=1):

// v4l2_request_vp9.c:298-302:
for (i = 0; i < 4; i++)
    ctrl->lf.ref_deltas[i] = s->s.h.lf_delta.ref[i]; // always, even if not updated

Evidence:

  • FFmpeg vp9.c:666669 sets defaults {1,0,-1,-1,0,0} on keyframe (empirically confirmed these match Phase 3 anchor values).
  • FFmpeg vp9.c:693-700 updates from bitstream only when lf_delta.updated=1.
  • FFmpeg v4l2_request_vp9.c:298302 always copies the persistent state (not conditional on updated).
  • Phase 3 anchor frame 1 (keyframe): lf.flags=3 (DELTA_ENABLED|DELTA_UPDATE). Frames 25 very likely have DELTA_ENABLED=1 but DELTA_UPDATE=0 (stable content — no change). The plan would send {0,0,0,0,0,0} instead of {1,0,-1,-1,0,0}.
  • VP9 spec default values {1,0,-1,-1,0,0} confirmed at vp9.c:664671: reset occurs on keyframe || errorres || intraonly.

Proposed fix: Add persistent LF delta state to object_context (not object_surface — context is the right scope for VP9 session state):

// In context.h, inside object_context:
/* VP9 loop-filter delta state — persisted across frames per VP9 spec */
struct {
    int8_t ref_deltas[4];   /* init to {1,0,-1,-1} per VP9 spec 8.5.1 */
    int8_t mode_deltas[2];  /* init to {0,0} */
} vp9_lf;

Initialize in RequestCreateContext (or at first VP9 frame) to {1,0,-1,-1,0,0}. In Clause 6: after parsing the uncompressed header, if delta_enabled=1 && delta_update=1, write the new deltas into context->vp9_lf. Then always copy context->vp9_lf into frame.lf.ref_deltas/frame.lf.mode_deltas (replacing the parser's local writes).

Empirical validation: After Phase 6 build, strace BBB frame-2 submission. The ref_deltas bytes in the FRAME payload (offset 03 in v4l2_vp9_loop_filter) should be {1,0,-1,-1} matching frame-1's values. If the fix is missing, frame-2 shows {0,0,0,0} — detectable by strace.


C3 — reference_mode extraction from compressed-header parser is underspecified

Site: Clause 10, plan line ~284: frame.reference_mode = compressed_hdr_reference_mode; (state from Clause 9).

Issue: comppredmode (reference_mode) is read from the compressed header by FFmpeg's VP9 decoder (vp9.c:10221024, using the VPX range coder s->c). The plan correctly identifies Clause 9 as the source. However, Clause 9 only specifies filling struct v4l2_ctrl_vp9_compressed_hdr. The reference_mode field is in struct v4l2_ctrl_vp9_frame, not compressed_hdr. The plan's function signature for vp9_fill_compressed_hdr (Clause 9) does not show any mechanism to return this value to Clause 10.

Evidence: Verified that struct v4l2_ctrl_vp9_compressed_hdr (kernel UAPI at v4l2-controls.h:28332851) has no reference_mode or comppredmode field. The frame struct (v4l2_ctrl_vp9_frame:27412765) has __u8 reference_mode. FFmpeg's fill_compressed_hdr (v4l2_request_vp9.c:104) uses a local comppredmode variable for internal parsing only; fill_frame reads s->s.h.comppredmode (persistent decoder state) separately.

The placeholder compressed_hdr_reference_mode in the plan is a C variable name that doesn't exist in any defined scope. If the author ports fill_compressed_hdr without adding an out-parameter, compressed_hdr_reference_mode will be an undefined identifier — compile failure.

For the BBB keyframe: allowcompinter=0 (all refs have same sign bias on keyframe), so the if (allowcompinter) block is not entered and comppredmode defaults to PRED_SINGLEREF (0) from memset. No bug on keyframe. For BBB inter frames 25, allowcompinter may be true, and the correct comppredmode must be extracted.

Proposed fix: Add an out-parameter to vp9_fill_compressed_hdr:

static void vp9_fill_compressed_hdr(
    struct v4l2_ctrl_vp9_compressed_hdr *compressed_hdr,
    const uint8_t *buf, uint32_t size,
    uint8_t lossless_flag,
    bool keyframe_or_intraonly,
    bool allowcompinter,
    uint8_t *out_reference_mode)      /* NEW out-param */
{
    ...
    if (!keyframe_or_intraonly) {
        if (allowcompinter) {
            comppredmode = vp89_rac_get(&c);
            if (comppredmode)
                comppredmode += vp89_rac_get(&c);
            ...
        } else {
            comppredmode = 0; /* PRED_SINGLEREF */
        }
        *out_reference_mode = comppredmode;
    } else {
        *out_reference_mode = 0; /* PRED_SINGLEREF for keyframe */
    }
    ...
}

allowcompinter (VP9 spec: true when sign biases of the 3 active refs are not all identical) must be determined before calling the compressed-header parser. It can be derived from picture->pic_fields.bits.{last,golden,alt}_ref_frame_sign_bias values.

Empirical validation: Before Phase 6 build, confirm that the plan's vp9_fill_compressed_hdr function signature includes out_reference_mode output. Compile test without it — expect undefined variable. With it — expect clean compile. Phase 7 byte-compare frame-2 reference_mode byte (offset 160 in FRAME struct) vs strace of ffmpeg-v4l2request must match.


C4 — Commit Z 4-codec regression claim is overstated; hantro codecs remain broken without env override

Site: Commit Z description + Phase 1 criteria table (Criterion 5 row).

Issue: The plan states: "Commit Z restores baseline functionality on 7.0" and the criterion-5 trace says "Commit Z restores baseline (mitigation B)." This is partially incorrect. Commit Z's find_codec_device walks /dev/video0..15 and picks the first device whose driver name matches {rkvdec, hantro-vpu, cedrus, sun4i_csi}. On 7.0:

video0 = rockchip-rga    (skip)
video1 = rkvdec          (MATCH → returns rkvdec)
video2 = hantro-enc      (never reached)
video3 = hantro-dec      (never reached)

Result: find_codec_device always returns rkvdec (video1). H.264, HEVC, VP9 (all rkvdec) benefit. MPEG-2 and VP8 (hantro at video3) still get the wrong device — they will see 0 supported profiles because rkvdec doesn't advertise MPEG-2/VP8 pixel formats.

This is actually documented in the plan's "End-user UX gap" paragraph: "If user wants the OTHER decoder (e.g., hantro for MPEG-2/VP8), they still need env override." But the criterion-5 trace says Commit Z "restores baseline" — contradicting the UX gap note. The 4-codec regression is NOT fully fixed by Commit Z alone.

Evidence: Phase 3 baseline table shows all 4 codecs needed env override on 7.0. Commit Z only eliminates the override for rkvdec codecs. Hantro codecs still require LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 LIBVA_V4L2_REQUEST_MEDIA_PATH=/dev/media1.

Proposed fix: Amend the criterion-5 trace to accurately state: "Commit Z restores rkvdec-codec baseline (H.264, HEVC) and enables VP9 enumeration. MPEG-2 and VP8 (hantro) still require explicit env override." This is a documentation correctness fix, not a code fix. The code behavior is acceptable for iter4 scope (iter4-B1 already tracks full multi-decoder dispatch).

Additionally, the plan should add LIBVA_V4L2_REQUEST_NO_AUTODETECT=1 as a safety knob so existing test automation that relied on /dev/video0 can opt out of the new walk:

if (getenv("LIBVA_V4L2_REQUEST_NO_AUTODETECT"))
    goto old_fallback;   /* skip walk, use /dev/video0 as before */

Empirical validation: Before merging Commit Z, verify vainfo without env override enumerates H.264/HEVC profiles (rkvdec) but NOT MPEG-2/VP8. Verify vainfo WITH LIBVA_V4L2_REQUEST_VIDEO_PATH=/dev/video3 enumerates MPEG-2/VP8. This confirms the scope is correct.


Suggested findings (NON-BLOCKING)

S1 — Compile-time assertions: struct sizes are stable on 7.0 UAPI but volatile across kernel major versions

Site: Clause 3.

The _Static_assert values 168 and 2040 were verified empirically by test-compile against /usr/include/linux/v4l2-controls.h on this system:

$ gcc -c /tmp/vp9_size_check.c && ./vp9_size_check
v4l2_ctrl_vp9_frame: 168
v4l2_ctrl_vp9_compressed_hdr: 2040

These match the Phase 3 strace exactly. The plan's assertion strategy is sound.

Historical stability note: Phase 2 estimated 144/1947 vs Phase 3 empirical 168/2040 — a delta of +24/+93 bytes between the Phase 2 reading and the live 7.0 kernel. This confirms that struct sizes CAN change between kernel versions. The _Static_assert approach is the right response: it turns any future growth into a loud build failure, as designed. No change needed.

S2 — seg.pred_probs 255-fill: plan's direct copy is already correct

Site: Clause 7.

FFmpeg vaapi_vp9.c:96100 fills segment_pred_probs with 255 for the non-temporal case before handing the VAAPI buffer to the backend. The plan's direct for (i=0;i<3;i++) frame.seg.pred_probs[i] = picture->segment_pred_probs[i] is correct — it receives already-padded values. FFmpeg v4l2_request_vp9.c:336341 does the same 255-fill but independently (for the V4L2 path). Both produce identical output. No fix needed.

S3 — reset_frame_context FFmpeg remap is correct as applied to VAAPI

Site: Clause 10.

Verified: vaapi_vp9.c:64 sets .reset_frame_context = h->h.resetctx — raw VP9 spec value (0..3), NO pre-remap. The plan's (resetctx > 0 ? resetctx - 1 : 0) correctly maps: 0→0(NONE), 1→0(NONE, spec says both 0 and 1 mean no-reset), 2→1(SPEC), 3→2(ALL). This matches V4L2 enum constants. Correct.

S4 — uv_mode memcpy omission is safe for rkvdec

Site: Clause 9 (compressed header port).

FFmpeg fill_compressed_hdr ends with memcpy(ctrl->uv_mode, s->prob.p.uv_mode, sizeof(ctrl->uv_mode)). This sets compressed_hdr.uv_mode to current probability values. The plan's port does not mention this memcpy.

Verified: v4l2_vp9_fw_update_probs (v4l2-vp9.c:14151444) does NOT call any update_uv_mode_probs function — there is no uv_mode update path from the compressed_hdr struct. rkvdec-vp9.c:280287 reads uv_mode from vp9_ctx->probability_tables (the frame_context table), not from prob_updates (the compressed_hdr ctrl). The memcpy in FFmpeg's fill_compressed_hdr fills a field that rkvdec never reads from the compressed_hdr control. Omission is safe.

S5 — lossless_flag semantics align exactly

Site: Clause 9.

VAAPI (va_dec_vp9.h:131140): LosslessFlag = base_qindex==0 && y_dc_delta_q==0 && uv_dc_delta_q==0 && uv_ac_delta_q==0. FFmpeg VP9 decoder (vp9shared.h:136): uint8_t lossless set by the same condition. FFmpeg vaapi_vp9.c:78: .lossless_flag = h->h.lossless — direct assignment, no remap. Semantics are identical. Plan's if (picture->pic_fields.bits.lossless_flag) ctrl->tx_mode = V4L2_VP9_TX_MODE_ONLY_4X4 is correct.

S6 — Per-segment ALT_Q/ALT_L mapping gap is correctly identified but needs explicit V4L2 UPDATE_DATA flag handling

Site: Clause 7.

The plan leaves V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA and V4L2_VP9_SEGMENTATION_FLAG_ABS_OR_DELTA_UPDATE zero for all frames (since VAAPI doesn't expose segmentation_update_data or absolute_vals). For BBB (segmentation disabled), these flags are irrelevant. The plan documents this as a fidelity gap. This is correct scope-management for iter4.

Minor clarification worth adding to the plan comment: the kernel only reads feature_data[] and feature_enabled[] when SEGMENTATION_FLAG_ENABLED is set. Since BBB has ENABLED=0, the zero'd feature_data arrays are never examined. Safe for iter4 scope.


Direct answers to the 7 queued questions

Q1 — Uncompressed-header parser correctness (Clause 6)

The spec parse path for BBB's keyframe to reach loop_filter_params() requires correctly parsing: frame_marker (2 bits) → profile (2 bits) → show_existing_frame (1 bit) → frame_type (1 bit) → show_frame (1 bit) → error_resilient_mode (1 bit) → frame_sync_code (3 bytes) → color_config() [variable length, profile-dependent] → frame_size() [2×16 bits] → render_and_frame_size_different (1 bit) → refresh_frame_flags (8 bits) → loop_filter_params(). The color_config() and inter-frame reference frame stuff make the path non-trivial but well-defined for profile=0 keyframe. The plan's "~50 LOC" estimate is conservative — a faithful profile-0 keyframe + profile-0 inter-frame parse is closer to 80100 LOC. The Phase 3 anchor (lf.ref_deltas={1,0,-1,-1}, lf.flags=3, quant.base_q_idx=46) is the correct empirical reference. See C2 for the critical persistence issue.

Q2 — reset_frame_context and interpolation_filter remap

reset_frame_context: correct, no issue (see S3). interpolation_filter: WRONG — double-XOR. See C1. Fix: direct assignment frame.interpolation_filter = picture->pic_fields.bits.mcomp_filter_type with no XOR.

Q3 — Compile-time size assertions stability

168/2040 are correct on 7.0 UAPI. Verified by test-compile. Assertions are appropriate and will fire loudly on kernel UAPI bump. If assertions fire, re-run Phase 3 baseline capture against the new kernel — do not just update the numbers. See S1.

Q4 — Per-segment ALT_Q mapping for non-BBB segmentation-enabled streams

VAAPI's seg_param[s].luma_ac_quant_scale is the dequantization scale y_dequant[qindex][1] (confirmed by vaapi_vp9.c:153: .luma_ac_quant_scale = h->h.segmentation.feat[i].qmul[0][1]). The kernel needs the raw Q-index delta (ALT_Q), NOT the dequant scale. The inverse mapping would require the VP9 quantization table lookup (spec table in section 8.3.1) which is non-trivial. The plan correctly identifies this as an iter4-Q6 fidelity gap. For BBB it is dead code. For segmentation-enabled streams, the mapping is fundamentally broken without the Q-table inverse. Recommend flagging in Commit B header comment. The flag V4L2_VP9_SEGMENTATION_FLAG_UPDATE_DATA also cannot be set without knowing segmentation_update_data from the uncompressed header — another VAAPI gap. All correctly documented.

Q5 — Test compile field availability

All VAAPI field accesses in Clauses 4, 5, 7, and 10 compile cleanly. Verified by gcc -c test against inline-definition of the VAAPI structs matching va_dec_vp9.h. Fields tested: frame_width, frame_height, profile, bit_depth, log2_tile_columns, log2_tile_rows, pic_fields.bits.frame_context_idx, filter_level, sharpness_level, frame_header_length_in_bytes, first_partition_size, all pic_fields.bits.* in Clauses 5 and 10, mb_segment_tree_probs[], segment_pred_probs[], seg_param[].segment_flags.fields.*. All present. Note: pic_fields.bits.frame_context_idx is 2 bits (not 8) — assignment to __u8 frame_context_idx in the V4L2 struct is safe (value fits: 0..3).

Q6 — Mitigation B regression risk

See C4 for the detailed finding. Short answer: Commit Z's walk-and-pick-first selects rkvdec for ALL codecs, not hantro. MPEG-2 and VP8 remain broken without env override. The criterion-5 regression claim is overstated. The LIBVA_V4L2_REQUEST_NO_AUTODETECT=1 safety knob is recommended as a Suggested addition (see C4).

Q7 — lossless_flag semantic alignment

Semantics are identical between VAAPI and FFmpeg. See S5. Plan's usage is correct.


Verdict

Plan needs two code amendments (C1 + C2) before Phase 6 implementation, plus one function-signature clarification (C3) and one documentation correction (C4). C1 is a one-line fix. C2 requires adding 6 bytes of persistent state to object_context and threading it through the Clause 6 parser call. C3 requires adding an out_reference_mode parameter to vp9_fill_compressed_hdr. C4 is a comment/documentation fix plus an optional safety env var. None require re-baselining or a Phase 3 re-capture.

Plan ready for Phase 6 after C1 + C2 + C3 amendments. C4 is strongly recommended but may be accepted as a known scope limitation note.