ampere-av1 Phase 3 progress: film_grain link + UPDATE_GRAIN; frame 0 bit-exact
Three structural fixes for AV1 with film_grain on vpu981 (RK3588). Output
is no longer empty / crashed; frame 0 (IDR with apply_grain=1) is
bit-exact vs kdirect. Inter frames still diverge.
Fix 1 — surface.h + surface.c: linked_decode_surface_id field on
object_surface, initialized to VA_INVALID_SURFACE. When AV1 picture has
apply_grain=1, VAAPI's VADecPictureParameterBufferAV1 carries a
current_display_picture distinct from current_frame. ffmpeg-vaapi calls
vaBeginPicture on current_frame (decode surface, slot gets bound) but
vaGetImage on current_display_picture (display surface, no slot) → NULL
deref in copy_surface_to_image.
Fix 2 — av1.c: in av1_set_controls, when cur_frame != cur_display, set
display_surface->linked_decode_surface_id = current_frame. Establishes
the back-link so display surface can borrow decode surface's data.
Fix 3 — image.c copy_surface_to_image: when slot is NULL and the
surface has linked_decode_surface_id, lookup the decode surface and
mirror its destination_data[] + destination_sizes[] +
destination_planes_count. NULL guard with diagnostic log retained.
Fix 4 — av1.c fill_film_grain: when apply_grain=1, also set
V4L2_AV1_FILM_GRAIN_FLAG_UPDATE_GRAIN. Confirmed by strace-diff: kdirect
sends flags=0x0B (APPLY|UPDATE|...), libva was sending 0x09 (APPLY but
no UPDATE). Without UPDATE the kernel tries to reuse from
film_grain_params_ref_idx=0, which is never populated. Earlier reverted
because UPDATE seemed to trigger a SEGV — but that SEGV was the
unmasked NULL-slot deref; with fix 1+2+3 in place UPDATE is safe.
Fix 5 — av1.c reference_frame_ts plumbing: when a referenced surface
has timestamp=0 AND linked_decode_surface_id set, follow the link to
find the decode surface that carries the real timestamp. Display
surfaces don't get OUTPUT QBUF'd by us, so their own timestamp stays
zero.
Also: BeginPicture diagnostic log + surface_unbind_slot diagnostic log
+ v4l2.c error_idx diagnostic (kept from earlier — useful for ongoing
investigation).
Verification on ampere:
test_av1.ivf (208x208, 2 frames, no grain): bit-exact PASS sha
029ee72c214b37c1 (unchanged, no regression)
av1_larger.ivf (352x288, 10 frames, film_grain alternates):
frame 0 (key, apply_grain=1): PASS bit-exact vs kdirect
frame 4: PASS bit-exact
frames 1,2,3,5,6,7,8,9: DIFFER
Frame 0 PASS proves: SEQUENCE + FRAME + TILE_GROUP_ENTRY + FILM_GRAIN
mapping is correct for IDR. Frame 4 PASS is unexplained but encouraging.
Inter-frame divergence (frame 1+) points at: reference handling for
inter prediction is still off — either order_hints[] (still zero,
VAAPI doesn't expose per-ref), or grain-applied vs pre-grain DPB
semantics, or ref_frame_idx pointing into the wrong surface space.
Next investigation: per-frame strace diff between libva and kdirect
controls payload to spot remaining field mis-mappings on inter frames.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
+43
@@ -216,7 +216,50 @@ static VAStatus copy_surface_to_image (struct request_data *driver_data,
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* AV1 film_grain: when this surface is the display surface of a
|
||||
* decode (current_display_picture != current_frame with apply_grain=1),
|
||||
* its slot is NULL because BeginPicture only fired on the decode
|
||||
* surface. Follow the back-link set in av1_set_controls and borrow
|
||||
* the decode surface's destination_data + sizes for the copy.
|
||||
*/
|
||||
if (surface_object->current_slot == NULL &&
|
||||
surface_object->linked_decode_surface_id != VA_INVALID_SURFACE) {
|
||||
struct object_surface *decode_surface =
|
||||
SURFACE(driver_data,
|
||||
surface_object->linked_decode_surface_id);
|
||||
if (decode_surface != NULL &&
|
||||
decode_surface->current_slot != NULL) {
|
||||
/* Mirror the fields we read below. The surface heap
|
||||
* pointer is stable for the surface's lifetime; we
|
||||
* only need destination_data + destination_sizes +
|
||||
* destination_planes_count from it. */
|
||||
surface_object->destination_planes_count =
|
||||
decode_surface->destination_planes_count;
|
||||
for (i = 0; i < decode_surface->destination_planes_count; i++) {
|
||||
surface_object->destination_data[i] =
|
||||
decode_surface->destination_data[i];
|
||||
surface_object->destination_sizes[i] =
|
||||
decode_surface->destination_sizes[i];
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < surface_object->destination_planes_count; i++) {
|
||||
/* AV1 Phase 3 diag: surface NULL-deref hunt. */
|
||||
if (buffer_object->data == NULL ||
|
||||
surface_object->destination_data[i] == NULL) {
|
||||
request_log("copy_surface_to_image NULL i=%u "
|
||||
"buf_data=%p dest_data=%p dest_size=%u "
|
||||
"planes=%u slot=%p linked=0x%x\n",
|
||||
i, (void *)buffer_object->data,
|
||||
(void *)surface_object->destination_data[i],
|
||||
surface_object->destination_sizes[i],
|
||||
surface_object->destination_planes_count,
|
||||
(void *)surface_object->current_slot,
|
||||
surface_object->linked_decode_surface_id);
|
||||
return VA_STATUS_ERROR_OPERATION_FAILED;
|
||||
}
|
||||
#ifdef __arm__
|
||||
if (!video_format_is_linear(driver_data->video_format))
|
||||
tiled_to_planar(surface_object->destination_data[i],
|
||||
|
||||
Reference in New Issue
Block a user