kernel: claim src/dst at device_run, not at buf_done (fixes panic from #7) #8
Reference in New Issue
Block a user
Delete Branch "noether/kernel-claim-bufs-at-device-run"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Symptom
Hard reboot (kernel panic; no persistent journal, no trace recoverable) observed on higgs (Pi CM5, 6.18.29+rpt-rpi-2712) during the first
mpv --hwdec=vaapi-copyplayback against the freshly-deployedr28+g79256dcstack (daedalus-v4l2#7).No cooked trace available, but the failure mode is reproducible by code reading.
Cause
The split src / dst lifecycle introduced by #7 calls
v4l2_m2m_job_finishonSRC_CONSUMEDeven when the corresponding dst_buf is still parked, waiting for a futureHAS_PIXELS.job_finishmoves the m2m_ctx back to IDLE, the scheduler dispatches the nextdevice_run, which callsv4l2_m2m_next_dst_buf— which returns the head of the CAPTURE ready-queue.That head is still our parked dst_buf, because we never removed it.
Two inflight entries now reference the same
vb2_buffer. When the laterHAS_PIXELSarrives for the original cookie,v4l2_m2m_dst_buf_remove_by_bufcallslist_delon alist_headno longer linked to the rdy_queue, smashing the next/prev pointers of whatever ELSE was at those addresses → kernel panic.Fix
Take both src and dst off
m2m_ctx's rdy_queue atdevice_run— as soon asv4l2_m2m_next_*_bufhas peeked them and all early-exit validation has passed. After that the daemon owns both halves exclusively via the inflight item; the m2m scheduler can't re-issue them on the nextdevice_run.Completion path drops the redundant
_remove_by_bufcalls — list is already detached, sobuf_donealone is correct.Matches the
drivers/media/platform/amphion/vdec.c/venc.cpattern (NXP), which also claims atdevice_runfor the same reason: amphion's encode path parks output buffers across multiple frames waiting for the codec to finish, structurally the same as our H.264 B-frame DPB parking.fail_buf_errorlearns about a newclaimedflag and skips thev4l2_m2m_*_buf_removecalls when the buffers have already been removed by-buf atdevice_run.Wire protocol
Unchanged —
DAEDALUS_PROTO_VERSIONstays at 1, daemon binary doesn't need to be rebuilt for this fix. Only the kernel module changes. apt-side this is still a daedalus-v4l2-dkms-only bump.Verified
r28+g79256dc.Closes the panic regression from #7.