Hard reboot observed on higgs (Pi CM5) during the first mpv vaapi-copy
playback against the freshly-deployed r28+g79256dc stack — kernel
panic, no persistent journal, no recoverable trace. Bug introduced
by the daedalus-v4l2#6 reorder fix (#7).
Cause
-----
The new completion path runs `v4l2_m2m_job_finish` on SRC_CONSUMED
even when the dst_buf is still parked (waiting for a future
HAS_PIXELS). job_finish moves the m2m_ctx back to IDLE, the
scheduler dispatches the next device_run — which calls
`v4l2_m2m_next_dst_buf`, which returns the head of the CAPTURE
ready-queue, which is STILL the parked dst_buf because we never
removed it. Two inflight entries now reference the same vb2_buffer;
the later HAS_PIXELS triggers `v4l2_m2m_dst_buf_remove_by_buf` on a
vb2_buffer whose list_head is no longer linked to that queue, and
`list_del()` smashes the next/prev pointers of whatever ELSE was at
those addresses.
Fix
---
Take both src and dst off `m2m_ctx`'s rdy_queue at device_run — as
soon as `v4l2_m2m_next_*_buf` has peeked them and all early-exit
validation has passed. After that, the daemon owns both halves
exclusively via the inflight item; the m2m scheduler can't re-issue
them on the next device_run. Completion path drops the redundant
`_remove_by_buf` calls — list is already detached, so `buf_done`
alone is correct.
Matches the amphion `vdec.c`/`venc.c` pattern (which also claims at
device_run for the same reason: amphion's encode pipeline parks
output buffers across multiple frames waiting for the codec to
finish, structurally the same as our H.264 B-frame DPB parking).
`fail_buf_error` learns about the new `claimed` flag and skips the
`v4l2_m2m_*_buf_remove` calls when the buffers have already been
removed by-buf at device_run.
Verified
--------
Builds clean against 6.18.29+rpt-rpi-2712. Field test pending —
deploy via marfrit-packages bump in lock-step with the daemon
(which doesn't need to change for this fix; PROTO_VERSION stays at 1).