Per Phase 4 plan PR #14 + kerneldoc audit (Task #19). Six call sites
deferred per-RX-frame mac80211 dispatch via tasklet; replace with the
synchronous-from-process-context API ieee80211_rx_ni() which does its
own local_bh_disable wrap.
Why _ni and not _list:
Phase 4 plan originally targeted ieee80211_rx_list for batch
delivery. Mining mt76 mainline (the only driver using _list)
showed the canonical pattern requires threading a struct list_head
through the per-frame call chain. bes2600s WSM dispatcher
(wsm_handle_rx -> bes2600_rx_cb / wsm.c beacon path) sits between
the bh threads SDIO read and the mac80211 hand-off; threading a
list_head through the dispatcher is a non-trivial refactor.
ieee80211_rx_ni() is the simpler drop-in: no list management, still
removes the tasklet hop. Per-call local_bh_disable cost is trivial
vs the saved tasklet schedule. Future refactor can revisit _list
if measurements warrant.
Sites converted:
- ap.c:96 (bes2600_sta_add link-id rx_queue drain on AP-mode
STA add). Was inside spin_lock_bh(&ps_state_lock);
refactored to splice the queue under the lock then
deliver after unlock — _ni runs the synchronous
mac80211 RX path inline, would otherwise hold the
lock across mac80211 dispatch. splice via
skb_queue_splice_init into a local sk_buff_head.
- sta.c:1487 (deauth-frame inject in inactivity-event handler).
Not under any lock; direct conversion.
- txrx.c:1960 (early-data + pm_unsupported branch from Patch E).
- txrx.c:1967 (early-data + LINK_SOFT-not-set branch).
- txrx.c:1971 (normal RX path in bes2600_rx_cb).
- wsm.c:2415 (beacon delivery in scan-complete WSM handler).
beacon SKB ownership is preserved by the existing
skb_copy(beacon, GFP_ATOMIC) -> beacon_bkp pattern;
no lifecycle change needed.
Mixing constraint (kerneldoc include/net/mac80211.h:5399-5430):
ieee80211_rx_ni() cannot mix with ieee80211_rx_irqsafe() for a
single hardware. All 6 sites convert atomically; no mixed state.
Build verified clean on ohm sandbox: srcversion 619A51E61BF5479AAC146E6.
Predicted Phase 7 delta: +5-15% over v3+D+E baseline (2.35 MB/s mean
on v3 alone; D+E single-rep was 3.22 MB/s). Modest improvement
expected from removing the tasklet schedule per RX frame. Smaller
deltas would still be a net win for upstream-cleanliness — the
kernel.org submission story benefits from not using _irqsafe from
process context.