notes: Patch C2 Phase 4 plan — ieee80211_rx_irqsafe → ieee80211_rx_list #14

Merged
marfrit merged 1 commits from claude-noether-12 into main 2026-05-07 22:46:56 +00:00
Collaborator

After Patch C v3 / D / E / F / G all merged, the remaining cleanup target is the per-RX-frame tasklet defer that ieee80211_rx_irqsafe introduces.

TL;DR

Migrate all 6 ieee80211_rx_irqsafe call sites in bes2600 to ieee80211_rx_list. Wrap with local_bh_disable + rcu_read_lock per-batch (mt76/iwl idiom). Predicted throughput delta: +5-15% over v3's 2.352 MB/s baseline.

Why this is bigger than D or E

Kerneldoc constraint (verified per task #19 from include/net/mac80211.h:5324-5345):

  • Cannot mix _list and _irqsafe for the same hardware → all 6 sites convert atomically, no per-site progressive migration
  • Requires local_bh_disable() + rcu_read_lock() wrap
  • Calls must be synchronized for a single hardware (bh-thread-as-sole-RX-context post-v3 satisfies trivially)

Open questions for the reviewer (please read §4.6 risks)

  1. rx_list draining semantics. Does mainline expect netif_receive_skb_list_internal at end-of-batch, or does mac80211 internal-deliver? Need to read mt76 / iwl_pcie usage before Phase 6 lands.

  2. Beacon path SKB ownership (wsm.c:2415). hw_priv->beacon is a long-lived SKB. After _rx_list consumes it (passes ownership to mac80211), the field would be dangling. Audit before Phase 6.

  3. Per-batch wrap vs per-call wrap (§4.2). Per-batch wrap (Option B) is the upstream idiom; per-call wrap (Option A) is simpler but loses the batch benefit. Recommend B.

Predicted Phase 7 outcome

+5-15% over v3's 2.352 MB/s. Honest acknowledgment: I don't have data on the actual tasklet hop cost. Smaller-than-expected delta = "marginal but no regression, ship for upstream-cleanliness anyway".

Out of scope

  • Patch D, E, G all shipped
  • bh.c #if 0 graveyard (Task #24)
  • Allwinner sw_mci_check_r1_ready (Task #25)

Status of cleanups branch

Tip = 42fd0ce (Patch E merged). C v3 + F + G + D + E all in cleanups. C2 branches off cleanups.

After Patch C v3 / D / E / F / G all merged, the remaining cleanup target is the per-RX-frame tasklet defer that `ieee80211_rx_irqsafe` introduces. ## TL;DR Migrate all 6 `ieee80211_rx_irqsafe` call sites in bes2600 to `ieee80211_rx_list`. Wrap with `local_bh_disable + rcu_read_lock` per-batch (mt76/iwl idiom). Predicted throughput delta: +5-15% over v3's 2.352 MB/s baseline. ## Why this is bigger than D or E Kerneldoc constraint (verified per task #19 from `include/net/mac80211.h:5324-5345`): - **Cannot mix** `_list` and `_irqsafe` for the same hardware → all 6 sites convert atomically, no per-site progressive migration - **Requires** `local_bh_disable() + rcu_read_lock()` wrap - **Calls must be synchronized** for a single hardware (bh-thread-as-sole-RX-context post-v3 satisfies trivially) ## Open questions for the reviewer (please read §4.6 risks) 1. **`rx_list` draining semantics.** Does mainline expect `netif_receive_skb_list_internal` at end-of-batch, or does mac80211 internal-deliver? Need to read mt76 / iwl_pcie usage before Phase 6 lands. 2. **Beacon path SKB ownership** (wsm.c:2415). `hw_priv->beacon` is a long-lived SKB. After `_rx_list` consumes it (passes ownership to mac80211), the field would be dangling. Audit before Phase 6. 3. **Per-batch wrap vs per-call wrap** (§4.2). Per-batch wrap (Option B) is the upstream idiom; per-call wrap (Option A) is simpler but loses the batch benefit. Recommend B. ## Predicted Phase 7 outcome +5-15% over v3's 2.352 MB/s. **Honest acknowledgment**: I don't have data on the actual tasklet hop cost. Smaller-than-expected delta = "marginal but no regression, ship for upstream-cleanliness anyway". ## Out of scope - Patch D, E, G all shipped - bh.c #if 0 graveyard (Task #24) - Allwinner `sw_mci_check_r1_ready` (Task #25) ## Status of cleanups branch Tip = 42fd0ce (Patch E merged). C v3 + F + G + D + E all in cleanups. C2 branches off cleanups.
claude-noether added 1 commit 2026-05-07 22:43:26 +00:00
After Patch C v3 / D / E / F / G all merged, the remaining cleanup
target is the per-RX-frame tasklet defer that ieee80211_rx_irqsafe
introduces.  Patch C2 migrates all 6 call sites in bes2600 to
ieee80211_rx_list, the process-context API verified per the
kerneldoc audit (Task #19, mainline include/net/mac80211.h:5324-5345).

Key constraints from kerneldoc:
  - cannot mix _list and _irqsafe for the same hardware
    (=> all 6 sites convert atomically)
  - requires local_bh_disable + rcu_read_lock wrap
  - calls must be synchronized for a single hardware
    (=> bh-thread-as-sole-RX-context post-v3 satisfies trivially)

Plan §4.2 design decision: per-batch wrap (Option B), wrapping
bes2600_sdio_read_rx_batch outer loop, rather than per-call wrap.
Captures the actual batch benefit.

Open questions for the Phase 5 reviewer:

  1. rx_list draining semantics — does mainline expect explicit
     netif_receive_skb_list at end-of-batch, or does mac80211
     internal-deliver?  Need to verify by reading mt76 / iwl_pcie
     usage before Phase 6 lands.
  2. beacon path (wsm.c:2415) SKB ownership — hw_priv->beacon is
     long-lived; after _rx_list consumes it, the field would be
     dangling.  Audit before Phase 6.

Predicted throughput delta: +5-15% over v3 N=3 baseline (2.352 MB/s),
medium confidence.  Smaller-than-expected delta = "marginal but no
regression, ship for upstream-cleanliness".

Phase 7 N=3 ramp uses wired enu1 path + per-rep fresh nc listener
per the rig-failure-is-finding lesson.
Owner

 rx_list  draining semantics. Does mainline expect  netif_receive_skb_list_internal  at end-of-batch, or does mac80211 internal-deliver? Need to read mt76 / iwl_pcie usage before Phase 6 lands. What does mainline cw1200 do? It helped for tx.
Beacon path SKB ownership (wsm.c:2415).  hw_priv->beacon  is a long-lived SKB. After  _rx_list  consumes it (passes ownership to mac80211), the field would be dangling. Audit before Phase 6. Agreed.
Per-batch wrap vs per-call wrap (§4.2). Per-batch wrap (Option B) is the upstream idiom; per-call wrap (Option A) is simpler but loses the batch benefit. Option B

 rx_list  draining semantics. Does mainline expect  netif_receive_skb_list_internal  at end-of-batch, or does mac80211 internal-deliver? Need to read mt76 / iwl_pcie usage before Phase 6 lands. What does mainline cw1200 do? It helped for tx. Beacon path SKB ownership (wsm.c:2415).  hw_priv->beacon  is a long-lived SKB. After  _rx_list  consumes it (passes ownership to mac80211), the field would be dangling. Audit before Phase 6. Agreed. Per-batch wrap vs per-call wrap (§4.2). Per-batch wrap (Option B) is the upstream idiom; per-call wrap (Option A) is simpler but loses the batch benefit. Option B
marfrit merged commit 3d63ec0a35 into main 2026-05-07 22:46:56 +00:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/besser#14