Compare commits

...

6 Commits

Author SHA1 Message Date
test0r 0ec58c0ad5 bes2600: Patch C2 — replace ieee80211_rx_irqsafe with ieee80211_rx_ni
Per Phase 4 plan PR #14 + kerneldoc audit (Task #19).  Six call sites
deferred per-RX-frame mac80211 dispatch via tasklet; replace with the
synchronous-from-process-context API ieee80211_rx_ni() which does its
own local_bh_disable wrap.

Why _ni and not _list:

  Phase 4 plan originally targeted ieee80211_rx_list for batch
  delivery.  Mining mt76 mainline (the only driver using _list)
  showed the canonical pattern requires threading a struct list_head
  through the per-frame call chain.  bes2600s WSM dispatcher
  (wsm_handle_rx -> bes2600_rx_cb / wsm.c beacon path) sits between
  the bh threads SDIO read and the mac80211 hand-off; threading a
  list_head through the dispatcher is a non-trivial refactor.
  ieee80211_rx_ni() is the simpler drop-in: no list management, still
  removes the tasklet hop.  Per-call local_bh_disable cost is trivial
  vs the saved tasklet schedule.  Future refactor can revisit _list
  if measurements warrant.

Sites converted:

  - ap.c:96       (bes2600_sta_add link-id rx_queue drain on AP-mode
                   STA add).  Was inside spin_lock_bh(&ps_state_lock);
                   refactored to splice the queue under the lock then
                   deliver after unlock — _ni runs the synchronous
                   mac80211 RX path inline, would otherwise hold the
                   lock across mac80211 dispatch.  splice via
                   skb_queue_splice_init into a local sk_buff_head.
  - sta.c:1487    (deauth-frame inject in inactivity-event handler).
                   Not under any lock; direct conversion.
  - txrx.c:1960   (early-data + pm_unsupported branch from Patch E).
  - txrx.c:1967   (early-data + LINK_SOFT-not-set branch).
  - txrx.c:1971   (normal RX path in bes2600_rx_cb).
  - wsm.c:2415    (beacon delivery in scan-complete WSM handler).
                   beacon SKB ownership is preserved by the existing
                   skb_copy(beacon, GFP_ATOMIC) -> beacon_bkp pattern;
                   no lifecycle change needed.

Mixing constraint (kerneldoc include/net/mac80211.h:5399-5430):
ieee80211_rx_ni() cannot mix with ieee80211_rx_irqsafe() for a
single hardware.  All 6 sites convert atomically; no mixed state.

Build verified clean on ohm sandbox: srcversion 619A51E61BF5479AAC146E6.

Predicted Phase 7 delta: +5-15% over v3+D+E baseline (2.35 MB/s mean
on v3 alone; D+E single-rep was 3.22 MB/s).  Modest improvement
expected from removing the tasklet schedule per RX frame.  Smaller
deltas would still be a net win for upstream-cleanliness — the
kernel.org submission story benefits from not using _irqsafe from
process context.
2026-05-08 06:40:00 +02:00
marfrit 42fd0ceab6 bes2600: Patch E — skip ps_state_lock when PSM-known-disabled (#8) 2026-05-07 22:31:45 +00:00
test0r 4be43770fd bes2600: Patch E — skip ps_state_lock when PSM-known-disabled
Per the Opus structural critique (PR #8 §2.4) and Sonnet review item 5.
The per-RX-frame early-data path takes ps_state_lock to double-check
whether a link entry transitioned to BES2600_LINK_SOFT (AP-side
power-save state machine, soft-link transition).

When c7 has latched pm_unsupported = true (firmware does not honor
PSM, see feedback_bes2600_firmware_no_psm memory), the AP power-save
state machine is dead and link entries never transition to LINK_SOFT.
The per-frame spin_lock_bh + double-check is wasted work.

This patch gates the lock acquisition on !pm_unsupported.  When the
latch is on (the steady state on the production-shipped bes2600
firmware), early_data RX frames bypass the spin_lock_bh and go
directly to ieee80211_rx_irqsafe.

If a future firmware drop fixes PSM, c7 self-clears pm_unsupported on
the first real PM_INDICATION and the locked path resumes.

Scope is narrower than Sonnet originally framed: only the per-RX-frame
hot path (txrx.c:1945-1951 in cleanups+G+D) is touched.  Other
ps_state_lock sites in txrx.c (lines 657, 1256, 1420, 1528) are TX
submission / multicast-start / link-id paths, not per-frame RX, and
not on the Bug #5 hot path.  Leave those alone.

Build verified: srcversion B5922B4933590F33207EE97 on ohm sandbox.
2026-05-08 00:22:14 +02:00
marfrit 3dbabf3092 bes2600: Patch D — atomicize ba_lock counters, drop the spinlock (#7) 2026-05-07 22:19:53 +00:00
test0r 44b296647b bes2600: Patch D — atomicize ba_lock counters, drop the spinlock
The block-ack policy uses 4 int counters (ba_acc, ba_cnt, ba_acc_rx,
ba_cnt_rx) bumped per data frame in the TX and RX hot paths under
spin_lock_bh(&hw_priv->ba_lock).  The lock was the heaviest per-frame
synchronization cost remaining after Patch C v3 (which fixed the
sdio_rx_work relay).  Per the Opus structural critique (PR #8), this
pattern matches mac80211 driver convention for per-frame statistics:
atomic_t suffices, no lock needed.

Field-by-field changes in struct bes2600_common:
  ba_acc, ba_cnt, ba_acc_rx, ba_cnt_rx: int -> atomic_t
  ba_armed:                              new atomic_t (timer-arm flag)
  ba_ena:                                bool -> atomic_t
  ba_lock:                               removed (spinlock_t deleted)
  ba_hist:                               int (single-writer = ba_timer)

Producer hot path (txrx.c TX submit + RX receive):
  - atomic_add for the byte accumulator
  - atomic_inc for the frame counter
  - atomic_cmpxchg(&ba_armed, 0, 1) to claim the once-per-window
    mod_timer arm — at most ONE producer succeeds; race-free
  - no spin_lock_bh

Consumer paths (sta.c bes2600_ba_timer, sta.c disconnect-reset, sta.c
bes2600_ba_work, debug.c debugfs reader):
  - atomic_read snapshots all 4 counters into locals; the threshold
    predicate (acc/cnt >= THLD) tolerates approximate snapshots — the
    timer fires periodically, a single misclassification just delays
    the policy update by one tick
  - atomic_set zeroes the counters at end of timer-callback window;
    racing producer increments after the snapshot are lost (acceptable
    for stats; same approximation the original lock allowed under
    contention)
  - atomic_set(&ba_armed, 0) re-enables the next window's arm

Followup-amenable simplification: ba_hist remains int because only
the single ba_timer callback writes it; multiple writers would need
to upgrade it too.

This patch follows the cw1200-mainline-idiom established by Patch C v3
(structural fix, not bandaid).  The cw1200 reference doesn't have a
similar lock to compare; bes2600 inherited this from a later
Bestechnic addition rather than the upstream tree.
2026-05-08 00:17:46 +02:00
marfrit 25c0ed8c57 bes2600: Patch G — restore SPDX + ST-Ericsson attribution chain (#6) 2026-05-07 22:11:14 +00:00
7 changed files with 126 additions and 68 deletions
+13 -2
View File
@@ -62,8 +62,11 @@ int bes2600_sta_add(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
struct bes2600_vif *priv = cw12xx_get_vif_from_ieee80211(vif);
struct bes2600_link_entry *entry;
struct sk_buff *skb;
struct sk_buff_head local_drain;
struct bes2600_common *hw_priv = hw->priv;
__skb_queue_head_init(&local_drain);
#ifdef P2P_MULTIVIF
WARN_ON(priv->if_id == CW12XX_GENERIC_IF_ID);
#endif
@@ -92,9 +95,17 @@ int bes2600_sta_add(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
IEEE80211_WMM_IE_STA_QOSINFO_AC_MASK)
priv->sta_asleep_mask |= BIT(sta_priv->link_id);
entry->status = BES2600_LINK_HARD;
while ((skb = skb_dequeue(&entry->rx_queue)))
ieee80211_rx_irqsafe(priv->hw, skb);
/*
* Patch C2: splice the rx_queue out under the lock then deliver
* after unlock. ieee80211_rx_ni() runs the mac80211 RX path
* synchronously (formerly ieee80211_rx_irqsafe deferred to a
* tasklet); calling it from inside spin_lock_bh would hold the
* lock across mac80211's full RX dispatch.
*/
skb_queue_splice_init(&entry->rx_queue, &local_drain);
spin_unlock_bh(&priv->ps_state_lock);
while ((skb = __skb_dequeue(&local_drain)))
ieee80211_rx_ni(priv->hw, skb);
#ifdef AP_AGGREGATE_FW_FIX
hw_priv->connected_sta_cnt++;
if(hw_priv->connected_sta_cnt>1) {
+17 -9
View File
@@ -353,15 +353,23 @@ struct bes2600_common {
* Keeping in common structure for the time being. Will be moved to VIFF
* after the mechanism is clear */
u8 ba_tid_mask;
int ba_acc; /*TODO: Same as above */
int ba_cnt; /*TODO: Same as above */
int ba_cnt_rx; /*TODO: Same as above */
int ba_acc_rx; /*TODO: Same as above */
int ba_hist; /*TODO: Same as above */
struct timer_list ba_timer;/*TODO: Same as above */
spinlock_t ba_lock; /*TODO: Same as above */
bool ba_ena; /*TODO: Same as above */
struct work_struct ba_work; /*TODO: Same as above */
/*
* Patch D: ba_lock removed. Per-frame TX/RX hot-path bumped these
* counters under spin_lock_bh; the lock did not protect any
* compound invariant that atomic ops can't satisfy. Counters are
* now atomic_t; ba_armed gates the once-per-window mod_timer
* arm via cmpxchg so concurrent TX/RX at a fresh window each
* try to claim the arm and exactly one succeeds.
*/
atomic_t ba_acc;
atomic_t ba_cnt;
atomic_t ba_cnt_rx;
atomic_t ba_acc_rx;
atomic_t ba_armed;
int ba_hist;
struct timer_list ba_timer;
atomic_t ba_ena;
struct work_struct ba_work;
bool is_BT_Present;
bool is_go_thru_go_neg;
u8 conf_listen_interval;
+8 -5
View File
@@ -110,17 +110,20 @@ static int bes2600_status_show_common(struct seq_file *seq, void *v)
int ba_cnt, ba_acc, ba_cnt_rx, ba_acc_rx, ba_avg = 0, ba_avg_rx = 0;
bool ba_ena;
spin_lock_bh(&hw_priv->ba_lock);
ba_cnt = hw_priv->debug->ba_cnt;
ba_acc = hw_priv->debug->ba_acc;
/*
* Patch D: ba_lock removed. hw_priv->debug->ba_* are written only
* by the timer callback (single writer); reading without a lock is
* fine for stats. ba_ena is atomic_t.
*/
ba_cnt = hw_priv->debug->ba_cnt;
ba_acc = hw_priv->debug->ba_acc;
ba_cnt_rx = hw_priv->debug->ba_cnt_rx;
ba_acc_rx = hw_priv->debug->ba_acc_rx;
ba_ena = hw_priv->ba_ena;
ba_ena = !!atomic_read(&hw_priv->ba_ena);
if (ba_cnt)
ba_avg = ba_acc / ba_cnt;
if (ba_cnt_rx)
ba_avg_rx = ba_acc_rx / ba_cnt_rx;
spin_unlock_bh(&hw_priv->ba_lock);
seq_puts(seq, "BES2600 Wireless LAN driver status\n");
seq_printf(seq, "Hardware: %d.%d\n",
+1 -1
View File
@@ -496,7 +496,7 @@ static struct ieee80211_hw *bes2600_init_common(size_t hw_priv_data_len)
INIT_LIST_HEAD(&hw_priv->event_queue);
INIT_WORK(&hw_priv->event_handler, bes2600_event_handler);
INIT_WORK(&hw_priv->ba_work, bes2600_ba_work);
spin_lock_init(&hw_priv->ba_lock);
/* Patch D: ba_lock removed; ba_acc/ba_cnt/etc are atomic_t. */
timer_setup(&hw_priv->ba_timer, bes2600_ba_timer, 0);
if (unlikely(bes2600_queue_stats_init(&hw_priv->tx_queue_stats,
+47 -32
View File
@@ -1484,7 +1484,7 @@ void bes2600_event_handler(struct work_struct *work)
IEEE80211_STYPE_DEAUTH | IEEE80211_FCTL_TODS);
deauth->u.deauth.reason_code = WLAN_REASON_DEAUTH_LEAVING;
deauth->seq_ctrl = 0;
ieee80211_rx_irqsafe(priv->hw, skb);
ieee80211_rx_ni(priv->hw, skb);
bes_devel(" Inactivity Deauth Frame sent for MAC SA %pM \t and DA %pM\n", deauth->sa, deauth->da);
queue_work(priv->hw_priv->workqueue, &priv->set_tim_work);
break;
@@ -2342,14 +2342,19 @@ void bes2600_join_work(struct work_struct *work)
//WARN_ON(wsm_reset(hw_priv, &reset, priv->if_id));
WARN_ON(wsm_set_block_ack_policy(hw_priv,
0, hw_priv->ba_tid_mask, priv->if_id));
spin_lock_bh(&hw_priv->ba_lock);
hw_priv->ba_ena = false;
hw_priv->ba_cnt = 0;
hw_priv->ba_acc = 0;
/*
* Patch D: ba_lock removed. Disconnect-reset clears the
* counters and the arm flag; producers racing here cannot
* cause harm — at worst they re-arm the timer and bump
* counters that will be cleared on the next timer tick.
*/
atomic_set(&hw_priv->ba_ena, 0);
atomic_set(&hw_priv->ba_cnt, 0);
atomic_set(&hw_priv->ba_acc, 0);
hw_priv->ba_hist = 0;
hw_priv->ba_cnt_rx = 0;
hw_priv->ba_acc_rx = 0;
spin_unlock_bh(&hw_priv->ba_lock);
atomic_set(&hw_priv->ba_cnt_rx, 0);
atomic_set(&hw_priv->ba_acc_rx, 0);
atomic_set(&hw_priv->ba_armed, 0);
mgmt_policy.protectedMgmtEnable = 0;
mgmt_policy.unprotectedMgmtFramesAllowed = 1;
@@ -2629,10 +2634,11 @@ void bes2600_ba_work(struct work_struct *work)
return;*/
bes_devel("BA work****\n");
spin_lock_bh(&hw_priv->ba_lock);
// tx_ba_tid_mask = hw_priv->ba_ena ? hw_priv->ba_tid_mask : 0;
/*
* Patch D: ba_lock removed. ba_tid_mask is u8 set once at init
* (main.c); reading it without a lock is fine.
*/
tx_ba_tid_mask = hw_priv->ba_tid_mask;
spin_unlock_bh(&hw_priv->ba_lock);
wsm_lock_tx(hw_priv);
@@ -2645,37 +2651,49 @@ void bes2600_ba_work(struct work_struct *work)
void bes2600_ba_timer(struct timer_list *t)
{
bool ba_ena;
int cnt, acc, cnt_rx, acc_rx;
struct bes2600_common *hw_priv = from_timer(hw_priv, t, ba_timer);
spin_lock_bh(&hw_priv->ba_lock);
bes2600_debug_ba(hw_priv, hw_priv->ba_cnt, hw_priv->ba_acc,
hw_priv->ba_cnt_rx, hw_priv->ba_acc_rx);
/*
* Patch D: ba_lock removed. Snapshot atomic counters into locals
* for the predicate evaluation; producers may race incrementing
* after the snapshot but the resulting decision is approximate
* which the policy already tolerates (next timer tick re-evaluates).
*/
cnt = atomic_read(&hw_priv->ba_cnt);
acc = atomic_read(&hw_priv->ba_acc);
cnt_rx = atomic_read(&hw_priv->ba_cnt_rx);
acc_rx = atomic_read(&hw_priv->ba_acc_rx);
bes2600_debug_ba(hw_priv, cnt, acc, cnt_rx, acc_rx);
if (atomic_read(&hw_priv->scan.in_progress)) {
hw_priv->ba_cnt = 0;
hw_priv->ba_acc = 0;
hw_priv->ba_cnt_rx = 0;
hw_priv->ba_acc_rx = 0;
goto skip_statistic_update;
atomic_set(&hw_priv->ba_cnt, 0);
atomic_set(&hw_priv->ba_acc, 0);
atomic_set(&hw_priv->ba_cnt_rx, 0);
atomic_set(&hw_priv->ba_acc_rx, 0);
atomic_set(&hw_priv->ba_armed, 0);
return;
}
if (hw_priv->ba_cnt >= BES2600_BLOCK_ACK_CNT &&
(hw_priv->ba_acc / hw_priv->ba_cnt >= BES2600_BLOCK_ACK_THLD ||
(hw_priv->ba_cnt_rx >= BES2600_BLOCK_ACK_CNT &&
hw_priv->ba_acc_rx / hw_priv->ba_cnt_rx >=
if (cnt >= BES2600_BLOCK_ACK_CNT &&
(acc / cnt >= BES2600_BLOCK_ACK_THLD ||
(cnt_rx >= BES2600_BLOCK_ACK_CNT &&
acc_rx / cnt_rx >=
BES2600_BLOCK_ACK_THLD)))
ba_ena = true;
else
ba_ena = false;
hw_priv->ba_cnt = 0;
hw_priv->ba_acc = 0;
hw_priv->ba_cnt_rx = 0;
hw_priv->ba_acc_rx = 0;
atomic_set(&hw_priv->ba_cnt, 0);
atomic_set(&hw_priv->ba_acc, 0);
atomic_set(&hw_priv->ba_cnt_rx, 0);
atomic_set(&hw_priv->ba_acc_rx, 0);
atomic_set(&hw_priv->ba_armed, 0);
if (ba_ena != hw_priv->ba_ena) {
if (ba_ena != !!atomic_read(&hw_priv->ba_ena)) {
if (ba_ena || ++hw_priv->ba_hist >= BES2600_BLOCK_ACK_HIST) {
hw_priv->ba_ena = ba_ena;
atomic_set(&hw_priv->ba_ena, ba_ena ? 1 : 0);
hw_priv->ba_hist = 0;
#if 0
bes_devel("[STA] %s block ACK:\n",
@@ -2685,9 +2703,6 @@ void bes2600_ba_timer(struct timer_list *t)
}
} else if (hw_priv->ba_hist)
--hw_priv->ba_hist;
skip_statistic_update:
spin_unlock_bh(&hw_priv->ba_lock);
}
int bes2600_vif_setup(struct bes2600_vif *priv)
+39 -18
View File
@@ -995,14 +995,18 @@ bes2600_tx_h_ba_stat(struct bes2600_vif *priv,
if (!ieee80211_is_data(t->hdr->frame_control))
return;
spin_lock_bh(&hw_priv->ba_lock);
hw_priv->ba_acc += t->skb->len - t->hdrlen;
if (!(hw_priv->ba_cnt_rx || hw_priv->ba_cnt)) {
/*
* Patch D: lock-free hot-path BA accounting. atomic_inc + atomic_add
* each per-frame; the once-per-window timer-arm uses cmpxchg on
* ba_armed so concurrent TX/RX can't both try to set the timer and
* we don't need cross-counter coherency on the ba_cnt/ba_cnt_rx pair.
*/
atomic_add(t->skb->len - t->hdrlen, &hw_priv->ba_acc);
atomic_inc(&hw_priv->ba_cnt);
if (atomic_cmpxchg(&hw_priv->ba_armed, 0, 1) == 0) {
mod_timer(&hw_priv->ba_timer,
jiffies + BES2600_BLOCK_ACK_INTERVAL);
}
hw_priv->ba_cnt++;
spin_unlock_bh(&hw_priv->ba_lock);
}
static int
@@ -1629,14 +1633,13 @@ bes2600_rx_h_ba_stat(struct bes2600_vif *priv,
if (!priv->setbssparams_done)
return;
spin_lock_bh(&hw_priv->ba_lock);
hw_priv->ba_acc_rx += skb_len - hdrlen;
if (!(hw_priv->ba_cnt_rx || hw_priv->ba_cnt)) {
/* Patch D: lock-free hot-path BA accounting; see TX side comment. */
atomic_add(skb_len - hdrlen, &hw_priv->ba_acc_rx);
atomic_inc(&hw_priv->ba_cnt_rx);
if (atomic_cmpxchg(&hw_priv->ba_armed, 0, 1) == 0) {
mod_timer(&hw_priv->ba_timer,
jiffies + BES2600_BLOCK_ACK_INTERVAL);
}
hw_priv->ba_cnt_rx++;
spin_unlock_bh(&hw_priv->ba_lock);
}
void bes2600_rx_cb(struct bes2600_vif *priv,
@@ -1939,15 +1942,33 @@ void bes2600_rx_cb(struct bes2600_vif *priv,
if (unlikely(bes2600_itp_rxed(hw_priv, skb)))
consume_skb(skb);
else if (unlikely(early_data)) {
spin_lock_bh(&priv->ps_state_lock);
/* Double-check status with lock held */
if (entry->status == BES2600_LINK_SOFT)
skb_queue_tail(&entry->rx_queue, skb);
else
ieee80211_rx_irqsafe(priv->hw, skb);
spin_unlock_bh(&priv->ps_state_lock);
/*
* Patch E: when c7 has latched pm_unsupported (firmware
* doesn't honour PSM, see feedback_bes2600_firmware_no_psm),
* AP-side power-save state machine is dead and link entries
* never transition to BES2600_LINK_SOFT. The double-check
* branch under ps_state_lock is unreachable in that case,
* so skip the per-frame lock acquisition entirely and
* deliver to mac80211 directly.
*
* On firmware that does honour PSM (the latch self-clears
* if a real PM_INDICATION ever arrives see c7), this
* predicate flips back to false and the original locked
* path is taken.
*/
if (hw_priv->bes_power.pm_unsupported) {
ieee80211_rx_ni(priv->hw, skb);
} else {
spin_lock_bh(&priv->ps_state_lock);
/* Double-check status with lock held */
if (entry->status == BES2600_LINK_SOFT)
skb_queue_tail(&entry->rx_queue, skb);
else
ieee80211_rx_ni(priv->hw, skb);
spin_unlock_bh(&priv->ps_state_lock);
}
} else {
ieee80211_rx_irqsafe(priv->hw, skb);
ieee80211_rx_ni(priv->hw, skb);
}
*skb_p = NULL;
+1 -1
View File
@@ -2412,7 +2412,7 @@ int wsm_handle_rx(struct bes2600_common *hw_priv, int id,
if (!hw_priv->beacon_bkp)
hw_priv->beacon_bkp = \
skb_copy(hw_priv->beacon, GFP_ATOMIC);
ieee80211_rx_irqsafe(hw_priv->hw, hw_priv->beacon);
ieee80211_rx_ni(hw_priv->hw, hw_priv->beacon);
hw_priv->beacon = hw_priv->beacon_bkp;
hw_priv->beacon_bkp = NULL;