From 3d833f8ccf31895a2ce7bf4fd4ef839e653b29bb Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Thu, 21 May 2026 09:25:12 +0200 Subject: [PATCH] bes2600: reset firmware state on wsm_join_confirm failure MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When wsm_join_confirm() returns status != WSM_STATUS_SUCCESS (ret 1), the driver cleared its bookkeeping but did not reset the firmware interface, leaving it in an intermediate post-rejection state. A rapid second JOIN attempt (e.g. wpa_supplicant retrying after the PREV_AUTH_NOT_VALID deauth that mac80211 emits to clean up) hits an inconsistent firmware context, causing bes2600_sdio_read_rx_batch to return SDIO error which cascades into wifi_force_close: wsm_join_confirm ret 1 deauthenticating from by local choice (Reason: 2=PREV_AUTH_NOT_VALID) [~10 min later] bes2600_sdio_read_rx_batch sdio read error WARNING: at bes2600_tx_loop_set_enable / bes2600_chrdev_wifi_force_close Two additions to the failure path in bes2600_join_work(): 1. wsm_reset (WSM_REQ_ID_RESET, 0x000A) with reset_statistics=false. This returns the firmware to IDLE so the next association attempt starts from a known-clean state. bes2600_unjoin_work() performs the same reset, but gates it on join_status != PASSIVE; after a failed JOIN join_status stays PASSIVE, so that path never fires — call wsm_reset directly here instead. Contract: wsm_reset takes only wsm_cmd_lock (not conf_lock, not wsm_oper_lock). wsm_oper_unlock was already called inside wsm_join_confirm() before wsm_join() returned -EINVAL, so there is no re-entrancy hazard. conf_lock is held at this call site, which is compatible with wsm_reset's locking requirements. 2. queue_work(workqueue, &priv->unjoin_work) instead of direct wsm_unlock_tx(). Serialises the next association attempt through the workqueue so it cannot race against lingering firmware-side effects of the failure. If unjoin_work is already queued, release TX immediately (matching cw1200 ancestor sta.c:1344 comment "Tx lock still held, unjoin will clear it."). Ancestor reference: drivers/net/wireless/st/cw1200/sta.c, function cw1200_join_work(), lines 1339-1344. cw1200 queues unjoin_work on join failure for the same reason. bes2600 needs the direct wsm_reset in addition because its unjoin_work has the join_status gate that cw1200's cw1200_do_unjoin() does not. Signed-off-by: Claude (noether) --- bes2600/sta.c | 47 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 43 insertions(+), 4 deletions(-) diff --git a/bes2600/sta.c b/bes2600/sta.c index 476d875..bf86835 100644 --- a/bes2600/sta.c +++ b/bes2600/sta.c @@ -2225,9 +2225,10 @@ void bes2600_join_work(struct work_struct *work) struct wsm_template_frame probe_tmp = { .frame_type = WSM_FRAME_TYPE_PROBE_REQUEST, }; - /*struct wsm_reset reset = { - .reset_statistics = true, - };*/ + struct wsm_reset join_fail_reset = { + .reset_statistics = false, + }; + bool join_failed = false; BUG_ON(queueId >= 4); @@ -2410,6 +2411,33 @@ void bes2600_join_work(struct work_struct *work) #endif /*CONFIG_BES2600_TESTMODE*/ cancel_delayed_work_sync(&priv->join_timeout); bes2600_pwr_clear_busy_event(priv->hw_priv, BES_PWR_LOCK_ON_JOIN); + /* + * Firmware rejected WSM_JOIN (wsm_join_confirm ret 1). + * Issue wsm_reset so the firmware returns to a clean + * IDLE state before the next association attempt. + * + * Without this reset the firmware sits in an + * intermediate post-reject state. A rapid second + * JOIN (e.g. wpa_supplicant retrying after the + * PREV_AUTH_NOT_VALID deauth that follows) hits an + * inconsistent firmware context, causing + * bes2600_sdio_read_rx_batch to return SDIO error + * which cascades into wifi_force_close. + * + * cw1200 ancestor (drivers/net/wireless/st/cw1200/ + * sta.c:1339) queues unjoin_work on join failure for + * the same reason; bes2600_unjoin_work gates its + * wsm_reset on join_status != PASSIVE, so after a + * failed JOIN (join_status stays PASSIVE) that path + * never fires — call wsm_reset directly here instead. + * + * Contract: wsm_reset takes only wsm_cmd_lock; safe + * to call while conf_lock is held. wsm_oper_unlock + * was already called in wsm_join_confirm() before + * wsm_join() returned the error. + */ + WARN_ON(wsm_reset(hw_priv, &join_fail_reset, priv->if_id)); + join_failed = true; } else { /* Upload keys */ #ifdef CONFIG_BES2600_TESTMODE @@ -2434,7 +2462,18 @@ void bes2600_join_work(struct work_struct *work) up(&hw_priv->conf_lock); if (bss) cfg80211_put_bss(hw_priv->hw->wiphy, bss); - wsm_unlock_tx(hw_priv); + /* + * On join failure: queue unjoin_work so the next association + * attempt is serialised after any lingering cleanup, matching + * cw1200 sta.c:1344 "Tx lock still held, unjoin will clear it." + * If unjoin_work is already queued, release TX immediately. + */ + if (join_failed) { + if (queue_work(hw_priv->workqueue, &priv->unjoin_work) <= 0) + wsm_unlock_tx(hw_priv); + } else { + wsm_unlock_tx(hw_priv); + } } void bes2600_join_timeout(struct work_struct *work)