From db4ea70fb5dae1b2ab9c06dd91f1d7b2b9dcf09c Mon Sep 17 00:00:00 2001 From: Markus Fritsche Date: Tue, 28 Apr 2026 14:32:18 +0200 Subject: [PATCH] bes2600: widen scan-defer backoff to 30s and decay count on quiet The scan-defer logic added in the previous patch ("bes2600: defer scan and soften WARN on firmware reject") used a 10-second backoff window and never cleared reject_count outside of a successful scan. Field testing on a PineTab2 (linux-pinetab2 6.19.10-danctnix1) shows two distinct mac80211 scan-retry cadences in practice: * Idle background scans every ~5 minutes when associated -- well outside any plausible backoff, the defer guard correctly falls through to a real WSM scan attempt. * Roam-evaluation bursts triggered when mac80211 wants to find a candidate AP for handover (signal degradation, beacon loss, locally-generated DEAUTH_LEAVING reason=3). Cadence is ~12 s, and one boot reproduced 14 such rejected scans in 3 minutes during a single burst, none of which engaged the defer guard because every retry landed just outside the 10 s window. Two-line behaviour change to fix that: 1. BES2600_SCAN_BACKOFF_JIFFIES grows from 10*HZ to 30*HZ, so a 12 s-cadence burst stays inside the window across consecutive rejects and the third reject in the burst trips the threshold guard. The 5 min idle case is still naturally past the window and is unaffected. 2. bes2600_scan_should_defer() resets reject_count to 0 when time_after(jiffies, backoff_until). Without this, reject_count accumulated indefinitely across the slow-cadence rejects, so an isolated reject after long quiet would have tripped the threshold the moment it arrived. After the change, count is latched only inside an active burst and decays cleanly when the burst ends. Net effect on a roam burst: * t=0 reject #1 (count 1, backoff_until = t0 + 30s) * t=12 reject #2 (count 2, backoff_until = t1 + 30s) * t=24 reject #3 (count 3, threshold met, next scan deferred) * t=36 defer fires, no WSM round-trip, reject not sent * ... defers continue until the firmware-policy state clears * scan succeeds -> reject_count = 0, normal cadence resumes WSM 0x0007 confirm rejections in a burst drop from ~14 to ~3 (just the scans needed to reach the threshold). wpa_supplicant's reason=3 locally-generated disconnects driven by exhausted roam candidates during the same burst window also drop. No new state, no new symbols, no change to mac80211-facing semantics: the deferred scan still completes via the existing fail: path with status=-EBUSY, the same response a real firmware-busy would produce. Signed-off-by: Markus Fritsche --- bes2600/scan.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/bes2600/scan.c b/bes2600/scan.c index faa1c90..ad5033b 100644 --- a/bes2600/scan.c +++ b/bes2600/scan.c @@ -22,9 +22,17 @@ * After this many consecutive WSM scan rejections from firmware, stop * issuing new scans for BES2600_SCAN_BACKOFF_JIFFIES and let the state * that's rejecting them (coex window, firmware-internal busy) clear. + * + * The backoff has to be at least as long as the natural mac80211 scan- + * retry cadence, otherwise the next attempt lands outside the window + * and bypasses the defer guard. Observed in the wild on PineTab2: + * roam-evaluation bursts at ~12 s cadence, idle background scans at + * ~5 min cadence. 30 s catches the burst and leaves the slow case + * alone (the firmware-policy state has had minutes to clear by then + * anyway). */ #define BES2600_SCAN_REJECT_THRESHOLD 3 -#define BES2600_SCAN_BACKOFF_JIFFIES (10 * HZ) +#define BES2600_SCAN_BACKOFF_JIFFIES (30 * HZ) static void bes2600_scan_restart_delayed(struct bes2600_vif *priv); @@ -40,7 +48,9 @@ static void bes2600_scan_restart_delayed(struct bes2600_vif *priv); * 2. We already saw >= BES2600_SCAN_REJECT_THRESHOLD consecutive * rejections on recent scan attempts and the backoff window has * not yet elapsed. Whatever was rejecting them is likely still - * rejecting them; give it time. + * rejecting them; give it time. If the backoff has elapsed without + * a fresh reject refreshing it, the burst is over and we reset the + * count so an isolated reject doesn't immediately re-trip. * * Returns true if the caller should abandon the scan iteration. */ @@ -51,6 +61,9 @@ static bool bes2600_scan_should_defer(struct bes2600_common *hw_priv) return true; #endif + if (time_after(jiffies, hw_priv->scan.backoff_until)) + hw_priv->scan.reject_count = 0; + if (hw_priv->scan.reject_count >= BES2600_SCAN_REJECT_THRESHOLD && time_before(jiffies, hw_priv->scan.backoff_until)) return true; -- 2.53.0