Files
kernel-agent/patches/driver/bes2600/scan-defer-backoff-tune-danctnix/0001-bes2600-widen-scan-defer-backoff-30s-and-decay-on-quiet.patch
T
claude-noether b04c8cd501 patches/driver/bes2600/*-danctnix + arch/arm64/scs-...: rebased on danctnix baseline (#29 redo)
PR #33's per-series mirrors were generated against the bes2600-dkms
cleanups branch (rooted at fe73571) without rebasing onto the
v7.0-danctnix1 kernel baseline. Result: per-commit diffs carried
stale baseline context (e.g. from_timer rather than the new
timer_container_of API), so the cumulative no longer applied cleanly
to ohm's actual base. pkgrel=6 build #1 failed with 'Hunk #3 FAILED'
in Patch D's sta.c.

Fix: in marfrit/bes2600-dkms, create danctnix-sync branch
(fe73571 + drop-in replace bes2600/ with v7.0-danctnix1's
drivers/staging/bes2600/), rebase cleanups onto it as
cleanups-rebased-on-danctnix, manually resolve the resulting conflicts
keeping each commit's intent + the new baseline context, rebase
Patch H accordingly. Format-patch and re-route to the same series-dir
names as PR #33.

Conflict resolution notes:
- 'remove userspace /dev/bes2600 character device interface' commit:
  the chardev wrapper was removed but two utility funcs that danctnix's
  bes2600_btuart.c depends on (bes2600_chrdev_is_bus_error,
  bes2600_chrdev_switch_subsys_glb) were re-added with EXPORT_SYMBOL_GPL.
  bes2600_switch_bt re-added as static (file-local, called only from
  bes2600_chrdev_switch_subsys_glb).
- Patch D (atomicize ba_lock): re-resolved bes2600_ba_timer's
  timer_container_of() vs from_timer() to keep the new API.
- SCS Makefile @@ hunk counts corrected from -9,6 +9,10 to -9,6 +9,11
  (the original was actually wrong; build-via-fuzz was masking it).

Cumulative b2sum: ka-promote ohm now emits
  eb179c03f35a4dbaec2e40036f0033ef04985bb6b14ab22419d68e5caaa5874f...
  (279 554 bytes, 32 patches resolved).

pkgrel=6 built from this manifest + installed on ohm 2026-05-19 ~23:39.
Functional verification: bes2600 + bes2600_btuart both load, Pattern A
0 over fresh boot, wlan0 associates to newton. srcversion
1A919EED0E6DC2478559B17 differs from pkgrel=5's BEB625FA... — the
reconstruction is functionally equivalent (5 GHz working, no
firmware/driver race conditions) but NOT byte-equivalent (the chardev
utility re-add chose different formatting than the original danctnix
code). Byte-equivalence is not a goal; per-series traceability and
working hardware are.

Closes (proper this time): #29.
Refs: #28, #30, #33 (the half-working attempt), #31, #32.
2026-05-19 23:44:29 +02:00

110 lines
4.8 KiB
Diff

From 179c2e0bf852734631acfc56b2478775215cc5f6 Mon Sep 17 00:00:00 2001
From: Markus Fritsche <fritsche.markus@gmail.com>
Date: Tue, 28 Apr 2026 14:32:18 +0200
Subject: [PATCH 14/29] bes2600: widen scan-defer backoff to 30s and decay
count on quiet
The scan-defer logic added in the previous patch ("bes2600: defer
scan and soften WARN on firmware reject") used a 10-second backoff
window and never cleared reject_count outside of a successful scan.
Field testing on a PineTab2 (linux-pinetab2 6.19.10-danctnix1) shows
two distinct mac80211 scan-retry cadences in practice:
* Idle background scans every ~5 minutes when associated -- well
outside any plausible backoff, the defer guard correctly falls
through to a real WSM scan attempt.
* Roam-evaluation bursts triggered when mac80211 wants to find a
candidate AP for handover (signal degradation, beacon loss,
locally-generated DEAUTH_LEAVING reason=3). Cadence is ~12 s, and
one boot reproduced 14 such rejected scans in 3 minutes during a
single burst, none of which engaged the defer guard because every
retry landed just outside the 10 s window.
Two-line behaviour change to fix that:
1. BES2600_SCAN_BACKOFF_JIFFIES grows from 10*HZ to 30*HZ, so a
12 s-cadence burst stays inside the window across consecutive
rejects and the third reject in the burst trips the threshold
guard. The 5 min idle case is still naturally past the window
and is unaffected.
2. bes2600_scan_should_defer() resets reject_count to 0 when
time_after(jiffies, backoff_until). Without this, reject_count
accumulated indefinitely across the slow-cadence rejects, so an
isolated reject after long quiet would have tripped the
threshold the moment it arrived. After the change, count is
latched only inside an active burst and decays cleanly when the
burst ends.
Net effect on a roam burst:
* t=0 reject #1 (count 1, backoff_until = t0 + 30s)
* t=12 reject #2 (count 2, backoff_until = t1 + 30s)
* t=24 reject #3 (count 3, threshold met, next scan deferred)
* t=36 defer fires, no WSM round-trip, reject not sent
* ... defers continue until the firmware-policy state clears
* scan succeeds -> reject_count = 0, normal cadence resumes
WSM 0x0007 confirm rejections in a burst drop from ~14 to ~3 (just
the scans needed to reach the threshold). wpa_supplicant's reason=3
locally-generated disconnects driven by exhausted roam candidates
during the same burst window also drop.
No new state, no new symbols, no change to mac80211-facing semantics:
the deferred scan still completes via the existing fail: path with
status=-EBUSY, the same response a real firmware-busy would produce.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
---
bes2600/scan.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/drivers/staging/bes2600/scan.c b/drivers/staging/bes2600/scan.c
index 5f6af3b..b944adc 100644
--- a/drivers/staging/bes2600/scan.c
+++ b/drivers/staging/bes2600/scan.c
@@ -22,9 +22,17 @@
* After this many consecutive WSM scan rejections from firmware, stop
* issuing new scans for BES2600_SCAN_BACKOFF_JIFFIES and let the state
* that's rejecting them (coex window, firmware-internal busy) clear.
+ *
+ * The backoff has to be at least as long as the natural mac80211 scan-
+ * retry cadence, otherwise the next attempt lands outside the window
+ * and bypasses the defer guard. Observed in the wild on PineTab2:
+ * roam-evaluation bursts at ~12 s cadence, idle background scans at
+ * ~5 min cadence. 30 s catches the burst and leaves the slow case
+ * alone (the firmware-policy state has had minutes to clear by then
+ * anyway).
*/
#define BES2600_SCAN_REJECT_THRESHOLD 3
-#define BES2600_SCAN_BACKOFF_JIFFIES (10 * HZ)
+#define BES2600_SCAN_BACKOFF_JIFFIES (30 * HZ)
static void bes2600_scan_restart_delayed(struct bes2600_vif *priv);
@@ -40,7 +48,9 @@ static void bes2600_scan_restart_delayed(struct bes2600_vif *priv);
* 2. We already saw >= BES2600_SCAN_REJECT_THRESHOLD consecutive
* rejections on recent scan attempts and the backoff window has
* not yet elapsed. Whatever was rejecting them is likely still
- * rejecting them; give it time.
+ * rejecting them; give it time. If the backoff has elapsed without
+ * a fresh reject refreshing it, the burst is over and we reset the
+ * count so an isolated reject doesn't immediately re-trip.
*
* Returns true if the caller should abandon the scan iteration.
*/
@@ -51,6 +61,9 @@ static bool bes2600_scan_should_defer(struct bes2600_common *hw_priv)
return true;
#endif
+ if (time_after(jiffies, hw_priv->scan.backoff_until))
+ hw_priv->scan.reject_count = 0;
+
if (hw_priv->scan.reject_count >= BES2600_SCAN_REJECT_THRESHOLD &&
time_before(jiffies, hw_priv->scan.backoff_until))
return true;
--
2.54.0