3d833f8ccf31895a2ce7bf4fd4ef839e653b29bb
8 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
3d833f8ccf |
bes2600: reset firmware state on wsm_join_confirm failure
When wsm_join_confirm() returns status != WSM_STATUS_SUCCESS (ret 1), the driver cleared its bookkeeping but did not reset the firmware interface, leaving it in an intermediate post-rejection state. A rapid second JOIN attempt (e.g. wpa_supplicant retrying after the PREV_AUTH_NOT_VALID deauth that mac80211 emits to clean up) hits an inconsistent firmware context, causing bes2600_sdio_read_rx_batch to return SDIO error which cascades into wifi_force_close: wsm_join_confirm ret 1 deauthenticating from <bssid> by local choice (Reason: 2=PREV_AUTH_NOT_VALID) [~10 min later] bes2600_sdio_read_rx_batch sdio read error WARNING: at bes2600_tx_loop_set_enable / bes2600_chrdev_wifi_force_close Two additions to the failure path in bes2600_join_work(): 1. wsm_reset (WSM_REQ_ID_RESET, 0x000A) with reset_statistics=false. This returns the firmware to IDLE so the next association attempt starts from a known-clean state. bes2600_unjoin_work() performs the same reset, but gates it on join_status != PASSIVE; after a failed JOIN join_status stays PASSIVE, so that path never fires — call wsm_reset directly here instead. Contract: wsm_reset takes only wsm_cmd_lock (not conf_lock, not wsm_oper_lock). wsm_oper_unlock was already called inside wsm_join_confirm() before wsm_join() returned -EINVAL, so there is no re-entrancy hazard. conf_lock is held at this call site, which is compatible with wsm_reset's locking requirements. 2. queue_work(workqueue, &priv->unjoin_work) instead of direct wsm_unlock_tx(). Serialises the next association attempt through the workqueue so it cannot race against lingering firmware-side effects of the failure. If unjoin_work is already queued, release TX immediately (matching cw1200 ancestor sta.c:1344 comment "Tx lock still held, unjoin will clear it."). Ancestor reference: drivers/net/wireless/st/cw1200/sta.c, function cw1200_join_work(), lines 1339-1344. cw1200 queues unjoin_work on join failure for the same reason. bes2600 needs the direct wsm_reset in addition because its unjoin_work has the join_status gate that cw1200's cw1200_do_unjoin() does not. Signed-off-by: Claude (noether) <claude@reauktion.de> |
||
|
|
447240cbe8 |
bes2600: Patch C2 — replace ieee80211_rx_irqsafe with ieee80211_rx_ni
Per Phase 4 plan PR #14 + kerneldoc audit (Task #19). Six call sites deferred per-RX-frame mac80211 dispatch via tasklet; replace with the synchronous-from-process-context API ieee80211_rx_ni() which does its own local_bh_disable wrap. Why _ni and not _list: Phase 4 plan originally targeted ieee80211_rx_list for batch delivery. Mining mt76 mainline (the only driver using _list) showed the canonical pattern requires threading a struct list_head through the per-frame call chain. bes2600s WSM dispatcher (wsm_handle_rx -> bes2600_rx_cb / wsm.c beacon path) sits between the bh threads SDIO read and the mac80211 hand-off; threading a list_head through the dispatcher is a non-trivial refactor. ieee80211_rx_ni() is the simpler drop-in: no list management, still removes the tasklet hop. Per-call local_bh_disable cost is trivial vs the saved tasklet schedule. Future refactor can revisit _list if measurements warrant. Sites converted: - ap.c:96 (bes2600_sta_add link-id rx_queue drain on AP-mode STA add). Was inside spin_lock_bh(&ps_state_lock); refactored to splice the queue under the lock then deliver after unlock — _ni runs the synchronous mac80211 RX path inline, would otherwise hold the lock across mac80211 dispatch. splice via skb_queue_splice_init into a local sk_buff_head. - sta.c:1487 (deauth-frame inject in inactivity-event handler). Not under any lock; direct conversion. - txrx.c:1960 (early-data + pm_unsupported branch from Patch E). - txrx.c:1967 (early-data + LINK_SOFT-not-set branch). - txrx.c:1971 (normal RX path in bes2600_rx_cb). - wsm.c:2415 (beacon delivery in scan-complete WSM handler). beacon SKB ownership is preserved by the existing skb_copy(beacon, GFP_ATOMIC) -> beacon_bkp pattern; no lifecycle change needed. Mixing constraint (kerneldoc include/net/mac80211.h:5399-5430): ieee80211_rx_ni() cannot mix with ieee80211_rx_irqsafe() for a single hardware. All 6 sites convert atomically; no mixed state. Build verified clean on ohm sandbox: srcversion 619A51E61BF5479AAC146E6. Predicted Phase 7 delta: +5-15% over v3+D+E baseline (2.35 MB/s mean on v3 alone; D+E single-rep was 3.22 MB/s). Modest improvement expected from removing the tasklet schedule per RX frame. Smaller deltas would still be a net win for upstream-cleanliness — the kernel.org submission story benefits from not using _irqsafe from process context. |
||
|
|
93f2aab656 |
bes2600: Patch D — atomicize ba_lock counters, drop the spinlock
The block-ack policy uses 4 int counters (ba_acc, ba_cnt, ba_acc_rx, ba_cnt_rx) bumped per data frame in the TX and RX hot paths under spin_lock_bh(&hw_priv->ba_lock). The lock was the heaviest per-frame synchronization cost remaining after Patch C v3 (which fixed the sdio_rx_work relay). Per the Opus structural critique (PR #8), this pattern matches mac80211 driver convention for per-frame statistics: atomic_t suffices, no lock needed. Field-by-field changes in struct bes2600_common: ba_acc, ba_cnt, ba_acc_rx, ba_cnt_rx: int -> atomic_t ba_armed: new atomic_t (timer-arm flag) ba_ena: bool -> atomic_t ba_lock: removed (spinlock_t deleted) ba_hist: int (single-writer = ba_timer) Producer hot path (txrx.c TX submit + RX receive): - atomic_add for the byte accumulator - atomic_inc for the frame counter - atomic_cmpxchg(&ba_armed, 0, 1) to claim the once-per-window mod_timer arm — at most ONE producer succeeds; race-free - no spin_lock_bh Consumer paths (sta.c bes2600_ba_timer, sta.c disconnect-reset, sta.c bes2600_ba_work, debug.c debugfs reader): - atomic_read snapshots all 4 counters into locals; the threshold predicate (acc/cnt >= THLD) tolerates approximate snapshots — the timer fires periodically, a single misclassification just delays the policy update by one tick - atomic_set zeroes the counters at end of timer-callback window; racing producer increments after the snapshot are lost (acceptable for stats; same approximation the original lock allowed under contention) - atomic_set(&ba_armed, 0) re-enables the next window's arm Followup-amenable simplification: ba_hist remains int because only the single ba_timer callback writes it; multiple writers would need to upgrade it too. This patch follows the cw1200-mainline-idiom established by Patch C v3 (structural fix, not bandaid). The cw1200 reference doesn't have a similar lock to compare; bes2600 inherited this from a later Bestechnic addition rather than the upstream tree. |
||
|
|
a02f8b7629 |
bes2600: Patch G — restore SPDX identifiers + ST-Ericsson attribution
The bes2600 driver is a fork of the upstream cw1200 driver
(drivers/net/wireless/st/cw1200/, ST-Ericsson, Dmitry Tarnyagin
2010-2011). The fork's file headers have three GPL-compliance issues:
1. NO SPDX-License-Identifier on any of 48 source files (cw1200
mainline has them on all 25). kernel.org-mandated since 2017.
2. Original "Copyright (c) 2010, ST-Ericsson" lines stripped from
all files inherited from cw1200, replaced with
"Copyright (c) 2010, Bestechnic" — factually impossible
(Bestechnic did not author the 2010 work) and a GPL-2.0 §1
attribution-preservation violation.
3. The "GPL version 2 as published by the Free Software Foundation"
boilerplate paragraph is redundant alongside SPDX and is the
legacy form modern kernel sources have replaced.
This patch corrects all three for the 48 .c/.h files in bes2600/:
- Adds `// SPDX-License-Identifier: GPL-2.0-only` (or `/* ... */`
for headers) as line 1 of every file.
- Restores `Copyright (c) 2010, ST-Ericsson` + `Author: Dmitry
Tarnyagin <dmitry.tarnyagin@lockless.no>` as the FIRST copyright
chain entry on all 22 files derived from cw1200 (bh.{c,h},
debug.{c,h}, fwio.{c,h}, hwio.{c,h}, main.c, pm.{c,h},
queue.{c,h}, scan.{c,h}, sta.{c,h}, txrx.{c,h}, wsm.{c,h}).
- Keeps `Copyright (c) 2022, Bestechnic (Beijing) Co., Ltd.` as
the SECOND chain entry where Bestechnic genuinely contributed.
- Notes "Derived from cw1200_sdio.c" + ST-Ericsson copyright on
bes2600_sdio.c (heavy derivation, not a literal rename).
- Notes "Replaces hwbus.h from cw1200/" + ST-Ericsson copyright
on sbus.h.
- Preserves the prism54/islsm authorship chain on main.c and
bes2600.h (Michael Wu 2006 + Jean-Baptiste Note 2004-2006).
- Drops the GPL-2.0 boilerplate paragraph in favour of SPDX.
No code changes — only file-header comment blocks. Module build is
unaffected (verified by header-only diff scope).
This is a prerequisite for any kernel.org submission attempt. The
existing MODULE_LICENSE("GPL") + MODULE_AUTHOR(Tarnyagin@stericsson.com)
declarations were already present and are unchanged here; the
mismatch between MODULE_AUTHOR and the (since-corrected) per-file
copyrights is now resolved.
|
||
|
|
a7e232738d |
bes2600: bus_reset on connection-loss storm to dodge assoc-comeback blackhole
When mac80211 declares connection loss against this AP (typically driven
by inactivity-deauth or beacon-loss), the userspace reauth that follows
sometimes enters a long blackhole: the AP responds to auth with success
but defers assoc with the 802.11v "assoc comeback" timer; ohm retries
faster than the comeback grants permission; the AP eventually fires an
unprotected deauth-reason-6 ("Class 2 frame received from non-
authenticated station"), and recovery only completes via cross-SSID or
cross-channel fallback. Receipts: ~86 s blackhole observed in the
phase-7 rep on 2026-05-07 02:42, with three subsequent BSSIDs returning
assoc comeback timeouts before reason-9 (STA_REQ_ASSOC_WITHOUT_AUTH)
fired. Documented in marfrit/besser:notes/phase4-2026-05-07.md.
When N=3 driver-side connection_loss decisions fire within a 60 s window
on the same vif, skip the ieee80211_connection_loss() path and trigger
the c5.2-introduced bes2600_chrdev_do_bus_reset() instead. The bus
reset removes and re-probes the chip; userspace re-associates with a
fresh chip state, dodging the AP's comeback-timer rejection cycle.
Predicted Phase 7 delta vs current baseline:
- api_connection_loss rate: unchanged (we don't address the trigger)
- conditional probability of >5 s blackhole given event: <= 30 %
- worst-case recovery: 86 s -> < 10 s
Contract pin: bes2600_chrdev_do_bus_reset(sbus_ops, sbus_priv) at
bes2600/bes_chardev.c:455, introduced by c5.2. The function is async-
returning: sbus_ops->bus_reset() schedules an SDIO rescan; the helper
waits up to 3 s for the remove() callback to clear sbus_priv, then
returns. Per-vif state is gone after this point, so the recover work
lives on bes2600_common (hw_priv) and uses the global bes2600_cdev for
the bus_reset call rather than dereferencing per-vif state.
Threshold (3 / 60 s) is well above the steady-state per-vif
connection_loss rate observed in the patch-A phase-7 rep (0.86/h under
sustained load), so a true storm is required to trip it.
Files touched:
- bes2600/bes2600.h: 3 counter fields on struct bes2600_vif, 1
work_struct on struct bes2600_common, 3 prototypes
- bes2600/sta.c: 3 helpers + storm-account hook in
bes2600_connection_loss_work + storm-init in bes2600_vif_setup +
cancel_work_sync in the hw_priv shutdown path; #include bes_chardev.h
was already pulled in by an earlier c-stack patch
- bes2600/main.c: INIT_WORK alongside other hw_priv work_structs
- bes2600/debug.c: ConnectionLossStormRecoveries seq_printf in the
per-vif status seq_file output
The cw1200/cw1260 ancestor has no equivalent; this is a clean
addition. checkpatch.pl --no-tree --strict: clean (0/0/0).
Signed-off-by: Claude (noether) <claude@reauktion.de>
|
||
|
|
3b4239ad2b |
bes2600: pre-empt AP-deauth-6 with mac80211 reassoc on decrypt-fail storm
When the BES2600 firmware reports WSM_STATUS_DECRYPTFAILURE for a burst
of received frames (typically because the host's PTK or GTK has fallen
out of sync with the AP), the AP eventually concludes that the STA is
not authenticated and emits an unprotected deauth-reason-6 ("Class 2
frame received from non-authenticated station"). On the deployed
pinetab2 + bes2600 stack this AP-initiated deauth has been observed to
leave the link blackholed for up to 109 s before userspace finds a
different SSID/channel to recover on. (Receipts at
https://git.reauktion.de/marfrit/besser, notes/phase5-2026-05-06.md.)
Add a sliding-window counter on each bes2600_vif: when 5 decrypt
failures fire within 5 s, schedule a worker that calls
ieee80211_connection_loss(vif). mac80211 then performs immediate
disassociation; userspace (NetworkManager / wpa_supplicant) reconnects
with fresh keys before the AP gets a chance to fire its unprotected
deauth.
Predicted Phase 7 delta vs the unpatched baseline:
- decrypt-burst rate: unchanged (this does not address root cause)
- AP-deauth-6 rate: <= 0.2 of baseline
- conditional probability of >5s blackhole given a burst:
100% -> <= 10%
- worst-case recovery time: 109s -> <5s
Contract pin: ieee80211_connection_loss() per
include/net/mac80211.h: "may also be called if the connection needs to
be terminated for some other reason... will cause immediate change to
disassociated state, without connection recovery attempts." Userspace
recovery is the existing NM/wpa_supplicant path. The worker context
satisfies the implicit process-context expectation.
Files touched:
- bes2600/bes2600.h: 4 new fields on struct bes2600_vif + 2 prototypes
- bes2600/txrx.c: new helpers + the call site at the existing
WSM_STATUS_DECRYPTFAILURE log point (the unconditional "goto drop"
branch in bes2600_rx_cb)
- bes2600/sta.c: bes2600_decrypt_storm_init() in bes2600_vif_setup;
cancel_work_sync() in bes2600_remove_interface, alongside the
existing per-vif cancel_*_work_sync block. Safe under the kernel
cancel_work_sync contract: the work_struct is INIT_WORK'd in setup,
so the call is valid; it blocks until any in-flight handler returns,
ensuring no use-after-free of priv when mac80211 frees the vif; and
it is idempotent (subsequent calls just return false).
- bes2600/debug.c: DecryptStormRecoveries seq_printf in the per-vif
status seq_file output
Threshold (5/5s) is set well above the steady-state per-vif decrypt-
fail rate observed in measurement (~1/min even under sustained 1 MB/s
load), so a true storm is required to trip it. The cw1200/cw1260
ancestor has no equivalent storm-recovery; this is a clean addition.
checkpatch.pl --no-tree --strict: clean (0/0/0).
Signed-off-by: Claude (noether) <claude@reauktion.de>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
e0d752aae9 | sync bes2600/ to v7.0-danctnix1 baseline (rebasing reference) | ||
|
|
ba20341e70 |
Upload
Source: https://github.com/cringeops/bes2600 Source: https://github.com/cringeops/bes2600/pull/14 Source: https://github.com/cringeops/bes2600/pull/17 Source: https://github.com/cringeops/bes2600/pull/20 |