9 Commits

Author SHA1 Message Date
test0r cdb6bd07d3 bes2600: reset firmware state on wsm_join_confirm failure
When wsm_join_confirm() returns status != WSM_STATUS_SUCCESS (ret 1),
the driver cleared its bookkeeping but did not reset the firmware
interface, leaving it in an intermediate post-rejection state.  A rapid
second JOIN attempt (e.g. wpa_supplicant retrying after the
PREV_AUTH_NOT_VALID deauth that mac80211 emits to clean up) hits an
inconsistent firmware context, causing bes2600_sdio_read_rx_batch to
return SDIO error which cascades into wifi_force_close:

  wsm_join_confirm ret 1
  deauthenticating from <bssid> by local choice (Reason: 2=PREV_AUTH_NOT_VALID)
  [~10 min later]
  bes2600_sdio_read_rx_batch sdio read error
  WARNING: at bes2600_tx_loop_set_enable / bes2600_chrdev_wifi_force_close

Two additions to the failure path in bes2600_join_work():

1. wsm_reset (WSM_REQ_ID_RESET, 0x000A) with reset_statistics=false.
   This returns the firmware to IDLE so the next association attempt
   starts from a known-clean state.  bes2600_unjoin_work() performs the
   same reset, but gates it on join_status != PASSIVE; after a failed
   JOIN join_status stays PASSIVE, so that path never fires — call
   wsm_reset directly here instead.

   Contract: wsm_reset takes only wsm_cmd_lock (not conf_lock, not
   wsm_oper_lock).  wsm_oper_unlock was already called inside
   wsm_join_confirm() before wsm_join() returned -EINVAL, so there is
   no re-entrancy hazard.  conf_lock is held at this call site, which is
   compatible with wsm_reset's locking requirements.

2. queue_work(workqueue, &priv->unjoin_work) instead of direct
   wsm_unlock_tx().  Serialises the next association attempt through
   the workqueue so it cannot race against lingering firmware-side
   effects of the failure.  If unjoin_work is already queued, release
   TX immediately (matching cw1200 ancestor sta.c:1344 comment "Tx lock
   still held, unjoin will clear it.").

Ancestor reference: drivers/net/wireless/st/cw1200/sta.c, function
cw1200_join_work(), lines 1339-1344.  cw1200 queues unjoin_work on join
failure for the same reason.  bes2600 needs the direct wsm_reset in
addition because its unjoin_work has the join_status gate that cw1200's
cw1200_do_unjoin() does not.

Signed-off-by: Claude (noether) <claude@reauktion.de>
2026-05-21 10:43:42 +02:00
test0r 0ec58c0ad5 bes2600: Patch C2 — replace ieee80211_rx_irqsafe with ieee80211_rx_ni
Per Phase 4 plan PR #14 + kerneldoc audit (Task #19).  Six call sites
deferred per-RX-frame mac80211 dispatch via tasklet; replace with the
synchronous-from-process-context API ieee80211_rx_ni() which does its
own local_bh_disable wrap.

Why _ni and not _list:

  Phase 4 plan originally targeted ieee80211_rx_list for batch
  delivery.  Mining mt76 mainline (the only driver using _list)
  showed the canonical pattern requires threading a struct list_head
  through the per-frame call chain.  bes2600s WSM dispatcher
  (wsm_handle_rx -> bes2600_rx_cb / wsm.c beacon path) sits between
  the bh threads SDIO read and the mac80211 hand-off; threading a
  list_head through the dispatcher is a non-trivial refactor.
  ieee80211_rx_ni() is the simpler drop-in: no list management, still
  removes the tasklet hop.  Per-call local_bh_disable cost is trivial
  vs the saved tasklet schedule.  Future refactor can revisit _list
  if measurements warrant.

Sites converted:

  - ap.c:96       (bes2600_sta_add link-id rx_queue drain on AP-mode
                   STA add).  Was inside spin_lock_bh(&ps_state_lock);
                   refactored to splice the queue under the lock then
                   deliver after unlock — _ni runs the synchronous
                   mac80211 RX path inline, would otherwise hold the
                   lock across mac80211 dispatch.  splice via
                   skb_queue_splice_init into a local sk_buff_head.
  - sta.c:1487    (deauth-frame inject in inactivity-event handler).
                   Not under any lock; direct conversion.
  - txrx.c:1960   (early-data + pm_unsupported branch from Patch E).
  - txrx.c:1967   (early-data + LINK_SOFT-not-set branch).
  - txrx.c:1971   (normal RX path in bes2600_rx_cb).
  - wsm.c:2415    (beacon delivery in scan-complete WSM handler).
                   beacon SKB ownership is preserved by the existing
                   skb_copy(beacon, GFP_ATOMIC) -> beacon_bkp pattern;
                   no lifecycle change needed.

Mixing constraint (kerneldoc include/net/mac80211.h:5399-5430):
ieee80211_rx_ni() cannot mix with ieee80211_rx_irqsafe() for a
single hardware.  All 6 sites convert atomically; no mixed state.

Build verified clean on ohm sandbox: srcversion 619A51E61BF5479AAC146E6.

Predicted Phase 7 delta: +5-15% over v3+D+E baseline (2.35 MB/s mean
on v3 alone; D+E single-rep was 3.22 MB/s).  Modest improvement
expected from removing the tasklet schedule per RX frame.  Smaller
deltas would still be a net win for upstream-cleanliness — the
kernel.org submission story benefits from not using _irqsafe from
process context.
2026-05-08 06:40:00 +02:00
marfrit 3dbabf3092 bes2600: Patch D — atomicize ba_lock counters, drop the spinlock (#7) 2026-05-07 22:19:53 +00:00
test0r 44b296647b bes2600: Patch D — atomicize ba_lock counters, drop the spinlock
The block-ack policy uses 4 int counters (ba_acc, ba_cnt, ba_acc_rx,
ba_cnt_rx) bumped per data frame in the TX and RX hot paths under
spin_lock_bh(&hw_priv->ba_lock).  The lock was the heaviest per-frame
synchronization cost remaining after Patch C v3 (which fixed the
sdio_rx_work relay).  Per the Opus structural critique (PR #8), this
pattern matches mac80211 driver convention for per-frame statistics:
atomic_t suffices, no lock needed.

Field-by-field changes in struct bes2600_common:
  ba_acc, ba_cnt, ba_acc_rx, ba_cnt_rx: int -> atomic_t
  ba_armed:                              new atomic_t (timer-arm flag)
  ba_ena:                                bool -> atomic_t
  ba_lock:                               removed (spinlock_t deleted)
  ba_hist:                               int (single-writer = ba_timer)

Producer hot path (txrx.c TX submit + RX receive):
  - atomic_add for the byte accumulator
  - atomic_inc for the frame counter
  - atomic_cmpxchg(&ba_armed, 0, 1) to claim the once-per-window
    mod_timer arm — at most ONE producer succeeds; race-free
  - no spin_lock_bh

Consumer paths (sta.c bes2600_ba_timer, sta.c disconnect-reset, sta.c
bes2600_ba_work, debug.c debugfs reader):
  - atomic_read snapshots all 4 counters into locals; the threshold
    predicate (acc/cnt >= THLD) tolerates approximate snapshots — the
    timer fires periodically, a single misclassification just delays
    the policy update by one tick
  - atomic_set zeroes the counters at end of timer-callback window;
    racing producer increments after the snapshot are lost (acceptable
    for stats; same approximation the original lock allowed under
    contention)
  - atomic_set(&ba_armed, 0) re-enables the next window's arm

Followup-amenable simplification: ba_hist remains int because only
the single ba_timer callback writes it; multiple writers would need
to upgrade it too.

This patch follows the cw1200-mainline-idiom established by Patch C v3
(structural fix, not bandaid).  The cw1200 reference doesn't have a
similar lock to compare; bes2600 inherited this from a later
Bestechnic addition rather than the upstream tree.
2026-05-08 00:17:46 +02:00
test0r 8dd79199f8 bes2600: Patch G — restore SPDX identifiers + ST-Ericsson attribution
The bes2600 driver is a fork of the upstream cw1200 driver
(drivers/net/wireless/st/cw1200/, ST-Ericsson, Dmitry Tarnyagin
2010-2011).  The fork's file headers have three GPL-compliance issues:

  1. NO SPDX-License-Identifier on any of 48 source files (cw1200
     mainline has them on all 25).  kernel.org-mandated since 2017.

  2. Original "Copyright (c) 2010, ST-Ericsson" lines stripped from
     all files inherited from cw1200, replaced with
     "Copyright (c) 2010, Bestechnic" — factually impossible
     (Bestechnic did not author the 2010 work) and a GPL-2.0 §1
     attribution-preservation violation.

  3. The "GPL version 2 as published by the Free Software Foundation"
     boilerplate paragraph is redundant alongside SPDX and is the
     legacy form modern kernel sources have replaced.

This patch corrects all three for the 48 .c/.h files in bes2600/:

  - Adds `// SPDX-License-Identifier: GPL-2.0-only` (or `/* ... */`
    for headers) as line 1 of every file.
  - Restores `Copyright (c) 2010, ST-Ericsson` + `Author: Dmitry
    Tarnyagin <dmitry.tarnyagin@lockless.no>` as the FIRST copyright
    chain entry on all 22 files derived from cw1200 (bh.{c,h},
    debug.{c,h}, fwio.{c,h}, hwio.{c,h}, main.c, pm.{c,h},
    queue.{c,h}, scan.{c,h}, sta.{c,h}, txrx.{c,h}, wsm.{c,h}).
  - Keeps `Copyright (c) 2022, Bestechnic (Beijing) Co., Ltd.` as
    the SECOND chain entry where Bestechnic genuinely contributed.
  - Notes "Derived from cw1200_sdio.c" + ST-Ericsson copyright on
    bes2600_sdio.c (heavy derivation, not a literal rename).
  - Notes "Replaces hwbus.h from cw1200/" + ST-Ericsson copyright
    on sbus.h.
  - Preserves the prism54/islsm authorship chain on main.c and
    bes2600.h (Michael Wu 2006 + Jean-Baptiste Note 2004-2006).
  - Drops the GPL-2.0 boilerplate paragraph in favour of SPDX.

No code changes — only file-header comment blocks.  Module build is
unaffected (verified by header-only diff scope).

This is a prerequisite for any kernel.org submission attempt.  The
existing MODULE_LICENSE("GPL") + MODULE_AUTHOR(Tarnyagin@stericsson.com)
declarations were already present and are unchanged here; the
mismatch between MODULE_AUTHOR and the (since-corrected) per-file
copyrights is now resolved.
2026-05-08 00:03:50 +02:00
claude-noether f2cf586f89 bes2600: bus_reset on connection-loss storm to dodge assoc-comeback blackhole
When mac80211 declares connection loss against this AP (typically driven
by inactivity-deauth or beacon-loss), the userspace reauth that follows
sometimes enters a long blackhole: the AP responds to auth with success
but defers assoc with the 802.11v "assoc comeback" timer; ohm retries
faster than the comeback grants permission; the AP eventually fires an
unprotected deauth-reason-6 ("Class 2 frame received from non-
authenticated station"), and recovery only completes via cross-SSID or
cross-channel fallback. Receipts: ~86 s blackhole observed in the
phase-7 rep on 2026-05-07 02:42, with three subsequent BSSIDs returning
assoc comeback timeouts before reason-9 (STA_REQ_ASSOC_WITHOUT_AUTH)
fired. Documented in marfrit/besser:notes/phase4-2026-05-07.md.

When N=3 driver-side connection_loss decisions fire within a 60 s window
on the same vif, skip the ieee80211_connection_loss() path and trigger
the c5.2-introduced bes2600_chrdev_do_bus_reset() instead. The bus
reset removes and re-probes the chip; userspace re-associates with a
fresh chip state, dodging the AP's comeback-timer rejection cycle.

Predicted Phase 7 delta vs current baseline:
- api_connection_loss rate: unchanged (we don't address the trigger)
- conditional probability of >5 s blackhole given event: <= 30 %
- worst-case recovery: 86 s -> < 10 s

Contract pin: bes2600_chrdev_do_bus_reset(sbus_ops, sbus_priv) at
bes2600/bes_chardev.c:455, introduced by c5.2. The function is async-
returning: sbus_ops->bus_reset() schedules an SDIO rescan; the helper
waits up to 3 s for the remove() callback to clear sbus_priv, then
returns. Per-vif state is gone after this point, so the recover work
lives on bes2600_common (hw_priv) and uses the global bes2600_cdev for
the bus_reset call rather than dereferencing per-vif state.

Threshold (3 / 60 s) is well above the steady-state per-vif
connection_loss rate observed in the patch-A phase-7 rep (0.86/h under
sustained load), so a true storm is required to trip it.

Files touched:
- bes2600/bes2600.h: 3 counter fields on struct bes2600_vif, 1
  work_struct on struct bes2600_common, 3 prototypes
- bes2600/sta.c: 3 helpers + storm-account hook in
  bes2600_connection_loss_work + storm-init in bes2600_vif_setup +
  cancel_work_sync in the hw_priv shutdown path; #include bes_chardev.h
  was already pulled in by an earlier c-stack patch
- bes2600/main.c: INIT_WORK alongside other hw_priv work_structs
- bes2600/debug.c: ConnectionLossStormRecoveries seq_printf in the
  per-vif status seq_file output

The cw1200/cw1260 ancestor has no equivalent; this is a clean
addition. checkpatch.pl --no-tree --strict: clean (0/0/0).

Signed-off-by: Claude (noether) <claude@reauktion.de>
2026-05-07 12:06:46 +02:00
claude-noether d0f14e3ba7 bes2600: pre-empt AP-deauth-6 with mac80211 reassoc on decrypt-fail storm
When the BES2600 firmware reports WSM_STATUS_DECRYPTFAILURE for a burst
of received frames (typically because the host's PTK or GTK has fallen
out of sync with the AP), the AP eventually concludes that the STA is
not authenticated and emits an unprotected deauth-reason-6 ("Class 2
frame received from non-authenticated station"). On the deployed
pinetab2 + bes2600 stack this AP-initiated deauth has been observed to
leave the link blackholed for up to 109 s before userspace finds a
different SSID/channel to recover on. (Receipts at
https://git.reauktion.de/marfrit/besser, notes/phase5-2026-05-06.md.)

Add a sliding-window counter on each bes2600_vif: when 5 decrypt
failures fire within 5 s, schedule a worker that calls
ieee80211_connection_loss(vif). mac80211 then performs immediate
disassociation; userspace (NetworkManager / wpa_supplicant) reconnects
with fresh keys before the AP gets a chance to fire its unprotected
deauth.

Predicted Phase 7 delta vs the unpatched baseline:
- decrypt-burst rate: unchanged (this does not address root cause)
- AP-deauth-6 rate: <= 0.2 of baseline
- conditional probability of >5s blackhole given a burst:
  100% -> <= 10%
- worst-case recovery time: 109s -> <5s

Contract pin: ieee80211_connection_loss() per
include/net/mac80211.h: "may also be called if the connection needs to
be terminated for some other reason... will cause immediate change to
disassociated state, without connection recovery attempts." Userspace
recovery is the existing NM/wpa_supplicant path. The worker context
satisfies the implicit process-context expectation.

Files touched:
- bes2600/bes2600.h: 4 new fields on struct bes2600_vif + 2 prototypes
- bes2600/txrx.c: new helpers + the call site at the existing
  WSM_STATUS_DECRYPTFAILURE log point (the unconditional "goto drop"
  branch in bes2600_rx_cb)
- bes2600/sta.c: bes2600_decrypt_storm_init() in bes2600_vif_setup;
  cancel_work_sync() in bes2600_remove_interface, alongside the
  existing per-vif cancel_*_work_sync block. Safe under the kernel
  cancel_work_sync contract: the work_struct is INIT_WORK'd in setup,
  so the call is valid; it blocks until any in-flight handler returns,
  ensuring no use-after-free of priv when mac80211 frees the vif; and
  it is idempotent (subsequent calls just return false).
- bes2600/debug.c: DecryptStormRecoveries seq_printf in the per-vif
  status seq_file output

Threshold (5/5s) is set well above the steady-state per-vif decrypt-
fail rate observed in measurement (~1/min even under sustained 1 MB/s
load), so a true storm is required to trip it. The cw1200/cw1260
ancestor has no equivalent storm-recovery; this is a clean addition.

checkpatch.pl --no-tree --strict: clean (0/0/0).

Signed-off-by: Claude (noether) <claude@reauktion.de>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 11:21:51 +02:00
test0r 9012b74eea bes2600: enable CONFIG_BES2600_TESTMODE by default + fix bit-rotted testmode plumbing
The driver implements a mac80211 testmode_cmd operation that dispatches
to a set of vendor commands (GET_TX_POWER_LEVEL, GET_TX_POWER_RANGE,
SET_SNAP_FRAME, TSM_STATS, GET_ROAM_DELAY, GET_STREAM, etc) plus the
BES2600 RF-test path (bes2600_vendor_rf_cmd → firmware
patch_wifi_testMode). The testmode handlers and the .testmode_cmd
binding in struct ieee80211_ops are conditionally compiled under
CONFIG_BES2600_TESTMODE, which previously defaulted to n.

Flip the Makefile default from n to y so wifi_testmode_cmd.o is
included in the build and the .testmode_cmd op is populated. On the
PineTab2 target kernel (linux-pinetab2 6.19.10-danctnix1, built with
CONFIG_NL80211_TESTMODE=y) this exposes the BES2600 RF-test surface
through the standard nl80211 testmode interface ('iw phy0 ...').

This also makes visible two classes of bit-rot that had accumulated
while nobody was building with CONFIG_BES2600_TESTMODE=y:

1. sta.c contains ~41 calls to bes2600_info() / bes2600_err() /
   bes2600_warn() / bes2600_dbg() / bes2600_err_with_cond() - a
   legacy log-macro family carrying a BES2600_DBG_* subsystem-id
   first argument. Neither the macros nor any of the BES2600_DBG_*
   constants are defined anywhere in the tree. The same call pattern
   appears under #if defined(BES2600_DETECTION_LOGIC) in hwio.c and
   under CONFIG_BES2600_ITP in itp.c, both normally disabled.

   Add minimal shim macros to bes_log.h that rewire the calls onto
   the existing bes_info() / bes_err() / bes_warn() / bes_devel()
   family (ignoring the subsystem id). Define BES2600_DBG_SBUS,
   BES2600_DBG_DOWNLOAD, BES2600_DBG_ITP and BES2600_DBG_TEST_MODE
   as 0 constants for documentation / grep.

2. bes2600_start_stop_tsm(), bes2600_get_tsm_params(), and
   bes2600_get_roam_delay() are declared in sta.c with external
   linkage but have no prototype in any header. All callers live in
   sta.c (inside bes2600_testmode_cmd). With CONFIG_BES2600_TESTMODE
   off the compiler never sees them; with it on gcc
   -Werror=missing-prototypes breaks the build.

   Mark the three functions static. (Keeping them file-local also
   matches their actual usage.)

Both changes are strictly scoped to make CONFIG_BES2600_TESTMODE=y
buildable; no behavioural change when the flag is off.

Tested-on: PineTab2 (BES2600WM + RK3566) running linux-pinetab2
6.19.10-danctnix1-1 with CONFIG_NL80211_TESTMODE=y. Module builds
cleanly, nl80211 testmode interface reachable via 'iw phy0 ...' from
userspace.

Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
2026-04-24 08:55:10 +02:00
Julian ba20341e70 Upload
Source: https://github.com/cringeops/bes2600
Source: https://github.com/cringeops/bes2600/pull/14
Source: https://github.com/cringeops/bes2600/pull/17
Source: https://github.com/cringeops/bes2600/pull/20
2025-09-17 16:35:45 +02:00