Commit Graph

6 Commits

Author SHA1 Message Date
claude-noether 06fab77745 bes2600: bus_reset on connection-loss storm to dodge assoc-comeback blackhole
When mac80211 declares connection loss against this AP (typically driven
by inactivity-deauth or beacon-loss), the userspace reauth that follows
sometimes enters a long blackhole: the AP responds to auth with success
but defers assoc with the 802.11v "assoc comeback" timer; ohm retries
faster than the comeback grants permission; the AP eventually fires an
unprotected deauth-reason-6 ("Class 2 frame received from non-
authenticated station"), and recovery only completes via cross-SSID or
cross-channel fallback. Receipts: ~86 s blackhole observed in the
phase-7 rep on 2026-05-07 02:42, with three subsequent BSSIDs returning
assoc comeback timeouts before reason-9 (STA_REQ_ASSOC_WITHOUT_AUTH)
fired. Documented in marfrit/besser:notes/phase4-2026-05-07.md.

When N=3 driver-side connection_loss decisions fire within a 60 s window
on the same vif, skip the ieee80211_connection_loss() path and trigger
the c5.2-introduced bes2600_chrdev_do_bus_reset() instead. The bus
reset removes and re-probes the chip; userspace re-associates with a
fresh chip state, dodging the AP's comeback-timer rejection cycle.

Predicted Phase 7 delta vs current baseline:
- api_connection_loss rate: unchanged (we don't address the trigger)
- conditional probability of >5 s blackhole given event: <= 30 %
- worst-case recovery: 86 s -> < 10 s

Contract pin: bes2600_chrdev_do_bus_reset(sbus_ops, sbus_priv) at
bes2600/bes_chardev.c:455, introduced by c5.2. The function is async-
returning: sbus_ops->bus_reset() schedules an SDIO rescan; the helper
waits up to 3 s for the remove() callback to clear sbus_priv, then
returns. Per-vif state is gone after this point, so the recover work
lives on bes2600_common (hw_priv) and uses the global bes2600_cdev for
the bus_reset call rather than dereferencing per-vif state.

Threshold (3 / 60 s) is well above the steady-state per-vif
connection_loss rate observed in the patch-A phase-7 rep (0.86/h under
sustained load), so a true storm is required to trip it.

Files touched:
- bes2600/bes2600.h: 3 counter fields on struct bes2600_vif, 1
  work_struct on struct bes2600_common, 3 prototypes
- bes2600/sta.c: 3 helpers + storm-account hook in
  bes2600_connection_loss_work + storm-init in bes2600_vif_setup +
  cancel_work_sync in the hw_priv shutdown path; #include bes_chardev.h
  was already pulled in by an earlier c-stack patch
- bes2600/main.c: INIT_WORK alongside other hw_priv work_structs
- bes2600/debug.c: ConnectionLossStormRecoveries seq_printf in the
  per-vif status seq_file output

The cw1200/cw1260 ancestor has no equivalent; this is a clean
addition. checkpatch.pl --no-tree --strict: clean (0/0/0).

Signed-off-by: Claude (noether) <claude@reauktion.de>
2026-05-19 09:06:07 +02:00
test0r 22b799f5a2 bes2600: recover wedged firmware via mmc_hw_reset on link break
When the LMAC active monitor detects 'link break between lmac and host'
(the hw_buf_used==pending watchdog in bes2600_bh_lmac_active_monitor),
bes2600_chrdev_wifi_force_close(hw_priv, true) is invoked to tear the
device down and prepare for a fresh probe. On the wifi_force_close_work
side this calls bes2600_chrdev_do_system_close() which dispatches
sbus_ops->power_switch(0).

On PineTab2 (RK3566 + BES2600WM over SDIO) this recovery path is a
no-op:

  * bes2600_sdio_power_down() writes a SYSTEM_CLOSE host-int message,
    clears MMC_CAP_NONREMOVABLE, and schedules sdio_scan_work, which is
    the literal one-line stub bes_warn("...this function does
    nothing\n").
  * bes2600_sdio_on() (the eventual power_switch(1) counterpart)
    toggles pdata->powerup, which is NULL on PineTab2 because the
    wifi-reset GPIO is owned by sdio_pwrseq, not the bes2600 device
    tree node (see arch/arm64/boot/dts/rockchip/rk3566-pinetab2.dtsi:
    'The reset pin is claimed by sdio_mmcseq, It is better to move it
    to U-Boot so the OS can use it.').

Net result: the chip is never reset. The function drivers are not
removed (the SDIO core has no signal that the card is gone), the
firmware stays wedged, and a subsequent rmmod bes2600 leaves the SDIO
function in a half-torn-down state. modprobe bes2600 then fails with
'probe with driver bes2600_wlan failed with error -123' (-ENOMEDIUM)
on both functions (:1 wifi, :2 BT-companion) until a full system
reboot.

Observed on PineTab2 (linux-pinetab2 6.19.10-danctnix1-1) after ~150
minutes of background-scan rejects (wsm_generic_confirm 0x0007,
[SCAN] Scan failed (-22)) accumulating until the LMAC stopped
acknowledging TX buffers (hw_buf_used:24 pending:24). Reproducible
under sustained scan pressure.

Add a sbus operation bus_reset() that the recovery path can call when
power_switch() has no effective chip-reset signal of its own. Provide
an SDIO implementation that calls mmc_hw_reset(self->func->card),
which on a multi-function SDIO card (PineTab2 binds func 1 for WLAN
and func 2 for the BT-companion path) takes the remove-and-rescan
path: mmc_sdio_hw_reset() marks the card removed and schedules
mmc_rescan, which tears down the bound function drivers and re-detects
the card on the next sweep, in turn reinvoking bes2600_sdio_probe().
With a single function probed it instead invokes mmc_power_cycle()
directly, which on PineTab2 toggles the wifi-reset GPIO via
sdio_pwrseq.

Add bes2600_chrdev_do_bus_reset() as the chrdev-side helper. It
invokes the bus op and then waits on probe_done_wq for the SDIO
remove() callback to clear sbus_priv, mirroring the wait pattern
already used by bes2600_chrdev_do_system_close() so that a subsequent
bes2600_switch_wifi(true) sees a clean state and can wait on the
fresh probe.

Wire it into bes2600_chrdev_wifi_force_close_work(): when halt_dev is
set (the hard-exception path used by both
bes2600_bh_lmac_active_monitor and bes2600_bh_mcu_active_monitor) and
the underlying bus implements bus_reset, take the new recovery path;
otherwise fall back to the legacy power_switch(0) sequence so this
patch is a no-op on USB or any other future bus that does not provide
bus_reset.

mmc_hw_reset() is exported by the MMC core and is the canonical
recovery primitive; calling it without holding the SDIO host claim is
correct because the multi-func remove-and-rescan path acquires the
host claim via the mmc workqueue, and the single-func mmc_power_cycle
path does not require the host claim.

No DT change is required: this works against the existing PineTab2
DTS, where the wifi-reset GPIO and the optional sdio_pwrkey GPIO (on
v2.0 boards) are both already configured as MMC pwrseq resets.

Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
2026-05-19 09:06:07 +02:00
test0r 0768e11da6 bes2600: drop BES2600_WRITE_DPD_TO_FILE kernel_*() file paths
bes_chardev.c carried three functions gated behind the
BES2600_WRITE_DPD_TO_FILE Kconfig/make-flag (default off):

  - bes2600_chrdev_write_dpd_data_to_file()
      filp_open(O_CREAT | O_TRUNC | O_RDWR) + kernel_write()
      writing a raw DPD calibration blob back to
      BES2600_DPD_PATH (default /data/cfg/bes2600_dpd.bin, an
      Android-AOSP path).

  - bes2600_chrdev_read_and_check_dpd_data()
      filp_open(O_RDONLY) + kernel_read() reading the DPD blob
      from either BES2600_DPD_GOLDEN_PATH (/data/cfg/…) or
      BES2600_DEFAULT_DPD_PATH (/lib/firmware/bes2600_dpd.bin),
      followed by a CRC/version sanity check.

  - bes2600_chrdev_dpd_is_vaild() (sic), the CRC/version helper
    used only by the read path.

Plus the bes_cdev.no_dpd field, its module_param, and two
intrusion sites in bes2600_chrdev_get_dpd_data() and
bes2600_chrdev_update_dpd_data() that invoke the above.

The Makefile defaults BES2600_WRITE_DPD_TO_FILE=n, so in a stock
build all of this is dead code. It is still a standing upstream
blocker for exactly the same reasons as the factory-txt write
path removed in the preceding patch:

  - filp_open() + kernel_read()/kernel_write() bypass the
    firmware-class abstraction and LSM-governed access control
    that apply to /lib/firmware/.
  - The write target /data/cfg/ is an Android AOSP convention
    that does not exist on a Linux distribution and cannot be
    created by the kernel anyway.
  - A runtime DPD re-calibration is intended to reduce TX EVM
    after temperature or aging drift; persisting the result via
    kernel_write() is fundamentally a userspace concern (debugfs
    dump + userspace tool is the expected route).

Remove the entire #ifdef BES2600_WRITE_DPD_TO_FILE block from
bes_chardev.c (including the inner #ifdef inside
bes2600_chrdev_read_and_check_dpd_data() guarding a
DPD_BIN_FILE_SIZE size check that only applied to the read-back-
its-own-write case), the no_dpd field and module_param, and the
two invocation sites. Drop the Kconfig/make-flag and the three
associated PATH macros from the Makefile. Net: -155 lines, no
remaining filp_open/kernel_read/kernel_write anywhere in
bes_chardev.c.

The in-memory DPD state path is unchanged: bes2600_chrdev_get_dpd_
buffer() still allocates a kmalloc'd buffer used by the firmware-
download path, bes2600_chrdev_update_dpd_data() still validates
the buffer's CRC and transitions bes2600_cdev.wait_state on
success, and bes2600_chrdev_free_dpd_data() still releases the
buffer on unload. Only the file-I/O side-channel is removed.

Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
2026-05-19 09:06:07 +02:00
test0r cd5f85e104 bes2600: remove userspace /dev/bes2600 character device interface
bes_chardev.c implemented a custom character device at /dev/bes2600 with
its own parser and command-dispatch table, exposing operations such as
'wifi on|off', 'bt on|off', 'change_fw_type <n>', 'bt_wakeup',
'bt_sleep', and 'wakeup_read_flag'. None of these surfaces are used by
the in-tree driver - every kernel call site consumes the internal state
accessors (bes2600_chrdev_is_signal_mode, bes2600_chrdev_get_fw_type,
etc) directly, not through the cdev.

The cdev interface is a standing upstream blocker for two reasons:

  1. Drivers under drivers/staging/ and drivers/net/wireless/ are
     expected to expose tuning via the firmware/nl80211/debugfs
     infrastructure rather than a private /dev node with an ad-hoc
     parser.

  2. The cdev handlers keep a global bes_cdev singleton alive whose
     ->cdev, ->dev_id, ->class and ->device pointers exist only to be
     torn down; they add no functionality that nl80211 or rfkill do
     not already provide (wifi/bt on-off, module_param for fw_type).

Remove the userspace interface:

  - open / read / write / release file_operations handlers and the
    bes2600_chardev_fops instance
  - bes2600_op_* command handlers and bes2600_op_map_tab dispatcher
  - bes2600_get_cmd_and_ifname / bes2600_recyle_cmd_and_ifname_mem
    string helpers
  - bes2600_load_uevent (its only caller was
    bes2600_chrdev_wifi_force_close_work informing userspace of a
    state it already gates via rfkill; that snprintf +
    kobject_uevent_env block is gone too, the kernel-side
    halt_device + switch_wifi(0) + chrdev_check_system_close
    sequence remains)
  - alloc_chrdev_region / cdev_init / cdev_add / class_create /
    device_create in bes2600_chrdev_init plus the fail1/fail2/fail3
    unwind labels
  - cdev_del / unregister_chrdev_region / device_destroy /
    class_destroy in bes2600_chrdev_free
  - cdev/dev_id/major/minor/class/device fields in struct bes_cdev

What remains (unchanged behaviour):

  - fw_type module parameter - the primary user-facing knob for
    signal/no-signal/BT mode switch
  - All in-kernel bes2600_chrdev_* accessor functions called from
    bes2600_sdio.c, bes_pwr.c, sta.c, bh.c, main.c, wsm.c, and
    wifi_testmode_cmd.c (13 call sites)
  - bes2600_chrdev_init / bes2600_chrdev_free as state-init / teardown
    for the remaining bes_cdev state (waitqueues, workqueues, flags)
  - DPD management (bes2600_chrdev_get_dpd_buffer / update / free)
  - wifi_force_close worker, system-close logic, bus-probe state
    machine

Tested-on: PineTab2 (BES2600WM + RK3566) running linux-pinetab2
6.19.10-danctnix1-1. Driver continues to associate and pass traffic;
no kernel messages related to the cdev absence. Users that previously
wrote to /dev/bes2600 should switch to the fw_type module parameter
or (future patch c4) nl80211 testmode commands.

Follow-ups:

  - c3.1: thread struct device * through bes2600_chrdev_is_signal_mode
    and friends so the global bes2600_cdev singleton can be dropped
    and the accessors scale to multi-device scenarios.
  - c4:   enable CONFIG_BES2600_TESTMODE and route nl80211 testmode
    commands to the firmware's patch_wifi_testMode entry.

Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
2026-05-19 09:06:07 +02:00
test0r e0d752aae9 sync bes2600/ to v7.0-danctnix1 baseline (rebasing reference) 2026-05-19 09:04:33 +02:00
Julian ba20341e70 Upload
Source: https://github.com/cringeops/bes2600
Source: https://github.com/cringeops/bes2600/pull/14
Source: https://github.com/cringeops/bes2600/pull/17
Source: https://github.com/cringeops/bes2600/pull/20
2025-09-17 16:35:45 +02:00