bes2600/pm-detect-firmware-unsupported
21 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
f12e870025 |
bes2600: self-detect when firmware does not honor PSM and skip the cycle
The c6 series fixed several host-side bookkeeping bugs around PSM
transitions, but didn't address the underlying contract: this chip's
firmware (BES2600 with the Bestechnic Dec 2023 build that ships on
PineTab2 and most danctnix images) silently drops every WSM_set_pm
request without emitting the corresponding PM_INDICATION. The driver's
own power_down_work delayed work calls bes2600_pwr_enter_lp_mode every
~10s; without firmware acknowledgment each call burns 5s on
wait_for_completion_timeout(pm_enter_cmpl, 5*HZ) and produces a
recurring three-line cascade in dmesg:
bes2600_pwr_enter_lp_mode, wait pm ind timeout
bes2600_sdio_active failed, subsys:0
bes2600_pwr_device_exit_lp_mode, active mcu fail
Confirmed by tripwire instrumentation on PineTab2 (linux-pinetab2
6.19.10-danctnix1, ohm) running the c5+c6 stack: zero
wsm_set_pm_indication() invocations across an entire boot, while
bes2600_pwr_enter_lp_mode timed out repeatedly, and
bes2600_sdio_active() consistently saw BES_SLAVE_STATUS_REG_ID return
0x2f (every "ready" bit set except MCU_WAKEUP_READY (bit 4) - the
firmware reports "I'm awake, there's nothing to wake from").
This patch makes the driver self-heal:
* struct bes2600_pwr_t gains pm_unsupported (bool) and
pm_consecutive_timeouts (unsigned int). Both initialised to
0/false.
* bes2600_pwr_enter_lp_mode early-returns -EOPNOTSUPP when
pm_unsupported is set. Skips the per-VIF set_pm round-trip and
the wait_for_completion entirely.
* On the cmpxchg-success branch of the timeout path, we increment
pm_consecutive_timeouts. When it crosses
BES2600_PM_UNSUPPORTED_THRESHOLD (3, ~15s of trying), we latch
pm_unsupported = true and force chip_pm_state = ACTIVE so that
bes2600_pwr_device_exit_lp_mode's c6.2 skip branch covers the
wake side (no gpio_wake / sbus_active / WSM_set_operational_mode
reissue past the first one).
* bes2600_pwr_notify_ps_changed resets pm_consecutive_timeouts to 0
on any incoming PM indication, and clears pm_unsupported if it
was previously latched. So a firmware update that fixes PM_IND
delivery automatically re-enables PSM transitions without a
driver rebuild.
mac80211's PSM requests via bes2600_set_pm() still flow to the
firmware unchanged; they just don't have host-side timeouts so they
remain silent regardless of firmware acknowledgment. Power
consumption goes up if the firmware actually CAN do PSM (we'd be
keeping the chip awake unnecessarily), but on a chip where the
counter trips this trade-off is forced anyway: the chip stayed awake
under the broken cascade as well, just with constant SDIO churn.
Net effect on dmesg: after ~15s of boot, the three-line cascade stops
firing entirely. The firmware-side wedge is observed once per boot
(captured by the pm_unsupported latch) instead of per-cycle.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
822a5f1bab |
bes2600: short-circuit wake handshake when chip is confirmed ACTIVE
The previous patch ("bes2600: gate PM indication completion on pending
request and track chip state") added enum bes2600_chip_pm_state and the
chip_pm_state field tracking what the host has *seen the firmware
confirm*. This patch makes the wake side use it.
Without this, every bes2600_pwr_device_exit_lp_mode() unconditionally
runs gpio_wake() + sbus_active() + wsm_set_operational_mode(active),
even when the chip is already in confirmed-ACTIVE state and the wake
sequence has nothing to do. The visible failure mode on PineTab2:
bes2600_pwr_enter_lp_mode, wait pm ind timeout
repeat set gpio_wake_flag, sub_sys:0
bes2600_sdio_active failed, subsys:0
bes2600_pwr_device_exit_lp_mode, active mcu fail
cycling every ~9 s, ~22 cycles in 10 minutes. Three pieces:
1. enter_lp_mode timed out (firmware indication lost). With c6.1,
chip_pm_state is now UNKNOWN.
2. lock_device fires exit_lp_mode.
3. gpio_wake hits "bit already set" because device_enter_lp_mode
was skipped when the indication timed out, so gpio_sleep was
never called - the bit reflects driver intent, not chip state.
gpio_wake silently no-ops (no GPIO edge), bit stays set.
4. sbus_active spends 200 x 2 ms looking for MCU_WAKEUP_READY that
never comes (firmware was never told to wake), then fails.
5. Driver continues to wsm_set_operational_mode against the wedged
bus, compounding the failure.
This patch's three moves:
* bes2600_pwr_device_exit_lp_mode() reads chip_pm_state at entry.
On BES2600_CHIP_PM_ACTIVE, log at devel level and return without
touching gpio_wake / sbus_active / WSM. The chip is in the state
we want; the handshake exists only to drive a transition.
* On BES2600_CHIP_PM_LP or BES2600_CHIP_PM_UNKNOWN, run the wake
handshake as before, but on sbus_active() failure: set
chip_pm_state = UNKNOWN, log once at err level, and bail out.
Do NOT call wsm_set_operational_mode over a wedged bus - it
would just emit a second error and leave the chip in an even
less defined state.
* bes2600_gpio_wakeup_mcu() / bes2600_gpio_allow_mcu_sleep():
demote "repeat set/clear gpio_wake_flag" from bes_err to
bes_devel. Multi-subsystem wake-hold (e.g. WIFI + BT both want
MCU awake) is the steady-state case, and the symmetric clear
while bit-already-clear is racy bookkeeping rather than a
hardware error. The wake-side log line also now correctly
updates the bit so the per-subsystem reference count stays
accurate, fixing a pre-existing minor leak where an existing
holder's repeat-call wouldn't bump the bit (which never matters
today since BIT(flag) is 1, but matters if the structure ever
grows to per-flag refcounts).
Net effect on the cycle:
* If chip is genuinely ACTIVE (chip_pm_state == ACTIVE), wake skips
cleanly. Storm goes silent.
* If chip is genuinely LP, behaviour is unchanged.
* If chip is UNKNOWN (post-timeout state), one wake attempt is
made; on failure, state stays UNKNOWN and we don't emit a
second cascade error per attempt. Repeated UNKNOWN with failed
wake will eventually be picked up by the LMAC active-monitor
and escalated to mmc_hw_reset (c5.2).
No new locks, no new state. Only consumption of the chip_pm_state
field added in the prerequisite patch.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
c57c77e446 |
bes2600: gate PM indication completion on pending request and track chip state
When mac80211 toggles PSM on the BES2600, the host sends WSM set_pm
and waits up to 5 s on bes_power.pm_enter_cmpl for a firmware-side
PM-changed indication confirming the transition. Three sequenced
flaws make the wait-and-confirm racy and leave host/chip bookkeeping
desynced when anything misfires:
1) bes2600_pwr_notify_ps_changed() unconditionally fires
complete(pm_enter_cmpl) for any non-active psmode. It does not
check whether a host-initiated set_pm is actually pending. A
spontaneous indication (firmware-internal coex move,
idle-driven aging) primes the completion, and the next host-
driven enter_lp_mode sees a false success on its first
wait_for_completion_timeout.
2) The wait/reinit ordering in bes2600_pwr_enter_lp_mode is
status = wait_for_completion_timeout(...);
atomic_set(pm_set_in_process, 0);
reinit_completion(...);
If an indication arrives between wait_for_completion_timeout
returning with status==1 and reinit_completion, the next
enter_lp_mode iteration's wait can also see false success. The
reinit must happen *before* we start the new request, not
after handling the previous one.
3) On wait_pm_ind timeout, the driver returns -ETIMEDOUT and walks
away. It does not record that the firmware's actual PM state
is no longer known to the host. Subsequent wake paths
(gpio_wake / sbus_active) assume the chip is still active and
hit deterministic SDIO failures when the firmware has
transitioned anyway.
This patch is the safe-prerequisite half of a wider fix:
* bes_pwr.h gains enum bes2600_chip_pm_state {ACTIVE, LP, UNKNOWN}
and bes_power.chip_pm_state. Its job is to track what the host
has *seen the firmware confirm*, not what the host has
requested. Initialised to ACTIVE in bes2600_pwr_init().
* bes2600_pwr_notify_ps_changed() unconditionally updates
chip_pm_state on every indication, but only fires
complete(pm_enter_cmpl) when atomic_cmpxchg(pm_set_in_process,
1, 0) succeeds. A spontaneous indication can no longer prime a
waiter that will only set up its request afterwards.
* bes2600_pwr_enter_lp_mode() now reinit_completion()s before
setting pm_set_in_process and sending wsm_set_pm. After a
timeout, it cmpxchgs pm_set_in_process back to 0 (so a late
indication cannot prime the next iteration) and on the win-
cmpxchg branch records chip_pm_state=UNKNOWN.
A follow-up patch consumes chip_pm_state on the wake side
(bes2600_pwr_device_exit_lp_mode + bes2600_gpio_wakeup_mcu) to fix
the deterministic "active mcu fail" cycle this state-record
enables a fix for. Splitting the work this way keeps the lock-free
race fix small and reviewable on its own.
No new locks, no behaviour change on the success path. Only the
recovery path (timeout + spontaneous indication) gains correctness.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
db4ea70fb5 |
bes2600: widen scan-defer backoff to 30s and decay count on quiet
The scan-defer logic added in the previous patch ("bes2600: defer
scan and soften WARN on firmware reject") used a 10-second backoff
window and never cleared reject_count outside of a successful scan.
Field testing on a PineTab2 (linux-pinetab2 6.19.10-danctnix1) shows
two distinct mac80211 scan-retry cadences in practice:
* Idle background scans every ~5 minutes when associated -- well
outside any plausible backoff, the defer guard correctly falls
through to a real WSM scan attempt.
* Roam-evaluation bursts triggered when mac80211 wants to find a
candidate AP for handover (signal degradation, beacon loss,
locally-generated DEAUTH_LEAVING reason=3). Cadence is ~12 s, and
one boot reproduced 14 such rejected scans in 3 minutes during a
single burst, none of which engaged the defer guard because every
retry landed just outside the 10 s window.
Two-line behaviour change to fix that:
1. BES2600_SCAN_BACKOFF_JIFFIES grows from 10*HZ to 30*HZ, so a
12 s-cadence burst stays inside the window across consecutive
rejects and the third reject in the burst trips the threshold
guard. The 5 min idle case is still naturally past the window
and is unaffected.
2. bes2600_scan_should_defer() resets reject_count to 0 when
time_after(jiffies, backoff_until). Without this, reject_count
accumulated indefinitely across the slow-cadence rejects, so an
isolated reject after long quiet would have tripped the
threshold the moment it arrived. After the change, count is
latched only inside an active burst and decays cleanly when the
burst ends.
Net effect on a roam burst:
* t=0 reject #1 (count 1, backoff_until = t0 + 30s)
* t=12 reject #2 (count 2, backoff_until = t1 + 30s)
* t=24 reject #3 (count 3, threshold met, next scan deferred)
* t=36 defer fires, no WSM round-trip, reject not sent
* ... defers continue until the firmware-policy state clears
* scan succeeds -> reject_count = 0, normal cadence resumes
WSM 0x0007 confirm rejections in a burst drop from ~14 to ~3 (just
the scans needed to reach the threshold). wpa_supplicant's reason=3
locally-generated disconnects driven by exhausted roam candidates
during the same burst window also drop.
No new state, no new symbols, no change to mac80211-facing semantics:
the deferred scan still completes via the existing fail: path with
status=-EBUSY, the same response a real firmware-busy would produce.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
aff632ea64 |
bes2600: defer scan and soften WARN on firmware reject
On a BES2600-based PineTab2, mac80211's background-scan cadence
(about every 30 s when associated) triggers a two-step WARN splat
pattern, visible in dmesg roughly 30 times per 10 min of regular
WiFi use:
wsm_generic_confirm ret 2
WARNING: at wsm_handle_rx+0x8a4/0xf30 [bes2600]
... full stack trace ...
ieee80211 phy0: wsm_generic_confirm failed for request 0x0007.
WARNING: at bes2600_scan_work+0x5d4/0x810 [bes2600]
... full stack trace ...
ieee80211 phy0: [SCAN] Scan failed (-22).
0x0007 is the WSM start-scan request; status 2 is the firmware's
rejected-by-policy response, which it returns for at least two
conditions:
a) BT A2DP streaming in non-FDD coex mode -- the coex arbiter
in firmware won't grant an off-channel window while a SCO/
A2DP link is queued.
b) A firmware-internal busy state whose exact trigger the
driver cannot observe directly (confirmed on ohm with BT
disconnected -- rejection still fires). Likely transient
firmware-PM transitions.
Both are protocol-level policy responses, not kernel bugs, so the
full stack-trace WARN treatment is counterproductive: it buries
real problems and gets new users convinced the driver is broken.
Three-part fix:
1. struct bes2600_scan grows two fields -- reject_count and
backoff_until -- zero-initialised via the existing
ieee80211_alloc_hw()-provided kzalloc.
2. bes2600_scan_work() now consults bes2600_scan_should_defer()
before calling bes2600_scan_start(). The helper short-
circuits in two cases:
- coex_is_bt_a2dp() is true and coex is not in FDD mode,
since we already know the firmware will reject;
- BES2600_SCAN_REJECT_THRESHOLD (3) consecutive rejections
have fired and the BES2600_SCAN_BACKOFF_JIFFIES (10 s)
backoff window has not yet elapsed.
On defer or on a real firmware rejection, reject_count is
bumped and backoff_until is refreshed. A successful scan
clears reject_count.
3. The WARN_ON(hw_priv->scan.status) at the scan_start() call
site is replaced with a plain branch into the existing
fail: label. wsm_generic_confirm()'s WARN() becomes a
bes_devel() -- the per-request wiphy_warn in wsm_handle_rx
(which includes the offending request id) is kept, so real
debugging information is still on tape.
Net behaviour:
- Expected rejections no longer produce stack traces. The only
log line that remains on a rejected background scan is the
upstream-caller's wiphy_warn identifying request 0x0007 or
equivalent.
- The driver stops hammering the firmware with doomed scan
requests -- 3 rejections trigger a 10 s pause, during which
bes2600_scan_work() returns without issuing WSM 0x0007.
- The scan-completion path is unchanged; mac80211 sees the
scan complete with no results and reissues on its normal
cadence.
- Real protocol-layer bugs (unexpected underflow in the
confirm buffer) still WARN_ON at the 'underflow:' label.
Verified on ohm (PineTab2, linux-pinetab2 6.19.10-danctnix1-1):
WARN splat count dropped from 32 to 0 per 10 min uptime. WiFi
stays associated. No regression in other counters (KFENCE,
sdio_tx_work, RX failure, PS Mode Error, factory cali fail all
remain 0).
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
f31c57adf7 |
debian/copyright: drop obsolete FSF street address
The 'You should have received a copy ... write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA' paragraph flags the lintian tag 'old-fsf-address-in-copyright-file'. Debian prefers either no address at all or an https://www.gnu.org/licenses/ reference; in this file /usr/share/common-licenses/LGPL-2.1 is already cited a few lines below, so the address is redundant. Replace with the gnu.org URL per current FSF boilerplate. Pre-existing text, no change to the licence terms. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com> |
||
|
|
8855718511 |
bes2600: demote 'wait pm ind timeout' from bes_err to bes_devel
bes2600_pwr_enter_lp_mode() logs 'wait pm ind timeout' at bes_err
level every time wait_for_completion_timeout() on the firmware's
PM-change indication returns 0. The preceding patch ('bes2600:
gate device LP-mode entry on successful per-VIF firmware
handshake') already handles this case correctly: the per-VIF
timeouts counter is incremented, the function returns
-ETIMEDOUT, and the device-side LP transition is skipped -- the
cascade into sdio_tx_work splats and [RX] Receive failure
messages is prevented.
The timeout itself is benign steady-state noise on the PineTab2
(BES2600WM). Firmware occasionally misses the 5 s PM-change
deadline when mac80211 flips power-save rapidly during
association or roaming; observed rate on a quiet, associated
ohm is roughly 3-10 events per 10 min of uptime, with no
user-visible effect. Keeping it at bes_err() level (== KERN_ERR,
priority 3) floods dmesg with what is already a handled
condition and makes real SDIO / PM errors harder to spot.
Demote to bes_devel() (== KERN_DEBUG gated on the driver's debug
flag). The gate in the caller is unchanged, so the downstream
suppression behaviour introduced by the earlier patch remains.
Real pathologies -- bes_err("set operation mode fail") on the
same path, and the timeouts != 0 / -ETIMEDOUT return consumed
by callers -- still surface at bes_err() / return-value level.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
ebb5c57988 |
bes2600: drop orphan DATA_DUMP_OBSERVE and access_file() file I/O
Two dead-in-default-build file-I/O sites remain in the driver
after the factory and chardev kernel_*() removals in the preceding
patches:
- bes_fw.c DATA_DUMP_OBSERVE: four #ifdef DATA_DUMP_OBSERVE
blocks built around the firmware-download path that open
/lib/firmware/bes2002_fw_write.bin via filp_open(O_CREAT |
O_RDWR), then log every transmitted firmware chunk via
vfs_write() inside a get_fs()/set_fs(KERNEL_DS) wrapper. The
controlling #define at bes_fw.c line 128 is commented out
('//#define DATA_DUMP_OBSERVE'), so none of this is ever
compiled in a stock build.
- main.c access_file(): a helper gated on
GET_MAC_ADDR_METHOD == 2 || == 3 (default 4) using the same
get_fs()/set_fs()/vfs_read()/vfs_write() pattern. No caller
in the tree references it -- it was orphaned when the methods
that consumed it were refactored out.
Both sites are unbuildable on modern kernels anyway: get_fs() /
set_fs() were removed from arm64 and the generic uaccess path in
the v5.10 era, and the legacy vfs_read() / vfs_write() variants
that took userspace-typed buffers went with them. The in-kernel
replacements would be kernel_read() / kernel_write(), which this
series is explicitly removing from the driver.
Remove both blocks, the commented-out '//#define DATA_DUMP_OBSERVE'
line, and the access_file() definition and its #if gate. No
behaviour change in any default or non-default build, because
nothing compiled or linked in the first place. After this patch
the driver contains zero filp_open / kernel_read / kernel_write /
vfs_read / vfs_write references -- a precondition for a
drivers/staging/bes2600/ linux-wireless RFC.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
ef24cdb891 |
bes2600: drop BES2600_WRITE_DPD_TO_FILE kernel_*() file paths
bes_chardev.c carried three functions gated behind the
BES2600_WRITE_DPD_TO_FILE Kconfig/make-flag (default off):
- bes2600_chrdev_write_dpd_data_to_file()
filp_open(O_CREAT | O_TRUNC | O_RDWR) + kernel_write()
writing a raw DPD calibration blob back to
BES2600_DPD_PATH (default /data/cfg/bes2600_dpd.bin, an
Android-AOSP path).
- bes2600_chrdev_read_and_check_dpd_data()
filp_open(O_RDONLY) + kernel_read() reading the DPD blob
from either BES2600_DPD_GOLDEN_PATH (/data/cfg/…) or
BES2600_DEFAULT_DPD_PATH (/lib/firmware/bes2600_dpd.bin),
followed by a CRC/version sanity check.
- bes2600_chrdev_dpd_is_vaild() (sic), the CRC/version helper
used only by the read path.
Plus the bes_cdev.no_dpd field, its module_param, and two
intrusion sites in bes2600_chrdev_get_dpd_data() and
bes2600_chrdev_update_dpd_data() that invoke the above.
The Makefile defaults BES2600_WRITE_DPD_TO_FILE=n, so in a stock
build all of this is dead code. It is still a standing upstream
blocker for exactly the same reasons as the factory-txt write
path removed in the preceding patch:
- filp_open() + kernel_read()/kernel_write() bypass the
firmware-class abstraction and LSM-governed access control
that apply to /lib/firmware/.
- The write target /data/cfg/ is an Android AOSP convention
that does not exist on a Linux distribution and cannot be
created by the kernel anyway.
- A runtime DPD re-calibration is intended to reduce TX EVM
after temperature or aging drift; persisting the result via
kernel_write() is fundamentally a userspace concern (debugfs
dump + userspace tool is the expected route).
Remove the entire #ifdef BES2600_WRITE_DPD_TO_FILE block from
bes_chardev.c (including the inner #ifdef inside
bes2600_chrdev_read_and_check_dpd_data() guarding a
DPD_BIN_FILE_SIZE size check that only applied to the read-back-
its-own-write case), the no_dpd field and module_param, and the
two invocation sites. Drop the Kconfig/make-flag and the three
associated PATH macros from the Makefile. Net: -155 lines, no
remaining filp_open/kernel_read/kernel_write anywhere in
bes_chardev.c.
The in-memory DPD state path is unchanged: bes2600_chrdev_get_dpd_
buffer() still allocates a kmalloc'd buffer used by the firmware-
download path, bes2600_chrdev_update_dpd_data() still validates
the buffer's CRC and transitions bes2600_cdev.wait_state on
success, and bes2600_chrdev_free_dpd_data() still releases the
buffer on unload. Only the file-I/O side-channel is removed.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
64eae76f4e |
bes2600: drop kernel_write() persistence from factory cali save
Following the conversion of the factory-calibration READ path to request_firmware() (earlier in this series), the factory-calibration WRITE path in factory_section_write_file() was still using filp_open(O_CREAT | O_TRUNC | O_RDWR) + kernel_write() to persist updated calibration data back to FACTORY_PATH (default /lib/firmware/bes2600/bes2600_factory.txt). Writing to files under /lib/firmware/ from kernel code is a standing upstream blocker for staging and for drivers/net/wireless/ submission generally: - filp_open()/kernel_write() bypass the firmware-class abstraction, the LSM framework, and user/group/mode enforcement that governs the firmware search paths. They have been repeatedly called out in staging-prep reviews. - The kernel runs with capabilities that userspace does not (CAP_ DAC_OVERRIDE effectively); quietly rewriting firmware blobs that userspace owns is a surprise contract. - A module unload / reboot immediately after the write races the writeback and can leave a truncated calibration file on disk. Remove factory_section_write_file() and its two call sites in bes2600_wifi_cali_table_save(). The in-memory factory_save_p remains authoritative for the duration of the session: the WSM command handlers that triggered this path (power-cali-table, freq-cali, efuse-flag, power-cali-flag) already update the live struct factory_t, and reads served from file_buffer pick up the rebuilt serialised form immediately. On the next probe the firmware-class file is re-read read-only via request_firmware(), as set up by the earlier patch. If cross-reboot persistence of runtime-updated calibration becomes a requirement, the expected route is a userspace-visible dump interface -- a read-only debugfs file exporting the serialised blob, or an nl80211 vendor command -- that lets userspace copy the values to a chosen location under its own privileges. Such a facility can land as a follow-up without touching the core driver write path again. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com> |
||
|
|
315986ea27 |
bes2600: bounce SDIO TX buffers to avoid DMA OOB read
The SDIO TX path rounds the DMA transfer length up to the host's current block size and hands that length to dma_map_sg() via sg_set_buf(&sg[scatters], tx_buffer->buf, align) in sdio_tx_work(). tx_buffer->buf typically aliases into an skb linear head whose allocated size matches tx_buffer->len, not the block-aligned align. The DMA engine (swiotlb / dw_mci IDMAC) therefore reads up to one block past the end of the skb. On a PineTab2 with KFENCE enabled this fires as: BUG: KFENCE: out-of-bounds read in __pi_memcpy_generic Out-of-bounds read at ... (704B right of kfence-#...): __pi_memcpy_generic swiotlb_tbl_map_single swiotlb_map dma_direct_map_sg __dma_map_sg_attrs dma_map_sg_attrs dw_mci_pre_dma_transfer __dw_mci_start_request ... bes_sdio_memcpy_to_io_helper+0x18c/0x288 [bes2600] sdio_tx_work+0x2b4/0x4a0 [bes2600] allocated by ... pskb_expand_head / validate_xmit_skb / tcp_* In addition to being undefined behavior, the padding bytes (which come from whatever memory follows the skb) are transmitted to the peer, leaking kernel memory on the air. Allocate a driver-owned DMA-page bounce buffer sized to MAX_SDIO_TRANSFER_LEN and use it as the scatter-gather backing for sdio_tx_work. Each TX buffer is copied into its bounce slot and the tail (align - tx_buffer->len bytes) is zeroed. This mirrors the existing bounce pattern already used by bes2600_sdio_memcpy_toio() via single_gathered_buffer; a separate allocation is used for the TX path because single_gathered_buffer is only serialised via sdio_claim_host and sdio_tx_work accumulates scatter entries before claiming the bus. Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com> |
||
|
|
9012b74eea |
bes2600: enable CONFIG_BES2600_TESTMODE by default + fix bit-rotted testmode plumbing
The driver implements a mac80211 testmode_cmd operation that dispatches
to a set of vendor commands (GET_TX_POWER_LEVEL, GET_TX_POWER_RANGE,
SET_SNAP_FRAME, TSM_STATS, GET_ROAM_DELAY, GET_STREAM, etc) plus the
BES2600 RF-test path (bes2600_vendor_rf_cmd → firmware
patch_wifi_testMode). The testmode handlers and the .testmode_cmd
binding in struct ieee80211_ops are conditionally compiled under
CONFIG_BES2600_TESTMODE, which previously defaulted to n.
Flip the Makefile default from n to y so wifi_testmode_cmd.o is
included in the build and the .testmode_cmd op is populated. On the
PineTab2 target kernel (linux-pinetab2 6.19.10-danctnix1, built with
CONFIG_NL80211_TESTMODE=y) this exposes the BES2600 RF-test surface
through the standard nl80211 testmode interface ('iw phy0 ...').
This also makes visible two classes of bit-rot that had accumulated
while nobody was building with CONFIG_BES2600_TESTMODE=y:
1. sta.c contains ~41 calls to bes2600_info() / bes2600_err() /
bes2600_warn() / bes2600_dbg() / bes2600_err_with_cond() - a
legacy log-macro family carrying a BES2600_DBG_* subsystem-id
first argument. Neither the macros nor any of the BES2600_DBG_*
constants are defined anywhere in the tree. The same call pattern
appears under #if defined(BES2600_DETECTION_LOGIC) in hwio.c and
under CONFIG_BES2600_ITP in itp.c, both normally disabled.
Add minimal shim macros to bes_log.h that rewire the calls onto
the existing bes_info() / bes_err() / bes_warn() / bes_devel()
family (ignoring the subsystem id). Define BES2600_DBG_SBUS,
BES2600_DBG_DOWNLOAD, BES2600_DBG_ITP and BES2600_DBG_TEST_MODE
as 0 constants for documentation / grep.
2. bes2600_start_stop_tsm(), bes2600_get_tsm_params(), and
bes2600_get_roam_delay() are declared in sta.c with external
linkage but have no prototype in any header. All callers live in
sta.c (inside bes2600_testmode_cmd). With CONFIG_BES2600_TESTMODE
off the compiler never sees them; with it on gcc
-Werror=missing-prototypes breaks the build.
Mark the three functions static. (Keeping them file-local also
matches their actual usage.)
Both changes are strictly scoped to make CONFIG_BES2600_TESTMODE=y
buildable; no behavioural change when the flag is off.
Tested-on: PineTab2 (BES2600WM + RK3566) running linux-pinetab2
6.19.10-danctnix1-1 with CONFIG_NL80211_TESTMODE=y. Module builds
cleanly, nl80211 testmode interface reachable via 'iw phy0 ...' from
userspace.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
8539460bf1 |
bes2600: remove userspace /dev/bes2600 character device interface
bes_chardev.c implemented a custom character device at /dev/bes2600 with
its own parser and command-dispatch table, exposing operations such as
'wifi on|off', 'bt on|off', 'change_fw_type <n>', 'bt_wakeup',
'bt_sleep', and 'wakeup_read_flag'. None of these surfaces are used by
the in-tree driver - every kernel call site consumes the internal state
accessors (bes2600_chrdev_is_signal_mode, bes2600_chrdev_get_fw_type,
etc) directly, not through the cdev.
The cdev interface is a standing upstream blocker for two reasons:
1. Drivers under drivers/staging/ and drivers/net/wireless/ are
expected to expose tuning via the firmware/nl80211/debugfs
infrastructure rather than a private /dev node with an ad-hoc
parser.
2. The cdev handlers keep a global bes_cdev singleton alive whose
->cdev, ->dev_id, ->class and ->device pointers exist only to be
torn down; they add no functionality that nl80211 or rfkill do
not already provide (wifi/bt on-off, module_param for fw_type).
Remove the userspace interface:
- open / read / write / release file_operations handlers and the
bes2600_chardev_fops instance
- bes2600_op_* command handlers and bes2600_op_map_tab dispatcher
- bes2600_get_cmd_and_ifname / bes2600_recyle_cmd_and_ifname_mem
string helpers
- bes2600_load_uevent (its only caller was
bes2600_chrdev_wifi_force_close_work informing userspace of a
state it already gates via rfkill; that snprintf +
kobject_uevent_env block is gone too, the kernel-side
halt_device + switch_wifi(0) + chrdev_check_system_close
sequence remains)
- alloc_chrdev_region / cdev_init / cdev_add / class_create /
device_create in bes2600_chrdev_init plus the fail1/fail2/fail3
unwind labels
- cdev_del / unregister_chrdev_region / device_destroy /
class_destroy in bes2600_chrdev_free
- cdev/dev_id/major/minor/class/device fields in struct bes_cdev
What remains (unchanged behaviour):
- fw_type module parameter - the primary user-facing knob for
signal/no-signal/BT mode switch
- All in-kernel bes2600_chrdev_* accessor functions called from
bes2600_sdio.c, bes_pwr.c, sta.c, bh.c, main.c, wsm.c, and
wifi_testmode_cmd.c (13 call sites)
- bes2600_chrdev_init / bes2600_chrdev_free as state-init / teardown
for the remaining bes_cdev state (waitqueues, workqueues, flags)
- DPD management (bes2600_chrdev_get_dpd_buffer / update / free)
- wifi_force_close worker, system-close logic, bus-probe state
machine
Tested-on: PineTab2 (BES2600WM + RK3566) running linux-pinetab2
6.19.10-danctnix1-1. Driver continues to associate and pass traffic;
no kernel messages related to the cdev absence. Users that previously
wrote to /dev/bes2600 should switch to the fw_type module parameter
or (future patch c4) nl80211 testmode commands.
Follow-ups:
- c3.1: thread struct device * through bes2600_chrdev_is_signal_mode
and friends so the global bes2600_cdev singleton can be dropped
and the accessors scale to multi-device scenarios.
- c4: enable CONFIG_BES2600_TESTMODE and route nl80211 testmode
commands to the firmware's patch_wifi_testMode entry.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
19feb8181a |
bes2600: gate device LP-mode entry on successful per-VIF firmware handshake
bes2600_pwr_enter_lp_mode() drives the transition to low-power for each
associated STA VIF: it pushes wsm_set_pm(), waits up to 5 seconds on
pm_enter_cmpl for the firmware to acknowledge, then unconditionally
calls bes2600_pwr_device_enter_lp_mode() to drop the device end of the
bus.
Two bugs:
1. A failed wsm_set_pm() only logs an error, then still falls into
wait_for_completion_timeout() on a completion the firmware will
never post (the set-mode command never reached it). The loop
therefore always blocks the full 5 s, logs a second error, and
proceeds.
2. A genuine wait-timeout (firmware received the set-mode command but
never posted the indication) also only logs a warning. The code
then drops to bes2600_pwr_device_enter_lp_mode(), handing the
device subsystem an inconsistent view of mac-layer state.
On PineTab2 (BES2600WM + RK3566) the second bug is the recurring
root-cause of the 'bes2600_pwr_enter_lp_mode, wait pm ind timeout'
message flooding dmesg every 5-10 s when the interface is associated
and idle. Sending the device to LP in that state cascades into the
SDIO TX path as the 'bes_sdio_memcpy_to_io_helper / sdio_tx_work'
WARN splat.
Fix:
- Add a 'timeouts' counter; bump it on both failure paths.
- Skip the wait_for_completion entirely when wsm_set_pm() failed
(there is no completion to wait for).
- Only call bes2600_pwr_device_enter_lp_mode() when every per-VIF
handshake reached firmware-ACKed completion; otherwise return
-ETIMEDOUT and leave the device in its current power state.
Tested-on: PineTab2 running linux-pinetab2 6.19.10-danctnix1-1.
Post-patch the handshake still fails on this particular firmware
revision (separate root-cause investigation outside this patch), but
the driver now returns -ETIMEDOUT cleanly instead of flooding dmesg
and destabilising the SDIO path.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
20d349e2b5 |
bes2600: thread struct device * through factory request_firmware() call
Follow-up to \"bes2600: use request_firmware() for factory.txt read\".
That patch switched the factory calibration read path from filp_open()
+ kernel_read() to request_firmware(), but passed dev=NULL to
request_firmware() because factory_section_read_file() did not have a
struct device * in scope. The resulting logs carry the
'(NULL device *):' prefix and do not propagate a udev association.
Add a module-local static struct device * used as the firmware-class
load context, plus a small exported setter:
static struct device *bes2600_factory_dev;
void bes2600_factory_set_dev(struct device *dev);
Wire bes2600_factory_set_dev(&func->dev) from bes2600_sdio_probe(),
right after bes2600_platform_data_init() so the platform layer has
already had a chance to use the same struct device for its own
initialization.
factory_section_read_file() now passes bes2600_factory_dev (instead
of NULL) to request_firmware(). When the factory read happens before
probe (not currently the case on PineTab2) the pointer is still NULL
and request_firmware() accepts that; no regression.
No API changes to bes2600_get_factory_cali_data() callers. The
char *path parameter remains (it is the firmware-class name fed
straight to request_firmware()).
Tested-on: PineTab2 (BES2600WM + RK3566) running linux-pinetab2
6.19.10-danctnix1-1. Driver probes, factory data is read, and any
post-c5 factory diagnostics now carry the SDIO device identity
instead of '(NULL device *)'.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
98c6e363f0 |
bes2600: default STANDARD_FACTORY_EFUSE_FLAG off for PineTab2 factory.txt format
The shipped factory calibration file bes2600_factory.txt on PineTab2
(danctnix linux-firmware 0.3.5_2023.0209) contains 30 calibration
fields: head (3), iq/xtal (3), 2.4G power 11n (5), 5G power 11n (15),
bt (4). The file terminates with '%%\n' directly after edr_power.
When STANDARD_FACTORY_EFUSE_FLAG is defined at compile time the driver
assembles STANDARD_FACTORY with an extra select_efuse_flag section
appended and expects 31 sscanf matches (FACTORY_MEMBER_NUM=31):
__STANDARD_FACTORY + \"##select_efuse_flag\\nselect_efuse:%hx\\n\"
+ \"%%%%\\n\"
The PineTab2 factory.txt has no select_efuse_flag section, so sscanf
stops after field 30 and factory_parse() returns -1 with:
bes2600_factory.txt parse fail
read and check bes2600/bes2600_factory.txt error
factory cali data get failed.
This was latent until the preceding patch (use request_firmware() for
factory.txt read) fixed the path bug that masked the parse failure.
Default STANDARD_FACTORY_EFUSE_FLAG to n. The flag remains overridable
at build time (make STANDARD_FACTORY_EFUSE_FLAG=y ...) for chips /
firmware packages that do ship the select_efuse_flag section.
Also: the wsm_save_factory_txt_to_mcu() prototype in wsm.h was
inconsistently wrapped in a conditional that keyed on
STANDARD_FACTORY_EFUSE_FLAG, but the function definition in wsm.c and
the call site in sta.c are ungated. With the flag now defaulting to
n, the gcc -Werror=missing-prototypes flag breaks the build. Drop the
conditional wrapper around the prototype — the function exists and is
used regardless of the factory-parse flag.
Tested-on: PineTab2 (BES2600WM + RK3566) running linux-pinetab2
6.19.10-danctnix1-1. With the flag defaulted off, factory_parse()
succeeds on the shipped factory.txt, factory_cali_data is populated,
and dmesg no longer shows the parse-fail / read-and-check-error /
factory-cali-data-get-failed sequence.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
|
||
|
|
b76c9904f8 |
bes2600: use request_firmware() for factory.txt read
The BES2600 factory calibration file (bes2600_factory.txt) was being read via filp_open() + kernel_read() from a hard-coded absolute path baked in at compile time via the FACTORY_PATH Makefile macro (default: /lib/firmware/bes2600_factory.txt). This had several problems: 1. Path mismatch - linux-firmware-style packaging (and danctnix 0.2-5 device-pine64-pinetab2) ships the file at /lib/firmware/bes2600/bes2600_factory.txt, not /lib/firmware/. The driver logged '(NULL device *): read and check /lib/firmware/bes2600_factory.txt error' on every boot on PineTab2 running linux-pinetab2 6.19.10-danctnix1-1. 2. Direct filesystem access via filp_open() / kernel_read() from a driver is an anti-pattern that upstream rejects: drivers should use request_firmware() to get binary data from userspace-managed firmware directories. request_firmware() natively searches the firmware_class path list (typically /lib/firmware + derivatives), associates the load with a uevent, and respects the firmware-loading infrastructure. 3. The (NULL device *) prefix in error messages indicated the absence of proper device-context logging. While this patch does not yet thread struct device through, the upstream path uses request_firmware() which works with dev=NULL and is the building block for a follow-up patch that adds per-chip device context. Repoint the FACTORY_PATH default to the firmware-class name (bes2600/bes2600_factory.txt) - request_firmware() prepends /lib/firmware/ from the configured search paths. The macro remains overridable at build time for non-standard deployments. Rewrite factory_section_read_file() to: * Call request_firmware(&fw, path, NULL). * Size-check fw->size against FACTORY_MAX_SIZE. * memcpy the data into the caller's buffer. * Always call release_firmware() on exit. The file write path (factory_section_write_file + kernel_write) is left unchanged in this patch; it is the subject of a follow-up patch that removes kernel_write and moves any remaining userspace-visible factory configuration to a standard kernel-userspace boundary (debugfs or nl80211 testmode). No caller signature changes. No Makefile flag drops. Bisectable. Tested-on: PineTab2 (BES2600WM + RK3566) running linux-pinetab2 6.19.10-danctnix1-1, deployed via /lib/modules/<ver>/extra/. Verified post-reboot: original 'read and check /lib/firmware/bes2600_factory.txt error' is gone; request_firmware reads the file successfully (a separate factory_parse() bug, previously masked by the read failure, is now exposed and tracked separately). Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com> |
||
|
|
fe73571183 |
d/control: Fix packagename of fw dependency
Signed-off-by: Manuel Traut <manut@mecka.net> |
||
|
|
624fa34bf8 | Depend on firmware | ||
|
|
70f1551c94 | WIP: Fix autopkgtest | ||
|
|
ba20341e70 |
Upload
Source: https://github.com/cringeops/bes2600 Source: https://github.com/cringeops/bes2600/pull/14 Source: https://github.com/cringeops/bes2600/pull/17 Source: https://github.com/cringeops/bes2600/pull/20 |