5 GHz unusable on bes2600: firmware rejects start-scan, APs invisible #19

Closed
opened 2026-05-16 22:17:11 +00:00 by marfrit · 1 comment
Owner

Summary

bes2600 on PineTab2 (ohm) cannot attach to or scan 5 GHz APs because the firmware rejects WSM start-scan requests for the 5 GHz band, even though the chip/driver/regdomain all support it. The 056a71a "defer scan and soften WARN on firmware reject" patch covers the log noise but doesn't address the underlying reject — APs stay invisible, so we never get the chance to roam to 5 GHz regardless of signal strength.

This is the structural reason real-world TCP-over-WiFi on ohm tops out around ~5 MB/s = 40 Mbit/s at MCS 7 HT20 single-stream 2.4 GHz, with ~16% TX retry rate. The chip can do much better on 5 GHz HT40 if it would scan there.

What does work

iw phy enumerates Band 2 with 37 channels spanning 5170–5875 MHz, HT20/HT40 caps (0x87e), MCS 0–7 single-stream. No VHT (so no 802.11ac headroom even if we got there). Regdomain on ohm is DE / DFS-ETSI, which permits all UNII-1/2/2e/3 bands at 20–26 dBm.

So in cfg80211/mac80211 terms, 5 GHz is a fully-registered band.

What doesn't work

iw dev wlan0 scan (with or without freq filter) returns zero BSSIDs, not even the currently-associated newton on 2.4 GHz. dmesg pattern during the scan:

ieee80211 phy0: bes2600_scan_complete_cb status: 0
ieee80211 phy0: [SCAN] Scan completed.
ieee80211 phy0: wsm_generic_confirm failed for request 0x0007.
ieee80211 phy0: [SCAN] Scan failed (-22).

WSM request 0x0007 = start-scan. Firmware returns status 2 ("rejected by policy"); driver maps to -22 (EINVAL). After rejection, the per-scan results are dropped — so even results from the 2.4 GHz portion of a multi-band scan don't make it to userspace.

Why we believe this is firmware-side

Patch 056a71a ("bes2600: defer scan and soften WARN on firmware reject", merged into danctnix-7.0.6) was written around this exact pattern. Its commit message identifies two known reject triggers:

  1. BT A2DP active in non-FDD coex mode — coex arbiter in firmware refuses to grant an off-channel window while a SCO/A2DP link is queued.
  2. Firmware-internal busy state during PM transitions — observed even with BT disconnected. Trigger not observable from the driver side.

The patch softens the WARN logging and adds a 10s backoff to stop console-flooding. It does not persuade the firmware to actually perform the scan.

How this surfaces operationally

  • ohm sees only whatever 2.4 GHz BSSID it last associated with.
  • nmcli connection down/up cycles re-associate to the same 2.4 GHz BSSID because that's all the scan ever returns.
  • Moving ohm next to the router doesn't help — same BSSID, same MCS 7, same scan-reject behavior.
  • Real-world LAN-direct TCP test (verified 2026-05-17, ohm next to router, signal -44 dBm): 1 GB blob in 3:35 = 4.99 MB/s = 40 Mbit/s, 16.25% TX retry rate. Pre-reconnect at -57 dBm was 3.12 MB/s. The reconnect helps; 5 GHz attachment would help a lot more.

Comparison to peer-driver reports

danctnix maintainer Danct12 has shown iperf3 screenshots of ~80 Mbit/s sustained on the same chip. The most likely explanation is they're on a 5 GHz AP / HT40 — possible on bes2600 hardware, blocked for ohm by this scan-reject behavior. The driver gap to BESser stack (we have more patches than Danct's 7.0.6 branch) doesn't explain it; the radio attachment does.

Workarounds (if 5 GHz is actually needed)

  • Direct-attach to a known 5 GHz BSSID via nmcli, bypassing the broken scan:
    nmcli connection add con-name newton-5g ifname wlan0 type wifi \
          ssid newton 802-11-wireless.band a 802-11-wireless.bssid <5g-mac>
    
    Requires the router to expose the 5 GHz radio as a separate SSID (or at least a knowable BSSID).
  • Combined 2.4/5 GHz SSID setups won't help — driver can't pick the band without scanning it first.

Proposed real fixes (in increasing order of depth)

  1. Smarter scan-deferral: instead of just rate-limiting, gate scan attempts on observable firmware-state signals (PM state, BT activity). Today's 056a71a uses a fixed 10s backoff window and a coex-A2DP check, but the "firmware-internal busy" case isn't detected. Needs more instrumentation.
  2. BT-coex tuning: investigate whether the FDD coex mode can be entered earlier or held longer, opening the off-channel window. The firmware has bt_drv_config_coex_mode and BTC patch hooks (per reference_bes2600_firmware_first_pass) — driver may not be configuring them optimally.
  3. Separate band-scan trigger: split scans into per-band chunks driver-side and tolerate per-band rejects so a 2.4-rejected scan doesn't poison the 5 GHz results. Needs to match how mac80211 issues multi-band scans.

All three are firmware-state-aware behaviour changes. None are trivial — they need a longer instrumentation pass on the bes2600 firmware/coex interaction before patches are scoped.

References

  • Memory: reference_bes2600_5ghz_scan_reject — verbose copy of these findings
  • Memory: reference_bes2600_firmware_no_psm — sibling firmware-policy gotcha
  • Memory: reference_bes2600_firmware_first_pass — coex/BTC symbol leaks for the depth-3 fix
  • Patch 056a71a ("bes2600: defer scan and soften WARN on firmware reject") in danctnix-7.0.6
  • Verified 2026-05-17 on ohm running BESser-c5x stack (kernel 7.0.0-pinetab2-danctnix-besser, srcversion 1B3B3ED0…)

Out of scope / parking lot

Not blocking the 5 GHz workstream on this: ohm-as-deployed is fine on 2.4 GHz for the typical workload. This issue exists to (a) record the structural cause so we stop attributing the 2.4-only behaviour to chip limitations, and (b) document what depth of firmware/coex work would actually unlock 5 GHz attachment when someone wants it.

## Summary `bes2600` on PineTab2 (ohm) cannot attach to or scan 5 GHz APs because the firmware **rejects** WSM start-scan requests for the 5 GHz band, even though the chip/driver/regdomain all support it. The 056a71a "defer scan and soften WARN on firmware reject" patch covers the log noise but doesn't address the underlying reject — APs stay invisible, so we never get the chance to roam to 5 GHz regardless of signal strength. This is the structural reason real-world TCP-over-WiFi on ohm tops out around **~5 MB/s = 40 Mbit/s** at MCS 7 HT20 single-stream 2.4 GHz, with ~16% TX retry rate. The chip can do much better on 5 GHz HT40 if it would scan there. ## What does work `iw phy` enumerates Band 2 with 37 channels spanning 5170–5875 MHz, HT20/HT40 caps (`0x87e`), MCS 0–7 single-stream. No VHT (so no 802.11ac headroom even if we got there). Regdomain on ohm is DE / DFS-ETSI, which permits all UNII-1/2/2e/3 bands at 20–26 dBm. So in cfg80211/mac80211 terms, 5 GHz is a fully-registered band. ## What doesn't work `iw dev wlan0 scan` (with or without `freq` filter) returns **zero BSSIDs**, not even the currently-associated `newton` on 2.4 GHz. dmesg pattern during the scan: ``` ieee80211 phy0: bes2600_scan_complete_cb status: 0 ieee80211 phy0: [SCAN] Scan completed. ieee80211 phy0: wsm_generic_confirm failed for request 0x0007. ieee80211 phy0: [SCAN] Scan failed (-22). ``` WSM request `0x0007` = start-scan. Firmware returns status 2 ("rejected by policy"); driver maps to `-22` (`EINVAL`). After rejection, the per-scan results are dropped — so even results from the 2.4 GHz portion of a multi-band scan don't make it to userspace. ## Why we believe this is firmware-side Patch `056a71a` ("bes2600: defer scan and soften WARN on firmware reject", merged into `danctnix-7.0.6`) was written around this exact pattern. Its commit message identifies two known reject triggers: 1. **BT A2DP active in non-FDD coex mode** — coex arbiter in firmware refuses to grant an off-channel window while a SCO/A2DP link is queued. 2. **Firmware-internal busy state during PM transitions** — observed even with BT disconnected. Trigger not observable from the driver side. The patch *softens* the WARN logging and adds a 10s backoff to stop console-flooding. It does **not** persuade the firmware to actually perform the scan. ## How this surfaces operationally - ohm sees only whatever 2.4 GHz BSSID it last associated with. - `nmcli connection down/up` cycles re-associate to the same 2.4 GHz BSSID because that's all the scan ever returns. - Moving ohm next to the router doesn't help — same BSSID, same MCS 7, same scan-reject behavior. - Real-world LAN-direct TCP test (verified 2026-05-17, ohm next to router, signal -44 dBm): 1 GB blob in 3:35 = **4.99 MB/s = 40 Mbit/s**, 16.25% TX retry rate. Pre-reconnect at -57 dBm was 3.12 MB/s. The reconnect helps; 5 GHz attachment would help a lot more. ## Comparison to peer-driver reports danctnix maintainer Danct12 has shown iperf3 screenshots of **~80 Mbit/s sustained** on the same chip. The most likely explanation is they're on a 5 GHz AP / HT40 — possible on bes2600 hardware, blocked for ohm by this scan-reject behavior. The driver gap to BESser stack (we have more patches than Danct's 7.0.6 branch) doesn't explain it; the radio attachment does. ## Workarounds (if 5 GHz is actually needed) - **Direct-attach to a known 5 GHz BSSID** via nmcli, bypassing the broken scan: ``` nmcli connection add con-name newton-5g ifname wlan0 type wifi \ ssid newton 802-11-wireless.band a 802-11-wireless.bssid <5g-mac> ``` Requires the router to expose the 5 GHz radio as a separate SSID (or at least a knowable BSSID). - Combined 2.4/5 GHz SSID setups won't help — driver can't pick the band without scanning it first. ## Proposed real fixes (in increasing order of depth) 1. **Smarter scan-deferral**: instead of just rate-limiting, gate scan attempts on observable firmware-state signals (PM state, BT activity). Today's 056a71a uses a fixed 10s backoff window and a coex-A2DP check, but the "firmware-internal busy" case isn't detected. Needs more instrumentation. 2. **BT-coex tuning**: investigate whether the FDD coex mode can be entered earlier or held longer, opening the off-channel window. The firmware has `bt_drv_config_coex_mode` and BTC patch hooks (per `reference_bes2600_firmware_first_pass`) — driver may not be configuring them optimally. 3. **Separate band-scan trigger**: split scans into per-band chunks driver-side and tolerate per-band rejects so a 2.4-rejected scan doesn't poison the 5 GHz results. Needs to match how mac80211 issues multi-band scans. All three are firmware-state-aware behaviour changes. None are trivial — they need a longer instrumentation pass on the bes2600 firmware/coex interaction before patches are scoped. ## References - Memory: `reference_bes2600_5ghz_scan_reject` — verbose copy of these findings - Memory: `reference_bes2600_firmware_no_psm` — sibling firmware-policy gotcha - Memory: `reference_bes2600_firmware_first_pass` — coex/BTC symbol leaks for the depth-3 fix - Patch `056a71a` ("bes2600: defer scan and soften WARN on firmware reject") in `danctnix-7.0.6` - Verified 2026-05-17 on ohm running BESser-c5x stack (kernel 7.0.0-pinetab2-danctnix-besser, srcversion 1B3B3ED0…) ## Out of scope / parking lot Not blocking the 5 GHz workstream on this: ohm-as-deployed is fine on 2.4 GHz for the typical workload. This issue exists to (a) record the structural cause so we stop attributing the 2.4-only behaviour to chip limitations, and (b) document what depth of firmware/coex work would actually unlock 5 GHz attachment when someone wants it.
Author
Owner

Closing as duplicate of #1, which I missed when filing this. The 2026-05-03 issue already documents the same WSM 0x0007 scan-reject pattern (plus secondary JOIN / RX / PREV_AUTH_NOT_VALID patterns this issue doesn't cover).

Folded the new findings from today (5 GHz capability proof via iw phy, perf measurement, depth-of-fix taxonomy, workarounds) into a comment on #1 instead of duplicating them here.

Also: a correction this issue's framing got wrong — "APs invisible, can't roam to 5 GHz" was too strong. Per #1's OP, ohm was actually associated on 5 GHz ch.40 on 2026-05-03. The real pattern is intermittent scan-reject causing roam blindness and eventual drift toward whichever band happens to be reliable at the moment (today: 2.4 GHz). Chip + driver + regdom all support 5 GHz; firmware behavior is what limits us, but not absolutely.

See #1 for the canonical thread.

Closing as duplicate of **#1**, which I missed when filing this. The 2026-05-03 issue already documents the same WSM 0x0007 scan-reject pattern (plus secondary JOIN / RX / PREV_AUTH_NOT_VALID patterns this issue doesn't cover). Folded the new findings from today (5 GHz capability proof via `iw phy`, perf measurement, depth-of-fix taxonomy, workarounds) into a comment on #1 instead of duplicating them here. Also: a correction this issue's framing got wrong — "APs invisible, can't roam to 5 GHz" was too strong. Per #1's OP, ohm was actually associated on 5 GHz ch.40 on 2026-05-03. The real pattern is **intermittent** scan-reject causing roam blindness and eventual drift toward whichever band happens to be reliable at the moment (today: 2.4 GHz). Chip + driver + regdom all support 5 GHz; firmware behavior is what limits us, but not absolutely. See #1 for the canonical thread.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/besser#19