Compare commits
18 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| f8986a4a18 | |||
| 122582e270 | |||
| ae175f9745 | |||
| 693e9b42aa | |||
| 0f783a1e69 | |||
| 843d40231f | |||
| 6ab61b9a06 | |||
| 216c7c59b1 | |||
| 02d3f4b222 | |||
| 3d63ec0a35 | |||
| 722434414a | |||
| fc88ff41c3 | |||
| fde41fcdd4 | |||
| 6bae531917 | |||
| 3a38286e6f | |||
| 1e408c9d33 | |||
| d01400140b | |||
| 993117a108 |
@@ -0,0 +1,172 @@
|
|||||||
|
# linux-pinetab2-danctnix-besser
|
||||||
|
|
||||||
|
Soft-upstream fork of `linux-pinetab2` (DanctNIX kernel for PineTab2) carrying the **BESser** bes2600 staging-driver patchset.
|
||||||
|
|
||||||
|
Drop-in replacement for `linux-pinetab2`. Same kernel version, same config (one toggle aside — see SCS caveat below), same modules — only the `drivers/staging/bes2600/` driver differs.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## TL;DR
|
||||||
|
|
||||||
|
| | |
|
||||||
|
|---|---|
|
||||||
|
| **Current package** | `linux-pinetab2-danctnix-besser-7.0.danctnix1-3-aarch64.pkg.tar.zst` |
|
||||||
|
| **Module srcversion** | `BEB625FA7443171EA8D55F7` (`bes2600.ko`) |
|
||||||
|
| **Kernel base** | DanctNIX [`linux-pinetab2`](https://codeberg.org/DanctNIX/linux-pinetab2) tag `v7.0-danctnix1` |
|
||||||
|
| **What it fixes vs upstream** | +73 % TX throughput, the `wsm_generic_confirm 0x0007` dmesg storm (besser#1 closed), the firmware-PSM-not-honored hang, the multi-function SDIO LMAC-wedge recovery |
|
||||||
|
| **What it adds today vs pkgrel=1** | **Patch I**: 5 GHz scan filter — `iw scan freq <single-5ghz-channel>` works, multi-channel per-band sweep refused at driver boundary to dodge firmware reject cascade. NM `band=a` profiles associate to 5 GHz cleanly. **Sustained 11.32 MB/s** download (2.54 GB factory image) on `newton` 5 GHz ch.48 — **3.6× the 2.4 GHz baseline of 3.12 MB/s** on the same source. |
|
||||||
|
| **Source-of-truth** | `git.reauktion.de/marfrit/bes2600-dkms` — branch `cleanups` for c-stack+A+B, branch `bes2600/scan-filter-5ghz` for Patch I |
|
||||||
|
| **This PKGBUILD** | `git.reauktion.de/marfrit/besser` `claude-noether-14` `danctnix-besser-pkgbuild/kernel/` |
|
||||||
|
| **Kernel-agent mirror** | `git.reauktion.de/marfrit/kernel-agent` `fleet/ohm.yaml` (manifest) + `patches/driver/bes2600/scan-filter-5ghz-danctnix/` |
|
||||||
|
| **Caveat** | `CONFIG_SHADOW_CALL_STACK=n` (security-hardening regression, workaround for a GCC 15.2.1 + arm_neon.h pragma issue — tracked in [besser#20](https://git.reauktion.de/marfrit/besser/issues/20), restore to `=y` when GCC is fixed) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What's in the patchset
|
||||||
|
|
||||||
|
A 17-commit cumulative diff over `v7.0-danctnix1`'s in-tree `drivers/staging/bes2600/`, plus the standalone Patch I (5 GHz scan filter) and an arm64 build-environment workaround for GCC 15.
|
||||||
|
|
||||||
|
Individual commits with full rationale + Phase-7 verification logs live on the **`cleanups` branch** of [`marfrit/bes2600-dkms`](https://git.reauktion.de/marfrit/bes2600-dkms/commits/branch/cleanups) and the **`bes2600/scan-filter-5ghz` branch** for Patch I. This PKGBUILD ships them squashed into separate patch files for build atomicity.
|
||||||
|
|
||||||
|
| group | what it does |
|
||||||
|
|---|---|
|
||||||
|
| **c-stack (c5.1–c5.2.1, c6.1, c6.2, c7)** | wifi-stability fixes: scan-defer-on-firmware-reject, scan-defer-backoff-tune, LMAC recover via `mmc_hw_reset`, PM state resync, wake-state consume, firmware-doesn't-honour-PSM self-detect, multi-function SDIO `mmc_hw_reset` rescan |
|
||||||
|
| **Patch A** | decrypt-storm fast-recover at `bes2600_rx_cb`: ≥5 `WSM_STATUS_DECRYPTFAILURE` in 5 s → `ieee80211_connection_loss(vif)`. Phase-7 confirmed N=2 (2026-05-07), storms recover ~1 s vs 109 s baseline. |
|
||||||
|
| **Patch B** | connection-loss bus-reset: ≥3 driver-side connection-loss decisions in 60 s on the same vif → `mmc_hw_reset` instead of mac80211 reauth. Installed dormant; never tripped in production yet. |
|
||||||
|
| **Patch C v3** | structural: drop `sdio_rx_work` workqueue relay; IRQ → bh-direct architecture (matches mainline cw1200). +73 % sustained RX. |
|
||||||
|
| **Patch D** | `ba_lock` removed; `ba_acc/ba_cnt/ba_acc_rx/ba_cnt_rx/ba_ena` → `atomic_t`; per-RX-frame spinlock eliminated. |
|
||||||
|
| **Patch E** | per-RX-frame `ps_state_lock` skipped when c7's `pm_unsupported` latch is on (steady-state on production firmware). |
|
||||||
|
| **Patch F** | cw1200 mainline backports: hw_scan SKB-lifecycle UAF, `init_common` `destroy_workqueue` on error, `atomic_add(1, x) → atomic_inc(x)` cosmetic. |
|
||||||
|
| **Patch G** | GPL-2.0 §1 attribution restoration: SPDX-License-Identifier on every file, Tarnyagin/ST-Ericsson copyright restored on cw1200-derived files. |
|
||||||
|
| **Patch C2** | `ieee80211_rx_irqsafe → ieee80211_rx_ni` at all 6 sites (kernel.org-clean process-context API; tasklet hop removed). |
|
||||||
|
| **Patch H** | `bh.c` hygiene cleanup: 76- and 468-line `#if 0` cw1200-ancestor fossil blocks removed; `__bes2600_irq_enable` stub removed; per-iteration `BUG_ON` → `WARN_ON_ONCE`. |
|
||||||
|
| **Patch I** ([besser#1](https://git.reauktion.de/marfrit/besser/issues/1)) | **5 GHz scan filter.** Refuses only **multi-channel** 5 GHz scans (the per-band-sweep mac80211 issues internally) at the driver boundary with `-EOPNOTSUPP`, dodging the firmware's status-2 reject cascade. Single-channel 5 GHz scans pass through so NM/`wpa_supplicant` per-freq BSS discovery (when `802-11-wireless.band=a`) still finds and associates to 5 GHz APs. Net effect: dmesg storm gone, 5 GHz attachment works, 3.6× sustained throughput on 5 GHz HT40 vs 2.4 GHz HT20. |
|
||||||
|
| **arm64 SCS Makefile workaround** | Adds `-ffixed-x18` explicitly for `arch/arm64/lib/xor-neon.o` when `CONFIG_SHADOW_CALL_STACK=y`. Dead code in this pkgrel (SCS is off), in place for the day SCS re-enable becomes possible. See [besser#20](https://git.reauktion.de/marfrit/besser/issues/20). |
|
||||||
|
|
||||||
|
## Measured outcome
|
||||||
|
|
||||||
|
- **Phase 7 (Patch I, 2026-05-18):** Pattern A `wsm_generic_confirm failed for request 0x0007` storm: 14.3/h → **0/h** over 30-min observation. 5 GHz `newton` BSSID `c0:25:06:e6:5b:33` @ 5240 MHz (ch.48), TX bitrate 150 Mbit/s MCS 7 HT40 short-GI. Internet download throughput **11.32 MB/s** (sustained 90.5 Mbit/s, ~60 % of PHY) vs 3.12 MB/s on 2.4 GHz HT20 same source.
|
||||||
|
- **Phase 7 (Patch C v3 + F + G + D + E + C2 + H, Mobian-flavor):** N=3 stress @ 4 MB/s sender on RK3566/PineTab2 — Patch B baseline 1.36 MB/s → +73 % sustained 2.28 MB/s. Race-fix verified under stress (no `wsm_release_tx_buffer` WARN storm under load).
|
||||||
|
- Module loads + associates cleanly; `pm_unsupported` latch fires on boot as expected.
|
||||||
|
|
||||||
|
## Building
|
||||||
|
|
||||||
|
```sh
|
||||||
|
makepkg -s
|
||||||
|
```
|
||||||
|
|
||||||
|
Identical workflow to upstream `linux-pinetab2`. Produces `linux-pinetab2-danctnix-besser-<ver>-aarch64.pkg.tar.zst` plus a matching `-headers` package. Build host can be aarch64 native (recommended — no cross-toolchain setup) or x86 with an aarch64 cross-compiler.
|
||||||
|
|
||||||
|
Build time: ~45–55 min on an 8-core aarch64 host (boltzmann/RPi5-class), most of it the kernel modules phase.
|
||||||
|
|
||||||
|
**GCC 15.2.1 note:** This pkgrel ships with `CONFIG_SHADOW_CALL_STACK=n` because GCC 15.2.1's strict pragma validator chokes on `arm_neon.h`'s push/`target("+nothing+aes")`/pop sequences when SCS is on. The `0003-arm64-xor-neon-ffixed-x18-build-fix.patch` is a defensive Makefile-side workaround that's a no-op while SCS is off; it'll silently unblock SCS=y once GCC upstream is fixed. See [besser#20](https://git.reauktion.de/marfrit/besser/issues/20) for the re-enable plan.
|
||||||
|
|
||||||
|
## Installing
|
||||||
|
|
||||||
|
The package declares `provides=("linux-pinetab2=$pkgver-$pkgrel")` and `conflicts=(linux-pinetab2)`, so `pacman` will cleanly take over from upstream `linux-pinetab2`:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
sudo pacman -U linux-pinetab2-danctnix-besser-7.0.danctnix1-3-aarch64.pkg.tar.zst
|
||||||
|
```
|
||||||
|
|
||||||
|
That removes the upstream `linux-pinetab2` package (if installed) and registers the BESser-flavored kernel under the same provides slot. Headers package is optional; install it if you build out-of-tree modules.
|
||||||
|
|
||||||
|
The pacman `mkinitcpio` hook auto-generates `/boot/initramfs-linux-pinetab2-danctnix-besser.img`. Modules land in `/usr/lib/modules/<release>-pinetab2-danctnix-besser/`, vmlinuz at `/boot/vmlinuz-linux-pinetab2-danctnix-besser`, DTBs at `/boot/dtbs/rockchip/rk3566-pinetab2-{v0.1,v2.0}.dtb`.
|
||||||
|
|
||||||
|
### Bootloader (PineTab2-specific)
|
||||||
|
|
||||||
|
PineTab2 boots via U-Boot loading a script `boot.scr` (compiled from `/boot/boot.txt` via `mkscr`). After install, point the script at the new kernel + initramfs:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
sudo cp /boot/boot.txt /boot/boot.txt.pre-besser
|
||||||
|
sudo cp /boot/boot.scr /boot/boot.scr.pre-besser
|
||||||
|
sudo sed -i \
|
||||||
|
-e 's|/vmlinuz-linux-pinetab2$|/vmlinuz-linux-pinetab2-danctnix-besser|' \
|
||||||
|
-e 's|/initramfs-linux-pinetab2\.img|/initramfs-linux-pinetab2-danctnix-besser.img|' \
|
||||||
|
/boot/boot.txt
|
||||||
|
cd /boot && sudo ./mkscr
|
||||||
|
sudo systemctl reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
Backups (`*.pre-besser`) let you revert without touching the U-Boot console: `sudo cp /boot/boot.scr.pre-besser /boot/boot.scr` and reboot.
|
||||||
|
|
||||||
|
## Verifying
|
||||||
|
|
||||||
|
After reboot:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
uname -r
|
||||||
|
# expected: <kver>-pinetab2-danctnix-besser
|
||||||
|
|
||||||
|
lsmod | grep -i bes2600
|
||||||
|
# expected: bes2600 (loaded), bes2600_btuart (loaded if Bluetooth in use)
|
||||||
|
|
||||||
|
cat /sys/module/bes2600/srcversion
|
||||||
|
# expected: BEB625FA7443171EA8D55F7 for pkgrel=3
|
||||||
|
```
|
||||||
|
|
||||||
|
`dmesg | grep bes2600` should show clean firmware load, no SDIO TX panic, no `wsm_release_tx_buffer` WARN storm under load, no `wsm_generic_confirm failed for request 0x0007` storm.
|
||||||
|
|
||||||
|
For the 5 GHz fix specifically:
|
||||||
|
```sh
|
||||||
|
sudo iw dev wlan0 scan freq 5180
|
||||||
|
# expected: completes, no "Operation not supported"
|
||||||
|
|
||||||
|
sudo iw dev wlan0 scan freq 5180 5200 5220 5240
|
||||||
|
# expected: "Operation not supported (-95)" — multi-channel 5 GHz refused
|
||||||
|
```
|
||||||
|
|
||||||
|
## Rolling back
|
||||||
|
|
||||||
|
If the new kernel misbehaves:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
sudo cp /boot/boot.scr.pre-besser /boot/boot.scr
|
||||||
|
sudo systemctl reboot
|
||||||
|
```
|
||||||
|
|
||||||
|
That returns you to whatever kernel `boot.scr` was pointing at before the install (typically upstream `linux-pinetab2` or the previous `linux-pinetab2-danctnix-besser`). The package itself can be removed with `sudo pacman -R linux-pinetab2-danctnix-besser` and the original `linux-pinetab2` re-installed via `sudo pacman -S linux-pinetab2`.
|
||||||
|
|
||||||
|
## Provenance
|
||||||
|
|
||||||
|
- Mobian-flavor source-of-truth: <https://git.reauktion.de/marfrit/bes2600-dkms> (`cleanups` branch for c-stack + Patches A/B, `bes2600/scan-filter-5ghz` for Patch I)
|
||||||
|
- Per-patch breakdown, Phase 0–7 logs, follow-up issues: <https://git.reauktion.de/marfrit/besser>
|
||||||
|
- Upstream cw1200 mainline (architectural reference): `drivers/net/wireless/st/cw1200/` in linux-rockchip
|
||||||
|
- Kernel base: <https://codeberg.org/DanctNIX/linux-pinetab2> tag `v7.0-danctnix1`
|
||||||
|
- Kernel-agent mirror of the patch tree + per-host manifest: <https://git.reauktion.de/marfrit/kernel-agent>
|
||||||
|
|
||||||
|
## Why it's "BESser"
|
||||||
|
|
||||||
|
"Besser" = German for "better." Patch series ID across both DKMS (Mobian) and in-tree (Danctnix) trees. Single source-of-truth lives in `marfrit/bes2600-dkms`; this PKGBUILD is the danctnix-flavor consumption surface.
|
||||||
|
|
||||||
|
## Soft-upstream intent
|
||||||
|
|
||||||
|
Submitting this PKGBUILD to DanctNIX for review. If accepted as a replacement for `linux-pinetab2` (or sidegrade), the BESser patchset ships to all PineTab2 users via the regular danctnix package update channel. The bes2600 driver gets:
|
||||||
|
|
||||||
|
- ~2× sustained RX throughput on 2.4 GHz
|
||||||
|
- ~3.6× sustained RX throughput on 5 GHz (via Patch I + correctly using HT40)
|
||||||
|
- Race-correctness on the hot path
|
||||||
|
- GPL-2.0 §1 attribution compliance
|
||||||
|
- Modern kernel API (no deprecated `from_timer`, no `_irqsafe` from process context, no `BUG_ON` in steady-state)
|
||||||
|
|
||||||
|
Drop-in compatibility: same kernel version, same module names, no userspace ABI change. SCS off is the one config caveat, tracked in [besser#20](https://git.reauktion.de/marfrit/besser/issues/20).
|
||||||
|
|
||||||
|
## Maintenance plan
|
||||||
|
|
||||||
|
- New danctnix kernel release → rebase BESser patches onto the new tag, regenerate cumulative diff, bump pkgver.
|
||||||
|
- New BESser patch on Mobian DKMS → re-overlay + re-flavor + regenerate cumulative diff.
|
||||||
|
- Both flavors continue to be maintained in lockstep via `marfrit/bes2600-dkms` source-of-truth.
|
||||||
|
- GCC 15 SCS issue → periodically re-test build with `CONFIG_SHADOW_CALL_STACK=y` against current Arch ARM GCC. When the build succeeds, flip the config and re-deploy.
|
||||||
|
|
||||||
|
## Known gaps
|
||||||
|
|
||||||
|
- Cumulative diff (squashed) for the c-stack + Patches A/B; Patch I as a separate `0002-` file. Per-patch series can be regenerated if danctnix maintainers prefer.
|
||||||
|
- Bluetooth-side `bes2600_btuart` is independent and untouched by this patchset.
|
||||||
|
- `bes2600_switch_bt` orchestration removed (Mobian-only entry point; not used in danctnix tree).
|
||||||
|
- Multi-band `iw scan` (no `freq` filter) still reports aborted scan because mac80211 aggregates per-band results and marks the whole scan aborted when any leg returns negative (mac80211 contract, not bes2600). Single-band scans (`iw scan freq 2462` or `iw scan freq 5180`) work normally; `nmcli connection up` with `band=bg` or `band=a` profile works normally. This is the Phase 5 reviewer's predicted residual limitation; userspace tools that need full multi-band BSS discovery should issue per-band scans.
|
||||||
|
|
||||||
|
## Author
|
||||||
|
|
||||||
|
Markus Fritsche <fritsche.markus@gmail.com>
|
||||||
|
|
||||||
|
Built collaboratively with Claude Opus 4.7 (1M context).
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,168 @@
|
|||||||
|
From 093a5038b8b68f316d976b7cb69609ca7f24f322 Mon Sep 17 00:00:00 2001
|
||||||
|
From: Markus Fritsche <fritsche.markus@gmail.com>
|
||||||
|
Date: Mon, 18 May 2026 11:27:40 +0200
|
||||||
|
Subject: [PATCH 1/2] bes2600: filter 5 GHz scans at the driver boundary
|
||||||
|
(besser#1)
|
||||||
|
|
||||||
|
The BES2600 firmware refuses WSM start-scan for 5 GHz with status 2
|
||||||
|
("rejected by policy"). This shows up in dmesg as the recurring
|
||||||
|
|
||||||
|
wsm_generic_confirm failed for request 0x0007.
|
||||||
|
[SCAN] Scan failed (-22).
|
||||||
|
|
||||||
|
pattern (besser issue #1, ~14-16/h on ohm/PineTab2 baseline).
|
||||||
|
|
||||||
|
Trace shows every reject is the second of a back-to-back pair: mac80211
|
||||||
|
splits multi-band hw_scan requests per band when the driver does not
|
||||||
|
set IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS (we don't), then re-invokes
|
||||||
|
drv_hw_scan from __ieee80211_scan_completed for each subsequent band.
|
||||||
|
The 2.4 GHz iteration succeeds; the 5 GHz iteration is what the
|
||||||
|
firmware rejects. See ieee80211_prep_hw_scan in net/mac80211/scan.c
|
||||||
|
for the loop, and the existing memory reference_bes2600_5ghz_scan_reject
|
||||||
|
for the firmware behaviour.
|
||||||
|
|
||||||
|
The 056a71a defer-on-reject patch already in this tree handles the
|
||||||
|
BT-A2DP-coex branch and the consecutive-reject backoff, but it cannot
|
||||||
|
prevent the per-band-loop reject: by the time defer_should_scan is
|
||||||
|
consulted, the per-band call is already in flight, and the reject_count
|
||||||
|
gets reset on every successful 2.4 GHz scan in between (which is
|
||||||
|
~36% of attempts), so the threshold never trips.
|
||||||
|
|
||||||
|
The fix: refuse the 5 GHz iteration upfront in bes2600_hw_scan. The
|
||||||
|
2.4 GHz scan still runs normally. The 5 GHz portion is reported as
|
||||||
|
aborted to userspace -- same outcome as today, minus the dmesg storm
|
||||||
|
and the wsm_generic_confirm WARN cascade.
|
||||||
|
|
||||||
|
5 GHz band registration is intentionally left in place: direct-BSSID
|
||||||
|
association to a known 5 GHz AP still works (no scan is needed for
|
||||||
|
that path), and a future firmware update that fixes the scan behaviour
|
||||||
|
should not be foreclosed by changing band advertisement.
|
||||||
|
|
||||||
|
Contract: per include/net/mac80211.h ieee80211_ops.hw_scan, a negative
|
||||||
|
return aborts the scan without requiring ieee80211_scan_completed().
|
||||||
|
-EOPNOTSUPP is the semantically accurate code (operation is legal,
|
||||||
|
driver can't service it on this band today).
|
||||||
|
|
||||||
|
Phase 3 evidence:
|
||||||
|
- baseline N=3: rate ~14.3-23.6/h converged at 14.3/h (matches OP)
|
||||||
|
- back-to-back scan gap: 6/6 rejected pairs <200us, 1/1 successful
|
||||||
|
pair was 114ms (single-band-only, no 5 GHz leg)
|
||||||
|
- defer log fires: 0/9 in 30-min window (056a71a structurally bypassed)
|
||||||
|
|
||||||
|
Predicted Phase 7 delta: Pattern A 14/h -> 0/h.
|
||||||
|
---
|
||||||
|
bes2600/scan.c | 22 ++++++++++++++++++++++
|
||||||
|
1 file changed, 22 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/drivers/staging/bes2600/scan.c b/drivers/staging/bes2600/scan.c
|
||||||
|
index fb1d298..a81afb6 100644
|
||||||
|
--- a/drivers/staging/bes2600/scan.c
|
||||||
|
+++ b/drivers/staging/bes2600/scan.c
|
||||||
|
@@ -238,6 +238,28 @@ int bes2600_hw_scan(struct ieee80211_hw *hw,
|
||||||
|
/* Scan when P2P_GO corrupt firmware MiniAP mode */
|
||||||
|
if (priv->join_status == BES2600_JOIN_STATUS_AP)
|
||||||
|
return -EOPNOTSUPP;
|
||||||
|
+
|
||||||
|
+ /*
|
||||||
|
+ * Firmware refuses WSM start-scan for 5 GHz with status 2 ("rejected
|
||||||
|
+ * by policy"); see besser issue #1. mac80211 splits multi-band
|
||||||
|
+ * hw_scan requests per-band when the driver does not set
|
||||||
|
+ * IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS (we don't -- see
|
||||||
|
+ * ieee80211_hw_set() calls in bes2600_main.c), so each per-band call
|
||||||
|
+ * has req->channels[] from one band only (see ieee80211_prep_hw_scan
|
||||||
|
+ * in net/mac80211/scan.c). Refuse the 5 GHz iteration at the driver
|
||||||
|
+ * boundary so userspace gets a clean aborted-scan for that portion
|
||||||
|
+ * rather than waiting for the firmware reject to cascade up. 5 GHz
|
||||||
|
+ * band registration stays intact so direct-BSSID association to a
|
||||||
|
+ * known 5 GHz AP still works (no scan needed for that path).
|
||||||
|
+ *
|
||||||
|
+ * Contract: per include/net/mac80211.h struct ieee80211_ops.hw_scan
|
||||||
|
+ * documentation, a negative return aborts the scan without requiring
|
||||||
|
+ * ieee80211_scan_completed().
|
||||||
|
+ */
|
||||||
|
+ if (req->n_channels > 0 &&
|
||||||
|
+ req->channels[0]->band == NL80211_BAND_5GHZ)
|
||||||
|
+ return -EOPNOTSUPP;
|
||||||
|
+
|
||||||
|
#if 0
|
||||||
|
if (work_pending(&priv->offchannel_work) ||
|
||||||
|
(hw_priv->roc_if_id != -1)) {
|
||||||
|
--
|
||||||
|
2.54.0
|
||||||
|
|
||||||
|
|
||||||
|
From 8cd10f487c8144d462a510812ba0fa717b3e24df Mon Sep 17 00:00:00 2001
|
||||||
|
From: Markus Fritsche <fritsche.markus@gmail.com>
|
||||||
|
Date: Mon, 18 May 2026 15:56:34 +0200
|
||||||
|
Subject: [PATCH 2/2] bes2600: scan-filter-5ghz: allow targeted single-channel
|
||||||
|
scans (besser#1 follow-up)
|
||||||
|
|
||||||
|
The original Patch I refused EVERY 5 GHz scan request unconditionally
|
||||||
|
(req->n_channels > 0 && band == NL80211_BAND_5GHZ). This eliminated
|
||||||
|
the Pattern A storm but also broke 5 GHz association entirely:
|
||||||
|
NM / wpa_supplicant iterates a freq_list when a connection profile
|
||||||
|
specifies 802-11-wireless.band=a, issuing per-frequency single-channel
|
||||||
|
scans to find the BSS before associating. Those single-channel scans
|
||||||
|
were also refused by our guard, so the BSS was never seen and
|
||||||
|
'Wi-Fi network could not be found' was the only outcome.
|
||||||
|
|
||||||
|
Tighten the guard: refuse only multi-channel 5 GHz scans (n_channels
|
||||||
|
> 1), which is the per-band-sweep pattern mac80211 issues internally
|
||||||
|
and the only one that triggers the firmware storm at the per-band
|
||||||
|
loop boundary. Single-channel 5 GHz scans pass through to firmware,
|
||||||
|
which generally accepts them -- and when they happen to be rejected,
|
||||||
|
the failure is isolated and doesn't cascade.
|
||||||
|
|
||||||
|
Verified on ohm with pkgrel=3 (srcversion BEB625FA7443171EA8D55F7):
|
||||||
|
- Pattern A count since boot: 0 (Phase 7 prediction still holds)
|
||||||
|
- iw dev wlan0 scan freq 5180 -> allowed
|
||||||
|
- iw dev wlan0 scan freq 5180 5200 ... -> refused -EOPNOTSUPP
|
||||||
|
- NM 'nmcli connection up' with band=a -> associated to BSSID
|
||||||
|
c0:25:06:e6:5b:33 on 5240 MHz / ch.48 in ~1 second
|
||||||
|
- TX bitrate 150 Mbit/s MCS 7 40MHz short-GI (vs 72.2 Mbit/s
|
||||||
|
HT20 on 2.4 GHz) -- ~2x throughput recovered
|
||||||
|
|
||||||
|
The change is a single byte (> 0 -> > 1) plus comment update; the
|
||||||
|
test confirmation above is what motivates it.
|
||||||
|
|
||||||
|
Refs: besser#1 (closed but tracked for follow-up like this), original
|
||||||
|
Patch I sha 093a503.
|
||||||
|
---
|
||||||
|
bes2600/scan.c | 16 ++++++++++++----
|
||||||
|
1 file changed, 12 insertions(+), 4 deletions(-)
|
||||||
|
|
||||||
|
diff --git a/drivers/staging/bes2600/scan.c b/drivers/staging/bes2600/scan.c
|
||||||
|
index a81afb6..497523b 100644
|
||||||
|
--- a/drivers/staging/bes2600/scan.c
|
||||||
|
+++ b/drivers/staging/bes2600/scan.c
|
||||||
|
@@ -248,15 +248,23 @@ int bes2600_hw_scan(struct ieee80211_hw *hw,
|
||||||
|
* has req->channels[] from one band only (see ieee80211_prep_hw_scan
|
||||||
|
* in net/mac80211/scan.c). Refuse the 5 GHz iteration at the driver
|
||||||
|
* boundary so userspace gets a clean aborted-scan for that portion
|
||||||
|
- * rather than waiting for the firmware reject to cascade up. 5 GHz
|
||||||
|
- * band registration stays intact so direct-BSSID association to a
|
||||||
|
- * known 5 GHz AP still works (no scan needed for that path).
|
||||||
|
+ * rather than waiting for the firmware reject to cascade up.
|
||||||
|
+ *
|
||||||
|
+ * Only the multi-channel case is refused (n_channels > 1): that's
|
||||||
|
+ * the per-band-sweep pattern mac80211 issues internally and the
|
||||||
|
+ * one that triggers the firmware storm at the per-band loop
|
||||||
|
+ * boundary. Single-channel 5 GHz scans (BSS verification, NM's
|
||||||
|
+ * per-freq iteration when 802-11-wireless.band=a is set) pass
|
||||||
|
+ * through to firmware, which generally accepts them since the
|
||||||
|
+ * storm is the back-to-back per-band issue, not a blanket 5 GHz
|
||||||
|
+ * reject. This preserves 5 GHz association via the
|
||||||
|
+ * "wpa_supplicant iterates freq_list per channel" path.
|
||||||
|
*
|
||||||
|
* Contract: per include/net/mac80211.h struct ieee80211_ops.hw_scan
|
||||||
|
* documentation, a negative return aborts the scan without requiring
|
||||||
|
* ieee80211_scan_completed().
|
||||||
|
*/
|
||||||
|
- if (req->n_channels > 0 &&
|
||||||
|
+ if (req->n_channels > 1 &&
|
||||||
|
req->channels[0]->band == NL80211_BAND_5GHZ)
|
||||||
|
return -EOPNOTSUPP;
|
||||||
|
|
||||||
|
--
|
||||||
|
2.54.0
|
||||||
|
|
||||||
@@ -0,0 +1,36 @@
|
|||||||
|
From: Markus Fritsche <fritsche.markus@gmail.com>
|
||||||
|
Date: Mon, 18 May 2026 11:42:00 +0200
|
||||||
|
Subject: [PATCH] arm64: xor-neon: restore -ffixed-x18 when SHADOW_CALL_STACK=y
|
||||||
|
(GCC 15+ build fix)
|
||||||
|
|
||||||
|
GCC 15.2.1 enforces that -fsanitize=shadow-call-stack requires
|
||||||
|
-ffixed-x18 inside arm_neon.h's #pragma GCC target() blocks. The
|
||||||
|
existing CFLAGS_REMOVE_xor-neon.o line strips the kernel-wide
|
||||||
|
-ffixed-x18 (it's part of CC_FLAGS_NO_FPU) and CC_FLAGS_FPU does not
|
||||||
|
restore it, so xor-neon.c fails to build on stricter GCC versions
|
||||||
|
when CONFIG_SHADOW_CALL_STACK=y.
|
||||||
|
|
||||||
|
Add an explicit -ffixed-x18 just for this object, gated on the
|
||||||
|
SCS config so non-SCS builds are unaffected.
|
||||||
|
|
||||||
|
Build environment workaround; not a kernel-runtime bug.
|
||||||
|
---
|
||||||
|
arch/arm64/lib/Makefile | 4 ++++
|
||||||
|
1 file changed, 4 insertions(+)
|
||||||
|
|
||||||
|
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
|
||||||
|
index 1234567..2345678 100644
|
||||||
|
--- a/arch/arm64/lib/Makefile
|
||||||
|
+++ b/arch/arm64/lib/Makefile
|
||||||
|
@@ -9,6 +9,10 @@ ifeq ($(CONFIG_KERNEL_MODE_NEON), y)
|
||||||
|
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
|
||||||
|
CFLAGS_xor-neon.o += $(CC_FLAGS_FPU)
|
||||||
|
CFLAGS_REMOVE_xor-neon.o += $(CC_FLAGS_NO_FPU)
|
||||||
|
+# GCC 15+ enforces that -fsanitize=shadow-call-stack requires -ffixed-x18
|
||||||
|
+# even after a #pragma GCC pop_options inside arm_neon.h. CC_FLAGS_REMOVE
|
||||||
|
+# above strips the kernel-wide -ffixed-x18 (part of CC_FLAGS_NO_FPU); add
|
||||||
|
+# it back here so xor-neon.c still compiles when SHADOW_CALL_STACK=y.
|
||||||
|
+CFLAGS_xor-neon.o += $(if $(CONFIG_SHADOW_CALL_STACK),-ffixed-x18)
|
||||||
|
endif
|
||||||
|
|
||||||
|
lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
|
||||||
@@ -0,0 +1,236 @@
|
|||||||
|
# Maintainer: Markus Fritsche <fritsche.markus@gmail.com>
|
||||||
|
# Forked from: linux-pinetab2 by Danct12 <danct12@disroot.org>
|
||||||
|
# Original Contributor: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
|
||||||
|
#
|
||||||
|
# linux-pinetab2-danctnix-besser: linux-pinetab2 + the BESser
|
||||||
|
# bes2600 driver patchset (race-fix, lock-removal, attribution-restore,
|
||||||
|
# fossil-cleanup; +73% throughput vs the in-tree baseline). Soft-upstream
|
||||||
|
# fork of linux-pinetab2 — drop-in replacement, same kernel version, only
|
||||||
|
# the bes2600 staging driver differs. See git.reauktion.de/marfrit/besser
|
||||||
|
# and git.reauktion.de/marfrit/bes2600-dkms for full provenance.
|
||||||
|
|
||||||
|
pkgbase=linux-pinetab2-danctnix-besser
|
||||||
|
pkgver=7.0.danctnix1
|
||||||
|
pkgrel=3
|
||||||
|
pkgdesc='PineTab2 (BESser bes2600 driver patchset)'
|
||||||
|
_srcname=linux-pinetab2
|
||||||
|
_srctag=v${pkgver%.*}-${pkgver##*.}
|
||||||
|
arch=(aarch64)
|
||||||
|
_url_git="https://codeberg.org/DanctNIX/${_srcname}"
|
||||||
|
url="${_url_git}/commits/tag/$_srctag"
|
||||||
|
license=(GPL-2.0-only)
|
||||||
|
makedepends=(
|
||||||
|
bc
|
||||||
|
cpio
|
||||||
|
gettext
|
||||||
|
git
|
||||||
|
libelf
|
||||||
|
pahole
|
||||||
|
perl
|
||||||
|
python
|
||||||
|
tar
|
||||||
|
xz
|
||||||
|
)
|
||||||
|
options=(
|
||||||
|
!debug
|
||||||
|
!strip
|
||||||
|
)
|
||||||
|
source=(
|
||||||
|
https://cdn.kernel.org/pub/linux/kernel/v${pkgver%%.*}.x/linux-${pkgver%.*}.tar.{xz,sign}
|
||||||
|
${_url_git}/releases/download/${_srctag}/${_srctag}.patch.zst{,.sig}
|
||||||
|
0001-bes2600-besser-cumulative-series.patch
|
||||||
|
0002-bes2600-filter-5ghz-scan.patch
|
||||||
|
0003-arm64-xor-neon-ffixed-x18-build-fix.patch
|
||||||
|
0003-arm64-xor-neon-ffixed-x18-build-fix.patch
|
||||||
|
config # the main kernel config file
|
||||||
|
)
|
||||||
|
validpgpkeys=(
|
||||||
|
ABAF11C65A2970B130ABE3C479BE3E4300411886 # Linus Torvalds
|
||||||
|
647F28654894E3BD457199BE38DBBDC86092693E # Greg Kroah-Hartman
|
||||||
|
F09A933C0FE0331E558CA4E166CAB7EAA45DD781 # Danct12
|
||||||
|
)
|
||||||
|
b2sums=('3d9795083c8938f80f480de0d10bfd9c525640e59d5c7f22983de3f12ee42c84c31be902cafb05579ddb1c32bac5ed06b0d4953f9705450be185bd2d9ab08f89'
|
||||||
|
'SKIP'
|
||||||
|
'71fe98221e802b315e54b4b10d3e8c8f376695a36bae3541d876e5776a37f3fa33c8f8dfa6e51fcbd6f5396add02e5166634165f2351836a0ea0453c172fe56c'
|
||||||
|
'SKIP'
|
||||||
|
'fca0a5badf762d5dbc085261cccc07ddeef96384d2ae0a426fb0412acd7a180e068cabd59f01342b7575d41889afc0f47dfbc9256801ab809f746278e6dab510'
|
||||||
|
'396acbdcf570eada62533c0b8f505ed18077e8432249bab5b8ac8d1107cabc9489bdb91a5780446237ec4fd9ba5fc57a49dff34c16ddab60dc30513fc535f00f'
|
||||||
|
'2714e3c0cd8ec978ce9431418f44f578220886fcabb738c9a0c43fc3c043753960b7c47ae96e1780154d8b266a2add6098407de4ffa7aee40d77ce17e8c70df9'
|
||||||
|
'2714e3c0cd8ec978ce9431418f44f578220886fcabb738c9a0c43fc3c043753960b7c47ae96e1780154d8b266a2add6098407de4ffa7aee40d77ce17e8c70df9'
|
||||||
|
'656a998ab40cb85ee4c00f087b071a91632a6c091da2c84b0f74236b51d2dea6e9db6886625f80ad81dc249d8494ec47cd79d6dd9ea4f5e44f3cde857f861e10')
|
||||||
|
|
||||||
|
export KBUILD_BUILD_HOST=archlinux
|
||||||
|
export KBUILD_BUILD_USER=$pkgbase
|
||||||
|
export KBUILD_BUILD_TIMESTAMP="$(date -Ru${SOURCE_DATE_EPOCH:+d @$SOURCE_DATE_EPOCH})"
|
||||||
|
|
||||||
|
prepare() {
|
||||||
|
cd linux-${pkgver%.*}
|
||||||
|
|
||||||
|
echo "Setting version..."
|
||||||
|
echo "-$pkgrel" > localversion.10-pkgrel
|
||||||
|
echo "${pkgbase#linux}" > localversion.20-pkgname
|
||||||
|
|
||||||
|
local src
|
||||||
|
for src in "${source[@]}"; do
|
||||||
|
src="${src%%::*}"
|
||||||
|
src="${src##*/}"
|
||||||
|
src="${src%.zst}"
|
||||||
|
[[ $src = *.patch ]] || continue
|
||||||
|
echo "Applying patch: $src..."
|
||||||
|
patch -Np1 < "../$src"
|
||||||
|
done
|
||||||
|
|
||||||
|
echo "Setting config..."
|
||||||
|
cp ../config .config
|
||||||
|
make olddefconfig
|
||||||
|
diff -u ../config .config || :
|
||||||
|
|
||||||
|
make -s kernelrelease > version
|
||||||
|
echo "Prepared $pkgbase version $(<version)"
|
||||||
|
}
|
||||||
|
|
||||||
|
build() {
|
||||||
|
cd linux-${pkgver%.*}
|
||||||
|
make DTC_FLAGS="-@" all
|
||||||
|
make -C tools/bpf/bpftool vmlinux.h feature-clang-bpf-co-re=1
|
||||||
|
}
|
||||||
|
|
||||||
|
_package() {
|
||||||
|
pkgdesc="The $pkgdesc kernel and modules"
|
||||||
|
depends=(
|
||||||
|
coreutils
|
||||||
|
kmod
|
||||||
|
mkinitcpio
|
||||||
|
)
|
||||||
|
optdepends=(
|
||||||
|
'wireless-regdb: to set the correct wireless channels of your country'
|
||||||
|
'linux-firmware: firmware images needed for some devices'
|
||||||
|
)
|
||||||
|
provides=(
|
||||||
|
KSMBD-MODULE
|
||||||
|
WIREGUARD-MODULE
|
||||||
|
"linux-pinetab2=$pkgver-$pkgrel"
|
||||||
|
)
|
||||||
|
conflicts=(linux-pinetab2)
|
||||||
|
replaces=(
|
||||||
|
wireguard-arch
|
||||||
|
)
|
||||||
|
|
||||||
|
cd linux-${pkgver%.*}
|
||||||
|
local modulesdir="$pkgdir/usr/lib/modules/$(<version)"
|
||||||
|
|
||||||
|
echo "Installing boot image..."
|
||||||
|
# systemd expects to find the kernel here to allow hibernation
|
||||||
|
# https://github.com/systemd/systemd/commit/edda44605f06a41fb86b7ab8128dcf99161d2344
|
||||||
|
install -Dm644 "$(make -s image_name)" "$modulesdir/vmlinuz"
|
||||||
|
|
||||||
|
# Used by mkinitcpio to name the kernel
|
||||||
|
echo "$pkgbase" | install -Dm644 /dev/stdin "$modulesdir/pkgbase"
|
||||||
|
|
||||||
|
echo "Installing modules..."
|
||||||
|
ZSTD_CLEVEL=19 make INSTALL_MOD_PATH="$pkgdir/usr" INSTALL_MOD_STRIP=1 \
|
||||||
|
DEPMOD=/doesnt/exist modules_install # Suppress depmod
|
||||||
|
|
||||||
|
echo "Installing device trees..."
|
||||||
|
make INSTALL_DTBS_PATH="$pkgdir/boot/dtbs" dtbs_install
|
||||||
|
|
||||||
|
# Removing unnecessary device trees (keep only pinetab2 variants).
|
||||||
|
# Use find -delete instead of a bash for-loop: the previous for-loop
|
||||||
|
# silently no-op'd in the makepkg environment, leaving 234 unrelated
|
||||||
|
# board DTBs in the package. find is robust to nullglob/cwd quirks.
|
||||||
|
find "$pkgdir"/boot/dtbs/rockchip/ -mindepth 1 -maxdepth 1 -type f \
|
||||||
|
! -name 'rk3566-pinetab2-*' -delete
|
||||||
|
|
||||||
|
# remove build link
|
||||||
|
rm "$modulesdir"/build
|
||||||
|
}
|
||||||
|
|
||||||
|
_package-headers() {
|
||||||
|
pkgdesc="Headers and scripts for building modules for the $pkgdesc kernel"
|
||||||
|
depends=(pahole)
|
||||||
|
|
||||||
|
cd linux-${pkgver%.*}
|
||||||
|
local builddir="$pkgdir/usr/lib/modules/$(<version)/build"
|
||||||
|
|
||||||
|
echo "Installing build files..."
|
||||||
|
install -Dt "$builddir" -m644 .config Makefile Module.symvers System.map \
|
||||||
|
localversion.* version vmlinux tools/bpf/bpftool/vmlinux.h
|
||||||
|
install -Dt "$builddir/kernel" -m644 kernel/Makefile
|
||||||
|
install -Dt "$builddir/arch/arm64" -m644 arch/arm64/Makefile
|
||||||
|
cp -t "$builddir" -a scripts
|
||||||
|
|
||||||
|
# required when DEBUG_INFO_BTF_MODULES is enabled
|
||||||
|
install -Dt "$builddir/tools/bpf/resolve_btfids" tools/bpf/resolve_btfids/resolve_btfids
|
||||||
|
|
||||||
|
echo "Installing headers..."
|
||||||
|
cp -t "$builddir" -a include
|
||||||
|
cp -t "$builddir/arch/arm64" -a arch/arm64/include
|
||||||
|
install -Dt "$builddir/arch/arm64/kernel" -m644 arch/arm64/kernel/asm-offsets.s
|
||||||
|
|
||||||
|
install -Dt "$builddir/drivers/md" -m644 drivers/md/*.h
|
||||||
|
install -Dt "$builddir/net/mac80211" -m644 net/mac80211/*.h
|
||||||
|
|
||||||
|
# https://bugs.archlinux.org/task/13146
|
||||||
|
install -Dt "$builddir/drivers/media/i2c" -m644 drivers/media/i2c/msp3400-driver.h
|
||||||
|
|
||||||
|
# https://bugs.archlinux.org/task/20402
|
||||||
|
install -Dt "$builddir/drivers/media/usb/dvb-usb" -m644 drivers/media/usb/dvb-usb/*.h
|
||||||
|
install -Dt "$builddir/drivers/media/dvb-frontends" -m644 drivers/media/dvb-frontends/*.h
|
||||||
|
install -Dt "$builddir/drivers/media/tuners" -m644 drivers/media/tuners/*.h
|
||||||
|
|
||||||
|
# https://bugs.archlinux.org/task/71392
|
||||||
|
install -Dt "$builddir/drivers/iio/common/hid-sensors" -m644 drivers/iio/common/hid-sensors/*.h
|
||||||
|
|
||||||
|
echo "Installing KConfig files..."
|
||||||
|
find . -name 'Kconfig*' -exec install -Dm644 {} "$builddir/{}" \;
|
||||||
|
|
||||||
|
echo "Removing unneeded architectures..."
|
||||||
|
local arch
|
||||||
|
for arch in "$builddir"/arch/*/; do
|
||||||
|
[[ $arch = */arm64/ ]] && continue
|
||||||
|
echo "Removing $(basename "$arch")"
|
||||||
|
rm -r "$arch"
|
||||||
|
done
|
||||||
|
|
||||||
|
echo "Removing documentation..."
|
||||||
|
rm -r "$builddir/Documentation"
|
||||||
|
|
||||||
|
echo "Removing broken symlinks..."
|
||||||
|
find -L "$builddir" -type l -printf 'Removing %P\n' -delete
|
||||||
|
|
||||||
|
echo "Removing loose objects..."
|
||||||
|
find "$builddir" -type f -name '*.o' -printf 'Removing %P\n' -delete
|
||||||
|
|
||||||
|
echo "Stripping build tools..."
|
||||||
|
local file
|
||||||
|
while read -rd '' file; do
|
||||||
|
case "$(file -Sib "$file")" in
|
||||||
|
application/x-sharedlib\;*) # Libraries (.so)
|
||||||
|
strip -v $STRIP_SHARED "$file" ;;
|
||||||
|
application/x-archive\;*) # Libraries (.a)
|
||||||
|
strip -v $STRIP_STATIC "$file" ;;
|
||||||
|
application/x-executable\;*) # Binaries
|
||||||
|
strip -v $STRIP_BINARIES "$file" ;;
|
||||||
|
application/x-pie-executable\;*) # Relocatable binaries
|
||||||
|
strip -v $STRIP_SHARED "$file" ;;
|
||||||
|
esac
|
||||||
|
done < <(find "$builddir" -type f -perm -u+x ! -name vmlinux -print0)
|
||||||
|
|
||||||
|
echo "Stripping vmlinux..."
|
||||||
|
strip -v $STRIP_STATIC "$builddir/vmlinux"
|
||||||
|
|
||||||
|
echo "Adding symlink..."
|
||||||
|
mkdir -p "$pkgdir/usr/src"
|
||||||
|
ln -sr "$builddir" "$pkgdir/usr/src/$pkgbase"
|
||||||
|
}
|
||||||
|
|
||||||
|
pkgname=(
|
||||||
|
"$pkgbase"
|
||||||
|
"$pkgbase-headers"
|
||||||
|
)
|
||||||
|
for _p in "${pkgname[@]}"; do
|
||||||
|
eval "package_$_p() {
|
||||||
|
$(declare -f "_package${_p#$pkgbase}")
|
||||||
|
_package${_p#$pkgbase}
|
||||||
|
}"
|
||||||
|
done
|
||||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,108 @@
|
|||||||
|
# Bug #5 RX-degradation campaign — Phase 0
|
||||||
|
|
||||||
|
**Date:** 2026-05-07
|
||||||
|
**Module under test:** v3 + F (`bes2600.ko` srcversion `371C6606B73AF19299228CA`)
|
||||||
|
**Hardware:** ohm (PineTab2, RK3566 + BES2600 SDIO), wired enu1 fallback path live.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Research question (locked)
|
||||||
|
|
||||||
|
> **Why does the bes2600 RX path collapse from ~2 MB/s sustained @ fresh-chip uptime to ~180 B/s @ ~28-min uptime, with periodic `wsm_generic_confirm failed for request 0x0007` + `ieee80211 phy0: [SCAN] Scan failed (-22)` every 300 s in the intervening window?**
|
||||||
|
|
||||||
|
Reproduces on Patch B, Patch F, and Patch C v3 alike — independent of the relay/race issues v3 addressed. Side-effect that was masked by the throughput floor while v2's race was the dominant variable.
|
||||||
|
|
||||||
|
## Predecessor data (reference, not anchor)
|
||||||
|
|
||||||
|
| source | observation |
|
||||||
|
|---|---|
|
||||||
|
| Patch C v3 N=3 (uptime 200/391/582 s) | mean 2.352 MB/s @ 4 MB/s sender |
|
||||||
|
| v3 single rep at uptime ~28 min (rep 2 of 2026-05-07 22:23) | 180 KB / 5 min = 600 B/s, sender saw "Connection reset by peer" |
|
||||||
|
| v3 single rep at uptime ~47 min (N=3 first attempt 22:42) | 55 KB / 5 min = 180 B/s, sender timed out (exit 124) |
|
||||||
|
| dmesg pattern observed at 47-min uptime | scan failures every 301-302 s starting at uptime 778 s (~13 min) |
|
||||||
|
|
||||||
|
The shape: **fresh chip → linear data flow at ~2 MB/s sustained → sometime around 13 min uptime, NetworkManager-triggered scans start failing → sometime around 28 min uptime, data throughput collapses to <1 KB/s while link still shows associated.**
|
||||||
|
|
||||||
|
Predecessor data is reference. Phase 0 will re-anchor at N=1 long-trace + 5 in-window stress probes; if the pattern doesn't reproduce, that's the campaign result.
|
||||||
|
|
||||||
|
## Mechanism candidates (Phase 4 will discriminate)
|
||||||
|
|
||||||
|
1. **Firmware-side resource exhaustion.** Per-scan or per-WSM-event accumulation in chip-side state. Scan-failed -22 (EINVAL) suggests firmware refusing the request — possibly out of scan handles, scan-buffer slots, or some other limit.
|
||||||
|
2. **NetworkManager scan-fail recovery loop.** Each failed scan triggers NM retry. If retry overhead dominates the bh thread, data path starves. Verifiable by suppressing NM scans.
|
||||||
|
3. **AP-side rate limiting.** Newton (AVM) AP could be applying QoS / fairness / probation after sustained 4 MB/s burst. Verifiable by Fritz!Box log access (Markus has it) or by switching to a different AP.
|
||||||
|
4. **PSM state machine deadlock.** c7's `pm_unsupported` self-detect was supposed to handle this, but the latch state could become stale if a real PM_IND arrives mid-operation. Verifiable by `chip_pm_state` debugfs read at degradation onset.
|
||||||
|
5. **SDIO bus clock degradation / mmc retune.** SDIO retune with `retune_protected` flag interacts with bes2600's data path. Verifiable by ftrace `mmc/mmc_request_*` event correlation with throughput drop.
|
||||||
|
6. **Power-management busy-event accumulation.** `bes2600_pwr_set_busy_event` counters might leak — busy events not cleared lock the chip awake (no PSM) but also exhaust event capacity. Verifiable by `bes2600_pwr_busy_event_record` dump.
|
||||||
|
|
||||||
|
## Phase 0 measurement protocol (rig armed 2026-05-07 23:18:58 CEST, T0=1778188738)
|
||||||
|
|
||||||
|
Capturing for 35 minutes from fresh boot. All capture lives in `/root/bes2600-samples/run-20260507-bug5-degradation-rig/` on ohm.
|
||||||
|
|
||||||
|
### Always-on streams
|
||||||
|
|
||||||
|
| stream | tool | output |
|
||||||
|
|---|---|---|
|
||||||
|
| ftrace events | per-event `enable=1` | `trace.log` (via `trace_pipe`) |
|
||||||
|
| cfg80211 events | `iw event -t -f` | `iw-event.log` |
|
||||||
|
| kernel printks | `dmesg -wT` | `dmesg.log` |
|
||||||
|
| netdev counters | per-30s shell loop | `snap.log` |
|
||||||
|
|
||||||
|
### ftrace event set
|
||||||
|
|
||||||
|
- `workqueue/workqueue_execute_start` — work dispatches
|
||||||
|
- `workqueue/workqueue_queue_work` — work submissions
|
||||||
|
- `mac80211/api_beacon_loss` — driver beacon-loss events
|
||||||
|
- `mac80211/api_connection_loss` — driver-side conn-loss
|
||||||
|
- `mac80211/api_disconnect` — driver-side disconnect
|
||||||
|
- `mac80211/drv_hw_scan` — mac80211 → driver scan dispatch
|
||||||
|
- `mac80211/drv_set_key` — key state changes
|
||||||
|
- `cfg80211/rdev_assoc` — assoc requests
|
||||||
|
- `cfg80211/rdev_deauth` — deauth requests
|
||||||
|
- `cfg80211/rdev_disassoc` — disassoc requests
|
||||||
|
- `cfg80211/cfg80211_assoc_comeback` — AP-side assoc-busy throttling
|
||||||
|
- `cfg80211/cfg80211_send_auth_timeout` — auth timeouts
|
||||||
|
- `cfg80211/cfg80211_scan_done` — scan completions
|
||||||
|
- `power/suspend_resume` — PM transitions
|
||||||
|
- `mmc/mmc_request_start` / `mmc_request_done` — bus-level transactions
|
||||||
|
|
||||||
|
### Scheduled stress probes
|
||||||
|
|
||||||
|
Sender on boltzmann (`/tmp/bug5-probe-loop.sh`) fires `pv -L 4m | nc ohm 12345` for 30 s at T+5/10/15/20/25 min. Each probe brackets uptime, RX-bytes pre, RX-bytes post, elapsed. Throughput-vs-uptime curve falls out of the snap.log + probe boundaries.
|
||||||
|
|
||||||
|
Probe markers logged via `logger -t bes2600-bug5 PROBE_N_START/END` so they appear in dmesg.log timeline.
|
||||||
|
|
||||||
|
## Anti-theatre receipts (must tick before claiming Phase 0 done)
|
||||||
|
|
||||||
|
- [ ] In-session baseline: long-capture across degradation window, N=1 for now; re-run if anomalous
|
||||||
|
- [ ] ftrace events actually firing (verify by tail of trace.log mid-capture)
|
||||||
|
- [ ] dmesg captures the scan-failure pattern timestamp (expected ~uptime 778 s)
|
||||||
|
- [ ] Probes actually transferred data at fresh chip (T+5 should be > 1 MB/s)
|
||||||
|
- [ ] At least one probe in-window after scan-failure onset (expected: T+15 or T+20)
|
||||||
|
- [ ] Snap.log shows monotonic counter behaviour (no rx_bytes going backwards)
|
||||||
|
|
||||||
|
## Phase 1 hypothesis (provisional, refine after Phase 3 data)
|
||||||
|
|
||||||
|
Metric candidate: **probe throughput as function of uptime, with state-transition markers (first `wsm_generic_confirm 0x0007 failed`, first `[SCAN] Scan failed (-22)`, first NetworkManager-deauth-and-reassociate)**.
|
||||||
|
|
||||||
|
Discriminator question: does throughput collapse abruptly at the first scan failure, or gradually over a window? Abrupt = single-event causation; gradual = accumulator.
|
||||||
|
|
||||||
|
## Phase 4 candidates (post-Phase-3)
|
||||||
|
|
||||||
|
Depending on which mechanism (1-6) Phase 3 surfaces:
|
||||||
|
- (1) firmware resource exhaustion: report to upstream; possibly disable NetworkManager scans pending firmware fix.
|
||||||
|
- (2) NM scan-fail loop: configure `wpa_supplicant` to skip scans; or add scan-failure handling in driver to dampen retry cascade.
|
||||||
|
- (3) AP-side: switch APs for testing; report to AVM if reproducible.
|
||||||
|
- (4) PSM deadlock: extend c7 latch with timeout-or-progress recovery.
|
||||||
|
- (5) SDIO retune: ftrace correlation guides the lock-ordering fix.
|
||||||
|
- (6) PWR busy-event leak: audit set/clear pairs; add a warning-when-stale.
|
||||||
|
|
||||||
|
## Out-of-scope
|
||||||
|
|
||||||
|
- Patch C v3 closure (PR #5 merged, Phase 7 done).
|
||||||
|
- Patch C2 (`ieee80211_rx_list` batch) — gated on Task #19 kerneldoc.
|
||||||
|
- Patch D / E independent.
|
||||||
|
- Reproduction at higher rates (8 MB/s ramp) — defer to Phase 4 once mechanism identified.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Phase 0 plan written 2026-05-07 23:21 CEST by Claude (noether), at the close of Patch C v3 Phase 7. Rig armed; long capture in flight; probes scheduled at T+5/10/15/20/25 min. Post-capture analysis will populate Phase 3 results before Phase 4 plan branches off.*
|
||||||
@@ -0,0 +1,127 @@
|
|||||||
|
# Patch C v3 — Phase 4 Plan: drop sdio_rx_work, match cw1200 architecture
|
||||||
|
|
||||||
|
**Author:** Claude (noether)
|
||||||
|
**Status:** Phase 4 v3 — supersedes v2 (PR #10) after cw1200 mainline survey showed the race-free path is structural, not lock-based.
|
||||||
|
**Decision:** drop the `sdio_rx_work` workqueue entirely; SDIO IRQ wakes `bh_wq`; bh thread does the SDIO read inline. Restores single-writer-from-bh invariant on `hw_bufs_used` *by construction*. No `atomic_t` prep needed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## §0 Why v3 supersedes v2
|
||||||
|
|
||||||
|
PR #10's plan was: convert `hw_bufs_used` etc. to `atomic_t` (prep), then direct-deliver from `sdio_rx_work` (structural). That was a workaround for the race that *only existed because of the relay*.
|
||||||
|
|
||||||
|
The cw1200 mining (`~/src/linux-rockchip`, 228 cw1200 commits) showed the upstream answer: there is no relay. cw1200's IRQ handler bumps `bh_rx` and wakes the bh thread; the bh thread does the SDIO read itself inside `cw1200_bh_rx_helper` (`drivers/net/wireless/st/cw1200/bh.c:233`). Single thread = single writer for `hw_bufs_used` = no race. Same `int hw_bufs_used` as bes2600, never atomic_t'd in 16 years upstream because it never needed to be.
|
||||||
|
|
||||||
|
Patch C v3 brings bes2600 into that shape. The structural simplification is bigger than v2's diff but lands the right architecture in one move.
|
||||||
|
|
||||||
|
## §1 Goal
|
||||||
|
|
||||||
|
Same as Patch C v2 §1: ≥ 1 MB/s sustained receive @ 4 MB/s sender, < 15 % `_raw_spin_unlock_irqrestore` CPU%, no 30-min cascade to link-death. Stretch toward Phase 1's full 2 MB/s once Patch C2 (rx_list batch) lands separately.
|
||||||
|
|
||||||
|
## §2 Situation
|
||||||
|
|
||||||
|
- Cleanups branch is at Patch F merged (commit `b717251`). All Phase 5 reviews of the F series merged via PR #4.
|
||||||
|
- ohm rebooted with F module live (srcversion `A9438692D6A8698F92AEEA1`) — F is the new baseline for Patch C v3 Phase 7 comparison.
|
||||||
|
- Wired path `enu1` at `192.168.88.80` survives bes2600 wedges; lmcp `ohm` still goes through wlan0. Phase 7 telemetry collection over enu1.
|
||||||
|
- Reboot-permission override active (ohm dev-allocated; I can `sudo reboot` directly — `feedback_user_pushes_reboot_button` override clause).
|
||||||
|
|
||||||
|
## §3 Baseline measurements
|
||||||
|
|
||||||
|
Carry forward from `run-20260507-patchC-preflight/baseline.tsv` (N=1, F-less Patch B module):
|
||||||
|
|
||||||
|
| metric | value |
|
||||||
|
|---|---|
|
||||||
|
| observed receive @ 4 MB/s | 1.362 MB/s |
|
||||||
|
| sdio_rx_work dispatches | 86.4/s = 90.3 per 1000 RX packets |
|
||||||
|
| sdio_tx_work dispatches | 276.1/s |
|
||||||
|
| bes2600_bh_work redispatches | 0 (single long-lived) |
|
||||||
|
|
||||||
|
**Phase 6 prereq:** capture an N=3 baseline ON THE F MODULE before Patch C v3 code lands. Same instrumentation, same stress ramp. This is the post-F / pre-v3 reference. Without it, Phase 7's delta is C+F vs B+nothing — confounded.
|
||||||
|
|
||||||
|
## §4 Plan v3
|
||||||
|
|
||||||
|
### §4.1 What gets eliminated
|
||||||
|
|
||||||
|
- **`sdio_rx_work` (bes2600_sdio.c:829)** — function deleted. No longer queued, no longer runs.
|
||||||
|
- **`self->rx_work` work_struct** — field deleted from `struct sbus_priv`. `INIT_WORK` removed.
|
||||||
|
- **`self->rx_queue` + `self->rx_queue_lock`** — fields deleted. `skb_queue_head_init` removed. No SKB ever queued there.
|
||||||
|
- **`bes2600_sdio_pipe_read`** — function deleted. No callers after this patch.
|
||||||
|
- **`sbus_ops->pipe_read`** — sbus op slot deleted (or kept and stubbed; tx_loop.c also implements it for the test-loop bus, has to stay if test-loop is preserved).
|
||||||
|
- **`queue_work(self->sdio_wq, &self->rx_work)`** at the 3 call sites in `bes2600_sdio.c` (lines 416, 941, 1199) — removed.
|
||||||
|
|
||||||
|
### §4.2 What gets added
|
||||||
|
|
||||||
|
- **A new `bes2600_bh_handle_rx_skb()`** in bh.c (same shape as Patch C added, same contract block; no longer needs to also wake the bh thread because we ARE the bh thread).
|
||||||
|
- **A new helper `bes2600_sdio_read_rx_batch()`** in bes2600_sdio.c, exported, that does what `sdio_rx_work` used to do MINUS the queuing: lock → read ctrl_reg → memcpy_fromio → packets_check → for-each-frame extract+deliver. Called from bh.
|
||||||
|
|
||||||
|
### §4.3 What gets rewired
|
||||||
|
|
||||||
|
- **`bes2600_gpio_irq_handler`** in bes2600_sdio.c:413 (the GPIO-IRQ path used when CONFIG_BES2600_USE_GPIO_IRQ is set): drop `queue_work(self->sdio_wq, &self->rx_work)`; instead call `self->irq_handler(self->irq_priv)` directly (which is `bes2600_irq_handler` in bh.c, bumps `bh_rx` + wakes `bh_wq`). Matches cw1200_sdio_irq_handler shape.
|
||||||
|
- **`bes2600_bh_rx_helper`** (bh.c:961, BES_SDIO_RX_MULTIPLE_ENABLE branch): instead of `pipe_read`-ing one SKB from the (now-gone) rx_queue, call the new `bes2600_sdio_read_rx_batch()` which does the SDIO read AND delivers each frame inline via `bes2600_bh_handle_rx_skb()`. Returns count delivered, or negative on error.
|
||||||
|
- **`bes2600_bh()` outer loop**: after a successful rx_batch read, the helper signals whether to continue draining (more frames pending) — same shape as today's `BH_RX_CONT_LIMIT=3` outer loop.
|
||||||
|
- **`bes2600_gpio_wakeup_mcu(SDIO_RX)`** + **`bes2600_gpio_allow_mcu_sleep(SDIO_RX)`** brackets: currently called inside sdio_rx_work. Move into bh thread around the `bes2600_sdio_read_rx_batch()` call. Same wake-flag bracketing, just from a different thread.
|
||||||
|
- **`sdio_wq` workqueue**: keeps `tx_work` and (briefly) `scan_work`. Renamed or kept — cosmetic. Don't touch in this patch.
|
||||||
|
|
||||||
|
### §4.4 What stays untouched
|
||||||
|
|
||||||
|
- TX path (`sdio_tx_work`, `bes2600_bh_tx_helper`, `wsm_alloc_tx_buffer`). Independent.
|
||||||
|
- WSM protocol layer (`wsm.c`, `wsm_handle_rx`). Same callees, just from bh thread now.
|
||||||
|
- mac80211 RX delivery (`ieee80211_rx_irqsafe`). That's Patch C2.
|
||||||
|
- `BES2600_RX_IN_BH` ifdef gate. Stays defined; the gated branch is now the only RX path.
|
||||||
|
- Symptom-shaped artifacts (asm nop, BUG_ON in hot path) — still deferred, see task #24 post-cleanup.
|
||||||
|
|
||||||
|
## §5 Shared-state delta table (the v2 lesson, applied)
|
||||||
|
|
||||||
|
Every field `bes2600_bh_handle_rx_skb` mutates directly or transitively, with the v3 protection:
|
||||||
|
|
||||||
|
| field | written by (today) | written by (after v3) | concurrency | required action |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| `hw_priv->hw_bufs_used` | bh thread (TX submit + RX confirm), main.c init | **bh thread only** (RX moves into bh) | single-writer | none — `int` is fine, race-free by construction |
|
||||||
|
| `hw_priv->hw_bufs_used_vif[i]` | bh thread (TX vif submit + RX vif confirm), main.c init | **bh thread only** | single-writer | none |
|
||||||
|
| `hw_priv->wsm_rx_seq[i]` | sdio_rx_work today | bh thread | single-writer | none — moves cleanly between contexts |
|
||||||
|
| `hw_priv->wsm_tx_pending[i]` | bh thread (inc on TX submit), bh+sdio_rx_work (dec on RX confirm) | **bh thread only** | single-writer | none |
|
||||||
|
| `hw_priv->lmac_mon_timer` / `mcu_mon_timer` | mod_timer / del_timer_sync from bh + sdio_rx_work | bh thread only | timer API safe anyway | none |
|
||||||
|
| `hw_priv->wsm_cmd.lock` | spinlock taken inside wsm_handle_rx | same | already protected | none |
|
||||||
|
| `priv->bh_evt_wq` wake-up | wsm_release_tx_buffer when count→0 | same | wake_up is concurrency-safe | none |
|
||||||
|
| `bes_pwr.lock` (inside bes2600_pwr_clear_busy_event) | bh thread (today) | bh thread | already protected | none |
|
||||||
|
| `self->rx_data_cnt` etc. (sbus_priv stats) | sdio_rx_work | bh thread | single-writer | none |
|
||||||
|
|
||||||
|
**Zero fields require new locking.** The architectural pivot eliminates the race v2's atomic_t was working around.
|
||||||
|
|
||||||
|
## §6 Risks
|
||||||
|
|
||||||
|
1. **bh thread now holds the SDIO bus mutex during read** (currently held by sdio_rx_work). TX work in the same bh thread is unaffected (sdio_tx_work runs on a separate workqueue and shares the same mutex anyway). The sdio_lock contention pattern doesn't change.
|
||||||
|
2. **Loss of "parallelism" between sdio_rx_work and bh TX**: sdio_rx_work and bh thread *appeared* to run in parallel today, but both serialize through `bes2600_sdio_lock(self)` for the actual bus operations. The parallelism was illusory. Net throughput should not regress.
|
||||||
|
3. **bh thread CPU-busy-time per RX batch increases**: inline SDIO read is the same cost, just charged to bh instead of sdio_wq's worker. Mitigation: the per-IRQ workqueue dispatch cost (~86/s) is what we trade for it. Net: -86 dispatches/s, +0 µs per frame.
|
||||||
|
4. **Multi-RX coalescing (BES_SDIO_RX_MULTIPLE_NUM=16)** stays. bes2600_sdio_extract_packets parses the multi-frame buffer same as before, just inline now. No functional change to chip-side behaviour.
|
||||||
|
5. **GPIO wake-flag bracketing**: `bes2600_gpio_wakeup_mcu(SDIO_RX)` and `bes2600_gpio_allow_mcu_sleep(SDIO_RX)` currently bracket sdio_rx_work. Move them to bracket the new bh-side read. If the wake-flag accounting is sub-system-scoped (it is — flag bits per subsystem), this is a clean move.
|
||||||
|
6. **IRQ re-enable in bh thread**: cw1200's bh re-enables IRQ via `__cw1200_irq_enable(priv, 1)` after each round. bes2600 has the analogous `__bes2600_irq_enable(0/1)` (commented out as the `asm volatile("nop")` symptom in `bh.c:1518-1520`). This patch does NOT re-engage the commented-out re-enable — that's still task #24's call. But if the IRQ stays disabled across rounds, we'd never receive the next IRQ. **Investigate before Phase 6 lands**: where does IRQ re-enable happen in the current bes2600 hot path? The sdio_func IRQ may be auto-managed by sdio core differently. Block Phase 6 on this audit.
|
||||||
|
7. **Phase 7 wedge resilience**: if v3 has a different bug shape than v2's race (which it shouldn't, since the race is gone by construction), the wired path lets us collect telemetry from a wedged ohm.
|
||||||
|
|
||||||
|
## §7 Phase 5 / 6 / 7
|
||||||
|
|
||||||
|
- **Phase 5**: PR on `git.reauktion.de/marfrit/besser` with this artifact. Specifically request reviewer focus on §6 risk #6 (IRQ re-enable mechanism).
|
||||||
|
- **Phase 6**: branch off cleanups (post-F): `bes2600/sdio-rx-no-relay`. Implement the file changes per §4. Build, install, smoke-test.
|
||||||
|
- **Phase 7**:
|
||||||
|
- First: N=3 stress-ramp **on F module** (post-F pre-v3 baseline). 10 min @ 1, 30 min @ 2, 30 min @ 4 MB/s. Use wired path for telemetry.
|
||||||
|
- Then: install v3 module, identical N=3 ramp. Compare deltas.
|
||||||
|
- Predicted: sdio_rx_work dispatch rate → 0/s (was 86/s). observed receive lifts toward ≥ 1.0 MB/s sustained. `_raw_spin_unlock_irqrestore` drops by the rx_queue lock contribution (was 1914/s acquires).
|
||||||
|
|
||||||
|
## §8 What gets dropped from v2 plan
|
||||||
|
|
||||||
|
- atomic_t prep refactor (`hw_bufs_used` → `atomic_t`): not needed. Single-writer invariant preserved structurally. Still a defensible standalone hardening patch *if mainlining bes2600 ever requires defense-in-depth*, but not on the Bug-#5 critical path.
|
||||||
|
- `wsm_tx_pending[]` decrement-decision race (v2 risk #2): also moots. Both sides single-thread under v3.
|
||||||
|
- v2 Phase 7's "C-prep should show zero delta" gate: replaced by "v3 should match cw1200's structural shape" gate.
|
||||||
|
|
||||||
|
## §9 Open question for reviewer
|
||||||
|
|
||||||
|
The big one is §6 risk #6 — IRQ re-enable. cw1200 explicitly does `__cw1200_irq_enable(priv, 1)` from bh after each round; bes2600 has the call **commented out** with an `asm volatile("nop")` placeholder. Either:
|
||||||
|
|
||||||
|
(a) bes2600's SDIO IRQ is level-triggered + auto-acked by SDIO core, so re-enable isn't needed (that would explain the nop).
|
||||||
|
(b) The current code happens to work because sdio_rx_work is queued by the IRQ regardless of whether IRQ is "enabled" by the driver-side flag. After v3 we have to manually re-enable like cw1200 does.
|
||||||
|
|
||||||
|
Need to confirm (a) vs (b) before Phase 6 lands. Plan to grep for `__bes2600_irq_enable` callsites and trace back to whether it's load-bearing.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Plan written 2026-05-07 by Claude (noether), after Patch F merged and Patch C v2 (PR #10) was superseded by the cw1200 architectural mining finding. Phase 5 review on PR. Don't curate.*
|
||||||
@@ -0,0 +1,171 @@
|
|||||||
|
# Patch C2 — Phase 4 Plan: migrate ieee80211_rx_irqsafe → ieee80211_rx_list
|
||||||
|
|
||||||
|
**Author:** Claude (noether)
|
||||||
|
**Status:** Phase 4 — pending Phase 5 PR review before any Phase 6 code.
|
||||||
|
**Predecessor:** Patch C v3 (PR #5 merged, +73% throughput, no-relay architecture); Patch D + E + F + G also landed. Cleanups branch tip = 42fd0ce.
|
||||||
|
**Task #19 contract**: `ieee80211_rx_list` callable from process context, **requires `local_bh_disable()` + `rcu_read_lock()` wrap**, **cannot mix with `ieee80211_rx_irqsafe()` for the same hardware** → all 6 sites convert in one shot.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## §0 Substrate
|
||||||
|
|
||||||
|
After Patch C v3:
|
||||||
|
- bh thread is the sole RX-delivery context (no relay, no sdio_rx_work)
|
||||||
|
- Per-frame work runs in process context (sleepable)
|
||||||
|
- Single-writer-from-bh invariant covers `hw_bufs_used` and friends
|
||||||
|
|
||||||
|
`ieee80211_rx_irqsafe` is currently called from process context. Per kerneldoc (`include/net/mac80211.h:5399-5411`):
|
||||||
|
|
||||||
|
> **Like ieee80211_rx() but can be called in IRQ context** (internally defers to a tasklet.)
|
||||||
|
|
||||||
|
The tasklet hop is the cost we pay today for delivering each RX frame from process context. `ieee80211_rx_list` is the process-context replacement.
|
||||||
|
|
||||||
|
## §1 Goal
|
||||||
|
|
||||||
|
Per-frame: skip the tasklet hop. Batch: process multiple SKBs from one SDIO read inside a single `local_bh_disable()`/`rcu_read_lock()` window.
|
||||||
|
|
||||||
|
Phase 1 metric: **RX throughput @ 4 MB/s sender**, with v3 N=3 baseline = 2.352 MB/s. Hypothesis: small to moderate uplift (<10%) from removing the tasklet deferral. Larger improvement would be surprising — if observed, that's a finding to investigate.
|
||||||
|
|
||||||
|
## §2 Situation
|
||||||
|
|
||||||
|
- 6 call sites in bes2600 currently use `ieee80211_rx_irqsafe`:
|
||||||
|
- `ap.c:96` (AP-mode link-id RX queue drain)
|
||||||
|
- `sta.c:1487` (link-id rx_queue drain in ?)
|
||||||
|
- `txrx.c:1960` (early-data + pm_unsupported branch — Patch E added)
|
||||||
|
- `txrx.c:1967` (early-data + LINK_SOFT-not-set branch)
|
||||||
|
- `txrx.c:1971` (normal RX path)
|
||||||
|
- `wsm.c:2415` (beacon SKB delivery from `bes2600_beacon_handler`?)
|
||||||
|
- All 6 must convert together (kerneldoc: cannot mix per hardware)
|
||||||
|
- bh thread is single-writer post-v3 → `_rx_list`'s "calls must be synchronized" satisfied trivially
|
||||||
|
- bh thread is process context → `_rx_list` callable
|
||||||
|
|
||||||
|
## §3 Baseline (carry forward)
|
||||||
|
|
||||||
|
From `notes/phase7-v3-2026-05-07.md` (v3 N=3 ramp, Phase 7 closed):
|
||||||
|
|
||||||
|
| metric | v3 fresh-chip N=3 |
|
||||||
|
|---|---|
|
||||||
|
| RX throughput @ 4 MB/s | mean 2.352 MB/s, min 2.102, max 2.590 |
|
||||||
|
| sdio_rx_work dispatches | 0/s |
|
||||||
|
| bh_work redispatches | 0 |
|
||||||
|
|
||||||
|
Phase 7 of C2 will compare against this baseline.
|
||||||
|
|
||||||
|
## §4 Plan
|
||||||
|
|
||||||
|
### §4.1 Conversion shape
|
||||||
|
|
||||||
|
Per call site:
|
||||||
|
```c
|
||||||
|
ieee80211_rx_irqsafe(priv->hw, skb);
|
||||||
|
```
|
||||||
|
becomes:
|
||||||
|
```c
|
||||||
|
ieee80211_rx_list(priv->hw, NULL, skb, &priv->rx_list);
|
||||||
|
```
|
||||||
|
|
||||||
|
Where `priv->rx_list` is a `struct list_head` initialized once.
|
||||||
|
|
||||||
|
**Wrap requirement:** `local_bh_disable()` + `rcu_read_lock()` must be held across the call. Per the kerneldoc, that's also needed for batch correctness.
|
||||||
|
|
||||||
|
### §4.2 Wrap placement (the design decision)
|
||||||
|
|
||||||
|
**Option A — per-call wrap.** Wrap each individual `ieee80211_rx_list()` call. Simple but loses the batch benefit (each call's wrap+unwrap costs as much as the avoided tasklet defer).
|
||||||
|
|
||||||
|
**Option B — per-batch wrap.** Wrap the OUTER frame-iteration loop (e.g., the `for` in `bes2600_sdio_extract_packets`). All 16 SKBs from one SDIO read get delivered inside one wrap. This is the upstream-idiomatic pattern (mt76, iwl_pcie do this).
|
||||||
|
|
||||||
|
Choosing **Option B**. Concrete shape:
|
||||||
|
|
||||||
|
- `bes2600_sdio_read_rx_batch` (the per-SDIO-batch entry point added in Patch C v3) wraps the read+extract+deliver phase:
|
||||||
|
```c
|
||||||
|
rcu_read_lock();
|
||||||
|
local_bh_disable();
|
||||||
|
// existing read + extract_packets that calls bh_handle_rx_skb per frame
|
||||||
|
local_bh_enable();
|
||||||
|
rcu_read_unlock();
|
||||||
|
```
|
||||||
|
- Inside `bes2600_bh_handle_rx_skb`, the single `ieee80211_rx_irqsafe` swap becomes `ieee80211_rx_list(priv->hw, NULL, skb, &priv->rx_list)`.
|
||||||
|
- The OTHER 5 call sites (in `ap.c`, `sta.c`, `txrx.c`'s branches, `wsm.c`) need the same treatment, but they're called from the bh thread (post-v3) so they're already in the right context. Each gets its own narrow wrap (Option A applied selectively because those paths process one frame at a time, not a batch).
|
||||||
|
|
||||||
|
### §4.3 The `rx_list` field
|
||||||
|
|
||||||
|
Add `struct list_head rx_list` to either `struct bes2600_common` (driver-wide) or `struct bes2600_vif` (per-vif). Per-vif is cleaner because the existing `priv->hw` parameter implies vif scope.
|
||||||
|
|
||||||
|
`INIT_LIST_HEAD(&priv->rx_list)` at vif setup; no teardown needed (mac80211 owns the SKBs once handed off).
|
||||||
|
|
||||||
|
**Open question for reviewer:** does the `rx_list` need to be drained explicitly after the batch (e.g., via a `list_for_each_entry_safe` + `netif_receive_skb_list_internal`)? Looking at mainline mt76 / iwl_pcie usage will clarify. Phase 6 must answer this before code lands.
|
||||||
|
|
||||||
|
### §4.4 What will NOT be touched
|
||||||
|
|
||||||
|
- The 6 call sites change atomically (all-or-nothing per kerneldoc) — no per-site progressive migration
|
||||||
|
- `wsm.c:2415` beacon path: same conversion shape, but beacon delivery is once-per-beacon-interval (not hot path); could stay `_irqsafe` if upstream allows mixing per-SKB-type. Re-read kerneldoc carefully — it says "per hardware", not per-call-site, so we can't keep _irqsafe even on the slow paths.
|
||||||
|
- bh thread structure (Patch C v3 stands)
|
||||||
|
- atomic_t counters from Patch D
|
||||||
|
- `pm_unsupported` lock-skip from Patch E
|
||||||
|
- mac80211 batch-delivery semantics (mainline owns this; we just call the API)
|
||||||
|
|
||||||
|
### §4.5 Predicted delta in Phase 3 units
|
||||||
|
|
||||||
|
| metric | predicted |
|
||||||
|
|---|---|
|
||||||
|
| `rx_irqsafe` tasklet schedule rate | → 0 (function no longer called) |
|
||||||
|
| RX throughput @ 4 MB/s sustained | 2.352 → +5-15% (medium confidence) |
|
||||||
|
| `_raw_spin_unlock_irqrestore` CPU% | small drop (no tasklet schedule lock contribution) |
|
||||||
|
|
||||||
|
**Honest acknowledgment:** I don't have data on how much the tasklet hop actually costs. The improvement might be smaller than predicted if tasklet defer was already cheap on this kernel. If <2%, Phase 7 says "marginal but no regression" and we ship anyway for upstream-cleanliness.
|
||||||
|
|
||||||
|
### §4.6 Risks
|
||||||
|
|
||||||
|
1. **`ieee80211_rx_list` semantics surprise.** mainline drivers I have access to (mt76, iwl_pcie) use this via NAPI infrastructure. bes2600 doesn't have NAPI; we're doing process-context-direct. The kerneldoc says callable that way but we should verify a few mainline drivers actually do it. **Phase 6 contract-cite from at least one upstream caller** before code lands.
|
||||||
|
|
||||||
|
2. **`rx_list` lifetime in cross-batch / cross-vif scenarios.** Multiple vifs (P2P_MULTIVIF=y in Makefile) might race on the same hw's `rx_list`. The kerneldoc says "for a single hardware" — the list is per-call destination, which means each call appends to its argument list. Per-vif `rx_list` per-call is the natural shape. No per-hw aggregator needed.
|
||||||
|
|
||||||
|
3. **`local_bh_disable` cost in batch wrap.** Not free. If the batch is small (1-2 SKBs), the wrap might dominate. Estimated breakeven: 2-3 SKBs per wrap. Phase 7 should look at SKB-per-batch distribution to confirm.
|
||||||
|
|
||||||
|
4. **`rcu_read_lock` across SDIO read.** SDIO read can take multi-ms (multi-block transfers). RCU reader-cs across that is fine (no preemption blocked) but it's a longer reader-cs than typical. Verifiable but not a blocker — kerneldoc requires it.
|
||||||
|
|
||||||
|
5. **wsm.c:2415 (beacon) is a different SKB lifecycle** — `hw_priv->beacon` is owned by hw_priv, not allocated per-call. After `_rx_list` consumes it (by passing ownership to mac80211), `hw_priv->beacon` is dangling. **Phase 6 must verify the beacon path either reallocates after delivery or wasn't actually transferring ownership.** Risk #5 is the biggest open question.
|
||||||
|
|
||||||
|
### §4.7 Phase 5 review handover
|
||||||
|
|
||||||
|
PR on `git.reauktion.de/marfrit/besser` with this artifact. Specifically request reviewer focus on:
|
||||||
|
- §4.2 wrap-placement choice (Option B vs A)
|
||||||
|
- §4.3 rx_list scoping (per-vif)
|
||||||
|
- §4.6 risks #1 (mainline-caller verification) and #5 (beacon path SKB ownership)
|
||||||
|
|
||||||
|
Don't curate.
|
||||||
|
|
||||||
|
### §4.8 Phase 6 implementation order
|
||||||
|
|
||||||
|
1. Branch off cleanups: `bes2600/rx-list-batch-delivery`
|
||||||
|
2. Add `struct list_head rx_list` to `struct bes2600_vif`, `INIT_LIST_HEAD` in vif setup
|
||||||
|
3. Convert all 6 call sites: `ieee80211_rx_irqsafe(...)` → `ieee80211_rx_list(...)`
|
||||||
|
4. Wrap `bes2600_sdio_read_rx_batch` outer loop with `rcu_read_lock + local_bh_disable / local_bh_enable + rcu_read_unlock`
|
||||||
|
5. For the non-bh-thread call sites (ap.c, sta.c, wsm.c beacon): per-call narrow wrap
|
||||||
|
6. Verify beacon path in wsm.c:2415 (Risk #5)
|
||||||
|
7. Build, install, smoke-test
|
||||||
|
8. Phase 7 N=3 stress ramp — compare to v3 baseline
|
||||||
|
|
||||||
|
### §4.9 Phase 7 protocol (per `feedback_phase7_stress_ramp`)
|
||||||
|
|
||||||
|
- N=3 reps, 30s each at 4 MB/s, fresh-chip (uptime <15 min)
|
||||||
|
- Use wired path (`ssh mfritsche@192.168.88.80`) for telemetry
|
||||||
|
- Fresh nc listener per rep (per `feedback_rig_failure_is_finding`)
|
||||||
|
- Compare: throughput delta + tasklet schedule rate (ftrace `irq:tasklet_*` events)
|
||||||
|
- If predicted delta met → close C2 + memory entry
|
||||||
|
- If NO delta → marginal patch but no regression; ship for upstream-cleanliness
|
||||||
|
|
||||||
|
## §5 Out of scope
|
||||||
|
|
||||||
|
- Patch D / E already shipped (PR #7, #8 merged)
|
||||||
|
- Patch G already shipped (PR #6 merged)
|
||||||
|
- bh.c `#if 0` graveyard removal (Task #24 hygiene)
|
||||||
|
- Allwinner `sw_mci_check_r1_ready` (Task #25)
|
||||||
|
|
||||||
|
## §6 Summary
|
||||||
|
|
||||||
|
C2 is a 6-site mechanical migration with ONE design decision (per-batch wrap), TWO open questions for the reviewer (rx_list draining + beacon path SKB ownership), and SMALL expected throughput delta (<15%). Risk-low, upstream-prep-high. Worth shipping for the kernel.org submission story even if the throughput delta is marginal.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Plan written 2026-05-08 by Claude (noether). Phase 5 review on PR. Phase 6 contingent on review passing.*
|
||||||
@@ -0,0 +1,63 @@
|
|||||||
|
# Patch C2 Phase 7 — N=3 ramp results
|
||||||
|
|
||||||
|
**Date:** 2026-05-08
|
||||||
|
**Module:** `bes2600.ko` srcversion `619A51E61BF5479AAC146E6` (cleanups + F + G + D + E + C2)
|
||||||
|
**Rig:** ohm fresh boot, wired enu1 path for control, wlan0 for data probes
|
||||||
|
**Stress:** netcat sender, `pv -L 4m`, 30 s per rep
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Results table
|
||||||
|
|
||||||
|
| rep | uptime (s) | rate (MB/s) |
|
||||||
|
|---:|---:|---:|
|
||||||
|
| 1 | 544 | **2.289** |
|
||||||
|
| 2 | 716 | **2.165** |
|
||||||
|
| 3 | 750 | **2.376** |
|
||||||
|
|
||||||
|
**N=3:** mean 2.277, median 2.289, min 2.165, max 2.376
|
||||||
|
|
||||||
|
## Comparison to baselines
|
||||||
|
|
||||||
|
| series | mean MB/s | Δ vs Patch B | Δ vs v3 |
|
||||||
|
|---|---:|---:|---:|
|
||||||
|
| Patch B (run-20260507-patchC-preflight, N=1) | 1.362 | — | -42% |
|
||||||
|
| Patch C v3 N=3 (run-20260507-N3v3-rep*) | 2.352 | +73% | — |
|
||||||
|
| Patch C v3 + F + G + D + E + C2 N=3 (this rep set) | 2.277 | +67% | -3% |
|
||||||
|
|
||||||
|
Δ vs v3 is **within rep variance** (v3 N=3 had min 2.102, max 2.590 → spread ±20%; this set's spread is similar). Statistically indistinguishable.
|
||||||
|
|
||||||
|
## Verdict: no measurable C2 throughput delta
|
||||||
|
|
||||||
|
The tasklet hop in `ieee80211_rx_irqsafe` was apparently cheap on this kernel. Migrating 6 sites from `_irqsafe` to `_rx_ni` (synchronous-from-process-context, internal `local_bh_disable` wrap) preserves throughput but doesn't measurably improve it.
|
||||||
|
|
||||||
|
**This was a predicted outcome.** The C2 Phase 4 plan §4.5 said:
|
||||||
|
> "If <2%, Phase 7 says 'marginal but no regression' and we ship anyway for upstream-cleanliness."
|
||||||
|
|
||||||
|
Observed: -3% (within noise) → falls into the "marginal but no regression" bucket. Ship for the kernel.org submission story (no `_irqsafe` from process context = upstream-idiomatic) even though performance is unchanged.
|
||||||
|
|
||||||
|
## Receipts checklist
|
||||||
|
|
||||||
|
- [x] N=3 reps captured at fresh-chip uptime (544/716/750 s — within first 13 min, before scan-failure-cadence onset)
|
||||||
|
- [x] All reps under same conditions: same fresh boot, same nc listener, same AP (newton, BSSID c0:25:06:e6:61:b0 on chan 1)
|
||||||
|
- [x] No WARN/BUG/oops on any rep
|
||||||
|
- [x] dmesg pattern: only the pre-existing wsm_generic_confirm 0x0007 noise — same on Patch B / Patch F / Patch C v3 / D / E / C2 (firmware-side, independent of all our patches)
|
||||||
|
- [x] Wired-rig telemetry collection — would have caught any wedge that wlan0 ate
|
||||||
|
- [x] Rig-failure-is-finding: an early "0-throughput" set of reps was rig artifact (nc-loop race, port-binding state from a prior session) — caught and discounted per `feedback_rig_failure_is_finding`. The recovered N=3 reps used setsid-detached listener + post-reboot fresh state.
|
||||||
|
|
||||||
|
## Phase 8 lesson
|
||||||
|
|
||||||
|
**Drop-in replacements with the right kerneldoc reading still need Phase 7 measurement.** I expected +5-15% from removing the tasklet schedule. Got -3% (noise). The cost we were saving was already amortised by something else (NAPI infra? per-CPU softirq scheduling?). The kerneldoc-correctness story stands; the perf story does not.
|
||||||
|
|
||||||
|
**Memory entry:** the perf-vs-correctness distinction is worth keeping. `_irqsafe → _rx_ni` is a CORRECTNESS / API-cleanliness move, not a performance optimization. Don't oversell predicted deltas without baseline measurement.
|
||||||
|
|
||||||
|
## Out-of-scope follow-ups
|
||||||
|
|
||||||
|
- Patch C v3 architectural win is the durable +73%. C / D / E / C2 / F / G are smaller cleanups that don't compound visibly.
|
||||||
|
- Bug #5 RX-degradation campaign already closed (hypothesis falsified).
|
||||||
|
- Task #24 (post-cleanup observation of bh.c symptom-shaped artifacts): mostly answered.
|
||||||
|
- Task #25 (Allwinner sw_mci_check_r1_ready measurement): can be done during any future stress run; not on critical path.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Phase 7 captured 2026-05-08 by Claude (noether). Patch C2 closes the post-Bug-#5 cleanup track. Throughput ceiling on this hardware = ~2.4 MB/s sustained @ 4 MB/s sender, fresh chip; further improvement would need firmware-side fixes (the wsm_generic_confirm 0x0007 path), not driver-side.*
|
||||||
@@ -0,0 +1,94 @@
|
|||||||
|
# Patch C v3 Phase 7 — N=3 verification results
|
||||||
|
|
||||||
|
**Date:** 2026-05-07
|
||||||
|
**Module:** `bes2600.ko` srcversion `371C6606B73AF19299228CA` (cleanups+F+v3)
|
||||||
|
**Rig:** ohm (PineTab2, RK3566 + BES2600 SDIO), wired enu1 path for telemetry
|
||||||
|
**Stress:** netcat sender from boltzmann, `pv -L 4m` rate cap (4 MB/s), 3-min window per rep
|
||||||
|
**Boot:** fresh — uptime 200 s / 391 s / 582 s at rep 1/2/3 starts (all within fresh-chip window before the ~13-min Bug #5 RX-degradation point)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Results table
|
||||||
|
|
||||||
|
| rep | elapsed (s) | RX bytes | RX MB | MB/s | sdio_rx_work | sdio_tx_work | bes2600_bh_work redispatches |
|
||||||
|
|---:|---:|---:|---:|---:|---:|---:|---:|
|
||||||
|
| 1 | 180.72 | 447,758,333 | 427.0 | **2.363** | 0 | 368 | 0 |
|
||||||
|
| 2 | 180.67 | 490,669,836 | 467.9 | **2.590** | 0 | 20 | 0 |
|
||||||
|
| 3 | 180.69 | 398,224,992 | 379.8 | **2.102** | 0 | 39 | 0 |
|
||||||
|
|
||||||
|
**N=3 stats:** mean 2.352 MB/s · median 2.363 MB/s · min 2.102 MB/s · max 2.590 MB/s
|
||||||
|
|
||||||
|
## Comparison to baselines
|
||||||
|
|
||||||
|
### vs Patch B baseline (`run-20260507-patchC-preflight`, N=1, 5 min @ 4 MB/s, fresh chip)
|
||||||
|
|
||||||
|
| | Patch B | v3 mean | Δ |
|
||||||
|
|---|---:|---:|---:|
|
||||||
|
| throughput | 1.362 MB/s | 2.352 MB/s | **+73%** |
|
||||||
|
|
||||||
|
### vs original Bug #5 baseline (`run-20260506-0659-fresh`, N=3, decay over time)
|
||||||
|
|
||||||
|
Bug #5 anchor was 725 / 663 / **75** KB/s — rep 3 saw link-death at ~9 min.
|
||||||
|
|
||||||
|
| | Bug #5 floor (rep 3) | v3 floor (rep 3) | Δ |
|
||||||
|
|---|---:|---:|---:|
|
||||||
|
| throughput | 0.075 MB/s | 2.102 MB/s | **28× improvement** |
|
||||||
|
|
||||||
|
### vs Phase 4 v3 plan §4.5 predictions
|
||||||
|
|
||||||
|
| metric | predicted | observed | verdict |
|
||||||
|
|---|---|---|---|
|
||||||
|
| sdio_rx_work dispatch rate | → 0/s (high confidence) | 0/s all 3 reps | ✅ |
|
||||||
|
| `bes2600_bh_work` redispatches | → 0 (high confidence) | 0 all 3 reps | ✅ |
|
||||||
|
| observed RX @ 4 MB/s | floor lifts toward ≥ 1 MB/s sustained (medium) | 2.10 MB/s floor | ✅ exceeds prediction |
|
||||||
|
| `_raw_spin_unlock_irqrestore` CPU% | 20% → 12-15% (medium) | not measured | deferred — perf-record run can confirm |
|
||||||
|
|
||||||
|
## Workqueue dispatch rate collapse
|
||||||
|
|
||||||
|
Patch B baseline (per `run-20260507-patchC-preflight`):
|
||||||
|
- sdio_rx_work: 86.4/s
|
||||||
|
- sdio_tx_work: 276.1/s
|
||||||
|
- bes2600_bh_work redispatches: 0
|
||||||
|
|
||||||
|
v3 N=3 mean:
|
||||||
|
- **sdio_rx_work: 0.0/s** (function deleted)
|
||||||
|
- **sdio_tx_work: 0.8/s** (post-tx queue_work → self->irq_handler call; the chip-side TX driver no longer needs to wake a separate workqueue)
|
||||||
|
- bes2600_bh_work redispatches: 0 (preserved invariant; bh thread still single long-lived work item)
|
||||||
|
|
||||||
|
The 99.7% reduction in `sdio_tx_work` dispatch rate is a side-effect of v3's IRQ→bh-direct rewiring: the post-TX `queue_work(self->sdio_wq, &self->rx_work)` call I replaced with `self->irq_handler()` was actually firing more often than I'd assumed (276/s on Patch B). Folding it into the bh wake-up cuts 275/s of workqueue dispatches that weren't doing anything useful.
|
||||||
|
|
||||||
|
## Risks observed
|
||||||
|
|
||||||
|
- **Bug #5 RX-degradation after ~13-min uptime is independent of v3.** Same scan-failure pattern observed (`wsm_generic_confirm failed for request 0x0007` + `[SCAN] Scan failed (-22)` every 300s) on v3 as on Patch B. v3 did NOT fix Bug #5; it fixed the v2-race that was ALSO present. RX-degradation is firmware-side, likely needs a separate campaign.
|
||||||
|
- **N=3 reps were 3 minutes each instead of 5** to fit within the fresh-chip window. Direct comparison with Patch B's 5-min baseline is approximate; chip-side throughput in 3-min vs 5-min should be similar given the bug fires on uptime, not on transferred-bytes.
|
||||||
|
- **No regression observed in 3×3 min = 9 min of stress.** The v2 race that wedged Patch C v1 within 13 s did NOT reproduce. v3's structural fix held.
|
||||||
|
|
||||||
|
## Phase 8 — lesson distilled
|
||||||
|
|
||||||
|
**The cw1200 mining was decisive.** Patch C v2 (atomic_t prep + direct-deliver on top of relay, PR #10 closed) would have worked correctly but kept the structural relay that was the source of the race. v3 removed the relay entirely — restoring single-writer-from-bh invariant by construction, no atomic_t needed, and delivering a 73% throughput improvement as side benefit.
|
||||||
|
|
||||||
|
Without the cw1200 history mine (`~/src/linux-rockchip`, 228 cw1200 commits over 16 years), v2's atomic_t prep would have shipped. The structural fix is upstream-grade because it matches the reference driver. v2's atomic_t wrapper would have been bes2600-specific bookkeeping with no upstream parallel — defensible as a fix, but worse to maintain.
|
||||||
|
|
||||||
|
**Memory entry:** *When you have an upstream-ancestral driver still in the kernel tree, mine its bug-fix history before patching the inherited fork. The architectural answer may already be there; you just have to look.*
|
||||||
|
|
||||||
|
## Receipts checklist (Phase 7 done)
|
||||||
|
|
||||||
|
- [x] N=3 reps captured at fresh-chip uptime (200/391/582 s)
|
||||||
|
- [x] Same instrumentation pre/post (workqueue ftrace + rx_packets/rx_bytes counters)
|
||||||
|
- [x] Predicted delta matched (sdio_rx_work → 0; bh redispatches → 0; throughput ≥ 1 MB/s sustained)
|
||||||
|
- [x] No WARN/BUG/oops during stress on any rep
|
||||||
|
- [x] Wired-rig telemetry collection (would have caught a wedge if v3 had one)
|
||||||
|
- [x] Receiver `nc` listener restarted fresh per rep (avoiding rep-2-style TCP race)
|
||||||
|
- [x] Stress-ramp memory honored: not steady-state low-rate; saw 4 MB/s saturate
|
||||||
|
|
||||||
|
## Out-of-scope follow-ups
|
||||||
|
|
||||||
|
- Patch C2 — `ieee80211_rx_list` batch delivery — gated on Task #19 kerneldoc verification.
|
||||||
|
- Patch D — ba_lock atomicization — independent.
|
||||||
|
- Patch E — ps_state_lock skip when pm_unsupported — independent.
|
||||||
|
- Bug #5 RX-degradation after 13-min uptime — separate campaign, scan-failure pattern is the entry point.
|
||||||
|
- Task #24 — observe whether `bh.c` `asm volatile("nop")` / commented-out `__bes2600_irq_enable(1)` / BUG_ON in hot path are still load-bearing post-v3. Already partially answered: `__bes2600_irq_enable` is a stub (PR #11 comment). The other artifacts can be re-read fresh.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Phase 7 results captured 2026-05-07 by Claude (noether). v3 (PR #5) closes Patch C campaign with structural improvement + race fix + measurable throughput win.*
|
||||||
Reference in New Issue
Block a user