notes: Bug #5 RX-degradation campaign — Phase 0 plan + research question

After Patch C v3 closed (PR #5 merged, Phase 7 N=3 verified at +73% throughput vs Patch B baseline), the post-13-min RX-degradation pattern remains. Reproduces on Patch B, F, and v3 alike — independent of the relay/race issues v3 addressed. Side-effect that was masked by the throughput floor while v2's race was the dominant variable. Research question (locked): Why does the bes2600 RX path collapse from ~2 MB/s sustained @ fresh-chip uptime to ~180 B/s @ ~28-min uptime, with periodic wsm_generic_confirm failed for request 0x0007 + ieee80211 phy0: [SCAN] Scan failed (-22) every 300 s in the intervening window? Phase 0 protocol: - long-capture rig armed on ohm at uptime 0 (fresh boot 23:13 CEST) - ftrace events: workqueue, mac80211, cfg80211, mmc, sdhci, power - iw event (cfg80211 reason codes), dmesg follow, per-30s netdev counter snap, 5 stress probes at T+5/10/15/20/25 min Phase 0 will: - re-anchor the predecessor data via the long capture (in-session N=1; re-run if anomalous) - characterize state transitions (first scan-fail, first throughput drop) via cfg80211/mac80211 ftrace + iw event correlation - feed Phase 1 metric formulation Mechanism candidates (Phase 4 will discriminate): 1. Firmware-side resource exhaustion (per-scan accumulator) 2. NetworkManager scan-fail recovery loop competing with data 3. AP-side rate limiting / fairness probation 4. PSM state machine deadlock (c7 latch stale) 5. SDIO bus retune interaction 6. Power-management busy-event accumulator leak Out of scope: Patch C2/D/E, higher-rate ramp, reproducing on different APs. Independent campaign from Patch C closure.
Merge pull request 'notes: Patch C v3 Phase 4 plan — drop sdio_rx_work, match cw1200' (#11 ) from claude-noether-9 into main
2026-05-07 23:23:31 +02:00 · 2026-05-07 19:41:44 +00:00 · 2026-05-07 21:36:15 +02:00 · 2026-05-07 18:56:12 +00:00 · 2026-05-07 20:50:39 +02:00 · 2026-05-07 17:21:37 +00:00
3 changed files with 371 additions and 0 deletions
@@ -0,0 +1,108 @@
+# Bug #5 RX-degradation campaign — Phase 0
+
+**Date:** 2026-05-07
+**Module under test:** v3 + F (`bes2600.ko` srcversion `371C6606B73AF19299228CA`)
+**Hardware:** ohm (PineTab2, RK3566 + BES2600 SDIO), wired enu1 fallback path live.
+
+---
+
+## Research question (locked)
+
+> **Why does the bes2600 RX path collapse from ~2 MB/s sustained @ fresh-chip uptime to ~180 B/s @ ~28-min uptime, with periodic `wsm_generic_confirm failed for request 0x0007` + `ieee80211 phy0: [SCAN] Scan failed (-22)` every 300 s in the intervening window?**
+
+Reproduces on Patch B, Patch F, and Patch C v3 alike — independent of the relay/race issues v3 addressed. Side-effect that was masked by the throughput floor while v2's race was the dominant variable.
+
+## Predecessor data (reference, not anchor)
+
+| source | observation |
+|---|---|
+| Patch C v3 N=3 (uptime 200/391/582 s) | mean 2.352 MB/s @ 4 MB/s sender |
+| v3 single rep at uptime ~28 min (rep 2 of 2026-05-07 22:23) | 180 KB / 5 min = 600 B/s, sender saw "Connection reset by peer" |
+| v3 single rep at uptime ~47 min (N=3 first attempt 22:42) | 55 KB / 5 min = 180 B/s, sender timed out (exit 124) |
+| dmesg pattern observed at 47-min uptime | scan failures every 301-302 s starting at uptime 778 s (~13 min) |
+
+The shape: **fresh chip → linear data flow at ~2 MB/s sustained → sometime around 13 min uptime, NetworkManager-triggered scans start failing → sometime around 28 min uptime, data throughput collapses to <1 KB/s while link still shows associated.**
+
+Predecessor data is reference. Phase 0 will re-anchor at N=1 long-trace + 5 in-window stress probes; if the pattern doesn't reproduce, that's the campaign result.
+
+## Mechanism candidates (Phase 4 will discriminate)
+
+1. **Firmware-side resource exhaustion.** Per-scan or per-WSM-event accumulation in chip-side state. Scan-failed -22 (EINVAL) suggests firmware refusing the request — possibly out of scan handles, scan-buffer slots, or some other limit.
+2. **NetworkManager scan-fail recovery loop.** Each failed scan triggers NM retry. If retry overhead dominates the bh thread, data path starves. Verifiable by suppressing NM scans.
+3. **AP-side rate limiting.** Newton (AVM) AP could be applying QoS / fairness / probation after sustained 4 MB/s burst. Verifiable by Fritz!Box log access (Markus has it) or by switching to a different AP.
+4. **PSM state machine deadlock.** c7's `pm_unsupported` self-detect was supposed to handle this, but the latch state could become stale if a real PM_IND arrives mid-operation. Verifiable by `chip_pm_state` debugfs read at degradation onset.
+5. **SDIO bus clock degradation / mmc retune.** SDIO retune with `retune_protected` flag interacts with bes2600's data path. Verifiable by ftrace `mmc/mmc_request_*` event correlation with throughput drop.
+6. **Power-management busy-event accumulation.** `bes2600_pwr_set_busy_event` counters might leak — busy events not cleared lock the chip awake (no PSM) but also exhaust event capacity. Verifiable by `bes2600_pwr_busy_event_record` dump.
+
+## Phase 0 measurement protocol (rig armed 2026-05-07 23:18:58 CEST, T0=1778188738)
+
+Capturing for 35 minutes from fresh boot. All capture lives in `/root/bes2600-samples/run-20260507-bug5-degradation-rig/` on ohm.
+
+### Always-on streams
+
+| stream | tool | output |
+|---|---|---|
+| ftrace events | per-event `enable=1` | `trace.log` (via `trace_pipe`) |
+| cfg80211 events | `iw event -t -f` | `iw-event.log` |
+| kernel printks | `dmesg -wT` | `dmesg.log` |
+| netdev counters | per-30s shell loop | `snap.log` |
+
+### ftrace event set
+
+- `workqueue/workqueue_execute_start` — work dispatches
+- `workqueue/workqueue_queue_work` — work submissions
+- `mac80211/api_beacon_loss` — driver beacon-loss events
+- `mac80211/api_connection_loss` — driver-side conn-loss
+- `mac80211/api_disconnect` — driver-side disconnect
+- `mac80211/drv_hw_scan` — mac80211 → driver scan dispatch
+- `mac80211/drv_set_key` — key state changes
+- `cfg80211/rdev_assoc` — assoc requests
+- `cfg80211/rdev_deauth` — deauth requests
+- `cfg80211/rdev_disassoc` — disassoc requests
+- `cfg80211/cfg80211_assoc_comeback` — AP-side assoc-busy throttling
+- `cfg80211/cfg80211_send_auth_timeout` — auth timeouts
+- `cfg80211/cfg80211_scan_done` — scan completions
+- `power/suspend_resume` — PM transitions
+- `mmc/mmc_request_start` / `mmc_request_done` — bus-level transactions
+
+### Scheduled stress probes
+
+Sender on boltzmann (`/tmp/bug5-probe-loop.sh`) fires `pv -L 4m | nc ohm 12345` for 30 s at T+5/10/15/20/25 min. Each probe brackets uptime, RX-bytes pre, RX-bytes post, elapsed. Throughput-vs-uptime curve falls out of the snap.log + probe boundaries.
+
+Probe markers logged via `logger -t bes2600-bug5 PROBE_N_START/END` so they appear in dmesg.log timeline.
+
+## Anti-theatre receipts (must tick before claiming Phase 0 done)
+
+- [ ] In-session baseline: long-capture across degradation window, N=1 for now; re-run if anomalous
+- [ ] ftrace events actually firing (verify by tail of trace.log mid-capture)
+- [ ] dmesg captures the scan-failure pattern timestamp (expected ~uptime 778 s)
+- [ ] Probes actually transferred data at fresh chip (T+5 should be > 1 MB/s)
+- [ ] At least one probe in-window after scan-failure onset (expected: T+15 or T+20)
+- [ ] Snap.log shows monotonic counter behaviour (no rx_bytes going backwards)
+
+## Phase 1 hypothesis (provisional, refine after Phase 3 data)
+
+Metric candidate: **probe throughput as function of uptime, with state-transition markers (first `wsm_generic_confirm 0x0007 failed`, first `[SCAN] Scan failed (-22)`, first NetworkManager-deauth-and-reassociate)**.
+
+Discriminator question: does throughput collapse abruptly at the first scan failure, or gradually over a window? Abrupt = single-event causation; gradual = accumulator.
+
+## Phase 4 candidates (post-Phase-3)
+
+Depending on which mechanism (1-6) Phase 3 surfaces:
+- (1) firmware resource exhaustion: report to upstream; possibly disable NetworkManager scans pending firmware fix.
+- (2) NM scan-fail loop: configure `wpa_supplicant` to skip scans; or add scan-failure handling in driver to dampen retry cascade.
+- (3) AP-side: switch APs for testing; report to AVM if reproducible.
+- (4) PSM deadlock: extend c7 latch with timeout-or-progress recovery.
+- (5) SDIO retune: ftrace correlation guides the lock-ordering fix.
+- (6) PWR busy-event leak: audit set/clear pairs; add a warning-when-stale.
+
+## Out-of-scope
+
+- Patch C v3 closure (PR #5 merged, Phase 7 done).
+- Patch C2 (`ieee80211_rx_list` batch) — gated on Task #19 kerneldoc.
+- Patch D / E independent.
+- Reproduction at higher rates (8 MB/s ramp) — defer to Phase 4 once mechanism identified.
+
+---
+
+*Phase 0 plan written 2026-05-07 23:21 CEST by Claude (noether), at the close of Patch C v3 Phase 7.  Rig armed; long capture in flight; probes scheduled at T+5/10/15/20/25 min.  Post-capture analysis will populate Phase 3 results before Phase 4 plan branches off.*
@@ -0,0 +1,136 @@
+# Patch C v2 — Phase 4 Plan: atomic_t prep + direct-deliver
+
+**Author:** Claude (noether)
+**Status:** Phase 4 v2 — Phase 7 of Patch C (notes/patch-c-phase4-plan-2026-05-07.md, PR #9 merged) failed with a thread-safety race; this is the redesign.
+**Decision:** Option B from PR #3 close-out comment — `atomic_t` prep refactor first, direct-deliver on top.
+
+---
+
+## §0 What just happened (Phase 7 of Patch C)
+
+Reproduced verbatim from boot -1 of ohm 2026-05-07 20:18:10 CEST, ~13 s into a 4 MB/s nc stress:
+
+```
+WARNING: at wsm_release_tx_buffer+0x84/0xa0 [bes2600], CPU#0: kworker/0:3H/3912
+Workqueue: bes_sdio sdio_rx_work [bes2600]
+pc : wsm_release_tx_buffer+0x84/0xa0 [bes2600]
+lr : bes2600_bh_handle_rx_skb+0x134/0x370 [bes2600]
+sdio_rx_work+0x2a8/0x540 [bes2600]
+bes2600_wlan: wsm_release_tx_buffer failed: -1
+```
+
+Storm continued; chip wedged; ohm fell off the WiFi (wlan0).  Patch C module preserved at `/var/tmp/bes2600.patchC-broken.ko` for forensics.  Patch B rolled back, currently on disk on ohm.  Lesson saved as `feedback_phase6_contract_threadsafety` memory.
+
+## §1 Why it failed
+
+`wsm_release_tx_buffer()` (bh.c:222–243) does **unlocked** read–modify–write on `hw_priv->hw_bufs_used`.  Pre-Patch-C invariant was single-writer = BH thread; the lock that mattered was structural, not annotated.  Patch C's direct-deliver moved one writer (RX-confirm decrement) into `sdio_rx_work` workqueue context.  BH thread + sdio_rx_work race on the int counter; underflow below zero, WARN, return -1, bookkeeping corrupt, TX wedges.
+
+Phase 6 contract block correctly cited `wsm_handle_rx`'s sleepability and held-lock invariants — but stopped at the called function's signature.  It did not enumerate `hw_bufs_used` as shared state mutated by the callee.  That's the gap.
+
+## §2 Shared-state delta table (the thing missing from Patch C)
+
+Every field that `bes2600_bh_handle_rx_skb` mutates either directly or transitively, with current protection and required action:
+
+| field | declared at | written by (today) | written by (after Patch C v2) | current protection | action needed |
+|---|---|---|---|---|---|
+| `hw_priv->hw_bufs_used` | bes2600.h | `wsm_alloc_tx_buffer` (bh thread, TX submit), `wsm_release_tx_buffer` (bh thread, RX confirm), `main.c:543` (init) | + `wsm_release_tx_buffer` from sdio_rx_work | single-writer = BH thread (structural) | **convert to `atomic_t`** |
+| `hw_priv->hw_bufs_used_vif[i]` | bes2600.h | `wsm_release_vif_tx_buffer` (bh thread), `bh.c:1271` (vif TX submit), init | + `wsm_release_vif_tx_buffer` from sdio_rx_work | single-writer = BH thread | **convert to `atomic_t [N]`** |
+| `hw_priv->wsm_rx_seq[i]` | bes2600.h | bh thread RX | sdio_rx_work only | single-writer = BH/sdio_rx context (was BH, now is sdio_rx_work, but still **one writer**) | OK — single writer |
+| `hw_priv->wsm_tx_pending[i]` | bes2600.h | `bes2600_bh_inc_pending_count` (TX submit, BH thread), `bes2600_bh_dec_pending_count` (RX confirm) | dec moves to sdio_rx_work; inc stays BH | single-writer = BH | **also needs `atomic_t`** |
+| `hw_priv->lmac_mon_timer` / `mcu_mon_timer` | bes2600.h | mod_timer / del_timer_sync from BH | ditto from sdio_rx_work | timer API is internally locked | OK — `mod_timer` is concurrency-safe |
+| `hw_priv->wsm_cmd.lock` (taken inside wsm_handle_rx) | wsm_buf | bh thread (today) | sdio_rx_work | spinlock | OK — already protected |
+| `hw_priv->vif_lock` (taken inside wsm_handle_rx for some paths) | per vif | bh thread today | sdio_rx_work | spinlock | OK |
+| `priv->bh_evt_wq` wake-up | bes2600.h | wsm_release_tx_buffer when count hits 0 | ditto from sdio_rx_work | wake_up is concurrency-safe | OK |
+| `bes2600_pwr_clear_busy_event` (called inside release) | bes_pwr | bh thread | sdio_rx_work | internal locking via `bes_power.lock` | OK |
+| `hw_priv->buf_released` | bes2600.h | only `wsm_release_buffer_to_fw` (MCAST_FWDING ifdef, AP-only) | unchanged — BH only | single-writer = BH | OK — not on Patch C v2 hot path |
+
+**Three fields require atomic_t conversion:** `hw_bufs_used`, `hw_bufs_used_vif[]`, `wsm_tx_pending[]`.  Everything else is already concurrency-safe or moves cleanly to single-writer-in-sdio_rx_work.
+
+## §3 Read-site survey (the rest of the work — atomic_read swaps)
+
+`grep -hE "hw_bufs_used\b|hw_bufs_used_vif\b" *.c *.h | wc -l` = **57 references** across the source tree:
+
+- 5 writers (above)
+- 52 readers — converted mechanically to `atomic_read()`.  Distribution:
+  - `bh.c`: 22 read sites (most in the bh main loop, BUG_ON gates, idle / suspend predicates)
+  - `sta.c`: 3 sites (PM idle check at sta.c:1231–1253)
+  - `bes2600_sdio.c`: 1 site (PM idle check at line 958)
+  - `main.c`: 2 sites (init zero, teardown wait)
+  - `debug.c`: 1 site (debugfs stats)
+  - `itp.c`: 1 site (test mode)
+
+`wsm_tx_pending[i]` site count is smaller — ~6 references, all in bh.c and the timer monitors.  Same mechanical conversion.
+
+## §4 Plan v2 — two-step
+
+**Patch C-prep** (NFC, lands first):
+
+- Convert `hw_bufs_used` from `int` → `atomic_t`.
+- Convert `hw_bufs_used_vif[CW12XX_MAX_VIFS]` from `int[]` → `atomic_t[]`.
+- Convert `wsm_tx_pending[2]` from `int[]` → `atomic_t[]`.
+- Update writers:
+  - `wsm_alloc_tx_buffer`: `atomic_inc(&hw_priv->hw_bufs_used)`.
+  - `wsm_release_tx_buffer`: rewrite with `atomic_fetch_sub_release(count, &hw_priv->hw_bufs_used)` — returns prior value.  Re-derive the "tx restart" predicate (`prior >= numInpChBufs - 1`) and the "wake bh_evt_wq + clear busy" predicate (`prior - count == 0`) from that.  WARN if `prior - count < 0`.
+  - `wsm_release_vif_tx_buffer`: same pattern on the array element.
+  - `bes2600_bh_inc/dec_pending_count`: use `atomic_inc` and `atomic_dec_return` (need post-decrement value to decide whether to del_timer).
+- Update all 52+6 read sites: mechanical `atomic_read()` swap.
+- `main.c:543` init: `atomic_set(&hw_priv->hw_bufs_used_vif[i], 0)`.
+
+**Patch C-prep does NOT change behaviour.**  Same atomic ordering (`_release` / `_acquire` chosen to match the implicit memory ordering the BH-only path had).  Phase 7 of C-prep alone should show **identical** numbers to pre-patch baseline (`run-20260507-patchC-preflight`): 1.36 MB/s, 86.4 sdio_rx_work/sec, 90.3 dispatches per 1000 RX pkts, 0 bh_work redispatches.  If Phase 7 of C-prep shows a delta, the atomic ordering is wrong and we loop back here, not to C v2.
+
+**Patch C v2** (the actual structural change, lands on top of C-prep):
+
+- Identical to Patch C as merged in PR #3 (since closed): direct-deliver from `bes2600_sdio_extract_packets` into `bes2600_bh_handle_rx_skb`, no `rx_queue` indirection, no bh wake-up for RX.
+- The contract block in `bh.c::bes2600_bh_handle_rx_skb` is **expanded** to include the shared-state delta table from §2 of this plan, with explicit citations.
+- Same minimum-diff scope as Patch C: keep `rx_queue`, `pipe_read`, `bh_rx_helper` for clean bisection; remove in a follow-up hygiene patch.
+
+## §5 What will NOT be touched (deferred or out of scope)
+
+- mac80211-side `ieee80211_rx_irqsafe` → `ieee80211_rx_list` migration: that's Patch C2, gated on Task #19 kerneldoc verification.
+- The `#if 0` graveyard in bh.c, the `asm volatile("nop")` placeholder, the BUG_ON in steady-state hot path: still symptom-shaped per `feedback_dont_patch_downstream_artifacts`.  Re-evaluate at Task #24 after C v2 / D / E land.
+- `ba_lock` (Patch D) and `ps_state_lock` (Patch E): independent.
+
+## §6 Risk list (per Phase 6 contract-thread-safety memory)
+
+1. **C-prep memory ordering**: I've chosen `atomic_fetch_sub_release` for `wsm_release_tx_buffer` to mirror the implicit BH-thread ordering (release before subsequent atomic ops on `bh_evt_wq` / `bes_power`).  If the BH thread or other readers expect `_acquire` semantics on the value, we get reordering bugs that are hard to reproduce.  **Mitigation:** pair with `_acquire` reads where the read-then-decision pattern is critical (e.g., the bh main loop's `if (!hw_priv->hw_bufs_used)` idle predicate).  Cite the kerneldoc reference for `atomic_fetch_sub_release` in the commit message.
+
+2. **`wsm_tx_pending[]` decrement-side timer interaction**: `bes2600_bh_dec_pending_count` does `if (--hw_priv->wsm_tx_pending[idx] == 0) del_timer_sync(timer); else mod_timer(timer, ...)`.  After atomic_t conversion: `if (atomic_dec_return(&hw_priv->wsm_tx_pending[idx]) == 0) ...`.  But *another* thread could `atomic_inc` between our dec and the timer call, racing the del_timer.  `del_timer_sync` is internally safe (it can be called concurrently with `mod_timer`), but the **decision** "whether to delete vs mod" is racy.  **Mitigation:** even after atomic conversion, this function still needs to be called from a single context.  Verify `inc/dec_pending_count` callers — if both sides only fire from BH and sdio_rx_work and never overlap on the same idx, we're fine; if not, this needs a lock.
+
+3. **`hw_bufs_used_vif[]` array vs `wsm_alloc_tx_buffer`**: vif counter increment lives at bh.c:1271, called from bh thread TX-submit path.  Decrement (`wsm_release_vif_tx_buffer`) called from RX-confirm.  After Patch C v2 the decrement is in sdio_rx_work — same race shape as the global counter.  Already covered by the atomic_t array conversion.
+
+4. **PM idle predicate at sta.c:1239**: reads `hw_priv->hw_bufs_used_vif[priv->if_id]` to decide can-sleep.  Currently racy (was already reading BH-mutated state from a non-BH PM context).  Atomic conversion makes the read coherent.  PM context's read-then-decide is still fundamentally a snapshot — no change in semantics, just no torn-read.
+
+5. **Reboot / module-unload teardown** (`main.c:840`): `wait_event_timeout(... !hw_priv->hw_bufs_used ...)`.  Becomes `... !atomic_read(...)`.  No semantic change — the wait_event macro re-evaluates the predicate on each wake.
+
+6. **Phase 7 rig: Patch C v2 still wedges chip if I missed anything**: now mitigated by ohm's new wired interface (enu1, 192.168.88.80) — survives bes2600 wedges, lets us collect dmesg / ftrace / journalctl from a wedged ohm without reboot.  See `reference_ohm_wired_iface` memory.
+
+## §7 Phase 5 review handover
+
+PR on git.reauktion.de/marfrit/besser, this file as the artifact (per `feedback_phase5_surface_is_pr`).  Specifically request reviewer focus on §2 shared-state delta table — that's the part that should have caught Patch C's bug.  Don't curate.
+
+## §8 Phase 6 implementation order
+
+1. Branch off `cleanups` on bes2600-dkms-mobian: `bes2600/atomic-tx-buf-counters` (= Patch C-prep).
+2. Mechanical refactor: `int hw_bufs_used` → `atomic_t hw_bufs_used`, all reads → `atomic_read`, all writes → atomic ops.  Same for vif array and tx_pending array.  No other changes.
+3. Build, install, smoke-test.  Phase 7 of C-prep.  Should be a no-op delta.
+4. PR + Phase 5 review + merge.
+5. Branch off C-prep: `bes2600/sdio-rx-direct-deliver-v2` (= Patch C v2).
+6. Re-apply the Patch C delta (3 files: bh.h, bh.c, bes2600_sdio.c — same edits as PR #3).
+7. Build, install, Phase 7 N=3 stress ramp.
+8. PR + Phase 5 review + merge.
+
+## §9 Phase 7 v2 protocol (per `feedback_phase7_stress_ramp` + wired-rig)
+
+1. Pre-C-prep baseline rep N=3 (re-anchor, since current N=1 baseline is from `run-20260507-patchC-preflight`).
+2. Apply C-prep, N=3.  Compare to pre.  Expect: zero meaningful delta.  If non-zero → memory-ordering bug, loop back to §4 atomic-ordering choice.
+3. Apply C v2, N=3.  Compare to C-prep baseline.  Expect: §4.5 of original Patch C plan's predicted delta (rx_queue lock acquires → 0, observed RX KB/s lifts toward ≥1 MB/s sustained @ 4MB/s).
+4. **All Phase 7 stress runs use the wired path (`ssh mfritsche@192.168.88.80`) for telemetry collection.**  When the chip wedges (it shouldn't this time, but planning for it), wlan0 stops responding but enu1 stays alive.  Collect dmesg / ftrace / journalctl over enu1 BEFORE rebooting.  This is the data we lost in Patch C boot -1 because wlan0 was the only path.
+5. N=3 reps per phase per `feedback_phase7_stress_ramp`.  Don't accept N=1 as verification.
+
+## §10 Closeout
+
+If C-prep + C v2 both pass Phase 7: proceed to D (ba_lock atomicization), E (ps_state_lock skip).  Markus's "we're not on the clock" applies — sequencing per bisection clarity, not delivery deadline.
+
+---
+
+*Plan written 2026-05-07 by Claude (noether), in response to Patch C Phase 7 failure.  Phase 5 review = PR comments on this artifact at git.reauktion.de/marfrit/besser.  Don't curate the shared-state delta table for the reviewer — that's the part the previous round's reviewer should have caught me on.*
@@ -0,0 +1,127 @@
+# Patch C v3 — Phase 4 Plan: drop sdio_rx_work, match cw1200 architecture
+
+**Author:** Claude (noether)
+**Status:** Phase 4 v3 — supersedes v2 (PR #10) after cw1200 mainline survey showed the race-free path is structural, not lock-based.
+**Decision:** drop the `sdio_rx_work` workqueue entirely; SDIO IRQ wakes `bh_wq`; bh thread does the SDIO read inline.  Restores single-writer-from-bh invariant on `hw_bufs_used` *by construction*.  No `atomic_t` prep needed.
+
+---
+
+## §0 Why v3 supersedes v2
+
+PR #10's plan was: convert `hw_bufs_used` etc. to `atomic_t` (prep), then direct-deliver from `sdio_rx_work` (structural).  That was a workaround for the race that *only existed because of the relay*.
+
+The cw1200 mining (`~/src/linux-rockchip`, 228 cw1200 commits) showed the upstream answer: there is no relay.  cw1200's IRQ handler bumps `bh_rx` and wakes the bh thread; the bh thread does the SDIO read itself inside `cw1200_bh_rx_helper` (`drivers/net/wireless/st/cw1200/bh.c:233`).  Single thread = single writer for `hw_bufs_used` = no race.  Same `int hw_bufs_used` as bes2600, never atomic_t'd in 16 years upstream because it never needed to be.
+
+Patch C v3 brings bes2600 into that shape.  The structural simplification is bigger than v2's diff but lands the right architecture in one move.
+
+## §1 Goal
+
+Same as Patch C v2 §1: ≥ 1 MB/s sustained receive @ 4 MB/s sender, < 15 % `_raw_spin_unlock_irqrestore` CPU%, no 30-min cascade to link-death.  Stretch toward Phase 1's full 2 MB/s once Patch C2 (rx_list batch) lands separately.
+
+## §2 Situation
+
+- Cleanups branch is at Patch F merged (commit `b717251`).  All Phase 5 reviews of the F series merged via PR #4.
+- ohm rebooted with F module live (srcversion `A9438692D6A8698F92AEEA1`) — F is the new baseline for Patch C v3 Phase 7 comparison.
+- Wired path `enu1` at `192.168.88.80` survives bes2600 wedges; lmcp `ohm` still goes through wlan0.  Phase 7 telemetry collection over enu1.
+- Reboot-permission override active (ohm dev-allocated; I can `sudo reboot` directly — `feedback_user_pushes_reboot_button` override clause).
+
+## §3 Baseline measurements
+
+Carry forward from `run-20260507-patchC-preflight/baseline.tsv` (N=1, F-less Patch B module):
+
+| metric | value |
+|---|---|
+| observed receive @ 4 MB/s | 1.362 MB/s |
+| sdio_rx_work dispatches | 86.4/s = 90.3 per 1000 RX packets |
+| sdio_tx_work dispatches | 276.1/s |
+| bes2600_bh_work redispatches | 0 (single long-lived) |
+
+**Phase 6 prereq:** capture an N=3 baseline ON THE F MODULE before Patch C v3 code lands.  Same instrumentation, same stress ramp.  This is the post-F / pre-v3 reference.  Without it, Phase 7's delta is C+F vs B+nothing — confounded.
+
+## §4 Plan v3
+
+### §4.1 What gets eliminated
+
+- **`sdio_rx_work` (bes2600_sdio.c:829)** — function deleted.  No longer queued, no longer runs.
+- **`self->rx_work` work_struct** — field deleted from `struct sbus_priv`.  `INIT_WORK` removed.
+- **`self->rx_queue` + `self->rx_queue_lock`** — fields deleted.  `skb_queue_head_init` removed.  No SKB ever queued there.
+- **`bes2600_sdio_pipe_read`** — function deleted.  No callers after this patch.
+- **`sbus_ops->pipe_read`** — sbus op slot deleted (or kept and stubbed; tx_loop.c also implements it for the test-loop bus, has to stay if test-loop is preserved).
+- **`queue_work(self->sdio_wq, &self->rx_work)`** at the 3 call sites in `bes2600_sdio.c` (lines 416, 941, 1199) — removed.
+
+### §4.2 What gets added
+
+- **A new `bes2600_bh_handle_rx_skb()`** in bh.c (same shape as Patch C added, same contract block; no longer needs to also wake the bh thread because we ARE the bh thread).
+- **A new helper `bes2600_sdio_read_rx_batch()`** in bes2600_sdio.c, exported, that does what `sdio_rx_work` used to do MINUS the queuing: lock → read ctrl_reg → memcpy_fromio → packets_check → for-each-frame extract+deliver.  Called from bh.
+
+### §4.3 What gets rewired
+
+- **`bes2600_gpio_irq_handler`** in bes2600_sdio.c:413 (the GPIO-IRQ path used when CONFIG_BES2600_USE_GPIO_IRQ is set):  drop `queue_work(self->sdio_wq, &self->rx_work)`; instead call `self->irq_handler(self->irq_priv)` directly (which is `bes2600_irq_handler` in bh.c, bumps `bh_rx` + wakes `bh_wq`).  Matches cw1200_sdio_irq_handler shape.
+- **`bes2600_bh_rx_helper`** (bh.c:961, BES_SDIO_RX_MULTIPLE_ENABLE branch): instead of `pipe_read`-ing one SKB from the (now-gone) rx_queue, call the new `bes2600_sdio_read_rx_batch()` which does the SDIO read AND delivers each frame inline via `bes2600_bh_handle_rx_skb()`.  Returns count delivered, or negative on error.
+- **`bes2600_bh()` outer loop**:  after a successful rx_batch read, the helper signals whether to continue draining (more frames pending) — same shape as today's `BH_RX_CONT_LIMIT=3` outer loop.
+- **`bes2600_gpio_wakeup_mcu(SDIO_RX)`** + **`bes2600_gpio_allow_mcu_sleep(SDIO_RX)`** brackets:  currently called inside sdio_rx_work.  Move into bh thread around the `bes2600_sdio_read_rx_batch()` call.  Same wake-flag bracketing, just from a different thread.
+- **`sdio_wq` workqueue**:  keeps `tx_work` and (briefly) `scan_work`.  Renamed or kept — cosmetic.  Don't touch in this patch.
+
+### §4.4 What stays untouched
+
+- TX path (`sdio_tx_work`, `bes2600_bh_tx_helper`, `wsm_alloc_tx_buffer`).  Independent.
+- WSM protocol layer (`wsm.c`, `wsm_handle_rx`).  Same callees, just from bh thread now.
+- mac80211 RX delivery (`ieee80211_rx_irqsafe`).  That's Patch C2.
+- `BES2600_RX_IN_BH` ifdef gate.  Stays defined; the gated branch is now the only RX path.
+- Symptom-shaped artifacts (asm nop, BUG_ON in hot path) — still deferred, see task #24 post-cleanup.
+
+## §5 Shared-state delta table (the v2 lesson, applied)
+
+Every field `bes2600_bh_handle_rx_skb` mutates directly or transitively, with the v3 protection:
+
+| field | written by (today) | written by (after v3) | concurrency | required action |
+|---|---|---|---|---|
+| `hw_priv->hw_bufs_used` | bh thread (TX submit + RX confirm), main.c init | **bh thread only** (RX moves into bh) | single-writer | none — `int` is fine, race-free by construction |
+| `hw_priv->hw_bufs_used_vif[i]` | bh thread (TX vif submit + RX vif confirm), main.c init | **bh thread only** | single-writer | none |
+| `hw_priv->wsm_rx_seq[i]` | sdio_rx_work today | bh thread | single-writer | none — moves cleanly between contexts |
+| `hw_priv->wsm_tx_pending[i]` | bh thread (inc on TX submit), bh+sdio_rx_work (dec on RX confirm) | **bh thread only** | single-writer | none |
+| `hw_priv->lmac_mon_timer` / `mcu_mon_timer` | mod_timer / del_timer_sync from bh + sdio_rx_work | bh thread only | timer API safe anyway | none |
+| `hw_priv->wsm_cmd.lock` | spinlock taken inside wsm_handle_rx | same | already protected | none |
+| `priv->bh_evt_wq` wake-up | wsm_release_tx_buffer when count→0 | same | wake_up is concurrency-safe | none |
+| `bes_pwr.lock` (inside bes2600_pwr_clear_busy_event) | bh thread (today) | bh thread | already protected | none |
+| `self->rx_data_cnt` etc. (sbus_priv stats) | sdio_rx_work | bh thread | single-writer | none |
+
+**Zero fields require new locking.** The architectural pivot eliminates the race v2's atomic_t was working around.
+
+## §6 Risks
+
+1. **bh thread now holds the SDIO bus mutex during read** (currently held by sdio_rx_work).  TX work in the same bh thread is unaffected (sdio_tx_work runs on a separate workqueue and shares the same mutex anyway).  The sdio_lock contention pattern doesn't change.
+2. **Loss of "parallelism" between sdio_rx_work and bh TX**:  sdio_rx_work and bh thread *appeared* to run in parallel today, but both serialize through `bes2600_sdio_lock(self)` for the actual bus operations.  The parallelism was illusory.  Net throughput should not regress.
+3. **bh thread CPU-busy-time per RX batch increases**:  inline SDIO read is the same cost, just charged to bh instead of sdio_wq's worker.  Mitigation: the per-IRQ workqueue dispatch cost (~86/s) is what we trade for it.  Net: -86 dispatches/s, +0 µs per frame.
+4. **Multi-RX coalescing (BES_SDIO_RX_MULTIPLE_NUM=16)** stays.  bes2600_sdio_extract_packets parses the multi-frame buffer same as before, just inline now.  No functional change to chip-side behaviour.
+5. **GPIO wake-flag bracketing**:  `bes2600_gpio_wakeup_mcu(SDIO_RX)` and `bes2600_gpio_allow_mcu_sleep(SDIO_RX)` currently bracket sdio_rx_work.  Move them to bracket the new bh-side read.  If the wake-flag accounting is sub-system-scoped (it is — flag bits per subsystem), this is a clean move.
+6. **IRQ re-enable in bh thread**:  cw1200's bh re-enables IRQ via `__cw1200_irq_enable(priv, 1)` after each round.  bes2600 has the analogous `__bes2600_irq_enable(0/1)` (commented out as the `asm volatile("nop")` symptom in `bh.c:1518-1520`).  This patch does NOT re-engage the commented-out re-enable — that's still task #24's call.  But if the IRQ stays disabled across rounds, we'd never receive the next IRQ.  **Investigate before Phase 6 lands**: where does IRQ re-enable happen in the current bes2600 hot path?  The sdio_func IRQ may be auto-managed by sdio core differently.  Block Phase 6 on this audit.
+7. **Phase 7 wedge resilience**:  if v3 has a different bug shape than v2's race (which it shouldn't, since the race is gone by construction), the wired path lets us collect telemetry from a wedged ohm.
+
+## §7 Phase 5 / 6 / 7
+
+- **Phase 5**: PR on `git.reauktion.de/marfrit/besser` with this artifact.  Specifically request reviewer focus on §6 risk #6 (IRQ re-enable mechanism).
+- **Phase 6**: branch off cleanups (post-F): `bes2600/sdio-rx-no-relay`.  Implement the file changes per §4.  Build, install, smoke-test.
+- **Phase 7**: 
+  - First: N=3 stress-ramp **on F module** (post-F pre-v3 baseline).  10 min @ 1, 30 min @ 2, 30 min @ 4 MB/s.  Use wired path for telemetry.
+  - Then: install v3 module, identical N=3 ramp.  Compare deltas.
+  - Predicted: sdio_rx_work dispatch rate → 0/s (was 86/s).  observed receive lifts toward ≥ 1.0 MB/s sustained.  `_raw_spin_unlock_irqrestore` drops by the rx_queue lock contribution (was 1914/s acquires).
+
+## §8 What gets dropped from v2 plan
+
+- atomic_t prep refactor (`hw_bufs_used` → `atomic_t`): not needed.  Single-writer invariant preserved structurally.  Still a defensible standalone hardening patch *if mainlining bes2600 ever requires defense-in-depth*, but not on the Bug-#5 critical path.
+- `wsm_tx_pending[]` decrement-decision race (v2 risk #2): also moots.  Both sides single-thread under v3.
+- v2 Phase 7's "C-prep should show zero delta" gate: replaced by "v3 should match cw1200's structural shape" gate.
+
+## §9 Open question for reviewer
+
+The big one is §6 risk #6 — IRQ re-enable. cw1200 explicitly does `__cw1200_irq_enable(priv, 1)` from bh after each round; bes2600 has the call **commented out** with an `asm volatile("nop")` placeholder.  Either:
+
+(a) bes2600's SDIO IRQ is level-triggered + auto-acked by SDIO core, so re-enable isn't needed (that would explain the nop).
+(b) The current code happens to work because sdio_rx_work is queued by the IRQ regardless of whether IRQ is "enabled" by the driver-side flag.  After v3 we have to manually re-enable like cw1200 does.
+
+Need to confirm (a) vs (b) before Phase 6 lands.  Plan to grep for `__bes2600_irq_enable` callsites and trace back to whether it's load-bearing.
+
+---
+
+*Plan written 2026-05-07 by Claude (noether), after Patch F merged and Patch C v2 (PR #10) was superseded by the cw1200 architectural mining finding.  Phase 5 review on PR.  Don't curate.*
Author	SHA1	Message	Date
claude-noether	6bae531917	notes: Bug #5 RX-degradation campaign — Phase 0 plan + research question After Patch C v3 closed (PR #5 merged, Phase 7 N=3 verified at +73% throughput vs Patch B baseline), the post-13-min RX-degradation pattern remains. Reproduces on Patch B, F, and v3 alike — independent of the relay/race issues v3 addressed. Side-effect that was masked by the throughput floor while v2's race was the dominant variable. Research question (locked): Why does the bes2600 RX path collapse from ~2 MB/s sustained @ fresh-chip uptime to ~180 B/s @ ~28-min uptime, with periodic wsm_generic_confirm failed for request 0x0007 + ieee80211 phy0: [SCAN] Scan failed (-22) every 300 s in the intervening window? Phase 0 protocol: - long-capture rig armed on ohm at uptime 0 (fresh boot 23:13 CEST) - ftrace events: workqueue, mac80211, cfg80211, mmc, sdhci, power - iw event (cfg80211 reason codes), dmesg follow, per-30s netdev counter snap, 5 stress probes at T+5/10/15/20/25 min Phase 0 will: - re-anchor the predecessor data via the long capture (in-session N=1; re-run if anomalous) - characterize state transitions (first scan-fail, first throughput drop) via cfg80211/mac80211 ftrace + iw event correlation - feed Phase 1 metric formulation Mechanism candidates (Phase 4 will discriminate): 1. Firmware-side resource exhaustion (per-scan accumulator) 2. NetworkManager scan-fail recovery loop competing with data 3. AP-side rate limiting / fairness probation 4. PSM state machine deadlock (c7 latch stale) 5. SDIO bus retune interaction 6. Power-management busy-event accumulator leak Out of scope: Patch C2/D/E, higher-rate ramp, reproducing on different APs. Independent campaign from Patch C closure.	2026-05-07 23:23:31 +02:00
marfrit	1e408c9d33	Merge pull request 'notes: Patch C v3 Phase 4 plan — drop sdio_rx_work, match cw1200' (#11 ) from claude-noether-9 into main Reviewed-on: #11	2026-05-07 19:41:44 +00:00
claude-noether	d01400140b	notes: Patch C v3 Phase 4 plan — drop sdio_rx_work, match cw1200 Supersedes v2 (PR #10). cw1200 mining (~/src/linux-rockchip, 228 cw1200 commits) confirmed: upstream cw1200 has no sdio_rx_work workqueue at all. IRQ handler bumps bh_rx + wakes bh_wq; bh thread does the SDIO read inline via cw1200_bh_rx_helper. Single thread = single writer for hw_bufs_used = no race by construction. Same int hw_bufs_used as bes2600, never atomic_t'd in 16 years upstream. v3 brings bes2600 into that shape: - delete sdio_rx_work, self->rx_work, self->rx_queue, self->rx_queue_lock, bes2600_sdio_pipe_read - GPIO IRQ handler calls self->irq_handler directly (matches cw1200_sdio_irq_handler shape) - bes2600_bh_rx_helper's BES_SDIO_RX_MULTIPLE_ENABLE branch replaced with inline SDIO read + extract_packets + per-skb delivery via new bes2600_bh_handle_rx_skb() - GPIO wake-flag bracketing moves into bh thread §5 shared-state delta table (the v2 lesson, applied): zero fields require new locking. hw_bufs_used / hw_bufs_used_vif / wsm_tx_pending all stay single-writer-from-bh. v2's atomic_t prep is mooted. §6 risk #6 is the open question for reviewer: bes2600's __bes2600_irq_enable(1) call is commented out in the BH-loop done: label with an asm volatile("nop") in its place. Either SDIO IRQ is auto-managed (so commenting out is fine) or the current code relies on sdio_rx_work being queued regardless of driver-side IRQ flag. Block Phase 6 on this audit. Patch F (PR #4 merged) is the new baseline. v3 will branch off F-merged cleanups. Phase 7 N=3 stress ramp uses wired enu1 path (192.168.88.80) for wedge-resilient telemetry collection.	2026-05-07 21:36:15 +02:00
marfrit	993117a108	Merge pull request 'notes: Patch C v2 Phase 4 plan — atomic_t prep + direct-deliver (re-after-failure)' (#10 ) from claude-noether-8 into main Question - you said earlier, the driver is a search-and-replace CW12xx driver. Did the CW12xx evolve since this "fork"? If so, are there lessons that can be learned from the CW12xx driver in it's nowadays state? Reviewed-on: #10	2026-05-07 18:56:12 +00:00
claude-noether	0b63ca3c24	notes: Patch C v2 Phase 4 plan — atomic_t prep + direct-deliver Phase 7 of Patch C (PR #9 → bes2600-dkms PR #3 → boot -1 of ohm 20:18:10) failed with a thread-safety race: wsm_release_tx_buffer's unlocked R-M-W on hw_bufs_used races against wsm_alloc_tx_buffer in the bh thread when Patch C moved the RX-confirm decrement into sdio_rx_work. WARN storm at +13s under stress, chip wedges, host off-network. Phase 6 contract analysis cited wsm_handle_rx's sleepability and held-lock invariants but stopped at the function signature. Did not enumerate hw_bufs_used as shared state mutated by the callee. Lesson saved as feedback_phase6_contract_threadsafety memory. Phase 4 v2 designs around that gap. Two-step: 1. Patch C-prep: NFC refactor — convert hw_bufs_used, hw_bufs_used_vif[], wsm_tx_pending[] from int / int[] to atomic_t / atomic_t[]. Use atomic_fetch_sub_release in wsm_release_tx_buffer (returns prior value for the >= numInpChBufs - 1 predicate). Mechanical atomic_read swap at ~58 read sites. Lands first; Phase 7 should show zero delta from baseline. 2. Patch C v2: re-apply the sdio_rx_work direct-deliver on top of C-prep. Identical structural change to the closed PR #3, but now the racing counter is safe. Contract block in bes2600_bh_handle_rx_skb expanded to include the shared-state delta table. Plan §2 is the shared-state delta table — every field bes2600_bh_handle_rx_skb mutates directly or transitively, with current protection and required action. 3 fields need atomic_t, the rest are already concurrency-safe or stay single-writer. Plan §6 lists 6 risks including memory-ordering choices, the inc/dec_pending_count timer-decision race, and the new wired-rig fallback (enu1 192.168.88.80) that survives bes2600 wedges so Phase 7 can capture dmesg / ftrace from a wedged ohm without reboot. PR superseded #3 closed with full verdict comment. Phase B rolled back on ohm at /lib/modules/.../extra/bes2600.ko. Markus's reboot button to land Patch B again before C-prep work begins.	2026-05-07 20:50:39 +02:00
marfrit	4666e03254	Merge pull request 'notes: Patch C Phase 4 plan (item 1 only — collapse sdio_rx_work into BH)' (#9 ) from claude-noether-7 into main Reviewed-on: #9	2026-05-07 17:21:37 +00:00