notes: Bug #5 root cause refined — workqueue-per-SDIO-transaction is the floor

Follow-up ftrace measurement (post-reboot, 3-min 4MB/s capture):
- workqueue_execute_start: 5,643/sec  ← dominates
- wsm_cmd_send: only 13/sec (host-to-chip command path NOT the hotspot)
- lock contention: 50/sec (modest)

The throughput floor is set by per-SDIO-transaction workqueue dispatch
overhead. Surgical patches B5-1/B5-2/B5-3 from the prior Phase 4 plan
all targeted the wrong layer; deferring those until an architectural
restructuring map is produced.

Promoting the Sonnet architect review from "backlog" to
"blocking on Bug #5" — the next step is a restructuring assessment,
not another patch.
This commit is contained in:
2026-05-07 17:31:31 +02:00
parent 928268f477
commit 594f73c6b4
+36 -14
View File
@@ -82,25 +82,47 @@ without board power-cycle").
**Status**: task c3 (indirectly, via bes_chardev removal which currently
gates the signal/nosignal mode switch path).
## Backlog — full architect review of bes2600 driver code quality
## Architect review — now BUG-#5-blocking (was backlog)
The Phase 0 perf trace for Bug #5 exposes a "when in doubt, add a lock"
pattern in the BH path (~20 % CPU in `_raw_spin_unlock_irqrestore` even
during healthy throughput). Markus has flagged this for a separate
architect-review pass: have Claude Sonnet (or equivalent reviewer) do a
top-to-bottom code-quality review of the bes2600 sources we have on
boltzmann (`~/src/besser/bes2600-dkms-mobian/bes2600/`), looking for:
The Phase 0 perf trace for Bug #5 first exposed a "when in doubt, add a
lock" pattern (~20 % CPU in `_raw_spin_unlock_irqrestore`). The
follow-up ftrace measurement (2026-05-07 17:00) refined the root cause
to an architectural problem: **the bes2600 driver dispatches every
SDIO transaction through the kernel workqueue**. Numbers from a 3-min
4 MB/s ohm capture (post-reboot, srcversion `1B3B3ED0`):
- needless lock proliferation
- BH / workqueue dispatch shape
- error-handling coverage
- dead code / leftover-from-cw1200 cruft
```
wsm_cmd_send: 13/sec (host-to-chip command rate, surprisingly low)
bes2600_rx_cb: 611/sec
bes2600_bh_wakeup: 267/sec
lock contention_begin: 50/sec
workqueue_execute_start: 5,643/sec ← DOMINATES; matches the mmc
transaction rate from earlier perf
```
5.6 k workqueue dispatches per second is the throughput floor — not a
specific lock, not WSM-command rate, not decrypt-state. A surgical fix
to any single function won't move the floor; the architecture needs
to be restructured to amortise SDIO transactions across fewer work-
items (or move SDIO RX out of the workqueue entirely).
This is where the **Claude Sonnet architect review** belongs: a
top-to-bottom assessment of `~/src/besser/bes2600-dkms-mobian/bes2600/`
focused on:
- the workqueue dispatch shape (most actionable)
- needless lock proliferation (the original signal)
- BH / RX scheduling boundaries
- error-handling coverage and dead-code from the cw1200 ancestor
- API contract violations relative to mainline mac80211
Output: ranked list of cleanup targets that would make later patch series
land more cleanly. Not blocking on Bug #5 — independent track.
Output: ranked list of restructuring targets, with predicted-delta
estimates against the Phase 1 metric (≥ 2 MB/s sustained @ 4 MB/s cap,
< 10 % CPU in lock-cycling, no link cascade in 30 min).
**Status**: backlog. Schedule when Bug #5's measurement pass finishes.
**Status**: now blocking on Bug #5 (was independent track). Surgical
patches B5-1, B5-2, B5-3 from the original Phase 4 candidate list are
all DEFERRED until the architect review's restructuring map is in.
## Bug #5 — RX path degrades under attempted-throughput pressure