Phase 7 of Patch C (PR #9 → bes2600-dkms PR #3 → boot -1 of ohm 20:18:10) failed with a thread-safety race: wsm_release_tx_buffer's unlocked R-M-W on hw_bufs_used races against wsm_alloc_tx_buffer in the bh thread when Patch C moved the RX-confirm decrement into sdio_rx_work. WARN storm at +13s under stress, chip wedges, host off-network. Phase 6 contract analysis cited wsm_handle_rx's sleepability and held-lock invariants but stopped at the function signature. Did not enumerate hw_bufs_used as shared state mutated by the callee. Lesson saved as feedback_phase6_contract_threadsafety memory. Phase 4 v2 designs around that gap. Two-step: 1. Patch C-prep: NFC refactor — convert hw_bufs_used, hw_bufs_used_vif[], wsm_tx_pending[] from int / int[] to atomic_t / atomic_t[]. Use atomic_fetch_sub_release in wsm_release_tx_buffer (returns prior value for the >= numInpChBufs - 1 predicate). Mechanical atomic_read swap at ~58 read sites. Lands first; Phase 7 should show zero delta from baseline. 2. Patch C v2: re-apply the sdio_rx_work direct-deliver on top of C-prep. Identical structural change to the closed PR #3, but now the racing counter is safe. Contract block in bes2600_bh_handle_rx_skb expanded to include the shared-state delta table. Plan §2 is the shared-state delta table — every field bes2600_bh_handle_rx_skb mutates directly or transitively, with current protection and required action. 3 fields need atomic_t, the rest are already concurrency-safe or stay single-writer. Plan §6 lists 6 risks including memory-ordering choices, the inc/dec_pending_count timer-decision race, and the new wired-rig fallback (enu1 192.168.88.80) that survives bes2600 wedges so Phase 7 can capture dmesg / ftrace from a wedged ohm without reboot. PR superseded #3 closed with full verdict comment. Phase B rolled back on ohm at /lib/modules/.../extra/bes2600.ko. Markus's reboot button to land Patch B again before C-prep work begins.
12 KiB
Patch C v2 — Phase 4 Plan: atomic_t prep + direct-deliver
Author: Claude (noether)
Status: Phase 4 v2 — Phase 7 of Patch C (notes/patch-c-phase4-plan-2026-05-07.md, PR #9 merged) failed with a thread-safety race; this is the redesign.
Decision: Option B from PR #3 close-out comment — atomic_t prep refactor first, direct-deliver on top.
§0 What just happened (Phase 7 of Patch C)
Reproduced verbatim from boot -1 of ohm 2026-05-07 20:18:10 CEST, ~13 s into a 4 MB/s nc stress:
WARNING: at wsm_release_tx_buffer+0x84/0xa0 [bes2600], CPU#0: kworker/0:3H/3912
Workqueue: bes_sdio sdio_rx_work [bes2600]
pc : wsm_release_tx_buffer+0x84/0xa0 [bes2600]
lr : bes2600_bh_handle_rx_skb+0x134/0x370 [bes2600]
sdio_rx_work+0x2a8/0x540 [bes2600]
bes2600_wlan: wsm_release_tx_buffer failed: -1
Storm continued; chip wedged; ohm fell off the WiFi (wlan0). Patch C module preserved at /var/tmp/bes2600.patchC-broken.ko for forensics. Patch B rolled back, currently on disk on ohm. Lesson saved as feedback_phase6_contract_threadsafety memory.
§1 Why it failed
wsm_release_tx_buffer() (bh.c:222–243) does unlocked read–modify–write on hw_priv->hw_bufs_used. Pre-Patch-C invariant was single-writer = BH thread; the lock that mattered was structural, not annotated. Patch C's direct-deliver moved one writer (RX-confirm decrement) into sdio_rx_work workqueue context. BH thread + sdio_rx_work race on the int counter; underflow below zero, WARN, return -1, bookkeeping corrupt, TX wedges.
Phase 6 contract block correctly cited wsm_handle_rx's sleepability and held-lock invariants — but stopped at the called function's signature. It did not enumerate hw_bufs_used as shared state mutated by the callee. That's the gap.
§2 Shared-state delta table (the thing missing from Patch C)
Every field that bes2600_bh_handle_rx_skb mutates either directly or transitively, with current protection and required action:
| field | declared at | written by (today) | written by (after Patch C v2) | current protection | action needed |
|---|---|---|---|---|---|
hw_priv->hw_bufs_used |
bes2600.h | wsm_alloc_tx_buffer (bh thread, TX submit), wsm_release_tx_buffer (bh thread, RX confirm), main.c:543 (init) |
+ wsm_release_tx_buffer from sdio_rx_work |
single-writer = BH thread (structural) | convert to atomic_t |
hw_priv->hw_bufs_used_vif[i] |
bes2600.h | wsm_release_vif_tx_buffer (bh thread), bh.c:1271 (vif TX submit), init |
+ wsm_release_vif_tx_buffer from sdio_rx_work |
single-writer = BH thread | convert to atomic_t [N] |
hw_priv->wsm_rx_seq[i] |
bes2600.h | bh thread RX | sdio_rx_work only | single-writer = BH/sdio_rx context (was BH, now is sdio_rx_work, but still one writer) | OK — single writer |
hw_priv->wsm_tx_pending[i] |
bes2600.h | bes2600_bh_inc_pending_count (TX submit, BH thread), bes2600_bh_dec_pending_count (RX confirm) |
dec moves to sdio_rx_work; inc stays BH | single-writer = BH | also needs atomic_t |
hw_priv->lmac_mon_timer / mcu_mon_timer |
bes2600.h | mod_timer / del_timer_sync from BH | ditto from sdio_rx_work | timer API is internally locked | OK — mod_timer is concurrency-safe |
hw_priv->wsm_cmd.lock (taken inside wsm_handle_rx) |
wsm_buf | bh thread (today) | sdio_rx_work | spinlock | OK — already protected |
hw_priv->vif_lock (taken inside wsm_handle_rx for some paths) |
per vif | bh thread today | sdio_rx_work | spinlock | OK |
priv->bh_evt_wq wake-up |
bes2600.h | wsm_release_tx_buffer when count hits 0 | ditto from sdio_rx_work | wake_up is concurrency-safe | OK |
bes2600_pwr_clear_busy_event (called inside release) |
bes_pwr | bh thread | sdio_rx_work | internal locking via bes_power.lock |
OK |
hw_priv->buf_released |
bes2600.h | only wsm_release_buffer_to_fw (MCAST_FWDING ifdef, AP-only) |
unchanged — BH only | single-writer = BH | OK — not on Patch C v2 hot path |
Three fields require atomic_t conversion: hw_bufs_used, hw_bufs_used_vif[], wsm_tx_pending[]. Everything else is already concurrency-safe or moves cleanly to single-writer-in-sdio_rx_work.
§3 Read-site survey (the rest of the work — atomic_read swaps)
grep -hE "hw_bufs_used\b|hw_bufs_used_vif\b" *.c *.h | wc -l = 57 references across the source tree:
- 5 writers (above)
- 52 readers — converted mechanically to
atomic_read(). Distribution:bh.c: 22 read sites (most in the bh main loop, BUG_ON gates, idle / suspend predicates)sta.c: 3 sites (PM idle check at sta.c:1231–1253)bes2600_sdio.c: 1 site (PM idle check at line 958)main.c: 2 sites (init zero, teardown wait)debug.c: 1 site (debugfs stats)itp.c: 1 site (test mode)
wsm_tx_pending[i] site count is smaller — ~6 references, all in bh.c and the timer monitors. Same mechanical conversion.
§4 Plan v2 — two-step
Patch C-prep (NFC, lands first):
- Convert
hw_bufs_usedfromint→atomic_t. - Convert
hw_bufs_used_vif[CW12XX_MAX_VIFS]fromint[]→atomic_t[]. - Convert
wsm_tx_pending[2]fromint[]→atomic_t[]. - Update writers:
wsm_alloc_tx_buffer:atomic_inc(&hw_priv->hw_bufs_used).wsm_release_tx_buffer: rewrite withatomic_fetch_sub_release(count, &hw_priv->hw_bufs_used)— returns prior value. Re-derive the "tx restart" predicate (prior >= numInpChBufs - 1) and the "wake bh_evt_wq + clear busy" predicate (prior - count == 0) from that. WARN ifprior - count < 0.wsm_release_vif_tx_buffer: same pattern on the array element.bes2600_bh_inc/dec_pending_count: useatomic_incandatomic_dec_return(need post-decrement value to decide whether to del_timer).
- Update all 52+6 read sites: mechanical
atomic_read()swap. main.c:543init:atomic_set(&hw_priv->hw_bufs_used_vif[i], 0).
Patch C-prep does NOT change behaviour. Same atomic ordering (_release / _acquire chosen to match the implicit memory ordering the BH-only path had). Phase 7 of C-prep alone should show identical numbers to pre-patch baseline (run-20260507-patchC-preflight): 1.36 MB/s, 86.4 sdio_rx_work/sec, 90.3 dispatches per 1000 RX pkts, 0 bh_work redispatches. If Phase 7 of C-prep shows a delta, the atomic ordering is wrong and we loop back here, not to C v2.
Patch C v2 (the actual structural change, lands on top of C-prep):
- Identical to Patch C as merged in PR #3 (since closed): direct-deliver from
bes2600_sdio_extract_packetsintobes2600_bh_handle_rx_skb, norx_queueindirection, no bh wake-up for RX. - The contract block in
bh.c::bes2600_bh_handle_rx_skbis expanded to include the shared-state delta table from §2 of this plan, with explicit citations. - Same minimum-diff scope as Patch C: keep
rx_queue,pipe_read,bh_rx_helperfor clean bisection; remove in a follow-up hygiene patch.
§5 What will NOT be touched (deferred or out of scope)
- mac80211-side
ieee80211_rx_irqsafe→ieee80211_rx_listmigration: that's Patch C2, gated on Task #19 kerneldoc verification. - The
#if 0graveyard in bh.c, theasm volatile("nop")placeholder, the BUG_ON in steady-state hot path: still symptom-shaped perfeedback_dont_patch_downstream_artifacts. Re-evaluate at Task #24 after C v2 / D / E land. ba_lock(Patch D) andps_state_lock(Patch E): independent.
§6 Risk list (per Phase 6 contract-thread-safety memory)
-
C-prep memory ordering: I've chosen
atomic_fetch_sub_releaseforwsm_release_tx_bufferto mirror the implicit BH-thread ordering (release before subsequent atomic ops onbh_evt_wq/bes_power). If the BH thread or other readers expect_acquiresemantics on the value, we get reordering bugs that are hard to reproduce. Mitigation: pair with_acquirereads where the read-then-decision pattern is critical (e.g., the bh main loop'sif (!hw_priv->hw_bufs_used)idle predicate). Cite the kerneldoc reference foratomic_fetch_sub_releasein the commit message. -
wsm_tx_pending[]decrement-side timer interaction:bes2600_bh_dec_pending_countdoesif (--hw_priv->wsm_tx_pending[idx] == 0) del_timer_sync(timer); else mod_timer(timer, ...). After atomic_t conversion:if (atomic_dec_return(&hw_priv->wsm_tx_pending[idx]) == 0) .... But another thread couldatomic_incbetween our dec and the timer call, racing the del_timer.del_timer_syncis internally safe (it can be called concurrently withmod_timer), but the decision "whether to delete vs mod" is racy. Mitigation: even after atomic conversion, this function still needs to be called from a single context. Verifyinc/dec_pending_countcallers — if both sides only fire from BH and sdio_rx_work and never overlap on the same idx, we're fine; if not, this needs a lock. -
hw_bufs_used_vif[]array vswsm_alloc_tx_buffer: vif counter increment lives at bh.c:1271, called from bh thread TX-submit path. Decrement (wsm_release_vif_tx_buffer) called from RX-confirm. After Patch C v2 the decrement is in sdio_rx_work — same race shape as the global counter. Already covered by the atomic_t array conversion. -
PM idle predicate at sta.c:1239: reads
hw_priv->hw_bufs_used_vif[priv->if_id]to decide can-sleep. Currently racy (was already reading BH-mutated state from a non-BH PM context). Atomic conversion makes the read coherent. PM context's read-then-decide is still fundamentally a snapshot — no change in semantics, just no torn-read. -
Reboot / module-unload teardown (
main.c:840):wait_event_timeout(... !hw_priv->hw_bufs_used ...). Becomes... !atomic_read(...). No semantic change — the wait_event macro re-evaluates the predicate on each wake. -
Phase 7 rig: Patch C v2 still wedges chip if I missed anything: now mitigated by ohm's new wired interface (enu1, 192.168.88.80) — survives bes2600 wedges, lets us collect dmesg / ftrace / journalctl from a wedged ohm without reboot. See
reference_ohm_wired_ifacememory.
§7 Phase 5 review handover
PR on git.reauktion.de/marfrit/besser, this file as the artifact (per feedback_phase5_surface_is_pr). Specifically request reviewer focus on §2 shared-state delta table — that's the part that should have caught Patch C's bug. Don't curate.
§8 Phase 6 implementation order
- Branch off
cleanupson bes2600-dkms-mobian:bes2600/atomic-tx-buf-counters(= Patch C-prep). - Mechanical refactor:
int hw_bufs_used→atomic_t hw_bufs_used, all reads →atomic_read, all writes → atomic ops. Same for vif array and tx_pending array. No other changes. - Build, install, smoke-test. Phase 7 of C-prep. Should be a no-op delta.
- PR + Phase 5 review + merge.
- Branch off C-prep:
bes2600/sdio-rx-direct-deliver-v2(= Patch C v2). - Re-apply the Patch C delta (3 files: bh.h, bh.c, bes2600_sdio.c — same edits as PR #3).
- Build, install, Phase 7 N=3 stress ramp.
- PR + Phase 5 review + merge.
§9 Phase 7 v2 protocol (per feedback_phase7_stress_ramp + wired-rig)
- Pre-C-prep baseline rep N=3 (re-anchor, since current N=1 baseline is from
run-20260507-patchC-preflight). - Apply C-prep, N=3. Compare to pre. Expect: zero meaningful delta. If non-zero → memory-ordering bug, loop back to §4 atomic-ordering choice.
- Apply C v2, N=3. Compare to C-prep baseline. Expect: §4.5 of original Patch C plan's predicted delta (rx_queue lock acquires → 0, observed RX KB/s lifts toward ≥1 MB/s sustained @ 4MB/s).
- All Phase 7 stress runs use the wired path (
ssh mfritsche@192.168.88.80) for telemetry collection. When the chip wedges (it shouldn't this time, but planning for it), wlan0 stops responding but enu1 stays alive. Collect dmesg / ftrace / journalctl over enu1 BEFORE rebooting. This is the data we lost in Patch C boot -1 because wlan0 was the only path. - N=3 reps per phase per
feedback_phase7_stress_ramp. Don't accept N=1 as verification.
§10 Closeout
If C-prep + C v2 both pass Phase 7: proceed to D (ba_lock atomicization), E (ps_state_lock skip). Markus's "we're not on the clock" applies — sequencing per bisection clarity, not delivery deadline.
Plan written 2026-05-07 by Claude (noether), in response to Patch C Phase 7 failure. Phase 5 review = PR comments on this artifact at git.reauktion.de/marfrit/besser. Don't curate the shared-state delta table for the reviewer — that's the part the previous round's reviewer should have caught me on.