Files
besser/notes/patch-c-v2-phase4-plan-2026-05-07.md
claude-noether 0b63ca3c24 notes: Patch C v2 Phase 4 plan — atomic_t prep + direct-deliver
Phase 7 of Patch C (PR #9 → bes2600-dkms PR #3 → boot -1 of ohm
20:18:10) failed with a thread-safety race: wsm_release_tx_buffer's
unlocked R-M-W on hw_bufs_used races against wsm_alloc_tx_buffer in
the bh thread when Patch C moved the RX-confirm decrement into
sdio_rx_work.  WARN storm at +13s under stress, chip wedges, host
off-network.

Phase 6 contract analysis cited wsm_handle_rx's sleepability and
held-lock invariants but stopped at the function signature.  Did not
enumerate hw_bufs_used as shared state mutated by the callee.  Lesson
saved as feedback_phase6_contract_threadsafety memory.

Phase 4 v2 designs around that gap.  Two-step:

1. Patch C-prep: NFC refactor — convert hw_bufs_used,
   hw_bufs_used_vif[], wsm_tx_pending[] from int / int[] to atomic_t /
   atomic_t[].  Use atomic_fetch_sub_release in wsm_release_tx_buffer
   (returns prior value for the >= numInpChBufs - 1 predicate).
   Mechanical atomic_read swap at ~58 read sites.  Lands first;
   Phase 7 should show zero delta from baseline.

2. Patch C v2: re-apply the sdio_rx_work direct-deliver on top of
   C-prep.  Identical structural change to the closed PR #3, but now
   the racing counter is safe.  Contract block in
   bes2600_bh_handle_rx_skb expanded to include the shared-state
   delta table.

Plan §2 is the shared-state delta table — every field
bes2600_bh_handle_rx_skb mutates directly or transitively, with
current protection and required action.  3 fields need atomic_t,
the rest are already concurrency-safe or stay single-writer.

Plan §6 lists 6 risks including memory-ordering choices, the
inc/dec_pending_count timer-decision race, and the new wired-rig
fallback (enu1 192.168.88.80) that survives bes2600 wedges so Phase 7
can capture dmesg / ftrace from a wedged ohm without reboot.

PR superseded #3 closed with full verdict comment.  Phase B rolled
back on ohm at /lib/modules/.../extra/bes2600.ko.  Markus's reboot
button to land Patch B again before C-prep work begins.
2026-05-07 20:50:39 +02:00

12 KiB
Raw Permalink Blame History

Patch C v2 — Phase 4 Plan: atomic_t prep + direct-deliver

Author: Claude (noether) Status: Phase 4 v2 — Phase 7 of Patch C (notes/patch-c-phase4-plan-2026-05-07.md, PR #9 merged) failed with a thread-safety race; this is the redesign. Decision: Option B from PR #3 close-out comment — atomic_t prep refactor first, direct-deliver on top.


§0 What just happened (Phase 7 of Patch C)

Reproduced verbatim from boot -1 of ohm 2026-05-07 20:18:10 CEST, ~13 s into a 4 MB/s nc stress:

WARNING: at wsm_release_tx_buffer+0x84/0xa0 [bes2600], CPU#0: kworker/0:3H/3912
Workqueue: bes_sdio sdio_rx_work [bes2600]
pc : wsm_release_tx_buffer+0x84/0xa0 [bes2600]
lr : bes2600_bh_handle_rx_skb+0x134/0x370 [bes2600]
sdio_rx_work+0x2a8/0x540 [bes2600]
bes2600_wlan: wsm_release_tx_buffer failed: -1

Storm continued; chip wedged; ohm fell off the WiFi (wlan0). Patch C module preserved at /var/tmp/bes2600.patchC-broken.ko for forensics. Patch B rolled back, currently on disk on ohm. Lesson saved as feedback_phase6_contract_threadsafety memory.

§1 Why it failed

wsm_release_tx_buffer() (bh.c:222243) does unlocked readmodifywrite on hw_priv->hw_bufs_used. Pre-Patch-C invariant was single-writer = BH thread; the lock that mattered was structural, not annotated. Patch C's direct-deliver moved one writer (RX-confirm decrement) into sdio_rx_work workqueue context. BH thread + sdio_rx_work race on the int counter; underflow below zero, WARN, return -1, bookkeeping corrupt, TX wedges.

Phase 6 contract block correctly cited wsm_handle_rx's sleepability and held-lock invariants — but stopped at the called function's signature. It did not enumerate hw_bufs_used as shared state mutated by the callee. That's the gap.

§2 Shared-state delta table (the thing missing from Patch C)

Every field that bes2600_bh_handle_rx_skb mutates either directly or transitively, with current protection and required action:

field declared at written by (today) written by (after Patch C v2) current protection action needed
hw_priv->hw_bufs_used bes2600.h wsm_alloc_tx_buffer (bh thread, TX submit), wsm_release_tx_buffer (bh thread, RX confirm), main.c:543 (init) + wsm_release_tx_buffer from sdio_rx_work single-writer = BH thread (structural) convert to atomic_t
hw_priv->hw_bufs_used_vif[i] bes2600.h wsm_release_vif_tx_buffer (bh thread), bh.c:1271 (vif TX submit), init + wsm_release_vif_tx_buffer from sdio_rx_work single-writer = BH thread convert to atomic_t [N]
hw_priv->wsm_rx_seq[i] bes2600.h bh thread RX sdio_rx_work only single-writer = BH/sdio_rx context (was BH, now is sdio_rx_work, but still one writer) OK — single writer
hw_priv->wsm_tx_pending[i] bes2600.h bes2600_bh_inc_pending_count (TX submit, BH thread), bes2600_bh_dec_pending_count (RX confirm) dec moves to sdio_rx_work; inc stays BH single-writer = BH also needs atomic_t
hw_priv->lmac_mon_timer / mcu_mon_timer bes2600.h mod_timer / del_timer_sync from BH ditto from sdio_rx_work timer API is internally locked OK — mod_timer is concurrency-safe
hw_priv->wsm_cmd.lock (taken inside wsm_handle_rx) wsm_buf bh thread (today) sdio_rx_work spinlock OK — already protected
hw_priv->vif_lock (taken inside wsm_handle_rx for some paths) per vif bh thread today sdio_rx_work spinlock OK
priv->bh_evt_wq wake-up bes2600.h wsm_release_tx_buffer when count hits 0 ditto from sdio_rx_work wake_up is concurrency-safe OK
bes2600_pwr_clear_busy_event (called inside release) bes_pwr bh thread sdio_rx_work internal locking via bes_power.lock OK
hw_priv->buf_released bes2600.h only wsm_release_buffer_to_fw (MCAST_FWDING ifdef, AP-only) unchanged — BH only single-writer = BH OK — not on Patch C v2 hot path

Three fields require atomic_t conversion: hw_bufs_used, hw_bufs_used_vif[], wsm_tx_pending[]. Everything else is already concurrency-safe or moves cleanly to single-writer-in-sdio_rx_work.

§3 Read-site survey (the rest of the work — atomic_read swaps)

grep -hE "hw_bufs_used\b|hw_bufs_used_vif\b" *.c *.h | wc -l = 57 references across the source tree:

  • 5 writers (above)
  • 52 readers — converted mechanically to atomic_read(). Distribution:
    • bh.c: 22 read sites (most in the bh main loop, BUG_ON gates, idle / suspend predicates)
    • sta.c: 3 sites (PM idle check at sta.c:12311253)
    • bes2600_sdio.c: 1 site (PM idle check at line 958)
    • main.c: 2 sites (init zero, teardown wait)
    • debug.c: 1 site (debugfs stats)
    • itp.c: 1 site (test mode)

wsm_tx_pending[i] site count is smaller — ~6 references, all in bh.c and the timer monitors. Same mechanical conversion.

§4 Plan v2 — two-step

Patch C-prep (NFC, lands first):

  • Convert hw_bufs_used from intatomic_t.
  • Convert hw_bufs_used_vif[CW12XX_MAX_VIFS] from int[]atomic_t[].
  • Convert wsm_tx_pending[2] from int[]atomic_t[].
  • Update writers:
    • wsm_alloc_tx_buffer: atomic_inc(&hw_priv->hw_bufs_used).
    • wsm_release_tx_buffer: rewrite with atomic_fetch_sub_release(count, &hw_priv->hw_bufs_used) — returns prior value. Re-derive the "tx restart" predicate (prior >= numInpChBufs - 1) and the "wake bh_evt_wq + clear busy" predicate (prior - count == 0) from that. WARN if prior - count < 0.
    • wsm_release_vif_tx_buffer: same pattern on the array element.
    • bes2600_bh_inc/dec_pending_count: use atomic_inc and atomic_dec_return (need post-decrement value to decide whether to del_timer).
  • Update all 52+6 read sites: mechanical atomic_read() swap.
  • main.c:543 init: atomic_set(&hw_priv->hw_bufs_used_vif[i], 0).

Patch C-prep does NOT change behaviour. Same atomic ordering (_release / _acquire chosen to match the implicit memory ordering the BH-only path had). Phase 7 of C-prep alone should show identical numbers to pre-patch baseline (run-20260507-patchC-preflight): 1.36 MB/s, 86.4 sdio_rx_work/sec, 90.3 dispatches per 1000 RX pkts, 0 bh_work redispatches. If Phase 7 of C-prep shows a delta, the atomic ordering is wrong and we loop back here, not to C v2.

Patch C v2 (the actual structural change, lands on top of C-prep):

  • Identical to Patch C as merged in PR #3 (since closed): direct-deliver from bes2600_sdio_extract_packets into bes2600_bh_handle_rx_skb, no rx_queue indirection, no bh wake-up for RX.
  • The contract block in bh.c::bes2600_bh_handle_rx_skb is expanded to include the shared-state delta table from §2 of this plan, with explicit citations.
  • Same minimum-diff scope as Patch C: keep rx_queue, pipe_read, bh_rx_helper for clean bisection; remove in a follow-up hygiene patch.

§5 What will NOT be touched (deferred or out of scope)

  • mac80211-side ieee80211_rx_irqsafeieee80211_rx_list migration: that's Patch C2, gated on Task #19 kerneldoc verification.
  • The #if 0 graveyard in bh.c, the asm volatile("nop") placeholder, the BUG_ON in steady-state hot path: still symptom-shaped per feedback_dont_patch_downstream_artifacts. Re-evaluate at Task #24 after C v2 / D / E land.
  • ba_lock (Patch D) and ps_state_lock (Patch E): independent.

§6 Risk list (per Phase 6 contract-thread-safety memory)

  1. C-prep memory ordering: I've chosen atomic_fetch_sub_release for wsm_release_tx_buffer to mirror the implicit BH-thread ordering (release before subsequent atomic ops on bh_evt_wq / bes_power). If the BH thread or other readers expect _acquire semantics on the value, we get reordering bugs that are hard to reproduce. Mitigation: pair with _acquire reads where the read-then-decision pattern is critical (e.g., the bh main loop's if (!hw_priv->hw_bufs_used) idle predicate). Cite the kerneldoc reference for atomic_fetch_sub_release in the commit message.

  2. wsm_tx_pending[] decrement-side timer interaction: bes2600_bh_dec_pending_count does if (--hw_priv->wsm_tx_pending[idx] == 0) del_timer_sync(timer); else mod_timer(timer, ...). After atomic_t conversion: if (atomic_dec_return(&hw_priv->wsm_tx_pending[idx]) == 0) .... But another thread could atomic_inc between our dec and the timer call, racing the del_timer. del_timer_sync is internally safe (it can be called concurrently with mod_timer), but the decision "whether to delete vs mod" is racy. Mitigation: even after atomic conversion, this function still needs to be called from a single context. Verify inc/dec_pending_count callers — if both sides only fire from BH and sdio_rx_work and never overlap on the same idx, we're fine; if not, this needs a lock.

  3. hw_bufs_used_vif[] array vs wsm_alloc_tx_buffer: vif counter increment lives at bh.c:1271, called from bh thread TX-submit path. Decrement (wsm_release_vif_tx_buffer) called from RX-confirm. After Patch C v2 the decrement is in sdio_rx_work — same race shape as the global counter. Already covered by the atomic_t array conversion.

  4. PM idle predicate at sta.c:1239: reads hw_priv->hw_bufs_used_vif[priv->if_id] to decide can-sleep. Currently racy (was already reading BH-mutated state from a non-BH PM context). Atomic conversion makes the read coherent. PM context's read-then-decide is still fundamentally a snapshot — no change in semantics, just no torn-read.

  5. Reboot / module-unload teardown (main.c:840): wait_event_timeout(... !hw_priv->hw_bufs_used ...). Becomes ... !atomic_read(...). No semantic change — the wait_event macro re-evaluates the predicate on each wake.

  6. Phase 7 rig: Patch C v2 still wedges chip if I missed anything: now mitigated by ohm's new wired interface (enu1, 192.168.88.80) — survives bes2600 wedges, lets us collect dmesg / ftrace / journalctl from a wedged ohm without reboot. See reference_ohm_wired_iface memory.

§7 Phase 5 review handover

PR on git.reauktion.de/marfrit/besser, this file as the artifact (per feedback_phase5_surface_is_pr). Specifically request reviewer focus on §2 shared-state delta table — that's the part that should have caught Patch C's bug. Don't curate.

§8 Phase 6 implementation order

  1. Branch off cleanups on bes2600-dkms-mobian: bes2600/atomic-tx-buf-counters (= Patch C-prep).
  2. Mechanical refactor: int hw_bufs_usedatomic_t hw_bufs_used, all reads → atomic_read, all writes → atomic ops. Same for vif array and tx_pending array. No other changes.
  3. Build, install, smoke-test. Phase 7 of C-prep. Should be a no-op delta.
  4. PR + Phase 5 review + merge.
  5. Branch off C-prep: bes2600/sdio-rx-direct-deliver-v2 (= Patch C v2).
  6. Re-apply the Patch C delta (3 files: bh.h, bh.c, bes2600_sdio.c — same edits as PR #3).
  7. Build, install, Phase 7 N=3 stress ramp.
  8. PR + Phase 5 review + merge.

§9 Phase 7 v2 protocol (per feedback_phase7_stress_ramp + wired-rig)

  1. Pre-C-prep baseline rep N=3 (re-anchor, since current N=1 baseline is from run-20260507-patchC-preflight).
  2. Apply C-prep, N=3. Compare to pre. Expect: zero meaningful delta. If non-zero → memory-ordering bug, loop back to §4 atomic-ordering choice.
  3. Apply C v2, N=3. Compare to C-prep baseline. Expect: §4.5 of original Patch C plan's predicted delta (rx_queue lock acquires → 0, observed RX KB/s lifts toward ≥1 MB/s sustained @ 4MB/s).
  4. All Phase 7 stress runs use the wired path (ssh mfritsche@192.168.88.80) for telemetry collection. When the chip wedges (it shouldn't this time, but planning for it), wlan0 stops responding but enu1 stays alive. Collect dmesg / ftrace / journalctl over enu1 BEFORE rebooting. This is the data we lost in Patch C boot -1 because wlan0 was the only path.
  5. N=3 reps per phase per feedback_phase7_stress_ramp. Don't accept N=1 as verification.

§10 Closeout

If C-prep + C v2 both pass Phase 7: proceed to D (ba_lock atomicization), E (ps_state_lock skip). Markus's "we're not on the clock" applies — sequencing per bisection clarity, not delivery deadline.


Plan written 2026-05-07 by Claude (noether), in response to Patch C Phase 7 failure. Phase 5 review = PR comments on this artifact at git.reauktion.de/marfrit/besser. Don't curate the shared-state delta table for the reviewer — that's the part the previous round's reviewer should have caught me on.