Files
besser/notes/patch-c-v2-phase4-plan-2026-05-07.md
T
claude-noether 0b63ca3c24 notes: Patch C v2 Phase 4 plan — atomic_t prep + direct-deliver
Phase 7 of Patch C (PR #9 → bes2600-dkms PR #3 → boot -1 of ohm
20:18:10) failed with a thread-safety race: wsm_release_tx_buffer's
unlocked R-M-W on hw_bufs_used races against wsm_alloc_tx_buffer in
the bh thread when Patch C moved the RX-confirm decrement into
sdio_rx_work.  WARN storm at +13s under stress, chip wedges, host
off-network.

Phase 6 contract analysis cited wsm_handle_rx's sleepability and
held-lock invariants but stopped at the function signature.  Did not
enumerate hw_bufs_used as shared state mutated by the callee.  Lesson
saved as feedback_phase6_contract_threadsafety memory.

Phase 4 v2 designs around that gap.  Two-step:

1. Patch C-prep: NFC refactor — convert hw_bufs_used,
   hw_bufs_used_vif[], wsm_tx_pending[] from int / int[] to atomic_t /
   atomic_t[].  Use atomic_fetch_sub_release in wsm_release_tx_buffer
   (returns prior value for the >= numInpChBufs - 1 predicate).
   Mechanical atomic_read swap at ~58 read sites.  Lands first;
   Phase 7 should show zero delta from baseline.

2. Patch C v2: re-apply the sdio_rx_work direct-deliver on top of
   C-prep.  Identical structural change to the closed PR #3, but now
   the racing counter is safe.  Contract block in
   bes2600_bh_handle_rx_skb expanded to include the shared-state
   delta table.

Plan §2 is the shared-state delta table — every field
bes2600_bh_handle_rx_skb mutates directly or transitively, with
current protection and required action.  3 fields need atomic_t,
the rest are already concurrency-safe or stay single-writer.

Plan §6 lists 6 risks including memory-ordering choices, the
inc/dec_pending_count timer-decision race, and the new wired-rig
fallback (enu1 192.168.88.80) that survives bes2600 wedges so Phase 7
can capture dmesg / ftrace from a wedged ohm without reboot.

PR superseded #3 closed with full verdict comment.  Phase B rolled
back on ohm at /lib/modules/.../extra/bes2600.ko.  Markus's reboot
button to land Patch B again before C-prep work begins.
2026-05-07 20:50:39 +02:00

137 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Patch C v2 — Phase 4 Plan: atomic_t prep + direct-deliver
**Author:** Claude (noether)
**Status:** Phase 4 v2 — Phase 7 of Patch C (notes/patch-c-phase4-plan-2026-05-07.md, PR #9 merged) failed with a thread-safety race; this is the redesign.
**Decision:** Option B from PR #3 close-out comment — `atomic_t` prep refactor first, direct-deliver on top.
---
## §0 What just happened (Phase 7 of Patch C)
Reproduced verbatim from boot -1 of ohm 2026-05-07 20:18:10 CEST, ~13 s into a 4 MB/s nc stress:
```
WARNING: at wsm_release_tx_buffer+0x84/0xa0 [bes2600], CPU#0: kworker/0:3H/3912
Workqueue: bes_sdio sdio_rx_work [bes2600]
pc : wsm_release_tx_buffer+0x84/0xa0 [bes2600]
lr : bes2600_bh_handle_rx_skb+0x134/0x370 [bes2600]
sdio_rx_work+0x2a8/0x540 [bes2600]
bes2600_wlan: wsm_release_tx_buffer failed: -1
```
Storm continued; chip wedged; ohm fell off the WiFi (wlan0). Patch C module preserved at `/var/tmp/bes2600.patchC-broken.ko` for forensics. Patch B rolled back, currently on disk on ohm. Lesson saved as `feedback_phase6_contract_threadsafety` memory.
## §1 Why it failed
`wsm_release_tx_buffer()` (bh.c:222243) does **unlocked** readmodifywrite on `hw_priv->hw_bufs_used`. Pre-Patch-C invariant was single-writer = BH thread; the lock that mattered was structural, not annotated. Patch C's direct-deliver moved one writer (RX-confirm decrement) into `sdio_rx_work` workqueue context. BH thread + sdio_rx_work race on the int counter; underflow below zero, WARN, return -1, bookkeeping corrupt, TX wedges.
Phase 6 contract block correctly cited `wsm_handle_rx`'s sleepability and held-lock invariants — but stopped at the called function's signature. It did not enumerate `hw_bufs_used` as shared state mutated by the callee. That's the gap.
## §2 Shared-state delta table (the thing missing from Patch C)
Every field that `bes2600_bh_handle_rx_skb` mutates either directly or transitively, with current protection and required action:
| field | declared at | written by (today) | written by (after Patch C v2) | current protection | action needed |
|---|---|---|---|---|---|
| `hw_priv->hw_bufs_used` | bes2600.h | `wsm_alloc_tx_buffer` (bh thread, TX submit), `wsm_release_tx_buffer` (bh thread, RX confirm), `main.c:543` (init) | + `wsm_release_tx_buffer` from sdio_rx_work | single-writer = BH thread (structural) | **convert to `atomic_t`** |
| `hw_priv->hw_bufs_used_vif[i]` | bes2600.h | `wsm_release_vif_tx_buffer` (bh thread), `bh.c:1271` (vif TX submit), init | + `wsm_release_vif_tx_buffer` from sdio_rx_work | single-writer = BH thread | **convert to `atomic_t [N]`** |
| `hw_priv->wsm_rx_seq[i]` | bes2600.h | bh thread RX | sdio_rx_work only | single-writer = BH/sdio_rx context (was BH, now is sdio_rx_work, but still **one writer**) | OK — single writer |
| `hw_priv->wsm_tx_pending[i]` | bes2600.h | `bes2600_bh_inc_pending_count` (TX submit, BH thread), `bes2600_bh_dec_pending_count` (RX confirm) | dec moves to sdio_rx_work; inc stays BH | single-writer = BH | **also needs `atomic_t`** |
| `hw_priv->lmac_mon_timer` / `mcu_mon_timer` | bes2600.h | mod_timer / del_timer_sync from BH | ditto from sdio_rx_work | timer API is internally locked | OK — `mod_timer` is concurrency-safe |
| `hw_priv->wsm_cmd.lock` (taken inside wsm_handle_rx) | wsm_buf | bh thread (today) | sdio_rx_work | spinlock | OK — already protected |
| `hw_priv->vif_lock` (taken inside wsm_handle_rx for some paths) | per vif | bh thread today | sdio_rx_work | spinlock | OK |
| `priv->bh_evt_wq` wake-up | bes2600.h | wsm_release_tx_buffer when count hits 0 | ditto from sdio_rx_work | wake_up is concurrency-safe | OK |
| `bes2600_pwr_clear_busy_event` (called inside release) | bes_pwr | bh thread | sdio_rx_work | internal locking via `bes_power.lock` | OK |
| `hw_priv->buf_released` | bes2600.h | only `wsm_release_buffer_to_fw` (MCAST_FWDING ifdef, AP-only) | unchanged — BH only | single-writer = BH | OK — not on Patch C v2 hot path |
**Three fields require atomic_t conversion:** `hw_bufs_used`, `hw_bufs_used_vif[]`, `wsm_tx_pending[]`. Everything else is already concurrency-safe or moves cleanly to single-writer-in-sdio_rx_work.
## §3 Read-site survey (the rest of the work — atomic_read swaps)
`grep -hE "hw_bufs_used\b|hw_bufs_used_vif\b" *.c *.h | wc -l` = **57 references** across the source tree:
- 5 writers (above)
- 52 readers — converted mechanically to `atomic_read()`. Distribution:
- `bh.c`: 22 read sites (most in the bh main loop, BUG_ON gates, idle / suspend predicates)
- `sta.c`: 3 sites (PM idle check at sta.c:12311253)
- `bes2600_sdio.c`: 1 site (PM idle check at line 958)
- `main.c`: 2 sites (init zero, teardown wait)
- `debug.c`: 1 site (debugfs stats)
- `itp.c`: 1 site (test mode)
`wsm_tx_pending[i]` site count is smaller — ~6 references, all in bh.c and the timer monitors. Same mechanical conversion.
## §4 Plan v2 — two-step
**Patch C-prep** (NFC, lands first):
- Convert `hw_bufs_used` from `int``atomic_t`.
- Convert `hw_bufs_used_vif[CW12XX_MAX_VIFS]` from `int[]``atomic_t[]`.
- Convert `wsm_tx_pending[2]` from `int[]``atomic_t[]`.
- Update writers:
- `wsm_alloc_tx_buffer`: `atomic_inc(&hw_priv->hw_bufs_used)`.
- `wsm_release_tx_buffer`: rewrite with `atomic_fetch_sub_release(count, &hw_priv->hw_bufs_used)` — returns prior value. Re-derive the "tx restart" predicate (`prior >= numInpChBufs - 1`) and the "wake bh_evt_wq + clear busy" predicate (`prior - count == 0`) from that. WARN if `prior - count < 0`.
- `wsm_release_vif_tx_buffer`: same pattern on the array element.
- `bes2600_bh_inc/dec_pending_count`: use `atomic_inc` and `atomic_dec_return` (need post-decrement value to decide whether to del_timer).
- Update all 52+6 read sites: mechanical `atomic_read()` swap.
- `main.c:543` init: `atomic_set(&hw_priv->hw_bufs_used_vif[i], 0)`.
**Patch C-prep does NOT change behaviour.** Same atomic ordering (`_release` / `_acquire` chosen to match the implicit memory ordering the BH-only path had). Phase 7 of C-prep alone should show **identical** numbers to pre-patch baseline (`run-20260507-patchC-preflight`): 1.36 MB/s, 86.4 sdio_rx_work/sec, 90.3 dispatches per 1000 RX pkts, 0 bh_work redispatches. If Phase 7 of C-prep shows a delta, the atomic ordering is wrong and we loop back here, not to C v2.
**Patch C v2** (the actual structural change, lands on top of C-prep):
- Identical to Patch C as merged in PR #3 (since closed): direct-deliver from `bes2600_sdio_extract_packets` into `bes2600_bh_handle_rx_skb`, no `rx_queue` indirection, no bh wake-up for RX.
- The contract block in `bh.c::bes2600_bh_handle_rx_skb` is **expanded** to include the shared-state delta table from §2 of this plan, with explicit citations.
- Same minimum-diff scope as Patch C: keep `rx_queue`, `pipe_read`, `bh_rx_helper` for clean bisection; remove in a follow-up hygiene patch.
## §5 What will NOT be touched (deferred or out of scope)
- mac80211-side `ieee80211_rx_irqsafe``ieee80211_rx_list` migration: that's Patch C2, gated on Task #19 kerneldoc verification.
- The `#if 0` graveyard in bh.c, the `asm volatile("nop")` placeholder, the BUG_ON in steady-state hot path: still symptom-shaped per `feedback_dont_patch_downstream_artifacts`. Re-evaluate at Task #24 after C v2 / D / E land.
- `ba_lock` (Patch D) and `ps_state_lock` (Patch E): independent.
## §6 Risk list (per Phase 6 contract-thread-safety memory)
1. **C-prep memory ordering**: I've chosen `atomic_fetch_sub_release` for `wsm_release_tx_buffer` to mirror the implicit BH-thread ordering (release before subsequent atomic ops on `bh_evt_wq` / `bes_power`). If the BH thread or other readers expect `_acquire` semantics on the value, we get reordering bugs that are hard to reproduce. **Mitigation:** pair with `_acquire` reads where the read-then-decision pattern is critical (e.g., the bh main loop's `if (!hw_priv->hw_bufs_used)` idle predicate). Cite the kerneldoc reference for `atomic_fetch_sub_release` in the commit message.
2. **`wsm_tx_pending[]` decrement-side timer interaction**: `bes2600_bh_dec_pending_count` does `if (--hw_priv->wsm_tx_pending[idx] == 0) del_timer_sync(timer); else mod_timer(timer, ...)`. After atomic_t conversion: `if (atomic_dec_return(&hw_priv->wsm_tx_pending[idx]) == 0) ...`. But *another* thread could `atomic_inc` between our dec and the timer call, racing the del_timer. `del_timer_sync` is internally safe (it can be called concurrently with `mod_timer`), but the **decision** "whether to delete vs mod" is racy. **Mitigation:** even after atomic conversion, this function still needs to be called from a single context. Verify `inc/dec_pending_count` callers — if both sides only fire from BH and sdio_rx_work and never overlap on the same idx, we're fine; if not, this needs a lock.
3. **`hw_bufs_used_vif[]` array vs `wsm_alloc_tx_buffer`**: vif counter increment lives at bh.c:1271, called from bh thread TX-submit path. Decrement (`wsm_release_vif_tx_buffer`) called from RX-confirm. After Patch C v2 the decrement is in sdio_rx_work — same race shape as the global counter. Already covered by the atomic_t array conversion.
4. **PM idle predicate at sta.c:1239**: reads `hw_priv->hw_bufs_used_vif[priv->if_id]` to decide can-sleep. Currently racy (was already reading BH-mutated state from a non-BH PM context). Atomic conversion makes the read coherent. PM context's read-then-decide is still fundamentally a snapshot — no change in semantics, just no torn-read.
5. **Reboot / module-unload teardown** (`main.c:840`): `wait_event_timeout(... !hw_priv->hw_bufs_used ...)`. Becomes `... !atomic_read(...)`. No semantic change — the wait_event macro re-evaluates the predicate on each wake.
6. **Phase 7 rig: Patch C v2 still wedges chip if I missed anything**: now mitigated by ohm's new wired interface (enu1, 192.168.88.80) — survives bes2600 wedges, lets us collect dmesg / ftrace / journalctl from a wedged ohm without reboot. See `reference_ohm_wired_iface` memory.
## §7 Phase 5 review handover
PR on git.reauktion.de/marfrit/besser, this file as the artifact (per `feedback_phase5_surface_is_pr`). Specifically request reviewer focus on §2 shared-state delta table — that's the part that should have caught Patch C's bug. Don't curate.
## §8 Phase 6 implementation order
1. Branch off `cleanups` on bes2600-dkms-mobian: `bes2600/atomic-tx-buf-counters` (= Patch C-prep).
2. Mechanical refactor: `int hw_bufs_used``atomic_t hw_bufs_used`, all reads → `atomic_read`, all writes → atomic ops. Same for vif array and tx_pending array. No other changes.
3. Build, install, smoke-test. Phase 7 of C-prep. Should be a no-op delta.
4. PR + Phase 5 review + merge.
5. Branch off C-prep: `bes2600/sdio-rx-direct-deliver-v2` (= Patch C v2).
6. Re-apply the Patch C delta (3 files: bh.h, bh.c, bes2600_sdio.c — same edits as PR #3).
7. Build, install, Phase 7 N=3 stress ramp.
8. PR + Phase 5 review + merge.
## §9 Phase 7 v2 protocol (per `feedback_phase7_stress_ramp` + wired-rig)
1. Pre-C-prep baseline rep N=3 (re-anchor, since current N=1 baseline is from `run-20260507-patchC-preflight`).
2. Apply C-prep, N=3. Compare to pre. Expect: zero meaningful delta. If non-zero → memory-ordering bug, loop back to §4 atomic-ordering choice.
3. Apply C v2, N=3. Compare to C-prep baseline. Expect: §4.5 of original Patch C plan's predicted delta (rx_queue lock acquires → 0, observed RX KB/s lifts toward ≥1 MB/s sustained @ 4MB/s).
4. **All Phase 7 stress runs use the wired path (`ssh mfritsche@192.168.88.80`) for telemetry collection.** When the chip wedges (it shouldn't this time, but planning for it), wlan0 stops responding but enu1 stays alive. Collect dmesg / ftrace / journalctl over enu1 BEFORE rebooting. This is the data we lost in Patch C boot -1 because wlan0 was the only path.
5. N=3 reps per phase per `feedback_phase7_stress_ramp`. Don't accept N=1 as verification.
## §10 Closeout
If C-prep + C v2 both pass Phase 7: proceed to D (ba_lock atomicization), E (ps_state_lock skip). Markus's "we're not on the clock" applies — sequencing per bisection clarity, not delivery deadline.
---
*Plan written 2026-05-07 by Claude (noether), in response to Patch C Phase 7 failure. Phase 5 review = PR comments on this artifact at git.reauktion.de/marfrit/besser. Don't curate the shared-state delta table for the reviewer — that's the part the previous round's reviewer should have caught me on.*