bes2600_join_work calls ieee80211_bss_get_elem without rcu_read_lock — suspicious RCU usage at net/wireless/util.c:1078 #23

Open
opened 2026-05-20 17:24:55 +00:00 by marfrit · 0 comments
Owner

Symptom

With CONFIG_PROVE_LOCKING=y + CONFIG_LOCKDEP=y + CONFIG_DEBUG_LOCK_ALLOC=y enabled, bes2600_join_work triggers a WARNING: suspicious RCU usage early in boot (~t=43s on ohm) the first time it tries to look up the joining BSS:

WARNING: suspicious RCU usage
7.0.0-danctnix1-5-pinetab2-danctnix-besser-lockdep #1 Tainted: G         C
net/wireless/util.c:1078 suspicious rcu_dereference_check() usage!

rcu_scheduler_active = 2, debug_locks = 1
2 locks held by kworker/u16:3/54:
 #0: ((wq_completion)bes2600_wq){+.+.}-{0:0}, at: process_one_work
 #1: ((work_completion)(&priv->join_work)){+.+.}-{0:0}, at: process_one_work

stack backtrace:
  lockdep_rcu_suspicious+0x170/0x200
  ieee80211_bss_get_elem+0x94/0xa0 [cfg80211]
  bes2600_join_work+0xf0/0x4e0 [bes2600]
  process_one_work+0x238/0x7d0
  worker_thread+0x1a8/0x358
  kthread+0x130/0x148
  ret_from_fork+0x10/0x20

Root cause

ieee80211_bss_get_elem() is an RCU-protected accessor — it does an rcu_dereference() internally and expects the caller to hold either rcu_read_lock() or another suitable RCU read-side critical section. bes2600_join_work calls it at bes2600_join_work+0xf0 without first taking rcu_read_lock(), so PROVE_LOCKING's rcu_dereference_check() flags it.

Source: bes2600/sta.c — look for the ieee80211_bss_get_elem(bss, ...) call site inside bes2600_join_work (likely around the SSID/IE extraction from the cfg80211_bss * that mac80211 passed in).

Provenance

Pre-exists the besser patch stack — bes2600_join_work is upstream c5x code, not added by any of the besser fixes (Patches A–H, c5.1–c7, etc.). Only surfaced now because Phase 7 of besser#18 ran with PROVE_LOCKING enabled, which is normally off in production builds.

Fix

Wrap the ieee80211_bss_get_elem() call (and the subsequent dereference of its returned pointer) in rcu_read_lock() / rcu_read_unlock(). Inspect the surrounding code to confirm the returned IE is either copied out before unlock or properly handled — the IE pointer is RCU-protected, so anything we do with it must live inside the read-side critical section.

Pattern from cw1200 upstream (which has the same lookup) should be a clean reference. Per feedback_mine_upstream_ancestor, check drivers/net/wireless/st/cw1200/sta.c in mainline first.

Severity

Low runtime impact in production (kernels without PROVE_LOCKING just dereference without the check), but a real correctness bug — the bss IE table can be updated under RCU and a non-protected reader can see partial state. Worth fixing for both lockdep cleanliness and actual correctness.

Repro

  • Build linux-pinetab2-danctnix-besser with CONFIG_PROVE_LOCKING=y + CONFIG_LOCKDEP=y + CONFIG_DEBUG_LOCK_ALLOC=y (i.e. the lockdep sibling kernel used for besser#18 Phase 7).
  • Boot, let wpa_supplicant attempt to associate to any WPA-secured SSID.
  • sudo dmesg | grep -B1 -A30 'WARNING: suspicious RCU usage' will show the splat within ~30s of wlan0 coming up.

Observed 2026-05-20 on ohm uname 7.0.0-danctnix1-5-pinetab2-danctnix-besser-lockdep.

## Symptom With `CONFIG_PROVE_LOCKING=y` + `CONFIG_LOCKDEP=y` + `CONFIG_DEBUG_LOCK_ALLOC=y` enabled, `bes2600_join_work` triggers a `WARNING: suspicious RCU usage` early in boot (~t=43s on ohm) the first time it tries to look up the joining BSS: ``` WARNING: suspicious RCU usage 7.0.0-danctnix1-5-pinetab2-danctnix-besser-lockdep #1 Tainted: G C net/wireless/util.c:1078 suspicious rcu_dereference_check() usage! rcu_scheduler_active = 2, debug_locks = 1 2 locks held by kworker/u16:3/54: #0: ((wq_completion)bes2600_wq){+.+.}-{0:0}, at: process_one_work #1: ((work_completion)(&priv->join_work)){+.+.}-{0:0}, at: process_one_work stack backtrace: lockdep_rcu_suspicious+0x170/0x200 ieee80211_bss_get_elem+0x94/0xa0 [cfg80211] bes2600_join_work+0xf0/0x4e0 [bes2600] process_one_work+0x238/0x7d0 worker_thread+0x1a8/0x358 kthread+0x130/0x148 ret_from_fork+0x10/0x20 ``` ## Root cause `ieee80211_bss_get_elem()` is an RCU-protected accessor — it does an `rcu_dereference()` internally and expects the caller to hold either `rcu_read_lock()` or another suitable RCU read-side critical section. `bes2600_join_work` calls it at `bes2600_join_work+0xf0` without first taking `rcu_read_lock()`, so PROVE_LOCKING's `rcu_dereference_check()` flags it. Source: `bes2600/sta.c` — look for the `ieee80211_bss_get_elem(bss, ...)` call site inside `bes2600_join_work` (likely around the SSID/IE extraction from the `cfg80211_bss *` that mac80211 passed in). ## Provenance Pre-exists the besser patch stack — `bes2600_join_work` is upstream c5x code, not added by any of the besser fixes (Patches A–H, c5.1–c7, etc.). Only surfaced now because Phase 7 of besser#18 ran with PROVE_LOCKING enabled, which is normally off in production builds. ## Fix Wrap the `ieee80211_bss_get_elem()` call (and the subsequent dereference of its returned pointer) in `rcu_read_lock()` / `rcu_read_unlock()`. Inspect the surrounding code to confirm the returned IE is either copied out before unlock or properly handled — the IE pointer is RCU-protected, so anything we do with it must live inside the read-side critical section. Pattern from cw1200 upstream (which has the same lookup) should be a clean reference. Per `feedback_mine_upstream_ancestor`, check `drivers/net/wireless/st/cw1200/sta.c` in mainline first. ## Severity Low runtime impact in production (kernels without PROVE_LOCKING just dereference without the check), but a real correctness bug — the bss IE table can be updated under RCU and a non-protected reader can see partial state. Worth fixing for both lockdep cleanliness and actual correctness. ## Repro - Build `linux-pinetab2-danctnix-besser` with `CONFIG_PROVE_LOCKING=y` + `CONFIG_LOCKDEP=y` + `CONFIG_DEBUG_LOCK_ALLOC=y` (i.e. the lockdep sibling kernel used for besser#18 Phase 7). - Boot, let wpa_supplicant attempt to associate to any WPA-secured SSID. - `sudo dmesg | grep -B1 -A30 'WARNING: suspicious RCU usage'` will show the splat within ~30s of wlan0 coming up. Observed 2026-05-20 on ohm uname `7.0.0-danctnix1-5-pinetab2-danctnix-besser-lockdep`.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marfrit/besser#23