bes2600: gate PM indication completion on pending request and track chip state
When mac80211 toggles PSM on the BES2600, the host sends WSM set_pm
and waits up to 5 s on bes_power.pm_enter_cmpl for a firmware-side
PM-changed indication confirming the transition. Three sequenced
flaws make the wait-and-confirm racy and leave host/chip bookkeeping
desynced when anything misfires:
1) bes2600_pwr_notify_ps_changed() unconditionally fires
complete(pm_enter_cmpl) for any non-active psmode. It does not
check whether a host-initiated set_pm is actually pending. A
spontaneous indication (firmware-internal coex move,
idle-driven aging) primes the completion, and the next host-
driven enter_lp_mode sees a false success on its first
wait_for_completion_timeout.
2) The wait/reinit ordering in bes2600_pwr_enter_lp_mode is
status = wait_for_completion_timeout(...);
atomic_set(pm_set_in_process, 0);
reinit_completion(...);
If an indication arrives between wait_for_completion_timeout
returning with status==1 and reinit_completion, the next
enter_lp_mode iteration's wait can also see false success. The
reinit must happen *before* we start the new request, not
after handling the previous one.
3) On wait_pm_ind timeout, the driver returns -ETIMEDOUT and walks
away. It does not record that the firmware's actual PM state
is no longer known to the host. Subsequent wake paths
(gpio_wake / sbus_active) assume the chip is still active and
hit deterministic SDIO failures when the firmware has
transitioned anyway.
This patch is the safe-prerequisite half of a wider fix:
* bes_pwr.h gains enum bes2600_chip_pm_state {ACTIVE, LP, UNKNOWN}
and bes_power.chip_pm_state. Its job is to track what the host
has *seen the firmware confirm*, not what the host has
requested. Initialised to ACTIVE in bes2600_pwr_init().
* bes2600_pwr_notify_ps_changed() unconditionally updates
chip_pm_state on every indication, but only fires
complete(pm_enter_cmpl) when atomic_cmpxchg(pm_set_in_process,
1, 0) succeeds. A spontaneous indication can no longer prime a
waiter that will only set up its request afterwards.
* bes2600_pwr_enter_lp_mode() now reinit_completion()s before
setting pm_set_in_process and sending wsm_set_pm. After a
timeout, it cmpxchgs pm_set_in_process back to 0 (so a late
indication cannot prime the next iteration) and on the win-
cmpxchg branch records chip_pm_state=UNKNOWN.
A follow-up patch consumes chip_pm_state on the wake side
(bes2600_pwr_device_exit_lp_mode + bes2600_gpio_wakeup_mcu) to fix
the deterministic "active mcu fail" cycle this state-record
enables a fix for. Splitting the work this way keeps the lock-free
race fix small and reviewable on its own.
No new locks, no behaviour change on the success path. Only the
recovery path (timeout + spontaneous indication) gains correctness.
Signed-off-by: Markus Fritsche <fritsche.markus@gmail.com>
This commit is contained in:
@@ -64,6 +64,20 @@ enum power_down_state
|
||||
POWER_DOWN_STATE_UNLOCKED,
|
||||
};
|
||||
|
||||
/*
|
||||
* Confirmed PM state of the firmware-side chip. Tracks what the host
|
||||
* has *seen* the firmware acknowledge, not what the host has
|
||||
* requested. UNKNOWN means a host-initiated transition timed out
|
||||
* before the firmware indication arrived; the next wake path should
|
||||
* treat it as "we don't know" and probe before issuing GPIO/SDIO
|
||||
* wakeup ops.
|
||||
*/
|
||||
enum bes2600_chip_pm_state {
|
||||
BES2600_CHIP_PM_ACTIVE = 0,
|
||||
BES2600_CHIP_PM_LP,
|
||||
BES2600_CHIP_PM_UNKNOWN,
|
||||
};
|
||||
|
||||
typedef void (*bes_pwr_enter_lp_cb)(struct bes2600_common *hw_priv);
|
||||
typedef void (*bes_pwr_exit_lp_cb)(struct bes2600_common *hw_priv);
|
||||
|
||||
@@ -106,6 +120,7 @@ struct bes2600_pwr_t
|
||||
bool ap_lp_bad;
|
||||
struct bes2600_pwr_event_t pwr_events[BES2600_DELAY_EVENT_NUM];
|
||||
atomic_t pm_set_in_process;
|
||||
atomic_t chip_pm_state;
|
||||
};
|
||||
|
||||
#ifdef CONFIG_BES2600_WOWLAN
|
||||
|
||||
Reference in New Issue
Block a user