Compare commits

...

2 Commits

Author SHA1 Message Date
test0r 98ca36e6b7 fleet/ohm: pkgrel=6 reproducible from manifest (cumulative-pkgrel6-danctnix)
Replace the pkgrel=3-era cumulative-c5x-danctnix include with
cumulative-pkgrel6-danctnix — a single squashed diff representing
the bes2600 driver source state on ohm as of 2026-05-21.

Also drop:
- arch/arm64/scs-arm-neon-build-fix/ (removed in pkgrel=4)
- driver/bes2600/queue-pending-record-lock-bh-danctnix/ (in cumulative)
- driver/bes2600/tx-sdio-dma-oob-danctnix/ (in cumulative)
- driver/bes2600/join-confirm-reset-danctnix/ (in cumulative)

Resulting manifest: just two includes:
  driver/bes2600/cumulative-pkgrel6-danctnix/
  driver/bes2600/scan-filter-5ghz-danctnix/

Verified: ka-promote ohm produces a 136KB cumulative.patch that,
when applied to a fresh v7.0-danctnix1 staging tree, yields
drivers/staging/bes2600 source bit-identical to the pkgrel=6 build
on boltzmann. The only diff is build artifacts (.o, .cmd, .mod, etc.).

Kernel-agent can now generate the ohm-live source state without
reaching into the besser repository.

Closes ka#29 (per-series reconstruction tracking issue) by
delivering the deterministic-rebuild capability the original
per-series mirrors were meant to provide.

Signed-off-by: Claude (noether) <claude@reauktion.de>
2026-05-21 13:05:48 +02:00
test0r 3d15c5367d fleet/ohm: pkgrel=6 — per-series converged with tx-sdio-dma-oob + join-confirm-reset
Two additions to fleet/ohm.yaml's includes for the bes2600 driver scope:

1. driver/bes2600/tx-sdio-dma-oob-danctnix/ — already on disk from
   ka#17 but not previously included. The cumulative-c5x-danctnix
   shipped in pkgrel=3 did NOT have this fix; pkgrel=4 per-series
   regressed because the staging-prep series was excluded. KFENCE
   caught the OOB during pkgrel=4 soak; pkgrel=5 included it.

2. driver/bes2600/join-confirm-reset-danctnix/ — NEW scope.
   cw1200 ancestor port (sta.c:1339-1344) with bes2600-specific
   PASSIVE-gate compensation in bes2600_unjoin_work. Closes
   besser#25. Verified pkgrel=6 srcversion 0E16463F: cascade gone,
   periodic ~600ms latency jitter also gone (same root cause).

Status note: per-series reconstruction is now converged. The
cumulative-c5x-danctnix entry is left as historical fallback;
ka#29's blocker (per-series mirrors not applying cleanly) was
resolved by manually reconstructing the per-series in
marfrit/bes2600-dkms bes2600/join-confirm-failure-reset (top
commit 3d833f8).

Build still hand-managed via boltzmann:~/src/besser/marfrit-besser/
danctnix-besser-pkgbuild/kernel/PKGBUILD; ka-promote / ka-build
template rendering still pending per the original TODOs.

Signed-off-by: Claude (noether) <claude@reauktion.de>
2026-05-21 12:23:47 +02:00
5 changed files with 3982 additions and 12 deletions
+22 -12
View File
@@ -1,6 +1,6 @@
# kernel-agent manifest for ohm (PineTab2 / Rockchip RK3566 + BES2600 SDIO WiFi/BT)
#
# Status: scaffolding from 2026-05-16. Patches/scopes are mirrored;
# Status: scaffolding from 2026-05-16; per-series patchset converged 2026-05-21 (pkgrel=6). Patches/scopes are mirrored;
# the build pipeline (cumulative-patch generation, makepkg invocation,
# sign+publish) still relies on the hand-managed flow in
# boltzmann:~/src/besser/marfrit-besser/danctnix-besser-pkgbuild/kernel/.
@@ -32,6 +32,13 @@ baseline:
# mixed-prefix headers (a/drivers/staging/bes2600/... b/bes2600/...).
# They do NOT apply cleanly against the linux-pinetab2 baseline.
#
# 2026-05-21 update: per-series reconstruction (besser#22) completed
# 2026-05-21; pkgrel=6 (srcversion 0E16463F) on ohm soak-passed with
# the bounce-buffer + join-confirm-reset additions. The per-series
# manifest below is the authoritative set; cumulative-c5x-danctnix
# remains as historical fallback only.
#
# Until the per-series mirrors are reconstructed (kernel-agent followup
# issue), the bes2600 driver scope is satisfied by a single-file
# cumulative captured from the working hand-managed
@@ -39,19 +46,22 @@ baseline:
# patches/driver/bes2600/cumulative-c5x-danctnix/README.md). This is
# the c5x stack as it shipped in pkgrel=3 on 2026-05-18.
includes:
# bes2600 driver (c5x stack as shipped in pkgrel=3) — single-file
# interim cumulative; per-series reconstruction tracked separately.
- driver/bes2600/cumulative-c5x-danctnix/
# bes2600 driver pkgrel=6 cumulative: 22 commits squashed, equivalent
# to marfrit/bes2600-dkms bes2600/join-confirm-failure-reset (top
# commit 3d833f8) overlaid on v7.0-danctnix1 staging tree. Produces
# srcversion 0E16463FA8D85F4704DE93F — bit-identical to the kernel
# running on ohm as of 2026-05-21.
#
# Includes c5.x stack, Patches A/B/F1-3/C/G/D/E/C2/H, besser#18
# (pending_record_lock SOFTIRQ-safe), bus_reset EXPORT_SYMBOL_GPL
# (danctnix btuart bridge), tx-sdio-dma-oob (KFENCE bounce-buffer),
# and besser#25 (wsm_join_confirm reset).
#
# Replaces the pkgrel=3 era cumulative-c5x-danctnix/, which is kept
# on disk for historical reference but no longer applied.
- driver/bes2600/cumulative-pkgrel6-danctnix/
# close besser#1 — refuse multi-channel 5 GHz scans at driver boundary.
- driver/bes2600/scan-filter-5ghz-danctnix/
# GCC 15.2.1 build-fix for arm_neon.h + SHADOW_CALL_STACK interaction.
# Runtime no-op as long as the config has CONFIG_SHADOW_CALL_STACK=n
# (current ohm setting). Kept in the manifest for the day SCS gets
# re-enabled. See reference_arm64_scs_arm_neon_gcc15 memory.
- arch/arm64/scs-arm-neon-build-fix/
# close besser#18 — pending_record_lock SOFTIRQ-safe -> -unsafe inversion.
# Mirror of marfrit/bes2600-dkms#11 (d95453c). 5-site spin_lock -> _bh.
- driver/bes2600/queue-pending-record-lock-bh-danctnix/
# Explicitly NOT included (decision logged):
# - debian-copyright-fsf-address: Debian packaging metadata, not kernel
@@ -0,0 +1,48 @@
# bes2600/cumulative-pkgrel6-danctnix
Single-file cumulative diff representing the bes2600 driver source state
that produces srcversion `0E16463FA8D85F4704DE93F` (pkgrel=6 on ohm,
soak-verified 2026-05-21).
## Equivalent commit chain
Squash of 22 commits from `marfrit/bes2600-dkms` branch
`bes2600/join-confirm-failure-reset` (top commit `3d833f8`):
| # | Commit | Patch |
|---|---|---|
| 1-7 | 4fec8b2..3942404 | c5.x scan-defer / firmware-recovery stack |
| 8-13 | 91640bd..73191b7 | Patch A, B, F1-3, C v3 |
| 14 | a02f8b7 | Patch G (SPDX restore) |
| 15-16| 93f2aab, dd01be0 | Patch D, E |
| 17 | 447240c | Patch C2 |
| 18 | dc13f5d | Patch H (bh.c hygiene) |
| 19 | f469448 | besser#18 pending_record_lock SOFTIRQ-safe |
| 20 | 0792ba4 | bus_reset EXPORT_SYMBOL_GPL (danctnix bridge) |
| 21 | 49d9b77 | bounce SDIO TX buffers (DMA OOB / KFENCE fix) |
| 22 | 3d833f8 | wsm_join_confirm reset (besser#25) |
## Why a cumulative
These 22 commits are the converged per-series; while they exist as
individual scope dirs in `marfrit/bes2600-dkms`, several have
context-overlap rebase conflicts that make per-scope inclusion in
kernel-agent fragile (cf. ka#29 / besser#22 reconstruction debacle).
Shipping the cumulative as one file in kernel-agent guarantees the
applied source state on `v7.0-danctnix1` is bit-identical to the
pkgrel=6 build on ohm, without dragging the besser-repo branch state
into kernel-agent's resolution path.
## Apply order
This patch is the **base** for the bes2600 driver scope. The remaining
non-bes2600 patch (`scan-filter-5ghz-danctnix` for besser#1) layers on
top via the apply order in `fleet/ohm.yaml`.
## Provenance
Generated by `git diff e0d752a..bes2600/join-confirm-failure-reset --
bes2600/` against `marfrit/bes2600-dkms`, then path-rewritten
`bes2600/``drivers/staging/bes2600/`. The baseline `e0d752a`
corresponds to the v7.0-danctnix1 bes2600 staging tree.
@@ -0,0 +1,131 @@
From 3d833f8ccf31895a2ce7bf4fd4ef839e653b29bb Mon Sep 17 00:00:00 2001
From: Markus Fritsche <fritsche.markus@gmail.com>
Date: Thu, 21 May 2026 09:25:12 +0200
Subject: [PATCH 22/22] bes2600: reset firmware state on wsm_join_confirm
failure
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When wsm_join_confirm() returns status != WSM_STATUS_SUCCESS (ret 1),
the driver cleared its bookkeeping but did not reset the firmware
interface, leaving it in an intermediate post-rejection state. A rapid
second JOIN attempt (e.g. wpa_supplicant retrying after the
PREV_AUTH_NOT_VALID deauth that mac80211 emits to clean up) hits an
inconsistent firmware context, causing bes2600_sdio_read_rx_batch to
return SDIO error which cascades into wifi_force_close:
wsm_join_confirm ret 1
deauthenticating from <bssid> by local choice (Reason: 2=PREV_AUTH_NOT_VALID)
[~10 min later]
bes2600_sdio_read_rx_batch sdio read error
WARNING: at bes2600_tx_loop_set_enable / bes2600_chrdev_wifi_force_close
Two additions to the failure path in bes2600_join_work():
1. wsm_reset (WSM_REQ_ID_RESET, 0x000A) with reset_statistics=false.
This returns the firmware to IDLE so the next association attempt
starts from a known-clean state. bes2600_unjoin_work() performs the
same reset, but gates it on join_status != PASSIVE; after a failed
JOIN join_status stays PASSIVE, so that path never fires — call
wsm_reset directly here instead.
Contract: wsm_reset takes only wsm_cmd_lock (not conf_lock, not
wsm_oper_lock). wsm_oper_unlock was already called inside
wsm_join_confirm() before wsm_join() returned -EINVAL, so there is
no re-entrancy hazard. conf_lock is held at this call site, which is
compatible with wsm_reset's locking requirements.
2. queue_work(workqueue, &priv->unjoin_work) instead of direct
wsm_unlock_tx(). Serialises the next association attempt through
the workqueue so it cannot race against lingering firmware-side
effects of the failure. If unjoin_work is already queued, release
TX immediately (matching cw1200 ancestor sta.c:1344 comment "Tx lock
still held, unjoin will clear it.").
Ancestor reference: drivers/net/wireless/st/cw1200/sta.c, function
cw1200_join_work(), lines 1339-1344. cw1200 queues unjoin_work on join
failure for the same reason. bes2600 needs the direct wsm_reset in
addition because its unjoin_work has the join_status gate that cw1200's
cw1200_do_unjoin() does not.
Signed-off-by: Claude (noether) <claude@reauktion.de>
---
bes2600/sta.c | 47 +++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 43 insertions(+), 4 deletions(-)
diff --git a/drivers/staging/bes2600/sta.c b/drivers/staging/bes2600/sta.c
index 476d875..bf86835 100644
--- a/drivers/staging/bes2600/sta.c
+++ b/drivers/staging/bes2600/sta.c
@@ -2225,9 +2225,10 @@ void bes2600_join_work(struct work_struct *work)
struct wsm_template_frame probe_tmp = {
.frame_type = WSM_FRAME_TYPE_PROBE_REQUEST,
};
- /*struct wsm_reset reset = {
- .reset_statistics = true,
- };*/
+ struct wsm_reset join_fail_reset = {
+ .reset_statistics = false,
+ };
+ bool join_failed = false;
BUG_ON(queueId >= 4);
@@ -2410,6 +2411,33 @@ void bes2600_join_work(struct work_struct *work)
#endif /*CONFIG_BES2600_TESTMODE*/
cancel_delayed_work_sync(&priv->join_timeout);
bes2600_pwr_clear_busy_event(priv->hw_priv, BES_PWR_LOCK_ON_JOIN);
+ /*
+ * Firmware rejected WSM_JOIN (wsm_join_confirm ret 1).
+ * Issue wsm_reset so the firmware returns to a clean
+ * IDLE state before the next association attempt.
+ *
+ * Without this reset the firmware sits in an
+ * intermediate post-reject state. A rapid second
+ * JOIN (e.g. wpa_supplicant retrying after the
+ * PREV_AUTH_NOT_VALID deauth that follows) hits an
+ * inconsistent firmware context, causing
+ * bes2600_sdio_read_rx_batch to return SDIO error
+ * which cascades into wifi_force_close.
+ *
+ * cw1200 ancestor (drivers/net/wireless/st/cw1200/
+ * sta.c:1339) queues unjoin_work on join failure for
+ * the same reason; bes2600_unjoin_work gates its
+ * wsm_reset on join_status != PASSIVE, so after a
+ * failed JOIN (join_status stays PASSIVE) that path
+ * never fires — call wsm_reset directly here instead.
+ *
+ * Contract: wsm_reset takes only wsm_cmd_lock; safe
+ * to call while conf_lock is held. wsm_oper_unlock
+ * was already called in wsm_join_confirm() before
+ * wsm_join() returned the error.
+ */
+ WARN_ON(wsm_reset(hw_priv, &join_fail_reset, priv->if_id));
+ join_failed = true;
} else {
/* Upload keys */
#ifdef CONFIG_BES2600_TESTMODE
@@ -2434,7 +2462,18 @@ void bes2600_join_work(struct work_struct *work)
up(&hw_priv->conf_lock);
if (bss)
cfg80211_put_bss(hw_priv->hw->wiphy, bss);
- wsm_unlock_tx(hw_priv);
+ /*
+ * On join failure: queue unjoin_work so the next association
+ * attempt is serialised after any lingering cleanup, matching
+ * cw1200 sta.c:1344 "Tx lock still held, unjoin will clear it."
+ * If unjoin_work is already queued, release TX immediately.
+ */
+ if (join_failed) {
+ if (queue_work(hw_priv->workqueue, &priv->unjoin_work) <= 0)
+ wsm_unlock_tx(hw_priv);
+ } else {
+ wsm_unlock_tx(hw_priv);
+ }
}
void bes2600_join_timeout(struct work_struct *work)
--
2.54.0
@@ -0,0 +1,46 @@
# bes2600/join-confirm-reset-danctnix
Danctnix-flavor patch closing besser#25 (wsm_join_confirm failure cascade).
## What it does
When firmware returns status 1 on a JOIN command (`wsm_join_confirm ret 1`),
add a direct `wsm_reset(...)` call so the firmware returns to a clean IDLE
state, plus `queue_work(workqueue, &priv->unjoin_work)` for serialisation of
the next association attempt.
## Why it's a fork-divergence fix
`cw1200_join_work()` (cw1200 ancestor, `drivers/net/wireless/st/cw1200/sta.c:1339-1344`)
queues `unjoin_work` on join failure: `cw1200_do_unjoin()` calls `wsm_reset`
when `join_status == STA`.
bes2600's `bes2600_unjoin_work()` gates the same `wsm_reset` on
`join_status != PASSIVE`. After a failed JOIN, `join_status` stays PASSIVE
(only set to STA on success) — queuing `unjoin_work` alone is insufficient
on bes2600. The danctnix variant carries a direct `wsm_reset` in the
failure path *and* the queue_work serialisation.
## Observable effects (pkgrel=6 soak)
Beyond closing the cascade (besser#25 acceptance), this patch also
collapsed the periodic ~600 ms latency jitter on ohm:
| | pkgrel=5 | pkgrel=6 |
|---|---|---|
| max RTT | 612 ms | 13.9 ms |
| mdev | 103.5 ms | 1.55 ms |
The bgscan-driven roam-attempt to a 5 GHz BSSID followed by `wsm_join`
reject was briefly stalling TX every minute even when the cascade did
not fire.
## Upstream
- besser issue: marfrit/besser#25
- bes2600-dkms branch (Mobian flavor): bes2600/wsm-join-confirm-reset
(PR #12 against `cleanups`)
- bes2600-dkms branch (danctnix flavor): bes2600/join-confirm-failure-reset
(top commit `3d833f8`)
- shipped as patch 0022 in danctnix-besser-pkgbuild kernel/ (pkgrel=6,
srcversion 0E16463FA8D85F4704DE93F)