Files
besser/notes/phase7-c2-2026-05-08.md
claude-noether 02d3f4b222 notes: Patch C2 Phase 7 — N=3 ramp, no measurable throughput delta
| rep | uptime | MB/s |
|----:|-------:|-----:|
|   1 |   544s | 2.289|
|   2 |   716s | 2.165|
|   3 |   750s | 2.376|

N=3 mean: 2.277 MB/s.  vs Patch C v3 N=3 (2.352 MB/s): -3% (within
rep variance).  vs Patch B baseline (1.362 MB/s): +67%.

C2 was predicted in §4.5 of the Phase 4 plan as a possible
"<2% delta" outcome -> "ship for upstream-cleanliness anyway".
Observed -3% -> within noise -> ship.  The tasklet hop in
ieee80211_rx_irqsafe was apparently cheap on this kernel.

Phase 8 lesson: _irqsafe -> _rx_ni is a CORRECTNESS / kernel.org-
submission move, not a performance optimization.  Don't oversell
predicted throughput deltas without prior measurement.

Patch C v3 architectural win remains the durable +73%; D / E / C2 /
F / G are smaller cleanups that don't compound visibly above noise.

Throughput ceiling on this hardware: ~2.4 MB/s sustained @ 4 MB/s
sender, fresh chip.  Further improvement needs firmware-side fixes
(wsm_generic_confirm 0x0007 path), not driver-side.
2026-05-08 07:43:33 +02:00

3.7 KiB

Patch C2 Phase 7 — N=3 ramp results

Date: 2026-05-08 Module: bes2600.ko srcversion 619A51E61BF5479AAC146E6 (cleanups + F + G + D + E + C2) Rig: ohm fresh boot, wired enu1 path for control, wlan0 for data probes Stress: netcat sender, pv -L 4m, 30 s per rep


Results table

rep uptime (s) rate (MB/s)
1 544 2.289
2 716 2.165
3 750 2.376

N=3: mean 2.277, median 2.289, min 2.165, max 2.376

Comparison to baselines

series mean MB/s Δ vs Patch B Δ vs v3
Patch B (run-20260507-patchC-preflight, N=1) 1.362 -42%
Patch C v3 N=3 (run-20260507-N3v3-rep*) 2.352 +73%
Patch C v3 + F + G + D + E + C2 N=3 (this rep set) 2.277 +67% -3%

Δ vs v3 is within rep variance (v3 N=3 had min 2.102, max 2.590 → spread ±20%; this set's spread is similar). Statistically indistinguishable.

Verdict: no measurable C2 throughput delta

The tasklet hop in ieee80211_rx_irqsafe was apparently cheap on this kernel. Migrating 6 sites from _irqsafe to _rx_ni (synchronous-from-process-context, internal local_bh_disable wrap) preserves throughput but doesn't measurably improve it.

This was a predicted outcome. The C2 Phase 4 plan §4.5 said:

"If <2%, Phase 7 says 'marginal but no regression' and we ship anyway for upstream-cleanliness."

Observed: -3% (within noise) → falls into the "marginal but no regression" bucket. Ship for the kernel.org submission story (no _irqsafe from process context = upstream-idiomatic) even though performance is unchanged.

Receipts checklist

  • N=3 reps captured at fresh-chip uptime (544/716/750 s — within first 13 min, before scan-failure-cadence onset)
  • All reps under same conditions: same fresh boot, same nc listener, same AP (newton, BSSID c0:25:06:e6:61:b0 on chan 1)
  • No WARN/BUG/oops on any rep
  • dmesg pattern: only the pre-existing wsm_generic_confirm 0x0007 noise — same on Patch B / Patch F / Patch C v3 / D / E / C2 (firmware-side, independent of all our patches)
  • Wired-rig telemetry collection — would have caught any wedge that wlan0 ate
  • Rig-failure-is-finding: an early "0-throughput" set of reps was rig artifact (nc-loop race, port-binding state from a prior session) — caught and discounted per feedback_rig_failure_is_finding. The recovered N=3 reps used setsid-detached listener + post-reboot fresh state.

Phase 8 lesson

Drop-in replacements with the right kerneldoc reading still need Phase 7 measurement. I expected +5-15% from removing the tasklet schedule. Got -3% (noise). The cost we were saving was already amortised by something else (NAPI infra? per-CPU softirq scheduling?). The kerneldoc-correctness story stands; the perf story does not.

Memory entry: the perf-vs-correctness distinction is worth keeping. _irqsafe → _rx_ni is a CORRECTNESS / API-cleanliness move, not a performance optimization. Don't oversell predicted deltas without baseline measurement.

Out-of-scope follow-ups

  • Patch C v3 architectural win is the durable +73%. C / D / E / C2 / F / G are smaller cleanups that don't compound visibly.
  • Bug #5 RX-degradation campaign already closed (hypothesis falsified).
  • Task #24 (post-cleanup observation of bh.c symptom-shaped artifacts): mostly answered.
  • Task #25 (Allwinner sw_mci_check_r1_ready measurement): can be done during any future stress run; not on critical path.

Phase 7 captured 2026-05-08 by Claude (noether). Patch C2 closes the post-Bug-#5 cleanup track. Throughput ceiling on this hardware = ~2.4 MB/s sustained @ 4 MB/s sender, fresh chip; further improvement would need firmware-side fixes (the wsm_generic_confirm 0x0007 path), not driver-side.