Files
besser/notes/phase7-c2-2026-05-08.md
claude-noether 02d3f4b222 notes: Patch C2 Phase 7 — N=3 ramp, no measurable throughput delta
| rep | uptime | MB/s |
|----:|-------:|-----:|
|   1 |   544s | 2.289|
|   2 |   716s | 2.165|
|   3 |   750s | 2.376|

N=3 mean: 2.277 MB/s.  vs Patch C v3 N=3 (2.352 MB/s): -3% (within
rep variance).  vs Patch B baseline (1.362 MB/s): +67%.

C2 was predicted in §4.5 of the Phase 4 plan as a possible
"<2% delta" outcome -> "ship for upstream-cleanliness anyway".
Observed -3% -> within noise -> ship.  The tasklet hop in
ieee80211_rx_irqsafe was apparently cheap on this kernel.

Phase 8 lesson: _irqsafe -> _rx_ni is a CORRECTNESS / kernel.org-
submission move, not a performance optimization.  Don't oversell
predicted throughput deltas without prior measurement.

Patch C v3 architectural win remains the durable +73%; D / E / C2 /
F / G are smaller cleanups that don't compound visibly above noise.

Throughput ceiling on this hardware: ~2.4 MB/s sustained @ 4 MB/s
sender, fresh chip.  Further improvement needs firmware-side fixes
(wsm_generic_confirm 0x0007 path), not driver-side.
2026-05-08 07:43:33 +02:00

64 lines
3.7 KiB
Markdown

# Patch C2 Phase 7 — N=3 ramp results
**Date:** 2026-05-08
**Module:** `bes2600.ko` srcversion `619A51E61BF5479AAC146E6` (cleanups + F + G + D + E + C2)
**Rig:** ohm fresh boot, wired enu1 path for control, wlan0 for data probes
**Stress:** netcat sender, `pv -L 4m`, 30 s per rep
---
## Results table
| rep | uptime (s) | rate (MB/s) |
|---:|---:|---:|
| 1 | 544 | **2.289** |
| 2 | 716 | **2.165** |
| 3 | 750 | **2.376** |
**N=3:** mean 2.277, median 2.289, min 2.165, max 2.376
## Comparison to baselines
| series | mean MB/s | Δ vs Patch B | Δ vs v3 |
|---|---:|---:|---:|
| Patch B (run-20260507-patchC-preflight, N=1) | 1.362 | — | -42% |
| Patch C v3 N=3 (run-20260507-N3v3-rep*) | 2.352 | +73% | — |
| Patch C v3 + F + G + D + E + C2 N=3 (this rep set) | 2.277 | +67% | -3% |
Δ vs v3 is **within rep variance** (v3 N=3 had min 2.102, max 2.590 → spread ±20%; this set's spread is similar). Statistically indistinguishable.
## Verdict: no measurable C2 throughput delta
The tasklet hop in `ieee80211_rx_irqsafe` was apparently cheap on this kernel. Migrating 6 sites from `_irqsafe` to `_rx_ni` (synchronous-from-process-context, internal `local_bh_disable` wrap) preserves throughput but doesn't measurably improve it.
**This was a predicted outcome.** The C2 Phase 4 plan §4.5 said:
> "If <2%, Phase 7 says 'marginal but no regression' and we ship anyway for upstream-cleanliness."
Observed: -3% (within noise) → falls into the "marginal but no regression" bucket. Ship for the kernel.org submission story (no `_irqsafe` from process context = upstream-idiomatic) even though performance is unchanged.
## Receipts checklist
- [x] N=3 reps captured at fresh-chip uptime (544/716/750 s — within first 13 min, before scan-failure-cadence onset)
- [x] All reps under same conditions: same fresh boot, same nc listener, same AP (newton, BSSID c0:25:06:e6:61:b0 on chan 1)
- [x] No WARN/BUG/oops on any rep
- [x] dmesg pattern: only the pre-existing wsm_generic_confirm 0x0007 noise — same on Patch B / Patch F / Patch C v3 / D / E / C2 (firmware-side, independent of all our patches)
- [x] Wired-rig telemetry collection — would have caught any wedge that wlan0 ate
- [x] Rig-failure-is-finding: an early "0-throughput" set of reps was rig artifact (nc-loop race, port-binding state from a prior session) — caught and discounted per `feedback_rig_failure_is_finding`. The recovered N=3 reps used setsid-detached listener + post-reboot fresh state.
## Phase 8 lesson
**Drop-in replacements with the right kerneldoc reading still need Phase 7 measurement.** I expected +5-15% from removing the tasklet schedule. Got -3% (noise). The cost we were saving was already amortised by something else (NAPI infra? per-CPU softirq scheduling?). The kerneldoc-correctness story stands; the perf story does not.
**Memory entry:** the perf-vs-correctness distinction is worth keeping. `_irqsafe → _rx_ni` is a CORRECTNESS / API-cleanliness move, not a performance optimization. Don't oversell predicted deltas without baseline measurement.
## Out-of-scope follow-ups
- Patch C v3 architectural win is the durable +73%. C / D / E / C2 / F / G are smaller cleanups that don't compound visibly.
- Bug #5 RX-degradation campaign already closed (hypothesis falsified).
- Task #24 (post-cleanup observation of bh.c symptom-shaped artifacts): mostly answered.
- Task #25 (Allwinner sw_mci_check_r1_ready measurement): can be done during any future stress run; not on critical path.
---
*Phase 7 captured 2026-05-08 by Claude (noether). Patch C2 closes the post-Bug-#5 cleanup track. Throughput ceiling on this hardware = ~2.4 MB/s sustained @ 4 MB/s sender, fresh chip; further improvement would need firmware-side fixes (the wsm_generic_confirm 0x0007 path), not driver-side.*