From 458ad36f8b01f9d85cc4b2038a7b250d049f6dc9 Mon Sep 17 00:00:00 2001 From: "Claude (noether)" Date: Thu, 7 May 2026 13:56:36 +0200 Subject: [PATCH] =?UTF-8?q?notes:=20backlog=20Bug=20#5=20=E2=80=94=20RX=20?= =?UTF-8?q?path=20degrades=20under=20throughput=20pressure?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Observed 2026-05-07: bumping the netcat sender from 1 MB/s to 4 MB/s DECREASED ohm's observed RX rate (1015 KB/s → 563 KB/s) and degraded the link (signal -57 → -67 dBm, MCS 4 → 3). Chip can't sustain near- link-rate RX even though theoretical capacity is ~8 MB/s. Hypothesis: driver/firmware lock contention or busy-wait on the RX SDIO path. Plausibly explains the original Phase-0 observation that YouTube DASH chunks drop ~10 frames per chunk fetch — chunk fetch is a brief near-line-rate burst that this bug would be triggered by. --- notes/observed-bugs.md | 43 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/notes/observed-bugs.md b/notes/observed-bugs.md index 21bf03621..7c547f423 100644 --- a/notes/observed-bugs.md +++ b/notes/observed-bugs.md @@ -82,6 +82,49 @@ without board power-cycle"). **Status**: task c3 (indirectly, via bes_chardev removal which currently gates the signal/nosignal mode switch path). +## Bug #5 — RX path degrades under attempted-throughput pressure + +**Suspect file**: bes2600 RX path (`txrx.c bes2600_rx_cb`, `bh.c bes2600_bh_work`, +SDIO RX scheduling) — pinpoint pending. + +**Symptom (observed 2026-05-07 13:43, srcversion `1B3B3ED0` = c-stack + +Patch A + Patch B, ohm @ -57 dBm 2.4GHz ch11 5b:32, idle save for the +netcat load):** + +``` +sender cap 1 MB/s → ohm receives 1015 KB/s, signal -57 dBm, RX MCS 4 +sender cap 4 MB/s → ohm receives 563 KB/s, signal -67 dBm, RX MCS 3 + (Send-Q on boltzmann backed up to 1.16 MB) +``` + +Pushing the sender-side cap from 1 MB/s to 4 MB/s **decreased** observed +throughput at the receiver and degraded the link metrics. Signal dropped +~10 dB and the chip downshifted MCS, suggesting the chip can't sustain +the higher RX rate even with the link physically capable of more (link +bitrate 65 Mb/s = ~8 MB/s theoretical). + +**Hypothesis (Markus, 2026-05-07): driver/firmware locks itself to death +under busy reads** — possibly a busy-wait loop or lock contention on the +RX SDIO path that prevents draining at line rate. Plausible reason it +didn't surface for the c-stack tasks: those operated at typical +browse-rate traffic, well below the saturation threshold this bug needs +to fire. + +**May explain**: original Phase-0 observation that **YouTube DASH chunks +drop ~10 frames per chunk fetch** on hardware-decoder playback. A chunk +fetch is a brief burst at near-link-rate; if the driver throttles itself +down during high-RX, the player buffer underruns for the duration of +the fetch. + +**How to drill (when prioritized)**: +- Capture trace_pipe with `mmc:*` and `sdio*` events enabled during a + controlled rate-ramp (e.g., pv -L 500K, 1M, 2M, 4M each for 60 s). +- Watch `/proc/sys/kernel/sched_*` and the `bes2600_bh_work` kworker for + CPU saturation. +- `perf top -p $(pgrep -f bes_sdio)` during 4 MB/s load. + +**Status**: backlog. No patch yet. + ## Bug #4 — scan_complete_cb constant loop **File**: `scan.c:883-909` — `bes2600_scan_complete_cb()`.