diff --git a/notes/observed-bugs.md b/notes/observed-bugs.md index 21bf03621..7c547f423 100644 --- a/notes/observed-bugs.md +++ b/notes/observed-bugs.md @@ -82,6 +82,49 @@ without board power-cycle"). **Status**: task c3 (indirectly, via bes_chardev removal which currently gates the signal/nosignal mode switch path). +## Bug #5 — RX path degrades under attempted-throughput pressure + +**Suspect file**: bes2600 RX path (`txrx.c bes2600_rx_cb`, `bh.c bes2600_bh_work`, +SDIO RX scheduling) — pinpoint pending. + +**Symptom (observed 2026-05-07 13:43, srcversion `1B3B3ED0` = c-stack + +Patch A + Patch B, ohm @ -57 dBm 2.4GHz ch11 5b:32, idle save for the +netcat load):** + +``` +sender cap 1 MB/s → ohm receives 1015 KB/s, signal -57 dBm, RX MCS 4 +sender cap 4 MB/s → ohm receives 563 KB/s, signal -67 dBm, RX MCS 3 + (Send-Q on boltzmann backed up to 1.16 MB) +``` + +Pushing the sender-side cap from 1 MB/s to 4 MB/s **decreased** observed +throughput at the receiver and degraded the link metrics. Signal dropped +~10 dB and the chip downshifted MCS, suggesting the chip can't sustain +the higher RX rate even with the link physically capable of more (link +bitrate 65 Mb/s = ~8 MB/s theoretical). + +**Hypothesis (Markus, 2026-05-07): driver/firmware locks itself to death +under busy reads** — possibly a busy-wait loop or lock contention on the +RX SDIO path that prevents draining at line rate. Plausible reason it +didn't surface for the c-stack tasks: those operated at typical +browse-rate traffic, well below the saturation threshold this bug needs +to fire. + +**May explain**: original Phase-0 observation that **YouTube DASH chunks +drop ~10 frames per chunk fetch** on hardware-decoder playback. A chunk +fetch is a brief burst at near-link-rate; if the driver throttles itself +down during high-RX, the player buffer underruns for the duration of +the fetch. + +**How to drill (when prioritized)**: +- Capture trace_pipe with `mmc:*` and `sdio*` events enabled during a + controlled rate-ramp (e.g., pv -L 500K, 1M, 2M, 4M each for 60 s). +- Watch `/proc/sys/kernel/sched_*` and the `bes2600_bh_work` kworker for + CPU saturation. +- `perf top -p $(pgrep -f bes_sdio)` during 4 MB/s load. + +**Status**: backlog. No patch yet. + ## Bug #4 — scan_complete_cb constant loop **File**: `scan.c:883-909` — `bes2600_scan_complete_cb()`.