Files
rk3588-ddr-analysis/POLL_SITE_MAP.md
T
marfrit 4166f81768 regs + POLL_SITE_MAP: TRM §2.4.3 register names for low-offset polls
Sibling went back into the TRM and found §2.4.3 'Registers Summary
For DDRPHY' which I'd missed — it names almost every PHY PUB register
we'd been calling 'RE guess':

  +0x110 = DDRPHY_CAL_RD_VWML0     (Read Valid Window Margin Left Code 0)
  +0x120 = DDRPHY_CAL_RD_VWMR0     (Read Valid Window Margin Right Code 0)
  +0x160 = DDRPHY_CAL_CON5         (Calibration Control 5: wrtrn_cyc_mode/en/th)
  +0x684 = DDRPHY_PRBS_CON0        (PRBS Training Control — was 'CalBusy')
  +0xa24 = DDRPHY_SCHD_TRAIN_CON0  (MASTER training scheduler; full bit map
                                    in the TRM — every training type + per-rank)
  +0xb88 = DDRPHY_DQSDUTY_CON2     (DQS rise-duty monitor — was 'UctShadow')

SCHD_TRAIN_CON0 is the master — the blob selects a training type via
its enable bits and polls bit[1] phy_train_done. Four of our 16 poll
sites are almost certainly polling this bit across different training
stages.

Still reserved in TRM: +0x118, +0x154, +0x184 — training-engine
private FSMs. Only dynamic tracing can name these.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 08:59:34 +02:00

141 lines
7.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Poll-site → register map (RK3588 DDR v1.19)
Each of the 16 timeout-less poll sites in the v1.19 stock conservative
blob, decoded against the RK3588 TRM Part 2 (Ch. 2, DMC) where
possible. Sites without TRM coverage are Synopsys DWC PUB registers —
not republished by Rockchip; names ending in `(RE)` are our educated
guesses from the code context.
Site index comes from `patch_timeouts_v3.py` (ascending-offset order
after `find_poll_loops()`).
## Early cluster (sites 07): 0x07b78..0x07f08
| # | branch @ | body | load | addr (symbolic) | register | src |
|---|---------|------|------|-----------------|----------|-----|
| 0 | 0x07b78 | 2 | `ldr w1, [x0+0x114]` on x0=PHY+0x20000 | PHY + 0x20114 | `PHY_TRAIN_INTERLOCK_114` (RE) | — |
| 1 | 0x07ba4 | 2 | `ldr w1, [x26+0xb88]` where x26=PHY+0x10000 | PHY + 0x10b88 | `PHY_SHADOW_BB8` (RE) | — |
| 2 | 0x07c8c | 3 | `ldr w0, [x1+0x14]` where x1=DDRCTL+0x10000 | **DDRCTL + 0x10014** | `DDRCTL_PWRCTL`? (TBD — 0x14 offset in uMCTL2 is typically PWRCTL or STAT) | TRM (partial) |
| 3 | 0x07ca8 | 2 | `ldr w1, [x0+0x514]` where x0=DDRCTL+0x10000 | **DDRCTL + 0x10514** | **DDRCTL_DFISTAT** `dfi_init_complete` | **TRM Part 2 Ch.2** |
| 4 | 0x07cd4 | 3 | `ldr w0, [x1+0x14]` same pattern as site 2 | DDRCTL + 0x10014 | same as #2 | TRM (partial) |
| 5 | 0x07ce8 | 3 | same +0x14 load, different mask | DDRCTL + 0x10014 | same as #2 | TRM (partial) |
| 6 | 0x07d0c | 3 | `ldr w0, [x26+0xb88]` where x26=PHY+0x10000 | PHY + 0x10b88 | `PHY_SHADOW_BB8` (RE) — same reg as site 1, different mask | — |
| 7 | 0x07f08 | 3 | `ldr w1, [x0+0x14]` same DDRCTL family | DDRCTL + 0x10014 | same as #2 | TRM (partial) |
## Mid cluster (sites 810): 0x09124..0x0aaf8
| # | branch @ | body | register | src |
|---|---------|------|----------|-----|
| 8 | 0x09124 | 3 | DDRCTL + (via x27) — needs further context trace | — |
| 9 | 0x0aa84 | 3 | DDRCTL + (via x24) — ditto | — |
| 10 | 0x0aaf8 | 3 | abs `0xff000024` per decoder — **SRAM mirror of a GRF?** non-obvious | — |
**Site 10 is unusual** — absolute `0xff000024` is in the SRAM_BOOT
region, not a controller or PHY block. Possibly a BL2 handoff word
the blob waits on before continuing. Worth its own trace.
## Late cluster (sites 1115): 0x0d154..0x0d378
| # | branch @ | body | register | src |
|---|---------|------|----------|-----|
| 11 | 0x0d154 | 3 | `ldr w5, [x0+0x14]` where x0=PHY+0x10000 → **PHY + 0x10014**, test `&0x7 == 1` | `PHY_STATE_014` (RE) — wait for state 1 | — |
| 12 | 0x0d340 | 2 | `ldr w1, [x0+0x118]` where x0=PHY+0x8000 → PHY + 0x8118 | `PHY_STAT_A_118` (RE) — train_phy_block | — |
| 13 | 0x0d34c | 2 | `ldr w1, [x0+0x120]` → PHY + 0x8120 | `PHY_STAT_B_120` (RE) | — |
| 14 | 0x0d364 | 2 | `ldr w1, [x0+0x184]` → PHY + 0x8184 | `PHY_HANDSHAKE_184` (RE, ack assert) | — |
| 15 | 0x0d378 | 2 | `ldr w1, [x0+0x184]` → PHY + 0x8184 | `PHY_HANDSHAKE_184` (RE, ack deassert) | — |
## Summary by coverage
- **TRM-documented (vendor-canonical names):** 1 site (site 3 — DDRCTL_DFISTAT).
- **TRM-documented family (0x14 offset in uMCTL2 space, exact register TBD):** 4 sites (2, 4, 5, 7).
- **DWC PUB / Innosilicon PHY — undocumented, RE names only:** 11 sites.
## Known tensions
1. **Site 3 tests DFISTAT bits[2:1] (mask 0x6), not bit[0].** Generic
uMCTL2 DFISTAT has only bit[0] defined (`dfi_init_complete`); bits
1+ are reserved. RK3588's blob treating bits[2:1] as meaningful
suggests Rockchip extended the DFISTAT register with vendor-specific
bits. Worth checking TRM bit tables for DFISTAT directly.
2. **Sites 2/4/5/7 all poll DDRCTL + 0x10014** with different bit
masks (&0x7==1, &0x7==3, &0x30==0x20, &0x7==3). At offset +0x14 in
uMCTL2 is `STAT` (Operating Mode Status Register) per generic DWC
docs — `operating_mode[2:0]` field encodes: 0=Init, 1=Normal, 2=Power-down,
3=Self-refresh, 5=Deep-power-down, 6=Deep-power-down init. RK3588
probably follows this convention — these polls wait for the
controller to enter specific operating modes.
3. **Site 10 at absolute `0xff000024`** is suspect. That region is
SRAM_BOOT in our emulator map. Possibly a BL2 handshake word. If
so, patching this site to "bounded retry" is safe — worst case
we skip one BL2 handoff. Should flag this separately.
## Action items
- Extract DFISTAT bit-field description from TRM Part 2 to confirm/deny
the RK3588 vendor extension hypothesis for site 3.
- Extract STAT (+0x14) bit-field description from TRM to confirm/deny
the "operating_mode" mapping for sites 2/4/5/7.
- Special-case site 10 in the bisection plan — it's not a normal PHY
poll and may need different treatment.
## Update 2026-04-15 evening — TRM §2.4.3 mined
Sibling research went back into the TRM and found **§2.4.3 Registers
Summary For DDRPHY** which I'd missed. That section names almost every
low-offset poll we'd labelled `(RE)`:
| old guess | TRM actual | confidence |
|---|---|---|
| PHY + 0x110 `(RE)``F000F000` trigger | **`DDRPHY_CAL_RD_VWML0`** (Read Valid Window Margin Left Code 0) | TRM HIGH |
| PHY + 0x120 `(RE)` — step-complete bit | **`DDRPHY_CAL_RD_VWMR0`** (Read Valid Window Margin Right Code 0) | TRM HIGH |
| PHY + 0x160 `(RE)` — CFG_B 0x30003 | **`DDRPHY_CAL_CON5`** (Calibration Control 5: wrtrn_cyc_mode / wrtrn_cyc_en / wrtrn_cyc_th) | TRM HIGH |
| PHY + 0x684 "CalBusy" | **`DDRPHY_PRBS_CON0`** — PRBS training control (our CalBusy guess was wrong) | TRM HIGH |
| PHY + 0xa24 "DFI ready" | **`DDRPHY_SCHD_TRAIN_CON0`** — master training scheduler, full bit layout documented | **TRM extremely high** |
| PHY + 0xb88 "shadow" | **`DDRPHY_DQSDUTY_CON2`** — DQS rise-duty monitor (DCM/DCA debug) | TRM HIGH |
| PHY + 0x118, 0x154, 0x184 | TRM gaps (reserved) — still RE. Likely per-slice shadow / handshake FSMs | RE |
### Big picture
- **SCHD_TRAIN_CON0 @ +0xa24 is the master controller.** Writing to
it with the right bit combination selects a training type (CBT,
WrLvl, GT, Rd, Wr), enables per-rank bits, and sets the `phy_train_en`
bit. Polling `[1] phy_train_done` tells you when it's finished. **Four
of our poll sites (8, 9, possibly 11) are almost certainly polling
this bit.**
- The `0x30003` / `0x30000` pattern in d328 writes to `CAL_CON5`, not
`SCHD_TRAIN_CON0` — so it configures write-training cycle mode, not
the master scheduler. Initial sibling hypothesis of "DVFS gate
training" was incorrect on second reading.
### Training sequence now mapped to poll sites
Per TRM §2.6.x "Training Procedure":
1. Power ramp / PLL lock / clock config
2. DFI init: `DDRCTL_DFIMISC.dfi_init_start=1`, poll `DDRCTL_DFISTAT.dfi_init_complete`**site 3 is here**
3. ZQ calibration: `ZQ_CON0.zq_manual_str`, poll `ZQ_CON1.zq_done`
4. MDLL lock
5. Scheduler enable (`LP_CON0.ctrl_scheduler_en`)
6. Training loop, all via `SCHD_TRAIN_CON0`:
- CBT (Command Bus Training)
- Write Leveling
- Gate Training
- Read DQ Training → results land in `CAL_RD_VWMC0/VWML0/VWMR0`
- Write DQ Training → results in `CAL_WR_VWMC0/VWML0/VWMR0`
- PRBS Training (LPDDR5 high-speed)
- DQS Duty-cycle monitoring
Our 16 poll sites are scattered across steps 2, 5 (scheduler-en state),
and the six training sub-steps in step 6.
### Open items
- **+0x118 / +0x154 / +0x184 remain TRM-reserved** — training engine
private FSMs. No way to name these from docs; dynamic tracing on
real hardware (with UART or JTAG) is the remaining path.
- CSDN LPDDR5 series (DDR Study blog) turned out to cover JEDEC-layer
protocol only; no register-level help. Useful background for training
phase names but not for our RE.