RK3588 DDR init blob reverse engineering

- Ghidra decompilation of v1.02-v1.19 blobs (118 functions)
- 53 functions renamed, 79 MMIO registers mapped to TRM
- 45 timeout-less poll loops identified and patched
- Production patcher (patch_prod.py) and QEMU emulator
- Comprehensive analysis, frequency tables, community research

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-03 13:06:47 +02:00
commit 816848a474
23 changed files with 84690 additions and 0 deletions
+191
View File
@@ -0,0 +1,191 @@
# RK3588 DDR Init Blob Analysis
## Overview
- **Blob:** `rk3588_ddr_lp4_2112MHz_lp5_2400MHz_v1.19.bin` (76,704 bytes)
- **Architecture:** AArch64 (64-bit ARM), not Cortex-M0 as initially assumed
- **Functions:** 118 decompiled, 17,308 assembly instructions
- **Execution context:** Runs on A76/A55 cores during early boot (BL31/TPL stage)
## Key Findings
### 1. Fast vs Conservative: Only 14 bytes differ
The "fast" (2112/2400 MHz) and "conservative" (1848/2112 MHz) blobs are
identical code with only **6 bytes of timing data** changed:
| Offset | Fast (2112/2400) | Conservative (1848/2112) | Purpose |
|--------|-----------------|-------------------------|---------|
| 0x11b8c | 0x0840 | 0x0738 | LP4 frequency param (repeated at 0x11bc0) |
| 0x11bf4 | 0x6960 | 0x6840 | LP5 frequency param |
The remaining 8 byte differences are in the ASCII version string at 0x10d83.
### 2. MMIO Register Regions (79 unique registers accessed)
| Region | Count | Hardware Block |
|--------|-------|---------------|
| 0xFD58xxxx | 1 | GRF (General Register Files) |
| 0xFD59xxxx | 1 | DDR GRF |
| 0xFD5Fxxxx | 27 | Bus GRF (main DDR config) |
| 0xFD8Cxxxx | 4 | PMU/CRU (clock/power) |
| 0xFE01xxxx | 4 | MSCH (Memory Scheduler) |
| 0xFE03xxxx | 1 | Firewall DDR |
| 0xFE05xxxx | 9 | SGRF (Security) |
| 0xFECCxxxx | 4 | DDR Controller |
| 0xFF00xxxx | 1 | SRAM/Boot ROM |
### 3. Potential Issues in Decompiled Code
- **Missing data section:** Offsets 0x0001xxxx and 0x001fxxxx are relative
to the blob load address, not absolute MMIO. Ghidra treats them as MMIO
which is incorrect — they are data tables within the binary.
- **Timing loops:** Several `do {} while` patterns poll hardware registers
without timeout, which could hang if hardware doesn't respond.
- **Security registers:** The blob manipulates SGRF (0xFE05xxxx) to grant
DDR access — this is the firewall configuration for memory regions.
### 4. Recompilation Status
Direct recompilation of Ghidra's C output is not possible because:
- Ghidra uses synthetic types (`undefined8`, `undefined4`)
- Data section references are treated as absolute addresses
- Inline data tables (timing params) need to be separated
- The original compiler (likely ARM's armclang) produces different code patterns
Assembly-level comparison is the correct approach — the disassembly from
Ghidra exactly matches the original blob's machine code.
## Files
- `ddr_decompiled.c` — Decompiled C (fast blob, 118 functions)
- `ddr_conservative_decompiled.c` — Decompiled C (conservative blob)
- `ddr_diff.txt` — Diff between the two
- `ddr_fast_asm.s` — Full disassembly (fast)
- `ddr_conservative_asm.s` — Full disassembly (conservative)
- `rk3588_ddr.h` — Register definitions header
- `rk3588_regs_auto.h` — Auto-extracted MMIO register map
## Conclusion
The DDR init blobs are essentially a **single codebase with parameterized
timing tables**. To create a custom frequency configuration, only 6 bytes
of timing data need to be modified. The code itself handles DDR PHY
training, calibration, and memory controller initialization for all 4
channels of the RK3588.
## Detailed Register Analysis (TRM-verified)
### MMIO Regions Accessed by DDR Init Blob
| Address Range | Block | Registers | Purpose |
|--------------|-------|-----------|---------|
| 0xFD588xxx | PMU1_GRF | 1 | DDR training status |
| 0xFD598xxx | DDR_GRF_CH2 | 1 | Channel 2 config |
| 0xFD5F4xxx | BUS_GRF | 2 | Bus fabric base config |
| 0xFD5F8xxx | BUS_GRF | 25 | DDR bus interconnect, AXI routing, QoS |
| 0xFD8C8xxx | SCRU | 4 | DDR PLL (DPLL) clock gate/reset/config |
| 0xFE010xxx | DDRC_CH0 | 4 | UMCTL2 controller (offsets 0xF0-0xFC) |
| 0xFE030xxx | FW_DDR | 1 | Firewall access control |
| 0xFE050xxx | SGRF | 9 | Security - DDR region access permissions |
| 0xFECC0xxx | Unknown | 4 | Possibly DDR scramble/ECC |
| 0xFF000xxx | SRAM | 1 | Boot mailbox/flag |
### Potential Bugs / Concerns
1. **No timeout on hardware polls:** FUN_000000e4 polls `_DAT_fe0500e0`
(SGRF status) in a tight loop with no timeout. If SGRF doesn't respond,
the system hangs permanently during boot.
2. **Single-channel DDRC access:** Only CH0 registers (0xFE01xxxx) are
accessed directly. Channels 1-3 are likely configured via the dense
BUS_GRF register block (0xFD5F8xxx) which may broadcast to all channels.
3. **Firewall opened wide:** `_DAT_fe030040 |= 0xffff` in FUN_000000e4
opens all DDR firewall masters — this grants full DDR access to all
bus masters during init, which is expected but never re-restricted.
4. **0x001FE000 region:** This 0x001FExxxx area (5 registers) is likely
a shared memory mailbox used to communicate between the DDR blob and
BL31/TF-A. Not actual MMIO — it's SRAM at a fixed offset.
### Binary Comparison: Fast vs Conservative
Both blobs share identical code (118 functions, 17,308 instructions).
Only the timing data table differs:
```
Offset 0x11B8C: LP4 freq parameter
Fast: 0x0840 (2112 MHz)
Conservative: 0x0738 (1848 MHz)
Offset 0x11BF4: LP5 freq parameter
Fast: 0x6960 (2400 MHz)
Conservative: 0x6840 (2112 MHz)
```
These values appear in the data section at the end of the blob and are
loaded by the frequency setup function (likely FUN_000009fc or nearby).
### Recompilation Assessment
The decompiled C cannot be directly recompiled because:
1. Ghidra's `undefined*` types need mapping to stdint types
2. Internal data references (0x0001xxxx) are blob-relative, not absolute
3. The blob is position-dependent — loaded at a fixed address by BL2
4. String/data tables are interleaved with code in the original binary
However, **assembly-level patching is straightforward** — both the fast and
conservative blobs prove that changing 6 bytes of timing data is all that's
needed for frequency customization. A tool could:
1. Parse the blob header
2. Locate the timing table (at known offset 0x11B8C)
3. Patch frequency values
4. Recalculate any checksums (if present — not confirmed)
### Files Added
- `rk3588_ddr.h` — Complete RK3588 DDR memory map header (TRM-verified)
- `rk3588_regs_annotated.h` — All 79 MMIO registers with block annotations
- `ddr_fast_asm.s` / `ddr_conservative_asm.s` — Full disassembly listings
- `ddr_diff.txt` — Diff between fast and conservative decompiled output
## Blob Version Comparison (All Revisions)
### Size & Complexity Evolution
| Version | Size | Functions | BL calls | LP5 MHz | Change from prev |
|---------|------|-----------|----------|---------|-----------------|
| v1.02 | 42 KB | ~80 | 258 | 2736 | Earliest available |
| v1.03 | 45 KB | ~85 | 294 | 2736 | +2.9 KB code |
| v1.04 | 49 KB | ~90 | 326 | 2736 | +4.1 KB code |
| v1.07 | 60 KB | ~95 | 425 | 2736 | +11 KB major rewrite |
| v1.08 | 64 KB | ~98 | 468 | 2736 | +4 KB code |
| v1.09 | 71 KB | 101 | 484 | 2736 | +6 KB code |
| v1.10 | 70 KB | 101 | 484 | 2736 | -0.6 KB refactor |
| v1.11 | 72 KB | 102 | 493 | 2736 | +1.6 KB |
| v1.12 | 73 KB | 105 | 513 | 2736 | +1.2 KB |
| v1.14 | 73 KB | 105 | 519 | 2736 | +0.2 KB |
| v1.15 | 73 KB | 105 | 521 | 2736 | +0.2 KB (last 2736) |
| v1.16 | 75 KB | 105 | 528 | **2400** | +2.2 KB + freq downgrade |
| v1.17 | 73 KB | 104 | 516 | 2400 | -2 KB refactor |
| v1.18 | 75 KB | 106 | 544 | 2400 | +1.9 KB |
| v1.19 | 77 KB | 118 | 560 | 2400 | +1.4 KB (current) |
### Code Changes Between Key Versions
**Every version contains real code changes, not just timing adjustments.**
| Transition | Identical funcs | Changed | New | Assessment |
|-----------|----------------|---------|-----|------------|
| v1.09 → v1.15 | 40 | 61 | 65 | Major: PHY training, ODT updates |
| v1.15 → v1.16 | 64 | 41 | 40 | Major: 2736→2400 + code changes |
| v1.16 → v1.19 | 31 | 73 | 82 | Major: new functions, expanded training |
### Conclusion
The DDR blobs are under **active development** — each version has substantial
code changes, not just parameter tweaks. The blob grew from 42 KB (v1.02) to
77 KB (v1.19), nearly doubling in size and function count.
The v1.15→v1.16 transition (2736→2400 MHz) was **not just a frequency change**
— it included 40+ function modifications alongside the frequency downgrade,
suggesting Rockchip discovered bugs or instability at 2736 MHz and rewrote
parts of the training algorithm.
**Implication for using old blobs:** Running v1.15 (2736 MHz) means missing
all bug fixes from v1.16-v1.19. The safest approach for higher frequencies
is to use the current v1.19 blob with **rkddr** to patch the frequency
parameter, getting both the latest code and custom timing.