Ship the new simulation & verification stack under simulation/:
- mmio_regions.py — address → region classifier (DDRCTL, DDRPHY,
OTP, SRAM, …). Shared by every other tool so trace output is
scannable without memorising the memory map.
- sim_tripwire.py — Bin-style per-access capture. Records
(seq, insn_tick, pc, addr, size, rw, val, region, fn_name) per
MMIO access. PCResolver bisects the vendor funs table parsed
from ddr_conservative_asm.s.
- tripwire_diff.py — PC-bucketed difflib.SequenceMatcher diff of
two tripwire CSVs. Buckets by fn_name so bitflip-induced control
flow divergence doesn't cascade noise.
- training_sim.py — DDR training simulator with --mode pass and
--mode bitflip (flip first N reads per training status, exercise
retry paths). BITFLIP_ONLY env var narrows to a single addr for
the sweep.
- bitflip_sweep.py — Flip each of 23 training-status addresses
one-at-a-time and tabulate retry convergence. Surfaces which
function(s) react to a transient fault by writing different
downstream register values.
Plus:
- mmio_diff.py updated: region-tagged divergence output,
--show-regions histogram, --tripwire-out-{vendor,rebuilt} CSV
capture, --capture-stack-writes for stack-allocated buffer diffs.
- debug_probes/tp_slot_{probe,writes}.py — ad-hoc Unicorn probes
for chasing a single-slot divergence in an SRAM buffer. Kept as
reference examples of how to extend the tripwire toolchain.
The stack found 6 silicon-hostile bugs in the rebuilt blob that
mmio_diff's write-sequence gate was structurally blind to, including
three ld-unresolved-symbol NULL derefs (case-mismatched externs,
missing DATA_SYMS) and one C-early-return-skips-shared-tail bug
where vendor's asm fell through to the tail via `b` after a `ret`.
RK3588 DDR Init Blob — Reverse Engineering Project
n## Prerequisites
Patching (any platform)
Emulation (x86_64 only)
Decompilation (x86_64 only)
Cross-compilation tools (optional)
Decompilation, analysis, patching, and pre-silicon simulation of the closed-source Rockchip RK3588 DDR initialization binary blobs.
The project has three layers:
- Static RE — Ghidra-exported decompiled C + disassembly +
annotated register map (
ddr_annotated.c,rk3588_ddr.h). - Patch + flash —
patch_prod.pyrewrites specific poll loops in the vendor blob to work around known hangs. Validated under Unicorn viablob_emu.pybefore flash. - Matching-decomp rebuild + simulation — the goal is a buildable
working DDR blob (not bit-identical reproduction). Per-function C
ports are spliced into the vendor blob;
mmio_diff.pygates behavioral equivalence by MMIO write sequence. Thesimulation/subdir adds read-side tripwire capture, PC-bucketed diff, and per-address bitflip fault injection for retry-path validation.
"Markus' insistence on simulation before flashing paid off. Big time. Again." — 2026-04-21. Tripwire + PC-bucket diff caught three silent NULL-derefs hidden behind
mmio_diff=3173/3173green.ld --unresolved-symbols=ignore-allhad zeroed undefined DATA_SYMS externs; silicon would have bricked.
Quick Start
# Apply production patch to current blob
python3 patch_prod.py /path/to/rk3588_ddr_lp4_2112MHz_lp5_2400MHz_v1.19.bin output.bin
# Run QEMU emulation test (on x86 with unicorn)
python3 /opt/work/emu_test.py
# Build the C emulator (on x86 oppenheimer container)
gcc -O2 -o ddr_emu ddr_emu2.c -lunicorn -lm
./ddr_emu blob.bin 50000
Files
Decompiled Sources
| File | Description |
|---|---|
ddr_decompiled.c |
Raw Ghidra decompilation (fast blob, 118 functions) |
ddr_conservative_decompiled.c |
Raw decompilation (conservative blob) |
ddr_annotated.c |
Human-readable annotated source (53 named functions, 79 named registers) |
ddr_diff.txt |
Diff between fast and conservative (only 12 lines!) |
ddr_fast_asm.s |
Full AArch64 disassembly (17,308 lines) |
ddr_conservative_asm.s |
Conservative disassembly |
Headers & Register Maps
| File | Description |
|---|---|
rk3588_ddr.h |
Complete RK3588 DDR memory map (TRM Part 2 verified) |
rk3588_regs_annotated.h |
All 79 MMIO registers with hardware block annotations |
Patchers
| File | Description |
|---|---|
patch_prod.py |
Production patcher — NOPs 40 non-critical polls, keeps 5 training loops |
patch_timeouts.py |
Aggressive patcher — NOPs all 16 B.cond polls (analysis only) |
Patched Blobs
| File | Patches | Use |
|---|---|---|
rk3588_ddr_v1.19_prod.bin |
40 NOPs, 5 kept | Production-ready |
rk3588_ddr_v1.19_patched_v2.bin |
45 NOPs (all) | Analysis/QEMU testing |
Analysis & Research
| File | Description |
|---|---|
ANALYSIS.md |
Full technical analysis with register maps and version comparison |
BUG_ANALYSIS.md |
Bug report, optimization opportunities, training explainer |
DDR_FREQUENCY_TABLE.md |
All LPDDR5 frequencies from 2112-3200 MHz |
COMMUNITY_RESEARCH.md |
40+ sources on DDR training, Rockchip issues, community OC |
Emulation
| File | Description |
|---|---|
ddr_emu2.c |
Unicorn-based C emulator with MMIO stubs |
blob_emu.py |
Python Unicorn harness — runs a blob, stubs MMIO, captures UART TX |
mmio_diff.py |
Runs vendor + rebuilt, diffs MMIO write sequences; region-tagged output |
| Ghidra project | On oppenheimer (CT131): /opt/work/ghidra_project/ |
Simulation & Verification Stack (simulation/)
| File | Description |
|---|---|
simulation/mmio_regions.py |
Address → region classifier (DDRCTL, DDRPHY, OTP, SRAM, …) |
simulation/sim_tripwire.py |
Bin-style per-access capture with PC → fn name resolver |
simulation/tripwire_diff.py |
PC-bucketed SequenceMatcher diff of two tripwire CSVs |
simulation/training_sim.py |
DDR-training simulator: pass and bitflip-first-pass modes |
simulation/bitflip_sweep.py |
Flip each training-status addr one-at-a-time, report retry convergence |
simulation/README.md |
Synopsis + usage for the above |
Debug Probes (debug_probes/)
| File | Description |
|---|---|
debug_probes/tp_slot_probe.py |
Snapshot the tp timing buffer at fn_5540's read site |
debug_probes/tp_slot_writes.py |
List every write to a specific tp slot, both vendor and rebuilt |
Ghidra Export Scripts
| File | Description |
|---|---|
ExportDecompiled.java |
Exports all functions as decompiled C |
ExportAsm.java |
Exports full disassembly listing |
QEMU Emulation Approach
Why QEMU Alone Doesn't Work
The DDR blob runs at EL3 (secure world) and accesses hardware-specific MMIO
registers. Standard QEMU virt machine doesn't model RK3588 hardware, so:
- All MMIO reads return 0 (unmapped memory)
- System register writes (MSR VBAR_EL3, etc.) cause exceptions
- The blob gets stuck on the very first register check
Solution: Unicorn Engine
We use the Unicorn CPU emulator (libcorn) which provides:
- AArch64 instruction emulation without OS/machine model
- Memory mapping API to create MMIO stub regions
- Exception hooks to skip privileged instructions (MSR/MRS)
- Code hooks for instruction counting and timeouts
Emulation Setup
Memory Map:
0x00000000 - 0x0001FFFF Blob code + data (128KB)
0x00100000 - 0x0010FFFF Stack (64KB)
0x001F0000 - 0x001FFFFF SRAM mailbox
0xFD580000 - 0xFD59FFFF GRF (pre-seeded with 0)
0xFD5F0000 - 0xFD5FFFFF BUS_GRF
0xFD8C0000 - 0xFD8CFFFF SCRU
0xFE010000 - 0xFE02FFFF DDRC
0xFE030000 - 0xFE03FFFF FW_DDR
0xFE050000 - 0xFE05FFFF SGRF (pre-seeded: STATUS=0, CON21=1)
0xFE0C0000 - 0xFE0FFFFF DDRPHY (pre-seeded: DfiStatus=2, CalBusy=0)
0xFECC0000 - 0xFECCFFFF DDR_SCRAMBLE
0xFF000000 - 0xFF0FFFFF SRAM_BOOT
Pre-seeded MMIO Values
Training-critical registers are pre-seeded with "ready" values:
SGRF_DDR_STATUS(0xFE0500E0) = 0 (ready)SGRF_DDR_CON21(0xFE050054) = 1 (done)DfiStatus(PHY+0xA24) = 0x02 (DFI ready)CalBusy(PHY+0x684) = 0x00 (not busy)MicroContMuxSel(PHY+0x10090) = 0 (available)MicroReset(PHY+0x10080) = 0 (reset complete)UctWriteProtShadow(PHY+0x10514) = 0 (training done)
Exception Handling
The hook_intr callback skips MSR/MRS/cache instructions by advancing PC+4. This allows the blob to execute through privileged setup code without implementing full EL3 register emulation.
Results
| Blob | Instructions | Final PC | Behavior |
|---|---|---|---|
| Original | 500K (limit) | 0x10350 | Stuck in TBZ poll loop |
| Patched (all NOP) | 500K (limit) | 0x09124 | Progressed into PHY training |
| Production patch | Similar to original for training loops | varies | Training polls preserved |
The original blob hangs at 0x10350 (a TBZ bit 1, -4 loop waiting for a
PHY register). The patched blob passes through all 45 poll points and reaches
deep PHY training code at 0x09124, where it waits for actual training
completion (which requires real hardware feedback).
Production Patch Policy
| Poll Type | Action | Reason |
|---|---|---|
| SGRF status | NOP | Hardware ready at check time |
| Firewall | NOP | Synchronous write |
| PLL lock | NOP | Already locked by calling code |
| BUS_GRF | NOP | Configuration, not status |
| PHY DfiStatus | KEEP | Active training wait |
| PHY CalBusy | KEEP | ZQ calibration in progress |
| PHY MicroReset | KEEP | Firmware startup |
| PHY UctWriteProt | KEEP | Training completion |
| PHY MicroContMux | KEEP | Firmware mailbox |
| Unknown | NOP | Prevent hangs (conservative) |
How to Use on Real Hardware
WARNING: Flashing a patched DDR blob can brick your board. Recovery requires maskrom mode. Only proceed if you understand the risks.
# 1. Backup current blob
dd if=/dev/mmcblk0 of=backup_idb.bin bs=512 count=8192
# 2. Patch
python3 patch_prod.py original_blob.bin patched_blob.bin
# 3. Flash (use rkdeveloptool in maskrom, or rkddr tool)
# See https://github.com/hbiyik/rkddr for safe in-place patching