Files
test0r 46155bbe91 simulation: tripwire + PC-bucketed diff + bitflip sweep
Ship the new simulation & verification stack under simulation/:

- mmio_regions.py — address → region classifier (DDRCTL, DDRPHY,
  OTP, SRAM, …). Shared by every other tool so trace output is
  scannable without memorising the memory map.
- sim_tripwire.py — Bin-style per-access capture. Records
  (seq, insn_tick, pc, addr, size, rw, val, region, fn_name) per
  MMIO access. PCResolver bisects the vendor funs table parsed
  from ddr_conservative_asm.s.
- tripwire_diff.py — PC-bucketed difflib.SequenceMatcher diff of
  two tripwire CSVs. Buckets by fn_name so bitflip-induced control
  flow divergence doesn't cascade noise.
- training_sim.py — DDR training simulator with --mode pass and
  --mode bitflip (flip first N reads per training status, exercise
  retry paths). BITFLIP_ONLY env var narrows to a single addr for
  the sweep.
- bitflip_sweep.py — Flip each of 23 training-status addresses
  one-at-a-time and tabulate retry convergence. Surfaces which
  function(s) react to a transient fault by writing different
  downstream register values.

Plus:

- mmio_diff.py updated: region-tagged divergence output,
  --show-regions histogram, --tripwire-out-{vendor,rebuilt} CSV
  capture, --capture-stack-writes for stack-allocated buffer diffs.
- debug_probes/tp_slot_{probe,writes}.py — ad-hoc Unicorn probes
  for chasing a single-slot divergence in an SRAM buffer. Kept as
  reference examples of how to extend the tripwire toolchain.

The stack found 6 silicon-hostile bugs in the rebuilt blob that
mmio_diff's write-sequence gate was structurally blind to, including
three ld-unresolved-symbol NULL derefs (case-mismatched externs,
missing DATA_SYMS) and one C-early-return-skips-shared-tail bug
where vendor's asm fell through to the tail via `b` after a `ret`.
2026-04-22 05:55:28 +02:00

8.3 KiB

RK3588 DDR Init Blob — Reverse Engineering Project

n## Prerequisites

Patching (any platform)

Emulation (x86_64 only)

Decompilation (x86_64 only)

Cross-compilation tools (optional)

Decompilation, analysis, patching, and pre-silicon simulation of the closed-source Rockchip RK3588 DDR initialization binary blobs.

The project has three layers:

  1. Static RE — Ghidra-exported decompiled C + disassembly + annotated register map (ddr_annotated.c, rk3588_ddr.h).
  2. Patch + flashpatch_prod.py rewrites specific poll loops in the vendor blob to work around known hangs. Validated under Unicorn via blob_emu.py before flash.
  3. Matching-decomp rebuild + simulation — the goal is a buildable working DDR blob (not bit-identical reproduction). Per-function C ports are spliced into the vendor blob; mmio_diff.py gates behavioral equivalence by MMIO write sequence. The simulation/ subdir adds read-side tripwire capture, PC-bucketed diff, and per-address bitflip fault injection for retry-path validation.

"Markus' insistence on simulation before flashing paid off. Big time. Again." — 2026-04-21. Tripwire + PC-bucket diff caught three silent NULL-derefs hidden behind mmio_diff=3173/3173 green. ld --unresolved-symbols=ignore-all had zeroed undefined DATA_SYMS externs; silicon would have bricked.

Quick Start

# Apply production patch to current blob
python3 patch_prod.py /path/to/rk3588_ddr_lp4_2112MHz_lp5_2400MHz_v1.19.bin output.bin

# Run QEMU emulation test (on x86 with unicorn)
python3 /opt/work/emu_test.py

# Build the C emulator (on x86 oppenheimer container)
gcc -O2 -o ddr_emu ddr_emu2.c -lunicorn -lm
./ddr_emu blob.bin 50000

Files

Decompiled Sources

File Description
ddr_decompiled.c Raw Ghidra decompilation (fast blob, 118 functions)
ddr_conservative_decompiled.c Raw decompilation (conservative blob)
ddr_annotated.c Human-readable annotated source (53 named functions, 79 named registers)
ddr_diff.txt Diff between fast and conservative (only 12 lines!)
ddr_fast_asm.s Full AArch64 disassembly (17,308 lines)
ddr_conservative_asm.s Conservative disassembly

Headers & Register Maps

File Description
rk3588_ddr.h Complete RK3588 DDR memory map (TRM Part 2 verified)
rk3588_regs_annotated.h All 79 MMIO registers with hardware block annotations

Patchers

File Description
patch_prod.py Production patcher — NOPs 40 non-critical polls, keeps 5 training loops
patch_timeouts.py Aggressive patcher — NOPs all 16 B.cond polls (analysis only)

Patched Blobs

File Patches Use
rk3588_ddr_v1.19_prod.bin 40 NOPs, 5 kept Production-ready
rk3588_ddr_v1.19_patched_v2.bin 45 NOPs (all) Analysis/QEMU testing

Analysis & Research

File Description
ANALYSIS.md Full technical analysis with register maps and version comparison
BUG_ANALYSIS.md Bug report, optimization opportunities, training explainer
DDR_FREQUENCY_TABLE.md All LPDDR5 frequencies from 2112-3200 MHz
COMMUNITY_RESEARCH.md 40+ sources on DDR training, Rockchip issues, community OC

Emulation

File Description
ddr_emu2.c Unicorn-based C emulator with MMIO stubs
blob_emu.py Python Unicorn harness — runs a blob, stubs MMIO, captures UART TX
mmio_diff.py Runs vendor + rebuilt, diffs MMIO write sequences; region-tagged output
Ghidra project On oppenheimer (CT131): /opt/work/ghidra_project/

Simulation & Verification Stack (simulation/)

File Description
simulation/mmio_regions.py Address → region classifier (DDRCTL, DDRPHY, OTP, SRAM, …)
simulation/sim_tripwire.py Bin-style per-access capture with PC → fn name resolver
simulation/tripwire_diff.py PC-bucketed SequenceMatcher diff of two tripwire CSVs
simulation/training_sim.py DDR-training simulator: pass and bitflip-first-pass modes
simulation/bitflip_sweep.py Flip each training-status addr one-at-a-time, report retry convergence
simulation/README.md Synopsis + usage for the above

Debug Probes (debug_probes/)

File Description
debug_probes/tp_slot_probe.py Snapshot the tp timing buffer at fn_5540's read site
debug_probes/tp_slot_writes.py List every write to a specific tp slot, both vendor and rebuilt

Ghidra Export Scripts

File Description
ExportDecompiled.java Exports all functions as decompiled C
ExportAsm.java Exports full disassembly listing

QEMU Emulation Approach

Why QEMU Alone Doesn't Work

The DDR blob runs at EL3 (secure world) and accesses hardware-specific MMIO registers. Standard QEMU virt machine doesn't model RK3588 hardware, so:

  • All MMIO reads return 0 (unmapped memory)
  • System register writes (MSR VBAR_EL3, etc.) cause exceptions
  • The blob gets stuck on the very first register check

Solution: Unicorn Engine

We use the Unicorn CPU emulator (libcorn) which provides:

  • AArch64 instruction emulation without OS/machine model
  • Memory mapping API to create MMIO stub regions
  • Exception hooks to skip privileged instructions (MSR/MRS)
  • Code hooks for instruction counting and timeouts

Emulation Setup

Memory Map:
  0x00000000 - 0x0001FFFF  Blob code + data (128KB)
  0x00100000 - 0x0010FFFF  Stack (64KB)
  0x001F0000 - 0x001FFFFF  SRAM mailbox
  0xFD580000 - 0xFD59FFFF  GRF (pre-seeded with 0)
  0xFD5F0000 - 0xFD5FFFFF  BUS_GRF
  0xFD8C0000 - 0xFD8CFFFF  SCRU
  0xFE010000 - 0xFE02FFFF  DDRC
  0xFE030000 - 0xFE03FFFF  FW_DDR
  0xFE050000 - 0xFE05FFFF  SGRF (pre-seeded: STATUS=0, CON21=1)
  0xFE0C0000 - 0xFE0FFFFF  DDRPHY (pre-seeded: DfiStatus=2, CalBusy=0)
  0xFECC0000 - 0xFECCFFFF  DDR_SCRAMBLE
  0xFF000000 - 0xFF0FFFFF  SRAM_BOOT

Pre-seeded MMIO Values

Training-critical registers are pre-seeded with "ready" values:

  • SGRF_DDR_STATUS (0xFE0500E0) = 0 (ready)
  • SGRF_DDR_CON21 (0xFE050054) = 1 (done)
  • DfiStatus (PHY+0xA24) = 0x02 (DFI ready)
  • CalBusy (PHY+0x684) = 0x00 (not busy)
  • MicroContMuxSel (PHY+0x10090) = 0 (available)
  • MicroReset (PHY+0x10080) = 0 (reset complete)
  • UctWriteProtShadow (PHY+0x10514) = 0 (training done)

Exception Handling

The hook_intr callback skips MSR/MRS/cache instructions by advancing PC+4. This allows the blob to execute through privileged setup code without implementing full EL3 register emulation.

Results

Blob Instructions Final PC Behavior
Original 500K (limit) 0x10350 Stuck in TBZ poll loop
Patched (all NOP) 500K (limit) 0x09124 Progressed into PHY training
Production patch Similar to original for training loops varies Training polls preserved

The original blob hangs at 0x10350 (a TBZ bit 1, -4 loop waiting for a PHY register). The patched blob passes through all 45 poll points and reaches deep PHY training code at 0x09124, where it waits for actual training completion (which requires real hardware feedback).

Production Patch Policy

Poll Type Action Reason
SGRF status NOP Hardware ready at check time
Firewall NOP Synchronous write
PLL lock NOP Already locked by calling code
BUS_GRF NOP Configuration, not status
PHY DfiStatus KEEP Active training wait
PHY CalBusy KEEP ZQ calibration in progress
PHY MicroReset KEEP Firmware startup
PHY UctWriteProt KEEP Training completion
PHY MicroContMux KEEP Firmware mailbox
Unknown NOP Prevent hangs (conservative)

How to Use on Real Hardware

WARNING: Flashing a patched DDR blob can brick your board. Recovery requires maskrom mode. Only proceed if you understand the risks.

# 1. Backup current blob
dd if=/dev/mmcblk0 of=backup_idb.bin bs=512 count=8192

# 2. Patch
python3 patch_prod.py original_blob.bin patched_blob.bin

# 3. Flash (use rkdeveloptool in maskrom, or rkddr tool)
# See https://github.com/hbiyik/rkddr for safe in-place patching