# RK3588 DDR TPL — Simulation & Verification Stack A set of Unicorn-based tools for pre-silicon simulation, behavioral diffing, and fault-injection of Rockchip RK3588 DDR TPL blobs (vendor or rebuilt). Built to hunt silicon-corruption bugs that `mmio_diff.py`'s write-sequence comparison cannot see — NULL derefs, read-side divergences, retry-path diffs. ## Synopsis | Tool | One-line | |---|---| | `mmio_regions.py` | Address → region classifier (`DDRCTL`, `DDRPHY`, `OTP`, `SRAM`, …) | | `sim_tripwire.py` | Bin-style per-access capture (PC, tick, addr, region, resolved fn name) | | `tripwire_diff.py` | PC-bucketed `SequenceMatcher` diff of two tripwire CSVs | | `training_sim.py` | DDR-training simulator with `pass` and `bitflip-first-pass` modes | | `bitflip_sweep.py` | Flip each training-status address one at a time, report retry convergence | The simulator **DOES NOT** need silicon. It runs vendor or rebuilt TPL blobs under Unicorn with an MMIO stub that returns "pass" values for all training-status polls, captures every access, and lets you diff runs behaviorally. ## Quick start Assuming your TPL blob is at `../rk3588_ddr_v1.19_prod.bin` (a copy of the vendor blob shipped at SPI offset `0x8000` on boards with RKBIN v1.19) and the rebuilt blob at `/tmp/rebuilt.bin`: ```bash # Run once in "pass" mode and capture tripwire to CSV python3 training_sim.py ../rk3588_ddr_v1.19_prod.bin \ --mode pass --tripwire-out /tmp/tw-pass.csv # Run again with the first read of every training status flipped python3 training_sim.py ../rk3588_ddr_v1.19_prod.bin \ --mode bitflip --flip-count 1 --flip-mask 0xFFFFFFFF \ --tripwire-out /tmp/tw-flip.csv # Diff the two runs by function bucket python3 tripwire_diff.py /tmp/tw-pass.csv /tmp/tw-flip.csv # Sweep every training-status address one-at-a-time and tabulate # whether the retry loop reconverges cleanly python3 bitflip_sweep.py ../rk3588_ddr_v1.19_prod.bin ``` For vendor-vs-rebuilt verification (needs `../mmio_diff.py` in the parent dir): ```bash python3 ../mmio_diff.py --ignore-pc \ ../rk3588_ddr_v1.19_prod.bin /tmp/rebuilt.bin \ --tripwire-out-vendor /tmp/tw-v.csv \ --tripwire-out-rebuilt /tmp/tw-r.csv \ --show-regions python3 tripwire_diff.py /tmp/tw-v.csv /tmp/tw-r.csv ``` ## Architecture ### `mmio_regions.py` — address classifier Pure lookup table. `classify(addr)` returns a short tag for each RK3588 peripheral window. Used by every other tool so trace output is scannable without memorising the memory map. Region tags: `DDRCTL`, `DDRCTL:SW` (STAT/PWRCTL/SWCTL/SWSTAT), `DDRCTL:MR` (mode-register ops), `DDRPHY`, `DDRPHY:TR` (training status offsets `0x080/090/0B4/3CC/514/684/A24`), `DDR_CRU`, `DDR_MEM`, `SRAM`, `PMU_SRAM`, `GRF`, `BUS_GRF`, `SGRF`, `CRU`, `SCRU`, `PMU`, `FW_DDR`, `OTP`, `UART`, `STACK`, `OTHER`. ### `sim_tripwire.py` — per-access capture `Capture` class with `rd(pc, addr, size, val, tick)` and `wr(...)` that record one row per access: (seq_idx, insn_tick, pc, addr, size, rw, val, region, fn_name) `fn_name` comes from `PCResolver`, which bisects the vendor funs table parsed from `../ddr_conservative_asm.s` (115 `FUN_xxxx @ offset` headers; Ghidra export). Set `RK_DDR_ASM` env var to override the default asm path. `emit_csv(path)` writes out; `load_csv(path)` re-hydrates. Both `training_sim.py` and `mmio_diff.py` (in parent dir) accept a tripwire capture object and record into it. ### `tripwire_diff.py` — PC-bucketed diff For each unique `fn_name` in either capture, collect records, key them by `(region, addr, rw, val, size)`, diff via `difflib. SequenceMatcher`. `quick_ratio()` short-circuits buckets that share almost nothing. Outputs three tiers: - **OK**: byte-identical key sequences (suppressed unless `--show-identical`). - **minor-diff**: ratio ≥ `--suspect-threshold` (default 0.9). - **SUSPECT**: ratio below threshold, printed first with the raw edit script. Why PC-bucket and not index-by-index? Under bitflip mode the control flow diverges at the flip point, which destroys index alignment. Grouping by function localises divergences so one buggy bucket doesn't cascade noise into unrelated ones. ### `training_sim.py` — DDR training simulator Two modes: - `--mode pass` — every training-status read returns its "done/OK/ trained" stub value every time. Equivalent to `mmio_diff`'s base harness. - `--mode bitflip --flip-count N --flip-mask MASK` — the first `N` reads of each training-status address return `stub_value ^ mask` (default mask `0xFFFFFFFF` → "not done"). Subsequent reads revert. Exercises the retry / error-recovery paths. Training-status addresses are defined inside `is_training_status()`; override single-address via the `BITFLIP_ONLY=0xADDR` env var (used by `bitflip_sweep.py`). Region-tagged access histogram + UART TX dump on every run. ### `bitflip_sweep.py` — per-address retry convergence Flips each training-status register one-at-a-time and summarises: - how many records diverged from the pass-mode baseline - whether any MMIO write value changed (= retry path took a different branch) - which function(s) wrote the divergent values Output is a single table row per address. A clean "write_divergence" column means retry paths converge deterministically. A non-zero count names the function whose retry wrote a different register value — which is often vendor-intended retry behavior, sometimes a port bug. Currently sweeps 23 addresses (7 DDRPHY training + 4 DDRCTL status × 4 channels). ## Record shape + diff bucketing (for tool authors) Per-access record fields: seq monotonic index within the capture tick Unicorn instruction count at the access pc access-site PC (absolute) addr MMIO/stack/SRAM address size 1/2/4/8 rw 'rd' or 'wr' val value read or written (hex) region mmio_regions.classify(addr) tag fn PCResolver result: FUN_xxxxxxxx from the funs table Diff key inside each fn bucket: `(region, addr, rw, val, size)`. Explicitly excludes `pc` (codegen reg-alloc shifts individual load/ store PCs within a function without changing behavior), `seq`, and `tick` (these drift with any upstream path difference). ## Known limitations - The Unicorn simulator exits early on sustained same-PC loops (>10 000 iterations) to avoid deadlocks. Real silicon polling that would eventually succeed is modelled via the stub returning the success value; if your use case needs a different success-delay profile, edit `stub_value` / `is_training_status`. - `sim_tripwire.PCResolver` attributes every PC to the *largest FUN_-entry address ≤ PC*. Unported code paths still resolve to a reasonable fn_name. Ports not in the `// ============ FUN_xxxx @` convention won't match. - `mmio_diff.py`'s `--capture-stack-writes` flag catches writes to Unicorn's scratch stack `0x00400000..0x00500000` — but the vendor firmware sometimes uses SRAM-resident scratch buffers (e.g. the `tp` timing buffer at `0xff0164f8`) instead of the call-stack. For those, add a dedicated hook in the probe (see `../debug_probes/ tp_slot_writes.py` for an example). ## Dependencies - Python 3.8+ - `unicorn-engine` (AArch64 CPU emulator) - `difflib` (stdlib) ```bash pip install unicorn ``` ## License GPL-2.0-or-later, matching the port candidates' SPDX headers.