Files

T

test0r 46155bbe91 simulation: tripwire + PC-bucketed diff + bitflip sweep

Ship the new simulation & verification stack under simulation/:

- mmio_regions.py — address → region classifier (DDRCTL, DDRPHY,
  OTP, SRAM, …). Shared by every other tool so trace output is
  scannable without memorising the memory map.
- sim_tripwire.py — Bin-style per-access capture. Records
  (seq, insn_tick, pc, addr, size, rw, val, region, fn_name) per
  MMIO access. PCResolver bisects the vendor funs table parsed
  from ddr_conservative_asm.s.
- tripwire_diff.py — PC-bucketed difflib.SequenceMatcher diff of
  two tripwire CSVs. Buckets by fn_name so bitflip-induced control
  flow divergence doesn't cascade noise.
- training_sim.py — DDR training simulator with --mode pass and
  --mode bitflip (flip first N reads per training status, exercise
  retry paths). BITFLIP_ONLY env var narrows to a single addr for
  the sweep.
- bitflip_sweep.py — Flip each of 23 training-status addresses
  one-at-a-time and tabulate retry convergence. Surfaces which
  function(s) react to a transient fault by writing different
  downstream register values.

Plus:

- mmio_diff.py updated: region-tagged divergence output,
  --show-regions histogram, --tripwire-out-{vendor,rebuilt} CSV
  capture, --capture-stack-writes for stack-allocated buffer diffs.
- debug_probes/tp_slot_{probe,writes}.py — ad-hoc Unicorn probes
  for chasing a single-slot divergence in an SRAM buffer. Kept as
  reference examples of how to extend the tripwire toolchain.

The stack found 6 silicon-hostile bugs in the rebuilt blob that
mmio_diff's write-sequence gate was structurally blind to, including
three ld-unresolved-symbol NULL derefs (case-mismatched externs,
missing DATA_SYMS) and one C-early-return-skips-shared-tail bug
where vendor's asm fell through to the tail via `b` after a `ret`.

2026-04-22 05:55:28 +02:00

7.3 KiB

Raw Permalink Blame History

RK3588 DDR TPL — Simulation & Verification Stack

A set of Unicorn-based tools for pre-silicon simulation, behavioral diffing, and fault-injection of Rockchip RK3588 DDR TPL blobs (vendor or rebuilt).

Built to hunt silicon-corruption bugs that mmio_diff.py's write-sequence comparison cannot see — NULL derefs, read-side divergences, retry-path diffs.

Synopsis

Tool	One-line
`mmio_regions.py`	Address → region classifier (`DDRCTL`, `DDRPHY`, `OTP`, `SRAM`, …)
`sim_tripwire.py`	Bin-style per-access capture (PC, tick, addr, region, resolved fn name)
`tripwire_diff.py`	PC-bucketed `SequenceMatcher` diff of two tripwire CSVs
`training_sim.py`	DDR-training simulator with `pass` and `bitflip-first-pass` modes
`bitflip_sweep.py`	Flip each training-status address one at a time, report retry convergence

The simulator DOES NOT need silicon. It runs vendor or rebuilt TPL blobs under Unicorn with an MMIO stub that returns "pass" values for all training-status polls, captures every access, and lets you diff runs behaviorally.

Quick start

Assuming your TPL blob is at ../rk3588_ddr_v1.19_prod.bin (a copy of the vendor blob shipped at SPI offset 0x8000 on boards with RKBIN v1.19) and the rebuilt blob at /tmp/rebuilt.bin:

# Run once in "pass" mode and capture tripwire to CSV
python3 training_sim.py ../rk3588_ddr_v1.19_prod.bin \
        --mode pass --tripwire-out /tmp/tw-pass.csv

# Run again with the first read of every training status flipped
python3 training_sim.py ../rk3588_ddr_v1.19_prod.bin \
        --mode bitflip --flip-count 1 --flip-mask 0xFFFFFFFF \
        --tripwire-out /tmp/tw-flip.csv

# Diff the two runs by function bucket
python3 tripwire_diff.py /tmp/tw-pass.csv /tmp/tw-flip.csv

# Sweep every training-status address one-at-a-time and tabulate
# whether the retry loop reconverges cleanly
python3 bitflip_sweep.py ../rk3588_ddr_v1.19_prod.bin

For vendor-vs-rebuilt verification (needs ../mmio_diff.py in the parent dir):

python3 ../mmio_diff.py --ignore-pc \
        ../rk3588_ddr_v1.19_prod.bin /tmp/rebuilt.bin \
        --tripwire-out-vendor  /tmp/tw-v.csv \
        --tripwire-out-rebuilt /tmp/tw-r.csv \
        --show-regions

python3 tripwire_diff.py /tmp/tw-v.csv /tmp/tw-r.csv

Architecture

`mmio_regions.py` — address classifier

Pure lookup table. classify(addr) returns a short tag for each RK3588 peripheral window. Used by every other tool so trace output is scannable without memorising the memory map.

Region tags: DDRCTL, DDRCTL:SW (STAT/PWRCTL/SWCTL/SWSTAT), DDRCTL:MR (mode-register ops), DDRPHY, DDRPHY:TR (training status offsets 0x080/090/0B4/3CC/514/684/A24), DDR_CRU, DDR_MEM, SRAM, PMU_SRAM, GRF, BUS_GRF, SGRF, CRU, SCRU, PMU, FW_DDR, OTP, UART, STACK, OTHER.

`sim_tripwire.py` — per-access capture

Capture class with rd(pc, addr, size, val, tick) and wr(...) that record one row per access:

(seq_idx, insn_tick, pc, addr, size, rw, val, region, fn_name)

fn_name comes from PCResolver, which bisects the vendor funs table parsed from ../ddr_conservative_asm.s (115 FUN_xxxx @ offset headers; Ghidra export). Set RK_DDR_ASM env var to override the default asm path.

emit_csv(path) writes out; load_csv(path) re-hydrates. Both training_sim.py and mmio_diff.py (in parent dir) accept a tripwire capture object and record into it.

`tripwire_diff.py` — PC-bucketed diff

For each unique fn_name in either capture, collect records, key them by (region, addr, rw, val, size), diff via difflib. SequenceMatcher. quick_ratio() short-circuits buckets that share almost nothing.

Outputs three tiers:

OK: byte-identical key sequences (suppressed unless --show-identical).
minor-diff: ratio ≥ --suspect-threshold (default 0.9).
SUSPECT: ratio below threshold, printed first with the raw edit script.

Why PC-bucket and not index-by-index? Under bitflip mode the control flow diverges at the flip point, which destroys index alignment. Grouping by function localises divergences so one buggy bucket doesn't cascade noise into unrelated ones.

`training_sim.py` — DDR training simulator

Two modes:

--mode pass — every training-status read returns its "done/OK/ trained" stub value every time. Equivalent to mmio_diff's base harness.
--mode bitflip --flip-count N --flip-mask MASK — the first N reads of each training-status address return stub_value ^ mask (default mask 0xFFFFFFFF → "not done"). Subsequent reads revert. Exercises the retry / error-recovery paths.

Training-status addresses are defined inside is_training_status(); override single-address via the BITFLIP_ONLY=0xADDR env var (used by bitflip_sweep.py).

Region-tagged access histogram + UART TX dump on every run.

`bitflip_sweep.py` — per-address retry convergence

Flips each training-status register one-at-a-time and summarises:

how many records diverged from the pass-mode baseline
whether any MMIO write value changed (= retry path took a different branch)
which function(s) wrote the divergent values

Output is a single table row per address. A clean "write_divergence" column means retry paths converge deterministically. A non-zero count names the function whose retry wrote a different register value — which is often vendor-intended retry behavior, sometimes a port bug.

Currently sweeps 23 addresses (7 DDRPHY training + 4 DDRCTL status × 4 channels).

Record shape + diff bucketing (for tool authors)

Per-access record fields:

seq     monotonic index within the capture
tick    Unicorn instruction count at the access
pc      access-site PC (absolute)
addr    MMIO/stack/SRAM address
size    1/2/4/8
rw      'rd' or 'wr'
val     value read or written (hex)
region  mmio_regions.classify(addr) tag
fn      PCResolver result: FUN_xxxxxxxx from the funs table

Diff key inside each fn bucket: (region, addr, rw, val, size). Explicitly excludes pc (codegen reg-alloc shifts individual load/ store PCs within a function without changing behavior), seq, and tick (these drift with any upstream path difference).

Known limitations

The Unicorn simulator exits early on sustained same-PC loops (>10 000 iterations) to avoid deadlocks. Real silicon polling that would eventually succeed is modelled via the stub returning the success value; if your use case needs a different success-delay profile, edit stub_value / is_training_status.
sim_tripwire.PCResolver attributes every PC to the largest FUN_-entry address ≤ PC. Unported code paths still resolve to a reasonable fn_name. Ports not in the // ============ FUN_xxxx @ convention won't match.
mmio_diff.py's --capture-stack-writes flag catches writes to Unicorn's scratch stack 0x00400000..0x00500000 — but the vendor firmware sometimes uses SRAM-resident scratch buffers (e.g. the tp timing buffer at 0xff0164f8) instead of the call-stack. For those, add a dedicated hook in the probe (see ../debug_probes/ tp_slot_writes.py for an example).

Dependencies

Python 3.8+
unicorn-engine (AArch64 CPU emulator)
difflib (stdlib)

pip install unicorn

License

GPL-2.0-or-later, matching the port candidates' SPDX headers.

7.3 KiB Raw Permalink Blame History Unescape Escape