Files
test0r 46155bbe91 simulation: tripwire + PC-bucketed diff + bitflip sweep
Ship the new simulation & verification stack under simulation/:

- mmio_regions.py — address → region classifier (DDRCTL, DDRPHY,
  OTP, SRAM, …). Shared by every other tool so trace output is
  scannable without memorising the memory map.
- sim_tripwire.py — Bin-style per-access capture. Records
  (seq, insn_tick, pc, addr, size, rw, val, region, fn_name) per
  MMIO access. PCResolver bisects the vendor funs table parsed
  from ddr_conservative_asm.s.
- tripwire_diff.py — PC-bucketed difflib.SequenceMatcher diff of
  two tripwire CSVs. Buckets by fn_name so bitflip-induced control
  flow divergence doesn't cascade noise.
- training_sim.py — DDR training simulator with --mode pass and
  --mode bitflip (flip first N reads per training status, exercise
  retry paths). BITFLIP_ONLY env var narrows to a single addr for
  the sweep.
- bitflip_sweep.py — Flip each of 23 training-status addresses
  one-at-a-time and tabulate retry convergence. Surfaces which
  function(s) react to a transient fault by writing different
  downstream register values.

Plus:

- mmio_diff.py updated: region-tagged divergence output,
  --show-regions histogram, --tripwire-out-{vendor,rebuilt} CSV
  capture, --capture-stack-writes for stack-allocated buffer diffs.
- debug_probes/tp_slot_{probe,writes}.py — ad-hoc Unicorn probes
  for chasing a single-slot divergence in an SRAM buffer. Kept as
  reference examples of how to extend the tripwire toolchain.

The stack found 6 silicon-hostile bugs in the rebuilt blob that
mmio_diff's write-sequence gate was structurally blind to, including
three ld-unresolved-symbol NULL derefs (case-mismatched externs,
missing DATA_SYMS) and one C-early-return-skips-shared-tail bug
where vendor's asm fell through to the tail via `b` after a `ret`.
2026-04-22 05:55:28 +02:00

198 lines
7.3 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# RK3588 DDR TPL — Simulation & Verification Stack
A set of Unicorn-based tools for pre-silicon simulation, behavioral
diffing, and fault-injection of Rockchip RK3588 DDR TPL blobs (vendor
or rebuilt).
Built to hunt silicon-corruption bugs that `mmio_diff.py`'s
write-sequence comparison cannot see — NULL derefs, read-side
divergences, retry-path diffs.
## Synopsis
| Tool | One-line |
|---|---|
| `mmio_regions.py` | Address → region classifier (`DDRCTL`, `DDRPHY`, `OTP`, `SRAM`, …) |
| `sim_tripwire.py` | Bin-style per-access capture (PC, tick, addr, region, resolved fn name) |
| `tripwire_diff.py` | PC-bucketed `SequenceMatcher` diff of two tripwire CSVs |
| `training_sim.py` | DDR-training simulator with `pass` and `bitflip-first-pass` modes |
| `bitflip_sweep.py` | Flip each training-status address one at a time, report retry convergence |
The simulator **DOES NOT** need silicon. It runs vendor or rebuilt TPL
blobs under Unicorn with an MMIO stub that returns "pass" values for
all training-status polls, captures every access, and lets you diff
runs behaviorally.
## Quick start
Assuming your TPL blob is at `../rk3588_ddr_v1.19_prod.bin` (a copy of
the vendor blob shipped at SPI offset `0x8000` on boards with RKBIN
v1.19) and the rebuilt blob at `/tmp/rebuilt.bin`:
```bash
# Run once in "pass" mode and capture tripwire to CSV
python3 training_sim.py ../rk3588_ddr_v1.19_prod.bin \
--mode pass --tripwire-out /tmp/tw-pass.csv
# Run again with the first read of every training status flipped
python3 training_sim.py ../rk3588_ddr_v1.19_prod.bin \
--mode bitflip --flip-count 1 --flip-mask 0xFFFFFFFF \
--tripwire-out /tmp/tw-flip.csv
# Diff the two runs by function bucket
python3 tripwire_diff.py /tmp/tw-pass.csv /tmp/tw-flip.csv
# Sweep every training-status address one-at-a-time and tabulate
# whether the retry loop reconverges cleanly
python3 bitflip_sweep.py ../rk3588_ddr_v1.19_prod.bin
```
For vendor-vs-rebuilt verification (needs `../mmio_diff.py` in the
parent dir):
```bash
python3 ../mmio_diff.py --ignore-pc \
../rk3588_ddr_v1.19_prod.bin /tmp/rebuilt.bin \
--tripwire-out-vendor /tmp/tw-v.csv \
--tripwire-out-rebuilt /tmp/tw-r.csv \
--show-regions
python3 tripwire_diff.py /tmp/tw-v.csv /tmp/tw-r.csv
```
## Architecture
### `mmio_regions.py` — address classifier
Pure lookup table. `classify(addr)` returns a short tag for each
RK3588 peripheral window. Used by every other tool so trace output is
scannable without memorising the memory map.
Region tags: `DDRCTL`, `DDRCTL:SW` (STAT/PWRCTL/SWCTL/SWSTAT),
`DDRCTL:MR` (mode-register ops), `DDRPHY`, `DDRPHY:TR` (training
status offsets `0x080/090/0B4/3CC/514/684/A24`), `DDR_CRU`, `DDR_MEM`,
`SRAM`, `PMU_SRAM`, `GRF`, `BUS_GRF`, `SGRF`, `CRU`, `SCRU`, `PMU`,
`FW_DDR`, `OTP`, `UART`, `STACK`, `OTHER`.
### `sim_tripwire.py` — per-access capture
`Capture` class with `rd(pc, addr, size, val, tick)` and `wr(...)`
that record one row per access:
(seq_idx, insn_tick, pc, addr, size, rw, val, region, fn_name)
`fn_name` comes from `PCResolver`, which bisects the vendor funs
table parsed from `../ddr_conservative_asm.s` (115 `FUN_xxxx @ offset`
headers; Ghidra export). Set `RK_DDR_ASM` env var to override the
default asm path.
`emit_csv(path)` writes out; `load_csv(path)` re-hydrates. Both
`training_sim.py` and `mmio_diff.py` (in parent dir) accept a
tripwire capture object and record into it.
### `tripwire_diff.py` — PC-bucketed diff
For each unique `fn_name` in either capture, collect records, key
them by `(region, addr, rw, val, size)`, diff via `difflib.
SequenceMatcher`. `quick_ratio()` short-circuits buckets that share
almost nothing.
Outputs three tiers:
- **OK**: byte-identical key sequences (suppressed unless
`--show-identical`).
- **minor-diff**: ratio ≥ `--suspect-threshold` (default 0.9).
- **SUSPECT**: ratio below threshold, printed first with the raw
edit script.
Why PC-bucket and not index-by-index? Under bitflip mode the control
flow diverges at the flip point, which destroys index alignment.
Grouping by function localises divergences so one buggy bucket
doesn't cascade noise into unrelated ones.
### `training_sim.py` — DDR training simulator
Two modes:
- `--mode pass` — every training-status read returns its "done/OK/
trained" stub value every time. Equivalent to `mmio_diff`'s base
harness.
- `--mode bitflip --flip-count N --flip-mask MASK` — the first `N`
reads of each training-status address return `stub_value ^ mask`
(default mask `0xFFFFFFFF` → "not done"). Subsequent reads revert.
Exercises the retry / error-recovery paths.
Training-status addresses are defined inside `is_training_status()`;
override single-address via the `BITFLIP_ONLY=0xADDR` env var
(used by `bitflip_sweep.py`).
Region-tagged access histogram + UART TX dump on every run.
### `bitflip_sweep.py` — per-address retry convergence
Flips each training-status register one-at-a-time and summarises:
- how many records diverged from the pass-mode baseline
- whether any MMIO write value changed (= retry path took a
different branch)
- which function(s) wrote the divergent values
Output is a single table row per address. A clean "write_divergence"
column means retry paths converge deterministically. A non-zero
count names the function whose retry wrote a different register
value — which is often vendor-intended retry behavior, sometimes
a port bug.
Currently sweeps 23 addresses (7 DDRPHY training + 4 DDRCTL status
× 4 channels).
## Record shape + diff bucketing (for tool authors)
Per-access record fields:
seq monotonic index within the capture
tick Unicorn instruction count at the access
pc access-site PC (absolute)
addr MMIO/stack/SRAM address
size 1/2/4/8
rw 'rd' or 'wr'
val value read or written (hex)
region mmio_regions.classify(addr) tag
fn PCResolver result: FUN_xxxxxxxx from the funs table
Diff key inside each fn bucket: `(region, addr, rw, val, size)`.
Explicitly excludes `pc` (codegen reg-alloc shifts individual load/
store PCs within a function without changing behavior), `seq`, and
`tick` (these drift with any upstream path difference).
## Known limitations
- The Unicorn simulator exits early on sustained same-PC loops
(>10 000 iterations) to avoid deadlocks. Real silicon polling that
would eventually succeed is modelled via the stub returning the
success value; if your use case needs a different success-delay
profile, edit `stub_value` / `is_training_status`.
- `sim_tripwire.PCResolver` attributes every PC to the *largest
FUN_-entry address ≤ PC*. Unported code paths still resolve to a
reasonable fn_name. Ports not in the `// ============ FUN_xxxx @`
convention won't match.
- `mmio_diff.py`'s `--capture-stack-writes` flag catches writes to
Unicorn's scratch stack `0x00400000..0x00500000` — but the vendor
firmware sometimes uses SRAM-resident scratch buffers (e.g. the
`tp` timing buffer at `0xff0164f8`) instead of the call-stack. For
those, add a dedicated hook in the probe (see `../debug_probes/
tp_slot_writes.py` for an example).
## Dependencies
- Python 3.8+
- `unicorn-engine` (AArch64 CPU emulator)
- `difflib` (stdlib)
```bash
pip install unicorn
```
## License
GPL-2.0-or-later, matching the port candidates' SPDX headers.