Files
rk3588-ddr-analysis/benchmark/gdb_debug/README.md
T
marfrit 00d655187a benchmark/: three-way RE-tool comparison + first real C-lift
Three small functions extracted from the v1.19 conservative blob with
ground-truth C and per-tool (Ghidra / retdec / decomp.me) docs:
  01_memset        — byte memset, 28 B
  02_memcpy32      — word-aligned memcpy, 36 B
  03_magic_memset  — magic check + tail-call to memset, 40 B
  04_train_phy_block — first real poll-site function (104 B, 26 insts),
                       contains poll sites 12-15

Results in RESULTS.md:
  - Ghidra: A on all four. Auto-decompile is close to final.
  - retdec: A on #3, F on #1 and #2 (no register-arg inference on raw),
    C on #4 (mistakes & 0xF0000000 for < 0x10000000).

GRIND_LOG.md (in 04_train_phy_block/) records the matching-decomp
iteration: 116-byte candidate.c at -Os vs vendor 104 bytes = 89.7%
size match on first real iteration. Remaining gap is GCC's choice of
`cmp w, w_const; b.ls` over vendor's `tst w, #imm; b.eq` for the
mask tests.

gdb_debug/ holds a native-aarch64 GDB single-stepper for the three
benchmark functions — boltzmann smoke test passed (memset:
buf[10] 0x00→0xab).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 07:26:23 +02:00

73 lines
3.0 KiB
Markdown

# gdb_debug — single-step the benchmark functions under GDB
Wraps each of `01_memset` / `02_memcpy32` / `03_magic_memset` in a
C harness, copies the raw bytes into an RWX buffer, and calls through
a function pointer. GDB attached to the harness lets you step every
machine instruction of the real blob code — **no QEMU needed because
boltzmann (and ampere, ohm, hertz) are natively aarch64.**
## Build
```
make # builds ./gdb_debug.elf natively on aarch64
```
Cross-build recipe (if you ever want to run on x86 oppenheimer via
qemu-user) lives in the Makefile; replace `gcc` with
`aarch64-linux-gnu-gcc` and `ld` with `aarch64-linux-gnu-ld`, and launch
under `qemu-aarch64-static -g 1234 ./gdb_debug.elf 1` with
`gdb-multiarch` attaching to `:1234`.
## Run under GDB
```
gdb ./gdb_debug.elf
(gdb) set pagination off
(gdb) layout split # TUI: source / asm / regs split
(gdb) break call_func # the dispatcher — one breakpoint catches all three
(gdb) run 1 # 1=memset 2=memcpy32 3=magic_memset
(gdb) stepi # one machine instruction
(gdb) info reg # full register dump
(gdb) x/8i $pc # peek 8 upcoming instructions
(gdb) display/i $pc # auto-show next instruction on every stop
(gdb) x/16bx $x0 # hex-dump 16 bytes from what X0 points at
```
## What to look for
### Function 1 (memset)
After `MOV X3, #0`, each iteration: `CMP X2, X3``B.NE``STRB W1, [X0, X3]`
`ADD X3, X3, #1` → back. Watch `$x3` advance, inspect `x/16bx $x0` to see
the buffer filling with `0xAB`.
### Function 2 (memcpy32)
First instruction is the alignment mask: `AND X2, X2, #0xfffffffc`.
Set a watchpoint on `$x2` to catch the mask, then step the loop to watch
4-byte transfers: `LDR W4, [X1, X3]` ; `STR W4, [X0, X3]` ; `ADD X3, X3, #4`.
### Function 3 (magic_memset)
Will **SIGSEGV** on `LDR W2, [X0, #4]` because `X0 = 0x1fe000` is unmapped
in user mode. That crash **is** the verification — it proves the function
really does target that absolute address. To execute the full path, add
before `call_func`:
```c
mmap((void*)0x1fe000, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED, -1, 0);
*(uint32_t*)0x1fe004 = 0x54410001;
```
Then the magic check passes and GDB steps into the tail-call to memset.
## Why this scaffold beats `ddr_emu2` for verifying trampolines
`ddr_emu2` dies at PC=0x10a80 in the emulator because it can't model an
MMIO register — blind spot for us. Native GDB on an aarch64 host runs the
*actual* CPU with full instruction fidelity; the limit becomes "can we
fake the MMIO responses?" rather than "does the emulator know this
instruction?". For compute-only code (functions 1 and 2), zero prep
needed. For MMIO-touching code, `mmap(MAP_FIXED)` + a signal handler
stub can serve as a synthetic PHY — **that's the path to single-stepping
a patched trampoline through the real ISA with fake hardware replies**,
which is exactly what the next round of v3fb verification would need.