benchmark/: three-way RE-tool comparison + first real C-lift

Three small functions extracted from the v1.19 conservative blob with
ground-truth C and per-tool (Ghidra / retdec / decomp.me) docs:
  01_memset        — byte memset, 28 B
  02_memcpy32      — word-aligned memcpy, 36 B
  03_magic_memset  — magic check + tail-call to memset, 40 B
  04_train_phy_block — first real poll-site function (104 B, 26 insts),
                       contains poll sites 12-15

Results in RESULTS.md:
  - Ghidra: A on all four. Auto-decompile is close to final.
  - retdec: A on #3, F on #1 and #2 (no register-arg inference on raw),
    C on #4 (mistakes & 0xF0000000 for < 0x10000000).

GRIND_LOG.md (in 04_train_phy_block/) records the matching-decomp
iteration: 116-byte candidate.c at -Os vs vendor 104 bytes = 89.7%
size match on first real iteration. Remaining gap is GCC's choice of
`cmp w, w_const; b.ls` over vendor's `tst w, #imm; b.eq` for the
mask tests.

gdb_debug/ holds a native-aarch64 GDB single-stepper for the three
benchmark functions — boltzmann smoke test passed (memset:
buf[10] 0x00→0xab).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-15 07:26:23 +02:00
parent 694be88964
commit 00d655187a
32 changed files with 1113 additions and 0 deletions
+26
View File
@@ -0,0 +1,26 @@
BENCH := $(abspath ..)
.PHONY: all clean
all: gdb_debug.elf
# Wrap each benchmark function's raw bytes into an .o with predictable
# symbols _binary_func_NN_bin_{start,end}, regardless of the cwd-dependent
# symbol names that `ld -b binary` generates.
define WRAP_BIN
$1.o: $(BENCH)/$2/func.bin
cp $$< $1.bin
ld -r -b binary -o $$@.raw $1.bin
rm -f $1.bin
objcopy $$$$(nm $$@.raw | awk '/_func_bin_start$$$$/{printf " --redefine-sym %s=_binary_$1_bin_start",$$$$3} /_func_bin_end$$$$/{printf " --redefine-sym %s=_binary_$1_bin_end",$$$$3}') $$@.raw $$@
rm -f $$@.raw
endef
$(eval $(call WRAP_BIN,func_01,01_memset))
$(eval $(call WRAP_BIN,func_02,02_memcpy32))
$(eval $(call WRAP_BIN,func_03,03_magic_memset))
gdb_debug.elf: harness.c func_01.o func_02.o func_03.o
gcc -O0 -g -Wall -o $@ $^
clean:
rm -f gdb_debug.elf func_*.o *.bin