00d655187a
Three small functions extracted from the v1.19 conservative blob with
ground-truth C and per-tool (Ghidra / retdec / decomp.me) docs:
01_memset — byte memset, 28 B
02_memcpy32 — word-aligned memcpy, 36 B
03_magic_memset — magic check + tail-call to memset, 40 B
04_train_phy_block — first real poll-site function (104 B, 26 insts),
contains poll sites 12-15
Results in RESULTS.md:
- Ghidra: A on all four. Auto-decompile is close to final.
- retdec: A on #3, F on #1 and #2 (no register-arg inference on raw),
C on #4 (mistakes & 0xF0000000 for < 0x10000000).
GRIND_LOG.md (in 04_train_phy_block/) records the matching-decomp
iteration: 116-byte candidate.c at -Os vs vendor 104 bytes = 89.7%
size match on first real iteration. Remaining gap is GCC's choice of
`cmp w, w_const; b.ls` over vendor's `tst w, #imm; b.eq` for the
mask tests.
gdb_debug/ holds a native-aarch64 GDB single-stepper for the three
benchmark functions — boltzmann smoke test passed (memset:
buf[10] 0x00→0xab).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1.5 KiB
1.5 KiB
decomp.me recipe — 01_memset
Create a scratch
Open https://decomp.me/ (or your self-hosted instance at http://192.168.88.64 if available). Click New scratch.
- Platform / Compiler:
gcc 12.x aarch64-linux-gnu(or whatever aarch64-gcc is offered — exact version doesn't matter much for this size). - Compiler flags:
-O2 -ffreestanding -nostdlib - Diff label:
memset_byte
Target asm
Paste the following into the "Target asm" box:
memset_byte:
mov x3, #0x0
.Lloop:
cmp x2, x3
b.ne .Lbody
ret
.Lbody:
strb w1, [x0, x3]
add x3, x3, #0x1
b .Lloop
Context (headers/decls)
#include <stddef.h>
#include <stdint.h>
Source
Paste the ground-truth C from reference.c (or write your own first
and iterate).
Expected workflow
- First compile: scorer usually reports a high similarity (>= 80%) if
the compiler picks the same
while (i != n)pattern. - Fine-tune: try
i++vsi+=1, trywhilevsfor, tryuint8_t *cast placement. Each yields a distinct register-allocation order the scorer rewards or punishes. - Perfect match possible if you hit the exact code shape GCC chose.
Benchmark notes
- decomp.me's strength is the compile-and-diff feedback loop — every edit immediately shows the byte-delta against the target.
- Weakness for this target: the real blob was likely built with a different compiler (ARMCC / Keil / vendor LLVM?). GCC may never match exactly even with perfect C. Similarity >= 90% is the realistic ceiling.