benchmark/: three-way RE-tool comparison + first real C-lift
Three small functions extracted from the v1.19 conservative blob with
ground-truth C and per-tool (Ghidra / retdec / decomp.me) docs:
01_memset — byte memset, 28 B
02_memcpy32 — word-aligned memcpy, 36 B
03_magic_memset — magic check + tail-call to memset, 40 B
04_train_phy_block — first real poll-site function (104 B, 26 insts),
contains poll sites 12-15
Results in RESULTS.md:
- Ghidra: A on all four. Auto-decompile is close to final.
- retdec: A on #3, F on #1 and #2 (no register-arg inference on raw),
C on #4 (mistakes & 0xF0000000 for < 0x10000000).
GRIND_LOG.md (in 04_train_phy_block/) records the matching-decomp
iteration: 116-byte candidate.c at -Os vs vendor 104 bytes = 89.7%
size match on first real iteration. Remaining gap is GCC's choice of
`cmp w, w_const; b.ls` over vendor's `tst w, #imm; b.eq` for the
mask tests.
gdb_debug/ holds a native-aarch64 GDB single-stepper for the three
benchmark functions — boltzmann smoke test passed (memset:
buf[10] 0x00→0xab).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,58 @@
|
||||
# decomp.me recipe — 01_memset
|
||||
|
||||
## Create a scratch
|
||||
|
||||
Open https://decomp.me/ (or your self-hosted instance at
|
||||
http://192.168.88.64 if available). Click **New scratch**.
|
||||
|
||||
- **Platform / Compiler:** `gcc 12.x aarch64-linux-gnu` (or whatever
|
||||
aarch64-gcc is offered — exact version doesn't matter much for this
|
||||
size).
|
||||
- **Compiler flags:** `-O2 -ffreestanding -nostdlib`
|
||||
- **Diff label:** `memset_byte`
|
||||
|
||||
## Target asm
|
||||
|
||||
Paste the following into the **"Target asm"** box:
|
||||
|
||||
```asm
|
||||
memset_byte:
|
||||
mov x3, #0x0
|
||||
.Lloop:
|
||||
cmp x2, x3
|
||||
b.ne .Lbody
|
||||
ret
|
||||
.Lbody:
|
||||
strb w1, [x0, x3]
|
||||
add x3, x3, #0x1
|
||||
b .Lloop
|
||||
```
|
||||
|
||||
## Context (headers/decls)
|
||||
|
||||
```c
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
```
|
||||
|
||||
## Source
|
||||
|
||||
Paste the ground-truth C from `reference.c` (or write your own first
|
||||
and iterate).
|
||||
|
||||
## Expected workflow
|
||||
|
||||
- First compile: scorer usually reports a high similarity (>= 80%) if
|
||||
the compiler picks the same `while (i != n)` pattern.
|
||||
- Fine-tune: try `i++` vs `i+=1`, try `while` vs `for`, try `uint8_t *`
|
||||
cast placement. Each yields a distinct register-allocation order the
|
||||
scorer rewards or punishes.
|
||||
- Perfect match possible if you hit the exact code shape GCC chose.
|
||||
|
||||
## Benchmark notes
|
||||
|
||||
- decomp.me's strength is the **compile-and-diff** feedback loop — every
|
||||
edit immediately shows the byte-delta against the target.
|
||||
- Weakness for this target: the real blob was likely built with a
|
||||
different compiler (ARMCC / Keil / vendor LLVM?). GCC may never match
|
||||
exactly even with perfect C. Similarity >= 90% is the realistic ceiling.
|
||||
Binary file not shown.
@@ -0,0 +1,14 @@
|
||||
|
||||
01_memset/func.bin: file format binary
|
||||
|
||||
|
||||
Disassembly of section .data:
|
||||
|
||||
0000000000000aac <.data>:
|
||||
aac: d2800003 mov x3, #0x0 // #0
|
||||
ab0: eb03005f cmp x2, x3
|
||||
ab4: 54000041 b.ne 0xabc // b.any
|
||||
ab8: d65f03c0 ret
|
||||
abc: 38236801 strb w1, [x0, x3]
|
||||
ac0: 91000463 add x3, x3, #0x1
|
||||
ac4: 17fffffb b 0xab0
|
||||
@@ -0,0 +1,41 @@
|
||||
# Ghidra recipe — 01_memset
|
||||
|
||||
## Load
|
||||
|
||||
**File → Import File…** → `func.bin`.
|
||||
|
||||
In the import dialog:
|
||||
- **Format:** Raw Binary
|
||||
- **Language:** AArch64:LE:64:v8A
|
||||
- **Base Address:** `0x0aac` ← critical; branches are PC-relative and the
|
||||
absolute function address matters for readability (though the code at
|
||||
0xaac has no absolute-addr refs of its own).
|
||||
|
||||
After import, click **Yes** on the "Analyze now?" prompt; default
|
||||
analyzers are fine.
|
||||
|
||||
## What to look for in Ghidra's decompiler output
|
||||
|
||||
- Function automatically detected at 0xaac (the file starts there).
|
||||
- Decompiler should produce something like:
|
||||
```c
|
||||
void FUN_00000aac(long param_1, byte param_2, long param_3) {
|
||||
long local_10 = 0;
|
||||
while (local_10 != param_3) {
|
||||
*(byte *)(param_1 + local_10) = param_2;
|
||||
local_10++;
|
||||
}
|
||||
}
|
||||
```
|
||||
- Idiomatic match rate: high for this pattern; Ghidra's decompiler
|
||||
recognises the pre-test loop well.
|
||||
- Ghidra types: `byte` (uint8_t), `long` (the 64-bit register) — not
|
||||
directly `uint8_t` / `size_t`. Manual retyping is usually needed.
|
||||
|
||||
## Benchmark notes
|
||||
|
||||
- Time to understandable output: ~seconds (auto-analysis).
|
||||
- Manual cleanup: rename `FUN_00000aac` → `memset_byte`; retype
|
||||
`param_1` to `void *`, `param_2` to `uint8_t`, `param_3` to `size_t`.
|
||||
- Limits: Ghidra's decompiler is position-dependent on the load address
|
||||
only for jump targets beyond the slice — irrelevant here.
|
||||
@@ -0,0 +1,24 @@
|
||||
/* Ground-truth C for FUN_00000aac @ blob offset 0xaac (28 bytes / 7 insts).
|
||||
*
|
||||
* Pattern: byte-wise memset with a simple counting loop.
|
||||
* Signature: void memset_byte(void *buf, uint8_t val, size_t len);
|
||||
*
|
||||
* AArch64 ABI: X0 = buf, W1 = val (low byte), X2 = len
|
||||
* Scratch: X3 = index i
|
||||
*
|
||||
* Notes the decompiler should ideally recover:
|
||||
* - This is unambiguously "memset" semantics; bonus points for naming it so.
|
||||
* - The loop structure is pre-test (cmp before body) — tools should emit
|
||||
* `while (i != len)` or `for (; i < len; ...)`.
|
||||
* - W1 is truncated to a byte by the STRB; decompiler should mark val as u8.
|
||||
*/
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
void memset_byte(void *buf, uint8_t val, size_t len) {
|
||||
size_t i = 0;
|
||||
while (i != len) {
|
||||
((uint8_t *)buf)[i] = val;
|
||||
i++;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,38 @@
|
||||
//
|
||||
// This file was generated by the Retargetable Decompiler
|
||||
// Website: https://retdec.com
|
||||
//
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
// ------------------- Function Prototypes --------------------
|
||||
|
||||
int64_t entry_point(void);
|
||||
|
||||
// ------------------------ Functions -------------------------
|
||||
|
||||
// Address range: 0xaac - 0xac8
|
||||
int64_t entry_point(void) {
|
||||
// 0xaac
|
||||
int64_t result; // 0xaac
|
||||
if (result == 0) {
|
||||
// 0xab8
|
||||
return result;
|
||||
}
|
||||
int64_t v1 = 0; // 0xac0
|
||||
*(char *)(v1 + result) = (char)result;
|
||||
v1++;
|
||||
while (result != v1) {
|
||||
// 0xabc
|
||||
*(char *)(v1 + result) = (char)result;
|
||||
v1++;
|
||||
}
|
||||
// 0xab8
|
||||
return result;
|
||||
}
|
||||
|
||||
// --------------------- Meta-Information ---------------------
|
||||
|
||||
// Detected compiler/packer: starforce (3.x)
|
||||
// Detected functions: 1
|
||||
|
||||
@@ -0,0 +1,38 @@
|
||||
# retdec recipe — 01_memset
|
||||
|
||||
retdec runs fully automated — hand it the binary, ask for C.
|
||||
|
||||
## Invocation (on the decompme container at pve4, or wherever retdec lives)
|
||||
|
||||
```
|
||||
retdec --mode raw --arch arm --endian little --bit-size 64 \
|
||||
--raw-entry-point 0x0aac \
|
||||
--raw-section-vma 0x0aac \
|
||||
func.bin -o retdec.c
|
||||
```
|
||||
|
||||
The flags:
|
||||
- `--mode raw` — input is a flat binary, no PE/ELF headers.
|
||||
- `--arch arm --endian little --bit-size 64` — AArch64 LE.
|
||||
- `--raw-entry-point 0x0aac` — tell retdec where execution starts.
|
||||
- `--raw-section-vma 0x0aac` — load the binary at address 0x0aac so
|
||||
branch targets resolve correctly.
|
||||
|
||||
Output goes to `retdec.c`. retdec emits a .ll (LLVM IR) and a .dsm
|
||||
(disasm) alongside — all useful for comparison.
|
||||
|
||||
## What to expect
|
||||
|
||||
retdec is the least "smart" of the three tools. For a raw 28-byte blob
|
||||
with no headers, it will:
|
||||
- Detect the function at 0x0aac.
|
||||
- Produce a C function named `function_aac` or similar.
|
||||
- Often inserts pseudo-intrinsics like `__asm_mov(x3, 0)` for instructions
|
||||
it doesn't fold into C. For this tiny loop it usually manages clean C.
|
||||
|
||||
## Benchmark notes
|
||||
|
||||
- Strength: zero-touch, scriptable, good for bulk processing.
|
||||
- Weakness: no interactive refinement — you get what you get. Type
|
||||
inference is conservative (`int32_t *` instead of `void *` / `uint8_t *`).
|
||||
- Often emits control flow as `goto` rather than structured loops.
|
||||
Reference in New Issue
Block a user