00d655187a
Three small functions extracted from the v1.19 conservative blob with
ground-truth C and per-tool (Ghidra / retdec / decomp.me) docs:
01_memset — byte memset, 28 B
02_memcpy32 — word-aligned memcpy, 36 B
03_magic_memset — magic check + tail-call to memset, 40 B
04_train_phy_block — first real poll-site function (104 B, 26 insts),
contains poll sites 12-15
Results in RESULTS.md:
- Ghidra: A on all four. Auto-decompile is close to final.
- retdec: A on #3, F on #1 and #2 (no register-arg inference on raw),
C on #4 (mistakes & 0xF0000000 for < 0x10000000).
GRIND_LOG.md (in 04_train_phy_block/) records the matching-decomp
iteration: 116-byte candidate.c at -Os vs vendor 104 bytes = 89.7%
size match on first real iteration. Remaining gap is GCC's choice of
`cmp w, w_const; b.ls` over vendor's `tst w, #imm; b.eq` for the
mask tests.
gdb_debug/ holds a native-aarch64 GDB single-stepper for the three
benchmark functions — boltzmann smoke test passed (memset:
buf[10] 0x00→0xab).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
172 lines
6.2 KiB
Markdown
172 lines
6.2 KiB
Markdown
# Three-way RE-tool benchmark — results
|
||
|
||
Three AArch64 functions from the RK3588 DDR v1.19 conservative blob,
|
||
decompiled by **Ghidra 11.3** (interactive, auto-analysis) and **retdec
|
||
v5.0** (fully automated, `--mode raw`). **decomp.me** is a
|
||
matching-decompilation comparator rather than a decompiler, so it's
|
||
benchmarked on a different axis (time-to-match).
|
||
|
||
## Function 01 — `memset_byte` (28 bytes, 7 insts)
|
||
|
||
**Ground truth:** `void memset_byte(void *buf, uint8_t val, size_t len)` —
|
||
byte-wise pre-test counting loop.
|
||
|
||
### Ghidra output
|
||
```c
|
||
void FUN_00000aac(long param_1, undefined1 param_2, long param_3) {
|
||
long lVar1;
|
||
for (lVar1 = 0; param_3 != lVar1; lVar1 = lVar1 + 1) {
|
||
*(undefined1 *)(param_1 + lVar1) = param_2;
|
||
}
|
||
}
|
||
```
|
||
**Grade: A.** Semantics perfect. Types `long`/`undefined1` instead of
|
||
`void*`/`uint8_t`/`size_t` — one click to retype each. For-loop shape
|
||
matches the canonical idiom exactly.
|
||
|
||
### retdec output
|
||
```c
|
||
int64_t entry_point(void) {
|
||
int64_t result;
|
||
if (result == 0) return result;
|
||
int64_t v1 = 0;
|
||
*(char *)(v1 + result) = (char)result;
|
||
...
|
||
}
|
||
```
|
||
**Grade: F.** **No function arguments inferred** — treats X0/W1/X2 as
|
||
uninitialised locals. The whole function signature is wrong. The loop
|
||
body overwrites the wrong things. This is retdec's biggest weakness on
|
||
raw binaries: without ELF symbol or DWARF hints, it can't tell which
|
||
registers are live-in parameters.
|
||
|
||
### decomp.me workflow
|
||
N/A as a decompiler; as a matching-decomp tool: paste the asm + your
|
||
candidate C → iterate on wording until the compiled output byte-matches.
|
||
For memset this reaches 90%+ similarity in a handful of edits with GCC
|
||
(exact match unlikely — original blob used a different compiler).
|
||
|
||
---
|
||
|
||
## Function 02 — `memcpy32` (36 bytes, 9 insts)
|
||
|
||
**Ground truth:** word-aligned memcpy, `len &= ~3`, 4-byte stride.
|
||
|
||
### Ghidra output
|
||
```c
|
||
void FUN_00001200(long param_1, long param_2, ulong param_3) {
|
||
ulong uVar1;
|
||
for (uVar1 = 0; uVar1 != (param_3 & 0xfffffffc); uVar1 = uVar1 + 4) {
|
||
*(undefined4 *)(param_1 + uVar1) = *(undefined4 *)(param_2 + uVar1);
|
||
}
|
||
}
|
||
```
|
||
**Grade: A.** Semantics perfect. The `& 0xfffffffc` mask is in the
|
||
correct position, the 4-byte stride is there, the `undefined4` (u32) copy
|
||
is there. Again: only type annotations need manual cleanup.
|
||
|
||
### retdec output
|
||
```c
|
||
int64_t entry_point(void) {
|
||
int64_t v1 = result & 0xfffffffc;
|
||
if (v1 == 0) return result;
|
||
int64_t v2 = 0;
|
||
int64_t v3 = v2 + 4;
|
||
while (v3 != v1) { v2 = v3; v3 = v2 + 4; }
|
||
return result;
|
||
}
|
||
```
|
||
**Grade: F.** Same no-arguments failure mode. The memory-copy statements
|
||
are completely absent — retdec failed to emit the two LDR/STR pair as a
|
||
dereference. You get an infinite-looking counter increment and nothing
|
||
else. Unusable.
|
||
|
||
---
|
||
|
||
## Function 03 — `magic_memset` (40 bytes, 9 insts)
|
||
|
||
**Ground truth:** `if (*(u32*)0x1fe004 == 0x54410001) memset(0x1fe000, 0, 0x32c);`
|
||
|
||
### Ghidra output
|
||
```c
|
||
void FUN_00000da4(void) {
|
||
if (_DAT_001fe004 == 0x54410001) {
|
||
FUN_00000aac(0x1fe000, 0, 0x32c);
|
||
return;
|
||
}
|
||
return;
|
||
}
|
||
```
|
||
**Grade: A+.** Perfect. `_DAT_001fe004` is Ghidra's auto-named data
|
||
symbol for the absolute address. The tail-call `B 0xaac` was correctly
|
||
turned into a regular call-and-return, preserving the calling
|
||
convention. The MOVZ+MOVK composed immediate `0x54410001` was collapsed
|
||
into a single literal.
|
||
|
||
### retdec output
|
||
```c
|
||
int64_t entry_point(void) {
|
||
if (*(int32_t *)0x1fe004 == 0x54410001) {
|
||
return unknown_aac(0x1fe000, 0, 812);
|
||
}
|
||
return 0x1fe000;
|
||
}
|
||
```
|
||
**Grade: B+.** **Noticeably better than retdec's output for #1 and #2**:
|
||
- Absolute-address dereference correctly parsed.
|
||
- `MOVZ W1, #1 ; MOVK W1, #0x5441, LSL #16` collapsed to
|
||
`0x54410001` ✓
|
||
- Tail-call to 0xaac correctly recognised as a call to
|
||
`unknown_aac(0x1fe000, 0, 812)` — even got arity right by observing
|
||
X0/W1/X2 being set up just before the branch.
|
||
- **Weakness:** returns `0x1fe000` in the fall-through branch —
|
||
spurious, because the function returns `void` and retdec fabricated
|
||
a return value. Also `812` decimal instead of `0x32c` hex.
|
||
|
||
---
|
||
|
||
## Takeaways
|
||
|
||
| dimension | Ghidra | retdec | decomp.me |
|
||
|---|---|---|---|
|
||
| Argument inference from raw binary | **yes** (intra-proc analysis) | **no** | n/a |
|
||
| Absolute-address data refs | auto-named `_DAT_xxxxx` | raw cast `*(int32_t *)0x…` | n/a |
|
||
| MOVZ+MOVK literal reconstruction | collapses | collapses | n/a |
|
||
| Tail-call recognition | yes | yes | n/a |
|
||
| Control-flow structure | clean structured loops | mix of `while` + `goto` | n/a |
|
||
| Type inference | `long`/`undefined4` placeholders | cautious `int64_t` fallback | n/a |
|
||
| Zero-touch automation | no (interactive) | **yes** | no (interactive) |
|
||
| Matching-decomp workflow | no | no | **yes** |
|
||
|
||
### When each tool wins
|
||
|
||
- **Ghidra** is the default daily driver. Auto-analysis output is already
|
||
close to final for simple functions — mostly you rename params and
|
||
retype placeholders.
|
||
- **retdec** shines when the target has **absolute-address data refs,
|
||
call tables, or embedded constants** (function #3). It falls over on
|
||
anything where register-passed parameters need inference from
|
||
surrounding context (functions #1 and #2). Fine for bulk batch
|
||
processing of a repo full of functions whose signatures you don't
|
||
know you care about — but verify each output.
|
||
- **decomp.me** doesn't compete with the others; it's the **"did my
|
||
rewrite compile to the same bytes?"** tool. Complementary: take
|
||
Ghidra's output, paste the C into decomp.me, iterate until the
|
||
compiled asm matches the blob's bytes. That's how you'd produce a
|
||
maintainable C re-implementation.
|
||
|
||
### Practical recipe for our DDR-blob work
|
||
|
||
1. **Start with Ghidra's decompiler output** (already done in
|
||
`ddr_annotated.c`). Retype params, rename variables. ~2–4h per
|
||
non-trivial function.
|
||
2. **Feed the cleaned C into decomp.me** with the original function's
|
||
bytes as target asm. Iterate until byte-match (or asymptotic
|
||
similarity). ~1–2h per function.
|
||
3. **retdec** is useful only for functions with lots of absolute-address
|
||
refs we want a second opinion on — which is rare in the poll-loop
|
||
patches.
|
||
|
||
For a production C re-implementation of the 20 poll sites, Ghidra →
|
||
decomp.me is the correct pipeline. Skip retdec for those.
|