# Three-way RE-tool benchmark — results Three AArch64 functions from the RK3588 DDR v1.19 conservative blob, decompiled by **Ghidra 11.3** (interactive, auto-analysis) and **retdec v5.0** (fully automated, `--mode raw`). **decomp.me** is a matching-decompilation comparator rather than a decompiler, so it's benchmarked on a different axis (time-to-match). ## Function 01 — `memset_byte` (28 bytes, 7 insts) **Ground truth:** `void memset_byte(void *buf, uint8_t val, size_t len)` — byte-wise pre-test counting loop. ### Ghidra output ```c void FUN_00000aac(long param_1, undefined1 param_2, long param_3) { long lVar1; for (lVar1 = 0; param_3 != lVar1; lVar1 = lVar1 + 1) { *(undefined1 *)(param_1 + lVar1) = param_2; } } ``` **Grade: A.** Semantics perfect. Types `long`/`undefined1` instead of `void*`/`uint8_t`/`size_t` — one click to retype each. For-loop shape matches the canonical idiom exactly. ### retdec output ```c int64_t entry_point(void) { int64_t result; if (result == 0) return result; int64_t v1 = 0; *(char *)(v1 + result) = (char)result; ... } ``` **Grade: F.** **No function arguments inferred** — treats X0/W1/X2 as uninitialised locals. The whole function signature is wrong. The loop body overwrites the wrong things. This is retdec's biggest weakness on raw binaries: without ELF symbol or DWARF hints, it can't tell which registers are live-in parameters. ### decomp.me workflow N/A as a decompiler; as a matching-decomp tool: paste the asm + your candidate C → iterate on wording until the compiled output byte-matches. For memset this reaches 90%+ similarity in a handful of edits with GCC (exact match unlikely — original blob used a different compiler). --- ## Function 02 — `memcpy32` (36 bytes, 9 insts) **Ground truth:** word-aligned memcpy, `len &= ~3`, 4-byte stride. ### Ghidra output ```c void FUN_00001200(long param_1, long param_2, ulong param_3) { ulong uVar1; for (uVar1 = 0; uVar1 != (param_3 & 0xfffffffc); uVar1 = uVar1 + 4) { *(undefined4 *)(param_1 + uVar1) = *(undefined4 *)(param_2 + uVar1); } } ``` **Grade: A.** Semantics perfect. The `& 0xfffffffc` mask is in the correct position, the 4-byte stride is there, the `undefined4` (u32) copy is there. Again: only type annotations need manual cleanup. ### retdec output ```c int64_t entry_point(void) { int64_t v1 = result & 0xfffffffc; if (v1 == 0) return result; int64_t v2 = 0; int64_t v3 = v2 + 4; while (v3 != v1) { v2 = v3; v3 = v2 + 4; } return result; } ``` **Grade: F.** Same no-arguments failure mode. The memory-copy statements are completely absent — retdec failed to emit the two LDR/STR pair as a dereference. You get an infinite-looking counter increment and nothing else. Unusable. --- ## Function 03 — `magic_memset` (40 bytes, 9 insts) **Ground truth:** `if (*(u32*)0x1fe004 == 0x54410001) memset(0x1fe000, 0, 0x32c);` ### Ghidra output ```c void FUN_00000da4(void) { if (_DAT_001fe004 == 0x54410001) { FUN_00000aac(0x1fe000, 0, 0x32c); return; } return; } ``` **Grade: A+.** Perfect. `_DAT_001fe004` is Ghidra's auto-named data symbol for the absolute address. The tail-call `B 0xaac` was correctly turned into a regular call-and-return, preserving the calling convention. The MOVZ+MOVK composed immediate `0x54410001` was collapsed into a single literal. ### retdec output ```c int64_t entry_point(void) { if (*(int32_t *)0x1fe004 == 0x54410001) { return unknown_aac(0x1fe000, 0, 812); } return 0x1fe000; } ``` **Grade: B+.** **Noticeably better than retdec's output for #1 and #2**: - Absolute-address dereference correctly parsed. - `MOVZ W1, #1 ; MOVK W1, #0x5441, LSL #16` collapsed to `0x54410001` ✓ - Tail-call to 0xaac correctly recognised as a call to `unknown_aac(0x1fe000, 0, 812)` — even got arity right by observing X0/W1/X2 being set up just before the branch. - **Weakness:** returns `0x1fe000` in the fall-through branch — spurious, because the function returns `void` and retdec fabricated a return value. Also `812` decimal instead of `0x32c` hex. --- ## Takeaways | dimension | Ghidra | retdec | decomp.me | |---|---|---|---| | Argument inference from raw binary | **yes** (intra-proc analysis) | **no** | n/a | | Absolute-address data refs | auto-named `_DAT_xxxxx` | raw cast `*(int32_t *)0x…` | n/a | | MOVZ+MOVK literal reconstruction | collapses | collapses | n/a | | Tail-call recognition | yes | yes | n/a | | Control-flow structure | clean structured loops | mix of `while` + `goto` | n/a | | Type inference | `long`/`undefined4` placeholders | cautious `int64_t` fallback | n/a | | Zero-touch automation | no (interactive) | **yes** | no (interactive) | | Matching-decomp workflow | no | no | **yes** | ### When each tool wins - **Ghidra** is the default daily driver. Auto-analysis output is already close to final for simple functions — mostly you rename params and retype placeholders. - **retdec** shines when the target has **absolute-address data refs, call tables, or embedded constants** (function #3). It falls over on anything where register-passed parameters need inference from surrounding context (functions #1 and #2). Fine for bulk batch processing of a repo full of functions whose signatures you don't know you care about — but verify each output. - **decomp.me** doesn't compete with the others; it's the **"did my rewrite compile to the same bytes?"** tool. Complementary: take Ghidra's output, paste the C into decomp.me, iterate until the compiled asm matches the blob's bytes. That's how you'd produce a maintainable C re-implementation. ### Practical recipe for our DDR-blob work 1. **Start with Ghidra's decompiler output** (already done in `ddr_annotated.c`). Retype params, rename variables. ~2–4h per non-trivial function. 2. **Feed the cleaned C into decomp.me** with the original function's bytes as target asm. Iterate until byte-match (or asymptotic similarity). ~1–2h per function. 3. **retdec** is useful only for functions with lots of absolute-address refs we want a second opinion on — which is rare in the poll-loop patches. For a production C re-implementation of the 20 poll sites, Ghidra → decomp.me is the correct pipeline. Skip retdec for those.