benchmark/: three-way RE-tool comparison + first real C-lift
Three small functions extracted from the v1.19 conservative blob with
ground-truth C and per-tool (Ghidra / retdec / decomp.me) docs:
01_memset — byte memset, 28 B
02_memcpy32 — word-aligned memcpy, 36 B
03_magic_memset — magic check + tail-call to memset, 40 B
04_train_phy_block — first real poll-site function (104 B, 26 insts),
contains poll sites 12-15
Results in RESULTS.md:
- Ghidra: A on all four. Auto-decompile is close to final.
- retdec: A on #3, F on #1 and #2 (no register-arg inference on raw),
C on #4 (mistakes & 0xF0000000 for < 0x10000000).
GRIND_LOG.md (in 04_train_phy_block/) records the matching-decomp
iteration: 116-byte candidate.c at -Os vs vendor 104 bytes = 89.7%
size match on first real iteration. Remaining gap is GCC's choice of
`cmp w, w_const; b.ls` over vendor's `tst w, #imm; b.eq` for the
mask tests.
gdb_debug/ holds a native-aarch64 GDB single-stepper for the three
benchmark functions — boltzmann smoke test passed (memset:
buf[10] 0x00→0xab).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Binary file not shown.
@@ -0,0 +1,16 @@
|
||||
|
||||
02_memcpy32/func.bin: file format binary
|
||||
|
||||
|
||||
Disassembly of section .data:
|
||||
|
||||
0000000000001200 <.data>:
|
||||
1200: 927e7442 and x2, x2, #0xfffffffc
|
||||
1204: d2800003 mov x3, #0x0 // #0
|
||||
1208: eb02007f cmp x3, x2
|
||||
120c: 54000041 b.ne 0x1214 // b.any
|
||||
1210: d65f03c0 ret
|
||||
1214: b8636824 ldr w4, [x1, x3]
|
||||
1218: b8236804 str w4, [x0, x3]
|
||||
121c: 91001063 add x3, x3, #0x4
|
||||
1220: 17fffffa b 0x1208
|
||||
@@ -0,0 +1,29 @@
|
||||
/* Ground-truth C for FUN_00001200 @ blob offset 0x1200 (36 bytes / 9 insts).
|
||||
*
|
||||
* Pattern: word-aligned memcpy; length rounded down to word multiple.
|
||||
* Signature: void memcpy32(uint32_t *dst, const uint32_t *src, size_t len_bytes);
|
||||
*
|
||||
* AArch64 ABI: X0 = dst, X1 = src, X2 = len (in bytes, rounded down to 4)
|
||||
* Scratch: X3 = byte index i, W4 = word register for transfer
|
||||
*
|
||||
* Notes the decompiler should ideally recover:
|
||||
* - `AND x2, x2, #0xFFFFFFFC` is `len &= ~3` — mask-out low 2 bits.
|
||||
* (Tools often render as `len & 0xFFFFFFFC` or `len & ~3`.)
|
||||
* - Inner loop reads/writes 4 bytes at a time — tools should recognise
|
||||
* uint32_t pointers, or at least `*(u32*)(x0+i) = *(u32*)(x1+i)`.
|
||||
* - Addressing is byte-indexed with a step of 4 — some tools may emit
|
||||
* `for (i = 0; i < len; i += 4)` in bytes; others may normalise into
|
||||
* an index-based word loop.
|
||||
*/
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
void memcpy32(uint32_t *dst, const uint32_t *src, size_t len_bytes) {
|
||||
len_bytes &= ~(size_t)3; /* round down to 4 */
|
||||
size_t i = 0;
|
||||
while (i != len_bytes) {
|
||||
*(uint32_t *)((uint8_t *)dst + i) =
|
||||
*(const uint32_t *)((const uint8_t *)src + i);
|
||||
i += 4;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,38 @@
|
||||
//
|
||||
// This file was generated by the Retargetable Decompiler
|
||||
// Website: https://retdec.com
|
||||
//
|
||||
|
||||
#include <stdint.h>
|
||||
|
||||
// ------------------- Function Prototypes --------------------
|
||||
|
||||
int64_t entry_point(void);
|
||||
|
||||
// ------------------------ Functions -------------------------
|
||||
|
||||
// Address range: 0x1200 - 0x1224
|
||||
int64_t entry_point(void) {
|
||||
// 0x1200
|
||||
int64_t result; // 0x1200
|
||||
int64_t v1 = result & 0xfffffffc; // 0x1200
|
||||
if (v1 == 0) {
|
||||
// 0x1210
|
||||
return result;
|
||||
}
|
||||
int64_t v2 = 0;
|
||||
int64_t v3 = v2 + 4; // 0x121c
|
||||
while (v3 != v1) {
|
||||
// 0x1214
|
||||
v2 = v3;
|
||||
v3 = v2 + 4;
|
||||
}
|
||||
// 0x1210
|
||||
return result;
|
||||
}
|
||||
|
||||
// --------------------- Meta-Information ---------------------
|
||||
|
||||
// Detected compiler/packer: starforce (3.x)
|
||||
// Detected functions: 1
|
||||
|
||||
Reference in New Issue
Block a user