Files
marfrit 00d655187a benchmark/: three-way RE-tool comparison + first real C-lift
Three small functions extracted from the v1.19 conservative blob with
ground-truth C and per-tool (Ghidra / retdec / decomp.me) docs:
  01_memset        — byte memset, 28 B
  02_memcpy32      — word-aligned memcpy, 36 B
  03_magic_memset  — magic check + tail-call to memset, 40 B
  04_train_phy_block — first real poll-site function (104 B, 26 insts),
                       contains poll sites 12-15

Results in RESULTS.md:
  - Ghidra: A on all four. Auto-decompile is close to final.
  - retdec: A on #3, F on #1 and #2 (no register-arg inference on raw),
    C on #4 (mistakes & 0xF0000000 for < 0x10000000).

GRIND_LOG.md (in 04_train_phy_block/) records the matching-decomp
iteration: 116-byte candidate.c at -Os vs vendor 104 bytes = 89.7%
size match on first real iteration. Remaining gap is GCC's choice of
`cmp w, w_const; b.ls` over vendor's `tst w, #imm; b.eq` for the
mask tests.

gdb_debug/ holds a native-aarch64 GDB single-stepper for the three
benchmark functions — boltzmann smoke test passed (memset:
buf[10] 0x00→0xab).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 07:26:23 +02:00

90 lines
3.1 KiB
C

/* Ground-truth C for FUN_0000d328 @ blob offset 0xd328 (104 bytes / 26 insts).
*
* **The first real poll-site function we lift to C.**
* Contains 4 of our 16 timeout-less polls (sites 12, 13, 14, 15).
*
* Pattern: PHY-block training step — poke a control register, wait for
* two status bits, apply two intermediate values with a
* handshake on a state register, ack the event.
*
* Signature: void train_phy_block(struct phy_ctx *ctx);
* (X0 = ctx, returns void)
*
* Layout:
* ctx (X0) — opaque per-rank/per-channel context
* ctx->base[0xb8] — 64-bit pointer to a PHY block base
* block + 0x8000 — addressed sub-block (likely "Master" bank in DWC PUB)
*
* The sub-block at +0x8000 has these registers (offsets within +0x8000):
* +0x110 CTL — write 0xF000F000 to start, 0xF0000000 to clear
* +0x118 STAT_A — bit[31:28] non-zero = step A done
* +0x120 STAT_B — bit[31:28] non-zero = step B done
* +0x154 CFG_A — write training value
* +0x160 CFG_B — write training value
* +0x184 HANDSHAKE — bits[1:0] toggle between 0 and !=0 to ack writes
*
* The 4 polls (in order):
* site 12 (B.EQ): STAT_A bit[31:28] non-zero?
* site 13 (B.EQ): STAT_B bit[31:28] non-zero?
* site 14 (B.EQ): HANDSHAKE bits[1:0] non-zero? (ack of step-1 writes)
* site 15 (B.NE): HANDSHAKE bits[1:0] zero? (ack of step-2 write)
*/
#include <stdint.h>
struct phy_ctx {
uint8_t pad[0xB8];
uint8_t *block; /* base pointer used at +0xB8 in struct */
/* ... rest of struct unknown */
};
#define PHY_CTL 0x110
#define PHY_STAT_A 0x118
#define PHY_STAT_B 0x120
#define PHY_CFG_A 0x154
#define PHY_CFG_B 0x160
#define PHY_HANDSHAKE 0x184
#define PHY_CTL_GO 0xF000F000U
#define PHY_CTL_CLR 0xF0000000U
#define PHY_STAT_DONE 0xF0000000U
#define PHY_CFG_VAL_RUN 0x00030003U
#define PHY_CFG_VAL_END 0x00030000U
#define PHY_HS_BUSY 0x3U
static inline uint32_t mmio_r(volatile uint8_t *base, unsigned off) {
return *(volatile uint32_t *)(base + off);
}
static inline void mmio_w(volatile uint8_t *base, unsigned off, uint32_t v) {
*(volatile uint32_t *)(base + off) = v;
}
void train_phy_block(struct phy_ctx *ctx) {
volatile uint8_t *phy = (volatile uint8_t *)(ctx->block + 0x8000);
mmio_w(phy, PHY_CTL, PHY_CTL_GO);
/* site 12 — wait for step A complete */
while ((mmio_r(phy, PHY_STAT_A) & PHY_STAT_DONE) == 0)
;
/* site 13 — wait for step B complete */
while ((mmio_r(phy, PHY_STAT_B) & PHY_STAT_DONE) == 0)
;
mmio_w(phy, PHY_CFG_B, PHY_CFG_VAL_RUN);
mmio_w(phy, PHY_CFG_A, PHY_CFG_VAL_RUN);
/* site 14 — wait for handshake to assert */
while ((mmio_r(phy, PHY_HANDSHAKE) & PHY_HS_BUSY) == 0)
;
mmio_w(phy, PHY_CFG_A, PHY_CFG_VAL_END);
/* site 15 — wait for handshake to deassert */
while ((mmio_r(phy, PHY_HANDSHAKE) & PHY_HS_BUSY) != 0)
;
mmio_w(phy, PHY_CFG_B, PHY_CFG_VAL_END);
mmio_w(phy, PHY_CTL, PHY_CTL_CLR);
}