benchmark: AI-Ghidra landscape + case-4 harness (synthetic PHY)
- benchmark/ai_ghidra/SETUP.md documents the GhidrAssist 1.5.0 install at /opt/ghidra/Ghidra/Extensions/GhidrAssist/ on oppenheimer (CT131), with dirac endpoints (Hermes-2-Pro 8B @ :8080, Qwen-coder 1.5B @ :8081) already reachable + tested. Final enable+config is UI-only; two clicks on next Ghidra launch. - gdb_debug/harness.c extended with case 4 = train_phy_block running under a synthetic PHY at 0x40000000. Static MMIO shim satisfies polls 1-3; poll 4 needs dynamic state-machine (next session, via SIGBUS handler or ptrace) — documented in the README. Vendor tree investigation: Rockchip's own sdram_rk3588.c / sdram_rk3568.c are STUBS (return -1). No free function names from there. Path forward: mine the vendor kernel's rockchip_dmc.c (devfreq DDR scaling driver) for register-offset naming hints at runtime-call level. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,72 @@
|
|||||||
|
# AI-Ghidra on oppenheimer (2026-04-15)
|
||||||
|
|
||||||
|
GhidrAssist wired to dirac's local LLMs. Two-click finish on next
|
||||||
|
launch because Ghidra's plugin enable/config step is UI-only.
|
||||||
|
|
||||||
|
## What's installed
|
||||||
|
|
||||||
|
- **GhidrAssist 1.5.0** (jtang613 fork — actual
|
||||||
|
`github.com/symgraph/GhidrAssist`) at
|
||||||
|
`/opt/ghidra/Ghidra/Extensions/GhidrAssist/` on oppenheimer (CT131).
|
||||||
|
- Built for Ghidra 11.4.3; our install is 11.3. Should load
|
||||||
|
(API-compatible) but if "Extension failed to load" appears, upgrade
|
||||||
|
Ghidra to 11.4.3+.
|
||||||
|
|
||||||
|
## Dirac LLM endpoints (OpenAI v1 compatible)
|
||||||
|
|
||||||
|
| port | model | size | role |
|
||||||
|
|------|-------|------|------|
|
||||||
|
| 8080 | `Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf` | ~4.8 GB | main — tool-calling, agentic rename, function-calling mode |
|
||||||
|
| 8081 | `qwen2.5-coder-1.5b-instruct-q4_k_m.gguf` | ~1 GB | fast — quick renames, short comments |
|
||||||
|
|
||||||
|
Base URLs:
|
||||||
|
- `http://192.168.88.194:8080/v1/` (Hermes)
|
||||||
|
- `http://192.168.88.194:8081/v1/` (Qwen-Coder)
|
||||||
|
|
||||||
|
Fully OpenAI-compatible: `/v1/models`, `/v1/chat/completions`. Tested
|
||||||
|
from oppenheimer; reachable, responses sane.
|
||||||
|
|
||||||
|
## Finish config (UI, next launch)
|
||||||
|
|
||||||
|
1. Open Ghidra, open any project.
|
||||||
|
2. CodeBrowser → **File → Configure → Miscellaneous** → tick
|
||||||
|
**Enable GhidrAssist**.
|
||||||
|
3. Reopen CodeBrowser; GhidrAssist tab appears.
|
||||||
|
4. Settings:
|
||||||
|
- **API type:** OpenAI compatible
|
||||||
|
- **API base URL:** `http://192.168.88.194:8080/v1`
|
||||||
|
- **Model name:** `Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf`
|
||||||
|
- **API key:** any non-empty string (llama.cpp ignores it)
|
||||||
|
- **System prompt** (recommended):
|
||||||
|
```
|
||||||
|
You are a reverse engineering assistant. When asked for a rename,
|
||||||
|
return ONLY the new name — do not use tool calls unless the
|
||||||
|
conversation is explicitly in function-calling mode.
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tested calls
|
||||||
|
|
||||||
|
- Qwen-coder 1.5B, memset-shaped function → returned `set_char` (close,
|
||||||
|
not ideal).
|
||||||
|
- Hermes 8B w/o system prompt → invokes `rename_function` tool call
|
||||||
|
immediately — perfect for GhidrAssist's agentic rename mode, awkward
|
||||||
|
for plain Q&A.
|
||||||
|
|
||||||
|
## Use patterns
|
||||||
|
|
||||||
|
- **Agentic rename pass over our 118 DDR functions** (Hermes + tool
|
||||||
|
calls auto-invoked).
|
||||||
|
- **Quick second opinion on a decompiler output** — Hermes or Qwen,
|
||||||
|
non-agentic.
|
||||||
|
- **Bulk scripted pass** — Ghidra headless + GhidrAssist's JSON API
|
||||||
|
(see `ghidra_scripts/README.txt` in the extension dir).
|
||||||
|
|
||||||
|
## Cold-start bookkeeping
|
||||||
|
|
||||||
|
llama-server cold-loads in ~15 s. First call after dirac wake returns
|
||||||
|
HTTP 503 `{"error":"Loading model"}` — retry after ~15 s.
|
||||||
|
|
||||||
|
## Offline fallback
|
||||||
|
|
||||||
|
Clevo (980M, 4 GB VRAM) can host Qwen-coder 1.5B when dirac is down.
|
||||||
|
Hermes 8B won't fit in 4 GB.
|
||||||
@@ -18,8 +18,9 @@ endef
|
|||||||
$(eval $(call WRAP_BIN,func_01,01_memset))
|
$(eval $(call WRAP_BIN,func_01,01_memset))
|
||||||
$(eval $(call WRAP_BIN,func_02,02_memcpy32))
|
$(eval $(call WRAP_BIN,func_02,02_memcpy32))
|
||||||
$(eval $(call WRAP_BIN,func_03,03_magic_memset))
|
$(eval $(call WRAP_BIN,func_03,03_magic_memset))
|
||||||
|
$(eval $(call WRAP_BIN,func_04,04_train_phy_block))
|
||||||
|
|
||||||
gdb_debug.elf: harness.c func_01.o func_02.o func_03.o
|
gdb_debug.elf: harness.c func_01.o func_02.o func_03.o func_04.o
|
||||||
gcc -O0 -g -Wall -o $@ $^
|
gcc -O0 -g -Wall -o $@ $^
|
||||||
|
|
||||||
clean:
|
clean:
|
||||||
|
|||||||
Binary file not shown.
Binary file not shown.
@@ -1,10 +1,21 @@
|
|||||||
/* Generic harness for single-stepping one of the benchmark functions under GDB.
|
/* Generic harness for single-stepping benchmark functions under GDB.
|
||||||
* Copies the raw bytes of funcNN.bin into an RWX buffer and calls through
|
* Copies the raw bytes of funcNN.bin into an RWX buffer and calls through
|
||||||
* a function pointer. GDB stepi from the call site drops you right into the
|
* a function pointer. GDB stepi from the call site drops you right into the
|
||||||
* target function's first instruction. No QEMU needed — boltzmann is aarch64.
|
* target function's first instruction. No QEMU needed — boltzmann is aarch64.
|
||||||
*
|
*
|
||||||
* Build: run `make` in this dir (native aarch64 only, for now).
|
* Build: run `make` in this dir.
|
||||||
* Run: ./gdb_debug.elf {1|2|3} (1=memset 2=memcpy32 3=magic_memset)
|
* Run:
|
||||||
|
* ./gdb_debug.elf 1 — memset
|
||||||
|
* ./gdb_debug.elf 2 — memcpy32
|
||||||
|
* ./gdb_debug.elf 3 — magic_memset (will SIGSEGV unless 0x1fe000 is mapped)
|
||||||
|
* ./gdb_debug.elf 4 — train_phy_block; mmaps a synthetic PHY block at
|
||||||
|
* FAKE_PHY_BASE pre-populated with "training-done"
|
||||||
|
* responses so all 4 polls exit on first iteration.
|
||||||
|
* ./gdb_debug.elf 4 stuck
|
||||||
|
* — train_phy_block but with MMIO left at zero so the
|
||||||
|
* polls would loop forever (interrupt with Ctrl+C).
|
||||||
|
* Useful for confirming v3fb trampolines time out
|
||||||
|
* cleanly when applied to a patched func_04.bin.
|
||||||
*
|
*
|
||||||
* Under GDB: see README.md.
|
* Under GDB: see README.md.
|
||||||
*/
|
*/
|
||||||
@@ -17,10 +28,12 @@
|
|||||||
extern uint8_t _binary_func_01_bin_start[], _binary_func_01_bin_end[];
|
extern uint8_t _binary_func_01_bin_start[], _binary_func_01_bin_end[];
|
||||||
extern uint8_t _binary_func_02_bin_start[], _binary_func_02_bin_end[];
|
extern uint8_t _binary_func_02_bin_start[], _binary_func_02_bin_end[];
|
||||||
extern uint8_t _binary_func_03_bin_start[], _binary_func_03_bin_end[];
|
extern uint8_t _binary_func_03_bin_start[], _binary_func_03_bin_end[];
|
||||||
|
extern uint8_t _binary_func_04_bin_start[], _binary_func_04_bin_end[];
|
||||||
|
|
||||||
typedef void (*f1_t)(void *, uint8_t, uint64_t);
|
typedef void (*f1_t)(void *, uint8_t, uint64_t);
|
||||||
typedef void (*f2_t)(uint32_t *, const uint32_t *, uint64_t);
|
typedef void (*f2_t)(uint32_t *, const uint32_t *, uint64_t);
|
||||||
typedef void (*f3_t)(void);
|
typedef void (*f3_t)(void);
|
||||||
|
typedef void (*f4_t)(uint64_t /* ctx pointer */);
|
||||||
|
|
||||||
static void *rwx_copy(const void *src, size_t len) {
|
static void *rwx_copy(const void *src, size_t len) {
|
||||||
void *p = mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC,
|
void *p = mmap(NULL, 4096, PROT_READ | PROT_WRITE | PROT_EXEC,
|
||||||
@@ -31,8 +44,54 @@ static void *rwx_copy(const void *src, size_t len) {
|
|||||||
return p;
|
return p;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* For function 4 (train_phy_block) we need a synthetic PHY block.
|
||||||
|
* The function does: base = *(u64 *)(ctx + 0xb8); base += 0x8000; ...
|
||||||
|
* So we need (a) a ctx struct with a valid base pointer at +0xb8,
|
||||||
|
* (b) a PHY block at base + 0x8000 with the right register layout.
|
||||||
|
*
|
||||||
|
* We pick FAKE_PHY_BASE so PHY block (at +0x8000) lands somewhere mappable.
|
||||||
|
* 0x40000000 is well outside libc / stack / heap on aarch64 Linux.
|
||||||
|
*/
|
||||||
|
#define FAKE_PHY_BASE 0x40000000UL /* requested via mmap MAP_FIXED */
|
||||||
|
#define FAKE_PHY_LEN 0x10000 /* 64 KiB = enough for [+0x8000..+0x8200] */
|
||||||
|
|
||||||
|
#define PHY_CTL_OFF 0x110
|
||||||
|
#define PHY_STAT_A_OFF 0x118
|
||||||
|
#define PHY_STAT_B_OFF 0x120
|
||||||
|
#define PHY_CFG_A_OFF 0x154
|
||||||
|
#define PHY_CFG_B_OFF 0x160
|
||||||
|
#define PHY_HANDSHAKE_OFF 0x184
|
||||||
|
|
||||||
|
struct phy_ctx {
|
||||||
|
uint8_t pad[0xB8];
|
||||||
|
uint64_t base; /* lives at offset 0xB8 */
|
||||||
|
};
|
||||||
|
|
||||||
|
static struct phy_ctx ctx;
|
||||||
|
|
||||||
|
static void prep_synthetic_phy(int let_polls_pass) {
|
||||||
|
void *m = mmap((void *)FAKE_PHY_BASE, FAKE_PHY_LEN,
|
||||||
|
PROT_READ | PROT_WRITE,
|
||||||
|
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
|
||||||
|
if (m == MAP_FAILED) { perror("mmap synthetic PHY"); exit(1); }
|
||||||
|
memset(m, 0, FAKE_PHY_LEN);
|
||||||
|
|
||||||
|
ctx.base = FAKE_PHY_BASE; /* ctx->base read by func at +0xB8 */
|
||||||
|
volatile uint8_t *phy = (volatile uint8_t *)(FAKE_PHY_BASE + 0x8000);
|
||||||
|
|
||||||
|
if (let_polls_pass) {
|
||||||
|
/* Pre-populate the registers the function polls so each LDR
|
||||||
|
* sees a "done" value on the first iteration. */
|
||||||
|
*(volatile uint32_t *)(phy + PHY_STAT_A_OFF) = 0xF0000001U; /* bits[31:28] non-zero */
|
||||||
|
*(volatile uint32_t *)(phy + PHY_STAT_B_OFF) = 0xF0000001U;
|
||||||
|
*(volatile uint32_t *)(phy + PHY_HANDSHAKE_OFF) = 0x00000003U; /* bits[1:0] non-zero */
|
||||||
|
}
|
||||||
|
printf("synthetic PHY mapped at 0x%lx, polls = %s\n",
|
||||||
|
(unsigned long)m, let_polls_pass ? "PASS" : "STUCK (will loop)");
|
||||||
|
}
|
||||||
|
|
||||||
static void __attribute__((noinline))
|
static void __attribute__((noinline))
|
||||||
call_func(void (*fn)(void), int which) {
|
call_func(void (*fn)(void), int which, int variant) {
|
||||||
switch (which) {
|
switch (which) {
|
||||||
case 1: {
|
case 1: {
|
||||||
char buf[64] = {0};
|
char buf[64] = {0};
|
||||||
@@ -52,12 +111,25 @@ call_func(void (*fn)(void), int which) {
|
|||||||
printf("calling magic_memset — SIGSEGVs on LDR of 0x1fe004 in user mode.\n");
|
printf("calling magic_memset — SIGSEGVs on LDR of 0x1fe004 in user mode.\n");
|
||||||
((f3_t)fn)();
|
((f3_t)fn)();
|
||||||
break;
|
break;
|
||||||
|
case 4: {
|
||||||
|
prep_synthetic_phy(variant);
|
||||||
|
printf("calling train_phy_block(ctx)\n");
|
||||||
|
((f4_t)fn)((uint64_t)&ctx);
|
||||||
|
printf("train_phy_block returned successfully.\n");
|
||||||
|
volatile uint8_t *phy = (volatile uint8_t *)(FAKE_PHY_BASE + 0x8000);
|
||||||
|
printf("post: CTL=0x%08x CFG_A=0x%08x CFG_B=0x%08x\n",
|
||||||
|
*(volatile uint32_t *)(phy + PHY_CTL_OFF),
|
||||||
|
*(volatile uint32_t *)(phy + PHY_CFG_A_OFF),
|
||||||
|
*(volatile uint32_t *)(phy + PHY_CFG_B_OFF));
|
||||||
|
break;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
int main(int argc, char **argv) {
|
int main(int argc, char **argv) {
|
||||||
if (argc != 2) { fprintf(stderr, "usage: %s {1|2|3}\n", argv[0]); return 2; }
|
if (argc < 2) { fprintf(stderr, "usage: %s {1|2|3|4} [stuck]\n", argv[0]); return 2; }
|
||||||
int which = atoi(argv[1]);
|
int which = atoi(argv[1]);
|
||||||
|
int variant = (argc >= 3 && strcmp(argv[2], "stuck") == 0) ? 0 : 1;
|
||||||
void (*fn)(void);
|
void (*fn)(void);
|
||||||
switch (which) {
|
switch (which) {
|
||||||
case 1: fn = rwx_copy(_binary_func_01_bin_start,
|
case 1: fn = rwx_copy(_binary_func_01_bin_start,
|
||||||
@@ -66,9 +138,11 @@ int main(int argc, char **argv) {
|
|||||||
_binary_func_02_bin_end - _binary_func_02_bin_start); break;
|
_binary_func_02_bin_end - _binary_func_02_bin_start); break;
|
||||||
case 3: fn = rwx_copy(_binary_func_03_bin_start,
|
case 3: fn = rwx_copy(_binary_func_03_bin_start,
|
||||||
_binary_func_03_bin_end - _binary_func_03_bin_start); break;
|
_binary_func_03_bin_end - _binary_func_03_bin_start); break;
|
||||||
|
case 4: fn = rwx_copy(_binary_func_04_bin_start,
|
||||||
|
_binary_func_04_bin_end - _binary_func_04_bin_start); break;
|
||||||
default: fprintf(stderr, "unknown index %d\n", which); return 2;
|
default: fprintf(stderr, "unknown index %d\n", which); return 2;
|
||||||
}
|
}
|
||||||
printf("function %d loaded at %p\n", which, fn);
|
printf("function %d loaded at %p\n", which, fn);
|
||||||
call_func(fn, which);
|
call_func(fn, which, variant);
|
||||||
return 0;
|
return 0;
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user