rk3588-ddr-analysis/benchmark/ai_ghidra/SETUP.md

# AI-Ghidra on oppenheimer (2026-04-15)

GhidrAssist wired to dirac's local LLMs. Two-click finish on next
launch because Ghidra's plugin enable/config step is UI-only.

## What's installed

- **GhidrAssist 1.5.0** (jtang613 fork — actual
  `github.com/symgraph/GhidrAssist`) at
  `/opt/ghidra/Ghidra/Extensions/GhidrAssist/` on oppenheimer (CT131).
- Built for Ghidra 11.4.3; our install is 11.3. Should load
  (API-compatible) but if "Extension failed to load" appears, upgrade
  Ghidra to 11.4.3+.

## Dirac LLM endpoints (OpenAI v1 compatible)

| port | model | size | role |
|------|-------|------|------|
| 8080 | `Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf` | ~4.8 GB | main — tool-calling, agentic rename, function-calling mode |
| 8081 | `qwen2.5-coder-1.5b-instruct-q4_k_m.gguf` | ~1 GB | fast — quick renames, short comments |

Base URLs:
- `http://192.168.88.194:8080/v1/` (Hermes)
- `http://192.168.88.194:8081/v1/` (Qwen-Coder)

Fully OpenAI-compatible: `/v1/models`, `/v1/chat/completions`. Tested
from oppenheimer; reachable, responses sane.

## Finish config (UI, next launch)

1. Open Ghidra, open any project.
2. CodeBrowser → **File → Configure → Miscellaneous** → tick
   **Enable GhidrAssist**.
3. Reopen CodeBrowser; GhidrAssist tab appears.
4. Settings:
   - **API type:** OpenAI compatible
   - **API base URL:** `http://192.168.88.194:8080/v1`
   - **Model name:** `Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf`
   - **API key:** any non-empty string (llama.cpp ignores it)
   - **System prompt** (recommended):
     ```
     You are a reverse engineering assistant. When asked for a rename,
     return ONLY the new name — do not use tool calls unless the
     conversation is explicitly in function-calling mode.
     ```

## Tested calls

- Qwen-coder 1.5B, memset-shaped function → returned `set_char` (close,
  not ideal).
- Hermes 8B w/o system prompt → invokes `rename_function` tool call
  immediately — perfect for GhidrAssist's agentic rename mode, awkward
  for plain Q&A.

## Use patterns

- **Agentic rename pass over our 118 DDR functions** (Hermes + tool
  calls auto-invoked).
- **Quick second opinion on a decompiler output** — Hermes or Qwen,
  non-agentic.
- **Bulk scripted pass** — Ghidra headless + GhidrAssist's JSON API
  (see `ghidra_scripts/README.txt` in the extension dir).

## Cold-start bookkeeping

llama-server cold-loads in ~15 s. First call after dirac wake returns
HTTP 503 `{"error":"Loading model"}` — retry after ~15 s.

## Offline fallback

Clevo (980M, 4 GB VRAM) can host Qwen-coder 1.5B when dirac is down.
Hermes 8B won't fit in 4 GB.