Files
rosenblatt/TODO.md
T
marfrit 24adc74812 Rosenblatt: project scaffold for RK3588 NPU on mainline
Codename: Frank Rosenblatt — Mark I Perceptron 1958, the first
hardware neural network.  This project lights up the RK3588 NPU on
mainline Linux so the OSS world finally owns the silicon-side of
inference on that chip.

Phase-1 scope: small LLM running CPU + NPU mix on boltzmann (Rock 5
ITX+).  Backend: llama.cpp with a new rknpu ggml backend offloading
INT8 GEMM (attention + FFN matmuls) to the NPU's tile-MAC array while
leaving dequant / RoPE / softmax / sampling / embedding on A76 NEON.

Target model: qwen2.5-1.5B-instruct Q4_K_M GGUF.

Scaffold layout: README.md (frame + 9+1-phase plan), TODO.md (rolling
punch-list), docs/{npu-mainline-status,architecture}.md, kernel/ for
DT bindings + driver tweaks, userspace/{npu-probe,llm-runtime}/,
fleet/boltzmann.yaml.

Next: Phase-1 substrate audit — fill the TBDs in docs/npu-mainline-status.md
with the actual state of Tomeu Vizoso's rknpu / DRM-accel work on
the boltzmann-running kernel.
2026-05-19 11:57:48 +00:00

76 lines
3.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# TODO — Rosenblatt
Rolling punch-list. Older items at bottom (move done → DONE.md when noisy).
---
## Phase 1 — substrate audit
- [ ] On boltzmann: `uname -r` → record in `fleet/boltzmann.yaml:kernel.running_version`
- [ ] `find / -path '*accel*' -name '*.ko' 2>/dev/null` — check if accel framework is built
- [ ] `ls /dev/accel/ /dev/dri/` — what's exposed?
- [ ] `lsmod | grep -iE 'rknpu|accel'` — what's loaded?
- [ ] `dmesg | grep -iE 'rknpu|npu|accel'` since boot — driver bringup log
- [ ] Tomeu's rknpu series — find on lore.kernel.org/dri-devel, capture latest
patch-set version + state (merged / in-review / dropped) → fill table in
`docs/npu-mainline-status.md`
- [ ] Check `drivers/accel/` in current torvalds tree — list in-tree
accelerators, confirm rknpu's mainline state
- [ ] Check DT bindings: `Documentation/devicetree/bindings/npu/rockchip,*.yaml`
- [ ] Inspect `arch/arm64/boot/dts/rockchip/rk3588.dtsi` for `npu` node
- [ ] If a userspace shim exists (rkneural?), capture repo URL + try
hello-world against the running kernel
- [ ] Spec-extract from BSP vendor `rockchip-npu` source — register map,
DMA descriptor format, irq handling. No code lift; spec only.
Phase exit criteria: `docs/npu-mainline-status.md` table fully populated;
clear answer to "do we drive via accel uAPI or write our own MMIO driver."
---
## Phase 2 — formulate
- [ ] List llama.cpp ops by wallclock %, profiling qwen-1.5B Q4_K_M on CPU
(use llama.cpp's built-in perf-timer or perf record)
- [ ] Pick the exact INT8 matmul tile size the NPU prefers (read from BSP source)
- [ ] Spec out the smallest backend interface: which ops we MUST handle,
which the framework falls back to CPU
- [ ] Write `docs/op-coverage.md`
---
## Phase 3 — analyze
- [ ] RKNPU2 SDK: trace through `librknnrt.so` user-API → kernel ioctl shapes
(objdump + strings, no actual reverse-engineering of vendor blob — just
the syscall surface)
- [ ] Tomeu's accel uAPI: read driver source, understand:
- submit-job ioctl shape
- dmabuf import path
- fence-wait mechanism
- error reporting
- [ ] BSP vendor `rockchip-npu` source: register layout, DMA descriptor
struct, irq handling sequence
---
## Phase 4 — baseline
- [ ] Build vanilla llama.cpp on boltzmann (mainline branch)
- [ ] Pull qwen2.5-1.5b-instruct Q4_K_M GGUF
- [ ] `llama-bench -m qwen2.5-1.5b -p 512 -n 128` × 3 runs
- [ ] Capture JSON to `benchmarks/$(date +%F)_boltzmann_qwen1.5b_cpu_baseline.json`
- [ ] Record into `fleet/boltzmann.yaml:baseline_measurement`
---
## Cross-phase / standing items
- [ ] Mirror Tomeu's WIP branch into a local clone for kernel hacking
- [ ] Set up serial console on boltzmann for kernel-panic recovery (Quark
umbrella; check current state)
- [ ] Add `project_rosenblatt.md` to claude-memory once Phase 1 closes (so
future sessions don't re-discover the campaign)
- [ ] Decide repo home: marfrit/rosenblatt on git.reauktion.de (probably yes,
after Phase-1 substrate is captured and the README isn't embarrassing)