Phase-1 audit closes with a substantively different picture than the
original scaffold's TBDs:
- Tomeu Vizoso's RK3588 NPU work merged in Linux 6.18 (Nov 2025) under
codename `rocket` (NOT `rknpu`). All references updated.
- Boltzmann's `linux-rk3588-marfrit-A1` (7.0.0-rc3-ARCH+) already ships
`drivers/accel/rocket/rocket.ko` as a built-but-not-loaded module.
- DT bindings + per-core nodes (`npu@fdab/c/d_0000`,
compatible `rockchip,rk3588-rknn-core`) in mainline since 6.18 but
ship `status = "disabled"` — board enable is the Phase-2 unblock,
not a driver port.
- Mesa 25.3 ships Rocket Gallium + Teflon TFLite delegate as the
authoritative userspace reference for the uAPI shape.
- Op coverage today is conv-centric (MobileNet-class); transformer
matmul needs the conv-1×1 shoehorn (RKNPU2 BSP precedent) or rocket
op-set additions. Surfaced as Phase-2-load-bearing risk.
- IOMMU v1.0 hazard: 32 GB host needs `mem=4G` or local
`rockchip,rk3568-iommu-v1` discriminator patches before the first
NPU job, to avoid DMA-window faults.
Files:
- docs/npu-mainline-status.md: full audit table with upstream pointers
(kernel.org / Mesa docs / dri-devel patch URLs / Tomeu's "we are in
mainline" blog post).
- docs/phases.md: per-phase log entry for Phase-1 closeout.
- docs/op-coverage.md: matmul-vs-conv-vs-rocket-op-set framing.
- fleet/boltzmann.yaml: audited kernel + npu_driver + dt_npu_nodes
state.
- kernel/dt-overlays/rk3588-rosenblatt-npu-enable.dtso: overlay to
flip the three rknn-core nodes to "okay" (+ matching mmu nodes),
carries the IOMMU-mitigation warning inline.
- kernel/README.md: kernel-agent scope wiring + anticipated local
carry patches.
- README.md: phase-status table + "rknpu → rocket" rename note.
- TODO.md: Phase-2 unblock concrete steps + standing
upstream-watch items.
Codename: Frank Rosenblatt — Mark I Perceptron 1958, the first
hardware neural network. This project lights up the RK3588 NPU on
mainline Linux so the OSS world finally owns the silicon-side of
inference on that chip.
Phase-1 scope: small LLM running CPU + NPU mix on boltzmann (Rock 5
ITX+). Backend: llama.cpp with a new rknpu ggml backend offloading
INT8 GEMM (attention + FFN matmuls) to the NPU's tile-MAC array while
leaving dequant / RoPE / softmax / sampling / embedding on A76 NEON.
Target model: qwen2.5-1.5B-instruct Q4_K_M GGUF.
Scaffold layout: README.md (frame + 9+1-phase plan), TODO.md (rolling
punch-list), docs/{npu-mainline-status,architecture}.md, kernel/ for
DT bindings + driver tweaks, userspace/{npu-probe,llm-runtime}/,
fleet/boltzmann.yaml.
Next: Phase-1 substrate audit — fill the TBDs in docs/npu-mainline-status.md
with the actual state of Tomeu Vizoso's rknpu / DRM-accel work on
the boltzmann-running kernel.