24adc74812
Codename: Frank Rosenblatt — Mark I Perceptron 1958, the first
hardware neural network. This project lights up the RK3588 NPU on
mainline Linux so the OSS world finally owns the silicon-side of
inference on that chip.
Phase-1 scope: small LLM running CPU + NPU mix on boltzmann (Rock 5
ITX+). Backend: llama.cpp with a new rknpu ggml backend offloading
INT8 GEMM (attention + FFN matmuls) to the NPU's tile-MAC array while
leaving dequant / RoPE / softmax / sampling / embedding on A76 NEON.
Target model: qwen2.5-1.5B-instruct Q4_K_M GGUF.
Scaffold layout: README.md (frame + 9+1-phase plan), TODO.md (rolling
punch-list), docs/{npu-mainline-status,architecture}.md, kernel/ for
DT bindings + driver tweaks, userspace/{npu-probe,llm-runtime}/,
fleet/boltzmann.yaml.
Next: Phase-1 substrate audit — fill the TBDs in docs/npu-mainline-status.md
with the actual state of Tomeu Vizoso's rknpu / DRM-accel work on
the boltzmann-running kernel.
56 lines
2.2 KiB
YAML
56 lines
2.2 KiB
YAML
# rosenblatt fleet manifest — boltzmann (Rock 5 ITX+, RK3588)
|
||
#
|
||
# Phase-1 audit host. Always-on, 32 GB DDR4, NVMe rootfs. NPU silicon
|
||
# present + accessible via Rockchip-BSP vendor module today; mainline
|
||
# path TBD (see docs/npu-mainline-status.md).
|
||
|
||
host: boltzmann
|
||
arch: arm64
|
||
soc: rockchip/rk3588
|
||
board: rock-5-itx-plus
|
||
distro: archlinuxarm # ALARM aarch64; boltzmann is the umbrella RK3588 host
|
||
role: primary-development # not yet primary-target (laptop targets land later)
|
||
|
||
hardware:
|
||
cpu: 4×Cortex-A76 (2.4 GHz) + 4×Cortex-A55 (1.8 GHz)
|
||
ram: 32 GB DDR4-2666
|
||
storage: NVMe (rootfs) + microSD (recovery)
|
||
npu:
|
||
cores: 3
|
||
tops_int8_per_core: 2 # ~2 TOPS INT8 per core, 6 TOPS aggregate (theoretical peak)
|
||
local_sram_per_core_mib: 2
|
||
power_domain: pd_npu
|
||
|
||
# Phase-1 audit fills these (pending boltzmann inspection)
|
||
kernel:
|
||
running_version: TBD # uname -r snapshot at audit time
|
||
source: TBD # mainline torvalds / mmind-rockchip / custom
|
||
npu_driver: TBD # vendor rockchip-npu / mainline rknpu / none
|
||
|
||
userspace:
|
||
rknn_vendor_runtime_installed: false # commitment: stay mainline-clean
|
||
llama_cpp_installed: TBD # via marfrit-packages or built-from-source
|
||
|
||
baseline_measurement:
|
||
pending: true
|
||
target: |
|
||
llama.cpp pure-CPU tok/s on qwen2.5-1.5b-instruct-q4_k_m.gguf,
|
||
3 runs, median wallclock. Use llama-bench from llama.cpp/build/bin.
|
||
ground_truth_file: benchmarks/2026-XX-XX_boltzmann_qwen1.5b_cpu_baseline.json
|
||
|
||
bringup_sequence:
|
||
1: substrate audit (docs/npu-mainline-status.md table filled)
|
||
2: npu-probe runs successfully (open device → 64×64 INT8 matmul → bit-match CPU ref)
|
||
3: llama.cpp pure-CPU baseline captured
|
||
4: rknpu ggml backend skeleton compiles
|
||
5: first llama.cpp matmul offload working on a single layer
|
||
6: full forward pass via NPU for one decode step
|
||
7: tok/s vs baseline measured
|
||
|
||
backup_host: ampere # CoolPi GenBook — port-validation target. Phase-2+ scope.
|
||
|
||
reverse_dependencies:
|
||
- Quark (boltzmann UEFI) — must stay bootable across kernel-rev experiments
|
||
- Neutron (boltzmann kernel build) — provides the kernel we tweak for rknpu
|
||
- Volta (boltzmann umbrella) — Rosenblatt is the third Volta-child after Quark + Neutron
|