Phase-1 audit closes with a substantively different picture than the original scaffold's TBDs: - Tomeu Vizoso's RK3588 NPU work merged in Linux 6.18 (Nov 2025) under codename `rocket` (NOT `rknpu`). All references updated. - Boltzmann's `linux-rk3588-marfrit-A1` (7.0.0-rc3-ARCH+) already ships `drivers/accel/rocket/rocket.ko` as a built-but-not-loaded module. - DT bindings + per-core nodes (`npu@fdab/c/d_0000`, compatible `rockchip,rk3588-rknn-core`) in mainline since 6.18 but ship `status = "disabled"` — board enable is the Phase-2 unblock, not a driver port. - Mesa 25.3 ships Rocket Gallium + Teflon TFLite delegate as the authoritative userspace reference for the uAPI shape. - Op coverage today is conv-centric (MobileNet-class); transformer matmul needs the conv-1×1 shoehorn (RKNPU2 BSP precedent) or rocket op-set additions. Surfaced as Phase-2-load-bearing risk. - IOMMU v1.0 hazard: 32 GB host needs `mem=4G` or local `rockchip,rk3568-iommu-v1` discriminator patches before the first NPU job, to avoid DMA-window faults. Files: - docs/npu-mainline-status.md: full audit table with upstream pointers (kernel.org / Mesa docs / dri-devel patch URLs / Tomeu's "we are in mainline" blog post). - docs/phases.md: per-phase log entry for Phase-1 closeout. - docs/op-coverage.md: matmul-vs-conv-vs-rocket-op-set framing. - fleet/boltzmann.yaml: audited kernel + npu_driver + dt_npu_nodes state. - kernel/dt-overlays/rk3588-rosenblatt-npu-enable.dtso: overlay to flip the three rknn-core nodes to "okay" (+ matching mmu nodes), carries the IOMMU-mitigation warning inline. - kernel/README.md: kernel-agent scope wiring + anticipated local carry patches. - README.md: phase-status table + "rknpu → rocket" rename note. - TODO.md: Phase-2 unblock concrete steps + standing upstream-watch items.
4.8 KiB
phases — Rosenblatt per-phase log
One entry per phase as it closes. Stores the findings (what we learned that future-us shouldn't have to rediscover) and the next gate (what Phase N+1 needs from us). Lives alongside the campaign-level README; this file is the durable journal.
Phase 0 — bootstrap
Closed: 2026-05-19
Deliverable: repo scaffold (README, TODO, docs/, kernel/, userspace/,
fleet/, benchmarks/), one initial commit Rosenblatt: project scaffold for RK3588 NPU on mainline.
Phase 1 — substrate audit
Closed: 2026-05-19
Deliverable: docs/npu-mainline-status.md table fully populated;
fleet/boltzmann.yaml kernel/NPU-driver block filled with live data;
clear answer to the "accel uAPI vs. own MMIO driver" question.
Findings
rocketdriver is in mainline — Tomeu Vizoso's NPU work merged to torvalds in Linux 6.18 (Nov 2025) asdrivers/accel/rocket/, KconfigDRM_ACCEL_ROCKET. Driver author + history kept the same shape, but the upstream name isrocket, notrknpu— searching forrknpuin mainline misses everything.- Boltzmann already ships the driver —
linux-rk3588-marfrit-A1(7.0.0-rc3-ARCH+) is post-6.18 and containsrocket.koat/lib/modules/.../drivers/accel/rocket/rocket.ko, markedintree: Y. Module aliases torockchip,rk3588-rknn-core, matching the DT compatibles on the box. - DT nodes present but disabled —
rk3588-base.dtsidefinesrknn_core_0/1/2at0xfdab/c/d_0000with compatrockchip,rk3588-rknn-core; all three boot withstatus = "disabled". No board file enables them. Per-core IOMMUsrknn_mmu_0/1/2at0xfdab9/aca/ada_000also disabled. - Userspace is Mesa Rocket Gallium + Teflon — shipped in Mesa
25.3.
src/gallium/drivers/rocket/in mesa3d main is the authoritative reference for regcmd construction.rkt_regcmd.cis ~700 lines, single-conv emit path. No matmul-specific code. - Kernel is a thin shim —
drivers/accel/rocket/rocket_drv.cexposes a single facade/dev/accel/accel0for all probed RKNN cores. uAPI is 4 ioctls (CREATE_BO, SUBMIT, PREP_BO, FINI_BO).rocket_job_hw_submit()powers the NPU, pointsPC_BASE_ADDRESSat the regcmd, setsPC_REGISTER_AMOUNTS, pullsPC_OPERATION_ENABLE.OP_EN = 1. Everything else is the userspace-built regcmd buffer. - NPU sub-blocks identified from
rocket_registers.hinterrupt masks: PC (program controller), CNA (conv neural-net accel; FEATURE / WEIGHT / CSC channels), CORE (MAC array), DPU (data processing unit, requant), PPU (post-processing — not exercised by Mesa today). Each per-core block has 2 parallel channels for double-buffering. - Matmul-as-conv-1×1 is the only viable path — confirmed by
reading Mesa's emit path. INT8 matmul
Y[M,N] = X[M,K] @ W[K,N]maps cleanly to a conv with width=height=1 kernel,DATAIN_CHANNEL=K,WEIGHT_KERNELS=N. The vendor RKNPU2 stack does the same shoehorn. - IOMMU v1.0 hazard surfaced from
linux-rockchipthread (Midgy BALON / Simon Xue, 2026-04-03). The NPU IOMMU is v1.0 IP bound to genericrockchip,rk3568-iommu— driven via the v2.0 code path. v1.0 can't allocate its DTE above 4 GB. Boltzmann has 32 GB. Naive enable will silently fault. Discriminator-compat patch series planned but not landed in mainline master as of 2026-05-19 (verified via cgit ondrivers/iommu/rockchip-iommu.c). - Vendor stack is off-limits —
librknnrt.sois a closed binary blob under restrictive Rockchip license. BSProckchip-linux/kerneldrivers/rknpu/source is permitted as a spec-extraction reference only.
Phase exit decision
Drive via the rocket DRM-accel uAPI. Writing our own MMIO
driver would mean re-implementing IOMMU integration, power-domain
sequencing, and fence/sched plumbing that's already in-tree and
production-validated by Mesa Teflon consumers. The Phase-2 unblock
list is short: DT enable + IOMMU mitigation + modprobe rocket.
Phase outflow → TODO
Captured in TODO.md "Phase-2 unblock" section. Highlights:
- Apply
kernel/dt-overlays/rk3588-rosenblatt-npu-enable.dtso(or equivalent board-DTS patch) to boltzmann. - Mitigate IOMMU v1.0 hazard before first NPU job:
mem=4Gboot or local discriminator-compat carry. modprobe rocket, confirm/dev/accel/accel0, no IOMMU faults.- Read
rkt_regcmd.c,rkt_ml.c,rkt_task.c,rkt_coefs.cfrom Mesa for the conv-1×1 matmul encoding details (op-coverage.md has the first cut).
Memory persisted
project_rosenblatt_overview.mdproject_rocket_upstream_state.md(note: name isrocket, notrknpu)project_iommu_v1_hazard.md
Phase 2 — formulate (open)
Status: open as of 2026-05-19. See TODO.md and
docs/op-coverage.md for current state of the formulation.