Files
marfrit c9a3f5c600 Rosenblatt Phase-1 closeout: rocket-driver substrate inventory
Phase-1 audit closes with a substantively different picture than the
original scaffold's TBDs:

- Tomeu Vizoso's RK3588 NPU work merged in Linux 6.18 (Nov 2025) under
  codename `rocket` (NOT `rknpu`).  All references updated.
- Boltzmann's `linux-rk3588-marfrit-A1` (7.0.0-rc3-ARCH+) already ships
  `drivers/accel/rocket/rocket.ko` as a built-but-not-loaded module.
- DT bindings + per-core nodes (`npu@fdab/c/d_0000`,
  compatible `rockchip,rk3588-rknn-core`) in mainline since 6.18 but
  ship `status = "disabled"` — board enable is the Phase-2 unblock,
  not a driver port.
- Mesa 25.3 ships Rocket Gallium + Teflon TFLite delegate as the
  authoritative userspace reference for the uAPI shape.
- Op coverage today is conv-centric (MobileNet-class); transformer
  matmul needs the conv-1×1 shoehorn (RKNPU2 BSP precedent) or rocket
  op-set additions.  Surfaced as Phase-2-load-bearing risk.
- IOMMU v1.0 hazard: 32 GB host needs `mem=4G` or local
  `rockchip,rk3568-iommu-v1` discriminator patches before the first
  NPU job, to avoid DMA-window faults.

Files:
- docs/npu-mainline-status.md: full audit table with upstream pointers
  (kernel.org / Mesa docs / dri-devel patch URLs / Tomeu's "we are in
  mainline" blog post).
- docs/phases.md: per-phase log entry for Phase-1 closeout.
- docs/op-coverage.md: matmul-vs-conv-vs-rocket-op-set framing.
- fleet/boltzmann.yaml: audited kernel + npu_driver + dt_npu_nodes
  state.
- kernel/dt-overlays/rk3588-rosenblatt-npu-enable.dtso: overlay to
  flip the three rknn-core nodes to "okay" (+ matching mmu nodes),
  carries the IOMMU-mitigation warning inline.
- kernel/README.md: kernel-agent scope wiring + anticipated local
  carry patches.
- README.md: phase-status table + "rknpu → rocket" rename note.
- TODO.md: Phase-2 unblock concrete steps + standing
  upstream-watch items.
2026-05-19 12:41:31 +00:00

4.8 KiB
Raw Permalink Blame History

phases — Rosenblatt per-phase log

One entry per phase as it closes. Stores the findings (what we learned that future-us shouldn't have to rediscover) and the next gate (what Phase N+1 needs from us). Lives alongside the campaign-level README; this file is the durable journal.


Phase 0 — bootstrap

Closed: 2026-05-19 Deliverable: repo scaffold (README, TODO, docs/, kernel/, userspace/, fleet/, benchmarks/), one initial commit Rosenblatt: project scaffold for RK3588 NPU on mainline.


Phase 1 — substrate audit

Closed: 2026-05-19 Deliverable: docs/npu-mainline-status.md table fully populated; fleet/boltzmann.yaml kernel/NPU-driver block filled with live data; clear answer to the "accel uAPI vs. own MMIO driver" question.

Findings

  1. rocket driver is in mainline — Tomeu Vizoso's NPU work merged to torvalds in Linux 6.18 (Nov 2025) as drivers/accel/rocket/, Kconfig DRM_ACCEL_ROCKET. Driver author + history kept the same shape, but the upstream name is rocket, not rknpu — searching for rknpu in mainline misses everything.
  2. Boltzmann already ships the driverlinux-rk3588-marfrit-A1 (7.0.0-rc3-ARCH+) is post-6.18 and contains rocket.ko at /lib/modules/.../drivers/accel/rocket/rocket.ko, marked intree: Y. Module aliases to rockchip,rk3588-rknn-core, matching the DT compatibles on the box.
  3. DT nodes present but disabledrk3588-base.dtsi defines rknn_core_0/1/2 at 0xfdab/c/d_0000 with compat rockchip,rk3588-rknn-core; all three boot with status = "disabled". No board file enables them. Per-core IOMMUs rknn_mmu_0/1/2 at 0xfdab9/aca/ada_000 also disabled.
  4. Userspace is Mesa Rocket Gallium + Teflon — shipped in Mesa 25.3. src/gallium/drivers/rocket/ in mesa3d main is the authoritative reference for regcmd construction. rkt_regcmd.c is ~700 lines, single-conv emit path. No matmul-specific code.
  5. Kernel is a thin shimdrivers/accel/rocket/rocket_drv.c exposes a single facade /dev/accel/accel0 for all probed RKNN cores. uAPI is 4 ioctls (CREATE_BO, SUBMIT, PREP_BO, FINI_BO). rocket_job_hw_submit() powers the NPU, points PC_BASE_ADDRESS at the regcmd, sets PC_REGISTER_AMOUNTS, pulls PC_OPERATION_ENABLE.OP_EN = 1. Everything else is the userspace-built regcmd buffer.
  6. NPU sub-blocks identified from rocket_registers.h interrupt masks: PC (program controller), CNA (conv neural-net accel; FEATURE / WEIGHT / CSC channels), CORE (MAC array), DPU (data processing unit, requant), PPU (post-processing — not exercised by Mesa today). Each per-core block has 2 parallel channels for double-buffering.
  7. Matmul-as-conv-1×1 is the only viable path — confirmed by reading Mesa's emit path. INT8 matmul Y[M,N] = X[M,K] @ W[K,N] maps cleanly to a conv with width=height=1 kernel, DATAIN_CHANNEL=K, WEIGHT_KERNELS=N. The vendor RKNPU2 stack does the same shoehorn.
  8. IOMMU v1.0 hazard surfaced from linux-rockchip thread (Midgy BALON / Simon Xue, 2026-04-03). The NPU IOMMU is v1.0 IP bound to generic rockchip,rk3568-iommu — driven via the v2.0 code path. v1.0 can't allocate its DTE above 4 GB. Boltzmann has 32 GB. Naive enable will silently fault. Discriminator-compat patch series planned but not landed in mainline master as of 2026-05-19 (verified via cgit on drivers/iommu/rockchip-iommu.c).
  9. Vendor stack is off-limitslibrknnrt.so is a closed binary blob under restrictive Rockchip license. BSP rockchip-linux/kernel drivers/rknpu/ source is permitted as a spec-extraction reference only.

Phase exit decision

Drive via the rocket DRM-accel uAPI. Writing our own MMIO driver would mean re-implementing IOMMU integration, power-domain sequencing, and fence/sched plumbing that's already in-tree and production-validated by Mesa Teflon consumers. The Phase-2 unblock list is short: DT enable + IOMMU mitigation + modprobe rocket.

Phase outflow → TODO

Captured in TODO.md "Phase-2 unblock" section. Highlights:

  • Apply kernel/dt-overlays/rk3588-rosenblatt-npu-enable.dtso (or equivalent board-DTS patch) to boltzmann.
  • Mitigate IOMMU v1.0 hazard before first NPU job: mem=4G boot or local discriminator-compat carry.
  • modprobe rocket, confirm /dev/accel/accel0, no IOMMU faults.
  • Read rkt_regcmd.c, rkt_ml.c, rkt_task.c, rkt_coefs.c from Mesa for the conv-1×1 matmul encoding details (op-coverage.md has the first cut).

Memory persisted

  • project_rosenblatt_overview.md
  • project_rocket_upstream_state.md (note: name is rocket, not rknpu)
  • project_iommu_v1_hazard.md

Phase 2 — formulate (open)

Status: open as of 2026-05-19. See TODO.md and docs/op-coverage.md for current state of the formulation.