Rosenblatt Phase-1 closeout: rocket-driver substrate inventory
Phase-1 audit closes with a substantively different picture than the original scaffold's TBDs: - Tomeu Vizoso's RK3588 NPU work merged in Linux 6.18 (Nov 2025) under codename `rocket` (NOT `rknpu`). All references updated. - Boltzmann's `linux-rk3588-marfrit-A1` (7.0.0-rc3-ARCH+) already ships `drivers/accel/rocket/rocket.ko` as a built-but-not-loaded module. - DT bindings + per-core nodes (`npu@fdab/c/d_0000`, compatible `rockchip,rk3588-rknn-core`) in mainline since 6.18 but ship `status = "disabled"` — board enable is the Phase-2 unblock, not a driver port. - Mesa 25.3 ships Rocket Gallium + Teflon TFLite delegate as the authoritative userspace reference for the uAPI shape. - Op coverage today is conv-centric (MobileNet-class); transformer matmul needs the conv-1×1 shoehorn (RKNPU2 BSP precedent) or rocket op-set additions. Surfaced as Phase-2-load-bearing risk. - IOMMU v1.0 hazard: 32 GB host needs `mem=4G` or local `rockchip,rk3568-iommu-v1` discriminator patches before the first NPU job, to avoid DMA-window faults. Files: - docs/npu-mainline-status.md: full audit table with upstream pointers (kernel.org / Mesa docs / dri-devel patch URLs / Tomeu's "we are in mainline" blog post). - docs/phases.md: per-phase log entry for Phase-1 closeout. - docs/op-coverage.md: matmul-vs-conv-vs-rocket-op-set framing. - fleet/boltzmann.yaml: audited kernel + npu_driver + dt_npu_nodes state. - kernel/dt-overlays/rk3588-rosenblatt-npu-enable.dtso: overlay to flip the three rknn-core nodes to "okay" (+ matching mmu nodes), carries the IOMMU-mitigation warning inline. - kernel/README.md: kernel-agent scope wiring + anticipated local carry patches. - README.md: phase-status table + "rknpu → rocket" rename note. - TODO.md: Phase-2 unblock concrete steps + standing upstream-watch items.
This commit is contained in:
@@ -64,12 +64,57 @@ clear answer to "do we drive via accel uAPI or write our own MMIO driver."
|
||||
|
||||
---
|
||||
|
||||
## Cross-phase / standing items
|
||||
## Phase-2 unblock — prerequisites (NEW, Phase-1 outflow)
|
||||
|
||||
- [ ] Mirror Tomeu's WIP branch into a local clone for kernel hacking
|
||||
- [ ] Set up serial console on boltzmann for kernel-panic recovery (Quark
|
||||
umbrella; check current state)
|
||||
- [ ] Add `project_rosenblatt.md` to claude-memory once Phase 1 closes (so
|
||||
future sessions don't re-discover the campaign)
|
||||
- [ ] Decide repo home: marfrit/rosenblatt on git.reauktion.de (probably yes,
|
||||
after Phase-1 substrate is captured and the README isn't embarrassing)
|
||||
Phase-1 audit (2026-05-19) reframed Phase-2 from "design rknpu backend
|
||||
interface" to a concrete bringup sequence. See
|
||||
`docs/npu-mainline-status.md` for full context.
|
||||
|
||||
- [ ] Patch boltzmann board DTS / overlay to flip
|
||||
`npu@fdab0000`, `npu@fdac0000`, `npu@fdad0000` from
|
||||
`status = "disabled"` → `"okay"`. Rebuild DTB.
|
||||
- [ ] **Mitigate IOMMU v1.0 hazard before first NPU job** (32 GB host).
|
||||
Pick one:
|
||||
- (A) Boot with `mem=4G` for first-bringup validation, OR
|
||||
- (B) Carry local patches: Simon Xue per-device-ops
|
||||
(`<https://lore.kernel.org/all/20260310105303.128859-1-xxm@rock-chips.com/>`)
|
||||
+ Midgy `rockchip,rk3568-iommu-v1` discriminator compat
|
||||
+ DT update for `rknpu_mmu` to the new compat.
|
||||
- [ ] `modprobe rocket` and confirm `/dev/accel/accel0..2` appear, no
|
||||
probe errors in dmesg, IOMMU faults absent.
|
||||
- [ ] Read `drivers/accel/rocket/rocket_job.c` + Mesa Rocket Gallium to
|
||||
determine submit-job uAPI capabilities — specifically whether
|
||||
we can express a transformer matmul as a tile/op the NPU pipeline
|
||||
accepts, or whether we need additional op coverage upstream.
|
||||
- [ ] Decide matmul strategy (Phase-2 deliverable):
|
||||
conv-1×1 shoehorn / extend rocket op set / thinner submit shim.
|
||||
|
||||
## Standing items — track upstream
|
||||
|
||||
- [ ] Watch `drivers/iommu/rockchip-iommu.c` for the discriminator
|
||||
`rockchip,rk3568-iommu-v1` compat to land; drop local patch (B)
|
||||
when it does.
|
||||
- [ ] Watch `linux-rockchip` for the next iteration of Midgy / Simon's
|
||||
thread (last visible activity 2026-04-03).
|
||||
- [ ] Watch `drivers/accel/rocket/` for matmul / GEMM op additions.
|
||||
|
||||
## Cross-phase / standing items (older)
|
||||
|
||||
- [ ] Mirror Tomeu's branch — superseded: code is now in-tree.
|
||||
Keep `git.kernel.org/.../torvalds/linux.git` checkout pinned to
|
||||
the boltzmann kernel rev for in-tree reading.
|
||||
- [ ] Set up serial console on boltzmann for kernel-panic recovery
|
||||
(Quark umbrella; check current state) — **becomes load-bearing
|
||||
once we start poking IOMMU code.**
|
||||
- [x] Add `project_rosenblatt_overview.md` + `project_rocket_upstream_state.md`
|
||||
to claude-memory — done 2026-05-19.
|
||||
- [ ] Decide repo home: marfrit/rosenblatt on git.reauktion.de
|
||||
(probably yes, after Phase-1 substrate is captured).
|
||||
- [ ] **Resolve board-name discrepancy.** README and
|
||||
`fleet/boltzmann.yaml` say boltzmann is a "Rock 5 ITX+" /
|
||||
`rock-5-itx-plus`; the running DT reports
|
||||
`model = "Radxa ROCK 5 ITX"`, `compatible = "radxa,rock-5-itx"`.
|
||||
Confirm physical board model (Radxa sells both SKUs) and
|
||||
either correct the README + manifest, or note that we boot
|
||||
the plain-ITX DT on ITX+ hardware (likely fine; ITX+ is mostly
|
||||
a connectivity-refresh, same SoC + same NPU silicon).
|
||||
|
||||
Reference in New Issue
Block a user