Files
marfrit c9a3f5c600 Rosenblatt Phase-1 closeout: rocket-driver substrate inventory
Phase-1 audit closes with a substantively different picture than the
original scaffold's TBDs:

- Tomeu Vizoso's RK3588 NPU work merged in Linux 6.18 (Nov 2025) under
  codename `rocket` (NOT `rknpu`).  All references updated.
- Boltzmann's `linux-rk3588-marfrit-A1` (7.0.0-rc3-ARCH+) already ships
  `drivers/accel/rocket/rocket.ko` as a built-but-not-loaded module.
- DT bindings + per-core nodes (`npu@fdab/c/d_0000`,
  compatible `rockchip,rk3588-rknn-core`) in mainline since 6.18 but
  ship `status = "disabled"` — board enable is the Phase-2 unblock,
  not a driver port.
- Mesa 25.3 ships Rocket Gallium + Teflon TFLite delegate as the
  authoritative userspace reference for the uAPI shape.
- Op coverage today is conv-centric (MobileNet-class); transformer
  matmul needs the conv-1×1 shoehorn (RKNPU2 BSP precedent) or rocket
  op-set additions.  Surfaced as Phase-2-load-bearing risk.
- IOMMU v1.0 hazard: 32 GB host needs `mem=4G` or local
  `rockchip,rk3568-iommu-v1` discriminator patches before the first
  NPU job, to avoid DMA-window faults.

Files:
- docs/npu-mainline-status.md: full audit table with upstream pointers
  (kernel.org / Mesa docs / dri-devel patch URLs / Tomeu's "we are in
  mainline" blog post).
- docs/phases.md: per-phase log entry for Phase-1 closeout.
- docs/op-coverage.md: matmul-vs-conv-vs-rocket-op-set framing.
- fleet/boltzmann.yaml: audited kernel + npu_driver + dt_npu_nodes
  state.
- kernel/dt-overlays/rk3588-rosenblatt-npu-enable.dtso: overlay to
  flip the three rknn-core nodes to "okay" (+ matching mmu nodes),
  carries the IOMMU-mitigation warning inline.
- kernel/README.md: kernel-agent scope wiring + anticipated local
  carry patches.
- README.md: phase-status table + "rknpu → rocket" rename note.
- TODO.md: Phase-2 unblock concrete steps + standing
  upstream-watch items.
2026-05-19 12:41:31 +00:00

112 lines
4.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# phases — Rosenblatt per-phase log
One entry per phase as it closes. Stores the *findings* (what we
learned that future-us shouldn't have to rediscover) and the *next
gate* (what Phase N+1 needs from us). Lives alongside the
campaign-level README; this file is the durable journal.
---
## Phase 0 — bootstrap
**Closed:** 2026-05-19
**Deliverable:** repo scaffold (README, TODO, docs/, kernel/, userspace/,
fleet/, benchmarks/), one initial commit `Rosenblatt: project scaffold
for RK3588 NPU on mainline`.
---
## Phase 1 — substrate audit
**Closed:** 2026-05-19
**Deliverable:** `docs/npu-mainline-status.md` table fully populated;
`fleet/boltzmann.yaml` kernel/NPU-driver block filled with live data;
clear answer to the "accel uAPI vs. own MMIO driver" question.
### Findings
1. **`rocket` driver is in mainline** — Tomeu Vizoso's NPU work merged
to torvalds in Linux 6.18 (Nov 2025) as `drivers/accel/rocket/`,
Kconfig `DRM_ACCEL_ROCKET`. Driver author + history kept the same
shape, but the **upstream name is `rocket`, not `rknpu`**
searching for `rknpu` in mainline misses everything.
2. **Boltzmann already ships the driver**`linux-rk3588-marfrit-A1`
(7.0.0-rc3-ARCH+) is post-6.18 and contains `rocket.ko` at
`/lib/modules/.../drivers/accel/rocket/rocket.ko`, marked
`intree: Y`. Module aliases to `rockchip,rk3588-rknn-core`,
matching the DT compatibles on the box.
3. **DT nodes present but disabled**`rk3588-base.dtsi` defines
`rknn_core_0/1/2` at `0xfdab/c/d_0000` with compat
`rockchip,rk3588-rknn-core`; all three boot with `status =
"disabled"`. No board file enables them. Per-core IOMMUs
`rknn_mmu_0/1/2` at `0xfdab9/aca/ada_000` also disabled.
4. **Userspace is Mesa Rocket Gallium + Teflon** — shipped in Mesa
25.3. `src/gallium/drivers/rocket/` in mesa3d main is the
authoritative reference for regcmd construction. `rkt_regcmd.c`
is ~700 lines, single-conv emit path. No matmul-specific code.
5. **Kernel is a thin shim**`drivers/accel/rocket/rocket_drv.c`
exposes a single facade `/dev/accel/accel0` for all probed RKNN
cores. uAPI is 4 ioctls (CREATE_BO, SUBMIT, PREP_BO, FINI_BO).
`rocket_job_hw_submit()` powers the NPU, points
`PC_BASE_ADDRESS` at the regcmd, sets `PC_REGISTER_AMOUNTS`,
pulls `PC_OPERATION_ENABLE.OP_EN = 1`. Everything else is the
userspace-built regcmd buffer.
6. **NPU sub-blocks** identified from `rocket_registers.h` interrupt
masks: **PC** (program controller), **CNA** (conv neural-net
accel; FEATURE / WEIGHT / CSC channels), **CORE** (MAC array),
**DPU** (data processing unit, requant), **PPU**
(post-processing — not exercised by Mesa today). Each per-core
block has 2 parallel channels for double-buffering.
7. **Matmul-as-conv-1×1 is the only viable path** — confirmed by
reading Mesa's emit path. INT8 matmul `Y[M,N] = X[M,K] @ W[K,N]`
maps cleanly to a conv with width=height=1 kernel,
`DATAIN_CHANNEL=K`, `WEIGHT_KERNELS=N`. The vendor RKNPU2 stack
does the same shoehorn.
8. **IOMMU v1.0 hazard surfaced from `linux-rockchip` thread**
(Midgy BALON / Simon Xue, 2026-04-03). The NPU IOMMU is v1.0 IP
bound to generic `rockchip,rk3568-iommu` — driven via the v2.0
code path. v1.0 can't allocate its DTE above 4 GB. Boltzmann
has 32 GB. Naive enable will silently fault. Discriminator-compat
patch series planned but **not landed in mainline master as of
2026-05-19** (verified via cgit on
`drivers/iommu/rockchip-iommu.c`).
9. **Vendor stack is off-limits**`librknnrt.so` is a closed
binary blob under restrictive Rockchip license. BSP
`rockchip-linux/kernel` `drivers/rknpu/` source is permitted as
a spec-extraction reference only.
### Phase exit decision
**Drive via the `rocket` DRM-accel uAPI.** Writing our own MMIO
driver would mean re-implementing IOMMU integration, power-domain
sequencing, and fence/sched plumbing that's already in-tree and
production-validated by Mesa Teflon consumers. The Phase-2 unblock
list is short: DT enable + IOMMU mitigation + `modprobe rocket`.
### Phase outflow → TODO
Captured in `TODO.md` "Phase-2 unblock" section. Highlights:
- Apply `kernel/dt-overlays/rk3588-rosenblatt-npu-enable.dtso` (or
equivalent board-DTS patch) to boltzmann.
- Mitigate IOMMU v1.0 hazard before first NPU job: `mem=4G` boot or
local discriminator-compat carry.
- `modprobe rocket`, confirm `/dev/accel/accel0`, no IOMMU faults.
- Read `rkt_regcmd.c`, `rkt_ml.c`, `rkt_task.c`, `rkt_coefs.c` from
Mesa for the conv-1×1 matmul encoding details (op-coverage.md
has the first cut).
### Memory persisted
- `project_rosenblatt_overview.md`
- `project_rocket_upstream_state.md` (note: name is `rocket`, not
`rknpu`)
- `project_iommu_v1_hazard.md`
---
## Phase 2 — formulate (open)
**Status:** open as of 2026-05-19. See `TODO.md` and
`docs/op-coverage.md` for current state of the formulation.