[GH-ISSUE #15393] Add Rockchip NPU support #71904

Open
opened 2026-05-05 02:55:15 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @MattimaxForce on GitHub (Apr 7, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15393

I am writing to strongly suggest the integration of Rockchip NPU support into Ollama.
The heavy lifting for this integration has already been done by the community. You can look at the rkllama project (github.com) which successfully implements NPU acceleration for RK3588 and RK3576 SoCs using the RKNN toolkit.
Since the logic and the implementation are already available and proven to work, integrating this into the official Ollama build would be a massive win for the ARM edge-computing community. This would allow users of devices like the Orange Pi 5 and Rock 5B to run LLMs with high efficiency without relying on pure CPU inference.
Please consider merging or adapting the work from rkllama to bring native NPU support to the official Ollama releases.

Originally created by @MattimaxForce on GitHub (Apr 7, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15393 I am writing to strongly suggest the integration of Rockchip NPU support into Ollama. The heavy lifting for this integration has already been done by the community. You can look at the rkllama project (github.com) which successfully implements NPU acceleration for RK3588 and RK3576 SoCs using the RKNN toolkit. Since the logic and the implementation are already available and proven to work, integrating this into the official Ollama build would be a massive win for the ARM edge-computing community. This would allow users of devices like the Orange Pi 5 and Rock 5B to run LLMs with high efficiency without relying on pure CPU inference. Please consider merging or adapting the work from rkllama to bring native NPU support to the official Ollama releases.
GiteaMirror added the feature request label 2026-05-05 02:55:15 -05:00
Author
Owner

@rickb006 commented on GitHub (Apr 9, 2026):

#II agree. This NPU framework is worthy of being implemented. Hundreds of thousands of units have shipped, and that's not including the 2 most recent product line additions.

<!-- gh-comment-id:4211760221 --> @rickb006 commented on GitHub (Apr 9, 2026): #II agree. This NPU framework is worthy of being implemented. Hundreds of thousands of units have shipped, and that's not including the 2 most recent product line additions.
Author
Owner

@k8ieone commented on GitHub (Apr 9, 2026):

Ideally the new generic Linux NPU stuff could be used instead of Rockchip-specific blobs and such.

This issue outlines everything https://github.com/blakeblackshear/frigate/discussions/18311#discussion-8347782

From my understanding, this should also make other NPUs (like the AMD XDNA) work because the implementations should be generic and transparent.

https://dri.freedesktop.org/docs/drm/accel/index.html

<!-- gh-comment-id:4212651011 --> @k8ieone commented on GitHub (Apr 9, 2026): Ideally the new generic Linux NPU stuff could be used instead of Rockchip-specific blobs and such. This issue outlines everything https://github.com/blakeblackshear/frigate/discussions/18311#discussion-8347782 From my understanding, this should also make other NPUs (like the AMD XDNA) work because the implementations should be generic and transparent. https://dri.freedesktop.org/docs/drm/accel/index.html
Author
Owner

@crazyquark commented on GitHub (Apr 14, 2026):

It should be possible to use rk-llama.cpp as backend. But I have not investigated

<!-- gh-comment-id:4241879806 --> @crazyquark commented on GitHub (Apr 14, 2026): It should be possible to use rk-llama.cpp as backend. But I have not investigated
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71904