[GH-ISSUE #11247] Add Vulkan GPU Backend for AMD/Intel Support #53922

Closed
opened 2026-04-29 04:57:27 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @linuxkernel94 on GitHub (Jun 30, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11247

What is the issue?

Hi team,

Currently, GPU support in Ollama is limited to CUDA (NVIDIA) and Apple Metal. That leaves out all users with AMD and Intel GPUs — even though most of these GPUs support Vulkan 1.2+, which is now supported by llama.cpp.

Other projects like GPT4All already support Vulkan out of the box, enabling real GPU acceleration on AMD/Intel hardware.

Why this matters:
ROCm support is extremely limited — most AMD GPUs (e.g. RX 6600M, 7900, 7600) don't work.

Vulkan works cross-platform (Windows, Linux, macOS), and is already functional in llama.cpp.

Not having Vulkan support means many users are stuck using CPU-only inference, even with capable GPUs.

Suggested solution:
Add a build of Ollama with LLAMA_VULKAN=1 enabled.

Automatically use Vulkan if CUDA is not available.

Allow setting backend preference via config or environment variable.

This would allow many more users to benefit from GPU acceleration without complex driver setup or hardware limitations

Thanks again for your amazing work.

Relevant log output


OS

Linux

GPU

AMD

CPU

AMD

Ollama version

No response

Originally created by @linuxkernel94 on GitHub (Jun 30, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11247 ### What is the issue? Hi team, Currently, GPU support in Ollama is limited to CUDA (NVIDIA) and Apple Metal. That leaves out all users with AMD and Intel GPUs — even though most of these GPUs support Vulkan 1.2+, which is now supported by llama.cpp. Other projects like GPT4All already support Vulkan out of the box, enabling real GPU acceleration on AMD/Intel hardware. ❗ Why this matters: ROCm support is extremely limited — most AMD GPUs (e.g. RX 6600M, 7900, 7600) don't work. Vulkan works cross-platform (Windows, Linux, macOS), and is already functional in llama.cpp. Not having Vulkan support means many users are stuck using CPU-only inference, even with capable GPUs. ✅ Suggested solution: Add a build of Ollama with LLAMA_VULKAN=1 enabled. Automatically use Vulkan if CUDA is not available. Allow setting backend preference via config or environment variable. This would allow many more users to benefit from GPU acceleration without complex driver setup or hardware limitations Thanks again for your amazing work. ### Relevant log output ```shell ``` ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version _No response_
GiteaMirror added the bug label 2026-04-29 04:57:27 -05:00
Author
Owner

@rick-github commented on GitHub (Jun 30, 2025):

#2033
https://github.com/ollama/ollama/pull/9650

<!-- gh-comment-id:3020286698 --> @rick-github commented on GitHub (Jun 30, 2025): #2033 https://github.com/ollama/ollama/pull/9650
Author
Owner

@KhazAkar commented on GitHub (Jul 2, 2025):

Since May 15 newest PR is not touched, with building bug affecting windows target.

<!-- gh-comment-id:3026641307 --> @KhazAkar commented on GitHub (Jul 2, 2025): Since May 15 newest PR is not touched, with building bug affecting windows target.
Author
Owner

@Stogas commented on GitHub (Jul 4, 2025):

Correction: ROCm fully supports desktop RDNA2, RDNA3 and RDNA4, not just the models you've listed.

So basically, the last three generations of consumer desktop AMD GPUs are supported and work (I myself use a 7800xt).

<!-- gh-comment-id:3037327183 --> @Stogas commented on GitHub (Jul 4, 2025): Correction: [ROCm fully supports desktop RDNA2, RDNA3 and RDNA4](https://rocm.docs.amd.com/en/latest/reference/gpu-arch-specs.html), not just the models you've listed. So basically, the last three generations of consumer desktop AMD GPUs are supported and work (I myself use a 7800xt).
Author
Owner

@linuxkernel94 commented on GitHub (Jul 4, 2025):

Thanks for the correction — you're absolutely right.

To clarify: ROCm does work on many desktop RDNA2, RDNA3, and RDNA4 GPUs, and that’s great.
What I meant to highlight is that ROCm support is still lacking or unreliable for many laptop/mobile GPUs, like the RX 6600M, which is what I personally use.

Also worth noting: Intel GPUs (especially recent ones like Arc or Xe) have no ROCm support at all, and are completely left out of GPU acceleration in Ollama right now — even though they support Vulkan 1.2+ and work with llama.cpp.

That’s why Vulkan support would be such a valuable addition — it could enable GPU acceleration across a much wider range of hardware, including AMD laptops and Intel GPUs, without relying on proprietary or incomplete drivers.

Thanks again for the discussion!

<!-- gh-comment-id:3037332511 --> @linuxkernel94 commented on GitHub (Jul 4, 2025): Thanks for the correction — you're absolutely right. To clarify: ROCm does work on many desktop RDNA2, RDNA3, and RDNA4 GPUs, and that’s great. What I meant to highlight is that ROCm support is still lacking or unreliable for many laptop/mobile GPUs, like the RX 6600M, which is what I personally use. Also worth noting: Intel GPUs (especially recent ones like Arc or Xe) have no ROCm support at all, and are completely left out of GPU acceleration in Ollama right now — even though they support Vulkan 1.2+ and work with llama.cpp. That’s why Vulkan support would be such a valuable addition — it could enable GPU acceleration across a much wider range of hardware, including AMD laptops and Intel GPUs, without relying on proprietary or incomplete drivers. Thanks again for the discussion!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53922