[GH-ISSUE #2797] Please consider supporting Intel GPU ARC A770 (16G) #48203

Closed
opened 2026-04-28 07:09:48 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @HelloMorningStar on GitHub (Feb 28, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2797

Originally assigned to: @dhiltgen on GitHub.

Here is a demo of ARC A770 running llama2:

https://www.reddit.com/r/LocalLLaMA/comments/1b0c6u8/llama_2_inference_with_pytorch_on_intel_arc/

The Intel Arc A770 is a powerful graphics card that is well-suited for a variety of tasks, including machine learning. It has 16GB of GDDR6 memory, a 256-bit memory interface, and a boost clock of 2.1 GHz. It also supports ray tracing and XeSS, which can improve performance in games and other applications.

The llama2 demo shows how the Intel Arc A770 can be used to accelerate machine learning inference tasks. The demo uses PyTorch, a popular machine learning framework, to run a llama2 model on the Intel Arc A770. The results show that the Intel Arc A770 can achieve significant performance gains over CPUs and other GPUs.

Originally created by @HelloMorningStar on GitHub (Feb 28, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2797 Originally assigned to: @dhiltgen on GitHub. Here is a demo of ARC A770 running llama2: https://www.reddit.com/r/LocalLLaMA/comments/1b0c6u8/llama_2_inference_with_pytorch_on_intel_arc/ The Intel Arc A770 is a powerful graphics card that is well-suited for a variety of tasks, including machine learning. It has 16GB of GDDR6 memory, a 256-bit memory interface, and a boost clock of 2.1 GHz. It also supports ray tracing and XeSS, which can improve performance in games and other applications. The llama2 demo shows how the Intel Arc A770 can be used to accelerate machine learning inference tasks. The demo uses PyTorch, a popular machine learning framework, to run a llama2 model on the Intel Arc A770. The results show that the Intel Arc A770 can achieve significant performance gains over CPUs and other GPUs.
GiteaMirror added the intel label 2026-04-28 07:09:48 -05:00
Author
Owner

@renyuneyun commented on GitHub (Apr 12, 2024):

Maybe #2033 could solve this?
According to this article, Vulkan has universal acceleration support for lots of GPUs, including Intel's (integrated or discrete) GPUs.

But actually I'm also curious on the status of the new neural engines that Intel introduced to the Ultra series of chips (and similar stuffs AMD introduces to their chips). That's not GPU, and Vulkan cannot support, I believe? Not sure what tools can unify the support of that.

<!-- gh-comment-id:2052105556 --> @renyuneyun commented on GitHub (Apr 12, 2024): Maybe #2033 could solve this? According to [this article](https://blog.nomic.ai/posts/gpt4all-gpu-inference-with-vulkan), Vulkan has universal acceleration support for lots of GPUs, including Intel's (integrated or discrete) GPUs. But actually I'm also curious on the status of the new neural engines that Intel introduced to the Ultra series of chips (and similar stuffs AMD introduces to their chips). That's not GPU, and Vulkan cannot support, I believe? Not sure what tools can unify the support of that.
Author
Owner

@DurianyDoriana commented on GitHub (Apr 15, 2024):

Maybe #2033 could solve this?

But actually I'm also curious on the status of the new neural engines that Intel introduced to the Ultra series of chips (and similar stuffs AMD introduces to their chips). That's not GPU, and Vulkan cannot support, I believe? Not sure what tools can unify the support of that.
Yes, Vulkan works great in Llama.cpp, GPT4All and other ready programs such as Jan.ai on Intel iGPU's and dGPU's.

Intel also supports 50+ LLM models and LangChain through IPEX-LLM. In fact, Ollama is mentioned on the IPEX-LLM github page:
https://github.com/intel-analytics/ipex-llm

Regarding your second question, their NPU are currently only supported through OpenVINO in Linux and Windows, and there's "Limited initial support for DirectML".
https://github.com/openvinotoolkit/openvino/tree/master/src/plugins/intel_npu
https://downloadmirror.intel.com/820222/NPU_Win_Release_Notes_v2267.pdf

<!-- gh-comment-id:2057890976 --> @DurianyDoriana commented on GitHub (Apr 15, 2024): > Maybe #2033 could solve this? > > But actually I'm also curious on the status of the new neural engines that Intel introduced to the Ultra series of chips (and similar stuffs AMD introduces to their chips). That's not GPU, and Vulkan cannot support, I believe? Not sure what tools can unify the support of that. Yes, Vulkan works great in Llama.cpp, GPT4All and other ready programs such as Jan.ai on Intel iGPU's and dGPU's. Intel also supports 50+ LLM models and LangChain through IPEX-LLM. In fact, Ollama is mentioned on the IPEX-LLM github page: https://github.com/intel-analytics/ipex-llm Regarding your second question, their NPU are currently only supported through OpenVINO in Linux and Windows, and there's "Limited initial support for DirectML". https://github.com/openvinotoolkit/openvino/tree/master/src/plugins/intel_npu https://downloadmirror.intel.com/820222/NPU_Win_Release_Notes_v2267.pdf
Author
Owner

@dhiltgen commented on GitHub (Apr 15, 2024):

Dup of #1590

<!-- gh-comment-id:2057924219 --> @dhiltgen commented on GitHub (Apr 15, 2024): Dup of #1590
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#48203