[PR #10816] [CLOSED] Add discover for Intel GPU and support OneApi SYCL #18646

Closed
opened 2026-04-16 06:42:04 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10816
Author: @chnxq
Created: 5/22/2025
Status: Closed

Base: mainHead: chnxq/sycl-discover


📝 Commits (10+)

  • 6c4f99c Add support Intel OneApi GPU.--draft
  • d5ecf05 Add Readme
  • 88e9e75 Merge branch 'main' into chnxq/add-oneapi
  • 81f41cd sync llama.cpp/ggml/sycl lib
  • 669d872 Merge branch 'main' into chnxq/add-oneapi
  • 553586a Merge branch 'main' into chnxq/add-oneapi
  • 69b6690 add readme & merge main branch
  • 71382f8 merge main
  • 5be3ff5 I don't know why after adding the AVX-VNNI instruction set, the Intel compiler cannot correctly recognize it. Temporarily roll back.
  • 66d2809 Merge branch 'main' into chnxq/add-oneapi

📊 Changes

62 files changed (+22134 additions, -577 deletions)

View changed files

📝 CMakeLists.txt (+1 -0)
📝 README.md (+63 -551)
📝 discover/gpu.go (+123 -24)
📝 discover/gpu_info.h (+1 -0)
discover/gpu_info_sycl.c (+97 -0)
discover/gpu_info_sycl.h (+29 -0)
📝 discover/gpu_windows.go (+7 -0)
llama/README-Intel-OneApi.md (+75 -0)
📝 llm/server.go (+32 -0)
📝 ml/backend/ggml/ggml/.rsync-filter (+1 -0)
📝 ml/backend/ggml/ggml/src/CMakeLists.txt (+2 -2)
ml/backend/ggml/ggml/src/ggml-sycl/CMakeLists.txt (+189 -0)
ml/backend/ggml/ggml/src/ggml-sycl/backend.hpp (+37 -0)
ml/backend/ggml/ggml/src/ggml-sycl/binbcast.cpp (+239 -0)
ml/backend/ggml/ggml/src/ggml-sycl/binbcast.hpp (+39 -0)
ml/backend/ggml/ggml/src/ggml-sycl/common.cpp (+83 -0)
ml/backend/ggml/ggml/src/ggml-sycl/common.hpp (+493 -0)
ml/backend/ggml/ggml/src/ggml-sycl/concat.cpp (+197 -0)
ml/backend/ggml/ggml/src/ggml-sycl/concat.hpp (+20 -0)
ml/backend/ggml/ggml/src/ggml-sycl/conv.cpp (+100 -0)

...and 42 more files

📄 Description

I write a test base ollama/main branch, it use SYCL to discover Intel-GPU device instead of level-zero lib. it can get GPU memery size for ollama.

Can it be integrated into Ollama?

For Intel OneApi developer

This document recode process of merge ggml-sycl from llama.cpp. to support Intel-Gpu.

Only tested in windows and intel integrated Graphics Card.

A portable package in https://github.com/chnxq/ollama/releases

develope config

Pre-request

Install Intel-OneApi: from OneApiBaseToolkit

Other ref: SYCL document for detail.

(default install in C:\Program Files (x86)\Intel\oneAPI) for next example.

Compile the CPU & GPU dynamic libraries

on windows powershell,establish environmental variables:

cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'

build libraries:
on ollama root directory:

cmake -B build -G "Ninja" -DGGML_SYCL=ON -DGGML_SYCL_TARGET=INTEL -DGGML_CPU_ALL_VARIANTS=ON -DGGML_BACKEND_DL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icx -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release -j 

build go src:

go build -o ollama.exe

Run

on ollama root directory

cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'

set OLLAMA_INTEL_GPU=true
set OLLAMA_INTEL_IF_TYPE=SYCL
set OLLAMA_NUM_GPU=999
set SYCL_CACHE_PERSISTENT=1
set OLLAMA_LIBRARY_PATH=./build/lib/ollama
# run ollama server
.\ollama.exe serve

or use shell ollama-intel-gpu.bat:

set OLLAMA_INTEL_GPU=true
set OLLAMA_INTEL_IF_TYPE=SYCL
set OLLAMA_NUM_GPU=64
set SYCL_CACHE_PERSISTENT=1
set OLLAMA_LIBRARY_PATH=./build/lib/ollama
set ONEAPI_ROOT=C:\Program Files (x86)\Intel\oneAPI
set PATH=%PATH%;%ONEAPI_ROOT%\2025.1\bin;./build/lib/ollama;

.\ollama.exe serve

note:

  1. set OLLAMA_NUM_GPU=xxx
    xxx: It needs to be manually set. According to the number of model layers that the video memory can load.for example my T140 has 16G shared video memory,i set it to 64.
  2. Next 2 env is necessary if use sycl for discover intel gpu:
    set OLLAMA_INTEL_GPU=true
    set OLLAMA_INTEL_IF_TYPE=SYCL (env OLLAMA_INTEL_IF_TYPE is used in go and c code of ollama and llama.cpp,same name as build param)
  3. When use pure CPU inference,a known bug need to delete ggml_sycl library temporay.
# run ollama test client 1
.\ollama.exe run deepseek-r1:1.5b --verbose
# run ollama test client 2
.\ollama.exe run qwen3:4b-fp16 --verbose
# run ollama test client 3
.\ollama.exe run gemma3:12b --verbose

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10816 **Author:** [@chnxq](https://github.com/chnxq) **Created:** 5/22/2025 **Status:** ❌ Closed **Base:** `main` ← **Head:** `chnxq/sycl-discover` --- ### 📝 Commits (10+) - [`6c4f99c`](https://github.com/ollama/ollama/commit/6c4f99c8f995a7dbfcc17d4cfe2ed0bb969b21e9) Add support Intel OneApi GPU.--draft - [`d5ecf05`](https://github.com/ollama/ollama/commit/d5ecf058e3e970dc2a8ab7ab454288cf67e1cad6) Add Readme - [`88e9e75`](https://github.com/ollama/ollama/commit/88e9e75e489ae056906992e6390fe088e5bae3df) Merge branch 'main' into chnxq/add-oneapi - [`81f41cd`](https://github.com/ollama/ollama/commit/81f41cd8f9681b0d28c07fb1e66399c1ab341940) sync llama.cpp/ggml/sycl lib - [`669d872`](https://github.com/ollama/ollama/commit/669d872a2944a7749490efab41a4a0d1d5e73e64) Merge branch 'main' into chnxq/add-oneapi - [`553586a`](https://github.com/ollama/ollama/commit/553586aade0a87bbae8326e64c98746bad88924a) Merge branch 'main' into chnxq/add-oneapi - [`69b6690`](https://github.com/ollama/ollama/commit/69b6690398b6d279c5a1a1436367424f2d7dffb8) add readme & merge main branch - [`71382f8`](https://github.com/ollama/ollama/commit/71382f855cd69cfbf787633c146ede47abb116d5) merge main - [`5be3ff5`](https://github.com/ollama/ollama/commit/5be3ff50db03a10674c1e821199dd232ddfb12d1) I don't know why after adding the AVX-VNNI instruction set, the Intel compiler cannot correctly recognize it. Temporarily roll back. - [`66d2809`](https://github.com/ollama/ollama/commit/66d280991f19864862b51c4784a9f4d5408be982) Merge branch 'main' into chnxq/add-oneapi ### 📊 Changes **62 files changed** (+22134 additions, -577 deletions) <details> <summary>View changed files</summary> 📝 `CMakeLists.txt` (+1 -0) 📝 `README.md` (+63 -551) 📝 `discover/gpu.go` (+123 -24) 📝 `discover/gpu_info.h` (+1 -0) ➕ `discover/gpu_info_sycl.c` (+97 -0) ➕ `discover/gpu_info_sycl.h` (+29 -0) 📝 `discover/gpu_windows.go` (+7 -0) ➕ `llama/README-Intel-OneApi.md` (+75 -0) 📝 `llm/server.go` (+32 -0) 📝 `ml/backend/ggml/ggml/.rsync-filter` (+1 -0) 📝 `ml/backend/ggml/ggml/src/CMakeLists.txt` (+2 -2) ➕ `ml/backend/ggml/ggml/src/ggml-sycl/CMakeLists.txt` (+189 -0) ➕ `ml/backend/ggml/ggml/src/ggml-sycl/backend.hpp` (+37 -0) ➕ `ml/backend/ggml/ggml/src/ggml-sycl/binbcast.cpp` (+239 -0) ➕ `ml/backend/ggml/ggml/src/ggml-sycl/binbcast.hpp` (+39 -0) ➕ `ml/backend/ggml/ggml/src/ggml-sycl/common.cpp` (+83 -0) ➕ `ml/backend/ggml/ggml/src/ggml-sycl/common.hpp` (+493 -0) ➕ `ml/backend/ggml/ggml/src/ggml-sycl/concat.cpp` (+197 -0) ➕ `ml/backend/ggml/ggml/src/ggml-sycl/concat.hpp` (+20 -0) ➕ `ml/backend/ggml/ggml/src/ggml-sycl/conv.cpp` (+100 -0) _...and 42 more files_ </details> ### 📄 Description I write a test base ollama/main branch, it use SYCL to discover Intel-GPU device instead of level-zero lib. it can get GPU memery size for ollama. Can it be integrated into Ollama? # For Intel OneApi developer This document recode process of merge ggml-sycl from [llama.cpp](https://github.com/ggml-org/llama.cpp). to support Intel-Gpu. Only tested in windows and intel integrated Graphics Card. A portable package in https://github.com/chnxq/ollama/releases # `develope config` ## Pre-request Install Intel-OneApi: from [OneApiBaseToolkit](https://www.intel.com/content/www/us/en/developer/articles/system-requirements/oneapi-base-toolkit/2025.html#inpage-nav-1-1) Other ref: [SYCL](https://github.com/ggml-org/llama.cpp/blob/master/docs/backend/SYCL.md) document for detail. (default install in C:\Program Files (x86)\Intel\oneAPI\) for next example. ## Compile the CPU & GPU dynamic libraries on windows powershell,establish environmental variables: ```shell cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell' ``` build libraries: on ollama root directory: ```shell cmake -B build -G "Ninja" -DGGML_SYCL=ON -DGGML_SYCL_TARGET=INTEL -DGGML_CPU_ALL_VARIANTS=ON -DGGML_BACKEND_DL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icx -DCMAKE_BUILD_TYPE=Release ``` ```shell cmake --build build --config Release -j ``` build go src: ```shell go build -o ollama.exe ``` ## Run on ollama root directory ```shell cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell' set OLLAMA_INTEL_GPU=true set OLLAMA_INTEL_IF_TYPE=SYCL set OLLAMA_NUM_GPU=999 set SYCL_CACHE_PERSISTENT=1 set OLLAMA_LIBRARY_PATH=./build/lib/ollama # run ollama server .\ollama.exe serve ``` or use shell ollama-intel-gpu.bat: ```shell set OLLAMA_INTEL_GPU=true set OLLAMA_INTEL_IF_TYPE=SYCL set OLLAMA_NUM_GPU=64 set SYCL_CACHE_PERSISTENT=1 set OLLAMA_LIBRARY_PATH=./build/lib/ollama set ONEAPI_ROOT=C:\Program Files (x86)\Intel\oneAPI set PATH=%PATH%;%ONEAPI_ROOT%\2025.1\bin;./build/lib/ollama; .\ollama.exe serve ``` note: 1. set OLLAMA_NUM_GPU=xxx xxx: It needs to be manually set. According to the number of model layers that the video memory can load.for example my T140 has 16G shared video memory,i set it to 64. 2. Next 2 env is necessary if use sycl for discover intel gpu: set OLLAMA_INTEL_GPU=true set OLLAMA_INTEL_IF_TYPE=SYCL (env OLLAMA_INTEL_IF_TYPE is used in go and c code of ollama and llama.cpp,same name as build param) 3. When use pure CPU inference,a known bug need to delete ggml_sycl library temporay. ```shell # run ollama test client 1 .\ollama.exe run deepseek-r1:1.5b --verbose ``` ```shell # run ollama test client 2 .\ollama.exe run qwen3:4b-fp16 --verbose ``` ```shell # run ollama test client 3 .\ollama.exe run gemma3:12b --verbose ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 06:42:04 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#18646