mirror of
https://github.com/ollama/ollama.git
synced 2026-05-07 00:22:43 -05:00
Open
opened 2026-04-12 10:32:24 -05:00 by GiteaMirror
·
72 comments
No Branch/Tag Specified
main
hoyyeva/anthropic-local-image-path
dhiltgen/ci
dhiltgen/llama-runner
parth-remove-claude-desktop-launch
hoyyeva/anthropic-reference-images-path
parth-anthropic-reference-images-path
brucemacd/download-before-remove
hoyyeva/editor-config-repair
parth-mlx-decode-checkpoints
parth-launch-codex-app
hoyyeva/fix-codex-model-metadata-warning
hoyyeva/qwen
parth/hide-claude-desktop-till-release
hoyyeva/opencode-image-modality
parth-add-claude-code-autoinstall
release_v0.22.0
pdevine/manifest-list
codex/fix-codex-model-metadata-warning
pdevine/addressable-manifest
brucemacd/launch-fetch-reccomended
jmorganca/llama-compat
launch-copilot-cli
hoyyeva/opencode-thinking
release_v0.20.7
parth-auto-save-backup
parth-test
jmorganca/gemma4-audio-replacements
fix-manifest-digest-on-pull
hoyyeva/vscode-improve
brucemacd/install-server-wait
parth/update-claude-docs
brucemac/start-ap-install
pdevine/mlx-update
pdevine/qwen35_vision
drifkin/api-show-fallback
mintlify/image-generation-1773352582
hoyyeva/server-context-length-local-config
jmorganca/faster-reptition-penalties
jmorganca/convert-nemotron
parth-pi-thinking
pdevine/sampling-penalties
jmorganca/fix-create-quantization-memory
dongchen/resumable_transfer_fix
pdevine/sampling-cache-error
jessegross/mlx-usage
hoyyeva/openclaw-config
hoyyeva/app-html
pdevine/qwen3next
brucemacd/sign-sh-install
brucemacd/tui-update
brucemacd/usage-api
jmorganca/launch-empty
fix-app-dist-embed
mxyng/mlx-compile
mxyng/mlx-quant
mxyng/mlx-glm4.7
mxyng/mlx
brucemacd/simplify-model-picker
jmorganca/qwen3-concurrent
fix-glm-4.7-flash-mla-config
drifkin/qwen3-coder-opening-tag
brucemacd/usage-cli
fix-cuda12-fattn-shmem
ollama-imagegen-docs
parth/fix-multiline-inputs
brucemacd/config-docs
mxyng/model-files
mxyng/simple-execute
fix-imagegen-ollama-models
mxyng/async-upload
jmorganca/lazy-no-dtype-changes
imagegen-auto-detect-create
parth/decrease-concurrent-download-hf
fix-mlx-quantize-init
jmorganca/x-cleanup
usage
imagegen-readme
jmorganca/glm-image
mlx-gpu-cd
jmorganca/imagegen-modelfile
parth/agent-skills
parth/agent-allowlist
parth/signed-in-offline
parth/agents
parth/fix-context-chopping
improve-cloud-flow
parth/add-models-websearch
parth/prompt-renderer-mcp
jmorganca/native-settings
jmorganca/download-stream-hash
jmorganca/client2-rebased
brucemacd/oai-chat-req-multipart
jessegross/multi_chunk_reserve
grace/additional-omit-empty
grace/mistral-3-large
mxyng/tokenizer2
mxyng/tokenizer
jessegross/flash
hoyyeva/windows-nacked-app
mxyng/cleanup-attention
grace/deepseek-parser
hoyyeva/remember-unsent-prompt
parth/add-lfs-pointer-error-conversion
parth/olmo2-test2
hoyyeva/ollama-launchagent-plist
nicole/olmo-model
parth/olmo-test
mxyng/remove-embedded
parth/render-template
jmorganca/intellect-3
parth/remove-prealloc-linter
jmorganca/cmd-eval
nicole/nomic-embed-text-fix
mxyng/lint-2
hoyyeva/add-gemini-3-pro-preview
hoyyeva/load-model-list
mxyng/expand-path
mxyng/environ-2
hoyyeva/deeplink-json-encoding
parth/improve-tool-calling-tests
hoyyeva/conversation
hoyyeva/assistant-edit-response
hoyyeva/thinking
origin/brucemacd/invalid-char-i-err
parth/improve-tool-calling
jmorganca/required-omitempty
grace/qwen3-vl-tests
mxyng/iter-client
parth/docs-readme
nicole/embed-test
pdevine/integration-benchstat
parth/remove-generate-cmd
parth/add-toolcall-id
mxyng/server-tests
jmorganca/glm-4.6
jmorganca/gin-h-compat
drifkin/stable-tool-args
pdevine/qwen3-more-thinking
parth/add-websearch-client
nicole/websearch_local
jmorganca/qwen3-coder-updates
grace/deepseek-v3-migration-tests
mxyng/fix-create
jmorganca/cloud-errors
pdevine/parser-tidy
revert-12233-parth/simplify-entrypoints-runner
parth/enable-so-gpt-oss
brucemacd/qwen3vl
jmorganca/readme-simplify
parth/gpt-oss-structured-outputs
revert-12039-jmorganca/tools-braces
mxyng/embeddings
mxyng/gguf
mxyng/benchmark
mxyng/types-null
parth/move-parsing
mxyng/gemma2
jmorganca/docs
mxyng/16-bit
mxyng/create-stdin
pdevine/authorizedkeys
mxyng/quant
parth/opt-in-error-context-window
brucemacd/cache-models
brucemacd/runner-completion
jmorganca/llama-update-6
brucemacd/benchmark-list
brucemacd/partial-read-caps
parth/deepseek-r1-tools
mxyng/omit-array
parth/tool-prefix-temp
brucemacd/runner-test
jmorganca/qwen25vl
brucemacd/model-forward-test-ext
parth/python-function-parsing
jmorganca/cuda-compression-none
drifkin/num-parallel
drifkin/chat-truncation-fix
jmorganca/sync
parth/python-tools-calling
drifkin/array-head-count
brucemacd/create-no-loop
parth/server-enable-content-stream-with-tools
qwen25omni
mxyng/v3
brucemacd/ropeconfig
jmorganca/silence-tokenizer
parth/sample-so-test
parth/sampling-structured-outputs
brucemacd/doc-go-engine
parth/constrained-sampling-json
jmorganca/mistral-wip
brucemacd/mistral-small-convert
parth/sample-unmarshal-json-for-params
brucemacd/jomorganca/mistral
pdevine/bfloat16
jmorganca/mistral
brucemacd/mistral
pdevine/logging
parth/sample-correctness-fix
parth/sample-fix-sorting
jmorgan/sample-fix-sorting-extras
jmorganca/temp-0-images
brucemacd/parallel-embed-models
brucemacd/shim-grammar
jmorganca/fix-gguf-error
bmizerany/nameswork
jmorganca/faster-releases
bmizerany/validatenames
brucemacd/err-no-vocab
brucemacd/rope-config
brucemacd/err-hint
brucemacd/qwen2_5
brucemacd/logprobs
brucemacd/new_runner_graph_bench
progress-flicker
brucemacd/forward-test
brucemacd/go_qwen2
pdevine/gemma2
jmorganca/add-missing-symlink-eval
mxyng/next-debug
parth/set-context-size-openai
brucemacd/next-bpe-bench
brucemacd/next-bpe-test
brucemacd/new_runner_e2e
brucemacd/new_runner_qwen2
pdevine/convert-cohere2
brucemacd/convert-cli
parth/log-probs
mxyng/next-mlx
mxyng/cmd-history
parth/templating
parth/tokenize-detokenize
brucemacd/check-key-register
bmizerany/grammar
jmorganca/vendor-081b29bd
mxyng/func-checks
jmorganca/fix-null-format
parth/fix-default-to-warn-json
jmorganca/qwen2vl
jmorganca/no-concat
parth/cmd-cleanup-SO
brucemacd/check-key-register-structured-err
parth/openai-stream-usage
parth/fix-referencing-so
stream-tools-stop
jmorganca/degin-1
brucemacd/install-path-clean
brucemacd/push-name-validation
brucemacd/browser-key-register
jmorganca/openai-fix-first-message
jmorganca/fix-proxy
jessegross/sample
parth/disallow-streaming-tools
dhiltgen/remove_submodule
jmorganca/ga
jmorganca/mllama
pdevine/newlines
pdevine/geems-2b
jmorganca/llama-bump
mxyng/modelname-7
mxyng/gin-slog
mxyng/modelname-6
jyan/convert-prog
jyan/quant5
paligemma-support
pdevine/import-docs
jmorganca/openai-context
jyan/paligemma
jyan/p2
jyan/palitest
bmizerany/embedspeedup
jmorganca/llama-vit
brucemacd/allow-ollama
royh/ep-methods
royh/whisper
mxyng/api-models
mxyng/fix-memory
jyan/q4_4/8
jyan/ollama-v
royh/stream-tools
roy-embed-parallel
bmizerany/hrm
revert-5963-revert-5924-mxyng/llama3.1-rope
royh/embed-viz
jyan/local2
jyan/auth
jyan/local
jyan/parse-temp
jmorganca/template-mistral
jyan/reord-g
royh-openai-suffixdocs
royh-imgembed
royh-embed-parallel
jyan/quant4
royh-precision
jyan/progress
pdevine/fix-template
jyan/quant3
pdevine/ggla
mxyng/update-registry-domain
jmorganca/ggml-static
mxyng/create-context
jyan/v0.146
mxyng/layers-from-files
build_dist
bmizerany/noseek
royh-ls
royh-name
timeout
mxyng/server-timestamp
bmizerany/nosillyggufslurps
royh-params
jmorganca/llama-cpp-7c26775
royh-openai-delete
royh-show-rigid
jmorganca/enable-fa
jmorganca/no-error-template
jyan/format
royh-testdelete
bmizerany/fastverify
language_support
pdevine/ps-glitches
brucemacd/tokenize
bruce/iq-quants
bmizerany/filepathwithcoloninhost
mxyng/split-bin
bmizerany/client-registry
jmorganca/if-none-match
native
jmorganca/native
jmorganca/batch-embeddings
jmorganca/initcmake
jmorganca/mm
pdevine/showggmlinfo
modenameenforcealphanum
bmizerany/modenameenforcealphanum
jmorganca/done-reason
jmorganca/llama-cpp-8960fe8
ollama.com
bmizerany/filepathnobuild
bmizerany/types/model/defaultfix
rmdisplaylong
nogogen
bmizerany/x
modelfile-readme
bmizerany/replacecolon
jmorganca/limit
jmorganca/execstack
jmorganca/replace-assets
mxyng/tune-concurrency
jmorganca/testing
whitespace-detection
jmorganca/options
upgrade-all
scratch
cuda-search
mattw/airenamer
mattw/allmodelsonhuggingface
mattw/quantcontext
mattw/whatneedstorun
brucemacd/llama-mem-calc
mattw/faq-context
mattw/communitylinks
mattw/noprune
mattw/python-functioncalling
rename
mxyng/install
pulse
remove-first
editor
mattw/selfqueryingretrieval
cgo
mattw/howtoquant
api
matt/streamingapi
format-config
mxyng/extra-args
shell
update-nous-hermes
cp-model
upload-progress
fix-unknown-model
fix-model-names
delete-fix
insecure-registry
ls
deletemodels
progressbar
readme-updates
license-layers
skip-list
list-models
modelpath
matt/examplemodelfiles
distribution
go-opts
v0.30.0-rc3
v0.30.0-rc2
v0.30.0-rc1
v0.30.0-rc0
v0.23.1
v0.23.1-rc0
v0.23.0
v0.23.0-rc0
v0.22.1
v0.22.1-rc1
v0.22.1-rc0
v0.22.0
v0.22.0-rc1
v0.21.3-rc0
v0.21.2-rc1
v0.21.2
v0.21.2-rc0
v0.21.1
v0.21.1-rc1
v0.21.1-rc0
v0.21.0
v0.21.0-rc1
v0.21.0-rc0
v0.20.8-rc0
v0.20.7
v0.20.7-rc1
v0.20.7-rc0
v0.20.6
v0.20.6-rc1
v0.20.6-rc0
v0.20.5
v0.20.5-rc2
v0.20.5-rc1
v0.20.5-rc0
v0.20.4
v0.20.4-rc2
v0.20.4-rc1
v0.20.4-rc0
v0.20.3
v0.20.3-rc0
v0.20.2
v0.20.1
v0.20.1-rc2
v0.20.1-rc1
v0.20.1-rc0
v0.20.0
v0.20.0-rc1
v0.20.0-rc0
v0.19.0
v0.19.0-rc2
v0.19.0-rc1
v0.19.0-rc0
v0.18.4-rc1
v0.18.4-rc0
v0.18.3
v0.18.3-rc2
v0.18.3-rc1
v0.18.3-rc0
v0.18.2
v0.18.2-rc1
v0.18.2-rc0
v0.18.1
v0.18.1-rc1
v0.18.1-rc0
v0.18.0
v0.18.0-rc2
v0.18.0-rc1
v0.18.0-rc0
v0.17.8-rc4
v0.17.8-rc3
v0.17.8-rc2
v0.17.8-rc1
v0.17.8-rc0
v0.17.7
v0.17.7-rc2
v0.17.7-rc1
v0.17.7-rc0
v0.17.6
v0.17.5
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.1-rc2
v0.17.1-rc1
v0.17.1-rc0
v0.17.0
v0.17.0-rc2
v0.17.0-rc1
v0.17.0-rc0
v0.16.3
v0.16.3-rc2
v0.16.3-rc1
v0.16.3-rc0
v0.16.2
v0.16.2-rc0
v0.16.1
v0.16.0
v0.16.0-rc2
v0.16.0-rc0
v0.16.0-rc1
v0.15.6
v0.15.5
v0.15.5-rc5
v0.15.5-rc4
v0.15.5-rc3
v0.15.5-rc2
v0.15.5-rc1
v0.15.5-rc0
v0.15.4
v0.15.3
v0.15.2
v0.15.1
v0.15.1-rc1
v0.15.1-rc0
v0.15.0-rc6
v0.15.0
v0.15.0-rc5
v0.15.0-rc4
v0.15.0-rc3
v0.15.0-rc2
v0.15.0-rc1
v0.15.0-rc0
v0.14.3
v0.14.3-rc3
v0.14.3-rc2
v0.14.3-rc1
v0.14.3-rc0
v0.14.2
v0.14.2-rc1
v0.14.2-rc0
v0.14.1
v0.14.0-rc11
v0.14.0
v0.14.0-rc10
v0.14.0-rc9
v0.14.0-rc8
v0.14.0-rc7
v0.14.0-rc6
v0.14.0-rc5
v0.14.0-rc4
v0.14.0-rc3
v0.14.0-rc2
v0.14.0-rc1
v0.14.0-rc0
v0.13.5
v0.13.5-rc1
v0.13.5-rc0
v0.13.4-rc2
v0.13.4
v0.13.4-rc1
v0.13.4-rc0
v0.13.3
v0.13.3-rc1
v0.13.3-rc0
v0.13.2
v0.13.2-rc2
v0.13.2-rc1
v0.13.2-rc0
v0.13.1
v0.13.1-rc2
v0.13.1-rc1
v0.13.1-rc0
v0.13.0
v0.13.0-rc0
v0.12.11
v0.12.11-rc1
v0.12.11-rc0
v0.12.10
v0.12.10-rc1
v0.12.10-rc0
v0.12.9-rc0
v0.12.9
v0.12.8
v0.12.8-rc0
v0.12.7
v0.12.7-rc1
v0.12.7-rc0
v0.12.7-citest0
v0.12.6
v0.12.6-rc1
v0.12.6-rc0
v0.12.5
v0.12.5-rc0
v0.12.4
v0.12.4-rc7
v0.12.4-rc6
v0.12.4-rc5
v0.12.4-rc4
v0.12.4-rc3
v0.12.4-rc2
v0.12.4-rc1
v0.12.4-rc0
v0.12.3
v0.12.2
v0.12.2-rc0
v0.12.1
v0.12.1-rc1
v0.12.1-rc2
v0.12.1-rc0
v0.12.0
v0.12.0-rc1
v0.12.0-rc0
v0.11.11
v0.11.11-rc3
v0.11.11-rc2
v0.11.11-rc1
v0.11.11-rc0
v0.11.10
v0.11.9
v0.11.9-rc0
v0.11.8
v0.11.8-rc0
v0.11.7-rc1
v0.11.7-rc0
v0.11.7
v0.11.6
v0.11.6-rc0
v0.11.5-rc4
v0.11.5-rc3
v0.11.5
v0.11.5-rc5
v0.11.5-rc2
v0.11.5-rc1
v0.11.5-rc0
v0.11.4
v0.11.4-rc0
v0.11.3
v0.11.3-rc0
v0.11.2
v0.11.1
v0.11.0-rc0
v0.11.0-rc1
v0.11.0-rc2
v0.11.0
v0.10.2-int1
v0.10.1
v0.10.0
v0.10.0-rc4
v0.10.0-rc3
v0.10.0-rc2
v0.10.0-rc1
v0.10.0-rc0
v0.9.7-rc1
v0.9.7-rc0
v0.9.6
v0.9.6-rc0
v0.9.6-ci0
v0.9.5
v0.9.4-rc5
v0.9.4-rc6
v0.9.4
v0.9.4-rc3
v0.9.4-rc4
v0.9.4-rc1
v0.9.4-rc2
v0.9.4-rc0
v0.9.3
v0.9.3-rc5
v0.9.4-citest0
v0.9.3-rc4
v0.9.3-rc3
v0.9.3-rc2
v0.9.3-rc1
v0.9.3-rc0
v0.9.2
v0.9.1
v0.9.1-rc1
v0.9.1-rc0
v0.9.1-ci1
v0.9.1-ci0
v0.9.0
v0.9.0-rc0
v0.8.0
v0.8.0-rc0
v0.7.1-rc2
v0.7.1
v0.7.1-rc1
v0.7.1-rc0
v0.7.0
v0.7.0-rc1
v0.7.0-rc0
v0.6.9-rc0
v0.6.8
v0.6.8-rc0
v0.6.7
v0.6.7-rc2
v0.6.7-rc1
v0.6.7-rc0
v0.6.6
v0.6.6-rc2
v0.6.6-rc1
v0.6.6-rc0
v0.6.5-rc1
v0.6.5
v0.6.5-rc0
v0.6.4-rc0
v0.6.4
v0.6.3-rc1
v0.6.3
v0.6.3-rc0
v0.6.2
v0.6.2-rc0
v0.6.1
v0.6.1-rc0
v0.6.0-rc0
v0.6.0
v0.5.14-rc0
v0.5.13
v0.5.13-rc6
v0.5.13-rc5
v0.5.13-rc4
v0.5.13-rc3
v0.5.13-rc2
v0.5.13-rc1
v0.5.13-rc0
v0.5.12
v0.5.12-rc1
v0.5.12-rc0
v0.5.11
v0.5.10
v0.5.9
v0.5.9-rc0
v0.5.8-rc13
v0.5.8
v0.5.8-rc12
v0.5.8-rc11
v0.5.8-rc10
v0.5.8-rc9
v0.5.8-rc8
v0.5.8-rc7
v0.5.8-rc6
v0.5.8-rc5
v0.5.8-rc4
v0.5.8-rc3
v0.5.8-rc2
v0.5.8-rc1
v0.5.8-rc0
v0.5.7
v0.5.6
v0.5.5
v0.5.5-rc0
v0.5.4
v0.5.3
v0.5.3-rc0
v0.5.2
v0.5.2-rc3
v0.5.2-rc2
v0.5.2-rc1
v0.5.2-rc0
v0.5.1
v0.5.0
v0.5.0-rc1
v0.4.8-rc0
v0.4.7
v0.4.6
v0.4.5
v0.4.4
v0.4.3
v0.4.3-rc0
v0.4.2
v0.4.2-rc1
v0.4.2-rc0
v0.4.1
v0.4.1-rc0
v0.4.0
v0.4.0-rc8
v0.4.0-rc7
v0.4.0-rc6
v0.4.0-rc5
v0.4.0-rc4
v0.4.0-rc3
v0.4.0-rc2
v0.4.0-rc1
v0.4.0-rc0
v0.4.0-ci3
v0.3.14
v0.3.14-rc0
v0.3.13
v0.3.12
v0.3.12-rc5
v0.3.12-rc4
v0.3.12-rc3
v0.3.12-rc2
v0.3.12-rc1
v0.3.11
v0.3.11-rc4
v0.3.11-rc3
v0.3.11-rc2
v0.3.11-rc1
v0.3.10
v0.3.10-rc1
v0.3.9
v0.3.8
v0.3.7
v0.3.7-rc6
v0.3.7-rc5
v0.3.7-rc4
v0.3.7-rc3
v0.3.7-rc2
v0.3.7-rc1
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.8
v0.2.8-rc2
v0.2.8-rc1
v0.2.7
v0.2.6
v0.2.5
v0.2.4
v0.2.3
v0.2.2
v0.2.2-rc2
v0.2.2-rc1
v0.2.1
v0.2.0
v0.1.49-rc14
v0.1.49-rc13
v0.1.49-rc12
v0.1.49-rc11
v0.1.49-rc10
v0.1.49-rc9
v0.1.49-rc8
v0.1.49-rc7
v0.1.49-rc6
v0.1.49-rc4
v0.1.49-rc5
v0.1.49-rc3
v0.1.49-rc2
v0.1.49-rc1
v0.1.48
v0.1.47
v0.1.46
v0.1.45-rc5
v0.1.45
v0.1.45-rc4
v0.1.45-rc3
v0.1.45-rc2
v0.1.45-rc1
v0.1.44
v0.1.43
v0.1.42
v0.1.41
v0.1.40
v0.1.40-rc1
v0.1.39
v0.1.39-rc2
v0.1.39-rc1
v0.1.38
v0.1.37
v0.1.36
v0.1.35
v0.1.35-rc1
v0.1.34
v0.1.34-rc1
v0.1.33
v0.1.33-rc7
v0.1.33-rc6
v0.1.33-rc5
v0.1.33-rc4
v0.1.33-rc3
v0.1.33-rc2
v0.1.33-rc1
v0.1.32
v0.1.32-rc2
v0.1.32-rc1
v0.1.31
v0.1.30
v0.1.29
v0.1.28
v0.1.27
v0.1.26
v0.1.25
v0.1.24
v0.1.23
v0.1.22
v0.1.21
v0.1.20
v0.1.19
v0.1.18
v0.1.17
v0.1.16
v0.1.15
v0.1.14
v0.1.13
v0.1.12
v0.1.11
v0.1.10
v0.1.9
v0.1.8
v0.1.7
v0.1.6
v0.1.5
v0.1.4
v0.1.3
v0.1.2
v0.1.1
v0.1.0
v0.0.21
v0.0.20
v0.0.19
v0.0.18
v0.0.17
v0.0.16
v0.0.15
v0.0.14
v0.0.13
v0.0.12
v0.0.11
v0.0.10
v0.0.9
v0.0.8
v0.0.7
v0.0.6
v0.0.5
v0.0.4
v0.0.3
v0.0.2
v0.0.1
Labels
Clear labels
amd
api
app
bug
build
cli
cloud
compatibility
context-length
create
docker
documentation
embeddings
feature request
feedback wanted
good first issue
gpt-oss
gpu
harmony
help wanted
image
install
intel
js
launch
linux
macos
memory
mlx
model
needs more info
networking
nvidia
ollama.com
performance
pull-request
python
question
registry
rendering
thinking
tools
top
vulkan
windows
wsl
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/ollama#879
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @taep96 on GitHub (Dec 18, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1590
Originally assigned to: @dhiltgen on GitHub.
@6543 commented on GitHub (Dec 19, 2023):
also looking forward ;)
PS: the intel IPEX looks not commonly supported - would be nice throug
so fallback option would be to use the vulkan api as target 🤔
@technovangelist commented on GitHub (Dec 19, 2023):
Hi, thanks so much for submitting your issue. At the moment we do not support inference using Intel's GPUs. I'll leave this issue open to track adding Intel support in the future.
@itlackey commented on GitHub (Jan 12, 2024):
+1 for IPEX support
Would it be possible to include oneAPI to support this? OpenCL is currently not working well with Intel GPUs. Vulkan may also be a decent option.
@Leo512bit commented on GitHub (Feb 4, 2024):
It looks like llama.cpp now supports SYCL for Intel GPUs. Is Arc support now possible?
https://github.com/ggerganov/llama.cpp/pull/2690
@uxdesignerhector commented on GitHub (Feb 4, 2024):
Last Automatic1111 update 1.7.0 included IPEX and initial support for Intel Arc GPUs on Windows, maybe someone could have a look a see what they have done to make it possible. I know this is for Windows only, but is shows that it is possible to integrate it while on Linux it should be easier as Windows support came later.
I'm aware that maybe WSL is another different beast, I remember having too much trouble installing Automatic1111 and accessing my Intel Arc GPU due to some limitation with the memory and privileges hardcoded into WSL
@felipeagc commented on GitHub (Feb 12, 2024):
Hey everyone, I made some progress on adding Intel Arc support to ollama: #2458
@ghost commented on GitHub (Feb 13, 2024):
Thank you @felipeagc
@tannisroot commented on GitHub (Apr 24, 2024):
Support for SYCL/Intel GPUs would be quite interesting because:
It is a very popular choice for home servers, since it has very good transcoding compatibility with Jellyfin, and is also supported by Frigate for ML workloads.
With 6GB of VRAM, it should be capable of running competent small models like llama3, which in combination with Home Assistant can be used to power a completely local voice assistant and destroy the likes of Alexa and Google Assistant comprehension wise.
@Kamryx commented on GitHub (Apr 25, 2024):
Extremely eager to have support for Arc GPUs. Have an A380 idle in my home server ready to be put to use. As the above commenter said, probably the best price/performance GPU for this work load.
I have an ultra layman and loose understanding of all this stuff, but have I correctly surmised that llama.cpp essentially already has Arc support, and it just needs to be implemented/merged into Ollama? And if that’s the case, are we probably in the final stretch?
@asknight1980 commented on GitHub (May 5, 2024):
I too have an A380 sitting idle in my R520 anxiously waiting for Ollama to recognize it. Thank you all for the progress you have contributed to this.
@kozuch commented on GitHub (Jun 6, 2024):
Is this now done with the merge of https://github.com/ollama/ollama/pull/3278 that has been released in v0.1.140?
@dhiltgen commented on GitHub (Jun 6, 2024):
@kozuch not quite. It's close.
If you build locally from source, it should work, but we haven't integrated it into our official builds yet.
@uxdesignerhector commented on GitHub (Jun 7, 2024):
@dhiltgen do you know if this will work on WSL or Windows or only Linux?
@dhiltgen commented on GitHub (Jun 7, 2024):
The Linux build is already covered in #4876 and my goal is to enable windows as well. This doc implies WSL2 should work.
@marcoleder commented on GitHub (Jun 11, 2024):
Looking forward to it! Let me know once it is available for Windows :)
@kozuch commented on GitHub (Jun 12, 2024):
You are not branching the releases off main? Why was the https://github.com/ollama/ollama/pull/3278 change seen in https://github.com/ollama/ollama/compare/v0.1.39...v0.1.40 changelist then?
@WeihanLi commented on GitHub (Jun 12, 2024):
Is there a release schedule for this?
@asknight1980 commented on GitHub (Jun 14, 2024):
How can I build it to enable Intel Arc?
@dhiltgen commented on GitHub (Jun 19, 2024):
Unfortunately users have reported crashing in the Intel GPU management library on some windows systems, so we've had to disable it temporarily until we figure out what's causing the crash. You can re-enable it by setting OLLAMA_INTEL_GPU=1
We don't have docs explaining how to build since it's not reliable yet. You can take a look at the gen_linux.sh and gen_windows.ps1 scripts here for some inspiration on the required tools.
@dhiltgen commented on GitHub (Jun 19, 2024):
Quick update - the crash is fixed on main now, but we'll keep it behind the env var I mentioned above until we get #4876 merged and the resulting binaries validated on linux and windows with Arc GPUs.
@ConnorMeng commented on GitHub (Jun 20, 2024):
Sorry if it isn't appropriate to ask this here, but when do you think this will reach the docker image, and when might there be some documentation for that as well?
@YumingChang02 commented on GitHub (Jul 5, 2024):
Is there any possibility to manual / auto detect internal gpu size? it seems igpu is detected as a oneapi compute device
But it seems that it is not correctly detecting igpu memory size
Note this is what i see using Arc A380
I am guessing this is what prevent igpu from working?
@asknight1980 commented on GitHub (Jul 5, 2024):
Are you able to do any inference at all on the Arc A380? I am showing it loading the model in GPU memory on my A380 but the processing is still happening on the CPU while the GPU sits idle.
Jul 05 18:25:13 cyka-b ollama[578885]: 2024/07/05 18:25:13 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_G>
Jul 05 18:25:17 cyka-b ollama[578885]: time=2024-07-05T18:25:17.512-05:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=oneapi compute="" driver=0.0 name="Inte>
NAME ID SIZE PROCESSOR UNTIL
tinyllama:latest 2644915ede35 827 MB 100% GPU 4 minutes from now
@MordragT commented on GitHub (Jul 7, 2024):
Is there any way to make ollama find the neo driver's libigdrcl.so library for opencl ? On my setup ollama always returns:
And then a bit later:
I reproduced the error with llama-cpp and it seems like if llama-cpp can only find the level-zero device and not the opencl one it will throw the exception.
@Yueming-Yan commented on GitHub (Jul 11, 2024):
Looking forward :)
Intel(R) Iris(R) Xe Graphics
Append some useful links:
https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md
https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md
@TheSpaceGod commented on GitHub (Jul 18, 2024):
Out of curiosity, what is holding up this PR (https://github.com/ollama/ollama/pull/4876) making it to main? It looks like its passing all the relevant PR tests.
I think this would be a real game changer for all the people running small LLM models via docker on Intel NUC style computers like myself.
@tannisroot commented on GitHub (Jul 18, 2024):
The Windows driver for Intel is crashing with Ollama.
Honestly as a Linux user it's a little bit annoying, I imagine majority of people who want to use Ollama with Intel GPU plan to do so in their Linux box.
It's also not guranteed Intel will fix it any time soon, I remember another open source project DXVK encountered major crashing bugs exclusive to the Windows Intel driver, and it took years for things to get fixed afaik (if they are even fully fixed).
@lirc571 commented on GitHub (Jul 18, 2024):
Some works are being done at #5593 and on llama.cpp side by Intel people. Looks like they are actively working on it!
@tannisroot commented on GitHub (Jul 19, 2024):
Oh then that is very good news!
@MarkWard0110 commented on GitHub (Jul 22, 2024):
Does this include support for the integrated GPU? For example, the Intel Core i9 14900k has an integrated GPU. When I enable the feature on Ubuntu Server 22.04 it crashes.
OLLAMA_INTEL_GPU=1I am curious to know if there are dependencies to have installed for this to work.
@TheSpaceGod commented on GitHub (Aug 7, 2024):
Any idea how this ticket will affect this effort? Is IPEX or Vulkan the better route to go for Intel GPUs?
https://github.com/ollama/ollama/issues/2033
@asknight1980 commented on GitHub (Aug 9, 2024):
It seems as if IPEX is the only way at this time, and that's only if you use Ubuntu 22.04. It doesn't work at all for me on 24.04. Intel has released their own guidance here: https://www.intel.com/content/www/us/en/content-details/826081/running-ollama-with-open-webui-on-intel-hardware-platform.html Only follow this guide if you can babysit the system it is installed on at every reboot because this guide does not enable any automatic service startups like Ollama and OpenWebUI include/intend as default. Very clunky, if you can even get it to work. I'm almost to the point of discarding my Intel GPU's in favor of amd/nvidia because those simply work so much easier.
@TheSpaceGod commented on GitHub (Aug 9, 2024):
Based on this chart, most all the Intel GPUs/iGPUs should support Vulkan 1.2+: https://www.intel.com/content/www/us/en/support/articles/000005524/graphics.html
I am really hoping this is an easier implementation route, because I agree, IPEX seems pretty hard to use in its current form. Even if Vulkan is slower than IPEX, some Intel GPU support will be better than nothing.
@celesrenata commented on GitHub (Aug 13, 2024):
I am also running into the 0 memory available IGPU issue, borrowing configs from @MordragT in NixOS on Intel 185H chips.
@slyoldfox commented on GitHub (Aug 14, 2024):
I was seeing exactly the same issue
@celesrenata commented on GitHub (Aug 19, 2024):
I am trying another route, I have build SR-IOV support for my ARC iGPU, and tested it successfully in Kube with plex. Once RAM arrives today, I will attempt to see if I can run OneAPI/IPEX-LLM from kubevirts to give to Ollama. My attempt yesterday showed that it offloaded to CPU, but I had no RAM left. I'll try to update this thread if I have any success.
@sambartik commented on GitHub (Aug 21, 2024):
Hi, I am on Ubuntu 24.04 LTS and got it working by using their container image
intelanalytics/ipex-llm-inference-cpp-xpu:latest, more info here.The only caveat was that the container was not able to detect my GPU. Digging deeper, I found that my kernel that came with the ubuntu - 6.8.0-40-generic was causing the issue. The workaround until it gets fixed was to set these environmental variables:
After that I got my GPU detected. Also, because it is docker, I don't need to worry about service startup as it is handled by docker.
@Xyz00777 commented on GitHub (Sep 3, 2024):
im a little bit confused atm, what is the state now atm? Because I have ollama running natively on my linux debian vm with a pass throught Arc 770 without container. I have the environment option for intel GPUs enabled and I can see it with lspci
but when i start the ollama service it says that it cant find any GPU
@tannisroot commented on GitHub (Sep 3, 2024):
How did you compile Ollama?
@Xyz00777 commented on GitHub (Sep 3, 2024):
i didnt compiled it, i just downloaded it with ansible, thats what i had done:
@tannisroot commented on GitHub (Sep 3, 2024):
Release version of Ollama is not compiled with OneAPI (Intel) support.
You need to fetch the repo, install level zero drivers, intel-basekit (info on Intel's website), activate runtime and then compile with certain envars enabled
@Xyz00777 commented on GitHub (Sep 3, 2024):
so i have to wait until at least the intel compatibel linux package is updated to can be downloaded or i have to compile it and have to have installed packages you mentioned? or do i even than need these packages if the intel compatible package for download got releases (sorry im a beginner in these area and have no experience)
@tannisroot commented on GitHub (Sep 3, 2024):
I believe Ollama does plan to provide an Intel supporting package at some point in the near future.
Meanwhile you can try building on your own. If you need help with that, @ me on the official Ollama discord, I'll be glad to assist you during european day hours!
@xiangyang-95 commented on GitHub (Oct 4, 2024):
It would be great if we could download, extract, and run Ollama on an Intel GPU directly. The example would be like
I am willing to contribute this feature if needed.
@semidark commented on GitHub (Oct 10, 2024):
With the help of @tannisroot, I successfully compiled Ollama with Intel GPU support from source.
The process was quite straightforward, and everything went smoothly. I had high hopes since I've been running llama.cpp standalone with my iGPU for the past few weeks. However, when I ran Ollama, it detected my iGPU, but the integrated llama.cpp server did not use it.
I suspect this is related to Ollama's handling of the unified memory on the iGPU, as mentioned by @dhiltgen in this comment .
Here is some output where Ollama reports that the memory size is 0 Bytes:
To investigate, I ran the
ollama_llama_serverdirectly without using Ollama, and it seemed to recognize my iGPU and Unified RAM as expected:So, how can I get Ollama to recognize the Unified Memory on my iGPU? Could we consider a quick fix to the GPU identification code, perhaps forcing Ollama to work with Unified Memory when the
ZES_ENABLE_SYSMAN=1environment variable is set?@Gunnarr970 commented on GitHub (Oct 11, 2024):
Here is an ipex-llm beta. It allows ollama to work on very old "HD Graphics 630" using SYCL.
@celesrenata commented on GitHub (Oct 13, 2024):
I did in the end have success with my little project.
https://github.com/celesrenata/nixos-k3s-configs
specifically with Ubuntu Kubevirts. So if you want to borrow from my work, I suggest looking into: https://github.com/celesrenata/nixos-k3s-configs/blob/main/kubevirt/ipex-1x/bootstrap-ipex-fleet.sh works with Ubuntu 24.04 LTS
@WoutvanderAa commented on GitHub (Nov 6, 2024):
do the arc cards already work? I have a intel arc a380 in my unraid server atm and I would love to use it for ollama.
@yurhett commented on GitHub (Nov 10, 2024):
Hi @dhiltgen,
Thank you for your hard work and dedication to improving ollama. I've reviewed the changes introduced in the 0.4 update and noticed that a significant portion of the codebase has been restructured, and the build system has transitioned to using make. Consequently, support for Intel GPUs has been excluded in this update.
However, it's worth noting that upstream llama.cpp has now officially added support for Intel GPUs. Considering this development, I would like to inquire if there are plans to integrate Intel GPU support into future releases of ollama.
Thank you for your time and consideration.
@pepijndevos commented on GitHub (Nov 10, 2024):
It seems indeed 0.4 just does not build Intel Arc support using the method suggested above. Is there another method?
For now it seems
git checkout v0.3.14will get you... somewhere, but currently still playing whack-a-mole with compiler errors.The reason I'm trying to build from source is that the ipex-llm bundled version appears broken
https://github.com/intel-analytics/ipex-llm/issues/12374
Update: I built from source, result:
@yurhett commented on GitHub (Nov 11, 2024):
Thanks! Thats a big discovery. I will try the 0.3 version on Windows to verify its correctness. I hope someone could guide this issue back on track.
Update:
Given the current situation, I would like to know if the collaborators are willing to continue supporting Intel GPU in ollama. The current state is quite problematic, and clarity on this matter would help the community determine the next steps.
@peremenov commented on GitHub (Nov 11, 2024):
Hello!
I managed to run an official IPEX Docker image with Ollama. My system specs are: AMD Ryzen 5 5600, 128Gb of RAM and Intel Arc a380, Ubuntu 24.04 LTS. There are issues I faced during the experiments I didn't manage to resolve. Ollama only managed to work with 1 layer of the model offloaded to GPU, the logs don't show anything meaningful (probably due to lower-tier GPU). Also older models work fine, but the newer ones not so much. I think it happens because Ollama has evolved over time, and there is an older version in IPEX Docker image.
Any insights or suggestions regarding these issues would be appreciated.
Here is a docker-compose file which I used to run the container.
Thank you
@marcin-kruszynski commented on GitHub (Nov 19, 2024):
@peremenov
Thanks for the yaml, it's working very good in case of my Meteor Lake Arc iGPU
Unfortunately, ipex-llm uses Ollama version 0.3.6 which will not run some new models (f.e. llama3.2-vision).
The trick with layers is probably setting OLLAMA_NUM_GPU to 999

I found this in ipex-llm docs:
https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md
@peremenov commented on GitHub (Nov 20, 2024):
Hey, @marcin-kruszynski
Thank you for the response. Yes, I'm aware about
OLLAMA_NUM_GPUsetting. Tried different values, butOLLAMA_NUM_GPU=1is only value when I managed to get stable performance.OLLAMA_NUM_GPU=2works ok, but crashes sometimes.OLLAMA_NUM_GPU=999crashes every time even on small models that should fit in VRAM. IDK, maybe it's somehow specific to my configuration. You totally right about older Ollama version used in IPEX image. It can't run llama3.2.I'm really looking forward to see https://github.com/ollama/ollama/pull/5059 working in the future releases, because as I understand the authors of Ollama aren't planning to add support of SYCL, One API or anything like that.
@Kamryx commented on GitHub (Dec 10, 2024):
Hey everyone, just wanted to check in again, how are we looking on this now, both present and near future? Again my understanding is unfortunately pretty limited but from what I’ve gathered Arc support was here and then got removed in the .4 update?
I’ve seen there’s another fork that aims to be a comprehensive and easy to install Arc focused Ollama instance, but it’d be really nice to just rely on the main Ollama project and not have to juggle or flip between different Ollama builds on my system especially if I change GPU vendors. I don’t actually even know if the aforementioned fork is working right now either.
But I’m sure most of us are aware of the new Battlemage GPUs and… yeah, they’re yet again even more compelling than Arc was before. 16GB A770s are $230 right now too with memory bandwidth that beats most of NVIDIA 40 series. So I'm pretty antsy. Could use llama.cpp (I think?) but the Ollama ecosystem is so awesome and would love to stick with it
@Leo512bit commented on GitHub (Dec 10, 2024):
Can you link to the fork? I'd like to take a look at it. Thanks.
On Mon, Dec 9, 2024, 5:22 PM Kamryx @.***> wrote:
@Kamryx commented on GitHub (Dec 10, 2024):
Yeah this here. It claims to require Ubuntu too, I’m running a different Linux flavor. That may just be a recommendation, idk.
https://github.com/mattcurf/ollama-intel-gpu
@Leo512bit
edit: looking again it seems to just specify Ubuntu for Arc kernel support so maybe that’s not a problem
@DocMAX commented on GitHub (Dec 15, 2024):
This is what i get with Ollama 0.5.1. Does it mean they are supported? When running a model they are not used, only NVIDIA.
@pauleseifert commented on GitHub (Dec 16, 2024):
That's the same problem I have @peremenov . Haven't figured out the cause but opened an issue at intel-analytics/ipex-llm#12513. Lower OLLAMA_NUM_GPU values than layers of the chosen model unfortunately mean that interference is offloaded to the CPU.
@DocMAX What settings did you use?
@DocMAX commented on GitHub (Dec 16, 2024):
No special settings. Just the ollama package from arch linux. Also installed the Intel oneapi libraries of course.
@vladislavdonchev commented on GitHub (Dec 21, 2024):
@Kamryx
Oh man, I'm losing my mind here... Which WSL kernel version did you manage to get this working with?
I tried a couple, and even though clinfo lists the A770 GPUs, dmesg shows errors trying to load the driver... Cards are working perfectly fine on the Windows host.
I even listed an issue on ipex-llm:
https://github.com/intel-analytics/ipex-llm/issues/12592
@DocMAX commented on GitHub (Jan 5, 2025):
ok from what i understand i can only use the IPEX "bundled" ollama with Intel Arc cards. It worked with the right libraries installed (arch linux). but from what i understand is i CAN'T run multiple gpu brands with intel at the moment, right? we still need so official "ipex-runner".
@uxdesignerhector commented on GitHub (Jan 23, 2025):
I was able to run this using WSL2, that means full Windows compatibility. I just had to disable my integrated GPU from Windows device manager otherwise I would encounter next error:
Error: llama runner process has terminated: exit status 2when running./ollama run qwen2.5-coder.Could be the Arc 770 the cheapest AI card on the market right now? This thing is very fast, it is ageing like fine wine
@wbste commented on GitHub (Jan 24, 2025):
It takes a few steps to setup but ipex version of ollama has been impressive. https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md
@charlescng commented on GitHub (Jan 24, 2025):
I can run the image from https://github.com/mattcurf/ollama-intel-gpu stable on unRAID 7.0.0 (kernel 6.6 with i915 driver) with an Arc A380 with the following environment variables passed in:
The Arc A380 only has 6 GB of VRAM but the llama3.1:8b models runs on it. I get ~12 response tokens per second with that model. Around 60 tokens per second with llama3.2:1b.
@baoduy commented on GitHub (Feb 24, 2025):
Looking forward to support Intel Arc GPU natively soon
Currently, I'm using the workaround here https://github.com/intel/ipex-llm/blob/main/docs/mddocs/Quickstart/ollama_quickstart.md but still refer the native support from ollama team
@huichuno commented on GitHub (May 10, 2025):
PyTorch 2.7 deliver significant functionality and performance enhancements on Intel GPU architectures to streamline AI workflows: https://pytorch.org/blog/pytorch-2-7-intel-gpus/
@MaoJianwei commented on GitHub (Jul 14, 2025):
Can ollama use Intel integrated GPU to speed up inference?e.g. Intel UHD Graphics 630 of i5-10400
@ericcurtin commented on GitHub (Oct 13, 2025):
We added Vulkan support to docker model runner, so we cover this feature:
https://www.docker.com/blog/docker-model-runner-vulkan-gpu-support/
We've also put effort to putting all our code in one central place to make it easier for people to contribute. Please star, fork and contribute.
https://github.com/docker/model-runner
We have vulkan support. You can pull models from Docker Hub, Huggingface or any other OCI registry. You can also push models to Docker Hub or any other OCI registry.
@Xyz00777 commented on GitHub (Nov 9, 2025):
heyho what is the state of developement? its sadly now open since nearly 2 years :(
@MaoJianwei commented on GitHub (Nov 10, 2025):
https://github.com/ggml-org/llama.cpp/issues/1956
@Xyz00777 commented on GitHub (Nov 12, 2025):
its a workaround to use llama.cpp instead of ollama but not a solution as well as the experimental vulkan support... :/
at least as far as i understood ollama is not on the current state of llama.cpp(?) and based on that its not working in ollama