mirror of
https://github.com/ollama/ollama.git
synced 2026-05-06 16:11:34 -05:00
Open
opened 2026-04-12 13:47:51 -05:00 by GiteaMirror
·
59 comments
No Branch/Tag Specified
main
dhiltgen/ci
parth-launch-plan-gating
hoyyeva/anthropic-reference-images-path
parth-anthropic-reference-images-path
brucemacd/download-before-remove
hoyyeva/editor-config-repair
parth-mlx-decode-checkpoints
parth-launch-codex-app
hoyyeva/fix-codex-model-metadata-warning
hoyyeva/qwen
parth/hide-claude-desktop-till-release
hoyyeva/opencode-image-modality
parth-add-claude-code-autoinstall
release_v0.22.0
pdevine/manifest-list
codex/fix-codex-model-metadata-warning
pdevine/addressable-manifest
brucemacd/launch-fetch-reccomended
jmorganca/llama-compat
launch-copilot-cli
hoyyeva/opencode-thinking
release_v0.20.7
parth-auto-save-backup
parth-test
jmorganca/gemma4-audio-replacements
fix-manifest-digest-on-pull
hoyyeva/vscode-improve
brucemacd/install-server-wait
parth/update-claude-docs
brucemac/start-ap-install
pdevine/mlx-update
pdevine/qwen35_vision
drifkin/api-show-fallback
mintlify/image-generation-1773352582
hoyyeva/server-context-length-local-config
jmorganca/faster-reptition-penalties
jmorganca/convert-nemotron
parth-pi-thinking
pdevine/sampling-penalties
jmorganca/fix-create-quantization-memory
dongchen/resumable_transfer_fix
pdevine/sampling-cache-error
jessegross/mlx-usage
hoyyeva/openclaw-config
hoyyeva/app-html
pdevine/qwen3next
brucemacd/sign-sh-install
brucemacd/tui-update
brucemacd/usage-api
jmorganca/launch-empty
fix-app-dist-embed
mxyng/mlx-compile
mxyng/mlx-quant
mxyng/mlx-glm4.7
mxyng/mlx
brucemacd/simplify-model-picker
jmorganca/qwen3-concurrent
fix-glm-4.7-flash-mla-config
drifkin/qwen3-coder-opening-tag
brucemacd/usage-cli
fix-cuda12-fattn-shmem
ollama-imagegen-docs
parth/fix-multiline-inputs
brucemacd/config-docs
mxyng/model-files
mxyng/simple-execute
fix-imagegen-ollama-models
mxyng/async-upload
jmorganca/lazy-no-dtype-changes
imagegen-auto-detect-create
parth/decrease-concurrent-download-hf
fix-mlx-quantize-init
jmorganca/x-cleanup
usage
imagegen-readme
jmorganca/glm-image
mlx-gpu-cd
jmorganca/imagegen-modelfile
parth/agent-skills
parth/agent-allowlist
parth/signed-in-offline
parth/agents
parth/fix-context-chopping
improve-cloud-flow
parth/add-models-websearch
parth/prompt-renderer-mcp
jmorganca/native-settings
jmorganca/download-stream-hash
jmorganca/client2-rebased
brucemacd/oai-chat-req-multipart
jessegross/multi_chunk_reserve
grace/additional-omit-empty
grace/mistral-3-large
mxyng/tokenizer2
mxyng/tokenizer
jessegross/flash
hoyyeva/windows-nacked-app
mxyng/cleanup-attention
grace/deepseek-parser
hoyyeva/remember-unsent-prompt
parth/add-lfs-pointer-error-conversion
parth/olmo2-test2
hoyyeva/ollama-launchagent-plist
nicole/olmo-model
parth/olmo-test
mxyng/remove-embedded
parth/render-template
jmorganca/intellect-3
parth/remove-prealloc-linter
jmorganca/cmd-eval
nicole/nomic-embed-text-fix
mxyng/lint-2
hoyyeva/add-gemini-3-pro-preview
hoyyeva/load-model-list
mxyng/expand-path
mxyng/environ-2
hoyyeva/deeplink-json-encoding
parth/improve-tool-calling-tests
hoyyeva/conversation
hoyyeva/assistant-edit-response
hoyyeva/thinking
origin/brucemacd/invalid-char-i-err
parth/improve-tool-calling
jmorganca/required-omitempty
grace/qwen3-vl-tests
mxyng/iter-client
parth/docs-readme
nicole/embed-test
pdevine/integration-benchstat
parth/remove-generate-cmd
parth/add-toolcall-id
mxyng/server-tests
jmorganca/glm-4.6
jmorganca/gin-h-compat
drifkin/stable-tool-args
pdevine/qwen3-more-thinking
parth/add-websearch-client
nicole/websearch_local
jmorganca/qwen3-coder-updates
grace/deepseek-v3-migration-tests
mxyng/fix-create
jmorganca/cloud-errors
pdevine/parser-tidy
revert-12233-parth/simplify-entrypoints-runner
parth/enable-so-gpt-oss
brucemacd/qwen3vl
jmorganca/readme-simplify
parth/gpt-oss-structured-outputs
revert-12039-jmorganca/tools-braces
mxyng/embeddings
mxyng/gguf
mxyng/benchmark
mxyng/types-null
parth/move-parsing
mxyng/gemma2
jmorganca/docs
mxyng/16-bit
mxyng/create-stdin
pdevine/authorizedkeys
mxyng/quant
parth/opt-in-error-context-window
brucemacd/cache-models
brucemacd/runner-completion
jmorganca/llama-update-6
brucemacd/benchmark-list
brucemacd/partial-read-caps
parth/deepseek-r1-tools
mxyng/omit-array
parth/tool-prefix-temp
brucemacd/runner-test
jmorganca/qwen25vl
brucemacd/model-forward-test-ext
parth/python-function-parsing
jmorganca/cuda-compression-none
drifkin/num-parallel
drifkin/chat-truncation-fix
jmorganca/sync
parth/python-tools-calling
drifkin/array-head-count
brucemacd/create-no-loop
parth/server-enable-content-stream-with-tools
qwen25omni
mxyng/v3
brucemacd/ropeconfig
jmorganca/silence-tokenizer
parth/sample-so-test
parth/sampling-structured-outputs
brucemacd/doc-go-engine
parth/constrained-sampling-json
jmorganca/mistral-wip
brucemacd/mistral-small-convert
parth/sample-unmarshal-json-for-params
brucemacd/jomorganca/mistral
pdevine/bfloat16
jmorganca/mistral
brucemacd/mistral
pdevine/logging
parth/sample-correctness-fix
parth/sample-fix-sorting
jmorgan/sample-fix-sorting-extras
jmorganca/temp-0-images
brucemacd/parallel-embed-models
brucemacd/shim-grammar
jmorganca/fix-gguf-error
bmizerany/nameswork
jmorganca/faster-releases
bmizerany/validatenames
brucemacd/err-no-vocab
brucemacd/rope-config
brucemacd/err-hint
brucemacd/qwen2_5
brucemacd/logprobs
brucemacd/new_runner_graph_bench
progress-flicker
brucemacd/forward-test
brucemacd/go_qwen2
pdevine/gemma2
jmorganca/add-missing-symlink-eval
mxyng/next-debug
parth/set-context-size-openai
brucemacd/next-bpe-bench
brucemacd/next-bpe-test
brucemacd/new_runner_e2e
brucemacd/new_runner_qwen2
pdevine/convert-cohere2
brucemacd/convert-cli
parth/log-probs
mxyng/next-mlx
mxyng/cmd-history
parth/templating
parth/tokenize-detokenize
brucemacd/check-key-register
bmizerany/grammar
jmorganca/vendor-081b29bd
mxyng/func-checks
jmorganca/fix-null-format
parth/fix-default-to-warn-json
jmorganca/qwen2vl
jmorganca/no-concat
parth/cmd-cleanup-SO
brucemacd/check-key-register-structured-err
parth/openai-stream-usage
parth/fix-referencing-so
stream-tools-stop
jmorganca/degin-1
brucemacd/install-path-clean
brucemacd/push-name-validation
brucemacd/browser-key-register
jmorganca/openai-fix-first-message
jmorganca/fix-proxy
jessegross/sample
parth/disallow-streaming-tools
dhiltgen/remove_submodule
jmorganca/ga
jmorganca/mllama
pdevine/newlines
pdevine/geems-2b
jmorganca/llama-bump
mxyng/modelname-7
mxyng/gin-slog
mxyng/modelname-6
jyan/convert-prog
jyan/quant5
paligemma-support
pdevine/import-docs
jmorganca/openai-context
jyan/paligemma
jyan/p2
jyan/palitest
bmizerany/embedspeedup
jmorganca/llama-vit
brucemacd/allow-ollama
royh/ep-methods
royh/whisper
mxyng/api-models
mxyng/fix-memory
jyan/q4_4/8
jyan/ollama-v
royh/stream-tools
roy-embed-parallel
bmizerany/hrm
revert-5963-revert-5924-mxyng/llama3.1-rope
royh/embed-viz
jyan/local2
jyan/auth
jyan/local
jyan/parse-temp
jmorganca/template-mistral
jyan/reord-g
royh-openai-suffixdocs
royh-imgembed
royh-embed-parallel
jyan/quant4
royh-precision
jyan/progress
pdevine/fix-template
jyan/quant3
pdevine/ggla
mxyng/update-registry-domain
jmorganca/ggml-static
mxyng/create-context
jyan/v0.146
mxyng/layers-from-files
build_dist
bmizerany/noseek
royh-ls
royh-name
timeout
mxyng/server-timestamp
bmizerany/nosillyggufslurps
royh-params
jmorganca/llama-cpp-7c26775
royh-openai-delete
royh-show-rigid
jmorganca/enable-fa
jmorganca/no-error-template
jyan/format
royh-testdelete
bmizerany/fastverify
language_support
pdevine/ps-glitches
brucemacd/tokenize
bruce/iq-quants
bmizerany/filepathwithcoloninhost
mxyng/split-bin
bmizerany/client-registry
jmorganca/if-none-match
native
jmorganca/native
jmorganca/batch-embeddings
jmorganca/initcmake
jmorganca/mm
pdevine/showggmlinfo
modenameenforcealphanum
bmizerany/modenameenforcealphanum
jmorganca/done-reason
jmorganca/llama-cpp-8960fe8
ollama.com
bmizerany/filepathnobuild
bmizerany/types/model/defaultfix
rmdisplaylong
nogogen
bmizerany/x
modelfile-readme
bmizerany/replacecolon
jmorganca/limit
jmorganca/execstack
jmorganca/replace-assets
mxyng/tune-concurrency
jmorganca/testing
whitespace-detection
jmorganca/options
upgrade-all
scratch
cuda-search
mattw/airenamer
mattw/allmodelsonhuggingface
mattw/quantcontext
mattw/whatneedstorun
brucemacd/llama-mem-calc
mattw/faq-context
mattw/communitylinks
mattw/noprune
mattw/python-functioncalling
rename
mxyng/install
pulse
remove-first
editor
mattw/selfqueryingretrieval
cgo
mattw/howtoquant
api
matt/streamingapi
format-config
mxyng/extra-args
shell
update-nous-hermes
cp-model
upload-progress
fix-unknown-model
fix-model-names
delete-fix
insecure-registry
ls
deletemodels
progressbar
readme-updates
license-layers
skip-list
list-models
modelpath
matt/examplemodelfiles
distribution
go-opts
v0.23.1
v0.23.1-rc0
v0.23.0
v0.23.0-rc0
v0.22.1
v0.22.1-rc1
v0.22.1-rc0
v0.22.0
v0.22.0-rc1
v0.21.3-rc0
v0.21.2-rc1
v0.21.2
v0.21.2-rc0
v0.21.1
v0.21.1-rc1
v0.21.1-rc0
v0.21.0
v0.21.0-rc1
v0.21.0-rc0
v0.20.8-rc0
v0.20.7
v0.20.7-rc1
v0.20.7-rc0
v0.20.6
v0.20.6-rc1
v0.20.6-rc0
v0.20.5
v0.20.5-rc2
v0.20.5-rc1
v0.20.5-rc0
v0.20.4
v0.20.4-rc2
v0.20.4-rc1
v0.20.4-rc0
v0.20.3
v0.20.3-rc0
v0.20.2
v0.20.1
v0.20.1-rc2
v0.20.1-rc1
v0.20.1-rc0
v0.20.0
v0.20.0-rc1
v0.20.0-rc0
v0.19.0
v0.19.0-rc2
v0.19.0-rc1
v0.19.0-rc0
v0.18.4-rc1
v0.18.4-rc0
v0.18.3
v0.18.3-rc2
v0.18.3-rc1
v0.18.3-rc0
v0.18.2
v0.18.2-rc1
v0.18.2-rc0
v0.18.1
v0.18.1-rc1
v0.18.1-rc0
v0.18.0
v0.18.0-rc2
v0.18.0-rc1
v0.18.0-rc0
v0.17.8-rc4
v0.17.8-rc3
v0.17.8-rc2
v0.17.8-rc1
v0.17.8-rc0
v0.17.7
v0.17.7-rc2
v0.17.7-rc1
v0.17.7-rc0
v0.17.6
v0.17.5
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.1-rc2
v0.17.1-rc1
v0.17.1-rc0
v0.17.0
v0.17.0-rc2
v0.17.0-rc1
v0.17.0-rc0
v0.16.3
v0.16.3-rc2
v0.16.3-rc1
v0.16.3-rc0
v0.16.2
v0.16.2-rc0
v0.16.1
v0.16.0
v0.16.0-rc2
v0.16.0-rc0
v0.16.0-rc1
v0.15.6
v0.15.5
v0.15.5-rc5
v0.15.5-rc4
v0.15.5-rc3
v0.15.5-rc2
v0.15.5-rc1
v0.15.5-rc0
v0.15.4
v0.15.3
v0.15.2
v0.15.1
v0.15.1-rc1
v0.15.1-rc0
v0.15.0-rc6
v0.15.0
v0.15.0-rc5
v0.15.0-rc4
v0.15.0-rc3
v0.15.0-rc2
v0.15.0-rc1
v0.15.0-rc0
v0.14.3
v0.14.3-rc3
v0.14.3-rc2
v0.14.3-rc1
v0.14.3-rc0
v0.14.2
v0.14.2-rc1
v0.14.2-rc0
v0.14.1
v0.14.0-rc11
v0.14.0
v0.14.0-rc10
v0.14.0-rc9
v0.14.0-rc8
v0.14.0-rc7
v0.14.0-rc6
v0.14.0-rc5
v0.14.0-rc4
v0.14.0-rc3
v0.14.0-rc2
v0.14.0-rc1
v0.14.0-rc0
v0.13.5
v0.13.5-rc1
v0.13.5-rc0
v0.13.4-rc2
v0.13.4
v0.13.4-rc1
v0.13.4-rc0
v0.13.3
v0.13.3-rc1
v0.13.3-rc0
v0.13.2
v0.13.2-rc2
v0.13.2-rc1
v0.13.2-rc0
v0.13.1
v0.13.1-rc2
v0.13.1-rc1
v0.13.1-rc0
v0.13.0
v0.13.0-rc0
v0.12.11
v0.12.11-rc1
v0.12.11-rc0
v0.12.10
v0.12.10-rc1
v0.12.10-rc0
v0.12.9-rc0
v0.12.9
v0.12.8
v0.12.8-rc0
v0.12.7
v0.12.7-rc1
v0.12.7-rc0
v0.12.7-citest0
v0.12.6
v0.12.6-rc1
v0.12.6-rc0
v0.12.5
v0.12.5-rc0
v0.12.4
v0.12.4-rc7
v0.12.4-rc6
v0.12.4-rc5
v0.12.4-rc4
v0.12.4-rc3
v0.12.4-rc2
v0.12.4-rc1
v0.12.4-rc0
v0.12.3
v0.12.2
v0.12.2-rc0
v0.12.1
v0.12.1-rc1
v0.12.1-rc2
v0.12.1-rc0
v0.12.0
v0.12.0-rc1
v0.12.0-rc0
v0.11.11
v0.11.11-rc3
v0.11.11-rc2
v0.11.11-rc1
v0.11.11-rc0
v0.11.10
v0.11.9
v0.11.9-rc0
v0.11.8
v0.11.8-rc0
v0.11.7-rc1
v0.11.7-rc0
v0.11.7
v0.11.6
v0.11.6-rc0
v0.11.5-rc4
v0.11.5-rc3
v0.11.5
v0.11.5-rc5
v0.11.5-rc2
v0.11.5-rc1
v0.11.5-rc0
v0.11.4
v0.11.4-rc0
v0.11.3
v0.11.3-rc0
v0.11.2
v0.11.1
v0.11.0-rc0
v0.11.0-rc1
v0.11.0-rc2
v0.11.0
v0.10.2-int1
v0.10.1
v0.10.0
v0.10.0-rc4
v0.10.0-rc3
v0.10.0-rc2
v0.10.0-rc1
v0.10.0-rc0
v0.9.7-rc1
v0.9.7-rc0
v0.9.6
v0.9.6-rc0
v0.9.6-ci0
v0.9.5
v0.9.4-rc5
v0.9.4-rc6
v0.9.4
v0.9.4-rc3
v0.9.4-rc4
v0.9.4-rc1
v0.9.4-rc2
v0.9.4-rc0
v0.9.3
v0.9.3-rc5
v0.9.4-citest0
v0.9.3-rc4
v0.9.3-rc3
v0.9.3-rc2
v0.9.3-rc1
v0.9.3-rc0
v0.9.2
v0.9.1
v0.9.1-rc1
v0.9.1-rc0
v0.9.1-ci1
v0.9.1-ci0
v0.9.0
v0.9.0-rc0
v0.8.0
v0.8.0-rc0
v0.7.1-rc2
v0.7.1
v0.7.1-rc1
v0.7.1-rc0
v0.7.0
v0.7.0-rc1
v0.7.0-rc0
v0.6.9-rc0
v0.6.8
v0.6.8-rc0
v0.6.7
v0.6.7-rc2
v0.6.7-rc1
v0.6.7-rc0
v0.6.6
v0.6.6-rc2
v0.6.6-rc1
v0.6.6-rc0
v0.6.5-rc1
v0.6.5
v0.6.5-rc0
v0.6.4-rc0
v0.6.4
v0.6.3-rc1
v0.6.3
v0.6.3-rc0
v0.6.2
v0.6.2-rc0
v0.6.1
v0.6.1-rc0
v0.6.0-rc0
v0.6.0
v0.5.14-rc0
v0.5.13
v0.5.13-rc6
v0.5.13-rc5
v0.5.13-rc4
v0.5.13-rc3
v0.5.13-rc2
v0.5.13-rc1
v0.5.13-rc0
v0.5.12
v0.5.12-rc1
v0.5.12-rc0
v0.5.11
v0.5.10
v0.5.9
v0.5.9-rc0
v0.5.8-rc13
v0.5.8
v0.5.8-rc12
v0.5.8-rc11
v0.5.8-rc10
v0.5.8-rc9
v0.5.8-rc8
v0.5.8-rc7
v0.5.8-rc6
v0.5.8-rc5
v0.5.8-rc4
v0.5.8-rc3
v0.5.8-rc2
v0.5.8-rc1
v0.5.8-rc0
v0.5.7
v0.5.6
v0.5.5
v0.5.5-rc0
v0.5.4
v0.5.3
v0.5.3-rc0
v0.5.2
v0.5.2-rc3
v0.5.2-rc2
v0.5.2-rc1
v0.5.2-rc0
v0.5.1
v0.5.0
v0.5.0-rc1
v0.4.8-rc0
v0.4.7
v0.4.6
v0.4.5
v0.4.4
v0.4.3
v0.4.3-rc0
v0.4.2
v0.4.2-rc1
v0.4.2-rc0
v0.4.1
v0.4.1-rc0
v0.4.0
v0.4.0-rc8
v0.4.0-rc7
v0.4.0-rc6
v0.4.0-rc5
v0.4.0-rc4
v0.4.0-rc3
v0.4.0-rc2
v0.4.0-rc1
v0.4.0-rc0
v0.4.0-ci3
v0.3.14
v0.3.14-rc0
v0.3.13
v0.3.12
v0.3.12-rc5
v0.3.12-rc4
v0.3.12-rc3
v0.3.12-rc2
v0.3.12-rc1
v0.3.11
v0.3.11-rc4
v0.3.11-rc3
v0.3.11-rc2
v0.3.11-rc1
v0.3.10
v0.3.10-rc1
v0.3.9
v0.3.8
v0.3.7
v0.3.7-rc6
v0.3.7-rc5
v0.3.7-rc4
v0.3.7-rc3
v0.3.7-rc2
v0.3.7-rc1
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.8
v0.2.8-rc2
v0.2.8-rc1
v0.2.7
v0.2.6
v0.2.5
v0.2.4
v0.2.3
v0.2.2
v0.2.2-rc2
v0.2.2-rc1
v0.2.1
v0.2.0
v0.1.49-rc14
v0.1.49-rc13
v0.1.49-rc12
v0.1.49-rc11
v0.1.49-rc10
v0.1.49-rc9
v0.1.49-rc8
v0.1.49-rc7
v0.1.49-rc6
v0.1.49-rc4
v0.1.49-rc5
v0.1.49-rc3
v0.1.49-rc2
v0.1.49-rc1
v0.1.48
v0.1.47
v0.1.46
v0.1.45-rc5
v0.1.45
v0.1.45-rc4
v0.1.45-rc3
v0.1.45-rc2
v0.1.45-rc1
v0.1.44
v0.1.43
v0.1.42
v0.1.41
v0.1.40
v0.1.40-rc1
v0.1.39
v0.1.39-rc2
v0.1.39-rc1
v0.1.38
v0.1.37
v0.1.36
v0.1.35
v0.1.35-rc1
v0.1.34
v0.1.34-rc1
v0.1.33
v0.1.33-rc7
v0.1.33-rc6
v0.1.33-rc5
v0.1.33-rc4
v0.1.33-rc3
v0.1.33-rc2
v0.1.33-rc1
v0.1.32
v0.1.32-rc2
v0.1.32-rc1
v0.1.31
v0.1.30
v0.1.29
v0.1.28
v0.1.27
v0.1.26
v0.1.25
v0.1.24
v0.1.23
v0.1.22
v0.1.21
v0.1.20
v0.1.19
v0.1.18
v0.1.17
v0.1.16
v0.1.15
v0.1.14
v0.1.13
v0.1.12
v0.1.11
v0.1.10
v0.1.9
v0.1.8
v0.1.7
v0.1.6
v0.1.5
v0.1.4
v0.1.3
v0.1.2
v0.1.1
v0.1.0
v0.0.21
v0.0.20
v0.0.19
v0.0.18
v0.0.17
v0.0.16
v0.0.15
v0.0.14
v0.0.13
v0.0.12
v0.0.11
v0.0.10
v0.0.9
v0.0.8
v0.0.7
v0.0.6
v0.0.5
v0.0.4
v0.0.3
v0.0.2
v0.0.1
Labels
Clear labels
amd
api
app
bug
build
cli
cloud
compatibility
context-length
create
docker
documentation
embeddings
feature request
feedback wanted
good first issue
gpt-oss
gpu
harmony
help wanted
image
install
intel
js
launch
linux
macos
memory
mlx
model
needs more info
networking
nvidia
ollama.com
performance
pull-request
python
question
registry
rendering
thinking
tools
top
vulkan
windows
wsl
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/ollama#3262
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ivanbrash on GitHub (Jun 20, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5186
Originally assigned to: @dhiltgen on GitHub.
Hello! I'm want to buy Lenovo Xiaoxin 14 AI laptop on AMD Ryzen 7 8845H on my birthday and I will install Artix Linux to this. Do you will to add AMD Ryzen NPU support to Ollama on Linux and Windows? If anything, AMD Ryzen NPU driver for Linux is already available on Github:
https://github.com/amd/xdna-driver.git
Sorry for my bad English, please!
@billtown commented on GitHub (Jun 20, 2024):
I have an AMD Ryzen 7 7840U w/ Radeon 780M Graphics and recently got inference working in the igpu.
On linux for me, the rocm support works. I have to use the OVERRIDE_GFX_VERSION. people seem to have various luck messing with the version. Not sure if this helps at all.
podman run -d --name ollama --replace --pull=always --restart=always -p 0.0.0.0:11434:11434 -v ollama:/root/.ollama --stop-signal=SIGKILL --device /dev/dri --device /dev/kfd -e HSA_OVERRIDE_GFX_VERSION=11.0.2 -e HSA_ENABLE_SDMA=0 docker.io/ollama/ollama:rocm@jasalt commented on GitHub (Jun 23, 2024):
There were some recent patches to llamafile and llama.cpp linked here also with ability to use more ram than what is dedicated to iGPU (HIP_UMA) https://github.com/ROCm/ROCm/discussions/2631#discussioncomment-9849190, looks promising.
@coreybutler commented on GitHub (Aug 14, 2024):
Running a AMD Ryzen 9 8945HS here. Would love to see support for this.
@2018wzh commented on GitHub (Aug 16, 2024):
Running a AMD AI 9 370HX here, Same as above. Hoping to see support
@grigio commented on GitHub (Aug 23, 2024):
Here are some news.. but Linux support seems lacking..
https://community.amd.com/t5/ai/get-a-powerful-ai-assistant-with-document-chat-accelerated-by/ba-p/704092
https://lmstudio.ai/ryzenai
@henry2man commented on GitHub (Sep 5, 2024):
@billtown What's the performance of your setup? I've recently purchased a Ryzen 9 8945HS + 64Gb RAM MiniPC for some Docker + VM and (hopefully) some lightweight LLM workloads with Ollama.
PS: I'm not and expert on Ollama intrinsics but I have enough experience to help with testing with my own Hardware in order to make this request reality.
@grigio commented on GitHub (Sep 5, 2024):
Can you share how many token/s you get with llama3.1-Q4_k_m or similar ?
@fan123450 commented on GitHub (Sep 11, 2024):
Running a AMD 8845HS here, Same as above. Hoping to see support both gpu and npu.
@billtown commented on GitHub (Sep 11, 2024):
total duration: 22.204829879s
load duration: 16.99589ms
prompt eval count: 1411 token(s)
prompt eval duration: 625.952ms
prompt eval rate: 2254.17 tokens/s
eval count: 269 token(s)
eval duration: 20.76486s
eval rate: 12.95 tokens/s < after building some context.
llama3:8b 365c0bd3c000 6.7 GB 100% GPU
radeontop at least shows vram and shaders and pipes hitting 100% when running. I have 16gb allocated in the bios
0.80G / 0.80G Memory Clock 100.00%
2.13G / 2.70G Shader Clock 78.81%
Graphics pipe 99.17%
Shader Interpolator 92.50%
Clip Rectangle 100.00%
these are what come alive in radeontop. And then a single thread on cpu hit's 100% (ollama).
@fan123450 commented on GitHub (Sep 12, 2024):
Great!Is there a detailed implementation steps reference? If available,I will be very grateful!
@evansrrr commented on GitHub (Sep 24, 2024):
Running Lenovo Xiaoxin pro 16, R7-8845H as the processor, same as above. Hope to enable AMD NPU soon!
@robfuscator commented on GitHub (Oct 22, 2024):
We'll have to wait at least until february before this is even possible on linux using a mainline kernel:
https://www.phoronix.com/news/AMD-XDNA-Linux-Driver-v4
@ivanbrash commented on GitHub (Nov 29, 2024):
I bought a Honor Magicbook X14 Pro on Ryzen 7 7840HS and installed a Gentoo with KDE on it. So far, I have not tried to install Ollama on it, since there is no NPU support on it. But when it appears, I will definitely install it.
@ToeiRei commented on GitHub (Nov 29, 2024):
I did play around with AI accelerators a bit and my Frame.Work has the same CPU (as you mention about your magicbook). The TOPS value was disappointing - to put it mildly. Don't get your hopes up; 25 TOPS max with different applications. It's a blast for image recognition, OCR,... but falls flat on LLM tasks.
@JiapengLi commented on GitHub (Dec 20, 2024):
Here is my test result under:
The performance is not as good as expected,
@JiapengLi commented on GitHub (Dec 20, 2024):
Related topics:
#3004
@Pekkari commented on GitHub (Dec 20, 2024):
@JiapengLi I don't think that is using your NPU in any ways, the amd-xdna driver is most likely be available in linux 6.14, then you may need the user space libraries from amd to interact to it, like rocm when talking amd gpus, or just cuda for nvidia, and then ollama may need to have code to call those libraries, which is the reason for this issue to exist. I'm no ollama maintainer though, they may know more details of those I mentioned.
@grigio commented on GitHub (Dec 21, 2024):
@JiapengLi I think Linux 6.14 should improve the situation, keep up updated
https://www.phoronix.com/news/Ryzen-AI-NPU6-Linux-6.14
@sinchichou commented on GitHub (Dec 26, 2024):
So I'am trying to let LLM running on AMD NPU.
But look like it need Visual Studio 2022 Community, CMake, Anaconda or Miniconda.
And the lib all need Ryzen AI SW or any else.
On ONNX Runtime supported list, there no show AMD NPU.
Maybe can try DirectML or ROCm.
I'll try that late.
@GreyXor commented on GitHub (Mar 18, 2025):
Yes I confirm that I can run amdxdna driver on my 6.14
@grigio commented on GitHub (Mar 18, 2025):
@GreyXor do you have improvements in tokens/sec over cpu or vulkan ?
@GreyXor commented on GitHub (Mar 18, 2025):
I mean, amdxdna is loaded and working. but I don't have app to effectively inference something. want me to try something ? I would be happy to try and run some benchmark
amdxdnahas been some kind of vaporware since mid-2023. At least now the driver is working but nothing uses it.I asked AMD for some doc here https://github.com/AMD-AIG-AIMA/Instella/issues/1 and we are some to wait for support here : https://github.com/ggml-org/llama.cpp/issues/1499
@wishx commented on GitHub (Mar 24, 2025):
The 6.14 kernel has been released and is widely available now. Just letting folks know.
https://www.phoronix.com/news/Linux-6.14
@evansrrr commented on GitHub (Apr 10, 2025):
yeah, and I saw that linux platform did have a leap in AI techs, e.g. having a maximum effect in proceeding sd-webui-aki etc.
@DocMAX commented on GitHub (Apr 14, 2025):
I'm going to buy a laptop for local LLM use. Can you recommend a good CPU for this? I prefer Lenovos ThinkPad line. Any experiences? Should i wait for the next generation?
@XenoAmess commented on GitHub (Apr 14, 2025):
well I hate apple but... well, just buy mac book(IMO).
@DocMAX commented on GitHub (Apr 14, 2025):
I hate apple too, so this is no option...
@Bush-cat commented on GitHub (Apr 14, 2025):
you'd want a ryzen ai max cpu
@DocMAX commented on GitHub (Apr 14, 2025):
How does it perform with ROCm and Ollama (Tokens/s)? Can't find any benchmark comparison list anywhere.
@XenoAmess commented on GitHub (Apr 14, 2025):
I have AMD cpu with NPU.
I have ubuntu kernel with 6.14 linux.
I have no way to run any llm backend with them.
So thanks your hell for ai, AMD.
Oh maybe I should say the word 'help', but I don't think they deserve.
@Bush-cat commented on GitHub (Apr 14, 2025):
It's the fastest igpu as it has very fast 4channel memory,
you can only get faster with a thick discrete gpu but those have less vram so you're limited in the size of the models you can run
@Bush-cat commented on GitHub (Apr 14, 2025):
Welp it always was more marketing for windows users and features I guess, the first ads for the npus were for applicaitons like ms teams...
also don't expect much from the 15 tops of your ryzen 8000 (or 10 tops of my ryzen 7000) npu, copilot+ pcs require at least a 50tops npu to do anything.
I saw a person benchmarking the ryzen 8000 npu and it took several minutes to finish an output with the tiniest llama model
@XenoAmess commented on GitHub (Apr 14, 2025):
The tiniest llama model we've seen is 0.5B. I can't quite believe it would use minutes to handle requests with 0.5B but...
Well, let's wait for AMD engineers to make it useable in another 2 years. maybe there is still hope they can achieeeeeeve it then?
@Pekkari commented on GitHub (Apr 14, 2025):
The marketing info around suggest maybe it may run using either LMStudio or vllm, needless to say, on linux side, one always may expect to tinker a bit to get those topics working.
@bonswouar commented on GitHub (Apr 14, 2025):
Please guys try it out instead of speculating: https://github.com/ollama/ollama/pull/6282
Not sure if it correctly uses the NPU, but it's working!
Probably fake news, I have a 8845hs and the few models I've tried (8B to 15B) run pretty well (but of course it depends what you compare it to).
Not "several minutes" for "tiniest llama model" for sure though
@DocMAX commented on GitHub (Apr 14, 2025):
AMD 5800U APU with ROCm: Llama3.1 8B. Question: "Who is Bill Gates":
total duration: 1m7.077394148s
load duration: 22.877264ms
prompt eval count: 51 token(s)
prompt eval duration: 8.593335ms
prompt eval rate: 5934.83 tokens/s
eval count: 435 token(s)
eval duration: 1m7.044265099s
eval rate: 6.49 tokens/s
@Bush-cat commented on GitHub (Apr 14, 2025):
I saw ryzen 7000 users got 2-10tokens per second with some optimized llama 3.1 8b model
https://www.reddit.com/r/LocalLLaMA/comments/1d9m0z3/running_llama_3_on_the_npu_of_a_firstgeneration/
and with models lower than 3b the quality of the output is really bad.
@Bush-cat commented on GitHub (Apr 14, 2025):
you probably used the igpu and not the NPU, I was only talking about the npu which is much slower than using the full igpu
@DocMAX commented on GitHub (Apr 14, 2025):
Anyone can benchmark on a AMD HX 375 for me please? I really wonder how fast it is. I expect around 20 tok/s with llama 3.1 8b with all the hype around the AMD AI processors.
@androidacy-user commented on GitHub (Jun 17, 2025):
People in this issue thread seem to be getting the GPU and NPU mixed up. Ollama can be forced to run on the iGPU, but seems to completely lack support for the (much more efficient) NPU on these chipsets.
@reneleonhardt commented on GitHub (Jun 19, 2025):
The NPU seems comparable with 50 TOPS, but a lot of unified RAM always helps of course 😅
https://www.techpowerup.com/334223/amds-ryzen-ai-max-395-delivers-up-to-12x-ai-llm-performance-compared-to-intels-lunar-lake
https://en.wikipedia.org/wiki/List_of_AMD_Ryzen_processors#Ryzen_AI_300_series
It looks like NPU support in Ollama would be amazing to run LLMs even on notebooks ❤
@regulad commented on GitHub (Jun 22, 2025):
I have a notebook with the Ryzen AI Max+ 395 at my disposal. I was able to get iGPU inference to work in rootless podman with the following command, but still no NPU inference in sight.
@DocMAX If you're interested in my speed, here is a prompt from 27B parameter Gemma 3
@padthaitofuhot commented on GitHub (Jul 27, 2025):
This please.
I have AMD Ryzen AI 7 PRO 360 w/ Radeon 880M in this Thinkpad. It's not a very powerful NPU, but it would be super keen to get a tiny model on it for quick enhanced local autocomplete or embedding vectors for RAG.
@androidacy-user commented on GitHub (Jul 27, 2025):
50 TOPs is plenty for a smaller or quantized model, or larger if you're willing to deal with slower inference times
@muety commented on GitHub (Jul 27, 2025):
Would love to see how it performs on reasonably large models (like ~21B or so)!
@jcubic commented on GitHub (Jul 27, 2025):
AMD NPU is supported by the mainline Linux kernel from 6.14 released on March 2025.
I wanted to buy a laptop with this NPU, and it would be great to be able to use bigger models with Ollama.
@gururise commented on GitHub (Sep 1, 2025):
NPU support can speed things up significantly. There are two other projects that support inference with AMD NPU's and show significant perf improvements over iGPU or CPU only:
@ha-pf-tickerer commented on GitHub (Sep 1, 2025):
Gaia or fastflowlm are great project that support the AMD Ryzen AI processors but support for olloma would be really, really great.
Our use case is a dedicated local AI mini-pc , to be used by the kids as a better google/alexa search
AND probably most off the time to give Home Assistant a local conversation agent using the
https://www.home-assistant.io/integrations/ollama/
This would allow a " it's too hot in here !" voice prompt to Home Assistant,
letting the Home Assistant correctly understand that the user "bob" sitting in the living room is not happy
with the temperature and that the Home Assistant server should lower the temperature in the room using
the AC or lower the thermostats based on the controls that Home Assistant already has.
I promised this functionality to my SO in order to hang the house full with zigbee sensors and have seriously expensive AMD Ryzen AI mini boxes in the house -)
@z0xca commented on GitHub (Dec 23, 2025):
Running a AMD 8845HS here too, Same as above. Hoping to see npu support.
@alerque commented on GitHub (Dec 23, 2025):
How is this affected by the merge of #13196?
@Pekkari commented on GitHub (Dec 24, 2025):
not affected at all. The merge is about the iGPU support, not the NPU, and for what I know, the NPU in 8845HS is not worth supporting, since the extra capacity it will provide for hybrid setup(GPU + NPU) is not really a deal maker.
@bonswouar commented on GitHub (Dec 24, 2025):
Isn't the NPU supposed to be more energy efficient than the GPU though?
The 8845HS being a laptop cpu I'd say it could be a huge deal maker, if it helps to run models on battery.
But if it's not more energy efficient, and not noticeably improving performances with hybrid setup, then I really don't see the point yeah
@Pekkari commented on GitHub (Dec 24, 2025):
Don't kill the messenger, I'm just voicing what I heard from AMD, I'd love to see the support coming anyways, since, I bought the hardware for the NPU, and suddenly got to the same situation :|
@alerque commented on GitHub (Dec 24, 2025):
Fair enough. I'm still figuring out what is what here.
Partly out of personal curiosity and partly because I'm an Arch Linux packager looking over the ROCM related packages wondering if there is anything we are missing out on that I could help fix... my personal hardware is an integrated
AMD Ryzen AI 9 HX 370 w/ Radeon 890Mwhich I assume does have an NPU and would benefit from this requested support correct? And also aAMD Ryzen 5 3600 6-Coreprocessor with a discreteRadeon RX 5500graphics card for which I assume there is no NPU correct?Is there somewhere that has commands to actually ferret out or a good table somewhere showing where AMD has NPUs at all and by what they are/are not supported?
@Pekkari commented on GitHub (Dec 24, 2025):
Fail to remember, but I read something about Strix Point support coming, which I think it is your hardware, however, I think it was GPU support in ROCM, so chances are that you may be still in the safe zone. 8845HS is prior to the Strix Point, and after is the Strix Halo that is the first intended to be supported, but community push made the support for Strix point also in ROCM to happen.
@z0xca commented on GitHub (Dec 24, 2025):
The Wikipedia List of AMD Ryzen processors shows which CPU's have an NPU or not
@GreyXor commented on GitHub (Feb 25, 2026):
If anyone interested, I wrote a little guide with FastFlowLM to use the NPU https://community.frame.work/t/guide-use-npu-xdna2-with-arch-linux-and-fastflowlm/80879
@poplk commented on GitHub (Mar 22, 2026):
Hi, I am thinking to buy an AMD Ryzen ai 9 max+ 395. Does anybody own one I have some questions ?
@alerque commented on GitHub (Mar 22, 2026):
@poplk This is an issue report on a piece of software and it is followed by people who want to be notified about updates to the software issue. This is not an open-topic forum or hardware buyers guide. Please don't spam the issue tracker.