mirror of
https://github.com/ollama/ollama.git
synced 2026-05-22 13:42:25 -05:00
[GH-ISSUE #2453] Add support for older AMD GPU gfx803, gfx802, gfx805 (e.g. Radeon RX 580, FirePro W7100) #1434
Open
opened 2026-04-12 11:18:32 -05:00 by GiteaMirror
·
220 comments
No Branch/Tag Specified
main
dhiltgen/llama-runner
parth-migrate-pi
codex/make-integration-hidden-and-lunchable
hoyyeva/migrate-pi
hoyyeva/opencode-thinking
hoyyeva/anthropic-local-image-path
parth-launch-codex-app
hoyyeva/anthropic-reference-images-path
parth-anthropic-reference-images-path
brucemacd/download-before-remove
hoyyeva/editor-config-repair
parth-mlx-decode-checkpoints
hoyyeva/qwen
parth/hide-claude-desktop-till-release
parth-add-claude-code-autoinstall
release_v0.22.0
pdevine/manifest-list
codex/fix-codex-model-metadata-warning
pdevine/addressable-manifest
brucemacd/launch-fetch-reccomended
jmorganca/llama-compat
launch-copilot-cli
release_v0.20.7
parth-auto-save-backup
parth-test
jmorganca/gemma4-audio-replacements
fix-manifest-digest-on-pull
hoyyeva/vscode-improve
brucemacd/install-server-wait
parth/update-claude-docs
brucemac/start-ap-install
pdevine/mlx-update
pdevine/qwen35_vision
drifkin/api-show-fallback
mintlify/image-generation-1773352582
hoyyeva/server-context-length-local-config
jmorganca/faster-reptition-penalties
jmorganca/convert-nemotron
parth-pi-thinking
pdevine/sampling-penalties
jmorganca/fix-create-quantization-memory
dongchen/resumable_transfer_fix
pdevine/sampling-cache-error
jessegross/mlx-usage
hoyyeva/openclaw-config
hoyyeva/app-html
pdevine/qwen3next
brucemacd/sign-sh-install
brucemacd/tui-update
brucemacd/usage-api
jmorganca/launch-empty
fix-app-dist-embed
mxyng/mlx-compile
mxyng/mlx-quant
mxyng/mlx-glm4.7
mxyng/mlx
brucemacd/simplify-model-picker
jmorganca/qwen3-concurrent
fix-glm-4.7-flash-mla-config
drifkin/qwen3-coder-opening-tag
brucemacd/usage-cli
fix-cuda12-fattn-shmem
ollama-imagegen-docs
parth/fix-multiline-inputs
brucemacd/config-docs
mxyng/model-files
mxyng/simple-execute
fix-imagegen-ollama-models
mxyng/async-upload
jmorganca/lazy-no-dtype-changes
imagegen-auto-detect-create
parth/decrease-concurrent-download-hf
fix-mlx-quantize-init
jmorganca/x-cleanup
usage
imagegen-readme
jmorganca/glm-image
mlx-gpu-cd
jmorganca/imagegen-modelfile
parth/agent-skills
parth/agent-allowlist
parth/signed-in-offline
parth/agents
parth/fix-context-chopping
improve-cloud-flow
parth/add-models-websearch
parth/prompt-renderer-mcp
jmorganca/native-settings
jmorganca/download-stream-hash
jmorganca/client2-rebased
brucemacd/oai-chat-req-multipart
jessegross/multi_chunk_reserve
grace/additional-omit-empty
grace/mistral-3-large
mxyng/tokenizer2
mxyng/tokenizer
jessegross/flash
hoyyeva/windows-nacked-app
mxyng/cleanup-attention
grace/deepseek-parser
hoyyeva/remember-unsent-prompt
parth/add-lfs-pointer-error-conversion
parth/olmo2-test2
hoyyeva/ollama-launchagent-plist
nicole/olmo-model
parth/olmo-test
mxyng/remove-embedded
parth/render-template
jmorganca/intellect-3
parth/remove-prealloc-linter
jmorganca/cmd-eval
nicole/nomic-embed-text-fix
mxyng/lint-2
hoyyeva/add-gemini-3-pro-preview
hoyyeva/load-model-list
mxyng/expand-path
mxyng/environ-2
hoyyeva/deeplink-json-encoding
parth/improve-tool-calling-tests
hoyyeva/conversation
hoyyeva/assistant-edit-response
hoyyeva/thinking
origin/brucemacd/invalid-char-i-err
parth/improve-tool-calling
jmorganca/required-omitempty
grace/qwen3-vl-tests
mxyng/iter-client
parth/docs-readme
nicole/embed-test
pdevine/integration-benchstat
parth/remove-generate-cmd
parth/add-toolcall-id
mxyng/server-tests
jmorganca/glm-4.6
jmorganca/gin-h-compat
drifkin/stable-tool-args
pdevine/qwen3-more-thinking
parth/add-websearch-client
nicole/websearch_local
jmorganca/qwen3-coder-updates
grace/deepseek-v3-migration-tests
mxyng/fix-create
jmorganca/cloud-errors
pdevine/parser-tidy
revert-12233-parth/simplify-entrypoints-runner
parth/enable-so-gpt-oss
brucemacd/qwen3vl
jmorganca/readme-simplify
parth/gpt-oss-structured-outputs
revert-12039-jmorganca/tools-braces
mxyng/embeddings
mxyng/gguf
mxyng/benchmark
mxyng/types-null
parth/move-parsing
mxyng/gemma2
jmorganca/docs
mxyng/16-bit
mxyng/create-stdin
pdevine/authorizedkeys
mxyng/quant
parth/opt-in-error-context-window
brucemacd/cache-models
brucemacd/runner-completion
jmorganca/llama-update-6
brucemacd/benchmark-list
brucemacd/partial-read-caps
parth/deepseek-r1-tools
mxyng/omit-array
parth/tool-prefix-temp
brucemacd/runner-test
jmorganca/qwen25vl
brucemacd/model-forward-test-ext
parth/python-function-parsing
jmorganca/cuda-compression-none
drifkin/num-parallel
drifkin/chat-truncation-fix
jmorganca/sync
parth/python-tools-calling
drifkin/array-head-count
brucemacd/create-no-loop
parth/server-enable-content-stream-with-tools
qwen25omni
mxyng/v3
brucemacd/ropeconfig
jmorganca/silence-tokenizer
parth/sample-so-test
parth/sampling-structured-outputs
brucemacd/doc-go-engine
parth/constrained-sampling-json
jmorganca/mistral-wip
brucemacd/mistral-small-convert
parth/sample-unmarshal-json-for-params
brucemacd/jomorganca/mistral
pdevine/bfloat16
jmorganca/mistral
brucemacd/mistral
pdevine/logging
parth/sample-correctness-fix
parth/sample-fix-sorting
jmorgan/sample-fix-sorting-extras
jmorganca/temp-0-images
brucemacd/parallel-embed-models
brucemacd/shim-grammar
jmorganca/fix-gguf-error
bmizerany/nameswork
jmorganca/faster-releases
bmizerany/validatenames
brucemacd/err-no-vocab
brucemacd/rope-config
brucemacd/err-hint
brucemacd/qwen2_5
brucemacd/logprobs
brucemacd/new_runner_graph_bench
progress-flicker
brucemacd/forward-test
brucemacd/go_qwen2
pdevine/gemma2
jmorganca/add-missing-symlink-eval
mxyng/next-debug
parth/set-context-size-openai
brucemacd/next-bpe-bench
brucemacd/next-bpe-test
brucemacd/new_runner_e2e
brucemacd/new_runner_qwen2
pdevine/convert-cohere2
brucemacd/convert-cli
parth/log-probs
mxyng/next-mlx
mxyng/cmd-history
parth/templating
parth/tokenize-detokenize
brucemacd/check-key-register
bmizerany/grammar
jmorganca/vendor-081b29bd
mxyng/func-checks
jmorganca/fix-null-format
parth/fix-default-to-warn-json
jmorganca/qwen2vl
jmorganca/no-concat
parth/cmd-cleanup-SO
brucemacd/check-key-register-structured-err
parth/openai-stream-usage
parth/fix-referencing-so
stream-tools-stop
jmorganca/degin-1
brucemacd/install-path-clean
brucemacd/push-name-validation
brucemacd/browser-key-register
jmorganca/openai-fix-first-message
jmorganca/fix-proxy
jessegross/sample
parth/disallow-streaming-tools
dhiltgen/remove_submodule
jmorganca/ga
jmorganca/mllama
pdevine/newlines
pdevine/geems-2b
jmorganca/llama-bump
mxyng/modelname-7
mxyng/gin-slog
mxyng/modelname-6
jyan/convert-prog
jyan/quant5
paligemma-support
pdevine/import-docs
jmorganca/openai-context
jyan/paligemma
jyan/p2
jyan/palitest
bmizerany/embedspeedup
jmorganca/llama-vit
brucemacd/allow-ollama
royh/ep-methods
royh/whisper
mxyng/api-models
mxyng/fix-memory
jyan/q4_4/8
jyan/ollama-v
royh/stream-tools
roy-embed-parallel
bmizerany/hrm
revert-5963-revert-5924-mxyng/llama3.1-rope
royh/embed-viz
jyan/local2
jyan/auth
jyan/local
jyan/parse-temp
jmorganca/template-mistral
jyan/reord-g
royh-openai-suffixdocs
royh-imgembed
royh-embed-parallel
jyan/quant4
royh-precision
jyan/progress
pdevine/fix-template
jyan/quant3
pdevine/ggla
mxyng/update-registry-domain
jmorganca/ggml-static
mxyng/create-context
jyan/v0.146
mxyng/layers-from-files
build_dist
bmizerany/noseek
royh-ls
royh-name
timeout
mxyng/server-timestamp
bmizerany/nosillyggufslurps
royh-params
jmorganca/llama-cpp-7c26775
royh-openai-delete
royh-show-rigid
jmorganca/enable-fa
jmorganca/no-error-template
jyan/format
royh-testdelete
bmizerany/fastverify
language_support
pdevine/ps-glitches
brucemacd/tokenize
bruce/iq-quants
bmizerany/filepathwithcoloninhost
mxyng/split-bin
bmizerany/client-registry
jmorganca/if-none-match
native
jmorganca/native
jmorganca/batch-embeddings
jmorganca/initcmake
jmorganca/mm
pdevine/showggmlinfo
modenameenforcealphanum
bmizerany/modenameenforcealphanum
jmorganca/done-reason
jmorganca/llama-cpp-8960fe8
ollama.com
bmizerany/filepathnobuild
bmizerany/types/model/defaultfix
rmdisplaylong
nogogen
bmizerany/x
modelfile-readme
bmizerany/replacecolon
jmorganca/limit
jmorganca/execstack
jmorganca/replace-assets
mxyng/tune-concurrency
jmorganca/testing
whitespace-detection
jmorganca/options
upgrade-all
scratch
cuda-search
mattw/airenamer
mattw/allmodelsonhuggingface
mattw/quantcontext
mattw/whatneedstorun
brucemacd/llama-mem-calc
mattw/faq-context
mattw/communitylinks
mattw/noprune
mattw/python-functioncalling
rename
mxyng/install
pulse
remove-first
editor
mattw/selfqueryingretrieval
cgo
mattw/howtoquant
api
matt/streamingapi
format-config
mxyng/extra-args
shell
update-nous-hermes
cp-model
upload-progress
fix-unknown-model
fix-model-names
delete-fix
insecure-registry
ls
deletemodels
progressbar
readme-updates
license-layers
skip-list
list-models
modelpath
matt/examplemodelfiles
distribution
go-opts
v0.30.0-rc23
v0.30.0-rc22
v0.30.0-rc21
v0.30.0-rc20
v0.30.0-rc19
v0.30.0-rc18
v0.25.0-rc0
v0.30.0-rc17
v0.30.0-rc16
v0.24.0-rc1
v0.24.0
v0.24.0-rc0
v0.23.4
v0.23.4-rc0
v0.30.0-rc15
v0.23.3
v0.23.3-rc1
v0.30.0-rc14
v0.23.3-rc0
v0.30.0-rc13
v0.30.0-rc12
v0.30.0-rc11
v0.30.0-rc10
v0.30.0-rc9
v0.30.0-rc8
v0.30.0-rc7
v0.30.0-rc6
v0.30.0-rc5
v0.23.2
v0.23.2-rc0
v0.30.0-rc4
v0.30.0-rc3
v0.30.0-rc2
v0.30.0-rc1
v0.30.0-rc0
v0.23.1
v0.23.1-rc0
v0.23.0
v0.23.0-rc0
v0.22.1
v0.22.1-rc1
v0.22.1-rc0
v0.22.0
v0.22.0-rc1
v0.21.3-rc0
v0.21.2-rc1
v0.21.2
v0.21.2-rc0
v0.21.1
v0.21.1-rc1
v0.21.1-rc0
v0.21.0
v0.21.0-rc1
v0.21.0-rc0
v0.20.8-rc0
v0.20.7
v0.20.7-rc1
v0.20.7-rc0
v0.20.6
v0.20.6-rc1
v0.20.6-rc0
v0.20.5
v0.20.5-rc2
v0.20.5-rc1
v0.20.5-rc0
v0.20.4
v0.20.4-rc2
v0.20.4-rc1
v0.20.4-rc0
v0.20.3
v0.20.3-rc0
v0.20.2
v0.20.1
v0.20.1-rc2
v0.20.1-rc1
v0.20.1-rc0
v0.20.0
v0.20.0-rc1
v0.20.0-rc0
v0.19.0
v0.19.0-rc2
v0.19.0-rc1
v0.19.0-rc0
v0.18.4-rc1
v0.18.4-rc0
v0.18.3
v0.18.3-rc2
v0.18.3-rc1
v0.18.3-rc0
v0.18.2
v0.18.2-rc1
v0.18.2-rc0
v0.18.1
v0.18.1-rc1
v0.18.1-rc0
v0.18.0
v0.18.0-rc2
v0.18.0-rc1
v0.18.0-rc0
v0.17.8-rc4
v0.17.8-rc3
v0.17.8-rc2
v0.17.8-rc1
v0.17.8-rc0
v0.17.7
v0.17.7-rc2
v0.17.7-rc1
v0.17.7-rc0
v0.17.6
v0.17.5
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.1-rc2
v0.17.1-rc1
v0.17.1-rc0
v0.17.0
v0.17.0-rc2
v0.17.0-rc1
v0.17.0-rc0
v0.16.3
v0.16.3-rc2
v0.16.3-rc1
v0.16.3-rc0
v0.16.2
v0.16.2-rc0
v0.16.1
v0.16.0
v0.16.0-rc2
v0.16.0-rc0
v0.16.0-rc1
v0.15.6
v0.15.5
v0.15.5-rc5
v0.15.5-rc4
v0.15.5-rc3
v0.15.5-rc2
v0.15.5-rc1
v0.15.5-rc0
v0.15.4
v0.15.3
v0.15.2
v0.15.1
v0.15.1-rc1
v0.15.1-rc0
v0.15.0-rc6
v0.15.0
v0.15.0-rc5
v0.15.0-rc4
v0.15.0-rc3
v0.15.0-rc2
v0.15.0-rc1
v0.15.0-rc0
v0.14.3
v0.14.3-rc3
v0.14.3-rc2
v0.14.3-rc1
v0.14.3-rc0
v0.14.2
v0.14.2-rc1
v0.14.2-rc0
v0.14.1
v0.14.0-rc11
v0.14.0
v0.14.0-rc10
v0.14.0-rc9
v0.14.0-rc8
v0.14.0-rc7
v0.14.0-rc6
v0.14.0-rc5
v0.14.0-rc4
v0.14.0-rc3
v0.14.0-rc2
v0.14.0-rc1
v0.14.0-rc0
v0.13.5
v0.13.5-rc1
v0.13.5-rc0
v0.13.4-rc2
v0.13.4
v0.13.4-rc1
v0.13.4-rc0
v0.13.3
v0.13.3-rc1
v0.13.3-rc0
v0.13.2
v0.13.2-rc2
v0.13.2-rc1
v0.13.2-rc0
v0.13.1
v0.13.1-rc2
v0.13.1-rc1
v0.13.1-rc0
v0.13.0
v0.13.0-rc0
v0.12.11
v0.12.11-rc1
v0.12.11-rc0
v0.12.10
v0.12.10-rc1
v0.12.10-rc0
v0.12.9-rc0
v0.12.9
v0.12.8
v0.12.8-rc0
v0.12.7
v0.12.7-rc1
v0.12.7-rc0
v0.12.7-citest0
v0.12.6
v0.12.6-rc1
v0.12.6-rc0
v0.12.5
v0.12.5-rc0
v0.12.4
v0.12.4-rc7
v0.12.4-rc6
v0.12.4-rc5
v0.12.4-rc4
v0.12.4-rc3
v0.12.4-rc2
v0.12.4-rc1
v0.12.4-rc0
v0.12.3
v0.12.2
v0.12.2-rc0
v0.12.1
v0.12.1-rc1
v0.12.1-rc2
v0.12.1-rc0
v0.12.0
v0.12.0-rc1
v0.12.0-rc0
v0.11.11
v0.11.11-rc3
v0.11.11-rc2
v0.11.11-rc1
v0.11.11-rc0
v0.11.10
v0.11.9
v0.11.9-rc0
v0.11.8
v0.11.8-rc0
v0.11.7-rc1
v0.11.7-rc0
v0.11.7
v0.11.6
v0.11.6-rc0
v0.11.5-rc4
v0.11.5-rc3
v0.11.5
v0.11.5-rc5
v0.11.5-rc2
v0.11.5-rc1
v0.11.5-rc0
v0.11.4
v0.11.4-rc0
v0.11.3
v0.11.3-rc0
v0.11.2
v0.11.1
v0.11.0-rc0
v0.11.0-rc1
v0.11.0-rc2
v0.11.0
v0.10.2-int1
v0.10.1
v0.10.0
v0.10.0-rc4
v0.10.0-rc3
v0.10.0-rc2
v0.10.0-rc1
v0.10.0-rc0
v0.9.7-rc1
v0.9.7-rc0
v0.9.6
v0.9.6-rc0
v0.9.6-ci0
v0.9.5
v0.9.4-rc5
v0.9.4-rc6
v0.9.4
v0.9.4-rc3
v0.9.4-rc4
v0.9.4-rc1
v0.9.4-rc2
v0.9.4-rc0
v0.9.3
v0.9.3-rc5
v0.9.4-citest0
v0.9.3-rc4
v0.9.3-rc3
v0.9.3-rc2
v0.9.3-rc1
v0.9.3-rc0
v0.9.2
v0.9.1
v0.9.1-rc1
v0.9.1-rc0
v0.9.1-ci1
v0.9.1-ci0
v0.9.0
v0.9.0-rc0
v0.8.0
v0.8.0-rc0
v0.7.1-rc2
v0.7.1
v0.7.1-rc1
v0.7.1-rc0
v0.7.0
v0.7.0-rc1
v0.7.0-rc0
v0.6.9-rc0
v0.6.8
v0.6.8-rc0
v0.6.7
v0.6.7-rc2
v0.6.7-rc1
v0.6.7-rc0
v0.6.6
v0.6.6-rc2
v0.6.6-rc1
v0.6.6-rc0
v0.6.5-rc1
v0.6.5
v0.6.5-rc0
v0.6.4-rc0
v0.6.4
v0.6.3-rc1
v0.6.3
v0.6.3-rc0
v0.6.2
v0.6.2-rc0
v0.6.1
v0.6.1-rc0
v0.6.0-rc0
v0.6.0
v0.5.14-rc0
v0.5.13
v0.5.13-rc6
v0.5.13-rc5
v0.5.13-rc4
v0.5.13-rc3
v0.5.13-rc2
v0.5.13-rc1
v0.5.13-rc0
v0.5.12
v0.5.12-rc1
v0.5.12-rc0
v0.5.11
v0.5.10
v0.5.9
v0.5.9-rc0
v0.5.8-rc13
v0.5.8
v0.5.8-rc12
v0.5.8-rc11
v0.5.8-rc10
v0.5.8-rc9
v0.5.8-rc8
v0.5.8-rc7
v0.5.8-rc6
v0.5.8-rc5
v0.5.8-rc4
v0.5.8-rc3
v0.5.8-rc2
v0.5.8-rc1
v0.5.8-rc0
v0.5.7
v0.5.6
v0.5.5
v0.5.5-rc0
v0.5.4
v0.5.3
v0.5.3-rc0
v0.5.2
v0.5.2-rc3
v0.5.2-rc2
v0.5.2-rc1
v0.5.2-rc0
v0.5.1
v0.5.0
v0.5.0-rc1
v0.4.8-rc0
v0.4.7
v0.4.6
v0.4.5
v0.4.4
v0.4.3
v0.4.3-rc0
v0.4.2
v0.4.2-rc1
v0.4.2-rc0
v0.4.1
v0.4.1-rc0
v0.4.0
v0.4.0-rc8
v0.4.0-rc7
v0.4.0-rc6
v0.4.0-rc5
v0.4.0-rc4
v0.4.0-rc3
v0.4.0-rc2
v0.4.0-rc1
v0.4.0-rc0
v0.4.0-ci3
v0.3.14
v0.3.14-rc0
v0.3.13
v0.3.12
v0.3.12-rc5
v0.3.12-rc4
v0.3.12-rc3
v0.3.12-rc2
v0.3.12-rc1
v0.3.11
v0.3.11-rc4
v0.3.11-rc3
v0.3.11-rc2
v0.3.11-rc1
v0.3.10
v0.3.10-rc1
v0.3.9
v0.3.8
v0.3.7
v0.3.7-rc6
v0.3.7-rc5
v0.3.7-rc4
v0.3.7-rc3
v0.3.7-rc2
v0.3.7-rc1
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.8
v0.2.8-rc2
v0.2.8-rc1
v0.2.7
v0.2.6
v0.2.5
v0.2.4
v0.2.3
v0.2.2
v0.2.2-rc2
v0.2.2-rc1
v0.2.1
v0.2.0
v0.1.49-rc14
v0.1.49-rc13
v0.1.49-rc12
v0.1.49-rc11
v0.1.49-rc10
v0.1.49-rc9
v0.1.49-rc8
v0.1.49-rc7
v0.1.49-rc6
v0.1.49-rc4
v0.1.49-rc5
v0.1.49-rc3
v0.1.49-rc2
v0.1.49-rc1
v0.1.48
v0.1.47
v0.1.46
v0.1.45-rc5
v0.1.45
v0.1.45-rc4
v0.1.45-rc3
v0.1.45-rc2
v0.1.45-rc1
v0.1.44
v0.1.43
v0.1.42
v0.1.41
v0.1.40
v0.1.40-rc1
v0.1.39
v0.1.39-rc2
v0.1.39-rc1
v0.1.38
v0.1.37
v0.1.36
v0.1.35
v0.1.35-rc1
v0.1.34
v0.1.34-rc1
v0.1.33
v0.1.33-rc7
v0.1.33-rc6
v0.1.33-rc5
v0.1.33-rc4
v0.1.33-rc3
v0.1.33-rc2
v0.1.33-rc1
v0.1.32
v0.1.32-rc2
v0.1.32-rc1
v0.1.31
v0.1.30
v0.1.29
v0.1.28
v0.1.27
v0.1.26
v0.1.25
v0.1.24
v0.1.23
v0.1.22
v0.1.21
v0.1.20
v0.1.19
v0.1.18
v0.1.17
v0.1.16
v0.1.15
v0.1.14
v0.1.13
v0.1.12
v0.1.11
v0.1.10
v0.1.9
v0.1.8
v0.1.7
v0.1.6
v0.1.5
v0.1.4
v0.1.3
v0.1.2
v0.1.1
v0.1.0
v0.0.21
v0.0.20
v0.0.19
v0.0.18
v0.0.17
v0.0.16
v0.0.15
v0.0.14
v0.0.13
v0.0.12
v0.0.11
v0.0.10
v0.0.9
v0.0.8
v0.0.7
v0.0.6
v0.0.5
v0.0.4
v0.0.3
v0.0.2
v0.0.1
Labels
Clear labels
amd
api
app
bug
build
cli
cloud
compatibility
context-length
create
docker
documentation
embeddings
feature request
feedback wanted
good first issue
gpt-oss
gpu
harmony
help wanted
image
install
intel
js
launch
linux
macos
memory
mlx
model
needs more info
networking
nvidia
ollama.com
performance
pull-request
python
question
registry
rendering
thinking
tools
top
vulkan
windows
wsl
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/ollama#1434
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @dhiltgen on GitHub (Feb 11, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2453
Originally assigned to: @dhiltgen on GitHub.
Officially ROCm no longer supports these cards, but it looks like other projects have found workarounds. Let's explore if that's possible. Best case, built-in to our binaries. Fall-back if that's not plausible is document how to build from source with the appropriate older ROCm library and AMD drivers installed on your system and build a local binary that works.
@dhiltgen commented on GitHub (Feb 12, 2024):
One interesting observation. I managed to get my
gfx803card not to crash with the invalid free by uninstalling the rocm libs on the host, and copying the exact libs from the build container over, however, when running models on the card, the responses were gibberish, so clearly it's more than just library dependencies and will require compile time changes.@Todd-Fulton commented on GitHub (Feb 20, 2024):
I'm trying to get this working on an RX 580.
With the 6.0.0-2 rocm packages on arch, I was getting
free(): invalid pointerfrom clinfo (maybe a related issue).In the logs after sending a "prompt" (not sure of the lingo?).
I notice in the rocblas cmake file file that they removed support for gfx803 for the 6.0.X builds, so I downgraded to the 5.7.1 packages and rebuilt ollama using the PKGBUILD from #2473
Then when I sent the prompt I get this error:
The assertion is coming from
stdlibc++here, so maybe if I change thePKGBUILDto build a different version of ollama, that might get fixed, I'll try that next.Not sure how much help I can be here, but I can test things out if needed.
This is the full output in the logs:
@Todd-Fulton commented on GitHub (Feb 20, 2024):
I ended up disabling
_GLIBCXX_ASSERTIONSin/etc/makepkg.confand I am starting to get some responses, but they are gibberish, at least sometimes. I think the problem is inllama.cppperhaps some sort of UB in the use ofstd::discrete_distributionthat was triggering the assert. This is the only place I could find it being used. And a discussion which seems to resemble what's going on.This is where
libstdc++was asserting inc++/13.2.1/bits/random.tccon line 2665:So it seems like the sum should be greater than 0, idk what the implications are, but that seems to be one of preconditions of using this type which
llama.cppis violating. May have some impact on the maths involved (which I am totally oblivious to).I tried this:
ollama run codellama "Write me a function that outputs the fibonacci sequence in C."and it just output a bunch of
##############################forever until Ictrl-cRunning the llama2 model:
I don't know if it's just messing with me, or if the bug is random.
Next try using codellama example:
@wilkensgomes commented on GitHub (Feb 20, 2024):
@Todd-Fulton Same error here. do you know how fix this ?
@Todd-Fulton commented on GitHub (Feb 21, 2024):
@wilkensgomes
for the error
rocBLAS error: Cannot read /opt/rocm/lib/rocblas/library/TensileLibrary.dat: Illegal seek for GPU arch : gfx803I downgraded to 5.7.1 rocm packages using downgrade on arch linux and then added them to Ignore at the end of the installation so that they don't get upgraded to 6.X packages.
For the error:
Feb 19 19:43:16 tokyo ollama[130295]: /usr/lib64/gcc/x86_64-pc-linux-gnu/13.2.1/../../../../include/c++/13.2.1/bits/random.tcc:2665: void std::discrete_distribution<>::param_type::_M_initialize() [_IntType = int]: Assertion '__sum > 0' failed.I turned off
_GLIBCXX_ASSERTIONSwhen building ollama, in/etc/makepkg.confThere might be a better way to disabling this in the PKGBUILD file just for building ollama/llama.cpp, but I haven't bothered with it, and just disabled the assertions globally.
Reading over the discussion for the second error, the gibberish happens after disabling the asserts, as the initialize method for
std::discrete_distribution<>requires that the sum of the probabilities are greater than 0, this make sense. AFAIK it doesn't make sense for a probability to be negative, or NAN, or all 0, which are the cases I can think of that would trigger the assertion after summing the probabilities.So as far as I can tell the gibberish is a result from certain models and small input prompts as said in the conversation. Somewhere between the model and the calculation of the probabilities, either some of them are negative, all are zero, or there is a NaN in there. For example, if for some reason a probability is a result of dividing a float by 0.0
p = x / y where y is 0.0thenp = NaNand then whenllama.cppcallsllama_sample_token()andstd::discrete_distributioncallsstd::accumulatethen the result will beNaN, I can only imagine how that would mess up the LLM when trying to figure out the next word to use. At least this is as far as my understanding goes.Apart from some of the smaller models and a small input prompts that produce gibberish, everything has been working for me since yesterday. I'm not even sure if the gibberish is particular to polaris gpus. I spent a few hours using llama2:13b as a Dungeon Master yesterday, was mind blowing.
@Todd-Fulton commented on GitHub (Feb 26, 2024):
I'm still getting familiar with these code bases, but I did some print debugging in
llama_sample_softmaxandllama_sample_tokenand sure enough, there are nans everywhere on short prompt, it's fairly reproducible on my end.I built both ollama and llama.cpp from their respective main branches, but took out the check for AMD version > 9 in ollama.
In file
llama.cpp, with the logging that I put in.I'll do my best to track down where the nans are coming from, it might be the gpu, which I have little experience in. I might try building rocm6.x from source if I can find an option to enable gfx803 support in the cmake files, and then build against that in case it's a bug in rocm 5.7.1 that I have installed.
Short prompt, nans, nans everywhere:
A little bit longer prompt, the calculations look right here:
More detailed logs:
llama.cpp.good.log
llama.cpp.nan.log
@ianlacerda commented on GitHub (Feb 27, 2024):
Is it not possible to create a docker image that supports gfx803? It would be easier than doing trial and error. Two weeks ago I was trying to install Ollama for my RX580 and I was only able to use the CPU due to conflicting dependencies on Arch Linux and Ubuntu 22.04.
@Todd-Fulton commented on GitHub (Mar 1, 2024):
This issue on llama.cpp seems to be the same bug.
I'm currently going through the Rocm stack and building it from source using the main branches and trying to find out if I can reintroduce rx580 "support" with patches if needed. I will put up a script and patches if I'm successful in that and it solves the problem. We could create a docker image from that script, or just use the script to create binary packages, or PKGBUILDS if it comes to that. Various parts of the stack still seem to "support" gfx803 (rx580), while other seem to have at least officially dropped it, like rocBLAS (though it might still work if I just patch up the build scripts).
I don't think this is a bug in ollama, but further down the stack. For example, clr introduced a
free(): invalid pointerbug somewhere between 6.0.0 (unreleased) and 6.0.2 tags, that was the reason I downgraded to 5.7.1. So it's a matter of finding which commit introduced that bug.As for the gibberish, I think that's a result of
nanscoming from somewhere. It seems to be specific to gfx803, otherwise a lot more users would be reporting it, and that bug also occurs in rocm 5.7.1.It might be worth trying even older versions of rocm than 5.7.1 if ollama and llama.cpp are still compatible with those, at least in the meantime. Adding support for older gpus without requiring downgrading rocm doesn't seem possible if rocm isn't going to support older gpus in the first place, users would still have to install older versions, or at least would require re-implementing that functionality.
If the gibberish is coming from clBLAST, then that narrows that down and rocm support for older gpus is just a side issue, I think users will either have to work on support in the open source, or just use older packages.
@nphalem commented on GitHub (Mar 21, 2024):
Any progress on this... ROCm successfully detects my gfx803 and it should work but ollama is blocking the card :/
@wreckdump commented on GitHub (Mar 26, 2024):
Could this also be applied to gfx804?
@eorisis commented on GitHub (Mar 30, 2024):
Support for Radeon RX 580/590 (I have a 590) would be super nice. Tried Ollama 0.1.30 update and is not possible yet.
@siavashmohammady66 commented on GitHub (Apr 2, 2024):
Please add support Older GPU's like RX 580 as Llama.cpp already support those GPU's
@6b6279 commented on GitHub (Apr 22, 2024):
@Todd-Fulton That's a regression with ROCm versions 6.0.* (see https://github.com/rocm-arch/rocm-arch/issues/981). Downgrading to 5.7.1 will enable support for, e.g., Polaris cards again.
@manuelpaulo commented on GitHub (Apr 25, 2024):
True, using CLBlast.
@DerRehberg commented on GitHub (Apr 26, 2024):
@6b6279 Can you give me detailed Instructions how to downgrade to 5.7.1 on Arch? I got an Rx 580
@6b6279 commented on GitHub (Apr 26, 2024):
@DerRehberg Try
downgrade rocm-opencl-runtimeand choose 5.7.1 as the target version. Don't forget to add the package in IgnorePkg to pin that version until you manually update.(downgrade is available on the AUR: https://aur.archlinux.org/packages/downgrade)
ollama won't use the GPU regardless, but it'll enable support for, e.g., the RX 580, while using darktable.
@DerRehberg commented on GitHub (Apr 26, 2024):
@6b6279 And now give me detailed instruction how to run Stable Diffusion on an RX 580
@6b6279 commented on GitHub (Apr 26, 2024):
@DerRehberg No idea. I use rocm only for image processing.
@janstadt commented on GitHub (May 15, 2024):
Is there any update to this? I have a 580 and would like to use it in addition to another gpu.
@jiriks74 commented on GitHub (May 15, 2024):
Helo. I'm a user of an Radeon Rx580 8GB and the statement that
is not entirely true. While it is not officially supported anymore you don't really need any workarounds to make ROCm work with these GPUs. I've been using OpenCL through ROCm for quite some time in Blender without any issues at all. All I needed to do is set an environment variable:
ROC_ENABLE_PRE_VEGA=1and the GPU just worked.I've tried dong so with Ollama but it seems that it disables the GPU manually as unsupported even if ROCm is able to run on it.
From ArchWiki
@Darin755 commented on GitHub (Jun 15, 2024):
I think this is really exciting as a RX580 on ebay is much more affordable. If support could be added it should be possible to built an AI machine for under 200 USD.
@jidaojiuyou commented on GitHub (Jun 27, 2024):
闲鱼上的rx580/rx590 200-300元。如果ollama能支持rx580就太好了。或者有什么方案能让用户自己构建ollama也可以。
@mgielissen commented on GitHub (Jul 4, 2024):
For example: GPT4All 3.0 (Vulkan support) works great with RX580 (8GB). Tested with Windows 10 and latest AMD driver.
@manuelpaulo commented on GitHub (Jul 8, 2024):
Also tested GPT4All 3.0 on an old PC with an RX580 (8GB) and it works nicely on Windows 10, with ROC_ENABLE_PRE_VEGA env variable set to 1, and HSA_OVERRIDE_GFX_VERSION set to "10.3.0". AMD HIP SDK version 5.5.1 only, not 5.7.1.
@yourchanges commented on GitHub (Jul 18, 2024):
for ollama running on gfx803 on windows, just check https://github.com/likelovewant/ollama-for-amd https://github.com/likelovewant/ollama-for-amd/wiki
Tested ollama 0.2.5/0.2.7 RX 570 4GB memory on win10 64bit, it run
ollama run phi3well ,but sometimes it failed withsee https://github.com/likelovewant/ollama-for-amd/issues/8#issuecomment-2238714971
@KazeLiu commented on GitHub (Jul 25, 2024):
希望能提供RX580的支持
Hope to provide support for RX580
@yourchanges commented on GitHub (Jul 26, 2024):
@KazeLiu rx580 === rx 570 , they are both gfx803.
@KazeLiu commented on GitHub (Jul 26, 2024):
@yourchanges
Thank you, this is my first time installing a large model, and I still don't quite understand the process. My environment is an internal offline network. My current process is to first enter the ollama-for-amd project, then download ollama-windows-amd64.7z and OllamaSetup.exe from version 0.2.8. After transferring them to the internal network, I first install OllamaSetup and then extract ollama-windows-amd64.7z, replacing the files in the Ollama folder. Then, I enter the ROCmLibs-for-gfx1103-AMD780M-APU project and download version 0.6.1.2 of rocm.gfx803.optic.vega10.logic.hip.sdk.6.1.2.7z. I download it and replace the files in the rocm folder within the Ollama folder. Then, I run the glm4 that I downloaded before going offline. When I check using ollama ps, I find that the CPU is still at 100%. I want to know which step I didn't do correctly.
谢谢,我是第一次安装大模型,这边还是不太明白内容,我这边的环境是内网离线环境。我现在的流程就是先进入ollama-for-amd项目,然后下载V0.2.8版本中的ollama-windows-amd64.7z和OllamaSetup.exe。传输到内网后,我先安装了OllamaSetup然后把ollama-windows-amd64.7z解压并替换掉了Ollama文件夹内的文件。然后进入了ROCmLibs-for-gfx1103-AMD780M-APU项目,下载 v0.6.1.2 的 rocm.gfx803.optic.vega10.logic.hip.sdk.6.1.2.7z。下载并替换掉Ollama文件夹内rocm文件夹内的文件。然后我将断网之前下载的glm4运行,用ollama ps查看,发现CPU还是100%。我想知道我是哪一步没有做好。
@yourchanges commented on GitHub (Jul 26, 2024):
@KazeLiu You should check your logs of the
ollama serve,key info: library=rocm compute=gfx803 driver=5.2 name="Radeon RX 570 Series" total="4.0 GiB" available="3.9 GiB" , if you got something like this, your llm run on your gpu, otherwise not.
you can carefully follow the https://github.com/likelovewant/ollama-for-amd/issues/8 maybe there are something wrong with you driver or hip sdk
@jeromechungmf commented on GitHub (Aug 6, 2024):
may be try this https://github.com/YellowRoseCx/koboldcpp-rocm/releases/tag/v1.71.1.yr0-ROCm ,it's work fine for me.my gpu card is amd rx580,os is almalinux 9.if ollama not work,try koboldcpp-rocm instead.before install koboldcpp-rocm,rocm 5.7.1 enviorment must be ready.try this https://www.youtube.com/watch?v=ljXdEih2GOI&ab_channel=LinuxMadeEZ video step by step,finish the rocm 5.7.1 enviorment ready,good luck.
@zuli12-dev commented on GitHub (Aug 10, 2024):
came across this as i am in the same situation with an old RX580 without any budget to get something new und unable to get it running, neither ollama (only cpu) nor comfyui.. trying koboldcpp but not all models are supported there...
Very sad, please let me know if you need any information, debug output etc...
Hope you can implement it some day 👍
cheers
@jeromechungmf commented on GitHub (Aug 16, 2024):
Hi
May be try https://discuss.linuxcontainers.org/t/llama-cpp-and-ollama-servers-plugins-for-vs-code-vs-codium-and-intellij-ai/19744 ,you can recompile ollama code for rx580.
https://blog.lyric.im/p/using-llamacpp-to-run-llama-2-using-amd-radeon-rx-6900-for-gpu-acceleration you can recompile llamacpp code for rx580.
You are right,koboldcpp is not all models are supported,even have stable issue.
May be the llamacpp or ollama is the good choice.
https://github.com/robertrosenbusch/gfx803_rocm61_pt24/tree/main?tab=readme-ov-file ,this repo can build run rx580 gpu card enviroment image.in my experince ,stable diffusion+comfyui both can run in this enviroment.
Good luck.
PS:if the rx580 must be install,please remember install rcom5.7,the gfx803 dat file in there.
Jerome
@KhazAkar commented on GitHub (Aug 17, 2024):
As alternative, maybe pushing Vulkan support would be reasonable? It's here:
https://github.com/ollama/ollama/issues/2033
@jeromechungmf commented on GitHub (Aug 17, 2024):
The LM Studio is support vulkan,but still have stable issue.I have previous experience using LM Studio. After running several LLM models on LM Studio, it suddenly stopped working properly, and Koboldcpp also started experiencing issues.
So, I am currently reprogramming with Docker and Rocm,in this enviroment to reinstall and compile Ollama, allowing Ollama to run LLM models on Docker with the RX580. Since running Stable Diffusion in the Rocm environment seems very stable—I’ve generated dozens of images over an entire day without any issues—I’m confident this setup will work well.
Perhaps these experiences might be useful as a reference.
Good Luck.
Jerome
@KhazAkar commented on GitHub (Aug 18, 2024):
@jeromechungmf have you tried alternative to LM Studio, called gpt4all? They're having AFAIK best Vulkan support for LLM acceleration. I'm using RX 5500M and it does wonders, where ROCm installation for this GPU, on Ubuntu 24.04, is a nightmare or hackfest, or boht.
@jeromechungmf commented on GitHub (Aug 18, 2024):
Finally, i found the docker image can support the rx580 gpu card for ollama, https://hub.docker.com/r/bergutman/ollama-rocm .
if you still use rx580 gpu card,docker pull the bergutman/ollama-rocm image,follow the overview ,use the docker-compose,run the bergutman/ollama-rocm image build to docker container.you can use 'docker logs containerID' command,see the ollama can get rx580 vram 8G.
I will likely continue testing the RX580 paired with Ollama to run various LLM large language models.
Hope these experiences might be useful as a reference.
Good Luck
Jerome
@jeromechungmf commented on GitHub (Aug 19, 2024):
After try download llama3 model for bergutman/ollama-rocm,i get error: template: :7:3: executing "" at <.response>: can't evaluate field response in type struct { first bool; system string; prompt string } #4057 ,the ollama version too old.
It's very sad.
Try llamacpp on docker or LM-Studio on Ubuntu 22.04 is my next step.
Jerome
@Tamila-2017 commented on GitHub (Sep 13, 2024):
Hello guys,
I am very interested in AI models for Ollama for text translation.
And I even managed to run Gemma 2 2B and Llama 3.1 8B on N100 CPU and Debian 12 OS.
The possibilities of their intelligence I liked very much.
Unfortunately the speed of these models on N100 is not very fast.
So I want to increase their speed with a graphics card.
I can only afford an old SAPHIRE NITRO RX480 8 GB graphics card.
But and unfortunately Ollama refuses to work with it.
Yes, I know there is still no official support for this graphics card: https://ollama.com/blog/amd-preview
There are a lot of discussions and attempts to implement this support on the web, for example: 1, 2, 3
And maybe some real solution for RX580 has already been found?
Tell me about it, please!
@mnccouk commented on GitHub (Sep 15, 2024):
I have some success running Ollama using the 5.7.1 Rocm libraries with Radeon RX 580 GPU - Ubuntu 22.04.4 LTS, from within a docker container. Only tried so far with the llama3.1 model but seems to work ok.
The process I went through was just to focus and fix errors I encountered while trying to get something working, so this is by no means a proper fix for this issue.
The changes are based on the main Ollama branch from 6th Sept 2024, it would be interesting if somebody else could try to see if it works them too.
Repo with the changes - https://github.com/mnccouk/ollama/tree/rx580_gpu
See the top of the readme for some basic instructions to build. I only focused on the docker build, so make sure to install Rocm libraries on the docker host machine first (5.7.1).
@jeromechungmf commented on GitHub (Sep 15, 2024):
Hello @Tamila-2017
I have been trying to run AI large language models (LLMs) on my PC using an RX580 GPU. I've tried two main approaches so far:
My Experience:
Linux with Vulkan:
I install Ubuntu 24.04 or 22.04, configuring it to use the Vulkan drivers, and then installing LMStudio. The goal is to make this setup work with my RX580 GPU for optimal LLM performance. However, attempting to install Vulkan drivers directly from AMD's website (for Ubuntu 22.04) proved unsuccessful. Instead, I found success by downloading the corresponding Vulkan driver package from https://vulkan.lunarg.com/sdk/home#linux
This allowed LMStudio to detect my RX580 GPU and use it for LLM calculations.
My Installation:
Currently, I am running Ubuntu Release 1.3.290 Ubuntu 22.04.
Recommendations
Thorough Driver Research: Before installing anything, make sure you download the correct Vulkan driver based on your Ubuntu version from https://vulkan.lunarg.com/sdk/home#linux
Windows with Vulkan:
In Windows OS Enviroment,just download AMD Driver Installer , follow the GUI step by step,Finish the gpu driver install.The Vulkan driver for windows os have been ready .Finish the AMD Driver install,you can download LMStudio and Run it.you can see the LMStudio show GPU Infomation VRAM 8.0GB.
This is my setup for running AI language models using an RX580 graphics card. Please take a look and see if you can provide any guidance or suggestions.
Good luck.
Jerome
@Tamila-2017 commented on GitHub (Sep 15, 2024):
Hi guys,
I am very grateful to all of you for your detailed account of how to solve my problem with the RX580.
Now I appeared a great chance to solve this problem and use RX580 to communicate with AI.
Can you please tell me, what is the speed increase of working with RX580 with AI compared to СPU N100?
Is it 10x, 100x, or maybe 1000x? :-)
@KhazAkar commented on GitHub (Sep 15, 2024):
You can all try ollama-for-amd fork. I'm using it myself with my RX5500M card and ROCm 6.2. works wonders.
@Tamila-2017 commented on GitHub (Sep 15, 2024):
KhazAkar,
Thank you, this is also very interesting information. Could you tell us more about this?
For example, what exactly is the name of this fork, where can I get it and how to use it.
And most importantly: you are using the RX 5500M. Will this fork work with the RX580?
@Tamila-2017 commented on GitHub (Sep 16, 2024):
jeromechungmf
I chose your advice first:
And it ended in failure. What I did, in detail:
wget -qO- https://packages.lunarg.com/lunarg-signing-key-pub.asc | sudo tee /etc/apt/trusted.gpg.d/lunarg.asc sudo wget -qO /etc/apt/sources.list.d/lunarg-vulkan-noble.list https://packages.lunarg.com/vulkan/lunarg-vulkan-noble.list sudo apt update sudo apt install vulkan-sdksudo add-apt-repository universe sudo apt install libfuse2t64sudo apt install libnss3sudo apt install libasound2t64But this file does not exist!
Next, I began to understand that a graphical environment is required to run LM Studio and I did not like it at all, because I want to using AI in a console environment, without X's.
That's why I stopped this experiment, and now I want to try KhazAkar advice.
But unfortunately he's not giving any details yet :-(
@mnccouk commented on GitHub (Sep 16, 2024):
Here's my performance figures
CPU - AMD Ryzen 5 3400G with Radeon Vega Graphics (16GB RAM)
"eval_count":342,"eval_duration":66228944000 = 5.1639 tokens a second
GPU - Radeon Rx580 8GB vRAM
"eval_count":308,"eval_duration":11708779000} = 26.305 tokens a second
Same question asked in both cases:
@Tamila-2017 commented on GitHub (Sep 17, 2024):
mnccouk,
thank you so much for the convincing test! :-)
It turns out that the gain about speed is small, about 5 times.
At the same time, the TDP of:
AMD Ryzen 5 3400G = 65 Watts
Radeon Vega 16 = 75 Watts.
AMD RX580 = 185 Watts.
@jeromechungmf commented on GitHub (Sep 17, 2024):
Hi Tamila-2017
Sorry,I should show my install process step by step:
1.downlaod the Ubuntu 22.04 Desktop iso file,my file name:ubuntu-22.04.4-desktop-amd64.iso .
2.make the iso file to usb boot Disk.In the linux you can use the dd command make it.ex:sudo dd bs=4M if=/path/to/ubuntu-22.04.4-desktop-amd64.iso of=/dev/sdX status=progress oflag=sync
3.I booted from a USB drive and installed Ubuntu 22.04 on my computer.You can choice minimal option,follw the gui guide step by step.
4.open the firefox paste the https://vulkan.lunarg.com/sdk/home#linux, you can browse the vulkan page,in the Linux content,click the Ubuntu Packages tab,then roll down until see the Release 1.3.290 link,click it.you can see the Ubuntu 22.04 (Jammy Jellyfish) and Ubuntu 24.04 (Noble Numbat).in my case,i click the Ubuntu 22.04 (Jammy Jellyfish) item.it's show the command like below:
wget -qO- https://packages.lunarg.com/lunarg-signing-key-pub.asc | sudo tee /etc/apt/trusted.gpg.d/lunarg.asc
sudo wget -qO /etc/apt/sources.list.d/lunarg-vulkan-1.3.290-jammy.list https://packages.lunarg.com/vulkan/1.3.290/lunarg-vulkan-1.3.290-jammy.list
sudo apt update
sudo apt install vulkan-sdk
5.open the terminal follow the command and execute.finish to install vulkan driver in my ubuntu 22.04 desktop enviroment.
6.reboot
7.download the lmstudio and follow the https://www.tecmint.com/lm-studio-run-llms-linux/ to run lmstudio.
In my experince,it's should be work.
PS:use xdrp to longin Ubuntu should cause by privilege issue of lmstudio,if you want to remote login ubuntu to run lmstudio.Using like RealVNC Viewer for remote access is a better option, as it can avoid permission problems and allow the lmstudio can detection of the RX580 GPU.
Good luck.
Jerome
@Tamila-2017 commented on GitHub (Sep 17, 2024):
jeromechungmf,
Thank you for your attention and detailed description of the installation.
And I'm sorry that I still have a little experience in Linux.
So, I'm starting to install the recommended one for me ubuntu-22.04.4-desktop-amd64
@KhazAkar commented on GitHub (Sep 17, 2024):
For vulkan, you don't need to downgrade to 22.04. 24.04 is also fine, even better. Ollama does not yet have vulkan support, but will gain it. For vulkan in CLI, you can either build Ollama with vulkan support from PR in this repo or build llama.cpp from ggreganov repository with vulkan support enabled. If you need GUI, then GPT4All is best one for GUI apps available, eventually LM Studio, but since I prefer open source, I'd pick GPT4ALL. In general I think we should move such discussion somewhere else and here focus on trying to get Ollama running with older AMD GPUs. So my proposition to try vulkan PR in Ollama marches the best.
@Tamila-2017 commented on GitHub (Sep 17, 2024):
So, I successfully installed Ubuntu-22.04.4-desktop-amd64 and followed steps "1-4".
However, unfortunately, action "5" is not executed, because there is no such command: execute.finish
.............
KhazAkar, oh, I'm glad you're here :-)
However, your message is vague and therefore not entirely clear
Could you elaborate on your recommendations, step by step?
@mnccouk commented on GitHub (Sep 17, 2024):
@Tamila-2017
I happen to have a real time power meter on the server, The results for power consumption:-
CPU while processing - "Why is the sky blue?"
= Duration = 66.7 seconds with power consumption = 104 Watts
= 0.001926889 kWh
= 1.92 Wh
GPU while processing - "Why is the sky blue?"
Duration = 11.7 seconds with power consumption = 230 Watts
= 0.001661111 kWh
= 1.66 Wh
Server power while GPU\CPU idle is 40 Watts
@Tamila-2017 commented on GitHub (Sep 17, 2024):
mnccouk,
Measurements are very useful, thank you
@Tamila-2017 commented on GitHub (Sep 17, 2024):
jeromechungmf,
I don't understand what the execute command execute.finish, which does not exist.
Therefore, I used this instruction - https://www.tecmint.com/lm-studio-run-llms-linux/
and I managed to launch the LM Studio.
Unfortunately, LM Studio also doesn't see the RX580 :-(
Although Ubuntu and nvtop sees it perfectly.
@Tamila-2017 commented on GitHub (Sep 17, 2024):
KhazAkar
I don't need the GUI, I'm strongly against it because I need to work with Ollama in the console.
I completely agree with you! But for some reason you're stubbornly not sharing the details of this success of yours -
even though I've asked you about it before.
Why don't you tell me about it? Is it your big secret or it your was just a bad joke?
Unlike you jeromechungmf detailed his experience and I tried to use it, although I'm not thrilled with the graphical environment because I need a console server option that works with Ollama.
@KhazAkar commented on GitHub (Sep 17, 2024):
@Tamila-2017 it's not a joke. Fork is here: https://github.com/likelovewant/ollama-for-amd
But I can't assure you of anything with older card than I currently have. This one does not use vulkan but ROCm and I don't know if your card is supported by ROCm 6.x.
For Vulkan usage, if you're feeling brave, you can try building Ollama with vulkan support by cloning repository and applying this PR:
https://github.com/ollama/ollama/pull/5059
In both cases, building Ollama by hand is inevitable.
@KhazAkar commented on GitHub (Sep 17, 2024):
Plus, @Tamila-2017 - only Info in this fork I've found is in foreign language to me.
https://github.com/likelovewant/ollama-for-amd/issues/1
You can translate it if you want.
And it would be great idea that you join Ollama discord server :)
@jeromechungmf commented on GitHub (Sep 18, 2024):
Hi Tamila-2017
Attach my screenshot,it's contain execute the vulkan driver install commands,my be you can refer.
Good lucks.
Jerome
@jeromechungmf commented on GitHub (Sep 18, 2024):
Hi Tamila-2017
Attach the LMStudio Configuration sccreenshot,in the developer tab,click the LM Runtimes,you can check llama.cpp Vulkan should be v0.0.4.Then Click Settings tab you can see the VRAM 8 GB.
Jerome
@bellmancity commented on GitHub (Sep 19, 2024):
I tried LM Studio using your method (vulkan + lmstudio appimage) on my Linux Mint 22 with RX580. It just works right away. My RX580 is "refurbished" 16Gb from Aliexpress.
10.81 tok/sec 531 tokens 0.91s to first token
Hopefully Ollama will works with RX580 soon.
@Tamila-2017 commented on GitHub (Sep 19, 2024):
Hello all,
Thanks all for the tips, but unfortunately, I can't handle these software tricks on my own.
If you want to help me remotely, I will give you access to my an AI computer, and you can configure the RX580's personally.
@Tamila-2017 commented on GitHub (Sep 19, 2024):
KhazAkar
I tried to register on Discord - https://discord.com
But it first demanded my phone number and then told me that the this number was incorrect.
Discord stupid, he doesn't know about phone numbers.
@mnccouk commented on GitHub (Sep 19, 2024):
I've added the docker image I'm using with the rx580 with Ollama to docker hub, Hopefully it might prove to be useful to someone.
https://hub.docker.com/r/mnccouk/ollama-gpu-rx580
@bellmancity commented on GitHub (Sep 21, 2024):
I have tried your docker image and it runs successfully.
Eval rate = 15.25 tokes/s. Very impressive compared to LM Studio above.
RX580 2048 with 16GB VRAM
Ryzen 5 4650G with 16GB Memory.
Thank you for your effort.
@mnccouk commented on GitHub (Sep 21, 2024):
@bellmancity, Awesome! Glad it worked for you.
@Tamila-2017 commented on GitHub (Sep 21, 2024):
I can't install ROCm 5.7.0 because there are obscure errors when compiling.
mnccouk,
please give the exact download link of the distro you used here
@eliot-akira commented on GitHub (Sep 21, 2024):
Maybe this link will help:
From https://github.com/mnccouk/ollama/tree/rx580_gpu?tab=readme-ov-file#linux-with-rx580-radeon-gpu
@Tamila-2017 commented on GitHub (Sep 21, 2024):
eliot-akira,
Thanks, these links have been known to me for a long time.
But I am not aware of the exact distribution that uses mnccouk.
I'm waiting for the exact link from mnccouk to download the distro, because a lot depends on the distro.
@KhazAkar commented on GitHub (Sep 21, 2024):
Eventually just make use of Vulkan driver, either in ollama from PR in this repo, or llama.cpp from https://github.com/ggerganov/llama.cpp built from source....
@Tamila-2017 commented on GitHub (Sep 21, 2024):
I've tried using Vulkan before, but nothing good came out of it.
So I want to use the docker that mnccouk created and it worked for him and also at bellmancity
@KhazAkar commented on GitHub (Sep 21, 2024):
Something feels off for me here. I'm not talking about gui apps like somebody here tried to give LMStudio here
Plus, I think fastest way would be to use vulkan based inference, even building ollama from source, which is well written in docs here.
ROCm 5.7.x is quite... old. 5.7.1 is shipped with ubuntu 24.04 and debian 12 I think, under slightly different name, search for
hip. Current versions of Ollama work with 6.0.x at least and I don't know if they work right with 5.7.x series. Screenshot on top says otherwise, but currently RX 580, besides pletinful of VRAM, ended it's life after cryptomining boom.Best bet for its usage for AI is to push vulkan implementation for Ollama to be merged, available as PR here:
https://github.com/ollama/ollama/pull/5059
Either way, signing out. This thread got quite derailed IMHO and 3/4 of it should be on official discord server.
@Tamila-2017 commented on GitHub (Sep 21, 2024):
KhazAkar ,
You narrate very much and interestingly.
Unfortunately, your thoughts do not lead to the final result, because they are based only on probabilistic guesses, but without their practical verification.
The only one who has a real success here without using the DE graphical environment and Vulkan - is modest mnccouk.
I'm waiting for him to give me the exact link to download the distro he uses.
Once again: not the name of the distro, but the EXACT LINK to download.
@mnccouk commented on GitHub (Sep 21, 2024):
@Tamila-2017, I installed Ubuntu 20.04 then did a distro upgrade to Ubuntu 22.04.4. Only reason I didn't install 22.04 directly was due to usb boot issue using 22.04 installation media, but that's another story.
Direct link (https://releases.ubuntu.com/22.04.4/) to 22.04.4 doesn't work for some reason. However, there is a link for 22.04.5, https://releases.ubuntu.com/jammy/ubuntu-22.04.5-desktop-amd64.iso.
Then install the Rocm libraries @eliot-akira provided the link to (above).
Sorry can't provide 22.04.4 working direct link but can't see why 22.04.5 shouldn't work for you.
Obviously, also make sure to install docker, this can be done using the packages that come with Ubuntu 22.04.x
@SirDubbins commented on GitHub (Sep 22, 2024):
This worked for me using an RX480
@Tamila-2017 commented on GitHub (Sep 22, 2024):
Dear mnccouk,
Here is where you can download any version of Ubuntu 22.04.04 -
https://old-releases.ubuntu.com/releases/jammy/
But there are 2 options here:
Can I use the version without Xorg and Gnom?
I.e. ubuntu-22.04.4-live-server-amd64.iso ?
@mnccouk commented on GitHub (Sep 22, 2024):
Ah nice, glad you found the link.
I installed the GUI version on my server as I have a monitor connected to it and wanted to use GUI tools to do stuff. If you have a monitor plugged into your server maybe this is a good option to go with in the beginning.
I'm sure ubuntu-22.04.4-live-server-amd64.iso should also work fine but can't validate that personally.
Maybe get it working first with the desktop version, then once you know all the steps switch to the server version, of cause the choice is yours.
@SirDubbins commented on GitHub (Sep 22, 2024):
Has anyone been successful using multiple GPUs? I was trying to add a couple 8gb RX470 but I can't seem to get them to show up in rocminfo.
@Tamila-2017 commented on GitHub (Sep 22, 2024):
This nightmarishly unfinished manual -
https://rocm.docs.amd.com/en/docs-5.7.0/deploy/linux/os-native/install.html
this is not an instruction, but a mockery of common sense.
This is the third time I have installed Ubuntu-22.04.04.04-Server "Jammy", then followed these instructions, remembering to replace 5.7.0 with 5.7.1, but fail every time:
I don't understand why this is the case :-(
Dear mnccouk,
could you do me the courtesy of creating a linear bash script instead of this nightmarish instruction, which I could execute without having to think about unnecessary complications to get a working ROCm 5.7.1 ?
@mnccouk commented on GitHub (Sep 22, 2024):
Just been through the docs, and back through my notes. This should get you close if not up and running. Haven't tested but hopefully this should get you on the right track
@Tamila-2017 commented on GitHub (Sep 22, 2024):
Wow! Thank you so much you dear mnccouk, for your quick and creative help!
I'm immediately proceeding with the 4th install ubuntu-22.04.4-live-server-amd64 :-)
@kth8 commented on GitHub (Sep 22, 2024):
I did a fresh install of ubuntu 22.04.5 then updated and rebooted but getting stuck at the
sudo apt install amdgpu-dkmspart with this errorfrom that make.log: https://termbin.com/hjob
@Tamila-2017 commented on GitHub (Sep 23, 2024):
Dear mnccouk,
Your magical script executed without errors, only I had to split it into two parts, where the
sudo rebootHowever, I got the same result as before, in which I don't see a graphics card, only a Celeron CPU:
What could be the reason for this failure?
@SirDubbins commented on GitHub (Sep 23, 2024):
I think i encountered a similar problem. I wound up deleting the 6.8.0-45 generic kernel and am using the older 5.19.0-32 generic kernel instead.
@Tamila-2017 commented on GitHub (Sep 23, 2024):
SirDubbins,
I used to get this error all the time too.
But when I used magical the script by mnccouk, I didn't notice it, because the lines on the screen were running very fast.
Maybe it didn't exist?
@mnccouk commented on GitHub (Sep 23, 2024):
@Tamila-2017
Probably best to run that script step by step, one command at a time, After each command validate you have no errors, apologies for the confusion. You may have had similar issue to @kth8.but not noticed due to lack of error trapping when running all the commands at once.
@SirDubbins proposed a fix, by trying again after downgrading your kernel, or reinstall ubuntu again(sorry!).
Looking at the doc - https://rocm.docs.amd.com/en/docs-5.7.1/release/gpu_os_support.html - Ubuntu 22.04.3 has been validated with kernel 6.2. Try that version(22.04.3) of Ubuntu instead, also don't install any updates as part of the Ubuntu installation.
I went from an earlier of Ubuntu 20.04 then upgraded through to Ubuntu 22.04.4, the whole process wasn't smooth for me either. It's quite possible the kernel drivers could have compiled against an earlier kernel version.
Just for info - I'm actually on kernel version 6.8.0-40-generic now.
Sorry, but you may have to target an install of (Ubuntu 22.04.3) or try downgrading your kernel but that introduce other issues. I'd opt for the fresh install again if you have the luxury of being able to.
@Tamila-2017 commented on GitHub (Sep 23, 2024):
Dear mnccouk,
I've reinstalled again ubuntu-22.04.4-live-server-amd64.iso
Then, carefully step by step, I executed each command of your magic script separately, watching for mistakes.
There were no mistakes.
However, I got the same result without the GPU -
What do I do next?
@mnccouk commented on GitHub (Sep 23, 2024):
What's the output of
@Tamila-2017 commented on GitHub (Sep 23, 2024):
@mnccouk commented on GitHub (Sep 23, 2024):
How about?
@mnccouk commented on GitHub (Sep 23, 2024):
And...
@Tamila-2017 commented on GitHub (Sep 23, 2024):
@mnccouk commented on GitHub (Sep 23, 2024):
I'm running out of ideas here, but I don't like the look of:
Think it maybe to do with PCI spec of your hardware - see this link for related topic - https://www.reddit.com/r/ROCm/comments/ba6tvq/atomics_on_my_hardware/
@Tamila-2017 commented on GitHub (Sep 23, 2024):
It's incredible! You are doing well, but I am not, even though I follow your instructions exactly
So I offer you remote access via ssh, and you personally can install this cranky ROCm 5.7.1
PS. I used to be able to mine Bitcoin, Litecoin and Ethereum Classic on the same motherboard MSI H61MU-E35 (B3) and AMD RX580 without any problems.
I don't understand why they refuse to work together with ROCm :-o
@mnccouk commented on GitHub (Sep 23, 2024):
I can't be certain on this, but what I'm reading is the ROCm requires PCIe 3.0 AtomicOp feature, looking at the spec of your CPU - https://www.intel.com/content/www/us/en/products/sku/71073/intel-celeron-processor-g1620-2m-cache-2-70-ghz/specifications.html
It looks like it supports PCIe revision 2.0 which unfortunately is not compatible with the ROCm system. Also motherboard seems to be PCIe 2.x too.
I'm sorry to say I don't think ROCm will work for you with that CPU\Motherboard GPU combination, I maybe wrong but that's I think. Open to any other thoughts.
@Tamila-2017 commented on GitHub (Sep 23, 2024):
Dear mnccouk 💓
You have no need to apologize :-) On the contrary, I am very grateful to you for your courtesy and your great work you have done to analyze my situation and create a magic script to build ROCm.
I think you are right about my outdated hardware.
I will consider buying a more modern CPU and Motherboard, and I would be grateful if you could tell me the system requirements for them.
@mnccouk commented on GitHub (Sep 24, 2024):
@Tamila-2017
I can only forward the motherboard and CPU I'm using as I know that would work:
Motherboard - B450M PRO-VDH MAX - https://www.msi.com/Motherboard/B450M-PRO-VDH-MAX/support
CPU - AMD Ryzen 5 3400G with Radeon Vega Graphics
And of course the RX580
There is a discussion here - https://github.com/ROCm/ROCm/issues/237 about a similar issue to what you have, except in this instance it's just one of the PCIe ports that is failing when attempting to use two graphics cards, even though it has PCIe 3.0 specification.
Maybe others would share what hardware(motherboard\cpu combination) they have successfully used with ROCm to give you some alternatives?
@Tamila-2017 commented on GitHub (Sep 24, 2024):
Ok. Guys, how do you measure speed in tokens/second? I want to measure the speed of CPU N100.
@bellmancity commented on GitHub (Sep 24, 2024):
@Tamila-2017 commented on GitHub (Sep 24, 2024):
So I need to run some model to measure the speed?
I ran the Llama 3.1 model and executed your command, but it did not report the speed:
@mnccouk commented on GitHub (Sep 24, 2024):
Then you make a prompt:-

@Tamila-2017 commented on GitHub (Sep 24, 2024):
Surprisingly, I got a different answer:
And the performance of Llama 3.1 on CPU N100 turned out like this:
@Tamila-2017 commented on GitHub (Sep 24, 2024):
The performance of Llama 3.1 on CPU Intel-i3-4330-3 50GHz:
@Tamila-2017 commented on GitHub (Sep 24, 2024):
Dear mnccouk,
You have achieved a performance of ~26 tokens/sec.
Let me clarify: this is the total performance AMD Ryzen 5 3400G + Radeon Vega Graphics + RX580 ?
Or is the performance of only one RX580 ?
@AustinPowers1935 commented on GitHub (Sep 25, 2024):
@AustinPowers1935 commented on GitHub (Sep 25, 2024):
@kth8 @Tamila-2017 ^^^
@kth8 commented on GitHub (Sep 25, 2024):
After doing a new install of Ubuntu 22.04 I followed the guide except step 2 of installing
amdgpu-dkmsbut after rebooting and runningrocminfoit seems my GPU is not discovered.Here are my system logs: https://termbin.com/svjq
I am using RX 470.
@mnccouk commented on GitHub (Sep 25, 2024):
This is just with 1 RX580 + CPU, the CPU is still utilised to some degree but the GPU is performing all the heavy lifting.
As far as I'm aware the embedded Radeon Vega Graphics(embedded in the CPU) in not utilised.
@mnccouk commented on GitHub (Sep 25, 2024):
@kth8
looks like you have similar problem to @Tamila-2017
from your log
@kth8 commented on GitHub (Sep 25, 2024):
Do you know what that means? Is there something wrong with my motherboard or graphics card? Do I need to change a BIOS setting or set a kernel parameter or use a different kernel or something else?
@mnccouk commented on GitHub (Sep 25, 2024):
I believe it's to do with PCI specification - see chat from earlier - https://github.com/ollama/ollama/issues/2453#issuecomment-2369372398
@kth8 commented on GitHub (Sep 25, 2024):
If this is a hardware issue then I guess there is no way to solve this?
@mnccouk commented on GitHub (Sep 25, 2024):
Only by upgrading motherboard and CPU, I think. (I'm no expert, only deduced this from what I've read, so, i'm open to other thoughts.)
Your logs list your system with detected chipset - https://www.intel.com/content/www/us/en/products/sku/66416/intel-c216-chipset/specifications.html
https://www.intel.com/content/www/us/en/products/sku/65693/intel-core-i33220-processor-3m-cache-3-30-ghz/specifications.html
Which both are PCIe v2.0,
@Tamila-2017 commented on GitHub (Sep 26, 2024):
Guys, let me express my humble opinion, perhaps erroneous.
Yes, outdated hardware can cause and failure ROCm to at it work.
But it should not affect ROCm assembly, i.e. ordinary compilation, imho.
@Tamila-2017 commented on GitHub (Sep 27, 2024):
About Atomics and Motherboards: https://github.com/ROCm/ROCm/issues/1146#issuecomment-758624560
Very sad information.... It turns out that ROCm is a capricious and unreliable project :-(
@Tamila-2017 commented on GitHub (Sep 29, 2024):
My Test NVIDIA Graphics Cards Performance:
Of course, modern NVIDIA graphics cards work fast.
But modern AMD cards also have good performance
AMD's problem lies in a completely different, non-caring attitude: the disgusting support for video cards.
As a result, the ROCm project is in a terribly moody state. It requires special motherboards with an undocumented Atomics option for the PCI bus, which their manufacturers usually do not report in their specifications. Therefore, it is impossible to guess in advance whether any motherboard will work or not.
The Ollama project also doesn't give a damn about old AMD graphics cards, paying attention only to new models.
As a result, I completely wasted a lot of time trying to get ROCm to work on the RX580, but I didn't get any of it.
I am completely disappointed in AMD and ROCm for it.
Therefore, I stop any attempts to work with the terrible AMD platform and completely switch to NVIDIA, which does not have these problems and everything is configured in a few minutes and works stably.
Good luck to you, guys! 💓
@T-Shilov commented on GitHub (Oct 8, 2024):
Hello everyone,
I have read this topic carefully, and I want to buy a PRIME X370-PRO motherboard.
Do you think it will be able to work with two AMD RX580 graphics cards?
@Darin755 commented on GitHub (Oct 8, 2024):
It should work the big question is what kind of performance you will get. That will depend heavily on the CPU and motherboard. For optimal performance you want two x16 slots but the board you have is either a single x16 or dual x8. That might be fine for your needs but it is something you should keep in mind.
If you want something that can handle two cards you want a server board with a server CPU which can be pretty expensive. The other option would be to just build two systems. Either way this issue is not really the place for this so I would recommend that you open up a new discussion.
@T-Shilov commented on GitHub (Oct 8, 2024):
@Darin755,
Thank you for your important observation, thank you!
So, the PRIME X370-PRO for the two RX580s is not suitable :-(
Will need another motherboard, and I would appreciate it if you could tell me its model.
I'm sorry, but is AI performance dependent on the CPU?
After all, graphics cards are used here, which provide the main performance of AI.
Take a look at this, please
Regarding the question of the other thread. The use of RX580 is closely related to the type of motherboard, and it was already discussed above. So I also have the right to discuss here the choice of motherboard for the RX580, Isn't that right?
However, I took your advice into account and asked my question here as well.
@T-Shilov commented on GitHub (Oct 10, 2024):
mnccouk,
I used Ubuntu 24.04 and your handy script: https://github.com/ollama/ollama/issues/2453#issuecomment-2366982275
However, when executing this script, several errors occurred and the script stopped:
Then I used an older Ubuntu 22.04.3 LTS, but other errors occurred and the script stopped again:
Can you please advise me on how this unfortunate situation can be remedied?
@T-Shilov commented on GitHub (Oct 11, 2024):
This topic was created on February 12. It's been 310 days already!
It contains a lot of questions and various tips on using AMD RX580.
For example, there was created a handy docker for using Ollama.
However, I studied this topic carefully, and realized that the main problem here is a failed installation of ROCm 5.7.1.
After that I tried using the most promising tips for installing ROCm 5.7.1, but unfortunately none of them work.
Well, can someone please courage the trouble to summarize and create a how to for a RELIABLE installation of Rock 5.7.1? :-)
@mnccouk commented on GitHub (Oct 11, 2024):
Regarding your error:
Check inside your build log -
/var/lib/dkms/amdgpu/6.2.4-1664922.22.04/build/make.log
See if that gives you some extra information on why the build failed.
I noticed that you are using 24.04 of ubuntu, ROCm 5.7.1 drivers are
relatively old so there's a chance that something might not be compatible
between the two.
On Fri, 11 Oct 2024 at 15:34, T-Shilov @.***> wrote:
@T-Shilov commented on GitHub (Oct 11, 2024):
mnccouk, thank you the quick response.
But since my attempts were unsuccessful, during this time I have already to variant to the option using Ubuntu 20.04.
A little later I will tell you what happened with it.
@T-Shilov commented on GitHub (Oct 12, 2024):
So, since in Ubuntu 22.04 I was not able to overcome this vicious error -
I had to switch to Ubuntu 20.04.
I modified your script a bit using this documentation
It turned out to be 2 parts:
and
The obtained results are in these attachments:
Install-ROCm-571-Ubuntu-20.04-Part-1.pdf
Install-ROCm-571-Ubuntu-20.04-Part-2.pdf
I marked the detected errors in red color.
mnccouk, please take a look at them. My concern is why the RX580 is not detected.
Although when I run Ollama it reports:
>>> AMD GPU ready.
Start-Ollama.pdf
@mnccouk commented on GitHub (Oct 12, 2024):
I've built a docker image that you can try, see - https://github.com/ollama/ollama/issues/2453#issuecomment-2362217923
This was built to detect the older rx580 card, with the prerequisite of making sure the ROCm driver module is already installed on the host and working - This is the step you've been working on.
Also after a fresh reboot, execute the following command:-
sudo dmesg | grep -e amdgpu -e drm
Look through the output for any errors, if the amdgpu module looks to be loaded ok, try Ollama from the docker image.
@T-Shilov commented on GitHub (Oct 12, 2024):
Thanks for the advice, only I didn't understand the necessary sequence of steps I should follow.
Do I need to first uninstall my futile attempts to install ROCm 5.7.1 on Ubuntu 20.04, and then install your docker on a clean Ubuntu?
Or do these actions have to be performed in some other sequence?
Could you please explain in more detail the right sequence of my actions, step by step?
@mnccouk commented on GitHub (Oct 12, 2024):
The sequence is:-
working on. It's still not verified if what you have done up to now is
working.
Seeing the output of the dmesg command will help verify this.
If the drivers are installed and working then move to the next step.
Follow the link already provided for some instructions on how to start the
container.
On Sat, 12 Oct 2024 at 20:57, T-Shilov @.***> wrote:
@T-Shilov commented on GitHub (Oct 12, 2024):
Thanks for the tip. Please, here is the output of your command:
Is this normal or not? I'm very worried about it.
@T-Shilov commented on GitHub (Oct 12, 2024):
Next, I followed these steps (they may be wrong):
@mnccouk commented on GitHub (Oct 12, 2024):
The docker container is just using the CPU there. Referring back to the
output of the dmesg command it looks like there is an issue with the ROCm
drivers using the PCI bus
This message from your log - kfd kfd: amdgpu: skipped device 1002:67df,
PCI rejects atomics 730<0
This is probably why your device is not being detected.
There is some discussion regarding this already in this thread, but I'm
afraid I don't know of a software solution around this issue. It seems as
if the ROCm driver is very particular about the PCI spec required.
On Sat, 12 Oct 2024 at 22:35, T-Shilov @.***> wrote:
@T-Shilov commented on GitHub (Oct 13, 2024):
Yes, I've already realized that ROCm is a very demanding and capricious thing.
I don't understand why she's making claims about my S1200KP server motherboard, as I think it has bus PCI-3 and my CPU Xeon E3 1245 v2:
@T-Shilov commented on GitHub (Oct 13, 2024):
Moreover, CPU-Z confirms that in this motherboard PCI is worked in mode PCIe x 16 3.0 @ x16 1.1 -
@T-Shilov commented on GitHub (Oct 21, 2024):
mnccouk, you wrote this here:
Unfortunately, " should see details" it is very vague and unclear.
Please show the full output of the rocminfo command for RX580.
It is very important.
@mnccouk commented on GitHub (Oct 21, 2024):
In the output of the command you should see the GPU detailed in the
response if it has been correctly identified by the driver.
See
https://rocm.docs.amd.com/projects/rocminfo/en/latest/how-to/use-rocminfo.html
For an example, it's the GPU section that you're looking for not CPU.
On Mon, 21 Oct 2024, 22:57 T-Shilov, @.***> wrote:
@T-Shilov commented on GitHub (Oct 21, 2024):
Thank you for the answer. I got this result, please tell me, is it correct or not?
@T-Shilov commented on GitHub (Oct 22, 2024):
Hello mnccouk and other friends,
I have read this topic carefully and have done a lot of work through numerous trials and errors to create a reliable installation procedure for the capricious ROCm 5.7.1 for the RX580.
I note that it is impossible to get the desired result on Ubuntu 22, 24 and Alma 9.2, because this malicious error invariably occurs every time you compile -
The desired result is possible only on Ubuntu 20.04.
As a result, I created a two-part script that creates a workable ROCm 5.7.1 without much mental strain :-)
I'm giving this script to our community, which is having difficulty installing ROCm 5.7.1.
Before running the script, you need to create new user: ai
Install_ROCm_5.7.1.zip
@T-Shilov commented on GitHub (Oct 23, 2024):
mnccouk,
unfortunately I bad don't understand how to use your instruction mnccouk/ollama-gpu-rx580 correctly
It is very brief, and therefore not very clear.
For example, I still can't figure out if it needs to use the sudo command, or if I can do without it.
Also, there are often conflicts between volumes and other confusion.
So I have to delete conflicting volumes frequently.
There is also a problem of delimiting access of several users to one LLM.
Therefore I very don't like docker!
mnccouk, could you tell me how to use ollama c RX580 without using docker?
It was much clearer than using an extra entity like docker.
@mnccouk commented on GitHub (Oct 23, 2024):
Sorry you're having difficulty getting this up and running, I feel this is
not the place to be discussing this as it's veering off topic. However, in
the interest of getting you up and running here are a few pointers:-
Should you use sudo or not?
This depends on how docker has been set up - see
https://docs.docker.com/engine/install/linux-postinstall/ to give
permission to your user(that runs docker), so that you don't have to use
sudo. Sudo'ing is still ok though to start your docker container if the
post install steps have not been carried out.
I'm not sure why you have conflicts between volumes, the -v
ollama:/root/.ollama This mounts the local directory "ollama" into the
container as the /root/.ollama directory.
The docker container starts with the Ollama serve command, which
means the API is exposed, in the case of the docker run command, on port
the ${docker_host_ip}:11434. -
https://github.com/ollama/ollama/blob/main/docs/api.md
I've only focused on building a docker image, so I can't help at this stage
with the request to build without utilisation of docker. My forked source
is here - https://github.com/mnccouk/ollama/tree/rx580_gpu if you would
like to try, I believe there is another build script in the codebase to
achieve this - ./scripts/build_linux.sh but have not tried this myself.
On Wed, 23 Oct 2024 at 19:32, T-Shilov @.***> wrote:
@T-Shilov commented on GitHub (Oct 23, 2024):
mnccouk,
thanks for the advices, I understand you.
I agree with you that this topic is not the place to discuss docker, it is better to discuss it on discord.
But unfortunately, I can't register on discord because I don't have an actual invitation, and discord rejects my phones for unknown reasons.
So, if mysterious problems arise using docker, then maybe you will make your project on App image?
It consists of only one file, and it is much easier and more convenient to work with it.
By the way, I think Debian is much better and more stable compared to Ubuntu.
That's why I redesigned my scripts for Debian 11.
They are not perfect yet, but Ollama is already working with them.
And if earlier after the question "Why is the sky blue?" Lama 3.1 in Ubuntu first thinks, there is a pause of 5-10 seconds or more, then Debian answers this question without a pause, instantly.
But it was only the first time, then the answers to other questions came slower and slower - why?
It gives the impression that Lama is getting tired of being asked questions.
The same "fatigue" of Llama is also observed in Ubuntu.
This is a very annoying inhibited LLama behavior. Maybe Docker to blame for it?
Here is the result in Debian 11:
@T-Shilov commented on GitHub (Oct 24, 2024):
I will tell you more about the incomprehensible behavior of Llama 3.1 in your Docker.
On the 1st question "Why is the sky blue?" Llama answered without pause, instantly and showed a speed of 27 tokens/sec.
On the 2nd exactly the same question, Llama thought for a long time, and showed a speed of 12 tokens / sec.
On the 3rd exactly the same question, Llama thought for 4 minutes and showed a speed of 12 tokens/sec.
Why?? I don't understand why this is happening, but this is not normal, and it is impossible to use such
thinking for a long time about a recurring simple question and unpredictable this Ollama.
PS. When Ollama thinks, the RX580 consumes 105 watts
When Ollama responds, the RX580 consumes 155 watts.
@mnccouk commented on GitHub (Oct 24, 2024):
Do you clear the conversation context between promoting the same question?
If not, processing subsequent questions also processes the previous
conversational context, which includes the responses from previous
questions.
This might be why each prompt is taking longer.
On Thu, 24 Oct 2024, 11:34 T-Shilov, @.***> wrote:
@T-Shilov commented on GitHub (Oct 24, 2024):
No, I didn't clean it, unaware of the need for cleaning.
After all, I asked only three identical short questions, and already on the 2nd question there was a problem with slowing down Ollma.
And what, even in this simple situation, me need to cleanse the context?
I didn't know about this because there is no such problem in ChatGPT.
OK, how exactly is the context of the conversation cleared?
Upd. I found a command that clears the conversational context- this is Ctrl-R (probably).
Now Llama 3.1 began to respond faster, but not instantly, as in the 1st time.
The speed has now become small - 19-22 tokens/sec - why?
@T-Shilov commented on GitHub (Oct 29, 2024):
Hello,
I want to join the LLM community in Discord, but registration by phone is rejected.
Can someone share an invite for Discord?
@kth8 commented on GitHub (Nov 22, 2024):
@mnccouk I managed to get my old RX 470 with i3-3220 CPU to work, not with ROCm but with Vulkan after following this guide: https://www.jeffgeerling.com/blog/2024/llms-accelerated-egpu-on-raspberry-pi-5
Using
Llama-3.2-3B-Instruct-Q4_K_Mmy RX 470 managed to get about 20 token/s which is half the speed of the RX 6500 XT Jeff got in his benchmark. I also saw there is a PR open to add Vulkan support to Ollama that will make running on older hardware easier #5059@Tamila-2017 commented on GitHub (Dec 12, 2024):
Hi all,
I now use an NVIDIA graphics card, which is easy to configure and provides a good speed of 90 tokens per second.
But I still have the RX580, and I'm still interested in using it.
I remember that someone is here in this topic promised that its full support in ROCm will soon appear and it will also be easy to use.
Tell me please, has this promise already been fulfilled?
@zoumath19 commented on GitHub (Dec 13, 2024):
Waiting on this too, will be immense
@Tamila-2017 commented on GitHub (Dec 13, 2024):
Ok, can you be more specific - 1 month, or 1 year, or more?
@mattiasghodsian commented on GitHub (Jan 28, 2025):
I would also appreciate obtaining support for the AMD RX 580 GPUs, as I have a couple available.
@takitakitanana commented on GitHub (Jan 28, 2025):
AMD RX 580 GPUs support would be awesome
@PhoenixIO commented on GitHub (Jan 30, 2025):
RX580 is quite common, and it would be decent to have official support for it,
@PhoenixIO commented on GitHub (Jan 30, 2025):
If anyone is wondering, after some research, I found a step-by-step solution to make this work on older GPUs: https://github.com/likelovewant/ollama-for-amd
@kth8 commented on GitHub (Feb 2, 2025):
I made a Docker image to easily run Llama on my RX 470 (gfx803) using llama.cpp's Vulkan https://github.com/kth8/llama-server-vulkan
@ChunkyPanda03 commented on GitHub (Feb 3, 2025):
I really like how llama.cpp has vulkan built in I wish that ollama supported vulkan as well. I think less effort should be put into rocm support. I say this because you would be able to target more gpus as amd just drops support of cards after 3 years. I have attempted and failed to get ollama working with the rocm libraries but unless someone forks and maintains an older branch for the polaris 10 gpus these patches we are doing will not work for the newer versions of the linux kernel as it is rocm 5.1 does not install to linux kernel 6.1.0-30 does not compile the dkms.
@Tamila-2017 commented on GitHub (Feb 3, 2025):
ChunkyPanda03
And what follows from this? Do you have a ready-made solution?
@siavashmohammady66 commented on GitHub (Feb 3, 2025):
For targeting multiple platforms supporting OpenCL is more important than supporting ROCM
@gl2007 commented on GitHub (Feb 4, 2025):
There is a fork of ollama for vulkan: ollama-vulkan but the problem is that you have to build it yourself. I am going to try it myself :)
@Tamila-2017 commented on GitHub (Feb 16, 2025):
Dear mnccouk,
Your docker for RX580 is working very fine so far 👍
And now I have a question: is it possible to use RX580 and Rock 5.7.1 without using your docker?
@gl2007 commented on GitHub (Feb 16, 2025):
see my reply to your feature request post. Got it to work without docker. There is also this other repo which I didn't test yet:
@Tamila-2017 commented on GitHub (Feb 16, 2025):
Dear gl2007,
Thank you for your feedback. Please specify where this answer of yours is located for working without a docker?
@gl2007 commented on GitHub (Feb 17, 2025):
See this releases page.
@Tamila-2017 commented on GitHub (Feb 17, 2025):
Thanks for the link. But I'm not using Windows, I'm using Debian. So I need to compile Source code (tar.gz) ?
@gl2007 commented on GitHub (Feb 17, 2025):
Ahh, ok; but it might not be as difficult as it sounds. Under scripts folder, you can find a ...linux.sh which can build it for you. Windows is typically more painful :)
@Tamila-2017 commented on GitHub (Feb 17, 2025):
Ok, I ran this script called install.sh
It was successfully completed:
Unfortunately, the GPU RX580 does not work because the speed is only 4 tokens/sec.
This means that only the CPU is running:
@Tamila-2017 commented on GitHub (Feb 17, 2025):
The previous experiment was on Ubuntu 20.04.
Now I've tried Debian 12.
The software was installed flawlessly again:
But unfortunately, the RX580 doesn't work here either :-(
@robertrosenbusch commented on GitHub (Feb 22, 2025):
First at all thnx @likelovewant for your great work.
@Tamila-2017 : if you are able to handle without any further probs, take a look on my ollama/pytorch Repos for gx803. it will summarize the following steps for ollama on RX5X0
My Steps to use the Ollama on gf803/Linux Ollama v0.5.12 were:
discover/gpu.gofrom "9 "to number 8CMakePresets.jsonAMDGPU_TARGETS gfx803 like ` "AMDGPU_TARGETS": "gfx803";CMakeLists.txt.like"^gfx(803|900|9[...]@siavashmohammady66 commented on GitHub (Feb 24, 2025):
Thank you a lot @robertrosenbusch
Could explain each step in more detail?
Thank you
@robertrosenbusch commented on GitHub (Feb 25, 2025):
Of course :D The short version will be to take a look on my documentation for gfx803-Ollama Dockerfile and/or on my install Instructions for Ollama :D
Hint Its not necessary to use any specific ROCm-Installation/Librarys on your Host System, cause you pass through the
--device=/dev/kfd --device=/dev/dri .from the Kernel to the ROCm 6.3 Docker-Image. You only need a Kernel-Version where these both devices are aviable. And of course the docker container is independent of the host Linux version used.... And you need a lot of time to download and compile Ollama for gfx803 ^.ln -s /opt/rocm-6.3.0 /opt/rocm/install.sh -ida gfx803sed -i 's/var RocmComputeMajorMin = "9"/var RocmComputeMajorMin = "8"/' discover/gpu.goCMakePresets.jsonandCMakeLists.txtsed -i 's/"gfx900;gfx940;gfx941;gfx942;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-" /"gfx803;gfx900;gfx940;gfx941;gfx942;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-" /g' CMakePresets.jsonsed -i 's/"list(FILTER AMDGPU_TARGETS INCLUDE REGEX "^gfx(900|94[012]|101[02]|1030|110[012])$")"/"list(FILTER AMDGPU_TARGETS INCLUDE REGEX "^gfx(803|900|94[012]|101[02]|1030|110[012])$")"/g' CMakeLists.txtcmake -B build -DAMDGPU_TARGETS=gfx803 && cmake --build buildgo generate ./... && go build ../ollama serve&into the background. You should get an output similar like thisollama run llama3.1:8bYou should get an output similar like thisBut all this Steps will do my
Dockerfile_rocm63_ollamafor you :DBenchmark i took last month:
ROCm-6.3.0 Ollama v0.5.4 Benchmark on RX570 vs CPU Ryzen7 3700x
`
@Tamila-2017 commented on GitHub (Feb 25, 2025):
robertrosenbusch, but you're using a very complicated method.
What advantages does it have compared to this simple method?
And here the obtained speed is demonstrated.
@robertrosenbusch commented on GitHub (Feb 25, 2025):
@Tamila-2017 : its so funny to talk with an AI-Robot ;P Whats your conclusion to simplyfy the 5-Steps on my Dockerfile to use gfx803 on a similar ubuntu?
@Tamila-2017 commented on GitHub (Feb 25, 2025):
Robertrosenbusch , I'm sorry, but unfortunately I don't understand the meaning of your question.
Could you ask your question in more detail?
@sanchez314c commented on GitHub (Feb 27, 2025):
@robertrosenbusch has you got this to work on bare-metal or just in Docker?
@robertrosenbusch commented on GitHub (Feb 27, 2025):
"Just" in Docker, cause you are independent on what ROCm-Version or Linux-Flavour or Linux-Version (Ubuntu, PopOS, CentOS,Arch etc.) you used on your BareMetal. And of course its well documented with "official" AMD ROCm-Dockercontainer on last aviable ROCm Version 6.3 and the last Ollama Version v0.5.12:P And of course the handling is much more easier.
I have no clue while anyone will use it on Baremetal. There is no Performance-Impact, cause Docker is a "simple" Process-Isolation with a well known Documentation and a very good Tool-/Userland. You need only six/6 (!) Steps to run Ollama v0.5.12 with gfx803 on your Linux (insert_flavor/_insert_flavor_version) :P From Zero to full working Ollama on gfx803.
But hey, feel free to compile and install it on your bare-metal. To install gfx803 ROCm-SoftwareStack on well known working Baremetal i am out.
Beware: Abroad a full function gfx-803 ROCm-Hazzle on your Baremetal you had have to change on this official Ollama-Git the files:
CMakeLists.txt,CMakePresets.jsonanddiscover/gpu.goOr you are using these Ollama Fork cause there is a small change in gpu.go on Line 74 included ^.^
@sanchez314c commented on GitHub (Feb 28, 2025):
@robertrosenbusch why would anyone want to use it on Bare-metal instead of Docker? Why, NOT. Anything run at a bare-metal level is going to be better.
I was asking because I have yet to be able to get a full version to compile. I keep crashing and arriving at CPU related errors for MAXVINNI and I don't even know where they're coming from because I'm not specifiying them.
@Tamila-2017 commented on GitHub (Feb 28, 2025):
You are absolutely right! Which is why I don't like Docker. It's an unnecessary layer of complexity.
@robertrosenbusch commented on GitHub (Feb 28, 2025):
@sanchez314c : Sorry to beeing harsh. Lets focus on Ollama Sourcecode and while you are not able to use gfx803 since months without changes/patch on the Ollama Sourcecode, independent you use it on Baremetal or into a Docker.
gpu.goCMakePresets.jsonCMakeLists.txtIf you had have a full working Baremetal/Docker ROCm-Environment for gfx803... checkout the latest Release-Ollama via Git, change the three files, recompile Ollama and be happy :P
@sanchez314c commented on GitHub (Mar 6, 2025):
Can anyone confirm a successful compile on bare-metal and include what version of ROCm they are using? I'm still hitting walls trying to get this going and would really appreciate any guidance/help. I know I'm not only one with legacy hardware (RX580 and Tesla K80) and I'm sure there are a lot of people out there who would appreciate this. In the midst of trying to figure this out my objective is to build a scripted install that does everything as well as detect system aspects. I have most of that work just can't get the compiles working. There is NO CLEAR instructions -- ANYWHERE -- including here, and any of the Github repos like Ollama-for-AMD also don't include clear easy instructions. Everything is somewhat criptic. And for someone who doesn't know how to compile and is learning this is extremely frustrating and difficult. I don't understand why you guys don't just add this legacy support provisions to source.
@dariosusman commented on GitHub (Mar 7, 2025):
I'm not entirely sure yet, but I seem to have been able to get this running on a bare-metal
https://github.com/likelovewant/ollama-for-amd/issues/62#issuecomment-2705481206
@sanchez314c commented on GitHub (Mar 9, 2025):
@robertrosenbusch
⠏ time=2025-03-09T05:58:51.134Z level=ERROR source=sched.go:456 msg="error loading llama server" error="llama runner process has terminated: exit status 2"
[GIN] 2025/03/09 - 05:58:51 | 500 | 1.051664302s | 127.0.0.1 | POST "/api/generate"
Error: llama runner process has terminated: exit status 2
root@02574468dc8f:/ollama# time=2025-03-09T05:58:56.135Z level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.001018716 model=/root/.ollama/models/blobs/sha256-aabd4debf0c8f08881923f2c25fc0fdeed24435271c2b3e92c4af36704040dbc
time=2025-03-09T05:58:56.385Z level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.251154112 model=/root/.ollama/models/blobs/sha256-aabd4debf0c8f08881923f2c25fc0fdeed24435271c2b3e92c4af36704040dbc
time=2
Cannot get it to work.
@sanchez314c commented on GitHub (Mar 12, 2025):
@robertrosenbusch
heathen-admin@LLMServer:~/ollama$ ./ollama run tinyllama
[GIN] 2025/03/11 - 21:02:00 | 200 | 47.412µs | 127.0.0.1 | HEAD "/"
[GIN] 2025/03/11 - 21:02:00 | 200 | 5.869556ms | 127.0.0.1 | POST "/api/show"
time=2025-03-11T21:02:00.108-04:00 level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.key_length default=64
time=2025-03-11T21:02:00.108-04:00 level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.value_length default=64
time=2025-03-11T21:02:00.108-04:00 level=INFO source=sched.go:715 msg="new model will fit in available VRAM in single GPU, loading" model=/home/heathen-admin/.ollama/models/blobs/sha256-2af3b81862c6be03c769683af18efdadb2c33f60ff32ab6f83e42c043d6c7816 gpu=0 parallel=4 available=7603056640 required="1.7 GiB"
time=2025-03-11T21:02:00.109-04:00 level=INFO source=server.go:97 msg="system memory" total="125.5 GiB" free="119.7 GiB" free_swap="8.0 GiB"
time=2025-03-11T21:02:00.109-04:00 level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.key_length default=64
time=2025-03-11T21:02:00.109-04:00 level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.value_length default=64
time=2025-03-11T21:02:00.109-04:00 level=INFO source=server.go:130 msg=offload library=rocm layers.requested=-1 layers.model=23 layers.offload=23 layers.split="" memory.available="[7.1 GiB]" memory.gpu_overhead="0 B" memory.required.full="1.7 GiB" memory.required.partial="1.7 GiB" memory.required.kv="176.0 MiB" memory.required.allocations="[1.7 GiB]" memory.weights.total="696.1 MiB" memory.weights.repeating="644.8 MiB" memory.weights.nonrepeating="51.3 MiB" memory.graph.full="544.0 MiB" memory.graph.partial="546.3 MiB"
time=2025-03-11T21:02:00.109-04:00 level=INFO source=server.go:380 msg="starting llama server" cmd="/home/heathen-admin/ollama/ollama runner --model /home/heathen-admin/.ollama/models/blobs/sha256-2af3b81862c6be03c769683af18efdadb2c33f60ff32ab6f83e42c043d6c7816 --ctx-size 8192 --batch-size 512 --n-gpu-layers 23 --threads 10 --parallel 4 --port 43799"
time=2025-03-11T21:02:00.109-04:00 level=INFO source=sched.go:450 msg="loaded runners" count=1
time=2025-03-11T21:02:00.110-04:00 level=INFO source=server.go:557 msg="waiting for llama runner to start responding"
time=2025-03-11T21:02:00.110-04:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error"
time=2025-03-11T21:02:00.121-04:00 level=INFO source=runner.go:932 msg="starting go runner"
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 2 CUDA devices:
Device 0: Tesla K80, compute capability 3.7, VMM: yes
Device 1: Tesla K80, compute capability 3.7, VMM: yes
load_backend: loaded CUDA backend from /home/heathen-admin/ollama/build/lib/ollama/libggml-cuda.so
⠇ ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon RX 580 Series, compute capability 8.0, VMM: no
load_backend: loaded ROCm backend from /home/heathen-admin/ollama/build/lib/ollama/libggml-hip.so
load_backend: loaded CPU backend from /home/heathen-admin/ollama/build/lib/ollama/libggml-cpu-skylakex.so
time=2025-03-11T21:02:00.951-04:00 level=INFO source=runner.go:935 msg=system info="CPU : LLAMAFILE = 1 | CPU : LLAMAFILE = 1 | CUDA : ARCHS = 370 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | ROCm : PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | AVX512 = 1 | LLAMAFILE = 1 | cgo(gcc)" threads=10
time=2025-03-11T21:02:00.951-04:00 level=INFO source=runner.go:993 msg="Server listening on 127.0.0.1:43799"
⠏ llama_load_model_from_file: using device CUDA0 (Tesla K80) - 11354 MiB free
⠏ time=2025-03-11T21:02:01.114-04:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model"
llama_load_model_from_file: using device CUDA1 (Tesla K80) - 11354 MiB free
llama_load_model_from_file: using device ROCm0 (Radeon RX 580 Series) - 8148 MiB free
llama_model_loader: loaded meta data with 23 key-value pairs and 201 tensors from /home/heathen-admin/.ollama/models/blobs/sha256-2af3b81862c6be03c769683af18efdadb2c33f60ff32ab6f83e42c043d6c7816 (version GGUF V3 (latest))
llama_model_loader: loaded meta data with 23 key-value pairs and 201 tensors from /home/heathen-admin/.ollama/models/blobs/sha256-2af3b81862c6be03c769683af18efdadb2c33f60ff32ab6f83e42c043d6c7816 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = llama
llama_model_loader: - kv 1: general.name str = TinyLlama
llama_model_loader: - kv 2: llama.context_length u32 = 2048
llama_model_loader: - kv 3: llama.embedding_length u32 = 2048
llama_model_loader: - kv 4: llama.block_count u32 = 22
llama_model_loader: - kv 5: llama.feed_forward_length u32 = 5632
llama_model_loader: - kv 6: llama.rope.dimension_count u32 = 64
llama_model_loader: - kv 7: llama.attention.head_count u32 = 32
llama_model_loader: - kv 8: llama.attention.head_count_kv u32 = 4
llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
llama_model_loader: - kv 10: llama.rope.freq_base f32 = 10000.000000
llama_model_loader: - kv 11: general.file_type u32 = 2
llama_model_loader: - kv 12: tokenizer.ggml.model str = llama
llama_model_loader: - kv 13: tokenizer.ggml.tokens arr[str,32000] = ["", "
", "", "<0x00>", "<...llama_model_loader: - kv 14: tokenizer.ggml.scores arr[f32,32000] = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv 15: tokenizer.ggml.token_type arr[i32,32000] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
llama_model_loader: - kv 16: tokenizer.ggml.merges arr[str,61249] = ["▁ t", "e r", "i n", "▁ a", "e n...
llama_model_loader: - kv 17: tokenizer.ggml.bos_token_id u32 = 1
llama_model_loader: - kv 18: tokenizer.ggml.eos_token_id u32 = 2
llama_model_loader: - kv 19: tokenizer.ggml.unknown_token_id u32 = 0
llama_model_loader: - kv 20: tokenizer.ggml.padding_token_id u32 = 2
llama_model_loader: - kv 21: tokenizer.chat_template str = {% for message in messages %}\n{% if m...
llama_model_loader: - kv 22: general.quantization_version u32 = 2
llama_model_loader: - type f32: 45 tensors
llama_model_loader: - type q4_0: 155 tensors
llama_model_loader: - type q6_K: 1 tensors
llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
llm_load_vocab: special tokens cache size = 3
llm_load_vocab: token to piece cache size = 0.1684 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = llama
llm_load_print_meta: vocab type = SPM
llm_load_print_meta: n_vocab = 32000
llm_load_print_meta: n_merges = 0
llm_load_print_meta: vocab_only = 0
llm_load_print_meta: n_ctx_train = 2048
llm_load_print_meta: n_embd = 2048
llm_load_print_meta: n_layer = 22
llm_load_print_meta: n_head = 32
llm_load_print_meta: n_head_kv = 4
llm_load_print_meta: n_rot = 64
llm_load_print_meta: n_swa = 0
llm_load_print_meta: n_embd_head_k = 64
llm_load_print_meta: n_embd_head_v = 64
llm_load_print_meta: n_gqa = 8
llm_load_print_meta: n_embd_k_gqa = 256
llm_load_print_meta: n_embd_v_gqa = 256
llm_load_print_meta: f_norm_eps = 0.0e+00
llm_load_print_meta: f_norm_rms_eps = 1.0e-05
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale = 0.0e+00
llm_load_print_meta: n_ff = 5632
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: causal attn = 1
llm_load_print_meta: pooling type = 0
llm_load_print_meta: rope type = 0
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 10000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn = 2048
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: ssm_d_conv = 0
llm_load_print_meta: ssm_d_inner = 0
llm_load_print_meta: ssm_d_state = 0
llm_load_print_meta: ssm_dt_rank = 0
llm_load_print_meta: ssm_dt_b_c_rms = 0
llm_load_print_meta: model type = 1B
llm_load_print_meta: model ftype = Q4_0
llm_load_print_meta: model params = 1.10 B
llm_load_print_meta: model size = 606.53 MiB (4.63 BPW)
llm_load_print_meta: general.name = TinyLlama
llm_load_print_meta: BOS token = 1 '
''llm_load_print_meta: EOS token = 2 '
llm_load_print_meta: UNK token = 0 ''
llm_load_print_meta: PAD token = 2 ''
llm_load_print_meta: LF token = 13 '<0x0A>'
llm_load_print_meta: EOG token = 2 ''
llm_load_print_meta: max token length = 48
⠙ llm_load_tensors: offloading 22 repeating layers to GPU
llm_load_tensors: offloading output layer to GPU
llm_load_tensors: offloaded 23/23 layers to GPU
llm_load_tensors: ROCm0 model buffer size = 169.48 MiB
llm_load_tensors: CUDA0 model buffer size = 212.77 MiB
llm_load_tensors: CUDA1 model buffer size = 189.12 MiB
llm_load_tensors: CPU_Mapped model buffer size = 35.16 MiB
SIGSEGV: segmentation violation
PC=0x729e387ac935 m=3 sigcode=1 addr=0x18
signal arrived during cgo execution
goroutine 50 gp=0xc000105340 m=3 mp=0xc00009ce08 [syscall]:
@chboishabba commented on GitHub (Mar 15, 2025):
the segv is due to rocm and is why the docker. i think possibly some ops are not supported eg fp16, i am investigating currently
https://github.com/lamikr/rocm_sdk_builder/issues/228
@mon-jai commented on GitHub (Mar 17, 2025):
This issue might be resolved if #9650 is merged.
Ollama with Vulkan runs perfectly on my Radeon RX 5700, even though it isn't supported by ROCm. Just run the installer and you are good to go, no manual file swapping required.
For now, we can use the binaries compiled by @McBane87: https://github.com/whyvl/ollama-vulkan/issues/7#issue-2828064858
@chboishabba commented on GitHub (Mar 19, 2025):
@mon-jai rocm_sdk_builder enables rocm support on these cards so should only require a working rocm installation. I believe robertrosenbusch docker should provide most of what is required as already provides vllm for gfx803 - still testing on my end.
https://github.com/robertrosenbusch/gfx803_rocm/issues/6
https://github.com/lamikr/rocm_sdk_builder/issues/173
@robertrosenbusch commented on GitHub (Mar 21, 2025):
@chboishabba @mon-jai :
Hi guys and girls and everything between and between outside there. be carefull which each other.
@chboishabba: you didnt answer my questions on my git-repos issue about the whisperX on ROCm/gfx803.
@mon-jai : both works on ollama 0.62 with rx570 (polaris/gfx803) and ROCm 6.3. take a look on my Dockerfile i published.
./ollama run tinyllama`/ollama run tinyllama
[GIN] 2025/03/21 - 23:08:55 | 200 | 32.127µs | 127.0.0.1 | HEAD "/"
[GIN] 2025/03/21 - 23:08:55 | 200 | 12.315771ms | 127.0.0.1 | POST "/api/show"
[GIN] 2025/03/21 - 23:08:55 | 200 | 5.944312ms | 127.0.0.1 | POST "/api/generate"
cell in our body where it's used to perform chemical reactions. The absorbed amount of this nutrient can be expressed as a percentage of the total amount
consumed (usually in terms of weight).
compound is essential for the organism to function properly, then adsortion may occur. However, most nutrients that are not absorbed into the bloodstream
but are still used by the body can be metabolized and utilized as energy or stored in tissues or organs for later use.
In summary, absorption occurs when a compound is completely absorbed from the food we eat, while adsortion occurs when a substance is not fully absorbed
into the bloodstream by an organism. Both processes are important for nutrition and can affect our physical and mental health in various ways.[GIN] 2025/03/21 - 23:14:41 | 200 | 5.40056913s | 127.0.0.1 | POST "/api/chat"
total duration: 5.400487346s
load duration: 9.143939ms
prompt eval count: 681 token(s)
prompt eval duration: 8.594151ms
prompt eval rate: 79239.94 tokens/s
eval count: 347 token(s)
eval duration: 5.37478095s
eval rate: 64.56 tokens/s
@chboishabba commented on GitHub (Mar 24, 2025):
Yes sorry haven't had moment
On Sat, 22 Mar 2025, 9:21 am Robert Rosenbusch, @.***>
wrote:
@lsunay commented on GitHub (Apr 8, 2025):
Setting Up Ollama with AMD Radeon RX 580 GPU Support Using Docker Containers
Hey everyone! I'm super excited to share with you how I set up Ollama with AMD Radeon RX 580 GPU support using Docker containers. A big shoutout to everyone whose posts and guides I've learned from along the way - your help has been invaluable!
In this post, I'll guide you through the process step by step. We'll use the following images:
docker.io/rocm/dev-ubuntu-22.04:5.7.1-complete- This image contains the ROCM 5.7.1 libraries, which support the RX 580 GPU.mnccouk/ollama-gpu-rx580:latest- This image contains Ollama, which is compatible with ROCM 5.7.1.Step 1: Stop the Existing Ollama Container
If you have an existing Ollama container running on your CPU, stop it to avoid conflicts:
Step 2: Start the ROCM Host Container
First, we'll start the ROCM host container, which will provide the necessary ROCM 5.7.1 libraries:
Inside the
rocm_hostcontainer, verify that the GPU is recognized by running:Step 3: Start the Ollama GPU Container
Next, start the Ollama GPU container using the volume from the existing Ollama container (if you want to use the existing models) and the ROCM host container:
Step 4: Start a Chat Session with Ollama
Once the container is running, you can start a chat session using the
llama3.1model:Step 5: Managing the Containers
To stop the running Ollama GPU container:
To start the Ollama GPU container again:
Step 6: Monitoring GPU Usage
To monitor GPU usage on the host machine, you can use the
rocm-smicommand:To monitor GPU usage inside the
rocm_hostcontainer, first enter the container:Then, run the
rocm-smicommand inside the container:Additional Notes:
rocm-smicommand is correctly installed and accessible in therocm_hostcontainer.System Information:
By following these steps, you should be able to set up and run Ollama with AMD Radeon RX 580 GPU support using Docker containers. Happy computing, and thank you again to everyone who helped me along the way!
@robertrosenbusch commented on GitHub (Apr 9, 2025):
@lsunay: Thats cool!
Just some hints to use GFX803 on Ollama with a Dockercontainer. It could save lifetime :P
@chboishabba commented on GitHub (Apr 10, 2025):
Propose possibly using pre built updated binaries from https://github.com/lamikr/rocm_sdk_builder
@MicahBird commented on GitHub (Apr 10, 2025):
@lsunay Thank you so much for the brief guide! I can confirm that it also works on a Radeon RX 570 with 4GB of VRAM. Testing llama3.2:1b I was able to get 72.94 tokens/s!
@mon-jai commented on GitHub (Apr 10, 2025):
@chboishabba There isn't any Windows build yet. 🥲
@chboishabba commented on GitHub (Apr 11, 2025):
@dhiltgen
https://github.com/lamikr/rocm_sdk_builder/issues/231 ?
@lsunay commented on GitHub (Apr 11, 2025):
Feedback on Linux Kernel Version and ROCm SegFaults
Hello,
Here is the Linux kernel version I am using:
uname -r
6.1.0-32-amd64
I checked if my kernel version is among the working versions listed for ROCm 6.3 with Ollama and PyTorch. Since the provided list is for ROCm 6.3, and I am using ROCm 5.7, I just wanted to share my kernel version for reference. No further action is needed from my side.
If you have any recommendations or information regarding this version, please share. Thank you!
@robertrosenbusch commented on GitHub (Apr 11, 2025):
@lsunay: Hi and welcome back!
The Prob into the Docker is not used ROCm 5.7.1 or ROCm 6.3 or Ollama or GFX803 or your used Distro. The Prob is inside the Kernel-Version 6.12 and 6.13 which is fixed on 6.14 to use/handle the both Kernel devices
/dev/driand/dev/kfdon your Host-System grmpfA "fresh new" installed Distro in April 2025 on Debian 13/Arch/Fedora 41 and your Docker HowTo will all ended into the same SegFault. That what i expected on and i am sorry :P
But its weekend time and i will proofe your HowTo (where i am really thankfull about it) on current ARCH-Linux with delivered Kernel 6.13 and Kernel 6.12 to verify. I am just curious if it will be works fine -·^
In this way, please hold the line :D I will research it and gave some feedback.
Robert Rosenbusch
@lsunay commented on GitHub (Apr 11, 2025):
Subject: Feedback on
amdgpuError with Kernel 6.1.0-32-amd64Hey Robert,
Thanks for the detailed info and updates on the kernel issues with ROCm! I’ve been following your posts and wanted to share an issue I’m facing on my setup, hoping to get some feedback if you’ve come across something similar. 😊
I’m running into an error in my logs:
amdgpu: init_user_pages: Failed to get user pages: -1. It shows up multiple times, especially during GPU-intensive tasks like model execution. I know you mentioned that the main problem with SegFaults is tied to Kernel 6.12 and 6.13 (fixed in 6.14), but I’m on a different version and still seeing issues, so I thought I’d share my setup for reference.Here’s what I’ve got:
uname -r)lspci | grep -i vga:01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Polaris 20 XL [Radeon RX 580 2048SP] (rev ef))amdgpuand related modules are loaded (confirmed withlsmod | grep amdgpu)journalctl -p 3 -xb): Repeated entries ofamdgpu: init_user_pages: Failed to get user pages: -1during workload.I’ve checked
dmesgand other logs, and there’s no direct SegFault reported yet, but I’m wondering if thisamdgpuerror could lead to crashes or if it’s a sign of a deeper compatibility issue, even on Kernel 6.1. From what I understand, this error points to the driver failing to access user space memory pages, maybe due to memory management or kernel-driver mismatch.I really appreciate your research on kernel versions and the heads-up on 6.14 fixing the SegFaults for 6.12/6.13. Since I’m on 6.1, I’m curious if this could still be related or if it’s something else (like IOMMU settings or ROCm version—I’m on 5.7, by the way). I’m planning to dig deeper into logs and maybe test a kernel update if needed.
If you’ve seen this
amdgpuerror before or have any tips for Kernel 6.1 with an RX 580, I’d be super grateful for your input. No rush, just wanted to share this and hold the line as you mentioned! :D Looking forward to any updates from your tests on Arch with 6.12/6.13. Thanks again for the great info and support!Cheers,
@robertrosenbusch commented on GitHub (Apr 12, 2025):
@lsunay: Hi and welcome back. Thanks for your additional Information
My short feedback:
/dev/driand/dev/kfdIndependent of Ollama and excluding Ubunutu. Its just a bug into the kernel versions 6.12 and 6.13amdgpu: init_user_pages: Failed to get user pages: -1, independet your using Kernelversion.I am not able to take a look deep inside what
mnccouk/ollama-gpu-rx580is doing into his Docker cause its definitely not well known documented and over 7 months old... as much a lot things about GFX803 grmpf its a little frustrating.This Error-Logs are independent on what Kernel/ROCm/Ollama Version you are used.
However, on ROCm 6.3 and Ollama 0.6.5 there is the same Prob, but only the once you load a new Modell/LLVm on Ollama.
So far my own researchs.
Cheers, Robert Rosenbusch
@robertrosenbusch commented on GitHub (Apr 13, 2025):
@lsunay : Hi ·^ It seems to be AMD published ROCm v6.4.0 last week. I will take a look on when AMD published their ROCm-PyTorch v6.4 Dockercontainer into the next few weeks. As far is i know to read the Release Notes at my first look... It shouldnt be a big Prob to recompile ROCm v6.4 and Ollama on gfx803.
Cheers, Robert Rosenbusch
@lsunay commented on GitHub (Apr 13, 2025):
Hi Robert,
Thanks for the update! 😊 It's great to hear that AMD has released ROCm v6.4.0 last week. I'm really looking forward to seeing how it performs with the ROCm-PyTorch v6.4 Docker container once it's available. From your initial look at the Release Notes, it sounds promising that recompiling ROCm v6.4 and Ollama on gfx803 shouldn't be a major issue.
I'd love to be more proactive and try compiling your Docker files if this turns out to be successful. However, at the moment, I'm a bit hesitant to make changes to my current setup as it's working fine, and I don't want to risk breaking anything. Additionally, my SSD drives with the models are pretty much full, so I don't have much space to experiment right now.
On a related note, I've noticed that the Ollama version in the mnccouk/ollama-gpu-rx580:latest image is outdated, and because of this, it can't run newer models like gemma3:4b. This issue actually makes me even more eager to try out newly created Docker containers that could support the latest models and updates.
I'll definitely keep an eye on your progress and updates over the next few weeks. Please do share any findings or new containers when you get a chance to work on them. I'm eager to test things out once I free up some space and feel confident about making changes. Thanks again for your efforts and for keeping us in the loop!
Note: This response was refined with the assistance of grok-3-beta, though the core message remains unchanged. Adding: You should have noticed before, in fact, I'm not much of a talker 😊
Cheers,
Levent Sunay
@robertrosenbusch commented on GitHub (Apr 14, 2025):
@lsunay
Told you so, its outdatet :P Here are my Benchmark on different LLMs on based on the ROCm6.3/Ollama 0.6.5 Dockercontainer:
benchmark_ollama0.6.5_rocm63.txt
UPDATE: I was a little greedy to know... ROCm 6.4 works fine on GFX803/Ollama 0.6.5
@vpereira commented on GitHub (May 6, 2025):
With
llama.cp+Vulkanit is working well:@mon-jai commented on GitHub (May 6, 2025):
@vpereira Sounds promising! Any idea how to use it with Open WebUI?
@vpereira commented on GitHub (May 6, 2025):
sure, i followed https://docs.openwebui.com/getting-started/quick-start/starting-with-llama-cpp/#step-3-serve-the-model-with-llamacpp
just added a systemctl service file, added the
--host 0.0.0.0to the example and it is working flawless, just spend my time looking thenvtop🤖@robertrosenbusch commented on GitHub (May 19, 2025):
@vpereira: Benchmark with ROCm 6.4.0 on RX570 with Lllama.cpp
llama.cpp# /llama.cpp/build/bin/llama-bench -m gemma-3-1b-it-UD-IQ1_S.gguf
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
Device 0: Radeon RX 570 Series, gfx803 (0x803), VMM: no, Wave Size: 64
Cheers, Robert Rosenbusch
@siavashmohammady66 commented on GitHub (May 19, 2025):
Fantastic
Could you explain how to do it?
@robertrosenbusch commented on GitHub (May 19, 2025):
Hi @siavashmohammady66 and welcome back!
What should i explain?! :P There is a "little" GIT for a all this AI-GFX803 Stuff. But big aware, you should able to use docker and had have a lot storage space and time to recompile ^.^
However, if you wanna use llama.cpp only its not a big deal.
Cheers, Robert Rosenbusch
@dakshcs commented on GitHub (Jun 27, 2025):
hello smoothbrain non-ai webdev here, is Polaris 10/20/30 support unofficial atp or will it be integrated into ollama soon?
I have a Polaris 30XT GPU and would love to use it!
@chboishabba commented on GitHub (Jun 28, 2025):
generally polaris are not supported for GPGPU, however this is the purpose of Robert's repo.
@robertrosenbusch commented on GitHub (Jul 10, 2025):
@chboishabba @phoenix277yt : I do it to fix it do made it more easy :P ROCm 6.4.1 and Ollama 0.9.5 on gfx803 (we are in 2025)
docker pull robertrosenbusch/rocm6_gfx803_ollama:6.4.1_0.9.5docker run -it --device=/dev/kfd --device=/dev/dri --group-add=video --ipc=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined -p 8080:8080 -p 11434:11434 --name rocm64_ollama_095 robertrosenbusch/rocm6_gfx803_ollama:6.4.1_0.9.5 bashbadüm depends on your ISP/Hardware you need some time to download.
you wanna download a specific LLVm? Lets go!
Try the Container
docker exec -ti rocm64_ollama_095 bashAnd then Download a you need for like this one
./ollama run llama3.2:1bAt the End of the Day, you got it on
http://YOURLOCALIP:8080@phoenix277yt : Feel free to ask and make some hints. there is a smale but beautifull gfx803 community there ^.^
Cheers, Robert Rosenbusch
@chboishabba commented on GitHub (Jul 11, 2025):
hooray! once I can fix my kernel install I will absolutely try! I can try right now on 6.14.0-rt3-arch1-1-rt but GPT helped me bork /boot/ every time I update and I need to fix that lol
I'm in Australia.. I know what you mean about badüm but I've got to presume you're German if you feel the need for the umlaut on onomatopoeia lol
Thanks very much brother, I'll let you know results. Best wishes.
@tristan-k commented on GitHub (Jul 12, 2025):
Here is a benchmark between both ollama docker ROCm versions on a
AMD Radeon Pro WX 5100.System
Ollama 0.9.0
Ollama 0.9.5
@tristan-k commented on GitHub (Jul 12, 2025):
Here is another quick benchmark for ollama-vulkan. It seems like Vulkan gives about 5 more t/s for the
AMD Radeon Pro WX 5100.@robertrosenbusch commented on GitHub (Jul 12, 2025):
@tristan-k : Hi and welcome back! thanks a lot for your benchmarks!
Did the
Radeon Pro WX 5100just support16 Gigs of VRam? i am intressted on, cause you benchmarked into your original post against a 12b Model ^.^I do some benchmarks from 0.6 to 0.9 there Same Hardware, same ROCm-Stack, different Ollama Versions. My main point of view was: take a look on "Total time".
And sorry, into my mindset "./obench" is nice, but not very carefull to benchmark anything into real world.
Cheers, Robert Rosenbusch
@tristan-k commented on GitHub (Jul 12, 2025):
No it doesn't. I initially benchmarked with the wrong model
gemma3:12band redid the bench with a modelgemma3:4bwhich fits into the VRAM. My bad.It's not exactly an apples-to-apples comparison. I will try to run
benchmark.pyin theollama-vulkandocker image later and post the results here.@chboishabba commented on GitHub (Jul 13, 2025):
Awesome benchmarks, cheers. I'm pretty blown away how much faster the deepseek eval is on the newer ollama versions. just having another look, it seems like the llama2:7b eval actually went backwards over apprx the same period... maybe sampling bias, as the larger model stayed roughly the same.
@tristan-k commented on GitHub (Jul 13, 2025):
As promised here are the benchmark results.
Ollama (0.9.3) Vulkan
Radeon Pro WX 5100 Vulkan to ROCm Comparison
@robertrosenbusch commented on GitHub (Jul 13, 2025):
@tristan-k : first at all, welcome back and take some popcorn and a cold soda please.
its really suspicious.. i guess the
Radeon Pro WX 5100is more similar to the RX580 then mine RX570, where i took all benchmarks on ROCm 6.4.And by the way you should really really happy to use the right Kernel-Version
6.14.0-63.fc42.x86_64for all the ROCm Stuff. Dont change it till AMD offer maybe with ROCm 6.5 or 7.0 is able to work against a Kernel-Version higher then Kernel-Version 6.11 I guess @chboishabba knows what i mean :P And as far as i know @chboishabba : the answer from AMD ROCm-Dev Team was: Use a kernel-version we are supporting@tristan-k: feel free to introduce it and make some hints ·^
At the End of the Day: gfx803 is still working on Ollama (0.6/0.7/0.8/0.9). Right? With ROCm 6.4.X and Vulcan1.4.313. I guess there are more differents between the Ollama-Versions then to the used different ROCm/Vulcan Versions on gfx803. But into my GIT i dont only support Ollama :D ComfyUI is my first intention
My Benchmark Sample? Ollama v0.9.0
Average stats:
Cheers, Robert Rosenbusch
@robertrosenbusch commented on GitHub (Jul 13, 2025):
@tristan-k @chboishabba : AMD just published offical ROCm 7.0 badüüm or i told you so Maybe i am to late on this GFX803 ROCm 7.0 Party :P laughing
Cheers, Robert Rosenbusch
@chboishabba commented on GitHub (Jul 14, 2025):
Can confirm, don't go updating until compatibility is confirmed unless you
are interested in bisecting kernel commits.
On Mon, 14 Jul 2025 at 06:53, Robert Rosenbusch @.***>
wrote:
@robertrosenbusch commented on GitHub (Aug 3, 2025):
... ollama 0.9.0 --> deepseek-r1:32b performance benchmark between 20 XEON E5-Cores vs 4x RX470/8G (gfx803) on/in a vid.
(top 20 XEON E5 - Cores
bottom 4 x RX470/8G )
Cheers, Robert Rosenbusch
@ericcurtin commented on GitHub (Oct 13, 2025):
We added Vulkan support to docker model runner, so we cover this feature:
https://www.docker.com/blog/docker-model-runner-vulkan-gpu-support/
We've also put effort to putting all our code in one central place to make it easier for people to contribute. Please star, fork and contribute.
https://github.com/docker/model-runner
We have vulkan support. You can pull models from Docker Hub, Huggingface or any other OCI registry. You can also push models to Docker Hub or any other OCI registry.
@exbanny58-alt commented on GitHub (Nov 19, 2025):
Ollama на AMD RX 580: Рабочее решение
Проблема: Ollama не видит RX 580, использует только CPU
Решение:
Убить все процессы ollama
1е окно:
bash
В PowerShell:
$env:OLLAMA_VULKAN="1"
$env:OLLAMA_GPU_LAYERS="99"
ollama serve
2е окно:
ollama run модель
Постоянное решение:
Создай системную переменную OLLAMA_VULKAN=1
Или используй .bat файл с этими настройками
Результат:
✅ Видеокарта загружена на 100% вместо CPU
✅ Мгновенные ответы
✅ Модели до 7B параметров в тестах
Проверка: В логах ищи "Radeon RX 580 Series" и library=Vulkan
Ollama on AMD RX 580: Working Solution
Problem: Ollama doesn't detect RX 580, uses CPU only
Solution:
kill all ollama process
bash
In PowerShell: 1 windows
$env:OLLAMA_VULKAN="1"
$env:OLLAMA_GPU_LAYERS="99"
ollama serve
2 windows
ollama run model-name
Permanent fix:
Create system variable OLLAMA_VULKAN=1
Or use .bat file with these settings
Result:
✅ GPU at 100% instead of CPU
✅ Instant responses
✅ Models up to 7B parameters test
Verification: In logs look for "Radeon RX 580 Series" and library=Vulkan
@aptac01 commented on GitHub (Dec 12, 2025):
По способу exbanny58-alt ускорил выполнение на asus rx570 8gb, Для удобного взаимодействия я сделал батник (file.bat) вот с таким содержимым
@echo off
powershell -Command "$env:OLLAMA_VULKAN='1'; $env:OLLAMA_GPU_LAYERS='99'; ollama serve"
При его запуске - оллама запускается в консольном режиме, чтобы открыть gui - можно поставить браузерное расширение типа "ollama ui"
Using the exbanny58-alt method, I accelerated the execution on the asus rx570 8gb, For convenient interaction, I made a batch file (file.bat) with the following contents
@echo off
powershell -Command "$env:OLLAMA_VULKAN='1'; $env:OLLAMA_GPU_LAYERS='99'; ollama serve"
When it starts, ollama starts in console mode, to open the gui - you can install a browser extension like "ollama ui"
@Lendangame commented on GitHub (Mar 7, 2026):
Doesnt work
@chboishabba commented on GitHub (Mar 10, 2026):
https://github.com/advanced-lvl-up/Rx470-Vega10-Rx580-gfx803-gfx900-fix-AMD-GPU/issues/10