mirror of
https://github.com/ollama/ollama.git
synced 2026-05-06 08:02:14 -05:00
[GH-ISSUE #6364] docker container can't detect Nvidia GPU - intermittent "cuda driver library failed to get device context 801" #66032
Open
opened 2026-05-03 23:41:33 -05:00 by GiteaMirror
·
37 comments
No Branch/Tag Specified
main
parth-mlx-decode-checkpoints
dhiltgen/ci
hoyyeva/editor-config-repair
parth-launch-codex-app
hoyyeva/fix-codex-model-metadata-warning
hoyyeva/qwen
hoyyeva/launch-backup-ux
parth/hide-claude-desktop-till-release
hoyyeva/opencode-image-modality
parth-add-claude-code-autoinstall
release_v0.22.0
pdevine/manifest-list
codex/fix-codex-model-metadata-warning
pdevine/addressable-manifest
brucemacd/launch-fetch-reccomended
jmorganca/llama-compat
launch-copilot-cli
hoyyeva/opencode-thinking
release_v0.20.7
parth-auto-save-backup
parth-test
jmorganca/gemma4-audio-replacements
fix-manifest-digest-on-pull
hoyyeva/vscode-improve
brucemacd/install-server-wait
brucemacd/download-before-remove
parth/update-claude-docs
parth-anthropic-reference-images-path
brucemac/start-ap-install
pdevine/mlx-update
pdevine/qwen35_vision
drifkin/api-show-fallback
mintlify/image-generation-1773352582
hoyyeva/server-context-length-local-config
jmorganca/faster-reptition-penalties
jmorganca/convert-nemotron
parth-pi-thinking
pdevine/sampling-penalties
jmorganca/fix-create-quantization-memory
dongchen/resumable_transfer_fix
pdevine/sampling-cache-error
jessegross/mlx-usage
hoyyeva/openclaw-config
hoyyeva/app-html
pdevine/qwen3next
brucemacd/sign-sh-install
brucemacd/tui-update
brucemacd/usage-api
jmorganca/launch-empty
fix-app-dist-embed
mxyng/mlx-compile
mxyng/mlx-quant
mxyng/mlx-glm4.7
mxyng/mlx
brucemacd/simplify-model-picker
jmorganca/qwen3-concurrent
fix-glm-4.7-flash-mla-config
drifkin/qwen3-coder-opening-tag
brucemacd/usage-cli
fix-cuda12-fattn-shmem
ollama-imagegen-docs
parth/fix-multiline-inputs
brucemacd/config-docs
mxyng/model-files
mxyng/simple-execute
fix-imagegen-ollama-models
mxyng/async-upload
jmorganca/lazy-no-dtype-changes
imagegen-auto-detect-create
parth/decrease-concurrent-download-hf
fix-mlx-quantize-init
jmorganca/x-cleanup
usage
imagegen-readme
jmorganca/glm-image
mlx-gpu-cd
jmorganca/imagegen-modelfile
parth/agent-skills
parth/agent-allowlist
parth/signed-in-offline
parth/agents
parth/fix-context-chopping
improve-cloud-flow
parth/add-models-websearch
parth/prompt-renderer-mcp
jmorganca/native-settings
jmorganca/download-stream-hash
jmorganca/client2-rebased
brucemacd/oai-chat-req-multipart
jessegross/multi_chunk_reserve
grace/additional-omit-empty
grace/mistral-3-large
mxyng/tokenizer2
mxyng/tokenizer
jessegross/flash
hoyyeva/windows-nacked-app
mxyng/cleanup-attention
grace/deepseek-parser
hoyyeva/remember-unsent-prompt
parth/add-lfs-pointer-error-conversion
parth/olmo2-test2
hoyyeva/ollama-launchagent-plist
nicole/olmo-model
parth/olmo-test
mxyng/remove-embedded
parth/render-template
jmorganca/intellect-3
parth/remove-prealloc-linter
jmorganca/cmd-eval
nicole/nomic-embed-text-fix
mxyng/lint-2
hoyyeva/add-gemini-3-pro-preview
hoyyeva/load-model-list
mxyng/expand-path
mxyng/environ-2
hoyyeva/deeplink-json-encoding
parth/improve-tool-calling-tests
hoyyeva/conversation
hoyyeva/assistant-edit-response
hoyyeva/thinking
origin/brucemacd/invalid-char-i-err
parth/improve-tool-calling
jmorganca/required-omitempty
grace/qwen3-vl-tests
mxyng/iter-client
parth/docs-readme
nicole/embed-test
pdevine/integration-benchstat
parth/remove-generate-cmd
parth/add-toolcall-id
mxyng/server-tests
jmorganca/glm-4.6
jmorganca/gin-h-compat
drifkin/stable-tool-args
pdevine/qwen3-more-thinking
parth/add-websearch-client
nicole/websearch_local
jmorganca/qwen3-coder-updates
grace/deepseek-v3-migration-tests
mxyng/fix-create
jmorganca/cloud-errors
pdevine/parser-tidy
revert-12233-parth/simplify-entrypoints-runner
parth/enable-so-gpt-oss
brucemacd/qwen3vl
jmorganca/readme-simplify
parth/gpt-oss-structured-outputs
revert-12039-jmorganca/tools-braces
mxyng/embeddings
mxyng/gguf
mxyng/benchmark
mxyng/types-null
parth/move-parsing
mxyng/gemma2
jmorganca/docs
mxyng/16-bit
mxyng/create-stdin
pdevine/authorizedkeys
mxyng/quant
parth/opt-in-error-context-window
brucemacd/cache-models
brucemacd/runner-completion
jmorganca/llama-update-6
brucemacd/benchmark-list
brucemacd/partial-read-caps
parth/deepseek-r1-tools
mxyng/omit-array
parth/tool-prefix-temp
brucemacd/runner-test
jmorganca/qwen25vl
brucemacd/model-forward-test-ext
parth/python-function-parsing
jmorganca/cuda-compression-none
drifkin/num-parallel
drifkin/chat-truncation-fix
jmorganca/sync
parth/python-tools-calling
drifkin/array-head-count
brucemacd/create-no-loop
parth/server-enable-content-stream-with-tools
qwen25omni
mxyng/v3
brucemacd/ropeconfig
jmorganca/silence-tokenizer
parth/sample-so-test
parth/sampling-structured-outputs
brucemacd/doc-go-engine
parth/constrained-sampling-json
jmorganca/mistral-wip
brucemacd/mistral-small-convert
parth/sample-unmarshal-json-for-params
brucemacd/jomorganca/mistral
pdevine/bfloat16
jmorganca/mistral
brucemacd/mistral
pdevine/logging
parth/sample-correctness-fix
parth/sample-fix-sorting
jmorgan/sample-fix-sorting-extras
jmorganca/temp-0-images
brucemacd/parallel-embed-models
brucemacd/shim-grammar
jmorganca/fix-gguf-error
bmizerany/nameswork
jmorganca/faster-releases
bmizerany/validatenames
brucemacd/err-no-vocab
brucemacd/rope-config
brucemacd/err-hint
brucemacd/qwen2_5
brucemacd/logprobs
brucemacd/new_runner_graph_bench
progress-flicker
brucemacd/forward-test
brucemacd/go_qwen2
pdevine/gemma2
jmorganca/add-missing-symlink-eval
mxyng/next-debug
parth/set-context-size-openai
brucemacd/next-bpe-bench
brucemacd/next-bpe-test
brucemacd/new_runner_e2e
brucemacd/new_runner_qwen2
pdevine/convert-cohere2
brucemacd/convert-cli
parth/log-probs
mxyng/next-mlx
mxyng/cmd-history
parth/templating
parth/tokenize-detokenize
brucemacd/check-key-register
bmizerany/grammar
jmorganca/vendor-081b29bd
mxyng/func-checks
jmorganca/fix-null-format
parth/fix-default-to-warn-json
jmorganca/qwen2vl
jmorganca/no-concat
parth/cmd-cleanup-SO
brucemacd/check-key-register-structured-err
parth/openai-stream-usage
parth/fix-referencing-so
stream-tools-stop
jmorganca/degin-1
brucemacd/install-path-clean
brucemacd/push-name-validation
brucemacd/browser-key-register
jmorganca/openai-fix-first-message
jmorganca/fix-proxy
jessegross/sample
parth/disallow-streaming-tools
dhiltgen/remove_submodule
jmorganca/ga
jmorganca/mllama
pdevine/newlines
pdevine/geems-2b
jmorganca/llama-bump
mxyng/modelname-7
mxyng/gin-slog
mxyng/modelname-6
jyan/convert-prog
jyan/quant5
paligemma-support
pdevine/import-docs
jmorganca/openai-context
jyan/paligemma
jyan/p2
jyan/palitest
bmizerany/embedspeedup
jmorganca/llama-vit
brucemacd/allow-ollama
royh/ep-methods
royh/whisper
mxyng/api-models
mxyng/fix-memory
jyan/q4_4/8
jyan/ollama-v
royh/stream-tools
roy-embed-parallel
bmizerany/hrm
revert-5963-revert-5924-mxyng/llama3.1-rope
royh/embed-viz
jyan/local2
jyan/auth
jyan/local
jyan/parse-temp
jmorganca/template-mistral
jyan/reord-g
royh-openai-suffixdocs
royh-imgembed
royh-embed-parallel
jyan/quant4
royh-precision
jyan/progress
pdevine/fix-template
jyan/quant3
pdevine/ggla
mxyng/update-registry-domain
jmorganca/ggml-static
mxyng/create-context
jyan/v0.146
mxyng/layers-from-files
build_dist
bmizerany/noseek
royh-ls
royh-name
timeout
mxyng/server-timestamp
bmizerany/nosillyggufslurps
royh-params
jmorganca/llama-cpp-7c26775
royh-openai-delete
royh-show-rigid
jmorganca/enable-fa
jmorganca/no-error-template
jyan/format
royh-testdelete
bmizerany/fastverify
language_support
pdevine/ps-glitches
brucemacd/tokenize
bruce/iq-quants
bmizerany/filepathwithcoloninhost
mxyng/split-bin
bmizerany/client-registry
jmorganca/if-none-match
native
jmorganca/native
jmorganca/batch-embeddings
jmorganca/initcmake
jmorganca/mm
pdevine/showggmlinfo
modenameenforcealphanum
bmizerany/modenameenforcealphanum
jmorganca/done-reason
jmorganca/llama-cpp-8960fe8
ollama.com
bmizerany/filepathnobuild
bmizerany/types/model/defaultfix
rmdisplaylong
nogogen
bmizerany/x
modelfile-readme
bmizerany/replacecolon
jmorganca/limit
jmorganca/execstack
jmorganca/replace-assets
mxyng/tune-concurrency
jmorganca/testing
whitespace-detection
jmorganca/options
upgrade-all
scratch
cuda-search
mattw/airenamer
mattw/allmodelsonhuggingface
mattw/quantcontext
mattw/whatneedstorun
brucemacd/llama-mem-calc
mattw/faq-context
mattw/communitylinks
mattw/noprune
mattw/python-functioncalling
rename
mxyng/install
pulse
remove-first
editor
mattw/selfqueryingretrieval
cgo
mattw/howtoquant
api
matt/streamingapi
format-config
mxyng/extra-args
shell
update-nous-hermes
cp-model
upload-progress
fix-unknown-model
fix-model-names
delete-fix
insecure-registry
ls
deletemodels
progressbar
readme-updates
license-layers
skip-list
list-models
modelpath
matt/examplemodelfiles
distribution
go-opts
v0.23.1
v0.23.1-rc0
v0.23.0
v0.23.0-rc0
v0.22.1
v0.22.1-rc1
v0.22.1-rc0
v0.22.0
v0.22.0-rc1
v0.21.3-rc0
v0.21.2-rc1
v0.21.2
v0.21.2-rc0
v0.21.1
v0.21.1-rc1
v0.21.1-rc0
v0.21.0
v0.21.0-rc1
v0.21.0-rc0
v0.20.8-rc0
v0.20.7
v0.20.7-rc1
v0.20.7-rc0
v0.20.6
v0.20.6-rc1
v0.20.6-rc0
v0.20.5
v0.20.5-rc2
v0.20.5-rc1
v0.20.5-rc0
v0.20.4
v0.20.4-rc2
v0.20.4-rc1
v0.20.4-rc0
v0.20.3
v0.20.3-rc0
v0.20.2
v0.20.1
v0.20.1-rc2
v0.20.1-rc1
v0.20.1-rc0
v0.20.0
v0.20.0-rc1
v0.20.0-rc0
v0.19.0
v0.19.0-rc2
v0.19.0-rc1
v0.19.0-rc0
v0.18.4-rc1
v0.18.4-rc0
v0.18.3
v0.18.3-rc2
v0.18.3-rc1
v0.18.3-rc0
v0.18.2
v0.18.2-rc1
v0.18.2-rc0
v0.18.1
v0.18.1-rc1
v0.18.1-rc0
v0.18.0
v0.18.0-rc2
v0.18.0-rc1
v0.18.0-rc0
v0.17.8-rc4
v0.17.8-rc3
v0.17.8-rc2
v0.17.8-rc1
v0.17.8-rc0
v0.17.7
v0.17.7-rc2
v0.17.7-rc1
v0.17.7-rc0
v0.17.6
v0.17.5
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.1-rc2
v0.17.1-rc1
v0.17.1-rc0
v0.17.0
v0.17.0-rc2
v0.17.0-rc1
v0.17.0-rc0
v0.16.3
v0.16.3-rc2
v0.16.3-rc1
v0.16.3-rc0
v0.16.2
v0.16.2-rc0
v0.16.1
v0.16.0
v0.16.0-rc2
v0.16.0-rc0
v0.16.0-rc1
v0.15.6
v0.15.5
v0.15.5-rc5
v0.15.5-rc4
v0.15.5-rc3
v0.15.5-rc2
v0.15.5-rc1
v0.15.5-rc0
v0.15.4
v0.15.3
v0.15.2
v0.15.1
v0.15.1-rc1
v0.15.1-rc0
v0.15.0-rc6
v0.15.0
v0.15.0-rc5
v0.15.0-rc4
v0.15.0-rc3
v0.15.0-rc2
v0.15.0-rc1
v0.15.0-rc0
v0.14.3
v0.14.3-rc3
v0.14.3-rc2
v0.14.3-rc1
v0.14.3-rc0
v0.14.2
v0.14.2-rc1
v0.14.2-rc0
v0.14.1
v0.14.0-rc11
v0.14.0
v0.14.0-rc10
v0.14.0-rc9
v0.14.0-rc8
v0.14.0-rc7
v0.14.0-rc6
v0.14.0-rc5
v0.14.0-rc4
v0.14.0-rc3
v0.14.0-rc2
v0.14.0-rc1
v0.14.0-rc0
v0.13.5
v0.13.5-rc1
v0.13.5-rc0
v0.13.4-rc2
v0.13.4
v0.13.4-rc1
v0.13.4-rc0
v0.13.3
v0.13.3-rc1
v0.13.3-rc0
v0.13.2
v0.13.2-rc2
v0.13.2-rc1
v0.13.2-rc0
v0.13.1
v0.13.1-rc2
v0.13.1-rc1
v0.13.1-rc0
v0.13.0
v0.13.0-rc0
v0.12.11
v0.12.11-rc1
v0.12.11-rc0
v0.12.10
v0.12.10-rc1
v0.12.10-rc0
v0.12.9-rc0
v0.12.9
v0.12.8
v0.12.8-rc0
v0.12.7
v0.12.7-rc1
v0.12.7-rc0
v0.12.7-citest0
v0.12.6
v0.12.6-rc1
v0.12.6-rc0
v0.12.5
v0.12.5-rc0
v0.12.4
v0.12.4-rc7
v0.12.4-rc6
v0.12.4-rc5
v0.12.4-rc4
v0.12.4-rc3
v0.12.4-rc2
v0.12.4-rc1
v0.12.4-rc0
v0.12.3
v0.12.2
v0.12.2-rc0
v0.12.1
v0.12.1-rc1
v0.12.1-rc2
v0.12.1-rc0
v0.12.0
v0.12.0-rc1
v0.12.0-rc0
v0.11.11
v0.11.11-rc3
v0.11.11-rc2
v0.11.11-rc1
v0.11.11-rc0
v0.11.10
v0.11.9
v0.11.9-rc0
v0.11.8
v0.11.8-rc0
v0.11.7-rc1
v0.11.7-rc0
v0.11.7
v0.11.6
v0.11.6-rc0
v0.11.5-rc4
v0.11.5-rc3
v0.11.5
v0.11.5-rc5
v0.11.5-rc2
v0.11.5-rc1
v0.11.5-rc0
v0.11.4
v0.11.4-rc0
v0.11.3
v0.11.3-rc0
v0.11.2
v0.11.1
v0.11.0-rc0
v0.11.0-rc1
v0.11.0-rc2
v0.11.0
v0.10.2-int1
v0.10.1
v0.10.0
v0.10.0-rc4
v0.10.0-rc3
v0.10.0-rc2
v0.10.0-rc1
v0.10.0-rc0
v0.9.7-rc1
v0.9.7-rc0
v0.9.6
v0.9.6-rc0
v0.9.6-ci0
v0.9.5
v0.9.4-rc5
v0.9.4-rc6
v0.9.4
v0.9.4-rc3
v0.9.4-rc4
v0.9.4-rc1
v0.9.4-rc2
v0.9.4-rc0
v0.9.3
v0.9.3-rc5
v0.9.4-citest0
v0.9.3-rc4
v0.9.3-rc3
v0.9.3-rc2
v0.9.3-rc1
v0.9.3-rc0
v0.9.2
v0.9.1
v0.9.1-rc1
v0.9.1-rc0
v0.9.1-ci1
v0.9.1-ci0
v0.9.0
v0.9.0-rc0
v0.8.0
v0.8.0-rc0
v0.7.1-rc2
v0.7.1
v0.7.1-rc1
v0.7.1-rc0
v0.7.0
v0.7.0-rc1
v0.7.0-rc0
v0.6.9-rc0
v0.6.8
v0.6.8-rc0
v0.6.7
v0.6.7-rc2
v0.6.7-rc1
v0.6.7-rc0
v0.6.6
v0.6.6-rc2
v0.6.6-rc1
v0.6.6-rc0
v0.6.5-rc1
v0.6.5
v0.6.5-rc0
v0.6.4-rc0
v0.6.4
v0.6.3-rc1
v0.6.3
v0.6.3-rc0
v0.6.2
v0.6.2-rc0
v0.6.1
v0.6.1-rc0
v0.6.0-rc0
v0.6.0
v0.5.14-rc0
v0.5.13
v0.5.13-rc6
v0.5.13-rc5
v0.5.13-rc4
v0.5.13-rc3
v0.5.13-rc2
v0.5.13-rc1
v0.5.13-rc0
v0.5.12
v0.5.12-rc1
v0.5.12-rc0
v0.5.11
v0.5.10
v0.5.9
v0.5.9-rc0
v0.5.8-rc13
v0.5.8
v0.5.8-rc12
v0.5.8-rc11
v0.5.8-rc10
v0.5.8-rc9
v0.5.8-rc8
v0.5.8-rc7
v0.5.8-rc6
v0.5.8-rc5
v0.5.8-rc4
v0.5.8-rc3
v0.5.8-rc2
v0.5.8-rc1
v0.5.8-rc0
v0.5.7
v0.5.6
v0.5.5
v0.5.5-rc0
v0.5.4
v0.5.3
v0.5.3-rc0
v0.5.2
v0.5.2-rc3
v0.5.2-rc2
v0.5.2-rc1
v0.5.2-rc0
v0.5.1
v0.5.0
v0.5.0-rc1
v0.4.8-rc0
v0.4.7
v0.4.6
v0.4.5
v0.4.4
v0.4.3
v0.4.3-rc0
v0.4.2
v0.4.2-rc1
v0.4.2-rc0
v0.4.1
v0.4.1-rc0
v0.4.0
v0.4.0-rc8
v0.4.0-rc7
v0.4.0-rc6
v0.4.0-rc5
v0.4.0-rc4
v0.4.0-rc3
v0.4.0-rc2
v0.4.0-rc1
v0.4.0-rc0
v0.4.0-ci3
v0.3.14
v0.3.14-rc0
v0.3.13
v0.3.12
v0.3.12-rc5
v0.3.12-rc4
v0.3.12-rc3
v0.3.12-rc2
v0.3.12-rc1
v0.3.11
v0.3.11-rc4
v0.3.11-rc3
v0.3.11-rc2
v0.3.11-rc1
v0.3.10
v0.3.10-rc1
v0.3.9
v0.3.8
v0.3.7
v0.3.7-rc6
v0.3.7-rc5
v0.3.7-rc4
v0.3.7-rc3
v0.3.7-rc2
v0.3.7-rc1
v0.3.6
v0.3.5
v0.3.4
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.8
v0.2.8-rc2
v0.2.8-rc1
v0.2.7
v0.2.6
v0.2.5
v0.2.4
v0.2.3
v0.2.2
v0.2.2-rc2
v0.2.2-rc1
v0.2.1
v0.2.0
v0.1.49-rc14
v0.1.49-rc13
v0.1.49-rc12
v0.1.49-rc11
v0.1.49-rc10
v0.1.49-rc9
v0.1.49-rc8
v0.1.49-rc7
v0.1.49-rc6
v0.1.49-rc4
v0.1.49-rc5
v0.1.49-rc3
v0.1.49-rc2
v0.1.49-rc1
v0.1.48
v0.1.47
v0.1.46
v0.1.45-rc5
v0.1.45
v0.1.45-rc4
v0.1.45-rc3
v0.1.45-rc2
v0.1.45-rc1
v0.1.44
v0.1.43
v0.1.42
v0.1.41
v0.1.40
v0.1.40-rc1
v0.1.39
v0.1.39-rc2
v0.1.39-rc1
v0.1.38
v0.1.37
v0.1.36
v0.1.35
v0.1.35-rc1
v0.1.34
v0.1.34-rc1
v0.1.33
v0.1.33-rc7
v0.1.33-rc6
v0.1.33-rc5
v0.1.33-rc4
v0.1.33-rc3
v0.1.33-rc2
v0.1.33-rc1
v0.1.32
v0.1.32-rc2
v0.1.32-rc1
v0.1.31
v0.1.30
v0.1.29
v0.1.28
v0.1.27
v0.1.26
v0.1.25
v0.1.24
v0.1.23
v0.1.22
v0.1.21
v0.1.20
v0.1.19
v0.1.18
v0.1.17
v0.1.16
v0.1.15
v0.1.14
v0.1.13
v0.1.12
v0.1.11
v0.1.10
v0.1.9
v0.1.8
v0.1.7
v0.1.6
v0.1.5
v0.1.4
v0.1.3
v0.1.2
v0.1.1
v0.1.0
v0.0.21
v0.0.20
v0.0.19
v0.0.18
v0.0.17
v0.0.16
v0.0.15
v0.0.14
v0.0.13
v0.0.12
v0.0.11
v0.0.10
v0.0.9
v0.0.8
v0.0.7
v0.0.6
v0.0.5
v0.0.4
v0.0.3
v0.0.2
v0.0.1
Labels
Clear labels
amd
api
app
bug
build
cli
cloud
compatibility
context-length
create
docker
documentation
embeddings
feature request
feedback wanted
good first issue
gpt-oss
gpu
harmony
help wanted
image
install
intel
js
launch
linux
macos
memory
mlx
model
needs more info
networking
nvidia
ollama.com
performance
pull-request
python
question
registry
rendering
thinking
tools
top
vulkan
windows
wsl
Mirrored from GitHub Pull Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
No Assignees
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: github-starred/ollama#66032
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @fahadshery on GitHub (Aug 14, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6364
Originally assigned to: @dhiltgen on GitHub.
What is the issue?
OS: Ubuntu 24.04 LTS
GPU: Nvidia Tesla P40 (24G)
I installed ollama without docker and it was able to utilise my gpu without any issues.
I then deployed ollama using the following docker compose file:
When I exec in to the container and run
nvidia-smiit successfully executes it fromwithinthe ollama docker container.but the logs show that it can't detect my gpu?
not sure why??
OS
Linux, Docker
GPU
Nvidia
CPU
Intel
Ollama version
0.3.5
@fahadshery commented on GitHub (Aug 14, 2024):
@rick-github commented on GitHub (Aug 14, 2024):
What's the output of nvidia-smi outside of the container?
@rick-github commented on GitHub (Aug 14, 2024):
If you (temporarily) install ollama as a service (curl -fsSL https://ollama.com/install.sh | sh) can it access the GPU?I see that you've already done that.
@fahadshery commented on GitHub (Aug 14, 2024):
@fahadshery commented on GitHub (Aug 14, 2024):
I am using vGPU which is a datacenter grade GPU. I changed and tried with different Nvidia profiles but no use. Here is more info on the GPU:
@fahadshery commented on GitHub (Aug 14, 2024):
yes, already tried and it works beautifully. But I need it running in docker so that it's easier to deploy other services with it like
stable diffusion, open-webui, whisper, searxng, libretranslateetc. etc.@rick-github commented on GitHub (Aug 14, 2024):
nvidia-smi outside of the container shows an ollama runner using the GPU. Is that running inside the container or is the ollama-as-a-service still running?
@fahadshery commented on GitHub (Aug 14, 2024):
There is no ollama service running in this VM. It's a fresh VM and I deploy everything using
ansibleso that I don't mess things up. So I am assuming that it's got to be from inside the container. but as logs show, container fails to recognise that there is a GPU available to it@fahadshery commented on GitHub (Aug 14, 2024):
this is what I am trying to build:
[](https://technotim.live/posts/ai-stack-tutorial
@rick-github commented on GitHub (Aug 14, 2024):
What do the following show:
pstree -ls 3576ps wwp3576@fahadshery commented on GitHub (Aug 15, 2024):
ok, I don't know what happened. (I didn't make any change other than downloading a different model i.e. llama3.1:8b)
I ran it and the GPU utilisation went up to 85%.
then I did the process check and here are the results:
is that normal? I was expecting a 90% GPU utilisation
@fahadshery commented on GitHub (Aug 15, 2024):
@rick-github commented on GitHub (Aug 15, 2024):
GPU utilization will vary on the efficiency of the model and external factors like power usage, etc. I don't know how this applies to VGPUs as they are a shared resource and presumably there will be some competition for cycles. You can get a view of possible limiting factors by looking at the performance state and throttle reasons from
nvidia-smi -q -d POWER,TEMPERATURE,PERFORMANCE.@fahadshery commented on GitHub (Aug 15, 2024):
How does it come up with
--n-gpu-layers 33in theps wwp27163 command? how do you determine that? or is it inherent to the underlying model to decide?@rick-github commented on GitHub (Aug 15, 2024):
In the server log there will be lines like:
This is ollama figuring out how much (V)RAM your system has, and calculating how many layers will fit in the available VRAM and how much RAM will be needed to the non-GPU layers. You can control the number of layers that are offloaded to the GPU with the
num_gpuoption, either in the CLI (/set parameter num_gpu xx) or in the API (curl localhost:11434/api/generate -d '{"model":"yy","options":{"num_gpu":xx}}').@fahadshery commented on GitHub (Aug 16, 2024):
so working within docker container is intermittent. it struggles to reload the model into the GPU once it's been offloaded. I installed using linux shell script and it's working as expected. in docker it some times don't even see the GPU even though
nvidia-smicommand works fine within the container.@rick-github commented on GitHub (Aug 16, 2024):
What's in the server logs when it fails?
@TomorrowToday commented on GitHub (Aug 26, 2024):
Are you running with the nvidia container toolkit? It's not supported yet on Ubuntu 24.04 according to their docs.
@fahadshery commented on GitHub (Aug 27, 2024):
working fine in other containers like
Stable-Diffusion-webui,whisperetc.@superwolfboy commented on GitHub (Aug 31, 2024):
enable "above 4G" in bios already ?
@fahadshery commented on GitHub (Aug 31, 2024):
I am running it on Dell R720 Server with NVIDIA Tesla P40 24G GPU. So not sure if there is an option there? But as I said, all the other containers are working fine. Even the
gpu-jupytercontainer is working fine!@superwolfboy commented on GitHub (Sep 2, 2024):
My problem is the same as you,vGPU is not working, almost the same log,
But GPU passthrough can working, and only one VM can use this GPU
@dhiltgen commented on GitHub (Sep 4, 2024):
@fahadshery in your initial logs I see the following error
That error code maps to:
I would recommend working through our troublshooting guide for NVIDIA GPUs - https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#nvidia-gpu-discovery
In particular, the uvm driver may be unloading which may explain intermittent behavior (works sometimes, fails other times). In our install script we mitigate this with the following code https://github.com/ollama/ollama/blob/main/scripts/install.sh#L358-L367 which may be applicable for your host system if it turns out this is the root cause.
@JavierCCC commented on GitHub (Sep 12, 2024):
Check /etc/docker/daemon.json
You want to have a runtime definition related to nvidia inside it, something like this
Then, you need to use
runtime: nvidiaInside your docker compose yaml.
@mrk3786 commented on GitHub (Sep 16, 2024):
I ran into the same problem. It turned out to be a CPU type configuration in my proxmox VM. I configured x86 and when i changed that to 'host', the issue was solved.
@vaclcer commented on GitHub (Sep 17, 2024):
Hello, reporting the same problem with error "cuda driver library failed to get device context 801":
time=2024-09-17T05:56:32.395Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 cuda_v11 cuda_v12 rocm_v60102 cpu cpu_avx]" time=2024-09-17T05:56:32.395Z level=DEBUG source=payload.go:45 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY" time=2024-09-17T05:56:32.395Z level=DEBUG source=sched.go:105 msg="starting llm scheduler" time=2024-09-17T05:56:32.395Z level=INFO source=gpu.go:200 msg="looking for compatible GPUs" time=2024-09-17T05:56:32.395Z level=DEBUG source=gpu.go:86 msg="searching for GPU discovery libraries for NVIDIA" time=2024-09-17T05:56:32.395Z level=DEBUG source=gpu.go:468 msg="Searching for GPU library" name=libcuda.so* time=2024-09-17T05:56:32.395Z level=DEBUG source=gpu.go:491 msg="gpu library search" globs="[/usr/lib/ollama/libcuda.so* /usr/local/nvidia/lib/libcuda.so* /usr/local/nvidia/lib64/libcuda.so* /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]" time=2024-09-17T05:56:32.396Z level=DEBUG source=gpu.go:525 msg="discovered GPU libraries" paths=[/usr/lib/x86_64-linux-gnu/libcuda.so.525.105.17] CUDA driver version: 12.0 time=2024-09-17T05:56:32.404Z level=DEBUG source=gpu.go:119 msg="detected GPUs" count=1 library=/usr/lib/x86_64-linux-gnu/libcuda.so.525.105.17 time=2024-09-17T05:56:32.404Z level=INFO source=gpu.go:252 msg="error looking up nvidia GPU memory" error="cuda driver library failed to get device context 801" time=2024-09-17T05:56:32.404Z level=DEBUG source=amd_linux.go:371 msg="amdgpu driver not detected /sys/module/amdgpu" time=2024-09-17T05:56:32.404Z level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered" releasing cuda driver library time=2024-09-17T05:56:32.404Z level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant=avx2 compute="" driver=0.0 name="" total="94.3 GiB" available="92.7 GiB"nvidia-smi in the container work ok:
`
Tue Sep 17 06:03:42 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA A40-48Q On | 00000000:00:05.0 Off | 0 |
| N/A N/A P8 N/A / N/A | 0MiB / 49152MiB | 0% Default |
| | | Disabled |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
`
Did go through the proposed troubleshooting, but no luck, no other errors found. What do I do now? Thanks for any help.
@phonkd commented on GitHub (Sep 19, 2024):
After changing the cpu type to host (of my qemu) vm it worked.
@fahadshery commented on GitHub (Sep 20, 2024):
ollama works fine but it's intermittent. I have no issues with other containers using the GPU. Therefore, I am running ollama as a service and have no issues..
I will check the CPU type and change it to host but this might not help since we're using vGPU
@fahadshery commented on GitHub (Sep 20, 2024):
are there no reloading model issues?
@dhiltgen commented on GitHub (Sep 24, 2024):
@vaclcer your driver version 525.105.17 is well over a year old. Perhaps try upgrading the driver and see if maybe this is a bug nvidia has already fixed?
@fahadshery from your logs, you're running a newer driver, but given the intermittent nature of this, it might also be worth trying to upgrade to the latest driver to see if that clears it up.
@fahadshery commented on GitHub (Sep 26, 2024):
ok, I am gona upgrade the drivers to the latest
550.90.05and try again and report back@dstaicova commented on GitHub (Oct 1, 2024):
Just to say, I'm getting the same error without docker:
>OLLAMA_DEBUG=1 ollama serve time=2024-10-01T18:59:04.108+03:00 level=INFO source=gpu.go:199 msg="looking for compatible GPUs" time=2024-10-01T18:59:04.109+03:00 level=DEBUG source=gpu.go:86 msg="searching for GPU discovery libraries for NVIDIA" time=2024-10-01T18:59:04.109+03:00 level=DEBUG source=gpu.go:468 msg="Searching for GPU library" name=libcuda.so* time=2024-10-01T18:59:04.109+03:00 level=DEBUG source=gpu.go:491 msg="gpu library search" globs="[/usr/local/lib/ollama/libcuda.so* /home/denijane/scripts/libcuda.so* /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]" time=2024-10-01T18:59:04.166+03:00 level=DEBUG source=gpu.go:525 msg="discovered GPU libraries" paths="[/usr/lib/libcuda.so.550.107.02 /usr/lib/libcuda.so.550.78 /usr/lib.bak/libcuda.so.550.90.07 /usr/lib32/libcuda.so.550.107.02 /usr/lib64/libcuda.so.550.107.02 /usr/lib64/libcuda.so.550.78]" cuInit err: 999 time=2024-10-01T18:59:04.173+03:00 level=WARN source=gpu.go:562 msg="unknown error initializing cuda driver library" library=/usr/lib/libcuda.so.550.107.02 error="cuda driver library init failure: 999" time=2024-10-01T18:59:04.173+03:00 level=WARN source=gpu.go:563 msg="see https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for more information" cuInit err: 803 time=2024-10-01T18:59:04.181+03:00 level=WARN source=gpu.go:558 msg="version mismatch between driver and cuda driver library - reboot or upgrade may be required" library=/usr/lib/libcuda.so.550.78 error="cuda driver library init failure: 803" cuInit err: 803 time=2024-10-01T18:59:04.193+03:00 level=WARN source=gpu.go:558 msg="version mismatch between driver and cuda driver library - reboot or upgrade may be required" library=/usr/lib.bak/libcuda.so.550.90.07 error="cuda driver library init failure: 803" library /usr/lib32/libcuda.so.550.107.02 load err: /usr/lib32/libcuda.so.550.107.02: wrong ELF class: ELFCLASS32 time=2024-10-01T18:59:04.193+03:00 level=DEBUG source=gpu.go:566 msg="skipping 32bit library" library=/usr/lib32/libcuda.so.550.107.02 cuInit err: 999 time=2024-10-01T18:59:04.202+03:00 level=WARN source=gpu.go:562 msg="unknown error initializing cuda driver library" library=/usr/lib64/libcuda.so.550.107.02 error="cuda driver library init failure: 999" time=2024-10-01T18:59:04.202+03:00 level=WARN source=gpu.go:563 msg="see https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for more information" cuInit err: 803 time=2024-10-01T18:59:04.205+03:00 level=WARN source=gpu.go:558 msg="version mismatch between driver and cuda driver library - reboot or upgrade may be required" library=/usr/lib64/libcuda.so.550.78 error="cuda driver library init failure: 803"I'm on Manjaro, with nvidia: 550.107.02, cuda: 12.4. I just installed the newest ollama and noticed it starts the models really slowly. Slower than before. The nvidia is working (I can see games in nvidia-smi), but I'm not sure when last I saw ollama to use the gpu as it was pretty busy 2-3 weeks.
@dhiltgen commented on GitHub (Oct 17, 2024):
@denijane you're getting 2 different errors from 2 different libraries we try. The 999 error is a generic "unknown error" code, which isn't super helpful, however the other code 803 is enlightening.
If you have already rebooted, somehow your system has gotten into an inconsistent state where the driver you're booting doesn't match the libraries installed.
@MrHongping commented on GitHub (Oct 19, 2024):
I encountered the same problem. My GPU is Tesla P4 and I am using PVE virtualization GPU for Ubuntu 22's virtual machine. The Nvidia smi command checks that the virtualized GPU is working properly, but I am unable to use the GPU for acceleration even after starting up in Docker or virtual machine environments.
@dhiltgen commented on GitHub (Nov 6, 2024):
I've posted a new PR documenting a workaround some users are seeing success with for a slightly different failure mode, but it might be helpful in these cases as well. If you are experiencing the sporadic 801, please give it a try and let us know if it resolves the problem.
#7519
@DoctorDream commented on GitHub (Feb 24, 2025):
After research, it has been found that this problem mostly occurs in virtual machines. Through numerous attempts, I've discovered that the cause of this problem (
cuda driver library failed to get device context 801) may not be related to the graphics card itself, but rather to the CPU instruction set.The default CPU virtualization mode in the VM doesn't have the AVX2 instruction set (which can be checked via
lscpu). However, after switching it to the host virtualization mode, the AVX2 instruction set appears, and at this point, ollama runs normally.@tharun571 commented on GitHub (Jan 20, 2026):
The "cuda driver library failed to get device context 801" error in Docker containers is frustrating - especially when GPU works fine outside Docker!
Common root causes for Docker GPU detection issues:
docker info | grep -i runtimeshows nvidia--gpus allor the runtime config in docker-compose/dev/nvidia*devicesQuick validation steps:
For intermittent failures (works sometimes, fails others), it's often a race condition with device initialization.
I built an OSS diagnostic tool for these exact GPU+Docker issues: env-doctor
It checks:
It works both inside and outside containers to pinpoint where the GPU detection breaks.
Full disclosure: I'm the author. Sharing because Docker GPU issues are notoriously hard to debug, and this automates the validation chain. Hope it helps troubleshoot!