[GH-ISSUE #10216] 8*H100 server didn't use GPU to run model #6703

Closed
opened 2026-04-12 18:26:24 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @zhkchen on GitHub (Apr 10, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10216

###What is the issue?
I am trying to run ollama on a server with 8 H100 GPU and I found ollama cannot use the gpus to run LLM model.

###Relevant log output
Apr 10 17:27:46 yamada-NULL systemd[1]: Stopping ollama.service - Ollama Service...
Apr 10 17:27:46 yamada-NULL systemd[1]: ollama.service: Deactivated successfully.
Apr 10 17:27:46 yamada-NULL systemd[1]: Stopped ollama.service - Ollama Service.
Apr 10 17:27:46 yamada-NULL systemd[1]: ollama.service: Consumed 2min 32.917s CPU time.
Apr 10 17:28:40 yamada-NULL systemd[1]: Started ollama.service - Ollama Service.
Apr 10 17:28:40 yamada-NULL ollama[48308]: 2025/04/10 17:28:40 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBU>
Apr 10 17:28:40 yamada-NULL ollama[48308]: time=2025-04-10T17:28:40.185+09:00 level=INFO source=images.go:458 msg="total blobs: 9"
Apr 10 17:28:40 yamada-NULL ollama[48308]: time=2025-04-10T17:28:40.185+09:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0"
Apr 10 17:28:40 yamada-NULL ollama[48308]: time=2025-04-10T17:28:40.186+09:00 level=INFO source=routes.go:1298 msg="Listening on 127.0.0.1:11434 (version 0.6.5)"
Apr 10 17:28:40 yamada-NULL ollama[48308]: time=2025-04-10T17:28:40.186+09:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
Apr 10 17:29:10 yamada-NULL ollama[48308]: time=2025-04-10T17:29:10.210+09:00 level=INFO source=gpu.go:612 msg="Unable to load cudart library /usr/lib/x86_64-linux-gnu/libcuda.so.570.124.06: cuda driver library init failure: 802"
Apr 10 17:31:10 yamada-NULL ollama[48308]: time=2025-04-10T17:31:10.247+09:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
Apr 10 17:31:10 yamada-NULL ollama[48308]: time=2025-04-10T17:31:10.247+09:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="2015.4 GiB" available="1998.4 GiB"
Apr 10 17:31:10 yamada-NULL ollama[48308]: [GIN] 2025/04/10 - 17:31:10 | 200 | 47.867µs | 127.0.0.1 | HEAD "/"
Apr 10 17:31:10 yamada-NULL ollama[48308]: [GIN] 2025/04/10 - 17:31:10 | 200 | 1.307023ms | 127.0.0.1 | GET "/api/tags"

###OS
Ubuntu 24.04.2 LTS

###GPU
8*NVIDIA H100 80GB

###CPU
Intel(R) Xeon(R) Platinum 8480+ CPU @ 2.0GHz

###Ollama version
0.6.5

Originally created by @zhkchen on GitHub (Apr 10, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10216 ###What is the issue? I am trying to run ollama on a server with 8 H100 GPU and I found ollama cannot use the gpus to run LLM model. ###Relevant log output Apr 10 17:27:46 yamada-NULL systemd[1]: Stopping ollama.service - Ollama Service... Apr 10 17:27:46 yamada-NULL systemd[1]: ollama.service: Deactivated successfully. Apr 10 17:27:46 yamada-NULL systemd[1]: Stopped ollama.service - Ollama Service. Apr 10 17:27:46 yamada-NULL systemd[1]: ollama.service: Consumed 2min 32.917s CPU time. Apr 10 17:28:40 yamada-NULL systemd[1]: Started ollama.service - Ollama Service. Apr 10 17:28:40 yamada-NULL ollama[48308]: 2025/04/10 17:28:40 routes.go:1231: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBU> Apr 10 17:28:40 yamada-NULL ollama[48308]: time=2025-04-10T17:28:40.185+09:00 level=INFO source=images.go:458 msg="total blobs: 9" Apr 10 17:28:40 yamada-NULL ollama[48308]: time=2025-04-10T17:28:40.185+09:00 level=INFO source=images.go:465 msg="total unused blobs removed: 0" Apr 10 17:28:40 yamada-NULL ollama[48308]: time=2025-04-10T17:28:40.186+09:00 level=INFO source=routes.go:1298 msg="Listening on 127.0.0.1:11434 (version 0.6.5)" Apr 10 17:28:40 yamada-NULL ollama[48308]: time=2025-04-10T17:28:40.186+09:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" Apr 10 17:29:10 yamada-NULL ollama[48308]: time=2025-04-10T17:29:10.210+09:00 level=INFO source=gpu.go:612 msg="Unable to load cudart library /usr/lib/x86_64-linux-gnu/libcuda.so.570.124.06: cuda driver library init failure: 802" Apr 10 17:31:10 yamada-NULL ollama[48308]: time=2025-04-10T17:31:10.247+09:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" Apr 10 17:31:10 yamada-NULL ollama[48308]: time=2025-04-10T17:31:10.247+09:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="2015.4 GiB" available="1998.4 GiB" Apr 10 17:31:10 yamada-NULL ollama[48308]: [GIN] 2025/04/10 - 17:31:10 | 200 | 47.867µs | 127.0.0.1 | HEAD "/" Apr 10 17:31:10 yamada-NULL ollama[48308]: [GIN] 2025/04/10 - 17:31:10 | 200 | 1.307023ms | 127.0.0.1 | GET "/api/tags" ###OS Ubuntu 24.04.2 LTS ###GPU 8*NVIDIA H100 80GB ###CPU Intel(R) Xeon(R) Platinum 8480+ CPU @ 2.0GHz ###Ollama version 0.6.5
GiteaMirror added the bug label 2026-04-12 18:26:24 -05:00
Author
Owner

@rick-github commented on GitHub (Apr 10, 2025):

Apr 10 17:29:10 yamada-NULL ollama[48308]: time=2025-04-10T17:29:10.210+09:00 level=INFO
 source=gpu.go:612
 msg="Unable to load cudart library /usr/lib/x86_64-linux-gnu/libcuda.so.570.124.06:
   cuda driver library init failure: 802"

From the Nvidia documentation:

cudaErrorSystemNotReady = 802
This error indicates that the system is not yet ready to start any CUDA work.
To continue using CUDA, verify the system configuration is in a valid state and
all required driver daemons are actively running. More information about this
error can be found in the system specific user guide.
<!-- gh-comment-id:2792325699 --> @rick-github commented on GitHub (Apr 10, 2025): ``` Apr 10 17:29:10 yamada-NULL ollama[48308]: time=2025-04-10T17:29:10.210+09:00 level=INFO source=gpu.go:612 msg="Unable to load cudart library /usr/lib/x86_64-linux-gnu/libcuda.so.570.124.06: cuda driver library init failure: 802" ``` From the Nvidia documentation: ``` cudaErrorSystemNotReady = 802 This error indicates that the system is not yet ready to start any CUDA work. To continue using CUDA, verify the system configuration is in a valid state and all required driver daemons are actively running. More information about this error can be found in the system specific user guide. ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#6703