[GH-ISSUE #5453] ollama dos not work on GPU #3409

Closed
opened 2026-04-12 14:02:44 -05:00 by GiteaMirror · 21 comments
Owner

Originally created by @tianfan007 on GitHub (Jul 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5453

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

I updated ollama version from 0.1.32  to 0.1.48 ,and then found that ollama not work GPU.
first ,run the command
ollama run gemma:latest
no matter any model
then ,run this command
ps -ef|grep ollama
I got these info:
ollama 1156949 1063915 4 11:44 ? 00:00:01 /tmp/ollama3065264722/runners/cpu_avx2/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-3a43f93b78ec50f7c4e4dc8bd1cb3fff5a900e7d574c51a6f7495e48486e0dac --ctx-size 2048 --batch-size 512 --embedding --log-disable --parallel 1 --port 43415
and I found there was no change of the Graphics memory
if I run the command
nvidia-smi
there hasn't any information about ollama
I don't know what's wrong with it.The previous version worked well

GPU is nvidia 3050ti with 4GB
integrated graphics is AMD 660M

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.1.48

Originally created by @tianfan007 on GitHub (Jul 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5453 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? I updated ollama version from 0.1.32  to 0.1.48 ,and then found that ollama not work GPU. first ,run the command `ollama run gemma:latest` no matter any model then ,run this command `ps -ef|grep ollama` I got these info: ollama 1156949 1063915 4 11:44 ? 00:00:01 /tmp/ollama3065264722/runners/cpu_avx2/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-3a43f93b78ec50f7c4e4dc8bd1cb3fff5a900e7d574c51a6f7495e48486e0dac --ctx-size 2048 --batch-size 512 --embedding --log-disable --parallel 1 --port 43415 and I found there was no change of the Graphics memory if I run the command `nvidia-smi` there hasn't any information about ollama I don't know what's wrong with it.The previous version worked well GPU is nvidia 3050ti with 4GB integrated graphics is AMD 660M ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.1.48
GiteaMirror added the nvidiabug labels 2026-04-12 14:02:44 -05:00
Author
Owner

@jmorganca commented on GitHub (Jul 3, 2024):

Sorry you hit this – looking into it

<!-- gh-comment-id:2205057073 --> @jmorganca commented on GitHub (Jul 3, 2024): Sorry you hit this – looking into it
Author
Owner

@jmorganca commented on GitHub (Jul 3, 2024):

May I ask what nvidia-smi shows? (Driver etc, if you have the full output that would be amazing)

<!-- gh-comment-id:2205059270 --> @jmorganca commented on GitHub (Jul 3, 2024): May I ask what `nvidia-smi` shows? (Driver etc, if you have the full output that would be amazing)
Author
Owner

@sieudx commented on GitHub (Jul 3, 2024):

I have a same isssue, I run ollama with docker by command:
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama2 ollama/ollama
Or by container :
ollama:
image: ollama/ollama:0.1.48
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
restart: always
networks:
- llm_local
But when i call it not run on gpu , it only run on CPU

Nvidia drive: 555.42.02 and cuda version 12.5

<!-- gh-comment-id:2205061989 --> @sieudx commented on GitHub (Jul 3, 2024): I have a same isssue, I run ollama with docker by command: docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama2 ollama/ollama Or by container : ollama: image: ollama/ollama:0.1.48 container_name: ollama ports: - "11434:11434" volumes: - ollama:/root/.ollama deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] restart: always networks: - llm_local But when i call it not run on gpu , it only run on CPU Nvidia drive: 555.42.02 and cuda version 12.5
Author
Owner

@tianfan007 commented on GitHub (Jul 3, 2024):

May I ask what nvidia-smi shows? (Driver etc, if you have the full output that would be amazing)

截图 2024-07-03 12-20-30

<!-- gh-comment-id:2205071240 --> @tianfan007 commented on GitHub (Jul 3, 2024): > May I ask what `nvidia-smi` shows? (Driver etc, if you have the full output that would be amazing) ![截图 2024-07-03 12-20-30](https://github.com/ollama/ollama/assets/5567566/5fdb2b36-289c-4cc0-832d-eaf865f90d98)
Author
Owner

@jmorganca commented on GitHub (Jul 3, 2024):

Thanks! And do you happen to have the logs handy? This would help us understand if the GPU wasn't detected properly.

sudo journalctl -u ollama > logs.txt
<!-- gh-comment-id:2205073349 --> @jmorganca commented on GitHub (Jul 3, 2024): Thanks! And do you happen to have the logs handy? This would help us understand if the GPU wasn't detected properly. ``` sudo journalctl -u ollama > logs.txt ```
Author
Owner

@tianfan007 commented on GitHub (Jul 3, 2024):

Thanks! And do you happen to have the logs handy? This would help us understand if the GPU wasn't detected properly.

sudo journalctl -u ollama > logs.txt

ok,log is here
logs.txt

<!-- gh-comment-id:2205097149 --> @tianfan007 commented on GitHub (Jul 3, 2024): > Thanks! And do you happen to have the logs handy? This would help us understand if the GPU wasn't detected properly. > > ``` > sudo journalctl -u ollama > logs.txt > ``` ok,log is here [logs.txt](https://github.com/user-attachments/files/16077225/logs.txt)
Author
Owner

@jmorganca commented on GitHub (Jul 3, 2024):

Thanks @tianfan007 one more thing – would it be possible to share the debug startup logs?

OLLAMA_DEBUG=1 OLLAMA_HOST=127.0.0.1:11435 ollama serve

(this will start Ollama in your current shell just for us to see the debug startup logs which will print GPU discovery info). Sorry about this

<!-- gh-comment-id:2205103238 --> @jmorganca commented on GitHub (Jul 3, 2024): Thanks @tianfan007 one more thing – would it be possible to share the debug startup logs? ``` OLLAMA_DEBUG=1 OLLAMA_HOST=127.0.0.1:11435 ollama serve ``` (this will start Ollama in your current shell just for us to see the debug startup logs which will print GPU discovery info). Sorry about this
Author
Owner

@tianfan007 commented on GitHub (Jul 3, 2024):

Thanks @tianfan007 one more thing – would it be possible to share the debug startup logs?

OLLAMA_DEBUG=1 OLLAMA_HOST=127.0.0.1:11435 ollama serve

(this will start Ollama in your current shell just for us to see the debug startup logs which will print GPU discovery info). Sorry about this

after i run the command ,output are these:

Couldn't find '/home/tianfan/.ollama/id_ed25519'. Generating new private key.
Your new public key is: 

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIH7JHmBDQRCh1+kTEQRxLrfvDsZn3uHiKAtM+OZHfThh

2024/07/03 13:26:02 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11435 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:/home/tianfan/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-07-03T13:26:02.822+08:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-07-03T13:26:02.822+08:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-07-03T13:26:02.822+08:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11435 (version 0.1.48)"
time=2024-07-03T13:26:02.823+08:00 level=WARN source=assets.go:81 msg="failed to read ollama.pid" path=/tmp/ollama3065264722 error="open /tmp/ollama3065264722/ollama.pid: permission denied"
time=2024-07-03T13:26:02.823+08:00 level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama3712559296/runners
time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cpu file=build/linux/x86_64/cpu/bin/ollama_llama_server.gz
time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cpu_avx file=build/linux/x86_64/cpu_avx/bin/ollama_llama_server.gz
time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cpu_avx2 file=build/linux/x86_64/cpu_avx2/bin/ollama_llama_server.gz
time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublas.so.11.gz
time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublasLt.so.11.gz
time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcudart.so.11.0.gz
time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/ollama_llama_server.gz
time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=rocm_v60101 file=build/linux/x86_64/rocm_v60101/bin/deps.txt.gz
time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=rocm_v60101 file=build/linux/x86_64/rocm_v60101/bin/ollama_llama_server.gz
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/cpu/ollama_llama_server
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/cpu_avx/ollama_llama_server
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/cpu_avx2/ollama_llama_server
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/cuda_v11/ollama_llama_server
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/rocm_v60101/ollama_llama_server
time=2024-07-03T13:26:05.117+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 cuda_v11 rocm_v60101 cpu cpu_avx]"
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:45 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY"
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=sched.go:94 msg="starting llm scheduler"
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=gpu.go:205 msg="Detecting GPUs"
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=gpu.go:91 msg="searching for GPU discovery libraries for NVIDIA"
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=gpu.go:435 msg="Searching for GPU library" name=libcuda.so*
time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=gpu.go:454 msg="gpu library search" globs="[/home/tianfan/下载/soft_base/libcuda.so** /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]"
time=2024-07-03T13:26:05.127+08:00 level=DEBUG source=gpu.go:488 msg="discovered GPU libraries" paths="[/usr/lib/i386-linux-gnu/libcuda.so.535.183.01 /usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01]"
library /usr/lib/i386-linux-gnu/libcuda.so.535.183.01 load err: /usr/lib/i386-linux-gnu/libcuda.so.535.183.01: wrong ELF class: ELFCLASS32
time=2024-07-03T13:26:05.127+08:00 level=DEBUG source=gpu.go:517 msg="Unable to load nvcuda" library=/usr/lib/i386-linux-gnu/libcuda.so.535.183.01 error="Unable to load /usr/lib/i386-linux-gnu/libcuda.so.535.183.01 library to query for Nvidia GPUs: /usr/lib/i386-linux-gnu/libcuda.so.535.183.01: wrong ELF class: ELFCLASS32"
cuInit err: 999
time=2024-07-03T13:26:05.135+08:00 level=DEBUG source=gpu.go:517 msg="Unable to load nvcuda" library=/usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01 error="nvcuda init failure: 999"
time=2024-07-03T13:26:05.135+08:00 level=DEBUG source=gpu.go:435 msg="Searching for GPU library" name=libcudart.so*
time=2024-07-03T13:26:05.135+08:00 level=DEBUG source=gpu.go:454 msg="gpu library search" globs="[/home/tianfan/下载/soft_base/libcudart.so** /tmp/ollama3712559296/runners/cuda*/libcudart.so* /usr/local/cuda/lib64/libcudart.so* /usr/lib/x86_64-linux-gnu/nvidia/current/libcudart.so* /usr/lib/x86_64-linux-gnu/libcudart.so* /usr/lib/wsl/lib/libcudart.so* /usr/lib/wsl/drivers/*/libcudart.so* /opt/cuda/lib64/libcudart.so* /usr/local/cuda*/targets/aarch64-linux/lib/libcudart.so* /usr/lib/aarch64-linux-gnu/nvidia/current/libcudart.so* /usr/lib/aarch64-linux-gnu/libcudart.so* /usr/local/cuda/lib*/libcudart.so* /usr/lib*/libcudart.so* /usr/local/lib*/libcudart.so*]"
time=2024-07-03T13:26:05.138+08:00 level=DEBUG source=gpu.go:488 msg="discovered GPU libraries" paths=[/tmp/ollama3712559296/runners/cuda_v11/libcudart.so.11.0]
cudaSetDevice err: 999
time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=gpu.go:500 msg="Unable to load cudart" library=/tmp/ollama3712559296/runners/cuda_v11/libcudart.so.11.0 error="cudart init failure: 999"
time=2024-07-03T13:26:05.141+08:00 level=WARN source=amd_linux.go:58 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory"
time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:87 msg="evaluating amdgpu node /sys/class/kfd/kfd/topology/nodes/0/properties"
time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:112 msg="detected CPU /sys/class/kfd/kfd/topology/nodes/0/properties"
time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:87 msg="evaluating amdgpu node /sys/class/kfd/kfd/topology/nodes/1/properties"
time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:202 msg="mapping amdgpu to drm sysfs nodes" amdgpu=/sys/class/kfd/kfd/topology/nodes/1/properties vendor=4098 device=5761 unique_id=0
time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:236 msg=matched amdgpu=/sys/class/kfd/kfd/topology/nodes/1/properties drm=/sys/class/drm/card0/device
time=2024-07-03T13:26:05.141+08:00 level=INFO source=amd_linux.go:259 msg="unsupported Radeon iGPU detected skipping" id=0 total="512.0 MiB"
time=2024-07-03T13:26:05.141+08:00 level=INFO source=amd_linux.go:345 msg="no compatible amdgpu devices detected"
time=2024-07-03T13:26:05.141+08:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="14.8 GiB" available="8.7 GiB"

<!-- gh-comment-id:2205129715 --> @tianfan007 commented on GitHub (Jul 3, 2024): > Thanks @tianfan007 one more thing – would it be possible to share the debug startup logs? > > ``` > OLLAMA_DEBUG=1 OLLAMA_HOST=127.0.0.1:11435 ollama serve > ``` > > (this will start Ollama in your current shell just for us to see the debug startup logs which will print GPU discovery info). Sorry about this after i run the command ,output are these: ``` Couldn't find '/home/tianfan/.ollama/id_ed25519'. Generating new private key. Your new public key is: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIH7JHmBDQRCh1+kTEQRxLrfvDsZn3uHiKAtM+OZHfThh 2024/07/03 13:26:02 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11435 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:/home/tianfan/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-07-03T13:26:02.822+08:00 level=INFO source=images.go:730 msg="total blobs: 0" time=2024-07-03T13:26:02.822+08:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0" time=2024-07-03T13:26:02.822+08:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11435 (version 0.1.48)" time=2024-07-03T13:26:02.823+08:00 level=WARN source=assets.go:81 msg="failed to read ollama.pid" path=/tmp/ollama3065264722 error="open /tmp/ollama3065264722/ollama.pid: permission denied" time=2024-07-03T13:26:02.823+08:00 level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama3712559296/runners time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cpu file=build/linux/x86_64/cpu/bin/ollama_llama_server.gz time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cpu_avx file=build/linux/x86_64/cpu_avx/bin/ollama_llama_server.gz time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cpu_avx2 file=build/linux/x86_64/cpu_avx2/bin/ollama_llama_server.gz time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublas.so.11.gz time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcublasLt.so.11.gz time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/libcudart.so.11.0.gz time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=cuda_v11 file=build/linux/x86_64/cuda_v11/bin/ollama_llama_server.gz time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=rocm_v60101 file=build/linux/x86_64/rocm_v60101/bin/deps.txt.gz time=2024-07-03T13:26:02.823+08:00 level=DEBUG source=payload.go:180 msg=extracting variant=rocm_v60101 file=build/linux/x86_64/rocm_v60101/bin/ollama_llama_server.gz time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/cpu/ollama_llama_server time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/cpu_avx/ollama_llama_server time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/cpu_avx2/ollama_llama_server time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/cuda_v11/ollama_llama_server time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:71 msg="availableServers : found" file=/tmp/ollama3712559296/runners/rocm_v60101/ollama_llama_server time=2024-07-03T13:26:05.117+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 cuda_v11 rocm_v60101 cpu cpu_avx]" time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=payload.go:45 msg="Override detection logic by setting OLLAMA_LLM_LIBRARY" time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=sched.go:94 msg="starting llm scheduler" time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=gpu.go:205 msg="Detecting GPUs" time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=gpu.go:91 msg="searching for GPU discovery libraries for NVIDIA" time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=gpu.go:435 msg="Searching for GPU library" name=libcuda.so* time=2024-07-03T13:26:05.117+08:00 level=DEBUG source=gpu.go:454 msg="gpu library search" globs="[/home/tianfan/下载/soft_base/libcuda.so** /usr/local/cuda*/targets/*/lib/libcuda.so* /usr/lib/*-linux-gnu/nvidia/current/libcuda.so* /usr/lib/*-linux-gnu/libcuda.so* /usr/lib/wsl/lib/libcuda.so* /usr/lib/wsl/drivers/*/libcuda.so* /opt/cuda/lib*/libcuda.so* /usr/local/cuda/lib*/libcuda.so* /usr/lib*/libcuda.so* /usr/local/lib*/libcuda.so*]" time=2024-07-03T13:26:05.127+08:00 level=DEBUG source=gpu.go:488 msg="discovered GPU libraries" paths="[/usr/lib/i386-linux-gnu/libcuda.so.535.183.01 /usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01]" library /usr/lib/i386-linux-gnu/libcuda.so.535.183.01 load err: /usr/lib/i386-linux-gnu/libcuda.so.535.183.01: wrong ELF class: ELFCLASS32 time=2024-07-03T13:26:05.127+08:00 level=DEBUG source=gpu.go:517 msg="Unable to load nvcuda" library=/usr/lib/i386-linux-gnu/libcuda.so.535.183.01 error="Unable to load /usr/lib/i386-linux-gnu/libcuda.so.535.183.01 library to query for Nvidia GPUs: /usr/lib/i386-linux-gnu/libcuda.so.535.183.01: wrong ELF class: ELFCLASS32" cuInit err: 999 time=2024-07-03T13:26:05.135+08:00 level=DEBUG source=gpu.go:517 msg="Unable to load nvcuda" library=/usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01 error="nvcuda init failure: 999" time=2024-07-03T13:26:05.135+08:00 level=DEBUG source=gpu.go:435 msg="Searching for GPU library" name=libcudart.so* time=2024-07-03T13:26:05.135+08:00 level=DEBUG source=gpu.go:454 msg="gpu library search" globs="[/home/tianfan/下载/soft_base/libcudart.so** /tmp/ollama3712559296/runners/cuda*/libcudart.so* /usr/local/cuda/lib64/libcudart.so* /usr/lib/x86_64-linux-gnu/nvidia/current/libcudart.so* /usr/lib/x86_64-linux-gnu/libcudart.so* /usr/lib/wsl/lib/libcudart.so* /usr/lib/wsl/drivers/*/libcudart.so* /opt/cuda/lib64/libcudart.so* /usr/local/cuda*/targets/aarch64-linux/lib/libcudart.so* /usr/lib/aarch64-linux-gnu/nvidia/current/libcudart.so* /usr/lib/aarch64-linux-gnu/libcudart.so* /usr/local/cuda/lib*/libcudart.so* /usr/lib*/libcudart.so* /usr/local/lib*/libcudart.so*]" time=2024-07-03T13:26:05.138+08:00 level=DEBUG source=gpu.go:488 msg="discovered GPU libraries" paths=[/tmp/ollama3712559296/runners/cuda_v11/libcudart.so.11.0] cudaSetDevice err: 999 time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=gpu.go:500 msg="Unable to load cudart" library=/tmp/ollama3712559296/runners/cuda_v11/libcudart.so.11.0 error="cudart init failure: 999" time=2024-07-03T13:26:05.141+08:00 level=WARN source=amd_linux.go:58 msg="ollama recommends running the https://www.amd.com/en/support/linux-drivers" error="amdgpu version file missing: /sys/module/amdgpu/version stat /sys/module/amdgpu/version: no such file or directory" time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:87 msg="evaluating amdgpu node /sys/class/kfd/kfd/topology/nodes/0/properties" time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:112 msg="detected CPU /sys/class/kfd/kfd/topology/nodes/0/properties" time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:87 msg="evaluating amdgpu node /sys/class/kfd/kfd/topology/nodes/1/properties" time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:202 msg="mapping amdgpu to drm sysfs nodes" amdgpu=/sys/class/kfd/kfd/topology/nodes/1/properties vendor=4098 device=5761 unique_id=0 time=2024-07-03T13:26:05.141+08:00 level=DEBUG source=amd_linux.go:236 msg=matched amdgpu=/sys/class/kfd/kfd/topology/nodes/1/properties drm=/sys/class/drm/card0/device time=2024-07-03T13:26:05.141+08:00 level=INFO source=amd_linux.go:259 msg="unsupported Radeon iGPU detected skipping" id=0 total="512.0 MiB" time=2024-07-03T13:26:05.141+08:00 level=INFO source=amd_linux.go:345 msg="no compatible amdgpu devices detected" time=2024-07-03T13:26:05.141+08:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="14.8 GiB" available="8.7 GiB" ```
Author
Owner

@tianfan007 commented on GitHub (Jul 3, 2024):

Thanks @tianfan007 one more thing – would it be possible to share the debug startup logs?

OLLAMA_DEBUG=1 OLLAMA_HOST=127.0.0.1:11435 ollama serve

(this will start Ollama in your current shell just for us to see the debug startup logs which will print GPU discovery info).
Some times,If I restart the System,It works……
截图 2024-07-03 13-44-46

It seems that it is not working very stably

<!-- gh-comment-id:2205149384 --> @tianfan007 commented on GitHub (Jul 3, 2024): > Thanks @tianfan007 one more thing – would it be possible to share the debug startup logs? > > ``` > OLLAMA_DEBUG=1 OLLAMA_HOST=127.0.0.1:11435 ollama serve > ``` > > (this will start Ollama in your current shell just for us to see the debug startup logs which will print GPU discovery info). Some times,If I restart the System,It works…… ![截图 2024-07-03 13-44-46](https://github.com/ollama/ollama/assets/5567566/32c22e22-38fb-4170-a9f4-663b67850dbc) It seems that it is not working very stably
Author
Owner

@dhiltgen commented on GitHub (Jul 3, 2024):

For some reason the system nvidia driver library isn't able to initialize

/usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01 error="nvcuda init failure: 999"

That error code is a generic "unknown error". Most of the folks reporting this recently have been hit by quirks in the latest driver v555, but you're running 535, so it is probably something else. That said, trying some of the troubleshooting steps here might yield a resolution. If that doesn't clear it, try running the server with OLLAMA_DEBUG=1 and CUDA_ERROR_LEVEL=50 and lets take a look at those logs.

sudo systemctl stop ollama
OLLAMA_DEBUG=1 CUDA_ERROR_LEVEL=50 ollama serve 2>&1 | tee server.log
<!-- gh-comment-id:2206564660 --> @dhiltgen commented on GitHub (Jul 3, 2024): For some reason the system nvidia driver library isn't able to initialize ``` /usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01 error="nvcuda init failure: 999" ``` That error code is a generic "unknown error". Most of the folks reporting this recently have been hit by quirks in the latest driver v555, but you're running 535, so it is probably something else. That said, trying some of the troubleshooting steps [here](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#container-fails-to-run-on-nvidia-gpu) might yield a resolution. If that doesn't clear it, try running the server with OLLAMA_DEBUG=1 and CUDA_ERROR_LEVEL=50 and lets take a look at those logs. ``` sudo systemctl stop ollama OLLAMA_DEBUG=1 CUDA_ERROR_LEVEL=50 ollama serve 2>&1 | tee server.log ```
Author
Owner

@tianfan007 commented on GitHub (Jul 4, 2024):

For some reason the system nvidia driver library isn't able to initialize

/usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01 error="nvcuda init failure: 999"

That error code is a generic "unknown error". Most of the folks reporting this recently have been hit by quirks in the latest driver v555, but you're running 535, so it is probably something else. That said, trying some of the troubleshooting steps here might yield a resolution. If that doesn't clear it, try running the server with OLLAMA_DEBUG=1 and CUDA_ERROR_LEVEL=50 and lets take a look at those logs.

sudo systemctl stop ollama
OLLAMA_DEBUG=1 CUDA_ERROR_LEVEL=50 ollama serve 2>&1 | tee server.log

as I mentioned above.If I restart OS,it works well most of the time. When I run the command "ollama run ...." after I use the OS for hours,edit some code,surf the internet,play music and so on, ollama would be not run on GPU.

<!-- gh-comment-id:2209366965 --> @tianfan007 commented on GitHub (Jul 4, 2024): > For some reason the system nvidia driver library isn't able to initialize > > ``` > /usr/lib/x86_64-linux-gnu/libcuda.so.535.183.01 error="nvcuda init failure: 999" > ``` > > That error code is a generic "unknown error". Most of the folks reporting this recently have been hit by quirks in the latest driver v555, but you're running 535, so it is probably something else. That said, trying some of the troubleshooting steps [here](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#container-fails-to-run-on-nvidia-gpu) might yield a resolution. If that doesn't clear it, try running the server with OLLAMA_DEBUG=1 and CUDA_ERROR_LEVEL=50 and lets take a look at those logs. > > ``` > sudo systemctl stop ollama > OLLAMA_DEBUG=1 CUDA_ERROR_LEVEL=50 ollama serve 2>&1 | tee server.log > ``` as I mentioned above.If I restart OS,it works well most of the time. When I run the command "ollama run ...." after I use the OS for hours,edit some code,surf the internet,play music and so on, ollama would be not run on GPU.
Author
Owner

@tianfan007 commented on GitHub (Jul 4, 2024):

and lets take a look at those logs.
Here is the log

ollama.log

<!-- gh-comment-id:2209372780 --> @tianfan007 commented on GitHub (Jul 4, 2024): > and lets take a look at those logs. Here is the log [ollama.log](https://github.com/user-attachments/files/16101665/ollama.log)
Author
Owner

@jmorganca commented on GitHub (Jul 4, 2024):

@tianfan007 is this the output from the following?

sudo systemctl stop ollama
OLLAMA_DEBUG=1 CUDA_ERROR_LEVEL=50 ollama serve 2>&1 | tee server.log

Thanks for the help 👍

<!-- gh-comment-id:2209416383 --> @jmorganca commented on GitHub (Jul 4, 2024): @tianfan007 is this the output from the following? ``` sudo systemctl stop ollama OLLAMA_DEBUG=1 CUDA_ERROR_LEVEL=50 ollama serve 2>&1 | tee server.log ``` Thanks for the help 👍
Author
Owner

@tianfan007 commented on GitHub (Jul 5, 2024):

@tianfan007 is this the output from the following?

sudo systemctl stop ollama
OLLAMA_DEBUG=1 CUDA_ERROR_LEVEL=50 ollama serve 2>&1 | tee server.log

Thanks for the help 👍

yes.I rename the log filename to "ollama.log"

<!-- gh-comment-id:2210379020 --> @tianfan007 commented on GitHub (Jul 5, 2024): > @tianfan007 is this the output from the following? > > ``` > sudo systemctl stop ollama > OLLAMA_DEBUG=1 CUDA_ERROR_LEVEL=50 ollama serve 2>&1 | tee server.log > ``` > > Thanks for the help 👍 yes.I rename the log filename to "ollama.log"
Author
Owner

@dhiltgen commented on GitHub (Jul 5, 2024):

@tianfan007 what you describe sounds like the UVM driver unloading. Next time it happens, instead of rebooting, please try to steps in the troubleshooting guide relating to the UVM driver and see if those resolve it. If so, then re-running our install script should clear it up, or you can manually perform the steps we do to get the UVM driver to stay loaded.

<!-- gh-comment-id:2211355931 --> @dhiltgen commented on GitHub (Jul 5, 2024): @tianfan007 what you describe sounds like the UVM driver unloading. Next time it happens, instead of rebooting, please try to steps in the [troubleshooting guide](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#linux-nvidia-troubleshooting) relating to the UVM driver and see if those resolve it. If so, then re-running our install script should clear it up, or you can manually perform the [steps we do](https://github.com/ollama/ollama/blob/main/scripts/install.sh#L314-L323) to get the UVM driver to stay loaded.
Author
Owner

@DSI-dot-Guru commented on GitHub (Jul 9, 2024):

@tianfan007 what you describe sounds like the UVM driver unloading. Next time it happens, instead of rebooting, please try to steps in the troubleshooting guide relating to the UVM driver and see if those resolve it. If so, then re-running our install script should clear it up, or you can manually perform the steps we do to get the UVM driver to stay loaded.

This actually turned out to be the issue for us.
Running on Deb12 Proxmox VM passing 2x P40s (NVIDIA-SMI 555.42.06 Driver Version: 555.42.06 CUDA Version: 12.5)
Ollama is installed to Debian and not a Docker Container.

Running: sudo nvidia-modprobe -u then sudo rmmod nvidia_uvm then sudo modprobe nvidia_uvm and then restarting the Ollama service put the focus back on the GPUs. Prior to this we also added:
Environment="CUDA_VISIBLE_DEVICES=GPU-44f1701b-c812-e344-a778-8ab5d4263278, GPU-a71221cb-9aa5-cefd-e8f8-3e0de5d42bfe"

to: /etc/systemd/system/ollama.service
We modified that to: Environment="CUDA_VISIBLE_DEVICES=0, 1"

Can also turn on debuging in the same file: Environment="OLLAMA_DEBUG=1"

A reinstall of Ollama did not work for us.

The solution for us was to modify: /etc/modules-load.d/modules.conf and add nvidia_uvm then save and reboot. If you don't have a modules.conf you can simply create a new file within that folder like nvidia_uvm.conf and add nvidia_uvm then save and reboot.

Hope this helps someone.

<!-- gh-comment-id:2216285962 --> @DSI-dot-Guru commented on GitHub (Jul 9, 2024): > @tianfan007 what you describe sounds like the UVM driver unloading. Next time it happens, instead of rebooting, please try to steps in the [troubleshooting guide](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#linux-nvidia-troubleshooting) relating to the UVM driver and see if those resolve it. If so, then re-running our install script should clear it up, or you can manually perform the [steps we do](https://github.com/ollama/ollama/blob/main/scripts/install.sh#L314-L323) to get the UVM driver to stay loaded. This actually turned out to be the issue for us. Running on Deb12 Proxmox VM passing 2x P40s (NVIDIA-SMI 555.42.06 Driver Version: 555.42.06 CUDA Version: 12.5) Ollama is installed to Debian and not a Docker Container. Running: `sudo nvidia-modprobe -u` then `sudo rmmod nvidia_uvm` then `sudo modprobe nvidia_uvm` and then restarting the Ollama service put the focus back on the GPUs. ~~Prior to this we also added: `Environment="CUDA_VISIBLE_DEVICES=GPU-44f1701b-c812-e344-a778-8ab5d4263278, GPU-a71221cb-9aa5-cefd-e8f8-3e0de5d42bfe"`~~ to: `/etc/systemd/system/ollama.service` We modified that to: `Environment="CUDA_VISIBLE_DEVICES=0, 1"` Can also turn on debuging in the same file: `Environment="OLLAMA_DEBUG=1"` A reinstall of Ollama did not work for us. The solution for us was to modify: `/etc/modules-load.d/modules.conf` and add `nvidia_uvm` then save and reboot. If you don't have a modules.conf you can simply create a new file within that folder like `nvidia_uvm.conf` and add `nvidia_uvm` then save and reboot. Hope this helps someone.
Author
Owner

@tianfan007 commented on GitHub (Jul 20, 2024):

sudo nvidia-modprobe -u then sudo rmmod nvidia_uvm then sudo modprobe nvidia_uvm and then restarting the Ollama service

through my observation,if ollma work only on CPU, I run the follow command
sudo nvidia-modprobe -u
sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm
and restart ollama,the ollama will be run on GPU。

<!-- gh-comment-id:2241148698 --> @tianfan007 commented on GitHub (Jul 20, 2024): > `sudo nvidia-modprobe -u` then `sudo rmmod nvidia_uvm` then `sudo modprobe nvidia_uvm` and then restarting the Ollama service through my observation,if ollma work only on CPU, I run the follow command sudo nvidia-modprobe -u sudo rmmod nvidia_uvm sudo modprobe nvidia_uvm and restart ollama,the ollama will be run on GPU。
Author
Owner

@phok007 commented on GitHub (Aug 18, 2024):

ollama dos not work... CPU only ,

2024/08/18 10:04:36 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:E:\OLLAMA_MODELS OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[* http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\Users\Administrator\AppData\Local\Programs\Ollama\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:782 msg="total blobs: 7"
time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:790 msg="total unused blobs removed: 0"
time=2024-08-18T10:04:36.805+08:00 level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.6)"
time=2024-08-18T10:04:36.806+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 cuda_v11.3 rocm_v6.1 cpu]"
time=2024-08-18T10:04:36.806+08:00 level=INFO source=gpu.go:204 msg="looking for compatible GPUs"
time=2024-08-18T10:04:36.833+08:00 level=INFO source=gpu.go:350 msg="no compatible GPUs were discovered"
time=2024-08-18T10:04:36.833+08:00 level=INFO source=types.go:105 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="255.9 GiB" available="248.8 GiB"

<!-- gh-comment-id:2295065630 --> @phok007 commented on GitHub (Aug 18, 2024): ollama dos not work... CPU only , 2024/08/18 10:04:36 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:E:\\OLLAMA_MODELS OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[* http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:782 msg="total blobs: 7" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:790 msg="total unused blobs removed: 0" time=2024-08-18T10:04:36.805+08:00 level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.6)" time=2024-08-18T10:04:36.806+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 cuda_v11.3 rocm_v6.1 cpu]" time=2024-08-18T10:04:36.806+08:00 level=INFO source=gpu.go:204 msg="looking for compatible GPUs" time=2024-08-18T10:04:36.833+08:00 level=INFO source=gpu.go:350 msg="no compatible GPUs were discovered" time=2024-08-18T10:04:36.833+08:00 level=INFO source=types.go:105 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="255.9 GiB" available="248.8 GiB"
Author
Owner

@summerlotus513 commented on GitHub (Aug 23, 2024):

ollama dos not work... CPU only ,

2024/08/18 10:04:36 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:E:\OLLAMA_MODELS OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[* http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\Users\Administrator\AppData\Local\Programs\Ollama\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:782 msg="total blobs: 7" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:790 msg="total unused blobs removed: 0" time=2024-08-18T10:04:36.805+08:00 level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.6)" time=2024-08-18T10:04:36.806+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 cuda_v11.3 rocm_v6.1 cpu]" time=2024-08-18T10:04:36.806+08:00 level=INFO source=gpu.go:204 msg="looking for compatible GPUs" time=2024-08-18T10:04:36.833+08:00 level=INFO source=gpu.go:350 msg="no compatible GPUs were discovered" time=2024-08-18T10:04:36.833+08:00 level=INFO source=types.go:105 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="255.9 GiB" available="248.8 GiB"

i have the same problem with you, do you sloved it?

<!-- gh-comment-id:2306372994 --> @summerlotus513 commented on GitHub (Aug 23, 2024): > ollama dos not work... CPU only , > > 2024/08/18 10:04:36 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:E:\OLLAMA_MODELS OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[* http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\Users\Administrator\AppData\Local\Programs\Ollama\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:782 msg="total blobs: 7" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:790 msg="total unused blobs removed: 0" time=2024-08-18T10:04:36.805+08:00 level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.6)" time=2024-08-18T10:04:36.806+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 cuda_v11.3 rocm_v6.1 cpu]" time=2024-08-18T10:04:36.806+08:00 level=INFO source=gpu.go:204 msg="looking for compatible GPUs" time=2024-08-18T10:04:36.833+08:00 level=INFO source=gpu.go:350 msg="no compatible GPUs were discovered" time=2024-08-18T10:04:36.833+08:00 level=INFO source=types.go:105 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="255.9 GiB" available="248.8 GiB" i have the same problem with you, do you sloved it?
Author
Owner

@tianfan007 commented on GitHub (Aug 27, 2024):

no ,it haven't be sloved.when it occured,only restart OS

At 2024-08-23 14:20:55, "iszhou" @.***> wrote:

ollama dos not work... CPU only ,

2024/08/18 10:04:36 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:E:\OLLAMA_MODELS OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[* http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\Users\Administrator\AppData\Local\Programs\Ollama\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:782 msg="total blobs: 7" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:790 msg="total unused blobs removed: 0" time=2024-08-18T10:04:36.805+08:00 level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.6)" time=2024-08-18T10:04:36.806+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 cuda_v11.3 rocm_v6.1 cpu]" time=2024-08-18T10:04:36.806+08:00 level=INFO source=gpu.go:204 msg="looking for compatible GPUs" time=2024-08-18T10:04:36.833+08:00 level=INFO source=gpu.go:350 msg="no compatible GPUs were discovered" time=2024-08-18T10:04:36.833+08:00 level=INFO source=types.go:105 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="255.9 GiB" available="248.8 GiB"

i have the same problem with you, do you sloved it?


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: @.***>

<!-- gh-comment-id:2312878155 --> @tianfan007 commented on GitHub (Aug 27, 2024): no ,it haven't be sloved.when it occured,only restart OS At 2024-08-23 14:20:55, "iszhou" ***@***.***> wrote: ollama dos not work... CPU only , 2024/08/18 10:04:36 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:E:\OLLAMA_MODELS OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[* http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\Users\Administrator\AppData\Local\Programs\Ollama\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:782 msg="total blobs: 7" time=2024-08-18T10:04:36.804+08:00 level=INFO source=images.go:790 msg="total unused blobs removed: 0" time=2024-08-18T10:04:36.805+08:00 level=INFO source=routes.go:1172 msg="Listening on [::]:11434 (version 0.3.6)" time=2024-08-18T10:04:36.806+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx cpu_avx2 cuda_v11.3 rocm_v6.1 cpu]" time=2024-08-18T10:04:36.806+08:00 level=INFO source=gpu.go:204 msg="looking for compatible GPUs" time=2024-08-18T10:04:36.833+08:00 level=INFO source=gpu.go:350 msg="no compatible GPUs were discovered" time=2024-08-18T10:04:36.833+08:00 level=INFO source=types.go:105 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="255.9 GiB" available="248.8 GiB" i have the same problem with you, do you sloved it? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: ***@***.***>
Author
Owner
<!-- gh-comment-id:2314073521 --> @ayttop commented on GitHub (Aug 28, 2024): https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/llama_cpp_quickstart.md
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3409