[GH-ISSUE #3729] failed at cuda 12.2 with GTX1080 Ti #48806

Closed
opened 2026-04-28 09:20:42 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @MissingTwins on GitHub (Apr 18, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3729

What is the issue?

This is a fresh installed ollama, but failed at first launch. cuda 12.2

ben@amd:~/work/ollama$ curl -fsSL https://ollama.com/install.sh | sh
>>> Downloading ollama...
####################################################################################################################### 100.0%####################################################################################################################### 100.0%
>>> Installing ollama to /usr/local/bin...
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service.
>>> NVIDIA GPU installed.
ben@amd:~/work/ollama$ ollama mistral
Error: unknown command "mistral" for "ollama"
ben@amd:~/work/ollama$ ollama mistral^C
ben@amd:~/work/ollama$ ollama run mistral
pulling manifest 
pulling e8a35b5937a5... 100% ▕██████████████████████████████████████████████████████████████▏ 4.1 GB                         
pulling 43070e2d4e53... 100% ▕██████████████████████████████████████████████████████████████▏  11 KB                         
pulling e6836092461f... 100% ▕██████████████████████████████████████████████████████████████▏   42 B                         
pulling ed11eda7790d... 100% ▕██████████████████████████████████████████████████████████████▏   30 B                         
pulling f9b1e3196ecf... 100% ▕██████████████████████████████████████████████████████████████▏  483 B                         
verifying sha256 digest 
writing manifest 
removing any unused layers 
success 
Error: llama runner process no longer running: 1 

ben@amd:~/work/ollama$ ollama run mistral
Error: llama runner process no longer running: 1 

ben@amd:~/work/ollama$ nvidia-smi 
Thu Apr 18 15:59:20 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.07             Driver Version: 535.161.07   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1080 Ti     On  | 00000000:0A:00.0 Off |                  N/A |
|  0%   33C    P8              19W / 275W |      4MiB / 11264MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
ben@amd:~/work/ollama$ ollama -v
ollama version is 0.1.32

I have linked libcublas.so.11 -> libcublas.so.12 but it still failed. cude works well for other cuda 11.x projects.

ben@amd:~/work/ollama$ ls -al  /usr/local/cuda/lib64/libcudart*
lrwxrwxrwx 1 root root      15 Aug 16  2023 /usr/local/cuda/lib64/libcudart.so -> libcudart.so.12
lrwxrwxrwx 1 root root      21 Aug 16  2023 /usr/local/cuda/lib64/libcudart.so.12 -> libcudart.so.12.2.140
-rw-r--r-- 1 root root  683360 Aug 16  2023 /usr/local/cuda/lib64/libcudart.so.12.2.140
-rw-r--r-- 1 root root 1379326 Aug 16  2023 /usr/local/cuda/lib64/libcudart_static.a

ben@amd:~/work/ollama$ ls -al  /usr/local/cuda/lib64/libcublas*
lrwxrwxrwx 1 root root        17 Aug 16  2023 /usr/local/cuda/lib64/libcublasLt.so -> libcublasLt.so.12
lrwxrwxrwx 1 root root        23 Aug 16  2023 /usr/local/cuda/lib64/libcublasLt.so.12 -> libcublasLt.so.12.2.5.6
-rw-r--r-- 1 root root 525843792 Aug 16  2023 /usr/local/cuda/lib64/libcublasLt.so.12.2.5.6
-rw-r--r-- 1 root root 770686098 Aug 16  2023 /usr/local/cuda/lib64/libcublasLt_static.a
lrwxrwxrwx 1 root root        15 Aug 16  2023 /usr/local/cuda/lib64/libcublas.so -> libcublas.so.12
lrwxrwxrwx 1 root root        15 Feb 15 19:55 /usr/local/cuda/lib64/libcublas.so.11 -> libcublas.so.12
lrwxrwxrwx 1 root root        21 Aug 16  2023 /usr/local/cuda/lib64/libcublas.so.12 -> libcublas.so.12.2.5.6
-rw-r--r-- 1 root root 106675248 Aug 16  2023 /usr/local/cuda/lib64/libcublas.so.12.2.5.6
-rw-r--r-- 1 root root 168600104 Aug 16  2023 /usr/local/cuda/lib64/libcublas_static.a


ben@amd:~/work/ollama$ ls -ld /usr/local/cuda*
lrwxrwxrwx  1 root root   22 Jan 20 23:13 /usr/local/cuda -> /etc/alternatives/cuda
lrwxrwxrwx  1 root root   25 Jan 20 23:13 /usr/local/cuda-12 -> /etc/alternatives/cuda-12
drwxr-xr-x 15 root root 4096 Jan 20 23:12 /usr/local/cuda-12.2

Here is the Logs
[   83.120309] amd systemd[1]: Started Ollama Service.
[   83.133078] amd ollama[2303]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key.
[   83.134194] amd ollama[2303]: Your new public key is:
[   83.134194] amd ollama[2303]: ssh-ed25519 Censored
[   83.134456] amd ollama[2303]: time=2024-04-18T15:55:19.875+09:00 level=INFO source=images.go:817 msg="total blobs: 0"
[   83.134513] amd ollama[2303]: time=2024-04-18T15:55:19.875+09:00 level=INFO source=images.go:824 msg="total unused blobs removed: 0"
[   83.134621] amd ollama[2303]: time=2024-04-18T15:55:19.875+09:00 level=INFO source=routes.go:1143 msg="Listening on 127.0.0.1:11434 (version 0.1.32)"
[   83.134956] amd ollama[2303]: time=2024-04-18T15:55:19.875+09:00 level=INFO source=payload.go:28 msg="extracting embedded files" dir=/tmp/ollama3603894721/runners
[   86.003279] amd ollama[2303]: time=2024-04-18T15:55:22.744+09:00 level=INFO source=payload.go:41 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 rocm_v60002]"
[   86.003279] amd ollama[2303]: time=2024-04-18T15:55:22.744+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type"
[   86.003618] amd ollama[2303]: time=2024-04-18T15:55:22.744+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*"
[   86.010035] amd ollama[2303]: time=2024-04-18T15:55:22.750+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]"
[   86.070476] amd ollama[2303]: time=2024-04-18T15:55:22.811+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart"
[   86.070514] amd ollama[2303]: time=2024-04-18T15:55:22.811+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[   86.225488] amd ollama[2303]: time=2024-04-18T15:55:22.966+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1"
[  137.899701] amd ollama[2303]: [GIN] 2024/04/18 - 15:56:14 | 200 |       37.46µs |       127.0.0.1 | HEAD     "/"
[  137.900488] amd ollama[2303]: [GIN] 2024/04/18 - 15:56:14 | 404 |     116.077µs |       127.0.0.1 | POST     "/api/show"
[  140.789836] amd ollama[2303]: time=2024-04-18T15:56:17.530+09:00 level=INFO source=download.go:136 msg="downloading e8a35b5937a5 in 42 100 MB part(s)"
[  185.253719] amd ollama[2303]: time=2024-04-18T15:57:01.994+09:00 level=INFO source=download.go:136 msg="downloading 43070e2d4e53 in 1 11 KB part(s)"
[  187.174634] amd ollama[2303]: time=2024-04-18T15:57:03.915+09:00 level=INFO source=download.go:136 msg="downloading e6836092461f in 1 42 B part(s)"
[  190.115108] amd ollama[2303]: time=2024-04-18T15:57:06.855+09:00 level=INFO source=download.go:136 msg="downloading ed11eda7790d in 1 30 B part(s)"
[  192.046078] amd ollama[2303]: time=2024-04-18T15:57:08.786+09:00 level=INFO source=download.go:136 msg="downloading f9b1e3196ecf in 1 483 B part(s)"
[  195.485759] amd ollama[2303]: [GIN] 2024/04/18 - 15:57:12 | 200 | 57.585406568s |       127.0.0.1 | POST     "/api/pull"
[  195.486886] amd ollama[2303]: [GIN] 2024/04/18 - 15:57:12 | 200 |     672.555µs |       127.0.0.1 | POST     "/api/show"
[  195.487559] amd ollama[2303]: [GIN] 2024/04/18 - 15:57:12 | 200 |     198.871µs |       127.0.0.1 | POST     "/api/show"
[  195.993394] amd ollama[2303]: time=2024-04-18T15:57:12.734+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type"
[  195.993394] amd ollama[2303]: time=2024-04-18T15:57:12.734+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*"
[  195.997472] amd ollama[2303]: time=2024-04-18T15:57:12.738+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]"
[  195.998324] amd ollama[2303]: time=2024-04-18T15:57:12.739+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart"
[  195.998324] amd ollama[2303]: time=2024-04-18T15:57:12.739+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[  196.134773] amd ollama[2303]: time=2024-04-18T15:57:12.875+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1"
[  196.191449] amd ollama[2303]: time=2024-04-18T15:57:12.932+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type"
[  196.191449] amd ollama[2303]: time=2024-04-18T15:57:12.932+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*"
[  196.193140] amd ollama[2303]: time=2024-04-18T15:57:12.934+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]"
[  196.193572] amd ollama[2303]: time=2024-04-18T15:57:12.934+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart"
[  196.193572] amd ollama[2303]: time=2024-04-18T15:57:12.934+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[  196.268651] amd ollama[2303]: time=2024-04-18T15:57:13.009+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1"
[  196.312505] amd ollama[2303]: time=2024-04-18T15:57:13.053+09:00 level=INFO source=server.go:127 msg="offload to gpu" reallayers=33 layers=33 required="4724.5 MiB" used="4724.5 MiB" available="11009.9 MiB" kv="256.0 MiB" fulloffload="164.0 MiB" partialoffload="181.0 MiB"
[  196.312587] amd ollama[2303]: time=2024-04-18T15:57:13.053+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[  196.312708] amd ollama[2303]: time=2024-04-18T15:57:13.053+09:00 level=INFO source=server.go:264 msg="starting llama server" cmd="/tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057730 --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --port 42067"
[  196.312960] amd ollama[2303]: time=2024-04-18T15:57:13.053+09:00 level=INFO source=server.go:389 msg="waiting for llama runner to start responding"
[  196.319131] amd ollama[2303]: /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server: /usr/local/cuda/lib64/libcublas.so.11: version `libcublas.so.11' not found (required by /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server)
[  196.363803] amd ollama[2303]: time=2024-04-18T15:57:13.104+09:00 level=ERROR source=routes.go:120 msg="error loading llama server" error="llama runner process no longer running: 1 "
[  196.363845] amd ollama[2303]: [GIN] 2024/04/18 - 15:57:13 | 500 |  875.760611ms |       127.0.0.1 | POST     "/api/chat"
[  245.946877] amd ollama[2303]: [GIN] 2024/04/18 - 15:58:02 | 200 |      19.607µs |       127.0.0.1 | HEAD     "/"
[  245.947494] amd ollama[2303]: [GIN] 2024/04/18 - 15:58:02 | 200 |     375.091µs |       127.0.0.1 | POST     "/api/show"
[  245.948175] amd ollama[2303]: [GIN] 2024/04/18 - 15:58:02 | 200 |      289.03µs |       127.0.0.1 | POST     "/api/show"
[  246.447634] amd ollama[2303]: time=2024-04-18T15:58:03.188+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type"
[  246.447634] amd ollama[2303]: time=2024-04-18T15:58:03.188+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*"
[  246.450677] amd ollama[2303]: time=2024-04-18T15:58:03.191+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]"
[  246.451539] amd ollama[2303]: time=2024-04-18T15:58:03.192+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart"
[  246.451539] amd ollama[2303]: time=2024-04-18T15:58:03.192+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[  246.595211] amd ollama[2303]: time=2024-04-18T15:58:03.336+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1"
[  246.650697] amd ollama[2303]: time=2024-04-18T15:58:03.391+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type"
[  246.650697] amd ollama[2303]: time=2024-04-18T15:58:03.391+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*"
[  246.652325] amd ollama[2303]: time=2024-04-18T15:58:03.393+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]"
[  246.652764] amd ollama[2303]: time=2024-04-18T15:58:03.393+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart"
[  246.652764] amd ollama[2303]: time=2024-04-18T15:58:03.393+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[  246.731444] amd ollama[2303]: time=2024-04-18T15:58:03.472+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1"
[  246.781592] amd ollama[2303]: time=2024-04-18T15:58:03.522+09:00 level=INFO source=server.go:127 msg="offload to gpu" reallayers=33 layers=33 required="4724.5 MiB" used="4724.5 MiB" available="11009.9 MiB" kv="256.0 MiB" fulloffload="164.0 MiB" partialoffload="181.0 MiB"
[  246.781684] amd ollama[2303]: time=2024-04-18T15:58:03.522+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[  246.781800] amd ollama[2303]: time=2024-04-18T15:58:03.522+09:00 level=INFO source=server.go:264 msg="starting llama server" cmd="/tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057730 --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --port 36135"
[  246.782201] amd ollama[2303]: time=2024-04-18T15:58:03.523+09:00 level=INFO source=server.go:389 msg="waiting for llama runner to start responding"
[  246.783896] amd ollama[2303]: /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server: /usr/local/cuda/lib64/libcublas.so.11: version `libcublas.so.11' not found (required by /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server)
[  246.833186] amd ollama[2303]: time=2024-04-18T15:58:03.574+09:00 level=ERROR source=routes.go:120 msg="error loading llama server" error="llama runner process no longer running: 1 "
[  246.833231] amd ollama[2303]: [GIN] 2024/04/18 - 15:58:03 | 500 |  884.761421ms |       127.0.0.1 | POST     "/api/chat"
[  437.978588] amd ollama[2303]: [GIN] 2024/04/18 - 16:01:14 | 200 |      20.769µs |       127.0.0.1 | HEAD     "/"
[  437.979267] amd ollama[2303]: [GIN] 2024/04/18 - 16:01:14 | 200 |     284.743µs |       127.0.0.1 | GET      "/api/tags"
[  456.418438] amd ollama[2303]: [GIN] 2024/04/18 - 16:01:33 | 200 |      28.283µs |       127.0.0.1 | HEAD     "/"
[  458.934304] amd ollama[2303]: time=2024-04-18T16:01:35.675+09:00 level=INFO source=download.go:136 msg="downloading 170370233dd5 in 42 100 MB part(s)"
[  476.934522] amd ollama[2303]: time=2024-04-18T16:01:53.675+09:00 level=INFO source=download.go:251 msg="170370233dd5 part 8 stalled; retrying. If this persists, press ctrl-c to exit, then 'ollama pull' to find a faster connection."
[  508.847472] amd ollama[2303]: time=2024-04-18T16:02:25.588+09:00 level=INFO source=download.go:136 msg="downloading 72d6f08a42f6 in 7 100 MB part(s)"
[  518.832954] amd ollama[2303]: time=2024-04-18T16:02:35.573+09:00 level=INFO source=download.go:136 msg="downloading c43332387573 in 1 67 B part(s)"
[  520.743339] amd ollama[2303]: time=2024-04-18T16:02:37.484+09:00 level=INFO source=download.go:136 msg="downloading 7c658f9561e5 in 1 564 B part(s)"
[  524.572442] amd ollama[2303]: [GIN] 2024/04/18 - 16:02:41 | 200 |          1m8s |       127.0.0.1 | POST     "/api/pull"
[  551.538296] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:08 | 200 |      26.851µs |       127.0.0.1 | HEAD     "/"
[  551.539134] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:08 | 404 |      65.683µs |       127.0.0.1 | POST     "/api/show"
[  555.200514] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:11 | 200 |  3.661717372s |       127.0.0.1 | POST     "/api/pull"
[  555.201466] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:11 | 200 |     569.521µs |       127.0.0.1 | POST     "/api/show"
[  555.202271] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:11 | 200 |      207.34µs |       127.0.0.1 | POST     "/api/show"
[  555.448202] amd ollama[2303]: time=2024-04-18T16:03:12.189+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type"
[  555.448202] amd ollama[2303]: time=2024-04-18T16:03:12.189+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*"
[  555.451303] amd ollama[2303]: time=2024-04-18T16:03:12.192+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]"
[  555.452133] amd ollama[2303]: time=2024-04-18T16:03:12.193+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart"
[  555.452133] amd ollama[2303]: time=2024-04-18T16:03:12.193+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[  555.576880] amd ollama[2303]: time=2024-04-18T16:03:12.317+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1"
[  555.632608] amd ollama[2303]: time=2024-04-18T16:03:12.373+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type"
[  555.632608] amd ollama[2303]: time=2024-04-18T16:03:12.373+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*"
[  555.634189] amd ollama[2303]: time=2024-04-18T16:03:12.375+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]"
[  555.634617] amd ollama[2303]: time=2024-04-18T16:03:12.375+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart"
[  555.634617] amd ollama[2303]: time=2024-04-18T16:03:12.375+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[  555.727459] amd ollama[2303]: time=2024-04-18T16:03:12.468+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1"
[  555.775085] amd ollama[2303]: time=2024-04-18T16:03:12.515+09:00 level=INFO source=server.go:127 msg="offload to gpu" reallayers=33 layers=33 required="5320.0 MiB" used="5320.0 MiB" available="11009.9 MiB" kv="256.0 MiB" fulloffload="164.0 MiB" partialoffload="181.0 MiB"
[  555.775171] amd ollama[2303]: time=2024-04-18T16:03:12.516+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
[  555.775247] amd ollama[2303]: time=2024-04-18T16:03:12.516+09:00 level=INFO source=server.go:264 msg="starting llama server" cmd="/tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-170370233dd5c5415250a2ecd5c71586352850729062ccef1496385647293868 --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --mmproj /usr/share/ollama/.ollama/models/blobs/sha256-72d6f08a42f656d36b356dbe0920675899a99ce21192fd66266fb7d82ed07539 --port 35909"
[  555.775609] amd ollama[2303]: time=2024-04-18T16:03:12.516+09:00 level=INFO source=server.go:389 msg="waiting for llama runner to start responding"
[  555.777326] amd ollama[2303]: /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server: /usr/local/cuda/lib64/libcublas.so.11: version `libcublas.so.11' not found (required by /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server)
[  555.826400] amd ollama[2303]: time=2024-04-18T16:03:12.567+09:00 level=ERROR source=routes.go:120 msg="error loading llama server" error="llama runner process no longer running: 1 "
[  555.826454] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:12 | 500 |   623.91401ms |       127.0.0.1 | POST     "/api/chat"
[ 2734.691422] amd ollama[2303]: [GIN] 2024/04/18 - 16:39:31 | 200 |      42.189µs |       127.0.0.1 | GET      "/api/version"

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.1.32

Originally created by @MissingTwins on GitHub (Apr 18, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3729 ### What is the issue? This is a fresh installed ollama, but failed at first launch. cuda 12.2 ``` ben@amd:~/work/ollama$ curl -fsSL https://ollama.com/install.sh | sh >>> Downloading ollama... ####################################################################################################################### 100.0%####################################################################################################################### 100.0% >>> Installing ollama to /usr/local/bin... >>> Creating ollama user... >>> Adding ollama user to render group... >>> Adding ollama user to video group... >>> Adding current user to ollama group... >>> Creating ollama systemd service... >>> Enabling and starting ollama service... Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service. >>> NVIDIA GPU installed. ben@amd:~/work/ollama$ ollama mistral Error: unknown command "mistral" for "ollama" ben@amd:~/work/ollama$ ollama mistral^C ben@amd:~/work/ollama$ ollama run mistral pulling manifest pulling e8a35b5937a5... 100% ▕██████████████████████████████████████████████████████████████▏ 4.1 GB pulling 43070e2d4e53... 100% ▕██████████████████████████████████████████████████████████████▏ 11 KB pulling e6836092461f... 100% ▕██████████████████████████████████████████████████████████████▏ 42 B pulling ed11eda7790d... 100% ▕██████████████████████████████████████████████████████████████▏ 30 B pulling f9b1e3196ecf... 100% ▕██████████████████████████████████████████████████████████████▏ 483 B verifying sha256 digest writing manifest removing any unused layers success Error: llama runner process no longer running: 1 ben@amd:~/work/ollama$ ollama run mistral Error: llama runner process no longer running: 1 ben@amd:~/work/ollama$ nvidia-smi Thu Apr 18 15:59:20 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.161.07 Driver Version: 535.161.07 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce GTX 1080 Ti On | 00000000:0A:00.0 Off | N/A | | 0% 33C P8 19W / 275W | 4MiB / 11264MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | No running processes found | +---------------------------------------------------------------------------------------+ ben@amd:~/work/ollama$ ollama -v ollama version is 0.1.32 ``` I have linked `libcublas.so.11 -> libcublas.so.12` but it still failed. cude works well for other cuda 11.x projects. ``` ben@amd:~/work/ollama$ ls -al /usr/local/cuda/lib64/libcudart* lrwxrwxrwx 1 root root 15 Aug 16 2023 /usr/local/cuda/lib64/libcudart.so -> libcudart.so.12 lrwxrwxrwx 1 root root 21 Aug 16 2023 /usr/local/cuda/lib64/libcudart.so.12 -> libcudart.so.12.2.140 -rw-r--r-- 1 root root 683360 Aug 16 2023 /usr/local/cuda/lib64/libcudart.so.12.2.140 -rw-r--r-- 1 root root 1379326 Aug 16 2023 /usr/local/cuda/lib64/libcudart_static.a ben@amd:~/work/ollama$ ls -al /usr/local/cuda/lib64/libcublas* lrwxrwxrwx 1 root root 17 Aug 16 2023 /usr/local/cuda/lib64/libcublasLt.so -> libcublasLt.so.12 lrwxrwxrwx 1 root root 23 Aug 16 2023 /usr/local/cuda/lib64/libcublasLt.so.12 -> libcublasLt.so.12.2.5.6 -rw-r--r-- 1 root root 525843792 Aug 16 2023 /usr/local/cuda/lib64/libcublasLt.so.12.2.5.6 -rw-r--r-- 1 root root 770686098 Aug 16 2023 /usr/local/cuda/lib64/libcublasLt_static.a lrwxrwxrwx 1 root root 15 Aug 16 2023 /usr/local/cuda/lib64/libcublas.so -> libcublas.so.12 lrwxrwxrwx 1 root root 15 Feb 15 19:55 /usr/local/cuda/lib64/libcublas.so.11 -> libcublas.so.12 lrwxrwxrwx 1 root root 21 Aug 16 2023 /usr/local/cuda/lib64/libcublas.so.12 -> libcublas.so.12.2.5.6 -rw-r--r-- 1 root root 106675248 Aug 16 2023 /usr/local/cuda/lib64/libcublas.so.12.2.5.6 -rw-r--r-- 1 root root 168600104 Aug 16 2023 /usr/local/cuda/lib64/libcublas_static.a ben@amd:~/work/ollama$ ls -ld /usr/local/cuda* lrwxrwxrwx 1 root root 22 Jan 20 23:13 /usr/local/cuda -> /etc/alternatives/cuda lrwxrwxrwx 1 root root 25 Jan 20 23:13 /usr/local/cuda-12 -> /etc/alternatives/cuda-12 drwxr-xr-x 15 root root 4096 Jan 20 23:12 /usr/local/cuda-12.2 ``` ------------ <details> <summary> Here is the Logs </summary> ``` [ 83.120309] amd systemd[1]: Started Ollama Service. [ 83.133078] amd ollama[2303]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key. [ 83.134194] amd ollama[2303]: Your new public key is: [ 83.134194] amd ollama[2303]: ssh-ed25519 Censored [ 83.134456] amd ollama[2303]: time=2024-04-18T15:55:19.875+09:00 level=INFO source=images.go:817 msg="total blobs: 0" [ 83.134513] amd ollama[2303]: time=2024-04-18T15:55:19.875+09:00 level=INFO source=images.go:824 msg="total unused blobs removed: 0" [ 83.134621] amd ollama[2303]: time=2024-04-18T15:55:19.875+09:00 level=INFO source=routes.go:1143 msg="Listening on 127.0.0.1:11434 (version 0.1.32)" [ 83.134956] amd ollama[2303]: time=2024-04-18T15:55:19.875+09:00 level=INFO source=payload.go:28 msg="extracting embedded files" dir=/tmp/ollama3603894721/runners [ 86.003279] amd ollama[2303]: time=2024-04-18T15:55:22.744+09:00 level=INFO source=payload.go:41 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 rocm_v60002]" [ 86.003279] amd ollama[2303]: time=2024-04-18T15:55:22.744+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type" [ 86.003618] amd ollama[2303]: time=2024-04-18T15:55:22.744+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*" [ 86.010035] amd ollama[2303]: time=2024-04-18T15:55:22.750+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]" [ 86.070476] amd ollama[2303]: time=2024-04-18T15:55:22.811+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart" [ 86.070514] amd ollama[2303]: time=2024-04-18T15:55:22.811+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 86.225488] amd ollama[2303]: time=2024-04-18T15:55:22.966+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1" [ 137.899701] amd ollama[2303]: [GIN] 2024/04/18 - 15:56:14 | 200 | 37.46µs | 127.0.0.1 | HEAD "/" [ 137.900488] amd ollama[2303]: [GIN] 2024/04/18 - 15:56:14 | 404 | 116.077µs | 127.0.0.1 | POST "/api/show" [ 140.789836] amd ollama[2303]: time=2024-04-18T15:56:17.530+09:00 level=INFO source=download.go:136 msg="downloading e8a35b5937a5 in 42 100 MB part(s)" [ 185.253719] amd ollama[2303]: time=2024-04-18T15:57:01.994+09:00 level=INFO source=download.go:136 msg="downloading 43070e2d4e53 in 1 11 KB part(s)" [ 187.174634] amd ollama[2303]: time=2024-04-18T15:57:03.915+09:00 level=INFO source=download.go:136 msg="downloading e6836092461f in 1 42 B part(s)" [ 190.115108] amd ollama[2303]: time=2024-04-18T15:57:06.855+09:00 level=INFO source=download.go:136 msg="downloading ed11eda7790d in 1 30 B part(s)" [ 192.046078] amd ollama[2303]: time=2024-04-18T15:57:08.786+09:00 level=INFO source=download.go:136 msg="downloading f9b1e3196ecf in 1 483 B part(s)" [ 195.485759] amd ollama[2303]: [GIN] 2024/04/18 - 15:57:12 | 200 | 57.585406568s | 127.0.0.1 | POST "/api/pull" [ 195.486886] amd ollama[2303]: [GIN] 2024/04/18 - 15:57:12 | 200 | 672.555µs | 127.0.0.1 | POST "/api/show" [ 195.487559] amd ollama[2303]: [GIN] 2024/04/18 - 15:57:12 | 200 | 198.871µs | 127.0.0.1 | POST "/api/show" [ 195.993394] amd ollama[2303]: time=2024-04-18T15:57:12.734+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type" [ 195.993394] amd ollama[2303]: time=2024-04-18T15:57:12.734+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*" [ 195.997472] amd ollama[2303]: time=2024-04-18T15:57:12.738+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]" [ 195.998324] amd ollama[2303]: time=2024-04-18T15:57:12.739+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart" [ 195.998324] amd ollama[2303]: time=2024-04-18T15:57:12.739+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 196.134773] amd ollama[2303]: time=2024-04-18T15:57:12.875+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1" [ 196.191449] amd ollama[2303]: time=2024-04-18T15:57:12.932+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type" [ 196.191449] amd ollama[2303]: time=2024-04-18T15:57:12.932+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*" [ 196.193140] amd ollama[2303]: time=2024-04-18T15:57:12.934+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]" [ 196.193572] amd ollama[2303]: time=2024-04-18T15:57:12.934+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart" [ 196.193572] amd ollama[2303]: time=2024-04-18T15:57:12.934+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 196.268651] amd ollama[2303]: time=2024-04-18T15:57:13.009+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1" [ 196.312505] amd ollama[2303]: time=2024-04-18T15:57:13.053+09:00 level=INFO source=server.go:127 msg="offload to gpu" reallayers=33 layers=33 required="4724.5 MiB" used="4724.5 MiB" available="11009.9 MiB" kv="256.0 MiB" fulloffload="164.0 MiB" partialoffload="181.0 MiB" [ 196.312587] amd ollama[2303]: time=2024-04-18T15:57:13.053+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 196.312708] amd ollama[2303]: time=2024-04-18T15:57:13.053+09:00 level=INFO source=server.go:264 msg="starting llama server" cmd="/tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057730 --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --port 42067" [ 196.312960] amd ollama[2303]: time=2024-04-18T15:57:13.053+09:00 level=INFO source=server.go:389 msg="waiting for llama runner to start responding" [ 196.319131] amd ollama[2303]: /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server: /usr/local/cuda/lib64/libcublas.so.11: version `libcublas.so.11' not found (required by /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server) [ 196.363803] amd ollama[2303]: time=2024-04-18T15:57:13.104+09:00 level=ERROR source=routes.go:120 msg="error loading llama server" error="llama runner process no longer running: 1 " [ 196.363845] amd ollama[2303]: [GIN] 2024/04/18 - 15:57:13 | 500 | 875.760611ms | 127.0.0.1 | POST "/api/chat" [ 245.946877] amd ollama[2303]: [GIN] 2024/04/18 - 15:58:02 | 200 | 19.607µs | 127.0.0.1 | HEAD "/" [ 245.947494] amd ollama[2303]: [GIN] 2024/04/18 - 15:58:02 | 200 | 375.091µs | 127.0.0.1 | POST "/api/show" [ 245.948175] amd ollama[2303]: [GIN] 2024/04/18 - 15:58:02 | 200 | 289.03µs | 127.0.0.1 | POST "/api/show" [ 246.447634] amd ollama[2303]: time=2024-04-18T15:58:03.188+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type" [ 246.447634] amd ollama[2303]: time=2024-04-18T15:58:03.188+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*" [ 246.450677] amd ollama[2303]: time=2024-04-18T15:58:03.191+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]" [ 246.451539] amd ollama[2303]: time=2024-04-18T15:58:03.192+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart" [ 246.451539] amd ollama[2303]: time=2024-04-18T15:58:03.192+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 246.595211] amd ollama[2303]: time=2024-04-18T15:58:03.336+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1" [ 246.650697] amd ollama[2303]: time=2024-04-18T15:58:03.391+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type" [ 246.650697] amd ollama[2303]: time=2024-04-18T15:58:03.391+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*" [ 246.652325] amd ollama[2303]: time=2024-04-18T15:58:03.393+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]" [ 246.652764] amd ollama[2303]: time=2024-04-18T15:58:03.393+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart" [ 246.652764] amd ollama[2303]: time=2024-04-18T15:58:03.393+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 246.731444] amd ollama[2303]: time=2024-04-18T15:58:03.472+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1" [ 246.781592] amd ollama[2303]: time=2024-04-18T15:58:03.522+09:00 level=INFO source=server.go:127 msg="offload to gpu" reallayers=33 layers=33 required="4724.5 MiB" used="4724.5 MiB" available="11009.9 MiB" kv="256.0 MiB" fulloffload="164.0 MiB" partialoffload="181.0 MiB" [ 246.781684] amd ollama[2303]: time=2024-04-18T15:58:03.522+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 246.781800] amd ollama[2303]: time=2024-04-18T15:58:03.522+09:00 level=INFO source=server.go:264 msg="starting llama server" cmd="/tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057730 --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --port 36135" [ 246.782201] amd ollama[2303]: time=2024-04-18T15:58:03.523+09:00 level=INFO source=server.go:389 msg="waiting for llama runner to start responding" [ 246.783896] amd ollama[2303]: /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server: /usr/local/cuda/lib64/libcublas.so.11: version `libcublas.so.11' not found (required by /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server) [ 246.833186] amd ollama[2303]: time=2024-04-18T15:58:03.574+09:00 level=ERROR source=routes.go:120 msg="error loading llama server" error="llama runner process no longer running: 1 " [ 246.833231] amd ollama[2303]: [GIN] 2024/04/18 - 15:58:03 | 500 | 884.761421ms | 127.0.0.1 | POST "/api/chat" [ 437.978588] amd ollama[2303]: [GIN] 2024/04/18 - 16:01:14 | 200 | 20.769µs | 127.0.0.1 | HEAD "/" [ 437.979267] amd ollama[2303]: [GIN] 2024/04/18 - 16:01:14 | 200 | 284.743µs | 127.0.0.1 | GET "/api/tags" [ 456.418438] amd ollama[2303]: [GIN] 2024/04/18 - 16:01:33 | 200 | 28.283µs | 127.0.0.1 | HEAD "/" [ 458.934304] amd ollama[2303]: time=2024-04-18T16:01:35.675+09:00 level=INFO source=download.go:136 msg="downloading 170370233dd5 in 42 100 MB part(s)" [ 476.934522] amd ollama[2303]: time=2024-04-18T16:01:53.675+09:00 level=INFO source=download.go:251 msg="170370233dd5 part 8 stalled; retrying. If this persists, press ctrl-c to exit, then 'ollama pull' to find a faster connection." [ 508.847472] amd ollama[2303]: time=2024-04-18T16:02:25.588+09:00 level=INFO source=download.go:136 msg="downloading 72d6f08a42f6 in 7 100 MB part(s)" [ 518.832954] amd ollama[2303]: time=2024-04-18T16:02:35.573+09:00 level=INFO source=download.go:136 msg="downloading c43332387573 in 1 67 B part(s)" [ 520.743339] amd ollama[2303]: time=2024-04-18T16:02:37.484+09:00 level=INFO source=download.go:136 msg="downloading 7c658f9561e5 in 1 564 B part(s)" [ 524.572442] amd ollama[2303]: [GIN] 2024/04/18 - 16:02:41 | 200 | 1m8s | 127.0.0.1 | POST "/api/pull" [ 551.538296] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:08 | 200 | 26.851µs | 127.0.0.1 | HEAD "/" [ 551.539134] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:08 | 404 | 65.683µs | 127.0.0.1 | POST "/api/show" [ 555.200514] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:11 | 200 | 3.661717372s | 127.0.0.1 | POST "/api/pull" [ 555.201466] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:11 | 200 | 569.521µs | 127.0.0.1 | POST "/api/show" [ 555.202271] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:11 | 200 | 207.34µs | 127.0.0.1 | POST "/api/show" [ 555.448202] amd ollama[2303]: time=2024-04-18T16:03:12.189+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type" [ 555.448202] amd ollama[2303]: time=2024-04-18T16:03:12.189+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*" [ 555.451303] amd ollama[2303]: time=2024-04-18T16:03:12.192+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]" [ 555.452133] amd ollama[2303]: time=2024-04-18T16:03:12.193+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart" [ 555.452133] amd ollama[2303]: time=2024-04-18T16:03:12.193+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 555.576880] amd ollama[2303]: time=2024-04-18T16:03:12.317+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1" [ 555.632608] amd ollama[2303]: time=2024-04-18T16:03:12.373+09:00 level=INFO source=gpu.go:121 msg="Detecting GPU type" [ 555.632608] amd ollama[2303]: time=2024-04-18T16:03:12.373+09:00 level=INFO source=gpu.go:268 msg="Searching for GPU management library libcudart.so*" [ 555.634189] amd ollama[2303]: time=2024-04-18T16:03:12.375+09:00 level=INFO source=gpu.go:314 msg="Discovered GPU libraries: [/tmp/ollama3603894721/runners/cuda_v11/libcudart.so.11.0 /usr/local/cuda/lib64/libcudart.so.12.2.140]" [ 555.634617] amd ollama[2303]: time=2024-04-18T16:03:12.375+09:00 level=INFO source=gpu.go:126 msg="Nvidia GPU detected via cudart" [ 555.634617] amd ollama[2303]: time=2024-04-18T16:03:12.375+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 555.727459] amd ollama[2303]: time=2024-04-18T16:03:12.468+09:00 level=INFO source=gpu.go:202 msg="[cudart] CUDART CUDA Compute Capability detected: 6.1" [ 555.775085] amd ollama[2303]: time=2024-04-18T16:03:12.515+09:00 level=INFO source=server.go:127 msg="offload to gpu" reallayers=33 layers=33 required="5320.0 MiB" used="5320.0 MiB" available="11009.9 MiB" kv="256.0 MiB" fulloffload="164.0 MiB" partialoffload="181.0 MiB" [ 555.775171] amd ollama[2303]: time=2024-04-18T16:03:12.516+09:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2" [ 555.775247] amd ollama[2303]: time=2024-04-18T16:03:12.516+09:00 level=INFO source=server.go:264 msg="starting llama server" cmd="/tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server --model /usr/share/ollama/.ollama/models/blobs/sha256-170370233dd5c5415250a2ecd5c71586352850729062ccef1496385647293868 --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --mmproj /usr/share/ollama/.ollama/models/blobs/sha256-72d6f08a42f656d36b356dbe0920675899a99ce21192fd66266fb7d82ed07539 --port 35909" [ 555.775609] amd ollama[2303]: time=2024-04-18T16:03:12.516+09:00 level=INFO source=server.go:389 msg="waiting for llama runner to start responding" [ 555.777326] amd ollama[2303]: /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server: /usr/local/cuda/lib64/libcublas.so.11: version `libcublas.so.11' not found (required by /tmp/ollama3603894721/runners/cuda_v11/ollama_llama_server) [ 555.826400] amd ollama[2303]: time=2024-04-18T16:03:12.567+09:00 level=ERROR source=routes.go:120 msg="error loading llama server" error="llama runner process no longer running: 1 " [ 555.826454] amd ollama[2303]: [GIN] 2024/04/18 - 16:03:12 | 500 | 623.91401ms | 127.0.0.1 | POST "/api/chat" [ 2734.691422] amd ollama[2303]: [GIN] 2024/04/18 - 16:39:31 | 200 | 42.189µs | 127.0.0.1 | GET "/api/version" ``` </details> ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.1.32
GiteaMirror added the bug label 2026-04-28 09:20:42 -05:00
Author
Owner

@remy415 commented on GitHub (Apr 18, 2024):

Hello, can you remove the link /usr/local/cuda/lib64/libcublas.so.11 -> libcublas.so.12 and try running it again?

<!-- gh-comment-id:2063989851 --> @remy415 commented on GitHub (Apr 18, 2024): Hello, can you remove the link `/usr/local/cuda/lib64/libcublas.so.11 -> libcublas.so.12` and try running it again?
Author
Owner

@MissingTwins commented on GitHub (Apr 18, 2024):

Hi, thank you for reply.
I removed the symbolic link and tried again, and it works now.

I think I need to find another way for CUDA 11 projects that rely on this symbolic link to coexist with Ollama.
Thank you for your time.

<!-- gh-comment-id:2064824507 --> @MissingTwins commented on GitHub (Apr 18, 2024): Hi, thank you for reply. I removed the symbolic link and tried again, and it works now. I think I need to find another way for CUDA 11 projects that rely on this symbolic link to coexist with Ollama. Thank you for your time.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#48806