[GH-ISSUE #9069] error while loading shared libraries: libggml_cuda_v11.so: ELF load command past end of file #5906

Closed
opened 2026-04-12 17:14:34 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @xcsnx on GitHub (Feb 13, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9069

What is the issue?

environment:

Image

Image

Image

ollama-linux-amd64.tgz

bug:

Image
/opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_llama_server: error while loading shared libraries: libggml_cuda_v11.so: ELF load command past end o

Image

How should this error be handled?

Relevant log output

Feb 13 20:50:48 aiops003 ollama[1214]: time=2025-02-13T20:50:48.943+08:00 level=INFO source=server.go:376 msg="starting llama server" cmd="/opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_
Feb 13 20:50:48 aiops003 ollama[1214]: time=2025-02-13T20:50:48.943+08:00 level=INFO source=sched.go:449 msg="loaded runners" count=1
Feb 13 20:50:48 aiops003 ollama[1214]: time=2025-02-13T20:50:48.943+08:00 level=INFO source=server.go:555 msg="waiting for llama runner to start responding"
Feb 13 20:50:48 aiops003 ollama[1214]: /opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_llama_server: error while loading shared libraries: libggml_cuda_v11.so: ELF load command past end o
Feb 13 20:50:48 aiops003 ollama[1214]: time=2025-02-13T20:50:48.944+08:00 level=INFO source=server.go:589 msg="waiting for server to become available" status="llm server error"
Feb 13 20:50:49 aiops003 ollama[1214]: time=2025-02-13T20:50:49.194+08:00 level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: exit st
Feb 13 20:50:49 aiops003 ollama[1214]: [GIN] 2025/02/13 - 20:50:49 | 500 |  634.140215ms |       127.0.0.1 | POST     "/api/generate"
Feb 13 20:50:54 aiops003 ollama[1214]: time=2025-02-13T20:50:54.336+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.142091008 model=/opt/oll
Feb 13 20:50:54 aiops003 ollama[1214]: time=2025-02-13T20:50:54.587+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.392626392 model=/opt/oll
Feb 13 20:50:54 aiops003 ollama[1214]: time=2025-02-13T20:50:54.837+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.642826874 model=/opt/oll
Feb 13 21:14:52 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:52 | 200 |      69.284µs |       127.0.0.1 | HEAD     "/"
Feb 13 21:14:52 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:52 | 200 |    31.41354ms |       127.0.0.1 | POST     "/api/show"
Feb 13 21:14:52 aiops003 ollama[1214]: time=2025-02-13T21:14:52.942+08:00 level=INFO source=sched.go:714 msg="new model will fit in available VRAM in single GPU, loading" model=/opt/ollama/m
Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.090+08:00 level=INFO source=server.go:104 msg="system memory" total="62.2 GiB" free="29.9 GiB" free_swap="31.2 GiB"
Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.091+08:00 level=INFO source=memory.go:356 msg="offload to cuda" layers.requested=-1 layers.model=29 layers.offload=29 layers.s
Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.092+08:00 level=INFO source=server.go:376 msg="starting llama server" cmd="/opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_
Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.092+08:00 level=INFO source=sched.go:449 msg="loaded runners" count=1
Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.092+08:00 level=INFO source=server.go:555 msg="waiting for llama runner to start responding"
Feb 13 21:14:53 aiops003 ollama[1214]: /opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_llama_server: error while loading shared libraries: libggml_cuda_v11.so: ELF load command past end o
Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.092+08:00 level=INFO source=server.go:589 msg="waiting for server to become available" status="llm server error"
Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.343+08:00 level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: exit st
Feb 13 21:14:53 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:53 | 500 |   653.90216ms |       127.0.0.1 | POST     "/api/generate"
Feb 13 21:14:54 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:54 | 200 |      50.188µs |       127.0.0.1 | HEAD     "/"
Feb 13 21:14:54 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:54 | 200 |   33.526529ms |       127.0.0.1 | POST     "/api/show"
Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.132+08:00 level=INFO source=sched.go:714 msg="new model will fit in available VRAM in single GPU, loading" model=/opt/ollama/m
Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.310+08:00 level=INFO source=server.go:104 msg="system memory" total="62.2 GiB" free="29.9 GiB" free_swap="31.2 GiB"
Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.310+08:00 level=INFO source=memory.go:356 msg="offload to cuda" layers.requested=-1 layers.model=29 layers.offload=29 layers.s
Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.311+08:00 level=INFO source=server.go:376 msg="starting llama server" cmd="/opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_
Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.311+08:00 level=INFO source=sched.go:449 msg="loaded runners" count=1
Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.311+08:00 level=INFO source=server.go:555 msg="waiting for llama runner to start responding"
Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.312+08:00 level=INFO source=server.go:589 msg="waiting for server to become available" status="llm server error"
Feb 13 21:14:55 aiops003 ollama[1214]: /opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_llama_server: error while loading shared libraries: libggml_cuda_v11.so: ELF load command past end o
Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.562+08:00 level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: exit st
Feb 13 21:14:55 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:55 | 500 |  645.989673ms |       127.0.0.1 | POST     "/api/generate"
Feb 13 21:14:58 aiops003 ollama[1214]: time=2025-02-13T21:14:58.491+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.147951666 model=/opt/oll
Feb 13 21:14:58 aiops003 ollama[1214]: time=2025-02-13T21:14:58.741+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.397326342 model=/opt/oll
Feb 13 21:14:58 aiops003 ollama[1214]: time=2025-02-13T21:14:58.990+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.647177827 model=/opt/oll

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

ollama version is 0.5.7

Originally created by @xcsnx on GitHub (Feb 13, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9069 ### What is the issue? environment: ![Image](https://github.com/user-attachments/assets/6a6b10e2-3cf3-4ab9-86b1-face70ac96aa) ![Image](https://github.com/user-attachments/assets/ca3bd6b4-53d5-431d-9b01-9f1e5bef276d) ![Image](https://github.com/user-attachments/assets/1fa6ffd6-002d-4745-89b8-8667d958169c) ollama-linux-amd64.tgz bug: ![Image](https://github.com/user-attachments/assets/7e8217bf-0e9e-4f5b-8407-2f2fe229dec2) /opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_llama_server: error while loading shared libraries: libggml_cuda_v11.so: ELF load command past end o ![Image](https://github.com/user-attachments/assets/e8042386-a16b-45bd-b91d-72370ea2370d) How should this error be handled? ### Relevant log output ```shell Feb 13 20:50:48 aiops003 ollama[1214]: time=2025-02-13T20:50:48.943+08:00 level=INFO source=server.go:376 msg="starting llama server" cmd="/opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_ Feb 13 20:50:48 aiops003 ollama[1214]: time=2025-02-13T20:50:48.943+08:00 level=INFO source=sched.go:449 msg="loaded runners" count=1 Feb 13 20:50:48 aiops003 ollama[1214]: time=2025-02-13T20:50:48.943+08:00 level=INFO source=server.go:555 msg="waiting for llama runner to start responding" Feb 13 20:50:48 aiops003 ollama[1214]: /opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_llama_server: error while loading shared libraries: libggml_cuda_v11.so: ELF load command past end o Feb 13 20:50:48 aiops003 ollama[1214]: time=2025-02-13T20:50:48.944+08:00 level=INFO source=server.go:589 msg="waiting for server to become available" status="llm server error" Feb 13 20:50:49 aiops003 ollama[1214]: time=2025-02-13T20:50:49.194+08:00 level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: exit st Feb 13 20:50:49 aiops003 ollama[1214]: [GIN] 2025/02/13 - 20:50:49 | 500 | 634.140215ms | 127.0.0.1 | POST "/api/generate" Feb 13 20:50:54 aiops003 ollama[1214]: time=2025-02-13T20:50:54.336+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.142091008 model=/opt/oll Feb 13 20:50:54 aiops003 ollama[1214]: time=2025-02-13T20:50:54.587+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.392626392 model=/opt/oll Feb 13 20:50:54 aiops003 ollama[1214]: time=2025-02-13T20:50:54.837+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.642826874 model=/opt/oll Feb 13 21:14:52 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:52 | 200 | 69.284µs | 127.0.0.1 | HEAD "/" Feb 13 21:14:52 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:52 | 200 | 31.41354ms | 127.0.0.1 | POST "/api/show" Feb 13 21:14:52 aiops003 ollama[1214]: time=2025-02-13T21:14:52.942+08:00 level=INFO source=sched.go:714 msg="new model will fit in available VRAM in single GPU, loading" model=/opt/ollama/m Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.090+08:00 level=INFO source=server.go:104 msg="system memory" total="62.2 GiB" free="29.9 GiB" free_swap="31.2 GiB" Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.091+08:00 level=INFO source=memory.go:356 msg="offload to cuda" layers.requested=-1 layers.model=29 layers.offload=29 layers.s Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.092+08:00 level=INFO source=server.go:376 msg="starting llama server" cmd="/opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_ Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.092+08:00 level=INFO source=sched.go:449 msg="loaded runners" count=1 Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.092+08:00 level=INFO source=server.go:555 msg="waiting for llama runner to start responding" Feb 13 21:14:53 aiops003 ollama[1214]: /opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_llama_server: error while loading shared libraries: libggml_cuda_v11.so: ELF load command past end o Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.092+08:00 level=INFO source=server.go:589 msg="waiting for server to become available" status="llm server error" Feb 13 21:14:53 aiops003 ollama[1214]: time=2025-02-13T21:14:53.343+08:00 level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: exit st Feb 13 21:14:53 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:53 | 500 | 653.90216ms | 127.0.0.1 | POST "/api/generate" Feb 13 21:14:54 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:54 | 200 | 50.188µs | 127.0.0.1 | HEAD "/" Feb 13 21:14:54 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:54 | 200 | 33.526529ms | 127.0.0.1 | POST "/api/show" Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.132+08:00 level=INFO source=sched.go:714 msg="new model will fit in available VRAM in single GPU, loading" model=/opt/ollama/m Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.310+08:00 level=INFO source=server.go:104 msg="system memory" total="62.2 GiB" free="29.9 GiB" free_swap="31.2 GiB" Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.310+08:00 level=INFO source=memory.go:356 msg="offload to cuda" layers.requested=-1 layers.model=29 layers.offload=29 layers.s Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.311+08:00 level=INFO source=server.go:376 msg="starting llama server" cmd="/opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_ Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.311+08:00 level=INFO source=sched.go:449 msg="loaded runners" count=1 Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.311+08:00 level=INFO source=server.go:555 msg="waiting for llama runner to start responding" Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.312+08:00 level=INFO source=server.go:589 msg="waiting for server to become available" status="llm server error" Feb 13 21:14:55 aiops003 ollama[1214]: /opt/ollama/lib/ollama/runners/cuda_v11_avx/ollama_llama_server: error while loading shared libraries: libggml_cuda_v11.so: ELF load command past end o Feb 13 21:14:55 aiops003 ollama[1214]: time=2025-02-13T21:14:55.562+08:00 level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: exit st Feb 13 21:14:55 aiops003 ollama[1214]: [GIN] 2025/02/13 - 21:14:55 | 500 | 645.989673ms | 127.0.0.1 | POST "/api/generate" Feb 13 21:14:58 aiops003 ollama[1214]: time=2025-02-13T21:14:58.491+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.147951666 model=/opt/oll Feb 13 21:14:58 aiops003 ollama[1214]: time=2025-02-13T21:14:58.741+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.397326342 model=/opt/oll Feb 13 21:14:58 aiops003 ollama[1214]: time=2025-02-13T21:14:58.990+08:00 level=WARN source=sched.go:646 msg="gpu VRAM usage didn't recover within timeout" seconds=5.647177827 model=/opt/oll ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version ollama version is 0.5.7
GiteaMirror added the bug label 2026-04-12 17:14:34 -05:00
Author
Owner

@rick-github commented on GitHub (Feb 13, 2025):

Did you run out of disk space while installing? Your libggml_cuda_v11.so is shorter, has the wrong mode, and hasn't had the mtime updated.

$ ls -l
total 965800
-rwxr-xr-x 1 root root 979085896 Jan 16 18:01 libggml_cuda_v11.so
-rwxr-xr-x 1 root root   9885296 Jan 16 18:03 ollama_llama_server
$ file libggml_cuda_v11.so
libggml_cuda_v11.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=de5f4b5eb1f5d6a71df36ad861957c886c945230, not stripped

<!-- gh-comment-id:2656736598 --> @rick-github commented on GitHub (Feb 13, 2025): Did you run out of disk space while installing? Your libggml_cuda_v11.so is shorter, has the wrong mode, and hasn't had the mtime updated. ```console $ ls -l total 965800 -rwxr-xr-x 1 root root 979085896 Jan 16 18:01 libggml_cuda_v11.so -rwxr-xr-x 1 root root 9885296 Jan 16 18:03 ollama_llama_server $ file libggml_cuda_v11.so libggml_cuda_v11.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=de5f4b5eb1f5d6a71df36ad861957c886c945230, not stripped
Author
Owner

@xcsnx commented on GitHub (Feb 13, 2025):

Did you run out of disk space while installing? Your libggml_cuda_v11.so is shorter, has the wrong mode, and hasn't had the mtime updated.

$ ls -l
total 965800
-rwxr-xr-x 1 root root 979085896 Jan 16 18:01 libggml_cuda_v11.so
-rwxr-xr-x 1 root root 9885296 Jan 16 18:03 ollama_llama_server
$ file libggml_cuda_v11.so
libggml_cuda_v11.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=de5f4b5eb1f5d6a71df36ad861957c886c945230, not stripped

Thank you for your reply. My disk space is sufficient, but I don’t quite understand what you mean. I installed it manually offline using the command tar -xzf ollama-linux-amd64.tgz. For details, please refer to: https://blog.csdn.net/weixin_44988127/article/details/145021587. Could you please provide further clarification?

<!-- gh-comment-id:2656797712 --> @xcsnx commented on GitHub (Feb 13, 2025): > Did you run out of disk space while installing? Your libggml_cuda_v11.so is shorter, has the wrong mode, and hasn't had the mtime updated. > > $ ls -l > total 965800 > -rwxr-xr-x 1 root root 979085896 Jan 16 18:01 libggml_cuda_v11.so > -rwxr-xr-x 1 root root 9885296 Jan 16 18:03 ollama_llama_server > $ file libggml_cuda_v11.so > libggml_cuda_v11.so: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=de5f4b5eb1f5d6a71df36ad861957c886c945230, not stripped Thank you for your reply. My disk space is sufficient, but I don’t quite understand what you mean. I installed it manually offline using the command tar -xzf ollama-linux-amd64.tgz. For details, please refer to: https://blog.csdn.net/weixin_44988127/article/details/145021587. Could you please provide further clarification?
Author
Owner

@rick-github commented on GitHub (Feb 13, 2025):

Your installation is broken.

Do this:

tar zvtf ollama-linux-amd64.tgz ./lib/ollama/runners/cuda_v11_avx

Do the results look like this:

drwxr-xr-x root/root         0 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/
-rwxr-xr-x root/root   9885296 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/ollama_llama_server
-rwxr-xr-x root/root 979085896 2025-01-16 18:01 ./lib/ollama/runners/cuda_v11_avx/libggml_cuda_v11.so

Then compare with the files in /opt/ollama/lib/ollama/runners/cuda_v11_avx/:

ls -l /opt/ollama/lib/ollama/runners/cuda_v11_avx/

Do they match the output of tar?

Recommendation: re-install.

<!-- gh-comment-id:2656849207 --> @rick-github commented on GitHub (Feb 13, 2025): Your installation is broken. Do this: ``` tar zvtf ollama-linux-amd64.tgz ./lib/ollama/runners/cuda_v11_avx ``` Do the results look like this: ``` drwxr-xr-x root/root 0 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/ -rwxr-xr-x root/root 9885296 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/ollama_llama_server -rwxr-xr-x root/root 979085896 2025-01-16 18:01 ./lib/ollama/runners/cuda_v11_avx/libggml_cuda_v11.so ``` Then compare with the files in `/opt/ollama/lib/ollama/runners/cuda_v11_avx/`: ``` ls -l /opt/ollama/lib/ollama/runners/cuda_v11_avx/ ``` Do they match the output of `tar`? Recommendation: re-install.
Author
Owner

@xcsnx commented on GitHub (Feb 13, 2025):

Your installation is broken.

Do this:

tar zvtf ollama-linux-amd64.tgz ./lib/ollama/runners/cuda_v11_avx

Do the results look like this:

drwxr-xr-x root/root         0 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/
-rwxr-xr-x root/root   9885296 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/ollama_llama_server
-rwxr-xr-x root/root 979085896 2025-01-16 18:01 ./lib/ollama/runners/cuda_v11_avx/libggml_cuda_v11.so

Then compare with the files in /opt/ollama/lib/ollama/runners/cuda_v11_avx/:

ls -l /opt/ollama/lib/ollama/runners/cuda_v11_avx/

Do they match the output of tar?

Recommendation: re-install.

[root@aiops003 ollama]# tar zvtf ollama-linux-amd64.tgz ./lib/ollama/runners/cuda_v11_avx
drwxr-xr-x root/root 0 2025-01-17 01:03 ./lib/ollama/runners/cuda_v11_avx/
-rwxr-xr-x root/root 9885296 2025-01-17 01:03 ./lib/ollama/runners/cuda_v11_avx/ollama_llama_server
-rwxr-xr-x root/root 979085896 2025-01-17 01:01 ./lib/ollama/runners/cuda_v11_avx/libggml_cuda_v11.so

Thank you! As you said, the length of libggml_cuda_v11.so is indeed different from the one in the tar package! So can I directly replace the libggml_cuda_v11.so file?

<!-- gh-comment-id:2656882928 --> @xcsnx commented on GitHub (Feb 13, 2025): > Your installation is broken. > > Do this: > > ``` > tar zvtf ollama-linux-amd64.tgz ./lib/ollama/runners/cuda_v11_avx > ``` > > Do the results look like this: > > ``` > drwxr-xr-x root/root 0 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/ > -rwxr-xr-x root/root 9885296 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/ollama_llama_server > -rwxr-xr-x root/root 979085896 2025-01-16 18:01 ./lib/ollama/runners/cuda_v11_avx/libggml_cuda_v11.so > ``` > > Then compare with the files in `/opt/ollama/lib/ollama/runners/cuda_v11_avx/`: > > ``` > ls -l /opt/ollama/lib/ollama/runners/cuda_v11_avx/ > ``` > > Do they match the output of `tar`? > > Recommendation: re-install. [root@aiops003 ollama]# tar zvtf ollama-linux-amd64.tgz ./lib/ollama/runners/cuda_v11_avx drwxr-xr-x root/root 0 2025-01-17 01:03 ./lib/ollama/runners/cuda_v11_avx/ -rwxr-xr-x root/root 9885296 2025-01-17 01:03 ./lib/ollama/runners/cuda_v11_avx/ollama_llama_server -rwxr-xr-x root/root 979085896 2025-01-17 01:01 ./lib/ollama/runners/cuda_v11_avx/libggml_cuda_v11.so Thank you! As you said, the length of libggml_cuda_v11.so is indeed different from the one in the tar package! So can I directly replace the libggml_cuda_v11.so file?
Author
Owner

@rick-github commented on GitHub (Feb 13, 2025):

There may be other incomplete files. Just re-install.

sudo tar --extract --verbose --file ollama-linux-amd64.tgz --directory /opt/ollama
<!-- gh-comment-id:2656916174 --> @rick-github commented on GitHub (Feb 13, 2025): There may be other incomplete files. Just re-install. ``` sudo tar --extract --verbose --file ollama-linux-amd64.tgz --directory /opt/ollama ```
Author
Owner

@xcsnx commented on GitHub (Feb 13, 2025):

Your installation is broken.

Do this:

tar zvtf ollama-linux-amd64.tgz ./lib/ollama/runners/cuda_v11_avx

Do the results look like this:

drwxr-xr-x root/root         0 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/
-rwxr-xr-x root/root   9885296 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/ollama_llama_server
-rwxr-xr-x root/root 979085896 2025-01-16 18:01 ./lib/ollama/runners/cuda_v11_avx/libggml_cuda_v11.so

Then compare with the files in /opt/ollama/lib/ollama/runners/cuda_v11_avx/:

ls -l /opt/ollama/lib/ollama/runners/cuda_v11_avx/

Do they match the output of tar?

Recommendation: re-install.

There may be other incomplete files. Just re-install.

sudo tar --extract --verbose --file ollama-linux-amd64.tgz --directory /opt/ollama

I directly replaced the libggml_cuda_v11.so file, and Ollama is now running on the GPU. Thank you!

<!-- gh-comment-id:2656924234 --> @xcsnx commented on GitHub (Feb 13, 2025): > Your installation is broken. > > Do this: > > ``` > tar zvtf ollama-linux-amd64.tgz ./lib/ollama/runners/cuda_v11_avx > ``` > > Do the results look like this: > > ``` > drwxr-xr-x root/root 0 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/ > -rwxr-xr-x root/root 9885296 2025-01-16 18:03 ./lib/ollama/runners/cuda_v11_avx/ollama_llama_server > -rwxr-xr-x root/root 979085896 2025-01-16 18:01 ./lib/ollama/runners/cuda_v11_avx/libggml_cuda_v11.so > ``` > > Then compare with the files in `/opt/ollama/lib/ollama/runners/cuda_v11_avx/`: > > ``` > ls -l /opt/ollama/lib/ollama/runners/cuda_v11_avx/ > ``` > > Do they match the output of `tar`? > > Recommendation: re-install. > There may be other incomplete files. Just re-install. > > ``` > sudo tar --extract --verbose --file ollama-linux-amd64.tgz --directory /opt/ollama > ``` I directly replaced the libggml_cuda_v11.so file, and Ollama is now running on the GPU. Thank you!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5906