[GH-ISSUE #7619] llama3.2-vision on multi gpu error #4862

Closed
opened 2026-04-12 15:51:46 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @18600709862 on GitHub (Nov 11, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7619

What is the issue?

multi gpu
ollama run llama3.2-vision

The image is a book cover. Output should be in this format - : . Do not output anything else /media/root/ssd2t/data/pro/tmp/o
... l/new/FastChat/image.png
Added image '/media/root/ssd2t/data/pro/tmp/ol/new/FastChat/image.png'
Error: POST predict: Post "http://127.0.0.1:41121/completion": EOF

one gpu
run ok

OS

Linux

GPU

Nvidia

CPU

No response

Ollama version

0.4.1

Originally created by @18600709862 on GitHub (Nov 11, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7619 ### What is the issue? multi gpu ollama run llama3.2-vision >>> The image is a book cover. Output should be in this format - <Name of the Book>: <Name of the Author>. Do not output anything else /media/root/ssd2t/data/pro/tmp/o ... l/new/FastChat/image.png Added image '/media/root/ssd2t/data/pro/tmp/ol/new/FastChat/image.png' Error: POST predict: Post "http://127.0.0.1:41121/completion": EOF one gpu run ok ### OS Linux ### GPU Nvidia ### CPU _No response_ ### Ollama version 0.4.1
GiteaMirror added the nvidiabug labels 2026-04-12 15:51:47 -05:00
Author
Owner

@rick-github commented on GitHub (Nov 11, 2024):

https://github.com/ollama/ollama/issues/7558

<!-- gh-comment-id:2467729921 --> @rick-github commented on GitHub (Nov 11, 2024): https://github.com/ollama/ollama/issues/7558
Author
Owner

@rajeshkumar-n commented on GitHub (Nov 11, 2024):

@rick-github - I was also running into a problem on accessing even with the single/full GPU on Ubuntu 22.04.5 LTS. The model(llama3.2-vision:90b-instruct-q4_K_M) works fine but not using the GPUs. Can you please provide additional details of OS, GPU configuration?

cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX Open Kernel Module for x86_64 550.54.15 Release Build (dvs-builder@U16-A24-23-2) Tue Mar 5 22:15:33 UTC 2024 GCC version: gcc version 13.1.0 (Ubuntu 13.1.0-8ubuntu1~22.04)

<!-- gh-comment-id:2468511028 --> @rajeshkumar-n commented on GitHub (Nov 11, 2024): @rick-github - I was also running into a problem on accessing even with the single/full GPU on Ubuntu 22.04.5 LTS. The model(llama3.2-vision:90b-instruct-q4_K_M) works fine but not using the GPUs. Can you please provide additional details of OS, GPU configuration? `cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX Open Kernel Module for x86_64 550.54.15 Release Build (dvs-builder@U16-A24-23-2) Tue Mar 5 22:15:33 UTC 2024 GCC version: gcc version 13.1.0 (Ubuntu 13.1.0-8ubuntu1~22.04)`
Author
Owner

@rick-github commented on GitHub (Nov 11, 2024):

Server logs will aid in diagnosis.

<!-- gh-comment-id:2468518680 --> @rick-github commented on GitHub (Nov 11, 2024): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in diagnosis.
Author
Owner

@rajeshkumar-n commented on GitHub (Nov 11, 2024):

Server logs will aid in diagnosis. Yes, the logs suggest no compatible GPUs. I wasn't successful with different combinations of CUDA, Nvidia drivers.
Nov 11 14:49:51 ollama-localhost systemd[1]: Started Ollama Service. Nov 11 14:49:51 ollama-localhost ollama[1159]: 2024/11/11 14:49:51 routes.go:1189: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.599Z level=INFO source=images.go:755 msg="total blobs: 11" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.600Z level=INFO source=images.go:762 msg="total unused blobs removed: 0" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.601Z level=INFO source=routes.go:1240 msg="Listening on 127.0.0.1:11434 (version 0.4.1)" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.603Z level=INFO source=common.go:135 msg="extracting embedded files" dir=/tmp/ollama1689759288/runners Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.856Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm]" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.856Z level=INFO source=gpu.go:221 msg="looking for compatible GPUs" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.912Z level=INFO source=gpu.go:620 msg="Unable to load cudart library /usr/lib/x86_64-linux-gnu/libcuda.so.550.54.15: cuda driver library init failure: 3" Nov 11 14:49:52 ollama-localhost ollama[1159]: time=2024-11-11T14:49:52.029Z level=INFO source=gpu.go:386 msg="no compatible GPUs were discovered"

<!-- gh-comment-id:2468572783 --> @rajeshkumar-n commented on GitHub (Nov 11, 2024): > [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in diagnosis. Yes, the logs suggest no compatible GPUs. I wasn't successful with different combinations of CUDA, Nvidia drivers. `Nov 11 14:49:51 ollama-localhost systemd[1]: Started Ollama Service. Nov 11 14:49:51 ollama-localhost ollama[1159]: 2024/11/11 14:49:51 routes.go:1189: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/usr/share/ollama/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.599Z level=INFO source=images.go:755 msg="total blobs: 11" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.600Z level=INFO source=images.go:762 msg="total unused blobs removed: 0" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.601Z level=INFO source=routes.go:1240 msg="Listening on 127.0.0.1:11434 (version 0.4.1)" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.603Z level=INFO source=common.go:135 msg="extracting embedded files" dir=/tmp/ollama1689759288/runners Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.856Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm]" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.856Z level=INFO source=gpu.go:221 msg="looking for compatible GPUs" Nov 11 14:49:51 ollama-localhost ollama[1159]: time=2024-11-11T14:49:51.912Z level=INFO source=gpu.go:620 msg="Unable to load cudart library /usr/lib/x86_64-linux-gnu/libcuda.so.550.54.15: cuda driver library init failure: 3" Nov 11 14:49:52 ollama-localhost ollama[1159]: time=2024-11-11T14:49:52.029Z level=INFO source=gpu.go:386 msg="no compatible GPUs were discovered"`
Author
Owner

@dhiltgen commented on GitHub (Nov 11, 2024):

@rajeshkumar-n during our call to cuInit the CUDA APIs/driver returned error code 3

    /**
     * This indicates that the CUDA driver has not been initialized with
     * ::cuInit() or that initialization has failed.
     */
    CUDA_ERROR_NOT_INITIALIZED                = 3,

Please give the troubleshooting steps a try and report back if none of them resolve the problem

https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#nvidia-gpu-discovery

<!-- gh-comment-id:2468589910 --> @dhiltgen commented on GitHub (Nov 11, 2024): @rajeshkumar-n during our call to cuInit the CUDA APIs/driver returned error code 3 ``` /** * This indicates that the CUDA driver has not been initialized with * ::cuInit() or that initialization has failed. */ CUDA_ERROR_NOT_INITIALIZED = 3, ``` Please give the troubleshooting steps a try and report back if none of them resolve the problem https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#nvidia-gpu-discovery
Author
Owner

@rajeshkumar-n commented on GitHub (Nov 12, 2024):

@rajeshkumar-n during our call to cuInit the CUDA APIs/driver returned error code 3

    /**
     * This indicates that the CUDA driver has not been initialized with
     * ::cuInit() or that initialization has failed.
     */
    CUDA_ERROR_NOT_INITIALIZED                = 3,

Please give the troubleshooting steps a try and report back if none of them resolve the problem

https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#nvidia-gpu-discovery

Submitted a new issue https://github.com/ollama/ollama/issues/7630 as this thread is for MIG.

<!-- gh-comment-id:2470623681 --> @rajeshkumar-n commented on GitHub (Nov 12, 2024): > @rajeshkumar-n during our call to cuInit the CUDA APIs/driver returned error code 3 > > ``` > /** > * This indicates that the CUDA driver has not been initialized with > * ::cuInit() or that initialization has failed. > */ > CUDA_ERROR_NOT_INITIALIZED = 3, > ``` > > Please give the troubleshooting steps a try and report back if none of them resolve the problem > > https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#nvidia-gpu-discovery Submitted a new issue https://github.com/ollama/ollama/issues/7630 as this thread is for MIG.
Author
Owner

@dhiltgen commented on GitHub (Nov 12, 2024):

I think we can close this as a dup of #7558

<!-- gh-comment-id:2471568956 --> @dhiltgen commented on GitHub (Nov 12, 2024): I think we can close this as a dup of #7558
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4862