[GH-ISSUE #6955] nvidia gpu discovery problem in docker container on wsl #30162

Closed
opened 2026-04-22 09:39:42 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @Paramjethwa on GitHub (Sep 25, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6955

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

Developed a chat app with a function of pulling the model by user directly into streamlit and select the model through dropdown.

i have pulled several model succesfully but when i try to do with large model Eg: Llava ii gives me a Asyncio.timeout.error.

i am using WSL2 (UBUNTU) and running the chat app through docker file

ERROR:
ERROR FILE TIMEOUT ERROR.txt

Docker compose File :
Docker_compose.txt

also, somehow i am not able to use GPU in docker, any solution for that too?

thankyou!

OS

Windows, Docker, WSL2

GPU

Nvidia

CPU

Intel

Ollama version

0.3.11

Originally created by @Paramjethwa on GitHub (Sep 25, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6955 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? Developed a chat app with a function of pulling the model by user directly into streamlit and select the model through dropdown. i have pulled several model succesfully but when i try to do with large model Eg: Llava ii gives me a Asyncio.timeout.error. i am using WSL2 (UBUNTU) and running the chat app through docker file ERROR: [ERROR FILE TIMEOUT ERROR.txt](https://github.com/user-attachments/files/17132576/ERROR.FILE.TIMEOUT.ERROR.txt) Docker compose File : [Docker_compose.txt](https://github.com/user-attachments/files/17132787/Docker_compose.txt) also, somehow i am not able to use GPU in docker, any solution for that too? thankyou! ### OS Windows, Docker, WSL2 ### GPU Nvidia ### CPU Intel ### Ollama version 0.3.11
GiteaMirror added the bugneeds more info labels 2026-04-22 09:39:43 -05:00
Author
Owner

@rick-github commented on GitHub (Sep 25, 2024):

Does your pull_ollama_model_async function set a timeout? Because the the error is TimeoutError and the request failed after 300 seconds (5m).

For your GPU problem, what's the result of running nvidia-smi inside the ollama container (docker exec -it ollama nvidia-smi)?

<!-- gh-comment-id:2374409207 --> @rick-github commented on GitHub (Sep 25, 2024): Does your `pull_ollama_model_async` function set a timeout? Because the the error is `TimeoutError` and the request failed after 300 seconds (5m). For your GPU problem, what's the result of running `nvidia-smi` inside the ollama container (`docker exec -it ollama nvidia-smi`)?
Author
Owner

@Paramjethwa commented on GitHub (Sep 25, 2024):

 docker exec -it local_multimodal_ai-ollama-1 nvidia-smi
Wed Sep 25 18:08:33 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.02              Driver Version: 560.94         CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4050 ...    On  |   00000000:01:00.0 Off |                  N/A |
| N/A   51C    P8              1W /   80W |       0MiB /   6141MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

this is what it returns

<!-- gh-comment-id:2374831686 --> @Paramjethwa commented on GitHub (Sep 25, 2024): ``` docker exec -it local_multimodal_ai-ollama-1 nvidia-smi Wed Sep 25 18:08:33 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 560.35.02 Driver Version: 560.94 CUDA Version: 12.6 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 4050 ... On | 00000000:01:00.0 Off | N/A | | N/A 51C P8 1W / 80W | 0MiB / 6141MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+ ``` this is what it returns
Author
Owner

@Paramjethwa commented on GitHub (Sep 25, 2024):

Does your pull_ollama_model_async function set a timeout? Because the the error is TimeoutError and the request failed after 300 seconds (5m).

I did fixed it by changing in my docker-compose-yaml

environment:
      - OLLAMA_LOAD_TIMEOUT=15m # increase if you get a Timeout Error

and
increase tieout in my utils.py
async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=1800)) as session:

<!-- gh-comment-id:2374838233 --> @Paramjethwa commented on GitHub (Sep 25, 2024): Does your pull_ollama_model_async function set a timeout? Because the the error is TimeoutError and the request failed after 300 seconds (5m). I did fixed it by changing in my docker-compose-yaml ``` environment: - OLLAMA_LOAD_TIMEOUT=15m # increase if you get a Timeout Error ``` and increase tieout in my utils.py `async with aiohttp.ClientSession(timeout=aiohttp.ClientTimeout(total=1800)) as session:`
Author
Owner

@dhiltgen commented on GitHub (Oct 24, 2024):

It sounds like you fixed the timeout problem.

Are you still having trouble getting it to work on your GPU? If so, please share a server log so we can see why it fails to discover the GPU. Setting -e OLLAMA_DEBUG=1 may also help to increase the amount of logs.

<!-- gh-comment-id:2435659413 --> @dhiltgen commented on GitHub (Oct 24, 2024): It sounds like you fixed the timeout problem. Are you still having trouble getting it to work on your GPU? If so, please share a server log so we can see why it fails to discover the GPU. Setting `-e OLLAMA_DEBUG=1` may also help to increase the amount of logs.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#30162