[GH-ISSUE #4146] starting the docker container stucks at "CPU has AVX2" #2576

Closed
opened 2026-04-12 12:55:15 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @valiantrex3rei on GitHub (May 4, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4146

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

Hello everyone.

I was following the tutorial at Ollama Docker image.
After installing the NVIDIA Container Toolkit, and then configuring Docker to use Nvidia driver and starting the container,
I tried to attach the container but it took forever.

I tried to remove the -d to see the outout,

docker run --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

and got the following:

time=2024-05-04T01:07:55.992Z level=INFO source=images.go:828 msg="total blobs: 0"
time=2024-05-04T01:07:55.993Z level=INFO source=images.go:835 msg="total unused blobs removed: 0"
time=2024-05-04T01:07:55.993Z level=INFO source=routes.go:1071 msg="Listening on [::]:11434 (version 0.1.33)"
time=2024-05-04T01:07:55.993Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama244156069/runners
time=2024-05-04T01:07:57.767Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 rocm_v60002]"
time=2024-05-04T01:07:57.767Z level=INFO source=gpu.go:96 msg="Detecting GPUs"
time=2024-05-04T01:07:57.801Z level=INFO source=gpu.go:101 msg="detected GPUs" library=/tmp/ollama244156069/runners/cuda_v11/libcudart.so.11.0 count=1
time=2024-05-04T01:07:57.801Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2"

It seems like the Nvidia driver and CUDA are recognized.
Nevertheless, I also tried to remove the --gpus=all for CPU only mode, but still stuck at the same spot, only without the detected GPUs in the output.

Any suggestions and/or corrections would be appreciated.

system information:

         _,met$$$$$gg.           bwang@deb11bwang
      ,g$$$$$$$$$$$$$$$P.        OS: Debian 11 bullseye
    ,g$$P""       """Y$$.".      Kernel: x86_64 Linux 5.10.0-28-amd64
   ,$$P'              `$$$.      Uptime: 6d 24m
  ',$$P       ,ggs.     `$$b:    Packages: 1726
  `d$$'     ,$P"'   .    $$$     Shell: bash 5.1.4
   $$P      d$'     ,    $$P     Disk: 83G / 931G (10%)
   $$:      $$.   -    ,d$$'     CPU: 13th Gen Intel Core i9-13900K @ 32x 7.5GHz [25.0°C]
   $$\;      Y$b._   _,d$P'      GPU: NVIDIA GeForce RTX 3060
   Y$$.    `.`"Y$$$$P"'          RAM: 2138MiB / 31866MiB
   `$$b      "-.__              
    `Y$$                        
     `Y$$.                      
       `$$b.                    
         `Y$$b.                 
            `"Y$b._             
                `""""           
     
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.223.02   Driver Version: 470.223.02   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0 Off |                  N/A |
|  0%   33C    P8    13W / 170W |     14MiB / 12053MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A       882      G   /usr/lib/xorg/Xorg                 12MiB |
+-----------------------------------------------------------------------------+

OS

Linux, Docker

GPU

Nvidia

CPU

Intel

Ollama version

latest

Originally created by @valiantrex3rei on GitHub (May 4, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4146 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? Hello everyone. I was following the tutorial at [Ollama Docker image](https://hub.docker.com/r/ollama/ollama). After installing the NVIDIA Container Toolkit, and then configuring Docker to use Nvidia driver and starting the container, I tried to attach the container but it took forever. I tried to remove the `-d` to see the outout, ``` docker run --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama ``` and got the following: ``` time=2024-05-04T01:07:55.992Z level=INFO source=images.go:828 msg="total blobs: 0" time=2024-05-04T01:07:55.993Z level=INFO source=images.go:835 msg="total unused blobs removed: 0" time=2024-05-04T01:07:55.993Z level=INFO source=routes.go:1071 msg="Listening on [::]:11434 (version 0.1.33)" time=2024-05-04T01:07:55.993Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama244156069/runners time=2024-05-04T01:07:57.767Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 rocm_v60002]" time=2024-05-04T01:07:57.767Z level=INFO source=gpu.go:96 msg="Detecting GPUs" time=2024-05-04T01:07:57.801Z level=INFO source=gpu.go:101 msg="detected GPUs" library=/tmp/ollama244156069/runners/cuda_v11/libcudart.so.11.0 count=1 time=2024-05-04T01:07:57.801Z level=INFO source=cpu_common.go:11 msg="CPU has AVX2" ``` It seems like the Nvidia driver and CUDA are recognized. Nevertheless, I also tried to remove the `--gpus=all` for CPU only mode, but still stuck at the same spot, only without the `detected GPUs` in the output. Any suggestions and/or corrections would be appreciated. system information: ``` _,met$$$$$gg. bwang@deb11bwang ,g$$$$$$$$$$$$$$$P. OS: Debian 11 bullseye ,g$$P"" """Y$$.". Kernel: x86_64 Linux 5.10.0-28-amd64 ,$$P' `$$$. Uptime: 6d 24m ',$$P ,ggs. `$$b: Packages: 1726 `d$$' ,$P"' . $$$ Shell: bash 5.1.4 $$P d$' , $$P Disk: 83G / 931G (10%) $$: $$. - ,d$$' CPU: 13th Gen Intel Core i9-13900K @ 32x 7.5GHz [25.0°C] $$\; Y$b._ _,d$P' GPU: NVIDIA GeForce RTX 3060 Y$$. `.`"Y$$$$P"' RAM: 2138MiB / 31866MiB `$$b "-.__ `Y$$ `Y$$. `$$b. `Y$$b. `"Y$b._ `"""" +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.223.02 Driver Version: 470.223.02 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A | | 0% 33C P8 13W / 170W | 14MiB / 12053MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 882 G /usr/lib/xorg/Xorg 12MiB | +-----------------------------------------------------------------------------+ ``` ### OS Linux, Docker ### GPU Nvidia ### CPU Intel ### Ollama version latest
GiteaMirror added the dockerbug labels 2026-04-12 12:55:15 -05:00
Author
Owner

@dhiltgen commented on GitHub (May 4, 2024):

What behavior did you see when you tried to run a model, for example ollama run llama3? Those server log messages look normal, and it should be waiting for client connections. To rule out networking problems on your host, you can try to docker exec -it <name of ollama container> bash in another terminal and then run the ollama run ... inside the container.

<!-- gh-comment-id:2094363942 --> @dhiltgen commented on GitHub (May 4, 2024): What behavior did you see when you tried to run a model, for example `ollama run llama3`? Those server log messages look normal, and it should be waiting for client connections. To rule out networking problems on your host, you can try to `docker exec -it <name of ollama container> bash` in another terminal and then run the `ollama run ...` inside the container.
Author
Owner

@valiantrex3rei commented on GitHub (May 4, 2024):

Thank you very much for the reply. It works like a charm!
This was definitely not a bug at all, but totally my own incompetence.

I started the container with:

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

And then tried:

docker exec -it ollama ollama run llama3

And everything works now.

My sincere apologies for wasting your time, and I hope one day another noob like me will find this helpful.

<!-- gh-comment-id:2094492450 --> @valiantrex3rei commented on GitHub (May 4, 2024): Thank you very much for the reply. It works like a charm! This was definitely not a bug at all, but totally my own incompetence. I started the container with: ``` docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama ``` And then tried: ``` docker exec -it ollama ollama run llama3 ``` And everything works now. My sincere apologies for wasting your time, and I hope one day another noob like me will find this helpful.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2576