[GH-ISSUE #2184] docker swarm service create doesn't use GPU #1247

Closed
opened 2026-04-12 11:01:15 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @go-laoji on GitHub (Jan 25, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2184

Originally assigned to: @dhiltgen on GitHub.

docker service create   \
	--name ollama   \
	--mount type=bind,source=/tmp/ollama,destination=/root/.ollama \   
        --constraint node.role==worker \
	--generic-resource "GPU=2"   \
	--mount type=bind,source=/dev/nvidia0,target=/dev/nvidia0 \
	--mount type=bind,source=/dev/nvidiactl,target=/dev/nvidiactl \
	--replicas 1   -p 11434:11434   ollama/ollama

use swarm service create,when service is running doesn't use gpu

Originally created by @go-laoji on GitHub (Jan 25, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2184 Originally assigned to: @dhiltgen on GitHub. ``` docker service create \ --name ollama \ --mount type=bind,source=/tmp/ollama,destination=/root/.ollama \ --constraint node.role==worker \ --generic-resource "GPU=2" \ --mount type=bind,source=/dev/nvidia0,target=/dev/nvidia0 \ --mount type=bind,source=/dev/nvidiactl,target=/dev/nvidiactl \ --replicas 1 -p 11434:11434 ollama/ollama ``` use swarm service create,when service is running doesn't use gpu
Author
Owner

@djmaze commented on GitHub (Feb 1, 2024):

That's because form swarm mode you need to have a cuda base image and they won't change that here.

(On another note, bind-mounting the nvidia devices is not the correct way to use gpus in swarm mode.)

<!-- gh-comment-id:1922495995 --> @djmaze commented on GitHub (Feb 1, 2024): That's because form swarm mode [you need to have a cuda base image](https://github.com/ollama/ollama/pull/1644#issuecomment-1866947478) and they won't change that here. (On another note, bind-mounting the nvidia devices is not the correct way to use gpus in swarm mode.)
Author
Owner

@dhiltgen commented on GitHub (Mar 11, 2024):

@go-laoji can you try to get the GPU working with the base docker container runtime first so we can isolate if this is a swam problem, or some other configuration problem with the nvidia container runtime or drivers?

https://docs.docker.com/config/containers/resource_constraints/#gpu

<!-- gh-comment-id:1989376780 --> @dhiltgen commented on GitHub (Mar 11, 2024): @go-laoji can you try to get the GPU working with the base docker container runtime first so we can isolate if this is a swam problem, or some other configuration problem with the nvidia container runtime or drivers? https://docs.docker.com/config/containers/resource_constraints/#gpu
Author
Owner

@dhiltgen commented on GitHub (Mar 27, 2024):

If you're still having problems, please let us know.

<!-- gh-comment-id:2023955890 --> @dhiltgen commented on GitHub (Mar 27, 2024): If you're still having problems, please let us know.
Author
Owner

@djmaze commented on GitHub (Jan 27, 2025):

For other other people still looking for a solution: with the current docker build, you just need to change a single line and the image works on Docker swarm as well.

diff --git a/Dockerfile b/Dockerfile
index 47228df6..a5a20f1f 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -155,7 +155,7 @@ RUN rm -rf \
     ./dist/linux-amd64/lib/ollama/libcu*.so* \
     ./dist/linux-amd64/lib/ollama/runners/cuda*

-FROM --platform=linux/amd64 ubuntu:22.04 AS runtime-amd64
+FROM --platform=linux/amd64 nvidia/cuda:12.6.3-runtime-ubuntu22.04 AS runtime-amd64
 RUN apt-get update && \
     apt-get install -y ca-certificates && \
     apt-get clean && rm -rf /var/lib/apt/lists/*
<!-- gh-comment-id:2616104976 --> @djmaze commented on GitHub (Jan 27, 2025): For other other people still looking for a solution: with the current docker build, you just need to change a single line and the image works on Docker swarm as well. ```diff diff --git a/Dockerfile b/Dockerfile index 47228df6..a5a20f1f 100644 --- a/Dockerfile +++ b/Dockerfile @@ -155,7 +155,7 @@ RUN rm -rf \ ./dist/linux-amd64/lib/ollama/libcu*.so* \ ./dist/linux-amd64/lib/ollama/runners/cuda* -FROM --platform=linux/amd64 ubuntu:22.04 AS runtime-amd64 +FROM --platform=linux/amd64 nvidia/cuda:12.6.3-runtime-ubuntu22.04 AS runtime-amd64 RUN apt-get update && \ apt-get install -y ca-certificates && \ apt-get clean && rm -rf /var/lib/apt/lists/* ```
Author
Owner

@JTHesse commented on GitHub (Apr 14, 2025):

Hi @djmaze, we are running into the same issue. Is the comment above the current workaround or are there any updates?

<!-- gh-comment-id:2800835369 --> @JTHesse commented on GitHub (Apr 14, 2025): Hi @djmaze, we are running into the same issue. Is the comment above the current workaround or are there any updates?
Author
Owner

@djmaze commented on GitHub (Apr 14, 2025):

Yes, that's still the way.

<!-- gh-comment-id:2801100909 --> @djmaze commented on GitHub (Apr 14, 2025): Yes, that's still the way.
Author
Owner

@dhiltgen commented on GitHub (Apr 14, 2025):

Could someone who has swam setup with GPU support see if setting one or more of the environment variables gets the official ollama image working?

docker run --rm -it nvidia/cuda:12.6.3-runtime-ubuntu22.04  env

vs.

docker run --rm -it --entrypoint env  ollama/ollama
<!-- gh-comment-id:2803245349 --> @dhiltgen commented on GitHub (Apr 14, 2025): Could someone who has swam setup with GPU support see if setting one or more of the environment variables gets the official ollama image working? ``` docker run --rm -it nvidia/cuda:12.6.3-runtime-ubuntu22.04 env ``` vs. ``` docker run --rm -it --entrypoint env ollama/ollama ```
Author
Owner

@djmaze commented on GitHub (Apr 15, 2025):

Could someone who has swam setup with GPU support see if setting one or more of the environment variables gets the official ollama image working?

Allright.. That's interesting. Somehow the latest ollama/ollama image is working with CUDA for me without further changes.. Nice! Not sure why it is working now though.

<!-- gh-comment-id:2807367611 --> @djmaze commented on GitHub (Apr 15, 2025): > Could someone who has swam setup with GPU support see if setting one or more of the environment variables gets the official ollama image working? Allright.. That's interesting. Somehow the latest `ollama/ollama` image is working with CUDA for me without further changes.. Nice! Not sure why it is working now though.
Author
Owner

@dhiltgen commented on GitHub (Apr 15, 2025):

It's possible we had a few releases where not all the env vars were set properly as we moved things around in the build system. I'll close this out since it sounds like it's working properly now.

<!-- gh-comment-id:2807665544 --> @dhiltgen commented on GitHub (Apr 15, 2025): It's possible we had a few releases where not all the env vars were set properly as we moved things around in the build system. I'll close this out since it sounds like it's working properly now.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#1247