[GH-ISSUE #5385] Provide a single command for "serve + pull model", to be used in CI/CD #29126

Open
opened 2026-04-22 07:47:26 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @steren on GitHub (Jun 29, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5385

I am building a container image on top of the official ollama/ollama image and I want to store in this image the model I intend to serve, so that I do not have to pull it after startup. The use case is to run Ollama in an autoscaled container environment.

The issue is that today, Ollama requires ollama serve before the ollama pull command can be used.

Expected

I'd expect to be able to use a very simple Dockerfile like this:

FROM ollama/ollama
RUN ollama pull gemma:2b
CMD ["serve"]

Observed

I cannot use a simple Dockerfile, I need to use a bash script that would start the server, wait for it to start, and only when started, pull the model:

wait_for_ollama() {
  while ! nc -z localhost 8080; do 
    sleep 1  # Wait 1 second before checking again
  done
}

# Start ollama serve in the background
ollama serve & 

# Wait for ollama serve to start listening
wait_for_ollama
echo "ollama serve is now listening on port 8080"

# Run ollama pull
ollama pull gemma:2b

# Indicate successful completion
echo "ollama pull gemma:2b completed"

That I then reference in my Dockerfile:

FROM ollama/ollama
ADD pull.sh /
RUN ./pull.sh
CMD ["serve"]
Originally created by @steren on GitHub (Jun 29, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5385 I am building a container image on top of the official `ollama/ollama` image and I want to store in this image the model I intend to serve, so that I do not have to pull it after startup. The use case is to run Ollama in an autoscaled container environment. The issue is that today, Ollama requires `ollama serve` before the `ollama pull` command can be used. # Expected I'd expect to be able to use a very simple Dockerfile like this: ``` FROM ollama/ollama RUN ollama pull gemma:2b CMD ["serve"] ``` # Observed I cannot use a simple Dockerfile, I need to use a bash script that would start the server, wait for it to start, and only when started, pull the model: ``` wait_for_ollama() { while ! nc -z localhost 8080; do sleep 1 # Wait 1 second before checking again done } # Start ollama serve in the background ollama serve & # Wait for ollama serve to start listening wait_for_ollama echo "ollama serve is now listening on port 8080" # Run ollama pull ollama pull gemma:2b # Indicate successful completion echo "ollama pull gemma:2b completed" ``` That I then reference in my Dockerfile: ``` FROM ollama/ollama ADD pull.sh / RUN ./pull.sh CMD ["serve"] ```
GiteaMirror added the feature request label 2026-04-22 07:47:26 -05:00
Author
Owner

@steren commented on GitHub (Jun 29, 2024):

CC @wietsevenema

<!-- gh-comment-id:2198291306 --> @steren commented on GitHub (Jun 29, 2024): CC @wietsevenema
Author
Owner

@ProjectMoon commented on GitHub (Jun 30, 2024):

I mean, this might be a dumb question: but have you tried just downloading it from outside docker and then copying it into during the docker build? You could also use an intermediate build container to do this, using the bash script to get the model, then copy it into the final result.

<!-- gh-comment-id:2198642423 --> @ProjectMoon commented on GitHub (Jun 30, 2024): I mean, this might be a dumb question: but have you tried just downloading it from outside docker and then copying it into during the docker build? You could also use an intermediate build container to do this, using the bash script to get the model, then copy it into the final result.
Author
Owner

@ProjectMoon commented on GitHub (Jun 30, 2024):

You can control where ollama stores its models with an environment variable, download and import the model during container build time, then recursively copy the whole directory into the final image. Then just make sure that ollama in the final image loads the models from the right directory.

<!-- gh-comment-id:2198642919 --> @ProjectMoon commented on GitHub (Jun 30, 2024): You can control where ollama stores its models with an environment variable, download and import the model during container build time, then recursively copy the whole directory into the final image. Then just make sure that ollama in the final image loads the models from the right directory.
Author
Owner

@oded996 commented on GitHub (Jul 26, 2024):

I mean, this might be a dumb question: but have you tried just downloading it from outside docker and then copying it into during the docker build? You could also use an intermediate build container to do this, using the bash script to get the model, then copy it into the final result.

Pulling during docker build time makes container start much faster as it does not need to pull the model every time the server starts. This is critical for fast scaling a production service

<!-- gh-comment-id:2253567409 --> @oded996 commented on GitHub (Jul 26, 2024): > I mean, this might be a dumb question: but have you tried just downloading it from outside docker and then copying it into during the docker build? You could also use an intermediate build container to do this, using the bash script to get the model, then copy it into the final result. Pulling during docker build time makes container start much faster as it does not need to pull the model every time the server starts. This is critical for fast scaling a production service
Author
Owner

@oded996 commented on GitHub (Jul 29, 2024):

You can achieve this today by using this command in the Dockerfile:
RUN ollama serve & sleep 10 && ollama pull gemma:2b

<!-- gh-comment-id:2256603050 --> @oded996 commented on GitHub (Jul 29, 2024): You can achieve this today by using this command in the Dockerfile: `RUN ollama serve & sleep 10 && ollama pull gemma:2b`
Author
Owner

@khteh commented on GitHub (Apr 4, 2025):

https://github.com/ollama/ollama/issues/10122

<!-- gh-comment-id:2777515028 --> @khteh commented on GitHub (Apr 4, 2025): https://github.com/ollama/ollama/issues/10122
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#29126