[GH-ISSUE #7074] Docker image size is over a GB larger than 0.3.10 #51001

Closed
opened 2026-04-28 17:46:56 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @codefromthecrypt on GitHub (Oct 2, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7074

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

I noticed some changelog in reducing docker image size, but when looking at darwin/arm64, it is significantly larger than 0.3.10. This may also apply to others. Can you mention if this is an accident or intent?

ollama/ollama                                   0.3.12           443040bf2568   6 days ago      3.22GB
ollama/ollama                                   0.3.11           580b37f3291e   13 days ago     3.22GB
ollama/ollama                                   0.3.10           a6cc2736bd5c   3 weeks ago     1.92GB

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.3.12

Originally created by @codefromthecrypt on GitHub (Oct 2, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7074 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? I noticed some changelog in reducing docker image size, but when looking at darwin/arm64, it is significantly larger than 0.3.10. This may also apply to others. Can you mention if this is an accident or intent? ``` ollama/ollama 0.3.12 443040bf2568 6 days ago 3.22GB ollama/ollama 0.3.11 580b37f3291e 13 days ago 3.22GB ollama/ollama 0.3.10 a6cc2736bd5c 3 weeks ago 1.92GB ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.3.12
GiteaMirror added the questiondockerlinux labels 2026-04-28 17:46:56 -05:00
Author
Owner

@dhiltgen commented on GitHub (Oct 2, 2024):

This is due to a space vs. time trade-off.

On Linux, Ollama carries LLM runner executables and dependent libraries as compressed payloads inside the main executable, and extracts them at startup into /tmp. This can take 5-20s depending on the CPU and I/O performance of the system. For long running systems, this overhead is negligible. For containers which are often ephemeral, this extra delay at startup can be more impactful. In 0.3.11 we adjusted our container build to carry these executables and dependent libraries directly in the container root filesystem ready to execute, which makes the container startup less than a second in most cases. When you view the image sizes in docker, it's reporting the uncompressed sizes of the filesystem. This masks the compressed binaries/libraries inside the 0.3.10 which are immediately extracted upon startup and occupy filesystem space in the running container. If you look at the compressed sizes of the images they're nearly identical, so download times from docker hub should be ~identical.

<!-- gh-comment-id:2389085057 --> @dhiltgen commented on GitHub (Oct 2, 2024): This is due to a space vs. time trade-off. On Linux, Ollama carries LLM runner executables and dependent libraries as compressed payloads inside the main executable, and extracts them at startup into /tmp. This can take 5-20s depending on the CPU and I/O performance of the system. For long running systems, this overhead is negligible. For containers which are often ephemeral, this extra delay at startup can be more impactful. In 0.3.11 we adjusted our container build to carry these executables and dependent libraries directly in the container root filesystem ready to execute, which makes the container startup less than a second in most cases. When you view the image sizes in docker, it's reporting the uncompressed sizes of the filesystem. This masks the compressed binaries/libraries inside the 0.3.10 which are immediately extracted upon startup and occupy filesystem space in the running container. If you look at the compressed sizes of the images they're nearly identical, so download times from docker hub should be ~identical.
Author
Owner

@codefromthecrypt commented on GitHub (Oct 3, 2024):

Thanks, FWIW I'm ok with it, as I tend to use ollama directly in CI, not via docker, and tiny models.

However, there are integration tests use ollama via docker (e.g. testcontainers) and so now this is a larger, and of course on slow networks, demos etc this is more visible than most places. That releases aren't daily compensates also. Main thing was understanding if this was an oversight or intent, and seems intent. Thanks for the answer.

<!-- gh-comment-id:2390205494 --> @codefromthecrypt commented on GitHub (Oct 3, 2024): Thanks, FWIW I'm ok with it, as I tend to use ollama directly in CI, not via docker, and tiny models. However, there are integration tests use ollama via docker (e.g. testcontainers) and so now this is a larger, and of course on slow networks, demos etc this is more visible than most places. That releases aren't daily compensates also. Main thing was understanding if this was an oversight or intent, and seems intent. Thanks for the answer.
Author
Owner

@dhiltgen commented on GitHub (Oct 3, 2024):

The images are downloaded compressed, and once compressed, the sizes between these versions are very close, so you shouldn't see a significant impact on transfer time to get the image loaded into a system.

<!-- gh-comment-id:2391737314 --> @dhiltgen commented on GitHub (Oct 3, 2024): The images are downloaded compressed, and once compressed, the sizes between these versions are very close, so you shouldn't see a significant impact on transfer time to get the image loaded into a system.
Author
Owner

@codefromthecrypt commented on GitHub (Oct 7, 2024):

Thanks again for clarifying the mechanics about transfer size. I'll mention that to the person who asked me.

<!-- gh-comment-id:2395709677 --> @codefromthecrypt commented on GitHub (Oct 7, 2024): Thanks again for clarifying the mechanics about transfer size. I'll mention that to the person who asked me.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#51001