[PR #6485] [CLOSED] Optimize container images for startup #43374

Closed
opened 2026-04-24 23:00:48 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6485
Author: @dhiltgen
Created: 8/24/2024
Status: Closed

Base: mainHead: optimize_container


📝 Commits (2)

  • 1e486db Optimize container images for startup
  • cc142f2 Refactor payload logic and add buildx support for faster builds

📊 Changes

22 files changed (+634 additions, -504 deletions)

View changed files

📝 Dockerfile (+64 -19)
📝 envconfig/config.go (+0 -48)
gpu/assets.go (+0 -148)
📝 gpu/gpu.go (+3 -4)
📝 llm/generate/gen_common.sh (+17 -2)
📝 llm/generate/gen_linux.sh (+18 -1)
📝 llm/llm_darwin.go (+0 -4)
llm/llm_darwin_amd64.go (+0 -11)
📝 llm/llm_linux.go (+0 -4)
📝 llm/llm_windows.go (+0 -4)
llm/payload.go (+0 -233)
llm/payloads_common.go (+382 -0)
llm/payloads_darwin_amd64.go (+8 -0)
llm/payloads_darwin_arm64.go (+8 -0)
llm/payloads_linux.go (+6 -0)
llm/payloads_test.go (+44 -0)
llm/payloads_unused.go (+8 -0)
📝 llm/server.go (+8 -13)
📝 scripts/build_linux.sh (+3 -11)
scripts/buildx_docker.sh (+38 -0)

...and 2 more files

📄 Description

This change adjusts how to handle runner payloads to support container builds where we keep them extracted in the filesystem. This makes it easier to optimize the cpu/cuda vs cpu/rocm images for size, and should result in faster startup times for container images.

Looks like container startup time is down to ~100ms on a warm system.

The ROCm image is still massive due to the base layer containing compilers and other tools, so I'll see if I can find a way to try to slim it down a bit more, but that could come in a follow up PR.

ROCm image updated to use a base ubuntu image and just use our libraries. The official images and packages pull in compilers as dependencies so this seems to be the optimal lean setup.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6485 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 8/24/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `optimize_container` --- ### 📝 Commits (2) - [`1e486db`](https://github.com/ollama/ollama/commit/1e486dbc7d69ff15faa38b96b6a8e9af6cab921a) Optimize container images for startup - [`cc142f2`](https://github.com/ollama/ollama/commit/cc142f259ad565722090fc1f65ae793ce74d7fe3) Refactor payload logic and add buildx support for faster builds ### 📊 Changes **22 files changed** (+634 additions, -504 deletions) <details> <summary>View changed files</summary> 📝 `Dockerfile` (+64 -19) 📝 `envconfig/config.go` (+0 -48) ➖ `gpu/assets.go` (+0 -148) 📝 `gpu/gpu.go` (+3 -4) 📝 `llm/generate/gen_common.sh` (+17 -2) 📝 `llm/generate/gen_linux.sh` (+18 -1) 📝 `llm/llm_darwin.go` (+0 -4) ➖ `llm/llm_darwin_amd64.go` (+0 -11) 📝 `llm/llm_linux.go` (+0 -4) 📝 `llm/llm_windows.go` (+0 -4) ➖ `llm/payload.go` (+0 -233) ➕ `llm/payloads_common.go` (+382 -0) ➕ `llm/payloads_darwin_amd64.go` (+8 -0) ➕ `llm/payloads_darwin_arm64.go` (+8 -0) ➕ `llm/payloads_linux.go` (+6 -0) ➕ `llm/payloads_test.go` (+44 -0) ➕ `llm/payloads_unused.go` (+8 -0) 📝 `llm/server.go` (+8 -13) 📝 `scripts/build_linux.sh` (+3 -11) ➕ `scripts/buildx_docker.sh` (+38 -0) _...and 2 more files_ </details> ### 📄 Description This change adjusts how to handle runner payloads to support container builds where we keep them extracted in the filesystem. This makes it easier to optimize the cpu/cuda vs cpu/rocm images for size, and should result in faster startup times for container images. Looks like container startup time is down to ~100ms on a warm system. ~~The ROCm image is still massive due to the base layer containing compilers and other tools, so I'll see if I can find a way to try to slim it down a bit more, but that could come in a follow up PR.~~ ROCm image updated to use a base ubuntu image and just use our libraries. The official images and packages pull in compilers as dependencies so this seems to be the optimal lean setup. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-24 23:00:48 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#43374