[PR #6547] [MERGED] Optimize container images for startup #22690

Closed
opened 2026-04-19 16:29:29 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6547
Author: @dhiltgen
Created: 8/28/2024
Status: Merged
Merged: 9/12/2024
Merged by: @dhiltgen

Base: mainHead: optimize_container_revamp_payload


📝 Commits (6)

  • c845b84 Optimize container images for startup
  • 4d75c9b Refactor payload logic and add buildx support for faster builds
  • 9369836 Move payloads around
  • 1ea2994 Review comments
  • 63f78ef Converge to buildx based helper scripts
  • e0e6098 Use docker buildx action for release

📊 Changes

32 files changed (+860 additions, -688 deletions)

View changed files

📝 .dockerignore (+2 -0)
📝 .github/workflows/release.yaml (+180 -29)
📝 .github/workflows/test.yaml (+1 -42)
📝 .gitignore (+3 -0)
📝 Dockerfile (+70 -31)
build/darwin/amd64/placeholder (+1 -0)
build/darwin/arm64/placeholder (+1 -0)
build/embed_darwin_amd64.go (+8 -0)
build/embed_darwin_arm64.go (+8 -0)
build/embed_linux.go (+6 -0)
build/embed_unused.go (+8 -0)
build/linux/amd64/placeholder (+1 -0)
build/linux/arm64/placeholder (+1 -0)
📝 envconfig/config.go (+0 -48)
gpu/assets.go (+0 -148)
📝 gpu/gpu.go (+3 -4)
📝 llm/generate/gen_common.sh (+25 -5)
📝 llm/generate/gen_darwin.sh (+8 -4)
📝 llm/generate/gen_linux.sh (+26 -9)
📝 llm/llm_darwin.go (+0 -4)

...and 12 more files

📄 Description

Replaces #6485

Move the payload handling logic to a discrete go module so we can start to lay the foundation to toggle between C++ and Go runner implementation at build time.

This change adjusts how to handle runner payloads to support container builds where we keep them extracted in the filesystem. This makes it easier to optimize the cpu/cuda vs cpu/rocm images for size, and should result in faster startup times for container images.

Looks like container startup time is down to ~100ms on a warm system.

ROCm image updated to use a base ubuntu image and just use our libraries. The official images and packages pull in compilers as dependencies so this seems to be the optimal lean setup.

Fixes #6541


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6547 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 8/28/2024 **Status:** ✅ Merged **Merged:** 9/12/2024 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `optimize_container_revamp_payload` --- ### 📝 Commits (6) - [`c845b84`](https://github.com/ollama/ollama/commit/c845b8488420aefc7ebf8b7eaa979949fbb5d257) Optimize container images for startup - [`4d75c9b`](https://github.com/ollama/ollama/commit/4d75c9b47785eaf5ea531775ab07309b50b14f25) Refactor payload logic and add buildx support for faster builds - [`9369836`](https://github.com/ollama/ollama/commit/9369836e846917e5a57a3845d3dd4fb55e125277) Move payloads around - [`1ea2994`](https://github.com/ollama/ollama/commit/1ea2994d4215ee2b11c772f776e4d98e170ad385) Review comments - [`63f78ef`](https://github.com/ollama/ollama/commit/63f78efd115a9f2389a5291b7ec562c36fe4cf1e) Converge to buildx based helper scripts - [`e0e6098`](https://github.com/ollama/ollama/commit/e0e60985995db31d3b5762f854122c347080dbb7) Use docker buildx action for release ### 📊 Changes **32 files changed** (+860 additions, -688 deletions) <details> <summary>View changed files</summary> 📝 `.dockerignore` (+2 -0) 📝 `.github/workflows/release.yaml` (+180 -29) 📝 `.github/workflows/test.yaml` (+1 -42) 📝 `.gitignore` (+3 -0) 📝 `Dockerfile` (+70 -31) ➕ `build/darwin/amd64/placeholder` (+1 -0) ➕ `build/darwin/arm64/placeholder` (+1 -0) ➕ `build/embed_darwin_amd64.go` (+8 -0) ➕ `build/embed_darwin_arm64.go` (+8 -0) ➕ `build/embed_linux.go` (+6 -0) ➕ `build/embed_unused.go` (+8 -0) ➕ `build/linux/amd64/placeholder` (+1 -0) ➕ `build/linux/arm64/placeholder` (+1 -0) 📝 `envconfig/config.go` (+0 -48) ➖ `gpu/assets.go` (+0 -148) 📝 `gpu/gpu.go` (+3 -4) 📝 `llm/generate/gen_common.sh` (+25 -5) 📝 `llm/generate/gen_darwin.sh` (+8 -4) 📝 `llm/generate/gen_linux.sh` (+26 -9) 📝 `llm/llm_darwin.go` (+0 -4) _...and 12 more files_ </details> ### 📄 Description Replaces #6485 Move the payload handling logic to a discrete go module so we can start to lay the foundation to toggle between C++ and Go runner implementation at build time. This change adjusts how to handle runner payloads to support container builds where we keep them extracted in the filesystem. This makes it easier to optimize the cpu/cuda vs cpu/rocm images for size, and should result in faster startup times for container images. Looks like container startup time is down to ~100ms on a warm system. ROCm image updated to use a base ubuntu image and just use our libraries. The official images and packages pull in compilers as dependencies so this seems to be the optimal lean setup. Fixes #6541 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 16:29:29 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#22690