[PR #5287] [CLOSED] llama: Support both old and new runners with a toggle with release build rigging #74026

Closed
opened 2026-05-05 05:58:31 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/5287
Author: @dhiltgen
Created: 6/26/2024
Status: Closed

Base: jmorganca/llamaHead: go_server


📝 Commits (7)

  • b4c3679 Add dist install logic to existing generate scripts
  • 72649d6 harden a few integration tests
  • 80d5e28 integration test for request context
  • c04adcd Add new runner build to CI
  • 1a8a66e fix llama->ggml generate define mixups
  • d37cc4e Unit test expose all ips on host
  • 61069ae Tidy up some debug log cruft

📊 Changes

12 files changed (+135 additions, -30 deletions)

View changed files

📝 .github/workflows/release.yaml (+3 -0)
📝 .github/workflows/test.yaml (+17 -12)
📝 Dockerfile (+33 -0)
📝 envconfig/config_test.go (+2 -0)
📝 integration/context_test.go (+31 -0)
📝 integration/llm_image_test.go (+3 -3)
📝 integration/utils_test.go (+8 -2)
📝 llm/generate/gen_common.sh (+16 -0)
📝 llm/generate/gen_darwin.sh (+12 -0)
📝 llm/generate/gen_linux.sh (+5 -1)
📝 llm/generate/gen_windows.ps1 (+5 -5)
📝 llm/llm.go (+0 -7)

📄 Description

Notable bugs in the new runners uncovered via integration tests:

  • llava integration test seems to show hallucinations, so multimodal isn't quite right
  • large context test case shows the batch processing needs additional work

This bundles the new runner into the publishing model for linux, mac and windows, so each platform can now toggle the new runners by simply setting OLLAMA_NEW_RUNNERS=1 (note: linux+arm does not include new runners in the containerized build)

Integrated the new linux packaging changes from #5049

% ls -lh
drwxr-xr-x 2 daniel daniel 4.0K Jul 15 00:12 cuda
-rw-r--r-- 1 daniel daniel 725M Jul 15 00:22 ollama
-rw-r--r-- 1 daniel daniel 2.4G Jul 15 00:30 ollama-linux-amd64.tgz
drwxr-xr-x 3 daniel daniel 4.0K Jul 15 00:30 rocm
% du -sh cuda rocm
895M	cuda
6.8G	rocm
% find /tmp/ollama921132520/runners -type f | xargs ls -lh
-rwxr-xr-x 1 daniel daniel 2.5M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx2/libllama.so
-rwxr-xr-x 1 daniel daniel 1.8M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx2/ollama_llama_server
-rwxr-xr-x 1 daniel daniel 7.2M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx2/ollama_runner
-rwxr-xr-x 1 daniel daniel 2.5M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx/libllama.so
-rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx/ollama_llama_server
-rwxr-xr-x 1 daniel daniel 7.2M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx/ollama_runner
-rwxr-xr-x 1 daniel daniel 2.4M Jul 15 00:31 /tmp/ollama921132520/runners/cpu/libllama.so
-rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/cpu/ollama_llama_server
-rwxr-xr-x 1 daniel daniel 7.2M Jul 15 00:31 /tmp/ollama921132520/runners/cpu/ollama_runner
-rwxr-xr-x 1 daniel daniel 233M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v11/libggml_cuda.so
-rwxr-xr-x 1 daniel daniel 249M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v11/libllama.so
-rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v11/ollama_llama_server
-rwxr-xr-x 1 daniel daniel 7.3M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v11/ollama_runner
-rwxr-xr-x 1 daniel daniel 234M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v12/libggml_cuda.so
-rwxr-xr-x 1 daniel daniel 252M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v12/libllama.so
-rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v12/ollama_llama_server
-rwxr-xr-x 1 daniel daniel 7.3M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v12/ollama_runner
-rwxr-xr-x 1 daniel daniel 178M Jul 15 00:31 /tmp/ollama921132520/runners/rocm_v6.1/libggml_hipblas.so
-rwxr-xr-x 1 daniel daniel 181M Jul 15 00:31 /tmp/ollama921132520/runners/rocm_v6.1/libllama.so
-rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/rocm_v6.1/ollama_llama_server
-rwxr-xr-x 1 daniel daniel 7.3M Jul 15 00:31 /tmp/ollama921132520/runners/rocm_v6.1/ollama_runner

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/5287 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 6/26/2024 **Status:** ❌ Closed **Base:** `jmorganca/llama` ← **Head:** `go_server` --- ### 📝 Commits (7) - [`b4c3679`](https://github.com/ollama/ollama/commit/b4c367955045baea76497486a78e01c9198e1745) Add dist install logic to existing generate scripts - [`72649d6`](https://github.com/ollama/ollama/commit/72649d6d6c9b42ea34c53fea91d95d4e9eefc767) harden a few integration tests - [`80d5e28`](https://github.com/ollama/ollama/commit/80d5e288921ab48f347b347a3767cbe1d4792499) integration test for request context - [`c04adcd`](https://github.com/ollama/ollama/commit/c04adcd3c2c7db36f168874cccaf84cd29d4298b) Add new runner build to CI - [`1a8a66e`](https://github.com/ollama/ollama/commit/1a8a66eab953be1eeb2a0c61e7d1ebd8cb146be8) fix llama->ggml generate define mixups - [`d37cc4e`](https://github.com/ollama/ollama/commit/d37cc4ef01f3a9a59f162927a2440199949743ee) Unit test expose all ips on host - [`61069ae`](https://github.com/ollama/ollama/commit/61069ae04e93b4d6a2afe815bddd81d9f8cf6f71) Tidy up some debug log cruft ### 📊 Changes **12 files changed** (+135 additions, -30 deletions) <details> <summary>View changed files</summary> 📝 `.github/workflows/release.yaml` (+3 -0) 📝 `.github/workflows/test.yaml` (+17 -12) 📝 `Dockerfile` (+33 -0) 📝 `envconfig/config_test.go` (+2 -0) 📝 `integration/context_test.go` (+31 -0) 📝 `integration/llm_image_test.go` (+3 -3) 📝 `integration/utils_test.go` (+8 -2) 📝 `llm/generate/gen_common.sh` (+16 -0) 📝 `llm/generate/gen_darwin.sh` (+12 -0) 📝 `llm/generate/gen_linux.sh` (+5 -1) 📝 `llm/generate/gen_windows.ps1` (+5 -5) 📝 `llm/llm.go` (+0 -7) </details> ### 📄 Description Notable bugs in the new runners uncovered via integration tests: - llava integration test seems to show hallucinations, so multimodal isn't quite right - large context test case shows the batch processing needs additional work This bundles the new runner into the publishing model for linux, mac and windows, so each platform can now toggle the new runners by simply setting `OLLAMA_NEW_RUNNERS=1` (note: linux+arm does not include new runners in the containerized build) Integrated the new linux packaging changes from #5049 ``` % ls -lh drwxr-xr-x 2 daniel daniel 4.0K Jul 15 00:12 cuda -rw-r--r-- 1 daniel daniel 725M Jul 15 00:22 ollama -rw-r--r-- 1 daniel daniel 2.4G Jul 15 00:30 ollama-linux-amd64.tgz drwxr-xr-x 3 daniel daniel 4.0K Jul 15 00:30 rocm % du -sh cuda rocm 895M cuda 6.8G rocm % find /tmp/ollama921132520/runners -type f | xargs ls -lh -rwxr-xr-x 1 daniel daniel 2.5M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx2/libllama.so -rwxr-xr-x 1 daniel daniel 1.8M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx2/ollama_llama_server -rwxr-xr-x 1 daniel daniel 7.2M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx2/ollama_runner -rwxr-xr-x 1 daniel daniel 2.5M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx/libllama.so -rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx/ollama_llama_server -rwxr-xr-x 1 daniel daniel 7.2M Jul 15 00:31 /tmp/ollama921132520/runners/cpu_avx/ollama_runner -rwxr-xr-x 1 daniel daniel 2.4M Jul 15 00:31 /tmp/ollama921132520/runners/cpu/libllama.so -rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/cpu/ollama_llama_server -rwxr-xr-x 1 daniel daniel 7.2M Jul 15 00:31 /tmp/ollama921132520/runners/cpu/ollama_runner -rwxr-xr-x 1 daniel daniel 233M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v11/libggml_cuda.so -rwxr-xr-x 1 daniel daniel 249M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v11/libllama.so -rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v11/ollama_llama_server -rwxr-xr-x 1 daniel daniel 7.3M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v11/ollama_runner -rwxr-xr-x 1 daniel daniel 234M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v12/libggml_cuda.so -rwxr-xr-x 1 daniel daniel 252M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v12/libllama.so -rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v12/ollama_llama_server -rwxr-xr-x 1 daniel daniel 7.3M Jul 15 00:31 /tmp/ollama921132520/runners/cuda_v12/ollama_runner -rwxr-xr-x 1 daniel daniel 178M Jul 15 00:31 /tmp/ollama921132520/runners/rocm_v6.1/libggml_hipblas.so -rwxr-xr-x 1 daniel daniel 181M Jul 15 00:31 /tmp/ollama921132520/runners/rocm_v6.1/libllama.so -rwxr-xr-x 1 daniel daniel 1.7M Jul 15 00:31 /tmp/ollama921132520/runners/rocm_v6.1/ollama_llama_server -rwxr-xr-x 1 daniel daniel 7.3M Jul 15 00:31 /tmp/ollama921132520/runners/rocm_v6.1/ollama_runner ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 05:58:31 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#74026