[PR #3725] [CLOSED] Add env override for opts.NumThread & opts.NumGPU #21807

Closed
opened 2026-04-19 15:52:47 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/3725
Author: @ghost
Created: 4/18/2024
Status: Closed

Base: mainHead: llm-threads-gpu-layers-env-override


📝 Commits (3)

  • 9517c00 Add env override for opts.NumThread & opts.NumGPU
  • 6ddfd4d Merge branch 'main' into llm-threads-gpu-layers-env-override
  • 9940f43 Conform logging to surrounding implementation

📊 Changes

2 files changed (+35 additions, -5 deletions)

View changed files

📝 cmd/cmd.go (+7 -5)
📝 llm/server.go (+28 -0)

📄 Description

This enables the capability to limit or disable both NumThread & NumGPU which can be useful for testing and running multiple concurrent instances. NumThread limitation is also valuable to reliably enforce core affinity (as with taskset) on hybrid architectures like modern Intel & ARM.

  • OLLAMA_MAX_LLM_THREADS Maximum number of LLM CPU threads (default is unlimited: 0)
  • OLLAMA_MAX_GPU_LAYERS Maximum number of GPU layers (default is unlimited: -1)

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/3725 **Author:** [@ghost](https://github.com/ghost) **Created:** 4/18/2024 **Status:** ❌ Closed **Base:** `main` ← **Head:** `llm-threads-gpu-layers-env-override` --- ### 📝 Commits (3) - [`9517c00`](https://github.com/ollama/ollama/commit/9517c00bfd1c1da59df0801a7b17f4c60506c01a) Add env override for opts.NumThread & opts.NumGPU - [`6ddfd4d`](https://github.com/ollama/ollama/commit/6ddfd4d743b1eff2d3aebc37b7916db4a7dee885) Merge branch 'main' into llm-threads-gpu-layers-env-override - [`9940f43`](https://github.com/ollama/ollama/commit/9940f43cd6fdfe93b3f27d54464612f0af99d0bb) Conform logging to surrounding implementation ### 📊 Changes **2 files changed** (+35 additions, -5 deletions) <details> <summary>View changed files</summary> 📝 `cmd/cmd.go` (+7 -5) 📝 `llm/server.go` (+28 -0) </details> ### 📄 Description This enables the capability to limit or disable both `NumThread` & `NumGPU` which can be useful for testing and running multiple concurrent instances. `NumThread` limitation is also valuable to reliably enforce core affinity (as with `taskset`) on hybrid architectures like modern Intel & ARM. - `OLLAMA_MAX_LLM_THREADS` **Maximum number of LLM CPU threads (default is unlimited: 0)** - `OLLAMA_MAX_GPU_LAYERS` **Maximum number of GPU layers (default is unlimited: -1)** --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 15:52:47 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#21807