[PR #8792] Addded environment-controlled number of threads for ollama server ($OMP_NUM_THREADS). #11509

Open
opened 2025-11-12 16:15:49 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/8792
Author: @james-irwin
Created: 2/3/2025
Status: 🔄 Open

Base: mainHead: add_num_threads


📝 Commits (2)

  • b568646 Addded environment-controlled number of threads for ollama server ().
  • da9f78a Merge branch 'main' into add_num_threads

📊 Changes

1 file changed (+2 additions, -1 deletions)

View changed files

📝 llm/server.go (+2 -1)

📄 Description

OMP_NUM_THREADS is a common (if dated) controlling environment variable to assert the number of desired workers used at runtime for an application. While ollama server isn't necessarily using OMP for its threading model, this is consistent with other systems that obey OMP_NUM_THREADS while also having a native controlling environment variable (for example. those that also entertain exclusive/tailored controlling environment variables MKL_NUM_THREADS and GOTO_NUM_THREADS).

Motivation: Using systems with high core counts and needing to pin to NUMA zones (via numactl) as well as heterogeneous cores within the single operating system image. Without this, the following scenario presents:

#cores in system = N
#cores desired = M (M<<N)
$> numactl -C 0-(M-1) go run ollama serve

Will launch N threads bound to M cores negatively affecting performance.

With this change:

$> OMP_NUM_THREADS=M numactl -C 0-(M-1) go run ollama serve

Will launch M threads and place them on the M assigned cores delivering expected performance.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/8792 **Author:** [@james-irwin](https://github.com/james-irwin) **Created:** 2/3/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `add_num_threads` --- ### 📝 Commits (2) - [`b568646`](https://github.com/ollama/ollama/commit/b568646af44340ac636fa8e39f7f2514a502c82b) Addded environment-controlled number of threads for ollama server (). - [`da9f78a`](https://github.com/ollama/ollama/commit/da9f78a86c4e44dc784372e2cec137339f093a7a) Merge branch 'main' into add_num_threads ### 📊 Changes **1 file changed** (+2 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `llm/server.go` (+2 -1) </details> ### 📄 Description OMP_NUM_THREADS is a common (if dated) controlling environment variable to assert the number of desired workers used at runtime for an application. While ollama server isn't necessarily using OMP for its threading model, this is consistent with other systems that obey OMP_NUM_THREADS while also having a native controlling environment variable (for example. those that also entertain exclusive/tailored controlling environment variables MKL_NUM_THREADS and GOTO_NUM_THREADS). Motivation: Using systems with high core counts and needing to pin to NUMA zones (via numactl) as well as heterogeneous cores within the single operating system image. Without this, the following scenario presents: #cores in system = N #cores desired = M (M<<N) $> numactl -C 0-(M-1) go run ollama serve Will launch N threads bound to M cores negatively affecting performance. With this change: $> OMP_NUM_THREADS=M numactl -C 0-(M-1) go run ollama serve Will launch M threads and place them on the M assigned cores delivering expected performance. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-12 16:15:49 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama-ollama#11509