[PR #6264] [MERGED] Parse cpuinfo and set default threads #17337

Closed
opened 2026-04-16 05:59:54 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6264
Author: @dhiltgen
Created: 8/8/2024
Status: Merged
Merged: 10/15/2024
Merged by: @dhiltgen

Base: mainHead: default_threads


📝 Commits (1)

  • 0580f4d Discovery CPU details for default thread selection

📊 Changes

7 files changed (+408 additions, -24 deletions)

View changed files

📝 gpu/gpu.go (+5 -1)
📝 gpu/gpu_darwin.go (+21 -0)
📝 gpu/gpu_linux.go (+94 -0)
📝 gpu/gpu_windows.go (+180 -3)
gpu/gpu_windows_test.go (+77 -0)
📝 gpu/types.go (+22 -2)
📝 llm/server.go (+9 -18)

📄 Description

Set the default thread count to the number of performance cores detected on the system. Without this change, the new Go server winds up picking runtime.NumCPU from Go, which equates to logical processors, and that results in thrashing on hyperthreading CPUs and poor CPU inference speed.

We need to reduce down to just the number of cores in a single socket given current limitations in the C++ code.

Fixes #5554


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6264 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 8/8/2024 **Status:** ✅ Merged **Merged:** 10/15/2024 **Merged by:** [@dhiltgen](https://github.com/dhiltgen) **Base:** `main` ← **Head:** `default_threads` --- ### 📝 Commits (1) - [`0580f4d`](https://github.com/ollama/ollama/commit/0580f4d3af272c5963a64cfaa79867dcef1703a8) Discovery CPU details for default thread selection ### 📊 Changes **7 files changed** (+408 additions, -24 deletions) <details> <summary>View changed files</summary> 📝 `gpu/gpu.go` (+5 -1) 📝 `gpu/gpu_darwin.go` (+21 -0) 📝 `gpu/gpu_linux.go` (+94 -0) 📝 `gpu/gpu_windows.go` (+180 -3) ➕ `gpu/gpu_windows_test.go` (+77 -0) 📝 `gpu/types.go` (+22 -2) 📝 `llm/server.go` (+9 -18) </details> ### 📄 Description Set the default thread count to the number of performance cores detected on the system. Without this change, the new Go server winds up picking `runtime.NumCPU` from Go, which equates to logical processors, and that results in thrashing on hyperthreading CPUs and poor CPU inference speed. We need to reduce down to just the number of cores in a single socket given current limitations in the C++ code. Fixes #5554 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 05:59:54 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#17337