[PR #9824] [MERGED] conditionally enable parallel pipelines #13064

Closed
opened 2026-04-13 00:16:55 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/9824
Author: @mxyng
Created: 3/17/2025
Status: Merged
Merged: 3/17/2025
Merged by: @mxyng

Base: mainHead: mxyng/sched


📝 Commits (1)

  • 4561fff conditionally enable parallel pipelines

📊 Changes

1 file changed (+1 additions, -1 deletions)

View changed files

📝 ml/backend/ggml/ggml.go (+1 -1)

📄 Description

parallel pipelines enables faster scheduling at the cost of more memory so only enable it when needed, which is when multiple devices are present and all layers are on the GPU, including the output layer


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/9824 **Author:** [@mxyng](https://github.com/mxyng) **Created:** 3/17/2025 **Status:** ✅ Merged **Merged:** 3/17/2025 **Merged by:** [@mxyng](https://github.com/mxyng) **Base:** `main` ← **Head:** `mxyng/sched` --- ### 📝 Commits (1) - [`4561fff`](https://github.com/ollama/ollama/commit/4561fff36e7338f12f12872a5ba2ced4e670796c) conditionally enable parallel pipelines ### 📊 Changes **1 file changed** (+1 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `ml/backend/ggml/ggml.go` (+1 -1) </details> ### 📄 Description parallel pipelines enables faster scheduling at the cost of more memory so only enable it when needed, which is when multiple devices are present and all layers are on the GPU, including the output layer --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:16:56 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#13064