[PR #11330] Reduce default parallelism to 1 #13510

Closed
opened 2026-04-13 00:29:10 -05:00 by GiteaMirror · 0 comments
Owner

Original Pull Request: https://github.com/ollama/ollama/pull/11330

State: closed
Merged: Yes


The current scheduler algorithm of picking the paralellism based on available VRAM complicates the upcoming dynamic layer memory allocation algorithm. This changes the default to 1, with the intent going forward that parallelism is explicit and will no longer be dynamically determinied. Removal of the dynamic logic will come in a follow up.

This behavior change should be release noted.

**Original Pull Request:** https://github.com/ollama/ollama/pull/11330 **State:** closed **Merged:** Yes --- The current scheduler algorithm of picking the paralellism based on available VRAM complicates the upcoming dynamic layer memory allocation algorithm. This changes the default to 1, with the intent going forward that parallelism is explicit and will no longer be dynamically determinied. Removal of the dynamic logic will come in a follow up. This behavior change should be release noted.
GiteaMirror added the pull-request label 2026-04-13 00:29:11 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#13510