[GH-ISSUE #1656] Try to use only 1 GPU if possible #26689

Closed
opened 2026-04-22 03:07:17 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @m0wer on GitHub (Dec 21, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1656

Originally assigned to: @dhiltgen on GitHub.

I have two GPUs: an RTX 3090 and a 2060. ollama tries to split the load among both, but due to communication speed between the cards being significantly slower than within a GPU, this is counterproductive. Would be great to prioritize using the first GPU only, if it alone has enough VRAM to fit the model.

Here are the results of a small experiment:

Model: solar:q4_0

  • RTX 3090 alone:
    • 350 W: 81 t/s
    • 280W: 79 t/s
  • RTX 3090 (280/350 W) and RTX 2060 (128/160 W):
    • 31.5 t/s

To do the test I just changed the value of CUDA_VISIBLE_DEVICES from 0,1 to 0.

Originally created by @m0wer on GitHub (Dec 21, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1656 Originally assigned to: @dhiltgen on GitHub. I have two GPUs: an RTX 3090 and a 2060. `ollama` tries to split the load among both, but due to communication speed between the cards being significantly slower than within a GPU, this is counterproductive. Would be great to prioritize using the first GPU only, if it alone has enough VRAM to fit the model. Here are the results of a small experiment: Model: `solar:q4_0` - RTX 3090 alone: - 350 W: 81 t/s - 280W: 79 t/s - RTX 3090 (280/350 W) and RTX 2060 (128/160 W): - 31.5 t/s To do the test I just changed the value of `CUDA_VISIBLE_DEVICES` from `0,1` to `0`.
GiteaMirror added the gpu label 2026-04-22 03:07:17 -05:00
Author
Owner

@divinity76 commented on GitHub (Feb 11, 2025):

i think that's what

OLLAMA_SCHED_SPREAD=0 ollama serve

is for - i'm guessing OLLAMA_SCHED_SPREAD didn't exist yet when you created this issue

<!-- gh-comment-id:2652158389 --> @divinity76 commented on GitHub (Feb 11, 2025): i think that's what ``` OLLAMA_SCHED_SPREAD=0 ollama serve ``` is for - i'm guessing OLLAMA_SCHED_SPREAD didn't exist yet when you created this issue
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#26689