[GH-ISSUE #4792] Added the ability to configure maxvram per GPU when there are multiple GPUs #49532

Closed
opened 2026-04-28 12:07:40 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @hamkido on GitHub (Jun 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4792

Originally assigned to: @dhiltgen on GitHub.

Current can config MaxVRAM, but can not config per gpu maxvram when have dual gpu.

Originally created by @hamkido on GitHub (Jun 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4792 Originally assigned to: @dhiltgen on GitHub. Current can config MaxVRAM, but can not config per gpu maxvram when have dual gpu.
GiteaMirror added the feature request label 2026-04-28 12:07:40 -05:00
Author
Owner

@dhiltgen commented on GitHub (Jun 18, 2024):

With 0.1.45 we'll now be aware of VRAM skew between multiple GPUs and should allocate the layers proportionally to what is available on each GPU. While this doesn't address your feature request, hopefully you'll see better behavior on multiple GPUs with this new release.

<!-- gh-comment-id:2177103794 --> @dhiltgen commented on GitHub (Jun 18, 2024): With 0.1.45 we'll now be aware of VRAM skew between multiple GPUs and should allocate the layers proportionally to what is available on each GPU. While this doesn't address your feature request, hopefully you'll see better behavior on multiple GPUs with this new release.
Author
Owner

@dhiltgen commented on GitHub (Jul 3, 2024):

Please give the latest release a try and if you see it get the memory prediction incorrect between your skewed VRAM GPUs share your server log and I'll reopen the issue.

<!-- gh-comment-id:2207484832 --> @dhiltgen commented on GitHub (Jul 3, 2024): Please give the latest release a try and if you see it get the memory prediction incorrect between your skewed VRAM GPUs share your server log and I'll reopen the issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#49532