[GH-ISSUE #13005] Per GPU settings #8611

Open
opened 2026-04-12 21:20:51 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @Mikec78660 on GitHub (Nov 7, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13005

It would be great is some of the setting could be done per GPU. As an example, You have 2 GPUs, one is your main desktop gpu. You want to reserve memory on it so that it can continue doing desktop GPU things, but you want the second GPU to be fully utilized for ollama. so:

Environment="OLLAMA_GPU_OVERHEAD:CUDA0=6000000000"
OR
Environment="OLLAMA_MAX_VRAM:1=12000000000"

Also if you have a 3090 and a 4090 and are running a 32GB model, this would allow for loading 24GB on the 4090 and 8GB on the 3090 for faster inference. Currently I can't see any way to accomplish this as there is no "preferred GPU" option, which would also be great to load a model in a specified order or preference:

Environment="OLLAMA_GPU_ORDER=CUDA2,CUDA3,CUDA1"

This could be device number or UUID or however to identify each GPU.

Originally created by @Mikec78660 on GitHub (Nov 7, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13005 It would be great is some of the setting could be done per GPU. As an example, You have 2 GPUs, one is your main desktop gpu. You want to reserve memory on it so that it can continue doing desktop GPU things, but you want the second GPU to be fully utilized for ollama. so: `Environment="OLLAMA_GPU_OVERHEAD:CUDA0=6000000000"` OR `Environment="OLLAMA_MAX_VRAM:1=12000000000"` Also if you have a 3090 and a 4090 and are running a 32GB model, this would allow for loading 24GB on the 4090 and 8GB on the 3090 for faster inference. Currently I can't see any way to accomplish this as there is no "preferred GPU" option, which would also be great to load a model in a specified order or preference: `Environment="OLLAMA_GPU_ORDER=CUDA2,CUDA3,CUDA1"` This could be device number or UUID or however to identify each GPU.
GiteaMirror added the feature request label 2026-04-12 21:20:51 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#8611