[GH-ISSUE #4433] GPU layer control / prioritisation #2770

Open
opened 2026-04-12 13:05:27 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @AncientMystic on GitHub (May 14, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4433

Would it be possible to add into the configuration of ollama something similar to LM studio to control the gpu utilisation?

Also would it be possible to fine tune ollama to somehow only load certain layers to the gpu similar to unsloth?

Possibly a way to load accessed and adjacent layers maybe with configuration on how many adjacent layers/how much of the model to load at once and either offload the unused layers to ram or not load them at all and just swap out loading layers when needed instead of just loading the entire model every time

Could maybe have add it as lazy loading or something to enable the usage of larger models at higher performance

It seems to have a significant performance advantage especially on lower hardware for those of us without extreme setups if possible within ollama at least

Originally created by @AncientMystic on GitHub (May 14, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4433 Would it be possible to add into the configuration of ollama something similar to LM studio to control the gpu utilisation? Also would it be possible to fine tune ollama to somehow only load certain layers to the gpu similar to unsloth? Possibly a way to load accessed and adjacent layers maybe with configuration on how many adjacent layers/how much of the model to load at once and either offload the unused layers to ram or not load them at all and just swap out loading layers when needed instead of just loading the entire model every time Could maybe have add it as lazy loading or something to enable the usage of larger models at higher performance It seems to have a significant performance advantage especially on lower hardware for those of us without extreme setups if possible within ollama at least
GiteaMirror added the feature request label 2026-04-12 13:05:27 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#2770