[GH-ISSUE #1864] loading the model into GPU direct #63103

Closed
opened 2026-05-03 12:06:39 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @Mahmuod1 on GitHub (Jan 9, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/1864

there is any way to loading the llm model into the GPU memory direct not in CPU and then switch in GPU as i seen in monitor

Originally created by @Mahmuod1 on GitHub (Jan 9, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/1864 there is any way to loading the llm model into the GPU memory direct not in CPU and then switch in GPU as i seen in monitor
Author
Owner

@pdevine commented on GitHub (Jan 9, 2024):

This is essentially what Ollama does. It tries to offload as many layers of the model as possible into the GPU, and then if there is not enough space, will load the rest into memory. In order to load the model into the GPU's memory though, your computer has to use at least some memory from your system to read it and perform the copy. With a Mac, since it has Unified Memory, you don't have to copy the model through the system memory.

Are you having problems with something in particular though? Do you have less system memory than GPU memory?

<!-- gh-comment-id:1883606139 --> @pdevine commented on GitHub (Jan 9, 2024): This is essentially what Ollama does. It tries to offload as many layers of the model as possible into the GPU, and then if there is not enough space, will load the rest into memory. In order to load the model into the GPU's memory though, your computer has to use at least _some_ memory from your system to read it and perform the copy. With a Mac, since it has Unified Memory, you don't have to copy the model through the system memory. Are you having problems with something in particular though? Do you have less system memory than GPU memory?
Author
Owner

@pdevine commented on GitHub (Mar 11, 2024):

Going to go ahead and close out the issue.

<!-- gh-comment-id:1989376820 --> @pdevine commented on GitHub (Mar 11, 2024): Going to go ahead and close out the issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#63103