[GH-ISSUE #2270] When I run a local model, GPU is used, but the CPU is 100% #1304

Closed
opened 2026-04-12 11:07:37 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @thugbobby on GitHub (Jan 30, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2270

Originally assigned to: @dhiltgen on GitHub.

When I run a local model, GPU is used, but the CPU is 100%, and it will be crashed.
image
image

Originally created by @thugbobby on GitHub (Jan 30, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2270 Originally assigned to: @dhiltgen on GitHub. When I run a local model, GPU is used, but the CPU is 100%, and it will be crashed. ![image](https://github.com/ollama/ollama/assets/68416779/2dc6dbf4-b786-4250-9996-20915a5b5ee5) ![image](https://github.com/ollama/ollama/assets/68416779/89c31672-2d47-4cd1-ad34-9e47fb2063af)
GiteaMirror added the gpu label 2026-04-12 11:07:37 -05:00
Author
Owner

@mehdiataei commented on GitHub (Jan 30, 2024):

Same issue. GPU is not used at all (the memory is allocated though)

<!-- gh-comment-id:1916824426 --> @mehdiataei commented on GitHub (Jan 30, 2024): Same issue. GPU is not used at all (the memory is allocated though)
Author
Owner

@ltomes commented on GitHub (Jan 30, 2024):

This has been brought up on this ticket as well: https://github.com/ollama/ollama/issues/1663
I have similar symptoms but using an A5000.

<!-- gh-comment-id:1917264420 --> @ltomes commented on GitHub (Jan 30, 2024): This has been brought up on this ticket as well: https://github.com/ollama/ollama/issues/1663 I have similar symptoms but using an A5000.
Author
Owner

@penouc commented on GitHub (Feb 3, 2024):

This seems to be a new version issue. I tried using ollma0.1.20 and found that the CPU's percentage could go over 100%, without crashing.
image

<!-- gh-comment-id:1925328957 --> @penouc commented on GitHub (Feb 3, 2024): This seems to be a new version issue. I tried using ollma0.1.20 and found that the CPU's percentage could go over 100%, without crashing. ![image](https://github.com/ollama/ollama/assets/1774022/6e03b496-786c-45f1-8919-215579fc6039)
Author
Owner

@easp commented on GitHub (Feb 3, 2024):

What model are you using?

<!-- gh-comment-id:1925433359 --> @easp commented on GitHub (Feb 3, 2024): What model are you using?
Author
Owner

@thugbobby commented on GitHub (Feb 8, 2024):

What model are you using?

yi:34b-chat

<!-- gh-comment-id:1933494512 --> @thugbobby commented on GitHub (Feb 8, 2024): > What model are you using? yi:34b-chat
Author
Owner

@dhiltgen commented on GitHub (Mar 12, 2024):

@thugbobby from your screenshot, it looks like python is using up most of your VRAM, leaving ollama with very little to fit in, so that implies it's loading only a small number of layers into the GPUs, and most of the work is being done by the CPU. Either load Ollama before your python app, or somehow configure python to use less VRAM, and you should get more layers loaded into the GPU. While we probably could load a few more layers from the looks of things, we try to be a little conservative in our memory prediction calculations, and we're continuing to refine those.

If you're still having problems, please share your server log and I'll re-open the issue.

<!-- gh-comment-id:1992615948 --> @dhiltgen commented on GitHub (Mar 12, 2024): @thugbobby from your screenshot, it looks like python is using up most of your VRAM, leaving ollama with very little to fit in, so that implies it's loading only a small number of layers into the GPUs, and most of the work is being done by the CPU. Either load Ollama before your python app, or somehow configure python to use less VRAM, and you should get more layers loaded into the GPU. While we probably could load a few more layers from the looks of things, we try to be a little conservative in our memory prediction calculations, and we're continuing to refine those. If you're still having problems, please share your server log and I'll re-open the issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#1304