[GH-ISSUE #9046] Is it possible to force ollama to run on CPU instead of GPU? #52398

Closed
opened 2026-04-28 23:08:26 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @yilmazay74 on GitHub (Feb 12, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9046

Hi Everyone,
For benchmarking reasons, to measure ollama performance on CPU, I need to force ollama to run on CPU.
However, I just could not do that. It keeps using GPU, so I am not able to measure its performance on CPU.
I think it would be great if there have been some parameters like --cpus=4 or --gpus=0 like docker's running parameters.
Is there any possiblity that this feature can be added in the near future?
And in order to solve my problem, is there any workaround to force ollama run a particular model on CPU?
Thanks in advance

Y. A.

Originally created by @yilmazay74 on GitHub (Feb 12, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9046 Hi Everyone, For benchmarking reasons, to measure ollama performance on CPU, I need to force ollama to run on CPU. However, I just could not do that. It keeps using GPU, so I am not able to measure its performance on CPU. I think it would be great if there have been some parameters like --cpus=4 or --gpus=0 like docker's running parameters. Is there any possiblity that this feature can be added in the near future? And in order to solve my problem, is there any workaround to force ollama run a particular model on CPU? Thanks in advance Y. A.
GiteaMirror added the feature request label 2026-04-28 23:08:26 -05:00
Author
Owner

@rick-github commented on GitHub (Feb 12, 2025):

https://github.com/ollama/ollama/issues/6950#issuecomment-2373663650

<!-- gh-comment-id:2653833466 --> @rick-github commented on GitHub (Feb 12, 2025): https://github.com/ollama/ollama/issues/6950#issuecomment-2373663650
Author
Owner

@kth8 commented on GitHub (Feb 12, 2025):

/set parameter num_gpu 0

<!-- gh-comment-id:2654035133 --> @kth8 commented on GitHub (Feb 12, 2025): `/set parameter num_gpu 0`
Author
Owner

@BruceMacD commented on GitHub (Feb 12, 2025):

The num_gpu mentioned above should do it, you can specify this as an option to the generate/chat endpoints in the API also. Resolving this for now, but please let me know if you continue to see issues.

<!-- gh-comment-id:2654381869 --> @BruceMacD commented on GitHub (Feb 12, 2025): The `num_gpu` mentioned above should do it, you can specify this as an option to the generate/chat endpoints in the API also. Resolving this for now, but please let me know if you continue to see issues.
Author
Owner

@yilmazay74 commented on GitHub (Feb 14, 2025):

Thank you so much. That is what I was looking for.

<!-- gh-comment-id:2658429170 --> @yilmazay74 commented on GitHub (Feb 14, 2025): Thank you so much. That is what I was looking for.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#52398