[GH-ISSUE #13217] Power adjustment during use #34498

Closed
opened 2026-04-22 18:06:58 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @LaoDi-Sama on GitHub (Nov 23, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13217

I've found that when using the Astr-bot service to call ollama, it only utilizes about 50W of GPU power, which is clearly not the speed I want.

When running the same model directly in the command prompt (cmd), it utilizes about 100W, and the speed is acceptable.

Is this a bug or is there a setting that can be adjusted? If so, please let me know. Thank you.

Originally created by @LaoDi-Sama on GitHub (Nov 23, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13217 I've found that when using the Astr-bot service to call ollama, it only utilizes about 50W of GPU power, which is clearly not the speed I want. When running the same model directly in the command prompt (cmd), it utilizes about 100W, and the speed is acceptable. Is this a bug or is there a setting that can be adjusted? If so, please let me know. Thank you.
Author
Owner

@rick-github commented on GitHub (Nov 23, 2025):

When using the Astr-bot service, what's the output of ollama ps?

<!-- gh-comment-id:3568099517 --> @rick-github commented on GitHub (Nov 23, 2025): When using the Astr-bot service, what's the output of `ollama ps`?
Author
Owner

@LaoDi-Sama commented on GitHub (Nov 23, 2025):

C:\Users\Lao_Di>ollama ps
NAME ID SIZE PROCESSOR UNTIL
lingua-ds83:latest f1e68986a5c4 7.5 GB 100% GPU 49 minutes from now

<!-- gh-comment-id:3568100868 --> @LaoDi-Sama commented on GitHub (Nov 23, 2025): C:\Users\Lao_Di>ollama ps NAME ID SIZE PROCESSOR UNTIL lingua-ds83:latest f1e68986a5c4 7.5 GB 100% GPU 49 minutes from now
Author
Owner

@LaoDi-Sama commented on GitHub (Nov 23, 2025):

it seems normal

<!-- gh-comment-id:3568101014 --> @LaoDi-Sama commented on GitHub (Nov 23, 2025): it seems normal
Author
Owner

@rick-github commented on GitHub (Nov 23, 2025):

If the model is fully loaded on the GPU then power usage will be a function of the prompt and token generation. there's no ollama setting to increase utilization.

<!-- gh-comment-id:3568103745 --> @rick-github commented on GitHub (Nov 23, 2025): If the model is fully loaded on the GPU then power usage will be a function of the prompt and token generation. there's no ollama setting to increase utilization.
Author
Owner

@LaoDi-Sama commented on GitHub (Nov 23, 2025):

this model is just deepseek-r1:8b with modelfile
may i change modelfile to make it
or any settings in astrbot?(like below

Image
<!-- gh-comment-id:3568106715 --> @LaoDi-Sama commented on GitHub (Nov 23, 2025): this model is just deepseek-r1:8b with modelfile may i change modelfile to make it or any settings in astrbot?(like below <img width="857" height="211" alt="Image" src="https://github.com/user-attachments/assets/7818587c-5341-4da5-8631-e180665ade96" />
Author
Owner

@LaoDi-Sama commented on GitHub (Nov 23, 2025):

i dont know the meaning of "temperature"

<!-- gh-comment-id:3568107166 --> @LaoDi-Sama commented on GitHub (Nov 23, 2025): i dont know the meaning of "temperature"
Author
Owner

@rick-github commented on GitHub (Nov 23, 2025):

deepseek-r1:8b is not the model shown in the ollama ps output.

Temperature is how creative the model is: 0 = less creative, 1 = more creative.

<!-- gh-comment-id:3568109024 --> @rick-github commented on GitHub (Nov 23, 2025): deepseek-r1:8b is not the model shown in the `ollama ps` output. Temperature is how creative the model is: 0 = less creative, 1 = more creative.
Author
Owner

@pdevine commented on GitHub (Nov 25, 2025):

I'm going to go ahead and close this as answered.

<!-- gh-comment-id:3573417951 --> @pdevine commented on GitHub (Nov 25, 2025): I'm going to go ahead and close this as answered.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34498