[GH-ISSUE #8474] Model running on 100% GPU runs on CPU #5455

Closed
opened 2026-04-12 16:41:13 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @RGFTheCoder on GitHub (Jan 18, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8474

What is the issue?

I've been using hf.co/QuantFactory/Qwen2.5-14B-Instruct-GGUF:Q6_K for a while and recently noticed a slowdown on a recent update to 0.5.4. This isn't fixed on 0.5.7. Ollama reports that the model is running on gpu 100%, but my usage shows that my cpu runs at 50% util, and my gpu barely gets 5% usage. (69.69.69.69 is a loopback device for cloudflared private network)

Image

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.5.4

Originally created by @RGFTheCoder on GitHub (Jan 18, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8474 ### What is the issue? I've been using `hf.co/QuantFactory/Qwen2.5-14B-Instruct-GGUF:Q6_K` for a while and recently noticed a slowdown on a recent update to 0.5.4. This isn't fixed on 0.5.7. Ollama reports that the model is running on gpu 100%, but my usage shows that my cpu runs at 50% util, and my gpu barely gets 5% usage. (69.69.69.69 is a loopback device for cloudflared private network) ![Image](https://github.com/user-attachments/assets/c133a56b-15de-4c7f-b0ee-0842300c294b) ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.5.4
GiteaMirror added the bug label 2026-04-12 16:41:13 -05:00
Author
Owner

@rick-github commented on GitHub (Jan 18, 2025):

Server logs will aid in debugging.

<!-- gh-comment-id:2599483489 --> @rick-github commented on GitHub (Jan 18, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in debugging.
Author
Owner

@RGFTheCoder commented on GitHub (Jan 18, 2025):

https://gist.github.com/RGFTheCoder/f70aa4251c6bfef965c5a408a2bfc4cb

<!-- gh-comment-id:2599509052 --> @RGFTheCoder commented on GitHub (Jan 18, 2025): https://gist.github.com/RGFTheCoder/f70aa4251c6bfef965c5a408a2bfc4cb
Author
Owner

@rick-github commented on GitHub (Jan 18, 2025):

Jan 17 19:02:23 linxdesktop ollama[91242]: time=2025-01-17T19:02:23.098-05:00 level=INFO source=routes.go:1339 msg="Dynamic LLM libraries" runners=[cpu]

Your installation doesn't have GPU enabled runners. NixOS? https://github.com/ollama/ollama/issues/8349

<!-- gh-comment-id:2599509965 --> @rick-github commented on GitHub (Jan 18, 2025): ``` Jan 17 19:02:23 linxdesktop ollama[91242]: time=2025-01-17T19:02:23.098-05:00 level=INFO source=routes.go:1339 msg="Dynamic LLM libraries" runners=[cpu] ``` Your installation doesn't have GPU enabled runners. NixOS? https://github.com/ollama/ollama/issues/8349
Author
Owner

@RGFTheCoder commented on GitHub (Jan 18, 2025):

Yep that was it, although it seems weird that ps reports it as 100% GPU.

<!-- gh-comment-id:2599513718 --> @RGFTheCoder commented on GitHub (Jan 18, 2025): Yep that was it, although it seems weird that `ps` reports it as 100% GPU.
Author
Owner

@kha84 commented on GitHub (Jan 19, 2025):

The same happens in Ubuntu 22 LTS - https://github.com/ollama/ollama/issues/8485

<!-- gh-comment-id:2600895883 --> @kha84 commented on GitHub (Jan 19, 2025): The same happens in Ubuntu 22 LTS - https://github.com/ollama/ollama/issues/8485
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5455