[GH-ISSUE #10703] Wrong model size lead can not use GPU #7033

Closed
opened 2026-04-12 18:56:21 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @zhfg on GitHub (May 14, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10703

What is the issue?

Issue description:
I use a AMD MI50 to run qwen3:32b, and open-webui as a frontend. with a same model, ollama recognition the wrong size of model in different time, it is leading can not use GPU when the wrong size.

Some output:

#Wrong size, in this time, ollama using CPU with wrong mode size
jacob@crawl:~$ ollama ps
NAME ID SIZE PROCESSOR UNTIL
qwen3:32b e1c9f234c6eb 53 GB 100% CPU 23 hours from now

#Restart ollama service and run agin, it's used GPU and right size of model
jacob@crawl:~$ ollama ps
NAME ID SIZE PROCESSOR UNTIL
qwen3:32b e1c9f234c6eb 25 GB 100% GPU 24 hours from now

Relevant log output


OS

Linux

GPU

AMD

CPU

AMD

Ollama version

0.6.8

Originally created by @zhfg on GitHub (May 14, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10703 ### What is the issue? Issue description: I use a AMD MI50 to run qwen3:32b, and open-webui as a frontend. with a same model, ollama recognition the wrong size of model in different time, it is leading can not use GPU when the wrong size. Some output: \#Wrong size, in this time, ollama using CPU with wrong mode size jacob@crawl:~$ ollama ps NAME ID SIZE PROCESSOR UNTIL qwen3:32b e1c9f234c6eb **53 GB** 100% CPU 23 hours from now \#Restart ollama service and run agin, it's used GPU and right size of model jacob@crawl:~$ ollama ps NAME ID SIZE PROCESSOR UNTIL qwen3:32b e1c9f234c6eb **25 GB** 100% GPU 24 hours from now ### Relevant log output ```shell ``` ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version 0.6.8
GiteaMirror added the bug label 2026-04-12 18:56:21 -05:00
Author
Owner

@mitchelldehaven commented on GitHub (May 14, 2025):

Running into a similar issue with an RTX 6000 Blackwell and qwen3:32b-q8_0

<!-- gh-comment-id:2878525851 --> @mitchelldehaven commented on GitHub (May 14, 2025): Running into a similar issue with an RTX 6000 Blackwell and `qwen3:32b-q8_0`
Author
Owner

@rick-github commented on GitHub (May 14, 2025):

You have a client setting num_ctx to something around 124000, which requires a context buffer of 30G. This does not fit in the available VRAM with enough space for any layers, so the model is run in system RAM.

<!-- gh-comment-id:2879683633 --> @rick-github commented on GitHub (May 14, 2025): You have a client setting `num_ctx` to something around 124000, which requires a context buffer of 30G. This does not fit in the available VRAM with enough space for any layers, so the model is run in system RAM.
Author
Owner

@zhfg commented on GitHub (May 15, 2025):

@rick-github As my checked, the num_ctx(called Context Length (Ollama) in open webui) never changed and default is 2048.

In any case, I will keep watching on this issue and feedback if any update

<!-- gh-comment-id:2881885397 --> @zhfg commented on GitHub (May 15, 2025): @rick-github As my checked, the num_ctx(called Context Length (Ollama) in open webui) never changed and default is 2048. In any case, I will keep watching on this issue and feedback if any update
Author
Owner

@rick-github commented on GitHub (May 15, 2025):

Server logs will aid in debugging.

<!-- gh-comment-id:2881889290 --> @rick-github commented on GitHub (May 15, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in debugging.
Author
Owner

@zhfg commented on GitHub (May 16, 2025):

@rick-github thank you for your reply
I manually set the value of num_ctx to different values, and when I use a large value, this issue does occur.

It looks like my configuration problem. This bug is closed

<!-- gh-comment-id:2886261654 --> @zhfg commented on GitHub (May 16, 2025): @rick-github thank you for your reply I manually set the value of num_ctx to different values, and when I use a large value, this issue does occur. It looks like my configuration problem. This bug is closed
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7033