[GH-ISSUE #13112] 0.12.11 much slower on gpt-oss:20b than 0.12.10 #55194

Closed
opened 2026-04-29 08:29:18 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @PThomasG on GitHub (Nov 17, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13112

What is the issue?

Running gpt-oss:20b on 0.12.10 results in eval rate of 245 tokens/sec
Running gpt-oss:20b on 0.12.11 results in eval rate of 135 tokens/sec

I reverted to 0.12.10 and it is back to normal.

Relevant log output

N/A

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.12.11

Originally created by @PThomasG on GitHub (Nov 17, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13112 ### What is the issue? Running gpt-oss:20b on 0.12.10 results in eval rate of 245 tokens/sec Running gpt-oss:20b on 0.12.11 results in eval rate of 135 tokens/sec I reverted to 0.12.10 and it is back to normal. ### Relevant log output ```shell N/A ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.12.11
GiteaMirror added the performancebugnvidia labels 2026-04-29 08:29:20 -05:00
Author
Owner

@pdevine commented on GitHub (Nov 17, 2025):

@paultg2 can you post logs as well as the output from ollama ps?

<!-- gh-comment-id:3539701303 --> @pdevine commented on GitHub (Nov 17, 2025): @paultg2 can you post logs as well as the output from `ollama ps`?
Author
Owner

@pdevine commented on GitHub (Nov 17, 2025):

I was able to duplicate this this morning. Still investigating.

<!-- gh-comment-id:3543583665 --> @pdevine commented on GitHub (Nov 17, 2025): I was able to duplicate this this morning. Still investigating.
Author
Owner

@PThomasG commented on GitHub (Nov 18, 2025):

I do not have logs, but I can create them. As you duplicated it, do you still need the logs?

BTW: this was on an RTX 5090 Founders Edition and also on a Gigabyte Aorus RTX 5090.

<!-- gh-comment-id:3544545359 --> @PThomasG commented on GitHub (Nov 18, 2025): I do not have logs, but I can create them. As you duplicated it, do you still need the logs? BTW: this was on an RTX 5090 Founders Edition and also on a Gigabyte Aorus RTX 5090.
Author
Owner

@jessegross commented on GitHub (Nov 18, 2025):

We've identified the cause, so no need for logs at this point. Thanks though!

<!-- gh-comment-id:3544620823 --> @jessegross commented on GitHub (Nov 18, 2025): We've identified the cause, so no need for logs at this point. Thanks though!
Author
Owner

@GlisseManTV commented on GitHub (Nov 18, 2025):

Just noticed that too with qwen 3. revert back to 0.12.10
Multi GPU setup

<!-- gh-comment-id:3546168802 --> @GlisseManTV commented on GitHub (Nov 18, 2025): Just noticed that too with qwen 3. revert back to 0.12.10 Multi GPU setup
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55194