[GH-ISSUE #10878] Ollama 0.7.1 shows a low performance than previous versions on Qwen 3 MoE model (30b-a3b) #7148

Closed
opened 2026-04-12 19:09:02 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @krpr on GitHub (May 27, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10878

What is the issue?

Image
Usage on version 0.7.1
Image
Usage on version 0.6.8 or 0.7.0 (sorry for my forgetting)

With the same hardware, Ollama 0.7.1 shows a low performance than 0.6.8 or 0.7.0 on Qwen 3 MoE model (30b-a3b)

OS: Ubuntu 24.04.2 LTS (GNU/Linux 6.8.0-60-generic x86_64)
CPU: AMD EPYC 7Y43 48-Core Processor x2
GPU: NVIDIA GeForce RTX 4090 48G x2

Relevant log output


OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.7.1

Originally created by @krpr on GitHub (May 27, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10878 ### What is the issue? ![Image](https://github.com/user-attachments/assets/5958b95a-751e-4686-ad67-8ec842591e38) Usage on version 0.7.1 ![Image](https://github.com/user-attachments/assets/a1fc74dc-0247-4aca-a851-0451dacfdf86) Usage on version 0.6.8 or 0.7.0 (sorry for my forgetting) With the same hardware, Ollama 0.7.1 shows a low performance than 0.6.8 or 0.7.0 on Qwen 3 MoE model (30b-a3b) OS: Ubuntu 24.04.2 LTS (GNU/Linux 6.8.0-60-generic x86_64) CPU: AMD EPYC 7Y43 48-Core Processor x2 GPU: NVIDIA GeForce RTX 4090 48G x2 ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.7.1
GiteaMirror added the needs more infobug labels 2026-04-12 19:09:02 -05:00
Author
Owner

@krpr commented on GitHub (May 27, 2025):

Image
even sometime the response token/s only 60% of the peak

<!-- gh-comment-id:2911685850 --> @krpr commented on GitHub (May 27, 2025): ![Image](https://github.com/user-attachments/assets/d9ac67af-0fc0-4588-86c4-4de107160b1e) even sometime the response token/s only 60% of the peak
Author
Owner

@frederikhendrix commented on GitHub (May 27, 2025):

Yes I have this as well. Also, when will ollama start running audio/transcription models?

<!-- gh-comment-id:2913797635 --> @frederikhendrix commented on GitHub (May 27, 2025): Yes I have this as well. Also, when will ollama start running audio/transcription models?
Author
Owner

@rick-github commented on GitHub (Jun 10, 2025):

Server logs may help in debugging.

<!-- gh-comment-id:2958397095 --> @rick-github commented on GitHub (Jun 10, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) may help in debugging.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7148