[GH-ISSUE #10867] qwen3:30b-a3b has poor performance on WSL2 #32900

Closed
opened 2026-04-22 14:49:44 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @junzhang-bjtu on GitHub (May 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10867

What is the issue?

qwen3:30b-a3b can't even run at about 1 token/s on WSL2, but it can run at about 15 tokens/s on Windows.

I just use CPU.

Moreover, after loading qwen3:30b-a3b on WSL2, it consumed more than 40GB of memory, while it only consumed about 20GB of memory on Windows.

Not only qwen3:30b-a3b, but also qwen3:8b has poor performance.

OS

Windows 11 24H2

WSL2 Ubuntu-24.04

GPU

Intel(R) Arc(TM) 140T

CPU

Intel(R) Core(TM) Ultra 9 285H

Memory

2 x 48G 5600MT/s

Ollama version

0.7.1

Originally created by @junzhang-bjtu on GitHub (May 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10867 ### What is the issue? qwen3:30b-a3b can't even run at about 1 token/s on WSL2, but it can run at about 15 tokens/s on Windows. I just use CPU. Moreover, after loading qwen3:30b-a3b on WSL2, it consumed more than 40GB of memory, while it only consumed about 20GB of memory on Windows. Not only qwen3:30b-a3b, but also qwen3:8b has poor performance. ### OS Windows 11 24H2 WSL2 Ubuntu-24.04 ### GPU Intel(R) Arc(TM) 140T ### CPU Intel(R) Core(TM) Ultra 9 285H ### Memory 2 x 48G 5600MT/s ### Ollama version 0.7.1
GiteaMirror added the bug label 2026-04-22 14:49:44 -05:00
Author
Owner

@rick-github commented on GitHub (May 26, 2025):

Have you made the GPU available inside WSL?

https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/configure-wsl-2-for-gpu-workflows.html

<!-- gh-comment-id:2910158920 --> @rick-github commented on GitHub (May 26, 2025): Have you made the GPU available inside WSL? https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/configure-wsl-2-for-gpu-workflows.html
Author
Owner

@junzhang-bjtu commented on GitHub (May 26, 2025):

Have you made the GPU available inside WSL?

https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/configure-wsl-2-for-gpu-workflows.html

I don't have a graphics card, so I can only use the CPU.

<!-- gh-comment-id:2910163346 --> @junzhang-bjtu commented on GitHub (May 26, 2025): > Have you made the GPU available inside WSL? > > https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/configure-wsl-2-for-gpu-workflows.html I don't have a graphics card, so I can only use the CPU.
Author
Owner

@scscgit commented on GitHub (May 26, 2025):

@MXS-Jun From the context it seems he meant the integrated GPU Intel(R) Arc(TM) 140T

<!-- gh-comment-id:2910713165 --> @scscgit commented on GitHub (May 26, 2025): @MXS-Jun From the context it seems he meant the integrated GPU Intel(R) Arc(TM) 140T
Author
Owner

@junzhang-bjtu commented on GitHub (May 26, 2025):

@MXS-Jun From the context it seems he meant the integrated GPU Intel(R) Arc(TM) 140T

OK, I will try it later. Thanks.

<!-- gh-comment-id:2910718448 --> @junzhang-bjtu commented on GitHub (May 26, 2025): > [@MXS-Jun](https://github.com/MXS-Jun) From the context it seems he meant the integrated GPU Intel(R) Arc(TM) 140T OK, I will try it later. Thanks.
Author
Owner

@junzhang-bjtu commented on GitHub (May 27, 2025):

@MXS-Jun From the context it seems he meant the integrated GPU Intel(R) Arc(TM) 140T

I have installed the things below in WSL2, but it's useless.

oneAPI

Intel GPU Driver

And I found that both dense models and MoE models have poor performance.

It's weird. Because it doesn't have problems in Ollama 0.6.6.

<!-- gh-comment-id:2910891760 --> @junzhang-bjtu commented on GitHub (May 27, 2025): > [@MXS-Jun](https://github.com/MXS-Jun) From the context it seems he meant the integrated GPU Intel(R) Arc(TM) 140T I have installed the things below in WSL2, but it's useless. [oneAPI](https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2025-1/base-apt.html#BASE-[APT](https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2025-1/base-apt.html#BASE-APT)) [Intel GPU Driver](https://dgpu-docs.intel.com/driver/client/overview.html) And I found that both dense models and MoE models have poor performance. It's weird. Because it doesn't have problems in Ollama 0.6.6.
Author
Owner

@junzhang-bjtu commented on GitHub (May 27, 2025):

I have changed to ollama-ipex-llm whose version is 2.2.0.

It bases on Ollama 0.6.2.

Everything is fine.

<!-- gh-comment-id:2910929807 --> @junzhang-bjtu commented on GitHub (May 27, 2025): I have changed to ollama-ipex-llm whose version is 2.2.0. It bases on Ollama 0.6.2. Everything is fine.
Author
Owner

@junzhang-bjtu commented on GitHub (Jun 6, 2025):

If you don't have a dedicated graphics card and only have Intel's integrated graphics, please prioritize using ollama-ipex-llm instead of the official ollama.

ipex-llm

<!-- gh-comment-id:2947810053 --> @junzhang-bjtu commented on GitHub (Jun 6, 2025): If you don't have a dedicated graphics card and only have Intel's integrated graphics, please prioritize using ollama-ipex-llm instead of the official ollama. [ipex-llm](https://github.com/intel/ipex-llm)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#32900