[GH-ISSUE #10546] MoE requires loading full model in memory #53452

Closed
opened 2026-04-29 03:15:43 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @markokocic on GitHub (May 3, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10546

What is the issue?

Tried loading Qwen3 MoE 30b-a3b on my laptop. Since it’s a MoE with 3b active parameters, I expected that ollama would be able to run in with limited memory. However, I got an error message telling me it requires 20+GB to run, which is the amount of memory required to run a fully loaded dense model.

Relevant log output


OS

Windows

GPU

Other

CPU

Intel

Ollama version

No response

Originally created by @markokocic on GitHub (May 3, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10546 ### What is the issue? Tried loading Qwen3 MoE 30b-a3b on my laptop. Since it’s a MoE with 3b active parameters, I expected that ollama would be able to run in with limited memory. However, I got an error message telling me it requires 20+GB to run, which is the amount of memory required to run a fully loaded dense model. ### Relevant log output ```shell ``` ### OS Windows ### GPU Other ### CPU Intel ### Ollama version _No response_
GiteaMirror added the bug label 2026-04-29 03:15:43 -05:00
Author
Owner

@Burnarz commented on GitHub (May 3, 2025):

Hi! That’s how a MoE model works.
Even though only 3B parameters are active during inference, all the experts need to be loaded into memory so the model can route your input to the appropriate ones based on the request.

<!-- gh-comment-id:2848641808 --> @Burnarz commented on GitHub (May 3, 2025): Hi! That’s how a MoE model works. Even though only 3B parameters are active during inference, all the experts need to be loaded into memory so the model can route your input to the appropriate ones based on the request.
Author
Owner

@markokocic commented on GitHub (May 3, 2025):

Thanks

<!-- gh-comment-id:2848648712 --> @markokocic commented on GitHub (May 3, 2025): Thanks
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53452