[GH-ISSUE #11477] In a mixture of experts model mixture-of-experts only load the active parameters into memory. #54090

Closed
opened 2026-04-29 05:12:11 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @qwerty108109 on GitHub (Jul 20, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11477

In a mixture of experts model mixture-of-experts only load the active parameters into memory.

For example, when using qwen3 qwen3:235b-a22b Only loading the 22B active parameters Into memory instead of the whole thing.

Originally created by @qwerty108109 on GitHub (Jul 20, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11477 In a mixture of experts model mixture-of-experts only load the active parameters into memory. For example, when using qwen3 ```qwen3:235b-a22b``` Only loading the 22B active parameters Into memory instead of the whole thing.
GiteaMirror added the feature request label 2026-04-29 05:12:11 -05:00
Author
Owner

@rick-github commented on GitHub (Jul 21, 2025):

#10546

<!-- gh-comment-id:3095023795 --> @rick-github commented on GitHub (Jul 21, 2025): #10546
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#54090