[GH-ISSUE #1553] customise number of experts in mixtral #47359

Open
opened 2026-04-28 03:37:08 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @scienlabs on GitHub (Dec 15, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1553

Could you someone provide guidance or documentation on how to adjust the number of experts in mixtral? I'm particularly interested in understanding if there's a way to dynamically adjust this number based on the requirements of different tasks or scenarios.

Originally created by @scienlabs on GitHub (Dec 15, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1553 Could you someone provide guidance or documentation on how to adjust the number of experts in mixtral? I'm particularly interested in understanding if there's a way to dynamically adjust this number based on the requirements of different tasks or scenarios.
Author
Owner

@RafaAguilar commented on GitHub (Dec 19, 2023):

I'm not sure what Ollama uses, but for the llama.cpp backend you can override a Key in the model with:

--override-kv KEY=TYPE:VALUE
                        advanced option to override model metadata by key. may be specified multiple times.
                        types: int, float, bool. example: --override-kv tokenizer.ggml.add_bos_token=bool:false

For example I override them using:

--override-kv llama.expert_used_count=int:3

But I think this is not yet supported by MODELFILE.

<!-- gh-comment-id:1863339486 --> @RafaAguilar commented on GitHub (Dec 19, 2023): I'm not sure what Ollama uses, but for the `llama.cpp` [backend](https://github.com/ggerganov/llama.cpp/pull/4406) you can override a Key in the model with: ``` --override-kv KEY=TYPE:VALUE advanced option to override model metadata by key. may be specified multiple times. types: int, float, bool. example: --override-kv tokenizer.ggml.add_bos_token=bool:false ``` For example I override them using: ``` --override-kv llama.expert_used_count=int:3 ``` But I think this is not yet supported by MODELFILE.
Author
Owner

@scienlabs commented on GitHub (Dec 27, 2023):

how can i do it with ollama? wondering if anyone can help

<!-- gh-comment-id:1870486900 --> @scienlabs commented on GitHub (Dec 27, 2023): how can i do it with ollama? wondering if anyone can help
Author
Owner

@PLK2 commented on GitHub (May 8, 2024):

Figured it out yet?

<!-- gh-comment-id:2100894627 --> @PLK2 commented on GitHub (May 8, 2024): Figured it out yet?
Author
Owner

@ColumbusAI commented on GitHub (Aug 2, 2024):

Any update on this?

<!-- gh-comment-id:2264473387 --> @ColumbusAI commented on GitHub (Aug 2, 2024): Any update on this?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#47359