[GH-ISSUE #4142] More Quants for command-r-plus Please? #49086

Closed
opened 2026-04-28 10:44:06 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @chigkim on GitHub (May 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4142

There are only q2_K, q4_0, q8_0, and fp16 for command-r-plus.
https://ollama.com/library/command-r-plus/tags
Can we have more quants like 0/1/s/m/l of q3, q5, q6 like other models?
Thanks so much!

Originally created by @chigkim on GitHub (May 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4142 There are only q2_K, q4_0, q8_0, and fp16 for command-r-plus. https://ollama.com/library/command-r-plus/tags Can we have more quants like 0/1/s/m/l of q3, q5, q6 like other models? Thanks so much!
GiteaMirror added the model label 2026-04-28 10:44:06 -05:00
Author
Owner

@taozhiyuai commented on GitHub (May 5, 2024):

import from gguf yourself.

<!-- gh-comment-id:2094656986 --> @taozhiyuai commented on GitHub (May 5, 2024): import from gguf yourself.
Author
Owner

@chigkim commented on GitHub (May 5, 2024):

I already did even before Ollama had. I'm requesting for other people.

<!-- gh-comment-id:2094777416 --> @chigkim commented on GitHub (May 5, 2024): I already did even before Ollama had. I'm requesting for other people.
Author
Owner

@pdevine commented on GitHub (Sep 13, 2024):

So you can actually do this pretty easily if you have the non-quantized version checked out either from safetensors or the ollama model. There's instructions in the import doc.

Short answer is:

  1. Create a Modelfile w/ the fp16 model on the FROM line
  2. Run ollama create --quantize q4_K_M test w/ the appropriate quantization level

Command-r is a little long in the tooth, so I'm going to go ahead and close the issue.

<!-- gh-comment-id:2347848934 --> @pdevine commented on GitHub (Sep 13, 2024): So you can actually do this pretty easily if you have the non-quantized version checked out either from safetensors or the ollama model. There's instructions in the [import doc](https://github.com/ollama/ollama/blob/main/docs/import.md#quantizing-a-model). Short answer is: 1. Create a Modelfile w/ the fp16 model on the FROM line 2. Run `ollama create --quantize q4_K_M test` w/ the appropriate quantization level Command-r is a little long in the tooth, so I'm going to go ahead and close the issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#49086