[GH-ISSUE #3532] CohereForAI / c4ai-command-r-plus-4bit #48688

Closed
opened 2026-04-28 09:05:59 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @petunder on GitHub (Apr 8, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/3532

What model would you like?

C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities, this includes Retrieval Augmented Generation (RAG) and tool use to automate sophisticated tasks.

https://huggingface.co/CohereForAI/c4ai-command-r-plus-4bit

This model is split into individual files, which makes direct uploading difficult. Additionally, the model is in the .safetensors format.

Originally created by @petunder on GitHub (Apr 8, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/3532 ### What model would you like? C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities, this includes Retrieval Augmented Generation (RAG) and tool use to automate sophisticated tasks. https://huggingface.co/CohereForAI/c4ai-command-r-plus-4bit This model is split into individual files, which makes direct uploading difficult. Additionally, the model is in the .safetensors format.
Author
Owner

@petunder commented on GitHub (Apr 8, 2024):

Command R Plus currently requires rebuilding Ollama with the llama.cpp branch from this PR: https://github.com/ggerganov/llama.cpp/pull/6491

<!-- gh-comment-id:2042444825 --> @petunder commented on GitHub (Apr 8, 2024): Command R Plus currently requires rebuilding Ollama with the llama.cpp branch from this PR: https://github.com/ggerganov/llama.cpp/pull/6491
Author
Owner

@jason-c-kwan commented on GitHub (Apr 8, 2024):

@petunder Do instructions exist somewhere on how to build ollama with a different commit/PR of llama.cpp? I have not been able to get it to work so far.

<!-- gh-comment-id:2043606924 --> @jason-c-kwan commented on GitHub (Apr 8, 2024): @petunder Do instructions exist somewhere on how to build ollama with a different commit/PR of llama.cpp? I have not been able to get it to work so far.
Author
Owner

@petunder commented on GitHub (Apr 9, 2024):

@petunder Do instructions exist somewhere on how to build ollama with a different commit/PR of llama.cpp? I have not been able to get it to work so far.

Two hours ago @ggerganov updated llama.cpp code, so you can build ollama from source as usual, all things should work

<!-- gh-comment-id:2044681531 --> @petunder commented on GitHub (Apr 9, 2024): > @petunder Do instructions exist somewhere on how to build ollama with a different commit/PR of llama.cpp? I have not been able to get it to work so far. Two hours ago @ggerganov updated llama.cpp code, so you can build ollama from source as usual, all things should work
Author
Owner

@taozhiyuai commented on GitHub (Apr 9, 2024):

support! I need command r plus on ollama. :_)

<!-- gh-comment-id:2045212455 --> @taozhiyuai commented on GitHub (Apr 9, 2024): support! I need command r plus on ollama. :_)
Author
Owner

@jason-c-kwan commented on GitHub (Apr 9, 2024):

@petunder Building does not work for me. Doing go build . gives me this error:

# github.com/ollama/ollama/llm
llm/server.go:68:28: undefined: gpu.CheckVRAM
llm/server.go:69:14: undefined: gpu.GetGPUInfo

...and the executables are not made. Am I missing something obvious here or should I submit a separate issue?

<!-- gh-comment-id:2045370811 --> @jason-c-kwan commented on GitHub (Apr 9, 2024): @petunder Building does not work for me. Doing `go build .` gives me this error: ``` # github.com/ollama/ollama/llm llm/server.go:68:28: undefined: gpu.CheckVRAM llm/server.go:69:14: undefined: gpu.GetGPUInfo ``` ...and the executables are not made. Am I missing something obvious here or should I submit a separate issue?
Author
Owner

@jason-c-kwan commented on GitHub (Apr 9, 2024):

Oh, I figured it out. For anyone else who comes across the same issue - the problem was I was building in a conda environment. Installing go 1.22 on the base system made it work.

<!-- gh-comment-id:2045593108 --> @jason-c-kwan commented on GitHub (Apr 9, 2024): Oh, I figured it out. For anyone else who comes across the same issue - the problem was I was building in a conda environment. Installing go 1.22 on the base system made it work.
Author
Owner

@trinque commented on GitHub (Apr 9, 2024):

https://github.com/ggerganov/llama.cpp/pull/6491 has been merged.

<!-- gh-comment-id:2046072286 --> @trinque commented on GitHub (Apr 9, 2024): https://github.com/ggerganov/llama.cpp/pull/6491 has been merged.
Author
Owner

@jason-c-kwan commented on GitHub (Apr 9, 2024):

Just opened a new issue (#3563 ) because I figured this one is specifically for requesting the 4-bit version. I have not been able to import either version of the GGUFs I could find on Hugging Face and not all the quants are yet available on ollama hub.

<!-- gh-comment-id:2046109339 --> @jason-c-kwan commented on GitHub (Apr 9, 2024): Just opened a new issue (#3563 ) because I figured this one is specifically for requesting the 4-bit version. I have not been able to import either version of the GGUFs I could find on Hugging Face and not all the quants are yet available on ollama hub.
Author
Owner

@pdevine commented on GitHub (Apr 12, 2024):

Let's cover this in #3576 . The changes are already merged and should be available in 0.1.32.

<!-- gh-comment-id:2052386825 --> @pdevine commented on GitHub (Apr 12, 2024): Let's cover this in #3576 . The changes are already merged and should be available in `0.1.32`.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#48688