[GH-ISSUE #7223] How to add support for RWKV? #51095

Closed
opened 2026-04-28 18:19:59 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @MollySophia on GitHub (Oct 16, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7223

Hi! I would like to try to make RWKV v6 models working with ollama.
llama.cpp has it supported already.

  • Currently ollama fails to load the model due to a bug in llama.cpp. Here's the fix PR: https://github.com/ggerganov/llama.cpp/pull/9907
  • Another issue is the chat template. I wonder how should a chat template be added for a new model? Specifically, how does ollama decide which template to use when loading a modelfile?

Thanks!

Originally created by @MollySophia on GitHub (Oct 16, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7223 Hi! I would like to try to make RWKV v6 models working with ollama. llama.cpp has it supported already. - Currently ollama fails to load the model due to a bug in llama.cpp. Here's the fix PR: https://github.com/ggerganov/llama.cpp/pull/9907 - Another issue is the chat template. I wonder how should a chat template be added for a new model? Specifically, how does ollama decide which template to use when loading a modelfile? Thanks!
GiteaMirror added the model label 2026-04-28 18:19:59 -05:00
Author
Owner

@rick-github commented on GitHub (Oct 16, 2024):

ollama looks for the field tokenizer.chat_template in a GGUF file and then searches for a match in template/index.json. If it finds a match it will use the ollama template from the template directory. It doesn't look like RWKV publishes an official GGUF file so you are dependent on third parties. I had a look at Lyte/RWKV-6-World-3B-v2.1-GGUF and the files don't include a chat_template, so ollama will fall back to a default of {{ .Prompt }}. However, looking at the RWKV's example usage the template looks fairly simple:

TEMPLATE="""{{ if .System }}Instruction: {{ .System }}

Input: {{ .Prompt }}

Response:
{{- else }}User: Hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: {{ .Prompt }}

Assistant:{{ end }}"""
<!-- gh-comment-id:2416439672 --> @rick-github commented on GitHub (Oct 16, 2024): ollama looks for the field `tokenizer.chat_template` in a GGUF file and then searches for a match in [`template/index.json`](https://github.com/ollama/ollama/blob/main/template/index.json). If it finds a match it will use the ollama template from the [`template`](https://github.com/ollama/ollama/tree/main/template) directory. It doesn't look like [RWKV](https://huggingface.co/collections/RWKV/rwkv-v6-669cb221fe9496b3c693c8e9) publishes an official GGUF file so you are dependent on third parties. I had a look at [Lyte/RWKV-6-World-3B-v2.1-GGUF](https://huggingface.co/Lyte/RWKV-6-World-3B-v2.1-GGUF/tree/main) and the files don't include a chat_template, so ollama will fall back to a default of `{{ .Prompt }}`. However, looking at the RWKV's [example usage](https://huggingface.co/RWKV/v6-Finch-14B-HF/blob/main/README.md) the template looks fairly simple: ```modelfile TEMPLATE="""{{ if .System }}Instruction: {{ .System }} Input: {{ .Prompt }} Response: {{- else }}User: Hi Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it. User: {{ .Prompt }} Assistant:{{ end }}""" ```
Author
Owner

@MollySophia commented on GitHub (Oct 16, 2024):

ollama looks for the field tokenizer.chat_template in a GGUF file and then searches for a match in template/index.json. If it finds a match it will use the ollama template from the template directory. It doesn't look like RWKV publishes an official GGUF file so you are dependent on third parties. I had a look at Lyte/RWKV-6-World-3B-v2.1-GGUF and the files don't include a chat_template, so ollama will fall back to a default of {{ .Prompt }}.

The llama.cpp convert_hf_to_gguf.py tries to parse tokenizer_config.json to get the chat_template field. However, RWKV hf models' tokenizer_config doesn't specify that. RWKV uses an unique template anyway.

However, looking at the RWKV's example usage the template looks fairly simple:

Yes. It's really simple.

<!-- gh-comment-id:2416566443 --> @MollySophia commented on GitHub (Oct 16, 2024): > ollama looks for the field `tokenizer.chat_template` in a GGUF file and then searches for a match in [`template/index.json`](https://github.com/ollama/ollama/blob/main/template/index.json). If it finds a match it will use the ollama template from the [`template`](https://github.com/ollama/ollama/tree/main/template) directory. It doesn't look like [RWKV](https://huggingface.co/collections/RWKV/rwkv-v6-669cb221fe9496b3c693c8e9) publishes an official GGUF file so you are dependent on third parties. I had a look at [Lyte/RWKV-6-World-3B-v2.1-GGUF](https://huggingface.co/Lyte/RWKV-6-World-3B-v2.1-GGUF/tree/main) and the files don't include a chat_template, so ollama will fall back to a default of `{{ .Prompt }}`. The llama.cpp `convert_hf_to_gguf.py` tries to parse `tokenizer_config.json` to get the chat_template field. However, RWKV hf models' tokenizer_config doesn't specify that. RWKV uses an unique template anyway. > However, looking at the RWKV's [example usage](https://huggingface.co/RWKV/v6-Finch-14B-HF/blob/main/README.md) the template looks fairly simple: Yes. It's really simple.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#51095