[GH-ISSUE #8547] deepseek-r1 qwen variants use a new pre-tokenizer, which is not implemented in the llama.cpp version used #31275

Closed
opened 2026-04-22 11:35:29 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @sealad886 on GitHub (Jan 23, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8547

What is the issue?

The newly supported deepseek-r1 model variants that have distill-qwen in the name use a new pre-tokenizer. Support for this has been added to the latest llama.cpp (not sure if the release version or just the latest commit on the main branch).

The backend llama.cpp that Ollama uses should be updated to support this, since the default pre-tokenizer is very different than the bespoke version.

OS

Linux, macOS, Windows, Docker, WSL2

GPU

No response

CPU

No response

Ollama version

0.5.7

Originally created by @sealad886 on GitHub (Jan 23, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8547 ### What is the issue? The newly supported `deepseek-r1` model variants that have `distill-qwen` in the name use a new pre-tokenizer. Support for this has been added to the latest llama.cpp (not sure if the release version or just the latest commit on the main branch). The backend llama.cpp that Ollama uses should be updated to support this, since the `default` pre-tokenizer is very different than the bespoke version. ### OS Linux, macOS, Windows, Docker, WSL2 ### GPU _No response_ ### CPU _No response_ ### Ollama version 0.5.7
GiteaMirror added the bug label 2026-04-22 11:35:29 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#31275