mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #11259] issue: Think tags not detected if opening tag is in prompt template? #54827
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @bjj on GitHub (Mar 6, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/11259
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.5.20 (latest)
Ollama Version (if applicable)
No response
Operating System
Ubuntu 24.04
Browser (if applicable)
Edge
Confirmation
README.md.Expected Behavior
After prompting QwQ-32B, the initial
<think>should be recognized as thinking when it is part of the prompt template (see the end of https://huggingface.co/Qwen/QwQ-32B/blob/main/tokenizer_config.json , compare the preview: https://huggingface.co/Qwen/QwQ-32B-Preview/blob/main/tokenizer_config.json )Actual Behavior
Thinking starts streaming in as an answer, eventually ending in
</think>but open-webui didn't recognize itSteps to Reproduce
Get QwQ-32B (not the preview), submit any prompt (thinking is forced because it's in the chat template).
Logs & Screenshots
Additional Information
No response
@bjj commented on GitHub (Mar 6, 2025):
This might be specific to vLLM, since it is using the tokenizer_config literally.
llama-serve, at least, seems to prune it off of the prompt and let the model generate it (?)@mindkrypted commented on GitHub (Mar 6, 2025):
@bjj
It's not specific to vLLM, I'm using TabbyAPI to serve the quanted model with exllamaV2 and it's also not being recognized properly within Open-WebUI.
There's an open discussion on the HF model's page https://huggingface.co/Qwen/QwQ-32B/discussions/4
The
<think>tag is provided by the inference server, but it's not displayed in Open-WebUI. (see the screenshot with detailed logs)@Lzhang-hub commented on GitHub (Mar 6, 2025):
It it because first
<think>is in chat template, so model fisrt output token is not<think>, open-webui can not got it.https://huggingface.co/Qwen/QwQ-32B/blob/main/tokenizer_config.json

@alvarolopez commented on GitHub (Mar 7, 2025):
This is also happening with DeepSeek models (c.f. https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B/blame/main/tokenizer_config.json)
The change was introduced weeks ago in
74fbf131a9@mindkrypted commented on GitHub (Mar 8, 2025):
With TabbyAPI, I'm able to get the "normal" tag when removing it from the chat template.
The end looks like this after the modification:
{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}\n