[GH-ISSUE #8706] Better <think> block rendering for DeepSeek-R1 and similar #102219

Closed
opened 2026-05-17 23:37:58 -05:00 by GiteaMirror · 23 comments
Owner

Originally created by @coder543 on GitHub (Jan 21, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/8706

Originally assigned to: @tjbck on GitHub.

Is your feature request related to a problem? Please describe.
Currently, when reasoning LLMs return a <think> block as part of their output, it is rendered in the same way as regular text responses… just surrounded by the unfortunate tags. This makes it difficult to visually distinguish between the reasoning process and the final answer. Additionally, lengthy <think> blocks can clutter the interface and reduce readability for users who primarily want to focus on the final output.

Describe the solution you'd like
I propose that the <think> block be rendered distinctly from the normal response, with a visual differentiation (e.g., a shaded background, a border, or an indented box). Furthermore, the <think> block should be collapsible, allowing users to expand or hide it as needed. By default, the block could be collapsed, with an indicator to expand it for users who are interested in understanding the detailed reasoning process.

Describe alternatives you've considered

  1. Using a toggle option in the settings to enable or disable <think> block rendering entirely.
  2. Highlighting <think> blocks with simple visual markers (e.g., italics or a different font) instead of full collapsibility.

Additional context
This enhancement would make the UI more user-friendly, especially for users who want a clean response while still having the option to delve into the reasoning when needed. Here’s a simple example mockup of how it might look:

  • Collapsed <think> block: [+] Reasoning available. Click to expand.
  • Expanded <think> block: A visually distinct, bordered box containing the detailed reasoning.
Originally created by @coder543 on GitHub (Jan 21, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/8706 Originally assigned to: @tjbck on GitHub. **Is your feature request related to a problem? Please describe.** Currently, when reasoning LLMs return a `<think>` block as part of their output, it is rendered in the same way as regular text responses… just surrounded by the unfortunate <think> tags. This makes it difficult to visually distinguish between the reasoning process and the final answer. Additionally, lengthy `<think>` blocks can clutter the interface and reduce readability for users who primarily want to focus on the final output. **Describe the solution you'd like** I propose that the `<think>` block be rendered distinctly from the normal response, with a visual differentiation (e.g., a shaded background, a border, or an indented box). Furthermore, the `<think>` block should be collapsible, allowing users to expand or hide it as needed. By default, the block could be collapsed, with an indicator to expand it for users who are interested in understanding the detailed reasoning process. **Describe alternatives you've considered** 1. Using a toggle option in the settings to enable or disable `<think>` block rendering entirely. 2. Highlighting `<think>` blocks with simple visual markers (e.g., italics or a different font) instead of full collapsibility. **Additional context** This enhancement would make the UI more user-friendly, especially for users who want a clean response while still having the option to delve into the reasoning when needed. Here’s a simple example mockup of how it might look: - Collapsed `<think>` block: `[+] Reasoning available. Click to expand.` - Expanded `<think>` block: A visually distinct, bordered box containing the detailed reasoning.
Author
Owner

@focomfy commented on GitHub (Jan 21, 2025):

Additionally, according to the official API documentation, it appears that the chain of thought is only included in the latest response, and there is no need to send the old chain of thought to the model.

api docs

Multi-round Conversation

In each round of the conversation, the model outputs the CoT (reasoning_content) and the final answer (content). In the next round of the conversation, the CoT from previous rounds is not concatenated into the context, as illustrated in the following diagram:

Image

Please note that if the reasoning_content field is included in the sequence of input messages, the API will return a 400 error. Therefore, you should remove the reasoning_content field from the API response before making the API request, as demonstrated in the [API example](https://api-docs.deepseek.com/guides/reasoning_model#api-example).

<!-- gh-comment-id:2603824956 --> @focomfy commented on GitHub (Jan 21, 2025): Additionally, according to the official API documentation, it appears that the chain of thought is only included in the latest response, and there is no need to send the old chain of thought to the model. **[api docs](https://api-docs.deepseek.com/guides/reasoning_model)** > ## Multi-round Conversation > > In each round of the conversation, the model outputs the CoT (`reasoning_content`) and the final answer (`content`). In the next round of the conversation, the CoT from previous rounds is not concatenated into the context, as illustrated in the following diagram: > > ![Image](https://github.com/user-attachments/assets/bfa3d9b8-0022-454d-8c9d-6ffce412bba3) > > Please note that if the `reasoning_content` field is included in the sequence of input messages, the API will return a `400` error. Therefore, you should remove the `reasoning_content` field from the API response before making the API request, as demonstrated in the [[API example](https://api-docs.deepseek.com/guides/reasoning_model#api-example)](https://api-docs.deepseek.com/guides/reasoning_model#api-example).
Author
Owner

@tjbck commented on GitHub (Jan 21, 2025):

@focomfy API responses should be handled by Pipe functions here.

<!-- gh-comment-id:2603850925 --> @tjbck commented on GitHub (Jan 21, 2025): @focomfy API responses should be handled by Pipe functions here.
Author
Owner

@vladislavdonchev commented on GitHub (Jan 21, 2025):

I'm handling this in a Pipeline for now and outputting only curated data to the user. I will make that code public in a few days... In the meanwhile it is a real possibility that someone just adds UI support here as well, doesn't look too complicated to implement. :)

<!-- gh-comment-id:2604187692 --> @vladislavdonchev commented on GitHub (Jan 21, 2025): I'm handling this in a Pipeline for now and outputting only curated data to the user. I will make that code public in a few days... In the meanwhile it is a real possibility that someone just adds UI support here as well, doesn't look too complicated to implement. :)
Author
Owner

@Lyzin commented on GitHub (Jan 21, 2025):

Yes, I see that on the deepseek website, they use grayed-out markdown citations for the think process, which looks good

<!-- gh-comment-id:2604538824 --> @Lyzin commented on GitHub (Jan 21, 2025): Yes, I see that on the deepseek website, they use grayed-out markdown citations for the think process, which looks good
Author
Owner

@fireblade2534 commented on GitHub (Jan 21, 2025):

#7438 has some code that could probably be adapted (Idk why its closed)

<!-- gh-comment-id:2605897035 --> @fireblade2534 commented on GitHub (Jan 21, 2025): #7438 has some code that could probably be adapted (Idk why its closed)
Author
Owner

@roeja commented on GitHub (Jan 22, 2025):

I was testing this today and think its straight forward to add the logic by using a marked extension same as the details token. Could use some better formatting like in #7438

My example:
Closed:
Image
Open:
Image

Code Change for Example

<!-- gh-comment-id:2606119693 --> @roeja commented on GitHub (Jan 22, 2025): I was testing this today and think its straight forward to add the logic by using a marked extension same as the `details` token. Could use some better formatting like in #7438 My example: Closed: ![Image](https://github.com/user-attachments/assets/4984ccfa-988c-450e-8994-a79e13a8c7e7) Open: ![Image](https://github.com/user-attachments/assets/d432c662-bc82-49a9-b5e6-3b9c02e72155) [Code Change for Example](https://github.com/roeja/open-webui/commit/1b548c81d56c7dcdcbdc58de24c752d9f89d6f69)
Author
Owner

@lowlyocean commented on GitHub (Jan 22, 2025):

Those using "current model" as their Task model will see title generation also render as <think>..., so maybe a solution can elegantly handle that (in addition to the rendering)?

<!-- gh-comment-id:2606268462 --> @lowlyocean commented on GitHub (Jan 22, 2025): Those using "current model" as their Task model will see title generation also render as `<think>...`, so maybe a solution can elegantly handle that (in addition to the rendering)?
Author
Owner

@silentoplayz commented on GitHub (Jan 22, 2025):

Related filter function that users of Open WebUI have decided to utilize for the time being - https://github.com/AaronFeng753/Better-R1

OP posted about his function to the r/LocalLLaMA subreddit, here.

<!-- gh-comment-id:2606367644 --> @silentoplayz commented on GitHub (Jan 22, 2025): Related filter function that users of Open WebUI have decided to utilize for the time being - https://github.com/AaronFeng753/Better-R1 OP posted about his function to the [r/LocalLLaMA subreddit](https://www.reddit.com/r/LocalLLaMA/), [here](https://www.reddit.com/r/LocalLLaMA/comments/1i6b65q/better_r1_experience_in_open_webui/).
Author
Owner

@evenkeelhuang commented on GitHub (Jan 22, 2025):

Better-R1 cannot resolve the issue of automatically generated titles rendering as <think>....

<!-- gh-comment-id:2606394044 --> @evenkeelhuang commented on GitHub (Jan 22, 2025): Better-R1 cannot resolve the issue of automatically generated titles rendering as `<think>...`.
Author
Owner

@c-hoffmann commented on GitHub (Jan 22, 2025):

I would appreciate an option in admin/settings/models/specific_model to hide the thoughts from the user completely and instead just show "thinking" with an animation that hints that the thinking-process is still ongoing.

<!-- gh-comment-id:2606480176 --> @c-hoffmann commented on GitHub (Jan 22, 2025): I would appreciate an option in admin/settings/models/_specific_model_ to hide the thoughts from the user completely and instead just show "thinking" with an animation that hints that the thinking-process is still ongoing.
Author
Owner

@peter-ch commented on GitHub (Jan 22, 2025):

The DeepSeek API does output the thinking tokens, can you please render them? It's really frustrating not to be able to see any activity and wait like 5 minutes for a response. I don't know whether that's a problem or a long CoT.

From the API docs:

for chunk in response:
    if chunk.choices[0].delta.reasoning_content:
        reasoning_content += chunk.choices[0].delta.reasoning_content
    else:
        content += chunk.choices[0].delta.content
<!-- gh-comment-id:2606494641 --> @peter-ch commented on GitHub (Jan 22, 2025): The DeepSeek API does output the thinking tokens, can you please render them? It's really frustrating not to be able to see any activity and wait like 5 minutes for a response. I don't know whether that's a problem or a long CoT. From the API docs: ```python for chunk in response: if chunk.choices[0].delta.reasoning_content: reasoning_content += chunk.choices[0].delta.reasoning_content else: content += chunk.choices[0].delta.content ```
Author
Owner

@tjbck commented on GitHub (Jan 22, 2025):

Image

tag support added to dev!

<!-- gh-comment-id:2606577899 --> @tjbck commented on GitHub (Jan 22, 2025): <img width="705" alt="Image" src="https://github.com/user-attachments/assets/8fc4b903-31e0-4e52-a59f-12984b435144" /> <think> tag support added to dev!
Author
Owner

@fireblade2534 commented on GitHub (Jan 22, 2025):

@tjbck When you click the speak button it speaks the thinking tags which it should not. Also It shows thinking tags in the AI generated titles. When you highlight something and ask R1 to explain/ask, the thinking tags are shown

<!-- gh-comment-id:2607889871 --> @fireblade2534 commented on GitHub (Jan 22, 2025): @tjbck When you click the speak button it speaks the thinking tags which it should not. Also It shows thinking tags in the AI generated titles. When you highlight something and ask R1 to explain/ask, the thinking tags are shown
Author
Owner

@tjbck commented on GitHub (Jan 22, 2025):

TTS issue has been addressed in dev. As for the title generation you might want to set a separate task model in this case!

<!-- gh-comment-id:2607937099 --> @tjbck commented on GitHub (Jan 22, 2025): TTS issue has been addressed in dev. As for the title generation you might want to set a separate task model in this case!
Author
Owner

@fireblade2534 commented on GitHub (Jan 22, 2025):

Fair enough for the title. What about "When you highlight something and ask R1 to explain/ask, the thinking tags are shown"

<!-- gh-comment-id:2608019087 --> @fireblade2534 commented on GitHub (Jan 22, 2025): Fair enough for the title. What about "When you highlight something and ask R1 to explain/ask, the thinking tags are shown"
Author
Owner

@tjbck commented on GitHub (Jan 22, 2025):

Have not yet decided what to do here, but potentially will be addressed in the subsequent releases!

<!-- gh-comment-id:2608031035 --> @tjbck commented on GitHub (Jan 22, 2025): Have not yet decided what to do here, but potentially will be addressed in the subsequent releases!
Author
Owner

@lowlyocean commented on GitHub (Jan 22, 2025):

Is the Ask/Explain using the selected "Task" model, similar to title generation? It seems reasoning models are generally not a good fit for Tasks (where succinct, or "immediate", responses are expected) - especially ones that place reasoning tokens in the response - so perhaps we can offer that guidance in the Docs or Settings UI?

<!-- gh-comment-id:2608042316 --> @lowlyocean commented on GitHub (Jan 22, 2025): Is the Ask/Explain using the selected "Task" model, similar to title generation? It seems reasoning models are generally not a good fit for Tasks (where succinct, or "immediate", responses are expected) - especially ones that place reasoning tokens in the response - so perhaps we can offer that guidance in the Docs or Settings UI?
Author
Owner

@gnouts commented on GitHub (Jan 23, 2025):

As for the title generation you might want to set a separate task model in this case!

Currently my Nvidia card can barely load one R1 14b and has to unload/reload another model to do the title. The second message is then as slow as the first one, waiting to reload R1.
I've tried very small model for title, that could fit along R1, but they fail at following the prompt (either they skip emoji, or write a super long sentence or completely miss the summary).

It seems reasoning models are generally not a good fit for Tasks

I also get that but I don't see a better solution for me right now :/
Currently, for user experience, I'd rather have no title than unload/reload models twice at each chat start.

Would you reconsider opening this issue to keep track of the title generation ?

<!-- gh-comment-id:2609095689 --> @gnouts commented on GitHub (Jan 23, 2025): > As for the title generation you might want to set a separate task model in this case! Currently my Nvidia card can barely load one R1 14b and has to unload/reload another model to do the title. The second message is then as slow as the first one, waiting to reload R1. I've tried very small model for title, that could fit along R1, but they fail at following the prompt (either they skip emoji, or write a super long sentence or completely miss the summary). > It seems reasoning models are generally not a good fit for Tasks I also get that but I don't see a better solution for me right now :/ Currently, for user experience, I'd rather have no title than unload/reload models twice at each chat start. Would you reconsider opening this issue to keep track of the title generation ?
Author
Owner

@alexfromapex commented on GitHub (Jan 24, 2025):

I changed the model to llama3 for generating titles, works pretty well, but it would be nice if it didn't consider anything inside the <think></think> tags for title generation or if it was configurable

<!-- gh-comment-id:2611315060 --> @alexfromapex commented on GitHub (Jan 24, 2025): I changed the model to llama3 for generating titles, works pretty well, but it would be nice if it didn't consider anything inside the `<think></think>` tags for title generation or if it was configurable
Author
Owner

@aw632 commented on GitHub (Feb 2, 2025):

Is this done yet? It doesn't work.

<!-- gh-comment-id:2629193954 --> @aw632 commented on GitHub (Feb 2, 2025): Is this done yet? It doesn't work.
Author
Owner

@hksquinson commented on GitHub (Feb 2, 2025):

I have been trying out the reasoning models and while it works for ollama models, I can't seem to get the reasoning to show up when I am using OpenRouter or the official Deepseek API. Anyone who has worked around it?

<!-- gh-comment-id:2629274655 --> @hksquinson commented on GitHub (Feb 2, 2025): I have been trying out the reasoning models and while it works for ollama models, I can't seem to get the reasoning to show up when I am using OpenRouter or the official Deepseek API. Anyone who has worked around it?
Author
Owner

@prasoon2211 commented on GitHub (Feb 6, 2025):

Would also love to know how to show thinking with OpenRouter for Deepseek-R1

<!-- gh-comment-id:2639554985 --> @prasoon2211 commented on GitHub (Feb 6, 2025): Would also love to know how to show thinking with OpenRouter for Deepseek-R1
Author
Owner

@arty-hlr commented on GitHub (Feb 25, 2025):

@hksquinson @prasoon2211 There is this so far, we're trying to also make it work for Claude 3.7, but it works for me with Deepseek-R1.

<!-- gh-comment-id:2682443497 --> @arty-hlr commented on GitHub (Feb 25, 2025): @hksquinson @prasoon2211 There is [this](https://github.com/rmarfil3/openwebui-openrouter-reasoning-tokens) so far, we're trying to also make it work for Claude 3.7, but it works for me with Deepseek-R1.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#102219