mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #16217] feat: Allow Editing of Reasoning/Thinking Section in Chat Mode #56492
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @OracleToes on GitHub (Aug 1, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/16217
Check Existing Issues
Problem Description
Currently, when interacting with an LLM in the Open WebUI chat interface, the model's reasoning process is displayed within a collapsible "thinking" section (using
<details>and<think>tags). While this section can be opened to view the live token stream during the reasoning process, the contents cannot be directly edited when attempting to modify a message. In edit mode, the reasoning section appears as a placeholder like<details id="__DETAIL_0__"/>. This prevents users from refining or correcting the LLM's thought process within the context of an ongoing conversation.This functionality is available in the Open WebUI playground, suggesting it's technically feasible but not implemented in the chat interface.
I am aware of #9034 and #9044 but for those running local models, this limitation is irrelevant.
Desired Solution you'd like
The ideal solution should enable direct editing of the content within the
<details>or<think>tags during message modification in the Open WebUI chat interface. This would provide users, especially those running local models, with greater control over the LLM's thought process.Allowing this editing would facilitate correction of errors or logical flaws within the reasoning steps, and enable much finer control/steering of the model.
I find this to be important because even if you're running local models, time is still a valuable resourceBeing able to stop the model after it's already generated a good chunk of reasoning is often better than regenerating the entire response.
Alternatives Considered
This could be a toggle in the Admin Panel, so that it can be disabled for those who want to restrict this function for their users.
Additional Context
When manually typing out a
<think>section and then submitting the message, the browser running the webui tab will lock up and become unresponsive.@jdwx commented on GitHub (Aug 19, 2025):
I have run into a similar issue. However, because thinking/details aren't carried forward to future generation requests, implementation of this would probably also need some way to differentiate between regenerating the whole response (including reasoning) and regenerating the message content starting after the reasoning section, reusing the existing reasoning. (Which might also be generally useful with reasoning models.)
@Classic298 commented on GitHub (Aug 19, 2025):
But how, since the reasoning is never sent back to the model. Any edits are irrelevant as the reasoning text of old responses is never sent to the model.
What would be the usecase for this? I really do not see it.
@jdwx commented on GitHub (Aug 19, 2025):
@Classic298 I am, and I believe OracleToes is also, referring to editing thinking for the current response, the one where the thinking is still part of the context, because it's all part of the same response.
What I think we're asking for is the ability to edit the thinking to:
And then have "Regenerate" pick up from there. Or, if you prefer, "Continue Response" from that point. (Although I would really, really like the option to keep the thinking and regenerate the rest of the response.)
This is obviously a simple, contrived example, but it's incredibly painful to see a large model spend 90 seconds "reasoning," get one detail backwards that tanks the response, and have no way to fix it that doesn't involve regenerating the whole response, waiting though another 90 seconds of thinking, and hoping it doesn't make the same mistake again.
I guess it technically doesn't have to be limited to the current response, but you're right that editing the thinking of any response but the current response would have no effect unless you branch the conversation there, making it "current" again. In which case it might be quite important.
Does that make sense?
@jdwx commented on GitHub (Aug 19, 2025):
Actually, I just tested this, and it does seem like if you edit the response at all and then select continue response, that the previous thinking is completely discarded and the response is continued without any reasoning at all.
Is that intentional? It makes perfect sense that the reasoning text of old responses isn't sent to the model. But the reasoning text of the current response is quite a different matter.
Losing it as well is certainly not what I would have expected, nor the behavior I would want.
@Classic298 commented on GitHub (Aug 20, 2025):
Regenerate would regenerate it - totally new, with any old generated messages obviously discarded.
Continue Response - conceptually it makes sense, but when would this be usable? In scenarios of external models (openai connectivity) this isn't possible at all.
And for local models (ollama) it would only even be feasible if you have a very slow model (like 1 token per second) and you are actually able to click the stop button in time to stop the response.
@rgaricano commented on GitHub (Aug 20, 2025):
maybe a workaround to test could be send to llm:
regenerate response but using this modified thinking: xxxxxx
When I tried this way (qwen3:8b) model responses are faster and showing in details the new thinking that I sent.
Test it, and if it work in this way maybe could be interesting some integration for modify thinking and regenerate new responses just using it.
Maybe an inconvenient could be the different use of thinking/reasoning tags by models.
@jdwx commented on GitHub (Aug 20, 2025):
I will be the first to admit I have literally never used Open WebUI with a closed model provider like OpenAI. I use it with local models, usually on llama.cpp, ik_llama.cpp, or VLLM. So it sounds like maybe there is some difference of perspective.
In the local model context,
<think>is just another token that appears in the response. So there is no problem at all with sending a chat continuation API call with<think>...</think>-enclosed content in the response being continued. I do it all the time in programs that access the API directly. I find myself missing this capability when I'm using Open WebUI.As far as I can tell, the core of this request is the ability to easily get from this conversation:
to this API submission to http://my.server:12345/v1/chat/completions:
@tom9358 commented on GitHub (Mar 3, 2026):
Do you know if this is also how the local models work when used with openwebui? i.e., that the thoughts text is included in the chat history with the next prompt?
@LIU-Yinyi commented on GitHub (Mar 6, 2026):
Hi @Classic298 and @tom9358 ,
I confirmed that the previous thinking/reasoning blocks would also feed into next round chat. This feature cannot turn off and the thinking/reasoning blocks cannot be edited under
v0.8.8. As shown in the snapshot, I have already manually edited the answer toSure, the prompt is:. But in the next round, the previous thinking/reasoning blocks of<details type="reasoning" ...>still pollutes the current chat (next round thinking mentioned it).Therefore, adding support for editing thinking/reasoning blocks is essential for researchers working on LLM security/jailbreak. Thanks for considering the feature.
@LIU-Yinyi commented on GitHub (Mar 6, 2026):
Since the response format may vary, unusual protocol may lead to the unexpected results (append the thinking/reasoning blocks to next round chat) I showed in the last thread. Enabling fully editable blocks (contain all details and contents) should help.
@Classic298 commented on GitHub (Mar 6, 2026):
This is perplexing - i was under the impression previous turns' thinking blocks should not be sent back to the API - only same-turn thinking blocks (which are needed for tool calling context and as to not interrupt the model's logic/thinking flow)
i will look into both issues
@Classic298 commented on GitHub (Mar 6, 2026):
Ok so reasoning IS BEING SENT
which is AGAINST OPENAI SPEC
But other providers like Anthropic RECOMMEND sending previous reasoning
so we have a bit of a situation here
@tjbck further decisions needed
@Classic298 commented on GitHub (Mar 6, 2026):
the current code already sends reasoning from all previous turns (via process_messages_with_output using raw=True indiscriminately
@LIU-Yinyi commented on GitHub (Mar 6, 2026):
Sound like a tricky bug :D
Good to add a switch to let users decide which convention to follow (OpenAI's or Anthropic's).
Also good to enable thinking/reasoning block editing (never leash customization).
@JiwaniZakir commented on GitHub (Mar 14, 2026):
The core problem is that the edit mode serializer is replacing
<details>/<think>blocks with<details id="__DETAIL_0__"/>placeholders instead of preserving the actual content for editing. I can fix this by modifying the message edit component to deserialize those placeholders back into their original content (or skip the placeholder substitution entirely when entering edit mode), similar to how the playground already handles it. I'll dig into the chat message component to see where the placeholder swap happens and make the thinking content editable inline.@JiwaniZakir commented on GitHub (Mar 14, 2026):
Stepping back from this one — my implementation didn't pass the project's quality gates. Unassigning myself so someone else can take a crack at it.
@a4lg commented on GitHub (Mar 18, 2026):
@Classic298
Can I ask where can we find Anthropic's recommendation about including all previous reasoning blocks?
I could find only effectively the opposite but maybe I'm missing something: