mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 03:18:23 -05:00
[GH-ISSUE #9488] Thinking doesn't showing for Deepseek R1 via API (external connection) #31055
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @drshliapa on GitHub (Feb 6, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/9488
I connected to Deepseek R1 via its API, but it only shows the final result instead of the thinking steps. Open WebUI version 0.5.10
@ImLuke954 commented on GitHub (Feb 6, 2025):
I can confirm this bug. Having the same issue.
@EricsmOOn commented on GitHub (Feb 6, 2025):
@itshen commented on GitHub (Feb 6, 2025):
ref : https://api-docs.deepseek.com/zh-cn/guides/reasoning_model
@ddowhy commented on GitHub (Feb 6, 2025):
Even for locally hosted models it doesnt show every time. Sometimes it does, others it does not. Inconsistent behaviour.
@EricSolshkov commented on GitHub (Feb 6, 2025):
This issue shows up nearly 100% using deepseek-r1 with reasoning, making open-webui incapable working with this model.
@HeMuling commented on GitHub (Feb 6, 2025):
There has been a solution that somehow fix the problem:
The solution is to use the pipe function: https://openwebui.com/f/zgccrui/deepseek_r1 (my recommendation is to increase the timeout in the code)
For more details, see forum: https://linux.do/t/topic/383183, it seems to be related to the openai standard package
@EliEron commented on GitHub (Feb 6, 2025):
Technically it's not a bug. The issue is that R1 returns the thinking tokens separately from the content itself using a field called
reasoning_content. Since that deviates from the standard OpenAI spec which Open-WebUI follows tjbck has stated very explicitly that they will not support it natively, leaving it up to pipe functions instead.Personally I think it would be a good idea to add this support natively, if for no other reason than to end the confusion a lot of users clearly face, but it's not up to me to decide. And it is true that you can work around it pretty easily with pipe functions. If you are using OpenRouter then you can use the function found in this github. If you are using the DeepSeek API itself you can use this function. They need different functions since OpenRouter has also gone with a slightly different design for how they deliver thinking tokens.
In either case once configured the function will add a new model to your model list which will act exactly as the normal R1 endpoint but will show the thinking tokens properly.
@crzroot commented on GitHub (Feb 7, 2025):
Yes, I also have this problem, and I don't show thinking
@aaronps commented on GitHub (Feb 9, 2025):
I deploy models with vllm, originally the output was ok (could see reasoning), after adding the new
--enable-reasoningflag in vllm, there is not reasoning output in open-webu, yes we know about thereasoning_contentbut other tools already works with that nicely.@fenghan0430 commented on GitHub (Feb 10, 2025):
The same problem arises with the Alibaba Cloud api
@djw520158 commented on GitHub (Feb 10, 2025):
Yes, I also have this problem, and I don't show thinking
@fq393 commented on GitHub (Feb 11, 2025):
The following pipeline preferably needs to optimize the next “https://openwebui.com/f/zgccrui/deepseek_r1” to prevent the "IndexError: list index out of range" problem
Before modification
choice = data.get("choices", [{}])[0]
after modification
choices = data.get("choices", [{}])
if not isinstance(choices, list) or len(choices) == 0: # 双重保险
yield self._format_error("APIError", "Empty choices in response")
return
choice = choices[0]
@peter-ch commented on GitHub (Feb 11, 2025):
Can you at least make that pipe built-in or something, why do I have to program things?
@litionls commented on GitHub (Feb 11, 2025):
use alibaba's api and volcengine‘s api have the same problem ,but if you use ollama run the distilled model is ok
@AndreyYukavichin commented on GitHub (Feb 12, 2025):
While the reasoning process can be observed when running the distillation model using Ollama, it is not accessible when utilizing the API of the Alibaba Cloud Bailian platform.
@GrayXu commented on GitHub (Feb 12, 2025):
the essence of this issue is that open-webui can only handle content, while the
reasoning_contentused by many API providers is not supported.I noticed that tjbck has already claimed it will not support
reasoning_content, which is very disappointing.First of all,
reasoning_contentis clearly a better practice; the tag format is too easily misused. I have reproduced many instances where, during discussions with the model about thinking models, its output tags were incorrectly displayed. A completely isolated field is an elegant implementation strategy.Secondly, tjbck's concern is worrying that different MaaS will require different adaptations for different providers, but in fact DeepSeek's design has also been adopted by various MaaS simultaneously, which is a de facto standard created by leading companies.
BTW, the pipe implementation mentioned earlier lacks scalability and locks down the endpoint.
@Changego commented on GitHub (Feb 13, 2025):
Indeed, when I connect to the DeepSeek provided by the public cloud, the thinking process is not displayed. This is quite frustrating.
@iskradelta commented on GitHub (Feb 13, 2025):
Id like to give back to the community by documenting what I did to get the DeepSeek R1 thinking step and also also make it wake-on-lan and sleep when nobody is talking to the llm.
open-webui is running on a different docker container host, than where the machine with GPU running vllm is.
Go to Settings, Admin Settings, Functions "tab" which is not very visible. See gist https://gist.github.com/iskradelta/8f1e11e32126b86a6e2a3f3026dad354
Change the mac address in the code and the IP of 10.88.0.1 should be the ip of the wake-on-lan-proxy (docker container IP) you will run.
If youre on the same lan and same subnet you dont need a wake-on-lan-proxy but can modify the function to send the wol packet directly instead.
The wake-on-lan is a simple Dockerfile with an entry.sh https://gist.github.com/iskradelta/dd46eeb7671338c5ee7858ec2f4b37b9
So just docker build . -t wake-on-lan-proxy and docker run it on the same host as open-webui
Next on the computer where you run vllm --enable-reasoning, also add | tee brain.file to just write its output to a log file.
Now do in another shell or put this in a systemd service,
This will make the GPU machine go to suspend after 900s (15min), after there have been no logs by vllm.
When a user wants to talk to the model, using open-webui, the "thinking" function with wake-onlan, will get wake it from suspend, and its ready within 3-4s to think.
@Ronchy2000 commented on GitHub (Feb 13, 2025):
Wonderful support,it works!Thx a lot。
@fenghan0430 commented on GitHub (Feb 13, 2025):
使用open-wenui的管道处理功能,对api返回的数据进行二次处理,可以解决没有think标签的问题。
不过管道功能不知道为什么连接不了本地的vllm服务器,出现错误 所有连接失败
@KinglyWayne commented on GitHub (Feb 13, 2025):
For those who lost the opening tag after r1 chattemplate update:
The new chat template of deepseek r1 has broken the starting tag. Before the official fix, I modified a pipeline from zgccrui to be compatible with the backend inference service without the starting tag. It is currently running well for me, and you can take a try.
https://openwebui.com/f/kinglywayne/deepseek_r1_thinkfix
All contributions come from zgccrui; I just added the check logic for the beginning .
The original function address of zgccrui is https://openwebui.com/f/zgccrui/deepseek_r1
@Ronchy2000 commented on GitHub (Feb 13, 2025):
这个倒是没试过,我都是直接localhost:8080,测试没问题的。
btw,请问联网功能有人测试过吗?我尝试了网上有人说的google 的 Google PSE API,每次都是报错,是和代理有关吗? 很疑惑,望有能人异士解答,谢谢!
@fenghan0430 commented on GitHub (Feb 13, 2025):
用代理,我这边联网搜索可用
@Ronchy2000 commented on GitHub (Feb 13, 2025):
需要额外配置吗?我这边用的是clash ,端口7890
@fenghan0430 commented on GitHub (Feb 14, 2025):
如果你使用docker部署,那么需要使用clash的tun模式。如果你使用clash的http代理,你可以去找一下open-wenui配置代理的设置项。
@lincyang commented on GitHub (Feb 14, 2025):
我是昨天部署的deepseek-r1:14b,使用openwebui后是能够正常显示think的,但是在问题提交后,think出现前,会有时间不等的延迟,有时10秒,难问题会几十秒。而我直接用ollama提问,会立刻返回think标签的。我不清楚是哪里的问题,请指点!
另外,DeepSeek R1_ThinkFix这个pipe的方案也尝试了,都一样。
@fenghan0430 commented on GitHub (Feb 14, 2025):
deepseek api和其他api,会将思考文本放在
reasoning_content中,模型的最后结果会出现在content。open-webui只解析了content而没有解析reasoning_content,导致api返回了reasoning_content但是没有程序处理。使用pipe可以手动处理这一部分。如果你使用pipe没有效果,可以把你的配置发出来,我尝试帮你解决。
@he0119 commented on GitHub (Feb 14, 2025):
open-webui 官方不支持的理由如下:
我简单的修改了一下,也许在 openai 没有标准化之前可以凑合用用。
https://hub.docker.com/r/he0119/open-webui
https://github.com/he0119/open-webui/pkgs/container/open-webui
@lwdnxu commented on GitHub (Feb 18, 2025):
相同的问题,是否有解决方案呢?
@zx900930 commented on GitHub (Feb 19, 2025):
The
<think>block is still not showing up, model:bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUFlogs:
@he0119 commented on GitHub (Feb 19, 2025):
It looks like your api payload is not completed. I don't know why.
@roksttr8616 commented on GitHub (Feb 21, 2025):
help! me too. beg beg you, devlop fast!!!
@tomly2019m commented on GitHub (Feb 21, 2025):
今天我也遇到了这个问题,但是现在已经解决了,想法是把reasoning content的东西移到content中来 并在reasoning时,手动加上标签。具体做法是截获第三方api返回的stream,并修改其中的内容,再移交给open webUI处理。明天我将把我的修改放上来,其实只需要增加一个截获函数即可。
Today I also encountered this issue, but it has been resolved now. The idea was to move the content from "reasoning content" into the main "content" field, and manually add the
<think>tag during reasoning. The specific approach involves intercepting the stream returned by a third-party API, modifying its content, and then handing it over to Open WebUI for processing. I will share my modifications tomorrow—it essentially requires simply adding an interception function.@tomly2019m commented on GitHub (Feb 22, 2025):
在
backend\open_webui\routers\openai.py中 找到generate_chat_completion这个函数在
下方 增加截获函数 这个方案适用于 火山 百炼 vLLM我没试过,但是思路是一样的,根据vLLM返回的chunk内容,把reasoning的部分贴到content中去就能正常显示了。
在下方,
if "text/event-stream" in r.headers.get("Content-Type", ""):分支里。调用截获函数并返回如果你的API 返回的格式有出入,可以在截获函数中打印出来,作相应的调整。
If your API returns data in a format that differs from expectations, you can print it out within the interceptor function and make corresponding adjustments.
@i-iooi-i commented on GitHub (Feb 22, 2025):
https://github.com/n4ze3m/page-assist
You can temporarily use this browser extension as a substitute, as it displays the complete thought process.
@romeoleung commented on GitHub (Feb 23, 2025):
I tried to add a phrase in the system prompt to ask it to output the reasoning_content as tag. Here's what I put:
"In your response, please output the reasoning_content as tag."
In Chinese: "回复中,请将reasoning_content转换为tag的内容输出"
It does "work" as it indeed gave me the CoT in the UI. But I guess it has a cost - I think it first output the reasoning_content in the back but did not show, and then in the real reply, it just wrapped the thinking process with tag and output again. I didn't really look the real reasoning_content (don't know how), so I guess it's outputting the thinking process twice.
@cdmusic2019 commented on GitHub (Feb 24, 2025):
其实用he0119大佬的分枝就可以了,或者自行替换一下大佬修改过的middleware.py文件就可以了。
@grea commented on GitHub (Feb 24, 2025):
有具体的链接吗?
@cdmusic2019 commented on GitHub (Feb 24, 2025):
https://github.com/he0119/open-webui
@grea @he0119
@bm2ilabs commented on GitHub (Feb 24, 2025):
I was able to make this work for openrouter
@AndrewTsao commented on GitHub (Feb 25, 2025):
使用函数方式能正常显示Thinking过程,但是发现无法关联知识库的内容。非函数的模型就可以。
@Hugh-yw commented on GitHub (Feb 26, 2025):
Deepseek-r1 deployed by SGLang engine, using the latest version of open-webui:0.5.16, configured the function according to the online tutorial, but still can't display the thinking chain
@duanhongyi commented on GitHub (Feb 26, 2025):
I solved this problem using the monkey patch, which allows me to utilize the original image with only runtime changes.
docker-compose.yml
./openwebui/patch/main.py
@tomly2019m commented on GitHub (Feb 26, 2025):
我提供的方法支持关联知识库,具体请参考我之前的回复,只需要新增一个函数即可。效果图
@Hugh-yw commented on GitHub (Feb 26, 2025):
Your solution only supports the public cloud API. The deepseek r1 deployed locally through the sglang engine is incompatible
@Hugh-yw commented on GitHub (Feb 26, 2025):
This solution can be applied to the local deployment of deepseek -r1, and the test is OK
@acejsk commented on GitHub (Feb 26, 2025):
Why not use Filter?
Will using filters cause performance issues?
class Filter:
...
def outlet(self, body: dict, user: Optional[dict] = None) -> dict:
# Modify or analyze the response body after processing by the API.
# This function is the post-processor for the API, which can be used to modify the response
chat history needs to be modified
@Hugh-yw commented on GitHub (Feb 27, 2025):
@KinglyWayne How is the online search function implemented? The online search is abnormal with the current configuration. No search results found
@KinglyWayne commented on GitHub (Feb 27, 2025):
Online search is a completely independent function. As far as I know, if you enable online search, you may need to update to the latest version, as a certain last version may have disrupted the split search function by default. Anyway, these have nothing to do with this pipeline.
@he0119 commented on GitHub (Feb 28, 2025):
After v0.5.17 released, we can use the new stream filter to do this.
Perhaps there are some specific scenarios that haven't been considered. Waiting for someone more experienced to improve the script.
https://openwebui.com/f/he0119/reasoning
@duanhongyi commented on GitHub (Feb 28, 2025):
Best answer
@i-iooi-i commented on GitHub (Feb 28, 2025):
@he0119 大佬,我添加了这个函数,思维链是有呈现出来,但是回答的内容也被折叠进去了。
而且回答完了,还是一直会显示“正在思考”转圈圈。 🙈🙈
https://github.com/user-attachments/assets/c3296495-b605-424f-9ad9-75b2f877c77c
@he0119 commented on GitHub (Feb 28, 2025):
也许有什么特别的情况没考虑到,等一个大佬完善脚本吧,我只遇到过输出有
<think>标签出错的情况🤣@i-iooi-i commented on GitHub (Feb 28, 2025):
@he0119 嗯嗯 好的。 😂😂
@duanhongyi commented on GitHub (Feb 28, 2025):
@i-iooi-i
试试这个看看,event id在标签闭合后应该删除。
@i-iooi-i commented on GitHub (Feb 28, 2025):
https://github.com/user-attachments/assets/73c42fb9-b3c7-4d3d-b64c-a86dac6413b5
可以了大佬👍!
开启函数,在函数设置里开启“全局”(关闭全局好像就没有思维链了)
提出一个问题,可能是Deepseek官方服务器的问题,有延迟,等待了一会儿,就能看到思维链与输出结果了。 🙇
@cdmusic2019 commented on GitHub (Feb 28, 2025):
测试可以正常使用了。感谢! @duanhongyi @he0119
@FFDVDGD commented on GitHub (Feb 28, 2025):
大佬您好,在使用了脚本之后显示思考过程但不显示结果。用的是阿里的api。
@duanhongyi commented on GitHub (Feb 28, 2025):
@FFDVDGD
阿里的返回值略有不同,我稍微改了一下,兼容两种模式,你试一试:
@qsqfcg commented on GitHub (Mar 12, 2025):
我如果在open-wenui里面直接添加本地vllm服务器,没有
<think>标签,倒是有</think>标签。但是用管道的话,我看了这篇文章,https://hadb.me/posts/2025/display-deepseek-r1-thinking 问问题直接报错。