mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #12135] issue: Non-streaming toolcall ignored, never resolves #32009
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Arokha on GitHub (Mar 27, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/12135
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.5.20
Ollama Version (if applicable)
n/a
Operating System
Ubuntu 22.04
Browser (if applicable)
Firefox
Confirmation
README.md.Expected Behavior
Model configured with native tool call functionality performs tool calls normally and they are resolved normally by open-webui invoking the tool functions and returning the data regardless of streaming response setting.
Actual Behavior
When streaming responses are disabled, tool calls fail to resolve. The response is returned to open-webui but seems to just be ignored and the chat remains in 'waiting for response' state forever.
Steps to Reproduce
(See my configuration below)
In a chat, set Stream Chat Response: Off
Make a chat message that will cause the (native tool-calling) LLM to invoke a tool. In my case, I'm using the example get date/time tools.
Chat spins forever. Click stop button to have it give up.
(Note: OpenAI models don't like to reply AND invoke tools in the same response. Performing this test against an Anthropic model results in a 'Sure, let me do that for you.' type response and the tool call, but again, it is dropped in the same way as an OpenAI model with streaming disabled.)
Change chat setting 'Stream Chat Response: On', and click 'Regenerate' on the LLM's response (which is just the ghost-text placeholder stuff... I forget the web dev term for that).
The tool executes, and is returned to the LLM, the LLM takes it and produces a reply to it.
Logs & Screenshots
I start a new chat with:
Check the date and tell it to mewith streaming disabled.After my request to GPT-4o-mini, it replies with:
You can see this appears to be a normal attempt by the LLM to invoke a tool. But, nothing is ever done with it. The chat remains spinning forever. Open-WebUI docker's only relevant log (how can I turn up logging?) is the 200 status result from the above:
open-webui | 2025-03-27 22:29:33.990 | INFO | uvicorn.protocols.http.httptools_impl:send:476 - 2600:<ipv6 redacted> - "POST /api/chat/completions HTTP/1.1" 200 - {}If I perform it with streaming ENabled:
Works fine.
Additional Information
I'm using litellm as a proxy server to OpenAI, Anthropic, etc.
My setting on the admin-level model config is streaming: yes.
I did my testing to both OpenAI and Anthropic models, with the only difference being whether they reply with a 'preface' message as they invoke tools (OpenAI does not, Anthropic does. This is normal behavior for their models).
I realize I don't have many interesting logs from open-webui about this... so if you can let me know how to generate more detailed logs from open-webui's docker image I'll do that and produce more logs.
@tjbck commented on GitHub (Mar 28, 2025):
Intended behaviour here, streaming must be enabled for tool calls to be invoked.
@Column01 commented on GitHub (Mar 28, 2025):
I take issue with this, Non-streaming mode is required for many backends to return tool calls. This makes native tool calling useless in that case (for example, llama.cpp's llama-server needs streaming mode off for the
--jinjaflag which is required for tool calling, and the front end will give you an error if you try with it on).I opened a new issue, thinking this to be unintended behavior, I didn't expect it to be intended to not work
#12154
@Column01 commented on GitHub (Mar 28, 2025):
at the very least, a message should be displayed indicating this when native mode is on with streaming mode off. I spent a few days troubleshooting because of this!
@Arokha commented on GitHub (Mar 29, 2025):
If this is the case, then it should not even pass tools to the API at all! What is the point of offering tools to the API in that case?!
@Frank-Schiro commented on GitHub (Jun 20, 2025):
Thanks seriously two days of going crazy trying to debug this.
@schematical commented on GitHub (Jun 25, 2025):
How exactly is this intended behavior? Where is that documented?
If it is intended behavior why show the toggle button UI at all if its not going to send it?
I don't mean to be harsh but this is pretty confusing behavior.
@Column01 commented on GitHub (Jun 25, 2025):
The answer is it's intentionally not complete kek
The code to run the tool calls exists, it's just only run when a streaming response is used.