mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[GH-ISSUE #18121] issue: tool calling not working when streaming turned off #18499
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @mramendi on GitHub (Oct 7, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/18121
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.6.32
Ollama Version (if applicable)
No response
Operating System
RHEL 9
Browser (if applicable)
No response
Confirmation
README.md.Expected Behavior
When a model is instructed to call a tool, it calls the tool, whether streaming replies is turned on or off.
Actual Behavior
With streaming on (the default setting) native tool calling works fine, with streaming off the model attempts to call the tool and returns the tool call message - however, OpenWebUI does NOT execute the tool and instead returns to the chat.
Steps to Reproduce
Logs & Screenshots
Here's the response from the model on which I was returned to chat.
@Classic298 commented on GitHub (Oct 7, 2025):
NOT an Open WebUI issue.
Please research first - this is not an Open WebUI issue!
Native function (tool) calling in LLMs like GPT and Gemini is designed to work with streaming enabled. The function calling process is integrated into the streaming output so that the model can incrementally emit tokens representing the tool call intent, its arguments, and then receive and incorporate tool results in real time.
Without streaming, the model would need to complete the entire processing including tool calls before sending any output, which is inefficient and leads to slower response times.
The APIs simply do not allow it - for a good reason.
Additionally, since tool calls are integrated within the token generation stream as special message parts, streaming is necessary to handle this structured interaction, including managing tool call start, arguments passing, tool call completion, and final output tokens to the user.
Finally: This is a duplicate issue, the exact same question was raised multiple times before in issues and discussions.
PS: When turning OFF streaming, use the default function calling system provided by Open WebUI (non-native) then it can still work with streaming OFF.
@mramendi commented on GitHub (Oct 7, 2025):
Your explanation does not reflect the official OpenAI API reference. In fact, the first example in this reference (the one with
get_horoscope) does not use streaming. https://platform.openai.com/docs/guides/function-calling@rgaricano commented on GitHub (Oct 7, 2025):
Yes, there is an issue, the tool is called, the response is OK but isn't showed and chat is stucked
@Classic298 commented on GitHub (Oct 8, 2025):
@mramendi your examples are for the responses API
@mramendi commented on GitHub (Oct 9, 2025):
@Classic298 sure, here's ChatCompletions: https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models . Streaming is not even mentioned.
Loads of examples around such as https://github.com/john-carroll-sw/chat-completions-function-calling-examples/blob/master/func_get_weather.py .
It is simply not true that function calling in ChatCompletions is only for streaming.
@Classic298 commented on GitHub (Oct 9, 2025):
I researched again and it is my fault, indeed you are right. I have found a few articles that mentioned major issues if you disabled streaming in connection with native function calls, but these were obviously wrong. Reopening.
@tjbck commented on GitHub (Oct 9, 2025):
Intended. With streaming off it'll follow the existing API behaviour. Our docs should be updated for this instead to clearly mention this.
@mramendi commented on GitHub (Oct 9, 2025):
@tjbck no it does not, please refer to the openai cookbook link I already published: https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models
@rgaricano commented on GitHub (Oct 9, 2025):
@tjbck
Tim, but the chat stuck in this circunstance, as I tested & posted.
The tool is executed fine, without streaming, but the response is never displayed and the chat stuck with any error or notification.
@tjbck commented on GitHub (Oct 9, 2025):
@mramendi the tools are called from Open WebUI, it's literally how their chat completion API behaves. I'd appreciate if you could actually try using the chat completion endpoint and see the behaviour for yourself instead of referencing a post that has nothing to with this.
@tjbck commented on GitHub (Oct 9, 2025):
@rgaricano it's because if you turn off the stream it returns a tool call response. As per my message above, we do not "execute" the tool from our end to follow the existing API behaviour when the streaming if set to False. This IS an intended behaviour and will NOT be supported.
@tjbck commented on GitHub (Oct 9, 2025):
@Classic298 was half correct here. This is not an "issue" and is a matter of how stream off should be "interpreted".