mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[GH-ISSUE #14583] issue: Ollama backend improperly handles native multi-tool calls #55971
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @JRomainG on GitHub (Jun 1, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/14583
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.6.13
Ollama Version (if applicable)
0.9.0
Operating System
macOS Sequoia (15.5)
Browser (if applicable)
Firefox 139.0.1
Confirmation
README.md.Expected Behavior
When the Ollama API returns multiple tools to run, Open WebUI calls each tool one by one, like for the OpenAI-compatible backends
Actual Behavior
The Open WebUI middleware concatenates all tool calls together when using the Ollama backend, resulting in a single invalid tool. In addition, because the tool does not exist, the Open WebUI frontend loads indefinitely without ever performing a request to the tool server (which might be related to #14577)
Steps to Reproduce
devstral:latest)Logs & Screenshots
Example of invalid tool call using the Ollama backend:

Associated Open WebUI response log:
Performing the request manually using
curl:{
"model": "devstral",
"created_at": "2025-06-01T15:55:15.518588Z",
"message": {
"role": "assistant",
"content": "",
"tool_calls": [
{
"function": {
"name": "tool_get_current_time_post",
"arguments": {
"timezone": "Europe/Paris"
}
}
},
{
"function": {
"name": "tool_get_current_time_post",
"arguments": {
"timezone": "America/New_York"
}
}
}
]
},
"done_reason": "stop",
"done": true,
"total_duration": 6700734042,
"load_duration": 26193667,
"prompt_eval_count": 1491,
"prompt_eval_duration": 2590387959,
"eval_count": 45,
"eval_duration": 4075518458
}
Example of the same call with the same model running on an OpenAI-compatible backend (in this example, LM Studio):

Associated Open WebUI response log:
Additional Information
I believe this behavior stems from the following code snippet:
53764fe648/backend/open_webui/utils/middleware.py (L1811-L1821)I added debug logs and compared output with both the Ollama and LM Studio backends.
Ollama:
LM Studio:
However, I believe the bug itself is in this code instead:
53764fe648/backend/open_webui/utils/response.py (L9-L24)From the manual
curlrequest in pasted above, you can see that the Ollamatool_callsarray does not contain anindexparameter. As such, it always defaults to 0 inconvert_ollama_tool_call_to_openai. Because there are multiple tools, they all end up sharing the same index, which is why they are later concatenated together bystream_body_handler.I believe the fix would be to change the
convert_ollama_tool_call_to_openaifunction so that it increments the default value forindexand makes sure there are no collisions@taylorwilsdon commented on GitHub (Jun 1, 2025):
Works fine for me. You're probably using the default 2048 max num_ctx length and its running out of tokens while deciding whether to use the tool, causing failure. Try increasing it to 10-15k tokens minimum. I can reproduce if I set it to 2048 and it works fine at 12k.
This doesn't happen with LM Studio because you're using it as a generic OpenAI API compatible endpoint and it's not passing Open WebUI's default num_ctx to the backend, and thus respecting whatever the configured baseline max context settings is for the model running. With Ollama, you are overriding whatever is set at the model level (in
ollama show)