[GH-ISSUE #12154] issue: Native tool calling with streaming mode off does nothing with returned tool calls #32014

New Issue

GiteaMirror · 2026-04-25T05:54:07-05:00

GiteaMirror commented

2026-04-25 05:54:07 -05:00

Originally created by @Column01 on GitHub (Mar 28, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/12154

Check Existing Issues

I have searched the existing issues and discussions.
I am using the latest version of Open WebUI.

Installation Method

Pip Install

Open WebUI Version

dev (or v0.5.20)

Ollama Version (if applicable)

llama-server b4942 (llama.cpp cuda backend)

Operating System

Windows 10

Browser (if applicable)

No response

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have listed steps to reproduce the bug in detail.

Expected Behavior

When running a model locally using "Native" tool calling mode, the model should return valid tool calls (it does) and open-webui should execute those tool calls with extra inference steps with their results as needed.

Actual Behavior

The backend returns valid tool calls for the prompt, but the webui never executes the tool calls as expected. The finish_reason is tool_calls, but the frontend and backend treat it as if the conversation turn is done when it is not; the model is expecting tool calls to be run and an additional inference step to happen to show the result of the tool calls

In this case, it's trying to find the latitude and longitude of Toronto using the web search tool I made, but the webui never runs it. The logs attached below are for this conversation

Steps to Reproduce

Add an example tool (in my case, a weather function)
Set Function Calling to Native, and Stream Chat Response to False (required for tool calling with llama.cpp)
Ask the model to use the tool
The model tries to, but the Web UI never handles it properly

Logs & Screenshots

llama-server.log

webUI.log

Additional Information

I tried to reach out for assistance on discord, posting a troubleshooting thread asking for help, but I don't know if anyone else could help me with this, it seems like a bug.

b03fc97e28/backend/open_webui/utils/middleware.py (L787-L803)

This code should have something inside the if metadata.get("function_calling") == "native": similar to the non native mode below it where it runs the tool calling inference steps. As far as I can tell poking around, the native mode never runs the tool calls, but I could be mistaken as I do not fully understand the codebase

Originally created by @Column01 on GitHub (Mar 28, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/12154 ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Pip Install ### Open WebUI Version dev (or v0.5.20) ### Ollama Version (if applicable) llama-server b4942 (llama.cpp cuda backend) ### Operating System Windows 10 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior When running a model locally using "Native" tool calling mode, the model should return valid tool calls (it does) and open-webui should execute those tool calls with extra inference steps with their results as needed. ### Actual Behavior The backend returns valid tool calls for the prompt, but the webui never executes the tool calls as expected. The `finish_reason` is `tool_calls`, but the frontend and backend treat it as if the conversation turn is done when it is not; the model is expecting tool calls to be run and an additional inference step to happen to show the result of the tool calls In this case, it's trying to find the latitude and longitude of Toronto using the web search tool I made, but the webui never runs it. The logs attached below are for this conversation ![Image](https://github.com/user-attachments/assets/75431c41-1377-4084-a735-c9285a0bfa0a) ### Steps to Reproduce 1. Add an example tool (in my case, a weather function) 2. Set `Function Calling` to `Native`, and `Stream Chat Response` to False (required for tool calling with llama.cpp) 3. Ask the model to use the tool 4. The model tries to, but the Web UI never handles it properly ### Logs & Screenshots [llama-server.log](https://github.com/user-attachments/files/19506734/llama-server.log) [webUI.log](https://github.com/user-attachments/files/19506737/webUI.log) ### Additional Information I tried to reach out for assistance on discord, posting a troubleshooting thread asking for help, but I don't know if anyone else could help me with this, it seems like a bug. https://github.com/open-webui/open-webui/blob/b03fc97e287f31ad07bda896143959bc4413f7d2/backend/open_webui/utils/middleware.py#L787-L803 This code should have something inside the `if metadata.get("function_calling") == "native":` similar to the non native mode below it where it runs the tool calling inference steps. As far as I can tell poking around, the native mode **never** runs the tool calls, but I could be mistaken as I do not fully understand the codebase

GiteaMirror added the bug label 2026-04-25 05:54:07 -05:00

GiteaMirror closed this issue

2026-04-25 05:54:08 -05:00

GiteaMirror commented

2026-04-25 05:54:09 -05:00

@Column01 commented on GitHub (Mar 28, 2025):

Linked issue is relevant, I have streaming mode off as it's required to be off for my backend. This seems like a silly thing to restrict in my opinion

@Column01 commented on GitHub (Mar 28, 2025): Linked issue is relevant, I have streaming mode off as it's required to be off for my backend. This seems like a silly thing to restrict in my opinion ![Image](https://github.com/user-attachments/assets/1f699b9a-9522-4035-8c49-bf4c620e6afc)

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#32014