issue: Native tool calling breaks TTS Read Aloud #4312

New Issue

GiteaMirror · 2025-11-11T15:51:21-06:00

GiteaMirror commented

2025-11-11 15:51:21 -06:00

Originally created by @rvkwi on GitHub (Mar 7, 2025).

Check Existing Issues

I have searched the existing issues and discussions.

Installation Method

Docker

Open WebUI Version

0.5.20

Ollama Version (if applicable)

No response

Operating System

Debian 12.9

Browser (if applicable)

No response

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have checked the browser console logs.
I have checked the Docker container logs.
I have listed steps to reproduce the bug in detail.

Expected Behavior

When using Native Tool calling, the Read Aloud feature should produce audio.

Actual Behavior

Read Aloud stays silent. Seems to be related to the type "tool_calls".

Steps to Reproduce

Use a model with the Native tool call option
Let it use a tool
Click the "Read Aloud" button

Logs & Screenshots

Not really seeing much in the docker logs but this comes through the browser console:

ready ran. Refresh page to run again.
blob:https://nvv:8443/2acbdb97-47af-4c46-a13f-47830ee8d12d:1 
        
        
       GET blob:https://nvv:8443/2acbdb97-47af-4c46-a13f-47830ee8d12d net::ERR_REQUEST_RANGE_NOT_SATISFIABLEUnderstand this errorAI
fbb71073-6b57-4658-ac5d-66c2ba231920:1 Uncaught (in promise) NotSupportedError: Failed to load because no supported source was found.Understand this errorAI
VM167:21 Extractor already ran. Refresh page to run again.

Additional Information

There seems to be a bug with Native Tool calling mode on models where the TTS seems to break.

Simple example tool (altho not related to this, had this across different tools).

from pydantic import BaseModel, Field
from typing import Optional
from fastapi.requests import Request
from random import choice

async def event_status(__event_emitter__=None, description="", done=True):
    if __event_emitter__:
        await __event_emitter__(
            {
                "type": "status",
                "data": {
                    "description": description,
                    "done": done,
                },
            }
        )

class Tools:
    class Valves(BaseModel):
        show_emitter: bool = Field(
            default=True, 
            description="If enabled, shows dynamic feedback emitters in chat"
        )
        show_results_in_emitter: bool = Field(
            default=False,
            description="Show results in emitter (visible to user) instead of just in return value (AI only)"
        )
    def __init__(self):
        self.valves = self.Valves()

    async def coin_flip(self, __event_emitter__=None, __user__: Optional[dict] = None) -> str:
        """
        Flips a coin and returns the result.
        """
        result = choice(["Heads", "Tails"])
        if self.valves.show_emitter:
            result_in_emitter = f"Result: {result}" if self.valves.show_results_in_emitter else ""
            await event_status(__event_emitter__, f"Flipping coin. {result_in_emitter}")
        return f"Coin landed on: {result}"

If i run this:

### USER
can you flip a coin for me?

### ASSISTANT
<details type="tool_calls" done="true" content="[{&quot;index&quot;: 0, &quot;id&quot;: &quot;call_21b20a0f-25bd-469d-b975-f66a0915cdfc&quot;, &quot;type&quot;: &quot;function&quot;, &quot;function&quot;: {&quot;name&quot;: &quot;coin_flip&quot;, &quot;arguments&quot;: &quot;{}&quot;}}]" results="[{&quot;tool_call_id&quot;: &quot;call_21b20a0f-25bd-469d-b975-f66a0915cdfc&quot;, &quot;content&quot;: &quot;Coin landed on: Tails&quot;}]">
<summary>Tool Executed</summary>

> coin_flip: Coin landed on: Tails
</details>
The coin landed on tails! What would you like to do next?

Now the "Read Aloud" button (using OpenAI as the TTS in my case) stays silent (but active) when used.

Toying around with this, It seems to be related to the type.

Creating a simple filter that switches out the type from tool_calls into code_interpreter fixes it

from pydantic import BaseModel, Field
from typing import Optional

class Filter:
	async def outlet(self, body: dict, __user__: Optional[dict] = None) -> dict:
		for message in body["messages"]:
			message["content"] = message["content"].replace('<details type="tool_calls"', '<details type="code_interpreter"')
		return body

now TTS (Read Aloud) works just fine.

I'm not a typescript dev, I was trying to dig through the source but it's out of my comfort zone and wasn't able to find where this goes wrong.
But I wanted to report it for someone else who might know, it seems easily reproducible and related to the "type=" part of the response.

Originally created by @rvkwi on GitHub (Mar 7, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. ### Installation Method Docker ### Open WebUI Version 0.5.20 ### Ollama Version (if applicable) _No response_ ### Operating System Debian 12.9 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have checked the browser console logs. - [x] I have checked the Docker container logs. - [x] I have listed steps to reproduce the bug in detail. ### Expected Behavior When using Native Tool calling, the Read Aloud feature should produce audio. ### Actual Behavior Read Aloud stays silent. Seems to be related to the type "tool_calls". ### Steps to Reproduce 1. Use a model with the Native tool call option 2. Let it use a tool 3. Click the "Read Aloud" button ### Logs & Screenshots Not really seeing much in the docker logs but this comes through the browser console: ``` ready ran. Refresh page to run again. blob:https://nvv:8443/2acbdb97-47af-4c46-a13f-47830ee8d12d:1 GET blob:https://nvv:8443/2acbdb97-47af-4c46-a13f-47830ee8d12d net::ERR_REQUEST_RANGE_NOT_SATISFIABLEUnderstand this errorAI fbb71073-6b57-4658-ac5d-66c2ba231920:1 Uncaught (in promise) NotSupportedError: Failed to load because no supported source was found.Understand this errorAI VM167:21 Extractor already ran. Refresh page to run again. ``` ### Additional Information There seems to be a bug with Native Tool calling mode on models where the TTS seems to break. Simple example tool (altho not related to this, had this across different tools). ``` from pydantic import BaseModel, Field from typing import Optional from fastapi.requests import Request from random import choice async def event_status(__event_emitter__=None, description="", done=True): if __event_emitter__: await __event_emitter__( { "type": "status", "data": { "description": description, "done": done, }, } ) class Tools: class Valves(BaseModel): show_emitter: bool = Field( default=True, description="If enabled, shows dynamic feedback emitters in chat" ) show_results_in_emitter: bool = Field( default=False, description="Show results in emitter (visible to user) instead of just in return value (AI only)" ) def __init__(self): self.valves = self.Valves() async def coin_flip(self, __event_emitter__=None, __user__: Optional[dict] = None) -> str: """ Flips a coin and returns the result. """ result = choice(["Heads", "Tails"]) if self.valves.show_emitter: result_in_emitter = f"Result: {result}" if self.valves.show_results_in_emitter else "" await event_status(__event_emitter__, f"Flipping coin. {result_in_emitter}") return f"Coin landed on: {result}" ``` If i run this: ``` ### USER can you flip a coin for me? ### ASSISTANT <details type="tool_calls" done="true" content="[{"index": 0, "id": "call_21b20a0f-25bd-469d-b975-f66a0915cdfc", "type": "function", "function": {"name": "coin_flip", "arguments": "{}"}}]" results="[{"tool_call_id": "call_21b20a0f-25bd-469d-b975-f66a0915cdfc", "content": "Coin landed on: Tails"}]"> <summary>Tool Executed</summary> > coin_flip: Coin landed on: Tails </details> The coin landed on tails! What would you like to do next? ``` Now the "Read Aloud" button (using OpenAI as the TTS in my case) stays silent (but active) when used. Toying around with this, It seems to be related to the **type**. Creating a simple filter that switches out the type from `tool_calls` into `code_interpreter` fixes it ``` from pydantic import BaseModel, Field from typing import Optional class Filter: async def outlet(self, body: dict, __user__: Optional[dict] = None) -> dict: for message in body["messages"]: message["content"] = message["content"].replace('<details type="tool_calls"', '<details type="code_interpreter"') return body ``` now TTS (Read Aloud) works just fine. I'm not a typescript dev, I was trying to dig through the source but it's out of my comfort zone and wasn't able to find where this goes wrong. But I wanted to report it for someone else who might know, it seems easily reproducible and related to the "type=" part of the response.

GiteaMirror added the bug label 2025-11-11 15:51:21 -06:00

GiteaMirror closed this issue

2025-11-11 15:51:21 -06:00

GiteaMirror commented

2025-11-11 15:51:22 -06:00

@tjbck commented on GitHub (Mar 7, 2025):

Good catch, fixed with ab92737a9a

@tjbck commented on GitHub (Mar 7, 2025): Good catch, fixed with ab92737a9ab21c562d5e6718e59eb76cbfe7f219

GiteaMirror referenced this issue

2025-11-11 17:48:50 -06:00

[PR #4312] [CLOSED] i18n : Updated ms-MY Malay (Bahasa Malaysia) Language from #4178 #8248

GiteaMirror referenced this issue