[GH-ISSUE #20600] issue: Tool call results not decoded from HTML entities before sending to LLM #57897

New Issue

GiteaMirror · 2026-05-05T21:52:47-05:00

GiteaMirror commented

2026-05-05 21:52:47 -05:00

Originally created by @Koumi460 on GitHub (Jan 12, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/20600

Check Existing Issues

I have searched for any existing and/or related issues.
I have searched for any existing and/or related discussions.
I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
I am using the latest version of Open WebUI.

Installation Method

Git Clone

Open WebUI Version

v0.7.2

Ollama Version (if applicable)

0.13.5

Operating System

Debian, Windows

Browser (if applicable)

N/A

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

Messages sent to the LLM should contain clean, properly formatted JSON without HTML entities:

Tool results should use standard JSON escaping: \"
No HTML entities like ", &, ' should be present in the message content sent to LLM
The model should receive the same clean format whether it's the first turn or subsequent turns

Actual Behavior

In multi-turn conversations with tool calls:

Messages retrieved from the database contain HTML-escaped JSON
The frontend's processDetails() function inserts HTML-escaped content directly into messages
Messages sent to the LLM contain " entities instead of proper quotes
The model receives malformed, hard-to-parse content

Steps to Reproduce

Start a new chat conversation in Open WebUI
Send a message that triggers a custom tool call (e.g., "Give me extremely brief summary of today's top 2 news. Use the tool available.")
Wait for the tool call to complete and the assistant to respond
Send a follow-up message (e.g., "make it even shorter")
Inspect the messages sent to the LLM (requires debug logging):
- The assistant's message from step/turn 2 will contain HTML-escaped JSON: ""{\\n \\"results\\": ... instead of clean JSON: "{\\n \\"results\\": ...

Logs & Screenshots

FIRST TURN (all good):
ollama[3261823]: time=2026-01-12T00:00:56.125Z level=TRACE source=server.go:1466 msg="completion request" prompt="<|im_start|>system\n### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* \"According to the study, the proposed method increases efficiency by 20% [1].\"\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id=\"1\" name=\"search_web\">{\n  \"results\": [\n    {\n      \"type\": \"web\",\n      \"title\": \"Reuters World News | Latest Top Stories | Reuters\",\n      \"age\": \"\",\n      \"description\": \"A group of European countries, led by Britain and Germany, is discussing plans to boost their military presence in Greenland to show U.S. President Donald Trump that the continent is serious about Arctic security, Bloomberg News reported on Sunday.\",\n      \"url\": \"https://www.reuters.com/world/\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"UK News - The latest headlines from the UK | Sky News\",\n      \"age\": \"\",\n      \"description\": \"&#x27;I have PTSD from online video&#x27;: The growing call to ban smartphones in schools\",\n      \"url\": \"https://news.sky.com/uk\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"Latest news from around the world | The Guardian\",\n      \"age\": \"\",\n      \"description\": \"Case brought by Muslim leaders and MP follows failed 2024 bid and seen as part of global anti-women’s rights backlash\",\n      \"url\": \"https://www.theguardian.com/world\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"Breaking News, Latest News and Videos | CNN\",\n      \"age\": \"\",\n      \"description\": \"RFK Jr.’s new food pyramid puts meat, dairy at the top.\",\n      \"url\": \"https://www.cnn.com/\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"World News - Breaking international news and headlines | Sky News\",\n      \"age\": \"\",\n      \"description\": \"There&#x27;s a daily game of cat and mouse to catch illegal migrants in Germany - and it may soon become UK&#x27;s problem\",\n      \"url\": \"https://news.sky.com/world\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"UK | Latest News & Updates | BBC News\",\n      \"age\": \"\",\n      \"description\": \"Updates from your News topics will appear in My News and in a collection on the News homepage. I never saw young women on Epstein visits, Mandelson tells BBC · In his first interview since being sacked as the UK&#x27;s US ambassador, Lord Mandelson also says he does not believe Donald Trump will take Greenland by force. ... Watch: Backlash against Musk&#x27;s Grok AI explained. Video, 00:01:10Watch: Backlash against Musk&#x27;s Grok AI explained ... Watch: Rowers capture close encounter with whales. Video, 00:00:25Watch: Rowers capture close encounter with whales\",\n      \"url\": \"https://www.bbc.co.uk/news/uk\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"ABC News - Breaking News, Latest News and Videos\",\n      \"age\": \"\",\n      \"description\": \"The strikes come in retaliation for an attack that killed three Americans.\",\n      \"url\": \"https://abcnews.go.com/\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"England | Latest News & Updates | BBC News\",\n      \"age\": \"\",\n      \"description\": \"Updates from your News topics will appear in My News and in a collection on the News homepage. &#x27;I have 45 children applying to my special school each week&#x27; Head teacher Steph Smith says her school is oversubscribed and that more like it are needed. ... Man spotted skiing down residential street. Video, 00:00:16Man spotted skiing down residential street ... 0:45Prince Naseem actor on playing his hero. 00:00:45, play videoPrince Naseem actor on playing his hero · 0:28Cyclists get snow on their spokes in chilly ride.\",\n      \"url\": \"https://www.bbc.co.uk/news/england\"\n    },\n    {\n      \"type\": \"videos\",\n      \"description\": \"Video. Catch up with the most important stories from around Europe and beyond - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\",\n      \"url\": \"https://www.euronews.com/video/2025/05/29/latest-news-bulletin-may-29th-evening\"\n    },\n    {\n      \"type\": \"videos\",\n      \"description\": \"President Donald Trump will receive an intelligence briefing today at 11 AM ET in the Oval Office. Trump will has lunch with Vice President JD Vance and then...\",\n      \"url\": \"https://www.youtube.com/watch?v=rr014mxX1sg\"\n    },\n    {\n      \"type\": \"videos\",\n      \"description\": \"Get today's top stories, breaking news and original reporting on CBS News 24/7.#news #livenews #breakingnews CBS News 24/7 is the premier anchored streaming ...\",\n      \"url\": \"https://www.youtube.com/watch?v=Jr0VVkRUqE4\"\n    },\n    {\n      \"type\": \"videos\",\n      \"description\": \"Video. Catch up with the most important stories from around Europe and beyond this January 4th, 2026 - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\",\n      \"url\": \"https://www.euronews.com/video/2026/01/04/latest-news-bulletin-january-4th-2026-morning\"\n    }\n  ]\n}</source>\n</context>\n\n<user_query>\nGive me extremely brief summary of today’s top 2 news.\n</user_query>\n\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>\n{\"type\": \"function\", \"function\": {\"name\": \"search_web\", \"description\": \"\\n        Search the web for the given query.\\n        \", \"parameters\": {\"type\": \"object\", \"required\": [\"query\"], \"properties\": {\"country\": {\"type\": \"string\", \"description\": \"The country to search from. Two letter country code.\"}, \"focus\": {\"type\": \"string\", \"description\": \"The type of search to perform. Can be \\\"all\\\", \\\"web\\\", \\\"news\\\", \\\"wikipedia\\\", \\\"academia\\\", \\\"reddit\\\", \\\"images\\\", \\\"videos\\\".\"}, \"language\": {\"type\": \"string\", \"description\": \"The language to search in. Language code like en, fr, de, etc.\"}, \"query\": {\"type\": \"string\", \"description\": \"The query to search for. Should be a string and optimized for search engines.\"}}}}}\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n<|im_start|>user\nGive me extremely brief summary of today’s top 2 news.<|im_end|>\n<|im_start|>assistant\n<tool_call>\n{\"name\": \"search_web\", \"arguments\": {\"focus\": \"news\", \"language\": \"en\", \"query\": \"today's top 2 news\"}}\n</tool_call><|im_end|>\n<|im_start|>user\n<tool_response>\n{\n  \"results\": [\n    {\n      \"type\": \"web\",\n      \"title\": \"Reuters World News | Latest Top Stories | Reuters\",\n      \"age\": \"\",\n      \"description\": \"A group of European countries, led by Britain and Germany, is discussing plans to boost their military presence in Greenland to show U.S. President Donald Trump that the continent is serious about Arctic security, Bloomberg News reported on Sunday.\",\n      \"url\": \"https://www.reuters.com/world/\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"UK News - The latest headlines from the UK | Sky News\",\n      \"age\": \"\",\n      \"description\": \"&#x27;I have PTSD from online video&#x27;: The growing call to ban smartphones in schools\",\n      \"url\": \"https://news.sky.com/uk\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"Latest news from around the world | The Guardian\",\n      \"age\": \"\",\n      \"description\": \"Case brought by Muslim leaders and MP follows failed 2024 bid and seen as part of global anti-women’s rights backlash\",\n      \"url\": \"https://www.theguardian.com/world\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"Breaking News, Latest News and Videos | CNN\",\n      \"age\": \"\",\n      \"description\": \"RFK Jr.’s new food pyramid puts meat, dairy at the top.\",\n      \"url\": \"https://www.cnn.com/\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"World News - Breaking international news and headlines | Sky News\",\n      \"age\": \"\",\n      \"description\": \"There&#x27;s a daily game of cat and mouse to catch illegal migrants in Germany - and it may soon become UK&#x27;s problem\",\n      \"url\": \"https://news.sky.com/world\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"UK | Latest News & Updates | BBC News\",\n      \"age\": \"\",\n      \"description\": \"Updates from your News topics will appear in My News and in a collection on the News homepage. I never saw young women on Epstein visits, Mandelson tells BBC · In his first interview since being sacked as the UK&#x27;s US ambassador, Lord Mandelson also says he does not believe Donald Trump will take Greenland by force. ... Watch: Backlash against Musk&#x27;s Grok AI explained. Video, 00:01:10Watch: Backlash against Musk&#x27;s Grok AI explained ... Watch: Rowers capture close encounter with whales. Video, 00:00:25Watch: Rowers capture close encounter with whales\",\n      \"url\": \"https://www.bbc.co.uk/news/uk\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"ABC News - Breaking News, Latest News and Videos\",\n      \"age\": \"\",\n      \"description\": \"The strikes come in retaliation for an attack that killed three Americans.\",\n      \"url\": \"https://abcnews.go.com/\"\n    },\n    {\n      \"type\": \"web\",\n      \"title\": \"England | Latest News & Updates | BBC News\",\n      \"age\": \"\",\n      \"description\": \"Updates from your News topics will appear in My News and in a collection on the News homepage. &#x27;I have 45 children applying to my special school each week&#x27; Head teacher Steph Smith says her school is oversubscribed and that more like it are needed. ... Man spotted skiing down residential street. Video, 00:00:16Man spotted skiing down residential street ... 0:45Prince Naseem actor on playing his hero. 00:00:45, play videoPrince Naseem actor on playing his hero · 0:28Cyclists get snow on their spokes in chilly ride.\",\n      \"url\": \"https://www.bbc.co.uk/news/england\"\n    },\n    {\n      \"type\": \"videos\",\n      \"description\": \"Video. Catch up with the most important stories from around Europe and beyond - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\",\n      \"url\": \"https://www.euronews.com/video/2025/05/29/latest-news-bulletin-may-29th-evening\"\n    },\n    {\n      \"type\": \"videos\",\n      \"description\": \"President Donald Trump will receive an intelligence briefing today at 11 AM ET in the Oval Office. Trump will has lunch with Vice President JD Vance and then...\",\n      \"url\": \"https://www.youtube.com/watch?v=rr014mxX1sg\"\n    },\n    {\n      \"type\": \"videos\",\n      \"description\": \"Get today's top stories, breaking news and original reporting on CBS News 24/7.#news #livenews #breakingnews CBS News 24/7 is the premier anchored streaming ...\",\n      \"url\": \"https://www.youtube.com/watch?v=Jr0VVkRUqE4\"\n    },\n    {\n      \"type\": \"videos\",\n      \"description\": \"Video. Catch up with the most important stories from around Europe and beyond this January 4th, 2026 - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\",\n      \"url\": \"https://www.euronews.com/video/2026/01/04/latest-news-bulletin-january-4th-2026-morning\"\n    }\n  ]\n}\n</tool_response><|im_end|>\n<|im_start|>assistant\n"

SUBSEQUENT TURN (incorrectly encoded):
ollama[3261823]: time=2026-01-12T00:01:12.552Z level=TRACE source=server.go:1466 msg="completion request" prompt="<|im_start|>system\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>\n{\"type\": \"function\", \"function\": {\"name\": \"search_web\", \"description\": \"\\n        Search the web for the given query.\\n        \", \"parameters\": {\"type\": \"object\", \"required\": [\"query\"], \"properties\": {\"country\": {\"type\": \"string\", \"description\": \"The country to search from. Two letter country code.\"}, \"focus\": {\"type\": \"string\", \"description\": \"The type of search to perform. Can be \\\"all\\\", \\\"web\\\", \\\"news\\\", \\\"wikipedia\\\", \\\"academia\\\", \\\"reddit\\\", \\\"images\\\", \\\"videos\\\".\"}, \"language\": {\"type\": \"string\", \"description\": \"The language to search in. Language code like en, fr, de, etc.\"}, \"query\": {\"type\": \"string\", \"description\": \"The query to search for. Should be a string and optimized for search engines.\"}}}}}\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n<|im_start|>user\nGive me extremely brief summary of today’s top 2 news.<|im_end|>\n<|im_start|>assistant\n\"&quot;{\\n  \\&quot;results\\&quot;: [\\n    {\\n      \\&quot;type\\&quot;: \\&quot;web\\&quot;,\\n      \\&quot;title\\&quot;: \\&quot;Reuters World News | Latest Top Stories | Reuters\\&quot;,\\n      \\&quot;age\\&quot;: \\&quot;\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;A group of European countries, led by Britain and Germany, is discussing plans to boost their military presence in Greenland to show U.S. President Donald Trump that the continent is serious about Arctic security, Bloomberg News reported on Sunday.\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://www.reuters.com/world/\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;web\\&quot;,\\n      \\&quot;title\\&quot;: \\&quot;UK News - The latest headlines from the UK | Sky News\\&quot;,\\n      \\&quot;age\\&quot;: \\&quot;\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;&amp;#x27;I have PTSD from online video&amp;#x27;: The growing call to ban smartphones in schools\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://news.sky.com/uk\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;web\\&quot;,\\n      \\&quot;title\\&quot;: \\&quot;Latest news from around the world | The Guardian\\&quot;,\\n      \\&quot;age\\&quot;: \\&quot;\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;Case brought by Muslim leaders and MP follows failed 2024 bid and seen as part of global anti-women’s rights backlash\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://www.theguardian.com/world\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;web\\&quot;,\\n      \\&quot;title\\&quot;: \\&quot;Breaking News, Latest News and Videos | CNN\\&quot;,\\n      \\&quot;age\\&quot;: \\&quot;\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;RFK Jr.’s new food pyramid puts meat, dairy at the top.\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://www.cnn.com/\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;web\\&quot;,\\n      \\&quot;title\\&quot;: \\&quot;World News - Breaking international news and headlines | Sky News\\&quot;,\\n      \\&quot;age\\&quot;: \\&quot;\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;There&amp;#x27;s a daily game of cat and mouse to catch illegal migrants in Germany - and it may soon become UK&amp;#x27;s problem\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://news.sky.com/world\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;web\\&quot;,\\n      \\&quot;title\\&quot;: \\&quot;UK | Latest News &amp; Updates | BBC News\\&quot;,\\n      \\&quot;age\\&quot;: \\&quot;\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;Updates from your News topics will appear in My News and in a collection on the News homepage. I never saw young women on Epstein visits, Mandelson tells BBC · In his first interview since being sacked as the UK&amp;#x27;s US ambassador, Lord Mandelson also says he does not believe Donald Trump will take Greenland by force. ... Watch: Backlash against Musk&amp;#x27;s Grok AI explained. Video, 00:01:10Watch: Backlash against Musk&amp;#x27;s Grok AI explained ... Watch: Rowers capture close encounter with whales. Video, 00:00:25Watch: Rowers capture close encounter with whales\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://www.bbc.co.uk/news/uk\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;web\\&quot;,\\n      \\&quot;title\\&quot;: \\&quot;ABC News - Breaking News, Latest News and Videos\\&quot;,\\n      \\&quot;age\\&quot;: \\&quot;\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;The strikes come in retaliation for an attack that killed three Americans.\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://abcnews.go.com/\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;web\\&quot;,\\n      \\&quot;title\\&quot;: \\&quot;England | Latest News &amp; Updates | BBC News\\&quot;,\\n      \\&quot;age\\&quot;: \\&quot;\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;Updates from your News topics will appear in My News and in a collection on the News homepage. &amp;#x27;I have 45 children applying to my special school each week&amp;#x27; Head teacher Steph Smith says her school is oversubscribed and that more like it are needed. ... Man spotted skiing down residential street. Video, 00:00:16Man spotted skiing down residential street ... 0:45Prince Naseem actor on playing his hero. 00:00:45, play videoPrince Naseem actor on playing his hero · 0:28Cyclists get snow on their spokes in chilly ride.\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://www.bbc.co.uk/news/england\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;videos\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;Video. Catch up with the most important stories from around Europe and beyond - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://www.euronews.com/video/2025/05/29/latest-news-bulletin-may-29th-evening\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;videos\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;President Donald Trump will receive an intelligence briefing today at 11 AM ET in the Oval Office. Trump will has lunch with Vice President JD Vance and then...\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://www.youtube.com/watch?v=rr014mxX1sg\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;videos\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;Get today&#x27;s top stories, breaking news and original reporting on CBS News 24/7.#news #livenews #breakingnews CBS News 24/7 is the premier anchored streaming ...\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://www.youtube.com/watch?v=Jr0VVkRUqE4\\&quot;\\n    },\\n    {\\n      \\&quot;type\\&quot;: \\&quot;videos\\&quot;,\\n      \\&quot;description\\&quot;: \\&quot;Video. Catch up with the most important stories from around Europe and beyond this January 4th, 2026 - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\\&quot;,\\n      \\&quot;url\\&quot;: \\&quot;https://www.euronews.com/video/2026/01/04/latest-news-bulletin-january-4th-2026-morning\\&quot;\\n    }\\n  ]\\n}&quot;\"\n1. European countries, led by Britain and Germany, are discussing increased military presence in Greenland to address Arctic security concerns amid U.S. President Donald Trump's interest in the region [1].  \n2. There is a growing call to ban smartphones in schools due to concerns over mental health, with some individuals reporting PTSD from online video content [2].<|im_end|>\n<|im_start|>user\nmake it even shorter<|im_end|>\n<|im_start|>assistant\n"

Additional Information

AFAIK - When tool calls are executed in multi-turn conversations, the tool call results are stored in conversation history database with HTML-escaped entities (e.g., ", &) in the database. When these messages are loaded from the database and sent back to the LLM in subsequent conversation turns, the HTML entities in tool call results are not properly decoded, causing the model to receive malformed JSON with escaped entities instead of proper quotation marks.

The issue is not visible in the chat window, but it does have an impact on the model, degrading its performance, especially if the chat history is tool call heavy. I am not confident that the fix is correct, but I have observed this bug across different deployment instances and with both ollama and llama.cpp.

Suggested solution (?) tested working on my instance (main branch):

File: src/lib/utils/index.ts
Function: processDetails()
Line: ~875

// BEFORE:
if (attributes.result) {
    content = content.replace(match, `"${attributes.result}"`);
}

// AFTER:
if (attributes.result) {
    const unescapedResult = unescapeHtml(attributes.result);
    content = content.replace(match, unescapedResult);
}

Originally created by @Koumi460 on GitHub (Jan 12, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/20600 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Git Clone ### Open WebUI Version v0.7.2 ### Ollama Version (if applicable) 0.13.5 ### Operating System Debian, Windows ### Browser (if applicable) N/A ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior Messages sent to the LLM should contain clean, properly formatted JSON without HTML entities: - Tool results should use standard JSON escaping: `\"` - No HTML entities like `"`, `&`, `'` should be present in the message content sent to LLM - The model should receive the same clean format whether it's the first turn or subsequent turns ### Actual Behavior In multi-turn conversations with tool calls: - Messages retrieved from the database contain HTML-escaped JSON - The frontend's `processDetails()` function inserts HTML-escaped content directly into messages - Messages sent to the LLM contain `"` entities instead of proper quotes - The model receives malformed, hard-to-parse content ### Steps to Reproduce 1. Start a new chat conversation in Open WebUI 2. Send a message that triggers a custom tool call (e.g., "Give me extremely brief summary of today's top 2 news. Use the tool available.") 3. Wait for the tool call to complete and the assistant to respond 4. Send a follow-up message (e.g., "make it even shorter") 5. Inspect the messages sent to the LLM (requires debug logging): - The assistant's message from step/turn 2 will contain HTML-escaped JSON: `""{\\n \\"results\\": ...` instead of clean JSON: `"{\\n \\"results\\": ...` ### Logs & Screenshots ``` FIRST TURN (all good): ollama[3261823]: time=2026-01-12T00:00:56.125Z level=TRACE source=server.go:1466 msg="completion request" prompt="<|im_start|>system\n### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* \"According to the study, the proposed method increases efficiency by 20% [1].\"\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id=\"1\" name=\"search_web\">{\n \"results\": [\n {\n \"type\": \"web\",\n \"title\": \"Reuters World News | Latest Top Stories | Reuters\",\n \"age\": \"\",\n \"description\": \"A group of European countries, led by Britain and Germany, is discussing plans to boost their military presence in Greenland to show U.S. President Donald Trump that the continent is serious about Arctic security, Bloomberg News reported on Sunday.\",\n \"url\": \"https://www.reuters.com/world/\"\n },\n {\n \"type\": \"web\",\n \"title\": \"UK News - The latest headlines from the UK | Sky News\",\n \"age\": \"\",\n \"description\": \"'I have PTSD from online video': The growing call to ban smartphones in schools\",\n \"url\": \"https://news.sky.com/uk\"\n },\n {\n \"type\": \"web\",\n \"title\": \"Latest news from around the world | The Guardian\",\n \"age\": \"\",\n \"description\": \"Case brought by Muslim leaders and MP follows failed 2024 bid and seen as part of global anti-women’s rights backlash\",\n \"url\": \"https://www.theguardian.com/world\"\n },\n {\n \"type\": \"web\",\n \"title\": \"Breaking News, Latest News and Videos | CNN\",\n \"age\": \"\",\n \"description\": \"RFK Jr.’s new food pyramid puts meat, dairy at the top.\",\n \"url\": \"https://www.cnn.com/\"\n },\n {\n \"type\": \"web\",\n \"title\": \"World News - Breaking international news and headlines | Sky News\",\n \"age\": \"\",\n \"description\": \"There's a daily game of cat and mouse to catch illegal migrants in Germany - and it may soon become UK's problem\",\n \"url\": \"https://news.sky.com/world\"\n },\n {\n \"type\": \"web\",\n \"title\": \"UK | Latest News & Updates | BBC News\",\n \"age\": \"\",\n \"description\": \"Updates from your News topics will appear in My News and in a collection on the News homepage. I never saw young women on Epstein visits, Mandelson tells BBC · In his first interview since being sacked as the UK's US ambassador, Lord Mandelson also says he does not believe Donald Trump will take Greenland by force. ... Watch: Backlash against Musk's Grok AI explained. Video, 00:01:10Watch: Backlash against Musk's Grok AI explained ... Watch: Rowers capture close encounter with whales. Video, 00:00:25Watch: Rowers capture close encounter with whales\",\n \"url\": \"https://www.bbc.co.uk/news/uk\"\n },\n {\n \"type\": \"web\",\n \"title\": \"ABC News - Breaking News, Latest News and Videos\",\n \"age\": \"\",\n \"description\": \"The strikes come in retaliation for an attack that killed three Americans.\",\n \"url\": \"https://abcnews.go.com/\"\n },\n {\n \"type\": \"web\",\n \"title\": \"England | Latest News & Updates | BBC News\",\n \"age\": \"\",\n \"description\": \"Updates from your News topics will appear in My News and in a collection on the News homepage. 'I have 45 children applying to my special school each week' Head teacher Steph Smith says her school is oversubscribed and that more like it are needed. ... Man spotted skiing down residential street. Video, 00:00:16Man spotted skiing down residential street ... 0:45Prince Naseem actor on playing his hero. 00:00:45, play videoPrince Naseem actor on playing his hero · 0:28Cyclists get snow on their spokes in chilly ride.\",\n \"url\": \"https://www.bbc.co.uk/news/england\"\n },\n {\n \"type\": \"videos\",\n \"description\": \"Video. Catch up with the most important stories from around Europe and beyond - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\",\n \"url\": \"https://www.euronews.com/video/2025/05/29/latest-news-bulletin-may-29th-evening\"\n },\n {\n \"type\": \"videos\",\n \"description\": \"President Donald Trump will receive an intelligence briefing today at 11 AM ET in the Oval Office. Trump will has lunch with Vice President JD Vance and then...\",\n \"url\": \"https://www.youtube.com/watch?v=rr014mxX1sg\"\n },\n {\n \"type\": \"videos\",\n \"description\": \"Get today's top stories, breaking news and original reporting on CBS News 24/7.#news #livenews #breakingnews CBS News 24/7 is the premier anchored streaming ...\",\n \"url\": \"https://www.youtube.com/watch?v=Jr0VVkRUqE4\"\n },\n {\n \"type\": \"videos\",\n \"description\": \"Video. Catch up with the most important stories from around Europe and beyond this January 4th, 2026 - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\",\n \"url\": \"https://www.euronews.com/video/2026/01/04/latest-news-bulletin-january-4th-2026-morning\"\n }\n ]\n}</source>\n</context>\n\n<user_query>\nGive me extremely brief summary of today’s top 2 news.\n</user_query>\n\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>\n{\"type\": \"function\", \"function\": {\"name\": \"search_web\", \"description\": \"\\n Search the web for the given query.\\n \", \"parameters\": {\"type\": \"object\", \"required\": [\"query\"], \"properties\": {\"country\": {\"type\": \"string\", \"description\": \"The country to search from. Two letter country code.\"}, \"focus\": {\"type\": \"string\", \"description\": \"The type of search to perform. Can be \\\"all\\\", \\\"web\\\", \\\"news\\\", \\\"wikipedia\\\", \\\"academia\\\", \\\"reddit\\\", \\\"images\\\", \\\"videos\\\".\"}, \"language\": {\"type\": \"string\", \"description\": \"The language to search in. Language code like en, fr, de, etc.\"}, \"query\": {\"type\": \"string\", \"description\": \"The query to search for. Should be a string and optimized for search engines.\"}}}}}\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n<|im_start|>user\nGive me extremely brief summary of today’s top 2 news.<|im_end|>\n<|im_start|>assistant\n<tool_call>\n{\"name\": \"search_web\", \"arguments\": {\"focus\": \"news\", \"language\": \"en\", \"query\": \"today's top 2 news\"}}\n</tool_call><|im_end|>\n<|im_start|>user\n<tool_response>\n{\n \"results\": [\n {\n \"type\": \"web\",\n \"title\": \"Reuters World News | Latest Top Stories | Reuters\",\n \"age\": \"\",\n \"description\": \"A group of European countries, led by Britain and Germany, is discussing plans to boost their military presence in Greenland to show U.S. President Donald Trump that the continent is serious about Arctic security, Bloomberg News reported on Sunday.\",\n \"url\": \"https://www.reuters.com/world/\"\n },\n {\n \"type\": \"web\",\n \"title\": \"UK News - The latest headlines from the UK | Sky News\",\n \"age\": \"\",\n \"description\": \"'I have PTSD from online video': The growing call to ban smartphones in schools\",\n \"url\": \"https://news.sky.com/uk\"\n },\n {\n \"type\": \"web\",\n \"title\": \"Latest news from around the world | The Guardian\",\n \"age\": \"\",\n \"description\": \"Case brought by Muslim leaders and MP follows failed 2024 bid and seen as part of global anti-women’s rights backlash\",\n \"url\": \"https://www.theguardian.com/world\"\n },\n {\n \"type\": \"web\",\n \"title\": \"Breaking News, Latest News and Videos | CNN\",\n \"age\": \"\",\n \"description\": \"RFK Jr.’s new food pyramid puts meat, dairy at the top.\",\n \"url\": \"https://www.cnn.com/\"\n },\n {\n \"type\": \"web\",\n \"title\": \"World News - Breaking international news and headlines | Sky News\",\n \"age\": \"\",\n \"description\": \"There's a daily game of cat and mouse to catch illegal migrants in Germany - and it may soon become UK's problem\",\n \"url\": \"https://news.sky.com/world\"\n },\n {\n \"type\": \"web\",\n \"title\": \"UK | Latest News & Updates | BBC News\",\n \"age\": \"\",\n \"description\": \"Updates from your News topics will appear in My News and in a collection on the News homepage. I never saw young women on Epstein visits, Mandelson tells BBC · In his first interview since being sacked as the UK's US ambassador, Lord Mandelson also says he does not believe Donald Trump will take Greenland by force. ... Watch: Backlash against Musk's Grok AI explained. Video, 00:01:10Watch: Backlash against Musk's Grok AI explained ... Watch: Rowers capture close encounter with whales. Video, 00:00:25Watch: Rowers capture close encounter with whales\",\n \"url\": \"https://www.bbc.co.uk/news/uk\"\n },\n {\n \"type\": \"web\",\n \"title\": \"ABC News - Breaking News, Latest News and Videos\",\n \"age\": \"\",\n \"description\": \"The strikes come in retaliation for an attack that killed three Americans.\",\n \"url\": \"https://abcnews.go.com/\"\n },\n {\n \"type\": \"web\",\n \"title\": \"England | Latest News & Updates | BBC News\",\n \"age\": \"\",\n \"description\": \"Updates from your News topics will appear in My News and in a collection on the News homepage. 'I have 45 children applying to my special school each week' Head teacher Steph Smith says her school is oversubscribed and that more like it are needed. ... Man spotted skiing down residential street. Video, 00:00:16Man spotted skiing down residential street ... 0:45Prince Naseem actor on playing his hero. 00:00:45, play videoPrince Naseem actor on playing his hero · 0:28Cyclists get snow on their spokes in chilly ride.\",\n \"url\": \"https://www.bbc.co.uk/news/england\"\n },\n {\n \"type\": \"videos\",\n \"description\": \"Video. Catch up with the most important stories from around Europe and beyond - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\",\n \"url\": \"https://www.euronews.com/video/2025/05/29/latest-news-bulletin-may-29th-evening\"\n },\n {\n \"type\": \"videos\",\n \"description\": \"President Donald Trump will receive an intelligence briefing today at 11 AM ET in the Oval Office. Trump will has lunch with Vice President JD Vance and then...\",\n \"url\": \"https://www.youtube.com/watch?v=rr014mxX1sg\"\n },\n {\n \"type\": \"videos\",\n \"description\": \"Get today's top stories, breaking news and original reporting on CBS News 24/7.#news #livenews #breakingnews CBS News 24/7 is the premier anchored streaming ...\",\n \"url\": \"https://www.youtube.com/watch?v=Jr0VVkRUqE4\"\n },\n {\n \"type\": \"videos\",\n \"description\": \"Video. Catch up with the most important stories from around Europe and beyond this January 4th, 2026 - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\",\n \"url\": \"https://www.euronews.com/video/2026/01/04/latest-news-bulletin-january-4th-2026-morning\"\n }\n ]\n}\n</tool_response><|im_end|>\n<|im_start|>assistant\n" SUBSEQUENT TURN (incorrectly encoded): ollama[3261823]: time=2026-01-12T00:01:12.552Z level=TRACE source=server.go:1466 msg="completion request" prompt="<|im_start|>system\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>\n{\"type\": \"function\", \"function\": {\"name\": \"search_web\", \"description\": \"\\n Search the web for the given query.\\n \", \"parameters\": {\"type\": \"object\", \"required\": [\"query\"], \"properties\": {\"country\": {\"type\": \"string\", \"description\": \"The country to search from. Two letter country code.\"}, \"focus\": {\"type\": \"string\", \"description\": \"The type of search to perform. Can be \\\"all\\\", \\\"web\\\", \\\"news\\\", \\\"wikipedia\\\", \\\"academia\\\", \\\"reddit\\\", \\\"images\\\", \\\"videos\\\".\"}, \"language\": {\"type\": \"string\", \"description\": \"The language to search in. Language code like en, fr, de, etc.\"}, \"query\": {\"type\": \"string\", \"description\": \"The query to search for. Should be a string and optimized for search engines.\"}}}}}\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n<|im_start|>user\nGive me extremely brief summary of today’s top 2 news.<|im_end|>\n<|im_start|>assistant\n\""{\\n \\"results\\": [\\n {\\n \\"type\\": \\"web\\",\\n \\"title\\": \\"Reuters World News | Latest Top Stories | Reuters\\",\\n \\"age\\": \\"\\",\\n \\"description\\": \\"A group of European countries, led by Britain and Germany, is discussing plans to boost their military presence in Greenland to show U.S. President Donald Trump that the continent is serious about Arctic security, Bloomberg News reported on Sunday.\\",\\n \\"url\\": \\"https://www.reuters.com/world/\\"\\n },\\n {\\n \\"type\\": \\"web\\",\\n \\"title\\": \\"UK News - The latest headlines from the UK | Sky News\\",\\n \\"age\\": \\"\\",\\n \\"description\\": \\"&#x27;I have PTSD from online video&#x27;: The growing call to ban smartphones in schools\\",\\n \\"url\\": \\"https://news.sky.com/uk\\"\\n },\\n {\\n \\"type\\": \\"web\\",\\n \\"title\\": \\"Latest news from around the world | The Guardian\\",\\n \\"age\\": \\"\\",\\n \\"description\\": \\"Case brought by Muslim leaders and MP follows failed 2024 bid and seen as part of global anti-women’s rights backlash\\",\\n \\"url\\": \\"https://www.theguardian.com/world\\"\\n },\\n {\\n \\"type\\": \\"web\\",\\n \\"title\\": \\"Breaking News, Latest News and Videos | CNN\\",\\n \\"age\\": \\"\\",\\n \\"description\\": \\"RFK Jr.’s new food pyramid puts meat, dairy at the top.\\",\\n \\"url\\": \\"https://www.cnn.com/\\"\\n },\\n {\\n \\"type\\": \\"web\\",\\n \\"title\\": \\"World News - Breaking international news and headlines | Sky News\\",\\n \\"age\\": \\"\\",\\n \\"description\\": \\"There&#x27;s a daily game of cat and mouse to catch illegal migrants in Germany - and it may soon become UK&#x27;s problem\\",\\n \\"url\\": \\"https://news.sky.com/world\\"\\n },\\n {\\n \\"type\\": \\"web\\",\\n \\"title\\": \\"UK | Latest News & Updates | BBC News\\",\\n \\"age\\": \\"\\",\\n \\"description\\": \\"Updates from your News topics will appear in My News and in a collection on the News homepage. I never saw young women on Epstein visits, Mandelson tells BBC · In his first interview since being sacked as the UK&#x27;s US ambassador, Lord Mandelson also says he does not believe Donald Trump will take Greenland by force. ... Watch: Backlash against Musk&#x27;s Grok AI explained. Video, 00:01:10Watch: Backlash against Musk&#x27;s Grok AI explained ... Watch: Rowers capture close encounter with whales. Video, 00:00:25Watch: Rowers capture close encounter with whales\\",\\n \\"url\\": \\"https://www.bbc.co.uk/news/uk\\"\\n },\\n {\\n \\"type\\": \\"web\\",\\n \\"title\\": \\"ABC News - Breaking News, Latest News and Videos\\",\\n \\"age\\": \\"\\",\\n \\"description\\": \\"The strikes come in retaliation for an attack that killed three Americans.\\",\\n \\"url\\": \\"https://abcnews.go.com/\\"\\n },\\n {\\n \\"type\\": \\"web\\",\\n \\"title\\": \\"England | Latest News & Updates | BBC News\\",\\n \\"age\\": \\"\\",\\n \\"description\\": \\"Updates from your News topics will appear in My News and in a collection on the News homepage. &#x27;I have 45 children applying to my special school each week&#x27; Head teacher Steph Smith says her school is oversubscribed and that more like it are needed. ... Man spotted skiing down residential street. Video, 00:00:16Man spotted skiing down residential street ... 0:45Prince Naseem actor on playing his hero. 00:00:45, play videoPrince Naseem actor on playing his hero · 0:28Cyclists get snow on their spokes in chilly ride.\\",\\n \\"url\\": \\"https://www.bbc.co.uk/news/england\\"\\n },\\n {\\n \\"type\\": \\"videos\\",\\n \\"description\\": \\"Video. Catch up with the most important stories from around Europe and beyond - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\\",\\n \\"url\\": \\"https://www.euronews.com/video/2025/05/29/latest-news-bulletin-may-29th-evening\\"\\n },\\n {\\n \\"type\\": \\"videos\\",\\n \\"description\\": \\"President Donald Trump will receive an intelligence briefing today at 11 AM ET in the Oval Office. Trump will has lunch with Vice President JD Vance and then...\\",\\n \\"url\\": \\"https://www.youtube.com/watch?v=rr014mxX1sg\\"\\n },\\n {\\n \\"type\\": \\"videos\\",\\n \\"description\\": \\"Get today's top stories, breaking news and original reporting on CBS News 24/7.#news #livenews #breakingnews CBS News 24/7 is the premier anchored streaming ...\\",\\n \\"url\\": \\"https://www.youtube.com/watch?v=Jr0VVkRUqE4\\"\\n },\\n {\\n \\"type\\": \\"videos\\",\\n \\"description\\": \\"Video. Catch up with the most important stories from around Europe and beyond this January 4th, 2026 - latest news, breaking news, World, Business, Entertainment, Politics, Culture, Travel.\\",\\n \\"url\\": \\"https://www.euronews.com/video/2026/01/04/latest-news-bulletin-january-4th-2026-morning\\"\\n }\\n ]\\n}"\"\n1. European countries, led by Britain and Germany, are discussing increased military presence in Greenland to address Arctic security concerns amid U.S. President Donald Trump's interest in the region [1]. \n2. There is a growing call to ban smartphones in schools due to concerns over mental health, with some individuals reporting PTSD from online video content [2].<|im_end|>\n<|im_start|>user\nmake it even shorter<|im_end|>\n<|im_start|>assistant\n" ``` ### Additional Information AFAIK - When tool calls are executed in multi-turn conversations, the tool call results are stored in conversation history database with HTML-escaped entities (e.g., `"`, `&`) in the database. When these messages are loaded from the database and sent back to the LLM in subsequent conversation turns, the HTML entities in tool call results are not properly decoded, causing the model to receive malformed JSON with escaped entities instead of proper quotation marks. The issue is not visible in the chat window, but it does have an impact on the model, degrading its performance, especially if the chat history is tool call heavy. I am not confident that the fix is correct, but I have observed this bug across different deployment instances and with both ollama and llama.cpp. Suggested solution (?) tested working on my instance (main branch): File: src/lib/utils/index.ts Function: processDetails() Line: ~875 ```typescript // BEFORE: if (attributes.result) { content = content.replace(match, `"${attributes.result}"`); } // AFTER: if (attributes.result) { const unescapedResult = unescapeHtml(attributes.result); content = content.replace(match, unescapedResult); } ```

GiteaMirror added the bug confirmed issue labels 2026-05-05 21:52:47 -05:00

GiteaMirror closed this issue

2026-05-05 21:52:48 -05:00

GiteaMirror commented

2026-05-05 21:52:49 -05:00

@owui-terminator[bot] commented on GitHub (Jan 12, 2026):

🔍 Similar Issues Found

I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions:

#19864 issue: Ollama Parameters get overriden after native tool calls
by Haervwe • Dec 10, 2025 • bug
#20595 issue: "search_web" tool executed even when "Web Search" control disabled
by SlavikCA • Jan 11, 2026 • bug
#18743 issue: Tool call results intermittently fail to display in UI when result data is large
by kjpoccia • Oct 30, 2025 • bug

💡 Tips:

If this is a duplicate, please consider closing this issue and adding any additional details to the existing one
If you found a solution in any of these issues, please share it here to help others

This comment was generated automatically by a bot. Please react with a 👍 if this comment was helpful, or a 👎 if it was not.

@owui-terminator[bot] commented on GitHub (Jan 12, 2026): 🔍 **Similar Issues Found** I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions: 1. [#19864](https://github.com/open-webui/open-webui/issues/19864) **issue: Ollama Parameters get overriden after native tool calls** *by Haervwe • Dec 10, 2025 • `bug`* 2. [#20595](https://github.com/open-webui/open-webui/issues/20595) **issue: "search_web" tool executed even when "Web Search" control disabled** *by SlavikCA • Jan 11, 2026 • `bug`* 3. [#18743](https://github.com/open-webui/open-webui/issues/18743) **issue: Tool call results intermittently fail to display in UI when result data is large** *by kjpoccia • Oct 30, 2025 • `bug`* --- 💡 **Tips:** - If this is a duplicate, please consider closing this issue and adding any additional details to the existing one - If you found a solution in any of these issues, please share it here to help others *This comment was generated automatically by a bot.* Please react with a 👍 if this comment was helpful, or a 👎 if it was not.

GiteaMirror commented

2026-05-05 21:52:49 -05:00

@jegranado commented on GitHub (Jan 13, 2026):

same here - thank you for providing all the details!
we're using OpenAI compatible endpoints, however.

@jegranado commented on GitHub (Jan 13, 2026): same here - thank you for providing all the details! we're using OpenAI compatible endpoints, however.

GiteaMirror commented

2026-05-05 21:52:49 -05:00

@silentoplayz commented on GitHub (Jan 18, 2026):

I am able to confirm this issue on the latest dev commit by looking at the backend logs. &quot is spotted several times throughout tool call related debug logging statements.

@silentoplayz commented on GitHub (Jan 18, 2026): I am able to confirm this issue on the latest `dev` commit by looking at the backend logs. `&quot` is spotted several times throughout tool call related debug logging statements.

GiteaMirror commented

2026-05-05 21:52:50 -05:00

@Bickio commented on GitHub (Jan 28, 2026):

Are we certain that tool calls should be injected into the assistant messages at all?

I'm experiencing issues where the LLM hallucinates tool call results in long conversations, and I believe it's because previous tool calls were injected into its own messages.

Early message in conversation:

Later message in conversation (hallucinates a tool call):

IMO assistant messages should be left exactly as-is (unless explicitly edited by the user), since they act as a reference for the LLM about how to respond

@Bickio commented on GitHub (Jan 28, 2026): Are we certain that tool calls should be injected into the assistant messages at all? I'm experiencing issues where the LLM hallucinates tool call results in long conversations, and I believe it's because previous tool calls were injected into its own messages. Early message in conversation: <img width="302" height="104" alt="Image" src="https://github.com/user-attachments/assets/3b176192-123c-4fb2-b39e-598f5a4e0c3d" /> Later message in conversation (hallucinates a tool call): <img width="302" height="104" alt="Image" src="https://github.com/user-attachments/assets/8ca1f397-2333-4cf9-ba85-cbd138f86792" /> IMO assistant messages should be left exactly as-is (unless explicitly edited by the user), since they act as a reference for the LLM about how to respond

GiteaMirror commented

2026-05-05 21:52:50 -05:00

@Classic298 commented on GitHub (Jan 28, 2026):

@Bickio

YES tool calls HAVE TO BE in assistant's side

what you're experiencing is not hallucinating tool calls. It looks like you are not on the latest version, or maybe, wait let me check... there definitely was an issue where tool calls were incorrectly... modified in the next request i.e. quotes in tool calls became html encoded

@Classic298 commented on GitHub (Jan 28, 2026): @Bickio YES tool calls HAVE TO BE in assistant's side what you're experiencing is not hallucinating tool calls. It looks like you are not on the latest version, or maybe, wait let me check... there definitely was an issue where tool calls were incorrectly... modified in the next request i.e. quotes in tool calls became html encoded

GiteaMirror commented

2026-05-05 21:52:50 -05:00

@Classic298 commented on GitHub (Jan 28, 2026):

oh yeah thats.. exactly this.. sorry i am tired and it's late

https://github.com/open-webui/open-webui/pull/20755

yeah anyways what you are experiencing is exactly because of this

@Classic298 commented on GitHub (Jan 28, 2026): oh yeah thats.. exactly this.. sorry i am tired and it's late https://github.com/open-webui/open-webui/pull/20755 yeah anyways what you are experiencing is exactly because of this

GiteaMirror commented

2026-05-05 21:52:51 -05:00

@Classic298 commented on GitHub (Jan 28, 2026):

@Bickio

This is the Open AI API standard.

Tool calls MUST BE in the assistant's message

Are we certain that tool calls should be injected into the assistant messages at all?

So yes we are very sure this is intended and should not be changed

I'm experiencing issues where the LLM hallucinates tool call results in long conversations, and I believe it's because previous tool calls were injected into its own messages.

No that's just because the tool calls get incorrectly formatted as we found out here and we wanna fix with the PR

@Classic298 commented on GitHub (Jan 28, 2026): @Bickio <img width="794" height="427" alt="Image" src="https://github.com/user-attachments/assets/e3f99b3e-e3fd-4e27-93a2-09ecf98cca12" /> This is the Open AI API standard. Tool calls MUST BE in the assistant's message > Are we certain that tool calls should be injected into the assistant messages at all? So yes we are very sure this is intended and should not be changed > I'm experiencing issues where the LLM hallucinates tool call results in long conversations, and I believe it's because previous tool calls were injected into its own messages. No that's just because the tool calls get incorrectly formatted as we found out here and we wanna fix with the PR

GiteaMirror commented

2026-05-05 21:52:51 -05:00

@Bickio commented on GitHub (Jan 28, 2026):

Hi @Classic298 I'm not very familiar with this codebase, but my understanding is that:

The role: "assistant" assistant messages tool_calls is where the LLM requests the tool to be called
However, the actual tool call result should be returned in a role: "tool" message

Screenshot from the OpenAI docs to support this:

However, as you can see from my litellm logs, OpenWebUI is injecting the actual tool output into the role: "assistant" message:

@Bickio commented on GitHub (Jan 28, 2026): Hi @Classic298 I'm not very familiar with this codebase, but my understanding is that: - The `role: "assistant"` assistant messages `tool_calls` is where the LLM _requests_ the tool to be called - However, the actual tool call result should be returned in a `role: "tool"` message Screenshot from the OpenAI docs to support this: <img width="434" height="401" alt="Image" src="https://github.com/user-attachments/assets/3b3371c4-05dc-4ecf-a3be-4493b93e7c04" /> However, as you can see from my litellm logs, OpenWebUI is injecting the actual tool output into the `role: "assistant"` message: <img width="837" height="275" alt="Image" src="https://github.com/user-attachments/assets/4955427a-4582-486f-a878-e5561b3b3d90" />

GiteaMirror commented

2026-05-05 21:52:51 -05:00

@Classic298 commented on GitHub (Jan 28, 2026):

@Bickio

Ok you are switching up the conversation here.

Earlier you claimed tool calls should not be in assistant message. This is wrong, as tool calls have to be in assistant message per OpenAI standard.

Now you are talking about tool results, not tool calls.

Yes tool results per OpenAI standard belong in "role: tools" object and not in the assistant object.

And yes, to be fully standard conform this would need to be changed, but in reality it's barely an issue, even a non issue.

I have never had issues with tool calls, even WITHOUT the fix of PR #20755.
That's because LLMs got finetuned so well that they always create the tool call correctly, with the correct quotes and not the HTML encoded ones. Only weaker LLMs, or those not finetuned well for tool calling would fail here - but oh well - the PR is coming and then the quotes will be correctly returned back to the model again which will repair tool calls for those models

And even though the tool results being directly embedded in the assistant message is not standard conform, it works - it works perfectly well.
If this were an issue that prevents tool calling from properly working, someone would have flagged it long ago, or at all.

And from the screenshot you shared it looks like the AI you use is outputting html encoded chars instead of the actual char, which means the LLM is affected by the wrong formatting which PR #20755 fixes - and not by the fact that the tool results get added to the assistant message instead of the tool role.

If this needs fixing, then this would require a refactor on the frontend and backend, mostly frontend, to handle the format changes.

@Classic298 commented on GitHub (Jan 28, 2026): @Bickio Ok you are switching up the conversation here. **Earlier** you claimed tool calls should not be in assistant message. This is wrong, as tool calls have to be in assistant message per OpenAI standard. Now you are talking about tool results, not tool calls. Yes tool results per OpenAI standard belong in "role: tools" object and not in the assistant object. And yes, to be fully standard conform this would need to be changed, but in reality it's barely an issue, even a non issue. I have never had issues with tool calls, even WITHOUT the fix of PR #20755. That's because LLMs got finetuned so well that they always create the tool call correctly, with the correct quotes and not the HTML encoded ones. Only weaker LLMs, or those not finetuned well for tool calling would fail here - but oh well - the PR is coming and then the quotes will be correctly returned back to the model again which will repair tool calls for those models And even though the tool **results** being directly embedded in the assistant message is not standard conform, it works - it works perfectly well. If this were an issue that prevents tool calling from properly working, someone would have flagged it long ago, or at all. And from the screenshot you shared it looks like the AI you use is outputting html encoded chars instead of the actual char, which means the LLM is affected by the wrong formatting which PR #20755 fixes - and not by the fact that the tool results get added to the assistant message instead of the tool role. If this needs fixing, then this would require a refactor on the frontend and backend, mostly frontend, to handle the format changes.

GiteaMirror commented

2026-05-05 21:52:52 -05:00

@Classic298 commented on GitHub (Jan 28, 2026):

If you want you can open a discussion for this - to add tool results to the tool role

But someone will have to implement it

Is it worth implementing it? Nobody has issues with the current behaviour, otherwise someone would have flagged it, and it requires significant work for - ... for no performance gain and not to fix a bug either.
Anyways your tool result report should be tracked in a discussion and separated from this specific issue here, which is strictly about wrong formatting of the quotes which causes weaker/less-well fine tuned models from correctly executing tools repeatedly.

@Classic298 commented on GitHub (Jan 28, 2026): If you want you can open a discussion for this - to add tool results to the tool role But someone will have to implement it Is it worth implementing it? Nobody has issues with the current behaviour, otherwise someone would have flagged it, and it requires significant work for - ... for no performance gain and not to fix a bug either. Anyways your tool result report should be tracked in a discussion and separated from this specific issue here, which is strictly about wrong formatting of the quotes which causes weaker/less-well fine tuned models from correctly executing tools repeatedly.

GiteaMirror commented

2026-05-05 21:52:52 -05:00

@Bickio commented on GitHub (Jan 28, 2026):

@Classic298 I think you're misrepresenting my intention here

you claimed tool calls should not be in assistant message

No, I claimed that OpenWebUI should leave assistant messages exactly as they are, which I believe to be correct. Tool calls in assistant messages are clearly fine, since they're added by the LLM and not by OpenWebUI. Apologies for not specifying "tool call outputs" at every point, clearly this caused some confusion.

Only weaker LLMs, or those not fine tuned well for tool calling would fail here

The model in my screenshot was Claude Opus 4.5, the strongest tool calling model available.

However, I do agree that unescaping the HTML will have some effect on the LLM behaviour. I'm not as confident as you are that it will entirely resolve the issue, or even that the effect will be positive. I guess we'll find out.

Either way, I think pushing my point here is not productive, and if my issue is not entirely resolved by your PR I will raise a new discussion.

Thanks for your time!

@Bickio commented on GitHub (Jan 28, 2026): @Classic298 I think you're misrepresenting my intention here > you claimed tool calls should not be in assistant message No, I claimed that OpenWebUI should leave assistant messages exactly as they are, which I believe to be correct. Tool calls in assistant messages are clearly fine, since they're added by the LLM and not by OpenWebUI. Apologies for not specifying "tool call **outputs**" at every point, clearly this caused some confusion. > Only weaker LLMs, or those not fine tuned well for tool calling would fail here The model in my screenshot was Claude Opus 4.5, the strongest tool calling model available. However, I do agree that unescaping the HTML will have some effect on the LLM behaviour. I'm not as confident as you are that it will entirely resolve the issue, or even that the effect will be positive. I guess we'll find out. Either way, I think pushing my point here is not productive, and if my issue is not entirely resolved by your PR I will raise a new discussion. Thanks for your time!

GiteaMirror commented

2026-05-05 21:52:52 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

In my testing, the hallucinations/performance degradation mentioned by myself and others above, are not resolved by the HTML unescaping changes. Claude models are still being confused by the injected tool output in previous messages.

@Classic298 As requested, I've created a new discussion to discuss the underlying issues and hopefully work towards a solution: https://github.com/open-webui/open-webui/discussions/21098

@Bickio commented on GitHub (Feb 2, 2026): In my testing, the hallucinations/performance degradation mentioned by myself and others above, _are not_ resolved by the HTML unescaping changes. Claude models are still being confused by the injected tool output in previous messages. @Classic298 As requested, I've created a new discussion to discuss the underlying issues and hopefully work towards a solution: https://github.com/open-webui/open-webui/discussions/21098

GiteaMirror commented

2026-05-05 21:52:53 -05:00

@Koumi460 commented on GitHub (Feb 2, 2026):

@Classic298
Thank you for creating the PR and commit and fixing this issue. Unfortunately I have not had time to run your code and test it, but I saw other prople did and I'll test it when I can. But thank you.

@Bickio
I think the issue you are bringing up is separate from the issue I logged. That being said, I am also observing the same behaviour you are describing - tool call results hallucinations when the conversation reaches about 80-100k tokens on qwen3-vl:30b-a3b. I am not sure why, it is a weaker model and I am not on the latest owui version, so take it with a pinch of salt. I think opening a separate discussion was a good call.

@Koumi460 commented on GitHub (Feb 2, 2026): @Classic298 Thank you for creating the PR and commit and fixing this issue. Unfortunately I have not had time to run your code and test it, but I saw other prople did and I'll test it when I can. But thank you. @Bickio I think the issue you are bringing up is separate from the issue I logged. That being said, I am also observing the same behaviour you are describing - tool call results hallucinations when the conversation reaches about 80-100k tokens on qwen3-vl:30b-a3b. I am not sure why, it is a weaker model and I am not on the latest owui version, so take it with a pinch of salt. I think opening a separate discussion was a good call.

GiteaMirror commented

2026-05-05 21:52:53 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

@Bickio
how did you test it?
I tested it with a weaker model locally, and it was affected by the HTML stuff, when the fix is applied, it then did the tool calls properly.

@Classic298 commented on GitHub (Feb 2, 2026): @Bickio how did you test it? I tested it with a weaker model locally, and it was affected by the HTML stuff, when the fix is applied, it then did the tool calls properly.

GiteaMirror commented

2026-05-05 21:52:55 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

@Classic298

My steps were roughly:

Found a conversation step where Claude Sonnet 4.5 hallucinates tool output reliably (5x in a row)
Validated that previous-message tool outputs were HTML escaped, using litellm
Shut down Open WebUI, checked out the fix branch above, booted up Open WebUI
Reran the problematic conversation step in Open WebUI (5x in a row, still hallucinating consistently)
Validated that previous-message tool outputs were HTML unescaped, using litellm

As I've said before: What happens when you inject tool output into the assistant message, is that you're effectively telling the LLM "this is how you successfully called the tool previously". Of course if you give it enough examples it will try to call the tool "the same way it did before" or so it thinks.

I tested it with a weaker model locally

I would expect some models to be more likely to adhere to their fine-tuning and resistant to context-window examples than others. Is that a sign of a "better" or more aligned model? I don't think so - IMO it is generally desirable for models to follow examples provided in context. It will also vary prompt-by-prompt. Regardless, poisoning the context with incorrect examples will always degrade model performance and increase the chance of hallucinations to some degree.

@Bickio commented on GitHub (Feb 2, 2026): @Classic298 My steps were roughly: - Found a conversation step where Claude Sonnet 4.5 hallucinates tool output reliably (5x in a row) - Validated that previous-message tool outputs were HTML escaped, using litellm - Shut down Open WebUI, checked out the fix branch above, booted up Open WebUI - Reran the problematic conversation step in Open WebUI (5x in a row, still hallucinating consistently) - Validated that previous-message tool outputs were HTML _unescaped_, using litellm As I've said before: What happens when you inject tool output into the assistant message, is that you're effectively telling the LLM "this is how _you_ successfully called the tool previously". Of course if you give it enough examples it will try to call the tool "the same way it did before" or so it thinks. > I tested it with a weaker model locally I would expect some models to be more likely to adhere to their fine-tuning and resistant to context-window examples than others. Is that a sign of a "better" or more aligned model? I don't think so - IMO it is generally desirable for models to follow examples provided in context. It will also vary prompt-by-prompt. Regardless, poisoning the context with incorrect examples will _always_ degrade model performance and increase the chance of hallucinations to some degree.

GiteaMirror commented

2026-05-05 21:52:57 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

You should not use a conversation where the LLM already hallucinated a tool call 5x in a row to test the fix.

You should get the LLM to hallucinate the tool call without the fix
and then you should deploy the fix
and then you should try again to get the LLM to hallucinate the tool call in a new conversation.

If WITH the fix the LLM still hallucinates the tool call then it didnt work - if it now works - then it works.

Using an already broken conversation (where the LLM failed previously) to test this fix is not proper testing etiquette.
That's not how you test LLM related fixes.

If a conversation is already in a state where the input will make the LLM see broken tool calls, then it will repeat the broken tool calls.

Regardless, poisoning the context with incorrect examples will always degrade model performance and increase the chance of hallucinations to some degree.

Exactly, so why are you testing it with poisioned context that could no longer appear in all future conversations, post-fix-deployment?

@Classic298 commented on GitHub (Feb 2, 2026): You should not use a conversation where the LLM already hallucinated a tool call 5x in a row to test the fix. You should get the LLM to hallucinate the tool call without the fix and then you should deploy the fix and then you should try again to get the LLM to hallucinate the tool call in a new conversation. If WITH the fix the LLM still hallucinates the tool call then it didnt work - if it now works - then it works. Using an already broken conversation (where the LLM failed previously) to test this fix is not proper testing etiquette. That's not how you test LLM related fixes. If a conversation is already in a state where the input will make the LLM see broken tool calls, then it will repeat the broken tool calls. > Regardless, poisoning the context with incorrect examples will always degrade model performance and increase the chance of hallucinations to some degree. Exactly, so why are you testing it with poisioned context that could no longer appear in all future conversations, post-fix-deployment?

GiteaMirror commented

2026-05-05 21:52:58 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

Testing it on a conversation that is already broken is like testing an antibiotic on a dead patient and saying "see, the antibiotic you invented didnt work".

The broken conversation is broken. We cannot - and it doesnt make sense to - have a database migration or something to fix all broken conversations. Besides the fact that you cannot even reliably do this. What are you gonna do? Replace all escaped HTML quotes with normal quotes? What if a conversation had a legitimate HTML quote that shouldnt get replaced?

Anyways, please test - but this time on a fresh conversation. If tool calls work reliably then the fix works.

@Classic298 commented on GitHub (Feb 2, 2026): Testing it on a conversation that is already broken is like testing an antibiotic on a dead patient and saying "see, the antibiotic you invented didnt work". The broken conversation is broken. We cannot - and it doesnt make sense to - have a database migration or something to fix all broken conversations. Besides the fact that you cannot even reliably do this. What are you gonna do? Replace all escaped HTML quotes with normal quotes? What if a conversation had a legitimate HTML quote that shouldnt get replaced? Anyways, please test - but this time on a fresh conversation. If tool calls work reliably then the fix works.

GiteaMirror commented

2026-05-05 21:52:59 -05:00

@Koumi460 commented on GitHub (Feb 2, 2026):

I've tested the PR that @Classic298 made and it fixes the issue with quotes being incorrectly escaped. I am happy with that and confirm it is working for me.

But even in a completely new environment, on this PR, I was still able to replicate the model hallucinating tool calls after maybe 10 turns and 70k tokens with heavy tool calling. Again, I am using weaker model (qwen2-vl:30b q8 on llama.cpp), but I think it is still happening because of how the tool call results get injected into the assistant's response as @Bickio is saying.

Some models will handle this better and some worse I guess. But it appears to me that it would be better to solve this. One side is the confusion of the model, and the other side is prompt caching - when the format of the tool response changes in the conversation history, the cache will miss and force it to re-process the tool response call once injected into the assistant's message. If the tool call response is very long, this could have meaningful impact on responsiveness and potentially costs. At least that is my speculation / current understanding.

I have created a fork of main branch and started playing around with how to fix this, eventually finding out about and getting inspiration from the PR#19578. I managed to get to good working condition, but I will be testing it like this to see if the issues are fixed. Looking at the calls and preliminary testing, it all looks good to me. Feel free to test it out:
Fork with fixed tool call history and parallel call handling

@Koumi460 commented on GitHub (Feb 2, 2026): I've tested the PR that @Classic298 made and it fixes the issue with quotes being incorrectly escaped. I am happy with that and confirm it is working for me. But even in a completely new environment, on this PR, I was still able to replicate the model hallucinating tool calls after maybe 10 turns and 70k tokens with heavy tool calling. Again, I am using weaker model (qwen2-vl:30b q8 on llama.cpp), but I think it is still happening because of how the tool call results get injected into the assistant's response as @Bickio is saying. Some models will handle this better and some worse I guess. But it appears to me that it would be better to solve this. One side is the confusion of the model, and the other side is prompt caching - when the format of the tool response changes in the conversation history, the cache will miss and force it to re-process the tool response call once injected into the assistant's message. If the tool call response is very long, this could have meaningful impact on responsiveness and potentially costs. At least that is my speculation / current understanding. I have created a fork of main branch and started playing around with how to fix this, eventually finding out about and getting inspiration from the [PR#19578](https://github.com/open-webui/open-webui/pull/19578/changes). I managed to get to good working condition, but I will be testing it like this to see if the issues are fixed. Looking at the calls and preliminary testing, it all looks good to me. Feel free to test it out: [Fork with fixed tool call history and parallel call handling](https://github.com/Koumi460/open-webui/tree/fix/tool-calls-parallel-handling)

GiteaMirror commented

2026-05-05 21:52:59 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

@Classic298 Obviously I was not using a conversation with hallucinations in context... My "5x repeat" methodology was performed by repeatedly rerunning the conversation step where the LLM first hallucinated - all previous messages contained only successful/correct tool calls

@Bickio commented on GitHub (Feb 2, 2026): @Classic298 Obviously I was not using a conversation with hallucinations in context... My "5x repeat" methodology was performed by repeatedly rerunning the conversation step where the LLM first hallucinated - all previous messages contained only successful/correct tool calls

GiteaMirror commented

2026-05-05 21:53:00 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

Report back to us if this makes a meaningful difference to https://github.com/open-webui/open-webui/discussions/21098

and it would be better if you posted this comment to https://github.com/open-webui/open-webui/discussions/21098 @Koumi460

@Classic298 commented on GitHub (Feb 2, 2026): Report back to us if this makes a meaningful difference to https://github.com/open-webui/open-webui/discussions/21098 and it would be better if you posted this comment to https://github.com/open-webui/open-webui/discussions/21098 @Koumi460

GiteaMirror commented

2026-05-05 21:53:00 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

@Bickio Can you please explain how you tested it then? from your explanation it reads like you used a conversation which contained previously faulty tool calls in the context.

Did you or did you not test it on a brand new, fresh conversation?

I can use Claude Sonnet 4.5 reliably even without the fix and it does not hallucinate tool calls.

What is your precise setup and testing setup?

@Classic298 commented on GitHub (Feb 2, 2026): @Bickio Can you please explain how you tested it then? from your explanation it reads like you used a conversation which contained previously faulty tool calls in the context. Did you or did you not test it on a brand new, fresh conversation? I can use Claude Sonnet 4.5 reliably even <ins>**without the fix**</ins> and it does not hallucinate tool calls. What is your precise setup and testing setup?

GiteaMirror commented

2026-05-05 21:53:00 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

The following conversation is WITHOUT the fix in place - and i even used sonnet's weaker/dumber brother:

@Classic298 commented on GitHub (Feb 2, 2026): The following conversation is WITHOUT the fix in place - and i even used sonnet's weaker/dumber brother: <img width="1147" height="1090" alt="Image" src="https://github.com/user-attachments/assets/ab106991-71ef-4966-9649-667d0b869078" /> <img width="1087" height="1098" alt="Image" src="https://github.com/user-attachments/assets/376c2807-e41a-4464-bed9-470c4f6c6a4a" /> <img width="1016" height="409" alt="Image" src="https://github.com/user-attachments/assets/ab582d30-97b2-48dd-b24b-40bed83bcad7" />

GiteaMirror commented

2026-05-05 21:53:01 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

The only models I even had issues with tool calling were weaker ones - which then started working again after the fix was applied locally for development and testing.

Therefore I struggle to see how you can have failed tool calls with sonnet, even with the fix.

@Classic298 commented on GitHub (Feb 2, 2026): The only models I even had issues with tool calling were weaker ones - which then started working again after the fix was applied locally for development and testing. Therefore I struggle to see how you can have failed tool calls with sonnet, even with the fix.

GiteaMirror commented

2026-05-05 21:53:01 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

So if Claude Opus 4.5 cannot do tool calling for you, neither without the fix, nor with the fix, then something is fundamentally wrong in your setup.

If Claude Haiku can do it, then Opus can too.

I tried to replicate ANY tool calling failure as you can see - without the fix applied - but it just worked. With Haiku.

@Classic298 commented on GitHub (Feb 2, 2026): So if Claude Opus 4.5 cannot do tool calling for you, neither without the fix, nor with the fix, then something is fundamentally wrong in your setup. If Claude Haiku can do it, then Opus can too. I tried to replicate ANY tool calling failure as you can see - without the fix applied - but it just worked. With Haiku.

GiteaMirror commented

2026-05-05 21:53:02 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

Can you please explain how you tested it then? from your explanation it reads like you used a conversation which contained previously faulty tool calls in the context.

No

rerunning the conversation step where the LLM first hallucinated

If you rerun the step with the first hallucination, there are, by definition, no hallucinations in context

Did you or did you not test it on a brand new, fresh conversation?

A completely fresh conversation? I did try with fresh conversations too, but I often just reran the conversation from the first hallucination as stated above. Why does this matter? There were no hallucinations in previous messages.

What is your precise setup and testing setup?

I've been pretty thorough, so you'll need to be more specific about the information you want. I can't give you access to my tools, as they retrieve sensitive data.

So if Claude Opus 4.5 cannot do tool calling for you, neither without the fix, nor with the fix, then something is fundamentally wrong in your setup

What are you trying to achieve with this angle? Just because I can reproduce the bug and you can't means my setup is wrong?

You have at least 3 users who are affected by, and can reproduce this bug. You've admitted that Open WebUI is not using the OpenAI API according to its spec. I've given clear theoretical arguments why Open WebUI's misuse would cause this hallucination, and shown that they occur in practice regardless of the HTML escaping red herring.

No, something is NOT wrong in my setup, something is wrong in Open WebUI, and it's not just this bug. The hostility towards community members trying engage in good faith is baffling, and makes me very concerned for the future of this software.

@Bickio commented on GitHub (Feb 2, 2026): > Can you please explain how you tested it then? from your explanation it reads like you used a conversation which contained previously faulty tool calls in the context. No > rerunning the conversation step where the LLM first hallucinated If you rerun the step with the first hallucination, there are, by definition, no hallucinations in context > Did you or did you not test it on a brand new, fresh conversation? A completely fresh conversation? I did try with fresh conversations too, but I often just reran the conversation from the first hallucination as stated above. Why does this matter? There were no hallucinations in previous messages. > What is your precise setup and testing setup? I've been pretty thorough, so you'll need to be more specific about the information you want. I can't give you access to my tools, as they retrieve sensitive data. > So if Claude Opus 4.5 cannot do tool calling for you, neither without the fix, nor with the fix, then something is fundamentally wrong in your setup What are you trying to achieve with this angle? Just because I can reproduce the bug and you can't means my setup is wrong? You have at least 3 users who are affected by, and can reproduce this bug. You've admitted that Open WebUI is not using the OpenAI API according to its spec. I've given clear theoretical arguments why Open WebUI's misuse would cause this hallucination, and shown that they occur in practice regardless of the HTML escaping red herring. No, something is NOT wrong in my setup, something is wrong in Open WebUI, and it's not just this bug. The hostility towards community members trying engage in good faith is baffling, and makes me very concerned for the future of this software.

GiteaMirror commented

2026-05-05 21:53:02 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

What are you trying to achieve with this angle? Just because I can reproduce the bug and you can't means my setup is wrong?

Honestly, yes exactly. That is what i am saying here.

At this point it is likely you have some reverse proxy issues, faulty setup of websocket, browser issue, anything really. If you use a middle layer like LiteLLM it may even be that it incorrectly formats the response (as it has done in the past with Gemini 3 outputs in some versions until it was fixed).

The narrative is simple: Is it reproducible with a clean, fresh, setup? Yes? Then it's a bug.
No? Then it's an issue with your specific setup, configuration, network, firewall, reverse proxy, config or anything else.
This is how it is. If we cannot reproduce it even after heavily trying (see images above) then how can we fix it?

If the other things you mentioned throughout the conversation were actually an issue, then I could reproduce the issue.

I saw users who complained for weeks that something does not work.
Only for them to eventually come back and say "woops it was the firewall sorry for the noise" or "yeah i enabled proxy buffering in nginx, that's the culprit" or "yeah it seems g2cli was faulty" or "litellm had an issue".
So often.

And if you cannot provide any way for me to reproduce it, even without the fix of the open PR applied, even on a weaker model, then absolutely, and i mean absolutely everything points to this being a you-problem.

The hostility towards community members trying engage in good faith is baffling, and makes me very concerned for the future of this software.

Me trying to help troubleshoot your issue with MY free time as an unpaid, volunteer contributor who works 40+ hrs a week in a full-time job and ALSO has full-time studies to attend AND helps improve open webui is worth a lot. You interpreting my messages as "hostility" when actually I am taking my little time trying to figure out what is wrong with your setup to help you, by literally asking you to explain what your setup is, and you responding with "No" is what is actually hostile.

Tell us! What is your setup! Configs, version, how did you connect the LLMs? Reverse proxy? Firewall? Do all tools fail? What tool fails? Does it fail in new chats? Did you clear browser cache? What browser? What extensions? Did you enable the Microsoft Edge extra Security setting which is also known to cause issues? How did you configure the models inside Open WebUI? What advanced parameters? What capabilities? Do you use any custom filters which might interfere by modifying user messages or the model's output in the inlet() or outlet()? What OS do you run it one? How did you install it? You only provided one single message so far as a screenshot; if we are being generous, because it was only evident that Claude Opus was used and a tool was called but nothing else was visible on these screenshots.

You have at least 3 users who are affected by, and can reproduce this bug.

Yes and all of them confirmed the PR working and the tool calls to then work again fully or much better.
Only you, out of the 3 others is claiming it is not fixed. So AGAIN it points to something being wrong in your setup when 3 others are saying it works (see PR comments)

Silentoplayz tested it and it works
Koumi tested it and it works
rbsn-cpu tested it and it works
and so did I

you are the only one for whom the fix allegedly doesn't work

I've been pretty thorough

Where? How? You have not explained a single spec of your setup.

@Classic298 commented on GitHub (Feb 2, 2026): > What are you trying to achieve with this angle? Just because I can reproduce the bug and you can't means my setup is wrong? Honestly, yes exactly. That is what i am saying here. At this point it is likely you have some reverse proxy issues, faulty setup of websocket, browser issue, anything really. If you use a middle layer like LiteLLM it may even be that it incorrectly formats the response (as it has done in the past with Gemini 3 outputs in some versions until it was fixed). The narrative is simple: Is it reproducible with a clean, fresh, setup? Yes? Then it's a bug. No? Then it's an issue with your specific setup, configuration, network, firewall, reverse proxy, config or anything else. This is how it is. If we cannot reproduce it even after heavily trying (see images above) then how can we fix it? If the other things you mentioned throughout the conversation were actually an issue, then I could reproduce the issue. I saw users who complained for weeks that something does not work. Only for them to eventually come back and say "woops it was the firewall sorry for the noise" or "yeah i enabled proxy buffering in nginx, that's the culprit" or "yeah it seems g2cli was faulty" or "litellm had an issue". So often. And if you cannot provide any way for me to reproduce it, even without the fix of the open PR applied, even on a weaker model, then absolutely, and i mean absolutely everything points to this being a you-problem. > The hostility towards community members trying engage in good faith is baffling, and makes me very concerned for the future of this software. Me trying to help troubleshoot **your** issue with MY free time as an unpaid, volunteer contributor who works 40+ hrs a week in a full-time job and ALSO has full-time studies to attend AND helps improve open webui is worth a lot. You interpreting my messages as "hostility" when actually I am taking my little time trying to figure out what is wrong with your setup to help you, by literally asking you to explain what your setup is, and you responding with "No" is what is actually hostile. Tell us! What is your setup! Configs, version, how did you connect the LLMs? Reverse proxy? Firewall? Do all tools fail? What tool fails? Does it fail in new chats? Did you clear browser cache? What browser? What extensions? Did you enable the Microsoft Edge extra Security setting which is also known to cause issues? How did you configure the models inside Open WebUI? What advanced parameters? What capabilities? Do you use any custom filters which might interfere by modifying user messages or the model's output in the inlet() or outlet()? What OS do you run it one? How did you install it? You only provided one single message so far as a screenshot; if we are being generous, because it was only evident that Claude Opus was used and a tool was called but nothing else was visible on these screenshots. > You have at least 3 users who are affected by, and can reproduce this bug. Yes and all of them confirmed the PR working and the tool calls to then work again fully or much better. Only you, out of the 3 others is claiming it is not fixed. So AGAIN it points to something being wrong in your setup when 3 others are saying it works (see PR comments) Silentoplayz tested it and it works Koumi tested it and it works rbsn-cpu tested it and it works and so did I you are the only one for whom the fix allegedly doesn't work > I've been pretty thorough Where? How? You have not explained a single spec of your setup.

GiteaMirror commented

2026-05-05 21:53:03 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

And this one single screenshot you shared is suspicious too: I do not even see a tool call here.
Even if we fix the HTML formatting, there is no tool being called. Where is the tool name? Tool name must be inside quotes and be an actual function name.
So the first "quot;" followed by Rows 1-2 is not a tool call. A tool call cannot be "Rows 1-2()" this is not a valid tool function name.

But the fact you also see raw \n\n's BESIDES the unformatted quot; and some text is telling that something is wrong with your setup.
Whatever you are experiencing: nobody else is experiencing it and we cannot reproduce it and it also has little if not nothing to do with this issue and the PR.

@Classic298 commented on GitHub (Feb 2, 2026): <img width="604" height="208" alt="Image" src="https://github.com/user-attachments/assets/9434e3d1-9ec7-4aea-9d9f-2a7fb594de62" /> And this one single screenshot you shared is suspicious too: I do not even see a tool call here. Even if we fix the HTML formatting, there is no tool being called. Where is the tool name? Tool name must be inside quotes and be an actual function name. So the first "quot;" followed by Rows 1-2 is not a tool call. A tool call cannot be "Rows 1-2()" this is not a valid tool function name. But the fact you also see raw \n\n's BESIDES the unformatted quot; and some text is telling that something is wrong with your setup. Whatever you are experiencing: nobody else is experiencing it and we cannot reproduce it and it also has little if not nothing to do with this issue and the PR.

GiteaMirror commented

2026-05-05 21:53:04 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

@Classic298 it will take me some time to reply fully, as you've asked for a lot of details.

With regards to that screenshot, as I noted in my discussion post, Open WebUI only injects the tool output, the tool call itself is completely lost. My tool outputs a multi line string, which is what you're seeing there.

@Bickio commented on GitHub (Feb 2, 2026): @Classic298 it will take me some time to reply fully, as you've asked for a lot of details. With regards to that screenshot, as I noted in my discussion post, Open WebUI only injects the tool _output_, the tool call itself is completely lost. My tool outputs a multi line string, which is what you're seeing there.

GiteaMirror commented

2026-05-05 21:53:05 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

@Bickio so we can get any confusion out of the way because I have an urgent question: are you using native tool calling via admin panel > settings > model > opus > advanced parameters > tool calling: native?

Because as you can see from my screenshots, the tool call does not get lost, and is injected in the assistant message (visibly).
The tool call is not lost.

And if you are using native tool calling, and the tool call IS LOST, then this is ... most interesting. Again: I cannot reproduce it - but then this would mean that whatever you are experiencing is not related to this issue and demands standalone investigation - my money is on middleware issues, LLM translation issues or perhaps reverse proxy or other network related causes.

@Classic298 commented on GitHub (Feb 2, 2026): @Bickio so we can get any confusion out of the way because I have an urgent question: are you using native tool calling via admin panel > settings > model > opus > advanced parameters > tool calling: native? Because as you can see from my screenshots, the tool call does not get lost, and is injected in the assistant message (visibly). The tool call is not lost. And if you are using native tool calling, and the tool call IS LOST, then this is ... most interesting. Again: I cannot reproduce it - but then this would mean that whatever you are experiencing is not related to this issue and demands standalone investigation - my money is on middleware issues, LLM translation issues or perhaps reverse proxy or other network related causes.

GiteaMirror commented

2026-05-05 21:53:06 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

@Classic298 yes, native tool calling is enabled.

To be clear, the tool call still appears in the UI so yes, not completely lost in that sense. What I meant is that it's not injected into the LLM call. The LLM only sees the tool outputs in previous messages. The screenshot in question is the LLM hallucinating an example of what it sees, which is just the tool output.

@Bickio commented on GitHub (Feb 2, 2026): @Classic298 yes, native tool calling is enabled. To be clear, the tool call still appears in the UI so yes, not completely lost in that sense. What I meant is that it's not injected into the LLM call. The LLM only sees the tool outputs in previous messages. The screenshot in question is the LLM hallucinating an example of what it sees, which is just the tool output.

GiteaMirror commented

2026-05-05 21:53:06 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

So in your screenshot, the LLM hallucinates the tool output, without even attempting a tool call?
What does the tool do? Output a very large text to the LLM or directly to the chat? Very large unformatted text, if it is similar to what is shown on the screenshot, do easily poison LLMs. Even modern ones. Even Gemini 3 Pro can easily be tricked into an endless-loop with just enough tokens or "random" character inputs. And same can happen to Opus (less often than Gemini 3 admittedly, but still) (this is well documented that having LLMs repeat the same or similar tokens a lot can cause this - and also giving HUGE inputs to the LLM can cause this in some cases). So yeah: what does your tool do in short?

@Classic298 commented on GitHub (Feb 2, 2026): 1. So in your screenshot, the LLM hallucinates the tool output, without even attempting a tool call? 2. What does the tool do? Output a very large text to the LLM or directly to the chat? Very large unformatted text, if it is similar to what is shown on the screenshot, do easily poison LLMs. Even modern ones. Even Gemini 3 Pro can easily be tricked into an endless-loop with just enough tokens or "random" character inputs. And same can happen to Opus (less often than Gemini 3 admittedly, but still) (this is well documented that having LLMs repeat the same or similar tokens a lot can cause this - and also giving HUGE inputs to the LLM can cause this in some cases). **So yeah: what does your tool do in short?**

GiteaMirror commented

2026-05-05 21:53:07 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

So in your screenshot, the LLM hallucinates the tool output, without even attempting a tool call?

Correct. "Monkey see, monkey do". Because its previous messages show itself producing the tool output, that's what it tries to do.

What does the tool do? Output a very large text to the LLM or directly to the chat? Very large unformatted text, if it is similar to what is shown on the screenshot, do easily poison LLMs. Even modern ones. Even Gemini 3 Pro can easily be tricked into an endless-loop with just enough tokens or "random" character inputs. And same can happen to Opus (less often than Gemini 3 admittedly, but still) (this is well documented that having LLMs repeat the same or similar tokens a lot can cause this - and also giving HUGE inputs to the LLM can cause this in some cases). So yeah: what does your tool do in short?

It outputs data in a format carefully engineered to reduce the issues you mention. The data is presented in something similar to CSV format, which is much more compact than JSON due to lack of key repetition. It has a strict limit on the volume of data (500 lines), and strategically places fresh column headers every 100 lines to ensure the LLM retains that info in its working memory. In practice, the majority of queries will produce a much smaller set of lines (1-10).

Volume of data is not the culprit here - it always works fine in the first message, with as many as 20-30 large tool calls. It's only once Open WebUI starts collapsing the previous messages that the hallucinations appear. I also use the same tool output format in other systems (e.g. mastra) without any hallucinations.

TL;DR yes the output is somewhat large, but the tool is carefully designed to account for this, and I'm well within the limits of what the LLM can handle

@Bickio commented on GitHub (Feb 2, 2026): > So in your screenshot, the LLM hallucinates the tool output, without even attempting a tool call? Correct. "Monkey see, monkey do". Because its previous messages show itself producing the tool output, that's what it tries to do. > What does the tool do? Output a very large text to the LLM or directly to the chat? Very large unformatted text, if it is similar to what is shown on the screenshot, do easily poison LLMs. Even modern ones. Even Gemini 3 Pro can easily be tricked into an endless-loop with just enough tokens or "random" character inputs. And same can happen to Opus (less often than Gemini 3 admittedly, but still) (this is well documented that having LLMs repeat the same or similar tokens a lot can cause this - and also giving HUGE inputs to the LLM can cause this in some cases). So yeah: what does your tool do in short? It outputs data in a format carefully engineered to reduce the issues you mention. The data is presented in something similar to CSV format, which is much more compact than JSON due to lack of key repetition. It has a strict limit on the volume of data (500 lines), and strategically places fresh column headers every 100 lines to ensure the LLM retains that info in its working memory. In practice, the majority of queries will produce a much smaller set of lines (1-10). Volume of data is not the culprit here - it always works fine in the first message, with as many as 20-30 large tool calls. It's only once Open WebUI starts collapsing the previous messages that the hallucinations appear. I also use the same tool output format in other systems (e.g. mastra) without any hallucinations. TL;DR yes the output is somewhat large, but the tool is carefully designed to account for this, and I'm well within the limits of what the LLM can handle

GiteaMirror commented

2026-05-05 21:53:08 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

@Bickio

Thanks for confirming. Though i must say, while this does point to your discussion's direction of "tool outputs should be in tool role", you also confirmed that in your screenshot the LLM didn't even attempt a tool call?

Therefore:

even if all previous tool calls in the same conversation worked
but from a specific turn onwards it no longer works
with and without the fix
and the LLM hallucinates a tool output without even doing a tool call

then you could not have tested the PR in the way it was meant to be tested - and more importantly: you also weren't experiencing this issue here (#20600) - but something entirely different. Hence it is good you opened the discussion. But also I think and hope we can finally conclude the PR works 🤣

@Classic298 commented on GitHub (Feb 2, 2026): @Bickio Thanks for confirming. Though i must say, while this does point to your discussion's direction of "tool outputs should be in tool role", you also confirmed that in your screenshot the LLM didn't even attempt a tool call? Therefore: 1. even if all previous tool calls in the same conversation worked 2. but from a specific turn onwards it no longer works 3. with and without the fix 4. and the LLM hallucinates a tool output without even doing a tool call then you could not have tested the PR in the way it was meant to be tested - and more importantly: you also weren't experiencing this issue here (#20600) - but something entirely different. Hence it is good you opened the discussion. But also I think and hope we can finally conclude the PR works 🤣

GiteaMirror commented

2026-05-05 21:53:11 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

@Classic298 I fear, once again we are misunderstanding each other.

you could not have tested the PR in the way it was meant to be tested

I was able to confirm by looking at the LiteLLM logs, that with your PR applied, the previous messages in the conversation sent to the LLM contained correctly HTML-unescaped tool outputs. In other words, your PR works as intended.

However, it does not prevent the hallucinations, since the LLM is still being shown examples of itself producing the tool outputs. My initial hypothesis was that unescaping the HTML would actually make these poisoned examples more potent to the LLM, however that is difficult to empirically prove. Naturally, some examples will be resolved (e.g. your tests), some will not (e.g. my tests), while others may break which we're completely unaware of as they were working previously. This is the nature of non-deterministic systems.

@Bickio commented on GitHub (Feb 2, 2026): @Classic298 I fear, once again we are misunderstanding each other. > you could not have tested the PR in the way it was meant to be tested I was able to confirm by looking at the LiteLLM logs, that with your PR applied, the previous messages in the conversation sent to the LLM contained correctly HTML-unescaped tool outputs. In other words, **your PR works as intended**. However, it does not prevent the hallucinations, since the LLM is still being shown examples of itself producing the tool outputs. My initial hypothesis was that unescaping the HTML would actually make these poisoned examples _more potent_ to the LLM, however that is difficult to empirically prove. Naturally, some examples will be resolved (e.g. your tests), some will not (e.g. my tests), while others may break which we're completely unaware of as they were working previously. This is the nature of non-deterministic systems.

GiteaMirror commented

2026-05-05 21:53:15 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

@Bickio
Ok then we misunderstood each other - but in a good way

PR works - but the PR was never intended to fix the issue you are experiencing

The issue you are experiencing MIGHT actually be related to the discussion you opened, but not to this issue here, and hence also not fixed by my PR

@Classic298 commented on GitHub (Feb 2, 2026): @Bickio Ok then we misunderstood each other - but in a good way PR works - but the PR was never intended to fix the issue you are experiencing The issue you are experiencing MIGHT actually be related to the discussion you opened, but not to this issue here, and hence also not fixed by my PR

GiteaMirror commented

2026-05-05 21:53:17 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

the PR was never intended to fix the issue you are experiencing

We can agree on that. I'm just not totally sure what it was meant to fix in that case. The HTML escaping is not visible to the user (except in hallucinations), and the PR hasn't resolved the "degraded LLM performance" mentioned in the original issue.

@Bickio commented on GitHub (Feb 2, 2026): > the PR was never intended to fix the issue you are experiencing We can agree on that. I'm just not totally sure what it was meant to fix in that case. The HTML escaping is not visible to the user (except in hallucinations), and the PR hasn't resolved the "degraded LLM performance" mentioned in the original issue.

GiteaMirror commented

2026-05-05 21:53:18 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

@Bickio

On weaker LLMs, when doing a second or third turn - all turns having tool calls - the probability of the LLM attempting to do a tool call, but failing, rises without the fix.

Why?

Because the LLM gets it's own messages sent back (as it has to be) - but in it's own prior messages, the tool calls are formatted using " instead of the proper " quotes.
Therefore it thinks a tool call needs HTML encoded " elements instead of " for a tool call - which is wrong.
Then the LLM will attempt to make a tool call in the current turn which looks like this
"get_weather" for example.

This is fixed by the PR, which ensures quotes and other HTML elements are not sent back to the model in HTML encoded form, but in their normal form as they were actually generated by the model, stored in the database and shown in the UI, and not in the raw encoded HTML form.

So the LLMs affected by this are mostly weaker LLMs with not very strong fine tuning (through fine tuning most modern LLMs have very strong function calling, even small models like gpt-oss-20bn).
Any model with good fine tuning will ignore the html encoded characters in it's previous responses and generate correct tool calls anyways.
But weaker models or models with not-so-perfect fine tuning will see their previous answers, where they seemingly used " instead of " and then repeat that. Because their fine tuning wasn't strong enough to teach them how tool calls have to work.

This is what was reported here - and this is what the PR fixes.

@Classic298 commented on GitHub (Feb 2, 2026): @Bickio On weaker LLMs, when doing a second or third turn - all turns having tool calls - the probability of the LLM attempting to do a tool call, but failing, rises without the fix. Why? Because the LLM gets it's own messages sent back (as it has to be) - but in it's own prior messages, the tool calls are formatted using `"` instead of the proper " quotes. Therefore it thinks a tool call needs HTML encoded `"` elements instead of " for a tool call - which is wrong. Then the LLM will attempt to make a tool call in the current turn which looks like this `"get_weather"` for example. This is fixed by the PR, which ensures quotes and other HTML elements are not sent back to the model in HTML encoded form, but in their normal form as they were actually generated by the model, stored in the database and shown in the UI, and not in the raw encoded HTML form. So the LLMs affected by this are mostly weaker LLMs with not very strong fine tuning (through fine tuning most modern LLMs have very strong function calling, even small models like gpt-oss-20bn). Any model with good fine tuning will ignore the html encoded characters in it's previous responses and generate correct tool calls anyways. But weaker models or models with not-so-perfect fine tuning will see their previous answers, where they seemingly used `" `instead of " and then repeat that. Because their fine tuning wasn't strong enough to teach them how tool calls have to work. This is what was reported here - and this is what the PR fixes.

GiteaMirror commented

2026-05-05 21:53:19 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

the LLM will attempt to make a tool call in the current turn which looks like this "get_weather" for example

I wasn't aware that this was a potential failure mode - I guess it seems plausible, but wouldn't an actual malformed tool call be prevented at the LLM API level, by the tool call schemas?

This is what was reported here

Where? The original issue only includes a vague mention of "degraded performance" which could equally refer to the same tool output hallucinations I see.

@Bickio commented on GitHub (Feb 2, 2026): > the LLM will attempt to make a tool call in the current turn which looks like this `"get_weather"` for example I wasn't aware that this was a potential failure mode - I guess it seems plausible, but wouldn't an actual malformed tool call be prevented at the LLM API level, by the tool call schemas? > This is what was reported here Where? The original issue only includes a vague mention of "degraded performance" which could equally refer to the same tool output hallucinations I see.

GiteaMirror commented

2026-05-05 21:53:20 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

I wasn't aware that this was a potential failure mode - I guess it seems plausible, but wouldn't an actual malformed tool call be prevented at the LLM API level, by the tool call schemas?

Yes and Yes.
But we do not have a tool call in this case.

The LLM just outputs "get_weather"
This is not a tool call.
This is just text that almost could be a tool call.
Therefore, no tool is called and no tool is executed.
And this happens because Open WebUI wrongfully sent back encoded HTML quotes instead of just normal quotes of prior messages in the same conversation - leading weaker LLMs to believe that you call tools by writing "get_weather" instead of "get_weather".

Where? The original issue only includes a vague mention of "degraded performance" which could equally refer to the same tool output hallucinations I see.

Fair - the original issue does not explicitly state "primarily happens with weaker or less well-fine-tuned LLMs".

Here's what I did:

Read original report: Original issue report states how to reproduce it
Try to reproduce it - successful
Attempt to fix it
Test fix
Fix works, quotes no longer get sent in HTML encoding
Submit PR
I test if the fix works for me too by purposefully using a weaker model, because I could not reproduce any degraded tool calling performance initially, because I also only ever use stronger models or well-fine-tunes models
Without the fix: tool calls quickly break on subsequent turns - with the fix: continues to work - tested on weaker models

Later on, Original Issue Reporter also said they used qwen2-30b - clearly a smaller model, not the latest model that you can use by Qwen either, so besides being small, also potentially not well fine tuned.

So TLDR: This issue is about degraded performance with small or not-well-fine-tuned models.
A monster like Claude Opus, and even small models, which are well fine-tuned, like Haiku, power through with the tool calls even if you send them corrupted prior messages with broken prior tool calls because of wrongfully encoded HTML quotes.

So through debugging, testing and focusing on fixing what was reported (we indeed should not send accidentally modified LLM turns back to the LLM in any case anyways - and to me this was the core issue i was focusing on primarily), the issue was fixed and the reporter also later confirmed which model they used, confirming my suspicion, with my own tests also, that this primarily affects weaker models.

TLDR for the TLDR:

Core issue is "we send modified messages back to the LLM on accident" -> this is fixed with my PR
"may cause performance degradation with tool calling" -> heavily model dependent but generally, any tool call related issues ALWAYS affect weaker models more than stronger well-fine-tuned models. That's why they are well-fine-tuned - they always do tool calls correctly.

PS:
This issue is about tool calls, not tool responses. I should have noticed earlier that your screenshots did not even show a tool call to begin with, but the " at the very beginning led me to believe the screenshot was meant to show a failed tool call - but that's not what the screenshot shows. As you said, it only shows the model hallucinating some tool call result that isn't there.

@Classic298 commented on GitHub (Feb 2, 2026): > I wasn't aware that this was a potential failure mode - I guess it seems plausible, but wouldn't an actual malformed tool call be prevented at the LLM API level, by the tool call schemas? Yes and Yes. But we do not have a tool call in this case. The LLM just outputs `"get_weather"` This is not a tool call. This is just text that almost could be a tool call. Therefore, no tool is called and no tool is executed. And this happens because Open WebUI wrongfully sent back encoded HTML quotes instead of just normal quotes of prior messages in the same conversation - leading weaker LLMs to believe that you call tools by writing `"get_weather"` instead of "get_weather". > Where? The original issue only includes a vague mention of "degraded performance" which could equally refer to the same tool output hallucinations I see. Fair - the original issue does not explicitly state "primarily happens with weaker or less well-fine-tuned LLMs". Here's what I did: 1. Read original report: Original issue report states how to reproduce it 2. Try to reproduce it - successful 3. Attempt to fix it 4. Test fix 5. Fix works, quotes no longer get sent in HTML encoding 6. Submit PR 7. I test if the fix works for me too by purposefully using a weaker model, because I could not reproduce any degraded tool calling performance initially, because I also only ever use stronger models or well-fine-tunes models 8. Without the fix: tool calls quickly break on subsequent turns - with the fix: continues to work - tested on weaker models Later on, Original Issue Reporter also said they used qwen2-30b - clearly a smaller model, not the latest model that you can use by Qwen either, so besides being small, also potentially not well fine tuned. So TLDR: This issue is about degraded performance with small or not-well-fine-tuned models. A monster like Claude Opus, and even small models, which are well fine-tuned, like Haiku, power through with the tool calls even if you send them corrupted prior messages with broken prior tool calls because of wrongfully encoded HTML quotes. So through debugging, testing and focusing on fixing what was reported (<ins>**we indeed should not send accidentally modified LLM turns back to the LLM in any case anyways**</ins> - and to me this was the core issue i was focusing on primarily), the issue was fixed and the reporter also later confirmed which model they used, confirming my suspicion, with my own tests also, that this primarily affects weaker models. TLDR for the TLDR: 1. Core issue is "we send modified messages back to the LLM on accident" -> this is fixed with my PR 2. "may cause performance degradation with tool calling" -> heavily model dependent but generally, any tool call related issues ALWAYS affect weaker models more than stronger well-fine-tuned models. That's why they are well-fine-tuned - they always do tool calls correctly. PS: This issue is about tool calls, not tool responses. I should have noticed earlier that your screenshots did not even show a tool call to begin with, but the `"` at the very beginning led me to believe the screenshot was meant to show a failed tool call - but that's not what the screenshot shows. As you said, it only shows the model hallucinating some tool call result that isn't there.

GiteaMirror commented

2026-05-05 21:53:21 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

hope that explains it

@Classic298 commented on GitHub (Feb 2, 2026): hope that explains it

GiteaMirror commented

2026-05-05 21:53:23 -05:00

@Bickio commented on GitHub (Feb 2, 2026):

Thanks, I was not aware that you'd successfully reproduced the LLM failing to use tool calls. If there's a concrete case where the PR reduces cases of hallucinated tool use, then I agree the PR is valuable on its own.

I do find it interesting that the LLM was hallucinating tool calls in your testing rather than tool outputs. Was your small model where you reproduced the malformed tool call connected via an OpenAI compatible API, or via Ollama? If it's Ollama, I suspect that there may be a difference in how the tool injection is formatted between the two systems - perhaps the Ollama code injects the tool call as well as the output, whereas the OpenAI code only injects the output?

@Bickio commented on GitHub (Feb 2, 2026): Thanks, I was not aware that you'd successfully reproduced the LLM failing to use tool calls. If there's a concrete case where the PR reduces cases of hallucinated tool use, then I agree the PR is valuable on its own. I do find it interesting that the LLM was hallucinating tool _calls_ in your testing rather than tool outputs. Was your small model where you reproduced the malformed tool call connected via an OpenAI compatible API, or via Ollama? If it's Ollama, I suspect that there may be a difference in how the tool injection is formatted between the two systems - perhaps the Ollama code injects the tool call as well as the output, whereas the OpenAI code only injects the output?

GiteaMirror commented

2026-05-05 21:53:23 -05:00

@Classic298 commented on GitHub (Feb 3, 2026):

Yes I was able to reproduce faulty tool calls - the original report was exclusively about tool calls and it's degraded performance

AFAIK - When tool calls are executed in multi-turn conversations, the tool call results are stored in conversation history database with HTML-escaped entities (e.g., ", &) in the database. When these messages are loaded from the database and sent back to the LLM in subsequent conversation turns, the HTML entities in tool call results are not properly decoded, causing the model to receive malformed JSON with escaped entities instead of proper quotation marks.
The issue is not visible in the chat window, but it does have an impact on the model, degrading its performance, especially if the chat history is tool call heavy.

Model connected via OpenAI, but Ollama should handle tool calls just as well.

I do find it interesting that the LLM was hallucinating tool calls

Well hallucinated tool calls is the wrong word here, if we are being honest. The model knows the tool is available and is trying it's best to call it.

Equally as much as you find it interesting that users have issues with ... well hallucinated is the wrong word here - malformed tool calls by the model due to poisioned context input, I find it interesting that you struggle with simply totally hallucinated model outputs.

@Classic298 commented on GitHub (Feb 3, 2026): Yes I was able to reproduce faulty tool calls - the original report was exclusively about tool calls and it's degraded performance > AFAIK - When tool calls are executed in multi-turn conversations, the tool call results are stored in conversation history database with HTML-escaped entities (e.g., `"`, `&`) in the database. When these messages are loaded from the database and sent back to the LLM in subsequent conversation turns, the HTML entities in tool call results are not properly decoded, causing the model to receive malformed JSON with escaped entities instead of proper quotation marks. > The issue is not visible in the chat window, but it does have an impact on the model, degrading its performance, especially if the chat history is tool call heavy. Model connected via OpenAI, but Ollama should handle tool calls just as well. > I do find it interesting that the LLM was hallucinating tool calls Well hallucinated tool calls is the wrong word here, if we are being honest. The model knows the tool is available and is trying it's best to call it. Equally as much as you find it interesting that users have issues with ... well hallucinated is the wrong word here - <ins>**malformed tool calls**</ins> by the model due to poisioned context input, I find it interesting that you struggle with simply totally hallucinated model outputs.

GiteaMirror commented

2026-05-05 21:53:25 -05:00

@Bickio commented on GitHub (Feb 3, 2026):

Correct me if I'm wrong here but they're not just malformed, the model is putting the tool call in the assistant content instead of tool calls array. So even if it was outputting correct json (no html escaping) the tool call wouldn't actually be processed, right?

@Bickio commented on GitHub (Feb 3, 2026): Correct me if I'm wrong here but they're not just malformed, the model is putting the tool call in the assistant content instead of tool calls array. So even if it was outputting correct json (no html escaping) the tool call wouldn't actually be processed, right?

GiteaMirror commented

2026-05-05 21:53:27 -05:00

@Classic298 commented on GitHub (Feb 3, 2026):

If we wanna be precise, yes. Thats another fault the model makes here. But some models are also (seemingly) trained to output it to the normal model output like DeepSeek V3.2 (full) - which then struggle even more if they see " in their previous messages.
This was also one of the models i tested the PR on. The PR didnt fully resolve it for this model though because DeepSeek V3.2 is trained also on DSML (deepseek markup language), which calls tool vastly different than just OpenAI tool calls - but it still improved tool calling performance.

@Classic298 commented on GitHub (Feb 3, 2026): If we wanna be precise, yes. Thats another fault the model makes here. But some models are also (seemingly) trained to output it to the normal model output like DeepSeek V3.2 (full) - which then struggle even more if they see `"` in their previous messages. This was also one of the models i tested the PR on. The PR didnt fully resolve it for this model though because DeepSeek V3.2 is trained also on DSML (deepseek markup language), which calls tool vastly different than just OpenAI tool calls - but it still improved tool calling performance.

GiteaMirror commented

2026-05-05 21:53:30 -05:00

@pfn commented on GitHub (Feb 3, 2026):

This bug is making tool-calling nigh unusable. My chats are unpredictably getting corrupted by mis-quoted tool output showing up. Especially when it should be in a tool element rather than output by the assistant itself. Assistant role should summarize tool results possibly, but not embed tool results directly (the LLM could embed results, but it's not something the chat host should be doing). Assistant role should only emit tool_call which the chat host links to the actual function call. Once a tool role message is received, the model decides what to do with it in the resulting assistant message, whether embed or not.

@pfn commented on GitHub (Feb 3, 2026): This bug is making tool-calling nigh unusable. My chats are unpredictably getting corrupted by mis-quoted tool output showing up. Especially when it should be in a tool element rather than output by the assistant itself. Assistant role should summarize tool results possibly, but not embed tool results directly (the LLM could embed results, but it's not something the chat host should be doing). Assistant role should only emit tool_call which the chat host links to the actual function call. Once a tool role message is received, the model decides what to do with it in the resulting assistant message, whether embed or not.

GiteaMirror commented

2026-05-05 21:53:31 -05:00

@Classic298 commented on GitHub (Feb 3, 2026):

@pfn did you test the PR? its a very minimal one-liner fix

@Classic298 commented on GitHub (Feb 3, 2026): @pfn did you test the PR? its a very minimal one-liner fix

GiteaMirror commented

2026-05-05 21:53:31 -05:00

@pfn commented on GitHub (Feb 3, 2026):

@Classic298 does it apply cleanly to the release branch? I currently run out of the docker release image and it may or may not be convenient to apply. If so, I'll make a new local image that incorporates the PR

@pfn commented on GitHub (Feb 3, 2026): @Classic298 does it apply cleanly to the release branch? I currently run out of the docker release image and it may or may not be convenient to apply. If so, I'll make a new local image that incorporates the PR

GiteaMirror commented

2026-05-05 21:53:32 -05:00

@Classic298 commented on GitHub (Feb 3, 2026):

@pfn https://github.com/open-webui/open-webui/pull/20755

@Classic298 commented on GitHub (Feb 3, 2026): @pfn https://github.com/open-webui/open-webui/pull/20755

GiteaMirror commented

2026-05-05 21:53:33 -05:00

@pfn commented on GitHub (Feb 4, 2026):

@Classic298 preliminary testing looks positive, it lasted a lot longer before eventually getting confused, for some reason, it starts embedding the json tool result/template into the assistant response, not sure of the source yet, however the escaping issue is gone:

regenerating the response can make it go back to normal again, so I'm not sure where the issue lies for this. the underlying model is glm 4.7 flash w/ thinking enabled

@pfn commented on GitHub (Feb 4, 2026): @Classic298 preliminary testing looks positive, it lasted a lot longer before eventually getting confused, for some reason, it starts embedding the json tool result/template into the assistant response, not sure of the source yet, however the escaping issue is gone: <img width="1148" height="420" alt="Image" src="https://github.com/user-attachments/assets/a10eea27-5648-44a9-8e22-347604e5d553" /> regenerating the response can make it go back to normal again, so I'm not sure where the issue lies for this. the underlying model is glm 4.7 flash w/ thinking enabled

GiteaMirror commented

2026-05-05 21:53:33 -05:00

@Bickio commented on GitHub (Feb 4, 2026):

it starts embedding the json tool result/template into the assistant response

This was always the expected outcome, at least to me. Glad to have confirmation from another user that the HTML escaping is indeed a red herring, and that at best #20755 just delays the tool output hallucinations on some models. On the other hand, at least the hallucinations are slightly more readable now... 😉

@pfn If you're curious about what's going on in your example, I wrote this discussion post to explain: #21098

@Bickio commented on GitHub (Feb 4, 2026): > it starts embedding the json tool result/template into the assistant response This was always the expected outcome, at least to me. Glad to have confirmation from another user that the HTML escaping is indeed a red herring, and that _at best_ #20755 just delays the tool output hallucinations on some models. On the other hand, at least the hallucinations are slightly more readable now... 😉 @pfn If you're curious about what's going on in your example, I wrote this discussion post to explain: #21098

GiteaMirror commented

2026-05-05 21:53:34 -05:00

@pfn commented on GitHub (Feb 4, 2026):

@Bickio indeed, your description of what's going on in #21098 sounds very apt. I don't know how to look at the underlying chat interaction in openwebui, so I can't confirm what you're describing, but I do completely agree that tool role messages must not be merged into assistant role messages

@pfn commented on GitHub (Feb 4, 2026): @Bickio indeed, your description of what's going on in #21098 sounds very apt. I don't know how to look at the underlying chat interaction in openwebui, so I can't confirm what you're describing, but I do completely agree that tool role messages must not be merged into assistant role messages

GiteaMirror commented

2026-05-05 21:53:34 -05:00

@Classic298 commented on GitHub (Feb 4, 2026):

How is it a red herring if now 6 users confirmed the PR helps with tool calling on subsequent responses?.......................

@Classic298 commented on GitHub (Feb 4, 2026): How is it a red herring if now 6 users confirmed the PR helps with tool calling on subsequent responses?.......................

GiteaMirror commented

2026-05-05 21:53:35 -05:00

@pfn commented on GitHub (Feb 4, 2026):

How is it a red herring if now 6 users confirmed the PR helps with tool calling on subsequent responses?.......................

it doesn't seem to address the underlying root cause. it just makes the &quot disappear

@pfn commented on GitHub (Feb 4, 2026): > How is it a red herring if now 6 users confirmed the PR helps with tool calling on subsequent responses?....................... it doesn't seem to address the underlying root cause. it just makes the `&quot` disappear

GiteaMirror commented

2026-05-05 21:53:35 -05:00

@Classic298 commented on GitHub (Feb 4, 2026):

@pfn Please see what issue you are commenting on ^

@Classic298 commented on GitHub (Feb 4, 2026): @pfn Please see what issue you are commenting on ^

GiteaMirror commented

2026-05-05 21:53:35 -05:00

@pfn commented on GitHub (Feb 4, 2026):

I am commenting on this issue, being that it isn't the right issue to address. The effort shouldn't be in making the html entities go away, it should be in making the tool and assistant role messages discrete rather than a merged assistant message. That breaks the model's context when they get merged. I applied your one-liner you mentioned, it still results in a broken chat, just without html entities getting in the way.

@pfn commented on GitHub (Feb 4, 2026): I am commenting on this issue, being that it isn't the right issue to address. The effort shouldn't be in making the html entities go away, it should be in making the tool and assistant role messages discrete rather than a merged assistant message. That breaks the model's context when they get merged. I applied your one-liner you mentioned, it still results in a broken chat, just without html entities getting in the way.

GiteaMirror commented

2026-05-05 21:53:36 -05:00

@Bickio commented on GitHub (Feb 4, 2026):

@Classic298 The only actual user facing issue here is the AI hallucination and degraded performance. The original author of this ticket made the (not unreasonable) assumption that the hallucination was caused by the HTML escaping, which we now know is not true.

@Bickio commented on GitHub (Feb 4, 2026): @Classic298 The only actual user facing issue here is the AI hallucination and degraded performance. The original author of this ticket made the (not unreasonable) assumption that the hallucination was caused by the HTML escaping, which we now know is not true.

GiteaMirror commented

2026-05-05 21:53:36 -05:00

@pfn commented on GitHub (Feb 4, 2026):

@Bickio
This is the Open AI API standard.
Tool calls MUST BE in the assistant's message

Are we certain that tool calls should be injected into the assistant messages at all?

So yes we are very sure this is intended and should not be changed

I'm experiencing issues where the LLM hallucinates tool call results in long conversations, and I believe it's because previous tool calls were injected into its own messages.

No that's just because the tool calls get incorrectly formatted as we found out here and we wanna fix with the PR

re this @Bickio -- I'm not reading further in the backlog of messages here, but there is some confusion in what you asked chatgpt:

tool_calls is a field of the assistant role message, it is a hook from the LLM output into the chat interface to invoke a tool
the chat interface responds by executing the tool, and sending a result back in a tool role message that added to the chat.
- the tool role message is associated to the tool_call through the use of the tool_call_id attribute
at no point should the content from the tool role message be merged with the assistant role message on the side of the chat host (openwebui).
- the LLM can choose to incorporate parts of the tool role message into a subsequent assistant role message, but it will generally not include the tool role message verbatim.

I've mentioned previously that I am new to using openwebui, so I don't really know how to confirm whether this is actually what is happening, because I do not know how to see the chat stream that is sent to the LLM.

@pfn commented on GitHub (Feb 4, 2026): > [@Bickio](https://github.com/Bickio) > > <img alt="Image" width="794" height="427" src="https://private-user-images.githubusercontent.com/27028174/541864432-e3f99b3e-e3fd-4e27-93a2-09ecf98cca12.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NzAxOTI5MDEsIm5iZiI6MTc3MDE5MjYwMSwicGF0aCI6Ii8yNzAyODE3NC81NDE4NjQ0MzItZTNmOTliM2UtZTNmZC00ZTI3LTkzYTItMDllY2Y5OGNjYTEyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNjAyMDQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjYwMjA0VDA4MTAwMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPThkOWI0ZTgwMGM4YjFmNTg2YmE2NzY2ZTU1MTc4NmZhMjk5YjI2ZDJhNTEwOTQxYzA0OTYwYjBlZTc4MmM1ZDgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.PflLEtYcODMF3j4OJ8nnMKw7NMGBaqSD8qvl_6lfBGg"> > This is the Open AI API standard. > > Tool calls MUST BE in the assistant's message > > > Are we certain that tool calls should be injected into the assistant messages at all? > > So yes we are very sure this is intended and should not be changed > > > I'm experiencing issues where the LLM hallucinates tool call results in long conversations, and I believe it's because previous tool calls were injected into its own messages. > > No that's just because the tool calls get incorrectly formatted as we found out here and we wanna fix with the PR re this @Bickio -- I'm not reading further in the backlog of messages here, but there is some confusion in what you asked chatgpt: * tool_calls *is* a field of the assistant role message, it is a hook from the LLM output into the chat interface to invoke a tool * the chat interface responds by executing the tool, and sending a result back in a tool role message that added to the chat. * the tool role message is associated to the tool_call through the use of the tool_call_id attribute * at no point should the content from the tool role message be merged with the assistant role message on the side of the chat host (openwebui). * the LLM can *choose* to incorporate parts of the tool role message into a subsequent assistant role message, but it will generally not include the tool role message verbatim. I've mentioned previously that I am new to using openwebui, so I don't really know how to confirm whether this is actually what is happening, because I do not know how to see the chat stream that is sent to the LLM.

GiteaMirror commented

2026-05-05 21:53:36 -05:00

@Bickio commented on GitHub (Feb 4, 2026):

@pfn agreed, and I pointed this out previously too

@Classic298 then agreed that Open WebUI was using the API wrong, but denied that the misuse impacts the user experience, citing the lack of users reporting issues, and suggested that the issue was with my infrastructure or configuration.

It's been a frustrating journey, but I hope we're on the home stretch now

@Bickio commented on GitHub (Feb 4, 2026): @pfn agreed, and I pointed this out previously too @Classic298 then agreed that Open WebUI was using the API wrong, but denied that the misuse impacts the user experience, citing the lack of users reporting issues, and suggested that the issue was with my infrastructure or configuration. It's been a frustrating journey, but I hope we're on the home stretch now

GiteaMirror commented

2026-05-05 21:53:37 -05:00

@Classic298 commented on GitHub (Feb 4, 2026):

The original author of this ticket made the (not unreasonable) assumption that the hallucination was caused by the HTML escaping, which we now know is not true.

then how come it fixes it for me and other users?
The answer is simple: because you are experiencing a different issue.

but denied that the misuse impacts the user experience, citing the lack of users reporting issues, and suggested that the issue was with my infrastructure or configuration.

And I stand by it unless someone can give us a reproducible example.
Claude Haiku, Sonnet, Opus - all do tool calling perfectly fine even without very strict adherence to the tool role standard.
Besides, you created a separate discussion for this where to discuss exactly that. This issue is, and I said it again and again, in the scope of what was reported.

Besides, Tim said a tool_output section may be introduced in dev in the PR comments. And yet we are discussing on the wrong issue the wrong topic for something else that might be addressed soon.

@pfn @Bickio opened a discussion for exactly what you want to discuss

and @Bickio, you calling it a red herring when multiple users confirmed this fixes it for them simply proofs once more that you have a different issue than what this issue is about

@Classic298 commented on GitHub (Feb 4, 2026): > The original author of this ticket made the (not unreasonable) assumption that the hallucination was caused by the HTML escaping, which we now know is not true. **then how come it fixes it for me and other users?** The answer is simple: **because you are experiencing a different issue.** > but denied that the misuse impacts the user experience, citing the lack of users reporting issues, and suggested that the issue was with my infrastructure or configuration. And I stand by it unless someone can give us a reproducible example. Claude Haiku, Sonnet, Opus - all do tool calling perfectly fine even without very strict adherence to the tool role standard. Besides, you created a separate discussion for this where to discuss exactly that. This issue is, and I said it again and again, in the scope of what was reported. Besides, Tim said a tool_output section may be introduced in dev in the PR comments. And yet we are discussing on the wrong issue the wrong topic for something else that might be addressed soon. @pfn @Bickio opened a discussion for exactly what you want to discuss and @Bickio, you calling it a red herring when multiple users confirmed this fixes it for them simply proofs once more that you have a different issue than what this issue is about

GiteaMirror commented

2026-05-05 21:53:37 -05:00

@Bickio commented on GitHub (Feb 4, 2026):

then how come it fixes it for me and other users?

Fixes what exactly?

The original poster said this:

I've tested the PR that @Classic298 made and it fixes the issue with quotes being incorrectly escaped.
But even in a completely new environment, on this PR, I was still able to replicate the model hallucinating tool calls

@pfn said this:

preliminary testing looks positive, it lasted a lot longer before eventually getting confused, for some reason, it starts embedding the json tool result/template into the assistant response, not sure of the source yet, however the escaping issue is gone

In both cases, after applying your PR:

The HTML is now unescaped (who cares, it shouldn't be visible to the user anyway)
The LLM still hallucinates, just prettier

Both of these cases match both my own original predictions and my personal experience too.

@Classic298 As far as I can tell, the only person who's reported that the PR completely fixes the hallucinations is you

@pfn @Bickio opened a discussion for exactly what you want to discuss

I opened a separate discussion at your request, not because I thought it was a distinct issue

and @Bickio, you calling it a red herring when multiple users confirmed this fixes it for them simply proofs once more that you have a different issue than what this issue is about

See above. Multiple users (including me) have confirmed that the HTML is now unescaped, yes. And also all of those users have reported that this has not fixed the hallucinations

@Bickio commented on GitHub (Feb 4, 2026): > then how come it fixes it for me and other users? Fixes _what_ exactly? The original poster said this: > I've tested the PR that @Classic298 made and it **fixes the issue with quotes being incorrectly escaped**. > But even in a completely new environment, on this PR, **I was still able to replicate the model hallucinating tool calls** @pfn said this: > preliminary testing looks positive, it lasted a lot longer before eventually getting confused, **for some reason, it starts embedding the json tool result/template into the assistant response**, not sure of the source yet, **however the escaping issue is gone** In both cases, after applying your PR: - The HTML is now unescaped (who cares, it shouldn't be visible to the user anyway) - The LLM still hallucinates, just prettier Both of these cases match both my own original predictions and my personal experience too. @Classic298 As far as I can tell, the only person who's reported that the PR completely fixes the hallucinations is you > @pfn @Bickio opened a discussion for exactly what you want to discuss I opened a separate discussion at your request, not because I thought it was a distinct issue > and @Bickio, you calling it a red herring when multiple users confirmed this fixes it for them simply proofs once more that you have a different issue than what this issue is about See above. Multiple users (including me) have confirmed that the HTML is now unescaped, yes. And also all of those users have reported that this has not fixed the hallucinations

GiteaMirror commented

2026-05-05 21:53:38 -05:00

@Bickio commented on GitHub (Feb 4, 2026):

Besides, Tim said a tool_output section may be introduced in dev in the PR comments. And yet we are discussing on the wrong issue the wrong topic for something else that might be addressed soon.

I can't find the comment you're referring to. Would it be possible to get some clarity on what is planned?

@Bickio commented on GitHub (Feb 4, 2026): > Besides, Tim said a tool_output section may be introduced in dev in the PR comments. And yet we are discussing on the wrong issue the wrong topic for something else that might be addressed soon. I can't find the comment you're referring to. Would it be possible to get some clarity on what is planned?

GiteaMirror commented

2026-05-05 21:53:38 -05:00

@Classic298 commented on GitHub (Feb 4, 2026):

on the pr there are other users who tested it and worked for them. Check there
on the other pr that references this one that was opened yesterday tim has commented that

See there #21135

@Classic298 commented on GitHub (Feb 4, 2026): 1) on the pr there are other users who tested it and worked for them. Check there 2) on the other pr that references this one that was opened yesterday tim has commented that See there #21135

GiteaMirror commented

2026-05-05 21:53:39 -05:00

@Bickio commented on GitHub (Feb 4, 2026):

on the pr there are other users who tested it and worked for them. Check there

These comments?

Tested locally and LGTM!

tested and everything is ok

Are you suggesting these users reproduced the hallucinations? Seems more likely to me that they're just confirming that the HTML unescaping correctly unescapes the HTML, which I already agree with.

In fact, one of those users, who appears to be another Open WebUI contributor, posted their testing methodology at the top of this issue:

I am able to confirm this issue on the latest dev commit by looking at the backend logs. &quot is spotted several times throughout tool call related debug logging statements.

Indeed, this testing methodology is enough to check the HTML unescaping, but doesn't do anything to reproduce the real issue of LLM hallucinations.

on the other pr that references this one that was opened yesterday tim has commented that
See there https://github.com/open-webui/open-webui/pull/21135

The comment in question is extremely brief:

This will be addressed with output field in dev.

I assumed there must have been more detail hiding somewhere, hence my request for clarity. Nevertheless, it sounds like a step generally in the right direction, so I'm hopeful we'll see some progress towards fixing this issue at last.

@Bickio commented on GitHub (Feb 4, 2026): > on the pr there are other users who tested it and worked for them. Check there These comments? > Tested locally and LGTM! > tested and everything is ok Are you suggesting these users reproduced the hallucinations? Seems more likely to me that they're just confirming that the HTML unescaping correctly unescapes the HTML, which I already agree with. In fact, one of those users, who appears to be another Open WebUI contributor, posted their testing methodology at the top of this issue: > I am able to confirm this issue on the latest dev commit by looking at the backend logs. &quot is spotted several times throughout tool call related debug logging statements. Indeed, this testing methodology is enough to check the HTML unescaping, but doesn't do anything to reproduce the real issue of LLM hallucinations. > on the other pr that references this one that was opened yesterday tim has commented that > See there https://github.com/open-webui/open-webui/pull/21135 The comment in question is extremely brief: > This will be addressed with `output` field in dev. I assumed there must have been more detail hiding somewhere, hence my request for clarity. Nevertheless, it sounds like a step generally in the right direction, so I'm hopeful we'll see some progress towards fixing this issue at last.

GiteaMirror commented

2026-05-05 21:53:41 -05:00

@Classic298 commented on GitHub (Feb 4, 2026):

I am certain that in a few days we will see a commit re: this on dev

@Classic298 commented on GitHub (Feb 4, 2026): I am certain that in a few days we will see a commit re: this on dev

GiteaMirror commented

2026-05-05 21:53:42 -05:00

@espen96 commented on GitHub (Feb 7, 2026):

I wanted to note that I've seen the same behavior. And at times, rarely, I've seen GLM 4.7 Flash visibly confused in its reasoning. Stating that it now has a tool result it sees after a web search, and user given context.

Later when debugging another tool, that resulted in it following up with a search. I questioned it about the tool call. It was confused, for some reason it was no longer able to recognize that it had made a tool call.

It has also fallen for the problem of attempting to write a tool output as it is effectively being shown a multi shot example of how retrival works.

Looking at search especially, it does appear that all the searching is merged together, and then sent in the user message using the normal search setup? If so, no wonder the LLM sees two sources for the same information and doesn't understand what happened. It searched. It sees the results, but it always had the results?

@espen96 commented on GitHub (Feb 7, 2026): I wanted to note that I've seen the same behavior. And at times, rarely, I've seen GLM 4.7 Flash visibly confused in its reasoning. Stating that it now has a tool result it sees after a web search, and user given context. Later when debugging another tool, that resulted in it following up with a search. I questioned it about the tool call. It was confused, for some reason it was no longer able to recognize that it had made a tool call. It has also fallen for the problem of attempting to write a tool output as it is effectively being shown a multi shot example of how retrival works. Looking at search especially, it does appear that all the searching is merged together, and then sent in the user message using the normal search setup? If so, no wonder the LLM sees two sources for the same information and doesn't understand what happened. It searched. It sees the results, but it always had the results?

GiteaMirror commented

2026-05-05 21:53:44 -05:00

@jrodguez commented on GitHub (Feb 10, 2026):

I've been experiencing similar issues with gpt-oss:20b served over vLLM and connected to openwebui via openai compatible endpoint. Similar to #20926 , tool calls will work great for a bit but after a few turns, stops generating responses. Checked backend logs and I indeed see the &quot everywhere. Noticed the same thing happens in opencode like #20896 (but I don't route through openwebui in this instance?). Going to try out the solution reported here.

@jrodguez commented on GitHub (Feb 10, 2026): I've been experiencing similar issues with gpt-oss:20b served over vLLM and connected to openwebui via openai compatible endpoint. Similar to #20926 , tool calls will work great for a bit but after a few turns, stops generating responses. Checked backend logs and I indeed see the &quot everywhere. Noticed the same thing happens in opencode like #20896 (but I don't route through openwebui in this instance?). Going to try out the solution reported here.

GiteaMirror commented

2026-05-05 21:53:45 -05:00

@Classic298 commented on GitHub (Feb 15, 2026):

Fixed by f2aca781c8

@Classic298 commented on GitHub (Feb 15, 2026): Fixed by https://github.com/Classic298/open-webui/commit/f2aca781c87244cffc130aa2722e700c19a81d66

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#57897