[GH-ISSUE #24144] issue: LLM response constantly gets "stuck" on most models for most queries with tools #58875

Closed
opened 2026-05-06 00:19:12 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @peterwwillis on GitHub (Apr 26, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/24144

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.9.2

Ollama Version (if applicable)

No response

Operating System

Ubuntu 24.04

Browser (if applicable)

Firefox 150

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

Tools run, AI continues to output tokens to completion. (This doesn't happen... see below)

Actual Behavior

I use different LLM providers: OpenCode Go, OpenRouter, OpenAI, etc. I also use various models, mainly latest MiniMax, Qwen, GLM, Kimi, and some GPT.

Web search is Searxng running in a local Docker.

I'll have a chat open and enable web search and code execution (and I make sure native tool calling is selected).

Randomly, probably 25% of the time, if the model runs a web search or code execution, the model just stops outputting tokens after running those tools. I'll then send "did you get stuck?" or something and suddenly it's replying again, but of course it skips the previous tool call info.

This has been going on for months. There's no debugging information for me to tell what is going wrong. Basically OpenWebUI is randomly broken a quarter of the time.

This doesn't happen if I don't select the tools.

Steps to Reproduce

  1. Open a new chat
  2. Select a model (ex. Kimi K2.5)
  3. Enable native tool calling
  4. Select the web search and code execution default tools
  5. Search for something that does a tool call, like "Give me step by step instructions to install X on Ubuntu 24.04", or "Give me a step by step retirement plan based on X Y Z <specific financial details, to involve code execution>".

Logs & Screenshots

Conversations so far have been private. If you can provide me with a way to generate some kind of useful log, I'll make more attempts and gather the logs.

Additional Information

This has been going on for 2 months. I really want to find an alternative to OpenWebUI so I can actually use the functionality.

Originally created by @peterwwillis on GitHub (Apr 26, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/24144 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.9.2 ### Ollama Version (if applicable) _No response_ ### Operating System Ubuntu 24.04 ### Browser (if applicable) Firefox 150 ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior Tools run, AI continues to output tokens to completion. (This doesn't happen... see below) ### Actual Behavior I use different LLM providers: OpenCode Go, OpenRouter, OpenAI, etc. I also use various models, mainly latest MiniMax, Qwen, GLM, Kimi, and some GPT. Web search is Searxng running in a local Docker. I'll have a chat open and enable web search and code execution (and I make sure native tool calling is selected). Randomly, probably 25% of the time, if the model runs a web search or code execution, the model just stops outputting tokens after running those tools. I'll then send "did you get stuck?" or something and suddenly it's replying again, but of course it skips the previous tool call info. This has been going on for months. There's no debugging information for me to tell what is going wrong. Basically OpenWebUI is randomly broken a quarter of the time. This doesn't happen if I don't select the tools. ### Steps to Reproduce 1. Open a new chat 2. Select a model (ex. Kimi K2.5) 3. Enable native tool calling 4. Select the web search and code execution default tools 5. Search for something that does a tool call, like "Give me step by step instructions to install X on Ubuntu 24.04", or "Give me a step by step retirement plan based on X Y Z <specific financial details, to involve code execution>". ### Logs & Screenshots Conversations so far have been private. If you can provide me with a way to generate some kind of useful log, I'll make more attempts and gather the logs. ### Additional Information This has been going on for 2 months. I really want to find an alternative to OpenWebUI so I can actually use the functionality.
GiteaMirror added the bug label 2026-05-06 00:19:12 -05:00
Author
Owner

@belugaming commented on GitHub (Apr 26, 2026):

The same problem, especially when calling the web search tool multiple times, the web page will freeze, even the Mac Studio M1 ultra 64G RAM will have a serious lag in the LLM search of 100 websites.

<!-- gh-comment-id:4321815025 --> @belugaming commented on GitHub (Apr 26, 2026): The same problem, especially when calling the web search tool multiple times, the web page will freeze, even the Mac Studio M1 ultra 64G RAM will have a serious lag in the LLM search of 100 websites.
Author
Owner

@silenceroom commented on GitHub (Apr 28, 2026):

had exactly same problem here. I was guessing it might related to the networking issue or something.

It seems that the model just gets stuck after execute the web_search for a round or tying to call a complicated skill to do something. Some modesl if you click the "continue" button it'll keep runnnig, but most of the times it just repeat what it had done and gets stuck again.

<!-- gh-comment-id:4331932788 --> @silenceroom commented on GitHub (Apr 28, 2026): had exactly same problem here. I was guessing it might related to the networking issue or something. It seems that the model just gets stuck after execute the web_search for a round or tying to call a complicated skill to do something. Some modesl if you click the "continue" button it'll keep runnnig, but most of the times it just repeat what it had done and gets stuck again.
Author
Owner

@Classic298 commented on GitHub (May 2, 2026):

I have never experienced this and I also use OpenAI and OpenRouter and Kimi K2.5 on native tool calling. Have used it all. I never experienced this.

And there's no reason for why this should be happening. What's your precise setup? What's your aiohttp timeout if you configured any? What's your network connectivity to open webui but also from open webui the Provider look like?

How can i reproduce this?

Are you maybe just silently running out of context?

<!-- gh-comment-id:4363127269 --> @Classic298 commented on GitHub (May 2, 2026): I have never experienced this and I also use OpenAI and OpenRouter and Kimi K2.5 on native tool calling. Have used it all. I **never** experienced this. And there's no reason for why this should be happening. What's your precise setup? What's your aiohttp timeout if you configured any? What's your network connectivity to open webui but also from open webui the Provider look like? How can i reproduce this? Are you maybe just silently running out of context?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#58875