[GH-ISSUE #20896] issue: Generation stops after tool call when routing Ollama through WebUI (GLM-4.7-Flash in OpenCode) #34855

New Issue

GiteaMirror · 2026-04-25T09:02:11-05:00

GiteaMirror commented

2026-04-25 09:02:11 -05:00

Originally created by @HuysArthur on GitHub (Jan 23, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/20896

Check Existing Issues

I have searched for any existing and/or related issues.
I have searched for any existing and/or related discussions.
I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.7.2

Ollama Version (if applicable)

v0.14.3

Operating System

Linux Mint 19.1

Browser (if applicable)

No response

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When I ask the model to do something in opencode it calls a tool and after that it gives the answer I asked, this works when I use the model directly from the Ollama API as shown below:

user@host dir % opencode run "Create the file 'README' with the contents 'hello world', let me know if it succeeded" --model ollama/glm-4.7-flash:bf16-80k

I'll create the README file with the specified content.

| Write Users/user/dir/README

README created successfully with content "hello world".

Actual Behavior

But when I use the same model from Ollama but route it trough the WebUI API, it always stops the execution after a tool call shown below, it correctly creates the file but after that it stops. This happens all the time in opencode with this configuration every time it calls a tool it stops execution and I have to manually type continue after each tool call

user@host dir % opencode run "Create the file 'README' with the contents 'hello world', let me know if it succeeded" --model webui/glm-4.7-flash:bf16-80k
| Write Users/user/dir/README

Steps to Reproduce

Setup ollama (v0.14.3) and install the model glm-4.7-flash:bf16 with a context_size of 80_000 (it can be smaller).
add that Ollama service to WebUI as a connection, so the model is accessible there
Enable API access for users
Install opencode (1.1.20)
configure opencode to be able to use the model trough open WebUI

example opencode config:
{
"$schema": "https://opencode.ai/config.json",
"permission": {
"edit": "allow",
"bash": "ask",
"webfetch": "ask",
"doom_loop": "ask",
"external_directory": "ask"
},
"provider": {
"webui": {
"npm": "@ai-sdk/openai-compatible",
"name": "WebUI",
"options": {
"baseURL": "http://{{ip_webui}}:{{port_webui}}/api/v1",
"apiKey": "{{key}}"
},
"models": {
"glm-4.7-flash:bf16-80k": {
"name": "GLM 4.7 Flash"
}
}
},
}

Logs & Screenshots

No weird logs:

2026-01-23 15:20:21.170 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56959 - "GET /api/v1/groups/ HTTP/1.1" 200
2026-01-23 15:20:21.172 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56956 - "GET /api/v1/auths/admin/config/ldap HTTP/1.1" 200
2026-01-23 15:20:21.182 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56955 - "GET /api/version/updates HTTP/1.1" 200
2026-01-23 15:20:21.194 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56958 - "GET /api/version/updates HTTP/1.1" 200
2026-01-23 15:20:21.284 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56957 - "GET /api/v1/auths/admin/config/ldap HTTP/1.1" 200
2026-01-23 15:20:31.788 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56961 - "GET /_app/version.json HTTP/1.1" 304
2026-01-23 15:21:31.788 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56967 - "GET /_app/version.json HTTP/1.1" 304
2026-01-23 15:22:31.782 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56969 - "GET /_app/version.json HTTP/1.1" 304
2026-01-23 15:23:31.779 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56970 - "GET /_app/version.json HTTP/1.1" 304
2026-01-23 15:24:31.781 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56973 - "GET /_app/version.json HTTP/1.1" 304
2026-01-23 15:25:31.786 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56974 - "GET /_app/version.json HTTP/1.1" 304
2026-01-23 15:26:31.790 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56976 - "GET /_app/version.json HTTP/1.1" 304
2026-01-23 15:27:18.064 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:60853 - "POST /api/v1/chat/completions HTTP/1.1" 200
2026-01-23 15:27:31.786 | INFO     | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56977 - "GET /_app/version.json HTTP/1.1" 304

Additional Information

config for the model in webUI is all default: glm-4.7-flash_bf16-80k-1769159953751.json

Originally created by @HuysArthur on GitHub (Jan 23, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/20896 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.7.2 ### Ollama Version (if applicable) v0.14.3 ### Operating System Linux Mint 19.1 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When I ask the model to do something in opencode it calls a tool and after that it gives the answer I asked, this works when I use the model directly from the Ollama API as shown below: user@host dir % opencode run "Create the file 'README' with the contents 'hello world', let me know if it succeeded" --model ollama/glm-4.7-flash:bf16-80k I'll create the README file with the specified content. | Write Users/user/dir/README README created successfully with content "hello world". ### Actual Behavior But when I use the same model from Ollama but route it trough the WebUI API, it always stops the execution after a tool call shown below, it correctly creates the file but after that it stops. This happens all the time in opencode with this configuration every time it calls a tool it stops execution and I have to manually type continue after each tool call user@host dir % opencode run "Create the file 'README' with the contents 'hello world', let me know if it succeeded" --model webui/glm-4.7-flash:bf16-80k | Write Users/user/dir/README ### Steps to Reproduce 1. Setup ollama (v0.14.3) and install the model glm-4.7-flash:bf16 with a context_size of 80_000 (it can be smaller). 2. add that Ollama service to WebUI as a connection, so the model is accessible there 3. Enable API access for users 4. Install opencode (1.1.20) 5. configure opencode to be able to use the model trough open WebUI example opencode config: { "$schema": "https://opencode.ai/config.json", "permission": { "edit": "allow", "bash": "ask", "webfetch": "ask", "doom_loop": "ask", "external_directory": "ask" }, "provider": { "webui": { "npm": "@ai-sdk/openai-compatible", "name": "WebUI", "options": { "baseURL": "http://{{ip_webui}}:{{port_webui}}/api/v1", "apiKey": "{{key}}" }, "models": { "glm-4.7-flash:bf16-80k": { "name": "GLM 4.7 Flash" } } }, } ### Logs & Screenshots No weird logs: ``` 2026-01-23 15:20:21.170 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56959 - "GET /api/v1/groups/ HTTP/1.1" 200 2026-01-23 15:20:21.172 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56956 - "GET /api/v1/auths/admin/config/ldap HTTP/1.1" 200 2026-01-23 15:20:21.182 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56955 - "GET /api/version/updates HTTP/1.1" 200 2026-01-23 15:20:21.194 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56958 - "GET /api/version/updates HTTP/1.1" 200 2026-01-23 15:20:21.284 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56957 - "GET /api/v1/auths/admin/config/ldap HTTP/1.1" 200 2026-01-23 15:20:31.788 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56961 - "GET /_app/version.json HTTP/1.1" 304 2026-01-23 15:21:31.788 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56967 - "GET /_app/version.json HTTP/1.1" 304 2026-01-23 15:22:31.782 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56969 - "GET /_app/version.json HTTP/1.1" 304 2026-01-23 15:23:31.779 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56970 - "GET /_app/version.json HTTP/1.1" 304 2026-01-23 15:24:31.781 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56973 - "GET /_app/version.json HTTP/1.1" 304 2026-01-23 15:25:31.786 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56974 - "GET /_app/version.json HTTP/1.1" 304 2026-01-23 15:26:31.790 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56976 - "GET /_app/version.json HTTP/1.1" 304 2026-01-23 15:27:18.064 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:60853 - "POST /api/v1/chat/completions HTTP/1.1" 200 2026-01-23 15:27:31.786 | INFO | uvicorn.protocols.http.httptools_impl:send:483 - 10.55.163.237:56977 - "GET /_app/version.json HTTP/1.1" 304 ``` ### Additional Information config for the model in webUI is all default: [glm-4.7-flash_bf16-80k-1769159953751.json](https://github.com/user-attachments/files/24823750/glm-4.7-flash_bf16-80k-1769159953751.json)

GiteaMirror added the bug label 2026-04-25 09:02:11 -05:00

GiteaMirror commented

2026-04-25 09:02:12 -05:00

@owui-terminator[bot] commented on GitHub (Jan 23, 2026):

🔍 Similar Issues Found

I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions:

#19864 issue: Ollama Parameters get overriden after native tool calls
by Haervwe • Dec 10, 2025 • bug
#17058 issue: Response cannot be stopped after the tool is called
by EntropyYue • Aug 30, 2025 • bug, confirmed issue
#20775 issue: calls to tools in gpt-oss
by chdid • Jan 18, 2026 • bug
#17729 issue: generation does not continue after tool call if a client is not connected to the UI
by johnnyasantoss • Sep 25, 2025 • bug
#20600 issue: Tool call results not decoded from HTML entities before sending to LLM
by Koumi460 • Jan 12, 2026 • bug

💡 Tips:

If this is a duplicate, please consider closing this issue and adding any additional details to the existing one
If you found a solution in any of these issues, please share it here to help others

This comment was generated automatically by a bot. Please react with a 👍 if this comment was helpful, or a 👎 if it was not.

@owui-terminator[bot] commented on GitHub (Jan 23, 2026): 🔍 **Similar Issues Found** I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions: 1. [#19864](https://github.com/open-webui/open-webui/issues/19864) **issue: Ollama Parameters get overriden after native tool calls** *by Haervwe • Dec 10, 2025 • `bug`* 2. [#17058](https://github.com/open-webui/open-webui/issues/17058) **issue: Response cannot be stopped after the tool is called** *by EntropyYue • Aug 30, 2025 • `bug`, `confirmed issue`* 3. [#20775](https://github.com/open-webui/open-webui/issues/20775) **issue: calls to tools in gpt-oss** *by chdid • Jan 18, 2026 • `bug`* 4. [#17729](https://github.com/open-webui/open-webui/issues/17729) **issue: generation does not continue after tool call if a client is not connected to the UI** *by johnnyasantoss • Sep 25, 2025 • `bug`* 5. [#20600](https://github.com/open-webui/open-webui/issues/20600) **issue: Tool call results not decoded from HTML entities before sending to LLM** *by Koumi460 • Jan 12, 2026 • `bug`* --- 💡 **Tips:** - If this is a duplicate, please consider closing this issue and adding any additional details to the existing one - If you found a solution in any of these issues, please share it here to help others *This comment was generated automatically by a bot.* Please react with a 👍 if this comment was helpful, or a 👎 if it was not.

GiteaMirror commented

2026-04-25 09:02:13 -05:00

@Classic298 commented on GitHub (Jan 25, 2026):

Testing wanted https://github.com/open-webui/open-webui/pull/20933

@Classic298 commented on GitHub (Jan 25, 2026): Testing wanted https://github.com/open-webui/open-webui/pull/20933

GiteaMirror commented

2026-04-25 09:02:13 -05:00

@HuysArthur commented on GitHub (Feb 2, 2026):

I still have the same issue, I build a new docker image from your code at Classic298:tool-call-early-stop. I tried to replicate my issue and I still have the same problem that the output stops after a tool-call.

@HuysArthur commented on GitHub (Feb 2, 2026): I still have the same issue, I build a new docker image from your code at `Classic298:tool-call-early-stop`. I tried to replicate my issue and I still have the same problem that the output stops after a tool-call. <img width="1770" height="592" alt="Image" src="https://github.com/user-attachments/assets/2f3fa2a4-1432-40b6-ad94-a21fed9f2064" />

GiteaMirror commented

2026-04-25 09:02:14 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

What am i seeing here? This does not look like Open WebUI
How did you test my PR

@Classic298 commented on GitHub (Feb 2, 2026): What am i seeing here? This does not look like Open WebUI How did you test my PR

GiteaMirror commented

2026-04-25 09:02:15 -05:00

@HuysArthur commented on GitHub (Feb 2, 2026):

I route my models through OpenWebUI. So this is a test run with opencode (cli coding agent), but it communicates with the OpenWebUI API.

@HuysArthur commented on GitHub (Feb 2, 2026): I route my models through OpenWebUI. So this is a test run with opencode (cli coding agent), but it communicates with the OpenWebUI API.

GiteaMirror commented

2026-04-25 09:02:16 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

This introduces additional failure points and is also outside of scope - here the conversation handling is done by open code and not by open webui anymore.

@Classic298 commented on GitHub (Feb 2, 2026): This introduces additional failure points and is also outside of scope - here the conversation handling is done by open code and not by open webui anymore.

GiteaMirror commented

2026-04-25 09:02:18 -05:00

@HuysArthur commented on GitHub (Feb 2, 2026):

Okay, so this issue appears to be unrelated to PR #20933. But what is OpenWebUI doing that causes tool calling to halt through the API? Everything works as expected when I route OpenCode directly to Ollama instead. Is there something specific that OpenWebUI handles or modifies between these calls that might be causing the breakage?

@HuysArthur commented on GitHub (Feb 2, 2026): Okay, so this issue appears to be unrelated to PR #20933. But what is OpenWebUI doing that causes tool calling to halt through the API? Everything works as expected when I route OpenCode directly to Ollama instead. Is there something specific that OpenWebUI handles or modifies between these calls that might be causing the breakage?

GiteaMirror commented

2026-04-25 09:02:19 -05:00

@HuysArthur commented on GitHub (Feb 2, 2026):

I was able to resolve the early stopping / halting after tool calls by configuring my Ollama endpoint in Open WebUI as an OpenAI API compatible connection (instead of the native Ollama API one). Tool calling now flows correctly without manual "continue" prompts.
For me this fixes the problem described — feel free to close the issue

@HuysArthur commented on GitHub (Feb 2, 2026): I was able to resolve the early stopping / halting after tool calls by configuring my Ollama endpoint in Open WebUI as an OpenAI API compatible connection (instead of the native Ollama API one). Tool calling now flows correctly without manual "continue" prompts. For me this fixes the problem described — feel free to close the issue

GiteaMirror commented

2026-04-25 09:02:20 -05:00

@Classic298 commented on GitHub (Feb 2, 2026):

huh

@Classic298 commented on GitHub (Feb 2, 2026): huh

GiteaMirror commented

2026-04-25 09:02:21 -05:00

@bannert1337 commented on GitHub (Feb 3, 2026):

I still have the issue on Open WebUI directly. After GLM-4.7-Flash runs native knowledge retrieval it ends generation.
It ends with \n<summary>Tool Executed</summary>\n</details>
I am running the model through llama.cpp and have it connected as OpenAI-compatible endpoint.

@bannert1337 commented on GitHub (Feb 3, 2026): I still have the issue on Open WebUI directly. After GLM-4.7-Flash runs native knowledge retrieval it ends generation. It ends with `\n<summary>Tool Executed</summary>\n</details>` I am running the model through llama.cpp and have it connected as OpenAI-compatible endpoint.

GiteaMirror commented

2026-04-25 09:02:22 -05:00

@JTHesse commented on GitHub (Feb 10, 2026):

I see the same issue with qwen3-coder-next.

@JTHesse commented on GitHub (Feb 10, 2026): I see the same issue with qwen3-coder-next.

GiteaMirror commented

2026-04-25 09:02:22 -05:00

@J-T1 commented on GitHub (Feb 19, 2026):

Issue persist with v0.8.3, whether the endpoint is configured as Ollama or OpenAI API.

@J-T1 commented on GitHub (Feb 19, 2026): Issue persist with v0.8.3, whether the endpoint is configured as Ollama or OpenAI API.

GiteaMirror commented

2026-04-25 09:02:23 -05:00

@moritzderallerechte commented on GitHub (Mar 18, 2026):

I have a similar issue.
I am accessing OpenRouter through a manifold pipe and use native tool calling.
When logged in with an admin account, everything works as expected.
However when using a normal user account, the response just stops after a tool call. That looks like this:

There are no errors in the OpenWebUI logs and there are no permission settings that correlate to this behaviour.
The tool calling works fine, as the correct response from querying the knowledge base is shown in the UI, but no subsequent call is made to the model.
Does anyone have an idea why this happens? I see no reason for this and am, quite frankly, frustrated...

@moritzderallerechte commented on GitHub (Mar 18, 2026): I have a similar issue. I am accessing OpenRouter through a manifold pipe and use native tool calling. When logged in with an admin account, everything works as expected. However when using a normal user account, the response just stops after a tool call. That looks like this: <img width="1008" height="291" alt="Image" src="https://github.com/user-attachments/assets/f27121ed-8420-431e-af0e-bf1e44bef2f2" /> There are no errors in the OpenWebUI logs and there are no permission settings that correlate to this behaviour. The tool calling works fine, as the correct response from querying the knowledge base is shown in the UI, but no subsequent call is made to the model. Does anyone have an idea why this happens? I see no reason for this and am, quite frankly, frustrated...

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#34855