[GH-ISSUE #21802] issue: chat_completion_tools_handler called with empty tools_dict, causing duplicate model execution #35104

Closed
opened 2026-04-25 09:18:29 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @ashkenazzio on GitHub (Feb 23, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/21802

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.8.5 (commit 1ac3dd4)

Ollama Version (if applicable)

No response

Operating System

NixOS

Browser (if applicable)

Zen Browser

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

Description

When a model has no tools configured (no MCP servers, no builtin tools, no function tools), process_chat_payload() still calls chat_completion_tools_handler() with an empty tools_dict. This sends a non-streaming request to the same model asking it to choose tools — effectively running the model twice for every user message.

For fast LLM models (GPT-4o, Claude Haiku, etc.), this adds a few seconds of latency that may go unnoticed. For agent-style backends exposed as OpenAI-compatible endpoints (which can take minutes to process), this is catastrophic — the agent runs its entire pipeline twice, with the first run's output silently discarded.

Actual Behavior

Root Cause

In backend/open_webui/utils/middleware.py, line ~2525:

        if tools_dict:
            if metadata.get("params", {}).get("function_calling") == "native":
                metadata["tools"] = tools_dict
                form_data["tools"] = [
                    {"type": "function", "function": tool.get("spec", {})}
                    for tool in tools_dict.values()
                ]

        else:
            # If the function calling is not native, then call the tools function calling handler
            try:
                form_data, flags = await chat_completion_tools_handler(
                    request, form_data, extra_params, user, models, tools_dict
                )
                sources.extend(flags.get("sources", []))
            except Exception as e:
                log.exception(e)

The else is at the same indentation as if tools_dict:, so it fires when tools_dict is empty (falsy) — not when function calling is non-native. The comment says "If the function calling is not native" but the code does the opposite: it runs when there are no tools at all.

Inside chat_completion_tools_handler(), the function builds a tool-calling prompt (with an empty tools list), sends it as a non-streaming generate_chat_completion() call to the model, waits for the full response, parses it for tool calls (finds none), and returns. The actual user request hasn't even been sent yet.

Steps to Reproduce

  1. Connect any OpenAI-compatible backend (e.g., a custom FastAPI server)
  2. Ensure the model has no tools, no MCP servers, no functions configured
  3. Send a chat message to that model
  4. Observe in the backend logs that the model receives two requests: first a non-streaming tool-selection request, then the actual streaming chat request

Logs & Screenshots

Suggested Fix:

Guard the else branch so chat_completion_tools_handler is only called when tools_dict is non-empty:

        if tools_dict:
            if metadata.get("params", {}).get("function_calling") == "native":
                metadata["tools"] = tools_dict
                form_data["tools"] = [
                    {"type": "function", "function": tool.get("spec", {})}
                    for tool in tools_dict.values()
                ]
            else:
                try:
                    form_data, flags = await chat_completion_tools_handler(
                        request, form_data, extra_params, user, models, tools_dict
                    )
                    sources.extend(flags.get("sources", []))
                except Exception as e:
                    log.exception(e)

This moves the else inside the if tools_dict: block, so it correctly means "tools exist but function calling is not native" rather than "no tools exist."

Note: Commit 8f0658e ("fix: payload tools handling") restructured this area but may not have addressed this specific case. Our testing confirms the bug is present in v0.8.5.

Additional Information

No response

Originally created by @ashkenazzio on GitHub (Feb 23, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/21802 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.8.5 (commit `1ac3dd4`) ### Ollama Version (if applicable) _No response_ ### Operating System NixOS ### Browser (if applicable) Zen Browser ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior ## Description When a model has **no tools configured** (no MCP servers, no builtin tools, no function tools), `process_chat_payload()` still calls `chat_completion_tools_handler()` with an empty `tools_dict`. This sends a **non-streaming request to the same model** asking it to choose tools — effectively running the model **twice** for every user message. For fast LLM models (GPT-4o, Claude Haiku, etc.), this adds a few seconds of latency that may go unnoticed. For agent-style backends exposed as OpenAI-compatible endpoints (which can take minutes to process), this is catastrophic — the agent runs its entire pipeline twice, with the first run's output silently discarded. ### Actual Behavior ## Root Cause In `backend/open_webui/utils/middleware.py`, line ~2525: ```python if tools_dict: if metadata.get("params", {}).get("function_calling") == "native": metadata["tools"] = tools_dict form_data["tools"] = [ {"type": "function", "function": tool.get("spec", {})} for tool in tools_dict.values() ] else: # If the function calling is not native, then call the tools function calling handler try: form_data, flags = await chat_completion_tools_handler( request, form_data, extra_params, user, models, tools_dict ) sources.extend(flags.get("sources", [])) except Exception as e: log.exception(e) ``` The `else` is at the same indentation as `if tools_dict:`, so it fires when `tools_dict` is **empty** (falsy) — not when function calling is non-native. The comment says "If the function calling is not native" but the code does the opposite: it runs when there are **no tools at all**. Inside `chat_completion_tools_handler()`, the function builds a tool-calling prompt (with an empty tools list), sends it as a **non-streaming** `generate_chat_completion()` call to the model, waits for the full response, parses it for tool calls (finds none), and returns. The actual user request hasn't even been sent yet. ### Steps to Reproduce 1. Connect any OpenAI-compatible backend (e.g., a custom FastAPI server) 2. Ensure the model has **no tools, no MCP servers, no functions** configured 3. Send a chat message to that model 4. Observe in the backend logs that the model receives **two requests**: first a non-streaming tool-selection request, then the actual streaming chat request ### Logs & Screenshots ## Suggested Fix: Guard the `else` branch so `chat_completion_tools_handler` is only called when `tools_dict` is non-empty: ```python if tools_dict: if metadata.get("params", {}).get("function_calling") == "native": metadata["tools"] = tools_dict form_data["tools"] = [ {"type": "function", "function": tool.get("spec", {})} for tool in tools_dict.values() ] else: try: form_data, flags = await chat_completion_tools_handler( request, form_data, extra_params, user, models, tools_dict ) sources.extend(flags.get("sources", [])) except Exception as e: log.exception(e) ``` This moves the `else` inside the `if tools_dict:` block, so it correctly means "tools exist but function calling is not native" rather than "no tools exist." **Note:** Commit `8f0658e` ("fix: payload tools handling") restructured this area but may not have addressed this specific case. Our testing confirms the bug is present in v0.8.5. ### Additional Information _No response_
GiteaMirror added the bug label 2026-04-25 09:18:29 -05:00
Author
Owner

@tjbck commented on GitHub (Feb 23, 2026):

Investigating.

<!-- gh-comment-id:3947677936 --> @tjbck commented on GitHub (Feb 23, 2026): Investigating.
Author
Owner

@tjbck commented on GitHub (Feb 23, 2026):

Addressed in dev.

<!-- gh-comment-id:3947682983 --> @tjbck commented on GitHub (Feb 23, 2026): Addressed in dev.
Author
Owner

@blakkd commented on GitHub (Feb 26, 2026):

Was driving me mad! I thought my config was corrupted. Nice to see the cause was identified 👍

<!-- gh-comment-id:3964270904 --> @blakkd commented on GitHub (Feb 26, 2026): Was driving me mad! I thought my config was corrupted. Nice to see the cause was identified 👍
Author
Owner

@ashkenazzio commented on GitHub (Feb 26, 2026):

Was driving me mad! I thought my config was corrupted. Nice to see the cause was identified 👍

Yeah, same. I was chasing a ghost bug in my code for 3 hours or so 🥲

<!-- gh-comment-id:3964630464 --> @ashkenazzio commented on GitHub (Feb 26, 2026): > Was driving me mad! I thought my config was corrupted. Nice to see the cause was identified 👍 Yeah, same. I was chasing a ghost bug in my code for 3 hours or so 🥲
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#35104