[GH-ISSUE #19656] issue: Tool call response tokens are duplicated, causing 2x token consumption #18948

Closed
opened 2026-04-20 01:14:05 -05:00 by GiteaMirror · 10 comments
Owner

Originally created by @FujinoXiao on GitHub (Dec 1, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/19656

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Git Clone

Open WebUI Version

v0.6.40

Ollama Version (if applicable)

No response

Operating System

Ubuntu 22.04

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

The tool response should only be included once in the context. For example, if a tool returns content that should consume ~10,000 tokens, only ~10,000 tokens should be used.

Actual Behavior

The tool response content is duplicated in the context, causing 2x token consumption. When a tool returns content worth ~10,000 tokens, it actually consumes ~20,000 tokens because the response appears twice in the message history.

Steps to Reproduce

  1. Go to Open WebUI, navigate to Workspace -> Tools -> Create a new tool

  2. Add the following test tool code:

from pydantic import Field

class Tools:
    def __init__(self):
        pass

    def repeat_test(
        self,
        text: str = Field("test", description="String to repeat"),
        count: int = Field(10000, description="Number of repetitions"),
    ) -> str:
        """
        Repeat a string for testing purposes.
        """
        return text * count
  1. Enable this tool in a chat

  2. Ask the model to call this tool (e.g., "Please use the repeat_test tool with count=10000")

  3. Check the token usage

  4. The word "test" is approximately 1 token, so repeating it 10,000 times should consume ~10,000 tokens plus a small amount for the user's question and model's response (maybe ~11,000 tokens total)

  5. However, the actual token consumption shows ~22,000+ tokens, indicating the tool response is being duplicated

  6. This issue scales linearly: if you set count=20000, the expected usage is ~20,000 tokens, but actual usage is ~40,000+ tokens. The tool response is consistently duplicated regardless of size.

Logs & Screenshots

Token consumption comparison:

Test 1: repeat_test with count=10000

class Tools:
    def __init__(self):
        pass

    def repeat_test(
        self,
        text: str = Field("test", description="String to repeat"),
        count: int = Field(10000, description="Number of repetitions"),
    ) -> str:
        """
        Repeat a string for testing purposes.
        """
        return text * count
  • Expected: ~10,000 tokens (tool response) + ~1,000 tokens (question + reply) ≈ 10,000 tokens
  • Actual: ~20,000+ tokens
Image Image

Test 2: repeat_test with count=20000

class Tools:
    def __init__(self):
        pass

    def repeat_test(
        self,
        text: str = Field("test", description="String to repeat"),
        count: int = Field(20000, description="Number of repetitions"),
    ) -> str:
        """
        Repeat a string for testing purposes.
        """
        return text * count
  • Expected: ~20,000 tokens (tool response) + ~1,000 tokens (question + reply) ≈ 20,000 tokens
  • Actual: ~40,000+ tokens

Conclusion: Tool response tokens are consistently doubled, confirming duplication issue.

Image Image

Additional Information

No response

Originally created by @FujinoXiao on GitHub (Dec 1, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/19656 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Git Clone ### Open WebUI Version v0.6.40 ### Ollama Version (if applicable) _No response_ ### Operating System Ubuntu 22.04 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior The tool response should only be included once in the context. For example, if a tool returns content that should consume ~10,000 tokens, only ~10,000 tokens should be used. ### Actual Behavior The tool response content is duplicated in the context, causing 2x token consumption. When a tool returns content worth ~10,000 tokens, it actually consumes ~20,000 tokens because the response appears twice in the message history. ### Steps to Reproduce 1. Go to Open WebUI, navigate to Workspace -> Tools -> Create a new tool 2. Add the following test tool code: from pydantic import Field ``` class Tools: def __init__(self): pass def repeat_test( self, text: str = Field("test", description="String to repeat"), count: int = Field(10000, description="Number of repetitions"), ) -> str: """ Repeat a string for testing purposes. """ return text * count ``` 3. Enable this tool in a chat 4. Ask the model to call this tool (e.g., "Please use the repeat_test tool with count=10000") 5. Check the token usage 6. The word "test" is approximately 1 token, so repeating it 10,000 times should consume ~10,000 tokens plus a small amount for the user's question and model's response (maybe ~11,000 tokens total) 7. However, the actual token consumption shows ~22,000+ tokens, indicating the tool response is being duplicated 8. This issue scales linearly: if you set count=20000, the expected usage is ~20,000 tokens, but actual usage is ~40,000+ tokens. The tool response is consistently duplicated regardless of size. ### Logs & Screenshots Token consumption comparison: Test 1: repeat_test with count=10000 ``` class Tools: def __init__(self): pass def repeat_test( self, text: str = Field("test", description="String to repeat"), count: int = Field(10000, description="Number of repetitions"), ) -> str: """ Repeat a string for testing purposes. """ return text * count ``` - Expected: ~10,000 tokens (tool response) + ~1,000 tokens (question + reply) ≈ 10,000 tokens - Actual: ~20,000+ tokens <img width="1428" height="1086" alt="Image" src="https://github.com/user-attachments/assets/4ecd17b7-66d4-4f0d-8f75-7d9cbdb423f5" /> <img width="1372" height="864" alt="Image" src="https://github.com/user-attachments/assets/e0b7b6a7-74eb-44dd-9a0f-cb345581aa67" /> Test 2: repeat_test with count=20000 ``` class Tools: def __init__(self): pass def repeat_test( self, text: str = Field("test", description="String to repeat"), count: int = Field(20000, description="Number of repetitions"), ) -> str: """ Repeat a string for testing purposes. """ return text * count ``` - Expected: ~20,000 tokens (tool response) + ~1,000 tokens (question + reply) ≈ 20,000 tokens - Actual: ~40,000+ tokens Conclusion: Tool response tokens are consistently doubled, confirming duplication issue. <img width="1557" height="1037" alt="Image" src="https://github.com/user-attachments/assets/c190511b-8fa2-4790-ac3a-16ca405d8163" /> <img width="1443" height="951" alt="Image" src="https://github.com/user-attachments/assets/da15fedf-4c06-4f5a-a54e-cc5ee579f02c" /> ### Additional Information _No response_
GiteaMirror added the confirmed issuebug labels 2026-04-20 01:14:05 -05:00
Author
Owner

@owui-terminator[bot] commented on GitHub (Dec 1, 2025):

🔍 Similar Issues Found

I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions:

  1. #19390 issue: tools will double the token cost
    by qq3829596922 • Nov 23, 2025 • bug

  2. #19169 issue: System Prompt Duplication During Agentic Tool Calls Leading to Token Waste and Write-Cache Overprice
    by alexis-dioxycle • Nov 13, 2025 • bug

  3. #17058 issue: Response cannot be stopped after the tool is called
    by EntropyYue • Aug 30, 2025 • bug

  4. #17678 issue: OAuth token after some time missing in tool calls
    by koflerm • Sep 23, 2025 • bug

  5. #16721 issue: Continuous Repetition of Tool-Use Responses
    by ziozzang • Aug 19, 2025 • bug

Show 5 more related issues
  1. #17047 issue: Multiple tool calls cause repetitive text output
    by pairwiserr • Aug 29, 2025 • bug

  2. #15690 issue: tool calls fail when model makes multiple tool calls in one response
    by Master-Pr0grammer • Jul 13, 2025 • bug

  3. #16138 issue: tools names are doubled when calling
    by latel • Jul 30, 2025 • bug

  4. #19509 issue: User overview page calls /api/v1/users multiple times
    by luke-wren • Nov 26, 2025 • bug

  5. #12829 issue: Using "tools" causes the API to be called twice.
    by KingPollux • Apr 14, 2025 • bug


💡 Tips:

  • If this is a duplicate, please consider closing this issue and adding any additional details to the existing one
  • If you found a solution in any of these issues, please share it here to help others

This comment was generated automatically by a bot. Please react with a 👍 if this comment was helpful, or a 👎 if it was not.

<!-- gh-comment-id:3595619885 --> @owui-terminator[bot] commented on GitHub (Dec 1, 2025): 🔍 **Similar Issues Found** I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions: 1. [#19390](https://github.com/open-webui/open-webui/issues/19390) **issue: tools will double the token cost** *by qq3829596922 • Nov 23, 2025 • `bug`* 2. [#19169](https://github.com/open-webui/open-webui/issues/19169) **issue: System Prompt Duplication During Agentic Tool Calls Leading to Token Waste and Write-Cache Overprice** *by alexis-dioxycle • Nov 13, 2025 • `bug`* 3. [#17058](https://github.com/open-webui/open-webui/issues/17058) **issue: Response cannot be stopped after the tool is called** *by EntropyYue • Aug 30, 2025 • `bug`* 4. [#17678](https://github.com/open-webui/open-webui/issues/17678) **issue: OAuth token after some time missing in tool calls** *by koflerm • Sep 23, 2025 • `bug`* 5. [#16721](https://github.com/open-webui/open-webui/issues/16721) **issue: Continuous Repetition of Tool-Use Responses** *by ziozzang • Aug 19, 2025 • `bug`* <details> <summary>Show 5 more related issues</summary> 6. [#17047](https://github.com/open-webui/open-webui/issues/17047) **issue: Multiple tool calls cause repetitive text output** *by pairwiserr • Aug 29, 2025 • `bug`* 7. [#15690](https://github.com/open-webui/open-webui/issues/15690) **issue: tool calls fail when model makes multiple tool calls in one response** *by Master-Pr0grammer • Jul 13, 2025 • `bug`* 8. [#16138](https://github.com/open-webui/open-webui/issues/16138) **issue: tools names are doubled when calling** *by latel • Jul 30, 2025 • `bug`* 9. [#19509](https://github.com/open-webui/open-webui/issues/19509) **issue: User overview page calls /api/v1/users multiple times** *by luke-wren • Nov 26, 2025 • `bug`* 10. [#12829](https://github.com/open-webui/open-webui/issues/12829) **issue: Using "tools" causes the API to be called twice.** *by KingPollux • Apr 14, 2025 • `bug`* </details> --- 💡 **Tips:** - If this is a duplicate, please consider closing this issue and adding any additional details to the existing one - If you found a solution in any of these issues, please share it here to help others *This comment was generated automatically by a bot.* Please react with a 👍 if this comment was helpful, or a 👎 if it was not.
Author
Owner

@FujinoXiao commented on GitHub (Dec 1, 2025):

This is a submission with detailed reproduction steps, test code, and token consumption data to properly document the bug.
and it is the first time

<!-- gh-comment-id:3595756106 --> @FujinoXiao commented on GitHub (Dec 1, 2025): This is a submission with detailed reproduction steps, test code, and token consumption data to properly document the bug. and it is the first time
Author
Owner

@tjbck commented on GitHub (Dec 1, 2025):

@silentoplayz confirmation wanted here!

<!-- gh-comment-id:3597412514 --> @tjbck commented on GitHub (Dec 1, 2025): @silentoplayz confirmation wanted here!
Author
Owner

@silentoplayz commented on GitHub (Dec 1, 2025):

@silentoplayz confirmation wanted here!

I tested this issue with an external GroqCloud model and it failed to call the tool provided by @qq3829596922 in the issue post... at least up until the point where I told the model in the next query to "FIX IT FOR ME", to which it likely obliged (but hit a token limit), which translated into a 20k+ tokens API call. I think it would've been roughly 10k tokens used instead if it weren't duplicating tool call response tokens, ultimately confirming the reported 2x token consumption suspicion.

Image

I had Kimi K2 Instruct fix the tool and I believe this confirms the issue further:
Image

Tool code:

import os
import requests
from datetime import datetime

# from pydantic import Field   # not needed here


class Tools:
    def __init__(self):
        pass

    def repeat_test(
        self,
        text: str = "test",  # default string
        count: int = 10000,  # default repetitions
    ) -> str:
        """Repeat a string for testing purposes."""
        return text * count

My local models are failing to use the tool successfully.
Image

Edit: I lowered the repeat amount to 1k (x10 reduction) and the model called the tool fine.

Image
<!-- gh-comment-id:3597696908 --> @silentoplayz commented on GitHub (Dec 1, 2025): > [@silentoplayz](https://github.com/silentoplayz) confirmation wanted here! I tested this issue with an external GroqCloud model and it failed to call the tool provided by @qq3829596922 in the issue post... at least up until the point where I told the model in the next query to "FIX IT FOR ME", to which it likely obliged (but hit a token limit), which translated into a 20k+ tokens API call. I think it would've been roughly 10k tokens used instead if it weren't duplicating tool call response tokens, ultimately confirming the reported 2x token consumption suspicion. <img width="2297" height="1284" alt="Image" src="https://github.com/user-attachments/assets/f5694558-b91c-44ad-a240-d6fa8f0887e1" /> I had Kimi K2 Instruct fix the tool and I believe this confirms the issue further: <img width="2297" height="1284" alt="Image" src="https://github.com/user-attachments/assets/1db67767-6a94-4cd5-92be-74d2e11a886d" /> Tool code: ```js import os import requests from datetime import datetime # from pydantic import Field # not needed here class Tools: def __init__(self): pass def repeat_test( self, text: str = "test", # default string count: int = 10000, # default repetitions ) -> str: """Repeat a string for testing purposes.""" return text * count ``` My local models are failing to use the tool successfully. <img width="2297" height="1284" alt="Image" src="https://github.com/user-attachments/assets/26e0d09d-487b-44d7-9ae9-c755ce0728b2" /> ### Edit: I lowered the repeat amount to 1k (x10 reduction) and the model called the tool fine. <img width="2093" height="1276" alt="Image" src="https://github.com/user-attachments/assets/a514a1c5-8faa-4ab4-aca6-df6fa5f9227b" />
Author
Owner

@rgaricano commented on GitHub (Dec 1, 2025):

The problem occurs because tool results are being added to the message history twice: once during initial tool processing and again when converting content blocks back to messages for subsequent LLM calls.

First - In chat_completion_tools_handler, tool results are immediately added to the message history after processing, here:
140605e660/backend/open_webui/utils/middleware.py (L496-L499)

& Second - In process_chat_response, when processing streaming responses, the content blocks (which include tool results) are converted back to messages and added to form_data["messages"], here:
140605e660/backend/open_webui/utils/middleware.py (L2993-L2998)

Should be fixed just removing first addition lines (L496-L499)

<!-- gh-comment-id:3597815713 --> @rgaricano commented on GitHub (Dec 1, 2025): The problem occurs because tool results are being added to the message history twice: once during initial tool processing and again when converting content blocks back to messages for subsequent LLM calls. First - In `chat_completion_tools_handler`, tool results are immediately added to the message history after processing, here: https://github.com/open-webui/open-webui/blob/140605e660b8186a7d5c79fb3be6ffb147a2f498/backend/open_webui/utils/middleware.py#L496-L499 & Second - In `process_chat_response`, when processing streaming responses, the content blocks (which include tool results) are converted back to messages and added to `form_data["messages"]`, here: https://github.com/open-webui/open-webui/blob/140605e660b8186a7d5c79fb3be6ffb147a2f498/backend/open_webui/utils/middleware.py#L2993-L2998 Should be fixed just removing first addition lines (L496-L499)
Author
Owner

@Ithanil commented on GitHub (Dec 1, 2025):

Not saying that there is nothing to see here, because it is actually the second report of this issue (see https://github.com/open-webui/open-webui/issues/19390 ),

But I think for actually understanding the issue we first need to look at the requests received by the LLM backend. And if I use the test tool (with low default reps) with native tool calling, I get the following request on the second turn (i.e. after the tool was called):

{
    "stream": true,
    "model": "Qwen3 Coder 30B",
    "messages": [
        {
            "role": "user",
            "content": "Use the test tool and report the result."
        },
        {
            "role": "assistant",
            "content": "",
            "tool_calls": [
                {
                    "id": "call_baca6daf6087405c89c019cd",
                    "function": {
                        "arguments": "{\"text\": \"Hello, World!\", \"count\": 3}",
                        "name": "repeat_test"
                    },
                    "type": "function",
                    "index": 0
                }
            ]
        },
        {
            "role": "tool",
            "tool_call_id": "call_baca6daf6087405c89c019cd",
            "content": "Hello, World!Hello, World!Hello, World!"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "repeat_test",
                "description": "\n        Repeat a string for testing purposes.\n        ",
                "parameters": {
                    "properties": {
                        "text": {
                            "default": "test",
                            "description": "String to repeat",
                            "type": "string"
                        },
                        "count": {
                            "default": 1,
                            "description": "Number of repetitions",
                            "type": "integer"
                        }
                    },
                    "type": "object"
                }
            }
        }
    ]
}

And in my opinion this is what I would expect and I see "Hello, World!Hello, World!Hello, World!" only once. Could you guys please look for the actual requests and tell where there is the apparent doubling?

EDIT:
What I find actually strange though is how the previous tool results are included in further rounds of chat:

    "messages": [
        {
            "role": "user",
            "content": "Use the test tool and report the result."
        },
        {
            "role": "assistant",
            "content": "\"&quot;Hello, World!Hello, World!Hello, World!&quot;\"\nThe result of using the test tool is: \"Hello, World!Hello, World!Hello, World!\"."
        },
        {
            "role": "user",
            "content": "Please report the result again, without calling the tool."
        }
    ],
<!-- gh-comment-id:3598177332 --> @Ithanil commented on GitHub (Dec 1, 2025): Not saying that there is nothing to see here, because it is actually the second report of this issue (see https://github.com/open-webui/open-webui/issues/19390 ), But I think for actually understanding the issue we first need to look at the requests received by the LLM backend. And if I use the test tool (with low default reps) with **native** tool calling, I get the following request on the second turn (i.e. after the tool was called): ``` { "stream": true, "model": "Qwen3 Coder 30B", "messages": [ { "role": "user", "content": "Use the test tool and report the result." }, { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_baca6daf6087405c89c019cd", "function": { "arguments": "{\"text\": \"Hello, World!\", \"count\": 3}", "name": "repeat_test" }, "type": "function", "index": 0 } ] }, { "role": "tool", "tool_call_id": "call_baca6daf6087405c89c019cd", "content": "Hello, World!Hello, World!Hello, World!" } ], "tools": [ { "type": "function", "function": { "name": "repeat_test", "description": "\n Repeat a string for testing purposes.\n ", "parameters": { "properties": { "text": { "default": "test", "description": "String to repeat", "type": "string" }, "count": { "default": 1, "description": "Number of repetitions", "type": "integer" } }, "type": "object" } } } ] } ``` And in my opinion this is what I would expect and I see "Hello, World!Hello, World!Hello, World!" only once. Could you guys please look for the actual requests and tell where there is the apparent doubling? EDIT: What I find *actually* strange though is how the previous tool results are included in further rounds of chat: ``` "messages": [ { "role": "user", "content": "Use the test tool and report the result." }, { "role": "assistant", "content": "\"&quot;Hello, World!Hello, World!Hello, World!&quot;\"\nThe result of using the test tool is: \"Hello, World!Hello, World!Hello, World!\"." }, { "role": "user", "content": "Please report the result again, without calling the tool." } ], ```
Author
Owner

@rgaricano commented on GitHub (Dec 1, 2025):

the issue is only with non-native function calling, native functin call haven't this duplication.

<!-- gh-comment-id:3598218344 --> @rgaricano commented on GitHub (Dec 1, 2025): the issue is only with non-native function calling, native functin call haven't this duplication.
Author
Owner

@Ithanil commented on GitHub (Dec 1, 2025):

the issue is only with non-native function calling, native functin call haven't this duplication.

Oh OK, thanks for clarifying. That wasn't clear to me.

In the case of non-native calling the messages look as follows, confirming the issue (tool result once in the <context> and then again appended as "Tool test_tool/repeat_test Output: testtesttest"):

    "messages": [
        {
            "role": "user",
            "content": "### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* \"According to the study, the proposed method increases efficiency by 20% [1].\"\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id=\"1\" name=\"test_tool/repeat_test\">testtesttest</source>\n</context>\n\nUse the test tool with count=3 and report the result.\n\nTool `test_tool/repeat_test` Output: testtesttest"
        }
    ],
<!-- gh-comment-id:3598268604 --> @Ithanil commented on GitHub (Dec 1, 2025): > the issue is only with non-native function calling, native functin call haven't this duplication. Oh OK, thanks for clarifying. That wasn't clear to me. In the case of non-native calling the messages look as follows, confirming the issue (tool result once in the `<context>` and then again appended as "Tool `test_tool/repeat_test` Output: testtesttest"): ``` "messages": [ { "role": "user", "content": "### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* \"According to the study, the proposed method increases efficiency by 20% [1].\"\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id=\"1\" name=\"test_tool/repeat_test\">testtesttest</source>\n</context>\n\nUse the test tool with count=3 and report the result.\n\nTool `test_tool/repeat_test` Output: testtesttest" } ], ```
Author
Owner

@tjbck commented on GitHub (Dec 1, 2025):

52ccab8fc0

<!-- gh-comment-id:3598348503 --> @tjbck commented on GitHub (Dec 1, 2025): 52ccab8fc0d18be5562c44b7414d6ceb1d1b1b01
Author
Owner

@FujinoXiao commented on GitHub (Dec 2, 2025):

52ccab8

why so fast,i just want to fix the bug to be a contributor

<!-- gh-comment-id:3601111248 --> @FujinoXiao commented on GitHub (Dec 2, 2025): > [52ccab8](https://github.com/open-webui/open-webui/commit/52ccab8fc0d18be5562c44b7414d6ceb1d1b1b01) why so fast,i just want to fix the bug to be a contributor
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#18948