[GH-ISSUE #19169] issue: System Prompt Duplication During Agentic Tool Calls Leading to Token Waste and Write-Cache Overprice #34324

Closed
opened 2026-04-25 08:15:43 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @alexis-dioxycle on GitHub (Nov 13, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/19169

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.26

Ollama Version (if applicable)

No response

Operating System

macOS and Ubuntu 22.04

Browser (if applicable)

Any (issue is server-side)

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When a model engages in agentic behavior with multiple sequential tool calls, the system prompt should remain singular and consistent throughout all tool invocations. Each tool call should see the system prompt only once in the messages array.

Actual Behavior

During agentic behavior with multiple tool calls, the system prompt gets duplicated with each subsequent tool call:

  • Tool call 1: System prompt appears 1 time ✓
  • Tool call 2: System prompt appears 2 times ✗
  • Tool call 3: System prompt appears 3 times ✗
  • Tool call 4: System prompt appears 4 times ✗
  • Tool call 5: System prompt appears 5 times ✗

This duplication:

  1. Significantly increases API costs (especially with Anthropic's prompt caching where duplicate prompts break cache efficiency). The few hundreds of dollars lost are what made us realized the issue.
  2. Wastes context tokens very fast if the system prompt is huge.
  3. Reduces available context window for actual conversation

Steps to Reproduce

I made a step by step guide that should be easy to follow as it only requires the creation of a model that will return the system prompt, and make it act agentically:


Steps to Reproduce

Prerequisites

  1. Start with Open WebUI v0.6.26 running in Docker
  2. Have access to a model that supports tool calling (e.g., Claude Sonnet 4, GPT-4, or any Anthropic/OpenAI model)
  3. Ensure you have admin access to create custom tools

Step-by-Step Reproduction

1. Set Up System Prompt

  • Navigate to Admin Panel > Settings > Models
  • Select your model (e.g., Claude Sonnet 4)
  • In the System Prompt field, enter:
    You are a model that does a lot of tool call. 
    
  • Go in Advanced params, and check Native Tool Call
  • Save the configuration

2. Create the System Prompt Inspector Tool

Navigate to Workspace > Tools > + (Create New Tool)

Paste the following complete tool code:

"""
title: System Prompt Inspector
author: Test User
version: 1.0.0
description: A tool to inspect and display the system prompt and message structure
"""

from pydantic import BaseModel, Field
from typing import Optional


class Tools:
    def __init__(self):
        pass

    class Valves(BaseModel):
        pass

    class UserValves(BaseModel):
        pass

    def get_system_prompt(self, __event_emitter__=None) -> str:
        """
        Retrieves and displays the current system prompt(s) from the conversation.
        Use this to check how many times the system prompt appears.
        
        :return: A formatted string showing all system prompts and their count
        """
        
        # Get the messages from the event emitter
        if __event_emitter__ is None:
            return "Error: No event emitter available"
        
        # Access the messages array
        try:
            messages = __event_emitter__.get("messages", [])
            
            if not messages:
                return "No messages found in conversation"
            
            # Count system prompts
            system_prompts = []
            system_count = 0
            
            for idx, msg in enumerate(messages):
                if msg.get("role") == "system":
                    system_count += 1
                    content = msg.get("content", "")
                    system_prompts.append({
                        "index": idx,
                        "content": content[:200] + "..." if len(content) > 200 else content
                    })
            
            # Build result
            result = f"**System Prompt Analysis**\n\n"
            result += f"Total system prompts found: **{system_count}**\n\n"
            
            if system_count == 0:
                result += "No system prompts found.\n"
            elif system_count == 1:
                result += "✓ Correct: Only one system prompt (as expected)\n\n"
                result += f"**System Prompt Content:**\n```\n{system_prompts[0]['content']}\n```"
            else:
                result += f"✗ **ISSUE DETECTED**: {system_count} duplicate system prompts found!\n\n"
                for i, prompt in enumerate(system_prompts, 1):
                    result += f"**System Prompt #{i}** (message index {prompt['index']}):\n"
                    result += f"```\n{prompt['content']}\n```\n\n"
            
            # Show total message count
            result += f"\n---\n**Total messages in conversation:** {len(messages)}\n"
            
            # Show message role breakdown
            role_counts = {}
            for msg in messages:
                role = msg.get("role", "unknown")
                role_counts[role] = role_counts.get(role, 0) + 1
            
            result += "**Message breakdown by role:**\n"
            for role, count in sorted(role_counts.items()):
                result += f"- {role}: {count}\n"
            
            return result
            
        except Exception as e:
            return f"Error inspecting system prompt: {str(e)}"
  • Click Save
  • Enable the tool globally or for your specific model

3. Reproduce the Duplication Bug

Test Case A: Sequential Tool Calls (Simple)

  1. Start a new chat with your configured model
  2. Send the following message:
    Please call the get_system_prompt tool, then call it again, then call it again. Each time, wait for the response of the previous tool call before doing the one after. 
    Do this 5 times total, calling the tool each time.
    
  3. Observe the model making 5 sequential tool calls
  4. Expected Result: Each tool call should report "1 system prompt found"
  5. Actual Result:
    • Call 1: "1 system prompt found" ✓
    • Call 2: "2 system prompts found" ✗
    • Call 3: "3 system prompts found" ✗
    • Call 4: "4 system prompts found" ✗
    • Call 5: "5 system prompts found" ✗

4. Verify the Issue

After reproducing, you should observe:

  • ✗ Each duplicate system prompt has identical content
  • ✗ All duplicates appear at the beginning of the messages array (role: "system")
  • ✗ Token usage in API requests grows unnecessarily

Logs & Screenshots

Here are the screenshots from this very model. I also have the actual logs that I can show.

Image

Here is the logs on the chrome network tab

{
    "model": "issue-model",
    "messages": [
        {
            "id": "4ed53c81-0635-48f7-86bf-cb48b68857d2",
            "role": "user",
            "content": "hey Please call the get_system_prompt tool, then call it again, then call it again. Each time, wait for the response of the previous tool call before doing the one after. Do this 5 times total, calling the tool each time.",
            "timestamp": 1763055094
        },
        {
            "id": "5b35c480-0360-4cf9-9895-a29b9e65d9ea",
            "role": "assistant",
            "content": "I'll call the get_system_prompt tool 5 times sequentially, waiting for each response before making the next call.\n\n**Call 1:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_Q5xML8IqT96LCnKhFMFOcg\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\n**Call 2:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_z9AX7lVzT6qezduXSMDH7Q\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\n**Call 3:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_5PzguAlrSKG8vs-CXLxZsA\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\n**Call 4:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_knvTrwlOQISbWomlKe92XQ\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\n**Call 5:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_mQZ12RIVQ4C9RaCVygTycg\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\nPerfect! I've completed all 5 sequential calls to the `get_system_prompt` tool. Interestingly, each call revealed that the system prompt had grown by one additional line of \"You are a model that does a lot of tool call\" - it appears the system prompt was expanding with each call, ending with 5 repetitions of that phrase in the final call.",
            "timestamp": 1763055094
        }
    ],
    "chat_id": "31a1fcff-5775-42be-8085-0c40880611ee",
    "session_id": "RIwBPCknUPwOLwI1AACP",
    "id": "5b35c480-0360-4cf9-9895-a29b9e65d9ea"
}

Additional Information

No response

Originally created by @alexis-dioxycle on GitHub (Nov 13, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/19169 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.26 ### Ollama Version (if applicable) _No response_ ### Operating System macOS and Ubuntu 22.04 ### Browser (if applicable) Any (issue is server-side) ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When a model engages in agentic behavior with multiple sequential tool calls, the system prompt should remain singular and consistent throughout all tool invocations. Each tool call should see the system prompt only once in the messages array. ### Actual Behavior During agentic behavior with multiple tool calls, the system prompt gets duplicated with each subsequent tool call: - Tool call 1: System prompt appears 1 time ✓ - Tool call 2: System prompt appears 2 times ✗ - Tool call 3: System prompt appears 3 times ✗ - Tool call 4: System prompt appears 4 times ✗ - Tool call 5: System prompt appears 5 times ✗ This duplication: 1. Significantly increases API costs (especially with Anthropic's prompt caching where duplicate prompts break cache efficiency). The few hundreds of dollars lost are what made us realized the issue. 2. Wastes context tokens very fast if the system prompt is huge. 3. Reduces available context window for actual conversation ### Steps to Reproduce I made a step by step guide that should be easy to follow as it only requires the creation of a model that will return the system prompt, and make it act agentically: --- ## Steps to Reproduce ### Prerequisites 1. Start with Open WebUI v0.6.26 running in Docker 2. Have access to a model that supports tool calling (e.g., Claude Sonnet 4, GPT-4, or any Anthropic/OpenAI model) 3. Ensure you have admin access to create custom tools ### Step-by-Step Reproduction #### 1. Set Up System Prompt - Navigate to **Admin Panel** > **Settings** > **Models** - Select your model (e.g., Claude Sonnet 4) - In the **System Prompt** field, enter: ``` You are a model that does a lot of tool call. ``` - Go in Advanced params, and check Native Tool Call - Save the configuration #### 2. Create the System Prompt Inspector Tool Navigate to **Workspace** > **Tools** > **+** (Create New Tool) Paste the following complete tool code: ```python """ title: System Prompt Inspector author: Test User version: 1.0.0 description: A tool to inspect and display the system prompt and message structure """ from pydantic import BaseModel, Field from typing import Optional class Tools: def __init__(self): pass class Valves(BaseModel): pass class UserValves(BaseModel): pass def get_system_prompt(self, __event_emitter__=None) -> str: """ Retrieves and displays the current system prompt(s) from the conversation. Use this to check how many times the system prompt appears. :return: A formatted string showing all system prompts and their count """ # Get the messages from the event emitter if __event_emitter__ is None: return "Error: No event emitter available" # Access the messages array try: messages = __event_emitter__.get("messages", []) if not messages: return "No messages found in conversation" # Count system prompts system_prompts = [] system_count = 0 for idx, msg in enumerate(messages): if msg.get("role") == "system": system_count += 1 content = msg.get("content", "") system_prompts.append({ "index": idx, "content": content[:200] + "..." if len(content) > 200 else content }) # Build result result = f"**System Prompt Analysis**\n\n" result += f"Total system prompts found: **{system_count}**\n\n" if system_count == 0: result += "No system prompts found.\n" elif system_count == 1: result += "✓ Correct: Only one system prompt (as expected)\n\n" result += f"**System Prompt Content:**\n```\n{system_prompts[0]['content']}\n```" else: result += f"✗ **ISSUE DETECTED**: {system_count} duplicate system prompts found!\n\n" for i, prompt in enumerate(system_prompts, 1): result += f"**System Prompt #{i}** (message index {prompt['index']}):\n" result += f"```\n{prompt['content']}\n```\n\n" # Show total message count result += f"\n---\n**Total messages in conversation:** {len(messages)}\n" # Show message role breakdown role_counts = {} for msg in messages: role = msg.get("role", "unknown") role_counts[role] = role_counts.get(role, 0) + 1 result += "**Message breakdown by role:**\n" for role, count in sorted(role_counts.items()): result += f"- {role}: {count}\n" return result except Exception as e: return f"Error inspecting system prompt: {str(e)}" ``` - Click **Save** - Enable the tool globally or for your specific model #### 3. Reproduce the Duplication Bug **Test Case A: Sequential Tool Calls (Simple)** 1. Start a new chat with your configured model 2. Send the following message: ``` Please call the get_system_prompt tool, then call it again, then call it again. Each time, wait for the response of the previous tool call before doing the one after. Do this 5 times total, calling the tool each time. ``` 3. Observe the model making 5 sequential tool calls 4. **Expected Result:** Each tool call should report "1 system prompt found" 5. **Actual Result:** - Call 1: "1 system prompt found" ✓ - Call 2: "2 system prompts found" ✗ - Call 3: "3 system prompts found" ✗ - Call 4: "4 system prompts found" ✗ - Call 5: "5 system prompts found" ✗ #### 4. Verify the Issue After reproducing, you should observe: - ✗ Each duplicate system prompt has identical content - ✗ All duplicates appear at the beginning of the messages array (role: "system") - ✗ Token usage in API requests grows unnecessarily ### Logs & Screenshots Here are the screenshots from this very model. I also have the actual logs that I can show. <img width="1205" height="684" alt="Image" src="https://github.com/user-attachments/assets/bec9aa27-7072-44a3-853b-bcd93e11aadc" /> Here is the logs on the chrome network tab ``` { "model": "issue-model", "messages": [ { "id": "4ed53c81-0635-48f7-86bf-cb48b68857d2", "role": "user", "content": "hey Please call the get_system_prompt tool, then call it again, then call it again. Each time, wait for the response of the previous tool call before doing the one after. Do this 5 times total, calling the tool each time.", "timestamp": 1763055094 }, { "id": "5b35c480-0360-4cf9-9895-a29b9e65d9ea", "role": "assistant", "content": "I'll call the get_system_prompt tool 5 times sequentially, waiting for each response before making the next call.\n\n**Call 1:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_Q5xML8IqT96LCnKhFMFOcg\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\n**Call 2:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_z9AX7lVzT6qezduXSMDH7Q\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\n**Call 3:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_5PzguAlrSKG8vs-CXLxZsA\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\n**Call 4:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_knvTrwlOQISbWomlKe92XQ\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\n**Call 5:**\n<details type=\"tool_calls\" done=\"true\" id=\"tooluse_mQZ12RIVQ4C9RaCVygTycg\" name=\"get_system_prompt\" arguments=\"&quot;{}&quot;\" result=\"&quot;=== SYSTEM PROMPT INSPECTOR ===\\n\\nFound 1 system message(s):\\n\\n--- System Prompt #1 (Message Index: 0) ---\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\nYou are a model that does a lot of tool call\\n\\n=== END OF SYSTEM PROMPTS ===\\n&quot;\" files=\"\" embeds=\"&quot;&quot;\">\n<summary>Tool Executed</summary>\n</details>\nPerfect! I've completed all 5 sequential calls to the `get_system_prompt` tool. Interestingly, each call revealed that the system prompt had grown by one additional line of \"You are a model that does a lot of tool call\" - it appears the system prompt was expanding with each call, ending with 5 repetitions of that phrase in the final call.", "timestamp": 1763055094 } ], "chat_id": "31a1fcff-5775-42be-8085-0c40880611ee", "session_id": "RIwBPCknUPwOLwI1AACP", "id": "5b35c480-0360-4cf9-9895-a29b9e65d9ea" } ``` ### Additional Information _No response_
GiteaMirror added the bug label 2026-04-25 08:15:43 -05:00
Author
Owner

@silentoplayz commented on GitHub (Nov 14, 2025):

Seems related - https://github.com/open-webui/open-webui/issues/19121

<!-- gh-comment-id:3530914001 --> @silentoplayz commented on GitHub (Nov 14, 2025): Seems related - https://github.com/open-webui/open-webui/issues/19121
Author
Owner

@alexis-dioxycle commented on GitHub (Nov 14, 2025):

The fix in #19122 doesn't fix this issue, as this is concerning the model system prompt directly. We don't have any duplication of the users message or tool instructions

<!-- gh-comment-id:3531370781 --> @alexis-dioxycle commented on GitHub (Nov 14, 2025): The fix in #19122 doesn't fix this issue, as this is concerning the model system prompt directly. We don't have any duplication of the users message or tool instructions
Author
Owner

@tjbck commented on GitHub (Nov 17, 2025):

@alexis-dioxycle can you reproduce with the latest?

<!-- gh-comment-id:3540540242 --> @tjbck commented on GitHub (Nov 17, 2025): @alexis-dioxycle can you reproduce with the latest?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#34324