[GH-ISSUE #21663] issue: RAG template mutates web search tool call args #58216

Closed
opened 2026-05-05 22:34:03 -05:00 by GiteaMirror · 18 comments
Owner

Originally created by @relic664 on GitHub (Feb 20, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/21663

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.8.3

Ollama Version (if applicable)

No response

Operating System

Alpine 3.23

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

RAG templating should not leak into tool call arguments.

Actual Behavior

RAG_SYSTEM_CONTEXT controls where RAG context is injected, not whether templating occurs (it's always applied when sources exist). The problem is that it's currently injected into the user message during native tool-calling. During loops, this can cause the RAG template to be injected into subsequent tool call args (especially in the case of multiple tool calls per turn). It would be safer to inject into the system message instead.

Steps to Reproduce

  1. Start a chat with any model
  2. Induce a multi-step web search, either by providing a sufficiently complex question or by reducing the number of results returned in a web search.
  3. Examine the tool call arguments

Logs & Screenshots

Image

Additional Information

When apply_source_context_to_messages was added in 2789f6a24d, it reused the existing RAG-context mechanism for tool sources. Current behavior is that whenever there are sources (from knowledge retrieval or tool-call citations), they are wrapped in tags and RAG_TEMPLATE via rag_template, then injected back into conversation context. RAG_SYSTEM_CONTEXT only changes placement (system vs user), not whether this wrapping occurs. This causes issues in native tool-calling loops where multiple calls may happen in one turn while tool arguments are still being generated, leading to template text leaking into subsequent tool arguments.

I'm willing to draft a PR to patch this if there's interest.

Originally created by @relic664 on GitHub (Feb 20, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/21663 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.8.3 ### Ollama Version (if applicable) _No response_ ### Operating System Alpine 3.23 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior RAG templating should not leak into tool call arguments. ### Actual Behavior `RAG_SYSTEM_CONTEXT` controls where RAG context is injected, not whether templating occurs (it's always applied when sources exist). The problem is that it's currently injected into the user message during native tool-calling. During loops, this can cause the RAG template to be injected into subsequent tool call args (especially in the case of multiple tool calls per turn). It would be safer to inject into the system message instead. ### Steps to Reproduce 1. Start a chat with any model 2. Induce a multi-step web search, either by providing a sufficiently complex question or by reducing the number of results returned in a web search. 3. Examine the tool call arguments ### Logs & Screenshots <img width="984" height="311" alt="Image" src="https://github.com/user-attachments/assets/1968e0fc-5e2a-45d9-b5d1-2bb753271797" /> ### Additional Information When apply_source_context_to_messages was added in 2789f6a24d, it reused the existing RAG-context mechanism for tool sources. Current behavior is that whenever there are sources (from knowledge retrieval or tool-call citations), they are wrapped in <source> tags and RAG_TEMPLATE via rag_template, then injected back into conversation context. RAG_SYSTEM_CONTEXT only changes placement (system vs user), not whether this wrapping occurs. This causes issues in native tool-calling loops where multiple calls may happen in one turn while tool arguments are still being generated, leading to template text leaking into subsequent tool arguments. I'm willing to draft a PR to patch this if there's interest.
GiteaMirror added the bug label 2026-05-05 22:34:04 -05:00
Author
Owner

@Classic298 commented on GitHub (Feb 20, 2026):

huh. How do i reproduce this? I have never seen this before and i have been using native tool calling for search_web many many times before.

What model is this? Are you sure the model didn't just hallucinate and input the RAG system prompt into the search_web as the actualy query of the tool call?

When i try to reproduce, it just works as intended.

Image

Need ways to reproduce this. Might also just be model dependent behaviour. I have never seen this happen.

<!-- gh-comment-id:3936501604 --> @Classic298 commented on GitHub (Feb 20, 2026): huh. How do i reproduce this? I have never seen this before and i have been using native tool calling for search_web many many times before. What model is this? Are you sure the model didn't just hallucinate and input the RAG system prompt into the search_web as the actualy query of the tool call? When i try to reproduce, it just works as intended. <img width="1109" height="803" alt="Image" src="https://github.com/user-attachments/assets/7c700918-f0e7-4143-ba0a-c23d911ebc5c" /> Need ways to reproduce this. Might also just be model dependent behaviour. I have never seen this happen.
Author
Owner

@relic664 commented on GitHub (Feb 20, 2026):

This screenshot doesn't dismiss the issue at hand as the issue is multiple concurrent search_web turns. Try your query again, but set sources returned to 1 instead of 5 in your web search settings to induce multiple calls sequentially.

<!-- gh-comment-id:3936531054 --> @relic664 commented on GitHub (Feb 20, 2026): This screenshot doesn't dismiss the issue at hand as the issue is multiple concurrent `search_web` turns. Try your query again, but set sources returned to 1 instead of 5 in your web search settings to induce multiple calls sequentially.
Author
Owner

@Classic298 commented on GitHub (Feb 20, 2026):

@relic664 the screenshot is just an example. I have had models do 30 search_web tool calls back-to-back with the right deep research prompts. It simply never happened to me. Can you please answer the questions I asked? The issue might be elsewhere.

<!-- gh-comment-id:3936542831 --> @Classic298 commented on GitHub (Feb 20, 2026): @relic664 the screenshot is just an example. I have had models do 30 search_web tool calls back-to-back with the right deep research prompts. It simply never happened to me. Can you please answer the questions I asked? The issue might be elsewhere.
Author
Owner

@Classic298 commented on GitHub (Feb 20, 2026):

I will do a code investigation here just in case, but the RAG prompt ... being injected into the tool call args as you say sound impossible.

<!-- gh-comment-id:3936569009 --> @Classic298 commented on GitHub (Feb 20, 2026): I will do a code investigation here just in case, but the RAG prompt ... being injected into the tool call args as you say sound impossible.
Author
Owner

@relic664 commented on GitHub (Feb 20, 2026):

I was able to reproduce with minimax2.5

Image
<!-- gh-comment-id:3936576796 --> @relic664 commented on GitHub (Feb 20, 2026): I was able to reproduce with minimax2.5 <img width="1432" height="790" alt="Image" src="https://github.com/user-attachments/assets/20fce127-c8dd-45e1-a184-28ccd22e242e" />
Author
Owner

@Classic298 commented on GitHub (Feb 20, 2026):

Ok i have found something interesting

The rag prompt is NOT injected into the tool call, the model is indeed just stupid

but Open WebUI IS also mishandling the Rag system prompt.

I will draft up a fix for what i found.

<!-- gh-comment-id:3936598316 --> @Classic298 commented on GitHub (Feb 20, 2026): Ok i have found something interesting The rag prompt is NOT injected into the tool call, the model is indeed just stupid but Open WebUI <ins>IS</ins> also mishandling the Rag system prompt. I will draft up a fix for what i found.
Author
Owner

@relic664 commented on GitHub (Feb 20, 2026):

In my case, I did a quick test and just shoved it in the system message instead of the user message and it worked fine. It was a three line fix, FWIW

<!-- gh-comment-id:3936620171 --> @relic664 commented on GitHub (Feb 20, 2026): In my case, I did a quick test and just shoved it in the system message instead of the user message and it worked fine. It was a three line fix, FWIW
Author
Owner

@Classic298 commented on GitHub (Feb 20, 2026):

properly fixing it is a bit more than three lines.

I am drafting up a PR right now.

Currently testing it. If it works, i will submit it in a few minutes

<!-- gh-comment-id:3936650809 --> @Classic298 commented on GitHub (Feb 20, 2026): properly fixing it is a bit more than three lines. I am drafting up a PR right now. Currently testing it. If it works, i will submit it in a few minutes
Author
Owner

@Classic298 commented on GitHub (Feb 20, 2026):

ok have a fix

<!-- gh-comment-id:3936820817 --> @Classic298 commented on GitHub (Feb 20, 2026): ok have a fix
Author
Owner

@Classic298 commented on GitHub (Feb 20, 2026):

https://github.com/open-webui/open-webui/pull/21668

<!-- gh-comment-id:3936829419 --> @Classic298 commented on GitHub (Feb 20, 2026): https://github.com/open-webui/open-webui/pull/21668
Author
Owner

@Classic298 commented on GitHub (Feb 22, 2026):

Fixed in dev testing wanted

<!-- gh-comment-id:3941945802 --> @Classic298 commented on GitHub (Feb 22, 2026): Fixed in dev testing wanted
Author
Owner

@relic664 commented on GitHub (Feb 23, 2026):

I checked on git-824eeba and its still broken there for me.

<!-- gh-comment-id:3942027492 --> @relic664 commented on GitHub (Feb 23, 2026): I checked on ` git-824eeba ` and its still broken there for me.
Author
Owner

@Classic298 commented on GitHub (Feb 23, 2026):

@relic664 sorry can you tell me what commit you are on now? The gist of the docker is different to that.

Can you see if the changes are applied?

<!-- gh-comment-id:3942032673 --> @Classic298 commented on GitHub (Feb 23, 2026): @relic664 sorry can you tell me what commit you are on now? The gist of the docker is different to that. Can you see if the changes are applied?
Author
Owner

@relic664 commented on GitHub (Feb 23, 2026):

I just pulled the latest dev and it corresponds to d9fd2a3f. The issue is still cropping up for me on that image. (now docker tag git-d9fd2a3)

<!-- gh-comment-id:3942056759 --> @relic664 commented on GitHub (Feb 23, 2026): I just pulled the latest dev and it corresponds to d9fd2a3f. The issue is still cropping up for me on that image. (now docker tag `git-d9fd2a3`)
Author
Owner

@relic664 commented on GitHub (Feb 23, 2026):

Here's the diff for the patch that has been working for me, if it helps.

@@ -805,6 +807,7 @@ def apply_source_context_to_messages(
    messages: list,
    sources: list,
    user_message: str,
    force_system_context: bool = False,
) -> list:
    """
    Build source context from citation sources and apply to messages.
@@ -832,7 +835,7 @@ def apply_source_context_to_messages(
    if not context_string:
        return messages

    if RAG_SYSTEM_CONTEXT:
    if force_system_context or RAG_SYSTEM_CONTEXT:
        return add_or_update_system_message(
            rag_template(
                request.app.state.config.RAG_TEMPLATE, context_string, user_message
@@ -4201,6 +4204,7 @@ async def flush_pending_delta_data(threshold: int = 0):
                                form_data["messages"],
                                tool_call_sources,
                                user_msg,
                                force_system_context=True,
                            )
                        tool_call_sources.clear()
<!-- gh-comment-id:3942078156 --> @relic664 commented on GitHub (Feb 23, 2026): Here's the diff for the patch that has been working for me, if it helps. ``` @@ -805,6 +807,7 @@ def apply_source_context_to_messages( messages: list, sources: list, user_message: str, force_system_context: bool = False, ) -> list: """ Build source context from citation sources and apply to messages. @@ -832,7 +835,7 @@ def apply_source_context_to_messages( if not context_string: return messages if RAG_SYSTEM_CONTEXT: if force_system_context or RAG_SYSTEM_CONTEXT: return add_or_update_system_message( rag_template( request.app.state.config.RAG_TEMPLATE, context_string, user_message @@ -4201,6 +4204,7 @@ async def flush_pending_delta_data(threshold: int = 0): form_data["messages"], tool_call_sources, user_msg, force_system_context=True, ) tool_call_sources.clear() ```
Author
Owner

@Classic298 commented on GitHub (Feb 23, 2026):

Feel free to reopen if still the case with 0.8.5

make sure to include logs that prove the RAG prompt is indeed being sent multiple times to the model still

Thanks!

<!-- gh-comment-id:3943737061 --> @Classic298 commented on GitHub (Feb 23, 2026): Feel free to reopen if still the case with 0.8.5 make sure to include logs that prove the RAG prompt is indeed being sent multiple times to the model still Thanks!
Author
Owner

@relic664 commented on GitHub (Feb 23, 2026):

The issue still persists in 0.8.5 (looks like I don't have permission to reopen the issue)

Image
<!-- gh-comment-id:3944292018 --> @relic664 commented on GitHub (Feb 23, 2026): The issue still persists in 0.8.5 (looks like I don't have permission to reopen the issue) <img width="1091" height="612" alt="Image" src="https://github.com/user-attachments/assets/03ebd81e-d718-4b63-a1a7-562f99e9b8b7" />
Author
Owner

@Classic298 commented on GitHub (Feb 23, 2026):

@relic664 i meant open a new issue if still reproducible. Do ensure you include your own logging that shows that the RAG prompt is still being concatenated into multiple prompts instead of just once prompt.

Also please ensure you are indeed running the latest version, not that docker is playing a trick on you. @silentoplayz tested this yesterday with 30+ web search calls WITH the fix in place and couldn't reproduce. While you can with just 3 tool uses.

Something is fishy here

<!-- gh-comment-id:3944334758 --> @Classic298 commented on GitHub (Feb 23, 2026): @relic664 i meant open a new issue if still reproducible. Do ensure you include your own logging that shows that the RAG prompt is still being concatenated into multiple prompts instead of just once prompt. Also please ensure you are indeed running the latest version, not that docker is playing a trick on you. @silentoplayz tested this yesterday with 30+ web search calls WITH the fix in place and couldn't reproduce. While you can with just 3 tool uses. Something is fishy here
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#58216