[GH-ISSUE #22154] issue: RAG template duplication for web search tool & disabling citations ability not working #58310

Closed
opened 2026-05-05 22:52:14 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @Master-Pr0grammer on GitHub (Mar 2, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/22154

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Pip Install

Open WebUI Version

v0.8.7 (latest)

Ollama Version (if applicable)

No response

Operating System

Ubuntu 22.04

Browser (if applicable)

Chrome

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When the model decides to use the native web search tools in open webui, the RAG template for citations should be pasted into they system prompt ONCE. and if citations are turned off for the model, this should not happen at all.

Actual Behavior

For every single web tool call, the RAG template is pasted into the system prompt. so if the model calls the tool 5 times, the RAG prompt will be pasted and completely duplicated 5 times into the system prompt.

This can confuse the model, degrade performance, and also makes inference much more costly to run since the prompt cache needs to be fully reprocessed every time the model calls a web search tool.

Steps to Reproduce

  1. launch open webui

  2. Give a model access to the web search tools

  3. ask it to perform deep research and invoke web search tools several times

  4. look at the openai API logs and see that the system prompt contains duplicates of the RAG template

  5. Repeat the above but with citations disabled for the model, and see that it still pastes the citations/RAG prompt into the system prompt, and still duplicates it.

Logs & Screenshots

Here is the system prompt after a few tool calls in the logs of my OpenAI API backend:

"messages": [
    {
      "role": "system",
      "content": "You are a helpful AI assistant.
### Task:
Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).

### Guidelines:
- If you don't know the answer, clearly state that.
- If uncertain, ask the user for clarification.
- Respond in the same language as the user's query.
- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.
- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.
- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**
- Do not cite if the <source> tag does not contain an id attribute.
- Do not use XML tags in your response.
- Ensure citations are concise and directly related to the information provided.

### Example of Citation:
If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:
* \"According to the study, the proposed method increases efficiency by 20% [1].\"

### Output:
Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.

<context>
<source id=\"1\" name=\"search_web\"></source>
<source id=\"2\" name=\"search_web\"></source>
<source id=\"3\" name=\"search_web\"></source>
<source id=\"4\" name=\"search_web\"></source>
<source id=\"5\" name=\"search_web\"></source>
</context>

### Task:
Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).

### Guidelines:
- If you don't know the answer, clearly state that.
- If uncertain, ask the user for clarification.
- Respond in the same language as the user's query.
- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.
- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.
- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**
- Do not cite if the <source> tag does not contain an id attribute.
- Do not use XML tags in your response.
- Ensure citations are concise and directly related to the information provided.

### Example of Citation:
If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:
* \"According to the study, the proposed method increases efficiency by 20% [1].\"

### Output:
Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.

<context>
<source id=\"1\" name=\"search_web\"></source>
<source id=\"2\" name=\"search_web\"></source>
<source id=\"3\" name=\"search_web\"></source>
<source id=\"4\" name=\"search_web\"></source>
<source id=\"5\" name=\"search_web\"></source>
<source id=\"6\" name=\"search_web\"></source>
<source id=\"7\" name=\"search_web\"></source>
<source id=\"8\" name=\"search_web\"></source>
<source id=\"9\" name=\"search_web\"></source>
<source id=\"10\" name=\"search_web\"></source>
</context>

### Task:
Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).

### Guidelines:
- If you don't know the answer, clearly state that.
- If uncertain, ask the user for clarification.
- Respond in the same language as the user's query.
- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.
- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.
- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**
- Do not cite if the <source> tag does not contain an id attribute.
- Do not use XML tags in your response.
- Ensure citations are concise and directly related to the information provided.

### Example of Citation:
If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:
* \"According to the study, the proposed method increases efficiency by 20% [1].\"

### Output:
Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.

<context>
<source id=\"1\" name=\"search_web\"></source>
<source id=\"2\" name=\"search_web\"></source>
<source id=\"3\" name=\"search_web\"></source>
<source id=\"4\" name=\"search_web\"></source>
<source id=\"5\" name=\"search_web\"></source>
<source id=\"6\" name=\"search_web\"></source>
<source id=\"7\" name=\"search_web\"></source>
<source id=\"8\" name=\"search_web\"></source>
<source id=\"9\" name=\"search_web\"></source>
<source id=\"10\" name=\"search_web\"></source>
<source id=\"7\" name=\"https://artificialanalysis.ai/models/qwen3-5-35b-a3b\"></source>
</context>

### Task:
Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).

### Guidelines:
- If you don't know the answer, clearly state that.
- If uncertain, ask the user for clarification.
- Respond in the same language as the user's query.
- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.
- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.
- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**
- Do not cite if the <source> tag does not contain an id attribute.
- Do not use XML tags in your response.
- Ensure citations are concise and directly related to the information provided.

### Example of Citation:
If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:
* \"According to the study, the proposed method increases efficiency by 20% [1].\"

### Output:
Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.

<context>
<source id=\"1\" name=\"search_web\"></source>
<source id=\"2\" name=\"search_web\"></source>
<source id=\"3\" name=\"search_web\"></source>
<source id=\"4\" name=\"search_web\"></source>
<source id=\"5\" name=\"search_web\"></source>
<source id=\"6\" name=\"search_web\"></source>
<source id=\"7\" name=\"search_web\"></source>
<source id=\"8\" name=\"search_web\"></source>
<source id=\"9\" name=\"search_web\"></source>
<source id=\"10\" name=\"search_web\"></source>
<source id=\"7\" name=\"https://artificialanalysis.ai/models/qwen3-5-35b-a3b\"></source>
<source id=\"11\" name=\"search_web\"></source>
<source id=\"10\" name=\"search_web\"></source>
<source id=\"12\" name=\"search_web\"></source>
<source id=\"13\" name=\"search_web\"></source>
<source id=\"14\" name=\"search_web\"></source>
</context>

### Task:
Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).

### Guidelines:
- If you don't know the answer, clearly state that.
- If uncertain, ask the user for clarification.
- Respond in the same language as the user's query.
- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.
- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.
- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**
- Do not cite if the <source> tag does not contain an id attribute.
- Do not use XML tags in your response.
- Ensure citations are concise and directly related to the information provided.

### Example of Citation:
If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:
* \"According to the study, the proposed method increases efficiency by 20% [1].\"

### Output:
Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.

<context>
<source id=\"1\" name=\"search_web\"></source>
<source id=\"2\" name=\"search_web\"></source>
<source id=\"3\" name=\"search_web\"></source>
<source id=\"4\" name=\"search_web\"></source>
<source id=\"5\" name=\"search_web\"></source>
<source id=\"6\" name=\"search_web\"></source>
<source id=\"7\" name=\"search_web\"></source>
<source id=\"8\" name=\"search_web\"></source>
<source id=\"9\" name=\"search_web\"></source>
<source id=\"10\" name=\"search_web\"></source>
<source id=\"7\" name=\"https://artificialanalysis.ai/models/qwen3-5-35b-a3b\"></source>
<source id=\"11\" name=\"search_web\"></source>
<source id=\"10\" name=\"search_web\"></source>
<source id=\"12\" name=\"search_web\"></source>
<source id=\"13\" name=\"search_web\"></source>
<source id=\"14\" name=\"search_web\"></source>
<source id=\"15\" name=\"https://www.datacamp.com/blog/qwen3.5\"></source>
</context>

### Task:
Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).

### Guidelines:
- If you don't know the answer, clearly state that.
- If uncertain, ask the user for clarification.
- Respond in the same language as the user's query.
- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.
- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.
- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**
- Do not cite if the <source> tag does not contain an id attribute.
- Do not use XML tags in your response.
- Ensure citations are concise and directly related to the information provided.

### Example of Citation:
If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:
* \"According to the study, the proposed method increases efficiency by 20% [1].\"

### Output:
Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.

<context>
<source id=\"1\" name=\"search_web\"></source>
<source id=\"2\" name=\"search_web\"></source>
<source id=\"3\" name=\"search_web\"></source>
<source id=\"4\" name=\"search_web\"></source>
<source id=\"5\" name=\"search_web\"></source>
<source id=\"6\" name=\"search_web\"></source>
<source id=\"7\" name=\"search_web\"></source>
<source id=\"8\" name=\"search_web\"></source>
<source id=\"9\" name=\"search_web\"></source>
<source id=\"10\" name=\"search_web\"></source>
<source id=\"7\" name=\"https://artificialanalysis.ai/models/qwen3-5-35b-a3b\"></source>
<source id=\"11\" name=\"search_web\"></source>
<source id=\"10\" name=\"search_web\"></source>
<source id=\"12\" name=\"search_web\"></source>
<source id=\"13\" name=\"search_web\"></source>
<source id=\"14\" name=\"search_web\"></source>
<source id=\"15\" name=\"https://www.datacamp.com/blog/qwen3.5\"></source>
<source id=\"16\" name=\"search_web\"></source>
<source id=\"17\" name=\"search_web\"></source>
<source id=\"6\" name=\"search_web\"></source>
<source id=\"18\" name=\"search_web\"></source>
<source id=\"12\" name=\"search_web\"></source>
</context>

Additional Information

fixing this issue, and adding the ability to prevent the system prompt from being dynamically modified in such a way is extremely important because doing so drastically slows down inference, the same process in open webui takes 10x as long as it does in LM Studio on the same hardware.

At minimum, having the ability to turn off citations, and it actually turning off citations for web search, which would enable functional prompt caching should work properly as it is intended.

As a suggestion, to keep the citations feature and have better prompt caching, you could append the instructions to the system prompt for citations ONCE before any chat, then show the source ID for citations in the tool output. this would allow citations to function as intended while massively speeding up inference and making it much cheaper to operate by utilizing prompt caching.

Originally created by @Master-Pr0grammer on GitHub (Mar 2, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/22154 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Pip Install ### Open WebUI Version v0.8.7 (latest) ### Ollama Version (if applicable) _No response_ ### Operating System Ubuntu 22.04 ### Browser (if applicable) Chrome ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When the model decides to use the native web search tools in open webui, the RAG template for citations should be pasted into they system prompt ONCE. and if citations are turned off for the model, this should not happen at all. ### Actual Behavior For every single web tool call, the RAG template is pasted into the system prompt. so if the model calls the tool 5 times, the RAG prompt will be pasted and completely duplicated 5 times into the system prompt. This can confuse the model, degrade performance, and also makes inference much more costly to run since the prompt cache needs to be fully reprocessed every time the model calls a web search tool. ### Steps to Reproduce 1. launch open webui 2. Give a model access to the web search tools 3. ask it to perform deep research and invoke web search tools several times 4. look at the openai API logs and see that the system prompt contains duplicates of the RAG template 5. Repeat the above but with `citations` disabled for the model, and see that it still pastes the citations/RAG prompt into the system prompt, and still duplicates it. ### Logs & Screenshots Here is the system prompt after a few tool calls in the logs of my OpenAI API backend: ``` "messages": [ { "role": "system", "content": "You are a helpful AI assistant. ### Task: Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">). ### Guidelines: - If you don't know the answer, clearly state that. - If uncertain, ask the user for clarification. - Respond in the same language as the user's query. - If the context is unreadable or of poor quality, inform the user and provide the best possible answer. - If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding. - **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.** - Do not cite if the <source> tag does not contain an id attribute. - Do not use XML tags in your response. - Ensure citations are concise and directly related to the information provided. ### Example of Citation: If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example: * \"According to the study, the proposed method increases efficiency by 20% [1].\" ### Output: Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context. <context> <source id=\"1\" name=\"search_web\"></source> <source id=\"2\" name=\"search_web\"></source> <source id=\"3\" name=\"search_web\"></source> <source id=\"4\" name=\"search_web\"></source> <source id=\"5\" name=\"search_web\"></source> </context> ### Task: Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">). ### Guidelines: - If you don't know the answer, clearly state that. - If uncertain, ask the user for clarification. - Respond in the same language as the user's query. - If the context is unreadable or of poor quality, inform the user and provide the best possible answer. - If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding. - **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.** - Do not cite if the <source> tag does not contain an id attribute. - Do not use XML tags in your response. - Ensure citations are concise and directly related to the information provided. ### Example of Citation: If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example: * \"According to the study, the proposed method increases efficiency by 20% [1].\" ### Output: Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context. <context> <source id=\"1\" name=\"search_web\"></source> <source id=\"2\" name=\"search_web\"></source> <source id=\"3\" name=\"search_web\"></source> <source id=\"4\" name=\"search_web\"></source> <source id=\"5\" name=\"search_web\"></source> <source id=\"6\" name=\"search_web\"></source> <source id=\"7\" name=\"search_web\"></source> <source id=\"8\" name=\"search_web\"></source> <source id=\"9\" name=\"search_web\"></source> <source id=\"10\" name=\"search_web\"></source> </context> ### Task: Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">). ### Guidelines: - If you don't know the answer, clearly state that. - If uncertain, ask the user for clarification. - Respond in the same language as the user's query. - If the context is unreadable or of poor quality, inform the user and provide the best possible answer. - If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding. - **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.** - Do not cite if the <source> tag does not contain an id attribute. - Do not use XML tags in your response. - Ensure citations are concise and directly related to the information provided. ### Example of Citation: If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example: * \"According to the study, the proposed method increases efficiency by 20% [1].\" ### Output: Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context. <context> <source id=\"1\" name=\"search_web\"></source> <source id=\"2\" name=\"search_web\"></source> <source id=\"3\" name=\"search_web\"></source> <source id=\"4\" name=\"search_web\"></source> <source id=\"5\" name=\"search_web\"></source> <source id=\"6\" name=\"search_web\"></source> <source id=\"7\" name=\"search_web\"></source> <source id=\"8\" name=\"search_web\"></source> <source id=\"9\" name=\"search_web\"></source> <source id=\"10\" name=\"search_web\"></source> <source id=\"7\" name=\"https://artificialanalysis.ai/models/qwen3-5-35b-a3b\"></source> </context> ### Task: Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">). ### Guidelines: - If you don't know the answer, clearly state that. - If uncertain, ask the user for clarification. - Respond in the same language as the user's query. - If the context is unreadable or of poor quality, inform the user and provide the best possible answer. - If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding. - **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.** - Do not cite if the <source> tag does not contain an id attribute. - Do not use XML tags in your response. - Ensure citations are concise and directly related to the information provided. ### Example of Citation: If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example: * \"According to the study, the proposed method increases efficiency by 20% [1].\" ### Output: Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context. <context> <source id=\"1\" name=\"search_web\"></source> <source id=\"2\" name=\"search_web\"></source> <source id=\"3\" name=\"search_web\"></source> <source id=\"4\" name=\"search_web\"></source> <source id=\"5\" name=\"search_web\"></source> <source id=\"6\" name=\"search_web\"></source> <source id=\"7\" name=\"search_web\"></source> <source id=\"8\" name=\"search_web\"></source> <source id=\"9\" name=\"search_web\"></source> <source id=\"10\" name=\"search_web\"></source> <source id=\"7\" name=\"https://artificialanalysis.ai/models/qwen3-5-35b-a3b\"></source> <source id=\"11\" name=\"search_web\"></source> <source id=\"10\" name=\"search_web\"></source> <source id=\"12\" name=\"search_web\"></source> <source id=\"13\" name=\"search_web\"></source> <source id=\"14\" name=\"search_web\"></source> </context> ### Task: Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">). ### Guidelines: - If you don't know the answer, clearly state that. - If uncertain, ask the user for clarification. - Respond in the same language as the user's query. - If the context is unreadable or of poor quality, inform the user and provide the best possible answer. - If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding. - **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.** - Do not cite if the <source> tag does not contain an id attribute. - Do not use XML tags in your response. - Ensure citations are concise and directly related to the information provided. ### Example of Citation: If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example: * \"According to the study, the proposed method increases efficiency by 20% [1].\" ### Output: Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context. <context> <source id=\"1\" name=\"search_web\"></source> <source id=\"2\" name=\"search_web\"></source> <source id=\"3\" name=\"search_web\"></source> <source id=\"4\" name=\"search_web\"></source> <source id=\"5\" name=\"search_web\"></source> <source id=\"6\" name=\"search_web\"></source> <source id=\"7\" name=\"search_web\"></source> <source id=\"8\" name=\"search_web\"></source> <source id=\"9\" name=\"search_web\"></source> <source id=\"10\" name=\"search_web\"></source> <source id=\"7\" name=\"https://artificialanalysis.ai/models/qwen3-5-35b-a3b\"></source> <source id=\"11\" name=\"search_web\"></source> <source id=\"10\" name=\"search_web\"></source> <source id=\"12\" name=\"search_web\"></source> <source id=\"13\" name=\"search_web\"></source> <source id=\"14\" name=\"search_web\"></source> <source id=\"15\" name=\"https://www.datacamp.com/blog/qwen3.5\"></source> </context> ### Task: Respond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">). ### Guidelines: - If you don't know the answer, clearly state that. - If uncertain, ask the user for clarification. - Respond in the same language as the user's query. - If the context is unreadable or of poor quality, inform the user and provide the best possible answer. - If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding. - **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.** - Do not cite if the <source> tag does not contain an id attribute. - Do not use XML tags in your response. - Ensure citations are concise and directly related to the information provided. ### Example of Citation: If the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example: * \"According to the study, the proposed method increases efficiency by 20% [1].\" ### Output: Provide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context. <context> <source id=\"1\" name=\"search_web\"></source> <source id=\"2\" name=\"search_web\"></source> <source id=\"3\" name=\"search_web\"></source> <source id=\"4\" name=\"search_web\"></source> <source id=\"5\" name=\"search_web\"></source> <source id=\"6\" name=\"search_web\"></source> <source id=\"7\" name=\"search_web\"></source> <source id=\"8\" name=\"search_web\"></source> <source id=\"9\" name=\"search_web\"></source> <source id=\"10\" name=\"search_web\"></source> <source id=\"7\" name=\"https://artificialanalysis.ai/models/qwen3-5-35b-a3b\"></source> <source id=\"11\" name=\"search_web\"></source> <source id=\"10\" name=\"search_web\"></source> <source id=\"12\" name=\"search_web\"></source> <source id=\"13\" name=\"search_web\"></source> <source id=\"14\" name=\"search_web\"></source> <source id=\"15\" name=\"https://www.datacamp.com/blog/qwen3.5\"></source> <source id=\"16\" name=\"search_web\"></source> <source id=\"17\" name=\"search_web\"></source> <source id=\"6\" name=\"search_web\"></source> <source id=\"18\" name=\"search_web\"></source> <source id=\"12\" name=\"search_web\"></source> </context> ``` ### Additional Information fixing this issue, and adding the ability to prevent the system prompt from being dynamically modified in such a way is extremely important because doing so drastically slows down inference, the same process in open webui takes 10x as long as it does in LM Studio on the same hardware. At minimum, having the ability to turn off citations, and it actually turning off citations for web search, which would enable functional prompt caching should work properly as it is intended. As a suggestion, to keep the citations feature and have better prompt caching, you could append the instructions to the system prompt for citations ONCE before any chat, then show the source ID for citations in the tool output. this would allow citations to function as intended while massively speeding up inference and making it much cheaper to operate by utilizing prompt caching.
GiteaMirror added the bug label 2026-05-05 22:52:14 -05:00
Author
Owner

@Classic298 commented on GitHub (Mar 2, 2026):

Thanks for the detailed report! I looked into this and here's what I found:

RAG Template Duplication

The duplication you're seeing happens specifically because you have the RAG_SYSTEM_CONTEXT environment variable set to True (the default is False). When this is set, the RAG template gets appended to the system message on each tool call iteration in the native function calling loop. For the user message path (the default), the message is properly restored before each re-application, so duplication doesn't happen.

There's a fix coming for this. The system message will be properly saved and restored between tool call iterations, same as we already do for the user message.

Citations Toggle

The "citations" capability toggle in the model editor is a frontend-only display control. It hides the citation UI elements (the source bubbles and clickable markers) but doesn't suppress the backend RAG template injection. This is by design: the RAG context helps the model ground its answers in the provided sources, which is useful regardless of whether the citation markers are rendered in the UI.

If you want to fully prevent the RAG template from being injected, you would need to customize the RAG_TEMPLATE config to be minimal/empty or disable the RAG prompt from being sent altogether, I believe in the admin interface settings, rather than toggling citations (which again is only for the frontend citation rendering, by design).

Prompt Caching

Your observation about prompt caching is valid for your setup. With RAG_SYSTEM_CONTEXT=True enabled AND the duplication bug, the system prompt keeps changing and growing on each tool call, which does invalidate the prompt cache. After the duplication fix, the system message will be stable across iterations (just updated with the accumulated sources rather than re-appended), which should help with caching.

<!-- gh-comment-id:3986167553 --> @Classic298 commented on GitHub (Mar 2, 2026): Thanks for the detailed report! I looked into this and here's what I found: **RAG Template Duplication** The duplication you're seeing happens specifically because you have the RAG_SYSTEM_CONTEXT environment variable set to True (the default is False). When this is set, the RAG template gets appended to the system message on each tool call iteration in the native function calling loop. For the user message path (the default), the message is properly restored before each re-application, so duplication doesn't happen. There's a fix coming for this. The system message will be properly saved and restored between tool call iterations, same as we already do for the user message. **Citations Toggle** The "citations" capability toggle in the model editor is a frontend-only display control. It hides the citation UI elements (the source bubbles and clickable markers) but doesn't suppress the backend RAG template injection. This is by design: the RAG context helps the model ground its answers in the provided sources, which is useful regardless of whether the citation markers are rendered in the UI. If you want to fully prevent the RAG template from being injected, you would need to customize the RAG_TEMPLATE config to be minimal/empty or disable the RAG prompt from being sent altogether, I believe in the admin interface settings, rather than toggling citations (which again is only for the frontend citation rendering, by design). **Prompt Caching** Your observation about prompt caching is valid for your setup. With RAG_SYSTEM_CONTEXT=True enabled AND the duplication bug, the system prompt keeps changing and growing on each tool call, which does invalidate the prompt cache. After the duplication fix, the system message will be stable across iterations (just updated with the accumulated sources rather than re-appended), which should help with caching.
Author
Owner

@Classic298 commented on GitHub (Mar 2, 2026):

testing wanted https://github.com/open-webui/open-webui/pull/22157

<!-- gh-comment-id:3986305490 --> @Classic298 commented on GitHub (Mar 2, 2026): testing wanted https://github.com/open-webui/open-webui/pull/22157
Author
Owner

@Classic298 commented on GitHub (Mar 2, 2026):

e0d4c3ec92

<!-- gh-comment-id:3987605226 --> @Classic298 commented on GitHub (Mar 2, 2026): https://github.com/open-webui/open-webui/commit/e0d4c3ec92aeddd3413d340f57e83a360ce7db71
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#58310