[GH-ISSUE #21780] issue: RAG template double injection #35096

Closed
opened 2026-04-25 09:17:35 -05:00 by GiteaMirror · 19 comments
Owner

Originally created by @relic664 on GitHub (Feb 23, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/21780

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.8.5

Ollama Version (if applicable)

No response

Operating System

Linux Alpine

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

RAG_TEMPLATE is being injected multiple times causing models to hallucinate and improperly call tools. This is a continuation of #21663 which still persists after v0.8.5.

Actual Behavior

RAG_TEMPLATE is injected at most once, preferably at the system message, not the user message. This is

Steps to Reproduce

  1. Use the v0.8.5 docker image
  2. Start a chat with any model
  3. Ensure web_search capabilities are enabled with native tool calling mode
  4. Set max results returned by web_search to be 1
  5. Ask a question requiring multiple turns

Logs & Screenshots

Call 1:
2026-02-23 07:11:13.545 | DEBUG | open_webui.utils.chat:generate_chat_completion:165 - generate_chat_completion: {'stream': True, 'model': 'zai-org/glm-5:thinking', 'messages': [{'role': 'system', 'content': '# Assistant\n\nYou are a concise, candid, and intellectually rigorous assistant.\nYou do not flatter the user or hedge unnecessarily. You value accuracy, clarity, and critical thinking over politeness or empathy. If the user’s idea is weak, flawed, or ambiguous, you say so directly and explain why. Avoid filler like “Great question!” or “Of course!”, or "You're absolutely right!" — go straight to substance. Maintain a professional, calm tone, but never defer or soften your reasoning. Your priority is to analyze, not appease. \n\n\n## Safety & etiquette\n- If the user says “don’t browse,” don’t browse.\n- Be transparent about uncertainty. Don’t speculate beyond evidence.\n\n## Response style\n- Start with the answer. Be concise and actionable.\n- Prioritize correctness and accuracy. Push back on user errors if present, but strive to be polite about the push back. \n- Do not praise the user for their points or assertions\n\n\n'}, {'role': 'user', 'content': '### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] only when the tag includes an explicit id attribute (e.g., ).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- Only include inline citations using [id] (e.g., [1], [2]) when the tag includes an id attribute.\n- Do not cite if the tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* "According to the study, the proposed method increases efficiency by 20% [1]."\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the tag with id attribute is present in the context.\n\n\n\n\n\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at /mnt/user/appdata/app but I want to exclude /mnt/user/appdata/app/.cache how can I do so?\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at /mnt/user/appdata/app but I want to exclude /mnt/user/appdata/app/.cache how can I do so?'}, {'role': 'assistant', 'content': 'The user is asking about the Unraid App Backup and Restore plugin and wants to exclude a specific subdirectory from a backup. This is a specific technical question about Unraid configuration. Let me search for information about this plugin and how to exclude directories.', 'tool_calls': [{'id': 'call_jf21ilhw', 'type': 'function', 'function': {'name': 'search_web', 'arguments': '{"query": "Unraid App Backup Restore plugin exclude directory configuration"}'}}]}, {'role': 'tool', 'tool_call_id': 'call_jf21ilhw', 'content': '[{"title": "[Plugin] Appdata.Backup - Plugin Support - Unraid Forums", "link": "https://forums.unraid.net/topic/137710-plugin-appdatabackup/", "snippet": "Appdata.Backup Support Thread This is the support thread for appdata.backup (formerly known as ca.backup2). This plugin primary takes care of your appdata backup! It allows you to configure backup settings for each of your docker containers. Flash and VM meta backup is integrated as well! If you encounter any issues, post it here with the debug log file attached! For your beta feedback, please ..."}]'}],

Call 2:
2026-02-23 07:11:22.438 | DEBUG | open_webui.utils.chat:generate_chat_completion:165 - generate_chat_completion: {'stream': True, 'model': 'zai-org/glm-5:thinking', 'messages': [{'role': 'system', 'content': '# Assistant\n\nYou are a concise, candid, and intellectually rigorous assistant.\nYou do not flatter the user or hedge unnecessarily. You value accuracy, clarity, and critical thinking over politeness or empathy. If the user’s idea is weak, flawed, or ambiguous, you say so directly and explain why. Avoid filler like “Great question!” or “Of course!”, or "You're absolutely right!" — go straight to substance. Maintain a professional, calm tone, but never defer or soften your reasoning. Your priority is to analyze, not appease. \n\n\n## Safety & etiquette\n- If the user says “don’t browse,” don’t browse.\n- Be transparent about uncertainty. Don’t speculate beyond evidence.\n\n## Response style\n- Start with the answer. Be concise and actionable.\n- Prioritize correctness and accuracy. Push back on user errors if present, but strive to be polite about the push back. \n- Do not praise the user for their points or assertions\n\n\n'}, {'role': 'user', 'content': '### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] only when the tag includes an explicit id attribute (e.g., ).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- Only include inline citations using [id] (e.g., [1], [2]) when the tag includes an id attribute.\n- Do not cite if the tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* "According to the study, the proposed method increases efficiency by 20% [1]."\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the tag with id attribute is present in the context.\n\n\n\n\n\n\n\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at /mnt/user/appdata/app but I want to exclude /mnt/user/appdata/app/.cache how can I do so?\n### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] only when the tag includes an explicit id attribute (e.g., ).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- Only include inline citations using [id] (e.g., [1], [2]) when the tag includes an id attribute.\n- Do not cite if the tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* "According to the study, the proposed method increases efficiency by 20% [1]."\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the tag with id attribute is present in the context.\n\n\n\n\n\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at /mnt/user/appdata/app but I want to exclude /mnt/user/appdata/app/.cache how can I do so?\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at /mnt/user/appdata/app but I want to exclude /mnt/user/appdata/app/.cache how can I do so?'}, {'role': 'assistant', 'content': 'The user is asking about the Unraid App Backup and Restore plugin and wants to exclude a specific subdirectory from a backup. This is a specific technical question about Unraid configuration. Let me search for information about this plugin and how to exclude directories.', 'tool_calls': [{'id': 'call_jf21ilhw', 'type': 'function', 'function': {'name': 'search_web', 'arguments': '{"query": "Unraid App Backup Restore plugin exclude directory configuration"}'}}]}, {'role': 'tool', 'tool_call_id': 'call_jf21ilhw', 'content': '[{"title": "[Plugin] Appdata.Backup - Plugin Support - Unraid Forums", "link": "https://forums.unraid.net/topic/137710-plugin-appdatabackup/", "snippet": "Appdata.Backup Support Thread This is the support thread for appdata.backup (formerly known as ca.backup2). This plugin primary takes care of your appdata backup! It allows you to configure backup settings for each of your docker containers. Flash and VM meta backup is integrated as well! If you encounter any issues, post it here with the debug log file attached! For your beta feedback, please ..."}]'}, {'role': 'assistant', 'content': 'Good, I found the Unraid forums thread about the Appdata.Backup plugin. Let me fetch this page for more details on how to exclude directories.', 'tool_calls': [{'id': 'call_ksgs2j9s', 'type': 'function', 'function': {'name': 'fetch_url', 'arguments': '{"url": "https://forums.unraid.net/topic/137710-plugin-appdatabackup/"}'}}, {'id': 'call_gtfkzogf', 'type': 'function', 'function': {'name': 'search_web', 'arguments': '{"query": "### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] only when the tag includes an explicit id attribute (e.g., <source id=\"1\">).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- Only include inline citations using [id] (e.g., [1], [2]) when the tag includes an id attribute.\n- Do not cite if the tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* \"According to the study, the proposed method increases efficiency by 20% [1].\"\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the tag with id attribute is present in the context.\n\n\n<source id=\"1\" name=\"search_web\">\n\n\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at /mnt/user/appdata/app but I want to exclude /mnt/user/appdata/app/.cache how can I do so?\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at /mnt/user/appdata/app but I want to exclude /mnt/user/appdata/app/.cache how can I do so?"}'}}, {'id': 'call_nix2ac48', 'type': 'function', 'function': {'name': 'calculate_timestamp', 'arguments': '{}'}}]}, {'role': 'tool', 'tool_call_id': 'call_ksgs2j9s', 'content': 'Jump to content\nView in the app\nA better way to browse. Learn more.\nLearn more ×Dismiss\nClose\nUnraid\nA full-screen app on your home screen with push notifications, badges and more.\nTo install this app on iOS and iPadOS\n 1. Tap the \n 2. Scroll the menu and tap Add to Home Screen.\n 3. Tap Add in the top-right corner.\n\n\nTo install this app on Android\n 1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.\n 2. Tap Add to Home screen or Install app.\n 3. Confirm by tapping Install.\n\n\n Unraid Unleash Your Hardware \n * Sign In\n * Search\n * Menu\n\n\nMessage added by KluthR , July 1, 2025Jul 1\n## Feature freeze\nPlease read: https://forums.unraid.net/topic/137710-plugin-appdatabackup/page/77/#findComment-1564584\n * Reply to this topic\n\n\n * Page 1 of 85 \n\n\n * Page 1 of 85 \n\n\n## Join the conversation\nYou can post now and register later. If you have an account, sign in now to post with your account. Note: Your post will require moderator approval before it will be visible.\n Followers \n Go to topic listing \n * Existing user? Sign In \n * Sign Up \n\n\nSearch...\nClose\n#### Configure browser push notifications\nclose\n##### Chrome (Android)\n 1. Tap the lock icon next to the address bar.\n 2. Tap Permissions → Notifications.\n 3. Adjust your preference.\n\n\n##### Chrome (Desktop)\n 1. Click the padlock icon in the address bar.\n 2. Select Site settings.\n 3. Find Notifications and adjust your preference.\n\n\n##### Safari (iOS 16.4+)\n 1. Ensure the site is installed via Add to Home Screen.\n 2. Open Settings App → Notifications.\n 3. Find your app name and adjust your preference.\n\n\n##### Safari (macOS)\n 1. Go to Safari → Preferences.\n 2. Click the Websites tab.\n 3. Select Notifications in the sidebar.\n 4. Find this website and adjust your preference.\n\n\n##### Edge (Android)\n 1. Tap the lock icon next to the address bar.\n 2. Tap Permissions.\n 3. Find Notifications and adjust your preference.\n\n\n##### Edge (Desktop)\n 1. Click the padlock icon in the address bar.\n 2. Click Permissions for this site.\n 3. Find Notifications and adjust your preference.\n\n\n##### Firefox (Android)\n 1. Go to Settings → Site permissions.\n 2. Tap Notifications.\n 3. Find this site in the list and adjust your preference.\n\n\n##### Firefox (Desktop)\n 1. Open Firefox Settings.\n 2. Search for Notifications.\n 3. Find this site in the list and adjust your preference.\n\n\n'}, {'role': 'tool', 'tool_call_id': 'call_gtfkzogf', 'content': '[{"title": "Citations in old format will break with \"Oops! No text generated from Ollama ...", "link": "https://github.com/open-webui/open-webui/discussions/7210", "snippet": "Nov 22, 2024 ... ### Task: Respond to the user query using the provided context, incorporating inline ... only when the <source_id> tag is explicitly provided**\xa0..."}]'}, {'role': 'tool', 'tool_call_id': 'call_nix2ac48', 'content': '{"current_timestamp": 1771848682, "current_iso": "2026-02-23T12:11:22.375892+00:00", "calculated_timestamp": 1771848682, "calculated_iso": "2026-02-23T12:11:22.375892+00:00"}'}],

Additional Information

Tool calls end up processed using add_or_update_user_message:

                    # Apply source context to messages for model
                    # Use metadata_only=True to avoid duplicating content
                    # that is already in the tool result message.
                    all_tool_call_sources.extend(tool_call_sources)
                    if all_tool_call_sources and user_message:
                        # Restore original user message before re-applying to avoid recursive nesting
                        form_data["messages"] = add_or_update_user_message(
                            user_message, form_data["messages"], append=False
                        )

But the flag append=False doesn't update, it prepends:

        if append:
            message["content"] = f"{message['content']}\n{content}"
        else:
            message["content"] = f"{content}\n{message['content']}"

Result is that the template is added multiple times to the context, since this is done for each tool call.

Originally created by @relic664 on GitHub (Feb 23, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/21780 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.8.5 ### Ollama Version (if applicable) _No response_ ### Operating System Linux Alpine ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior RAG_TEMPLATE is being injected multiple times causing models to hallucinate and improperly call tools. This is a continuation of #21663 which still persists after v0.8.5. ### Actual Behavior RAG_TEMPLATE is injected at most once, preferably at the system message, not the user message. This is ### Steps to Reproduce 1. Use the v0.8.5 docker image 2. Start a chat with any model 3. Ensure web_search capabilities are enabled with native tool calling mode 4. Set max results returned by web_search to be 1 5. Ask a question requiring multiple turns ### Logs & Screenshots Call 1: 2026-02-23 07:11:13.545 | DEBUG | open_webui.utils.chat:generate_chat_completion:165 - generate_chat_completion: {'stream': True, 'model': 'zai-org/glm-5:thinking', 'messages': [{'role': 'system', 'content': '# Assistant\n\nYou are a concise, candid, and intellectually rigorous assistant.\nYou do not flatter the user or hedge unnecessarily. You value accuracy, clarity, and critical thinking over politeness or empathy. If the user’s idea is weak, flawed, or ambiguous, you say so directly and explain why. Avoid filler like “Great question!” or “Of course!”, or "You\'re absolutely right!" — go straight to substance. Maintain a professional, calm tone, but never defer or soften your reasoning. Your priority is to analyze, not appease. \n\n\n## Safety & etiquette\n- If the user says “don’t browse,” don’t browse.\n- Be transparent about uncertainty. Don’t speculate beyond evidence.\n\n## Response style\n- Start with the answer. Be concise and actionable.\n- Prioritize correctness and accuracy. Push back on user errors if present, but strive to be polite about the push back. \n- Do not praise the user for their points or assertions\n\n\n'}, {'role': 'user', 'content': '### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id="1">).\n\n### Guidelines:\n- If you don\'t know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user\'s query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn\'t present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* "According to the study, the proposed method increases efficiency by 20% [1]."\n\n### Output:\nProvide a clear and direct response to the user\'s query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id="1" name="search_web"></source>\n</context>\n\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at `/mnt/user/appdata/app` but I want to exclude `/mnt/user/appdata/app/.cache` how can I do so?\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at `/mnt/user/appdata/app` but I want to exclude `/mnt/user/appdata/app/.cache` how can I do so?'}, {'role': 'assistant', 'content': '<think>The user is asking about the Unraid App Backup and Restore plugin and wants to exclude a specific subdirectory from a backup. This is a specific technical question about Unraid configuration. Let me search for information about this plugin and how to exclude directories.</think>', 'tool_calls': [{'id': 'call_jf21ilhw', 'type': 'function', 'function': {'name': 'search_web', 'arguments': '{"query": "Unraid App Backup Restore plugin exclude directory configuration"}'}}]}, {'role': 'tool', 'tool_call_id': 'call_jf21ilhw', 'content': '[{"title": "[Plugin] Appdata.Backup - Plugin Support - Unraid Forums", "link": "https://forums.unraid.net/topic/137710-plugin-appdatabackup/", "snippet": "Appdata.Backup Support Thread This is the support thread for appdata.backup (formerly known as ca.backup2). This plugin primary takes care of your appdata backup! It allows you to configure backup settings for each of your docker containers. Flash and VM meta backup is integrated as well! If you encounter any issues, post it here with the debug log file attached! For your beta feedback, please ..."}]'}], Call 2: 2026-02-23 07:11:22.438 | DEBUG | open_webui.utils.chat:generate_chat_completion:165 - generate_chat_completion: {'stream': True, 'model': 'zai-org/glm-5:thinking', 'messages': [{'role': 'system', 'content': '# Assistant\n\nYou are a concise, candid, and intellectually rigorous assistant.\nYou do not flatter the user or hedge unnecessarily. You value accuracy, clarity, and critical thinking over politeness or empathy. If the user’s idea is weak, flawed, or ambiguous, you say so directly and explain why. Avoid filler like “Great question!” or “Of course!”, or "You\'re absolutely right!" — go straight to substance. Maintain a professional, calm tone, but never defer or soften your reasoning. Your priority is to analyze, not appease. \n\n\n## Safety & etiquette\n- If the user says “don’t browse,” don’t browse.\n- Be transparent about uncertainty. Don’t speculate beyond evidence.\n\n## Response style\n- Start with the answer. Be concise and actionable.\n- Prioritize correctness and accuracy. Push back on user errors if present, but strive to be polite about the push back. \n- Do not praise the user for their points or assertions\n\n\n'}, {'role': 'user', 'content': '### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id="1">).\n\n### Guidelines:\n- If you don\'t know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user\'s query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn\'t present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* "According to the study, the proposed method increases efficiency by 20% [1]."\n\n### Output:\nProvide a clear and direct response to the user\'s query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id="1" name="search_web"></source>\n<source id="1" name="https://forums.unraid.net/topic/137710-plugin-appdatabackup/"></source>\n<source id="2" name="search_web"></source>\n</context>\n\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at `/mnt/user/appdata/app` but I want to exclude `/mnt/user/appdata/app/.cache` how can I do so?\n### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id="1">).\n\n### Guidelines:\n- If you don\'t know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user\'s query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn\'t present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* "According to the study, the proposed method increases efficiency by 20% [1]."\n\n### Output:\nProvide a clear and direct response to the user\'s query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id="1" name="search_web"></source>\n</context>\n\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at `/mnt/user/appdata/app` but I want to exclude `/mnt/user/appdata/app/.cache` how can I do so?\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at `/mnt/user/appdata/app` but I want to exclude `/mnt/user/appdata/app/.cache` how can I do so?'}, {'role': 'assistant', 'content': '<think>The user is asking about the Unraid App Backup and Restore plugin and wants to exclude a specific subdirectory from a backup. This is a specific technical question about Unraid configuration. Let me search for information about this plugin and how to exclude directories.</think>', 'tool_calls': [{'id': 'call_jf21ilhw', 'type': 'function', 'function': {'name': 'search_web', 'arguments': '{"query": "Unraid App Backup Restore plugin exclude directory configuration"}'}}]}, {'role': 'tool', 'tool_call_id': 'call_jf21ilhw', 'content': '[{"title": "[Plugin] Appdata.Backup - Plugin Support - Unraid Forums", "link": "https://forums.unraid.net/topic/137710-plugin-appdatabackup/", "snippet": "Appdata.Backup Support Thread This is the support thread for appdata.backup (formerly known as ca.backup2). This plugin primary takes care of your appdata backup! It allows you to configure backup settings for each of your docker containers. Flash and VM meta backup is integrated as well! If you encounter any issues, post it here with the debug log file attached! For your beta feedback, please ..."}]'}, {'role': 'assistant', 'content': '<think>Good, I found the Unraid forums thread about the Appdata.Backup plugin. Let me fetch this page for more details on how to exclude directories.</think>', 'tool_calls': [{'id': 'call_ksgs2j9s', 'type': 'function', 'function': {'name': 'fetch_url', 'arguments': '{"url": "https://forums.unraid.net/topic/137710-plugin-appdatabackup/"}'}}, {'id': 'call_gtfkzogf', 'type': 'function', 'function': {'name': 'search_web', 'arguments': '{"query": "### Task:\\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\\"1\\">).\\n\\n### Guidelines:\\n- If you don\'t know the answer, clearly state that.\\n- If uncertain, ask the user for clarification.\\n- Respond in the same language as the user\'s query.\\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\\n- If the answer isn\'t present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\\n- Do not cite if the <source> tag does not contain an id attribute.\\n- Do not use XML tags in your response.\\n- Ensure citations are concise and directly related to the information provided.\\n\\n### Example of Citation:\\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\\n* \\"According to the study, the proposed method increases efficiency by 20% [1].\\"\\n\\n### Output:\\nProvide a clear and direct response to the user\'s query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\\n\\n<context>\\n<source id=\\"1\\" name=\\"search_web\\"></source>\\n</context>\\n\\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at `/mnt/user/appdata/app` but I want to exclude `/mnt/user/appdata/app/.cache` how can I do so?\\nIn unraid, for the app backup and restore plugin, I have an app I want to back up at `/mnt/user/appdata/app` but I want to exclude `/mnt/user/appdata/app/.cache` how can I do so?"}'}}, {'id': 'call_nix2ac48', 'type': 'function', 'function': {'name': 'calculate_timestamp', 'arguments': '{}'}}]}, {'role': 'tool', 'tool_call_id': 'call_ksgs2j9s', 'content': '[Jump to content](https://forums.unraid.net/topic/137710-plugin-appdatabackup/#ipsLayout__main)\nView in the app\nA better way to browse. **Learn more**.\nLearn more ×Dismiss\nClose\nUnraid\nA full-screen app on your home screen with push notifications, badges and more.\nTo install this app on iOS and iPadOS\n 1. Tap the \n 2. Scroll the menu and tap **Add to Home Screen**.\n 3. Tap **Add** in the top-right corner.\n\n\nTo install this app on Android\n 1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.\n 2. Tap **Add to Home screen** or **Install app**.\n 3. Confirm by tapping **Install**.\n\n\n[ Unraid Unleash Your Hardware ](https://forums.unraid.net/)\n * Sign In\n * Search\n * Menu\n\n\n**Message added by KluthR** , July 1, 2025Jul 1\n## Feature freeze\nPlease read: <https://forums.unraid.net/topic/137710-plugin-appdatabackup/page/77/#findComment-1564584>\n * [Reply to this topic](https://forums.unraid.net/topic/137710-plugin-appdatabackup/#replyForm)\n\n\n * Page 1 of 85 \n\n\n * Page 1 of 85 \n\n\n## Join the conversation\nYou can post now and register later. If you have an account, [sign in now](https://forums.unraid.net/login/) to post with your account. **Note:** Your post will require moderator approval before it will be visible.\n[ Followers ](https://forums.unraid.net/login/ "Sign in to follow this")\n[ Go to topic listing ](https://forums.unraid.net/forum/61-plugin-support/ "Go to Plugin Support")\n * [ Existing user? Sign In ](https://forums.unraid.net/login/)\n * [ Sign Up ](https://account.unraid.net/?redirect_uri=https://forums.unraid.net)\n\n\nSearch...\nClose\n#### Configure browser push notifications\nclose\n##### Chrome (Android)\n 1. Tap the lock icon next to the address bar.\n 2. Tap **Permissions → Notifications**.\n 3. Adjust your preference.\n\n\n##### Chrome (Desktop)\n 1. Click the padlock icon in the address bar.\n 2. Select **Site settings**.\n 3. Find **Notifications** and adjust your preference.\n\n\n##### Safari (iOS 16.4+)\n 1. Ensure the site is installed via **Add to Home Screen**.\n 2. Open **Settings App → Notifications**.\n 3. Find your app name and adjust your preference.\n\n\n##### Safari (macOS)\n 1. Go to **Safari → Preferences**.\n 2. Click the **Websites** tab.\n 3. Select **Notifications** in the sidebar.\n 4. Find this website and adjust your preference.\n\n\n##### Edge (Android)\n 1. Tap the lock icon next to the address bar.\n 2. Tap **Permissions**.\n 3. Find **Notifications** and adjust your preference.\n\n\n##### Edge (Desktop)\n 1. Click the padlock icon in the address bar.\n 2. Click **Permissions for this site**.\n 3. Find **Notifications** and adjust your preference.\n\n\n##### Firefox (Android)\n 1. Go to **Settings → Site permissions**.\n 2. Tap **Notifications**.\n 3. Find this site in the list and adjust your preference.\n\n\n##### Firefox (Desktop)\n 1. Open Firefox Settings.\n 2. Search for **Notifications**.\n 3. Find this site in the list and adjust your preference.\n\n\n'}, {'role': 'tool', 'tool_call_id': 'call_gtfkzogf', 'content': '[{"title": "Citations in old format will break with \\"Oops! No text generated from Ollama ...", "link": "https://github.com/open-webui/open-webui/discussions/7210", "snippet": "Nov 22, 2024 ... ### Task: Respond to the user query using the provided context, incorporating inline ... only when the <source_id> tag is explicitly provided**\xa0..."}]'}, {'role': 'tool', 'tool_call_id': 'call_nix2ac48', 'content': '{"current_timestamp": 1771848682, "current_iso": "2026-02-23T12:11:22.375892+00:00", "calculated_timestamp": 1771848682, "calculated_iso": "2026-02-23T12:11:22.375892+00:00"}'}], ### Additional Information Tool calls end up processed using `add_or_update_user_message`: ```python # Apply source context to messages for model # Use metadata_only=True to avoid duplicating content # that is already in the tool result message. all_tool_call_sources.extend(tool_call_sources) if all_tool_call_sources and user_message: # Restore original user message before re-applying to avoid recursive nesting form_data["messages"] = add_or_update_user_message( user_message, form_data["messages"], append=False ) ``` But the flag `append=False` doesn't update, it prepends: ``` if append: message["content"] = f"{message['content']}\n{content}" else: message["content"] = f"{content}\n{message['content']}" ``` Result is that the template is added multiple times to the context, since this is done for each tool call.
GiteaMirror added the bug label 2026-04-25 09:17:35 -05:00
Author
Owner

@relic664 commented on GitHub (Feb 23, 2026):

My working patch has been to just force the RAG_SYSTEM_CONTEXT into the system context, instead of user, to stop the hallucination and bad tool calls, but a real fix would be to fix the behavior where add_or_update_user_message is always adding to the context, regardless of append value.

<!-- gh-comment-id:3944707124 --> @relic664 commented on GitHub (Feb 23, 2026): My working patch has been to just force the RAG_SYSTEM_CONTEXT into the system context, instead of user, to stop the hallucination and bad tool calls, but a real fix would be to fix the behavior where `add_or_update_user_message` is always adding to the context, regardless of `append` value.
Author
Owner

@tjbck commented on GitHub (Feb 23, 2026):

@relic664 could you pull the latest dev and confirm the issue has been resolved?

<!-- gh-comment-id:3946479946 --> @tjbck commented on GitHub (Feb 23, 2026): @relic664 could you pull the latest dev and confirm the issue has been resolved?
Author
Owner

@relic664 commented on GitHub (Feb 23, 2026):

I tested docker tag git-a52e6c2 (which is dev as of writing) and the multiple template injection is fixed.

However, I still have issues where models hallucinate the RAG template into tool calls. You can blame the model in that instance and it's not strictly a Open-WebUI problem, but I do think how this is handled during native tool calling warrants some thought (i.e, should the context be injected in a non-user channel like system?).

I can continue to patch my system locally (https://github.com/open-webui/open-webui/issues/21663#issuecomment-3942078156), but I'm sure others will have a similar issue. Especially considering I've observed this with larger models like GLM-5 (thinking).

<!-- gh-comment-id:3947765995 --> @relic664 commented on GitHub (Feb 23, 2026): I tested docker tag `git-a52e6c2` (which is `dev` as of writing) and the multiple template injection **is fixed**. However, I still have issues where models hallucinate the RAG template into tool calls. You can blame the model in that instance and it's not strictly a Open-WebUI problem, but I do think how this is handled during native tool calling warrants some thought (i.e, should the context be injected in a non-user channel like system?). I can continue to patch my system locally (https://github.com/open-webui/open-webui/issues/21663#issuecomment-3942078156), but I'm sure others will have a similar issue. Especially considering I've observed this with larger models like GLM-5 (thinking).
Author
Owner

@Classic298 commented on GitHub (Feb 24, 2026):

Thanks for confirming it fixed. The bug you are still experiencing might be model dependent behaviour

If you are self-hosting: check the context size

<!-- gh-comment-id:3954907664 --> @Classic298 commented on GitHub (Feb 24, 2026): Thanks for confirming it fixed. The bug you are still experiencing might be model dependent behaviour If you are self-hosting: check the context size
Author
Owner

@relic664 commented on GitHub (Feb 24, 2026):

It happens with multiple open source models, self hosted, nano gpt, openrouter, any number of providers. Its simply a byproduct of sticking the template into the user chanel -- models can and will hallucinate the template into tool calls.

If anyone else has this issue, a simple patch (https://github.com/open-webui/open-webui/issues/21663#issuecomment-3942078156) will force the template into the system channel instead, eliminating the issue while still retaining citations.

<!-- gh-comment-id:3954922737 --> @relic664 commented on GitHub (Feb 24, 2026): It happens with multiple open source models, self hosted, nano gpt, openrouter, any number of providers. Its simply a byproduct of sticking the template into the user chanel -- models can and will hallucinate the template into tool calls. If anyone else has this issue, a simple patch (https://github.com/open-webui/open-webui/issues/21663#issuecomment-3942078156) will force the template into the system channel instead, eliminating the issue while still retaining citations.
Author
Owner

@Classic298 commented on GitHub (Feb 24, 2026):

@relic664 this has never been reported in ... years despite the RAG prompt always being in the user message. Nothing changed about the prompt's placement.

I'll look into it again, but it would also help if you could tell me what models you use and with what configs. I really try to reproduce, but i can't.

<!-- gh-comment-id:3954958488 --> @Classic298 commented on GitHub (Feb 24, 2026): @relic664 this has never been reported in ... years despite the RAG prompt always being in the user message. Nothing changed about the prompt's placement. I'll look into it again, but it would also help if you could tell me what models you use and with what configs. I really try to reproduce, but i can't.
Author
Owner

@relic664 commented on GitHub (Feb 24, 2026):

I will grab some performance data for you in the coming days. And the issue is around native tool calling specifically, and the "native" mode was only added about a month ago in 0.7.0.

<!-- gh-comment-id:3955047629 --> @relic664 commented on GitHub (Feb 24, 2026): I will grab some performance data for you in the coming days. And the issue is around native tool calling specifically, and the "native" mode was only added about a month ago in 0.7.0.
Author
Owner

@Classic298 commented on GitHub (Feb 24, 2026):

Yes. I use native mode exclusively. I just don't see the rag prompt ever get injected by the model into the query

<!-- gh-comment-id:3955071660 --> @Classic298 commented on GitHub (Feb 24, 2026): Yes. I use native mode exclusively. I just don't see the rag prompt ever get injected by the model into the query
Author
Owner

@relic664 commented on GitHub (Feb 25, 2026):

Here's what I have so far. Models denoted (-) mean they completed the task without hallucinating. The patched column is where I'm instead sticking the rag template into system context instead of user context. In both cases, I modified web search to only return one result.

Prompt:

Tell me about the history of hand washing. Use five or more sources from the web, which might require multiple tool calls as the tool can only return one result at a time.

Model Calls before Hallucinate Calls before Hallucinate (Patched)
GLM 5 Thinking 3 -
GLM 5 1 -
Nemotron 3 A3B 4 -
Qwen 3.5 A17B Thinking 9 -
Qwen 3.5 A17B - -
Kimi 2.5 Thinking 6 -
GLM 4.7 Flash 8 -
Qwen 3 A22b 2507 - -
Qwen 3.5 27b Thinking 6 -
Qwen 3.5 a3b - -
Step 3.5 Flash - -
MiniMax 2.5 - -

It's worth noting that the system and user context actually do have different meaning to the model. Post-training trains models to attend to the system prompt to shape behavior, but user context is to be operated on/processed. The models obviously perform differently, with GLM 5 being the worst, likely due to differences in the post-training regime.

The fact that this hasn't been reported before is related to native tool calling and how models are trained to use tools. Typically user context/data is used by models to perform tool calls, hence it leaking into the tool call.

For example, models are fine tuned with supervised datasets that pair user context with tool call specs/args.

<!-- gh-comment-id:3956419504 --> @relic664 commented on GitHub (Feb 25, 2026): Here's what I have so far. Models denoted (-) mean they completed the task without hallucinating. The patched column is where I'm instead sticking the rag template into system context instead of user context. In both cases, I modified web search to only return one result. Prompt: > Tell me about the history of hand washing. Use five or more sources from the web, which might require multiple tool calls as the tool can only return one result at a time. | Model | Calls before Hallucinate | Calls before Hallucinate (Patched) | | --- | --- | --- | | GLM 5 Thinking | 3 | - | | GLM 5 | 1 | - | | Nemotron 3 A3B | 4 | - | | Qwen 3.5 A17B Thinking | 9 | - | | Qwen 3.5 A17B | - | - | | Kimi 2.5 Thinking | 6 | - | | GLM 4.7 Flash | 8 | - | | Qwen 3 A22b 2507 | - | - | | Qwen 3.5 27b Thinking | 6 | - | | Qwen 3.5 a3b | - | - | | Step 3.5 Flash | - | - | | MiniMax 2.5 | - | - | It's worth noting that the system and user context actually do have different meaning to the model. Post-training trains models to attend to the system prompt to shape behavior, but user context is to be operated on/processed. The models obviously perform differently, with GLM 5 being the worst, likely due to differences in the post-training regime. The fact that this hasn't been reported before is related to native tool calling and how models are trained to use tools. Typically user context/data is used by models to perform tool calls, hence it leaking into the tool call. For example, models are fine tuned with supervised datasets that pair user context with tool call specs/args.
Author
Owner

@Classic298 commented on GitHub (Feb 25, 2026):

Hmm mhm

Maybe it's worth considering moving the rag prompt to the system prompt

Thanks for these detailed investigation this is extraordinary

<!-- gh-comment-id:3957280828 --> @Classic298 commented on GitHub (Feb 25, 2026): Hmm mhm Maybe it's worth considering moving the rag prompt to the system prompt Thanks for these detailed investigation this is extraordinary
Author
Owner

@Classic298 commented on GitHub (Feb 25, 2026):

@relic664 testing wanted

https://github.com/open-webui/open-webui/pull/21855

<!-- gh-comment-id:3957753909 --> @Classic298 commented on GitHub (Feb 25, 2026): @relic664 testing wanted https://github.com/open-webui/open-webui/pull/21855
Author
Owner

@relic664 commented on GitHub (Feb 25, 2026):

@Classic298 gave it a test just now, looks good. All the models passed w/o hallucination under your PR.

<!-- gh-comment-id:3959223545 --> @relic664 commented on GitHub (Feb 25, 2026): @Classic298 gave it a test just now, looks good. All the models passed w/o hallucination under your PR.
Author
Owner

@Classic298 commented on GitHub (Feb 25, 2026):

awesomeness @tjbck

<!-- gh-comment-id:3959226697 --> @Classic298 commented on GitHub (Feb 25, 2026): awesomeness @tjbck
Author
Owner

@Classic298 commented on GitHub (Feb 25, 2026):

thanks so much for testing and producing top notch benchmark results. Rare.

<!-- gh-comment-id:3959228328 --> @Classic298 commented on GitHub (Feb 25, 2026): thanks so much for testing and producing top notch benchmark results. Rare.
Author
Owner

@relic664 commented on GitHub (Feb 27, 2026):

@Classic298 what happened with the PR? I saw that it was closed, but unclear what the resolution is/was.

<!-- gh-comment-id:3973525368 --> @relic664 commented on GitHub (Feb 27, 2026): @Classic298 what happened with the PR? I saw that it was closed, but unclear what the resolution is/was.
Author
Owner

@Classic298 commented on GitHub (Feb 27, 2026):

@relic664 moving it to system message not wanted

Why?
I am not in the know, but i suspect because it breaks the kv cache
And because at the end of the day IT IS model dependent behavior, and I also still cannot reproduce these hallucinations and have never seen them.

<!-- gh-comment-id:3974198583 --> @Classic298 commented on GitHub (Feb 27, 2026): @relic664 moving it to system message not wanted Why? I am not in the know, but i suspect because it breaks the kv cache And because at the end of the day IT IS model dependent behavior, and I also still cannot reproduce these hallucinations and have never seen them.
Author
Owner

@relic664 commented on GitHub (Feb 27, 2026):

@tjbck

Would you mind elaborating? I can see how the kv cache does break under the proposed PR and I'm happy to contribute to fixing the issue without breaking the kv cache. There's other solutions that could be proposed. Are you strictly against the template being in the system message under any circumstance?

As long as the template remains in the user channel, the possibility to hallucinate the template into a tool call remains. Yes there is some model dependency, but I think it would be best to have a model-agnostic native tool calling pipeline.

<!-- gh-comment-id:3974278729 --> @relic664 commented on GitHub (Feb 27, 2026): @tjbck Would you mind elaborating? I can see how the kv cache does break under the proposed PR and I'm happy to contribute to fixing the issue without breaking the kv cache. There's other solutions that could be proposed. Are you strictly against the template being in the system message under any circumstance? As long as the template remains in the user channel, the possibility to hallucinate the template into a tool call remains. Yes there is some model dependency, but I think it would be best to have a model-agnostic native tool calling pipeline.
Author
Owner

@tjbck commented on GitHub (Mar 8, 2026):

RAG_SYSTEM_CONTEXT=True should be used.

<!-- gh-comment-id:4020161185 --> @tjbck commented on GitHub (Mar 8, 2026): RAG_SYSTEM_CONTEXT=True should be used.
Author
Owner

@relic664 commented on GitHub (Mar 9, 2026):

Okay looking at this more, yes RAG_SYSTEM_CONTEXT=True does resolve this issue, but at the expense of KV-cache as was mentioned above.

This solves the immediate problem, but a more pragmatic approach for the future would be to split RAG_TEMPLATE up so the context goes to user, instructions to system. I will give this some thought and propose a change as it's much less straightforward than sending everything to the system context.

Edit: I made some passes at splitting the template up from the context - so the sources go into the user channel, but instructions into the system channel. While this works, I found models performed worse in terms of citing the correct source - likely due to the disconnect between the system instruction (system context) and sources (user context), even when experimenting with playing with the default rag template. Generally this approach seems to be empirically more brittle, particularly for smaller models.

So, in short, I think if models are hallucinating the guidance needs to be RAG_SYSTEM_CONTEXT=True as forcing everything into system context still retains citation accuracy while significantly reducing the chance of hallucinating the template into a tool call, at the expense of breaking the KV cache.

<!-- gh-comment-id:4023282182 --> @relic664 commented on GitHub (Mar 9, 2026): Okay looking at this more, yes `RAG_SYSTEM_CONTEXT=True` does resolve this issue, but at the expense of KV-cache as was mentioned above. This solves the immediate problem, but a more pragmatic approach for the future would be to split `RAG_TEMPLATE` up so the context goes to user, instructions to system. I will give this some thought and propose a change as it's much less straightforward than sending everything to the system context. Edit: I made some passes at splitting the template up from the context - so the sources go into the user channel, but instructions into the system channel. While this works, I found models performed worse in terms of citing the correct source - likely due to the disconnect between the system instruction (system context) and sources (user context), even when experimenting with playing with the default rag template. Generally this approach seems to be empirically more brittle, particularly for smaller models. So, in short, I think if models are hallucinating the guidance needs to be `RAG_SYSTEM_CONTEXT=True` as forcing everything into system context still retains citation accuracy while significantly reducing the chance of hallucinating the template into a tool call, at the expense of breaking the KV cache.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#35096