issue: With "bypass embedding and retrieval" attached files are still sent in a RAG system prompt not with the user message #6490

Closed
opened 2025-11-11 16:56:43 -06:00 by GiteaMirror · 15 comments
Owner

Originally created by @mramendi on GitHub (Sep 24, 2025).

Originally assigned to: @tjbck on GitHub.

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.30

Ollama Version (if applicable)

No response

Operating System

RHEL 9

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

After enabling "Bypass embedding and retrieval", any file attached to a chat message becomes a part of the user prompt in the turn of the chat when it si attached. The system prompt is unmodified and no other role="system" messages are added by OWUI.

Actual Behavior

After enabling "Bypass embedding and retrieval", any file attached to a chat message becomes a part of a "RAG system prompt", which becomes a store of all the files ever attached to the thread. This "RAG system prompt" then becomes a message with the "system" role, which can affect LLM behaviour, especially when the main system prompt is a CoT.

Steps to Reproduce

  1. Have some way of logging the full API request (I used S3 storage in LiteLLM)
  2. In the Admin panel, go to Settings>Documents and enable "Bypass embedding and retrieval"
  3. Create two test files: test1.txt and test2.txt. Just put "I am in the files.
  4. Start a fresh chat with a model in OpenWebUI.
  5. Attach text1.txt, send a message, see a brief popup about sources, get response
  6. Attach text2.txt, send a message, see a brief popup about sources, get response
  7. Review the log of the latest request and see that the text1.txt and text2.txt content are in the "RAG system prompt" together with some prose you as the user never requested (because it's RAG and you enabled "Bypass embedding and retrieval" so you expected to disable the built-in RAG).

Logs & Screenshots

A screenshot of my admin settings>documents: https://imgur.com/a/4h1w2gK

The first message being prepared https://imgur.com/a/rA1hWmk

The second message being sent - screenchot caught the briedly-appearing messages about sources: https://imgur.com/a/n7NC0eQ

Google Drive link for the S3 storage download for the exchange. The gpt-5-nano request is just the chat title, but the other two are the actual exchange, note exactly where the text file content is. https://drive.google.com/file/d/1MVzU4EdSUluKkmKG39Q4YOlZt4YJo3Rm/view?usp=sharing

The most relevant part of the time-12-48-17-813581_chatcmpl-ad121799-66dd-4d2e-881d-1b72a0a2c2df.json file:

"messages": [{"role": "system", "content": "###Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).\n\n###Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n*\"According to the study, the proposed method increases efficiency by 20% [1].\"\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the
format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id=\"1\" name=\"test1.txt\">I am test1.txt\n</source>\n<source id=\"2\" name=\"test2.txt\">I am test2.txt\n</source>\n</context>\n\n<user_query>\nHave another test file.\n</user_query>\n"}, {"role": "user", "content": "Hello! Have a test file."}, {"role": "assistant", "content": "Yes, there is a test file named \"test1.txt\" mentioned in the context [1]. Let me know if you need further details about it!"}, {"role": "user", "content": "Have another test file."}]

### Additional Information

_No response_
Originally created by @mramendi on GitHub (Sep 24, 2025). Originally assigned to: @tjbck on GitHub. ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.30 ### Ollama Version (if applicable) _No response_ ### Operating System RHEL 9 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior After enabling "Bypass embedding and retrieval", any file attached to a chat message becomes a part of the user prompt in the turn of the chat when it si attached. The system prompt is unmodified and no other role="system" messages are added by OWUI. ### Actual Behavior After enabling "Bypass embedding and retrieval", any file attached to a chat message becomes a part of a "RAG system prompt", which becomes a store of all the files ever attached to the thread. This "RAG system prompt" then becomes a message with the "system" role, which can affect LLM behaviour, especially when the main system prompt is a CoT. ### Steps to Reproduce 1. Have some way of logging the full API request (I used S3 storage in LiteLLM) 2. In the Admin panel, go to Settings>Documents and enable "Bypass embedding and retrieval" 3. Create two test files: test1.txt and test2.txt. Just put "I am <filename> in the files. 4. Start a fresh chat with a model in OpenWebUI. 5. Attach text1.txt, send a message, see a brief popup about sources, get response 6. Attach text2.txt, send a message, see a brief popup about sources, get response 7. Review the log of the latest request and see that the text1.txt and text2.txt content are in the "RAG system prompt" together with some prose you as the user never requested (because it's RAG and you enabled "Bypass embedding and retrieval" so you expected to disable the built-in RAG). ### Logs & Screenshots A screenshot of my admin settings>documents: https://imgur.com/a/4h1w2gK The first message being prepared https://imgur.com/a/rA1hWmk The second message being sent - screenchot caught the briedly-appearing messages about sources: https://imgur.com/a/n7NC0eQ Google Drive link for the S3 storage download for the exchange. The gpt-5-nano request is just the chat title, but the other two are the actual exchange, note exactly where the text file content is. https://drive.google.com/file/d/1MVzU4EdSUluKkmKG39Q4YOlZt4YJo3Rm/view?usp=sharing The most relevant part of the `time-12-48-17-813581_chatcmpl-ad121799-66dd-4d2e-881d-1b72a0a2c2df.json` file: ``` "messages": [{"role": "system", "content": "###Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).\n\n###Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n*\"According to the study, the proposed method increases efficiency by 20% [1].\"\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id=\"1\" name=\"test1.txt\">I am test1.txt\n</source>\n<source id=\"2\" name=\"test2.txt\">I am test2.txt\n</source>\n</context>\n\n<user_query>\nHave another test file.\n</user_query>\n"}, {"role": "user", "content": "Hello! Have a test file."}, {"role": "assistant", "content": "Yes, there is a test file named \"test1.txt\" mentioned in the context [1]. Let me know if you need further details about it!"}, {"role": "user", "content": "Have another test file."}] ### Additional Information _No response_
GiteaMirror added the bug label 2025-11-11 16:56:43 -06:00
Author
Owner

@Classic298 commented on GitHub (Sep 24, 2025):

Add to that, even if all files are in full context mode, RAG queries are still generated

@Classic298 commented on GitHub (Sep 24, 2025): Add to that, even if all files are in full context mode, RAG queries are still generated
Author
Owner

@tjbck commented on GitHub (Sep 24, 2025):

@Classic298 We'll need a new issue for this, @mramendi this behaviour has been modified to use the last user message instead.

@tjbck commented on GitHub (Sep 24, 2025): @Classic298 We'll need a new issue for this, @mramendi this behaviour has been modified to use the last user message instead.
Author
Owner

@mramendi commented on GitHub (Sep 25, 2025):

Excuse me, modified in what version? @tjbck the logs I included are all from v0.6.30 and that's the latest version according to https://github.com/open-webui/open-webui/releases

Or is this a freshly merged PR and I need to wait for v0.6.31? I checked the PRs and could not find this but maybe it's just me.

@mramendi commented on GitHub (Sep 25, 2025): Excuse me, modified in what version? @tjbck the logs I included are all from v0.6.30 and that's the latest version according to https://github.com/open-webui/open-webui/releases Or is this a freshly merged PR and I need to wait for v0.6.31? I checked the PRs and could not find this but maybe it's just me.
Author
Owner

@Classic298 commented on GitHub (Sep 25, 2025):

@mramendi in dev - so in the next version

@Classic298 commented on GitHub (Sep 25, 2025): @mramendi in dev - so in the next version
Author
Owner

@mramendi commented on GitHub (Sep 25, 2025):

@Classic298 thanks a lot!! if possible I would prefer to see the PR - I'm trying to understand how OWUI functions under the hood - not asking to "review" just to eyeball for my own education

@mramendi commented on GitHub (Sep 25, 2025): @Classic298 thanks a lot!! if possible I would prefer to see the PR - I'm trying to understand how OWUI functions under the hood - not asking to "review" just to eyeball for my own education
Author
Owner

@Classic298 commented on GitHub (Sep 25, 2025):

should be this f096e99059

@Classic298 commented on GitHub (Sep 25, 2025): should be this https://github.com/open-webui/open-webui/commit/f096e99059d8591e1bd03fe2a656cba5ca71556f
Author
Owner

@mramendi commented on GitHub (Sep 26, 2025):

Updated to 0.6.31, the issue still persists.

Snippet from full request log:

"messages": [{"role": "user", "content": "### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* \"According to the study, the proposed method increases efficiency by 20% [1].\"\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id=\"1\" name=\"test1.txt\">I am test1.txt\n</source>\n</context>\n\n<user_query>\nhave a test file\n</user_query>\n\nhave a test file"}], "response": {"id": "chatcmpl-1d08ca24-635c-4b76-b42c-d9b0f63aa09e", "created": 1758846929, "model": "Qwen/Qwen3-235B-A22B-Instruct-2507", "object": "chat.completion", "system_fingerprint": null, "choices": [{"finish_reason": "stop", "index": 0, "message": {"content": "Yes, there is a test file named \"test1.txt\" [1].", "role": "assistant", "tool_calls": null, "function_call": null}}]
@mramendi commented on GitHub (Sep 26, 2025): Updated to 0.6.31, the issue still persists. Snippet from full request log: ``` "messages": [{"role": "user", "content": "### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [id] **only when the <source> tag includes an explicit id attribute** (e.g., <source id=\"1\">).\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [id] (e.g., [1], [2]) when the <source> tag includes an id attribute.**\n- Do not cite if the <source> tag does not contain an id attribute.\n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in a source with a provided id attribute, the response should include the citation like in the following example:\n* \"According to the study, the proposed method increases efficiency by 20% [1].\"\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [id] only when the <source> tag with id attribute is present in the context.\n\n<context>\n<source id=\"1\" name=\"test1.txt\">I am test1.txt\n</source>\n</context>\n\n<user_query>\nhave a test file\n</user_query>\n\nhave a test file"}], "response": {"id": "chatcmpl-1d08ca24-635c-4b76-b42c-d9b0f63aa09e", "created": 1758846929, "model": "Qwen/Qwen3-235B-A22B-Instruct-2507", "object": "chat.completion", "system_fingerprint": null, "choices": [{"finish_reason": "stop", "index": 0, "message": {"content": "Yes, there is a test file named \"test1.txt\" [1].", "role": "assistant", "tool_calls": null, "function_call": null}}]
Author
Owner

@Classic298 commented on GitHub (Sep 26, 2025):

Ah no.

That is the citation prompt not the RAG/Query prompt

You always need this to be sent

@Classic298 commented on GitHub (Sep 26, 2025): Ah no. That is the citation prompt not the RAG/Query prompt You always need this to be sent
Author
Owner

@mramendi commented on GitHub (Sep 26, 2025):

I see now. While I would prefer not to have this prompt and to have each file in the message where it was sent (not all in he last message), this does make things better. Is the citation prompt simply hardcoded?

@mramendi commented on GitHub (Sep 26, 2025): I see now. While I would prefer not to have this prompt and to have each file in the message where it was sent (not all in he last message), this does make things better. Is the citation prompt simply hardcoded?
Author
Owner

@Classic298 commented on GitHub (Sep 26, 2025):

the citation prompt is not hardcoded, i am pretty sure you can insert your own version of the citation prompt in the admin panel and set it to just a dot or something like that to overwrite the default prompt.

@Classic298 commented on GitHub (Sep 26, 2025): the citation prompt is not hardcoded, i am pretty sure you can insert your own version of the citation prompt in the admin panel and set it to just a dot or something like that to overwrite the default prompt.
Author
Owner

@mramendi commented on GitHub (Sep 26, 2025):

I just went through the entire admin panel and could not find the citation prompt, advice would be much appreciated.

@mramendi commented on GitHub (Sep 26, 2025): I just went through the entire admin panel and could not find the citation prompt, advice would be much appreciated.
Author
Owner

@Classic298 commented on GitHub (Sep 26, 2025):

Admin panel > Documents > RAG-Template

@Classic298 commented on GitHub (Sep 26, 2025): Admin panel > Documents > RAG-Template
Author
Owner

@mramendi commented on GitHub (Sep 26, 2025):

@Classic298 fresh screenshot of my Admin panel > Settins > Documents is at https://imgur.com/a/TeI8WAn and no RAG-template in sight. (If I turn off "Bypass embeddings and retrieval" a couple more settings appear but sitll nothing like this).

@mramendi commented on GitHub (Sep 26, 2025): @Classic298 fresh screenshot of my Admin panel > Settins > Documents is at https://imgur.com/a/TeI8WAn and no RAG-template in sight. (If I turn off "Bypass embeddings and retrieval" a couple more settings appear but sitll nothing like this).
Author
Owner

@rgaricano commented on GitHub (Sep 26, 2025):

scrolldown, it is below

Image

@rgaricano commented on GitHub (Sep 26, 2025): scrolldown, it is below ![Image](https://github.com/user-attachments/assets/a70c6673-a386-4c63-a9f3-c98f735eaf06)
Author
Owner

@mramendi commented on GitHub (Sep 26, 2025):

This setting was not present in the Chrome Web App but surfaced when I switched to the web page. Also, the setting is accessible only when "Bypass Embeddings and Retrieval" is disabled, but it is used even when "Bypass Embeddings and Retrieval" is enabled. Thanks for the pointers, I do think I'll be able to fix my workflow for now.

@mramendi commented on GitHub (Sep 26, 2025): This setting was not present in the Chrome Web App but surfaced when I switched to the web page. Also, the setting is accessible only when "Bypass Embeddings and Retrieval" is disabled, but it is used even when "Bypass Embeddings and Retrieval" is enabled. Thanks for the pointers, I do think I'll be able to fix my workflow for now.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#6490