[GH-ISSUE #17926] issue: Generate query and launch retrieval even with Bypass Embedding and Retrieval enabled #18441

Closed
opened 2026-04-20 00:39:45 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @Zijie-wang1125 on GitHub (Sep 30, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/17926

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Git Clone

Open WebUI Version

0.6.31

Ollama Version (if applicable)

No response

Operating System

macOS Sonoma

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When Bypass Embedding and Retrieval is enabled, a query should not be generated, and no indicators or status updates should be displayed in the interface

Actual Behavior

still we are generate qurey and a status of retrival is displayed(even though no retrival is actually processing)

Steps to Reproduce

  1. Turn on Bypass embed and retrival in setting
  2. upload a file or attach a knoledege base
  3. ask anything

Logs & Screenshots

Image

Additional Information

No response

Originally created by @Zijie-wang1125 on GitHub (Sep 30, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/17926 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Git Clone ### Open WebUI Version 0.6.31 ### Ollama Version (if applicable) _No response_ ### Operating System macOS Sonoma ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When Bypass Embedding and Retrieval is enabled, a query should not be generated, and no indicators or status updates should be displayed in the interface ### Actual Behavior still we are generate qurey and a status of retrival is displayed(even though no retrival is actually processing) ### Steps to Reproduce 1. Turn on Bypass embed and retrival in setting 2. upload a file or attach a knoledege base 3. ask anything ### Logs & Screenshots <img width="1056" height="249" alt="Image" src="https://github.com/user-attachments/assets/3b85e938-dccd-4fde-86cd-3e76c1c9b819" /> ### Additional Information _No response_
GiteaMirror added the bug label 2026-04-20 00:39:45 -05:00
Author
Owner

@Classic298 commented on GitHub (Sep 30, 2025):

Please update to the latest version, it does not generate RAG queries anymore if indeed all files are in full context mode.

<!-- gh-comment-id:3352338486 --> @Classic298 commented on GitHub (Sep 30, 2025): Please update to the latest version, it does not generate RAG queries anymore if indeed all files are in full context mode.
Author
Owner

@Zijie-wang1125 commented on GitHub (Oct 1, 2025):

It is true, however, the issue still there. Here are the two cases:
1, If 'Bypass Embedding and Retrieval' is On, supposedly, all file should be in full context mode by default, however i will still have to manually toggle 'Using Focused Retrieval' after uploading every files. Otherwise, it still gonna generate quries.
2, If 'Bypass Embedding and Retrieval' is off, enabling 'Using Focused Retrieval' (the full context mode) should completely bypass RAG for the current file. However, the backend continues to embed the file, leading to a significant increase in upload time and wasted embedding tokens.

<!-- gh-comment-id:3354422966 --> @Zijie-wang1125 commented on GitHub (Oct 1, 2025): It is true, however, the issue still there. Here are the two cases: 1, If 'Bypass Embedding and Retrieval' is On, supposedly, all file should be in full context mode by default, however i will still have to manually toggle 'Using Focused Retrieval' after uploading every files. Otherwise, it still gonna generate quries. 2, If 'Bypass Embedding and Retrieval' is off, enabling 'Using Focused Retrieval' (the full context mode) should completely bypass RAG for the current file. However, the backend continues to embed the file, leading to a significant increase in upload time and wasted embedding tokens.
Author
Owner

@Classic298 commented on GitHub (Oct 1, 2025):

If 'Bypass Embedding and Retrieval' is off, enabling 'Using Focused Retrieval' (the full context mode)

No! Using Focused Retrieval means using RAG.

Turn the file to full context mode.

<!-- gh-comment-id:3354874035 --> @Classic298 commented on GitHub (Oct 1, 2025): > If 'Bypass Embedding and Retrieval' is off, enabling 'Using Focused Retrieval' (the full context mode) No! Using Focused Retrieval means using RAG. Turn the file to full context mode.
Author
Owner

@Zijie-wang1125 commented on GitHub (Oct 1, 2025):

Thank you for clarifying my misunderstanding regarding 'Using Focused Retrieval.' However, as far as I know, the only way to enable full context mode in Open WebUI is by toggling 'Bypass Embedding and Retrieval' in the admin panel under Settings > Documents. This enables full context mode globally, not just for specific files. If I'm not mistaken, the problem is that even after toggling 'Bypass Embedding and Retrieval,' RAG queries are still being generated. I can disable qurey generation in admin panel as well, but the status of rag will still displayed in the interface.

Image Image
<!-- gh-comment-id:3354974775 --> @Zijie-wang1125 commented on GitHub (Oct 1, 2025): Thank you for clarifying my misunderstanding regarding 'Using Focused Retrieval.' However, as far as I know, the only way to enable full context mode in Open WebUI is by toggling 'Bypass Embedding and Retrieval' in the admin panel under Settings > Documents. This enables full context mode globally, not just for specific files. If I'm not mistaken, the problem is that even after toggling 'Bypass Embedding and Retrieval,' RAG queries are still being generated. I can disable qurey generation in admin panel as well, but the status of rag will still displayed in the interface. <img width="1140" height="189" alt="Image" src="https://github.com/user-attachments/assets/27da535b-141d-4e28-93bd-660773cb0cf3" /> <img width="1034" height="290" alt="Image" src="https://github.com/user-attachments/assets/b36b9b64-5ccc-4052-bd52-4b070cb7c63d" />
Author
Owner

@rgaricano commented on GitHub (Oct 1, 2025):

Yes, you are right.
That query is generated because is enabled the Retrieval Query Generation, you can disable it in adminSettings/Interface, but it should by "autodisabled" if bypass...

<!-- gh-comment-id:3355001220 --> @rgaricano commented on GitHub (Oct 1, 2025): Yes, you are right. That query is generated because is enabled the Retrieval Query Generation, you can disable it in adminSettings/Interface, but it should by "autodisabled" if bypass...
Author
Owner

@Zijie-wang1125 commented on GitHub (Oct 1, 2025):

Yes i know that, and in fact, i have disabled that. As you can see from my screen shot, there is no new query generated, but the status of querying is still displayed. And, just as you said, it should be 'autodisabled' for better practice

<!-- gh-comment-id:3355027544 --> @Zijie-wang1125 commented on GitHub (Oct 1, 2025): Yes i know that, and in fact, i have disabled that. As you can see from my screen shot, there is no new query generated, but the status of querying is still displayed. And, just as you said, it should be 'autodisabled' for better practice
Author
Owner

@Classic298 commented on GitHub (Oct 1, 2025):

the only way to enable full context mode in Open WebUI is by toggling 'Bypass Embedding and Retrieval' in the admin panel

you can also upload a file, click on the file, enable full context mode.

the problem is that even after toggling 'Bypass Embedding and Retrieval,' RAG queries are still being generated

good catch that is a separate problem.
The bug fix in 0.6.32 was only for full context mode inside the chat, but not if it is set to full context mode in the admin panel

<!-- gh-comment-id:3355040010 --> @Classic298 commented on GitHub (Oct 1, 2025): > the only way to enable full context mode in Open WebUI is by toggling 'Bypass Embedding and Retrieval' in the admin panel you can also upload a file, click on the file, enable full context mode. > the problem is that even after toggling 'Bypass Embedding and Retrieval,' RAG queries are still being generated good catch that is a separate problem. The bug fix in 0.6.32 was only for full context mode inside the chat, but not if it is set to full context mode in the admin panel
Author
Owner

@rgaricano commented on GitHub (Oct 1, 2025):

Yes i know that, and in fact, i have disabled that. As you can see from my screen shot, ...

No, I can't see it in your screenshots, I'm referering to:

Image

<!-- gh-comment-id:3355134396 --> @rgaricano commented on GitHub (Oct 1, 2025): > Yes i know that, and in fact, i have disabled that. As you can see from my screen shot, ... No, I can't see it in your screenshots, I'm referering to: ![Image](https://github.com/user-attachments/assets/e897d131-aafc-4da4-8ecb-40007952f3fa)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#18441