[PR #5378] [MERGED] fix: RAG and Web Search + RAG enhancements #8471

Closed
opened 2025-11-11 17:57:29 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/5378
Author: @thiswillbeyourgithub
Created: 9/12/2024
Status: Merged
Merged: 9/13/2024
Merged by: @tjbck

Base: devHead: fix_RAG_and_web


📝 Commits (8)

  • 53f03f6 fix: log exception when issues of collection querying
  • ed2a1e7 enh: use non hybrid search as fallback if hybrid search failed
  • 209e246 fix: much improved RAG template
  • adf2678 logs: crash if rag_template would be wrong
  • 9661fee fix: handle case where [query] happens in the RAG context
  • b4ad645 fix: add check that the context for RAG is not empty if the threshold is 0
  • e872f5d log: added a debug log if detecting a potential prompt injection attack
  • 65d5545 added a few type hints

📊 Changes

3 files changed (+73 additions, -30 deletions)

View changed files

📝 backend/open_webui/apps/rag/utils.py (+52 -20)
📝 backend/open_webui/config.py (+16 -10)
📝 backend/open_webui/main.py (+5 -0)

📄 Description

  • fix: log exception when issues of collection querying
  • enh: use non hybrid search as fallback if hybrid search failed
  • fix: much improved RAG template
  • logs: crash if rag_template would be wrong
  • fix: handle case where [query] happens in the RAG context
  • fix: add check that the context for RAG is not empty if the threshold is 0
  • log: added a debug log if detecting a potential prompt injection attack
  • added a few type hints

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions and describe your changes before submitting a pull request.

  • Linked the related discussions below + the fix is quite urgent IMO

Before submitting, make sure you've checked the following:

  • [ X ] Target branch: Please verify that the pull request targets the dev branch.
  • [ X ] Description: Provide a concise description of the changes made in this pull request.
  • [ X ] Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
  • [ X ] Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • [ X ] Testing: Have you written and run sufficient tests for validating the changes?
  • [ X ] Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • [ X ] Prefix: To cleary categorize this pull request, prefix the pull request title, using one of the following:
    • BREAKING CHANGE: Significant changes that may affect compatibility
    • build: Changes that affect the build system or external dependencies
    • ci: Changes to our continuous integration processes or workflows
    • chore: Refactor, cleanup, or other non-functional code changes
    • docs: Documentation update or addition
    • feat: Introduces a new feature or enhancement to the codebase
    • fix: Bug fix or error correction
    • i18n: Internationalization or localization changes
    • perf: Performance improvement
    • refactor: Code restructuring for better maintainability, readability, or scalability
    • style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc.)
    • test: Adding missing tests or correcting existing tests
    • WIP: Work in progress, a temporary label for incomplete or ongoing work

Body

I wrote that first as a discussion message then it got out of hand so I did as best as I could to adhere to your format

Disclaimer

I'm extremely happy with Open-WebUI and very enthusiastic about its future and the future of pipelines. Hence, I was quite disappointed by some aspects of the code. I try to explain all of this below but want to clarify that I don't judge any of the maintainers: I can't even begin to imagine what you must have on your plate with a repo of 40k stars. But I still think I found some practices that will most certainly lose you more time than you could gain by coding quickly. I am totally open for discussion on any of the points I raised, I could totally be wrong. And part of all that might be attributed to misunderstandings about what RAG is supposed to do for the user, in which case understanding it here would help clarify the documentation. I will still continue using Open-WebUI and provide accounts for all my community, but want to voice my concerns. To reiterate: my current mindset is the one of a mostly happy user, concerned about some decisions and practices, wanting to do sort of a soft wake up call while remaining respectful of the well being of the maintainers.

Context

RAG and web search were broken for at least weeks. Using langfuse I found out that the [context] in the RAG template was replaced by an empty string. I also found out that nothing in the code stops that even though I see no reason not to do that.

The issue

  • I've located my issue: it happens only when using hybrid search.
    • I was certain I had tried disabling it, but as I said, the lack of checks / logs makes the debugging process tedious and harder.
  • The issue is in backend/open_webui/apps/rag/utils.py, in query_collection_with_hybrid_search
    • The query happens in a try block, but the exception is ignored.
    • In my case the exception was Torch not compiled with CUDA enabled. I don't know why this happened exactly but I's not the purpose of this PR.
  • As a direct consequence: the context was never filled.
  • The RAG template replacement happens without checking if that's alright.
  • And finally the RAG template does not even tell the LLM to warn the user if the context seems broken or empty.
    • It even specifically tells the LLM to avoid mentioning it's using the context. This means that if you ask an open ended question to the LLM, hoping it would use the document, then it will completely hallucinate instead of being grounded by the document. This also makes it harder to get the LLM to confirm that the context was valid because it avoids mentioning it.
  • if "[query]" is present in the context, the template replacement will be wrong as it will replace it too. I used uuid (standard python lib) to generate a seed in that case.
  • There's still an issue with injection that will happen if <context> or </context> is present inside the context. I don't know exactly how you want to deal with that. I added a warning anyway. This is NOT to take lightly as pipelines will end up exposing more and more APIs and attackers will always have the edge.

Other remarks

  • At least the function "get_rag_context" is not typed (arguments nor outputs)
  • In a general way, at least for the rag utils file : only 3 functions out of 14 have an output type, making it all the more harder for the community to dive on specific issue like I'm doing right now.
    • If you want to add very fast optional type checking I can strongly recommend beartype. In one of my project I've been using it by decorating pretty much all my functions with a beartype decorator that either crash or warns in case of inconsistencies (depending on an env variable). It's been a huge headache saver and I think you should try it. Here's my code as an example. Without the env parameter, the decorator directly returns the function so there is no overhead. If you set the env parameter then the overhead is minimal, thanks to beartype.
  • As I modified the RAG template, it has to be modified in the docs repository too.
  • When debugging, I simply added asserts and expected it to trigger red banners, but was surprised to see that it didn't. I strongly suggest adding an env variable or a UI parameter to allow at least admin accounts to see debugging banners instead of only showing fatal errors. I would not be surprised if there were more bugs that users didn't even know about.

Commits in this PR

  • Properly log exceptions occurring during querying (both hybrid and not)
  • If all collections failed to be queried in hybrid search: fallback to using non hybrid search
  • Improved the RAG template, especially important is the mention that if the information is not coming from the context, the LLM should say so.
  • Added sanity checks for the RAG template replacement.
  • Fixes cases where "[query]" is present in the context
  • If the relevancy factor is 0, the context should never be empty. I added a check for this.
  • Added a prompt injection warning of "" and "" are present in the context. This is NOT to take lightly as pipelines will end up exposing more and more APIs and attackers will always have the edge.
  • Added a few type hints.

What this PR fixes

It probably fixes too:

  • #5002 and #5022 by @qiulang
  • #5291 by @tkg61
  • #3363 by @hugobox with @changchiyou and @Rom0067 and @anonymousmaharaj and @devingfx

It might also fix

  • #3518 by @Robinsane
  • #4775 by @foxworld306 and #4779 with @GridexX
  • #3868 with @gaetanquentin and @BennisonDevadoss
  • #176 of @bhaidar and @dmvieira

Closing remarks

I'll be blunt but honest: I think that the relative lack of asserts, the fact that exceptions are not shown to the admin, the relative lack of type hints, the apparent lack of test coverage, will progressively increase the burden of developing Open-WebUI as well as make it harder for the community to provide helpful issues, as well as step in and help. I think RAG might have been silently broken for weeks, but maybe many other things are broken without us knowing about it.

Open-WebUI and Pipelines are ambitious, and you need all the help modern developpement practices can bring you!

I now have reasons to doubt that the rest of the codebase is properly checking things, or that anything that breaks is logged.

I think it would be awesome if you added my suggestion regarding showing errors to the admin, then progressively added more asserts as you naturally refactor the code here and there. This might be the most efficient way to go about this as I understand that increasing the test coverage for example requires a significant effort. Also beartype typechecking during the developpement phase or as an opt in for users might be very helpful. (But do use their latest release candidate instead of the latest release.)

Then again, thank you again for making Open-WebUI, it's a great improvement for me and my community. Have a nice day.

PS: While I have your attention, could you share your thoughts about #5323 ? I explained in there that the default RAG model was trained on tweets and silently clips after a few tokens but Open-WebUI's default is still to use that with a 1500 chunk size.


Changelog Entry

Description

RAG and web search were broken for me and I think many others. The fix was to add a fallback to the non hybrid search. While I was at it I saw numerous things I decided to patch, all still related to this issue.

Added

  • If all collections failed to be queried in hybrid search: fallback to using non hybrid search
  • Added sanity checks for the RAG template replacement.
  • Added a few type hints.

Fixed

  • Properly log exceptions occurring during querying (both hybrid and not)
  • Improved the RAG template, especially important is the mention that if the information is not coming from the context, the LLM should say so.
  • Fixes cases where "[query]" is present in the context
  • If the relevancy factor is 0, the context should never be empty. I added a check for this.

Security

  • Added a prompt injection warning of "" and "" are present in the context. This is NOT to take lightly as pipelines will end up exposing more and more APIs and attackers will always have the edge.

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/5378 **Author:** [@thiswillbeyourgithub](https://github.com/thiswillbeyourgithub) **Created:** 9/12/2024 **Status:** ✅ Merged **Merged:** 9/13/2024 **Merged by:** [@tjbck](https://github.com/tjbck) **Base:** `dev` ← **Head:** `fix_RAG_and_web` --- ### 📝 Commits (8) - [`53f03f6`](https://github.com/open-webui/open-webui/commit/53f03f65561f74927382b115d5eee5ed826015e7) fix: log exception when issues of collection querying - [`ed2a1e7`](https://github.com/open-webui/open-webui/commit/ed2a1e7db960fa6aa4c1c298272164c6f48f5bc0) enh: use non hybrid search as fallback if hybrid search failed - [`209e246`](https://github.com/open-webui/open-webui/commit/209e246e6f5fce408421eb30001c1886885aee5a) fix: much improved RAG template - [`adf2678`](https://github.com/open-webui/open-webui/commit/adf26789b8c1c9c256f22aef98f64449d8f684a5) logs: crash if rag_template would be wrong - [`9661fee`](https://github.com/open-webui/open-webui/commit/9661fee55428b81c3f65bc2863efbcd70c7e403a) fix: handle case where [query] happens in the RAG context - [`b4ad645`](https://github.com/open-webui/open-webui/commit/b4ad64586aceeae45dd5cc4178f43b4b6c00fb77) fix: add check that the context for RAG is not empty if the threshold is 0 - [`e872f5d`](https://github.com/open-webui/open-webui/commit/e872f5dc78e5585b205e378bab87b26de5ff59e2) log: added a debug log if detecting a potential prompt injection attack - [`65d5545`](https://github.com/open-webui/open-webui/commit/65d5545cf0cd85c1b4a5e55266289b42879c4673) added a few type hints ### 📊 Changes **3 files changed** (+73 additions, -30 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/apps/rag/utils.py` (+52 -20) 📝 `backend/open_webui/config.py` (+16 -10) 📝 `backend/open_webui/main.py` (+5 -0) </details> ### 📄 Description - **fix: log exception when issues of collection querying** - **enh: use non hybrid search as fallback if hybrid search failed** - **fix: much improved RAG template** - **logs: crash if rag_template would be wrong** - **fix: handle case where [query] happens in the RAG context** - **fix: add check that the context for RAG is not empty if the threshold is 0** - **log: added a debug log if detecting a potential prompt injection attack** - **added a few type hints** # Pull Request Checklist ### Note to first-time contributors: Please open a discussion post in [Discussions](https://github.com/open-webui/open-webui/discussions) and describe your changes before submitting a pull request. - Linked the related discussions below + the fix is quite urgent IMO **Before submitting, make sure you've checked the following:** - [ X ] **Target branch:** Please verify that the pull request targets the `dev` branch. - [ X ] **Description:** Provide a concise description of the changes made in this pull request. - [ X ] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [ ] **Documentation:** Have you updated relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs), or other documentation sources? - [ X ] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? - [ X ] **Testing:** Have you written and run sufficient tests for validating the changes? - [ X ] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [ X ] **Prefix:** To cleary categorize this pull request, prefix the pull request title, using one of the following: - **BREAKING CHANGE**: Significant changes that may affect compatibility - **build**: Changes that affect the build system or external dependencies - **ci**: Changes to our continuous integration processes or workflows - **chore**: Refactor, cleanup, or other non-functional code changes - **docs**: Documentation update or addition - **feat**: Introduces a new feature or enhancement to the codebase - **fix**: Bug fix or error correction - **i18n**: Internationalization or localization changes - **perf**: Performance improvement - **refactor**: Code restructuring for better maintainability, readability, or scalability - **style**: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc.) - **test**: Adding missing tests or correcting existing tests - **WIP**: Work in progress, a temporary label for incomplete or ongoing work --- # Body *I wrote that first as a discussion message then it got out of hand so I did as best as I could to adhere to your format* ## Disclaimer I'm extremely happy with Open-WebUI and very enthusiastic about its future and the future of pipelines. Hence, I was quite disappointed by some aspects of the code. I try to explain all of this below but want to clarify that I don't judge any of the maintainers: I can't even begin to imagine what you must have on your plate with a repo of 40k stars. But I still think I found some practices that will most certainly lose you more time than you could gain by coding quickly. I am totally open for discussion on any of the points I raised, I could totally be wrong. And part of all that might be attributed to misunderstandings about what RAG is supposed to do for the user, in which case understanding it here would help clarify the documentation. I will still continue using Open-WebUI and provide accounts for all my community, but want to voice my concerns. To reiterate: my current mindset is the one of a mostly happy user, concerned about some decisions and practices, wanting to do sort of a soft wake up call while remaining respectful of the well being of the maintainers. ## Context RAG and web search were broken for at least weeks. Using langfuse I found out that the `[context]` in the RAG template was replaced by an empty string. I also found out that nothing in the code stops that even though I see no reason not to do that. ## The issue - I've located my issue: it happens only when using hybrid search. - I was certain I had tried disabling it, but as I said, the lack of checks / logs makes the debugging process tedious and harder. - The issue is in `backend/open_webui/apps/rag/utils.py`, in `query_collection_with_hybrid_search` - The query happens in a try block, but the exception is ignored. - In my case the exception was `Torch not compiled with CUDA enabled`. I don't know why this happened exactly but I's not the purpose of this PR. - As a direct consequence: the context was never filled. - The RAG template replacement happens without checking if that's alright. - And finally the RAG template does not even tell the LLM to warn the user if the context seems broken or empty. - It even specifically tells the LLM to avoid mentioning it's using the context. This means that if you ask an open ended question to the LLM, hoping it would use the document, then it will completely hallucinate instead of being grounded by the document. This also makes it harder to get the LLM to confirm that the context was valid because it avoids mentioning it. ## Additional related issues in this PR - if "[query]" is present in the context, the template replacement will be wrong as it will replace it too. I used uuid (standard python lib) to generate a seed in that case. - There's still an issue with injection that will happen if `<context>` or `</context>` is present inside the context. I don't know exactly how you want to deal with that. I added a warning anyway. This is NOT to take lightly as pipelines will end up exposing more and more APIs and attackers will always have the edge. ## Other remarks - At least the function "get_rag_context" is not typed (arguments nor outputs) - In a general way, at least for the rag utils file : only 3 functions out of 14 have an output type, making it all the more harder for the community to dive on specific issue like I'm doing right now. - If you want to add very fast optional type checking I can strongly recommend [beartype](https://github.com/beartype/beartype). In one of my project I've been using it by decorating pretty much all my functions with a beartype decorator that either crash or warns in case of inconsistencies (depending on an env variable). It's been a huge headache saver and I think you should try it. [Here's my code as an example](https://github.com/thiswillbeyourgithub/WDoc/blob/main/WDoc/utils/typechecker.py). Without the env parameter, the decorator directly returns the function so there is **no** overhead. If you set the env parameter then the overhead is minimal, thanks to beartype. - As I modified the RAG template, it has to be modified in the docs repository too. - When debugging, I simply added asserts and expected it to trigger red banners, but was surprised to see that it didn't. I strongly suggest adding an env variable or a UI parameter to allow at least admin accounts to see debugging banners instead of only showing fatal errors. I would not be surprised if there were more bugs that users didn't even know about. ## Commits in this PR - Properly log exceptions occurring during querying (both hybrid and not) - If all collections failed to be queried in hybrid search: fallback to using non hybrid search - Improved the RAG template, especially important is the mention that if the information is not coming from the context, the LLM should say so. - Added sanity checks for the RAG template replacement. - Fixes cases where "[query]" is present in the context - If the relevancy factor is 0, the context should never be empty. I added a check for this. - Added a prompt injection warning of "<context>" and "</context>" are present in the context. This is NOT to take lightly as pipelines will end up exposing more and more APIs and attackers will always have the edge. - Added a few type hints. ### What this PR fixes - My issue at #5034 - The ensuing discussion at #5050 ### It probably fixes too: - #5002 and #5022 by @qiulang - #5291 by @tkg61 - #3363 by @hugobox with @changchiyou and @Rom0067 and @anonymousmaharaj and @devingfx ### It might also fix - #3518 by @Robinsane - #4775 by @foxworld306 and #4779 with @GridexX - #3868 with @gaetanquentin and @BennisonDevadoss - #176 of @bhaidar and @dmvieira ## Closing remarks I'll be blunt but honest: I think that the relative lack of asserts, the fact that exceptions are not shown to the admin, the relative lack of type hints, the apparent lack of test coverage, will progressively increase the burden of developing Open-WebUI as well as make it harder for the community to provide helpful issues, as well as step in and help. I think RAG might have been silently broken for weeks, but maybe many other things are broken without us knowing about it. Open-WebUI and Pipelines are ambitious, and you need all the help modern developpement practices can bring you! I now have reasons to doubt that the rest of the codebase is properly checking things, or that anything that breaks is logged. I think it would be awesome if you added my suggestion regarding showing errors to the admin, then progressively added more asserts as you naturally refactor the code here and there. This might be the most efficient way to go about this as I understand that increasing the test coverage for example requires a significant effort. Also beartype typechecking during the developpement phase or as an opt in for users might be very helpful. (But do use their latest release candidate instead of the latest release.) Then again, thank you again for making Open-WebUI, it's a great improvement for me and my community. Have a nice day. PS: While I have your attention, could you share your thoughts about #5323 ? I explained in there that the default RAG model was trained on tweets and silently clips after a few tokens but Open-WebUI's default is still to use that with a 1500 chunk size. --- # Changelog Entry ### Description RAG and web search were broken for me and I think many others. The fix was to add a fallback to the non hybrid search. While I was at it I saw numerous things I decided to patch, all still related to this issue. ### Added - If all collections failed to be queried in hybrid search: fallback to using non hybrid search - Added sanity checks for the RAG template replacement. - Added a few type hints. ### Fixed - Properly log exceptions occurring during querying (both hybrid and not) - Improved the RAG template, especially important is the mention that if the information is not coming from the context, the LLM should say so. - Fixes cases where "[query]" is present in the context - If the relevancy factor is 0, the context should never be empty. I added a check for this. ### Security - Added a prompt injection warning of "<context>" and "</context>" are present in the context. This is NOT to take lightly as pipelines will end up exposing more and more APIs and attackers will always have the edge. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-11 17:57:29 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#8471