[PR #22678] [CLOSED] fix: optimize file handling for full context mode to reduce chat latency #65666

Closed
opened 2026-05-06 11:34:12 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/22678
Author: @a86582751
Created: 3/14/2026
Status: Closed

Base: devHead: fix-full-context-bypass-rag


📝 Commits (10+)

📊 Changes

1 file changed (+100 additions, -33 deletions)

View changed files

📝 backend/open_webui/utils/middleware.py (+100 -33)

📄 Description

Pull Request Checklist

Before submitting, make sure you've checked the following:

  • Target branch: Verify that the pull request targets the dev branch. PRs targeting main will be immediately closed.
  • Description: Provide a concise description of the changes made in this pull request down below.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Add docs in Open WebUI Docs Repository. (Not needed for this bug fix)
  • Dependencies: No new dependencies. Uses existing Files model.
  • Testing: Performed manual tests to verify the implemented fix works as intended AND does not break any other functionality. Tested with DOCX, PDF, and TXT files.
  • Agentic AI Code: This PR was written with AI assistance but has gone through human review and manual testing.
  • Code review: Performed self-review of code.
  • Title Prefix: Using fix: prefix for bug fix.

Changelog Entry

Description

Fixes the issue where sending messages with large files in 'Full Context' mode causes significant delays. When users select 'Full Context' mode for uploaded files, the system was still generating queries for RAG retrieval, serializing large file content in frontend JSON, and emitting unnecessary status updates. This resulted in noticeable delays when chatting with documents.

Fixed

  • Fixed 'Full Context' mode still triggering RAG processing and causing delays
  • Fixed frontend JSON serialization of large file content causing significant delays
  • Fixed unnecessary status emissions ('retrieving', 'sources_retrieved') in full context mode

Changed

  • Modified chat_completion_files_handler in backend/open_webui/utils/middleware.py
  • When all files are in 'Full Context' mode (or global bypass is enabled):
    • Skip RAG query generation entirely
    • Skip status emission for retrieval
    • Load file content from database on backend
    • Directly inject content into user message
    • Frontend only passes file IDs (avoiding large JSON serialization)

Additional Information

Root Cause:
The original code checked all_full_context but still called get_sources_from_items with full_context=True, which triggered the entire RAG pipeline including query generation and status emissions. Additionally, the frontend was serializing large file content in JSON, causing significant delays.

Solution:
When all_full_context is True OR bypass_embedding is True:

  1. Skip query generation
  2. Skip status emissions
  3. Load file content directly from database (backend)
  4. Inject content into user message
  5. Return early to avoid RAG processing

Testing Results:

  • Tested with DOCX files - works correctly
  • Tested with PDF files - works correctly
  • Tested with TXT/MD files - works correctly
  • Verified normal RAG mode still works
  • Verified 'Full Context' toggle functions correctly
  • Delay reduced from ~36 seconds to milliseconds

Files Changed:

  • backend/open_webui/utils/middleware.py (+39 lines, -19 lines)

Contributor License Agreement

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/22678 **Author:** [@a86582751](https://github.com/a86582751) **Created:** 3/14/2026 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `fix-full-context-bypass-rag` --- ### 📝 Commits (10+) - [`fe6783c`](https://github.com/open-webui/open-webui/commit/fe6783c16699911c7be17392596d579333fb110c) Merge pull request #19030 from open-webui/dev - [`fc05e0a`](https://github.com/open-webui/open-webui/commit/fc05e0a6c5d39da60b603b4d520f800d6e36f748) Merge pull request #19405 from open-webui/dev - [`e3faec6`](https://github.com/open-webui/open-webui/commit/e3faec62c58e3a83d89aa3df539feacefa125e0c) Merge pull request #19416 from open-webui/dev - [`9899293`](https://github.com/open-webui/open-webui/commit/9899293f050ad50ae12024cbebee7e018acd851e) Merge pull request #19448 from open-webui/dev - [`140605e`](https://github.com/open-webui/open-webui/commit/140605e660b8186a7d5c79fb3be6ffb147a2f498) Merge pull request #19462 from open-webui/dev - [`6f1486f`](https://github.com/open-webui/open-webui/commit/6f1486ffd0cb288d0e21f41845361924e0d742b3) Merge pull request #19466 from open-webui/dev - [`d95f533`](https://github.com/open-webui/open-webui/commit/d95f533214e3fe5beb5e41ec1f349940bc4c7043) Merge pull request #19729 from open-webui/dev - [`a727153`](https://github.com/open-webui/open-webui/commit/a7271532f8a38da46785afcaa7e65f9a45e7d753) 0.6.43 (#20093) - [`6adde20`](https://github.com/open-webui/open-webui/commit/6adde203cd292a9e3af9c64a2ae36b603fed096a) Merge pull request #20394 from open-webui/dev - [`f9b0534`](https://github.com/open-webui/open-webui/commit/f9b0534e0c442631d1cb7205169588b9b6204179) Merge pull request #20522 from open-webui/dev ### 📊 Changes **1 file changed** (+100 additions, -33 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/utils/middleware.py` (+100 -33) </details> ### 📄 Description <!-- ⚠️ CRITICAL CHECKS FOR CONTRIBUTORS (READ, DON'T DELETE) ⚠️ 1. Target the `dev` branch. PRs targeting `main` will be automatically closed. 2. Do NOT delete the CLA section at the bottom. It is required for the bot to accept your PR. --> # Pull Request Checklist **Before submitting, make sure you've checked the following:** - [x] **Target branch:** Verify that the pull request targets the `dev` branch. **PRs targeting `main` will be immediately closed.** - [x] **Description:** Provide a concise description of the changes made in this pull request down below. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [ ] **Documentation:** Add docs in [Open WebUI Docs Repository](https://github.com/open-webui/docs). (Not needed for this bug fix) - [x] **Dependencies:** No new dependencies. Uses existing `Files` model. - [x] **Testing:** Performed manual tests to verify the implemented fix works as intended AND does not break any other functionality. Tested with DOCX, PDF, and TXT files. - [x] **Agentic AI Code:** This PR was written with AI assistance but has gone through human review and manual testing. - [x] **Code review:** Performed self-review of code. - [x] **Title Prefix:** Using `fix:` prefix for bug fix. --- # Changelog Entry ### Description Fixes the issue where sending messages with large files in 'Full Context' mode causes significant delays. When users select 'Full Context' mode for uploaded files, the system was still generating queries for RAG retrieval, serializing large file content in frontend JSON, and emitting unnecessary status updates. This resulted in noticeable delays when chatting with documents. ### Fixed - Fixed 'Full Context' mode still triggering RAG processing and causing delays - Fixed frontend JSON serialization of large file content causing significant delays - Fixed unnecessary status emissions ('retrieving', 'sources_retrieved') in full context mode ### Changed - Modified `chat_completion_files_handler` in `backend/open_webui/utils/middleware.py` - When all files are in 'Full Context' mode (or global bypass is enabled): - Skip RAG query generation entirely - Skip status emission for retrieval - Load file content from database on backend - Directly inject content into user message - Frontend only passes file IDs (avoiding large JSON serialization) --- ### Additional Information **Root Cause:** The original code checked `all_full_context` but still called `get_sources_from_items` with `full_context=True`, which triggered the entire RAG pipeline including query generation and status emissions. Additionally, the frontend was serializing large file content in JSON, causing significant delays. **Solution:** When `all_full_context` is True OR `bypass_embedding` is True: 1. Skip query generation 2. Skip status emissions 3. Load file content directly from database (backend) 4. Inject content into user message 5. Return early to avoid RAG processing **Testing Results:** - [x] Tested with DOCX files - works correctly - [x] Tested with PDF files - works correctly - [x] Tested with TXT/MD files - works correctly - [x] Verified normal RAG mode still works - [x] Verified 'Full Context' toggle functions correctly - [x] Delay reduced from ~36 seconds to milliseconds **Files Changed:** - `backend/open_webui/utils/middleware.py` (+39 lines, -19 lines) ### Contributor License Agreement <!-- 🚨 DO NOT DELETE THE TEXT BELOW 🚨 Keep the "Contributor License Agreement" confirmation text intact. Deleting it will trigger the CLA-Bot to INVALIDATE your PR. Your PR will NOT be reviewed or merged until you check the box below confirming that you have read and agree to the terms of the CLA. --> - [x] By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. > [!NOTE] > Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-06 11:34:12 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#65666