[PR #18525] [CLOSED] fix: enable query chunking to prevent embedding errors on long inputs #40451

Closed
opened 2026-04-25 12:55:58 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/18525
Author: @acwoo97
Created: 10/23/2025
Status: Closed

Base: devHead: fix/split-query


📝 Commits (1)

  • ed0c66a fix: enable query chunking to prevent embedding errors on long inputs

📊 Changes

2 files changed (+87 additions, -68 deletions)

View changed files

📝 backend/open_webui/routers/retrieval.py (+85 -68)
📝 backend/open_webui/utils/middleware.py (+2 -0)

📄 Description

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions and describe your changes before submitting a pull request.

Before submitting, make sure you've checked the following:

  • Target branch: Verify that the pull request targets the dev branch. Not targeting the dev branch may lead to immediate closure of the PR.
  • Description: Provide a concise description of the changes made in this pull request.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: If necessary, update relevant documentation Open WebUI Docs like environment variables, the tutorials, or other documentation sources.
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Perform manual tests to verify the implemented fix/feature works as intended AND does not break any other functionality. Take this as an opportunity to make screenshots of the feature/fix and include it in the PR description.
  • Agentic AI Code: Confirm this Pull Request is not written by any AI Agent or has at least gone through additional human review and manual testing. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR.
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Title Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following:
    • feat: Introduces a new feature or enhancement to the codebase

Changelog Entry

Description

  • Implemented a unified text splitting logic to ensure both uploaded documents and user chat queries are properly chunked before being embedded.
  • This enhancement prevents embedding model errors caused by long query inputs exceeding maximum token limits.
  • Improves stability and consistency across both document ingestion and query-time embedding processes.

Added

  • split_text_content() function: a new helper method to apply configurable chunking logic (character, token, markdown_header) to both documents and raw text queries.

Changed

  • Refactored save_docs_to_vector_db():
    • Replaced inline document splitting logic with a call to split_text_content().
    • Simplified code structure and reduced duplication.
  • Updated query processing to call split_text_content() — ensuring long user queries are split into valid chunks before embedding.

Deprecated

  • N/A

Removed

  • Removed redundant text splitting code previously duplicated inside save_docs_to_vector_db().

Fixed

  • Fixed runtime embedding errors that occurred when user queries exceeded the embedding model’s maximum token length during chat completions.

Security

  • N/A

Breaking Changes

  • None. Existing functionality remains compatible with previous versions.

Additional Information

  • Motivation:
    • Previously, only uploaded documents were chunked during ingestion, while chat queries were sent as a single text for embedding.
    • Long queries could exceed model limitations, causing embedding failures.
    • Now, both data paths (document + query) share a unified splitting mechanism.
  • Related Configuration:
    • Uses request.app.state.config for TEXT_SPLITTER, CHUNK_SIZE, and CHUNK_OVERLAP.
  • Manual Testing Performed:
    • Verified document upload still performs proper chunking before vector storage.
    • Tested long chat queries (>4000 tokens) to confirm chunking occurs and embedding completes successfully.

Screenshots or Videos

(N/A – backend logic change only)

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/18525 **Author:** [@acwoo97](https://github.com/acwoo97) **Created:** 10/23/2025 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `fix/split-query` --- ### 📝 Commits (1) - [`ed0c66a`](https://github.com/open-webui/open-webui/commit/ed0c66ac946aa1c885470149d0f5986da863e918) fix: enable query chunking to prevent embedding errors on long inputs ### 📊 Changes **2 files changed** (+87 additions, -68 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/routers/retrieval.py` (+85 -68) 📝 `backend/open_webui/utils/middleware.py` (+2 -0) </details> ### 📄 Description # Pull Request Checklist ### Note to first-time contributors: Please open a discussion post in [Discussions](https://github.com/open-webui/open-webui/discussions) and describe your changes before submitting a pull request. **Before submitting, make sure you've checked the following:** - [x] **Target branch:** Verify that the pull request targets the `dev` branch. Not targeting the `dev` branch may lead to immediate closure of the PR. - [x] **Description:** Provide a concise description of the changes made in this pull request. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [ ] **Documentation:** If necessary, update relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs) like environment variables, the tutorials, or other documentation sources. - [ ] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? - [x] **Testing:** Perform manual tests to verify the implemented fix/feature works as intended AND does not break any other functionality. Take this as an opportunity to make screenshots of the feature/fix and include it in the PR description. - [x] **Agentic AI Code:** Confirm this Pull Request is **not written by any AI Agent** or has at least gone through additional human review **and** manual testing. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR. - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [x] **Title Prefix:** To clearly categorize this pull request, prefix the pull request title using one of the following: - **feat**: Introduces a new feature or enhancement to the codebase --- # Changelog Entry ### Description - Implemented a unified text splitting logic to ensure both uploaded documents and user chat queries are properly chunked before being embedded. - This enhancement prevents embedding model errors caused by long query inputs exceeding maximum token limits. - Improves stability and consistency across both document ingestion and query-time embedding processes. ### Added - `split_text_content()` function: a new helper method to apply configurable chunking logic (`character`, `token`, `markdown_header`) to both documents and raw text queries. ### Changed - Refactored `save_docs_to_vector_db()`: - Replaced inline document splitting logic with a call to `split_text_content()`. - Simplified code structure and reduced duplication. - Updated query processing to call `split_text_content()` — ensuring long user queries are split into valid chunks before embedding. ### Deprecated - N/A ### Removed - Removed redundant text splitting code previously duplicated inside `save_docs_to_vector_db()`. ### Fixed - Fixed runtime embedding errors that occurred when user queries exceeded the embedding model’s maximum token length during chat completions. ### Security - N/A ### Breaking Changes - **None**. Existing functionality remains compatible with previous versions. --- ### Additional Information - **Motivation:** - Previously, only uploaded documents were chunked during ingestion, while chat queries were sent as a single text for embedding. - Long queries could exceed model limitations, causing embedding failures. - Now, both data paths (document + query) share a unified splitting mechanism. - **Related Configuration:** - Uses `request.app.state.config` for `TEXT_SPLITTER`, `CHUNK_SIZE`, and `CHUNK_OVERLAP`. - **Manual Testing Performed:** - Verified document upload still performs proper chunking before vector storage. - Tested long chat queries (>4000 tokens) to confirm chunking occurs and embedding completes successfully. ### Screenshots or Videos *(N/A – backend logic change only)* ### Contributor License Agreement By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-25 12:55:58 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#40451