mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-08 12:58:11 -05:00
[PR #18525] [CLOSED] fix: enable query chunking to prevent embedding errors on long inputs #40451
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/18525
Author: @acwoo97
Created: 10/23/2025
Status: ❌ Closed
Base:
dev← Head:fix/split-query📝 Commits (1)
ed0c66afix: enable query chunking to prevent embedding errors on long inputs📊 Changes
2 files changed (+87 additions, -68 deletions)
View changed files
📝
backend/open_webui/routers/retrieval.py(+85 -68)📝
backend/open_webui/utils/middleware.py(+2 -0)📄 Description
Pull Request Checklist
Note to first-time contributors: Please open a discussion post in Discussions and describe your changes before submitting a pull request.
Before submitting, make sure you've checked the following:
devbranch. Not targeting thedevbranch may lead to immediate closure of the PR.Changelog Entry
Description
Added
split_text_content()function: a new helper method to apply configurable chunking logic (character,token,markdown_header) to both documents and raw text queries.Changed
save_docs_to_vector_db():split_text_content().split_text_content()— ensuring long user queries are split into valid chunks before embedding.Deprecated
Removed
save_docs_to_vector_db().Fixed
Security
Breaking Changes
Additional Information
request.app.state.configforTEXT_SPLITTER,CHUNK_SIZE, andCHUNK_OVERLAP.Screenshots or Videos
(N/A – backend logic change only)
Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.