mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 02:48:13 -05:00
[PR #21524] [CLOSED] fix: prevent double chunking after markdown header splitting #41751
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/21524
Author: @Baireinhold
Created: 2/17/2026
Status: ❌ Closed
Base:
dev← Head:fix/markdown-double-chunking📝 Commits (1)
d7dc58dfix: prevent double chunking after markdown header splitting📊 Changes
1 file changed (+7 additions, -3 deletions)
View changed files
📝
backend/open_webui/routers/retrieval.py(+7 -3)📄 Description
Pull Request Checklist
devChangelog Entry
Description
MarkdownHeaderTextSplitterFixed
ENABLE_MARKDOWN_HEADER_TEXT_SPLITTERis enabled, markdown-split chunks fall through unconditionally into theTEXT_SPLITTERbranch, which re-splits them viaRecursiveCharacterTextSplitterorTokenTextSplitter, destroying the semantic boundaries established by header splitting. Added amarkdown_split_doneflag to skip the secondary splitter when markdown splitting was already applied.Additional Information
backend/open_webui/routers/retrieval.pyENABLE_MARKDOWN_HEADER_TEXT_SPLITTERis disabled (flag staysFalse, splitters run as before)Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.