mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 10:58:17 -05:00
[PR #21523] [CLOSED] fix: preserve header metadata in markdown splitter #26120
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/21523
Author: @Baireinhold
Created: 2/17/2026
Status: ❌ Closed
Base:
dev← Head:fix/markdown-header-metadata📝 Commits (1)
79b1f26fix: preserve header metadata in markdown splitter📊 Changes
1 file changed (+1 additions, -1 deletions)
View changed files
📝
backend/open_webui/routers/retrieval.py(+1 -1)📄 Description
Pull Request Checklist
devChangelog Entry
Description
MarkdownHeaderTextSplitterFixed
MarkdownHeaderTextSplitter.split_text()returns chunks with metadata containing the header hierarchy (e.g.{"Header 1": "Chapter 1", "Header 2": "1.1 Background"}), but only the parent document's metadata was preserved viametadata={**doc.metadata}, discardingsplit_chunk.metadata. Changed tometadata={**doc.metadata, **split_chunk.metadata}to merge both.Additional Information
backend/open_webui/routers/retrieval.pyContributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.