mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[PR #21977] fix: include file metadata in knowledge base context sent to LLM #42002
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/21977
Author: @kjpoccia
Created: 2/28/2026
Status: 🔄 Open
Base:
dev← Head:feat/filename-fileid-kb-search📝 Commits (1)
38e438aadd filename and fileid to returned sources from kb search📊 Changes
1 file changed (+4 additions, -0 deletions)
View changed files
📝
backend/open_webui/utils/middleware.py(+4 -0)📄 Description
Pull Request Checklist
Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.
This is to ensure large feature PRs are discussed with the community first, before starting work on it. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR.
Before submitting, make sure you've checked the following:
devbranch. PRs targetingmainwill be immediately closed.devto ensure no unrelated commits (e.g. frommain) are included. Push updates to the existing PR branch instead of closing and reopening.Changelog Entry
file_idandfile_namewhen formatted for LLM context, preventing document identity loss and aligning withquery_knowledge_filesbehavior.Description
This PR includes
file_nameandfile_idin the formatted context, aligning behavior with the nativequery_knowledge_filestool.Added
file_idandfile_namein knowledge base context formatting.Changed
Deprecated
Removed
Fixed
Security
Breaking Changes
Additional Information
Screenshots or Videos
In the below example, we have a KB with meeting minutes. When we ask for details surrounding a certain topic, the model returns chunks from various files, but it can't tell the chunks are from different files. To the model, they're all from the same source.


The below screenshot shows the model's performance after the fix:

Contributor License Agreement
By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.