mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-22 14:13:08 -05:00
Configurable options for different file formats for RAG #2738
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @dsjath on GitHub (Nov 21, 2024).
Feature Request
Is your feature request related to a problem? Please describe.
I am frustrated that I cannot distinquish file types that should be automatically vectorized and file types that should not be automatically vectorized.
I do want to be able to upload large csv/excel files, but I do not want them to be embedded right away - on the other hand I do want embedding on text formats such as txt/word/pdf to be embedded right away.
I am utilizing pipelines to customize the experience - so I can e.g. parse csv/excel files with some code and only embed certain aspects of the file inside the pipeline whereas I just want the txt files to be embedded as usual.
Describe the solution you'd like
A yaml config file that allows customization of how to handle uploads of different file types.
E.g.
dissallow upload
allow upload but block rag
allow upload and use rag
file size limits for different file types
allow different max number of files for different formats (which needs to be less than the overall maximum)
Describe alternatives you've considered
I have tried to disallow uploads entirely, but then I have to do some weird integrations with API to access the files instead to the folder - not a friendly user experience.
Additional context
Add any other context or screenshots about the feature request here.