mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[PR #7830] [CLOSED] feat: Batch Processing for Large-Scale Document Import #37746
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/open-webui/open-webui/pull/7830
Author: @gabriel-ecegi
Created: 12/13/2024
Status: ❌ Closed
Base:
dev← Head:dev📝 Commits (2)
f2e2b59Add batching440894fFix process/files/batch📊 Changes
2 files changed (+182 additions, -14 deletions)
View changed files
📝
backend/open_webui/apps/retrieval/main.py(+97 -10)📝
backend/open_webui/apps/webui/routers/knowledge.py(+85 -4)📄 Description
Pull Request: Add Batch Processing Support for Document Import
Discussion
Before submitting, make sure you've checked the following:
devbranchChangelog Entry
Description
Added batch processing capability to significantly improve performance when importing large volumes of documents. This enhancement is particularly valuable for enterprise integrations (like Confluence imports) where thousands of documents need to be processed simultaneously.
Added
BatchProcessFilesFormBatchProcessFilesResultBatchProcessFilesResponse/process/files/batchfor batch document processingChanged
Fixed
Performance
Additional Information
This enhancement addresses the performance bottleneck when importing large document sets. Instead of processing files one by one, we can now handle hundreds of files in a single operation, making integrations with enterprise systems more practical.
The implementation includes robust error handling that allows partial success - if some files fail to process, the successful ones are still added to the knowledge base, with clear reporting of any failures.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.