[PR #18666] feat: Update knowledge files sync #24876

Open
opened 2026-04-20 05:37:54 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/18666
Author: @Stoyan-Zlatev
Created: 10/27/2025
Status: 🔄 Open

Base: devHead: feature/knowledge-sync-button


📝 Commits (10+)

  • d3acc2b Update Knowledge directory sync process
  • 5ec7450 Update sync confirmation text
  • 950a859 Update endpoint name
  • 2c5bec6 Revert endpoint name
  • 735619f Reformat long log line
  • 60a8a6e Reformat log line using black/pre-commit
  • 3e576ae Merge branch 'dev' into feature/knowledge-sync-button
  • aaef753 Reformat
  • 40f38da Update translation.json
  • b0a16eb refac: update spacing in UserMenu dropdown items

📊 Changes

4 files changed (+428 additions, -32 deletions)

View changed files

📝 backend/open_webui/routers/knowledge.py (+61 -0)
backend/open_webui/utils/knowledge_sync.py (+221 -0)
📝 src/lib/apis/knowledge/index.ts (+42 -2)
📝 src/lib/components/workspace/Knowledge/KnowledgeBase.svelte (+104 -30)

📄 Description

Pull Request Checklist

Before submitting, make sure you've checked the following:

  • Target branch: Verify that the pull request targets the dev branch. Not targeting the dev branch may lead to immediate closure of the PR.
  • Description: Provide a concise description of the changes made in this pull request.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: If necessary, update relevant documentation Open WebUI Docs like environment variables, the tutorials, or other documentation sources.
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Perform manual tests to verify the implemented fix/feature works as intended AND does not break any other functionality. Take this as an opportunity to make screenshots of the feature/fix and include it in the PR description.
  • Agentic AI Code:: Confirm this Pull Request is not written by any AI Agent or has at least gone through additional human review and manual testing. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR.
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Title Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following:
    • BREAKING CHANGE: Significant changes that may affect compatibility
    • build: Changes that affect the build system or external dependencies
    • ci: Changes to our continuous integration processes or workflows
    • chore: Refactor, cleanup, or other non-functional code changes
    • docs: Documentation update or addition
    • feat: Introduces a new feature or enhancement to the codebase
    • fix: Bug fix or error correction
    • i18n: Internationalization or localization changes
    • perf: Performance improvement
    • refactor: Code restructuring for better maintainability, readability, or scalability
    • style: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.)
    • test: Adding missing tests or correcting existing tests
    • WIP: Work in progress, a temporary label for incomplete or ongoing work

Changelog Entry

Description

  • Introduced a new api/v1/knowledge/{id}/file/sync endpoint to enable efficient, hash-based file synchronization for knowledge collections.
  • This change improves performance by avoiding redundant file uploads when syncing directories, only replacing files that have actually changed.
  • The original api/v1/knowledge/{id}/reset endpoint remains available for backward compatibility to prevent breaking existing tools or scripts.

Added

  • New endpoint: POST /api/v1/knowledge/{id}/file/sync
    • Syncs individual files to a knowledge base with hash-based comparison.
    • Logic:
      • If a file with the same name exists and hashes match → skip upload.
      • If a file with the same name exists and hashes differ → replace with new file.
      • If no same-named file exists → upload new file.
  • Integrated hash comparison logic to determine whether a file requires re-uploading.

Changed

  • Updated the "Sync" button behavior to use the new /file/sync endpoint instead of /reset for more efficient updates.
  • Improved sync logic to handle selective updates rather than full re-uploads.

Deprecated

  • The api/v1/knowledge/{id}/reset endpoint is now considered legacy for sync operations.
    • It remains available for backward compatibility but is no longer used by the sync button.

Removed

  • N/A

Fixed

  • Prevented unnecessary deletions and reuploads during directory synchronization.

Security

  • N/A

Breaking Changes

  • BREAKING CHANGE: None. The previous /reset endpoint remains functional for any existing integrations.

Additional Information

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/18666 **Author:** [@Stoyan-Zlatev](https://github.com/Stoyan-Zlatev) **Created:** 10/27/2025 **Status:** 🔄 Open **Base:** `dev` ← **Head:** `feature/knowledge-sync-button` --- ### 📝 Commits (10+) - [`d3acc2b`](https://github.com/open-webui/open-webui/commit/d3acc2be35d2615dbf1ce89f9854212e619bbe0e) Update Knowledge directory sync process - [`5ec7450`](https://github.com/open-webui/open-webui/commit/5ec74504c495d26e6484de45604a396ef467a356) Update sync confirmation text - [`950a859`](https://github.com/open-webui/open-webui/commit/950a859e55fae97b75ca9ad89b1822ed2c91efab) Update endpoint name - [`2c5bec6`](https://github.com/open-webui/open-webui/commit/2c5bec6f76eaf68d532798f6e83127e6a6ab9fef) Revert endpoint name - [`735619f`](https://github.com/open-webui/open-webui/commit/735619f1065eb9f9b951115280034c001a249db2) Reformat long log line - [`60a8a6e`](https://github.com/open-webui/open-webui/commit/60a8a6ebbb3eb8e151972a010b02ea3a8ee7036e) Reformat log line using black/pre-commit - [`3e576ae`](https://github.com/open-webui/open-webui/commit/3e576ae6d576696a5d0c65447ab689ac4b4bf77e) Merge branch 'dev' into feature/knowledge-sync-button - [`aaef753`](https://github.com/open-webui/open-webui/commit/aaef7538b2406cc9f971978d58fc70d52636796f) Reformat - [`40f38da`](https://github.com/open-webui/open-webui/commit/40f38da6281bd075836529d5af0224c240ef7308) Update translation.json - [`b0a16eb`](https://github.com/open-webui/open-webui/commit/b0a16eb47648790a5578d1436db86c23709190cd) refac: update spacing in UserMenu dropdown items ### 📊 Changes **4 files changed** (+428 additions, -32 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/routers/knowledge.py` (+61 -0) ➕ `backend/open_webui/utils/knowledge_sync.py` (+221 -0) 📝 `src/lib/apis/knowledge/index.ts` (+42 -2) 📝 `src/lib/components/workspace/Knowledge/KnowledgeBase.svelte` (+104 -30) </details> ### 📄 Description # Pull Request Checklist **Before submitting, make sure you've checked the following:** - [x] **Target branch:** Verify that the pull request targets the `dev` branch. Not targeting the `dev` branch may lead to immediate closure of the PR. - [x] **Description:** Provide a concise description of the changes made in this pull request. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [x] **Documentation:** If necessary, update relevant documentation [Open WebUI Docs](https://github.com/open-webui/docs) like environment variables, the tutorials, or other documentation sources. - [x] **Dependencies:** Are there any new dependencies? Have you updated the dependency versions in the documentation? - [x] **Testing:** Perform manual tests to verify the implemented fix/feature works as intended AND does not break any other functionality. Take this as an opportunity to make screenshots of the feature/fix and include it in the PR description. - [x] **Agentic AI Code:**: Confirm this Pull Request is **not written by any AI Agent** or has at least gone through additional human review **and** manual testing. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR. - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [x] **Title Prefix:** To clearly categorize this pull request, prefix the pull request title using one of the following: - **BREAKING CHANGE**: Significant changes that may affect compatibility - **build**: Changes that affect the build system or external dependencies - **ci**: Changes to our continuous integration processes or workflows - **chore**: Refactor, cleanup, or other non-functional code changes - **docs**: Documentation update or addition - **feat**: Introduces a new feature or enhancement to the codebase - **fix**: Bug fix or error correction - **i18n**: Internationalization or localization changes - **perf**: Performance improvement - **refactor**: Code restructuring for better maintainability, readability, or scalability - **style**: Changes that do not affect the meaning of the code (white space, formatting, missing semi-colons, etc.) - **test**: Adding missing tests or correcting existing tests - **WIP**: Work in progress, a temporary label for incomplete or ongoing work # Changelog Entry ### Description - Introduced a new `api/v1/knowledge/{id}/file/sync` endpoint to enable efficient, hash-based file synchronization for knowledge collections. - This change improves performance by avoiding redundant file uploads when syncing directories, only replacing files that have actually changed. - The original `api/v1/knowledge/{id}/reset` endpoint remains available for backward compatibility to prevent breaking existing tools or scripts. ### Added - New endpoint: `POST /api/v1/knowledge/{id}/file/sync` - Syncs individual files to a knowledge base with hash-based comparison. - Logic: - If a file with the same name exists and hashes match → skip upload. - If a file with the same name exists and hashes differ → replace with new file. - If no same-named file exists → upload new file. - Integrated hash comparison logic to determine whether a file requires re-uploading. ### Changed - Updated the "Sync" button behavior to use the new `/file/sync` endpoint instead of `/reset` for more efficient updates. - Improved sync logic to handle selective updates rather than full re-uploads. ### Deprecated - The `api/v1/knowledge/{id}/reset` endpoint is now considered legacy for sync operations. - It remains available for backward compatibility but is no longer used by the sync button. ### Removed - N/A ### Fixed - Prevented unnecessary deletions and reuploads during directory synchronization. ### Security - N/A ### Breaking Changes - **BREAKING CHANGE**: None. The previous `/reset` endpoint remains functional for any existing integrations. --- ### Additional Information - This enhancement significantly improves sync efficiency for large knowledge collections where only a subset of files have changed. - Related to [prior work introducing a `hash` field in file metadata (`api/v1/knowledge/{id}`)](https://github.com/open-webui/open-webui/pull/18284). - This change aligns with the [proposal for improving the sync function through hash-based verification](https://github.com/open-webui/open-webui/discussions/18323). ### Contributor License Agreement By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-20 05:37:54 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#24876