[PR #22723] [CLOSED] FIX: merge tool citations and RAG sources instead of overwriting #49878

Closed
opened 2026-04-30 02:17:02 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/22723
Author: @IllimarR
Created: 3/16/2026
Status: Closed

Base: devHead: fix/merge-sources


📝 Commits (1)

  • a1ce262 fix: merge tool citations and RAG sources instead of overwriting

📊 Changes

2 files changed (+17 additions, -6 deletions)

View changed files

📝 backend/open_webui/models/chats.py (+11 -4)
📝 src/lib/components/chat/Chat.svelte (+6 -2)

📄 Description

Pull Request Checklist

Before submitting, make sure you've checked the following:

  • Target branch: Verify that the pull request targets the dev branch. PRs targeting main will be immediately closed.
  • Description: Provide a concise description of the changes made in this pull request down below.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Add docs in Open WebUI Docs Repository. Document user-facing behavior, environment variables, public APIs/interfaces, or deployment steps.
  • Dependencies: Are there any new or upgraded dependencies? If so, explain why, update the changelog/docs, and include any compatibility notes. Actually run the code/function that uses updated library to ensure it doesn't crash.
  • Testing: Perform manual tests to verify the implemented fix/feature works as intended AND does not break any other functionality. Include reproducible steps to demonstrate the issue before the fix. Test edge cases (URL encoding, HTML entities, types). Take this as an opportunity to make screenshots of the feature/fix and include them in the PR description.
  • Agentic AI Code: Confirm this Pull Request is not written by any AI Agent or has at least gone through additional human review AND manual testing. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR.
  • Code review: Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards?
  • Design & Architecture: Prefer smart defaults over adding new settings; use local state for ephemeral UI logic. Open a Discussion for major architectural or UX changes.
  • Git Hygiene: Keep PRs atomic (one logical change). Clean up commits and rebase on dev to ensure no unrelated commits (e.g. from main) are included. Push updates to the existing PR branch instead of closing and reopening.
  • Title Prefix: To clearly categorize this pull request, prefix the pull request title using one of the following:
    • fix: Bug fix or error correction

Changelog Entry

Description

When tools emit citations via __event_emitter__ and the chat also has RAG/Knowledge files attached, both frontend and backend now merge the two sources lists instead of one silently replacing the other.

Root cause: There are two independent systems that write sources to the assistant message, and they conflict:

  1. Tool citations (Path A): During tool execution, the tool emits type: "citation" events. The WebSocket handler appends these to message.sources and saves to DB. The frontend receives source/citation events and appends to message.sources.

  2. RAG/Knowledge sources (Path B): After tool execution, chat_completion_files_handler collects RAG sources from attached knowledge files. These are emitted as chat:completion events and saved to DB via upsert_message.

Frontend overwrite (Chat.svelte): The chat:completion handler checks if (sources && !message?.sources) — but by this point tool citations from Path A have already set message.sources, so RAG sources are silently skipped.

Backend overwrite (chats.py): upsert_message does a shallow dict merge ({**existing, **message}), so the Path B sources key completely replaces the Path A tool citations. On page reload, only Path B sources appear.

Result: Users never see both tool citations AND RAG sources together — during a live session only tool citations are visible, and after page reload only RAG sources are visible.

How to reproduce

  1. Create a Workspace Knowledge collection with one or more files
  2. Attach it to a model or chat
  3. Use a tool that emits citations via __event_emitter__ (e.g., a web search or custom search tool)
  4. Ask a question that triggers both the tool and references the attached knowledge
  5. Observe that only one set of sources appears in the citations panel
  6. Reload the page — observe that the other set now appears instead

Added

  • N/A — no new features

Changed

  • Frontend (Chat.svelte): Changed the chat:completion source handler from skipping RAG sources when tool citations exist to concatenating both lists
  • Backend (chats.py): Changed upsert_message to merge sources lists (with deduplication) instead of shallow-overwriting

Deprecated

  • N/A

Removed

  • N/A

Fixed

  • Tool-emitted citations and RAG/Knowledge sources now coexist in the citations panel instead of one silently overwriting the other

Security

  • N/A

Breaking Changes

  • N/A — this is a purely additive fix. If no tool citations exist, behavior is identical to before.

Additional Information

  • No new dependencies
  • The backend merge uses not in deduplication to prevent duplicate sources when the same source appears in both paths
  • The backend loop for key in ("sources",) is written for extensibility — other list-type fields can be added to the tuple if needed in the future
  • Risk is low: if no tool citations exist (!message?.sources), both frontend and backend behavior is identical to before

Screenshots or Videos

  • N/A (behavior fix in the citations panel — sources that were previously missing now appear alongside existing ones)

Contributor License Agreement


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/22723 **Author:** [@IllimarR](https://github.com/IllimarR) **Created:** 3/16/2026 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `fix/merge-sources` --- ### 📝 Commits (1) - [`a1ce262`](https://github.com/open-webui/open-webui/commit/a1ce2625a348aad4946f7a9d9be77c2aec297620) fix: merge tool citations and RAG sources instead of overwriting ### 📊 Changes **2 files changed** (+17 additions, -6 deletions) <details> <summary>View changed files</summary> 📝 `backend/open_webui/models/chats.py` (+11 -4) 📝 `src/lib/components/chat/Chat.svelte` (+6 -2) </details> ### 📄 Description <!-- ⚠️ CRITICAL CHECKS FOR CONTRIBUTORS (READ, DON'T DELETE) ⚠️ 1. Target the `dev` branch. PRs targeting `main` will be automatically closed. 2. Do NOT delete the CLA section at the bottom. It is required for the bot to accept your PR. --> # Pull Request Checklist **Before submitting, make sure you've checked the following:** - [x] **Target branch:** Verify that the pull request targets the `dev` branch. **PRs targeting `main` will be immediately closed.** - [x] **Description:** Provide a concise description of the changes made in this pull request down below. - [x] **Changelog:** Ensure a changelog entry following the format of [Keep a Changelog](https://keepachangelog.com/) is added at the bottom of the PR description. - [ ] **Documentation:** Add docs in [Open WebUI Docs Repository](https://github.com/open-webui/docs). Document user-facing behavior, environment variables, public APIs/interfaces, or deployment steps. - [x] **Dependencies:** Are there any new or upgraded dependencies? If so, explain why, update the changelog/docs, and include any compatibility notes. Actually run the code/function that uses updated library to ensure it doesn't crash. - [x] **Testing:** Perform manual tests to **verify the implemented fix/feature works as intended AND does not break any other functionality**. Include reproducible steps to demonstrate the issue before the fix. Test edge cases (URL encoding, HTML entities, types). Take this as an opportunity to **make screenshots of the feature/fix and include them in the PR description**. - [x] **Agentic AI Code:** Confirm this Pull Request is **not written by any AI Agent** or has at least **gone through additional human review AND manual testing**. If any AI Agent is the co-author of this PR, it may lead to immediate closure of the PR. - [x] **Code review:** Have you performed a self-review of your code, addressing any coding standard issues and ensuring adherence to the project's coding standards? - [x] **Design & Architecture:** Prefer smart defaults over adding new settings; use local state for ephemeral UI logic. Open a Discussion for major architectural or UX changes. - [x] **Git Hygiene:** Keep PRs atomic (one logical change). Clean up commits and rebase on `dev` to ensure no unrelated commits (e.g. from `main`) are included. Push updates to the existing PR branch instead of closing and reopening. - [x] **Title Prefix:** To clearly categorize this pull request, prefix the pull request title using one of the following: - **fix**: Bug fix or error correction # Changelog Entry ### Description When tools emit citations via `__event_emitter__` and the chat also has RAG/Knowledge files attached, both frontend and backend now merge the two sources lists instead of one silently replacing the other. **Root cause:** There are two independent systems that write `sources` to the assistant message, and they conflict: 1. **Tool citations (Path A):** During tool execution, the tool emits `type: "citation"` events. The WebSocket handler appends these to `message.sources` and saves to DB. The frontend receives `source`/`citation` events and appends to `message.sources`. 2. **RAG/Knowledge sources (Path B):** After tool execution, `chat_completion_files_handler` collects RAG sources from attached knowledge files. These are emitted as `chat:completion` events and saved to DB via `upsert_message`. **Frontend overwrite** (`Chat.svelte`): The `chat:completion` handler checks `if (sources && !message?.sources)` — but by this point tool citations from Path A have already set `message.sources`, so RAG sources are silently skipped. **Backend overwrite** (`chats.py`): `upsert_message` does a shallow dict merge (`{**existing, **message}`), so the Path B `sources` key completely replaces the Path A tool citations. On page reload, only Path B sources appear. **Result:** Users never see both tool citations AND RAG sources together — during a live session only tool citations are visible, and after page reload only RAG sources are visible. ### How to reproduce 1. Create a Workspace Knowledge collection with one or more files 2. Attach it to a model or chat 3. Use a tool that emits citations via `__event_emitter__` (e.g., a web search or custom search tool) 4. Ask a question that triggers both the tool and references the attached knowledge 5. Observe that only one set of sources appears in the citations panel 6. Reload the page — observe that the other set now appears instead ### Added - N/A — no new features ### Changed - **Frontend (`Chat.svelte`):** Changed the `chat:completion` source handler from skipping RAG sources when tool citations exist to concatenating both lists - **Backend (`chats.py`):** Changed `upsert_message` to merge `sources` lists (with deduplication) instead of shallow-overwriting ### Deprecated - N/A ### Removed - N/A ### Fixed - Tool-emitted citations and RAG/Knowledge sources now coexist in the citations panel instead of one silently overwriting the other ### Security - N/A ### Breaking Changes - N/A — this is a purely additive fix. If no tool citations exist, behavior is identical to before. --- ### Additional Information - No new dependencies - The backend merge uses `not in` deduplication to prevent duplicate sources when the same source appears in both paths - The backend loop `for key in ("sources",)` is written for extensibility — other list-type fields can be added to the tuple if needed in the future - Risk is low: if no tool citations exist (`!message?.sources`), both frontend and backend behavior is identical to before ### Screenshots or Videos - N/A (behavior fix in the citations panel — sources that were previously missing now appear alongside existing ones) ### Contributor License Agreement <!-- 🚨 DO NOT DELETE THE TEXT BELOW 🚨 Keep the "Contributor License Agreement" confirmation text intact. Deleting it will trigger the CLA-Bot to INVALIDATE your PR. Your PR will NOT be reviewed or merged until you check the box below confirming that you have read and agree to the terms of the CLA. --> - [x] By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-30 02:17:02 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#49878