github-starred/open-webui

Fork 0

You've already forked open-webui

mirror of https://github.com/open-webui/open-webui.git synced 2026-05-06 10:58:17 -05:00

Code Issues 1.3k Packages Projects Releases 126 Wiki Activity

[GH-ISSUE #20435] Chunk-level citation issue: Different citation markers show identical retrieved chunks in RAG answers #34713

New Issue

Closed

opened 2026-04-25 08:48:57 -05:00 by GiteaMirror · 1 comment

GiteaMirror commented

2026-04-25 08:48:57 -05:00

Owner

Copy Link

Originally created by @Leonurus-free on GitHub (Jan 7, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/20435

Check Existing Issues

I have searched for any existing and/or related issues.
I have searched for any existing and/or related discussions.
I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.6.43

Ollama Version (if applicable)

No response

Operating System

ubuntu22.04

Browser (if applicable)

Chrome 133

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

Problem Description

I am currently using Open WebUI to build a self-hosted RAG knowledge base. I noticed an issue with how citations are displayed in the generated answers.

When the answer contains multiple sentences with separate citation markers pointing to the same document, clicking on each citation shows the exact same set of chunks from that document.
This makes it appear as if every cited sentence is supported by the same evidence, even though in reality each sentence should correspond to different chunks with different relevance scores.

For example (as shown in the screenshot):

The answer contains three sentences, each annotated with a citation to the same document:
“人工智能大模型综述及展望.pdf”
However, when clicking the three citation markers individually, they all display the same retrieved chunks, instead of showing the specific chunk relevant to that sentence.

Why this is a problem

It creates the impression that the citations are not trustworthy, because:
- Different statements appear to be backed by identical evidence.
Users who want to verify the source of a specific sentence must:
- Manually search through all returned chunks
- Without knowing which chunk actually supports which sentence
This significantly reduces usability, especially when documents are long and contain many chunks.

Expected Behavior

Each citation marker should correspond to the specific chunk(s) used to support that particular sentence.

Impact

This citation behavior causes practical usability issues:

When users want to verify the source of a specific statement in the answer
They must manually search through all retrieved chunks of the cited document
This significantly increases cognitive load and hurts user experience

The problem becomes more severe when documents are long or split into many chunks.

Questions

I would like to ask:

Is this behavior a design or implementation limitation of Open WebUI?
Are there any configuration options (e.g. Retriever / Citation / Prompt / Pipeline settings) that enable fine-grained, per-answer or per-paragraph citations?
Are there any existing workarounds, PRs, or plugins in the community that address this issue?

Actual Behavior

Current behavior:

Answer A
Citation: Document X (containing multiple chunks) The content of all the blocks is the same.

Steps to Reproduce

Deploy Open WebUI and enable the RAG / Knowledge Base feature.
Create a knowledge base and upload one or more long documents (which are automatically split into multiple chunks).
Ensure the documents are successfully indexed and retrievable.
Ask a question that:
- Requires information from multiple chunks within the same document, or
- Produces an answer consisting of multiple statements or paragraphs.
Observe the generated answer and the displayed citations / sources.

Observed Result

The citation section only lists the document-level source.
All retrieved chunks from the same document are grouped together.
There is no clear mapping between individual statements in the answer and their corresponding source chunks.

Expected Result

Each paragraph or key statement in the answer should be:
- Associated with a specific chunk used during generation, and
- Clearly referenced in the citation section (ideally inline or per-paragraph).

Logs & Screenshots

Additional Information

No response

Originally created by @Leonurus-free on GitHub (Jan 7, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/20435 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.6.43 ### Ollama Version (if applicable) _No response_ ### Operating System ubuntu22.04 ### Browser (if applicable) Chrome 133 ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior ## Problem Description I am currently using **Open WebUI** to build a self-hosted RAG knowledge base. I noticed an issue with how citations are displayed in the generated answers. When the answer contains multiple sentences with separate citation markers pointing to the same document, clicking on each citation shows **the exact same set of chunks** from that document. This makes it appear as if every cited sentence is supported by the same evidence, even though in reality each sentence should correspond to **different chunks with different relevance scores**. For example (as shown in the screenshot): - The answer contains three sentences, each annotated with a citation to the same document: *“人工智能大模型综述及展望.pdf”* - However, when clicking the three citation markers individually, **they all display the same retrieved chunks**, instead of showing the specific chunk relevant to that sentence. ### Why this is a problem 1. It creates the impression that the citations are not trustworthy, because: - Different statements appear to be backed by identical evidence. 2. Users who want to verify the source of a specific sentence must: - Manually search through all returned chunks - Without knowing which chunk actually supports which sentence 3. This significantly reduces usability, especially when documents are long and contain many chunks. ### Expected Behavior - Each citation marker should correspond to the **specific chunk(s)** used to support that particular sentence. ## Impact This citation behavior causes practical usability issues: - When users want to **verify the source of a specific statement** in the answer - They must manually search through **all retrieved chunks** of the cited document - This significantly increases cognitive load and hurts user experience The problem becomes more severe when documents are long or split into many chunks. ## Questions I would like to ask: 1. Is this behavior a **design or implementation limitation** of Open WebUI? 2. Are there any **configuration options** (e.g. Retriever / Citation / Prompt / Pipeline settings) that enable **fine-grained, per-answer or per-paragraph citations**? 3. Are there any **existing workarounds, PRs, or plugins** in the community that address this issue? ### Actual Behavior **Current behavior:** - Answer A - Citation: Document X (containing multiple chunks) The content of all the blocks is the same. ### Steps to Reproduce ## Steps to Reproduce 1. Deploy **Open WebUI** and enable the **RAG / Knowledge Base** feature. 2. Create a knowledge base and upload one or more **long documents** (which are automatically split into multiple chunks). 3. Ensure the documents are successfully indexed and retrievable. 4. Ask a question that: - Requires information from **multiple chunks within the same document**, or - Produces an answer consisting of **multiple statements or paragraphs**. 5. Observe the generated answer and the displayed **citations / sources**. ## Observed Result - The citation section only lists the **document-level source**. - All retrieved chunks from the same document are grouped together. - There is **no clear mapping** between individual statements in the answer and their **corresponding source chunks**. ## Expected Result - Each paragraph or key statement in the answer should be: - Associated with a **specific chunk** used during generation, and - Clearly referenced in the citation section (ideally inline or per-paragraph). ### Logs & Screenshots <img width="1035" height="213" alt="Image" src="https://github.com/user-attachments/assets/74ca8201-0ccd-4a25-be49-a5f5e469faaf" /> <img width="1364" height="655" alt="Image" src="https://github.com/user-attachments/assets/32fc6ff7-be38-43bc-abb8-6ffe756fa88a" /> <img width="1366" height="649" alt="Image" src="https://github.com/user-attachments/assets/85f11ac4-d544-4d87-9b6f-7caaf524c11e" /> ### Additional Information _No response_

GiteaMirror added the bug label 2026-04-25 08:48:57 -05:00

GiteaMirror closed this issue

2026-04-25 08:48:59 -05:00

GiteaMirror commented

2026-04-25 08:49:00 -05:00

Author

Owner

Copy Link

@frenzybiscuit commented on GitHub (Jan 7, 2026):

I believe the relevance score is only relevant if you are using hybrid search (a reranker) as you can set it to specifically exclude results that are below a certain score.

And since hybrid search is busted in .43, you'll need to either run the dev image or wait for .44 where it's fixed...

@frenzybiscuit commented on GitHub (Jan 7, 2026): I believe the relevance score is only relevant if you are using hybrid search (a reranker) as you can set it to specifically exclude results that are below a certain score. And since hybrid search is busted in .43, you'll need to either run the dev image or wait for .44 where it's fixed...

No Branch/Tag Specified

main

dev

v0.9.2

v0.9.1

v0.9.0

v0.8.12

v0.8.11

v0.8.10

v0.8.9

v0.8.8

v0.8.7

v0.8.6

v0.8.5

v0.8.4

v0.8.3

v0.8.2

v0.8.1

v0.8.0

v0.7.2

v0.7.1

v0.7.0

v0.6.43

v0.6.42

v0.6.41

v0.6.40

v0.6.39

v0.6.38

v0.6.37

v0.6.36

v0.6.35

v0.6.34

v0.6.33

v0.6.32

v0.6.31

v0.6.30

v0.6.29

v0.6.28

v0.6.27

v0.6.26

v0.6.25

v0.6.24

v0.6.23

v0.6.22

v0.6.21

v0.6.20

v0.6.19

v0.6.18

v0.6.17

v0.6.16

v0.6.15

v0.6.14

v0.6.13

v0.6.12

v0.6.11

v0.6.10

v0.6.9

v0.6.8

v0.6.7

v0.6.6

v0.6.5

v0.6.4

v0.6.3

v0.6.2

v0.6.1

v0.6.0

v0.5.20

v0.5.19

v0.5.18

v0.5.17

v0.5.16

v0.5.15

v0.5.14

v0.5.13

v0.5.12

v0.5.11

v0.5.10

v0.5.9

v0.5.8

v0.5.7

v0.5.6

v0.5.5

v0.5.4

v0.5.3

v0.5.2

v0.5.1

v0.5.0

v0.4.8

v0.4.7

v0.4.6

v0.4.5

v0.4.4

v0.4.3

v0.4.2

v0.4.1

v0.4.0

v0.3.35

v0.3.34

v0.3.33

v0.3.32

v0.3.31

v0.3.30

v0.3.29

v0.3.28

v0.3.27

v0.3.26

v0.3.25

v0.3.24

v0.3.23

v0.3.22

v0.3.21

v0.3.20

v0.3.19

v0.3.18

v0.3.17

v0.3.16

v0.3.15

v0.3.14

v0.3.13

v0.3.12

v0.3.11

v0.3.10

v0.3.9

v0.3.8

v0.3.7

v0.3.6

v0.3.5

v0.3.4

v0.3.3

v0.3.2

v0.3.1

v0.3.0

v0.2.5

v0.2.4

v0.2.3

v0.2.2

v0.2.1

v0.2.0

v0.1.125

v0.1.124

v0.1.123

v0.1.122

v0.1.121

v0.1.120

v0.1.119

v0.1.118

v0.1.117

v0.1.116

v0.1.115

v0.1.114

v0.1.113

v0.1.112

v0.1.111

v0.1.110

v0.1.109

v0.1.108

v0.1.107

v0.1.106

v0.1.105

v0.1.104

v0.1.103

v0.1.102

Labels

Clear labels

bug

confirmed

confirmed issue

core

documentation

enhancement

good first issue

help wanted

non-core

pull-request

Mirrored from GitHub Pull Request

python

question

testing wanted

No Label bug

Milestone

No items

No Milestone

Projects

Clear projects

No project

Assignees

Clear assignees

GiteaMirror

ninjasurge

No Assignees

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#34713

Reference in New Issue

Repository

github-starred/open-webui

Title

Body

Block a user

Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.

User to block:

Optional note:

The note is not visible to the blocked user.

Delete Branch "%!s()"

Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?