[GH-ISSUE #24076] issue: RAG template: duplicate {{CONTEXT}} / [context] placeholders inject retrieved context multiple times without warning #58843

Closed
opened 2026-05-06 00:16:34 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @KingsleyOWO on GitHub (Apr 24, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/24076

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.9.0

Ollama Version (if applicable)

No response

Operating System

Ubuntu 24.04

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When a RAG template contains more than one context placeholder, OpenWebUI should warn that retrieved context will be injected once per placeholder.

Acceptable fixes could include either:

  1. a backend warning from rag_template() when the rendered template has context_count > 1, or
  2. a UI-side warning/lint when saving a custom RAG template with duplicate [context] / {{CONTEXT}} placeholders.

The warning should not block saving and should not change existing behavior, because some users may intentionally duplicate context placeholders.

Actual Behavior

A template with duplicate context placeholders injects the full retrieved context multiple times silently.

There is already a debug warning when no context placeholder exists:

if '[context]' not in template and '{{CONTEXT}}' not in template:
    log.debug("WARNING: The RAG template does not contain the '[context]' or '{{CONTEXT}}' placeholder.")

However, there is no corresponding warning when more than one context placeholder exists. The missing-placeholder case and duplicate-placeholder case are both easy-to-miss RAG-template failure modes, but only the missing-placeholder case currently emits a signal.

Steps to Reproduce

  1. Go to Admin Settings → Documents → RAG Template.
  2. Save a custom RAG template that contains {{CONTEXT}} twice:
### Task
Answer using the provided context.

<context>
{{CONTEXT}}
</context>

Later in the same template:

```xml
<context>
{{CONTEXT}}
</context>
```
  1. Run any RAG-enabled chat turn.
  2. Capture the rendered prompt reaching the model, for example by using a local proxy, enabling debug instrumentation, or temporarily logging the output of rag_template().
  3. Observe that the retrieved context is inserted once for each placeholder occurrence.

Minimal substitution-only reproducer:

template = "A {{CONTEXT}} B {{CONTEXT}}"
context = "CTX"

rendered = template.replace("{{CONTEXT}}", context)

assert rendered == "A CTX B CTX"

The OpenWebUI code path does the same kind of global replacement for both [context] and {{CONTEXT}}.

Logs & Screenshots

No browser-console error is expected for this issue. This is a rendered-prompt/template-expansion behavior, not a frontend crash.

Browser console:

  • No relevant frontend error observed.

Container / backend logs:

  • No exception observed when the duplicate-placeholder RAG template is used.
  • The issue is that there is also no warning/log when the template contains more than one context placeholder.
  • The duplicate injection is only visible if the final rendered prompt is inspected or token-counted.

Minimal rendered-output evidence:

Template:

A
{{CONTEXT}}
B
{{CONTEXT}}
C

Context:

<source id="1">example retrieved chunk</source>

Rendered result:

A
<source id="1">example retrieved chunk</source>
B
<source id="1">example retrieved chunk</source>
C

This is consistent with the current rag_template() substitution path, which uses:

template = template.replace('[context]', context)
template = template.replace('{{CONTEXT}}', context)

without a count argument.

Token-accounting reproduction from my environment:

Metric Template with 1× {{CONTEXT}} Template with 2× {{CONTEXT}} Delta One context copy
chars 4,985 8,687 +3,702 3,551
Qwen tokenizer 3,024 5,843 +2,819 2,710
tiktoken o200k_base 3,450 6,701 +3,251 3,141

The rendered prompt therefore contains one extra full copy of the retrieved context. The current UI/logs do not make this visible.

Additional Information

Related but not the same as existing RAG duplication reports:

  • #19098 discusses prompt/context duplication when a RAG template is used with tool results.
  • #21167 discusses user-message duplication for RAG queries.
  • #21726 discusses RAG duplication when view_knowledge_file is used.

This issue is narrower: a custom RAG template that itself contains duplicate {{CONTEXT}} / [context] placeholders causes the retrieved context to be injected once per placeholder.

I am not asking to change substitution semantics. Intentional duplicate placeholders should continue to work. I am only requesting warning/linting so that accidental duplication is visible.

Anecdotal observation, not part of the bug claim:

In one internal Traditional Chinese HR/policy RAG deployment, the duplicate-context template appeared to produce better answers than the default single-context template in a small manual comparison. In particular, it handled multi-clause policy distinctions more reliably.

However, I am not claiming this proves duplicate injection is generally beneficial. The comparison was not controlled enough: the custom template also differed in language, structure, citation guidance, and XML/code-fence formatting; the sample size was only n=2; and only one model family was tested.

This observation is included only to explain why a user might unknowingly keep a duplicate-context template in production: it may appear to improve answer quality while silently increasing prompt tokens. A warning would let users make that tradeoff intentionally.

Originally created by @KingsleyOWO on GitHub (Apr 24, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/24076 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.9.0 ### Ollama Version (if applicable) _No response_ ### Operating System Ubuntu 24.04 ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When a RAG template contains more than one context placeholder, OpenWebUI should warn that retrieved context will be injected once per placeholder. Acceptable fixes could include either: 1. a backend warning from `rag_template()` when the rendered template has `context_count > 1`, or 2. a UI-side warning/lint when saving a custom RAG template with duplicate `[context]` / `{{CONTEXT}}` placeholders. The warning should not block saving and should not change existing behavior, because some users may intentionally duplicate context placeholders. ### Actual Behavior A template with duplicate context placeholders injects the full retrieved context multiple times silently. There is already a debug warning when no context placeholder exists: ```python if '[context]' not in template and '{{CONTEXT}}' not in template: log.debug("WARNING: The RAG template does not contain the '[context]' or '{{CONTEXT}}' placeholder.") ``` However, there is no corresponding warning when more than one context placeholder exists. The missing-placeholder case and duplicate-placeholder case are both easy-to-miss RAG-template failure modes, but only the missing-placeholder case currently emits a signal. ### Steps to Reproduce 1. Go to Admin Settings → Documents → RAG Template. 2. Save a custom RAG template that contains `{{CONTEXT}}` twice: ````text ### Task Answer using the provided context. <context> {{CONTEXT}} </context> Later in the same template: ```xml <context> {{CONTEXT}} </context> ``` ```` 3. Run any RAG-enabled chat turn. 4. Capture the rendered prompt reaching the model, for example by using a local proxy, enabling debug instrumentation, or temporarily logging the output of `rag_template()`. 5. Observe that the retrieved context is inserted once for each placeholder occurrence. Minimal substitution-only reproducer: ```python template = "A {{CONTEXT}} B {{CONTEXT}}" context = "CTX" rendered = template.replace("{{CONTEXT}}", context) assert rendered == "A CTX B CTX" ``` The OpenWebUI code path does the same kind of global replacement for both `[context]` and `{{CONTEXT}}`. ### Logs & Screenshots No browser-console error is expected for this issue. This is a rendered-prompt/template-expansion behavior, not a frontend crash. Browser console: - No relevant frontend error observed. Container / backend logs: - No exception observed when the duplicate-placeholder RAG template is used. - The issue is that there is also no warning/log when the template contains more than one context placeholder. - The duplicate injection is only visible if the final rendered prompt is inspected or token-counted. Minimal rendered-output evidence: Template: ```text A {{CONTEXT}} B {{CONTEXT}} C ``` Context: ```text <source id="1">example retrieved chunk</source> ``` Rendered result: ```text A <source id="1">example retrieved chunk</source> B <source id="1">example retrieved chunk</source> C ``` This is consistent with the current `rag_template()` substitution path, which uses: ```python template = template.replace('[context]', context) template = template.replace('{{CONTEXT}}', context) ``` without a `count` argument. Token-accounting reproduction from my environment: | Metric | Template with 1× `{{CONTEXT}}` | Template with 2× `{{CONTEXT}}` | Delta | One context copy | | --- | ---: | ---: | ---: | ---: | | chars | 4,985 | 8,687 | +3,702 | 3,551 | | Qwen tokenizer | 3,024 | 5,843 | +2,819 | 2,710 | | tiktoken `o200k_base` | 3,450 | 6,701 | +3,251 | 3,141 | The rendered prompt therefore contains one extra full copy of the retrieved context. The current UI/logs do not make this visible. ### Additional Information Related but not the same as existing RAG duplication reports: - #19098 discusses prompt/context duplication when a RAG template is used with tool results. - #21167 discusses user-message duplication for RAG queries. - #21726 discusses RAG duplication when `view_knowledge_file` is used. This issue is narrower: a custom RAG template that itself contains duplicate `{{CONTEXT}}` / `[context]` placeholders causes the retrieved context to be injected once per placeholder. I am not asking to change substitution semantics. Intentional duplicate placeholders should continue to work. I am only requesting warning/linting so that accidental duplication is visible. Anecdotal observation, not part of the bug claim: In one internal Traditional Chinese HR/policy RAG deployment, the duplicate-context template appeared to produce better answers than the default single-context template in a small manual comparison. In particular, it handled multi-clause policy distinctions more reliably. However, I am not claiming this proves duplicate injection is generally beneficial. The comparison was not controlled enough: the custom template also differed in language, structure, citation guidance, and XML/code-fence formatting; the sample size was only n=2; and only one model family was tested. This observation is included only to explain why a user might unknowingly keep a duplicate-context template in production: it may appear to improve answer quality while silently increasing prompt tokens. A warning would let users make that tradeoff intentionally.
GiteaMirror added the bug label 2026-05-06 00:16:34 -05:00
Author
Owner

@tjbck commented on GitHub (Apr 24, 2026):

Intended behaviour, duplication notice message added to dev.

<!-- gh-comment-id:4312096404 --> @tjbck commented on GitHub (Apr 24, 2026): Intended behaviour, duplication notice message added to dev.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#58843