[GH-ISSUE #23057] issue: Files uploaded with "Use entire document" force context refresh every reply #19875

Closed
opened 2026-04-20 02:24:21 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @frenzybiscuit on GitHub (Mar 26, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23057

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker Cuda

Open WebUI Version

0.8.11

Ollama Version (if applicable)

N/A

Operating System

Linux

Browser (if applicable)

Firefox

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

Does not reprocess context every reply

Actual Behavior

Reprocesses context every reply

Steps to Reproduce

If you take a large script (say 50k context) and shove it into the system prompt of a model, it works fine. Context doesn't reprocess every reply.

If you take that same script and upload it into OWUI and use "Entire Document" it will force refresh the context every reply. This leads to very long wait times between replies. It's quite frustrating.

I have verified this happens on both vllm and llamacpp (ik_llamacpp).

This happens all the time with SillyTavern when you use lorebooks, but only if you inject the new context at a depth that causes reprocessing. I'm assuming the same thing happens here.

Logs & Screenshots

Image Image Image

Additional Information

No response

Originally created by @frenzybiscuit on GitHub (Mar 26, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/23057 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker Cuda ### Open WebUI Version 0.8.11 ### Ollama Version (if applicable) N/A ### Operating System Linux ### Browser (if applicable) Firefox ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior Does not reprocess context every reply ### Actual Behavior Reprocesses context every reply ### Steps to Reproduce If you take a large script (say 50k context) and **shove it into the system prompt of a model, it works fine.** Context doesn't reprocess every reply. If you take that same script and upload it into OWUI and use "Entire Document" i**t will force refresh the context every reply.** This leads to very long wait times between replies. It's quite frustrating. I have verified this happens on both vllm and llamacpp (ik_llamacpp). This happens all the time with SillyTavern when you use lorebooks, but only if you inject the new context at a depth that causes reprocessing. I'm assuming the same thing happens here. ### Logs & Screenshots <img width="1096" height="469" alt="Image" src="https://github.com/user-attachments/assets/2041ec45-53eb-49e2-b057-e73122b9e8d8" /> <img width="1012" height="700" alt="Image" src="https://github.com/user-attachments/assets/0044fd99-2260-414a-a3c9-f161dd3ec6d9" /> <img width="994" height="127" alt="Image" src="https://github.com/user-attachments/assets/03c4924b-9e3a-4224-97fa-4df80f9bb1a8" /> ### Additional Information _No response_
GiteaMirror added the bug label 2026-04-20 02:24:21 -05:00
Author
Owner

@adhusch commented on GitHub (Mar 27, 2026):

Hi @frenzybiscuit ,

did you try setting RAG_SYSTEM_CONTEXT=True ?

Best

<!-- gh-comment-id:4142189319 --> @adhusch commented on GitHub (Mar 27, 2026): Hi @frenzybiscuit , did you try setting RAG_SYSTEM_CONTEXT=True ? Best
Author
Owner

@frenzybiscuit commented on GitHub (Mar 27, 2026):

Hi @frenzybiscuit ,

did you try setting RAG_SYSTEM_CONTEXT=True ?

Best

No, but I'm not using rag. I'm using the entire document in full context according to OWUI (entire document).

Would setting RAG_SYSTEM_CONTEXT=True fix this?

<!-- gh-comment-id:4143577263 --> @frenzybiscuit commented on GitHub (Mar 27, 2026): > Hi [@frenzybiscuit](https://github.com/frenzybiscuit) , > > did you try setting RAG_SYSTEM_CONTEXT=True ? > > Best No, but I'm not using rag. I'm using the entire document in full context according to OWUI (entire document). Would setting RAG_SYSTEM_CONTEXT=True fix this?
Author
Owner

@frenzybiscuit commented on GitHub (Mar 27, 2026):

I mean yes I use rag, but that's not what this option does..

<!-- gh-comment-id:4143578500 --> @frenzybiscuit commented on GitHub (Mar 27, 2026): I mean yes I use rag, but that's not what this option does..
Author
Owner

@frenzybiscuit commented on GitHub (Mar 27, 2026):

Will try and let you know.

<!-- gh-comment-id:4143581525 --> @frenzybiscuit commented on GitHub (Mar 27, 2026): Will try and let you know.
Author
Owner

@frenzybiscuit commented on GitHub (Mar 27, 2026):

That does fix it, yes...

Is full context injected into the prompt like regular rag?

Well, that solved my issue.

<!-- gh-comment-id:4143621677 --> @frenzybiscuit commented on GitHub (Mar 27, 2026): That does fix it, yes... Is full context injected into the prompt like regular rag? Well, that solved my issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#19875