[GH-ISSUE #17034] feat: Current RAG and Memory Behaviour breaks caching of LLM Providers and should be more flexible #33675

Closed
opened 2026-04-25 07:34:22 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @Podden on GitHub (Aug 29, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/17034

Check Existing Issues

  • I have searched the existing issues and discussions.

Problem Description

I'm currently developing my own Anthropic Pipe and want to use Promt caching and I noticed the current behaviour of appending RAG content and memory into the system promt is suboptimal as it completely invalidates the promt cache when system promt is changed.

Desired Solution you'd like

Multiple System Messages with IDs (Model_System_Promt, User_System_Promt, User_Memory, RAG) would be nice so I can handle them differently. Also, I want to clearly differentiate between RAG from knowledge or notes or RAG from files which I uploaded to this Conversation. Anthropic puts documents right besides the message to which there belong, which I find much more intuitivly that putting everything in the System Promt. How does the AI know which knowledge is currently important and which one is multiple messages old?

Currently I have to Regex for "User Context: " for Memory and a Custom RAG Template with Tags at Start and End, not optimal.

Alternatives Considered

No response

Additional Context

No response

Originally created by @Podden on GitHub (Aug 29, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/17034 ### Check Existing Issues - [x] I have searched the existing issues and discussions. ### Problem Description I'm currently developing my own Anthropic Pipe and want to use Promt caching and I noticed the current behaviour of appending RAG content and memory into the system promt is suboptimal as it completely invalidates the promt cache when system promt is changed. ### Desired Solution you'd like Multiple System Messages with IDs (Model_System_Promt, User_System_Promt, User_Memory, RAG) would be nice so I can handle them differently. Also, I want to clearly differentiate between RAG from knowledge or notes or RAG from files which I uploaded to this Conversation. Anthropic puts documents right besides the message to which there belong, which I find much more intuitivly that putting everything in the System Promt. How does the AI know which knowledge is currently important and which one is multiple messages old? Currently I have to Regex for "User Context: " for Memory and a Custom RAG Template with Tags at Start and End, not optimal. ### Alternatives Considered _No response_ ### Additional Context _No response_
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#33675