[GH-ISSUE #19594] feat: Persistent "Invisible" Metadata Object within Chat History for Context Management / OR new DB Table for storing these #34465

Open
opened 2026-04-25 08:28:08 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @Classic298 on GitHub (Nov 29, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/19594

Check Existing Issues

  • I have searched for all existing open AND closed issues and discussions for similar requests. I have found none that is comparable to my request.

Verify Feature Scope

  • I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions.

Problem Description

I am trying to implement a "Context Compaction" feature similar to the update recently released by Anthropic for Claude.ai (where earlier context is summarized to prevent hitting token limits and enhance context management).

Currently, implementing this in Open WebUI via Filters is technically possible but architecturally flawed and unscalable for the following reasons:

  • Complexity & Overhead: It requires setting up a standalone external database (SQLite/VectorDB) just to store summaries and cross-reference them with Chat IDs.
  • Scalability: In an enterprise environment (e.g., 2000 users with 100 chats each), managing 200,000 external references creates a massive performance bottleneck.
  • Data Consistency (The "Orphan" Problem): If a user deletes a chat in Open WebUI, there is no automatic propagation to the external database. This requires complex cron jobs (e.g., "garbage collection" scripts) to fetch existing IDs and purge stale data.
  • Concurrency & Race Conditions: If a user has the same chat open in two browser tabs/devices and sends messages simultaneously, an external database solution struggles to maintain the "ground truth" of the conversation flow without complex locking mechanisms.
  • Performance: External database lookups block the request flow.

The lack of a native place to store "hidden" state data inside the chat object prevents the creation of robust, self-contained, and scalable context management plugins.

Desired Solution you'd like

I request the addition of a persistent, "invisible" object/field within the standard Chat History structure (e.g., a metadata or hidden_context key inside the JSON object of the chat).

Key Requirements:

  • Persistence: This object must be saved alongside the chat messages in the main Open WebUI database.
  • Invisibility: The contents of this object should not be rendered in the UI chat bubble stream.
  • Accessibility via Filters: The Filter system must be able to read and write to this object during the request pipeline.

Proposed Workflow with this feature:

  1. User sends a message.
  2. Filter intercepts request -> checks chat.hidden_context.
  3. Filter sees context is full -> triggers a "Summary Task" (e.g., via a small model like GPT-4o-mini/Flash-Lite).
  4. Filter replaces old message history with the summary + recent messages for the API call.
  5. Filter updates chat.hidden_context with the new summary state and saves it back to the Chat Object.
  6. Filter sends modified request to the AI - replaced old messages with the summary and kept the newest 5-10 messages intact as is.

Benefit: This makes the chat self-contained. If the chat is deleted, the summary data is deleted with it. If the chat is exported, the intelligence travels with it. No external databases, no race conditions, no orphaned data. And the concurrency issue I described in the Problem Description section is also solved, since whatever request the user sent on the two separate devices got processed LAST will be the ultimate final state of whatever is saved in the chat.

Alternatives Considered

As detailed in the problem description, using a Filter and complex external setup was considered and rejected. It introduces a single point of failure, massive storage overhead, and violates the principle of keeping chat data atomic as well is impossible to implement keeping concurrent requests to the same chat in mind. It requires handling synchronization logic (ASIC/CAP theorem constraints) that should not be the responsibility of a simple context filter.

Additional Context

Anthropic's implementation of context compaction.

Image

Technical Note on Performance: Currently, very long chats in Open WebUI can be slow to load. While adding this field adds data, it allows us to prevent the visible message history from growing infinitely, potentially speeding up rendering if the UI only loads the visible messages and keeps the hidden_context in the background.

However, this request assumes that the backend structure can handle slightly larger JSON objects per chat. A general refactor of how large chat objects are loaded might be required in tandem to ensure the UI remains snappy.

Originally created by @Classic298 on GitHub (Nov 29, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/19594 ### Check Existing Issues - [x] I have searched for all existing **open AND closed** issues and discussions for similar requests. I have found none that is comparable to my request. ### Verify Feature Scope - [x] I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions. ### Problem Description I am trying to implement a "Context Compaction" feature similar to the update recently released by Anthropic for Claude.ai (where earlier context is summarized to prevent hitting token limits and **enhance context management**). Currently, implementing this in Open WebUI via Filters is technically possible but architecturally flawed and unscalable for the following reasons: - Complexity & Overhead: It requires setting up a standalone external database (SQLite/VectorDB) just to store summaries and cross-reference them with Chat IDs. - Scalability: In an enterprise environment (e.g., 2000 users with 100 chats each), managing 200,000 external references creates a massive performance bottleneck. - Data Consistency (The "Orphan" Problem): If a user deletes a chat in Open WebUI, there is no automatic propagation to the external database. This requires complex cron jobs (e.g., "garbage collection" scripts) to fetch existing IDs and purge stale data. - Concurrency & Race Conditions: If a user has the same chat open in two browser tabs/devices and sends messages simultaneously, an external database solution struggles to maintain the "ground truth" of the conversation flow without complex locking mechanisms. - Performance: External database lookups block the request flow. The lack of a native place to store "hidden" state data inside the chat object prevents the creation of robust, self-contained, and scalable context management plugins. ### Desired Solution you'd like I request the addition of a persistent, "invisible" object/field within the standard Chat History structure (e.g., a metadata or hidden_context key inside the JSON object of the chat). Key Requirements: - Persistence: This object must be saved alongside the chat messages in the main Open WebUI database. - Invisibility: The contents of this object should not be rendered in the UI chat bubble stream. - Accessibility via Filters: The Filter system must be able to read and write to this object during the request pipeline. Proposed Workflow with this feature: 1. User sends a message. 2. Filter intercepts request -> checks chat.hidden_context. 3. Filter sees context is full -> triggers a "Summary Task" (e.g., via a small model like GPT-4o-mini/Flash-Lite). 4. Filter replaces old message history with the summary + recent messages for the API call. 5. Filter updates chat.hidden_context with the new summary state and saves it back to the Chat Object. 6. Filter sends modified request to the AI - replaced old messages with the summary and kept the newest 5-10 messages intact as is. **Benefit:** This makes the chat self-contained. If the chat is deleted, the summary data is deleted with it. If the chat is exported, the intelligence travels with it. No external databases, no race conditions, no orphaned data. And the concurrency issue I described in the Problem Description section is also solved, since whatever request the user sent on the two separate devices got processed LAST will be the ultimate final state of whatever is saved in the chat. ### Alternatives Considered As detailed in the problem description, using a Filter and complex external setup was considered and rejected. It introduces a single point of failure, massive storage overhead, and violates the principle of keeping chat data atomic as well is impossible to implement keeping concurrent requests to the same chat in mind. It requires handling synchronization logic (ASIC/CAP theorem constraints) **that should not be the responsibility of a simple context filter**. ### Additional Context Anthropic's implementation of context compaction. <img width="402" height="518" alt="Image" src="https://github.com/user-attachments/assets/00e78b11-50d4-49d0-acbe-2c46956d6a01" /> Technical Note on Performance: Currently, very long chats in Open WebUI can be slow to load. While adding this field adds data, it allows us to prevent the visible message history from growing infinitely, potentially speeding up rendering if the UI only loads the visible messages and keeps the hidden_context in the background. However, this request assumes that the backend structure can handle slightly larger JSON objects per chat. A general refactor of how large chat objects are loaded might be required in tandem to ensure the UI remains snappy.
Author
Owner

@Classic298 commented on GitHub (Dec 21, 2025):

Related: https://github.com/open-webui/open-webui/discussions/19279

<!-- gh-comment-id:3679187237 --> @Classic298 commented on GitHub (Dec 21, 2025): Related: https://github.com/open-webui/open-webui/discussions/19279
Author
Owner

@elacy commented on GitHub (Mar 7, 2026):

I have another use case for this feature. I'm trying to run RPGs through this and being able to store game state in the conversation message means if something goes wrong I can just delete the message and the game state is back to where it was at the last message.

<!-- gh-comment-id:4017416324 --> @elacy commented on GitHub (Mar 7, 2026): I have another use case for this feature. I'm trying to run RPGs through this and being able to store game state in the conversation message means if something goes wrong I can just delete the message and the game state is back to where it was at the last message.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#34465