mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-22 06:02:06 -05:00
feat: Support memory retrieval reranking for improved context personalization #6001
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @longzanxi on GitHub (Aug 8, 2025).
Check Existing Issues
Problem Description
Currently, the Open WebUI user memory retrieval only supports vector similarity search and does not include a reranking step. As a result, some long-term user preferences or factual memories—though highly relevant to the current conversation—may not be surfaced, or may be overshadowed by less relevant results. This impacts context accuracy and reduces the personalization capabilities of the chatbot.
Desired Solution you'd like
I propose adding an optional reranking step to the memory retrieval pipeline, similar to what is available for knowledge base RAG retrieval. Specifically:
Implementation suggestion: In memory_handler.py / chat_memory_handler.py, apply reranking after VECTOR_DB_CLIENT.search, reusing the process from RAG retrieval.
Alternatives Considered
Additional Context
Related issue: RAG Hybrid Search still broken (#15915)
No public discussions or issues currently address reranking for user memory. This feature would improve context accuracy and long-term personalization for advanced users and complex scenarios.
If more technical details are needed or collaboration is welcome, I'm happy to provide further input.
@onestardao commented on GitHub (Aug 9, 2025):
This is essentially the vectorstore ranking drift problem — after retrieval, the ranking phase can drift away from the most relevant context, especially in long-term memory personalization.
We’ve documented this as No.7 in our AI failure Problem Map, along with reproducible cases and tested fixes. If you want the write-up and implementation pattern, I can share it.
@longzanxi commented on GitHub (Aug 9, 2025):
@onestardao Thanks! Yes, I’d really appreciate it if you could share the Problem Map No.7 write-up and the implementation pattern (repro cases, metrics, fixes).
@onestardao commented on GitHub (Aug 9, 2025):
Yes — this is exactly the vectorstore ranking drift problem we documented as No.7 in the WFGY Problem Map.
The write-up includes reproducible cases, metrics, and the exact fix pattern.
Full details & implementation steps:
https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md
@i-iooi-i commented on GitHub (Sep 21, 2025):
Despite storing numerous memories, I've observed that the AI consistently accesses only a small, fixed subset, with the rest remaining unreachable. I suspect this limitation might stem from the embedding model. The current memory feature feels quite underdeveloped, and I eagerly anticipate its full implementation. For now, I am relying on system prompts to manage critical information.