[GH-ISSUE #8177] Optimizing the RAG #15028

New Issue

GiteaMirror · 2026-04-19T21:18:56-05:00

GiteaMirror commented

2026-04-19 21:18:56 -05:00

Originally created by @Schwenn2002 on GitHub (Dec 28, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/8177

I would like to be able to search the documents in the RAG in several stages so that I can choose a smaller context window for the LLM and use less VRAM.

Let's assume there is a context window of 10,000 tokens and a chunk in the RAG has 500 tokens.

The ideal first step would be for the RAG search with e.g. Top K = 80, then a reranking of the 80 chunks and then the return of the best 20 chunks to the LLM (20x500 tokens then fit in the context window).

Is that possible?

Originally created by @Schwenn2002 on GitHub (Dec 28, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/8177 I would like to be able to search the documents in the RAG in several stages so that I can choose a smaller context window for the LLM and use less VRAM. Let's assume there is a context window of 10,000 tokens and a chunk in the RAG has 500 tokens. The ideal first step would be for the RAG search with e.g. Top K = 80, then a reranking of the 80 chunks and then the return of the best 20 chunks to the LLM (20x500 tokens then fit in the context window). Is that possible?

GiteaMirror closed this issue

2026-04-19 21:18:57 -05:00

GiteaMirror referenced this issue

2026-04-20 04:57:36 -05:00

[PR #15028] [MERGED] feat: System prompt input field resizable #23688

GiteaMirror referenced this issue

2026-04-25 11:57:30 -05:00

[PR #15028] [MERGED] feat: System prompt input field resizable #39318

GiteaMirror referenced this issue

2026-04-29 21:38:29 -05:00

[PR #15028] [MERGED] feat: System prompt input field resizable #46736

GiteaMirror referenced this issue