[GH-ISSUE #23995] issue: RAG Chunk Min Size Target option doesn't work #35675

Closed
opened 2026-04-25 09:51:57 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @xNissX233 on GitHub (Apr 22, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23995

Open WebUI Version

0.9.1

Operating System

Arch Linux

Browser (if applicable)

Desktop App

Expected Behavior

When setting Chunk Min Size Target option to 100 for example, the context given to the model from the Knowledge files should be of minimum that size per chunk.

Actual Behavior

Setting Chunk Min Size Target to a high value still results in the model getting tiny pieces of information from each file.

Steps to Reproduce

  1. Set Chunk Min Size Target to 1000
  2. Load to knowledge a text based file that contains short lines (for example a subtitle)
  3. Ask the model a question related to the subtitle
  4. The context RAG extracts from the file is the short line, even if it was smaller than 1000

Logs & Screenshots

Image
Originally created by @xNissX233 on GitHub (Apr 22, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/23995 ### Open WebUI Version 0.9.1 ### Operating System Arch Linux ### Browser (if applicable) Desktop App ### Expected Behavior When setting Chunk Min Size Target option to 100 for example, the context given to the model from the Knowledge files should be of minimum that size per chunk. ### Actual Behavior Setting Chunk Min Size Target to a high value still results in the model getting tiny pieces of information from each file. ### Steps to Reproduce 1. Set Chunk Min Size Target to 1000 2. Load to knowledge a text based file that contains short lines (for example a subtitle) 3. Ask the model a question related to the subtitle 4. The context RAG extracts from the file is the short line, even if it was smaller than 1000 ### Logs & Screenshots <img width="2317" height="283" alt="Image" src="https://github.com/user-attachments/assets/af0e06eb-8e4e-4139-b128-c5d54f53e1ae" />
GiteaMirror added the bug label 2026-04-25 09:51:57 -05:00
Author
Owner

@Classic298 commented on GitHub (Apr 22, 2026):

As explained in the docs, this config doesn't ENSURE all small markdown header sections get merged. It tries to. But it can't always do it

There are edge cases like when you're at the end of a file and there's one small chunk remaining that can't be merged into any other chunk. Then you HAVE to create the chunk even though it's small

<!-- gh-comment-id:4296116065 --> @Classic298 commented on GitHub (Apr 22, 2026): As explained in the docs, this config doesn't ENSURE all small markdown header sections get merged. It tries to. But it can't always do it There are edge cases like when you're at the end of a file and there's one small chunk remaining that can't be merged into any other chunk. Then you HAVE to create the chunk even though it's small
Author
Owner

@xNissX233 commented on GitHub (Apr 22, 2026):

That is definitely not my case, and I observe the same behavior with all the documents I try, but you can close the issue if you consider.
In my case it happens that I tell the model "continue" every now and then. Then I save the conversation in knowledge as context. And with RAG, if I say "continue" the model gets as context from the conversation in the txt file nothing more than a few "continue" lines, instead of actual longer messages

<!-- gh-comment-id:4296121525 --> @xNissX233 commented on GitHub (Apr 22, 2026): That is definitely not my case, and I observe the same behavior with all the documents I try, but you can close the issue if you consider. In my case it happens that I tell the model "continue" every now and then. Then I save the conversation in knowledge as context. And with RAG, if I say "continue" the model gets as context from the conversation in the txt file nothing more than a few "continue" lines, instead of actual longer messages
Author
Owner

@Classic298 commented on GitHub (Apr 22, 2026):

What do you mean by continue

<!-- gh-comment-id:4296142650 --> @Classic298 commented on GitHub (Apr 22, 2026): What do you mean by continue
Author
Owner

@xNissX233 commented on GitHub (Apr 22, 2026):

I write a message with content:
"Continue"
To the model so he continues his last message.
And that "Continue" is then saved as part of the conversation, being used wrongly by the RAG later in the context.
So for example:
Me: I like birds
AI: (gets useful context about birds) Birds are cool
Me: Continue
AI: (gets as context a bunch of "continue" lines and gives an answer that doesn't have any useful context)

<!-- gh-comment-id:4296158569 --> @xNissX233 commented on GitHub (Apr 22, 2026): I write a message with content: "Continue" To the model so he continues his last message. And that "Continue" is then saved as part of the conversation, being used wrongly by the RAG later in the context. So for example: Me: I like birds AI: (gets useful context about birds) Birds are cool Me: Continue AI: (gets as context a bunch of "continue" lines and gives an answer that doesn't have any useful context)
Author
Owner

@Classic298 commented on GitHub (Apr 22, 2026):

Aha but that has absolutely nothing to do with the min chunk size config. That's just the RAG working as designed. On every query it fetches new information

Try setting to native tool calling instead. The model will decide on it's own when to invoke the RAG

<!-- gh-comment-id:4296178475 --> @Classic298 commented on GitHub (Apr 22, 2026): Aha but that has absolutely nothing to do with the min chunk size config. That's just the RAG working as designed. On every query it fetches new information Try setting to native tool calling instead. The model will decide on it's own when to invoke the RAG
Author
Owner

@xNissX233 commented on GitHub (Apr 22, 2026):

Isn't there an option to control this behavior without native tool mode then? native tool mode is a bit too expensive for me since it injects a lot of tokens each time it's called and the model takes a lot of time to think and answer

<!-- gh-comment-id:4296188864 --> @xNissX233 commented on GitHub (Apr 22, 2026): Isn't there an option to control this behavior without native tool mode then? native tool mode is a bit too expensive for me since it injects a lot of tokens each time it's called and the model takes a lot of time to think and answer
Author
Owner

@Classic298 commented on GitHub (Apr 22, 2026):

@xNissX233 you can disable all the tool categories you don't want the model to have.

I.e. web search and such and only give it the knowledge tools

It's not many tokens maybe a few hundred which is just the prompt of the 5/6 knowledge tools there are

You have to pay the price anyways either way.

Either the task model gets all the tools each time and decides to do RAG or your chat model uses the tools natively.

<!-- gh-comment-id:4296212859 --> @Classic298 commented on GitHub (Apr 22, 2026): @xNissX233 you can disable all the tool categories you don't want the model to have. I.e. web search and such and only give it the knowledge tools It's not many tokens maybe a few hundred which is just the prompt of the 5/6 knowledge tools there are You have to pay the price anyways either way. Either the task model gets all the tools each time and decides to do RAG or your chat model uses the tools natively.
Author
Owner

@xNissX233 commented on GitHub (Apr 22, 2026):

Ok, would be nice to have an option to set an actual minimum amount of characters/tokens in each context provided with RAG though, since native mode sometimes doesn't retrieve anything if the model doesn't decide to do so.
Thank you.

<!-- gh-comment-id:4296225888 --> @xNissX233 commented on GitHub (Apr 22, 2026): Ok, would be nice to have an option to set an actual minimum amount of characters/tokens in each context provided with RAG though, since native mode sometimes doesn't retrieve anything if the model doesn't decide to do so. Thank you.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#35675