[GH-ISSUE #12059] feat: Add an embedding option that specifies document separator characters #16455

Closed
opened 2026-04-19 22:22:14 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @fgfg54321 on GitHub (Mar 26, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/12059

Check Existing Issues

  • I have searched the existing issues and discussions.

Problem Description

Add an embedding option that specifies document separator characters

Desired Solution you'd like

In retrieval. py files the RecursiveCharacterTextSplitter separators options can be set environment variables or the background It can be more flexible processing document segmentation

now i manually modify the code here
separators_str = "\n\n\n\n,.,。"
separators = [s.strip() for s in separators_str.split(",") if s.strip()]

text_splitter = RecursiveCharacterTextSplitter(
separators=separators, # New separator parameter
chunk_size=request.app.state.config.CHUNK_SIZE,
chunk_overlap=request.app.state.config.CHUNK_OVERLAP,
add_start_index=True,
)

Alternatives Considered

No response

Additional Context

No response

Originally created by @fgfg54321 on GitHub (Mar 26, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/12059 ### Check Existing Issues - [x] I have searched the existing issues and discussions. ### Problem Description Add an embedding option that specifies document separator characters ### Desired Solution you'd like In retrieval. py files the RecursiveCharacterTextSplitter separators options can be set environment variables or the background It can be more flexible processing document segmentation now i manually modify the code here separators_str = "\n\n\n\n,.,。" separators = [s.strip() for s in separators_str.split(",") if s.strip()] text_splitter = RecursiveCharacterTextSplitter( separators=separators, # New separator parameter chunk_size=request.app.state.config.CHUNK_SIZE, chunk_overlap=request.app.state.config.CHUNK_OVERLAP, add_start_index=True, ) ### Alternatives Considered _No response_ ### Additional Context _No response_
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#16455