[GH-ISSUE #14164] feat: retrieve RAG context chunks within /api/chat/completions #17162

Closed
opened 2026-04-19 22:54:20 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @vaclcer on GitHub (May 22, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/14164

Check Existing Issues

  • I have searched the existing issues and discussions.

Problem Description

Hi, this is a recurring topic 13304

Judging by the reactions, many people would appreciate to get context chunks used for final generation step in RAG via API. I believe this should be pretty much standard in any production system.

Context chunks are used not only for a reference, but also for any response evaluations based on RAGAS, DeepEval or others...

There was a rejected PR addressing this 13378

This might be working with enabled streaming response, but that is not trivial to process for many of us, I presume.

Thank you for the consideration!

Desired Solution you'd like

I would think retrieving RAG context/chunks would be configurable via /api/chat/completions payload object, similar as:

payload = {
'model': model,
'messages': [{'role': 'user', 'content': query}],
'files': [{'type': 'collection', 'id': collection_id}]
'retrieve_context': True
}

Alternatives Considered

No response

Additional Context

No response

Originally created by @vaclcer on GitHub (May 22, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/14164 ### Check Existing Issues - [x] I have searched the existing issues and discussions. ### Problem Description Hi, this is a recurring topic [13304](https://github.com/open-webui/open-webui/discussions/13304) Judging by the reactions, many people would appreciate to get context chunks used for final generation step in RAG via API. I believe this should be pretty much standard in any production system. Context chunks are used not only for a reference, but also for any response evaluations based on RAGAS, DeepEval or others... There was a rejected PR addressing this [13378](https://github.com/open-webui/open-webui/pull/13378) This might be working with enabled streaming response, but that is not trivial to process for many of us, I presume. Thank you for the consideration! ### Desired Solution you'd like I would think retrieving RAG context/chunks would be configurable via /api/chat/completions payload object, similar as: > payload = { > 'model': model, > 'messages': [{'role': 'user', 'content': query}], > 'files': [{'type': 'collection', 'id': collection_id}] > 'retrieve_context': True > } ### Alternatives Considered _No response_ ### Additional Context _No response_
Author
Owner

@daschuchmann commented on GitHub (May 22, 2025):

This would be so nice to have, making the api finally capable to return the knowledge

<!-- gh-comment-id:2900478523 --> @daschuchmann commented on GitHub (May 22, 2025): This would be so nice to have, making the api finally capable to return the knowledge
Author
Owner

@tjbck commented on GitHub (May 26, 2025):

Addressed with 2d5b82df8c.

<!-- gh-comment-id:2910403106 --> @tjbck commented on GitHub (May 26, 2025): Addressed with 2d5b82df8c9f48e1211dae3394e6eb2f5694cb6c.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#17162