mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #1200] feat: pure text ingestion API for RAG #51057
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @icsy7867 on GitHub (Mar 18, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1200
Originally assigned to: @tjbck on GitHub.
Is your feature request related to a problem? Please describe.
Straight text ingestion is fast. In most of my cases I am trying to ingest certain policy pages or confluence pages. Confluence has an api which allows me to get html of a page. I can then easily parse and send this as plain text. Currently I am sending a few hundred confluence pages in a couple minutes.
Describe the solution you'd like
It would be nice to have a restapi endpoint to ingest straight text. It should include a document title name, and the text body. Ideally the title could also be a URL to a website location.
Describe alternatives you've considered
Trying to save the documents as text files first. And then ingesting. But the extra overhead here is a lot.
Additional context
While I need a text ingestion api, might be nice to be able to have a file ingestion api as well.
@tjbck commented on GitHub (Mar 24, 2024):
Added to our dev branch! You can check out the RAG API endpoints here:
http://[Your Open WebUI]/rag/api/v1/docs. Keep us updated and let us know if the current implementation fits into your use-case, if not feel free to elaborate more on your workflow so we can better accommodate, thanks!@longfei-zhang commented on GitHub (Mar 4, 2025):
@tjbck I can't find the RAG API endpoint through http://[Your Open WebUI]/rag/api/v1/docs
it show me 404: Not Found
the log from the open web ui pod: