mirror of
https://github.com/open-webui/open-webui.git
synced 2026-03-10 15:54:15 -05:00
feat: Allow clickable URLS as sources for documents. #4366
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @icsy7867 on GitHub (Mar 10, 2025).
Check Existing Issues
Problem Description
Not sure where to start or the best place to ask this.
For my use case, I would love to be able to make the source id of a document, be a clickable URL instead of some generic source-id.
I have written scrappers and wrappers to use my companies documents and knowledge source and push them to an API to load into a qdrant database. It was actually pretty easy to write and do. I simply used the confluence API to return the HTML code of every document in a space, and then I iteratively pushed this to a qdrant DB via an api.
When I named the document, I was able to (with some modifications) name the document as a web URL, which, when a document matched, produced a clickable URL link instead of something like "something.txt". With MANY peoples documents, information and other items stored in various web databases and sources (like confluence), it would be nice to reference these articles via their source links instead of a file name.
Desired Solution you'd like
There might be a better solution, but with the previous tool I made a few simple edits...
When using the API to push a document into a RAG database, I took the source URL of the document I was uploading and URLENCODED it, so that the special characters and slashes would not interfere with the JSON formatting.
When displaying the source name, I simply did a URLDECODE on the source/file name. This worked well in the tool, because if you URLDECODE something that is not URLENCODED it just returns the same string.
The link should be clickable. While using an AI model as an information hub, it would be nice if the model was able to reference the original source via URL, as depending on the user provided context, the information may or may not be completely accurate.
Alternatives Considered
I am playing around with the API and formatting to see what is possible.
Additional Context
No response
@icsy7867 commented on GitHub (Mar 10, 2025):
For my initial API test, URL encoding the filename works:
I think I could just edit the javascript here:
d7bfa395b0/src/lib/components/chat/Messages/Citations.svelte (L120)and/or
d7bfa395b0/src/lib/components/chat/Messages/Citations.svelte (L160)If I can just use the javascript uriDecodeComponent, maybe they would work? I can try to build the container.
And this one...
d7bfa395b0/src/lib/components/chat/Messages/CitationsModal.svelte (L101)Trying this out and building the container :D
Neat! Editing the last document worked for the modal! But the ID on the search did not change.
Whoops! Forgot one...
d7bfa395b0/src/lib/components/chat/Messages/Citations.svelte (L197)This worked!
Now the only remaining place is here:
EDIT Found it...
d7bfa395b0/src/lib/components/chat/MessageInput/Commands/Knowledge.svelte (L213)I believe the last piece is here:
d7bfa395b0/src/lib/components/common/FileItem.svelte (L85)But I have to switch gears and will try tonight :D
EDIT
It appears that was the correct file! After selecting the file, it appears correctly now:
@linuxrrze commented on GitHub (Mar 11, 2025):
This made my day! Thank you for adding this feature!
I was also trying to commit external web scraping data into open-webui and found no way to make the citations show usable URLs.
@icsy7867 commented on GitHub (Mar 12, 2025):
Whoops... one more... location. I can create another pull, but it will most likely be tomorrow.
When a document is used as a reference, it needs to also be decoded.
b03fc97e28/src/lib/components/chat/Messages/Markdown/Source.svelte (L47)