mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[GH-ISSUE #7732] Serious problem with PDF files #53530
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @mazierovictor on GitHub (Dec 9, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/7732
Installation Method
I am using Docker installation
Environment
Open WebUI Version: [e.g.v0.4.8]
Operating System: Ubuntu 24.04
Expected Behavior:
When attaching a PDF file directly to the chat, I need the file to be processed in its entirety. In other words, when the upload is complete, the system extracts all the information from the PDF, saves it in a string and waits for the user's command. This way, LLM would process the user's command according to the information extracted from the file.
Example:
I have a general model, and I need it to summarize a PDF file that is a scientific article. So, I upload the file, give it the command to summarize the file from beginning to end, and after processing, I have my summarized text.
Actual Behavior:
What is happening is that when I upload a PDF file, openwebui understands that it is a RAG, so if I ask it to summarize a file, it does not summarize it in its entirety, hindering any attempt to interact with any type of file.
After all, if I really wanted to create a RAG, I would use the knowledge base for that.
Bug Summary:
The bug basically consists of a file processing confusion, that is, if I want to make, for example, a template for a ChatPDF with Openwebui, I can't, because when I upload a file, it extracts the information incompletely, or as if that file were for a RAG, because the system segments it, different from the expected behavior for the chat, where it would have to process the PDF file, extract the information and send all the extracted information to the LLM so that it is returned according to the user's request.
Reproduction Details
Steps to Reproduce:
Have a registered model, without any configuration or prompt, upload a PDF file of any scientific article, in the chat, request that that article be summarized, wait for the result.