mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #1069] bug: [RAG] v0.1.109 Breaks PDF upload #12322
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @justinh-rahb on GitHub (Mar 7, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1069
Bug Report
Description
Bug Summary:
After updating Open WebUI using Docker, the PDF functionality began to produce an error indicating that the
rapidocr-onnxruntimepackage is not found, even though the package appears to be installed according to the Docker build logs.Steps to Reproduce:
Expected Behavior:
PDFs should process without any issues as they did prior to the update.
Actual Behavior:
An error message is displayed suggesting the
rapidocr-onnxruntimepackage is not installed, contradicting the Docker build logs where the package shows as successfully installed.Environment
Reproduction Details
Confirmation:
rapidocr-onnxruntimepackage directly on the container console, which indicates the package is already installed.Logs and Screenshots
Docker Container Logs:
Installation Method
Installed via Docker.
Additional Information
The error appeared post-update, suggesting a possible issue with the Docker image's dependency management or a change in the way dependencies are recognized. This could be due to a version mismatch or an environment path issue within the container.
Note
If this bug report does not meet the required standards or lacks necessary information, it may delay the resolution of the issue. Please ensure that all sections are completed in accordance with the guidelines provided in the README.md and troubleshooting.md documents. Thank you for your cooperation.
@justinh-rahb commented on GitHub (Mar 7, 2024):
Tagging @jannikstdl
@tjbck commented on GitHub (Mar 7, 2024):
Found this: https://github.com/langchain-ai/langchain/issues/15576
We should probably remove the OCR feature on docker :/
@tjbck commented on GitHub (Mar 7, 2024):
Reverted the change for now, we might want to investigate more on how we can get this to work on docker env.
@justinh-rahb commented on GitHub (Mar 7, 2024):
@tjbck do you want to keep this issue open for an eventual fix or open a new one for that?
@tjbck commented on GitHub (Mar 7, 2024):
let's keep it open!
@jannikstdl commented on GitHub (Mar 7, 2024):
Sorry, just woke up.
Weird that this did work on my local environment but not with the docker image. So basically there were some tools needed to run this not installed on our base image?
@tjbck commented on GitHub (Mar 7, 2024):
@jannikstdl no worries! fixed with the latest release :)
@jcrosasm-IA commented on GitHub (Jul 7, 2024):
Hello, it seems you are still having a similar problem with the latest Open Webui version in your Docker image. What I have been experiencing to date is when I load a PDF file from the chat window it seems it cannot be read, I understand that because the document was not vectorized. Both the Llama 3 local models and gpt through the API give me a response that "they cannot access the content or information of the document" (these are for RAG use). What am I doing wrong? In previous versions I didn't have that problem. I hope you consider this query. Thank you.