[GH-ISSUE #1069] bug: [RAG] v0.1.109 Breaks PDF upload #12322

Closed
opened 2026-04-19 19:13:33 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @justinh-rahb on GitHub (Mar 7, 2024).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/1069

Bug Report

Description

Bug Summary:
After updating Open WebUI using Docker, the PDF functionality began to produce an error indicating that the rapidocr-onnxruntime package is not found, even though the package appears to be installed according to the Docker build logs.

Steps to Reproduce:

  1. Run the Docker update command for Open WebUI:
    docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui
    
  2. Attempt to process a PDF within the Open WebUI environment.

Expected Behavior:
PDFs should process without any issues as they did prior to the update.

Actual Behavior:
An error message is displayed suggesting the rapidocr-onnxruntime package is not installed, contradicting the Docker build logs where the package shows as successfully installed.

Environment

  • Operating System: Mac
  • Browser (if applicable): Chrome

Reproduction Details

Confirmation:

  • [*] I have confirmed that I am on the latest version of both Open WebUI and any relevant dependencies.
  • I have included the browser console logs if available and relevant.
  • [*] I have included the Docker container logs.
  • [*] I have attempted to install the rapidocr-onnxruntime package directly on the container console, which indicates the package is already installed.

Logs and Screenshots

Docker Container Logs:

INFO:     142.115.145.66:0 - "GET /litellm/api/model/info HTTP/1.1" 200 OK
INFO:     142.115.145.66:0 - "GET /ollama/api/version HTTP/1.1" 200 OK
INFO:     142.115.145.66:0 - "GET /api/version/updates HTTP/1.1" 200 OK
application/pdf
`rapidocr-onnxruntime` package not found, please install it with `pip install rapidocr-onnxruntime`
INFO:     142.115.145.66:0 - "POST /rag/api/v1/doc HTTP/1.1" 400 Bad Request
Screenshot 2024-03-06 at 6 49 42 PM Screenshot 2024-03-06 at 6 53 42 PM

Installation Method

Installed via Docker.

Additional Information

The error appeared post-update, suggesting a possible issue with the Docker image's dependency management or a change in the way dependencies are recognized. This could be due to a version mismatch or an environment path issue within the container.

Note

If this bug report does not meet the required standards or lacks necessary information, it may delay the resolution of the issue. Please ensure that all sections are completed in accordance with the guidelines provided in the README.md and troubleshooting.md documents. Thank you for your cooperation.

Originally created by @justinh-rahb on GitHub (Mar 7, 2024). Original GitHub issue: https://github.com/open-webui/open-webui/issues/1069 # Bug Report ## Description **Bug Summary:** After updating Open WebUI using Docker, the PDF functionality began to produce an error indicating that the `rapidocr-onnxruntime` package is not found, even though the package appears to be installed according to the Docker build logs. **Steps to Reproduce:** 1. Run the Docker update command for Open WebUI: ``` docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui ``` 2. Attempt to process a PDF within the Open WebUI environment. **Expected Behavior:** PDFs should process without any issues as they did prior to the update. **Actual Behavior:** An error message is displayed suggesting the `rapidocr-onnxruntime` package is not installed, contradicting the Docker build logs where the package shows as successfully installed. ## Environment - **Operating System:** Mac - **Browser (if applicable):** Chrome ## Reproduction Details **Confirmation:** - [*] I have confirmed that I am on the latest version of both Open WebUI and any relevant dependencies. - [ ] I have included the browser console logs if available and relevant. - [*] I have included the Docker container logs. - [*] I have attempted to install the `rapidocr-onnxruntime` package directly on the container console, which indicates the package is already installed. ## Logs and Screenshots **Docker Container Logs:** ``` INFO: 142.115.145.66:0 - "GET /litellm/api/model/info HTTP/1.1" 200 OK INFO: 142.115.145.66:0 - "GET /ollama/api/version HTTP/1.1" 200 OK INFO: 142.115.145.66:0 - "GET /api/version/updates HTTP/1.1" 200 OK application/pdf `rapidocr-onnxruntime` package not found, please install it with `pip install rapidocr-onnxruntime` INFO: 142.115.145.66:0 - "POST /rag/api/v1/doc HTTP/1.1" 400 Bad Request ``` <img width="1250" alt="Screenshot 2024-03-06 at 6 49 42 PM" src="https://github.com/open-webui/open-webui/assets/52832301/60a9c62b-a83c-418d-a7a0-52678c3db3e9"> <img width="1354" alt="Screenshot 2024-03-06 at 6 53 42 PM" src="https://github.com/open-webui/open-webui/assets/52832301/e42baf76-329d-4c0a-95fb-3cd95a91ccdb"> ## Installation Method Installed via Docker. ## Additional Information The error appeared post-update, suggesting a possible issue with the Docker image's dependency management or a change in the way dependencies are recognized. This could be due to a version mismatch or an environment path issue within the container. ## Note If this bug report does not meet the required standards or lacks necessary information, it may delay the resolution of the issue. Please ensure that all sections are completed in accordance with the guidelines provided in the README.md and troubleshooting.md documents. Thank you for your cooperation.
Author
Owner

@justinh-rahb commented on GitHub (Mar 7, 2024):

Tagging @jannikstdl

<!-- gh-comment-id:1982075925 --> @justinh-rahb commented on GitHub (Mar 7, 2024): Tagging @jannikstdl
Author
Owner

@tjbck commented on GitHub (Mar 7, 2024):

Found this: https://github.com/langchain-ai/langchain/issues/15576

We should probably remove the OCR feature on docker :/

<!-- gh-comment-id:1982090922 --> @tjbck commented on GitHub (Mar 7, 2024): Found this: https://github.com/langchain-ai/langchain/issues/15576 We should probably remove the OCR feature on docker :/
Author
Owner

@tjbck commented on GitHub (Mar 7, 2024):

Reverted the change for now, we might want to investigate more on how we can get this to work on docker env.

<!-- gh-comment-id:1982151455 --> @tjbck commented on GitHub (Mar 7, 2024): Reverted the change for now, we might want to investigate more on how we can get this to work on docker env.
Author
Owner

@justinh-rahb commented on GitHub (Mar 7, 2024):

@tjbck do you want to keep this issue open for an eventual fix or open a new one for that?

<!-- gh-comment-id:1982155833 --> @justinh-rahb commented on GitHub (Mar 7, 2024): @tjbck do you want to keep this issue open for an eventual fix or open a new one for that?
Author
Owner

@tjbck commented on GitHub (Mar 7, 2024):

let's keep it open!

<!-- gh-comment-id:1982156416 --> @tjbck commented on GitHub (Mar 7, 2024): let's keep it open!
Author
Owner

@jannikstdl commented on GitHub (Mar 7, 2024):

Sorry, just woke up.

Weird that this did work on my local environment but not with the docker image. So basically there were some tools needed to run this not installed on our base image?

<!-- gh-comment-id:1982805134 --> @jannikstdl commented on GitHub (Mar 7, 2024): Sorry, just woke up. Weird that this did work on my local environment but not with the docker image. So basically there were some tools needed to run this not installed on our base image?
Author
Owner

@tjbck commented on GitHub (Mar 7, 2024):

@jannikstdl no worries! fixed with the latest release :)

<!-- gh-comment-id:1982826194 --> @tjbck commented on GitHub (Mar 7, 2024): @jannikstdl no worries! fixed with the latest release :)
Author
Owner

@jcrosasm-IA commented on GitHub (Jul 7, 2024):

Hello, it seems you are still having a similar problem with the latest Open Webui version in your Docker image. What I have been experiencing to date is when I load a PDF file from the chat window it seems it cannot be read, I understand that because the document was not vectorized. Both the Llama 3 local models and gpt through the API give me a response that "they cannot access the content or information of the document" (these are for RAG use). What am I doing wrong? In previous versions I didn't have that problem. I hope you consider this query. Thank you.

<!-- gh-comment-id:2212526172 --> @jcrosasm-IA commented on GitHub (Jul 7, 2024): Hello, it seems you are still having a similar problem with the latest Open Webui version in your Docker image. What I have been experiencing to date is when I load a PDF file from the chat window it seems it cannot be read, I understand that because the document was not vectorized. Both the Llama 3 local models and gpt through the API give me a response that "they cannot access the content or information of the document" (these are for RAG use). What am I doing wrong? In previous versions I didn't have that problem. I hope you consider this query. Thank you.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#12322