issue: WebLoader unable to handle non-html links #5861

Closed
opened 2025-11-11 16:36:15 -06:00 by GiteaMirror · 1 comment
Owner

Originally created by @ifgreulich on GitHub (Jul 24, 2025).

Check Existing Issues

  • I have searched the existing issues and discussions.
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.6.18

Ollama Version (if applicable)

0.9.6 but not relevant

Operating System

Ubuntu 24.04

Browser (if applicable)

not relevant

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

If one configures Web Search using Google PSE Open-Webui is able to handle all links provided by the search engine as part of result list even when the link points not to a html document. E.g. if the link points to a pdf document it is downloaded and processed using e.g. apache tika and it's content is used as if it was content of an html page.

Actual Behavior

Currently links provided as search result from the search engine pointing to a PDF document (at least PDF) fail to load (run into timeout, see attached logs) and an error message appears at the web ui.

Steps to Reproduce

  1. Config Webloader, Search Engine and KI (Google- or Olama-based)
  2. Config Browser to prefer language German. (Only to make search result including links to PDF documents more likely!)
  3. Prompt KI using “Explain integral calculus to me”, switch on WebSerarch using button below prompt and run request
  4. Wait for answer / error message to appear.

Logs & Screenshots

Open-Webui.log

Image Image

Additional Information

No response

Originally created by @ifgreulich on GitHub (Jul 24, 2025). ### Check Existing Issues - [x] I have searched the existing issues and discussions. - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.6.18 ### Ollama Version (if applicable) 0.9.6 but not relevant ### Operating System Ubuntu 24.04 ### Browser (if applicable) not relevant ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior If one configures Web Search using Google PSE Open-Webui is able to handle all links provided by the search engine as part of result list even when the link points not to a html document. E.g. if the link points to a pdf document it is downloaded and processed using e.g. apache tika and it's content is used as if it was content of an html page. ### Actual Behavior Currently links provided as search result from the search engine pointing to a PDF document (at least PDF) fail to load (run into timeout, see attached logs) and an error message appears at the web ui. ### Steps to Reproduce 1. Config Webloader, Search Engine and KI (Google- or Olama-based) 2. Config Browser to prefer language German. (Only to make search result including links to PDF documents more likely!) 4. Prompt KI using “Explain integral calculus to me”, switch on WebSerarch using button below prompt and run request 5. Wait for answer / error message to appear. ### Logs & Screenshots [Open-Webui.log](https://github.com/user-attachments/files/21413861/Open-Webui.log) <img width="1812" height="665" alt="Image" src="https://github.com/user-attachments/assets/2bc7ce6d-91b3-4585-bf3c-23687b2998be" /> <img width="1948" height="216" alt="Image" src="https://github.com/user-attachments/assets/7e8f7c27-5555-426d-b647-a94a25b003ee" /> ### Additional Information _No response_
GiteaMirror added the bug label 2025-11-11 16:36:15 -06:00
Author
Owner

@tjbck commented on GitHub (Jul 24, 2025):

This has nothing to do with webloader and has to do with your reverse proxy timeout config.

@tjbck commented on GitHub (Jul 24, 2025): This has nothing to do with webloader and has to do with your reverse proxy timeout config.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#5861