[GH-ISSUE #19734] issue: Firecrawl web loader times out after 3 seconds (ignoring environment variables) #57640

Closed
opened 2026-05-05 21:17:04 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @Sorkai on GitHub (Dec 4, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/19734

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.6.41

Ollama Version (if applicable)

No response

Operating System

Windows 11 Version 25H2 (Build 26200.7309)

Browser (if applicable)

Microsoft Edge 142.0.3595.94

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

Web Loader (Firecrawl): The firecrawl loader should respect the environment variable timeouts (e.g., RAG_WEB_SEARCH_TIMEOUT) or have a reasonable default > 3 seconds.

Actual Behavior

When using the firecrawl Web Loader Engine, scrape jobs consistently fail with a TimeoutError after exactly 3 seconds. This timeout duration is too short for most real-world web pages. I have attempted to increase the timeout using environment variables (e.g., RAG_WEB_SEARCH_TIMEOUT=60), but the 3-second limit persists, suggesting it might be hardcoded or the configuration is not being applied to the Firecrawl client.

Steps to Reproduce

  1. Go to Settings -> Documents (or Web Search).
  2. Set Web Loader Engine to firecrawl.
  3. Trigger a web search or add a URL to documents.
  4. The operation fails quickly.

Logs & Screenshots

File "/app/backend/open_webui/retrieval/web/utils.py", line 250, in lazy_load
    result = firecrawl.batch_scrape(
File "/usr/local/lib/python3.11/site-packages/firecrawl/v2/methods/batch.py", line 272, in wait_for_batch_completion
    raise TimeoutError(f"Batch scrape job {job_id} did not complete within {timeout} seconds")
TimeoutError: Batch scrape job 019ae795... did not complete within 3 seconds

1Panel-ollama-webui-9dLT-20251204122843.log

Additional Information

I attempted to set RAG_WEB_SEARCH_TIMEOUT=60 in environment variables, but the loader persisted with a 3-second timeout.

Originally created by @Sorkai on GitHub (Dec 4, 2025). Original GitHub issue: https://github.com/open-webui/open-webui/issues/19734 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.6.41 ### Ollama Version (if applicable) _No response_ ### Operating System Windows 11 Version 25H2 (Build 26200.7309) ### Browser (if applicable) Microsoft Edge 142.0.3595.94 ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior Web Loader (Firecrawl): The firecrawl loader should respect the environment variable timeouts (e.g., RAG_WEB_SEARCH_TIMEOUT) or have a reasonable default > 3 seconds. ### Actual Behavior When using the firecrawl Web Loader Engine, scrape jobs consistently fail with a TimeoutError after exactly 3 seconds. This timeout duration is too short for most real-world web pages. I have attempted to increase the timeout using environment variables (e.g., RAG_WEB_SEARCH_TIMEOUT=60), but the 3-second limit persists, suggesting it might be hardcoded or the configuration is not being applied to the Firecrawl client. ### Steps to Reproduce 1. Go to Settings -> Documents (or Web Search). 2. Set Web Loader Engine to firecrawl. 3. Trigger a web search or add a URL to documents. 4. The operation fails quickly. ### Logs & Screenshots ``` File "/app/backend/open_webui/retrieval/web/utils.py", line 250, in lazy_load result = firecrawl.batch_scrape( File "/usr/local/lib/python3.11/site-packages/firecrawl/v2/methods/batch.py", line 272, in wait_for_batch_completion raise TimeoutError(f"Batch scrape job {job_id} did not complete within {timeout} seconds") TimeoutError: Batch scrape job 019ae795... did not complete within 3 seconds ``` [1Panel-ollama-webui-9dLT-20251204122843.log](https://github.com/user-attachments/files/23923086/1Panel-ollama-webui-9dLT-20251204122843.log) ### Additional Information I attempted to set RAG_WEB_SEARCH_TIMEOUT=60 in environment variables, but the loader persisted with a 3-second timeout.
GiteaMirror added the bug label 2026-05-05 21:17:04 -05:00
Author
Owner

@owui-terminator[bot] commented on GitHub (Dec 4, 2025):

🔍 Similar Issues Found

I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions:

  1. #13169 issue: Unable to get self-hosted Firecrawl Web Loader Engine to work
    by MikeNatC • Apr 23, 2025 • bug

  2. #14746 issue: Bypass Web Loader in Web Search not working
    by williamgateszhao • Jun 07, 2025 • bug

  3. #19085 issue: Chat UI loads forever instead of showing error
    by TamKej • Nov 10, 2025 • bug

  4. #15375 issue: When my Open-WebUI generation exits directly without interruption midway, re-entering the dialog box will result in a loop on the loading screen.
    by Frost2002 • Jun 28, 2025 • bug

  5. #16747 issue: API Timeout after 100 seconds with Long-Running Tools (e.g., Web Search)
    by ldpg-dev • Aug 20, 2025 • bug

Show 5 more related issues
  1. #11528 issue: "Loading" state persists when using minicpm-v in OpenWebUI
    by ckappgit • Mar 11, 2025 • bug

  2. #16900 issue: Long running MCP tool calls result in timeout
    by dionny • Aug 25, 2025 • bug

  3. #18973 issue: cannot import name 'Firecrawl' from 'firecrawl'
    by gwolf2u • Nov 06, 2025 • bug

  4. #16974 issue: chat loads forever without user feedback on error
    by johnnyasantoss • Aug 27, 2025 • bug

  5. #19438 issue: Icon loading regression
    by JoelShepard • Nov 24, 2025 • bug


💡 Tips:

  • If this is a duplicate, please consider closing this issue and adding any additional details to the existing one
  • If you found a solution in any of these issues, please share it here to help others

This comment was generated automatically by a bot. Please react with a 👍 if this comment was helpful, or a 👎 if it was not.

<!-- gh-comment-id:3610163766 --> @owui-terminator[bot] commented on GitHub (Dec 4, 2025): 🔍 **Similar Issues Found** I found some existing issues that might be related to this one. Please check if any of these are duplicates or contain helpful solutions: 1. [#13169](https://github.com/open-webui/open-webui/issues/13169) **issue: Unable to get self-hosted Firecrawl Web Loader Engine to work** *by MikeNatC • Apr 23, 2025 • `bug`* 2. [#14746](https://github.com/open-webui/open-webui/issues/14746) **issue: Bypass Web Loader in Web Search not working** *by williamgateszhao • Jun 07, 2025 • `bug`* 3. [#19085](https://github.com/open-webui/open-webui/issues/19085) **issue: Chat UI loads forever instead of showing error** *by TamKej • Nov 10, 2025 • `bug`* 4. [#15375](https://github.com/open-webui/open-webui/issues/15375) **issue: When my Open-WebUI generation exits directly without interruption midway, re-entering the dialog box will result in a loop on the loading screen.** *by Frost2002 • Jun 28, 2025 • `bug`* 5. [#16747](https://github.com/open-webui/open-webui/issues/16747) **issue: API Timeout after 100 seconds with Long-Running Tools (e.g., Web Search)** *by ldpg-dev • Aug 20, 2025 • `bug`* <details> <summary>Show 5 more related issues</summary> 6. [#11528](https://github.com/open-webui/open-webui/issues/11528) **issue: "Loading" state persists when using minicpm-v in OpenWebUI** *by ckappgit • Mar 11, 2025 • `bug`* 7. [#16900](https://github.com/open-webui/open-webui/issues/16900) **issue: Long running MCP tool calls result in timeout** *by dionny • Aug 25, 2025 • `bug`* 8. [#18973](https://github.com/open-webui/open-webui/issues/18973) **issue: cannot import name 'Firecrawl' from 'firecrawl'** *by gwolf2u • Nov 06, 2025 • `bug`* 9. [#16974](https://github.com/open-webui/open-webui/issues/16974) **issue: chat loads forever without user feedback on error** *by johnnyasantoss • Aug 27, 2025 • `bug`* 10. [#19438](https://github.com/open-webui/open-webui/issues/19438) **issue: Icon loading regression** *by JoelShepard • Nov 24, 2025 • `bug`* </details> --- 💡 **Tips:** - If this is a duplicate, please consider closing this issue and adding any additional details to the existing one - If you found a solution in any of these issues, please share it here to help others *This comment was generated automatically by a bot.* Please react with a 👍 if this comment was helpful, or a 👎 if it was not.
Author
Owner

@borisboc commented on GitHub (Dec 7, 2025):

Hello. Indeed I don't see such timeout variable for FireCrawl. However, there is one for PlayWrigth (see PLAYWRIGHT_TIMEOUT). And in my PR #19804 , there is one timeout for the default safe_web. This will be called SAFE_WEBLOADER_TIMEOUT

<!-- gh-comment-id:3623008307 --> @borisboc commented on GitHub (Dec 7, 2025): Hello. Indeed I don't see such timeout variable for FireCrawl. However, there is one for PlayWrigth (see **PLAYWRIGHT_TIMEOUT**). And in my PR #19804 , there is one timeout for the default **safe_web**. This will be called **SAFE_WEBLOADER_TIMEOUT**
Author
Owner

@Sorkai commented on GitHub (Dec 8, 2025):

Thank you for your response. I've saw your PR and greatly appreciate your fixes. Looking forward to merging the code and deploying it.

<!-- gh-comment-id:3626524407 --> @Sorkai commented on GitHub (Dec 8, 2025): Thank you for your response. I've saw your PR and greatly appreciate your fixes. Looking forward to merging the code and deploying it.
Author
Owner

@borisboc commented on GitHub (Dec 8, 2025):

Thank you for your response. I've saw your PR and greatly appreciate your fixes. Looking forward to merging the code and deploying it.

It is now in the dev branch : b02397e460

<!-- gh-comment-id:3628919094 --> @borisboc commented on GitHub (Dec 8, 2025): > Thank you for your response. I've saw your PR and greatly appreciate your fixes. Looking forward to merging the code and deploying it. It is now in the dev branch : b02397e460ec56b4b74146508eeab2a3ba13950e
Author
Owner

@Classic298 commented on GitHub (Dec 11, 2025):

@borisboc can this be closed?

<!-- gh-comment-id:3641026661 --> @Classic298 commented on GitHub (Dec 11, 2025): @borisboc can this be closed?
Author
Owner

@borisboc commented on GitHub (Dec 11, 2025):

@borisboc can this be closed?

Sorry : the merged modifications were made for the SafeWebBaseLoader (meaning WEB_LOADER_ENGINE.value == "safe_web") but NOT for SafeFireCrawlLoader (meaning WEB_LOADER_ENGINE.value == "firecrawl"). So I guess the issue cannot be closed. An alternative for @Sorkai is to use the "safe_web" rather than the "firecrawl", since safe_web has now a timeout. But, aside from the timeout, the 2 web loader are fairly different and I am not sure @Sorkai is OK to use "safe_web" rather than "firecrawl".

<!-- gh-comment-id:3641250416 --> @borisboc commented on GitHub (Dec 11, 2025): > [@borisboc](https://github.com/borisboc) can this be closed? Sorry : the merged modifications were made for the `SafeWebBaseLoader` (meaning `WEB_LOADER_ENGINE.value == "safe_web"`) but **NOT** for `SafeFireCrawlLoader` (meaning `WEB_LOADER_ENGINE.value == "firecrawl"`). So I guess the issue **cannot** be closed. An alternative for @Sorkai is to use the `"safe_web"` rather than the `"firecrawl"`, since safe_web has now a timeout. But, aside from the timeout, the 2 web loader are fairly different and I am not sure @Sorkai is OK to use "safe_web" rather than "firecrawl".
Author
Owner

@Classic298 commented on GitHub (Dec 14, 2025):

PR welcome

<!-- gh-comment-id:3652223706 --> @Classic298 commented on GitHub (Dec 14, 2025): PR welcome
Author
Owner

@borisboc commented on GitHub (Dec 15, 2025):

Hello all. This is normally corrected in PR #19964 (commit 5f6baf0a86). Please note @Sorkai that the timeout is in milliseconds, not in seconds. So formerly, in the main code, the timeout was 3*count (count beeing the number of different requests to send, default was 3) defaulting to 9ms. However, for my small tests (searching for the price of NVIDIA DGX Spark) even 1 ms timeout was "working".

<!-- gh-comment-id:3654470553 --> @borisboc commented on GitHub (Dec 15, 2025): Hello all. This is normally corrected in PR #19964 (commit 5f6baf0a866d7fa6338027470cc2883d67122988). Please note @Sorkai that the timeout is in milliseconds, not in seconds. So formerly, in the main code, the timeout was 3*count (count beeing the number of different requests to send, default was 3) defaulting to 9ms. However, for my small tests (searching for the price of NVIDIA DGX Spark) even 1 ms timeout was "working".
Author
Owner

@silentoplayz commented on GitHub (Jan 18, 2026):

I'm pretty sure this can be closed with 89ad1c68d1.

<!-- gh-comment-id:3765152219 --> @silentoplayz commented on GitHub (Jan 18, 2026): I'm pretty sure this can be closed with https://github.com/open-webui/open-webui/commit/89ad1c68d1aadf849960b5e202aa4651096b05f5.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#57640