mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-07 11:28:35 -05:00
[GH-ISSUE #17845] issue: web search too slow / generating embeddings for 10.000+ chunks #18416
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @tfriedel on GitHub (Sep 28, 2025).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/17845
Check Existing Issues
Installation Method
Docker
Open WebUI Version
0.6.31
Ollama Version (if applicable)
No response
Operating System
Ubuntu 24.04
Browser (if applicable)
Chrome 140.0.7339.208
Confirmation
README.md.Expected Behavior
Response should take a relatively short amount of time, say <1 min. If a web page is long, unneccessary stuff like javascript code, b64 encodings etc needs to be stripped so only text is used. If it's still long, it needs to be shortened or rejected.
Actual Behavior
reponse took 8 minutes with 15971 created embeddings
Steps to Reproduce
probably not easy to reproduce, but the web search that triggered this behaviour was ('google_pse', ['HELM Sycon competitor biostimulants comparison 2025', 'biostimulant product comparison amino acid phosphite seaweed extracts
row crops', 'Sycon HELM independent trial results vs competitor biologicals corn soybean'])
Logs & Screenshots
Additional Information
Apparently all the top 3 results were long pdfs, of which two failed to fetch and the last one triggered this long embedding creation. There should be some way to exclude pdfs or very long documents and focus on shorter ones. Or a fast way to handle extracting the relevant info.
@Classic298 commented on GitHub (Sep 28, 2025):
damn sounds like you hit a really large website 😁
yeah maybe there should be a limit to this, or truncating websites or similar.