Ollama embedding model Web Search function error #4024

Closed
opened 2025-11-11 15:44:41 -06:00 by GiteaMirror · 4 comments
Owner

Originally created by @so898 on GitHub (Feb 21, 2025).

Bug Report

Installation Method

Docker

Environment

  • Open WebUI Version: v0.5.16

  • Ollama (if applicable): v0.5.11

  • Operating System: macOS 15.2

  • Browser (if applicable): Chrome 131.0.6778.265

Confirmation:

  • I have read and followed all the instructions provided in the README.md.
  • I am on the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below.

Expected Behavior:

Web Search function with Ollama embedding model works.

Actual Behavior:

Not work, the webpage could not be embedded.

Description

Bug Summary:

Something wrong with the embed function with case Ollama embedding request could not be sent.

Reproduction Details

Steps to Reproduce:

Set up Ollama in the document configuration for embedding.

Do a web search with Ollama model.

Logs and Screenshots

Browser Console Logs:

None

Docker Container Logs:

INFO  [open_webui.routers.retrieval] Using token text splitter: cl100k_base
INFO  [open_webui.routers.retrieval] adding to collection web-search-e06df04cbea6beda1a8e296235f893edc5f5dc4a1c2490991709
ERROR [open_webui.routers.retrieval] 'NoneType' object is not iterable
Traceback (most recent call last):
  File "/app/backend/open_webui/routers/retrieval.py", line 834, in save_docs_to_vector_db
    embeddings = embedding_function(
                 ^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/retrieval/utils.py", line 339, in <lambda>
    return lambda query, user=None: generate_multiple(query, user, func)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/retrieval/utils.py", line 332, in generate_multiple
    embeddings.extend(
TypeError: 'NoneType' object is not iterable
ERROR [open_webui.routers.retrieval] 'NoneType' object is not iterable
Traceback (most recent call last):
  File "/app/backend/open_webui/routers/retrieval.py", line 1385, in process_web_search
    await run_in_threadpool(
  File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 37, in run_in_threadpool
    return await anyio.to_thread.run_sync(func)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2461, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 962, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/routers/retrieval.py", line 856, in save_docs_to_vector_db
    raise e
  File "/app/backend/open_webui/routers/retrieval.py", line 834, in save_docs_to_vector_db
    embeddings = embedding_function(
                 ^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/retrieval/utils.py", line 339, in <lambda>
    return lambda query, user=None: generate_multiple(query, user, func)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/retrieval/utils.py", line 332, in generate_multiple
    embeddings.extend(
TypeError: 'NoneType' object is not iterable
ERROR [open_webui.utils.middleware] 400: [ERROR: 'NoneType' object is not iterable]
Traceback (most recent call last):
  File "/app/backend/open_webui/routers/retrieval.py", line 1385, in process_web_search
    await run_in_threadpool(
  File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 37, in run_in_threadpool
    return await anyio.to_thread.run_sync(func)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2461, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 962, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/routers/retrieval.py", line 856, in save_docs_to_vector_db
    raise e
  File "/app/backend/open_webui/routers/retrieval.py", line 834, in save_docs_to_vector_db
    embeddings = embedding_function(
                 ^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/retrieval/utils.py", line 339, in <lambda>
    return lambda query, user=None: generate_multiple(query, user, func)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/retrieval/utils.py", line 332, in generate_multiple
    embeddings.extend(
TypeError: 'NoneType' object is not iterable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/app/backend/open_webui/utils/middleware.py", line 340, in chat_web_search_handler
    results = await process_web_search(
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/backend/open_webui/routers/retrieval.py", line 1402, in process_web_search
    raise HTTPException(
fastapi.exceptions.HTTPException: 400: [ERROR: 'NoneType' object is not iterable]
DEBUG [open_webui.utils.middleware] tool_ids=None

Screenshots/Screen Recordings (if applicable):

None

Additional Information

This bug happens with v0.5.16. The old version has other problem with web search function like multiple search when error occurred.

However, web search is totally unusable with the new version.

Note

None.

Originally created by @so898 on GitHub (Feb 21, 2025). # Bug Report ## Installation Method Docker ## Environment - **Open WebUI Version:** v0.5.16 - **Ollama (if applicable):** v0.5.11 - **Operating System:** macOS 15.2 - **Browser (if applicable):** Chrome 131.0.6778.265 **Confirmation:** - [x] I have read and followed all the instructions provided in the README.md. - [x] I am on the latest version of both Open WebUI and Ollama. - [ ] I have included the browser console logs. - [x] I have included the Docker container logs. - [ ] I have provided the exact steps to reproduce the bug in the "Steps to Reproduce" section below. ## Expected Behavior: Web Search function with Ollama embedding model works. ## Actual Behavior: Not work, the webpage could not be embedded. ## Description **Bug Summary:** Something wrong with the embed function with case Ollama embedding request could not be sent. ## Reproduction Details **Steps to Reproduce:** Set up Ollama in the document configuration for embedding. Do a web search with Ollama model. ## Logs and Screenshots **Browser Console Logs:** None **Docker Container Logs:** ``` INFO [open_webui.routers.retrieval] Using token text splitter: cl100k_base INFO [open_webui.routers.retrieval] adding to collection web-search-e06df04cbea6beda1a8e296235f893edc5f5dc4a1c2490991709 ERROR [open_webui.routers.retrieval] 'NoneType' object is not iterable Traceback (most recent call last): File "/app/backend/open_webui/routers/retrieval.py", line 834, in save_docs_to_vector_db embeddings = embedding_function( ^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/retrieval/utils.py", line 339, in <lambda> return lambda query, user=None: generate_multiple(query, user, func) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/retrieval/utils.py", line 332, in generate_multiple embeddings.extend( TypeError: 'NoneType' object is not iterable ERROR [open_webui.routers.retrieval] 'NoneType' object is not iterable Traceback (most recent call last): File "/app/backend/open_webui/routers/retrieval.py", line 1385, in process_web_search await run_in_threadpool( File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 37, in run_in_threadpool return await anyio.to_thread.run_sync(func) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2461, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 962, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/routers/retrieval.py", line 856, in save_docs_to_vector_db raise e File "/app/backend/open_webui/routers/retrieval.py", line 834, in save_docs_to_vector_db embeddings = embedding_function( ^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/retrieval/utils.py", line 339, in <lambda> return lambda query, user=None: generate_multiple(query, user, func) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/retrieval/utils.py", line 332, in generate_multiple embeddings.extend( TypeError: 'NoneType' object is not iterable ERROR [open_webui.utils.middleware] 400: [ERROR: 'NoneType' object is not iterable] Traceback (most recent call last): File "/app/backend/open_webui/routers/retrieval.py", line 1385, in process_web_search await run_in_threadpool( File "/usr/local/lib/python3.11/site-packages/starlette/concurrency.py", line 37, in run_in_threadpool return await anyio.to_thread.run_sync(func) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2461, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 962, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/routers/retrieval.py", line 856, in save_docs_to_vector_db raise e File "/app/backend/open_webui/routers/retrieval.py", line 834, in save_docs_to_vector_db embeddings = embedding_function( ^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/retrieval/utils.py", line 339, in <lambda> return lambda query, user=None: generate_multiple(query, user, func) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/retrieval/utils.py", line 332, in generate_multiple embeddings.extend( TypeError: 'NoneType' object is not iterable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/app/backend/open_webui/utils/middleware.py", line 340, in chat_web_search_handler results = await process_web_search( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/backend/open_webui/routers/retrieval.py", line 1402, in process_web_search raise HTTPException( fastapi.exceptions.HTTPException: 400: [ERROR: 'NoneType' object is not iterable] DEBUG [open_webui.utils.middleware] tool_ids=None ``` **Screenshots/Screen Recordings (if applicable):** None ## Additional Information This bug happens with v0.5.16. The old version has other problem with web search function like multiple search when error occurred. However, web search is totally unusable with the new version. ## Note None.
Author
Owner

@rgaricano commented on GitHub (Feb 21, 2025):

If you change to ollama embedd it's necessary to indicate a model for embed,
& (If this new model assigned embed info to vector space in a different way that one before it's necessary reset the database and reembed all docs).

Yes, this is a handicap for production places,
maybe in a future it's possible to have working togheter differents vector spaces or migrating tools....

@rgaricano commented on GitHub (Feb 21, 2025): If you change to ollama embedd it's necessary to indicate a model for embed, & (If this new model assigned embed info to vector space in a different way that one before it's necessary reset the database and reembed all docs). Yes, this is a handicap for production places, maybe in a future it's possible to have working togheter differents vector spaces or migrating tools....
Author
Owner

@so898 commented on GitHub (Feb 21, 2025):

If you change to ollama embedd it's necessary to indicate a model for embed, & (If this new model assigned embed info to vector space in a different way that one before it's necessary reset the database and reembed all docs).

Yes, this is a handicap for production places, maybe in a future it's possible to have working togheter differents vector spaces or migrating tools....

Yes, and I have config the RAG with Ollama embedding model from the beginning.

Here is my configuration for rag part.

"rag":
    {
        "pdf_extract_images": true,
        "youtube_loader_language":
        [
            "en"
        ],
        "youtube_loader_proxy_url": "",
        "enable_web_loader_ssl_verification": true,
        "web":
        {
            "search":
            {
                "enable": true,
                "engine": "searxng",
                "searxng_query_url": "http://192.168.0.146:4000/search?q=<query>",
                "google_pse_api_key": "",
                "google_pse_engine_id": "",
                "brave_search_api_key": "",
                "mojeek_search_api_key": "",
                "serpstack_api_key": "",
                "serpstack_https": true,
                "serper_api_key": "",
                "serply_api_key": "",
                "tavily_api_key": "",
                "searchapi_api_key": "",
                "searchapi_engine": "",
                "jina_api_key": "",
                "bing_search_v7_endpoint": "https://api.bing.microsoft.com/v7.0/search",
                "bing_search_v7_subscription_key": "",
                "result_count": 5,
                "concurrent_requests": 5,
                "kagi_search_api_key": "",
                "exa_api_key": "",
                "bocha_search_api_key": "",
                "domain":
                {
                    "filter_list":
                    []
                },
                "serpapi_api_key": "",
                "serpapi_engine": "",
                "full_context": false,
                "trust_env": null
            }
        },
        "template": "### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [source_id] **only when the <source_id> tag is explicitly provided** in the context.\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [source_id] when a <source_id> tag is explicitly provided in the context.**  \n- Do not cite if the <source_id> tag is not provided in the context.  \n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in \"whitepaper.pdf\" with a provided <source_id>, the response should include the citation like so:  \n* \"According to the study, the proposed method increases efficiency by 20% [whitepaper.pdf].\"\nIf no <source_id> is present, the response should omit the citation.\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [source_id] only when the <source_id> tag is present in the context.\n\n<context>\n{{CONTEXT}}\n</context>\n\n<user_query>\n{{QUERY}}\n</user_query>\n",
        "top_k": 5,
        "relevance_threshold": 0.5,
        "enable_hybrid_search": false,
        "embedding_engine": "ollama",
        "embedding_model": "nomic-embed-text:latest",
        "openai_api_base_url": "https://api.openai.com/v1",
        "openai_api_key": "",
        "ollama":
        {
            "url": "http://192.168.0.55:11434",
            "key": ""
        },
        "embedding_batch_size": 1,
        "reranking_model": "",
        "file":
        {
            "max_size": null,
            "max_count": null
        },
        "CONTENT_EXTRACTION_ENGINE": "",
        "tika_server_url": "http://tika:9998",
        "text_splitter": "token",
        "chunk_size": 2048,
        "chunk_overlap": 128,
        "full_context": false
    }
@so898 commented on GitHub (Feb 21, 2025): > If you change to ollama embedd it's necessary to indicate a model for embed, & (If this new model assigned embed info to vector space in a different way that one before it's necessary reset the database and reembed all docs). > > Yes, this is a handicap for production places, maybe in a future it's possible to have working togheter differents vector spaces or migrating tools.... Yes, and I have config the RAG with Ollama embedding model from the beginning. Here is my configuration for rag part. ``` "rag": { "pdf_extract_images": true, "youtube_loader_language": [ "en" ], "youtube_loader_proxy_url": "", "enable_web_loader_ssl_verification": true, "web": { "search": { "enable": true, "engine": "searxng", "searxng_query_url": "http://192.168.0.146:4000/search?q=<query>", "google_pse_api_key": "", "google_pse_engine_id": "", "brave_search_api_key": "", "mojeek_search_api_key": "", "serpstack_api_key": "", "serpstack_https": true, "serper_api_key": "", "serply_api_key": "", "tavily_api_key": "", "searchapi_api_key": "", "searchapi_engine": "", "jina_api_key": "", "bing_search_v7_endpoint": "https://api.bing.microsoft.com/v7.0/search", "bing_search_v7_subscription_key": "", "result_count": 5, "concurrent_requests": 5, "kagi_search_api_key": "", "exa_api_key": "", "bocha_search_api_key": "", "domain": { "filter_list": [] }, "serpapi_api_key": "", "serpapi_engine": "", "full_context": false, "trust_env": null } }, "template": "### Task:\nRespond to the user query using the provided context, incorporating inline citations in the format [source_id] **only when the <source_id> tag is explicitly provided** in the context.\n\n### Guidelines:\n- If you don't know the answer, clearly state that.\n- If uncertain, ask the user for clarification.\n- Respond in the same language as the user's query.\n- If the context is unreadable or of poor quality, inform the user and provide the best possible answer.\n- If the answer isn't present in the context but you possess the knowledge, explain this to the user and provide the answer using your own understanding.\n- **Only include inline citations using [source_id] when a <source_id> tag is explicitly provided in the context.** \n- Do not cite if the <source_id> tag is not provided in the context. \n- Do not use XML tags in your response.\n- Ensure citations are concise and directly related to the information provided.\n\n### Example of Citation:\nIf the user asks about a specific topic and the information is found in \"whitepaper.pdf\" with a provided <source_id>, the response should include the citation like so: \n* \"According to the study, the proposed method increases efficiency by 20% [whitepaper.pdf].\"\nIf no <source_id> is present, the response should omit the citation.\n\n### Output:\nProvide a clear and direct response to the user's query, including inline citations in the format [source_id] only when the <source_id> tag is present in the context.\n\n<context>\n{{CONTEXT}}\n</context>\n\n<user_query>\n{{QUERY}}\n</user_query>\n", "top_k": 5, "relevance_threshold": 0.5, "enable_hybrid_search": false, "embedding_engine": "ollama", "embedding_model": "nomic-embed-text:latest", "openai_api_base_url": "https://api.openai.com/v1", "openai_api_key": "", "ollama": { "url": "http://192.168.0.55:11434", "key": "" }, "embedding_batch_size": 1, "reranking_model": "", "file": { "max_size": null, "max_count": null }, "CONTENT_EXTRACTION_ENGINE": "", "tika_server_url": "http://tika:9998", "text_splitter": "token", "chunk_size": 2048, "chunk_overlap": 128, "full_context": false } ```
Author
Owner

@sir3mat commented on GitHub (Feb 21, 2025):

same with openai

open_webui.routers.retrieval] adding to collection file-4a723aa4-3146-4b0b-89cb-a4593b5f7d24 Invalid URL 'http:/172.18.21.137:80/embeddings': No host supplied 2025-02-21T15:59:42.292578158Z ERROR [open_webui.routers.retrieval] 'NoneType' object is not iterable 2025-02-21T15:59:42.292589868Z Traceback (most recent call last): File "/app/backend/open_webui/routers/retrieval.py", line 834, in save_docs_to_vector_db embeddings = embedding_function(

@sir3mat commented on GitHub (Feb 21, 2025): same with openai `open_webui.routers.retrieval] adding to collection file-4a723aa4-3146-4b0b-89cb-a4593b5f7d24 Invalid URL 'http:/172.18.21.137:80/embeddings': No host supplied 2025-02-21T15:59:42.292578158Z ERROR [open_webui.routers.retrieval] 'NoneType' object is not iterable 2025-02-21T15:59:42.292589868Z Traceback (most recent call last): File "/app/backend/open_webui/routers/retrieval.py", line 834, in save_docs_to_vector_db embeddings = embedding_function(`
Author
Owner

@tjbck commented on GitHub (Feb 21, 2025):

Unable to reproduce with the following settings, I'd recommend you double check everything is configured correctly!

Image

Image

@tjbck commented on GitHub (Feb 21, 2025): Unable to reproduce with the following settings, I'd recommend you double check everything is configured correctly! ![Image](https://github.com/user-attachments/assets/40f06e65-0c0e-4d5f-afda-5a0eb6b57939) ![Image](https://github.com/user-attachments/assets/e05dd137-7c02-4b6f-8f89-106d3e2c29d5)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#4024