[GH-ISSUE #22181] issue: v0.8.8 - RAG not triggered for custom model with attached Knowledge Base #35181

New Issue

GiteaMirror · 2026-04-25T09:25:12-05:00

GiteaMirror commented

2026-04-25 09:25:12 -05:00

Originally created by @CallSohail on GitHub (Mar 3, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/22181

Check Existing Issues

I have searched for any existing and/or related issues.
I have searched for any existing and/or related discussions.
I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

v0.8.8-cuda

Ollama Version (if applicable)

0.15.6

Operating System

Debian 12

Browser (if applicable)

Chrome (latest)

Confirmation

I have read and followed all instructions in README.md.
I am using the latest version of both Open WebUI and Ollama.
I have included the browser console logs.
I have included the Docker container logs.
I have provided every relevant configuration, setting, and environment variable used in my setup.
I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
Start with the initial platform/version/OS and dependencies used,
Specify exact install/launch/configure commands,
List URLs visited, user input (incl. example values/emails/passwords if needed),
Describe all options and toggles enabled or changed,
Include any files or environmental changes,
Identify the expected and actual result at each stage,
Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When a Knowledge Base is permanently attached to a custom model via Model Settings (Workspace → Models → Edit → Knowledge), asking questions should automatically trigger the full RAG pipeline with hybrid search.

This should work the same way as manually attaching the KB using # in the chat area.

Actual Behavior

RAG is NOT triggered when KB is attached via Model Settings.
RAG IS triggered when same KB is attached via # in chat.

Same model, same KB, same question - different behavior depending on HOW the KB is attached.

Steps to Reproduce

OpenWebUI v0.8.8-cuda, PostgreSQL 16
Create Knowledge Base:
- Workspace → Knowledge → Create "Test KB"
- Upload PDF documents
- Verify processing complete
Create Custom Model:
- Workspace → Models → New Model
- Name: "TestModel"
- Base Model: any model (e.g., gpt-oss-20b)
- Knowledge: Attach "Test KB" (Collection)
- Save
Test 1 - KB attached via Model Settings:
- Start new chat with "TestModel"
- Ask: "Question about document content"
- Check logs: docker logs open-webui 2>&1 | grep -E "hybrid_search|query_doc"
- Result: NO hybrid_search logs, NO query_doc logs
Test 2 - KB attached via # in chat:
- Same model "TestModel"
- In chat, type # and select same "Test KB"
- Ask same question
- Check logs
- Result: hybrid_search triggered, documents found with scores

Logs & Screenshots

Test 1 - KB attached via Model Settings (NOT WORKING):

Logs show only basic API calls, NO retrieval:

POST /api/v1/chats/new HTTP/1.1" 200
POST /api/chat/completions HTTP/1.1" 200
POST /api/chat/completed HTTP/1.1" 200

No query_doc, no hybrid_search - RAG completely skipped.

Test 2 - KB attached via # in chat (WORKING):

Logs show full hybrid search pipeline:

Starting hybrid search for 3 queries in 1 collections...
query_doc_with_hybrid_search:result [[{
  'name': 'Document1.pdf',
  'score': 0.9941
}, {
  'name': 'Document2.pdf', 
  'score': 0.5976
}...]]

Hybrid search triggered, reranker applied, scores calculated.

Additional Information

Configuration:

Embedding: snowflake-arctic-embed2:latest (Ollama)
Hybrid Search: enabled
Reranker: BAAI/bge-reranker-v2-m3
Database: PostgreSQL 16

The KB is correctly attached in Model Settings (visible in UI).
Same KB, same model, same question - only difference is attachment method.

Attaching via Model Settings should auto-inject RAG context like # does.
This appears to be a regression or missing feature in v0.8.8.

Originally created by @CallSohail on GitHub (Mar 3, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/22181 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version v0.8.8-cuda ### Ollama Version (if applicable) 0.15.6 ### Operating System Debian 12 ### Browser (if applicable) Chrome (latest) ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When a Knowledge Base is permanently attached to a custom model via Model Settings (Workspace → Models → Edit → Knowledge), asking questions should automatically trigger the full RAG pipeline with hybrid search. This should work the same way as manually attaching the KB using # in the chat area. ### Actual Behavior RAG is NOT triggered when KB is attached via Model Settings. RAG IS triggered when same KB is attached via # in chat. Same model, same KB, same question - different behavior depending on HOW the KB is attached. ### Steps to Reproduce 1. OpenWebUI v0.8.8-cuda, PostgreSQL 16 2. Create Knowledge Base: - Workspace → Knowledge → Create "Test KB" - Upload PDF documents - Verify processing complete 3. Create Custom Model: - Workspace → Models → New Model - Name: "TestModel" - Base Model: any model (e.g., gpt-oss-20b) - Knowledge: Attach "Test KB" (Collection) - Save 4. Test 1 - KB attached via Model Settings: - Start new chat with "TestModel" - Ask: "Question about document content" - Check logs: docker logs open-webui 2>&1 | grep -E "hybrid_search|query_doc" - Result: NO hybrid_search logs, NO query_doc logs 5. Test 2 - KB attached via # in chat: - Same model "TestModel" - In chat, type # and select same "Test KB" - Ask same question - Check logs - Result: hybrid_search triggered, documents found with scores ### Logs & Screenshots **Test 1 - KB attached via Model Settings (NOT WORKING):** Logs show only basic API calls, NO retrieval: ``` POST /api/v1/chats/new HTTP/1.1" 200 POST /api/chat/completions HTTP/1.1" 200 POST /api/chat/completed HTTP/1.1" 200 ``` No `query_doc`, no `hybrid_search` - RAG completely skipped. <img width="1086" height="338" alt="Image" src="https://github.com/user-attachments/assets/e64187b6-d85b-4e86-a804-0ed6cc6db17f" /> **Test 2 - KB attached via # in chat (WORKING):** Logs show full hybrid search pipeline: ``` Starting hybrid search for 3 queries in 1 collections... query_doc_with_hybrid_search:result [[{ 'name': 'Document1.pdf', 'score': 0.9941 }, { 'name': 'Document2.pdf', 'score': 0.5976 }...]] ``` Hybrid search triggered, reranker applied, scores calculated. <img width="1331" height="381" alt="Image" src="https://github.com/user-attachments/assets/16dd0324-2a13-478b-bcd9-1a1f80bd0891" /> ### Additional Information Configuration: - Embedding: snowflake-arctic-embed2:latest (Ollama) - Hybrid Search: enabled - Reranker: BAAI/bge-reranker-v2-m3 - Database: PostgreSQL 16 The KB is correctly attached in Model Settings (visible in UI). Same KB, same model, same question - only difference is attachment method. Attaching via Model Settings should auto-inject RAG context like # does. This appears to be a regression or missing feature in v0.8.8.

GiteaMirror added the bug label 2026-04-25 09:25:12 -05:00

GiteaMirror closed this issue

2026-04-25 09:25:12 -05:00

GiteaMirror commented

2026-04-25 09:25:14 -05:00

@druellan commented on GitHub (Mar 3, 2026):

I have the same issue, since 0.8.x

I tested with a new model, a new knowledge base all of them in public or sharing the same level of access. The KB search never triggers.
In my case I'm not using hybrid search, no reranker, qdrant as vector, but I'm using PostgreSQL 17.

I can't reproduce it in a fresh install with SQLLite, so, my current hypothesis is probably a mismatch in the permissions system or something not correctly migrated in the PSQL database.

@druellan commented on GitHub (Mar 3, 2026): I have the same issue, since 0.8.x I tested with a new model, a new knowledge base all of them in public or sharing the same level of access. The KB search never triggers. In my case I'm not using hybrid search, no reranker, qdrant as vector, but I'm using PostgreSQL 17. I can't reproduce it in a fresh install with SQLLite, so, my current hypothesis is probably a mismatch in the permissions system or something not correctly migrated in the PSQL database.

GiteaMirror commented

2026-04-25 09:25:14 -05:00

@Classic298 commented on GitHub (Mar 3, 2026):

@CallSohail in your first screenshot, the model didn't even use the builtin tools to query the knowledge base.

@druellan native or not native tool calling? cannot reproduce. steps to reproduce?

@Classic298 commented on GitHub (Mar 3, 2026): @CallSohail in your first screenshot, the model didn't even use the builtin tools to query the knowledge base. @druellan native or not native tool calling? cannot reproduce. steps to reproduce?

GiteaMirror commented

2026-04-25 09:25:15 -05:00

@Classic298 commented on GitHub (Mar 3, 2026):

how can i reproduce this?

@Classic298 commented on GitHub (Mar 3, 2026): how can i reproduce this?

GiteaMirror commented

2026-04-25 09:25:15 -05:00

@Classic298 commented on GitHub (Mar 3, 2026):

@CallSohail A few questions to narrow this down:

Is native function calling enabled for this model?
Can you share the logs with GLOBAL_LOG_LEVEL=DEBUG? The default log level hides the relevant code paths. Set this env var, reproduce the issue, and share the full output.
Are you using PostgreSQL or SQLite?

@druellan Same questions — could you also share DEBUG-level logs when reproducing? Your observation that it works on SQLite but not PostgreSQL is very helpful.

The code path for model-attached KBs (non-native FC) does add the KB items to the retrieval pipeline and should trigger hybrid search. But several error handlers in that path swallow exceptions silently at non-DEBUG log levels, so we need DEBUG logs to see where the flow breaks.

@Classic298 commented on GitHub (Mar 3, 2026): @CallSohail A few questions to narrow this down: 1. Is native function calling enabled for this model? 2. Can you share the logs with `GLOBAL_LOG_LEVEL=DEBUG`? The default log level hides the relevant code paths. Set this env var, reproduce the issue, and share the full output. 3. Are you using PostgreSQL or SQLite? @druellan Same questions — could you also share DEBUG-level logs when reproducing? Your observation that it works on SQLite but not PostgreSQL is very helpful. The code path for model-attached KBs (non-native FC) does add the KB items to the retrieval pipeline and should trigger hybrid search. But several error handlers in that path swallow exceptions silently at non-DEBUG log levels, so we need DEBUG logs to see where the flow breaks.

GiteaMirror commented

2026-04-25 09:25:15 -05:00

@druellan commented on GitHub (Mar 4, 2026):

Hey @Classic298 sorry the vague report, still working on it but I wanted to chime in.
I can't reproduce it locally on a fresh OWUI intallation, so, still no steps to reproduce it, but I'm going to see if I can gather some logs from production. I bet that besides the postgreSQL another factor that ties both cases is the fact the we both upgraded from early versions

@druellan commented on GitHub (Mar 4, 2026): Hey @Classic298 sorry the vague report, still working on it but I wanted to chime in. I can't reproduce it locally on a fresh OWUI intallation, so, still no steps to reproduce it, but I'm going to see if I can gather some logs from production. I bet that besides the postgreSQL another factor that ties both cases is the fact the we both upgraded from early versions

GiteaMirror commented

2026-04-25 09:25:15 -05:00

@CallSohail commented on GitHub (Mar 4, 2026):

@Classic298

1. Native function calling: No, it's disabled. When I enable it, the model does call query_knowledge_bases builtin tool but it only uses simple vector search — hybrid search and reranker are bypassed.

2. DEBUG logs with GLOBAL_LOG_LEVEL=DEBUG:

2026-03-04 08:53:24.653 | DEBUG | open_webui.utils.middleware:process_chat_payload:2072 - form_data: {'stream': True, 'model': 'eve-evebot', 'messages': [{'role': 'user', 'content': 'Comment configurer le client mail de mon smartphone ?'}], 'features': {'voice': False, 'image_generation': False, 'code_interpreter': False, 'web_search': False, 'memory': True}, 'variables': {...}, 'metadata': {'user_id': '...', 'chat_id': '7e89b86b-3894-4cd2-a117-4b171d844d4a', 'filter_ids': [], 'tool_ids': None, 'tool_servers': [], 'files': None, 'model': {'id': 'eve-evebot', 'name': 'Eve Evebot', 'base_model_id': 'openai/gpt-oss-20b', ...}}}

As you can see: tool_ids: None, files: None, no hybrid_search logs, no query_collection logs anywhere. The KB attached via Model Settings is not being picked up by the retrieval pipeline. Same question with #KBName in chat triggers full hybrid search with scores.

3. Database: PostgreSQL with pgvector.

@CallSohail commented on GitHub (Mar 4, 2026): @Classic298 **1. Native function calling:** No, it's disabled. When I enable it, the model does call `query_knowledge_bases` builtin tool but it only uses simple vector search — hybrid search and reranker are bypassed. **2. DEBUG logs with `GLOBAL_LOG_LEVEL=DEBUG`:** ``` 2026-03-04 08:53:24.653 | DEBUG | open_webui.utils.middleware:process_chat_payload:2072 - form_data: {'stream': True, 'model': 'eve-evebot', 'messages': [{'role': 'user', 'content': 'Comment configurer le client mail de mon smartphone ?'}], 'features': {'voice': False, 'image_generation': False, 'code_interpreter': False, 'web_search': False, 'memory': True}, 'variables': {...}, 'metadata': {'user_id': '...', 'chat_id': '7e89b86b-3894-4cd2-a117-4b171d844d4a', 'filter_ids': [], 'tool_ids': None, 'tool_servers': [], 'files': None, 'model': {'id': 'eve-evebot', 'name': 'Eve Evebot', 'base_model_id': 'openai/gpt-oss-20b', ...}}} ``` As you can see: `tool_ids: None`, `files: None`, no `hybrid_search` logs, no `query_collection` logs anywhere. The KB attached via Model Settings is not being picked up by the retrieval pipeline. Same question with `#KBName` in chat triggers full hybrid search with scores. **3. Database:** PostgreSQL with pgvector.

GiteaMirror commented

2026-04-25 09:25:16 -05:00

@CallSohail commented on GitHub (Mar 4, 2026):

@Classic298

Root Cause Diagnosis: KB via Model Settings Not Injected

1. Where Model Settings Are Loaded Before `process_chat_payload`

In backend/open_webui/main.py, the chat_completion endpoint (line ~1679) does the following before calling process_chat_payload:

# main.py ~line 1698-1699
model = request.app.state.MODELS[model_id]       # full model dict from cache
model_info = Models.get_model_by_id(model_id)    # DB ORM object

The request.app.state.MODELS dict is populated by get_all_models() in utils/models.py, which calls custom_model.model_dump() and assigns the result to model["info"]. Because ModelMeta uses model_config = ConfigDict(extra="allow"), the knowledge list (set via Workspace → Models → Edit → Knowledge) is preserved in model["info"]["meta"]["knowledge"]. The model object itself is correctly populated — knowledge IS in model["info"]["meta"]["knowledge"].

Then, at main.py ~line 1769-1779, metadata is built with:

metadata = {
    ...
    "filter_ids": form_data.pop("filter_ids", []),
    "tool_ids": form_data.get("tool_ids", None),   # .get, NOT .pop
    "tool_servers": form_data.pop("tool_servers", None),
    "files": form_data.get("files", None),          # .get, NOT .pop — this is the snapshot
    ...
}
form_data["metadata"] = metadata

At this point, if the user sent no files in the request, metadata["files"] is None. This is what your DEBUG log at line 2072 is showing — it is the metadata snapshot taken at request entry time, before KB injection. The 'files': None in the log is expected and does not indicate a bug by itself.

2. The Code Path That Should Inject Model-Attached KBs

Inside process_chat_payload (middleware.py), the KB injection for Default mode happens at lines 2179–2222:

# middleware.py line 2179-2222
# Model "Knowledge" handling
user_message = get_last_user_message(form_data["messages"])
model_knowledge = model.get("info", {}).get("meta", {}).get("knowledge", False)

if (
    model_knowledge
    and metadata.get("params", {}).get("function_calling") != "native"   # GUARD #1
):
    ...
    knowledge_files = []
    for item in model_knowledge:
        if item.get("collection_name"):
            knowledge_files.append({"id": item.get("collection_name"), "name": item.get("name"), "legacy": True})
        elif item.get("collection_names"):
            knowledge_files.append({"name": ..., "type": "collection", "collection_names": ..., "legacy": True})
        else:
            knowledge_files.append(item)

    files = form_data.get("files", [])   # ← POTENTIAL BUG HERE (see §4)
    files.extend(knowledge_files)
    form_data["files"] = files           # KB injected as files

Then, further down at lines 2300–2378, the files are extracted and placed into metadata:

# middleware.py line 2300-2378
tool_ids = form_data.pop("tool_ids", None)
terminal_id = form_data.pop("terminal_id", None)
files = form_data.pop("files", None)    # picks up KB-injected files
...
metadata = {
    **metadata,
    "tool_ids": tool_ids,
    "terminal_id": terminal_id,
    "files": files,                      # KB files now in metadata
}
form_data["metadata"] = metadata

Finally, chat_completion_files_handler is called at line 2641:

# middleware.py line 2639-2646
if file_context_enabled:
    try:
        form_data, flags = await chat_completion_files_handler(
            request, form_data, extra_params, user
        )
        sources.extend(flags.get("sources", []))
    except Exception as e:
        log.exception(e)

And chat_completion_files_handler reads from body.get("metadata", {}).get("files", None) at line 1819 — that is, it reads metadata["files"] which should now contain the KB.

The injection is architecturally correct in Default mode — the flow is well-designed. The failures are caused by the issues below.

3. Why `#KBName` Works But Model Settings KB Fails

#KBName in chat → The frontend Chat.svelte appends a file reference to the files array when the user types #KBName and selects a KB. This is sent as part of form_data["files"] in the request body. When metadata["files"] is constructed in main.py, it already contains the KB reference, so even before the middleware injection logic, the KB is in the pipeline. chat_completion_files_handler sees it immediately.

Model Settings KB → No file reference is sent by the frontend in form_data["files"]. Instead, the KB lives only in model["info"]["meta"]["knowledge"]. The injection must happen server-side in the middleware block at lines 2179–2222. This path has two failure modes:

Failure Mode A — Native Function Calling (function_calling == "native"):

# middleware.py line 2183-2185
if (
    model_knowledge
    and metadata.get("params", {}).get("function_calling") != "native"   # SKIPPED if native!
):

If the model or system has function_calling = "native" set (in metadata["params"]), the entire block is bypassed. In native mode, get_builtin_tools() in tools.py instead adds query_knowledge_files as a tool for the model to call autonomously (lines 431–438 of tools.py). The model must then decide to call it — it does not receive context-injected chunks. If the model doesn't call the tool, or if the tool parameters don't match, the KB is never retrieved. This is a fundamental behavioral difference.

Failure Mode B — None Coercion (the Python Gotcha):

At middleware.py line 2220:

files = form_data.get("files", [])

Python's dict.get(key, default) returns default only when the key is absent. If the frontend (or a prior filter) explicitly sets form_data["files"] = None, this returns None, not []. Then:

files.extend(knowledge_files)   # AttributeError: 'NoneType' has no attribute 'extend'

This exception propagates up unchecked — there is no try/except around this block.

4. Silent Error Handlers Swallowing Exceptions

The exception from Failure Mode B propagates to process_chat in main.py (line 1882):

# main.py line 1882-1883
except Exception as e:
    log.debug(f"Error processing chat payload: {e}")   # ← DEBUG LEVEL ONLY
    if metadata.get("chat_id") and metadata.get("message_id"):
        try:
            ...
            await event_emitter({"type": "chat:message:error", "data": {"error": {"content": str(e)}}})
        except:
            pass   # ← SWALLOWED AGAIN

log.debug(...) is invisible at INFO, WARNING, or ERROR log levels. Unless you run with LOG_LEVEL=DEBUG, this exception is completely invisible in server logs. The user sees an error in the chat UI (from chat:message:error event) but there is nothing in the server log to trace it to the KB injection code.

Additionally, at line 2645–2646 in chat_completion_files_handler:

    except Exception as e:
        log.exception(e)   # This one does show, but only if chat_completion_files_handler is reached

5. Complete Execution Order Summary

main.py chat_completion():
  → line 1698: model = MODELS[model_id]          # model["info"]["meta"]["knowledge"] populated
  → line 1779: metadata["files"] = None           # no files from user request
  → line 1843: process_chat() called
      → process_chat_payload():
          → line 2072: log.debug(form_data)        # FILES=None HERE — EXPECTED, pre-injection
          → line 2159-2177: Folder project files merged
          → line 2181: model_knowledge = model["info"]["meta"]["knowledge"]
          → line 2183: if model_knowledge AND NOT native FC:
              → line 2220: files = form_data.get("files", [])  ← POSSIBLE None bug
              → line 2221: files.extend(knowledge_files)
              → line 2222: form_data["files"] = knowledge_files
          → line 2228: process_pipeline_inlet_filter()  ← could clear form_data["files"]?
          → line 2240: process_filter_functions()       ← could clear form_data["files"]?
          → line 2302: files = form_data.pop("files", None)  ← captures KB files
          → line 2372-2378: metadata["files"] = files         ← KB in metadata
          → line 2617-2623: if native FC → add tools to form_data["tools"] (skip files handler!)
          → line 2639-2646: chat_completion_files_handler()   ← reads metadata["files"]
              → line 1819: if files := body.get("metadata",{}).get("files", None):
                  → RAG retrieval runs here

6. Specific Fixes

Fix 1 — None-safe files initialization (middleware.py line 2220):

# Before (buggy when form_data["files"] is None):
files = form_data.get("files", [])

# After (safe):
files = form_data.get("files") or []

This uses the truthiness check so both missing keys and None values return [].

Fix 2 — Also fix the folder handling at line 2176 for consistency:

# Before:
*form_data.get("files", []),
# After:
*(form_data.get("files") or []),

Fix 3 — Elevate the error log in main.py line 1883:

# Before (invisible at default log level):
log.debug(f"Error processing chat payload: {e}")

# After:
log.exception(f"Error processing chat payload: {e}")
# or at minimum:
log.error(f"Error processing chat payload: {e}", exc_info=True)

Fix 4 — Native FC path needs explicit context injection for model-attached KBs. In tools.py lines 431–438, when the model has attached knowledge and query_knowledge_files is the designated tool, the current assumption is that the model will call it. For reliability, consider adding the knowledge files to form_data["files"] even in native FC mode, so the content is available via add_file_context() at line 2597. Alternatively, document this behavior gap clearly.

7. Verifying `model["info"]["meta"]["knowledge"]` Reaches Middleware

To confirm the model object is the problem vs. the injection logic, add this temporary log immediately before the guard:

# middleware.py after line 2181
log.warning(f"[KB DEBUG] model_id={form_data.get('model')}, "
            f"model_knowledge={model.get('info',{}).get('meta',{}).get('knowledge')}, "
            f"function_calling={metadata.get('params',{}).get('function_calling')}")

This will immediately tell you whether:

model_knowledge is falsy (model object missing knowledge in request.app.state.MODELS)
function_calling == "native" is silently bypassing injection
The injection runs but fails at None.extend()

Help from AI :)

@CallSohail commented on GitHub (Mar 4, 2026): @Classic298 ## Root Cause Diagnosis: KB via Model Settings Not Injected ### 1. Where Model Settings Are Loaded Before `process_chat_payload` In `backend/open_webui/main.py`, the `chat_completion` endpoint (line ~1679) does the following before calling `process_chat_payload`: ```python # main.py ~line 1698-1699 model = request.app.state.MODELS[model_id] # full model dict from cache model_info = Models.get_model_by_id(model_id) # DB ORM object ``` The `request.app.state.MODELS` dict is populated by `get_all_models()` in `utils/models.py`, which calls `custom_model.model_dump()` and assigns the result to `model["info"]`. Because `ModelMeta` uses `model_config = ConfigDict(extra="allow")`, the `knowledge` list (set via Workspace → Models → Edit → Knowledge) is preserved in `model["info"]["meta"]["knowledge"]`. **The model object itself is correctly populated — knowledge IS in `model["info"]["meta"]["knowledge"]`.** Then, at main.py ~line 1769-1779, metadata is built with: ```python metadata = { ... "filter_ids": form_data.pop("filter_ids", []), "tool_ids": form_data.get("tool_ids", None), # .get, NOT .pop "tool_servers": form_data.pop("tool_servers", None), "files": form_data.get("files", None), # .get, NOT .pop — this is the snapshot ... } form_data["metadata"] = metadata ``` At this point, if the user sent no files in the request, `metadata["files"]` is `None`. **This is what your DEBUG log at line 2072 is showing — it is the metadata snapshot taken at request entry time, before KB injection.** The `'files': None` in the log is expected and does not indicate a bug by itself. --- ### 2. The Code Path That Should Inject Model-Attached KBs Inside `process_chat_payload` (middleware.py), the KB injection for **Default mode** happens at lines 2179–2222: ```python # middleware.py line 2179-2222 # Model "Knowledge" handling user_message = get_last_user_message(form_data["messages"]) model_knowledge = model.get("info", {}).get("meta", {}).get("knowledge", False) if ( model_knowledge and metadata.get("params", {}).get("function_calling") != "native" # GUARD #1 ): ... knowledge_files = [] for item in model_knowledge: if item.get("collection_name"): knowledge_files.append({"id": item.get("collection_name"), "name": item.get("name"), "legacy": True}) elif item.get("collection_names"): knowledge_files.append({"name": ..., "type": "collection", "collection_names": ..., "legacy": True}) else: knowledge_files.append(item) files = form_data.get("files", []) # ← POTENTIAL BUG HERE (see §4) files.extend(knowledge_files) form_data["files"] = files # KB injected as files ``` Then, further down at lines 2300–2378, the `files` are extracted and placed into metadata: ```python # middleware.py line 2300-2378 tool_ids = form_data.pop("tool_ids", None) terminal_id = form_data.pop("terminal_id", None) files = form_data.pop("files", None) # picks up KB-injected files ... metadata = { **metadata, "tool_ids": tool_ids, "terminal_id": terminal_id, "files": files, # KB files now in metadata } form_data["metadata"] = metadata ``` Finally, `chat_completion_files_handler` is called at line 2641: ```python # middleware.py line 2639-2646 if file_context_enabled: try: form_data, flags = await chat_completion_files_handler( request, form_data, extra_params, user ) sources.extend(flags.get("sources", [])) except Exception as e: log.exception(e) ``` And `chat_completion_files_handler` reads from `body.get("metadata", {}).get("files", None)` at line 1819 — that is, it reads `metadata["files"]` which should now contain the KB. **The injection is architecturally correct in Default mode** — the flow is well-designed. The failures are caused by the issues below. --- ### 3. Why `#KBName` Works But Model Settings KB Fails **`#KBName` in chat** → The frontend `Chat.svelte` appends a file reference to the `files` array when the user types `#KBName` and selects a KB. This is sent as part of `form_data["files"]` in the request body. When `metadata["files"]` is constructed in main.py, it already contains the KB reference, so even before the middleware injection logic, the KB is in the pipeline. `chat_completion_files_handler` sees it immediately. **Model Settings KB** → No file reference is sent by the frontend in `form_data["files"]`. Instead, the KB lives only in `model["info"]["meta"]["knowledge"]`. The injection must happen server-side in the middleware block at lines 2179–2222. **This path has two failure modes:** **Failure Mode A — Native Function Calling (`function_calling == "native"`):** ```python # middleware.py line 2183-2185 if ( model_knowledge and metadata.get("params", {}).get("function_calling") != "native" # SKIPPED if native! ): ``` If the model or system has `function_calling = "native"` set (in `metadata["params"]`), the entire block is bypassed. In native mode, `get_builtin_tools()` in `tools.py` instead adds `query_knowledge_files` as a tool for the model to call autonomously (lines 431–438 of tools.py). The model must then decide to call it — it does not receive context-injected chunks. **If the model doesn't call the tool, or if the tool parameters don't match, the KB is never retrieved.** This is a fundamental behavioral difference. **Failure Mode B — `None` Coercion (the Python Gotcha):** At middleware.py line 2220: ```python files = form_data.get("files", []) ``` Python's `dict.get(key, default)` returns `default` only when the key is **absent**. If the frontend (or a prior filter) explicitly sets `form_data["files"] = None`, this returns `None`, not `[]`. Then: ```python files.extend(knowledge_files) # AttributeError: 'NoneType' has no attribute 'extend' ``` This exception propagates up unchecked — there is no `try/except` around this block. --- ### 4. Silent Error Handlers Swallowing Exceptions The exception from Failure Mode B propagates to `process_chat` in main.py (line 1882): ```python # main.py line 1882-1883 except Exception as e: log.debug(f"Error processing chat payload: {e}") # ← DEBUG LEVEL ONLY if metadata.get("chat_id") and metadata.get("message_id"): try: ... await event_emitter({"type": "chat:message:error", "data": {"error": {"content": str(e)}}}) except: pass # ← SWALLOWED AGAIN ``` **`log.debug(...)` is invisible at INFO, WARNING, or ERROR log levels.** Unless you run with `LOG_LEVEL=DEBUG`, this exception is completely invisible in server logs. The user sees an error in the chat UI (from `chat:message:error` event) but there is nothing in the server log to trace it to the KB injection code. Additionally, at line 2645–2646 in `chat_completion_files_handler`: ```python except Exception as e: log.exception(e) # This one does show, but only if chat_completion_files_handler is reached ``` --- ### 5. Complete Execution Order Summary ``` main.py chat_completion(): → line 1698: model = MODELS[model_id] # model["info"]["meta"]["knowledge"] populated → line 1779: metadata["files"] = None # no files from user request → line 1843: process_chat() called → process_chat_payload(): → line 2072: log.debug(form_data) # FILES=None HERE — EXPECTED, pre-injection → line 2159-2177: Folder project files merged → line 2181: model_knowledge = model["info"]["meta"]["knowledge"] → line 2183: if model_knowledge AND NOT native FC: → line 2220: files = form_data.get("files", []) ← POSSIBLE None bug → line 2221: files.extend(knowledge_files) → line 2222: form_data["files"] = knowledge_files → line 2228: process_pipeline_inlet_filter() ← could clear form_data["files"]? → line 2240: process_filter_functions() ← could clear form_data["files"]? → line 2302: files = form_data.pop("files", None) ← captures KB files → line 2372-2378: metadata["files"] = files ← KB in metadata → line 2617-2623: if native FC → add tools to form_data["tools"] (skip files handler!) → line 2639-2646: chat_completion_files_handler() ← reads metadata["files"] → line 1819: if files := body.get("metadata",{}).get("files", None): → RAG retrieval runs here ``` --- ### 6. Specific Fixes **Fix 1 — None-safe `files` initialization (middleware.py line 2220):** ```python # Before (buggy when form_data["files"] is None): files = form_data.get("files", []) # After (safe): files = form_data.get("files") or [] ``` This uses the truthiness check so both missing keys and `None` values return `[]`. **Fix 2 — Also fix the folder handling at line 2176 for consistency:** ```python # Before: *form_data.get("files", []), # After: *(form_data.get("files") or []), ``` **Fix 3 — Elevate the error log in main.py line 1883:** ```python # Before (invisible at default log level): log.debug(f"Error processing chat payload: {e}") # After: log.exception(f"Error processing chat payload: {e}") # or at minimum: log.error(f"Error processing chat payload: {e}", exc_info=True) ``` **Fix 4 — Native FC path needs explicit context injection for model-attached KBs.** In `tools.py` lines 431–438, when the model has attached knowledge and `query_knowledge_files` is the designated tool, the current assumption is that the model will call it. For reliability, consider adding the knowledge files to `form_data["files"]` even in native FC mode, so the content is available via `add_file_context()` at line 2597. Alternatively, document this behavior gap clearly. --- ### 7. Verifying `model["info"]["meta"]["knowledge"]` Reaches Middleware To confirm the model object is the problem vs. the injection logic, add this temporary log immediately before the guard: ```python # middleware.py after line 2181 log.warning(f"[KB DEBUG] model_id={form_data.get('model')}, " f"model_knowledge={model.get('info',{}).get('meta',{}).get('knowledge')}, " f"function_calling={metadata.get('params',{}).get('function_calling')}") ``` This will immediately tell you whether: - `model_knowledge` is falsy (model object missing knowledge in `request.app.state.MODELS`) - `function_calling == "native"` is silently bypassing injection - The injection runs but fails at `None.extend()` Help from AI :)

GiteaMirror commented

2026-04-25 09:25:16 -05:00

@Classic298 commented on GitHub (Mar 4, 2026):

None of this is the "root cause" unlike you claimed and a lot of the "gotchas" you described are intended behaviour.

Of course the native tool calling should bypass the classical automatic forced RAG injection. That's how it is supposed to work.
I will hide your commend for this reason. Thanks for the additional logs, I will research this issue again later today - I would appreciate further logs and steps to reproduce because otherwise i will not be able to reproduce it, not log the issue, not debug it, and not fix it.

I need steps to reproduce and logs.

Thanks

@Classic298 commented on GitHub (Mar 4, 2026): None of this is the "root cause" unlike you claimed and a lot of the "gotchas" you described are intended behaviour. Of course the native tool calling should bypass the classical automatic forced RAG injection. That's how it is supposed to work. I will hide your commend for this reason. Thanks for the additional logs, I will research this issue again later today - I would appreciate further logs and <ins>**steps to reproduce**</ins> because otherwise i will not be able to reproduce it, not log the issue, not debug it, and not fix it. I need steps to reproduce and logs. Thanks

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#35181

[GH-ISSUE #22181] issue: v0.8.8 - RAG not triggered for custom model with attached Knowledge Base #35181

Check Existing Issues

Installation Method

Open WebUI Version

Ollama Version (if applicable)

Operating System

Browser (if applicable)

Confirmation

Expected Behavior

Actual Behavior

Steps to Reproduce

Logs & Screenshots

Additional Information

Root Cause Diagnosis: KB via Model Settings Not Injected

1. Where Model Settings Are Loaded Before process_chat_payload

2. The Code Path That Should Inject Model-Attached KBs

3. Why #KBName Works But Model Settings KB Fails

4. Silent Error Handlers Swallowing Exceptions

5. Complete Execution Order Summary

6. Specific Fixes

7. Verifying model["info"]["meta"]["knowledge"] Reaches Middleware

1. Where Model Settings Are Loaded Before `process_chat_payload`

3. Why `#KBName` Works But Model Settings KB Fails

7. Verifying `model["info"]["meta"]["knowledge"]` Reaches Middleware