[GH-ISSUE #22505] issue: Multiple system messages break chat template parsing with Qwen3.5 GGUF models #58391

Closed
opened 2026-05-05 23:05:28 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @GlisseManTV on GitHub (Mar 9, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/22505

Check Existing Issues

  • I have searched for any existing and/or related issues.
  • I have searched for any existing and/or related discussions.
  • I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!).
  • I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.8.10

Ollama Version (if applicable)

0.17.7 and llama.cpp b8249 CUDA

Operating System

Truenas / Debian

Browser (if applicable)

No response

Confirmation

  • I have read and followed all instructions in README.md.
  • I am using the latest version of both Open WebUI and Ollama.
  • I have included the browser console logs.
  • I have included the Docker container logs.
  • I have provided every relevant configuration, setting, and environment variable used in my setup.
  • I have clearly listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc).
  • I have documented step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation. My steps:
  • Start with the initial platform/version/OS and dependencies used,
  • Specify exact install/launch/configure commands,
  • List URLs visited, user input (incl. example values/emails/passwords if needed),
  • Describe all options and toggles enabled or changed,
  • Include any files or environmental changes,
  • Identify the expected and actual result at each stage,
  • Ensure any reasonably skilled user can follow and hit the same issue.

Expected Behavior

When a User System Prompt is configured in Open WebUI and file/RAG system context injection is enabled, Open WebUI should ensure that the generated request sent to the backend contains a valid message structure.

Specifically, the system should either:

Merge multiple system prompts into a single system message, or

Ensure that the generated chat message structure remains compatible with models that enforce strict chat templates.

The request sent to the backend should look like this:

{
  "messages": [
    {
      "role": "system",
      "content": "[SYSTEM CONTEXT - Files Available] ...\n\nUser system prompt..."
    },
    {
      "role": "user",
      "content": "Hello"
    }
  ]
}

This ensures compatibility with models that validate the chat structure, such as Qwen models with strict Jinja chat templates.

Actual Behavior

When both User System Prompt and System Context injection (for files/RAG) are enabled, Open WebUI generates multiple consecutive system messages at the beginning of the conversation.

Example request generated by Open WebUI:

{
  "messages": [
    {
      "role": "system",
      "content": "[SYSTEM CONTEXT - Files Available] ..."
    },
    {
      "role": "system",
      "content": "User system prompt"
    },
    {
      "role": "user",
      "content": "<attached_files> ..."
    }
  ]
}

This structure causes models with strict chat templates (e.g. Qwen models) to fail during template parsing.

The backend returns the following error:

Unable to generate parser for this template.
Automatic parser generation failed:

Jinja Exception: System message must be at the beginning.

As a result:

  • the request fails with HTTP 400
  • file context / RAG features stop working
  • the model appears unable to read attached documents

Steps to Reproduce

  1. Start Open WebUI with a backend model server (in my case using llama.cpp).
  2. Load a model that uses a strict chat template (example: Qwen3.5 GGUF models).
  3. In Open WebUI settings, configure a User System Prompt.
  4. Enable file context / document tools (RAG).
  5. Upload a file to the chat.
  6. Send a message referencing the uploaded file.

Open WebUI will generate a request containing:

  • one system message for the file system context
  • one system message for the user system prompt

Example message structure:

system (file context)
system (user system prompt)
user
assistant

The backend then returns a template parsing error.

Removing the User System Prompt immediately resolves the issue.

Logs & Screenshots

| open_webui.utils.chat:generate_chat_completion:165 - generate_chat_completion: {'stream': True, 'model': 'aiserver1.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf', 'messages': [{'role': 'system', 'content': "[SYSTEM CONTEXT - Files Available]\nThe following files are available in this conversation:\nFiles count: 1\nFiles list: publishexe.bat\n\nFile metadata for tools:\n{'files': [{'id': '001f50e5-5eef-42c0-be85-31dea02e8c16', 'name': 'publishexe.bat'}]}\n[END SYSTEM CONTEXT]\n\nYou can now call the appropriate tools to process these files."}, {'role': 'system', 'content': "Je m'appelle Gaetan.\nGlisseManTV/MCPO-File-Generation-Tool"}, {'role': 'user', 'content': '<attached_files>\n\n</attached_files>\n\nsalut ca va?'}, {'role': 'assistant', 'content':

Additional Information

Environment:

  • Open WebUI: (0.8.10)
  • Backend: llama.cpp b8249
  • Model: Qwen3.5 GGUF
  • Quantization: Q4_K_XL

Workaround:

Removing the User System Prompt prevents the issue.

Possible improvement:

Open WebUI could merge multiple system prompts before sending the request to the backend to avoid incompatibility with strict chat templates.

Originally created by @GlisseManTV on GitHub (Mar 9, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/22505 ### Check Existing Issues - [x] I have searched for any existing and/or related issues. - [x] I have searched for any existing and/or related discussions. - [x] I have also searched in the CLOSED issues AND CLOSED discussions and found no related items (your issue might already be addressed on the development branch!). - [x] I am using the latest version of Open WebUI. ### Installation Method Docker ### Open WebUI Version 0.8.10 ### Ollama Version (if applicable) 0.17.7 and llama.cpp b8249 CUDA ### Operating System Truenas / Debian ### Browser (if applicable) _No response_ ### Confirmation - [x] I have read and followed all instructions in `README.md`. - [x] I am using the latest version of **both** Open WebUI and Ollama. - [x] I have included the browser console logs. - [x] I have included the Docker container logs. - [x] I have **provided every relevant configuration, setting, and environment variable used in my setup.** - [x] I have clearly **listed every relevant configuration, custom setting, environment variable, and command-line option that influences my setup** (such as Docker Compose overrides, .env values, browser settings, authentication configurations, etc). - [x] I have documented **step-by-step reproduction instructions that are precise, sequential, and leave nothing to interpretation**. My steps: - Start with the initial platform/version/OS and dependencies used, - Specify exact install/launch/configure commands, - List URLs visited, user input (incl. example values/emails/passwords if needed), - Describe all options and toggles enabled or changed, - Include any files or environmental changes, - Identify the expected and actual result at each stage, - Ensure any reasonably skilled user can follow and hit the same issue. ### Expected Behavior When a User System Prompt is configured in Open WebUI and file/RAG system context injection is enabled, Open WebUI should ensure that the generated request sent to the backend contains a valid message structure. Specifically, the system should either: Merge multiple system prompts into a single system message, or Ensure that the generated chat message structure remains compatible with models that enforce strict chat templates. The request sent to the backend should look like this: ``` { "messages": [ { "role": "system", "content": "[SYSTEM CONTEXT - Files Available] ...\n\nUser system prompt..." }, { "role": "user", "content": "Hello" } ] } ``` This ensures compatibility with models that validate the chat structure, such as Qwen models with strict Jinja chat templates. ### Actual Behavior When both User System Prompt and System Context injection (for files/RAG) are enabled, Open WebUI generates multiple consecutive system messages at the beginning of the conversation. Example request generated by Open WebUI: ``` { "messages": [ { "role": "system", "content": "[SYSTEM CONTEXT - Files Available] ..." }, { "role": "system", "content": "User system prompt" }, { "role": "user", "content": "<attached_files> ..." } ] } ``` This structure causes models with strict chat templates (e.g. Qwen models) to fail during template parsing. The backend returns the following error: Unable to generate parser for this template. Automatic parser generation failed: Jinja Exception: System message must be at the beginning. As a result: - the request fails with HTTP 400 - file context / RAG features stop working - the model appears unable to read attached documents ### Steps to Reproduce 1. Start Open WebUI with a backend model server (in my case using llama.cpp). 2. Load a model that uses a strict chat template (example: Qwen3.5 GGUF models). 3. In Open WebUI settings, configure a User System Prompt. 4. Enable file context / document tools (RAG). 5. Upload a file to the chat. 6. Send a message referencing the uploaded file. Open WebUI will generate a request containing: - one system message for the file system context - one system message for the user system prompt Example message structure: ``` system (file context) system (user system prompt) user assistant ``` The backend then returns a template parsing error. Removing the User System Prompt immediately resolves the issue. ### Logs & Screenshots > | open_webui.utils.chat:generate_chat_completion:165 - generate_chat_completion: {'stream': True, 'model': 'aiserver1.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf', 'messages': [{'role': '**system**', 'content': "[SYSTEM CONTEXT - Files Available]\nThe following files are available in this conversation:\nFiles count: 1\nFiles list: publishexe.bat\n\nFile metadata for tools:\n{'files': [{'id': '001f50e5-5eef-42c0-be85-31dea02e8c16', 'name': 'publishexe.bat'}]}\n[END SYSTEM CONTEXT]\n\nYou can now call the appropriate tools to process these files."}, {'role': '**system**', 'content': "Je m'appelle Gaetan.\nGlisseManTV/MCPO-File-Generation-Tool"}, {'role': 'user', 'content': '<attached_files>\n<file type="file" url="001f50e5-5eef-42c0-be85-31dea02e8c16" content_type="application/octet-stream" name="publishexe.bat"/>\n</attached_files>\n\nsalut ca va?'}, {'role': '**assistant**', 'content': ### Additional Information Environment: - Open WebUI: (0.8.10) - Backend: llama.cpp b8249 - Model: [Qwen3.5 GGUF](https://huggingface.co/unsloth/Qwen3.5-35B-A3B-GGUF/tree/main) - Quantization: Q4_K_XL Workaround: Removing the User System Prompt prevents the issue. Possible improvement: Open WebUI could merge multiple system prompts before sending the request to the backend to avoid incompatibility with strict chat templates.
GiteaMirror added the bug label 2026-05-05 23:05:28 -05:00
Author
Owner

@GlisseManTV commented on GitHub (Mar 9, 2026):

It also occurs with file and chat system prompt.

And also when I have file and folder system prompt.

User system prompt in addition of chat system prompt don't trigger the issue.

<!-- gh-comment-id:4026555729 --> @GlisseManTV commented on GitHub (Mar 9, 2026): It also occurs with file and chat system prompt. And also when I have file and folder system prompt. User system prompt in addition of chat system prompt don't trigger the issue.
Author
Owner

@tjbck commented on GitHub (Mar 24, 2026):

Addressed in dev.

<!-- gh-comment-id:4121990505 --> @tjbck commented on GitHub (Mar 24, 2026): Addressed in dev.
Author
Owner

@moritzderallerechte commented on GitHub (Apr 9, 2026):

May I ask why there is an <attached_files>....</attached_files> part added to the user prompt anyways?
The files are added to the content field of the user message anyways, so why not add filename and other metada there?
The url field can also easily be misunderstood by some llm providers, as i have learned, since there is no way to actually use that url to retrieve file content when doing cloud inference.
I had models try and resolve that url which resulted in internal server errors.
Could you add a way to customize that prompt injection? I really don't like this unconfigurable, hardcoded, style @tjbck

<!-- gh-comment-id:4214853284 --> @moritzderallerechte commented on GitHub (Apr 9, 2026): May I ask why there is an <attached_files>....</attached_files> part added to the user prompt anyways? The files are added to the content field of the user message anyways, so why not add filename and other metada there? The url field can also easily be misunderstood by some llm providers, as i have learned, since there is no way to actually use that url to retrieve file content when doing cloud inference. I had models try and resolve that url which resulted in internal server errors. Could you add a way to customize that prompt injection? I really don't like this unconfigurable, hardcoded, style @tjbck
Author
Owner

@dylanmorris1231ho-spec commented on GitHub (Apr 9, 2026):

That's a fair point.

The <attached_files> block was introduced to give models a consistent
structure for attachments, but you're right that in some setups it can be
redundant since the file content is already included in the user message.

The issue with models trying to resolve the url field is also a good
observation. Making this formatting configurable instead of hardcoded would
probably be the better approach.

We'll consider adding a way to customize the attachment prompt formatting.

On Thu, Apr 9, 2026, 9:03 AM Moritz Terhechte @.***>
wrote:

moritzderallerechte left a comment (open-webui/open-webui#22505)
https://github.com/open-webui/open-webui/issues/22505#issuecomment-4214853284

May I ask why there is an <attached_files>....</attached_files> part added
to the user prompt anyways?
The files are added to the content field of the user message anyways, so
why not add filename and other metada there?
The url field can also easily be misunderstood by some llm providers, as i
have learned, since there is no way to actually use that url to retrieve
file content when doing cloud inference.
I had models try and resolve that url which resulted in internal server
errors.
Could you add a way to customize that prompt injection? I really don't
like this unconfigurable, hardcoded, style @tjbck
https://github.com/tjbck


Reply to this email directly, view it on GitHub
https://github.com/open-webui/open-webui/issues/22505#issuecomment-4214853284,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/B7VFZEDWRSMURKFOH2C354T4U6UTPAVCNFSM6AAAAACWMGZWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DEMJUHA2TGMRYGQ
.
You are receiving this because you are subscribed to this thread.Message
ID: @.***>

<!-- gh-comment-id:4215385796 --> @dylanmorris1231ho-spec commented on GitHub (Apr 9, 2026): That's a fair point. The <attached_files> block was introduced to give models a consistent structure for attachments, but you're right that in some setups it can be redundant since the file content is already included in the user message. The issue with models trying to resolve the url field is also a good observation. Making this formatting configurable instead of hardcoded would probably be the better approach. We'll consider adding a way to customize the attachment prompt formatting. On Thu, Apr 9, 2026, 9:03 AM Moritz Terhechte ***@***.***> wrote: > *moritzderallerechte* left a comment (open-webui/open-webui#22505) > <https://github.com/open-webui/open-webui/issues/22505#issuecomment-4214853284> > > May I ask why there is an <attached_files>....</attached_files> part added > to the user prompt anyways? > The files are added to the content field of the user message anyways, so > why not add filename and other metada there? > The url field can also easily be misunderstood by some llm providers, as i > have learned, since there is no way to actually use that url to retrieve > file content when doing cloud inference. > I had models try and resolve that url which resulted in internal server > errors. > Could you add a way to customize that prompt injection? I really don't > like this unconfigurable, hardcoded, style @tjbck > <https://github.com/tjbck> > > — > Reply to this email directly, view it on GitHub > <https://github.com/open-webui/open-webui/issues/22505#issuecomment-4214853284>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/B7VFZEDWRSMURKFOH2C354T4U6UTPAVCNFSM6AAAAACWMGZWHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DEMJUHA2TGMRYGQ> > . > You are receiving this because you are subscribed to this thread.Message > ID: ***@***.***> >
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#58391