[PR #23715] [CLOSED] fix: exclude empty assistant placeholder from Notes chat payload #50378

Closed
opened 2026-04-30 03:04:33 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/23715
Author: @Classic298
Created: 4/14/2026
Status: Closed

Base: devHead: claude/sync-dev-branches-LP3qO


📝 Commits (1)

  • 1008658 fix: exclude empty assistant placeholder from Notes chat payload

📊 Changes

1 file changed (+1 additions, -1 deletions)

View changed files

📝 src/lib/components/notes/NoteEditor/Chat.svelte (+1 -1)

📄 Description

The Notes AI Chat pushes an empty assistant message to the local messages array so it can stream delta content into it. That placeholder was also being sent in the chat/completions request, which llama.cpp interprets as an assistant response prefill. For models with thinking enabled in their chat template (Qwen3, gpt-oss, etc.), llama.cpp rejects the combination with: "Assistant response prefill is incompatible with enable_thinking." (400).

Filter the streaming placeholder out of the outgoing payload so the conversation ends with the user turn, matching the OpenAI spec and avoiding the prefill interpretation. The placeholder is still used locally for streaming UI updates.

Fixes open-webui/open-webui#23703

Contributor License Agreement

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/23715 **Author:** [@Classic298](https://github.com/Classic298) **Created:** 4/14/2026 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `claude/sync-dev-branches-LP3qO` --- ### 📝 Commits (1) - [`1008658`](https://github.com/open-webui/open-webui/commit/10086587a6ff7493e2540b7df01a7203cb64e6f9) fix: exclude empty assistant placeholder from Notes chat payload ### 📊 Changes **1 file changed** (+1 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `src/lib/components/notes/NoteEditor/Chat.svelte` (+1 -1) </details> ### 📄 Description The Notes AI Chat pushes an empty assistant message to the local messages array so it can stream delta content into it. That placeholder was also being sent in the chat/completions request, which llama.cpp interprets as an assistant response prefill. For models with thinking enabled in their chat template (Qwen3, gpt-oss, etc.), llama.cpp rejects the combination with: "Assistant response prefill is incompatible with enable_thinking." (400). Filter the streaming placeholder out of the outgoing payload so the conversation ends with the user turn, matching the OpenAI spec and avoiding the prefill interpretation. The placeholder is still used locally for streaming UI updates. Fixes open-webui/open-webui#23703 ### Contributor License Agreement <!-- 🚨 DO NOT DELETE THE TEXT BELOW 🚨 Keep the "Contributor License Agreement" confirmation text intact. Deleting it will trigger the CLA-Bot to INVALIDATE your PR. Your PR will NOT be reviewed or merged until you check the box below confirming that you have read and agree to the terms of the CLA. --> - [X] By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. > [!NOTE] > Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-30 03:04:33 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#50378