[PR #23715] [CLOSED] fix: exclude empty assistant placeholder from Notes chat payload #50378

New Issue

GiteaMirror · 2026-04-30T03:04:33-05:00

GiteaMirror commented

2026-04-30 03:04:33 -05:00

📋 Pull Request Information

Original PR: https://github.com/open-webui/open-webui/pull/23715
Author: @Classic298
Created: 4/14/2026
Status: ❌ Closed

Base: dev ← Head: claude/sync-dev-branches-LP3qO

📝 Commits (1)

1008658 fix: exclude empty assistant placeholder from Notes chat payload

📊 Changes

1 file changed (+1 additions, -1 deletions)

View changed files

📝 src/lib/components/notes/NoteEditor/Chat.svelte (+1 -1)

📄 Description

The Notes AI Chat pushes an empty assistant message to the local messages array so it can stream delta content into it. That placeholder was also being sent in the chat/completions request, which llama.cpp interprets as an assistant response prefill. For models with thinking enabled in their chat template (Qwen3, gpt-oss, etc.), llama.cpp rejects the combination with: "Assistant response prefill is incompatible with enable_thinking." (400).

Filter the streaming placeholder out of the outgoing payload so the conversation ends with the user turn, matching the OpenAI spec and avoiding the prefill interpretation. The placeholder is still used locally for streaming UI updates.

Fixes open-webui/open-webui#23703

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.

_{🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.}

## 📋 Pull Request Information **Original PR:** https://github.com/open-webui/open-webui/pull/23715 **Author:** [@Classic298](https://github.com/Classic298) **Created:** 4/14/2026 **Status:** ❌ Closed **Base:** `dev` ← **Head:** `claude/sync-dev-branches-LP3qO` --- ### 📝 Commits (1) - [`1008658`](https://github.com/open-webui/open-webui/commit/10086587a6ff7493e2540b7df01a7203cb64e6f9) fix: exclude empty assistant placeholder from Notes chat payload ### 📊 Changes **1 file changed** (+1 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `src/lib/components/notes/NoteEditor/Chat.svelte` (+1 -1) </details> ### 📄 Description The Notes AI Chat pushes an empty assistant message to the local messages array so it can stream delta content into it. That placeholder was also being sent in the chat/completions request, which llama.cpp interprets as an assistant response prefill. For models with thinking enabled in their chat template (Qwen3, gpt-oss, etc.), llama.cpp rejects the combination with: "Assistant response prefill is incompatible with enable_thinking." (400). Filter the streaming placeholder out of the outgoing payload so the conversation ends with the user turn, matching the OpenAI spec and avoiding the prefill interpretation. The placeholder is still used locally for streaming UI updates. Fixes open-webui/open-webui#23703 ### Contributor License Agreement  - [X] By submitting this pull request, I confirm that I have read and fully agree to the [Contributor License Agreement (CLA)](https://github.com/open-webui/open-webui/blob/main/CONTRIBUTOR_LICENSE_AGREEMENT), and I am providing my contributions under its terms. > [!NOTE] > Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>

GiteaMirror added the pull-request label 2026-04-30 03:04:33 -05:00

GiteaMirror closed this issue

2026-04-30 03:04:35 -05:00

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#50378