mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 19:08:59 -05:00
[GH-ISSUE #23703] issue: Notes feature not compatible with llama.cpp, enable_thinking is always injected? #58716
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @TomTheWise on GitHub (Apr 14, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/23703
Check Existing Issues
Installation Method
Git Clone
Open WebUI Version
8.12
Ollama Version (if applicable)
No response
Operating System
debian13; llama cpp on latest build
Browser (if applicable)
Edge latest
Confirmation
README.md.Expected Behavior
Actual Behavior
When using llama.cpp directly as provider and use ANY model - even matured ones like gemma3 or gpt-oss:20b without ANY additional settings - it works flawlessly without any issue in normal chat conversations.
But as soon as you try it within the notes feature it stops working. The browser IMMEDIATELY gets an error 400 Bad Request, OpenWebUI shows nothing in the logs. llama.cpp however in its logs shows that apparently the API call was made as type "Assistent response prefil" but OpenWebUI with its request sent "enable_thinking".
Steps to Reproduce
srv operator(): got exception: {"error":{"code":400,"message":"Assistant response prefill is incompatible with enable_thinking.","type":"invalid_request_error"}}
srv log_server_r: done request: POST /v1/chat/completions IP 400
Try the same model in Ollama and the same steps - and it will work.
Logs & Screenshots
llama.cpp log:
srv operator(): got exception: {"error":{"code":400,"message":"Assistant response prefill is incompatible with enable_thinking.","type":"invalid_request_error"}}
srv log_server_r: done request: POST /v1/chat/completions IP 400
Browserlog:
manifest.json:1 Manifest: Enctype should be set to either application/x-www-form-urlencoded or multipart/form-data. It currently defaults to application/x-www-form-urlencoded
index.js:1733 [tiptap warn]: Duplicate extension names found: ['codeBlock', 'bulletList', 'listItem', 'listKeymap', 'orderedList']. This can lead to issues.
hf @ index.js:1733
index.js:1733 [tiptap warn]: Duplicate extension names found: ['codeBlock', 'bulletList', 'listItem', 'listKeymap', 'orderedList']. This can lead to issues.
hf @ index.js:1733
fetcher.js:76 POST https://OWUI-FQDN/api/chat/completions 400 (Bad Request)
window.fetch @ fetcher.js:76
m @ index.ts:341
we @ Chat.svelte:182
await in we
ve @ Chat.svelte:307
await in ve
mt @ MessageInput.svelte:547
keydown @ MessageInput.svelte:933
(anonymous) @ index-client.js:178
keydown @ RichTextInput.svelte:1088
(anonymous) @ index.js:3122
someProp @ index.js:5594
hl @ index.js:3120
t.dom.addEventListener.t.input.eventHandlers. @ index.js:3089
Additional Information
If I understand correctly, OWUI always set enable_thinking - I have no clue why - this also explains why you can't disable reasoning in the models Advanced Params in OWUI?
@Classic298 commented on GitHub (Apr 14, 2026):
is this still reproducible in dev?
@TomTheWise commented on GitHub (Apr 14, 2026):
Sorry, I currently have no possibility to check dev - only in a few hours.
But if its already fixed in dev then that is great ! :)
@Classic298 commented on GitHub (Apr 14, 2026):
Analyzed your issue
OWUI does not inject enable_thinking. Searched the entire codebase - zero occurrences of enable_thinking, chat_template_kwargs, prefill, or continue_final_message. The enable_thinking flag comes from the model's chat template inside llama.cpp (Qwen3, gpt-oss, etc. default to thinking enabled) OR you toggled / set it as an advanced parameter for the model.
The error in your case is a misreading of the llama.cpp error.
the phrase "Assistant response prefill is incompatible with enable_thinking" is llama.cpp's way of saying "you gave me a trailing assistant message AND the template wants a thinking block, those two states can't coexist."
This is what actually needs a fix
@Classic298 commented on GitHub (Apr 14, 2026):
https://github.com/open-webui/open-webui/pull/23715
@Classic298 commented on GitHub (Apr 14, 2026):
fd93bd3414