[GH-ISSUE #24341] feat: thinking_budget_tokens for llama.cpp #58938

New Issue

GiteaMirror · 2026-05-06T00:30:29-05:00

GiteaMirror commented

2026-05-06 00:30:29 -05:00

Originally created by @alkeryn on GitHub (May 4, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/24341

Check Existing Issues

I have searched for all existing open AND closed issues and discussions for similar requests. I have found none that is comparable to my request.

Verify Feature Scope

I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions.

Problem Description

with llama.cpp it's possible to disable thinking through the api using : thinking_budget_tokens: 0, there may also be other methods that work with it.
anyway, currently there is no convenient builtin way to disable thinking for llama.cpp
it can be done per chat by adding "thinking_budget_tokens: 0" as a custom field in the chat settings, but there is no way to do it globaly.

Desired Solution you'd like

either add an option for llama.cpp in settings or allow custom fields not only in the chat settings but globaly.

Alternatives Considered

No response

Additional Context

No response

Originally created by @alkeryn on GitHub (May 4, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/24341 ### Check Existing Issues - [x] I have searched for all existing **open AND closed** issues and discussions for similar requests. I have found none that is comparable to my request. ### Verify Feature Scope - [x] I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions. ### Problem Description with llama.cpp it's possible to disable thinking through the api using : thinking_budget_tokens: 0, there may also be other methods that work with it. anyway, currently there is no convenient builtin way to disable thinking for llama.cpp it can be done per chat by adding "thinking_budget_tokens: 0" as a custom field in the chat settings, but there is no way to do it globaly. ### Desired Solution you'd like either add an option for llama.cpp in settings or allow custom fields not only in the chat settings but globaly. ### Alternatives Considered _No response_ ### Additional Context _No response_

GiteaMirror closed this issue

2026-05-06 00:30:30 -05:00

GiteaMirror commented

2026-05-06 00:30:32 -05:00

@owui-terminator[bot] commented on GitHub (May 4, 2026):

🔍 Similar Issues Found

I found some existing issues that might be related. Please check if any of these are duplicates or contain helpful solutions:

#23703 issue: Notes feature not compatible with llama.cpp, enable_thinking is always injected?
by TomTheWise · bug
#8688 feat: built-in cpu-only llama cpp integration
by tjbck · enhancement, help wanted
#17428 issue: Support Think Parsing with llama.cpp + GPT-OSS
by AbdullahMPrograms · bug
#17350 issue: Llama.cpp server timing metrics not parsed correctly
by ITankForCAD · bug
#16251 issue: When using llama.cpp as backend, pressing stop doesn't stop token generation
by OracleToes · bug

💡 If this is a duplicate, consider closing it and adding details to the existing issue.

This comment was generated automatically. React with 👍 if helpful, 👎 if not.

@owui-terminator[bot] commented on GitHub (May 4, 2026): 🔍 **Similar Issues Found** I found some existing issues that might be related. Please check if any of these are duplicates or contain helpful solutions: 1. [#23703](https://github.com/open-webui/open-webui/issues/23703) **issue: Notes feature not compatible with llama.cpp, enable_thinking is always injected?** *by TomTheWise · `bug`* 2. [#8688](https://github.com/open-webui/open-webui/issues/8688) **feat: built-in cpu-only llama cpp integration** *by tjbck · `enhancement`, `help wanted`* 3. [#17428](https://github.com/open-webui/open-webui/issues/17428) **issue: Support Think Parsing with llama.cpp + GPT-OSS** *by AbdullahMPrograms · `bug`* 4. [#17350](https://github.com/open-webui/open-webui/issues/17350) **issue: Llama.cpp server timing metrics not parsed correctly** *by ITankForCAD · `bug`* 5. [#16251](https://github.com/open-webui/open-webui/issues/16251) **issue: When using llama.cpp as backend, pressing stop doesn't stop token generation** *by OracleToes · `bug`* --- 💡 If this is a duplicate, consider closing it and adding details to the existing issue. *This comment was generated automatically.* React with 👍 if helpful, 👎 if not.

GiteaMirror commented

2026-05-06 00:30:33 -05:00

@Classic298 commented on GitHub (May 4, 2026):

Filter?

@Classic298 commented on GitHub (May 4, 2026): Filter?

GiteaMirror commented

2026-05-06 00:30:36 -05:00

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 filter?

@alkeryn commented on GitHub (May 4, 2026): @Classic298 filter?

GiteaMirror commented

2026-05-06 00:30:42 -05:00

@Classic298 commented on GitHub (May 4, 2026):

this can be easily done with a filter, did you look into it? https://docs.openwebui.com/features/extensibility/plugin/functions/filter

@Classic298 commented on GitHub (May 4, 2026): this can be easily done with a filter, did you look into it? https://docs.openwebui.com/features/extensibility/plugin/functions/filter

GiteaMirror commented

2026-05-06 00:30:44 -05:00

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 fair enough, though i do feel like it'd make sense as a builtin feature, we can already add custom args in the current chat, why couldn't we add them globaly in here?

@alkeryn commented on GitHub (May 4, 2026): @Classic298 fair enough, though i do feel like it'd make sense as a builtin feature, we can already add custom args in the current chat, why couldn't we add them globaly in here? <img width="1383" height="798" alt="Image" src="https://github.com/user-attachments/assets/cbe753ff-6c26-4a34-9e6a-ebc348562ec8" />

GiteaMirror commented

2026-05-06 00:30:45 -05:00

@Classic298 commented on GitHub (May 4, 2026):

@alkeryn you can, scroll further down - you can add custom parameters. Open WebUI cannot had dozens and dozens more parameters for model or inference dependent engines, that's why you can add custom paramaters at the very end as well

@Classic298 commented on GitHub (May 4, 2026): @alkeryn you can, scroll further down - you can add custom parameters. Open WebUI cannot had dozens and dozens more parameters for model or inference dependent engines, that's why you can add custom paramaters at the very end as well

GiteaMirror commented

2026-05-06 00:30:47 -05:00

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 you can only for a single chat ie :

but the "add new custom setting" doesn't show up in the global config here :

@alkeryn commented on GitHub (May 4, 2026): @Classic298 you can only for a single chat ie : <img width="428" height="1405" alt="Image" src="https://github.com/user-attachments/assets/36dc5683-d4f1-486f-9ff2-c91a249e9b20" /> but the "add new custom setting" doesn't show up in the global config here : <img width="1682" height="938" alt="Image" src="https://github.com/user-attachments/assets/35a5242e-91ac-438f-a8fd-3ad4f3613d17" />

GiteaMirror commented

2026-05-06 00:30:48 -05:00

@Classic298 commented on GitHub (May 4, 2026):

@alkeryn go to admin panel > models and add it to the model if your desire there in the advanced paramaters

@Classic298 commented on GitHub (May 4, 2026): @alkeryn go to admin panel > models and add it to the model if your desire there in the advanced paramaters

GiteaMirror commented

2026-05-06 00:30:49 -05:00

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 sure, but per model is annoying, would be nice to have a global setting or at least per connection / provider.

@alkeryn commented on GitHub (May 4, 2026): @Classic298 sure, but per model is annoying, would be nice to have a global setting or at least per connection / provider.

GiteaMirror commented

2026-05-06 00:30:50 -05:00

@Classic298 commented on GitHub (May 4, 2026):

@alkeryn admin panel > settings > models > top right: settings - this should also do it

@Classic298 commented on GitHub (May 4, 2026): @alkeryn admin panel > settings > models > top right: settings - this should also do it

GiteaMirror commented

2026-05-06 00:30:51 -05:00

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 i don't see an option for it, weird.

@alkeryn commented on GitHub (May 4, 2026): @Classic298 i don't see an option for it, weird. <img width="1284" height="839" alt="Image" src="https://github.com/user-attachments/assets/c0433d51-ef01-406d-89bd-f34ab199d037" />

GiteaMirror commented

2026-05-06 00:30:52 -05:00

@Classic298 commented on GitHub (May 4, 2026):

hmmmmmm open a feature request specifically for adding custom param IN THIS MODAL that you here opened so that you can add the same custom param to all models pls thx

@Classic298 commented on GitHub (May 4, 2026): hmmmmmm open a feature request specifically for adding custom param IN THIS MODAL that you here opened so that you can add the same custom param to all models pls thx

Sign in to join this conversation.

Branches Tags

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/open-webui#58938