[GH-ISSUE #24341] feat: thinking_budget_tokens for llama.cpp #58938

Closed
opened 2026-05-06 00:30:29 -05:00 by GiteaMirror · 12 comments
Owner

Originally created by @alkeryn on GitHub (May 4, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/24341

Check Existing Issues

  • I have searched for all existing open AND closed issues and discussions for similar requests. I have found none that is comparable to my request.

Verify Feature Scope

  • I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions.

Problem Description

with llama.cpp it's possible to disable thinking through the api using : thinking_budget_tokens: 0, there may also be other methods that work with it.
anyway, currently there is no convenient builtin way to disable thinking for llama.cpp
it can be done per chat by adding "thinking_budget_tokens: 0" as a custom field in the chat settings, but there is no way to do it globaly.

Desired Solution you'd like

either add an option for llama.cpp in settings or allow custom fields not only in the chat settings but globaly.

Alternatives Considered

No response

Additional Context

No response

Originally created by @alkeryn on GitHub (May 4, 2026). Original GitHub issue: https://github.com/open-webui/open-webui/issues/24341 ### Check Existing Issues - [x] I have searched for all existing **open AND closed** issues and discussions for similar requests. I have found none that is comparable to my request. ### Verify Feature Scope - [x] I have read through and understood the scope definition for feature requests in the Issues section. I believe my feature request meets the definition and belongs in the Issues section instead of the Discussions. ### Problem Description with llama.cpp it's possible to disable thinking through the api using : thinking_budget_tokens: 0, there may also be other methods that work with it. anyway, currently there is no convenient builtin way to disable thinking for llama.cpp it can be done per chat by adding "thinking_budget_tokens: 0" as a custom field in the chat settings, but there is no way to do it globaly. ### Desired Solution you'd like either add an option for llama.cpp in settings or allow custom fields not only in the chat settings but globaly. ### Alternatives Considered _No response_ ### Additional Context _No response_
Author
Owner

@owui-terminator[bot] commented on GitHub (May 4, 2026):

🔍 Similar Issues Found

I found some existing issues that might be related. Please check if any of these are duplicates or contain helpful solutions:

  1. #23703 issue: Notes feature not compatible with llama.cpp, enable_thinking is always injected?
    by TomTheWise · bug

  2. #8688 feat: built-in cpu-only llama cpp integration
    by tjbck · enhancement, help wanted

  3. #17428 issue: Support Think Parsing with llama.cpp + GPT-OSS
    by AbdullahMPrograms · bug

  4. #17350 issue: Llama.cpp server timing metrics not parsed correctly
    by ITankForCAD · bug

  5. #16251 issue: When using llama.cpp as backend, pressing stop doesn't stop token generation
    by OracleToes · bug


💡 If this is a duplicate, consider closing it and adding details to the existing issue.

This comment was generated automatically. React with 👍 if helpful, 👎 if not.

<!-- gh-comment-id:4368912033 --> @owui-terminator[bot] commented on GitHub (May 4, 2026): 🔍 **Similar Issues Found** I found some existing issues that might be related. Please check if any of these are duplicates or contain helpful solutions: 1. [#23703](https://github.com/open-webui/open-webui/issues/23703) **issue: Notes feature not compatible with llama.cpp, enable_thinking is always injected?** *by TomTheWise · `bug`* 2. [#8688](https://github.com/open-webui/open-webui/issues/8688) **feat: built-in cpu-only llama cpp integration** *by tjbck · `enhancement`, `help wanted`* 3. [#17428](https://github.com/open-webui/open-webui/issues/17428) **issue: Support Think Parsing with llama.cpp + GPT-OSS** *by AbdullahMPrograms · `bug`* 4. [#17350](https://github.com/open-webui/open-webui/issues/17350) **issue: Llama.cpp server timing metrics not parsed correctly** *by ITankForCAD · `bug`* 5. [#16251](https://github.com/open-webui/open-webui/issues/16251) **issue: When using llama.cpp as backend, pressing stop doesn't stop token generation** *by OracleToes · `bug`* --- 💡 If this is a duplicate, consider closing it and adding details to the existing issue. *This comment was generated automatically.* React with 👍 if helpful, 👎 if not.
Author
Owner

@Classic298 commented on GitHub (May 4, 2026):

Filter?

<!-- gh-comment-id:4369251986 --> @Classic298 commented on GitHub (May 4, 2026): Filter?
Author
Owner

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 filter?

<!-- gh-comment-id:4369291643 --> @alkeryn commented on GitHub (May 4, 2026): @Classic298 filter?
Author
Owner

@Classic298 commented on GitHub (May 4, 2026):

this can be easily done with a filter, did you look into it? https://docs.openwebui.com/features/extensibility/plugin/functions/filter

<!-- gh-comment-id:4369764645 --> @Classic298 commented on GitHub (May 4, 2026): this can be easily done with a filter, did you look into it? https://docs.openwebui.com/features/extensibility/plugin/functions/filter
Author
Owner

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 fair enough, though i do feel like it'd make sense as a builtin feature, we can already add custom args in the current chat, why couldn't we add them globaly in here?

Image
<!-- gh-comment-id:4369926905 --> @alkeryn commented on GitHub (May 4, 2026): @Classic298 fair enough, though i do feel like it'd make sense as a builtin feature, we can already add custom args in the current chat, why couldn't we add them globaly in here? <img width="1383" height="798" alt="Image" src="https://github.com/user-attachments/assets/cbe753ff-6c26-4a34-9e6a-ebc348562ec8" />
Author
Owner

@Classic298 commented on GitHub (May 4, 2026):

@alkeryn you can, scroll further down - you can add custom parameters. Open WebUI cannot had dozens and dozens more parameters for model or inference dependent engines, that's why you can add custom paramaters at the very end as well

<!-- gh-comment-id:4369946176 --> @Classic298 commented on GitHub (May 4, 2026): @alkeryn you can, scroll further down - you can add custom parameters. Open WebUI cannot had dozens and dozens more parameters for model or inference dependent engines, that's why you can add custom paramaters at the very end as well
Author
Owner

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 you can only for a single chat ie :

Image

but the "add new custom setting" doesn't show up in the global config here :

Image
<!-- gh-comment-id:4370669568 --> @alkeryn commented on GitHub (May 4, 2026): @Classic298 you can only for a single chat ie : <img width="428" height="1405" alt="Image" src="https://github.com/user-attachments/assets/36dc5683-d4f1-486f-9ff2-c91a249e9b20" /> but the "add new custom setting" doesn't show up in the global config here : <img width="1682" height="938" alt="Image" src="https://github.com/user-attachments/assets/35a5242e-91ac-438f-a8fd-3ad4f3613d17" />
Author
Owner

@Classic298 commented on GitHub (May 4, 2026):

@alkeryn go to admin panel > models and add it to the model if your desire there in the advanced paramaters

<!-- gh-comment-id:4370731158 --> @Classic298 commented on GitHub (May 4, 2026): @alkeryn go to admin panel > models and add it to the model if your desire there in the advanced paramaters
Author
Owner

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 sure, but per model is annoying, would be nice to have a global setting or at least per connection / provider.

<!-- gh-comment-id:4371671170 --> @alkeryn commented on GitHub (May 4, 2026): @Classic298 sure, but per model is annoying, would be nice to have a global setting or at least per connection / provider.
Author
Owner

@Classic298 commented on GitHub (May 4, 2026):

@alkeryn admin panel > settings > models > top right: settings - this should also do it

<!-- gh-comment-id:4371677021 --> @Classic298 commented on GitHub (May 4, 2026): @alkeryn admin panel > settings > models > top right: settings - this should also do it
Author
Owner

@alkeryn commented on GitHub (May 4, 2026):

@Classic298 i don't see an option for it, weird.
Image

<!-- gh-comment-id:4374562150 --> @alkeryn commented on GitHub (May 4, 2026): @Classic298 i don't see an option for it, weird. <img width="1284" height="839" alt="Image" src="https://github.com/user-attachments/assets/c0433d51-ef01-406d-89bd-f34ab199d037" />
Author
Owner

@Classic298 commented on GitHub (May 4, 2026):

hmmmmmm open a feature request specifically for adding custom param IN THIS MODAL that you here opened so that you can add the same custom param to all models pls thx

<!-- gh-comment-id:4374582365 --> @Classic298 commented on GitHub (May 4, 2026): hmmmmmm open a feature request specifically for adding custom param IN THIS MODAL that you here opened so that you can add the same custom param to all models pls thx
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/open-webui#58938