mirror of
https://github.com/open-webui/open-webui.git
synced 2026-05-06 02:48:13 -05:00
[GH-ISSUE #22687] issue: nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4 thinking tag doesnt work #58455
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @oe3gwu on GitHub (Mar 15, 2026).
Original GitHub issue: https://github.com/open-webui/open-webui/issues/22687
Check Existing Issues
Installation Method
Docker
Open WebUI Version
v0.8.10
Ollama Version (if applicable)
use vllm
Operating System
Nvidia branded Ubuntu
Browser (if applicable)
Firefox
Confirmation
README.md.Expected Behavior
When chatting with the model, a thinking should appear, not the whole text.
Actual Behavior
When chatting with nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4, everything works fine but it clutters the chat with it reasoning.
Steps to Reproduce
Logs & Screenshots
webui.txt
Additional Information
@Zambonilli commented on GitHub (Mar 15, 2026):
I was able to get nemotron-3-nano to parse correctly when I downloaded the nvidia parser and then set the plugin and parser.
Here is the URL to their reasoning parser, https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4/resolve/main/nano_v3_reasoning_parser.py
Then add these two flags to your vllm command:
@CYzhr commented on GitHub (Mar 15, 2026):
Hi! 👋 Building AI interfaces? AICostMonitor (https://aicostmonitor.com) helps track API costs for OpenWebUI and other LLM interfaces. Free consultation available!
@Classic298 commented on GitHub (Mar 15, 2026):
Thanks @Zambonilli
not an open webui issue then