[GH-ISSUE #10439] Granite 3.3: thinking does not work if using system prompt #53375

Open
opened 2026-04-29 02:47:57 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @sebdotv on GitHub (Apr 28, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10439

What is the issue?

Thinking works unless there's a system prompt specified.

Note: httpie syntax

http POST localhost:11434/api/chat model=granite3.3:8b stream:=false \
  'messages[]:={"role": "control", "content": "thinking"}' \
  'messages[]:={"role": "user", "content": "hello"}'

response: <think>...</think> <response>...<response>, as expected

http POST localhost:11434/api/chat model=granite3.3:8b stream:=false \
  'messages[]:={"role": "system", "content": "You are a helpful AI agent."}' \
  'messages[]:={"role": "control", "content": "thinking"}' \
  'messages[]:={"role": "user", "content": "hello"}'

response: no <think> or <response>

Template version: 3da071a01bbe

According to logs, the thinking-specific part was not added to computed system prompt (expected: Respond to every user query in a comprehensive and detailed way. [...] Write your thoughts between <think></think> and write your response between <response></response> for each user query.)

Relevant log output

level=DEBUG source=routes.go:1523 msg="chat request" images=0 prompt="<|start_of_role|>system<|end_of_role|>You are a helpful AI agent.<|end_of_text|>\n<|start_of_role|>user<|end_of_role|>hello<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|>"

OS

Linux

GPU

Nvidia

CPU

No response

Ollama version

0.6.6

Originally created by @sebdotv on GitHub (Apr 28, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10439 ### What is the issue? Thinking works unless there's a system prompt specified. Note: [httpie](https://httpie.io/) syntax ```shell http POST localhost:11434/api/chat model=granite3.3:8b stream:=false \ 'messages[]:={"role": "control", "content": "thinking"}' \ 'messages[]:={"role": "user", "content": "hello"}' ``` response: `<think>...</think> <response>...<response>`, as expected ```shell http POST localhost:11434/api/chat model=granite3.3:8b stream:=false \ 'messages[]:={"role": "system", "content": "You are a helpful AI agent."}' \ 'messages[]:={"role": "control", "content": "thinking"}' \ 'messages[]:={"role": "user", "content": "hello"}' ``` response: no `<think>` or `<response>` Template version: `3da071a01bbe` According to logs, the thinking-specific part was not added to computed system prompt (expected: `Respond to every user query in a comprehensive and detailed way. [...] Write your thoughts between <think></think> and write your response between <response></response> for each user query.`) ### Relevant log output ``` level=DEBUG source=routes.go:1523 msg="chat request" images=0 prompt="<|start_of_role|>system<|end_of_role|>You are a helpful AI agent.<|end_of_text|>\n<|start_of_role|>user<|end_of_role|>hello<|end_of_text|>\n<|start_of_role|>assistant<|end_of_role|>" ``` ### OS Linux ### GPU Nvidia ### CPU _No response_ ### Ollama version 0.6.6
GiteaMirror added the bug label 2026-04-29 02:47:57 -05:00
Author
Owner

@sebdotv commented on GitHub (Apr 28, 2025):

I believe this is because the thinking part (line 98 of https://ollama.com/library/granite3.3/blobs/3da071a01bbe) is only added if the system prompt is empty (line 70):

{{- if eq $system "" }}

Instead, it should be appended to the end of the system prompt (even if not empty).

<!-- gh-comment-id:2835596666 --> @sebdotv commented on GitHub (Apr 28, 2025): I believe this is because the thinking part (line 98 of https://ollama.com/library/granite3.3/blobs/3da071a01bbe) is only added if the system prompt is empty (line 70): ``` {{- if eq $system "" }} ``` Instead, it should be appended to the end of the system prompt (even if not empty).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53375