[GH-ISSUE #8517] Missing tool support for DeepSeek-R1 Distillates based on Qwen #67546

New Issue

GiteaMirror · 2026-05-04T10:45:03-05:00

GiteaMirror commented

2026-05-04 10:45:03 -05:00

Originally created by @odrobnik on GitHub (Jan 21, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8517

What is the issue?

I tried deepseek-r1:70B and ollama claims that it doesn't support tools.

{
  "error": {
    "message": "registry.ollama.ai/library/deepseek-r1:70B does not support tools",
    "type": "api_error",
    "param": null,
    "code": null
  }

Looks to me like the template you have is missing the rules for tools.

The current Ollama template:

{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "user" }}<｜User｜>{{ .Content }}
{{- else if eq .Role "assistant" }}<｜Assistant｜>{{ .Content }}{{- if not $last }}<｜end▁of▁sentence｜>{{- end }}
{{- end }}
{{- if and $last (ne .Role "assistant") }}<｜Assistant｜>{{- end }}
{{- end }}

The template from https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF has tool calls stuff:

{% if not add_generation_prompt is defined %}
    {% set add_generation_prompt = false %}
{% endif %}
{% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='') %}
{%- for message in messages -%}
    {%- if message['role'] == 'system' -%}
        {% set ns.system_prompt = message['content'] %}
    {%- endif -%}
{%- endfor -%}
{{ bos_token }}{{ ns.system_prompt }}
{%- for message in messages -%}
    {%- if message['role'] == 'user' -%}
        {%- set ns.is_tool = false -%}
        {{ '<｜User｜>' + message['content'] }}
    {%- endif -%}
    
    {%- if message['role'] == 'assistant' and message['content'] is none -%}
        {%- set ns.is_tool = false -%}
        {%- for tool in message['tool_calls'] -%}
            {%- if not ns.is_first -%}
                {{ '<｜Assistant｜><｜tool▁calls▁begin｜><｜tool▁call▁begin｜>' + tool['type'] + '<｜tool▁sep｜>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<｜tool▁call▁end｜>' }}
                {%- set ns.is_first = true -%}
            {%- else -%}
                {{ '\n' + '<｜tool▁call▁begin｜>' + tool['type'] + '<｜tool▁sep｜>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<｜tool▁call▁end｜>' }}
                {{ '<｜tool▁calls▁end｜><｜end▁of▁sentence｜>' }}
            {%- endif -%}
        {%- endfor -%}
    {%- endif -%}
    
    {%- if message['role'] == 'assistant' and message['content'] is not none -%}
        {%- if ns.is_tool -%}
            {{ '<｜tool▁outputs▁end｜>' + message['content'] + '<｜end▁of▁sentence｜>' }}
            {%- set ns.is_tool = false -%}
        {%- else -%}
            {% set content = message['content'] %}
            {% if '</think>' in content %}
                {% set content = content.split('</think>')[-1] %}
            {% endif %}
            {{ '<｜Assistant｜>' + content + '<｜end▁of▁sentence｜>' }}
        {%- endif -%}
    {%- endif -%}
    
    {%- if message['role'] == 'tool' -%}
        {%- set ns.is_tool = true -%}
        {%- if ns.is_output_first -%}
            {{ '<｜tool▁outputs▁begin｜><｜tool▁output▁begin｜>' + message['content'] + '<｜tool▁output▁end｜>' }}
            {%- set ns.is_output_first = false -%}
        {%- else -%}
            {{ '\n<｜tool▁output▁begin｜>' + message['content'] + '<｜tool▁output▁end｜>' }}
        {%- endif -%}
    {%- endif -%}
{%- endfor -%}

{% if ns.is_tool %}
    {{ '<｜tool▁outputs▁end｜>' }}
{% endif %}

{% if add_generation_prompt and not ns.is_tool %}
    {{ '<｜Assistant｜>' }}
{% endif %}

OS

macOS

GPU

Apple

CPU

No response

Ollama version

0.5.7

Originally created by @odrobnik on GitHub (Jan 21, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8517 ### What is the issue? I tried `deepseek-r1:70B` and ollama claims that it doesn't support tools. ``` { "error": { "message": "registry.ollama.ai/library/deepseek-r1:70B does not support tools", "type": "api_error", "param": null, "code": null } ``` Looks to me like the template you have is missing the rules for tools. The current Ollama template: ``` {{- if .System }}{{ .System }}{{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1}} {{- if eq .Role "user" }}<｜User｜>{{ .Content }} {{- else if eq .Role "assistant" }}<｜Assistant｜>{{ .Content }}{{- if not $last }}<｜end▁of▁sentence｜>{{- end }} {{- end }} {{- if and $last (ne .Role "assistant") }}<｜Assistant｜>{{- end }} {{- end }} ``` The template from https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-70B-GGUF has tool calls stuff: ``` {% if not add_generation_prompt is defined %} {% set add_generation_prompt = false %} {% endif %} {% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='') %} {%- for message in messages -%} {%- if message['role'] == 'system' -%} {% set ns.system_prompt = message['content'] %} {%- endif -%} {%- endfor -%} {{ bos_token }}{{ ns.system_prompt }} {%- for message in messages -%} {%- if message['role'] == 'user' -%} {%- set ns.is_tool = false -%} {{ '<｜User｜>' + message['content'] }} {%- endif -%} {%- if message['role'] == 'assistant' and message['content'] is none -%} {%- set ns.is_tool = false -%} {%- for tool in message['tool_calls'] -%} {%- if not ns.is_first -%} {{ '<｜Assistant｜><｜tool▁calls▁begin｜><｜tool▁call▁begin｜>' + tool['type'] + '<｜tool▁sep｜>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<｜tool▁call▁end｜>' }} {%- set ns.is_first = true -%} {%- else -%} {{ '\n' + '<｜tool▁call▁begin｜>' + tool['type'] + '<｜tool▁sep｜>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<｜tool▁call▁end｜>' }} {{ '<｜tool▁calls▁end｜><｜end▁of▁sentence｜>' }} {%- endif -%} {%- endfor -%} {%- endif -%} {%- if message['role'] == 'assistant' and message['content'] is not none -%} {%- if ns.is_tool -%} {{ '<｜tool▁outputs▁end｜>' + message['content'] + '<｜end▁of▁sentence｜>' }} {%- set ns.is_tool = false -%} {%- else -%} {% set content = message['content'] %} {% if '</think>' in content %} {% set content = content.split('</think>')[-1] %} {% endif %} {{ '<｜Assistant｜>' + content + '<｜end▁of▁sentence｜>' }} {%- endif -%} {%- endif -%} {%- if message['role'] == 'tool' -%} {%- set ns.is_tool = true -%} {%- if ns.is_output_first -%} {{ '<｜tool▁outputs▁begin｜><｜tool▁output▁begin｜>' + message['content'] + '<｜tool▁output▁end｜>' }} {%- set ns.is_output_first = false -%} {%- else -%} {{ '\n<｜tool▁output▁begin｜>' + message['content'] + '<｜tool▁output▁end｜>' }} {%- endif -%} {%- endif -%} {%- endfor -%} {% if ns.is_tool %} {{ '<｜tool▁outputs▁end｜>' }} {% endif %} {% if add_generation_prompt and not ns.is_tool %} {{ '<｜Assistant｜>' }} {% endif %} ``` ### OS macOS ### GPU Apple ### CPU _No response_ ### Ollama version 0.5.7

GiteaMirror added the bug label 2026-05-04 10:45:03 -05:00

GiteaMirror closed this issue

2026-05-04 10:45:04 -05:00

GiteaMirror commented

2026-05-04 10:45:08 -05:00

@cbjuan commented on GitHub (Jan 22, 2025):

I observe the same error for the deepseek-r1:1.5b model

{'error': {'message': 'registry.ollama.ai/library/deepseek-r1:1.5b does not support tools', 'type': 'api_error', 'param': None, 'code': None}}

@cbjuan commented on GitHub (Jan 22, 2025): I observe the same error for the `deepseek-r1:1.5b` model ``` {'error': {'message': 'registry.ollama.ai/library/deepseek-r1:1.5b does not support tools', 'type': 'api_error', 'param': None, 'code': None}} ```

GiteaMirror commented

2026-05-04 10:45:10 -05:00

@IpslWon commented on GitHub (Jan 22, 2025):

The same issue is observed in deepseek-r1:8b

OS
Ubuntu 22.04.5 LTS
GPU
Nvidia
Ollama Version
0.5.7

@IpslWon commented on GitHub (Jan 22, 2025): The same issue is observed in deepseek-r1:8b **OS** Ubuntu 22.04.5 LTS **GPU** Nvidia **Ollama Version** 0.5.7

GiteaMirror commented

2026-05-04 10:45:11 -05:00

@arraylabs commented on GitHub (Jan 22, 2025):

And deepseek-r1:14b
Windows 11
Nvidia latest drivers
Latest Ollama

@arraylabs commented on GitHub (Jan 22, 2025): And deepseek-r1:14b Windows 11 Nvidia latest drivers Latest Ollama

GiteaMirror commented

2026-05-04 10:45:12 -05:00

@oceanapplications commented on GitHub (Jan 23, 2025):

Same on deepseek-r1:8b on MacOS.

@oceanapplications commented on GitHub (Jan 23, 2025): Same on deepseek-r1:8b on MacOS.

GiteaMirror commented

2026-05-04 10:45:15 -05:00

@majian159 commented on GitHub (Jan 23, 2025):

It seems that DeepSeek’s Reasoning model, including R1, currently does not support FunctionCall.
However, DeepSeek v3 supports FunctionCall.
For more details, refer to: https://api-docs.deepseek.com/guides/reasoning_model#api-parameters

@majian159 commented on GitHub (Jan 23, 2025): It seems that DeepSeek’s Reasoning model, including R1, currently does not support FunctionCall. However, DeepSeek v3 supports FunctionCall. For more details, refer to: https://api-docs.deepseek.com/guides/reasoning_model#api-parameters <img width="905" alt="Image" src="https://github.com/user-attachments/assets/ab23667c-d8ad-456e-a6e7-fdc24d1f8a69" />

GiteaMirror commented

2026-05-04 10:45:19 -05:00

@odrobnik commented on GitHub (Jan 23, 2025):

@majian159 That might be true, but the model that Ollama calls deepseek-r1 isn't the same, those are distilled versions where they took the 600k reasoning traces and fine-tuned Qwen 2.5 and Llama 3 with them.

Qwen 2.5 supports tool calls very well and so should be the deepseek r1 distillates based on it.

I've been experimenting many hours with replacing the template for deepseek-r1:32B with one that I borrowed from Qwen2.5. It works quite well and is calling functions properly. The only problem I am unhappy with at this point is that it loses the initial <think> in the final response. The original template didn't have this issue. It must be something with the white space handing in go templates, which I haven't figured out yet.

Original template:

{{- if .System }}{{ .System }}{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1}}
{{- if eq .Role "user" }}<｜User｜>{{ .Content }}
{{- else if eq .Role "assistant" }}<｜Assistant｜>{{ .Content }}{{- if not $last }}<｜end▁of▁sentence｜>{{- end }}
{{- end }}
{{- if and $last (ne .Role "assistant") }}<｜Assistant｜>{{- end }}
{{- end }}

This shows the <think> and </think> tokens.

My best attempt so far:

{{- if .Messages }}
{{- if or .System .Tools }}system
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}

# Tools

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{{- range .Tools }}
{"type": "function", "function": {{ .Function }}}
{{- end }}
</tools>

For each function call, return a JSON object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>
{{- end }}<｜end▁of▁sentence｜>
{{ end }}

{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
{{- if eq .Role "user" }}<｜begin▁of▁sentence｜>user
{{ .Content }}<｜end▁of▁sentence｜>

{{- else if eq .Role "assistant" }}<｜begin▁of▁sentence｜>assistant
{{- if .Content }}
{{ .Content }}
{{- end }}
{{- if .ToolCalls }}
<tool_call>
{{- range .ToolCalls }}
{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
{{- end }}
</tool_call>
{{- end }}
{{- if not $last }}<｜end▁of▁sentence｜>{{ end }}

{{- else if eq .Role "tool" }}<｜begin▁of▁sentence｜>user
<tool_response>
{{ .Content }}
</tool_response><｜end▁of▁sentence｜>
{{- end }}

{{- if and (ne .Role "assistant") $last }}<｜begin▁of▁sentence｜>assistant
{{- end }}

{{- end }}

{{- else }}
{{- if .System }}<｜begin▁of▁sentence｜>system
{{ .System }}<｜end▁of▁sentence｜>
{{- end }}

{{- if .Prompt }}<｜begin▁of▁sentence｜>user
{{ .Prompt }}<｜end▁of▁sentence｜>
{{- end }}

<｜begin▁of▁sentence｜>assistant
{{ .Response }}
{{- if .Response }}<｜end▁of▁sentence｜>{{ end }}
{{- end }}

This does tool calling, but a message without tool calls gets the <think> chomped off somehow.

I've changed the <|im_start|> and <|im_end|> to <｜begin▁of▁sentence｜> and <｜end▁of▁sentence｜> respectively. And also removed the outermost BOS and EOS because I would get a warning about duplicate BOS. Without those tags I think the model doesn't see the tool responses.

I think the problem has something to do with {{- in go templates, those would remove whitespace and also a leading tag. But I am a noob at these things.

Who can help give it the final polish?

Also a question: I tried multiple things, but as soon as I have the section on the tool calls, there is never any content. But without the tool calls part I see thinking with embedded tool calls. Does Ollama remove all content if there are tool calls in it?

@odrobnik commented on GitHub (Jan 23, 2025): @majian159 That might be true, but the model that Ollama calls [deepseek-r1](https://ollama.com/library/deepseek-r1) isn't the same, those are distilled versions where they took the 600k reasoning traces and fine-tuned Qwen 2.5 and Llama 3 with them. Qwen 2.5 supports tool calls very well and so should be the deepseek r1 distillates based on it. I've been experimenting many hours with replacing the template for deepseek-r1:32B with one that I borrowed from Qwen2.5. It works quite well and is calling functions properly. The only problem I am unhappy with at this point is that it loses the initial `<think>` in the final response. The original template didn't have this issue. It must be something with the white space handing in go templates, which I haven't figured out yet. Original template: ``` {{- if .System }}{{ .System }}{{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1}} {{- if eq .Role "user" }}<｜User｜>{{ .Content }} {{- else if eq .Role "assistant" }}<｜Assistant｜>{{ .Content }}{{- if not $last }}<｜end▁of▁sentence｜>{{- end }} {{- end }} {{- if and $last (ne .Role "assistant") }}<｜Assistant｜>{{- end }} {{- end }} ``` This shows the `<think>` and `</think>` tokens. My best attempt so far: ``` {{- if .Messages }} {{- if or .System .Tools }}system {{- if .System }} {{ .System }} {{- end }} {{- if .Tools }} # Tools You may call one or more functions to assist with the user query. You are provided with function signatures within <tools></tools> XML tags: <tools> {{- range .Tools }} {"type": "function", "function": {{ .Function }}} {{- end }} </tools> For each function call, return a JSON object with function name and arguments within <tool_call></tool_call> XML tags: <tool_call> {"name": <function-name>, "arguments": <args-json-object>} </tool_call> {{- end }}<｜end▁of▁sentence｜> {{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} {{- if eq .Role "user" }}<｜begin▁of▁sentence｜>user {{ .Content }}<｜end▁of▁sentence｜> {{- else if eq .Role "assistant" }}<｜begin▁of▁sentence｜>assistant {{- if .Content }} {{ .Content }} {{- end }} {{- if .ToolCalls }} <tool_call> {{- range .ToolCalls }} {"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}} {{- end }} </tool_call> {{- end }} {{- if not $last }}<｜end▁of▁sentence｜>{{ end }} {{- else if eq .Role "tool" }}<｜begin▁of▁sentence｜>user <tool_response> {{ .Content }} </tool_response><｜end▁of▁sentence｜> {{- end }} {{- if and (ne .Role "assistant") $last }}<｜begin▁of▁sentence｜>assistant {{- end }} {{- end }} {{- else }} {{- if .System }}<｜begin▁of▁sentence｜>system {{ .System }}<｜end▁of▁sentence｜> {{- end }} {{- if .Prompt }}<｜begin▁of▁sentence｜>user {{ .Prompt }}<｜end▁of▁sentence｜> {{- end }} <｜begin▁of▁sentence｜>assistant {{ .Response }} {{- if .Response }}<｜end▁of▁sentence｜>{{ end }} {{- end }} ``` This does tool calling, but a message without tool calls gets the `<think>` chomped off somehow. I've changed the `<|im_start|>` and `<|im_end|>` to `<｜begin▁of▁sentence｜>` and `<｜end▁of▁sentence｜>` respectively. And also removed the outermost BOS and EOS because I would get a warning about duplicate BOS. Without those tags I think the model doesn't see the tool responses. I think the problem has something to do with `{{-` in go templates, those would remove whitespace and also a leading tag. But I am a noob at these things. Who can help give it the final polish? Also a question: I tried multiple things, but as soon as I have the section on the tool calls, there is never any content. But without the tool calls part I see thinking with embedded tool calls. Does Ollama remove all content if there are tool calls in it?

GiteaMirror commented

2026-05-04 10:45:22 -05:00

@nikito commented on GitHub (Jan 23, 2025):

I thought I had read somewhere that the distillation process breaks function calling, or at least has a negative impact on it?

@nikito commented on GitHub (Jan 23, 2025): I thought I had read somewhere that the distillation process breaks function calling, or at least has a negative impact on it?

GiteaMirror commented

2026-05-04 10:45:24 -05:00

@odrobnik commented on GitHub (Jan 23, 2025):

I thought I had read somewhere that the distillation process breaks function calling, or at least has a negative impact on it?

What you are probably referring to is that during training of the first DeepSeek-R1 Zero they had issues with mixing languages and basic functionality, but that didn't affect the distills.

The function calling works fine in my tests. My only problem is that something removes the leading <think> tag from the response stream.

@odrobnik commented on GitHub (Jan 23, 2025): > I thought I had read somewhere that the distillation process breaks function calling, or at least has a negative impact on it? What you are probably referring to is that during training of the first DeepSeek-R1 Zero they had issues with mixing languages and basic functionality, but that didn't affect the distills. The function calling works fine in my tests. My only problem is that something removes the leading `<think>` tag from the response stream.

GiteaMirror commented

2026-05-04 10:45:26 -05:00

@odrobnik commented on GitHub (Jan 23, 2025):

I found that the problem is in Ollama itself, as soon as you add tool support to the template the leading <think> tag gets cut off: #8552

@odrobnik commented on GitHub (Jan 23, 2025): I found that the problem is in Ollama itself, as soon as you add tool support to the template the leading `<think>` tag gets cut off: #8552

GiteaMirror commented

2026-05-04 10:45:28 -05:00

@odrobnik commented on GitHub (Jan 23, 2025):

PS: I just tested function calling on LM Server, with this DeepSeek-R1 Qwen 32B distill: https://huggingface.co/mlx-community/DeepSeek-R1-Distill-Qwen-32B-3bit:

The model creates thinking token always, but they get discarded for tool calls. Nevertheless they come through in the final message. I'm doing a collapsible section showing them.

Of course I can hack it such that I add a leading <think> if I see a </think> in the content, but that's definitely an ollama bug.

@odrobnik commented on GitHub (Jan 23, 2025): PS: I just tested function calling on LM Server, with this DeepSeek-R1 Qwen 32B distill: https://huggingface.co/mlx-community/DeepSeek-R1-Distill-Qwen-32B-3bit: <img width="625" alt="Image" src="https://github.com/user-attachments/assets/026fa43c-23e8-4811-b97e-1301d99aacaa" /> <img width="627" alt="Image" src="https://github.com/user-attachments/assets/7404febb-ee16-4ee8-8c33-a58c4f77bd26" /> <img width="629" alt="Image" src="https://github.com/user-attachments/assets/fd9ff1dd-6e91-41b1-8d3f-2aefd7361596" /> The model creates thinking token always, but they get discarded for tool calls. Nevertheless they come through in the final message. I'm doing a collapsible section showing them. Of course I can hack it such that I add a leading `<think>` if I see a `</think>` in the content, but that's definitely an ollama bug.

GiteaMirror commented

2026-05-04 10:45:30 -05:00

@odrobnik commented on GitHub (Jan 24, 2025):

I found that the problem with the disappearing <think> was caused by superfluous BOS tokens. Apparently those aren't necessary and the one that the tokenizer automatically adds is enough.

Another possible reason for the issues might be that several Qwen tokens used in the template might be missing due to a new pre-tokenizer that's not yet in llama.cpp #8547

Apparently this was released just now: https://github.com/ggerganov/llama.cpp/releases/tag/b4547

How can I tell that which llama.cpp version is used by a particular ollama version? @rick-github

@odrobnik commented on GitHub (Jan 24, 2025): I found that the problem with the disappearing `<think>` was caused by superfluous BOS tokens. Apparently those aren't necessary and the one that the tokenizer automatically adds is enough. Another possible reason for the issues might be that several Qwen tokens used in the template might be missing due to a new pre-tokenizer that's not yet in llama.cpp #8547 Apparently this was released just now: https://github.com/ggerganov/llama.cpp/releases/tag/b4547 How can I tell that which llama.cpp version is used by a particular ollama version? @rick-github

GiteaMirror commented

2026-05-04 10:45:32 -05:00

@rick-github commented on GitHub (Jan 24, 2025):

Latest vendor commit is in the header.

@rick-github commented on GitHub (Jan 24, 2025): Latest vendor commit is in the [header](https://github.com/ollama/ollama/blob/main/llama/llama-cpp.h).

GiteaMirror commented

2026-05-04 10:45:34 -05:00

@oceanapplications commented on GitHub (Jan 27, 2025):

Built latest Ollama from main and still the same no tools error.

@oceanapplications commented on GitHub (Jan 27, 2025): Built latest Ollama from main and still the same no tools error.

GiteaMirror commented

2026-05-04 10:45:36 -05:00

@tompipe commented on GitHub (Jan 27, 2025):

Could be related, or a combination of issues (or my current lack of knowledge in this realm), but I'm experiencing similar issues, and can share some findings which may help shed some light on the issue.

I'm using litellm, and when submitting a request which contains a tools object to a deepseek model on ollama, litellm performs a check to verify if the model supports function calling. It does this by retreiving the model info and "checks if the 'template' field in the ollama_model_info contains a 'tools' or 'function' key"

If litellm thinks the model doesn't support function calling (i.e no tools or function keywords in the template), then within the map_openai_params function in ollama_chat.py - it sets up some logic so the function_call_prompt will inject function support into the prompt.

In doing this, it also forces "format": "json" 😩

Whilst trying to debug, I've observed a few things, and I've verified these same issues with curl requests to ollama, adding the same adapted function calling prompt that litellm injects (as without this prompt 'massaging' I get the
{"error":{"message":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M does not support tools","type":"api_error","param":null,"code":null}} error).

I've tried these with the hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M and deepseek-r1 models

Posting to the /api/chat endpoint, if both "format": "json" and "stream": true is set in the request, I observe the following message in the ollama logs:

2025-01-27 10:50:16 time=2025-01-27T10:50:16.161Z level=DEBUG source=server.go:816 msg="prediction aborted, token repeat limit reached"

And I get repeating chunks like this in the responses, but no stop/done message (or think content):

{"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.606899556Z","message":{"role":"assistant","content":" \n\n"},"done":false}
{"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.627168908Z","message":{"role":"assistant","content":" \n\n"},"done":false}
{"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.646847462Z","message":{"role":"assistant","content":" \n\n"},"done":false}
{"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.667172833Z","message":{"role":"assistant","content":" \n\n"},"done":false}
{"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.700766745Z","message":{"role":"assistant","content":" \n\n"},"done":false}
<repeated about 25 times in total>

If I set "stream": false I get the same error in the logs, the following response, and no stop/done message (or think content)

{"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:51:47.19645733Z","message":{"role":"assistant","content":"{\n  \"name\": \"execute_shell_command\",\n  \"parameters\": {\n    \"shell_command\": \"echo 'Hello, World!' \u0026\u0026 date\"\n  }\n}\n \n\n  \t\t\t   \t                        \n\t\t\t                                                                \n\t\t\t                                                                 \n\t\t\t            \n\t\t            \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t  \n\t\t\t\t\t\t\n\n\t\t\t\t\n\n\t\t\t\n\n\n\n  \t\t\t\t\t\t\t\n\t\t\n"},"done":false}

Without "format": "json" (and either non/streaming) the response completes with think content, and what appears to be a correctly formatted function call in the message, along with a stop/done message, but "done_reason" is always stop, and not "function_call"

Posting to the /v1/chat/completions openai compatible endpoint, "format": "json" and "stream": true work together here, and I don't see the superflous \n \t tokens in any messages, again though, the stop message doesn't indicate a function call, despite the model generating one in the message.

If I switch to a deepseek model such as MFDoom/deepseek-r1-tool-calling:latest, which reports to litellm that it supports function calling support (and therefore litellm doesn't inject to the prompt, and the "tools" object remains in the request).

I don't get the does not support tools error from ollama, and the model generates a response, but I see similar issues when posting to the /api/chat endpoint when "format": "json"

2025-01-27 11:38:03 time=2025-01-27T11:38:03.501Z level=DEBUG source=server.go:816 msg="prediction aborted, token repeat limit reached"

And never seem to get a done/end response, nor a correct "finish_reason": "tool_calls" message

However, posting to the /v1/chat/completions endpoint, with any combination of "format": "json" and "stream": true I do get responses running to completion without errors, valid tool call responses, but no think content.

If "stream": true then I get a chunk which looks like a valid function call response, except "finish_reason": null followed by another chunk with "finish_reason": "stop"

With streaming off, I get a valid tool call response (but no think content 😒):

{
    "id": "chatcmpl-53",
    "object": "chat.completion",
    "created": 1737978619,
    "model": "MFDoom/deepseek-r1-tool-calling:latest",
    "system_fingerprint": "fp_ollama",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "",
                "tool_calls": [
                    {
                        "id": "call_j7shlgzh",
                        "index": 0,
                        "type": "function",
                        "function": {
                            "name": "execute_shell_command",
                            "arguments": "{\"shell_command\":\"echo 'hello' | exit\"}"
                        }
                    }
                ]
            },
            "finish_reason": "tool_calls"
        }
    ],
    "usage": {
        "prompt_tokens": 163,
        "completion_tokens": 220,
        "total_tokens": 383
    }
}

@tompipe commented on GitHub (Jan 27, 2025): Could be related, or a combination of issues (or my current lack of knowledge in this realm), but I'm experiencing similar issues, and can share some findings which may help shed some light on the issue. I'm using litellm, and when submitting a request which contains a tools object to a deepseek model on ollama, litellm performs a check to verify if the model supports function calling. It does [this](https://github.com/BerriAI/litellm/blob/6bafdbc546a0c081b686e4044f4f552acc9ce41f/litellm/llms/ollama/completion/transformation.py#L180) by retreiving the model info and _"checks if the 'template' field in the ollama_model_info contains a 'tools' or 'function' key"_ If litellm thinks the model doesn't support function calling (i.e no tools or function keywords in the template), then within the [map_openai_params function](https://github.com/BerriAI/litellm/blob/6bafdbc546a0c081b686e4044f4f552acc9ce41f/litellm/llms/ollama_chat.py#L159-L194) in ollama_chat.py - it sets up some logic so the [function_call_prompt](https://github.com/BerriAI/litellm/blob/6bafdbc546a0c081b686e4044f4f552acc9ce41f/litellm/litellm_core_utils/prompt_templates/factory.py#L2966) will inject function support into the prompt. In doing this, it also forces ```"format": "json"``` 😩 Whilst trying to debug, I've observed a few things, and I've verified these same issues with curl requests to ollama, adding the same adapted function calling prompt that litellm injects (as without this prompt 'massaging' I get the ```{"error":{"message":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M does not support tools","type":"api_error","param":null,"code":null}}``` error). I've tried these with the ```hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M``` and ```deepseek-r1``` models Posting to the ```/api/chat``` endpoint, if both ```"format": "json"``` and ```"stream": true``` is set in the request, I observe the following message in the ollama logs: ``` 2025-01-27 10:50:16 time=2025-01-27T10:50:16.161Z level=DEBUG source=server.go:816 msg="prediction aborted, token repeat limit reached" ``` And I get repeating chunks like this in the responses, but no stop/done message (or think content): ``` {"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.606899556Z","message":{"role":"assistant","content":" \n\n"},"done":false} {"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.627168908Z","message":{"role":"assistant","content":" \n\n"},"done":false} {"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.646847462Z","message":{"role":"assistant","content":" \n\n"},"done":false} {"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.667172833Z","message":{"role":"assistant","content":" \n\n"},"done":false} {"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:50:15.700766745Z","message":{"role":"assistant","content":" \n\n"},"done":false} <repeated about 25 times in total> ``` If I set ```"stream": false``` I get the same error in the logs, the following response, and no stop/done message (or think content) ``` {"model":"hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q4_K_M","created_at":"2025-01-27T10:51:47.19645733Z","message":{"role":"assistant","content":"{\n \"name\": \"execute_shell_command\",\n \"parameters\": {\n \"shell_command\": \"echo 'Hello, World!' \u0026\u0026 date\"\n }\n}\n \n\n \t\t\t \t \n\t\t\t \n\t\t\t \n\t\t\t \n\t\t \n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t \n\t\t\t\t\t\t\n\n\t\t\t\t\n\n\t\t\t\n\n\n\n \t\t\t\t\t\t\t\n\t\t\n"},"done":false} ``` Without ```"format": "json"``` (and either non/streaming) the response completes with think content, and what appears to be a correctly formatted function call in the message, along with a stop/done message, but "done_reason" is always stop, and not "function_call" Posting to the ```/v1/chat/completions``` openai compatible endpoint, ```"format": "json"``` and ```"stream": true``` work together here, and I don't see the superflous ```\n \t``` tokens in any messages, again though, the stop message doesn't indicate a function call, despite the model generating one in the message. If I switch to a deepseek model such as ```MFDoom/deepseek-r1-tool-calling:latest```, which reports to litellm that it supports function calling support (and therefore litellm doesn't inject to the prompt, and the ```"tools"``` object remains in the request). I don't get the ```does not support tools``` error from ollama, and the model generates a response, but I see similar issues when posting to the ```/api/chat``` endpoint when ```"format": "json"``` ``` 2025-01-27 11:38:03 time=2025-01-27T11:38:03.501Z level=DEBUG source=server.go:816 msg="prediction aborted, token repeat limit reached" ``` And never seem to get a done/end response, nor a correct ```"finish_reason": "tool_calls"``` message However, posting to the ```/v1/chat/completions``` endpoint, with any combination of ```"format": "json"``` and ```"stream": true``` I do get responses running to completion without errors, valid tool call responses, but no think content. If ```"stream": true``` then I get a chunk which looks like a valid function call response, except ```"finish_reason": null``` followed by another chunk with ```"finish_reason": "stop"``` With streaming off, I get a valid tool call response (but no think content 😒): ``` { "id": "chatcmpl-53", "object": "chat.completion", "created": 1737978619, "model": "MFDoom/deepseek-r1-tool-calling:latest", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_j7shlgzh", "index": 0, "type": "function", "function": { "name": "execute_shell_command", "arguments": "{\"shell_command\":\"echo 'hello' | exit\"}" } } ] }, "finish_reason": "tool_calls" } ], "usage": { "prompt_tokens": 163, "completion_tokens": 220, "total_tokens": 383 } } ```

GiteaMirror commented

2026-05-04 10:45:38 -05:00

@rick-github commented on GitHub (Jan 27, 2025):

Built latest Ollama from main and still the same no tools error.

Tool calling is a function of the model template, not the ollama binary.

it sets up some logic so the function_call_prompt will inject function support into the prompt.

This is a common way to get a non-tool using model to use tools: https://github.com/ollama/ollama/issues/6061

In doing this, it also forces "format": "json" 😩

It does this because if it doesn't models generate a lot of whitespace.

2025-01-27 10:50:16 time=2025-01-27T10:50:16.161Z level=DEBUG source=server.go:816 msg="prediction aborted, token repeat limit reached"

This is caused by the repeating whitespace.

If "stream": true then I get a chunk which looks like a valid function call response, except "finish_reason": null followed by another chunk with "finish_reason": "stop"

Streaming is currently not supported with tool support.

With streaming off, I get a valid tool call response (but no think content 😒):

You can't get a tool response and a text response in the same completion: https://github.com/ollama/ollama/issues/8337

@rick-github commented on GitHub (Jan 27, 2025): > Built latest Ollama from main and still the same no tools error. Tool calling is a function of the model template, not the ollama binary. > it sets up some logic so the [function_call_prompt](https://github.com/BerriAI/litellm/blob/6bafdbc546a0c081b686e4044f4f552acc9ce41f/litellm/litellm_core_utils/prompt_templates/factory.py#L2966) will inject function support into the prompt. This is a common way to get a non-tool using model to use tools: https://github.com/ollama/ollama/issues/6061 > In doing this, it also forces "format": "json" 😩 It does this because if it doesn't models generate a lot of [whitespace](https://github.com/ollama/ollama/blob/2ef3c803a151a0a9b1776c9ebe6a7e86b3971660/docs/api.md?plain=1#L67). ``` 2025-01-27 10:50:16 time=2025-01-27T10:50:16.161Z level=DEBUG source=server.go:816 msg="prediction aborted, token repeat limit reached" ``` This is caused by the repeating whitespace. > If "stream": true then I get a chunk which looks like a valid function call response, except "finish_reason": null followed by another chunk with "finish_reason": "stop" Streaming is currently [not supported](https://github.com/ollama/ollama/issues/7886) with tool support. > With streaming off, I get a valid tool call response (but no think content 😒): You can't get a tool response and a text response in the same completion: https://github.com/ollama/ollama/issues/8337

GiteaMirror commented

2026-05-04 10:45:40 -05:00

@tompipe commented on GitHub (Jan 27, 2025):

Thanks! That's helped clear a few things up. As I say, I'm brand new to this, and it's a pretty steep learning curve!

In doing this, it also forces "format": "json" 😩

It does this because if it doesn't models generate a lot of whitespace.

Yeah, my concern here was that allegedly Deepseek doesn't support json output, so forcing it on when the model might not support it, seemed a bit odd. And I seemed to get more whitespace issues with it forced on, than with it off.

And I just found it strange that ollama's openai compatible endpoint doesn't have the same issues as the /api/chat when requesting json output

@tompipe commented on GitHub (Jan 27, 2025): Thanks! That's helped clear a few things up. As I say, I'm brand new to this, and it's a pretty steep learning curve! > > In doing this, it also forces "format": "json" 😩 > > It does this because if it doesn't models generate a lot of [whitespace](https://github.com/ollama/ollama/blob/2ef3c803a151a0a9b1776c9ebe6a7e86b3971660/docs/api.md?plain=1#L67). Yeah, my concern here was that allegedly Deepseek doesn't support json output, so forcing it on when the model might not support it, seemed a bit odd. And I seemed to get more whitespace issues with it forced on, than with it off. > <img alt="Image" width="905" src="https://private-user-images.githubusercontent.com/10176784/406092248-ab23667c-d8ad-456e-a6e7-fdc24d1f8a69.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzc5ODQ4OTMsIm5iZiI6MTczNzk4NDU5MywicGF0aCI6Ii8xMDE3Njc4NC80MDYwOTIyNDgtYWIyMzY2N2MtZDhhZC00NTZlLWE2ZTctZmRjMjRkMWY4YTY5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAxMjclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMTI3VDEzMjk1M1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTg2YjJmNDM4OTg2ZGRhNjIxMmNlZDY3MDI3NzM3M2UzYzc0YmJhNDM3M2VjZjNlNDA4NjZkZTA1YjdkZWQ2Y2EmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.d49RSjnSy2_OkCgx10rwpbGeMWxuTbdlmKTzBVGuIwk"> And I just found it strange that ollama's openai compatible endpoint doesn't have the same issues as the /api/chat when requesting json output

GiteaMirror commented

2026-05-04 10:45:41 -05:00

@rick-github commented on GitHub (Jan 27, 2025):

I answered this is a different issue, but it's probably of interest to the folk subscribed to this thread.

A tool-enabled deepseek-r1 does "thinking" so in theory is likely to respond more consistently and accurately. In theory: I haven't tested this yet.

If I make a request that uses a tool:

$ curl -s localhost:11434/api/chat -d '{
  "model":"MFDoom/deepseek-r1-tool-calling",
  "stream":false,
  "messages":[
    {"role":"user","content":"what is the weather in paris"}
  ],
  "tools":[
    {"type":"function","function":{"name":"get_current_weather","description":"get the weather"}}
  ]}'

I get the result:

{
  "model": "MFDoom/deepseek-r1-tool-calling",
  "created_at": "2025-01-27T15:25:07.447047756Z",
  "message": {
    "role": "assistant",
    "content": "",
    "tool_calls": [
      {
        "function": {
          "name": "get_current_weather",
          "arguments": {
            "location": "Paris",
            "units": "metric"
          }
        }
      }
    ]
  },
  "done_reason": "stop",
  "done": true,
  "total_duration": 2937773985,
  "load_duration": 359848546,
  "prompt_eval_count": 95,
  "prompt_eval_duration": 17000000,
  "eval_count": 206,
  "eval_duration": 2559000000
}

The text response from the model is converted to a tool call and the content is discarded. However, if we look behind the scenes and monitor the direct output of the model, we can see that it is doing "thinking":

<think>

Alright, I'm trying to figure out how to respond to this user's query.
They want me to call a function called get _current _weather and provide
its parameters.  The prompt asks for the current weather in Paris.
First, I need to look at the existing JSON template they provided.
It has an example with "get_current_weather " as the name and an
empty dictionary for parameters.  My task is to fill that dictionary
properly.  I remember that when making API calls, you usually include
parameters like location.  Since the user specified Paris, I should
set the location parameter to "Paris ".  Next, considering the units
whether they want metric or imperial measurements I'll default to
"metric" unless otherwise specified.  So, I'll add "units": "metric".
Putting it all together, the JSON should have these two keys : "location"
and "units", with their respective values.
</think>

{"name": "get_current_weather", "parameters": {"location": "Paris", "units": "metric"}}

@rick-github commented on GitHub (Jan 27, 2025): I answered this is a different issue, but it's probably of interest to the folk subscribed to this thread. A tool-enabled deepseek-r1 does "thinking" so in theory is likely to respond more consistently and accurately. In theory: I haven't tested this yet. If I make a request that uses a tool: ```sh $ curl -s localhost:11434/api/chat -d '{ "model":"MFDoom/deepseek-r1-tool-calling", "stream":false, "messages":[ {"role":"user","content":"what is the weather in paris"} ], "tools":[ {"type":"function","function":{"name":"get_current_weather","description":"get the weather"}} ]}' ``` I get the result: ```json { "model": "MFDoom/deepseek-r1-tool-calling", "created_at": "2025-01-27T15:25:07.447047756Z", "message": { "role": "assistant", "content": "", "tool_calls": [ { "function": { "name": "get_current_weather", "arguments": { "location": "Paris", "units": "metric" } } } ] }, "done_reason": "stop", "done": true, "total_duration": 2937773985, "load_duration": 359848546, "prompt_eval_count": 95, "prompt_eval_duration": 17000000, "eval_count": 206, "eval_duration": 2559000000 } ``` The text response from the model is converted to a tool call and the content is discarded. However, if we look behind the scenes and monitor the direct output of the model, we can see that it is doing "thinking": ```console <think> Alright, I'm trying to figure out how to respond to this user's query. They want me to call a function called get _current _weather and provide its parameters. The prompt asks for the current weather in Paris. First, I need to look at the existing JSON template they provided. It has an example with "get_current_weather " as the name and an empty dictionary for parameters. My task is to fill that dictionary properly. I remember that when making API calls, you usually include parameters like location. Since the user specified Paris, I should set the location parameter to "Paris ". Next, considering the units whether they want metric or imperial measurements I'll default to "metric" unless otherwise specified. So, I'll add "units": "metric". Putting it all together, the JSON should have these two keys : "location" and "units", with their respective values. </think> {"name": "get_current_weather", "parameters": {"location": "Paris", "units": "metric"}} ```

GiteaMirror commented

2026-05-04 10:45:43 -05:00

@odrobnik commented on GitHub (Jan 27, 2025):

There are three main issues:

the current template is missing the tool calling support parts you can see eg in the one for Qwen2.5
there is some part of ollama that looks for tool calls in the original output and if there are any then it discards the other text context. Ideally we would still get the generated thinking in the context. We am not certain about this, but whenever the template go tool calls from the model, the context was empty.
I semi-successfully added tool calling to the template, but it seems a bit unreliable. I think that this is because the special tokens for the tool calls are missing, due to the tokenizer not implementing it.

@odrobnik commented on GitHub (Jan 27, 2025): There are three main issues: 1) the current template is missing the tool calling support parts you can see eg in the one for Qwen2.5 2) there is some part of ollama that looks for tool calls in the original output and if there are any then it discards the other text context. Ideally we would still get the generated thinking in the context. We am not certain about this, but whenever the template go tool calls from the model, the context was empty. 3) I semi-successfully added tool calling to the template, but it seems a bit unreliable. I think that this is because the special tokens for the tool calls are missing, due to the tokenizer not implementing it.

GiteaMirror commented

2026-05-04 10:45:44 -05:00

@DevinduSamarasinghe commented on GitHub (Jan 30, 2025):

https://ollama.com/MFDoom/deepseek-r1-tool-calling:8b

The above model supports tool calling.

@DevinduSamarasinghe commented on GitHub (Jan 30, 2025): https://ollama.com/MFDoom/deepseek-r1-tool-calling:8b The above model supports tool calling.

GiteaMirror commented

2026-05-04 10:45:45 -05:00

@niltonvasques commented on GitHub (Jan 31, 2025):

I tried here on langflow, but it not calls the tool correctly, at least it is not raising errors informing that the model has not tool calling features.

@niltonvasques commented on GitHub (Jan 31, 2025): I tried here on langflow, but it not calls the tool correctly, at least it is not raising errors informing that the model has not tool calling features.

GiteaMirror commented

2026-05-04 10:45:46 -05:00

@rick-github commented on GitHub (Jan 31, 2025):

I believe that some frameworks hide the fact that some models don't do tools, and use the old insert-tool-in-system-prompt method to implement function calling.

@rick-github commented on GitHub (Jan 31, 2025): I believe that some frameworks hide the fact that some models don't do tools, and use the old insert-tool-in-system-prompt method to implement function calling.

GiteaMirror commented

2026-05-04 10:45:48 -05:00

@odrobnik commented on GitHub (Jan 31, 2025):

https://ollama.com/MFDoom/deepseek-r1-tool-calling:8b

The above model supports tool calling.

I tested it, there are several issue:

for the first couple of tries it didn't even call any tools.
There is a leading </think> in the output.

"</think>\n\n**Step 1:** The user wants to know whose birthday is next between Oliver and Sylvia.\n\n**Step 2:** I\'ll use the `birthdate` tool to find both of their birthdates.\n\n**Step 3:** After getting both dates, I\'ll apply the `date_difference` tool to determine how many days each has left until their birthday this year.\n\n**Step 4:** Compare the two numbers to see who has a closer birthday in terms of days remaining.\n</think>\n\n**Final Answer:**\n\nSylvia\'s birthday is next this year."

The 14B version does call some tools, but it doesn't come to the right conclusions in my test case involving multiple reasoning steps.

Even with the 32B version I get results like this:

[{"index":0,"message":{"role":"assistant","content":"\u003c/think\u003e\n\n{\n \"name\": \"date_ difference\",\n \"parameters\": {\n  \"first Date\": get_current_date (),\n  \"last_Date\": birthdate ({user: \"Oliver\"})\n }\n}"},"finish_reason":"stop"}],"usage":{"prompt_tokens":471,"completion_tokens":42,"total_tokens":513}}

At other times there are spaces in the names of functions. Very odd and unreliable behavior.

@odrobnik commented on GitHub (Jan 31, 2025): > https://ollama.com/MFDoom/deepseek-r1-tool-calling:8b > > The above model supports tool calling. I tested it, there are several issue: 1. for the first couple of tries it didn't even call any tools. 2. There is a leading `</think>` in the output. ``` "</think>\n\n**Step 1:** The user wants to know whose birthday is next between Oliver and Sylvia.\n\n**Step 2:** I\'ll use the `birthdate` tool to find both of their birthdates.\n\n**Step 3:** After getting both dates, I\'ll apply the `date_difference` tool to determine how many days each has left until their birthday this year.\n\n**Step 4:** Compare the two numbers to see who has a closer birthday in terms of days remaining.\n</think>\n\n**Final Answer:**\n\nSylvia\'s birthday is next this year." ``` 3. The 14B version does call some tools, but it doesn't come to the right conclusions in my test case involving multiple reasoning steps. 4. Even with the 32B version I get results like this: ``` [{"index":0,"message":{"role":"assistant","content":"\u003c/think\u003e\n\n{\n \"name\": \"date_ difference\",\n \"parameters\": {\n \"first Date\": get_current_date (),\n \"last_Date\": birthdate ({user: \"Oliver\"})\n }\n}"},"finish_reason":"stop"}],"usage":{"prompt_tokens":471,"completion_tokens":42,"total_tokens":513}} ``` At other times there are spaces in the names of functions. Very odd and unreliable behavior.

GiteaMirror commented

2026-05-04 10:45:50 -05:00

@tompipe commented on GitHub (Feb 2, 2025):

So I gave it a full test through using the openai examples - hopeuflly this adds further context

Request 1

curl -s localhost:33821/v1/chat/completions -d '{
    "model": "MFDoom/deepseek-r1-tool-calling",
    "messages": [
        {
            "role": "user",
            "content": "What is the weather like in Paris today?"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get current temperature for a given location.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City and country e.g. Bogotá, Colombia"
                        }
                    },
                    "required": [
                        "location"
                    ],
                    "additionalProperties": false
                },
                "strict": true
            }
        }
    ]
}'

Ollama log entry

2025-02-02 16:28:39 time=2025-02-02T16:28:39.269Z level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="\r\nWhen using a tool, format as:\r\n{\"name\": \"function_name\", \"parameters\": {\"param1\": \"value1\"}}\r\nThe following tools are available when needed for specific tasks:\r\n[{\"type\":\"function\",\"function\":{\"name\":\"get_weather\",\"description\":\"Get current temperature for a given location.\",\"parameters\":{\"type\":\"object\",\"required\":[\"location\"],\"properties\":{\"location\":{\"type\":\"string\",\"description\":\"City and country e.g. Bogotá, Colombia\"}}}}}]<｜User｜>What is the weather like in Paris today?\r\nGiven the tools, please respond with a JSON object for a function call with its proper arguments that best answers the given prompt.\r\nRespond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}. Do not use variables.<｜Assistant｜>\r\n\r\n"

Response

{
    "id": "chatcmpl-295",
    "object": "chat.completion",
    "created": 1738513724,
    "model": "MFDoom/deepseek-r1-tool-calling",
    "system_fingerprint": "fp_ollama",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "",
                "tool_calls": [
                    {
                        "id": "call_y4oi4to8",
                        "index": 0,
                        "type": "function",
                        "function": {
                            "name": "get_weather",
                            "arguments": "{\"location\":\"Paris, France\"}"
                        }
                    }
                ]
            },
            "finish_reason": "tool_calls"
        }
    ],
    "usage": {
        "prompt_tokens": 160,
        "completion_tokens": 263,
        "total_tokens": 423
    }
}

I then submitted the follow up request, appending the tool call message and the output of the tool message to the original request

Request 2

curl -s localhost:33821/v1/chat/completions -d '{
    "model": "MFDoom/deepseek-r1-tool-calling",
    "messages": [
        {
            "role": "user",
            "content": "What is the weather like in Paris today?"
        },
        {
            "role": "assistant",
            "content": "",
            "tool_calls": [
                {
                    "id": "call_0r878gkj",
                    "index": 0,
                    "type": "function",
                    "function": {
                        "name": "get_weather",
                        "arguments": "{\"location\":\"Paris, France\"}"
                    }
                }
            ]
        },
        {
            "role": "tool",
            "tool_call_id": "call_0r878gkj",
            "content": "The current temperature in Paris is 14°C (57.2°F)."
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get current temperature for a given location.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "City and country e.g. Bogotá, Colombia"
                        }
                    },
                    "required": [
                        "location"
                    ],
                    "additionalProperties": false
                },
                "strict": true
            }
        }
    ]
}'

Ollama log entry

2025-02-02 16:30:50 time=2025-02-02T16:30:50.573Z level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="\r\nWhen using a tool, format as:\r\n{\"name\": \"function_name\", \"parameters\": {\"param1\": \"value1\"}}\r\nThe following tools are available when needed for specific tasks:\r\n[{\"type\":\"function\",\"function\":{\"name\":\"get_weather\",\"description\":\"Get current temperature for a given location.\",\"parameters\":{\"type\":\"object\",\"required\":[\"location\"],\"properties\":{\"location\":{\"type\":\"string\",\"description\":\"City and country e.g. Bogotá, Colombia\"}}}}}]<｜User｜>What is the weather like in Paris today?\r\n<｜Assistant｜>\r\n<｜end▁of▁sentence｜>\r\n<｜tool▁outputs▁begin｜>\r\n<｜tool▁output▁begin｜>\r\nThe current temperature in Paris is 14°C (57.2°F).\r\n<｜tool▁output▁end｜>\r\n<｜tool▁outputs▁end｜><｜Assistant｜>\r\n\r\n"

Response

{
    "id": "chatcmpl-65",
    "object": "chat.completion",
    "created": 1738513851,
    "model": "MFDoom/deepseek-r1-tool-calling",
    "system_fingerprint": "fp_ollama",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "",
                "tool_calls": [
                    {
                        "id": "call_2idcm56l",
                        "index": 0,
                        "type": "function",
                        "function": {
                            "name": "get_weather",
                            "arguments": "{\"location\":\"Paris\"}"
                        }
                    }
                ]
            },
            "finish_reason": "tool_calls"
        }
    ],
    "usage": {
        "prompt_tokens": 174,
        "completion_tokens": 23,
        "total_tokens": 197
    }
}

Which seems like it just wants to call the tool again. But if the follow up request is submitted without the tools object

Response 3

Ollama log entry

2025-02-02 16:33:52 time=2025-02-02T16:33:52.068Z level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="<｜User｜>What is the weather like in Paris today?\r\n<｜Assistant｜>\r\n<｜end▁of▁sentence｜>\r\n<｜tool▁outputs▁begin｜>\r\n<｜tool▁output▁begin｜>\r\nThe current temperature in Paris is 14°C (57.2°F).\r\n<｜tool▁output▁end｜>\r\n<｜tool▁outputs▁end｜><｜Assistant｜>\r\n\r\n"

{
    "id": "chatcmpl-75",
    "object": "chat.completion",
    "created": 1738514032,
    "model": "MFDoom/deepseek-r1-tool-calling",
    "system_fingerprint": "fp_ollama",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "</think>\n\nThe current weather conditions in Paris are as follows: the temperature is **14°C** (57.2°F)."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 78,
        "completion_tokens": 28,
        "total_tokens": 106
    }
}

Then it does digest the tool output, but as @odrobnik observed, it has a leading </think> along with some other occasional weirdness. Like

  "message": {
      "role": "assistant",
      "content": "</think>\n\n<｜tool▁outputs●begin｜>\n<｜tool●outputs｜>\n<｜tool output starts here｜>\n\nThe current weather in Paris is [insert live weather description].\n\n[insert real-time data such as temperature, precipitation chances, wind speed, and conditions like sunny, cloudy, rainy.]\n\n<｜tool output ends here｜>\n<｜tool●outputs｜>\n<｜tool outputs●end ｜>\n\nPlease note: The information provided should be checked through a reliable weather service for accuracy."
  }

Other notes

Ensuring "additionalProperties": false and "strict": true were present in the request, prevented it from 'making up' function arguments like time=today or calling non existent functions like get_weather_today
Re-submitting the same request appears to not use the given tool output if it's already been consumed. And it'll respond with something like "I can't get current weather, consider consulting a reliable weather service like AccuWeather..." but resubmitting the request with an edited messages[1].tool_calls[0].id and messages[2].tool_call_id seems to work (i.e from call_0r878gkj to call_0r878gkk). Is there some sort of caching or a mechanism ensuring tool calls can only be used once?
At some stage during this testing, I did get a valid think output along with it digesting the results of a tool call, and the message was something like "Okay, so I'm looking at this problem where the user asked to get the weather in Paris. They provided a chat history showing previous interactions with me. I see that in earlier interactions that .... " and it correctly spat out the current weather after it's reasoning. But unfortunately haven't been able to reproduce it again.

I'll keep digging

@tompipe commented on GitHub (Feb 2, 2025): So I gave it a full test through using the [openai examples](https://platform.openai.com/docs/guides/function-calling) - hopeuflly this adds further context <details> <summary>Request 1</summary> ```json curl -s localhost:33821/v1/chat/completions -d '{ "model": "MFDoom/deepseek-r1-tool-calling", "messages": [ { "role": "user", "content": "What is the weather like in Paris today?" } ], "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "Get current temperature for a given location.", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City and country e.g. Bogotá, Colombia" } }, "required": [ "location" ], "additionalProperties": false }, "strict": true } } ] }' ``` Ollama log entry ``` 2025-02-02 16:28:39 time=2025-02-02T16:28:39.269Z level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="\r\nWhen using a tool, format as:\r\n{\"name\": \"function_name\", \"parameters\": {\"param1\": \"value1\"}}\r\nThe following tools are available when needed for specific tasks:\r\n[{\"type\":\"function\",\"function\":{\"name\":\"get_weather\",\"description\":\"Get current temperature for a given location.\",\"parameters\":{\"type\":\"object\",\"required\":[\"location\"],\"properties\":{\"location\":{\"type\":\"string\",\"description\":\"City and country e.g. Bogotá, Colombia\"}}}}}]<｜User｜>What is the weather like in Paris today?\r\nGiven the tools, please respond with a JSON object for a function call with its proper arguments that best answers the given prompt.\r\nRespond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}. Do not use variables.<｜Assistant｜>\r\n\r\n" ``` Response ```json { "id": "chatcmpl-295", "object": "chat.completion", "created": 1738513724, "model": "MFDoom/deepseek-r1-tool-calling", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_y4oi4to8", "index": 0, "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\":\"Paris, France\"}" } } ] }, "finish_reason": "tool_calls" } ], "usage": { "prompt_tokens": 160, "completion_tokens": 263, "total_tokens": 423 } } ``` </details> I then submitted the follow up request, appending the tool call message and the output of the tool message to the original request <details> <summary>Request 2</summary> ```json curl -s localhost:33821/v1/chat/completions -d '{ "model": "MFDoom/deepseek-r1-tool-calling", "messages": [ { "role": "user", "content": "What is the weather like in Paris today?" }, { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_0r878gkj", "index": 0, "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\":\"Paris, France\"}" } } ] }, { "role": "tool", "tool_call_id": "call_0r878gkj", "content": "The current temperature in Paris is 14°C (57.2°F)." } ], "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "Get current temperature for a given location.", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City and country e.g. Bogotá, Colombia" } }, "required": [ "location" ], "additionalProperties": false }, "strict": true } } ] }' ``` Ollama log entry ``` 2025-02-02 16:30:50 time=2025-02-02T16:30:50.573Z level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="\r\nWhen using a tool, format as:\r\n{\"name\": \"function_name\", \"parameters\": {\"param1\": \"value1\"}}\r\nThe following tools are available when needed for specific tasks:\r\n[{\"type\":\"function\",\"function\":{\"name\":\"get_weather\",\"description\":\"Get current temperature for a given location.\",\"parameters\":{\"type\":\"object\",\"required\":[\"location\"],\"properties\":{\"location\":{\"type\":\"string\",\"description\":\"City and country e.g. Bogotá, Colombia\"}}}}}]<｜User｜>What is the weather like in Paris today?\r\n<｜Assistant｜>\r\n<｜end▁of▁sentence｜>\r\n<｜tool▁outputs▁begin｜>\r\n<｜tool▁output▁begin｜>\r\nThe current temperature in Paris is 14°C (57.2°F).\r\n<｜tool▁output▁end｜>\r\n<｜tool▁outputs▁end｜><｜Assistant｜>\r\n\r\n" ``` Response ```json { "id": "chatcmpl-65", "object": "chat.completion", "created": 1738513851, "model": "MFDoom/deepseek-r1-tool-calling", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_2idcm56l", "index": 0, "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\":\"Paris\"}" } } ] }, "finish_reason": "tool_calls" } ], "usage": { "prompt_tokens": 174, "completion_tokens": 23, "total_tokens": 197 } } ``` </details> Which seems like it just wants to call the tool again. But if the follow up request is submitted without the ```tools``` object <details> <summary>Response 3</summary> Ollama log entry ``` 2025-02-02 16:33:52 time=2025-02-02T16:33:52.068Z level=DEBUG source=routes.go:1470 msg="chat request" images=0 prompt="<｜User｜>What is the weather like in Paris today?\r\n<｜Assistant｜>\r\n<｜end▁of▁sentence｜>\r\n<｜tool▁outputs▁begin｜>\r\n<｜tool▁output▁begin｜>\r\nThe current temperature in Paris is 14°C (57.2°F).\r\n<｜tool▁output▁end｜>\r\n<｜tool▁outputs▁end｜><｜Assistant｜>\r\n\r\n" ``` ```json { "id": "chatcmpl-75", "object": "chat.completion", "created": 1738514032, "model": "MFDoom/deepseek-r1-tool-calling", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "</think>\n\nThe current weather conditions in Paris are as follows: the temperature is **14°C** (57.2°F)." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 78, "completion_tokens": 28, "total_tokens": 106 } } ``` </details> Then it does digest the tool output, but as @odrobnik observed, it has a leading ```</think>``` along with some other occasional weirdness. Like ```json "message": { "role": "assistant", "content": "</think>\n\n<｜tool▁outputs●begin｜>\n<｜tool●outputs｜>\n<｜tool output starts here｜>\n\nThe current weather in Paris is [insert live weather description].\n\n[insert real-time data such as temperature, precipitation chances, wind speed, and conditions like sunny, cloudy, rainy.]\n\n<｜tool output ends here｜>\n<｜tool●outputs｜>\n<｜tool outputs●end ｜>\n\nPlease note: The information provided should be checked through a reliable weather service for accuracy." } ``` ### Other notes - Ensuring ```"additionalProperties": false``` and ```"strict": true``` were present in the request, prevented it from 'making up' function arguments like _time=today_ or calling non existent functions like _get_weather_today_ - Re-submitting the same request appears to not use the given tool output if it's already been consumed. And it'll respond with something like _"I can't get current weather, consider consulting a reliable weather service like AccuWeather..."_ but resubmitting the request with an edited ```messages[1].tool_calls[0].id``` and ```messages[2].tool_call_id``` seems to work (i.e from call_0r878gkj to call_0r878gkk). Is there some sort of caching or a mechanism ensuring tool calls can only be used once? - At some stage during this testing, I did get a valid think output along with it digesting the results of a tool call, and the message was something like _"<think>Okay, so I'm looking at this problem where the user asked to get the weather in Paris. They provided a chat history showing previous interactions with me. I see that in earlier interactions that .... </think>"_ and it correctly spat out the current weather after it's reasoning. But unfortunately haven't been able to reproduce it again. I'll keep digging

GiteaMirror commented

2026-05-04 10:45:55 -05:00

@odrobnik commented on GitHub (Feb 2, 2025):

I am pretty sure that ollama doesn’t know to assign the tool call responses properly to tool calls if there are more than one.

If there is just one then often it works, but if you have multiple responses it won’t work reliably because the id field doesn’t get put in the template so the model only sees the pure response text.

To prove that you can add the function and parameters in front of the response and then the LLM will know which answer belongs to which call.I think that this might be an ollama issue in general because LM Studio seems to have no issue with multiple tool responses.

@odrobnik commented on GitHub (Feb 2, 2025): I am pretty sure that ollama doesn’t know to assign the tool call responses properly to tool calls if there are more than one. If there is just one then often it works, but if you have multiple responses it won’t work reliably because the id field doesn’t get put in the template so the model only sees the pure response text. To prove that you can add the function and parameters in front of the response and then the LLM will know which answer belongs to which call.I think that this might be an ollama issue in general because LM Studio seems to have no issue with multiple tool responses.

GiteaMirror commented

2026-05-04 10:46:00 -05:00

@tompipe commented on GitHub (Feb 4, 2025):

Ok, so a little more digging, and I think I've uncovered one of the possible/related issues. This one I think is the cause of the occasional multiple/duplicate tool calls, and perhaps also for the sporadic random function names in the tool_calls array (followed by a valid tool call).

If (as I understand it) the ollama code for parsing the tool calls receives the full response, including the content between the <think> tags, then suppose the model returned:

<think>
Okay, I need to figure out how to respond to the user's question about the weather in Paris today using the tools provided. The available tool is "get_weather," which requires a location parameter. 

First, the user's query is pretty straightforward: they just want the current temperature and maybe some additional info for Paris, France. Since the tool specifically asks for the location as a string that includes both the city and country, I should make sure to use "Paris, France" as the location.

I should structure my response according to the required JSON format. That means calling the function with the appropriate parameters. So it would be {"name": "get_weather", "parameters":{"location":"Paris, France"}}.

Once I send this request, the tool will process it and return a JSON object containing the temperature data for that location today. After receiving the response, I'll format it into an explanation so the user understands how to interpret the results.
</think>

{"name": "get_weather", "parameters":{"location":"Paris, France"}}

Ollama's parseObjects function will try parse the content (including anything inside the think output), for any valid json. If this maps to a valid function call with name and arguments, then it appears that multiple/duplicate tool call entries will be spat out, something like:

{
    "role": "assistant",
    "content": "",
    "tool_calls": [
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "arguments": "{\"location\":\"Paris, France\"}"
            }
        },
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "arguments": "{\"location\":\"Paris, France\"}"
            }
        }
    ]
}

This may also explain a few of the responses where i've seen it want to call non existant functions like get_weather_today.

If, during the thinking, the model considered an ideal function (such as "get_weather_today", or evaluated other tools before settling on the correct one to call, these considerations would still be output in ollama's response, and could/would end up being called by the client.

Slightly concerning given the contrived scenario where a model is provided CRUD tools, and it considers each one, e.g:

<think>
... OK, I won't call {\"name\": \"delete_record\", \"parameters\":{\"id\":\"1\"}} because the documentation says that will delete the record. I think the right tool call would be {\"name\": \"update_record\", \"parameters\":{\"id\":\"1\"}}
</think>
{\"name\": \"update_record\", \"parameters\":{\"id\":\"1\"}}

I've set up an example go playground using the code lifted from ollama, as a quick 'proof of concept'. If anyone wants to confirm my findings.

I'm sure this isn't just going to affect the Deepseek/qwen distillates, so probably wants to be hacked off into a seperate issue.

@tompipe commented on GitHub (Feb 4, 2025): Ok, so a little more digging, and I think I've uncovered one of the possible/related issues. This one I think is the cause of the occasional multiple/duplicate tool calls, and perhaps also for the sporadic random function names in the tool_calls array (followed by a valid tool call). If (as I understand it) the ollama code for [parsing the tool calls](https://github.com/ollama/ollama/blob/f9d2d8913554d78b1cae47c5eaa9cbbd0ea79273/server/model.go#L132) receives the full response, including the content between the ```<think>``` tags, then suppose the model returned: ``` <think> Okay, I need to figure out how to respond to the user's question about the weather in Paris today using the tools provided. The available tool is "get_weather," which requires a location parameter. First, the user's query is pretty straightforward: they just want the current temperature and maybe some additional info for Paris, France. Since the tool specifically asks for the location as a string that includes both the city and country, I should make sure to use "Paris, France" as the location. I should structure my response according to the required JSON format. That means calling the function with the appropriate parameters. So it would be {"name": "get_weather", "parameters":{"location":"Paris, France"}}. Once I send this request, the tool will process it and return a JSON object containing the temperature data for that location today. After receiving the response, I'll format it into an explanation so the user understands how to interpret the results. </think> {"name": "get_weather", "parameters":{"location":"Paris, France"}} ``` Ollama's ```parseObjects``` function will try parse the content (including anything inside the think output), for any valid json. If this maps to a valid function call with name and arguments, then it appears that multiple/duplicate tool call entries will be spat out, something like: ```json { "role": "assistant", "content": "", "tool_calls": [ { "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\":\"Paris, France\"}" } }, { "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\":\"Paris, France\"}" } } ] } ``` This may also explain a few of the responses where i've seen it want to call non existant functions like ```get_weather_today```. If, during the thinking, the model considered an ideal function (such as "get_weather_today", or evaluated other tools before settling on the correct one to call, these considerations would still be output in ollama's response, and could/would end up being called by the client. Slightly concerning given the contrived scenario where a model is provided CRUD tools, and it considers each one, e.g: ``` <think> ... OK, I won't call {\"name\": \"delete_record\", \"parameters\":{\"id\":\"1\"}} because the documentation says that will delete the record. I think the right tool call would be {\"name\": \"update_record\", \"parameters\":{\"id\":\"1\"}} </think> {\"name\": \"update_record\", \"parameters\":{\"id\":\"1\"}} ``` I've set up an [example go playground](https://go.dev/play/p/qK4vKE3Z6r9) using the code lifted from ollama, as a quick 'proof of concept'. If anyone wants to confirm my findings. I'm sure this isn't just going to affect the Deepseek/qwen distillates, so probably wants to be hacked off into a seperate issue.

GiteaMirror commented

2026-05-04 10:46:04 -05:00

@odrobnik commented on GitHub (Feb 4, 2025):

I believe that the tool call parsing should ignore the thinking part. Ideally this would put into the reasoning_content property, as original Deepseek is doing it

@odrobnik commented on GitHub (Feb 4, 2025): I believe that the tool call parsing should ignore the thinking part. Ideally this would put into the reasoning_content property, as original Deepseek is doing it

GiteaMirror commented

2026-05-04 10:46:06 -05:00

@rick-github commented on GitHub (Feb 4, 2025):

You're expending a lot of effort to get the distillate to use tools. Why not just use the original base model?

@rick-github commented on GitHub (Feb 4, 2025): You're expending a lot of effort to get the distillate to use tools. Why not just use the original base model?

GiteaMirror commented

2026-05-04 10:46:08 -05:00

@tompipe commented on GitHub (Feb 4, 2025):

I believe that the tool call parsing should ignore the thinking part.

As far as I can tell, parseToolCalls is being passed the full content, which I believe still includes the think tags, though I'm not familar with go, or the ollama pipeline, so can't be certain. But it definitely aligns with my findings/testing

Ideally this would put into the reasoning_content property, as original Deepseek is doing it

Absolutely. This would be great

@tompipe commented on GitHub (Feb 4, 2025): > I believe that the tool call parsing should ignore the thinking part. As far as I can tell, [parseToolCalls](https://github.com/ollama/ollama/blob/65b7ecac7bd4346fae8f49764b0d6d2eb8de39ae/server/routes.go#L1505) is being passed the full content, which I believe still includes the think tags, though I'm not familar with go, or the ollama pipeline, so can't be certain. But it definitely aligns with my findings/testing > Ideally this would put into the reasoning_content property, as original Deepseek is doing it Absolutely. This would be great

GiteaMirror commented

2026-05-04 10:46:09 -05:00

@tompipe commented on GitHub (Feb 4, 2025):

You're expending a lot of effort to get the distillate to use tools. Why not just use the original base model?

Ha, I would if I could download moar vramz 😄

@tompipe commented on GitHub (Feb 4, 2025): > You're expending a lot of effort to get the distillate to use tools. Why not just use the original base model? Ha, I would if I could download moar vramz 😄

GiteaMirror commented

2026-05-04 10:46:10 -05:00

@rick-github commented on GitHub (Feb 4, 2025):

Not deepseek-r1:671b, qwen2.5:7b or llama3.1:8b.

@rick-github commented on GitHub (Feb 4, 2025): Not deepseek-r1:671b, qwen2.5:7b or llama3.1:8b.

GiteaMirror commented

2026-05-04 10:46:11 -05:00

@tompipe commented on GitHub (Feb 4, 2025):

Not deepseek-r1:671b, qwen2.5:7b or llama3.1:8b.

Ah I see. Well if something doesn't work as it should, I tend to get an itch to understand why. And this one got me scratching 🤣

@tompipe commented on GitHub (Feb 4, 2025): > Not deepseek-r1:671b, qwen2.5:7b or llama3.1:8b. Ah I see. Well if something doesn't work as it should, I tend to get an itch to understand why. And this one got me scratching 🤣

GiteaMirror commented

2026-05-04 10:46:12 -05:00

@wangjiyang commented on GitHub (Feb 5, 2025):

https://ollama.com/MFDoom/deepseek-r1-tool-calling:8b
The above model supports tool calling.

I tested it, there are several issue:

for the first couple of tries it didn't even call any tools.

There is a leading </think> in the output.
"</think>\n\n**Step 1:** The user wants to know whose birthday is next between Oliver and Sylvia.\n\n**Step 2:** I\'ll use the `birthdate` tool to find both of their birthdates.\n\n**Step 3:** After getting both dates, I\'ll apply the `date_difference` tool to determine how many days each has left until their birthday this year.\n\n**Step 4:** Compare the two numbers to see who has a closer birthday in terms of days remaining.\n</think>\n\n**Final Answer:**\n\nSylvia\'s birthday is next this year."
The 14B version does call some tools, but it doesn't come to the right conclusions in my test case involving multiple reasoning steps.

Even with the 32B version I get results like this:
[{"index":0,"message":{"role":"assistant","content":"\u003c/think\u003e\n\n{\n \"name\": \"date_ difference\",\n \"parameters\": {\n  \"first Date\": get_current_date (),\n  \"last_Date\": birthdate ({user: \"Oliver\"})\n }\n}"},"finish_reason":"stop"}],"usage":{"prompt_tokens":471,"completion_tokens":42,"total_tokens":513}}
At other times there are spaces in the names of functions. Very odd and unreliable behavior.

I also noticed some extra spaces are generated after ".＂ or "/". Eg, stdio.h became "stdio. h". This is happened even normal chat creation without tool support. After some investigation, it is very probably caused by tokenizer.

@wangjiyang commented on GitHub (Feb 5, 2025): > > https://ollama.com/MFDoom/deepseek-r1-tool-calling:8b > > The above model supports tool calling. > > I tested it, there are several issue: > > 1. for the first couple of tries it didn't even call any tools. > 2. There is a leading `</think>` in the output. > > ``` > "</think>\n\n**Step 1:** The user wants to know whose birthday is next between Oliver and Sylvia.\n\n**Step 2:** I\'ll use the `birthdate` tool to find both of their birthdates.\n\n**Step 3:** After getting both dates, I\'ll apply the `date_difference` tool to determine how many days each has left until their birthday this year.\n\n**Step 4:** Compare the two numbers to see who has a closer birthday in terms of days remaining.\n</think>\n\n**Final Answer:**\n\nSylvia\'s birthday is next this year." > ``` > > > The 14B version does call some tools, but it doesn't come to the right conclusions in my test case involving multiple reasoning steps. > > > Even with the 32B version I get results like this: > > ``` > [{"index":0,"message":{"role":"assistant","content":"\u003c/think\u003e\n\n{\n \"name\": \"date_ difference\",\n \"parameters\": {\n \"first Date\": get_current_date (),\n \"last_Date\": birthdate ({user: \"Oliver\"})\n }\n}"},"finish_reason":"stop"}],"usage":{"prompt_tokens":471,"completion_tokens":42,"total_tokens":513}} > ``` > > At other times there are spaces in the names of functions. Very odd and unreliable behavior. I also noticed some extra spaces are generated after ".＂ or "/". Eg, stdio.h became "stdio. h". This is happened even normal chat creation without tool support. After some investigation, it is very probably caused by tokenizer.

GiteaMirror commented

2026-05-04 10:46:14 -05:00

@mozophe commented on GitHub (Feb 11, 2025):

I think this issue is partly related to https://github.com/ollama/ollama/issues/8982.

Also, there are a few tool supported versions of DeepSeek-R1-Qwen2.5 available in ollama library: https://ollama.com/search?c=tools&q=deepseek

Could be a good idea to either test the models or use its template.

@mozophe commented on GitHub (Feb 11, 2025): I think this issue is partly related to https://github.com/ollama/ollama/issues/8982. Also, there are a few tool supported versions of DeepSeek-R1-Qwen2.5 available in ollama library: https://ollama.com/search?c=tools&q=deepseek Could be a good idea to either test the models or use its template.

GiteaMirror commented

2026-05-04 10:46:19 -05:00

@robwilkes commented on GitHub (Feb 12, 2025):

This may not help however Groq supports R1 distill models for tools calling so it definitely seems possible to do.

https://console.groq.com/docs/tool-use

@robwilkes commented on GitHub (Feb 12, 2025): This may not help however Groq supports R1 distill models for tools calling so it definitely seems possible to do. https://console.groq.com/docs/tool-use

GiteaMirror commented

2026-05-04 10:46:22 -05:00

@xiaoming2624 commented on GitHub (Feb 14, 2025):

hi,
And deepseek-r1:14b
ubuntu 20.04.6 LTS
Latest Ollama
webui-dify
search-duckduckgo or searxng

[ollama] Error: APl request failed with status
code 400:
{"error":"registry.ollama.ai/library/deepseek-
r1:14b does not support tools"}

@xiaoming2624 commented on GitHub (Feb 14, 2025): hi, And deepseek-r1:14b ubuntu 20.04.6 LTS Latest Ollama webui-dify search-duckduckgo or searxng [ollama] Error: APl request failed with status code 400: {"error":"registry.ollama.ai/library/deepseek- r1:14b does not support tools"} ![Image](https://github.com/user-attachments/assets/9aafd1b7-18dc-4632-877b-663e671cb2cb)

GiteaMirror commented

2026-05-04 10:46:22 -05:00

@codex-horizon commented on GitHub (Feb 26, 2025):

我观察到模型出现相同的错误deepseek-r1:1.5b

{'error': {'message': 'registry.ollama.ai/library/deepseek-r1:1.5b does not support tools', 'type': 'api_error', 'param': None, 'code': None}}

java.lang.RuntimeException: [400] Bad Request - {"error":"registry.ollama.ai/library/deepseek-r1:70b does not support tools"}

@codex-horizon commented on GitHub (Feb 26, 2025): > 我观察到模型出现相同的错误`deepseek-r1:1.5b` > > ``` > {'error': {'message': 'registry.ollama.ai/library/deepseek-r1:1.5b does not support tools', 'type': 'api_error', 'param': None, 'code': None}} > ``` java.lang.RuntimeException: [400] Bad Request - {"error":"registry.ollama.ai/library/deepseek-r1:70b does not support tools"}

GiteaMirror commented

2026-05-04 10:46:23 -05:00

@f2bo commented on GitHub (Feb 26, 2025):

@odrobnik Have you seen this issue https://github.com/ggml-org/llama.cpp/issues/11861#issuecomment-2660718488? It seems to be the same and apparently has been fixed, though I'm not sure when the fix will be taken by ollama.

@f2bo commented on GitHub (Feb 26, 2025): @odrobnik Have you seen this issue https://github.com/ggml-org/llama.cpp/issues/11861#issuecomment-2660718488? It seems to be the same and apparently has been fixed, though I'm not sure when the fix will be taken by ollama.

GiteaMirror commented

2026-05-04 10:46:24 -05:00

@zoeshawwang commented on GitHub (Mar 1, 2025):

same here

@zoeshawwang commented on GitHub (Mar 1, 2025): same here

GiteaMirror commented

2026-05-04 10:46:24 -05:00

@jesusmogollon commented on GitHub (Mar 3, 2025):

+1

@jesusmogollon commented on GitHub (Mar 3, 2025): +1

GiteaMirror commented

2026-05-04 10:46:25 -05:00

@vishalmakwana111 commented on GitHub (Mar 6, 2025):

+1

@vishalmakwana111 commented on GitHub (Mar 6, 2025): +1

GiteaMirror commented

2026-05-04 10:46:26 -05:00

@blackhawkee commented on GitHub (Mar 9, 2025):

same issue +1

@blackhawkee commented on GitHub (Mar 9, 2025): same issue +1

GiteaMirror commented

2026-05-04 10:46:27 -05:00

@liupums commented on GitHub (Mar 12, 2025):

figured out and confirmed that qwen2.5 supports tools

https://ollama.com/library/qwen2.5

$ curl -s localhost:11434/api/chat -d '{ "model":"qwen2.5", "stream":false, "messages":[ {"role":"user","content":"what is the weather in paris"} ], "tools":[ {"type":"function","function":{"name":"get_current_weather","description":"get the weather"}} ]}' {"model":"qwen2.5","created_at":"2025-03-12T01:25:02.5036295Z","message":{"role":"assistant","content":"","tool_calls":[{"function":{"name":"get_current_weather","arguments":{"query":"Paris"}}}]},"done_reason":"stop","done":true,"total_duration":4318202200,"load_duration":2173890400,"prompt_eval_count":147,"prompt_eval_duration":770000000,"eval_count":21,"eval_duration":1111000000}

@liupums commented on GitHub (Mar 12, 2025): figured out and confirmed that qwen2.5 supports tools https://ollama.com/library/[qwen2.5](https://ollama.com/library/qwen2.5) ` $ curl -s localhost:11434/api/chat -d '{ "model":"qwen2.5", "stream":false, "messages":[ {"role":"user","content":"what is the weather in paris"} ], "tools":[ {"type":"function","function":{"name":"get_current_weather","description":"get the weather"}} ]}' {"model":"qwen2.5","created_at":"2025-03-12T01:25:02.5036295Z","message":{"role":"assistant","content":"","tool_calls":[{"function":{"name":"get_current_weather","arguments":{"query":"Paris"}}}]},"done_reason":"stop","done":true,"total_duration":4318202200,"load_duration":2173890400,"prompt_eval_count":147,"prompt_eval_duration":770000000,"eval_count":21,"eval_duration":1111000000} `

GiteaMirror commented

2026-05-04 10:46:28 -05:00

@crystal-coding-time commented on GitHub (Apr 27, 2025):

Same issue +1

@crystal-coding-time commented on GitHub (Apr 27, 2025): Same issue +1

GiteaMirror commented

2026-05-04 10:46:32 -05:00

@maccman commented on GitHub (May 20, 2025):

Same issue

@maccman commented on GitHub (May 20, 2025): Same issue

GiteaMirror commented

2026-05-04 10:46:34 -05:00

@JunHyeokYoo commented on GitHub (May 20, 2025):

+1

@JunHyeokYoo commented on GitHub (May 20, 2025): +1

GiteaMirror commented

2026-05-04 10:46:36 -05:00

@YuSheng1223 commented on GitHub (May 23, 2025):

same issue

@YuSheng1223 commented on GitHub (May 23, 2025): same issue

GiteaMirror commented

2026-05-04 10:46:38 -05:00

@udaykumarbpatel commented on GitHub (May 24, 2025):

+1

@udaykumarbpatel commented on GitHub (May 24, 2025): +1

GiteaMirror commented

2026-05-04 10:46:38 -05:00

@wuhongsheng commented on GitHub (May 26, 2025):

+1

@wuhongsheng commented on GitHub (May 26, 2025): +1

GiteaMirror commented

2026-05-04 10:46:40 -05:00

@anishcorratech commented on GitHub (Jun 7, 2025):

same issue,

ollama version is 0.9.0

2025/06/07 13:40:42 INFO Model loaded provider=ollama model=deepseek-r1:1.5b
2025/06/07 13:40:42 INFO Initializing server... name=filesystem
2025/06/07 13:40:42 INFO Initializing server... name=sqlite
2025/06/07 13:40:44 INFO Server connected name=filesystem
2025/06/07 13:40:44 INFO Server connected name=sqlite
2025/06/07 13:40:44 INFO Tools loaded server=filesystem count=11
2025/06/07 13:40:44 INFO Tools loaded server=sqlite count=6

  You: hi
2025/06/07 13:40:47 INFO Shutting down MCP servers...
2025/06/07 13:40:47 INFO Server closed name=filesystem
2025/06/07 13:40:47 INFO Server closed name=sqlite
Error: registry.ollama.ai/library/deepseek-r1:1.5b does not support tools

@anishcorratech commented on GitHub (Jun 7, 2025): same issue, ollama version is `0.9.0` ``` 2025/06/07 13:40:42 INFO Model loaded provider=ollama model=deepseek-r1:1.5b 2025/06/07 13:40:42 INFO Initializing server... name=filesystem 2025/06/07 13:40:42 INFO Initializing server... name=sqlite 2025/06/07 13:40:44 INFO Server connected name=filesystem 2025/06/07 13:40:44 INFO Server connected name=sqlite 2025/06/07 13:40:44 INFO Tools loaded server=filesystem count=11 2025/06/07 13:40:44 INFO Tools loaded server=sqlite count=6 You: hi 2025/06/07 13:40:47 INFO Shutting down MCP servers... 2025/06/07 13:40:47 INFO Server closed name=filesystem 2025/06/07 13:40:47 INFO Server closed name=sqlite Error: registry.ollama.ai/library/deepseek-r1:1.5b does not support tools ```

GiteaMirror commented

2026-05-04 10:46:41 -05:00

@ctcanbol commented on GitHub (Jun 23, 2025):

Are there any plan to fix this?

@ctcanbol commented on GitHub (Jun 23, 2025): Are there any plan to fix this?

GiteaMirror commented

2026-05-04 10:46:42 -05:00

@Jayian1890 commented on GitHub (Jun 26, 2025):

Since January with no fix is kinda absurd, ngl

@Jayian1890 commented on GitHub (Jun 26, 2025): Since January with no fix is kinda absurd, ngl

GiteaMirror commented

2026-05-04 10:46:42 -05:00

@qwerty108109 commented on GitHub (Jul 3, 2025):

I do not think there's a question. This model definitely has tool support. But Ollama is lacking in tool support for this model.
This is definitely a bug that has been encountered by more than a few people.

@qwerty108109 commented on GitHub (Jul 3, 2025): I do not think there's a question. This model definitely has tool support. But Ollama is lacking in tool support for this model. This is definitely a bug that has been encountered by more than a few people.

GiteaMirror commented

2026-05-04 10:46:42 -05:00

@qwerty108109 commented on GitHub (Jul 3, 2025):

I can confirm this issue has not been fixed on the latest version of Ollama v0.9.5.

@qwerty108109 commented on GitHub (Jul 3, 2025): I can confirm this issue has not been fixed on the latest version of Ollama v0.9.5.

GiteaMirror commented

2026-05-04 10:46:43 -05:00

@tko commented on GitHub (Jul 3, 2025):

might help: https://github.com/ollama/ollama/pull/11273

@tko commented on GitHub (Jul 3, 2025): might help: https://github.com/ollama/ollama/pull/11273

GiteaMirror commented

2026-05-04 10:46:44 -05:00

@ctcanbol commented on GitHub (Jul 3, 2025):

I just switched to vLLM. Plus, better tokens per sec. Not fixing this bug for such a popular model for more then a month is unacceptable.

@ctcanbol commented on GitHub (Jul 3, 2025): I just switched to vLLM. Plus, better tokens per sec. Not fixing this bug for such a popular model for more then a month is unacceptable.

GiteaMirror commented

2026-05-04 10:46:44 -05:00

@qwerty108109 commented on GitHub (Jul 3, 2025):

Ollama is a completely volunteer project to my knowledge.
Where VLLM has money and institutional backers.
@ctcanbol

@qwerty108109 commented on GitHub (Jul 3, 2025): Ollama is a completely volunteer project to my knowledge. Where VLLM has money and institutional backers. @ctcanbol

GiteaMirror commented

2026-05-04 10:46:45 -05:00

@Jayian1890 commented on GitHub (Jul 4, 2025):

Ollama is a completely volunteer project to my knowledge. Where VLLM has money and institutional backers. @ctcanbol

I can submit a pull request if it’s worth the effort. Don’t want to waste the time if it won’t be merged.

@Jayian1890 commented on GitHub (Jul 4, 2025): > Ollama is a completely volunteer project to my knowledge. Where VLLM has money and institutional backers. [@ctcanbol](https://github.com/ctcanbol) I can submit a pull request if it’s worth the effort. Don’t want to waste the time if it won’t be merged.

GiteaMirror commented

2026-05-04 10:46:46 -05:00

@qwerty108109 commented on GitHub (Jul 4, 2025):

Just to caveat this, I do not speak on behalf of the Ollama Project, but to my knowledge, Ollama is constantly looking to add new contributors and does have a very high quality control standards. Here is a link to the developer guide if you're interested in contributing. https://github.com/ollama/ollama/blob/main/docs/development.md
@Jayian1890

@qwerty108109 commented on GitHub (Jul 4, 2025): Just to caveat this, I do not speak on behalf of the Ollama Project, but to my knowledge, Ollama is constantly looking to add new contributors and does have a very high quality control standards. Here is a link to the developer guide if you're interested in contributing. [https://github.com/ollama/ollama/blob/main/docs/development.md](https://github.com/ollama/ollama/blob/main/docs/development.md) @Jayian1890

GiteaMirror commented

2026-05-04 10:46:47 -05:00

@ver007 commented on GitHub (Jul 7, 2025):

I just switched to vLLM. Plus, better tokens per sec. Not fixing this bug for such a popular model for more then a month is unacceptable.

vLLM + Deepseek tools call is ok？

@ver007 commented on GitHub (Jul 7, 2025): > I just switched to vLLM. Plus, better tokens per sec. Not fixing this bug for such a popular model for more then a month is unacceptable. vLLM + Deepseek tools call is ok？

GiteaMirror commented

2026-05-04 10:46:48 -05:00

@ver007 commented on GitHub (Jul 7, 2025):

Ollama is a completely volunteer project to my knowledge. Where VLLM has money and institutional backers. @ctcanbol

You are delaying the solution because the model originated from China. This is regional discrimination.

@ver007 commented on GitHub (Jul 7, 2025): > Ollama is a completely volunteer project to my knowledge. Where VLLM has money and institutional backers. [@ctcanbol](https://github.com/ctcanbol) You are delaying the solution because the model originated from China. This is regional discrimination.

GiteaMirror commented

2026-05-04 10:46:48 -05:00

@mcr-ksh commented on GitHub (Jul 26, 2025):

+1

registry.ollama.ai/library/deepseek-r1:7b does not support tools

Also for the 7b model. A pitty.

@mcr-ksh commented on GitHub (Jul 26, 2025): +1 registry.ollama.ai/library/deepseek-r1:7b does not support tools Also for the 7b model. A pitty.

GiteaMirror commented

2026-05-04 10:46:49 -05:00

@tomaszkiewicz commented on GitHub (Jul 26, 2025):

https://docs.boundaryml.com/home - this is NOT a solution of the problem, but... the project is VERY interesting in the way it handles prompting and tooling, moreover it doesn't require native tool calling support, so works with ANY model, I tried it with many models on Ollama, including DeepSeek and Qwen.

@tomaszkiewicz commented on GitHub (Jul 26, 2025): https://docs.boundaryml.com/home - this is NOT a solution of the problem, but... the project is VERY interesting in the way it handles prompting and tooling, moreover it doesn't require native tool calling support, so works with ANY model, I tried it with many models on Ollama, including DeepSeek and Qwen.

GiteaMirror commented

2026-05-04 10:46:49 -05:00

@ParthSareen commented on GitHub (Aug 3, 2025):

Hey folks! Sorry didn't see this issue earlier but we have been working for a while to try and get tool calling working on the distills.

There are a couple issues which make it difficult to support it right now.

The distills do not output the right token sequence compared to the full sized model for tool calling. Our parser needs the prefix to match correctly in order to parse the tool call. We need this information to be accurate as we have to distinguish between thinking, chatting, and tool calling.
Custom instruction can get shifted out. The is kind of fixable just through putting a system prompt but it's inconsistent and can potentially be shifted out on long prompts - I have some fixes for this in the near future.

The only realistic way to get tool calling on these models is to early execute into a constrained output. That comes with its own challenges like output quality but I do intend to scope this in.

Sorry that you guys are running into this but we are trying to improve this process. As for work arounds - I'd recommend modifying the template yourself with a custom prompt. YMMV which is why we haven't done it yet. But worth a shot. Going to close this out for now.

@ParthSareen commented on GitHub (Aug 3, 2025): Hey folks! Sorry didn't see this issue earlier but we have been working for a while to try and get tool calling working on the distills. There are a couple issues which make it difficult to support it right now. 1. The distills do not output the right token sequence compared to the full sized model for tool calling. Our parser needs the prefix to match correctly in order to parse the tool call. We need this information to be accurate as we have to distinguish between thinking, chatting, and tool calling. 2. Custom instruction can get shifted out. The is kind of fixable just through putting a system prompt but it's inconsistent and can potentially be shifted out on long prompts - I have some fixes for this in the near future. The only realistic way to get tool calling on these models is to early execute into a constrained output. That comes with its own challenges like output quality but I do intend to scope this in. Sorry that you guys are running into this but we are trying to improve this process. As for work arounds - I'd recommend modifying the template yourself with a custom prompt. YMMV which is why we haven't done it yet. But worth a shot. Going to close this out for now.

GiteaMirror commented

2026-05-04 10:46:50 -05:00

@antarasi commented on GitHub (Dec 3, 2025):

If you're not going to support tool calling for deepseek-r1, please consider removing the "tools" tag from the model page:
https://ollama.com/library/deepseek-r1

It's confusing users that this model supports tool calling.

@antarasi commented on GitHub (Dec 3, 2025): If you're not going to support tool calling for deepseek-r1, please consider removing the "tools" tag from the model page: https://ollama.com/library/deepseek-r1 It's confusing users that this model supports tool calling.

GiteaMirror commented

2026-05-04 10:46:51 -05:00

@ver007 commented on GitHub (Dec 4, 2025):

If you're not going to support tool calling for deepseek-r1, please consider removing the "tools" tag from the model page: https://ollama.com/library/deepseek-r1

It's confusing users that this model supports tool calling.

The developers of this project harbored racial prejudice and deliberately selectively supported certain features of the model.

@ver007 commented on GitHub (Dec 4, 2025): > If you're not going to support tool calling for deepseek-r1, please consider removing the "tools" tag from the model page: https://ollama.com/library/deepseek-r1 > > It's confusing users that this model supports tool calling. The developers of this project harbored racial prejudice and deliberately selectively supported certain features of the model.

GiteaMirror commented

2026-05-04 10:46:51 -05:00

@rafaelcapucho commented on GitHub (Mar 6, 2026):

If you're not going to support tool calling for deepseek-r1, please consider removing the "tools" tag from the model page: https://ollama.com/library/deepseek-r1

It's confusing users that this model supports tool calling.

You're I just lost many hours trying to make it work

@rafaelcapucho commented on GitHub (Mar 6, 2026): > If you're not going to support tool calling for deepseek-r1, please consider removing the "tools" tag from the model page: https://ollama.com/library/deepseek-r1 > > It's confusing users that this model supports tool calling. You're I just lost many hours trying to make it work

GiteaMirror commented

2026-05-04 10:46:52 -05:00

@Amamax commented on GitHub (Mar 30, 2026):

The models they provide support the displayed tags, just not the custom distillates you can download from somewhere else... But yeah there is really some lacking info about how the capability discovery process works.

@Amamax commented on GitHub (Mar 30, 2026): The models they provide support the displayed tags, just not the custom distillates you can download from somewhere else... But yeah there is really some lacking info about how the capability discovery process works.

Sign in to join this conversation.

Branches Tags

main

dhiltgen/ci

parth-launch-plan-gating

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#67546