[GH-ISSUE #10981] Ollama v0.9.0 break with Phi4-mini tools declaration #32998

Closed
opened 2026-04-22 15:04:46 -05:00 by GiteaMirror · 18 comments
Owner

Originally created by @JpEncausse on GitHub (Jun 5, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10981

What is the issue?

Declaring a Tool with Phi4-mini trigger a parsing error from Ollama. It works with previous version of Ollama

{
  "name": "LLM_Tool_RAG",
  "description": "Recherches hybrides (texte + vecteur) dans la base Orama",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "object",
        "properties": {
          "term": { "type": "string" },
          "vector": {
            "type": "object",
            "properties": {
              "value": { "type": "string" }
            },
            "required": ["value"]
          }
        },
        "required": ["term", "vector"]
      }
    },
    "required": ["query"]
  }
}

Relevant log output

HTTP 500 - Internal Server Error
{"error":"unexpected end of JSON input"}
http://localhost:11434/api/chat

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.9.0

Originally created by @JpEncausse on GitHub (Jun 5, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10981 ### What is the issue? Declaring a Tool with Phi4-mini trigger a parsing error from Ollama. It works with previous version of Ollama ``` { "name": "LLM_Tool_RAG", "description": "Recherches hybrides (texte + vecteur) dans la base Orama", "parameters": { "type": "object", "properties": { "query": { "type": "object", "properties": { "term": { "type": "string" }, "vector": { "type": "object", "properties": { "value": { "type": "string" } }, "required": ["value"] } }, "required": ["term", "vector"] } }, "required": ["query"] } } ``` ### Relevant log output ```shell HTTP 500 - Internal Server Error {"error":"unexpected end of JSON input"} http://localhost:11434/api/chat ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.9.0
GiteaMirror added the bug label 2026-04-22 15:04:46 -05:00
Author
Owner

@rick-github commented on GitHub (Jun 5, 2025):

This seems due to the malformed template in the phi4-mini model, #9437.

Using a modified template results in tool calls being generated, although in my experience, phi4-mini is a poor tool user.

$ ./ollama-tool.py --model phi4-mini:3.8b --system 'You are a digital assistant who is responsible for helping the user with tasks using the provided tools.  For tools calls, return a JSON structure of the form `{"name", function, "arguments", json-args-list}`' --tools get_datetime --prompt "What is the the time?"
Traceback (most recent call last):
  File "/home/rick/docker/aitoolkit/./ollama-tool.py", line 518, in <module>
    messages = chat(messages, p)
               ^^^^^^^^^^^^^^^^^
  File "/home/rick/docker/aitoolkit/./ollama-tool.py", line 464, in chat
    response = ollama.chat(model=args.model, options=options, messages=messages, tools=tools)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rick/miniconda3/lib/python3.12/site-packages/ollama/_client.py", line 342, in chat
    return self._request(
           ^^^^^^^^^^^^^^
  File "/home/rick/miniconda3/lib/python3.12/site-packages/ollama/_client.py", line 180, in _request
    return cls(**self._request_raw(*args, **kwargs).json())
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rick/miniconda3/lib/python3.12/site-packages/ollama/_client.py", line 124, in _request_raw
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: unexpected end of JSON input (status code: 500)


$ ./ollama-tool.py --model phi4-mini:3.8b-newtooltemplate --system 'You are a digital assistant who is responsible for helping the user with tasks using the provided tools.  For tools calls, return a JSON structure of the form `{"name", function, "arguments", json-args-list}`' --tools get_datetime --prompt "What is the the time?"
calling get_datetime({'timezone_name': ''})
The current time provided is in the format you requested:

- Fulldate (Full Date and Time): Thursday, June 05, 2025 17:35
- Just Date: Thursday, June 05, 2025
- Just Time: 17:35

Please note that this date seems to be set far into the future from today's actual knowledge cutoff in early 2023. If you're looking for an accurate and current time check based on your local timezone or a specific location's timezone you mentioned earlier like `America/New_York`, please let me know, so I can correct it accordingly!
<!-- gh-comment-id:2945023387 --> @rick-github commented on GitHub (Jun 5, 2025): This seems due to the malformed template in the phi4-mini model, #9437. Using a modified template results in tool calls being generated, although in my experience, phi4-mini is a poor tool user. ```console $ ./ollama-tool.py --model phi4-mini:3.8b --system 'You are a digital assistant who is responsible for helping the user with tasks using the provided tools. For tools calls, return a JSON structure of the form `{"name", function, "arguments", json-args-list}`' --tools get_datetime --prompt "What is the the time?" Traceback (most recent call last): File "/home/rick/docker/aitoolkit/./ollama-tool.py", line 518, in <module> messages = chat(messages, p) ^^^^^^^^^^^^^^^^^ File "/home/rick/docker/aitoolkit/./ollama-tool.py", line 464, in chat response = ollama.chat(model=args.model, options=options, messages=messages, tools=tools) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/rick/miniconda3/lib/python3.12/site-packages/ollama/_client.py", line 342, in chat return self._request( ^^^^^^^^^^^^^^ File "/home/rick/miniconda3/lib/python3.12/site-packages/ollama/_client.py", line 180, in _request return cls(**self._request_raw(*args, **kwargs).json()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/rick/miniconda3/lib/python3.12/site-packages/ollama/_client.py", line 124, in _request_raw raise ResponseError(e.response.text, e.response.status_code) from None ollama._types.ResponseError: unexpected end of JSON input (status code: 500) $ ./ollama-tool.py --model phi4-mini:3.8b-newtooltemplate --system 'You are a digital assistant who is responsible for helping the user with tasks using the provided tools. For tools calls, return a JSON structure of the form `{"name", function, "arguments", json-args-list}`' --tools get_datetime --prompt "What is the the time?" calling get_datetime({'timezone_name': ''}) The current time provided is in the format you requested: - Fulldate (Full Date and Time): Thursday, June 05, 2025 17:35 - Just Date: Thursday, June 05, 2025 - Just Time: 17:35 Please note that this date seems to be set far into the future from today's actual knowledge cutoff in early 2023. If you're looking for an accurate and current time check based on your local timezone or a specific location's timezone you mentioned earlier like `America/New_York`, please let me know, so I can correct it accordingly! ```
Author
Owner

@pavelai commented on GitHub (Jun 5, 2025):

@rick-github All mini models are poor tool users (comparing to big models) as of now. But it's enough for testing and research purposes. Though it should be expected to work fine. But it's broken for a pretty long time, while it's marked as fixed. It's highly confusing to me. How can users update local model to have fixed template?

<!-- gh-comment-id:2946185163 --> @pavelai commented on GitHub (Jun 5, 2025): @rick-github All mini models are poor tool users (comparing to big models) as of now. But it's enough for testing and research purposes. Though it should be expected to work fine. But it's broken for a pretty long time, while it's marked as fixed. It's highly confusing to me. How can users update local model to have fixed template?
Author
Owner

@rick-github commented on GitHub (Jun 5, 2025):

How can users update local model to have fixed template?

By modifying the template as show in #9437. See here for more details on Modelfiles.

<!-- gh-comment-id:2946194789 --> @rick-github commented on GitHub (Jun 5, 2025): > How can users update local model to have fixed template? By modifying the template as show in #9437. See [here](https://github.com/ollama/ollama/blob/main/docs/modelfile.md) for more details on Modelfiles.
Author
Owner

@pavelai commented on GitHub (Jun 6, 2025):

@rick-github Thanks! But I though it would be a command like ollama pull but with some flag to update the modelfile only. How could it be fixed for all fresh installs of phi4-mini? Is it possible?

<!-- gh-comment-id:2947193050 --> @pavelai commented on GitHub (Jun 6, 2025): @rick-github Thanks! But I though it would be a command like `ollama pull` but with some flag to update the modelfile only. How could it be fixed for all fresh installs of phi4-mini? Is it possible?
Author
Owner

@rick-github commented on GitHub (Jun 6, 2025):

If the model is updated, ollama pull will just download the modified template. However, published models rarely get updates, so the quickest way to fix this is by doing it locally as described.

<!-- gh-comment-id:2949255264 --> @rick-github commented on GitHub (Jun 6, 2025): If the model is updated, `ollama pull` will just download the modified template. However, published models rarely get updates, so the quickest way to fix this is by doing it locally as described.
Author
Owner

@pavelai commented on GitHub (Jun 10, 2025):

I made this step https://github.com/ollama/ollama/issues/9437#issuecomment-2692298950 with creating a new modelfile. But it didn't work for me. The model still produces text answers and wraps tool calls into markdown blocks. I tried to modify system message and instruct the model to use tools. But with no success. Am I missing something?

<!-- gh-comment-id:2957708242 --> @pavelai commented on GitHub (Jun 10, 2025): I made this step https://github.com/ollama/ollama/issues/9437#issuecomment-2692298950 with creating a new modelfile. But it didn't work for me. The model still produces text answers and wraps tool calls into markdown blocks. I tried to modify system message and instruct the model to use tools. But with no success. Am I missing something?
Author
Owner

@rick-github commented on GitHub (Jun 10, 2025):

What is the content of your current Modelfile?

<!-- gh-comment-id:2958876481 --> @rick-github commented on GitHub (Jun 10, 2025): What is the content of your current Modelfile?
Author
Owner

@pavelai commented on GitHub (Jun 11, 2025):

Here is the output of model show:

# Modelfile generated by "ollama show"
# To build a new Modelfile based on this, replace FROM with:
# FROM phi4-mini-fixed:latest

FROM /Users/user/.ollama/models/blobs/sha256-3c168af1dea0a414299c7d9077e100ac763370e5a98b3c53801a958a47f0a5db
TEMPLATE """{{- if or .System .Tools }}<|system|>{{ if .System }}{{ .System }}{{ end }}
{{- if .Tools }}{{ if not .System }}You are a helpful assistant with some tools.{{ end }}<|tool|>{{- range .Tools }} {{ .Function }} {{ end }}<|/tool|><|end|>
{{- end }}
{{- end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
{{- if ne .Role "system" }}<|{{ .Role }}|>{{ .Content }}
{{- if .ToolCalls }}<|tool_call|>[{{ range .ToolCalls }}{"name":"{{ .Function.Name }}","arguments":{{ .Function.Arguments }}}{{ end }}]<|/tool_call|>
{{- end }}
{{- if not $last }}<|end|>
{{- end }}
{{- if and (ne .Role "assistant") $last }}<|end|><|assistant|>{{ end }}
{{- end }}
{{- end }}"""
LICENSE """
"""

I've truncated the LICENSE value here as unnecessary.

<!-- gh-comment-id:2962347994 --> @pavelai commented on GitHub (Jun 11, 2025): Here is the output of `model show`: ``` # Modelfile generated by "ollama show" # To build a new Modelfile based on this, replace FROM with: # FROM phi4-mini-fixed:latest FROM /Users/user/.ollama/models/blobs/sha256-3c168af1dea0a414299c7d9077e100ac763370e5a98b3c53801a958a47f0a5db TEMPLATE """{{- if or .System .Tools }}<|system|>{{ if .System }}{{ .System }}{{ end }} {{- if .Tools }}{{ if not .System }}You are a helpful assistant with some tools.{{ end }}<|tool|>{{- range .Tools }} {{ .Function }} {{ end }}<|/tool|><|end|> {{- end }} {{- end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} {{- if ne .Role "system" }}<|{{ .Role }}|>{{ .Content }} {{- if .ToolCalls }}<|tool_call|>[{{ range .ToolCalls }}{"name":"{{ .Function.Name }}","arguments":{{ .Function.Arguments }}}{{ end }}]<|/tool_call|> {{- end }} {{- if not $last }}<|end|> {{- end }} {{- if and (ne .Role "assistant") $last }}<|end|><|assistant|>{{ end }} {{- end }} {{- end }}""" LICENSE """ """ ``` I've truncated the LICENSE value here as unnecessary.
Author
Owner

@rick-github commented on GitHub (Jun 11, 2025):

Add a system prompt:

--- Modelfile.10981	2025-06-11 16:27:02.441335935 +0200
+++ Modelfile	2025-06-11 16:32:51.973823919 +0200
@@ -3,6 +3,11 @@
 # FROM phi4-mini-fixed:latest
 
 FROM /Users/user/.ollama/models/blobs/sha256-3c168af1dea0a414299c7d9077e100ac763370e5a98b3c53801a958a47f0a5db
+SYSTEM """
+You are a digital assistant who is responsible for helping the user with tasks using the provided tools.
+When creating a tool call, use a structure of the form:
+  <|tool_call|>[{"name", function, "arguments", json-args-list}]<|/tool_call|>
+"""
 TEMPLATE """{{- if or .System .Tools }}<|system|>{{ if .System }}{{ .System }}{{ end }}
 {{- if .Tools }}{{ if not .System }}You are a helpful assistant with some tools.{{ end }}<|tool|>{{- range .Tools }} {{ .Function }} {{ end }}<|/tool|><|end|>
 {{- end }}

This improves the tool usage, but it's still pretty bad. qwen3:4b and llama3.2:3b are of comparable size and perform much better at tool calling.

<!-- gh-comment-id:2963110277 --> @rick-github commented on GitHub (Jun 11, 2025): Add a system prompt: ```diff --- Modelfile.10981 2025-06-11 16:27:02.441335935 +0200 +++ Modelfile 2025-06-11 16:32:51.973823919 +0200 @@ -3,6 +3,11 @@ # FROM phi4-mini-fixed:latest FROM /Users/user/.ollama/models/blobs/sha256-3c168af1dea0a414299c7d9077e100ac763370e5a98b3c53801a958a47f0a5db +SYSTEM """ +You are a digital assistant who is responsible for helping the user with tasks using the provided tools. +When creating a tool call, use a structure of the form: + <|tool_call|>[{"name", function, "arguments", json-args-list}]<|/tool_call|> +""" TEMPLATE """{{- if or .System .Tools }}<|system|>{{ if .System }}{{ .System }}{{ end }} {{- if .Tools }}{{ if not .System }}You are a helpful assistant with some tools.{{ end }}<|tool|>{{- range .Tools }} {{ .Function }} {{ end }}<|/tool|><|end|> {{- end }} ``` This improves the tool usage, but it's still pretty bad. [qwen3:4b](https://ollama.com/library/qwen3:4b) and [llama3.2:3b](https://ollama.com/library/llama3.2:3b) are of comparable size and perform much better at tool calling.
Author
Owner

@jtremesay-sereema commented on GitHub (Jun 12, 2025):

shouldn't it be [{"name": function, "arguments": json-args-list}] ? (colon instead of comma between key and value)

<!-- gh-comment-id:2966253443 --> @jtremesay-sereema commented on GitHub (Jun 12, 2025): shouldn't it be `[{"name": function, "arguments": json-args-list}]` ? (colon instead of comma between key and value)
Author
Owner

@rick-github commented on GitHub (Jun 12, 2025):

Yes, you are correct.

<!-- gh-comment-id:2966269263 --> @rick-github commented on GitHub (Jun 12, 2025): Yes, you are correct.
Author
Owner

@pavelai commented on GitHub (Jun 12, 2025):

Thanks, it's started to work better with the provided modelfile settings.

But it's very unstable for tool calls generation. I'm not sure is it the model or Ollama itself. The problem is that after the first reply the quality of generation decreases and a lot of garbage output starts to appear with or without the tool call. Example:

For the prompt "What time is now?", it generates valid tool call:

[{"name": "get_date_time", "arguments": {}}]

And a message:

<|/tool_call|><|assistant|>I will now initiate a tool call using an object with specified properties.

[{"name": "get_date_time", "arguments": {}}]

I'm using streaming generation. Maybe that's the reason.

<!-- gh-comment-id:2966447445 --> @pavelai commented on GitHub (Jun 12, 2025): Thanks, it's started to work better with the provided modelfile settings. But it's very unstable for tool calls generation. I'm not sure is it the model or Ollama itself. The problem is that after the first reply the quality of generation decreases and a lot of garbage output starts to appear with or without the tool call. Example: For the prompt "What time is now?", it generates valid tool call: ``` [{"name": "get_date_time", "arguments": {}}] ``` And a message: ``` <|/tool_call|><|assistant|>I will now initiate a tool call using an object with specified properties. [{"name": "get_date_time", "arguments": {}}] ``` I'm using streaming generation. Maybe that's the reason.
Author
Owner

@rick-github commented on GitHub (Jun 12, 2025):

Perhaps it hasn't been mentioned before, but phi4-mini is not a great tool user.

<!-- gh-comment-id:2966542769 --> @rick-github commented on GitHub (Jun 12, 2025): Perhaps it hasn't been mentioned before, but phi4-mini is not a great tool user.
Author
Owner

@pavelai commented on GitHub (Jun 12, 2025):

After some more testing I've found that the model started to behave like a tool even when there are no tools provided.

Example:

user: Give me an example of using word "AI"
assistant: "[{"name": "search", "function": "querySearch", "arguments": {"keywords": ["AI"]}}]"

It could be solved by switching Phi4 models with and without fix. I think it's could be noted somewhere in documentation or model's description.

<!-- gh-comment-id:2967313425 --> @pavelai commented on GitHub (Jun 12, 2025): After some more testing I've found that the model started to behave like a tool even when there are no tools provided. Example: ``` user: Give me an example of using word "AI" assistant: "[{"name": "search", "function": "querySearch", "arguments": {"keywords": ["AI"]}}]" ``` It could be solved by switching Phi4 models with and without fix. I think it's could be noted somewhere in documentation or model's description.
Author
Owner

@rick-github commented on GitHub (Jun 12, 2025):

Or try a model that's better at tool use. I've heard that phi4 is not great in that respect.

<!-- gh-comment-id:2967321303 --> @rick-github commented on GitHub (Jun 12, 2025): Or try a model that's better at tool use. I've heard that phi4 is not great in that respect.
Author
Owner

@pavelai commented on GitHub (Jun 12, 2025):

Using bigger models could solve the issue. But I think a lot of developers are trying to find the way to use mini models e.g. for research purposes. What's important to make models more useful on mobile devices and embedded software.

<!-- gh-comment-id:2967348016 --> @pavelai commented on GitHub (Jun 12, 2025): Using bigger models could solve the issue. But I think a lot of developers are trying to find the way to use mini models e.g. for research purposes. What's important to make models more useful on mobile devices and embedded software.
Author
Owner

@rick-github commented on GitHub (Jun 12, 2025):

qwen3:4b and llama3.2:3b are of comparable size and perform much better at tool calling.

<!-- gh-comment-id:2967365265 --> @rick-github commented on GitHub (Jun 12, 2025): [qwen3:4b](https://ollama.com/library/qwen3:4b) and [llama3.2:3b](https://ollama.com/library/llama3.2:3b) are of comparable size and perform much better at tool calling.
Author
Owner

@pavelai commented on GitHub (Jun 13, 2025):

Thanks for the advice. I've checked qwen3:4b and it works impressively good! So, yes, now it seems more like a Phi4 related issue with tools calling and garbage output.

<!-- gh-comment-id:2969206379 --> @pavelai commented on GitHub (Jun 13, 2025): Thanks for the advice. I've checked qwen3:4b and it works impressively good! So, yes, now it seems more like a Phi4 related issue with tools calling and garbage output.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#32998