[GH-ISSUE #10552] JSON SyntaxError with tool_calls on llama3.2 (and Phi4-mini) #69003

Open
opened 2026-05-04 16:44:15 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @JpEncausse on GitHub (May 3, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10552

Originally assigned to: @ParthSareen on GitHub.

What is the issue?

Fetching a POST on http://localhost:11434/api/chat With the following body:
LLM: llama3.2 : 3b I have other issues with Phi4-mini but not the same errors.

{
  "model": "llama3.2:latest",
  "format": "json",
  "options": {  "num_ctx": 8192,  "temperature": 0.95  },
  "tools": [ {
      "type": "function",
      "function": {
        "name": "LLM_Tool_Search",
        "description": "Perform a query to an online web search engine and return top results. Usefull to get live, accurate data.",
        "parameters": {
          "type": "object",
          "properties": {
            "query": {
              "type": "string",
              "description": "the query to search"
            }
          },
          "required": [ "query" ]
        }
      }
   }],
  "messages": [
    {
      "role": "system",
      "content": "Current date is :03/05/2025 22:31:22\n# ROLE\nYour are an AI assistant you answer general question\nReply in JSON"
    },
    {
      "role": "user",
      "content": "Quel est la météo demain 04/05/2025 à Versailles ? Cherche en ligne avec tes Tools."
    }
  ]
}

But Ollama return a broken results

{
    "model": "llama3.2:latest",
    "created_at": "2025-05-03T20:31:26.3976631Z",
    "message": {
      "role": "assistant",
      "content": "",
      "tool_calls": [
        {
          "function": {
            "name": "LLM_Tool_Search",
            "arguments": {
              "query": "mété Versailles 04/05/2025"
            }
          }
        }
      ]
    },
    "done": false
}
{
    "model": "llama3.2:latest",
    "created_at": "2025-05-03T20:31:26.4051015Z",
    "message": {
      "role": "assistant",
      "content": ""
    },
    "done_reason": "stop",
    "done": true,
    "total_duration": 3580039000,
    "load_duration": 2958931400,
    "prompt_eval_count": 288,
    "prompt_eval_duration": 336476500,
    "eval_count": 35,
    "eval_duration": 280906200
}

As you can see the result is broken, it return Obj1 Obj2 but should return [ Obj1 , Obj2 ] or only Obj1 I do think the good return should be Obj1 with some fields of Obj2 and to be compatible with OpenAI done_reason should have a value like tool_calls

Relevant log output


OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.6.6

Originally created by @JpEncausse on GitHub (May 3, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10552 Originally assigned to: @ParthSareen on GitHub. ### What is the issue? Fetching a POST on `http://localhost:11434/api/chat` With the following body: LLM: llama3.2 : 3b I have other issues with Phi4-mini but not the same errors. ``` { "model": "llama3.2:latest", "format": "json", "options": { "num_ctx": 8192, "temperature": 0.95 }, "tools": [ { "type": "function", "function": { "name": "LLM_Tool_Search", "description": "Perform a query to an online web search engine and return top results. Usefull to get live, accurate data.", "parameters": { "type": "object", "properties": { "query": { "type": "string", "description": "the query to search" } }, "required": [ "query" ] } } }], "messages": [ { "role": "system", "content": "Current date is :03/05/2025 22:31:22\n# ROLE\nYour are an AI assistant you answer general question\nReply in JSON" }, { "role": "user", "content": "Quel est la météo demain 04/05/2025 à Versailles ? Cherche en ligne avec tes Tools." } ] } ``` But Ollama return a broken results ``` { "model": "llama3.2:latest", "created_at": "2025-05-03T20:31:26.3976631Z", "message": { "role": "assistant", "content": "", "tool_calls": [ { "function": { "name": "LLM_Tool_Search", "arguments": { "query": "mété Versailles 04/05/2025" } } } ] }, "done": false } { "model": "llama3.2:latest", "created_at": "2025-05-03T20:31:26.4051015Z", "message": { "role": "assistant", "content": "" }, "done_reason": "stop", "done": true, "total_duration": 3580039000, "load_duration": 2958931400, "prompt_eval_count": 288, "prompt_eval_duration": 336476500, "eval_count": 35, "eval_duration": 280906200 } ``` As you can see the result is broken, it return `Obj1 Obj2` but should return `[ Obj1 , Obj2 ]` or only `Obj1 ` I do think the good return should be `Obj1` with some fields of `Obj2` and to be compatible with OpenAI `done_reason` should have a value like `tool_calls` ### Relevant log output ```shell ``` ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.6.6
GiteaMirror added the bug label 2026-05-04 16:44:15 -05:00
Author
Owner

@jmorganca commented on GitHub (May 3, 2025):

@ParthSareen PTAL when you have a chance 😊

<!-- gh-comment-id:2848803038 --> @jmorganca commented on GitHub (May 3, 2025): @ParthSareen PTAL when you have a chance 😊
Author
Owner

@JpEncausse commented on GitHub (May 4, 2025):

I find a workaround but it's not satisfying, here is my prompt.

Using this prompt, Llama 3.2 will ALWAYS provide a tool_calls writtten correctly. But it will be almost at random if there is multiple tools. On the other side Phi4-mini is shy and I will have to insist a lot to get something. Follow-up conversation is not very accurate.

I try to figure out 2 things :

  • How to have a kind of generic prompt to use Tools when it matters
  • How to get a valide tool call within JSON Structure AND parameters (names, ...)

I could narrow bots to 1 LLM and 1 Tool, focus on 1 task and build Agentic architecture but any way

  • they need tool to discuss with each other
  • they need to decide to use or not their tool
# ROLE
Your are an AI assistant you answer general question

# FINAL OUTPUT FORMAT
====================

Must be a valid **JSON object** and strictly follow the template.
No additional explanations or comments.
{
    "text"       : "", // MANDATORY, Answer of the assistant. Must be a valid Markdown String.
    "speech"     : "", // MANDATORY, Short plain text version of the text to speak aloud. No emoji.
    "data"       : {}, // OPTIONAL.  Additional data. Must be a valid JSON object
    "tool_calls": [{   // OPTIONAL.  Use only if a tool is needed to answer. Must be an array of valid JSON objects.
        "function": {
            "name": "",     // Exact name of a declared tool in the system prompt. Case-sensitive.
            "arguments": {} // Valid JSON object. Must match the tool's expected input structure. Do NOT stringify.
        }
    }],
}


# TOOLS (Optional)

Only call tools that are explicitly needed. If none apply, return a direct answer.
If the answer can be generated using your internal knowledge, DO NOT call any tool.
Never guess tool names. You MUST use the exact name of the tool as declared.
Always match spelling exactly (e.g., "LLM_Tool_Search").

Examples:
- Q: "Who is Napoleon?" => DO NOT use any tool. Answer directly.
- Q: "What's the weather in Paris?" => Use "LLM_Tool_Search"
- Q: "Explain what you see in this image: [url]" => Use "LLM_Tool_INTERNAL_Image"


# TEXT CONTENT
====================

## GOALS
You try to provide concise answers unless the user asks for details.

## WORKFLOW
When a question is asked, you must first assess if the provided information is sufficient.
If additional details are needed, you must politely ask for clarifications.
Do not hallucinate. Only answer with verified information. **If the answer is unavailable or uncertain, say so explicitly**.

## STYLE
You are in a chat interface. 
Your answers must be friendly but professional. 
You answer **in the language of the question** (or in French).

## Scope :
You politely decline to answer to question outside of your skill set.

# FINAL CHECK
====================

You must ONLY output JSON object. DO NOT explain, summarize, comment, or add any other text than the JSON.
All values must be properly quoted, and keys must use double quotes.
If the format is not respected exactly, the response will be considered invalid.
<!-- gh-comment-id:2849087035 --> @JpEncausse commented on GitHub (May 4, 2025): I find a workaround but it's not satisfying, here is my prompt. Using this prompt, Llama 3.2 will ALWAYS provide a `tool_calls` writtten correctly. But it will be almost at random if there is multiple tools. On the other side Phi4-mini is shy and I will have to insist a lot to get something. Follow-up conversation is not very accurate. I try to figure out 2 things : - How to have a kind of generic prompt to use Tools when it matters - How to get a valide tool call within JSON Structure AND parameters (names, ...) I could narrow bots to 1 LLM and 1 Tool, focus on 1 task and build Agentic architecture but any way - they need tool to discuss with each other - they need to decide to use or not their tool ``` # ROLE Your are an AI assistant you answer general question # FINAL OUTPUT FORMAT ==================== Must be a valid **JSON object** and strictly follow the template. No additional explanations or comments. { "text" : "", // MANDATORY, Answer of the assistant. Must be a valid Markdown String. "speech" : "", // MANDATORY, Short plain text version of the text to speak aloud. No emoji. "data" : {}, // OPTIONAL. Additional data. Must be a valid JSON object "tool_calls": [{ // OPTIONAL. Use only if a tool is needed to answer. Must be an array of valid JSON objects. "function": { "name": "", // Exact name of a declared tool in the system prompt. Case-sensitive. "arguments": {} // Valid JSON object. Must match the tool's expected input structure. Do NOT stringify. } }], } # TOOLS (Optional) Only call tools that are explicitly needed. If none apply, return a direct answer. If the answer can be generated using your internal knowledge, DO NOT call any tool. Never guess tool names. You MUST use the exact name of the tool as declared. Always match spelling exactly (e.g., "LLM_Tool_Search"). Examples: - Q: "Who is Napoleon?" => DO NOT use any tool. Answer directly. - Q: "What's the weather in Paris?" => Use "LLM_Tool_Search" - Q: "Explain what you see in this image: [url]" => Use "LLM_Tool_INTERNAL_Image" # TEXT CONTENT ==================== ## GOALS You try to provide concise answers unless the user asks for details. ## WORKFLOW When a question is asked, you must first assess if the provided information is sufficient. If additional details are needed, you must politely ask for clarifications. Do not hallucinate. Only answer with verified information. **If the answer is unavailable or uncertain, say so explicitly**. ## STYLE You are in a chat interface. Your answers must be friendly but professional. You answer **in the language of the question** (or in French). ## Scope : You politely decline to answer to question outside of your skill set. # FINAL CHECK ==================== You must ONLY output JSON object. DO NOT explain, summarize, comment, or add any other text than the JSON. All values must be properly quoted, and keys must use double quotes. If the format is not respected exactly, the response will be considered invalid. ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#69003