[GH-ISSUE #6713] Talking to Mistral-Nemo via OpenAI tool calling - fails #4229

Open
opened 2026-04-12 15:09:52 -05:00 by GiteaMirror · 10 comments
Owner

Originally created by @ChristianWeyer on GitHub (Sep 9, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6713

Originally assigned to: @ParthSareen on GitHub.

What is the issue?

With this curl command:

curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"mistral-nemo:12b-instruct-2407-fp16",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather like in Boston?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"]
            }
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}' | json_pp

we should be able to execute an OpenAI API compatible tool use call against mistral-nemo.

But I get this result:

{
   "choices" : [
      {
         "finish_reason" : "stop",
         "index" : 0,
         "message" : {
            "content" : "Glad to help! In which unit would you like the temperature?",
            "role" : "assistant"
         }
      }
   ],
   "created" : 1725905432,
   "id" : "chatcmpl-677",
   "model" : "mistral-nemo:12b-instruct-2407-fp16",
   "object" : "chat.completion",
   "system_fingerprint" : "fp_ollama",
   "usage" : {
      "completion_tokens" : 15,
      "prompt_tokens" : 95,
      "total_tokens" : 110
   }
}

Is there a missing config or something similar like that?
Thanks.

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.3.9

Originally created by @ChristianWeyer on GitHub (Sep 9, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6713 Originally assigned to: @ParthSareen on GitHub. ### What is the issue? With this curl command: ``` curl http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"mistral-nemo:12b-instruct-2407-fp16", "messages": [ { "role": "user", "content": "What is the weather like in Boston?" } ], "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] } } } ], "tool_choice": "auto" }' | json_pp ``` we should be able to execute an OpenAI API compatible tool use call against `mistral-nemo`. But I get this result: ``` { "choices" : [ { "finish_reason" : "stop", "index" : 0, "message" : { "content" : "Glad to help! In which unit would you like the temperature?", "role" : "assistant" } } ], "created" : 1725905432, "id" : "chatcmpl-677", "model" : "mistral-nemo:12b-instruct-2407-fp16", "object" : "chat.completion", "system_fingerprint" : "fp_ollama", "usage" : { "completion_tokens" : 15, "prompt_tokens" : 95, "total_tokens" : 110 } } ``` Is there a missing config or something similar like that? Thanks. ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.3.9
GiteaMirror added the bugneeds more info labels 2026-04-12 15:09:52 -05:00
Author
Owner

@ChristianWeyer commented on GitHub (Sep 9, 2024):

Here is the server.log for the call.
server.log

<!-- gh-comment-id:2338807060 --> @ChristianWeyer commented on GitHub (Sep 9, 2024): Here is the server.log for the call. [server.log](https://github.com/user-attachments/files/16934986/server.log)
Author
Owner

@rick-github commented on GitHub (Sep 9, 2024):

It seems like this model is not as ready to do tools calls as some other models. If you run your query enough times, it will eventually return a tool call, but it's pretty infrequent. I was able to get it to respond more readily with a tool call by adding a system message:

--- openai.tc.orig      2024-09-09 21:24:34.604400267 +0200
+++ openai.tc   2024-09-09 21:37:25.673342526 +0200
@@ -4,6 +4,10 @@
 -H "Content-Type: application/json" \
 -d '{"model":"mistral-nemo:12b-instruct-2407-fp16",
   "messages": [
+    { 
+      "role": "system", 
+      "content": "You are a helpful AI with tool calling capabilities."
+    }, 
     {
       "role": "user",
       "content": "What is the weather like in Boston?"

This could probably be added to the template or add a SYSTEM prompt to the Modelfile so that clients don't need to.

<!-- gh-comment-id:2338933604 --> @rick-github commented on GitHub (Sep 9, 2024): It seems like this model is not as ready to do tools calls as some other models. If you run your query enough times, it will eventually return a tool call, but it's pretty infrequent. I was able to get it to respond more readily with a tool call by adding a system message: ```diff --- openai.tc.orig 2024-09-09 21:24:34.604400267 +0200 +++ openai.tc 2024-09-09 21:37:25.673342526 +0200 @@ -4,6 +4,10 @@ -H "Content-Type: application/json" \ -d '{"model":"mistral-nemo:12b-instruct-2407-fp16", "messages": [ + { + "role": "system", + "content": "You are a helpful AI with tool calling capabilities." + }, { "role": "user", "content": "What is the weather like in Boston?" ``` This could probably be added to the template or add a SYSTEM prompt to the Modelfile so that clients don't need to.
Author
Owner

@ChristianWeyer commented on GitHub (Sep 10, 2024):

Thanks @rick-github ! What do we need to do to make it better and change the template for the weights in the Ollama model catalog?

<!-- gh-comment-id:2339668503 --> @ChristianWeyer commented on GitHub (Sep 10, 2024): Thanks @rick-github ! What do we need to do to make it better and change the template for the weights in the Ollama model catalog?
Author
Owner

@ChristianWeyer commented on GitHub (Sep 10, 2024):

It seems like this model is not as ready to do tools calls as some other models. If you run your query enough times, it will eventually return a tool call, but it's pretty infrequent. I was able to get it to respond more readily with a tool call by adding a system message:

--- openai.tc.orig      2024-09-09 21:24:34.604400267 +0200
+++ openai.tc   2024-09-09 21:37:25.673342526 +0200
@@ -4,6 +4,10 @@
 -H "Content-Type: application/json" \
 -d '{"model":"mistral-nemo:12b-instruct-2407-fp16",
   "messages": [
+    { 
+      "role": "system", 
+      "content": "You are a helpful AI with tool calling capabilities."
+    }, 
     {
       "role": "user",
       "content": "What is the weather like in Boston?"

This could probably be added to the template or add a SYSTEM prompt to the Modelfile so that clients don't need to.

This gives me the correct answer sometimes - but maybe in something like 70% only.
Which is not what we need :-)

Do you have another idea? @rick-github

<!-- gh-comment-id:2339883404 --> @ChristianWeyer commented on GitHub (Sep 10, 2024): > It seems like this model is not as ready to do tools calls as some other models. If you run your query enough times, it will eventually return a tool call, but it's pretty infrequent. I was able to get it to respond more readily with a tool call by adding a system message: > > ```diff > --- openai.tc.orig 2024-09-09 21:24:34.604400267 +0200 > +++ openai.tc 2024-09-09 21:37:25.673342526 +0200 > @@ -4,6 +4,10 @@ > -H "Content-Type: application/json" \ > -d '{"model":"mistral-nemo:12b-instruct-2407-fp16", > "messages": [ > + { > + "role": "system", > + "content": "You are a helpful AI with tool calling capabilities." > + }, > { > "role": "user", > "content": "What is the weather like in Boston?" > ``` > > This could probably be added to the template or add a SYSTEM prompt to the Modelfile so that clients don't need to. This gives me the correct answer sometimes - but maybe in something like 70% only. Which is not what we need :-) Do you have another idea? @rick-github
Author
Owner

@ChristianWeyer commented on GitHub (Sep 10, 2024):

This is more reliable for me:

curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"mistral-nemo:12b-instruct-2407-fp16",
  "messages": [
    { 
      "role": "system", 
      "content": "You are a helpful AI with tool calling capabilities. Watch out for the 'tools' element in the request."
    },    {
      "role": "user",
      "content": "What is the weather like in Boston?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"]
            }
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto",
  "temperature": 0
}' | json_pp

<!-- gh-comment-id:2339899726 --> @ChristianWeyer commented on GitHub (Sep 10, 2024): This is more reliable for me: ``` curl http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"mistral-nemo:12b-instruct-2407-fp16", "messages": [ { "role": "system", "content": "You are a helpful AI with tool calling capabilities. Watch out for the 'tools' element in the request." }, { "role": "user", "content": "What is the weather like in Boston?" } ], "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] } } } ], "tool_choice": "auto", "temperature": 0 }' | json_pp ```
Author
Owner

@yingding commented on GitHub (Oct 22, 2024):

It seems to work now with the ollama 0.3.14 version on mac apple silicon also without the system prompt with a mistral-nemo:12b.
I got the same error with the ollama 0.3.13 version from openai python sdk openai==1.52.0 previously. It seems to be fine now.

<!-- gh-comment-id:2429858901 --> @yingding commented on GitHub (Oct 22, 2024): It seems to work now with the ollama 0.3.14 version on mac apple silicon also without the system prompt with a `mistral-nemo:12b`. I got the same error with the ollama 0.3.13 version from openai python sdk `openai==1.52.0` previously. It seems to be fine now.
Author
Owner

@dhiltgen commented on GitHub (Nov 6, 2024):

@ChristianWeyer can you try upgrading to the latest version and see if that resolves it for you as well?

<!-- gh-comment-id:2458471098 --> @dhiltgen commented on GitHub (Nov 6, 2024): @ChristianWeyer can you try upgrading to the latest version and see if that resolves it for you as well?
Author
Owner

@ChristianWeyer commented on GitHub (Nov 6, 2024):

Latest ollama, just freshly pulled the model @dhiltgen.
Result, using the curl command from my original post above:

❯ curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"mistral-nemo:12b-instruct-2407-fp16",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather like in Boston?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["celsius", "fahrenheit"]
            }
          },
          "required": ["location"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}' | json_pp
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1173  100   398  100   775    255    497  0:00:01  0:00:01 --:--:--   753
{
   "choices" : [
      {
         "finish_reason" : "stop",
         "index" : 0,
         "message" : {
            "content" : "Certainly! Do you want to check the current weather conditions in Celsius or Fahrenheit?",
            "role" : "assistant"
         }
      }
   ],
   "created" : 1730885472,
   "id" : "chatcmpl-27",
   "model" : "mistral-nemo:12b-instruct-2407-fp16",
   "object" : "chat.completion",
   "system_fingerprint" : "fp_ollama",
   "usage" : {
      "completion_tokens" : 20,
      "prompt_tokens" : 95,
      "total_tokens" : 115
   }
}
<!-- gh-comment-id:2459122409 --> @ChristianWeyer commented on GitHub (Nov 6, 2024): Latest ollama, just freshly pulled the model @dhiltgen. Result, using the curl command from my original post above: ``` ❯ curl http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"mistral-nemo:12b-instruct-2407-fp16", "messages": [ { "role": "user", "content": "What is the weather like in Boston?" } ], "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] } } } ], "tool_choice": "auto" }' | json_pp % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1173 100 398 100 775 255 497 0:00:01 0:00:01 --:--:-- 753 { "choices" : [ { "finish_reason" : "stop", "index" : 0, "message" : { "content" : "Certainly! Do you want to check the current weather conditions in Celsius or Fahrenheit?", "role" : "assistant" } } ], "created" : 1730885472, "id" : "chatcmpl-27", "model" : "mistral-nemo:12b-instruct-2407-fp16", "object" : "chat.completion", "system_fingerprint" : "fp_ollama", "usage" : { "completion_tokens" : 20, "prompt_tokens" : 95, "total_tokens" : 115 } } ```
Author
Owner

@ParthSareen commented on GitHub (Jan 15, 2025):

Hey @ChristianWeyer are you still running into this?

<!-- gh-comment-id:2593975296 --> @ParthSareen commented on GitHub (Jan 15, 2025): Hey @ChristianWeyer are you still running into this?
Author
Owner

@ChristianWeyer commented on GitHub (Jan 16, 2025):

I moved on... sorry.

<!-- gh-comment-id:2595942424 --> @ChristianWeyer commented on GitHub (Jan 16, 2025): I moved on... sorry.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4229