[GH-ISSUE #14958] Tool calls silently drop with large system prompts (~1600+ tokens) #56131

Closed
opened 2026-04-29 10:18:14 -05:00 by GiteaMirror · 5 comments
Owner

Originally created by @cicoyle on GitHub (Mar 19, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14958

What is the issue?

Im using

$ ollama --version
ollama version is 0.17.6

OS: macOS (Apple M2 Max, 32GB)

Models tested: mistral-small3.1:24b, qwq:32b, qwen2.5:32b

When using the OpenAI-compatible /v1/chat/completions endpoint with tool_choice: "required" and a large system prompt (~1600+ tokens), Ollama generates completion tokens, but returns empty content with no tool_calls in the response. The same request with a shorter system prompt works correctly.

Repro: Works (~570 prompt tokens, short system prompt):

  curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d '{
      "model": "mistral-small3.1:24b",
      "messages": [{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is the weather?"}],
      "tools": [{"type":"function","function":{"name":"get_weather","description":"Get current weather","parameters":{"type":"object","properties":{"location":{"type":"string"}}}}}],
      "tool_choice": "required"
  }'

Note, I did generalize my prompt to weather...

Fails: Same endpoint and tool definitions, but with a system prompt expanded to ~1600 tokens containing detailed multi-step agent instructions. The response returns "content":"", "finish_reason":"stop" with NO tool_calls field, despite "completion_tokens":31 proving the model generated output.

What I'm seeing:

  • The model generates 31 completion tokens, but they are not captured as tool_calls
  • This happens across ALL tested models (mistral, qwen, qwq), meaning it's not model-specific
  • Works perfectly with short prompts + same tools
  • num_ctx=4096 (default), prompt is 1632 tokens leaving ~2464 tokens for generation, this should be sufficient

I expect:
Tool calls should be returned in the response regardless of system prompt length, as long as the prompt fits within the context window.

Relevant log output

sanitized logs:

ollama logs

  time=2026-03-19T09:43:38.139-05:00 level=DEBUG source=server.go:1536 msg="completion request" images=0 prompt=7064 format=""                                                                                                                     
  time=2026-03-19T09:43:38.157-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=0 prompt=1632 used=0 remaining=1632                                                                                                       
  [GIN] 2026/03/19 - 09:43:47 | 200 | 18.676361417s | ::1 | POST "/v1/chat/completions" 


raw resp captured via a proxy:            
                                                                                                                                                                                             
  {"id":"chatcmpl-638","object":"chat.completion","created":1773931427,"model":"mistral-small3.1:24b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":""},"finish_reason":"stop"}],"usage":{"prompt_
  tokens":1632,"completion_tokens":31,"total_tokens":1663}} 


Note: prompt_tokens:1632, completion_tokens:31 (tokens generated but not returned as tool_calls), content:"", no tool_calls field.

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.17.6

Originally created by @cicoyle on GitHub (Mar 19, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14958 ### What is the issue? Im using ``` $ ollama --version ollama version is 0.17.6 OS: macOS (Apple M2 Max, 32GB) Models tested: mistral-small3.1:24b, qwq:32b, qwen2.5:32b ``` When using the OpenAI-compatible `/v1/chat/completions` endpoint with `tool_choice: "required"` and a large system prompt (~1600+ tokens), Ollama generates completion tokens, but returns empty content with no tool_calls in the response. The same request with a shorter system prompt works correctly. Repro: Works (~570 prompt tokens, short system prompt): ``` curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "mistral-small3.1:24b", "messages": [{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is the weather?"}], "tools": [{"type":"function","function":{"name":"get_weather","description":"Get current weather","parameters":{"type":"object","properties":{"location":{"type":"string"}}}}}], "tool_choice": "required" }' ``` Note, I did generalize my prompt to weather... Fails: Same endpoint and tool definitions, but with a system prompt expanded to ~1600 tokens containing detailed multi-step agent instructions. The response returns "content":"", "finish_reason":"stop" with NO tool_calls field, despite "completion_tokens":31 proving the model generated output. What I'm seeing: - The model generates 31 completion tokens, but they are not captured as tool_calls - This happens across ALL tested models (mistral, qwen, qwq), meaning it's not model-specific - Works perfectly with short prompts + same tools - `num_ctx=4096` (default), prompt is 1632 tokens leaving ~2464 tokens for generation, this should be sufficient I expect: Tool calls should be returned in the response regardless of system prompt length, as long as the prompt fits within the context window. ### Relevant log output ```shell sanitized logs: ollama logs time=2026-03-19T09:43:38.139-05:00 level=DEBUG source=server.go:1536 msg="completion request" images=0 prompt=7064 format="" time=2026-03-19T09:43:38.157-05:00 level=DEBUG source=cache.go:151 msg="loading cache slot" id=0 cache=0 prompt=1632 used=0 remaining=1632 [GIN] 2026/03/19 - 09:43:47 | 200 | 18.676361417s | ::1 | POST "/v1/chat/completions" raw resp captured via a proxy: {"id":"chatcmpl-638","object":"chat.completion","created":1773931427,"model":"mistral-small3.1:24b","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":""},"finish_reason":"stop"}],"usage":{"prompt_ tokens":1632,"completion_tokens":31,"total_tokens":1663}} Note: prompt_tokens:1632, completion_tokens:31 (tokens generated but not returned as tool_calls), content:"", no tool_calls field. ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.17.6
GiteaMirror added the bug label 2026-04-29 10:18:14 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 19, 2026):

What's the output of ollama ps after running the large prompt?

$ curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d '{
      "model": "mistral-small3.1:24b",
      "messages": [
        {"role":"system","content":"'"$(yes You are a helpful assistant. | head -264 | tr \\n ' ')"'"},
        {"role":"user","content":"What is the weather in Zurich?"}
      ],
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string"
                }
              }
            }
          }
        }
      ],
      "tool_choice": "required"
  }' | jq

{
  "id": "chatcmpl-426",
  "object": "chat.completion",
  "created": 1773935171,
  "model": "mistral-small3.1:24b",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_wzzfmfr9",
            "index": 0,
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\":\"Zurich\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 1635,
    "completion_tokens": 18,
    "total_tokens": 1653
  }
}
<!-- gh-comment-id:4091172066 --> @rick-github commented on GitHub (Mar 19, 2026): What's the output of `ollama ps` after running the large prompt? ```console $ curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "mistral-small3.1:24b", "messages": [ {"role":"system","content":"'"$(yes You are a helpful assistant. | head -264 | tr \\n ' ')"'"}, {"role":"user","content":"What is the weather in Zurich?"} ], "tools": [ { "type": "function", "function": { "name": "get_weather", "description": "Get current weather", "parameters": { "type": "object", "properties": { "location": { "type": "string" } } } } } ], "tool_choice": "required" }' | jq ``` ```json { "id": "chatcmpl-426", "object": "chat.completion", "created": 1773935171, "model": "mistral-small3.1:24b", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_wzzfmfr9", "index": 0, "type": "function", "function": { "name": "get_weather", "arguments": "{\"location\":\"Zurich\"}" } } ] }, "finish_reason": "tool_calls" } ], "usage": { "prompt_tokens": 1635, "completion_tokens": 18, "total_tokens": 1653 } } ```
Author
Owner

@cicoyle commented on GitHub (Mar 19, 2026):

$ ollama ps
NAME                    ID              SIZE     PROCESSOR    CONTEXT    UNTIL              
mistral-small3.1:24b    b9aaf0c2586a    16 GB    100% GPU     4096       4 minutes from now    

I think the bug has something to do with the tool call output parser. When the system prompt + tools exceed some threshold (~1500 tokens I think), the parser fails to capture the generated tool call tokens and returns an empty resp. The model is generating the right output (31 tokens = typical tool call), but Ollama's resp serialization drops it.

I created a repo curl that consistently fails for me:

echo "=== SHORT (should work) ===" && curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_short.json | jq . && echo "=== LONG (should fail)
   ===" && curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_1.json | jq .

=== SHORT (should work) ===
{
  "id": "chatcmpl-226",
  "object": "chat.completion",
  "created": 1773952138,
  "model": "mistral-small3.1:24b",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_kweb0u1j",
            "index": 0,
            "type": "function",
            "function": {
              "name": "GetTicketsAtRisk",
              "arguments": "{}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 850,
    "completion_tokens": 16,
    "total_tokens": 866
  }
}
=== LONG (should fail)
   ===
{
  "id": "chatcmpl-116",
  "object": "chat.completion",
  "created": 1773952147,
  "model": "mistral-small3.1:24b",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": ""
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1467,
    "completion_tokens": 31,
    "total_tokens": 1498
  }
}

Where the files are defined as follows:

cat /tmp/ollama_req_short.json
{"model": "mistral-small3.1:24b", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Identify tickets that are at risk."}], "tools": [{"type": "function", "function": {"name": "SendEscalationEmail", "description": "\nSend an email to the Escalation Team notifying them of a delayed ticket\ndue to resource constraints. Use this when a ticket has insufficient staff\nand no available technicians are expected soon.\n", "parameters": {"description": "Input schema for sending escalation notification.", "properties": {"requester_name": {"description": "Requester name", "type": "string"}, "requester_id": {"description": "Requester ID", "type": "string"}, "ticket_id": {"description": "Ticket identifier", "type": "string"}, "skill_needed": {"description": "Skill that is needed", "type": "string"}, "deadline": {"description": "Original SLA deadline", "type": "string"}}, "required": ["requester_name", "requester_id", "ticket_id", "skill_needed", "deadline"], "type": "object"}}}, {"type": "function", "function": {"name": "SendTeamLeadAlert", "description": "\nSend an alert to the team lead about critical issues requiring attention.\nUse this for resource gap escalations, when additional staff are needed, or\nwhen task assignments need priority adjustment.\n", "parameters": {"description": "Input schema for sending alert to team lead.", "properties": {"alert_type": {"description": "Type of alert: 'RESOURCE_GAP', 'PRIORITY_ESCALATION', 'STAFF_NEEDED'", "type": "string"}, "details": {"description": "Detailed description of the issue", "type": "string"}, "ids": {"description": "Comma-separated list of affected IDs", "type": "string"}}, "required": ["alert_type", "ids", "details"], "type": "object"}}}, {"type": "function", "function": {"name": "GetOpenTicketsCount", "description": "Get the total count of all open tickets that have not yet been resolved (stage < 90).", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsAtRisk", "description": "Get all tickets at risk of missing their SLA. Returns tickets with deadline of tomorrow or older that are not yet resolved ordered by date and priority.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsAtRiskCount", "description": "Get the count of tickets at risk of missing their SLA. Returns the number of tickets with deadline of tomorrow or older that are not yet resolved.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsWithGaps", "description": "Get tickets that have resource gaps and are at risk. Returns tickets where required skills exceed available staff ordered by deadline.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsWithGapsCount", "description": "Get the count of tickets that have resource gaps and are at risk. Returns the number of tickets where required skills exceed available staff.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetAvailableStaff", "description": "Get all available staff members that have not yet been assigned and that have the needed skills.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetAvailableStaffCount", "description": "Get the count of unique tickets that have gaps and have available staff that can cover them.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetUncoverableTickets", "description": "Get tickets that have resource gaps and no available staff. These tickets cannot be resolved on time.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetUncoverableTicketsCount", "description": "Get the count of unique tickets that have resource gaps and no available staff.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetRequesterContact", "description": "Get requester contact information for a ticket.", "parameters": {"properties": {"ticket_id": {}}, "required": ["ticket_id"], "type": "object"}}}], "tool_choice": "required", "temperature": 1}%

&&

cat /tmp/ollama_req_1.json
{"model": "mistral-small3.1:24b", "messages": [{"role": "system", "content": "# Today's date is: March 19, 2026\n\nName:\n- Your name is IT_Support_Coordinator_Agent.\n\nRole:\n- You are an IT Helpdesk Operations Coordinator.\n\nGoal:\n- Your goal is to manage helpdesk operations to ensure all support tickets with resource constraints that are at risk of missing their SLA deadline are identified, and relevant actions prioritized. If tickets are unable to be resolved because there are no available technicians, the user can ask to notify the Escalation Team. If the ticket can be resolved by an available technician, the user can ask to notify the team lead to prioritize the required skills and assign them.\n\nPrimary Instructions:\n- Only perform the specific tasks requested. Do NOT send notifications unless explicitly asked.\n- WHEN ASKED to identify at-risk tickets: Call 'get-tickets-at-risk-count' and 'get-open-tickets-count'. Report counts with a chart. Do NOT call 'get-tickets-at-risk' unless asked for the full list.\n- WHEN ASKED to list at-risk tickets: Call 'get-tickets-at-risk'. Report ticket ID, requester name, and due date.\n- WHEN ASKED to identify tickets with resource gaps: Call 'get-tickets-with-gaps-count' and 'get-tickets-at-risk-count'. Report counts with a chart showing 'Resource Gap' vs 'Other At Risk'.\n- WHEN ASKED to identify which gap tickets can be covered by available staff: Call 'get-available-staff-count' and 'get-uncoverable-tickets-count'. Report counts with a chart. Do NOT notify anyone.\n- WHEN ASKED to list gap tickets with available staff: Call 'get-available-staff'.\n- WHEN ASKED to identify gap tickets that cannot be covered: Call 'get-uncoverable-tickets-count' and 'get-available-staff-count'. Report with a chart. Do NOT notify anyone.\n- WHEN ASKED to list uncoverable gap tickets: Call 'get-uncoverable-tickets'.\n- WHEN ASKED to notify the Escalation Team: Use 'get-requester-contact' then 'send-escalation-email' with requester name, requester ID, ticket ID, skill needed, and deadline.\n- WHEN ASKED to notify the Team Lead: Use 'send-team-lead-alert' with alert type 'RESOURCE_GAP', affected ticket IDs, and details.\n- WHEN ASKED to process all gap tickets end-to-end: Call 'get-available-staff'. If no staff exists, notify the Escalation Team. If staff exists, notify the Team Lead.\n- Tool names use lowercase kebab-case. Tickets and incidents are interchangeable terms.\n- When reporting counts, include a chart on its own line using this format: <<<{\"title\":\"Title\",\"data\":{\"Label1\":count1,\"Label2\":count2}}>>> Use integers from tool results only. For at-risk queries show 'At Risk' vs 'Not At Risk'."}, {"role": "user", "content": "Identify tickets that are at risk."}], "tools": [{"type": "function", "function": {"name": "SendEscalationEmail", "description": "\nSend an email to the Escalation Team notifying them of a delayed ticket\ndue to resource constraints. Use this when a ticket has insufficient staff\nand no available technicians are expected soon.\n", "parameters": {"description": "Input schema for sending escalation notification.", "properties": {"requester_name": {"description": "Requester name", "type": "string"}, "requester_id": {"description": "Requester ID", "type": "string"}, "ticket_id": {"description": "Ticket identifier", "type": "string"}, "skill_needed": {"description": "Skill that is needed", "type": "string"}, "deadline": {"description": "Original SLA deadline", "type": "string"}}, "required": ["requester_name", "requester_id", "ticket_id", "skill_needed", "deadline"], "type": "object"}}}, {"type": "function", "function": {"name": "SendTeamLeadAlert", "description": "\nSend an alert to the team lead about critical issues requiring attention.\nUse this for resource gap escalations, when additional staff are needed, or\nwhen task assignments need priority adjustment.\n", "parameters": {"description": "Input schema for sending alert to team lead.", "properties": {"alert_type": {"description": "Type of alert: 'RESOURCE_GAP', 'PRIORITY_ESCALATION', 'STAFF_NEEDED'", "type": "string"}, "details": {"description": "Detailed description of the issue", "type": "string"}, "ids": {"description": "Comma-separated list of affected IDs", "type": "string"}}, "required": ["alert_type", "ids", "details"], "type": "object"}}}, {"type": "function", "function": {"name": "GetOpenTicketsCount", "description": "Get the total count of all open tickets that have not yet been resolved (stage < 90).", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsAtRisk", "description": "Get all tickets at risk of missing their SLA. Returns tickets with deadline of tomorrow or older that are not yet resolved ordered by date and priority.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsAtRiskCount", "description": "Get the count of tickets at risk of missing their SLA. Returns the number of tickets with deadline of tomorrow or older that are not yet resolved.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsWithGaps", "description": "Get tickets that have resource gaps and are at risk. Returns tickets where required skills exceed available staff ordered by deadline.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsWithGapsCount", "description": "Get the count of tickets that have resource gaps and are at risk. Returns the number of tickets where required skills exceed available staff.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetAvailableStaff", "description": "Get all available staff members that have not yet been assigned and that have the needed skills.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetAvailableStaffCount", "description": "Get the count of unique tickets that have gaps and have available staff that can cover them.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetUncoverableTickets", "description": "Get tickets that have resource gaps and no available staff. These tickets cannot be resolved on time.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetUncoverableTicketsCount", "description": "Get the count of unique tickets that have resource gaps and no available staff.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetRequesterContact", "description": "Get requester contact information for a ticket.", "parameters": {"properties": {"ticket_id": {}}, "required": ["ticket_id"], "type": "object"}}}], "tool_choice": "required", "temperature": 1}%

I noticed, when replacing only the system prompt with "You are a helpful assistant." (same tools, same tool_choice, same user message) it returns tool_calls correctly at 926 tokens. The bug triggers with longer structured system prompts around ~1500+ tokens combined with 12 tool definitions.

<!-- gh-comment-id:4093120188 --> @cicoyle commented on GitHub (Mar 19, 2026): ``` $ ollama ps NAME ID SIZE PROCESSOR CONTEXT UNTIL mistral-small3.1:24b b9aaf0c2586a 16 GB 100% GPU 4096 4 minutes from now ``` I think the bug has something to do with the tool call output parser. When the system prompt + tools exceed some threshold (~1500 tokens I think), the parser fails to capture the generated tool call tokens and returns an empty resp. The model is generating the right output (31 tokens = typical tool call), but Ollama's resp serialization drops it. I created a repo curl that consistently fails for me: ``` echo "=== SHORT (should work) ===" && curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_short.json | jq . && echo "=== LONG (should fail) ===" && curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_1.json | jq . === SHORT (should work) === { "id": "chatcmpl-226", "object": "chat.completion", "created": 1773952138, "model": "mistral-small3.1:24b", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_kweb0u1j", "index": 0, "type": "function", "function": { "name": "GetTicketsAtRisk", "arguments": "{}" } } ] }, "finish_reason": "tool_calls" } ], "usage": { "prompt_tokens": 850, "completion_tokens": 16, "total_tokens": 866 } } === LONG (should fail) === { "id": "chatcmpl-116", "object": "chat.completion", "created": 1773952147, "model": "mistral-small3.1:24b", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 1467, "completion_tokens": 31, "total_tokens": 1498 } } ``` Where the files are defined as follows: ``` cat /tmp/ollama_req_short.json {"model": "mistral-small3.1:24b", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Identify tickets that are at risk."}], "tools": [{"type": "function", "function": {"name": "SendEscalationEmail", "description": "\nSend an email to the Escalation Team notifying them of a delayed ticket\ndue to resource constraints. Use this when a ticket has insufficient staff\nand no available technicians are expected soon.\n", "parameters": {"description": "Input schema for sending escalation notification.", "properties": {"requester_name": {"description": "Requester name", "type": "string"}, "requester_id": {"description": "Requester ID", "type": "string"}, "ticket_id": {"description": "Ticket identifier", "type": "string"}, "skill_needed": {"description": "Skill that is needed", "type": "string"}, "deadline": {"description": "Original SLA deadline", "type": "string"}}, "required": ["requester_name", "requester_id", "ticket_id", "skill_needed", "deadline"], "type": "object"}}}, {"type": "function", "function": {"name": "SendTeamLeadAlert", "description": "\nSend an alert to the team lead about critical issues requiring attention.\nUse this for resource gap escalations, when additional staff are needed, or\nwhen task assignments need priority adjustment.\n", "parameters": {"description": "Input schema for sending alert to team lead.", "properties": {"alert_type": {"description": "Type of alert: 'RESOURCE_GAP', 'PRIORITY_ESCALATION', 'STAFF_NEEDED'", "type": "string"}, "details": {"description": "Detailed description of the issue", "type": "string"}, "ids": {"description": "Comma-separated list of affected IDs", "type": "string"}}, "required": ["alert_type", "ids", "details"], "type": "object"}}}, {"type": "function", "function": {"name": "GetOpenTicketsCount", "description": "Get the total count of all open tickets that have not yet been resolved (stage < 90).", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsAtRisk", "description": "Get all tickets at risk of missing their SLA. Returns tickets with deadline of tomorrow or older that are not yet resolved ordered by date and priority.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsAtRiskCount", "description": "Get the count of tickets at risk of missing their SLA. Returns the number of tickets with deadline of tomorrow or older that are not yet resolved.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsWithGaps", "description": "Get tickets that have resource gaps and are at risk. Returns tickets where required skills exceed available staff ordered by deadline.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsWithGapsCount", "description": "Get the count of tickets that have resource gaps and are at risk. Returns the number of tickets where required skills exceed available staff.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetAvailableStaff", "description": "Get all available staff members that have not yet been assigned and that have the needed skills.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetAvailableStaffCount", "description": "Get the count of unique tickets that have gaps and have available staff that can cover them.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetUncoverableTickets", "description": "Get tickets that have resource gaps and no available staff. These tickets cannot be resolved on time.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetUncoverableTicketsCount", "description": "Get the count of unique tickets that have resource gaps and no available staff.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetRequesterContact", "description": "Get requester contact information for a ticket.", "parameters": {"properties": {"ticket_id": {}}, "required": ["ticket_id"], "type": "object"}}}], "tool_choice": "required", "temperature": 1}% ``` && ``` cat /tmp/ollama_req_1.json {"model": "mistral-small3.1:24b", "messages": [{"role": "system", "content": "# Today's date is: March 19, 2026\n\nName:\n- Your name is IT_Support_Coordinator_Agent.\n\nRole:\n- You are an IT Helpdesk Operations Coordinator.\n\nGoal:\n- Your goal is to manage helpdesk operations to ensure all support tickets with resource constraints that are at risk of missing their SLA deadline are identified, and relevant actions prioritized. If tickets are unable to be resolved because there are no available technicians, the user can ask to notify the Escalation Team. If the ticket can be resolved by an available technician, the user can ask to notify the team lead to prioritize the required skills and assign them.\n\nPrimary Instructions:\n- Only perform the specific tasks requested. Do NOT send notifications unless explicitly asked.\n- WHEN ASKED to identify at-risk tickets: Call 'get-tickets-at-risk-count' and 'get-open-tickets-count'. Report counts with a chart. Do NOT call 'get-tickets-at-risk' unless asked for the full list.\n- WHEN ASKED to list at-risk tickets: Call 'get-tickets-at-risk'. Report ticket ID, requester name, and due date.\n- WHEN ASKED to identify tickets with resource gaps: Call 'get-tickets-with-gaps-count' and 'get-tickets-at-risk-count'. Report counts with a chart showing 'Resource Gap' vs 'Other At Risk'.\n- WHEN ASKED to identify which gap tickets can be covered by available staff: Call 'get-available-staff-count' and 'get-uncoverable-tickets-count'. Report counts with a chart. Do NOT notify anyone.\n- WHEN ASKED to list gap tickets with available staff: Call 'get-available-staff'.\n- WHEN ASKED to identify gap tickets that cannot be covered: Call 'get-uncoverable-tickets-count' and 'get-available-staff-count'. Report with a chart. Do NOT notify anyone.\n- WHEN ASKED to list uncoverable gap tickets: Call 'get-uncoverable-tickets'.\n- WHEN ASKED to notify the Escalation Team: Use 'get-requester-contact' then 'send-escalation-email' with requester name, requester ID, ticket ID, skill needed, and deadline.\n- WHEN ASKED to notify the Team Lead: Use 'send-team-lead-alert' with alert type 'RESOURCE_GAP', affected ticket IDs, and details.\n- WHEN ASKED to process all gap tickets end-to-end: Call 'get-available-staff'. If no staff exists, notify the Escalation Team. If staff exists, notify the Team Lead.\n- Tool names use lowercase kebab-case. Tickets and incidents are interchangeable terms.\n- When reporting counts, include a chart on its own line using this format: <<<{\"title\":\"Title\",\"data\":{\"Label1\":count1,\"Label2\":count2}}>>> Use integers from tool results only. For at-risk queries show 'At Risk' vs 'Not At Risk'."}, {"role": "user", "content": "Identify tickets that are at risk."}], "tools": [{"type": "function", "function": {"name": "SendEscalationEmail", "description": "\nSend an email to the Escalation Team notifying them of a delayed ticket\ndue to resource constraints. Use this when a ticket has insufficient staff\nand no available technicians are expected soon.\n", "parameters": {"description": "Input schema for sending escalation notification.", "properties": {"requester_name": {"description": "Requester name", "type": "string"}, "requester_id": {"description": "Requester ID", "type": "string"}, "ticket_id": {"description": "Ticket identifier", "type": "string"}, "skill_needed": {"description": "Skill that is needed", "type": "string"}, "deadline": {"description": "Original SLA deadline", "type": "string"}}, "required": ["requester_name", "requester_id", "ticket_id", "skill_needed", "deadline"], "type": "object"}}}, {"type": "function", "function": {"name": "SendTeamLeadAlert", "description": "\nSend an alert to the team lead about critical issues requiring attention.\nUse this for resource gap escalations, when additional staff are needed, or\nwhen task assignments need priority adjustment.\n", "parameters": {"description": "Input schema for sending alert to team lead.", "properties": {"alert_type": {"description": "Type of alert: 'RESOURCE_GAP', 'PRIORITY_ESCALATION', 'STAFF_NEEDED'", "type": "string"}, "details": {"description": "Detailed description of the issue", "type": "string"}, "ids": {"description": "Comma-separated list of affected IDs", "type": "string"}}, "required": ["alert_type", "ids", "details"], "type": "object"}}}, {"type": "function", "function": {"name": "GetOpenTicketsCount", "description": "Get the total count of all open tickets that have not yet been resolved (stage < 90).", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsAtRisk", "description": "Get all tickets at risk of missing their SLA. Returns tickets with deadline of tomorrow or older that are not yet resolved ordered by date and priority.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsAtRiskCount", "description": "Get the count of tickets at risk of missing their SLA. Returns the number of tickets with deadline of tomorrow or older that are not yet resolved.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsWithGaps", "description": "Get tickets that have resource gaps and are at risk. Returns tickets where required skills exceed available staff ordered by deadline.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetTicketsWithGapsCount", "description": "Get the count of tickets that have resource gaps and are at risk. Returns the number of tickets where required skills exceed available staff.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetAvailableStaff", "description": "Get all available staff members that have not yet been assigned and that have the needed skills.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetAvailableStaffCount", "description": "Get the count of unique tickets that have gaps and have available staff that can cover them.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetUncoverableTickets", "description": "Get tickets that have resource gaps and no available staff. These tickets cannot be resolved on time.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetUncoverableTicketsCount", "description": "Get the count of unique tickets that have resource gaps and no available staff.", "parameters": {"properties": {}, "type": "object"}}}, {"type": "function", "function": {"name": "GetRequesterContact", "description": "Get requester contact information for a ticket.", "parameters": {"properties": {"ticket_id": {}}, "required": ["ticket_id"], "type": "object"}}}], "tool_choice": "required", "temperature": 1}% ``` I noticed, when replacing only the system prompt with "You are a helpful assistant." (same tools, same tool_choice, same user message) it returns tool_calls correctly at 926 tokens. The bug triggers with longer structured system prompts around ~1500+ tokens combined with 12 tool definitions.
Author
Owner

@cicoyle commented on GitHub (Mar 19, 2026):

Plz note, I did just run the example scenario from ^ with the latest ollama version 0.18.2 and see the same result:

echo "=== SHORT (should work) ===" && curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_short.json | jq . && echo "=== LONG (should fail)
   ===" && curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_1.json | jq .

=== SHORT (should work) ===
{
  "id": "chatcmpl-493",
  "object": "chat.completion",
  "created": 1773952967,
  "model": "mistral-small3.1:24b",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_jcw40vr4",
            "index": 0,
            "type": "function",
            "function": {
              "name": "GetTicketsAtRisk",
              "arguments": "{}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 850,
    "completion_tokens": 16,
    "total_tokens": 866
  }
}
=== LONG (should fail)
   ===
{
  "id": "chatcmpl-593",
  "object": "chat.completion",
  "created": 1773952976,
  "model": "mistral-small3.1:24b",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": ""
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1467,
    "completion_tokens": 31,
    "total_tokens": 1498
  }
}

ollama --version
ollama version is 0.18.2

<!-- gh-comment-id:4093180445 --> @cicoyle commented on GitHub (Mar 19, 2026): Plz note, I did just run the example scenario from ^ with the latest ollama version `0.18.2` and see the same result: ``` echo "=== SHORT (should work) ===" && curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_short.json | jq . && echo "=== LONG (should fail) ===" && curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_1.json | jq . === SHORT (should work) === { "id": "chatcmpl-493", "object": "chat.completion", "created": 1773952967, "model": "mistral-small3.1:24b", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_jcw40vr4", "index": 0, "type": "function", "function": { "name": "GetTicketsAtRisk", "arguments": "{}" } } ] }, "finish_reason": "tool_calls" } ], "usage": { "prompt_tokens": 850, "completion_tokens": 16, "total_tokens": 866 } } === LONG (should fail) === { "id": "chatcmpl-593", "object": "chat.completion", "created": 1773952976, "model": "mistral-small3.1:24b", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 1467, "completion_tokens": 31, "total_tokens": 1498 } } ollama --version ollama version is 0.18.2 ```
Author
Owner

@rick-github commented on GitHub (Mar 19, 2026):

The system prompt in the long request tells the model to call 'get-tickets-at-risk-count'. There is no function by that name. If I change the get-... to the corresponding Get... functions, then the model emits tools calls.

$ curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_1.json | jq .
{
  "id": "chatcmpl-77",
  "object": "chat.completion",
  "created": 1773954230,
  "model": "mistral-small3.1:24b",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "tool_calls": [
          {
            "id": "call_z7f00f4g",
            "index": 0,
            "type": "function",
            "function": {
              "name": "GetTicketsAtRiskCount",
              "arguments": "{}"
            }
          },
          {
            "id": "call_4yb4u38h",
            "index": 1,
            "type": "function",
            "function": {
              "name": "GetOpenTicketsCount",
              "arguments": "{}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 1462,
    "completion_tokens": 29,
    "total_tokens": 1491
  }
}
<!-- gh-comment-id:4093283772 --> @rick-github commented on GitHub (Mar 19, 2026): The system prompt in the long request tells the model to call 'get-tickets-at-risk-count'. There is no function by that name. If I change the `get-...` to the corresponding `Get...` functions, then the model emits tools calls. ```console $ curl -s http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d @/tmp/ollama_req_1.json | jq . { "id": "chatcmpl-77", "object": "chat.completion", "created": 1773954230, "model": "mistral-small3.1:24b", "system_fingerprint": "fp_ollama", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "", "tool_calls": [ { "id": "call_z7f00f4g", "index": 0, "type": "function", "function": { "name": "GetTicketsAtRiskCount", "arguments": "{}" } }, { "id": "call_4yb4u38h", "index": 1, "type": "function", "function": { "name": "GetOpenTicketsCount", "arguments": "{}" } } ] }, "finish_reason": "tool_calls" } ], "usage": { "prompt_tokens": 1462, "completion_tokens": 29, "total_tokens": 1491 } } ```
Author
Owner

@cicoyle commented on GitHub (Mar 19, 2026):

Hi @rick-github - Thanks for identifying this - you were right! The root cause was a tool name casing mismatch between my system prompt instructions (kebab-case like get-orders-at-risk-count) and the actual tool definitions registered as PascalCase (GetOrdersAtRiskCount). The model saw "call get-orders-at-risk-count" in the prompt, but no tool by that name existed.

This wasn't an Ollama bug after all. That said, it might be worth considering whether Ollama could

  • emit a warning or error when tool_choice: "required" is set, but the model references a function name that doesn't match any registered tool.
  • apply fuzzy/case-insensitive matching on tool names (ex: treat get-orders-at-risk-count and GetOrdersAtRiskCount as the same tool) to avoid users falling into this same trap

Either would have saved my debugging time. I can open a separate feature request for that linking to this issue. Closing this issue - thx again for the quick reply :)

<!-- gh-comment-id:4093629246 --> @cicoyle commented on GitHub (Mar 19, 2026): Hi @rick-github - Thanks for identifying this - you were right! The root cause was a tool name casing mismatch between my system prompt instructions (kebab-case like `get-orders-at-risk-count`) and the actual tool definitions registered as PascalCase (`GetOrdersAtRiskCount`). The model saw "call `get-orders-at-risk-count`" in the prompt, but no tool by that name existed. This wasn't an Ollama bug after all. That said, it might be worth considering whether Ollama could - emit a warning or error when `tool_choice: "required"` is set, but the model references a function name that doesn't match any registered tool. - apply fuzzy/case-insensitive matching on tool names (ex: treat `get-orders-at-risk-count` and `GetOrdersAtRiskCount` as the same tool) to avoid users falling into this same trap Either would have saved my debugging time. I can open a separate feature request for that linking to this issue. Closing this issue - thx again for the quick reply :)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#56131