[GH-ISSUE #15465] fix: recover truncated tool call JSON when max_tokens cuts output #35645

Open
opened 2026-04-22 20:18:34 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @Rih0z on GitHub (Apr 9, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15465

Problem

When a model generates a large tool call (e.g., write_file with HTML content), the JSON arguments can be truncated if the output hits the token limit. The tool call is silently dropped instead of returning partial arguments.

This causes missing tool calls when:

  • write_file produces large HTML/code content
  • Qwen3/Gemma4 models hit their output token limit mid-JSON

Proposed Fix

Add recovery for truncated tool call JSON so that at least the successfully parsed key-value pairs are returned rather than dropping the entire call.

Reproduction

curl http://localhost:11434/api/chat -d '{
  "model": "qwen3:8b",
  "messages": [{"role": "user", "content": "Write a complete HTML dashboard page"}],
  "tools": [{"type": "function", "function": {"name": "write_file", "parameters": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}}}}],
  "options": {"num_predict": 2048}
}'

Tool call JSON gets truncated, no tool_calls in response.

Note: PRs #14835/#14915 fix this for model-specific parsers (Qwen3, etc.). This fix targets the generic tools/tools.go parser used when no model-specific parser is available.

Originally created by @Rih0z on GitHub (Apr 9, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15465 ## Problem When a model generates a large tool call (e.g., `write_file` with HTML content), the JSON arguments can be truncated if the output hits the token limit. The tool call is silently dropped instead of returning partial arguments. This causes missing tool calls when: - `write_file` produces large HTML/code content - Qwen3/Gemma4 models hit their output token limit mid-JSON ## Proposed Fix Add recovery for truncated tool call JSON so that at least the successfully parsed key-value pairs are returned rather than dropping the entire call. ## Reproduction ```bash curl http://localhost:11434/api/chat -d '{ "model": "qwen3:8b", "messages": [{"role": "user", "content": "Write a complete HTML dashboard page"}], "tools": [{"type": "function", "function": {"name": "write_file", "parameters": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}}}}], "options": {"num_predict": 2048} }' ``` Tool call JSON gets truncated, no tool_calls in response. Note: PRs #14835/#14915 fix this for model-specific parsers (Qwen3, etc.). This fix targets the generic `tools/tools.go` parser used when no model-specific parser is available.
Author
Owner

@PureBlissAK commented on GitHub (Apr 18, 2026):

🤖 Automated Triage & Analysis Report

Issue: #15465
Analyzed: 2026-04-18T18:21:27.435681

Analysis

  • Type: unknown
  • Severity: medium
  • Components: unknown

Implementation Plan

  • Effort: medium
  • Steps:

This issue has been triaged and marked for implementation.

<!-- gh-comment-id:4274308320 --> @PureBlissAK commented on GitHub (Apr 18, 2026): <!-- ollama-issue-orchestrator:v1 issue:15465 --> ## 🤖 Automated Triage & Analysis Report **Issue**: #15465 **Analyzed**: 2026-04-18T18:21:27.435681 ### Analysis - **Type**: unknown - **Severity**: medium - **Components**: unknown ### Implementation Plan - **Effort**: medium - **Steps**: *This issue has been triaged and marked for implementation.*
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#35645