[GH-ISSUE #15529] Claude Code CLI outputs raw JSON instead of executing tool calls when using local models. #56436

Closed
opened 2026-04-29 10:49:22 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @davidal07 on GitHub (Apr 12, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15529

What is the issue?

When using the Claude Code CLI connected to a local Ollama model, the local models (such as qwen2.5-coder) successfully generate the correct JSON structure for tool calling (e.g., the Write tool). However, instead of executing the tool, Claude Code prints the raw JSON directly to the terminal.

Interestingly, when using Ollama's cloud models (via the same session and permissions), the tools execute perfectly. This suggests an issue with how the local Ollama API is wrapping the response—likely returning the JSON as standard text instead of the expected Anthropic tool_use stop reason format.

OS: Windows 11
Ollama version: 0.20.5
Claude Code version: v2.1.104
Local Models tested: qwen2.5-coder, gemma4:e4b, qwen3.5

Steps to Reproduce:

  1. open terminal
  2. Launch Claude Code pointing to the local model.
  3. Prompt the model to perform a file operation (e.g., "/init" to create the CLAUDE.md file).
  4. Observe the terminal output.

Expected Behavior:
The Ollama API should wrap the model's JSON tool call in the correct Anthropic format (stop_reason: "tool_use"), allowing Claude Code to intercept it and execute the file system operation natively.

Actual Behavior:
The CLI receives the JSON as a standard chat text payload and prints it to the user.

Relevant log output

❯/init
● {"name": "Write", "arguments": {"file_path": "/repo/CLAUDE.md", "content": "# CLAUDE.md\n\nThis file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.\n\n## Commands\n- Build: Run make build\n- Lint: Run make lint\n- Run Tests: Run make test\n- Run Single Test: Run go test ./path/to/test/file -run TestFunctionName\n\n## Architecture\nThe codebase follows a modular design with the following components:\n\n- Service Layer: Contains business logic and handles requests.\n- Repository Layer: Manages data persistence and interactions with databases or APIs.\n- Domain Layer: Defines the core entities and rules of the application.\n- API Layer: Exposes endpoints for external systems to interact with the service layer.\n\n## Important Files\n- Cursor Rules: Located in .cursor/rules/ directory.\n- Copilot Instructions: Located in .github/copilot-instructions.md file.\n\n## Setup\nBefore running any commands, ensure that the necessary dependencies are installed:\n\nsh\ngo mod download\n\n\nFor more detailed instructions, refer to the README.md and Cursor rules."}}
✻ Brewed for 41s

❯ hello
● {"name":"WebSearch","arguments":{"query":"current weather in [city]"}}

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.20.5

Originally created by @davidal07 on GitHub (Apr 12, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15529 ### What is the issue? When using the Claude Code CLI connected to a local Ollama model, the local models (such as qwen2.5-coder) successfully generate the correct JSON structure for tool calling (e.g., the Write tool). However, instead of executing the tool, Claude Code prints the raw JSON directly to the terminal. Interestingly, when using Ollama's cloud models (via the same session and permissions), the tools execute perfectly. This suggests an issue with how the local Ollama API is wrapping the response—likely returning the JSON as standard text instead of the expected Anthropic tool_use stop reason format. OS: Windows 11 Ollama version: 0.20.5 Claude Code version: v2.1.104 Local Models tested: qwen2.5-coder, gemma4:e4b, qwen3.5 Steps to Reproduce: 1. open terminal 2. Launch Claude Code pointing to the local model. 3. Prompt the model to perform a file operation (e.g., "/init" to create the CLAUDE.md file). 4. Observe the terminal output. Expected Behavior: The Ollama API should wrap the model's JSON tool call in the correct Anthropic format (stop_reason: "tool_use"), allowing Claude Code to intercept it and execute the file system operation natively. Actual Behavior: The CLI receives the JSON as a standard chat text payload and prints it to the user. ### Relevant log output ```shell ❯/init ● {"name": "Write", "arguments": {"file_path": "/repo/CLAUDE.md", "content": "# CLAUDE.md\n\nThis file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.\n\n## Commands\n- Build: Run make build\n- Lint: Run make lint\n- Run Tests: Run make test\n- Run Single Test: Run go test ./path/to/test/file -run TestFunctionName\n\n## Architecture\nThe codebase follows a modular design with the following components:\n\n- Service Layer: Contains business logic and handles requests.\n- Repository Layer: Manages data persistence and interactions with databases or APIs.\n- Domain Layer: Defines the core entities and rules of the application.\n- API Layer: Exposes endpoints for external systems to interact with the service layer.\n\n## Important Files\n- Cursor Rules: Located in .cursor/rules/ directory.\n- Copilot Instructions: Located in .github/copilot-instructions.md file.\n\n## Setup\nBefore running any commands, ensure that the necessary dependencies are installed:\n\nsh\ngo mod download\n\n\nFor more detailed instructions, refer to the README.md and Cursor rules."}} ✻ Brewed for 41s ❯ hello ● {"name":"WebSearch","arguments":{"query":"current weather in [city]"}} ``` ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.20.5
GiteaMirror added the bug label 2026-04-29 10:49:22 -05:00
Author
Owner

@rick-github commented on GitHub (Apr 12, 2026):

qwen2.5-coder is not a good tool user. Use a better model. Also check that context size is appropriate.

<!-- gh-comment-id:4232677237 --> @rick-github commented on GitHub (Apr 12, 2026): qwen2.5-coder is not a good tool user. Use a better model. Also check that context size is [appropriate](https://docs.ollama.com/integrations/claude-code#manual-setup:~:text=Note%3A%20Claude%20Code%20requires%20a%20large%20context%20window.%20We%20recommend%20at%20least%2064k%20tokens.%20See%20the%20context%20length%20documentation%20for%20how%20to%20adjust%20context%20length%20in%20Ollama).
Author
Owner

@ParthSareen commented on GitHub (Apr 13, 2026):

Often when you see this kind of issue it's as Rick mentioned - context length or model not following the correct tool calling format. We have fancier tool calling repairs for newer models. Take a look at what's new and popular here: https://ollama.com/search

<!-- gh-comment-id:4238626458 --> @ParthSareen commented on GitHub (Apr 13, 2026): Often when you see this kind of issue it's as Rick mentioned - context length or model not following the correct tool calling format. We have fancier tool calling repairs for newer models. Take a look at what's new and popular here: https://ollama.com/search
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#56436