[PR #10415] [MERGED] tools: refactor tool call parsing and enable streaming #39109

Closed
opened 2026-04-22 23:45:26 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10415
Author: @ParthSareen
Created: 4/25/2025
Status: Merged
Merged: 5/23/2025
Merged by: @ParthSareen

Base: mainHead: tool-parsing


📝 Commits (10+)

  • ecfd212 model: support tools streaming and improve parsing
  • f184e14 server/routes: catch when JSON tool was used
  • 8603690 jsonv2 decoder
  • f29d88a wip
  • 95abc54 add new parser, tests, and templates
  • ec66d26 checkpoint for new parser
  • a76f0ad checkpoint - cleanup still left, functionality setup
  • 1936798 checkpoint
  • fed72ae renaming and splitting stuff up
  • cef2c86 tools package and utils

📊 Changes

27 files changed (+1868 additions, -340 deletions)

View changed files

📝 server/model.go (+0 -124)
server/model_test.go (+0 -179)
📝 server/routes.go (+31 -37)
📝 tools/testdata/command-r-plus.gotmpl (+0 -0)
📝 tools/testdata/command-r-plus.out (+0 -0)
📝 tools/testdata/firefunction.gotmpl (+0 -0)
📝 tools/testdata/firefunction.out (+0 -0)
📝 tools/testdata/llama3-groq-tool-use.gotmpl (+0 -0)
📝 tools/testdata/llama3-groq-tool-use.out (+0 -0)
tools/testdata/llama3.2.gotmpl (+44 -0)
tools/testdata/llama3.2.out (+24 -0)
📝 tools/testdata/messages.json (+0 -0)
📝 tools/testdata/mistral.gotmpl (+0 -0)
📝 tools/testdata/mistral.out (+0 -0)
📝 tools/testdata/nemotron.gotmpl (+0 -0)
📝 tools/testdata/nemotron.out (+0 -0)
tools/testdata/qwen2.5.gotmpl (+51 -0)
tools/testdata/qwen2.5.out (+31 -0)
tools/testdata/qwen3.gotmpl (+50 -0)
tools/testdata/qwen3.out (+31 -0)

...and 7 more files

📄 Description

Demo

Simple tool calling

https://github.com/user-attachments/assets/fe523ef1-904a-4ab2-aaab-d43bec1c35e6

Search tool calling

https://github.com/user-attachments/assets/a2fd71b8-67c3-4c87-9daa-cdc3a23fc783

Incremental Parsing

  • Enables streaming and eventually extending to other types of tools - e.g. Python
  • Dynamically finds the tool special token to use as prefix check
  • Still allows for JSON tool parsing as a fallback if the response starts with a JSON parsable type

Breaking Changes

This PR also warrants a change to the qwen2.5-coder template due to incremental JSON parsing and focusing on tool prefixes.

There a possibility that other models will break and this can happen in a couple considerations:

  1. Model has a tool prefix defined, however does not output the prefix before making a tool call.
  2. Model does not output a JSON tool call right away if it does not have a prefix. Previously we'd greedily parse over the accumulated content and get all JSON.

I'd say these are model specific problems that should pertain to training or templating.

Closes: https://github.com/ollama/ollama/issues/7014, https://github.com/ollama/ollama/issues/7886, https://github.com/ollama/ollama/issues/9632, https://github.com/ollama/ollama-python/issues/463, https://github.com/ollama/ollama/issues/10712

Follow ups:

  • Template updates for qwen and llama4
  • Remove leading spaces in general
  • Consider setting done to true
  • Python function calling: https://github.com/ollama/ollama/pull/10453
  • Use full prefix instead of single token for tools
  • "Pipelining" and sending JSON in multiple tool call setting faster

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10415 **Author:** [@ParthSareen](https://github.com/ParthSareen) **Created:** 4/25/2025 **Status:** ✅ Merged **Merged:** 5/23/2025 **Merged by:** [@ParthSareen](https://github.com/ParthSareen) **Base:** `main` ← **Head:** `tool-parsing` --- ### 📝 Commits (10+) - [`ecfd212`](https://github.com/ollama/ollama/commit/ecfd212165f2625559a6346246706e781639a0f3) model: support tools streaming and improve parsing - [`f184e14`](https://github.com/ollama/ollama/commit/f184e144d53b5c98d2a6f52557820385a6616bec) server/routes: catch when JSON tool was used - [`8603690`](https://github.com/ollama/ollama/commit/86036908faeedded9e3b7dd3f2a252592c7cec5f) jsonv2 decoder - [`f29d88a`](https://github.com/ollama/ollama/commit/f29d88a75b553c9f195a5d861208be28b8c22d43) wip - [`95abc54`](https://github.com/ollama/ollama/commit/95abc54ae14ba7e32cebff188b2e969d1b0b6de0) add new parser, tests, and templates - [`ec66d26`](https://github.com/ollama/ollama/commit/ec66d26583a7f2325396d92a1813cd7343ecf860) checkpoint for new parser - [`a76f0ad`](https://github.com/ollama/ollama/commit/a76f0adfc24f393990d154d84db7f4e6a9f6690d) checkpoint - cleanup still left, functionality setup - [`1936798`](https://github.com/ollama/ollama/commit/1936798dd35aa147fa637218631e013f728565d1) checkpoint - [`fed72ae`](https://github.com/ollama/ollama/commit/fed72ae2140ab43e3bc6af00bcb9086f5f6e7842) renaming and splitting stuff up - [`cef2c86`](https://github.com/ollama/ollama/commit/cef2c8678891277435b07dbe2f7923ddcb7b3cc5) tools package and utils ### 📊 Changes **27 files changed** (+1868 additions, -340 deletions) <details> <summary>View changed files</summary> 📝 `server/model.go` (+0 -124) ➖ `server/model_test.go` (+0 -179) 📝 `server/routes.go` (+31 -37) 📝 `tools/testdata/command-r-plus.gotmpl` (+0 -0) 📝 `tools/testdata/command-r-plus.out` (+0 -0) 📝 `tools/testdata/firefunction.gotmpl` (+0 -0) 📝 `tools/testdata/firefunction.out` (+0 -0) 📝 `tools/testdata/llama3-groq-tool-use.gotmpl` (+0 -0) 📝 `tools/testdata/llama3-groq-tool-use.out` (+0 -0) ➕ `tools/testdata/llama3.2.gotmpl` (+44 -0) ➕ `tools/testdata/llama3.2.out` (+24 -0) 📝 `tools/testdata/messages.json` (+0 -0) 📝 `tools/testdata/mistral.gotmpl` (+0 -0) 📝 `tools/testdata/mistral.out` (+0 -0) 📝 `tools/testdata/nemotron.gotmpl` (+0 -0) 📝 `tools/testdata/nemotron.out` (+0 -0) ➕ `tools/testdata/qwen2.5.gotmpl` (+51 -0) ➕ `tools/testdata/qwen2.5.out` (+31 -0) ➕ `tools/testdata/qwen3.gotmpl` (+50 -0) ➕ `tools/testdata/qwen3.out` (+31 -0) _...and 7 more files_ </details> ### 📄 Description ## Demo ### Simple tool calling https://github.com/user-attachments/assets/fe523ef1-904a-4ab2-aaab-d43bec1c35e6 ### Search tool calling https://github.com/user-attachments/assets/a2fd71b8-67c3-4c87-9daa-cdc3a23fc783 ## Incremental Parsing - Enables streaming and eventually extending to other types of tools - e.g. Python - Dynamically finds the tool special token to use as prefix check - Still allows for JSON tool parsing as a fallback if the response starts with a JSON parsable type ## Breaking Changes This PR also warrants a change to the `qwen2.5-coder` template due to incremental JSON parsing and focusing on tool prefixes. There a possibility that other models will break and this can happen in a couple considerations: 1. Model has a tool prefix defined, however does not output the prefix before making a tool call. 2. Model does not output a JSON tool call right away if it does not have a prefix. Previously we'd greedily parse over the accumulated content and get **all** JSON. I'd say these are model specific problems that should pertain to training or templating. Closes: https://github.com/ollama/ollama/issues/7014, https://github.com/ollama/ollama/issues/7886, https://github.com/ollama/ollama/issues/9632, https://github.com/ollama/ollama-python/issues/463, https://github.com/ollama/ollama/issues/10712 ### Follow ups: - [ ] Template updates for qwen and llama4 - [ ] Remove leading spaces in general - [ ] Consider setting done to true - [ ] Python function calling: https://github.com/ollama/ollama/pull/10453 - [ ] Use full prefix instead of single token for tools - [ ] "Pipelining" and sending JSON in multiple tool call setting faster --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 23:45:27 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#39109