[PR #15858] server: preserve chat metrics across buffered tool chunks #62028

Open
opened 2026-04-29 16:59:05 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15858
Author: @imwyvern
Created: 4/28/2026
Status: 🔄 Open

Base: mainHead: clawoss/fix/15850-token-count-tool-call


📝 Commits (1)

  • 1817ccf server: preserve chat metrics across buffered tool chunks

📊 Changes

2 files changed (+102 additions, -0 deletions)

View changed files

📝 server/routes.go (+18 -0)
📝 server/routes_generate_test.go (+84 -0)

📄 Description

Summary

  • preserve observed chat metrics across streamed chunks before they are emitted
  • keep prompt_eval_count/prompt duration available on the final tool-call response when earlier chunks were buffered by the tool parser
  • add a regression test covering buffered tool-call chunks with prompt metrics only on the first chunk

Fixes #15850.

Tests

  • go test ./server -run 'TestGenerateChat/messages_with_tools_preserves_prompt_metrics_from_buffered_chunks' -count=1
  • go test ./server -run TestGenerateChat -count=1

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15858 **Author:** [@imwyvern](https://github.com/imwyvern) **Created:** 4/28/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `clawoss/fix/15850-token-count-tool-call` --- ### 📝 Commits (1) - [`1817ccf`](https://github.com/ollama/ollama/commit/1817ccf4be76480d3d7e06432f16c07866a16437) server: preserve chat metrics across buffered tool chunks ### 📊 Changes **2 files changed** (+102 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `server/routes.go` (+18 -0) 📝 `server/routes_generate_test.go` (+84 -0) </details> ### 📄 Description ## Summary - preserve observed chat metrics across streamed chunks before they are emitted - keep `prompt_eval_count`/prompt duration available on the final tool-call response when earlier chunks were buffered by the tool parser - add a regression test covering buffered tool-call chunks with prompt metrics only on the first chunk Fixes #15850. ## Tests - `go test ./server -run 'TestGenerateChat/messages_with_tools_preserves_prompt_metrics_from_buffered_chunks' -count=1` - `go test ./server -run TestGenerateChat -count=1` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 16:59:05 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#62028