[PR #14080] [MERGED] ollamarunner: Fix off by one error with numPredict #14503

Closed
opened 2026-04-13 00:56:07 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14080
Author: @jessegross
Created: 2/5/2026
Status: Merged
Merged: 2/5/2026
Merged by: @jessegross

Base: mainHead: jessegross/num_predict


📝 Commits (1)

  • 2375641 ollamarunner: Fix off by one error with numPredict

📊 Changes

2 files changed (+53 additions, -8 deletions)

View changed files

📝 integration/basic_test.go (+44 -0)
📝 runner/ollamarunner/runner.go (+9 -8)

📄 Description

When numPredict is set, the user will receive one less token than the requested limit. In addition, the stats will incorrectly show the number of tokens returned as the limit. In cases where numPredict is not set, the number of tokens is reported correctly.

This occurs because numPredict is checked when setting up the next batch but hitting the limit will terminate the current batch as well. Instead, is is better to check the limit as we actually predict them.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14080 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 2/5/2026 **Status:** ✅ Merged **Merged:** 2/5/2026 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/num_predict` --- ### 📝 Commits (1) - [`2375641`](https://github.com/ollama/ollama/commit/2375641a39c70b6d0094b1efc5660e05dd02c9a7) ollamarunner: Fix off by one error with numPredict ### 📊 Changes **2 files changed** (+53 additions, -8 deletions) <details> <summary>View changed files</summary> 📝 `integration/basic_test.go` (+44 -0) 📝 `runner/ollamarunner/runner.go` (+9 -8) </details> ### 📄 Description When numPredict is set, the user will receive one less token than the requested limit. In addition, the stats will incorrectly show the number of tokens returned as the limit. In cases where numPredict is not set, the number of tokens is reported correctly. This occurs because numPredict is checked when setting up the next batch but hitting the limit will terminate the current batch as well. Instead, is is better to check the limit as we actually predict them. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:56:08 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#14503