[PR #12105] Add support for the 'ignore_eos'. #60405

Open
opened 2026-04-29 15:21:35 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/12105
Author: @DoubleRedX
Created: 8/28/2025
Status: 🔄 Open

Base: mainHead: main


📝 Commits (2)

  • a2d5e2b Add support for the 'ignore eos' parameter, which can be used in conjunction with 'num_predict' to specify the number of output tokens of the model.
  • 55b6b0d Modify test about openai.

📊 Changes

9 files changed (+29 additions, -1 deletions)

View changed files

📝 api/types.go (+2 -0)
📝 llama/llama.cpp/common/sampling.cpp (+4 -0)
📝 llama/llama.cpp/common/sampling.h (+2 -0)
📝 llama/llama.go (+6 -0)
📝 llama/sampling_ext.cpp (+5 -0)
📝 llama/sampling_ext.h (+2 -0)
📝 openai/openai.go (+3 -0)
📝 openai/openai_test.go (+3 -0)
📝 runner/llamarunner/runner.go (+2 -1)

📄 Description

Add support for the 'ignore eos' parameter, which can be used in conjunction with 'num_predict' to specify the number of output tokens of the model.

When using tools to conduct performance tests on Ollama's openai-compatible API, it is necessary to specify the size of the output token. However, in Ollama, the num predict parameter can only limit the upper bound of the number of tokens generated by the model. Therefore, the 'ignore eos' parameter is needed to enable the model to generate tokens infinitely to achieve the purpose of generating a specified number of tokens.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/12105 **Author:** [@DoubleRedX](https://github.com/DoubleRedX) **Created:** 8/28/2025 **Status:** 🔄 Open **Base:** `main` ← **Head:** `main` --- ### 📝 Commits (2) - [`a2d5e2b`](https://github.com/ollama/ollama/commit/a2d5e2bd70ffe9f47225b70dd12d19821b0c3ecc) Add support for the 'ignore eos' parameter, which can be used in conjunction with 'num_predict' to specify the number of output tokens of the model. - [`55b6b0d`](https://github.com/ollama/ollama/commit/55b6b0d0d6278ca63b80942a5799a2f689a4c1e5) Modify test about openai. ### 📊 Changes **9 files changed** (+29 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `api/types.go` (+2 -0) 📝 `llama/llama.cpp/common/sampling.cpp` (+4 -0) 📝 `llama/llama.cpp/common/sampling.h` (+2 -0) 📝 `llama/llama.go` (+6 -0) 📝 `llama/sampling_ext.cpp` (+5 -0) 📝 `llama/sampling_ext.h` (+2 -0) 📝 `openai/openai.go` (+3 -0) 📝 `openai/openai_test.go` (+3 -0) 📝 `runner/llamarunner/runner.go` (+2 -1) </details> ### 📄 Description ### Add support for the 'ignore eos' parameter, which can be used in conjunction with 'num_predict' to specify the number of output tokens of the model. When using tools to conduct performance tests on Ollama's openai-compatible API, it is necessary to specify the size of the output token. However, in Ollama, the num predict parameter can only limit the upper bound of the number of tokens generated by the model. Therefore, the 'ignore eos' parameter is needed to enable the model to generate tokens infinitely to achieve the purpose of generating a specified number of tokens. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 15:21:35 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#60405