[PR #13641] server: compute numKeep to protect system prompts during context shift #19579

Open
opened 2026-04-16 07:11:06 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/13641
Author: @ParthSareen
Created: 1/7/2026
Status: 🔄 Open

Base: mainHead: parth/fix-context-chopping


📝 Commits (1)

  • d65ccbc server: compute numKeep to protect system prompts during context shift

📊 Changes

3 files changed (+47 additions, -10 deletions)

View changed files

📝 server/prompt.go (+37 -6)
📝 server/prompt_test.go (+1 -1)
📝 server/routes.go (+9 -3)

📄 Description

Previously, NumKeep defaulted to 4, causing system prompts to be truncated when the context window filled up and a shift operation occurred. This resulted in models losing their persona/instructions during long conversations.

Changes:

  • chatPrompt() now returns numKeep (token count of system messages + tools)
  • ChatHandler and GenerateHandler set opts.NumKeep from the computed value
  • Error if system+tools exceeds NumCtx-100 (too little room for conversation)
  • Cap numKeep at NumCtx-200 to ensure at least 200 tokens for generation

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/13641 **Author:** [@ParthSareen](https://github.com/ParthSareen) **Created:** 1/7/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `parth/fix-context-chopping` --- ### 📝 Commits (1) - [`d65ccbc`](https://github.com/ollama/ollama/commit/d65ccbc85c7f2789ab04d09121a62b7d652f2ff0) server: compute numKeep to protect system prompts during context shift ### 📊 Changes **3 files changed** (+47 additions, -10 deletions) <details> <summary>View changed files</summary> 📝 `server/prompt.go` (+37 -6) 📝 `server/prompt_test.go` (+1 -1) 📝 `server/routes.go` (+9 -3) </details> ### 📄 Description Previously, NumKeep defaulted to 4, causing system prompts to be truncated when the context window filled up and a shift operation occurred. This resulted in models losing their persona/instructions during long conversations. Changes: - chatPrompt() now returns numKeep (token count of system messages + tools) - ChatHandler and GenerateHandler set opts.NumKeep from the computed value - Error if system+tools exceeds NumCtx-100 (too little room for conversation) - Cap numKeep at NumCtx-200 to ensure at least 200 tokens for generation --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 07:11:06 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#19579