[PR #7624] [MERGED] runner.go: Make KV entry accounting more robust #59168

Closed
opened 2026-04-29 14:04:21 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/7624
Author: @jessegross
Created: 11/12/2024
Status: Merged
Merged: 11/12/2024
Merged by: @jessegross

Base: mainHead: jessegross/kv_shift


📝 Commits (1)

  • b039afd runner.go: Make KV entry accounting more robust

📊 Changes

2 files changed (+58 additions, -56 deletions)

View changed files

📝 llama/runner/cache.go (+33 -12)
📝 llama/runner/runner.go (+25 -44)

📄 Description

The structure of the accounting for KV cache shifting was carried over from the old runner but it now doesn't feel natural with the new runner. There are a number of invariants that should hold true but are difficult to reason about. There is at least one bug report that would imply that the invariants are not holding.

This reduces the number of implicit assumptions and is more forgiving of unexpected situations. It also improves behavior around which input tokens are kept when truncation occurs.

Bug #7545


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/7624 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 11/12/2024 **Status:** ✅ Merged **Merged:** 11/12/2024 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/kv_shift` --- ### 📝 Commits (1) - [`b039afd`](https://github.com/ollama/ollama/commit/b039afdd969f794b1f89e101b4c04ddafc352b0a) runner.go: Make KV entry accounting more robust ### 📊 Changes **2 files changed** (+58 additions, -56 deletions) <details> <summary>View changed files</summary> 📝 `llama/runner/cache.go` (+33 -12) 📝 `llama/runner/runner.go` (+25 -44) </details> ### 📄 Description The structure of the accounting for KV cache shifting was carried over from the old runner but it now doesn't feel natural with the new runner. There are a number of invariants that should hold true but are difficult to reason about. There is at least one bug report that would imply that the invariants are not holding. This reduces the number of implicit assumptions and is more forgiving of unexpected situations. It also improves behavior around which input tokens are kept when truncation occurs. Bug #7545 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 14:04:21 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#59168