[PR #9433] [MERGED] runner: clear cache when shift is not possible #12960

Closed
opened 2026-04-13 00:13:53 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/9433
Author: @BruceMacD
Created: 3/1/2025
Status: Merged
Merged: 3/31/2025
Merged by: @BruceMacD

Base: mainHead: brucemacd/ctx-shift-err


📝 Commits (2)

  • 9272e09 runner: clear cache when shift is not possible
  • 8ac3b75 PR feedback

📊 Changes

6 files changed (+179 additions, -13 deletions)

View changed files

📝 llama/llama.go (+4 -0)
📝 runner/llamarunner/cache.go (+44 -9)
📝 runner/llamarunner/runner.go (+9 -1)
📝 runner/ollamarunner/cache.go (+22 -2)
📝 runner/ollamarunner/cache_test.go (+91 -0)
📝 runner/ollamarunner/runner.go (+9 -1)

📄 Description

Clear KV cache when shift operation is not supported by model. Added KvCacheCanShift() check to handle models that can't perform cache shifts, falling back to full cache clear while preserving logical token history to maintain expected behavior when context window fills up.

Fixes: https://github.com/ollama/ollama/issues/5975
Fixes: https://github.com/ollama/ollama/issues/8074
Fixes: https://github.com/ollama/ollama/issues/8571
Fixes: https://github.com/ollama/ollama/issues/8599
Fixes: https://github.com/ollama/ollama/issues/8602
Fixes: https://github.com/ollama/ollama/issues/8614
Fixes: https://github.com/ollama/ollama/issues/8924
Fixes: https://github.com/ollama/ollama/issues/9010
Fixes: https://github.com/ollama/ollama/issues/9047
Fixes: https://github.com/ollama/ollama/issues/9064
Fixes: https://github.com/ollama/ollama/issues/9105
Fixes: https://github.com/ollama/ollama/issues/9171
Fixes: https://github.com/ollama/ollama/issues/9248
Fixes: https://github.com/ollama/ollama/issues/9410


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/9433 **Author:** [@BruceMacD](https://github.com/BruceMacD) **Created:** 3/1/2025 **Status:** ✅ Merged **Merged:** 3/31/2025 **Merged by:** [@BruceMacD](https://github.com/BruceMacD) **Base:** `main` ← **Head:** `brucemacd/ctx-shift-err` --- ### 📝 Commits (2) - [`9272e09`](https://github.com/ollama/ollama/commit/9272e09c0b68d6dfb1a13b14c6f14f49748735ff) runner: clear cache when shift is not possible - [`8ac3b75`](https://github.com/ollama/ollama/commit/8ac3b759d12cb1d3bac7ffde799f5175b5886a41) PR feedback ### 📊 Changes **6 files changed** (+179 additions, -13 deletions) <details> <summary>View changed files</summary> 📝 `llama/llama.go` (+4 -0) 📝 `runner/llamarunner/cache.go` (+44 -9) 📝 `runner/llamarunner/runner.go` (+9 -1) 📝 `runner/ollamarunner/cache.go` (+22 -2) 📝 `runner/ollamarunner/cache_test.go` (+91 -0) 📝 `runner/ollamarunner/runner.go` (+9 -1) </details> ### 📄 Description Clear KV cache when shift operation is not supported by model. Added KvCacheCanShift() check to handle models that can't perform cache shifts, falling back to full cache clear while preserving logical token history to maintain expected behavior when context window fills up. Fixes: https://github.com/ollama/ollama/issues/5975 Fixes: https://github.com/ollama/ollama/issues/8074 Fixes: https://github.com/ollama/ollama/issues/8571 Fixes: https://github.com/ollama/ollama/issues/8599 Fixes: https://github.com/ollama/ollama/issues/8602 Fixes: https://github.com/ollama/ollama/issues/8614 Fixes: https://github.com/ollama/ollama/issues/8924 Fixes: https://github.com/ollama/ollama/issues/9010 Fixes: https://github.com/ollama/ollama/issues/9047 Fixes: https://github.com/ollama/ollama/issues/9064 Fixes: https://github.com/ollama/ollama/issues/9105 Fixes: https://github.com/ollama/ollama/issues/9171 Fixes: https://github.com/ollama/ollama/issues/9248 Fixes: https://github.com/ollama/ollama/issues/9410 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 00:13:53 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#12960