[PR #10055] [MERGED] kvcache: add check for values that fall out of sliding window cache #39004

Closed
opened 2026-04-22 23:38:53 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10055
Author: @jmorganca
Created: 3/30/2025
Status: Merged
Merged: 4/2/2025
Merged by: @jessegross

Base: mainHead: jmorganca/cache


📝 Commits (1)

  • 100cd90 kvcache: Add check for values that fall out of sliding window cache

📊 Changes

7 files changed (+131 additions, -2 deletions)

View changed files

📝 kvcache/cache.go (+5 -0)
📝 kvcache/causal.go (+36 -2)
📝 kvcache/causal_test.go (+71 -0)
📝 kvcache/encoder.go (+4 -0)
📝 kvcache/wrapper.go (+10 -0)
📝 runner/ollamarunner/cache.go (+4 -0)
📝 runner/ollamarunner/cache_test.go (+1 -0)

📄 Description

This is an attempt at fixing an issue where cached positions would "fall out" of the sliding window attention when the sequence became too long (i.e. longer than the sliding window size of 512, 1024, etc).

Now, for cache slots where the the cache no longer contains the position determined by the common prefix (because the window has shift beyond that position), the common prefix will be considered to be 0


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10055 **Author:** [@jmorganca](https://github.com/jmorganca) **Created:** 3/30/2025 **Status:** ✅ Merged **Merged:** 4/2/2025 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jmorganca/cache` --- ### 📝 Commits (1) - [`100cd90`](https://github.com/ollama/ollama/commit/100cd90a19bf4d8a6a0a56209aed71b9ddf58cee) kvcache: Add check for values that fall out of sliding window cache ### 📊 Changes **7 files changed** (+131 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `kvcache/cache.go` (+5 -0) 📝 `kvcache/causal.go` (+36 -2) 📝 `kvcache/causal_test.go` (+71 -0) 📝 `kvcache/encoder.go` (+4 -0) 📝 `kvcache/wrapper.go` (+10 -0) 📝 `runner/ollamarunner/cache.go` (+4 -0) 📝 `runner/ollamarunner/cache_test.go` (+1 -0) </details> ### 📄 Description This is an attempt at fixing an issue where cached positions would "fall out" of the sliding window attention when the sequence became too long (i.e. longer than the sliding window size of 512, 1024, etc). Now, for cache slots where the the cache no longer contains the position determined by the common prefix (because the window has shift beyond that position), the common prefix will be considered to be 0 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 23:38:53 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#39004