[PR #15058] [MERGED] mlxrunner: schedule periodic snapshots during prefill #61691

Closed
opened 2026-04-29 16:44:09 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15058
Author: @jessegross
Created: 3/25/2026
Status: Merged
Merged: 3/26/2026
Merged by: @jessegross

Base: mainHead: jessegross/snapshots


📝 Commits (3)

  • 74c4a72 mlxrunner: improve eviction and LRU tracking
  • 015546f mlxrunner: schedule periodic snapshots during prefill
  • 17b114f mlxrunner: combine setStateRaw and setStateDetached into setState

📊 Changes

4 files changed (+192 additions, -121 deletions)

View changed files

📝 x/mlxrunner/cache.go (+81 -52)
📝 x/mlxrunner/cache/recurrent.go (+12 -24)
📝 x/mlxrunner/cache_test.go (+77 -36)
📝 x/mlxrunner/pipeline.go (+22 -9)

📄 Description

Add periodic snapshots every 8k tokens and near the end of the prompt so that long prompts can be partially restored and thinking/generation can be retried without full reprocessing.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15058 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 3/25/2026 **Status:** ✅ Merged **Merged:** 3/26/2026 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/snapshots` --- ### 📝 Commits (3) - [`74c4a72`](https://github.com/ollama/ollama/commit/74c4a72fe8d74df4cc85678a061bcbd70b49182c) mlxrunner: improve eviction and LRU tracking - [`015546f`](https://github.com/ollama/ollama/commit/015546fdedf5c11974e0e98a5c817ebeb5fab8cd) mlxrunner: schedule periodic snapshots during prefill - [`17b114f`](https://github.com/ollama/ollama/commit/17b114fc4b0e0c6d74496deb9991452191e3beaa) mlxrunner: combine setStateRaw and setStateDetached into setState ### 📊 Changes **4 files changed** (+192 additions, -121 deletions) <details> <summary>View changed files</summary> 📝 `x/mlxrunner/cache.go` (+81 -52) 📝 `x/mlxrunner/cache/recurrent.go` (+12 -24) 📝 `x/mlxrunner/cache_test.go` (+77 -36) 📝 `x/mlxrunner/pipeline.go` (+22 -9) </details> ### 📄 Description Add periodic snapshots every 8k tokens and near the end of the prompt so that long prompts can be partially restored and thinking/generation can be retried without full reprocessing. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-29 16:44:09 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#61691