[PR #14985] [MERGED] mlxrunner: support partial match on pure transformer caches #14953

Closed
opened 2026-04-13 01:06:43 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14985
Author: @jessegross
Created: 3/20/2026
Status: Merged
Merged: 3/24/2026
Merged by: @jessegross

Base: mainHead: jessegross/partial


📝 Commits (3)

  • 00b5e60 mlxrunner: support partial match on pure transformer caches
  • 325b3c1 mlxrunner: show time since last used in cache dump tree
  • f0375b5 mlxrunner: panic on double unpin

📊 Changes

6 files changed (+149 additions, -112 deletions)

View changed files

📝 x/mlxrunner/cache.go (+25 -29)
📝 x/mlxrunner/cache/cache.go (+21 -7)
📝 x/mlxrunner/cache/recurrent.go (+6 -16)
📝 x/mlxrunner/cache/recurrent_test.go (+15 -19)
📝 x/mlxrunner/cache_test.go (+78 -40)
📝 x/mlxrunner/mlx/array.go (+4 -1)

📄 Description

Previously, a partial match within a node's edge would truncate the path to the parent snapshot - effectively making all cache types behave as recurrent caches. Caches with only transformer layers can rewind to arbitrary boundary so this restores this capability to improve cache hits


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14985 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 3/20/2026 **Status:** ✅ Merged **Merged:** 3/24/2026 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/partial` --- ### 📝 Commits (3) - [`00b5e60`](https://github.com/ollama/ollama/commit/00b5e60c69eaed528d32745b0b28988ac7c5aa0b) mlxrunner: support partial match on pure transformer caches - [`325b3c1`](https://github.com/ollama/ollama/commit/325b3c1e5eb97c4afbb9f0939c90f9593721a309) mlxrunner: show time since last used in cache dump tree - [`f0375b5`](https://github.com/ollama/ollama/commit/f0375b55fade9a7a0177e3182a2cd35b935e9b0a) mlxrunner: panic on double unpin ### 📊 Changes **6 files changed** (+149 additions, -112 deletions) <details> <summary>View changed files</summary> 📝 `x/mlxrunner/cache.go` (+25 -29) 📝 `x/mlxrunner/cache/cache.go` (+21 -7) 📝 `x/mlxrunner/cache/recurrent.go` (+6 -16) 📝 `x/mlxrunner/cache/recurrent_test.go` (+15 -19) 📝 `x/mlxrunner/cache_test.go` (+78 -40) 📝 `x/mlxrunner/mlx/array.go` (+4 -1) </details> ### 📄 Description Previously, a partial match within a node's edge would truncate the path to the parent snapshot - effectively making all cache types behave as recurrent caches. Caches with only transformer layers can rewind to arbitrary boundary so this restores this capability to improve cache hits --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-13 01:06:43 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#14953