[PR #15125] kvcache: wire turboquant cache into new engine path #20312

Open
opened 2026-04-16 07:32:58 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15125
Author: @Dankguy17
Created: 3/29/2026
Status: 🔄 Open

Base: mainHead: kvcache/turboquant-rotation-compression


📝 Commits (4)

  • 9d19dc3 kvcache: add TurboQuant rotation-enhanced KV cache compression
  • 64f5037 Merge branch 'ollama:main' into kvcache/turboquant-rotation-compression
  • 8195786 Merge branch 'ollama:main' into kvcache/turboquant-rotation-compression
  • aafb65a kvcache: wire turboquant cache into new engine path

📊 Changes

16 files changed (+1031 additions, -2 deletions)

View changed files

📝 fs/ggml/ggml.go (+5 -1)
kvcache/turboquant.go (+216 -0)
kvcache/turboquant_test.go (+118 -0)
📝 llama/llama.go (+3 -0)
📝 ml/backend.go (+5 -0)
📝 ml/backend/ggml/ggml.go (+7 -0)
📝 model/model.go (+5 -0)
📝 model/model_test.go (+33 -0)
📝 runner/ollamarunner/cache.go (+27 -1)
turboquant/codebook.go (+90 -0)
turboquant/codebook_test.go (+104 -0)
turboquant/qjl.go (+32 -0)
turboquant/qjl_test.go (+76 -0)
turboquant/rotation.go (+129 -0)
turboquant/rotation_test.go (+154 -0)
turboquant/turboquant.go (+27 -0)

📄 Description

Follow-up to #15090. Includes TurboQuant cache wiring + CheckpointCache passthrough fixes. Commit: aafb65a9.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15125 **Author:** [@Dankguy17](https://github.com/Dankguy17) **Created:** 3/29/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `kvcache/turboquant-rotation-compression` --- ### 📝 Commits (4) - [`9d19dc3`](https://github.com/ollama/ollama/commit/9d19dc3599cad974aabbbde1114a8266320a95c2) kvcache: add TurboQuant rotation-enhanced KV cache compression - [`64f5037`](https://github.com/ollama/ollama/commit/64f503712138f7dcf82586f805ac391529d93772) Merge branch 'ollama:main' into kvcache/turboquant-rotation-compression - [`8195786`](https://github.com/ollama/ollama/commit/8195786db6c34c1593845bad87fcacb077ad277d) Merge branch 'ollama:main' into kvcache/turboquant-rotation-compression - [`aafb65a`](https://github.com/ollama/ollama/commit/aafb65a95708f79288c676bb36e91d78657939eb) kvcache: wire turboquant cache into new engine path ### 📊 Changes **16 files changed** (+1031 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `fs/ggml/ggml.go` (+5 -1) ➕ `kvcache/turboquant.go` (+216 -0) ➕ `kvcache/turboquant_test.go` (+118 -0) 📝 `llama/llama.go` (+3 -0) 📝 `ml/backend.go` (+5 -0) 📝 `ml/backend/ggml/ggml.go` (+7 -0) 📝 `model/model.go` (+5 -0) 📝 `model/model_test.go` (+33 -0) 📝 `runner/ollamarunner/cache.go` (+27 -1) ➕ `turboquant/codebook.go` (+90 -0) ➕ `turboquant/codebook_test.go` (+104 -0) ➕ `turboquant/qjl.go` (+32 -0) ➕ `turboquant/qjl_test.go` (+76 -0) ➕ `turboquant/rotation.go` (+129 -0) ➕ `turboquant/rotation_test.go` (+154 -0) ➕ `turboquant/turboquant.go` (+27 -0) </details> ### 📄 Description Follow-up to #15090. Includes TurboQuant cache wiring + CheckpointCache passthrough fixes. Commit: aafb65a9. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-16 07:32:58 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#20312