[PR #12494] [MERGED] llm: Support KV cache quantization with gpt-oss #45094

Closed
opened 2026-04-25 00:46:51 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/12494
Author: @jessegross
Created: 10/3/2025
Status: Merged
Merged: 10/3/2025
Merged by: @jessegross

Base: mainHead: jessegross/gptkv


📝 Commits (1)

  • 5147b35 llm: Support KV cache quantization with gpt-oss

📊 Changes

1 file changed (+0 additions, -5 deletions)

View changed files

📝 fs/ggml/ggml.go (+0 -5)

📄 Description

With the new version of GGML in #12245, KV cache quantization no longer causes a fallback to CPU.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/12494 **Author:** [@jessegross](https://github.com/jessegross) **Created:** 10/3/2025 **Status:** ✅ Merged **Merged:** 10/3/2025 **Merged by:** [@jessegross](https://github.com/jessegross) **Base:** `main` ← **Head:** `jessegross/gptkv` --- ### 📝 Commits (1) - [`5147b35`](https://github.com/ollama/ollama/commit/5147b35767ca441540ed8ed5d5c25249d44d6c71) llm: Support KV cache quantization with gpt-oss ### 📊 Changes **1 file changed** (+0 additions, -5 deletions) <details> <summary>View changed files</summary> 📝 `fs/ggml/ggml.go` (+0 -5) </details> ### 📄 Description With the new version of GGML in #12245, KV cache quantization no longer causes a fallback to CPU. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-25 00:46:51 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#45094