[PR #10806] [MERGED] server: improve tensor quantization fallback logic #12108

Closed
opened 2025-11-12 16:29:00 -06:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10806
Author: @BruceMacD
Created: 5/21/2025
Status: Merged
Merged: 5/22/2025
Merged by: @BruceMacD

Base: mainHead: brucemacd/quant-fallback


📝 Commits (1)

  • 29518f0 server: improve tensor quantization fallback logic

📊 Changes

1 file changed (+22 additions, -6 deletions)

View changed files

📝 server/quantization.go (+22 -6)

📄 Description

Fall back to alternative quantization types when a tensor's dimensions aren't divisible by the block size required for the original desired quantization type. If retried quantization types fail, the system ultimately falls back to F16 (half-precision floating point) which has a block size of 1 and can handle any tensor dimension.

resolves #10729


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10806 **Author:** [@BruceMacD](https://github.com/BruceMacD) **Created:** 5/21/2025 **Status:** ✅ Merged **Merged:** 5/22/2025 **Merged by:** [@BruceMacD](https://github.com/BruceMacD) **Base:** `main` ← **Head:** `brucemacd/quant-fallback` --- ### 📝 Commits (1) - [`29518f0`](https://github.com/ollama/ollama/commit/29518f03729d9dd51e2437490137355f2fa40713) server: improve tensor quantization fallback logic ### 📊 Changes **1 file changed** (+22 additions, -6 deletions) <details> <summary>View changed files</summary> 📝 `server/quantization.go` (+22 -6) </details> ### 📄 Description Fall back to alternative quantization types when a tensor's dimensions aren't divisible by the block size required for the original desired quantization type. If retried quantization types fail, the system ultimately falls back to F16 (half-precision floating point) which has a block size of 1 and can handle any tensor dimension. resolves #10729 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2025-11-12 16:29:00 -06:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama-ollama#12108