[PR #6538] [MERGED] throw an error when encountering unsupport tensor sizes #38018

Closed
opened 2026-04-22 22:41:44 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/6538
Author: @pdevine
Created: 8/28/2024
Status: Merged
Merged: 8/28/2024
Merged by: @pdevine

Base: mainHead: pdevine/convert-error


📝 Commits (2)

  • 11e5a51 throw an error when encountering unsupport tensor sizes
  • c0302b0 feed the linter

📊 Changes

2 files changed (+106 additions, -0 deletions)

View changed files

📝 convert/convert_test.go (+101 -0)
📝 convert/reader_safetensors.go (+5 -0)

📄 Description

The bitsandbytes package creates an 8 bit quantized version of a model which is unsupported by the llama.cpp back end. It does this by creating two tensors for each of the layers which look like:

model.layers.0.mlp.down_proj.weight dtype=I8 shape=[4096, 14336]
model.layers.0.mlp.down_proj.weight_format dtype=U8 shape=[]

This change just looks to see if there is a tensor with no shape and returns an error. Right now the server will panic instead.

We already support quantizing directly from the Safetensors model, so users should use the --quantize flag w/ ollama create instead of relying on bitsandbytes.

Fixes #6357


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/6538 **Author:** [@pdevine](https://github.com/pdevine) **Created:** 8/28/2024 **Status:** ✅ Merged **Merged:** 8/28/2024 **Merged by:** [@pdevine](https://github.com/pdevine) **Base:** `main` ← **Head:** `pdevine/convert-error` --- ### 📝 Commits (2) - [`11e5a51`](https://github.com/ollama/ollama/commit/11e5a51308033f1ca0d9000e423c69ab70b88137) throw an error when encountering unsupport tensor sizes - [`c0302b0`](https://github.com/ollama/ollama/commit/c0302b05c08687d203c3a07f37e1c9669cc031bf) feed the linter ### 📊 Changes **2 files changed** (+106 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `convert/convert_test.go` (+101 -0) 📝 `convert/reader_safetensors.go` (+5 -0) </details> ### 📄 Description The `bitsandbytes` package creates an 8 bit quantized version of a model which is unsupported by the llama.cpp back end. It does this by creating two tensors for each of the layers which look like: ``` model.layers.0.mlp.down_proj.weight dtype=I8 shape=[4096, 14336] model.layers.0.mlp.down_proj.weight_format dtype=U8 shape=[] ``` This change just looks to see if there is a tensor with no shape and returns an error. Right now the server will panic instead. We already support quantizing directly from the Safetensors model, so users should use the `--quantize` flag w/ `ollama create` instead of relying on `bitsandbytes`. Fixes #6357 --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 22:41:44 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#38018