[GH-ISSUE #15925] Quantization bugs #72203

Open
opened 2026-05-05 03:37:42 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @levicki on GitHub (May 1, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15925

What is the issue?

I tried creating a model from safetensors with quantization.

I accidentally typed Q5_K_M instead of Q4_K_M.

ollama.exe proceeded to:

  1. Import all model files into blobs folder
  2. Merge layers to single GGUF (58.3 GB)
  3. Only at the very end fail the quantization because Q5_K_M is not supported
  4. Leave all the files it created in blobs folder

TL;DR — It wasted total of 116.6 GB of space and NVME disk writes, and about 5 minutes of CPU churning and RAM paging only to tell me "I can't do that Dave" and leave the mess behind.

And why?

Because someone wrote the code that:

  • does not validate command-line arguments up front
  • does not use a temp folder nor delete-on-close semantics for files that should not persist

Finally, there's no option to do the simplest housekeeping on blobs folder:

  • Enumerate manifests
  • Build a list of hashes
  • Enumerate blobs to find files
  • Remove all files found in manifests from the list
  • Present the remaining files and their sizes to user
  • Offer the user to delete them

Claude 4.6 (free version) wrote this workflow in Python for me in less than 10 seconds, I see no good reason not to have this as a command line option (for example ollama.exe gc).

Relevant log output


OS

Windows 10

GPU

RTX 5090

CPU

Xeon w5-2455X

Ollama version

0.22.1

Originally created by @levicki on GitHub (May 1, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15925 ### What is the issue? I tried creating a model from safetensors with quantization. I accidentally typed `Q5_K_M` instead of `Q4_K_M`. `ollama.exe` proceeded to: 1. Import all model files into `blobs` folder 2. Merge layers to single GGUF (58.3 GB) 3. Only at the very end fail the quantization because `Q5_K_M` is not supported 4. Leave all the files it created in `blobs` folder TL;DR — It wasted total of 116.6 GB of space and NVME disk writes, and about 5 minutes of CPU churning and RAM paging only to tell me "I can't do that Dave" and leave the mess behind. And why? Because someone wrote the code that: - does not validate command-line arguments up front - does not use a temp folder nor delete-on-close semantics for files that should not persist Finally, there's no option to do the simplest housekeeping on `blobs` folder: - Enumerate manifests - Build a list of hashes - Enumerate blobs to find files - Remove all files found in manifests from the list - Present the remaining files and their sizes to user - Offer the user to delete them Claude 4.6 (free version) wrote this workflow in Python for me in less than 10 seconds, I see no good reason not to have this as a command line option (for example `ollama.exe gc`). ### Relevant log output ```shell ``` ### OS Windows 10 ### GPU RTX 5090 ### CPU Xeon w5-2455X ### Ollama version 0.22.1
GiteaMirror added the bug label 2026-05-05 03:37:42 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#72203