[PR #15927] fix(server/create): validate quantize up-front before importing blobs #77656

Open
opened 2026-05-05 10:19:59 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/15927
Author: @SAY-5
Created: 5/2/2026
Status: 🔄 Open

Base: mainHead: fix/validate-quantization-upfront-15925


📝 Commits (1)

  • fc4009e fix(server/create): validate quantize up-front before importing blobs

📊 Changes

1 file changed (+9 additions, -0 deletions)

View changed files

📝 server/create.go (+9 -0)

📄 Description

Closes #15925.

CreateHandler accepted any string as the quantization argument and deferred validation until quantizeLayer -> ggml.ParseFileType ran at the end of the create flow, after all source files had already been imported into blobs/ and merged into a single GGUF. A user who typed Q5_K_M instead of Q4_K_M therefore lost ~116GB of disk writes and several minutes of CPU before seeing unsupported quantization type Q5_K_M, with the orphan blobs left behind.

Patch

Run the same ggml.ParseFileType check next to the existing r.Model / r.Files validations in CreateHandler so a typo returns HTTP 400 before any I/O. The downstream quantizeLayer call still runs the same parse, so the success path is unchanged.

Verification

go vet ./server/
go test ./server/ -run TestCreate

both clean.

Reproduction (after the fix):

$ curl -X POST http://localhost:11434/api/create -d '{"model":"my-q5", "from":"safetensors-path", "quantize":"Q5_K_M"}'
{"error":"unsupported quantization type Q5_K_M - supported types are F32, F16, Q4_K_S, Q4_K_M, Q8_0"}

No blobs imported, no GGUF merge attempted.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/15927 **Author:** [@SAY-5](https://github.com/SAY-5) **Created:** 5/2/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `fix/validate-quantization-upfront-15925` --- ### 📝 Commits (1) - [`fc4009e`](https://github.com/ollama/ollama/commit/fc4009edff634ab130ded5259f717af69277782f) fix(server/create): validate quantize up-front before importing blobs ### 📊 Changes **1 file changed** (+9 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `server/create.go` (+9 -0) </details> ### 📄 Description Closes #15925. `CreateHandler` accepted any string as the quantization argument and deferred validation until `quantizeLayer` -> `ggml.ParseFileType` ran at the *end* of the create flow, after all source files had already been imported into `blobs/` and merged into a single GGUF. A user who typed `Q5_K_M` instead of `Q4_K_M` therefore lost ~116GB of disk writes and several minutes of CPU before seeing `unsupported quantization type Q5_K_M`, with the orphan blobs left behind. ### Patch Run the same `ggml.ParseFileType` check next to the existing `r.Model` / `r.Files` validations in `CreateHandler` so a typo returns HTTP 400 before any I/O. The downstream `quantizeLayer` call still runs the same parse, so the success path is unchanged. ### Verification ``` go vet ./server/ go test ./server/ -run TestCreate ``` both clean. Reproduction (after the fix): ``` $ curl -X POST http://localhost:11434/api/create -d '{"model":"my-q5", "from":"safetensors-path", "quantize":"Q5_K_M"}' {"error":"unsupported quantization type Q5_K_M - supported types are F32, F16, Q4_K_S, Q4_K_M, Q8_0"} ``` No blobs imported, no GGUF merge attempted. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 10:19:59 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#77656