[PR #3682] [MERGED] quantize any fp16/fp32 model #11248

Closed
opened 2026-04-12 23:25:28 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/3682
Author: @mxyng
Created: 4/16/2024
Status: Merged
Merged: 5/7/2024
Merged by: @mxyng

Base: mainHead: mxyng/quantize-all-the-things


📝 Commits (9)

📊 Changes

14 files changed (+624 additions, -589 deletions)

View changed files

📝 convert/convert.go (+2 -1)
📝 convert/gemma.go (+2 -13)
📝 convert/llama.go (+2 -16)
📝 convert/mistral.go (+2 -13)
📝 convert/mixtral.go (+3 -14)
📝 integration/utils_test.go (+1 -1)
llm/filetype.go (+140 -0)
📝 llm/ggml.go (+18 -77)
📝 llm/llm.go (+4 -52)
📝 server/images.go (+139 -328)
📝 server/layer.go (+29 -44)
server/model.go (+261 -0)
📝 server/routes.go (+1 -6)
📝 server/routes_test.go (+20 -24)

📄 Description

  • FROM /path/to/{safetensors,pytorch}
  • FROM /path/to/fp{16,32}.bin
  • FROM model:fp{16,32}

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/3682 **Author:** [@mxyng](https://github.com/mxyng) **Created:** 4/16/2024 **Status:** ✅ Merged **Merged:** 5/7/2024 **Merged by:** [@mxyng](https://github.com/mxyng) **Base:** `main` ← **Head:** `mxyng/quantize-all-the-things` --- ### 📝 Commits (9) - [`9685c34`](https://github.com/ollama/ollama/commit/9685c34509db4c5b0bc20aff3bf9252921908055) quantize any fp16/fp32 model - [`a7248f6`](https://github.com/ollama/ollama/commit/a7248f6ea8fc277b81916dffb238cdcb1f0d9c58) update tests - [`01811c1`](https://github.com/ollama/ollama/commit/01811c176a43e2aa5bc288188f94949b8a0299b5) comments - [`7ffe457`](https://github.com/ollama/ollama/commit/7ffe45734d1e2fada01837383afc20053a5b4c0f) rebase - [`4d0d0fa`](https://github.com/ollama/ollama/commit/4d0d0fa3839e8e0fed8b210a88290b5f15e04baa) no iterator - [`d245460`](https://github.com/ollama/ollama/commit/d2454603626510ab1366a7f077452fb49c4ca47c) only quantize language models - [`f5e8b20`](https://github.com/ollama/ollama/commit/f5e8b207fb87582ecb16edba6ac681c148ad0e15) s/DisplayLongest/String/ - [`6694be5`](https://github.com/ollama/ollama/commit/6694be5e5027b2f27f6eeeb51a5284a2b18129ad) convert/llama: use WriteSeeker - [`b2f00aa`](https://github.com/ollama/ollama/commit/b2f00aa9771d44a1423a2e2f23c5218f1bbc834d) close zip files ### 📊 Changes **14 files changed** (+624 additions, -589 deletions) <details> <summary>View changed files</summary> 📝 `convert/convert.go` (+2 -1) 📝 `convert/gemma.go` (+2 -13) 📝 `convert/llama.go` (+2 -16) 📝 `convert/mistral.go` (+2 -13) 📝 `convert/mixtral.go` (+3 -14) 📝 `integration/utils_test.go` (+1 -1) ➕ `llm/filetype.go` (+140 -0) 📝 `llm/ggml.go` (+18 -77) 📝 `llm/llm.go` (+4 -52) 📝 `server/images.go` (+139 -328) 📝 `server/layer.go` (+29 -44) ➕ `server/model.go` (+261 -0) 📝 `server/routes.go` (+1 -6) 📝 `server/routes_test.go` (+20 -24) </details> ### 📄 Description - FROM /path/to/{safetensors,pytorch} - FROM /path/to/fp{16,32}.bin - FROM model:fp{16,32} --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-12 23:25:28 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#11248