[PR #10099] [MERGED] add mistral-small #23687

Closed
opened 2026-04-19 17:09:05 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/10099
Author: @mxyng
Created: 4/3/2025
Status: Merged
Merged: 4/3/2025
Merged by: @mxyng

Base: mainHead: jmorganca/mistral3


📝 Commits (1)

  • 3a9c714 model: support for mistral-small in the ollama runner

📊 Changes

27 files changed (+1116 additions, -350 deletions)

View changed files

📝 convert/convert.go (+3 -1)
convert/convert_mistral.go (+190 -0)
📝 convert/reader.go (+1 -4)
📝 fs/ggml/ggml.go (+5 -2)
📝 kvcache/causal_test.go (+18 -8)
📝 llama/llama.cpp/src/llama-arch.cpp (+17 -0)
📝 llama/llama.cpp/src/llama-arch.h (+1 -0)
📝 llama/llama.cpp/src/llama-model.cpp (+3 -0)
📝 llama/llama.cpp/src/llama-quant.cpp (+2 -7)
📝 llama/patches/0021-add-model-quantizations.patch (+81 -21)
llama/patches/0022-metal-add-op_neg.patch (+75 -0)
📝 ml/backend.go (+9 -1)
📝 ml/backend/ggml/ggml.go (+56 -0)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-embed.metal (+7 -0)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m (+15 -0)
📝 ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.metal (+7 -0)
📝 model/models/gemma3/model_text.go (+11 -11)
model/models/mistral3/imageproc.go (+56 -0)
model/models/mistral3/model.go (+189 -0)
model/models/mistral3/model_text.go (+177 -0)

...and 7 more files

📄 Description

This change implements Mistral Small 3.1 multimodal model for the Ollama engine. A compatible model is can be found here.

Specifically, this implements:

  • Pixtral vision encoder model/models/mistral3/model_vision.go
  • Mistral text decoder model/models/mistral3/model_text.go
  • Mistral 3.1 multimodal projector model/models/mistral3/model.go

Mistral text decoder share many characteristics with the Llama text decoder and may be merged into that implementation at a later time. Similarly, the Pixtral vision encoder may be split out into a separate package, e.g. model/models/pixtral, at a later time

A few notes:

  • Pixtral's 2D rope is calculated explicitly rather than using ggml_rope_multi
  • Use SILU for Pixtral's action rather than GELU. While GELU is what Hugging Face transformers uses, it produces worse results than SILU. vllm uses SILU
  • Add a metal OP_NEG kernel which allows the vision graph to be fully computed on the GPU reducing splits

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/10099 **Author:** [@mxyng](https://github.com/mxyng) **Created:** 4/3/2025 **Status:** ✅ Merged **Merged:** 4/3/2025 **Merged by:** [@mxyng](https://github.com/mxyng) **Base:** `main` ← **Head:** `jmorganca/mistral3` --- ### 📝 Commits (1) - [`3a9c714`](https://github.com/ollama/ollama/commit/3a9c7145e44759fa2a84c167e5e20194882f8141) model: support for mistral-small in the ollama runner ### 📊 Changes **27 files changed** (+1116 additions, -350 deletions) <details> <summary>View changed files</summary> 📝 `convert/convert.go` (+3 -1) ➕ `convert/convert_mistral.go` (+190 -0) 📝 `convert/reader.go` (+1 -4) 📝 `fs/ggml/ggml.go` (+5 -2) 📝 `kvcache/causal_test.go` (+18 -8) 📝 `llama/llama.cpp/src/llama-arch.cpp` (+17 -0) 📝 `llama/llama.cpp/src/llama-arch.h` (+1 -0) 📝 `llama/llama.cpp/src/llama-model.cpp` (+3 -0) 📝 `llama/llama.cpp/src/llama-quant.cpp` (+2 -7) 📝 `llama/patches/0021-add-model-quantizations.patch` (+81 -21) ➕ `llama/patches/0022-metal-add-op_neg.patch` (+75 -0) 📝 `ml/backend.go` (+9 -1) 📝 `ml/backend/ggml/ggml.go` (+56 -0) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal-embed.metal` (+7 -0) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m` (+15 -0) 📝 `ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.metal` (+7 -0) 📝 `model/models/gemma3/model_text.go` (+11 -11) ➕ `model/models/mistral3/imageproc.go` (+56 -0) ➕ `model/models/mistral3/model.go` (+189 -0) ➕ `model/models/mistral3/model_text.go` (+177 -0) _...and 7 more files_ </details> ### 📄 Description This change implements [Mistral Small 3.1](https://mistral.ai/news/mistral-small-3-1) multimodal model for the Ollama engine. A compatible model is can be found [here](https://ollama.com/mike/mistral-small:latest). Specifically, this implements: - Pixtral vision encoder `model/models/mistral3/model_vision.go` - Mistral text decoder `model/models/mistral3/model_text.go` - Mistral 3.1 multimodal projector `model/models/mistral3/model.go` Mistral text decoder share many characteristics with the Llama text decoder and may be merged into that implementation at a later time. Similarly, the Pixtral vision encoder may be split out into a separate package, e.g. `model/models/pixtral`, at a later time A few notes: - Pixtral's 2D rope is calculated explicitly rather than using `ggml_rope_multi` - Use SILU for Pixtral's action rather than GELU. While GELU is what Hugging Face transformers uses, it produces worse results than SILU. vllm [uses](https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/pixtral.py#L648) SILU - Add a metal OP_NEG kernel which allows the vision graph to be fully computed on the GPU reducing splits --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-19 17:09:05 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#23687