[GH-ISSUE #11340] Add SmolLM3 model #33239

Closed
opened 2026-04-22 15:43:53 -05:00 by GiteaMirror · 15 comments
Owner

Originally created by @Dalibor-P on GitHub (Jul 9, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11340

Blog: https://huggingface.co/blog/smollm3
Base model: https://hf.co/HuggingFaceTB/SmolLM3-3B-Base
Instruct and reasoning model: https://hf.co/HuggingFaceTB/SmolLM3-3B

Originally created by @Dalibor-P on GitHub (Jul 9, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11340 Blog: https://huggingface.co/blog/smollm3 Base model: https://hf.co/HuggingFaceTB/SmolLM3-3B-Base Instruct and reasoning model: https://hf.co/HuggingFaceTB/SmolLM3-3B
GiteaMirror added the model label 2026-04-22 15:43:53 -05:00
Author
Owner

@rick-github commented on GitHub (Jul 9, 2025):

https://github.com/ggml-org/llama.cpp/pull/14581

<!-- gh-comment-id:3052133957 --> @rick-github commented on GitHub (Jul 9, 2025): https://github.com/ggml-org/llama.cpp/pull/14581
Author
Owner

@MohamedAliRashad commented on GitHub (Jul 10, 2025):

What holds this issue from being resolved ?

<!-- gh-comment-id:3057233661 --> @MohamedAliRashad commented on GitHub (Jul 10, 2025): What holds this issue from being resolved ?
Author
Owner

@chigkim commented on GitHub (Jul 14, 2025):

It seems like Ollama team is getting tired of bringing new models?

SmolLM3, Reka flash3, GLM4, etc are pretty decent releases, but Ollama doesn't have them on their main library.

<!-- gh-comment-id:3070339385 --> @chigkim commented on GitHub (Jul 14, 2025): It seems like Ollama team is getting tired of bringing new models? SmolLM3, Reka flash3, GLM4, etc are pretty decent releases, but Ollama doesn't have them on their main library.
Author
Owner

@nake89 commented on GitHub (Jul 17, 2025):

This was merged over a week ago: https://github.com/ggml-org/llama.cpp/pull/14581
What's the hold up?

<!-- gh-comment-id:3083018907 --> @nake89 commented on GitHub (Jul 17, 2025): This was merged over a week ago: https://github.com/ggml-org/llama.cpp/pull/14581 What's the hold up?
Author
Owner

@iamregret17 commented on GitHub (Jul 17, 2025):

Brotha, we need more upvotes for this issue

<!-- gh-comment-id:3083253711 --> @iamregret17 commented on GitHub (Jul 17, 2025): Brotha, we need more upvotes for this issue
Author
Owner

@Wladastic commented on GitHub (Jul 25, 2025):

As far as I can tell it seems to be running well on llama.cpp already.
Maybe we need to wait for yet another llama.cpp pull

<!-- gh-comment-id:3117137929 --> @Wladastic commented on GitHub (Jul 25, 2025): As far as I can tell it seems to be running well on llama.cpp already. Maybe we need to wait for yet another llama.cpp pull
Author
Owner

@Sasikuttan2163 commented on GitHub (Jul 26, 2025):

Hasn't it been 2 months since the last one?

<!-- gh-comment-id:3121657829 --> @Sasikuttan2163 commented on GitHub (Jul 26, 2025): Hasn't it been 2 months since the last one?
Author
Owner

@devashishraj commented on GitHub (Aug 6, 2025):

This was merged over a week ago: ggml-org/llama.cpp#14581 What's the hold up?

it's an open source program there's no hold-up , people take a look at debug log from ollama and attempt a patch to support required model

<!-- gh-comment-id:3157480347 --> @devashishraj commented on GitHub (Aug 6, 2025): > This was merged over a week ago: [ggml-org/llama.cpp#14581](https://github.com/ggml-org/llama.cpp/pull/14581) What's the hold up? it's an open source program there's no hold-up , people take a look at debug log from ollama and attempt a patch to support required model
Author
Owner

@mrs83 commented on GitHub (Aug 18, 2025):

What's the ETA for this? I'm planning to use Ollama as the main OpenAI-compatible endpoint for a local app, and SmolLM3 is a perfect fit. I understand the hype around gpt-oss, but ignoring other great models seems like a missed opportunity

<!-- gh-comment-id:3195881230 --> @mrs83 commented on GitHub (Aug 18, 2025): What's the ETA for this? I'm planning to use Ollama as the main OpenAI-compatible endpoint for a local app, and SmolLM3 is a perfect fit. I understand the hype around gpt-oss, but ignoring other great models seems like a missed opportunity
Author
Owner

@Dalibor-P commented on GitHub (Aug 18, 2025):

With 50 upvotes, it's clear that people want this model. In addition, checking SmolLM3 hugging face repo, shows that the model was downloaded 600k times, which is not insignificant. It is, clearly, a popular model.

To ollama team, please add support for the model.

<!-- gh-comment-id:3196157517 --> @Dalibor-P commented on GitHub (Aug 18, 2025): With 50 upvotes, it's clear that people want this model. In addition, checking [SmolLM3 hugging face repo](https://huggingface.co/HuggingFaceTB/SmolLM3-3B), shows that the model was downloaded 600k times, which is not insignificant. It is, clearly, a popular model. To ollama team, please add support for the model.
Author
Owner

@rick-github commented on GitHub (Aug 18, 2025):

FROM hf.co/unsloth/SmolLM3-3B-128K-GGUF
TEMPLATE "
{{- $lastUserIdx := -1 }}
{{- range $i, $_ := .Messages }}
{{- if eq .Role "user" }}{{- $lastUserIdx = $i }}{{ end }}
{{- end -}}
<|im_start|>system
## Metadata

Knowledge Cutoff Date: June 2025
Today Date: {{ currentDate }}
Reasoning Mode: {{ if $.IsThinkSet }}{{ if $.Think }}/think{{ else }}/no_think{{ end }}{{ else }}/think{{ end }}

{{ if .System }}
## Custom Instructions

{{ .System }}


{{ end }}
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|im_start|>user
{{ .Content }}<|im_end|>
{{- else if eq .Role "assistant" }}<|im_start|>assistant
{{- if (and $.IsThinkSet (and .Thinking (or $last (gt $i $lastUserIdx)))) -}}
<think>{{ .Thinking }}</think>
{{- end }}
{{ .Content }}
{{- end }}
{{ if and (ne .Role "assistant") $last }}<|im_start|>assistant
{{- if and $.IsThinkSet (not $.Think) -}}
<think>

</think>

{{ end }}
{{ end }}
{{- end -}}
"
$ ollama -v
ollama version is 0.11.5-rc2
$ ollama run smollm3:3b-q4_K_M hello --think=true
Thinking...
Okay, the user just said "hello". I need to respond appropriately. Since it's a greeting, I should reply 
with a friendly and welcoming message. Let me keep it simple and open-ended to encourage them to ask 
questions or share what they need help with. Maybe something like, "Hello! How can I assist you today?" 
That should work. I don't want to assume their request yet; I just want to be helpful and inviting.
...done thinking.

Hello! How can I assist you today? Whether you have a question, need help with something specific, or 
just want to chat, feel free to let me know.
<!-- gh-comment-id:3196349508 --> @rick-github commented on GitHub (Aug 18, 2025): ``` FROM hf.co/unsloth/SmolLM3-3B-128K-GGUF TEMPLATE " {{- $lastUserIdx := -1 }} {{- range $i, $_ := .Messages }} {{- if eq .Role "user" }}{{- $lastUserIdx = $i }}{{ end }} {{- end -}} <|im_start|>system ## Metadata Knowledge Cutoff Date: June 2025 Today Date: {{ currentDate }} Reasoning Mode: {{ if $.IsThinkSet }}{{ if $.Think }}/think{{ else }}/no_think{{ end }}{{ else }}/think{{ end }} {{ if .System }} ## Custom Instructions {{ .System }} {{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 }} {{- if eq .Role "user" }}<|im_start|>user {{ .Content }}<|im_end|> {{- else if eq .Role "assistant" }}<|im_start|>assistant {{- if (and $.IsThinkSet (and .Thinking (or $last (gt $i $lastUserIdx)))) -}} <think>{{ .Thinking }}</think> {{- end }} {{ .Content }} {{- end }} {{ if and (ne .Role "assistant") $last }}<|im_start|>assistant {{- if and $.IsThinkSet (not $.Think) -}} <think> </think> {{ end }} {{ end }} {{- end -}} " ``` ```console $ ollama -v ollama version is 0.11.5-rc2 $ ollama run smollm3:3b-q4_K_M hello --think=true Thinking... Okay, the user just said "hello". I need to respond appropriately. Since it's a greeting, I should reply with a friendly and welcoming message. Let me keep it simple and open-ended to encourage them to ask questions or share what they need help with. Maybe something like, "Hello! How can I assist you today?" That should work. I don't want to assume their request yet; I just want to be helpful and inviting. ...done thinking. Hello! How can I assist you today? Whether you have a question, need help with something specific, or just want to chat, feel free to let me know. ```
Author
Owner

@Spatchy commented on GitHub (Aug 18, 2025):

For those that were confused by @rick-github's message, SmolLM3 now works as of 0.11.5-rc2. I've confirmed myself in docker: image: ollama/ollama:0.11.5-rc2

<!-- gh-comment-id:3196510501 --> @Spatchy commented on GitHub (Aug 18, 2025): For those that were confused by @rick-github's message, SmolLM3 now works as of `0.11.5-rc2`. I've confirmed myself in docker: `image: ollama/ollama:0.11.5-rc2`
Author
Owner

@Cerrix commented on GitHub (Aug 21, 2025):

@rick-github thank you so much for sharing your template! I'm testing it and when I use the command ollama run smollm3:3b-q4_K_M hello --think=true it works perfectly.

However, when I run the model in interactive mode, I encounter this strange behavior:

ollama run smollm3:3b-q4_K_M
>>> /set think
Set 'think' mode.
>>> hello
HelloHello! How can I help you today?  # Note the duplicated "Hello" here and no thinking at all

Do you know why the response starts with "HelloHello" in the interactive mode? And no thinking output is provided other than this duplication.

<!-- gh-comment-id:3209551517 --> @Cerrix commented on GitHub (Aug 21, 2025): @rick-github thank you so much for sharing your template! I'm testing it and when I use the command `ollama run smollm3:3b-q4_K_M hello --think=true` it works perfectly. However, when I run the model in interactive mode, I encounter this strange behavior: ```bash ollama run smollm3:3b-q4_K_M >>> /set think Set 'think' mode. >>> hello HelloHello! How can I help you today? # Note the duplicated "Hello" here and no thinking at all ``` Do you know why the response starts with "HelloHello" in the interactive mode? And no thinking output is provided other than this duplication.
Author
Owner

@rick-github commented on GitHub (Aug 21, 2025):

It seems to be a function of the client. The exact same prompt is sent to the ollama server, but the client sometimes interprets the response differently. For example, here's a session where the first response includes thinking and the second doesn't, and the second has the double Hello:

$ ollama run smollm3:3b-q4_K_M 
>>> /set think
Set 'think' mode.
>>> hello
Thinking...
Okay, the user just said "hello". I need to respond appropriately. Let me start by acknowledging their greeting.

First, a simple greeting in return would be polite. Maybe say "Hello!" to match their tone.

Then, offer assistance since they might have a question or need help with something. Use an emoji to keep it friendly and approachable. Something like a smiley face next to the text.

Keep the response concise so it's not too long. Let them know I'm here to help if they have any questions or need information. Make sure to mention various topics I can assist with, like explaining things, providing answers to specific queries, etc.

End with an emoji to maintain a friendly and welcoming atmosphere. Maybe use another smiley face or a thumbs-up sign.

Check for clarity and ensure the response is clear and not too technical. Avoid any jargon that might confuse them. Keep it simple and straightforward.
...done thinking.

Hello!  How can I assist you today? Whether you need help with anything from explaining concepts to finding information, just let me know what's on your mind!

>>> hello
HelloHello! It seems like we've been greeted twice. Would you like to ask a question or do something specific now?

I'll see if I can figure out what's going on.

<!-- gh-comment-id:3211081579 --> @rick-github commented on GitHub (Aug 21, 2025): It seems to be a function of the client. The exact same prompt is sent to the ollama server, but the client sometimes interprets the response differently. For example, here's a session where the first response includes thinking and the second doesn't, and the second has the double `Hello`: ```console $ ollama run smollm3:3b-q4_K_M >>> /set think Set 'think' mode. >>> hello Thinking... Okay, the user just said "hello". I need to respond appropriately. Let me start by acknowledging their greeting. First, a simple greeting in return would be polite. Maybe say "Hello!" to match their tone. Then, offer assistance since they might have a question or need help with something. Use an emoji to keep it friendly and approachable. Something like a smiley face next to the text. Keep the response concise so it's not too long. Let them know I'm here to help if they have any questions or need information. Make sure to mention various topics I can assist with, like explaining things, providing answers to specific queries, etc. End with an emoji to maintain a friendly and welcoming atmosphere. Maybe use another smiley face or a thumbs-up sign. Check for clarity and ensure the response is clear and not too technical. Avoid any jargon that might confuse them. Keep it simple and straightforward. ...done thinking. Hello! How can I assist you today? Whether you need help with anything from explaining concepts to finding information, just let me know what's on your mind! >>> hello HelloHello! It seems like we've been greeted twice. Would you like to ask a question or do something specific now? ``` I'll see if I can figure out what's going on.
Author
Owner

@rick-github commented on GitHub (Sep 15, 2025):

The double emit has been fixed with https://github.com/ollama/ollama/pull/12021 (0.11.7).

<!-- gh-comment-id:3291491748 --> @rick-github commented on GitHub (Sep 15, 2025): The double emit has been fixed with https://github.com/ollama/ollama/pull/12021 (0.11.7).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#33239