[GH-ISSUE #8624] Deepseek 80% size reduction #5582

Closed
opened 2026-04-12 16:51:02 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @gileneusz on GitHub (Jan 28, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8624

New quants done by unsloth.ai:

MoE Bits Disk Size Type Quality Link Down_proj
1.58-bit 131GB IQ1_S Fair Link 2.06/1.56bit
1.73-bit 158GB IQ1_M Good Link 2.06bit
2.22-bit 183GB IQ2_XXS Better Link 2.5/2.06bit
2.51-bit 212GB Q2_K_XL Best Link 3.5/2.5bit

please consider adding them

https://unsloth.ai/blog/deepseekr1-dynamic

thanks!

Originally created by @gileneusz on GitHub (Jan 28, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8624 New quants done by unsloth.ai: | MoE Bits | Disk Size | Type | Quality | Link | Down_proj | |-----------|-----------|----------|---------|------------------------------------------------------------------------------------------------------------------------------------------------|----------------| | 1.58-bit | 131GB | IQ1_S | Fair | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_S) | 2.06/1.56bit | | 1.73-bit | 158GB | IQ1_M | Good | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ1_M) | 2.06bit | | 2.22-bit | 183GB | IQ2_XXS | Better | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-IQ2_XXS) | 2.5/2.06bit | | 2.51-bit | 212GB | Q2_K_XL | Best | [Link](https://huggingface.co/unsloth/DeepSeek-R1-GGUF/tree/main/DeepSeek-R1-UD-Q2_K_XL) | 3.5/2.5bit | please consider adding them https://unsloth.ai/blog/deepseekr1-dynamic thanks!
GiteaMirror added the model label 2026-04-12 16:51:02 -05:00
Author
Owner

@rick-github commented on GitHub (Jan 28, 2025):

$ ollama run --verbose deepseek-r1:iq1_s hello
<think>

</think>

Hello! How can I assist you today? 😊

total duration:       22m5.288908462s
load duration:        20m26.355671608s
prompt eval count:    4 token(s)
prompt eval duration: 44.306s
prompt eval rate:     0.09 tokens/s
eval count:           16 token(s)
eval duration:        54.608s
eval rate:            0.29 tokens/s
$ ollama ps
NAME                 ID              SIZE      PROCESSOR         UNTIL   
deepseek-r1:iq1_s    29fac10adcf9    183 GB    94%/6% CPU/GPU    Forever    
<!-- gh-comment-id:2620184689 --> @rick-github commented on GitHub (Jan 28, 2025): ```console $ ollama run --verbose deepseek-r1:iq1_s hello <think> </think> Hello! How can I assist you today? 😊 total duration: 22m5.288908462s load duration: 20m26.355671608s prompt eval count: 4 token(s) prompt eval duration: 44.306s prompt eval rate: 0.09 tokens/s eval count: 16 token(s) eval duration: 54.608s eval rate: 0.29 tokens/s ``` ```console $ ollama ps NAME ID SIZE PROCESSOR UNTIL deepseek-r1:iq1_s 29fac10adcf9 183 GB 94%/6% CPU/GPU Forever ```
Author
Owner

@duttaoindril commented on GitHub (Jan 31, 2025):

Hmm, I get this error?

ollama run --verbose deepseek-r1:iq1_s
pulling manifest
Error: pull model manifest: file does not exist
<!-- gh-comment-id:2628434093 --> @duttaoindril commented on GitHub (Jan 31, 2025): Hmm, I get this error? ``` ollama run --verbose deepseek-r1:iq1_s pulling manifest Error: pull model manifest: file does not exist ```
Author
Owner

@rick-github commented on GitHub (Jan 31, 2025):

I built the model based on the instructions on the linked page, it is not officially in the ollama library yet. However, a user has uploaded it to their repo:

ollama pull SIGJNF/deepseek-r1-671b-1.58bit
<!-- gh-comment-id:2628458376 --> @rick-github commented on GitHub (Jan 31, 2025): I built the model based on the instructions on the linked page, it is not officially in the ollama library yet. However, a user has uploaded it to their repo: ``` ollama pull SIGJNF/deepseek-r1-671b-1.58bit ```
Author
Owner

@Itsanewday commented on GitHub (Feb 3, 2025):

I built the model based on the instructions on the linked page, it is not officially in the ollama library yet. However, a user has uploaded it to their repo:

ollama pull SIGJNF/deepseek-r1-671b-1.58bit

Dear rick,
Can you share a step-by-step instructions to built the model?
Thanks!

<!-- gh-comment-id:2630081506 --> @Itsanewday commented on GitHub (Feb 3, 2025): > I built the model based on the instructions on the linked page, it is not officially in the ollama library yet. However, a user has uploaded it to their repo: > > ``` > ollama pull SIGJNF/deepseek-r1-671b-1.58bit > ``` Dear rick, Can you share a step-by-step instructions to built the model? Thanks!
Author
Owner

@rick-github commented on GitHub (Feb 3, 2025):

mkdir ds
cd ds
huggingface-cli download --local-dir=. --include="*UD-IQ1_S*" unsloth/DeepSeek-R1-GGUF
cd DeepSeek-R1-UD-IQ1_S
docker run --gpus all --rm -it -v .:/workdir --workdir /workdir --user $(id -u):$(id -g) --entrypoint /app/llama-gguf-split ghcr.io/ggerganov/llama.cpp:full-cuda--b1-75af08c --merge DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf DeepSeek-R1-UD-IQ1_S.gguf
ollama pull deepseek-r1
ollama show --modelfile deepseek-r1 | grep -v FROM > Modelfile
echo FROM DeepSeek-R1-UD-IQ1_S.gguf >> Modelfile
ollama create deepseek-r1:iq1_s
<!-- gh-comment-id:2631346462 --> @rick-github commented on GitHub (Feb 3, 2025): ``` mkdir ds cd ds huggingface-cli download --local-dir=. --include="*UD-IQ1_S*" unsloth/DeepSeek-R1-GGUF cd DeepSeek-R1-UD-IQ1_S docker run --gpus all --rm -it -v .:/workdir --workdir /workdir --user $(id -u):$(id -g) --entrypoint /app/llama-gguf-split ghcr.io/ggerganov/llama.cpp:full-cuda--b1-75af08c --merge DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf DeepSeek-R1-UD-IQ1_S.gguf ollama pull deepseek-r1 ollama show --modelfile deepseek-r1 | grep -v FROM > Modelfile echo FROM DeepSeek-R1-UD-IQ1_S.gguf >> Modelfile ollama create deepseek-r1:iq1_s ```
Author
Owner

@Itsanewday commented on GitHub (Feb 4, 2025):

thank you!

---Original---
From: @.>
Date: Mon, Feb 3, 2025 23:34 PM
To: @.
>;
Cc: @.@.>;
Subject: Re: [ollama/ollama] Deepseek 80% size reduction (Issue #8624)

mkdir ds cd ds huggingface-cli download --local-dir=. --include="UD-IQ1_S" unsloth/DeepSeek-R1-GGUF cd DeepSeek-R1-UD-IQ1_S docker run --gpus all --rm -it -v .:/workdir --workdir /workdir --user $(id -u):$(id -g) --entrypoint /app/llama-gguf-split ghcr.io/ggerganov/llama.cpp:full-cuda--b1-75af08c --merge DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf DeepSeek-R1-UD-IQ1_S.gguf ollama pull deepseek-r1 ollama show --modelfile deepseek-r1 | grep -v FROM > Modelfile echo FROM DeepSeek-R1-UD-IQ1_S.gguf >> Modelfile ollama create deepseek-r1:iq1_s

Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: @.***>

<!-- gh-comment-id:2632593947 --> @Itsanewday commented on GitHub (Feb 4, 2025): thank you! ---Original--- From: ***@***.***&gt; Date: Mon, Feb 3, 2025 23:34 PM To: ***@***.***&gt;; Cc: ***@***.******@***.***&gt;; Subject: Re: [ollama/ollama] Deepseek 80% size reduction (Issue #8624) mkdir ds cd ds huggingface-cli download --local-dir=. --include="*UD-IQ1_S*" unsloth/DeepSeek-R1-GGUF cd DeepSeek-R1-UD-IQ1_S docker run --gpus all --rm -it -v .:/workdir --workdir /workdir --user $(id -u):$(id -g) --entrypoint /app/llama-gguf-split ghcr.io/ggerganov/llama.cpp:full-cuda--b1-75af08c --merge DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf DeepSeek-R1-UD-IQ1_S.gguf ollama pull deepseek-r1 ollama show --modelfile deepseek-r1 | grep -v FROM &gt; Modelfile echo FROM DeepSeek-R1-UD-IQ1_S.gguf &gt;&gt; Modelfile ollama create deepseek-r1:iq1_s — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***&gt;
Author
Owner

@Itsanewday commented on GitHub (Feb 7, 2025):

mkdir ds
cd ds
huggingface-cli download --local-dir=. --include="*UD-IQ1_S*" unsloth/DeepSeek-R1-GGUF
cd DeepSeek-R1-UD-IQ1_S
docker run --gpus all --rm -it -v .:/workdir --workdir /workdir --user $(id -u):$(id -g) --entrypoint /app/llama-gguf-split ghcr.io/ggerganov/llama.cpp:full-cuda--b1-75af08c --merge DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf DeepSeek-R1-UD-IQ1_S.gguf
ollama pull deepseek-r1
ollama show --modelfile deepseek-r1 | grep -v FROM > Modelfile
echo FROM DeepSeek-R1-UD-IQ1_S.gguf >> Modelfile
ollama create deepseek-r1:iq1_s

Dear rick,
I have done this step: docker run --gpus all --rm -it -v .:/workdir --workdir /workdir --user $(id -u):$(id -g) --entrypoint /app/llama-gguf-split ghcr.io/ggerganov/llama.cpp:full-cuda--b1-75af08c --merge DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf DeepSeek-R1-UD-IQ1_S.gguf.
Now I would like to use this model in a ollama Docker compose, what should I do?

<!-- gh-comment-id:2641732557 --> @Itsanewday commented on GitHub (Feb 7, 2025): > ``` > mkdir ds > cd ds > huggingface-cli download --local-dir=. --include="*UD-IQ1_S*" unsloth/DeepSeek-R1-GGUF > cd DeepSeek-R1-UD-IQ1_S > docker run --gpus all --rm -it -v .:/workdir --workdir /workdir --user $(id -u):$(id -g) --entrypoint /app/llama-gguf-split ghcr.io/ggerganov/llama.cpp:full-cuda--b1-75af08c --merge DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf DeepSeek-R1-UD-IQ1_S.gguf > ollama pull deepseek-r1 > ollama show --modelfile deepseek-r1 | grep -v FROM > Modelfile > echo FROM DeepSeek-R1-UD-IQ1_S.gguf >> Modelfile > ollama create deepseek-r1:iq1_s > ``` Dear rick, I have done this step: docker run --gpus all --rm -it -v .:/workdir --workdir /workdir --user $(id -u):$(id -g) --entrypoint /app/llama-gguf-split ghcr.io/ggerganov/llama.cpp:full-cuda--b1-75af08c --merge DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf DeepSeek-R1-UD-IQ1_S.gguf. Now I would like to use this model in a ollama Docker compose, what should I do?
Author
Owner

@rick-github commented on GitHub (Feb 7, 2025):

Now I would like to use this model in a ollama Docker compose, what should I do?

services:
  ollama:
    image: ollama/ollama
    ports:
      - 11434:11434
    volumes:
      - /path/to/models:/root/.ollama
<!-- gh-comment-id:2642424497 --> @rick-github commented on GitHub (Feb 7, 2025): > Now I would like to use this model in a ollama Docker compose, what should I do? ```yaml services: ollama: image: ollama/ollama ports: - 11434:11434 volumes: - /path/to/models:/root/.ollama ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5582