[GH-ISSUE #7816] I import a IQ_4XS model but get an IQ1_M #67057

Closed
opened 2026-05-04 09:23:31 -05:00 by GiteaMirror · 13 comments
Owner

Originally created by @CberYellowstone on GitHub (Nov 24, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7816

What is the issue?

As the title says, I imported a custom gguf model, which is of the IQ_4XS quantization type. But after importing it, ollama show displays it as IQ1_M. Is this behavior expected? Because I saw in previous issues that support for IQ_4XS has been added, so this confuses me.
image
image

the gguf file

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.4.4

Originally created by @CberYellowstone on GitHub (Nov 24, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7816 ### What is the issue? As the title says, I imported a custom gguf model, which is of the IQ_4XS quantization type. But after importing it, ollama show displays it as IQ1_M. Is this behavior expected? Because I saw in previous issues that support for IQ_4XS has been added, so this confuses me. ![image](https://github.com/user-attachments/assets/aab46156-2b61-4e36-a24f-c512ea3b7ce1) ![image](https://github.com/user-attachments/assets/aef1b4a8-4e46-414f-9c0f-1547a905b510) [the gguf file](https://huggingface.co/SakuraLLM/Sakura-14B-Qwen2.5-v1.0-GGUF/blob/main/sakura-14b-qwen2.5-v1.0-iq4xs.gguf) ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.4.4
GiteaMirror added the bugollama.com labels 2026-05-04 09:23:33 -05:00
Author
Owner

@rick-github commented on GitHub (Nov 24, 2024):

How was the GGUF file created?

<!-- gh-comment-id:2496037047 --> @rick-github commented on GitHub (Nov 24, 2024): How was the GGUF file created?
Author
Owner

@CberYellowstone commented on GitHub (Nov 24, 2024):

How was the GGUF file created?

Since I am not the model creator, I am not clear about the details. I have opened an issue(in chinese) in the model creator's GitHub repository to ask about it.
By the way, before raising this issue, I asked in the Discord channel and a user told me that all models quantized with IQ_4XS seem to be displayed as IQ1_M
image

So I think this is not an isolated case

<!-- gh-comment-id:2496044194 --> @CberYellowstone commented on GitHub (Nov 24, 2024): > How was the GGUF file created? Since I am not the model creator, I am not clear about the details. I have opened an [issue](https://github.com/SakuraLLM/SakuraLLM/issues/124)(in chinese) in the model creator's GitHub repository to ask about it. By the way, before raising this issue, I asked in the Discord channel and a user told me that all models quantized with IQ_4XS seem to be displayed as IQ1_M ![image](https://github.com/user-attachments/assets/d98c926f-af73-4cdb-aa4a-4880c63a70fc) So I think this is not an isolated case
Author
Owner

@CberYellowstone commented on GitHub (Nov 24, 2024):

Oh forgot to mention the point, this gguf file is directly downloaded from huggingface, the link is right in the previous text
Already verified to run normally, only the quantization level is incorrect, everything else is fine
Hash also matches that on huggingface

<!-- gh-comment-id:2496046096 --> @CberYellowstone commented on GitHub (Nov 24, 2024): Oh forgot to mention the point, this gguf file is directly downloaded from huggingface, the link is right in the previous text Already verified to run normally, only the quantization level is incorrect, everything else is fine Hash also matches that on huggingface
Author
Owner

@CberYellowstone commented on GitHub (Nov 24, 2024):

How was the GGUF file created?

Just got a reply, the gguf file was generated using llama-cpp

<!-- gh-comment-id:2496048334 --> @CberYellowstone commented on GitHub (Nov 24, 2024): > How was the GGUF file created? Just got a reply, the gguf file was generated using llama-cpp
Author
Owner

@rick-github commented on GitHub (Nov 24, 2024):

Looks like the filetype defs in ollama have strayed from those in llama.cpp:

$ sdiff -b <(sed -ne 's/.*LLAMA_FTYPE_MOSTLY_\([^ ]*\).*= \([0-9]*\),.*/ \2 \1/p' llama/llama.h) <(sed -ne 's/^[ \t]*fileType\([^ \t]*\).*$/\1/p' llm/filetype.go | tail +2   | cat -n | sed -e 's/[ \t][ \t]*/ /g')
 1 F16								 1 F16
 2 Q4_0								 2 Q4_0
 3 Q4_1								 3 Q4_1
 4 Q4_1_SOME_F16					      |	 4 Q4_1_F16
 5 Q4_2								 5 Q4_2
 6 Q4_3								 6 Q4_3
 7 Q8_0								 7 Q8_0
 8 Q5_0								 8 Q5_0
 9 Q5_1								 9 Q5_1
 10 Q2_K							 10 Q2_K
 11 Q3_K_S							 11 Q3_K_S
 12 Q3_K_M							 12 Q3_K_M
 13 Q3_K_L							 13 Q3_K_L
 14 Q4_K_S							 14 Q4_K_S
 15 Q4_K_M							 15 Q4_K_M
 16 Q5_K_S							 16 Q5_K_S
 17 Q5_K_M							 17 Q5_K_M
 18 Q6_K							 18 Q6_K
 19 IQ2_XXS							 19 IQ2_XXS
 20 IQ2_XS							 20 IQ2_XS
 21 Q2_K_S							 21 Q2_K_S
 22 IQ3_XS							 22 IQ3_XS
 23 IQ3_XXS							 23 IQ3_XXS
 24 IQ1_S							 24 IQ1_S
 25 IQ4_NL							 25 IQ4_NL
 26 IQ3_S							 26 IQ3_S
 27 IQ3_M						      |	 27 IQ2_S
 28 IQ2_S						      |	 28 IQ4_XS
 29 IQ2_M							 29 IQ2_M
 30 IQ4_XS						      |	 30 IQ1_M
 31 IQ1_M						      |	 31 BF16
 32 BF16						      |	 32 Unknown
 33 Q4_0_4_4						      <
 34 Q4_0_4_8						      <
 35 Q4_0_8_8						      <
 36 TQ1_0						      <
 37 TQ2_0						      <

ollama is missing fileTypeIQ3_M and fileTypeIQ4_XS should be below fileTypeIQ2_M.

<!-- gh-comment-id:2496095664 --> @rick-github commented on GitHub (Nov 24, 2024): Looks like the filetype defs in ollama have strayed from those in llama.cpp: ```console $ sdiff -b <(sed -ne 's/.*LLAMA_FTYPE_MOSTLY_\([^ ]*\).*= \([0-9]*\),.*/ \2 \1/p' llama/llama.h) <(sed -ne 's/^[ \t]*fileType\([^ \t]*\).*$/\1/p' llm/filetype.go | tail +2 | cat -n | sed -e 's/[ \t][ \t]*/ /g') 1 F16 1 F16 2 Q4_0 2 Q4_0 3 Q4_1 3 Q4_1 4 Q4_1_SOME_F16 | 4 Q4_1_F16 5 Q4_2 5 Q4_2 6 Q4_3 6 Q4_3 7 Q8_0 7 Q8_0 8 Q5_0 8 Q5_0 9 Q5_1 9 Q5_1 10 Q2_K 10 Q2_K 11 Q3_K_S 11 Q3_K_S 12 Q3_K_M 12 Q3_K_M 13 Q3_K_L 13 Q3_K_L 14 Q4_K_S 14 Q4_K_S 15 Q4_K_M 15 Q4_K_M 16 Q5_K_S 16 Q5_K_S 17 Q5_K_M 17 Q5_K_M 18 Q6_K 18 Q6_K 19 IQ2_XXS 19 IQ2_XXS 20 IQ2_XS 20 IQ2_XS 21 Q2_K_S 21 Q2_K_S 22 IQ3_XS 22 IQ3_XS 23 IQ3_XXS 23 IQ3_XXS 24 IQ1_S 24 IQ1_S 25 IQ4_NL 25 IQ4_NL 26 IQ3_S 26 IQ3_S 27 IQ3_M | 27 IQ2_S 28 IQ2_S | 28 IQ4_XS 29 IQ2_M 29 IQ2_M 30 IQ4_XS | 30 IQ1_M 31 IQ1_M | 31 BF16 32 BF16 | 32 Unknown 33 Q4_0_4_4 < 34 Q4_0_4_8 < 35 Q4_0_8_8 < 36 TQ1_0 < 37 TQ2_0 < ``` ollama is missing `fileTypeIQ3_M` and `fileTypeIQ4_XS` should be below `fileTypeIQ2_M`.
Author
Owner

@rick-github commented on GitHub (Nov 24, 2024):

Note this is purely presentation, the model you've imported is still IQ_4XS, just the display of the model quantization was incorrect.

<!-- gh-comment-id:2496097435 --> @rick-github commented on GitHub (Nov 24, 2024): Note this is purely presentation, the model you've imported is still IQ_4XS, just the display of the model quantization was incorrect.
Author
Owner

@CberYellowstone commented on GitHub (Dec 21, 2024):

@jmorganca @rick-github
This issue still has residual effects, the quantization levels of models uploaded on ollama.com are not displayed correctly.

<!-- gh-comment-id:2558001541 --> @CberYellowstone commented on GitHub (Dec 21, 2024): @jmorganca @rick-github This issue still has residual effects, the quantization levels of models uploaded on ollama.com are not displayed correctly.
Author
Owner

@rick-github commented on GitHub (Dec 21, 2024):

Do you have an example?

<!-- gh-comment-id:2558004296 --> @rick-github commented on GitHub (Dec 21, 2024): Do you have an example?
Author
Owner

@CberYellowstone commented on GitHub (Dec 21, 2024):

Do you have an example?

https://ollama.com/CBYellowstone/sakura-v1.0

image

The correct display should be IQ_4XS

<!-- gh-comment-id:2558004595 --> @CberYellowstone commented on GitHub (Dec 21, 2024): > Do you have an example? https://ollama.com/CBYellowstone/sakura-v1.0 ![image](https://github.com/user-attachments/assets/f5379f02-66a1-41a4-8365-f6c0adc94c14) The correct display should be IQ_4XS
Author
Owner

@rick-github commented on GitHub (Dec 21, 2024):

The website code is not part of the ollama project repo, so somebody else will have to fix it. Have you tried re-pushing the model to see if it changes?

<!-- gh-comment-id:2558007732 --> @rick-github commented on GitHub (Dec 21, 2024): The website code is not part of the ollama project repo, so somebody else will have to fix it. Have you tried re-pushing the model to see if it changes?
Author
Owner

@CberYellowstone commented on GitHub (Dec 21, 2024):

The website code is not part of the ollama project repo, so somebody else will have to fix it. Have you tried re-pushing the model to see if it changes?

Okay, I understand the current situation. I will try to re-upload later.

<!-- gh-comment-id:2558008137 --> @CberYellowstone commented on GitHub (Dec 21, 2024): > The website code is not part of the ollama project repo, so somebody else will have to fix it. Have you tried re-pushing the model to see if it changes? Okay, I understand the current situation. I will try to re-upload later.
Author
Owner

@CberYellowstone commented on GitHub (Dec 23, 2024):

The website code is not part of the ollama project repo, so somebody else will have to fix it. Have you tried re-pushing the model to see if it changes?

This doesn't work, the problem still exists. Should I open a new issue somewhere?

<!-- gh-comment-id:2559906425 --> @CberYellowstone commented on GitHub (Dec 23, 2024): > The website code is not part of the ollama project repo, so somebody else will have to fix it. Have you tried re-pushing the model to see if it changes? This doesn't work, the problem still exists. Should I open a new issue somewhere?
Author
Owner

@rick-github commented on GitHub (Dec 23, 2024):

Open a new ticket in this tracker.

<!-- gh-comment-id:2560444588 --> @rick-github commented on GitHub (Dec 23, 2024): Open a new ticket in this tracker.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#67057