[GH-ISSUE #7268] fail to run ollama run hf-mirror.com/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:Q8 #4618

Closed
opened 2026-04-12 15:32:03 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @taozhiyuai on GitHub (Oct 19, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/7268

What is the issue?

taozhiyu@Mac ~ % ollama run hf-mirror.com/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:Q8
pulling manifest
Error: pull model manifest: 400: The specified tag is not a valid quantization scheme. Please use another tag or "latest"
taozhiyu@Mac ~ % ollama run hf-mirror.com/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:lastest
pulling manifest
Error: pull model manifest: 400: The specified tag is not a valid quantization scheme. Please use another tag or "latest"

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

ollama version is 0.3.13

Originally created by @taozhiyuai on GitHub (Oct 19, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/7268 ### What is the issue? taozhiyu@Mac ~ % ollama run hf-mirror.com/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:Q8 pulling manifest Error: pull model manifest: 400: The specified tag is not a valid quantization scheme. Please use another tag or "latest" taozhiyu@Mac ~ % ollama run hf-mirror.com/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:lastest pulling manifest Error: pull model manifest: 400: The specified tag is not a valid quantization scheme. Please use another tag or "latest" ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version ollama version is 0.3.13
GiteaMirror added the bug label 2026-04-12 15:32:03 -05:00
Author
Owner

@kth8 commented on GitHub (Oct 19, 2024):

ollama run hf.co/bartowski/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:Q8_0 works so perhaps so you should open an issue in Orenguteng's Hugging Face repo asking him to use standard quantization naming scheme.

<!-- gh-comment-id:2423716928 --> @kth8 commented on GitHub (Oct 19, 2024): `ollama run hf.co/bartowski/Llama-3.1-8B-Lexi-Uncensored-V2-GGUF:Q8_0` works so perhaps so you should open an issue in Orenguteng's Hugging Face repo asking him to use standard quantization naming scheme.
Author
Owner

@taozhiyuai commented on GitHub (Oct 20, 2024):

明白了.xx

<!-- gh-comment-id:2424566005 --> @taozhiyuai commented on GitHub (Oct 20, 2024): 明白了.xx
Author
Owner

@taozhiyuai commented on GitHub (Oct 22, 2024):

@jmorganca is that possible ollama app ignore the format of gguf filename? user provide URL of gguf file and Q type in command "ollama run".

it is impossible to request all gguf author to rename the file name to compatible with ollama .

<!-- gh-comment-id:2429362892 --> @taozhiyuai commented on GitHub (Oct 22, 2024): @jmorganca is that possible ollama app ignore the format of gguf filename? user provide URL of gguf file and Q type in command "ollama run". it is impossible to request all gguf author to rename the file name to compatible with ollama .
Author
Owner

@pdevine commented on GitHub (Oct 23, 2024):

Hi @taozhiyuai , I think you could probably ask HuggingFace to change that? There isn't anything we can change.

I'll go ahead and close the issue.

<!-- gh-comment-id:2430627482 --> @pdevine commented on GitHub (Oct 23, 2024): Hi @taozhiyuai , I think you could probably ask HuggingFace to change that? There isn't anything we can change. I'll go ahead and close the issue.
Author
Owner

@Mushoz commented on GitHub (Oct 25, 2024):

I am getting the same error with this command:

ollama run hf.co/bartowski/Replete-LLM-V2.5-Qwen-32b-GGUF:IQ4_NL

The IQ4_NL quant does exist in the repo, and is a valid and normal quant option though: https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-32b-GGUF

Is there any way I can fix this? And should I open a new ticket since this one is already closed? @pdevine

<!-- gh-comment-id:2438490194 --> @Mushoz commented on GitHub (Oct 25, 2024): I am getting the same error with this command: `ollama run hf.co/bartowski/Replete-LLM-V2.5-Qwen-32b-GGUF:IQ4_NL` The IQ4_NL quant does exist in the repo, and is a valid and normal quant option though: https://huggingface.co/bartowski/Replete-LLM-V2.5-Qwen-32b-GGUF Is there any way I can fix this? And should I open a new ticket since this one is already closed? @pdevine
Author
Owner

@pdevine commented on GitHub (Oct 25, 2024):

@Mushoz I think this is still an HF issue though, no?

<!-- gh-comment-id:2438747422 --> @pdevine commented on GitHub (Oct 25, 2024): @Mushoz I think this is still an HF issue though, no?
Author
Owner

@Mushoz commented on GitHub (Oct 25, 2024):

Not sure, is it? I have opened an issue about it here: https://github.com/ollama/ollama/issues/7365

Someone else is having the same issue. IQ4_NL is a quant that should be supported, but it doesn't work. All other quants work fine. What exactly is wrong on HF's side? @pdevine

<!-- gh-comment-id:2438751059 --> @Mushoz commented on GitHub (Oct 25, 2024): Not sure, is it? I have opened an issue about it here: https://github.com/ollama/ollama/issues/7365 Someone else is having the same issue. IQ4_NL is a quant that should be supported, but it doesn't work. All other quants work fine. What exactly is wrong on HF's side? @pdevine
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#4618