[GH-ISSUE #8728] ollama create: Error: supplied file was not in GGUF format #5662

New Issue

GiteaMirror · 2026-04-12T16:57:31-05:00

GiteaMirror commented

2026-04-12 16:57:31 -05:00

Originally created by @qits-kkruse on GitHub (Jan 31, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8728

What is the issue?

I get this error message running ollama create:

ollama create DeepSeek-R1-671b-q1.58.model
gathering model components
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100%
copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100%
copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100%
parsing GGUF
Error: supplied file was not in GGUF format

I downloaded the model files like so:

from huggingface_hub import snapshot_download
snapshot_download(
  repo_id = "unsloth/DeepSeek-R1-GGUF",
  local_dir = "DeepSeek-R1-GGUF",
  allow_patterns = ["*UD-IQ1_S*"],
)

Then i merged the three files into one, like so:

llama-gguf-split --merge DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
  DeepSeek-R1-671b-q1.58.gguf

This is my short and sweet model file:

FROM DeepSeek-R1-671b-q1.58.gguf

But using ollama create i get this:

parsing GGUF
Error: supplied file was not in GGUF format

ollama-cli confirms that the file is gguf Version 3 and it can do inference with it as well:

llama-cli --model DeepSeek-R1-671b-q1.58.gguf

register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (AMD EPYC 9174F 16-Core Processor)
build: 4604 (5783575c) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2) for x86_64-redhat-linux (debug)
main: llama backend init
main: load the model and apply lora adapter, if any
llama_model_loader: loaded meta data with 52 key-value pairs and 1025 tensors from DeepSeek-R1-671b-q1.58.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = deepseek2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 BF16
llama_model_loader: - kv   3:                       general.quantized_by str              = Unsloth
llama_model_loader: - kv   4:                         general.size_label str              = 256x20B
llama_model_loader: - kv   5:                           general.repo_url str              = https://huggingface.co/unsloth
llama_model_loader: - kv   6:                      deepseek2.block_count u32              = 61
llama_model_loader: - kv   7:                   deepseek2.context_length u32              = 163840
llama_model_loader: - kv   8:                 deepseek2.embedding_length u32              = 7168
llama_model_loader: - kv   9:              deepseek2.feed_forward_length u32              = 18432
llama_model_loader: - kv  10:             deepseek2.attention.head_count u32              = 128
llama_model_loader: - kv  11:          deepseek2.attention.head_count_kv u32              = 128
llama_model_loader: - kv  12:                   deepseek2.rope.freq_base f32              = 10000,000000
llama_model_loader: - kv  13: deepseek2.attention.layer_norm_rms_epsilon f32              = 0,000001
llama_model_loader: - kv  14:                deepseek2.expert_used_count u32              = 8
llama_model_loader: - kv  15:        deepseek2.leading_dense_block_count u32              = 3
llama_model_loader: - kv  16:                       deepseek2.vocab_size u32              = 129280
llama_model_loader: - kv  17:            deepseek2.attention.q_lora_rank u32              = 1536
llama_model_loader: - kv  18:           deepseek2.attention.kv_lora_rank u32              = 512
llama_model_loader: - kv  19:             deepseek2.attention.key_length u32              = 192
llama_model_loader: - kv  20:           deepseek2.attention.value_length u32              = 128
llama_model_loader: - kv  21:       deepseek2.expert_feed_forward_length u32              = 2048
llama_model_loader: - kv  22:                     deepseek2.expert_count u32              = 256
llama_model_loader: - kv  23:              deepseek2.expert_shared_count u32              = 1
llama_model_loader: - kv  24:             deepseek2.expert_weights_scale f32              = 2,500000
llama_model_loader: - kv  25:              deepseek2.expert_weights_norm bool             = true
llama_model_loader: - kv  26:               deepseek2.expert_gating_func u32              = 2
llama_model_loader: - kv  27:             deepseek2.rope.dimension_count u32              = 64
llama_model_loader: - kv  28:                deepseek2.rope.scaling.type str              = yarn
llama_model_loader: - kv  29:              deepseek2.rope.scaling.factor f32              = 40,000000
llama_model_loader: - kv  30: deepseek2.rope.scaling.original_context_length u32              = 4096
llama_model_loader: - kv  31: deepseek2.rope.scaling.yarn_log_multiplier f32              = 0,100000
llama_model_loader: - kv  32:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  33:                         tokenizer.ggml.pre str              = deepseek-v3
llama_model_loader: - kv  34:                      tokenizer.ggml.tokens arr[str,129280]  = ["<｜begin▁of▁sentence｜>", "<▒...
llama_model_loader: - kv  35:                  tokenizer.ggml.token_type arr[i32,129280]  = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  36:                      tokenizer.ggml.merges arr[str,127741]  = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e...
llama_model_loader: - kv  37:                tokenizer.ggml.bos_token_id u32              = 0
llama_model_loader: - kv  38:                tokenizer.ggml.eos_token_id u32              = 1
llama_model_loader: - kv  39:            tokenizer.ggml.padding_token_id u32              = 128815
llama_model_loader: - kv  40:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  41:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  42:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  43:               general.quantization_version u32              = 2
llama_model_loader: - kv  44:                          general.file_type u32              = 24
llama_model_loader: - kv  45:                      quantize.imatrix.file str              = DeepSeek-R1.imatrix
llama_model_loader: - kv  46:                   quantize.imatrix.dataset str              = /training_data/calibration_datav3.txt
llama_model_loader: - kv  47:             quantize.imatrix.entries_count i32              = 720
llama_model_loader: - kv  48:              quantize.imatrix.chunks_count i32              = 124
llama_model_loader: - kv  49:                                   split.no u16              = 0
llama_model_loader: - kv  50:                        split.tensors.count i32              = 1025
llama_model_loader: - kv  51:                                split.count u16              = 0
llama_model_loader: - type  f32:  361 tensors
llama_model_loader: - type q4_K:  190 tensors
llama_model_loader: - type q5_K:  116 tensors
llama_model_loader: - type q6_K:  184 tensors
llama_model_loader: - type iq2_xxs:    6 tensors
llama_model_loader: - type iq1_s:  168 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = IQ1_S - 1.5625 bpw
print_info: file size   = 130,60 GiB (1,67 BPW)
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 819
load: token to piece cache size = 0,8223 MB
print_info: arch             = deepseek2
print_info: vocab_only       = 0
print_info: n_ctx_train      = 163840
print_info: n_embd           = 7168
print_info: n_layer          = 61
print_info: n_head           = 128
print_info: n_head_kv        = 128
print_info: n_rot            = 64
print_info: n_swa            = 0
print_info: n_embd_head_k    = 192
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 1
print_info: n_embd_k_gqa     = 24576
print_info: n_embd_v_gqa     = 16384
print_info: f_norm_eps       = 0,0e+00
print_info: f_norm_rms_eps   = 1,0e-06
print_info: f_clamp_kqv      = 0,0e+00
print_info: f_max_alibi_bias = 0,0e+00
print_info: f_logit_scale    = 0,0e+00
print_info: n_ff             = 18432
print_info: n_expert         = 256
print_info: n_expert_used    = 8
print_info: causal attn      = 1
print_info: pooling type     = 0
print_info: rope type        = 0
print_info: rope scaling     = yarn
print_info: freq_base_train  = 10000,0
print_info: freq_scale_train = 0,025
print_info: n_ctx_orig_yarn  = 4096
print_info: rope_finetuned   = unknown
print_info: ssm_d_conv       = 0
print_info: ssm_d_inner      = 0
print_info: ssm_d_state      = 0
print_info: ssm_dt_rank      = 0
print_info: ssm_dt_b_c_rms   = 0
print_info: model type       = 671B
print_info: model params     = 671,03 B
print_info: general.name     = DeepSeek R1 BF16
print_info: n_layer_dense_lead   = 3
print_info: n_lora_q             = 1536
print_info: n_lora_kv            = 512
print_info: n_ff_exp             = 2048
print_info: n_expert_shared      = 1
print_info: expert_weights_scale = 2,5
print_info: expert_weights_norm  = 1
print_info: expert_gating_func   = sigmoid
print_info: rope_yarn_log_mul    = 0,1000
print_info: vocab type       = BPE
print_info: n_vocab          = 129280
print_info: n_merges         = 127741
print_info: BOS token        = 0 '<｜begin▁of▁sentence｜>'
print_info: EOS token        = 1 '<｜end▁of▁sentence｜>'
print_info: EOT token        = 1 '<｜end▁of▁sentence｜>'
print_info: PAD token        = 128815 '<｜PAD▁TOKEN｜>'
print_info: LF token         = 201 'Ċ'
print_info: FIM PRE token    = 128801 '<｜fim▁begin｜>'
print_info: FIM SUF token    = 128800 '<｜fim▁hole｜>'
print_info: FIM MID token    = 128802 '<｜fim▁end｜>'
print_info: EOG token        = 1 '<｜end▁of▁sentence｜>'
print_info: max token length = 256

OS

Linux

GPU

No response

CPU

AMD

Ollama version

0.5.7

Originally created by @qits-kkruse on GitHub (Jan 31, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8728 ### What is the issue? I get this error message running ollama create: ``` ollama create DeepSeek-R1-671b-q1.58.model gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100% copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100% parsing GGUF Error: supplied file was not in GGUF format ``` I downloaded the model files like so: ``` from huggingface_hub import snapshot_download snapshot_download( repo_id = "unsloth/DeepSeek-R1-GGUF", local_dir = "DeepSeek-R1-GGUF", allow_patterns = ["*UD-IQ1_S*"], ) ``` Then i merged the three files into one, like so: ``` llama-gguf-split --merge DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \ DeepSeek-R1-671b-q1.58.gguf ``` This is my short and sweet model file: `FROM DeepSeek-R1-671b-q1.58.gguf` But using ollama create i get this: ``` parsing GGUF Error: supplied file was not in GGUF format ``` ollama-cli confirms that the file is gguf Version 3 and it can do inference with it as well: ``` llama-cli --model DeepSeek-R1-671b-q1.58.gguf register_backend: registered backend CPU (1 devices) register_device: registered device CPU (AMD EPYC 9174F 16-Core Processor) build: 4604 (5783575c) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2) for x86_64-redhat-linux (debug) main: llama backend init main: load the model and apply lora adapter, if any llama_model_loader: loaded meta data with 52 key-value pairs and 1025 tensors from DeepSeek-R1-671b-q1.58.gguf (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = deepseek2 llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = DeepSeek R1 BF16 llama_model_loader: - kv 3: general.quantized_by str = Unsloth llama_model_loader: - kv 4: general.size_label str = 256x20B llama_model_loader: - kv 5: general.repo_url str = https://huggingface.co/unsloth llama_model_loader: - kv 6: deepseek2.block_count u32 = 61 llama_model_loader: - kv 7: deepseek2.context_length u32 = 163840 llama_model_loader: - kv 8: deepseek2.embedding_length u32 = 7168 llama_model_loader: - kv 9: deepseek2.feed_forward_length u32 = 18432 llama_model_loader: - kv 10: deepseek2.attention.head_count u32 = 128 llama_model_loader: - kv 11: deepseek2.attention.head_count_kv u32 = 128 llama_model_loader: - kv 12: deepseek2.rope.freq_base f32 = 10000,000000 llama_model_loader: - kv 13: deepseek2.attention.layer_norm_rms_epsilon f32 = 0,000001 llama_model_loader: - kv 14: deepseek2.expert_used_count u32 = 8 llama_model_loader: - kv 15: deepseek2.leading_dense_block_count u32 = 3 llama_model_loader: - kv 16: deepseek2.vocab_size u32 = 129280 llama_model_loader: - kv 17: deepseek2.attention.q_lora_rank u32 = 1536 llama_model_loader: - kv 18: deepseek2.attention.kv_lora_rank u32 = 512 llama_model_loader: - kv 19: deepseek2.attention.key_length u32 = 192 llama_model_loader: - kv 20: deepseek2.attention.value_length u32 = 128 llama_model_loader: - kv 21: deepseek2.expert_feed_forward_length u32 = 2048 llama_model_loader: - kv 22: deepseek2.expert_count u32 = 256 llama_model_loader: - kv 23: deepseek2.expert_shared_count u32 = 1 llama_model_loader: - kv 24: deepseek2.expert_weights_scale f32 = 2,500000 llama_model_loader: - kv 25: deepseek2.expert_weights_norm bool = true llama_model_loader: - kv 26: deepseek2.expert_gating_func u32 = 2 llama_model_loader: - kv 27: deepseek2.rope.dimension_count u32 = 64 llama_model_loader: - kv 28: deepseek2.rope.scaling.type str = yarn llama_model_loader: - kv 29: deepseek2.rope.scaling.factor f32 = 40,000000 llama_model_loader: - kv 30: deepseek2.rope.scaling.original_context_length u32 = 4096 llama_model_loader: - kv 31: deepseek2.rope.scaling.yarn_log_multiplier f32 = 0,100000 llama_model_loader: - kv 32: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 33: tokenizer.ggml.pre str = deepseek-v3 llama_model_loader: - kv 34: tokenizer.ggml.tokens arr[str,129280] = ["<｜begin▁of▁sentence｜>", "<▒... llama_model_loader: - kv 35: tokenizer.ggml.token_type arr[i32,129280] = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 36: tokenizer.ggml.merges arr[str,127741] = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e... llama_model_loader: - kv 37: tokenizer.ggml.bos_token_id u32 = 0 llama_model_loader: - kv 38: tokenizer.ggml.eos_token_id u32 = 1 llama_model_loader: - kv 39: tokenizer.ggml.padding_token_id u32 = 128815 llama_model_loader: - kv 40: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 41: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 42: tokenizer.chat_template str = {% if not add_generation_prompt is de... llama_model_loader: - kv 43: general.quantization_version u32 = 2 llama_model_loader: - kv 44: general.file_type u32 = 24 llama_model_loader: - kv 45: quantize.imatrix.file str = DeepSeek-R1.imatrix llama_model_loader: - kv 46: quantize.imatrix.dataset str = /training_data/calibration_datav3.txt llama_model_loader: - kv 47: quantize.imatrix.entries_count i32 = 720 llama_model_loader: - kv 48: quantize.imatrix.chunks_count i32 = 124 llama_model_loader: - kv 49: split.no u16 = 0 llama_model_loader: - kv 50: split.tensors.count i32 = 1025 llama_model_loader: - kv 51: split.count u16 = 0 llama_model_loader: - type f32: 361 tensors llama_model_loader: - type q4_K: 190 tensors llama_model_loader: - type q5_K: 116 tensors llama_model_loader: - type q6_K: 184 tensors llama_model_loader: - type iq2_xxs: 6 tensors llama_model_loader: - type iq1_s: 168 tensors print_info: file format = GGUF V3 (latest) print_info: file type = IQ1_S - 1.5625 bpw print_info: file size = 130,60 GiB (1,67 BPW) load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect load: special tokens cache size = 819 load: token to piece cache size = 0,8223 MB print_info: arch = deepseek2 print_info: vocab_only = 0 print_info: n_ctx_train = 163840 print_info: n_embd = 7168 print_info: n_layer = 61 print_info: n_head = 128 print_info: n_head_kv = 128 print_info: n_rot = 64 print_info: n_swa = 0 print_info: n_embd_head_k = 192 print_info: n_embd_head_v = 128 print_info: n_gqa = 1 print_info: n_embd_k_gqa = 24576 print_info: n_embd_v_gqa = 16384 print_info: f_norm_eps = 0,0e+00 print_info: f_norm_rms_eps = 1,0e-06 print_info: f_clamp_kqv = 0,0e+00 print_info: f_max_alibi_bias = 0,0e+00 print_info: f_logit_scale = 0,0e+00 print_info: n_ff = 18432 print_info: n_expert = 256 print_info: n_expert_used = 8 print_info: causal attn = 1 print_info: pooling type = 0 print_info: rope type = 0 print_info: rope scaling = yarn print_info: freq_base_train = 10000,0 print_info: freq_scale_train = 0,025 print_info: n_ctx_orig_yarn = 4096 print_info: rope_finetuned = unknown print_info: ssm_d_conv = 0 print_info: ssm_d_inner = 0 print_info: ssm_d_state = 0 print_info: ssm_dt_rank = 0 print_info: ssm_dt_b_c_rms = 0 print_info: model type = 671B print_info: model params = 671,03 B print_info: general.name = DeepSeek R1 BF16 print_info: n_layer_dense_lead = 3 print_info: n_lora_q = 1536 print_info: n_lora_kv = 512 print_info: n_ff_exp = 2048 print_info: n_expert_shared = 1 print_info: expert_weights_scale = 2,5 print_info: expert_weights_norm = 1 print_info: expert_gating_func = sigmoid print_info: rope_yarn_log_mul = 0,1000 print_info: vocab type = BPE print_info: n_vocab = 129280 print_info: n_merges = 127741 print_info: BOS token = 0 '<｜begin▁of▁sentence｜>' print_info: EOS token = 1 '<｜end▁of▁sentence｜>' print_info: EOT token = 1 '<｜end▁of▁sentence｜>' print_info: PAD token = 128815 '<｜PAD▁TOKEN｜>' print_info: LF token = 201 'Ċ' print_info: FIM PRE token = 128801 '<｜fim▁begin｜>' print_info: FIM SUF token = 128800 '<｜fim▁hole｜>' print_info: FIM MID token = 128802 '<｜fim▁end｜>' print_info: EOG token = 1 '<｜end▁of▁sentence｜>' print_info: max token length = 256 ``` ### OS Linux ### GPU _No response_ ### CPU AMD ### Ollama version 0.5.7

GiteaMirror added the bug label 2026-04-12 16:57:31 -05:00

GiteaMirror closed this issue

2026-04-12 16:57:32 -05:00

GiteaMirror commented

2026-04-12 16:57:32 -05:00

@qits-kkruse commented on GitHub (Jan 31, 2025):

Basically i was following these instructions until it said "and then you add the model to ollama.":

https://unsloth.ai/blog/deepseekr1-dynamic

@qits-kkruse commented on GitHub (Jan 31, 2025): Basically i was following these instructions until it said "and then you add the model to ollama.": https://unsloth.ai/blog/deepseekr1-dynamic

GiteaMirror commented

2026-04-12 16:57:33 -05:00

@rick-github commented on GitHub (Jan 31, 2025):

Your ollama create command copies three files, not one. Are you sure the Modelfile is correct?

@rick-github commented on GitHub (Jan 31, 2025): Your `ollama create` command copies three files, not one. Are you sure the Modelfile is correct?

GiteaMirror commented

2026-04-12 16:57:33 -05:00

@rick-github commented on GitHub (Jan 31, 2025):

I suspect that your "short and sweet modelfile" is called DeepSeek-R1-671b-q1.58.model and contains FROM DeepSeek-R1-671b-q1.58.gguf. However, your ollama create command is not correctly formed. What your command does is create a model called DeepSeek-R1-671b-q1.58.model, using the contents of the current directory. If your model file is not called Modelfile, you need to explicitly pass it as an argument:

ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58

@rick-github commented on GitHub (Jan 31, 2025): I suspect that your "short and sweet modelfile" is called `DeepSeek-R1-671b-q1.58.model` and contains `FROM DeepSeek-R1-671b-q1.58.gguf`. However, your `ollama create` command is not correctly formed. What your command does is create a model called DeepSeek-R1-671b-q1.58.model, using the contents of the current directory. If your model file is not called `Modelfile`, you need to explicitly pass it as an argument: ``` ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58 ```

GiteaMirror commented

2026-04-12 16:57:34 -05:00

@qits-kkruse commented on GitHub (Jan 31, 2025):

Your ollama create command copies three files, not one. Are you sure the Modelfile is correct?

Pretty positive, yes. The three blobs that are created are one very large one (the gguf file) and two smaller ones (some kind of meta data). The first blob is almost as large as the source gguf file:

[root@foo01 deepseek-r1-1.58bit]# ollama create DeepSeek-R1-671b-q1.58.model
gathering model components
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100%
copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100%
copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100%
parsing GGUF
Error: supplied file was not in GGUF format

This is the directory listing:

[root@foo01 deepseek-r1-1.58bit]# ls -lh
insgesamt 131G
-rw-r--r--.  1 kkadmin kkadmin 131G 31. Jan 12:55 DeepSeek-R1-671b-q1.58.gguf
-rw-r--r--.  1 kkadmin kkadmin   33 31. Jan 16:35 DeepSeek-R1-671b-q1.58.model

And the model file:

[root@foo01 deepseek-r1-1.58bit]# cat DeepSeek-R1-671b-q1.58.model
FROM DeepSeek-R1-671b-q1.58.gguf

@qits-kkruse commented on GitHub (Jan 31, 2025): > Your `ollama create` command copies three files, not one. Are you sure the Modelfile is correct? Pretty positive, yes. The three blobs that are created are one very large one (the gguf file) and two smaller ones (some kind of meta data). The first blob is almost as large as the source gguf file: ``` [root@foo01 deepseek-r1-1.58bit]# ollama create DeepSeek-R1-671b-q1.58.model gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100% copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100% parsing GGUF Error: supplied file was not in GGUF format ``` This is the directory listing: ``` [root@foo01 deepseek-r1-1.58bit]# ls -lh insgesamt 131G -rw-r--r--. 1 kkadmin kkadmin 131G 31. Jan 12:55 DeepSeek-R1-671b-q1.58.gguf -rw-r--r--. 1 kkadmin kkadmin 33 31. Jan 16:35 DeepSeek-R1-671b-q1.58.model ``` And the model file: ``` [root@foo01 deepseek-r1-1.58bit]# cat DeepSeek-R1-671b-q1.58.model FROM DeepSeek-R1-671b-q1.58.gguf ```

GiteaMirror commented

2026-04-12 16:57:34 -05:00

@qits-kkruse commented on GitHub (Jan 31, 2025):

I suspect that your "short and sweet modelfile" is called DeepSeek-R1-671b-q1.58.model and contains FROM DeepSeek-R1-671b-q1.58.gguf. However, your ollama create command is not correctly formed. What your command does is create a model called DeepSeek-R1-671b-q1.58.model, using the contents of the current directory. If your model file is not called Modelfile, you need to explicitly pass it as an argument:
ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58

Cheers, i'll do that from now. But the result is the same:

[root@foo01 deepseek-r1-1.58bit]# ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58
gathering model components
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100%
copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100%
copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100%
parsing GGUF
Error: supplied file was not in GGUF format

The directory listing:

[root@foo01 deepseek-r1-1.58bit]# ls -lh
insgesamt 131G
-rw-r--r--. 1 kkadmin kkadmin 131G 31. Jan 12:55 DeepSeek-R1-671b-q1.58.gguf
-rw-r--r--. 1 kkadmin kkadmin 33 31. Jan 16:35 DeepSeek-R1-671b-q1.58.model

And the model file:

[root@foo01 deepseek-r1-1.58bit]# cat DeepSeek-R1-671b-q1.58.model
FROM DeepSeek-R1-671b-q1.58.gguf
[root@foo01 deepseek-r1-1.58bit]#

@qits-kkruse commented on GitHub (Jan 31, 2025): > I suspect that your "short and sweet modelfile" is called `DeepSeek-R1-671b-q1.58.model` and contains `FROM DeepSeek-R1-671b-q1.58.gguf`. However, your `ollama create` command is not correctly formed. What your command does is create a model called DeepSeek-R1-671b-q1.58.model, using the contents of the current directory. If your model file is not called `Modelfile`, you need to explicitly pass it as an argument: > > ``` > ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58 > ``` Cheers, i'll do that from now. But the result is the same: ``` [root@foo01 deepseek-r1-1.58bit]# ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58 gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100% copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100% parsing GGUF Error: supplied file was not in GGUF format ``` The directory listing: [root@foo01 deepseek-r1-1.58bit]# ls -lh insgesamt 131G -rw-r--r--. 1 kkadmin kkadmin 131G 31. Jan 12:55 DeepSeek-R1-671b-q1.58.gguf -rw-r--r--. 1 kkadmin kkadmin 33 31. Jan 16:35 DeepSeek-R1-671b-q1.58.model And the model file: ``` [root@foo01 deepseek-r1-1.58bit]# cat DeepSeek-R1-671b-q1.58.model FROM DeepSeek-R1-671b-q1.58.gguf [root@foo01 deepseek-r1-1.58bit]# ```

GiteaMirror commented

2026-04-12 16:57:35 -05:00

@rick-github commented on GitHub (Jan 31, 2025):

$ cat Modelfile
FROM DeepSeek-R1-UD-IQ1_S.gguf

$ ollama create DeepSeek-R1-671b-q1.58.model
gathering model components 
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% 
parsing GGUF 
using existing layer sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 
writing manifest 
success

I speculate that you have a model called DeepSeek-R1-671b-q1.58.gguf (ie, ollama list will show it) and now when the create command runs, it uses the contents of the modelfile as a reference to the already existing model and uses all of the files assocated with that rather than the single GGUF in your current directory. Try this:

ollama rm DeepSeek-R1-671b-q1.58.gguf
ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58

If that fails, try this:

ollama rm DeepSeek-R1-671b-q1.58.gguf
echo FROM ./DeepSeek-R1-671b-q1.58.gguf > Modelfile
ollama create DeepSeek-R1-671b-q1.58

@rick-github commented on GitHub (Jan 31, 2025): ```console $ cat Modelfile FROM DeepSeek-R1-UD-IQ1_S.gguf ``` ```console $ ollama create DeepSeek-R1-671b-q1.58.model gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% parsing GGUF using existing layer sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 writing manifest success ``` I speculate that you have a model called `DeepSeek-R1-671b-q1.58.gguf` (ie, `ollama list` will show it) and now when the create command runs, it uses the contents of the modelfile as a reference to the already existing model and uses all of the files assocated with that rather than the single GGUF in your current directory. Try this: ```console ollama rm DeepSeek-R1-671b-q1.58.gguf ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58 ``` If that fails, try this: ```console ollama rm DeepSeek-R1-671b-q1.58.gguf echo FROM ./DeepSeek-R1-671b-q1.58.gguf > Modelfile ollama create DeepSeek-R1-671b-q1.58 ```

GiteaMirror commented

2026-04-12 16:57:35 -05:00

@qits-kkruse commented on GitHub (Jan 31, 2025):

ollama list didn't show it. However, i removed the three blobs (sha256-...) manually and then moved the filename DeepSeek-R1-671b-q1.58.model to Modelfile and ran it again. It worked.

[root@foo01 deepseek-r1-1.58bit]# cat Modelfile
FROM DeepSeek-R1-671b-q1.58.gguf
[root@foo01 deepseek-r1-1.58bit]# ollama create DeepSeek-R1-671b-q1.58
gathering model components
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100%
parsing GGUF
using existing layer sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6
writing manifest
success
[root@foo01 deepseek-r1-1.58bit]#

Thanks mate, you saved my weekend, marriage and i'll possibly rename my second born after you.

@qits-kkruse commented on GitHub (Jan 31, 2025): ollama list didn't show it. However, i removed the three blobs (sha256-...) manually and then moved the filename DeepSeek-R1-671b-q1.58.model to Modelfile and ran it again. It worked. ``` [root@foo01 deepseek-r1-1.58bit]# cat Modelfile FROM DeepSeek-R1-671b-q1.58.gguf [root@foo01 deepseek-r1-1.58bit]# ollama create DeepSeek-R1-671b-q1.58 gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% parsing GGUF using existing layer sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 writing manifest success [root@foo01 deepseek-r1-1.58bit]# ``` Thanks mate, you saved my weekend, marriage and i'll possibly rename my second born after you.

GiteaMirror commented

2026-04-12 16:57:36 -05:00

@rick-github commented on GitHub (Jan 31, 2025):

You probably know this, but you'll need a TEMPLATE and some PARAMETERS for best function: https://github.com/ollama/ollama/issues/8571#issuecomment-2622118005

@rick-github commented on GitHub (Jan 31, 2025): You probably know this, but you'll need a TEMPLATE and some PARAMETERS for best function: https://github.com/ollama/ollama/issues/8571#issuecomment-2622118005

GiteaMirror referenced this issue

2026-04-12 23:41:12 -05:00

[PR #5662] [MERGED] fix system prompt #11873

GiteaMirror referenced this issue

2026-04-16 05:53:50 -05:00

[PR #5662] [MERGED] fix system prompt #17144

GiteaMirror referenced this issue

2026-04-19 16:18:30 -05:00

[PR #5662] [MERGED] fix system prompt #22413

GiteaMirror referenced this issue

2026-04-22 22:25:09 -05:00

[PR #5662] [MERGED] fix system prompt #37746

GiteaMirror referenced this issue

2026-04-24 22:48:27 -05:00

[PR #5662] [MERGED] fix system prompt #43121

GiteaMirror referenced this issue

2026-04-29 13:27:33 -05:00

[PR #5662] [MERGED] fix system prompt #58570

GiteaMirror referenced this issue

2026-05-05 06:08:01 -05:00

[PR #5662] [MERGED] fix system prompt #74167

Sign in to join this conversation.

Branches Tags

main

hoyyeva/fix-claude-channels-env

parth-update-hermes-launch

hoyyeva/vscode-extension-docs-update

parth-gemma4-chat-template-renderer

parth-api-status-context-length

hoyyeva/wire-up-context-length

hoyyeva/claude-code-context-doc

jmorganca/investigate-issue-17046

hoyyeva/hermes-docs

jmorganca/agent-loop-style

hoyyeva/openclaw

parth-agent-loop

hoyyeva/ollama-vscode-extension

brucemacd/cache-metrics

brucemacd/hermes-desktop

hoyyeva/docs-vscode

parth-input-style-experiment

brucemacd/docs-glm52

hoyyeva/poc-docs

Parth/mlx-launch-recommendations

parth-first-time-app-cli-experience

test/darwin-xcode-pin

improve-cloud-model-recommendations

hoyyeva/goose-docs

jmorganca/context-limit-fixes

hoyyeva/qwen-doc

hoyyeva/vscode-docs

jmorganca/remove-mlx-imagegen-code

parth-copilot-token-length-defaults

hoyyeva/poolside-windows

laguna-support

jmorganca/harden-markdown-rendering

laguna-renderer-parser

laguna-llamacpp

codex/make-integration-hidden-and-lunchable

brucemacd/omp-docs

pdevine/gguf-mtp-oldstyle

hoyyeva/migrate-pi

hoyyeva/anthropic-local-image-path

parth-launch-codex-app

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth/hide-claude-desktop-till-release

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#5662