[GH-ISSUE #8728] ollama create: Error: supplied file was not in GGUF format #5662

Closed
opened 2026-04-12 16:57:31 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @qits-kkruse on GitHub (Jan 31, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/8728

What is the issue?

I get this error message running ollama create:

ollama create DeepSeek-R1-671b-q1.58.model
gathering model components
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100%
copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100%
copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100%
parsing GGUF
Error: supplied file was not in GGUF format

I downloaded the model files like so:

from huggingface_hub import snapshot_download
snapshot_download(
  repo_id = "unsloth/DeepSeek-R1-GGUF",
  local_dir = "DeepSeek-R1-GGUF",
  allow_patterns = ["*UD-IQ1_S*"],
)

Then i merged the three files into one, like so:

llama-gguf-split --merge DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \
  DeepSeek-R1-671b-q1.58.gguf

This is my short and sweet model file:

FROM DeepSeek-R1-671b-q1.58.gguf

But using ollama create i get this:

parsing GGUF
Error: supplied file was not in GGUF format

ollama-cli confirms that the file is gguf Version 3 and it can do inference with it as well:

llama-cli --model DeepSeek-R1-671b-q1.58.gguf

register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (AMD EPYC 9174F 16-Core Processor)
build: 4604 (5783575c) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2) for x86_64-redhat-linux (debug)
main: llama backend init
main: load the model and apply lora adapter, if any
llama_model_loader: loaded meta data with 52 key-value pairs and 1025 tensors from DeepSeek-R1-671b-q1.58.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = deepseek2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 BF16
llama_model_loader: - kv   3:                       general.quantized_by str              = Unsloth
llama_model_loader: - kv   4:                         general.size_label str              = 256x20B
llama_model_loader: - kv   5:                           general.repo_url str              = https://huggingface.co/unsloth
llama_model_loader: - kv   6:                      deepseek2.block_count u32              = 61
llama_model_loader: - kv   7:                   deepseek2.context_length u32              = 163840
llama_model_loader: - kv   8:                 deepseek2.embedding_length u32              = 7168
llama_model_loader: - kv   9:              deepseek2.feed_forward_length u32              = 18432
llama_model_loader: - kv  10:             deepseek2.attention.head_count u32              = 128
llama_model_loader: - kv  11:          deepseek2.attention.head_count_kv u32              = 128
llama_model_loader: - kv  12:                   deepseek2.rope.freq_base f32              = 10000,000000
llama_model_loader: - kv  13: deepseek2.attention.layer_norm_rms_epsilon f32              = 0,000001
llama_model_loader: - kv  14:                deepseek2.expert_used_count u32              = 8
llama_model_loader: - kv  15:        deepseek2.leading_dense_block_count u32              = 3
llama_model_loader: - kv  16:                       deepseek2.vocab_size u32              = 129280
llama_model_loader: - kv  17:            deepseek2.attention.q_lora_rank u32              = 1536
llama_model_loader: - kv  18:           deepseek2.attention.kv_lora_rank u32              = 512
llama_model_loader: - kv  19:             deepseek2.attention.key_length u32              = 192
llama_model_loader: - kv  20:           deepseek2.attention.value_length u32              = 128
llama_model_loader: - kv  21:       deepseek2.expert_feed_forward_length u32              = 2048
llama_model_loader: - kv  22:                     deepseek2.expert_count u32              = 256
llama_model_loader: - kv  23:              deepseek2.expert_shared_count u32              = 1
llama_model_loader: - kv  24:             deepseek2.expert_weights_scale f32              = 2,500000
llama_model_loader: - kv  25:              deepseek2.expert_weights_norm bool             = true
llama_model_loader: - kv  26:               deepseek2.expert_gating_func u32              = 2
llama_model_loader: - kv  27:             deepseek2.rope.dimension_count u32              = 64
llama_model_loader: - kv  28:                deepseek2.rope.scaling.type str              = yarn
llama_model_loader: - kv  29:              deepseek2.rope.scaling.factor f32              = 40,000000
llama_model_loader: - kv  30: deepseek2.rope.scaling.original_context_length u32              = 4096
llama_model_loader: - kv  31: deepseek2.rope.scaling.yarn_log_multiplier f32              = 0,100000
llama_model_loader: - kv  32:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  33:                         tokenizer.ggml.pre str              = deepseek-v3
llama_model_loader: - kv  34:                      tokenizer.ggml.tokens arr[str,129280]  = ["<|begin▁of▁sentence|>", "<▒...
llama_model_loader: - kv  35:                  tokenizer.ggml.token_type arr[i32,129280]  = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  36:                      tokenizer.ggml.merges arr[str,127741]  = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e...
llama_model_loader: - kv  37:                tokenizer.ggml.bos_token_id u32              = 0
llama_model_loader: - kv  38:                tokenizer.ggml.eos_token_id u32              = 1
llama_model_loader: - kv  39:            tokenizer.ggml.padding_token_id u32              = 128815
llama_model_loader: - kv  40:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  41:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  42:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  43:               general.quantization_version u32              = 2
llama_model_loader: - kv  44:                          general.file_type u32              = 24
llama_model_loader: - kv  45:                      quantize.imatrix.file str              = DeepSeek-R1.imatrix
llama_model_loader: - kv  46:                   quantize.imatrix.dataset str              = /training_data/calibration_datav3.txt
llama_model_loader: - kv  47:             quantize.imatrix.entries_count i32              = 720
llama_model_loader: - kv  48:              quantize.imatrix.chunks_count i32              = 124
llama_model_loader: - kv  49:                                   split.no u16              = 0
llama_model_loader: - kv  50:                        split.tensors.count i32              = 1025
llama_model_loader: - kv  51:                                split.count u16              = 0
llama_model_loader: - type  f32:  361 tensors
llama_model_loader: - type q4_K:  190 tensors
llama_model_loader: - type q5_K:  116 tensors
llama_model_loader: - type q6_K:  184 tensors
llama_model_loader: - type iq2_xxs:    6 tensors
llama_model_loader: - type iq1_s:  168 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = IQ1_S - 1.5625 bpw
print_info: file size   = 130,60 GiB (1,67 BPW)
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 819
load: token to piece cache size = 0,8223 MB
print_info: arch             = deepseek2
print_info: vocab_only       = 0
print_info: n_ctx_train      = 163840
print_info: n_embd           = 7168
print_info: n_layer          = 61
print_info: n_head           = 128
print_info: n_head_kv        = 128
print_info: n_rot            = 64
print_info: n_swa            = 0
print_info: n_embd_head_k    = 192
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 1
print_info: n_embd_k_gqa     = 24576
print_info: n_embd_v_gqa     = 16384
print_info: f_norm_eps       = 0,0e+00
print_info: f_norm_rms_eps   = 1,0e-06
print_info: f_clamp_kqv      = 0,0e+00
print_info: f_max_alibi_bias = 0,0e+00
print_info: f_logit_scale    = 0,0e+00
print_info: n_ff             = 18432
print_info: n_expert         = 256
print_info: n_expert_used    = 8
print_info: causal attn      = 1
print_info: pooling type     = 0
print_info: rope type        = 0
print_info: rope scaling     = yarn
print_info: freq_base_train  = 10000,0
print_info: freq_scale_train = 0,025
print_info: n_ctx_orig_yarn  = 4096
print_info: rope_finetuned   = unknown
print_info: ssm_d_conv       = 0
print_info: ssm_d_inner      = 0
print_info: ssm_d_state      = 0
print_info: ssm_dt_rank      = 0
print_info: ssm_dt_b_c_rms   = 0
print_info: model type       = 671B
print_info: model params     = 671,03 B
print_info: general.name     = DeepSeek R1 BF16
print_info: n_layer_dense_lead   = 3
print_info: n_lora_q             = 1536
print_info: n_lora_kv            = 512
print_info: n_ff_exp             = 2048
print_info: n_expert_shared      = 1
print_info: expert_weights_scale = 2,5
print_info: expert_weights_norm  = 1
print_info: expert_gating_func   = sigmoid
print_info: rope_yarn_log_mul    = 0,1000
print_info: vocab type       = BPE
print_info: n_vocab          = 129280
print_info: n_merges         = 127741
print_info: BOS token        = 0 '<|begin▁of▁sentence|>'
print_info: EOS token        = 1 '<|end▁of▁sentence|>'
print_info: EOT token        = 1 '<|end▁of▁sentence|>'
print_info: PAD token        = 128815 '<|PAD▁TOKEN|>'
print_info: LF token         = 201 'Ċ'
print_info: FIM PRE token    = 128801 '<|fim▁begin|>'
print_info: FIM SUF token    = 128800 '<|fim▁hole|>'
print_info: FIM MID token    = 128802 '<|fim▁end|>'
print_info: EOG token        = 1 '<|end▁of▁sentence|>'
print_info: max token length = 256

OS

Linux

GPU

No response

CPU

AMD

Ollama version

0.5.7

Originally created by @qits-kkruse on GitHub (Jan 31, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/8728 ### What is the issue? I get this error message running ollama create: ``` ollama create DeepSeek-R1-671b-q1.58.model gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100% copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100% parsing GGUF Error: supplied file was not in GGUF format ``` I downloaded the model files like so: ``` from huggingface_hub import snapshot_download snapshot_download( repo_id = "unsloth/DeepSeek-R1-GGUF", local_dir = "DeepSeek-R1-GGUF", allow_patterns = ["*UD-IQ1_S*"], ) ``` Then i merged the three files into one, like so: ``` llama-gguf-split --merge DeepSeek-R1-GGUF/DeepSeek-R1-UD-IQ1_S/DeepSeek-R1-UD-IQ1_S-00001-of-00003.gguf \ DeepSeek-R1-671b-q1.58.gguf ``` This is my short and sweet model file: `FROM DeepSeek-R1-671b-q1.58.gguf` But using ollama create i get this: ``` parsing GGUF Error: supplied file was not in GGUF format ``` ollama-cli confirms that the file is gguf Version 3 and it can do inference with it as well: ``` llama-cli --model DeepSeek-R1-671b-q1.58.gguf register_backend: registered backend CPU (1 devices) register_device: registered device CPU (AMD EPYC 9174F 16-Core Processor) build: 4604 (5783575c) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2) for x86_64-redhat-linux (debug) main: llama backend init main: load the model and apply lora adapter, if any llama_model_loader: loaded meta data with 52 key-value pairs and 1025 tensors from DeepSeek-R1-671b-q1.58.gguf (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = deepseek2 llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = DeepSeek R1 BF16 llama_model_loader: - kv 3: general.quantized_by str = Unsloth llama_model_loader: - kv 4: general.size_label str = 256x20B llama_model_loader: - kv 5: general.repo_url str = https://huggingface.co/unsloth llama_model_loader: - kv 6: deepseek2.block_count u32 = 61 llama_model_loader: - kv 7: deepseek2.context_length u32 = 163840 llama_model_loader: - kv 8: deepseek2.embedding_length u32 = 7168 llama_model_loader: - kv 9: deepseek2.feed_forward_length u32 = 18432 llama_model_loader: - kv 10: deepseek2.attention.head_count u32 = 128 llama_model_loader: - kv 11: deepseek2.attention.head_count_kv u32 = 128 llama_model_loader: - kv 12: deepseek2.rope.freq_base f32 = 10000,000000 llama_model_loader: - kv 13: deepseek2.attention.layer_norm_rms_epsilon f32 = 0,000001 llama_model_loader: - kv 14: deepseek2.expert_used_count u32 = 8 llama_model_loader: - kv 15: deepseek2.leading_dense_block_count u32 = 3 llama_model_loader: - kv 16: deepseek2.vocab_size u32 = 129280 llama_model_loader: - kv 17: deepseek2.attention.q_lora_rank u32 = 1536 llama_model_loader: - kv 18: deepseek2.attention.kv_lora_rank u32 = 512 llama_model_loader: - kv 19: deepseek2.attention.key_length u32 = 192 llama_model_loader: - kv 20: deepseek2.attention.value_length u32 = 128 llama_model_loader: - kv 21: deepseek2.expert_feed_forward_length u32 = 2048 llama_model_loader: - kv 22: deepseek2.expert_count u32 = 256 llama_model_loader: - kv 23: deepseek2.expert_shared_count u32 = 1 llama_model_loader: - kv 24: deepseek2.expert_weights_scale f32 = 2,500000 llama_model_loader: - kv 25: deepseek2.expert_weights_norm bool = true llama_model_loader: - kv 26: deepseek2.expert_gating_func u32 = 2 llama_model_loader: - kv 27: deepseek2.rope.dimension_count u32 = 64 llama_model_loader: - kv 28: deepseek2.rope.scaling.type str = yarn llama_model_loader: - kv 29: deepseek2.rope.scaling.factor f32 = 40,000000 llama_model_loader: - kv 30: deepseek2.rope.scaling.original_context_length u32 = 4096 llama_model_loader: - kv 31: deepseek2.rope.scaling.yarn_log_multiplier f32 = 0,100000 llama_model_loader: - kv 32: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 33: tokenizer.ggml.pre str = deepseek-v3 llama_model_loader: - kv 34: tokenizer.ggml.tokens arr[str,129280] = ["<|begin▁of▁sentence|>", "<▒... llama_model_loader: - kv 35: tokenizer.ggml.token_type arr[i32,129280] = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 36: tokenizer.ggml.merges arr[str,127741] = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e... llama_model_loader: - kv 37: tokenizer.ggml.bos_token_id u32 = 0 llama_model_loader: - kv 38: tokenizer.ggml.eos_token_id u32 = 1 llama_model_loader: - kv 39: tokenizer.ggml.padding_token_id u32 = 128815 llama_model_loader: - kv 40: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 41: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 42: tokenizer.chat_template str = {% if not add_generation_prompt is de... llama_model_loader: - kv 43: general.quantization_version u32 = 2 llama_model_loader: - kv 44: general.file_type u32 = 24 llama_model_loader: - kv 45: quantize.imatrix.file str = DeepSeek-R1.imatrix llama_model_loader: - kv 46: quantize.imatrix.dataset str = /training_data/calibration_datav3.txt llama_model_loader: - kv 47: quantize.imatrix.entries_count i32 = 720 llama_model_loader: - kv 48: quantize.imatrix.chunks_count i32 = 124 llama_model_loader: - kv 49: split.no u16 = 0 llama_model_loader: - kv 50: split.tensors.count i32 = 1025 llama_model_loader: - kv 51: split.count u16 = 0 llama_model_loader: - type f32: 361 tensors llama_model_loader: - type q4_K: 190 tensors llama_model_loader: - type q5_K: 116 tensors llama_model_loader: - type q6_K: 184 tensors llama_model_loader: - type iq2_xxs: 6 tensors llama_model_loader: - type iq1_s: 168 tensors print_info: file format = GGUF V3 (latest) print_info: file type = IQ1_S - 1.5625 bpw print_info: file size = 130,60 GiB (1,67 BPW) load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect load: special tokens cache size = 819 load: token to piece cache size = 0,8223 MB print_info: arch = deepseek2 print_info: vocab_only = 0 print_info: n_ctx_train = 163840 print_info: n_embd = 7168 print_info: n_layer = 61 print_info: n_head = 128 print_info: n_head_kv = 128 print_info: n_rot = 64 print_info: n_swa = 0 print_info: n_embd_head_k = 192 print_info: n_embd_head_v = 128 print_info: n_gqa = 1 print_info: n_embd_k_gqa = 24576 print_info: n_embd_v_gqa = 16384 print_info: f_norm_eps = 0,0e+00 print_info: f_norm_rms_eps = 1,0e-06 print_info: f_clamp_kqv = 0,0e+00 print_info: f_max_alibi_bias = 0,0e+00 print_info: f_logit_scale = 0,0e+00 print_info: n_ff = 18432 print_info: n_expert = 256 print_info: n_expert_used = 8 print_info: causal attn = 1 print_info: pooling type = 0 print_info: rope type = 0 print_info: rope scaling = yarn print_info: freq_base_train = 10000,0 print_info: freq_scale_train = 0,025 print_info: n_ctx_orig_yarn = 4096 print_info: rope_finetuned = unknown print_info: ssm_d_conv = 0 print_info: ssm_d_inner = 0 print_info: ssm_d_state = 0 print_info: ssm_dt_rank = 0 print_info: ssm_dt_b_c_rms = 0 print_info: model type = 671B print_info: model params = 671,03 B print_info: general.name = DeepSeek R1 BF16 print_info: n_layer_dense_lead = 3 print_info: n_lora_q = 1536 print_info: n_lora_kv = 512 print_info: n_ff_exp = 2048 print_info: n_expert_shared = 1 print_info: expert_weights_scale = 2,5 print_info: expert_weights_norm = 1 print_info: expert_gating_func = sigmoid print_info: rope_yarn_log_mul = 0,1000 print_info: vocab type = BPE print_info: n_vocab = 129280 print_info: n_merges = 127741 print_info: BOS token = 0 '<|begin▁of▁sentence|>' print_info: EOS token = 1 '<|end▁of▁sentence|>' print_info: EOT token = 1 '<|end▁of▁sentence|>' print_info: PAD token = 128815 '<|PAD▁TOKEN|>' print_info: LF token = 201 'Ċ' print_info: FIM PRE token = 128801 '<|fim▁begin|>' print_info: FIM SUF token = 128800 '<|fim▁hole|>' print_info: FIM MID token = 128802 '<|fim▁end|>' print_info: EOG token = 1 '<|end▁of▁sentence|>' print_info: max token length = 256 ``` ### OS Linux ### GPU _No response_ ### CPU AMD ### Ollama version 0.5.7
GiteaMirror added the bug label 2026-04-12 16:57:31 -05:00
Author
Owner

@qits-kkruse commented on GitHub (Jan 31, 2025):

Basically i was following these instructions until it said "and then you add the model to ollama.":

https://unsloth.ai/blog/deepseekr1-dynamic

<!-- gh-comment-id:2627566185 --> @qits-kkruse commented on GitHub (Jan 31, 2025): Basically i was following these instructions until it said "and then you add the model to ollama.": https://unsloth.ai/blog/deepseekr1-dynamic
Author
Owner

@rick-github commented on GitHub (Jan 31, 2025):

Your ollama create command copies three files, not one. Are you sure the Modelfile is correct?

<!-- gh-comment-id:2627626768 --> @rick-github commented on GitHub (Jan 31, 2025): Your `ollama create` command copies three files, not one. Are you sure the Modelfile is correct?
Author
Owner

@rick-github commented on GitHub (Jan 31, 2025):

I suspect that your "short and sweet modelfile" is called DeepSeek-R1-671b-q1.58.model and contains FROM DeepSeek-R1-671b-q1.58.gguf. However, your ollama create command is not correctly formed. What your command does is create a model called DeepSeek-R1-671b-q1.58.model, using the contents of the current directory. If your model file is not called Modelfile, you need to explicitly pass it as an argument:

ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58
<!-- gh-comment-id:2627647892 --> @rick-github commented on GitHub (Jan 31, 2025): I suspect that your "short and sweet modelfile" is called `DeepSeek-R1-671b-q1.58.model` and contains `FROM DeepSeek-R1-671b-q1.58.gguf`. However, your `ollama create` command is not correctly formed. What your command does is create a model called DeepSeek-R1-671b-q1.58.model, using the contents of the current directory. If your model file is not called `Modelfile`, you need to explicitly pass it as an argument: ``` ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58 ```
Author
Owner

@qits-kkruse commented on GitHub (Jan 31, 2025):

Your ollama create command copies three files, not one. Are you sure the Modelfile is correct?

Pretty positive, yes. The three blobs that are created are one very large one (the gguf file) and two smaller ones (some kind of meta data). The first blob is almost as large as the source gguf file:

[root@foo01 deepseek-r1-1.58bit]# ollama create DeepSeek-R1-671b-q1.58.model
gathering model components
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100%
copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100%
copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100%
parsing GGUF
Error: supplied file was not in GGUF format

This is the directory listing:

[root@foo01 deepseek-r1-1.58bit]# ls -lh
insgesamt 131G
-rw-r--r--.  1 kkadmin kkadmin 131G 31. Jan 12:55 DeepSeek-R1-671b-q1.58.gguf
-rw-r--r--.  1 kkadmin kkadmin   33 31. Jan 16:35 DeepSeek-R1-671b-q1.58.model

And the model file:

[root@foo01 deepseek-r1-1.58bit]# cat DeepSeek-R1-671b-q1.58.model
FROM DeepSeek-R1-671b-q1.58.gguf
<!-- gh-comment-id:2627670742 --> @qits-kkruse commented on GitHub (Jan 31, 2025): > Your `ollama create` command copies three files, not one. Are you sure the Modelfile is correct? Pretty positive, yes. The three blobs that are created are one very large one (the gguf file) and two smaller ones (some kind of meta data). The first blob is almost as large as the source gguf file: ``` [root@foo01 deepseek-r1-1.58bit]# ollama create DeepSeek-R1-671b-q1.58.model gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100% copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100% parsing GGUF Error: supplied file was not in GGUF format ``` This is the directory listing: ``` [root@foo01 deepseek-r1-1.58bit]# ls -lh insgesamt 131G -rw-r--r--. 1 kkadmin kkadmin 131G 31. Jan 12:55 DeepSeek-R1-671b-q1.58.gguf -rw-r--r--. 1 kkadmin kkadmin 33 31. Jan 16:35 DeepSeek-R1-671b-q1.58.model ``` And the model file: ``` [root@foo01 deepseek-r1-1.58bit]# cat DeepSeek-R1-671b-q1.58.model FROM DeepSeek-R1-671b-q1.58.gguf ```
Author
Owner

@qits-kkruse commented on GitHub (Jan 31, 2025):

I suspect that your "short and sweet modelfile" is called DeepSeek-R1-671b-q1.58.model and contains FROM DeepSeek-R1-671b-q1.58.gguf. However, your ollama create command is not correctly formed. What your command does is create a model called DeepSeek-R1-671b-q1.58.model, using the contents of the current directory. If your model file is not called Modelfile, you need to explicitly pass it as an argument:

ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58

Cheers, i'll do that from now. But the result is the same:

[root@foo01 deepseek-r1-1.58bit]# ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58
gathering model components
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100%
copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100%
copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100%
parsing GGUF
Error: supplied file was not in GGUF format

The directory listing:

[root@foo01 deepseek-r1-1.58bit]# ls -lh
insgesamt 131G
-rw-r--r--. 1 kkadmin kkadmin 131G 31. Jan 12:55 DeepSeek-R1-671b-q1.58.gguf
-rw-r--r--. 1 kkadmin kkadmin 33 31. Jan 16:35 DeepSeek-R1-671b-q1.58.model

And the model file:

[root@foo01 deepseek-r1-1.58bit]# cat DeepSeek-R1-671b-q1.58.model
FROM DeepSeek-R1-671b-q1.58.gguf
[root@foo01 deepseek-r1-1.58bit]#
<!-- gh-comment-id:2627677180 --> @qits-kkruse commented on GitHub (Jan 31, 2025): > I suspect that your "short and sweet modelfile" is called `DeepSeek-R1-671b-q1.58.model` and contains `FROM DeepSeek-R1-671b-q1.58.gguf`. However, your `ollama create` command is not correctly formed. What your command does is create a model called DeepSeek-R1-671b-q1.58.model, using the contents of the current directory. If your model file is not called `Modelfile`, you need to explicitly pass it as an argument: > > ``` > ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58 > ``` Cheers, i'll do that from now. But the result is the same: ``` [root@foo01 deepseek-r1-1.58bit]# ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58 gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% copying file sha256:e7e4039387ae974f2130df2909b21335624a1bf421b256191ccd6a280a9db0fa 100% copying file sha256:285d51ea4f63810cddbc950e114a3f19c58990862cad7135443f0da6c8b9d17f 100% parsing GGUF Error: supplied file was not in GGUF format ``` The directory listing: [root@foo01 deepseek-r1-1.58bit]# ls -lh insgesamt 131G -rw-r--r--. 1 kkadmin kkadmin 131G 31. Jan 12:55 DeepSeek-R1-671b-q1.58.gguf -rw-r--r--. 1 kkadmin kkadmin 33 31. Jan 16:35 DeepSeek-R1-671b-q1.58.model And the model file: ``` [root@foo01 deepseek-r1-1.58bit]# cat DeepSeek-R1-671b-q1.58.model FROM DeepSeek-R1-671b-q1.58.gguf [root@foo01 deepseek-r1-1.58bit]# ```
Author
Owner

@rick-github commented on GitHub (Jan 31, 2025):

$ cat Modelfile
FROM DeepSeek-R1-UD-IQ1_S.gguf
$ ollama create DeepSeek-R1-671b-q1.58.model
gathering model components 
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% 
parsing GGUF 
using existing layer sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 
writing manifest 
success

I speculate that you have a model called DeepSeek-R1-671b-q1.58.gguf (ie, ollama list will show it) and now when the create command runs, it uses the contents of the modelfile as a reference to the already existing model and uses all of the files assocated with that rather than the single GGUF in your current directory. Try this:

ollama rm DeepSeek-R1-671b-q1.58.gguf
ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58

If that fails, try this:

ollama rm DeepSeek-R1-671b-q1.58.gguf
echo FROM ./DeepSeek-R1-671b-q1.58.gguf > Modelfile
ollama create DeepSeek-R1-671b-q1.58
<!-- gh-comment-id:2627762057 --> @rick-github commented on GitHub (Jan 31, 2025): ```console $ cat Modelfile FROM DeepSeek-R1-UD-IQ1_S.gguf ``` ```console $ ollama create DeepSeek-R1-671b-q1.58.model gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% parsing GGUF using existing layer sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 writing manifest success ``` I speculate that you have a model called `DeepSeek-R1-671b-q1.58.gguf` (ie, `ollama list` will show it) and now when the create command runs, it uses the contents of the modelfile as a reference to the already existing model and uses all of the files assocated with that rather than the single GGUF in your current directory. Try this: ```console ollama rm DeepSeek-R1-671b-q1.58.gguf ollama create -f DeepSeek-R1-671b-q1.58.model DeepSeek-R1-671b-q1.58 ``` If that fails, try this: ```console ollama rm DeepSeek-R1-671b-q1.58.gguf echo FROM ./DeepSeek-R1-671b-q1.58.gguf > Modelfile ollama create DeepSeek-R1-671b-q1.58 ```
Author
Owner

@qits-kkruse commented on GitHub (Jan 31, 2025):

ollama list didn't show it. However, i removed the three blobs (sha256-...) manually and then moved the filename DeepSeek-R1-671b-q1.58.model to Modelfile and ran it again. It worked.

[root@foo01 deepseek-r1-1.58bit]# cat Modelfile
FROM DeepSeek-R1-671b-q1.58.gguf
[root@foo01 deepseek-r1-1.58bit]# ollama create DeepSeek-R1-671b-q1.58
gathering model components
copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100%
parsing GGUF
using existing layer sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6
writing manifest
success
[root@foo01 deepseek-r1-1.58bit]#

Thanks mate, you saved my weekend, marriage and i'll possibly rename my second born after you.

<!-- gh-comment-id:2627806158 --> @qits-kkruse commented on GitHub (Jan 31, 2025): ollama list didn't show it. However, i removed the three blobs (sha256-...) manually and then moved the filename DeepSeek-R1-671b-q1.58.model to Modelfile and ran it again. It worked. ``` [root@foo01 deepseek-r1-1.58bit]# cat Modelfile FROM DeepSeek-R1-671b-q1.58.gguf [root@foo01 deepseek-r1-1.58bit]# ollama create DeepSeek-R1-671b-q1.58 gathering model components copying file sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 100% parsing GGUF using existing layer sha256:a542caee8df72af41ad48d75b94adacb5fbc61856930460bd599d835400fb3b6 writing manifest success [root@foo01 deepseek-r1-1.58bit]# ``` Thanks mate, you saved my weekend, marriage and i'll possibly rename my second born after you.
Author
Owner

@rick-github commented on GitHub (Jan 31, 2025):

You probably know this, but you'll need a TEMPLATE and some PARAMETERS for best function: https://github.com/ollama/ollama/issues/8571#issuecomment-2622118005

<!-- gh-comment-id:2627817986 --> @rick-github commented on GitHub (Jan 31, 2025): You probably know this, but you'll need a TEMPLATE and some PARAMETERS for best function: https://github.com/ollama/ollama/issues/8571#issuecomment-2622118005
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#5662