[GH-ISSUE #13177] deepseek-ocr :: Not able to run this model #70772

Closed
opened 2026-05-04 22:56:45 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @amey6992 on GitHub (Nov 20, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13177

What is the issue?

Issue with deepseek-ocr

  • I did pull the model
  • When I run it, I get an error
  • Error: 500 Internal Server Error: unable to load model: /home/.ollama/models/blobs/sha256-3a18673ff291a1d8de94d490877127899356d33a18028d5f3945bf245c11b02c
Image

free memory is about 160Gbs, the other model llama3.2 worked well

Relevant log output

llama_model_loader: loaded meta data with 33 key-value pairs and 631 tensors from /home/.ollama/models/blobs/sha256-3a18673ff291a1d8de94d490877127899356d33a18028d5f3945bf245c11b02c (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:           deepseekocr.attention.head_count u32              = 10
llama_model_loader: - kv   1:        deepseekocr.attention.head_count_kv u32              = 10
llama_model_loader: - kv   2:                    deepseekocr.block_count u32              = 12
llama_model_loader: - kv   3:                 deepseekocr.context_length u32              = 8192
llama_model_loader: - kv   4:               deepseekocr.embedding_length u32              = 1280
llama_model_loader: - kv   5:                   deepseekocr.expert_count u32              = 64
llama_model_loader: - kv   6:              deepseekocr.expert_used_count u32              = 6
llama_model_loader: - kv   7:            deepseekocr.feed_forward_length u32              = 6848
llama_model_loader: - kv   8:                       general.architecture str              = deepseekocr
llama_model_loader: - kv   9:                          general.file_type u32              = 1
llama_model_loader: - kv  10:               general.quantization_version u32              = 2
llama_model_loader: - kv  11:      deepseekocr.leading_dense_block_count u32              = 1
llama_model_loader: - kv  12:                deepseekocr.sam.block_count u32              = 12
llama_model_loader: - kv  13:           deepseekocr.sam.embedding_length u32              = 768
llama_model_loader: - kv  14:   deepseekocr.sam.global_attention_indexes arr[i32,4]       = [2, 5, 8, 11]
llama_model_loader: - kv  15:                 deepseekocr.sam.head_count u32              = 12
llama_model_loader: - kv  16:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  17:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  18:           tokenizer.ggml.add_padding_token bool             = false
llama_model_loader: - kv  19:                tokenizer.ggml.bos_token_id u32              = 0
llama_model_loader: - kv  20:                tokenizer.ggml.eos_token_id u32              = 1
llama_model_loader: - kv  21:                      tokenizer.ggml.merges arr[str,127741]  = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e...
llama_model_loader: - kv  22:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  23:            tokenizer.ggml.padding_token_id u32              = 2
llama_model_loader: - kv  24:                         tokenizer.ggml.pre str              = default
llama_model_loader: - kv  25:                      tokenizer.ggml.scores arr[f32,129280]  = [0.000000, 1.000000, 2.000000, 3.0000...
llama_model_loader: - kv  26:                  tokenizer.ggml.token_type arr[i32,129280]  = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  27:                      tokenizer.ggml.tokens arr[str,129280]  = ["<|begin▁of▁sentence|>", "<�...
llama_model_loader: - kv  28:             deepseekocr.vision.block_count u32              = 24
llama_model_loader: - kv  29:        deepseekocr.vision.embedding_length u32              = 1024
llama_model_loader: - kv  30:              deepseekocr.vision.head_count u32              = 16
llama_model_loader: - kv  31:              deepseekocr.vision.image_size u32              = 224
llama_model_loader: - kv  32:              deepseekocr.vision.patch_size u32              = 14
llama_model_loader: - type  f32:  360 tensors
llama_model_loader: - type  f16:  151 tensors
llama_model_loader: - type bf16:  120 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = F16
print_info: file size   = 6.22 GiB (16.02 BPW) 
llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'deepseekocr'
llama_model_load_from_file_impl: failed to load model
time=2025-11-21T01:23:08.526+05:30 level=INFO source=sched.go:425 msg="NewLlamaServer failed" model=/home/.ollama/models/blobs/sha256-3a18673ff291a1d8de94d490877127899356d33a18028d5f3945bf245c11b02c error="unable to load model: /home/plus91/.ollama/models/blobs/sha256-3a18673ff291a1d8de94d490877127899356d33a18028d5f3945bf245c11b02c"

OS

Linux

GPU

No response

CPU

Intel

Ollama version

0.12.11

Originally created by @amey6992 on GitHub (Nov 20, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13177 ### What is the issue? ### Issue with deepseek-ocr ### - I did pull the model - When I run it, I get an error - Error: 500 Internal Server Error: unable to load model: /home/.ollama/models/blobs/sha256-3a18673ff291a1d8de94d490877127899356d33a18028d5f3945bf245c11b02c <img width="1854" height="847" alt="Image" src="https://github.com/user-attachments/assets/e16dcd79-215a-46ec-8058-fedc6a3fd09f" /> free memory is about 160Gbs, the other model llama3.2 worked well ### Relevant log output ```shell llama_model_loader: loaded meta data with 33 key-value pairs and 631 tensors from /home/.ollama/models/blobs/sha256-3a18673ff291a1d8de94d490877127899356d33a18028d5f3945bf245c11b02c (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: deepseekocr.attention.head_count u32 = 10 llama_model_loader: - kv 1: deepseekocr.attention.head_count_kv u32 = 10 llama_model_loader: - kv 2: deepseekocr.block_count u32 = 12 llama_model_loader: - kv 3: deepseekocr.context_length u32 = 8192 llama_model_loader: - kv 4: deepseekocr.embedding_length u32 = 1280 llama_model_loader: - kv 5: deepseekocr.expert_count u32 = 64 llama_model_loader: - kv 6: deepseekocr.expert_used_count u32 = 6 llama_model_loader: - kv 7: deepseekocr.feed_forward_length u32 = 6848 llama_model_loader: - kv 8: general.architecture str = deepseekocr llama_model_loader: - kv 9: general.file_type u32 = 1 llama_model_loader: - kv 10: general.quantization_version u32 = 2 llama_model_loader: - kv 11: deepseekocr.leading_dense_block_count u32 = 1 llama_model_loader: - kv 12: deepseekocr.sam.block_count u32 = 12 llama_model_loader: - kv 13: deepseekocr.sam.embedding_length u32 = 768 llama_model_loader: - kv 14: deepseekocr.sam.global_attention_indexes arr[i32,4] = [2, 5, 8, 11] llama_model_loader: - kv 15: deepseekocr.sam.head_count u32 = 12 llama_model_loader: - kv 16: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 17: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 18: tokenizer.ggml.add_padding_token bool = false llama_model_loader: - kv 19: tokenizer.ggml.bos_token_id u32 = 0 llama_model_loader: - kv 20: tokenizer.ggml.eos_token_id u32 = 1 llama_model_loader: - kv 21: tokenizer.ggml.merges arr[str,127741] = ["Ġ t", "Ġ a", "i n", "Ġ Ġ", "h e... llama_model_loader: - kv 22: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 23: tokenizer.ggml.padding_token_id u32 = 2 llama_model_loader: - kv 24: tokenizer.ggml.pre str = default llama_model_loader: - kv 25: tokenizer.ggml.scores arr[f32,129280] = [0.000000, 1.000000, 2.000000, 3.0000... llama_model_loader: - kv 26: tokenizer.ggml.token_type arr[i32,129280] = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 27: tokenizer.ggml.tokens arr[str,129280] = ["<|begin▁of▁sentence|>", "<�... llama_model_loader: - kv 28: deepseekocr.vision.block_count u32 = 24 llama_model_loader: - kv 29: deepseekocr.vision.embedding_length u32 = 1024 llama_model_loader: - kv 30: deepseekocr.vision.head_count u32 = 16 llama_model_loader: - kv 31: deepseekocr.vision.image_size u32 = 224 llama_model_loader: - kv 32: deepseekocr.vision.patch_size u32 = 14 llama_model_loader: - type f32: 360 tensors llama_model_loader: - type f16: 151 tensors llama_model_loader: - type bf16: 120 tensors print_info: file format = GGUF V3 (latest) print_info: file type = F16 print_info: file size = 6.22 GiB (16.02 BPW) llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'deepseekocr' llama_model_load_from_file_impl: failed to load model time=2025-11-21T01:23:08.526+05:30 level=INFO source=sched.go:425 msg="NewLlamaServer failed" model=/home/.ollama/models/blobs/sha256-3a18673ff291a1d8de94d490877127899356d33a18028d5f3945bf245c11b02c error="unable to load model: /home/plus91/.ollama/models/blobs/sha256-3a18673ff291a1d8de94d490877127899356d33a18028d5f3945bf245c11b02c" ``` ### OS Linux ### GPU _No response_ ### CPU Intel ### Ollama version 0.12.11
GiteaMirror added the bug label 2026-05-04 22:56:45 -05:00
Author
Owner

@rick-github commented on GitHub (Nov 20, 2025):

DeepSeek-OCR requires Ollama v0.13.0 or later.

<!-- gh-comment-id:3559907380 --> @rick-github commented on GitHub (Nov 20, 2025): DeepSeek-OCR requires [Ollama v0.13.0](https://github.com/ollama/ollama/releases) or later.
Author
Owner

@amey6992 commented on GitHub (Nov 20, 2025):

Yes, just got this from the docs

<!-- gh-comment-id:3559913526 --> @amey6992 commented on GitHub (Nov 20, 2025): Yes, just got this from the docs
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#70772