[GH-ISSUE #4907] Cannot run qwen2 7B, 1.5b #65137

Closed
opened 2026-05-03 19:50:45 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @SAXN-SYNX on GitHub (Jun 7, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4907

Shows error while running it.

llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = qwen2
llama_model_loader: - kv   1:                               general.name str              = Qwen2-7B-Instruct
llama_model_loader: - kv   2:                          qwen2.block_count u32              = 28
llama_model_loader: - kv   3:                       qwen2.context_length u32              = 32768
llama_model_loader: - kv   4:                     qwen2.embedding_length u32              = 3584
llama_model_loader: - kv   5:                  qwen2.feed_forward_length u32              = 18944
llama_model_loader: - kv   6:                 qwen2.attention.head_count u32              = 28
llama_model_loader: - kv   7:              qwen2.attention.head_count_kv u32              = 4
llama_model_loader: - kv   8:                       qwen2.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv   9:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000001
llama_model_loader: - kv  10:                          general.file_type u32              = 2
llama_model_loader: - kv  11:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  12:                         tokenizer.ggml.pre str              = qwen2
llama_model_loader: - kv  13:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  14:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  15:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv  16:                tokenizer.ggml.eos_token_id u32              = 151645
llama_model_loader: - kv  17:            tokenizer.ggml.padding_token_id u32              = 151643
llama_model_loader: - kv  18:                tokenizer.ggml.bos_token_id u32              = 151643
llama_model_loader: - kv  19:                    tokenizer.chat_template str              = {% for message in messages %}{% if lo...
llama_model_loader: - kv  20:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  141 tensors
llama_model_loader: - type q4_0:  197 tensors
llama_model_loader: - type q6_K:    1 tensors
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'qwen2'
llama_load_model_from_file: exception loading model
terminate called after throwing an instance of 'std::runtime_error'
  what():  error loading model vocabulary: unknown pre-tokenizer type: 'qwen2'

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.1.34

Originally created by @SAXN-SYNX on GitHub (Jun 7, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4907 ### Shows error while running it. ``` llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = qwen2 llama_model_loader: - kv 1: general.name str = Qwen2-7B-Instruct llama_model_loader: - kv 2: qwen2.block_count u32 = 28 llama_model_loader: - kv 3: qwen2.context_length u32 = 32768 llama_model_loader: - kv 4: qwen2.embedding_length u32 = 3584 llama_model_loader: - kv 5: qwen2.feed_forward_length u32 = 18944 llama_model_loader: - kv 6: qwen2.attention.head_count u32 = 28 llama_model_loader: - kv 7: qwen2.attention.head_count_kv u32 = 4 llama_model_loader: - kv 8: qwen2.rope.freq_base f32 = 1000000.000000 llama_model_loader: - kv 9: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000001 llama_model_loader: - kv 10: general.file_type u32 = 2 llama_model_loader: - kv 11: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 12: tokenizer.ggml.pre str = qwen2 llama_model_loader: - kv 13: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 14: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 15: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",... llama_model_loader: - kv 16: tokenizer.ggml.eos_token_id u32 = 151645 llama_model_loader: - kv 17: tokenizer.ggml.padding_token_id u32 = 151643 llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 151643 llama_model_loader: - kv 19: tokenizer.chat_template str = {% for message in messages %}{% if lo... llama_model_loader: - kv 20: general.quantization_version u32 = 2 llama_model_loader: - type f32: 141 tensors llama_model_loader: - type q4_0: 197 tensors llama_model_loader: - type q6_K: 1 tensors llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'qwen2' llama_load_model_from_file: exception loading model terminate called after throwing an instance of 'std::runtime_error' what(): error loading model vocabulary: unknown pre-tokenizer type: 'qwen2' ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.34
GiteaMirror added the bug label 2026-05-03 19:50:45 -05:00
Author
Owner

@include commented on GitHub (Jun 7, 2024):

+1.

<!-- gh-comment-id:2155160702 --> @include commented on GitHub (Jun 7, 2024): +1.
Author
Owner

@boxabirds commented on GitHub (Jun 7, 2024):

Lots of people experiencing this in the Discord.

Happens for me with Ollama 0.1.41 as well.

<!-- gh-comment-id:2155168704 --> @boxabirds commented on GitHub (Jun 7, 2024): Lots of people experiencing this in the Discord. Happens for me with Ollama 0.1.41 as well.
Author
Owner

@informaticker commented on GitHub (Jun 7, 2024):

Ollama version
0.1.34

Try upgrading first, fixed the issue for me.

<!-- gh-comment-id:2155194676 --> @informaticker commented on GitHub (Jun 7, 2024): > Ollama version > 0.1.34 Try upgrading first, fixed the issue for me.
Author
Owner

@boxabirds commented on GitHub (Jun 7, 2024):

I get garbage output with 41. image

<!-- gh-comment-id:2155266041 --> @boxabirds commented on GitHub (Jun 7, 2024): I get garbage output with 41. ![image](https://github.com/ollama/ollama/assets/147305/523ea64c-16cb-4670-a590-63871204b723)
Author
Owner

@iplayfast commented on GitHub (Jun 7, 2024):

I tried running deepseek-v2 (which it pulled and installed) and after that qwen2 started behaving. (could be a clue?)

<!-- gh-comment-id:2155294264 --> @iplayfast commented on GitHub (Jun 7, 2024): I tried running deepseek-v2 (which it pulled and installed) and after that qwen2 started behaving. (could be a clue?)
Author
Owner

@ghost commented on GitHub (Jun 7, 2024):

Escreva um código em python para somar os números de 1 até 100.

qwen2:1.5b-instruct 3:12 PM
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG

image

ollama -v
ollama version is 0.1.41

Open WebUI Versão
v0.2.5

<!-- gh-comment-id:2155301788 --> @ghost commented on GitHub (Jun 7, 2024): Escreva um código em python para somar os números de 1 até 100. qwen2:1.5b-instruct 3:12 PM GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG ![image](https://github.com/ollama/ollama/assets/144032090/7d02fc9d-c100-4dd8-8c13-621f6dba271e) ollama -v ollama version is 0.1.41 Open WebUI Versão v0.2.5
Author
Owner

@jmorganca commented on GitHub (Jun 7, 2024):

Hi all, sorry about this. It is fixed in 0.1.42: https://github.com/ollama/ollama/releases/tag/v0.1.42

<!-- gh-comment-id:2155671605 --> @jmorganca commented on GitHub (Jun 7, 2024): Hi all, sorry about this. It is fixed in 0.1.42: https://github.com/ollama/ollama/releases/tag/v0.1.42
Author
Owner

@liuaifu commented on GitHub (Jun 9, 2024):

@jmorganca

Microsoft Windows [版本 10.0.22631.3593]
(c) Microsoft Corporation。保留所有权利。

C:\Users\laf163>ollama -v
ollama version is 0.1.42

C:\Users\laf163>ollama run qwen2:7b
>>> Why is the sky blue?
#&%$#7456893210

>>> Send a message (/? for help)

NVIDIA GeForce RTX 3050 Laptop GPU

驱动程序版本: 31.0.15.5161
驱动程序日期: 2024/2/15
DirectX 版本: 12 (FL 12.1)
物理位置: PCI 总线 1、设备 0、功能 0

利用率 0%
专用 GPU 内存 0.0/4.0 GB
共享 GPU 内存 0.0/7.7 GB
GPU 内存 0.0/11.7 GB

<!-- gh-comment-id:2156280023 --> @liuaifu commented on GitHub (Jun 9, 2024): @jmorganca ``` Microsoft Windows [版本 10.0.22631.3593] (c) Microsoft Corporation。保留所有权利。 C:\Users\laf163>ollama -v ollama version is 0.1.42 C:\Users\laf163>ollama run qwen2:7b >>> Why is the sky blue? #&%$#7456893210 >>> Send a message (/? for help) ``` NVIDIA GeForce RTX 3050 Laptop GPU 驱动程序版本: 31.0.15.5161 驱动程序日期: 2024/2/15 DirectX 版本: 12 (FL 12.1) 物理位置: PCI 总线 1、设备 0、功能 0 利用率 0% 专用 GPU 内存 0.0/4.0 GB 共享 GPU 内存 0.0/7.7 GB GPU 内存 0.0/11.7 GB
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#65137