[GH-ISSUE #14443] success to create qwen 3.5 q8 in GGUF, but fail to run #55893

Closed
opened 2026-04-29 09:54:20 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @taozhiyuai on GitHub (Feb 26, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14443

What is the issue?

taozhiyu@Mac ~ % ollama list
NAME                           ID              SIZE     MODIFIED           
qwen35-35b-a3b:q8_0            40c3699330db    36 GB    About a minute ago 

taozhiyu@Mac ~ % ollama run qwen35-35b-a3b:q8_0
Error: 500 Internal Server Error: unable to load model: /Users/taozhiyu/.ollama/models/blobs/sha256-42fcd4b5080b3db03410736f1339dfdf491afacf859df83b4cc8ae1faea48a7f

taozhiyu@Mac ~ % ollama --version
ollama version is 0.17.1-rc2

taozhiyu@TAOZHIYUs-MacBook-Pro ~ % ollama run qwen35-35b-a3b:q8_0 Error: 500 Internal Server Error: unable to load model: /Users/taozhiyu/.ollama/models/blobs/sha256-42fcd4b5080b3db03410736f1339dfdf491afacf859df83b4cc8ae1faea48a7f taozhiyu@TAOZHIYUs-MacBook-Pro ~ % ollama --version ollama version is 0.17.4

Relevant log output


OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.17.1-rc2

Originally created by @taozhiyuai on GitHub (Feb 26, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14443 ### What is the issue? ``` taozhiyu@Mac ~ % ollama list NAME ID SIZE MODIFIED qwen35-35b-a3b:q8_0 40c3699330db 36 GB About a minute ago taozhiyu@Mac ~ % ollama run qwen35-35b-a3b:q8_0 Error: 500 Internal Server Error: unable to load model: /Users/taozhiyu/.ollama/models/blobs/sha256-42fcd4b5080b3db03410736f1339dfdf491afacf859df83b4cc8ae1faea48a7f taozhiyu@Mac ~ % ollama --version ollama version is 0.17.1-rc2 ``` `taozhiyu@TAOZHIYUs-MacBook-Pro ~ % ollama run qwen35-35b-a3b:q8_0 Error: 500 Internal Server Error: unable to load model: /Users/taozhiyu/.ollama/models/blobs/sha256-42fcd4b5080b3db03410736f1339dfdf491afacf859df83b4cc8ae1faea48a7f taozhiyu@TAOZHIYUs-MacBook-Pro ~ % ollama --version ollama version is 0.17.4` ### Relevant log output ```shell ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.17.1-rc2
GiteaMirror added the bug label 2026-04-29 09:54:20 -05:00
Author
Owner

@chidugit commented on GitHub (Feb 26, 2026):

Same issue here!

<!-- gh-comment-id:3967359313 --> @chidugit commented on GitHub (Feb 26, 2026): Same issue here!
Author
Owner

@battmanux commented on GitHub (Feb 26, 2026):

Same here on Linux Ubuntu 24.04. RTX5090. (Ollama server logs : qwen35 not supported / qwen35moe not supported)

<!-- gh-comment-id:3967458593 --> @battmanux commented on GitHub (Feb 26, 2026): Same here on Linux Ubuntu 24.04. RTX5090. (Ollama server logs : qwen35 not supported / qwen35moe not supported)
Author
Owner

@kerta1n commented on GitHub (Feb 27, 2026):

Same here on Debian Trixie, RTX3090 with the model hf.co/unsloth/Qwen3.5-27B-GGUF:Q8_0

<!-- gh-comment-id:3970528039 --> @kerta1n commented on GitHub (Feb 27, 2026): Same here on Debian Trixie, RTX3090 with the model `hf.co/unsloth/Qwen3.5-27B-GGUF:Q8_0`
Author
Owner

@ka-admin commented on GitHub (Feb 27, 2026):

Feb 27 09:59:02 ollama[1985]: llama_model_loader: - type q8_0: 518 tensors
Feb 27 09:59:02 ollama[1985]: print_info: file format = GGUF V3 (latest)
Feb 27 09:59:02 ollama[1985]: print_info: file type = Q8_0
Feb 27 09:59:02 ollama[1985]: print_info: file size = 120.94 GiB (8.51 BPW)
Feb 27 09:59:02 ollama[1985]: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe'
Feb 27 09:59:02 ollama[1985]: llama_model_load_from_file_impl: failed to load model

Ollama 0.17.4

<!-- gh-comment-id:3971214684 --> @ka-admin commented on GitHub (Feb 27, 2026): Feb 27 09:59:02 ollama[1985]: llama_model_loader: - type q8_0: 518 tensors Feb 27 09:59:02 ollama[1985]: print_info: file format = GGUF V3 (latest) Feb 27 09:59:02 ollama[1985]: print_info: file type = Q8_0 Feb 27 09:59:02 ollama[1985]: print_info: file size = 120.94 GiB (8.51 BPW) Feb 27 09:59:02 ollama[1985]: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe' Feb 27 09:59:02 ollama[1985]: llama_model_load_from_file_impl: failed to load model Ollama 0.17.4
Author
Owner

@GitUsers1234 commented on GitHub (Feb 27, 2026):

same issue with Ollama 0.17.4 even though it claims to support. Quite confusing.

<!-- gh-comment-id:3971273033 --> @GitUsers1234 commented on GitHub (Feb 27, 2026): same issue with Ollama 0.17.4 even though it claims to support. Quite confusing.
Author
Owner

@iWangJiaxiang commented on GitHub (Feb 27, 2026):

Feb 27 09:59:02 ollama[1985]: llama_model_loader: - type q8_0: 518 tensors Feb 27 09:59:02 ollama[1985]: print_info: file format = GGUF V3 (latest) Feb 27 09:59:02 ollama[1985]: print_info: file type = Q8_0 Feb 27 09:59:02 ollama[1985]: print_info: file size = 120.94 GiB (8.51 BPW) Feb 27 09:59:02 ollama[1985]: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe' Feb 27 09:59:02 ollama[1985]: llama_model_load_from_file_impl: failed to load model

Ollama 0.17.4

Same issue when using model from huggingface

<!-- gh-comment-id:3971331321 --> @iWangJiaxiang commented on GitHub (Feb 27, 2026): > Feb 27 09:59:02 ollama[1985]: llama_model_loader: - type q8_0: 518 tensors Feb 27 09:59:02 ollama[1985]: print_info: file format = GGUF V3 (latest) Feb 27 09:59:02 ollama[1985]: print_info: file type = Q8_0 Feb 27 09:59:02 ollama[1985]: print_info: file size = 120.94 GiB (8.51 BPW) Feb 27 09:59:02 ollama[1985]: llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'qwen35moe' Feb 27 09:59:02 ollama[1985]: llama_model_load_from_file_impl: failed to load model > > Ollama 0.17.4 Same issue when using model from huggingface
Author
Owner

@YJesus commented on GitHub (Feb 27, 2026):

https://github.com/ollama/ollama/issues/14419#issuecomment-3959159035

<!-- gh-comment-id:3972337948 --> @YJesus commented on GitHub (Feb 27, 2026): https://github.com/ollama/ollama/issues/14419#issuecomment-3959159035
Author
Owner

@tgiraud2007 commented on GitHub (Feb 27, 2026):

Same here

<!-- gh-comment-id:3972987267 --> @tgiraud2007 commented on GitHub (Feb 27, 2026): Same here
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55893