[GH-ISSUE #10298] Support to GLM-Z1-32B, GLM-Z1-Rumination-32B #53275

Closed
opened 2026-04-29 02:26:32 -05:00 by GiteaMirror · 11 comments
Owner

Originally created by @sunisstar on GitHub (Apr 16, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10298

As title, gogogo!

Originally created by @sunisstar on GitHub (Apr 16, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10298 As title, gogogo!
GiteaMirror added the model label 2026-04-29 02:26:32 -05:00
Author
Owner

@rick-github commented on GitHub (Apr 16, 2025):

#10269

<!-- gh-comment-id:2809711083 --> @rick-github commented on GitHub (Apr 16, 2025): #10269
Author
Owner

@lssac commented on GitHub (Apr 17, 2025):

+1

<!-- gh-comment-id:2812153950 --> @lssac commented on GitHub (Apr 17, 2025): +1
Author
Owner

@Circadian234 commented on GitHub (Apr 18, 2025):

These models are amazing for 32B! Please support

<!-- gh-comment-id:2814814921 --> @Circadian234 commented on GitHub (Apr 18, 2025): These models are amazing for 32B! Please support
Author
Owner

@mikestut commented on GitHub (Apr 18, 2025):

These models are amazing for 32B! Please support +1

<!-- gh-comment-id:2815658609 --> @mikestut commented on GitHub (Apr 18, 2025): These models are amazing for 32B! Please support +1
Author
Owner

@yujiaYY commented on GitHub (Apr 20, 2025):

These models are amazing for 32B! Please support

<!-- gh-comment-id:2817059318 --> @yujiaYY commented on GitHub (Apr 20, 2025): These models are amazing for 32B! Please support
Author
Owner

@Hunter6324 commented on GitHub (Apr 24, 2025):

go

<!-- gh-comment-id:2826127666 --> @Hunter6324 commented on GitHub (Apr 24, 2025): go
Author
Owner

@sunisstar commented on GitHub (Apr 24, 2025):

Good news, xinference has supported Z1 series after v1.5.0!
but I haven't tested it yet.

<!-- gh-comment-id:2828056141 --> @sunisstar commented on GitHub (Apr 24, 2025): Good news, xinference has supported Z1 series after v1.5.0! but I haven't tested it yet.
Author
Owner

@songsh commented on GitHub (Apr 28, 2025):

https://huggingface.co/bartowski/THUDM_GLM-Z1-32B-0414-GGUF/blob/main/THUDM_GLM-Z1-32B-0414-Q5_K_M.gguf
bartowski 出 z1-32b 模型,ollama 不支持。

<!-- gh-comment-id:2833720546 --> @songsh commented on GitHub (Apr 28, 2025): https://huggingface.co/bartowski/THUDM_GLM-Z1-32B-0414-GGUF/blob/main/THUDM_GLM-Z1-32B-0414-Q5_K_M.gguf bartowski 出 z1-32b 模型,ollama 不支持。
Author
Owner

@songsh commented on GitHub (Apr 28, 2025):

Good news, xinference has supported Z1 series after v1.5.0! but I haven't tested it yet.

i not found z1 support on xinference,

<!-- gh-comment-id:2833737512 --> @songsh commented on GitHub (Apr 28, 2025): > Good news, xinference has supported Z1 series after v1.5.0! but I haven't tested it yet. i not found z1 support on xinference,
Author
Owner

@mikestut commented on GitHub (Apr 28, 2025):

The model at https://ollama.com/JollyLlama/GLM-4-32B-0414-Q4_K_M has been extensively validated through testing and offers excellent performance for GLM4-32b implementation. So I strongly recommend downloading this optimized Q4_K_M quantized version for evaluation, particularly noting its improved inference speed while maintaining high accuracy. But please ensure your Ollama
version upgraded to v0.6.6+ before deployment, as this model requires specific tensor operations available only in the latest runtime. The combination of efficient memory usage
and robust natural language processing capabilities makes it an ideal choice for production-grade applications requiring large-scale language modeling.

<!-- gh-comment-id:2836974676 --> @mikestut commented on GitHub (Apr 28, 2025): The model at https://ollama.com/JollyLlama/GLM-4-32B-0414-Q4_K_M has been extensively validated through testing and offers excellent performance for GLM4-32b implementation. So I strongly recommend downloading this optimized Q4_K_M quantized version for evaluation, particularly noting its improved inference speed while maintaining high accuracy. But please ensure your Ollama version upgraded to **v0.6.6+** before deployment, as this model requires specific tensor operations available only in the latest runtime. The combination of efficient memory usage and robust natural language processing capabilities makes it an ideal choice for production-grade applications requiring large-scale language modeling.
Author
Owner

@limingchina commented on GitHub (May 11, 2025):

https://huggingface.co/bartowski/THUDM_GLM-Z1-32B-0414-GGUF/blob/main/THUDM_GLM-Z1-32B-0414-Q5_K_M.gguf bartowski 出 z1-32b 模型,ollama 不支持。

Tried this on LM Studio, I use the IQ_XS quant version from this page(https://huggingface.co/bartowski/THUDM_GLM-4-32B-0414-GGUF). Doesn't work. The output is competely garbage like this
/=::0")%H6'-'224&D16=-!0/935-A=7*(F713;>EF+(189-:B?&(E=?C!?#!F>)28?D"8148./"(A!4%"8+H#>3CFE!D<A&'>A)-2(5";5;(8:4E3C#"(:0>38"&(#7F3B-E68!$$>%;A4H.**C#-107/+2<B9D69157;'0'A%-A.>')4%8E1%6>

<!-- gh-comment-id:2870066053 --> @limingchina commented on GitHub (May 11, 2025): > https://huggingface.co/bartowski/THUDM_GLM-Z1-32B-0414-GGUF/blob/main/THUDM_GLM-Z1-32B-0414-Q5_K_M.gguf bartowski 出 z1-32b 模型,ollama 不支持。 Tried this on LM Studio, I use the IQ_XS quant version from this page(https://huggingface.co/bartowski/THUDM_GLM-4-32B-0414-GGUF). Doesn't work. The output is competely garbage like this `/=::0")%H6'-'224&D16=-!0/935-A=7*(F713;>EF+(189-:B?&(E=?C!?#!F>)28?D"8148./"(A!4%"8+H#>3CFE!D<A&'>A)-2(5";5;(8:4E3C#"(:0>38"&(#7F3B-E68!$$>%;A4H.**C#-107/+2<B9D69157;'0'A%-A.>')4%8E1%6> `
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53275