[GH-ISSUE #13566] Model Request GLM-4.7 #70992

Closed
opened 2026-05-04 23:41:26 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @mlzxdzl on GitHub (Dec 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13566

zai.org release a new model called GLM-4.7 which have good expression in coding field. Would you support it ?

https://huggingface.co/zai-org/GLM-4.7

Originally created by @mlzxdzl on GitHub (Dec 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13566 zai.org release a new model called GLM-4.7 which have good expression in coding field. Would you support it ? https://huggingface.co/zai-org/GLM-4.7
GiteaMirror added the model label 2026-05-04 23:41:26 -05:00
Author
Owner

@rick-github commented on GitHub (Dec 26, 2025):

It's currently available as a cloud model: https://ollama.com/library/glm-4.7.

If the goal is to run it locally, unlsoth has a heavily quantized version available that is simple to run with ollama:

$ ollama -v
ollama version is 0.13.4
$ ollama run hf.co/unsloth/GLM-4.7-GGUF:UD-TQ1_0 hello
Hello! I'm GLM, a large language model trained by Z.ai. How can I assist you today? Please feel free 
to ask any questions or share your thoughts in Chinese (Simplified), and I'll do my best to provide 
helpful responses.

For example, you could ask me about:
- Writing tasks, content creation, or editing
- Analyzing data or solving logical problems
- Learning and studying assistance
- Creative writing or role-playing scenarios

What would you like to discuss?
$ ollama ps
NAME                                   ID              SIZE     PROCESSOR    CONTEXT    UNTIL   
hf.co/unsloth/GLM-4.7-GGUF:UD-TQ1_0    676fc848373f    88 GB    100% GPU     4096       Forever

Less quantized versions are available but the model shards need to be merged using tools from llama.cpp. The template from unsloth is pretty simple, so for tools and thinking it needs to be replaced:

$ hf download --local-dir . unsloth/GLM-4.7-GGUF --include Q4_K_M/*
$ cd Q4_K_M
$ llama-gguf-split --merge GLM-4.7-Q4_K_M-00001-of-00005.gguf GLM-4.7-Q4_K_M.gguf
$ echo FROM GLM-4.7-Q4_K_M.gguf > Modelfile
$ cat Modefile.glm-tools | grep -v FROM >> Modelfile
$ ollama create glm-4.7:358a32b-q4_K_M 
$ ollama-run.py glm-4.7:358a32b-q4_K_M what is the time? --tool get_datetime --context 4096 
Thinking...
The user is asking for the current time. I have access to a get_datetime function that can return the
current date and time. The function requires a timezone_name parameter, which is optional according
to the description - if not supplied, it should use 'UTC'. 

Since the user didn't specify a timezone, I'll use UTC as default.
...done thinking

calling get_datetime({'timezone_name': 'UTC'})
Thinking...
The function returned the current date and time in UTC format. The time is 08:09 (8:09 AM) on
Friday, December 26, 2025. I should provide this information to the user in a clear way.

...done thinking
The current time is **08:09 UTC** on Friday, December 26, 2025.

If you'd like the time for a specific timezone, please let me know which one you prefer!

$ ollama ps
NAME                                 ID              SIZE      PROCESSOR    CONTEXT    UNTIL   
glm-4.7:358a32b-q4_K_M               1f92f066437a    226 GB    100% GPU     4096       Forever    
<!-- gh-comment-id:3692446111 --> @rick-github commented on GitHub (Dec 26, 2025): It's currently available as a cloud model: https://ollama.com/library/glm-4.7. If the goal is to run it locally, unlsoth has a heavily quantized version available that is simple to run with ollama: ```console $ ollama -v ollama version is 0.13.4 $ ollama run hf.co/unsloth/GLM-4.7-GGUF:UD-TQ1_0 hello Hello! I'm GLM, a large language model trained by Z.ai. How can I assist you today? Please feel free to ask any questions or share your thoughts in Chinese (Simplified), and I'll do my best to provide helpful responses. For example, you could ask me about: - Writing tasks, content creation, or editing - Analyzing data or solving logical problems - Learning and studying assistance - Creative writing or role-playing scenarios What would you like to discuss? $ ollama ps NAME ID SIZE PROCESSOR CONTEXT UNTIL hf.co/unsloth/GLM-4.7-GGUF:UD-TQ1_0 676fc848373f 88 GB 100% GPU 4096 Forever ``` Less quantized versions are available but the model shards need to be merged using tools from [llama.cpp](https://github.com/ggml-org/llama.cpp). The template from unsloth is pretty simple, so for tools and thinking it needs to be [replaced](https://github.com/ollama/ollama/issues/11563#issuecomment-3218169593): ```console $ hf download --local-dir . unsloth/GLM-4.7-GGUF --include Q4_K_M/* $ cd Q4_K_M $ llama-gguf-split --merge GLM-4.7-Q4_K_M-00001-of-00005.gguf GLM-4.7-Q4_K_M.gguf $ echo FROM GLM-4.7-Q4_K_M.gguf > Modelfile $ cat Modefile.glm-tools | grep -v FROM >> Modelfile $ ollama create glm-4.7:358a32b-q4_K_M ``` ```console $ ollama-run.py glm-4.7:358a32b-q4_K_M what is the time? --tool get_datetime --context 4096 Thinking... The user is asking for the current time. I have access to a get_datetime function that can return the current date and time. The function requires a timezone_name parameter, which is optional according to the description - if not supplied, it should use 'UTC'. Since the user didn't specify a timezone, I'll use UTC as default. ...done thinking calling get_datetime({'timezone_name': 'UTC'}) Thinking... The function returned the current date and time in UTC format. The time is 08:09 (8:09 AM) on Friday, December 26, 2025. I should provide this information to the user in a clear way. ...done thinking The current time is **08:09 UTC** on Friday, December 26, 2025. If you'd like the time for a specific timezone, please let me know which one you prefer! $ ollama ps NAME ID SIZE PROCESSOR CONTEXT UNTIL glm-4.7:358a32b-q4_K_M 1f92f066437a 226 GB 100% GPU 4096 Forever ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#70992