[GH-ISSUE #10408] GLM-4-0414 uses the wrong template #68898

Open
opened 2026-05-04 15:36:41 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @matteoserva on GitHub (Apr 25, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10408

What is the issue?

The GLM4 model uses the wrong template, causing a performance degradation.
Here is the relevant pull request in llama.cpp: https://github.com/ggml-org/llama.cpp/pull/13099

The reason is that the code enters this branch:
e9e5f61c45/llama/llama.cpp/src/llama-chat.cpp (L122-L123)

Instead of entering this:
e9e5f61c45/llama/llama.cpp/src/llama-chat.cpp (L153-L154)

Relevant log output


OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @matteoserva on GitHub (Apr 25, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10408 ### What is the issue? The GLM4 model uses the wrong template, causing a performance degradation. Here is the relevant pull request in llama.cpp: https://github.com/ggml-org/llama.cpp/pull/13099 The reason is that the code enters this branch: https://github.com/ollama/ollama/blob/e9e5f61c45e13f9b87be985ae735de8c217e9915/llama/llama.cpp/src/llama-chat.cpp#L122-L123 Instead of entering this: https://github.com/ollama/ollama/blob/e9e5f61c45e13f9b87be985ae735de8c217e9915/llama/llama.cpp/src/llama-chat.cpp#L153-L154 ### Relevant log output ```shell ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the bug label 2026-05-04 15:36:41 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#68898