[PR #14936] model/parsers: fix GLM-4.6 parser returning 500 on truncated tool call #77221

Open
opened 2026-05-05 09:54:08 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/14936
Author: @alvinttang
Created: 3/18/2026
Status: 🔄 Open

Base: mainHead: fix/glm46-truncated-tool-call


📝 Commits (1)

  • 654535f model/parsers: fix GLM-4.6 tool call parser returning 500 on truncated output

📊 Changes

2 files changed (+124 additions, -2 deletions)

View changed files

📝 model/parsers/glm46.go (+17 -2)
📝 model/parsers/glm46_test.go (+107 -0)

📄 Description

Summary

The GLM-4.6 tool call parser (glm46.go) returns a hard error when tool call parsing fails (e.g. truncated output from hitting num_predict or context window limits). This error bubbles up as HTTP 500 to the client, with no usable response.

This is the same class of bug that was fixed for the Qwen3/Qwen3Coder parsers in #14835, but the fix was not applied to the GLM-4.6 parser.

Before: Truncated tool call → parseGLM46ToolCall() fails → Add() returns error → server returns HTTP 500
After: Truncated tool call → parseGLM46ToolCall() fails → raw content returned as regular text → client gets usable response with done_reason: "length"

Changes

  • model/parsers/glm46.go: When parseGLM46ToolCall() fails, fall back to returning the raw content as regular text instead of returning an error. When done=true, drain any remaining buffered content held back by the streaming parser.
  • model/parsers/glm46_test.go: Added 5 tests covering truncated tool calls (with and without streaming), invalid XML fallback, valid-after-invalid recovery, and content drain on done.

Test plan

  • All new tests pass (go test ./model/parsers/ -run TestGLM46Parser -v)
  • All existing parser tests still pass (go test ./model/parsers/ -v)
  • Manual test: send a chat request to a GLM-4.6 model with tools enabled and a low num_predict that causes tool call truncation — should return 200 with content instead of 500

Related: #14570 (same bug in Qwen3 parser, fixed by #14835)


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/14936 **Author:** [@alvinttang](https://github.com/alvinttang) **Created:** 3/18/2026 **Status:** 🔄 Open **Base:** `main` ← **Head:** `fix/glm46-truncated-tool-call` --- ### 📝 Commits (1) - [`654535f`](https://github.com/ollama/ollama/commit/654535f3dbc1c324895f5f920bbfc0efec8e311c) model/parsers: fix GLM-4.6 tool call parser returning 500 on truncated output ### 📊 Changes **2 files changed** (+124 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `model/parsers/glm46.go` (+17 -2) 📝 `model/parsers/glm46_test.go` (+107 -0) </details> ### 📄 Description ## Summary The GLM-4.6 tool call parser (`glm46.go`) returns a hard error when tool call parsing fails (e.g. truncated output from hitting `num_predict` or context window limits). This error bubbles up as HTTP 500 to the client, with no usable response. This is the same class of bug that was fixed for the Qwen3/Qwen3Coder parsers in #14835, but the fix was not applied to the GLM-4.6 parser. **Before:** Truncated tool call → `parseGLM46ToolCall()` fails → `Add()` returns error → server returns HTTP 500 **After:** Truncated tool call → `parseGLM46ToolCall()` fails → raw content returned as regular text → client gets usable response with `done_reason: "length"` ## Changes - **`model/parsers/glm46.go`**: When `parseGLM46ToolCall()` fails, fall back to returning the raw content as regular text instead of returning an error. When `done=true`, drain any remaining buffered content held back by the streaming parser. - **`model/parsers/glm46_test.go`**: Added 5 tests covering truncated tool calls (with and without streaming), invalid XML fallback, valid-after-invalid recovery, and content drain on done. ## Test plan - [x] All new tests pass (`go test ./model/parsers/ -run TestGLM46Parser -v`) - [x] All existing parser tests still pass (`go test ./model/parsers/ -v`) - [ ] Manual test: send a chat request to a GLM-4.6 model with tools enabled and a low `num_predict` that causes tool call truncation — should return 200 with content instead of 500 Related: #14570 (same bug in Qwen3 parser, fixed by #14835) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-05-05 09:54:08 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#77221