[GH-ISSUE #15363] The qwen3.5 35b model has a bug that causes infinite thinking when using the network. #71890

Open
opened 2026-05-05 02:52:09 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @sdqq1234 on GitHub (Apr 6, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15363

What is the issue?

I'm using Ollama version 0.20.2 to load a Q4 quantization model of QW3.5 35B. When I connect to the internet and provide a image to query its content, I encounter an infinite "thinking" loop where the content keeps looping without ending.

Relevant log output


OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.20.2

Originally created by @sdqq1234 on GitHub (Apr 6, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15363 ### What is the issue? I'm using Ollama version 0.20.2 to load a Q4 quantization model of QW3.5 35B. When I connect to the internet and provide a image to query its content, I encounter an infinite "thinking" loop where the content keeps looping without ending. ### Relevant log output ```shell ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.20.2
GiteaMirror added the bug label 2026-05-05 02:52:10 -05:00
Author
Owner

@rick-github commented on GitHub (Apr 6, 2026):

Server logs will aid in debugging.

<!-- gh-comment-id:4191742721 --> @rick-github commented on GitHub (Apr 6, 2026): [Server logs](https://docs.ollama.com/troubleshooting) will aid in debugging.
Author
Owner

@sdqq1234 commented on GitHub (Apr 6, 2026):

server.log
server-1.log
server-2.log
server-3.log
server-4.log
server-5.log
upgrade.log

I don't know which one is correct, so I uploaded all the server log files.

<!-- gh-comment-id:4192486302 --> @sdqq1234 commented on GitHub (Apr 6, 2026): [server.log](https://github.com/user-attachments/files/26509231/server.log) [server-1.log](https://github.com/user-attachments/files/26509234/server-1.log) [server-2.log](https://github.com/user-attachments/files/26509233/server-2.log) [server-3.log](https://github.com/user-attachments/files/26509232/server-3.log) [server-4.log](https://github.com/user-attachments/files/26509237/server-4.log) [server-5.log](https://github.com/user-attachments/files/26509236/server-5.log) [upgrade.log](https://github.com/user-attachments/files/26509235/upgrade.log) I don't know which one is correct, so I uploaded all the server log files.
Author
Owner

@pdevine commented on GitHub (Apr 6, 2026):

Which model are you using exactly? If you are using the coding model for non-coding tasks it can get into a loop because the presence_penalty isn't set.

<!-- gh-comment-id:4194277935 --> @pdevine commented on GitHub (Apr 6, 2026): Which model are you using exactly? If you are using the coding model for non-coding tasks it can get into a loop because the presence_penalty isn't set.
Author
Owner

@andrenaP commented on GitHub (Apr 13, 2026):

Having the same issue using qwen3.5:latest for code generation. I checked it with presence_penalty set to 0 and presence_penalty set to 1.5 it will not exit the thinking mode no matter what presence_penalty I set. If I turn thinking mode off I will get out a good result.

This happens only if I provide a lot of context to the model. It works just fine on simple questions using thinking.

<!-- gh-comment-id:4235408948 --> @andrenaP commented on GitHub (Apr 13, 2026): Having the same issue using qwen3.5:latest for code generation. I checked it with presence_penalty set to 0 and presence_penalty set to 1.5 it will not exit the thinking mode no matter what presence_penalty I set. If I turn thinking mode off I will get out a good result. This happens only if I provide a lot of context to the model. It works just fine on simple questions using thinking.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71890