[GH-ISSUE #6937] error reading llm response:An existing connection was forcibly closed by the remote host. #30153

Closed
opened 2026-04-22 09:38:40 -05:00 by GiteaMirror · 15 comments
Owner

Originally created by @yaosd99 on GitHub (Sep 24, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6937

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

bug-help

***Cannot import images.
Please~~

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.3.11

Originally created by @yaosd99 on GitHub (Sep 24, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6937 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? ![bug-help](https://github.com/user-attachments/assets/07b1633d-7fe1-49c6-8267-d0432331f143) ***Cannot import images. Please~~ ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.3.11
GiteaMirror added the nvidiabugwindows labels 2026-04-22 09:38:40 -05:00
Author
Owner

@srikary12 commented on GitHub (Sep 24, 2024):

I'll take a look at it.

<!-- gh-comment-id:2371798850 --> @srikary12 commented on GitHub (Sep 24, 2024): I'll take a look at it.
Author
Owner

@dhiltgen commented on GitHub (Sep 25, 2024):

This model loads correctly for me on an nvidia GPU on windows. Can you share your server log so we can see what may be going wrong?

https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md

<!-- gh-comment-id:2372641631 --> @dhiltgen commented on GitHub (Sep 25, 2024): This model loads correctly for me on an nvidia GPU on windows. Can you share your server log so we can see what may be going wrong? https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md
Author
Owner

@yaosd99 commented on GitHub (Sep 25, 2024):

This model loads correctly for me on an nvidia GPU on windows. Can you share your server log so we can see what may be going wrong?

https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md

This is a reproduced log file, please refer to it~
server.log

<!-- gh-comment-id:2373567099 --> @yaosd99 commented on GitHub (Sep 25, 2024): > This model loads correctly for me on an nvidia GPU on windows. Can you share your server log so we can see what may be going wrong? > > https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md This is a reproduced log file, please refer to it~ [server.log](https://github.com/user-attachments/files/17128762/server.log)
Author
Owner

@srikary12 commented on GitHub (Sep 25, 2024):

The logs look clear and model loads correctly for me. Can you restart your ollama and check of the issue still persists?

<!-- gh-comment-id:2373689334 --> @srikary12 commented on GitHub (Sep 25, 2024): The logs look clear and model loads correctly for me. Can you restart your ollama and check of the issue still persists?
Author
Owner

@yaosd99 commented on GitHub (Sep 25, 2024):

日志看起来很清晰,对我来说模型加载正确。您能否重新启动 ollama 并检查问题仍然存在?

Yes, I tried various methods including restarting and rebuilding the environment, but still encountered the aforementioned issues~~~~~~~
=。=

<!-- gh-comment-id:2373710936 --> @yaosd99 commented on GitHub (Sep 25, 2024): > 日志看起来很清晰,对我来说模型加载正确。您能否重新启动 ollama 并检查问题仍然存在? Yes, I tried various methods including restarting and rebuilding the environment, but still encountered the aforementioned issues~~~~~~~ =。=
Author
Owner

@dhiltgen commented on GitHub (Sep 25, 2024):

The attached logs don't seem to show a crash. I actually see two 200 responses after the load, implying things worked correctly. @yaosd99 can you clarify? Was this log capturing the exchange where the client received the error error reading llm response:An existing connection was forcibly closed by the remote host.

<!-- gh-comment-id:2375169745 --> @dhiltgen commented on GitHub (Sep 25, 2024): The attached logs don't seem to show a crash. I actually see two `200` responses after the load, implying things worked correctly. @yaosd99 can you clarify? Was this log capturing the exchange where the client received the error `error reading llm response:An existing connection was forcibly closed by the remote host.`
Author
Owner

@yaosd99 commented on GitHub (Sep 27, 2024):

The attached logs don't seem to show a crash. I actually see two responses after the load, implying things worked correctly. @yaosd99 can you clarify? Was this log capturing the exchange where the client received the error 200``error reading llm response:An existing connection was forcibly closed by the remote host.

After testing, the server. log does not receive the error displayed by cmd, and the 5-digit number of "xxxxx" in the error "127.0.0.1: xxxxx" changes every time

<!-- gh-comment-id:2379074812 --> @yaosd99 commented on GitHub (Sep 27, 2024): > The attached logs don't seem to show a crash. I actually see two responses after the load, implying things worked correctly. @yaosd99 can you clarify? Was this log capturing the exchange where the client received the error `200``error reading llm response:An existing connection was forcibly closed by the remote host.` After testing, the server. log does not receive the error displayed by cmd, and the 5-digit number of "xxxxx" in the error "127.0.0.1: xxxxx" changes every time
Author
Owner

@dhiltgen commented on GitHub (Sep 27, 2024):

Ollama spawns a child process on a random port number for every model load. I've tried a few different approaches but haven't managed to reproduce this crash, so I'm still not sure why it crashes on your system. Setting OLLAMA_DEBUG=1 might yield some more logging that's helpful. You could also try to force it back to use the CUDA v11 runner instead of v12 by setting OLLAMA_LLM_LIBRARY=cuda_v11

Do other models work correctly?

<!-- gh-comment-id:2379699963 --> @dhiltgen commented on GitHub (Sep 27, 2024): Ollama spawns a child process on a random port number for every model load. I've tried a few different approaches but haven't managed to reproduce this crash, so I'm still not sure why it crashes on your system. Setting OLLAMA_DEBUG=1 might yield some more logging that's helpful. You could also try to force it back to use the CUDA v11 runner instead of v12 by setting OLLAMA_LLM_LIBRARY=cuda_v11 Do other models work correctly?
Author
Owner

@steveseguin commented on GitHub (Oct 25, 2024):

I'm having the same error. It's probably just me being stupid, but things work if I don't include an image.

Windows 11. Titan RTX. Ollama 0.4.0-rc5 using Llama 3.2-vision:latest

8-KB square image uploaded, with a message "describe". Fails with the same message as the OP. Works I don't include an image.

image
image

server.log shows this:

Exception 0xc0000005 0x0 0x241ea9da000 0x7ff97aff2edb
PC=0x7ff97aff2edb
signal arrived during external code execution

server.log

Hope it helps.

<!-- gh-comment-id:2437020888 --> @steveseguin commented on GitHub (Oct 25, 2024): I'm having the same error. It's probably just me being stupid, but things work if I don't include an image. Windows 11. Titan RTX. Ollama 0.4.0-rc5 using Llama 3.2-vision:latest 8-KB square image uploaded, with a message "describe". Fails with the same message as the OP. Works I don't include an image. ![image](https://github.com/user-attachments/assets/962f5eb5-64d9-40d2-afef-6254e6134750) ![image](https://github.com/user-attachments/assets/6431f9f3-3691-4949-be01-e2bf6578335d) server.log shows this: ``` Exception 0xc0000005 0x0 0x241ea9da000 0x7ff97aff2edb PC=0x7ff97aff2edb signal arrived during external code execution ``` [server.log](https://github.com/user-attachments/files/17517383/serverlog.log) Hope it helps.
Author
Owner

@jessegross commented on GitHub (Oct 30, 2024):

@steveseguin That is a different issue - yours is the same as #7362

That one has been fixed in current main if you build from source or in the next RC.

<!-- gh-comment-id:2447947238 --> @jessegross commented on GitHub (Oct 30, 2024): @steveseguin That is a different issue - yours is the same as #7362 That one has been fixed in current `main` if you build from source or in the next RC.
Author
Owner

@yaosd99 commented on GitHub (Nov 2, 2024):

Ollama spawns a child process on a random port number for every model load. I've tried a few different approaches but haven't managed to reproduce this crash, so I'm still not sure why it crashes on your system. Setting OLLAMA_DEBUG=1 might yield some more logging that's helpful. You could also try to force it back to use the CUDA v11 runner instead of v12 by setting OLLAMA_LLM_LIBRARY=cuda_v11

Do other models work correctly?

After my testing, the previous bug was due to the model's inability to recognize a portion of the images, which prevented it from returning the inference results to Ollama. It's not Ollama's problem, it's the problem with the LLM.

<!-- gh-comment-id:2452938790 --> @yaosd99 commented on GitHub (Nov 2, 2024): > Ollama spawns a child process on a random port number for every model load. I've tried a few different approaches but haven't managed to reproduce this crash, so I'm still not sure why it crashes on your system. Setting OLLAMA_DEBUG=1 might yield some more logging that's helpful. You could also try to force it back to use the CUDA v11 runner instead of v12 by setting OLLAMA_LLM_LIBRARY=cuda_v11 > > Do other models work correctly? After my testing, the previous bug was due to the model's inability to recognize a portion of the images, which prevented it from returning the inference results to Ollama. It's not Ollama's problem, it's the problem with the LLM.
Author
Owner

@HappyManWorld commented on GitHub (Nov 17, 2024):

I have the same problem only in llm "deepseek-coder-v2."
'error reading llm response: read tcp 127.0.0.1:59935->127.0.0.1:59933: wsarecv: An existing connection was forcibly closed by the remote host.'

If I send a small query, work perfect. But when I query like "make unit test for PHP code....", the system 4-5 minutes, and then I see this error.

Other models work correctly. I tried more than ten.

Ollama version=0.4.2

image

In console mode (look like stop in part of answer):
image

<!-- gh-comment-id:2481425072 --> @HappyManWorld commented on GitHub (Nov 17, 2024): I have the same problem only in llm "deepseek-coder-v2." `'error reading llm response: read tcp 127.0.0.1:59935->127.0.0.1:59933: wsarecv: An existing connection was forcibly closed by the remote host.'` If I send a small query, work perfect. But when I query like "make unit test for PHP code....", the system 4-5 minutes, and then I see this error. Other models work correctly. I tried more than ten. Ollama version=0.4.2 ![image](https://github.com/user-attachments/assets/4d292613-18ba-47e5-830c-9f9d7d9e67d1) In console mode (look like stop in part of answer): ![image](https://github.com/user-attachments/assets/f3fb3ba0-764e-416f-b8af-bd5fdf4f934b)
Author
Owner

@dhiltgen commented on GitHub (Nov 18, 2024):

@HappyManRus can you check your server log? There are a number of existing issues tracking deepseek crashes. Depending on what the crash was, there may be a workaround you can try until we get it fixed.

<!-- gh-comment-id:2484162319 --> @dhiltgen commented on GitHub (Nov 18, 2024): @HappyManRus can you check your server log? There are a number of existing issues tracking deepseek crashes. Depending on what the crash was, there may be a workaround you can try until we get it fixed.
Author
Owner

@HappyManWorld commented on GitHub (Nov 19, 2024):

@HappyManRus can you check your server log? There are a number of existing issues tracking deepseek crashes. Depending on what the crash was, there may be a workaround you can try until we get it fixed.

Hello.
@dhiltgen, I hope it helps.

server_copy.log

<!-- gh-comment-id:2486560917 --> @HappyManWorld commented on GitHub (Nov 19, 2024): > @HappyManRus can you check your server log? There are a number of existing issues tracking deepseek crashes. Depending on what the crash was, there may be a workaround you can try until we get it fixed. Hello. [@dhiltgen](https://github.com/dhiltgen), I hope it helps. [server_copy.log](https://github.com/user-attachments/files/17820237/server_copy.log)
Author
Owner

@jessegross commented on GitHub (Nov 20, 2024):

@HappyManRus Your issue is the same as #5975 - there is a workaround there.

I'm going to go ahead and close this issue as it seems that the original issue has been fixed. For other issues, please follow up in the linked bugs if needed.

<!-- gh-comment-id:2489455868 --> @jessegross commented on GitHub (Nov 20, 2024): @HappyManRus Your issue is the same as #5975 - there is a workaround there. I'm going to go ahead and close this issue as it seems that the original issue has been fixed. For other issues, please follow up in the linked bugs if needed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#30153