[GH-ISSUE #10348] Issue began after version 0.5.12, Error: POST predict: Post ... wsarecv: An existing connection was forcibly closed by the remote host. #68853

Closed
opened 2026-05-04 15:24:48 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @katmandoo212 on GitHub (Apr 20, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10348

server-2.log

You can see in the server log that the port number is way off, it should be 11434. If I revert back to 0.5.12 the 11434 port is used, but after that version, at least for Windows 10, it attempts to use a different port every execution.

Originally posted by @katmandoo212 in #9986

Originally created by @katmandoo212 on GitHub (Apr 20, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10348 > [server-2.log](https://github.com/user-attachments/files/19824862/server-2.log) > > You can see in the server log that the port number is way off, it should be 11434. If I revert back to 0.5.12 the 11434 port is used, but after that version, at least for Windows 10, it attempts to use a different port every execution. _Originally posted by @katmandoo212 in [#9986](https://github.com/ollama/ollama/issues/9986#issuecomment-2817106369)_
Author
Owner

@rick-github commented on GitHub (Apr 20, 2025):

time=2025-04-20T05:53:35.864-04:00 level=ERROR source=routes.go:478 msg="embedding generation failed" error="do embedding request: Post \"http://127.0.0.1:62279/embedding\": context canceled"
[GIN] 2025/04/20 - 05:53:35 | 500 |   38.2550735s |       127.0.0.1 | POST     "/api/embed"

The port number is for communicating with the runner, it's not related to the port number of the server (11434). The problem you have is that the context was cancelled and the server returned a 500. The reason for the cancellation is not clear, try adding OLLAMA_DEBUG=1 to the server environment for a more detailed log.

<!-- gh-comment-id:2817114027 --> @rick-github commented on GitHub (Apr 20, 2025): ``` time=2025-04-20T05:53:35.864-04:00 level=ERROR source=routes.go:478 msg="embedding generation failed" error="do embedding request: Post \"http://127.0.0.1:62279/embedding\": context canceled" [GIN] 2025/04/20 - 05:53:35 | 500 | 38.2550735s | 127.0.0.1 | POST "/api/embed" ``` The port number is for communicating with the runner, it's not related to the port number of the server (11434). The problem you have is that the context was cancelled and the server returned a 500. The reason for the cancellation is not clear, try adding `OLLAMA_DEBUG=1` to the server environment for a more detailed log.
Author
Owner

@katmandoo212 commented on GitHub (Apr 21, 2025):

Here are my logs after installing 0.6.5 and running with OLLAMA_DEBUG=1.

app.log
server.log

<!-- gh-comment-id:2817574533 --> @katmandoo212 commented on GitHub (Apr 21, 2025): Here are my logs after installing 0.6.5 and running with OLLAMA_DEBUG=1. [app.log](https://github.com/user-attachments/files/19829028/app.log) [server.log](https://github.com/user-attachments/files/19829027/server.log)
Author
Owner

@katmandoo212 commented on GitHub (Apr 21, 2025):

These are the logs from 0.5.12 and running with OLLAMA_DEBUG=1.

app.log
server.log

<!-- gh-comment-id:2817585325 --> @katmandoo212 commented on GitHub (Apr 21, 2025): These are the logs from 0.5.12 and running with OLLAMA_DEBUG=1. [app.log](https://github.com/user-attachments/files/19829105/app.log) [server.log](https://github.com/user-attachments/files/19829106/server.log)
Author
Owner

@rick-github commented on GitHub (Apr 21, 2025):

D:\home\blt\github\llama.cpp\ggml\src\ggml.c:1729: GGML_ASSERT(tensor->op == GGML_OP_UNARY) failed

#9509
Also:

time=2025-04-20T23:47:29.455-04:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll: your nvidia driver is too old or missing.  If you have a CUDA GPU please upgrade to run ollama"
time=2025-04-20T23:47:29.464-04:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: your nvidia driver is too old or missing.  If you have a CUDA GPU please upgrade to run ollama"

No GPUs are detected, is that expected?

<!-- gh-comment-id:2818999534 --> @rick-github commented on GitHub (Apr 21, 2025): ``` D:\home\blt\github\llama.cpp\ggml\src\ggml.c:1729: GGML_ASSERT(tensor->op == GGML_OP_UNARY) failed ``` #9509 Also: ``` time=2025-04-20T23:47:29.455-04:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll: your nvidia driver is too old or missing. If you have a CUDA GPU please upgrade to run ollama" time=2025-04-20T23:47:29.464-04:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: your nvidia driver is too old or missing. If you have a CUDA GPU please upgrade to run ollama" ``` No GPUs are detected, is that expected?
Author
Owner

@katmandoo212 commented on GitHub (Apr 21, 2025):

Yes. I am using Intel I5 CPU only.

<!-- gh-comment-id:2819139735 --> @katmandoo212 commented on GitHub (Apr 21, 2025): Yes. I am using Intel I5 CPU only.
Author
Owner

@katmandoo212 commented on GitHub (Apr 22, 2025):

This issue is resolved on my machine. Here is the breakdown of what resolved it. Thanks so much for your kind help, rick-github!

Suggesting to set OLLAMA_DEBUG and examine the log files did the trick! I saw that the server.log had a line close to the runner termination, that referenced a llama.cpp file that was not in the Ollama installation, but was a llama.cpp source file that I had cloned and downloaded. It stood out because it was on my D drive and not on my C drive.

I had cloned and built llama.cpp from source and had also cloned and built Ollama from source. For some reason, the Ollama build referenced the llama.cpp source code, that I had cloned. I have no clue why. But I resolved the issue by uninstalling my locally built Ollama and moving the llama.cpp source code to another folder (to try to hide it from Ollama). Then I downloaded the Windows Ollama Setup executable for 0.6.5 from the Ollama github. That allows Ollama 0.6.5 to run.

The issue I need to figure out now is this: I am working on an application that I want to be able to use Ollama and llama.cpp in, to compare speeds, and to use the llama.cpp RPC server running on a Raspberry PI cluster. I am only doing this to experiment with, so I expect to run into problems and dead-ends.

<!-- gh-comment-id:2820012732 --> @katmandoo212 commented on GitHub (Apr 22, 2025): This issue is resolved on my machine. Here is the breakdown of what resolved it. Thanks so much for your kind help, rick-github! Suggesting to set OLLAMA_DEBUG and examine the log files did the trick! I saw that the server.log had a line close to the runner termination, that referenced a llama.cpp file that was not in the Ollama installation, but was a llama.cpp source file that I had cloned and downloaded. It stood out because it was on my D drive and not on my C drive. I had cloned and built llama.cpp from source and had also cloned and built Ollama from source. For some reason, the Ollama build referenced the llama.cpp source code, that I had cloned. I have no clue why. But I resolved the issue by uninstalling my locally built Ollama and moving the llama.cpp source code to another folder (to try to hide it from Ollama). Then I downloaded the Windows Ollama Setup executable for 0.6.5 from the Ollama github. That allows Ollama 0.6.5 to run. The issue I need to figure out now is this: I am working on an application that I want to be able to use Ollama and llama.cpp in, to compare speeds, and to use the llama.cpp RPC server running on a Raspberry PI cluster. I am only doing this to experiment with, so I expect to run into problems and dead-ends.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#68853