[GH-ISSUE #11157] ollama : Error: POST predict: Post "http://127.0.0.1:43799/completion": EOF #7359

Closed
opened 2026-04-12 19:24:51 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @nPrevail on GitHub (Jun 21, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11157

What is the issue?

When you run ollama run <LLM Module>, and if you input anything, you'll receive the following output when running a LLM:

Error: POST predict: Post "http://127.0.0.1:43799/completion": EOF

Relevant log output

Error: POST predict: Post "http://127.0.0.1:43799/completion": EOF

Server log:
https://pastebin.com/G6whWmkZ

server_debug_ollama.txt

OS

Linux

GPU

AMD

CPU

AMD

Ollama version

0.7.0

Originally created by @nPrevail on GitHub (Jun 21, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11157 ### What is the issue? When you run `ollama run <LLM Module>`, and if you input anything, you'll receive the following output when running a LLM: `Error: POST predict: Post "http://127.0.0.1:43799/completion": EOF` ### Relevant log output ```shell Error: POST predict: Post "http://127.0.0.1:43799/completion": EOF ``` Server log: https://pastebin.com/G6whWmkZ [server_debug_ollama.txt](https://github.com/user-attachments/files/20958784/server_debug_ollama.txt) ### OS Linux ### GPU AMD ### CPU AMD ### Ollama version 0.7.0
GiteaMirror added the bug label 2026-04-12 19:24:51 -05:00
Author
Owner

@rick-github commented on GitHub (Jun 21, 2025):

Server logs will aid in debugging.

<!-- gh-comment-id:2993752725 --> @rick-github commented on GitHub (Jun 21, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md#how-to-troubleshoot-issues) will aid in debugging.
Author
Owner

@jaypeche commented on GitHub (Jun 22, 2025):

Same problem, just align versions of nvidia-cuda-toolkit && nvidia-drivers, with same ~amd64 keywording !

Recompile ollama from sources ebuild, using same cuda compiler.

This result a good execution of LLMs like mistral-nemo, with maximal GPU/CPU capabilities !

A feedback be interested, for AMD ROCm. I don't have AMD compatible machine for tests :;)

<!-- gh-comment-id:2994320716 --> @jaypeche commented on GitHub (Jun 22, 2025): Same problem, just align versions of nvidia-cuda-toolkit && nvidia-drivers, with same ~amd64 keywording ! Recompile ollama from sources ebuild, using same cuda compiler. This result a good execution of LLMs like mistral-nemo, with maximal GPU/CPU capabilities ! A feedback be interested, for AMD ROCm. I don't have AMD compatible machine for tests :;)
Author
Owner

@nPrevail commented on GitHub (Jun 28, 2025):

I've added the server log:
https://pastebin.com/G6whWmkZ

@rick-github

<!-- gh-comment-id:3014921679 --> @nPrevail commented on GitHub (Jun 28, 2025): I've added the server log: https://pastebin.com/G6whWmkZ @rick-github
Author
Owner

@rick-github commented on GitHub (Jun 28, 2025):

Jun 27 21:28:08 nixos ollama[1266]: ggml_cuda_compute_forward: RMS_NORM failed
Jun 27 21:28:08 nixos ollama[1266]: ROCm error: no kernel image is available for execution on the device
Jun 27 21:28:08 nixos ollama[1266]:   current device: 0, in function ggml_cuda_compute_forward at /build/source/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:2362
Jun 27 21:28:08 nixos ollama[1266]:   err
Jun 27 21:28:08 nixos ollama[1266]: /build/source/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:75: ROCm error
Jun 27 21:28:08 nixos ollama[1266]: Memory critical error by agent node-0 (Agent handle: 0x2e408890) on address 0x7fa97c200000. Reason: Memory in use.

Looks like the same as #11123

<!-- gh-comment-id:3015180091 --> @rick-github commented on GitHub (Jun 28, 2025): ``` Jun 27 21:28:08 nixos ollama[1266]: ggml_cuda_compute_forward: RMS_NORM failed Jun 27 21:28:08 nixos ollama[1266]: ROCm error: no kernel image is available for execution on the device Jun 27 21:28:08 nixos ollama[1266]: current device: 0, in function ggml_cuda_compute_forward at /build/source/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:2362 Jun 27 21:28:08 nixos ollama[1266]: err Jun 27 21:28:08 nixos ollama[1266]: /build/source/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:75: ROCm error Jun 27 21:28:08 nixos ollama[1266]: Memory critical error by agent node-0 (Agent handle: 0x2e408890) on address 0x7fa97c200000. Reason: Memory in use. ``` Looks like the same as #11123
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7359