[GH-ISSUE #13307] failed with status code 500: llama runner process has terminated: exit status 2 #55302

Closed
opened 2026-04-29 08:48:16 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @Riperjame on GitHub (Dec 3, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13307

What is the issue?

When I successfully downloaded and pulled the deepseek-r1:14b model, running it immediately gave an error: Ollama call failed with status code 500: llama runner process has terminated: exit status 2. Then I ran the deepseek-r1:8b model and found that the CPU usage was about 50% and the GPU 98%, which had never happened before.

Relevant log output


OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @Riperjame on GitHub (Dec 3, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13307 ### What is the issue? When I successfully downloaded and pulled the deepseek-r1:14b model, running it immediately gave an error: Ollama call failed with status code 500: llama runner process has terminated: exit status 2. Then I ran the deepseek-r1:8b model and found that the CPU usage was about 50% and the GPU 98%, which had never happened before. ### Relevant log output ```shell ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the bug label 2026-04-29 08:48:16 -05:00
Author
Owner

@rick-github commented on GitHub (Dec 3, 2025):

Server log will help in debugging.

<!-- gh-comment-id:3605971645 --> @rick-github commented on GitHub (Dec 3, 2025): [Server log](https://docs.ollama.com/troubleshooting) will help in debugging.
Author
Owner

@Riperjame commented on GitHub (Dec 4, 2025):

Resolved.
Here is what I encountered: after pulling the deepseek-r1:14b model, running it would immediately throw an error failed with status code 500: llama runner process has terminated: exit status 2. However, after pulling the deepseek-r1:8b model, it ran with CPU ~50% and GPU ~98%. This is clearly abnormal as it had never occurred before.
In the Ollama crash logs, I found the core cause of the error was graph_reserve: failed to allocate compute buffers + memory address exception (Exception0xc0000005). The issue stemmed from [context length set too high] + [GPU memory allocation policy conflict]: the default OLLAMA_CONTEXT_LENGTH=262144 is too large, whereas the 14B model only supports 131072 and the 16GB GPU memory cannot handle it.
So, I created the environment variable OLLAMA_CONTEXT_LENGTH=8192 and restarted Ollama, which solved the issue.

服务器日志有助于调试。

<!-- gh-comment-id:3611044789 --> @Riperjame commented on GitHub (Dec 4, 2025): Resolved. Here is what I encountered: after pulling the deepseek-r1:14b model, running it would immediately throw an error failed with status code 500: llama runner process has terminated: exit status 2. However, after pulling the deepseek-r1:8b model, it ran with CPU ~50% and GPU ~98%. This is clearly abnormal as it had never occurred before. In the Ollama crash logs, I found the core cause of the error was graph_reserve: failed to allocate compute buffers + memory address exception (Exception0xc0000005). The issue stemmed from [context length set too high] + [GPU memory allocation policy conflict]: the default OLLAMA_CONTEXT_LENGTH=262144 is too large, whereas the 14B model only supports 131072 and the 16GB GPU memory cannot handle it. So, I created the environment variable OLLAMA_CONTEXT_LENGTH=8192 and restarted Ollama, which solved the issue. > [服务器日志](https://docs.ollama.com/troubleshooting)有助于调试。
Author
Owner

@metal3d commented on GitHub (Dec 4, 2025):

OLLAMA_CONTEXT_LENGTH=8192

It didn't fix the problem for me

<!-- gh-comment-id:3611521303 --> @metal3d commented on GitHub (Dec 4, 2025): > OLLAMA_CONTEXT_LENGTH=8192 It didn't fix the problem for me
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55302