[GH-ISSUE #13119] Persistent 500 Internal Server Error trying to operate deepseek-r1:70b on Windows #55198

Closed
opened 2026-04-29 08:29:41 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @LexxM3 on GitHub (Nov 17, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13119

What is the issue?

Ollama consistently refuses to operate deepseek-r1:70b with a

Error: 500 Internal Server Error: llama runner process has terminated: exit status 2

  • other models work fine (e.g. codellama:70b and gpt-oss:120b)

  • Ollama version 0.12.11 (latest as of posting date 2025-11-17)

  • below is an example of apparently successful pull; attempted at least 3 times using different approaches, ollama rm after each one; testing included Ollama and machine restarts

~$ ollama pull deepseek-r1:70b
pulling manifest
pulling 4cd576d9aa16: 100% ▕████████▏  42 GB
pulling c5ad996bda6e: 100% ▕████████▏  556 B
pulling 6e4c38e1172f: 100% ▕████████▏ 1.1 KB
pulling f4d24e9138dd: 100% ▕████████▏  148 B
pulling 5e9a45d7d8b9: 100% ▕████████▏  488 B
verifying sha256 digest
writing manifest
success
  • machine is 10890XE with 160GB DRAM and dual RTX3090 with NVlink; Windows 10, latest (2025-11-06) NVIDIA game drivers

  • happy to provide logs, but not sure what would be relevant

Relevant log output


OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.12.11

Originally created by @LexxM3 on GitHub (Nov 17, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13119 ### What is the issue? Ollama consistently refuses to operate `deepseek-r1:70b` with a > Error: 500 Internal Server Error: llama runner process has terminated: exit status 2 - other models work fine (e.g. `codellama:70b` and `gpt-oss:120b`) - Ollama version 0.12.11 (latest as of posting date 2025-11-17) - below is an example of apparently successful pull; attempted at least 3 times using different approaches, `ollama rm` after each one; testing included Ollama and machine restarts ``` ~$ ollama pull deepseek-r1:70b pulling manifest pulling 4cd576d9aa16: 100% ▕████████▏ 42 GB pulling c5ad996bda6e: 100% ▕████████▏ 556 B pulling 6e4c38e1172f: 100% ▕████████▏ 1.1 KB pulling f4d24e9138dd: 100% ▕████████▏ 148 B pulling 5e9a45d7d8b9: 100% ▕████████▏ 488 B verifying sha256 digest writing manifest success ``` - machine is 10890XE with 160GB DRAM and dual RTX3090 with NVlink; Windows 10, latest (2025-11-06) NVIDIA game drivers - happy to provide logs, but not sure what would be relevant ### Relevant log output ```shell ``` ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.12.11
GiteaMirror added the bug label 2026-04-29 08:29:41 -05:00
Author
Owner

@LexxM3 commented on GitHub (Nov 17, 2025):

Just tested deepseek-r1:32b and deepseek-r1:14b -- both same error. deepseek-r1:8b, deepseek-r1:7b, and the cloud deepseek-v3.1:617b-cloud work.

<!-- gh-comment-id:3544057241 --> @LexxM3 commented on GitHub (Nov 17, 2025): Just tested `deepseek-r1:32b` and `deepseek-r1:14b` -- both same error. `deepseek-r1:8b`, `deepseek-r1:7b`, and the cloud `deepseek-v3.1:617b-cloud` work.
Author
Owner

@rick-github commented on GitHub (Nov 17, 2025):

Server log will help in debugging.

<!-- gh-comment-id:3544381345 --> @rick-github commented on GitHub (Nov 17, 2025): [Server log](https://docs.ollama.com/troubleshooting) will help in debugging.
Author
Owner

@LexxM3 commented on GitHub (Nov 18, 2025):

Ok, figured it out. Mostly my fault, but I don't have everything quite fully explained and, at least, it would nice to have better error messages than the generic 500/2. I'll document the notes here for others and then close this issue.

  • core issue is I had excessive context configured for my hardware and that model

    • with dual 3090 (48GB VRAM) and deepseek-r1:70b model, maximum working context is 8K
    • it's 32K max for deepseek-r1:32b, and so on
    • I somehow had it set to 256K when it was failing
  • interestingly, gpt-oss:120b allows even 256K context -- not sure I understand that as it's a much larger model, seems like maybe there is some specialized handling by Ollama to allow that (I am just ramping up on Ollama, so don't know all the history and nuances yet)

  • I do specifically recall setting context to 128K, but it ended up being 256K somehow

    • interestingly, I recall that 128K was the maximum and that's what I set
    • somehow it changed to max 256K and then changed my setting to that max
    • I could be imagining it, but I don't think so; if I am not imagining it, it's a UI issue, but not serious for a 0.x.x package
  • server log was useful, but it took some careful reading to notice the useful details, lots of stuff there all piled together

Closing issue and thanks.

<!-- gh-comment-id:3545089282 --> @LexxM3 commented on GitHub (Nov 18, 2025): Ok, figured it out. Mostly my fault, but I don't have everything quite fully explained and, at least, it would nice to have better error messages than the generic 500/2. I'll document the notes here for others and then close this issue. - core issue is I had excessive context configured for my hardware and that model - with dual 3090 (48GB VRAM) and `deepseek-r1:70b` model, maximum working context is 8K - it's 32K max for `deepseek-r1:32b`, and so on - I somehow had it set to 256K when it was failing - interestingly, `gpt-oss:120b` allows even 256K context -- not sure I understand that as it's a much larger model, seems like maybe there is some specialized handling by Ollama to allow that (I am just ramping up on Ollama, so don't know all the history and nuances yet) - I do specifically recall setting context to 128K, but it ended up being 256K somehow - interestingly, I recall that 128K was the maximum and that's what I set - somehow it changed to max 256K and then changed my setting to that max - I could be imagining it, but I don't think so; if I am not imagining it, it's a UI issue, but not serious for a 0.x.x package - server log was useful, but it took some careful reading to notice the useful details, lots of stuff there all piled together Closing issue and thanks.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#55198