[GH-ISSUE #15063] Running DeepSeek R1 model become slow #35427

Closed
opened 2026-04-22 19:55:46 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @constantin-ungureanu-github on GitHub (Mar 25, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15063

What is the issue?

With some previous ollama releases I could run deepseek-r1:70b (42GB) fast with GPU support.
However, with newer versions of ollama (noticed starting 0.18.0), the same model and same hardware now is very slow.
I see that DRAM is heavily used, same processor. But also VRAM is loaded.
Combined VRAM of 2 5090 is 64 GB, it should fit into VRAM, no? At least previously it did (I recall version 0.16.x was running in GPU/VRAM, can't recall about 0.17.x) and it was fast with DeepSeek R1.

Other models I see that are run in GPU/VRAM ex. qwen3-coder-next/51 GB. This seems a regression.

Relevant log output


OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.18.x

Originally created by @constantin-ungureanu-github on GitHub (Mar 25, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15063 ### What is the issue? With some previous ollama releases I could run deepseek-r1:70b (42GB) fast with GPU support. However, with newer versions of ollama (noticed starting 0.18.0), the same model and same hardware now is very slow. I see that DRAM is heavily used, same processor. But also VRAM is loaded. Combined VRAM of 2 5090 is 64 GB, it should fit into VRAM, no? At least previously it did (I recall version 0.16.x was running in GPU/VRAM, can't recall about 0.17.x) and it was fast with DeepSeek R1. Other models I see that are run in GPU/VRAM ex. qwen3-coder-next/51 GB. This seems a regression. ### Relevant log output ```shell ``` ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.18.x
GiteaMirror added the bug label 2026-04-22 19:55:46 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 25, 2026):

Server logs will aid in debugging.

<!-- gh-comment-id:4129803717 --> @rick-github commented on GitHub (Mar 25, 2026): [Server logs](https://docs.ollama.com/troubleshooting) will aid in debugging.
Author
Owner

@constantin-ungureanu-github commented on GitHub (Apr 13, 2026):

I solved this issue by using vllm.

<!-- gh-comment-id:4239756640 --> @constantin-ungureanu-github commented on GitHub (Apr 13, 2026): I solved this issue by using vllm.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#35427