[GH-ISSUE #15049] Why does running deepseek-r1:32b require over 300 GB of memory? #71719

Closed
opened 2026-05-05 02:24:15 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @yangy996 on GitHub (Mar 25, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15049

What is the issue?

curl http://localhost:11434/api/generate -d '{
"model": "deepseek-r1:32b",
"prompt": "Hello, how are you?",
"stream": false
}'
{"error":"model requires more system memory (338.1 GiB) than is available (328.0 GiB)"}

Relevant log output

● ollama.service - Ollama Service
Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2026-03-24 17:47:32 CST; 15h ago
Main PID: 2816880 (ollama)
Tasks: 24 (limit: 618634)
Memory: 1.1G
CGroup: /system.slice/ollama.service
└─2816880 /usr/local/bin/ollama serve

3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=DEBUG source=server.go:976 msg="available gpu" id=GPU-fdfb4273-9c05-65a3-6882-21542da92ff6 library=CUDA "a>
3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=DEBUG source=server.go:976 msg="available gpu" id=GPU-e42673e2-1cc5-b890-6193-de099240bffe library=CUDA "a>
3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=WARN source=server.go:1044 msg="model request too large for system" requested="338.1 GiB" available="337.2>
3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=INFO source=sched.go:516 msg="Load failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-6150cb38231>
3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.853+08:00 level=INFO source=runner.go:965 msg="starting go runner"
3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.853+08:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.873+08:00 level=DEBUG source=server.go:1830 msg="stopping llama server" pid=325701
3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.873+08:00 level=DEBUG source=server.go:1836 msg="waiting for llama server to exit" pid=325701
3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.882+08:00 level=DEBUG source=server.go:1840 msg="llama server stopped" pid=325701
3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: [GIN] 2026/03/25 - 09:28:05 | 500 | 1.744758145s | 127.0.0.1 | POST "/api/generate"


OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.18.0

Originally created by @yangy996 on GitHub (Mar 25, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15049 ### What is the issue? curl http://localhost:11434/api/generate -d '{ "model": "deepseek-r1:32b", "prompt": "Hello, how are you?", "stream": false }' {"error":"model requires more system memory (338.1 GiB) than is available (328.0 GiB)"} ### Relevant log output ● ollama.service - Ollama Service Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2026-03-24 17:47:32 CST; 15h ago Main PID: 2816880 (ollama) Tasks: 24 (limit: 618634) Memory: 1.1G CGroup: /system.slice/ollama.service └─2816880 /usr/local/bin/ollama serve 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=DEBUG source=server.go:976 msg="available gpu" id=GPU-fdfb4273-9c05-65a3-6882-21542da92ff6 library=CUDA "a> 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=DEBUG source=server.go:976 msg="available gpu" id=GPU-e42673e2-1cc5-b890-6193-de099240bffe library=CUDA "a> 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=WARN source=server.go:1044 msg="model request too large for system" requested="338.1 GiB" available="337.2> 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.821+08:00 level=INFO source=sched.go:516 msg="Load failed" model=/usr/share/ollama/.ollama/models/blobs/sha256-6150cb38231> 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.853+08:00 level=INFO source=runner.go:965 msg="starting go runner" 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.853+08:00 level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.873+08:00 level=DEBUG source=server.go:1830 msg="stopping llama server" pid=325701 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.873+08:00 level=DEBUG source=server.go:1836 msg="waiting for llama server to exit" pid=325701 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: time=2026-03-25T09:28:05.882+08:00 level=DEBUG source=server.go:1840 msg="llama server stopped" pid=325701 3月 25 09:28:05 sunway-SYS-420GP-TNR ollama[2816880]: [GIN] 2026/03/25 - 09:28:05 | 500 | 1.744758145s | 127.0.0.1 | POST "/api/generate" ```shell ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.18.0
GiteaMirror added the bug label 2026-05-05 02:24:15 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 25, 2026):

Server logs will aid in debugging.

<!-- gh-comment-id:4122642864 --> @rick-github commented on GitHub (Mar 25, 2026): [Server logs](https://docs.ollama.com/troubleshooting) will aid in debugging.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#71719