[GH-ISSUE #10505] ollama ps NOT EQUAL to nvidia-smi 实际显存占用和ollama计算不匹配 #32672

Closed
opened 2026-04-22 14:23:16 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @timothy-WangS on GitHub (Apr 30, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/10505

What is the issue?

使用ollama运行模型时,发现部分模型被加载到了CPU上,追溯到这个问题:ollama ps命令获得的显存需求和实际占用显存不一致。

环境:
ubuntu-22.04(wsl2 on win server 2022)
ollama 0.6.6
qwen2.5:14b
OLLAMA_NUM_PARALLEL=8

显卡:
2080Ti 22G * 2

已确认ollama调用了两张显卡,也正确识别了显存大小。现在调整到100%占用GPU的运行模式,发现用不满显存。ollama设置中预留显存、允许使用显存等参数均默认,加载GPU层数设为999(全部加载GPU)

Image

Image

Image

Relevant log output


OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Originally created by @timothy-WangS on GitHub (Apr 30, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/10505 ### What is the issue? 使用ollama运行模型时,发现部分模型被加载到了CPU上,追溯到这个问题:ollama ps命令获得的显存需求和实际占用显存不一致。 环境: ubuntu-22.04(wsl2 on win server 2022) ollama 0.6.6 qwen2.5:14b OLLAMA_NUM_PARALLEL=8 显卡: 2080Ti 22G * 2 已确认ollama调用了两张显卡,也正确识别了显存大小。现在调整到100%占用GPU的运行模式,发现用不满显存。ollama设置中预留显存、允许使用显存等参数均默认,加载GPU层数设为999(全部加载GPU) ![Image](https://github.com/user-attachments/assets/e4cd1ec3-77fb-4721-93b7-3feb5de811c1) <img width="551" alt="Image" src="https://github.com/user-attachments/assets/474522e5-6665-4375-a49d-12131528a5fb" /> ![Image](https://github.com/user-attachments/assets/7388b462-fa1c-4120-a9c6-5fa97d8a6c45) ### Relevant log output ```shell ``` ### OS _No response_ ### GPU _No response_ ### CPU _No response_ ### Ollama version _No response_
GiteaMirror added the bug label 2026-04-22 14:23:16 -05:00
Author
Owner

@rick-github commented on GitHub (Apr 30, 2025):

https://github.com/ollama/ollama/issues/10041#issuecomment-2816399723

<!-- gh-comment-id:2842472721 --> @rick-github commented on GitHub (Apr 30, 2025): https://github.com/ollama/ollama/issues/10041#issuecomment-2816399723
Author
Owner

@jmorganca commented on GitHub (Apr 30, 2025):

Thanks @rick-github will close this. @timothy-WangS the mismatch is usually due to ollama ps showing reserved memory that may be needed as the context window grows.

<!-- gh-comment-id:2842969984 --> @jmorganca commented on GitHub (Apr 30, 2025): Thanks @rick-github will close this. @timothy-WangS the mismatch is usually due to `ollama ps` showing _reserved_ memory that may be needed as the context window grows.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#32672