[GH-ISSUE #8181] Does Ollama prioritize the use of shared GPU memory? #51734

Closed
opened 2026-04-28 20:49:38 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @mydreamworldpolly on GitHub (Dec 20, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/8181

What is the issue?

I installed ollama 0.5.1 and try the new settings
OLLAMA_FLASH_ATTENTION=1
OLLAMA_KV_CACHE_TYPE=q8_0
and use Qwen2.5-7b Q4 with long context 130000. The VRAM usage has indeed decreased as expected, but Ollama still occupies shared memory instead of fully utilizing the freed-up VRAM, resulting in performance degradation.
155617

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.5.1

Originally created by @mydreamworldpolly on GitHub (Dec 20, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/8181 ### What is the issue? I installed ollama 0.5.1 and try the new settings OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q8_0 and use Qwen2.5-7b Q4 with long context 130000. The VRAM usage has indeed decreased as expected, but Ollama still occupies shared memory instead of fully utilizing the freed-up VRAM, resulting in performance degradation. <img width="360" alt="155617" src="https://github.com/user-attachments/assets/007fd6e4-1ba9-4d50-8474-9f1b967f9b1d" /> ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.5.1
GiteaMirror added the bug label 2026-04-28 20:49:38 -05:00
Author
Owner

@rick-github commented on GitHub (Dec 20, 2024):

https://github.com/ollama/ollama/issues/6160

<!-- gh-comment-id:2556558497 --> @rick-github commented on GitHub (Dec 20, 2024): https://github.com/ollama/ollama/issues/6160
Author
Owner

@mydreamworldpolly commented on GitHub (Dec 20, 2024):

#6160

solved. Thank you!

<!-- gh-comment-id:2556594395 --> @mydreamworldpolly commented on GitHub (Dec 20, 2024): > #6160 solved. Thank you!
Author
Owner

@mydreamworldpolly commented on GitHub (Dec 20, 2024):

closed

<!-- gh-comment-id:2556595004 --> @mydreamworldpolly commented on GitHub (Dec 20, 2024): closed
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#51734