[GH-ISSUE #4392] Use GTT memory in case of iGPUs to run the model efiiciently. #28504

Closed
opened 2026-04-22 06:44:05 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @CoolnsX on GitHub (May 13, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4392

Models run on System memory using CPU is perfectly fine.

But when using integrate GPUs which have limited VRAM locked by vendors, we have model crash due to "low vram memory"

They have feature called GTT memory on linux, and Shared Memory on windows, which they can use whenever their VRAM capacity is nearly full.

Originally created by @CoolnsX on GitHub (May 13, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4392 Models run on System memory using CPU is perfectly fine. But when using integrate GPUs which have limited VRAM locked by vendors, we have model crash due to "low vram memory" They have feature called GTT memory on linux, and Shared Memory on windows, which they can use whenever their VRAM capacity is nearly full.
GiteaMirror added the feature request label 2026-04-22 06:44:05 -05:00
Author
Owner

@robertvazan commented on GitHub (Nov 2, 2024):

See #6282

<!-- gh-comment-id:2453092879 --> @robertvazan commented on GitHub (Nov 2, 2024): See #6282
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#28504