[GH-ISSUE #11747] GPT-OSS:20b offloaded partially to CPU despite enough VRAM #33543

Closed
opened 2026-04-22 16:23:01 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @MarkEScheidker on GitHub (Aug 6, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11747

What is the issue?

I have a 16gb Nvidia GPU, but Ollama doesn't use all available VRAM unless I specify number of layers to use in OpenWebui.

Default settings:
Image

OpenWebUI num_gpu (ollama) set to 256:

Image

I've set these environment variables:
OLLAMA_FLASH_ATTENTION=1
OLLAMA_NUM_PARALLEL=1
GGML_USE_MMAP=1

Is there a reason the model isn't fully loaded into vram by default?

Relevant log output


OS

WSL2, Docker

GPU

Nvidia

CPU

Intel

Ollama version

0.11.2

Originally created by @MarkEScheidker on GitHub (Aug 6, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11747 ### What is the issue? I have a 16gb Nvidia GPU, but Ollama doesn't use all available VRAM unless I specify number of layers to use in OpenWebui. Default settings: <img width="501" height="780" alt="Image" src="https://github.com/user-attachments/assets/9b371b2b-308b-43f1-8b45-d08ad799724b" /> OpenWebUI num_gpu (ollama) set to 256: <img width="494" height="786" alt="Image" src="https://github.com/user-attachments/assets/bdc39144-2806-40b3-93c1-bc38482e3d6f" /> I've set these environment variables: OLLAMA_FLASH_ATTENTION=1 OLLAMA_NUM_PARALLEL=1 GGML_USE_MMAP=1 Is there a reason the model isn't fully loaded into vram by default? ### Relevant log output ```shell ``` ### OS WSL2, Docker ### GPU Nvidia ### CPU Intel ### Ollama version 0.11.2
GiteaMirror added the bug label 2026-04-22 16:23:01 -05:00
Author
Owner

@jmorganca commented on GitHub (Aug 6, 2025):

Hi there. This should be fixed in 0.11.3 – let me know if you're still seeing issues!

<!-- gh-comment-id:3160807155 --> @jmorganca commented on GitHub (Aug 6, 2025): Hi there. This should be fixed in 0.11.3 – let me know if you're still seeing issues!
Author
Owner

@MarkEScheidker commented on GitHub (Aug 6, 2025):

Yes, It did, the latest tag for the docker image was just updated so I wasn't using the latest verison

<!-- gh-comment-id:3160967499 --> @MarkEScheidker commented on GitHub (Aug 6, 2025): Yes, It did, the latest tag for the docker image was just updated so I wasn't using the latest verison
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#33543