[GH-ISSUE #5498] Ollama OpenAI compatibility fails on GPU? #3440

Closed
opened 2026-04-12 14:06:23 -05:00 by GiteaMirror · 7 comments
Owner

Originally created by @rhastie on GitHub (Jul 5, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/5498

What is the issue?

We have seen instances where when we use the OpenAI API compatibility layer Ollama fails to utilise our NVIDIA GPU. When we re-run the test using the Ollama generate API it does use the GPU.

Is this a configuration consideration or potentially a bug.

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.1.48

Originally created by @rhastie on GitHub (Jul 5, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/5498 ### What is the issue? We have seen instances where when we use the OpenAI API compatibility layer Ollama fails to utilise our NVIDIA GPU. When we re-run the test using the Ollama generate API it does use the GPU. Is this a configuration consideration or potentially a bug. ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.1.48
GiteaMirror added the needs more infobug labels 2026-04-12 14:06:23 -05:00
Author
Owner

@jmorganca commented on GitHub (Jul 5, 2024):

This would be a bug - it should use the GPU the same way. May I ask what Nvidia hardware you may be running on? Sorry about this.

<!-- gh-comment-id:2211158665 --> @jmorganca commented on GitHub (Jul 5, 2024): This would be a bug - it should use the GPU the same way. May I ask what Nvidia hardware you may be running on? Sorry about this.
Author
Owner

@rhastie commented on GitHub (Jul 5, 2024):

Currently we are using a L40 NVIDIA GPU with 48Gb memory. It's part of the ADA generation so pretty new. Spec is here: https://www.nvidia.com/en-gb/data-center/l40/

We have been monitoring the GPU metrics under nvidia-smi and the utilization is zero when using the OpenAI API.

<!-- gh-comment-id:2211169221 --> @rhastie commented on GitHub (Jul 5, 2024): Currently we are using a L40 NVIDIA GPU with 48Gb memory. It's part of the ADA generation so pretty new. Spec is here: https://www.nvidia.com/en-gb/data-center/l40/ We have been monitoring the GPU metrics under nvidia-smi and the utilization is zero when using the OpenAI API.
Author
Owner

@Moonlight1220 commented on GitHub (Jul 8, 2024):

Considering that your GPU was made with running LLM's in mind it's most likely a bug of some kind, may I ask if you are using OpenWebUI (Formerly known as OllamaWebUI)

<!-- gh-comment-id:2215199625 --> @Moonlight1220 commented on GitHub (Jul 8, 2024): Considering that your GPU was made with running LLM's in mind it's most likely a bug of some kind, may I ask if you are using OpenWebUI (Formerly known as OllamaWebUI)
Author
Owner

@rhastie commented on GitHub (Jul 8, 2024):

Not currently but we can easily do that if you feel it would help.

Currently we are directly monitoring the GPU metrics and trying the respective APIs directly. At its simplest we are using "curl" to hit the two different API versions in a loop.

<!-- gh-comment-id:2215375278 --> @rhastie commented on GitHub (Jul 8, 2024): Not currently but we can easily do that if you feel it would help. Currently we are directly monitoring the GPU metrics and trying the respective APIs directly. At its simplest we are using "curl" to hit the two different API versions in a loop.
Author
Owner

@Moonlight1220 commented on GitHub (Jul 9, 2024):

Unsure if it will help becouse i have never seen this issue before however its worth a try, are you running Ollama localy or on a server?

<!-- gh-comment-id:2218800270 --> @Moonlight1220 commented on GitHub (Jul 9, 2024): Unsure if it will help becouse i have never seen this issue before however its worth a try, are you running Ollama localy or on a server?
Author
Owner

@royjhan commented on GitHub (Jul 31, 2024):

@rhastie are you still experiencing this issue?

<!-- gh-comment-id:2261050240 --> @royjhan commented on GitHub (Jul 31, 2024): @rhastie are you still experiencing this issue?
Author
Owner

@dhiltgen commented on GitHub (Oct 24, 2024):

If you're still seeing this, please upgrade to the latest version and if that doesn't clear it up, please share a server log and I'll reopen so we can further investigate.

<!-- gh-comment-id:2434111913 --> @dhiltgen commented on GitHub (Oct 24, 2024): If you're still seeing this, please upgrade to the latest version and if that doesn't clear it up, please share a server log and I'll reopen so we can further investigate.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#3440