[GH-ISSUE #4975] Is RTX 4070 and not RTX 4070ti supported - ambigous documentation #28904

Closed
opened 2026-04-22 07:27:20 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @thinkrapido on GitHub (Jun 11, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4975

What is the issue?

Hello,

my prompts to ollama model codellama:34b-code-q6_K is taking very long to process.
And in the CPU Monitor many CPUs get envolved when calculating an answer.
What am I doing wrong? Is it a bug or do I have to bear with it?
I expect answers within a second delay.

The documentation at NVIDIA (https://www.nvidia.com/de-de/geforce/graphics-cards/40-series/rtx-4070-family/) says that it is CUDA-Enabled, but on https://github.com/ollama/ollama/blob/main/docs/gpu.md it is not listed for capability 8.9.

I'm using a linux system with the latest CUDA Libraries enabled.

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

ollama version is 0.1.24

Originally created by @thinkrapido on GitHub (Jun 11, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4975 ### What is the issue? Hello, my prompts to ollama model codellama:34b-code-q6_K is taking very long to process. And in the CPU Monitor many CPUs get envolved when calculating an answer. What am I doing wrong? Is it a bug or do I have to bear with it? I expect answers within a second delay. The documentation at NVIDIA (https://www.nvidia.com/de-de/geforce/graphics-cards/40-series/rtx-4070-family/) says that it is CUDA-Enabled, but on https://github.com/ollama/ollama/blob/main/docs/gpu.md it is not listed for capability 8.9. I'm using a linux system with the latest CUDA Libraries enabled. ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version ollama version is 0.1.24
GiteaMirror added the bug label 2026-04-22 07:27:20 -05:00
Author
Owner

@d-kleine commented on GitHub (Jun 12, 2024):

I am using an RTX 3080 Ti and this is not in the list too, but CUDA works with ollama for me. I think the ollama's GPU documentation is just outdated (for Ollama, only the part about the Compute Capability is relevant). So not a Ollama issue most likely.

Did you install cuDNN on your system? Which versions are you using?

  • Nvidia graphics drivers
  • CUDA
  • cuDNN

Try to update to the newest Nvidia graphics driver, CUDA, cuDNN, I just did that yesterday and works perfectly fine for me.

<!-- gh-comment-id:2163695739 --> @d-kleine commented on GitHub (Jun 12, 2024): I am using an RTX 3080 Ti and this is not in the list too, but CUDA works with ollama for me. I think the ollama's GPU documentation is just outdated (for Ollama, only the part about the Compute Capability is relevant). So not a Ollama issue most likely. Did you install cuDNN on your system? Which versions are you using? - Nvidia graphics drivers - CUDA - cuDNN Try to update to the newest Nvidia graphics driver, CUDA, cuDNN, I just did that yesterday and works perfectly fine for me.
Author
Owner

@pdevine commented on GitHub (Jun 14, 2024):

Both are supported. I've updated the docs w/ #5036 . @thinkrapido the problem is you're loading a model which is too big for your GPU. You can use ollama ps to see what percentage of the model is being loaded onto CPU vs GPU.

<!-- gh-comment-id:2166983801 --> @pdevine commented on GitHub (Jun 14, 2024): Both are supported. I've updated the docs w/ #5036 . @thinkrapido the problem is you're loading a model which is too big for your GPU. You can use `ollama ps` to see what percentage of the model is being loaded onto CPU vs GPU.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#28904