[GH-ISSUE #12036] nvcc fatal : Unsupported gpu architecture 'compute_35' (fixed with work around) #54506

Closed
opened 2026-04-29 06:13:08 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @abcbarryn on GitHub (Aug 22, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12036

What is the issue?

When compiling Ollama intially on my system I got the error message: nvcc fatal : Unsupported gpu architecture 'compute_35'
The problem appears to be that the cmake build scripts detected my older GPU and tried to build the Ollama CUDA extentions to be compatible with it, BUT this was not possible, at least with the initial versions of gcc and nvcc that I was using.

Relevant log output

[ 56%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o
nvcc fatal   : Unsupported gpu architecture 'compute_35'
gmake[2]: *** [ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make:80: ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:773: ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all] Error 2
gmake: *** [Makefile:136: all] Error 2

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

Current

Originally created by @abcbarryn on GitHub (Aug 22, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12036 ### What is the issue? When compiling Ollama intially on my system I got the error message: nvcc fatal : Unsupported gpu architecture 'compute_35' The problem appears to be that the cmake build scripts detected my older GPU and tried to build the Ollama CUDA extentions to be compatible with it, BUT this was not possible, at least with the initial versions of gcc and nvcc that I was using. ### Relevant log output ```shell [ 56%] Building CUDA object ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o nvcc fatal : Unsupported gpu architecture 'compute_35' gmake[2]: *** [ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/build.make:80: ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o] Error 1 gmake[1]: *** [CMakeFiles/Makefile2:773: ml/backend/ggml/ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/all] Error 2 gmake: *** [Makefile:136: all] Error 2 ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version Current
GiteaMirror added the buildbugnvidialinux labels 2026-04-29 06:13:10 -05:00
Author
Owner

@abcbarryn commented on GitHub (Aug 22, 2025):

I did get it to compile on my system by using nvcc 11.8 and gcc 11.
I think though that if it can't support the GPU installed it SHOULD simply issue a warning and compile a binary that will support newer GPUs rather than aborting the compile with an error. Some thing like this:
A compute_35 GPU is installed, it will NOT be supported by the current version of NVCC.
Also, I had issues with it trying to use nvcc version 12.9 even though I had set alternatives --config cuda to 11.8. I had to symlink the CUDA 12.9 folder to the CUDA 11.8 folder, even though I set the environment variable export CMAKE_CUDA_COMPILER=/usr/local/cuda-11.8/bin/nvcc, it was ignored.

<!-- gh-comment-id:3215263299 --> @abcbarryn commented on GitHub (Aug 22, 2025): I did get it to compile on my system by using nvcc 11.8 and gcc 11. I think though that if it can't support the GPU installed it SHOULD simply issue a warning and compile a binary that will support newer GPUs rather than aborting the compile with an error. Some thing like this: A compute_35 GPU is installed, it will NOT be supported by the current version of NVCC. Also, I had issues with it trying to use nvcc version 12.9 even though I had set `alternatives --config cuda` to 11.8. I had to symlink the CUDA 12.9 folder to the CUDA 11.8 folder, even though I set the environment variable `export CMAKE_CUDA_COMPILER=/usr/local/cuda-11.8/bin/nvcc`, it was ignored.
Author
Owner

@rick-github commented on GitHub (Aug 22, 2025):

CUDA 11 is currently not a supported platform. It may be supported again in the future (https://github.com/ollama/ollama/pull/12000).

<!-- gh-comment-id:3215315259 --> @rick-github commented on GitHub (Aug 22, 2025): CUDA 11 is currently not a [supported](https://github.com/ollama/ollama/blob/main/docs/gpu.md#nvidia) platform. It may be supported again in the future (https://github.com/ollama/ollama/pull/12000).
Author
Owner

@abcbarryn commented on GitHub (Aug 22, 2025):

Well, with my hardware installed and driver loaded, I couldn't build from source, even with CUDA 12.9. The only way I could get Ollama to compile and build with CUDA support WAS with CUDA 11.8! It saw my GPU and kept trying to support compute_35, but CUDA 12.9 has dropped support for compute_35. I couldn't figure out how to tell the build process not to try to support compute_35.

<!-- gh-comment-id:3215880821 --> @abcbarryn commented on GitHub (Aug 22, 2025): Well, with my hardware installed and driver loaded, I couldn't build from source, even with CUDA 12.9. The only way I could get Ollama to compile and build with CUDA support WAS with CUDA 11.8! It saw my GPU and kept trying to support compute_35, but CUDA 12.9 has dropped support for compute_35. I couldn't figure out how to tell the build process not to try to support compute_35.
Author
Owner

@dhiltgen commented on GitHub (Nov 6, 2025):

At this point, the ggml-cuda code has evolved to a point where the lack of capabilities on these older GPUs will cause models not to work. We're close to enabling Vulkan which may be able to operate on these older GPUs. Since you're OK building from source, I'd give Vulkan a try as it is already enabled for local builds, and see if you are able to get that working instead of CUDA.

<!-- gh-comment-id:3494288054 --> @dhiltgen commented on GitHub (Nov 6, 2025): At this point, the ggml-cuda code has evolved to a point where the lack of capabilities on these older GPUs will cause models not to work. We're close to enabling Vulkan which may be able to operate on these older GPUs. Since you're OK building from source, I'd give Vulkan a try as it is already enabled for local builds, and see if you are able to get that working instead of CUDA.
Author
Owner

@abcbarryn commented on GitHub (Nov 6, 2025):

The following models work file with CUDA 11.8 and Ollama 0.11.2
Gemma3
Gemma3:12b
Deepseek-R1

<!-- gh-comment-id:3494806653 --> @abcbarryn commented on GitHub (Nov 6, 2025): The following models work file with CUDA 11.8 and Ollama 0.11.2 Gemma3 Gemma3:12b Deepseek-R1
Author
Owner

@abcbarryn commented on GitHub (Nov 16, 2025):

Patch file for Ollama 0.11.2 to enable compute 3.5. You must build using CUDA 11.8 and GCC version 11. Note: some newer models (gpt-oss) may not work on Ollama 0.11.2. Also, you may need to install Nvidia driver version 470, which requires kernel version 6.2.12 (or earlier) and GCC version 7 to install/link the kernel driver.

--- discover/gpu.go     2025-07-20 20:07:27.794281648 -0400
+++ discover/gpu.go     2025-08-22 20:12:11.442609544 -0400
@@ -67,6 +67,6 @@
 // (string values used to allow ldflags overrides at build time)
 var (
-       CudaComputeMajorMin = "5"
-       CudaComputeMinorMin = "0"
+       CudaComputeMajorMin = "3"
+       CudaComputeMinorMin = "5"
 )


<!-- gh-comment-id:3538521051 --> @abcbarryn commented on GitHub (Nov 16, 2025): Patch file for Ollama 0.11.2 to enable compute 3.5. You must build using CUDA 11.8 and GCC version 11. Note: some newer models (gpt-oss) may not work on Ollama 0.11.2. Also, you may need to install Nvidia driver version 470, which requires kernel version 6.2.12 (or earlier) and GCC version 7 to install/link the kernel driver. ``` --- discover/gpu.go 2025-07-20 20:07:27.794281648 -0400 +++ discover/gpu.go 2025-08-22 20:12:11.442609544 -0400 @@ -67,6 +67,6 @@ // (string values used to allow ldflags overrides at build time) var ( - CudaComputeMajorMin = "5" - CudaComputeMinorMin = "0" + CudaComputeMajorMin = "3" + CudaComputeMinorMin = "5" ) ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#54506