[PR #2528] [CLOSED] Explicitly disable AVX2 on GPU builds #36795

Closed
opened 2026-04-22 21:26:00 -05:00 by GiteaMirror · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/ollama/ollama/pull/2528
Author: @dhiltgen
Created: 2/15/2024
Status: Closed

Base: windows-previewHead: fix_avx


📝 Commits (1)

  • db2a9ad Explicitly disable AVX2 on GPU builds

📊 Changes

1 file changed (+1 additions, -1 deletions)

View changed files

📝 llm/generate/gen_windows.ps1 (+1 -1)

📄 Description

Even though we weren't setting it to on, somewhere in the cmake config it was getting toggled on. By explicitly setting it to off, we get /arch:AVX as intended.

Fixes #2527

Input:

generating config with: cmake -S ../llama.cpp -B ../llama.cpp/build/windows/amd64/cuda_v11.3 -DBUILD_SHARED_LIBS=on -DLLAMA_NATIVE=off -A x64 -DCMAKE_VERBOSE_MAKEFILE=on -DLLAMA_SERVER_VERBOSE=on -DLLAMA_CUBLAS=ON -DLLAMA_AVX=on -DLLAMA_AVX2=off -DCUDAToolkit_INCLUDE_DIR=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include -DCMAKE_CUDA_ARCHITECTURES=50;52;61;70;75;80

Example Compile: (note the correct /arch

  C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64\CL.exe /c /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" /Zi /W3 /WX- /diagnostics:column /O2 /Ob1 /D WIN32 /D _WINDOWS /D NDEBUG /D GGML_USE_CUBLAS /D GGML_CUDA_DMMV_X=32 /D GGML_CUDA_MMV_Y=1 /D K_QUANTS_PER_ITERATION=2 /D GGML_CUDA_PEER_MAX_BATCH_SIZE=128 /D _CRT_SECURE_NO_WARNINGS /D _XOPEN_SOURCE=600 /D "CMAKE_INTDIR=\"RelWithDebInfo\"" /D _MBCS /Gm- /EHsc /MD /GS /arch:AVX /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /Fo"build_info.dir\RelWithDebInfo\\" /Fd"build_info.dir\RelWithDebInfo\build_info.pdb" /external:W3 /Gd /TP /errorReport:queue "C:\Users\danie\code\ollama\llm\llama.cpp\common\build-info.cpp"

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/ollama/ollama/pull/2528 **Author:** [@dhiltgen](https://github.com/dhiltgen) **Created:** 2/15/2024 **Status:** ❌ Closed **Base:** `windows-preview` ← **Head:** `fix_avx` --- ### 📝 Commits (1) - [`db2a9ad`](https://github.com/ollama/ollama/commit/db2a9ad1feb012d2b1e635c39d10936ab3aafc27) Explicitly disable AVX2 on GPU builds ### 📊 Changes **1 file changed** (+1 additions, -1 deletions) <details> <summary>View changed files</summary> 📝 `llm/generate/gen_windows.ps1` (+1 -1) </details> ### 📄 Description Even though we weren't setting it to on, somewhere in the cmake config it was getting toggled on. By explicitly setting it to off, we get `/arch:AVX` as intended. Fixes #2527 Input: ``` generating config with: cmake -S ../llama.cpp -B ../llama.cpp/build/windows/amd64/cuda_v11.3 -DBUILD_SHARED_LIBS=on -DLLAMA_NATIVE=off -A x64 -DCMAKE_VERBOSE_MAKEFILE=on -DLLAMA_SERVER_VERBOSE=on -DLLAMA_CUBLAS=ON -DLLAMA_AVX=on -DLLAMA_AVX2=off -DCUDAToolkit_INCLUDE_DIR=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include -DCMAKE_CUDA_ARCHITECTURES=50;52;61;70;75;80 ``` Example Compile: (note the correct `/arch` ``` C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64\CL.exe /c /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" /Zi /W3 /WX- /diagnostics:column /O2 /Ob1 /D WIN32 /D _WINDOWS /D NDEBUG /D GGML_USE_CUBLAS /D GGML_CUDA_DMMV_X=32 /D GGML_CUDA_MMV_Y=1 /D K_QUANTS_PER_ITERATION=2 /D GGML_CUDA_PEER_MAX_BATCH_SIZE=128 /D _CRT_SECURE_NO_WARNINGS /D _XOPEN_SOURCE=600 /D "CMAKE_INTDIR=\"RelWithDebInfo\"" /D _MBCS /Gm- /EHsc /MD /GS /arch:AVX /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /Fo"build_info.dir\RelWithDebInfo\\" /Fd"build_info.dir\RelWithDebInfo\build_info.pdb" /external:W3 /Gd /TP /errorReport:queue "C:\Users\danie\code\ollama\llm\llama.cpp\common\build-info.cpp" ``` --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
GiteaMirror added the pull-request label 2026-04-22 21:26:00 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#36795