[GH-ISSUE #2527] Windows GPU libraries compiled with AVX2 instead of AVX #47992

Closed
opened 2026-04-28 06:20:42 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @dhiltgen on GitHub (Feb 15, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/2527

Even though we're setting:

generating config with: cmake -S ../llama.cpp -B ../llama.cpp/build/windows/amd64/cuda_v11.3 -DBUILD_SHARED_LIBS=on -DLLAMA_NATIVE=off -A x64 -DCMAKE_VERBOSE_MAKEFILE=on -DLLAMA_SERVER_VERBOSE=on -DLLAMA_CUBLAS=ON -DLLAMA_AVX=on -DCMAKE_CUDA_ARCHITECTURES=50;52;61;70;75;80

The actual compile lines look like this:

ClCompile:
  C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64\CL.exe /c /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" /Zi /W3 /WX- /diagnostics:column /O2 /Ob1 /D WIN32 /D _WINDOWS /D NDEBUG /D GGML_USE_CUBLAS /D GGML_CUDA_DMMV_X=32 /D GGML_CUDA_MMV_Y=1 /D K_QUANTS_PER_ITERATION=2 /D GGML_CUDA_PEER_MAX_BATCH_SIZE=128 /D _CRT_SECURE_NO_WARNINGS /D _XOPEN_SOURCE=600 /D "CMAKE_INTDIR=\"RelWithDebInfo\"" /D _MBCS /Gm- /EHsc /MD /GS /arch:AVX2 /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /Fo"build_info.dir\RelWithDebInfo\\" /Fd"build_info.dir\RelWithDebInfo\build_info.pdb" /external:W3 /Gd /TP /errorReport:queue "C:\Users\danie\code\ollama\llm\llama.cpp\common\build-info.cpp"

The /arch:AVX2 shouldn't be there.

Originally created by @dhiltgen on GitHub (Feb 15, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/2527 Even though we're setting: ``` generating config with: cmake -S ../llama.cpp -B ../llama.cpp/build/windows/amd64/cuda_v11.3 -DBUILD_SHARED_LIBS=on -DLLAMA_NATIVE=off -A x64 -DCMAKE_VERBOSE_MAKEFILE=on -DLLAMA_SERVER_VERBOSE=on -DLLAMA_CUBLAS=ON -DLLAMA_AVX=on -DCMAKE_CUDA_ARCHITECTURES=50;52;61;70;75;80 ``` The actual compile lines look like this: ``` ClCompile: C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64\CL.exe /c /I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\include" /Zi /W3 /WX- /diagnostics:column /O2 /Ob1 /D WIN32 /D _WINDOWS /D NDEBUG /D GGML_USE_CUBLAS /D GGML_CUDA_DMMV_X=32 /D GGML_CUDA_MMV_Y=1 /D K_QUANTS_PER_ITERATION=2 /D GGML_CUDA_PEER_MAX_BATCH_SIZE=128 /D _CRT_SECURE_NO_WARNINGS /D _XOPEN_SOURCE=600 /D "CMAKE_INTDIR=\"RelWithDebInfo\"" /D _MBCS /Gm- /EHsc /MD /GS /arch:AVX2 /fp:precise /Zc:wchar_t /Zc:forScope /Zc:inline /GR /Fo"build_info.dir\RelWithDebInfo\\" /Fd"build_info.dir\RelWithDebInfo\build_info.pdb" /external:W3 /Gd /TP /errorReport:queue "C:\Users\danie\code\ollama\llm\llama.cpp\common\build-info.cpp" ``` The `/arch:AVX2` shouldn't be there.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#47992