[GH-ISSUE #9191] make ollama of windows amd64 version smaller #52501

Open
opened 2026-04-28 23:31:06 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @zmldndx on GitHub (Feb 18, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9191

The size of ggml-cuda.dll is more than twice larger than llama.cpp, can we make ollama-windows-amd64.zip much more smaller?
Our tool is based on ollama command line exe, and we want introduce users to download ollama automatically. but it's too big on windows, thus many people stop downloading.

Originally created by @zmldndx on GitHub (Feb 18, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9191 The size of ggml-cuda.dll is more than twice larger than llama.cpp, can we make ollama-windows-amd64.zip much more smaller? Our tool is based on ollama command line exe, and we want introduce users to download ollama automatically. but it's too big on windows, thus many people stop downloading.
GiteaMirror added the feature request label 2026-04-28 23:31:06 -05:00
Author
Owner

@mxyng commented on GitHub (Feb 18, 2025):

The release build of ollama for linux and windows builds for a wide set of CUDA and ROCm architectures. These represent a large proportion part of the artifact's size.

I have some idea on how to shrink the download size while maintaining broad compatibility but there's currently no bandwidth to work on this

For the time being, if file sizes are an issue, you can always build from source which will only build the architectures you specify

<!-- gh-comment-id:2667170325 --> @mxyng commented on GitHub (Feb 18, 2025): The release build of ollama for linux and windows builds for a wide set of CUDA and ROCm architectures. These represent a large proportion part of the artifact's size. I have some idea on how to shrink the download size while maintaining broad compatibility but there's currently no bandwidth to work on this For the time being, if file sizes are an issue, you can always build from source which will only build the architectures you specify
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#52501