[GH-ISSUE #1733] Feature request: improve install.sh and release binaries for CPU instructions #26748

Closed
opened 2026-04-22 03:16:14 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @oafish on GitHub (Dec 28, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1733

Is it feasible for precompile multiple binaries for AVX1, AVX2, AVX512 and Openblas just like https://github.com/ggerganov/llama.cpp/releases
The install.sh can detect the platform not only CPU architecture but also the grep cpuinfo to download most suitable binaries.
I hope it is an elegant solution for https://github.com/jmorganca/ollama/issues/644

Originally created by @oafish on GitHub (Dec 28, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1733 Is it feasible for precompile multiple binaries for AVX1, AVX2, AVX512 and Openblas just like https://github.com/ggerganov/llama.cpp/releases The install.sh can detect the platform not only CPU architecture but also the grep cpuinfo to download most suitable binaries. I hope it is an elegant solution for https://github.com/jmorganca/ollama/issues/644
Author
Owner

@technovangelist commented on GitHub (Dec 28, 2023):

Hi @oafish, thanks so much for submitting the issue. AVX was introduced just about 10 years ago. It introduces extra functionality to make some linear algebra functions easier to process, and that's super important with Large Language Models. That said, there is an issue that is looking at this (#1279) which suggests that the CPU instructions should be determined at runtime. Seems to be a pretty good match to what you are asking here in this issue.

So I am going to close this one, but definitely track #1279 to keep an eye on where we are with this. Does that make sense? If you feel there is something not captured by 1279 that this does, reopen this one, but it may be better just to add a comment onto that issue.

Thanks for being part of this great community.

<!-- gh-comment-id:1871477516 --> @technovangelist commented on GitHub (Dec 28, 2023): Hi @oafish, thanks so much for submitting the issue. AVX was introduced just about 10 years ago. It introduces extra functionality to make some linear algebra functions easier to process, and that's super important with Large Language Models. That said, there is an issue that is looking at this (#1279) which suggests that the CPU instructions should be determined at runtime. Seems to be a pretty good match to what you are asking here in this issue. So I am going to close this one, but definitely track #1279 to keep an eye on where we are with this. Does that make sense? If you feel there is something not captured by 1279 that this does, reopen this one, but it may be better just to add a comment onto that issue. Thanks for being part of this great community.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#26748