[GH-ISSUE #732] I could not run ollama as standalone install or via docker on Ubuntu 22.04.3 LTS with nvidia GPU RTX 3060 12GB #46851

Closed
opened 2026-04-28 00:52:39 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @pexus on GitHub (Oct 8, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/732

My System Processor is : AMD® Fx(tm)-8350 eight-core processor × 8
I have 32 GB Main memory

I tried installing ollama as standalone install and also as docker. Everytime I run any model I get the following error.

Error: failed to start a llama runner

Looking at the status of the ollama service, I get the following. I have tried installing multiple times and installing the nvidia cuda toolkit.

systemctl status ollama
○ ollama.service - Ollama Service
Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: enabled)
Active: inactive (dead) since Sat 2023-10-07 21:23:04 MDT; 14min ago
Process: 1324 ExecStart=/usr/local/bin/ollama serve (code=exited, status=0/SUCCESS)
Main PID: 1324 (code=exited, status=0/SUCCESS)
CPU: 17.482s

Oct 07 17:42:39 ubuntu-ai ollama[1324]: 2023/10/07 17:42:39 llama.go:323: llama runner exited with error: signal: illegal instruction (core dumped)
Oct 07 17:43:47 ubuntu-ai ollama[1324]: [GIN] 2023/10/07 - 17:43:47 | 200 | 25.577µs | 127.0.0.1 | HEAD "/"
Oct 07 17:43:47 ubuntu-ai ollama[1324]: [GIN] 2023/10/07 - 17:43:47 | 200 | 405.285µs | 127.0.0.1 | GET "/api/tags"
Oct 07 17:44:04 ubuntu-ai ollama[1324]: [GIN] 2023/10/07 - 17:44:04 | 200 | 45.925µs | 127.0.0.1 | HEAD "/"
Oct 07 17:44:38 ubuntu-ai ollama[1324]: 2023/10/07 17:44:38 llama.go:330: error starting llama runner: llama runner did not start within alloted ti>
Oct 07 17:44:38 ubuntu-ai ollama[1324]: [GIN] 2023/10/07 - 17:44:38 | 500 | 4m4s | 127.0.0.1 | POST "/api/genera

Originally created by @pexus on GitHub (Oct 8, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/732 My System Processor is : AMD® Fx(tm)-8350 eight-core processor × 8 I have 32 GB Main memory ============= I tried installing ollama as standalone install and also as docker. Everytime I run any model I get the following error. Error: failed to start a llama runner Looking at the status of the ollama service, I get the following. I have tried installing multiple times and installing the nvidia cuda toolkit. systemctl status ollama ○ ollama.service - Ollama Service Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: enabled) Active: inactive (dead) since Sat 2023-10-07 21:23:04 MDT; 14min ago Process: 1324 ExecStart=/usr/local/bin/ollama serve (code=exited, status=0/SUCCESS) Main PID: 1324 (code=exited, status=0/SUCCESS) CPU: 17.482s Oct 07 17:42:39 ubuntu-ai ollama[1324]: 2023/10/07 17:42:39 llama.go:323: llama runner exited with error: signal: illegal instruction (core dumped) Oct 07 17:43:47 ubuntu-ai ollama[1324]: [GIN] 2023/10/07 - 17:43:47 | 200 | 25.577µs | 127.0.0.1 | HEAD "/" Oct 07 17:43:47 ubuntu-ai ollama[1324]: [GIN] 2023/10/07 - 17:43:47 | 200 | 405.285µs | 127.0.0.1 | GET "/api/tags" Oct 07 17:44:04 ubuntu-ai ollama[1324]: [GIN] 2023/10/07 - 17:44:04 | 200 | 45.925µs | 127.0.0.1 | HEAD "/" Oct 07 17:44:38 ubuntu-ai ollama[1324]: 2023/10/07 17:44:38 llama.go:330: error starting llama runner: llama runner did not start within alloted ti> Oct 07 17:44:38 ubuntu-ai ollama[1324]: [GIN] 2023/10/07 - 17:44:38 | 500 | 4m4s | 127.0.0.1 | POST "/api/genera
GiteaMirror added the bug label 2026-04-28 00:52:39 -05:00
Author
Owner

@jjeantw commented on GitHub (Oct 8, 2023):

AMD FX 8350 has AVX support, but no AVX2 support.

<!-- gh-comment-id:1751911680 --> @jjeantw commented on GitHub (Oct 8, 2023): AMD FX 8350 has AVX support, but no AVX2 support.
Author
Owner

@pexus commented on GitHub (Oct 8, 2023):

Thanks

lscpu does list AVX support, but not AVX2 as you mentioned.
So is there any work around or I am out of luck with this CPU to run ollama ?

<!-- gh-comment-id:1751912669 --> @pexus commented on GitHub (Oct 8, 2023): Thanks lscpu does list AVX support, but not AVX2 as you mentioned. So is there any work around or I am out of luck with this CPU to run ollama ?
Author
Owner

@lasseedfast commented on GitHub (Oct 21, 2023):

I had the exact same problem and error, and still have with Ollama, but I was able to install and run llama.cpp with CMAKE_ARGS="-DLLAMA_CUBLAS=1 -DLLAMA_AVX2=OFF -DLLAMA_F16C=OFF -DLLAMA_FMA=OFF" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir I know to little to say if any of this is transferable to this problem but at least I can run models on my old processor now.

<!-- gh-comment-id:1773643120 --> @lasseedfast commented on GitHub (Oct 21, 2023): I had the exact same problem and error, and still have with Ollama, but I was able to install and run llama.cpp with `CMAKE_ARGS="-DLLAMA_CUBLAS=1 -DLLAMA_AVX2=OFF -DLLAMA_F16C=OFF -DLLAMA_FMA=OFF" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir` I know to little to say if any of this is transferable to this problem but at least I can run models on my old processor now.
Author
Owner

@pexus commented on GitHub (Oct 21, 2023):

Thanks for the response. What platform did you use the command on. Is it Mac?
I tried to do the above, but didn't work. I am on Ubuntu though.
I was wondering if I have to clone the repo and re-build passing the arguments you provided in the make file. I will give this a shot and see if it will work for me.
Thanks again.

<!-- gh-comment-id:1773910542 --> @pexus commented on GitHub (Oct 21, 2023): Thanks for the response. What platform did you use the command on. Is it Mac? I tried to do the above, but didn't work. I am on Ubuntu though. I was wondering if I have to clone the repo and re-build passing the arguments you provided in the make file. I will give this a shot and see if it will work for me. Thanks again.
Author
Owner

@pexus commented on GitHub (Oct 21, 2023):

I tried git cloning and rebuilding it. But I get the same error. The ollama debug still shows on CPU which does not have AVX2 support:

{"timestamp":1697922947,"level":"INFO","function":"main","line":1192,"message":"system info","n_threads":4,"total_threads":8,"system_info":"AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | "}

I believe I need to disable this in my local build, but I am not sure how to do this, which CMakefile I need to edit when do the build. Is there another way to propagate the flags you mentioned when doing the build using go

go generate ./...
go build .

I quite don't understand the command you provided in the previous response:

CMAKE_ARGS="-DLLAMA_CUBLAS=1 -DLLAMA_AVX2=OFF -DLLAMA_F16C=OFF -DLLAMA_FMA=OFF" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir

Is this build command if so it does not match the build instruction in the git repo which uses go to build it.

Any suggestions appreciated.

<!-- gh-comment-id:1773924465 --> @pexus commented on GitHub (Oct 21, 2023): I tried git cloning and rebuilding it. But I get the same error. The ollama debug still shows on CPU which does not have AVX2 support: {"timestamp":1697922947,"level":"INFO","function":"main","line":1192,"message":"system info","n_threads":4,"total_threads":8,"system_info":"AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | "} I believe I need to disable this in my local build, but I am not sure how to do this, which CMakefile I need to edit when do the build. Is there another way to propagate the flags you mentioned when doing the build using go go generate ./... go build . I quite don't understand the command you provided in the previous response: CMAKE_ARGS="-DLLAMA_CUBLAS=1 -DLLAMA_AVX2=OFF -DLLAMA_F16C=OFF -DLLAMA_FMA=OFF" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir Is this build command if so it does not match the build instruction in the git repo which uses go to build it. Any suggestions appreciated.
Author
Owner

@lasseedfast commented on GitHub (Oct 24, 2023):

I might have been unclear when writing, sorry for that. What I ment is that I used the CMAKE_ARGS to install the llama-cpp-python package, not installing Ollama. But since I had the same error code using llama-cpp-python and Ollama I thouht it could be a lead.

Actually I later installed Ollama by setting the CMAKE enviroment to not use AVX2 (can't remember how) and then build Ollama from source, but it was slow so must have done something wrong...

<!-- gh-comment-id:1777008304 --> @lasseedfast commented on GitHub (Oct 24, 2023): I might have been unclear when writing, sorry for that. What I ment is that I used the `CMAKE_ARGS` to install the llama-cpp-python package, not installing Ollama. But since I had the same error code using llama-cpp-python and Ollama I thouht it could be a lead. Actually I later installed Ollama by setting the CMAKE enviroment to not use AVX2 (can't remember how) and then build Ollama from source, but it was slow so must have done something wrong...
Author
Owner

@pexus commented on GitHub (Oct 26, 2023):

Thanks for clarifying.
I tried rebuilding Ollama, but could not figure out an easy way to set the flag. I thought may be the build will dynamically figure this out based on the processor during make, but it does. It would be good to have the runtime figure this out automatically rather than statically setting it at build time if that is the issue. Anyway for now I can't run this. I am using other LLM runtimes like Text Generation Web UI. Hopefully the maintainers will fix this, since I see it is marked as a bug.

<!-- gh-comment-id:1780364421 --> @pexus commented on GitHub (Oct 26, 2023): Thanks for clarifying. I tried rebuilding Ollama, but could not figure out an easy way to set the flag. I thought may be the build will dynamically figure this out based on the processor during make, but it does. It would be good to have the runtime figure this out automatically rather than statically setting it at build time if that is the issue. Anyway for now I can't run this. I am using other LLM runtimes like Text Generation Web UI. Hopefully the maintainers will fix this, since I see it is marked as a bug.
Author
Owner

@pexus commented on GitHub (Oct 28, 2023):

I downloaded the latest version v0.1.7 (Oct 27, 2023). This issue seem to have been resolved. I can load Mistral and it works fine.
So I am closing this thread.

<!-- gh-comment-id:1783808025 --> @pexus commented on GitHub (Oct 28, 2023): I downloaded the latest version v0.1.7 (Oct 27, 2023). This issue seem to have been resolved. I can load Mistral and it works fine. So I am closing this thread.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#46851