[GH-ISSUE #1938] ollama --version 0.1.20 not working #63152

Closed
opened 2026-05-03 12:18:03 -05:00 by GiteaMirror · 14 comments
Owner

Originally created by @PhilipAmadasun on GitHub (Jan 11, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/1938

Originally assigned to: @jmorganca on GitHub.

Our ollama no longer works once upgrading to version 0.1.20. All the commands, for instance:

curl http://localhost:11434/api/chat -d '{
>   "model": "llama2",
>   "messages": [
>     {
>       "role": "user",
>       "content": "why is the sky blue?"
>     }
>   ]
> }'

Just gets stuck and doesn't run. What's going on? I believe this is the latest version, is the version not stable? Do we have to downgrade ollama? If so how do we go about doing that?

Originally created by @PhilipAmadasun on GitHub (Jan 11, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/1938 Originally assigned to: @jmorganca on GitHub. Our ollama no longer works once upgrading to version `0.1.20`. All the commands, for instance: ``` curl http://localhost:11434/api/chat -d '{ > "model": "llama2", > "messages": [ > { > "role": "user", > "content": "why is the sky blue?" > } > ] > }' ``` Just gets stuck and doesn't run. What's going on? I believe this is the latest version, is the version not stable? Do we have to downgrade ollama? If so how do we go about doing that?
GiteaMirror added the bug label 2026-05-03 12:18:03 -05:00
Author
Owner

@jmorganca commented on GitHub (Jan 12, 2024):

Hi @PhilipAmadasun - I'm sorry it's hanging for you. You definitely shouldn't need to downgrade – 0.1.20 was focused on stability around CUDA, although there's still a bit more work to on it. To help me track it down:

  • Is this on macOS or Linux?
  • If Linux, what kind of GPU?
  • Do you have the logs handy? journalctl --no-pager -u ollama on Linux and cat ~/.ollama/logs/server.log on macOS

Thanks so much

<!-- gh-comment-id:1888262045 --> @jmorganca commented on GitHub (Jan 12, 2024): Hi @PhilipAmadasun - I'm sorry it's hanging for you. You definitely shouldn't need to downgrade – `0.1.20` was focused on stability around CUDA, although there's still a bit more work to on it. To help me track it down: - Is this on macOS or Linux? - If Linux, what kind of GPU? - Do you have the logs handy? `journalctl --no-pager -u ollama` on Linux and `cat ~/.ollama/logs/server.log` on macOS Thanks so much
Author
Owner

@PhilipAmadasun commented on GitHub (Jan 12, 2024):

Here are the logs:
ollama_logs.txt

<!-- gh-comment-id:1889825695 --> @PhilipAmadasun commented on GitHub (Jan 12, 2024): Here are the logs: [ollama_logs.txt](https://github.com/jmorganca/ollama/files/13923196/ollama_logs.txt)
Author
Owner

@PhilipAmadasun commented on GitHub (Jan 12, 2024):

@jmorganca We're using linux (Ubuntu 22.04). These are the GPU specs ```
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... On | 00000000:03:00.0 Off | 0 |
| N/A 30C P0 27W / 250W | 0MiB / 16384MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P100-PCIE... On | 00000000:82:00.0 Off | 0 |
| N/A 32C P0 26W / 250W | 0MiB / 16384MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

<!-- gh-comment-id:1890048470 --> @PhilipAmadasun commented on GitHub (Jan 12, 2024): @jmorganca We're using linux (Ubuntu 22.04). These are the GPU specs ``` +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla P100-PCIE... On | 00000000:03:00.0 Off | 0 | | N/A 30C P0 27W / 250W | 0MiB / 16384MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 Tesla P100-PCIE... On | 00000000:82:00.0 Off | 0 | | N/A 32C P0 26W / 250W | 0MiB / 16384MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ ```
Author
Owner

@Rust-Ninja-Sabi commented on GitHub (Jan 14, 2024):

I think it the same bug:

>ollama run mixtral     
zsh: illegal hardware instruction  ollama run mixtral
<!-- gh-comment-id:1891011275 --> @Rust-Ninja-Sabi commented on GitHub (Jan 14, 2024): I think it the same bug: ``` >ollama run mixtral zsh: illegal hardware instruction ollama run mixtral ```
Author
Owner

@Rust-Ninja-Sabi commented on GitHub (Jan 14, 2024):

Is it possible to download a older version?

<!-- gh-comment-id:1891011598 --> @Rust-Ninja-Sabi commented on GitHub (Jan 14, 2024): Is it possible to download a older version?
Author
Owner

@PhilipAmadasun commented on GitHub (Jan 14, 2024):

@Rust-Ninja-Sabi Yes use command:

curl https://ollama.ai/install.sh | sed 's#https://ollama.ai/download#https://github.com/jmorganca/ollama/releases/download/v0.1.17#' | sh
<!-- gh-comment-id:1891037817 --> @PhilipAmadasun commented on GitHub (Jan 14, 2024): @Rust-Ninja-Sabi Yes use command: ``` curl https://ollama.ai/install.sh | sed 's#https://ollama.ai/download#https://github.com/jmorganca/ollama/releases/download/v0.1.17#' | sh ```
Author
Owner

@Rust-Ninja-Sabi commented on GitHub (Jan 17, 2024):

Thanks. This script does not run on macOS

<!-- gh-comment-id:1895317633 --> @Rust-Ninja-Sabi commented on GitHub (Jan 17, 2024): Thanks. This script does not run on macOS
Author
Owner

@dhiltgen commented on GitHub (Jan 26, 2024):

Here's the excerpt from the log where it wen't bad.

Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: llama_apply_lora_from_file_internal: applying lora adapter from '/usr/share/ollama/.ollama/models/blobs/sha256:f4e82fc0919ab5e92b0bf8230154a96cd6c0462a7583b39af0ab6f4d1c8d3521' - please wait ...
Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: llama_apply_lora_from_file_internal: bad file magic
Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: llama_init_from_gpt_params: error: failed to apply lora adapter
Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: Lazy loading /tmp/ollama2924267924/cuda/libext_server.so library
Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: Lazy loading /tmp/ollama2924267924/cuda/libext_server.so library
Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: {"timestamp":1705015928,"level":"ERROR","function":"load_model","line":581,"message":"unable to load model","model":"/usr/share/ollama/.ollama/models/blobs/sha256:e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057730"}
Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: 2024/01/11 23:32:08 llm.go:129: Failed to load dynamic library cuda - falling back to CPU mode error loading model /usr/share/ollama/.ollama/models/blobs/sha256:e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057
Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: 2024/01/11 23:32:08 ext_server_common.go:85: concurrent llm servers not yet supported, waiting for prior server to complete
Jan 12 18:53:33 arnold.ailab.internal systemd[1]: Stopping Ollama Service...

I can't speak to the lora adapter load problem, but that failure cascaded to another bug where we didn't unlock a lock and that lead to concurrent llm servers not yet supported, waiting for prior server to complete which was fixed a week ago. Upgrading to 0.1.22 will resolve the lock bug, but you might want to re-pull your models in case something got corrupted on your filesystem.

<!-- gh-comment-id:1912752547 --> @dhiltgen commented on GitHub (Jan 26, 2024): Here's the excerpt from the log where it wen't bad. ``` Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: llama_apply_lora_from_file_internal: applying lora adapter from '/usr/share/ollama/.ollama/models/blobs/sha256:f4e82fc0919ab5e92b0bf8230154a96cd6c0462a7583b39af0ab6f4d1c8d3521' - please wait ... Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: llama_apply_lora_from_file_internal: bad file magic Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: llama_init_from_gpt_params: error: failed to apply lora adapter Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: Lazy loading /tmp/ollama2924267924/cuda/libext_server.so library Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: Lazy loading /tmp/ollama2924267924/cuda/libext_server.so library Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: {"timestamp":1705015928,"level":"ERROR","function":"load_model","line":581,"message":"unable to load model","model":"/usr/share/ollama/.ollama/models/blobs/sha256:e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057730"} Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: 2024/01/11 23:32:08 llm.go:129: Failed to load dynamic library cuda - falling back to CPU mode error loading model /usr/share/ollama/.ollama/models/blobs/sha256:e8a35b5937a5e6d5c35d1f2a15f161e07eefe5e5bb0a3cdd42998ee79b057 Jan 11 23:32:08 arnold.ailab.internal ollama[340593]: 2024/01/11 23:32:08 ext_server_common.go:85: concurrent llm servers not yet supported, waiting for prior server to complete Jan 12 18:53:33 arnold.ailab.internal systemd[1]: Stopping Ollama Service... ``` I can't speak to the lora adapter load problem, but that failure cascaded to another bug where we didn't unlock a lock and that lead to `concurrent llm servers not yet supported, waiting for prior server to complete` which was fixed a week ago. Upgrading to 0.1.22 will resolve the lock bug, but you might want to re-pull your models in case something got corrupted on your filesystem.
Author
Owner

@dhiltgen commented on GitHub (Jan 26, 2024):

@Rust-Ninja-Sabi your problem is unrelated to this issue. You are most likely trying to run Ollama under Rosetta on an ARM mac, which until recently wasn't supported (resulting in an "illegal instruction" error). If you ugprade, it will work, but you should run Ollama as a native ARM app and you'll get much better performance.

<!-- gh-comment-id:1912754556 --> @dhiltgen commented on GitHub (Jan 26, 2024): @Rust-Ninja-Sabi your problem is unrelated to this issue. You are most likely trying to run Ollama under Rosetta on an ARM mac, which until recently wasn't supported (resulting in an "illegal instruction" error). If you ugprade, it will work, but you should run Ollama as a native ARM app and you'll get much better performance.
Author
Owner

@Rust-Ninja-Sabi commented on GitHub (Jan 27, 2024):

@dhiltgen hello Daniel
Thanks for your message. I installed Ollama again (version 0.1.22). Now it works. I installed it from Ollama homepage. I hope it is the native version.

<!-- gh-comment-id:1913093759 --> @Rust-Ninja-Sabi commented on GitHub (Jan 27, 2024): @dhiltgen hello Daniel Thanks for your message. I installed Ollama again (version 0.1.22). Now it works. I installed it from Ollama homepage. I hope it is the native version.
Author
Owner

@dhiltgen commented on GitHub (Jan 27, 2024):

@Rust-Ninja-Sabi we compile as a "Mach-O universal binary" so a single executable contains both x86 and ARM variants and MacOS will pick the right one based on your configuration. Running under Rosetta will work now (where it used to crash), but will have a significant performance penalty.

<!-- gh-comment-id:1913216394 --> @dhiltgen commented on GitHub (Jan 27, 2024): @Rust-Ninja-Sabi we compile as a "Mach-O universal binary" so a single executable contains both x86 and ARM variants and MacOS will pick the right one based on your configuration. Running under Rosetta will work now (where it used to crash), but will have a significant performance penalty.
Author
Owner

@dhiltgen commented on GitHub (Jan 27, 2024):

@PhilipAmadasun please let us know if 0.1.22 resolves your problem

<!-- gh-comment-id:1913219546 --> @dhiltgen commented on GitHub (Jan 27, 2024): @PhilipAmadasun please let us know if 0.1.22 resolves your problem
Author
Owner

@Rust-Ninja-Sabi commented on GitHub (Jan 27, 2024):

Thanks. It is working.

<!-- gh-comment-id:1913249020 --> @Rust-Ninja-Sabi commented on GitHub (Jan 27, 2024): Thanks. It is working.
Author
Owner

@PhilipAmadasun commented on GitHub (Feb 16, 2024):

@dhiltgen @jmorganca All's good! Sorry or late response.

<!-- gh-comment-id:1947922011 --> @PhilipAmadasun commented on GitHub (Feb 16, 2024): @dhiltgen @jmorganca All's good! Sorry or late response.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#63152