[GH-ISSUE #1703] Error: llama runner process has terminated. when running dolphin-mixtral #26722

Closed
opened 2026-04-22 03:11:27 -05:00 by GiteaMirror · 9 comments
Owner

Originally created by @G-only1 on GitHub (Dec 25, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/1703

when i run ollama run dolphin-mixtral it gives the error Error: llama runner process has terminated.

Originally created by @G-only1 on GitHub (Dec 25, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/1703 when i run ollama run dolphin-mixtral it gives the error Error: llama runner process has terminated.
Author
Owner

@ruucat commented on GitHub (Dec 26, 2023):

Same error with llava:latest

<!-- gh-comment-id:1869750959 --> @ruucat commented on GitHub (Dec 26, 2023): Same error with `llava:latest`
Author
Owner

@BruceMacD commented on GitHub (Dec 27, 2023):

It sounds like you may be running out of memory while loading the model.

Is there any more info in the logs you could share here?

https://github.com/jmorganca/ollama/blob/main/docs/troubleshooting.md

<!-- gh-comment-id:1870498404 --> @BruceMacD commented on GitHub (Dec 27, 2023): It sounds like you may be running out of memory while loading the model. Is there any more info in the logs you could share here? https://github.com/jmorganca/ollama/blob/main/docs/troubleshooting.md
Author
Owner

@jocot commented on GitHub (Dec 28, 2023):

I had the same error, I managed to fix it by setting the num_ctx parameter to 16384.

"The base model has 32k context, I finetuned it with 16k"
https://huggingface.co/TheBloke/dolphin-2.6-mixtral-8x7b-GGUF

Note: this problem also exists in the orca2:13b-q8_0 model, setting num_ctx to 8192 or 16384 gets it to work.

<!-- gh-comment-id:1871096175 --> @jocot commented on GitHub (Dec 28, 2023): I had the same error, I managed to fix it by setting the num_ctx parameter to 16384. "The base model has 32k context, I finetuned it with 16k" https://huggingface.co/TheBloke/dolphin-2.6-mixtral-8x7b-GGUF Note: this problem also exists in the orca2:13b-q8_0 model, setting num_ctx to 8192 or 16384 gets it to work.
Author
Owner

@williamsun-hha commented on GitHub (Dec 31, 2023):

@jocot Thank you very much for your help. Can you please share how do you do ( setting the num_ctx parameter to 16384.) in ollama, I do not see an option to set the parameter not in the command line option. Thank you very much for your help in advance!

<!-- gh-comment-id:1873054147 --> @williamsun-hha commented on GitHub (Dec 31, 2023): @jocot Thank you very much for your help. Can you please share how do you do ( setting the num_ctx parameter to 16384.) in ollama, I do not see an option to set the parameter not in the command line option. Thank you very much for your help in advance!
Author
Owner

@BruceMacD commented on GitHub (Jan 2, 2024):

@williamsun-hha
First make sure you're on a version of Ollama that supports the /set parameter option, the easiest way to do that is to make sure you're on latest:
https://github.com/jmorganca/ollama/blob/main/docs/faq.md#how-can-i-upgrade-ollama

Then run the set command like this:

ollama run dolphin-mixtral

>>> /set parameter num_ctx 16384
Set parameter 'num_ctx' to '16384'

... continue your session now with the context set
<!-- gh-comment-id:1873877704 --> @BruceMacD commented on GitHub (Jan 2, 2024): @williamsun-hha First make sure you're on a version of Ollama that supports the `/set parameter` option, the easiest way to do that is to make sure you're on latest: https://github.com/jmorganca/ollama/blob/main/docs/faq.md#how-can-i-upgrade-ollama Then run the set command like this: ``` ollama run dolphin-mixtral >>> /set parameter num_ctx 16384 Set parameter 'num_ctx' to '16384' ... continue your session now with the context set ```
Author
Owner

@vojtabohm commented on GitHub (Jan 2, 2024):

I have the same issue as well. I cannot run the set command since the run command fails before I get a chance to do anything else.

Running on M1 Max 32GB. Getting status 5, which means running out of memory.

warning: current allocated size is greater than the recommended max working set size 
ggml_metal_graph_compute: command buffer 4 failed with status 5

Is there a way I could limit the allocated memory? I would like to get it to run even if it's very slow to generate tokens.

<!-- gh-comment-id:1874642183 --> @vojtabohm commented on GitHub (Jan 2, 2024): I have the same issue as well. I cannot run the `set` command since the `run` command fails before I get a chance to do anything else. Running on M1 Max 32GB. Getting status 5, which means running out of memory. ``` warning: current allocated size is greater than the recommended max working set size ggml_metal_graph_compute: command buffer 4 failed with status 5 ``` Is there a way I could limit the allocated memory? I would like to get it to run even if it's very slow to generate tokens.
Author
Owner

@marko911 commented on GitHub (Jan 3, 2024):

I'm also on an M1 Pro 32GB and I get Error: ollama runner process has terminated when trying to run notux

<!-- gh-comment-id:1875087708 --> @marko911 commented on GitHub (Jan 3, 2024): I'm also on an M1 Pro 32GB and I get `Error: ollama runner process has terminated` when trying to run `notux`
Author
Owner

@G-only1 commented on GitHub (Jan 3, 2024):

It sounds like you may be running out of memory while loading the model.

Is there any more info in the logs you could share here?

https://github.com/jmorganca/ollama/blob/main/docs/troubleshooting.md

Dec 24 23:31:10 dolphin-virtual-machine systemd[1]: Started Ollama Service. Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key. Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: Your new public key is: Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFUoGHH4jz95Lzc1ptUUoARs1WXd319YwK54bLfxdX1v Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 images.go:737: total blobs: 0 Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 images.go:744: total unused blobs removed: 0 Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 routes.go:895: Listening on 127.0.0.1:11434 (version 0.1.17) Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 routes.go:915: warning: gpu support may not be enabled, check that you have installed GPU drivers: nvidia-smi command failed Dec 24 23:31:31 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:31:31 | 200 | 129.465µs | 127.0.0.1 | HEAD "/" Dec 24 23:31:31 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:31:31 | 404 | 358.46µs | 127.0.0.1 | POST "/api/show" Dec 24 23:31:33 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:33 download.go:123: downloading 122f07acc1ea in 64 413 MB part(s) Dec 24 23:41:31 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:31 download.go:123: downloading a47b02e00552 in 1 106 B part(s) Dec 24 23:41:33 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:33 download.go:123: downloading 9640c2212a51 in 1 41 B part(s) Dec 24 23:41:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:35 download.go:123: downloading 4b0d050e562d in 1 98 B part(s) Dec 24 23:41:38 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:38 download.go:123: downloading 4836176c502f in 1 484 B part(s) Dec 24 23:45:01 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:01 | 200 | 13m30s | 127.0.0.1 | POST "/api/pull" Dec 24 23:45:01 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:01 | 200 | 1.507254ms | 127.0.0.1 | POST "/api/show" Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:436: starting llama runner Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:494: waiting for llama runner to start responding Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:525: llama runner stopped successfully Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:04 | 500 | 2.189316673s | 127.0.0.1 | POST "/api/generate" Dec 24 23:46:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:46:23 | 200 | 37.038µs | 127.0.0.1 | HEAD "/" Dec 24 23:46:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:46:23 | 200 | 1.908081ms | 127.0.0.1 | GET "/api/tags" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 32.273µs | 127.0.0.1 | HEAD "/" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 893.862µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 1.024683ms | 127.0.0.1 | POST "/api/show" Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:436: starting llama runner Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:494: waiting for llama runner to start responding Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:525: llama runner stopped successfully Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:23 | 500 | 1.309327305s | 127.0.0.1 | POST "/api/generate" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 36.988µs | 127.0.0.1 | HEAD "/" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 871.764µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 855.621µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:436: starting llama runner Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:494: waiting for llama runner to start responding Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:525: llama runner stopped successfully Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:35 | 500 | 1.332430527s | 127.0.0.1 | POST "/api/generate" Dec 24 23:48:54 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:54 | 200 | 41.983µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:14 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:14 | 200 | 37.102µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:14 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:14 | 200 | 816.071µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:23 | 200 | 35.097µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:23 | 200 | 855.153µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:39 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:39 | 200 | 43.59µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:39 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:39 | 200 | 989.712µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:57 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:57 | 200 | 37.255µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:57 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:57 | 200 | 793.545µs | 127.0.0.1 | POST "/api/show" Dec 24 23:50:44 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:50:44 | 200 | 61.603µs | 127.0.0.1 | GET "/"Dec 24 23:31:10 dolphin-virtual-machine systemd[1]: Started Ollama Service. Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key. Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: Your new public key is: Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFUoGHH4jz95Lzc1ptUUoARs1WXd319YwK54bLfxdX1v Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 images.go:737: total blobs: 0 Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 images.go:744: total unused blobs removed: 0 Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 routes.go:895: Listening on 127.0.0.1:11434 (version 0.1.17) Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 routes.go:915: warning: gpu support may not be enabled, check that you have installed GPU drivers: nvidia-smi command failed Dec 24 23:31:31 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:31:31 | 200 | 129.465µs | 127.0.0.1 | HEAD "/" Dec 24 23:31:31 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:31:31 | 404 | 358.46µs | 127.0.0.1 | POST "/api/show" Dec 24 23:31:33 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:33 download.go:123: downloading 122f07acc1ea in 64 413 MB part(s) Dec 24 23:41:31 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:31 download.go:123: downloading a47b02e00552 in 1 106 B part(s) Dec 24 23:41:33 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:33 download.go:123: downloading 9640c2212a51 in 1 41 B part(s) Dec 24 23:41:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:35 download.go:123: downloading 4b0d050e562d in 1 98 B part(s) Dec 24 23:41:38 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:38 download.go:123: downloading 4836176c502f in 1 484 B part(s) Dec 24 23:45:01 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:01 | 200 | 13m30s | 127.0.0.1 | POST "/api/pull" Dec 24 23:45:01 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:01 | 200 | 1.507254ms | 127.0.0.1 | POST "/api/show" Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:436: starting llama runner Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:494: waiting for llama runner to start responding Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:525: llama runner stopped successfully Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:04 | 500 | 2.189316673s | 127.0.0.1 | POST "/api/generate" Dec 24 23:46:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:46:23 | 200 | 37.038µs | 127.0.0.1 | HEAD "/" Dec 24 23:46:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:46:23 | 200 | 1.908081ms | 127.0.0.1 | GET "/api/tags" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 32.273µs | 127.0.0.1 | HEAD "/" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 893.862µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 1.024683ms | 127.0.0.1 | POST "/api/show" Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:436: starting llama runner Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:494: waiting for llama runner to start responding Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:525: llama runner stopped successfully Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:23 | 500 | 1.309327305s | 127.0.0.1 | POST "/api/generate" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 36.988µs | 127.0.0.1 | HEAD "/" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 871.764µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 855.621µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:436: starting llama runner Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:494: waiting for llama runner to start responding Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:525: llama runner stopped successfully Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:35 | 500 | 1.332430527s | 127.0.0.1 | POST "/api/generate" Dec 24 23:48:54 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:54 | 200 | 41.983µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:14 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:14 | 200 | 37.102µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:14 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:14 | 200 | 816.071µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:23 | 200 | 35.097µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:23 | 200 | 855.153µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:39 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:39 | 200 | 43.59µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:39 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:39 | 200 | 989.712µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:57 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:57 | 200 | 37.255µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:57 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:57 | 200 | 793.545µs | 127.0.0.1 | POST "/api/show" Dec 24 23:50:44 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:50:44 | 200 | 61.603µs | 127.0.0.1 | GET "/"

<!-- gh-comment-id:1875395442 --> @G-only1 commented on GitHub (Jan 3, 2024): > It sounds like you may be running out of memory while loading the model. > > Is there any more info in the logs you could share here? > > https://github.com/jmorganca/ollama/blob/main/docs/troubleshooting.md `Dec 24 23:31:10 dolphin-virtual-machine systemd[1]: Started Ollama Service. Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key. Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: Your new public key is: Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFUoGHH4jz95Lzc1ptUUoARs1WXd319YwK54bLfxdX1v Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 images.go:737: total blobs: 0 Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 images.go:744: total unused blobs removed: 0 Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 routes.go:895: Listening on 127.0.0.1:11434 (version 0.1.17) Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 routes.go:915: warning: gpu support may not be enabled, check that you have installed GPU drivers: nvidia-smi command failed Dec 24 23:31:31 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:31:31 | 200 | 129.465µs | 127.0.0.1 | HEAD "/" Dec 24 23:31:31 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:31:31 | 404 | 358.46µs | 127.0.0.1 | POST "/api/show" Dec 24 23:31:33 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:33 download.go:123: downloading 122f07acc1ea in 64 413 MB part(s) Dec 24 23:41:31 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:31 download.go:123: downloading a47b02e00552 in 1 106 B part(s) Dec 24 23:41:33 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:33 download.go:123: downloading 9640c2212a51 in 1 41 B part(s) Dec 24 23:41:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:35 download.go:123: downloading 4b0d050e562d in 1 98 B part(s) Dec 24 23:41:38 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:38 download.go:123: downloading 4836176c502f in 1 484 B part(s) Dec 24 23:45:01 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:01 | 200 | 13m30s | 127.0.0.1 | POST "/api/pull" Dec 24 23:45:01 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:01 | 200 | 1.507254ms | 127.0.0.1 | POST "/api/show" Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:436: starting llama runner Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:494: waiting for llama runner to start responding Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:525: llama runner stopped successfully Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:04 | 500 | 2.189316673s | 127.0.0.1 | POST "/api/generate" Dec 24 23:46:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:46:23 | 200 | 37.038µs | 127.0.0.1 | HEAD "/" Dec 24 23:46:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:46:23 | 200 | 1.908081ms | 127.0.0.1 | GET "/api/tags" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 32.273µs | 127.0.0.1 | HEAD "/" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 893.862µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 1.024683ms | 127.0.0.1 | POST "/api/show" Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:436: starting llama runner Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:494: waiting for llama runner to start responding Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:525: llama runner stopped successfully Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:23 | 500 | 1.309327305s | 127.0.0.1 | POST "/api/generate" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 36.988µs | 127.0.0.1 | HEAD "/" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 871.764µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 855.621µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:436: starting llama runner Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:494: waiting for llama runner to start responding Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:525: llama runner stopped successfully Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:35 | 500 | 1.332430527s | 127.0.0.1 | POST "/api/generate" Dec 24 23:48:54 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:54 | 200 | 41.983µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:14 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:14 | 200 | 37.102µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:14 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:14 | 200 | 816.071µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:23 | 200 | 35.097µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:23 | 200 | 855.153µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:39 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:39 | 200 | 43.59µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:39 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:39 | 200 | 989.712µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:57 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:57 | 200 | 37.255µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:57 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:57 | 200 | 793.545µs | 127.0.0.1 | POST "/api/show" Dec 24 23:50:44 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:50:44 | 200 | 61.603µs | 127.0.0.1 | GET "/"Dec 24 23:31:10 dolphin-virtual-machine systemd[1]: Started Ollama Service. Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key. Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: Your new public key is: Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFUoGHH4jz95Lzc1ptUUoARs1WXd319YwK54bLfxdX1v Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 images.go:737: total blobs: 0 Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 images.go:744: total unused blobs removed: 0 Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 routes.go:895: Listening on 127.0.0.1:11434 (version 0.1.17) Dec 24 23:31:10 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:10 routes.go:915: warning: gpu support may not be enabled, check that you have installed GPU drivers: nvidia-smi command failed Dec 24 23:31:31 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:31:31 | 200 | 129.465µs | 127.0.0.1 | HEAD "/" Dec 24 23:31:31 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:31:31 | 404 | 358.46µs | 127.0.0.1 | POST "/api/show" Dec 24 23:31:33 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:31:33 download.go:123: downloading 122f07acc1ea in 64 413 MB part(s) Dec 24 23:41:31 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:31 download.go:123: downloading a47b02e00552 in 1 106 B part(s) Dec 24 23:41:33 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:33 download.go:123: downloading 9640c2212a51 in 1 41 B part(s) Dec 24 23:41:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:35 download.go:123: downloading 4b0d050e562d in 1 98 B part(s) Dec 24 23:41:38 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:41:38 download.go:123: downloading 4836176c502f in 1 484 B part(s) Dec 24 23:45:01 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:01 | 200 | 13m30s | 127.0.0.1 | POST "/api/pull" Dec 24 23:45:01 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:01 | 200 | 1.507254ms | 127.0.0.1 | POST "/api/show" Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:436: starting llama runner Dec 24 23:45:03 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:03 llama.go:494: waiting for llama runner to start responding Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:45:04 llama.go:525: llama runner stopped successfully Dec 24 23:45:04 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:45:04 | 500 | 2.189316673s | 127.0.0.1 | POST "/api/generate" Dec 24 23:46:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:46:23 | 200 | 37.038µs | 127.0.0.1 | HEAD "/" Dec 24 23:46:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:46:23 | 200 | 1.908081ms | 127.0.0.1 | GET "/api/tags" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 32.273µs | 127.0.0.1 | HEAD "/" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 893.862µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:21 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:21 | 200 | 1.024683ms | 127.0.0.1 | POST "/api/show" Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:436: starting llama runner Dec 24 23:48:22 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:22 llama.go:494: waiting for llama runner to start responding Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:23 llama.go:525: llama runner stopped successfully Dec 24 23:48:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:23 | 500 | 1.309327305s | 127.0.0.1 | POST "/api/generate" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 36.988µs | 127.0.0.1 | HEAD "/" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 871.764µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:34 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:34 | 200 | 855.621µs | 127.0.0.1 | POST "/api/show" Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:403: skipping accelerated runner because num_gpu=0 Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:436: starting llama runner Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:494: waiting for llama runner to start responding Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:451: signal: illegal instruction (core dumped) Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:459: error starting llama runner: llama runner process has terminated Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: 2023/12/24 23:48:35 llama.go:525: llama runner stopped successfully Dec 24 23:48:35 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:35 | 500 | 1.332430527s | 127.0.0.1 | POST "/api/generate" Dec 24 23:48:54 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:48:54 | 200 | 41.983µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:14 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:14 | 200 | 37.102µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:14 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:14 | 200 | 816.071µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:23 | 200 | 35.097µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:23 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:23 | 200 | 855.153µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:39 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:39 | 200 | 43.59µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:39 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:39 | 200 | 989.712µs | 127.0.0.1 | POST "/api/show" Dec 24 23:49:57 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:57 | 200 | 37.255µs | 127.0.0.1 | HEAD "/" Dec 24 23:49:57 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:49:57 | 200 | 793.545µs | 127.0.0.1 | POST "/api/show" Dec 24 23:50:44 dolphin-virtual-machine ollama[3974]: [GIN] 2023/12/24 - 23:50:44 | 200 | 61.603µs | 127.0.0.1 | GET "/"`
Author
Owner

@jocot commented on GitHub (Jan 7, 2024):

I have the same issue as well. I cannot run the set command since the run command fails before I get a chance to do anything else.

Running on M1 Max 32GB. Getting status 5, which means running out of memory.

warning: current allocated size is greater than the recommended max working set size 
ggml_metal_graph_compute: command buffer 4 failed with status 5

Is there a way I could limit the allocated memory? I would like to get it to run even if it's very slow to generate tokens.

Yes you can adjust the num_ctx setting which involves editing a json configuration file.

I have the same issue as well. I cannot run the set command since the run command fails before I get a chance to do anything else.

Running on M1 Max 32GB. Getting status 5, which means running out of memory.

warning: current allocated size is greater than the recommended max working set size 
ggml_metal_graph_compute: command buffer 4 failed with status 5

Is there a way I could limit the allocated memory? I would like to get it to run even if it's very slow to generate tokens.

@vojtabohm @marko911

I've written a python script to edit the num_ctx setting (if you cannot start the model at all)

It should work on a Mac but I don't have a Mac to test it on.

Instructions are on the link:

https://gist.github.com/jocot/27fd0c6b370d5afd317b2a5602f4dea4

If it still won't run after changing num_ctx, then it's most likely due to insufficient RAM.

Let me know if any questions or issues.

I hope that helps!

<!-- gh-comment-id:1879904080 --> @jocot commented on GitHub (Jan 7, 2024): > I have the same issue as well. I cannot run the `set` command since the `run` command fails before I get a chance to do anything else. > > Running on M1 Max 32GB. Getting status 5, which means running out of memory. > > ``` > warning: current allocated size is greater than the recommended max working set size > ggml_metal_graph_compute: command buffer 4 failed with status 5 > ``` > > Is there a way I could limit the allocated memory? I would like to get it to run even if it's very slow to generate tokens. Yes you can adjust the num_ctx setting which involves editing a json configuration file. > I have the same issue as well. I cannot run the `set` command since the `run` command fails before I get a chance to do anything else. > > Running on M1 Max 32GB. Getting status 5, which means running out of memory. > > ``` > warning: current allocated size is greater than the recommended max working set size > ggml_metal_graph_compute: command buffer 4 failed with status 5 > ``` > > Is there a way I could limit the allocated memory? I would like to get it to run even if it's very slow to generate tokens. @vojtabohm @marko911 I've written a python script to edit the num_ctx setting (if you cannot start the model at all) It *should* work on a Mac but I don't have a Mac to test it on. Instructions are on the link: [https://gist.github.com/jocot/27fd0c6b370d5afd317b2a5602f4dea4](url) If it still won't run after changing num_ctx, then it's most likely due to insufficient RAM. Let me know if any questions or issues. I hope that helps!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#26722