[GH-ISSUE #630] Error: failed to start a llama runner #26042

Closed
opened 2026-04-22 01:55:27 -05:00 by GiteaMirror · 30 comments
Owner

Originally created by @azhang on GitHub (Sep 28, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/630

Originally assigned to: @BruceMacD on GitHub.

When I run

ollama run mistral

it downloads properly but then fails to run it, with the following error:

Error: failed to start a llama runner

I'm running this on my intel mbp with 64g ram

Originally created by @azhang on GitHub (Sep 28, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/630 Originally assigned to: @BruceMacD on GitHub. When I run ollama run mistral it downloads properly but then fails to run it, with the following error: Error: failed to start a llama runner I'm running this on my intel mbp with 64g ram
GiteaMirror added the bug label 2026-04-22 01:55:27 -05:00
Author
Owner

@BruceMacD commented on GitHub (Sep 28, 2023):

This is the error that arises when the llama.cpp runner that runs the models fails to start. I'm going to get this tested on an Intel Mac, but in the meantime just wanted to verify that you're running the Ollama executable from a release (rather than building from source).

<!-- gh-comment-id:1739369388 --> @BruceMacD commented on GitHub (Sep 28, 2023): This is the error that arises when the llama.cpp runner that runs the models fails to start. I'm going to get this tested on an Intel Mac, but in the meantime just wanted to verify that you're running the Ollama executable from a release (rather than building from source).
Author
Owner

@Gru2000 commented on GitHub (Oct 4, 2023):

I'm having this same issue but running on Ubuntu 22 with Nvidia GPUS
Screenshot 2023-10-03 at 7 23 36 PM

<!-- gh-comment-id:1746029865 --> @Gru2000 commented on GitHub (Oct 4, 2023): I'm having this same issue but running on Ubuntu 22 with Nvidia GPUS <img width="786" alt="Screenshot 2023-10-03 at 7 23 36 PM" src="https://github.com/jmorganca/ollama/assets/130062505/b345fa74-dd78-4ce0-8209-6a797ef20ee0">
Author
Owner

@IgorRidanovic commented on GitHub (Oct 4, 2023):

Same issue here. Running Ollama from https://ollama.ai/install.sh. Ubuntu 22.04 running on WSL2. ollama run orca-mini gives me this odd looking prompt: ⠙
I type something, wait for 5 or so minutes and get Error: failed to start a llama runner

<!-- gh-comment-id:1746173071 --> @IgorRidanovic commented on GitHub (Oct 4, 2023): Same issue here. Running Ollama from `https://ollama.ai/install.sh`. Ubuntu 22.04 running on WSL2.` ollama run orca-mini` gives me this odd looking prompt: ⠙ I type something, wait for 5 or so minutes and get `Error: failed to start a llama runner`
Author
Owner

@BruceMacD commented on GitHub (Oct 5, 2023):

Sounds like the models are failing to load. Do you see any additional logs in ~/.ollama/logs/server.log?

<!-- gh-comment-id:1749337928 --> @BruceMacD commented on GitHub (Oct 5, 2023): Sounds like the models are failing to load. Do you see any additional logs in `~/.ollama/logs/server.log`?
Author
Owner

@zhougsoft commented on GitHub (Oct 6, 2023):

Also getting the issue here for ollama run orca-mini on native Ubuntu 22.04 running on an old Thinkpad T430:
image

I checked the for the ~/.ollama/logs/server.log but I don't actually have an ~/.ollama dir. I did my install with the curl https://ollama.ai/install.sh | sh cmd from the site if that helps any.

EDIT:
I added the ~/.ollama dir manually and ran ollama serve. A key pair was generated and placed in the ~/.ollama dir so it seems to be recognising, but still no logs after running ollama run orca-mini again.

<!-- gh-comment-id:1750785250 --> @zhougsoft commented on GitHub (Oct 6, 2023): Also getting the issue here for `ollama run orca-mini` on native Ubuntu 22.04 running on an old Thinkpad T430: ![image](https://github.com/jmorganca/ollama/assets/90252209/e91df7d0-4552-4749-8f32-e7ec95cdd9df) I checked the for the `~/.ollama/logs/server.log` but I don't actually have an `~/.ollama` dir. I did my install with the `curl https://ollama.ai/install.sh | sh` cmd from the site if that helps any. EDIT: I added the `~/.ollama` dir manually and ran `ollama serve`. A key pair was generated and placed in the `~/.ollama` dir so it seems to be recognising, but still no logs after running `ollama run orca-mini` again.
Author
Owner

@brownsnow commented on GitHub (Oct 6, 2023):

Same error!

<!-- gh-comment-id:1751217732 --> @brownsnow commented on GitHub (Oct 6, 2023): Same error!
Author
Owner

@BruceMacD commented on GitHub (Oct 6, 2023):

Hi everyone, thanks for all the reports.

This is a generic error that gets returned when the llama runner fails to start. Most likely (like in the old thinkpad case) the system doesn't have enough resources to load the model.

You can see the actual error by checking the Ollama server:

  • Mac: server logs are available in ~/.ollama/logs/server.log
  • Linux: if installed using the installer the server logs can be seen with journalctl -u ollama, otherwise check the ~/.ollama/logs/server.log

Making this better now to relay the actual error.

<!-- gh-comment-id:1751269733 --> @BruceMacD commented on GitHub (Oct 6, 2023): Hi everyone, thanks for all the reports. This is a generic error that gets returned when the llama runner fails to start. Most likely (like in the old thinkpad case) the system doesn't have enough resources to load the model. You can see the actual error by checking the Ollama server: - **Mac:** server logs are available in `~/.ollama/logs/server.log` - **Linux:** if installed using the installer the server logs can be seen with `journalctl -u ollama`, otherwise check the `~/.ollama/logs/server.log` Making this better now to relay the actual error.
Author
Owner

@brownsnow commented on GitHub (Oct 6, 2023):

luser:~ $ sudo journalctl -u ollama
[sudo] password for luser:
Oct 06 13:38:58 CX600 systemd[1]: Started Ollama Service.
Oct 06 13:38:58 CX600 ollama[1992]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key.
Oct 06 13:38:58 CX600 ollama[1992]: Your new public key is:
Oct 06 13:38:58 CX600 ollama[1992]:
Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 images.go:996: total blobs: 0
Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 images.go:1003: total unused blobs removed: 0
Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 routes.go:572: Listening on 127.0.0.1:11434
Oct 06 13:41:53 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:41:53 | 200 | 36.185µs | 127.0.0.1 | HEAD "/"
Oct 06 13:41:53 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:41:53 | 200 | 206.837µs | 127.0.0.1 | GET "/api/>
Oct 06 13:41:56 CX600 ollama[1992]: 2023/10/06 13:41:56 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:36 CX600 ollama[1992]: 2023/10/06 13:47:36 download.go:235: success getting sha256:8daa9615cce30c259a9555b1c>
Oct 06 13:47:38 CX600 ollama[1992]: 2023/10/06 13:47:38 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:38 CX600 ollama[1992]: 2023/10/06 13:47:38 download.go:235: success getting sha256:8c17c2ebb0ea011be9981cc39>
Oct 06 13:47:39 CX600 ollama[1992]: 2023/10/06 13:47:39 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:39 CX600 ollama[1992]: 2023/10/06 13:47:39 download.go:235: success getting sha256:7c23fb36d80141c4ab8cdbb61>
Oct 06 13:47:40 CX600 ollama[1992]: 2023/10/06 13:47:40 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:40 CX600 ollama[1992]: 2023/10/06 13:47:40 download.go:235: success getting sha256:bec56154823a9d2956cf28f6c>
Oct 06 13:47:41 CX600 ollama[1992]: 2023/10/06 13:47:41 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:42 CX600 ollama[1992]: 2023/10/06 13:47:42 download.go:235: success getting sha256:e35ab70a78c78ebbbc4d2e2ea>
Oct 06 13:47:42 CX600 ollama[1992]: 2023/10/06 13:47:42 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:43 CX600 ollama[1992]: 2023/10/06 13:47:43 download.go:235: success getting sha256:09fe89200c09e3fa8b36e77da>
Oct 06 13:48:00 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:48:00 | 200 | 6m7s | 127.0.0.1 | POST "/api/>
Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:239: 4096 MiB VRAM available, loading up to 36 GPU layers
Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:313: starting llama runner
Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:349: waiting for llama runner to start responding
Oct 06 13:48:02 CX600 ollama[1992]: 2023/10/06 13:48:02 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:313: starting llama runner
Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:349: waiting for llama runner to start responding
Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 13:52:00 CX600 ollama[1992]: 2023/10/06 13:52:00 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 13:52:00 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:00 | 500 | 4m0s | 127.0.0.1 | POST "/api/>
Oct 06 13:52:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:13 | 200 | 20.552µs | 127.0.0.1 | HEAD "/"
Oct 06 13:52:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:13 | 200 | 242.456µs | 127.0.0.1 | GET "/api/>
Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:239: 4096 MiB VRAM available, loading up to 36 GPU layers
Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:313: starting llama runner
Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:349: waiting for llama runner to start responding
Oct 06 13:52:14 CX600 ollama[1992]: 2023/10/06 13:52:14 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:313: starting llama runner
Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:349: waiting for llama runner to start responding
Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 13:56:13 CX600 ollama[1992]: 2023/10/06 13:56:13 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 13:56:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:56:13 | 500 | 4m0s | 127.0.0.1 | POST "/api/>
Oct 06 14:37:57 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:37:57 | 200 | 19.934µs | 127.0.0.1 | HEAD "/"
Oct 06 14:37:59 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:37:57 | 200 | 252.629µs | 127.0.0.1 | GET "/api/>
Oct 06 14:54:42 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:54:42 | 200 | 19.907µs | 127.0.0.1 | HEAD "/"
Oct 06 14:54:42 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:54:42 | 200 | 229.742µs | 127.0.0.1 | GET "/api/>
Oct 06 14:54:45 CX600 ollama[1992]: 2023/10/06 14:54:45 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:00:53 CX600 ollama[1992]: 2023/10/06 15:00:53 download.go:235: success getting sha256:3230a638a2da7f51833ddf0f5>
Oct 06 15:00:55 CX600 ollama[1992]: 2023/10/06 15:00:55 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:00:55 CX600 ollama[1992]: 2023/10/06 15:00:55 download.go:235: success getting sha256:d5311aab7c4cecbb387fbb06d>
Oct 06 15:00:56 CX600 ollama[1992]: 2023/10/06 15:00:56 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:00:57 CX600 ollama[1992]: 2023/10/06 15:00:57 download.go:235: success getting sha256:1e836a895a40bb08d7f5d3209>
Oct 06 15:01:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:01:16 | 200 | 6m33s | 127.0.0.1 | POST "/api/>
Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:239: 4096 MiB VRAM available, loading up to 32 GPU layers
Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:313: starting llama runner
Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:349: waiting for llama runner to start responding
Oct 06 15:01:20 CX600 ollama[1992]: 2023/10/06 15:01:20 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:313: starting llama runner
Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:349: waiting for llama runner to start responding
Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:05:18 CX600 ollama[1992]: 2023/10/06 15:05:18 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:05:18 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:05:18 | 500 | 4m1s | 127.0.0.1 | POST "/api/>
Oct 06 15:12:03 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:12:03 | 200 | 19.418µs | 127.0.0.1 | HEAD "/"
Oct 06 15:12:03 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:12:03 | 200 | 268.109µs | 127.0.0.1 | GET "/api/>
Oct 06 15:12:05 CX600 ollama[1992]: 2023/10/06 15:12:05 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:14:56 CX600 ollama[1992]: 2023/10/06 15:14:56 download.go:235: success getting sha256:e84705205f71dd55be7b24a77>
Oct 06 15:14:58 CX600 ollama[1992]: 2023/10/06 15:14:58 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:14:58 CX600 ollama[1992]: 2023/10/06 15:14:58 download.go:235: success getting sha256:e7214e2f1a0f5ed0ed67c3db9>
Oct 06 15:14:59 CX600 ollama[1992]: 2023/10/06 15:14:59 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:14:59 CX600 ollama[1992]: 2023/10/06 15:14:59 download.go:235: success getting sha256:93ca9b3d83dc541f11062c0b9>
Oct 06 15:15:00 CX600 ollama[1992]: 2023/10/06 15:15:00 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:15:00 CX600 ollama[1992]: 2023/10/06 15:15:00 download.go:235: success getting sha256:65009e4e7fee047467033b69d>
Oct 06 15:15:09 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:15:09 | 200 | 3m5s | 127.0.0.1 | POST "/api/>
Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:239: 4096 MiB VRAM available, loading up to 57 GPU layers
Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:313: starting llama runner
Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:349: waiting for llama runner to start responding
Oct 06 15:15:10 CX600 ollama[1992]: 2023/10/06 15:15:10 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:313: starting llama runner
Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:349: waiting for llama runner to start responding
Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:19:09 CX600 ollama[1992]: 2023/10/06 15:19:09 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:19:09 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:19:09 | 500 | 4m0s | 127.0.0.1 | POST "/api/>
Oct 06 15:54:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:54:16 | 200 | 21.369µs | 127.0.0.1 | HEAD "/"
Oct 06 15:54:18 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:54:16 | 200 | 345.576µs | 127.0.0.1 | GET "/api/>
Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:239: 4096 MiB VRAM available, loading up to 57 GPU layers
Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:313: starting llama runner
Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:349: waiting for llama runner to start responding
Oct 06 15:54:20 CX600 ollama[1992]: 2023/10/06 15:54:19 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:313: starting llama runner
Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:349: waiting for llama runner to start responding
Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:58:16 CX600 ollama[1992]: 2023/10/06 15:58:16 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:58:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:58:16 | 500 | 4m0s | 127.0.0.1 | POST "/api/>
luser:~ $

<!-- gh-comment-id:1751354377 --> @brownsnow commented on GitHub (Oct 6, 2023): luser:~ $ sudo journalctl -u ollama [sudo] password for luser: Oct 06 13:38:58 CX600 systemd[1]: Started Ollama Service. Oct 06 13:38:58 CX600 ollama[1992]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key. Oct 06 13:38:58 CX600 ollama[1992]: Your new public key is: Oct 06 13:38:58 CX600 ollama[1992]: Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 images.go:996: total blobs: 0 Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 images.go:1003: total unused blobs removed: 0 Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 routes.go:572: Listening on 127.0.0.1:11434 Oct 06 13:41:53 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:41:53 | 200 | 36.185µs | 127.0.0.1 | HEAD "/" Oct 06 13:41:53 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:41:53 | 200 | 206.837µs | 127.0.0.1 | GET "/api/> Oct 06 13:41:56 CX600 ollama[1992]: 2023/10/06 13:41:56 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:36 CX600 ollama[1992]: 2023/10/06 13:47:36 download.go:235: success getting sha256:8daa9615cce30c259a9555b1c> Oct 06 13:47:38 CX600 ollama[1992]: 2023/10/06 13:47:38 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:38 CX600 ollama[1992]: 2023/10/06 13:47:38 download.go:235: success getting sha256:8c17c2ebb0ea011be9981cc39> Oct 06 13:47:39 CX600 ollama[1992]: 2023/10/06 13:47:39 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:39 CX600 ollama[1992]: 2023/10/06 13:47:39 download.go:235: success getting sha256:7c23fb36d80141c4ab8cdbb61> Oct 06 13:47:40 CX600 ollama[1992]: 2023/10/06 13:47:40 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:40 CX600 ollama[1992]: 2023/10/06 13:47:40 download.go:235: success getting sha256:bec56154823a9d2956cf28f6c> Oct 06 13:47:41 CX600 ollama[1992]: 2023/10/06 13:47:41 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:42 CX600 ollama[1992]: 2023/10/06 13:47:42 download.go:235: success getting sha256:e35ab70a78c78ebbbc4d2e2ea> Oct 06 13:47:42 CX600 ollama[1992]: 2023/10/06 13:47:42 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:43 CX600 ollama[1992]: 2023/10/06 13:47:43 download.go:235: success getting sha256:09fe89200c09e3fa8b36e77da> Oct 06 13:48:00 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:48:00 | 200 | 6m7s | 127.0.0.1 | POST "/api/> Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:239: 4096 MiB VRAM available, loading up to 36 GPU layers Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:313: starting llama runner Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:349: waiting for llama runner to start responding Oct 06 13:48:02 CX600 ollama[1992]: 2023/10/06 13:48:02 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:313: starting llama runner Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:349: waiting for llama runner to start responding Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 13:52:00 CX600 ollama[1992]: 2023/10/06 13:52:00 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 13:52:00 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:00 | 500 | 4m0s | 127.0.0.1 | POST "/api/> Oct 06 13:52:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:13 | 200 | 20.552µs | 127.0.0.1 | HEAD "/" Oct 06 13:52:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:13 | 200 | 242.456µs | 127.0.0.1 | GET "/api/> Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:239: 4096 MiB VRAM available, loading up to 36 GPU layers Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:313: starting llama runner Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:349: waiting for llama runner to start responding Oct 06 13:52:14 CX600 ollama[1992]: 2023/10/06 13:52:14 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:313: starting llama runner Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:349: waiting for llama runner to start responding Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 13:56:13 CX600 ollama[1992]: 2023/10/06 13:56:13 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 13:56:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:56:13 | 500 | 4m0s | 127.0.0.1 | POST "/api/> Oct 06 14:37:57 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:37:57 | 200 | 19.934µs | 127.0.0.1 | HEAD "/" Oct 06 14:37:59 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:37:57 | 200 | 252.629µs | 127.0.0.1 | GET "/api/> Oct 06 14:54:42 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:54:42 | 200 | 19.907µs | 127.0.0.1 | HEAD "/" Oct 06 14:54:42 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:54:42 | 200 | 229.742µs | 127.0.0.1 | GET "/api/> Oct 06 14:54:45 CX600 ollama[1992]: 2023/10/06 14:54:45 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:00:53 CX600 ollama[1992]: 2023/10/06 15:00:53 download.go:235: success getting sha256:3230a638a2da7f51833ddf0f5> Oct 06 15:00:55 CX600 ollama[1992]: 2023/10/06 15:00:55 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:00:55 CX600 ollama[1992]: 2023/10/06 15:00:55 download.go:235: success getting sha256:d5311aab7c4cecbb387fbb06d> Oct 06 15:00:56 CX600 ollama[1992]: 2023/10/06 15:00:56 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:00:57 CX600 ollama[1992]: 2023/10/06 15:00:57 download.go:235: success getting sha256:1e836a895a40bb08d7f5d3209> Oct 06 15:01:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:01:16 | 200 | 6m33s | 127.0.0.1 | POST "/api/> Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:239: 4096 MiB VRAM available, loading up to 32 GPU layers Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:313: starting llama runner Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:349: waiting for llama runner to start responding Oct 06 15:01:20 CX600 ollama[1992]: 2023/10/06 15:01:20 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:313: starting llama runner Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:349: waiting for llama runner to start responding Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:05:18 CX600 ollama[1992]: 2023/10/06 15:05:18 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:05:18 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:05:18 | 500 | 4m1s | 127.0.0.1 | POST "/api/> Oct 06 15:12:03 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:12:03 | 200 | 19.418µs | 127.0.0.1 | HEAD "/" Oct 06 15:12:03 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:12:03 | 200 | 268.109µs | 127.0.0.1 | GET "/api/> Oct 06 15:12:05 CX600 ollama[1992]: 2023/10/06 15:12:05 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:14:56 CX600 ollama[1992]: 2023/10/06 15:14:56 download.go:235: success getting sha256:e84705205f71dd55be7b24a77> Oct 06 15:14:58 CX600 ollama[1992]: 2023/10/06 15:14:58 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:14:58 CX600 ollama[1992]: 2023/10/06 15:14:58 download.go:235: success getting sha256:e7214e2f1a0f5ed0ed67c3db9> Oct 06 15:14:59 CX600 ollama[1992]: 2023/10/06 15:14:59 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:14:59 CX600 ollama[1992]: 2023/10/06 15:14:59 download.go:235: success getting sha256:93ca9b3d83dc541f11062c0b9> Oct 06 15:15:00 CX600 ollama[1992]: 2023/10/06 15:15:00 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:15:00 CX600 ollama[1992]: 2023/10/06 15:15:00 download.go:235: success getting sha256:65009e4e7fee047467033b69d> Oct 06 15:15:09 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:15:09 | 200 | 3m5s | 127.0.0.1 | POST "/api/> Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:239: 4096 MiB VRAM available, loading up to 57 GPU layers Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:313: starting llama runner Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:349: waiting for llama runner to start responding Oct 06 15:15:10 CX600 ollama[1992]: 2023/10/06 15:15:10 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:313: starting llama runner Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:349: waiting for llama runner to start responding Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:19:09 CX600 ollama[1992]: 2023/10/06 15:19:09 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:19:09 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:19:09 | 500 | 4m0s | 127.0.0.1 | POST "/api/> Oct 06 15:54:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:54:16 | 200 | 21.369µs | 127.0.0.1 | HEAD "/" Oct 06 15:54:18 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:54:16 | 200 | 345.576µs | 127.0.0.1 | GET "/api/> Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:239: 4096 MiB VRAM available, loading up to 57 GPU layers Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:313: starting llama runner Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:349: waiting for llama runner to start responding Oct 06 15:54:20 CX600 ollama[1992]: 2023/10/06 15:54:19 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:313: starting llama runner Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:349: waiting for llama runner to start responding Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:58:16 CX600 ollama[1992]: 2023/10/06 15:58:16 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:58:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:58:16 | 500 | 4m0s | 127.0.0.1 | POST "/api/> luser:~ $
Author
Owner

@BruceMacD commented on GitHub (Oct 6, 2023):

@brownsnow the illegal instruction error looks like the root of the problem in your case.

What CPU architecture are you using? You can check this by running uname -a. Make sure you're running the appropriate version of Ollama (the install script should have picked the correct version automatically).

<!-- gh-comment-id:1751357449 --> @BruceMacD commented on GitHub (Oct 6, 2023): @brownsnow the `illegal instruction` error looks like the root of the problem in your case. What CPU architecture are you using? You can check this by running `uname -a`. Make sure you're running the appropriate version of Ollama (the install script should have picked the correct version automatically).
Author
Owner

@simonedoria commented on GitHub (Oct 7, 2023):

@brownsnow the illegal instruction error looks like the root of the problem in your case.

What CPU architecture are you using? You can check this by running uname -a. Make sure you're running the appropriate version of Ollama (the install script should have picked the correct version automatically).

Hi, I have the same problem as @brownsnow , here is my uname -a : Linux ns357104 5.4.0-125-generic #141-Ubuntu SMP Wed Aug 10 13:42:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

CPU 8-Core
RAM 32gb

Trying to run mistral.

Thanks!

<!-- gh-comment-id:1751774323 --> @simonedoria commented on GitHub (Oct 7, 2023): > @brownsnow the `illegal instruction` error looks like the root of the problem in your case. > > What CPU architecture are you using? You can check this by running `uname -a`. Make sure you're running the appropriate version of Ollama (the install script should have picked the correct version automatically). Hi, I have the same problem as @brownsnow , here is my **uname -a** : Linux ns357104 5.4.0-125-generic #141-Ubuntu SMP Wed Aug 10 13:42:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux CPU 8-Core RAM 32gb Trying to run mistral. Thanks!
Author
Owner

@brownsnow commented on GitHub (Oct 7, 2023):

luser:~ $ uname -a
Linux CX600 6.5.5-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 23 Sep 2023 22:55:13 +0000 x86_64 GNU/Linux
luser:~ $

<!-- gh-comment-id:1751807319 --> @brownsnow commented on GitHub (Oct 7, 2023): luser:~ $ uname -a Linux CX600 6.5.5-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 23 Sep 2023 22:55:13 +0000 x86_64 GNU/Linux luser:~ $
Author
Owner

@adrian5 commented on GitHub (Oct 7, 2023):

Same issue as @brownsnow, same architecture and kernel. This is my first foray into using machine models of any kind, hence limited knowledge about all the components ollama so conveniently wraps.

I tried with mistral and orca-mini. Here is some metadata from the blobs/ directory that belongs to Mistral I believe:

{
  "model_format": "gguf",
  "model_family": "llama",
  "model_type": "7B",
  "file_type": "Q4_0",
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:6ae28029995007a3ee8d0b8556d50f3b59b831074cf19c84de87acf51fb54054",
      "sha256:fede2d8d6c1f404b1db73b1cd26f7d5455ff2deeb737b5e2b339339dce2969d4"
    ]
  },
  "architecture": "amd64",
  "os": "linux"
}

Edit: Nevermind, it's very likely #644 for me. My CPU definitely doesn't support AVX2. I recall that keeping me from trying in the past but forgot about that detail.

<!-- gh-comment-id:1751850007 --> @adrian5 commented on GitHub (Oct 7, 2023): Same issue as @brownsnow, same architecture and kernel. This is my first foray into using machine models of any kind, hence limited knowledge about all the components ollama so conveniently wraps. I tried with `mistral` and `orca-mini`. Here is some metadata from the `blobs/` directory that belongs to Mistral I believe: ```json { "model_format": "gguf", "model_family": "llama", "model_type": "7B", "file_type": "Q4_0", "rootfs": { "type": "layers", "diff_ids": [ "sha256:6ae28029995007a3ee8d0b8556d50f3b59b831074cf19c84de87acf51fb54054", "sha256:fede2d8d6c1f404b1db73b1cd26f7d5455ff2deeb737b5e2b339339dce2969d4" ] }, "architecture": "amd64", "os": "linux" } ``` --- Edit: Nevermind, it's very likely #644 for me. My CPU definitely doesn't support AVX2. I recall that keeping me from trying in the past but forgot about that detail.
Author
Owner

@9cat commented on GitHub (Oct 12, 2023):

same issue , my old i3 cpu has the AVX still fail

grep avx /proc/cpuinfo
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx f16c lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadowvnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts md_clear flush_l1d

<!-- gh-comment-id:1758851676 --> @9cat commented on GitHub (Oct 12, 2023): same issue , my old i3 cpu has the AVX still fail grep avx /proc/cpuinfo nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave **_avx_** f16c lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadowvnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts md_clear flush_l1d
Author
Owner

@jneumann-dev commented on GitHub (Oct 12, 2023):

@9cat getting same problem, also only have avx support, what I'm piecing together is that you have to build ollama from source to only use instruction sets your processor supports. The release build is, let's say, overly optimistic about what kind of hardware you're using. According to #644 a fix with compile-time checks for full compatibility with the processor has already been implemented, so in theory if you can compile ollama from source this problem should go away. TL;DR apparently need to compile from source. Will try and report back later today.

<!-- gh-comment-id:1759772840 --> @jneumann-dev commented on GitHub (Oct 12, 2023): @9cat getting same problem, also only have avx support, what I'm piecing together is that you have to build ollama from source to only use instruction sets your processor supports. The release build is, let's say, overly optimistic about what kind of hardware you're using. According to #644 a fix with compile-time checks for full compatibility with the processor has already been implemented, so in theory if you can compile ollama from source this problem should go away. TL;DR apparently need to compile from source. Will try and report back later today.
Author
Owner

@BruceMacD commented on GitHub (Oct 12, 2023):

Hi all, just merged a change that will relay the actual error starting the llama runner to the client. What happened with this issue is that there a few different problems here, but it was not clear what the root was in each case due to the bad error message being returned.

A summary of the issues I see here:

  • Older CPUs that do not support the instruction set required by llama.cpp (which we use to run the models)
  • Unsupported CPU architectures (AVX, try building from source, see the development doc for reference, I haven't tested this though)
  • Loading models on machines that cannot run them adequately and the runner times out while loading

Please feel free to open more issues if I've missed something here, and thanks for all the reports.

<!-- gh-comment-id:1759830263 --> @BruceMacD commented on GitHub (Oct 12, 2023): Hi all, just merged a change that will relay the actual error starting the llama runner to the client. What happened with this issue is that there a few different problems here, but it was not clear what the root was in each case due to the bad error message being returned. A summary of the issues I see here: - Older CPUs that do not support the instruction set required by llama.cpp (which we use to run the models) - Unsupported CPU architectures (AVX, try building from source, see the [development doc](https://github.com/jmorganca/ollama/blob/main/docs/development.md) for reference, I haven't tested this though) - Loading models on machines that cannot run them adequately and the runner times out while loading Please feel free to open more issues if I've missed something here, and thanks for all the reports.
Author
Owner

@drhumlen commented on GitHub (Oct 18, 2023):

My problem was that I was continuously pulling and building from source, but I hadn't updated the dependencies in a while. For me this fixed it:

brew install cmake
brew install go
go generate ./...
go build .

And then I could finally ./ollama serve and ./ollama run llama2 like normal 😄 (Macbook with M1 pro, and 16gb ram)

<!-- gh-comment-id:1769292803 --> @drhumlen commented on GitHub (Oct 18, 2023): My problem was that I was continuously pulling and building from source, but I hadn't updated the dependencies in a while. For me this fixed it: ``` brew install cmake brew install go go generate ./... go build . ``` And then I could finally `./ollama serve` and `./ollama run llama2` like normal 😄 (Macbook with M1 pro, and 16gb ram)
Author
Owner

@mzm008 commented on GitHub (Dec 21, 2023):

me too, can not run. long time to waiting and error
my cpu info is Linux VM-0-4-ubuntu 5.4.0-139-generic #156-Ubuntu SMP Fri Jan 20 17:27:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

<!-- gh-comment-id:1865802121 --> @mzm008 commented on GitHub (Dec 21, 2023): me too, can not run. long time to waiting and error my cpu info is Linux VM-0-4-ubuntu 5.4.0-139-generic #156-Ubuntu SMP Fri Jan 20 17:27:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Author
Owner

@Abdullah-shamito commented on GitHub (Dec 26, 2023):

hello everyone, i'v faced this issue and solve it by:
remove the model
and then update ollama using curl https://ollama.ai/install.sh | sh
install the model
btw i used ( ollama run mixtral:8x7b-instruct-v0.1-q5_0) from https://ollama.ai/library/mixtral/tags

<!-- gh-comment-id:1869608261 --> @Abdullah-shamito commented on GitHub (Dec 26, 2023): hello everyone, i'v faced this issue and solve it by: remove the model and then update ollama using curl https://ollama.ai/install.sh | sh install the model btw i used ( ollama run mixtral:8x7b-instruct-v0.1-q5_0) from https://ollama.ai/library/mixtral/tags
Author
Owner

@BruceMacD commented on GitHub (Dec 27, 2023):

@Abdullah-shamito
In your case of probably failing to load the model into memory before it times out. Mixtral is a larger model so this may happen.

There will be some improvements to this in the next release and you won't see the timeout anymore.

<!-- gh-comment-id:1870495665 --> @BruceMacD commented on GitHub (Dec 27, 2023): @Abdullah-shamito In your case of probably failing to load the model into memory before it times out. Mixtral is a larger model so this may happen. There will be some improvements to this in the next release and you won't see the timeout anymore.
Author
Owner

@UmutAlihan commented on GitHub (May 19, 2024):

I am also having this error:

ollama version: 0.1.38

Error: timed out waiting for llama runner to start:```
<!-- gh-comment-id:2119454522 --> @UmutAlihan commented on GitHub (May 19, 2024): I am also having this error: ollama version: 0.1.38 ```~$ ollama run llama3:70b-instruct-q8_0 Error: timed out waiting for llama runner to start:```
Author
Owner

@UmutAlihan commented on GitHub (May 27, 2024):

@Abdullah-shamito In your case of probably failing to load the model into memory before it times out. Mixtral is a larger model so this may happen.

There will be some improvements to this in the next release and you won't see the timeout anymore.

when are you planning to release the new version?? This issue is really killing the vibe. At least please release a rc version

<!-- gh-comment-id:2133309370 --> @UmutAlihan commented on GitHub (May 27, 2024): > @Abdullah-shamito In your case of probably failing to load the model into memory before it times out. Mixtral is a larger model so this may happen. > > There will be some improvements to this in the next release and you won't see the timeout anymore. when are you planning to release the new version?? This issue is really killing the vibe. At least please release a rc version
Author
Owner

@Nantris commented on GitHub (Jun 21, 2024):

Is there some way to increase the timeout? I still see the problem with ollama@0.1.44. I'm trying to load an enormous model (which may well fail to load) but it's failing without even trying to load it due to this issue. I don't think this issue should be closed.

<!-- gh-comment-id:2181876135 --> @Nantris commented on GitHub (Jun 21, 2024): Is there some way to increase the timeout? I still see the problem with `ollama@0.1.44`. I'm trying to load an enormous model (which may well fail to load) but it's failing without even trying to load it due to this issue. I don't think this issue should be closed.
Author
Owner

@UmutAlihan commented on GitHub (Jun 21, 2024):

Is there some way to increase the timeout? I still see the problem with ollama@0.1.44. I'm trying to load an enormous model (which may well fail to load) but it's failing without even trying to load it due to this issue. I don't think this issue should be closed.

you can check this workaround: https://github.com/ollama/ollama/issues/4131#issuecomment-2174891150

<!-- gh-comment-id:2182371780 --> @UmutAlihan commented on GitHub (Jun 21, 2024): > Is there some way to increase the timeout? I still see the problem with `ollama@0.1.44`. I'm trying to load an enormous model (which may well fail to load) but it's failing without even trying to load it due to this issue. I don't think this issue should be closed. you can check this workaround: https://github.com/ollama/ollama/issues/4131#issuecomment-2174891150
Author
Owner

@Nantris commented on GitHub (Jun 21, 2024):

Thanks for pointing that out @UmutAlihan.

I hope this might be added to the base project or as a CLI flag @BruceMacD? Building from source isn't viable for us.

<!-- gh-comment-id:2183240777 --> @Nantris commented on GitHub (Jun 21, 2024): Thanks for pointing that out @UmutAlihan. I hope this might be added to the base project or as a CLI flag @BruceMacD? Building from source isn't viable for us.
Author
Owner

@UmutAlihan commented on GitHub (Jun 29, 2024):

Thanks for pointing that out @UmutAlihan.

I hope this might be added to the base project or as a CLI flag @BruceMacD? Building from source isn't viable for us.

Your very welcome. I am also sad that in the fast paced release cycle for some reason this is being ignored.

I was not expecting for such quality software has hardcoded timeout value... Even though this is fine if contributors realize it, parameterize it and release. However this still stays in the base project. Am I missing something? Is there some reason for the timeout value stays hardcoded?

<!-- gh-comment-id:2198339632 --> @UmutAlihan commented on GitHub (Jun 29, 2024): > Thanks for pointing that out @UmutAlihan. > > I hope this might be added to the base project or as a CLI flag @BruceMacD? Building from source isn't viable for us. Your very welcome. I am also sad that in the fast paced release cycle for some reason this is being ignored. I was not expecting for such quality software has hardcoded timeout value... Even though this is fine if contributors realize it, parameterize it and release. However this still stays in the base project. Am I missing something? Is there some reason for the timeout value stays hardcoded?
Author
Owner

@Nantris commented on GitHub (Jun 29, 2024):

@UmutAlihan I can't find the relevant comment, but this is resolved and I was able to load the 236 billion parameter version of Deepseek-Coder-v2 as a result. It still did fail early when I lacked sufficient RAM (eg had too many other applications open.)

The new logic in 0.1.45 is to only bail out if the load makes no progress in the timeout time, rather than if it fails to fully load in that time.

<!-- gh-comment-id:2198343495 --> @Nantris commented on GitHub (Jun 29, 2024): @UmutAlihan I can't find the relevant comment, but this is resolved and I was able to load the 236 billion parameter version of Deepseek-Coder-v2 as a result. It still did fail early when I lacked sufficient RAM (eg had too many other applications open.) The new logic in 0.1.45 is to only bail out if the load makes no progress in the timeout time, rather than if it fails to fully load in that time.
Author
Owner

@ichinippa128 commented on GitHub (Jul 12, 2024):

Depending on the model, the timeout also occurs in v0.2.1.
Even if you change the load process, I don't think the timeout period should be fixed at 5 minutes.
I would like to be able to change the timeout period to accommodate various models and GPU environments.

<!-- gh-comment-id:2224696143 --> @ichinippa128 commented on GitHub (Jul 12, 2024): Depending on the model, the timeout also occurs in v0.2.1. Even if you change the load process, I don't think the timeout period should be fixed at 5 minutes. I would like to be able to change the timeout period to accommodate various models and GPU environments.
Author
Owner

@UmutAlihan commented on GitHub (Jul 12, 2024):

Depending on the model, the timeout also occurs in v0.2.1.
Even if you change the load process, I don't think the timeout period should be fixed at 5 minutes.
I would like to be able to change the timeout period to accommodate various models and GPU environments.

yes exactly thisnis the issue for some time :/

<!-- gh-comment-id:2225096430 --> @UmutAlihan commented on GitHub (Jul 12, 2024): > Depending on the model, the timeout also occurs in v0.2.1. > Even if you change the load process, I don't think the timeout period should be fixed at 5 minutes. > I would like to be able to change the timeout period to accommodate various models and GPU environments. yes exactly thisnis the issue for some time :/
Author
Owner

@BruceMacD commented on GitHub (Jul 12, 2024):

Hey all, thanks for bringing up this timeout. It seems like there is an issue in one of the recent releases with loading larger models in some cases, possibly caused by memory mapping. It should be refreshing the timeout as the model is loaded, not a fixed timeout. This means the loading is getting stuck somewhere. I'll update when we have it figured out.

<!-- gh-comment-id:2226218214 --> @BruceMacD commented on GitHub (Jul 12, 2024): Hey all, thanks for bringing up this timeout. It seems like there is an issue in one of the recent releases with loading larger models in some cases, possibly caused by memory mapping. It should be refreshing the timeout as the model is loaded, not a fixed timeout. This means the loading is getting stuck somewhere. I'll update when we have it figured out.
Author
Owner

@jvel07 commented on GitHub (Aug 28, 2025):

@BruceMacD did you figure out?

<!-- gh-comment-id:3231991952 --> @jvel07 commented on GitHub (Aug 28, 2025): @BruceMacD did you figure out?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#26042