[GH-ISSUE #630] Error: failed to start a llama runner #281

New Issue

GiteaMirror · 2026-04-12T09:49:17-05:00

GiteaMirror commented

2026-04-12 09:49:17 -05:00

Originally created by @azhang on GitHub (Sep 28, 2023).
Original GitHub issue: https://github.com/ollama/ollama/issues/630

Originally assigned to: @BruceMacD on GitHub.

When I run

ollama run mistral

it downloads properly but then fails to run it, with the following error:

Error: failed to start a llama runner

I'm running this on my intel mbp with 64g ram

Originally created by @azhang on GitHub (Sep 28, 2023). Original GitHub issue: https://github.com/ollama/ollama/issues/630 Originally assigned to: @BruceMacD on GitHub. When I run ollama run mistral it downloads properly but then fails to run it, with the following error: Error: failed to start a llama runner I'm running this on my intel mbp with 64g ram

GiteaMirror added the bug label 2026-04-12 09:49:17 -05:00

GiteaMirror closed this issue

2026-04-12 09:49:17 -05:00

GiteaMirror commented

2026-04-12 09:49:18 -05:00

@BruceMacD commented on GitHub (Sep 28, 2023):

This is the error that arises when the llama.cpp runner that runs the models fails to start. I'm going to get this tested on an Intel Mac, but in the meantime just wanted to verify that you're running the Ollama executable from a release (rather than building from source).

@BruceMacD commented on GitHub (Sep 28, 2023): This is the error that arises when the llama.cpp runner that runs the models fails to start. I'm going to get this tested on an Intel Mac, but in the meantime just wanted to verify that you're running the Ollama executable from a release (rather than building from source).

GiteaMirror commented

2026-04-12 09:49:18 -05:00

@Gru2000 commented on GitHub (Oct 4, 2023):

I'm having this same issue but running on Ubuntu 22 with Nvidia GPUS

@Gru2000 commented on GitHub (Oct 4, 2023): I'm having this same issue but running on Ubuntu 22 with Nvidia GPUS <img width="786" alt="Screenshot 2023-10-03 at 7 23 36 PM" src="https://github.com/jmorganca/ollama/assets/130062505/b345fa74-dd78-4ce0-8209-6a797ef20ee0">

GiteaMirror commented

2026-04-12 09:49:18 -05:00

@IgorRidanovic commented on GitHub (Oct 4, 2023):

Same issue here. Running Ollama from https://ollama.ai/install.sh. Ubuntu 22.04 running on WSL2. ollama run orca-mini gives me this odd looking prompt: ⠙
I type something, wait for 5 or so minutes and get Error: failed to start a llama runner

@IgorRidanovic commented on GitHub (Oct 4, 2023): Same issue here. Running Ollama from `https://ollama.ai/install.sh`. Ubuntu 22.04 running on WSL2.` ollama run orca-mini` gives me this odd looking prompt: ⠙ I type something, wait for 5 or so minutes and get `Error: failed to start a llama runner`

GiteaMirror commented

2026-04-12 09:49:19 -05:00

@BruceMacD commented on GitHub (Oct 5, 2023):

Sounds like the models are failing to load. Do you see any additional logs in ~/.ollama/logs/server.log?

@BruceMacD commented on GitHub (Oct 5, 2023): Sounds like the models are failing to load. Do you see any additional logs in `~/.ollama/logs/server.log`?

GiteaMirror commented

2026-04-12 09:49:19 -05:00

@zhougsoft commented on GitHub (Oct 6, 2023):

Also getting the issue here for ollama run orca-mini on native Ubuntu 22.04 running on an old Thinkpad T430:

I checked the for the ~/.ollama/logs/server.log but I don't actually have an ~/.ollama dir. I did my install with the curl https://ollama.ai/install.sh | sh cmd from the site if that helps any.

EDIT:
I added the ~/.ollama dir manually and ran ollama serve. A key pair was generated and placed in the ~/.ollama dir so it seems to be recognising, but still no logs after running ollama run orca-mini again.

@zhougsoft commented on GitHub (Oct 6, 2023): Also getting the issue here for `ollama run orca-mini` on native Ubuntu 22.04 running on an old Thinkpad T430: ![image](https://github.com/jmorganca/ollama/assets/90252209/e91df7d0-4552-4749-8f32-e7ec95cdd9df) I checked the for the `~/.ollama/logs/server.log` but I don't actually have an `~/.ollama` dir. I did my install with the `curl https://ollama.ai/install.sh | sh` cmd from the site if that helps any. EDIT: I added the `~/.ollama` dir manually and ran `ollama serve`. A key pair was generated and placed in the `~/.ollama` dir so it seems to be recognising, but still no logs after running `ollama run orca-mini` again.

GiteaMirror commented

2026-04-12 09:49:19 -05:00

@brownsnow commented on GitHub (Oct 6, 2023):

Same error!

@brownsnow commented on GitHub (Oct 6, 2023): Same error!

GiteaMirror commented

2026-04-12 09:49:20 -05:00

@BruceMacD commented on GitHub (Oct 6, 2023):

Hi everyone, thanks for all the reports.

This is a generic error that gets returned when the llama runner fails to start. Most likely (like in the old thinkpad case) the system doesn't have enough resources to load the model.

You can see the actual error by checking the Ollama server:

Mac: server logs are available in ~/.ollama/logs/server.log
Linux: if installed using the installer the server logs can be seen with journalctl -u ollama, otherwise check the ~/.ollama/logs/server.log

Making this better now to relay the actual error.

@BruceMacD commented on GitHub (Oct 6, 2023): Hi everyone, thanks for all the reports. This is a generic error that gets returned when the llama runner fails to start. Most likely (like in the old thinkpad case) the system doesn't have enough resources to load the model. You can see the actual error by checking the Ollama server: - **Mac:** server logs are available in `~/.ollama/logs/server.log` - **Linux:** if installed using the installer the server logs can be seen with `journalctl -u ollama`, otherwise check the `~/.ollama/logs/server.log` Making this better now to relay the actual error.

GiteaMirror commented

2026-04-12 09:49:20 -05:00

@brownsnow commented on GitHub (Oct 6, 2023):

luser:~ $ sudo journalctl -u ollama
[sudo] password for luser:
Oct 06 13:38:58 CX600 systemd[1]: Started Ollama Service.
Oct 06 13:38:58 CX600 ollama[1992]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key.
Oct 06 13:38:58 CX600 ollama[1992]: Your new public key is:
Oct 06 13:38:58 CX600 ollama[1992]:
Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 images.go:996: total blobs: 0
Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 images.go:1003: total unused blobs removed: 0
Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 routes.go:572: Listening on 127.0.0.1:11434
Oct 06 13:41:53 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:41:53 | 200 | 36.185µs | 127.0.0.1 | HEAD "/"
Oct 06 13:41:53 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:41:53 | 200 | 206.837µs | 127.0.0.1 | GET "/api/>
Oct 06 13:41:56 CX600 ollama[1992]: 2023/10/06 13:41:56 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:36 CX600 ollama[1992]: 2023/10/06 13:47:36 download.go:235: success getting sha256:8daa9615cce30c259a9555b1c>
Oct 06 13:47:38 CX600 ollama[1992]: 2023/10/06 13:47:38 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:38 CX600 ollama[1992]: 2023/10/06 13:47:38 download.go:235: success getting sha256:8c17c2ebb0ea011be9981cc39>
Oct 06 13:47:39 CX600 ollama[1992]: 2023/10/06 13:47:39 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:39 CX600 ollama[1992]: 2023/10/06 13:47:39 download.go:235: success getting sha256:7c23fb36d80141c4ab8cdbb61>
Oct 06 13:47:40 CX600 ollama[1992]: 2023/10/06 13:47:40 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:40 CX600 ollama[1992]: 2023/10/06 13:47:40 download.go:235: success getting sha256:bec56154823a9d2956cf28f6c>
Oct 06 13:47:41 CX600 ollama[1992]: 2023/10/06 13:47:41 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:42 CX600 ollama[1992]: 2023/10/06 13:47:42 download.go:235: success getting sha256:e35ab70a78c78ebbbc4d2e2ea>
Oct 06 13:47:42 CX600 ollama[1992]: 2023/10/06 13:47:42 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 13:47:43 CX600 ollama[1992]: 2023/10/06 13:47:43 download.go:235: success getting sha256:09fe89200c09e3fa8b36e77da>
Oct 06 13:48:00 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:48:00 | 200 | 6m7s | 127.0.0.1 | POST "/api/>
Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:239: 4096 MiB VRAM available, loading up to 36 GPU layers
Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:313: starting llama runner
Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:349: waiting for llama runner to start responding
Oct 06 13:48:02 CX600 ollama[1992]: 2023/10/06 13:48:02 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:313: starting llama runner
Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:349: waiting for llama runner to start responding
Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 13:52:00 CX600 ollama[1992]: 2023/10/06 13:52:00 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 13:52:00 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:00 | 500 | 4m0s | 127.0.0.1 | POST "/api/>
Oct 06 13:52:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:13 | 200 | 20.552µs | 127.0.0.1 | HEAD "/"
Oct 06 13:52:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:13 | 200 | 242.456µs | 127.0.0.1 | GET "/api/>
Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:239: 4096 MiB VRAM available, loading up to 36 GPU layers
Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:313: starting llama runner
Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:349: waiting for llama runner to start responding
Oct 06 13:52:14 CX600 ollama[1992]: 2023/10/06 13:52:14 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:313: starting llama runner
Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:349: waiting for llama runner to start responding
Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 13:56:13 CX600 ollama[1992]: 2023/10/06 13:56:13 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 13:56:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:56:13 | 500 | 4m0s | 127.0.0.1 | POST "/api/>
Oct 06 14:37:57 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:37:57 | 200 | 19.934µs | 127.0.0.1 | HEAD "/"
Oct 06 14:37:59 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:37:57 | 200 | 252.629µs | 127.0.0.1 | GET "/api/>
Oct 06 14:54:42 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:54:42 | 200 | 19.907µs | 127.0.0.1 | HEAD "/"
Oct 06 14:54:42 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:54:42 | 200 | 229.742µs | 127.0.0.1 | GET "/api/>
Oct 06 14:54:45 CX600 ollama[1992]: 2023/10/06 14:54:45 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:00:53 CX600 ollama[1992]: 2023/10/06 15:00:53 download.go:235: success getting sha256:3230a638a2da7f51833ddf0f5>
Oct 06 15:00:55 CX600 ollama[1992]: 2023/10/06 15:00:55 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:00:55 CX600 ollama[1992]: 2023/10/06 15:00:55 download.go:235: success getting sha256:d5311aab7c4cecbb387fbb06d>
Oct 06 15:00:56 CX600 ollama[1992]: 2023/10/06 15:00:56 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:00:57 CX600 ollama[1992]: 2023/10/06 15:00:57 download.go:235: success getting sha256:1e836a895a40bb08d7f5d3209>
Oct 06 15:01:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:01:16 | 200 | 6m33s | 127.0.0.1 | POST "/api/>
Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:239: 4096 MiB VRAM available, loading up to 32 GPU layers
Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:313: starting llama runner
Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:349: waiting for llama runner to start responding
Oct 06 15:01:20 CX600 ollama[1992]: 2023/10/06 15:01:20 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:313: starting llama runner
Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:349: waiting for llama runner to start responding
Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:05:18 CX600 ollama[1992]: 2023/10/06 15:05:18 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:05:18 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:05:18 | 500 | 4m1s | 127.0.0.1 | POST "/api/>
Oct 06 15:12:03 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:12:03 | 200 | 19.418µs | 127.0.0.1 | HEAD "/"
Oct 06 15:12:03 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:12:03 | 200 | 268.109µs | 127.0.0.1 | GET "/api/>
Oct 06 15:12:05 CX600 ollama[1992]: 2023/10/06 15:12:05 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:14:56 CX600 ollama[1992]: 2023/10/06 15:14:56 download.go:235: success getting sha256:e84705205f71dd55be7b24a77>
Oct 06 15:14:58 CX600 ollama[1992]: 2023/10/06 15:14:58 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:14:58 CX600 ollama[1992]: 2023/10/06 15:14:58 download.go:235: success getting sha256:e7214e2f1a0f5ed0ed67c3db9>
Oct 06 15:14:59 CX600 ollama[1992]: 2023/10/06 15:14:59 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:14:59 CX600 ollama[1992]: 2023/10/06 15:14:59 download.go:235: success getting sha256:93ca9b3d83dc541f11062c0b9>
Oct 06 15:15:00 CX600 ollama[1992]: 2023/10/06 15:15:00 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b>
Oct 06 15:15:00 CX600 ollama[1992]: 2023/10/06 15:15:00 download.go:235: success getting sha256:65009e4e7fee047467033b69d>
Oct 06 15:15:09 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:15:09 | 200 | 3m5s | 127.0.0.1 | POST "/api/>
Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:239: 4096 MiB VRAM available, loading up to 57 GPU layers
Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:313: starting llama runner
Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:349: waiting for llama runner to start responding
Oct 06 15:15:10 CX600 ollama[1992]: 2023/10/06 15:15:10 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:313: starting llama runner
Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:349: waiting for llama runner to start responding
Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:19:09 CX600 ollama[1992]: 2023/10/06 15:19:09 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:19:09 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:19:09 | 500 | 4m0s | 127.0.0.1 | POST "/api/>
Oct 06 15:54:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:54:16 | 200 | 21.369µs | 127.0.0.1 | HEAD "/"
Oct 06 15:54:18 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:54:16 | 200 | 345.576µs | 127.0.0.1 | GET "/api/>
Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:239: 4096 MiB VRAM available, loading up to 57 GPU layers
Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:313: starting llama runner
Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:349: waiting for llama runner to start responding
Oct 06 15:54:20 CX600 ollama[1992]: 2023/10/06 15:54:19 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:313: starting llama runner
Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:349: waiting for llama runner to start responding
Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:323: llama runner exited with error: signal: illegal ins>
Oct 06 15:58:16 CX600 ollama[1992]: 2023/10/06 15:58:16 llama.go:330: error starting llama runner: llama runner did not s>
Oct 06 15:58:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:58:16 | 500 | 4m0s | 127.0.0.1 | POST "/api/>
luser:~ $

@brownsnow commented on GitHub (Oct 6, 2023): luser:~ $ sudo journalctl -u ollama [sudo] password for luser: Oct 06 13:38:58 CX600 systemd[1]: Started Ollama Service. Oct 06 13:38:58 CX600 ollama[1992]: Couldn't find '/usr/share/ollama/.ollama/id_ed25519'. Generating new private key. Oct 06 13:38:58 CX600 ollama[1992]: Your new public key is: Oct 06 13:38:58 CX600 ollama[1992]: Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 images.go:996: total blobs: 0 Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 images.go:1003: total unused blobs removed: 0 Oct 06 13:38:58 CX600 ollama[1992]: 2023/10/06 13:38:58 routes.go:572: Listening on 127.0.0.1:11434 Oct 06 13:41:53 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:41:53 | 200 | 36.185µs | 127.0.0.1 | HEAD "/" Oct 06 13:41:53 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:41:53 | 200 | 206.837µs | 127.0.0.1 | GET "/api/> Oct 06 13:41:56 CX600 ollama[1992]: 2023/10/06 13:41:56 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:36 CX600 ollama[1992]: 2023/10/06 13:47:36 download.go:235: success getting sha256:8daa9615cce30c259a9555b1c> Oct 06 13:47:38 CX600 ollama[1992]: 2023/10/06 13:47:38 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:38 CX600 ollama[1992]: 2023/10/06 13:47:38 download.go:235: success getting sha256:8c17c2ebb0ea011be9981cc39> Oct 06 13:47:39 CX600 ollama[1992]: 2023/10/06 13:47:39 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:39 CX600 ollama[1992]: 2023/10/06 13:47:39 download.go:235: success getting sha256:7c23fb36d80141c4ab8cdbb61> Oct 06 13:47:40 CX600 ollama[1992]: 2023/10/06 13:47:40 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:40 CX600 ollama[1992]: 2023/10/06 13:47:40 download.go:235: success getting sha256:bec56154823a9d2956cf28f6c> Oct 06 13:47:41 CX600 ollama[1992]: 2023/10/06 13:47:41 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:42 CX600 ollama[1992]: 2023/10/06 13:47:42 download.go:235: success getting sha256:e35ab70a78c78ebbbc4d2e2ea> Oct 06 13:47:42 CX600 ollama[1992]: 2023/10/06 13:47:42 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 13:47:43 CX600 ollama[1992]: 2023/10/06 13:47:43 download.go:235: success getting sha256:09fe89200c09e3fa8b36e77da> Oct 06 13:48:00 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:48:00 | 200 | 6m7s | 127.0.0.1 | POST "/api/> Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:239: 4096 MiB VRAM available, loading up to 36 GPU layers Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:313: starting llama runner Oct 06 13:48:00 CX600 ollama[1992]: 2023/10/06 13:48:00 llama.go:349: waiting for llama runner to start responding Oct 06 13:48:02 CX600 ollama[1992]: 2023/10/06 13:48:02 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:313: starting llama runner Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:349: waiting for llama runner to start responding Oct 06 13:50:00 CX600 ollama[1992]: 2023/10/06 13:50:00 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 13:52:00 CX600 ollama[1992]: 2023/10/06 13:52:00 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 13:52:00 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:00 | 500 | 4m0s | 127.0.0.1 | POST "/api/> Oct 06 13:52:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:13 | 200 | 20.552µs | 127.0.0.1 | HEAD "/" Oct 06 13:52:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:52:13 | 200 | 242.456µs | 127.0.0.1 | GET "/api/> Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:239: 4096 MiB VRAM available, loading up to 36 GPU layers Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:313: starting llama runner Oct 06 13:52:13 CX600 ollama[1992]: 2023/10/06 13:52:13 llama.go:349: waiting for llama runner to start responding Oct 06 13:52:14 CX600 ollama[1992]: 2023/10/06 13:52:14 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:313: starting llama runner Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:349: waiting for llama runner to start responding Oct 06 13:54:13 CX600 ollama[1992]: 2023/10/06 13:54:13 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 13:56:13 CX600 ollama[1992]: 2023/10/06 13:56:13 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 13:56:13 CX600 ollama[1992]: [GIN] 2023/10/06 - 13:56:13 | 500 | 4m0s | 127.0.0.1 | POST "/api/> Oct 06 14:37:57 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:37:57 | 200 | 19.934µs | 127.0.0.1 | HEAD "/" Oct 06 14:37:59 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:37:57 | 200 | 252.629µs | 127.0.0.1 | GET "/api/> Oct 06 14:54:42 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:54:42 | 200 | 19.907µs | 127.0.0.1 | HEAD "/" Oct 06 14:54:42 CX600 ollama[1992]: [GIN] 2023/10/06 - 14:54:42 | 200 | 229.742µs | 127.0.0.1 | GET "/api/> Oct 06 14:54:45 CX600 ollama[1992]: 2023/10/06 14:54:45 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:00:53 CX600 ollama[1992]: 2023/10/06 15:00:53 download.go:235: success getting sha256:3230a638a2da7f51833ddf0f5> Oct 06 15:00:55 CX600 ollama[1992]: 2023/10/06 15:00:55 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:00:55 CX600 ollama[1992]: 2023/10/06 15:00:55 download.go:235: success getting sha256:d5311aab7c4cecbb387fbb06d> Oct 06 15:00:56 CX600 ollama[1992]: 2023/10/06 15:00:56 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:00:57 CX600 ollama[1992]: 2023/10/06 15:00:57 download.go:235: success getting sha256:1e836a895a40bb08d7f5d3209> Oct 06 15:01:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:01:16 | 200 | 6m33s | 127.0.0.1 | POST "/api/> Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:239: 4096 MiB VRAM available, loading up to 32 GPU layers Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:313: starting llama runner Oct 06 15:01:18 CX600 ollama[1992]: 2023/10/06 15:01:18 llama.go:349: waiting for llama runner to start responding Oct 06 15:01:20 CX600 ollama[1992]: 2023/10/06 15:01:20 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:313: starting llama runner Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:349: waiting for llama runner to start responding Oct 06 15:03:18 CX600 ollama[1992]: 2023/10/06 15:03:18 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:05:18 CX600 ollama[1992]: 2023/10/06 15:05:18 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:05:18 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:05:18 | 500 | 4m1s | 127.0.0.1 | POST "/api/> Oct 06 15:12:03 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:12:03 | 200 | 19.418µs | 127.0.0.1 | HEAD "/" Oct 06 15:12:03 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:12:03 | 200 | 268.109µs | 127.0.0.1 | GET "/api/> Oct 06 15:12:05 CX600 ollama[1992]: 2023/10/06 15:12:05 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:14:56 CX600 ollama[1992]: 2023/10/06 15:14:56 download.go:235: success getting sha256:e84705205f71dd55be7b24a77> Oct 06 15:14:58 CX600 ollama[1992]: 2023/10/06 15:14:58 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:14:58 CX600 ollama[1992]: 2023/10/06 15:14:58 download.go:235: success getting sha256:e7214e2f1a0f5ed0ed67c3db9> Oct 06 15:14:59 CX600 ollama[1992]: 2023/10/06 15:14:59 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:14:59 CX600 ollama[1992]: 2023/10/06 15:14:59 download.go:235: success getting sha256:93ca9b3d83dc541f11062c0b9> Oct 06 15:15:00 CX600 ollama[1992]: 2023/10/06 15:15:00 images.go:1495: redirected to: https://dd20bb891979d25aebc8bec07b> Oct 06 15:15:00 CX600 ollama[1992]: 2023/10/06 15:15:00 download.go:235: success getting sha256:65009e4e7fee047467033b69d> Oct 06 15:15:09 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:15:09 | 200 | 3m5s | 127.0.0.1 | POST "/api/> Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:239: 4096 MiB VRAM available, loading up to 57 GPU layers Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:313: starting llama runner Oct 06 15:15:09 CX600 ollama[1992]: 2023/10/06 15:15:09 llama.go:349: waiting for llama runner to start responding Oct 06 15:15:10 CX600 ollama[1992]: 2023/10/06 15:15:10 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:313: starting llama runner Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:349: waiting for llama runner to start responding Oct 06 15:17:09 CX600 ollama[1992]: 2023/10/06 15:17:09 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:19:09 CX600 ollama[1992]: 2023/10/06 15:19:09 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:19:09 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:19:09 | 500 | 4m0s | 127.0.0.1 | POST "/api/> Oct 06 15:54:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:54:16 | 200 | 21.369µs | 127.0.0.1 | HEAD "/" Oct 06 15:54:18 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:54:16 | 200 | 345.576µs | 127.0.0.1 | GET "/api/> Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:239: 4096 MiB VRAM available, loading up to 57 GPU layers Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:313: starting llama runner Oct 06 15:54:18 CX600 ollama[1992]: 2023/10/06 15:54:16 llama.go:349: waiting for llama runner to start responding Oct 06 15:54:20 CX600 ollama[1992]: 2023/10/06 15:54:19 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:313: starting llama runner Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:349: waiting for llama runner to start responding Oct 06 15:56:16 CX600 ollama[1992]: 2023/10/06 15:56:16 llama.go:323: llama runner exited with error: signal: illegal ins> Oct 06 15:58:16 CX600 ollama[1992]: 2023/10/06 15:58:16 llama.go:330: error starting llama runner: llama runner did not s> Oct 06 15:58:16 CX600 ollama[1992]: [GIN] 2023/10/06 - 15:58:16 | 500 | 4m0s | 127.0.0.1 | POST "/api/> luser:~ $

GiteaMirror commented

2026-04-12 09:49:20 -05:00

@BruceMacD commented on GitHub (Oct 6, 2023):

@brownsnow the illegal instruction error looks like the root of the problem in your case.

What CPU architecture are you using? You can check this by running uname -a. Make sure you're running the appropriate version of Ollama (the install script should have picked the correct version automatically).

@BruceMacD commented on GitHub (Oct 6, 2023): @brownsnow the `illegal instruction` error looks like the root of the problem in your case. What CPU architecture are you using? You can check this by running `uname -a`. Make sure you're running the appropriate version of Ollama (the install script should have picked the correct version automatically).

GiteaMirror commented

2026-04-12 09:49:21 -05:00

@simonedoria commented on GitHub (Oct 7, 2023):

@brownsnow the illegal instruction error looks like the root of the problem in your case.

What CPU architecture are you using? You can check this by running uname -a. Make sure you're running the appropriate version of Ollama (the install script should have picked the correct version automatically).

Hi, I have the same problem as @brownsnow , here is my uname -a : Linux ns357104 5.4.0-125-generic #141-Ubuntu SMP Wed Aug 10 13:42:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

CPU 8-Core
RAM 32gb

Trying to run mistral.

Thanks!

@simonedoria commented on GitHub (Oct 7, 2023): > @brownsnow the `illegal instruction` error looks like the root of the problem in your case. > > What CPU architecture are you using? You can check this by running `uname -a`. Make sure you're running the appropriate version of Ollama (the install script should have picked the correct version automatically). Hi, I have the same problem as @brownsnow , here is my **uname -a** : Linux ns357104 5.4.0-125-generic #141-Ubuntu SMP Wed Aug 10 13:42:03 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux CPU 8-Core RAM 32gb Trying to run mistral. Thanks!

GiteaMirror commented

2026-04-12 09:49:21 -05:00

@brownsnow commented on GitHub (Oct 7, 2023):

luser:~ $ uname -a
Linux CX600 6.5.5-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 23 Sep 2023 22:55:13 +0000 x86_64 GNU/Linux
luser:~ $

@brownsnow commented on GitHub (Oct 7, 2023): luser:~ $ uname -a Linux CX600 6.5.5-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 23 Sep 2023 22:55:13 +0000 x86_64 GNU/Linux luser:~ $

GiteaMirror commented

2026-04-12 09:49:22 -05:00

@adrian5 commented on GitHub (Oct 7, 2023):

Same issue as @brownsnow, same architecture and kernel. This is my first foray into using machine models of any kind, hence limited knowledge about all the components ollama so conveniently wraps.

I tried with mistral and orca-mini. Here is some metadata from the blobs/ directory that belongs to Mistral I believe:

{
  "model_format": "gguf",
  "model_family": "llama",
  "model_type": "7B",
  "file_type": "Q4_0",
  "rootfs": {
    "type": "layers",
    "diff_ids": [
      "sha256:6ae28029995007a3ee8d0b8556d50f3b59b831074cf19c84de87acf51fb54054",
      "sha256:fede2d8d6c1f404b1db73b1cd26f7d5455ff2deeb737b5e2b339339dce2969d4"
    ]
  },
  "architecture": "amd64",
  "os": "linux"
}

Edit: Nevermind, it's very likely #644 for me. My CPU definitely doesn't support AVX2. I recall that keeping me from trying in the past but forgot about that detail.

@adrian5 commented on GitHub (Oct 7, 2023): Same issue as @brownsnow, same architecture and kernel. This is my first foray into using machine models of any kind, hence limited knowledge about all the components ollama so conveniently wraps. I tried with `mistral` and `orca-mini`. Here is some metadata from the `blobs/` directory that belongs to Mistral I believe: ```json { "model_format": "gguf", "model_family": "llama", "model_type": "7B", "file_type": "Q4_0", "rootfs": { "type": "layers", "diff_ids": [ "sha256:6ae28029995007a3ee8d0b8556d50f3b59b831074cf19c84de87acf51fb54054", "sha256:fede2d8d6c1f404b1db73b1cd26f7d5455ff2deeb737b5e2b339339dce2969d4" ] }, "architecture": "amd64", "os": "linux" } ``` --- Edit: Nevermind, it's very likely #644 for me. My CPU definitely doesn't support AVX2. I recall that keeping me from trying in the past but forgot about that detail.

GiteaMirror commented

2026-04-12 09:49:22 -05:00

@9cat commented on GitHub (Oct 12, 2023):

same issue , my old i3 cpu has the AVX still fail

grep avx /proc/cpuinfo
nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx f16c lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadowvnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts md_clear flush_l1d

@9cat commented on GitHub (Oct 12, 2023): same issue , my old i3 cpu has the AVX still fail grep avx /proc/cpuinfo nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave **_avx_** f16c lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadowvnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts md_clear flush_l1d

GiteaMirror commented

2026-04-12 09:49:23 -05:00

@jneumann-dev commented on GitHub (Oct 12, 2023):

@9cat getting same problem, also only have avx support, what I'm piecing together is that you have to build ollama from source to only use instruction sets your processor supports. The release build is, let's say, overly optimistic about what kind of hardware you're using. According to #644 a fix with compile-time checks for full compatibility with the processor has already been implemented, so in theory if you can compile ollama from source this problem should go away. TL;DR apparently need to compile from source. Will try and report back later today.

@jneumann-dev commented on GitHub (Oct 12, 2023): @9cat getting same problem, also only have avx support, what I'm piecing together is that you have to build ollama from source to only use instruction sets your processor supports. The release build is, let's say, overly optimistic about what kind of hardware you're using. According to #644 a fix with compile-time checks for full compatibility with the processor has already been implemented, so in theory if you can compile ollama from source this problem should go away. TL;DR apparently need to compile from source. Will try and report back later today.

GiteaMirror commented

2026-04-12 09:49:23 -05:00

@BruceMacD commented on GitHub (Oct 12, 2023):

Hi all, just merged a change that will relay the actual error starting the llama runner to the client. What happened with this issue is that there a few different problems here, but it was not clear what the root was in each case due to the bad error message being returned.

A summary of the issues I see here:

Older CPUs that do not support the instruction set required by llama.cpp (which we use to run the models)
Unsupported CPU architectures (AVX, try building from source, see the development doc for reference, I haven't tested this though)
Loading models on machines that cannot run them adequately and the runner times out while loading

Please feel free to open more issues if I've missed something here, and thanks for all the reports.

@BruceMacD commented on GitHub (Oct 12, 2023): Hi all, just merged a change that will relay the actual error starting the llama runner to the client. What happened with this issue is that there a few different problems here, but it was not clear what the root was in each case due to the bad error message being returned. A summary of the issues I see here: - Older CPUs that do not support the instruction set required by llama.cpp (which we use to run the models) - Unsupported CPU architectures (AVX, try building from source, see the [development doc](https://github.com/jmorganca/ollama/blob/main/docs/development.md) for reference, I haven't tested this though) - Loading models on machines that cannot run them adequately and the runner times out while loading Please feel free to open more issues if I've missed something here, and thanks for all the reports.

GiteaMirror commented

2026-04-12 09:49:23 -05:00

@drhumlen commented on GitHub (Oct 18, 2023):

My problem was that I was continuously pulling and building from source, but I hadn't updated the dependencies in a while. For me this fixed it:

brew install cmake
brew install go
go generate ./...
go build .

And then I could finally ./ollama serve and ./ollama run llama2 like normal 😄 (Macbook with M1 pro, and 16gb ram)

@drhumlen commented on GitHub (Oct 18, 2023): My problem was that I was continuously pulling and building from source, but I hadn't updated the dependencies in a while. For me this fixed it: ``` brew install cmake brew install go go generate ./... go build . ``` And then I could finally `./ollama serve` and `./ollama run llama2` like normal 😄 (Macbook with M1 pro, and 16gb ram)

GiteaMirror commented

2026-04-12 09:49:24 -05:00

@mzm008 commented on GitHub (Dec 21, 2023):

me too, can not run. long time to waiting and error
my cpu info is Linux VM-0-4-ubuntu 5.4.0-139-generic #156-Ubuntu SMP Fri Jan 20 17:27:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

@mzm008 commented on GitHub (Dec 21, 2023): me too, can not run. long time to waiting and error my cpu info is Linux VM-0-4-ubuntu 5.4.0-139-generic #156-Ubuntu SMP Fri Jan 20 17:27:18 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

GiteaMirror commented

2026-04-12 09:49:24 -05:00

@Abdullah-shamito commented on GitHub (Dec 26, 2023):

hello everyone, i'v faced this issue and solve it by:
remove the model
and then update ollama using curl https://ollama.ai/install.sh | sh
install the model
btw i used ( ollama run mixtral:8x7b-instruct-v0.1-q5_0) from https://ollama.ai/library/mixtral/tags

@Abdullah-shamito commented on GitHub (Dec 26, 2023): hello everyone, i'v faced this issue and solve it by: remove the model and then update ollama using curl https://ollama.ai/install.sh | sh install the model btw i used ( ollama run mixtral:8x7b-instruct-v0.1-q5_0) from https://ollama.ai/library/mixtral/tags

GiteaMirror commented

2026-04-12 09:49:25 -05:00

@BruceMacD commented on GitHub (Dec 27, 2023):

@Abdullah-shamito
In your case of probably failing to load the model into memory before it times out. Mixtral is a larger model so this may happen.

There will be some improvements to this in the next release and you won't see the timeout anymore.

@BruceMacD commented on GitHub (Dec 27, 2023): @Abdullah-shamito In your case of probably failing to load the model into memory before it times out. Mixtral is a larger model so this may happen. There will be some improvements to this in the next release and you won't see the timeout anymore.

GiteaMirror commented

2026-04-12 09:49:25 -05:00

@UmutAlihan commented on GitHub (May 19, 2024):

I am also having this error:

ollama version: 0.1.38

Error: timed out waiting for llama runner to start:```

@UmutAlihan commented on GitHub (May 19, 2024): I am also having this error: ollama version: 0.1.38 ```~$ ollama run llama3:70b-instruct-q8_0 Error: timed out waiting for llama runner to start:```

GiteaMirror commented

2026-04-12 09:49:25 -05:00

@UmutAlihan commented on GitHub (May 27, 2024):

@Abdullah-shamito In your case of probably failing to load the model into memory before it times out. Mixtral is a larger model so this may happen.

There will be some improvements to this in the next release and you won't see the timeout anymore.

when are you planning to release the new version?? This issue is really killing the vibe. At least please release a rc version

@UmutAlihan commented on GitHub (May 27, 2024): > @Abdullah-shamito In your case of probably failing to load the model into memory before it times out. Mixtral is a larger model so this may happen. > > There will be some improvements to this in the next release and you won't see the timeout anymore. when are you planning to release the new version?? This issue is really killing the vibe. At least please release a rc version

GiteaMirror commented

2026-04-12 09:49:26 -05:00

@Nantris commented on GitHub (Jun 21, 2024):

Is there some way to increase the timeout? I still see the problem with ollama@0.1.44. I'm trying to load an enormous model (which may well fail to load) but it's failing without even trying to load it due to this issue. I don't think this issue should be closed.

@Nantris commented on GitHub (Jun 21, 2024): Is there some way to increase the timeout? I still see the problem with `ollama@0.1.44`. I'm trying to load an enormous model (which may well fail to load) but it's failing without even trying to load it due to this issue. I don't think this issue should be closed.

GiteaMirror commented

2026-04-12 09:49:27 -05:00

@UmutAlihan commented on GitHub (Jun 21, 2024):

Is there some way to increase the timeout? I still see the problem with ollama@0.1.44. I'm trying to load an enormous model (which may well fail to load) but it's failing without even trying to load it due to this issue. I don't think this issue should be closed.

you can check this workaround: https://github.com/ollama/ollama/issues/4131#issuecomment-2174891150

@UmutAlihan commented on GitHub (Jun 21, 2024): > Is there some way to increase the timeout? I still see the problem with `ollama@0.1.44`. I'm trying to load an enormous model (which may well fail to load) but it's failing without even trying to load it due to this issue. I don't think this issue should be closed. you can check this workaround: https://github.com/ollama/ollama/issues/4131#issuecomment-2174891150

GiteaMirror commented

2026-04-12 09:49:27 -05:00

@Nantris commented on GitHub (Jun 21, 2024):

Thanks for pointing that out @UmutAlihan.

I hope this might be added to the base project or as a CLI flag @BruceMacD? Building from source isn't viable for us.

@Nantris commented on GitHub (Jun 21, 2024): Thanks for pointing that out @UmutAlihan. I hope this might be added to the base project or as a CLI flag @BruceMacD? Building from source isn't viable for us.

GiteaMirror commented

2026-04-12 09:49:27 -05:00

@UmutAlihan commented on GitHub (Jun 29, 2024):

Thanks for pointing that out @UmutAlihan.

I hope this might be added to the base project or as a CLI flag @BruceMacD? Building from source isn't viable for us.

Your very welcome. I am also sad that in the fast paced release cycle for some reason this is being ignored.

I was not expecting for such quality software has hardcoded timeout value... Even though this is fine if contributors realize it, parameterize it and release. However this still stays in the base project. Am I missing something? Is there some reason for the timeout value stays hardcoded?

@UmutAlihan commented on GitHub (Jun 29, 2024): > Thanks for pointing that out @UmutAlihan. > > I hope this might be added to the base project or as a CLI flag @BruceMacD? Building from source isn't viable for us. Your very welcome. I am also sad that in the fast paced release cycle for some reason this is being ignored. I was not expecting for such quality software has hardcoded timeout value... Even though this is fine if contributors realize it, parameterize it and release. However this still stays in the base project. Am I missing something? Is there some reason for the timeout value stays hardcoded?

GiteaMirror commented

2026-04-12 09:49:28 -05:00

@Nantris commented on GitHub (Jun 29, 2024):

@UmutAlihan I can't find the relevant comment, but this is resolved and I was able to load the 236 billion parameter version of Deepseek-Coder-v2 as a result. It still did fail early when I lacked sufficient RAM (eg had too many other applications open.)

The new logic in 0.1.45 is to only bail out if the load makes no progress in the timeout time, rather than if it fails to fully load in that time.

@Nantris commented on GitHub (Jun 29, 2024): @UmutAlihan I can't find the relevant comment, but this is resolved and I was able to load the 236 billion parameter version of Deepseek-Coder-v2 as a result. It still did fail early when I lacked sufficient RAM (eg had too many other applications open.) The new logic in 0.1.45 is to only bail out if the load makes no progress in the timeout time, rather than if it fails to fully load in that time.

GiteaMirror commented

2026-04-12 09:49:28 -05:00

@ichinippa128 commented on GitHub (Jul 12, 2024):

Depending on the model, the timeout also occurs in v0.2.1.
Even if you change the load process, I don't think the timeout period should be fixed at 5 minutes.
I would like to be able to change the timeout period to accommodate various models and GPU environments.

@ichinippa128 commented on GitHub (Jul 12, 2024): Depending on the model, the timeout also occurs in v0.2.1. Even if you change the load process, I don't think the timeout period should be fixed at 5 minutes. I would like to be able to change the timeout period to accommodate various models and GPU environments.

GiteaMirror commented

2026-04-12 09:49:29 -05:00

@UmutAlihan commented on GitHub (Jul 12, 2024):

Depending on the model, the timeout also occurs in v0.2.1.
Even if you change the load process, I don't think the timeout period should be fixed at 5 minutes.
I would like to be able to change the timeout period to accommodate various models and GPU environments.

yes exactly thisnis the issue for some time :/

@UmutAlihan commented on GitHub (Jul 12, 2024): > Depending on the model, the timeout also occurs in v0.2.1. > Even if you change the load process, I don't think the timeout period should be fixed at 5 minutes. > I would like to be able to change the timeout period to accommodate various models and GPU environments. yes exactly thisnis the issue for some time :/

GiteaMirror commented

2026-04-12 09:49:29 -05:00

@BruceMacD commented on GitHub (Jul 12, 2024):

Hey all, thanks for bringing up this timeout. It seems like there is an issue in one of the recent releases with loading larger models in some cases, possibly caused by memory mapping. It should be refreshing the timeout as the model is loaded, not a fixed timeout. This means the loading is getting stuck somewhere. I'll update when we have it figured out.

@BruceMacD commented on GitHub (Jul 12, 2024): Hey all, thanks for bringing up this timeout. It seems like there is an issue in one of the recent releases with loading larger models in some cases, possibly caused by memory mapping. It should be refreshing the timeout as the model is loaded, not a fixed timeout. This means the loading is getting stuck somewhere. I'll update when we have it figured out.

GiteaMirror commented

2026-04-12 09:49:30 -05:00

@jvel07 commented on GitHub (Aug 28, 2025):

@BruceMacD did you figure out?

@jvel07 commented on GitHub (Aug 28, 2025): @BruceMacD did you figure out?

GiteaMirror referenced this issue

2026-04-12 22:51:35 -05:00

[PR #319] [CLOSED] RFC: optional generate header to not stream response #10097

GiteaMirror referenced this issue

2026-04-16 04:57:04 -05:00

[PR #319] [CLOSED] RFC: optional generate header to not stream response #15368

GiteaMirror referenced this issue

2026-04-19 15:09:45 -05:00

[PR #319] [CLOSED] RFC: optional generate header to not stream response #20637

GiteaMirror referenced this issue

2026-04-22 01:43:48 -05:00

[GH-ISSUE #281] Consider a non streaming api for `/api/generate` #25884

GiteaMirror referenced this issue

2026-04-22 01:44:15 -05:00

[GH-ISSUE #294] Streaming responses should have `Content-Type` set to `application/x-ndjson ` #25891

GiteaMirror referenced this issue

2026-04-22 01:44:16 -05:00

[GH-ISSUE #294] Streaming responses should have `Content-Type` set to `application/x-ndjson ` #25891

GiteaMirror referenced this issue

2026-04-22 01:45:06 -05:00

[GH-ISSUE #318] add a generate option for max response length #25901

GiteaMirror referenced this issue

2026-04-22 20:42:36 -05:00

[PR #319] [CLOSED] RFC: optional generate header to not stream response #35970

GiteaMirror referenced this issue

2026-04-24 21:14:50 -05:00

[PR #319] [CLOSED] RFC: optional generate header to not stream response #41345

GiteaMirror referenced this issue

2026-04-27 23:16:04 -05:00

[GH-ISSUE #281] Consider a non streaming api for `/api/generate` #46633

GiteaMirror referenced this issue

2026-04-27 23:17:28 -05:00

[GH-ISSUE #294] Streaming responses should have `Content-Type` set to `application/x-ndjson ` #46640

GiteaMirror referenced this issue

2026-04-27 23:17:37 -05:00

[GH-ISSUE #294] Streaming responses should have `Content-Type` set to `application/x-ndjson ` #46640

GiteaMirror referenced this issue

2026-04-27 23:21:09 -05:00

[GH-ISSUE #318] add a generate option for max response length #46650

GiteaMirror referenced this issue

2026-04-29 11:22:06 -05:00

[PR #319] [CLOSED] RFC: optional generate header to not stream response #56794

GiteaMirror referenced this issue

2026-05-03 07:43:01 -05:00

[GH-ISSUE #281] Consider a non streaming api for `/api/generate` #62160

GiteaMirror referenced this issue

2026-05-03 07:44:01 -05:00

[GH-ISSUE #294] Streaming responses should have `Content-Type` set to `application/x-ndjson ` #62167

GiteaMirror referenced this issue

2026-05-03 07:44:03 -05:00

[GH-ISSUE #294] Streaming responses should have `Content-Type` set to `application/x-ndjson ` #62167

GiteaMirror referenced this issue

2026-05-03 07:45:39 -05:00

[GH-ISSUE #318] add a generate option for max response length #62177

GiteaMirror referenced this issue

2026-05-05 03:54:46 -05:00

[PR #319] [CLOSED] RFC: optional generate header to not stream response #72391

Sign in to join this conversation.

Branches Tags

main

hoyyeva/anthropic-local-image-path

dhiltgen/ci

dhiltgen/llama-runner

parth-remove-claude-desktop-launch

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#281