[GH-ISSUE #6568] Error: llama runner process has terminated: exit status 127 #50646

Closed
opened 2026-04-28 16:44:16 -05:00 by GiteaMirror · 8 comments
Owner

Originally created by @MiloDev123 on GitHub (Aug 30, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/6568

What is the issue?

Error: llama runner process has terminated: exit status 127

imagen

Running Ollama in an Ubuntu container with root inside Termux in an Oculus Quest 2.

OS

Linux

GPU

Other

CPU

Other

Ollama version

0.3.8

Originally created by @MiloDev123 on GitHub (Aug 30, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/6568 ### What is the issue? Error: llama runner process has terminated: exit status 127 ![imagen](https://github.com/user-attachments/assets/daa21bda-ed28-441b-b985-89d61789b7e3) Running Ollama in an Ubuntu container with root inside Termux in an Oculus Quest 2. ### OS Linux ### GPU Other ### CPU Other ### Ollama version 0.3.8
GiteaMirror added the bug label 2026-04-28 16:44:16 -05:00
Author
Owner

@MiloDev123 commented on GitHub (Aug 30, 2024):

# ollama serve
2024/08/30 17:52:01 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-08-30T17:52:01.227+02:00 level=INFO source=images.go:753 msg="total blobs: 5"
time=2024-08-30T17:52:01.230+02:00 level=INFO source=images.go:760 msg="total unused blobs removed: 0"
time=2024-08-30T17:52:01.234+02:00 level=INFO source=routes.go:1172 msg="Listening on 127.0.0.1:11434 (version 0.3.8)"
time=2024-08-30T17:52:01.519+02:00 level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama1991716437/runners
time=2024-08-30T17:52:17.038+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cuda_v11 cuda_v12]"
time=2024-08-30T17:52:17.041+02:00 level=INFO source=gpu.go:200 msg="looking for compatible GPUs"
time=2024-08-30T17:52:17.075+02:00 level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered"
time=2024-08-30T17:52:17.075+02:00 level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant="no vector extensions" compute="" driver=0.0 name="" total="5.7 GiB" available="3.1 GiB"
[GIN] 2024/08/30 - 17:52:45 | 200 |      77.344µs |       127.0.0.1 | HEAD     "/"
[GIN] 2024/08/30 - 17:52:45 | 200 |   88.723125ms |       127.0.0.1 | POST     "/api/show"
time=2024-08-30T17:52:45.532+02:00 level=INFO source=memory.go:309 msg="offload to cpu" layers.requested=-1 layers.model=27 layers.offload=0 layers.split="" memory.available="[3.1 GiB]" memory.required.full="3.3 GiB" memory.required.partial="0 B" memory.required.kv="832.0 MiB" memory.required.allocations="[2.9 GiB]" memory.weights.total="1.9 GiB" memory.weights.repeating="1.4 GiB" memory.weights.nonrepeating="461.4 MiB" memory.graph.full="504.5 MiB" memory.graph.partial="965.9 MiB"
time=2024-08-30T17:52:45.550+02:00 level=INFO source=server.go:391 msg="starting llama server" cmd="/tmp/ollama1991716437/runners/cpu/ollama_llama_server --model /root/.ollama/models/blobs/sha256-7462734796d67c40ecec2ca98eddf970e171dbb6b370e43fd633ee75b69abe1b --ctx-size 8192 --batch-size 512 --embedding --log-disable --no-mmap --numa distribute --parallel 4 --port 35927"
time=2024-08-30T17:52:45.554+02:00 level=INFO source=sched.go:450 msg="loaded runners" count=1
time=2024-08-30T17:52:45.554+02:00 level=INFO source=server.go:591 msg="waiting for llama runner to start responding"
time=2024-08-30T17:52:45.555+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server error"
/tmp/ollama1991716437/runners/cpu/ollama_llama_server: error while loading shared libraries: libllama.so: cannot open shared object file: No such file or directory
time=2024-08-30T17:52:45.806+02:00 level=ERROR source=sched.go:456 msg="error loading llama server" error="llama runner process has terminated: exit status 127"
[GIN] 2024/08/30 - 17:52:45 | 500 |  520.509323ms |       127.0.0.1 | POST     "/api/chat"
<!-- gh-comment-id:2321741162 --> @MiloDev123 commented on GitHub (Aug 30, 2024): ``` # ollama serve 2024/08/30 17:52:01 routes.go:1125: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR: OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-08-30T17:52:01.227+02:00 level=INFO source=images.go:753 msg="total blobs: 5" time=2024-08-30T17:52:01.230+02:00 level=INFO source=images.go:760 msg="total unused blobs removed: 0" time=2024-08-30T17:52:01.234+02:00 level=INFO source=routes.go:1172 msg="Listening on 127.0.0.1:11434 (version 0.3.8)" time=2024-08-30T17:52:01.519+02:00 level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama1991716437/runners time=2024-08-30T17:52:17.038+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cuda_v11 cuda_v12]" time=2024-08-30T17:52:17.041+02:00 level=INFO source=gpu.go:200 msg="looking for compatible GPUs" time=2024-08-30T17:52:17.075+02:00 level=INFO source=gpu.go:347 msg="no compatible GPUs were discovered" time=2024-08-30T17:52:17.075+02:00 level=INFO source=types.go:107 msg="inference compute" id=0 library=cpu variant="no vector extensions" compute="" driver=0.0 name="" total="5.7 GiB" available="3.1 GiB" [GIN] 2024/08/30 - 17:52:45 | 200 | 77.344µs | 127.0.0.1 | HEAD "/" [GIN] 2024/08/30 - 17:52:45 | 200 | 88.723125ms | 127.0.0.1 | POST "/api/show" time=2024-08-30T17:52:45.532+02:00 level=INFO source=memory.go:309 msg="offload to cpu" layers.requested=-1 layers.model=27 layers.offload=0 layers.split="" memory.available="[3.1 GiB]" memory.required.full="3.3 GiB" memory.required.partial="0 B" memory.required.kv="832.0 MiB" memory.required.allocations="[2.9 GiB]" memory.weights.total="1.9 GiB" memory.weights.repeating="1.4 GiB" memory.weights.nonrepeating="461.4 MiB" memory.graph.full="504.5 MiB" memory.graph.partial="965.9 MiB" time=2024-08-30T17:52:45.550+02:00 level=INFO source=server.go:391 msg="starting llama server" cmd="/tmp/ollama1991716437/runners/cpu/ollama_llama_server --model /root/.ollama/models/blobs/sha256-7462734796d67c40ecec2ca98eddf970e171dbb6b370e43fd633ee75b69abe1b --ctx-size 8192 --batch-size 512 --embedding --log-disable --no-mmap --numa distribute --parallel 4 --port 35927" time=2024-08-30T17:52:45.554+02:00 level=INFO source=sched.go:450 msg="loaded runners" count=1 time=2024-08-30T17:52:45.554+02:00 level=INFO source=server.go:591 msg="waiting for llama runner to start responding" time=2024-08-30T17:52:45.555+02:00 level=INFO source=server.go:625 msg="waiting for server to become available" status="llm server error" /tmp/ollama1991716437/runners/cpu/ollama_llama_server: error while loading shared libraries: libllama.so: cannot open shared object file: No such file or directory time=2024-08-30T17:52:45.806+02:00 level=ERROR source=sched.go:456 msg="error loading llama server" error="llama runner process has terminated: exit status 127" [GIN] 2024/08/30 - 17:52:45 | 500 | 520.509323ms | 127.0.0.1 | POST "/api/chat" ```
Author
Owner

@Hcaziah commented on GitHub (Aug 30, 2024):

rolling back to 0.3.6 fixes this atm #6541

<!-- gh-comment-id:2321796375 --> @Hcaziah commented on GitHub (Aug 30, 2024): rolling back to 0.3.6 fixes this atm #6541
Author
Owner

@MikeLP commented on GitHub (Aug 30, 2024):

I have the same issue with some large models like llama 400b and dbrx on AMD ROCm 6.2. Rolled back to 0.3.6 as well.

My hardware:
CPU: AMD Ryzen Threadripper PRO 7965WX 24-Cores
GPU 1: AMD Instinct MI100 [Discrete]
GPU 2 AMD Instinct MI100 [Discrete]
GPU 3: AMD Radeon RX 6900 XT [Discrete]
GPU 4: AMD Radeon VII [Discrete]
VRAM: 96GiB
RAM: 128 GiB

<!-- gh-comment-id:2322023759 --> @MikeLP commented on GitHub (Aug 30, 2024): I have the same issue with some large models like llama 400b and dbrx on AMD ROCm 6.2. Rolled back to 0.3.6 as well. My hardware: CPU: AMD Ryzen Threadripper PRO 7965WX 24-Cores GPU 1: AMD Instinct MI100 [Discrete] GPU 2 AMD Instinct MI100 [Discrete] GPU 3: AMD Radeon RX 6900 XT [Discrete] GPU 4: AMD Radeon VII [Discrete] VRAM: 96GiB RAM: 128 GiB
Author
Owner

@khuezy commented on GitHub (Aug 30, 2024):

Friendly reminder to lock down versions, cause things like this happens all the time.
RUN curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.3.6 sh

<!-- gh-comment-id:2322232003 --> @khuezy commented on GitHub (Aug 30, 2024): Friendly reminder to lock down versions, cause things like this happens all the time. `RUN curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.3.6 sh`
Author
Owner

@pdevine commented on GitHub (Aug 30, 2024):

Dupe of #6541

<!-- gh-comment-id:2322388978 --> @pdevine commented on GitHub (Aug 30, 2024): Dupe of #6541
Author
Owner

@alex2romanov commented on GitHub (Nov 14, 2024):

Снимок экрана 2024-11-15 в 07 41 24
I have the same issue, downgrading to 0.3.6
Снимок экрана 2024-11-15 в 07 40 53

<!-- gh-comment-id:2477630202 --> @alex2romanov commented on GitHub (Nov 14, 2024): ![Снимок экрана 2024-11-15 в 07 41 24](https://github.com/user-attachments/assets/34055116-f2f0-4d4a-a916-9f3d7a1971b1) I have the same issue, downgrading to 0.3.6 ![Снимок экрана 2024-11-15 в 07 40 53](https://github.com/user-attachments/assets/eba7a243-96aa-41dc-b4c8-b3a3df67edc3)
Author
Owner

@KHRMNKY commented on GitHub (Jan 28, 2025):

Friendly reminder to lock down versions, cause things like this happens all the time. RUN curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.3.6 sh

when executing the command, I encountered error

Image

<!-- gh-comment-id:2618685296 --> @KHRMNKY commented on GitHub (Jan 28, 2025): > Friendly reminder to lock down versions, cause things like this happens all the time. `RUN curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.3.6 sh` when executing the command, I encountered error ![Image](https://github.com/user-attachments/assets/6c09abcd-e28f-4877-9639-f4fb0fcc782e)
Author
Owner

@ErfanFathi commented on GitHub (May 21, 2025):

This command works for me:
systemctl restart ollama.service

<!-- gh-comment-id:2899194950 --> @ErfanFathi commented on GitHub (May 21, 2025): This command works for me: `systemctl restart ollama.service `
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#50646