[GH-ISSUE #4795] Error: llama runner process has terminated: exit status 0xc000001d #65061

Closed
opened 2026-05-03 19:40:03 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @Ecthellin203 on GitHub (Jun 3, 2024).
Original GitHub issue: https://github.com/ollama/ollama/issues/4795

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

run llama3:8b

Error: llama runner process has terminated: exit status 0xc000001d

2024/06/03 15:40:13 routes.go:1007: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST: OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS: OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR:C:\\Users\\ecthe\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_TMPDIR:]"
time=2024-06-03T15:40:13.471+08:00 level=INFO source=images.go:729 msg="total blobs: 0"
time=2024-06-03T15:40:13.471+08:00 level=INFO source=images.go:736 msg="total unused blobs removed: 0"
time=2024-06-03T15:40:13.472+08:00 level=INFO source=routes.go:1053 msg="Listening on 127.0.0.1:11434 (version 0.1.41)"
time=2024-06-03T15:40:13.472+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v5.7 cpu cpu_avx cpu_avx2 cuda_v11.3]"
time=2024-06-03T15:40:18.670+08:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=rocm compute=gfx1100 driver=0.0 name="AMD Radeon RX 7900 XTX" total="24.0 GiB" available="23.9 GiB"
2024/06/03 15:43:44 routes.go:1007: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST: OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS: OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR:C:\\Users\\ecthe\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_TMPDIR:]"
time=2024-06-03T15:43:44.069+08:00 level=INFO source=images.go:729 msg="total blobs: 0"
time=2024-06-03T15:43:44.070+08:00 level=INFO source=images.go:736 msg="total unused blobs removed: 0"
time=2024-06-03T15:43:44.070+08:00 level=INFO source=routes.go:1053 msg="Listening on 127.0.0.1:11434 (version 0.1.41)"
time=2024-06-03T15:43:44.071+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11.3 rocm_v5.7]"
time=2024-06-03T15:43:45.378+08:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=rocm compute=gfx1100 driver=0.0 name="AMD Radeon RX 7900 XTX" total="24.0 GiB" available="23.9 GiB"
2024/06/03 15:46:24 routes.go:1007: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST: OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS: OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR:C:\\Users\\ecthe\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_TMPDIR:]"
time=2024-06-03T15:46:24.031+08:00 level=INFO source=images.go:729 msg="total blobs: 0"
time=2024-06-03T15:46:24.034+08:00 level=INFO source=images.go:736 msg="total unused blobs removed: 0"
time=2024-06-03T15:46:24.035+08:00 level=INFO source=routes.go:1053 msg="Listening on 127.0.0.1:11434 (version 0.1.41)"
time=2024-06-03T15:46:24.035+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v5.7 cpu cpu_avx cpu_avx2 cuda_v11.3]"
time=2024-06-03T15:46:24.820+08:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=rocm compute=gfx1100 driver=0.0 name="AMD Radeon RX 7900 XTX" total="24.0 GiB" available="23.9 GiB"
[GIN] 2024/06/03 - 15:46:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2024/06/03 - 15:47:01 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2024/06/03 - 15:47:01 | 404 |       527.2µs |       127.0.0.1 | POST     "/api/show"

time=2024-06-03T15:58:14.081+08:00 level=INFO source=download.go:136 msg="downloading 6a0746a1ec1a in 47 100 MB part(s)"
time=2024-06-03T15:58:31.081+08:00 level=INFO source=download.go:251 msg="6a0746a1ec1a part 33 stalled; retrying. If this persists, press ctrl-c to exit, then 'ollama pull' to find a faster connection."
time=2024-06-03T15:58:31.081+08:00 level=INFO source=download.go:251 msg="6a0746a1ec1a part 7 stalled; retrying. If this persists, press ctrl-c to exit, then 'ollama pull' to find a faster connection."
time=2024-06-03T16:07:06.001+08:00 level=INFO source=download.go:178 msg="6a0746a1ec1a part 38 attempt 0 failed: unexpected EOF, retrying in 1s"
time=2024-06-03T16:09:06.642+08:00 level=INFO source=download.go:136 msg="downloading 4fa551d4f938 in 1 12 KB part(s)"
time=2024-06-03T16:09:09.722+08:00 level=INFO source=download.go:136 msg="downloading 8ab4849b038c in 1 254 B part(s)"
time=2024-06-03T16:09:12.829+08:00 level=INFO source=download.go:136 msg="downloading 577073ffcc6c in 1 110 B part(s)"
time=2024-06-03T16:09:15.947+08:00 level=INFO source=download.go:136 msg="downloading 3f8eb4da87fa in 1 485 B part(s)"
[GIN] 2024/06/03 - 16:09:50 | 200 |        11m38s |       127.0.0.1 | POST     "/api/pull"
[GIN] 2024/06/03 - 16:09:50 | 200 |      1.6429ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2024/06/03 - 16:09:50 | 200 |       1.558ms |       127.0.0.1 | POST     "/api/show"
time=2024-06-03T16:09:55.333+08:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=33 memory.available="23.9 GiB" memory.required.full="5.0 GiB" memory.required.partial="5.0 GiB" memory.required.kv="256.0 MiB" memory.weights.total="4.1 GiB" memory.weights.repeating="3.7 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="164.0 MiB" memory.graph.partial="677.5 MiB"
time=2024-06-03T16:09:55.334+08:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=33 memory.available="23.9 GiB" memory.required.full="5.0 GiB" memory.required.partial="5.0 GiB" memory.required.kv="256.0 MiB" memory.weights.total="4.1 GiB" memory.weights.repeating="3.7 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="164.0 MiB" memory.graph.partial="677.5 MiB"
time=2024-06-03T16:09:55.339+08:00 level=INFO source=server.go:341 msg="starting llama server" cmd="C:\\Users\\ecthe\\AppData\\Local\\Programs\\Ollama\\ollama_runners\\rocm_v5.7\\ollama_llama_server.exe --model D:\\Models\\blobs\\sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --parallel 1 --port 50779"
time=2024-06-03T16:09:55.542+08:00 level=INFO source=sched.go:338 msg="loaded runners" count=1
time=2024-06-03T16:09:55.542+08:00 level=INFO source=server.go:529 msg="waiting for llama runner to start responding"
time=2024-06-03T16:09:55.542+08:00 level=INFO source=server.go:567 msg="waiting for server to become available" status="llm server error"
time=2024-06-03T16:10:05.789+08:00 level=ERROR source=sched.go:344 msg="error loading llama server" error="llama runner process has terminated: exit status 0xc000001d "
[GIN] 2024/06/03 - 16:10:05 | 500 |     15.41134s |       127.0.0.1 | POST     "/api/chat"

server.log
app.log

OS

Windows

GPU

AMD

CPU

Intel

Ollama version

0.1.41

Originally created by @Ecthellin203 on GitHub (Jun 3, 2024). Original GitHub issue: https://github.com/ollama/ollama/issues/4795 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? run llama3:8b Error: llama runner process has terminated: exit status 0xc000001d ``` 2024/06/03 15:40:13 routes.go:1007: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST: OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS: OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR:C:\\Users\\ecthe\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_TMPDIR:]" time=2024-06-03T15:40:13.471+08:00 level=INFO source=images.go:729 msg="total blobs: 0" time=2024-06-03T15:40:13.471+08:00 level=INFO source=images.go:736 msg="total unused blobs removed: 0" time=2024-06-03T15:40:13.472+08:00 level=INFO source=routes.go:1053 msg="Listening on 127.0.0.1:11434 (version 0.1.41)" time=2024-06-03T15:40:13.472+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v5.7 cpu cpu_avx cpu_avx2 cuda_v11.3]" time=2024-06-03T15:40:18.670+08:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=rocm compute=gfx1100 driver=0.0 name="AMD Radeon RX 7900 XTX" total="24.0 GiB" available="23.9 GiB" 2024/06/03 15:43:44 routes.go:1007: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST: OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS: OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR:C:\\Users\\ecthe\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_TMPDIR:]" time=2024-06-03T15:43:44.069+08:00 level=INFO source=images.go:729 msg="total blobs: 0" time=2024-06-03T15:43:44.070+08:00 level=INFO source=images.go:736 msg="total unused blobs removed: 0" time=2024-06-03T15:43:44.070+08:00 level=INFO source=routes.go:1053 msg="Listening on 127.0.0.1:11434 (version 0.1.41)" time=2024-06-03T15:43:44.071+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11.3 rocm_v5.7]" time=2024-06-03T15:43:45.378+08:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=rocm compute=gfx1100 driver=0.0 name="AMD Radeon RX 7900 XTX" total="24.0 GiB" available="23.9 GiB" 2024/06/03 15:46:24 routes.go:1007: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST: OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS: OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR:C:\\Users\\ecthe\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_TMPDIR:]" time=2024-06-03T15:46:24.031+08:00 level=INFO source=images.go:729 msg="total blobs: 0" time=2024-06-03T15:46:24.034+08:00 level=INFO source=images.go:736 msg="total unused blobs removed: 0" time=2024-06-03T15:46:24.035+08:00 level=INFO source=routes.go:1053 msg="Listening on 127.0.0.1:11434 (version 0.1.41)" time=2024-06-03T15:46:24.035+08:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [rocm_v5.7 cpu cpu_avx cpu_avx2 cuda_v11.3]" time=2024-06-03T15:46:24.820+08:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=rocm compute=gfx1100 driver=0.0 name="AMD Radeon RX 7900 XTX" total="24.0 GiB" available="23.9 GiB" [GIN] 2024/06/03 - 15:46:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2024/06/03 - 15:47:01 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2024/06/03 - 15:47:01 | 404 | 527.2µs | 127.0.0.1 | POST "/api/show" time=2024-06-03T15:58:14.081+08:00 level=INFO source=download.go:136 msg="downloading 6a0746a1ec1a in 47 100 MB part(s)" time=2024-06-03T15:58:31.081+08:00 level=INFO source=download.go:251 msg="6a0746a1ec1a part 33 stalled; retrying. If this persists, press ctrl-c to exit, then 'ollama pull' to find a faster connection." time=2024-06-03T15:58:31.081+08:00 level=INFO source=download.go:251 msg="6a0746a1ec1a part 7 stalled; retrying. If this persists, press ctrl-c to exit, then 'ollama pull' to find a faster connection." time=2024-06-03T16:07:06.001+08:00 level=INFO source=download.go:178 msg="6a0746a1ec1a part 38 attempt 0 failed: unexpected EOF, retrying in 1s" time=2024-06-03T16:09:06.642+08:00 level=INFO source=download.go:136 msg="downloading 4fa551d4f938 in 1 12 KB part(s)" time=2024-06-03T16:09:09.722+08:00 level=INFO source=download.go:136 msg="downloading 8ab4849b038c in 1 254 B part(s)" time=2024-06-03T16:09:12.829+08:00 level=INFO source=download.go:136 msg="downloading 577073ffcc6c in 1 110 B part(s)" time=2024-06-03T16:09:15.947+08:00 level=INFO source=download.go:136 msg="downloading 3f8eb4da87fa in 1 485 B part(s)" [GIN] 2024/06/03 - 16:09:50 | 200 | 11m38s | 127.0.0.1 | POST "/api/pull" [GIN] 2024/06/03 - 16:09:50 | 200 | 1.6429ms | 127.0.0.1 | POST "/api/show" [GIN] 2024/06/03 - 16:09:50 | 200 | 1.558ms | 127.0.0.1 | POST "/api/show" time=2024-06-03T16:09:55.333+08:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=33 memory.available="23.9 GiB" memory.required.full="5.0 GiB" memory.required.partial="5.0 GiB" memory.required.kv="256.0 MiB" memory.weights.total="4.1 GiB" memory.weights.repeating="3.7 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="164.0 MiB" memory.graph.partial="677.5 MiB" time=2024-06-03T16:09:55.334+08:00 level=INFO source=memory.go:133 msg="offload to gpu" layers.requested=-1 layers.real=33 memory.available="23.9 GiB" memory.required.full="5.0 GiB" memory.required.partial="5.0 GiB" memory.required.kv="256.0 MiB" memory.weights.total="4.1 GiB" memory.weights.repeating="3.7 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="164.0 MiB" memory.graph.partial="677.5 MiB" time=2024-06-03T16:09:55.339+08:00 level=INFO source=server.go:341 msg="starting llama server" cmd="C:\\Users\\ecthe\\AppData\\Local\\Programs\\Ollama\\ollama_runners\\rocm_v5.7\\ollama_llama_server.exe --model D:\\Models\\blobs\\sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa --ctx-size 2048 --batch-size 512 --embedding --log-disable --n-gpu-layers 33 --parallel 1 --port 50779" time=2024-06-03T16:09:55.542+08:00 level=INFO source=sched.go:338 msg="loaded runners" count=1 time=2024-06-03T16:09:55.542+08:00 level=INFO source=server.go:529 msg="waiting for llama runner to start responding" time=2024-06-03T16:09:55.542+08:00 level=INFO source=server.go:567 msg="waiting for server to become available" status="llm server error" time=2024-06-03T16:10:05.789+08:00 level=ERROR source=sched.go:344 msg="error loading llama server" error="llama runner process has terminated: exit status 0xc000001d " [GIN] 2024/06/03 - 16:10:05 | 500 | 15.41134s | 127.0.0.1 | POST "/api/chat" ``` [server.log](https://github.com/user-attachments/files/15530934/server.log) [app.log](https://github.com/user-attachments/files/15530937/app.log) ### OS Windows ### GPU AMD ### CPU Intel ### Ollama version 0.1.41
GiteaMirror added the amdbugwindows labels 2026-05-03 19:40:04 -05:00
Author
Owner

@Ecthellin203 commented on GitHub (Jun 4, 2024):

And i got another question. AVX2 is a minimum requirement? In fact my hardware platform is really old(E5v2).

<!-- gh-comment-id:2146370124 --> @Ecthellin203 commented on GitHub (Jun 4, 2024): And i got another question. AVX2 is a minimum requirement? In fact my hardware platform is really old(E5v2).
Author
Owner

@dhiltgen commented on GitHub (Jun 18, 2024):

I'm not sure exactly why it's failing to start the subprocess. Lets try a different approach to see if we can get some more details on what's going wrong. Please upgrade to the latest version, Quit the tray app, then in a powershell terminal run

$env:OLLAMA_DEBUG="1"
ollama serve  2>&1 | % ToString | Tee-Object server.log

Then in another terminal, try to run the same model you already downloaded. Assuming it crashes, share that server.log

And i got another question. AVX2 is a minimum requirement? In fact my hardware platform is really old(E5v2).

Our GPU runners are compiled with AVX, but for CPU runners, we'll detect no vector support, AVX, AVX2 and auto-select the best available.

<!-- gh-comment-id:2177152044 --> @dhiltgen commented on GitHub (Jun 18, 2024): I'm not sure exactly why it's failing to start the subprocess. Lets try a different approach to see if we can get some more details on what's going wrong. Please upgrade to the latest version, Quit the tray app, then in a powershell terminal run ``` $env:OLLAMA_DEBUG="1" ollama serve 2>&1 | % ToString | Tee-Object server.log ``` Then in another terminal, try to run the same model you already downloaded. Assuming it crashes, share that server.log > And i got another question. AVX2 is a minimum requirement? In fact my hardware platform is really old(E5v2). Our GPU runners are compiled with AVX, but for CPU runners, we'll detect no vector support, AVX, AVX2 and auto-select the best available.
Author
Owner

@dhiltgen commented on GitHub (Jul 3, 2024):

If you're still having trouble, please follow the instructions above and I'll reopen the issue.

<!-- gh-comment-id:2207490617 --> @dhiltgen commented on GitHub (Jul 3, 2024): If you're still having trouble, please follow the instructions above and I'll reopen the issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#65061