[GH-ISSUE #11209] Repeated crashes (exit status 2) on Windows with version 0.9.3 #69441

Closed
opened 2026-05-04 18:06:21 -05:00 by GiteaMirror · 11 comments
Owner

Originally created by @yuval-ngtnuma on GitHub (Jun 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11209

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

I just upgraded ollama on my Windows 10 Pro (22H2, OS build 19045.5965) PC to 0.9.3. The PC has two NVidia 2080 Ti GPUs (driver version 560.94). I am not sure what version it was before, but I previously upgraded within the past two months.

Since the upgrade, I am unable to run any model. I get the error:

>ollama run llama3.2:1b
Error: llama runner process has terminated: exit status 2

The most relevant log message looks like this one:

signal arrived during external code execution

runtime.cgocall(0x7ff69cb6c470, 0xc0005875a0)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/cgocall.go:167 +0x3e fp=0xc000587578 sp=0xc000587510 pc=0x7ff69be82dbe
github.com/ollama/ollama/ml/backend/ggml/ggml/src._Cfunc_ggml_backend_load_all_from_path(0x143865dd0e0)

Relevant log output

[GIN] 2025/06/26 - 21:24:30 | 200 |     88.4522ms |       127.0.0.1 | POST     "/api/show"
time=2025-06-26T21:24:30.324+03:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 gpu=GPU-fdf1946b-5cd3-50d4-863c-6686a64f17dd parallel=2 available=10597957632 required="2.5 GiB"
time=2025-06-26T21:24:30.371+03:00 level=INFO source=server.go:135 msg="system memory" total="63.7 GiB" free="48.8 GiB" free_swap="55.6 GiB"
time=2025-06-26T21:24:30.371+03:00 level=INFO source=server.go:175 msg=offload library=cuda layers.requested=-1 layers.model=17 layers.offload=17 layers.split="" memory.available="[9.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="2.5 GiB" memory.required.partial="2.5 GiB" memory.required.kv="256.0 MiB" memory.required.allocations="[2.5 GiB]" memory.weights.total="1.2 GiB" memory.weights.repeating="986.2 MiB" memory.weights.nonrepeating="266.2 MiB" memory.graph.full="544.0 MiB" memory.graph.partial="554.3 MiB"
llama_model_loader: loaded meta data with 30 key-value pairs and 147 tensors from C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Llama 3.2 1B Instruct
llama_model_loader: - kv   3:                           general.finetune str              = Instruct
llama_model_loader: - kv   4:                           general.basename str              = Llama-3.2
llama_model_loader: - kv   5:                         general.size_label str              = 1B
llama_model_loader: - kv   6:                               general.tags arr[str,6]       = ["facebook", "meta", "pytorch", "llam...
llama_model_loader: - kv   7:                          general.languages arr[str,8]       = ["en", "de", "fr", "it", "pt", "hi", ...
llama_model_loader: - kv   8:                          llama.block_count u32              = 16
llama_model_loader: - kv   9:                       llama.context_length u32              = 131072
llama_model_loader: - kv  10:                     llama.embedding_length u32              = 2048
llama_model_loader: - kv  11:                  llama.feed_forward_length u32              = 8192
llama_model_loader: - kv  12:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  13:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  14:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv  15:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  16:                 llama.attention.key_length u32              = 64
llama_model_loader: - kv  17:               llama.attention.value_length u32              = 64
llama_model_loader: - kv  18:                          general.file_type u32              = 7
llama_model_loader: - kv  19:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  20:                 llama.rope.dimension_count u32              = 64
llama_model_loader: - kv  21:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  22:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  23:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  24:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  25:                      tokenizer.ggml.merges arr[str,280147]  = ["Ä  Ä ", "Ä  Ä Ä Ä ", "Ä Ä  Ä Ä ", "...
llama_model_loader: - kv  26:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  27:                tokenizer.ggml.eos_token_id u32              = 128009
llama_model_loader: - kv  28:                    tokenizer.chat_template str              = {{- bos_token }}\n{%- if custom_tools ...
llama_model_loader: - kv  29:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   34 tensors
llama_model_loader: - type q8_0:  113 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q8_0
print_info: file size   = 1.22 GiB (8.50 BPW) 
load: special tokens cache size = 256
load: token to piece cache size = 0.7999 MB
print_info: arch             = llama
print_info: vocab_only       = 1
print_info: model type       = ?B
print_info: model params     = 1.24 B
print_info: general.name     = Llama 3.2 1B Instruct
print_info: vocab type       = BPE
print_info: n_vocab          = 128256
print_info: n_merges         = 280147
print_info: BOS token        = 128000 '<|begin_of_text|>'
print_info: EOS token        = 128009 '<|eot_id|>'
print_info: EOT token        = 128009 '<|eot_id|>'
print_info: EOM token        = 128008 '<|eom_id|>'
print_info: LF token         = 198 'ÄŠ'
print_info: EOG token        = 128008 '<|eom_id|>'
print_info: EOG token        = 128009 '<|eot_id|>'
print_info: max token length = 256
llama_model_load: vocab only - skipping tensors
time=2025-06-26T21:24:30.634+03:00 level=INFO source=server.go:438 msg="starting llama server" cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model C:\\Users\\user\\.ollama\\models\\blobs\\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 --ctx-size 8192 --batch-size 512 --n-gpu-layers 17 --threads 6 --no-mmap --parallel 2 --port 58897"
time=2025-06-26T21:24:30.635+03:00 level=INFO source=sched.go:483 msg="loaded runners" count=1
time=2025-06-26T21:24:30.637+03:00 level=INFO source=server.go:598 msg="waiting for llama runner to start responding"
time=2025-06-26T21:24:30.637+03:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server error"
time=2025-06-26T21:24:30.674+03:00 level=INFO source=runner.go:815 msg="starting go runner"
Exception 0xc0000005 0x0 0x0 0x7ffc305a2590
PC=0x7ffc305a2590
signal arrived during external code execution

runtime.cgocall(0x7ff69cb6c470, 0xc0005875a0)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/cgocall.go:167 +0x3e fp=0xc000587578 sp=0xc000587510 pc=0x7ff69be82dbe
github.com/ollama/ollama/ml/backend/ggml/ggml/src._Cfunc_ggml_backend_load_all_from_path(0x143865dd0e0)
	_cgo_gotypes.go:199 +0x45 fp=0xc0005875a0 sp=0xc000587578 pc=0x7ff69c245c05
github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.func1.1({0xc0003dce00, 0x37})
	C:/a/ollama/ollama/ml/backend/ggml/ggml/src/ggml.go:97 +0xf5 fp=0xc000587638 sp=0xc0005875a0 pc=0x7ff69c245635
github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.func1()
	C:/a/ollama/ollama/ml/backend/ggml/ggml/src/ggml.go:98 +0x4e5 fp=0xc0005878b0 sp=0xc000587638 pc=0x7ff69c245485
github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.OnceFunc.func2()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/sync/oncefunc.go:27 +0x62 fp=0xc0005878f8 sp=0xc0005878b0 pc=0x7ff69c244ec2
sync.(*Once).doSlow(0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/sync/once.go:78 +0xab fp=0xc000587950 sp=0xc0005878f8 pc=0x7ff69be9a58b
sync.(*Once).Do(0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/sync/once.go:69 +0x19 fp=0xc000587970 sp=0xc000587950 pc=0x7ff69be9a4b9
github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.OnceFunc.func3()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/sync/oncefunc.go:32 +0x2d fp=0xc0005879a0 sp=0xc000587970 pc=0x7ff69c244e2d
github.com/ollama/ollama/llama.BackendInit()
	C:/a/ollama/ollama/llama/llama.go:60 +0x16 fp=0xc0005879b0 sp=0xc0005879a0 pc=0x7ff69c249c16
github.com/ollama/ollama/runner/llamarunner.Execute({0xc0000a4020, 0xf, 0x1e})
	C:/a/ollama/ollama/runner/llamarunner/runner.go:817 +0x63e fp=0xc000587d08 sp=0xc0005879b0 pc=0x7ff69c30621e
github.com/ollama/ollama/runner.Execute({0xc0000a4010?, 0x0?, 0x0?})
	C:/a/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc000587d30 sp=0xc000587d08 pc=0x7ff69c38b9d4
github.com/ollama/ollama/cmd.NewCLI.func2(0xc0000a3200?, {0x7ff69d0663a4?, 0x4?, 0x7ff69d0663a8?})
	C:/a/ollama/ollama/cmd/cmd.go:1529 +0x45 fp=0xc000587d58 sp=0xc000587d30 pc=0x7ff69caea4c5
github.com/spf13/cobra.(*Command).execute(0xc0004ccf08, {0xc0005bc690, 0xf, 0xf})
	C:/Users/runneradmin/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc000587e78 sp=0xc000587d58 pc=0x7ff69c00cb7c
github.com/spf13/cobra.(*Command).ExecuteC(0xc00061c908)
	C:/Users/runneradmin/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000587f30 sp=0xc000587e78 pc=0x7ff69c00d3c5
github.com/spf13/cobra.(*Command).Execute(...)
	C:/Users/runneradmin/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	C:/Users/runneradmin/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
	C:/a/ollama/ollama/main.go:12 +0x4d fp=0xc000587f50 sp=0xc000587f30 pc=0x7ff69caeaf4d
runtime.main()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:283 +0x27d fp=0xc000587fe0 sp=0xc000587f50 pc=0x7ff69be54f1d
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000587fe8 sp=0xc000587fe0 pc=0x7ff69be8db01

goroutine 2 gp=0xc0000028c0 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00006ffa8 sp=0xc00006ff88 pc=0x7ff69be8630e
runtime.goparkunlock(...)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:441
runtime.forcegchelper()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:348 +0xb8 fp=0xc00006ffe0 sp=0xc00006ffa8 pc=0x7ff69be55238
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x7ff69be8db01
created by runtime.init.7 in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:336 +0x1a

goroutine 3 gp=0xc000002c40 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000071f80 sp=0xc000071f60 pc=0x7ff69be8630e
runtime.goparkunlock(...)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:441
runtime.bgsweep(0xc00007e000)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgcsweep.go:316 +0xdf fp=0xc000071fc8 sp=0xc000071f80 pc=0x7ff69be3dfff
runtime.gcenable.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:204 +0x25 fp=0xc000071fe0 sp=0xc000071fc8 pc=0x7ff69be323c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000071fe8 sp=0xc000071fe0 pc=0x7ff69be8db01
created by runtime.gcenable in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:204 +0x66

goroutine 4 gp=0xc000002e00 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x7ff69d22de80?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000085f78 sp=0xc000085f58 pc=0x7ff69be8630e
runtime.goparkunlock(...)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x7ff69dbbbc00)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000085fa8 sp=0xc000085f78 pc=0x7ff69be3ba49
runtime.bgscavenge(0xc00007e000)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc000085fc8 sp=0xc000085fa8 pc=0x7ff69be3bfd9
runtime.gcenable.gowrap2()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:205 +0x25 fp=0xc000085fe0 sp=0xc000085fc8 pc=0x7ff69be32365
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x7ff69be8db01
created by runtime.gcenable in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:205 +0xa5

goroutine 5 gp=0xc000003340 m=nil [finalizer wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000087e30 sp=0xc000087e10 pc=0x7ff69be8630e
runtime.runfinq()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mfinal.go:196 +0x107 fp=0xc000087fe0 sp=0xc000087e30 pc=0x7ff69be31347
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000087fe8 sp=0xc000087fe0 pc=0x7ff69be8db01
created by runtime.createfing in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mfinal.go:166 +0x3d

goroutine 6 gp=0xc000003dc0 m=nil [chan receive]:
runtime.gopark(0xc00016f720?, 0xc0004b8018?, 0x60?, 0x3f?, 0x7ff69bf7b0e8?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000073f18 sp=0xc000073ef8 pc=0x7ff69be8630e
runtime.chanrecv(0xc00003c460, 0x0, 0x1)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/chan.go:664 +0x445 fp=0xc000073f90 sp=0xc000073f18 pc=0x7ff69be22d85
runtime.chanrecv1(0x7ff69be55080?, 0xc000073f76?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/chan.go:506 +0x12 fp=0xc000073fb8 sp=0xc000073f90 pc=0x7ff69be22912
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1799 +0x2f fp=0xc000073fe0 sp=0xc000073fb8 pc=0x7ff69be355ef
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000073fe8 sp=0xc000073fe0 pc=0x7ff69be8db01
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1794 +0x85

goroutine 7 gp=0xc0003e8380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000081f38 sp=0xc000081f18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000081fc8 sp=0xc000081f38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000081fe0 sp=0xc000081fc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000081fe8 sp=0xc000081fe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 8 gp=0xc0003e8540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000083f38 sp=0xc000083f18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000083fc8 sp=0xc000083f38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000083fe0 sp=0xc000083fc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 18 gp=0xc0004861c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000491f38 sp=0xc000491f18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000491fc8 sp=0xc000491f38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000491fe0 sp=0xc000491fc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000491fe8 sp=0xc000491fe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 19 gp=0xc000486380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000493f38 sp=0xc000493f18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000493fc8 sp=0xc000493f38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000493fe0 sp=0xc000493fc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000493fe8 sp=0xc000493fe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 34 gp=0xc000206000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00048df38 sp=0xc00048df18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048dfc8 sp=0xc00048df38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00048dfe0 sp=0xc00048dfc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048dfe8 sp=0xc00048dfe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 35 gp=0xc0002061c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00048ff38 sp=0xc00048ff18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048ffc8 sp=0xc00048ff38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00048ffe0 sp=0xc00048ffc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048ffe8 sp=0xc00048ffe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 36 gp=0xc000206380 m=nil [GC worker (idle)]:
runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00020df38 sp=0xc00020df18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00020dfc8 sp=0xc00020df38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00020dfe0 sp=0xc00020dfc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00020dfe8 sp=0xc00020dfe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 20 gp=0xc000486540 m=nil [GC worker (idle)]:
runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000209f38 sp=0xc000209f18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000209fc8 sp=0xc000209f38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000209fe0 sp=0xc000209fc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000209fe8 sp=0xc000209fe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 9 gp=0xc0003e8700 m=nil [GC worker (idle)]:
runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00047bf38 sp=0xc00047bf18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00047bfc8 sp=0xc00047bf38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00047bfe0 sp=0xc00047bfc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00047bfe8 sp=0xc00047bfe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 10 gp=0xc0003e88c0 m=nil [GC worker (idle)]:
runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00047df38 sp=0xc00047df18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00047dfc8 sp=0xc00047df38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00047dfe0 sp=0xc00047dfc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00047dfe8 sp=0xc00047dfe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 11 gp=0xc0003e8a80 m=nil [GC worker (idle)]:
runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000477f38 sp=0xc000477f18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000477fc8 sp=0xc000477f38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000477fe0 sp=0xc000477fc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000477fe8 sp=0xc000477fe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105

goroutine 21 gp=0xc000486700 m=nil [GC worker (idle)]:
runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00020bf38 sp=0xc00020bf18 pc=0x7ff69be8630e
runtime.gcBgMarkWorker(0xc00003da40)
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00020bfc8 sp=0xc00020bf38 pc=0x7ff69be348e9
runtime.gcBgMarkStartWorkers.gowrap1()
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00020bfe0 sp=0xc00020bfc8 pc=0x7ff69be347c5
runtime.goexit({})
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00020bfe8 sp=0xc00020bfe0 pc=0x7ff69be8db01
created by runtime.gcBgMarkStartWorkers in goroutine 1
	C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105
rax     0x0
rbx     0xb90cff430
rcx     0x7ffb8b9af0f8
rdx     0x0
rdi     0x7ff69dc59a08
rsi     0x7ffb8b9af0f0
rbp     0xb90cfee80
rsp     0xb90cfed00
r8      0xb90cff2f8
r9      0x1
r10     0x0
r11     0x246
r12     0x143865b26e8
r13     0xb90cff411
r14     0x0
r15     0x8003
rip     0x7ffc305a2590
rflags  0x10286
cs      0x33
fs      0x53
gs      0x2b
time=2025-06-26T21:24:31.139+03:00 level=ERROR source=sched.go:489 msg="error loading llama server" error="llama runner process has terminated: exit status 2"
[GIN] 2025/06/26 - 21:24:31 | 500 |    955.2704ms |       127.0.0.1 | POST     "/api/generate"
time=2025-06-26T21:24:36.187+03:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.0477552 runner.size="2.5 GiB" runner.vram="2.5 GiB" runner.parallel=2 runner.pid=13372 runner.model=C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45
time=2025-06-26T21:24:36.437+03:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.2977378999999996 runner.size="2.5 GiB" runner.vram="2.5 GiB" runner.parallel=2 runner.pid=13372 runner.model=C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45
time=2025-06-26T21:24:36.687+03:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.5478760000000005 runner.size="2.5 GiB" runner.vram="2.5 GiB" runner.parallel=2 runner.pid=13372 runner.model=C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.9.3

Originally created by @yuval-ngtnuma on GitHub (Jun 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11209 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? I just upgraded ollama on my Windows 10 Pro (22H2, OS build 19045.5965) PC to 0.9.3. The PC has two NVidia 2080 Ti GPUs (driver version 560.94). I am not sure what version it was before, but I previously upgraded within the past two months. Since the upgrade, I am unable to run *any* model. I get the error: ``` >ollama run llama3.2:1b Error: llama runner process has terminated: exit status 2 ``` The most relevant log message looks like this one: ``` signal arrived during external code execution runtime.cgocall(0x7ff69cb6c470, 0xc0005875a0) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/cgocall.go:167 +0x3e fp=0xc000587578 sp=0xc000587510 pc=0x7ff69be82dbe github.com/ollama/ollama/ml/backend/ggml/ggml/src._Cfunc_ggml_backend_load_all_from_path(0x143865dd0e0) ``` ### Relevant log output ```shell [GIN] 2025/06/26 - 21:24:30 | 200 | 88.4522ms | 127.0.0.1 | POST "/api/show" time=2025-06-26T21:24:30.324+03:00 level=INFO source=sched.go:788 msg="new model will fit in available VRAM in single GPU, loading" model=C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 gpu=GPU-fdf1946b-5cd3-50d4-863c-6686a64f17dd parallel=2 available=10597957632 required="2.5 GiB" time=2025-06-26T21:24:30.371+03:00 level=INFO source=server.go:135 msg="system memory" total="63.7 GiB" free="48.8 GiB" free_swap="55.6 GiB" time=2025-06-26T21:24:30.371+03:00 level=INFO source=server.go:175 msg=offload library=cuda layers.requested=-1 layers.model=17 layers.offload=17 layers.split="" memory.available="[9.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="2.5 GiB" memory.required.partial="2.5 GiB" memory.required.kv="256.0 MiB" memory.required.allocations="[2.5 GiB]" memory.weights.total="1.2 GiB" memory.weights.repeating="986.2 MiB" memory.weights.nonrepeating="266.2 MiB" memory.graph.full="544.0 MiB" memory.graph.partial="554.3 MiB" llama_model_loader: loaded meta data with 30 key-value pairs and 147 tensors from C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = Llama 3.2 1B Instruct llama_model_loader: - kv 3: general.finetune str = Instruct llama_model_loader: - kv 4: general.basename str = Llama-3.2 llama_model_loader: - kv 5: general.size_label str = 1B llama_model_loader: - kv 6: general.tags arr[str,6] = ["facebook", "meta", "pytorch", "llam... llama_model_loader: - kv 7: general.languages arr[str,8] = ["en", "de", "fr", "it", "pt", "hi", ... llama_model_loader: - kv 8: llama.block_count u32 = 16 llama_model_loader: - kv 9: llama.context_length u32 = 131072 llama_model_loader: - kv 10: llama.embedding_length u32 = 2048 llama_model_loader: - kv 11: llama.feed_forward_length u32 = 8192 llama_model_loader: - kv 12: llama.attention.head_count u32 = 32 llama_model_loader: - kv 13: llama.attention.head_count_kv u32 = 8 llama_model_loader: - kv 14: llama.rope.freq_base f32 = 500000.000000 llama_model_loader: - kv 15: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 16: llama.attention.key_length u32 = 64 llama_model_loader: - kv 17: llama.attention.value_length u32 = 64 llama_model_loader: - kv 18: general.file_type u32 = 7 llama_model_loader: - kv 19: llama.vocab_size u32 = 128256 llama_model_loader: - kv 20: llama.rope.dimension_count u32 = 64 llama_model_loader: - kv 21: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 22: tokenizer.ggml.pre str = llama-bpe llama_model_loader: - kv 23: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 24: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 25: tokenizer.ggml.merges arr[str,280147] = ["Ä  Ä ", "Ä  Ä Ä Ä ", "Ä Ä  Ä Ä ", "... llama_model_loader: - kv 26: tokenizer.ggml.bos_token_id u32 = 128000 llama_model_loader: - kv 27: tokenizer.ggml.eos_token_id u32 = 128009 llama_model_loader: - kv 28: tokenizer.chat_template str = {{- bos_token }}\n{%- if custom_tools ... llama_model_loader: - kv 29: general.quantization_version u32 = 2 llama_model_loader: - type f32: 34 tensors llama_model_loader: - type q8_0: 113 tensors print_info: file format = GGUF V3 (latest) print_info: file type = Q8_0 print_info: file size = 1.22 GiB (8.50 BPW) load: special tokens cache size = 256 load: token to piece cache size = 0.7999 MB print_info: arch = llama print_info: vocab_only = 1 print_info: model type = ?B print_info: model params = 1.24 B print_info: general.name = Llama 3.2 1B Instruct print_info: vocab type = BPE print_info: n_vocab = 128256 print_info: n_merges = 280147 print_info: BOS token = 128000 '<|begin_of_text|>' print_info: EOS token = 128009 '<|eot_id|>' print_info: EOT token = 128009 '<|eot_id|>' print_info: EOM token = 128008 '<|eom_id|>' print_info: LF token = 198 'ÄŠ' print_info: EOG token = 128008 '<|eom_id|>' print_info: EOG token = 128009 '<|eot_id|>' print_info: max token length = 256 llama_model_load: vocab only - skipping tensors time=2025-06-26T21:24:30.634+03:00 level=INFO source=server.go:438 msg="starting llama server" cmd="C:\\Users\\user\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model C:\\Users\\user\\.ollama\\models\\blobs\\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 --ctx-size 8192 --batch-size 512 --n-gpu-layers 17 --threads 6 --no-mmap --parallel 2 --port 58897" time=2025-06-26T21:24:30.635+03:00 level=INFO source=sched.go:483 msg="loaded runners" count=1 time=2025-06-26T21:24:30.637+03:00 level=INFO source=server.go:598 msg="waiting for llama runner to start responding" time=2025-06-26T21:24:30.637+03:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server error" time=2025-06-26T21:24:30.674+03:00 level=INFO source=runner.go:815 msg="starting go runner" Exception 0xc0000005 0x0 0x0 0x7ffc305a2590 PC=0x7ffc305a2590 signal arrived during external code execution runtime.cgocall(0x7ff69cb6c470, 0xc0005875a0) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/cgocall.go:167 +0x3e fp=0xc000587578 sp=0xc000587510 pc=0x7ff69be82dbe github.com/ollama/ollama/ml/backend/ggml/ggml/src._Cfunc_ggml_backend_load_all_from_path(0x143865dd0e0) _cgo_gotypes.go:199 +0x45 fp=0xc0005875a0 sp=0xc000587578 pc=0x7ff69c245c05 github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.func1.1({0xc0003dce00, 0x37}) C:/a/ollama/ollama/ml/backend/ggml/ggml/src/ggml.go:97 +0xf5 fp=0xc000587638 sp=0xc0005875a0 pc=0x7ff69c245635 github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.func1() C:/a/ollama/ollama/ml/backend/ggml/ggml/src/ggml.go:98 +0x4e5 fp=0xc0005878b0 sp=0xc000587638 pc=0x7ff69c245485 github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.OnceFunc.func2() C:/hostedtoolcache/windows/go/1.24.0/x64/src/sync/oncefunc.go:27 +0x62 fp=0xc0005878f8 sp=0xc0005878b0 pc=0x7ff69c244ec2 sync.(*Once).doSlow(0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/sync/once.go:78 +0xab fp=0xc000587950 sp=0xc0005878f8 pc=0x7ff69be9a58b sync.(*Once).Do(0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/sync/once.go:69 +0x19 fp=0xc000587970 sp=0xc000587950 pc=0x7ff69be9a4b9 github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.OnceFunc.func3() C:/hostedtoolcache/windows/go/1.24.0/x64/src/sync/oncefunc.go:32 +0x2d fp=0xc0005879a0 sp=0xc000587970 pc=0x7ff69c244e2d github.com/ollama/ollama/llama.BackendInit() C:/a/ollama/ollama/llama/llama.go:60 +0x16 fp=0xc0005879b0 sp=0xc0005879a0 pc=0x7ff69c249c16 github.com/ollama/ollama/runner/llamarunner.Execute({0xc0000a4020, 0xf, 0x1e}) C:/a/ollama/ollama/runner/llamarunner/runner.go:817 +0x63e fp=0xc000587d08 sp=0xc0005879b0 pc=0x7ff69c30621e github.com/ollama/ollama/runner.Execute({0xc0000a4010?, 0x0?, 0x0?}) C:/a/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc000587d30 sp=0xc000587d08 pc=0x7ff69c38b9d4 github.com/ollama/ollama/cmd.NewCLI.func2(0xc0000a3200?, {0x7ff69d0663a4?, 0x4?, 0x7ff69d0663a8?}) C:/a/ollama/ollama/cmd/cmd.go:1529 +0x45 fp=0xc000587d58 sp=0xc000587d30 pc=0x7ff69caea4c5 github.com/spf13/cobra.(*Command).execute(0xc0004ccf08, {0xc0005bc690, 0xf, 0xf}) C:/Users/runneradmin/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc000587e78 sp=0xc000587d58 pc=0x7ff69c00cb7c github.com/spf13/cobra.(*Command).ExecuteC(0xc00061c908) C:/Users/runneradmin/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000587f30 sp=0xc000587e78 pc=0x7ff69c00d3c5 github.com/spf13/cobra.(*Command).Execute(...) C:/Users/runneradmin/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) C:/Users/runneradmin/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985 main.main() C:/a/ollama/ollama/main.go:12 +0x4d fp=0xc000587f50 sp=0xc000587f30 pc=0x7ff69caeaf4d runtime.main() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:283 +0x27d fp=0xc000587fe0 sp=0xc000587f50 pc=0x7ff69be54f1d runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000587fe8 sp=0xc000587fe0 pc=0x7ff69be8db01 goroutine 2 gp=0xc0000028c0 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00006ffa8 sp=0xc00006ff88 pc=0x7ff69be8630e runtime.goparkunlock(...) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:441 runtime.forcegchelper() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:348 +0xb8 fp=0xc00006ffe0 sp=0xc00006ffa8 pc=0x7ff69be55238 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x7ff69be8db01 created by runtime.init.7 in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:336 +0x1a goroutine 3 gp=0xc000002c40 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000071f80 sp=0xc000071f60 pc=0x7ff69be8630e runtime.goparkunlock(...) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:441 runtime.bgsweep(0xc00007e000) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgcsweep.go:316 +0xdf fp=0xc000071fc8 sp=0xc000071f80 pc=0x7ff69be3dfff runtime.gcenable.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:204 +0x25 fp=0xc000071fe0 sp=0xc000071fc8 pc=0x7ff69be323c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000071fe8 sp=0xc000071fe0 pc=0x7ff69be8db01 created by runtime.gcenable in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:204 +0x66 goroutine 4 gp=0xc000002e00 m=nil [GC scavenge wait]: runtime.gopark(0x10000?, 0x7ff69d22de80?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000085f78 sp=0xc000085f58 pc=0x7ff69be8630e runtime.goparkunlock(...) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:441 runtime.(*scavengerState).park(0x7ff69dbbbc00) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000085fa8 sp=0xc000085f78 pc=0x7ff69be3ba49 runtime.bgscavenge(0xc00007e000) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc000085fc8 sp=0xc000085fa8 pc=0x7ff69be3bfd9 runtime.gcenable.gowrap2() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:205 +0x25 fp=0xc000085fe0 sp=0xc000085fc8 pc=0x7ff69be32365 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x7ff69be8db01 created by runtime.gcenable in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:205 +0xa5 goroutine 5 gp=0xc000003340 m=nil [finalizer wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000087e30 sp=0xc000087e10 pc=0x7ff69be8630e runtime.runfinq() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mfinal.go:196 +0x107 fp=0xc000087fe0 sp=0xc000087e30 pc=0x7ff69be31347 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000087fe8 sp=0xc000087fe0 pc=0x7ff69be8db01 created by runtime.createfing in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mfinal.go:166 +0x3d goroutine 6 gp=0xc000003dc0 m=nil [chan receive]: runtime.gopark(0xc00016f720?, 0xc0004b8018?, 0x60?, 0x3f?, 0x7ff69bf7b0e8?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000073f18 sp=0xc000073ef8 pc=0x7ff69be8630e runtime.chanrecv(0xc00003c460, 0x0, 0x1) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/chan.go:664 +0x445 fp=0xc000073f90 sp=0xc000073f18 pc=0x7ff69be22d85 runtime.chanrecv1(0x7ff69be55080?, 0xc000073f76?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/chan.go:506 +0x12 fp=0xc000073fb8 sp=0xc000073f90 pc=0x7ff69be22912 runtime.unique_runtime_registerUniqueMapCleanup.func2(...) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1796 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1799 +0x2f fp=0xc000073fe0 sp=0xc000073fb8 pc=0x7ff69be355ef runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000073fe8 sp=0xc000073fe0 pc=0x7ff69be8db01 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1794 +0x85 goroutine 7 gp=0xc0003e8380 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000081f38 sp=0xc000081f18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000081fc8 sp=0xc000081f38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000081fe0 sp=0xc000081fc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000081fe8 sp=0xc000081fe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 8 gp=0xc0003e8540 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000083f38 sp=0xc000083f18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000083fc8 sp=0xc000083f38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000083fe0 sp=0xc000083fc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 18 gp=0xc0004861c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000491f38 sp=0xc000491f18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000491fc8 sp=0xc000491f38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000491fe0 sp=0xc000491fc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000491fe8 sp=0xc000491fe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 19 gp=0xc000486380 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000493f38 sp=0xc000493f18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000493fc8 sp=0xc000493f38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000493fe0 sp=0xc000493fc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000493fe8 sp=0xc000493fe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 34 gp=0xc000206000 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00048df38 sp=0xc00048df18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048dfc8 sp=0xc00048df38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00048dfe0 sp=0xc00048dfc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048dfe8 sp=0xc00048dfe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 35 gp=0xc0002061c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00048ff38 sp=0xc00048ff18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00048ffc8 sp=0xc00048ff38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00048ffe0 sp=0xc00048ffc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00048ffe8 sp=0xc00048ffe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 36 gp=0xc000206380 m=nil [GC worker (idle)]: runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00020df38 sp=0xc00020df18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00020dfc8 sp=0xc00020df38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00020dfe0 sp=0xc00020dfc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00020dfe8 sp=0xc00020dfe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 20 gp=0xc000486540 m=nil [GC worker (idle)]: runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000209f38 sp=0xc000209f18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000209fc8 sp=0xc000209f38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000209fe0 sp=0xc000209fc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000209fe8 sp=0xc000209fe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 9 gp=0xc0003e8700 m=nil [GC worker (idle)]: runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00047bf38 sp=0xc00047bf18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00047bfc8 sp=0xc00047bf38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00047bfe0 sp=0xc00047bfc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00047bfe8 sp=0xc00047bfe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 10 gp=0xc0003e88c0 m=nil [GC worker (idle)]: runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00047df38 sp=0xc00047df18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00047dfc8 sp=0xc00047df38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00047dfe0 sp=0xc00047dfc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00047dfe8 sp=0xc00047dfe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 11 gp=0xc0003e8a80 m=nil [GC worker (idle)]: runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc000477f38 sp=0xc000477f18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc000477fc8 sp=0xc000477f38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc000477fe0 sp=0xc000477fc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000477fe8 sp=0xc000477fe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 goroutine 21 gp=0xc000486700 m=nil [GC worker (idle)]: runtime.gopark(0x4324e2e908?, 0x0?, 0x0?, 0x0?, 0x0?) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/proc.go:435 +0xce fp=0xc00020bf38 sp=0xc00020bf18 pc=0x7ff69be8630e runtime.gcBgMarkWorker(0xc00003da40) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1423 +0xe9 fp=0xc00020bfc8 sp=0xc00020bf38 pc=0x7ff69be348e9 runtime.gcBgMarkStartWorkers.gowrap1() C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x25 fp=0xc00020bfe0 sp=0xc00020bfc8 pc=0x7ff69be347c5 runtime.goexit({}) C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00020bfe8 sp=0xc00020bfe0 pc=0x7ff69be8db01 created by runtime.gcBgMarkStartWorkers in goroutine 1 C:/hostedtoolcache/windows/go/1.24.0/x64/src/runtime/mgc.go:1339 +0x105 rax 0x0 rbx 0xb90cff430 rcx 0x7ffb8b9af0f8 rdx 0x0 rdi 0x7ff69dc59a08 rsi 0x7ffb8b9af0f0 rbp 0xb90cfee80 rsp 0xb90cfed00 r8 0xb90cff2f8 r9 0x1 r10 0x0 r11 0x246 r12 0x143865b26e8 r13 0xb90cff411 r14 0x0 r15 0x8003 rip 0x7ffc305a2590 rflags 0x10286 cs 0x33 fs 0x53 gs 0x2b time=2025-06-26T21:24:31.139+03:00 level=ERROR source=sched.go:489 msg="error loading llama server" error="llama runner process has terminated: exit status 2" [GIN] 2025/06/26 - 21:24:31 | 500 | 955.2704ms | 127.0.0.1 | POST "/api/generate" time=2025-06-26T21:24:36.187+03:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.0477552 runner.size="2.5 GiB" runner.vram="2.5 GiB" runner.parallel=2 runner.pid=13372 runner.model=C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 time=2025-06-26T21:24:36.437+03:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.2977378999999996 runner.size="2.5 GiB" runner.vram="2.5 GiB" runner.parallel=2 runner.pid=13372 runner.model=C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 time=2025-06-26T21:24:36.687+03:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.5478760000000005 runner.size="2.5 GiB" runner.vram="2.5 GiB" runner.parallel=2 runner.pid=13372 runner.model=C:\Users\user\.ollama\models\blobs\sha256-74701a8c35f6c8d9a4b91f3f3497643001d63e0c7a84e085bed452548fa88d45 ``` ### OS Windows ### GPU Nvidia ### CPU Intel ### Ollama version 0.9.3
GiteaMirror added the bugwindows labels 2026-05-04 18:06:22 -05:00
Author
Owner

@yuval-ngtnuma commented on GitHub (Jun 26, 2025):

I can confirm after downgrading to 0.9.2, that the problem is indeed with 0.9.3, as 0.9.2 works fine.

<!-- gh-comment-id:3009545534 --> @yuval-ngtnuma commented on GitHub (Jun 26, 2025): I can confirm after downgrading to 0.9.2, that the problem is indeed with 0.9.3, as 0.9.2 works fine.
Author
Owner

@dhiltgen commented on GitHub (Jun 26, 2025):

This might be a dependency problem.

To test that theory, can you install the MSVC C++ runtime package and let us know if 0.9.3 starts working correctly?

<!-- gh-comment-id:3009589980 --> @dhiltgen commented on GitHub (Jun 26, 2025): This might be a dependency problem. To test that theory, can you install the MSVC C++ runtime package and let us know if 0.9.3 starts working correctly? - https://aka.ms/vs/17/release/vc_redist.x64.exe
Author
Owner

@jpdr-apps commented on GitHub (Jun 26, 2025):

I installed the C++ libraries and the problem was solved. Moreover, I have 2 laptops, and the one which already had the libraries before didn't have this issue. Now both are runing Ollama as usual.

<!-- gh-comment-id:3009689509 --> @jpdr-apps commented on GitHub (Jun 26, 2025): I installed the C++ libraries and the problem was solved. Moreover, I have 2 laptops, and the one which already had the libraries before didn't have this issue. Now both are runing Ollama as usual.
Author
Owner

@dhiltgen commented on GitHub (Jun 26, 2025):

Sorry about that! We'll get this fixed in the next version. Until then, if anyone hits this, manually installing the C++ runtime will resolve the crash.

https://aka.ms/vs/17/release/vc_redist.x64.exe

<!-- gh-comment-id:3009692733 --> @dhiltgen commented on GitHub (Jun 26, 2025): Sorry about that! We'll get this fixed in the next version. Until then, if anyone hits this, manually installing the C++ runtime will resolve the crash. https://aka.ms/vs/17/release/vc_redist.x64.exe
Author
Owner

@mayflyfy commented on GitHub (Jun 28, 2025):

I have the same problem,

win10 nvidia intel

<!-- gh-comment-id:3015129718 --> @mayflyfy commented on GitHub (Jun 28, 2025): I have the same problem, win10 nvidia intel
Author
Owner

@bit-yuyuyu commented on GitHub (Jun 30, 2025):

I have the same problem,Ubuntu 20.04, Nvidia A6000 * 4, intel
the log
6月 30 14:42:59 serer-A6000 ollama[3752607]: print_info: max token length = 256
6月 30 14:42:59 serer-A6000 ollama[3752607]: load_tensors: loading model tensors, this can take a while... (mmap = true)
6月 30 14:43:00 serer-A6000 ollama[3752607]: alloc_tensor_range: failed to initialize tensor output.weight
6月 30 14:43:00 serer-A6000 ollama[3752607]: llama_model_load: error loading model: unable to allocate CUDA0 buffer
6月 30 14:43:00 serer-A6000 ollama[3752607]: llama_model_load_from_file_impl: failed to load model
6月 30 14:43:00 serer-A6000 ollama[3752607]: panic: unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a8cc1361f3145dc01f6d77c6c82c9116b9ffe3c97b34716fe20418455876c40e
6月 30 14:43:00 serer-A6000 ollama[3752607]: goroutine 257 [running]:
6月 30 14:43:00 serer-A6000 ollama[3752607]: github.com/ollama/ollama/runner/llamarunner.(*Server).loadModel(0xc0007ec360, {0x29, 0x0, 0x1, {0x0, 0x0, 0x0}, 0xc00078d690, 0x0}, {0x7ffe897f>6月 30 14:43:00 serer-A6000 ollama[3752607]: github.com/ollama/ollama/runner/llamarunner/runner.go:751 +0x395
6月 30 14:43:00 serer-A6000 ollama[3752607]: created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
6月 30 14:43:00 serer-A6000 ollama[3752607]: github.com/ollama/ollama/runner/llamarunner/runner.go:848 +0xb57
6月 30 14:43:00 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:00.569+08:00 level=ERROR source=server.go:464 msg="llama runner terminated" error="exit status 2"
6月 30 14:43:00 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:00.597+08:00 level=ERROR source=sched.go:489 msg="error loading llama server" error="llama runner process has terminated:>6月 30 14:43:00 serer-A6000 ollama[3752607]: [GIN] 2025/06/30 - 14:43:00 | 500 | 3.184085189s | 127.0.0.1 | POST "/api/generate"
6月 30 14:43:05 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:05.930+08:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.332562432 runner>6月 30 14:43:06 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:06.270+08:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.67258538 runner.>6月 30 14:43:06 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:06.609+08:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=6.011434446 runner>6月 30 14:44:19 serer-A6000 systemd[1]: Stopping Ollama Service...
6月 30 14:44:19 serer-A6000 systemd[1]: ollama.service: Succeeded.
6月 30 14:44:19 serer-A6000 systemd[1]: Stopped Ollama Service.
6月 30 14:45:31 serer-A6000 systemd[1]: Started Ollama Service.
6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.989+08:00 level=INFO source=routes.go:1235 msg="server config" env="map[CUDA_VISIBLE_DEVICES:1,2,3 GPU_DEVICE_ORDINAL:>6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.990+08:00 level=INFO source=images.go:476 msg="total blobs: 5"
6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.991+08:00 level=INFO source=images.go:483 msg="total unused blobs removed: 0"
6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.991+08:00 level=INFO source=routes.go:1288 msg="Listening on [::]:11434 (version 0.9.3)"
6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.991+08:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
6月 30 14:45:32 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:32.467+08:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-e540b747-7f9d-2783-4c15-368667c9c829 library=c>lines 358247-358294/359381 100%

(base) serer@serer-A6000:~$ ollama run qwen3:14b
Error: llama runner process has terminated: error loading model: unable to allocate CUDA0 buffer

<!-- gh-comment-id:3018091749 --> @bit-yuyuyu commented on GitHub (Jun 30, 2025): I have the same problem,Ubuntu 20.04, Nvidia A6000 * 4, intel the log 6月 30 14:42:59 serer-A6000 ollama[3752607]: print_info: max token length = 256 6月 30 14:42:59 serer-A6000 ollama[3752607]: load_tensors: loading model tensors, this can take a while... (mmap = true) 6月 30 14:43:00 serer-A6000 ollama[3752607]: alloc_tensor_range: failed to initialize tensor output.weight 6月 30 14:43:00 serer-A6000 ollama[3752607]: llama_model_load: error loading model: unable to allocate CUDA0 buffer 6月 30 14:43:00 serer-A6000 ollama[3752607]: llama_model_load_from_file_impl: failed to load model 6月 30 14:43:00 serer-A6000 ollama[3752607]: panic: unable to load model: /usr/share/ollama/.ollama/models/blobs/sha256-a8cc1361f3145dc01f6d77c6c82c9116b9ffe3c97b34716fe20418455876c40e 6月 30 14:43:00 serer-A6000 ollama[3752607]: goroutine 257 [running]: 6月 30 14:43:00 serer-A6000 ollama[3752607]: github.com/ollama/ollama/runner/llamarunner.(*Server).loadModel(0xc0007ec360, {0x29, 0x0, 0x1, {0x0, 0x0, 0x0}, 0xc00078d690, 0x0}, {0x7ffe897f>6月 30 14:43:00 serer-A6000 ollama[3752607]: github.com/ollama/ollama/runner/llamarunner/runner.go:751 +0x395 6月 30 14:43:00 serer-A6000 ollama[3752607]: created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1 6月 30 14:43:00 serer-A6000 ollama[3752607]: github.com/ollama/ollama/runner/llamarunner/runner.go:848 +0xb57 6月 30 14:43:00 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:00.569+08:00 level=ERROR source=server.go:464 msg="llama runner terminated" error="exit status 2" 6月 30 14:43:00 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:00.597+08:00 level=ERROR source=sched.go:489 msg="error loading llama server" error="llama runner process has terminated:>6月 30 14:43:00 serer-A6000 ollama[3752607]: [GIN] 2025/06/30 - 14:43:00 | 500 | 3.184085189s | 127.0.0.1 | POST "/api/generate" 6月 30 14:43:05 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:05.930+08:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.332562432 runner>6月 30 14:43:06 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:06.270+08:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=5.67258538 runner.>6月 30 14:43:06 serer-A6000 ollama[3752607]: time=2025-06-30T14:43:06.609+08:00 level=WARN source=sched.go:687 msg="gpu VRAM usage didn't recover within timeout" seconds=6.011434446 runner>6月 30 14:44:19 serer-A6000 systemd[1]: Stopping Ollama Service... 6月 30 14:44:19 serer-A6000 systemd[1]: ollama.service: Succeeded. 6月 30 14:44:19 serer-A6000 systemd[1]: Stopped Ollama Service. 6月 30 14:45:31 serer-A6000 systemd[1]: Started Ollama Service. 6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.989+08:00 level=INFO source=routes.go:1235 msg="server config" env="map[CUDA_VISIBLE_DEVICES:1,2,3 GPU_DEVICE_ORDINAL:>6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.990+08:00 level=INFO source=images.go:476 msg="total blobs: 5" 6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.991+08:00 level=INFO source=images.go:483 msg="total unused blobs removed: 0" 6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.991+08:00 level=INFO source=routes.go:1288 msg="Listening on [::]:11434 (version 0.9.3)" 6月 30 14:45:31 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:31.991+08:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" 6月 30 14:45:32 serer-A6000 ollama[3804822]: time=2025-06-30T14:45:32.467+08:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-e540b747-7f9d-2783-4c15-368667c9c829 library=c>lines 358247-358294/359381 100% (base) serer@serer-A6000:~$ ollama run qwen3:14b Error: llama runner process has terminated: error loading model: unable to allocate CUDA0 buffer
Author
Owner

@wiskewu commented on GitHub (Jul 1, 2025):

Sorry about that! We'll get this fixed in the next version. Until then, if anyone hits this, manually installing the C++ runtime will resolve the crash.

https://aka.ms/vs/17/release/vc_redist.x64.exe

it works. solved my problem.

<!-- gh-comment-id:3021602149 --> @wiskewu commented on GitHub (Jul 1, 2025): > Sorry about that! We'll get this fixed in the next version. Until then, if anyone hits this, manually installing the C++ runtime will resolve the crash. > > https://aka.ms/vs/17/release/vc_redist.x64.exe it works. solved my problem.
Author
Owner

@SDwarfs commented on GitHub (Jul 1, 2025):

Sorry about that! We'll get this fixed in the next version. Until then, if anyone hits this, manually installing the C++ runtime will resolve the crash.

https://aka.ms/vs/17/release/vc_redist.x64.exe

This also fixed the issue for me on Windows 10 Pro, RTX 3060. Thank you.

<!-- gh-comment-id:3024527454 --> @SDwarfs commented on GitHub (Jul 1, 2025): > Sorry about that! We'll get this fixed in the next version. Until then, if anyone hits this, manually installing the C++ runtime will resolve the crash. > > https://aka.ms/vs/17/release/vc_redist.x64.exe This also fixed the issue for me on Windows 10 Pro, RTX 3060. Thank you.
Author
Owner

@dhiltgen commented on GitHub (Jul 1, 2025):

Fixed in 0.9.4 - we now include the redistributable in the windows installer, and will conditionally install it if not detected on the system already.

<!-- gh-comment-id:3025824850 --> @dhiltgen commented on GitHub (Jul 1, 2025): Fixed in 0.9.4 - we now include the redistributable in the windows installer, and will conditionally install it if not detected on the system already.
Author
Owner

@sunhy0316 commented on GitHub (Aug 25, 2025):

Is vc_redist unnecessary in the newest version with statically linked VC libraries? I canceled the vc_redist installation (no admin rights) during 0.11.6 setup, but the application works fine regardless.

<!-- gh-comment-id:3221965359 --> @sunhy0316 commented on GitHub (Aug 25, 2025): Is vc_redist unnecessary in the newest version with statically linked VC libraries? I canceled the vc_redist installation (no admin rights) during 0.11.6 setup, but the application works fine regardless.
Author
Owner

@dhiltgen commented on GitHub (Aug 25, 2025):

We've reduced the dependency, but the ROCm libraries still depend on vc_redist.

<!-- gh-comment-id:3221986565 --> @dhiltgen commented on GitHub (Aug 25, 2025): We've reduced the dependency, but the ROCm libraries still depend on vc_redist.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#69441