[GH-ISSUE #11686] Issue with gpt-oss:20b on MacBook M1 Pro #7732

Closed
opened 2026-04-12 19:51:29 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @ben73 on GitHub (Aug 5, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11686

What is the issue?

> ollama run gpt-oss:20b


>>> hi
Error: model runner has unexpectedly stopped, this may be due to resource limitations or an internal error, check ollama server logs for details

Relevant log output

time=2025-08-05T23:55:48.303+04:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model=/Users/ben/.ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 gpu=0 parallel=1 available=22906503168 required="13.8 GiB"
time=2025-08-05T23:55:48.303+04:00 level=INFO source=server.go:135 msg="system memory" total="32.0 GiB" free="18.3 GiB" free_swap="0 B"
time=2025-08-05T23:55:48.303+04:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=25 layers.offload=25 layers.split="" memory.available="[21.3 GiB]" memory.gpu_overhead="0 B" memory.required.full="13.8 GiB" memory.required.partial="13.8 GiB" memory.required.kv="150.0 MiB" memory.required.allocations="[13.8 GiB]" memory.weights.total="11.7 GiB" memory.weights.repeating="10.7 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="1.0 GiB" memory.graph.partial="1.0 GiB"
time=2025-08-05T23:55:48.303+04:00 level=INFO source=server.go:218 msg="enabling flash attention"
time=2025-08-05T23:55:48.338+04:00 level=INFO source=server.go:438 msg="starting llama server" cmd="/opt/homebrew/Cellar/ollama/HEAD-fa7776f/bin/ollama runner --ollama-engine --model /Users/ben/.ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 --ctx-size 8192 --batch-size 512 --n-gpu-layers 25 --threads 6 --flash-attn --kv-cache-type q8_0 --parallel 1 --port 60940"
time=2025-08-05T23:55:48.340+04:00 level=INFO source=sched.go:481 msg="loaded runners" count=1
time=2025-08-05T23:55:48.340+04:00 level=INFO source=server.go:598 msg="waiting for llama runner to start responding"
time=2025-08-05T23:55:48.340+04:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server not responding"
time=2025-08-05T23:55:48.353+04:00 level=INFO source=runner.go:925 msg="starting ollama engine"
time=2025-08-05T23:55:48.353+04:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:60940"
time=2025-08-05T23:55:48.387+04:00 level=INFO source=ggml.go:92 msg="" architecture=gptoss file_type=MXFP4 name="" description="" num_tensors=315 num_key_values=30
time=2025-08-05T23:55:48.387+04:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang)
time=2025-08-05T23:55:48.592+04:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server loading model"
time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:367 msg="offloading 24 repeating layers to GPU"
time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:373 msg="offloading output layer to GPU"
time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:378 msg="offloaded 25/25 layers to GPU"
time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:381 msg="model weights" buffer=Metal size="11.7 GiB"
time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:381 msg="model weights" buffer=CPU size="1.1 GiB"
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M1 Pro
ggml_metal_load_library: using embedded metal library
ggml_metal_init: GPU name:   Apple M1 Pro
ggml_metal_init: GPU family: MTLGPUFamilyApple7  (1007)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction   = true
ggml_metal_init: simdgroup matrix mul. = true
ggml_metal_init: has residency sets    = true
ggml_metal_init: has bfloat            = true
ggml_metal_init: use bfloat            = true
ggml_metal_init: hasUnifiedMemory      = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 22906.50 MB
time=2025-08-05T23:55:49.432+04:00 level=INFO source=ggml.go:672 msg="compute graph" backend=Metal buffer_type=Metal size="2.1 GiB"
time=2025-08-05T23:55:49.432+04:00 level=INFO source=ggml.go:672 msg="compute graph" backend=BLAS buffer_type=CPU size="5.6 MiB"
time=2025-08-05T23:55:49.432+04:00 level=INFO source=ggml.go:672 msg="compute graph" backend=CPU buffer_type=CPU size="0 B"
time=2025-08-05T23:55:54.113+04:00 level=INFO source=server.go:637 msg="llama runner started in 5.77 seconds"
[GIN] 2025/08/05 - 23:55:54 | 200 |  5.916203708s |       127.0.0.1 | POST     "/api/generate"
ggml_metal_get_buffer: error: tensor 'leaf_7 (view)' buffer is nil
ggml_metal_get_buffer: error: tensor 'leaf_7 (view) (copy of  (view) (permuted))' buffer is nil
ggml-metal.m:4848: GGML_ASSERT(ne0 % ggml_blck_size(dst->type) == 0) failed
SIGABRT: abort
PC=0x1997dd388 m=7 sigcode=0
signal arrived during cgo execution

goroutine 11 gp=0x14000582700 m=7 mp=0x14000580008 [syscall]:
runtime.cgocall(0x1037c5d68, 0x14000593a38)
	runtime/cgocall.go:167 +0x44 fp=0x140005939f0 sp=0x140005939b0 pc=0x102cb7914
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_sched_graph_compute_async(0x1531f3c00, 0x1508e5890)
	_cgo_gotypes.go:876 +0x34 fp=0x14000593a30 sp=0x140005939f0 pc=0x103084334
github.com/ollama/ollama/ml/backend/ggml.(*Context).Compute.func1(...)
	github.com/ollama/ollama/ml/backend/ggml/ggml.go:631
github.com/ollama/ollama/ml/backend/ggml.(*Context).Compute(0x14000402000, {0x140004b2410, 0x1, 0x0?})
	github.com/ollama/ollama/ml/backend/ggml/ggml.go:631 +0x90 fp=0x14000593ae0 sp=0x14000593a30 pc=0x10308c400
github.com/ollama/ollama/model.Forward({0x103e1e0d0, 0x14000402000}, {0x103e14870, 0x140000d76b0}, {0x140019fe400, 0x44, 0x80}, {{0x103e28e48, 0x1400199e7b0}, {0x0, ...}, ...})
	github.com/ollama/ollama/model/model.go:305 +0x1f4 fp=0x14000593bd0 sp=0x14000593ae0 pc=0x103097fa4
github.com/ollama/ollama/runner/ollamarunner.(*Server).processBatch(0x14000145440)
	github.com/ollama/ollama/runner/ollamarunner/runner.go:480 +0x3f0 fp=0x14000593f80 sp=0x14000593bd0 pc=0x103114770
github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0x14000145440, {0x103e15d50, 0x14000136c80})
	github.com/ollama/ollama/runner/ollamarunner/runner.go:362 +0x54 fp=0x14000593fa0 sp=0x14000593f80 pc=0x103114344
github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap2()
	github.com/ollama/ollama/runner/ollamarunner/runner.go:960 +0x30 fp=0x14000593fd0 sp=0x14000593fa0 pc=0x103118af0
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x14000593fd0 sp=0x14000593fd0 pc=0x102cc31f4
created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1
	github.com/ollama/ollama/runner/ollamarunner/runner.go:960 +0x898

goroutine 1 gp=0x140000021c0 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x140005955e0 sp=0x140005955c0 pc=0x102cbae38
runtime.netpollblock(0x14000595678?, 0x2d3f710?, 0x1?)
	runtime/netpoll.go:575 +0x158 fp=0x14000595620 sp=0x140005955e0 pc=0x102c80a28
internal/poll.runtime_pollWait(0x14b53c440, 0x72)
	runtime/netpoll.go:351 +0xa0 fp=0x14000595650 sp=0x14000595620 pc=0x102cb9ff0
internal/poll.(*pollDesc).wait(0x1400012b780?, 0x102d41978?, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x14000595680 sp=0x14000595650 pc=0x102d3af28
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0x1400012b780)
	internal/poll/fd_unix.go:620 +0x24c fp=0x14000595730 sp=0x14000595680 pc=0x102d3f7fc
net.(*netFD).accept(0x1400012b780)
	net/fd_unix.go:172 +0x28 fp=0x140005957f0 sp=0x14000595730 pc=0x102daeb38
net.(*TCPListener).accept(0x140001337c0)
	net/tcpsock_posix.go:159 +0x24 fp=0x14000595840 sp=0x140005957f0 pc=0x102dc2d94
net.(*TCPListener).Accept(0x140001337c0)
	net/tcpsock.go:380 +0x2c fp=0x14000595880 sp=0x14000595840 pc=0x102dc1d7c
net/http.(*onceCloseListener).Accept(0x140000ea480?)
	<autogenerated>:1 +0x30 fp=0x140005958a0 sp=0x14000595880 pc=0x102f9d860
net/http.(*Server).Serve(0x140001f1500, {0x103e138a8, 0x140001337c0})
	net/http/server.go:3424 +0x290 fp=0x140005959d0 sp=0x140005958a0 pc=0x102f76f00
github.com/ollama/ollama/runner/ollamarunner.Execute({0x14000032170, 0x11, 0x11})
	github.com/ollama/ollama/runner/ollamarunner/runner.go:984 +0xb78 fp=0x14000595ce0 sp=0x140005959d0 pc=0x103118888
github.com/ollama/ollama/runner.Execute({0x14000032150?, 0x0?, 0x0?})
	github.com/ollama/ollama/runner/runner.go:20 +0x120 fp=0x14000595d10 sp=0x14000595ce0 pc=0x103119130
github.com/ollama/ollama/cmd.NewCLI.func2(0x140001f1200?, {0x103992511?, 0x4?, 0x103992515?})
	github.com/ollama/ollama/cmd/cmd.go:1583 +0x54 fp=0x14000595d40 sp=0x14000595d10 pc=0x103777344
github.com/spf13/cobra.(*Command).execute(0x1400046b208, {0x14000145200, 0x12, 0x12})
	github.com/spf13/cobra@v1.7.0/command.go:940 +0x680 fp=0x14000595e60 sp=0x14000595d40 pc=0x102e1d330
github.com/spf13/cobra.(*Command).ExecuteC(0x14000140308)
	github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320 fp=0x14000595f20 sp=0x14000595e60 pc=0x102e1da80
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
	github.com/ollama/ollama/main.go:12 +0x54 fp=0x14000595f40 sp=0x14000595f20 pc=0x103777e94
runtime.main()
	runtime/proc.go:283 +0x284 fp=0x14000595fd0 sp=0x14000595f40 pc=0x102c875d4
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x14000595fd0 sp=0x14000595fd0 pc=0x102cc31f4

goroutine 2 gp=0x14000002c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x14000070f90 sp=0x14000070f70 pc=0x102cbae38
runtime.goparkunlock(...)
	runtime/proc.go:441
runtime.forcegchelper()
	runtime/proc.go:348 +0xb8 fp=0x14000070fd0 sp=0x14000070f90 pc=0x102c87928
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x14000070fd0 sp=0x14000070fd0 pc=0x102cc31f4
created by runtime.init.7 in goroutine 1
	runtime/proc.go:336 +0x24

goroutine 3 gp=0x14000003180 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x14000071760 sp=0x14000071740 pc=0x102cbae38
runtime.goparkunlock(...)
	runtime/proc.go:441
runtime.bgsweep(0x1400009c000)
	runtime/mgcsweep.go:316 +0x108 fp=0x140000717b0 sp=0x14000071760 pc=0x102c72998
runtime.gcenable.gowrap1()
	runtime/mgc.go:204 +0x28 fp=0x140000717d0 sp=0x140000717b0 pc=0x102c66798
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x140000717d0 sp=0x140000717d0 pc=0x102cc31f4
created by runtime.gcenable in goroutine 1
	runtime/mgc.go:204 +0x6c

goroutine 4 gp=0x14000003340 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x103b48cf8?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x14000071f60 sp=0x14000071f40 pc=0x102cbae38
runtime.goparkunlock(...)
	runtime/proc.go:441
runtime.(*scavengerState).park(0x1046c1f80)
	runtime/mgcscavenge.go:425 +0x5c fp=0x14000071f90 sp=0x14000071f60 pc=0x102c7042c
runtime.bgscavenge(0x1400009c000)
	runtime/mgcscavenge.go:658 +0xac fp=0x14000071fb0 sp=0x14000071f90 pc=0x102c709cc
runtime.gcenable.gowrap2()
	runtime/mgc.go:205 +0x28 fp=0x14000071fd0 sp=0x14000071fb0 pc=0x102c66738
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x14000071fd0 sp=0x14000071fd0 pc=0x102cc31f4
created by runtime.gcenable in goroutine 1
	runtime/mgc.go:205 +0xac

goroutine 5 gp=0x14000003c00 m=nil [finalizer wait]:
runtime.gopark(0x18000705c8?, 0x1000000000000?, 0xf8?, 0x5?, 0x102fa01cc?)
	runtime/proc.go:435 +0xc8 fp=0x14000070590 sp=0x14000070570 pc=0x102cbae38
runtime.runfinq()
	runtime/mfinal.go:196 +0x108 fp=0x140000707d0 sp=0x14000070590 pc=0x102c65798
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x140000707d0 sp=0x140000707d0 pc=0x102cc31f4
created by runtime.createfing in goroutine 1
	runtime/mfinal.go:166 +0x80

goroutine 6 gp=0x140001dc700 m=nil [chan receive]:
runtime.gopark(0x1400022b400?, 0x14001998150?, 0x48?, 0x27?, 0x102d82d08?)
	runtime/proc.go:435 +0xc8 fp=0x140000726f0 sp=0x140000726d0 pc=0x102cbae38
runtime.chanrecv(0x14000040380, 0x0, 0x1)
	runtime/chan.go:664 +0x42c fp=0x14000072770 sp=0x140000726f0 pc=0x102c57a6c
runtime.chanrecv1(0x0?, 0x0?)
	runtime/chan.go:506 +0x14 fp=0x140000727a0 sp=0x14000072770 pc=0x102c57604
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	runtime/mgc.go:1797
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	runtime/mgc.go:1800 +0x3c fp=0x140000727d0 sp=0x140000727a0 pc=0x102c699bc
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x140000727d0 sp=0x140000727d0 pc=0x102cc31f4
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	runtime/mgc.go:1795 +0x78

goroutine 7 gp=0x140001dcfc0 m=nil [GC worker (idle)]:
runtime.gopark(0x13a6a94ffa2ad5?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x14000072f10 sp=0x14000072ef0 pc=0x102cbae38
runtime.gcBgMarkWorker(0x14000041960)
	runtime/mgc.go:1423 +0xdc fp=0x14000072fb0 sp=0x14000072f10 pc=0x102c68c2c
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x28 fp=0x14000072fd0 sp=0x14000072fb0 pc=0x102c68b18
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x14000072fd0 sp=0x14000072fd0 pc=0x102cc31f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x140

goroutine 18 gp=0x14000504000 m=nil [GC worker (idle)]:
runtime.gopark(0x13a6a94fe099bf?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x1400006c710 sp=0x1400006c6f0 pc=0x102cbae38
runtime.gcBgMarkWorker(0x14000041960)
	runtime/mgc.go:1423 +0xdc fp=0x1400006c7b0 sp=0x1400006c710 pc=0x102c68c2c
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x28 fp=0x1400006c7d0 sp=0x1400006c7b0 pc=0x102c68b18
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x1400006c7d0 sp=0x1400006c7d0 pc=0x102cc31f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x140

goroutine 34 gp=0x14000102380 m=nil [GC worker (idle)]:
runtime.gopark(0x13a6a94fe0bedb?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x1400011a710 sp=0x1400011a6f0 pc=0x102cbae38
runtime.gcBgMarkWorker(0x14000041960)
	runtime/mgc.go:1423 +0xdc fp=0x1400011a7b0 sp=0x1400011a710 pc=0x102c68c2c
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x28 fp=0x1400011a7d0 sp=0x1400011a7b0 pc=0x102c68b18
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x1400011a7d0 sp=0x1400011a7d0 pc=0x102cc31f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x140

goroutine 8 gp=0x140001dd180 m=nil [GC worker (idle)]:
runtime.gopark(0x13a6a95010e67e?, 0x3?, 0x13?, 0x54?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x14000073710 sp=0x140000736f0 pc=0x102cbae38
runtime.gcBgMarkWorker(0x14000041960)
	runtime/mgc.go:1423 +0xdc fp=0x140000737b0 sp=0x14000073710 pc=0x102c68c2c
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x28 fp=0x140000737d0 sp=0x140000737b0 pc=0x102c68b18
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x140000737d0 sp=0x140000737d0 pc=0x102cc31f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x140

goroutine 9 gp=0x140001dd340 m=nil [GC worker (idle)]:
runtime.gopark(0x13a6a94fe0a0e9?, 0x1?, 0x23?, 0xe8?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x14000073f10 sp=0x14000073ef0 pc=0x102cbae38
runtime.gcBgMarkWorker(0x14000041960)
	runtime/mgc.go:1423 +0xdc fp=0x14000073fb0 sp=0x14000073f10 pc=0x102c68c2c
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x28 fp=0x14000073fd0 sp=0x14000073fb0 pc=0x102c68b18
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x14000073fd0 sp=0x14000073fd0 pc=0x102cc31f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x140

goroutine 19 gp=0x140005041c0 m=nil [GC worker (idle)]:
runtime.gopark(0x13a6a94fe2d26f?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x1400006cf10 sp=0x1400006cef0 pc=0x102cbae38
runtime.gcBgMarkWorker(0x14000041960)
	runtime/mgc.go:1423 +0xdc fp=0x1400006cfb0 sp=0x1400006cf10 pc=0x102c68c2c
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x28 fp=0x1400006cfd0 sp=0x1400006cfb0 pc=0x102c68b18
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x1400006cfd0 sp=0x1400006cfd0 pc=0x102cc31f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x140

goroutine 20 gp=0x14000504380 m=nil [GC worker (idle)]:
runtime.gopark(0x13a6a9501a7a27?, 0x3?, 0x1a?, 0xc1?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x1400006d710 sp=0x1400006d6f0 pc=0x102cbae38
runtime.gcBgMarkWorker(0x14000041960)
	runtime/mgc.go:1423 +0xdc fp=0x1400006d7b0 sp=0x1400006d710 pc=0x102c68c2c
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x28 fp=0x1400006d7d0 sp=0x1400006d7b0 pc=0x102c68b18
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x1400006d7d0 sp=0x1400006d7d0 pc=0x102cc31f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x140

goroutine 35 gp=0x14000102540 m=nil [GC worker (idle)]:
runtime.gopark(0x13a6a94fe0a5cb?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xc8 fp=0x1400011af10 sp=0x1400011aef0 pc=0x102cbae38
runtime.gcBgMarkWorker(0x14000041960)
	runtime/mgc.go:1423 +0xdc fp=0x1400011afb0 sp=0x1400011af10 pc=0x102c68c2c
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x28 fp=0x1400011afd0 sp=0x1400011afb0 pc=0x102c68b18
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x1400011afd0 sp=0x1400011afd0 pc=0x102cc31f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x140

goroutine 12 gp=0x14000102a80 m=nil [select]:
runtime.gopark(0x14000047a50?, 0x2?, 0x78?, 0x77?, 0x1400004781c?)
	runtime/proc.go:435 +0xc8 fp=0x14000047650 sp=0x14000047630 pc=0x102cbae38
runtime.selectgo(0x14000047a50, 0x14000047818, 0x44?, 0x0, 0x0?, 0x1)
	runtime/select.go:351 +0x6c4 fp=0x14000047780 sp=0x14000047650 pc=0x102c9ac14
github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0x14000145440, {0x103e13a88, 0x14001b2e380}, 0x14000362780)
	github.com/ollama/ollama/runner/ollamarunner/runner.go:680 +0x9d8 fp=0x14000047aa0 sp=0x14000047780 pc=0x103116458
github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x103e13a88?, 0x14001b2e380?}, 0x14000597b28?)
	<autogenerated>:1 +0x40 fp=0x14000047ad0 sp=0x14000047aa0 pc=0x103118f30
net/http.HandlerFunc.ServeHTTP(0x1400015c3c0?, {0x103e13a88?, 0x14001b2e380?}, 0x14000597b10?)
	net/http/server.go:2294 +0x38 fp=0x14000047b00 sp=0x14000047ad0 pc=0x102f73928
net/http.(*ServeMux).ServeHTTP(0x10?, {0x103e13a88, 0x14001b2e380}, 0x14000362780)
	net/http/server.go:2822 +0x1b4 fp=0x14000047b50 sp=0x14000047b00 pc=0x102f754b4
net/http.serverHandler.ServeHTTP({0x103e100b0?}, {0x103e13a88?, 0x14001b2e380?}, 0x1?)
	net/http/server.go:3301 +0xbc fp=0x14000047b80 sp=0x14000047b50 pc=0x102f9123c
net/http.(*conn).serve(0x140000ea480, {0x103e15d18, 0x14000514de0})
	net/http/server.go:2102 +0x52c fp=0x14000047fa0 sp=0x14000047b80 pc=0x102f720cc
net/http.(*Server).Serve.gowrap3()
	net/http/server.go:3454 +0x30 fp=0x14000047fd0 sp=0x14000047fa0 pc=0x102f77290
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x14000047fd0 sp=0x14000047fd0 pc=0x102cc31f4
created by net/http.(*Server).Serve in goroutine 1
	net/http/server.go:3454 +0x3d8

goroutine 416 gp=0x14001ae6700 m=nil [IO wait]:
runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x102cdecd0?)
	runtime/proc.go:435 +0xc8 fp=0x1400042e580 sp=0x1400042e560 pc=0x102cbae38
runtime.netpollblock(0x0?, 0x0?, 0x0?)
	runtime/netpoll.go:575 +0x158 fp=0x1400042e5c0 sp=0x1400042e580 pc=0x102c80a28
internal/poll.runtime_pollWait(0x14b53c328, 0x72)
	runtime/netpoll.go:351 +0xa0 fp=0x1400042e5f0 sp=0x1400042e5c0 pc=0x102cb9ff0
internal/poll.(*pollDesc).wait(0x1400012a180?, 0x1400047e7f1?, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x1400042e620 sp=0x1400042e5f0 pc=0x102d3af28
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x1400012a180, {0x1400047e7f1, 0x1, 0x1})
	internal/poll/fd_unix.go:165 +0x1fc fp=0x1400042e6c0 sp=0x1400042e620 pc=0x102d3c1dc
net.(*netFD).Read(0x1400012a180, {0x1400047e7f1?, 0x1400042e758?, 0x102f6cb44?})
	net/fd_posix.go:55 +0x28 fp=0x1400042e710 sp=0x1400042e6c0 pc=0x102dad108
net.(*conn).Read(0x14000074430, {0x1400047e7f1?, 0x100000001?, 0x100000100000001?})
	net/net.go:194 +0x34 fp=0x1400042e760 sp=0x1400042e710 pc=0x102db9fd4
net/http.(*connReader).backgroundRead(0x1400047e7e0)
	net/http/server.go:690 +0x40 fp=0x1400042e7b0 sp=0x1400042e760 pc=0x102f6ca40
net/http.(*connReader).startBackgroundRead.gowrap2()
	net/http/server.go:686 +0x28 fp=0x1400042e7d0 sp=0x1400042e7b0 pc=0x102f6c928
runtime.goexit({})
	runtime/asm_arm64.s:1223 +0x4 fp=0x1400042e7d0 sp=0x1400042e7d0 pc=0x102cc31f4
created by net/http.(*connReader).startBackgroundRead in goroutine 12
	net/http/server.go:686 +0xc4

r0      0x0
r1      0x0
r2      0x0
r3      0x0
r4      0x103b63e2f
r5      0x1701ee960
r6      0x64656c6961662029
r7      0x14000074040
r8      0xa94ee6108823ab88
r9      0xa94ee611f83d5b88
r10     0x2
r11     0x10000000000
r12     0xfffffffd
r13     0x0
r14     0x0
r15     0x0
r16     0x148
r17     0x208975fa8
r18     0x0
r19     0x6
r20     0x1503
r21     0x1701ef0e0
r22     0x205317e68
r23     0x1509264c0
r24     0x1445000
r25     0x151f60d90
r26     0x40
r27     0x8
r28     0x150926350
r29     0x1701ee8c0
lr      0x19981688c
sp      0x1701ee8a0
pc      0x1997dd388
fault   0x1997dd388
time=2025-08-05T23:55:57.496+04:00 level=ERROR source=server.go:807 msg="post predict" error="Post \"http://127.0.0.1:60940/completion\": EOF"
[GIN] 2025/08/05 - 23:55:57 | 200 |  568.232458ms |       127.0.0.1 | POST     "/api/chat"
time=2025-08-05T23:55:57.496+04:00 level=ERROR source=server.go:464 msg="llama runner terminated" error="exit status 2"

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

HEAD-fa7776f

Originally created by @ben73 on GitHub (Aug 5, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11686 ### What is the issue? ```bash > ollama run gpt-oss:20b >>> hi Error: model runner has unexpectedly stopped, this may be due to resource limitations or an internal error, check ollama server logs for details ``` ### Relevant log output ```shell time=2025-08-05T23:55:48.303+04:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model=/Users/ben/.ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 gpu=0 parallel=1 available=22906503168 required="13.8 GiB" time=2025-08-05T23:55:48.303+04:00 level=INFO source=server.go:135 msg="system memory" total="32.0 GiB" free="18.3 GiB" free_swap="0 B" time=2025-08-05T23:55:48.303+04:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=25 layers.offload=25 layers.split="" memory.available="[21.3 GiB]" memory.gpu_overhead="0 B" memory.required.full="13.8 GiB" memory.required.partial="13.8 GiB" memory.required.kv="150.0 MiB" memory.required.allocations="[13.8 GiB]" memory.weights.total="11.7 GiB" memory.weights.repeating="10.7 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="1.0 GiB" memory.graph.partial="1.0 GiB" time=2025-08-05T23:55:48.303+04:00 level=INFO source=server.go:218 msg="enabling flash attention" time=2025-08-05T23:55:48.338+04:00 level=INFO source=server.go:438 msg="starting llama server" cmd="/opt/homebrew/Cellar/ollama/HEAD-fa7776f/bin/ollama runner --ollama-engine --model /Users/ben/.ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 --ctx-size 8192 --batch-size 512 --n-gpu-layers 25 --threads 6 --flash-attn --kv-cache-type q8_0 --parallel 1 --port 60940" time=2025-08-05T23:55:48.340+04:00 level=INFO source=sched.go:481 msg="loaded runners" count=1 time=2025-08-05T23:55:48.340+04:00 level=INFO source=server.go:598 msg="waiting for llama runner to start responding" time=2025-08-05T23:55:48.340+04:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server not responding" time=2025-08-05T23:55:48.353+04:00 level=INFO source=runner.go:925 msg="starting ollama engine" time=2025-08-05T23:55:48.353+04:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:60940" time=2025-08-05T23:55:48.387+04:00 level=INFO source=ggml.go:92 msg="" architecture=gptoss file_type=MXFP4 name="" description="" num_tensors=315 num_key_values=30 time=2025-08-05T23:55:48.387+04:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang) time=2025-08-05T23:55:48.592+04:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server loading model" time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:367 msg="offloading 24 repeating layers to GPU" time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:373 msg="offloading output layer to GPU" time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:378 msg="offloaded 25/25 layers to GPU" time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:381 msg="model weights" buffer=Metal size="11.7 GiB" time=2025-08-05T23:55:48.884+04:00 level=INFO source=ggml.go:381 msg="model weights" buffer=CPU size="1.1 GiB" ggml_metal_init: allocating ggml_metal_init: picking default device: Apple M1 Pro ggml_metal_load_library: using embedded metal library ggml_metal_init: GPU name: Apple M1 Pro ggml_metal_init: GPU family: MTLGPUFamilyApple7 (1007) ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003) ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001) ggml_metal_init: simdgroup reduction = true ggml_metal_init: simdgroup matrix mul. = true ggml_metal_init: has residency sets = true ggml_metal_init: has bfloat = true ggml_metal_init: use bfloat = true ggml_metal_init: hasUnifiedMemory = true ggml_metal_init: recommendedMaxWorkingSetSize = 22906.50 MB time=2025-08-05T23:55:49.432+04:00 level=INFO source=ggml.go:672 msg="compute graph" backend=Metal buffer_type=Metal size="2.1 GiB" time=2025-08-05T23:55:49.432+04:00 level=INFO source=ggml.go:672 msg="compute graph" backend=BLAS buffer_type=CPU size="5.6 MiB" time=2025-08-05T23:55:49.432+04:00 level=INFO source=ggml.go:672 msg="compute graph" backend=CPU buffer_type=CPU size="0 B" time=2025-08-05T23:55:54.113+04:00 level=INFO source=server.go:637 msg="llama runner started in 5.77 seconds" [GIN] 2025/08/05 - 23:55:54 | 200 | 5.916203708s | 127.0.0.1 | POST "/api/generate" ggml_metal_get_buffer: error: tensor 'leaf_7 (view)' buffer is nil ggml_metal_get_buffer: error: tensor 'leaf_7 (view) (copy of (view) (permuted))' buffer is nil ggml-metal.m:4848: GGML_ASSERT(ne0 % ggml_blck_size(dst->type) == 0) failed SIGABRT: abort PC=0x1997dd388 m=7 sigcode=0 signal arrived during cgo execution goroutine 11 gp=0x14000582700 m=7 mp=0x14000580008 [syscall]: runtime.cgocall(0x1037c5d68, 0x14000593a38) runtime/cgocall.go:167 +0x44 fp=0x140005939f0 sp=0x140005939b0 pc=0x102cb7914 github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_sched_graph_compute_async(0x1531f3c00, 0x1508e5890) _cgo_gotypes.go:876 +0x34 fp=0x14000593a30 sp=0x140005939f0 pc=0x103084334 github.com/ollama/ollama/ml/backend/ggml.(*Context).Compute.func1(...) github.com/ollama/ollama/ml/backend/ggml/ggml.go:631 github.com/ollama/ollama/ml/backend/ggml.(*Context).Compute(0x14000402000, {0x140004b2410, 0x1, 0x0?}) github.com/ollama/ollama/ml/backend/ggml/ggml.go:631 +0x90 fp=0x14000593ae0 sp=0x14000593a30 pc=0x10308c400 github.com/ollama/ollama/model.Forward({0x103e1e0d0, 0x14000402000}, {0x103e14870, 0x140000d76b0}, {0x140019fe400, 0x44, 0x80}, {{0x103e28e48, 0x1400199e7b0}, {0x0, ...}, ...}) github.com/ollama/ollama/model/model.go:305 +0x1f4 fp=0x14000593bd0 sp=0x14000593ae0 pc=0x103097fa4 github.com/ollama/ollama/runner/ollamarunner.(*Server).processBatch(0x14000145440) github.com/ollama/ollama/runner/ollamarunner/runner.go:480 +0x3f0 fp=0x14000593f80 sp=0x14000593bd0 pc=0x103114770 github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0x14000145440, {0x103e15d50, 0x14000136c80}) github.com/ollama/ollama/runner/ollamarunner/runner.go:362 +0x54 fp=0x14000593fa0 sp=0x14000593f80 pc=0x103114344 github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap2() github.com/ollama/ollama/runner/ollamarunner/runner.go:960 +0x30 fp=0x14000593fd0 sp=0x14000593fa0 pc=0x103118af0 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x14000593fd0 sp=0x14000593fd0 pc=0x102cc31f4 created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1 github.com/ollama/ollama/runner/ollamarunner/runner.go:960 +0x898 goroutine 1 gp=0x140000021c0 m=nil [IO wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x140005955e0 sp=0x140005955c0 pc=0x102cbae38 runtime.netpollblock(0x14000595678?, 0x2d3f710?, 0x1?) runtime/netpoll.go:575 +0x158 fp=0x14000595620 sp=0x140005955e0 pc=0x102c80a28 internal/poll.runtime_pollWait(0x14b53c440, 0x72) runtime/netpoll.go:351 +0xa0 fp=0x14000595650 sp=0x14000595620 pc=0x102cb9ff0 internal/poll.(*pollDesc).wait(0x1400012b780?, 0x102d41978?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x14000595680 sp=0x14000595650 pc=0x102d3af28 internal/poll.(*pollDesc).waitRead(...) internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Accept(0x1400012b780) internal/poll/fd_unix.go:620 +0x24c fp=0x14000595730 sp=0x14000595680 pc=0x102d3f7fc net.(*netFD).accept(0x1400012b780) net/fd_unix.go:172 +0x28 fp=0x140005957f0 sp=0x14000595730 pc=0x102daeb38 net.(*TCPListener).accept(0x140001337c0) net/tcpsock_posix.go:159 +0x24 fp=0x14000595840 sp=0x140005957f0 pc=0x102dc2d94 net.(*TCPListener).Accept(0x140001337c0) net/tcpsock.go:380 +0x2c fp=0x14000595880 sp=0x14000595840 pc=0x102dc1d7c net/http.(*onceCloseListener).Accept(0x140000ea480?) <autogenerated>:1 +0x30 fp=0x140005958a0 sp=0x14000595880 pc=0x102f9d860 net/http.(*Server).Serve(0x140001f1500, {0x103e138a8, 0x140001337c0}) net/http/server.go:3424 +0x290 fp=0x140005959d0 sp=0x140005958a0 pc=0x102f76f00 github.com/ollama/ollama/runner/ollamarunner.Execute({0x14000032170, 0x11, 0x11}) github.com/ollama/ollama/runner/ollamarunner/runner.go:984 +0xb78 fp=0x14000595ce0 sp=0x140005959d0 pc=0x103118888 github.com/ollama/ollama/runner.Execute({0x14000032150?, 0x0?, 0x0?}) github.com/ollama/ollama/runner/runner.go:20 +0x120 fp=0x14000595d10 sp=0x14000595ce0 pc=0x103119130 github.com/ollama/ollama/cmd.NewCLI.func2(0x140001f1200?, {0x103992511?, 0x4?, 0x103992515?}) github.com/ollama/ollama/cmd/cmd.go:1583 +0x54 fp=0x14000595d40 sp=0x14000595d10 pc=0x103777344 github.com/spf13/cobra.(*Command).execute(0x1400046b208, {0x14000145200, 0x12, 0x12}) github.com/spf13/cobra@v1.7.0/command.go:940 +0x680 fp=0x14000595e60 sp=0x14000595d40 pc=0x102e1d330 github.com/spf13/cobra.(*Command).ExecuteC(0x14000140308) github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320 fp=0x14000595f20 sp=0x14000595e60 pc=0x102e1da80 github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) github.com/spf13/cobra@v1.7.0/command.go:985 main.main() github.com/ollama/ollama/main.go:12 +0x54 fp=0x14000595f40 sp=0x14000595f20 pc=0x103777e94 runtime.main() runtime/proc.go:283 +0x284 fp=0x14000595fd0 sp=0x14000595f40 pc=0x102c875d4 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x14000595fd0 sp=0x14000595fd0 pc=0x102cc31f4 goroutine 2 gp=0x14000002c40 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x14000070f90 sp=0x14000070f70 pc=0x102cbae38 runtime.goparkunlock(...) runtime/proc.go:441 runtime.forcegchelper() runtime/proc.go:348 +0xb8 fp=0x14000070fd0 sp=0x14000070f90 pc=0x102c87928 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x14000070fd0 sp=0x14000070fd0 pc=0x102cc31f4 created by runtime.init.7 in goroutine 1 runtime/proc.go:336 +0x24 goroutine 3 gp=0x14000003180 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x14000071760 sp=0x14000071740 pc=0x102cbae38 runtime.goparkunlock(...) runtime/proc.go:441 runtime.bgsweep(0x1400009c000) runtime/mgcsweep.go:316 +0x108 fp=0x140000717b0 sp=0x14000071760 pc=0x102c72998 runtime.gcenable.gowrap1() runtime/mgc.go:204 +0x28 fp=0x140000717d0 sp=0x140000717b0 pc=0x102c66798 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x140000717d0 sp=0x140000717d0 pc=0x102cc31f4 created by runtime.gcenable in goroutine 1 runtime/mgc.go:204 +0x6c goroutine 4 gp=0x14000003340 m=nil [GC scavenge wait]: runtime.gopark(0x10000?, 0x103b48cf8?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x14000071f60 sp=0x14000071f40 pc=0x102cbae38 runtime.goparkunlock(...) runtime/proc.go:441 runtime.(*scavengerState).park(0x1046c1f80) runtime/mgcscavenge.go:425 +0x5c fp=0x14000071f90 sp=0x14000071f60 pc=0x102c7042c runtime.bgscavenge(0x1400009c000) runtime/mgcscavenge.go:658 +0xac fp=0x14000071fb0 sp=0x14000071f90 pc=0x102c709cc runtime.gcenable.gowrap2() runtime/mgc.go:205 +0x28 fp=0x14000071fd0 sp=0x14000071fb0 pc=0x102c66738 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x14000071fd0 sp=0x14000071fd0 pc=0x102cc31f4 created by runtime.gcenable in goroutine 1 runtime/mgc.go:205 +0xac goroutine 5 gp=0x14000003c00 m=nil [finalizer wait]: runtime.gopark(0x18000705c8?, 0x1000000000000?, 0xf8?, 0x5?, 0x102fa01cc?) runtime/proc.go:435 +0xc8 fp=0x14000070590 sp=0x14000070570 pc=0x102cbae38 runtime.runfinq() runtime/mfinal.go:196 +0x108 fp=0x140000707d0 sp=0x14000070590 pc=0x102c65798 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x140000707d0 sp=0x140000707d0 pc=0x102cc31f4 created by runtime.createfing in goroutine 1 runtime/mfinal.go:166 +0x80 goroutine 6 gp=0x140001dc700 m=nil [chan receive]: runtime.gopark(0x1400022b400?, 0x14001998150?, 0x48?, 0x27?, 0x102d82d08?) runtime/proc.go:435 +0xc8 fp=0x140000726f0 sp=0x140000726d0 pc=0x102cbae38 runtime.chanrecv(0x14000040380, 0x0, 0x1) runtime/chan.go:664 +0x42c fp=0x14000072770 sp=0x140000726f0 pc=0x102c57a6c runtime.chanrecv1(0x0?, 0x0?) runtime/chan.go:506 +0x14 fp=0x140000727a0 sp=0x14000072770 pc=0x102c57604 runtime.unique_runtime_registerUniqueMapCleanup.func2(...) runtime/mgc.go:1797 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() runtime/mgc.go:1800 +0x3c fp=0x140000727d0 sp=0x140000727a0 pc=0x102c699bc runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x140000727d0 sp=0x140000727d0 pc=0x102cc31f4 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 runtime/mgc.go:1795 +0x78 goroutine 7 gp=0x140001dcfc0 m=nil [GC worker (idle)]: runtime.gopark(0x13a6a94ffa2ad5?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x14000072f10 sp=0x14000072ef0 pc=0x102cbae38 runtime.gcBgMarkWorker(0x14000041960) runtime/mgc.go:1423 +0xdc fp=0x14000072fb0 sp=0x14000072f10 pc=0x102c68c2c runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x28 fp=0x14000072fd0 sp=0x14000072fb0 pc=0x102c68b18 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x14000072fd0 sp=0x14000072fd0 pc=0x102cc31f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x140 goroutine 18 gp=0x14000504000 m=nil [GC worker (idle)]: runtime.gopark(0x13a6a94fe099bf?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x1400006c710 sp=0x1400006c6f0 pc=0x102cbae38 runtime.gcBgMarkWorker(0x14000041960) runtime/mgc.go:1423 +0xdc fp=0x1400006c7b0 sp=0x1400006c710 pc=0x102c68c2c runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x28 fp=0x1400006c7d0 sp=0x1400006c7b0 pc=0x102c68b18 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x1400006c7d0 sp=0x1400006c7d0 pc=0x102cc31f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x140 goroutine 34 gp=0x14000102380 m=nil [GC worker (idle)]: runtime.gopark(0x13a6a94fe0bedb?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x1400011a710 sp=0x1400011a6f0 pc=0x102cbae38 runtime.gcBgMarkWorker(0x14000041960) runtime/mgc.go:1423 +0xdc fp=0x1400011a7b0 sp=0x1400011a710 pc=0x102c68c2c runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x28 fp=0x1400011a7d0 sp=0x1400011a7b0 pc=0x102c68b18 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x1400011a7d0 sp=0x1400011a7d0 pc=0x102cc31f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x140 goroutine 8 gp=0x140001dd180 m=nil [GC worker (idle)]: runtime.gopark(0x13a6a95010e67e?, 0x3?, 0x13?, 0x54?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x14000073710 sp=0x140000736f0 pc=0x102cbae38 runtime.gcBgMarkWorker(0x14000041960) runtime/mgc.go:1423 +0xdc fp=0x140000737b0 sp=0x14000073710 pc=0x102c68c2c runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x28 fp=0x140000737d0 sp=0x140000737b0 pc=0x102c68b18 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x140000737d0 sp=0x140000737d0 pc=0x102cc31f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x140 goroutine 9 gp=0x140001dd340 m=nil [GC worker (idle)]: runtime.gopark(0x13a6a94fe0a0e9?, 0x1?, 0x23?, 0xe8?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x14000073f10 sp=0x14000073ef0 pc=0x102cbae38 runtime.gcBgMarkWorker(0x14000041960) runtime/mgc.go:1423 +0xdc fp=0x14000073fb0 sp=0x14000073f10 pc=0x102c68c2c runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x28 fp=0x14000073fd0 sp=0x14000073fb0 pc=0x102c68b18 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x14000073fd0 sp=0x14000073fd0 pc=0x102cc31f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x140 goroutine 19 gp=0x140005041c0 m=nil [GC worker (idle)]: runtime.gopark(0x13a6a94fe2d26f?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x1400006cf10 sp=0x1400006cef0 pc=0x102cbae38 runtime.gcBgMarkWorker(0x14000041960) runtime/mgc.go:1423 +0xdc fp=0x1400006cfb0 sp=0x1400006cf10 pc=0x102c68c2c runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x28 fp=0x1400006cfd0 sp=0x1400006cfb0 pc=0x102c68b18 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x1400006cfd0 sp=0x1400006cfd0 pc=0x102cc31f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x140 goroutine 20 gp=0x14000504380 m=nil [GC worker (idle)]: runtime.gopark(0x13a6a9501a7a27?, 0x3?, 0x1a?, 0xc1?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x1400006d710 sp=0x1400006d6f0 pc=0x102cbae38 runtime.gcBgMarkWorker(0x14000041960) runtime/mgc.go:1423 +0xdc fp=0x1400006d7b0 sp=0x1400006d710 pc=0x102c68c2c runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x28 fp=0x1400006d7d0 sp=0x1400006d7b0 pc=0x102c68b18 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x1400006d7d0 sp=0x1400006d7d0 pc=0x102cc31f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x140 goroutine 35 gp=0x14000102540 m=nil [GC worker (idle)]: runtime.gopark(0x13a6a94fe0a5cb?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xc8 fp=0x1400011af10 sp=0x1400011aef0 pc=0x102cbae38 runtime.gcBgMarkWorker(0x14000041960) runtime/mgc.go:1423 +0xdc fp=0x1400011afb0 sp=0x1400011af10 pc=0x102c68c2c runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x28 fp=0x1400011afd0 sp=0x1400011afb0 pc=0x102c68b18 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x1400011afd0 sp=0x1400011afd0 pc=0x102cc31f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x140 goroutine 12 gp=0x14000102a80 m=nil [select]: runtime.gopark(0x14000047a50?, 0x2?, 0x78?, 0x77?, 0x1400004781c?) runtime/proc.go:435 +0xc8 fp=0x14000047650 sp=0x14000047630 pc=0x102cbae38 runtime.selectgo(0x14000047a50, 0x14000047818, 0x44?, 0x0, 0x0?, 0x1) runtime/select.go:351 +0x6c4 fp=0x14000047780 sp=0x14000047650 pc=0x102c9ac14 github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0x14000145440, {0x103e13a88, 0x14001b2e380}, 0x14000362780) github.com/ollama/ollama/runner/ollamarunner/runner.go:680 +0x9d8 fp=0x14000047aa0 sp=0x14000047780 pc=0x103116458 github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x103e13a88?, 0x14001b2e380?}, 0x14000597b28?) <autogenerated>:1 +0x40 fp=0x14000047ad0 sp=0x14000047aa0 pc=0x103118f30 net/http.HandlerFunc.ServeHTTP(0x1400015c3c0?, {0x103e13a88?, 0x14001b2e380?}, 0x14000597b10?) net/http/server.go:2294 +0x38 fp=0x14000047b00 sp=0x14000047ad0 pc=0x102f73928 net/http.(*ServeMux).ServeHTTP(0x10?, {0x103e13a88, 0x14001b2e380}, 0x14000362780) net/http/server.go:2822 +0x1b4 fp=0x14000047b50 sp=0x14000047b00 pc=0x102f754b4 net/http.serverHandler.ServeHTTP({0x103e100b0?}, {0x103e13a88?, 0x14001b2e380?}, 0x1?) net/http/server.go:3301 +0xbc fp=0x14000047b80 sp=0x14000047b50 pc=0x102f9123c net/http.(*conn).serve(0x140000ea480, {0x103e15d18, 0x14000514de0}) net/http/server.go:2102 +0x52c fp=0x14000047fa0 sp=0x14000047b80 pc=0x102f720cc net/http.(*Server).Serve.gowrap3() net/http/server.go:3454 +0x30 fp=0x14000047fd0 sp=0x14000047fa0 pc=0x102f77290 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x14000047fd0 sp=0x14000047fd0 pc=0x102cc31f4 created by net/http.(*Server).Serve in goroutine 1 net/http/server.go:3454 +0x3d8 goroutine 416 gp=0x14001ae6700 m=nil [IO wait]: runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x102cdecd0?) runtime/proc.go:435 +0xc8 fp=0x1400042e580 sp=0x1400042e560 pc=0x102cbae38 runtime.netpollblock(0x0?, 0x0?, 0x0?) runtime/netpoll.go:575 +0x158 fp=0x1400042e5c0 sp=0x1400042e580 pc=0x102c80a28 internal/poll.runtime_pollWait(0x14b53c328, 0x72) runtime/netpoll.go:351 +0xa0 fp=0x1400042e5f0 sp=0x1400042e5c0 pc=0x102cb9ff0 internal/poll.(*pollDesc).wait(0x1400012a180?, 0x1400047e7f1?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x1400042e620 sp=0x1400042e5f0 pc=0x102d3af28 internal/poll.(*pollDesc).waitRead(...) internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0x1400012a180, {0x1400047e7f1, 0x1, 0x1}) internal/poll/fd_unix.go:165 +0x1fc fp=0x1400042e6c0 sp=0x1400042e620 pc=0x102d3c1dc net.(*netFD).Read(0x1400012a180, {0x1400047e7f1?, 0x1400042e758?, 0x102f6cb44?}) net/fd_posix.go:55 +0x28 fp=0x1400042e710 sp=0x1400042e6c0 pc=0x102dad108 net.(*conn).Read(0x14000074430, {0x1400047e7f1?, 0x100000001?, 0x100000100000001?}) net/net.go:194 +0x34 fp=0x1400042e760 sp=0x1400042e710 pc=0x102db9fd4 net/http.(*connReader).backgroundRead(0x1400047e7e0) net/http/server.go:690 +0x40 fp=0x1400042e7b0 sp=0x1400042e760 pc=0x102f6ca40 net/http.(*connReader).startBackgroundRead.gowrap2() net/http/server.go:686 +0x28 fp=0x1400042e7d0 sp=0x1400042e7b0 pc=0x102f6c928 runtime.goexit({}) runtime/asm_arm64.s:1223 +0x4 fp=0x1400042e7d0 sp=0x1400042e7d0 pc=0x102cc31f4 created by net/http.(*connReader).startBackgroundRead in goroutine 12 net/http/server.go:686 +0xc4 r0 0x0 r1 0x0 r2 0x0 r3 0x0 r4 0x103b63e2f r5 0x1701ee960 r6 0x64656c6961662029 r7 0x14000074040 r8 0xa94ee6108823ab88 r9 0xa94ee611f83d5b88 r10 0x2 r11 0x10000000000 r12 0xfffffffd r13 0x0 r14 0x0 r15 0x0 r16 0x148 r17 0x208975fa8 r18 0x0 r19 0x6 r20 0x1503 r21 0x1701ef0e0 r22 0x205317e68 r23 0x1509264c0 r24 0x1445000 r25 0x151f60d90 r26 0x40 r27 0x8 r28 0x150926350 r29 0x1701ee8c0 lr 0x19981688c sp 0x1701ee8a0 pc 0x1997dd388 fault 0x1997dd388 time=2025-08-05T23:55:57.496+04:00 level=ERROR source=server.go:807 msg="post predict" error="Post \"http://127.0.0.1:60940/completion\": EOF" [GIN] 2025/08/05 - 23:55:57 | 200 | 568.232458ms | 127.0.0.1 | POST "/api/chat" time=2025-08-05T23:55:57.496+04:00 level=ERROR source=server.go:464 msg="llama runner terminated" error="exit status 2" ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version HEAD-fa7776f
GiteaMirror added the bug label 2026-04-12 19:51:29 -05:00
Author
Owner

@jessegross commented on GitHub (Aug 5, 2025):

Likely the same as #11671, which is fixed in main.

<!-- gh-comment-id:3156483479 --> @jessegross commented on GitHub (Aug 5, 2025): Likely the same as #11671, which is fixed in `main`.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7732