[GH-ISSUE #11795] segfault on aarch64-darwin for 0.11.0+ (including 0.11.4) #7823

Closed
opened 2026-04-12 19:59:42 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @prusnak on GitHub (Aug 7, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11795

What is the issue?

error:

$ go test github.com/ollama/ollama/ml/backend/ggml

time=2025-08-08T00:07:10.123+02:00 level=INFO source=ggml.go:92 msg="" architecture=test file_type=unknown name="" description="" num_tensors=1 num_key_values=3
time=2025-08-08T00:07:10.123+02:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang)
time=2025-08-08T00:07:10.150+02:00 level=INFO source=ggml.go:365 msg="offloading 1 repeating layers to GPU"
time=2025-08-08T00:07:10.150+02:00 level=INFO source=ggml.go:369 msg="offloading output layer to CPU"
time=2025-08-08T00:07:10.150+02:00 level=INFO source=ggml.go:376 msg="offloaded 1/2 layers to GPU"
time=2025-08-08T00:07:10.150+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=Metal size="32 B"
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M4
ggml_metal_load_library: using embedded metal library
ggml_metal_init: GPU name:   Apple M4
ggml_metal_init: GPU family: MTLGPUFamilyApple9  (1009)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction   = true
ggml_metal_init: simdgroup matrix mul. = true
ggml_metal_init: has residency sets    = false
ggml_metal_init: has bfloat            = true
ggml_metal_init: use bfloat            = true
ggml_metal_init: hasUnifiedMemory      = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 22906.50 MB
-[MTLComputePipelineDescriptorInternal setComputeFunction:withType:]:800: failed assertion `computeFunction must not be nil.'
SIGABRT: abort
PC=0x19fd2e388 m=0 sigcode=0
signal arrived during cgo execution

goroutine 4 gp=0x14000003dc0 m=0 mp=0x105332c00 [syscall]:
runtime.cgocall(0x104ffe044, 0x14000033248)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/cgocall.go:167 +0x44 fp=0x14000033210 sp=0x140000331d0 pc=0x104ef8b64
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_dev_init(0x105358628, 0x0)
	_cgo_gotypes.go:755 +0x34 fp=0x14000033240 sp=0x14000033210 pc=0x104feb9b4
github.com/ollama/ollama/ml/backend/ggml.New.func25(...)
	/Users/stick/work/ollama/ollama/ml/backend/ggml/ggml.go:397
github.com/ollama/ollama/ml/backend/ggml.New({0x1400001e370, 0x46}, {0x0, 0x0, 0x1, {0x0, 0x0, 0x0}, 0x0})
	/Users/stick/work/ollama/ollama/ml/backend/ggml/ggml.go:397 +0x1f1c fp=0x14000033ce0 sp=0x14000033240 pc=0x104feee1c
github.com/ollama/ollama/ml/backend/ggml.setup({0x105196f20, 0x14000003c00})
	/Users/stick/work/ollama/ollama/ml/backend/ggml/ggml_test.go:39 +0x3fc fp=0x14000033f10 sp=0x14000033ce0 pc=0x104fe58ec
github.com/ollama/ollama/ml/backend/ggml.TestMXFP4Ops(0x14000003c00)
	/Users/stick/work/ollama/ollama/ml/backend/ggml/mxfp4_test.go:46 +0x2c fp=0x14000033f60 sp=0x14000033f10 pc=0x104fe5c4c
testing.tRunner(0x14000003c00, 0x105191a50)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1792 +0xe4 fp=0x14000033fb0 sp=0x14000033f60 pc=0x104f6e494
testing.(*T).Run.gowrap1()
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1851 +0x2c fp=0x14000033fd0 sp=0x14000033fb0 pc=0x104f6f31c
runtime.goexit({})
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000033fd0 sp=0x14000033fd0 pc=0x104f02d84
created by testing.(*T).Run in goroutine 1
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1851 +0x374

goroutine 1 gp=0x140000021c0 m=nil [chan receive]:
runtime.gopark(0x104f03c80?, 0x1400006f938?, 0xa?, 0x4c?, 0x1055280d0?)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x1400006f8f0 sp=0x1400006f8d0 pc=0x104efb398
runtime.chanrecv(0x14000026380, 0x1400006f9f7, 0x1)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/chan.go:664 +0x42c fp=0x1400006f970 sp=0x1400006f8f0 pc=0x104e9653c
runtime.chanrecv1(0x105331ec0?, 0x1051993c0?)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/chan.go:506 +0x14 fp=0x1400006f9a0 sp=0x1400006f970 pc=0x104e96104
testing.(*T).Run(0x14000003a40, {0x1050c7917?, 0x1400006fab8?}, 0x105191a50)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1859 +0x388 fp=0x1400006fa80 sp=0x1400006f9a0 pc=0x104f6f1f8
testing.runTests.func1(0x14000003a40)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:2279 +0x40 fp=0x1400006fac0 sp=0x1400006fa80 pc=0x104f71130
testing.tRunner(0x14000003a40, 0x1400006fbe8)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1792 +0xe4 fp=0x1400006fb10 sp=0x1400006fac0 pc=0x104f6e494
testing.runTests(0x1400000e288, {0x1052c0920, 0x3, 0x3}, {0x105528170?, 0x10551c108?, 0x1053320c0?})
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:2277 +0x3ec fp=0x1400006fc10 sp=0x1400006fb10 pc=0x104f7104c
testing.(*M).Run(0x14000108280)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:2142 +0x588 fp=0x1400006fe40 sp=0x1400006fc10 pc=0x104f6fda8
github.com/ollama/ollama/ml/backend/ggml.TestMain(0x14000108280)
	/Users/stick/work/ollama/ollama/ml/backend/ggml/ggml_test.go:18 +0x180 fp=0x1400006fea0 sp=0x1400006fe40 pc=0x104fe54c0
main.main()
	_testmain.go:51 +0x98 fp=0x1400006ff40 sp=0x1400006fea0 pc=0x104ff3948
runtime.main()
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:283 +0x284 fp=0x1400006ffd0 sp=0x1400006ff40 pc=0x104ec74d4
runtime.goexit({})
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400006ffd0 sp=0x1400006ffd0 pc=0x104f02d84

goroutine 18 gp=0x14000092380 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x14000058790 sp=0x14000058770 pc=0x104efb398
runtime.goparkunlock(...)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:441
runtime.forcegchelper()
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:348 +0xb8 fp=0x140000587d0 sp=0x14000058790 pc=0x104ec7828
runtime.goexit({})
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000587d0 sp=0x140000587d0 pc=0x104f02d84
created by runtime.init.7 in goroutine 1
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:336 +0x24

goroutine 19 gp=0x14000092540 m=nil [GC sweep wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x14000058f60 sp=0x14000058f40 pc=0x104efb398
runtime.goparkunlock(...)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:441
runtime.bgsweep(0x140000a2000)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgcsweep.go:276 +0xa0 fp=0x14000058fb0 sp=0x14000058f60 pc=0x104eb08b0
runtime.gcenable.gowrap1()
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:204 +0x28 fp=0x14000058fd0 sp=0x14000058fb0 pc=0x104ea4728
runtime.goexit({})
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000058fd0 sp=0x14000058fd0 pc=0x104f02d84
created by runtime.gcenable in goroutine 1
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:204 +0x6c

goroutine 20 gp=0x14000092700 m=nil [GC scavenge wait]:
runtime.gopark(0x140000a2000?, 0x105109f48?, 0x1?, 0x0?, 0x14000092700?)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x14000059760 sp=0x14000059740 pc=0x104efb398
runtime.goparkunlock(...)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x105332140)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgcscavenge.go:425 +0x5c fp=0x14000059790 sp=0x14000059760 pc=0x104eae3bc
runtime.bgscavenge(0x140000a2000)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgcscavenge.go:653 +0x44 fp=0x140000597b0 sp=0x14000059790 pc=0x104eae8f4
runtime.gcenable.gowrap2()
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:205 +0x28 fp=0x140000597d0 sp=0x140000597b0 pc=0x104ea46c8
runtime.goexit({})
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000597d0 sp=0x140000597d0 pc=0x104f02d84
created by runtime.gcenable in goroutine 1
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:205 +0xac

goroutine 2 gp=0x140000036c0 m=nil [finalizer wait]:
runtime.gopark(0x1400005c5b8?, 0x104efbee4?, 0x1?, 0xc5?, 0x104f1e424?)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x1400005c590 sp=0x1400005c570 pc=0x104efb398
runtime.runfinq()
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mfinal.go:196 +0x108 fp=0x1400005c7d0 sp=0x1400005c590 pc=0x104ea3728
runtime.goexit({})
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400005c7d0 sp=0x1400005c7d0 pc=0x104f02d84
created by runtime.createfing in goroutine 1
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mfinal.go:166 +0x80

goroutine 3 gp=0x14000003880 m=nil [chan receive]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x1400005cef0 sp=0x1400005ced0 pc=0x104efb398
runtime.chanrecv(0x14000026150, 0x0, 0x1)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/chan.go:664 +0x42c fp=0x1400005cf70 sp=0x1400005cef0 pc=0x104e9653c
runtime.chanrecv1(0x0?, 0x0?)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/chan.go:506 +0x14 fp=0x1400005cfa0 sp=0x1400005cf70 pc=0x104e96104
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:1797
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:1800 +0x3c fp=0x1400005cfd0 sp=0x1400005cfa0 pc=0x104ea794c
runtime.goexit({})
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400005cfd0 sp=0x1400005cfd0 pc=0x104f02d84
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	/nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:1795 +0x78

r0      0x0
r1      0x0
r2      0x0
r3      0x0
r4      0x73
r5      0x2e
r6      0x0
r7      0x0
r8      0x41a05b4d8bf90233
r9      0x41a05b4f8636a2f3
r10     0xa
r11     0x0
r12     0x38
r13     0x116015920
r14     0x10000020dd146e9
r15     0x20dd146e8
r16     0x148
r17     0x20ed51558
r18     0x0
r19     0x6
r20     0x103
r21     0x20dcfa1a0
r22     0x10d460000
r23     0xffffffffffffffff
r24     0x20b6ef000
r25     0x207866000
r26     0x1ab33c793
r27     0x125f343c0
r28     0x1051144cd
r29     0x16af74d40
lr      0x19fd6788c
sp      0x16af74d20
pc      0x19fd2e388
fault   0x19fd2e388
FAIL	github.com/ollama/ollama/ml/backend/ggml	0.211s
FAIL

OS

macOS 15.6 (24G84)

GPU

Apple M4

Ollama version

0.11.0 and 0.11.4

Originally created by @prusnak on GitHub (Aug 7, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11795 ### What is the issue? error: ``` $ go test github.com/ollama/ollama/ml/backend/ggml time=2025-08-08T00:07:10.123+02:00 level=INFO source=ggml.go:92 msg="" architecture=test file_type=unknown name="" description="" num_tensors=1 num_key_values=3 time=2025-08-08T00:07:10.123+02:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang) time=2025-08-08T00:07:10.150+02:00 level=INFO source=ggml.go:365 msg="offloading 1 repeating layers to GPU" time=2025-08-08T00:07:10.150+02:00 level=INFO source=ggml.go:369 msg="offloading output layer to CPU" time=2025-08-08T00:07:10.150+02:00 level=INFO source=ggml.go:376 msg="offloaded 1/2 layers to GPU" time=2025-08-08T00:07:10.150+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=Metal size="32 B" ggml_metal_init: allocating ggml_metal_init: picking default device: Apple M4 ggml_metal_load_library: using embedded metal library ggml_metal_init: GPU name: Apple M4 ggml_metal_init: GPU family: MTLGPUFamilyApple9 (1009) ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003) ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001) ggml_metal_init: simdgroup reduction = true ggml_metal_init: simdgroup matrix mul. = true ggml_metal_init: has residency sets = false ggml_metal_init: has bfloat = true ggml_metal_init: use bfloat = true ggml_metal_init: hasUnifiedMemory = true ggml_metal_init: recommendedMaxWorkingSetSize = 22906.50 MB -[MTLComputePipelineDescriptorInternal setComputeFunction:withType:]:800: failed assertion `computeFunction must not be nil.' SIGABRT: abort PC=0x19fd2e388 m=0 sigcode=0 signal arrived during cgo execution goroutine 4 gp=0x14000003dc0 m=0 mp=0x105332c00 [syscall]: runtime.cgocall(0x104ffe044, 0x14000033248) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/cgocall.go:167 +0x44 fp=0x14000033210 sp=0x140000331d0 pc=0x104ef8b64 github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_dev_init(0x105358628, 0x0) _cgo_gotypes.go:755 +0x34 fp=0x14000033240 sp=0x14000033210 pc=0x104feb9b4 github.com/ollama/ollama/ml/backend/ggml.New.func25(...) /Users/stick/work/ollama/ollama/ml/backend/ggml/ggml.go:397 github.com/ollama/ollama/ml/backend/ggml.New({0x1400001e370, 0x46}, {0x0, 0x0, 0x1, {0x0, 0x0, 0x0}, 0x0}) /Users/stick/work/ollama/ollama/ml/backend/ggml/ggml.go:397 +0x1f1c fp=0x14000033ce0 sp=0x14000033240 pc=0x104feee1c github.com/ollama/ollama/ml/backend/ggml.setup({0x105196f20, 0x14000003c00}) /Users/stick/work/ollama/ollama/ml/backend/ggml/ggml_test.go:39 +0x3fc fp=0x14000033f10 sp=0x14000033ce0 pc=0x104fe58ec github.com/ollama/ollama/ml/backend/ggml.TestMXFP4Ops(0x14000003c00) /Users/stick/work/ollama/ollama/ml/backend/ggml/mxfp4_test.go:46 +0x2c fp=0x14000033f60 sp=0x14000033f10 pc=0x104fe5c4c testing.tRunner(0x14000003c00, 0x105191a50) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1792 +0xe4 fp=0x14000033fb0 sp=0x14000033f60 pc=0x104f6e494 testing.(*T).Run.gowrap1() /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1851 +0x2c fp=0x14000033fd0 sp=0x14000033fb0 pc=0x104f6f31c runtime.goexit({}) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000033fd0 sp=0x14000033fd0 pc=0x104f02d84 created by testing.(*T).Run in goroutine 1 /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1851 +0x374 goroutine 1 gp=0x140000021c0 m=nil [chan receive]: runtime.gopark(0x104f03c80?, 0x1400006f938?, 0xa?, 0x4c?, 0x1055280d0?) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x1400006f8f0 sp=0x1400006f8d0 pc=0x104efb398 runtime.chanrecv(0x14000026380, 0x1400006f9f7, 0x1) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/chan.go:664 +0x42c fp=0x1400006f970 sp=0x1400006f8f0 pc=0x104e9653c runtime.chanrecv1(0x105331ec0?, 0x1051993c0?) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/chan.go:506 +0x14 fp=0x1400006f9a0 sp=0x1400006f970 pc=0x104e96104 testing.(*T).Run(0x14000003a40, {0x1050c7917?, 0x1400006fab8?}, 0x105191a50) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1859 +0x388 fp=0x1400006fa80 sp=0x1400006f9a0 pc=0x104f6f1f8 testing.runTests.func1(0x14000003a40) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:2279 +0x40 fp=0x1400006fac0 sp=0x1400006fa80 pc=0x104f71130 testing.tRunner(0x14000003a40, 0x1400006fbe8) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:1792 +0xe4 fp=0x1400006fb10 sp=0x1400006fac0 pc=0x104f6e494 testing.runTests(0x1400000e288, {0x1052c0920, 0x3, 0x3}, {0x105528170?, 0x10551c108?, 0x1053320c0?}) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:2277 +0x3ec fp=0x1400006fc10 sp=0x1400006fb10 pc=0x104f7104c testing.(*M).Run(0x14000108280) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/testing/testing.go:2142 +0x588 fp=0x1400006fe40 sp=0x1400006fc10 pc=0x104f6fda8 github.com/ollama/ollama/ml/backend/ggml.TestMain(0x14000108280) /Users/stick/work/ollama/ollama/ml/backend/ggml/ggml_test.go:18 +0x180 fp=0x1400006fea0 sp=0x1400006fe40 pc=0x104fe54c0 main.main() _testmain.go:51 +0x98 fp=0x1400006ff40 sp=0x1400006fea0 pc=0x104ff3948 runtime.main() /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:283 +0x284 fp=0x1400006ffd0 sp=0x1400006ff40 pc=0x104ec74d4 runtime.goexit({}) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400006ffd0 sp=0x1400006ffd0 pc=0x104f02d84 goroutine 18 gp=0x14000092380 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x14000058790 sp=0x14000058770 pc=0x104efb398 runtime.goparkunlock(...) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:441 runtime.forcegchelper() /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:348 +0xb8 fp=0x140000587d0 sp=0x14000058790 pc=0x104ec7828 runtime.goexit({}) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000587d0 sp=0x140000587d0 pc=0x104f02d84 created by runtime.init.7 in goroutine 1 /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:336 +0x24 goroutine 19 gp=0x14000092540 m=nil [GC sweep wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x14000058f60 sp=0x14000058f40 pc=0x104efb398 runtime.goparkunlock(...) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:441 runtime.bgsweep(0x140000a2000) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgcsweep.go:276 +0xa0 fp=0x14000058fb0 sp=0x14000058f60 pc=0x104eb08b0 runtime.gcenable.gowrap1() /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:204 +0x28 fp=0x14000058fd0 sp=0x14000058fb0 pc=0x104ea4728 runtime.goexit({}) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000058fd0 sp=0x14000058fd0 pc=0x104f02d84 created by runtime.gcenable in goroutine 1 /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:204 +0x6c goroutine 20 gp=0x14000092700 m=nil [GC scavenge wait]: runtime.gopark(0x140000a2000?, 0x105109f48?, 0x1?, 0x0?, 0x14000092700?) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x14000059760 sp=0x14000059740 pc=0x104efb398 runtime.goparkunlock(...) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:441 runtime.(*scavengerState).park(0x105332140) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgcscavenge.go:425 +0x5c fp=0x14000059790 sp=0x14000059760 pc=0x104eae3bc runtime.bgscavenge(0x140000a2000) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgcscavenge.go:653 +0x44 fp=0x140000597b0 sp=0x14000059790 pc=0x104eae8f4 runtime.gcenable.gowrap2() /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:205 +0x28 fp=0x140000597d0 sp=0x140000597b0 pc=0x104ea46c8 runtime.goexit({}) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000597d0 sp=0x140000597d0 pc=0x104f02d84 created by runtime.gcenable in goroutine 1 /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:205 +0xac goroutine 2 gp=0x140000036c0 m=nil [finalizer wait]: runtime.gopark(0x1400005c5b8?, 0x104efbee4?, 0x1?, 0xc5?, 0x104f1e424?) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x1400005c590 sp=0x1400005c570 pc=0x104efb398 runtime.runfinq() /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mfinal.go:196 +0x108 fp=0x1400005c7d0 sp=0x1400005c590 pc=0x104ea3728 runtime.goexit({}) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400005c7d0 sp=0x1400005c7d0 pc=0x104f02d84 created by runtime.createfing in goroutine 1 /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mfinal.go:166 +0x80 goroutine 3 gp=0x14000003880 m=nil [chan receive]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/proc.go:435 +0xc8 fp=0x1400005cef0 sp=0x1400005ced0 pc=0x104efb398 runtime.chanrecv(0x14000026150, 0x0, 0x1) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/chan.go:664 +0x42c fp=0x1400005cf70 sp=0x1400005cef0 pc=0x104e9653c runtime.chanrecv1(0x0?, 0x0?) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/chan.go:506 +0x14 fp=0x1400005cfa0 sp=0x1400005cf70 pc=0x104e96104 runtime.unique_runtime_registerUniqueMapCleanup.func2(...) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:1797 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:1800 +0x3c fp=0x1400005cfd0 sp=0x1400005cfa0 pc=0x104ea794c runtime.goexit({}) /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400005cfd0 sp=0x1400005cfd0 pc=0x104f02d84 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 /nix/store/kw1vd98s15vj700m3gx2x2xca2z477i3-go-1.24.5/share/go/src/runtime/mgc.go:1795 +0x78 r0 0x0 r1 0x0 r2 0x0 r3 0x0 r4 0x73 r5 0x2e r6 0x0 r7 0x0 r8 0x41a05b4d8bf90233 r9 0x41a05b4f8636a2f3 r10 0xa r11 0x0 r12 0x38 r13 0x116015920 r14 0x10000020dd146e9 r15 0x20dd146e8 r16 0x148 r17 0x20ed51558 r18 0x0 r19 0x6 r20 0x103 r21 0x20dcfa1a0 r22 0x10d460000 r23 0xffffffffffffffff r24 0x20b6ef000 r25 0x207866000 r26 0x1ab33c793 r27 0x125f343c0 r28 0x1051144cd r29 0x16af74d40 lr 0x19fd6788c sp 0x16af74d20 pc 0x19fd2e388 fault 0x19fd2e388 FAIL github.com/ollama/ollama/ml/backend/ggml 0.211s FAIL ``` ### OS macOS 15.6 (24G84) ### GPU Apple M4 ### Ollama version 0.11.0 and 0.11.4
GiteaMirror added the bug label 2026-04-12 19:59:42 -05:00
Author
Owner

@prusnak commented on GitHub (Aug 7, 2025):

this fixes the issue for me:

diff --git a/ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m b/ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m
index d8e05a21..ec692366 100644
--- a/ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m
+++ b/ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m
@@ -1067,14 +1067,19 @@ @implementation GGMLMetalClass
         if (supported) { \
             struct ggml_metal_kernel * kernel = &ctx->kernels[e]; \
             id<MTLFunction> metal_function = [metal_library newFunctionWithName:@"kernel_"#name]; \
-            kernel->pipeline = [device newComputePipelineStateWithFunction:metal_function error:&error]; \
-            GGML_LOG_DEBUG("%s: loaded %-40s %16p | th_max = %4d | th_width = %4d\n", __func__, "kernel_"#name, (void *) kernel->pipeline, \
-                    (int) kernel->pipeline.maxTotalThreadsPerThreadgroup, \
-                    (int) kernel->pipeline.threadExecutionWidth); \
-            [metal_function release]; \
-            if (error) { \
-                GGML_LOG_ERROR("%s: error: load pipeline error: %s\n", __func__, [[error description] UTF8String]); \
-                return NULL; \
+            if (metal_function == nil) { \
+                GGML_LOG_WARN("%s: function %-40s not found in Metal library, skipping\n", __func__, "kernel_"#name); \
+                kernel->pipeline = nil; \
+            } else { \
+                kernel->pipeline = [device newComputePipelineStateWithFunction:metal_function error:&error]; \
+                GGML_LOG_DEBUG("%s: loaded %-40s %16p | th_max = %4d | th_width = %4d\n", __func__, "kernel_"#name, (void *) kernel->pipeline, \
+                        (int) kernel->pipeline.maxTotalThreadsPerThreadgroup, \
+                        (int) kernel->pipeline.threadExecutionWidth); \
+                [metal_function release]; \
+                if (error) { \
+                    GGML_LOG_ERROR("%s: error: load pipeline error: %s\n", __func__, [[error description] UTF8String]); \
+                    return NULL; \
+                } \
             } \
         } else { \
             GGML_LOG_WARN("%s: skipping %-40s (not supported)\n", __func__, "kernel_"#name); \
@@ -3020,6 +3025,11 @@ static bool ggml_metal_encode_node(
                         default: GGML_ABORT("MUL MAT-MAT not implemented");
                     }
 
+                    if (pipeline == nil) {
+                        GGML_LOG_ERROR("%s: pipeline not available for type %s, falling back to CPU\n", __func__, ggml_type_name(src0t));
+                        return false;
+                    }
+
                     ggml_metal_kargs_mul_mm args = {
                         /*.ne00 =*/ ne00,
                         /*.ne02 =*/ ne02,
@@ -3227,6 +3237,10 @@ static bool ggml_metal_encode_node(
                                 nsg = N_SG_MXFP4;
                                 nr0 = N_R0_MXFP4;
                                 pipeline = ctx->kernels[GGML_METAL_KERNEL_TYPE_MUL_MV_MXFP4_F32].pipeline;
+                                if (pipeline == nil) {
+                                    GGML_LOG_ERROR("%s: MXFP4 MUL_MV pipeline not available, falling back to CPU\n", __func__);
+                                    return false;
+                                }
                             } break;
                         default:
                             {
@@ -3416,6 +3430,11 @@ static bool ggml_metal_encode_node(
                             default: GGML_ABORT("MUL_MAT_ID not implemented");
                         }
 
+                        if (pipeline == nil) {
+                            GGML_LOG_ERROR("%s: pipeline not available for MUL_MAT_ID type %s, falling back to CPU\n", __func__, ggml_type_name(src0t));
+                            return false;
+                        }
+
                         ggml_metal_kargs_mul_mm_id args = {
                             /*.ne00  =*/ ne00,
                             /*.ne02  =*/ ne02,
@@ -3629,6 +3648,10 @@ static bool ggml_metal_encode_node(
                                 nsg = N_SG_MXFP4;
                                 nr0 = N_R0_MXFP4;
                                 pipeline = ctx->kernels[GGML_METAL_KERNEL_TYPE_MUL_MV_ID_MXFP4_F32].pipeline;
+                                if (pipeline == nil) {
+                                    GGML_LOG_ERROR("%s: MXFP4 MUL_MV_ID pipeline not available, falling back to CPU\n", __func__);
+                                    return false;
+                                }
                             } break;
                         default:
                             {
<!-- gh-comment-id:3165903544 --> @prusnak commented on GitHub (Aug 7, 2025): this fixes the issue for me: ``` patch diff --git a/ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m b/ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m index d8e05a21..ec692366 100644 --- a/ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m +++ b/ml/backend/ggml/ggml/src/ggml-metal/ggml-metal.m @@ -1067,14 +1067,19 @@ @implementation GGMLMetalClass if (supported) { \ struct ggml_metal_kernel * kernel = &ctx->kernels[e]; \ id<MTLFunction> metal_function = [metal_library newFunctionWithName:@"kernel_"#name]; \ - kernel->pipeline = [device newComputePipelineStateWithFunction:metal_function error:&error]; \ - GGML_LOG_DEBUG("%s: loaded %-40s %16p | th_max = %4d | th_width = %4d\n", __func__, "kernel_"#name, (void *) kernel->pipeline, \ - (int) kernel->pipeline.maxTotalThreadsPerThreadgroup, \ - (int) kernel->pipeline.threadExecutionWidth); \ - [metal_function release]; \ - if (error) { \ - GGML_LOG_ERROR("%s: error: load pipeline error: %s\n", __func__, [[error description] UTF8String]); \ - return NULL; \ + if (metal_function == nil) { \ + GGML_LOG_WARN("%s: function %-40s not found in Metal library, skipping\n", __func__, "kernel_"#name); \ + kernel->pipeline = nil; \ + } else { \ + kernel->pipeline = [device newComputePipelineStateWithFunction:metal_function error:&error]; \ + GGML_LOG_DEBUG("%s: loaded %-40s %16p | th_max = %4d | th_width = %4d\n", __func__, "kernel_"#name, (void *) kernel->pipeline, \ + (int) kernel->pipeline.maxTotalThreadsPerThreadgroup, \ + (int) kernel->pipeline.threadExecutionWidth); \ + [metal_function release]; \ + if (error) { \ + GGML_LOG_ERROR("%s: error: load pipeline error: %s\n", __func__, [[error description] UTF8String]); \ + return NULL; \ + } \ } \ } else { \ GGML_LOG_WARN("%s: skipping %-40s (not supported)\n", __func__, "kernel_"#name); \ @@ -3020,6 +3025,11 @@ static bool ggml_metal_encode_node( default: GGML_ABORT("MUL MAT-MAT not implemented"); } + if (pipeline == nil) { + GGML_LOG_ERROR("%s: pipeline not available for type %s, falling back to CPU\n", __func__, ggml_type_name(src0t)); + return false; + } + ggml_metal_kargs_mul_mm args = { /*.ne00 =*/ ne00, /*.ne02 =*/ ne02, @@ -3227,6 +3237,10 @@ static bool ggml_metal_encode_node( nsg = N_SG_MXFP4; nr0 = N_R0_MXFP4; pipeline = ctx->kernels[GGML_METAL_KERNEL_TYPE_MUL_MV_MXFP4_F32].pipeline; + if (pipeline == nil) { + GGML_LOG_ERROR("%s: MXFP4 MUL_MV pipeline not available, falling back to CPU\n", __func__); + return false; + } } break; default: { @@ -3416,6 +3430,11 @@ static bool ggml_metal_encode_node( default: GGML_ABORT("MUL_MAT_ID not implemented"); } + if (pipeline == nil) { + GGML_LOG_ERROR("%s: pipeline not available for MUL_MAT_ID type %s, falling back to CPU\n", __func__, ggml_type_name(src0t)); + return false; + } + ggml_metal_kargs_mul_mm_id args = { /*.ne00 =*/ ne00, /*.ne02 =*/ ne02, @@ -3629,6 +3648,10 @@ static bool ggml_metal_encode_node( nsg = N_SG_MXFP4; nr0 = N_R0_MXFP4; pipeline = ctx->kernels[GGML_METAL_KERNEL_TYPE_MUL_MV_ID_MXFP4_F32].pipeline; + if (pipeline == nil) { + GGML_LOG_ERROR("%s: MXFP4 MUL_MV_ID pipeline not available, falling back to CPU\n", __func__); + return false; + } } break; default: { ```
Author
Owner

@jmorganca commented on GitHub (Aug 7, 2025):

@prusnak which model is this? seems that it's only 2 layers:

offloaded 1/2 layers
<!-- gh-comment-id:3165908492 --> @jmorganca commented on GitHub (Aug 7, 2025): @prusnak which model is this? seems that it's only 2 layers: ``` offloaded 1/2 layers ```
Author
Owner

@prusnak commented on GitHub (Aug 7, 2025):

@prusnak which model is this? seems that it's only 2 layers:

the above is output from

go test github.com/ollama/ollama/ml/backend/ggml

but i see the same failure when i run for example

ollama run gemma3:12b

but this time it shows "offloaded 49/49 layers to GPU"

<!-- gh-comment-id:3165985562 --> @prusnak commented on GitHub (Aug 7, 2025): > [@prusnak](https://github.com/prusnak) which model is this? seems that it's only 2 layers: the above is output from ``` go test github.com/ollama/ollama/ml/backend/ggml ``` ----- but i see the same failure when i run for example ``` ollama run gemma3:12b ``` but this time it shows "offloaded 49/49 layers to GPU"
Author
Owner

@anicolao commented on GitHub (Aug 10, 2025):

@prusnak's patch fixes the original error, but when trying to run gpt-oss the server crashes with

1:50:30.736-04:00 level=INFO source=server.go:637 msg="llama runner started in 2.76 seconds"
[GIN] 2025/08/09 - 21:50:30 | 200 |   2.85683825s |       127.0.0.1 | POST     "/api/generate"
ggml_metal_encode_node: pipeline not available for type bf16, falling back to CPU
SIGSEGV: segmentation violation
PC=0x1522b6f38 m=19 sigcode=2 addr=0x50
...
<!-- gh-comment-id:3172295129 --> @anicolao commented on GitHub (Aug 10, 2025): @prusnak's patch fixes the original error, but when trying to run `gpt-oss` the server crashes with ``` 1:50:30.736-04:00 level=INFO source=server.go:637 msg="llama runner started in 2.76 seconds" [GIN] 2025/08/09 - 21:50:30 | 200 | 2.85683825s | 127.0.0.1 | POST "/api/generate" ggml_metal_encode_node: pipeline not available for type bf16, falling back to CPU SIGSEGV: segmentation violation PC=0x1522b6f38 m=19 sigcode=2 addr=0x50 ... ```
Author
Owner

@myyra commented on GitHub (Aug 10, 2025):

I ran into this issue with Ollama from nixpkgs. Can't replicate it outside of Nix, though.

<!-- gh-comment-id:3172354739 --> @myyra commented on GitHub (Aug 10, 2025): I ran into this issue with Ollama from nixpkgs. Can't replicate it outside of Nix, though.
Author
Owner

@prusnak commented on GitHub (Aug 10, 2025):

We fixed the Nixpkgs issue by using apple-sdk_15 - https://github.com/NixOS/nixpkgs/pull/432470

<!-- gh-comment-id:3172493203 --> @prusnak commented on GitHub (Aug 10, 2025): We fixed the Nixpkgs issue by using `apple-sdk_15` - https://github.com/NixOS/nixpkgs/pull/432470
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7823