[GH-ISSUE #11682] Running got-oss with OLLAMA_FLASH_ATTENTION=1 and OLLAMA_KV_CACHE_TYPE=q8_0 will crash the server #7729

Closed
opened 2026-04-12 19:51:26 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @i0ntempest on GitHub (Aug 5, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11682

What is the issue?

When both OLLAMA_FLASH_ATTENTION=1 and OLLAMA_KV_CACHE_TYPE=q8_0 are set and gpt-oss 20b is run, the server crashes.

Relevant log output

time=2025-08-06T05:12:23.881+10:00 level=INFO source=routes.go:1297 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:15m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Volumes/WD SN850X/Ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_MMAP:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]"
time=2025-08-06T05:12:23.884+10:00 level=INFO source=images.go:477 msg="total blobs: 83"
time=2025-08-06T05:12:23.885+10:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2025-08-06T05:12:23.886+10:00 level=INFO source=routes.go:1350 msg="Listening on 127.0.0.1:11434 (version 0.11.0)"
time=2025-08-06T05:12:23.916+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="48.0 GiB" available="48.0 GiB"
[GIN] 2025/08/06 - 05:12:33 | 200 |      30.458µs |       127.0.0.1 | HEAD     "/"
[GIN] 2025/08/06 - 05:12:33 | 200 |   51.567959ms |       127.0.0.1 | POST     "/api/show"
time=2025-08-06T05:12:33.823+10:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model="/Volumes/WD SN850X/Ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583" gpu=0 parallel=1 available=51539607552 required="15.0 GiB"
time=2025-08-06T05:12:33.823+10:00 level=INFO source=server.go:135 msg="system memory" total="64.0 GiB" free="35.0 GiB" free_swap="0 B"
time=2025-08-06T05:12:33.824+10:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=25 layers.offload=25 layers.split="" memory.available="[48.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="15.0 GiB" memory.required.partial="15.0 GiB" memory.required.kv="300.0 MiB" memory.required.allocations="[15.0 GiB]" memory.weights.total="11.7 GiB" memory.weights.repeating="10.7 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="2.0 GiB" memory.graph.partial="2.0 GiB"
time=2025-08-06T05:12:33.824+10:00 level=INFO source=server.go:218 msg="enabling flash attention"
time=2025-08-06T05:12:33.824+10:00 level=WARN source=server.go:226 msg="kv cache type not supported by model" type=""
time=2025-08-06T05:12:33.852+10:00 level=INFO source=server.go:439 msg="starting llama server" cmd="/opt/local/bin/ollama runner --ollama-engine --model /Volumes/WD SN850X/Ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 --ctx-size 8192 --batch-size 512 --n-gpu-layers 25 --threads 12 --flash-attn --parallel 1 --port 51079"
time=2025-08-06T05:12:33.853+10:00 level=INFO source=sched.go:481 msg="loaded runners" count=1
time=2025-08-06T05:12:33.853+10:00 level=INFO source=server.go:599 msg="waiting for llama runner to start responding"
time=2025-08-06T05:12:33.853+10:00 level=INFO source=server.go:633 msg="waiting for server to become available" status="llm server not responding"
time=2025-08-06T05:12:33.863+10:00 level=INFO source=runner.go:925 msg="starting ollama engine"
time=2025-08-06T05:12:33.863+10:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:51079"
time=2025-08-06T05:12:33.889+10:00 level=INFO source=ggml.go:92 msg="" architecture=gptoss file_type=MXFP4 name="" description="" num_tensors=315 num_key_values=30
time=2025-08-06T05:12:33.953+10:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang)
time=2025-08-06T05:12:34.104+10:00 level=INFO source=server.go:633 msg="waiting for server to become available" status="llm server loading model"
time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:367 msg="offloading 24 repeating layers to GPU"
time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:373 msg="offloading output layer to GPU"
time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:378 msg="offloaded 25/25 layers to GPU"
time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:381 msg="model weights" buffer=CPU size="1.1 GiB"
time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:381 msg="model weights" buffer=Metal size="11.7 GiB"
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M4 Max
ggml_metal_load_library: using embedded metal library
ggml_metal_init: GPU name:   Apple M4 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple9  (1009)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction   = true
ggml_metal_init: simdgroup matrix mul. = true
ggml_metal_init: has residency sets    = true
ggml_metal_init: has bfloat            = true
ggml_metal_init: use bfloat            = true
ggml_metal_init: hasUnifiedMemory      = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 51539.61 MB
time=2025-08-06T05:12:34.427+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=Metal buffer_type=Metal size="2.1 GiB"
time=2025-08-06T05:12:34.427+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=BLAS buffer_type=CPU size="5.6 MiB"
time=2025-08-06T05:12:34.427+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=CPU buffer_type=CPU size="0 B"
time=2025-08-06T05:12:36.611+10:00 level=INFO source=server.go:638 msg="llama runner started in 2.76 seconds"
[GIN] 2025/08/06 - 05:12:36 | 200 |  2.864798959s |       127.0.0.1 | POST     "/api/generate"
[GIN] 2025/08/06 - 05:12:44 | 200 |  5.652470834s |       127.0.0.1 | POST     "/api/chat"
^C  Mac-Studio  Admin  
 ~  ^C
  Mac-Studio  Admin  
 ~  130  sudo -u ollama /opt/local/libexec/ollama/ollama-wrapper.sh 
time=2025-08-06T05:13:06.712+10:00 level=INFO source=routes.go:1297 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:15m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Volumes/WD SN850X/Ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_MMAP:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]"
time=2025-08-06T05:13:06.714+10:00 level=INFO source=images.go:477 msg="total blobs: 83"
time=2025-08-06T05:13:06.714+10:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2025-08-06T05:13:06.715+10:00 level=INFO source=routes.go:1350 msg="Listening on 127.0.0.1:11434 (version 0.11.0)"
time=2025-08-06T05:13:06.734+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="48.0 GiB" available="48.0 GiB"
[GIN] 2025/08/06 - 05:13:08 | 200 |      42.417µs |       127.0.0.1 | HEAD     "/"
[GIN] 2025/08/06 - 05:13:08 | 200 |   58.487708ms |       127.0.0.1 | POST     "/api/show"
time=2025-08-06T05:13:08.787+10:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model="/Volumes/WD SN850X/Ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583" gpu=0 parallel=1 available=51539607552 required="13.8 GiB"
time=2025-08-06T05:13:08.787+10:00 level=INFO source=server.go:135 msg="system memory" total="64.0 GiB" free="35.0 GiB" free_swap="0 B"
time=2025-08-06T05:13:08.788+10:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=25 layers.offload=25 layers.split="" memory.available="[48.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="13.8 GiB" memory.required.partial="13.8 GiB" memory.required.kv="150.0 MiB" memory.required.allocations="[13.8 GiB]" memory.weights.total="11.7 GiB" memory.weights.repeating="10.7 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="1.0 GiB" memory.graph.partial="1.0 GiB"
time=2025-08-06T05:13:08.788+10:00 level=INFO source=server.go:218 msg="enabling flash attention"
time=2025-08-06T05:13:08.815+10:00 level=INFO source=server.go:439 msg="starting llama server" cmd="/opt/local/bin/ollama runner --ollama-engine --model /Volumes/WD SN850X/Ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 --ctx-size 8192 --batch-size 512 --n-gpu-layers 25 --threads 12 --flash-attn --kv-cache-type q8_0 --parallel 1 --port 51140"
time=2025-08-06T05:13:08.817+10:00 level=INFO source=sched.go:481 msg="loaded runners" count=1
time=2025-08-06T05:13:08.817+10:00 level=INFO source=server.go:599 msg="waiting for llama runner to start responding"
time=2025-08-06T05:13:08.817+10:00 level=INFO source=server.go:633 msg="waiting for server to become available" status="llm server not responding"
time=2025-08-06T05:13:08.827+10:00 level=INFO source=runner.go:925 msg="starting ollama engine"
time=2025-08-06T05:13:08.827+10:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:51140"
time=2025-08-06T05:13:08.853+10:00 level=INFO source=ggml.go:92 msg="" architecture=gptoss file_type=MXFP4 name="" description="" num_tensors=315 num_key_values=30
time=2025-08-06T05:13:08.913+10:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang)
time=2025-08-06T05:13:09.068+10:00 level=INFO source=server.go:633 msg="waiting for server to become available" status="llm server loading model"
time=2025-08-06T05:13:09.269+10:00 level=INFO source=ggml.go:367 msg="offloading 24 repeating layers to GPU"
time=2025-08-06T05:13:09.269+10:00 level=INFO source=ggml.go:373 msg="offloading output layer to GPU"
time=2025-08-06T05:13:09.270+10:00 level=INFO source=ggml.go:378 msg="offloaded 25/25 layers to GPU"
time=2025-08-06T05:13:09.270+10:00 level=INFO source=ggml.go:381 msg="model weights" buffer=Metal size="11.7 GiB"
time=2025-08-06T05:13:09.270+10:00 level=INFO source=ggml.go:381 msg="model weights" buffer=CPU size="1.1 GiB"
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M4 Max
ggml_metal_load_library: using embedded metal library
ggml_metal_init: GPU name:   Apple M4 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple9  (1009)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction   = true
ggml_metal_init: simdgroup matrix mul. = true
ggml_metal_init: has residency sets    = true
ggml_metal_init: has bfloat            = true
ggml_metal_init: use bfloat            = true
ggml_metal_init: hasUnifiedMemory      = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 51539.61 MB
time=2025-08-06T05:13:09.361+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=Metal buffer_type=Metal size="2.1 GiB"
time=2025-08-06T05:13:09.361+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=BLAS buffer_type=CPU size="5.6 MiB"
time=2025-08-06T05:13:09.361+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=CPU buffer_type=CPU size="0 B"
time=2025-08-06T05:13:11.328+10:00 level=INFO source=server.go:638 msg="llama runner started in 2.51 seconds"
[GIN] 2025/08/06 - 05:13:11 | 200 |  2.619066917s |       127.0.0.1 | POST     "/api/generate"
ggml_metal_get_buffer: error: tensor 'leaf_7 (view)' buffer is nil
ggml_metal_get_buffer: error: tensor 'leaf_7 (view) (copy of  (view) (permuted))' buffer is nil
ggml-metal.m:4848: GGML_ASSERT(ne0 % ggml_blck_size(dst->type) == 0) failed
SIGABRT: abort
PC=0x192c71388 m=18 sigcode=0
signal arrived during cgo execution

goroutine 6 gp=0x14000103a40 m=18 mp=0x14000580808 [syscall]:
runtime.cgocall(0x101003758, 0x1400014da38)
	/opt/local/lib/go/src/runtime/cgocall.go:167 +0x44 fp=0x1400014d9f0 sp=0x1400014d9b0 pc=0x1004f4d34
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_sched_graph_compute_async(0x12741ca00, 0x15cf35890)
	_cgo_gotypes.go:876 +0x34 fp=0x1400014da30 sp=0x1400014d9f0 pc=0x1008c1864
github.com/ollama/ollama/ml/backend/ggml.(*Context).Compute.func1(...)
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/ml/backend/ggml/ggml.go:631
github.com/ollama/ollama/ml/backend/ggml.(*Context).Compute(0x14000554000, {0x14000040540, 0x1, 0x0?})
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/ml/backend/ggml/ggml.go:631 +0x90 fp=0x1400014dae0 sp=0x1400014da30 pc=0x1008c9930
github.com/ollama/ollama/model.Forward({0x10165a0d0, 0x14000554000}, {0x101650870, 0x140001d56b0}, {0x140000b2400, 0x51, 0x80}, {{0x101664e48, 0x1400151e4f8}, {0x0, ...}, ...})
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/model/model.go:305 +0x1f4 fp=0x1400014dbd0 sp=0x1400014dae0 pc=0x1008d54d4
github.com/ollama/ollama/runner/ollamarunner.(*Server).processBatch(0x140001fe5a0)
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:480 +0x3f0 fp=0x1400014df80 sp=0x1400014dbd0 pc=0x100951d20
github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0x140001fe5a0, {0x101651d50, 0x1400051f4f0})
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:362 +0x54 fp=0x1400014dfa0 sp=0x1400014df80 pc=0x1009518f4
github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap2()
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:960 +0x30 fp=0x1400014dfd0 sp=0x1400014dfa0 pc=0x1009560a0
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400014dfd0 sp=0x1400014dfd0 pc=0x100500614
created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:960 +0x898

goroutine 1 gp=0x14000002380 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400014f5e0 sp=0x1400014f5c0 pc=0x1004f8258
runtime.netpollblock(0x1400014f678?, 0x57cb30?, 0x1?)
	/opt/local/lib/go/src/runtime/netpoll.go:575 +0x158 fp=0x1400014f620 sp=0x1400014f5e0 pc=0x1004bde48
internal/poll.runtime_pollWait(0x126e31f90, 0x72)
	/opt/local/lib/go/src/runtime/netpoll.go:351 +0xa0 fp=0x1400014f650 sp=0x1400014f620 pc=0x1004f7410
internal/poll.(*pollDesc).wait(0x14000551c00?, 0x10057ed98?, 0x0)
	/opt/local/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x1400014f680 sp=0x1400014f650 pc=0x100578348
internal/poll.(*pollDesc).waitRead(...)
	/opt/local/lib/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0x14000551c00)
	/opt/local/lib/go/src/internal/poll/fd_unix.go:620 +0x24c fp=0x1400014f730 sp=0x1400014f680 pc=0x10057cc1c
net.(*netFD).accept(0x14000551c00)
	/opt/local/lib/go/src/net/fd_unix.go:172 +0x28 fp=0x1400014f7f0 sp=0x1400014f730 pc=0x1005ebf58
net.(*TCPListener).accept(0x14000555880)
	/opt/local/lib/go/src/net/tcpsock_posix.go:159 +0x24 fp=0x1400014f840 sp=0x1400014f7f0 pc=0x1006001b4
net.(*TCPListener).Accept(0x14000555880)
	/opt/local/lib/go/src/net/tcpsock.go:380 +0x2c fp=0x1400014f880 sp=0x1400014f840 pc=0x1005ff19c
net/http.(*onceCloseListener).Accept(0x140004121b0?)
	<autogenerated>:1 +0x30 fp=0x1400014f8a0 sp=0x1400014f880 pc=0x1007dac80
net/http.(*Server).Serve(0x14000506100, {0x10164f8a8, 0x14000555880})
	/opt/local/lib/go/src/net/http/server.go:3424 +0x290 fp=0x1400014f9d0 sp=0x1400014f8a0 pc=0x1007b4320
github.com/ollama/ollama/runner/ollamarunner.Execute({0x140001b6030, 0x11, 0x11})
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:984 +0xb78 fp=0x1400014fce0 sp=0x1400014f9d0 pc=0x100955e38
github.com/ollama/ollama/runner.Execute({0x140001b6010?, 0x0?, 0x0?})
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/runner.go:20 +0x120 fp=0x1400014fd10 sp=0x1400014fce0 pc=0x1009566e0
github.com/ollama/ollama/cmd.NewCLI.func2(0x14000277400?, {0x1011cff11?, 0x4?, 0x1011cff15?})
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/cmd/cmd.go:1583 +0x54 fp=0x1400014fd40 sp=0x1400014fd10 pc=0x100fb4d34
github.com/spf13/cobra.(*Command).execute(0x140000fef08, {0x140001fe120, 0x12, 0x12})
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x680 fp=0x1400014fe60 sp=0x1400014fd40 pc=0x10065a750
github.com/spf13/cobra.(*Command).ExecuteC(0x140000ce908)
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320 fp=0x1400014ff20 sp=0x1400014fe60 pc=0x10065aea0
github.com/spf13/cobra.(*Command).Execute(...)
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/main.go:12 +0x54 fp=0x1400014ff40 sp=0x1400014ff20 pc=0x100fb5884
runtime.main()
	/opt/local/lib/go/src/runtime/proc.go:283 +0x284 fp=0x1400014ffd0 sp=0x1400014ff40 pc=0x1004c49f4
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400014ffd0 sp=0x1400014ffd0 pc=0x100500614

goroutine 2 gp=0x14000002e00 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007cf90 sp=0x1400007cf70 pc=0x1004f8258
runtime.goparkunlock(...)
	/opt/local/lib/go/src/runtime/proc.go:441
runtime.forcegchelper()
	/opt/local/lib/go/src/runtime/proc.go:348 +0xb8 fp=0x1400007cfd0 sp=0x1400007cf90 pc=0x1004c4d48
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007cfd0 sp=0x1400007cfd0 pc=0x100500614
created by runtime.init.7 in goroutine 1
	/opt/local/lib/go/src/runtime/proc.go:336 +0x24

goroutine 3 gp=0x140000036c0 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007d760 sp=0x1400007d740 pc=0x1004f8258
runtime.goparkunlock(...)
	/opt/local/lib/go/src/runtime/proc.go:441
runtime.bgsweep(0x140000a8000)
	/opt/local/lib/go/src/runtime/mgcsweep.go:316 +0x108 fp=0x1400007d7b0 sp=0x1400007d760 pc=0x1004afdb8
runtime.gcenable.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:204 +0x28 fp=0x1400007d7d0 sp=0x1400007d7b0 pc=0x1004a3bb8
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007d7d0 sp=0x1400007d7d0 pc=0x100500614
created by runtime.gcenable in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:204 +0x6c

goroutine 4 gp=0x14000003880 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x101386758?, 0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007df60 sp=0x1400007df40 pc=0x1004f8258
runtime.goparkunlock(...)
	/opt/local/lib/go/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x101f11fc0)
	/opt/local/lib/go/src/runtime/mgcscavenge.go:425 +0x5c fp=0x1400007df90 sp=0x1400007df60 pc=0x1004ad84c
runtime.bgscavenge(0x140000a8000)
	/opt/local/lib/go/src/runtime/mgcscavenge.go:658 +0xac fp=0x1400007dfb0 sp=0x1400007df90 pc=0x1004addec
runtime.gcenable.gowrap2()
	/opt/local/lib/go/src/runtime/mgc.go:205 +0x28 fp=0x1400007dfd0 sp=0x1400007dfb0 pc=0x1004a3b58
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007dfd0 sp=0x1400007dfd0 pc=0x100500614
created by runtime.gcenable in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:205 +0xac

goroutine 18 gp=0x140001828c0 m=nil [finalizer wait]:
runtime.gopark(0x180007c5c8?, 0x1000000000000?, 0xf8?, 0xc5?, 0x1007dd5ec?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007c590 sp=0x1400007c570 pc=0x1004f8258
runtime.runfinq()
	/opt/local/lib/go/src/runtime/mfinal.go:196 +0x108 fp=0x1400007c7d0 sp=0x1400007c590 pc=0x1004a2bb8
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007c7d0 sp=0x1400007c7d0 pc=0x100500614
created by runtime.createfing in goroutine 1
	/opt/local/lib/go/src/runtime/mfinal.go:166 +0x80

goroutine 19 gp=0x14000183340 m=nil [chan receive]:
runtime.gopark(0x140002a9680?, 0x14001524030?, 0x48?, 0x87?, 0x1005c0128?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x140000786f0 sp=0x140000786d0 pc=0x1004f8258
runtime.chanrecv(0x14000192310, 0x0, 0x1)
	/opt/local/lib/go/src/runtime/chan.go:664 +0x42c fp=0x14000078770 sp=0x140000786f0 pc=0x100494e8c
runtime.chanrecv1(0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/chan.go:506 +0x14 fp=0x140000787a0 sp=0x14000078770 pc=0x100494a24
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	/opt/local/lib/go/src/runtime/mgc.go:1797
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1800 +0x3c fp=0x140000787d0 sp=0x140000787a0 pc=0x1004a6ddc
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000787d0 sp=0x140000787d0 pc=0x100500614
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1795 +0x78

goroutine 20 gp=0x140001836c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000078f10 sp=0x14000078ef0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000078fb0 sp=0x14000078f10 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000078fd0 sp=0x14000078fb0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000078fd0 sp=0x14000078fd0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 34 gp=0x14000102380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011a710 sp=0x1400011a6f0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011a7b0 sp=0x1400011a710 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011a7d0 sp=0x1400011a7b0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011a7d0 sp=0x1400011a7d0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 50 gp=0x14000504000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000116710 sp=0x140001166f0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x140001167b0 sp=0x14000116710 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x140001167d0 sp=0x140001167b0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140001167d0 sp=0x140001167d0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 21 gp=0x14000183880 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc814b062?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000079710 sp=0x140000796f0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x140000797b0 sp=0x14000079710 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x140000797d0 sp=0x140000797b0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000797d0 sp=0x140000797d0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 51 gp=0x140005041c0 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc80e2b46?, 0x3?, 0x44?, 0xe3?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000116f10 sp=0x14000116ef0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000116fb0 sp=0x14000116f10 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000116fd0 sp=0x14000116fb0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000116fd0 sp=0x14000116fd0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 35 gp=0x14000102540 m=nil [GC worker (idle)]:
runtime.gopark(0x101f42800?, 0x1?, 0x8?, 0x20?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011af10 sp=0x1400011aef0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011afb0 sp=0x1400011af10 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011afd0 sp=0x1400011afb0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011afd0 sp=0x1400011afd0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 22 gp=0x14000183a40 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc80c3a0a?, 0x0?, 0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000079f10 sp=0x14000079ef0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000079fb0 sp=0x14000079f10 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000079fd0 sp=0x14000079fb0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000079fd0 sp=0x14000079fd0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 52 gp=0x14000504380 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc80e2e0a?, 0x3?, 0x64?, 0xfa?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000117710 sp=0x140001176f0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x140001177b0 sp=0x14000117710 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x140001177d0 sp=0x140001177b0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140001177d0 sp=0x140001177d0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 53 gp=0x14000504540 m=nil [GC worker (idle)]:
runtime.gopark(0x101f42800?, 0x1?, 0x49?, 0x3f?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000117f10 sp=0x14000117ef0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000117fb0 sp=0x14000117f10 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000117fd0 sp=0x14000117fb0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000117fd0 sp=0x14000117fd0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 54 gp=0x14000504700 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc80e1935?, 0x1?, 0xe3?, 0xe?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000118710 sp=0x140001186f0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x140001187b0 sp=0x14000118710 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x140001187d0 sp=0x140001187b0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140001187d0 sp=0x140001187d0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 36 gp=0x14000102700 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc813c655?, 0x3?, 0x3d?, 0x33?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400061ef10 sp=0x1400061eef0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400061efb0 sp=0x1400061ef10 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400061efd0 sp=0x1400061efb0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400061efd0 sp=0x1400061efd0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 37 gp=0x140001028c0 m=nil [GC worker (idle)]:
runtime.gopark(0x101f42800?, 0x1?, 0xe8?, 0x1c?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011bf10 sp=0x1400011bef0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011bfb0 sp=0x1400011bf10 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011bfd0 sp=0x1400011bfb0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011bfd0 sp=0x1400011bfd0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 38 gp=0x14000102a80 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc80af0f0?, 0x3?, 0xdf?, 0x37?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011c710 sp=0x1400011c6f0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011c7b0 sp=0x1400011c710 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011c7d0 sp=0x1400011c7b0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011c7d0 sp=0x1400011c7d0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 23 gp=0x14000183c00 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc80cb1d5?, 0x1?, 0xa6?, 0x38?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007a710 sp=0x1400007a6f0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400007a7b0 sp=0x1400007a710 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400007a7d0 sp=0x1400007a7b0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007a7d0 sp=0x1400007a7d0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 39 gp=0x14000102c40 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc80c3f93?, 0x3?, 0xb0?, 0xa1?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011cf10 sp=0x1400011cef0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011cfb0 sp=0x1400011cf10 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011cfd0 sp=0x1400011cfb0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011cfd0 sp=0x1400011cfd0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 55 gp=0x140005048c0 m=nil [GC worker (idle)]:
runtime.gopark(0x17f6cc80e1647?, 0x1?, 0xf9?, 0x86?, 0x0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000118f10 sp=0x14000118ef0 pc=0x1004f8258
runtime.gcBgMarkWorker(0x14000193730)
	/opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000118fb0 sp=0x14000118f10 pc=0x1004a604c
runtime.gcBgMarkStartWorkers.gowrap1()
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000118fd0 sp=0x14000118fb0 pc=0x1004a5f38
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000118fd0 sp=0x14000118fd0 pc=0x100500614
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/opt/local/lib/go/src/runtime/mgc.go:1339 +0x140

goroutine 444 gp=0x14000103880 m=nil [IO wait]:
runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x10051c0f0?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007b580 sp=0x1400007b560 pc=0x1004f8258
runtime.netpollblock(0x0?, 0x0?, 0x0?)
	/opt/local/lib/go/src/runtime/netpoll.go:575 +0x158 fp=0x1400007b5c0 sp=0x1400007b580 pc=0x1004bde48
internal/poll.runtime_pollWait(0x126e31e78, 0x72)
	/opt/local/lib/go/src/runtime/netpoll.go:351 +0xa0 fp=0x1400007b5f0 sp=0x1400007b5c0 pc=0x1004f7410
internal/poll.(*pollDesc).wait(0x14000250b80?, 0x140001f53f1?, 0x0)
	/opt/local/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x1400007b620 sp=0x1400007b5f0 pc=0x100578348
internal/poll.(*pollDesc).waitRead(...)
	/opt/local/lib/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x14000250b80, {0x140001f53f1, 0x1, 0x1})
	/opt/local/lib/go/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x1400007b6c0 sp=0x1400007b620 pc=0x1005795fc
net.(*netFD).Read(0x14000250b80, {0x140001f53f1?, 0x1400007b758?, 0x1007a9f64?})
	/opt/local/lib/go/src/net/fd_posix.go:55 +0x28 fp=0x1400007b710 sp=0x1400007b6c0 pc=0x1005ea528
net.(*conn).Read(0x1400019e430, {0x140001f53f1?, 0x1400007b798?, 0x1009560a0?})
	/opt/local/lib/go/src/net/net.go:194 +0x34 fp=0x1400007b760 sp=0x1400007b710 pc=0x1005f73f4
net/http.(*connReader).backgroundRead(0x140001f53e0)
	/opt/local/lib/go/src/net/http/server.go:690 +0x40 fp=0x1400007b7b0 sp=0x1400007b760 pc=0x1007a9e60
net/http.(*connReader).startBackgroundRead.gowrap2()
	/opt/local/lib/go/src/net/http/server.go:686 +0x28 fp=0x1400007b7d0 sp=0x1400007b7b0 pc=0x1007a9d48
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007b7d0 sp=0x1400007b7d0 pc=0x100500614
created by net/http.(*connReader).startBackgroundRead in goroutine 7
	/opt/local/lib/go/src/net/http/server.go:686 +0xc4

goroutine 7 gp=0x14000103c00 m=nil [select]:
runtime.gopark(0x14000045a50?, 0x2?, 0x78?, 0x57?, 0x1400004581c?)
	/opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000045650 sp=0x14000045630 pc=0x1004f8258
runtime.selectgo(0x14000045a50, 0x14000045818, 0x51?, 0x0, 0x0?, 0x1)
	/opt/local/lib/go/src/runtime/select.go:351 +0x6c4 fp=0x14000045780 sp=0x14000045650 pc=0x1004d8034
github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0x140001fe5a0, {0x10164fa88, 0x140015262a0}, 0x140001b6b40)
	/opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:680 +0x9d8 fp=0x14000045aa0 sp=0x14000045780 pc=0x100953a08
github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x10164fa88?, 0x140015262a0?}, 0x14000151b28?)
	<autogenerated>:1 +0x40 fp=0x14000045ad0 sp=0x14000045aa0 pc=0x1009564e0
net/http.HandlerFunc.ServeHTTP(0x140000d2840?, {0x10164fa88?, 0x140015262a0?}, 0x14000151b10?)
	/opt/local/lib/go/src/net/http/server.go:2294 +0x38 fp=0x14000045b00 sp=0x14000045ad0 pc=0x1007b0d48
net/http.(*ServeMux).ServeHTTP(0x10?, {0x10164fa88, 0x140015262a0}, 0x140001b6b40)
	/opt/local/lib/go/src/net/http/server.go:2822 +0x1b4 fp=0x14000045b50 sp=0x14000045b00 pc=0x1007b28d4
net/http.serverHandler.ServeHTTP({0x10164c0b0?}, {0x10164fa88?, 0x140015262a0?}, 0x1?)
	/opt/local/lib/go/src/net/http/server.go:3301 +0xbc fp=0x14000045b80 sp=0x14000045b50 pc=0x1007ce65c
net/http.(*conn).serve(0x140004121b0, {0x101651d18, 0x140002b8090})
	/opt/local/lib/go/src/net/http/server.go:2102 +0x52c fp=0x14000045fa0 sp=0x14000045b80 pc=0x1007af4ec
net/http.(*Server).Serve.gowrap3()
	/opt/local/lib/go/src/net/http/server.go:3454 +0x30 fp=0x14000045fd0 sp=0x14000045fa0 pc=0x1007b46b0
runtime.goexit({})
	/opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000045fd0 sp=0x14000045fd0 pc=0x100500614
created by net/http.(*Server).Serve in goroutine 1
	/opt/local/lib/go/src/net/http/server.go:3454 +0x3d8

r0      0x0
r1      0x0
r2      0x0
r3      0x0
r4      0x1013a190a
r5      0x6be67a960
r6      0x64656c6961662029
r7      0x1400019e038
r8      0x948eb72c3c2817aa
r9      0x948eb72a824fa7aa
r10     0x2
r11     0x10000000000
r12     0xfffffffd
r13     0x0
r14     0x0
r15     0x0
r16     0x148
r17     0x201e09fa8
r18     0x0
r19     0x6
r20     0x4203
r21     0x6be67b0e0
r22     0x1fe7abe68
r23     0x15cf764c0
r24     0x1445000
r25     0x12788d300
r26     0x40
r27     0x8
r28     0x15cf76350
r29     0x6be67a8c0
lr      0x192caa88c
sp      0x6be67a8a0
pc      0x192c71388
fault   0x192c71388
time=2025-08-06T05:13:12.472+10:00 level=ERROR source=server.go:808 msg="post predict" error="Post \"http://127.0.0.1:51140/completion\": EOF"
time=2025-08-06T05:13:12.472+10:00 level=ERROR source=server.go:465 msg="llama runner terminated" error="exit status 2"
[GIN] 2025/08/06 - 05:13:12 | 200 |  242.328083ms |       127.0.0.1 | POST     "/api/chat"

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.11.0

Originally created by @i0ntempest on GitHub (Aug 5, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11682 ### What is the issue? When both `OLLAMA_FLASH_ATTENTION=1` and `OLLAMA_KV_CACHE_TYPE=q8_0` are set and gpt-oss 20b is run, the server crashes. ### Relevant log output ```shell time=2025-08-06T05:12:23.881+10:00 level=INFO source=routes.go:1297 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:15m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Volumes/WD SN850X/Ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_MMAP:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]" time=2025-08-06T05:12:23.884+10:00 level=INFO source=images.go:477 msg="total blobs: 83" time=2025-08-06T05:12:23.885+10:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0" time=2025-08-06T05:12:23.886+10:00 level=INFO source=routes.go:1350 msg="Listening on 127.0.0.1:11434 (version 0.11.0)" time=2025-08-06T05:12:23.916+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="48.0 GiB" available="48.0 GiB" [GIN] 2025/08/06 - 05:12:33 | 200 | 30.458µs | 127.0.0.1 | HEAD "/" [GIN] 2025/08/06 - 05:12:33 | 200 | 51.567959ms | 127.0.0.1 | POST "/api/show" time=2025-08-06T05:12:33.823+10:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model="/Volumes/WD SN850X/Ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583" gpu=0 parallel=1 available=51539607552 required="15.0 GiB" time=2025-08-06T05:12:33.823+10:00 level=INFO source=server.go:135 msg="system memory" total="64.0 GiB" free="35.0 GiB" free_swap="0 B" time=2025-08-06T05:12:33.824+10:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=25 layers.offload=25 layers.split="" memory.available="[48.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="15.0 GiB" memory.required.partial="15.0 GiB" memory.required.kv="300.0 MiB" memory.required.allocations="[15.0 GiB]" memory.weights.total="11.7 GiB" memory.weights.repeating="10.7 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="2.0 GiB" memory.graph.partial="2.0 GiB" time=2025-08-06T05:12:33.824+10:00 level=INFO source=server.go:218 msg="enabling flash attention" time=2025-08-06T05:12:33.824+10:00 level=WARN source=server.go:226 msg="kv cache type not supported by model" type="" time=2025-08-06T05:12:33.852+10:00 level=INFO source=server.go:439 msg="starting llama server" cmd="/opt/local/bin/ollama runner --ollama-engine --model /Volumes/WD SN850X/Ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 --ctx-size 8192 --batch-size 512 --n-gpu-layers 25 --threads 12 --flash-attn --parallel 1 --port 51079" time=2025-08-06T05:12:33.853+10:00 level=INFO source=sched.go:481 msg="loaded runners" count=1 time=2025-08-06T05:12:33.853+10:00 level=INFO source=server.go:599 msg="waiting for llama runner to start responding" time=2025-08-06T05:12:33.853+10:00 level=INFO source=server.go:633 msg="waiting for server to become available" status="llm server not responding" time=2025-08-06T05:12:33.863+10:00 level=INFO source=runner.go:925 msg="starting ollama engine" time=2025-08-06T05:12:33.863+10:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:51079" time=2025-08-06T05:12:33.889+10:00 level=INFO source=ggml.go:92 msg="" architecture=gptoss file_type=MXFP4 name="" description="" num_tensors=315 num_key_values=30 time=2025-08-06T05:12:33.953+10:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang) time=2025-08-06T05:12:34.104+10:00 level=INFO source=server.go:633 msg="waiting for server to become available" status="llm server loading model" time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:367 msg="offloading 24 repeating layers to GPU" time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:373 msg="offloading output layer to GPU" time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:378 msg="offloaded 25/25 layers to GPU" time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:381 msg="model weights" buffer=CPU size="1.1 GiB" time=2025-08-06T05:12:34.318+10:00 level=INFO source=ggml.go:381 msg="model weights" buffer=Metal size="11.7 GiB" ggml_metal_init: allocating ggml_metal_init: picking default device: Apple M4 Max ggml_metal_load_library: using embedded metal library ggml_metal_init: GPU name: Apple M4 Max ggml_metal_init: GPU family: MTLGPUFamilyApple9 (1009) ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003) ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001) ggml_metal_init: simdgroup reduction = true ggml_metal_init: simdgroup matrix mul. = true ggml_metal_init: has residency sets = true ggml_metal_init: has bfloat = true ggml_metal_init: use bfloat = true ggml_metal_init: hasUnifiedMemory = true ggml_metal_init: recommendedMaxWorkingSetSize = 51539.61 MB time=2025-08-06T05:12:34.427+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=Metal buffer_type=Metal size="2.1 GiB" time=2025-08-06T05:12:34.427+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=BLAS buffer_type=CPU size="5.6 MiB" time=2025-08-06T05:12:34.427+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=CPU buffer_type=CPU size="0 B" time=2025-08-06T05:12:36.611+10:00 level=INFO source=server.go:638 msg="llama runner started in 2.76 seconds" [GIN] 2025/08/06 - 05:12:36 | 200 | 2.864798959s | 127.0.0.1 | POST "/api/generate" [GIN] 2025/08/06 - 05:12:44 | 200 | 5.652470834s | 127.0.0.1 | POST "/api/chat" ^C  Mac-Studio  Admin    ~  ^C   Mac-Studio  Admin    ~  130  sudo -u ollama /opt/local/libexec/ollama/ollama-wrapper.sh time=2025-08-06T05:13:06.712+10:00 level=INFO source=routes.go:1297 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:15m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Volumes/WD SN850X/Ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_MMAP:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]" time=2025-08-06T05:13:06.714+10:00 level=INFO source=images.go:477 msg="total blobs: 83" time=2025-08-06T05:13:06.714+10:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0" time=2025-08-06T05:13:06.715+10:00 level=INFO source=routes.go:1350 msg="Listening on 127.0.0.1:11434 (version 0.11.0)" time=2025-08-06T05:13:06.734+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="48.0 GiB" available="48.0 GiB" [GIN] 2025/08/06 - 05:13:08 | 200 | 42.417µs | 127.0.0.1 | HEAD "/" [GIN] 2025/08/06 - 05:13:08 | 200 | 58.487708ms | 127.0.0.1 | POST "/api/show" time=2025-08-06T05:13:08.787+10:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model="/Volumes/WD SN850X/Ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583" gpu=0 parallel=1 available=51539607552 required="13.8 GiB" time=2025-08-06T05:13:08.787+10:00 level=INFO source=server.go:135 msg="system memory" total="64.0 GiB" free="35.0 GiB" free_swap="0 B" time=2025-08-06T05:13:08.788+10:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=25 layers.offload=25 layers.split="" memory.available="[48.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="13.8 GiB" memory.required.partial="13.8 GiB" memory.required.kv="150.0 MiB" memory.required.allocations="[13.8 GiB]" memory.weights.total="11.7 GiB" memory.weights.repeating="10.7 GiB" memory.weights.nonrepeating="1.1 GiB" memory.graph.full="1.0 GiB" memory.graph.partial="1.0 GiB" time=2025-08-06T05:13:08.788+10:00 level=INFO source=server.go:218 msg="enabling flash attention" time=2025-08-06T05:13:08.815+10:00 level=INFO source=server.go:439 msg="starting llama server" cmd="/opt/local/bin/ollama runner --ollama-engine --model /Volumes/WD SN850X/Ollama/models/blobs/sha256-b112e727c6f18875636c56a779790a590d705aec9e1c0eb5a97d51fc2a778583 --ctx-size 8192 --batch-size 512 --n-gpu-layers 25 --threads 12 --flash-attn --kv-cache-type q8_0 --parallel 1 --port 51140" time=2025-08-06T05:13:08.817+10:00 level=INFO source=sched.go:481 msg="loaded runners" count=1 time=2025-08-06T05:13:08.817+10:00 level=INFO source=server.go:599 msg="waiting for llama runner to start responding" time=2025-08-06T05:13:08.817+10:00 level=INFO source=server.go:633 msg="waiting for server to become available" status="llm server not responding" time=2025-08-06T05:13:08.827+10:00 level=INFO source=runner.go:925 msg="starting ollama engine" time=2025-08-06T05:13:08.827+10:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:51140" time=2025-08-06T05:13:08.853+10:00 level=INFO source=ggml.go:92 msg="" architecture=gptoss file_type=MXFP4 name="" description="" num_tensors=315 num_key_values=30 time=2025-08-06T05:13:08.913+10:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang) time=2025-08-06T05:13:09.068+10:00 level=INFO source=server.go:633 msg="waiting for server to become available" status="llm server loading model" time=2025-08-06T05:13:09.269+10:00 level=INFO source=ggml.go:367 msg="offloading 24 repeating layers to GPU" time=2025-08-06T05:13:09.269+10:00 level=INFO source=ggml.go:373 msg="offloading output layer to GPU" time=2025-08-06T05:13:09.270+10:00 level=INFO source=ggml.go:378 msg="offloaded 25/25 layers to GPU" time=2025-08-06T05:13:09.270+10:00 level=INFO source=ggml.go:381 msg="model weights" buffer=Metal size="11.7 GiB" time=2025-08-06T05:13:09.270+10:00 level=INFO source=ggml.go:381 msg="model weights" buffer=CPU size="1.1 GiB" ggml_metal_init: allocating ggml_metal_init: picking default device: Apple M4 Max ggml_metal_load_library: using embedded metal library ggml_metal_init: GPU name: Apple M4 Max ggml_metal_init: GPU family: MTLGPUFamilyApple9 (1009) ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003) ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001) ggml_metal_init: simdgroup reduction = true ggml_metal_init: simdgroup matrix mul. = true ggml_metal_init: has residency sets = true ggml_metal_init: has bfloat = true ggml_metal_init: use bfloat = true ggml_metal_init: hasUnifiedMemory = true ggml_metal_init: recommendedMaxWorkingSetSize = 51539.61 MB time=2025-08-06T05:13:09.361+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=Metal buffer_type=Metal size="2.1 GiB" time=2025-08-06T05:13:09.361+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=BLAS buffer_type=CPU size="5.6 MiB" time=2025-08-06T05:13:09.361+10:00 level=INFO source=ggml.go:672 msg="compute graph" backend=CPU buffer_type=CPU size="0 B" time=2025-08-06T05:13:11.328+10:00 level=INFO source=server.go:638 msg="llama runner started in 2.51 seconds" [GIN] 2025/08/06 - 05:13:11 | 200 | 2.619066917s | 127.0.0.1 | POST "/api/generate" ggml_metal_get_buffer: error: tensor 'leaf_7 (view)' buffer is nil ggml_metal_get_buffer: error: tensor 'leaf_7 (view) (copy of (view) (permuted))' buffer is nil ggml-metal.m:4848: GGML_ASSERT(ne0 % ggml_blck_size(dst->type) == 0) failed SIGABRT: abort PC=0x192c71388 m=18 sigcode=0 signal arrived during cgo execution goroutine 6 gp=0x14000103a40 m=18 mp=0x14000580808 [syscall]: runtime.cgocall(0x101003758, 0x1400014da38) /opt/local/lib/go/src/runtime/cgocall.go:167 +0x44 fp=0x1400014d9f0 sp=0x1400014d9b0 pc=0x1004f4d34 github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_sched_graph_compute_async(0x12741ca00, 0x15cf35890) _cgo_gotypes.go:876 +0x34 fp=0x1400014da30 sp=0x1400014d9f0 pc=0x1008c1864 github.com/ollama/ollama/ml/backend/ggml.(*Context).Compute.func1(...) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/ml/backend/ggml/ggml.go:631 github.com/ollama/ollama/ml/backend/ggml.(*Context).Compute(0x14000554000, {0x14000040540, 0x1, 0x0?}) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/ml/backend/ggml/ggml.go:631 +0x90 fp=0x1400014dae0 sp=0x1400014da30 pc=0x1008c9930 github.com/ollama/ollama/model.Forward({0x10165a0d0, 0x14000554000}, {0x101650870, 0x140001d56b0}, {0x140000b2400, 0x51, 0x80}, {{0x101664e48, 0x1400151e4f8}, {0x0, ...}, ...}) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/model/model.go:305 +0x1f4 fp=0x1400014dbd0 sp=0x1400014dae0 pc=0x1008d54d4 github.com/ollama/ollama/runner/ollamarunner.(*Server).processBatch(0x140001fe5a0) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:480 +0x3f0 fp=0x1400014df80 sp=0x1400014dbd0 pc=0x100951d20 github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0x140001fe5a0, {0x101651d50, 0x1400051f4f0}) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:362 +0x54 fp=0x1400014dfa0 sp=0x1400014df80 pc=0x1009518f4 github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap2() /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:960 +0x30 fp=0x1400014dfd0 sp=0x1400014dfa0 pc=0x1009560a0 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400014dfd0 sp=0x1400014dfd0 pc=0x100500614 created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1 /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:960 +0x898 goroutine 1 gp=0x14000002380 m=nil [IO wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400014f5e0 sp=0x1400014f5c0 pc=0x1004f8258 runtime.netpollblock(0x1400014f678?, 0x57cb30?, 0x1?) /opt/local/lib/go/src/runtime/netpoll.go:575 +0x158 fp=0x1400014f620 sp=0x1400014f5e0 pc=0x1004bde48 internal/poll.runtime_pollWait(0x126e31f90, 0x72) /opt/local/lib/go/src/runtime/netpoll.go:351 +0xa0 fp=0x1400014f650 sp=0x1400014f620 pc=0x1004f7410 internal/poll.(*pollDesc).wait(0x14000551c00?, 0x10057ed98?, 0x0) /opt/local/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x1400014f680 sp=0x1400014f650 pc=0x100578348 internal/poll.(*pollDesc).waitRead(...) /opt/local/lib/go/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Accept(0x14000551c00) /opt/local/lib/go/src/internal/poll/fd_unix.go:620 +0x24c fp=0x1400014f730 sp=0x1400014f680 pc=0x10057cc1c net.(*netFD).accept(0x14000551c00) /opt/local/lib/go/src/net/fd_unix.go:172 +0x28 fp=0x1400014f7f0 sp=0x1400014f730 pc=0x1005ebf58 net.(*TCPListener).accept(0x14000555880) /opt/local/lib/go/src/net/tcpsock_posix.go:159 +0x24 fp=0x1400014f840 sp=0x1400014f7f0 pc=0x1006001b4 net.(*TCPListener).Accept(0x14000555880) /opt/local/lib/go/src/net/tcpsock.go:380 +0x2c fp=0x1400014f880 sp=0x1400014f840 pc=0x1005ff19c net/http.(*onceCloseListener).Accept(0x140004121b0?) <autogenerated>:1 +0x30 fp=0x1400014f8a0 sp=0x1400014f880 pc=0x1007dac80 net/http.(*Server).Serve(0x14000506100, {0x10164f8a8, 0x14000555880}) /opt/local/lib/go/src/net/http/server.go:3424 +0x290 fp=0x1400014f9d0 sp=0x1400014f8a0 pc=0x1007b4320 github.com/ollama/ollama/runner/ollamarunner.Execute({0x140001b6030, 0x11, 0x11}) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:984 +0xb78 fp=0x1400014fce0 sp=0x1400014f9d0 pc=0x100955e38 github.com/ollama/ollama/runner.Execute({0x140001b6010?, 0x0?, 0x0?}) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/runner.go:20 +0x120 fp=0x1400014fd10 sp=0x1400014fce0 pc=0x1009566e0 github.com/ollama/ollama/cmd.NewCLI.func2(0x14000277400?, {0x1011cff11?, 0x4?, 0x1011cff15?}) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/cmd/cmd.go:1583 +0x54 fp=0x1400014fd40 sp=0x1400014fd10 pc=0x100fb4d34 github.com/spf13/cobra.(*Command).execute(0x140000fef08, {0x140001fe120, 0x12, 0x12}) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x680 fp=0x1400014fe60 sp=0x1400014fd40 pc=0x10065a750 github.com/spf13/cobra.(*Command).ExecuteC(0x140000ce908) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320 fp=0x1400014ff20 sp=0x1400014fe60 pc=0x10065aea0 github.com/spf13/cobra.(*Command).Execute(...) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985 main.main() /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/main.go:12 +0x54 fp=0x1400014ff40 sp=0x1400014ff20 pc=0x100fb5884 runtime.main() /opt/local/lib/go/src/runtime/proc.go:283 +0x284 fp=0x1400014ffd0 sp=0x1400014ff40 pc=0x1004c49f4 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400014ffd0 sp=0x1400014ffd0 pc=0x100500614 goroutine 2 gp=0x14000002e00 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007cf90 sp=0x1400007cf70 pc=0x1004f8258 runtime.goparkunlock(...) /opt/local/lib/go/src/runtime/proc.go:441 runtime.forcegchelper() /opt/local/lib/go/src/runtime/proc.go:348 +0xb8 fp=0x1400007cfd0 sp=0x1400007cf90 pc=0x1004c4d48 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007cfd0 sp=0x1400007cfd0 pc=0x100500614 created by runtime.init.7 in goroutine 1 /opt/local/lib/go/src/runtime/proc.go:336 +0x24 goroutine 3 gp=0x140000036c0 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007d760 sp=0x1400007d740 pc=0x1004f8258 runtime.goparkunlock(...) /opt/local/lib/go/src/runtime/proc.go:441 runtime.bgsweep(0x140000a8000) /opt/local/lib/go/src/runtime/mgcsweep.go:316 +0x108 fp=0x1400007d7b0 sp=0x1400007d760 pc=0x1004afdb8 runtime.gcenable.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:204 +0x28 fp=0x1400007d7d0 sp=0x1400007d7b0 pc=0x1004a3bb8 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007d7d0 sp=0x1400007d7d0 pc=0x100500614 created by runtime.gcenable in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:204 +0x6c goroutine 4 gp=0x14000003880 m=nil [GC scavenge wait]: runtime.gopark(0x10000?, 0x101386758?, 0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007df60 sp=0x1400007df40 pc=0x1004f8258 runtime.goparkunlock(...) /opt/local/lib/go/src/runtime/proc.go:441 runtime.(*scavengerState).park(0x101f11fc0) /opt/local/lib/go/src/runtime/mgcscavenge.go:425 +0x5c fp=0x1400007df90 sp=0x1400007df60 pc=0x1004ad84c runtime.bgscavenge(0x140000a8000) /opt/local/lib/go/src/runtime/mgcscavenge.go:658 +0xac fp=0x1400007dfb0 sp=0x1400007df90 pc=0x1004addec runtime.gcenable.gowrap2() /opt/local/lib/go/src/runtime/mgc.go:205 +0x28 fp=0x1400007dfd0 sp=0x1400007dfb0 pc=0x1004a3b58 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007dfd0 sp=0x1400007dfd0 pc=0x100500614 created by runtime.gcenable in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:205 +0xac goroutine 18 gp=0x140001828c0 m=nil [finalizer wait]: runtime.gopark(0x180007c5c8?, 0x1000000000000?, 0xf8?, 0xc5?, 0x1007dd5ec?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007c590 sp=0x1400007c570 pc=0x1004f8258 runtime.runfinq() /opt/local/lib/go/src/runtime/mfinal.go:196 +0x108 fp=0x1400007c7d0 sp=0x1400007c590 pc=0x1004a2bb8 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007c7d0 sp=0x1400007c7d0 pc=0x100500614 created by runtime.createfing in goroutine 1 /opt/local/lib/go/src/runtime/mfinal.go:166 +0x80 goroutine 19 gp=0x14000183340 m=nil [chan receive]: runtime.gopark(0x140002a9680?, 0x14001524030?, 0x48?, 0x87?, 0x1005c0128?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x140000786f0 sp=0x140000786d0 pc=0x1004f8258 runtime.chanrecv(0x14000192310, 0x0, 0x1) /opt/local/lib/go/src/runtime/chan.go:664 +0x42c fp=0x14000078770 sp=0x140000786f0 pc=0x100494e8c runtime.chanrecv1(0x0?, 0x0?) /opt/local/lib/go/src/runtime/chan.go:506 +0x14 fp=0x140000787a0 sp=0x14000078770 pc=0x100494a24 runtime.unique_runtime_registerUniqueMapCleanup.func2(...) /opt/local/lib/go/src/runtime/mgc.go:1797 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1800 +0x3c fp=0x140000787d0 sp=0x140000787a0 pc=0x1004a6ddc runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000787d0 sp=0x140000787d0 pc=0x100500614 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1795 +0x78 goroutine 20 gp=0x140001836c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000078f10 sp=0x14000078ef0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000078fb0 sp=0x14000078f10 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000078fd0 sp=0x14000078fb0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000078fd0 sp=0x14000078fd0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 34 gp=0x14000102380 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011a710 sp=0x1400011a6f0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011a7b0 sp=0x1400011a710 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011a7d0 sp=0x1400011a7b0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011a7d0 sp=0x1400011a7d0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 50 gp=0x14000504000 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000116710 sp=0x140001166f0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x140001167b0 sp=0x14000116710 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x140001167d0 sp=0x140001167b0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140001167d0 sp=0x140001167d0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 21 gp=0x14000183880 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc814b062?, 0x0?, 0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000079710 sp=0x140000796f0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x140000797b0 sp=0x14000079710 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x140000797d0 sp=0x140000797b0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000797d0 sp=0x140000797d0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 51 gp=0x140005041c0 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc80e2b46?, 0x3?, 0x44?, 0xe3?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000116f10 sp=0x14000116ef0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000116fb0 sp=0x14000116f10 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000116fd0 sp=0x14000116fb0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000116fd0 sp=0x14000116fd0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 35 gp=0x14000102540 m=nil [GC worker (idle)]: runtime.gopark(0x101f42800?, 0x1?, 0x8?, 0x20?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011af10 sp=0x1400011aef0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011afb0 sp=0x1400011af10 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011afd0 sp=0x1400011afb0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011afd0 sp=0x1400011afd0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 22 gp=0x14000183a40 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc80c3a0a?, 0x0?, 0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000079f10 sp=0x14000079ef0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000079fb0 sp=0x14000079f10 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000079fd0 sp=0x14000079fb0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000079fd0 sp=0x14000079fd0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 52 gp=0x14000504380 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc80e2e0a?, 0x3?, 0x64?, 0xfa?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000117710 sp=0x140001176f0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x140001177b0 sp=0x14000117710 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x140001177d0 sp=0x140001177b0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140001177d0 sp=0x140001177d0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 53 gp=0x14000504540 m=nil [GC worker (idle)]: runtime.gopark(0x101f42800?, 0x1?, 0x49?, 0x3f?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000117f10 sp=0x14000117ef0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000117fb0 sp=0x14000117f10 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000117fd0 sp=0x14000117fb0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000117fd0 sp=0x14000117fd0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 54 gp=0x14000504700 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc80e1935?, 0x1?, 0xe3?, 0xe?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000118710 sp=0x140001186f0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x140001187b0 sp=0x14000118710 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x140001187d0 sp=0x140001187b0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140001187d0 sp=0x140001187d0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 36 gp=0x14000102700 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc813c655?, 0x3?, 0x3d?, 0x33?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400061ef10 sp=0x1400061eef0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400061efb0 sp=0x1400061ef10 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400061efd0 sp=0x1400061efb0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400061efd0 sp=0x1400061efd0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 37 gp=0x140001028c0 m=nil [GC worker (idle)]: runtime.gopark(0x101f42800?, 0x1?, 0xe8?, 0x1c?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011bf10 sp=0x1400011bef0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011bfb0 sp=0x1400011bf10 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011bfd0 sp=0x1400011bfb0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011bfd0 sp=0x1400011bfd0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 38 gp=0x14000102a80 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc80af0f0?, 0x3?, 0xdf?, 0x37?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011c710 sp=0x1400011c6f0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011c7b0 sp=0x1400011c710 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011c7d0 sp=0x1400011c7b0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011c7d0 sp=0x1400011c7d0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 23 gp=0x14000183c00 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc80cb1d5?, 0x1?, 0xa6?, 0x38?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007a710 sp=0x1400007a6f0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400007a7b0 sp=0x1400007a710 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400007a7d0 sp=0x1400007a7b0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007a7d0 sp=0x1400007a7d0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 39 gp=0x14000102c40 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc80c3f93?, 0x3?, 0xb0?, 0xa1?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400011cf10 sp=0x1400011cef0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x1400011cfb0 sp=0x1400011cf10 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x1400011cfd0 sp=0x1400011cfb0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400011cfd0 sp=0x1400011cfd0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 55 gp=0x140005048c0 m=nil [GC worker (idle)]: runtime.gopark(0x17f6cc80e1647?, 0x1?, 0xf9?, 0x86?, 0x0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000118f10 sp=0x14000118ef0 pc=0x1004f8258 runtime.gcBgMarkWorker(0x14000193730) /opt/local/lib/go/src/runtime/mgc.go:1423 +0xdc fp=0x14000118fb0 sp=0x14000118f10 pc=0x1004a604c runtime.gcBgMarkStartWorkers.gowrap1() /opt/local/lib/go/src/runtime/mgc.go:1339 +0x28 fp=0x14000118fd0 sp=0x14000118fb0 pc=0x1004a5f38 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000118fd0 sp=0x14000118fd0 pc=0x100500614 created by runtime.gcBgMarkStartWorkers in goroutine 1 /opt/local/lib/go/src/runtime/mgc.go:1339 +0x140 goroutine 444 gp=0x14000103880 m=nil [IO wait]: runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x10051c0f0?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x1400007b580 sp=0x1400007b560 pc=0x1004f8258 runtime.netpollblock(0x0?, 0x0?, 0x0?) /opt/local/lib/go/src/runtime/netpoll.go:575 +0x158 fp=0x1400007b5c0 sp=0x1400007b580 pc=0x1004bde48 internal/poll.runtime_pollWait(0x126e31e78, 0x72) /opt/local/lib/go/src/runtime/netpoll.go:351 +0xa0 fp=0x1400007b5f0 sp=0x1400007b5c0 pc=0x1004f7410 internal/poll.(*pollDesc).wait(0x14000250b80?, 0x140001f53f1?, 0x0) /opt/local/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x1400007b620 sp=0x1400007b5f0 pc=0x100578348 internal/poll.(*pollDesc).waitRead(...) /opt/local/lib/go/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0x14000250b80, {0x140001f53f1, 0x1, 0x1}) /opt/local/lib/go/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x1400007b6c0 sp=0x1400007b620 pc=0x1005795fc net.(*netFD).Read(0x14000250b80, {0x140001f53f1?, 0x1400007b758?, 0x1007a9f64?}) /opt/local/lib/go/src/net/fd_posix.go:55 +0x28 fp=0x1400007b710 sp=0x1400007b6c0 pc=0x1005ea528 net.(*conn).Read(0x1400019e430, {0x140001f53f1?, 0x1400007b798?, 0x1009560a0?}) /opt/local/lib/go/src/net/net.go:194 +0x34 fp=0x1400007b760 sp=0x1400007b710 pc=0x1005f73f4 net/http.(*connReader).backgroundRead(0x140001f53e0) /opt/local/lib/go/src/net/http/server.go:690 +0x40 fp=0x1400007b7b0 sp=0x1400007b760 pc=0x1007a9e60 net/http.(*connReader).startBackgroundRead.gowrap2() /opt/local/lib/go/src/net/http/server.go:686 +0x28 fp=0x1400007b7d0 sp=0x1400007b7b0 pc=0x1007a9d48 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400007b7d0 sp=0x1400007b7d0 pc=0x100500614 created by net/http.(*connReader).startBackgroundRead in goroutine 7 /opt/local/lib/go/src/net/http/server.go:686 +0xc4 goroutine 7 gp=0x14000103c00 m=nil [select]: runtime.gopark(0x14000045a50?, 0x2?, 0x78?, 0x57?, 0x1400004581c?) /opt/local/lib/go/src/runtime/proc.go:435 +0xc8 fp=0x14000045650 sp=0x14000045630 pc=0x1004f8258 runtime.selectgo(0x14000045a50, 0x14000045818, 0x51?, 0x0, 0x0?, 0x1) /opt/local/lib/go/src/runtime/select.go:351 +0x6c4 fp=0x14000045780 sp=0x14000045650 pc=0x1004d8034 github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0x140001fe5a0, {0x10164fa88, 0x140015262a0}, 0x140001b6b40) /opt/local/var/macports/build/ollama-67ec51d2/work/gopath/src/github.com/ollama/ollama/runner/ollamarunner/runner.go:680 +0x9d8 fp=0x14000045aa0 sp=0x14000045780 pc=0x100953a08 github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x10164fa88?, 0x140015262a0?}, 0x14000151b28?) <autogenerated>:1 +0x40 fp=0x14000045ad0 sp=0x14000045aa0 pc=0x1009564e0 net/http.HandlerFunc.ServeHTTP(0x140000d2840?, {0x10164fa88?, 0x140015262a0?}, 0x14000151b10?) /opt/local/lib/go/src/net/http/server.go:2294 +0x38 fp=0x14000045b00 sp=0x14000045ad0 pc=0x1007b0d48 net/http.(*ServeMux).ServeHTTP(0x10?, {0x10164fa88, 0x140015262a0}, 0x140001b6b40) /opt/local/lib/go/src/net/http/server.go:2822 +0x1b4 fp=0x14000045b50 sp=0x14000045b00 pc=0x1007b28d4 net/http.serverHandler.ServeHTTP({0x10164c0b0?}, {0x10164fa88?, 0x140015262a0?}, 0x1?) /opt/local/lib/go/src/net/http/server.go:3301 +0xbc fp=0x14000045b80 sp=0x14000045b50 pc=0x1007ce65c net/http.(*conn).serve(0x140004121b0, {0x101651d18, 0x140002b8090}) /opt/local/lib/go/src/net/http/server.go:2102 +0x52c fp=0x14000045fa0 sp=0x14000045b80 pc=0x1007af4ec net/http.(*Server).Serve.gowrap3() /opt/local/lib/go/src/net/http/server.go:3454 +0x30 fp=0x14000045fd0 sp=0x14000045fa0 pc=0x1007b46b0 runtime.goexit({}) /opt/local/lib/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000045fd0 sp=0x14000045fd0 pc=0x100500614 created by net/http.(*Server).Serve in goroutine 1 /opt/local/lib/go/src/net/http/server.go:3454 +0x3d8 r0 0x0 r1 0x0 r2 0x0 r3 0x0 r4 0x1013a190a r5 0x6be67a960 r6 0x64656c6961662029 r7 0x1400019e038 r8 0x948eb72c3c2817aa r9 0x948eb72a824fa7aa r10 0x2 r11 0x10000000000 r12 0xfffffffd r13 0x0 r14 0x0 r15 0x0 r16 0x148 r17 0x201e09fa8 r18 0x0 r19 0x6 r20 0x4203 r21 0x6be67b0e0 r22 0x1fe7abe68 r23 0x15cf764c0 r24 0x1445000 r25 0x12788d300 r26 0x40 r27 0x8 r28 0x15cf76350 r29 0x6be67a8c0 lr 0x192caa88c sp 0x6be67a8a0 pc 0x192c71388 fault 0x192c71388 time=2025-08-06T05:13:12.472+10:00 level=ERROR source=server.go:808 msg="post predict" error="Post \"http://127.0.0.1:51140/completion\": EOF" time=2025-08-06T05:13:12.472+10:00 level=ERROR source=server.go:465 msg="llama runner terminated" error="exit status 2" [GIN] 2025/08/06 - 05:13:12 | 200 | 242.328083ms | 127.0.0.1 | POST "/api/chat" ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.11.0
GiteaMirror added the bug label 2026-04-12 19:51:26 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7729