[GH-ISSUE #11951] Ollama crashing when using OpenAI API #7935

Closed
opened 2026-04-12 20:05:59 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @SleepyYui on GitHub (Aug 18, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11951

Originally assigned to: @ParthSareen on GitHub.

What is the issue?

When connecting via OpenAI API routes and sending a request to the chat endpoint the server crashes.

Model used: hf.co/unsloth/gemma-3-270m-it-GGUF:F16
Base URL: http://localhost:11434/v1

I am using the openai python module.

Using ollama run with the same model in the CLI works perfectly fine, as well as the desktop app.

Relevant log output

username@COMPUTER ~ % ollama serve
time=2025-08-18T10:11:23.184+02:00 level=INFO source=routes.go:1304 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/matthias/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]"
time=2025-08-18T10:11:23.186+02:00 level=INFO source=images.go:477 msg="total blobs: 38"
time=2025-08-18T10:11:23.187+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2025-08-18T10:11:23.187+02:00 level=INFO source=routes.go:1357 msg="Listening on 127.0.0.1:11434 (version 0.11.4)"
time=2025-08-18T10:11:23.236+02:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="16.0 GiB" available="16.0 GiB"
time=2025-08-18T10:11:23.236+02:00 level=INFO source=routes.go:1398 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB"
time=2025-08-18T10:11:29.927+02:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model=/Users/matthias/.ollama/models/blobs/sha256-140cb395a3a3dcdb3de66e44c60d7265286c9d49f951c1899ec0d8c1f16b2feb gpu=0 parallel=1 available=17179885568 required="1.5 GiB"
time=2025-08-18T10:11:29.927+02:00 level=INFO source=server.go:135 msg="system memory" total="24.0 GiB" free="5.1 GiB" free_swap="0 B"
time=2025-08-18T10:11:29.927+02:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=19 layers.offload=19 layers.split="" memory.available="[16.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="1.5 GiB" memory.required.partial="1.5 GiB" memory.required.kv="27.0 MiB" memory.required.allocations="[1.5 GiB]" memory.weights.total="511.5 MiB" memory.weights.repeating="191.5 MiB" memory.weights.nonrepeating="320.0 MiB" memory.graph.full="513.2 MiB" memory.graph.partial="513.2 MiB"
time=2025-08-18T10:11:29.950+02:00 level=INFO source=server.go:438 msg="starting llama server" cmd="/Applications/Ollama.app/Contents/Resources/ollama runner --ollama-engine --model /Users/matthias/.ollama/models/blobs/sha256-140cb395a3a3dcdb3de66e44c60d7265286c9d49f951c1899ec0d8c1f16b2feb --ctx-size 4096 --batch-size 512 --n-gpu-layers 19 --threads 4 --parallel 1 --port 53562"
time=2025-08-18T10:11:29.952+02:00 level=INFO source=sched.go:481 msg="loaded runners" count=1
time=2025-08-18T10:11:29.953+02:00 level=INFO source=server.go:598 msg="waiting for llama runner to start responding"
time=2025-08-18T10:11:29.953+02:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server not responding"
time=2025-08-18T10:11:29.961+02:00 level=INFO source=runner.go:925 msg="starting ollama engine"
time=2025-08-18T10:11:29.961+02:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:53562"
time=2025-08-18T10:11:29.980+02:00 level=INFO source=ggml.go:92 msg="" architecture=gemma3 file_type=F16 name=Gemma-3-270M-It description="" num_tensors=236 num_key_values=42
time=2025-08-18T10:11:29.983+02:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang)
time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:365 msg="offloading 18 repeating layers to GPU"
time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:371 msg="offloading output layer to GPU"
time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:376 msg="offloaded 19/19 layers to GPU"
time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=CPU size="320.0 MiB"
time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=Metal size="511.5 MiB"
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M3
ggml_metal_load_library: using embedded metal library
ggml_metal_init: GPU name:   Apple M3
ggml_metal_init: GPU family: MTLGPUFamilyApple9  (1009)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction   = true
ggml_metal_init: simdgroup matrix mul. = true
ggml_metal_init: has residency sets    = false
ggml_metal_init: has bfloat            = true
ggml_metal_init: use bfloat            = true
ggml_metal_init: hasUnifiedMemory      = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 17179.89 MB
time=2025-08-18T10:11:30.055+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=Metal buffer_type=Metal size="48.5 MiB"
time=2025-08-18T10:11:30.055+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=BLAS buffer_type=CPU size="1.2 MiB"
time=2025-08-18T10:11:30.055+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=CPU buffer_type=CPU size="0 B"
time=2025-08-18T10:11:30.205+02:00 level=INFO source=server.go:637 msg="llama runner started in 0.25 seconds"
SIGSEGV: segmentation violation
PC=0x10320b98c m=11 sigcode=2 addr=0xfffffffffffffff8
signal arrived during cgo execution

goroutine 66 gp=0x14000582c40 m=11 mp=0x140002a2808 [syscall]:
runtime.cgocall(0x103169ec4, 0x1400008f928)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/cgocall.go:167 +0x44 fp=0x1400008f8f0 sp=0x1400008f8b0 pc=0x102659ef4
github.com/ollama/ollama/llama._Cfunc_schema_to_grammar(0x12c010000, 0x14000414000, 0x8000)
	_cgo_gotypes.go:970 +0x34 fp=0x1400008f920 sp=0x1400008f8f0 pc=0x1029b4634
github.com/ollama/ollama/llama.SchemaToGrammar({0x140002daa80?, 0xa37, 0x1400029a050?})
	/Users/runner/work/ollama/ollama/llama/llama.go:587 +0xc0 fp=0x1400008f9c0 sp=0x1400008f920 pc=0x1029b7b20
github.com/ollama/ollama/llm.(*llmServer).Completion(0x140005c0c00, {0x103815f10, 0x1400029a050}, {{0x1400037a000, 0x1b2e}, {0x140002daa80, 0xa37, 0xa80}, {0x0, 0x0, ...}, ...}, ...)
	/Users/runner/work/ollama/ollama/llm/server.go:753 +0x27c fp=0x1400008fe20 sp=0x1400008f9c0 pc=0x102a4837c
github.com/ollama/ollama/server.(*Server).ChatHandler.func1()
	/Users/runner/work/ollama/ollama/server/routes.go:1638 +0x274 fp=0x1400008ffd0 sp=0x1400008fe20 pc=0x1030f6a04
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400008ffd0 sp=0x1400008ffd0 pc=0x102665174
created by github.com/ollama/ollama/server.(*Server).ChatHandler in goroutine 40
	/Users/runner/work/ollama/ollama/server/routes.go:1635 +0x130c

goroutine 1 gp=0x140000021c0 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x1042aa010?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000207660 sp=0x14000207640 pc=0x10265d418
runtime.netpollblock(0x140000496f8?, 0x26e1030?, 0x1?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:575 +0x158 fp=0x140002076a0 sp=0x14000207660 pc=0x1026229d8
internal/poll.runtime_pollWait(0x12af967d0, 0x72)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:351 +0xa0 fp=0x140002076d0 sp=0x140002076a0 pc=0x10265c5d0
internal/poll.(*pollDesc).wait(0x140005ab600?, 0x3800000038?, 0x0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x14000207700 sp=0x140002076d0 pc=0x1026dc848
internal/poll.(*pollDesc).waitRead(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0x140005ab600)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_unix.go:620 +0x24c fp=0x140002077b0 sp=0x14000207700 pc=0x1026e111c
net.(*netFD).accept(0x140005ab600)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/fd_unix.go:172 +0x28 fp=0x14000207870 sp=0x140002077b0 pc=0x102750388
net.(*TCPListener).accept(0x14000596200)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/tcpsock_posix.go:159 +0x24 fp=0x140002078c0 sp=0x14000207870 pc=0x1027645e4
net.(*TCPListener).Accept(0x14000596200)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/tcpsock.go:380 +0x2c fp=0x14000207900 sp=0x140002078c0 pc=0x1027635cc
net/http.(*onceCloseListener).Accept(0x1400013fef0?)
	<autogenerated>:1 +0x30 fp=0x14000207920 sp=0x14000207900 pc=0x10293e8e0
net/http.(*Server).Serve(0x140000f0700, {0x103813a68, 0x14000596200})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:3424 +0x290 fp=0x14000207a50 sp=0x14000207920 pc=0x102918020
github.com/ollama/ollama/server.Serve({0x103813a68, 0x14000596200})
	/Users/runner/work/ollama/ollama/server/routes.go:1401 +0x79c fp=0x14000207d00 sp=0x14000207a50 pc=0x1030f3eec
github.com/ollama/ollama/cmd.RunServer(0x140001c9400?, {0x104100680?, 0x4?, 0x103393958?})
	/Users/runner/work/ollama/ollama/cmd/cmd.go:1337 +0x44 fp=0x14000207d40 sp=0x14000207d00 pc=0x103111684
github.com/spf13/cobra.(*Command).execute(0x140005c5808, {0x104100680, 0x0, 0x0})
	/Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x648 fp=0x14000207e60 sp=0x14000207d40 pc=0x1027be928
github.com/spf13/cobra.(*Command).ExecuteC(0x140005c4908)
	/Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320 fp=0x14000207f20 sp=0x14000207e60 pc=0x1027bf070
github.com/spf13/cobra.(*Command).Execute(...)
	/Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	/Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
	/Users/runner/work/ollama/ollama/main.go:12 +0x54 fp=0x14000207f40 sp=0x14000207f20 pc=0x103119a94
runtime.main()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:283 +0x284 fp=0x14000207fd0 sp=0x14000207f40 pc=0x102629544
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000207fd0 sp=0x14000207fd0 pc=0x102665174

goroutine 2 gp=0x14000002c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000072f90 sp=0x14000072f70 pc=0x10265d418
runtime.goparkunlock(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:441
runtime.forcegchelper()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:348 +0xb8 fp=0x14000072fd0 sp=0x14000072f90 pc=0x102629898
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000072fd0 sp=0x14000072fd0 pc=0x102665174
created by runtime.init.7 in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:336 +0x24

goroutine 3 gp=0x14000003500 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000073760 sp=0x14000073740 pc=0x10265d418
runtime.goparkunlock(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:441
runtime.bgsweep(0x1400007c000)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgcsweep.go:316 +0x108 fp=0x140000737b0 sp=0x14000073760 pc=0x102614978
runtime.gcenable.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:204 +0x28 fp=0x140000737d0 sp=0x140000737b0 pc=0x102608778
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000737d0 sp=0x140000737d0 pc=0x102665174
created by runtime.gcenable in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:204 +0x6c

goroutine 4 gp=0x140000036c0 m=nil [GC scavenge wait]:
runtime.gopark(0x371102?, 0x6553f100?, 0x0?, 0x0?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000073f60 sp=0x14000073f40 pc=0x10265d418
runtime.goparkunlock(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x1040d2000)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgcscavenge.go:425 +0x5c fp=0x14000073f90 sp=0x14000073f60 pc=0x10261240c
runtime.bgscavenge(0x1400007c000)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgcscavenge.go:658 +0xac fp=0x14000073fb0 sp=0x14000073f90 pc=0x1026129ac
runtime.gcenable.gowrap2()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:205 +0x28 fp=0x14000073fd0 sp=0x14000073fb0 pc=0x102608718
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000073fd0 sp=0x14000073fd0 pc=0x102665174
created by runtime.gcenable in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:205 +0xac

goroutine 5 gp=0x14000003c00 m=nil [finalizer wait]:
runtime.gopark(0x18000725c8?, 0x1000000000000?, 0xf8?, 0x25?, 0x10294124c?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000072590 sp=0x14000072570 pc=0x10265d418
runtime.runfinq()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mfinal.go:196 +0x108 fp=0x140000727d0 sp=0x14000072590 pc=0x102607778
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000727d0 sp=0x140000727d0 pc=0x102665174
created by runtime.createfing in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mfinal.go:166 +0x80

goroutine 6 gp=0x140001b4700 m=nil [chan receive]:
runtime.gopark(0x140001fb540?, 0x14000510438?, 0x48?, 0x47?, 0x102724558?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000746f0 sp=0x140000746d0 pc=0x10265d418
runtime.chanrecv(0x14000042380, 0x0, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:664 +0x42c fp=0x14000074770 sp=0x140000746f0 pc=0x1025f9a1c
runtime.chanrecv1(0x0?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:506 +0x14 fp=0x140000747a0 sp=0x14000074770 pc=0x1025f95b4
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1799 +0x3c fp=0x140000747d0 sp=0x140000747a0 pc=0x10260b99c
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000747d0 sp=0x140000747d0 pc=0x102665174
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1794 +0x78

goroutine 7 gp=0x140001b4e00 m=nil [GC worker (idle)]:
runtime.gopark(0x7336aaf3b537?, 0x3?, 0x3b?, 0x83?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000074f10 sp=0x14000074ef0 pc=0x10265d418
runtime.gcBgMarkWorker(0x140000437a0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x14000074fb0 sp=0x14000074f10 pc=0x10260ac0c
runtime.gcBgMarkStartWorkers.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x14000074fd0 sp=0x14000074fb0 pc=0x10260aaf8
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000074fd0 sp=0x14000074fd0 pc=0x102665174
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140

goroutine 8 gp=0x140001b4fc0 m=nil [GC worker (idle)]:
runtime.gopark(0x7336aaf903c7?, 0x3?, 0x8e?, 0x1a?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000075710 sp=0x140000756f0 pc=0x10265d418
runtime.gcBgMarkWorker(0x140000437a0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x140000757b0 sp=0x14000075710 pc=0x10260ac0c
runtime.gcBgMarkStartWorkers.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x140000757d0 sp=0x140000757b0 pc=0x10260aaf8
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000757d0 sp=0x140000757d0 pc=0x102665174
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140

goroutine 18 gp=0x14000286000 m=nil [GC worker (idle)]:
runtime.gopark(0x7336aaf88f91?, 0x1?, 0x78?, 0x9c?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000088f10 sp=0x14000088ef0 pc=0x10265d418
runtime.gcBgMarkWorker(0x140000437a0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x14000088fb0 sp=0x14000088f10 pc=0x10260ac0c
runtime.gcBgMarkStartWorkers.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x14000088fd0 sp=0x14000088fb0 pc=0x10260aaf8
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000088fd0 sp=0x14000088fd0 pc=0x102665174
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140

goroutine 34 gp=0x140000aa380 m=nil [GC worker (idle)]:
runtime.gopark(0x7336aaf94394?, 0x3?, 0xbb?, 0xec?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c0710 sp=0x140000c06f0 pc=0x10265d418
runtime.gcBgMarkWorker(0x140000437a0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x140000c07b0 sp=0x140000c0710 pc=0x10260ac0c
runtime.gcBgMarkStartWorkers.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x140000c07d0 sp=0x140000c07b0 pc=0x10260aaf8
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c07d0 sp=0x140000c07d0 pc=0x102665174
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140

goroutine 9 gp=0x140001b5180 m=nil [GC worker (idle)]:
runtime.gopark(0x7336aaf30545?, 0x3?, 0x68?, 0x9a?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000075f10 sp=0x14000075ef0 pc=0x10265d418
runtime.gcBgMarkWorker(0x140000437a0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x14000075fb0 sp=0x14000075f10 pc=0x10260ac0c
runtime.gcBgMarkStartWorkers.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x14000075fd0 sp=0x14000075fb0 pc=0x10260aaf8
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000075fd0 sp=0x14000075fd0 pc=0x102665174
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140

goroutine 19 gp=0x140002861c0 m=nil [GC worker (idle)]:
runtime.gopark(0x7336aaf80efc?, 0x3?, 0x3a?, 0x5f?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x1400006ef10 sp=0x1400006eef0 pc=0x10265d418
runtime.gcBgMarkWorker(0x140000437a0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x1400006efb0 sp=0x1400006ef10 pc=0x10260ac0c
runtime.gcBgMarkStartWorkers.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x1400006efd0 sp=0x1400006efb0 pc=0x10260aaf8
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400006efd0 sp=0x1400006efd0 pc=0x102665174
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140

goroutine 35 gp=0x140000aa540 m=nil [GC worker (idle)]:
runtime.gopark(0x7336aaf1cb25?, 0x3?, 0x30?, 0xe5?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c0f10 sp=0x140000c0ef0 pc=0x10265d418
runtime.gcBgMarkWorker(0x140000437a0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x140000c0fb0 sp=0x140000c0f10 pc=0x10260ac0c
runtime.gcBgMarkStartWorkers.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x140000c0fd0 sp=0x140000c0fb0 pc=0x10260aaf8
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c0fd0 sp=0x140000c0fd0 pc=0x102665174
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140

goroutine 10 gp=0x140001b5340 m=nil [GC worker (idle)]:
runtime.gopark(0x7336a5f22647?, 0x3?, 0x5d?, 0xba?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000bc710 sp=0x140000bc6f0 pc=0x10265d418
runtime.gcBgMarkWorker(0x140000437a0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x140000bc7b0 sp=0x140000bc710 pc=0x10260ac0c
runtime.gcBgMarkStartWorkers.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x140000bc7d0 sp=0x140000bc7b0 pc=0x10260aaf8
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000bc7d0 sp=0x140000bc7d0 pc=0x102665174
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140

goroutine 11 gp=0x14000287180 m=nil [select, locked to thread]:
runtime.gopark(0x140000c3fa0?, 0x2?, 0xa8?, 0x3e?, 0x140000c3f90?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c3e20 sp=0x140000c3e00 pc=0x10265d418
runtime.selectgo(0x140000c3fa0, 0x140000c3f8c, 0x0?, 0x0, 0x0?, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/select.go:351 +0x6c4 fp=0x140000c3f50 sp=0x140000c3e20 pc=0x10263cbb4
runtime.ensureSigM.func1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/signal_unix.go:1085 +0x160 fp=0x140000c3fd0 sp=0x140000c3f50 pc=0x1026579b0
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c3fd0 sp=0x140000c3fd0 pc=0x102665174
created by runtime.ensureSigM in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/signal_unix.go:1068 +0xd8

goroutine 36 gp=0x140000aa700 m=4 mp=0x14000079808 [syscall]:
runtime.sigNoteSleep(0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/os_darwin.go:133 +0x20 fp=0x14000071790 sp=0x14000071750 pc=0x102623a70
os/signal.signal_recv()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/sigqueue.go:149 +0x2c fp=0x140000717b0 sp=0x14000071790 pc=0x10265f99c
os/signal.loop()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/signal/signal_unix.go:23 +0x1c fp=0x140000717d0 sp=0x140000717b0 pc=0x102940dfc
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000717d0 sp=0x140000717d0 pc=0x102665174
created by os/signal.Notify.func1.1 in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/signal/signal.go:152 +0x28

goroutine 37 gp=0x140000aa8c0 m=nil [chan receive]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000071ee0 sp=0x14000071ec0 pc=0x10265d418
runtime.chanrecv(0x140003c3f10, 0x0, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:664 +0x42c fp=0x14000071f60 sp=0x14000071ee0 pc=0x1025f9a1c
runtime.chanrecv1(0x0?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:506 +0x14 fp=0x14000071f90 sp=0x14000071f60 pc=0x1025f95b4
github.com/ollama/ollama/server.Serve.func1()
	/Users/runner/work/ollama/ollama/server/routes.go:1374 +0x44 fp=0x14000071fd0 sp=0x14000071f90 pc=0x1030f3fa4
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000071fd0 sp=0x14000071fd0 pc=0x102665174
created by github.com/ollama/ollama/server.Serve in goroutine 1
	/Users/runner/work/ollama/ollama/server/routes.go:1373 +0x544

goroutine 38 gp=0x140000aaa80 m=nil [select]:
runtime.gopark(0x14000049f30?, 0x3?, 0x0?, 0x0?, 0x14000049ca2?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x1400020db10 sp=0x1400020daf0 pc=0x10265d418
runtime.selectgo(0x1400020df30, 0x14000049c9c, 0x14000132540?, 0x0, 0x1033ab6fd?, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/select.go:351 +0x6c4 fp=0x1400020dc40 sp=0x1400020db10 pc=0x10263cbb4
github.com/ollama/ollama/server.(*Scheduler).processPending(0x1400046d560, {0x103815f10, 0x1400059a410})
	/Users/runner/work/ollama/ollama/server/sched.go:118 +0x9c fp=0x1400020dfa0 sp=0x1400020dc40 pc=0x1030f7dfc
github.com/ollama/ollama/server.(*Scheduler).Run.func1()
	/Users/runner/work/ollama/ollama/server/sched.go:108 +0x28 fp=0x1400020dfd0 sp=0x1400020dfa0 pc=0x1030f7d48
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400020dfd0 sp=0x1400020dfd0 pc=0x102665174
created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1
	/Users/runner/work/ollama/ollama/server/sched.go:107 +0xc0

goroutine 39 gp=0x140000aac40 m=nil [select]:
runtime.gopark(0x140000caf40?, 0x3?, 0x88?, 0xab?, 0x140000cacba?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000cab30 sp=0x140000cab10 pc=0x10265d418
runtime.selectgo(0x140000caf40, 0x140000cacb4, 0x0?, 0x0, 0x0?, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/select.go:351 +0x6c4 fp=0x140000cac60 sp=0x140000cab30 pc=0x10263cbb4
github.com/ollama/ollama/server.(*Scheduler).processCompleted(0x1400046d560, {0x103815f10, 0x1400059a410})
	/Users/runner/work/ollama/ollama/server/sched.go:318 +0xa8 fp=0x140000cafa0 sp=0x140000cac60 pc=0x1030f8f48
github.com/ollama/ollama/server.(*Scheduler).Run.func2()
	/Users/runner/work/ollama/ollama/server/sched.go:112 +0x28 fp=0x140000cafd0 sp=0x140000cafa0 pc=0x1030f7d08
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000cafd0 sp=0x140000cafd0 pc=0x102665174
created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1
	/Users/runner/work/ollama/ollama/server/sched.go:111 +0x118

goroutine 40 gp=0x140000aae00 m=nil [chan receive]:
runtime.gopark(0x100048ca8?, 0x12adcd3d8?, 0x8?, 0xc1?, 0x1042aa330?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000048c00 sp=0x14000048be0 pc=0x10265d418
runtime.chanrecv(0x140005d6460, 0x140000491c8, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:664 +0x42c fp=0x14000048c80 sp=0x14000048c00 pc=0x1025f9a1c
runtime.chanrecv2(0x140002b9050?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:511 +0x14 fp=0x14000048cb0 sp=0x14000048c80 pc=0x1025f95d4
github.com/ollama/ollama/server.(*Server).ChatHandler(0x1400068c400, 0x140001c8700)
	/Users/runner/work/ollama/ollama/server/routes.go:1728 +0x192c fp=0x14000049510 sp=0x14000048cb0 pc=0x1030f61fc
github.com/ollama/ollama/server.(*Server).ChatHandler-fm(0x140002b3801?)
	<autogenerated>:1 +0x30 fp=0x14000049530 sp=0x14000049510 pc=0x103108730
github.com/gin-gonic/gin.(*Context).Next(0x140001c8700)
	/Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185 +0x3c fp=0x14000049550 sp=0x14000049530 pc=0x102c59c9c
github.com/ollama/ollama/server.(*Server).GenerateRoutes.ChatMiddleware.func6(0x140001c8700)
	/Users/runner/work/ollama/ollama/openai/openai.go:1063 +0x1e0 fp=0x140000496b0 sp=0x14000049550 pc=0x1031023f0
github.com/gin-gonic/gin.(*Context).Next(0x140001c8700)
	/Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185 +0x3c fp=0x140000496d0 sp=0x140000496b0 pc=0x102c59c9c
github.com/ollama/ollama/server.(*Server).GenerateRoutes.allowedHostsMiddleware.func5(0x140001c8700)
	/Users/runner/work/ollama/ollama/server/routes.go:1210 +0x160 fp=0x14000049730 sp=0x140000496d0 pc=0x1030f36f0
github.com/gin-gonic/gin.(*Context).Next(...)
	/Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185
github.com/gin-gonic/gin.CustomRecoveryWithWriter.func1(0x140001c8700)
	/Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/recovery.go:102 +0x78 fp=0x14000049780 sp=0x14000049730 pc=0x102c66788
github.com/gin-gonic/gin.(*Context).Next(...)
	/Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185
github.com/gin-gonic/gin.LoggerWithConfig.func1(0x140001c8700)
	/Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/logger.go:249 +0xb4 fp=0x14000049940 sp=0x14000049780 pc=0x102c65b34
github.com/gin-gonic/gin.(*Context).Next(...)
	/Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185
github.com/gin-gonic/gin.(*Engine).handleHTTPRequest(0x14000112000, 0x140001c8700)
	/Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/gin.go:633 +0x6c4 fp=0x14000049ac0 sp=0x14000049940 pc=0x102c64fd4
github.com/gin-gonic/gin.(*Engine).ServeHTTP(0x14000112000, {0x103813c48, 0x14000174460}, 0x1400029e140)
	/Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/gin.go:589 +0x170 fp=0x14000049b00 sp=0x14000049ac0 pc=0x102c647a0
net/http.(*ServeMux).ServeHTTP(0x10?, {0x103813c48, 0x14000174460}, 0x1400029e140)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:2822 +0x1b4 fp=0x14000049b50 sp=0x14000049b00 pc=0x1029165d4
net/http.serverHandler.ServeHTTP({0x103810270?}, {0x103813c48?, 0x14000174460?}, 0x6?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:3301 +0xbc fp=0x14000049b80 sp=0x14000049b50 pc=0x1029322bc
net/http.(*conn).serve(0x1400013fef0, {0x103815ed8, 0x14000598630})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:2102 +0x52c fp=0x14000049fa0 sp=0x14000049b80 pc=0x1029131ec
net/http.(*Server).Serve.gowrap3()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:3454 +0x30 fp=0x14000049fd0 sp=0x14000049fa0 pc=0x1029183b0
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000049fd0 sp=0x14000049fd0 pc=0x102665174
created by net/http.(*Server).Serve in goroutine 1
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:3454 +0x3d8

goroutine 41 gp=0x140000aafc0 m=nil [IO wait]:
runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x102680c50?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c1d80 sp=0x140000c1d60 pc=0x10265d418
runtime.netpollblock(0x0?, 0x0?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:575 +0x158 fp=0x140000c1dc0 sp=0x140000c1d80 pc=0x1026229d8
internal/poll.runtime_pollWait(0x12af966b8, 0x72)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:351 +0xa0 fp=0x140000c1df0 sp=0x140000c1dc0 pc=0x10265c5d0
internal/poll.(*pollDesc).wait(0x140005aa100?, 0x140005988e1?, 0x0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x140000c1e20 sp=0x140000c1df0 pc=0x1026dc848
internal/poll.(*pollDesc).waitRead(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x140005aa100, {0x140005988e1, 0x1, 0x1})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x140000c1ec0 sp=0x140000c1e20 pc=0x1026ddafc
net.(*netFD).Read(0x140005aa100, {0x140005988e1?, 0x0?, 0x0?})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/fd_posix.go:55 +0x28 fp=0x140000c1f10 sp=0x140000c1ec0 pc=0x10274e958
net.(*conn).Read(0x14000294008, {0x140005988e1?, 0x0?, 0x0?})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/net.go:194 +0x34 fp=0x140000c1f60 sp=0x140000c1f10 pc=0x10275b824
net/http.(*connReader).backgroundRead(0x140005988d0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:690 +0x40 fp=0x140000c1fb0 sp=0x140000c1f60 pc=0x10290db60
net/http.(*connReader).startBackgroundRead.gowrap2()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:686 +0x28 fp=0x140000c1fd0 sp=0x140000c1fb0 pc=0x10290da48
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c1fd0 sp=0x140000c1fd0 pc=0x102665174
created by net/http.(*connReader).startBackgroundRead in goroutine 40
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:686 +0xc4

goroutine 48 gp=0x140000ab500 m=nil [select]:
runtime.gopark(0x1400061ff38?, 0x2?, 0xa8?, 0xfd?, 0x1400061fee4?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x1400061fd70 sp=0x1400061fd50 pc=0x10265d418
runtime.selectgo(0x1400061ff38, 0x1400061fee0, 0x14000050040?, 0x0, 0x140001f6060?, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/select.go:351 +0x6c4 fp=0x1400061fea0 sp=0x1400061fd70 pc=0x10263cbb4
net/http.(*persistConn).writeLoop(0x140005da000)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:2590 +0x9c fp=0x1400061ffb0 sp=0x1400061fea0 pc=0x10292dcbc
net/http.(*Transport).dialConn.gowrap3()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:1945 +0x28 fp=0x1400061ffd0 sp=0x1400061ffb0 pc=0x10292ae78
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400061ffd0 sp=0x1400061ffd0 pc=0x102665174
created by net/http.(*Transport).dialConn in goroutine 20
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:1945 +0x120c

goroutine 14 gp=0x140000abc00 m=nil [IO wait]:
runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x102680c50?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c5be0 sp=0x140000c5bc0 pc=0x10265d418
runtime.netpollblock(0x0?, 0x0?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:575 +0x158 fp=0x140000c5c20 sp=0x140000c5be0 pc=0x1026229d8
internal/poll.runtime_pollWait(0x12af965a0, 0x72)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:351 +0xa0 fp=0x140000c5c50 sp=0x140000c5c20 pc=0x10265c5d0
internal/poll.(*pollDesc).wait(0x140003749c0?, 0x14001208000?, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x140000c5c80 sp=0x140000c5c50 pc=0x1026dc848
internal/poll.(*pollDesc).waitRead(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x140003749c0, {0x14001208000, 0x8000, 0x8000})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x140000c5d20 sp=0x140000c5c80 pc=0x1026ddafc
os.(*File).read(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file_posix.go:29
os.(*File).Read(0x14000294038, {0x14001208000?, 0x182?, 0x8000?})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file.go:124 +0x6c fp=0x140000c5d60 sp=0x140000c5d20 pc=0x1026e6f7c
io.copyBuffer({0x10380e1c0, 0x1400000e8a0}, {0x10380bee0, 0x140005ac000}, {0x0, 0x0, 0x0})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/io/io.go:429 +0x18c fp=0x140000c5de0 sp=0x140000c5d60 pc=0x1026d264c
io.Copy(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/io/io.go:388
os.genericWriteTo(0x1?, {0x10380e1c0, 0x1400000e8a0})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file.go:275 +0x58 fp=0x140000c5e30 sp=0x140000c5de0 pc=0x1026e77e8
os.(*File).WriteTo(0x103fad4d0?, {0x10380e1c0?, 0x1400000e8a0?})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file.go:253 +0x60 fp=0x140000c5e60 sp=0x140000c5e30 pc=0x1026e7720
io.copyBuffer({0x10380e1c0, 0x1400000e8a0}, {0x10380bbc8, 0x14000294038}, {0x0, 0x0, 0x0})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/io/io.go:411 +0x98 fp=0x140000c5ee0 sp=0x140000c5e60 pc=0x1026d2558
io.Copy(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/io/io.go:388
os/exec.(*Cmd).writerDescriptor.func1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:596 +0x44 fp=0x140000c5f40 sp=0x140000c5ee0 pc=0x1029cd584
os/exec.(*Cmd).Start.func2(0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:749 +0x34 fp=0x140000c5fb0 sp=0x140000c5f40 pc=0x1029cdfb4
os/exec.(*Cmd).Start.gowrap1()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:761 +0x30 fp=0x140000c5fd0 sp=0x140000c5fb0 pc=0x1029cdf40
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c5fd0 sp=0x140000c5fd0 pc=0x102665174
created by os/exec.(*Cmd).Start in goroutine 38
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:748 +0x76c

goroutine 15 gp=0x140000abdc0 m=9 mp=0x14000580008 [syscall]:
syscall.syscall6(0x90010100000050?, 0x12ae29a38?, 0x10429c5c0?, 0x90?, 0x14000580008?, 0x140002b0000?, 0x140000c1608?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/sys_darwin.go:60 +0x50 fp=0x140000c15a0 sp=0x140000c14e0 pc=0x102660d50
syscall.wait4(0x140000c1638?, 0x1026e6674?, 0x90?, 0x1037cebc0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/syscall/zsyscall_darwin_arm64.go:44 +0x4c fp=0x140000c1600 sp=0x140000c15a0 pc=0x10267ccfc
syscall.Wait4(0x140000c1668?, 0x140000c166c, 0x140000c1678?, 0x1025f8dc4?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/syscall/syscall_bsd.go:144 +0x28 fp=0x140000c1640 sp=0x140000c1600 pc=0x10267a7c8
os.(*Process).pidWait.func1(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec_unix.go:68
os.ignoringEINTR2[...](...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file_posix.go:261
os.(*Process).pidWait(0x140005bc700)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec_unix.go:67 +0xa4 fp=0x140000c16a0 sp=0x140000c1640 pc=0x1026e66c4
os.(*Process).wait(0x140000c1738?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec_unix.go:30 +0x24 fp=0x140000c16c0 sp=0x140000c16a0 pc=0x1026e65c4
os.(*Process).Wait(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec.go:358
os/exec.(*Cmd).Wait(0x140001ff380)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:922 +0x38 fp=0x140000c1730 sp=0x140000c16c0 pc=0x1029ce588
github.com/ollama/ollama/llm.NewLlamaServer.func1()
	/Users/runner/work/ollama/ollama/llm/server.go:461 +0x2c fp=0x140000c17d0 sp=0x140000c1730 pc=0x102a46ccc
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c17d0 sp=0x140000c17d0 pc=0x102665174
created by github.com/ollama/ollama/llm.NewLlamaServer in goroutine 38
	/Users/runner/work/ollama/ollama/llm/server.go:460 +0x2df8

goroutine 47 gp=0x14000502540 m=nil [IO wait]:
runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x102680c50?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c7ac0 sp=0x140000c7aa0 pc=0x10265d418
runtime.netpollblock(0x0?, 0x0?, 0x0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:575 +0x158 fp=0x140000c7b00 sp=0x140000c7ac0 pc=0x1026229d8
internal/poll.runtime_pollWait(0x12af96488, 0x72)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:351 +0xa0 fp=0x140000c7b30 sp=0x140000c7b00 pc=0x10265c5d0
internal/poll.(*pollDesc).wait(0x140005d4080?, 0x14000614000?, 0x0)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x140000c7b60 sp=0x140000c7b30 pc=0x1026dc848
internal/poll.(*pollDesc).waitRead(...)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x140005d4080, {0x14000614000, 0x1000, 0x1000})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x140000c7c00 sp=0x140000c7b60 pc=0x1026ddafc
net.(*netFD).Read(0x140005d4080, {0x14000614000?, 0x80?, 0x8?})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/fd_posix.go:55 +0x28 fp=0x140000c7c50 sp=0x140000c7c00 pc=0x10274e958
net.(*conn).Read(0x14000612000, {0x14000614000?, 0x0?, 0x140000c7d28?})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/net.go:194 +0x34 fp=0x140000c7ca0 sp=0x140000c7c50 pc=0x10275b824
net/http.(*persistConn).Read(0x140005da000, {0x14000614000?, 0x10380bae8?, 0x103faa910?})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:2122 +0x4c fp=0x140000c7d00 sp=0x140000c7ca0 pc=0x10292b8fc
bufio.(*Reader).fill(0x1400046c060)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/bufio/bufio.go:113 +0xf8 fp=0x140000c7d40 sp=0x140000c7d00 pc=0x1027712b8
bufio.(*Reader).Peek(0x1400046c060, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/bufio/bufio.go:152 +0x60 fp=0x140000c7d60 sp=0x140000c7d40 pc=0x102771420
net/http.(*persistConn).readLoop(0x140005da000)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:2275 +0x12c fp=0x140000c7fb0 sp=0x140000c7d60 pc=0x10292c5ac
net/http.(*Transport).dialConn.gowrap2()
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:1944 +0x28 fp=0x140000c7fd0 sp=0x140000c7fb0 pc=0x10292aed8
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c7fd0 sp=0x140000c7fd0 pc=0x102665174
created by net/http.(*Transport).dialConn in goroutine 20
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:1944 +0x11c4

goroutine 22 gp=0x14000582380 m=nil [chan receive]:
runtime.gopark(0x3?, 0x14000502540?, 0x0?, 0x0?, 0x140005d62a0?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000bd6b0 sp=0x140000bd690 pc=0x10265d418
runtime.chanrecv(0x140000d6230, 0x0, 0x1)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:664 +0x42c fp=0x140000bd730 sp=0x140000bd6b0 pc=0x1025f9a1c
runtime.chanrecv1(0x0?, 0x100010000?)
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:506 +0x14 fp=0x140000bd760 sp=0x140000bd730 pc=0x1025f95b4
github.com/ollama/ollama/server.(*Scheduler).load.func1.1()
	/Users/runner/work/ollama/ollama/server/sched.go:500 +0x44 fp=0x140000bd7d0 sp=0x140000bd760 pc=0x1030fb184
runtime.goexit({})
	/Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000bd7d0 sp=0x140000bd7d0 pc=0x102665174
created by github.com/ollama/ollama/server.(*Scheduler).load.func1 in goroutine 16
	/Users/runner/work/ollama/ollama/server/sched.go:499 +0x2fc

r0      0x17287c608
r1      0xffffffffffffffe0
r2      0xc0040c7dfbd
r3      0x1
r4      0x4
r5      0xe9875915
r6      0x600001202280
r7      0x6000012020e0
r8      0x17287c280
r9      0x17287c610
r10     0x1
r11     0x17287c650
r12     0x17
r13     0x10320bcbc
r14     0x0
r15     0x7fb
r16     0xe9875915
r17     0x114
r18     0x0
r19     0x17287c280
r20     0x10358a1a6
r21     0x17287c310
r22     0x600003119408
r23     0x17287ecd8
r24     0x17287c470
r25     0x103569720
r26     0x17287ed78
r27     0x600001c1b3b1
r28     0x17287c2f0
r29     0x17287c1a0
lr      0x10320bcd0
sp      0x17287c160
pc      0x10320b98c
fault   0xfffffffffffffff8
username@COMPUTER ~ %


The logfile contains less:

time=2025-08-18T10:10:27.256+02:00 level=INFO source=routes.go:1304 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/matthias/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]"
time=2025-08-18T10:10:27.258+02:00 level=INFO source=images.go:477 msg="total blobs: 38"
time=2025-08-18T10:10:27.259+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0"
time=2025-08-18T10:10:27.259+02:00 level=INFO source=routes.go:1357 msg="Listening on [::]:11434 (version 0.11.4)"
time=2025-08-18T10:10:27.372+02:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="16.0 GiB" available="16.0 GiB"
time=2025-08-18T10:10:27.372+02:00 level=INFO source=routes.go:1398 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB"
[GIN] 2025/08/18 - 10:10:31 | 404 |     986.791µs |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/08/18 - 10:10:31 | 200 |    3.604083ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/08/18 - 10:10:32 | 404 |    2.616208ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/08/18 - 10:10:34 | 404 |    1.746916ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/08/18 - 10:10:37 | 200 |   54.450084ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/08/18 - 10:10:39 | 200 |    1.598291ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/08/18 - 10:10:39 | 200 |   53.516334ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/08/18 - 10:10:39 | 200 |   35.498708ms |       127.0.0.1 | POST     "/api/show"
time=2025-08-18T10:10:39.629+02:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model=/Users/matthias/.ollama/models/blobs/sha256-140cb395a3a3dcdb3de66e44c60d7265286c9d49f951c1899ec0d8c1f16b2feb gpu=0 parallel=1 available=17179885568 required="1.5 GiB"
time=2025-08-18T10:10:39.629+02:00 level=INFO source=server.go:135 msg="system memory" total="24.0 GiB" free="4.0 GiB" free_swap="0 B"
time=2025-08-18T10:10:39.629+02:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=19 layers.offload=19 layers.split="" memory.available="[16.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="1.5 GiB" memory.required.partial="1.5 GiB" memory.required.kv="27.0 MiB" memory.required.allocations="[1.5 GiB]" memory.weights.total="511.5 MiB" memory.weights.repeating="191.5 MiB" memory.weights.nonrepeating="320.0 MiB" memory.graph.full="513.2 MiB" memory.graph.partial="513.2 MiB"
time=2025-08-18T10:10:39.650+02:00 level=INFO source=server.go:438 msg="starting llama server" cmd="/Applications/Ollama.app/Contents/Resources/ollama runner --ollama-engine --model /Users/matthias/.ollama/models/blobs/sha256-140cb395a3a3dcdb3de66e44c60d7265286c9d49f951c1899ec0d8c1f16b2feb --ctx-size 4096 --batch-size 512 --n-gpu-layers 19 --threads 4 --parallel 1 --port 53339"
time=2025-08-18T10:10:39.651+02:00 level=INFO source=sched.go:481 msg="loaded runners" count=1
time=2025-08-18T10:10:39.652+02:00 level=INFO source=server.go:598 msg="waiting for llama runner to start responding"
time=2025-08-18T10:10:39.652+02:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server not responding"
time=2025-08-18T10:10:39.660+02:00 level=INFO source=runner.go:925 msg="starting ollama engine"
time=2025-08-18T10:10:39.661+02:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:53339"
time=2025-08-18T10:10:39.679+02:00 level=INFO source=ggml.go:92 msg="" architecture=gemma3 file_type=F16 name=Gemma-3-270M-It description="" num_tensors=236 num_key_values=42
time=2025-08-18T10:10:39.681+02:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang)
time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:365 msg="offloading 18 repeating layers to GPU"
time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:371 msg="offloading output layer to GPU"
time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:376 msg="offloaded 19/19 layers to GPU"
time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=CPU size="320.0 MiB"
time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=Metal size="511.5 MiB"
ggml_metal_init: allocating
ggml_metal_init: picking default device: Apple M3
ggml_metal_load_library: using embedded metal library
time=2025-08-18T10:10:39.903+02:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server loading model"
ggml_metal_init: GPU name:   Apple M3
ggml_metal_init: GPU family: MTLGPUFamilyApple9  (1009)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction   = true
ggml_metal_init: simdgroup matrix mul. = true
ggml_metal_init: has residency sets    = false
ggml_metal_init: has bfloat            = true
ggml_metal_init: use bfloat            = true
ggml_metal_init: hasUnifiedMemory      = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 17179.89 MB
time=2025-08-18T10:10:52.217+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=Metal buffer_type=Metal size="48.5 MiB"
time=2025-08-18T10:10:52.217+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=BLAS buffer_type=CPU size="1.2 MiB"
time=2025-08-18T10:10:52.217+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=CPU buffer_type=CPU size="0 B"
time=2025-08-18T10:10:52.970+02:00 level=INFO source=server.go:637 msg="llama runner started in 13.32 seconds"
[GIN] 2025/08/18 - 10:10:53 | 200 | 13.657210417s |       127.0.0.1 | POST     "/api/chat"

OS

macOS

GPU

Apple

CPU

Apple

Ollama version

0.11.4

Originally created by @SleepyYui on GitHub (Aug 18, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11951 Originally assigned to: @ParthSareen on GitHub. ### What is the issue? When connecting via OpenAI API routes and sending a request to the chat endpoint the server crashes. Model used: hf.co/unsloth/gemma-3-270m-it-GGUF:F16 Base URL: http://localhost:11434/v1 I am using the openai python module. Using ollama run with the same model in the CLI works perfectly fine, as well as the desktop app. ### Relevant log output ```shell username@COMPUTER ~ % ollama serve time=2025-08-18T10:11:23.184+02:00 level=INFO source=routes.go:1304 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/matthias/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]" time=2025-08-18T10:11:23.186+02:00 level=INFO source=images.go:477 msg="total blobs: 38" time=2025-08-18T10:11:23.187+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0" time=2025-08-18T10:11:23.187+02:00 level=INFO source=routes.go:1357 msg="Listening on 127.0.0.1:11434 (version 0.11.4)" time=2025-08-18T10:11:23.236+02:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="16.0 GiB" available="16.0 GiB" time=2025-08-18T10:11:23.236+02:00 level=INFO source=routes.go:1398 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB" time=2025-08-18T10:11:29.927+02:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model=/Users/matthias/.ollama/models/blobs/sha256-140cb395a3a3dcdb3de66e44c60d7265286c9d49f951c1899ec0d8c1f16b2feb gpu=0 parallel=1 available=17179885568 required="1.5 GiB" time=2025-08-18T10:11:29.927+02:00 level=INFO source=server.go:135 msg="system memory" total="24.0 GiB" free="5.1 GiB" free_swap="0 B" time=2025-08-18T10:11:29.927+02:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=19 layers.offload=19 layers.split="" memory.available="[16.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="1.5 GiB" memory.required.partial="1.5 GiB" memory.required.kv="27.0 MiB" memory.required.allocations="[1.5 GiB]" memory.weights.total="511.5 MiB" memory.weights.repeating="191.5 MiB" memory.weights.nonrepeating="320.0 MiB" memory.graph.full="513.2 MiB" memory.graph.partial="513.2 MiB" time=2025-08-18T10:11:29.950+02:00 level=INFO source=server.go:438 msg="starting llama server" cmd="/Applications/Ollama.app/Contents/Resources/ollama runner --ollama-engine --model /Users/matthias/.ollama/models/blobs/sha256-140cb395a3a3dcdb3de66e44c60d7265286c9d49f951c1899ec0d8c1f16b2feb --ctx-size 4096 --batch-size 512 --n-gpu-layers 19 --threads 4 --parallel 1 --port 53562" time=2025-08-18T10:11:29.952+02:00 level=INFO source=sched.go:481 msg="loaded runners" count=1 time=2025-08-18T10:11:29.953+02:00 level=INFO source=server.go:598 msg="waiting for llama runner to start responding" time=2025-08-18T10:11:29.953+02:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server not responding" time=2025-08-18T10:11:29.961+02:00 level=INFO source=runner.go:925 msg="starting ollama engine" time=2025-08-18T10:11:29.961+02:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:53562" time=2025-08-18T10:11:29.980+02:00 level=INFO source=ggml.go:92 msg="" architecture=gemma3 file_type=F16 name=Gemma-3-270M-It description="" num_tensors=236 num_key_values=42 time=2025-08-18T10:11:29.983+02:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang) time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:365 msg="offloading 18 repeating layers to GPU" time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:371 msg="offloading output layer to GPU" time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:376 msg="offloaded 19/19 layers to GPU" time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=CPU size="320.0 MiB" time=2025-08-18T10:11:30.027+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=Metal size="511.5 MiB" ggml_metal_init: allocating ggml_metal_init: picking default device: Apple M3 ggml_metal_load_library: using embedded metal library ggml_metal_init: GPU name: Apple M3 ggml_metal_init: GPU family: MTLGPUFamilyApple9 (1009) ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003) ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001) ggml_metal_init: simdgroup reduction = true ggml_metal_init: simdgroup matrix mul. = true ggml_metal_init: has residency sets = false ggml_metal_init: has bfloat = true ggml_metal_init: use bfloat = true ggml_metal_init: hasUnifiedMemory = true ggml_metal_init: recommendedMaxWorkingSetSize = 17179.89 MB time=2025-08-18T10:11:30.055+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=Metal buffer_type=Metal size="48.5 MiB" time=2025-08-18T10:11:30.055+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=BLAS buffer_type=CPU size="1.2 MiB" time=2025-08-18T10:11:30.055+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=CPU buffer_type=CPU size="0 B" time=2025-08-18T10:11:30.205+02:00 level=INFO source=server.go:637 msg="llama runner started in 0.25 seconds" SIGSEGV: segmentation violation PC=0x10320b98c m=11 sigcode=2 addr=0xfffffffffffffff8 signal arrived during cgo execution goroutine 66 gp=0x14000582c40 m=11 mp=0x140002a2808 [syscall]: runtime.cgocall(0x103169ec4, 0x1400008f928) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/cgocall.go:167 +0x44 fp=0x1400008f8f0 sp=0x1400008f8b0 pc=0x102659ef4 github.com/ollama/ollama/llama._Cfunc_schema_to_grammar(0x12c010000, 0x14000414000, 0x8000) _cgo_gotypes.go:970 +0x34 fp=0x1400008f920 sp=0x1400008f8f0 pc=0x1029b4634 github.com/ollama/ollama/llama.SchemaToGrammar({0x140002daa80?, 0xa37, 0x1400029a050?}) /Users/runner/work/ollama/ollama/llama/llama.go:587 +0xc0 fp=0x1400008f9c0 sp=0x1400008f920 pc=0x1029b7b20 github.com/ollama/ollama/llm.(*llmServer).Completion(0x140005c0c00, {0x103815f10, 0x1400029a050}, {{0x1400037a000, 0x1b2e}, {0x140002daa80, 0xa37, 0xa80}, {0x0, 0x0, ...}, ...}, ...) /Users/runner/work/ollama/ollama/llm/server.go:753 +0x27c fp=0x1400008fe20 sp=0x1400008f9c0 pc=0x102a4837c github.com/ollama/ollama/server.(*Server).ChatHandler.func1() /Users/runner/work/ollama/ollama/server/routes.go:1638 +0x274 fp=0x1400008ffd0 sp=0x1400008fe20 pc=0x1030f6a04 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400008ffd0 sp=0x1400008ffd0 pc=0x102665174 created by github.com/ollama/ollama/server.(*Server).ChatHandler in goroutine 40 /Users/runner/work/ollama/ollama/server/routes.go:1635 +0x130c goroutine 1 gp=0x140000021c0 m=nil [IO wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x1042aa010?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000207660 sp=0x14000207640 pc=0x10265d418 runtime.netpollblock(0x140000496f8?, 0x26e1030?, 0x1?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:575 +0x158 fp=0x140002076a0 sp=0x14000207660 pc=0x1026229d8 internal/poll.runtime_pollWait(0x12af967d0, 0x72) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:351 +0xa0 fp=0x140002076d0 sp=0x140002076a0 pc=0x10265c5d0 internal/poll.(*pollDesc).wait(0x140005ab600?, 0x3800000038?, 0x0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x14000207700 sp=0x140002076d0 pc=0x1026dc848 internal/poll.(*pollDesc).waitRead(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Accept(0x140005ab600) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_unix.go:620 +0x24c fp=0x140002077b0 sp=0x14000207700 pc=0x1026e111c net.(*netFD).accept(0x140005ab600) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/fd_unix.go:172 +0x28 fp=0x14000207870 sp=0x140002077b0 pc=0x102750388 net.(*TCPListener).accept(0x14000596200) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/tcpsock_posix.go:159 +0x24 fp=0x140002078c0 sp=0x14000207870 pc=0x1027645e4 net.(*TCPListener).Accept(0x14000596200) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/tcpsock.go:380 +0x2c fp=0x14000207900 sp=0x140002078c0 pc=0x1027635cc net/http.(*onceCloseListener).Accept(0x1400013fef0?) <autogenerated>:1 +0x30 fp=0x14000207920 sp=0x14000207900 pc=0x10293e8e0 net/http.(*Server).Serve(0x140000f0700, {0x103813a68, 0x14000596200}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:3424 +0x290 fp=0x14000207a50 sp=0x14000207920 pc=0x102918020 github.com/ollama/ollama/server.Serve({0x103813a68, 0x14000596200}) /Users/runner/work/ollama/ollama/server/routes.go:1401 +0x79c fp=0x14000207d00 sp=0x14000207a50 pc=0x1030f3eec github.com/ollama/ollama/cmd.RunServer(0x140001c9400?, {0x104100680?, 0x4?, 0x103393958?}) /Users/runner/work/ollama/ollama/cmd/cmd.go:1337 +0x44 fp=0x14000207d40 sp=0x14000207d00 pc=0x103111684 github.com/spf13/cobra.(*Command).execute(0x140005c5808, {0x104100680, 0x0, 0x0}) /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x648 fp=0x14000207e60 sp=0x14000207d40 pc=0x1027be928 github.com/spf13/cobra.(*Command).ExecuteC(0x140005c4908) /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320 fp=0x14000207f20 sp=0x14000207e60 pc=0x1027bf070 github.com/spf13/cobra.(*Command).Execute(...) /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) /Users/runner/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985 main.main() /Users/runner/work/ollama/ollama/main.go:12 +0x54 fp=0x14000207f40 sp=0x14000207f20 pc=0x103119a94 runtime.main() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:283 +0x284 fp=0x14000207fd0 sp=0x14000207f40 pc=0x102629544 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000207fd0 sp=0x14000207fd0 pc=0x102665174 goroutine 2 gp=0x14000002c40 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000072f90 sp=0x14000072f70 pc=0x10265d418 runtime.goparkunlock(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:441 runtime.forcegchelper() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:348 +0xb8 fp=0x14000072fd0 sp=0x14000072f90 pc=0x102629898 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000072fd0 sp=0x14000072fd0 pc=0x102665174 created by runtime.init.7 in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:336 +0x24 goroutine 3 gp=0x14000003500 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000073760 sp=0x14000073740 pc=0x10265d418 runtime.goparkunlock(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:441 runtime.bgsweep(0x1400007c000) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgcsweep.go:316 +0x108 fp=0x140000737b0 sp=0x14000073760 pc=0x102614978 runtime.gcenable.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:204 +0x28 fp=0x140000737d0 sp=0x140000737b0 pc=0x102608778 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000737d0 sp=0x140000737d0 pc=0x102665174 created by runtime.gcenable in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:204 +0x6c goroutine 4 gp=0x140000036c0 m=nil [GC scavenge wait]: runtime.gopark(0x371102?, 0x6553f100?, 0x0?, 0x0?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000073f60 sp=0x14000073f40 pc=0x10265d418 runtime.goparkunlock(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:441 runtime.(*scavengerState).park(0x1040d2000) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgcscavenge.go:425 +0x5c fp=0x14000073f90 sp=0x14000073f60 pc=0x10261240c runtime.bgscavenge(0x1400007c000) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgcscavenge.go:658 +0xac fp=0x14000073fb0 sp=0x14000073f90 pc=0x1026129ac runtime.gcenable.gowrap2() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:205 +0x28 fp=0x14000073fd0 sp=0x14000073fb0 pc=0x102608718 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000073fd0 sp=0x14000073fd0 pc=0x102665174 created by runtime.gcenable in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:205 +0xac goroutine 5 gp=0x14000003c00 m=nil [finalizer wait]: runtime.gopark(0x18000725c8?, 0x1000000000000?, 0xf8?, 0x25?, 0x10294124c?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000072590 sp=0x14000072570 pc=0x10265d418 runtime.runfinq() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mfinal.go:196 +0x108 fp=0x140000727d0 sp=0x14000072590 pc=0x102607778 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000727d0 sp=0x140000727d0 pc=0x102665174 created by runtime.createfing in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mfinal.go:166 +0x80 goroutine 6 gp=0x140001b4700 m=nil [chan receive]: runtime.gopark(0x140001fb540?, 0x14000510438?, 0x48?, 0x47?, 0x102724558?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000746f0 sp=0x140000746d0 pc=0x10265d418 runtime.chanrecv(0x14000042380, 0x0, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:664 +0x42c fp=0x14000074770 sp=0x140000746f0 pc=0x1025f9a1c runtime.chanrecv1(0x0?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:506 +0x14 fp=0x140000747a0 sp=0x14000074770 pc=0x1025f95b4 runtime.unique_runtime_registerUniqueMapCleanup.func2(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1796 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1799 +0x3c fp=0x140000747d0 sp=0x140000747a0 pc=0x10260b99c runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000747d0 sp=0x140000747d0 pc=0x102665174 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1794 +0x78 goroutine 7 gp=0x140001b4e00 m=nil [GC worker (idle)]: runtime.gopark(0x7336aaf3b537?, 0x3?, 0x3b?, 0x83?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000074f10 sp=0x14000074ef0 pc=0x10265d418 runtime.gcBgMarkWorker(0x140000437a0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x14000074fb0 sp=0x14000074f10 pc=0x10260ac0c runtime.gcBgMarkStartWorkers.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x14000074fd0 sp=0x14000074fb0 pc=0x10260aaf8 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000074fd0 sp=0x14000074fd0 pc=0x102665174 created by runtime.gcBgMarkStartWorkers in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140 goroutine 8 gp=0x140001b4fc0 m=nil [GC worker (idle)]: runtime.gopark(0x7336aaf903c7?, 0x3?, 0x8e?, 0x1a?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000075710 sp=0x140000756f0 pc=0x10265d418 runtime.gcBgMarkWorker(0x140000437a0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x140000757b0 sp=0x14000075710 pc=0x10260ac0c runtime.gcBgMarkStartWorkers.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x140000757d0 sp=0x140000757b0 pc=0x10260aaf8 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000757d0 sp=0x140000757d0 pc=0x102665174 created by runtime.gcBgMarkStartWorkers in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140 goroutine 18 gp=0x14000286000 m=nil [GC worker (idle)]: runtime.gopark(0x7336aaf88f91?, 0x1?, 0x78?, 0x9c?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000088f10 sp=0x14000088ef0 pc=0x10265d418 runtime.gcBgMarkWorker(0x140000437a0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x14000088fb0 sp=0x14000088f10 pc=0x10260ac0c runtime.gcBgMarkStartWorkers.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x14000088fd0 sp=0x14000088fb0 pc=0x10260aaf8 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000088fd0 sp=0x14000088fd0 pc=0x102665174 created by runtime.gcBgMarkStartWorkers in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140 goroutine 34 gp=0x140000aa380 m=nil [GC worker (idle)]: runtime.gopark(0x7336aaf94394?, 0x3?, 0xbb?, 0xec?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c0710 sp=0x140000c06f0 pc=0x10265d418 runtime.gcBgMarkWorker(0x140000437a0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x140000c07b0 sp=0x140000c0710 pc=0x10260ac0c runtime.gcBgMarkStartWorkers.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x140000c07d0 sp=0x140000c07b0 pc=0x10260aaf8 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c07d0 sp=0x140000c07d0 pc=0x102665174 created by runtime.gcBgMarkStartWorkers in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140 goroutine 9 gp=0x140001b5180 m=nil [GC worker (idle)]: runtime.gopark(0x7336aaf30545?, 0x3?, 0x68?, 0x9a?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000075f10 sp=0x14000075ef0 pc=0x10265d418 runtime.gcBgMarkWorker(0x140000437a0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x14000075fb0 sp=0x14000075f10 pc=0x10260ac0c runtime.gcBgMarkStartWorkers.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x14000075fd0 sp=0x14000075fb0 pc=0x10260aaf8 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000075fd0 sp=0x14000075fd0 pc=0x102665174 created by runtime.gcBgMarkStartWorkers in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140 goroutine 19 gp=0x140002861c0 m=nil [GC worker (idle)]: runtime.gopark(0x7336aaf80efc?, 0x3?, 0x3a?, 0x5f?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x1400006ef10 sp=0x1400006eef0 pc=0x10265d418 runtime.gcBgMarkWorker(0x140000437a0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x1400006efb0 sp=0x1400006ef10 pc=0x10260ac0c runtime.gcBgMarkStartWorkers.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x1400006efd0 sp=0x1400006efb0 pc=0x10260aaf8 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400006efd0 sp=0x1400006efd0 pc=0x102665174 created by runtime.gcBgMarkStartWorkers in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140 goroutine 35 gp=0x140000aa540 m=nil [GC worker (idle)]: runtime.gopark(0x7336aaf1cb25?, 0x3?, 0x30?, 0xe5?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c0f10 sp=0x140000c0ef0 pc=0x10265d418 runtime.gcBgMarkWorker(0x140000437a0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x140000c0fb0 sp=0x140000c0f10 pc=0x10260ac0c runtime.gcBgMarkStartWorkers.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x140000c0fd0 sp=0x140000c0fb0 pc=0x10260aaf8 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c0fd0 sp=0x140000c0fd0 pc=0x102665174 created by runtime.gcBgMarkStartWorkers in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140 goroutine 10 gp=0x140001b5340 m=nil [GC worker (idle)]: runtime.gopark(0x7336a5f22647?, 0x3?, 0x5d?, 0xba?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000bc710 sp=0x140000bc6f0 pc=0x10265d418 runtime.gcBgMarkWorker(0x140000437a0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1423 +0xdc fp=0x140000bc7b0 sp=0x140000bc710 pc=0x10260ac0c runtime.gcBgMarkStartWorkers.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x28 fp=0x140000bc7d0 sp=0x140000bc7b0 pc=0x10260aaf8 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000bc7d0 sp=0x140000bc7d0 pc=0x102665174 created by runtime.gcBgMarkStartWorkers in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/mgc.go:1339 +0x140 goroutine 11 gp=0x14000287180 m=nil [select, locked to thread]: runtime.gopark(0x140000c3fa0?, 0x2?, 0xa8?, 0x3e?, 0x140000c3f90?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c3e20 sp=0x140000c3e00 pc=0x10265d418 runtime.selectgo(0x140000c3fa0, 0x140000c3f8c, 0x0?, 0x0, 0x0?, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/select.go:351 +0x6c4 fp=0x140000c3f50 sp=0x140000c3e20 pc=0x10263cbb4 runtime.ensureSigM.func1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/signal_unix.go:1085 +0x160 fp=0x140000c3fd0 sp=0x140000c3f50 pc=0x1026579b0 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c3fd0 sp=0x140000c3fd0 pc=0x102665174 created by runtime.ensureSigM in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/signal_unix.go:1068 +0xd8 goroutine 36 gp=0x140000aa700 m=4 mp=0x14000079808 [syscall]: runtime.sigNoteSleep(0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/os_darwin.go:133 +0x20 fp=0x14000071790 sp=0x14000071750 pc=0x102623a70 os/signal.signal_recv() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/sigqueue.go:149 +0x2c fp=0x140000717b0 sp=0x14000071790 pc=0x10265f99c os/signal.loop() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/signal/signal_unix.go:23 +0x1c fp=0x140000717d0 sp=0x140000717b0 pc=0x102940dfc runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000717d0 sp=0x140000717d0 pc=0x102665174 created by os/signal.Notify.func1.1 in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/signal/signal.go:152 +0x28 goroutine 37 gp=0x140000aa8c0 m=nil [chan receive]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000071ee0 sp=0x14000071ec0 pc=0x10265d418 runtime.chanrecv(0x140003c3f10, 0x0, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:664 +0x42c fp=0x14000071f60 sp=0x14000071ee0 pc=0x1025f9a1c runtime.chanrecv1(0x0?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:506 +0x14 fp=0x14000071f90 sp=0x14000071f60 pc=0x1025f95b4 github.com/ollama/ollama/server.Serve.func1() /Users/runner/work/ollama/ollama/server/routes.go:1374 +0x44 fp=0x14000071fd0 sp=0x14000071f90 pc=0x1030f3fa4 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000071fd0 sp=0x14000071fd0 pc=0x102665174 created by github.com/ollama/ollama/server.Serve in goroutine 1 /Users/runner/work/ollama/ollama/server/routes.go:1373 +0x544 goroutine 38 gp=0x140000aaa80 m=nil [select]: runtime.gopark(0x14000049f30?, 0x3?, 0x0?, 0x0?, 0x14000049ca2?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x1400020db10 sp=0x1400020daf0 pc=0x10265d418 runtime.selectgo(0x1400020df30, 0x14000049c9c, 0x14000132540?, 0x0, 0x1033ab6fd?, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/select.go:351 +0x6c4 fp=0x1400020dc40 sp=0x1400020db10 pc=0x10263cbb4 github.com/ollama/ollama/server.(*Scheduler).processPending(0x1400046d560, {0x103815f10, 0x1400059a410}) /Users/runner/work/ollama/ollama/server/sched.go:118 +0x9c fp=0x1400020dfa0 sp=0x1400020dc40 pc=0x1030f7dfc github.com/ollama/ollama/server.(*Scheduler).Run.func1() /Users/runner/work/ollama/ollama/server/sched.go:108 +0x28 fp=0x1400020dfd0 sp=0x1400020dfa0 pc=0x1030f7d48 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400020dfd0 sp=0x1400020dfd0 pc=0x102665174 created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1 /Users/runner/work/ollama/ollama/server/sched.go:107 +0xc0 goroutine 39 gp=0x140000aac40 m=nil [select]: runtime.gopark(0x140000caf40?, 0x3?, 0x88?, 0xab?, 0x140000cacba?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000cab30 sp=0x140000cab10 pc=0x10265d418 runtime.selectgo(0x140000caf40, 0x140000cacb4, 0x0?, 0x0, 0x0?, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/select.go:351 +0x6c4 fp=0x140000cac60 sp=0x140000cab30 pc=0x10263cbb4 github.com/ollama/ollama/server.(*Scheduler).processCompleted(0x1400046d560, {0x103815f10, 0x1400059a410}) /Users/runner/work/ollama/ollama/server/sched.go:318 +0xa8 fp=0x140000cafa0 sp=0x140000cac60 pc=0x1030f8f48 github.com/ollama/ollama/server.(*Scheduler).Run.func2() /Users/runner/work/ollama/ollama/server/sched.go:112 +0x28 fp=0x140000cafd0 sp=0x140000cafa0 pc=0x1030f7d08 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000cafd0 sp=0x140000cafd0 pc=0x102665174 created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1 /Users/runner/work/ollama/ollama/server/sched.go:111 +0x118 goroutine 40 gp=0x140000aae00 m=nil [chan receive]: runtime.gopark(0x100048ca8?, 0x12adcd3d8?, 0x8?, 0xc1?, 0x1042aa330?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x14000048c00 sp=0x14000048be0 pc=0x10265d418 runtime.chanrecv(0x140005d6460, 0x140000491c8, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:664 +0x42c fp=0x14000048c80 sp=0x14000048c00 pc=0x1025f9a1c runtime.chanrecv2(0x140002b9050?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:511 +0x14 fp=0x14000048cb0 sp=0x14000048c80 pc=0x1025f95d4 github.com/ollama/ollama/server.(*Server).ChatHandler(0x1400068c400, 0x140001c8700) /Users/runner/work/ollama/ollama/server/routes.go:1728 +0x192c fp=0x14000049510 sp=0x14000048cb0 pc=0x1030f61fc github.com/ollama/ollama/server.(*Server).ChatHandler-fm(0x140002b3801?) <autogenerated>:1 +0x30 fp=0x14000049530 sp=0x14000049510 pc=0x103108730 github.com/gin-gonic/gin.(*Context).Next(0x140001c8700) /Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185 +0x3c fp=0x14000049550 sp=0x14000049530 pc=0x102c59c9c github.com/ollama/ollama/server.(*Server).GenerateRoutes.ChatMiddleware.func6(0x140001c8700) /Users/runner/work/ollama/ollama/openai/openai.go:1063 +0x1e0 fp=0x140000496b0 sp=0x14000049550 pc=0x1031023f0 github.com/gin-gonic/gin.(*Context).Next(0x140001c8700) /Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185 +0x3c fp=0x140000496d0 sp=0x140000496b0 pc=0x102c59c9c github.com/ollama/ollama/server.(*Server).GenerateRoutes.allowedHostsMiddleware.func5(0x140001c8700) /Users/runner/work/ollama/ollama/server/routes.go:1210 +0x160 fp=0x14000049730 sp=0x140000496d0 pc=0x1030f36f0 github.com/gin-gonic/gin.(*Context).Next(...) /Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185 github.com/gin-gonic/gin.CustomRecoveryWithWriter.func1(0x140001c8700) /Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/recovery.go:102 +0x78 fp=0x14000049780 sp=0x14000049730 pc=0x102c66788 github.com/gin-gonic/gin.(*Context).Next(...) /Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185 github.com/gin-gonic/gin.LoggerWithConfig.func1(0x140001c8700) /Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/logger.go:249 +0xb4 fp=0x14000049940 sp=0x14000049780 pc=0x102c65b34 github.com/gin-gonic/gin.(*Context).Next(...) /Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/context.go:185 github.com/gin-gonic/gin.(*Engine).handleHTTPRequest(0x14000112000, 0x140001c8700) /Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/gin.go:633 +0x6c4 fp=0x14000049ac0 sp=0x14000049940 pc=0x102c64fd4 github.com/gin-gonic/gin.(*Engine).ServeHTTP(0x14000112000, {0x103813c48, 0x14000174460}, 0x1400029e140) /Users/runner/go/pkg/mod/github.com/gin-gonic/gin@v1.10.0/gin.go:589 +0x170 fp=0x14000049b00 sp=0x14000049ac0 pc=0x102c647a0 net/http.(*ServeMux).ServeHTTP(0x10?, {0x103813c48, 0x14000174460}, 0x1400029e140) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:2822 +0x1b4 fp=0x14000049b50 sp=0x14000049b00 pc=0x1029165d4 net/http.serverHandler.ServeHTTP({0x103810270?}, {0x103813c48?, 0x14000174460?}, 0x6?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:3301 +0xbc fp=0x14000049b80 sp=0x14000049b50 pc=0x1029322bc net/http.(*conn).serve(0x1400013fef0, {0x103815ed8, 0x14000598630}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:2102 +0x52c fp=0x14000049fa0 sp=0x14000049b80 pc=0x1029131ec net/http.(*Server).Serve.gowrap3() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:3454 +0x30 fp=0x14000049fd0 sp=0x14000049fa0 pc=0x1029183b0 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x14000049fd0 sp=0x14000049fd0 pc=0x102665174 created by net/http.(*Server).Serve in goroutine 1 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:3454 +0x3d8 goroutine 41 gp=0x140000aafc0 m=nil [IO wait]: runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x102680c50?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c1d80 sp=0x140000c1d60 pc=0x10265d418 runtime.netpollblock(0x0?, 0x0?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:575 +0x158 fp=0x140000c1dc0 sp=0x140000c1d80 pc=0x1026229d8 internal/poll.runtime_pollWait(0x12af966b8, 0x72) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:351 +0xa0 fp=0x140000c1df0 sp=0x140000c1dc0 pc=0x10265c5d0 internal/poll.(*pollDesc).wait(0x140005aa100?, 0x140005988e1?, 0x0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x140000c1e20 sp=0x140000c1df0 pc=0x1026dc848 internal/poll.(*pollDesc).waitRead(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0x140005aa100, {0x140005988e1, 0x1, 0x1}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x140000c1ec0 sp=0x140000c1e20 pc=0x1026ddafc net.(*netFD).Read(0x140005aa100, {0x140005988e1?, 0x0?, 0x0?}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/fd_posix.go:55 +0x28 fp=0x140000c1f10 sp=0x140000c1ec0 pc=0x10274e958 net.(*conn).Read(0x14000294008, {0x140005988e1?, 0x0?, 0x0?}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/net.go:194 +0x34 fp=0x140000c1f60 sp=0x140000c1f10 pc=0x10275b824 net/http.(*connReader).backgroundRead(0x140005988d0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:690 +0x40 fp=0x140000c1fb0 sp=0x140000c1f60 pc=0x10290db60 net/http.(*connReader).startBackgroundRead.gowrap2() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:686 +0x28 fp=0x140000c1fd0 sp=0x140000c1fb0 pc=0x10290da48 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c1fd0 sp=0x140000c1fd0 pc=0x102665174 created by net/http.(*connReader).startBackgroundRead in goroutine 40 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/server.go:686 +0xc4 goroutine 48 gp=0x140000ab500 m=nil [select]: runtime.gopark(0x1400061ff38?, 0x2?, 0xa8?, 0xfd?, 0x1400061fee4?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x1400061fd70 sp=0x1400061fd50 pc=0x10265d418 runtime.selectgo(0x1400061ff38, 0x1400061fee0, 0x14000050040?, 0x0, 0x140001f6060?, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/select.go:351 +0x6c4 fp=0x1400061fea0 sp=0x1400061fd70 pc=0x10263cbb4 net/http.(*persistConn).writeLoop(0x140005da000) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:2590 +0x9c fp=0x1400061ffb0 sp=0x1400061fea0 pc=0x10292dcbc net/http.(*Transport).dialConn.gowrap3() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:1945 +0x28 fp=0x1400061ffd0 sp=0x1400061ffb0 pc=0x10292ae78 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x1400061ffd0 sp=0x1400061ffd0 pc=0x102665174 created by net/http.(*Transport).dialConn in goroutine 20 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:1945 +0x120c goroutine 14 gp=0x140000abc00 m=nil [IO wait]: runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x102680c50?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c5be0 sp=0x140000c5bc0 pc=0x10265d418 runtime.netpollblock(0x0?, 0x0?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:575 +0x158 fp=0x140000c5c20 sp=0x140000c5be0 pc=0x1026229d8 internal/poll.runtime_pollWait(0x12af965a0, 0x72) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:351 +0xa0 fp=0x140000c5c50 sp=0x140000c5c20 pc=0x10265c5d0 internal/poll.(*pollDesc).wait(0x140003749c0?, 0x14001208000?, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x140000c5c80 sp=0x140000c5c50 pc=0x1026dc848 internal/poll.(*pollDesc).waitRead(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0x140003749c0, {0x14001208000, 0x8000, 0x8000}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x140000c5d20 sp=0x140000c5c80 pc=0x1026ddafc os.(*File).read(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file_posix.go:29 os.(*File).Read(0x14000294038, {0x14001208000?, 0x182?, 0x8000?}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file.go:124 +0x6c fp=0x140000c5d60 sp=0x140000c5d20 pc=0x1026e6f7c io.copyBuffer({0x10380e1c0, 0x1400000e8a0}, {0x10380bee0, 0x140005ac000}, {0x0, 0x0, 0x0}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/io/io.go:429 +0x18c fp=0x140000c5de0 sp=0x140000c5d60 pc=0x1026d264c io.Copy(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/io/io.go:388 os.genericWriteTo(0x1?, {0x10380e1c0, 0x1400000e8a0}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file.go:275 +0x58 fp=0x140000c5e30 sp=0x140000c5de0 pc=0x1026e77e8 os.(*File).WriteTo(0x103fad4d0?, {0x10380e1c0?, 0x1400000e8a0?}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file.go:253 +0x60 fp=0x140000c5e60 sp=0x140000c5e30 pc=0x1026e7720 io.copyBuffer({0x10380e1c0, 0x1400000e8a0}, {0x10380bbc8, 0x14000294038}, {0x0, 0x0, 0x0}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/io/io.go:411 +0x98 fp=0x140000c5ee0 sp=0x140000c5e60 pc=0x1026d2558 io.Copy(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/io/io.go:388 os/exec.(*Cmd).writerDescriptor.func1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:596 +0x44 fp=0x140000c5f40 sp=0x140000c5ee0 pc=0x1029cd584 os/exec.(*Cmd).Start.func2(0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:749 +0x34 fp=0x140000c5fb0 sp=0x140000c5f40 pc=0x1029cdfb4 os/exec.(*Cmd).Start.gowrap1() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:761 +0x30 fp=0x140000c5fd0 sp=0x140000c5fb0 pc=0x1029cdf40 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c5fd0 sp=0x140000c5fd0 pc=0x102665174 created by os/exec.(*Cmd).Start in goroutine 38 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:748 +0x76c goroutine 15 gp=0x140000abdc0 m=9 mp=0x14000580008 [syscall]: syscall.syscall6(0x90010100000050?, 0x12ae29a38?, 0x10429c5c0?, 0x90?, 0x14000580008?, 0x140002b0000?, 0x140000c1608?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/sys_darwin.go:60 +0x50 fp=0x140000c15a0 sp=0x140000c14e0 pc=0x102660d50 syscall.wait4(0x140000c1638?, 0x1026e6674?, 0x90?, 0x1037cebc0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/syscall/zsyscall_darwin_arm64.go:44 +0x4c fp=0x140000c1600 sp=0x140000c15a0 pc=0x10267ccfc syscall.Wait4(0x140000c1668?, 0x140000c166c, 0x140000c1678?, 0x1025f8dc4?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/syscall/syscall_bsd.go:144 +0x28 fp=0x140000c1640 sp=0x140000c1600 pc=0x10267a7c8 os.(*Process).pidWait.func1(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec_unix.go:68 os.ignoringEINTR2[...](...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/file_posix.go:261 os.(*Process).pidWait(0x140005bc700) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec_unix.go:67 +0xa4 fp=0x140000c16a0 sp=0x140000c1640 pc=0x1026e66c4 os.(*Process).wait(0x140000c1738?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec_unix.go:30 +0x24 fp=0x140000c16c0 sp=0x140000c16a0 pc=0x1026e65c4 os.(*Process).Wait(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec.go:358 os/exec.(*Cmd).Wait(0x140001ff380) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/os/exec/exec.go:922 +0x38 fp=0x140000c1730 sp=0x140000c16c0 pc=0x1029ce588 github.com/ollama/ollama/llm.NewLlamaServer.func1() /Users/runner/work/ollama/ollama/llm/server.go:461 +0x2c fp=0x140000c17d0 sp=0x140000c1730 pc=0x102a46ccc runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c17d0 sp=0x140000c17d0 pc=0x102665174 created by github.com/ollama/ollama/llm.NewLlamaServer in goroutine 38 /Users/runner/work/ollama/ollama/llm/server.go:460 +0x2df8 goroutine 47 gp=0x14000502540 m=nil [IO wait]: runtime.gopark(0xffffffffffffffff?, 0xffffffffffffffff?, 0x23?, 0x0?, 0x102680c50?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000c7ac0 sp=0x140000c7aa0 pc=0x10265d418 runtime.netpollblock(0x0?, 0x0?, 0x0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:575 +0x158 fp=0x140000c7b00 sp=0x140000c7ac0 pc=0x1026229d8 internal/poll.runtime_pollWait(0x12af96488, 0x72) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/netpoll.go:351 +0xa0 fp=0x140000c7b30 sp=0x140000c7b00 pc=0x10265c5d0 internal/poll.(*pollDesc).wait(0x140005d4080?, 0x14000614000?, 0x0) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x140000c7b60 sp=0x140000c7b30 pc=0x1026dc848 internal/poll.(*pollDesc).waitRead(...) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0x140005d4080, {0x14000614000, 0x1000, 0x1000}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x140000c7c00 sp=0x140000c7b60 pc=0x1026ddafc net.(*netFD).Read(0x140005d4080, {0x14000614000?, 0x80?, 0x8?}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/fd_posix.go:55 +0x28 fp=0x140000c7c50 sp=0x140000c7c00 pc=0x10274e958 net.(*conn).Read(0x14000612000, {0x14000614000?, 0x0?, 0x140000c7d28?}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/net.go:194 +0x34 fp=0x140000c7ca0 sp=0x140000c7c50 pc=0x10275b824 net/http.(*persistConn).Read(0x140005da000, {0x14000614000?, 0x10380bae8?, 0x103faa910?}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:2122 +0x4c fp=0x140000c7d00 sp=0x140000c7ca0 pc=0x10292b8fc bufio.(*Reader).fill(0x1400046c060) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/bufio/bufio.go:113 +0xf8 fp=0x140000c7d40 sp=0x140000c7d00 pc=0x1027712b8 bufio.(*Reader).Peek(0x1400046c060, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/bufio/bufio.go:152 +0x60 fp=0x140000c7d60 sp=0x140000c7d40 pc=0x102771420 net/http.(*persistConn).readLoop(0x140005da000) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:2275 +0x12c fp=0x140000c7fb0 sp=0x140000c7d60 pc=0x10292c5ac net/http.(*Transport).dialConn.gowrap2() /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:1944 +0x28 fp=0x140000c7fd0 sp=0x140000c7fb0 pc=0x10292aed8 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000c7fd0 sp=0x140000c7fd0 pc=0x102665174 created by net/http.(*Transport).dialConn in goroutine 20 /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/net/http/transport.go:1944 +0x11c4 goroutine 22 gp=0x14000582380 m=nil [chan receive]: runtime.gopark(0x3?, 0x14000502540?, 0x0?, 0x0?, 0x140005d62a0?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/proc.go:435 +0xc8 fp=0x140000bd6b0 sp=0x140000bd690 pc=0x10265d418 runtime.chanrecv(0x140000d6230, 0x0, 0x1) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:664 +0x42c fp=0x140000bd730 sp=0x140000bd6b0 pc=0x1025f9a1c runtime.chanrecv1(0x0?, 0x100010000?) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/chan.go:506 +0x14 fp=0x140000bd760 sp=0x140000bd730 pc=0x1025f95b4 github.com/ollama/ollama/server.(*Scheduler).load.func1.1() /Users/runner/work/ollama/ollama/server/sched.go:500 +0x44 fp=0x140000bd7d0 sp=0x140000bd760 pc=0x1030fb184 runtime.goexit({}) /Users/runner/hostedtoolcache/go/1.24.0/arm64/src/runtime/asm_arm64.s:1223 +0x4 fp=0x140000bd7d0 sp=0x140000bd7d0 pc=0x102665174 created by github.com/ollama/ollama/server.(*Scheduler).load.func1 in goroutine 16 /Users/runner/work/ollama/ollama/server/sched.go:499 +0x2fc r0 0x17287c608 r1 0xffffffffffffffe0 r2 0xc0040c7dfbd r3 0x1 r4 0x4 r5 0xe9875915 r6 0x600001202280 r7 0x6000012020e0 r8 0x17287c280 r9 0x17287c610 r10 0x1 r11 0x17287c650 r12 0x17 r13 0x10320bcbc r14 0x0 r15 0x7fb r16 0xe9875915 r17 0x114 r18 0x0 r19 0x17287c280 r20 0x10358a1a6 r21 0x17287c310 r22 0x600003119408 r23 0x17287ecd8 r24 0x17287c470 r25 0x103569720 r26 0x17287ed78 r27 0x600001c1b3b1 r28 0x17287c2f0 r29 0x17287c1a0 lr 0x10320bcd0 sp 0x17287c160 pc 0x10320b98c fault 0xfffffffffffffff8 username@COMPUTER ~ % The logfile contains less: time=2025-08-18T10:10:27.256+02:00 level=INFO source=routes.go:1304 msg="server config" env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/matthias/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]" time=2025-08-18T10:10:27.258+02:00 level=INFO source=images.go:477 msg="total blobs: 38" time=2025-08-18T10:10:27.259+02:00 level=INFO source=images.go:484 msg="total unused blobs removed: 0" time=2025-08-18T10:10:27.259+02:00 level=INFO source=routes.go:1357 msg="Listening on [::]:11434 (version 0.11.4)" time=2025-08-18T10:10:27.372+02:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="16.0 GiB" available="16.0 GiB" time=2025-08-18T10:10:27.372+02:00 level=INFO source=routes.go:1398 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB" [GIN] 2025/08/18 - 10:10:31 | 404 | 986.791µs | 127.0.0.1 | POST "/api/show" [GIN] 2025/08/18 - 10:10:31 | 200 | 3.604083ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/08/18 - 10:10:32 | 404 | 2.616208ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/08/18 - 10:10:34 | 404 | 1.746916ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/08/18 - 10:10:37 | 200 | 54.450084ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/08/18 - 10:10:39 | 200 | 1.598291ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/08/18 - 10:10:39 | 200 | 53.516334ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/08/18 - 10:10:39 | 200 | 35.498708ms | 127.0.0.1 | POST "/api/show" time=2025-08-18T10:10:39.629+02:00 level=INFO source=sched.go:786 msg="new model will fit in available VRAM in single GPU, loading" model=/Users/matthias/.ollama/models/blobs/sha256-140cb395a3a3dcdb3de66e44c60d7265286c9d49f951c1899ec0d8c1f16b2feb gpu=0 parallel=1 available=17179885568 required="1.5 GiB" time=2025-08-18T10:10:39.629+02:00 level=INFO source=server.go:135 msg="system memory" total="24.0 GiB" free="4.0 GiB" free_swap="0 B" time=2025-08-18T10:10:39.629+02:00 level=INFO source=server.go:175 msg=offload library=metal layers.requested=-1 layers.model=19 layers.offload=19 layers.split="" memory.available="[16.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="1.5 GiB" memory.required.partial="1.5 GiB" memory.required.kv="27.0 MiB" memory.required.allocations="[1.5 GiB]" memory.weights.total="511.5 MiB" memory.weights.repeating="191.5 MiB" memory.weights.nonrepeating="320.0 MiB" memory.graph.full="513.2 MiB" memory.graph.partial="513.2 MiB" time=2025-08-18T10:10:39.650+02:00 level=INFO source=server.go:438 msg="starting llama server" cmd="/Applications/Ollama.app/Contents/Resources/ollama runner --ollama-engine --model /Users/matthias/.ollama/models/blobs/sha256-140cb395a3a3dcdb3de66e44c60d7265286c9d49f951c1899ec0d8c1f16b2feb --ctx-size 4096 --batch-size 512 --n-gpu-layers 19 --threads 4 --parallel 1 --port 53339" time=2025-08-18T10:10:39.651+02:00 level=INFO source=sched.go:481 msg="loaded runners" count=1 time=2025-08-18T10:10:39.652+02:00 level=INFO source=server.go:598 msg="waiting for llama runner to start responding" time=2025-08-18T10:10:39.652+02:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server not responding" time=2025-08-18T10:10:39.660+02:00 level=INFO source=runner.go:925 msg="starting ollama engine" time=2025-08-18T10:10:39.661+02:00 level=INFO source=runner.go:983 msg="Server listening on 127.0.0.1:53339" time=2025-08-18T10:10:39.679+02:00 level=INFO source=ggml.go:92 msg="" architecture=gemma3 file_type=F16 name=Gemma-3-270M-It description="" num_tensors=236 num_key_values=42 time=2025-08-18T10:10:39.681+02:00 level=INFO source=ggml.go:104 msg=system Metal.0.EMBED_LIBRARY=1 Metal.0.BF16=1 CPU.0.ARM_FMA=1 CPU.0.FP16_VA=1 CPU.0.DOTPROD=1 CPU.0.LLAMAFILE=1 CPU.0.ACCELERATE=1 compiler=cgo(clang) time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:365 msg="offloading 18 repeating layers to GPU" time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:371 msg="offloading output layer to GPU" time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:376 msg="offloaded 19/19 layers to GPU" time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=CPU size="320.0 MiB" time=2025-08-18T10:10:39.755+02:00 level=INFO source=ggml.go:379 msg="model weights" buffer=Metal size="511.5 MiB" ggml_metal_init: allocating ggml_metal_init: picking default device: Apple M3 ggml_metal_load_library: using embedded metal library time=2025-08-18T10:10:39.903+02:00 level=INFO source=server.go:632 msg="waiting for server to become available" status="llm server loading model" ggml_metal_init: GPU name: Apple M3 ggml_metal_init: GPU family: MTLGPUFamilyApple9 (1009) ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003) ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001) ggml_metal_init: simdgroup reduction = true ggml_metal_init: simdgroup matrix mul. = true ggml_metal_init: has residency sets = false ggml_metal_init: has bfloat = true ggml_metal_init: use bfloat = true ggml_metal_init: hasUnifiedMemory = true ggml_metal_init: recommendedMaxWorkingSetSize = 17179.89 MB time=2025-08-18T10:10:52.217+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=Metal buffer_type=Metal size="48.5 MiB" time=2025-08-18T10:10:52.217+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=BLAS buffer_type=CPU size="1.2 MiB" time=2025-08-18T10:10:52.217+02:00 level=INFO source=ggml.go:668 msg="compute graph" backend=CPU buffer_type=CPU size="0 B" time=2025-08-18T10:10:52.970+02:00 level=INFO source=server.go:637 msg="llama runner started in 13.32 seconds" [GIN] 2025/08/18 - 10:10:53 | 200 | 13.657210417s | 127.0.0.1 | POST "/api/chat" ``` ### OS macOS ### GPU Apple ### CPU Apple ### Ollama version 0.11.4
GiteaMirror added the bugneeds more infomacos labels 2026-04-12 20:05:59 -05:00
Author
Owner

@ParthSareen commented on GitHub (Aug 19, 2025):

Could I see the request you're making? Are you using structured ouputs?

<!-- gh-comment-id:3202813132 --> @ParthSareen commented on GitHub (Aug 19, 2025): Could I see the request you're making? Are you using structured ouputs?
Author
Owner

@pdevine commented on GitHub (Aug 19, 2025):

Also, can you try gemma3:270m-it-fp16 (the official ollama version). That worked for me (w/o structured outputs).

curl http://localhost:11434/v1/chat/completions -d '{"model": "gemma3:270m-it-fp16", "messages": [{"role": "user", "content"
: "hello there"}]}'
{"id":"chatcmpl-785","object":"chat.completion","created":1755645431,"model":"gemma3:270m-it-fp16","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"Hello! How can I help you today?\n"},"finish_reason":"stop"}],"usage":{"prompt_tokens":11,"completion_tokens":11,"total_tokens":22}}
<!-- gh-comment-id:3202830996 --> @pdevine commented on GitHub (Aug 19, 2025): Also, can you try `gemma3:270m-it-fp16` (the official ollama version). That worked for me (w/o structured outputs). ``` curl http://localhost:11434/v1/chat/completions -d '{"model": "gemma3:270m-it-fp16", "messages": [{"role": "user", "content" : "hello there"}]}' {"id":"chatcmpl-785","object":"chat.completion","created":1755645431,"model":"gemma3:270m-it-fp16","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"Hello! How can I help you today?\n"},"finish_reason":"stop"}],"usage":{"prompt_tokens":11,"completion_tokens":11,"total_tokens":22}} ```
Author
Owner

@SleepyYui commented on GitHub (Aug 21, 2025):

Also, can you try gemma3:270m-it-fp16 (the official ollama version). That worked for me (w/o structured outputs).

curl http://localhost:11434/v1/chat/completions -d '{"model": "gemma3:270m-it-fp16", "messages": [{"role": "user", "content"
: "hello there"}]}'
{"id":"chatcmpl-785","object":"chat.completion","created":1755645431,"model":"gemma3:270m-it-fp16","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"Hello! How can I help you today?\n"},"finish_reason":"stop"}],"usage":{"prompt_tokens":11,"completion_tokens":11,"total_tokens":22}}
curl http://localhost:11434/v1/chat/completions -d '{"model": "hf.co/unsloth/gemma-3-270m-it-GGUF:F16", "messages": [{"role": "user", "content"
: "hello there"}]}'
{"id":"chatcmpl-816","object":"chat.completion","created":1755768454,"model":"hf.co/unsloth/gemma-3-270m-it-GGUF:F16","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"Hello! How can I help you today?\n"},"finish_reason":"stop"}],"usage":{"prompt_tokens":11,"completion_tokens":11,"total_tokens":22}}

This works, probably the json I want it to output as? llama.cpp had similar problems because it didn't like the regex.

I'll provide my request asap, don't have the original json that caused it anymore, will search for it.

<!-- gh-comment-id:3209741264 --> @SleepyYui commented on GitHub (Aug 21, 2025): > Also, can you try `gemma3:270m-it-fp16` (the official ollama version). That worked for me (w/o structured outputs). > > ``` > curl http://localhost:11434/v1/chat/completions -d '{"model": "gemma3:270m-it-fp16", "messages": [{"role": "user", "content" > : "hello there"}]}' > {"id":"chatcmpl-785","object":"chat.completion","created":1755645431,"model":"gemma3:270m-it-fp16","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"Hello! How can I help you today?\n"},"finish_reason":"stop"}],"usage":{"prompt_tokens":11,"completion_tokens":11,"total_tokens":22}} > ``` ``` curl http://localhost:11434/v1/chat/completions -d '{"model": "hf.co/unsloth/gemma-3-270m-it-GGUF:F16", "messages": [{"role": "user", "content" : "hello there"}]}' {"id":"chatcmpl-816","object":"chat.completion","created":1755768454,"model":"hf.co/unsloth/gemma-3-270m-it-GGUF:F16","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":"Hello! How can I help you today?\n"},"finish_reason":"stop"}],"usage":{"prompt_tokens":11,"completion_tokens":11,"total_tokens":22}} ``` This works, probably the json I want it to output as? llama.cpp had similar problems because it didn't like the regex. I'll provide my request asap, don't have the original json that caused it anymore, will search for it.
Author
Owner

@pdevine commented on GitHub (Sep 19, 2025):

Going to go and close this as stale. We can reopen if it's still an issue.

<!-- gh-comment-id:3310177423 --> @pdevine commented on GitHub (Sep 19, 2025): Going to go and close this as stale. We can reopen if it's still an issue.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7935