[GH-ISSUE #14092] 0.15.5-rc4 run qwen3:4b-instruct-2507-q4_K_M error #9200

Closed
opened 2026-04-12 22:03:08 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @luhuaei on GitHub (Feb 5, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/14092

What is the issue?

Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details

> env | grep OLLAMA
OLLAMA_ORIGINS=*
OLLAMA_HOST=0.0.0.0
OLLAMA_LLM_LIBRARY=cuda
OLLAMA_NUM_PARALLEL=4
OLLAMA_DEBUG=1
OLLAMA_FLASH_ATTENTION=1
OLLAMA_KV_CACHE_TYPE=q8_0

jetson agx orin 64G dev kit

> env | grep CUDA
CMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc
CUDA_BIN_PATH=/usr/local/cuda/bin
CUDAARCHS=87
CUDACXX=/usr/local/cuda/bin/nvcc
CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
CUDA_NVCC_EXECUTABLE=/usr/local/cuda/bin/nvcc
CUDA_HOME=/usr/local/cuda
CUDA_ARCHITECTURES=87

Relevant log output

time=2026-02-05T09:40:30.709Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --model /root/.ollama/models/blobs/sha256-85e4a5b7b8ef0e48af0e8658f5aaab9c2324c76c1641493f4d1e25fce54b18b9 --port 45139"
time=2026-02-05T09:40:30.709Z level=DEBUG source=server.go:431 msg=subprocess PATH=/opt/venv/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin OLLAMA_NUM_PARALLEL=4 OLLAMA_ORIGINS=* OLLAMA_KV_CACHE_TYPE=q8_0 OLLAMA_HOST=0.0.0.0 OLLAMA_FLASH_ATTENTION=1 OLLAMA_LLM_LIBRARY=cuda OLLAMA_DEBUG=1 CUDA_HOME=/usr/local/cuda CUDA_ARCHITECTURES=87 CUDA_NVCC_EXECUTABLE=/usr/local/cuda/bin/nvcc CUDA_BIN_PATH=/usr/local/cuda/bin CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/cuda/compat:/usr/local/cuda/lib64: OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama
time=2026-02-05T09:40:30.709Z level=INFO source=sched.go:463 msg="system memory" total="60.0 GiB" free="57.2 GiB" free_swap="0 B"
time=2026-02-05T09:40:30.709Z level=INFO source=sched.go:470 msg="gpu memory" id=GPU-678cd885-b8ec-5922-848b-a80d8aadc8b5 library=CUDA available="39.9 GiB" free="40.4 GiB" minimum="457.0 MiB" overhead="0 B"
time=2026-02-05T09:40:30.709Z level=INFO source=server.go:756 msg="loading model" "model layers"=37 requested=-1
time=2026-02-05T09:40:30.723Z level=INFO source=runner.go:1410 msg="starting ollama engine"
time=2026-02-05T09:40:30.727Z level=INFO source=runner.go:1445 msg="Server listening on 127.0.0.1:45139"
time=2026-02-05T09:40:30.732Z level=INFO source=runner.go:1283 msg=load request="{Operation:fit LoraPath:[] Parallel:4 BatchSize:512 FlashAttention:Enabled KvSize:1048576 KvCacheType:q8_0 NumThreads:12 GPULayers:37[ID:GPU-678cd885-b8ec-5922-848b-a80d8aadc8b5 Layers:37(0..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-02-05T09:40:30.765Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=general.alignment default=32
time=2026-02-05T09:40:30.765Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=general.description default=""
time=2026-02-05T09:40:30.765Z level=INFO source=ggml.go:136 msg="" architecture=qwen3 file_type=Q4_K_M name="Qwen3 4B Instruct 2507" description="" num_tensors=398 num_key_values=33
time=2026-02-05T09:40:30.765Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: Orin, compute capability 8.7, VMM: yes, ID: GPU-678cd885-b8ec-5922-848b-a80d8aadc8b5
load_backend: loaded CUDA backend from /usr/local/lib/ollama/libggml-cuda.so
load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu.so
time=2026-02-05T09:40:30.805Z level=INFO source=ggml.go:104 msg=system CPU.0.NEON=1 CPU.0.ARM_FMA=1 CPU.0.LLAMAFILE=1 CUDA.0.ARCHS=870 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 CPU.1.NEON=1 CPU.1.ARM_FMA=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc)
time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.pooling_type default=0
time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.rope.scaling.type default=""
time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.rope.scaling.factor default=1
time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.rope.scaling.original_context_length default=0
time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.expert_count default=0
time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.expert_used_count default=0
time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.norm_top_k_prob default=true
/workspace/ollama-0.15.5-rc4/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu:396: GGML_ASSERT(ggml_nbytes(src0) <= INT_MAX) failed
[New LWP 596]
[New LWP 595]
[New LWP 594]
[New LWP 593]
[New LWP 592]
[New LWP 591]
[New LWP 590]
[New LWP 589]
[New LWP 588]
[New LWP 587]
[New LWP 586]
[New LWP 585]
[New LWP 584]
[New LWP 583]
[New LWP 582]
warning: could not find '.gnu_debugaltlink' file for /usr/lib/aarch64-linux-gnu/libtcmalloc.so.4
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/aarch64-linux-gnu/libthread_db.so.1".
0x0000000000426600 in ?? ()
#0  0x0000000000426600 in ?? ()
#1  0x0000000000000004 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
[Inferior 1 (process 581) detached]
SIGABRT: abort
PC=0xffff8e447608 m=9 sigcode=18446744073709551610
signal arrived during cgo execution

goroutine 41 gp=0x40001dd880 m=9 mp=0x4000580808 [syscall]:
runtime.cgocall(0x10a5060, 0x4000c6d0a8)
	/workspace/go/src/runtime/cgocall.go:167 +0x44 fp=0x4000c6d060 sp=0x4000c6d020 pc=0x48bdf4
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_sched_reserve(0x3a069080, 0x4a8a5e50)
	_cgo_gotypes.go:1012 +0x34 fp=0x4000c6d0a0 sp=0x4000c6d060 pc=0x8afd54
github.com/ollama/ollama/ml/backend/ggml.(*Context).Reserve.func2(...)
	/workspace/ollama-0.15.5-rc4/ml/backend/ggml/ggml.go:850
github.com/ollama/ollama/ml/backend/ggml.(*Context).Reserve(0x4000684040)
	/workspace/ollama-0.15.5-rc4/ml/backend/ggml/ggml.go:850 +0xe0 fp=0x4000c6d330 sp=0x4000c6d0a0 pc=0x8ba500
github.com/ollama/ollama/runner/ollamarunner.(*Server).reserveWorstCaseGraph(0x4000226d20, 0x1)
	/workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1168 +0x834 fp=0x4000c6d660 sp=0x4000c6d330 pc=0x982cd4
github.com/ollama/ollama/runner/ollamarunner.(*Server).allocModel(0x4000226d20, {0xffffe23e6919?, 0x0?}, {0x0, 0xc, {0x4000684b80, 0x1, 0x1}, 0x1}, {0x0?, ...}, ...)
	/workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1231 +0x2e4 fp=0x4000c6d710 sp=0x4000c6d660 pc=0x9833a4
github.com/ollama/ollama/runner/ollamarunner.(*Server).load(0x4000226d20, {0x17a0970, 0x40000ffea0}, 0x4000242140)
	/workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1310 +0x460 fp=0x4000c6daa0 sp=0x4000c6d710 pc=0x983cb0
github.com/ollama/ollama/runner/ollamarunner.(*Server).load-fm({0x17a0970?, 0x40000ffea0?}, 0x4000163b28?)
	<autogenerated>:1 +0x40 fp=0x4000c6dad0 sp=0x4000c6daa0 pc=0x985b60
net/http.HandlerFunc.ServeHTTP(0x40004952c0?, {0x17a0970?, 0x40000ffea0?}, 0x4000163b10?)
	/workspace/go/src/net/http/server.go:2294 +0x38 fp=0x4000c6db00 sp=0x4000c6dad0 pc=0x7487d8
net/http.(*ServeMux).ServeHTTP(0x10?, {0x17a0970, 0x40000ffea0}, 0x4000242140)
	/workspace/go/src/net/http/server.go:2822 +0x1b4 fp=0x4000c6db50 sp=0x4000c6db00 pc=0x74a364
net/http.serverHandler.ServeHTTP({0x179cd30?}, {0x17a0970?, 0x40000ffea0?}, 0x1?)
	/workspace/go/src/net/http/server.go:3301 +0xbc fp=0x4000c6db80 sp=0x4000c6db50 pc=0x76604c
net/http.(*conn).serve(0x400012e480, {0x17a2e98, 0x4000708cc0})
	/workspace/go/src/net/http/server.go:2102 +0x52c fp=0x4000c6dfa0 sp=0x4000c6db80 pc=0x746f7c
net/http.(*Server).Serve.gowrap3()
	/workspace/go/src/net/http/server.go:3454 +0x30 fp=0x4000c6dfd0 sp=0x4000c6dfa0 pc=0x74c140
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000c6dfd0 sp=0x4000c6dfd0 pc=0x4971f4
created by net/http.(*Server).Serve in goroutine 1
	/workspace/go/src/net/http/server.go:3454 +0x3d8

goroutine 1 gp=0x40000021c0 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000c6f720 sp=0x4000c6f700 pc=0x48f308
runtime.netpollblock(0x7000000000?, 0x6?, 0x0?)
	/workspace/go/src/runtime/netpoll.go:575 +0x158 fp=0x4000c6f760 sp=0x4000c6f720 pc=0x4542c8
internal/poll.runtime_pollWait(0xffff470e6f30, 0x72)
	/workspace/go/src/runtime/netpoll.go:351 +0xa0 fp=0x4000c6f790 sp=0x4000c6f760 pc=0x48e4c0
internal/poll.(*pollDesc).wait(0x4000627a00?, 0x517538?, 0x0)
	/workspace/go/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x4000c6f7c0 sp=0x4000c6f790 pc=0x510ad8
internal/poll.(*pollDesc).waitRead(...)
	/workspace/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0x4000627a00)
	/workspace/go/src/internal/poll/fd_unix.go:620 +0x24c fp=0x4000c6f870 sp=0x4000c6f7c0 pc=0x5153ac
net.(*netFD).accept(0x4000627a00)
	/workspace/go/src/net/fd_unix.go:172 +0x28 fp=0x4000c6f930 sp=0x4000c6f870 pc=0x583eb8
net.(*TCPListener).accept(0x4000684980)
	/workspace/go/src/net/tcpsock_posix.go:159 +0x24 fp=0x4000c6f980 sp=0x4000c6f930 pc=0x599354
net.(*TCPListener).Accept(0x4000684980)
	/workspace/go/src/net/tcpsock.go:380 +0x2c fp=0x4000c6f9c0 sp=0x4000c6f980 pc=0x5982ec
net/http.(*onceCloseListener).Accept(0x400012e480?)
	<autogenerated>:1 +0x30 fp=0x4000c6f9e0 sp=0x4000c6f9c0 pc=0x772670
net/http.(*Server).Serve(0x40001f1800, {0x17a0760, 0x4000684980})
	/workspace/go/src/net/http/server.go:3424 +0x290 fp=0x4000c6fb10 sp=0x4000c6f9e0 pc=0x74bdb0
github.com/ollama/ollama/runner/ollamarunner.Execute({0x40000320a0, 0x4, 0x4})
	/workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1446 +0x7fc fp=0x4000c6fce0 sp=0x4000c6fb10 pc=0x98558c
github.com/ollama/ollama/runner.Execute({0x4000032080?, 0x0?, 0x0?})
	/workspace/ollama-0.15.5-rc4/runner/runner.go:28 +0x1a4 fp=0x4000c6fd10 sp=0x4000c6fce0 pc=0x98c944
github.com/ollama/ollama/cmd.NewCLI.func3(0x40001f1600?, {0x15a3954?, 0x4?, 0x15a3958?})
	/workspace/ollama-0.15.5-rc4/cmd/cmd.go:1979 +0x54 fp=0x4000c6fd40 sp=0x4000c6fd10 pc=0x1055284
github.com/spf13/cobra.(*Command).execute(0x4000131508, {0x40004cba40, 0x5, 0x5})
	/root/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x648 fp=0x4000c6fe60 sp=0x4000c6fd40 pc=0x5f3bb8
github.com/spf13/cobra.(*Command).ExecuteC(0x40006a0908)
	/root/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320 fp=0x4000c6ff20 sp=0x4000c6fe60 pc=0x5f4300
github.com/spf13/cobra.(*Command).Execute(...)
	/root/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	/root/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
	/workspace/ollama-0.15.5-rc4/main.go:12 +0x54 fp=0x4000c6ff40 sp=0x4000c6ff20 pc=0x1055dd4
runtime.main()
	/workspace/go/src/runtime/proc.go:283 +0x284 fp=0x4000c6ffd0 sp=0x4000c6ff40 pc=0x45b674
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000c6ffd0 sp=0x4000c6ffd0 pc=0x4971f4

goroutine 2 gp=0x4000002c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000070f90 sp=0x4000070f70 pc=0x48f308
runtime.goparkunlock(...)
	/workspace/go/src/runtime/proc.go:441
runtime.forcegchelper()
	/workspace/go/src/runtime/proc.go:348 +0xb8 fp=0x4000070fd0 sp=0x4000070f90 pc=0x45b9c8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000070fd0 sp=0x4000070fd0 pc=0x4971f4
created by runtime.init.7 in goroutine 1
	/workspace/go/src/runtime/proc.go:336 +0x24

goroutine 3 gp=0x4000003180 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000071760 sp=0x4000071740 pc=0x48f308
runtime.goparkunlock(...)
	/workspace/go/src/runtime/proc.go:441
runtime.bgsweep(0x400009c000)
	/workspace/go/src/runtime/mgcsweep.go:316 +0x108 fp=0x40000717b0 sp=0x4000071760 pc=0x4461f8
runtime.gcenable.gowrap1()
	/workspace/go/src/runtime/mgc.go:204 +0x28 fp=0x40000717d0 sp=0x40000717b0 pc=0x43a028
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x40000717d0 sp=0x40000717d0 pc=0x4971f4
created by runtime.gcenable in goroutine 1
	/workspace/go/src/runtime/mgc.go:204 +0x6c

goroutine 4 gp=0x4000003340 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x178c0c0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000071f60 sp=0x4000071f40 pc=0x48f308
runtime.goparkunlock(...)
	/workspace/go/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x2181fe0)
	/workspace/go/src/runtime/mgcscavenge.go:425 +0x5c fp=0x4000071f90 sp=0x4000071f60 pc=0x443cbc
runtime.bgscavenge(0x400009c000)
	/workspace/go/src/runtime/mgcscavenge.go:658 +0xac fp=0x4000071fb0 sp=0x4000071f90 pc=0x44423c
runtime.gcenable.gowrap2()
	/workspace/go/src/runtime/mgc.go:205 +0x28 fp=0x4000071fd0 sp=0x4000071fb0 pc=0x439fc8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000071fd0 sp=0x4000071fd0 pc=0x4971f4
created by runtime.gcenable in goroutine 1
	/workspace/go/src/runtime/mgc.go:205 +0xac

goroutine 5 gp=0x4000003c00 m=nil [finalizer wait]:
runtime.gopark(0x18000001b8?, 0x1000000000000?, 0xf8?, 0x5?, 0x7750bc?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000070590 sp=0x4000070570 pc=0x48f308
runtime.runfinq()
	/workspace/go/src/runtime/mfinal.go:196 +0x108 fp=0x40000707d0 sp=0x4000070590 pc=0x439028
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x40000707d0 sp=0x40000707d0 pc=0x4971f4
created by runtime.createfing in goroutine 1
	/workspace/go/src/runtime/mfinal.go:166 +0x80

goroutine 6 gp=0x40001dc700 m=nil [chan receive]:
runtime.gopark(0x40002295e0?, 0x4003188048?, 0x48?, 0x27?, 0x55c478?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x40000726f0 sp=0x40000726d0 pc=0x48f308
runtime.chanrecv(0x4000036380, 0x0, 0x1)
	/workspace/go/src/runtime/chan.go:664 +0x42c fp=0x4000072770 sp=0x40000726f0 pc=0x42b08c
runtime.chanrecv1(0x0?, 0x0?)
	/workspace/go/src/runtime/chan.go:506 +0x14 fp=0x40000727a0 sp=0x4000072770 pc=0x42ac24
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	/workspace/go/src/runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	/workspace/go/src/runtime/mgc.go:1799 +0x3c fp=0x40000727d0 sp=0x40000727a0 pc=0x43d24c
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x40000727d0 sp=0x40000727d0 pc=0x4971f4
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	/workspace/go/src/runtime/mgc.go:1794 +0x78

goroutine 7 gp=0x40001dce00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000072f10 sp=0x4000072ef0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x4000072fb0 sp=0x4000072f10 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x4000072fd0 sp=0x4000072fb0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000072fd0 sp=0x4000072fd0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 18 gp=0x4000504000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400006c710 sp=0x400006c6f0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400006c7b0 sp=0x400006c710 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400006c7d0 sp=0x400006c7b0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400006c7d0 sp=0x400006c7d0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 34 gp=0x4000102380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011a710 sp=0x400011a6f0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011a7b0 sp=0x400011a710 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011a7d0 sp=0x400011a7b0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011a7d0 sp=0x400011a7d0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 8 gp=0x40001dcfc0 m=nil [GC worker (idle)]:
runtime.gopark(0x6c5358f8652c5?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000073710 sp=0x40000736f0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x40000737b0 sp=0x4000073710 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x40000737d0 sp=0x40000737b0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x40000737d0 sp=0x40000737d0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 19 gp=0x40005041c0 m=nil [GC worker (idle)]:
runtime.gopark(0x6c5358f86d604?, 0x3?, 0x1a?, 0x49?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400006cf10 sp=0x400006cef0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400006cfb0 sp=0x400006cf10 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400006cfd0 sp=0x400006cfb0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400006cfd0 sp=0x400006cfd0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 35 gp=0x4000102540 m=nil [GC worker (idle)]:
runtime.gopark(0x6c5358f868405?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011af10 sp=0x400011aef0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011afb0 sp=0x400011af10 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011afd0 sp=0x400011afb0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011afd0 sp=0x400011afd0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 36 gp=0x4000102700 m=nil [GC worker (idle)]:
runtime.gopark(0x6c5358efa70d5?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011b710 sp=0x400011b6f0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011b7b0 sp=0x400011b710 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011b7d0 sp=0x400011b7b0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011b7d0 sp=0x400011b7d0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 37 gp=0x40001028c0 m=nil [GC worker (idle)]:
runtime.gopark(0x6c5358f8653a5?, 0x1?, 0x43?, 0x8b?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011bf10 sp=0x400011bef0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011bfb0 sp=0x400011bf10 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011bfd0 sp=0x400011bfb0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011bfd0 sp=0x400011bfd0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 38 gp=0x4000102a80 m=nil [GC worker (idle)]:
runtime.gopark(0x6c5358f863e25?, 0x3?, 0x59?, 0x38?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011c710 sp=0x400011c6f0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011c7b0 sp=0x400011c710 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011c7d0 sp=0x400011c7b0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011c7d0 sp=0x400011c7d0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 39 gp=0x4000102c40 m=nil [GC worker (idle)]:
runtime.gopark(0x6c5358f865045?, 0x1?, 0x9f?, 0x69?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011cf10 sp=0x400011cef0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011cfb0 sp=0x400011cf10 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011cfd0 sp=0x400011cfb0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011cfd0 sp=0x400011cfd0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 20 gp=0x4000504380 m=nil [GC worker (idle)]:
runtime.gopark(0x6c5358ef5675e?, 0x3?, 0x99?, 0x1f?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400006d710 sp=0x400006d6f0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400006d7b0 sp=0x400006d710 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400006d7d0 sp=0x400006d7b0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400006d7d0 sp=0x400006d7d0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 9 gp=0x40001dd180 m=nil [GC worker (idle)]:
runtime.gopark(0x6c5358f865045?, 0x3?, 0x75?, 0xa8?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000073f10 sp=0x4000073ef0 pc=0x48f308
runtime.gcBgMarkWorker(0x40000377a0)
	/workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x4000073fb0 sp=0x4000073f10 pc=0x43c4bc
runtime.gcBgMarkStartWorkers.gowrap1()
	/workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x4000073fd0 sp=0x4000073fb0 pc=0x43c3a8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000073fd0 sp=0x4000073fd0 pc=0x4971f4
created by runtime.gcBgMarkStartWorkers in goroutine 1
	/workspace/go/src/runtime/mgc.go:1339 +0x140

goroutine 40 gp=0x40001dd6c0 m=nil [sync.WaitGroup.Wait]:
runtime.gopark(0x2193d60?, 0x0?, 0x60?, 0xe0?, 0x0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000083a90 sp=0x4000083a70 pc=0x48f308
runtime.goparkunlock(...)
	/workspace/go/src/runtime/proc.go:441
runtime.semacquire1(0x4000226dd8, 0x0, 0x1, 0x0, 0x18)
	/workspace/go/src/runtime/sema.go:188 +0x204 fp=0x4000083ae0 sp=0x4000083a90 pc=0x46fb14
sync.runtime_SemacquireWaitGroup(0x0?)
	/workspace/go/src/runtime/sema.go:110 +0x2c fp=0x4000083b20 sp=0x4000083ae0 pc=0x490cbc
sync.(*WaitGroup).Wait(0x4000226dd0)
	/workspace/go/src/sync/waitgroup.go:118 +0x70 fp=0x4000083b40 sp=0x4000083b20 pc=0x4a28d0
github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0x4000226d20, {0x17a2ed0, 0x40004cbae0})
	/workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:441 +0x38 fp=0x4000083fa0 sp=0x4000083b40 pc=0x97d958
github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap1()
	/workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1423 +0x30 fp=0x4000083fd0 sp=0x4000083fa0 pc=0x9857b0
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000083fd0 sp=0x4000083fd0 pc=0x4971f4
created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1
	/workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1423 +0x448

goroutine 43 gp=0x40001dda40 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0xc8?, 0xd5?, 0x42f6d0?)
	/workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011d580 sp=0x400011d560 pc=0x48f308
runtime.netpollblock(0x0?, 0xffffffff?, 0xff?)
	/workspace/go/src/runtime/netpoll.go:575 +0x158 fp=0x400011d5c0 sp=0x400011d580 pc=0x4542c8
internal/poll.runtime_pollWait(0xffff470e6e18, 0x72)
	/workspace/go/src/runtime/netpoll.go:351 +0xa0 fp=0x400011d5f0 sp=0x400011d5c0 pc=0x48e4c0
internal/poll.(*pollDesc).wait(0x4000627a80?, 0x4000708dc1?, 0x0)
	/workspace/go/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x400011d620 sp=0x400011d5f0 pc=0x510ad8
internal/poll.(*pollDesc).waitRead(...)
	/workspace/go/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x4000627a80, {0x4000708dc1, 0x1, 0x1})
	/workspace/go/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x400011d6c0 sp=0x400011d620 pc=0x511d8c
net.(*netFD).Read(0x4000627a80, {0x4000708dc1?, 0x400011d758?, 0x7419f4?})
	/workspace/go/src/net/fd_posix.go:55 +0x28 fp=0x400011d710 sp=0x400011d6c0 pc=0x582488
net.(*conn).Read(0x4000074758, {0x4000708dc1?, 0x0?, 0x0?})
	/workspace/go/src/net/net.go:194 +0x34 fp=0x400011d760 sp=0x400011d710 pc=0x58fbc4
net/http.(*connReader).backgroundRead(0x4000708db0)
	/workspace/go/src/net/http/server.go:690 +0x40 fp=0x400011d7b0 sp=0x400011d760 pc=0x7418f0
net/http.(*connReader).startBackgroundRead.gowrap2()
	/workspace/go/src/net/http/server.go:686 +0x28 fp=0x400011d7d0 sp=0x400011d7b0 pc=0x7417d8
runtime.goexit({})
	/workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011d7d0 sp=0x400011d7d0 pc=0x4971f4
created by net/http.(*connReader).startBackgroundRead in goroutine 41
	/workspace/go/src/net/http/server.go:686 +0xc4

r0      0x0
r1      0x24d
r2      0x6
r3      0xffff43f6e0c0
r4      0xffff8eb5eb50
r5      0x1
r6      0x20
r7      0xffff43f6c7f0
r8      0x83
r9      0x0
r10     0xa
r11     0x101010101010101
r12     0x2
r13     0x0
r14     0x7251e0a150f1d3a
r15     0x3a574268
r16     0x1
r17     0xffff8e3e7d0c
r18     0xffff8eb5eb50
r19     0x24d
r20     0xffff43f6e0c0
r21     0x6
r22     0xffff43f6c928
r23     0x0
r24     0xffff43f6d578
r25     0xffff43f6d550
r26     0x3b334470
r27     0x3a16fb00
r28     0x3a574268
r29     0xffff43f6c790
lr      0xffff8e4475f4
sp      0xffff43f6c780
pc      0xffff8e447608
fault   0x0
time=2026-02-05T09:40:32.850Z level=ERROR source=server.go:1204 msg="do load request" error="Post \"http://127.0.0.1:45139/load\": EOF"
time=2026-02-05T09:40:32.851Z level=ERROR source=server.go:1204 msg="do load request" error="Post \"http://127.0.0.1:45139/load\": dial tcp 127.0.0.1:45139: connect: connection refused"
time=2026-02-05T09:40:32.851Z level=INFO source=sched.go:490 msg="Load failed" model=/root/.ollama/models/blobs/sha256-85e4a5b7b8ef0e48af0e8658f5aaab9c2324c76c1641493f4d1e25fce54b18b9 error="model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details"
time=2026-02-05T09:40:32.851Z level=DEBUG source=server.go:1829 msg="stopping llama server" pid=581
time=2026-02-05T09:40:32.851Z level=DEBUG source=server.go:1835 msg="waiting for llama server to exit" pid=581
time=2026-02-05T09:40:32.852Z level=ERROR source=server.go:303 msg="llama runner terminated" error="exit status 2"
time=2026-02-05T09:40:32.852Z level=DEBUG source=server.go:1839 msg="llama server stopped" pid=581
time=2026-02-05T09:40:32.852Z level=DEBUG source=sched.go:241 msg="new model fits with existing models, loading"

OS

Linux

GPU

Other

CPU

Other

Ollama version

0.15.5-rc4

Originally created by @luhuaei on GitHub (Feb 5, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/14092 ### What is the issue? Error: 500 Internal Server Error: model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details ``` > env | grep OLLAMA OLLAMA_ORIGINS=* OLLAMA_HOST=0.0.0.0 OLLAMA_LLM_LIBRARY=cuda OLLAMA_NUM_PARALLEL=4 OLLAMA_DEBUG=1 OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q8_0 ``` jetson agx orin 64G dev kit ``` > env | grep CUDA CMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc CUDA_BIN_PATH=/usr/local/cuda/bin CUDAARCHS=87 CUDACXX=/usr/local/cuda/bin/nvcc CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda CUDA_NVCC_EXECUTABLE=/usr/local/cuda/bin/nvcc CUDA_HOME=/usr/local/cuda CUDA_ARCHITECTURES=87 ``` ### Relevant log output ```shell time=2026-02-05T09:40:30.709Z level=INFO source=server.go:430 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --model /root/.ollama/models/blobs/sha256-85e4a5b7b8ef0e48af0e8658f5aaab9c2324c76c1641493f4d1e25fce54b18b9 --port 45139" time=2026-02-05T09:40:30.709Z level=DEBUG source=server.go:431 msg=subprocess PATH=/opt/venv/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin OLLAMA_NUM_PARALLEL=4 OLLAMA_ORIGINS=* OLLAMA_KV_CACHE_TYPE=q8_0 OLLAMA_HOST=0.0.0.0 OLLAMA_FLASH_ATTENTION=1 OLLAMA_LLM_LIBRARY=cuda OLLAMA_DEBUG=1 CUDA_HOME=/usr/local/cuda CUDA_ARCHITECTURES=87 CUDA_NVCC_EXECUTABLE=/usr/local/cuda/bin/nvcc CUDA_BIN_PATH=/usr/local/cuda/bin CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/cuda/compat:/usr/local/cuda/lib64: OLLAMA_LIBRARY_PATH=/usr/local/lib/ollama time=2026-02-05T09:40:30.709Z level=INFO source=sched.go:463 msg="system memory" total="60.0 GiB" free="57.2 GiB" free_swap="0 B" time=2026-02-05T09:40:30.709Z level=INFO source=sched.go:470 msg="gpu memory" id=GPU-678cd885-b8ec-5922-848b-a80d8aadc8b5 library=CUDA available="39.9 GiB" free="40.4 GiB" minimum="457.0 MiB" overhead="0 B" time=2026-02-05T09:40:30.709Z level=INFO source=server.go:756 msg="loading model" "model layers"=37 requested=-1 time=2026-02-05T09:40:30.723Z level=INFO source=runner.go:1410 msg="starting ollama engine" time=2026-02-05T09:40:30.727Z level=INFO source=runner.go:1445 msg="Server listening on 127.0.0.1:45139" time=2026-02-05T09:40:30.732Z level=INFO source=runner.go:1283 msg=load request="{Operation:fit LoraPath:[] Parallel:4 BatchSize:512 FlashAttention:Enabled KvSize:1048576 KvCacheType:q8_0 NumThreads:12 GPULayers:37[ID:GPU-678cd885-b8ec-5922-848b-a80d8aadc8b5 Layers:37(0..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" time=2026-02-05T09:40:30.765Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=general.alignment default=32 time=2026-02-05T09:40:30.765Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=general.description default="" time=2026-02-05T09:40:30.765Z level=INFO source=ggml.go:136 msg="" architecture=qwen3 file_type=Q4_K_M name="Qwen3 4B Instruct 2507" description="" num_tensors=398 num_key_values=33 time=2026-02-05T09:40:30.765Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/local/lib/ollama ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 CUDA devices: Device 0: Orin, compute capability 8.7, VMM: yes, ID: GPU-678cd885-b8ec-5922-848b-a80d8aadc8b5 load_backend: loaded CUDA backend from /usr/local/lib/ollama/libggml-cuda.so load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu.so time=2026-02-05T09:40:30.805Z level=INFO source=ggml.go:104 msg=system CPU.0.NEON=1 CPU.0.ARM_FMA=1 CPU.0.LLAMAFILE=1 CUDA.0.ARCHS=870 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 CPU.1.NEON=1 CPU.1.ARM_FMA=1 CPU.1.LLAMAFILE=1 compiler=cgo(gcc) time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.pooling_type default=0 time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}" time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.rope.scaling.type default="" time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.rope.scaling.factor default=1 time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.rope.scaling.original_context_length default=0 time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.expert_count default=0 time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.expert_used_count default=0 time=2026-02-05T09:40:30.810Z level=DEBUG source=ggml.go:300 msg="key with type not found" key=qwen3.norm_top_k_prob default=true /workspace/ollama-0.15.5-rc4/ml/backend/ggml/ggml/src/ggml-cuda/cpy.cu:396: GGML_ASSERT(ggml_nbytes(src0) <= INT_MAX) failed [New LWP 596] [New LWP 595] [New LWP 594] [New LWP 593] [New LWP 592] [New LWP 591] [New LWP 590] [New LWP 589] [New LWP 588] [New LWP 587] [New LWP 586] [New LWP 585] [New LWP 584] [New LWP 583] [New LWP 582] warning: could not find '.gnu_debugaltlink' file for /usr/lib/aarch64-linux-gnu/libtcmalloc.so.4 [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/aarch64-linux-gnu/libthread_db.so.1". 0x0000000000426600 in ?? () #0 0x0000000000426600 in ?? () #1 0x0000000000000004 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) [Inferior 1 (process 581) detached] SIGABRT: abort PC=0xffff8e447608 m=9 sigcode=18446744073709551610 signal arrived during cgo execution goroutine 41 gp=0x40001dd880 m=9 mp=0x4000580808 [syscall]: runtime.cgocall(0x10a5060, 0x4000c6d0a8) /workspace/go/src/runtime/cgocall.go:167 +0x44 fp=0x4000c6d060 sp=0x4000c6d020 pc=0x48bdf4 github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_sched_reserve(0x3a069080, 0x4a8a5e50) _cgo_gotypes.go:1012 +0x34 fp=0x4000c6d0a0 sp=0x4000c6d060 pc=0x8afd54 github.com/ollama/ollama/ml/backend/ggml.(*Context).Reserve.func2(...) /workspace/ollama-0.15.5-rc4/ml/backend/ggml/ggml.go:850 github.com/ollama/ollama/ml/backend/ggml.(*Context).Reserve(0x4000684040) /workspace/ollama-0.15.5-rc4/ml/backend/ggml/ggml.go:850 +0xe0 fp=0x4000c6d330 sp=0x4000c6d0a0 pc=0x8ba500 github.com/ollama/ollama/runner/ollamarunner.(*Server).reserveWorstCaseGraph(0x4000226d20, 0x1) /workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1168 +0x834 fp=0x4000c6d660 sp=0x4000c6d330 pc=0x982cd4 github.com/ollama/ollama/runner/ollamarunner.(*Server).allocModel(0x4000226d20, {0xffffe23e6919?, 0x0?}, {0x0, 0xc, {0x4000684b80, 0x1, 0x1}, 0x1}, {0x0?, ...}, ...) /workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1231 +0x2e4 fp=0x4000c6d710 sp=0x4000c6d660 pc=0x9833a4 github.com/ollama/ollama/runner/ollamarunner.(*Server).load(0x4000226d20, {0x17a0970, 0x40000ffea0}, 0x4000242140) /workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1310 +0x460 fp=0x4000c6daa0 sp=0x4000c6d710 pc=0x983cb0 github.com/ollama/ollama/runner/ollamarunner.(*Server).load-fm({0x17a0970?, 0x40000ffea0?}, 0x4000163b28?) <autogenerated>:1 +0x40 fp=0x4000c6dad0 sp=0x4000c6daa0 pc=0x985b60 net/http.HandlerFunc.ServeHTTP(0x40004952c0?, {0x17a0970?, 0x40000ffea0?}, 0x4000163b10?) /workspace/go/src/net/http/server.go:2294 +0x38 fp=0x4000c6db00 sp=0x4000c6dad0 pc=0x7487d8 net/http.(*ServeMux).ServeHTTP(0x10?, {0x17a0970, 0x40000ffea0}, 0x4000242140) /workspace/go/src/net/http/server.go:2822 +0x1b4 fp=0x4000c6db50 sp=0x4000c6db00 pc=0x74a364 net/http.serverHandler.ServeHTTP({0x179cd30?}, {0x17a0970?, 0x40000ffea0?}, 0x1?) /workspace/go/src/net/http/server.go:3301 +0xbc fp=0x4000c6db80 sp=0x4000c6db50 pc=0x76604c net/http.(*conn).serve(0x400012e480, {0x17a2e98, 0x4000708cc0}) /workspace/go/src/net/http/server.go:2102 +0x52c fp=0x4000c6dfa0 sp=0x4000c6db80 pc=0x746f7c net/http.(*Server).Serve.gowrap3() /workspace/go/src/net/http/server.go:3454 +0x30 fp=0x4000c6dfd0 sp=0x4000c6dfa0 pc=0x74c140 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000c6dfd0 sp=0x4000c6dfd0 pc=0x4971f4 created by net/http.(*Server).Serve in goroutine 1 /workspace/go/src/net/http/server.go:3454 +0x3d8 goroutine 1 gp=0x40000021c0 m=nil [IO wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000c6f720 sp=0x4000c6f700 pc=0x48f308 runtime.netpollblock(0x7000000000?, 0x6?, 0x0?) /workspace/go/src/runtime/netpoll.go:575 +0x158 fp=0x4000c6f760 sp=0x4000c6f720 pc=0x4542c8 internal/poll.runtime_pollWait(0xffff470e6f30, 0x72) /workspace/go/src/runtime/netpoll.go:351 +0xa0 fp=0x4000c6f790 sp=0x4000c6f760 pc=0x48e4c0 internal/poll.(*pollDesc).wait(0x4000627a00?, 0x517538?, 0x0) /workspace/go/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x4000c6f7c0 sp=0x4000c6f790 pc=0x510ad8 internal/poll.(*pollDesc).waitRead(...) /workspace/go/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Accept(0x4000627a00) /workspace/go/src/internal/poll/fd_unix.go:620 +0x24c fp=0x4000c6f870 sp=0x4000c6f7c0 pc=0x5153ac net.(*netFD).accept(0x4000627a00) /workspace/go/src/net/fd_unix.go:172 +0x28 fp=0x4000c6f930 sp=0x4000c6f870 pc=0x583eb8 net.(*TCPListener).accept(0x4000684980) /workspace/go/src/net/tcpsock_posix.go:159 +0x24 fp=0x4000c6f980 sp=0x4000c6f930 pc=0x599354 net.(*TCPListener).Accept(0x4000684980) /workspace/go/src/net/tcpsock.go:380 +0x2c fp=0x4000c6f9c0 sp=0x4000c6f980 pc=0x5982ec net/http.(*onceCloseListener).Accept(0x400012e480?) <autogenerated>:1 +0x30 fp=0x4000c6f9e0 sp=0x4000c6f9c0 pc=0x772670 net/http.(*Server).Serve(0x40001f1800, {0x17a0760, 0x4000684980}) /workspace/go/src/net/http/server.go:3424 +0x290 fp=0x4000c6fb10 sp=0x4000c6f9e0 pc=0x74bdb0 github.com/ollama/ollama/runner/ollamarunner.Execute({0x40000320a0, 0x4, 0x4}) /workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1446 +0x7fc fp=0x4000c6fce0 sp=0x4000c6fb10 pc=0x98558c github.com/ollama/ollama/runner.Execute({0x4000032080?, 0x0?, 0x0?}) /workspace/ollama-0.15.5-rc4/runner/runner.go:28 +0x1a4 fp=0x4000c6fd10 sp=0x4000c6fce0 pc=0x98c944 github.com/ollama/ollama/cmd.NewCLI.func3(0x40001f1600?, {0x15a3954?, 0x4?, 0x15a3958?}) /workspace/ollama-0.15.5-rc4/cmd/cmd.go:1979 +0x54 fp=0x4000c6fd40 sp=0x4000c6fd10 pc=0x1055284 github.com/spf13/cobra.(*Command).execute(0x4000131508, {0x40004cba40, 0x5, 0x5}) /root/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x648 fp=0x4000c6fe60 sp=0x4000c6fd40 pc=0x5f3bb8 github.com/spf13/cobra.(*Command).ExecuteC(0x40006a0908) /root/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x320 fp=0x4000c6ff20 sp=0x4000c6fe60 pc=0x5f4300 github.com/spf13/cobra.(*Command).Execute(...) /root/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) /root/go/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985 main.main() /workspace/ollama-0.15.5-rc4/main.go:12 +0x54 fp=0x4000c6ff40 sp=0x4000c6ff20 pc=0x1055dd4 runtime.main() /workspace/go/src/runtime/proc.go:283 +0x284 fp=0x4000c6ffd0 sp=0x4000c6ff40 pc=0x45b674 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000c6ffd0 sp=0x4000c6ffd0 pc=0x4971f4 goroutine 2 gp=0x4000002c40 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000070f90 sp=0x4000070f70 pc=0x48f308 runtime.goparkunlock(...) /workspace/go/src/runtime/proc.go:441 runtime.forcegchelper() /workspace/go/src/runtime/proc.go:348 +0xb8 fp=0x4000070fd0 sp=0x4000070f90 pc=0x45b9c8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000070fd0 sp=0x4000070fd0 pc=0x4971f4 created by runtime.init.7 in goroutine 1 /workspace/go/src/runtime/proc.go:336 +0x24 goroutine 3 gp=0x4000003180 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000071760 sp=0x4000071740 pc=0x48f308 runtime.goparkunlock(...) /workspace/go/src/runtime/proc.go:441 runtime.bgsweep(0x400009c000) /workspace/go/src/runtime/mgcsweep.go:316 +0x108 fp=0x40000717b0 sp=0x4000071760 pc=0x4461f8 runtime.gcenable.gowrap1() /workspace/go/src/runtime/mgc.go:204 +0x28 fp=0x40000717d0 sp=0x40000717b0 pc=0x43a028 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x40000717d0 sp=0x40000717d0 pc=0x4971f4 created by runtime.gcenable in goroutine 1 /workspace/go/src/runtime/mgc.go:204 +0x6c goroutine 4 gp=0x4000003340 m=nil [GC scavenge wait]: runtime.gopark(0x10000?, 0x178c0c0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000071f60 sp=0x4000071f40 pc=0x48f308 runtime.goparkunlock(...) /workspace/go/src/runtime/proc.go:441 runtime.(*scavengerState).park(0x2181fe0) /workspace/go/src/runtime/mgcscavenge.go:425 +0x5c fp=0x4000071f90 sp=0x4000071f60 pc=0x443cbc runtime.bgscavenge(0x400009c000) /workspace/go/src/runtime/mgcscavenge.go:658 +0xac fp=0x4000071fb0 sp=0x4000071f90 pc=0x44423c runtime.gcenable.gowrap2() /workspace/go/src/runtime/mgc.go:205 +0x28 fp=0x4000071fd0 sp=0x4000071fb0 pc=0x439fc8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000071fd0 sp=0x4000071fd0 pc=0x4971f4 created by runtime.gcenable in goroutine 1 /workspace/go/src/runtime/mgc.go:205 +0xac goroutine 5 gp=0x4000003c00 m=nil [finalizer wait]: runtime.gopark(0x18000001b8?, 0x1000000000000?, 0xf8?, 0x5?, 0x7750bc?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000070590 sp=0x4000070570 pc=0x48f308 runtime.runfinq() /workspace/go/src/runtime/mfinal.go:196 +0x108 fp=0x40000707d0 sp=0x4000070590 pc=0x439028 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x40000707d0 sp=0x40000707d0 pc=0x4971f4 created by runtime.createfing in goroutine 1 /workspace/go/src/runtime/mfinal.go:166 +0x80 goroutine 6 gp=0x40001dc700 m=nil [chan receive]: runtime.gopark(0x40002295e0?, 0x4003188048?, 0x48?, 0x27?, 0x55c478?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x40000726f0 sp=0x40000726d0 pc=0x48f308 runtime.chanrecv(0x4000036380, 0x0, 0x1) /workspace/go/src/runtime/chan.go:664 +0x42c fp=0x4000072770 sp=0x40000726f0 pc=0x42b08c runtime.chanrecv1(0x0?, 0x0?) /workspace/go/src/runtime/chan.go:506 +0x14 fp=0x40000727a0 sp=0x4000072770 pc=0x42ac24 runtime.unique_runtime_registerUniqueMapCleanup.func2(...) /workspace/go/src/runtime/mgc.go:1796 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() /workspace/go/src/runtime/mgc.go:1799 +0x3c fp=0x40000727d0 sp=0x40000727a0 pc=0x43d24c runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x40000727d0 sp=0x40000727d0 pc=0x4971f4 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 /workspace/go/src/runtime/mgc.go:1794 +0x78 goroutine 7 gp=0x40001dce00 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000072f10 sp=0x4000072ef0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x4000072fb0 sp=0x4000072f10 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x4000072fd0 sp=0x4000072fb0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000072fd0 sp=0x4000072fd0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 18 gp=0x4000504000 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400006c710 sp=0x400006c6f0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400006c7b0 sp=0x400006c710 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400006c7d0 sp=0x400006c7b0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400006c7d0 sp=0x400006c7d0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 34 gp=0x4000102380 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011a710 sp=0x400011a6f0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011a7b0 sp=0x400011a710 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011a7d0 sp=0x400011a7b0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011a7d0 sp=0x400011a7d0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 8 gp=0x40001dcfc0 m=nil [GC worker (idle)]: runtime.gopark(0x6c5358f8652c5?, 0x0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000073710 sp=0x40000736f0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x40000737b0 sp=0x4000073710 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x40000737d0 sp=0x40000737b0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x40000737d0 sp=0x40000737d0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 19 gp=0x40005041c0 m=nil [GC worker (idle)]: runtime.gopark(0x6c5358f86d604?, 0x3?, 0x1a?, 0x49?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400006cf10 sp=0x400006cef0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400006cfb0 sp=0x400006cf10 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400006cfd0 sp=0x400006cfb0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400006cfd0 sp=0x400006cfd0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 35 gp=0x4000102540 m=nil [GC worker (idle)]: runtime.gopark(0x6c5358f868405?, 0x0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011af10 sp=0x400011aef0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011afb0 sp=0x400011af10 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011afd0 sp=0x400011afb0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011afd0 sp=0x400011afd0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 36 gp=0x4000102700 m=nil [GC worker (idle)]: runtime.gopark(0x6c5358efa70d5?, 0x0?, 0x0?, 0x0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011b710 sp=0x400011b6f0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011b7b0 sp=0x400011b710 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011b7d0 sp=0x400011b7b0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011b7d0 sp=0x400011b7d0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 37 gp=0x40001028c0 m=nil [GC worker (idle)]: runtime.gopark(0x6c5358f8653a5?, 0x1?, 0x43?, 0x8b?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011bf10 sp=0x400011bef0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011bfb0 sp=0x400011bf10 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011bfd0 sp=0x400011bfb0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011bfd0 sp=0x400011bfd0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 38 gp=0x4000102a80 m=nil [GC worker (idle)]: runtime.gopark(0x6c5358f863e25?, 0x3?, 0x59?, 0x38?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011c710 sp=0x400011c6f0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011c7b0 sp=0x400011c710 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011c7d0 sp=0x400011c7b0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011c7d0 sp=0x400011c7d0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 39 gp=0x4000102c40 m=nil [GC worker (idle)]: runtime.gopark(0x6c5358f865045?, 0x1?, 0x9f?, 0x69?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011cf10 sp=0x400011cef0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400011cfb0 sp=0x400011cf10 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400011cfd0 sp=0x400011cfb0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011cfd0 sp=0x400011cfd0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 20 gp=0x4000504380 m=nil [GC worker (idle)]: runtime.gopark(0x6c5358ef5675e?, 0x3?, 0x99?, 0x1f?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400006d710 sp=0x400006d6f0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x400006d7b0 sp=0x400006d710 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x400006d7d0 sp=0x400006d7b0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400006d7d0 sp=0x400006d7d0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 9 gp=0x40001dd180 m=nil [GC worker (idle)]: runtime.gopark(0x6c5358f865045?, 0x3?, 0x75?, 0xa8?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000073f10 sp=0x4000073ef0 pc=0x48f308 runtime.gcBgMarkWorker(0x40000377a0) /workspace/go/src/runtime/mgc.go:1423 +0xdc fp=0x4000073fb0 sp=0x4000073f10 pc=0x43c4bc runtime.gcBgMarkStartWorkers.gowrap1() /workspace/go/src/runtime/mgc.go:1339 +0x28 fp=0x4000073fd0 sp=0x4000073fb0 pc=0x43c3a8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000073fd0 sp=0x4000073fd0 pc=0x4971f4 created by runtime.gcBgMarkStartWorkers in goroutine 1 /workspace/go/src/runtime/mgc.go:1339 +0x140 goroutine 40 gp=0x40001dd6c0 m=nil [sync.WaitGroup.Wait]: runtime.gopark(0x2193d60?, 0x0?, 0x60?, 0xe0?, 0x0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x4000083a90 sp=0x4000083a70 pc=0x48f308 runtime.goparkunlock(...) /workspace/go/src/runtime/proc.go:441 runtime.semacquire1(0x4000226dd8, 0x0, 0x1, 0x0, 0x18) /workspace/go/src/runtime/sema.go:188 +0x204 fp=0x4000083ae0 sp=0x4000083a90 pc=0x46fb14 sync.runtime_SemacquireWaitGroup(0x0?) /workspace/go/src/runtime/sema.go:110 +0x2c fp=0x4000083b20 sp=0x4000083ae0 pc=0x490cbc sync.(*WaitGroup).Wait(0x4000226dd0) /workspace/go/src/sync/waitgroup.go:118 +0x70 fp=0x4000083b40 sp=0x4000083b20 pc=0x4a28d0 github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0x4000226d20, {0x17a2ed0, 0x40004cbae0}) /workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:441 +0x38 fp=0x4000083fa0 sp=0x4000083b40 pc=0x97d958 github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap1() /workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1423 +0x30 fp=0x4000083fd0 sp=0x4000083fa0 pc=0x9857b0 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x4000083fd0 sp=0x4000083fd0 pc=0x4971f4 created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1 /workspace/ollama-0.15.5-rc4/runner/ollamarunner/runner.go:1423 +0x448 goroutine 43 gp=0x40001dda40 m=nil [IO wait]: runtime.gopark(0x0?, 0x0?, 0xc8?, 0xd5?, 0x42f6d0?) /workspace/go/src/runtime/proc.go:435 +0xc8 fp=0x400011d580 sp=0x400011d560 pc=0x48f308 runtime.netpollblock(0x0?, 0xffffffff?, 0xff?) /workspace/go/src/runtime/netpoll.go:575 +0x158 fp=0x400011d5c0 sp=0x400011d580 pc=0x4542c8 internal/poll.runtime_pollWait(0xffff470e6e18, 0x72) /workspace/go/src/runtime/netpoll.go:351 +0xa0 fp=0x400011d5f0 sp=0x400011d5c0 pc=0x48e4c0 internal/poll.(*pollDesc).wait(0x4000627a80?, 0x4000708dc1?, 0x0) /workspace/go/src/internal/poll/fd_poll_runtime.go:84 +0x28 fp=0x400011d620 sp=0x400011d5f0 pc=0x510ad8 internal/poll.(*pollDesc).waitRead(...) /workspace/go/src/internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0x4000627a80, {0x4000708dc1, 0x1, 0x1}) /workspace/go/src/internal/poll/fd_unix.go:165 +0x1fc fp=0x400011d6c0 sp=0x400011d620 pc=0x511d8c net.(*netFD).Read(0x4000627a80, {0x4000708dc1?, 0x400011d758?, 0x7419f4?}) /workspace/go/src/net/fd_posix.go:55 +0x28 fp=0x400011d710 sp=0x400011d6c0 pc=0x582488 net.(*conn).Read(0x4000074758, {0x4000708dc1?, 0x0?, 0x0?}) /workspace/go/src/net/net.go:194 +0x34 fp=0x400011d760 sp=0x400011d710 pc=0x58fbc4 net/http.(*connReader).backgroundRead(0x4000708db0) /workspace/go/src/net/http/server.go:690 +0x40 fp=0x400011d7b0 sp=0x400011d760 pc=0x7418f0 net/http.(*connReader).startBackgroundRead.gowrap2() /workspace/go/src/net/http/server.go:686 +0x28 fp=0x400011d7d0 sp=0x400011d7b0 pc=0x7417d8 runtime.goexit({}) /workspace/go/src/runtime/asm_arm64.s:1223 +0x4 fp=0x400011d7d0 sp=0x400011d7d0 pc=0x4971f4 created by net/http.(*connReader).startBackgroundRead in goroutine 41 /workspace/go/src/net/http/server.go:686 +0xc4 r0 0x0 r1 0x24d r2 0x6 r3 0xffff43f6e0c0 r4 0xffff8eb5eb50 r5 0x1 r6 0x20 r7 0xffff43f6c7f0 r8 0x83 r9 0x0 r10 0xa r11 0x101010101010101 r12 0x2 r13 0x0 r14 0x7251e0a150f1d3a r15 0x3a574268 r16 0x1 r17 0xffff8e3e7d0c r18 0xffff8eb5eb50 r19 0x24d r20 0xffff43f6e0c0 r21 0x6 r22 0xffff43f6c928 r23 0x0 r24 0xffff43f6d578 r25 0xffff43f6d550 r26 0x3b334470 r27 0x3a16fb00 r28 0x3a574268 r29 0xffff43f6c790 lr 0xffff8e4475f4 sp 0xffff43f6c780 pc 0xffff8e447608 fault 0x0 time=2026-02-05T09:40:32.850Z level=ERROR source=server.go:1204 msg="do load request" error="Post \"http://127.0.0.1:45139/load\": EOF" time=2026-02-05T09:40:32.851Z level=ERROR source=server.go:1204 msg="do load request" error="Post \"http://127.0.0.1:45139/load\": dial tcp 127.0.0.1:45139: connect: connection refused" time=2026-02-05T09:40:32.851Z level=INFO source=sched.go:490 msg="Load failed" model=/root/.ollama/models/blobs/sha256-85e4a5b7b8ef0e48af0e8658f5aaab9c2324c76c1641493f4d1e25fce54b18b9 error="model failed to load, this may be due to resource limitations or an internal error, check ollama server logs for details" time=2026-02-05T09:40:32.851Z level=DEBUG source=server.go:1829 msg="stopping llama server" pid=581 time=2026-02-05T09:40:32.851Z level=DEBUG source=server.go:1835 msg="waiting for llama server to exit" pid=581 time=2026-02-05T09:40:32.852Z level=ERROR source=server.go:303 msg="llama runner terminated" error="exit status 2" time=2026-02-05T09:40:32.852Z level=DEBUG source=server.go:1839 msg="llama server stopped" pid=581 time=2026-02-05T09:40:32.852Z level=DEBUG source=sched.go:241 msg="new model fits with existing models, loading" ``` ### OS Linux ### GPU Other ### CPU Other ### Ollama version 0.15.5-rc4
GiteaMirror added the bug label 2026-04-12 22:03:08 -05:00
Author
Owner

@rick-github commented on GitHub (Feb 5, 2026):

The -rc releases have new default context logic that doesn't account for OLLAMA_NUM_PARALLEL. If you haven't set OLLAMA_CONTEXT_LENGTH then the default context and OLLAMA_NUM_PARALLEL=4 may be consuming all of your VRAM and causing an issue. Setting OLLAMA_CONTEXT_LENGTH=4096 in the server environment will return to the previous behaviour until the issue is resolved in an ollama release.

<!-- gh-comment-id:3855704102 --> @rick-github commented on GitHub (Feb 5, 2026): The -rc releases have new default context logic that doesn't account for `OLLAMA_NUM_PARALLEL`. If you haven't set `OLLAMA_CONTEXT_LENGTH` then the default context and `OLLAMA_NUM_PARALLEL=4` may be consuming all of your VRAM and causing an issue. Setting `OLLAMA_CONTEXT_LENGTH=4096` in the server environment will return to the previous behaviour until the issue is resolved in an ollama release.
Author
Owner

@luhuaei commented on GitHub (Feb 6, 2026):

Thank you for the explanation

<!-- gh-comment-id:3857387655 --> @luhuaei commented on GitHub (Feb 6, 2026): Thank you for the explanation
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9200