[GH-ISSUE #13981] Ollama无法利用Intel UHD Graphics 630和NVIDIA GeForce GT 720一起运行llm #9144

Closed
opened 2026-04-12 21:59:46 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @gitcbz on GitHub (Jan 30, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/13981

What is the issue?

C:\Users\~~~>ollama run qwen3:4b
>>> 你好
Error: 500 Internal Server Error: model runner has unexpectedly stopped, this may be due to resource limitations or an internal error, check ollama server logs for details

我的Gpu:Intel UHD Graphics 630 和 NVIDIA GeForce GT 720

Relevant log output

time=2026-01-30T21:52:44.701+08:00 level=INFO source=routes.go:1614 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:D:\\Program Files (x86)\\Ollama\\Models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:true OLLAMA_VULKAN:true ROCR_VISIBLE_DEVICES:]"
time=2026-01-30T21:52:44.719+08:00 level=INFO source=images.go:499 msg="total blobs: 14"
time=2026-01-30T21:52:44.721+08:00 level=INFO source=images.go:506 msg="total unused blobs removed: 0"
time=2026-01-30T21:52:44.722+08:00 level=INFO source=routes.go:1667 msg="Listening on [::]:11434 (version 0.14.2)"
time=2026-01-30T21:52:44.724+08:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-01-30T21:52:44.783+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 52144"
time=2026-01-30T21:52:44.948+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 52151"
time=2026-01-30T21:52:45.088+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 52157"
time=2026-01-30T21:52:45.197+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 52164"
time=2026-01-30T21:52:46.766+08:00 level=INFO source=types.go:42 msg="inference compute" id=337ff6b7-f904-f3b4-084d-941807ea7d6c filter_id="" library=Vulkan compute=0.0 name=Vulkan1 description="NVIDIA GeForce GT 720" libdirs=ollama,vulkan driver=0.0 pci_id=0000:01:00.0 type=discrete total="2.0 GiB" available="2.0 GiB"
time=2026-01-30T21:52:46.766+08:00 level=INFO source=types.go:42 msg="inference compute" id=8680923e-0000-0000-0000-000000000000 filter_id="" library=Vulkan compute=0.0 name=Vulkan0 description="Intel(R) UHD Graphics 630" libdirs=ollama,vulkan driver=0.0 pci_id="" type=iGPU total="8.1 GiB" available="7.2 GiB"
time=2026-01-30T21:52:46.766+08:00 level=INFO source=routes.go:1708 msg="entering low vram mode" "total vram"="10.0 GiB" threshold="20.0 GiB"
[GIN] 2026/01/30 - 21:52:52 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2026/01/30 - 21:52:53 | 200 |    105.6466ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2026/01/30 - 21:52:53 | 200 |    102.1436ms |       127.0.0.1 | POST     "/api/show"
time=2026-01-30T21:52:53.283+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 62316"
time=2026-01-30T21:52:54.760+08:00 level=INFO source=cpu_windows.go:148 msg=packages count=1
time=2026-01-30T21:52:54.760+08:00 level=INFO source=cpu_windows.go:195 msg="" package=0 cores=6 efficiency=0 threads=12
time=2026-01-30T21:52:54.836+08:00 level=INFO source=server.go:245 msg="enabling flash attention"
time=2026-01-30T21:52:54.837+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model D:\\Program Files (x86)\\Ollama\\Models\\blobs\\sha256-3e4cb14174460404e7a233e531675303b2fbf7749c02f91864fe311ab6344e4f --port 62327"
time=2026-01-30T21:52:54.837+08:00 level=INFO source=sched.go:452 msg="system memory" total="15.9 GiB" free="8.1 GiB" free_swap="15.5 GiB"
time=2026-01-30T21:52:54.837+08:00 level=INFO source=sched.go:459 msg="gpu memory" id=8680923e-0000-0000-0000-000000000000 library=Vulkan available="6.8 GiB" free="7.3 GiB" minimum="457.0 MiB" overhead="0 B"
time=2026-01-30T21:52:54.837+08:00 level=INFO source=sched.go:459 msg="gpu memory" id=337ff6b7-f904-f3b4-084d-941807ea7d6c library=Vulkan available="1.5 GiB" free="2.0 GiB" minimum="457.0 MiB" overhead="0 B"
time=2026-01-30T21:52:54.837+08:00 level=INFO source=server.go:755 msg="loading model" "model layers"=37 requested=-1
time=2026-01-30T21:52:54.881+08:00 level=INFO source=runner.go:1405 msg="starting ollama engine"
time=2026-01-30T21:52:54.896+08:00 level=INFO source=runner.go:1440 msg="Server listening on 127.0.0.1:62327"
time=2026-01-30T21:52:54.908+08:00 level=INFO source=runner.go:1278 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:6 GPULayers:37[ID:8680923e-0000-0000-0000-000000000000 Layers:37(0..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-01-30T21:52:54.936+08:00 level=INFO source=ggml.go:136 msg="" architecture=qwen3 file_type=Q4_K_M name="Qwen3 4B Thinking 2507" description="" num_tensors=398 num_key_values=33
load_backend: loaded CPU backend from C:\Users\Administrator\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll
ggml_vulkan: Found 2 Vulkan devices:
ggml_vulkan: 0 = Intel(R) UHD Graphics 630 (Intel Corporation) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none
ggml_vulkan: 1 = NVIDIA GeForce GT 720 (NVIDIA) | uma: 0 | fp16: 0 | bf16: 0 | warp size: 32 | shared memory: 49152 | int dot: 0 | matrix cores: none
load_backend: loaded Vulkan backend from C:\Users\Administrator\AppData\Local\Programs\Ollama\lib\ollama\vulkan\ggml-vulkan.dll
time=2026-01-30T21:52:55.230+08:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(clang)
ggml_backend_vk_get_device_memory called: uuid 8680923e-0000-0000-0000-000000000000
ggml_backend_vk_get_device_memory called: luid 0x0000000000014131
ggml_dxgi_pdh_init called
DXGI + PDH Initialized. Getting GPU free memory info
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB
Integrated GPU (Intel(R) UHD Graphics 630) with LUID 0x0000000000014131 detected. Shared Total: 8517677056.00 bytes (7.93 GB), Shared Usage: 861556736.00 bytes (0.80 GB), Dedicated Total: 134217728.00 bytes (0.12 GB), Dedicated Usage: 0.00 bytes (0.00 GB)
ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 7790338048 total: 8651894784
ggml_backend_vk_get_device_memory called: uuid 337ff6b7-f904-f3b4-084d-941807ea7d6c
ggml_backend_vk_get_device_memory called: luid 0x0000000000014689
ggml_dxgi_pdh_init called
DXGI + PDH Initialized. Getting GPU free memory info
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB
Discrete GPU (NVIDIA GeForce GT 720) with LUID 0x0000000000014689 detected. Dedicated Total: 2104819712.00 bytes (1.96 GB), Dedicated Usage: 9396224.00 bytes (0.01 GB)
ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 2095423488 total: 2104819712
time=2026-01-30T21:52:55.848+08:00 level=INFO source=runner.go:1278 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:6 GPULayers:37[ID:337ff6b7-f904-f3b4-084d-941807ea7d6c Layers:7(0..6) ID:8680923e-0000-0000-0000-000000000000 Layers:30(7..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
ggml_backend_vk_get_device_memory called: uuid 8680923e-0000-0000-0000-000000000000
ggml_backend_vk_get_device_memory called: luid 0x0000000000014131
ggml_dxgi_pdh_init called
DXGI + PDH Initialized. Getting GPU free memory info
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB
Integrated GPU (Intel(R) UHD Graphics 630) with LUID 0x0000000000014131 detected. Shared Total: 8517677056.00 bytes (7.93 GB), Shared Usage: 861773824.00 bytes (0.80 GB), Dedicated Total: 134217728.00 bytes (0.12 GB), Dedicated Usage: 0.00 bytes (0.00 GB)
ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 7790120960 total: 8651894784
ggml_backend_vk_get_device_memory called: uuid 337ff6b7-f904-f3b4-084d-941807ea7d6c
ggml_backend_vk_get_device_memory called: luid 0x0000000000014689
ggml_dxgi_pdh_init called
DXGI + PDH Initialized. Getting GPU free memory info
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB
Discrete GPU (NVIDIA GeForce GT 720) with LUID 0x0000000000014689 detected. Dedicated Total: 2104819712.00 bytes (1.96 GB), Dedicated Usage: 9396224.00 bytes (0.01 GB)
ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 2095423488 total: 2104819712
time=2026-01-30T21:52:56.161+08:00 level=INFO source=runner.go:1278 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:6 GPULayers:37[ID:337ff6b7-f904-f3b4-084d-941807ea7d6c Layers:7(0..6) ID:8680923e-0000-0000-0000-000000000000 Layers:30(7..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
ggml_backend_vk_get_device_memory called: uuid 8680923e-0000-0000-0000-000000000000
ggml_backend_vk_get_device_memory called: luid 0x0000000000014131
ggml_dxgi_pdh_init called
DXGI + PDH Initialized. Getting GPU free memory info
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB
Integrated GPU (Intel(R) UHD Graphics 630) with LUID 0x0000000000014131 detected. Shared Total: 8517677056.00 bytes (7.93 GB), Shared Usage: 861569024.00 bytes (0.80 GB), Dedicated Total: 134217728.00 bytes (0.12 GB), Dedicated Usage: 0.00 bytes (0.00 GB)
ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 7790325760 total: 8651894784
ggml_backend_vk_get_device_memory called: uuid 337ff6b7-f904-f3b4-084d-941807ea7d6c
ggml_backend_vk_get_device_memory called: luid 0x0000000000014689
ggml_dxgi_pdh_init called
DXGI + PDH Initialized. Getting GPU free memory info
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB
[DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB
Discrete GPU (NVIDIA GeForce GT 720) with LUID 0x0000000000014689 detected. Dedicated Total: 2104819712.00 bytes (1.96 GB), Dedicated Usage: 9396224.00 bytes (0.01 GB)
ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 2095423488 total: 2104819712
time=2026-01-30T21:52:57.062+08:00 level=INFO source=runner.go:1278 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:6 GPULayers:37[ID:337ff6b7-f904-f3b4-084d-941807ea7d6c Layers:7(0..6) ID:8680923e-0000-0000-0000-000000000000 Layers:30(7..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=ggml.go:482 msg="offloading 36 repeating layers to GPU"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=ggml.go:489 msg="offloading output layer to GPU"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=ggml.go:494 msg="offloaded 37/37 layers to GPU"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:240 msg="model weights" device=Vulkan0 size="1.9 GiB"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:240 msg="model weights" device=Vulkan1 size="400.1 MiB"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:245 msg="model weights" device=CPU size="304.3 MiB"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:251 msg="kv cache" device=Vulkan0 size="464.0 MiB"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:251 msg="kv cache" device=Vulkan1 size="112.0 MiB"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:262 msg="compute graph" device=Vulkan0 size="71.0 MiB"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:262 msg="compute graph" device=Vulkan1 size="79.0 MiB"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:267 msg="compute graph" device=CPU size="5.0 MiB"
time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:272 msg="total memory" size="3.3 GiB"
time=2026-01-30T21:52:57.064+08:00 level=INFO source=sched.go:526 msg="loaded runners" count=1
time=2026-01-30T21:52:57.064+08:00 level=INFO source=server.go:1347 msg="waiting for llama runner to start responding"
time=2026-01-30T21:52:57.064+08:00 level=INFO source=server.go:1381 msg="waiting for server to become available" status="llm server loading model"
time=2026-01-30T21:53:04.078+08:00 level=INFO source=server.go:1385 msg="llama runner started in 9.24 seconds"
[GIN] 2026/01/30 - 21:53:04 | 200 |   10.8871466s |       127.0.0.1 | POST     "/api/generate"
Exception 0xe06d7363 0x19930520 0xf227ff3e0 0x7ffcbcb67f7a
PC=0x7ffcbcb67f7a
signal arrived during external code execution

runtime.cgocall(0x7ff6af9695e0, 0xc000567aa0)
        runtime/cgocall.go:167 +0x3e fp=0xc000567a78 sp=0xc000567a10 pc=0x7ff6aebb243e
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_sched_graph_compute_async(0x180fec00010, 0x1817a884ed0)
        _cgo_gotypes.go:961 +0x50 fp=0xc000567aa0 sp=0xc000567a78 pc=0x7ff6af047a70
github.com/ollama/ollama/ml/backend/ggml.(*Context).ComputeWithNotify.func2(...)
        github.com/ollama/ollama/ml/backend/ggml/ggml.go:825
github.com/ollama/ollama/ml/backend/ggml.(*Context).ComputeWithNotify(0xc001f7c380, 0xc001e60de0?, {0xc001f70ae0, 0x1, 0x2?})
        github.com/ollama/ollama/ml/backend/ggml/ggml.go:825 +0x1b5 fp=0xc000567b78 sp=0xc000567aa0 pc=0x7ff6af056135
github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc000516000, {0x0, {0x7ff6b01a8b20, 0xc001f7c380}, {0x7ff6b01b6208, 0xc0004dac60}, {0xc000452e00, 0xb, 0x10}, {{0x7ff6b01b6208, ...}, ...}, ...})
        github.com/ollama/ollama/runner/ollamarunner/runner.go:723 +0x876 fp=0xc000567ef0 sp=0xc000567b78 pc=0x7ff6af130096
github.com/ollama/ollama/runner/ollamarunner.(*Server).run.gowrap1()
        github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x58 fp=0xc000567fe0 sp=0xc000567ef0 pc=0x7ff6af12dc98
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000567fe8 sp=0xc000567fe0 pc=0x7ff6aebbd8e1
created by github.com/ollama/ollama/runner/ollamarunner.(*Server).run in goroutine 50
        github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x2cd

goroutine 1 gp=0xc0000021c0 m=nil [IO wait, 1 minutes]:
runtime.gopark(0x7ff6aebbf0e0?, 0x7ff6b0b78540?, 0x20?, 0x80?, 0xc0005180cc?)
        runtime/proc.go:435 +0xce fp=0xc000231648 sp=0xc000231628 pc=0x7ff6aebb598e
runtime.netpollblock(0x1a8?, 0xaeb50406?, 0xf6?)
        runtime/netpoll.go:575 +0xf7 fp=0xc000231680 sp=0xc000231648 pc=0x7ff6aeb7bdf7
internal/poll.runtime_pollWait(0x180f68a4cb0, 0x72)
        runtime/netpoll.go:351 +0x85 fp=0xc0002316a0 sp=0xc000231680 pc=0x7ff6aebb4b25
internal/poll.(*pollDesc).wait(0x7ff6aec4a7b3?, 0x0?, 0x0)
        internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0002316c8 sp=0xc0002316a0 pc=0x7ff6aec4bda7
internal/poll.execIO(0xc000518020, 0xc000231770)
        internal/poll/fd_windows.go:177 +0x105 fp=0xc000231740 sp=0xc0002316c8 pc=0x7ff6aec4d205
internal/poll.(*FD).acceptOne(0xc000518008, 0x204, {0xc0005201e0?, 0xc0002317d0?, 0x7ff6aebbb8f7?}, 0xc000231810?)
        internal/poll/fd_windows.go:946 +0x65 fp=0xc0002317a0 sp=0xc000231740 pc=0x7ff6aec51785
internal/poll.(*FD).Accept(0xc000518008, 0xc000231950)
        internal/poll/fd_windows.go:980 +0x1b6 fp=0xc000231858 sp=0xc0002317a0 pc=0x7ff6aec51ab6
net.(*netFD).accept(0xc000518008)
        net/fd_windows.go:182 +0x4b fp=0xc000231970 sp=0xc000231858 pc=0x7ff6aecc326b
net.(*TCPListener).accept(0xc000602080)
        net/tcpsock_posix.go:159 +0x1b fp=0xc0002319c0 sp=0xc000231970 pc=0x7ff6aecd981b
net.(*TCPListener).Accept(0xc000602080)
        net/tcpsock.go:380 +0x30 fp=0xc0002319f0 sp=0xc0002319c0 pc=0x7ff6aecd85d0
net/http.(*onceCloseListener).Accept(0xc0005b4000?)
        <autogenerated>:1 +0x24 fp=0xc000231a08 sp=0xc0002319f0 pc=0x7ff6aeef1bc4
net/http.(*Server).Serve(0xc00051c000, {0x7ff6b019a930, 0xc000602080})
        net/http/server.go:3424 +0x30c fp=0xc000231b38 sp=0xc000231a08 pc=0x7ff6aeec948c
github.com/ollama/ollama/runner/ollamarunner.Execute({0xc0000a2030, 0x4, 0x5})
        github.com/ollama/ollama/runner/ollamarunner/runner.go:1441 +0x94e fp=0xc000231d08 sp=0xc000231b38 pc=0x7ff6af136e8e
github.com/ollama/ollama/runner.Execute({0xc0000a2010?, 0x0?, 0x0?})
        github.com/ollama/ollama/runner/runner.go:28 +0x130 fp=0xc000231d30 sp=0xc000231d08 pc=0x7ff6af1377f0
github.com/ollama/ollama/cmd.NewCLI.func2(0xc000143300?, {0x7ff6aff9f302?, 0x4?, 0x7ff6aff9f306?})
        github.com/ollama/ollama/cmd/cmd.go:1881 +0x45 fp=0xc000231d58 sp=0xc000231d30 pc=0x7ff6af8fbf25
github.com/spf13/cobra.(*Command).execute(0xc000219508, {0xc0003507d0, 0x5, 0x5})
        github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc000231e78 sp=0xc000231d58 pc=0x7ff6aed3e41c
github.com/spf13/cobra.(*Command).ExecuteC(0xc0000b4908)
        github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000231f30 sp=0xc000231e78 pc=0x7ff6aed3ec65
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
        github.com/ollama/ollama/main.go:12 +0x4d fp=0xc000231f50 sp=0xc000231f30 pc=0x7ff6af8fca0d
runtime.main()
        runtime/proc.go:283 +0x27d fp=0xc000231fe0 sp=0xc000231f50 pc=0x7ff6aeb84ddd
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000231fe8 sp=0xc000231fe0 pc=0x7ff6aebbd8e1

goroutine 2 gp=0xc0000028c0 m=nil [force gc (idle), 1 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc00006ffa8 sp=0xc00006ff88 pc=0x7ff6aebb598e
runtime.goparkunlock(...)
        runtime/proc.go:441
runtime.forcegchelper()
        runtime/proc.go:348 +0xb8 fp=0xc00006ffe0 sp=0xc00006ffa8 pc=0x7ff6aeb850f8
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x7ff6aebbd8e1
created by runtime.init.7 in goroutine 1
        runtime/proc.go:336 +0x1a

goroutine 3 gp=0xc000002c40 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000071f80 sp=0xc000071f60 pc=0x7ff6aebb598e
runtime.goparkunlock(...)
        runtime/proc.go:441
runtime.bgsweep(0xc00007e000)
        runtime/mgcsweep.go:316 +0xdf fp=0xc000071fc8 sp=0xc000071f80 pc=0x7ff6aeb6debf
runtime.gcenable.gowrap1()
        runtime/mgc.go:204 +0x25 fp=0xc000071fe0 sp=0xc000071fc8 pc=0x7ff6aeb62285
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000071fe8 sp=0xc000071fe0 pc=0x7ff6aebbd8e1
created by runtime.gcenable in goroutine 1
        runtime/mgc.go:204 +0x66

goroutine 4 gp=0xc000002e00 m=nil [GC scavenge wait]:
runtime.gopark(0xf724c?, 0xd34bc?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000085f78 sp=0xc000085f58 pc=0x7ff6aebb598e
runtime.goparkunlock(...)
        runtime/proc.go:441
runtime.(*scavengerState).park(0x7ff6b0b9f920)
        runtime/mgcscavenge.go:425 +0x49 fp=0xc000085fa8 sp=0xc000085f78 pc=0x7ff6aeb6b909
runtime.bgscavenge(0xc00007e000)
        runtime/mgcscavenge.go:658 +0x59 fp=0xc000085fc8 sp=0xc000085fa8 pc=0x7ff6aeb6be99
runtime.gcenable.gowrap2()
        runtime/mgc.go:205 +0x25 fp=0xc000085fe0 sp=0xc000085fc8 pc=0x7ff6aeb62225
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x7ff6aebbd8e1
created by runtime.gcenable in goroutine 1
        runtime/mgc.go:205 +0xa5

goroutine 5 gp=0xc000003340 m=nil [finalizer wait, 1 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000087e30 sp=0xc000087e10 pc=0x7ff6aebb598e
runtime.runfinq()
        runtime/mfinal.go:196 +0x107 fp=0xc000087fe0 sp=0xc000087e30 pc=0x7ff6aeb61207
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000087fe8 sp=0xc000087fe0 pc=0x7ff6aebbd8e1
created by runtime.createfing in goroutine 1
        runtime/mfinal.go:166 +0x3d

goroutine 6 gp=0xc000003dc0 m=nil [chan receive]:
runtime.gopark(0xc00017d0e0?, 0xc000dcc030?, 0x60?, 0x3f?, 0x7ff6aecabf68?)
        runtime/proc.go:435 +0xce fp=0xc000073f18 sp=0xc000073ef8 pc=0x7ff6aebb598e
runtime.chanrecv(0xc00003c380, 0x0, 0x1)
        runtime/chan.go:664 +0x445 fp=0xc000073f90 sp=0xc000073f18 pc=0x7ff6aeb52d45
runtime.chanrecv1(0x7ff6aeb84f40?, 0xc000073f76?)
        runtime/chan.go:506 +0x12 fp=0xc000073fb8 sp=0xc000073f90 pc=0x7ff6aeb528d2
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
        runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
        runtime/mgc.go:1799 +0x2f fp=0xc000073fe0 sp=0xc000073fb8 pc=0x7ff6aeb654af
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000073fe8 sp=0xc000073fe0 pc=0x7ff6aebbd8e1
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
        runtime/mgc.go:1794 +0x85

goroutine 7 gp=0xc0003ee540 m=nil [GC worker (idle), 1 minutes]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000081f38 sp=0xc000081f18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc000081fc8 sp=0xc000081f38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc000081fe0 sp=0xc000081fc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000081fe8 sp=0xc000081fe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 18 gp=0xc0004861c0 m=nil [GC worker (idle)]:
runtime.gopark(0x480e487a53c?, 0x3?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000491f38 sp=0xc000491f18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc000491fc8 sp=0xc000491f38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc000491fe0 sp=0xc000491fc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000491fe8 sp=0xc000491fe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 19 gp=0xc000486380 m=nil [GC worker (idle)]:
runtime.gopark(0x480e4a928ec?, 0x3?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000493f38 sp=0xc000493f18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc000493fc8 sp=0xc000493f38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc000493fe0 sp=0xc000493fc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000493fe8 sp=0xc000493fe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 34 gp=0xc000206000 m=nil [GC worker (idle), 1 minutes]:
runtime.gopark(0x480e487a53c?, 0x1?, 0x68?, 0xf5?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc00048df38 sp=0xc00048df18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc00048dfc8 sp=0xc00048df38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc00048dfe0 sp=0xc00048dfc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00048dfe8 sp=0xc00048dfe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 8 gp=0xc0003ee700 m=nil [GC worker (idle), 1 minutes]:
runtime.gopark(0x480e487a53c?, 0x3?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000083f38 sp=0xc000083f18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc000083fc8 sp=0xc000083f38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc000083fe0 sp=0xc000083fc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 35 gp=0xc0002061c0 m=nil [GC worker (idle)]:
runtime.gopark(0x480e4a928ec?, 0x1?, 0xd4?, 0xaf?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc00048ff38 sp=0xc00048ff18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc00048ffc8 sp=0xc00048ff38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc00048ffe0 sp=0xc00048ffc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00048ffe8 sp=0xc00048ffe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 9 gp=0xc0003ee8c0 m=nil [GC worker (idle)]:
runtime.gopark(0x480e487a53c?, 0x1?, 0x8?, 0xb?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc00047bf38 sp=0xc00047bf18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc00047bfc8 sp=0xc00047bf38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc00047bfe0 sp=0xc00047bfc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00047bfe8 sp=0xc00047bfe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 36 gp=0xc000206380 m=nil [GC worker (idle), 1 minutes]:
runtime.gopark(0x46934493f80?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000477f38 sp=0xc000477f18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc000477fc8 sp=0xc000477f38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc000477fe0 sp=0xc000477fc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000477fe8 sp=0xc000477fe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 20 gp=0xc000486540 m=nil [GC worker (idle)]:
runtime.gopark(0x480e4a928ec?, 0x1?, 0x14?, 0xfd?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc00049bf38 sp=0xc00049bf18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc00049bfc8 sp=0xc00049bf38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc00049bfe0 sp=0xc00049bfc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00049bfe8 sp=0xc00049bfe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 10 gp=0xc0003eea80 m=nil [GC worker (idle)]:
runtime.gopark(0x480e4a928ec?, 0x1?, 0x1c?, 0x8b?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc00047df38 sp=0xc00047df18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc00047dfc8 sp=0xc00047df38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc00047dfe0 sp=0xc00047dfc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00047dfe8 sp=0xc00047dfe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 37 gp=0xc000206540 m=nil [GC worker (idle)]:
runtime.gopark(0x480e487a53c?, 0x3?, 0x94?, 0x77?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000479f38 sp=0xc000479f18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc000479fc8 sp=0xc000479f38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc000479fe0 sp=0xc000479fc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000479fe8 sp=0xc000479fe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 38 gp=0xc000206700 m=nil [GC worker (idle)]:
runtime.gopark(0x7ff6b0bee5e0?, 0x1?, 0xb0?, 0x83?, 0x0?)
        runtime/proc.go:435 +0xce fp=0xc000497f38 sp=0xc000497f18 pc=0x7ff6aebb598e
runtime.gcBgMarkWorker(0xc00003d7a0)
        runtime/mgc.go:1423 +0xe9 fp=0xc000497fc8 sp=0xc000497f38 pc=0x7ff6aeb647a9
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1339 +0x25 fp=0xc000497fe0 sp=0xc000497fc8 pc=0x7ff6aeb64685
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000497fe8 sp=0xc000497fe0 pc=0x7ff6aebbd8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1339 +0x105

goroutine 50 gp=0xc0004868c0 m=nil [chan receive]:
runtime.gopark(0x30?, 0x7ff6aff09900?, 0x1?, 0x0?, 0xc000049798?)
        runtime/proc.go:435 +0xce fp=0xc000049750 sp=0xc000049730 pc=0x7ff6aebb598e
runtime.chanrecv(0xc00003c1c0, 0x0, 0x1)
        runtime/chan.go:664 +0x445 fp=0xc0000497c8 sp=0xc000049750 pc=0x7ff6aeb52d45
runtime.chanrecv1(0x7ff6affddfd2?, 0x29?)
        runtime/chan.go:506 +0x12 fp=0xc0000497f0 sp=0xc0000497c8 pc=0x7ff6aeb528d2
github.com/ollama/ollama/runner/ollamarunner.(*Server).forwardBatch(_, {0x1, {0x7ff6b01a8b20, 0xc0004bc000}, {0x7ff6b01b6208, 0xc001f3d068}, {0xc000076238, 0x1, 0x1}, {{0x7ff6b01b6208, ...}, ...}, ...})
        github.com/ollama/ollama/runner/ollamarunner/runner.go:475 +0xfa fp=0xc000049b58 sp=0xc0000497f0 pc=0x7ff6af12ddba
github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0xc000516000, {0x7ff6b019cf20, 0xc0000d2af0})
        github.com/ollama/ollama/runner/ollamarunner/runner.go:452 +0x18c fp=0xc000049fb8 sp=0xc000049b58 pc=0x7ff6af12da6c
github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap1()
        github.com/ollama/ollama/runner/ollamarunner/runner.go:1418 +0x28 fp=0xc000049fe0 sp=0xc000049fb8 pc=0x7ff6af137108
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000049fe8 sp=0xc000049fe0 pc=0x7ff6aebbd8e1
created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1
        github.com/ollama/ollama/runner/ollamarunner/runner.go:1418 +0x4c9

goroutine 27 gp=0xc000486a80 m=nil [select]:
runtime.gopark(0xc000d49a08?, 0x2?, 0xf3?, 0x91?, 0xc000d4986c?)
        runtime/proc.go:435 +0xce fp=0xc000d49698 sp=0xc000d49678 pc=0x7ff6aebb598e
runtime.selectgo(0xc000d49a08, 0xc000d49868, 0xb?, 0x0, 0x1?, 0x1)
        runtime/select.go:351 +0x837 fp=0xc000d497d0 sp=0xc000d49698 pc=0x7ff6aeb96437
github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0xc000516000, {0x7ff6b019aae0, 0xc0005b20e0}, 0xc00047e280)
        github.com/ollama/ollama/runner/ollamarunner/runner.go:950 +0xc4e fp=0xc000d49ac0 sp=0xc000d497d0 pc=0x7ff6af13218e
github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x7ff6b019aae0?, 0xc0005b20e0?}, 0xc000d49b40?)
        <autogenerated>:1 +0x36 fp=0xc000d49af0 sp=0xc000d49ac0 pc=0x7ff6af1375f6
net/http.HandlerFunc.ServeHTTP(0xc0005900c0?, {0x7ff6b019aae0?, 0xc0005b20e0?}, 0xc000d49b60?)
        net/http/server.go:2294 +0x29 fp=0xc000d49b18 sp=0xc000d49af0 pc=0x7ff6aeec5ac9
net/http.(*ServeMux).ServeHTTP(0x7ff6aeb5b785?, {0x7ff6b019aae0, 0xc0005b20e0}, 0xc00047e280)
        net/http/server.go:2822 +0x1c4 fp=0xc000d49b68 sp=0xc000d49b18 pc=0x7ff6aeec79c4
net/http.serverHandler.ServeHTTP({0x7ff6b0196ff0?}, {0x7ff6b019aae0?, 0xc0005b20e0?}, 0x1?)
        net/http/server.go:3301 +0x8e fp=0xc000d49b98 sp=0xc000d49b68 pc=0x7ff6aeee544e
net/http.(*conn).serve(0xc0005b4000, {0x7ff6b019cee8, 0xc0001c9080})
        net/http/server.go:2102 +0x625 fp=0xc000d49fb8 sp=0xc000d49b98 pc=0x7ff6aeec3fc5
net/http.(*Server).Serve.gowrap3()
        net/http/server.go:3454 +0x28 fp=0xc000d49fe0 sp=0xc000d49fb8 pc=0x7ff6aeec9888
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000d49fe8 sp=0xc000d49fe0 pc=0x7ff6aebbd8e1
created by net/http.(*Server).Serve in goroutine 1
        net/http/server.go:3454 +0x485

goroutine 28 gp=0xc000207500 m=nil [IO wait, 1 minutes]:
runtime.gopark(0x0?, 0xc0005187a0?, 0x48?, 0x88?, 0xc00051884c?)
        runtime/proc.go:435 +0xce fp=0xc00056bd58 sp=0xc00056bd38 pc=0x7ff6aebb598e
runtime.netpollblock(0x214?, 0xaeb50406?, 0xf6?)
        runtime/netpoll.go:575 +0xf7 fp=0xc00056bd90 sp=0xc00056bd58 pc=0x7ff6aeb7bdf7
internal/poll.runtime_pollWait(0x180f68a4b98, 0x72)
        runtime/netpoll.go:351 +0x85 fp=0xc00056bdb0 sp=0xc00056bd90 pc=0x7ff6aebb4b25
internal/poll.(*pollDesc).wait(0x204?, 0x72?, 0x0)
        internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00056bdd8 sp=0xc00056bdb0 pc=0x7ff6aec4bda7
internal/poll.execIO(0xc0005187a0, 0x7ff6b001b1e0)
        internal/poll/fd_windows.go:177 +0x105 fp=0xc00056be50 sp=0xc00056bdd8 pc=0x7ff6aec4d205
internal/poll.(*FD).Read(0xc000518788, {0xc001f6f0c1, 0x1, 0x1})
        internal/poll/fd_windows.go:438 +0x29b fp=0xc00056bef0 sp=0xc00056be50 pc=0x7ff6aec4dedb
net.(*netFD).Read(0xc000518788, {0xc001f6f0c1?, 0xc000602158?, 0xc00056bf70?})
        net/fd_posix.go:55 +0x25 fp=0xc00056bf38 sp=0xc00056bef0 pc=0x7ff6aecc1145
net.(*conn).Read(0xc0004b8018, {0xc001f6f0c1?, 0xc000d58000?, 0x7ff6aef38380?})
        net/net.go:194 +0x45 fp=0xc00056bf80 sp=0xc00056bf38 pc=0x7ff6aecd0865
net/http.(*connReader).backgroundRead(0xc001f6f0b0)
        net/http/server.go:690 +0x37 fp=0xc00056bfc8 sp=0xc00056bf80 pc=0x7ff6aeebde97
net/http.(*connReader).startBackgroundRead.gowrap2()
        net/http/server.go:686 +0x25 fp=0xc00056bfe0 sp=0xc00056bfc8 pc=0x7ff6aeebddc5
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00056bfe8 sp=0xc00056bfe0 pc=0x7ff6aebbd8e1
created by net/http.(*connReader).startBackgroundRead in goroutine 27
        net/http/server.go:686 +0xb6

goroutine 104 gp=0xc000586540 m=nil [chan receive]:
runtime.gopark(0x30?, 0x7ff6aff09900?, 0x1?, 0x9d?, 0xc000571b20?)
        runtime/proc.go:435 +0xce fp=0xc000571ad8 sp=0xc000571ab8 pc=0x7ff6aebb598e
runtime.chanrecv(0xc00058e1c0, 0x0, 0x1)
        runtime/chan.go:664 +0x445 fp=0xc000571b50 sp=0xc000571ad8 pc=0x7ff6aeb52d45
runtime.chanrecv1(0x7ff6affe1717?, 0x2c?)
        runtime/chan.go:506 +0x12 fp=0xc000571b78 sp=0xc000571b50 pc=0x7ff6aeb528d2
github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc000516000, {0x1, {0x7ff6b01a8b20, 0xc0004bc000}, {0x7ff6b01b6208, 0xc001f3d068}, {0xc000076238, 0x1, 0x1}, {{0x7ff6b01b6208, ...}, ...}, ...})
        github.com/ollama/ollama/runner/ollamarunner/runner.go:651 +0x185 fp=0xc000571ef0 sp=0xc000571b78 pc=0x7ff6af12f9a5
github.com/ollama/ollama/runner/ollamarunner.(*Server).run.gowrap1()
        github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x58 fp=0xc000571fe0 sp=0xc000571ef0 pc=0x7ff6af12dc98
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000571fe8 sp=0xc000571fe0 pc=0x7ff6aebbd8e1
created by github.com/ollama/ollama/runner/ollamarunner.(*Server).run in goroutine 50
        github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x2cd
rax     0x0
rbx     0xf227ff368
rcx     0x0
rdx     0x0
rdi     0xe06d7363
rsi     0x1
rbp     0x4
rsp     0xf227ff240
r8      0x0
r9      0x0
r10     0x0
r11     0x0
r12     0x0
r13     0x180fec00010
r14     0x40000000
r15     0x0
rip     0x7ffcbcb67f7a
rflags  0x202
cs      0x33
fs      0x53
gs      0x2b
time=2026-01-30T21:54:40.409+08:00 level=ERROR source=server.go:1592 msg="post predict" error="Post \"http://127.0.0.1:62327/completion\": read tcp 127.0.0.1:61472->127.0.0.1:62327: wsarecv: An existing connection was forcibly closed by the remote host."
[GIN] 2026/01/30 - 21:54:40 | 500 |    2.6094727s |       127.0.0.1 | POST     "/api/chat"

OS

Windows

GPU

No response

CPU

Intel

Ollama version

0.14.2

Originally created by @gitcbz on GitHub (Jan 30, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/13981 ### What is the issue? C:\\Users\\~~~>ollama run qwen3:4b \>\>\> 你好 Error: 500 Internal Server Error: model runner has unexpectedly stopped, this may be due to resource limitations or an internal error, check ollama server logs for details 我的Gpu:Intel UHD Graphics 630 和 NVIDIA GeForce GT 720 ### Relevant log output ```shell time=2026-01-30T21:52:44.701+08:00 level=INFO source=routes.go:1614 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:D:\\Program Files (x86)\\Ollama\\Models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:true OLLAMA_VULKAN:true ROCR_VISIBLE_DEVICES:]" time=2026-01-30T21:52:44.719+08:00 level=INFO source=images.go:499 msg="total blobs: 14" time=2026-01-30T21:52:44.721+08:00 level=INFO source=images.go:506 msg="total unused blobs removed: 0" time=2026-01-30T21:52:44.722+08:00 level=INFO source=routes.go:1667 msg="Listening on [::]:11434 (version 0.14.2)" time=2026-01-30T21:52:44.724+08:00 level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2026-01-30T21:52:44.783+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 52144" time=2026-01-30T21:52:44.948+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 52151" time=2026-01-30T21:52:45.088+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 52157" time=2026-01-30T21:52:45.197+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 52164" time=2026-01-30T21:52:46.766+08:00 level=INFO source=types.go:42 msg="inference compute" id=337ff6b7-f904-f3b4-084d-941807ea7d6c filter_id="" library=Vulkan compute=0.0 name=Vulkan1 description="NVIDIA GeForce GT 720" libdirs=ollama,vulkan driver=0.0 pci_id=0000:01:00.0 type=discrete total="2.0 GiB" available="2.0 GiB" time=2026-01-30T21:52:46.766+08:00 level=INFO source=types.go:42 msg="inference compute" id=8680923e-0000-0000-0000-000000000000 filter_id="" library=Vulkan compute=0.0 name=Vulkan0 description="Intel(R) UHD Graphics 630" libdirs=ollama,vulkan driver=0.0 pci_id="" type=iGPU total="8.1 GiB" available="7.2 GiB" time=2026-01-30T21:52:46.766+08:00 level=INFO source=routes.go:1708 msg="entering low vram mode" "total vram"="10.0 GiB" threshold="20.0 GiB" [GIN] 2026/01/30 - 21:52:52 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2026/01/30 - 21:52:53 | 200 | 105.6466ms | 127.0.0.1 | POST "/api/show" [GIN] 2026/01/30 - 21:52:53 | 200 | 102.1436ms | 127.0.0.1 | POST "/api/show" time=2026-01-30T21:52:53.283+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 62316" time=2026-01-30T21:52:54.760+08:00 level=INFO source=cpu_windows.go:148 msg=packages count=1 time=2026-01-30T21:52:54.760+08:00 level=INFO source=cpu_windows.go:195 msg="" package=0 cores=6 efficiency=0 threads=12 time=2026-01-30T21:52:54.836+08:00 level=INFO source=server.go:245 msg="enabling flash attention" time=2026-01-30T21:52:54.837+08:00 level=INFO source=server.go:429 msg="starting runner" cmd="C:\\Users\\Administrator\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model D:\\Program Files (x86)\\Ollama\\Models\\blobs\\sha256-3e4cb14174460404e7a233e531675303b2fbf7749c02f91864fe311ab6344e4f --port 62327" time=2026-01-30T21:52:54.837+08:00 level=INFO source=sched.go:452 msg="system memory" total="15.9 GiB" free="8.1 GiB" free_swap="15.5 GiB" time=2026-01-30T21:52:54.837+08:00 level=INFO source=sched.go:459 msg="gpu memory" id=8680923e-0000-0000-0000-000000000000 library=Vulkan available="6.8 GiB" free="7.3 GiB" minimum="457.0 MiB" overhead="0 B" time=2026-01-30T21:52:54.837+08:00 level=INFO source=sched.go:459 msg="gpu memory" id=337ff6b7-f904-f3b4-084d-941807ea7d6c library=Vulkan available="1.5 GiB" free="2.0 GiB" minimum="457.0 MiB" overhead="0 B" time=2026-01-30T21:52:54.837+08:00 level=INFO source=server.go:755 msg="loading model" "model layers"=37 requested=-1 time=2026-01-30T21:52:54.881+08:00 level=INFO source=runner.go:1405 msg="starting ollama engine" time=2026-01-30T21:52:54.896+08:00 level=INFO source=runner.go:1440 msg="Server listening on 127.0.0.1:62327" time=2026-01-30T21:52:54.908+08:00 level=INFO source=runner.go:1278 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:6 GPULayers:37[ID:8680923e-0000-0000-0000-000000000000 Layers:37(0..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" time=2026-01-30T21:52:54.936+08:00 level=INFO source=ggml.go:136 msg="" architecture=qwen3 file_type=Q4_K_M name="Qwen3 4B Thinking 2507" description="" num_tensors=398 num_key_values=33 load_backend: loaded CPU backend from C:\Users\Administrator\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll ggml_vulkan: Found 2 Vulkan devices: ggml_vulkan: 0 = Intel(R) UHD Graphics 630 (Intel Corporation) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none ggml_vulkan: 1 = NVIDIA GeForce GT 720 (NVIDIA) | uma: 0 | fp16: 0 | bf16: 0 | warp size: 32 | shared memory: 49152 | int dot: 0 | matrix cores: none load_backend: loaded Vulkan backend from C:\Users\Administrator\AppData\Local\Programs\Ollama\lib\ollama\vulkan\ggml-vulkan.dll time=2026-01-30T21:52:55.230+08:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 compiler=cgo(clang) ggml_backend_vk_get_device_memory called: uuid 8680923e-0000-0000-0000-000000000000 ggml_backend_vk_get_device_memory called: luid 0x0000000000014131 ggml_dxgi_pdh_init called DXGI + PDH Initialized. Getting GPU free memory info [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB Integrated GPU (Intel(R) UHD Graphics 630) with LUID 0x0000000000014131 detected. Shared Total: 8517677056.00 bytes (7.93 GB), Shared Usage: 861556736.00 bytes (0.80 GB), Dedicated Total: 134217728.00 bytes (0.12 GB), Dedicated Usage: 0.00 bytes (0.00 GB) ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 7790338048 total: 8651894784 ggml_backend_vk_get_device_memory called: uuid 337ff6b7-f904-f3b4-084d-941807ea7d6c ggml_backend_vk_get_device_memory called: luid 0x0000000000014689 ggml_dxgi_pdh_init called DXGI + PDH Initialized. Getting GPU free memory info [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB Discrete GPU (NVIDIA GeForce GT 720) with LUID 0x0000000000014689 detected. Dedicated Total: 2104819712.00 bytes (1.96 GB), Dedicated Usage: 9396224.00 bytes (0.01 GB) ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 2095423488 total: 2104819712 time=2026-01-30T21:52:55.848+08:00 level=INFO source=runner.go:1278 msg=load request="{Operation:fit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:6 GPULayers:37[ID:337ff6b7-f904-f3b4-084d-941807ea7d6c Layers:7(0..6) ID:8680923e-0000-0000-0000-000000000000 Layers:30(7..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" ggml_backend_vk_get_device_memory called: uuid 8680923e-0000-0000-0000-000000000000 ggml_backend_vk_get_device_memory called: luid 0x0000000000014131 ggml_dxgi_pdh_init called DXGI + PDH Initialized. Getting GPU free memory info [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB Integrated GPU (Intel(R) UHD Graphics 630) with LUID 0x0000000000014131 detected. Shared Total: 8517677056.00 bytes (7.93 GB), Shared Usage: 861773824.00 bytes (0.80 GB), Dedicated Total: 134217728.00 bytes (0.12 GB), Dedicated Usage: 0.00 bytes (0.00 GB) ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 7790120960 total: 8651894784 ggml_backend_vk_get_device_memory called: uuid 337ff6b7-f904-f3b4-084d-941807ea7d6c ggml_backend_vk_get_device_memory called: luid 0x0000000000014689 ggml_dxgi_pdh_init called DXGI + PDH Initialized. Getting GPU free memory info [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB Discrete GPU (NVIDIA GeForce GT 720) with LUID 0x0000000000014689 detected. Dedicated Total: 2104819712.00 bytes (1.96 GB), Dedicated Usage: 9396224.00 bytes (0.01 GB) ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 2095423488 total: 2104819712 time=2026-01-30T21:52:56.161+08:00 level=INFO source=runner.go:1278 msg=load request="{Operation:alloc LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:6 GPULayers:37[ID:337ff6b7-f904-f3b4-084d-941807ea7d6c Layers:7(0..6) ID:8680923e-0000-0000-0000-000000000000 Layers:30(7..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" ggml_backend_vk_get_device_memory called: uuid 8680923e-0000-0000-0000-000000000000 ggml_backend_vk_get_device_memory called: luid 0x0000000000014131 ggml_dxgi_pdh_init called DXGI + PDH Initialized. Getting GPU free memory info [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB Integrated GPU (Intel(R) UHD Graphics 630) with LUID 0x0000000000014131 detected. Shared Total: 8517677056.00 bytes (7.93 GB), Shared Usage: 861569024.00 bytes (0.80 GB), Dedicated Total: 134217728.00 bytes (0.12 GB), Dedicated Usage: 0.00 bytes (0.00 GB) ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 7790325760 total: 8651894784 ggml_backend_vk_get_device_memory called: uuid 337ff6b7-f904-f3b4-084d-941807ea7d6c ggml_backend_vk_get_device_memory called: luid 0x0000000000014689 ggml_dxgi_pdh_init called DXGI + PDH Initialized. Getting GPU free memory info [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000014131, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: NVIDIA GeForce GT 720, LUID: 0x0000000000014689, Dedicated: 1.96 GB, Shared: 7.93 GB [DXGI] Adapter Description: Intel(R) UHD Graphics 630, LUID: 0x0000000000023866, Dedicated: 0.12 GB, Shared: 7.93 GB [DXGI] Adapter Description: Microsoft Basic Render Driver, LUID: 0x0000000000014658, Dedicated: 0.00 GB, Shared: 7.93 GB Discrete GPU (NVIDIA GeForce GT 720) with LUID 0x0000000000014689 detected. Dedicated Total: 2104819712.00 bytes (1.96 GB), Dedicated Usage: 9396224.00 bytes (0.01 GB) ggml_backend_vk_get_device_memory utilizing DXGI + PDH memory reporting free: 2095423488 total: 2104819712 time=2026-01-30T21:52:57.062+08:00 level=INFO source=runner.go:1278 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:Enabled KvSize:4096 KvCacheType: NumThreads:6 GPULayers:37[ID:337ff6b7-f904-f3b4-084d-941807ea7d6c Layers:7(0..6) ID:8680923e-0000-0000-0000-000000000000 Layers:30(7..36)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:false}" time=2026-01-30T21:52:57.063+08:00 level=INFO source=ggml.go:482 msg="offloading 36 repeating layers to GPU" time=2026-01-30T21:52:57.063+08:00 level=INFO source=ggml.go:489 msg="offloading output layer to GPU" time=2026-01-30T21:52:57.063+08:00 level=INFO source=ggml.go:494 msg="offloaded 37/37 layers to GPU" time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:240 msg="model weights" device=Vulkan0 size="1.9 GiB" time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:240 msg="model weights" device=Vulkan1 size="400.1 MiB" time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:245 msg="model weights" device=CPU size="304.3 MiB" time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:251 msg="kv cache" device=Vulkan0 size="464.0 MiB" time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:251 msg="kv cache" device=Vulkan1 size="112.0 MiB" time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:262 msg="compute graph" device=Vulkan0 size="71.0 MiB" time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:262 msg="compute graph" device=Vulkan1 size="79.0 MiB" time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:267 msg="compute graph" device=CPU size="5.0 MiB" time=2026-01-30T21:52:57.063+08:00 level=INFO source=device.go:272 msg="total memory" size="3.3 GiB" time=2026-01-30T21:52:57.064+08:00 level=INFO source=sched.go:526 msg="loaded runners" count=1 time=2026-01-30T21:52:57.064+08:00 level=INFO source=server.go:1347 msg="waiting for llama runner to start responding" time=2026-01-30T21:52:57.064+08:00 level=INFO source=server.go:1381 msg="waiting for server to become available" status="llm server loading model" time=2026-01-30T21:53:04.078+08:00 level=INFO source=server.go:1385 msg="llama runner started in 9.24 seconds" [GIN] 2026/01/30 - 21:53:04 | 200 | 10.8871466s | 127.0.0.1 | POST "/api/generate" Exception 0xe06d7363 0x19930520 0xf227ff3e0 0x7ffcbcb67f7a PC=0x7ffcbcb67f7a signal arrived during external code execution runtime.cgocall(0x7ff6af9695e0, 0xc000567aa0) runtime/cgocall.go:167 +0x3e fp=0xc000567a78 sp=0xc000567a10 pc=0x7ff6aebb243e github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_backend_sched_graph_compute_async(0x180fec00010, 0x1817a884ed0) _cgo_gotypes.go:961 +0x50 fp=0xc000567aa0 sp=0xc000567a78 pc=0x7ff6af047a70 github.com/ollama/ollama/ml/backend/ggml.(*Context).ComputeWithNotify.func2(...) github.com/ollama/ollama/ml/backend/ggml/ggml.go:825 github.com/ollama/ollama/ml/backend/ggml.(*Context).ComputeWithNotify(0xc001f7c380, 0xc001e60de0?, {0xc001f70ae0, 0x1, 0x2?}) github.com/ollama/ollama/ml/backend/ggml/ggml.go:825 +0x1b5 fp=0xc000567b78 sp=0xc000567aa0 pc=0x7ff6af056135 github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc000516000, {0x0, {0x7ff6b01a8b20, 0xc001f7c380}, {0x7ff6b01b6208, 0xc0004dac60}, {0xc000452e00, 0xb, 0x10}, {{0x7ff6b01b6208, ...}, ...}, ...}) github.com/ollama/ollama/runner/ollamarunner/runner.go:723 +0x876 fp=0xc000567ef0 sp=0xc000567b78 pc=0x7ff6af130096 github.com/ollama/ollama/runner/ollamarunner.(*Server).run.gowrap1() github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x58 fp=0xc000567fe0 sp=0xc000567ef0 pc=0x7ff6af12dc98 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000567fe8 sp=0xc000567fe0 pc=0x7ff6aebbd8e1 created by github.com/ollama/ollama/runner/ollamarunner.(*Server).run in goroutine 50 github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x2cd goroutine 1 gp=0xc0000021c0 m=nil [IO wait, 1 minutes]: runtime.gopark(0x7ff6aebbf0e0?, 0x7ff6b0b78540?, 0x20?, 0x80?, 0xc0005180cc?) runtime/proc.go:435 +0xce fp=0xc000231648 sp=0xc000231628 pc=0x7ff6aebb598e runtime.netpollblock(0x1a8?, 0xaeb50406?, 0xf6?) runtime/netpoll.go:575 +0xf7 fp=0xc000231680 sp=0xc000231648 pc=0x7ff6aeb7bdf7 internal/poll.runtime_pollWait(0x180f68a4cb0, 0x72) runtime/netpoll.go:351 +0x85 fp=0xc0002316a0 sp=0xc000231680 pc=0x7ff6aebb4b25 internal/poll.(*pollDesc).wait(0x7ff6aec4a7b3?, 0x0?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0002316c8 sp=0xc0002316a0 pc=0x7ff6aec4bda7 internal/poll.execIO(0xc000518020, 0xc000231770) internal/poll/fd_windows.go:177 +0x105 fp=0xc000231740 sp=0xc0002316c8 pc=0x7ff6aec4d205 internal/poll.(*FD).acceptOne(0xc000518008, 0x204, {0xc0005201e0?, 0xc0002317d0?, 0x7ff6aebbb8f7?}, 0xc000231810?) internal/poll/fd_windows.go:946 +0x65 fp=0xc0002317a0 sp=0xc000231740 pc=0x7ff6aec51785 internal/poll.(*FD).Accept(0xc000518008, 0xc000231950) internal/poll/fd_windows.go:980 +0x1b6 fp=0xc000231858 sp=0xc0002317a0 pc=0x7ff6aec51ab6 net.(*netFD).accept(0xc000518008) net/fd_windows.go:182 +0x4b fp=0xc000231970 sp=0xc000231858 pc=0x7ff6aecc326b net.(*TCPListener).accept(0xc000602080) net/tcpsock_posix.go:159 +0x1b fp=0xc0002319c0 sp=0xc000231970 pc=0x7ff6aecd981b net.(*TCPListener).Accept(0xc000602080) net/tcpsock.go:380 +0x30 fp=0xc0002319f0 sp=0xc0002319c0 pc=0x7ff6aecd85d0 net/http.(*onceCloseListener).Accept(0xc0005b4000?) <autogenerated>:1 +0x24 fp=0xc000231a08 sp=0xc0002319f0 pc=0x7ff6aeef1bc4 net/http.(*Server).Serve(0xc00051c000, {0x7ff6b019a930, 0xc000602080}) net/http/server.go:3424 +0x30c fp=0xc000231b38 sp=0xc000231a08 pc=0x7ff6aeec948c github.com/ollama/ollama/runner/ollamarunner.Execute({0xc0000a2030, 0x4, 0x5}) github.com/ollama/ollama/runner/ollamarunner/runner.go:1441 +0x94e fp=0xc000231d08 sp=0xc000231b38 pc=0x7ff6af136e8e github.com/ollama/ollama/runner.Execute({0xc0000a2010?, 0x0?, 0x0?}) github.com/ollama/ollama/runner/runner.go:28 +0x130 fp=0xc000231d30 sp=0xc000231d08 pc=0x7ff6af1377f0 github.com/ollama/ollama/cmd.NewCLI.func2(0xc000143300?, {0x7ff6aff9f302?, 0x4?, 0x7ff6aff9f306?}) github.com/ollama/ollama/cmd/cmd.go:1881 +0x45 fp=0xc000231d58 sp=0xc000231d30 pc=0x7ff6af8fbf25 github.com/spf13/cobra.(*Command).execute(0xc000219508, {0xc0003507d0, 0x5, 0x5}) github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc000231e78 sp=0xc000231d58 pc=0x7ff6aed3e41c github.com/spf13/cobra.(*Command).ExecuteC(0xc0000b4908) github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000231f30 sp=0xc000231e78 pc=0x7ff6aed3ec65 github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) github.com/spf13/cobra@v1.7.0/command.go:985 main.main() github.com/ollama/ollama/main.go:12 +0x4d fp=0xc000231f50 sp=0xc000231f30 pc=0x7ff6af8fca0d runtime.main() runtime/proc.go:283 +0x27d fp=0xc000231fe0 sp=0xc000231f50 pc=0x7ff6aeb84ddd runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000231fe8 sp=0xc000231fe0 pc=0x7ff6aebbd8e1 goroutine 2 gp=0xc0000028c0 m=nil [force gc (idle), 1 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00006ffa8 sp=0xc00006ff88 pc=0x7ff6aebb598e runtime.goparkunlock(...) runtime/proc.go:441 runtime.forcegchelper() runtime/proc.go:348 +0xb8 fp=0xc00006ffe0 sp=0xc00006ffa8 pc=0x7ff6aeb850f8 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x7ff6aebbd8e1 created by runtime.init.7 in goroutine 1 runtime/proc.go:336 +0x1a goroutine 3 gp=0xc000002c40 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000071f80 sp=0xc000071f60 pc=0x7ff6aebb598e runtime.goparkunlock(...) runtime/proc.go:441 runtime.bgsweep(0xc00007e000) runtime/mgcsweep.go:316 +0xdf fp=0xc000071fc8 sp=0xc000071f80 pc=0x7ff6aeb6debf runtime.gcenable.gowrap1() runtime/mgc.go:204 +0x25 fp=0xc000071fe0 sp=0xc000071fc8 pc=0x7ff6aeb62285 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000071fe8 sp=0xc000071fe0 pc=0x7ff6aebbd8e1 created by runtime.gcenable in goroutine 1 runtime/mgc.go:204 +0x66 goroutine 4 gp=0xc000002e00 m=nil [GC scavenge wait]: runtime.gopark(0xf724c?, 0xd34bc?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000085f78 sp=0xc000085f58 pc=0x7ff6aebb598e runtime.goparkunlock(...) runtime/proc.go:441 runtime.(*scavengerState).park(0x7ff6b0b9f920) runtime/mgcscavenge.go:425 +0x49 fp=0xc000085fa8 sp=0xc000085f78 pc=0x7ff6aeb6b909 runtime.bgscavenge(0xc00007e000) runtime/mgcscavenge.go:658 +0x59 fp=0xc000085fc8 sp=0xc000085fa8 pc=0x7ff6aeb6be99 runtime.gcenable.gowrap2() runtime/mgc.go:205 +0x25 fp=0xc000085fe0 sp=0xc000085fc8 pc=0x7ff6aeb62225 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000085fe8 sp=0xc000085fe0 pc=0x7ff6aebbd8e1 created by runtime.gcenable in goroutine 1 runtime/mgc.go:205 +0xa5 goroutine 5 gp=0xc000003340 m=nil [finalizer wait, 1 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000087e30 sp=0xc000087e10 pc=0x7ff6aebb598e runtime.runfinq() runtime/mfinal.go:196 +0x107 fp=0xc000087fe0 sp=0xc000087e30 pc=0x7ff6aeb61207 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000087fe8 sp=0xc000087fe0 pc=0x7ff6aebbd8e1 created by runtime.createfing in goroutine 1 runtime/mfinal.go:166 +0x3d goroutine 6 gp=0xc000003dc0 m=nil [chan receive]: runtime.gopark(0xc00017d0e0?, 0xc000dcc030?, 0x60?, 0x3f?, 0x7ff6aecabf68?) runtime/proc.go:435 +0xce fp=0xc000073f18 sp=0xc000073ef8 pc=0x7ff6aebb598e runtime.chanrecv(0xc00003c380, 0x0, 0x1) runtime/chan.go:664 +0x445 fp=0xc000073f90 sp=0xc000073f18 pc=0x7ff6aeb52d45 runtime.chanrecv1(0x7ff6aeb84f40?, 0xc000073f76?) runtime/chan.go:506 +0x12 fp=0xc000073fb8 sp=0xc000073f90 pc=0x7ff6aeb528d2 runtime.unique_runtime_registerUniqueMapCleanup.func2(...) runtime/mgc.go:1796 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() runtime/mgc.go:1799 +0x2f fp=0xc000073fe0 sp=0xc000073fb8 pc=0x7ff6aeb654af runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000073fe8 sp=0xc000073fe0 pc=0x7ff6aebbd8e1 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 runtime/mgc.go:1794 +0x85 goroutine 7 gp=0xc0003ee540 m=nil [GC worker (idle), 1 minutes]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000081f38 sp=0xc000081f18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc000081fc8 sp=0xc000081f38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000081fe0 sp=0xc000081fc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000081fe8 sp=0xc000081fe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 18 gp=0xc0004861c0 m=nil [GC worker (idle)]: runtime.gopark(0x480e487a53c?, 0x3?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000491f38 sp=0xc000491f18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc000491fc8 sp=0xc000491f38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000491fe0 sp=0xc000491fc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000491fe8 sp=0xc000491fe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 19 gp=0xc000486380 m=nil [GC worker (idle)]: runtime.gopark(0x480e4a928ec?, 0x3?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000493f38 sp=0xc000493f18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc000493fc8 sp=0xc000493f38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000493fe0 sp=0xc000493fc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000493fe8 sp=0xc000493fe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 34 gp=0xc000206000 m=nil [GC worker (idle), 1 minutes]: runtime.gopark(0x480e487a53c?, 0x1?, 0x68?, 0xf5?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00048df38 sp=0xc00048df18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc00048dfc8 sp=0xc00048df38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00048dfe0 sp=0xc00048dfc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00048dfe8 sp=0xc00048dfe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 8 gp=0xc0003ee700 m=nil [GC worker (idle), 1 minutes]: runtime.gopark(0x480e487a53c?, 0x3?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000083f38 sp=0xc000083f18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc000083fc8 sp=0xc000083f38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000083fe0 sp=0xc000083fc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 35 gp=0xc0002061c0 m=nil [GC worker (idle)]: runtime.gopark(0x480e4a928ec?, 0x1?, 0xd4?, 0xaf?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00048ff38 sp=0xc00048ff18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc00048ffc8 sp=0xc00048ff38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00048ffe0 sp=0xc00048ffc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00048ffe8 sp=0xc00048ffe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 9 gp=0xc0003ee8c0 m=nil [GC worker (idle)]: runtime.gopark(0x480e487a53c?, 0x1?, 0x8?, 0xb?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00047bf38 sp=0xc00047bf18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc00047bfc8 sp=0xc00047bf38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00047bfe0 sp=0xc00047bfc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00047bfe8 sp=0xc00047bfe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 36 gp=0xc000206380 m=nil [GC worker (idle), 1 minutes]: runtime.gopark(0x46934493f80?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000477f38 sp=0xc000477f18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc000477fc8 sp=0xc000477f38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000477fe0 sp=0xc000477fc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000477fe8 sp=0xc000477fe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 20 gp=0xc000486540 m=nil [GC worker (idle)]: runtime.gopark(0x480e4a928ec?, 0x1?, 0x14?, 0xfd?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00049bf38 sp=0xc00049bf18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc00049bfc8 sp=0xc00049bf38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00049bfe0 sp=0xc00049bfc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00049bfe8 sp=0xc00049bfe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 10 gp=0xc0003eea80 m=nil [GC worker (idle)]: runtime.gopark(0x480e4a928ec?, 0x1?, 0x1c?, 0x8b?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00047df38 sp=0xc00047df18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc00047dfc8 sp=0xc00047df38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00047dfe0 sp=0xc00047dfc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00047dfe8 sp=0xc00047dfe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 37 gp=0xc000206540 m=nil [GC worker (idle)]: runtime.gopark(0x480e487a53c?, 0x3?, 0x94?, 0x77?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000479f38 sp=0xc000479f18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc000479fc8 sp=0xc000479f38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000479fe0 sp=0xc000479fc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000479fe8 sp=0xc000479fe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 38 gp=0xc000206700 m=nil [GC worker (idle)]: runtime.gopark(0x7ff6b0bee5e0?, 0x1?, 0xb0?, 0x83?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000497f38 sp=0xc000497f18 pc=0x7ff6aebb598e runtime.gcBgMarkWorker(0xc00003d7a0) runtime/mgc.go:1423 +0xe9 fp=0xc000497fc8 sp=0xc000497f38 pc=0x7ff6aeb647a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000497fe0 sp=0xc000497fc8 pc=0x7ff6aeb64685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000497fe8 sp=0xc000497fe0 pc=0x7ff6aebbd8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 50 gp=0xc0004868c0 m=nil [chan receive]: runtime.gopark(0x30?, 0x7ff6aff09900?, 0x1?, 0x0?, 0xc000049798?) runtime/proc.go:435 +0xce fp=0xc000049750 sp=0xc000049730 pc=0x7ff6aebb598e runtime.chanrecv(0xc00003c1c0, 0x0, 0x1) runtime/chan.go:664 +0x445 fp=0xc0000497c8 sp=0xc000049750 pc=0x7ff6aeb52d45 runtime.chanrecv1(0x7ff6affddfd2?, 0x29?) runtime/chan.go:506 +0x12 fp=0xc0000497f0 sp=0xc0000497c8 pc=0x7ff6aeb528d2 github.com/ollama/ollama/runner/ollamarunner.(*Server).forwardBatch(_, {0x1, {0x7ff6b01a8b20, 0xc0004bc000}, {0x7ff6b01b6208, 0xc001f3d068}, {0xc000076238, 0x1, 0x1}, {{0x7ff6b01b6208, ...}, ...}, ...}) github.com/ollama/ollama/runner/ollamarunner/runner.go:475 +0xfa fp=0xc000049b58 sp=0xc0000497f0 pc=0x7ff6af12ddba github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0xc000516000, {0x7ff6b019cf20, 0xc0000d2af0}) github.com/ollama/ollama/runner/ollamarunner/runner.go:452 +0x18c fp=0xc000049fb8 sp=0xc000049b58 pc=0x7ff6af12da6c github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap1() github.com/ollama/ollama/runner/ollamarunner/runner.go:1418 +0x28 fp=0xc000049fe0 sp=0xc000049fb8 pc=0x7ff6af137108 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000049fe8 sp=0xc000049fe0 pc=0x7ff6aebbd8e1 created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1 github.com/ollama/ollama/runner/ollamarunner/runner.go:1418 +0x4c9 goroutine 27 gp=0xc000486a80 m=nil [select]: runtime.gopark(0xc000d49a08?, 0x2?, 0xf3?, 0x91?, 0xc000d4986c?) runtime/proc.go:435 +0xce fp=0xc000d49698 sp=0xc000d49678 pc=0x7ff6aebb598e runtime.selectgo(0xc000d49a08, 0xc000d49868, 0xb?, 0x0, 0x1?, 0x1) runtime/select.go:351 +0x837 fp=0xc000d497d0 sp=0xc000d49698 pc=0x7ff6aeb96437 github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0xc000516000, {0x7ff6b019aae0, 0xc0005b20e0}, 0xc00047e280) github.com/ollama/ollama/runner/ollamarunner/runner.go:950 +0xc4e fp=0xc000d49ac0 sp=0xc000d497d0 pc=0x7ff6af13218e github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x7ff6b019aae0?, 0xc0005b20e0?}, 0xc000d49b40?) <autogenerated>:1 +0x36 fp=0xc000d49af0 sp=0xc000d49ac0 pc=0x7ff6af1375f6 net/http.HandlerFunc.ServeHTTP(0xc0005900c0?, {0x7ff6b019aae0?, 0xc0005b20e0?}, 0xc000d49b60?) net/http/server.go:2294 +0x29 fp=0xc000d49b18 sp=0xc000d49af0 pc=0x7ff6aeec5ac9 net/http.(*ServeMux).ServeHTTP(0x7ff6aeb5b785?, {0x7ff6b019aae0, 0xc0005b20e0}, 0xc00047e280) net/http/server.go:2822 +0x1c4 fp=0xc000d49b68 sp=0xc000d49b18 pc=0x7ff6aeec79c4 net/http.serverHandler.ServeHTTP({0x7ff6b0196ff0?}, {0x7ff6b019aae0?, 0xc0005b20e0?}, 0x1?) net/http/server.go:3301 +0x8e fp=0xc000d49b98 sp=0xc000d49b68 pc=0x7ff6aeee544e net/http.(*conn).serve(0xc0005b4000, {0x7ff6b019cee8, 0xc0001c9080}) net/http/server.go:2102 +0x625 fp=0xc000d49fb8 sp=0xc000d49b98 pc=0x7ff6aeec3fc5 net/http.(*Server).Serve.gowrap3() net/http/server.go:3454 +0x28 fp=0xc000d49fe0 sp=0xc000d49fb8 pc=0x7ff6aeec9888 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000d49fe8 sp=0xc000d49fe0 pc=0x7ff6aebbd8e1 created by net/http.(*Server).Serve in goroutine 1 net/http/server.go:3454 +0x485 goroutine 28 gp=0xc000207500 m=nil [IO wait, 1 minutes]: runtime.gopark(0x0?, 0xc0005187a0?, 0x48?, 0x88?, 0xc00051884c?) runtime/proc.go:435 +0xce fp=0xc00056bd58 sp=0xc00056bd38 pc=0x7ff6aebb598e runtime.netpollblock(0x214?, 0xaeb50406?, 0xf6?) runtime/netpoll.go:575 +0xf7 fp=0xc00056bd90 sp=0xc00056bd58 pc=0x7ff6aeb7bdf7 internal/poll.runtime_pollWait(0x180f68a4b98, 0x72) runtime/netpoll.go:351 +0x85 fp=0xc00056bdb0 sp=0xc00056bd90 pc=0x7ff6aebb4b25 internal/poll.(*pollDesc).wait(0x204?, 0x72?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00056bdd8 sp=0xc00056bdb0 pc=0x7ff6aec4bda7 internal/poll.execIO(0xc0005187a0, 0x7ff6b001b1e0) internal/poll/fd_windows.go:177 +0x105 fp=0xc00056be50 sp=0xc00056bdd8 pc=0x7ff6aec4d205 internal/poll.(*FD).Read(0xc000518788, {0xc001f6f0c1, 0x1, 0x1}) internal/poll/fd_windows.go:438 +0x29b fp=0xc00056bef0 sp=0xc00056be50 pc=0x7ff6aec4dedb net.(*netFD).Read(0xc000518788, {0xc001f6f0c1?, 0xc000602158?, 0xc00056bf70?}) net/fd_posix.go:55 +0x25 fp=0xc00056bf38 sp=0xc00056bef0 pc=0x7ff6aecc1145 net.(*conn).Read(0xc0004b8018, {0xc001f6f0c1?, 0xc000d58000?, 0x7ff6aef38380?}) net/net.go:194 +0x45 fp=0xc00056bf80 sp=0xc00056bf38 pc=0x7ff6aecd0865 net/http.(*connReader).backgroundRead(0xc001f6f0b0) net/http/server.go:690 +0x37 fp=0xc00056bfc8 sp=0xc00056bf80 pc=0x7ff6aeebde97 net/http.(*connReader).startBackgroundRead.gowrap2() net/http/server.go:686 +0x25 fp=0xc00056bfe0 sp=0xc00056bfc8 pc=0x7ff6aeebddc5 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00056bfe8 sp=0xc00056bfe0 pc=0x7ff6aebbd8e1 created by net/http.(*connReader).startBackgroundRead in goroutine 27 net/http/server.go:686 +0xb6 goroutine 104 gp=0xc000586540 m=nil [chan receive]: runtime.gopark(0x30?, 0x7ff6aff09900?, 0x1?, 0x9d?, 0xc000571b20?) runtime/proc.go:435 +0xce fp=0xc000571ad8 sp=0xc000571ab8 pc=0x7ff6aebb598e runtime.chanrecv(0xc00058e1c0, 0x0, 0x1) runtime/chan.go:664 +0x445 fp=0xc000571b50 sp=0xc000571ad8 pc=0x7ff6aeb52d45 runtime.chanrecv1(0x7ff6affe1717?, 0x2c?) runtime/chan.go:506 +0x12 fp=0xc000571b78 sp=0xc000571b50 pc=0x7ff6aeb528d2 github.com/ollama/ollama/runner/ollamarunner.(*Server).computeBatch(0xc000516000, {0x1, {0x7ff6b01a8b20, 0xc0004bc000}, {0x7ff6b01b6208, 0xc001f3d068}, {0xc000076238, 0x1, 0x1}, {{0x7ff6b01b6208, ...}, ...}, ...}) github.com/ollama/ollama/runner/ollamarunner/runner.go:651 +0x185 fp=0xc000571ef0 sp=0xc000571b78 pc=0x7ff6af12f9a5 github.com/ollama/ollama/runner/ollamarunner.(*Server).run.gowrap1() github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x58 fp=0xc000571fe0 sp=0xc000571ef0 pc=0x7ff6af12dc98 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000571fe8 sp=0xc000571fe0 pc=0x7ff6aebbd8e1 created by github.com/ollama/ollama/runner/ollamarunner.(*Server).run in goroutine 50 github.com/ollama/ollama/runner/ollamarunner/runner.go:458 +0x2cd rax 0x0 rbx 0xf227ff368 rcx 0x0 rdx 0x0 rdi 0xe06d7363 rsi 0x1 rbp 0x4 rsp 0xf227ff240 r8 0x0 r9 0x0 r10 0x0 r11 0x0 r12 0x0 r13 0x180fec00010 r14 0x40000000 r15 0x0 rip 0x7ffcbcb67f7a rflags 0x202 cs 0x33 fs 0x53 gs 0x2b time=2026-01-30T21:54:40.409+08:00 level=ERROR source=server.go:1592 msg="post predict" error="Post \"http://127.0.0.1:62327/completion\": read tcp 127.0.0.1:61472->127.0.0.1:62327: wsarecv: An existing connection was forcibly closed by the remote host." [GIN] 2026/01/30 - 21:54:40 | 500 | 2.6094727s | 127.0.0.1 | POST "/api/chat" ``` ### OS Windows ### GPU _No response_ ### CPU Intel ### Ollama version 0.14.2
GiteaMirror added the bug label 2026-04-12 21:59:46 -05:00
Author
Owner

@taozebra commented on GitHub (Feb 7, 2026):

把显卡禁用了,用CPU跑可能现实点,UHD630、GT720太老了

<!-- gh-comment-id:3863334090 --> @taozebra commented on GitHub (Feb 7, 2026): 把显卡禁用了,用CPU跑可能现实点,UHD630、GT720太老了
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9144