[GH-ISSUE #11957] Hang when switching model. #7939

Closed
opened 2026-04-12 20:06:15 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @mcr-ksh on GitHub (Aug 18, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/11957

What is the issue?

When having a model loaded with env:

OLLAMA_MODELS=O:\models
OLLAMA_MAX_LOADED_MODELS=1
OLLAMA_HOST=0.0.0.0:11434
OLLAMA_NOHISTORY=1
OLLAMA_NEW_ESTIMATES=1
OLLAMA_LOAD_TIMEOUT=30m
OLLAMA_DEBUG=0
OLLAMA_NEW_ENGINE=1
OLLAMA_KEEP_ALIVE=24h
OLLAMA_FLASH_ATTENTION=1
OLLAMA_KV_CACHE_TYPE=q4_0
GGML_CUDA_FORCE_MMQ=1
GGML_CUDA_FORCE_CUBLAS=1
GGML_CUDA_ENABLE_UNIFIED_MEMORY=1

Relevant log output

time=2025-08-18T20:26:13.308Z level=INFO source=server.go:166 msg="enabling new memory estimates"
time=2025-08-18T20:26:13.318Z level=INFO source=server.go:211 msg="enabling flash attention"
time=2025-08-18T20:26:13.325Z level=INFO source=server.go:383 msg="starting runner" cmd="C:\\Users\\ksh.IRONSOFTWARE\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model O:\\models\\blobs\\sha256-87048bcd55216712ef14c11c2c303728463207b165bf18440b9b84b07ec00f87 --port 57184"
time=2025-08-18T20:26:13.329Z level=INFO source=server.go:657 msg="loading model" "model layers"=33 requested=-1
time=2025-08-18T20:26:13.338Z level=INFO source=server.go:663 msg="system memory" total="48.0 GiB" free="38.8 GiB" free_swap="41.0 GiB"
time=2025-08-18T20:26:13.338Z level=INFO source=server.go:667 msg="gpu memory" id=GPU-620f3fa0-47a8-a31e-eff8-c87476f589db available="9.4 GiB" free="9.9 GiB" minimum="457.0 MiB" overhead="0 B"
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xc0000005 code=0x0 addr=0x0 pc=0x7ff79e225157]

goroutine 33 [running]:
github.com/ollama/ollama/llm.(*ollamaServer).Load.func1()
	C:/a/ollama/ollama/llm/server.go:654 +0x77
github.com/ollama/ollama/llm.(*ollamaServer).Load(0xc002a80380, {0x7ff79f208f70, 0xc0000fc050}, {0xc000164240, 0x1, 0x1}, 0x0)
	C:/a/ollama/ollama/llm/server.go:684 +0x80e
github.com/ollama/ollama/server.(*Scheduler).load(0xc000307c00, 0xc0011840d0, 0xc003502330, {0xc000164240, 0x1, 0x1}, 0x0)
	C:/a/ollama/ollama/server/sched.go:435 +0x73a
github.com/ollama/ollama/server.(*Scheduler).processPending(0xc000307c00, {0x7ff79f208f70, 0xc00054f270})
	C:/a/ollama/ollama/server/sched.go:209 +0xab9
github.com/ollama/ollama/server.(*Scheduler).Run.func1()
	C:/a/ollama/ollama/server/sched.go:123 +0x1f
created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1
	C:/a/ollama/ollama/server/sched.go:122 +0xb1

OS

Linux

GPU

Nvidia

CPU

AMD

Ollama version

0.11.5-rc2

Originally created by @mcr-ksh on GitHub (Aug 18, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/11957 ### What is the issue? When having a model loaded with env: ``` OLLAMA_MODELS=O:\models OLLAMA_MAX_LOADED_MODELS=1 OLLAMA_HOST=0.0.0.0:11434 OLLAMA_NOHISTORY=1 OLLAMA_NEW_ESTIMATES=1 OLLAMA_LOAD_TIMEOUT=30m OLLAMA_DEBUG=0 OLLAMA_NEW_ENGINE=1 OLLAMA_KEEP_ALIVE=24h OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q4_0 GGML_CUDA_FORCE_MMQ=1 GGML_CUDA_FORCE_CUBLAS=1 GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 ``` ### Relevant log output ```shell time=2025-08-18T20:26:13.308Z level=INFO source=server.go:166 msg="enabling new memory estimates" time=2025-08-18T20:26:13.318Z level=INFO source=server.go:211 msg="enabling flash attention" time=2025-08-18T20:26:13.325Z level=INFO source=server.go:383 msg="starting runner" cmd="C:\\Users\\ksh.IRONSOFTWARE\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --model O:\\models\\blobs\\sha256-87048bcd55216712ef14c11c2c303728463207b165bf18440b9b84b07ec00f87 --port 57184" time=2025-08-18T20:26:13.329Z level=INFO source=server.go:657 msg="loading model" "model layers"=33 requested=-1 time=2025-08-18T20:26:13.338Z level=INFO source=server.go:663 msg="system memory" total="48.0 GiB" free="38.8 GiB" free_swap="41.0 GiB" time=2025-08-18T20:26:13.338Z level=INFO source=server.go:667 msg="gpu memory" id=GPU-620f3fa0-47a8-a31e-eff8-c87476f589db available="9.4 GiB" free="9.9 GiB" minimum="457.0 MiB" overhead="0 B" panic: runtime error: invalid memory address or nil pointer dereference [signal 0xc0000005 code=0x0 addr=0x0 pc=0x7ff79e225157] goroutine 33 [running]: github.com/ollama/ollama/llm.(*ollamaServer).Load.func1() C:/a/ollama/ollama/llm/server.go:654 +0x77 github.com/ollama/ollama/llm.(*ollamaServer).Load(0xc002a80380, {0x7ff79f208f70, 0xc0000fc050}, {0xc000164240, 0x1, 0x1}, 0x0) C:/a/ollama/ollama/llm/server.go:684 +0x80e github.com/ollama/ollama/server.(*Scheduler).load(0xc000307c00, 0xc0011840d0, 0xc003502330, {0xc000164240, 0x1, 0x1}, 0x0) C:/a/ollama/ollama/server/sched.go:435 +0x73a github.com/ollama/ollama/server.(*Scheduler).processPending(0xc000307c00, {0x7ff79f208f70, 0xc00054f270}) C:/a/ollama/ollama/server/sched.go:209 +0xab9 github.com/ollama/ollama/server.(*Scheduler).Run.func1() C:/a/ollama/ollama/server/sched.go:123 +0x1f created by github.com/ollama/ollama/server.(*Scheduler).Run in goroutine 1 C:/a/ollama/ollama/server/sched.go:122 +0xb1 ``` ### OS Linux ### GPU Nvidia ### CPU AMD ### Ollama version 0.11.5-rc2
GiteaMirror added the bug label 2026-04-12 20:06:15 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#7939