[GH-ISSUE #12942] Ollama suddenly became unresponsive, reinstall did not help #70640

Closed
opened 2026-05-04 22:21:57 -05:00 by GiteaMirror · 3 comments
Owner

Originally created by @cwiokpl on GitHub (Nov 4, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12942

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

What is the issue?

Hello,

I am running windows 10 VM in Debian 13 host with GPU pass through. I set up CUDA Toolkit and Ollama 2 days ago and everything was working fine. I had multiple models installed including qwen3-vl:30b. The issue started when I decided to download qwen3-vl:30b-a3b-instruct-q4_K_M using ollama run qwen3-vl:30b-a3b-instruct-q4_K_M, however by accident I used ollama run qwen3-vl:30b-a3b-thinking-q4_K_M, which actually is the same model as "qwen3-vl:30b" - sha1 matches. After that no matter what I do I cannot get it to work. I am not 100% sure if my actions are related to the issue, just what I noticed.

When I open Ollama app it gets stuck at the "writing" animation, when I do the same using cli using ollama run it freezes at the new line. Ollama ps shows empty list, I understand, meaning that the model is not loaded.

Example output when I run ollama run qwen3-4b:
C:\Users\User>ollama run qwen3:4b

It will freeze like this, I see some CPU and GPU load, but GPU memory stays at almost zero.

I reinstalled Ollama (deleted all the files after uninstall manually) and reinstalled CUDA Toolkit. Due to the nature of the setup (GPU Passthrough + Looking Glass), I'd rather avoid complete Windows reinstall.

OS
Windows 10 in VM

GPU
NVIDIA RTX 4060 TI

CPU
AMD

Ollama version

0.12.9

Relevant log output

**App.log:**
time=2025-11-04T07:52:40.889Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043
time=2025-11-04T07:52:40.890Z level=INFO source=app.go:231 msg="initialized tools registry" tool_count=0
time=2025-11-04T07:52:40.894Z level=INFO source=app.go:246 msg="starting ollama server"
time=2025-11-04T07:52:41.171Z level=INFO source=app.go:275 msg="starting ui server" port=51753
time=2025-11-04T07:52:41.805Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762242761805731900 version=0.12.9
time=2025-11-04T07:52:41.807Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=498.5µs request_id=1762242761806733000 version=0.12.9
time=2025-11-04T07:52:41.807Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=0s request_id=1762242761807231500 version=0.12.9
time=2025-11-04T07:52:41.831Z level=WARN source=ui.go:1601 msg="failed to show model details" error="Post \"http://127.0.0.1:11434/api/show\": dial tcp 127.0.0.1:11434: connectex: No connection could be made because the target machine actively refused it." model=qwen3:4b
time=2025-11-04T07:52:41.831Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=326.4µs request_id=1762242761831404200 version=0.12.9
time=2025-11-04T07:52:41.837Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761837847500 version=0.12.9
time=2025-11-04T07:52:41.851Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=497.4µs request_id=1762242761850732700 version=0.12.9
time=2025-11-04T07:52:41.864Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761864755600 version=0.12.9
time=2025-11-04T07:52:41.877Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=415.4µs request_id=1762242761877313100 version=0.12.9
time=2025-11-04T07:52:41.890Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=209.3µs request_id=1762242761890395300 version=0.12.9
time=2025-11-04T07:52:41.892Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=113.6345ms request_id=1762242761778593200 version=0.12.9
time=2025-11-04T07:52:41.903Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=499.2µs request_id=1762242761903228400 version=0.12.9
time=2025-11-04T07:52:41.916Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=139.1µs request_id=1762242761916225900 version=0.12.9
time=2025-11-04T07:52:41.928Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=139.7µs request_id=1762242761928585800 version=0.12.9
time=2025-11-04T07:52:41.940Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761940723400 version=0.12.9
time=2025-11-04T07:52:41.954Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=335µs request_id=1762242761954222500 version=0.12.9
time=2025-11-04T07:52:41.968Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=499.6µs request_id=1762242761967721800 version=0.12.9
time=2025-11-04T07:52:41.982Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=498.9µs request_id=1762242761981721800 version=0.12.9
time=2025-11-04T07:52:41.994Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761994719600 version=0.12.9
time=2025-11-04T07:52:42.334Z level=ERROR source=ui.go:1618 msg="failed to get inference compute" error="timeout scanning server log for inference compute details"
time=2025-11-04T07:52:42.334Z level=ERROR source=ui.go:168 msg=site.serveHTTP error="failed to get inference compute: timeout scanning server log for inference compute details" http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=500 http.d=528.4653ms request_id=1762242761805731900 version=0.12.9
time=2025-11-04T07:52:43.841Z level=ERROR source=ui.go:1618 msg="failed to get inference compute" error="timeout scanning server log for inference compute details"
time=2025-11-04T07:52:43.841Z level=ERROR source=ui.go:168 msg=site.serveHTTP error="failed to get inference compute: timeout scanning server log for inference compute details" http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=500 http.d=502.453ms request_id=1762242763339151000 version=0.12.9
time=2025-11-04T07:52:44.173Z level=INFO source=updater.go:252 msg="beginning update checker" interval=1h0m0s
time=2025-11-04T07:52:44.874Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=2.8663136s request_id=1762242762008219900 version=0.12.9
time=2025-11-04T07:52:44.899Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=3.0929845s request_id=1762242761805731900 version=0.12.9
time=2025-11-04T07:52:44.986Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=2.1465903s request_id=1762242762839948100 version=0.12.9
time=2025-11-04T07:52:45.100Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=196.3387ms request_id=1762242764903683900 version=0.12.9
time=2025-11-04T07:52:45.859Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}"
time=2025-11-04T07:52:45.859Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=13.0829ms request_id=1762242765846136400 version=0.12.9
time=2025-11-04T07:52:59.339Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043
time=2025-11-04T07:52:59.342Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance"
time=2025-11-04T07:52:59.342Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting"
time=2025-11-04T07:53:02.866Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043
time=2025-11-04T07:53:03.058Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance"
time=2025-11-04T07:53:03.058Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting"
time=2025-11-04T07:53:03.375Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=148.8µs request_id=1762242783375185400 version=0.12.9
time=2025-11-04T07:53:03.376Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}"
time=2025-11-04T07:53:03.376Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=0s request_id=1762242783376834300 version=0.12.9
time=2025-11-04T07:53:03.377Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=503.5µs request_id=1762242783376834300 version=0.12.9
time=2025-11-04T07:53:03.377Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=496.2µs request_id=1762242783377337800 version=0.12.9
time=2025-11-04T07:53:03.378Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=2.5002ms request_id=1762242783376333000 version=0.12.9
time=2025-11-04T07:53:03.448Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=44.9987ms request_id=1762242783403330300 version=0.12.9
time=2025-11-04T07:53:03.761Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=415.6881ms request_id=1762242783345619600 version=0.12.9
time=2025-11-04T07:53:03.908Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=505.539ms request_id=1762242783402830800 version=0.12.9
time=2025-11-04T07:53:07.025Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=498µs request_id=1762242787024596900 version=0.12.9
time=2025-11-04T07:53:07.035Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762242787035097700 version=0.12.9
time=2025-11-04T07:53:07.038Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=2.4991ms request_id=1762242787035596200 version=0.12.9
time=2025-11-04T07:53:37.064Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.2205ms request_id=1762242817063647200 version=0.12.9
time=2025-11-04T07:54:07.081Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.8894ms request_id=1762242847079696800 version=0.12.9
time=2025-11-04T07:54:37.097Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5039ms request_id=1762242877095746800 version=0.12.9
time=2025-11-04T07:55:07.108Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.4987ms request_id=1762242907107297600 version=0.12.9
time=2025-11-04T07:55:37.112Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=848.1µs request_id=1762242937111999200 version=0.12.9
time=2025-11-04T07:56:07.121Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5062ms request_id=1762242967119897100 version=0.12.9
time=2025-11-04T07:56:37.130Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5708ms request_id=1762242997128444800 version=0.12.9
time=2025-11-04T07:57:07.138Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.3395ms request_id=1762243027137492600 version=0.12.9
time=2025-11-04T07:57:37.145Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.0001ms request_id=1762243057144539300 version=0.12.9
time=2025-11-04T07:58:07.157Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5011ms request_id=1762243087155586700 version=0.12.9
time=2025-11-04T07:58:37.163Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=675.2µs request_id=1762243117162962600 version=0.12.9
time=2025-11-04T07:59:07.169Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=999.2µs request_id=1762243147168179400 version=0.12.9
time=2025-11-04T07:59:37.179Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=999µs request_id=1762243177178225200 version=0.12.9
time=2025-11-04T08:00:06.820Z level=ERROR source=ui.go:1179 msg="chat stream error" error="Post \"http://127.0.0.1:11434/api/chat\": context canceled"
time=2025-11-04T08:00:06.820Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/chat/new http.pattern="POST /api/v1/chat/{id}" http.status=200 http.d=6m59.8000615s request_id=1762242787020233200 version=0.12.9
time=2025-11-04T08:00:16.016Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043
time=2025-11-04T08:00:16.238Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance"
time=2025-11-04T08:00:16.238Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting"
time=2025-11-04T08:00:16.571Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=500.4µs request_id=1762243216570659100 version=0.12.9
time=2025-11-04T08:00:16.573Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=78.2µs request_id=1762243216573081100 version=0.12.9
time=2025-11-04T08:00:16.576Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=6.0012ms request_id=1762243216570159700 version=0.12.9
time=2025-11-04T08:00:16.576Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=3.9996ms request_id=1762243216572161300 version=0.12.9
time=2025-11-04T08:00:16.594Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}"
time=2025-11-04T08:00:16.594Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=23.4993ms request_id=1762243216571159500 version=0.12.9
time=2025-11-04T08:00:16.646Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=53.4994ms request_id=1762243216592657900 version=0.12.9
time=2025-11-04T08:00:16.784Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=243.7951ms request_id=1762243216540382800 version=0.12.9
time=2025-11-04T08:00:16.865Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=272.8287ms request_id=1762243216592657900 version=0.12.9
time=2025-11-04T08:00:19.478Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=500µs request_id=1762243219478468900 version=0.12.9
time=2025-11-04T08:00:19.486Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762243219486623800 version=0.12.9
time=2025-11-04T08:00:19.500Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=13.9069ms request_id=1762243219486623800 version=0.12.9
time=2025-11-04T08:00:49.506Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.3714ms request_id=1762243249505513500 version=0.12.9

**Server.log**
time=2025-11-04T07:52:41.998Z level=INFO source=routes.go:1524 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\User\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]"
time=2025-11-04T07:52:42.008Z level=INFO source=images.go:522 msg="total blobs: 5"
time=2025-11-04T07:52:42.008Z level=INFO source=images.go:529 msg="total unused blobs removed: 0"
time=2025-11-04T07:52:42.009Z level=INFO source=routes.go:1577 msg="Listening on 127.0.0.1:11434 (version 0.12.9)"
time=2025-11-04T07:52:42.010Z level=INFO source=runner.go:76 msg="discovering available GPUs..."
time=2025-11-04T07:52:42.015Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51799"
time=2025-11-04T07:52:44.164Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51806"
time=2025-11-04T07:52:44.471Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51812"
time=2025-11-04T07:52:44.695Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51817"
time=2025-11-04T07:52:44.695Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51818"
time=2025-11-04T07:52:44.873Z level=INFO source=types.go:42 msg="inference compute" id=GPU-bb474679-35c8-b903-05b8-51d955e673c2 filtered_id="" library=CUDA compute=8.9 name=CUDA0 description="NVIDIA GeForce RTX 4060 Ti" libdirs=ollama,cuda_v13 driver=13.0 pci_id=0000:07:00.0 type=discrete total="16.0 GiB" available="15.4 GiB"
time=2025-11-04T07:52:44.873Z level=INFO source=routes.go:1618 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB"
[GIN] 2025/11/04 - 07:52:44 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:52:44 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:52:44 | 200 |     24.1829ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:52:44 | 200 |    111.9967ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 07:53:03 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:53:03 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:53:03 | 200 |      1.4954ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:53:03 | 200 |     43.0009ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 07:53:07 | 200 |       500.5µs |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:53:07 | 200 |       502.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:53:07 | 200 |     55.0018ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 07:53:07 | 200 |     37.9946ms |       127.0.0.1 | POST     "/api/show"
time=2025-11-04T07:53:07.198Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51861"
[GIN] 2025/11/04 - 07:53:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:53:37 | 200 |       850.5µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:54:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:54:07 | 200 |       505.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:54:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:54:37 | 200 |       504.6µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:55:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:55:07 | 200 |       498.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:55:22 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2025/11/04 - 07:55:22 | 200 |       505.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:55:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:55:37 | 200 |       498.6µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:56:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:56:07 | 200 |       499.7µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:56:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:56:37 | 200 |       997.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:57:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:57:07 | 200 |      1.0126ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:57:37 | 200 |       500.2µs |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:57:37 | 200 |       499.9µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:58:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:58:07 | 200 |      1.0005ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:58:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:58:37 | 200 |         506µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:59:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:59:07 | 200 |       999.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:59:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:59:37 | 200 |       495.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:00:16 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:00:16 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:00:16 | 200 |       998.9µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:00:16 | 200 |     50.9962ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 08:00:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:00:19 | 200 |         499µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:00:19 | 200 |     54.9984ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 08:00:19 | 200 |      39.997ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 08:00:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:00:49 | 200 |         501µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:01:19 | 200 |       499.2µs |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:01:19 | 200 |       979.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:01:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:01:49 | 200 |       505.5µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:02:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:02:19 | 200 |       463.7µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:02:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:02:49 | 200 |       500.5µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:03:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:03:19 | 200 |       999.6µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:03:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:03:49 | 200 |       972.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:04:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:04:19 | 200 |       1.001ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:04:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:04:49 | 200 |       501.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:05:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:05:19 | 200 |       499.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:05:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:05:49 | 200 |       503.5µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:06:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:06:19 | 200 |      1.0011ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:06:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:06:49 | 200 |       1.498ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:07:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:07:19 | 200 |       998.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:07:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:07:49 | 200 |       1.004ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:08:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:08:19 | 200 |      1.5048ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:08:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:08:49 | 200 |      1.0002ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:09:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:09:19 | 200 |       998.4µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:09:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:09:49 | 200 |       475.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:10:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:10:19 | 200 |       504.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:10:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:10:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:10:49 | 200 |       504.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:11:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:11:19 | 200 |       500.8µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:11:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:11:49 | 200 |       500.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:12:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:12:19 | 200 |       505.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:12:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:12:49 | 200 |         507µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:13:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:13:19 | 200 |       923.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:13:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:13:49 | 200 |           1ms |       127.0.0.1 | GET      "/api/tags"

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.12.9

Originally created by @cwiokpl on GitHub (Nov 4, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12942 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? **What is the issue?** Hello, I am running windows 10 VM in Debian 13 host with GPU pass through. I set up CUDA Toolkit and Ollama 2 days ago and everything was working fine. I had multiple models installed including qwen3-vl:30b. The issue started when I decided to download qwen3-vl:30b-a3b-instruct-q4_K_M using `ollama run qwen3-vl:30b-a3b-instruct-q4_K_M`, however by accident I used `ollama run qwen3-vl:30b-a3b-thinking-q4_K_M`, which actually is the same model as "qwen3-vl:30b" - sha1 matches. After that no matter what I do I cannot get it to work. I am not 100% sure if my actions are related to the issue, just what I noticed. When I open Ollama app it gets stuck at the "writing" animation, when I do the same using cli using ollama run it freezes at the new line. Ollama ps shows empty list, I understand, meaning that the model is not loaded. Example output when I run `ollama run qwen3-4b`: `C:\Users\User>ollama run qwen3:4b` `⠴` It will freeze like this, I see some CPU and GPU load, but GPU memory stays at almost zero. I reinstalled Ollama (deleted all the files after uninstall manually) and reinstalled CUDA Toolkit. Due to the nature of the setup (GPU Passthrough + Looking Glass), I'd rather avoid complete Windows reinstall. **OS** Windows 10 in VM **GPU** NVIDIA RTX 4060 TI **CPU** AMD Ollama version 0.12.9 ### Relevant log output ```shell **App.log:** time=2025-11-04T07:52:40.889Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043 time=2025-11-04T07:52:40.890Z level=INFO source=app.go:231 msg="initialized tools registry" tool_count=0 time=2025-11-04T07:52:40.894Z level=INFO source=app.go:246 msg="starting ollama server" time=2025-11-04T07:52:41.171Z level=INFO source=app.go:275 msg="starting ui server" port=51753 time=2025-11-04T07:52:41.805Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762242761805731900 version=0.12.9 time=2025-11-04T07:52:41.807Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=498.5µs request_id=1762242761806733000 version=0.12.9 time=2025-11-04T07:52:41.807Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=0s request_id=1762242761807231500 version=0.12.9 time=2025-11-04T07:52:41.831Z level=WARN source=ui.go:1601 msg="failed to show model details" error="Post \"http://127.0.0.1:11434/api/show\": dial tcp 127.0.0.1:11434: connectex: No connection could be made because the target machine actively refused it." model=qwen3:4b time=2025-11-04T07:52:41.831Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=326.4µs request_id=1762242761831404200 version=0.12.9 time=2025-11-04T07:52:41.837Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761837847500 version=0.12.9 time=2025-11-04T07:52:41.851Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=497.4µs request_id=1762242761850732700 version=0.12.9 time=2025-11-04T07:52:41.864Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761864755600 version=0.12.9 time=2025-11-04T07:52:41.877Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=415.4µs request_id=1762242761877313100 version=0.12.9 time=2025-11-04T07:52:41.890Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=209.3µs request_id=1762242761890395300 version=0.12.9 time=2025-11-04T07:52:41.892Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=113.6345ms request_id=1762242761778593200 version=0.12.9 time=2025-11-04T07:52:41.903Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=499.2µs request_id=1762242761903228400 version=0.12.9 time=2025-11-04T07:52:41.916Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=139.1µs request_id=1762242761916225900 version=0.12.9 time=2025-11-04T07:52:41.928Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=139.7µs request_id=1762242761928585800 version=0.12.9 time=2025-11-04T07:52:41.940Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761940723400 version=0.12.9 time=2025-11-04T07:52:41.954Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=335µs request_id=1762242761954222500 version=0.12.9 time=2025-11-04T07:52:41.968Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=499.6µs request_id=1762242761967721800 version=0.12.9 time=2025-11-04T07:52:41.982Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=498.9µs request_id=1762242761981721800 version=0.12.9 time=2025-11-04T07:52:41.994Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761994719600 version=0.12.9 time=2025-11-04T07:52:42.334Z level=ERROR source=ui.go:1618 msg="failed to get inference compute" error="timeout scanning server log for inference compute details" time=2025-11-04T07:52:42.334Z level=ERROR source=ui.go:168 msg=site.serveHTTP error="failed to get inference compute: timeout scanning server log for inference compute details" http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=500 http.d=528.4653ms request_id=1762242761805731900 version=0.12.9 time=2025-11-04T07:52:43.841Z level=ERROR source=ui.go:1618 msg="failed to get inference compute" error="timeout scanning server log for inference compute details" time=2025-11-04T07:52:43.841Z level=ERROR source=ui.go:168 msg=site.serveHTTP error="failed to get inference compute: timeout scanning server log for inference compute details" http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=500 http.d=502.453ms request_id=1762242763339151000 version=0.12.9 time=2025-11-04T07:52:44.173Z level=INFO source=updater.go:252 msg="beginning update checker" interval=1h0m0s time=2025-11-04T07:52:44.874Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=2.8663136s request_id=1762242762008219900 version=0.12.9 time=2025-11-04T07:52:44.899Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=3.0929845s request_id=1762242761805731900 version=0.12.9 time=2025-11-04T07:52:44.986Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=2.1465903s request_id=1762242762839948100 version=0.12.9 time=2025-11-04T07:52:45.100Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=196.3387ms request_id=1762242764903683900 version=0.12.9 time=2025-11-04T07:52:45.859Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}" time=2025-11-04T07:52:45.859Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=13.0829ms request_id=1762242765846136400 version=0.12.9 time=2025-11-04T07:52:59.339Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043 time=2025-11-04T07:52:59.342Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance" time=2025-11-04T07:52:59.342Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting" time=2025-11-04T07:53:02.866Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043 time=2025-11-04T07:53:03.058Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance" time=2025-11-04T07:53:03.058Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting" time=2025-11-04T07:53:03.375Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=148.8µs request_id=1762242783375185400 version=0.12.9 time=2025-11-04T07:53:03.376Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}" time=2025-11-04T07:53:03.376Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=0s request_id=1762242783376834300 version=0.12.9 time=2025-11-04T07:53:03.377Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=503.5µs request_id=1762242783376834300 version=0.12.9 time=2025-11-04T07:53:03.377Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=496.2µs request_id=1762242783377337800 version=0.12.9 time=2025-11-04T07:53:03.378Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=2.5002ms request_id=1762242783376333000 version=0.12.9 time=2025-11-04T07:53:03.448Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=44.9987ms request_id=1762242783403330300 version=0.12.9 time=2025-11-04T07:53:03.761Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=415.6881ms request_id=1762242783345619600 version=0.12.9 time=2025-11-04T07:53:03.908Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=505.539ms request_id=1762242783402830800 version=0.12.9 time=2025-11-04T07:53:07.025Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=498µs request_id=1762242787024596900 version=0.12.9 time=2025-11-04T07:53:07.035Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762242787035097700 version=0.12.9 time=2025-11-04T07:53:07.038Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=2.4991ms request_id=1762242787035596200 version=0.12.9 time=2025-11-04T07:53:37.064Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.2205ms request_id=1762242817063647200 version=0.12.9 time=2025-11-04T07:54:07.081Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.8894ms request_id=1762242847079696800 version=0.12.9 time=2025-11-04T07:54:37.097Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5039ms request_id=1762242877095746800 version=0.12.9 time=2025-11-04T07:55:07.108Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.4987ms request_id=1762242907107297600 version=0.12.9 time=2025-11-04T07:55:37.112Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=848.1µs request_id=1762242937111999200 version=0.12.9 time=2025-11-04T07:56:07.121Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5062ms request_id=1762242967119897100 version=0.12.9 time=2025-11-04T07:56:37.130Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5708ms request_id=1762242997128444800 version=0.12.9 time=2025-11-04T07:57:07.138Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.3395ms request_id=1762243027137492600 version=0.12.9 time=2025-11-04T07:57:37.145Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.0001ms request_id=1762243057144539300 version=0.12.9 time=2025-11-04T07:58:07.157Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5011ms request_id=1762243087155586700 version=0.12.9 time=2025-11-04T07:58:37.163Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=675.2µs request_id=1762243117162962600 version=0.12.9 time=2025-11-04T07:59:07.169Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=999.2µs request_id=1762243147168179400 version=0.12.9 time=2025-11-04T07:59:37.179Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=999µs request_id=1762243177178225200 version=0.12.9 time=2025-11-04T08:00:06.820Z level=ERROR source=ui.go:1179 msg="chat stream error" error="Post \"http://127.0.0.1:11434/api/chat\": context canceled" time=2025-11-04T08:00:06.820Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/chat/new http.pattern="POST /api/v1/chat/{id}" http.status=200 http.d=6m59.8000615s request_id=1762242787020233200 version=0.12.9 time=2025-11-04T08:00:16.016Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043 time=2025-11-04T08:00:16.238Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance" time=2025-11-04T08:00:16.238Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting" time=2025-11-04T08:00:16.571Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=500.4µs request_id=1762243216570659100 version=0.12.9 time=2025-11-04T08:00:16.573Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=78.2µs request_id=1762243216573081100 version=0.12.9 time=2025-11-04T08:00:16.576Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=6.0012ms request_id=1762243216570159700 version=0.12.9 time=2025-11-04T08:00:16.576Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=3.9996ms request_id=1762243216572161300 version=0.12.9 time=2025-11-04T08:00:16.594Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}" time=2025-11-04T08:00:16.594Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=23.4993ms request_id=1762243216571159500 version=0.12.9 time=2025-11-04T08:00:16.646Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=53.4994ms request_id=1762243216592657900 version=0.12.9 time=2025-11-04T08:00:16.784Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=243.7951ms request_id=1762243216540382800 version=0.12.9 time=2025-11-04T08:00:16.865Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=272.8287ms request_id=1762243216592657900 version=0.12.9 time=2025-11-04T08:00:19.478Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=500µs request_id=1762243219478468900 version=0.12.9 time=2025-11-04T08:00:19.486Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762243219486623800 version=0.12.9 time=2025-11-04T08:00:19.500Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=13.9069ms request_id=1762243219486623800 version=0.12.9 time=2025-11-04T08:00:49.506Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.3714ms request_id=1762243249505513500 version=0.12.9 **Server.log** time=2025-11-04T07:52:41.998Z level=INFO source=routes.go:1524 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\User\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]" time=2025-11-04T07:52:42.008Z level=INFO source=images.go:522 msg="total blobs: 5" time=2025-11-04T07:52:42.008Z level=INFO source=images.go:529 msg="total unused blobs removed: 0" time=2025-11-04T07:52:42.009Z level=INFO source=routes.go:1577 msg="Listening on 127.0.0.1:11434 (version 0.12.9)" time=2025-11-04T07:52:42.010Z level=INFO source=runner.go:76 msg="discovering available GPUs..." time=2025-11-04T07:52:42.015Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51799" time=2025-11-04T07:52:44.164Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51806" time=2025-11-04T07:52:44.471Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51812" time=2025-11-04T07:52:44.695Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51817" time=2025-11-04T07:52:44.695Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51818" time=2025-11-04T07:52:44.873Z level=INFO source=types.go:42 msg="inference compute" id=GPU-bb474679-35c8-b903-05b8-51d955e673c2 filtered_id="" library=CUDA compute=8.9 name=CUDA0 description="NVIDIA GeForce RTX 4060 Ti" libdirs=ollama,cuda_v13 driver=13.0 pci_id=0000:07:00.0 type=discrete total="16.0 GiB" available="15.4 GiB" time=2025-11-04T07:52:44.873Z level=INFO source=routes.go:1618 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB" [GIN] 2025/11/04 - 07:52:44 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:52:44 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:52:44 | 200 | 24.1829ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:52:44 | 200 | 111.9967ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 07:53:03 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:53:03 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:53:03 | 200 | 1.4954ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:53:03 | 200 | 43.0009ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 07:53:07 | 200 | 500.5µs | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:53:07 | 200 | 502.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:53:07 | 200 | 55.0018ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 07:53:07 | 200 | 37.9946ms | 127.0.0.1 | POST "/api/show" time=2025-11-04T07:53:07.198Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51861" [GIN] 2025/11/04 - 07:53:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:53:37 | 200 | 850.5µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:54:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:54:07 | 200 | 505.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:54:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:54:37 | 200 | 504.6µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:55:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:55:07 | 200 | 498.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:55:22 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2025/11/04 - 07:55:22 | 200 | 505.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:55:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:55:37 | 200 | 498.6µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:56:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:56:07 | 200 | 499.7µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:56:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:56:37 | 200 | 997.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:57:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:57:07 | 200 | 1.0126ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:57:37 | 200 | 500.2µs | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:57:37 | 200 | 499.9µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:58:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:58:07 | 200 | 1.0005ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:58:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:58:37 | 200 | 506µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:59:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:59:07 | 200 | 999.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:59:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:59:37 | 200 | 495.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:00:16 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:00:16 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:00:16 | 200 | 998.9µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:00:16 | 200 | 50.9962ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 08:00:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:00:19 | 200 | 499µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:00:19 | 200 | 54.9984ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 08:00:19 | 200 | 39.997ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 08:00:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:00:49 | 200 | 501µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:01:19 | 200 | 499.2µs | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:01:19 | 200 | 979.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:01:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:01:49 | 200 | 505.5µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:02:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:02:19 | 200 | 463.7µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:02:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:02:49 | 200 | 500.5µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:03:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:03:19 | 200 | 999.6µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:03:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:03:49 | 200 | 972.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:04:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:04:19 | 200 | 1.001ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:04:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:04:49 | 200 | 501.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:05:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:05:19 | 200 | 499.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:05:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:05:49 | 200 | 503.5µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:06:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:06:19 | 200 | 1.0011ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:06:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:06:49 | 200 | 1.498ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:07:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:07:19 | 200 | 998.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:07:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:07:49 | 200 | 1.004ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:08:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:08:19 | 200 | 1.5048ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:08:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:08:49 | 200 | 1.0002ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:09:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:09:19 | 200 | 998.4µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:09:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:09:49 | 200 | 475.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:10:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:10:19 | 200 | 504.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:10:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:10:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:10:49 | 200 | 504.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:11:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:11:19 | 200 | 500.8µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:11:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:11:49 | 200 | 500.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:12:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:12:19 | 200 | 505.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:12:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:12:49 | 200 | 507µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:13:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:13:19 | 200 | 923.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:13:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:13:49 | 200 | 1ms | 127.0.0.1 | GET "/api/tags" ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.12.9
GiteaMirror added the bugnvidiawindows labels 2026-05-04 22:21:58 -05:00
Author
Owner

@dhiltgen commented on GitHub (Nov 4, 2025):

I'm not sure what's going wrong, but lets try running with more verbose logging and isolate the server to try to understand the problem. Quit the desktop from the system tray, and in a powershell terminal, run

$env:OLLAMA_DEBUG="2"
ollama serve 2>&1 | % ToString | tee-object serve.log

Then in another powershell terminal, send a simple prompt:

ollama run qwen3:4b hello

then share the results.

When it becomes unresponsive, is Ollama consuming a full core of a CPU in Task Manager, or does everything look idle?

<!-- gh-comment-id:3487546970 --> @dhiltgen commented on GitHub (Nov 4, 2025): I'm not sure what's going wrong, but lets try running with more verbose logging and isolate the server to try to understand the problem. Quit the desktop from the system tray, and in a powershell terminal, run ```powershell $env:OLLAMA_DEBUG="2" ollama serve 2>&1 | % ToString | tee-object serve.log ``` Then in another powershell terminal, send a simple prompt: ```powershell ollama run qwen3:4b hello ``` then share the results. When it becomes unresponsive, is Ollama consuming a full core of a CPU in Task Manager, or does everything look idle?
Author
Owner

@cwiokpl commented on GitHub (Nov 5, 2025):

Thanks, my updates are:
Yesterday I turned off other VMs on the machine and increased RAM allocation for the VM from 32GBs to 64GBs and changed CPU setup (QEMU/KVM in Virt Manager) from 1 Socket, 2 Cores and 2 Threads to 3 Sockets, 2 Cores and 2 Threads and the issue magically was gone.

Today, I returned to previous settings and the behaviour came back to what was described above - not responsive. Attached is the serve.log with additional debugging.

Right now, ollama is frozen on the qwen3:4b hello, but in task manager I see it using 25% of CPU, GPU is pretty much idling. It does not look as if any logical processor was used more than the other. Ollama ps shows nothing.

serve.log

<!-- gh-comment-id:3489859669 --> @cwiokpl commented on GitHub (Nov 5, 2025): Thanks, my updates are: Yesterday I turned off other VMs on the machine and increased RAM allocation for the VM from 32GBs to 64GBs and changed CPU setup (QEMU/KVM in Virt Manager) from 1 Socket, 2 Cores and 2 Threads to 3 Sockets, 2 Cores and 2 Threads and the issue magically was gone. Today, I returned to previous settings and the behaviour came back to what was described above - not responsive. Attached is the serve.log with additional debugging. Right now, ollama is frozen on the qwen3:4b hello, but in task manager I see it using 25% of CPU, GPU is pretty much idling. It does not look as if any logical processor was used more than the other. Ollama ps shows nothing. [serve.log](https://github.com/user-attachments/files/23354313/serve.log)
Author
Owner

@dhiltgen commented on GitHub (Nov 5, 2025):

changed CPU setup (QEMU/KVM in Virt Manager) from 1 Socket, 2 Cores and 2 Threads to 3 Sockets, 2 Cores and 2 Threads and the issue magically was gone.

Thanks for that insight! My suspicion is the hang is in the windows specific CPU lookup code, and that seems to align with your findings.

<!-- gh-comment-id:3492118423 --> @dhiltgen commented on GitHub (Nov 5, 2025): > changed CPU setup (QEMU/KVM in Virt Manager) from 1 Socket, 2 Cores and 2 Threads to 3 Sockets, 2 Cores and 2 Threads and the issue magically was gone. Thanks for that insight! My suspicion is the hang is in the windows specific CPU lookup code, and that seems to align with your findings.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#70640