[GH-ISSUE #12942] Ollama suddenly became unresponsive, reinstall did not help #70640

New Issue

GiteaMirror · 2026-05-04T22:21:57-05:00

GiteaMirror commented

2026-05-04 22:21:57 -05:00

Originally created by @cwiokpl on GitHub (Nov 4, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12942

Originally assigned to: @dhiltgen on GitHub.

What is the issue?

What is the issue?

Hello,

I am running windows 10 VM in Debian 13 host with GPU pass through. I set up CUDA Toolkit and Ollama 2 days ago and everything was working fine. I had multiple models installed including qwen3-vl:30b. The issue started when I decided to download qwen3-vl:30b-a3b-instruct-q4_K_M using ollama run qwen3-vl:30b-a3b-instruct-q4_K_M, however by accident I used ollama run qwen3-vl:30b-a3b-thinking-q4_K_M, which actually is the same model as "qwen3-vl:30b" - sha1 matches. After that no matter what I do I cannot get it to work. I am not 100% sure if my actions are related to the issue, just what I noticed.

When I open Ollama app it gets stuck at the "writing" animation, when I do the same using cli using ollama run it freezes at the new line. Ollama ps shows empty list, I understand, meaning that the model is not loaded.

Example output when I run ollama run qwen3-4b:
C:\Users\User>ollama run qwen3:4b
⠴

It will freeze like this, I see some CPU and GPU load, but GPU memory stays at almost zero.

I reinstalled Ollama (deleted all the files after uninstall manually) and reinstalled CUDA Toolkit. Due to the nature of the setup (GPU Passthrough + Looking Glass), I'd rather avoid complete Windows reinstall.

OS
Windows 10 in VM

GPU
NVIDIA RTX 4060 TI

CPU
AMD

Ollama version

0.12.9

Relevant log output

**App.log:**
time=2025-11-04T07:52:40.889Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043
time=2025-11-04T07:52:40.890Z level=INFO source=app.go:231 msg="initialized tools registry" tool_count=0
time=2025-11-04T07:52:40.894Z level=INFO source=app.go:246 msg="starting ollama server"
time=2025-11-04T07:52:41.171Z level=INFO source=app.go:275 msg="starting ui server" port=51753
time=2025-11-04T07:52:41.805Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762242761805731900 version=0.12.9
time=2025-11-04T07:52:41.807Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=498.5µs request_id=1762242761806733000 version=0.12.9
time=2025-11-04T07:52:41.807Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=0s request_id=1762242761807231500 version=0.12.9
time=2025-11-04T07:52:41.831Z level=WARN source=ui.go:1601 msg="failed to show model details" error="Post \"http://127.0.0.1:11434/api/show\": dial tcp 127.0.0.1:11434: connectex: No connection could be made because the target machine actively refused it." model=qwen3:4b
time=2025-11-04T07:52:41.831Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=326.4µs request_id=1762242761831404200 version=0.12.9
time=2025-11-04T07:52:41.837Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761837847500 version=0.12.9
time=2025-11-04T07:52:41.851Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=497.4µs request_id=1762242761850732700 version=0.12.9
time=2025-11-04T07:52:41.864Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761864755600 version=0.12.9
time=2025-11-04T07:52:41.877Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=415.4µs request_id=1762242761877313100 version=0.12.9
time=2025-11-04T07:52:41.890Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=209.3µs request_id=1762242761890395300 version=0.12.9
time=2025-11-04T07:52:41.892Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=113.6345ms request_id=1762242761778593200 version=0.12.9
time=2025-11-04T07:52:41.903Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=499.2µs request_id=1762242761903228400 version=0.12.9
time=2025-11-04T07:52:41.916Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=139.1µs request_id=1762242761916225900 version=0.12.9
time=2025-11-04T07:52:41.928Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=139.7µs request_id=1762242761928585800 version=0.12.9
time=2025-11-04T07:52:41.940Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761940723400 version=0.12.9
time=2025-11-04T07:52:41.954Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=335µs request_id=1762242761954222500 version=0.12.9
time=2025-11-04T07:52:41.968Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=499.6µs request_id=1762242761967721800 version=0.12.9
time=2025-11-04T07:52:41.982Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=498.9µs request_id=1762242761981721800 version=0.12.9
time=2025-11-04T07:52:41.994Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761994719600 version=0.12.9
time=2025-11-04T07:52:42.334Z level=ERROR source=ui.go:1618 msg="failed to get inference compute" error="timeout scanning server log for inference compute details"
time=2025-11-04T07:52:42.334Z level=ERROR source=ui.go:168 msg=site.serveHTTP error="failed to get inference compute: timeout scanning server log for inference compute details" http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=500 http.d=528.4653ms request_id=1762242761805731900 version=0.12.9
time=2025-11-04T07:52:43.841Z level=ERROR source=ui.go:1618 msg="failed to get inference compute" error="timeout scanning server log for inference compute details"
time=2025-11-04T07:52:43.841Z level=ERROR source=ui.go:168 msg=site.serveHTTP error="failed to get inference compute: timeout scanning server log for inference compute details" http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=500 http.d=502.453ms request_id=1762242763339151000 version=0.12.9
time=2025-11-04T07:52:44.173Z level=INFO source=updater.go:252 msg="beginning update checker" interval=1h0m0s
time=2025-11-04T07:52:44.874Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=2.8663136s request_id=1762242762008219900 version=0.12.9
time=2025-11-04T07:52:44.899Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=3.0929845s request_id=1762242761805731900 version=0.12.9
time=2025-11-04T07:52:44.986Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=2.1465903s request_id=1762242762839948100 version=0.12.9
time=2025-11-04T07:52:45.100Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=196.3387ms request_id=1762242764903683900 version=0.12.9
time=2025-11-04T07:52:45.859Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}"
time=2025-11-04T07:52:45.859Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=13.0829ms request_id=1762242765846136400 version=0.12.9
time=2025-11-04T07:52:59.339Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043
time=2025-11-04T07:52:59.342Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance"
time=2025-11-04T07:52:59.342Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting"
time=2025-11-04T07:53:02.866Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043
time=2025-11-04T07:53:03.058Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance"
time=2025-11-04T07:53:03.058Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting"
time=2025-11-04T07:53:03.375Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=148.8µs request_id=1762242783375185400 version=0.12.9
time=2025-11-04T07:53:03.376Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}"
time=2025-11-04T07:53:03.376Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=0s request_id=1762242783376834300 version=0.12.9
time=2025-11-04T07:53:03.377Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=503.5µs request_id=1762242783376834300 version=0.12.9
time=2025-11-04T07:53:03.377Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=496.2µs request_id=1762242783377337800 version=0.12.9
time=2025-11-04T07:53:03.378Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=2.5002ms request_id=1762242783376333000 version=0.12.9
time=2025-11-04T07:53:03.448Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=44.9987ms request_id=1762242783403330300 version=0.12.9
time=2025-11-04T07:53:03.761Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=415.6881ms request_id=1762242783345619600 version=0.12.9
time=2025-11-04T07:53:03.908Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=505.539ms request_id=1762242783402830800 version=0.12.9
time=2025-11-04T07:53:07.025Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=498µs request_id=1762242787024596900 version=0.12.9
time=2025-11-04T07:53:07.035Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762242787035097700 version=0.12.9
time=2025-11-04T07:53:07.038Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=2.4991ms request_id=1762242787035596200 version=0.12.9
time=2025-11-04T07:53:37.064Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.2205ms request_id=1762242817063647200 version=0.12.9
time=2025-11-04T07:54:07.081Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.8894ms request_id=1762242847079696800 version=0.12.9
time=2025-11-04T07:54:37.097Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5039ms request_id=1762242877095746800 version=0.12.9
time=2025-11-04T07:55:07.108Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.4987ms request_id=1762242907107297600 version=0.12.9
time=2025-11-04T07:55:37.112Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=848.1µs request_id=1762242937111999200 version=0.12.9
time=2025-11-04T07:56:07.121Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5062ms request_id=1762242967119897100 version=0.12.9
time=2025-11-04T07:56:37.130Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5708ms request_id=1762242997128444800 version=0.12.9
time=2025-11-04T07:57:07.138Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.3395ms request_id=1762243027137492600 version=0.12.9
time=2025-11-04T07:57:37.145Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.0001ms request_id=1762243057144539300 version=0.12.9
time=2025-11-04T07:58:07.157Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5011ms request_id=1762243087155586700 version=0.12.9
time=2025-11-04T07:58:37.163Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=675.2µs request_id=1762243117162962600 version=0.12.9
time=2025-11-04T07:59:07.169Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=999.2µs request_id=1762243147168179400 version=0.12.9
time=2025-11-04T07:59:37.179Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=999µs request_id=1762243177178225200 version=0.12.9
time=2025-11-04T08:00:06.820Z level=ERROR source=ui.go:1179 msg="chat stream error" error="Post \"http://127.0.0.1:11434/api/chat\": context canceled"
time=2025-11-04T08:00:06.820Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/chat/new http.pattern="POST /api/v1/chat/{id}" http.status=200 http.d=6m59.8000615s request_id=1762242787020233200 version=0.12.9
time=2025-11-04T08:00:16.016Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043
time=2025-11-04T08:00:16.238Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance"
time=2025-11-04T08:00:16.238Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting"
time=2025-11-04T08:00:16.571Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=500.4µs request_id=1762243216570659100 version=0.12.9
time=2025-11-04T08:00:16.573Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=78.2µs request_id=1762243216573081100 version=0.12.9
time=2025-11-04T08:00:16.576Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=6.0012ms request_id=1762243216570159700 version=0.12.9
time=2025-11-04T08:00:16.576Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=3.9996ms request_id=1762243216572161300 version=0.12.9
time=2025-11-04T08:00:16.594Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}"
time=2025-11-04T08:00:16.594Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=23.4993ms request_id=1762243216571159500 version=0.12.9
time=2025-11-04T08:00:16.646Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=53.4994ms request_id=1762243216592657900 version=0.12.9
time=2025-11-04T08:00:16.784Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=243.7951ms request_id=1762243216540382800 version=0.12.9
time=2025-11-04T08:00:16.865Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=272.8287ms request_id=1762243216592657900 version=0.12.9
time=2025-11-04T08:00:19.478Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=500µs request_id=1762243219478468900 version=0.12.9
time=2025-11-04T08:00:19.486Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762243219486623800 version=0.12.9
time=2025-11-04T08:00:19.500Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=13.9069ms request_id=1762243219486623800 version=0.12.9
time=2025-11-04T08:00:49.506Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.3714ms request_id=1762243249505513500 version=0.12.9

**Server.log**
time=2025-11-04T07:52:41.998Z level=INFO source=routes.go:1524 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\User\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]"
time=2025-11-04T07:52:42.008Z level=INFO source=images.go:522 msg="total blobs: 5"
time=2025-11-04T07:52:42.008Z level=INFO source=images.go:529 msg="total unused blobs removed: 0"
time=2025-11-04T07:52:42.009Z level=INFO source=routes.go:1577 msg="Listening on 127.0.0.1:11434 (version 0.12.9)"
time=2025-11-04T07:52:42.010Z level=INFO source=runner.go:76 msg="discovering available GPUs..."
time=2025-11-04T07:52:42.015Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51799"
time=2025-11-04T07:52:44.164Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51806"
time=2025-11-04T07:52:44.471Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51812"
time=2025-11-04T07:52:44.695Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51817"
time=2025-11-04T07:52:44.695Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51818"
time=2025-11-04T07:52:44.873Z level=INFO source=types.go:42 msg="inference compute" id=GPU-bb474679-35c8-b903-05b8-51d955e673c2 filtered_id="" library=CUDA compute=8.9 name=CUDA0 description="NVIDIA GeForce RTX 4060 Ti" libdirs=ollama,cuda_v13 driver=13.0 pci_id=0000:07:00.0 type=discrete total="16.0 GiB" available="15.4 GiB"
time=2025-11-04T07:52:44.873Z level=INFO source=routes.go:1618 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB"
[GIN] 2025/11/04 - 07:52:44 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:52:44 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:52:44 | 200 |     24.1829ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:52:44 | 200 |    111.9967ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 07:53:03 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:53:03 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:53:03 | 200 |      1.4954ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:53:03 | 200 |     43.0009ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 07:53:07 | 200 |       500.5µs |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:53:07 | 200 |       502.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:53:07 | 200 |     55.0018ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 07:53:07 | 200 |     37.9946ms |       127.0.0.1 | POST     "/api/show"
time=2025-11-04T07:53:07.198Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51861"
[GIN] 2025/11/04 - 07:53:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:53:37 | 200 |       850.5µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:54:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:54:07 | 200 |       505.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:54:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:54:37 | 200 |       504.6µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:55:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:55:07 | 200 |       498.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:55:22 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2025/11/04 - 07:55:22 | 200 |       505.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:55:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:55:37 | 200 |       498.6µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:56:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:56:07 | 200 |       499.7µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:56:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:56:37 | 200 |       997.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:57:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:57:07 | 200 |      1.0126ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:57:37 | 200 |       500.2µs |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:57:37 | 200 |       499.9µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:58:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:58:07 | 200 |      1.0005ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:58:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:58:37 | 200 |         506µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:59:07 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:59:07 | 200 |       999.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 07:59:37 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 07:59:37 | 200 |       495.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:00:16 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:00:16 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:00:16 | 200 |       998.9µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:00:16 | 200 |     50.9962ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 08:00:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:00:19 | 200 |         499µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:00:19 | 200 |     54.9984ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 08:00:19 | 200 |      39.997ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/04 - 08:00:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:00:49 | 200 |         501µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:01:19 | 200 |       499.2µs |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:01:19 | 200 |       979.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:01:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:01:49 | 200 |       505.5µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:02:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:02:19 | 200 |       463.7µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:02:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:02:49 | 200 |       500.5µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:03:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:03:19 | 200 |       999.6µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:03:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:03:49 | 200 |       972.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:04:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:04:19 | 200 |       1.001ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:04:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:04:49 | 200 |       501.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:05:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:05:19 | 200 |       499.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:05:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:05:49 | 200 |       503.5µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:06:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:06:19 | 200 |      1.0011ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:06:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:06:49 | 200 |       1.498ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:07:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:07:19 | 200 |       998.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:07:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:07:49 | 200 |       1.004ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:08:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:08:19 | 200 |      1.5048ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:08:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:08:49 | 200 |      1.0002ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:09:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:09:19 | 200 |       998.4µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:09:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:09:49 | 200 |       475.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:10:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:10:19 | 200 |       504.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:10:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:10:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:10:49 | 200 |       504.2µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:11:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:11:19 | 200 |       500.8µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:11:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:11:49 | 200 |       500.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:12:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:12:19 | 200 |       505.3µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:12:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:12:49 | 200 |         507µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:13:19 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:13:19 | 200 |       923.1µs |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/04 - 08:13:49 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/04 - 08:13:49 | 200 |           1ms |       127.0.0.1 | GET      "/api/tags"

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.12.9

Originally created by @cwiokpl on GitHub (Nov 4, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12942 Originally assigned to: @dhiltgen on GitHub. ### What is the issue? **What is the issue?** Hello, I am running windows 10 VM in Debian 13 host with GPU pass through. I set up CUDA Toolkit and Ollama 2 days ago and everything was working fine. I had multiple models installed including qwen3-vl:30b. The issue started when I decided to download qwen3-vl:30b-a3b-instruct-q4_K_M using `ollama run qwen3-vl:30b-a3b-instruct-q4_K_M`, however by accident I used `ollama run qwen3-vl:30b-a3b-thinking-q4_K_M`, which actually is the same model as "qwen3-vl:30b" - sha1 matches. After that no matter what I do I cannot get it to work. I am not 100% sure if my actions are related to the issue, just what I noticed. When I open Ollama app it gets stuck at the "writing" animation, when I do the same using cli using ollama run it freezes at the new line. Ollama ps shows empty list, I understand, meaning that the model is not loaded. Example output when I run `ollama run qwen3-4b`: `C:\Users\User>ollama run qwen3:4b` `⠴` It will freeze like this, I see some CPU and GPU load, but GPU memory stays at almost zero. I reinstalled Ollama (deleted all the files after uninstall manually) and reinstalled CUDA Toolkit. Due to the nature of the setup (GPU Passthrough + Looking Glass), I'd rather avoid complete Windows reinstall. **OS** Windows 10 in VM **GPU** NVIDIA RTX 4060 TI **CPU** AMD Ollama version 0.12.9 ### Relevant log output ```shell **App.log:** time=2025-11-04T07:52:40.889Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043 time=2025-11-04T07:52:40.890Z level=INFO source=app.go:231 msg="initialized tools registry" tool_count=0 time=2025-11-04T07:52:40.894Z level=INFO source=app.go:246 msg="starting ollama server" time=2025-11-04T07:52:41.171Z level=INFO source=app.go:275 msg="starting ui server" port=51753 time=2025-11-04T07:52:41.805Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762242761805731900 version=0.12.9 time=2025-11-04T07:52:41.807Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=498.5µs request_id=1762242761806733000 version=0.12.9 time=2025-11-04T07:52:41.807Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=0s request_id=1762242761807231500 version=0.12.9 time=2025-11-04T07:52:41.831Z level=WARN source=ui.go:1601 msg="failed to show model details" error="Post \"http://127.0.0.1:11434/api/show\": dial tcp 127.0.0.1:11434: connectex: No connection could be made because the target machine actively refused it." model=qwen3:4b time=2025-11-04T07:52:41.831Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=326.4µs request_id=1762242761831404200 version=0.12.9 time=2025-11-04T07:52:41.837Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761837847500 version=0.12.9 time=2025-11-04T07:52:41.851Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=497.4µs request_id=1762242761850732700 version=0.12.9 time=2025-11-04T07:52:41.864Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761864755600 version=0.12.9 time=2025-11-04T07:52:41.877Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=415.4µs request_id=1762242761877313100 version=0.12.9 time=2025-11-04T07:52:41.890Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=209.3µs request_id=1762242761890395300 version=0.12.9 time=2025-11-04T07:52:41.892Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=113.6345ms request_id=1762242761778593200 version=0.12.9 time=2025-11-04T07:52:41.903Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=499.2µs request_id=1762242761903228400 version=0.12.9 time=2025-11-04T07:52:41.916Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=139.1µs request_id=1762242761916225900 version=0.12.9 time=2025-11-04T07:52:41.928Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=139.7µs request_id=1762242761928585800 version=0.12.9 time=2025-11-04T07:52:41.940Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761940723400 version=0.12.9 time=2025-11-04T07:52:41.954Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=335µs request_id=1762242761954222500 version=0.12.9 time=2025-11-04T07:52:41.968Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=499.6µs request_id=1762242761967721800 version=0.12.9 time=2025-11-04T07:52:41.982Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=498.9µs request_id=1762242761981721800 version=0.12.9 time=2025-11-04T07:52:41.994Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=0s request_id=1762242761994719600 version=0.12.9 time=2025-11-04T07:52:42.334Z level=ERROR source=ui.go:1618 msg="failed to get inference compute" error="timeout scanning server log for inference compute details" time=2025-11-04T07:52:42.334Z level=ERROR source=ui.go:168 msg=site.serveHTTP error="failed to get inference compute: timeout scanning server log for inference compute details" http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=500 http.d=528.4653ms request_id=1762242761805731900 version=0.12.9 time=2025-11-04T07:52:43.841Z level=ERROR source=ui.go:1618 msg="failed to get inference compute" error="timeout scanning server log for inference compute details" time=2025-11-04T07:52:43.841Z level=ERROR source=ui.go:168 msg=site.serveHTTP error="failed to get inference compute: timeout scanning server log for inference compute details" http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=500 http.d=502.453ms request_id=1762242763339151000 version=0.12.9 time=2025-11-04T07:52:44.173Z level=INFO source=updater.go:252 msg="beginning update checker" interval=1h0m0s time=2025-11-04T07:52:44.874Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=2.8663136s request_id=1762242762008219900 version=0.12.9 time=2025-11-04T07:52:44.899Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=3.0929845s request_id=1762242761805731900 version=0.12.9 time=2025-11-04T07:52:44.986Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=2.1465903s request_id=1762242762839948100 version=0.12.9 time=2025-11-04T07:52:45.100Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=196.3387ms request_id=1762242764903683900 version=0.12.9 time=2025-11-04T07:52:45.859Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}" time=2025-11-04T07:52:45.859Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=13.0829ms request_id=1762242765846136400 version=0.12.9 time=2025-11-04T07:52:59.339Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043 time=2025-11-04T07:52:59.342Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance" time=2025-11-04T07:52:59.342Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting" time=2025-11-04T07:53:02.866Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043 time=2025-11-04T07:53:03.058Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance" time=2025-11-04T07:53:03.058Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting" time=2025-11-04T07:53:03.375Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=148.8µs request_id=1762242783375185400 version=0.12.9 time=2025-11-04T07:53:03.376Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}" time=2025-11-04T07:53:03.376Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=0s request_id=1762242783376834300 version=0.12.9 time=2025-11-04T07:53:03.377Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=503.5µs request_id=1762242783376834300 version=0.12.9 time=2025-11-04T07:53:03.377Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=496.2µs request_id=1762242783377337800 version=0.12.9 time=2025-11-04T07:53:03.378Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=2.5002ms request_id=1762242783376333000 version=0.12.9 time=2025-11-04T07:53:03.448Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=44.9987ms request_id=1762242783403330300 version=0.12.9 time=2025-11-04T07:53:03.761Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=415.6881ms request_id=1762242783345619600 version=0.12.9 time=2025-11-04T07:53:03.908Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=505.539ms request_id=1762242783402830800 version=0.12.9 time=2025-11-04T07:53:07.025Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=498µs request_id=1762242787024596900 version=0.12.9 time=2025-11-04T07:53:07.035Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762242787035097700 version=0.12.9 time=2025-11-04T07:53:07.038Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=2.4991ms request_id=1762242787035596200 version=0.12.9 time=2025-11-04T07:53:37.064Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.2205ms request_id=1762242817063647200 version=0.12.9 time=2025-11-04T07:54:07.081Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.8894ms request_id=1762242847079696800 version=0.12.9 time=2025-11-04T07:54:37.097Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5039ms request_id=1762242877095746800 version=0.12.9 time=2025-11-04T07:55:07.108Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.4987ms request_id=1762242907107297600 version=0.12.9 time=2025-11-04T07:55:37.112Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=848.1µs request_id=1762242937111999200 version=0.12.9 time=2025-11-04T07:56:07.121Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5062ms request_id=1762242967119897100 version=0.12.9 time=2025-11-04T07:56:37.130Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5708ms request_id=1762242997128444800 version=0.12.9 time=2025-11-04T07:57:07.138Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.3395ms request_id=1762243027137492600 version=0.12.9 time=2025-11-04T07:57:37.145Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.0001ms request_id=1762243057144539300 version=0.12.9 time=2025-11-04T07:58:07.157Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.5011ms request_id=1762243087155586700 version=0.12.9 time=2025-11-04T07:58:37.163Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=675.2µs request_id=1762243117162962600 version=0.12.9 time=2025-11-04T07:59:07.169Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=999.2µs request_id=1762243147168179400 version=0.12.9 time=2025-11-04T07:59:37.179Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=999µs request_id=1762243177178225200 version=0.12.9 time=2025-11-04T08:00:06.820Z level=ERROR source=ui.go:1179 msg="chat stream error" error="Post \"http://127.0.0.1:11434/api/chat\": context canceled" time=2025-11-04T08:00:06.820Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/chat/new http.pattern="POST /api/v1/chat/{id}" http.status=200 http.d=6m59.8000615s request_id=1762242787020233200 version=0.12.9 time=2025-11-04T08:00:16.016Z level=INFO source=app_windows.go:270 msg="starting Ollama" app=C:\Users\User\AppData\Local\Programs\Ollama version=0.12.9 OS=Windows/10.0.19043 time=2025-11-04T08:00:16.238Z level=INFO source=eventloop.go:329 msg="sent focus request to existing instance" time=2025-11-04T08:00:16.238Z level=INFO source=app_windows.go:79 msg="existing instance found, exiting" time=2025-11-04T08:00:16.571Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=500.4µs request_id=1762243216570659100 version=0.12.9 time=2025-11-04T08:00:16.573Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=78.2µs request_id=1762243216573081100 version=0.12.9 time=2025-11-04T08:00:16.576Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=6.0012ms request_id=1762243216570159700 version=0.12.9 time=2025-11-04T08:00:16.576Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/health http.pattern="GET /api/v1/health" http.status=200 http.d=3.9996ms request_id=1762243216572161300 version=0.12.9 time=2025-11-04T08:00:16.594Z level=INFO source=server.go:343 msg=Matched "inference compute"="{Library:CUDA Variant: Compute:8.9 Driver:13.0 Name:CUDA0 VRAM:16.0 GiB}" time=2025-11-04T08:00:16.594Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/inference-compute http.pattern="GET /api/v1/inference-compute" http.status=200 http.d=23.4993ms request_id=1762243216571159500 version=0.12.9 time=2025-11-04T08:00:16.646Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/model/qwen3:4b/capabilities http.pattern="GET /api/v1/model/{model}/capabilities" http.status=200 http.d=53.4994ms request_id=1762243216592657900 version=0.12.9 time=2025-11-04T08:00:16.784Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/me http.pattern="GET /api/v1/me" http.status=200 http.d=243.7951ms request_id=1762243216540382800 version=0.12.9 time=2025-11-04T08:00:16.865Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=POST http.path=/api/v1/model/upstream http.pattern="POST /api/v1/model/upstream" http.status=200 http.d=272.8287ms request_id=1762243216592657900 version=0.12.9 time=2025-11-04T08:00:19.478Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/chats http.pattern="GET /api/v1/chats" http.status=200 http.d=500µs request_id=1762243219478468900 version=0.12.9 time=2025-11-04T08:00:19.486Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/settings http.pattern="GET /api/v1/settings" http.status=200 http.d=0s request_id=1762243219486623800 version=0.12.9 time=2025-11-04T08:00:19.500Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=13.9069ms request_id=1762243219486623800 version=0.12.9 time=2025-11-04T08:00:49.506Z level=INFO source=ui.go:168 msg=site.serveHTTP http.method=GET http.path=/api/v1/models http.pattern="GET /api/v1/models" http.status=200 http.d=1.3714ms request_id=1762243249505513500 version=0.12.9 **Server.log** time=2025-11-04T07:52:41.998Z level=INFO source=routes.go:1524 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\User\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]" time=2025-11-04T07:52:42.008Z level=INFO source=images.go:522 msg="total blobs: 5" time=2025-11-04T07:52:42.008Z level=INFO source=images.go:529 msg="total unused blobs removed: 0" time=2025-11-04T07:52:42.009Z level=INFO source=routes.go:1577 msg="Listening on 127.0.0.1:11434 (version 0.12.9)" time=2025-11-04T07:52:42.010Z level=INFO source=runner.go:76 msg="discovering available GPUs..." time=2025-11-04T07:52:42.015Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51799" time=2025-11-04T07:52:44.164Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51806" time=2025-11-04T07:52:44.471Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51812" time=2025-11-04T07:52:44.695Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51817" time=2025-11-04T07:52:44.695Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51818" time=2025-11-04T07:52:44.873Z level=INFO source=types.go:42 msg="inference compute" id=GPU-bb474679-35c8-b903-05b8-51d955e673c2 filtered_id="" library=CUDA compute=8.9 name=CUDA0 description="NVIDIA GeForce RTX 4060 Ti" libdirs=ollama,cuda_v13 driver=13.0 pci_id=0000:07:00.0 type=discrete total="16.0 GiB" available="15.4 GiB" time=2025-11-04T07:52:44.873Z level=INFO source=routes.go:1618 msg="entering low vram mode" "total vram"="16.0 GiB" threshold="20.0 GiB" [GIN] 2025/11/04 - 07:52:44 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:52:44 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:52:44 | 200 | 24.1829ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:52:44 | 200 | 111.9967ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 07:53:03 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:53:03 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:53:03 | 200 | 1.4954ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:53:03 | 200 | 43.0009ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 07:53:07 | 200 | 500.5µs | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:53:07 | 200 | 502.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:53:07 | 200 | 55.0018ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 07:53:07 | 200 | 37.9946ms | 127.0.0.1 | POST "/api/show" time=2025-11-04T07:53:07.198Z level=INFO source=server.go:400 msg="starting runner" cmd="C:\\Users\\User\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 51861" [GIN] 2025/11/04 - 07:53:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:53:37 | 200 | 850.5µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:54:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:54:07 | 200 | 505.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:54:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:54:37 | 200 | 504.6µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:55:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:55:07 | 200 | 498.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:55:22 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2025/11/04 - 07:55:22 | 200 | 505.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:55:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:55:37 | 200 | 498.6µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:56:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:56:07 | 200 | 499.7µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:56:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:56:37 | 200 | 997.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:57:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:57:07 | 200 | 1.0126ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:57:37 | 200 | 500.2µs | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:57:37 | 200 | 499.9µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:58:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:58:07 | 200 | 1.0005ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:58:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:58:37 | 200 | 506µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:59:07 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:59:07 | 200 | 999.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 07:59:37 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 07:59:37 | 200 | 495.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:00:16 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:00:16 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:00:16 | 200 | 998.9µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:00:16 | 200 | 50.9962ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 08:00:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:00:19 | 200 | 499µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:00:19 | 200 | 54.9984ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 08:00:19 | 200 | 39.997ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/04 - 08:00:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:00:49 | 200 | 501µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:01:19 | 200 | 499.2µs | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:01:19 | 200 | 979.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:01:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:01:49 | 200 | 505.5µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:02:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:02:19 | 200 | 463.7µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:02:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:02:49 | 200 | 500.5µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:03:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:03:19 | 200 | 999.6µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:03:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:03:49 | 200 | 972.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:04:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:04:19 | 200 | 1.001ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:04:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:04:49 | 200 | 501.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:05:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:05:19 | 200 | 499.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:05:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:05:49 | 200 | 503.5µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:06:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:06:19 | 200 | 1.0011ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:06:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:06:49 | 200 | 1.498ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:07:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:07:19 | 200 | 998.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:07:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:07:49 | 200 | 1.004ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:08:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:08:19 | 200 | 1.5048ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:08:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:08:49 | 200 | 1.0002ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:09:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:09:19 | 200 | 998.4µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:09:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:09:49 | 200 | 475.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:10:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:10:19 | 200 | 504.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:10:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:10:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:10:49 | 200 | 504.2µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:11:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:11:19 | 200 | 500.8µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:11:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:11:49 | 200 | 500.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:12:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:12:19 | 200 | 505.3µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:12:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:12:49 | 200 | 507µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:13:19 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:13:19 | 200 | 923.1µs | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/04 - 08:13:49 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/04 - 08:13:49 | 200 | 1ms | 127.0.0.1 | GET "/api/tags" ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.12.9

GiteaMirror added the bug nvidia windows labels 2026-05-04 22:21:58 -05:00

GiteaMirror closed this issue

2026-05-04 22:22:02 -05:00

GiteaMirror commented

2026-05-04 22:22:07 -05:00

@dhiltgen commented on GitHub (Nov 4, 2025):

I'm not sure what's going wrong, but lets try running with more verbose logging and isolate the server to try to understand the problem. Quit the desktop from the system tray, and in a powershell terminal, run

$env:OLLAMA_DEBUG="2"
ollama serve 2>&1 | % ToString | tee-object serve.log

Then in another powershell terminal, send a simple prompt:

ollama run qwen3:4b hello

then share the results.

When it becomes unresponsive, is Ollama consuming a full core of a CPU in Task Manager, or does everything look idle?

@dhiltgen commented on GitHub (Nov 4, 2025): I'm not sure what's going wrong, but lets try running with more verbose logging and isolate the server to try to understand the problem. Quit the desktop from the system tray, and in a powershell terminal, run ```powershell $env:OLLAMA_DEBUG="2" ollama serve 2>&1 | % ToString | tee-object serve.log ``` Then in another powershell terminal, send a simple prompt: ```powershell ollama run qwen3:4b hello ``` then share the results. When it becomes unresponsive, is Ollama consuming a full core of a CPU in Task Manager, or does everything look idle?

GiteaMirror commented

2026-05-04 22:22:09 -05:00

@cwiokpl commented on GitHub (Nov 5, 2025):

Thanks, my updates are:
Yesterday I turned off other VMs on the machine and increased RAM allocation for the VM from 32GBs to 64GBs and changed CPU setup (QEMU/KVM in Virt Manager) from 1 Socket, 2 Cores and 2 Threads to 3 Sockets, 2 Cores and 2 Threads and the issue magically was gone.

Today, I returned to previous settings and the behaviour came back to what was described above - not responsive. Attached is the serve.log with additional debugging.

Right now, ollama is frozen on the qwen3:4b hello, but in task manager I see it using 25% of CPU, GPU is pretty much idling. It does not look as if any logical processor was used more than the other. Ollama ps shows nothing.

serve.log

@cwiokpl commented on GitHub (Nov 5, 2025): Thanks, my updates are: Yesterday I turned off other VMs on the machine and increased RAM allocation for the VM from 32GBs to 64GBs and changed CPU setup (QEMU/KVM in Virt Manager) from 1 Socket, 2 Cores and 2 Threads to 3 Sockets, 2 Cores and 2 Threads and the issue magically was gone. Today, I returned to previous settings and the behaviour came back to what was described above - not responsive. Attached is the serve.log with additional debugging. Right now, ollama is frozen on the qwen3:4b hello, but in task manager I see it using 25% of CPU, GPU is pretty much idling. It does not look as if any logical processor was used more than the other. Ollama ps shows nothing. [serve.log](https://github.com/user-attachments/files/23354313/serve.log)

GiteaMirror commented

2026-05-04 22:22:10 -05:00

@dhiltgen commented on GitHub (Nov 5, 2025):

changed CPU setup (QEMU/KVM in Virt Manager) from 1 Socket, 2 Cores and 2 Threads to 3 Sockets, 2 Cores and 2 Threads and the issue magically was gone.

Thanks for that insight! My suspicion is the hang is in the windows specific CPU lookup code, and that seems to align with your findings.

@dhiltgen commented on GitHub (Nov 5, 2025): > changed CPU setup (QEMU/KVM in Virt Manager) from 1 Socket, 2 Cores and 2 Threads to 3 Sockets, 2 Cores and 2 Threads and the issue magically was gone. Thanks for that insight! My suspicion is the hang is in the windows specific CPU lookup code, and that seems to align with your findings.

Sign in to join this conversation.

Branches Tags

main

parth-mlx-decode-checkpoints

dhiltgen/ci

hoyyeva/editor-config-repair

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

hoyyeva/launch-backup-ux

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

brucemacd/download-before-remove

parth/update-claude-docs

parth-anthropic-reference-images-path

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#70640