[GH-ISSUE #9357] Intermittent hanging/stalling and "llama runner process no longer running" error on CPU with specific models (Windows 11, Ryzen 5800X, RTX 3060) #68166

New Issue

GiteaMirror · 2026-05-04T12:42:34-05:00

GiteaMirror commented

2026-05-04 12:42:34 -05:00

Originally created by @bluespork on GitHub (Feb 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9357

What is the issue?

I am experiencing an intermittent issue with Ollama on Windows 11 where models running on the CPU either hang/stall mid-response or produce the "llama runner process no longer running" error. This issue does not occur when running on the GPU (Nvidia RTX 3060). I have done extensive troubleshooting (detailed below) and believe this to be a bug within Ollama or llama.cpp.

Expected Behavior:

Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" environment variable is set, without hanging, stalling, or producing the "llama runner process no longer running" error.

Actual Behavior:

With Ollama version 0.1.32 (portable, server run separately), the mistral:latest model fails with "llama runner process no longer running" during the loading phase, before even getting to a prompt.
With Ollama version 0.5.4 (installed via official installer), the mistral:latest model consistently fails with "llama runner process no longer running" immediately upon starting, when forced to run on CPU.
With Ollama version 0.5.12 (installed via official installer), the deepscaler:1.5b-preview-q4_K_M model sometimes runs to completion on CPU. However, it frequently hangs or stalls mid-response. If I type anything into the prompt after it hangs, it will sometimes continue and complete the response. Other times, it produces the "llama runner process no longer running" error.
Running on the GPU (without setting CUDA_VISIBLE_DEVICES) works flawlessly with all tested models.

Steps to Reproduce:

System: Windows 11, Ryzen 7 5800X, 32GB RAM, Nvidia RTX 3060.
Ollama Version: 0.1.32 (portable), 0.5.4 (installed), and 0.5.12 (installed).
Environment Variable: Ensure CUDA_VISIBLE_DEVICES="" is set. This has been tested both as a system-wide variable and set per-session in Command Prompt. We have verified using echo %CUDA_VISIBLE_DEVICES% that the variable is correctly unset when it should be, and set to an empty string when we set it. The system-wide enviroment variable was deleted as well.
Model(s): mistral:latest (consistently fails on versions 0.5.4 and 0.1.32), deepscaler:1.5b-preview-q4_K_M (intermittently hangs/stalls or fails on version 0.5.12).
Command: ollama run <model_name> --verbose "list 10 animals who can weight over 1000 pounds" (or any other prompt).
Observe:
- With mistral:latest on version 0.5.4, multiple ollama_llama_server.exe processes appear in Task Manager and then disappear, leading to the error.
- With mistral:latest on version 0.1.32, get "llama runner process no longer running".
- With deepscaler:1.5b-preview-q4_K_M on version 0.5.12, the response may stop mid-sentence. Typing anything in the prompt may cause it to continue, or may result in the "llama runner process no longer running" error.

Troubleshooting Steps Taken:

Confirmed issue occurs with multiple models (mistral:latest, deepscaler:1.5b-preview-q4_K_M), but not all models.
Confirmed issue is specific to CPU execution; GPU execution works perfectly.
Verified CUDA_VISIBLE_DEVICES="" is correctly set and that Ollama is indeed using the CPU (via Task Manager).
Tested with a significantly older Ollama version (0.1.32) to rule out recent regressions – the issue also occurs there, but at a different stage (during model loading).
Restarted Ollama and rebooted the computer multiple times.
Confirmed AVX and AVX2 support on the CPU using CPU-Z
Ruled out common software conflicts (VPN, firewall).
Tested with different thread counts using OLLAMA_NUM_THREAD (1, 4, 8) – no effect.

System Information:

Ollama Version: 0.5.4, 0.5.12, and 0.1.32
Operating System: Windows 11
CPU: AMD Ryzen 7 5800X
RAM: 32GB
GPU: Nvidia RTX 3060 12GB
Models Affected: mistral:latest, deepscaler:1.5b-preview-q4_K_M

Conclusion:

This issue appears to be a bug related to how specific models are handled on the CPU by Ollama (or llama.cpp) on this particular system configuration. The intermittent hanging/stalling with deepscaler:1.5b-preview-q4_K_M on version 0.5.12, and the consistent failure with mistral:latest on versions 0.5.4 and 0.1.32, suggest a low-level incompatibility or instability. The different failure modes (hanging vs. crashing) are important clues.

Relevant log output

OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.5.12

Originally created by @bluespork on GitHub (Feb 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9357 ### What is the issue? I am experiencing an intermittent issue with Ollama on Windows 11 where models running on the CPU either hang/stall mid-response or produce the "llama runner process no longer running" error. This issue does *not* occur when running on the GPU (Nvidia RTX 3060). I have done extensive troubleshooting (detailed below) and believe this to be a bug within Ollama or llama.cpp. **Expected Behavior:** Ollama should consistently run models on the CPU when the `CUDA_VISIBLE_DEVICES=""` environment variable is set, without hanging, stalling, or producing the "llama runner process no longer running" error. **Actual Behavior:** * With Ollama version 0.1.32 (portable, server run separately), the `mistral:latest` model fails with "llama runner process no longer running" during the loading phase, before even getting to a prompt. * With Ollama version 0.5.4 (installed via official installer), the `mistral:latest` model consistently fails with "llama runner process no longer running" immediately upon starting, when forced to run on CPU. * With Ollama version **0.5.12** (installed via official installer), the `deepscaler:1.5b-preview-q4_K_M` model *sometimes* runs to completion on CPU. However, it frequently hangs or stalls mid-response. If I type anything into the prompt after it hangs, it will sometimes continue and complete the response. Other times, it produces the "llama runner process no longer running" error. * Running on the GPU (without setting `CUDA_VISIBLE_DEVICES`) works flawlessly with all tested models. **Steps to Reproduce:** 1. **System:** Windows 11, Ryzen 7 5800X, 32GB RAM, Nvidia RTX 3060. 2. **Ollama Version:** 0.1.32 (portable), 0.5.4 (installed), and **0.5.12** (installed). 3. **Environment Variable:** Ensure `CUDA_VISIBLE_DEVICES=""` is set. This has been tested both as a system-wide variable and set per-session in Command Prompt. We have verified using `echo %CUDA_VISIBLE_DEVICES%` that the variable is correctly unset when it *should* be, and set to an empty string when we set it. The system-wide enviroment variable was deleted as well. 4. **Model(s):** `mistral:latest` (consistently fails on versions 0.5.4 and 0.1.32), `deepscaler:1.5b-preview-q4_K_M` (intermittently hangs/stalls or fails on version **0.5.12**). 5. **Command:** `ollama run <model_name> --verbose "list 10 animals who can weight over 1000 pounds"` (or any other prompt). 6. **Observe:** * With `mistral:latest` on version 0.5.4, multiple `ollama_llama_server.exe` processes appear in Task Manager and then disappear, leading to the error. * With `mistral:latest` on version 0.1.32, get "llama runner process no longer running". * With `deepscaler:1.5b-preview-q4_K_M` on version **0.5.12**, the response may stop mid-sentence. Typing anything in the prompt may cause it to continue, or may result in the "llama runner process no longer running" error. **Troubleshooting Steps Taken:** * Confirmed issue occurs with multiple models (`mistral:latest`, `deepscaler:1.5b-preview-q4_K_M`), but *not* all models. * Confirmed issue is specific to CPU execution; GPU execution works perfectly. * Verified `CUDA_VISIBLE_DEVICES=""` is correctly set and that Ollama is indeed using the CPU (via Task Manager). * Tested with a significantly older Ollama version (0.1.32) to rule out recent regressions – the issue also occurs there, but at a different stage (during model loading). * Restarted Ollama and rebooted the computer multiple times. * Confirmed AVX and AVX2 support on the CPU using CPU-Z * Ruled out common software conflicts (VPN, firewall). * Tested with different thread counts using `OLLAMA_NUM_THREAD` (1, 4, 8) – no effect. **System Information:** * **Ollama Version:** 0.5.4, **0.5.12**, and 0.1.32 * **Operating System:** Windows 11 * **CPU:** AMD Ryzen 7 5800X * **RAM:** 32GB * **GPU:** Nvidia RTX 3060 12GB * **Models Affected:** `mistral:latest`, `deepscaler:1.5b-preview-q4_K_M` **Conclusion:** This issue appears to be a bug related to how specific models are handled on the CPU by Ollama (or llama.cpp) on this particular system configuration. The intermittent hanging/stalling with `deepscaler:1.5b-preview-q4_K_M` on version 0.5.12, and the consistent failure with `mistral:latest` on versions 0.5.4 and 0.1.32, suggest a low-level incompatibility or instability. The different failure modes (hanging vs. crashing) are important clues. ### Relevant log output ```shell ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.5.12

GiteaMirror added the bug label 2026-05-04 12:42:34 -05:00

GiteaMirror closed this issue

2026-05-04 12:42:48 -05:00

GiteaMirror commented

2026-05-04 12:42:49 -05:00

@rick-github commented on GitHub (Feb 26, 2025):

Server logs may aid in debugging.

OLLAMA_NUM_THREAD is not an ollama configuration variable.

@rick-github commented on GitHub (Feb 26, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md) may aid in debugging. `OLLAMA_NUM_THREAD` is not an ollama configuration variable.

GiteaMirror commented

2026-05-04 12:42:50 -05:00

@bluespork commented on GitHub (Feb 27, 2025):

UPDATE:

I have made a significant discovery that isolates the problem. I physically removed the NVIDIA RTX 3060 GPU from my system and replaced it with an older AMD Radeon R5 430 card that is not supported by Ollama for GPU acceleration.

With the AMD card installed, Ollama runs perfectly on the CPU, without any of the hanging or "llama runner process no longer running" errors. This occurs with all models, including mistral:latest. I did not need to set CUDA_VISIBLE_DEVICES="" with the AMD card, as Ollama correctly detected that it could not use the GPU.

This strongly indicates that the issue is specific to the interaction between Ollama/llama.cpp and either the NVIDIA RTX 3060 hardware or, more likely, the NVIDIA drivers on Windows 11. It is not a general CPU incompatibility, nor is it a fundamental problem with Ollama's ability to run on the CPU.

@bluespork commented on GitHub (Feb 27, 2025): **UPDATE:** I have made a significant discovery that isolates the problem. I physically removed the NVIDIA RTX 3060 GPU from my system and replaced it with an older AMD Radeon R5 430 card that is *not* supported by Ollama for GPU acceleration. With the AMD card installed, Ollama runs *perfectly* on the CPU, without any of the hanging or "llama runner process no longer running" errors. This occurs with all models, including `mistral:latest`. I did *not* need to set `CUDA_VISIBLE_DEVICES=""` with the AMD card, as Ollama correctly detected that it could not use the GPU. This strongly indicates that the issue is specific to the interaction between Ollama/`llama.cpp` and either the NVIDIA RTX 3060 hardware or, more likely, the NVIDIA drivers on Windows 11. It is *not* a general CPU incompatibility, nor is it a fundamental problem with Ollama's ability to run on the CPU.

GiteaMirror commented

2026-05-04 12:42:52 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

yes I have a rtx3060 very similar system, and same issue with CUDA_VISIBLE_DEVICES:-1 which is how I use cpu only.
seems this will crash ollama (but you would not know as, the model continues the response fine and quick) also reset's the logs and produce orphaned processes also.
0.5.12+-0.5.13

@YonTracks commented on GitHub (Mar 4, 2025): yes I have a rtx3060 very similar system, and same issue with `CUDA_VISIBLE_DEVICES:-1` which is how I use cpu only. seems this will crash ollama (but you would not know as, the model continues the response fine and quick) also reset's the logs and produce orphaned processes also. 0.5.12+-0.5.13

GiteaMirror commented

2026-05-04 12:42:54 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

2025/03/04 21:30:05 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]"
time=2025-03-04T21:30:05.861+10:00 level=INFO source=images.go:432 msg="total blobs: 133"
time=2025-03-04T21:30:05.870+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)"
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:30:05.878+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:30:05.887+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll
time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]"
time=2025-03-04T21:30:05.895+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll]
time=2025-03-04T21:30:05.905+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll"
time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll
time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]"
time=2025-03-04T21:30:05.914+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]"
time=2025-03-04T21:30:05.920+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.924+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.927+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.930+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
time=2025-03-04T21:30:05.930+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-03-04T21:30:05.931+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="9.6 GiB"

@YonTracks commented on GitHub (Mar 4, 2025): ``` 2025/03/04 21:30:05 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]" time=2025-03-04T21:30:05.861+10:00 level=INFO source=images.go:432 msg="total blobs: 133" time=2025-03-04T21:30:05.870+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" time=2025-03-04T21:30:05.877+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)" time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler" time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12 time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:30:05.878+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:30:05.887+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]" time=2025-03-04T21:30:05.895+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll] time=2025-03-04T21:30:05.905+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll" time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]" time=2025-03-04T21:30:05.914+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]" time=2025-03-04T21:30:05.920+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:30:05.924+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:30:05.927+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:30:05.930+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." time=2025-03-04T21:30:05.930+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" time=2025-03-04T21:30:05.931+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="9.6 GiB" ```

GiteaMirror commented

2026-05-04 12:42:56 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

2025/03/04 21:31:37 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]"
time=2025-03-04T21:31:37.689+10:00 level=INFO source=images.go:432 msg="total blobs: 133"
time=2025-03-04T21:31:37.696+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-04T21:31:37.701+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)"
time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler"
time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12
time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:31:37.703+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:31:37.715+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll
time=2025-03-04T21:31:37.718+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-03-04T21:31:37.718+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]"
time=2025-03-04T21:31:37.719+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll]
time=2025-03-04T21:31:37.729+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll"
time=2025-03-04T21:31:37.729+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll
time=2025-03-04T21:31:37.729+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]"
time=2025-03-04T21:31:37.738+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]"
time=2025-03-04T21:31:37.742+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:31:37.747+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:31:37.753+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:31:37.754+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
time=2025-03-04T21:31:37.754+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-03-04T21:31:37.754+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="26.5 GiB"
[GIN] 2025/03/04 - 21:31:41 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2025/03/04 - 21:31:41 | 200 |     12.8616ms |       127.0.0.1 | POST     "/api/show"
time=2025-03-04T21:31:41.159+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.5 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB"
time=2025-03-04T21:31:41.159+10:00 level=DEBUG source=sched.go:182 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=3 gpu_count=1
time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=sched.go:212 msg="cpu mode with first model, loading"
time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.4 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB"
time=2025-03-04T21:31:41.167+10:00 level=INFO source=server.go:97 msg="system memory" total="31.9 GiB" free="26.4 GiB" free_swap="26.8 GiB"
time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=memory.go:108 msg=evaluating library=cpu gpu_count=1 available="[26.4 GiB]"
time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.4 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB"
time=2025-03-04T21:31:41.167+10:00 level=WARN source=ggml.go:136 msg="key not found" key=llama.attention.key_length default=128
time=2025-03-04T21:31:41.167+10:00 level=WARN source=ggml.go:136 msg="key not found" key=llama.attention.value_length default=128
time=2025-03-04T21:31:41.167+10:00 level=INFO source=server.go:130 msg=offload library=cpu layers.requested=-1 layers.model=33 layers.offload=0 layers.split="" memory.available="[26.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="10.1 GiB" memory.required.partial="0 B" memory.required.kv="4.0 GiB" memory.required.allocations="[10.1 GiB]" memory.weights.total="7.7 GiB" memory.weights.repeating="7.6 GiB" memory.weights.nonrepeating="105.0 MiB" memory.graph.full="2.1 GiB" memory.graph.partial="2.2 GiB"
time=2025-03-04T21:31:41.168+10:00 level=WARN source=server.go:170 msg="flash attention enabled but not supported by gpu"
time=2025-03-04T21:31:41.168+10:00 level=WARN source=server.go:193 msg="quantized kv cache requested but flash attention disabled" type=q8_0
time=2025-03-04T21:31:41.168+10:00 level=DEBUG source=server.go:259 msg="compatible gpu libraries" compatible=[]
time=2025-03-04T21:31:41.175+10:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu"
time=2025-03-04T21:31:41.175+10:00 level=INFO source=server.go:380 msg="starting llama server" cmd="C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model C:\\Users\\clint\\.ollama\\models\\blobs\\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 --ctx-size 32768 --batch-size 512 --verbose --threads 6 --no-mmap --parallel 4 --port 54417"
time=2025-03-04T21:31:41.175+10:00 level=DEBUG source=server.go:398 msg=subprocess environment="[CUDA_PATH_V12_8=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8 CUDA_VISIBLE_DEVICES=-1 HIP_PATH=C:\\Program Files\\AMD\\ROCm\\6.1\\ HIP_PATH_61=C:\\Program Files\\AMD\\ROCm\\6.1\\ PATH=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp;C:\\Python313\\Scripts\\;C:\\Python313\\;C:\\Python312\\;C:\\WINDOWS\\system32;C:\\WINDOWS;C:\\WINDOWS\\System32\\Wbem;C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\;C:\\ProgramData\\chocolatey\\bin;C:\\Program Files\\Microsoft VS Code\\bin;C:\\Program Files\\Git\\cmd;C:\\Program Files\\CMake\\bin;C:\\msys64\\mingw64\\bin;C:\\msys64\\usr\\bin;C:\\msys64\\ucrt64\\bin;C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts;C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\;C:\\Program Files (x86)\\Inno Setup 6;C:\\Program Files\\Go\\bin;C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64;C:\\Program Files\\nodejs\\;C:\\Program Files\\dotnet\\;C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\;C:\\Users\\clint\\ninja-1.12.1;C:\\Users\\clint\\ccache-4.10.2-windows-x86_64;C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin;C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe;c:\\msys64\\ucrt64\\bin;c:\\msys64\\usr\\bin;C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts;C:\\Users\\clint\\go\\bin;C:\\Users\\clint\\.dotnet\\tools;C:\\Users\\clint\\AppData\\Roaming\\npm;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama]"
time=2025-03-04T21:31:41.178+10:00 level=INFO source=sched.go:450 msg="loaded runners" count=1
time=2025-03-04T21:31:41.178+10:00 level=INFO source=server.go:557 msg="waiting for llama runner to start responding"
time=2025-03-04T21:31:41.178+10:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error"
time=2025-03-04T21:31:41.201+10:00 level=INFO source=runner.go:931 msg="starting go runner"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python313\Scripts
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python313
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python312
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\system32
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\System32\Wbem
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\System32\WindowsPowerShell\v1.0
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\ProgramData\chocolatey\bin
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Microsoft VS Code\\bin"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Git\\cmd"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\CMake\\bin"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\mingw64\bin
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\usr\bin
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\ucrt64\bin
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\Python\Python312\Scripts
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Inno Setup 6"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Go\\bin"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\nodejs"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\dotnet"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\ninja-1.12.1
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\ccache-4.10.2-windows-x86_64
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Microsoft\WindowsApps
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:84 msg="ggml backend load all from path" path=C:\Users\clint\AppData\Local\Programs\Ollama
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\bin
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Microsoft\WinGet\Packages\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=c:\msys64\ucrt64\bin
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=c:\msys64\usr\bin
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\Python\Python312\Scripts
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\go\bin
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\.dotnet\tools
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\npm
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:84 msg="ggml backend load all from path" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 35
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 16
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0
load_backend: loaded CPU backend from C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll
time=2025-03-04T21:31:41.237+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\cuda
time=2025-03-04T21:31:41.237+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\rocm
time=2025-03-04T21:31:41.237+10:00 level=INFO source=runner.go:934 msg=system info="CPU : SSE3 = 1 | LLAMAFILE = 1 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | cgo(gcc)" threads=6
time=2025-03-04T21:31:41.238+10:00 level=INFO source=runner.go:992 msg="Server listening on 127.0.0.1:54417"
llama_model_loader: loaded meta data with 25 key-value pairs and 291 tensors from C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = Mistral-7B-Instruct-v0.3
llama_model_loader: - kv   2:                          llama.block_count u32              = 32
llama_model_loader: - kv   3:                       llama.context_length u32              = 32768
llama_model_loader: - kv   4:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv   7:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv   8:                       llama.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  10:                          general.file_type u32              = 2
llama_model_loader: - kv  11:                           llama.vocab_size u32              = 32768
llama_model_loader: - kv  12:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  13:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  14:                         tokenizer.ggml.pre str              = default
llama_model_loader: - kv  15:                      tokenizer.ggml.tokens arr[str,32768]   = ["<unk>", "<s>", "</s>", "[INST]", "[...
llama_model_loader: - kv  16:                      tokenizer.ggml.scores arr[f32,32768]   = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv  17:                  tokenizer.ggml.token_type arr[i32,32768]   = [2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
llama_model_loader: - kv  18:                tokenizer.ggml.bos_token_id u32              = 1
llama_model_loader: - kv  19:                tokenizer.ggml.eos_token_id u32              = 2
llama_model_loader: - kv  20:            tokenizer.ggml.unknown_token_id u32              = 0
llama_model_loader: - kv  21:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  22:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  23:                    tokenizer.chat_template str              = {{ bos_token }}{% for message in mess...
llama_model_loader: - kv  24:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   65 tensors
llama_model_loader: - type q4_0:  225 tensors
llama_model_loader: - type q6_K:    1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_0
print_info: file size   = 3.83 GiB (4.54 BPW) 
init_tokenizer: initializing tokenizer for type 1
load: control token:    468 '[control_466]' is not marked as EOG
load: control token:    464 '[control_462]' is not marked as EOG
load: control token:    727 '[control_725]' is not marked as EOG
load: control token:    343 '[control_341]' is not marked as EOG
load: control token:    603 '[control_601]' is not marked as EOG
load: control token:    332 '[control_330]' is not marked as EOG
load: control token:     34 '[control_32]' is not marked as EOG
load: control token:    412 '[control_410]' is not marked as EOG
load: control token:    675 '[control_673]' is not marked as EOG
load: control token:    177 '[control_175]' is not marked as EOG
load: control token:    434 '[control_432]' is not marked as EOG
...  repeats
...

@YonTracks commented on GitHub (Mar 4, 2025): ``` 2025/03/04 21:31:37 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]" time=2025-03-04T21:31:37.689+10:00 level=INFO source=images.go:432 msg="total blobs: 133" time=2025-03-04T21:31:37.696+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" time=2025-03-04T21:31:37.701+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)" time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler" time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12 time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:31:37.703+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:31:37.715+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll time=2025-03-04T21:31:37.718+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-03-04T21:31:37.718+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]" time=2025-03-04T21:31:37.719+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll] time=2025-03-04T21:31:37.729+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll" time=2025-03-04T21:31:37.729+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll time=2025-03-04T21:31:37.729+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]" time=2025-03-04T21:31:37.738+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]" time=2025-03-04T21:31:37.742+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:31:37.747+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:31:37.753+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:31:37.754+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." time=2025-03-04T21:31:37.754+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" time=2025-03-04T21:31:37.754+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="26.5 GiB" [GIN] 2025/03/04 - 21:31:41 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2025/03/04 - 21:31:41 | 200 | 12.8616ms | 127.0.0.1 | POST "/api/show" time=2025-03-04T21:31:41.159+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.5 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB" time=2025-03-04T21:31:41.159+10:00 level=DEBUG source=sched.go:182 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=3 gpu_count=1 time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=sched.go:212 msg="cpu mode with first model, loading" time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.4 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB" time=2025-03-04T21:31:41.167+10:00 level=INFO source=server.go:97 msg="system memory" total="31.9 GiB" free="26.4 GiB" free_swap="26.8 GiB" time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=memory.go:108 msg=evaluating library=cpu gpu_count=1 available="[26.4 GiB]" time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.4 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB" time=2025-03-04T21:31:41.167+10:00 level=WARN source=ggml.go:136 msg="key not found" key=llama.attention.key_length default=128 time=2025-03-04T21:31:41.167+10:00 level=WARN source=ggml.go:136 msg="key not found" key=llama.attention.value_length default=128 time=2025-03-04T21:31:41.167+10:00 level=INFO source=server.go:130 msg=offload library=cpu layers.requested=-1 layers.model=33 layers.offload=0 layers.split="" memory.available="[26.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="10.1 GiB" memory.required.partial="0 B" memory.required.kv="4.0 GiB" memory.required.allocations="[10.1 GiB]" memory.weights.total="7.7 GiB" memory.weights.repeating="7.6 GiB" memory.weights.nonrepeating="105.0 MiB" memory.graph.full="2.1 GiB" memory.graph.partial="2.2 GiB" time=2025-03-04T21:31:41.168+10:00 level=WARN source=server.go:170 msg="flash attention enabled but not supported by gpu" time=2025-03-04T21:31:41.168+10:00 level=WARN source=server.go:193 msg="quantized kv cache requested but flash attention disabled" type=q8_0 time=2025-03-04T21:31:41.168+10:00 level=DEBUG source=server.go:259 msg="compatible gpu libraries" compatible=[] time=2025-03-04T21:31:41.175+10:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu" time=2025-03-04T21:31:41.175+10:00 level=INFO source=server.go:380 msg="starting llama server" cmd="C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model C:\\Users\\clint\\.ollama\\models\\blobs\\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 --ctx-size 32768 --batch-size 512 --verbose --threads 6 --no-mmap --parallel 4 --port 54417" time=2025-03-04T21:31:41.175+10:00 level=DEBUG source=server.go:398 msg=subprocess environment="[CUDA_PATH_V12_8=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8 CUDA_VISIBLE_DEVICES=-1 HIP_PATH=C:\\Program Files\\AMD\\ROCm\\6.1\\ HIP_PATH_61=C:\\Program Files\\AMD\\ROCm\\6.1\\ PATH=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp;C:\\Python313\\Scripts\\;C:\\Python313\\;C:\\Python312\\;C:\\WINDOWS\\system32;C:\\WINDOWS;C:\\WINDOWS\\System32\\Wbem;C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\;C:\\ProgramData\\chocolatey\\bin;C:\\Program Files\\Microsoft VS Code\\bin;C:\\Program Files\\Git\\cmd;C:\\Program Files\\CMake\\bin;C:\\msys64\\mingw64\\bin;C:\\msys64\\usr\\bin;C:\\msys64\\ucrt64\\bin;C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts;C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\;C:\\Program Files (x86)\\Inno Setup 6;C:\\Program Files\\Go\\bin;C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64;C:\\Program Files\\nodejs\\;C:\\Program Files\\dotnet\\;C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\;C:\\Users\\clint\\ninja-1.12.1;C:\\Users\\clint\\ccache-4.10.2-windows-x86_64;C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin;C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe;c:\\msys64\\ucrt64\\bin;c:\\msys64\\usr\\bin;C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts;C:\\Users\\clint\\go\\bin;C:\\Users\\clint\\.dotnet\\tools;C:\\Users\\clint\\AppData\\Roaming\\npm;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama]" time=2025-03-04T21:31:41.178+10:00 level=INFO source=sched.go:450 msg="loaded runners" count=1 time=2025-03-04T21:31:41.178+10:00 level=INFO source=server.go:557 msg="waiting for llama runner to start responding" time=2025-03-04T21:31:41.178+10:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error" time=2025-03-04T21:31:41.201+10:00 level=INFO source=runner.go:931 msg="starting go runner" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python313\Scripts time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python313 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python312 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\system32 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\System32\Wbem time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\System32\WindowsPowerShell\v1.0 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\ProgramData\chocolatey\bin time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Microsoft VS Code\\bin" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Git\\cmd" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\CMake\\bin" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\mingw64\bin time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\usr\bin time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\ucrt64\bin time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\Python\Python312\Scripts time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Inno Setup 6" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Go\\bin" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\nodejs" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\dotnet" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\ninja-1.12.1 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\ccache-4.10.2-windows-x86_64 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Microsoft\WindowsApps time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:84 msg="ggml backend load all from path" path=C:\Users\clint\AppData\Local\Programs\Ollama time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\bin time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Microsoft\WinGet\Packages\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=c:\msys64\ucrt64\bin time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=c:\msys64\usr\bin time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\Python\Python312\Scripts time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\go\bin time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\.dotnet\tools time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\npm time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:84 msg="ggml backend load all from path" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0 ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 35 ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0 ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 16 ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0 load_backend: loaded CPU backend from C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll time=2025-03-04T21:31:41.237+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\cuda time=2025-03-04T21:31:41.237+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\rocm time=2025-03-04T21:31:41.237+10:00 level=INFO source=runner.go:934 msg=system info="CPU : SSE3 = 1 | LLAMAFILE = 1 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | cgo(gcc)" threads=6 time=2025-03-04T21:31:41.238+10:00 level=INFO source=runner.go:992 msg="Server listening on 127.0.0.1:54417" llama_model_loader: loaded meta data with 25 key-value pairs and 291 tensors from C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.name str = Mistral-7B-Instruct-v0.3 llama_model_loader: - kv 2: llama.block_count u32 = 32 llama_model_loader: - kv 3: llama.context_length u32 = 32768 llama_model_loader: - kv 4: llama.embedding_length u32 = 4096 llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336 llama_model_loader: - kv 6: llama.attention.head_count u32 = 32 llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8 llama_model_loader: - kv 8: llama.rope.freq_base f32 = 1000000.000000 llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 10: general.file_type u32 = 2 llama_model_loader: - kv 11: llama.vocab_size u32 = 32768 llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128 llama_model_loader: - kv 13: tokenizer.ggml.model str = llama llama_model_loader: - kv 14: tokenizer.ggml.pre str = default llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,32768] = ["<unk>", "<s>", "</s>", "[INST]", "[... llama_model_loader: - kv 16: tokenizer.ggml.scores arr[f32,32768] = [0.000000, 0.000000, 0.000000, 0.0000... llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,32768] = [2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ... llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 1 llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 2 llama_model_loader: - kv 20: tokenizer.ggml.unknown_token_id u32 = 0 llama_model_loader: - kv 21: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 22: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 23: tokenizer.chat_template str = {{ bos_token }}{% for message in mess... llama_model_loader: - kv 24: general.quantization_version u32 = 2 llama_model_loader: - type f32: 65 tensors llama_model_loader: - type q4_0: 225 tensors llama_model_loader: - type q6_K: 1 tensors print_info: file format = GGUF V3 (latest) print_info: file type = Q4_0 print_info: file size = 3.83 GiB (4.54 BPW) init_tokenizer: initializing tokenizer for type 1 load: control token: 468 '[control_466]' is not marked as EOG load: control token: 464 '[control_462]' is not marked as EOG load: control token: 727 '[control_725]' is not marked as EOG load: control token: 343 '[control_341]' is not marked as EOG load: control token: 603 '[control_601]' is not marked as EOG load: control token: 332 '[control_330]' is not marked as EOG load: control token: 34 '[control_32]' is not marked as EOG load: control token: 412 '[control_410]' is not marked as EOG load: control token: 675 '[control_673]' is not marked as EOG load: control token: 177 '[control_175]' is not marked as EOG load: control token: 434 '[control_432]' is not marked as EOG ... repeats ...

GiteaMirror commented

2026-05-04 12:42:57 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

load: control token:    653 '[control_651]' is not marked as EOG
load: control token:    694 '[control_692]' is not marked as EOG
load: control token:    316 '[control_314]' is not marked as EOG
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 771
load: token to piece cache size = 0.1731 MB
print_info: arch             = llama
print_info: vocab_only       = 1
print_info: model type       = ?B
print_info: model params     = 7.25 B
print_info: general.name     = Mistral-7B-Instruct-v0.3
print_info: vocab type       = SPM
print_info: n_vocab          = 32768
print_info: n_merges         = 0
print_info: BOS token        = 1 '<s>'
print_info: EOS token        = 2 '</s>'
print_info: UNK token        = 0 '<unk>'
print_info: LF token         = 781 '<0x0A>'
print_info: EOG token        = 2 '</s>'
print_info: max token length = 48
llama_model_load: vocab only - skipping tensors
time=2025-03-04T21:33:34.465+10:00 level=DEBUG source=routes.go:1501 msg="chat request" images=0 prompt="[INST] hello[/INST]  Hello! How can I help you today? Let me know if you have any questions or need assistance with something. I'm here to help!</s>[INST] hello[/INST] "
time=2025-03-04T21:33:34.466+10:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=0 prompt=44 used=0 remaining=44
[GIN] 2025/03/04 - 21:33:40 | 200 |    8.2533849s |       127.0.0.1 | POST     "/api/chat"
time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:467 msg="context for request finished"
time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:340 msg="runner with non-zero duration has gone idle, adding timer" modelPath=C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 duration=5m0s
time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:358 msg="after processing request finished event" modelPath=C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 refCount=0

it keeps on crashing so I could capture a log.
will keep crashing untill no more memory.

@YonTracks commented on GitHub (Mar 4, 2025): ``` load: control token: 653 '[control_651]' is not marked as EOG load: control token: 694 '[control_692]' is not marked as EOG load: control token: 316 '[control_314]' is not marked as EOG load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect load: special tokens cache size = 771 load: token to piece cache size = 0.1731 MB print_info: arch = llama print_info: vocab_only = 1 print_info: model type = ?B print_info: model params = 7.25 B print_info: general.name = Mistral-7B-Instruct-v0.3 print_info: vocab type = SPM print_info: n_vocab = 32768 print_info: n_merges = 0 print_info: BOS token = 1 '<s>' print_info: EOS token = 2 '</s>' print_info: UNK token = 0 '<unk>' print_info: LF token = 781 '<0x0A>' print_info: EOG token = 2 '</s>' print_info: max token length = 48 llama_model_load: vocab only - skipping tensors time=2025-03-04T21:33:34.465+10:00 level=DEBUG source=routes.go:1501 msg="chat request" images=0 prompt="[INST] hello[/INST] Hello! How can I help you today? Let me know if you have any questions or need assistance with something. I'm here to help!</s>[INST] hello[/INST] " time=2025-03-04T21:33:34.466+10:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=0 prompt=44 used=0 remaining=44 [GIN] 2025/03/04 - 21:33:40 | 200 | 8.2533849s | 127.0.0.1 | POST "/api/chat" time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:467 msg="context for request finished" time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:340 msg="runner with non-zero duration has gone idle, adding timer" modelPath=C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 duration=5m0s time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:358 msg="after processing request finished event" modelPath=C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 refCount=0 ``` it keeps on crashing so I could capture a log. will keep crashing untill no more memory.

GiteaMirror commented

2026-05-04 12:42:58 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

testing official now lol.

@YonTracks commented on GitHub (Mar 4, 2025): testing official now lol.

GiteaMirror commented

2026-05-04 12:42:59 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

official 0.5.13 also is the same, works fine, crashes after.

2025/03/04 21:47:51 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]"
time=2025-03-04T21:47:51.364+10:00 level=INFO source=images.go:432 msg="total blobs: 133"
time=2025-03-04T21:47:51.368+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-04T21:47:51.372+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13)"
time=2025-03-04T21:47:51.372+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler"
time=2025-03-04T21:47:51.372+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-04T21:47:51.376+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-03-04T21:47:51.376+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12
time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:47:51.377+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll
time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]"
time=2025-03-04T21:47:51.395+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll]
initializing C:\WINDOWS\system32\nvcuda.dll
dlsym: cuInit - 00007FFEE2BA5F80
dlsym: cuDriverGetVersion - 00007FFEE2BA6020
dlsym: cuDeviceGetCount - 00007FFEE2BA6816
dlsym: cuDeviceGet - 00007FFEE2BA6810
dlsym: cuDeviceGetAttribute - 00007FFEE2BA6170
dlsym: cuDeviceGetUuid - 00007FFEE2BA6822
dlsym: cuDeviceGetName - 00007FFEE2BA681C
dlsym: cuCtxCreate_v3 - 00007FFEE2BA6894
dlsym: cuMemGetInfo_v2 - 00007FFEE2BA6996
dlsym: cuCtxDestroy - 00007FFEE2BA68A6
calling cuInit
cuInit err: 100
time=2025-03-04T21:47:51.406+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll"
time=2025-03-04T21:47:51.406+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll
time=2025-03-04T21:47:51.406+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]"
time=2025-03-04T21:47:51.414+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]"
cudaSetDevice err: 100
time=2025-03-04T21:47:51.420+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
cudaSetDevice err: 100
time=2025-03-04T21:47:51.424+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll: cudart init failure: 100"
cudaSetDevice err: 100
time=2025-03-04T21:47:51.429+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100"
cudaSetDevice err: 100
time=2025-03-04T21:47:51.435+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:47:51.436+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
time=2025-03-04T21:47:51.436+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
releasing nvml library
time=2025-03-04T21:47:51.437+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="18.4 GiB"

keeps on crashing same, orphaned processes.

good luck.

@YonTracks commented on GitHub (Mar 4, 2025): official 0.5.13 also is the same, works fine, crashes after. ``` 2025/03/04 21:47:51 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]" time=2025-03-04T21:47:51.364+10:00 level=INFO source=images.go:432 msg="total blobs: 133" time=2025-03-04T21:47:51.368+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" time=2025-03-04T21:47:51.372+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13)" time=2025-03-04T21:47:51.372+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler" time=2025-03-04T21:47:51.372+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-03-04T21:47:51.376+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-03-04T21:47:51.376+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12 time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:47:51.377+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]" time=2025-03-04T21:47:51.395+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll] initializing C:\WINDOWS\system32\nvcuda.dll dlsym: cuInit - 00007FFEE2BA5F80 dlsym: cuDriverGetVersion - 00007FFEE2BA6020 dlsym: cuDeviceGetCount - 00007FFEE2BA6816 dlsym: cuDeviceGet - 00007FFEE2BA6810 dlsym: cuDeviceGetAttribute - 00007FFEE2BA6170 dlsym: cuDeviceGetUuid - 00007FFEE2BA6822 dlsym: cuDeviceGetName - 00007FFEE2BA681C dlsym: cuCtxCreate_v3 - 00007FFEE2BA6894 dlsym: cuMemGetInfo_v2 - 00007FFEE2BA6996 dlsym: cuCtxDestroy - 00007FFEE2BA68A6 calling cuInit cuInit err: 100 time=2025-03-04T21:47:51.406+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll" time=2025-03-04T21:47:51.406+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll time=2025-03-04T21:47:51.406+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]" time=2025-03-04T21:47:51.414+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]" cudaSetDevice err: 100 time=2025-03-04T21:47:51.420+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" cudaSetDevice err: 100 time=2025-03-04T21:47:51.424+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll: cudart init failure: 100" cudaSetDevice err: 100 time=2025-03-04T21:47:51.429+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100" cudaSetDevice err: 100 time=2025-03-04T21:47:51.435+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:47:51.436+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." time=2025-03-04T21:47:51.436+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" releasing nvml library time=2025-03-04T21:47:51.437+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="18.4 GiB" ``` keeps on crashing same, orphaned processes. good luck.

GiteaMirror commented

2026-05-04 12:43:00 -05:00

@rick-github commented on GitHub (Mar 4, 2025):

calling cuInit
cuInit err: 100

cudaErrorNoDevice = 100

This indicates that no CUDA-capable devices were detected by the installed CUDA driver.

This is different to the OP's problem - you have some sort of driver mismatch. Open a new issue.

@rick-github commented on GitHub (Mar 4, 2025): ``` calling cuInit cuInit err: 100 ``` [cudaErrorNoDevice](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg3f51e3575c2178246db0a94a430e0038e942e4cbbd2bef6e92e293253f055613:~:text=proper%20device%20architecture.-,cudaErrorNoDevice,-%3D%20100) = 100 - This indicates that no CUDA-capable devices were detected by the installed CUDA driver. This is different to the OP's problem - you have some sort of driver mismatch. Open a new issue.

GiteaMirror commented

2026-05-04 12:43:00 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

ok I will do that.
should I try to test different drivers etc. cheers.

@YonTracks commented on GitHub (Mar 4, 2025): ok I will do that. should I try to test different drivers etc. cheers.

GiteaMirror commented

2026-05-04 12:43:01 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

will comment here also, very very similar issue as not all models and mistral:latest. I see this error also "llama runner process no longer running" error.

this one is a dev build with cmake config, default and for all. and then installed via install script.

2025/03/04 21:30:05 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]"
time=2025-03-04T21:30:05.861+10:00 level=INFO source=images.go:432 msg="total blobs: 133"
time=2025-03-04T21:30:05.870+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)"
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:30:05.878+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:30:05.887+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll
time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]"
time=2025-03-04T21:30:05.895+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll]
time=2025-03-04T21:30:05.905+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll"
time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll
time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]"
time=2025-03-04T21:30:05.914+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]"
time=2025-03-04T21:30:05.920+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.924+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.927+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.930+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
time=2025-03-04T21:30:05.930+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-03-04T21:30:05.931+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="9.6 GiB"

and the other is the official, I see the official installs all gpu files. and small changes in behavior for both.

@YonTracks commented on GitHub (Mar 4, 2025): will comment here also, very very similar issue as not all models and `mistral:latest`. I see this error also `"llama runner process no longer running" error.` this one is a dev build with cmake config, default and for all. and then installed via install script. > ``` > 2025/03/04 21:30:05 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]" > time=2025-03-04T21:30:05.861+10:00 level=INFO source=images.go:432 msg="total blobs: 133" > time=2025-03-04T21:30:05.870+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" > time=2025-03-04T21:30:05.877+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)" > time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler" > time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" > time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 > time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12 > time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" > time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll > time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]" > time=2025-03-04T21:30:05.878+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" > time=2025-03-04T21:30:05.887+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll > time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll > time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]" > time=2025-03-04T21:30:05.895+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll] > time=2025-03-04T21:30:05.905+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll" > time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll > time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]" > time=2025-03-04T21:30:05.914+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]" > time=2025-03-04T21:30:05.920+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" > time=2025-03-04T21:30:05.924+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100" > time=2025-03-04T21:30:05.927+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" > time=2025-03-04T21:30:05.930+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." > time=2025-03-04T21:30:05.930+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" > time=2025-03-04T21:30:05.931+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="9.6 GiB" > ``` and the other is the official, I see the official installs all gpu files. and small changes in behavior for both.

GiteaMirror commented

2026-05-04 12:43:02 -05:00

@rick-github commented on GitHub (Mar 4, 2025):

None of the logs you've posted contain "llama runner process no longer running".

@rick-github commented on GitHub (Mar 4, 2025): None of the logs you've posted contain "llama runner process no longer running".

GiteaMirror commented

2026-05-04 12:43:02 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

the log, it resets, but when the memory fills up, then the error changes. I will try get it, writing the other issue.

@YonTracks commented on GitHub (Mar 4, 2025): the log, it resets, but when the memory fills up, then the error changes. I will try get it, writing the other issue.

GiteaMirror commented

2026-05-04 12:43:03 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

and 0.5.13 vs 0.5.12 vs dev etc.

@YonTracks commented on GitHub (Mar 4, 2025): and 0.5.13 vs 0.5.12 vs dev etc.

GiteaMirror commented

2026-05-04 12:43:04 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

lol this is intermittent, depending on how long you wait, before prompts. always resetting logs but, and getting slower. the error is coming soon.

@YonTracks commented on GitHub (Mar 4, 2025): lol this is intermittent, depending on how long you wait, before prompts. always resetting logs but, and getting slower. the error is coming soon.

GiteaMirror commented

2026-05-04 12:43:06 -05:00

@YonTracks commented on GitHub (Mar 4, 2025):

ok well a different error this time lol. >>> what else Error: Post "http://127.0.0.1:11434/api/chat": dial tcp 127.0.0.1:11434: connectex: No connection could be made because the target machine actively refused it. official 0.5.13.

I will work out how to word this correct. cheers

@YonTracks commented on GitHub (Mar 4, 2025): ok well a different error this time lol. ```>>> what else Error: Post "http://127.0.0.1:11434/api/chat": dial tcp 127.0.0.1:11434: connectex: No connection could be made because the target machine actively refused it```. official 0.5.13. I will work out how to word this correct. cheers

GiteaMirror commented

2026-05-04 12:43:08 -05:00

@YonTracks commented on GitHub (Mar 5, 2025):

Expected Behavior:

Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" or CUDA_VISIBLE_DEVICES="-1" environment variable is set.
just as >>> /set parameter num_gpu 0 Set parameter 'num_gpu' to '0' correct does.
cheers

@YonTracks commented on GitHub (Mar 5, 2025): Expected Behavior: Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" or CUDA_VISIBLE_DEVICES="-1" environment variable is set. just as ```>>> /set parameter num_gpu 0 Set parameter 'num_gpu' to '0'``` correct does. cheers

GiteaMirror commented

2026-05-04 12:43:09 -05:00

@YonTracks commented on GitHub (Mar 5, 2025):

Expected Behavior:

Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" or CUDA_VISIBLE_DEVICES="-1" environment variable is set. just as >>> /set parameter num_gpu 0 Set parameter 'num_gpu' to '0' correct does. cheers

rather I think CUDA_VISIBLE_DEVICES="-1" only, with the /set parameter num_gpu 0 also, else should be default.

@YonTracks commented on GitHub (Mar 5, 2025): > Expected Behavior: > > Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" or CUDA_VISIBLE_DEVICES="-1" environment variable is set. just as `>>> /set parameter num_gpu 0 Set parameter 'num_gpu' to '0'` correct does. cheers rather I think `CUDA_VISIBLE_DEVICES="-1"` only, with the `/set parameter num_gpu 0` also, else should be default.

GiteaMirror commented

2026-05-04 12:43:10 -05:00

@rick-github commented on GitHub (Mar 5, 2025):

@bluespork Can you try another experiment. Rather than setting CUDA_VISIBLE_DEVICES="" to keep the model off the GPU, create a copy of the model with num_gpu=0:

$ echo FROM mistral:latest > Modelfile
$ echo PARAMETER num_gpu 0 >> Modelfile
$ ollama create mistral:cpu

And then run the test:

$ ollama run --verbose mistral:cpu list 10 animals who can weigh over 1000 pounds

Based on #9496, it could be that the problem is triggered by the invalid value in CUDA_VISIBLE_DEVICES.

@rick-github commented on GitHub (Mar 5, 2025): @bluespork Can you try another experiment. Rather than setting `CUDA_VISIBLE_DEVICES=""` to keep the model off the GPU, create a copy of the model with `num_gpu=0`: ```sh $ echo FROM mistral:latest > Modelfile $ echo PARAMETER num_gpu 0 >> Modelfile $ ollama create mistral:cpu ``` And then run the test: ```sh $ ollama run --verbose mistral:cpu list 10 animals who can weigh over 1000 pounds ``` Based on #9496, it could be that the problem is triggered by the invalid value in `CUDA_VISIBLE_DEVICES`.

GiteaMirror commented

2026-05-04 12:43:11 -05:00

@bluespork commented on GitHub (Mar 10, 2025):

Thank you for your time. The problem was the invalid value in CUDA_VISIBLE_DEVICES.

@bluespork commented on GitHub (Mar 10, 2025): Thank you for your time. The problem was the invalid value in CUDA_VISIBLE_DEVICES.

Sign in to join this conversation.

Branches Tags

main

hoyyeva/fix-claude-channels-env

parth-update-hermes-launch

hoyyeva/vscode-extension-docs-update

parth-gemma4-chat-template-renderer

parth-api-status-context-length

hoyyeva/wire-up-context-length

hoyyeva/claude-code-context-doc

jmorganca/investigate-issue-17046

hoyyeva/hermes-docs

jmorganca/agent-loop-style

hoyyeva/openclaw

parth-agent-loop

hoyyeva/ollama-vscode-extension

brucemacd/cache-metrics

brucemacd/hermes-desktop

hoyyeva/docs-vscode

parth-input-style-experiment

brucemacd/docs-glm52

hoyyeva/poc-docs

Parth/mlx-launch-recommendations

parth-first-time-app-cli-experience

test/darwin-xcode-pin

improve-cloud-model-recommendations

hoyyeva/goose-docs

jmorganca/context-limit-fixes

hoyyeva/qwen-doc

hoyyeva/vscode-docs

jmorganca/remove-mlx-imagegen-code

parth-copilot-token-length-defaults

hoyyeva/poolside-windows

laguna-support

jmorganca/harden-markdown-rendering

laguna-renderer-parser

laguna-llamacpp

codex/make-integration-hidden-and-lunchable

brucemacd/omp-docs

pdevine/gguf-mtp-oldstyle

hoyyeva/migrate-pi

hoyyeva/anthropic-local-image-path

parth-launch-codex-app

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth/hide-claude-desktop-till-release

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#68166