[GH-ISSUE #9357] Intermittent hanging/stalling and "llama runner process no longer running" error on CPU with specific models (Windows 11, Ryzen 5800X, RTX 3060) #31870

Closed
opened 2026-04-22 12:38:20 -05:00 by GiteaMirror · 20 comments
Owner

Originally created by @bluespork on GitHub (Feb 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9357

What is the issue?

I am experiencing an intermittent issue with Ollama on Windows 11 where models running on the CPU either hang/stall mid-response or produce the "llama runner process no longer running" error. This issue does not occur when running on the GPU (Nvidia RTX 3060). I have done extensive troubleshooting (detailed below) and believe this to be a bug within Ollama or llama.cpp.

Expected Behavior:

Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" environment variable is set, without hanging, stalling, or producing the "llama runner process no longer running" error.

Actual Behavior:

  • With Ollama version 0.1.32 (portable, server run separately), the mistral:latest model fails with "llama runner process no longer running" during the loading phase, before even getting to a prompt.
  • With Ollama version 0.5.4 (installed via official installer), the mistral:latest model consistently fails with "llama runner process no longer running" immediately upon starting, when forced to run on CPU.
  • With Ollama version 0.5.12 (installed via official installer), the deepscaler:1.5b-preview-q4_K_M model sometimes runs to completion on CPU. However, it frequently hangs or stalls mid-response. If I type anything into the prompt after it hangs, it will sometimes continue and complete the response. Other times, it produces the "llama runner process no longer running" error.
  • Running on the GPU (without setting CUDA_VISIBLE_DEVICES) works flawlessly with all tested models.

Steps to Reproduce:

  1. System: Windows 11, Ryzen 7 5800X, 32GB RAM, Nvidia RTX 3060.
  2. Ollama Version: 0.1.32 (portable), 0.5.4 (installed), and 0.5.12 (installed).
  3. Environment Variable: Ensure CUDA_VISIBLE_DEVICES="" is set. This has been tested both as a system-wide variable and set per-session in Command Prompt. We have verified using echo %CUDA_VISIBLE_DEVICES% that the variable is correctly unset when it should be, and set to an empty string when we set it. The system-wide enviroment variable was deleted as well.
  4. Model(s): mistral:latest (consistently fails on versions 0.5.4 and 0.1.32), deepscaler:1.5b-preview-q4_K_M (intermittently hangs/stalls or fails on version 0.5.12).
  5. Command: ollama run <model_name> --verbose "list 10 animals who can weight over 1000 pounds" (or any other prompt).
  6. Observe:
    • With mistral:latest on version 0.5.4, multiple ollama_llama_server.exe processes appear in Task Manager and then disappear, leading to the error.
    • With mistral:latest on version 0.1.32, get "llama runner process no longer running".
    • With deepscaler:1.5b-preview-q4_K_M on version 0.5.12, the response may stop mid-sentence. Typing anything in the prompt may cause it to continue, or may result in the "llama runner process no longer running" error.

Troubleshooting Steps Taken:

  • Confirmed issue occurs with multiple models (mistral:latest, deepscaler:1.5b-preview-q4_K_M), but not all models.
  • Confirmed issue is specific to CPU execution; GPU execution works perfectly.
  • Verified CUDA_VISIBLE_DEVICES="" is correctly set and that Ollama is indeed using the CPU (via Task Manager).
  • Tested with a significantly older Ollama version (0.1.32) to rule out recent regressions – the issue also occurs there, but at a different stage (during model loading).
  • Restarted Ollama and rebooted the computer multiple times.
  • Confirmed AVX and AVX2 support on the CPU using CPU-Z
  • Ruled out common software conflicts (VPN, firewall).
  • Tested with different thread counts using OLLAMA_NUM_THREAD (1, 4, 8) – no effect.

System Information:

  • Ollama Version: 0.5.4, 0.5.12, and 0.1.32
  • Operating System: Windows 11
  • CPU: AMD Ryzen 7 5800X
  • RAM: 32GB
  • GPU: Nvidia RTX 3060 12GB
  • Models Affected: mistral:latest, deepscaler:1.5b-preview-q4_K_M

Conclusion:

This issue appears to be a bug related to how specific models are handled on the CPU by Ollama (or llama.cpp) on this particular system configuration. The intermittent hanging/stalling with deepscaler:1.5b-preview-q4_K_M on version 0.5.12, and the consistent failure with mistral:latest on versions 0.5.4 and 0.1.32, suggest a low-level incompatibility or instability. The different failure modes (hanging vs. crashing) are important clues.

Relevant log output


OS

Windows

GPU

Nvidia

CPU

AMD

Ollama version

0.5.12

Originally created by @bluespork on GitHub (Feb 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9357 ### What is the issue? I am experiencing an intermittent issue with Ollama on Windows 11 where models running on the CPU either hang/stall mid-response or produce the "llama runner process no longer running" error. This issue does *not* occur when running on the GPU (Nvidia RTX 3060). I have done extensive troubleshooting (detailed below) and believe this to be a bug within Ollama or llama.cpp. **Expected Behavior:** Ollama should consistently run models on the CPU when the `CUDA_VISIBLE_DEVICES=""` environment variable is set, without hanging, stalling, or producing the "llama runner process no longer running" error. **Actual Behavior:** * With Ollama version 0.1.32 (portable, server run separately), the `mistral:latest` model fails with "llama runner process no longer running" during the loading phase, before even getting to a prompt. * With Ollama version 0.5.4 (installed via official installer), the `mistral:latest` model consistently fails with "llama runner process no longer running" immediately upon starting, when forced to run on CPU. * With Ollama version **0.5.12** (installed via official installer), the `deepscaler:1.5b-preview-q4_K_M` model *sometimes* runs to completion on CPU. However, it frequently hangs or stalls mid-response. If I type anything into the prompt after it hangs, it will sometimes continue and complete the response. Other times, it produces the "llama runner process no longer running" error. * Running on the GPU (without setting `CUDA_VISIBLE_DEVICES`) works flawlessly with all tested models. **Steps to Reproduce:** 1. **System:** Windows 11, Ryzen 7 5800X, 32GB RAM, Nvidia RTX 3060. 2. **Ollama Version:** 0.1.32 (portable), 0.5.4 (installed), and **0.5.12** (installed). 3. **Environment Variable:** Ensure `CUDA_VISIBLE_DEVICES=""` is set. This has been tested both as a system-wide variable and set per-session in Command Prompt. We have verified using `echo %CUDA_VISIBLE_DEVICES%` that the variable is correctly unset when it *should* be, and set to an empty string when we set it. The system-wide enviroment variable was deleted as well. 4. **Model(s):** `mistral:latest` (consistently fails on versions 0.5.4 and 0.1.32), `deepscaler:1.5b-preview-q4_K_M` (intermittently hangs/stalls or fails on version **0.5.12**). 5. **Command:** `ollama run <model_name> --verbose "list 10 animals who can weight over 1000 pounds"` (or any other prompt). 6. **Observe:** * With `mistral:latest` on version 0.5.4, multiple `ollama_llama_server.exe` processes appear in Task Manager and then disappear, leading to the error. * With `mistral:latest` on version 0.1.32, get "llama runner process no longer running". * With `deepscaler:1.5b-preview-q4_K_M` on version **0.5.12**, the response may stop mid-sentence. Typing anything in the prompt may cause it to continue, or may result in the "llama runner process no longer running" error. **Troubleshooting Steps Taken:** * Confirmed issue occurs with multiple models (`mistral:latest`, `deepscaler:1.5b-preview-q4_K_M`), but *not* all models. * Confirmed issue is specific to CPU execution; GPU execution works perfectly. * Verified `CUDA_VISIBLE_DEVICES=""` is correctly set and that Ollama is indeed using the CPU (via Task Manager). * Tested with a significantly older Ollama version (0.1.32) to rule out recent regressions – the issue also occurs there, but at a different stage (during model loading). * Restarted Ollama and rebooted the computer multiple times. * Confirmed AVX and AVX2 support on the CPU using CPU-Z * Ruled out common software conflicts (VPN, firewall). * Tested with different thread counts using `OLLAMA_NUM_THREAD` (1, 4, 8) – no effect. **System Information:** * **Ollama Version:** 0.5.4, **0.5.12**, and 0.1.32 * **Operating System:** Windows 11 * **CPU:** AMD Ryzen 7 5800X * **RAM:** 32GB * **GPU:** Nvidia RTX 3060 12GB * **Models Affected:** `mistral:latest`, `deepscaler:1.5b-preview-q4_K_M` **Conclusion:** This issue appears to be a bug related to how specific models are handled on the CPU by Ollama (or llama.cpp) on this particular system configuration. The intermittent hanging/stalling with `deepscaler:1.5b-preview-q4_K_M` on version 0.5.12, and the consistent failure with `mistral:latest` on versions 0.5.4 and 0.1.32, suggest a low-level incompatibility or instability. The different failure modes (hanging vs. crashing) are important clues. ### Relevant log output ```shell ``` ### OS Windows ### GPU Nvidia ### CPU AMD ### Ollama version 0.5.12
GiteaMirror added the bug label 2026-04-22 12:38:20 -05:00
Author
Owner

@rick-github commented on GitHub (Feb 26, 2025):

Server logs may aid in debugging.

OLLAMA_NUM_THREAD is not an ollama configuration variable.

<!-- gh-comment-id:2684683390 --> @rick-github commented on GitHub (Feb 26, 2025): [Server logs](https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md) may aid in debugging. `OLLAMA_NUM_THREAD` is not an ollama configuration variable.
Author
Owner

@bluespork commented on GitHub (Feb 27, 2025):

UPDATE:

I have made a significant discovery that isolates the problem. I physically removed the NVIDIA RTX 3060 GPU from my system and replaced it with an older AMD Radeon R5 430 card that is not supported by Ollama for GPU acceleration.

With the AMD card installed, Ollama runs perfectly on the CPU, without any of the hanging or "llama runner process no longer running" errors. This occurs with all models, including mistral:latest. I did not need to set CUDA_VISIBLE_DEVICES="" with the AMD card, as Ollama correctly detected that it could not use the GPU.

This strongly indicates that the issue is specific to the interaction between Ollama/llama.cpp and either the NVIDIA RTX 3060 hardware or, more likely, the NVIDIA drivers on Windows 11. It is not a general CPU incompatibility, nor is it a fundamental problem with Ollama's ability to run on the CPU.

<!-- gh-comment-id:2686775765 --> @bluespork commented on GitHub (Feb 27, 2025): **UPDATE:** I have made a significant discovery that isolates the problem. I physically removed the NVIDIA RTX 3060 GPU from my system and replaced it with an older AMD Radeon R5 430 card that is *not* supported by Ollama for GPU acceleration. With the AMD card installed, Ollama runs *perfectly* on the CPU, without any of the hanging or "llama runner process no longer running" errors. This occurs with all models, including `mistral:latest`. I did *not* need to set `CUDA_VISIBLE_DEVICES=""` with the AMD card, as Ollama correctly detected that it could not use the GPU. This strongly indicates that the issue is specific to the interaction between Ollama/`llama.cpp` and either the NVIDIA RTX 3060 hardware or, more likely, the NVIDIA drivers on Windows 11. It is *not* a general CPU incompatibility, nor is it a fundamental problem with Ollama's ability to run on the CPU.
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

yes I have a rtx3060 very similar system, and same issue with CUDA_VISIBLE_DEVICES:-1 which is how I use cpu only.
seems this will crash ollama (but you would not know as, the model continues the response fine and quick) also reset's the logs and produce orphaned processes also.
0.5.12+-0.5.13

<!-- gh-comment-id:2697207420 --> @YonTracks commented on GitHub (Mar 4, 2025): yes I have a rtx3060 very similar system, and same issue with `CUDA_VISIBLE_DEVICES:-1` which is how I use cpu only. seems this will crash ollama (but you would not know as, the model continues the response fine and quick) also reset's the logs and produce orphaned processes also. 0.5.12+-0.5.13
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

2025/03/04 21:30:05 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]"
time=2025-03-04T21:30:05.861+10:00 level=INFO source=images.go:432 msg="total blobs: 133"
time=2025-03-04T21:30:05.870+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)"
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:30:05.878+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:30:05.887+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll
time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]"
time=2025-03-04T21:30:05.895+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll]
time=2025-03-04T21:30:05.905+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll"
time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll
time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]"
time=2025-03-04T21:30:05.914+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]"
time=2025-03-04T21:30:05.920+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.924+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.927+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.930+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
time=2025-03-04T21:30:05.930+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-03-04T21:30:05.931+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="9.6 GiB"
<!-- gh-comment-id:2697210444 --> @YonTracks commented on GitHub (Mar 4, 2025): ``` 2025/03/04 21:30:05 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]" time=2025-03-04T21:30:05.861+10:00 level=INFO source=images.go:432 msg="total blobs: 133" time=2025-03-04T21:30:05.870+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" time=2025-03-04T21:30:05.877+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)" time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler" time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12 time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:30:05.878+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:30:05.887+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]" time=2025-03-04T21:30:05.895+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll] time=2025-03-04T21:30:05.905+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll" time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]" time=2025-03-04T21:30:05.914+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]" time=2025-03-04T21:30:05.920+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:30:05.924+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:30:05.927+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:30:05.930+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." time=2025-03-04T21:30:05.930+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" time=2025-03-04T21:30:05.931+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="9.6 GiB" ```
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

2025/03/04 21:31:37 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]"
time=2025-03-04T21:31:37.689+10:00 level=INFO source=images.go:432 msg="total blobs: 133"
time=2025-03-04T21:31:37.696+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-04T21:31:37.701+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)"
time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler"
time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12
time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:31:37.703+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:31:37.715+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll
time=2025-03-04T21:31:37.718+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-03-04T21:31:37.718+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]"
time=2025-03-04T21:31:37.719+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll]
time=2025-03-04T21:31:37.729+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll"
time=2025-03-04T21:31:37.729+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll
time=2025-03-04T21:31:37.729+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]"
time=2025-03-04T21:31:37.738+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]"
time=2025-03-04T21:31:37.742+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:31:37.747+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:31:37.753+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:31:37.754+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
time=2025-03-04T21:31:37.754+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-03-04T21:31:37.754+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="26.5 GiB"
[GIN] 2025/03/04 - 21:31:41 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2025/03/04 - 21:31:41 | 200 |     12.8616ms |       127.0.0.1 | POST     "/api/show"
time=2025-03-04T21:31:41.159+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.5 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB"
time=2025-03-04T21:31:41.159+10:00 level=DEBUG source=sched.go:182 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=3 gpu_count=1
time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=sched.go:212 msg="cpu mode with first model, loading"
time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.4 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB"
time=2025-03-04T21:31:41.167+10:00 level=INFO source=server.go:97 msg="system memory" total="31.9 GiB" free="26.4 GiB" free_swap="26.8 GiB"
time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=memory.go:108 msg=evaluating library=cpu gpu_count=1 available="[26.4 GiB]"
time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.4 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB"
time=2025-03-04T21:31:41.167+10:00 level=WARN source=ggml.go:136 msg="key not found" key=llama.attention.key_length default=128
time=2025-03-04T21:31:41.167+10:00 level=WARN source=ggml.go:136 msg="key not found" key=llama.attention.value_length default=128
time=2025-03-04T21:31:41.167+10:00 level=INFO source=server.go:130 msg=offload library=cpu layers.requested=-1 layers.model=33 layers.offload=0 layers.split="" memory.available="[26.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="10.1 GiB" memory.required.partial="0 B" memory.required.kv="4.0 GiB" memory.required.allocations="[10.1 GiB]" memory.weights.total="7.7 GiB" memory.weights.repeating="7.6 GiB" memory.weights.nonrepeating="105.0 MiB" memory.graph.full="2.1 GiB" memory.graph.partial="2.2 GiB"
time=2025-03-04T21:31:41.168+10:00 level=WARN source=server.go:170 msg="flash attention enabled but not supported by gpu"
time=2025-03-04T21:31:41.168+10:00 level=WARN source=server.go:193 msg="quantized kv cache requested but flash attention disabled" type=q8_0
time=2025-03-04T21:31:41.168+10:00 level=DEBUG source=server.go:259 msg="compatible gpu libraries" compatible=[]
time=2025-03-04T21:31:41.175+10:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu"
time=2025-03-04T21:31:41.175+10:00 level=INFO source=server.go:380 msg="starting llama server" cmd="C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model C:\\Users\\clint\\.ollama\\models\\blobs\\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 --ctx-size 32768 --batch-size 512 --verbose --threads 6 --no-mmap --parallel 4 --port 54417"
time=2025-03-04T21:31:41.175+10:00 level=DEBUG source=server.go:398 msg=subprocess environment="[CUDA_PATH_V12_8=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8 CUDA_VISIBLE_DEVICES=-1 HIP_PATH=C:\\Program Files\\AMD\\ROCm\\6.1\\ HIP_PATH_61=C:\\Program Files\\AMD\\ROCm\\6.1\\ PATH=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp;C:\\Python313\\Scripts\\;C:\\Python313\\;C:\\Python312\\;C:\\WINDOWS\\system32;C:\\WINDOWS;C:\\WINDOWS\\System32\\Wbem;C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\;C:\\ProgramData\\chocolatey\\bin;C:\\Program Files\\Microsoft VS Code\\bin;C:\\Program Files\\Git\\cmd;C:\\Program Files\\CMake\\bin;C:\\msys64\\mingw64\\bin;C:\\msys64\\usr\\bin;C:\\msys64\\ucrt64\\bin;C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts;C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\;C:\\Program Files (x86)\\Inno Setup 6;C:\\Program Files\\Go\\bin;C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64;C:\\Program Files\\nodejs\\;C:\\Program Files\\dotnet\\;C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\;C:\\Users\\clint\\ninja-1.12.1;C:\\Users\\clint\\ccache-4.10.2-windows-x86_64;C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin;C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe;c:\\msys64\\ucrt64\\bin;c:\\msys64\\usr\\bin;C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts;C:\\Users\\clint\\go\\bin;C:\\Users\\clint\\.dotnet\\tools;C:\\Users\\clint\\AppData\\Roaming\\npm;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama]"
time=2025-03-04T21:31:41.178+10:00 level=INFO source=sched.go:450 msg="loaded runners" count=1
time=2025-03-04T21:31:41.178+10:00 level=INFO source=server.go:557 msg="waiting for llama runner to start responding"
time=2025-03-04T21:31:41.178+10:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error"
time=2025-03-04T21:31:41.201+10:00 level=INFO source=runner.go:931 msg="starting go runner"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python313\Scripts
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python313
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python312
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\system32
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\System32\Wbem
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\System32\WindowsPowerShell\v1.0
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\ProgramData\chocolatey\bin
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Microsoft VS Code\\bin"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Git\\cmd"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\CMake\\bin"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\mingw64\bin
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\usr\bin
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\ucrt64\bin
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\Python\Python312\Scripts
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Inno Setup 6"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Go\\bin"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\nodejs"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\dotnet"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0"
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\ninja-1.12.1
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\ccache-4.10.2-windows-x86_64
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Microsoft\WindowsApps
time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:84 msg="ggml backend load all from path" path=C:\Users\clint\AppData\Local\Programs\Ollama
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\bin
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Microsoft\WinGet\Packages\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=c:\msys64\ucrt64\bin
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=c:\msys64\usr\bin
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\Python\Python312\Scripts
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\go\bin
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\.dotnet\tools
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\npm
time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:84 msg="ggml backend load all from path" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 35
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 16
ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0
load_backend: loaded CPU backend from C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll
time=2025-03-04T21:31:41.237+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\cuda
time=2025-03-04T21:31:41.237+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\rocm
time=2025-03-04T21:31:41.237+10:00 level=INFO source=runner.go:934 msg=system info="CPU : SSE3 = 1 | LLAMAFILE = 1 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | cgo(gcc)" threads=6
time=2025-03-04T21:31:41.238+10:00 level=INFO source=runner.go:992 msg="Server listening on 127.0.0.1:54417"
llama_model_loader: loaded meta data with 25 key-value pairs and 291 tensors from C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = Mistral-7B-Instruct-v0.3
llama_model_loader: - kv   2:                          llama.block_count u32              = 32
llama_model_loader: - kv   3:                       llama.context_length u32              = 32768
llama_model_loader: - kv   4:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv   6:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv   7:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv   8:                       llama.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  10:                          general.file_type u32              = 2
llama_model_loader: - kv  11:                           llama.vocab_size u32              = 32768
llama_model_loader: - kv  12:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  13:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  14:                         tokenizer.ggml.pre str              = default
llama_model_loader: - kv  15:                      tokenizer.ggml.tokens arr[str,32768]   = ["<unk>", "<s>", "</s>", "[INST]", "[...
llama_model_loader: - kv  16:                      tokenizer.ggml.scores arr[f32,32768]   = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv  17:                  tokenizer.ggml.token_type arr[i32,32768]   = [2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
llama_model_loader: - kv  18:                tokenizer.ggml.bos_token_id u32              = 1
llama_model_loader: - kv  19:                tokenizer.ggml.eos_token_id u32              = 2
llama_model_loader: - kv  20:            tokenizer.ggml.unknown_token_id u32              = 0
llama_model_loader: - kv  21:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  22:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  23:                    tokenizer.chat_template str              = {{ bos_token }}{% for message in mess...
llama_model_loader: - kv  24:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   65 tensors
llama_model_loader: - type q4_0:  225 tensors
llama_model_loader: - type q6_K:    1 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_0
print_info: file size   = 3.83 GiB (4.54 BPW) 
init_tokenizer: initializing tokenizer for type 1
load: control token:    468 '[control_466]' is not marked as EOG
load: control token:    464 '[control_462]' is not marked as EOG
load: control token:    727 '[control_725]' is not marked as EOG
load: control token:    343 '[control_341]' is not marked as EOG
load: control token:    603 '[control_601]' is not marked as EOG
load: control token:    332 '[control_330]' is not marked as EOG
load: control token:     34 '[control_32]' is not marked as EOG
load: control token:    412 '[control_410]' is not marked as EOG
load: control token:    675 '[control_673]' is not marked as EOG
load: control token:    177 '[control_175]' is not marked as EOG
load: control token:    434 '[control_432]' is not marked as EOG
...  repeats
...
<!-- gh-comment-id:2697218043 --> @YonTracks commented on GitHub (Mar 4, 2025): ``` 2025/03/04 21:31:37 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]" time=2025-03-04T21:31:37.689+10:00 level=INFO source=images.go:432 msg="total blobs: 133" time=2025-03-04T21:31:37.696+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" time=2025-03-04T21:31:37.701+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)" time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler" time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-03-04T21:31:37.701+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12 time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-03-04T21:31:37.701+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:31:37.703+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:31:37.715+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll time=2025-03-04T21:31:37.718+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-03-04T21:31:37.718+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]" time=2025-03-04T21:31:37.719+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll] time=2025-03-04T21:31:37.729+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll" time=2025-03-04T21:31:37.729+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll time=2025-03-04T21:31:37.729+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]" time=2025-03-04T21:31:37.738+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]" time=2025-03-04T21:31:37.742+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:31:37.747+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:31:37.753+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:31:37.754+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." time=2025-03-04T21:31:37.754+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" time=2025-03-04T21:31:37.754+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="26.5 GiB" [GIN] 2025/03/04 - 21:31:41 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2025/03/04 - 21:31:41 | 200 | 12.8616ms | 127.0.0.1 | POST "/api/show" time=2025-03-04T21:31:41.159+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.5 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB" time=2025-03-04T21:31:41.159+10:00 level=DEBUG source=sched.go:182 msg="updating default concurrency" OLLAMA_MAX_LOADED_MODELS=3 gpu_count=1 time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=sched.go:212 msg="cpu mode with first model, loading" time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.4 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB" time=2025-03-04T21:31:41.167+10:00 level=INFO source=server.go:97 msg="system memory" total="31.9 GiB" free="26.4 GiB" free_swap="26.8 GiB" time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=memory.go:108 msg=evaluating library=cpu gpu_count=1 available="[26.4 GiB]" time=2025-03-04T21:31:41.167+10:00 level=DEBUG source=gpu.go:391 msg="updating system memory data" before.total="31.9 GiB" before.free="26.4 GiB" before.free_swap="26.8 GiB" now.total="31.9 GiB" now.free="26.4 GiB" now.free_swap="26.8 GiB" time=2025-03-04T21:31:41.167+10:00 level=WARN source=ggml.go:136 msg="key not found" key=llama.attention.key_length default=128 time=2025-03-04T21:31:41.167+10:00 level=WARN source=ggml.go:136 msg="key not found" key=llama.attention.value_length default=128 time=2025-03-04T21:31:41.167+10:00 level=INFO source=server.go:130 msg=offload library=cpu layers.requested=-1 layers.model=33 layers.offload=0 layers.split="" memory.available="[26.4 GiB]" memory.gpu_overhead="0 B" memory.required.full="10.1 GiB" memory.required.partial="0 B" memory.required.kv="4.0 GiB" memory.required.allocations="[10.1 GiB]" memory.weights.total="7.7 GiB" memory.weights.repeating="7.6 GiB" memory.weights.nonrepeating="105.0 MiB" memory.graph.full="2.1 GiB" memory.graph.partial="2.2 GiB" time=2025-03-04T21:31:41.168+10:00 level=WARN source=server.go:170 msg="flash attention enabled but not supported by gpu" time=2025-03-04T21:31:41.168+10:00 level=WARN source=server.go:193 msg="quantized kv cache requested but flash attention disabled" type=q8_0 time=2025-03-04T21:31:41.168+10:00 level=DEBUG source=server.go:259 msg="compatible gpu libraries" compatible=[] time=2025-03-04T21:31:41.175+10:00 level=DEBUG source=gpu.go:695 msg="no filter required for library cpu" time=2025-03-04T21:31:41.175+10:00 level=INFO source=server.go:380 msg="starting llama server" cmd="C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model C:\\Users\\clint\\.ollama\\models\\blobs\\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 --ctx-size 32768 --batch-size 512 --verbose --threads 6 --no-mmap --parallel 4 --port 54417" time=2025-03-04T21:31:41.175+10:00 level=DEBUG source=server.go:398 msg=subprocess environment="[CUDA_PATH_V12_8=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8 CUDA_VISIBLE_DEVICES=-1 HIP_PATH=C:\\Program Files\\AMD\\ROCm\\6.1\\ HIP_PATH_61=C:\\Program Files\\AMD\\ROCm\\6.1\\ PATH=C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp;C:\\Python313\\Scripts\\;C:\\Python313\\;C:\\Python312\\;C:\\WINDOWS\\system32;C:\\WINDOWS;C:\\WINDOWS\\System32\\Wbem;C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\;C:\\ProgramData\\chocolatey\\bin;C:\\Program Files\\Microsoft VS Code\\bin;C:\\Program Files\\Git\\cmd;C:\\Program Files\\CMake\\bin;C:\\msys64\\mingw64\\bin;C:\\msys64\\usr\\bin;C:\\msys64\\ucrt64\\bin;C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts;C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\;C:\\Program Files (x86)\\Inno Setup 6;C:\\Program Files\\Go\\bin;C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64;C:\\Program Files\\nodejs\\;C:\\Program Files\\dotnet\\;C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\;C:\\Users\\clint\\ninja-1.12.1;C:\\Users\\clint\\ccache-4.10.2-windows-x86_64;C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin;C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe;c:\\msys64\\ucrt64\\bin;c:\\msys64\\usr\\bin;C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts;C:\\Users\\clint\\go\\bin;C:\\Users\\clint\\.dotnet\\tools;C:\\Users\\clint\\AppData\\Roaming\\npm;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm;C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama]" time=2025-03-04T21:31:41.178+10:00 level=INFO source=sched.go:450 msg="loaded runners" count=1 time=2025-03-04T21:31:41.178+10:00 level=INFO source=server.go:557 msg="waiting for llama runner to start responding" time=2025-03-04T21:31:41.178+10:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error" time=2025-03-04T21:31:41.201+10:00 level=INFO source=runner.go:931 msg="starting go runner" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python313\Scripts time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python313 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Python312 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\system32 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\System32\Wbem time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\WINDOWS\System32\WindowsPowerShell\v1.0 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\ProgramData\chocolatey\bin time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Microsoft VS Code\\bin" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Git\\cmd" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\CMake\\bin" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\mingw64\bin time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\usr\bin time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\msys64\ucrt64\bin time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\Python\Python312\Scripts time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Inno Setup 6" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\Go\\bin" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\nodejs" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\dotnet" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path="C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0" time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\ninja-1.12.1 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\ccache-4.10.2-windows-x86_64 time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Microsoft\WindowsApps time=2025-03-04T21:31:41.202+10:00 level=DEBUG source=ggml.go:84 msg="ggml backend load all from path" path=C:\Users\clint\AppData\Local\Programs\Ollama time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\bin time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Microsoft\WinGet\Packages\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=c:\msys64\ucrt64\bin time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=c:\msys64\usr\bin time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\Python\Python312\Scripts time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\go\bin time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\.dotnet\tools time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Roaming\npm time=2025-03-04T21:31:41.207+10:00 level=DEBUG source=ggml.go:84 msg="ggml backend load all from path" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-alderlake.dll score: 0 ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll score: 35 ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll score: 0 ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-sandybridge.dll score: 16 ggml_backend_load_best: C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-skylakex.dll score: 0 load_backend: loaded CPU backend from C:\Users\clint\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-haswell.dll time=2025-03-04T21:31:41.237+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\cuda time=2025-03-04T21:31:41.237+10:00 level=DEBUG source=ggml.go:78 msg="skipping path which is not part of ollama" path=C:\Users\clint\AppData\Local\Programs\Ollama\lib\rocm time=2025-03-04T21:31:41.237+10:00 level=INFO source=runner.go:934 msg=system info="CPU : SSE3 = 1 | LLAMAFILE = 1 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | cgo(gcc)" threads=6 time=2025-03-04T21:31:41.238+10:00 level=INFO source=runner.go:992 msg="Server listening on 127.0.0.1:54417" llama_model_loader: loaded meta data with 25 key-value pairs and 291 tensors from C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.name str = Mistral-7B-Instruct-v0.3 llama_model_loader: - kv 2: llama.block_count u32 = 32 llama_model_loader: - kv 3: llama.context_length u32 = 32768 llama_model_loader: - kv 4: llama.embedding_length u32 = 4096 llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336 llama_model_loader: - kv 6: llama.attention.head_count u32 = 32 llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8 llama_model_loader: - kv 8: llama.rope.freq_base f32 = 1000000.000000 llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 10: general.file_type u32 = 2 llama_model_loader: - kv 11: llama.vocab_size u32 = 32768 llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128 llama_model_loader: - kv 13: tokenizer.ggml.model str = llama llama_model_loader: - kv 14: tokenizer.ggml.pre str = default llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,32768] = ["<unk>", "<s>", "</s>", "[INST]", "[... llama_model_loader: - kv 16: tokenizer.ggml.scores arr[f32,32768] = [0.000000, 0.000000, 0.000000, 0.0000... llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,32768] = [2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ... llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 1 llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 2 llama_model_loader: - kv 20: tokenizer.ggml.unknown_token_id u32 = 0 llama_model_loader: - kv 21: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 22: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 23: tokenizer.chat_template str = {{ bos_token }}{% for message in mess... llama_model_loader: - kv 24: general.quantization_version u32 = 2 llama_model_loader: - type f32: 65 tensors llama_model_loader: - type q4_0: 225 tensors llama_model_loader: - type q6_K: 1 tensors print_info: file format = GGUF V3 (latest) print_info: file type = Q4_0 print_info: file size = 3.83 GiB (4.54 BPW) init_tokenizer: initializing tokenizer for type 1 load: control token: 468 '[control_466]' is not marked as EOG load: control token: 464 '[control_462]' is not marked as EOG load: control token: 727 '[control_725]' is not marked as EOG load: control token: 343 '[control_341]' is not marked as EOG load: control token: 603 '[control_601]' is not marked as EOG load: control token: 332 '[control_330]' is not marked as EOG load: control token: 34 '[control_32]' is not marked as EOG load: control token: 412 '[control_410]' is not marked as EOG load: control token: 675 '[control_673]' is not marked as EOG load: control token: 177 '[control_175]' is not marked as EOG load: control token: 434 '[control_432]' is not marked as EOG ... repeats ...
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

load: control token:    653 '[control_651]' is not marked as EOG
load: control token:    694 '[control_692]' is not marked as EOG
load: control token:    316 '[control_314]' is not marked as EOG
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 771
load: token to piece cache size = 0.1731 MB
print_info: arch             = llama
print_info: vocab_only       = 1
print_info: model type       = ?B
print_info: model params     = 7.25 B
print_info: general.name     = Mistral-7B-Instruct-v0.3
print_info: vocab type       = SPM
print_info: n_vocab          = 32768
print_info: n_merges         = 0
print_info: BOS token        = 1 '<s>'
print_info: EOS token        = 2 '</s>'
print_info: UNK token        = 0 '<unk>'
print_info: LF token         = 781 '<0x0A>'
print_info: EOG token        = 2 '</s>'
print_info: max token length = 48
llama_model_load: vocab only - skipping tensors
time=2025-03-04T21:33:34.465+10:00 level=DEBUG source=routes.go:1501 msg="chat request" images=0 prompt="[INST] hello[/INST]  Hello! How can I help you today? Let me know if you have any questions or need assistance with something. I'm here to help!</s>[INST] hello[/INST] "
time=2025-03-04T21:33:34.466+10:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=0 prompt=44 used=0 remaining=44
[GIN] 2025/03/04 - 21:33:40 | 200 |    8.2533849s |       127.0.0.1 | POST     "/api/chat"
time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:467 msg="context for request finished"
time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:340 msg="runner with non-zero duration has gone idle, adding timer" modelPath=C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 duration=5m0s
time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:358 msg="after processing request finished event" modelPath=C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 refCount=0

it keeps on crashing so I could capture a log.
will keep crashing untill no more memory.

<!-- gh-comment-id:2697222719 --> @YonTracks commented on GitHub (Mar 4, 2025): ``` load: control token: 653 '[control_651]' is not marked as EOG load: control token: 694 '[control_692]' is not marked as EOG load: control token: 316 '[control_314]' is not marked as EOG load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect load: special tokens cache size = 771 load: token to piece cache size = 0.1731 MB print_info: arch = llama print_info: vocab_only = 1 print_info: model type = ?B print_info: model params = 7.25 B print_info: general.name = Mistral-7B-Instruct-v0.3 print_info: vocab type = SPM print_info: n_vocab = 32768 print_info: n_merges = 0 print_info: BOS token = 1 '<s>' print_info: EOS token = 2 '</s>' print_info: UNK token = 0 '<unk>' print_info: LF token = 781 '<0x0A>' print_info: EOG token = 2 '</s>' print_info: max token length = 48 llama_model_load: vocab only - skipping tensors time=2025-03-04T21:33:34.465+10:00 level=DEBUG source=routes.go:1501 msg="chat request" images=0 prompt="[INST] hello[/INST] Hello! How can I help you today? Let me know if you have any questions or need assistance with something. I'm here to help!</s>[INST] hello[/INST] " time=2025-03-04T21:33:34.466+10:00 level=DEBUG source=cache.go:104 msg="loading cache slot" id=0 cache=0 prompt=44 used=0 remaining=44 [GIN] 2025/03/04 - 21:33:40 | 200 | 8.2533849s | 127.0.0.1 | POST "/api/chat" time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:467 msg="context for request finished" time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:340 msg="runner with non-zero duration has gone idle, adding timer" modelPath=C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 duration=5m0s time=2025-03-04T21:33:40.901+10:00 level=DEBUG source=sched.go:358 msg="after processing request finished event" modelPath=C:\Users\clint\.ollama\models\blobs\sha256-ff82381e2bea77d91c1b824c7afb83f6fb73e9f7de9dda631bcdbca564aa5435 refCount=0 ``` it keeps on crashing so I could capture a log. will keep crashing untill no more memory.
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

testing official now lol.

<!-- gh-comment-id:2697228672 --> @YonTracks commented on GitHub (Mar 4, 2025): testing official now lol.
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

official 0.5.13 also is the same, works fine, crashes after.

2025/03/04 21:47:51 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]"
time=2025-03-04T21:47:51.364+10:00 level=INFO source=images.go:432 msg="total blobs: 133"
time=2025-03-04T21:47:51.368+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-04T21:47:51.372+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13)"
time=2025-03-04T21:47:51.372+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler"
time=2025-03-04T21:47:51.372+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-04T21:47:51.376+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-03-04T21:47:51.376+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12
time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:47:51.377+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll
time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]"
time=2025-03-04T21:47:51.395+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll]
initializing C:\WINDOWS\system32\nvcuda.dll
dlsym: cuInit - 00007FFEE2BA5F80
dlsym: cuDriverGetVersion - 00007FFEE2BA6020
dlsym: cuDeviceGetCount - 00007FFEE2BA6816
dlsym: cuDeviceGet - 00007FFEE2BA6810
dlsym: cuDeviceGetAttribute - 00007FFEE2BA6170
dlsym: cuDeviceGetUuid - 00007FFEE2BA6822
dlsym: cuDeviceGetName - 00007FFEE2BA681C
dlsym: cuCtxCreate_v3 - 00007FFEE2BA6894
dlsym: cuMemGetInfo_v2 - 00007FFEE2BA6996
dlsym: cuCtxDestroy - 00007FFEE2BA68A6
calling cuInit
cuInit err: 100
time=2025-03-04T21:47:51.406+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll"
time=2025-03-04T21:47:51.406+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll
time=2025-03-04T21:47:51.406+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]"
time=2025-03-04T21:47:51.414+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]"
cudaSetDevice err: 100
time=2025-03-04T21:47:51.420+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
cudaSetDevice err: 100
time=2025-03-04T21:47:51.424+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll: cudart init failure: 100"
cudaSetDevice err: 100
time=2025-03-04T21:47:51.429+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100"
cudaSetDevice err: 100
time=2025-03-04T21:47:51.435+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:47:51.436+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
time=2025-03-04T21:47:51.436+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
releasing nvml library
time=2025-03-04T21:47:51.437+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="18.4 GiB"

keeps on crashing same, orphaned processes.

good luck.

<!-- gh-comment-id:2697270538 --> @YonTracks commented on GitHub (Mar 4, 2025): official 0.5.13 also is the same, works fine, crashes after. ``` 2025/03/04 21:47:51 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]" time=2025-03-04T21:47:51.364+10:00 level=INFO source=images.go:432 msg="total blobs: 133" time=2025-03-04T21:47:51.368+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" time=2025-03-04T21:47:51.372+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13)" time=2025-03-04T21:47:51.372+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler" time=2025-03-04T21:47:51.372+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-03-04T21:47:51.376+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2025-03-04T21:47:51.376+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12 time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll time=2025-03-04T21:47:51.376+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:47:51.377+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll time=2025-03-04T21:47:51.394+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]" time=2025-03-04T21:47:51.395+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll] initializing C:\WINDOWS\system32\nvcuda.dll dlsym: cuInit - 00007FFEE2BA5F80 dlsym: cuDriverGetVersion - 00007FFEE2BA6020 dlsym: cuDeviceGetCount - 00007FFEE2BA6816 dlsym: cuDeviceGet - 00007FFEE2BA6810 dlsym: cuDeviceGetAttribute - 00007FFEE2BA6170 dlsym: cuDeviceGetUuid - 00007FFEE2BA6822 dlsym: cuDeviceGetName - 00007FFEE2BA681C dlsym: cuCtxCreate_v3 - 00007FFEE2BA6894 dlsym: cuMemGetInfo_v2 - 00007FFEE2BA6996 dlsym: cuCtxDestroy - 00007FFEE2BA68A6 calling cuInit cuInit err: 100 time=2025-03-04T21:47:51.406+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll" time=2025-03-04T21:47:51.406+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll time=2025-03-04T21:47:51.406+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]" time=2025-03-04T21:47:51.414+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]" cudaSetDevice err: 100 time=2025-03-04T21:47:51.420+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" cudaSetDevice err: 100 time=2025-03-04T21:47:51.424+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v11\\cudart64_110.dll: cudart init failure: 100" cudaSetDevice err: 100 time=2025-03-04T21:47:51.429+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100" cudaSetDevice err: 100 time=2025-03-04T21:47:51.435+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" time=2025-03-04T21:47:51.436+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." time=2025-03-04T21:47:51.436+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" releasing nvml library time=2025-03-04T21:47:51.437+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="18.4 GiB" ``` keeps on crashing same, orphaned processes. good luck.
Author
Owner

@rick-github commented on GitHub (Mar 4, 2025):

calling cuInit
cuInit err: 100

cudaErrorNoDevice = 100

  • This indicates that no CUDA-capable devices were detected by the installed CUDA driver.

This is different to the OP's problem - you have some sort of driver mismatch. Open a new issue.

<!-- gh-comment-id:2697311383 --> @rick-github commented on GitHub (Mar 4, 2025): ``` calling cuInit cuInit err: 100 ``` [cudaErrorNoDevice](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1gg3f51e3575c2178246db0a94a430e0038e942e4cbbd2bef6e92e293253f055613:~:text=proper%20device%20architecture.-,cudaErrorNoDevice,-%3D%20100) = 100 - This indicates that no CUDA-capable devices were detected by the installed CUDA driver. This is different to the OP's problem - you have some sort of driver mismatch. Open a new issue.
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

ok I will do that.
should I try to test different drivers etc. cheers.

<!-- gh-comment-id:2697350085 --> @YonTracks commented on GitHub (Mar 4, 2025): ok I will do that. should I try to test different drivers etc. cheers.
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

will comment here also, very very similar issue as not all models and mistral:latest. I see this error also "llama runner process no longer running" error.

this one is a dev build with cmake config, default and for all. and then installed via install script.

2025/03/04 21:30:05 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]"
time=2025-03-04T21:30:05.861+10:00 level=INFO source=images.go:432 msg="total blobs: 133"
time=2025-03-04T21:30:05.870+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)"
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA"
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll
time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:30:05.878+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]"
time=2025-03-04T21:30:05.887+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll
time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll
time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]"
time=2025-03-04T21:30:05.895+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll]
time=2025-03-04T21:30:05.905+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll"
time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll
time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]"
time=2025-03-04T21:30:05.914+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]"
time=2025-03-04T21:30:05.920+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.924+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.927+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100"
time=2025-03-04T21:30:05.930+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found."
time=2025-03-04T21:30:05.930+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered"
time=2025-03-04T21:30:05.931+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="9.6 GiB"

and the other is the official, I see the official installs all gpu files. and small changes in behavior for both.

<!-- gh-comment-id:2697412998 --> @YonTracks commented on GitHub (Mar 4, 2025): will comment here also, very very similar issue as not all models and `mistral:latest`. I see this error also `"llama runner process no longer running" error.` this one is a dev build with cmake config, default and for all. and then installed via install script. > ``` > 2025/03/04 21:30:05 routes.go:1215: INFO server config env="map[CUDA_VISIBLE_DEVICES:-1 GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:8192 OLLAMA_DEBUG:true OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE:q8_0 OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\clint\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:true ROCR_VISIBLE_DEVICES:]" > time=2025-03-04T21:30:05.861+10:00 level=INFO source=images.go:432 msg="total blobs: 133" > time=2025-03-04T21:30:05.870+10:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" > time=2025-03-04T21:30:05.877+10:00 level=INFO source=routes.go:1277 msg="Listening on 127.0.0.1:11434 (version 0.5.13-yontracks)" > time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=sched.go:106 msg="starting llm scheduler" > time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" > time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 > time=2025-03-04T21:30:05.877+10:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=6 efficiency=0 threads=12 > time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:98 msg="searching for GPU discovery libraries for NVIDIA" > time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvml.dll > time=2025-03-04T21:30:05.877+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvml.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvml.dll C:\\Python313\\Scripts\\nvml.dll C:\\Python313\\nvml.dll C:\\Python312\\nvml.dll C:\\WINDOWS\\system32\\nvml.dll C:\\WINDOWS\\nvml.dll C:\\WINDOWS\\System32\\Wbem\\nvml.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvml.dll C:\\ProgramData\\chocolatey\\bin\\nvml.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvml.dll C:\\Program Files\\Git\\cmd\\nvml.dll C:\\Program Files\\CMake\\bin\\nvml.dll C:\\msys64\\mingw64\\bin\\nvml.dll C:\\msys64\\usr\\bin\\nvml.dll C:\\msys64\\ucrt64\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvml.dll C:\\Program Files (x86)\\Inno Setup 6\\nvml.dll C:\\Program Files\\Go\\bin\\nvml.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvml.dll C:\\Program Files\\nodejs\\nvml.dll C:\\Program Files\\dotnet\\nvml.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvml.dll C:\\Users\\clint\\ninja-1.12.1\\nvml.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvml.dll c:\\msys64\\ucrt64\\bin\\nvml.dll c:\\msys64\\usr\\bin\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvml.dll C:\\Users\\clint\\go\\bin\\nvml.dll C:\\Users\\clint\\.dotnet\\tools\\nvml.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvml.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvml.dll c:\\Windows\\System32\\nvml.dll]" > time=2025-03-04T21:30:05.878+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\WINDOWS\\system32\\nvml.dll c:\\Windows\\System32\\nvml.dll]" > time=2025-03-04T21:30:05.887+10:00 level=DEBUG source=gpu.go:111 msg="nvidia-ml loaded" library=C:\WINDOWS\system32\nvml.dll > time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=nvcuda.dll > time=2025-03-04T21:30:05.893+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\nvcuda.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\nvcuda.dll C:\\Python313\\Scripts\\nvcuda.dll C:\\Python313\\nvcuda.dll C:\\Python312\\nvcuda.dll C:\\WINDOWS\\system32\\nvcuda.dll C:\\WINDOWS\\nvcuda.dll C:\\WINDOWS\\System32\\Wbem\\nvcuda.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\nvcuda.dll C:\\ProgramData\\chocolatey\\bin\\nvcuda.dll C:\\Program Files\\Microsoft VS Code\\bin\\nvcuda.dll C:\\Program Files\\Git\\cmd\\nvcuda.dll C:\\Program Files\\CMake\\bin\\nvcuda.dll C:\\msys64\\mingw64\\bin\\nvcuda.dll C:\\msys64\\usr\\bin\\nvcuda.dll C:\\msys64\\ucrt64\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\nvcuda.dll C:\\Program Files (x86)\\Inno Setup 6\\nvcuda.dll C:\\Program Files\\Go\\bin\\nvcuda.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\nvcuda.dll C:\\Program Files\\nodejs\\nvcuda.dll C:\\Program Files\\dotnet\\nvcuda.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\nvcuda.dll C:\\Users\\clint\\ninja-1.12.1\\nvcuda.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\nvcuda.dll c:\\msys64\\ucrt64\\bin\\nvcuda.dll c:\\msys64\\usr\\bin\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\nvcuda.dll C:\\Users\\clint\\go\\bin\\nvcuda.dll C:\\Users\\clint\\.dotnet\\tools\\nvcuda.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\nvcuda.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\nvcuda.dll c:\\windows\\system*\\nvcuda.dll]" > time=2025-03-04T21:30:05.895+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths=[C:\WINDOWS\system32\nvcuda.dll] > time=2025-03-04T21:30:05.905+10:00 level=INFO source=gpu.go:602 msg="no nvidia devices detected by library C:\\WINDOWS\\system32\\nvcuda.dll" > time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:501 msg="Searching for GPU library" name=cudart64_*.dll > time=2025-03-04T21:30:05.905+10:00 level=DEBUG source=gpu.go:525 msg="gpu library search" globs="[C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_*.dll C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\libnvvp\\cudart64_*.dll C:\\Python313\\Scripts\\cudart64_*.dll C:\\Python313\\cudart64_*.dll C:\\Python312\\cudart64_*.dll C:\\WINDOWS\\system32\\cudart64_*.dll C:\\WINDOWS\\cudart64_*.dll C:\\WINDOWS\\System32\\Wbem\\cudart64_*.dll C:\\WINDOWS\\System32\\WindowsPowerShell\\v1.0\\cudart64_*.dll C:\\ProgramData\\chocolatey\\bin\\cudart64_*.dll C:\\Program Files\\Microsoft VS Code\\bin\\cudart64_*.dll C:\\Program Files\\Git\\cmd\\cudart64_*.dll C:\\Program Files\\CMake\\bin\\cudart64_*.dll C:\\msys64\\mingw64\\bin\\cudart64_*.dll C:\\msys64\\usr\\bin\\cudart64_*.dll C:\\msys64\\ucrt64\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Program Files (x86)\\Windows Kits\\10\\Windows Performance Toolkit\\cudart64_*.dll C:\\Program Files (x86)\\Inno Setup 6\\cudart64_*.dll C:\\Program Files\\Go\\bin\\cudart64_*.dll C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.42.34433\\bin\\Hostx64\\x64\\cudart64_*.dll C:\\Program Files\\nodejs\\cudart64_*.dll C:\\Program Files\\dotnet\\cudart64_*.dll C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2025.1.0\\cudart64_*.dll C:\\Users\\clint\\ninja-1.12.1\\cudart64_*.dll C:\\Users\\clint\\ccache-4.10.2-windows-x86_64\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WindowsApps\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Microsoft\\WinGet\\Packages\\gptscript-ai.gptscript_Microsoft.Winget.Source_8wekyb3d8bbwe\\cudart64_*.dll c:\\msys64\\ucrt64\\bin\\cudart64_*.dll c:\\msys64\\usr\\bin\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\Python\\Python312\\Scripts\\cudart64_*.dll C:\\Users\\clint\\go\\bin\\cudart64_*.dll C:\\Users\\clint\\.dotnet\\tools\\cudart64_*.dll C:\\Users\\clint\\AppData\\Roaming\\npm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\cuda\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\rocm\\cudart64_*.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v*\\cudart64_*.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v*\\bin\\cudart64_*.dll]" > time=2025-03-04T21:30:05.914+10:00 level=DEBUG source=gpu.go:558 msg="discovered GPU libraries" paths="[C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll]" > time=2025-03-04T21:30:05.920+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" > time=2025-03-04T21:30:05.924+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library C:\\Users\\clint\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v12\\cudart64_12.dll: cudart init failure: 100" > time=2025-03-04T21:30:05.927+10:00 level=DEBUG source=gpu.go:574 msg="Unable to load cudart library c:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.8\\bin\\cudart64_12.dll: cudart init failure: 100" > time=2025-03-04T21:30:05.930+10:00 level=DEBUG source=amd_windows.go:34 msg="unable to load amdhip64_6.dll, please make sure to upgrade to the latest amd driver: The specified module could not be found." > time=2025-03-04T21:30:05.930+10:00 level=INFO source=gpu.go:377 msg="no compatible GPUs were discovered" > time=2025-03-04T21:30:05.931+10:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="31.9 GiB" available="9.6 GiB" > ``` and the other is the official, I see the official installs all gpu files. and small changes in behavior for both.
Author
Owner

@rick-github commented on GitHub (Mar 4, 2025):

None of the logs you've posted contain "llama runner process no longer running".

<!-- gh-comment-id:2697421025 --> @rick-github commented on GitHub (Mar 4, 2025): None of the logs you've posted contain "llama runner process no longer running".
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

the log, it resets, but when the memory fills up, then the error changes. I will try get it, writing the other issue.

<!-- gh-comment-id:2697431210 --> @YonTracks commented on GitHub (Mar 4, 2025): the log, it resets, but when the memory fills up, then the error changes. I will try get it, writing the other issue.
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

and 0.5.13 vs 0.5.12 vs dev etc.

<!-- gh-comment-id:2697437863 --> @YonTracks commented on GitHub (Mar 4, 2025): and 0.5.13 vs 0.5.12 vs dev etc.
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

lol this is intermittent, depending on how long you wait, before prompts. always resetting logs but, and getting slower. the error is coming soon.

<!-- gh-comment-id:2697456137 --> @YonTracks commented on GitHub (Mar 4, 2025): lol this is intermittent, depending on how long you wait, before prompts. always resetting logs but, and getting slower. the error is coming soon.
Author
Owner

@YonTracks commented on GitHub (Mar 4, 2025):

ok well a different error this time lol. >>> what else Error: Post "http://127.0.0.1:11434/api/chat": dial tcp 127.0.0.1:11434: connectex: No connection could be made because the target machine actively refused it. official 0.5.13.

I will work out how to word this correct. cheers

<!-- gh-comment-id:2697459083 --> @YonTracks commented on GitHub (Mar 4, 2025): ok well a different error this time lol. ```>>> what else Error: Post "http://127.0.0.1:11434/api/chat": dial tcp 127.0.0.1:11434: connectex: No connection could be made because the target machine actively refused it```. official 0.5.13. I will work out how to word this correct. cheers
Author
Owner

@YonTracks commented on GitHub (Mar 5, 2025):

Expected Behavior:

Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" or CUDA_VISIBLE_DEVICES="-1" environment variable is set.
just as >>> /set parameter num_gpu 0 Set parameter 'num_gpu' to '0' correct does.
cheers

<!-- gh-comment-id:2699512623 --> @YonTracks commented on GitHub (Mar 5, 2025): Expected Behavior: Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" or CUDA_VISIBLE_DEVICES="-1" environment variable is set. just as ```>>> /set parameter num_gpu 0 Set parameter 'num_gpu' to '0'``` correct does. cheers
Author
Owner

@YonTracks commented on GitHub (Mar 5, 2025):

Expected Behavior:

Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" or CUDA_VISIBLE_DEVICES="-1" environment variable is set. just as >>> /set parameter num_gpu 0 Set parameter 'num_gpu' to '0' correct does. cheers

rather I think CUDA_VISIBLE_DEVICES="-1" only, with the /set parameter num_gpu 0 also, else should be default.

<!-- gh-comment-id:2699518438 --> @YonTracks commented on GitHub (Mar 5, 2025): > Expected Behavior: > > Ollama should consistently run models on the CPU when the CUDA_VISIBLE_DEVICES="" or CUDA_VISIBLE_DEVICES="-1" environment variable is set. just as `>>> /set parameter num_gpu 0 Set parameter 'num_gpu' to '0'` correct does. cheers rather I think `CUDA_VISIBLE_DEVICES="-1"` only, with the `/set parameter num_gpu 0` also, else should be default.
Author
Owner

@rick-github commented on GitHub (Mar 5, 2025):

@bluespork Can you try another experiment. Rather than setting CUDA_VISIBLE_DEVICES="" to keep the model off the GPU, create a copy of the model with num_gpu=0:

$ echo FROM mistral:latest > Modelfile
$ echo PARAMETER num_gpu 0 >> Modelfile
$ ollama create mistral:cpu

And then run the test:

$ ollama run --verbose mistral:cpu list 10 animals who can weigh over 1000 pounds

Based on #9496, it could be that the problem is triggered by the invalid value in CUDA_VISIBLE_DEVICES.

<!-- gh-comment-id:2699532674 --> @rick-github commented on GitHub (Mar 5, 2025): @bluespork Can you try another experiment. Rather than setting `CUDA_VISIBLE_DEVICES=""` to keep the model off the GPU, create a copy of the model with `num_gpu=0`: ```sh $ echo FROM mistral:latest > Modelfile $ echo PARAMETER num_gpu 0 >> Modelfile $ ollama create mistral:cpu ``` And then run the test: ```sh $ ollama run --verbose mistral:cpu list 10 animals who can weigh over 1000 pounds ``` Based on #9496, it could be that the problem is triggered by the invalid value in `CUDA_VISIBLE_DEVICES`.
Author
Owner

@bluespork commented on GitHub (Mar 10, 2025):

Thank you for your time. The problem was the invalid value in CUDA_VISIBLE_DEVICES.

<!-- gh-comment-id:2711789112 --> @bluespork commented on GitHub (Mar 10, 2025): Thank you for your time. The problem was the invalid value in CUDA_VISIBLE_DEVICES.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#31870