[GH-ISSUE #12999] 运行模型无法在GPU上运行 #8607

Closed
opened 2026-04-12 21:20:40 -05:00 by GiteaMirror · 4 comments
Owner

Originally created by @wll0307 on GitHub (Nov 7, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12999

What is the issue?

运行ollama启动后没有办法加载GPU只能cpu,我是在更新失败后我还是用原先的版本就出现了这个问题

Relevant log output

(occ) root@jikimo-virtual-machine:/home/jikimo/code/2Dces# ollama serve
time=2025-11-07T16:48:08.678+08:00 level=INFO source=routes.go:1525 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-11-07T16:48:08.678+08:00 level=INFO source=images.go:522 msg="total blobs: 0"
time=2025-11-07T16:48:08.678+08:00 level=INFO source=images.go:529 msg="total unused blobs removed: 0"
time=2025-11-07T16:48:08.678+08:00 level=INFO source=routes.go:1578 msg="Listening on 127.0.0.1:11434 (version 0.12.10)"
time=2025-11-07T16:48:08.678+08:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2025-11-07T16:48:08.679+08:00 level=INFO source=server.go:400 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 32769"
time=2025-11-07T16:48:08.704+08:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.4 GiB" available="24.7 GiB"
time=2025-11-07T16:48:08.704+08:00 level=INFO source=routes.go:1619 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB"
^C(occ) root@jikimo-virtual-machine:/home/jikimo/code/2Dces# ollama -v
Warning: could not connect to a running Ollama instance
Warning: client version is 0.12.10

OS

Linux

GPU

Nvidia

CPU

No response

Ollama version

0.12.10

Originally created by @wll0307 on GitHub (Nov 7, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12999 ### What is the issue? 运行ollama启动后没有办法加载GPU只能cpu,我是在更新失败后我还是用原先的版本就出现了这个问题 ### Relevant log output ```shell (occ) root@jikimo-virtual-machine:/home/jikimo/code/2Dces# ollama serve time=2025-11-07T16:48:08.678+08:00 level=INFO source=routes.go:1525 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2025-11-07T16:48:08.678+08:00 level=INFO source=images.go:522 msg="total blobs: 0" time=2025-11-07T16:48:08.678+08:00 level=INFO source=images.go:529 msg="total unused blobs removed: 0" time=2025-11-07T16:48:08.678+08:00 level=INFO source=routes.go:1578 msg="Listening on 127.0.0.1:11434 (version 0.12.10)" time=2025-11-07T16:48:08.678+08:00 level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2025-11-07T16:48:08.679+08:00 level=INFO source=server.go:400 msg="starting runner" cmd="/usr/local/bin/ollama runner --ollama-engine --port 32769" time=2025-11-07T16:48:08.704+08:00 level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.4 GiB" available="24.7 GiB" time=2025-11-07T16:48:08.704+08:00 level=INFO source=routes.go:1619 msg="entering low vram mode" "total vram"="0 B" threshold="20.0 GiB" ^C(occ) root@jikimo-virtual-machine:/home/jikimo/code/2Dces# ollama -v Warning: could not connect to a running Ollama instance Warning: client version is 0.12.10 ``` ### OS Linux ### GPU Nvidia ### CPU _No response_ ### Ollama version 0.12.10
GiteaMirror added the bug label 2026-04-12 21:20:40 -05:00
Author
Owner

@rick-github commented on GitHub (Nov 7, 2025):

Run the following and post the output;

OLLAMA_DEBUG=2 ollama serve

What GPU is in the machine?

<!-- gh-comment-id:3501517229 --> @rick-github commented on GitHub (Nov 7, 2025): Run the following and post the output; ``` OLLAMA_DEBUG=2 ollama serve ``` What GPU is in the machine?
Author
Owner

@wll0307 commented on GitHub (Nov 10, 2025):

运行以下命令并贴出输出结果;

OLLAMA_DEBUG=2 ollama serve

这台机器里用的是什么GPU?

3090的我在更新前是没出现任何问题,只是我更新失败后我没有选择继续更新,然后运行模型就开始出现加载模型就开始只显示cpu了

<!-- gh-comment-id:3509028573 --> @wll0307 commented on GitHub (Nov 10, 2025): > 运行以下命令并贴出输出结果; > > ``` > OLLAMA_DEBUG=2 ollama serve > ``` > > 这台机器里用的是什么GPU? 3090的我在更新前是没出现任何问题,只是我更新失败后我没有选择继续更新,然后运行模型就开始出现加载模型就开始只显示cpu了
Author
Owner

@rick-github commented on GitHub (Nov 10, 2025):

Run the following and post the output;

OLLAMA_DEBUG=2 ollama serve
<!-- gh-comment-id:3509032718 --> @rick-github commented on GitHub (Nov 10, 2025): Run the following and post the output; ``` OLLAMA_DEBUG=2 ollama serve ```
Author
Owner

@wll0307 commented on GitHub (Nov 10, 2025):

运行以下命令并贴出输出结果;

OLLAMA_DEBUG=2 ollama serve

谢谢了,感谢解决了,我更新了,可能是更新失败导致的是GPU没有获取

Image
<!-- gh-comment-id:3509246613 --> @wll0307 commented on GitHub (Nov 10, 2025): > 运行以下命令并贴出输出结果; > > ``` > OLLAMA_DEBUG=2 ollama serve > ``` 谢谢了,感谢解决了,我更新了,可能是更新失败导致的是GPU没有获取 <img width="716" height="154" alt="Image" src="https://github.com/user-attachments/assets/f665e582-1586-42d9-9f22-66d310499021" />
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#8607