[GH-ISSUE #15321] Adding cap_perfmon to ollama breaks GPU discovery #9800

Open
opened 2026-04-12 22:40:39 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @relzp on GitHub (Apr 4, 2026).
Original GitHub issue: https://github.com/ollama/ollama/issues/15321

What is the issue?

Following the docs at https://docs.ollama.com/gpu#vulkan-gpu-support, I added the perfmon capability to the ollama executable using the command sudo setcap cap_perfmon+ep /usr/bin/ollama to give ollama access to the available VRAM data.

However when restarting the ollama service, I noticed that neither of my GPUs were detected. I usually use Vulkan, but I also tested with ROCm, and I noticed the same behaviour (GPU detected without cap_perfmon, not detected with cap_perfmon).

Relevant log output

without cap_perfmon:
Started Ollama Service.
time=2026-04-04T08:08:52.131Z level=INFO source=routes.go:1744 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-04-04T08:08:52.131Z level=INFO source=routes.go:1746 msg="Ollama cloud disabled: false"
time=2026-04-04T08:08:52.132Z level=INFO source=images.go:499 msg="total blobs: 29"
time=2026-04-04T08:08:52.132Z level=INFO source=images.go:506 msg="total unused blobs removed: 0"
time=2026-04-04T08:08:52.133Z level=INFO source=routes.go:1802 msg="Listening on 127.0.0.1:11434 (version 0.20.0)"
time=2026-04-04T08:08:52.133Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-04-04T08:08:52.133Z level=INFO source=server.go:432 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 36717"
time=2026-04-04T08:08:52.268Z level=INFO source=types.go:42 msg="inference compute" id=00000000-0c00-0000-0000-000000000000 filter_id="" library=Vulkan compute=0.0 name=Vulkan0 description="AMD Radeon RX 7700 XT (RADV NAVI32)" libdirs=ollama driver=0.0 pci_id=0000:0c:00.0 type=discrete total="12.0 GiB" available="9.1 GiB"
time=2026-04-04T08:08:52.268Z level=INFO source=types.go:42 msg="inference compute" id=1736597e-c3c7-5b7d-9882-ade1b4a6ea1b filter_id="" library=Vulkan compute=0.0 name=Vulkan1 description="NVIDIA GeForce GTX 1060 6GB" libdirs=ollama driver=0.0 pci_id=0000:0d:00.0 type=discrete total="6.0 GiB" available="5.5 GiB"
time=2026-04-04T08:08:52.268Z level=INFO source=routes.go:1852 msg="vram-based default context" total_vram="18.0 GiB" default_num_ctx=4096


with cap_perfmon:
Started Ollama Service.
time=2026-04-04T08:09:23.746Z level=INFO source=routes.go:1744 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-04-04T08:09:23.746Z level=INFO source=routes.go:1746 msg="Ollama cloud disabled: false"
time=2026-04-04T08:09:23.747Z level=INFO source=images.go:499 msg="total blobs: 29"
time=2026-04-04T08:09:23.747Z level=INFO source=images.go:506 msg="total unused blobs removed: 0"
time=2026-04-04T08:09:23.748Z level=INFO source=routes.go:1802 msg="Listening on 127.0.0.1:11434 (version 0.20.0)"
time=2026-04-04T08:09:23.748Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-04-04T08:09:23.748Z level=INFO source=server.go:432 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 44129"
time=2026-04-04T08:09:23.772Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.3 GiB" available="22.1 GiB"
time=2026-04-04T08:09:23.772Z level=INFO source=routes.go:1852 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096

OS

Linux

GPU

AMD, Nvidia

CPU

AMD

Ollama version

0.20.0

Originally created by @relzp on GitHub (Apr 4, 2026). Original GitHub issue: https://github.com/ollama/ollama/issues/15321 ### What is the issue? Following the docs at https://docs.ollama.com/gpu#vulkan-gpu-support, I added the perfmon capability to the ollama executable using the command `sudo setcap cap_perfmon+ep /usr/bin/ollama` to give ollama access to the available VRAM data. However when restarting the ollama service, I noticed that neither of my GPUs were detected. I usually use Vulkan, but I also tested with ROCm, and I noticed the same behaviour (GPU detected without cap_perfmon, not detected with cap_perfmon). ### Relevant log output ```shell without cap_perfmon: Started Ollama Service. time=2026-04-04T08:08:52.131Z level=INFO source=routes.go:1744 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2026-04-04T08:08:52.131Z level=INFO source=routes.go:1746 msg="Ollama cloud disabled: false" time=2026-04-04T08:08:52.132Z level=INFO source=images.go:499 msg="total blobs: 29" time=2026-04-04T08:08:52.132Z level=INFO source=images.go:506 msg="total unused blobs removed: 0" time=2026-04-04T08:08:52.133Z level=INFO source=routes.go:1802 msg="Listening on 127.0.0.1:11434 (version 0.20.0)" time=2026-04-04T08:08:52.133Z level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2026-04-04T08:08:52.133Z level=INFO source=server.go:432 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 36717" time=2026-04-04T08:08:52.268Z level=INFO source=types.go:42 msg="inference compute" id=00000000-0c00-0000-0000-000000000000 filter_id="" library=Vulkan compute=0.0 name=Vulkan0 description="AMD Radeon RX 7700 XT (RADV NAVI32)" libdirs=ollama driver=0.0 pci_id=0000:0c:00.0 type=discrete total="12.0 GiB" available="9.1 GiB" time=2026-04-04T08:08:52.268Z level=INFO source=types.go:42 msg="inference compute" id=1736597e-c3c7-5b7d-9882-ade1b4a6ea1b filter_id="" library=Vulkan compute=0.0 name=Vulkan1 description="NVIDIA GeForce GTX 1060 6GB" libdirs=ollama driver=0.0 pci_id=0000:0d:00.0 type=discrete total="6.0 GiB" available="5.5 GiB" time=2026-04-04T08:08:52.268Z level=INFO source=routes.go:1852 msg="vram-based default context" total_vram="18.0 GiB" default_num_ctx=4096 with cap_perfmon: Started Ollama Service. time=2026-04-04T08:09:23.746Z level=INFO source=routes.go:1744 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:INFO OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2026-04-04T08:09:23.746Z level=INFO source=routes.go:1746 msg="Ollama cloud disabled: false" time=2026-04-04T08:09:23.747Z level=INFO source=images.go:499 msg="total blobs: 29" time=2026-04-04T08:09:23.747Z level=INFO source=images.go:506 msg="total unused blobs removed: 0" time=2026-04-04T08:09:23.748Z level=INFO source=routes.go:1802 msg="Listening on 127.0.0.1:11434 (version 0.20.0)" time=2026-04-04T08:09:23.748Z level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2026-04-04T08:09:23.748Z level=INFO source=server.go:432 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 44129" time=2026-04-04T08:09:23.772Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.3 GiB" available="22.1 GiB" time=2026-04-04T08:09:23.772Z level=INFO source=routes.go:1852 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096 ``` ### OS Linux ### GPU AMD, Nvidia ### CPU AMD ### Ollama version 0.20.0
GiteaMirror added the bug label 2026-04-12 22:40:39 -05:00
Author
Owner

@be-eitel commented on GitHub (Apr 6, 2026):

OLLAMA_VULKAN:false is being logged at msg=server config, suggesting that the corresponding environment variable is not set.

Have you tried setting the OLLAMA_VULKAN env variable to 1 like noted here?

<!-- gh-comment-id:4189893996 --> @be-eitel commented on GitHub (Apr 6, 2026): `OLLAMA_VULKAN:false` is being logged at `msg=server config`, suggesting that the corresponding environment variable is not set. Have you tried setting the OLLAMA_VULKAN env variable to 1 like noted [here](https://docs.ollama.com/gpu#vulkan-gpu-support)?
Author
Owner

@relzp commented on GitHub (Apr 6, 2026):

Sorry yeah I do normally have it set to 1, I just happened at that time to be testing out if I needed that env to use vulkan.
I've set OLLAMA_VULKAN=1, and unfortunately it still doesn't find either GPU when cap_perfmon is set on the ollama binary.
These are the logs with OLLAMA_VULKAN=1 and OLLAMA_DEBUG=2:

Started Ollama Service.
time=2026-04-06T10:16:47.341Z level=INFO source=routes.go:1744 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG-4 OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:true ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2026-04-06T10:16:47.341Z level=INFO source=routes.go:1746 msg="Ollama cloud disabled: false"
time=2026-04-06T10:16:47.342Z level=INFO source=images.go:499 msg="total blobs: 41"
time=2026-04-06T10:16:47.343Z level=INFO source=images.go:506 msg="total unused blobs removed: 0"
time=2026-04-06T10:16:47.343Z level=INFO source=routes.go:1802 msg="Listening on 127.0.0.1:11434 (version 0.20.2)"
time=2026-04-06T10:16:47.343Z level=DEBUG source=sched.go:145 msg="starting llm scheduler"
time=2026-04-06T10:16:47.343Z level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2026-04-06T10:16:47.343Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs=[/usr/lib/ollama] extraEnvs=map[]
time=2026-04-06T10:16:47.344Z level=INFO source=server.go:432 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 43729"
time=2026-04-06T10:16:47.344Z level=DEBUG source=server.go:433 msg=subprocess PATH=/usr/local/sbin:/usr/local/bin:/usr/bin OLLAMA_MODELS=/var/lib/ollama OLLAMA_VULKAN=1 OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/lib/ollama OLLAMA_LIBRARY_PATH=/usr/lib/ollama
time=2026-04-06T10:16:47.357Z level=INFO source=runner.go:1417 msg="starting ollama engine"
time=2026-04-06T10:16:47.358Z level=INFO source=runner.go:1452 msg="Server listening on 127.0.0.1:43729"
time=2026-04-06T10:16:47.365Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string
time=2026-04-06T10:16:47.365Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string
time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.alignment default=32
time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.alignment default=32
time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.file_type default=0
time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.name default=""
time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.description default=""
time=2026-04-06T10:16:47.365Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3
time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/lib/ollama
time=2026-04-06T10:16:47.367Z level=INFO source=ggml.go:104 msg=system CPU.0.LLAMAFILE=1 compiler=cgo(gcc)
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.block_count default=0
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.pooling_type default=0
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.expert_count default=0
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}"
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}"
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}"
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}"
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}"
time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.pre default=""
time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.block_count default=0
time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.embedding_length default=0
time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.attention.head_count default=0
time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.attention.head_count_kv default=0
time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.attention.key_length default=0
time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.rope.dimension_count default=0
time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0
time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.rope.freq_base default=100000
time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.rope.scaling.factor default=1
time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:1392 msg="dummy model load took" duration=2.649655ms
time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:1397 msg="gathering device infos took" duration=370ns
time=2026-04-06T10:16:47.368Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] devices=[]
time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=24.876934ms OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] extra_envs=map[]
time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=0
time=2026-04-06T10:16:47.368Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[]
time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=25.032096ms
time=2026-04-06T10:16:47.368Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.3 GiB" available="22.8 GiB"
time=2026-04-06T10:16:47.368Z level=INFO source=routes.go:1852 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096

Just to make sure:

> sudo getcap /usr/bin/ollama
/usr/bin/ollama cap_perfmon=ep
<!-- gh-comment-id:4191720196 --> @relzp commented on GitHub (Apr 6, 2026): Sorry yeah I do normally have it set to 1, I just happened at that time to be testing out if I needed that env to use vulkan. I've set OLLAMA_VULKAN=1, and unfortunately it still doesn't find either GPU when cap_perfmon is set on the ollama binary. These are the logs with OLLAMA_VULKAN=1 and OLLAMA_DEBUG=2: ``` Started Ollama Service. time=2026-04-06T10:16:47.341Z level=INFO source=routes.go:1744 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:0 OLLAMA_DEBUG:DEBUG-4 OLLAMA_DEBUG_LOG_REQUESTS:false OLLAMA_EDITOR: OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NO_CLOUD:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:true ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2026-04-06T10:16:47.341Z level=INFO source=routes.go:1746 msg="Ollama cloud disabled: false" time=2026-04-06T10:16:47.342Z level=INFO source=images.go:499 msg="total blobs: 41" time=2026-04-06T10:16:47.343Z level=INFO source=images.go:506 msg="total unused blobs removed: 0" time=2026-04-06T10:16:47.343Z level=INFO source=routes.go:1802 msg="Listening on 127.0.0.1:11434 (version 0.20.2)" time=2026-04-06T10:16:47.343Z level=DEBUG source=sched.go:145 msg="starting llm scheduler" time=2026-04-06T10:16:47.343Z level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2026-04-06T10:16:47.343Z level=TRACE source=runner.go:440 msg="starting runner for device discovery" libDirs=[/usr/lib/ollama] extraEnvs=map[] time=2026-04-06T10:16:47.344Z level=INFO source=server.go:432 msg="starting runner" cmd="/usr/bin/ollama runner --ollama-engine --port 43729" time=2026-04-06T10:16:47.344Z level=DEBUG source=server.go:433 msg=subprocess PATH=/usr/local/sbin:/usr/local/bin:/usr/bin OLLAMA_MODELS=/var/lib/ollama OLLAMA_VULKAN=1 OLLAMA_DEBUG=2 LD_LIBRARY_PATH=/usr/lib/ollama OLLAMA_LIBRARY_PATH=/usr/lib/ollama time=2026-04-06T10:16:47.357Z level=INFO source=runner.go:1417 msg="starting ollama engine" time=2026-04-06T10:16:47.358Z level=INFO source=runner.go:1452 msg="Server listening on 127.0.0.1:43729" time=2026-04-06T10:16:47.365Z level=DEBUG source=gguf.go:604 msg=general.architecture type=string time=2026-04-06T10:16:47.365Z level=DEBUG source=gguf.go:604 msg=tokenizer.ggml.model type=string time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.alignment default=32 time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.alignment default=32 time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.file_type default=0 time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.name default="" time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=general.description default="" time=2026-04-06T10:16:47.365Z level=INFO source=ggml.go:136 msg="" architecture=llama file_type=unknown name="" description="" num_tensors=0 num_key_values=3 time=2026-04-06T10:16:47.365Z level=DEBUG source=ggml.go:94 msg="ggml backend load all from path" path=/usr/lib/ollama time=2026-04-06T10:16:47.367Z level=INFO source=ggml.go:104 msg=system CPU.0.LLAMAFILE=1 compiler=cgo(gcc) time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.block_count default=0 time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.pooling_type default=0 time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.expert_count default=0 time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.tokens default="&{size:0 values:[]}" time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.scores default="&{size:0 values:[]}" time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.token_type default="&{size:0 values:[]}" time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.merges default="&{size:0 values:[]}" time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.add_bos_token default=true time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.bos_token_id default=0 time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.add_eos_token default=false time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.eos_token_id default=0 time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.eos_token_ids default="&{size:0 values:[]}" time=2026-04-06T10:16:47.367Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=tokenizer.ggml.pre default="" time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.block_count default=0 time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.embedding_length default=0 time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.attention.head_count default=0 time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.attention.head_count_kv default=0 time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.attention.key_length default=0 time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.rope.dimension_count default=0 time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.attention.layer_norm_rms_epsilon default=0 time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.rope.freq_base default=100000 time=2026-04-06T10:16:47.368Z level=DEBUG source=ggml.go:325 msg="key with type not found" key=llama.rope.scaling.factor default=1 time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:1392 msg="dummy model load took" duration=2.649655ms time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:1397 msg="gathering device infos took" duration=370ns time=2026-04-06T10:16:47.368Z level=TRACE source=runner.go:467 msg="runner enumerated devices" OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] devices=[] time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:437 msg="bootstrap discovery took" duration=24.876934ms OLLAMA_LIBRARY_PATH=[/usr/lib/ollama] extra_envs=map[] time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:124 msg="evaluating which, if any, devices to filter out" initial_count=0 time=2026-04-06T10:16:47.368Z level=TRACE source=runner.go:174 msg="supported GPU library combinations before filtering" supported=map[] time=2026-04-06T10:16:47.368Z level=DEBUG source=runner.go:40 msg="GPU bootstrap discovery took" duration=25.032096ms time=2026-04-06T10:16:47.368Z level=INFO source=types.go:60 msg="inference compute" id=cpu library=cpu compute="" name=cpu description=cpu libdirs=ollama driver="" pci_id="" type="" total="31.3 GiB" available="22.8 GiB" time=2026-04-06T10:16:47.368Z level=INFO source=routes.go:1852 msg="vram-based default context" total_vram="0 B" default_num_ctx=4096 ``` Just to make sure: ``` > sudo getcap /usr/bin/ollama /usr/bin/ollama cap_perfmon=ep ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#9800