[GH-ISSUE #12408] CUDA error: no kernel image is available for execution on the device #34001

Closed
opened 2026-04-22 17:13:24 -05:00 by GiteaMirror · 6 comments
Owner

Originally created by @Dominiquini on GitHub (Sep 25, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12408

What is the issue?

Ollama doesn't work if I try to run it using my GPU with CUDA. If I disable CUDA (start the service with the environment variable 'CUDA_VISIBLE_DEVICES="-1"'), everything seems to work fine!

Everything seemed to be working fine a few weeks ago! I think the problem starts when I updated CUDA to version 13.0.1, as I tested several previous versions of Ollama, and the same error persists! Even version that I know that worked in the past...

Relevant log output

JOURNALCTL:

24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.610-03:00 level=INFO source=routes.go:1475 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.614-03:00 level=INFO source=images.go:518 msg="total blobs: 60"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.615-03:00 level=INFO source=images.go:525 msg="total unused blobs removed: 0"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.615-03:00 level=INFO source=routes.go:1528 msg="Listening on 127.0.0.1:11434 (version 0.12.1)"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.615-03:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.768-03:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-b5634b9b-f606-83eb-86c6-99d9398d729f library=cuda variant=v12 compute=6.1 driver=13.0 name="NVIDIA GeForce GTX 1060 6GB" total="5.9 GiB" available="4.5 GiB"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.768-03:00 level=INFO source=routes.go:1569 msg="entering low vram mode" "total vram"="5.9 GiB" threshold="20.0 GiB"
24/09/2025 22:45	ollama	[GIN] 2025/09/24 - 22:45:40 | 200 |      37.167µs |       127.0.0.1 | HEAD     "/"
24/09/2025 22:45	ollama	[GIN] 2025/09/24 - 22:45:40 | 200 |   67.780023ms |       127.0.0.1 | POST     "/api/show"
24/09/2025 22:45	ollama	llama_model_loader: loaded meta data with 29 key-value pairs and 464 tensors from /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 (version GGUF V3 (latest))
24/09/2025 22:45	ollama	llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
24/09/2025 22:45	ollama	llama_model_loader: - kv   0:                       general.architecture str              = gemma2
24/09/2025 22:45	ollama	llama_model_loader: - kv   1:                               general.name str              = gemma-2-9b-it
24/09/2025 22:45	ollama	llama_model_loader: - kv   2:                      gemma2.context_length u32              = 8192
24/09/2025 22:45	ollama	llama_model_loader: - kv   3:                    gemma2.embedding_length u32              = 3584
24/09/2025 22:45	ollama	llama_model_loader: - kv   4:                         gemma2.block_count u32              = 42
24/09/2025 22:45	ollama	llama_model_loader: - kv   5:                 gemma2.feed_forward_length u32              = 14336
24/09/2025 22:45	ollama	llama_model_loader: - kv   6:                gemma2.attention.head_count u32              = 16
24/09/2025 22:45	ollama	llama_model_loader: - kv   7:             gemma2.attention.head_count_kv u32              = 8
24/09/2025 22:45	ollama	llama_model_loader: - kv   8:    gemma2.attention.layer_norm_rms_epsilon f32              = 0.000001
24/09/2025 22:45	ollama	llama_model_loader: - kv   9:                gemma2.attention.key_length u32              = 256
24/09/2025 22:45	ollama	llama_model_loader: - kv  10:              gemma2.attention.value_length u32              = 256
24/09/2025 22:45	ollama	llama_model_loader: - kv  11:                          general.file_type u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - kv  12:              gemma2.attn_logit_softcapping f32              = 50.000000
24/09/2025 22:45	ollama	llama_model_loader: - kv  13:             gemma2.final_logit_softcapping f32              = 30.000000
24/09/2025 22:45	ollama	llama_model_loader: - kv  14:            gemma2.attention.sliding_window u32              = 4096
24/09/2025 22:45	ollama	llama_model_loader: - kv  15:                       tokenizer.ggml.model str              = llama
24/09/2025 22:45	ollama	llama_model_loader: - kv  16:                         tokenizer.ggml.pre str              = default
24/09/2025 22:45	ollama	llama_model_loader: - kv  17:                      tokenizer.ggml.tokens arr[str,256000]  = ["<pad>", "<eos>", "<bos>", "<unk>", ...
24/09/2025 22:45	ollama	llama_model_loader: - kv  18:                      tokenizer.ggml.scores arr[f32,256000]  = [0.000000, 0.000000, 0.000000, 0.0000...
24/09/2025 22:45	ollama	llama_model_loader: - kv  19:                  tokenizer.ggml.token_type arr[i32,256000]  = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
24/09/2025 22:45	ollama	llama_model_loader: - kv  20:                tokenizer.ggml.bos_token_id u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - kv  21:                tokenizer.ggml.eos_token_id u32              = 1
24/09/2025 22:45	ollama	llama_model_loader: - kv  22:            tokenizer.ggml.unknown_token_id u32              = 3
24/09/2025 22:45	ollama	llama_model_loader: - kv  23:            tokenizer.ggml.padding_token_id u32              = 0
24/09/2025 22:45	ollama	llama_model_loader: - kv  24:               tokenizer.ggml.add_bos_token bool             = true
24/09/2025 22:45	ollama	llama_model_loader: - kv  25:               tokenizer.ggml.add_eos_token bool             = false
24/09/2025 22:45	ollama	llama_model_loader: - kv  26:                    tokenizer.chat_template str              = {{ bos_token }}{% if messages[0]['rol...
24/09/2025 22:45	ollama	llama_model_loader: - kv  27:            tokenizer.ggml.add_space_prefix bool             = false
24/09/2025 22:45	ollama	llama_model_loader: - kv  28:               general.quantization_version u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - type  f32:  169 tensors
24/09/2025 22:45	ollama	llama_model_loader: - type q4_0:  294 tensors
24/09/2025 22:45	ollama	llama_model_loader: - type q6_K:    1 tensors
24/09/2025 22:45	ollama	print_info: file format = GGUF V3 (latest)
24/09/2025 22:45	ollama	print_info: file type   = Q4_0
24/09/2025 22:45	ollama	print_info: file size   = 5.06 GiB (4.71 BPW)
24/09/2025 22:45	ollama	load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
24/09/2025 22:45	ollama	load: printing all EOG tokens:
24/09/2025 22:45	ollama	load:   - 1 ('<eos>')
24/09/2025 22:45	ollama	load:   - 107 ('<end_of_turn>')
24/09/2025 22:45	ollama	load: special tokens cache size = 108
24/09/2025 22:45	ollama	load: token to piece cache size = 1.6014 MB
24/09/2025 22:45	ollama	print_info: arch             = gemma2
24/09/2025 22:45	ollama	print_info: vocab_only       = 1
24/09/2025 22:45	ollama	print_info: model type       = ?B
24/09/2025 22:45	ollama	print_info: model params     = 9.24 B
24/09/2025 22:45	ollama	print_info: general.name     = gemma-2-9b-it
24/09/2025 22:45	ollama	print_info: vocab type       = SPM
24/09/2025 22:45	ollama	print_info: n_vocab          = 256000
24/09/2025 22:45	ollama	print_info: n_merges         = 0
24/09/2025 22:45	ollama	print_info: BOS token        = 2 '<bos>'
24/09/2025 22:45	ollama	print_info: EOS token        = 1 '<eos>'
24/09/2025 22:45	ollama	print_info: EOT token        = 107 '<end_of_turn>'
24/09/2025 22:45	ollama	print_info: UNK token        = 3 '<unk>'
24/09/2025 22:45	ollama	print_info: PAD token        = 0 '<pad>'
24/09/2025 22:45	ollama	print_info: LF token         = 227 '<0x0A>'
24/09/2025 22:45	ollama	print_info: EOG token        = 1 '<eos>'
24/09/2025 22:45	ollama	print_info: EOG token        = 107 '<end_of_turn>'
24/09/2025 22:45	ollama	print_info: max token length = 93
24/09/2025 22:45	ollama	llama_model_load: vocab only - skipping tensors
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.324-03:00 level=INFO source=server.go:399 msg="starting runner" cmd="/usr/bin/ollama runner --model /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 --port 37197"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.334-03:00 level=INFO source=runner.go:864 msg="starting go runner"
24/09/2025 22:45	ollama	ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
24/09/2025 22:45	ollama	ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
24/09/2025 22:45	ollama	ggml_cuda_init: found 1 CUDA devices:
24/09/2025 22:45	ollama	  Device 0: NVIDIA GeForce GTX 1060 6GB, compute capability 6.1, VMM: yes, ID: GPU-b5634b9b-f606-83eb-86c6-99d9398d729f
24/09/2025 22:45	ollama	load_backend: loaded CUDA backend from /usr/lib/ollama/libggml-cuda.so
24/09/2025 22:45	ollama	load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-haswell.so
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.377-03:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=750,800,860,870,880,890,900,1000,1030,1100,1200,1210 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.377-03:00 level=INFO source=runner.go:900 msg="Server listening on 127.0.0.1:37197"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.395-03:00 level=INFO source=server.go:504 msg="system memory" total="31.2 GiB" free="20.8 GiB" free_swap="0 B"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.397-03:00 level=INFO source=server.go:544 msg=offload library=cuda layers.requested=-1 layers.model=43 layers.offload=20 layers.split=[20] memory.available="[4.5 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.2 GiB" memory.required.partial="4.5 GiB" memory.required.kv="1.3 GiB" memory.required.allocations="[4.5 GiB]" memory.weights.total="5.1 GiB" memory.weights.repeating="4.4 GiB" memory.weights.nonrepeating="717.8 MiB" memory.graph.full="507.0 MiB" memory.graph.partial="1.2 GiB"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.397-03:00 level=INFO source=runner.go:799 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:false KvSize:4096 KvCacheType: NumThreads:4 GPULayers:20[ID:GPU-b5634b9b-f606-83eb-86c6-99d9398d729f Layers:20(22..41)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:true}"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.397-03:00 level=INFO source=server.go:1251 msg="waiting for llama runner to start responding"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.398-03:00 level=INFO source=server.go:1285 msg="waiting for server to become available" status="llm server loading model"
24/09/2025 22:45	ollama	llama_model_load_from_file_impl: using device CUDA0 (NVIDIA GeForce GTX 1060 6GB) - 4625 MiB free
24/09/2025 22:45	ollama	llama_model_loader: loaded meta data with 29 key-value pairs and 464 tensors from /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 (version GGUF V3 (latest))
24/09/2025 22:45	ollama	llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
24/09/2025 22:45	ollama	llama_model_loader: - kv   0:                       general.architecture str              = gemma2
24/09/2025 22:45	ollama	llama_model_loader: - kv   1:                               general.name str              = gemma-2-9b-it
24/09/2025 22:45	ollama	llama_model_loader: - kv   2:                      gemma2.context_length u32              = 8192
24/09/2025 22:45	ollama	llama_model_loader: - kv   3:                    gemma2.embedding_length u32              = 3584
24/09/2025 22:45	ollama	llama_model_loader: - kv   4:                         gemma2.block_count u32              = 42
24/09/2025 22:45	ollama	llama_model_loader: - kv   5:                 gemma2.feed_forward_length u32              = 14336
24/09/2025 22:45	ollama	llama_model_loader: - kv   6:                gemma2.attention.head_count u32              = 16
24/09/2025 22:45	ollama	llama_model_loader: - kv   7:             gemma2.attention.head_count_kv u32              = 8
24/09/2025 22:45	ollama	llama_model_loader: - kv   8:    gemma2.attention.layer_norm_rms_epsilon f32              = 0.000001
24/09/2025 22:45	ollama	llama_model_loader: - kv   9:                gemma2.attention.key_length u32              = 256
24/09/2025 22:45	ollama	llama_model_loader: - kv  10:              gemma2.attention.value_length u32              = 256
24/09/2025 22:45	ollama	llama_model_loader: - kv  11:                          general.file_type u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - kv  12:              gemma2.attn_logit_softcapping f32              = 50.000000
24/09/2025 22:45	ollama	llama_model_loader: - kv  13:             gemma2.final_logit_softcapping f32              = 30.000000
24/09/2025 22:45	ollama	llama_model_loader: - kv  14:            gemma2.attention.sliding_window u32              = 4096
24/09/2025 22:45	ollama	llama_model_loader: - kv  15:                       tokenizer.ggml.model str              = llama
24/09/2025 22:45	ollama	llama_model_loader: - kv  16:                         tokenizer.ggml.pre str              = default
24/09/2025 22:45	ollama	llama_model_loader: - kv  17:                      tokenizer.ggml.tokens arr[str,256000]  = ["<pad>", "<eos>", "<bos>", "<unk>", ...
24/09/2025 22:45	ollama	llama_model_loader: - kv  18:                      tokenizer.ggml.scores arr[f32,256000]  = [0.000000, 0.000000, 0.000000, 0.0000...
24/09/2025 22:45	ollama	llama_model_loader: - kv  19:                  tokenizer.ggml.token_type arr[i32,256000]  = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
24/09/2025 22:45	ollama	llama_model_loader: - kv  20:                tokenizer.ggml.bos_token_id u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - kv  21:                tokenizer.ggml.eos_token_id u32              = 1
24/09/2025 22:45	ollama	llama_model_loader: - kv  22:            tokenizer.ggml.unknown_token_id u32              = 3
24/09/2025 22:45	ollama	llama_model_loader: - kv  23:            tokenizer.ggml.padding_token_id u32              = 0
24/09/2025 22:45	ollama	llama_model_loader: - kv  24:               tokenizer.ggml.add_bos_token bool             = true
24/09/2025 22:45	ollama	llama_model_loader: - kv  25:               tokenizer.ggml.add_eos_token bool             = false
24/09/2025 22:45	ollama	llama_model_loader: - kv  26:                    tokenizer.chat_template str              = {{ bos_token }}{% if messages[0]['rol...
24/09/2025 22:45	ollama	llama_model_loader: - kv  27:            tokenizer.ggml.add_space_prefix bool             = false
24/09/2025 22:45	ollama	llama_model_loader: - kv  28:               general.quantization_version u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - type  f32:  169 tensors
24/09/2025 22:45	ollama	llama_model_loader: - type q4_0:  294 tensors
24/09/2025 22:45	ollama	llama_model_loader: - type q6_K:    1 tensors
24/09/2025 22:45	ollama	print_info: file format = GGUF V3 (latest)
24/09/2025 22:45	ollama	print_info: file type   = Q4_0
24/09/2025 22:45	ollama	print_info: file size   = 5.06 GiB (4.71 BPW)
24/09/2025 22:45	ollama	[GIN] 2025/09/24 - 22:45:41 | 200 |      24.284µs |       127.0.0.1 | GET      "/"
24/09/2025 22:45	ollama	load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
24/09/2025 22:45	ollama	load: printing all EOG tokens:
24/09/2025 22:45	ollama	load:   - 1 ('<eos>')
24/09/2025 22:45	ollama	load:   - 107 ('<end_of_turn>')
24/09/2025 22:45	ollama	load: special tokens cache size = 108
24/09/2025 22:45	ollama	load: token to piece cache size = 1.6014 MB
24/09/2025 22:45	ollama	print_info: arch             = gemma2
24/09/2025 22:45	ollama	print_info: vocab_only       = 0
24/09/2025 22:45	ollama	print_info: n_ctx_train      = 8192
24/09/2025 22:45	ollama	print_info: n_embd           = 3584
24/09/2025 22:45	ollama	print_info: n_layer          = 42
24/09/2025 22:45	ollama	print_info: n_head           = 16
24/09/2025 22:45	ollama	print_info: n_head_kv        = 8
24/09/2025 22:45	ollama	print_info: n_rot            = 256
24/09/2025 22:45	ollama	print_info: n_swa            = 4096
24/09/2025 22:45	ollama	print_info: is_swa_any       = 1
24/09/2025 22:45	ollama	print_info: n_embd_head_k    = 256
24/09/2025 22:45	ollama	print_info: n_embd_head_v    = 256
24/09/2025 22:45	ollama	print_info: n_gqa            = 2
24/09/2025 22:45	ollama	print_info: n_embd_k_gqa     = 2048
24/09/2025 22:45	ollama	print_info: n_embd_v_gqa     = 2048
24/09/2025 22:45	ollama	print_info: f_norm_eps       = 0.0e+00
24/09/2025 22:45	ollama	print_info: f_norm_rms_eps   = 1.0e-06
24/09/2025 22:45	ollama	print_info: f_clamp_kqv      = 0.0e+00
24/09/2025 22:45	ollama	print_info: f_max_alibi_bias = 0.0e+00
24/09/2025 22:45	ollama	print_info: f_logit_scale    = 0.0e+00
24/09/2025 22:45	ollama	print_info: f_attn_scale     = 6.2e-02
24/09/2025 22:45	ollama	print_info: n_ff             = 14336
24/09/2025 22:45	ollama	print_info: n_expert         = 0
24/09/2025 22:45	ollama	print_info: n_expert_used    = 0
24/09/2025 22:45	ollama	print_info: causal attn      = 1
24/09/2025 22:45	ollama	print_info: pooling type     = 0
24/09/2025 22:45	ollama	print_info: rope type        = 2
24/09/2025 22:45	ollama	print_info: rope scaling     = linear
24/09/2025 22:45	ollama	print_info: freq_base_train  = 10000.0
24/09/2025 22:45	ollama	print_info: freq_scale_train = 1
24/09/2025 22:45	ollama	print_info: n_ctx_orig_yarn  = 8192
24/09/2025 22:45	ollama	print_info: rope_finetuned   = unknown
24/09/2025 22:45	ollama	print_info: model type       = 9B
24/09/2025 22:45	ollama	print_info: model params     = 9.24 B
24/09/2025 22:45	ollama	print_info: general.name     = gemma-2-9b-it
24/09/2025 22:45	ollama	print_info: vocab type       = SPM
24/09/2025 22:45	ollama	print_info: n_vocab          = 256000
24/09/2025 22:45	ollama	print_info: n_merges         = 0
24/09/2025 22:45	ollama	print_info: BOS token        = 2 '<bos>'
24/09/2025 22:45	ollama	print_info: EOS token        = 1 '<eos>'
24/09/2025 22:45	ollama	print_info: EOT token        = 107 '<end_of_turn>'
24/09/2025 22:45	ollama	print_info: UNK token        = 3 '<unk>'
24/09/2025 22:45	ollama	print_info: PAD token        = 0 '<pad>'
24/09/2025 22:45	ollama	print_info: LF token         = 227 '<0x0A>'
24/09/2025 22:45	ollama	print_info: EOG token        = 1 '<eos>'
24/09/2025 22:45	ollama	print_info: EOG token        = 107 '<end_of_turn>'
24/09/2025 22:45	ollama	print_info: max token length = 93
24/09/2025 22:45	ollama	load_tensors: loading model tensors, this can take a while... (mmap = true)
24/09/2025 22:45	ollama	load_tensors: offloading 20 repeating layers to GPU
24/09/2025 22:45	ollama	load_tensors: offloaded 20/43 layers to GPU
24/09/2025 22:45	ollama	load_tensors:        CUDA0 model buffer size =  2127.34 MiB
24/09/2025 22:45	ollama	load_tensors:   CPU_Mapped model buffer size =  5185.21 MiB
24/09/2025 22:45	ollama	llama_context: constructing llama_context
24/09/2025 22:45	ollama	llama_context: n_seq_max     = 1
24/09/2025 22:45	ollama	llama_context: n_ctx         = 4096
24/09/2025 22:45	ollama	llama_context: n_ctx_per_seq = 4096
24/09/2025 22:45	ollama	llama_context: n_batch       = 512
24/09/2025 22:45	ollama	llama_context: n_ubatch      = 512
24/09/2025 22:45	ollama	llama_context: causal_attn   = 1
24/09/2025 22:45	ollama	llama_context: flash_attn    = 0
24/09/2025 22:45	ollama	llama_context: kv_unified    = false
24/09/2025 22:45	ollama	llama_context: freq_base     = 10000.0
24/09/2025 22:45	ollama	llama_context: freq_scale    = 1
24/09/2025 22:45	ollama	llama_context: n_ctx_per_seq (4096) < n_ctx_train (8192) -- the full capacity of the model will not be utilized
24/09/2025 22:45	ollama	llama_context:        CPU  output buffer size =     0.99 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified_iswa: using full-size SWA cache (ref: https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)
24/09/2025 22:45	ollama	llama_kv_cache_unified_iswa: creating non-SWA KV cache, size = 4096 cells
24/09/2025 22:45	ollama	llama_kv_cache_unified:      CUDA0 KV buffer size =   320.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified:        CPU KV buffer size =   352.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified: size =  672.00 MiB (  4096 cells,  21 layers,  1/1 seqs), K (f16):  336.00 MiB, V (f16):  336.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified_iswa: creating     SWA KV cache, size = 4096 cells
24/09/2025 22:45	ollama	llama_kv_cache_unified:      CUDA0 KV buffer size =   320.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified:        CPU KV buffer size =   352.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified: size =  672.00 MiB (  4096 cells,  21 layers,  1/1 seqs), K (f16):  336.00 MiB, V (f16):  336.00 MiB
24/09/2025 22:45	ollama	llama_context:      CUDA0 compute buffer size =  1224.77 MiB
24/09/2025 22:45	ollama	llama_context:  CUDA_Host compute buffer size =    40.01 MiB
24/09/2025 22:45	ollama	llama_context: graph nodes  = 1816
24/09/2025 22:45	ollama	llama_context: graph splits = 290 (with bs=512), 3 (with bs=1)
24/09/2025 22:45	ollama	time=2025-09-24T22:45:42.653-03:00 level=INFO source=server.go:1289 msg="llama runner started in 1.33 seconds"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:42.653-03:00 level=INFO source=sched.go:470 msg="loaded runners" count=1
24/09/2025 22:45	ollama	time=2025-09-24T22:45:42.653-03:00 level=INFO source=server.go:1251 msg="waiting for llama runner to start responding"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:42.654-03:00 level=INFO source=server.go:1289 msg="llama runner started in 1.33 seconds"
24/09/2025 22:45	ollama	ggml_cuda_compute_forward: ADD failed
24/09/2025 22:45	ollama	CUDA error: no kernel image is available for execution on the device
24/09/2025 22:45	ollama	  current device: 0, in function ggml_cuda_compute_forward at /build/ollama/src/ollama/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:2568
24/09/2025 22:45	ollama	  err
24/09/2025 22:45	ollama	/build/ollama/src/ollama/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:84: CUDA error
24/09/2025 22:45	ollama	[New LWP 310914]
24/09/2025 22:45	ollama	[New LWP 310913]
24/09/2025 22:45	ollama	[New LWP 310912]
24/09/2025 22:45	ollama	[New LWP 310910]
24/09/2025 22:45	ollama	[New LWP 310909]
24/09/2025 22:45	ollama	[New LWP 310908]
24/09/2025 22:45	ollama	[New LWP 310907]
24/09/2025 22:45	ollama	[New LWP 310906]
24/09/2025 22:45	ollama	[New LWP 310905]
24/09/2025 22:45	ollama	[New LWP 310904]
24/09/2025 22:45	ollama	[New LWP 310903]
24/09/2025 22:45	ollama	[New LWP 310902]
24/09/2025 22:45	ollama	[Thread debugging using libthread_db enabled]
24/09/2025 22:45	ollama	Using host libthread_db library "/usr/lib/libthread_db.so.1".
24/09/2025 22:45	ollama	0x00007f59d8e9f042 in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#0  0x00007f59d8e9f042 in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#1  0x00007f59d8e931ac in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#2  0x00007f59d8e931f4 in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#3  0x00007f59d8f03dcf in wait4 () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#4  0x00007f599032a5bd in ggml_print_backtrace () from /usr/lib/ollama/libggml-base.so
24/09/2025 22:45	ollama	#5  0x00007f599032a763 in ggml_abort () from /usr/lib/ollama/libggml-base.so
24/09/2025 22:45	ollama	#6  0x00007f597b75c381 in ggml_cuda_error(char const*, char const*, char const*, int, char const*) () from /usr/lib/ollama/libggml-cuda.so
24/09/2025 22:45	ollama	#7  0x00007f597b76af02 in ?? () from /usr/lib/ollama/libggml-cuda.so
24/09/2025 22:45	ollama	#8  0x00005623b8da30ed in ?? ()
24/09/2025 22:45	ollama	#9  0x00005623b8e19c12 in ?? ()
24/09/2025 22:45	ollama	#10 0x00005623b8e1afa3 in ?? ()
24/09/2025 22:45	ollama	#11 0x00005623b8e1e84a in ?? ()
24/09/2025 22:45	ollama	#12 0x00005623b8e1f916 in ?? ()
24/09/2025 22:45	ollama	#13 0x00005623b8d5bc50 in ?? ()
24/09/2025 22:45	ollama	#14 0x00005623b80743a1 in ?? ()
24/09/2025 22:45	ollama	#15 0x0000000000000498 in ?? ()
24/09/2025 22:45	ollama	#16 0x000000c000103180 in ?? ()
24/09/2025 22:45	ollama	#17 0x00005623b807285a in ?? ()
24/09/2025 22:45	ollama	#18 0x00005623b80771e5 in ?? ()
24/09/2025 22:45	ollama	#19 0x00007fffc9fc4818 in ?? ()
24/09/2025 22:45	ollama	#20 0x00005623b80771e5 in ?? ()
24/09/2025 22:45	ollama	#21 0x00005623b9da4260 in ?? ()
24/09/2025 22:45	ollama	#22 0x00007fffc9fc48f0 in ?? ()
24/09/2025 22:45	ollama	#23 0x00005623b8072645 in ?? ()
24/09/2025 22:45	ollama	#24 0x00005623b80725d3 in ?? ()
24/09/2025 22:45	ollama	#25 0x0000000000000006 in ?? ()
24/09/2025 22:45	ollama	#26 0x00007fffc9fc4978 in ?? ()
24/09/2025 22:45	ollama	#27 0x0000000000000006 in ?? ()
24/09/2025 22:45	ollama	#28 0x0000000000000006 in ?? ()
24/09/2025 22:45	ollama	#29 0x00007fffc9fc4978 in ?? ()
24/09/2025 22:45	ollama	#30 0x00007f59d8e27675 in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	Backtrace stopped: previous frame inner to this frame (corrupt stack?)
24/09/2025 22:45	ollama	[Inferior 1 (process 310900) detached]
24/09/2025 22:45	ollama	SIGABRT: abort
24/09/2025 22:45	ollama	PC=0x7f59d8e9894c m=0 sigcode=18446744073709551610
24/09/2025 22:45	ollama	signal arrived during cgo execution
24/09/2025 22:45	ollama	goroutine 11 gp=0xc000103180 m=0 mp=0x5623b9da6080 [syscall]:
24/09/2025 22:45	ollama	runtime.cgocall(0x5623b8d5bc00, 0xc000389bd8)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/cgocall.go:167 +0x4b fp=0xc000389bb0 sp=0xc000389b78 pc=0x5623b806906b
24/09/2025 22:45	ollama	github.com/ollama/ollama/llama._Cfunc_llama_decode(0x5623e40f5060, {0xf, 0x5623e40fac20, 0x0, 0x5623e41a39a0, 0x5623e6c89a70, 0x5623e6c8a280, 0x5623e6c8d610})
24/09/2025 22:45	ollama		_cgo_gotypes.go:672 +0x4a fp=0xc000389bd8 sp=0xc000389bb0 pc=0x5623b841ea6a
24/09/2025 22:45	ollama	github.com/ollama/ollama/llama.(*Context).Decode.func1(...)
24/09/2025 22:45	ollama		/build/ollama/src/ollama/llama/llama.go:150
24/09/2025 22:45	ollama	github.com/ollama/ollama/llama.(*Context).Decode(0xc00050dd88?, 0x1?)
24/09/2025 22:45	ollama		/build/ollama/src/ollama/llama/llama.go:150 +0xed fp=0xc000389cc0 sp=0xc000389bd8 pc=0x5623b842184d
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0002e94a0, 0xc0007120f0, 0xc00050df28)
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:441 +0x209 fp=0xc000389ee8 sp=0xc000389cc0 pc=0x5623b84ec309
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0002e94a0, {0x5623b94e0570, 0xc000123a40})
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:346 +0x1d5 fp=0xc000389fb8 sp=0xc000389ee8 pc=0x5623b84ebf95
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1()
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:880 +0x28 fp=0xc000389fe0 sp=0xc000389fb8 pc=0x5623b84f0ce8
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000389fe8 sp=0xc000389fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:880 +0x4c5
24/09/2025 22:45	ollama	goroutine 1 gp=0xc000002380 m=nil [IO wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000597790 sp=0xc000597770 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.netpollblock(0xc0005977e0?, 0xb80013a6?, 0x23?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc0005977c8 sp=0xc000597790 pc=0x5623b80301d7
24/09/2025 22:45	ollama	internal/poll.runtime_pollWait(0x7f59d94fb400, 0x72)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/netpoll.go:351 +0x85 fp=0xc0005977e8 sp=0xc0005977c8 pc=0x5623b806b6c5
24/09/2025 22:45	ollama	internal/poll.(*pollDesc).wait(0xc0005b1400?, 0x900000036?, 0x0)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000597810 sp=0xc0005977e8 pc=0x5623b80f4707
24/09/2025 22:45	ollama	internal/poll.(*pollDesc).waitRead(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_poll_runtime.go:89
24/09/2025 22:45	ollama	internal/poll.(*FD).Accept(0xc0005b1400)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_unix.go:613 +0x28c fp=0xc0005978b8 sp=0xc000597810 pc=0x5623b80f9b2c
24/09/2025 22:45	ollama	net.(*netFD).accept(0xc0005b1400)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/fd_unix.go:161 +0x29 fp=0xc000597970 sp=0xc0005978b8 pc=0x5623b8164089
24/09/2025 22:45	ollama	net.(*TCPListener).accept(0xc00012fd80)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/tcpsock_posix.go:159 +0x1b fp=0xc0005979c0 sp=0xc000597970 pc=0x5623b81797bb
24/09/2025 22:45	ollama	net.(*TCPListener).Accept(0xc00012fd80)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/tcpsock.go:380 +0x30 fp=0xc0005979f0 sp=0xc0005979c0 pc=0x5623b8178650
24/09/2025 22:45	ollama	net/http.(*onceCloseListener).Accept(0xc0004d03f0?)
24/09/2025 22:45	ollama		<autogenerated>:1 +0x24 fp=0xc000597a08 sp=0xc0005979f0 pc=0x5623b8399ea4
24/09/2025 22:45	ollama	net/http.(*Server).Serve(0xc0001f1500, {0x5623b94ddfc8, 0xc00012fd80})
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:3463 +0x30c fp=0xc000597b38 sp=0xc000597a08 pc=0x5623b837188c
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034260, 0x4, 0x4})
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:901 +0x8f4 fp=0xc000597d08 sp=0xc000597b38 pc=0x5623b84f0a74
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner.Execute({0xc000034250?, 0x0?, 0x0?})
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/runner.go:22 +0xd4 fp=0xc000597d30 sp=0xc000597d08 pc=0x5623b8583414
24/09/2025 22:45	ollama	github.com/ollama/ollama/cmd.NewCLI.func2(0xc0001f1200?, {0x5623b8fec2d0?, 0x4?, 0x5623b8fec2d4?})
24/09/2025 22:45	ollama		/build/ollama/src/ollama/cmd/cmd.go:1706 +0x45 fp=0xc000597d58 sp=0xc000597d30 pc=0x5623b8ced5c5
24/09/2025 22:45	ollama	github.com/spf13/cobra.(*Command).execute(0xc0004d3508, {0xc00012fb80, 0x4, 0x4})
24/09/2025 22:45	ollama		/build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x88a fp=0xc000597e78 sp=0xc000597d58 pc=0x5623b81dd70a
24/09/2025 22:45	ollama	github.com/spf13/cobra.(*Command).ExecuteC(0xc0005c6f08)
24/09/2025 22:45	ollama		/build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x398 fp=0xc000597f30 sp=0xc000597e78 pc=0x5623b81ddf38
24/09/2025 22:45	ollama	github.com/spf13/cobra.(*Command).Execute(...)
24/09/2025 22:45	ollama		/build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
24/09/2025 22:45	ollama	github.com/spf13/cobra.(*Command).ExecuteContext(...)
24/09/2025 22:45	ollama		/build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985
24/09/2025 22:45	ollama	main.main()
24/09/2025 22:45	ollama		/build/ollama/src/ollama/main.go:12 +0x4d fp=0xc000597f50 sp=0xc000597f30 pc=0x5623b8cee08d
24/09/2025 22:45	ollama	runtime.main()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:285 +0x29d fp=0xc000597fe0 sp=0xc000597f50 pc=0x5623b8037a7d
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000597fe8 sp=0xc000597fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006cfa8 sp=0xc00006cf88 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.goparkunlock(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:466
24/09/2025 22:45	ollama	runtime.forcegchelper()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:373 +0xb8 fp=0xc00006cfe0 sp=0xc00006cfa8 pc=0x5623b8037db8
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006cfe8 sp=0xc00006cfe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.init.7 in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:361 +0x1a
24/09/2025 22:45	ollama	goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006d780 sp=0xc00006d760 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.goparkunlock(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:466
24/09/2025 22:45	ollama	runtime.bgsweep(0xc000098000)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgcsweep.go:323 +0xdf fp=0xc00006d7c8 sp=0xc00006d780 pc=0x5623b8021adf
24/09/2025 22:45	ollama	runtime.gcenable.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:212 +0x25 fp=0xc00006d7e0 sp=0xc00006d7c8 pc=0x5623b8015a65
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006d7e8 sp=0xc00006d7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcenable in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:212 +0x66
24/09/2025 22:45	ollama	goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x10000?, 0x5623b91b3748?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006df78 sp=0xc00006df58 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.goparkunlock(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:466
24/09/2025 22:45	ollama	runtime.(*scavengerState).park(0x5623b9da3100)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc00006dfa8 sp=0xc00006df78 pc=0x5623b801f549
24/09/2025 22:45	ollama	runtime.bgscavenge(0xc000098000)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc00006dfc8 sp=0xc00006dfa8 pc=0x5623b801faf9
24/09/2025 22:45	ollama	runtime.gcenable.gowrap2()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:213 +0x25 fp=0xc00006dfe0 sp=0xc00006dfc8 pc=0x5623b8015a05
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006dfe8 sp=0xc00006dfe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcenable in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:213 +0xa5
24/09/2025 22:45	ollama	goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x5623b8046d57?, 0x5623b800d385?, 0xb8?, 0x1?, 0xc000002380?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006c620 sp=0xc00006c600 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.runFinalizers()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mfinal.go:210 +0x107 fp=0xc00006c7e0 sp=0xc00006c620 pc=0x5623b8014967
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006c7e8 sp=0xc00006c7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.createfing in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mfinal.go:172 +0x3d
24/09/2025 22:45	ollama	goroutine 6 gp=0xc0001ce8c0 m=nil [cleanup wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006e768 sp=0xc00006e748 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.goparkunlock(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:466
24/09/2025 22:45	ollama	runtime.(*cleanupQueue).dequeue(0x5623b9da3a60)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mcleanup.go:439 +0xc5 fp=0xc00006e7a0 sp=0xc00006e768 pc=0x5623b8011b45
24/09/2025 22:45	ollama	runtime.runCleanups()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mcleanup.go:635 +0x45 fp=0xc00006e7e0 sp=0xc00006e7a0 pc=0x5623b8012205
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006e7e8 sp=0xc00006e7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.(*cleanupQueue).createGs in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mcleanup.go:589 +0xa5
24/09/2025 22:45	ollama	goroutine 7 gp=0xc0001cefc0 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006ef38 sp=0xc00006ef18 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006efc8 sp=0xc00006ef38 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006efe0 sp=0xc00006efc8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006efe8 sp=0xc00006efe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 8 gp=0xc0001cf180 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006f738 sp=0xc00006f718 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006f7c8 sp=0xc00006f738 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006f7e0 sp=0xc00006f7c8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006f7e8 sp=0xc00006f7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 9 gp=0xc0001cf340 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207649ae95?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006ff38 sp=0xc00006ff18 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006ffc8 sp=0xc00006ff38 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006ffe0 sp=0xc00006ffc8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 10 gp=0xc0001cf500 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207649b2fb?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000068738 sp=0xc000068718 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc0000687c8 sp=0xc000068738 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc0000687e0 sp=0xc0000687c8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc0000687e8 sp=0xc0000687e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 18 gp=0xc000504000 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207648e360?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00050a7c8 sp=0xc00050a738 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 34 gp=0xc000102380 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x320764a388d?, 0x3?, 0xab?, 0x12?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000083f38 sp=0xc000083f18 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc000083fc8 sp=0xc000083f38 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc000083fe0 sp=0xc000083fc8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 35 gp=0xc000102540 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207649b52d?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000506f38 sp=0xc000506f18 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc000506fc8 sp=0xc000506f38 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc000506fe0 sp=0xc000506fc8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000506fe8 sp=0xc000506fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 36 gp=0xc000102700 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207649d167?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000507738 sp=0xc000507718 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc0005077c8 sp=0xc000507738 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc0005077e0 sp=0xc0005077c8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc0005077e8 sp=0xc0005077e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 12 gp=0xc000103340 m=nil [select]:
24/09/2025 22:45	ollama	runtime.gopark(0xc000047a70?, 0x2?, 0x78?, 0x77?, 0xc0000478bc?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc0000476e8 sp=0xc0000476c8 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.selectgo(0xc000047a70, 0xc0000478b8, 0xf?, 0x0, 0x1?, 0x1)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/select.go:351 +0x8c5 fp=0xc000047828 sp=0xc0000476e8 pc=0x5623b804a685
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.(*Server).completion(0xc0002e94a0, {0x5623b94de1a8, 0xc0003040f0}, 0xc00043c140)
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:629 +0xb30 fp=0xc000047ab8 sp=0xc000047828 pc=0x5623b84edf10
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.(*Server).completion-fm({0x5623b94de1a8?, 0xc0003040f0?}, 0xc000047b38?)
24/09/2025 22:45	ollama		<autogenerated>:1 +0x36 fp=0xc000047ae8 sp=0xc000047ab8 pc=0x5623b84f10f6
24/09/2025 22:45	ollama	net/http.HandlerFunc.ServeHTTP(0xc0005caf00?, {0x5623b94de1a8?, 0xc0003040f0?}, 0xc000047b58?)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:2322 +0x29 fp=0xc000047b10 sp=0xc000047ae8 pc=0x5623b836dec9
24/09/2025 22:45	ollama	net/http.(*ServeMux).ServeHTTP(0x5623b800d385?, {0x5623b94de1a8, 0xc0003040f0}, 0xc00043c140)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:2861 +0x1c7 fp=0xc000047b60 sp=0xc000047b10 pc=0x5623b836fda7
24/09/2025 22:45	ollama	net/http.serverHandler.ServeHTTP({0x5623b94dad30?}, {0x5623b94de1a8?, 0xc0003040f0?}, 0x1?)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:3340 +0x8e fp=0xc000047b90 sp=0xc000047b60 pc=0x5623b838d68e
24/09/2025 22:45	ollama	net/http.(*conn).serve(0xc0004d03f0, {0x5623b94e0538, 0xc0004c71d0})
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:2109 +0x665 fp=0xc000047fb8 sp=0xc000047b90 pc=0x5623b836bfc5
24/09/2025 22:45	ollama	net/http.(*Server).Serve.gowrap3()
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:3493 +0x28 fp=0xc000047fe0 sp=0xc000047fb8 pc=0x5623b8371c88
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000047fe8 sp=0xc000047fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by net/http.(*Server).Serve in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:3493 +0x485
24/09/2025 22:45	ollama	goroutine 20 gp=0xc000504380 m=nil [IO wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000509dd8 sp=0xc000509db8 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.netpollblock(0x5623b8090b98?, 0xb80013a6?, 0x23?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc000509e10 sp=0xc000509dd8 pc=0x5623b80301d7
24/09/2025 22:45	ollama	internal/poll.runtime_pollWait(0x7f59d94fb200, 0x72)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/netpoll.go:351 +0x85 fp=0xc000509e30 sp=0xc000509e10 pc=0x5623b806b6c5
24/09/2025 22:45	ollama	internal/poll.(*pollDesc).wait(0xc0005b1480?, 0xc00012fde1?, 0x0)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000509e58 sp=0xc000509e30 pc=0x5623b80f4707
24/09/2025 22:45	ollama	internal/poll.(*pollDesc).waitRead(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_poll_runtime.go:89
24/09/2025 22:45	ollama	internal/poll.(*FD).Read(0xc0005b1480, {0xc00012fde1, 0x1, 0x1})
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_unix.go:165 +0x279 fp=0xc000509ef0 sp=0xc000509e58 pc=0x5623b80f59f9
24/09/2025 22:45	ollama	net.(*netFD).Read(0xc0005b1480, {0xc00012fde1?, 0x5623b9ce5ec0?, 0xc000509f70?})
24/09/2025 22:45	ollama		/usr/lib/go/src/net/fd_posix.go:68 +0x25 fp=0xc000509f38 sp=0xc000509ef0 pc=0x5623b81621e5
24/09/2025 22:45	ollama	net.(*conn).Read(0xc00011c3c0, {0xc00012fde1?, 0x0?, 0x0?})
24/09/2025 22:45	ollama		/usr/lib/go/src/net/net.go:196 +0x45 fp=0xc000509f80 sp=0xc000509f38 pc=0x5623b8170205
24/09/2025 22:45	ollama	net/http.(*connReader).backgroundRead(0xc00012fdc0)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:702 +0x33 fp=0xc000509fc8 sp=0xc000509f80 pc=0x5623b8366473
24/09/2025 22:45	ollama	net/http.(*connReader).startBackgroundRead.gowrap2()
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:698 +0x25 fp=0xc000509fe0 sp=0xc000509fc8 pc=0x5623b83663a5
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000509fe8 sp=0xc000509fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by net/http.(*connReader).startBackgroundRead in goroutine 12
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:698 +0xb6
24/09/2025 22:45	ollama	rax    0x0
24/09/2025 22:45	ollama	rbx    0x4be74
24/09/2025 22:45	ollama	rcx    0x7f59d8e9894c
24/09/2025 22:45	ollama	rdx    0x6
24/09/2025 22:45	ollama	rdi    0x4be74
24/09/2025 22:45	ollama	rsi    0x4be74
24/09/2025 22:45	ollama	rbp    0x7fffc9fbfaa0
24/09/2025 22:45	ollama	rsp    0x7fffc9fbfa60
24/09/2025 22:45	ollama	r8     0x0
24/09/2025 22:45	ollama	r9     0x0
24/09/2025 22:45	ollama	r10    0x0
24/09/2025 22:45	ollama	r11    0x246
24/09/2025 22:45	ollama	r12    0x7f597bc22729
24/09/2025 22:45	ollama	r13    0x54
24/09/2025 22:45	ollama	r14    0x6
24/09/2025 22:45	ollama	r15    0x0
24/09/2025 22:45	ollama	rip    0x7f59d8e9894c
24/09/2025 22:45	ollama	rflags 0x246
24/09/2025 22:45	ollama	cs     0x33
24/09/2025 22:45	ollama	fs     0x0
24/09/2025 22:45	ollama	gs     0x0
24/09/2025 22:45	ollama	time=2025-09-24T22:45:44.225-03:00 level=ERROR source=server.go:1459 msg="post predict" error="Post \"http://127.0.0.1:37197/completion\": EOF"
24/09/2025 22:45	ollama	[GIN] 2025/09/24 - 22:45:44 | 200 |  3.564513467s |       127.0.0.1 | POST     "/api/generate"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:44.253-03:00 level=ERROR source=server.go:425 msg="llama runner terminated" error="exit status 2"

CUDA_ERR:

[23:59:10.902][140663025804992][CUDA][E] No CUDA context is current to the calling thread
[23:59:10.902][140663025804992][CUDA][E] Returning 201 (CUDA_ERROR_INVALID_CONTEXT) from cuCtxGetDevice_v2
[23:59:13.181][140663025804992][CUDA][E] Error handling fatbinary, to get more information when using CUDA Driver APIs use the CU_JIT_ERROR_LOG_BUFFER and CU_JIT_ERROR_LOG_BUFFER_SIZE_BYTES parameters
[23:59:13.182][140663025804992][CUDA][E] No available relocatable PTX entries for GPU
[23:59:13.182][140663025804992][CUDA][E] No device code available for GPU ISA 61
[23:59:13.182][140663025804992][CUDA][E] Kernel (_Z11k_bin_bcastIXadL_ZN42_INTERNAL_f88bb2be_11_binbcast_cu_6840010b6op_addEffEEfffEvPKT0_PKT1_PT2_iiiiiiiiiiiiiiiii) cannot be found in library due to compilation error, to get more information when using C[23:59:13.182][140663025804992][CUDA][E] Returning 209 (CUDA_ERROR_NO_BINARY_FOR_GPU) from cuLibraryGetKernel

NVIDIA-SMI:

Thu Sep 25 00:19:58 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.09              Driver Version: 580.82.09      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1060 6GB    On  |   00000000:01:00.0  On |                  N/A |
|  0%   50C    P2             27W /  180W |    1499MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A          188994      G   /usr/bin/ksecretd                         1MiB |
|    0   N/A  N/A          189133      G   /usr/bin/kwin_wayland                    80MiB |
|    0   N/A  N/A          189269      G   /usr/bin/Xwayland                         5MiB |
|    0   N/A  N/A          189344      G   /usr/bin/ksmserver                        1MiB |
|    0   N/A  N/A          189346      G   /usr/bin/kded6                            1MiB |
|    0   N/A  N/A          189395      G   /usr/bin/plasmashell                    172MiB |
|    0   N/A  N/A          189474      G   /usr/bin/kaccess                          1MiB |
|    0   N/A  N/A          189477      G   ...it-kde-authentication-agent-1          1MiB |
|    0   N/A  N/A          189657      G   /usr/bin/kwalletd6                        1MiB |
|    0   N/A  N/A          189758      G   /usr/bin/kwalletmanager5                  1MiB |
|    0   N/A  N/A          189834      G   /opt/flemozi/flemozi                      4MiB |
|    0   N/A  N/A          189911      G   /usr/bin/kdeconnectd                      1MiB |
|    0   N/A  N/A          189917      G   /usr/bin/xwaylandvideobridge              1MiB |
|    0   N/A  N/A          189922      G   /usr/bin/yakuake                          1MiB |
|    0   N/A  N/A          189927      G   /usr/bin/qbittorrent                      1MiB |
|    0   N/A  N/A          189985      G   vicinae                                   1MiB |
|    0   N/A  N/A          190031      G   /usr/lib/DiscoverNotifier                 1MiB |
|    0   N/A  N/A          190032      G   /usr/bin/kalendarac                       1MiB |
|    0   N/A  N/A          190033      G   /usr/bin/kgpg                             1MiB |
|    0   N/A  N/A          190164      G   /usr/lib/xdg-desktop-portal-kde           1MiB |
|    0   N/A  N/A          190334      G   /usr/bin/akonadi_control                  1MiB |
|    0   N/A  N/A          190465      G   ...bin/akonadi_archivemail_agent          1MiB |
|    0   N/A  N/A          190468      G   ...konadi_followupreminder_agent          1MiB |
|    0   N/A  N/A          190469      G   /usr/bin/akonadi_google_resource          1MiB |
|    0   N/A  N/A          190473      G   .../akonadi_maildispatcher_agent          1MiB |
|    0   N/A  N/A          190474      G   .../bin/akonadi_mailfilter_agent          1MiB |
|    0   N/A  N/A          190475      G   /usr/bin/akonadi_mailmerge_agent          1MiB |
|    0   N/A  N/A          190479      G   /usr/bin/akonadi_migration_agent          1MiB |
|    0   N/A  N/A          190480      G   ...akonadi_newmailnotifier_agent          1MiB |
|    0   N/A  N/A          190481      G   /usr/bin/akonadi_sendlater_agent          1MiB |
|    0   N/A  N/A          190484      G   .../akonadi_unifiedmailbox_agent          1MiB |
|    0   N/A  N/A          193538      G   /usr/lib/baloorunner                      1MiB |
|    0   N/A  N/A          196733      G   /usr/bin/kalarm                           1MiB |
|    0   N/A  N/A          196742      G   /usr/bin/konsole                          1MiB |
|    0   N/A  N/A          214267      G   /usr/lib/firefox/firefox               1101MiB |
|    0   N/A  N/A          214759      G   ...asma-browser-integration-host          1MiB |
+-----------------------------------------------------------------------------------------+

OS

Linux (EndeavourOS)

GPU

Nvidia

CPU

Intel

Ollama version

0.12.1

Originally created by @Dominiquini on GitHub (Sep 25, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12408 ### What is the issue? Ollama doesn't work if I try to run it using my GPU with CUDA. If I disable CUDA (start the service with the environment variable 'CUDA_VISIBLE_DEVICES="-1"'), everything seems to work fine! Everything seemed to be working fine a few weeks ago! I think the problem starts when I updated CUDA to version 13.0.1, as I tested several previous versions of Ollama, and the same error persists! Even version that I know that worked in the past... ### Relevant log output #### JOURNALCTL: ```shell 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.610-03:00 level=INFO source=routes.go:1475 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.614-03:00 level=INFO source=images.go:518 msg="total blobs: 60" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.615-03:00 level=INFO source=images.go:525 msg="total unused blobs removed: 0" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.615-03:00 level=INFO source=routes.go:1528 msg="Listening on 127.0.0.1:11434 (version 0.12.1)" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.615-03:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.768-03:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-b5634b9b-f606-83eb-86c6-99d9398d729f library=cuda variant=v12 compute=6.1 driver=13.0 name="NVIDIA GeForce GTX 1060 6GB" total="5.9 GiB" available="4.5 GiB" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.768-03:00 level=INFO source=routes.go:1569 msg="entering low vram mode" "total vram"="5.9 GiB" threshold="20.0 GiB" 24/09/2025 22:45 ollama [GIN] 2025/09/24 - 22:45:40 | 200 | 37.167µs | 127.0.0.1 | HEAD "/" 24/09/2025 22:45 ollama [GIN] 2025/09/24 - 22:45:40 | 200 | 67.780023ms | 127.0.0.1 | POST "/api/show" 24/09/2025 22:45 ollama llama_model_loader: loaded meta data with 29 key-value pairs and 464 tensors from /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 (version GGUF V3 (latest)) 24/09/2025 22:45 ollama llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. 24/09/2025 22:45 ollama llama_model_loader: - kv 0: general.architecture str = gemma2 24/09/2025 22:45 ollama llama_model_loader: - kv 1: general.name str = gemma-2-9b-it 24/09/2025 22:45 ollama llama_model_loader: - kv 2: gemma2.context_length u32 = 8192 24/09/2025 22:45 ollama llama_model_loader: - kv 3: gemma2.embedding_length u32 = 3584 24/09/2025 22:45 ollama llama_model_loader: - kv 4: gemma2.block_count u32 = 42 24/09/2025 22:45 ollama llama_model_loader: - kv 5: gemma2.feed_forward_length u32 = 14336 24/09/2025 22:45 ollama llama_model_loader: - kv 6: gemma2.attention.head_count u32 = 16 24/09/2025 22:45 ollama llama_model_loader: - kv 7: gemma2.attention.head_count_kv u32 = 8 24/09/2025 22:45 ollama llama_model_loader: - kv 8: gemma2.attention.layer_norm_rms_epsilon f32 = 0.000001 24/09/2025 22:45 ollama llama_model_loader: - kv 9: gemma2.attention.key_length u32 = 256 24/09/2025 22:45 ollama llama_model_loader: - kv 10: gemma2.attention.value_length u32 = 256 24/09/2025 22:45 ollama llama_model_loader: - kv 11: general.file_type u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - kv 12: gemma2.attn_logit_softcapping f32 = 50.000000 24/09/2025 22:45 ollama llama_model_loader: - kv 13: gemma2.final_logit_softcapping f32 = 30.000000 24/09/2025 22:45 ollama llama_model_loader: - kv 14: gemma2.attention.sliding_window u32 = 4096 24/09/2025 22:45 ollama llama_model_loader: - kv 15: tokenizer.ggml.model str = llama 24/09/2025 22:45 ollama llama_model_loader: - kv 16: tokenizer.ggml.pre str = default 24/09/2025 22:45 ollama llama_model_loader: - kv 17: tokenizer.ggml.tokens arr[str,256000] = ["<pad>", "<eos>", "<bos>", "<unk>", ... 24/09/2025 22:45 ollama llama_model_loader: - kv 18: tokenizer.ggml.scores arr[f32,256000] = [0.000000, 0.000000, 0.000000, 0.0000... 24/09/2025 22:45 ollama llama_model_loader: - kv 19: tokenizer.ggml.token_type arr[i32,256000] = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ... 24/09/2025 22:45 ollama llama_model_loader: - kv 20: tokenizer.ggml.bos_token_id u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - kv 21: tokenizer.ggml.eos_token_id u32 = 1 24/09/2025 22:45 ollama llama_model_loader: - kv 22: tokenizer.ggml.unknown_token_id u32 = 3 24/09/2025 22:45 ollama llama_model_loader: - kv 23: tokenizer.ggml.padding_token_id u32 = 0 24/09/2025 22:45 ollama llama_model_loader: - kv 24: tokenizer.ggml.add_bos_token bool = true 24/09/2025 22:45 ollama llama_model_loader: - kv 25: tokenizer.ggml.add_eos_token bool = false 24/09/2025 22:45 ollama llama_model_loader: - kv 26: tokenizer.chat_template str = {{ bos_token }}{% if messages[0]['rol... 24/09/2025 22:45 ollama llama_model_loader: - kv 27: tokenizer.ggml.add_space_prefix bool = false 24/09/2025 22:45 ollama llama_model_loader: - kv 28: general.quantization_version u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - type f32: 169 tensors 24/09/2025 22:45 ollama llama_model_loader: - type q4_0: 294 tensors 24/09/2025 22:45 ollama llama_model_loader: - type q6_K: 1 tensors 24/09/2025 22:45 ollama print_info: file format = GGUF V3 (latest) 24/09/2025 22:45 ollama print_info: file type = Q4_0 24/09/2025 22:45 ollama print_info: file size = 5.06 GiB (4.71 BPW) 24/09/2025 22:45 ollama load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect 24/09/2025 22:45 ollama load: printing all EOG tokens: 24/09/2025 22:45 ollama load: - 1 ('<eos>') 24/09/2025 22:45 ollama load: - 107 ('<end_of_turn>') 24/09/2025 22:45 ollama load: special tokens cache size = 108 24/09/2025 22:45 ollama load: token to piece cache size = 1.6014 MB 24/09/2025 22:45 ollama print_info: arch = gemma2 24/09/2025 22:45 ollama print_info: vocab_only = 1 24/09/2025 22:45 ollama print_info: model type = ?B 24/09/2025 22:45 ollama print_info: model params = 9.24 B 24/09/2025 22:45 ollama print_info: general.name = gemma-2-9b-it 24/09/2025 22:45 ollama print_info: vocab type = SPM 24/09/2025 22:45 ollama print_info: n_vocab = 256000 24/09/2025 22:45 ollama print_info: n_merges = 0 24/09/2025 22:45 ollama print_info: BOS token = 2 '<bos>' 24/09/2025 22:45 ollama print_info: EOS token = 1 '<eos>' 24/09/2025 22:45 ollama print_info: EOT token = 107 '<end_of_turn>' 24/09/2025 22:45 ollama print_info: UNK token = 3 '<unk>' 24/09/2025 22:45 ollama print_info: PAD token = 0 '<pad>' 24/09/2025 22:45 ollama print_info: LF token = 227 '<0x0A>' 24/09/2025 22:45 ollama print_info: EOG token = 1 '<eos>' 24/09/2025 22:45 ollama print_info: EOG token = 107 '<end_of_turn>' 24/09/2025 22:45 ollama print_info: max token length = 93 24/09/2025 22:45 ollama llama_model_load: vocab only - skipping tensors 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.324-03:00 level=INFO source=server.go:399 msg="starting runner" cmd="/usr/bin/ollama runner --model /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 --port 37197" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.334-03:00 level=INFO source=runner.go:864 msg="starting go runner" 24/09/2025 22:45 ollama ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no 24/09/2025 22:45 ollama ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no 24/09/2025 22:45 ollama ggml_cuda_init: found 1 CUDA devices: 24/09/2025 22:45 ollama Device 0: NVIDIA GeForce GTX 1060 6GB, compute capability 6.1, VMM: yes, ID: GPU-b5634b9b-f606-83eb-86c6-99d9398d729f 24/09/2025 22:45 ollama load_backend: loaded CUDA backend from /usr/lib/ollama/libggml-cuda.so 24/09/2025 22:45 ollama load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-haswell.so 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.377-03:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=750,800,860,870,880,890,900,1000,1030,1100,1200,1210 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc) 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.377-03:00 level=INFO source=runner.go:900 msg="Server listening on 127.0.0.1:37197" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.395-03:00 level=INFO source=server.go:504 msg="system memory" total="31.2 GiB" free="20.8 GiB" free_swap="0 B" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.397-03:00 level=INFO source=server.go:544 msg=offload library=cuda layers.requested=-1 layers.model=43 layers.offload=20 layers.split=[20] memory.available="[4.5 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.2 GiB" memory.required.partial="4.5 GiB" memory.required.kv="1.3 GiB" memory.required.allocations="[4.5 GiB]" memory.weights.total="5.1 GiB" memory.weights.repeating="4.4 GiB" memory.weights.nonrepeating="717.8 MiB" memory.graph.full="507.0 MiB" memory.graph.partial="1.2 GiB" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.397-03:00 level=INFO source=runner.go:799 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:false KvSize:4096 KvCacheType: NumThreads:4 GPULayers:20[ID:GPU-b5634b9b-f606-83eb-86c6-99d9398d729f Layers:20(22..41)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:true}" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.397-03:00 level=INFO source=server.go:1251 msg="waiting for llama runner to start responding" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.398-03:00 level=INFO source=server.go:1285 msg="waiting for server to become available" status="llm server loading model" 24/09/2025 22:45 ollama llama_model_load_from_file_impl: using device CUDA0 (NVIDIA GeForce GTX 1060 6GB) - 4625 MiB free 24/09/2025 22:45 ollama llama_model_loader: loaded meta data with 29 key-value pairs and 464 tensors from /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 (version GGUF V3 (latest)) 24/09/2025 22:45 ollama llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. 24/09/2025 22:45 ollama llama_model_loader: - kv 0: general.architecture str = gemma2 24/09/2025 22:45 ollama llama_model_loader: - kv 1: general.name str = gemma-2-9b-it 24/09/2025 22:45 ollama llama_model_loader: - kv 2: gemma2.context_length u32 = 8192 24/09/2025 22:45 ollama llama_model_loader: - kv 3: gemma2.embedding_length u32 = 3584 24/09/2025 22:45 ollama llama_model_loader: - kv 4: gemma2.block_count u32 = 42 24/09/2025 22:45 ollama llama_model_loader: - kv 5: gemma2.feed_forward_length u32 = 14336 24/09/2025 22:45 ollama llama_model_loader: - kv 6: gemma2.attention.head_count u32 = 16 24/09/2025 22:45 ollama llama_model_loader: - kv 7: gemma2.attention.head_count_kv u32 = 8 24/09/2025 22:45 ollama llama_model_loader: - kv 8: gemma2.attention.layer_norm_rms_epsilon f32 = 0.000001 24/09/2025 22:45 ollama llama_model_loader: - kv 9: gemma2.attention.key_length u32 = 256 24/09/2025 22:45 ollama llama_model_loader: - kv 10: gemma2.attention.value_length u32 = 256 24/09/2025 22:45 ollama llama_model_loader: - kv 11: general.file_type u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - kv 12: gemma2.attn_logit_softcapping f32 = 50.000000 24/09/2025 22:45 ollama llama_model_loader: - kv 13: gemma2.final_logit_softcapping f32 = 30.000000 24/09/2025 22:45 ollama llama_model_loader: - kv 14: gemma2.attention.sliding_window u32 = 4096 24/09/2025 22:45 ollama llama_model_loader: - kv 15: tokenizer.ggml.model str = llama 24/09/2025 22:45 ollama llama_model_loader: - kv 16: tokenizer.ggml.pre str = default 24/09/2025 22:45 ollama llama_model_loader: - kv 17: tokenizer.ggml.tokens arr[str,256000] = ["<pad>", "<eos>", "<bos>", "<unk>", ... 24/09/2025 22:45 ollama llama_model_loader: - kv 18: tokenizer.ggml.scores arr[f32,256000] = [0.000000, 0.000000, 0.000000, 0.0000... 24/09/2025 22:45 ollama llama_model_loader: - kv 19: tokenizer.ggml.token_type arr[i32,256000] = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ... 24/09/2025 22:45 ollama llama_model_loader: - kv 20: tokenizer.ggml.bos_token_id u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - kv 21: tokenizer.ggml.eos_token_id u32 = 1 24/09/2025 22:45 ollama llama_model_loader: - kv 22: tokenizer.ggml.unknown_token_id u32 = 3 24/09/2025 22:45 ollama llama_model_loader: - kv 23: tokenizer.ggml.padding_token_id u32 = 0 24/09/2025 22:45 ollama llama_model_loader: - kv 24: tokenizer.ggml.add_bos_token bool = true 24/09/2025 22:45 ollama llama_model_loader: - kv 25: tokenizer.ggml.add_eos_token bool = false 24/09/2025 22:45 ollama llama_model_loader: - kv 26: tokenizer.chat_template str = {{ bos_token }}{% if messages[0]['rol... 24/09/2025 22:45 ollama llama_model_loader: - kv 27: tokenizer.ggml.add_space_prefix bool = false 24/09/2025 22:45 ollama llama_model_loader: - kv 28: general.quantization_version u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - type f32: 169 tensors 24/09/2025 22:45 ollama llama_model_loader: - type q4_0: 294 tensors 24/09/2025 22:45 ollama llama_model_loader: - type q6_K: 1 tensors 24/09/2025 22:45 ollama print_info: file format = GGUF V3 (latest) 24/09/2025 22:45 ollama print_info: file type = Q4_0 24/09/2025 22:45 ollama print_info: file size = 5.06 GiB (4.71 BPW) 24/09/2025 22:45 ollama [GIN] 2025/09/24 - 22:45:41 | 200 | 24.284µs | 127.0.0.1 | GET "/" 24/09/2025 22:45 ollama load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect 24/09/2025 22:45 ollama load: printing all EOG tokens: 24/09/2025 22:45 ollama load: - 1 ('<eos>') 24/09/2025 22:45 ollama load: - 107 ('<end_of_turn>') 24/09/2025 22:45 ollama load: special tokens cache size = 108 24/09/2025 22:45 ollama load: token to piece cache size = 1.6014 MB 24/09/2025 22:45 ollama print_info: arch = gemma2 24/09/2025 22:45 ollama print_info: vocab_only = 0 24/09/2025 22:45 ollama print_info: n_ctx_train = 8192 24/09/2025 22:45 ollama print_info: n_embd = 3584 24/09/2025 22:45 ollama print_info: n_layer = 42 24/09/2025 22:45 ollama print_info: n_head = 16 24/09/2025 22:45 ollama print_info: n_head_kv = 8 24/09/2025 22:45 ollama print_info: n_rot = 256 24/09/2025 22:45 ollama print_info: n_swa = 4096 24/09/2025 22:45 ollama print_info: is_swa_any = 1 24/09/2025 22:45 ollama print_info: n_embd_head_k = 256 24/09/2025 22:45 ollama print_info: n_embd_head_v = 256 24/09/2025 22:45 ollama print_info: n_gqa = 2 24/09/2025 22:45 ollama print_info: n_embd_k_gqa = 2048 24/09/2025 22:45 ollama print_info: n_embd_v_gqa = 2048 24/09/2025 22:45 ollama print_info: f_norm_eps = 0.0e+00 24/09/2025 22:45 ollama print_info: f_norm_rms_eps = 1.0e-06 24/09/2025 22:45 ollama print_info: f_clamp_kqv = 0.0e+00 24/09/2025 22:45 ollama print_info: f_max_alibi_bias = 0.0e+00 24/09/2025 22:45 ollama print_info: f_logit_scale = 0.0e+00 24/09/2025 22:45 ollama print_info: f_attn_scale = 6.2e-02 24/09/2025 22:45 ollama print_info: n_ff = 14336 24/09/2025 22:45 ollama print_info: n_expert = 0 24/09/2025 22:45 ollama print_info: n_expert_used = 0 24/09/2025 22:45 ollama print_info: causal attn = 1 24/09/2025 22:45 ollama print_info: pooling type = 0 24/09/2025 22:45 ollama print_info: rope type = 2 24/09/2025 22:45 ollama print_info: rope scaling = linear 24/09/2025 22:45 ollama print_info: freq_base_train = 10000.0 24/09/2025 22:45 ollama print_info: freq_scale_train = 1 24/09/2025 22:45 ollama print_info: n_ctx_orig_yarn = 8192 24/09/2025 22:45 ollama print_info: rope_finetuned = unknown 24/09/2025 22:45 ollama print_info: model type = 9B 24/09/2025 22:45 ollama print_info: model params = 9.24 B 24/09/2025 22:45 ollama print_info: general.name = gemma-2-9b-it 24/09/2025 22:45 ollama print_info: vocab type = SPM 24/09/2025 22:45 ollama print_info: n_vocab = 256000 24/09/2025 22:45 ollama print_info: n_merges = 0 24/09/2025 22:45 ollama print_info: BOS token = 2 '<bos>' 24/09/2025 22:45 ollama print_info: EOS token = 1 '<eos>' 24/09/2025 22:45 ollama print_info: EOT token = 107 '<end_of_turn>' 24/09/2025 22:45 ollama print_info: UNK token = 3 '<unk>' 24/09/2025 22:45 ollama print_info: PAD token = 0 '<pad>' 24/09/2025 22:45 ollama print_info: LF token = 227 '<0x0A>' 24/09/2025 22:45 ollama print_info: EOG token = 1 '<eos>' 24/09/2025 22:45 ollama print_info: EOG token = 107 '<end_of_turn>' 24/09/2025 22:45 ollama print_info: max token length = 93 24/09/2025 22:45 ollama load_tensors: loading model tensors, this can take a while... (mmap = true) 24/09/2025 22:45 ollama load_tensors: offloading 20 repeating layers to GPU 24/09/2025 22:45 ollama load_tensors: offloaded 20/43 layers to GPU 24/09/2025 22:45 ollama load_tensors: CUDA0 model buffer size = 2127.34 MiB 24/09/2025 22:45 ollama load_tensors: CPU_Mapped model buffer size = 5185.21 MiB 24/09/2025 22:45 ollama llama_context: constructing llama_context 24/09/2025 22:45 ollama llama_context: n_seq_max = 1 24/09/2025 22:45 ollama llama_context: n_ctx = 4096 24/09/2025 22:45 ollama llama_context: n_ctx_per_seq = 4096 24/09/2025 22:45 ollama llama_context: n_batch = 512 24/09/2025 22:45 ollama llama_context: n_ubatch = 512 24/09/2025 22:45 ollama llama_context: causal_attn = 1 24/09/2025 22:45 ollama llama_context: flash_attn = 0 24/09/2025 22:45 ollama llama_context: kv_unified = false 24/09/2025 22:45 ollama llama_context: freq_base = 10000.0 24/09/2025 22:45 ollama llama_context: freq_scale = 1 24/09/2025 22:45 ollama llama_context: n_ctx_per_seq (4096) < n_ctx_train (8192) -- the full capacity of the model will not be utilized 24/09/2025 22:45 ollama llama_context: CPU output buffer size = 0.99 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified_iswa: using full-size SWA cache (ref: https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055) 24/09/2025 22:45 ollama llama_kv_cache_unified_iswa: creating non-SWA KV cache, size = 4096 cells 24/09/2025 22:45 ollama llama_kv_cache_unified: CUDA0 KV buffer size = 320.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified: CPU KV buffer size = 352.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified: size = 672.00 MiB ( 4096 cells, 21 layers, 1/1 seqs), K (f16): 336.00 MiB, V (f16): 336.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified_iswa: creating SWA KV cache, size = 4096 cells 24/09/2025 22:45 ollama llama_kv_cache_unified: CUDA0 KV buffer size = 320.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified: CPU KV buffer size = 352.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified: size = 672.00 MiB ( 4096 cells, 21 layers, 1/1 seqs), K (f16): 336.00 MiB, V (f16): 336.00 MiB 24/09/2025 22:45 ollama llama_context: CUDA0 compute buffer size = 1224.77 MiB 24/09/2025 22:45 ollama llama_context: CUDA_Host compute buffer size = 40.01 MiB 24/09/2025 22:45 ollama llama_context: graph nodes = 1816 24/09/2025 22:45 ollama llama_context: graph splits = 290 (with bs=512), 3 (with bs=1) 24/09/2025 22:45 ollama time=2025-09-24T22:45:42.653-03:00 level=INFO source=server.go:1289 msg="llama runner started in 1.33 seconds" 24/09/2025 22:45 ollama time=2025-09-24T22:45:42.653-03:00 level=INFO source=sched.go:470 msg="loaded runners" count=1 24/09/2025 22:45 ollama time=2025-09-24T22:45:42.653-03:00 level=INFO source=server.go:1251 msg="waiting for llama runner to start responding" 24/09/2025 22:45 ollama time=2025-09-24T22:45:42.654-03:00 level=INFO source=server.go:1289 msg="llama runner started in 1.33 seconds" 24/09/2025 22:45 ollama ggml_cuda_compute_forward: ADD failed 24/09/2025 22:45 ollama CUDA error: no kernel image is available for execution on the device 24/09/2025 22:45 ollama current device: 0, in function ggml_cuda_compute_forward at /build/ollama/src/ollama/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:2568 24/09/2025 22:45 ollama err 24/09/2025 22:45 ollama /build/ollama/src/ollama/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:84: CUDA error 24/09/2025 22:45 ollama [New LWP 310914] 24/09/2025 22:45 ollama [New LWP 310913] 24/09/2025 22:45 ollama [New LWP 310912] 24/09/2025 22:45 ollama [New LWP 310910] 24/09/2025 22:45 ollama [New LWP 310909] 24/09/2025 22:45 ollama [New LWP 310908] 24/09/2025 22:45 ollama [New LWP 310907] 24/09/2025 22:45 ollama [New LWP 310906] 24/09/2025 22:45 ollama [New LWP 310905] 24/09/2025 22:45 ollama [New LWP 310904] 24/09/2025 22:45 ollama [New LWP 310903] 24/09/2025 22:45 ollama [New LWP 310902] 24/09/2025 22:45 ollama [Thread debugging using libthread_db enabled] 24/09/2025 22:45 ollama Using host libthread_db library "/usr/lib/libthread_db.so.1". 24/09/2025 22:45 ollama 0x00007f59d8e9f042 in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #0 0x00007f59d8e9f042 in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #1 0x00007f59d8e931ac in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #2 0x00007f59d8e931f4 in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #3 0x00007f59d8f03dcf in wait4 () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #4 0x00007f599032a5bd in ggml_print_backtrace () from /usr/lib/ollama/libggml-base.so 24/09/2025 22:45 ollama #5 0x00007f599032a763 in ggml_abort () from /usr/lib/ollama/libggml-base.so 24/09/2025 22:45 ollama #6 0x00007f597b75c381 in ggml_cuda_error(char const*, char const*, char const*, int, char const*) () from /usr/lib/ollama/libggml-cuda.so 24/09/2025 22:45 ollama #7 0x00007f597b76af02 in ?? () from /usr/lib/ollama/libggml-cuda.so 24/09/2025 22:45 ollama #8 0x00005623b8da30ed in ?? () 24/09/2025 22:45 ollama #9 0x00005623b8e19c12 in ?? () 24/09/2025 22:45 ollama #10 0x00005623b8e1afa3 in ?? () 24/09/2025 22:45 ollama #11 0x00005623b8e1e84a in ?? () 24/09/2025 22:45 ollama #12 0x00005623b8e1f916 in ?? () 24/09/2025 22:45 ollama #13 0x00005623b8d5bc50 in ?? () 24/09/2025 22:45 ollama #14 0x00005623b80743a1 in ?? () 24/09/2025 22:45 ollama #15 0x0000000000000498 in ?? () 24/09/2025 22:45 ollama #16 0x000000c000103180 in ?? () 24/09/2025 22:45 ollama #17 0x00005623b807285a in ?? () 24/09/2025 22:45 ollama #18 0x00005623b80771e5 in ?? () 24/09/2025 22:45 ollama #19 0x00007fffc9fc4818 in ?? () 24/09/2025 22:45 ollama #20 0x00005623b80771e5 in ?? () 24/09/2025 22:45 ollama #21 0x00005623b9da4260 in ?? () 24/09/2025 22:45 ollama #22 0x00007fffc9fc48f0 in ?? () 24/09/2025 22:45 ollama #23 0x00005623b8072645 in ?? () 24/09/2025 22:45 ollama #24 0x00005623b80725d3 in ?? () 24/09/2025 22:45 ollama #25 0x0000000000000006 in ?? () 24/09/2025 22:45 ollama #26 0x00007fffc9fc4978 in ?? () 24/09/2025 22:45 ollama #27 0x0000000000000006 in ?? () 24/09/2025 22:45 ollama #28 0x0000000000000006 in ?? () 24/09/2025 22:45 ollama #29 0x00007fffc9fc4978 in ?? () 24/09/2025 22:45 ollama #30 0x00007f59d8e27675 in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama Backtrace stopped: previous frame inner to this frame (corrupt stack?) 24/09/2025 22:45 ollama [Inferior 1 (process 310900) detached] 24/09/2025 22:45 ollama SIGABRT: abort 24/09/2025 22:45 ollama PC=0x7f59d8e9894c m=0 sigcode=18446744073709551610 24/09/2025 22:45 ollama signal arrived during cgo execution 24/09/2025 22:45 ollama goroutine 11 gp=0xc000103180 m=0 mp=0x5623b9da6080 [syscall]: 24/09/2025 22:45 ollama runtime.cgocall(0x5623b8d5bc00, 0xc000389bd8) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/cgocall.go:167 +0x4b fp=0xc000389bb0 sp=0xc000389b78 pc=0x5623b806906b 24/09/2025 22:45 ollama github.com/ollama/ollama/llama._Cfunc_llama_decode(0x5623e40f5060, {0xf, 0x5623e40fac20, 0x0, 0x5623e41a39a0, 0x5623e6c89a70, 0x5623e6c8a280, 0x5623e6c8d610}) 24/09/2025 22:45 ollama _cgo_gotypes.go:672 +0x4a fp=0xc000389bd8 sp=0xc000389bb0 pc=0x5623b841ea6a 24/09/2025 22:45 ollama github.com/ollama/ollama/llama.(*Context).Decode.func1(...) 24/09/2025 22:45 ollama /build/ollama/src/ollama/llama/llama.go:150 24/09/2025 22:45 ollama github.com/ollama/ollama/llama.(*Context).Decode(0xc00050dd88?, 0x1?) 24/09/2025 22:45 ollama /build/ollama/src/ollama/llama/llama.go:150 +0xed fp=0xc000389cc0 sp=0xc000389bd8 pc=0x5623b842184d 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0002e94a0, 0xc0007120f0, 0xc00050df28) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:441 +0x209 fp=0xc000389ee8 sp=0xc000389cc0 pc=0x5623b84ec309 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0002e94a0, {0x5623b94e0570, 0xc000123a40}) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:346 +0x1d5 fp=0xc000389fb8 sp=0xc000389ee8 pc=0x5623b84ebf95 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1() 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:880 +0x28 fp=0xc000389fe0 sp=0xc000389fb8 pc=0x5623b84f0ce8 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000389fe8 sp=0xc000389fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:880 +0x4c5 24/09/2025 22:45 ollama goroutine 1 gp=0xc000002380 m=nil [IO wait]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000597790 sp=0xc000597770 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.netpollblock(0xc0005977e0?, 0xb80013a6?, 0x23?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc0005977c8 sp=0xc000597790 pc=0x5623b80301d7 24/09/2025 22:45 ollama internal/poll.runtime_pollWait(0x7f59d94fb400, 0x72) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/netpoll.go:351 +0x85 fp=0xc0005977e8 sp=0xc0005977c8 pc=0x5623b806b6c5 24/09/2025 22:45 ollama internal/poll.(*pollDesc).wait(0xc0005b1400?, 0x900000036?, 0x0) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000597810 sp=0xc0005977e8 pc=0x5623b80f4707 24/09/2025 22:45 ollama internal/poll.(*pollDesc).waitRead(...) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_poll_runtime.go:89 24/09/2025 22:45 ollama internal/poll.(*FD).Accept(0xc0005b1400) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_unix.go:613 +0x28c fp=0xc0005978b8 sp=0xc000597810 pc=0x5623b80f9b2c 24/09/2025 22:45 ollama net.(*netFD).accept(0xc0005b1400) 24/09/2025 22:45 ollama /usr/lib/go/src/net/fd_unix.go:161 +0x29 fp=0xc000597970 sp=0xc0005978b8 pc=0x5623b8164089 24/09/2025 22:45 ollama net.(*TCPListener).accept(0xc00012fd80) 24/09/2025 22:45 ollama /usr/lib/go/src/net/tcpsock_posix.go:159 +0x1b fp=0xc0005979c0 sp=0xc000597970 pc=0x5623b81797bb 24/09/2025 22:45 ollama net.(*TCPListener).Accept(0xc00012fd80) 24/09/2025 22:45 ollama /usr/lib/go/src/net/tcpsock.go:380 +0x30 fp=0xc0005979f0 sp=0xc0005979c0 pc=0x5623b8178650 24/09/2025 22:45 ollama net/http.(*onceCloseListener).Accept(0xc0004d03f0?) 24/09/2025 22:45 ollama <autogenerated>:1 +0x24 fp=0xc000597a08 sp=0xc0005979f0 pc=0x5623b8399ea4 24/09/2025 22:45 ollama net/http.(*Server).Serve(0xc0001f1500, {0x5623b94ddfc8, 0xc00012fd80}) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:3463 +0x30c fp=0xc000597b38 sp=0xc000597a08 pc=0x5623b837188c 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034260, 0x4, 0x4}) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:901 +0x8f4 fp=0xc000597d08 sp=0xc000597b38 pc=0x5623b84f0a74 24/09/2025 22:45 ollama github.com/ollama/ollama/runner.Execute({0xc000034250?, 0x0?, 0x0?}) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/runner.go:22 +0xd4 fp=0xc000597d30 sp=0xc000597d08 pc=0x5623b8583414 24/09/2025 22:45 ollama github.com/ollama/ollama/cmd.NewCLI.func2(0xc0001f1200?, {0x5623b8fec2d0?, 0x4?, 0x5623b8fec2d4?}) 24/09/2025 22:45 ollama /build/ollama/src/ollama/cmd/cmd.go:1706 +0x45 fp=0xc000597d58 sp=0xc000597d30 pc=0x5623b8ced5c5 24/09/2025 22:45 ollama github.com/spf13/cobra.(*Command).execute(0xc0004d3508, {0xc00012fb80, 0x4, 0x4}) 24/09/2025 22:45 ollama /build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x88a fp=0xc000597e78 sp=0xc000597d58 pc=0x5623b81dd70a 24/09/2025 22:45 ollama github.com/spf13/cobra.(*Command).ExecuteC(0xc0005c6f08) 24/09/2025 22:45 ollama /build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x398 fp=0xc000597f30 sp=0xc000597e78 pc=0x5623b81ddf38 24/09/2025 22:45 ollama github.com/spf13/cobra.(*Command).Execute(...) 24/09/2025 22:45 ollama /build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992 24/09/2025 22:45 ollama github.com/spf13/cobra.(*Command).ExecuteContext(...) 24/09/2025 22:45 ollama /build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985 24/09/2025 22:45 ollama main.main() 24/09/2025 22:45 ollama /build/ollama/src/ollama/main.go:12 +0x4d fp=0xc000597f50 sp=0xc000597f30 pc=0x5623b8cee08d 24/09/2025 22:45 ollama runtime.main() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:285 +0x29d fp=0xc000597fe0 sp=0xc000597f50 pc=0x5623b8037a7d 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000597fe8 sp=0xc000597fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006cfa8 sp=0xc00006cf88 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.goparkunlock(...) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:466 24/09/2025 22:45 ollama runtime.forcegchelper() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:373 +0xb8 fp=0xc00006cfe0 sp=0xc00006cfa8 pc=0x5623b8037db8 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006cfe8 sp=0xc00006cfe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.init.7 in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:361 +0x1a 24/09/2025 22:45 ollama goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]: 24/09/2025 22:45 ollama runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006d780 sp=0xc00006d760 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.goparkunlock(...) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:466 24/09/2025 22:45 ollama runtime.bgsweep(0xc000098000) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgcsweep.go:323 +0xdf fp=0xc00006d7c8 sp=0xc00006d780 pc=0x5623b8021adf 24/09/2025 22:45 ollama runtime.gcenable.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:212 +0x25 fp=0xc00006d7e0 sp=0xc00006d7c8 pc=0x5623b8015a65 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006d7e8 sp=0xc00006d7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcenable in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:212 +0x66 24/09/2025 22:45 ollama goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]: 24/09/2025 22:45 ollama runtime.gopark(0x10000?, 0x5623b91b3748?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006df78 sp=0xc00006df58 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.goparkunlock(...) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:466 24/09/2025 22:45 ollama runtime.(*scavengerState).park(0x5623b9da3100) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc00006dfa8 sp=0xc00006df78 pc=0x5623b801f549 24/09/2025 22:45 ollama runtime.bgscavenge(0xc000098000) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc00006dfc8 sp=0xc00006dfa8 pc=0x5623b801faf9 24/09/2025 22:45 ollama runtime.gcenable.gowrap2() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:213 +0x25 fp=0xc00006dfe0 sp=0xc00006dfc8 pc=0x5623b8015a05 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006dfe8 sp=0xc00006dfe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcenable in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:213 +0xa5 24/09/2025 22:45 ollama goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait]: 24/09/2025 22:45 ollama runtime.gopark(0x5623b8046d57?, 0x5623b800d385?, 0xb8?, 0x1?, 0xc000002380?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006c620 sp=0xc00006c600 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.runFinalizers() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mfinal.go:210 +0x107 fp=0xc00006c7e0 sp=0xc00006c620 pc=0x5623b8014967 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006c7e8 sp=0xc00006c7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.createfing in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mfinal.go:172 +0x3d 24/09/2025 22:45 ollama goroutine 6 gp=0xc0001ce8c0 m=nil [cleanup wait]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006e768 sp=0xc00006e748 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.goparkunlock(...) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:466 24/09/2025 22:45 ollama runtime.(*cleanupQueue).dequeue(0x5623b9da3a60) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mcleanup.go:439 +0xc5 fp=0xc00006e7a0 sp=0xc00006e768 pc=0x5623b8011b45 24/09/2025 22:45 ollama runtime.runCleanups() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mcleanup.go:635 +0x45 fp=0xc00006e7e0 sp=0xc00006e7a0 pc=0x5623b8012205 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006e7e8 sp=0xc00006e7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.(*cleanupQueue).createGs in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mcleanup.go:589 +0xa5 24/09/2025 22:45 ollama goroutine 7 gp=0xc0001cefc0 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006ef38 sp=0xc00006ef18 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006efc8 sp=0xc00006ef38 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006efe0 sp=0xc00006efc8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006efe8 sp=0xc00006efe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 8 gp=0xc0001cf180 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006f738 sp=0xc00006f718 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006f7c8 sp=0xc00006f738 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006f7e0 sp=0xc00006f7c8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006f7e8 sp=0xc00006f7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 9 gp=0xc0001cf340 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207649ae95?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006ff38 sp=0xc00006ff18 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006ffc8 sp=0xc00006ff38 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006ffe0 sp=0xc00006ffc8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 10 gp=0xc0001cf500 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207649b2fb?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000068738 sp=0xc000068718 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc0000687c8 sp=0xc000068738 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc0000687e0 sp=0xc0000687c8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc0000687e8 sp=0xc0000687e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 18 gp=0xc000504000 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207648e360?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00050a7c8 sp=0xc00050a738 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 34 gp=0xc000102380 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x320764a388d?, 0x3?, 0xab?, 0x12?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000083f38 sp=0xc000083f18 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc000083fc8 sp=0xc000083f38 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc000083fe0 sp=0xc000083fc8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 35 gp=0xc000102540 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207649b52d?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000506f38 sp=0xc000506f18 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc000506fc8 sp=0xc000506f38 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc000506fe0 sp=0xc000506fc8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000506fe8 sp=0xc000506fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 36 gp=0xc000102700 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207649d167?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000507738 sp=0xc000507718 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc0005077c8 sp=0xc000507738 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc0005077e0 sp=0xc0005077c8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc0005077e8 sp=0xc0005077e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 12 gp=0xc000103340 m=nil [select]: 24/09/2025 22:45 ollama runtime.gopark(0xc000047a70?, 0x2?, 0x78?, 0x77?, 0xc0000478bc?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc0000476e8 sp=0xc0000476c8 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.selectgo(0xc000047a70, 0xc0000478b8, 0xf?, 0x0, 0x1?, 0x1) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/select.go:351 +0x8c5 fp=0xc000047828 sp=0xc0000476e8 pc=0x5623b804a685 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.(*Server).completion(0xc0002e94a0, {0x5623b94de1a8, 0xc0003040f0}, 0xc00043c140) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:629 +0xb30 fp=0xc000047ab8 sp=0xc000047828 pc=0x5623b84edf10 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.(*Server).completion-fm({0x5623b94de1a8?, 0xc0003040f0?}, 0xc000047b38?) 24/09/2025 22:45 ollama <autogenerated>:1 +0x36 fp=0xc000047ae8 sp=0xc000047ab8 pc=0x5623b84f10f6 24/09/2025 22:45 ollama net/http.HandlerFunc.ServeHTTP(0xc0005caf00?, {0x5623b94de1a8?, 0xc0003040f0?}, 0xc000047b58?) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:2322 +0x29 fp=0xc000047b10 sp=0xc000047ae8 pc=0x5623b836dec9 24/09/2025 22:45 ollama net/http.(*ServeMux).ServeHTTP(0x5623b800d385?, {0x5623b94de1a8, 0xc0003040f0}, 0xc00043c140) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:2861 +0x1c7 fp=0xc000047b60 sp=0xc000047b10 pc=0x5623b836fda7 24/09/2025 22:45 ollama net/http.serverHandler.ServeHTTP({0x5623b94dad30?}, {0x5623b94de1a8?, 0xc0003040f0?}, 0x1?) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:3340 +0x8e fp=0xc000047b90 sp=0xc000047b60 pc=0x5623b838d68e 24/09/2025 22:45 ollama net/http.(*conn).serve(0xc0004d03f0, {0x5623b94e0538, 0xc0004c71d0}) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:2109 +0x665 fp=0xc000047fb8 sp=0xc000047b90 pc=0x5623b836bfc5 24/09/2025 22:45 ollama net/http.(*Server).Serve.gowrap3() 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:3493 +0x28 fp=0xc000047fe0 sp=0xc000047fb8 pc=0x5623b8371c88 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000047fe8 sp=0xc000047fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by net/http.(*Server).Serve in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:3493 +0x485 24/09/2025 22:45 ollama goroutine 20 gp=0xc000504380 m=nil [IO wait]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000509dd8 sp=0xc000509db8 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.netpollblock(0x5623b8090b98?, 0xb80013a6?, 0x23?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc000509e10 sp=0xc000509dd8 pc=0x5623b80301d7 24/09/2025 22:45 ollama internal/poll.runtime_pollWait(0x7f59d94fb200, 0x72) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/netpoll.go:351 +0x85 fp=0xc000509e30 sp=0xc000509e10 pc=0x5623b806b6c5 24/09/2025 22:45 ollama internal/poll.(*pollDesc).wait(0xc0005b1480?, 0xc00012fde1?, 0x0) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000509e58 sp=0xc000509e30 pc=0x5623b80f4707 24/09/2025 22:45 ollama internal/poll.(*pollDesc).waitRead(...) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_poll_runtime.go:89 24/09/2025 22:45 ollama internal/poll.(*FD).Read(0xc0005b1480, {0xc00012fde1, 0x1, 0x1}) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_unix.go:165 +0x279 fp=0xc000509ef0 sp=0xc000509e58 pc=0x5623b80f59f9 24/09/2025 22:45 ollama net.(*netFD).Read(0xc0005b1480, {0xc00012fde1?, 0x5623b9ce5ec0?, 0xc000509f70?}) 24/09/2025 22:45 ollama /usr/lib/go/src/net/fd_posix.go:68 +0x25 fp=0xc000509f38 sp=0xc000509ef0 pc=0x5623b81621e5 24/09/2025 22:45 ollama net.(*conn).Read(0xc00011c3c0, {0xc00012fde1?, 0x0?, 0x0?}) 24/09/2025 22:45 ollama /usr/lib/go/src/net/net.go:196 +0x45 fp=0xc000509f80 sp=0xc000509f38 pc=0x5623b8170205 24/09/2025 22:45 ollama net/http.(*connReader).backgroundRead(0xc00012fdc0) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:702 +0x33 fp=0xc000509fc8 sp=0xc000509f80 pc=0x5623b8366473 24/09/2025 22:45 ollama net/http.(*connReader).startBackgroundRead.gowrap2() 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:698 +0x25 fp=0xc000509fe0 sp=0xc000509fc8 pc=0x5623b83663a5 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000509fe8 sp=0xc000509fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by net/http.(*connReader).startBackgroundRead in goroutine 12 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:698 +0xb6 24/09/2025 22:45 ollama rax 0x0 24/09/2025 22:45 ollama rbx 0x4be74 24/09/2025 22:45 ollama rcx 0x7f59d8e9894c 24/09/2025 22:45 ollama rdx 0x6 24/09/2025 22:45 ollama rdi 0x4be74 24/09/2025 22:45 ollama rsi 0x4be74 24/09/2025 22:45 ollama rbp 0x7fffc9fbfaa0 24/09/2025 22:45 ollama rsp 0x7fffc9fbfa60 24/09/2025 22:45 ollama r8 0x0 24/09/2025 22:45 ollama r9 0x0 24/09/2025 22:45 ollama r10 0x0 24/09/2025 22:45 ollama r11 0x246 24/09/2025 22:45 ollama r12 0x7f597bc22729 24/09/2025 22:45 ollama r13 0x54 24/09/2025 22:45 ollama r14 0x6 24/09/2025 22:45 ollama r15 0x0 24/09/2025 22:45 ollama rip 0x7f59d8e9894c 24/09/2025 22:45 ollama rflags 0x246 24/09/2025 22:45 ollama cs 0x33 24/09/2025 22:45 ollama fs 0x0 24/09/2025 22:45 ollama gs 0x0 24/09/2025 22:45 ollama time=2025-09-24T22:45:44.225-03:00 level=ERROR source=server.go:1459 msg="post predict" error="Post \"http://127.0.0.1:37197/completion\": EOF" 24/09/2025 22:45 ollama [GIN] 2025/09/24 - 22:45:44 | 200 | 3.564513467s | 127.0.0.1 | POST "/api/generate" 24/09/2025 22:45 ollama time=2025-09-24T22:45:44.253-03:00 level=ERROR source=server.go:425 msg="llama runner terminated" error="exit status 2" ``` #### CUDA_ERR: ```shell [23:59:10.902][140663025804992][CUDA][E] No CUDA context is current to the calling thread [23:59:10.902][140663025804992][CUDA][E] Returning 201 (CUDA_ERROR_INVALID_CONTEXT) from cuCtxGetDevice_v2 [23:59:13.181][140663025804992][CUDA][E] Error handling fatbinary, to get more information when using CUDA Driver APIs use the CU_JIT_ERROR_LOG_BUFFER and CU_JIT_ERROR_LOG_BUFFER_SIZE_BYTES parameters [23:59:13.182][140663025804992][CUDA][E] No available relocatable PTX entries for GPU [23:59:13.182][140663025804992][CUDA][E] No device code available for GPU ISA 61 [23:59:13.182][140663025804992][CUDA][E] Kernel (_Z11k_bin_bcastIXadL_ZN42_INTERNAL_f88bb2be_11_binbcast_cu_6840010b6op_addEffEEfffEvPKT0_PKT1_PT2_iiiiiiiiiiiiiiiii) cannot be found in library due to compilation error, to get more information when using C[23:59:13.182][140663025804992][CUDA][E] Returning 209 (CUDA_ERROR_NO_BINARY_FOR_GPU) from cuLibraryGetKernel ``` #### NVIDIA-SMI: ```shell Thu Sep 25 00:19:58 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 580.82.09 Driver Version: 580.82.09 CUDA Version: 13.0 | +-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce GTX 1060 6GB On | 00000000:01:00.0 On | N/A | | 0% 50C P2 27W / 180W | 1499MiB / 6144MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 188994 G /usr/bin/ksecretd 1MiB | | 0 N/A N/A 189133 G /usr/bin/kwin_wayland 80MiB | | 0 N/A N/A 189269 G /usr/bin/Xwayland 5MiB | | 0 N/A N/A 189344 G /usr/bin/ksmserver 1MiB | | 0 N/A N/A 189346 G /usr/bin/kded6 1MiB | | 0 N/A N/A 189395 G /usr/bin/plasmashell 172MiB | | 0 N/A N/A 189474 G /usr/bin/kaccess 1MiB | | 0 N/A N/A 189477 G ...it-kde-authentication-agent-1 1MiB | | 0 N/A N/A 189657 G /usr/bin/kwalletd6 1MiB | | 0 N/A N/A 189758 G /usr/bin/kwalletmanager5 1MiB | | 0 N/A N/A 189834 G /opt/flemozi/flemozi 4MiB | | 0 N/A N/A 189911 G /usr/bin/kdeconnectd 1MiB | | 0 N/A N/A 189917 G /usr/bin/xwaylandvideobridge 1MiB | | 0 N/A N/A 189922 G /usr/bin/yakuake 1MiB | | 0 N/A N/A 189927 G /usr/bin/qbittorrent 1MiB | | 0 N/A N/A 189985 G vicinae 1MiB | | 0 N/A N/A 190031 G /usr/lib/DiscoverNotifier 1MiB | | 0 N/A N/A 190032 G /usr/bin/kalendarac 1MiB | | 0 N/A N/A 190033 G /usr/bin/kgpg 1MiB | | 0 N/A N/A 190164 G /usr/lib/xdg-desktop-portal-kde 1MiB | | 0 N/A N/A 190334 G /usr/bin/akonadi_control 1MiB | | 0 N/A N/A 190465 G ...bin/akonadi_archivemail_agent 1MiB | | 0 N/A N/A 190468 G ...konadi_followupreminder_agent 1MiB | | 0 N/A N/A 190469 G /usr/bin/akonadi_google_resource 1MiB | | 0 N/A N/A 190473 G .../akonadi_maildispatcher_agent 1MiB | | 0 N/A N/A 190474 G .../bin/akonadi_mailfilter_agent 1MiB | | 0 N/A N/A 190475 G /usr/bin/akonadi_mailmerge_agent 1MiB | | 0 N/A N/A 190479 G /usr/bin/akonadi_migration_agent 1MiB | | 0 N/A N/A 190480 G ...akonadi_newmailnotifier_agent 1MiB | | 0 N/A N/A 190481 G /usr/bin/akonadi_sendlater_agent 1MiB | | 0 N/A N/A 190484 G .../akonadi_unifiedmailbox_agent 1MiB | | 0 N/A N/A 193538 G /usr/lib/baloorunner 1MiB | | 0 N/A N/A 196733 G /usr/bin/kalarm 1MiB | | 0 N/A N/A 196742 G /usr/bin/konsole 1MiB | | 0 N/A N/A 214267 G /usr/lib/firefox/firefox 1101MiB | | 0 N/A N/A 214759 G ...asma-browser-integration-host 1MiB | +-----------------------------------------------------------------------------------------+ ``` ### OS Linux (EndeavourOS) ### GPU Nvidia ### CPU Intel ### Ollama version 0.12.1
GiteaMirror added the needs more infobug labels 2026-04-22 17:13:25 -05:00
Author
Owner

@rick-github commented on GitHub (Sep 25, 2025):

Was ollama installed from an Arch repo, or through the official method?

<!-- gh-comment-id:3332331976 --> @rick-github commented on GitHub (Sep 25, 2025): Was ollama installed from an Arch repo, or through the [official method](https://ollama.com/download)?
Author
Owner

@Dominiquini commented on GitHub (Sep 25, 2025):

Was ollama installed from an Arch repo, or through the official method?

I installed from the ARCH repo. I'll try installing through the official method and check if the same error happens!
But I'm thinking the problem is that CUDA 13 (updated from the ARCH repos) drops support for my GPU (NVIDIA GeForce GTX 1060)!

Thanks.

<!-- gh-comment-id:3332489776 --> @Dominiquini commented on GitHub (Sep 25, 2025): > Was ollama installed from an Arch repo, or through the [official method](https://ollama.com/download)? I installed from the ARCH repo. I'll try installing through the official method and check if the same error happens! But I'm thinking the problem is that CUDA 13 (updated from the ARCH repos) drops support for my GPU (NVIDIA GeForce GTX 1060)! Thanks.
Author
Owner

@omnigenous commented on GitHub (Sep 25, 2025):

@Dominiquini same issue, did you find solution for CUDA on pascal cards on Arch?

<!-- gh-comment-id:3333169396 --> @omnigenous commented on GitHub (Sep 25, 2025): @Dominiquini same issue, did you find solution for CUDA on pascal cards on Arch?
Author
Owner

@Dominiquini commented on GitHub (Sep 25, 2025):

@Dominiquini same issue, did you find solution for CUDA on pascal cards on Arch?

Not yet! I opened this issue on the arch repo: https://gitlab.archlinux.org/archlinux/packaging/packages/ollama/-/issues/26

At least, I was able to run here locally and using my GPU (with CUDA), by using the binaries that I get from GitHub (https://github.com/ollama/ollama/releases/download/v0.12.2/ollama-linux-amd64.tgz) and utilizing the libs from the folder ./lib/ollama/cuda_v12/! Even that in my machine I don't have a working CUDA (CUDA 13 installed!), this libs works fine!

<!-- gh-comment-id:3333930608 --> @Dominiquini commented on GitHub (Sep 25, 2025): > [@Dominiquini](https://github.com/Dominiquini) same issue, did you find solution for CUDA on pascal cards on Arch? Not yet! I opened this issue on the arch repo: https://gitlab.archlinux.org/archlinux/packaging/packages/ollama/-/issues/26 At least, I was able to run here locally and using my GPU (with CUDA), by using the binaries that I get from GitHub (https://github.com/ollama/ollama/releases/download/v0.12.2/ollama-linux-amd64.tgz) and utilizing the libs from the folder `./lib/ollama/cuda_v12/`! Even that in my machine I don't have a working CUDA (CUDA 13 installed!), this libs works fine!
Author
Owner

@derfehler commented on GitHub (Sep 29, 2025):

TL;DR: Everyone using Maxwell, Pascal, and Volta architectures is doomed. All rolling distributions that migrated to CUDA 13 have dropped support — what else could one expect from a trillion-dollar corpo ?

<!-- gh-comment-id:3345862703 --> @derfehler commented on GitHub (Sep 29, 2025): TL;DR: Everyone using Maxwell, Pascal, and Volta architectures is doomed. All rolling distributions that migrated to CUDA 13 have dropped support — what else could one expect from a trillion-dollar corpo ?
Author
Owner

@Dominiquini commented on GitHub (Sep 29, 2025):

I'm closing this bug as there is no action that can be taken!

Note: For Arch, I created two packages in the AUR that allow this program to be used on Pascal boards:

- ollama-bin
- ollama-cuda12-bin
<!-- gh-comment-id:3349020236 --> @Dominiquini commented on GitHub (Sep 29, 2025): I'm closing this bug as there is no action that can be taken! Note: For Arch, I created two packages in the AUR that allow this program to be used on Pascal boards: ``` - ollama-bin - ollama-cuda12-bin ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#34001