[GH-ISSUE #12408] CUDA error: no kernel image is available for execution on the device #34001

New Issue

GiteaMirror · 2026-04-22T17:13:24-05:00

GiteaMirror commented

2026-04-22 17:13:24 -05:00

Originally created by @Dominiquini on GitHub (Sep 25, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/12408

What is the issue?

Ollama doesn't work if I try to run it using my GPU with CUDA. If I disable CUDA (start the service with the environment variable 'CUDA_VISIBLE_DEVICES="-1"'), everything seems to work fine!

Everything seemed to be working fine a few weeks ago! I think the problem starts when I updated CUDA to version 13.0.1, as I tested several previous versions of Ollama, and the same error persists! Even version that I know that worked in the past...

Relevant log output

JOURNALCTL:

24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.610-03:00 level=INFO source=routes.go:1475 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.614-03:00 level=INFO source=images.go:518 msg="total blobs: 60"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.615-03:00 level=INFO source=images.go:525 msg="total unused blobs removed: 0"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.615-03:00 level=INFO source=routes.go:1528 msg="Listening on 127.0.0.1:11434 (version 0.12.1)"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.615-03:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.768-03:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-b5634b9b-f606-83eb-86c6-99d9398d729f library=cuda variant=v12 compute=6.1 driver=13.0 name="NVIDIA GeForce GTX 1060 6GB" total="5.9 GiB" available="4.5 GiB"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:37.768-03:00 level=INFO source=routes.go:1569 msg="entering low vram mode" "total vram"="5.9 GiB" threshold="20.0 GiB"
24/09/2025 22:45	ollama	[GIN] 2025/09/24 - 22:45:40 | 200 |      37.167µs |       127.0.0.1 | HEAD     "/"
24/09/2025 22:45	ollama	[GIN] 2025/09/24 - 22:45:40 | 200 |   67.780023ms |       127.0.0.1 | POST     "/api/show"
24/09/2025 22:45	ollama	llama_model_loader: loaded meta data with 29 key-value pairs and 464 tensors from /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 (version GGUF V3 (latest))
24/09/2025 22:45	ollama	llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
24/09/2025 22:45	ollama	llama_model_loader: - kv   0:                       general.architecture str              = gemma2
24/09/2025 22:45	ollama	llama_model_loader: - kv   1:                               general.name str              = gemma-2-9b-it
24/09/2025 22:45	ollama	llama_model_loader: - kv   2:                      gemma2.context_length u32              = 8192
24/09/2025 22:45	ollama	llama_model_loader: - kv   3:                    gemma2.embedding_length u32              = 3584
24/09/2025 22:45	ollama	llama_model_loader: - kv   4:                         gemma2.block_count u32              = 42
24/09/2025 22:45	ollama	llama_model_loader: - kv   5:                 gemma2.feed_forward_length u32              = 14336
24/09/2025 22:45	ollama	llama_model_loader: - kv   6:                gemma2.attention.head_count u32              = 16
24/09/2025 22:45	ollama	llama_model_loader: - kv   7:             gemma2.attention.head_count_kv u32              = 8
24/09/2025 22:45	ollama	llama_model_loader: - kv   8:    gemma2.attention.layer_norm_rms_epsilon f32              = 0.000001
24/09/2025 22:45	ollama	llama_model_loader: - kv   9:                gemma2.attention.key_length u32              = 256
24/09/2025 22:45	ollama	llama_model_loader: - kv  10:              gemma2.attention.value_length u32              = 256
24/09/2025 22:45	ollama	llama_model_loader: - kv  11:                          general.file_type u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - kv  12:              gemma2.attn_logit_softcapping f32              = 50.000000
24/09/2025 22:45	ollama	llama_model_loader: - kv  13:             gemma2.final_logit_softcapping f32              = 30.000000
24/09/2025 22:45	ollama	llama_model_loader: - kv  14:            gemma2.attention.sliding_window u32              = 4096
24/09/2025 22:45	ollama	llama_model_loader: - kv  15:                       tokenizer.ggml.model str              = llama
24/09/2025 22:45	ollama	llama_model_loader: - kv  16:                         tokenizer.ggml.pre str              = default
24/09/2025 22:45	ollama	llama_model_loader: - kv  17:                      tokenizer.ggml.tokens arr[str,256000]  = ["<pad>", "<eos>", "<bos>", "<unk>", ...
24/09/2025 22:45	ollama	llama_model_loader: - kv  18:                      tokenizer.ggml.scores arr[f32,256000]  = [0.000000, 0.000000, 0.000000, 0.0000...
24/09/2025 22:45	ollama	llama_model_loader: - kv  19:                  tokenizer.ggml.token_type arr[i32,256000]  = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
24/09/2025 22:45	ollama	llama_model_loader: - kv  20:                tokenizer.ggml.bos_token_id u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - kv  21:                tokenizer.ggml.eos_token_id u32              = 1
24/09/2025 22:45	ollama	llama_model_loader: - kv  22:            tokenizer.ggml.unknown_token_id u32              = 3
24/09/2025 22:45	ollama	llama_model_loader: - kv  23:            tokenizer.ggml.padding_token_id u32              = 0
24/09/2025 22:45	ollama	llama_model_loader: - kv  24:               tokenizer.ggml.add_bos_token bool             = true
24/09/2025 22:45	ollama	llama_model_loader: - kv  25:               tokenizer.ggml.add_eos_token bool             = false
24/09/2025 22:45	ollama	llama_model_loader: - kv  26:                    tokenizer.chat_template str              = {{ bos_token }}{% if messages[0]['rol...
24/09/2025 22:45	ollama	llama_model_loader: - kv  27:            tokenizer.ggml.add_space_prefix bool             = false
24/09/2025 22:45	ollama	llama_model_loader: - kv  28:               general.quantization_version u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - type  f32:  169 tensors
24/09/2025 22:45	ollama	llama_model_loader: - type q4_0:  294 tensors
24/09/2025 22:45	ollama	llama_model_loader: - type q6_K:    1 tensors
24/09/2025 22:45	ollama	print_info: file format = GGUF V3 (latest)
24/09/2025 22:45	ollama	print_info: file type   = Q4_0
24/09/2025 22:45	ollama	print_info: file size   = 5.06 GiB (4.71 BPW)
24/09/2025 22:45	ollama	load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
24/09/2025 22:45	ollama	load: printing all EOG tokens:
24/09/2025 22:45	ollama	load:   - 1 ('<eos>')
24/09/2025 22:45	ollama	load:   - 107 ('<end_of_turn>')
24/09/2025 22:45	ollama	load: special tokens cache size = 108
24/09/2025 22:45	ollama	load: token to piece cache size = 1.6014 MB
24/09/2025 22:45	ollama	print_info: arch             = gemma2
24/09/2025 22:45	ollama	print_info: vocab_only       = 1
24/09/2025 22:45	ollama	print_info: model type       = ?B
24/09/2025 22:45	ollama	print_info: model params     = 9.24 B
24/09/2025 22:45	ollama	print_info: general.name     = gemma-2-9b-it
24/09/2025 22:45	ollama	print_info: vocab type       = SPM
24/09/2025 22:45	ollama	print_info: n_vocab          = 256000
24/09/2025 22:45	ollama	print_info: n_merges         = 0
24/09/2025 22:45	ollama	print_info: BOS token        = 2 '<bos>'
24/09/2025 22:45	ollama	print_info: EOS token        = 1 '<eos>'
24/09/2025 22:45	ollama	print_info: EOT token        = 107 '<end_of_turn>'
24/09/2025 22:45	ollama	print_info: UNK token        = 3 '<unk>'
24/09/2025 22:45	ollama	print_info: PAD token        = 0 '<pad>'
24/09/2025 22:45	ollama	print_info: LF token         = 227 '<0x0A>'
24/09/2025 22:45	ollama	print_info: EOG token        = 1 '<eos>'
24/09/2025 22:45	ollama	print_info: EOG token        = 107 '<end_of_turn>'
24/09/2025 22:45	ollama	print_info: max token length = 93
24/09/2025 22:45	ollama	llama_model_load: vocab only - skipping tensors
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.324-03:00 level=INFO source=server.go:399 msg="starting runner" cmd="/usr/bin/ollama runner --model /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 --port 37197"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.334-03:00 level=INFO source=runner.go:864 msg="starting go runner"
24/09/2025 22:45	ollama	ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
24/09/2025 22:45	ollama	ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
24/09/2025 22:45	ollama	ggml_cuda_init: found 1 CUDA devices:
24/09/2025 22:45	ollama	  Device 0: NVIDIA GeForce GTX 1060 6GB, compute capability 6.1, VMM: yes, ID: GPU-b5634b9b-f606-83eb-86c6-99d9398d729f
24/09/2025 22:45	ollama	load_backend: loaded CUDA backend from /usr/lib/ollama/libggml-cuda.so
24/09/2025 22:45	ollama	load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-haswell.so
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.377-03:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=750,800,860,870,880,890,900,1000,1030,1100,1200,1210 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.377-03:00 level=INFO source=runner.go:900 msg="Server listening on 127.0.0.1:37197"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.395-03:00 level=INFO source=server.go:504 msg="system memory" total="31.2 GiB" free="20.8 GiB" free_swap="0 B"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.397-03:00 level=INFO source=server.go:544 msg=offload library=cuda layers.requested=-1 layers.model=43 layers.offload=20 layers.split=[20] memory.available="[4.5 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.2 GiB" memory.required.partial="4.5 GiB" memory.required.kv="1.3 GiB" memory.required.allocations="[4.5 GiB]" memory.weights.total="5.1 GiB" memory.weights.repeating="4.4 GiB" memory.weights.nonrepeating="717.8 MiB" memory.graph.full="507.0 MiB" memory.graph.partial="1.2 GiB"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.397-03:00 level=INFO source=runner.go:799 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:false KvSize:4096 KvCacheType: NumThreads:4 GPULayers:20[ID:GPU-b5634b9b-f606-83eb-86c6-99d9398d729f Layers:20(22..41)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:true}"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.397-03:00 level=INFO source=server.go:1251 msg="waiting for llama runner to start responding"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:41.398-03:00 level=INFO source=server.go:1285 msg="waiting for server to become available" status="llm server loading model"
24/09/2025 22:45	ollama	llama_model_load_from_file_impl: using device CUDA0 (NVIDIA GeForce GTX 1060 6GB) - 4625 MiB free
24/09/2025 22:45	ollama	llama_model_loader: loaded meta data with 29 key-value pairs and 464 tensors from /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 (version GGUF V3 (latest))
24/09/2025 22:45	ollama	llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
24/09/2025 22:45	ollama	llama_model_loader: - kv   0:                       general.architecture str              = gemma2
24/09/2025 22:45	ollama	llama_model_loader: - kv   1:                               general.name str              = gemma-2-9b-it
24/09/2025 22:45	ollama	llama_model_loader: - kv   2:                      gemma2.context_length u32              = 8192
24/09/2025 22:45	ollama	llama_model_loader: - kv   3:                    gemma2.embedding_length u32              = 3584
24/09/2025 22:45	ollama	llama_model_loader: - kv   4:                         gemma2.block_count u32              = 42
24/09/2025 22:45	ollama	llama_model_loader: - kv   5:                 gemma2.feed_forward_length u32              = 14336
24/09/2025 22:45	ollama	llama_model_loader: - kv   6:                gemma2.attention.head_count u32              = 16
24/09/2025 22:45	ollama	llama_model_loader: - kv   7:             gemma2.attention.head_count_kv u32              = 8
24/09/2025 22:45	ollama	llama_model_loader: - kv   8:    gemma2.attention.layer_norm_rms_epsilon f32              = 0.000001
24/09/2025 22:45	ollama	llama_model_loader: - kv   9:                gemma2.attention.key_length u32              = 256
24/09/2025 22:45	ollama	llama_model_loader: - kv  10:              gemma2.attention.value_length u32              = 256
24/09/2025 22:45	ollama	llama_model_loader: - kv  11:                          general.file_type u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - kv  12:              gemma2.attn_logit_softcapping f32              = 50.000000
24/09/2025 22:45	ollama	llama_model_loader: - kv  13:             gemma2.final_logit_softcapping f32              = 30.000000
24/09/2025 22:45	ollama	llama_model_loader: - kv  14:            gemma2.attention.sliding_window u32              = 4096
24/09/2025 22:45	ollama	llama_model_loader: - kv  15:                       tokenizer.ggml.model str              = llama
24/09/2025 22:45	ollama	llama_model_loader: - kv  16:                         tokenizer.ggml.pre str              = default
24/09/2025 22:45	ollama	llama_model_loader: - kv  17:                      tokenizer.ggml.tokens arr[str,256000]  = ["<pad>", "<eos>", "<bos>", "<unk>", ...
24/09/2025 22:45	ollama	llama_model_loader: - kv  18:                      tokenizer.ggml.scores arr[f32,256000]  = [0.000000, 0.000000, 0.000000, 0.0000...
24/09/2025 22:45	ollama	llama_model_loader: - kv  19:                  tokenizer.ggml.token_type arr[i32,256000]  = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
24/09/2025 22:45	ollama	llama_model_loader: - kv  20:                tokenizer.ggml.bos_token_id u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - kv  21:                tokenizer.ggml.eos_token_id u32              = 1
24/09/2025 22:45	ollama	llama_model_loader: - kv  22:            tokenizer.ggml.unknown_token_id u32              = 3
24/09/2025 22:45	ollama	llama_model_loader: - kv  23:            tokenizer.ggml.padding_token_id u32              = 0
24/09/2025 22:45	ollama	llama_model_loader: - kv  24:               tokenizer.ggml.add_bos_token bool             = true
24/09/2025 22:45	ollama	llama_model_loader: - kv  25:               tokenizer.ggml.add_eos_token bool             = false
24/09/2025 22:45	ollama	llama_model_loader: - kv  26:                    tokenizer.chat_template str              = {{ bos_token }}{% if messages[0]['rol...
24/09/2025 22:45	ollama	llama_model_loader: - kv  27:            tokenizer.ggml.add_space_prefix bool             = false
24/09/2025 22:45	ollama	llama_model_loader: - kv  28:               general.quantization_version u32              = 2
24/09/2025 22:45	ollama	llama_model_loader: - type  f32:  169 tensors
24/09/2025 22:45	ollama	llama_model_loader: - type q4_0:  294 tensors
24/09/2025 22:45	ollama	llama_model_loader: - type q6_K:    1 tensors
24/09/2025 22:45	ollama	print_info: file format = GGUF V3 (latest)
24/09/2025 22:45	ollama	print_info: file type   = Q4_0
24/09/2025 22:45	ollama	print_info: file size   = 5.06 GiB (4.71 BPW)
24/09/2025 22:45	ollama	[GIN] 2025/09/24 - 22:45:41 | 200 |      24.284µs |       127.0.0.1 | GET      "/"
24/09/2025 22:45	ollama	load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
24/09/2025 22:45	ollama	load: printing all EOG tokens:
24/09/2025 22:45	ollama	load:   - 1 ('<eos>')
24/09/2025 22:45	ollama	load:   - 107 ('<end_of_turn>')
24/09/2025 22:45	ollama	load: special tokens cache size = 108
24/09/2025 22:45	ollama	load: token to piece cache size = 1.6014 MB
24/09/2025 22:45	ollama	print_info: arch             = gemma2
24/09/2025 22:45	ollama	print_info: vocab_only       = 0
24/09/2025 22:45	ollama	print_info: n_ctx_train      = 8192
24/09/2025 22:45	ollama	print_info: n_embd           = 3584
24/09/2025 22:45	ollama	print_info: n_layer          = 42
24/09/2025 22:45	ollama	print_info: n_head           = 16
24/09/2025 22:45	ollama	print_info: n_head_kv        = 8
24/09/2025 22:45	ollama	print_info: n_rot            = 256
24/09/2025 22:45	ollama	print_info: n_swa            = 4096
24/09/2025 22:45	ollama	print_info: is_swa_any       = 1
24/09/2025 22:45	ollama	print_info: n_embd_head_k    = 256
24/09/2025 22:45	ollama	print_info: n_embd_head_v    = 256
24/09/2025 22:45	ollama	print_info: n_gqa            = 2
24/09/2025 22:45	ollama	print_info: n_embd_k_gqa     = 2048
24/09/2025 22:45	ollama	print_info: n_embd_v_gqa     = 2048
24/09/2025 22:45	ollama	print_info: f_norm_eps       = 0.0e+00
24/09/2025 22:45	ollama	print_info: f_norm_rms_eps   = 1.0e-06
24/09/2025 22:45	ollama	print_info: f_clamp_kqv      = 0.0e+00
24/09/2025 22:45	ollama	print_info: f_max_alibi_bias = 0.0e+00
24/09/2025 22:45	ollama	print_info: f_logit_scale    = 0.0e+00
24/09/2025 22:45	ollama	print_info: f_attn_scale     = 6.2e-02
24/09/2025 22:45	ollama	print_info: n_ff             = 14336
24/09/2025 22:45	ollama	print_info: n_expert         = 0
24/09/2025 22:45	ollama	print_info: n_expert_used    = 0
24/09/2025 22:45	ollama	print_info: causal attn      = 1
24/09/2025 22:45	ollama	print_info: pooling type     = 0
24/09/2025 22:45	ollama	print_info: rope type        = 2
24/09/2025 22:45	ollama	print_info: rope scaling     = linear
24/09/2025 22:45	ollama	print_info: freq_base_train  = 10000.0
24/09/2025 22:45	ollama	print_info: freq_scale_train = 1
24/09/2025 22:45	ollama	print_info: n_ctx_orig_yarn  = 8192
24/09/2025 22:45	ollama	print_info: rope_finetuned   = unknown
24/09/2025 22:45	ollama	print_info: model type       = 9B
24/09/2025 22:45	ollama	print_info: model params     = 9.24 B
24/09/2025 22:45	ollama	print_info: general.name     = gemma-2-9b-it
24/09/2025 22:45	ollama	print_info: vocab type       = SPM
24/09/2025 22:45	ollama	print_info: n_vocab          = 256000
24/09/2025 22:45	ollama	print_info: n_merges         = 0
24/09/2025 22:45	ollama	print_info: BOS token        = 2 '<bos>'
24/09/2025 22:45	ollama	print_info: EOS token        = 1 '<eos>'
24/09/2025 22:45	ollama	print_info: EOT token        = 107 '<end_of_turn>'
24/09/2025 22:45	ollama	print_info: UNK token        = 3 '<unk>'
24/09/2025 22:45	ollama	print_info: PAD token        = 0 '<pad>'
24/09/2025 22:45	ollama	print_info: LF token         = 227 '<0x0A>'
24/09/2025 22:45	ollama	print_info: EOG token        = 1 '<eos>'
24/09/2025 22:45	ollama	print_info: EOG token        = 107 '<end_of_turn>'
24/09/2025 22:45	ollama	print_info: max token length = 93
24/09/2025 22:45	ollama	load_tensors: loading model tensors, this can take a while... (mmap = true)
24/09/2025 22:45	ollama	load_tensors: offloading 20 repeating layers to GPU
24/09/2025 22:45	ollama	load_tensors: offloaded 20/43 layers to GPU
24/09/2025 22:45	ollama	load_tensors:        CUDA0 model buffer size =  2127.34 MiB
24/09/2025 22:45	ollama	load_tensors:   CPU_Mapped model buffer size =  5185.21 MiB
24/09/2025 22:45	ollama	llama_context: constructing llama_context
24/09/2025 22:45	ollama	llama_context: n_seq_max     = 1
24/09/2025 22:45	ollama	llama_context: n_ctx         = 4096
24/09/2025 22:45	ollama	llama_context: n_ctx_per_seq = 4096
24/09/2025 22:45	ollama	llama_context: n_batch       = 512
24/09/2025 22:45	ollama	llama_context: n_ubatch      = 512
24/09/2025 22:45	ollama	llama_context: causal_attn   = 1
24/09/2025 22:45	ollama	llama_context: flash_attn    = 0
24/09/2025 22:45	ollama	llama_context: kv_unified    = false
24/09/2025 22:45	ollama	llama_context: freq_base     = 10000.0
24/09/2025 22:45	ollama	llama_context: freq_scale    = 1
24/09/2025 22:45	ollama	llama_context: n_ctx_per_seq (4096) < n_ctx_train (8192) -- the full capacity of the model will not be utilized
24/09/2025 22:45	ollama	llama_context:        CPU  output buffer size =     0.99 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified_iswa: using full-size SWA cache (ref: https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)
24/09/2025 22:45	ollama	llama_kv_cache_unified_iswa: creating non-SWA KV cache, size = 4096 cells
24/09/2025 22:45	ollama	llama_kv_cache_unified:      CUDA0 KV buffer size =   320.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified:        CPU KV buffer size =   352.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified: size =  672.00 MiB (  4096 cells,  21 layers,  1/1 seqs), K (f16):  336.00 MiB, V (f16):  336.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified_iswa: creating     SWA KV cache, size = 4096 cells
24/09/2025 22:45	ollama	llama_kv_cache_unified:      CUDA0 KV buffer size =   320.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified:        CPU KV buffer size =   352.00 MiB
24/09/2025 22:45	ollama	llama_kv_cache_unified: size =  672.00 MiB (  4096 cells,  21 layers,  1/1 seqs), K (f16):  336.00 MiB, V (f16):  336.00 MiB
24/09/2025 22:45	ollama	llama_context:      CUDA0 compute buffer size =  1224.77 MiB
24/09/2025 22:45	ollama	llama_context:  CUDA_Host compute buffer size =    40.01 MiB
24/09/2025 22:45	ollama	llama_context: graph nodes  = 1816
24/09/2025 22:45	ollama	llama_context: graph splits = 290 (with bs=512), 3 (with bs=1)
24/09/2025 22:45	ollama	time=2025-09-24T22:45:42.653-03:00 level=INFO source=server.go:1289 msg="llama runner started in 1.33 seconds"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:42.653-03:00 level=INFO source=sched.go:470 msg="loaded runners" count=1
24/09/2025 22:45	ollama	time=2025-09-24T22:45:42.653-03:00 level=INFO source=server.go:1251 msg="waiting for llama runner to start responding"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:42.654-03:00 level=INFO source=server.go:1289 msg="llama runner started in 1.33 seconds"
24/09/2025 22:45	ollama	ggml_cuda_compute_forward: ADD failed
24/09/2025 22:45	ollama	CUDA error: no kernel image is available for execution on the device
24/09/2025 22:45	ollama	  current device: 0, in function ggml_cuda_compute_forward at /build/ollama/src/ollama/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:2568
24/09/2025 22:45	ollama	  err
24/09/2025 22:45	ollama	/build/ollama/src/ollama/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:84: CUDA error
24/09/2025 22:45	ollama	[New LWP 310914]
24/09/2025 22:45	ollama	[New LWP 310913]
24/09/2025 22:45	ollama	[New LWP 310912]
24/09/2025 22:45	ollama	[New LWP 310910]
24/09/2025 22:45	ollama	[New LWP 310909]
24/09/2025 22:45	ollama	[New LWP 310908]
24/09/2025 22:45	ollama	[New LWP 310907]
24/09/2025 22:45	ollama	[New LWP 310906]
24/09/2025 22:45	ollama	[New LWP 310905]
24/09/2025 22:45	ollama	[New LWP 310904]
24/09/2025 22:45	ollama	[New LWP 310903]
24/09/2025 22:45	ollama	[New LWP 310902]
24/09/2025 22:45	ollama	[Thread debugging using libthread_db enabled]
24/09/2025 22:45	ollama	Using host libthread_db library "/usr/lib/libthread_db.so.1".
24/09/2025 22:45	ollama	0x00007f59d8e9f042 in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#0  0x00007f59d8e9f042 in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#1  0x00007f59d8e931ac in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#2  0x00007f59d8e931f4 in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#3  0x00007f59d8f03dcf in wait4 () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	#4  0x00007f599032a5bd in ggml_print_backtrace () from /usr/lib/ollama/libggml-base.so
24/09/2025 22:45	ollama	#5  0x00007f599032a763 in ggml_abort () from /usr/lib/ollama/libggml-base.so
24/09/2025 22:45	ollama	#6  0x00007f597b75c381 in ggml_cuda_error(char const*, char const*, char const*, int, char const*) () from /usr/lib/ollama/libggml-cuda.so
24/09/2025 22:45	ollama	#7  0x00007f597b76af02 in ?? () from /usr/lib/ollama/libggml-cuda.so
24/09/2025 22:45	ollama	#8  0x00005623b8da30ed in ?? ()
24/09/2025 22:45	ollama	#9  0x00005623b8e19c12 in ?? ()
24/09/2025 22:45	ollama	#10 0x00005623b8e1afa3 in ?? ()
24/09/2025 22:45	ollama	#11 0x00005623b8e1e84a in ?? ()
24/09/2025 22:45	ollama	#12 0x00005623b8e1f916 in ?? ()
24/09/2025 22:45	ollama	#13 0x00005623b8d5bc50 in ?? ()
24/09/2025 22:45	ollama	#14 0x00005623b80743a1 in ?? ()
24/09/2025 22:45	ollama	#15 0x0000000000000498 in ?? ()
24/09/2025 22:45	ollama	#16 0x000000c000103180 in ?? ()
24/09/2025 22:45	ollama	#17 0x00005623b807285a in ?? ()
24/09/2025 22:45	ollama	#18 0x00005623b80771e5 in ?? ()
24/09/2025 22:45	ollama	#19 0x00007fffc9fc4818 in ?? ()
24/09/2025 22:45	ollama	#20 0x00005623b80771e5 in ?? ()
24/09/2025 22:45	ollama	#21 0x00005623b9da4260 in ?? ()
24/09/2025 22:45	ollama	#22 0x00007fffc9fc48f0 in ?? ()
24/09/2025 22:45	ollama	#23 0x00005623b8072645 in ?? ()
24/09/2025 22:45	ollama	#24 0x00005623b80725d3 in ?? ()
24/09/2025 22:45	ollama	#25 0x0000000000000006 in ?? ()
24/09/2025 22:45	ollama	#26 0x00007fffc9fc4978 in ?? ()
24/09/2025 22:45	ollama	#27 0x0000000000000006 in ?? ()
24/09/2025 22:45	ollama	#28 0x0000000000000006 in ?? ()
24/09/2025 22:45	ollama	#29 0x00007fffc9fc4978 in ?? ()
24/09/2025 22:45	ollama	#30 0x00007f59d8e27675 in ?? () from /usr/lib/libc.so.6
24/09/2025 22:45	ollama	Backtrace stopped: previous frame inner to this frame (corrupt stack?)
24/09/2025 22:45	ollama	[Inferior 1 (process 310900) detached]
24/09/2025 22:45	ollama	SIGABRT: abort
24/09/2025 22:45	ollama	PC=0x7f59d8e9894c m=0 sigcode=18446744073709551610
24/09/2025 22:45	ollama	signal arrived during cgo execution
24/09/2025 22:45	ollama	goroutine 11 gp=0xc000103180 m=0 mp=0x5623b9da6080 [syscall]:
24/09/2025 22:45	ollama	runtime.cgocall(0x5623b8d5bc00, 0xc000389bd8)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/cgocall.go:167 +0x4b fp=0xc000389bb0 sp=0xc000389b78 pc=0x5623b806906b
24/09/2025 22:45	ollama	github.com/ollama/ollama/llama._Cfunc_llama_decode(0x5623e40f5060, {0xf, 0x5623e40fac20, 0x0, 0x5623e41a39a0, 0x5623e6c89a70, 0x5623e6c8a280, 0x5623e6c8d610})
24/09/2025 22:45	ollama		_cgo_gotypes.go:672 +0x4a fp=0xc000389bd8 sp=0xc000389bb0 pc=0x5623b841ea6a
24/09/2025 22:45	ollama	github.com/ollama/ollama/llama.(*Context).Decode.func1(...)
24/09/2025 22:45	ollama		/build/ollama/src/ollama/llama/llama.go:150
24/09/2025 22:45	ollama	github.com/ollama/ollama/llama.(*Context).Decode(0xc00050dd88?, 0x1?)
24/09/2025 22:45	ollama		/build/ollama/src/ollama/llama/llama.go:150 +0xed fp=0xc000389cc0 sp=0xc000389bd8 pc=0x5623b842184d
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0002e94a0, 0xc0007120f0, 0xc00050df28)
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:441 +0x209 fp=0xc000389ee8 sp=0xc000389cc0 pc=0x5623b84ec309
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0002e94a0, {0x5623b94e0570, 0xc000123a40})
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:346 +0x1d5 fp=0xc000389fb8 sp=0xc000389ee8 pc=0x5623b84ebf95
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1()
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:880 +0x28 fp=0xc000389fe0 sp=0xc000389fb8 pc=0x5623b84f0ce8
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000389fe8 sp=0xc000389fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:880 +0x4c5
24/09/2025 22:45	ollama	goroutine 1 gp=0xc000002380 m=nil [IO wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000597790 sp=0xc000597770 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.netpollblock(0xc0005977e0?, 0xb80013a6?, 0x23?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc0005977c8 sp=0xc000597790 pc=0x5623b80301d7
24/09/2025 22:45	ollama	internal/poll.runtime_pollWait(0x7f59d94fb400, 0x72)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/netpoll.go:351 +0x85 fp=0xc0005977e8 sp=0xc0005977c8 pc=0x5623b806b6c5
24/09/2025 22:45	ollama	internal/poll.(*pollDesc).wait(0xc0005b1400?, 0x900000036?, 0x0)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000597810 sp=0xc0005977e8 pc=0x5623b80f4707
24/09/2025 22:45	ollama	internal/poll.(*pollDesc).waitRead(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_poll_runtime.go:89
24/09/2025 22:45	ollama	internal/poll.(*FD).Accept(0xc0005b1400)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_unix.go:613 +0x28c fp=0xc0005978b8 sp=0xc000597810 pc=0x5623b80f9b2c
24/09/2025 22:45	ollama	net.(*netFD).accept(0xc0005b1400)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/fd_unix.go:161 +0x29 fp=0xc000597970 sp=0xc0005978b8 pc=0x5623b8164089
24/09/2025 22:45	ollama	net.(*TCPListener).accept(0xc00012fd80)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/tcpsock_posix.go:159 +0x1b fp=0xc0005979c0 sp=0xc000597970 pc=0x5623b81797bb
24/09/2025 22:45	ollama	net.(*TCPListener).Accept(0xc00012fd80)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/tcpsock.go:380 +0x30 fp=0xc0005979f0 sp=0xc0005979c0 pc=0x5623b8178650
24/09/2025 22:45	ollama	net/http.(*onceCloseListener).Accept(0xc0004d03f0?)
24/09/2025 22:45	ollama		<autogenerated>:1 +0x24 fp=0xc000597a08 sp=0xc0005979f0 pc=0x5623b8399ea4
24/09/2025 22:45	ollama	net/http.(*Server).Serve(0xc0001f1500, {0x5623b94ddfc8, 0xc00012fd80})
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:3463 +0x30c fp=0xc000597b38 sp=0xc000597a08 pc=0x5623b837188c
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034260, 0x4, 0x4})
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:901 +0x8f4 fp=0xc000597d08 sp=0xc000597b38 pc=0x5623b84f0a74
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner.Execute({0xc000034250?, 0x0?, 0x0?})
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/runner.go:22 +0xd4 fp=0xc000597d30 sp=0xc000597d08 pc=0x5623b8583414
24/09/2025 22:45	ollama	github.com/ollama/ollama/cmd.NewCLI.func2(0xc0001f1200?, {0x5623b8fec2d0?, 0x4?, 0x5623b8fec2d4?})
24/09/2025 22:45	ollama		/build/ollama/src/ollama/cmd/cmd.go:1706 +0x45 fp=0xc000597d58 sp=0xc000597d30 pc=0x5623b8ced5c5
24/09/2025 22:45	ollama	github.com/spf13/cobra.(*Command).execute(0xc0004d3508, {0xc00012fb80, 0x4, 0x4})
24/09/2025 22:45	ollama		/build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x88a fp=0xc000597e78 sp=0xc000597d58 pc=0x5623b81dd70a
24/09/2025 22:45	ollama	github.com/spf13/cobra.(*Command).ExecuteC(0xc0005c6f08)
24/09/2025 22:45	ollama		/build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x398 fp=0xc000597f30 sp=0xc000597e78 pc=0x5623b81ddf38
24/09/2025 22:45	ollama	github.com/spf13/cobra.(*Command).Execute(...)
24/09/2025 22:45	ollama		/build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992
24/09/2025 22:45	ollama	github.com/spf13/cobra.(*Command).ExecuteContext(...)
24/09/2025 22:45	ollama		/build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985
24/09/2025 22:45	ollama	main.main()
24/09/2025 22:45	ollama		/build/ollama/src/ollama/main.go:12 +0x4d fp=0xc000597f50 sp=0xc000597f30 pc=0x5623b8cee08d
24/09/2025 22:45	ollama	runtime.main()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:285 +0x29d fp=0xc000597fe0 sp=0xc000597f50 pc=0x5623b8037a7d
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000597fe8 sp=0xc000597fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006cfa8 sp=0xc00006cf88 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.goparkunlock(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:466
24/09/2025 22:45	ollama	runtime.forcegchelper()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:373 +0xb8 fp=0xc00006cfe0 sp=0xc00006cfa8 pc=0x5623b8037db8
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006cfe8 sp=0xc00006cfe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.init.7 in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:361 +0x1a
24/09/2025 22:45	ollama	goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006d780 sp=0xc00006d760 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.goparkunlock(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:466
24/09/2025 22:45	ollama	runtime.bgsweep(0xc000098000)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgcsweep.go:323 +0xdf fp=0xc00006d7c8 sp=0xc00006d780 pc=0x5623b8021adf
24/09/2025 22:45	ollama	runtime.gcenable.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:212 +0x25 fp=0xc00006d7e0 sp=0xc00006d7c8 pc=0x5623b8015a65
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006d7e8 sp=0xc00006d7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcenable in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:212 +0x66
24/09/2025 22:45	ollama	goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x10000?, 0x5623b91b3748?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006df78 sp=0xc00006df58 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.goparkunlock(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:466
24/09/2025 22:45	ollama	runtime.(*scavengerState).park(0x5623b9da3100)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc00006dfa8 sp=0xc00006df78 pc=0x5623b801f549
24/09/2025 22:45	ollama	runtime.bgscavenge(0xc000098000)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc00006dfc8 sp=0xc00006dfa8 pc=0x5623b801faf9
24/09/2025 22:45	ollama	runtime.gcenable.gowrap2()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:213 +0x25 fp=0xc00006dfe0 sp=0xc00006dfc8 pc=0x5623b8015a05
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006dfe8 sp=0xc00006dfe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcenable in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:213 +0xa5
24/09/2025 22:45	ollama	goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x5623b8046d57?, 0x5623b800d385?, 0xb8?, 0x1?, 0xc000002380?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006c620 sp=0xc00006c600 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.runFinalizers()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mfinal.go:210 +0x107 fp=0xc00006c7e0 sp=0xc00006c620 pc=0x5623b8014967
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006c7e8 sp=0xc00006c7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.createfing in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mfinal.go:172 +0x3d
24/09/2025 22:45	ollama	goroutine 6 gp=0xc0001ce8c0 m=nil [cleanup wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006e768 sp=0xc00006e748 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.goparkunlock(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:466
24/09/2025 22:45	ollama	runtime.(*cleanupQueue).dequeue(0x5623b9da3a60)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mcleanup.go:439 +0xc5 fp=0xc00006e7a0 sp=0xc00006e768 pc=0x5623b8011b45
24/09/2025 22:45	ollama	runtime.runCleanups()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mcleanup.go:635 +0x45 fp=0xc00006e7e0 sp=0xc00006e7a0 pc=0x5623b8012205
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006e7e8 sp=0xc00006e7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.(*cleanupQueue).createGs in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mcleanup.go:589 +0xa5
24/09/2025 22:45	ollama	goroutine 7 gp=0xc0001cefc0 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006ef38 sp=0xc00006ef18 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006efc8 sp=0xc00006ef38 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006efe0 sp=0xc00006efc8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006efe8 sp=0xc00006efe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 8 gp=0xc0001cf180 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006f738 sp=0xc00006f718 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006f7c8 sp=0xc00006f738 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006f7e0 sp=0xc00006f7c8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006f7e8 sp=0xc00006f7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 9 gp=0xc0001cf340 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207649ae95?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006ff38 sp=0xc00006ff18 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006ffc8 sp=0xc00006ff38 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006ffe0 sp=0xc00006ffc8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 10 gp=0xc0001cf500 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207649b2fb?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000068738 sp=0xc000068718 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc0000687c8 sp=0xc000068738 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc0000687e0 sp=0xc0000687c8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc0000687e8 sp=0xc0000687e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 18 gp=0xc000504000 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207648e360?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00050a7c8 sp=0xc00050a738 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 34 gp=0xc000102380 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x320764a388d?, 0x3?, 0xab?, 0x12?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000083f38 sp=0xc000083f18 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc000083fc8 sp=0xc000083f38 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc000083fe0 sp=0xc000083fc8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 35 gp=0xc000102540 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207649b52d?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000506f38 sp=0xc000506f18 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc000506fc8 sp=0xc000506f38 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc000506fe0 sp=0xc000506fc8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000506fe8 sp=0xc000506fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 36 gp=0xc000102700 m=nil [GC worker (idle)]:
24/09/2025 22:45	ollama	runtime.gopark(0x3207649d167?, 0x0?, 0x0?, 0x0?, 0x0?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000507738 sp=0xc000507718 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.gcBgMarkWorker(0xc0000a36c0)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc0005077c8 sp=0xc000507738 pc=0x5623b801818b
24/09/2025 22:45	ollama	runtime.gcBgMarkStartWorkers.gowrap1()
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc0005077e0 sp=0xc0005077c8 pc=0x5623b8018065
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc0005077e8 sp=0xc0005077e0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by runtime.gcBgMarkStartWorkers in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/mgc.go:1373 +0x105
24/09/2025 22:45	ollama	goroutine 12 gp=0xc000103340 m=nil [select]:
24/09/2025 22:45	ollama	runtime.gopark(0xc000047a70?, 0x2?, 0x78?, 0x77?, 0xc0000478bc?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc0000476e8 sp=0xc0000476c8 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.selectgo(0xc000047a70, 0xc0000478b8, 0xf?, 0x0, 0x1?, 0x1)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/select.go:351 +0x8c5 fp=0xc000047828 sp=0xc0000476e8 pc=0x5623b804a685
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.(*Server).completion(0xc0002e94a0, {0x5623b94de1a8, 0xc0003040f0}, 0xc00043c140)
24/09/2025 22:45	ollama		/build/ollama/src/ollama/runner/llamarunner/runner.go:629 +0xb30 fp=0xc000047ab8 sp=0xc000047828 pc=0x5623b84edf10
24/09/2025 22:45	ollama	github.com/ollama/ollama/runner/llamarunner.(*Server).completion-fm({0x5623b94de1a8?, 0xc0003040f0?}, 0xc000047b38?)
24/09/2025 22:45	ollama		<autogenerated>:1 +0x36 fp=0xc000047ae8 sp=0xc000047ab8 pc=0x5623b84f10f6
24/09/2025 22:45	ollama	net/http.HandlerFunc.ServeHTTP(0xc0005caf00?, {0x5623b94de1a8?, 0xc0003040f0?}, 0xc000047b58?)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:2322 +0x29 fp=0xc000047b10 sp=0xc000047ae8 pc=0x5623b836dec9
24/09/2025 22:45	ollama	net/http.(*ServeMux).ServeHTTP(0x5623b800d385?, {0x5623b94de1a8, 0xc0003040f0}, 0xc00043c140)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:2861 +0x1c7 fp=0xc000047b60 sp=0xc000047b10 pc=0x5623b836fda7
24/09/2025 22:45	ollama	net/http.serverHandler.ServeHTTP({0x5623b94dad30?}, {0x5623b94de1a8?, 0xc0003040f0?}, 0x1?)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:3340 +0x8e fp=0xc000047b90 sp=0xc000047b60 pc=0x5623b838d68e
24/09/2025 22:45	ollama	net/http.(*conn).serve(0xc0004d03f0, {0x5623b94e0538, 0xc0004c71d0})
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:2109 +0x665 fp=0xc000047fb8 sp=0xc000047b90 pc=0x5623b836bfc5
24/09/2025 22:45	ollama	net/http.(*Server).Serve.gowrap3()
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:3493 +0x28 fp=0xc000047fe0 sp=0xc000047fb8 pc=0x5623b8371c88
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000047fe8 sp=0xc000047fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by net/http.(*Server).Serve in goroutine 1
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:3493 +0x485
24/09/2025 22:45	ollama	goroutine 20 gp=0xc000504380 m=nil [IO wait]:
24/09/2025 22:45	ollama	runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000509dd8 sp=0xc000509db8 pc=0x5623b806c4ee
24/09/2025 22:45	ollama	runtime.netpollblock(0x5623b8090b98?, 0xb80013a6?, 0x23?)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc000509e10 sp=0xc000509dd8 pc=0x5623b80301d7
24/09/2025 22:45	ollama	internal/poll.runtime_pollWait(0x7f59d94fb200, 0x72)
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/netpoll.go:351 +0x85 fp=0xc000509e30 sp=0xc000509e10 pc=0x5623b806b6c5
24/09/2025 22:45	ollama	internal/poll.(*pollDesc).wait(0xc0005b1480?, 0xc00012fde1?, 0x0)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000509e58 sp=0xc000509e30 pc=0x5623b80f4707
24/09/2025 22:45	ollama	internal/poll.(*pollDesc).waitRead(...)
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_poll_runtime.go:89
24/09/2025 22:45	ollama	internal/poll.(*FD).Read(0xc0005b1480, {0xc00012fde1, 0x1, 0x1})
24/09/2025 22:45	ollama		/usr/lib/go/src/internal/poll/fd_unix.go:165 +0x279 fp=0xc000509ef0 sp=0xc000509e58 pc=0x5623b80f59f9
24/09/2025 22:45	ollama	net.(*netFD).Read(0xc0005b1480, {0xc00012fde1?, 0x5623b9ce5ec0?, 0xc000509f70?})
24/09/2025 22:45	ollama		/usr/lib/go/src/net/fd_posix.go:68 +0x25 fp=0xc000509f38 sp=0xc000509ef0 pc=0x5623b81621e5
24/09/2025 22:45	ollama	net.(*conn).Read(0xc00011c3c0, {0xc00012fde1?, 0x0?, 0x0?})
24/09/2025 22:45	ollama		/usr/lib/go/src/net/net.go:196 +0x45 fp=0xc000509f80 sp=0xc000509f38 pc=0x5623b8170205
24/09/2025 22:45	ollama	net/http.(*connReader).backgroundRead(0xc00012fdc0)
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:702 +0x33 fp=0xc000509fc8 sp=0xc000509f80 pc=0x5623b8366473
24/09/2025 22:45	ollama	net/http.(*connReader).startBackgroundRead.gowrap2()
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:698 +0x25 fp=0xc000509fe0 sp=0xc000509fc8 pc=0x5623b83663a5
24/09/2025 22:45	ollama	runtime.goexit({})
24/09/2025 22:45	ollama		/usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000509fe8 sp=0xc000509fe0 pc=0x5623b8074701
24/09/2025 22:45	ollama	created by net/http.(*connReader).startBackgroundRead in goroutine 12
24/09/2025 22:45	ollama		/usr/lib/go/src/net/http/server.go:698 +0xb6
24/09/2025 22:45	ollama	rax    0x0
24/09/2025 22:45	ollama	rbx    0x4be74
24/09/2025 22:45	ollama	rcx    0x7f59d8e9894c
24/09/2025 22:45	ollama	rdx    0x6
24/09/2025 22:45	ollama	rdi    0x4be74
24/09/2025 22:45	ollama	rsi    0x4be74
24/09/2025 22:45	ollama	rbp    0x7fffc9fbfaa0
24/09/2025 22:45	ollama	rsp    0x7fffc9fbfa60
24/09/2025 22:45	ollama	r8     0x0
24/09/2025 22:45	ollama	r9     0x0
24/09/2025 22:45	ollama	r10    0x0
24/09/2025 22:45	ollama	r11    0x246
24/09/2025 22:45	ollama	r12    0x7f597bc22729
24/09/2025 22:45	ollama	r13    0x54
24/09/2025 22:45	ollama	r14    0x6
24/09/2025 22:45	ollama	r15    0x0
24/09/2025 22:45	ollama	rip    0x7f59d8e9894c
24/09/2025 22:45	ollama	rflags 0x246
24/09/2025 22:45	ollama	cs     0x33
24/09/2025 22:45	ollama	fs     0x0
24/09/2025 22:45	ollama	gs     0x0
24/09/2025 22:45	ollama	time=2025-09-24T22:45:44.225-03:00 level=ERROR source=server.go:1459 msg="post predict" error="Post \"http://127.0.0.1:37197/completion\": EOF"
24/09/2025 22:45	ollama	[GIN] 2025/09/24 - 22:45:44 | 200 |  3.564513467s |       127.0.0.1 | POST     "/api/generate"
24/09/2025 22:45	ollama	time=2025-09-24T22:45:44.253-03:00 level=ERROR source=server.go:425 msg="llama runner terminated" error="exit status 2"

CUDA_ERR:

[23:59:10.902][140663025804992][CUDA][E] No CUDA context is current to the calling thread
[23:59:10.902][140663025804992][CUDA][E] Returning 201 (CUDA_ERROR_INVALID_CONTEXT) from cuCtxGetDevice_v2
[23:59:13.181][140663025804992][CUDA][E] Error handling fatbinary, to get more information when using CUDA Driver APIs use the CU_JIT_ERROR_LOG_BUFFER and CU_JIT_ERROR_LOG_BUFFER_SIZE_BYTES parameters
[23:59:13.182][140663025804992][CUDA][E] No available relocatable PTX entries for GPU
[23:59:13.182][140663025804992][CUDA][E] No device code available for GPU ISA 61
[23:59:13.182][140663025804992][CUDA][E] Kernel (_Z11k_bin_bcastIXadL_ZN42_INTERNAL_f88bb2be_11_binbcast_cu_6840010b6op_addEffEEfffEvPKT0_PKT1_PT2_iiiiiiiiiiiiiiiii) cannot be found in library due to compilation error, to get more information when using C[23:59:13.182][140663025804992][CUDA][E] Returning 209 (CUDA_ERROR_NO_BINARY_FOR_GPU) from cuLibraryGetKernel

NVIDIA-SMI:

Thu Sep 25 00:19:58 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.09              Driver Version: 580.82.09      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1060 6GB    On  |   00000000:01:00.0  On |                  N/A |
|  0%   50C    P2             27W /  180W |    1499MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A          188994      G   /usr/bin/ksecretd                         1MiB |
|    0   N/A  N/A          189133      G   /usr/bin/kwin_wayland                    80MiB |
|    0   N/A  N/A          189269      G   /usr/bin/Xwayland                         5MiB |
|    0   N/A  N/A          189344      G   /usr/bin/ksmserver                        1MiB |
|    0   N/A  N/A          189346      G   /usr/bin/kded6                            1MiB |
|    0   N/A  N/A          189395      G   /usr/bin/plasmashell                    172MiB |
|    0   N/A  N/A          189474      G   /usr/bin/kaccess                          1MiB |
|    0   N/A  N/A          189477      G   ...it-kde-authentication-agent-1          1MiB |
|    0   N/A  N/A          189657      G   /usr/bin/kwalletd6                        1MiB |
|    0   N/A  N/A          189758      G   /usr/bin/kwalletmanager5                  1MiB |
|    0   N/A  N/A          189834      G   /opt/flemozi/flemozi                      4MiB |
|    0   N/A  N/A          189911      G   /usr/bin/kdeconnectd                      1MiB |
|    0   N/A  N/A          189917      G   /usr/bin/xwaylandvideobridge              1MiB |
|    0   N/A  N/A          189922      G   /usr/bin/yakuake                          1MiB |
|    0   N/A  N/A          189927      G   /usr/bin/qbittorrent                      1MiB |
|    0   N/A  N/A          189985      G   vicinae                                   1MiB |
|    0   N/A  N/A          190031      G   /usr/lib/DiscoverNotifier                 1MiB |
|    0   N/A  N/A          190032      G   /usr/bin/kalendarac                       1MiB |
|    0   N/A  N/A          190033      G   /usr/bin/kgpg                             1MiB |
|    0   N/A  N/A          190164      G   /usr/lib/xdg-desktop-portal-kde           1MiB |
|    0   N/A  N/A          190334      G   /usr/bin/akonadi_control                  1MiB |
|    0   N/A  N/A          190465      G   ...bin/akonadi_archivemail_agent          1MiB |
|    0   N/A  N/A          190468      G   ...konadi_followupreminder_agent          1MiB |
|    0   N/A  N/A          190469      G   /usr/bin/akonadi_google_resource          1MiB |
|    0   N/A  N/A          190473      G   .../akonadi_maildispatcher_agent          1MiB |
|    0   N/A  N/A          190474      G   .../bin/akonadi_mailfilter_agent          1MiB |
|    0   N/A  N/A          190475      G   /usr/bin/akonadi_mailmerge_agent          1MiB |
|    0   N/A  N/A          190479      G   /usr/bin/akonadi_migration_agent          1MiB |
|    0   N/A  N/A          190480      G   ...akonadi_newmailnotifier_agent          1MiB |
|    0   N/A  N/A          190481      G   /usr/bin/akonadi_sendlater_agent          1MiB |
|    0   N/A  N/A          190484      G   .../akonadi_unifiedmailbox_agent          1MiB |
|    0   N/A  N/A          193538      G   /usr/lib/baloorunner                      1MiB |
|    0   N/A  N/A          196733      G   /usr/bin/kalarm                           1MiB |
|    0   N/A  N/A          196742      G   /usr/bin/konsole                          1MiB |
|    0   N/A  N/A          214267      G   /usr/lib/firefox/firefox               1101MiB |
|    0   N/A  N/A          214759      G   ...asma-browser-integration-host          1MiB |
+-----------------------------------------------------------------------------------------+

OS

Linux (EndeavourOS)

GPU

Nvidia

CPU

Intel

Ollama version

0.12.1

Originally created by @Dominiquini on GitHub (Sep 25, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/12408 ### What is the issue? Ollama doesn't work if I try to run it using my GPU with CUDA. If I disable CUDA (start the service with the environment variable 'CUDA_VISIBLE_DEVICES="-1"'), everything seems to work fine! Everything seemed to be working fine a few weeks ago! I think the problem starts when I updated CUDA to version 13.0.1, as I tested several previous versions of Ollama, and the same error persists! Even version that I know that worked in the past... ### Relevant log output #### JOURNALCTL: ```shell 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.610-03:00 level=INFO source=routes.go:1475 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.614-03:00 level=INFO source=images.go:518 msg="total blobs: 60" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.615-03:00 level=INFO source=images.go:525 msg="total unused blobs removed: 0" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.615-03:00 level=INFO source=routes.go:1528 msg="Listening on 127.0.0.1:11434 (version 0.12.1)" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.615-03:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.768-03:00 level=INFO source=types.go:131 msg="inference compute" id=GPU-b5634b9b-f606-83eb-86c6-99d9398d729f library=cuda variant=v12 compute=6.1 driver=13.0 name="NVIDIA GeForce GTX 1060 6GB" total="5.9 GiB" available="4.5 GiB" 24/09/2025 22:45 ollama time=2025-09-24T22:45:37.768-03:00 level=INFO source=routes.go:1569 msg="entering low vram mode" "total vram"="5.9 GiB" threshold="20.0 GiB" 24/09/2025 22:45 ollama [GIN] 2025/09/24 - 22:45:40 | 200 | 37.167µs | 127.0.0.1 | HEAD "/" 24/09/2025 22:45 ollama [GIN] 2025/09/24 - 22:45:40 | 200 | 67.780023ms | 127.0.0.1 | POST "/api/show" 24/09/2025 22:45 ollama llama_model_loader: loaded meta data with 29 key-value pairs and 464 tensors from /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 (version GGUF V3 (latest)) 24/09/2025 22:45 ollama llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. 24/09/2025 22:45 ollama llama_model_loader: - kv 0: general.architecture str = gemma2 24/09/2025 22:45 ollama llama_model_loader: - kv 1: general.name str = gemma-2-9b-it 24/09/2025 22:45 ollama llama_model_loader: - kv 2: gemma2.context_length u32 = 8192 24/09/2025 22:45 ollama llama_model_loader: - kv 3: gemma2.embedding_length u32 = 3584 24/09/2025 22:45 ollama llama_model_loader: - kv 4: gemma2.block_count u32 = 42 24/09/2025 22:45 ollama llama_model_loader: - kv 5: gemma2.feed_forward_length u32 = 14336 24/09/2025 22:45 ollama llama_model_loader: - kv 6: gemma2.attention.head_count u32 = 16 24/09/2025 22:45 ollama llama_model_loader: - kv 7: gemma2.attention.head_count_kv u32 = 8 24/09/2025 22:45 ollama llama_model_loader: - kv 8: gemma2.attention.layer_norm_rms_epsilon f32 = 0.000001 24/09/2025 22:45 ollama llama_model_loader: - kv 9: gemma2.attention.key_length u32 = 256 24/09/2025 22:45 ollama llama_model_loader: - kv 10: gemma2.attention.value_length u32 = 256 24/09/2025 22:45 ollama llama_model_loader: - kv 11: general.file_type u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - kv 12: gemma2.attn_logit_softcapping f32 = 50.000000 24/09/2025 22:45 ollama llama_model_loader: - kv 13: gemma2.final_logit_softcapping f32 = 30.000000 24/09/2025 22:45 ollama llama_model_loader: - kv 14: gemma2.attention.sliding_window u32 = 4096 24/09/2025 22:45 ollama llama_model_loader: - kv 15: tokenizer.ggml.model str = llama 24/09/2025 22:45 ollama llama_model_loader: - kv 16: tokenizer.ggml.pre str = default 24/09/2025 22:45 ollama llama_model_loader: - kv 17: tokenizer.ggml.tokens arr[str,256000] = ["<pad>", "<eos>", "<bos>", "<unk>", ... 24/09/2025 22:45 ollama llama_model_loader: - kv 18: tokenizer.ggml.scores arr[f32,256000] = [0.000000, 0.000000, 0.000000, 0.0000... 24/09/2025 22:45 ollama llama_model_loader: - kv 19: tokenizer.ggml.token_type arr[i32,256000] = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ... 24/09/2025 22:45 ollama llama_model_loader: - kv 20: tokenizer.ggml.bos_token_id u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - kv 21: tokenizer.ggml.eos_token_id u32 = 1 24/09/2025 22:45 ollama llama_model_loader: - kv 22: tokenizer.ggml.unknown_token_id u32 = 3 24/09/2025 22:45 ollama llama_model_loader: - kv 23: tokenizer.ggml.padding_token_id u32 = 0 24/09/2025 22:45 ollama llama_model_loader: - kv 24: tokenizer.ggml.add_bos_token bool = true 24/09/2025 22:45 ollama llama_model_loader: - kv 25: tokenizer.ggml.add_eos_token bool = false 24/09/2025 22:45 ollama llama_model_loader: - kv 26: tokenizer.chat_template str = {{ bos_token }}{% if messages[0]['rol... 24/09/2025 22:45 ollama llama_model_loader: - kv 27: tokenizer.ggml.add_space_prefix bool = false 24/09/2025 22:45 ollama llama_model_loader: - kv 28: general.quantization_version u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - type f32: 169 tensors 24/09/2025 22:45 ollama llama_model_loader: - type q4_0: 294 tensors 24/09/2025 22:45 ollama llama_model_loader: - type q6_K: 1 tensors 24/09/2025 22:45 ollama print_info: file format = GGUF V3 (latest) 24/09/2025 22:45 ollama print_info: file type = Q4_0 24/09/2025 22:45 ollama print_info: file size = 5.06 GiB (4.71 BPW) 24/09/2025 22:45 ollama load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect 24/09/2025 22:45 ollama load: printing all EOG tokens: 24/09/2025 22:45 ollama load: - 1 ('<eos>') 24/09/2025 22:45 ollama load: - 107 ('<end_of_turn>') 24/09/2025 22:45 ollama load: special tokens cache size = 108 24/09/2025 22:45 ollama load: token to piece cache size = 1.6014 MB 24/09/2025 22:45 ollama print_info: arch = gemma2 24/09/2025 22:45 ollama print_info: vocab_only = 1 24/09/2025 22:45 ollama print_info: model type = ?B 24/09/2025 22:45 ollama print_info: model params = 9.24 B 24/09/2025 22:45 ollama print_info: general.name = gemma-2-9b-it 24/09/2025 22:45 ollama print_info: vocab type = SPM 24/09/2025 22:45 ollama print_info: n_vocab = 256000 24/09/2025 22:45 ollama print_info: n_merges = 0 24/09/2025 22:45 ollama print_info: BOS token = 2 '<bos>' 24/09/2025 22:45 ollama print_info: EOS token = 1 '<eos>' 24/09/2025 22:45 ollama print_info: EOT token = 107 '<end_of_turn>' 24/09/2025 22:45 ollama print_info: UNK token = 3 '<unk>' 24/09/2025 22:45 ollama print_info: PAD token = 0 '<pad>' 24/09/2025 22:45 ollama print_info: LF token = 227 '<0x0A>' 24/09/2025 22:45 ollama print_info: EOG token = 1 '<eos>' 24/09/2025 22:45 ollama print_info: EOG token = 107 '<end_of_turn>' 24/09/2025 22:45 ollama print_info: max token length = 93 24/09/2025 22:45 ollama llama_model_load: vocab only - skipping tensors 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.324-03:00 level=INFO source=server.go:399 msg="starting runner" cmd="/usr/bin/ollama runner --model /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 --port 37197" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.334-03:00 level=INFO source=runner.go:864 msg="starting go runner" 24/09/2025 22:45 ollama ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no 24/09/2025 22:45 ollama ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no 24/09/2025 22:45 ollama ggml_cuda_init: found 1 CUDA devices: 24/09/2025 22:45 ollama Device 0: NVIDIA GeForce GTX 1060 6GB, compute capability 6.1, VMM: yes, ID: GPU-b5634b9b-f606-83eb-86c6-99d9398d729f 24/09/2025 22:45 ollama load_backend: loaded CUDA backend from /usr/lib/ollama/libggml-cuda.so 24/09/2025 22:45 ollama load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-haswell.so 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.377-03:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=750,800,860,870,880,890,900,1000,1030,1100,1200,1210 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc) 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.377-03:00 level=INFO source=runner.go:900 msg="Server listening on 127.0.0.1:37197" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.395-03:00 level=INFO source=server.go:504 msg="system memory" total="31.2 GiB" free="20.8 GiB" free_swap="0 B" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.397-03:00 level=INFO source=server.go:544 msg=offload library=cuda layers.requested=-1 layers.model=43 layers.offload=20 layers.split=[20] memory.available="[4.5 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.2 GiB" memory.required.partial="4.5 GiB" memory.required.kv="1.3 GiB" memory.required.allocations="[4.5 GiB]" memory.weights.total="5.1 GiB" memory.weights.repeating="4.4 GiB" memory.weights.nonrepeating="717.8 MiB" memory.graph.full="507.0 MiB" memory.graph.partial="1.2 GiB" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.397-03:00 level=INFO source=runner.go:799 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:false KvSize:4096 KvCacheType: NumThreads:4 GPULayers:20[ID:GPU-b5634b9b-f606-83eb-86c6-99d9398d729f Layers:20(22..41)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:true}" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.397-03:00 level=INFO source=server.go:1251 msg="waiting for llama runner to start responding" 24/09/2025 22:45 ollama time=2025-09-24T22:45:41.398-03:00 level=INFO source=server.go:1285 msg="waiting for server to become available" status="llm server loading model" 24/09/2025 22:45 ollama llama_model_load_from_file_impl: using device CUDA0 (NVIDIA GeForce GTX 1060 6GB) - 4625 MiB free 24/09/2025 22:45 ollama llama_model_loader: loaded meta data with 29 key-value pairs and 464 tensors from /var/lib/ollama/blobs/sha256-ff1d1fc78170d787ee1201778e2dd65ea211654ca5fb7d69b5a2e7b123a50373 (version GGUF V3 (latest)) 24/09/2025 22:45 ollama llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. 24/09/2025 22:45 ollama llama_model_loader: - kv 0: general.architecture str = gemma2 24/09/2025 22:45 ollama llama_model_loader: - kv 1: general.name str = gemma-2-9b-it 24/09/2025 22:45 ollama llama_model_loader: - kv 2: gemma2.context_length u32 = 8192 24/09/2025 22:45 ollama llama_model_loader: - kv 3: gemma2.embedding_length u32 = 3584 24/09/2025 22:45 ollama llama_model_loader: - kv 4: gemma2.block_count u32 = 42 24/09/2025 22:45 ollama llama_model_loader: - kv 5: gemma2.feed_forward_length u32 = 14336 24/09/2025 22:45 ollama llama_model_loader: - kv 6: gemma2.attention.head_count u32 = 16 24/09/2025 22:45 ollama llama_model_loader: - kv 7: gemma2.attention.head_count_kv u32 = 8 24/09/2025 22:45 ollama llama_model_loader: - kv 8: gemma2.attention.layer_norm_rms_epsilon f32 = 0.000001 24/09/2025 22:45 ollama llama_model_loader: - kv 9: gemma2.attention.key_length u32 = 256 24/09/2025 22:45 ollama llama_model_loader: - kv 10: gemma2.attention.value_length u32 = 256 24/09/2025 22:45 ollama llama_model_loader: - kv 11: general.file_type u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - kv 12: gemma2.attn_logit_softcapping f32 = 50.000000 24/09/2025 22:45 ollama llama_model_loader: - kv 13: gemma2.final_logit_softcapping f32 = 30.000000 24/09/2025 22:45 ollama llama_model_loader: - kv 14: gemma2.attention.sliding_window u32 = 4096 24/09/2025 22:45 ollama llama_model_loader: - kv 15: tokenizer.ggml.model str = llama 24/09/2025 22:45 ollama llama_model_loader: - kv 16: tokenizer.ggml.pre str = default 24/09/2025 22:45 ollama llama_model_loader: - kv 17: tokenizer.ggml.tokens arr[str,256000] = ["<pad>", "<eos>", "<bos>", "<unk>", ... 24/09/2025 22:45 ollama llama_model_loader: - kv 18: tokenizer.ggml.scores arr[f32,256000] = [0.000000, 0.000000, 0.000000, 0.0000... 24/09/2025 22:45 ollama llama_model_loader: - kv 19: tokenizer.ggml.token_type arr[i32,256000] = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ... 24/09/2025 22:45 ollama llama_model_loader: - kv 20: tokenizer.ggml.bos_token_id u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - kv 21: tokenizer.ggml.eos_token_id u32 = 1 24/09/2025 22:45 ollama llama_model_loader: - kv 22: tokenizer.ggml.unknown_token_id u32 = 3 24/09/2025 22:45 ollama llama_model_loader: - kv 23: tokenizer.ggml.padding_token_id u32 = 0 24/09/2025 22:45 ollama llama_model_loader: - kv 24: tokenizer.ggml.add_bos_token bool = true 24/09/2025 22:45 ollama llama_model_loader: - kv 25: tokenizer.ggml.add_eos_token bool = false 24/09/2025 22:45 ollama llama_model_loader: - kv 26: tokenizer.chat_template str = {{ bos_token }}{% if messages[0]['rol... 24/09/2025 22:45 ollama llama_model_loader: - kv 27: tokenizer.ggml.add_space_prefix bool = false 24/09/2025 22:45 ollama llama_model_loader: - kv 28: general.quantization_version u32 = 2 24/09/2025 22:45 ollama llama_model_loader: - type f32: 169 tensors 24/09/2025 22:45 ollama llama_model_loader: - type q4_0: 294 tensors 24/09/2025 22:45 ollama llama_model_loader: - type q6_K: 1 tensors 24/09/2025 22:45 ollama print_info: file format = GGUF V3 (latest) 24/09/2025 22:45 ollama print_info: file type = Q4_0 24/09/2025 22:45 ollama print_info: file size = 5.06 GiB (4.71 BPW) 24/09/2025 22:45 ollama [GIN] 2025/09/24 - 22:45:41 | 200 | 24.284µs | 127.0.0.1 | GET "/" 24/09/2025 22:45 ollama load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect 24/09/2025 22:45 ollama load: printing all EOG tokens: 24/09/2025 22:45 ollama load: - 1 ('<eos>') 24/09/2025 22:45 ollama load: - 107 ('<end_of_turn>') 24/09/2025 22:45 ollama load: special tokens cache size = 108 24/09/2025 22:45 ollama load: token to piece cache size = 1.6014 MB 24/09/2025 22:45 ollama print_info: arch = gemma2 24/09/2025 22:45 ollama print_info: vocab_only = 0 24/09/2025 22:45 ollama print_info: n_ctx_train = 8192 24/09/2025 22:45 ollama print_info: n_embd = 3584 24/09/2025 22:45 ollama print_info: n_layer = 42 24/09/2025 22:45 ollama print_info: n_head = 16 24/09/2025 22:45 ollama print_info: n_head_kv = 8 24/09/2025 22:45 ollama print_info: n_rot = 256 24/09/2025 22:45 ollama print_info: n_swa = 4096 24/09/2025 22:45 ollama print_info: is_swa_any = 1 24/09/2025 22:45 ollama print_info: n_embd_head_k = 256 24/09/2025 22:45 ollama print_info: n_embd_head_v = 256 24/09/2025 22:45 ollama print_info: n_gqa = 2 24/09/2025 22:45 ollama print_info: n_embd_k_gqa = 2048 24/09/2025 22:45 ollama print_info: n_embd_v_gqa = 2048 24/09/2025 22:45 ollama print_info: f_norm_eps = 0.0e+00 24/09/2025 22:45 ollama print_info: f_norm_rms_eps = 1.0e-06 24/09/2025 22:45 ollama print_info: f_clamp_kqv = 0.0e+00 24/09/2025 22:45 ollama print_info: f_max_alibi_bias = 0.0e+00 24/09/2025 22:45 ollama print_info: f_logit_scale = 0.0e+00 24/09/2025 22:45 ollama print_info: f_attn_scale = 6.2e-02 24/09/2025 22:45 ollama print_info: n_ff = 14336 24/09/2025 22:45 ollama print_info: n_expert = 0 24/09/2025 22:45 ollama print_info: n_expert_used = 0 24/09/2025 22:45 ollama print_info: causal attn = 1 24/09/2025 22:45 ollama print_info: pooling type = 0 24/09/2025 22:45 ollama print_info: rope type = 2 24/09/2025 22:45 ollama print_info: rope scaling = linear 24/09/2025 22:45 ollama print_info: freq_base_train = 10000.0 24/09/2025 22:45 ollama print_info: freq_scale_train = 1 24/09/2025 22:45 ollama print_info: n_ctx_orig_yarn = 8192 24/09/2025 22:45 ollama print_info: rope_finetuned = unknown 24/09/2025 22:45 ollama print_info: model type = 9B 24/09/2025 22:45 ollama print_info: model params = 9.24 B 24/09/2025 22:45 ollama print_info: general.name = gemma-2-9b-it 24/09/2025 22:45 ollama print_info: vocab type = SPM 24/09/2025 22:45 ollama print_info: n_vocab = 256000 24/09/2025 22:45 ollama print_info: n_merges = 0 24/09/2025 22:45 ollama print_info: BOS token = 2 '<bos>' 24/09/2025 22:45 ollama print_info: EOS token = 1 '<eos>' 24/09/2025 22:45 ollama print_info: EOT token = 107 '<end_of_turn>' 24/09/2025 22:45 ollama print_info: UNK token = 3 '<unk>' 24/09/2025 22:45 ollama print_info: PAD token = 0 '<pad>' 24/09/2025 22:45 ollama print_info: LF token = 227 '<0x0A>' 24/09/2025 22:45 ollama print_info: EOG token = 1 '<eos>' 24/09/2025 22:45 ollama print_info: EOG token = 107 '<end_of_turn>' 24/09/2025 22:45 ollama print_info: max token length = 93 24/09/2025 22:45 ollama load_tensors: loading model tensors, this can take a while... (mmap = true) 24/09/2025 22:45 ollama load_tensors: offloading 20 repeating layers to GPU 24/09/2025 22:45 ollama load_tensors: offloaded 20/43 layers to GPU 24/09/2025 22:45 ollama load_tensors: CUDA0 model buffer size = 2127.34 MiB 24/09/2025 22:45 ollama load_tensors: CPU_Mapped model buffer size = 5185.21 MiB 24/09/2025 22:45 ollama llama_context: constructing llama_context 24/09/2025 22:45 ollama llama_context: n_seq_max = 1 24/09/2025 22:45 ollama llama_context: n_ctx = 4096 24/09/2025 22:45 ollama llama_context: n_ctx_per_seq = 4096 24/09/2025 22:45 ollama llama_context: n_batch = 512 24/09/2025 22:45 ollama llama_context: n_ubatch = 512 24/09/2025 22:45 ollama llama_context: causal_attn = 1 24/09/2025 22:45 ollama llama_context: flash_attn = 0 24/09/2025 22:45 ollama llama_context: kv_unified = false 24/09/2025 22:45 ollama llama_context: freq_base = 10000.0 24/09/2025 22:45 ollama llama_context: freq_scale = 1 24/09/2025 22:45 ollama llama_context: n_ctx_per_seq (4096) < n_ctx_train (8192) -- the full capacity of the model will not be utilized 24/09/2025 22:45 ollama llama_context: CPU output buffer size = 0.99 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified_iswa: using full-size SWA cache (ref: https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055) 24/09/2025 22:45 ollama llama_kv_cache_unified_iswa: creating non-SWA KV cache, size = 4096 cells 24/09/2025 22:45 ollama llama_kv_cache_unified: CUDA0 KV buffer size = 320.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified: CPU KV buffer size = 352.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified: size = 672.00 MiB ( 4096 cells, 21 layers, 1/1 seqs), K (f16): 336.00 MiB, V (f16): 336.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified_iswa: creating SWA KV cache, size = 4096 cells 24/09/2025 22:45 ollama llama_kv_cache_unified: CUDA0 KV buffer size = 320.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified: CPU KV buffer size = 352.00 MiB 24/09/2025 22:45 ollama llama_kv_cache_unified: size = 672.00 MiB ( 4096 cells, 21 layers, 1/1 seqs), K (f16): 336.00 MiB, V (f16): 336.00 MiB 24/09/2025 22:45 ollama llama_context: CUDA0 compute buffer size = 1224.77 MiB 24/09/2025 22:45 ollama llama_context: CUDA_Host compute buffer size = 40.01 MiB 24/09/2025 22:45 ollama llama_context: graph nodes = 1816 24/09/2025 22:45 ollama llama_context: graph splits = 290 (with bs=512), 3 (with bs=1) 24/09/2025 22:45 ollama time=2025-09-24T22:45:42.653-03:00 level=INFO source=server.go:1289 msg="llama runner started in 1.33 seconds" 24/09/2025 22:45 ollama time=2025-09-24T22:45:42.653-03:00 level=INFO source=sched.go:470 msg="loaded runners" count=1 24/09/2025 22:45 ollama time=2025-09-24T22:45:42.653-03:00 level=INFO source=server.go:1251 msg="waiting for llama runner to start responding" 24/09/2025 22:45 ollama time=2025-09-24T22:45:42.654-03:00 level=INFO source=server.go:1289 msg="llama runner started in 1.33 seconds" 24/09/2025 22:45 ollama ggml_cuda_compute_forward: ADD failed 24/09/2025 22:45 ollama CUDA error: no kernel image is available for execution on the device 24/09/2025 22:45 ollama current device: 0, in function ggml_cuda_compute_forward at /build/ollama/src/ollama/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:2568 24/09/2025 22:45 ollama err 24/09/2025 22:45 ollama /build/ollama/src/ollama/ml/backend/ggml/ggml/src/ggml-cuda/ggml-cuda.cu:84: CUDA error 24/09/2025 22:45 ollama [New LWP 310914] 24/09/2025 22:45 ollama [New LWP 310913] 24/09/2025 22:45 ollama [New LWP 310912] 24/09/2025 22:45 ollama [New LWP 310910] 24/09/2025 22:45 ollama [New LWP 310909] 24/09/2025 22:45 ollama [New LWP 310908] 24/09/2025 22:45 ollama [New LWP 310907] 24/09/2025 22:45 ollama [New LWP 310906] 24/09/2025 22:45 ollama [New LWP 310905] 24/09/2025 22:45 ollama [New LWP 310904] 24/09/2025 22:45 ollama [New LWP 310903] 24/09/2025 22:45 ollama [New LWP 310902] 24/09/2025 22:45 ollama [Thread debugging using libthread_db enabled] 24/09/2025 22:45 ollama Using host libthread_db library "/usr/lib/libthread_db.so.1". 24/09/2025 22:45 ollama 0x00007f59d8e9f042 in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #0 0x00007f59d8e9f042 in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #1 0x00007f59d8e931ac in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #2 0x00007f59d8e931f4 in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #3 0x00007f59d8f03dcf in wait4 () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama #4 0x00007f599032a5bd in ggml_print_backtrace () from /usr/lib/ollama/libggml-base.so 24/09/2025 22:45 ollama #5 0x00007f599032a763 in ggml_abort () from /usr/lib/ollama/libggml-base.so 24/09/2025 22:45 ollama #6 0x00007f597b75c381 in ggml_cuda_error(char const*, char const*, char const*, int, char const*) () from /usr/lib/ollama/libggml-cuda.so 24/09/2025 22:45 ollama #7 0x00007f597b76af02 in ?? () from /usr/lib/ollama/libggml-cuda.so 24/09/2025 22:45 ollama #8 0x00005623b8da30ed in ?? () 24/09/2025 22:45 ollama #9 0x00005623b8e19c12 in ?? () 24/09/2025 22:45 ollama #10 0x00005623b8e1afa3 in ?? () 24/09/2025 22:45 ollama #11 0x00005623b8e1e84a in ?? () 24/09/2025 22:45 ollama #12 0x00005623b8e1f916 in ?? () 24/09/2025 22:45 ollama #13 0x00005623b8d5bc50 in ?? () 24/09/2025 22:45 ollama #14 0x00005623b80743a1 in ?? () 24/09/2025 22:45 ollama #15 0x0000000000000498 in ?? () 24/09/2025 22:45 ollama #16 0x000000c000103180 in ?? () 24/09/2025 22:45 ollama #17 0x00005623b807285a in ?? () 24/09/2025 22:45 ollama #18 0x00005623b80771e5 in ?? () 24/09/2025 22:45 ollama #19 0x00007fffc9fc4818 in ?? () 24/09/2025 22:45 ollama #20 0x00005623b80771e5 in ?? () 24/09/2025 22:45 ollama #21 0x00005623b9da4260 in ?? () 24/09/2025 22:45 ollama #22 0x00007fffc9fc48f0 in ?? () 24/09/2025 22:45 ollama #23 0x00005623b8072645 in ?? () 24/09/2025 22:45 ollama #24 0x00005623b80725d3 in ?? () 24/09/2025 22:45 ollama #25 0x0000000000000006 in ?? () 24/09/2025 22:45 ollama #26 0x00007fffc9fc4978 in ?? () 24/09/2025 22:45 ollama #27 0x0000000000000006 in ?? () 24/09/2025 22:45 ollama #28 0x0000000000000006 in ?? () 24/09/2025 22:45 ollama #29 0x00007fffc9fc4978 in ?? () 24/09/2025 22:45 ollama #30 0x00007f59d8e27675 in ?? () from /usr/lib/libc.so.6 24/09/2025 22:45 ollama Backtrace stopped: previous frame inner to this frame (corrupt stack?) 24/09/2025 22:45 ollama [Inferior 1 (process 310900) detached] 24/09/2025 22:45 ollama SIGABRT: abort 24/09/2025 22:45 ollama PC=0x7f59d8e9894c m=0 sigcode=18446744073709551610 24/09/2025 22:45 ollama signal arrived during cgo execution 24/09/2025 22:45 ollama goroutine 11 gp=0xc000103180 m=0 mp=0x5623b9da6080 [syscall]: 24/09/2025 22:45 ollama runtime.cgocall(0x5623b8d5bc00, 0xc000389bd8) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/cgocall.go:167 +0x4b fp=0xc000389bb0 sp=0xc000389b78 pc=0x5623b806906b 24/09/2025 22:45 ollama github.com/ollama/ollama/llama._Cfunc_llama_decode(0x5623e40f5060, {0xf, 0x5623e40fac20, 0x0, 0x5623e41a39a0, 0x5623e6c89a70, 0x5623e6c8a280, 0x5623e6c8d610}) 24/09/2025 22:45 ollama _cgo_gotypes.go:672 +0x4a fp=0xc000389bd8 sp=0xc000389bb0 pc=0x5623b841ea6a 24/09/2025 22:45 ollama github.com/ollama/ollama/llama.(*Context).Decode.func1(...) 24/09/2025 22:45 ollama /build/ollama/src/ollama/llama/llama.go:150 24/09/2025 22:45 ollama github.com/ollama/ollama/llama.(*Context).Decode(0xc00050dd88?, 0x1?) 24/09/2025 22:45 ollama /build/ollama/src/ollama/llama/llama.go:150 +0xed fp=0xc000389cc0 sp=0xc000389bd8 pc=0x5623b842184d 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0002e94a0, 0xc0007120f0, 0xc00050df28) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:441 +0x209 fp=0xc000389ee8 sp=0xc000389cc0 pc=0x5623b84ec309 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0002e94a0, {0x5623b94e0570, 0xc000123a40}) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:346 +0x1d5 fp=0xc000389fb8 sp=0xc000389ee8 pc=0x5623b84ebf95 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1() 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:880 +0x28 fp=0xc000389fe0 sp=0xc000389fb8 pc=0x5623b84f0ce8 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000389fe8 sp=0xc000389fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:880 +0x4c5 24/09/2025 22:45 ollama goroutine 1 gp=0xc000002380 m=nil [IO wait]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000597790 sp=0xc000597770 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.netpollblock(0xc0005977e0?, 0xb80013a6?, 0x23?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc0005977c8 sp=0xc000597790 pc=0x5623b80301d7 24/09/2025 22:45 ollama internal/poll.runtime_pollWait(0x7f59d94fb400, 0x72) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/netpoll.go:351 +0x85 fp=0xc0005977e8 sp=0xc0005977c8 pc=0x5623b806b6c5 24/09/2025 22:45 ollama internal/poll.(*pollDesc).wait(0xc0005b1400?, 0x900000036?, 0x0) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000597810 sp=0xc0005977e8 pc=0x5623b80f4707 24/09/2025 22:45 ollama internal/poll.(*pollDesc).waitRead(...) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_poll_runtime.go:89 24/09/2025 22:45 ollama internal/poll.(*FD).Accept(0xc0005b1400) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_unix.go:613 +0x28c fp=0xc0005978b8 sp=0xc000597810 pc=0x5623b80f9b2c 24/09/2025 22:45 ollama net.(*netFD).accept(0xc0005b1400) 24/09/2025 22:45 ollama /usr/lib/go/src/net/fd_unix.go:161 +0x29 fp=0xc000597970 sp=0xc0005978b8 pc=0x5623b8164089 24/09/2025 22:45 ollama net.(*TCPListener).accept(0xc00012fd80) 24/09/2025 22:45 ollama /usr/lib/go/src/net/tcpsock_posix.go:159 +0x1b fp=0xc0005979c0 sp=0xc000597970 pc=0x5623b81797bb 24/09/2025 22:45 ollama net.(*TCPListener).Accept(0xc00012fd80) 24/09/2025 22:45 ollama /usr/lib/go/src/net/tcpsock.go:380 +0x30 fp=0xc0005979f0 sp=0xc0005979c0 pc=0x5623b8178650 24/09/2025 22:45 ollama net/http.(*onceCloseListener).Accept(0xc0004d03f0?) 24/09/2025 22:45 ollama <autogenerated>:1 +0x24 fp=0xc000597a08 sp=0xc0005979f0 pc=0x5623b8399ea4 24/09/2025 22:45 ollama net/http.(*Server).Serve(0xc0001f1500, {0x5623b94ddfc8, 0xc00012fd80}) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:3463 +0x30c fp=0xc000597b38 sp=0xc000597a08 pc=0x5623b837188c 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034260, 0x4, 0x4}) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:901 +0x8f4 fp=0xc000597d08 sp=0xc000597b38 pc=0x5623b84f0a74 24/09/2025 22:45 ollama github.com/ollama/ollama/runner.Execute({0xc000034250?, 0x0?, 0x0?}) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/runner.go:22 +0xd4 fp=0xc000597d30 sp=0xc000597d08 pc=0x5623b8583414 24/09/2025 22:45 ollama github.com/ollama/ollama/cmd.NewCLI.func2(0xc0001f1200?, {0x5623b8fec2d0?, 0x4?, 0x5623b8fec2d4?}) 24/09/2025 22:45 ollama /build/ollama/src/ollama/cmd/cmd.go:1706 +0x45 fp=0xc000597d58 sp=0xc000597d30 pc=0x5623b8ced5c5 24/09/2025 22:45 ollama github.com/spf13/cobra.(*Command).execute(0xc0004d3508, {0xc00012fb80, 0x4, 0x4}) 24/09/2025 22:45 ollama /build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:940 +0x88a fp=0xc000597e78 sp=0xc000597d58 pc=0x5623b81dd70a 24/09/2025 22:45 ollama github.com/spf13/cobra.(*Command).ExecuteC(0xc0005c6f08) 24/09/2025 22:45 ollama /build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:1068 +0x398 fp=0xc000597f30 sp=0xc000597e78 pc=0x5623b81ddf38 24/09/2025 22:45 ollama github.com/spf13/cobra.(*Command).Execute(...) 24/09/2025 22:45 ollama /build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:992 24/09/2025 22:45 ollama github.com/spf13/cobra.(*Command).ExecuteContext(...) 24/09/2025 22:45 ollama /build/ollama/src/pkg/mod/github.com/spf13/cobra@v1.7.0/command.go:985 24/09/2025 22:45 ollama main.main() 24/09/2025 22:45 ollama /build/ollama/src/ollama/main.go:12 +0x4d fp=0xc000597f50 sp=0xc000597f30 pc=0x5623b8cee08d 24/09/2025 22:45 ollama runtime.main() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:285 +0x29d fp=0xc000597fe0 sp=0xc000597f50 pc=0x5623b8037a7d 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000597fe8 sp=0xc000597fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006cfa8 sp=0xc00006cf88 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.goparkunlock(...) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:466 24/09/2025 22:45 ollama runtime.forcegchelper() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:373 +0xb8 fp=0xc00006cfe0 sp=0xc00006cfa8 pc=0x5623b8037db8 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006cfe8 sp=0xc00006cfe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.init.7 in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:361 +0x1a 24/09/2025 22:45 ollama goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]: 24/09/2025 22:45 ollama runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006d780 sp=0xc00006d760 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.goparkunlock(...) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:466 24/09/2025 22:45 ollama runtime.bgsweep(0xc000098000) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgcsweep.go:323 +0xdf fp=0xc00006d7c8 sp=0xc00006d780 pc=0x5623b8021adf 24/09/2025 22:45 ollama runtime.gcenable.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:212 +0x25 fp=0xc00006d7e0 sp=0xc00006d7c8 pc=0x5623b8015a65 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006d7e8 sp=0xc00006d7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcenable in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:212 +0x66 24/09/2025 22:45 ollama goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]: 24/09/2025 22:45 ollama runtime.gopark(0x10000?, 0x5623b91b3748?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006df78 sp=0xc00006df58 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.goparkunlock(...) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:466 24/09/2025 22:45 ollama runtime.(*scavengerState).park(0x5623b9da3100) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc00006dfa8 sp=0xc00006df78 pc=0x5623b801f549 24/09/2025 22:45 ollama runtime.bgscavenge(0xc000098000) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc00006dfc8 sp=0xc00006dfa8 pc=0x5623b801faf9 24/09/2025 22:45 ollama runtime.gcenable.gowrap2() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:213 +0x25 fp=0xc00006dfe0 sp=0xc00006dfc8 pc=0x5623b8015a05 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006dfe8 sp=0xc00006dfe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcenable in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:213 +0xa5 24/09/2025 22:45 ollama goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait]: 24/09/2025 22:45 ollama runtime.gopark(0x5623b8046d57?, 0x5623b800d385?, 0xb8?, 0x1?, 0xc000002380?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006c620 sp=0xc00006c600 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.runFinalizers() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mfinal.go:210 +0x107 fp=0xc00006c7e0 sp=0xc00006c620 pc=0x5623b8014967 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006c7e8 sp=0xc00006c7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.createfing in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mfinal.go:172 +0x3d 24/09/2025 22:45 ollama goroutine 6 gp=0xc0001ce8c0 m=nil [cleanup wait]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006e768 sp=0xc00006e748 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.goparkunlock(...) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:466 24/09/2025 22:45 ollama runtime.(*cleanupQueue).dequeue(0x5623b9da3a60) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mcleanup.go:439 +0xc5 fp=0xc00006e7a0 sp=0xc00006e768 pc=0x5623b8011b45 24/09/2025 22:45 ollama runtime.runCleanups() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mcleanup.go:635 +0x45 fp=0xc00006e7e0 sp=0xc00006e7a0 pc=0x5623b8012205 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006e7e8 sp=0xc00006e7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.(*cleanupQueue).createGs in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mcleanup.go:589 +0xa5 24/09/2025 22:45 ollama goroutine 7 gp=0xc0001cefc0 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006ef38 sp=0xc00006ef18 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006efc8 sp=0xc00006ef38 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006efe0 sp=0xc00006efc8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006efe8 sp=0xc00006efe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 8 gp=0xc0001cf180 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006f738 sp=0xc00006f718 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006f7c8 sp=0xc00006f738 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006f7e0 sp=0xc00006f7c8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006f7e8 sp=0xc00006f7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 9 gp=0xc0001cf340 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207649ae95?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00006ff38 sp=0xc00006ff18 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00006ffc8 sp=0xc00006ff38 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00006ffe0 sp=0xc00006ffc8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 10 gp=0xc0001cf500 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207649b2fb?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000068738 sp=0xc000068718 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc0000687c8 sp=0xc000068738 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc0000687e0 sp=0xc0000687c8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc0000687e8 sp=0xc0000687e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 18 gp=0xc000504000 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207648e360?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc00050a7c8 sp=0xc00050a738 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 34 gp=0xc000102380 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x320764a388d?, 0x3?, 0xab?, 0x12?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000083f38 sp=0xc000083f18 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc000083fc8 sp=0xc000083f38 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc000083fe0 sp=0xc000083fc8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000083fe8 sp=0xc000083fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 35 gp=0xc000102540 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207649b52d?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000506f38 sp=0xc000506f18 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc000506fc8 sp=0xc000506f38 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc000506fe0 sp=0xc000506fc8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000506fe8 sp=0xc000506fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 36 gp=0xc000102700 m=nil [GC worker (idle)]: 24/09/2025 22:45 ollama runtime.gopark(0x3207649d167?, 0x0?, 0x0?, 0x0?, 0x0?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000507738 sp=0xc000507718 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.gcBgMarkWorker(0xc0000a36c0) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1463 +0xeb fp=0xc0005077c8 sp=0xc000507738 pc=0x5623b801818b 24/09/2025 22:45 ollama runtime.gcBgMarkStartWorkers.gowrap1() 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x25 fp=0xc0005077e0 sp=0xc0005077c8 pc=0x5623b8018065 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc0005077e8 sp=0xc0005077e0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by runtime.gcBgMarkStartWorkers in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/mgc.go:1373 +0x105 24/09/2025 22:45 ollama goroutine 12 gp=0xc000103340 m=nil [select]: 24/09/2025 22:45 ollama runtime.gopark(0xc000047a70?, 0x2?, 0x78?, 0x77?, 0xc0000478bc?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc0000476e8 sp=0xc0000476c8 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.selectgo(0xc000047a70, 0xc0000478b8, 0xf?, 0x0, 0x1?, 0x1) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/select.go:351 +0x8c5 fp=0xc000047828 sp=0xc0000476e8 pc=0x5623b804a685 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.(*Server).completion(0xc0002e94a0, {0x5623b94de1a8, 0xc0003040f0}, 0xc00043c140) 24/09/2025 22:45 ollama /build/ollama/src/ollama/runner/llamarunner/runner.go:629 +0xb30 fp=0xc000047ab8 sp=0xc000047828 pc=0x5623b84edf10 24/09/2025 22:45 ollama github.com/ollama/ollama/runner/llamarunner.(*Server).completion-fm({0x5623b94de1a8?, 0xc0003040f0?}, 0xc000047b38?) 24/09/2025 22:45 ollama <autogenerated>:1 +0x36 fp=0xc000047ae8 sp=0xc000047ab8 pc=0x5623b84f10f6 24/09/2025 22:45 ollama net/http.HandlerFunc.ServeHTTP(0xc0005caf00?, {0x5623b94de1a8?, 0xc0003040f0?}, 0xc000047b58?) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:2322 +0x29 fp=0xc000047b10 sp=0xc000047ae8 pc=0x5623b836dec9 24/09/2025 22:45 ollama net/http.(*ServeMux).ServeHTTP(0x5623b800d385?, {0x5623b94de1a8, 0xc0003040f0}, 0xc00043c140) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:2861 +0x1c7 fp=0xc000047b60 sp=0xc000047b10 pc=0x5623b836fda7 24/09/2025 22:45 ollama net/http.serverHandler.ServeHTTP({0x5623b94dad30?}, {0x5623b94de1a8?, 0xc0003040f0?}, 0x1?) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:3340 +0x8e fp=0xc000047b90 sp=0xc000047b60 pc=0x5623b838d68e 24/09/2025 22:45 ollama net/http.(*conn).serve(0xc0004d03f0, {0x5623b94e0538, 0xc0004c71d0}) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:2109 +0x665 fp=0xc000047fb8 sp=0xc000047b90 pc=0x5623b836bfc5 24/09/2025 22:45 ollama net/http.(*Server).Serve.gowrap3() 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:3493 +0x28 fp=0xc000047fe0 sp=0xc000047fb8 pc=0x5623b8371c88 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000047fe8 sp=0xc000047fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by net/http.(*Server).Serve in goroutine 1 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:3493 +0x485 24/09/2025 22:45 ollama goroutine 20 gp=0xc000504380 m=nil [IO wait]: 24/09/2025 22:45 ollama runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/proc.go:460 +0xce fp=0xc000509dd8 sp=0xc000509db8 pc=0x5623b806c4ee 24/09/2025 22:45 ollama runtime.netpollblock(0x5623b8090b98?, 0xb80013a6?, 0x23?) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/netpoll.go:575 +0xf7 fp=0xc000509e10 sp=0xc000509dd8 pc=0x5623b80301d7 24/09/2025 22:45 ollama internal/poll.runtime_pollWait(0x7f59d94fb200, 0x72) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/netpoll.go:351 +0x85 fp=0xc000509e30 sp=0xc000509e10 pc=0x5623b806b6c5 24/09/2025 22:45 ollama internal/poll.(*pollDesc).wait(0xc0005b1480?, 0xc00012fde1?, 0x0) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000509e58 sp=0xc000509e30 pc=0x5623b80f4707 24/09/2025 22:45 ollama internal/poll.(*pollDesc).waitRead(...) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_poll_runtime.go:89 24/09/2025 22:45 ollama internal/poll.(*FD).Read(0xc0005b1480, {0xc00012fde1, 0x1, 0x1}) 24/09/2025 22:45 ollama /usr/lib/go/src/internal/poll/fd_unix.go:165 +0x279 fp=0xc000509ef0 sp=0xc000509e58 pc=0x5623b80f59f9 24/09/2025 22:45 ollama net.(*netFD).Read(0xc0005b1480, {0xc00012fde1?, 0x5623b9ce5ec0?, 0xc000509f70?}) 24/09/2025 22:45 ollama /usr/lib/go/src/net/fd_posix.go:68 +0x25 fp=0xc000509f38 sp=0xc000509ef0 pc=0x5623b81621e5 24/09/2025 22:45 ollama net.(*conn).Read(0xc00011c3c0, {0xc00012fde1?, 0x0?, 0x0?}) 24/09/2025 22:45 ollama /usr/lib/go/src/net/net.go:196 +0x45 fp=0xc000509f80 sp=0xc000509f38 pc=0x5623b8170205 24/09/2025 22:45 ollama net/http.(*connReader).backgroundRead(0xc00012fdc0) 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:702 +0x33 fp=0xc000509fc8 sp=0xc000509f80 pc=0x5623b8366473 24/09/2025 22:45 ollama net/http.(*connReader).startBackgroundRead.gowrap2() 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:698 +0x25 fp=0xc000509fe0 sp=0xc000509fc8 pc=0x5623b83663a5 24/09/2025 22:45 ollama runtime.goexit({}) 24/09/2025 22:45 ollama /usr/lib/go/src/runtime/asm_amd64.s:1693 +0x1 fp=0xc000509fe8 sp=0xc000509fe0 pc=0x5623b8074701 24/09/2025 22:45 ollama created by net/http.(*connReader).startBackgroundRead in goroutine 12 24/09/2025 22:45 ollama /usr/lib/go/src/net/http/server.go:698 +0xb6 24/09/2025 22:45 ollama rax 0x0 24/09/2025 22:45 ollama rbx 0x4be74 24/09/2025 22:45 ollama rcx 0x7f59d8e9894c 24/09/2025 22:45 ollama rdx 0x6 24/09/2025 22:45 ollama rdi 0x4be74 24/09/2025 22:45 ollama rsi 0x4be74 24/09/2025 22:45 ollama rbp 0x7fffc9fbfaa0 24/09/2025 22:45 ollama rsp 0x7fffc9fbfa60 24/09/2025 22:45 ollama r8 0x0 24/09/2025 22:45 ollama r9 0x0 24/09/2025 22:45 ollama r10 0x0 24/09/2025 22:45 ollama r11 0x246 24/09/2025 22:45 ollama r12 0x7f597bc22729 24/09/2025 22:45 ollama r13 0x54 24/09/2025 22:45 ollama r14 0x6 24/09/2025 22:45 ollama r15 0x0 24/09/2025 22:45 ollama rip 0x7f59d8e9894c 24/09/2025 22:45 ollama rflags 0x246 24/09/2025 22:45 ollama cs 0x33 24/09/2025 22:45 ollama fs 0x0 24/09/2025 22:45 ollama gs 0x0 24/09/2025 22:45 ollama time=2025-09-24T22:45:44.225-03:00 level=ERROR source=server.go:1459 msg="post predict" error="Post \"http://127.0.0.1:37197/completion\": EOF" 24/09/2025 22:45 ollama [GIN] 2025/09/24 - 22:45:44 | 200 | 3.564513467s | 127.0.0.1 | POST "/api/generate" 24/09/2025 22:45 ollama time=2025-09-24T22:45:44.253-03:00 level=ERROR source=server.go:425 msg="llama runner terminated" error="exit status 2" ``` #### CUDA_ERR: ```shell [23:59:10.902][140663025804992][CUDA][E] No CUDA context is current to the calling thread [23:59:10.902][140663025804992][CUDA][E] Returning 201 (CUDA_ERROR_INVALID_CONTEXT) from cuCtxGetDevice_v2 [23:59:13.181][140663025804992][CUDA][E] Error handling fatbinary, to get more information when using CUDA Driver APIs use the CU_JIT_ERROR_LOG_BUFFER and CU_JIT_ERROR_LOG_BUFFER_SIZE_BYTES parameters [23:59:13.182][140663025804992][CUDA][E] No available relocatable PTX entries for GPU [23:59:13.182][140663025804992][CUDA][E] No device code available for GPU ISA 61 [23:59:13.182][140663025804992][CUDA][E] Kernel (_Z11k_bin_bcastIXadL_ZN42_INTERNAL_f88bb2be_11_binbcast_cu_6840010b6op_addEffEEfffEvPKT0_PKT1_PT2_iiiiiiiiiiiiiiiii) cannot be found in library due to compilation error, to get more information when using C[23:59:13.182][140663025804992][CUDA][E] Returning 209 (CUDA_ERROR_NO_BINARY_FOR_GPU) from cuLibraryGetKernel ``` #### NVIDIA-SMI: ```shell Thu Sep 25 00:19:58 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 580.82.09 Driver Version: 580.82.09 CUDA Version: 13.0 | +-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce GTX 1060 6GB On | 00000000:01:00.0 On | N/A | | 0% 50C P2 27W / 180W | 1499MiB / 6144MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | 0 N/A N/A 188994 G /usr/bin/ksecretd 1MiB | | 0 N/A N/A 189133 G /usr/bin/kwin_wayland 80MiB | | 0 N/A N/A 189269 G /usr/bin/Xwayland 5MiB | | 0 N/A N/A 189344 G /usr/bin/ksmserver 1MiB | | 0 N/A N/A 189346 G /usr/bin/kded6 1MiB | | 0 N/A N/A 189395 G /usr/bin/plasmashell 172MiB | | 0 N/A N/A 189474 G /usr/bin/kaccess 1MiB | | 0 N/A N/A 189477 G ...it-kde-authentication-agent-1 1MiB | | 0 N/A N/A 189657 G /usr/bin/kwalletd6 1MiB | | 0 N/A N/A 189758 G /usr/bin/kwalletmanager5 1MiB | | 0 N/A N/A 189834 G /opt/flemozi/flemozi 4MiB | | 0 N/A N/A 189911 G /usr/bin/kdeconnectd 1MiB | | 0 N/A N/A 189917 G /usr/bin/xwaylandvideobridge 1MiB | | 0 N/A N/A 189922 G /usr/bin/yakuake 1MiB | | 0 N/A N/A 189927 G /usr/bin/qbittorrent 1MiB | | 0 N/A N/A 189985 G vicinae 1MiB | | 0 N/A N/A 190031 G /usr/lib/DiscoverNotifier 1MiB | | 0 N/A N/A 190032 G /usr/bin/kalendarac 1MiB | | 0 N/A N/A 190033 G /usr/bin/kgpg 1MiB | | 0 N/A N/A 190164 G /usr/lib/xdg-desktop-portal-kde 1MiB | | 0 N/A N/A 190334 G /usr/bin/akonadi_control 1MiB | | 0 N/A N/A 190465 G ...bin/akonadi_archivemail_agent 1MiB | | 0 N/A N/A 190468 G ...konadi_followupreminder_agent 1MiB | | 0 N/A N/A 190469 G /usr/bin/akonadi_google_resource 1MiB | | 0 N/A N/A 190473 G .../akonadi_maildispatcher_agent 1MiB | | 0 N/A N/A 190474 G .../bin/akonadi_mailfilter_agent 1MiB | | 0 N/A N/A 190475 G /usr/bin/akonadi_mailmerge_agent 1MiB | | 0 N/A N/A 190479 G /usr/bin/akonadi_migration_agent 1MiB | | 0 N/A N/A 190480 G ...akonadi_newmailnotifier_agent 1MiB | | 0 N/A N/A 190481 G /usr/bin/akonadi_sendlater_agent 1MiB | | 0 N/A N/A 190484 G .../akonadi_unifiedmailbox_agent 1MiB | | 0 N/A N/A 193538 G /usr/lib/baloorunner 1MiB | | 0 N/A N/A 196733 G /usr/bin/kalarm 1MiB | | 0 N/A N/A 196742 G /usr/bin/konsole 1MiB | | 0 N/A N/A 214267 G /usr/lib/firefox/firefox 1101MiB | | 0 N/A N/A 214759 G ...asma-browser-integration-host 1MiB | +-----------------------------------------------------------------------------------------+ ``` ### OS Linux (EndeavourOS) ### GPU Nvidia ### CPU Intel ### Ollama version 0.12.1

GiteaMirror added the needs more info bug labels 2026-04-22 17:13:25 -05:00

GiteaMirror closed this issue

2026-04-22 17:13:25 -05:00

GiteaMirror commented

2026-04-22 17:13:26 -05:00

@rick-github commented on GitHub (Sep 25, 2025):

Was ollama installed from an Arch repo, or through the official method?

@rick-github commented on GitHub (Sep 25, 2025): Was ollama installed from an Arch repo, or through the [official method](https://ollama.com/download)?

GiteaMirror commented

2026-04-22 17:13:27 -05:00

@Dominiquini commented on GitHub (Sep 25, 2025):

Was ollama installed from an Arch repo, or through the official method?

I installed from the ARCH repo. I'll try installing through the official method and check if the same error happens!
But I'm thinking the problem is that CUDA 13 (updated from the ARCH repos) drops support for my GPU (NVIDIA GeForce GTX 1060)!

Thanks.

@Dominiquini commented on GitHub (Sep 25, 2025): > Was ollama installed from an Arch repo, or through the [official method](https://ollama.com/download)? I installed from the ARCH repo. I'll try installing through the official method and check if the same error happens! But I'm thinking the problem is that CUDA 13 (updated from the ARCH repos) drops support for my GPU (NVIDIA GeForce GTX 1060)! Thanks.

GiteaMirror commented

2026-04-22 17:13:28 -05:00

@omnigenous commented on GitHub (Sep 25, 2025):

@Dominiquini same issue, did you find solution for CUDA on pascal cards on Arch?

@omnigenous commented on GitHub (Sep 25, 2025): @Dominiquini same issue, did you find solution for CUDA on pascal cards on Arch?

GiteaMirror commented

2026-04-22 17:13:28 -05:00

@Dominiquini commented on GitHub (Sep 25, 2025):

@Dominiquini same issue, did you find solution for CUDA on pascal cards on Arch?

Not yet! I opened this issue on the arch repo: https://gitlab.archlinux.org/archlinux/packaging/packages/ollama/-/issues/26

At least, I was able to run here locally and using my GPU (with CUDA), by using the binaries that I get from GitHub (https://github.com/ollama/ollama/releases/download/v0.12.2/ollama-linux-amd64.tgz) and utilizing the libs from the folder ./lib/ollama/cuda_v12/! Even that in my machine I don't have a working CUDA (CUDA 13 installed!), this libs works fine!

@Dominiquini commented on GitHub (Sep 25, 2025): > [@Dominiquini](https://github.com/Dominiquini) same issue, did you find solution for CUDA on pascal cards on Arch? Not yet! I opened this issue on the arch repo: https://gitlab.archlinux.org/archlinux/packaging/packages/ollama/-/issues/26 At least, I was able to run here locally and using my GPU (with CUDA), by using the binaries that I get from GitHub (https://github.com/ollama/ollama/releases/download/v0.12.2/ollama-linux-amd64.tgz) and utilizing the libs from the folder `./lib/ollama/cuda_v12/`! Even that in my machine I don't have a working CUDA (CUDA 13 installed!), this libs works fine!

GiteaMirror commented

2026-04-22 17:13:29 -05:00

@derfehler commented on GitHub (Sep 29, 2025):

TL;DR: Everyone using Maxwell, Pascal, and Volta architectures is doomed. All rolling distributions that migrated to CUDA 13 have dropped support — what else could one expect from a trillion-dollar corpo ?

@derfehler commented on GitHub (Sep 29, 2025): TL;DR: Everyone using Maxwell, Pascal, and Volta architectures is doomed. All rolling distributions that migrated to CUDA 13 have dropped support — what else could one expect from a trillion-dollar corpo ?

GiteaMirror commented

2026-04-22 17:13:30 -05:00

@Dominiquini commented on GitHub (Sep 29, 2025):

I'm closing this bug as there is no action that can be taken!

Note: For Arch, I created two packages in the AUR that allow this program to be used on Pascal boards:

- ollama-bin
- ollama-cuda12-bin

@Dominiquini commented on GitHub (Sep 29, 2025): I'm closing this bug as there is no action that can be taken! Note: For Arch, I created two packages in the AUR that allow this program to be used on Pascal boards: ``` - ollama-bin - ollama-cuda12-bin ```

Sign in to join this conversation.

Branches Tags

main

dhiltgen/ci

parth-launch-plan-gating

hoyyeva/anthropic-reference-images-path

parth-anthropic-reference-images-path

brucemacd/download-before-remove

hoyyeva/editor-config-repair

parth-mlx-decode-checkpoints

parth-launch-codex-app

hoyyeva/fix-codex-model-metadata-warning

hoyyeva/qwen

parth/hide-claude-desktop-till-release

hoyyeva/opencode-image-modality

parth-add-claude-code-autoinstall

release_v0.22.0

pdevine/manifest-list

codex/fix-codex-model-metadata-warning

pdevine/addressable-manifest

brucemacd/launch-fetch-reccomended

jmorganca/llama-compat

launch-copilot-cli

hoyyeva/opencode-thinking

release_v0.20.7

parth-auto-save-backup

parth-test

jmorganca/gemma4-audio-replacements

fix-manifest-digest-on-pull

hoyyeva/vscode-improve

brucemacd/install-server-wait

parth/update-claude-docs

brucemac/start-ap-install

pdevine/mlx-update

pdevine/qwen35_vision

drifkin/api-show-fallback

mintlify/image-generation-1773352582

hoyyeva/server-context-length-local-config

jmorganca/faster-reptition-penalties

jmorganca/convert-nemotron

parth-pi-thinking

pdevine/sampling-penalties

jmorganca/fix-create-quantization-memory

dongchen/resumable_transfer_fix

pdevine/sampling-cache-error

jessegross/mlx-usage

hoyyeva/openclaw-config

hoyyeva/app-html

pdevine/qwen3next

brucemacd/sign-sh-install

brucemacd/tui-update

brucemacd/usage-api

jmorganca/launch-empty

fix-app-dist-embed

mxyng/mlx-compile

mxyng/mlx-quant

mxyng/mlx-glm4.7

mxyng/mlx

brucemacd/simplify-model-picker

jmorganca/qwen3-concurrent

fix-glm-4.7-flash-mla-config

drifkin/qwen3-coder-opening-tag

brucemacd/usage-cli

fix-cuda12-fattn-shmem

ollama-imagegen-docs

parth/fix-multiline-inputs

brucemacd/config-docs

mxyng/model-files

mxyng/simple-execute

fix-imagegen-ollama-models

mxyng/async-upload

jmorganca/lazy-no-dtype-changes

imagegen-auto-detect-create

parth/decrease-concurrent-download-hf

fix-mlx-quantize-init

jmorganca/x-cleanup

usage

imagegen-readme

jmorganca/glm-image

mlx-gpu-cd

jmorganca/imagegen-modelfile

parth/agent-skills

parth/agent-allowlist

parth/signed-in-offline

parth/agents

parth/fix-context-chopping

improve-cloud-flow

parth/add-models-websearch

parth/prompt-renderer-mcp

jmorganca/native-settings

jmorganca/download-stream-hash

jmorganca/client2-rebased

brucemacd/oai-chat-req-multipart

jessegross/multi_chunk_reserve

grace/additional-omit-empty

grace/mistral-3-large

mxyng/tokenizer2

mxyng/tokenizer

jessegross/flash

hoyyeva/windows-nacked-app

mxyng/cleanup-attention

grace/deepseek-parser

hoyyeva/remember-unsent-prompt

parth/add-lfs-pointer-error-conversion

parth/olmo2-test2

hoyyeva/ollama-launchagent-plist

nicole/olmo-model

parth/olmo-test

mxyng/remove-embedded

parth/render-template

jmorganca/intellect-3

parth/remove-prealloc-linter

jmorganca/cmd-eval

nicole/nomic-embed-text-fix

mxyng/lint-2

hoyyeva/add-gemini-3-pro-preview

hoyyeva/load-model-list

mxyng/expand-path

mxyng/environ-2

hoyyeva/deeplink-json-encoding

parth/improve-tool-calling-tests

hoyyeva/conversation

hoyyeva/assistant-edit-response

hoyyeva/thinking

origin/brucemacd/invalid-char-i-err

parth/improve-tool-calling

jmorganca/required-omitempty

grace/qwen3-vl-tests

mxyng/iter-client

parth/docs-readme

nicole/embed-test

pdevine/integration-benchstat

parth/remove-generate-cmd

parth/add-toolcall-id

mxyng/server-tests

jmorganca/glm-4.6

jmorganca/gin-h-compat

drifkin/stable-tool-args

pdevine/qwen3-more-thinking

parth/add-websearch-client

nicole/websearch_local

jmorganca/qwen3-coder-updates

grace/deepseek-v3-migration-tests

mxyng/fix-create

jmorganca/cloud-errors

pdevine/parser-tidy

revert-12233-parth/simplify-entrypoints-runner

parth/enable-so-gpt-oss

brucemacd/qwen3vl

jmorganca/readme-simplify

parth/gpt-oss-structured-outputs

revert-12039-jmorganca/tools-braces

mxyng/embeddings

mxyng/gguf

mxyng/benchmark

mxyng/types-null

parth/move-parsing

mxyng/gemma2

jmorganca/docs

mxyng/16-bit

mxyng/create-stdin

pdevine/authorizedkeys

mxyng/quant

parth/opt-in-error-context-window

brucemacd/cache-models

brucemacd/runner-completion

jmorganca/llama-update-6

brucemacd/benchmark-list

brucemacd/partial-read-caps

parth/deepseek-r1-tools

mxyng/omit-array

parth/tool-prefix-temp

brucemacd/runner-test

jmorganca/qwen25vl

brucemacd/model-forward-test-ext

parth/python-function-parsing

jmorganca/cuda-compression-none

drifkin/num-parallel

drifkin/chat-truncation-fix

jmorganca/sync

parth/python-tools-calling

drifkin/array-head-count

brucemacd/create-no-loop

parth/server-enable-content-stream-with-tools

qwen25omni

mxyng/v3

brucemacd/ropeconfig

jmorganca/silence-tokenizer

parth/sample-so-test

parth/sampling-structured-outputs

brucemacd/doc-go-engine

parth/constrained-sampling-json

jmorganca/mistral-wip

brucemacd/mistral-small-convert

parth/sample-unmarshal-json-for-params

brucemacd/jomorganca/mistral

pdevine/bfloat16

jmorganca/mistral

brucemacd/mistral

pdevine/logging

parth/sample-correctness-fix

parth/sample-fix-sorting

jmorgan/sample-fix-sorting-extras

jmorganca/temp-0-images

brucemacd/parallel-embed-models

brucemacd/shim-grammar

jmorganca/fix-gguf-error

bmizerany/nameswork

jmorganca/faster-releases

bmizerany/validatenames

brucemacd/err-no-vocab

brucemacd/rope-config

brucemacd/err-hint

brucemacd/qwen2_5

brucemacd/logprobs

brucemacd/new_runner_graph_bench

progress-flicker

brucemacd/forward-test

brucemacd/go_qwen2

pdevine/gemma2

jmorganca/add-missing-symlink-eval

mxyng/next-debug

parth/set-context-size-openai

brucemacd/next-bpe-bench

brucemacd/next-bpe-test

brucemacd/new_runner_e2e

brucemacd/new_runner_qwen2

pdevine/convert-cohere2

brucemacd/convert-cli

parth/log-probs

mxyng/next-mlx

mxyng/cmd-history

parth/templating

parth/tokenize-detokenize

brucemacd/check-key-register

bmizerany/grammar

jmorganca/vendor-081b29bd

mxyng/func-checks

jmorganca/fix-null-format

parth/fix-default-to-warn-json

jmorganca/qwen2vl

jmorganca/no-concat

parth/cmd-cleanup-SO

brucemacd/check-key-register-structured-err

parth/openai-stream-usage

parth/fix-referencing-so

stream-tools-stop

jmorganca/degin-1

brucemacd/install-path-clean

brucemacd/push-name-validation

brucemacd/browser-key-register

jmorganca/openai-fix-first-message

jmorganca/fix-proxy

jessegross/sample

parth/disallow-streaming-tools

dhiltgen/remove_submodule

jmorganca/ga

jmorganca/mllama

pdevine/newlines

pdevine/geems-2b

jmorganca/llama-bump

mxyng/modelname-7

mxyng/gin-slog

mxyng/modelname-6

jyan/convert-prog

jyan/quant5

paligemma-support

pdevine/import-docs

jmorganca/openai-context

jyan/paligemma

jyan/p2

jyan/palitest

bmizerany/embedspeedup

jmorganca/llama-vit

brucemacd/allow-ollama

royh/ep-methods

royh/whisper

mxyng/api-models

mxyng/fix-memory

jyan/q4_4/8

jyan/ollama-v

royh/stream-tools

roy-embed-parallel

bmizerany/hrm

revert-5963-revert-5924-mxyng/llama3.1-rope

royh/embed-viz

jyan/local2

jyan/auth

jyan/local

jyan/parse-temp

jmorganca/template-mistral

jyan/reord-g

royh-openai-suffixdocs

royh-imgembed

royh-embed-parallel

jyan/quant4

royh-precision

jyan/progress

pdevine/fix-template

jyan/quant3

pdevine/ggla

mxyng/update-registry-domain

jmorganca/ggml-static

mxyng/create-context

jyan/v0.146

mxyng/layers-from-files

build_dist

bmizerany/noseek

royh-ls

royh-name

timeout

mxyng/server-timestamp

bmizerany/nosillyggufslurps

royh-params

jmorganca/llama-cpp-7c26775

royh-openai-delete

royh-show-rigid

jmorganca/enable-fa

jmorganca/no-error-template

jyan/format

royh-testdelete

bmizerany/fastverify

language_support

pdevine/ps-glitches

brucemacd/tokenize

bruce/iq-quants

bmizerany/filepathwithcoloninhost

mxyng/split-bin

bmizerany/client-registry

jmorganca/if-none-match

native

jmorganca/native

jmorganca/batch-embeddings

jmorganca/initcmake

jmorganca/mm

pdevine/showggmlinfo

modenameenforcealphanum

bmizerany/modenameenforcealphanum

jmorganca/done-reason

jmorganca/llama-cpp-8960fe8

ollama.com

bmizerany/filepathnobuild

bmizerany/types/model/defaultfix

rmdisplaylong

nogogen

bmizerany/x

modelfile-readme

bmizerany/replacecolon

jmorganca/limit

jmorganca/execstack

jmorganca/replace-assets

mxyng/tune-concurrency

jmorganca/testing

whitespace-detection

jmorganca/options

upgrade-all

scratch

cuda-search

mattw/airenamer

mattw/allmodelsonhuggingface

mattw/quantcontext

mattw/whatneedstorun

brucemacd/llama-mem-calc

mattw/faq-context

mattw/communitylinks

mattw/noprune

mattw/python-functioncalling

rename

mxyng/install

pulse

remove-first

editor

mattw/selfqueryingretrieval

cgo

mattw/howtoquant

api

matt/streamingapi

format-config

mxyng/extra-args

shell

update-nous-hermes

cp-model

upload-progress

fix-unknown-model

fix-model-names

delete-fix

insecure-registry

ls

deletemodels

progressbar

readme-updates

license-layers

skip-list

list-models

modelpath

matt/examplemodelfiles

distribution

go-opts

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: github-starred/ollama#34001