[GH-ISSUE #9976] segmentation violation GGML_ASSERT(sections.v[0] > 0 || sections.v[1] > 0 || sections.v[2] > 0) #53047

Closed
opened 2026-04-29 01:45:53 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @thafer6 on GitHub (Mar 25, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9976

What is the issue?

I'm getting the following error when trying to do inference:

Mar 25 08:53:54 **** ollama[121821]: //ml/backend/ggml/ggml/src/ggml-cuda/rope.cu:381: GGML_ASSERT(sections.v[0] > 0 || sections.v[1] > 0 || sections.v[2] > 0) failed
Mar 25 08:53:54 **** ollama[121821]: SIGSEGV: segmentation violation

Relevant log output

Mar 25 08:52:35: time=2025-03-25T08:52:35.197Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.vision.block_count default=0
Mar 25 08:52:35: time=2025-03-25T08:52:35.198Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.attention.key_length default=128
Mar 25 08:52:35: time=2025-03-25T08:52:35.198Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.attention.value_length default=128
Mar 25 08:52:35: time=2025-03-25T08:52:35.198Z level=INFO source=sched.go:715 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c gpu=GPU-33f80123-4c0a-59d3-5a8d-8e90642b62b6 parallel=4 available=15545794560 required="8.6 GiB"
Mar 25 08:52:35: time=2025-03-25T08:52:35.459Z level=INFO source=server.go:105 msg="system memory" total="31.3 GiB" free="26.4 GiB" free_swap="0 B"
Mar 25 08:52:35: time=2025-03-25T08:52:35.459Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.vision.block_count default=0
Mar 25 08:52:35: time=2025-03-25T08:52:35.460Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.attention.key_length default=128
Mar 25 08:52:35: time=2025-03-25T08:52:35.460Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.attention.value_length default=128
Mar 25 08:52:35: time=2025-03-25T08:52:35.460Z level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=29 layers.offload=29 layers.split="" memory.available="[14.5 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.6 GiB" memory.required.partial="8.6 GiB" memory.required.kv="448.0 MiB" memory.required.allocations="[8.6 GiB]" memory.weights.total="6.5 GiB" memory.weights.repeating="6.5 GiB" memory.weights.nonrepeating="552.2 MiB" memory.graph.full="522.7 MiB" memory.graph.partial="522.7 MiB"
Mar 25 08:52:35: llama_model_loader: loaded meta data with 35 key-value pairs and 339 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c (version GGUF V3 (latest))
Mar 25 08:52:35: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Mar 25 08:52:35: llama_model_loader: - kv   0:                       general.architecture str              = qwen2vl
Mar 25 08:52:35: llama_model_loader: - kv   1:                               general.type str              = model
Mar 25 08:52:35: llama_model_loader: - kv   2:                               general.name str              = Olmoe_Model_Hf
Mar 25 08:52:35: llama_model_loader: - kv   3:                         general.size_label str              = 7.6B
Mar 25 08:52:35: llama_model_loader: - kv   4:                            general.license str              = apache-2.0
Mar 25 08:52:35: llama_model_loader: - kv   5:                   general.base_model.count u32              = 1
Mar 25 08:52:35: llama_model_loader: - kv   6:                  general.base_model.0.name str              = Qwen2 VL 7B Instruct
Mar 25 08:52:35: llama_model_loader: - kv   7:          general.base_model.0.organization str              = Qwen
Mar 25 08:52:35: llama_model_loader: - kv   8:              general.base_model.0.repo_url str              = https://huggingface.co/Qwen/Qwen2-VL-...
Mar 25 08:52:35: llama_model_loader: - kv   9:                      general.dataset.count u32              = 1
Mar 25 08:52:35: llama_model_loader: - kv  10:                     general.dataset.0.name str              = olmOCR Mix 0225
Mar 25 08:52:35: llama_model_loader: - kv  11:                  general.dataset.0.version str              = 0225
Mar 25 08:52:35: llama_model_loader: - kv  12:             general.dataset.0.organization str              = Allenai
Mar 25 08:52:35: llama_model_loader: - kv  13:                 general.dataset.0.repo_url str              = https://huggingface.co/allenai/olmOCR...
Mar 25 08:52:35: llama_model_loader: - kv  14:                          general.languages arr[str,1]       = ["en"]
Mar 25 08:52:35: llama_model_loader: - kv  15:                        qwen2vl.block_count u32              = 28
Mar 25 08:52:35: llama_model_loader: - kv  16:                     qwen2vl.context_length u32              = 32768
Mar 25 08:52:35: llama_model_loader: - kv  17:                   qwen2vl.embedding_length u32              = 3584
Mar 25 08:52:35: llama_model_loader: - kv  18:                qwen2vl.feed_forward_length u32              = 18944
Mar 25 08:52:35: llama_model_loader: - kv  19:               qwen2vl.attention.head_count u32              = 28
Mar 25 08:52:35: llama_model_loader: - kv  20:            qwen2vl.attention.head_count_kv u32              = 4
Mar 25 08:52:35: llama_model_loader: - kv  21:                     qwen2vl.rope.freq_base f32              = 1000000.000000
Mar 25 08:52:35: llama_model_loader: - kv  22:   qwen2vl.attention.layer_norm_rms_epsilon f32              = 0.000001
Mar 25 08:52:35: llama_model_loader: - kv  23:                          general.file_type u32              = 7
Mar 25 08:52:35: llama_model_loader: - kv  24:            qwen2vl.rope.dimension_sections arr[i32,4]       = [16, 24, 24, 0]
Mar 25 08:52:35: llama_model_loader: - kv  25:                       tokenizer.ggml.model str              = gpt2
Mar 25 08:52:35: llama_model_loader: - kv  26:                         tokenizer.ggml.pre str              = qwen2
Mar 25 08:52:35: llama_model_loader: - kv  27:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
Mar 25 08:52:35: llama_model_loader: - kv  28:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Mar 25 08:52:35: llama_model_loader: - kv  29:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
Mar 25 08:52:35: llama_model_loader: - kv  30:                tokenizer.ggml.eos_token_id u32              = 151645
Mar 25 08:52:35: llama_model_loader: - kv  31:            tokenizer.ggml.padding_token_id u32              = 151643
Mar 25 08:52:35: llama_model_loader: - kv  32:                tokenizer.ggml.bos_token_id u32              = 151643
Mar 25 08:52:35: llama_model_loader: - kv  33:                    tokenizer.chat_template str              = {% set image_count = namespace(value=...
Mar 25 08:52:35: llama_model_loader: - kv  34:               general.quantization_version u32              = 2
Mar 25 08:52:35: llama_model_loader: - type  f32:  141 tensors
Mar 25 08:52:35: llama_model_loader: - type q8_0:  198 tensors
Mar 25 08:52:35: print_info: file format = GGUF V3 (latest)
Mar 25 08:52:35: print_info: file type   = Q8_0
Mar 25 08:52:35: print_info: file size   = 7.54 GiB (8.50 BPW)
Mar 25 08:52:35: load: special tokens cache size = 14
Mar 25 08:52:35: load: token to piece cache size = 0.9309 MB
Mar 25 08:52:35: print_info: arch             = qwen2vl
Mar 25 08:52:35: print_info: vocab_only       = 1
Mar 25 08:52:35: print_info: model type       = ?B
Mar 25 08:52:35: print_info: model params     = 7.62 B
Mar 25 08:52:35: print_info: general.name     = Olmoe_Model_Hf
Mar 25 08:52:35: print_info: vocab type       = BPE
Mar 25 08:52:35: print_info: n_vocab          = 152064
Mar 25 08:52:35: print_info: n_merges         = 151387
Mar 25 08:52:35: print_info: BOS token        = 151643 '<|endoftext|>'
Mar 25 08:52:35: print_info: EOS token        = 151645 '<|im_end|>'
Mar 25 08:52:35: print_info: EOT token        = 151645 '<|im_end|>'
Mar 25 08:52:35: print_info: PAD token        = 151643 '<|endoftext|>'
Mar 25 08:52:35: print_info: LF token         = 198 'Ċ'
Mar 25 08:52:35: print_info: EOG token        = 151643 '<|endoftext|>'
Mar 25 08:52:35: print_info: EOG token        = 151645 '<|im_end|>'
Mar 25 08:52:35: print_info: max token length = 256
Mar 25 08:52:35: llama_model_load: vocab only - skipping tensors
Mar 25 08:52:35: time=2025-03-25T08:52:35.712Z level=INFO source=server.go:405 msg="starting llama server" cmd="/usr/local/bin/ollama runner --model /usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c --ctx-size 8192 --batch-size 512 --n-gpu-layers 29 --threads 4 --parallel 4 --port 44713"
Mar 25 08:52:35: time=2025-03-25T08:52:35.713Z level=INFO source=sched.go:450 msg="loaded runners" count=1
Mar 25 08:52:35: time=2025-03-25T08:52:35.713Z level=INFO source=server.go:585 msg="waiting for llama runner to start responding"
Mar 25 08:52:35: time=2025-03-25T08:52:35.713Z level=INFO source=server.go:619 msg="waiting for server to become available" status="llm server error"
Mar 25 08:52:35: time=2025-03-25T08:52:35.725Z level=INFO source=runner.go:931 msg="starting go runner"
Mar 25 08:52:35: ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
Mar 25 08:52:35: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
Mar 25 08:52:35: ggml_cuda_init: found 1 CUDA devices:
Mar 25 08:52:35:   Device 0: Tesla T4, compute capability 7.5, VMM: yes
Mar 25 08:52:35: load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v12/libggml-cuda.so
Mar 25 08:52:35: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-skylakex.so
Mar 25 08:52:35: time=2025-03-25T08:52:35.808Z level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.AVX512=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc)
Mar 25 08:52:35: time=2025-03-25T08:52:35.809Z level=INFO source=runner.go:991 msg="Server listening on 127.0.0.1:44713"
Mar 25 08:52:35: llama_model_load_from_file_impl: using device CUDA0 (Tesla T4) - 14825 MiB free
Mar 25 08:52:35: time=2025-03-25T08:52:35.964Z level=INFO source=server.go:619 msg="waiting for server to become available" status="llm server loading model"
Mar 25 08:52:36: llama_model_loader: loaded meta data with 35 key-value pairs and 339 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c (version GGUF V3 (latest))
Mar 25 08:52:36: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Mar 25 08:52:36: llama_model_loader: - kv   0:                       general.architecture str              = qwen2vl
Mar 25 08:52:36: llama_model_loader: - kv   1:                               general.type str              = model
Mar 25 08:52:36: llama_model_loader: - kv   2:                               general.name str              = Olmoe_Model_Hf
Mar 25 08:52:36: llama_model_loader: - kv   3:                         general.size_label str              = 7.6B
Mar 25 08:52:36: llama_model_loader: - kv   4:                            general.license str              = apache-2.0
Mar 25 08:52:36: llama_model_loader: - kv   5:                   general.base_model.count u32              = 1
Mar 25 08:52:36: llama_model_loader: - kv   6:                  general.base_model.0.name str              = Qwen2 VL 7B Instruct
Mar 25 08:52:36: llama_model_loader: - kv   7:          general.base_model.0.organization str              = Qwen
Mar 25 08:52:36: llama_model_loader: - kv   8:              general.base_model.0.repo_url str              = https://huggingface.co/Qwen/Qwen2-VL-...
Mar 25 08:52:36: llama_model_loader: - kv   9:                      general.dataset.count u32              = 1
Mar 25 08:52:36: llama_model_loader: - kv  10:                     general.dataset.0.name str              = olmOCR Mix 0225
Mar 25 08:52:36: llama_model_loader: - kv  11:                  general.dataset.0.version str              = 0225
Mar 25 08:52:36: llama_model_loader: - kv  12:             general.dataset.0.organization str              = Allenai
Mar 25 08:52:36: llama_model_loader: - kv  13:                 general.dataset.0.repo_url str              = https://huggingface.co/allenai/olmOCR...
Mar 25 08:52:36: llama_model_loader: - kv  14:                          general.languages arr[str,1]       = ["en"]
Mar 25 08:52:36: llama_model_loader: - kv  15:                        qwen2vl.block_count u32              = 28
Mar 25 08:52:36: llama_model_loader: - kv  16:                     qwen2vl.context_length u32              = 32768
Mar 25 08:52:36: llama_model_loader: - kv  17:                   qwen2vl.embedding_length u32              = 3584
Mar 25 08:52:36: llama_model_loader: - kv  18:                qwen2vl.feed_forward_length u32              = 18944
Mar 25 08:52:36: llama_model_loader: - kv  19:               qwen2vl.attention.head_count u32              = 28
Mar 25 08:52:36: llama_model_loader: - kv  20:            qwen2vl.attention.head_count_kv u32              = 4
Mar 25 08:52:36: llama_model_loader: - kv  21:                     qwen2vl.rope.freq_base f32              = 1000000.000000
Mar 25 08:52:36: llama_model_loader: - kv  22:   qwen2vl.attention.layer_norm_rms_epsilon f32              = 0.000001
Mar 25 08:52:36: llama_model_loader: - kv  23:                          general.file_type u32              = 7
Mar 25 08:52:36: llama_model_loader: - kv  24:            qwen2vl.rope.dimension_sections arr[i32,4]       = [16, 24, 24, 0]
Mar 25 08:52:36: llama_model_loader: - kv  25:                       tokenizer.ggml.model str              = gpt2
Mar 25 08:52:36: llama_model_loader: - kv  26:                         tokenizer.ggml.pre str              = qwen2
Mar 25 08:52:36: llama_model_loader: - kv  27:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
Mar 25 08:52:36: llama_model_loader: - kv  28:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Mar 25 08:52:36: llama_model_loader: - kv  29:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
Mar 25 08:52:36: llama_model_loader: - kv  30:                tokenizer.ggml.eos_token_id u32              = 151645
Mar 25 08:52:36: llama_model_loader: - kv  31:            tokenizer.ggml.padding_token_id u32              = 151643
Mar 25 08:52:36: llama_model_loader: - kv  32:                tokenizer.ggml.bos_token_id u32              = 151643
Mar 25 08:52:36: llama_model_loader: - kv  33:                    tokenizer.chat_template str              = {% set image_count = namespace(value=...
Mar 25 08:52:36: llama_model_loader: - kv  34:               general.quantization_version u32              = 2
Mar 25 08:52:36: llama_model_loader: - type  f32:  141 tensors
Mar 25 08:52:36: llama_model_loader: - type q8_0:  198 tensors
Mar 25 08:52:36: print_info: file format = GGUF V3 (latest)
Mar 25 08:52:36: print_info: file type   = Q8_0
Mar 25 08:52:36: print_info: file size   = 7.54 GiB (8.50 BPW)
Mar 25 08:52:36: load: special tokens cache size = 14
Mar 25 08:52:36: load: token to piece cache size = 0.9309 MB
Mar 25 08:52:36: print_info: arch             = qwen2vl
Mar 25 08:52:36: print_info: vocab_only       = 0
Mar 25 08:52:36: print_info: n_ctx_train      = 32768
Mar 25 08:52:36: print_info: n_embd           = 3584
Mar 25 08:52:36: print_info: n_layer          = 28
Mar 25 08:52:36: print_info: n_head           = 28
Mar 25 08:52:36: print_info: n_head_kv        = 4
Mar 25 08:52:36: print_info: n_rot            = 128
Mar 25 08:52:36: print_info: n_swa            = 0
Mar 25 08:52:36: print_info: n_embd_head_k    = 128
Mar 25 08:52:36: print_info: n_embd_head_v    = 128
Mar 25 08:52:36: print_info: n_gqa            = 7
Mar 25 08:52:36: print_info: n_embd_k_gqa     = 512
Mar 25 08:52:36: print_info: n_embd_v_gqa     = 512
Mar 25 08:52:36: print_info: f_norm_eps       = 0.0e+00
Mar 25 08:52:36: print_info: f_norm_rms_eps   = 1.0e-06
Mar 25 08:52:36: print_info: f_clamp_kqv      = 0.0e+00
Mar 25 08:52:36: print_info: f_max_alibi_bias = 0.0e+00
Mar 25 08:52:36: print_info: f_logit_scale    = 0.0e+00
Mar 25 08:52:36: print_info: n_ff             = 18944
Mar 25 08:52:36: print_info: n_expert         = 0
Mar 25 08:52:36: print_info: n_expert_used    = 0
Mar 25 08:52:36: print_info: causal attn      = 1
Mar 25 08:52:36: print_info: pooling type     = 0
Mar 25 08:52:36: print_info: rope type        = 8
Mar 25 08:52:36: print_info: rope scaling     = linear
Mar 25 08:52:36: print_info: freq_base_train  = 1000000.0
Mar 25 08:52:36: print_info: freq_scale_train = 1
Mar 25 08:52:36: print_info: n_ctx_orig_yarn  = 32768
Mar 25 08:52:36: print_info: rope_finetuned   = unknown
Mar 25 08:52:36: print_info: ssm_d_conv       = 0
Mar 25 08:52:36: print_info: ssm_d_inner      = 0
Mar 25 08:52:36: print_info: ssm_d_state      = 0
Mar 25 08:52:36: print_info: ssm_dt_rank      = 0
Mar 25 08:52:36: print_info: ssm_dt_b_c_rms   = 0
Mar 25 08:52:36: print_info: model type       = 7B
Mar 25 08:52:36: print_info: model params     = 7.62 B
Mar 25 08:52:36: print_info: general.name     = Olmoe_Model_Hf
Mar 25 08:52:36: print_info: vocab type       = BPE
Mar 25 08:52:36: print_info: n_vocab          = 152064
Mar 25 08:52:36: print_info: n_merges         = 151387
Mar 25 08:52:36: print_info: BOS token        = 151643 '<|endoftext|>'
Mar 25 08:52:36: print_info: EOS token        = 151645 '<|im_end|>'
Mar 25 08:52:36: print_info: EOT token        = 151645 '<|im_end|>'
Mar 25 08:52:36: print_info: PAD token        = 151643 '<|endoftext|>'
Mar 25 08:52:36: print_info: LF token         = 198 'Ċ'
Mar 25 08:52:36: print_info: EOG token        = 151643 '<|endoftext|>'
Mar 25 08:52:36: print_info: EOG token        = 151645 '<|im_end|>'
Mar 25 08:52:36: print_info: max token length = 256
Mar 25 08:52:36: load_tensors: loading model tensors, this can take a while... (mmap = true)
Mar 25 08:52:36: load_tensors: offloading 28 repeating layers to GPU
Mar 25 08:52:36: load_tensors: offloading output layer to GPU
Mar 25 08:52:36: load_tensors: offloaded 29/29 layers to GPU
Mar 25 08:52:36: load_tensors:        CUDA0 model buffer size =  7165.44 MiB
Mar 25 08:52:36: load_tensors:   CPU_Mapped model buffer size =   552.23 MiB
Mar 25 08:52:38: llama_init_from_model: n_seq_max     = 4
Mar 25 08:52:38: llama_init_from_model: n_ctx         = 8192
Mar 25 08:52:38: llama_init_from_model: n_ctx_per_seq = 2048
Mar 25 08:52:38: llama_init_from_model: n_batch       = 2048
Mar 25 08:52:38: llama_init_from_model: n_ubatch      = 512
Mar 25 08:52:38: llama_init_from_model: flash_attn    = 0
Mar 25 08:52:38: llama_init_from_model: freq_base     = 1000000.0
Mar 25 08:52:38: llama_init_from_model: freq_scale    = 1
Mar 25 08:52:38: llama_init_from_model: n_ctx_per_seq (2048) < n_ctx_train (32768) -- the full capacity of the model will not be utilized
Mar 25 08:52:38: llama_kv_cache_init: kv_size = 8192, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 28, can_shift = 1
Mar 25 08:52:38: llama_kv_cache_init:      CUDA0 KV buffer size =   448.00 MiB
Mar 25 08:52:38: llama_init_from_model: KV self size  =  448.00 MiB, K (f16):  224.00 MiB, V (f16):  224.00 MiB
Mar 25 08:52:38: llama_init_from_model:  CUDA_Host  output buffer size =     2.38 MiB
Mar 25 08:52:38: llama_init_from_model:      CUDA0 compute buffer size =   492.01 MiB
Mar 25 08:52:38: llama_init_from_model:  CUDA_Host compute buffer size =    23.01 MiB
Mar 25 08:52:38: llama_init_from_model: graph nodes  = 986
Mar 25 08:52:38: llama_init_from_model: graph splits = 2
Mar 25 08:52:38: time=2025-03-25T08:52:38.472Z level=INFO source=server.go:624 msg="llama runner started in 2.76 seconds"
Mar 25 08:53:54: //ml/backend/ggml/ggml/src/ggml-cuda/rope.cu:381: GGML_ASSERT(sections.v[0] > 0 || sections.v[1] > 0 || sections.v[2] > 0) failed
Mar 25 08:53:54: SIGSEGV: segmentation violation
Mar 25 08:53:54: PC=0x7fb9ae7f30d7 m=3 sigcode=1 addr=0x205003fcc
Mar 25 08:53:54: signal arrived during cgo execution
Mar 25 08:53:54: goroutine 11 gp=0xc0006028c0 m=3 mp=0xc000079008 [syscall]:
Mar 25 08:53:54: runtime.cgocall(0x56224168e100, 0xc00008fbc8)
Mar 25 08:53:54:         runtime/cgocall.go:167 +0x4b fp=0xc00008fba0 sp=0xc00008fb68 pc=0x562240a1c60b
Mar 25 08:53:54: github.com/ollama/ollama/llama._Cfunc_llama_decode(0x7fb9a88ac240, {0x1, 0x7fb9a8906620, 0x0, 0x0, 0x7fb9a8908630, 0x7fb9a890a640, 0x7fb9a88d4b80, 0x7fb9aafae7c0})
Mar 25 08:53:54:         _cgo_gotypes.go:574 +0x4a fp=0xc00008fbc8 sp=0xc00008fba0 pc=0x562240db3eea
Mar 25 08:53:54: github.com/ollama/ollama/llama.(*Context).Decode.func1(...)
Mar 25 08:53:54:         github.com/ollama/ollama/llama/llama.go:132
Mar 25 08:53:54: github.com/ollama/ollama/llama.(*Context).Decode(0x56224260d080?, 0x0?)
Mar 25 08:53:54:         github.com/ollama/ollama/llama/llama.go:132 +0xf6 fp=0xc00008fcc8 sp=0xc00008fbc8 pc=0x562240db6c96
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0004ba360, 0xc0001106c0, 0xc00008ff20)
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:435 +0x23e fp=0xc00008fee0 sp=0xc00008fcc8 pc=0x562240dc0abe
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0004ba360, {0x562241cfb940, 0xc000366230})
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:343 +0x1d5 fp=0xc00008ffb8 sp=0xc00008fee0 pc=0x562240dc0715
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2()
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:972 +0x28 fp=0xc00008ffe0 sp=0xc00008ffb8 pc=0x562240dc4fc8
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00008ffe8 sp=0xc00008ffe0 pc=0x562240a27021
Mar 25 08:53:54: created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:972 +0xcb7
Mar 25 08:53:54: goroutine 1 gp=0xc000002380 m=nil [IO wait, 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc0000495f8 sp=0xc0000495d8 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.netpollblock(0xc00051d678?, 0x409b9226?, 0x22?)
Mar 25 08:53:54:         runtime/netpoll.go:575 +0xf7 fp=0xc000049630 sp=0xc0000495f8 pc=0x5622409e46f7
Mar 25 08:53:54: internal/poll.runtime_pollWait(0x7fb9d1dd3eb0, 0x72)
Mar 25 08:53:54:         runtime/netpoll.go:351 +0x85 fp=0xc000049650 sp=0xc000049630 pc=0x562240a1eb05
Mar 25 08:53:54: internal/poll.(*pollDesc).wait(0xc000529c00?, 0x5622409c7406?, 0x0)
Mar 25 08:53:54:         internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000049678 sp=0xc000049650 pc=0x562240aa5f87
Mar 25 08:53:54: internal/poll.(*pollDesc).waitRead(...)
Mar 25 08:53:54:         internal/poll/fd_poll_runtime.go:89
Mar 25 08:53:54: internal/poll.(*FD).Accept(0xc000529c00)
Mar 25 08:53:54:         internal/poll/fd_unix.go:620 +0x295 fp=0xc000049720 sp=0xc000049678 pc=0x562240aab355
Mar 25 08:53:54: net.(*netFD).accept(0xc000529c00)
Mar 25 08:53:54:         net/fd_unix.go:172 +0x29 fp=0xc0000497d8 sp=0xc000049720 pc=0x562240b1e169
Mar 25 08:53:54: net.(*TCPListener).accept(0xc000404b00)
Mar 25 08:53:54:         net/tcpsock_posix.go:159 +0x1b fp=0xc000049828 sp=0xc0000497d8 pc=0x562240b33b1b
Mar 25 08:53:54: net.(*TCPListener).Accept(0xc000404b00)
Mar 25 08:53:54:         net/tcpsock.go:380 +0x30 fp=0xc000049858 sp=0xc000049828 pc=0x562240b329d0
Mar 25 08:53:54: net/http.(*onceCloseListener).Accept(0xc0000ee000?)
Mar 25 08:53:54:         <autogenerated>:1 +0x24 fp=0xc000049870 sp=0xc000049858 pc=0x562240d4a004
Mar 25 08:53:54: net/http.(*Server).Serve(0xc000035a00, {0x562241cf9678, 0xc000404b00})
Mar 25 08:53:54:         net/http/server.go:3424 +0x30c fp=0xc0000499a0 sp=0xc000049870 pc=0x562240d218cc
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034120, 0xe, 0xe})
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:992 +0x108a fp=0xc000049d08 sp=0xc0000499a0 pc=0x562240dc4d0a
Mar 25 08:53:54: github.com/ollama/ollama/runner.Execute({0xc000034110?, 0x0?, 0x0?})
Mar 25 08:53:54:         github.com/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc000049d30 sp=0xc000049d08 pc=0x562240eaf914
Mar 25 08:53:54: github.com/ollama/ollama/cmd.NewCLI.func2(0xc000035600?, {0x56224186b054?, 0x4?, 0x56224186b058?})
Mar 25 08:53:54:         github.com/ollama/ollama/cmd/cmd.go:1327 +0x45 fp=0xc000049d58 sp=0xc000049d30 pc=0x5622416208a5
Mar 25 08:53:54: github.com/spf13/cobra.(*Command).execute(0xc00049d508, {0xc000523ea0, 0xe, 0xe})
Mar 25 08:53:54:         github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc000049e78 sp=0xc000049d58 pc=0x562240b977bc
Mar 25 08:53:54: github.com/spf13/cobra.(*Command).ExecuteC(0xc000558c08)
Mar 25 08:53:54:         github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000049f30 sp=0xc000049e78 pc=0x562240b98005
Mar 25 08:53:54: github.com/spf13/cobra.(*Command).Execute(...)
Mar 25 08:53:54:         github.com/spf13/cobra@v1.7.0/command.go:992
Mar 25 08:53:54: github.com/spf13/cobra.(*Command).ExecuteContext(...)
Mar 25 08:53:54:         github.com/spf13/cobra@v1.7.0/command.go:985
Mar 25 08:53:54: main.main()
Mar 25 08:53:54:         github.com/ollama/ollama/main.go:12 +0x4d fp=0xc000049f50 sp=0xc000049f30 pc=0x562241620c0d
Mar 25 08:53:54: runtime.main()
Mar 25 08:53:54:         runtime/proc.go:283 +0x29d fp=0xc000049fe0 sp=0xc000049f50 pc=0x5622409ebcfd
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000049fe8 sp=0xc000049fe0 pc=0x562240a27021
Mar 25 08:53:54: goroutine 2 gp=0xc000002e00 m=nil [force gc (idle), 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000072fa8 sp=0xc000072f88 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.goparkunlock(...)
Mar 25 08:53:54:         runtime/proc.go:441
Mar 25 08:53:54: runtime.forcegchelper()
Mar 25 08:53:54:         runtime/proc.go:348 +0xb8 fp=0xc000072fe0 sp=0xc000072fa8 pc=0x5622409ec038
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000072fe8 sp=0xc000072fe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.init.7 in goroutine 1
Mar 25 08:53:54:         runtime/proc.go:336 +0x1a
Mar 25 08:53:54: goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]:
Mar 25 08:53:54: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000073780 sp=0xc000073760 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.goparkunlock(...)
Mar 25 08:53:54:         runtime/proc.go:441
Mar 25 08:53:54: runtime.bgsweep(0xc00007e000)
Mar 25 08:53:54:         runtime/mgcsweep.go:316 +0xdf fp=0xc0000737c8 sp=0xc000073780 pc=0x5622409d685f
Mar 25 08:53:54: runtime.gcenable.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:204 +0x25 fp=0xc0000737e0 sp=0xc0000737c8 pc=0x5622409cac45
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc0000737e8 sp=0xc0000737e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcenable in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:204 +0x66
Mar 25 08:53:54: goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]:
Mar 25 08:53:54: runtime.gopark(0x10000?, 0x562241a21c70?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000073f78 sp=0xc000073f58 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.goparkunlock(...)
Mar 25 08:53:54:         runtime/proc.go:441
Mar 25 08:53:54: runtime.(*scavengerState).park(0x562242560b40)
Mar 25 08:53:54:         runtime/mgcscavenge.go:425 +0x49 fp=0xc000073fa8 sp=0xc000073f78 pc=0x5622409d42a9
Mar 25 08:53:54: runtime.bgscavenge(0xc00007e000)
Mar 25 08:53:54:         runtime/mgcscavenge.go:658 +0x59 fp=0xc000073fc8 sp=0xc000073fa8 pc=0x5622409d4839
Mar 25 08:53:54: runtime.gcenable.gowrap2()
Mar 25 08:53:54:         runtime/mgc.go:205 +0x25 fp=0xc000073fe0 sp=0xc000073fc8 pc=0x5622409cabe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000073fe8 sp=0xc000073fe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcenable in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:205 +0xa5
Mar 25 08:53:54: goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait, 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000072688?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000072630 sp=0xc000072610 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.runfinq()
Mar 25 08:53:54:         runtime/mfinal.go:196 +0x107 fp=0xc0000727e0 sp=0xc000072630 pc=0x5622409c9c07
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc0000727e8 sp=0xc0000727e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.createfing in goroutine 1
Mar 25 08:53:54:         runtime/mfinal.go:166 +0x3d
Mar 25 08:53:54: goroutine 6 gp=0xc0001d28c0 m=nil [chan receive, 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0xc000225720?, 0xc000300018?, 0x60?, 0x47?, 0x562240b04ea8?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000074718 sp=0xc0000746f8 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.chanrecv(0xc000040380, 0x0, 0x1)
Mar 25 08:53:54:         runtime/chan.go:664 +0x445 fp=0xc000074790 sp=0xc000074718 pc=0x5622409bbe05
Mar 25 08:53:54: runtime.chanrecv1(0x0?, 0x0?)
Mar 25 08:53:54:         runtime/chan.go:506 +0x12 fp=0xc0000747b8 sp=0xc000074790 pc=0x5622409bb992
Mar 25 08:53:54: runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
Mar 25 08:53:54:         runtime/mgc.go:1796
Mar 25 08:53:54: runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1799 +0x2f fp=0xc0000747e0 sp=0xc0000747b8 pc=0x5622409cddef
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc0000747e8 sp=0xc0000747e0 pc=0x562240a27021
Mar 25 08:53:54: created by unique.runtime_registerUniqueMapCleanup in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1794 +0x85
Mar 25 08:53:54: goroutine 7 gp=0xc0001d3340 m=nil [GC worker (idle), 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000074f38 sp=0xc000074f18 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc000074fc8 sp=0xc000074f38 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc000074fe0 sp=0xc000074fc8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000074fe8 sp=0xc000074fe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 18 gp=0xc000102380 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000bae4c84?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00006e738 sp=0xc00006e718 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00006e7c8 sp=0xc00006e738 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00006e7e0 sp=0xc00006e7c8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00006e7e8 sp=0xc00006e7e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 34 gp=0xc000504000 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000baeb3b6?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00050a7c8 sp=0xc00050a738 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 8 gp=0xc0001d3500 m=nil [GC worker (idle), 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x180000baeb59d?, 0x3?, 0xb1?, 0x9?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000075738 sp=0xc000075718 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc0000757c8 sp=0xc000075738 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc0000757e0 sp=0xc0000757c8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc0000757e8 sp=0xc0000757e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 19 gp=0xc000102540 m=nil [GC worker (idle), 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x180000bae4882?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00006ef38 sp=0xc00006ef18 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00006efc8 sp=0xc00006ef38 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00006efe0 sp=0xc00006efc8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00006efe8 sp=0xc00006efe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 35 gp=0xc0005041c0 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000baec65f?, 0x3?, 0xf5?, 0x1c?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00050af38 sp=0xc00050af18 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00050afc8 sp=0xc00050af38 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00050afe0 sp=0xc00050afc8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 9 gp=0xc0001d36c0 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000baf26c9?, 0x3?, 0xb3?, 0x24?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000075f38 sp=0xc000075f18 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc000075fc8 sp=0xc000075f38 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc000075fe0 sp=0xc000075fc8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000075fe8 sp=0xc000075fe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 20 gp=0xc000102700 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000bae5fee?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00006f738 sp=0xc00006f718 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00006f7c8 sp=0xc00006f738 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00006f7e0 sp=0xc00006f7c8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00006f7e8 sp=0xc00006f7e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 66 gp=0xc000602700 m=nil [IO wait]:
Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00006fdd8 sp=0xc00006fdb8 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.netpollblock(0x562240a42d78?, 0x409b9226?, 0x22?)
Mar 25 08:53:54:         runtime/netpoll.go:575 +0xf7 fp=0xc00006fe10 sp=0xc00006fdd8 pc=0x5622409e46f7
Mar 25 08:53:54: internal/poll.runtime_pollWait(0x7fb9d1dd3d98, 0x72)
Mar 25 08:53:54:         runtime/netpoll.go:351 +0x85 fp=0xc00006fe30 sp=0xc00006fe10 pc=0x562240a1eb05
Mar 25 08:53:54: internal/poll.(*pollDesc).wait(0xc000474000?, 0xc0000ec0d1?, 0x0)
Mar 25 08:53:54:         internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00006fe58 sp=0xc00006fe30 pc=0x562240aa5f87
Mar 25 08:53:54: internal/poll.(*pollDesc).waitRead(...)
Mar 25 08:53:54:         internal/poll/fd_poll_runtime.go:89
Mar 25 08:53:54: internal/poll.(*FD).Read(0xc000474000, {0xc0000ec0d1, 0x1, 0x1})
Mar 25 08:53:54:         internal/poll/fd_unix.go:165 +0x27a fp=0xc00006fef0 sp=0xc00006fe58 pc=0x562240aa727a
Mar 25 08:53:54: net.(*netFD).Read(0xc000474000, {0xc0000ec0d1?, 0xc0000515d8?, 0xc00006ff70?})
Mar 25 08:53:54:         net/fd_posix.go:55 +0x25 fp=0xc00006ff38 sp=0xc00006fef0 pc=0x562240b1c1c5
Mar 25 08:53:54: net.(*conn).Read(0xc00052c090, {0xc0000ec0d1?, 0x0?, 0x0?})
Mar 25 08:53:54:         net/net.go:194 +0x45 fp=0xc00006ff80 sp=0xc00006ff38 pc=0x562240b2a585
Mar 25 08:53:54: net/http.(*connReader).backgroundRead(0xc0000ec0c0)
Mar 25 08:53:54:         net/http/server.go:690 +0x37 fp=0xc00006ffc8 sp=0xc00006ff80 pc=0x562240d162d7
Mar 25 08:53:54: net/http.(*connReader).startBackgroundRead.gowrap2()
Mar 25 08:53:54:         net/http/server.go:686 +0x25 fp=0xc00006ffe0 sp=0xc00006ffc8 pc=0x562240d16205
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x562240a27021
Mar 25 08:53:54: created by net/http.(*connReader).startBackgroundRead in goroutine 21
Mar 25 08:53:54:         net/http/server.go:686 +0xb6
Mar 25 08:53:54: goroutine 21 gp=0xc0001028c0 m=nil [select]:
Mar 25 08:53:54: runtime.gopark(0xc000143a58?, 0x2?, 0x4?, 0x0?, 0xc000143834?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000143648 sp=0xc000143628 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.selectgo(0xc000143a58, 0xc000143830, 0xc000362400?, 0x0, 0x1?, 0x1)
Mar 25 08:53:54:         runtime/select.go:351 +0x837 fp=0xc000143780 sp=0xc000143648 pc=0x5622409fe1f7
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).completion(0xc0004ba360, {0x562241cf9858, 0xc0000009a0}, 0xc0004e4000)
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:688 +0xa25 fp=0xc000143ac0 sp=0xc000143780 pc=0x562240dc24c5
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).completion-fm({0x562241cf9858?, 0xc0000009a0?}, 0xc000125b40?)
Mar 25 08:53:54:         <autogenerated>:1 +0x36 fp=0xc000143af0 sp=0xc000143ac0 pc=0x562240dc53f6
Mar 25 08:53:54: net/http.HandlerFunc.ServeHTTP(0xc0005372c0?, {0x562241cf9858?, 0xc0000009a0?}, 0xc000125b60?)
Mar 25 08:53:54:         net/http/server.go:2294 +0x29 fp=0xc000143b18 sp=0xc000143af0 pc=0x562240d1df09
Mar 25 08:53:54: net/http.(*ServeMux).ServeHTTP(0x5622409c4125?, {0x562241cf9858, 0xc0000009a0}, 0xc0004e4000)
Mar 25 08:53:54:         net/http/server.go:2822 +0x1c4 fp=0xc000143b68 sp=0xc000143b18 pc=0x562240d1fe04
Mar 25 08:53:54: net/http.serverHandler.ServeHTTP({0x562241cf5ef0?}, {0x562241cf9858?, 0xc0000009a0?}, 0x1?)
Mar 25 08:53:54:         net/http/server.go:3301 +0x8e fp=0xc000143b98 sp=0xc000143b68 pc=0x562240d3d88e
Mar 25 08:53:54: net/http.(*conn).serve(0xc0000ee000, {0x562241cfb908, 0xc00034bf50})
Mar 25 08:53:54:         net/http/server.go:2102 +0x625 fp=0xc000143fb8 sp=0xc000143b98 pc=0x562240d1c405
Mar 25 08:53:54: net/http.(*Server).Serve.gowrap3()
Mar 25 08:53:54:         net/http/server.go:3454 +0x28 fp=0xc000143fe0 sp=0xc000143fb8 pc=0x562240d21cc8
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000143fe8 sp=0xc000143fe0 pc=0x562240a27021
Mar 25 08:53:54: created by net/http.(*Server).Serve in goroutine 1
Mar 25 08:53:54:         net/http/server.go:3454 +0x485
Mar 25 08:53:54: rax    0x205003fcc
Mar 25 08:53:54: rbx    0x7fb9a81c8cf0
Mar 25 08:53:54: rcx    0xff3
Mar 25 08:53:54: rdx    0x7fb9a8007780
Mar 25 08:53:54: rdi    0x7fb9a8007790
Mar 25 08:53:54: rsi    0x0
Mar 25 08:53:54: rbp    0x7fb9d1dca120
Mar 25 08:53:54: rsp    0x7fb9d1dca100
Mar 25 08:53:54: r8     0x4
Mar 25 08:53:54: r9     0x0
Mar 25 08:53:54: r10    0x4
Mar 25 08:53:54: r11    0x8
Mar 25 08:53:54: r12    0x7fb9b4003330
Mar 25 08:53:54: r13    0x7fb9a8007790
Mar 25 08:53:54: r14    0x0
Mar 25 08:53:54: r15    0x56225f5d94e0
Mar 25 08:53:54: rip    0x7fb9ae7f30d7
Mar 25 08:53:54: rflags 0x10297
Mar 25 08:53:54: cs     0x33
Mar 25 08:53:54: fs     0x0
Mar 25 08:53:54: gs     0x0
Mar 25 08:53:54: SIGABRT: abort
Mar 25 08:53:54: PC=0x7fba191389fc m=3 sigcode=18446744073709551610
Mar 25 08:53:54: signal arrived during cgo execution
Mar 25 08:53:54: goroutine 11 gp=0xc0006028c0 m=3 mp=0xc000079008 [syscall]:
Mar 25 08:53:54: runtime.cgocall(0x56224168e100, 0xc00008fbc8)
Mar 25 08:53:54:         runtime/cgocall.go:167 +0x4b fp=0xc00008fba0 sp=0xc00008fb68 pc=0x562240a1c60b
Mar 25 08:53:54: github.com/ollama/ollama/llama._Cfunc_llama_decode(0x7fb9a88ac240, {0x1, 0x7fb9a8906620, 0x0, 0x0, 0x7fb9a8908630, 0x7fb9a890a640, 0x7fb9a88d4b80, 0x7fb9aafae7c0})
Mar 25 08:53:54:         _cgo_gotypes.go:574 +0x4a fp=0xc00008fbc8 sp=0xc00008fba0 pc=0x562240db3eea
Mar 25 08:53:54: github.com/ollama/ollama/llama.(*Context).Decode.func1(...)
Mar 25 08:53:54:         github.com/ollama/ollama/llama/llama.go:132
Mar 25 08:53:54: github.com/ollama/ollama/llama.(*Context).Decode(0x56224260d080?, 0x0?)
Mar 25 08:53:54:         github.com/ollama/ollama/llama/llama.go:132 +0xf6 fp=0xc00008fcc8 sp=0xc00008fbc8 pc=0x562240db6c96
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0004ba360, 0xc0001106c0, 0xc00008ff20)
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:435 +0x23e fp=0xc00008fee0 sp=0xc00008fcc8 pc=0x562240dc0abe
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0004ba360, {0x562241cfb940, 0xc000366230})
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:343 +0x1d5 fp=0xc00008ffb8 sp=0xc00008fee0 pc=0x562240dc0715
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2()
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:972 +0x28 fp=0xc00008ffe0 sp=0xc00008ffb8 pc=0x562240dc4fc8
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00008ffe8 sp=0xc00008ffe0 pc=0x562240a27021
Mar 25 08:53:54: created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:972 +0xcb7
Mar 25 08:53:54: goroutine 1 gp=0xc000002380 m=nil [IO wait, 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc0000495f8 sp=0xc0000495d8 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.netpollblock(0xc00051d678?, 0x409b9226?, 0x22?)
Mar 25 08:53:54:         runtime/netpoll.go:575 +0xf7 fp=0xc000049630 sp=0xc0000495f8 pc=0x5622409e46f7
Mar 25 08:53:54: internal/poll.runtime_pollWait(0x7fb9d1dd3eb0, 0x72)
Mar 25 08:53:54:         runtime/netpoll.go:351 +0x85 fp=0xc000049650 sp=0xc000049630 pc=0x562240a1eb05
Mar 25 08:53:54: internal/poll.(*pollDesc).wait(0xc000529c00?, 0x5622409c7406?, 0x0)
Mar 25 08:53:54:         internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000049678 sp=0xc000049650 pc=0x562240aa5f87
Mar 25 08:53:54: internal/poll.(*pollDesc).waitRead(...)
Mar 25 08:53:54:         internal/poll/fd_poll_runtime.go:89
Mar 25 08:53:54: internal/poll.(*FD).Accept(0xc000529c00)
Mar 25 08:53:54:         internal/poll/fd_unix.go:620 +0x295 fp=0xc000049720 sp=0xc000049678 pc=0x562240aab355
Mar 25 08:53:54: net.(*netFD).accept(0xc000529c00)
Mar 25 08:53:54:         net/fd_unix.go:172 +0x29 fp=0xc0000497d8 sp=0xc000049720 pc=0x562240b1e169
Mar 25 08:53:54: net.(*TCPListener).accept(0xc000404b00)
Mar 25 08:53:54:         net/tcpsock_posix.go:159 +0x1b fp=0xc000049828 sp=0xc0000497d8 pc=0x562240b33b1b
Mar 25 08:53:54: net.(*TCPListener).Accept(0xc000404b00)
Mar 25 08:53:54:         net/tcpsock.go:380 +0x30 fp=0xc000049858 sp=0xc000049828 pc=0x562240b329d0
Mar 25 08:53:54: net/http.(*onceCloseListener).Accept(0xc0000ee000?)
Mar 25 08:53:54:         <autogenerated>:1 +0x24 fp=0xc000049870 sp=0xc000049858 pc=0x562240d4a004
Mar 25 08:53:54: net/http.(*Server).Serve(0xc000035a00, {0x562241cf9678, 0xc000404b00})
Mar 25 08:53:54:         net/http/server.go:3424 +0x30c fp=0xc0000499a0 sp=0xc000049870 pc=0x562240d218cc
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034120, 0xe, 0xe})
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:992 +0x108a fp=0xc000049d08 sp=0xc0000499a0 pc=0x562240dc4d0a
Mar 25 08:53:54: github.com/ollama/ollama/runner.Execute({0xc000034110?, 0x0?, 0x0?})
Mar 25 08:53:54:         github.com/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc000049d30 sp=0xc000049d08 pc=0x562240eaf914
Mar 25 08:53:54: github.com/ollama/ollama/cmd.NewCLI.func2(0xc000035600?, {0x56224186b054?, 0x4?, 0x56224186b058?})
Mar 25 08:53:54:         github.com/ollama/ollama/cmd/cmd.go:1327 +0x45 fp=0xc000049d58 sp=0xc000049d30 pc=0x5622416208a5
Mar 25 08:53:54: github.com/spf13/cobra.(*Command).execute(0xc00049d508, {0xc000523ea0, 0xe, 0xe})
Mar 25 08:53:54:         github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc000049e78 sp=0xc000049d58 pc=0x562240b977bc
Mar 25 08:53:54: github.com/spf13/cobra.(*Command).ExecuteC(0xc000558c08)
Mar 25 08:53:54:         github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000049f30 sp=0xc000049e78 pc=0x562240b98005
Mar 25 08:53:54: github.com/spf13/cobra.(*Command).Execute(...)
Mar 25 08:53:54:         github.com/spf13/cobra@v1.7.0/command.go:992
Mar 25 08:53:54: github.com/spf13/cobra.(*Command).ExecuteContext(...)
Mar 25 08:53:54:         github.com/spf13/cobra@v1.7.0/command.go:985
Mar 25 08:53:54: main.main()
Mar 25 08:53:54:         github.com/ollama/ollama/main.go:12 +0x4d fp=0xc000049f50 sp=0xc000049f30 pc=0x562241620c0d
Mar 25 08:53:54: runtime.main()
Mar 25 08:53:54:         runtime/proc.go:283 +0x29d fp=0xc000049fe0 sp=0xc000049f50 pc=0x5622409ebcfd
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000049fe8 sp=0xc000049fe0 pc=0x562240a27021
Mar 25 08:53:54: goroutine 2 gp=0xc000002e00 m=nil [force gc (idle), 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000072fa8 sp=0xc000072f88 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.goparkunlock(...)
Mar 25 08:53:54:         runtime/proc.go:441
Mar 25 08:53:54: runtime.forcegchelper()
Mar 25 08:53:54:         runtime/proc.go:348 +0xb8 fp=0xc000072fe0 sp=0xc000072fa8 pc=0x5622409ec038
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000072fe8 sp=0xc000072fe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.init.7 in goroutine 1
Mar 25 08:53:54:         runtime/proc.go:336 +0x1a
Mar 25 08:53:54: goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]:
Mar 25 08:53:54: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000073780 sp=0xc000073760 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.goparkunlock(...)
Mar 25 08:53:54:         runtime/proc.go:441
Mar 25 08:53:54: runtime.bgsweep(0xc00007e000)
Mar 25 08:53:54:         runtime/mgcsweep.go:316 +0xdf fp=0xc0000737c8 sp=0xc000073780 pc=0x5622409d685f
Mar 25 08:53:54: runtime.gcenable.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:204 +0x25 fp=0xc0000737e0 sp=0xc0000737c8 pc=0x5622409cac45
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc0000737e8 sp=0xc0000737e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcenable in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:204 +0x66
Mar 25 08:53:54: goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]:
Mar 25 08:53:54: runtime.gopark(0x10000?, 0x562241a21c70?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000073f78 sp=0xc000073f58 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.goparkunlock(...)
Mar 25 08:53:54:         runtime/proc.go:441
Mar 25 08:53:54: runtime.(*scavengerState).park(0x562242560b40)
Mar 25 08:53:54:         runtime/mgcscavenge.go:425 +0x49 fp=0xc000073fa8 sp=0xc000073f78 pc=0x5622409d42a9
Mar 25 08:53:54: runtime.bgscavenge(0xc00007e000)
Mar 25 08:53:54:         runtime/mgcscavenge.go:658 +0x59 fp=0xc000073fc8 sp=0xc000073fa8 pc=0x5622409d4839
Mar 25 08:53:54: runtime.gcenable.gowrap2()
Mar 25 08:53:54:         runtime/mgc.go:205 +0x25 fp=0xc000073fe0 sp=0xc000073fc8 pc=0x5622409cabe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000073fe8 sp=0xc000073fe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcenable in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:205 +0xa5
Mar 25 08:53:54: goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait, 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000072688?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000072630 sp=0xc000072610 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.runfinq()
Mar 25 08:53:54:         runtime/mfinal.go:196 +0x107 fp=0xc0000727e0 sp=0xc000072630 pc=0x5622409c9c07
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc0000727e8 sp=0xc0000727e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.createfing in goroutine 1
Mar 25 08:53:54:         runtime/mfinal.go:166 +0x3d
Mar 25 08:53:54: goroutine 6 gp=0xc0001d28c0 m=nil [chan receive, 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0xc000225720?, 0xc000300018?, 0x60?, 0x47?, 0x562240b04ea8?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000074718 sp=0xc0000746f8 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.chanrecv(0xc000040380, 0x0, 0x1)
Mar 25 08:53:54:         runtime/chan.go:664 +0x445 fp=0xc000074790 sp=0xc000074718 pc=0x5622409bbe05
Mar 25 08:53:54: runtime.chanrecv1(0x0?, 0x0?)
Mar 25 08:53:54:         runtime/chan.go:506 +0x12 fp=0xc0000747b8 sp=0xc000074790 pc=0x5622409bb992
Mar 25 08:53:54: runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
Mar 25 08:53:54:         runtime/mgc.go:1796
Mar 25 08:53:54: runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1799 +0x2f fp=0xc0000747e0 sp=0xc0000747b8 pc=0x5622409cddef
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc0000747e8 sp=0xc0000747e0 pc=0x562240a27021
Mar 25 08:53:54: created by unique.runtime_registerUniqueMapCleanup in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1794 +0x85
Mar 25 08:53:54: goroutine 7 gp=0xc0001d3340 m=nil [GC worker (idle), 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000074f38 sp=0xc000074f18 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc000074fc8 sp=0xc000074f38 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc000074fe0 sp=0xc000074fc8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000074fe8 sp=0xc000074fe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 18 gp=0xc000102380 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000bae4c84?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00006e738 sp=0xc00006e718 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00006e7c8 sp=0xc00006e738 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00006e7e0 sp=0xc00006e7c8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00006e7e8 sp=0xc00006e7e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 34 gp=0xc000504000 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000baeb3b6?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00050a7c8 sp=0xc00050a738 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 8 gp=0xc0001d3500 m=nil [GC worker (idle), 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x180000baeb59d?, 0x3?, 0xb1?, 0x9?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000075738 sp=0xc000075718 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc0000757c8 sp=0xc000075738 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc0000757e0 sp=0xc0000757c8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc0000757e8 sp=0xc0000757e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 19 gp=0xc000102540 m=nil [GC worker (idle), 1 minutes]:
Mar 25 08:53:54: runtime.gopark(0x180000bae4882?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00006ef38 sp=0xc00006ef18 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00006efc8 sp=0xc00006ef38 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00006efe0 sp=0xc00006efc8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00006efe8 sp=0xc00006efe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 35 gp=0xc0005041c0 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000baec65f?, 0x3?, 0xf5?, 0x1c?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00050af38 sp=0xc00050af18 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00050afc8 sp=0xc00050af38 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00050afe0 sp=0xc00050afc8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 9 gp=0xc0001d36c0 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000baf26c9?, 0x3?, 0xb3?, 0x24?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000075f38 sp=0xc000075f18 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc000075fc8 sp=0xc000075f38 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc000075fe0 sp=0xc000075fc8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000075fe8 sp=0xc000075fe0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 20 gp=0xc000102700 m=nil [GC worker (idle)]:
Mar 25 08:53:54: runtime.gopark(0x180000bae5fee?, 0x0?, 0x0?, 0x0?, 0x0?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00006f738 sp=0xc00006f718 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960)
Mar 25 08:53:54:         runtime/mgc.go:1423 +0xe9 fp=0xc00006f7c8 sp=0xc00006f738 pc=0x5622409cd109
Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1()
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x25 fp=0xc00006f7e0 sp=0xc00006f7c8 pc=0x5622409ccfe5
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00006f7e8 sp=0xc00006f7e0 pc=0x562240a27021
Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1
Mar 25 08:53:54:         runtime/mgc.go:1339 +0x105
Mar 25 08:53:54: goroutine 66 gp=0xc000602700 m=nil [IO wait]:
Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc00006fdd8 sp=0xc00006fdb8 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.netpollblock(0x562240a42d78?, 0x409b9226?, 0x22?)
Mar 25 08:53:54:         runtime/netpoll.go:575 +0xf7 fp=0xc00006fe10 sp=0xc00006fdd8 pc=0x5622409e46f7
Mar 25 08:53:54: internal/poll.runtime_pollWait(0x7fb9d1dd3d98, 0x72)
Mar 25 08:53:54:         runtime/netpoll.go:351 +0x85 fp=0xc00006fe30 sp=0xc00006fe10 pc=0x562240a1eb05
Mar 25 08:53:54: internal/poll.(*pollDesc).wait(0xc000474000?, 0xc0000ec0d1?, 0x0)
Mar 25 08:53:54:         internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00006fe58 sp=0xc00006fe30 pc=0x562240aa5f87
Mar 25 08:53:54: internal/poll.(*pollDesc).waitRead(...)
Mar 25 08:53:54:         internal/poll/fd_poll_runtime.go:89
Mar 25 08:53:54: internal/poll.(*FD).Read(0xc000474000, {0xc0000ec0d1, 0x1, 0x1})
Mar 25 08:53:54:         internal/poll/fd_unix.go:165 +0x27a fp=0xc00006fef0 sp=0xc00006fe58 pc=0x562240aa727a
Mar 25 08:53:54: net.(*netFD).Read(0xc000474000, {0xc0000ec0d1?, 0xc0000515d8?, 0xc00006ff70?})
Mar 25 08:53:54:         net/fd_posix.go:55 +0x25 fp=0xc00006ff38 sp=0xc00006fef0 pc=0x562240b1c1c5
Mar 25 08:53:54: net.(*conn).Read(0xc00052c090, {0xc0000ec0d1?, 0x0?, 0x0?})
Mar 25 08:53:54:         net/net.go:194 +0x45 fp=0xc00006ff80 sp=0xc00006ff38 pc=0x562240b2a585
Mar 25 08:53:54: net/http.(*connReader).backgroundRead(0xc0000ec0c0)
Mar 25 08:53:54:         net/http/server.go:690 +0x37 fp=0xc00006ffc8 sp=0xc00006ff80 pc=0x562240d162d7
Mar 25 08:53:54: net/http.(*connReader).startBackgroundRead.gowrap2()
Mar 25 08:53:54:         net/http/server.go:686 +0x25 fp=0xc00006ffe0 sp=0xc00006ffc8 pc=0x562240d16205
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x562240a27021
Mar 25 08:53:54: created by net/http.(*connReader).startBackgroundRead in goroutine 21
Mar 25 08:53:54:         net/http/server.go:686 +0xb6
Mar 25 08:53:54: goroutine 21 gp=0xc0001028c0 m=nil [select]:
Mar 25 08:53:54: runtime.gopark(0xc000143a58?, 0x2?, 0x4?, 0x0?, 0xc000143834?)
Mar 25 08:53:54:         runtime/proc.go:435 +0xce fp=0xc000143648 sp=0xc000143628 pc=0x562240a1f8ee
Mar 25 08:53:54: runtime.selectgo(0xc000143a58, 0xc000143830, 0xc000362400?, 0x0, 0x1?, 0x1)
Mar 25 08:53:54:         runtime/select.go:351 +0x837 fp=0xc000143780 sp=0xc000143648 pc=0x5622409fe1f7
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).completion(0xc0004ba360, {0x562241cf9858, 0xc0000009a0}, 0xc0004e4000)
Mar 25 08:53:54:         github.com/ollama/ollama/runner/llamarunner/runner.go:688 +0xa25 fp=0xc000143ac0 sp=0xc000143780 pc=0x562240dc24c5
Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).completion-fm({0x562241cf9858?, 0xc0000009a0?}, 0xc000125b40?)
Mar 25 08:53:54:         <autogenerated>:1 +0x36 fp=0xc000143af0 sp=0xc000143ac0 pc=0x562240dc53f6
Mar 25 08:53:54: net/http.HandlerFunc.ServeHTTP(0xc0005372c0?, {0x562241cf9858?, 0xc0000009a0?}, 0xc000125b60?)
Mar 25 08:53:54:         net/http/server.go:2294 +0x29 fp=0xc000143b18 sp=0xc000143af0 pc=0x562240d1df09
Mar 25 08:53:54: net/http.(*ServeMux).ServeHTTP(0x5622409c4125?, {0x562241cf9858, 0xc0000009a0}, 0xc0004e4000)
Mar 25 08:53:54:         net/http/server.go:2822 +0x1c4 fp=0xc000143b68 sp=0xc000143b18 pc=0x562240d1fe04
Mar 25 08:53:54: net/http.serverHandler.ServeHTTP({0x562241cf5ef0?}, {0x562241cf9858?, 0xc0000009a0?}, 0x1?)
Mar 25 08:53:54:         net/http/server.go:3301 +0x8e fp=0xc000143b98 sp=0xc000143b68 pc=0x562240d3d88e
Mar 25 08:53:54: net/http.(*conn).serve(0xc0000ee000, {0x562241cfb908, 0xc00034bf50})
Mar 25 08:53:54:         net/http/server.go:2102 +0x625 fp=0xc000143fb8 sp=0xc000143b98 pc=0x562240d1c405
Mar 25 08:53:54: net/http.(*Server).Serve.gowrap3()
Mar 25 08:53:54:         net/http/server.go:3454 +0x28 fp=0xc000143fe0 sp=0xc000143fb8 pc=0x562240d21cc8
Mar 25 08:53:54: runtime.goexit({})
Mar 25 08:53:54:         runtime/asm_amd64.s:1700 +0x1 fp=0xc000143fe8 sp=0xc000143fe0 pc=0x562240a27021
Mar 25 08:53:54: created by net/http.(*Server).Serve in goroutine 1
Mar 25 08:53:54:         net/http/server.go:3454 +0x485
Mar 25 08:53:54: rax    0x0
Mar 25 08:53:54: rbx    0x7fb9d1dcb640
Mar 25 08:53:54: rcx    0x7fba191389fc
Mar 25 08:53:54: rdx    0x6
Mar 25 08:53:54: rdi    0x33e82
Mar 25 08:53:54: rsi    0x33e84
Mar 25 08:53:54: rbp    0x33e84
Mar 25 08:53:54: rsp    0x7fb9d1dca180
Mar 25 08:53:54: r8     0x7fb9d1dca250
Mar 25 08:53:54: r9     0x7fb9d1dca220
Mar 25 08:53:54: r10    0x8
Mar 25 08:53:54: r11    0x246
Mar 25 08:53:54: r12    0x6
Mar 25 08:53:54: r13    0x16
Mar 25 08:53:54: r14    0x80
Mar 25 08:53:54: r15    0x8
Mar 25 08:53:54: rip    0x7fba191389fc
Mar 25 08:53:54: rflags 0x246
Mar 25 08:53:54: cs     0x33
Mar 25 08:53:54: fs     0x0
Mar 25 08:53:54: gs     0x0
Mar 25 08:53:54: time=2025-03-25T08:53:54.459Z level=ERROR source=server.go:449 msg="llama runner terminated" error="exit status 2"
Mar 25 08:53:54: [GIN] 2025/03/25 - 08:53:54 | 500 |         1m19s |       127.0.0.1 | POST     "/api/generate"
Mar 25 08:53:59: time=2025-03-25T08:53:59.951Z level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.203447183 model=/usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c
Mar 25 08:54:00: time=2025-03-25T08:54:00.202Z level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.454439782 model=/usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.6.1 (same error when using 0.6.2)

Originally created by @thafer6 on GitHub (Mar 25, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9976 ### What is the issue? I'm getting the following error when trying to do inference: Mar 25 08:53:54 **** ollama[121821]: //ml/backend/ggml/ggml/src/ggml-cuda/rope.cu:381: GGML_ASSERT(sections.v[0] > 0 || sections.v[1] > 0 || sections.v[2] > 0) failed Mar 25 08:53:54 **** ollama[121821]: SIGSEGV: segmentation violation ### Relevant log output ```shell Mar 25 08:52:35: time=2025-03-25T08:52:35.197Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.vision.block_count default=0 Mar 25 08:52:35: time=2025-03-25T08:52:35.198Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.attention.key_length default=128 Mar 25 08:52:35: time=2025-03-25T08:52:35.198Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.attention.value_length default=128 Mar 25 08:52:35: time=2025-03-25T08:52:35.198Z level=INFO source=sched.go:715 msg="new model will fit in available VRAM in single GPU, loading" model=/usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c gpu=GPU-33f80123-4c0a-59d3-5a8d-8e90642b62b6 parallel=4 available=15545794560 required="8.6 GiB" Mar 25 08:52:35: time=2025-03-25T08:52:35.459Z level=INFO source=server.go:105 msg="system memory" total="31.3 GiB" free="26.4 GiB" free_swap="0 B" Mar 25 08:52:35: time=2025-03-25T08:52:35.459Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.vision.block_count default=0 Mar 25 08:52:35: time=2025-03-25T08:52:35.460Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.attention.key_length default=128 Mar 25 08:52:35: time=2025-03-25T08:52:35.460Z level=WARN source=ggml.go:149 msg="key not found" key=qwen2vl.attention.value_length default=128 Mar 25 08:52:35: time=2025-03-25T08:52:35.460Z level=INFO source=server.go:138 msg=offload library=cuda layers.requested=-1 layers.model=29 layers.offload=29 layers.split="" memory.available="[14.5 GiB]" memory.gpu_overhead="0 B" memory.required.full="8.6 GiB" memory.required.partial="8.6 GiB" memory.required.kv="448.0 MiB" memory.required.allocations="[8.6 GiB]" memory.weights.total="6.5 GiB" memory.weights.repeating="6.5 GiB" memory.weights.nonrepeating="552.2 MiB" memory.graph.full="522.7 MiB" memory.graph.partial="522.7 MiB" Mar 25 08:52:35: llama_model_loader: loaded meta data with 35 key-value pairs and 339 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c (version GGUF V3 (latest)) Mar 25 08:52:35: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. Mar 25 08:52:35: llama_model_loader: - kv 0: general.architecture str = qwen2vl Mar 25 08:52:35: llama_model_loader: - kv 1: general.type str = model Mar 25 08:52:35: llama_model_loader: - kv 2: general.name str = Olmoe_Model_Hf Mar 25 08:52:35: llama_model_loader: - kv 3: general.size_label str = 7.6B Mar 25 08:52:35: llama_model_loader: - kv 4: general.license str = apache-2.0 Mar 25 08:52:35: llama_model_loader: - kv 5: general.base_model.count u32 = 1 Mar 25 08:52:35: llama_model_loader: - kv 6: general.base_model.0.name str = Qwen2 VL 7B Instruct Mar 25 08:52:35: llama_model_loader: - kv 7: general.base_model.0.organization str = Qwen Mar 25 08:52:35: llama_model_loader: - kv 8: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2-VL-... Mar 25 08:52:35: llama_model_loader: - kv 9: general.dataset.count u32 = 1 Mar 25 08:52:35: llama_model_loader: - kv 10: general.dataset.0.name str = olmOCR Mix 0225 Mar 25 08:52:35: llama_model_loader: - kv 11: general.dataset.0.version str = 0225 Mar 25 08:52:35: llama_model_loader: - kv 12: general.dataset.0.organization str = Allenai Mar 25 08:52:35: llama_model_loader: - kv 13: general.dataset.0.repo_url str = https://huggingface.co/allenai/olmOCR... Mar 25 08:52:35: llama_model_loader: - kv 14: general.languages arr[str,1] = ["en"] Mar 25 08:52:35: llama_model_loader: - kv 15: qwen2vl.block_count u32 = 28 Mar 25 08:52:35: llama_model_loader: - kv 16: qwen2vl.context_length u32 = 32768 Mar 25 08:52:35: llama_model_loader: - kv 17: qwen2vl.embedding_length u32 = 3584 Mar 25 08:52:35: llama_model_loader: - kv 18: qwen2vl.feed_forward_length u32 = 18944 Mar 25 08:52:35: llama_model_loader: - kv 19: qwen2vl.attention.head_count u32 = 28 Mar 25 08:52:35: llama_model_loader: - kv 20: qwen2vl.attention.head_count_kv u32 = 4 Mar 25 08:52:35: llama_model_loader: - kv 21: qwen2vl.rope.freq_base f32 = 1000000.000000 Mar 25 08:52:35: llama_model_loader: - kv 22: qwen2vl.attention.layer_norm_rms_epsilon f32 = 0.000001 Mar 25 08:52:35: llama_model_loader: - kv 23: general.file_type u32 = 7 Mar 25 08:52:35: llama_model_loader: - kv 24: qwen2vl.rope.dimension_sections arr[i32,4] = [16, 24, 24, 0] Mar 25 08:52:35: llama_model_loader: - kv 25: tokenizer.ggml.model str = gpt2 Mar 25 08:52:35: llama_model_loader: - kv 26: tokenizer.ggml.pre str = qwen2 Mar 25 08:52:35: llama_model_loader: - kv 27: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ... Mar 25 08:52:35: llama_model_loader: - kv 28: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... Mar 25 08:52:35: llama_model_loader: - kv 29: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",... Mar 25 08:52:35: llama_model_loader: - kv 30: tokenizer.ggml.eos_token_id u32 = 151645 Mar 25 08:52:35: llama_model_loader: - kv 31: tokenizer.ggml.padding_token_id u32 = 151643 Mar 25 08:52:35: llama_model_loader: - kv 32: tokenizer.ggml.bos_token_id u32 = 151643 Mar 25 08:52:35: llama_model_loader: - kv 33: tokenizer.chat_template str = {% set image_count = namespace(value=... Mar 25 08:52:35: llama_model_loader: - kv 34: general.quantization_version u32 = 2 Mar 25 08:52:35: llama_model_loader: - type f32: 141 tensors Mar 25 08:52:35: llama_model_loader: - type q8_0: 198 tensors Mar 25 08:52:35: print_info: file format = GGUF V3 (latest) Mar 25 08:52:35: print_info: file type = Q8_0 Mar 25 08:52:35: print_info: file size = 7.54 GiB (8.50 BPW) Mar 25 08:52:35: load: special tokens cache size = 14 Mar 25 08:52:35: load: token to piece cache size = 0.9309 MB Mar 25 08:52:35: print_info: arch = qwen2vl Mar 25 08:52:35: print_info: vocab_only = 1 Mar 25 08:52:35: print_info: model type = ?B Mar 25 08:52:35: print_info: model params = 7.62 B Mar 25 08:52:35: print_info: general.name = Olmoe_Model_Hf Mar 25 08:52:35: print_info: vocab type = BPE Mar 25 08:52:35: print_info: n_vocab = 152064 Mar 25 08:52:35: print_info: n_merges = 151387 Mar 25 08:52:35: print_info: BOS token = 151643 '<|endoftext|>' Mar 25 08:52:35: print_info: EOS token = 151645 '<|im_end|>' Mar 25 08:52:35: print_info: EOT token = 151645 '<|im_end|>' Mar 25 08:52:35: print_info: PAD token = 151643 '<|endoftext|>' Mar 25 08:52:35: print_info: LF token = 198 'Ċ' Mar 25 08:52:35: print_info: EOG token = 151643 '<|endoftext|>' Mar 25 08:52:35: print_info: EOG token = 151645 '<|im_end|>' Mar 25 08:52:35: print_info: max token length = 256 Mar 25 08:52:35: llama_model_load: vocab only - skipping tensors Mar 25 08:52:35: time=2025-03-25T08:52:35.712Z level=INFO source=server.go:405 msg="starting llama server" cmd="/usr/local/bin/ollama runner --model /usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c --ctx-size 8192 --batch-size 512 --n-gpu-layers 29 --threads 4 --parallel 4 --port 44713" Mar 25 08:52:35: time=2025-03-25T08:52:35.713Z level=INFO source=sched.go:450 msg="loaded runners" count=1 Mar 25 08:52:35: time=2025-03-25T08:52:35.713Z level=INFO source=server.go:585 msg="waiting for llama runner to start responding" Mar 25 08:52:35: time=2025-03-25T08:52:35.713Z level=INFO source=server.go:619 msg="waiting for server to become available" status="llm server error" Mar 25 08:52:35: time=2025-03-25T08:52:35.725Z level=INFO source=runner.go:931 msg="starting go runner" Mar 25 08:52:35: ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no Mar 25 08:52:35: ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no Mar 25 08:52:35: ggml_cuda_init: found 1 CUDA devices: Mar 25 08:52:35: Device 0: Tesla T4, compute capability 7.5, VMM: yes Mar 25 08:52:35: load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v12/libggml-cuda.so Mar 25 08:52:35: load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-skylakex.so Mar 25 08:52:35: time=2025-03-25T08:52:35.808Z level=INFO source=ggml.go:109 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.AVX512=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 CUDA.0.ARCHS=500,600,610,700,750,800,860,870,890,900,1200 CUDA.0.USE_GRAPHS=1 CUDA.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(gcc) Mar 25 08:52:35: time=2025-03-25T08:52:35.809Z level=INFO source=runner.go:991 msg="Server listening on 127.0.0.1:44713" Mar 25 08:52:35: llama_model_load_from_file_impl: using device CUDA0 (Tesla T4) - 14825 MiB free Mar 25 08:52:35: time=2025-03-25T08:52:35.964Z level=INFO source=server.go:619 msg="waiting for server to become available" status="llm server loading model" Mar 25 08:52:36: llama_model_loader: loaded meta data with 35 key-value pairs and 339 tensors from /usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c (version GGUF V3 (latest)) Mar 25 08:52:36: llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. Mar 25 08:52:36: llama_model_loader: - kv 0: general.architecture str = qwen2vl Mar 25 08:52:36: llama_model_loader: - kv 1: general.type str = model Mar 25 08:52:36: llama_model_loader: - kv 2: general.name str = Olmoe_Model_Hf Mar 25 08:52:36: llama_model_loader: - kv 3: general.size_label str = 7.6B Mar 25 08:52:36: llama_model_loader: - kv 4: general.license str = apache-2.0 Mar 25 08:52:36: llama_model_loader: - kv 5: general.base_model.count u32 = 1 Mar 25 08:52:36: llama_model_loader: - kv 6: general.base_model.0.name str = Qwen2 VL 7B Instruct Mar 25 08:52:36: llama_model_loader: - kv 7: general.base_model.0.organization str = Qwen Mar 25 08:52:36: llama_model_loader: - kv 8: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2-VL-... Mar 25 08:52:36: llama_model_loader: - kv 9: general.dataset.count u32 = 1 Mar 25 08:52:36: llama_model_loader: - kv 10: general.dataset.0.name str = olmOCR Mix 0225 Mar 25 08:52:36: llama_model_loader: - kv 11: general.dataset.0.version str = 0225 Mar 25 08:52:36: llama_model_loader: - kv 12: general.dataset.0.organization str = Allenai Mar 25 08:52:36: llama_model_loader: - kv 13: general.dataset.0.repo_url str = https://huggingface.co/allenai/olmOCR... Mar 25 08:52:36: llama_model_loader: - kv 14: general.languages arr[str,1] = ["en"] Mar 25 08:52:36: llama_model_loader: - kv 15: qwen2vl.block_count u32 = 28 Mar 25 08:52:36: llama_model_loader: - kv 16: qwen2vl.context_length u32 = 32768 Mar 25 08:52:36: llama_model_loader: - kv 17: qwen2vl.embedding_length u32 = 3584 Mar 25 08:52:36: llama_model_loader: - kv 18: qwen2vl.feed_forward_length u32 = 18944 Mar 25 08:52:36: llama_model_loader: - kv 19: qwen2vl.attention.head_count u32 = 28 Mar 25 08:52:36: llama_model_loader: - kv 20: qwen2vl.attention.head_count_kv u32 = 4 Mar 25 08:52:36: llama_model_loader: - kv 21: qwen2vl.rope.freq_base f32 = 1000000.000000 Mar 25 08:52:36: llama_model_loader: - kv 22: qwen2vl.attention.layer_norm_rms_epsilon f32 = 0.000001 Mar 25 08:52:36: llama_model_loader: - kv 23: general.file_type u32 = 7 Mar 25 08:52:36: llama_model_loader: - kv 24: qwen2vl.rope.dimension_sections arr[i32,4] = [16, 24, 24, 0] Mar 25 08:52:36: llama_model_loader: - kv 25: tokenizer.ggml.model str = gpt2 Mar 25 08:52:36: llama_model_loader: - kv 26: tokenizer.ggml.pre str = qwen2 Mar 25 08:52:36: llama_model_loader: - kv 27: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ... Mar 25 08:52:36: llama_model_loader: - kv 28: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... Mar 25 08:52:36: llama_model_loader: - kv 29: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",... Mar 25 08:52:36: llama_model_loader: - kv 30: tokenizer.ggml.eos_token_id u32 = 151645 Mar 25 08:52:36: llama_model_loader: - kv 31: tokenizer.ggml.padding_token_id u32 = 151643 Mar 25 08:52:36: llama_model_loader: - kv 32: tokenizer.ggml.bos_token_id u32 = 151643 Mar 25 08:52:36: llama_model_loader: - kv 33: tokenizer.chat_template str = {% set image_count = namespace(value=... Mar 25 08:52:36: llama_model_loader: - kv 34: general.quantization_version u32 = 2 Mar 25 08:52:36: llama_model_loader: - type f32: 141 tensors Mar 25 08:52:36: llama_model_loader: - type q8_0: 198 tensors Mar 25 08:52:36: print_info: file format = GGUF V3 (latest) Mar 25 08:52:36: print_info: file type = Q8_0 Mar 25 08:52:36: print_info: file size = 7.54 GiB (8.50 BPW) Mar 25 08:52:36: load: special tokens cache size = 14 Mar 25 08:52:36: load: token to piece cache size = 0.9309 MB Mar 25 08:52:36: print_info: arch = qwen2vl Mar 25 08:52:36: print_info: vocab_only = 0 Mar 25 08:52:36: print_info: n_ctx_train = 32768 Mar 25 08:52:36: print_info: n_embd = 3584 Mar 25 08:52:36: print_info: n_layer = 28 Mar 25 08:52:36: print_info: n_head = 28 Mar 25 08:52:36: print_info: n_head_kv = 4 Mar 25 08:52:36: print_info: n_rot = 128 Mar 25 08:52:36: print_info: n_swa = 0 Mar 25 08:52:36: print_info: n_embd_head_k = 128 Mar 25 08:52:36: print_info: n_embd_head_v = 128 Mar 25 08:52:36: print_info: n_gqa = 7 Mar 25 08:52:36: print_info: n_embd_k_gqa = 512 Mar 25 08:52:36: print_info: n_embd_v_gqa = 512 Mar 25 08:52:36: print_info: f_norm_eps = 0.0e+00 Mar 25 08:52:36: print_info: f_norm_rms_eps = 1.0e-06 Mar 25 08:52:36: print_info: f_clamp_kqv = 0.0e+00 Mar 25 08:52:36: print_info: f_max_alibi_bias = 0.0e+00 Mar 25 08:52:36: print_info: f_logit_scale = 0.0e+00 Mar 25 08:52:36: print_info: n_ff = 18944 Mar 25 08:52:36: print_info: n_expert = 0 Mar 25 08:52:36: print_info: n_expert_used = 0 Mar 25 08:52:36: print_info: causal attn = 1 Mar 25 08:52:36: print_info: pooling type = 0 Mar 25 08:52:36: print_info: rope type = 8 Mar 25 08:52:36: print_info: rope scaling = linear Mar 25 08:52:36: print_info: freq_base_train = 1000000.0 Mar 25 08:52:36: print_info: freq_scale_train = 1 Mar 25 08:52:36: print_info: n_ctx_orig_yarn = 32768 Mar 25 08:52:36: print_info: rope_finetuned = unknown Mar 25 08:52:36: print_info: ssm_d_conv = 0 Mar 25 08:52:36: print_info: ssm_d_inner = 0 Mar 25 08:52:36: print_info: ssm_d_state = 0 Mar 25 08:52:36: print_info: ssm_dt_rank = 0 Mar 25 08:52:36: print_info: ssm_dt_b_c_rms = 0 Mar 25 08:52:36: print_info: model type = 7B Mar 25 08:52:36: print_info: model params = 7.62 B Mar 25 08:52:36: print_info: general.name = Olmoe_Model_Hf Mar 25 08:52:36: print_info: vocab type = BPE Mar 25 08:52:36: print_info: n_vocab = 152064 Mar 25 08:52:36: print_info: n_merges = 151387 Mar 25 08:52:36: print_info: BOS token = 151643 '<|endoftext|>' Mar 25 08:52:36: print_info: EOS token = 151645 '<|im_end|>' Mar 25 08:52:36: print_info: EOT token = 151645 '<|im_end|>' Mar 25 08:52:36: print_info: PAD token = 151643 '<|endoftext|>' Mar 25 08:52:36: print_info: LF token = 198 'Ċ' Mar 25 08:52:36: print_info: EOG token = 151643 '<|endoftext|>' Mar 25 08:52:36: print_info: EOG token = 151645 '<|im_end|>' Mar 25 08:52:36: print_info: max token length = 256 Mar 25 08:52:36: load_tensors: loading model tensors, this can take a while... (mmap = true) Mar 25 08:52:36: load_tensors: offloading 28 repeating layers to GPU Mar 25 08:52:36: load_tensors: offloading output layer to GPU Mar 25 08:52:36: load_tensors: offloaded 29/29 layers to GPU Mar 25 08:52:36: load_tensors: CUDA0 model buffer size = 7165.44 MiB Mar 25 08:52:36: load_tensors: CPU_Mapped model buffer size = 552.23 MiB Mar 25 08:52:38: llama_init_from_model: n_seq_max = 4 Mar 25 08:52:38: llama_init_from_model: n_ctx = 8192 Mar 25 08:52:38: llama_init_from_model: n_ctx_per_seq = 2048 Mar 25 08:52:38: llama_init_from_model: n_batch = 2048 Mar 25 08:52:38: llama_init_from_model: n_ubatch = 512 Mar 25 08:52:38: llama_init_from_model: flash_attn = 0 Mar 25 08:52:38: llama_init_from_model: freq_base = 1000000.0 Mar 25 08:52:38: llama_init_from_model: freq_scale = 1 Mar 25 08:52:38: llama_init_from_model: n_ctx_per_seq (2048) < n_ctx_train (32768) -- the full capacity of the model will not be utilized Mar 25 08:52:38: llama_kv_cache_init: kv_size = 8192, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 28, can_shift = 1 Mar 25 08:52:38: llama_kv_cache_init: CUDA0 KV buffer size = 448.00 MiB Mar 25 08:52:38: llama_init_from_model: KV self size = 448.00 MiB, K (f16): 224.00 MiB, V (f16): 224.00 MiB Mar 25 08:52:38: llama_init_from_model: CUDA_Host output buffer size = 2.38 MiB Mar 25 08:52:38: llama_init_from_model: CUDA0 compute buffer size = 492.01 MiB Mar 25 08:52:38: llama_init_from_model: CUDA_Host compute buffer size = 23.01 MiB Mar 25 08:52:38: llama_init_from_model: graph nodes = 986 Mar 25 08:52:38: llama_init_from_model: graph splits = 2 Mar 25 08:52:38: time=2025-03-25T08:52:38.472Z level=INFO source=server.go:624 msg="llama runner started in 2.76 seconds" Mar 25 08:53:54: //ml/backend/ggml/ggml/src/ggml-cuda/rope.cu:381: GGML_ASSERT(sections.v[0] > 0 || sections.v[1] > 0 || sections.v[2] > 0) failed Mar 25 08:53:54: SIGSEGV: segmentation violation Mar 25 08:53:54: PC=0x7fb9ae7f30d7 m=3 sigcode=1 addr=0x205003fcc Mar 25 08:53:54: signal arrived during cgo execution Mar 25 08:53:54: goroutine 11 gp=0xc0006028c0 m=3 mp=0xc000079008 [syscall]: Mar 25 08:53:54: runtime.cgocall(0x56224168e100, 0xc00008fbc8) Mar 25 08:53:54: runtime/cgocall.go:167 +0x4b fp=0xc00008fba0 sp=0xc00008fb68 pc=0x562240a1c60b Mar 25 08:53:54: github.com/ollama/ollama/llama._Cfunc_llama_decode(0x7fb9a88ac240, {0x1, 0x7fb9a8906620, 0x0, 0x0, 0x7fb9a8908630, 0x7fb9a890a640, 0x7fb9a88d4b80, 0x7fb9aafae7c0}) Mar 25 08:53:54: _cgo_gotypes.go:574 +0x4a fp=0xc00008fbc8 sp=0xc00008fba0 pc=0x562240db3eea Mar 25 08:53:54: github.com/ollama/ollama/llama.(*Context).Decode.func1(...) Mar 25 08:53:54: github.com/ollama/ollama/llama/llama.go:132 Mar 25 08:53:54: github.com/ollama/ollama/llama.(*Context).Decode(0x56224260d080?, 0x0?) Mar 25 08:53:54: github.com/ollama/ollama/llama/llama.go:132 +0xf6 fp=0xc00008fcc8 sp=0xc00008fbc8 pc=0x562240db6c96 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0004ba360, 0xc0001106c0, 0xc00008ff20) Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:435 +0x23e fp=0xc00008fee0 sp=0xc00008fcc8 pc=0x562240dc0abe Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0004ba360, {0x562241cfb940, 0xc000366230}) Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:343 +0x1d5 fp=0xc00008ffb8 sp=0xc00008fee0 pc=0x562240dc0715 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2() Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:972 +0x28 fp=0xc00008ffe0 sp=0xc00008ffb8 pc=0x562240dc4fc8 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008ffe8 sp=0xc00008ffe0 pc=0x562240a27021 Mar 25 08:53:54: created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:972 +0xcb7 Mar 25 08:53:54: goroutine 1 gp=0xc000002380 m=nil [IO wait, 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc0000495f8 sp=0xc0000495d8 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.netpollblock(0xc00051d678?, 0x409b9226?, 0x22?) Mar 25 08:53:54: runtime/netpoll.go:575 +0xf7 fp=0xc000049630 sp=0xc0000495f8 pc=0x5622409e46f7 Mar 25 08:53:54: internal/poll.runtime_pollWait(0x7fb9d1dd3eb0, 0x72) Mar 25 08:53:54: runtime/netpoll.go:351 +0x85 fp=0xc000049650 sp=0xc000049630 pc=0x562240a1eb05 Mar 25 08:53:54: internal/poll.(*pollDesc).wait(0xc000529c00?, 0x5622409c7406?, 0x0) Mar 25 08:53:54: internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000049678 sp=0xc000049650 pc=0x562240aa5f87 Mar 25 08:53:54: internal/poll.(*pollDesc).waitRead(...) Mar 25 08:53:54: internal/poll/fd_poll_runtime.go:89 Mar 25 08:53:54: internal/poll.(*FD).Accept(0xc000529c00) Mar 25 08:53:54: internal/poll/fd_unix.go:620 +0x295 fp=0xc000049720 sp=0xc000049678 pc=0x562240aab355 Mar 25 08:53:54: net.(*netFD).accept(0xc000529c00) Mar 25 08:53:54: net/fd_unix.go:172 +0x29 fp=0xc0000497d8 sp=0xc000049720 pc=0x562240b1e169 Mar 25 08:53:54: net.(*TCPListener).accept(0xc000404b00) Mar 25 08:53:54: net/tcpsock_posix.go:159 +0x1b fp=0xc000049828 sp=0xc0000497d8 pc=0x562240b33b1b Mar 25 08:53:54: net.(*TCPListener).Accept(0xc000404b00) Mar 25 08:53:54: net/tcpsock.go:380 +0x30 fp=0xc000049858 sp=0xc000049828 pc=0x562240b329d0 Mar 25 08:53:54: net/http.(*onceCloseListener).Accept(0xc0000ee000?) Mar 25 08:53:54: <autogenerated>:1 +0x24 fp=0xc000049870 sp=0xc000049858 pc=0x562240d4a004 Mar 25 08:53:54: net/http.(*Server).Serve(0xc000035a00, {0x562241cf9678, 0xc000404b00}) Mar 25 08:53:54: net/http/server.go:3424 +0x30c fp=0xc0000499a0 sp=0xc000049870 pc=0x562240d218cc Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034120, 0xe, 0xe}) Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:992 +0x108a fp=0xc000049d08 sp=0xc0000499a0 pc=0x562240dc4d0a Mar 25 08:53:54: github.com/ollama/ollama/runner.Execute({0xc000034110?, 0x0?, 0x0?}) Mar 25 08:53:54: github.com/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc000049d30 sp=0xc000049d08 pc=0x562240eaf914 Mar 25 08:53:54: github.com/ollama/ollama/cmd.NewCLI.func2(0xc000035600?, {0x56224186b054?, 0x4?, 0x56224186b058?}) Mar 25 08:53:54: github.com/ollama/ollama/cmd/cmd.go:1327 +0x45 fp=0xc000049d58 sp=0xc000049d30 pc=0x5622416208a5 Mar 25 08:53:54: github.com/spf13/cobra.(*Command).execute(0xc00049d508, {0xc000523ea0, 0xe, 0xe}) Mar 25 08:53:54: github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc000049e78 sp=0xc000049d58 pc=0x562240b977bc Mar 25 08:53:54: github.com/spf13/cobra.(*Command).ExecuteC(0xc000558c08) Mar 25 08:53:54: github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000049f30 sp=0xc000049e78 pc=0x562240b98005 Mar 25 08:53:54: github.com/spf13/cobra.(*Command).Execute(...) Mar 25 08:53:54: github.com/spf13/cobra@v1.7.0/command.go:992 Mar 25 08:53:54: github.com/spf13/cobra.(*Command).ExecuteContext(...) Mar 25 08:53:54: github.com/spf13/cobra@v1.7.0/command.go:985 Mar 25 08:53:54: main.main() Mar 25 08:53:54: github.com/ollama/ollama/main.go:12 +0x4d fp=0xc000049f50 sp=0xc000049f30 pc=0x562241620c0d Mar 25 08:53:54: runtime.main() Mar 25 08:53:54: runtime/proc.go:283 +0x29d fp=0xc000049fe0 sp=0xc000049f50 pc=0x5622409ebcfd Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000049fe8 sp=0xc000049fe0 pc=0x562240a27021 Mar 25 08:53:54: goroutine 2 gp=0xc000002e00 m=nil [force gc (idle), 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000072fa8 sp=0xc000072f88 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.goparkunlock(...) Mar 25 08:53:54: runtime/proc.go:441 Mar 25 08:53:54: runtime.forcegchelper() Mar 25 08:53:54: runtime/proc.go:348 +0xb8 fp=0xc000072fe0 sp=0xc000072fa8 pc=0x5622409ec038 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000072fe8 sp=0xc000072fe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.init.7 in goroutine 1 Mar 25 08:53:54: runtime/proc.go:336 +0x1a Mar 25 08:53:54: goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]: Mar 25 08:53:54: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000073780 sp=0xc000073760 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.goparkunlock(...) Mar 25 08:53:54: runtime/proc.go:441 Mar 25 08:53:54: runtime.bgsweep(0xc00007e000) Mar 25 08:53:54: runtime/mgcsweep.go:316 +0xdf fp=0xc0000737c8 sp=0xc000073780 pc=0x5622409d685f Mar 25 08:53:54: runtime.gcenable.gowrap1() Mar 25 08:53:54: runtime/mgc.go:204 +0x25 fp=0xc0000737e0 sp=0xc0000737c8 pc=0x5622409cac45 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000737e8 sp=0xc0000737e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcenable in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:204 +0x66 Mar 25 08:53:54: goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]: Mar 25 08:53:54: runtime.gopark(0x10000?, 0x562241a21c70?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000073f78 sp=0xc000073f58 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.goparkunlock(...) Mar 25 08:53:54: runtime/proc.go:441 Mar 25 08:53:54: runtime.(*scavengerState).park(0x562242560b40) Mar 25 08:53:54: runtime/mgcscavenge.go:425 +0x49 fp=0xc000073fa8 sp=0xc000073f78 pc=0x5622409d42a9 Mar 25 08:53:54: runtime.bgscavenge(0xc00007e000) Mar 25 08:53:54: runtime/mgcscavenge.go:658 +0x59 fp=0xc000073fc8 sp=0xc000073fa8 pc=0x5622409d4839 Mar 25 08:53:54: runtime.gcenable.gowrap2() Mar 25 08:53:54: runtime/mgc.go:205 +0x25 fp=0xc000073fe0 sp=0xc000073fc8 pc=0x5622409cabe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000073fe8 sp=0xc000073fe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcenable in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:205 +0xa5 Mar 25 08:53:54: goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait, 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000072688?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000072630 sp=0xc000072610 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.runfinq() Mar 25 08:53:54: runtime/mfinal.go:196 +0x107 fp=0xc0000727e0 sp=0xc000072630 pc=0x5622409c9c07 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000727e8 sp=0xc0000727e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.createfing in goroutine 1 Mar 25 08:53:54: runtime/mfinal.go:166 +0x3d Mar 25 08:53:54: goroutine 6 gp=0xc0001d28c0 m=nil [chan receive, 1 minutes]: Mar 25 08:53:54: runtime.gopark(0xc000225720?, 0xc000300018?, 0x60?, 0x47?, 0x562240b04ea8?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000074718 sp=0xc0000746f8 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.chanrecv(0xc000040380, 0x0, 0x1) Mar 25 08:53:54: runtime/chan.go:664 +0x445 fp=0xc000074790 sp=0xc000074718 pc=0x5622409bbe05 Mar 25 08:53:54: runtime.chanrecv1(0x0?, 0x0?) Mar 25 08:53:54: runtime/chan.go:506 +0x12 fp=0xc0000747b8 sp=0xc000074790 pc=0x5622409bb992 Mar 25 08:53:54: runtime.unique_runtime_registerUniqueMapCleanup.func2(...) Mar 25 08:53:54: runtime/mgc.go:1796 Mar 25 08:53:54: runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1799 +0x2f fp=0xc0000747e0 sp=0xc0000747b8 pc=0x5622409cddef Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000747e8 sp=0xc0000747e0 pc=0x562240a27021 Mar 25 08:53:54: created by unique.runtime_registerUniqueMapCleanup in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1794 +0x85 Mar 25 08:53:54: goroutine 7 gp=0xc0001d3340 m=nil [GC worker (idle), 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000074f38 sp=0xc000074f18 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc000074fc8 sp=0xc000074f38 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc000074fe0 sp=0xc000074fc8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000074fe8 sp=0xc000074fe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 18 gp=0xc000102380 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000bae4c84?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00006e738 sp=0xc00006e718 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00006e7c8 sp=0xc00006e738 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00006e7e0 sp=0xc00006e7c8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00006e7e8 sp=0xc00006e7e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 34 gp=0xc000504000 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000baeb3b6?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00050a7c8 sp=0xc00050a738 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 8 gp=0xc0001d3500 m=nil [GC worker (idle), 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x180000baeb59d?, 0x3?, 0xb1?, 0x9?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000075738 sp=0xc000075718 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc0000757c8 sp=0xc000075738 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc0000757e0 sp=0xc0000757c8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000757e8 sp=0xc0000757e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 19 gp=0xc000102540 m=nil [GC worker (idle), 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x180000bae4882?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00006ef38 sp=0xc00006ef18 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00006efc8 sp=0xc00006ef38 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00006efe0 sp=0xc00006efc8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00006efe8 sp=0xc00006efe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 35 gp=0xc0005041c0 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000baec65f?, 0x3?, 0xf5?, 0x1c?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00050af38 sp=0xc00050af18 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00050afc8 sp=0xc00050af38 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00050afe0 sp=0xc00050afc8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 9 gp=0xc0001d36c0 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000baf26c9?, 0x3?, 0xb3?, 0x24?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000075f38 sp=0xc000075f18 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc000075fc8 sp=0xc000075f38 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc000075fe0 sp=0xc000075fc8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000075fe8 sp=0xc000075fe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 20 gp=0xc000102700 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000bae5fee?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00006f738 sp=0xc00006f718 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00006f7c8 sp=0xc00006f738 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00006f7e0 sp=0xc00006f7c8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00006f7e8 sp=0xc00006f7e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 66 gp=0xc000602700 m=nil [IO wait]: Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00006fdd8 sp=0xc00006fdb8 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.netpollblock(0x562240a42d78?, 0x409b9226?, 0x22?) Mar 25 08:53:54: runtime/netpoll.go:575 +0xf7 fp=0xc00006fe10 sp=0xc00006fdd8 pc=0x5622409e46f7 Mar 25 08:53:54: internal/poll.runtime_pollWait(0x7fb9d1dd3d98, 0x72) Mar 25 08:53:54: runtime/netpoll.go:351 +0x85 fp=0xc00006fe30 sp=0xc00006fe10 pc=0x562240a1eb05 Mar 25 08:53:54: internal/poll.(*pollDesc).wait(0xc000474000?, 0xc0000ec0d1?, 0x0) Mar 25 08:53:54: internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00006fe58 sp=0xc00006fe30 pc=0x562240aa5f87 Mar 25 08:53:54: internal/poll.(*pollDesc).waitRead(...) Mar 25 08:53:54: internal/poll/fd_poll_runtime.go:89 Mar 25 08:53:54: internal/poll.(*FD).Read(0xc000474000, {0xc0000ec0d1, 0x1, 0x1}) Mar 25 08:53:54: internal/poll/fd_unix.go:165 +0x27a fp=0xc00006fef0 sp=0xc00006fe58 pc=0x562240aa727a Mar 25 08:53:54: net.(*netFD).Read(0xc000474000, {0xc0000ec0d1?, 0xc0000515d8?, 0xc00006ff70?}) Mar 25 08:53:54: net/fd_posix.go:55 +0x25 fp=0xc00006ff38 sp=0xc00006fef0 pc=0x562240b1c1c5 Mar 25 08:53:54: net.(*conn).Read(0xc00052c090, {0xc0000ec0d1?, 0x0?, 0x0?}) Mar 25 08:53:54: net/net.go:194 +0x45 fp=0xc00006ff80 sp=0xc00006ff38 pc=0x562240b2a585 Mar 25 08:53:54: net/http.(*connReader).backgroundRead(0xc0000ec0c0) Mar 25 08:53:54: net/http/server.go:690 +0x37 fp=0xc00006ffc8 sp=0xc00006ff80 pc=0x562240d162d7 Mar 25 08:53:54: net/http.(*connReader).startBackgroundRead.gowrap2() Mar 25 08:53:54: net/http/server.go:686 +0x25 fp=0xc00006ffe0 sp=0xc00006ffc8 pc=0x562240d16205 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x562240a27021 Mar 25 08:53:54: created by net/http.(*connReader).startBackgroundRead in goroutine 21 Mar 25 08:53:54: net/http/server.go:686 +0xb6 Mar 25 08:53:54: goroutine 21 gp=0xc0001028c0 m=nil [select]: Mar 25 08:53:54: runtime.gopark(0xc000143a58?, 0x2?, 0x4?, 0x0?, 0xc000143834?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000143648 sp=0xc000143628 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.selectgo(0xc000143a58, 0xc000143830, 0xc000362400?, 0x0, 0x1?, 0x1) Mar 25 08:53:54: runtime/select.go:351 +0x837 fp=0xc000143780 sp=0xc000143648 pc=0x5622409fe1f7 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).completion(0xc0004ba360, {0x562241cf9858, 0xc0000009a0}, 0xc0004e4000) Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:688 +0xa25 fp=0xc000143ac0 sp=0xc000143780 pc=0x562240dc24c5 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).completion-fm({0x562241cf9858?, 0xc0000009a0?}, 0xc000125b40?) Mar 25 08:53:54: <autogenerated>:1 +0x36 fp=0xc000143af0 sp=0xc000143ac0 pc=0x562240dc53f6 Mar 25 08:53:54: net/http.HandlerFunc.ServeHTTP(0xc0005372c0?, {0x562241cf9858?, 0xc0000009a0?}, 0xc000125b60?) Mar 25 08:53:54: net/http/server.go:2294 +0x29 fp=0xc000143b18 sp=0xc000143af0 pc=0x562240d1df09 Mar 25 08:53:54: net/http.(*ServeMux).ServeHTTP(0x5622409c4125?, {0x562241cf9858, 0xc0000009a0}, 0xc0004e4000) Mar 25 08:53:54: net/http/server.go:2822 +0x1c4 fp=0xc000143b68 sp=0xc000143b18 pc=0x562240d1fe04 Mar 25 08:53:54: net/http.serverHandler.ServeHTTP({0x562241cf5ef0?}, {0x562241cf9858?, 0xc0000009a0?}, 0x1?) Mar 25 08:53:54: net/http/server.go:3301 +0x8e fp=0xc000143b98 sp=0xc000143b68 pc=0x562240d3d88e Mar 25 08:53:54: net/http.(*conn).serve(0xc0000ee000, {0x562241cfb908, 0xc00034bf50}) Mar 25 08:53:54: net/http/server.go:2102 +0x625 fp=0xc000143fb8 sp=0xc000143b98 pc=0x562240d1c405 Mar 25 08:53:54: net/http.(*Server).Serve.gowrap3() Mar 25 08:53:54: net/http/server.go:3454 +0x28 fp=0xc000143fe0 sp=0xc000143fb8 pc=0x562240d21cc8 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000143fe8 sp=0xc000143fe0 pc=0x562240a27021 Mar 25 08:53:54: created by net/http.(*Server).Serve in goroutine 1 Mar 25 08:53:54: net/http/server.go:3454 +0x485 Mar 25 08:53:54: rax 0x205003fcc Mar 25 08:53:54: rbx 0x7fb9a81c8cf0 Mar 25 08:53:54: rcx 0xff3 Mar 25 08:53:54: rdx 0x7fb9a8007780 Mar 25 08:53:54: rdi 0x7fb9a8007790 Mar 25 08:53:54: rsi 0x0 Mar 25 08:53:54: rbp 0x7fb9d1dca120 Mar 25 08:53:54: rsp 0x7fb9d1dca100 Mar 25 08:53:54: r8 0x4 Mar 25 08:53:54: r9 0x0 Mar 25 08:53:54: r10 0x4 Mar 25 08:53:54: r11 0x8 Mar 25 08:53:54: r12 0x7fb9b4003330 Mar 25 08:53:54: r13 0x7fb9a8007790 Mar 25 08:53:54: r14 0x0 Mar 25 08:53:54: r15 0x56225f5d94e0 Mar 25 08:53:54: rip 0x7fb9ae7f30d7 Mar 25 08:53:54: rflags 0x10297 Mar 25 08:53:54: cs 0x33 Mar 25 08:53:54: fs 0x0 Mar 25 08:53:54: gs 0x0 Mar 25 08:53:54: SIGABRT: abort Mar 25 08:53:54: PC=0x7fba191389fc m=3 sigcode=18446744073709551610 Mar 25 08:53:54: signal arrived during cgo execution Mar 25 08:53:54: goroutine 11 gp=0xc0006028c0 m=3 mp=0xc000079008 [syscall]: Mar 25 08:53:54: runtime.cgocall(0x56224168e100, 0xc00008fbc8) Mar 25 08:53:54: runtime/cgocall.go:167 +0x4b fp=0xc00008fba0 sp=0xc00008fb68 pc=0x562240a1c60b Mar 25 08:53:54: github.com/ollama/ollama/llama._Cfunc_llama_decode(0x7fb9a88ac240, {0x1, 0x7fb9a8906620, 0x0, 0x0, 0x7fb9a8908630, 0x7fb9a890a640, 0x7fb9a88d4b80, 0x7fb9aafae7c0}) Mar 25 08:53:54: _cgo_gotypes.go:574 +0x4a fp=0xc00008fbc8 sp=0xc00008fba0 pc=0x562240db3eea Mar 25 08:53:54: github.com/ollama/ollama/llama.(*Context).Decode.func1(...) Mar 25 08:53:54: github.com/ollama/ollama/llama/llama.go:132 Mar 25 08:53:54: github.com/ollama/ollama/llama.(*Context).Decode(0x56224260d080?, 0x0?) Mar 25 08:53:54: github.com/ollama/ollama/llama/llama.go:132 +0xf6 fp=0xc00008fcc8 sp=0xc00008fbc8 pc=0x562240db6c96 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0004ba360, 0xc0001106c0, 0xc00008ff20) Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:435 +0x23e fp=0xc00008fee0 sp=0xc00008fcc8 pc=0x562240dc0abe Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0004ba360, {0x562241cfb940, 0xc000366230}) Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:343 +0x1d5 fp=0xc00008ffb8 sp=0xc00008fee0 pc=0x562240dc0715 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2() Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:972 +0x28 fp=0xc00008ffe0 sp=0xc00008ffb8 pc=0x562240dc4fc8 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00008ffe8 sp=0xc00008ffe0 pc=0x562240a27021 Mar 25 08:53:54: created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:972 +0xcb7 Mar 25 08:53:54: goroutine 1 gp=0xc000002380 m=nil [IO wait, 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc0000495f8 sp=0xc0000495d8 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.netpollblock(0xc00051d678?, 0x409b9226?, 0x22?) Mar 25 08:53:54: runtime/netpoll.go:575 +0xf7 fp=0xc000049630 sp=0xc0000495f8 pc=0x5622409e46f7 Mar 25 08:53:54: internal/poll.runtime_pollWait(0x7fb9d1dd3eb0, 0x72) Mar 25 08:53:54: runtime/netpoll.go:351 +0x85 fp=0xc000049650 sp=0xc000049630 pc=0x562240a1eb05 Mar 25 08:53:54: internal/poll.(*pollDesc).wait(0xc000529c00?, 0x5622409c7406?, 0x0) Mar 25 08:53:54: internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000049678 sp=0xc000049650 pc=0x562240aa5f87 Mar 25 08:53:54: internal/poll.(*pollDesc).waitRead(...) Mar 25 08:53:54: internal/poll/fd_poll_runtime.go:89 Mar 25 08:53:54: internal/poll.(*FD).Accept(0xc000529c00) Mar 25 08:53:54: internal/poll/fd_unix.go:620 +0x295 fp=0xc000049720 sp=0xc000049678 pc=0x562240aab355 Mar 25 08:53:54: net.(*netFD).accept(0xc000529c00) Mar 25 08:53:54: net/fd_unix.go:172 +0x29 fp=0xc0000497d8 sp=0xc000049720 pc=0x562240b1e169 Mar 25 08:53:54: net.(*TCPListener).accept(0xc000404b00) Mar 25 08:53:54: net/tcpsock_posix.go:159 +0x1b fp=0xc000049828 sp=0xc0000497d8 pc=0x562240b33b1b Mar 25 08:53:54: net.(*TCPListener).Accept(0xc000404b00) Mar 25 08:53:54: net/tcpsock.go:380 +0x30 fp=0xc000049858 sp=0xc000049828 pc=0x562240b329d0 Mar 25 08:53:54: net/http.(*onceCloseListener).Accept(0xc0000ee000?) Mar 25 08:53:54: <autogenerated>:1 +0x24 fp=0xc000049870 sp=0xc000049858 pc=0x562240d4a004 Mar 25 08:53:54: net/http.(*Server).Serve(0xc000035a00, {0x562241cf9678, 0xc000404b00}) Mar 25 08:53:54: net/http/server.go:3424 +0x30c fp=0xc0000499a0 sp=0xc000049870 pc=0x562240d218cc Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.Execute({0xc000034120, 0xe, 0xe}) Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:992 +0x108a fp=0xc000049d08 sp=0xc0000499a0 pc=0x562240dc4d0a Mar 25 08:53:54: github.com/ollama/ollama/runner.Execute({0xc000034110?, 0x0?, 0x0?}) Mar 25 08:53:54: github.com/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc000049d30 sp=0xc000049d08 pc=0x562240eaf914 Mar 25 08:53:54: github.com/ollama/ollama/cmd.NewCLI.func2(0xc000035600?, {0x56224186b054?, 0x4?, 0x56224186b058?}) Mar 25 08:53:54: github.com/ollama/ollama/cmd/cmd.go:1327 +0x45 fp=0xc000049d58 sp=0xc000049d30 pc=0x5622416208a5 Mar 25 08:53:54: github.com/spf13/cobra.(*Command).execute(0xc00049d508, {0xc000523ea0, 0xe, 0xe}) Mar 25 08:53:54: github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc000049e78 sp=0xc000049d58 pc=0x562240b977bc Mar 25 08:53:54: github.com/spf13/cobra.(*Command).ExecuteC(0xc000558c08) Mar 25 08:53:54: github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000049f30 sp=0xc000049e78 pc=0x562240b98005 Mar 25 08:53:54: github.com/spf13/cobra.(*Command).Execute(...) Mar 25 08:53:54: github.com/spf13/cobra@v1.7.0/command.go:992 Mar 25 08:53:54: github.com/spf13/cobra.(*Command).ExecuteContext(...) Mar 25 08:53:54: github.com/spf13/cobra@v1.7.0/command.go:985 Mar 25 08:53:54: main.main() Mar 25 08:53:54: github.com/ollama/ollama/main.go:12 +0x4d fp=0xc000049f50 sp=0xc000049f30 pc=0x562241620c0d Mar 25 08:53:54: runtime.main() Mar 25 08:53:54: runtime/proc.go:283 +0x29d fp=0xc000049fe0 sp=0xc000049f50 pc=0x5622409ebcfd Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000049fe8 sp=0xc000049fe0 pc=0x562240a27021 Mar 25 08:53:54: goroutine 2 gp=0xc000002e00 m=nil [force gc (idle), 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000072fa8 sp=0xc000072f88 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.goparkunlock(...) Mar 25 08:53:54: runtime/proc.go:441 Mar 25 08:53:54: runtime.forcegchelper() Mar 25 08:53:54: runtime/proc.go:348 +0xb8 fp=0xc000072fe0 sp=0xc000072fa8 pc=0x5622409ec038 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000072fe8 sp=0xc000072fe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.init.7 in goroutine 1 Mar 25 08:53:54: runtime/proc.go:336 +0x1a Mar 25 08:53:54: goroutine 3 gp=0xc000003340 m=nil [GC sweep wait]: Mar 25 08:53:54: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000073780 sp=0xc000073760 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.goparkunlock(...) Mar 25 08:53:54: runtime/proc.go:441 Mar 25 08:53:54: runtime.bgsweep(0xc00007e000) Mar 25 08:53:54: runtime/mgcsweep.go:316 +0xdf fp=0xc0000737c8 sp=0xc000073780 pc=0x5622409d685f Mar 25 08:53:54: runtime.gcenable.gowrap1() Mar 25 08:53:54: runtime/mgc.go:204 +0x25 fp=0xc0000737e0 sp=0xc0000737c8 pc=0x5622409cac45 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000737e8 sp=0xc0000737e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcenable in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:204 +0x66 Mar 25 08:53:54: goroutine 4 gp=0xc000003500 m=nil [GC scavenge wait]: Mar 25 08:53:54: runtime.gopark(0x10000?, 0x562241a21c70?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000073f78 sp=0xc000073f58 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.goparkunlock(...) Mar 25 08:53:54: runtime/proc.go:441 Mar 25 08:53:54: runtime.(*scavengerState).park(0x562242560b40) Mar 25 08:53:54: runtime/mgcscavenge.go:425 +0x49 fp=0xc000073fa8 sp=0xc000073f78 pc=0x5622409d42a9 Mar 25 08:53:54: runtime.bgscavenge(0xc00007e000) Mar 25 08:53:54: runtime/mgcscavenge.go:658 +0x59 fp=0xc000073fc8 sp=0xc000073fa8 pc=0x5622409d4839 Mar 25 08:53:54: runtime.gcenable.gowrap2() Mar 25 08:53:54: runtime/mgc.go:205 +0x25 fp=0xc000073fe0 sp=0xc000073fc8 pc=0x5622409cabe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000073fe8 sp=0xc000073fe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcenable in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:205 +0xa5 Mar 25 08:53:54: goroutine 5 gp=0xc000003dc0 m=nil [finalizer wait, 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000072688?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000072630 sp=0xc000072610 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.runfinq() Mar 25 08:53:54: runtime/mfinal.go:196 +0x107 fp=0xc0000727e0 sp=0xc000072630 pc=0x5622409c9c07 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000727e8 sp=0xc0000727e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.createfing in goroutine 1 Mar 25 08:53:54: runtime/mfinal.go:166 +0x3d Mar 25 08:53:54: goroutine 6 gp=0xc0001d28c0 m=nil [chan receive, 1 minutes]: Mar 25 08:53:54: runtime.gopark(0xc000225720?, 0xc000300018?, 0x60?, 0x47?, 0x562240b04ea8?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000074718 sp=0xc0000746f8 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.chanrecv(0xc000040380, 0x0, 0x1) Mar 25 08:53:54: runtime/chan.go:664 +0x445 fp=0xc000074790 sp=0xc000074718 pc=0x5622409bbe05 Mar 25 08:53:54: runtime.chanrecv1(0x0?, 0x0?) Mar 25 08:53:54: runtime/chan.go:506 +0x12 fp=0xc0000747b8 sp=0xc000074790 pc=0x5622409bb992 Mar 25 08:53:54: runtime.unique_runtime_registerUniqueMapCleanup.func2(...) Mar 25 08:53:54: runtime/mgc.go:1796 Mar 25 08:53:54: runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1799 +0x2f fp=0xc0000747e0 sp=0xc0000747b8 pc=0x5622409cddef Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000747e8 sp=0xc0000747e0 pc=0x562240a27021 Mar 25 08:53:54: created by unique.runtime_registerUniqueMapCleanup in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1794 +0x85 Mar 25 08:53:54: goroutine 7 gp=0xc0001d3340 m=nil [GC worker (idle), 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000074f38 sp=0xc000074f18 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc000074fc8 sp=0xc000074f38 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc000074fe0 sp=0xc000074fc8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000074fe8 sp=0xc000074fe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 18 gp=0xc000102380 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000bae4c84?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00006e738 sp=0xc00006e718 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00006e7c8 sp=0xc00006e738 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00006e7e0 sp=0xc00006e7c8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00006e7e8 sp=0xc00006e7e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 34 gp=0xc000504000 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000baeb3b6?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00050a7c8 sp=0xc00050a738 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 8 gp=0xc0001d3500 m=nil [GC worker (idle), 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x180000baeb59d?, 0x3?, 0xb1?, 0x9?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000075738 sp=0xc000075718 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc0000757c8 sp=0xc000075738 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc0000757e0 sp=0xc0000757c8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc0000757e8 sp=0xc0000757e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 19 gp=0xc000102540 m=nil [GC worker (idle), 1 minutes]: Mar 25 08:53:54: runtime.gopark(0x180000bae4882?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00006ef38 sp=0xc00006ef18 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00006efc8 sp=0xc00006ef38 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00006efe0 sp=0xc00006efc8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00006efe8 sp=0xc00006efe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 35 gp=0xc0005041c0 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000baec65f?, 0x3?, 0xf5?, 0x1c?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00050af38 sp=0xc00050af18 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00050afc8 sp=0xc00050af38 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00050afe0 sp=0xc00050afc8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 9 gp=0xc0001d36c0 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000baf26c9?, 0x3?, 0xb3?, 0x24?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000075f38 sp=0xc000075f18 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc000075fc8 sp=0xc000075f38 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc000075fe0 sp=0xc000075fc8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000075fe8 sp=0xc000075fe0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 20 gp=0xc000102700 m=nil [GC worker (idle)]: Mar 25 08:53:54: runtime.gopark(0x180000bae5fee?, 0x0?, 0x0?, 0x0?, 0x0?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00006f738 sp=0xc00006f718 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.gcBgMarkWorker(0xc000041960) Mar 25 08:53:54: runtime/mgc.go:1423 +0xe9 fp=0xc00006f7c8 sp=0xc00006f738 pc=0x5622409cd109 Mar 25 08:53:54: runtime.gcBgMarkStartWorkers.gowrap1() Mar 25 08:53:54: runtime/mgc.go:1339 +0x25 fp=0xc00006f7e0 sp=0xc00006f7c8 pc=0x5622409ccfe5 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00006f7e8 sp=0xc00006f7e0 pc=0x562240a27021 Mar 25 08:53:54: created by runtime.gcBgMarkStartWorkers in goroutine 1 Mar 25 08:53:54: runtime/mgc.go:1339 +0x105 Mar 25 08:53:54: goroutine 66 gp=0xc000602700 m=nil [IO wait]: Mar 25 08:53:54: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0xb?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc00006fdd8 sp=0xc00006fdb8 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.netpollblock(0x562240a42d78?, 0x409b9226?, 0x22?) Mar 25 08:53:54: runtime/netpoll.go:575 +0xf7 fp=0xc00006fe10 sp=0xc00006fdd8 pc=0x5622409e46f7 Mar 25 08:53:54: internal/poll.runtime_pollWait(0x7fb9d1dd3d98, 0x72) Mar 25 08:53:54: runtime/netpoll.go:351 +0x85 fp=0xc00006fe30 sp=0xc00006fe10 pc=0x562240a1eb05 Mar 25 08:53:54: internal/poll.(*pollDesc).wait(0xc000474000?, 0xc0000ec0d1?, 0x0) Mar 25 08:53:54: internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00006fe58 sp=0xc00006fe30 pc=0x562240aa5f87 Mar 25 08:53:54: internal/poll.(*pollDesc).waitRead(...) Mar 25 08:53:54: internal/poll/fd_poll_runtime.go:89 Mar 25 08:53:54: internal/poll.(*FD).Read(0xc000474000, {0xc0000ec0d1, 0x1, 0x1}) Mar 25 08:53:54: internal/poll/fd_unix.go:165 +0x27a fp=0xc00006fef0 sp=0xc00006fe58 pc=0x562240aa727a Mar 25 08:53:54: net.(*netFD).Read(0xc000474000, {0xc0000ec0d1?, 0xc0000515d8?, 0xc00006ff70?}) Mar 25 08:53:54: net/fd_posix.go:55 +0x25 fp=0xc00006ff38 sp=0xc00006fef0 pc=0x562240b1c1c5 Mar 25 08:53:54: net.(*conn).Read(0xc00052c090, {0xc0000ec0d1?, 0x0?, 0x0?}) Mar 25 08:53:54: net/net.go:194 +0x45 fp=0xc00006ff80 sp=0xc00006ff38 pc=0x562240b2a585 Mar 25 08:53:54: net/http.(*connReader).backgroundRead(0xc0000ec0c0) Mar 25 08:53:54: net/http/server.go:690 +0x37 fp=0xc00006ffc8 sp=0xc00006ff80 pc=0x562240d162d7 Mar 25 08:53:54: net/http.(*connReader).startBackgroundRead.gowrap2() Mar 25 08:53:54: net/http/server.go:686 +0x25 fp=0xc00006ffe0 sp=0xc00006ffc8 pc=0x562240d16205 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc00006ffe8 sp=0xc00006ffe0 pc=0x562240a27021 Mar 25 08:53:54: created by net/http.(*connReader).startBackgroundRead in goroutine 21 Mar 25 08:53:54: net/http/server.go:686 +0xb6 Mar 25 08:53:54: goroutine 21 gp=0xc0001028c0 m=nil [select]: Mar 25 08:53:54: runtime.gopark(0xc000143a58?, 0x2?, 0x4?, 0x0?, 0xc000143834?) Mar 25 08:53:54: runtime/proc.go:435 +0xce fp=0xc000143648 sp=0xc000143628 pc=0x562240a1f8ee Mar 25 08:53:54: runtime.selectgo(0xc000143a58, 0xc000143830, 0xc000362400?, 0x0, 0x1?, 0x1) Mar 25 08:53:54: runtime/select.go:351 +0x837 fp=0xc000143780 sp=0xc000143648 pc=0x5622409fe1f7 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).completion(0xc0004ba360, {0x562241cf9858, 0xc0000009a0}, 0xc0004e4000) Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner/runner.go:688 +0xa25 fp=0xc000143ac0 sp=0xc000143780 pc=0x562240dc24c5 Mar 25 08:53:54: github.com/ollama/ollama/runner/llamarunner.(*Server).completion-fm({0x562241cf9858?, 0xc0000009a0?}, 0xc000125b40?) Mar 25 08:53:54: <autogenerated>:1 +0x36 fp=0xc000143af0 sp=0xc000143ac0 pc=0x562240dc53f6 Mar 25 08:53:54: net/http.HandlerFunc.ServeHTTP(0xc0005372c0?, {0x562241cf9858?, 0xc0000009a0?}, 0xc000125b60?) Mar 25 08:53:54: net/http/server.go:2294 +0x29 fp=0xc000143b18 sp=0xc000143af0 pc=0x562240d1df09 Mar 25 08:53:54: net/http.(*ServeMux).ServeHTTP(0x5622409c4125?, {0x562241cf9858, 0xc0000009a0}, 0xc0004e4000) Mar 25 08:53:54: net/http/server.go:2822 +0x1c4 fp=0xc000143b68 sp=0xc000143b18 pc=0x562240d1fe04 Mar 25 08:53:54: net/http.serverHandler.ServeHTTP({0x562241cf5ef0?}, {0x562241cf9858?, 0xc0000009a0?}, 0x1?) Mar 25 08:53:54: net/http/server.go:3301 +0x8e fp=0xc000143b98 sp=0xc000143b68 pc=0x562240d3d88e Mar 25 08:53:54: net/http.(*conn).serve(0xc0000ee000, {0x562241cfb908, 0xc00034bf50}) Mar 25 08:53:54: net/http/server.go:2102 +0x625 fp=0xc000143fb8 sp=0xc000143b98 pc=0x562240d1c405 Mar 25 08:53:54: net/http.(*Server).Serve.gowrap3() Mar 25 08:53:54: net/http/server.go:3454 +0x28 fp=0xc000143fe0 sp=0xc000143fb8 pc=0x562240d21cc8 Mar 25 08:53:54: runtime.goexit({}) Mar 25 08:53:54: runtime/asm_amd64.s:1700 +0x1 fp=0xc000143fe8 sp=0xc000143fe0 pc=0x562240a27021 Mar 25 08:53:54: created by net/http.(*Server).Serve in goroutine 1 Mar 25 08:53:54: net/http/server.go:3454 +0x485 Mar 25 08:53:54: rax 0x0 Mar 25 08:53:54: rbx 0x7fb9d1dcb640 Mar 25 08:53:54: rcx 0x7fba191389fc Mar 25 08:53:54: rdx 0x6 Mar 25 08:53:54: rdi 0x33e82 Mar 25 08:53:54: rsi 0x33e84 Mar 25 08:53:54: rbp 0x33e84 Mar 25 08:53:54: rsp 0x7fb9d1dca180 Mar 25 08:53:54: r8 0x7fb9d1dca250 Mar 25 08:53:54: r9 0x7fb9d1dca220 Mar 25 08:53:54: r10 0x8 Mar 25 08:53:54: r11 0x246 Mar 25 08:53:54: r12 0x6 Mar 25 08:53:54: r13 0x16 Mar 25 08:53:54: r14 0x80 Mar 25 08:53:54: r15 0x8 Mar 25 08:53:54: rip 0x7fba191389fc Mar 25 08:53:54: rflags 0x246 Mar 25 08:53:54: cs 0x33 Mar 25 08:53:54: fs 0x0 Mar 25 08:53:54: gs 0x0 Mar 25 08:53:54: time=2025-03-25T08:53:54.459Z level=ERROR source=server.go:449 msg="llama runner terminated" error="exit status 2" Mar 25 08:53:54: [GIN] 2025/03/25 - 08:53:54 | 500 | 1m19s | 127.0.0.1 | POST "/api/generate" Mar 25 08:53:59: time=2025-03-25T08:53:59.951Z level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.203447183 model=/usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c Mar 25 08:54:00: time=2025-03-25T08:54:00.202Z level=WARN source=sched.go:647 msg="gpu VRAM usage didn't recover within timeout" seconds=5.454439782 model=/usr/share/ollama/.ollama/models/blobs/sha256-490e953657a0d4298cf8420dbffe4c705e973978be355eedf5edce272061348c ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.6.1 (same error when using 0.6.2)
GiteaMirror added the bug label 2026-04-29 01:45:53 -05:00
Author
Owner

@rick-github commented on GitHub (Mar 25, 2025):

#8907

<!-- gh-comment-id:2751220710 --> @rick-github commented on GitHub (Mar 25, 2025): #8907
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#53047