[GH-ISSUE #9352] Ollama segfaults #68162

Closed
opened 2026-05-04 12:40:09 -05:00 by GiteaMirror · 2 comments
Owner

Originally created by @iganev on GitHub (Feb 26, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9352

What is the issue?

Recently I noticed Ollama started hanging again and restarting the container is the only (temporary) remedy. Currently running 0.5.12.

The instance uses 2x4090 in a docker setup.

The workload consists of lots of embedding requests using several different embedding models (all-minilm:33m, bge-m3, bge-large, snowflake-arctic-embed, paraphrase-multilingual, etc.) and the occasional llama3.1:8b summarization requests.

Might be important to note that the llama model is being used with n_ctx 8192 instead of the default 2048, which still fits neatly into one of the GPUs with about 3GB leftover space.

Looking at the logs I can see bge-large causing consistent segfaults over and over again.

Furthermore, it seems like the llama model is being loaded and unloaded on every request, even if the requests are milliseconds apart, dunno if that contributes to the issue.

What I observe as a behavior is that at some point ollama stops serving requests and hangs indefinitely. Restarting the container drops all hanging connections and then the container works for a while until it doesn't.

Relevant log output

[GIN] 2025/02/26 - 00:25:16 | 200 | 42.274034387s |     10.252.1.10 | POST     "/api/chat"
time=2025-02-26T00:25:17.084Z level=WARN source=types.go:512 msg="invalid option provided" option=tfs_z
time=2025-02-26T00:25:17.084Z level=WARN source=types.go:512 msg="invalid option provided" option=num_gqa
time=2025-02-26T00:25:17.568Z level=INFO source=sched.go:508 msg="updated VRAM based on existing loaded models" gpu=GPU-d419dbd5-adab-6e8b-e46b-4e45491c3e50 library=cuda total="23.6 GiB" available="21.6 GiB"
time=2025-02-26T00:25:17.568Z level=INFO source=sched.go:508 msg="updated VRAM based on existing loaded models" gpu=GPU-6da9f13b-9b65-b30a-fd59-910f358a7824 library=cuda total="23.6 GiB" available="23.3 GiB"
time=2025-02-26T00:25:17.568Z level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.key_length default=128
time=2025-02-26T00:25:17.568Z level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.value_length default=128
time=2025-02-26T00:25:17.568Z level=INFO source=sched.go:715 msg="new model will fit in available VRAM in single GPU, loading" model=/root/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 gpu=GPU-6da9f13b-9b65-b30a-fd59-910f358a7824 parallel=10 available=24976752640 required="20.4 GiB"
time=2025-02-26T00:25:17.674Z level=INFO source=server.go:97 msg="system memory" total="125.5 GiB" free="106.5 GiB" free_swap="8.0 GiB"
time=2025-02-26T00:25:17.674Z level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.key_length default=128
time=2025-02-26T00:25:17.674Z level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.value_length default=128
time=2025-02-26T00:25:17.674Z level=INFO source=server.go:130 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[23.3 GiB]" memory.gpu_overhead="0 B" memory.required.full="20.4 GiB" memory.required.partial="20.4 GiB" memory.required.kv="10.0 GiB" memory.required.allocations="[20.4 GiB]" memory.weights.total="13.9 GiB" memory.weights.repeating="13.5 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="5.2 GiB" memory.graph.partial="5.5 GiB"
time=2025-02-26T00:25:17.674Z level=INFO source=server.go:380 msg="starting llama server" cmd="/usr/bin/ollama runner --model /root/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 --ctx-size 81920 --batch-size 512 --n-gpu-layers 33 --threads 8 --parallel 10 --port 37941"
time=2025-02-26T00:25:17.674Z level=INFO source=sched.go:450 msg="loaded runners" count=3
time=2025-02-26T00:25:17.674Z level=INFO source=server.go:557 msg="waiting for llama runner to start responding"
time=2025-02-26T00:25:17.675Z level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error"
time=2025-02-26T00:25:17.681Z level=INFO source=runner.go:932 msg="starting go runner"
[GIN] 2025/02/26 - 00:25:17 | 200 |  5.065479138s |     10.252.1.10 | POST     "/api/embed"
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
load_backend: loaded CUDA backend from /usr/lib/ollama/cuda_v12/libggml-cuda.so
load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so
time=2025-02-26T00:25:17.703Z level=INFO source=runner.go:935 msg=system info="CPU : LLAMAFILE = 1 | CPU : LLAMAFILE = 1 | CUDA : ARCHS = 600,610,620,700,720,750,800,860,870,890,900 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | cgo(gcc)" threads=8
time=2025-02-26T00:25:17.703Z level=INFO source=runner.go:993 msg="Server listening on 127.0.0.1:37941"
llama_load_model_from_file: using device CUDA0 (NVIDIA GeForce RTX 4090) - 23819 MiB free
llama_model_loader: loaded meta data with 29 key-value pairs and 292 tensors from /root/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Meta Llama 3.1 8B Instruct
llama_model_loader: - kv   3:                           general.finetune str              = Instruct
llama_model_loader: - kv   4:                           general.basename str              = Meta-Llama-3.1
llama_model_loader: - kv   5:                         general.size_label str              = 8B
llama_model_loader: - kv   6:                            general.license str              = llama3.1
llama_model_loader: - kv   7:                               general.tags arr[str,6]       = ["facebook", "meta", "pytorch", "llam...
llama_model_loader: - kv   8:                          general.languages arr[str,8]       = ["en", "de", "fr", "it", "pt", "hi", ...
llama_model_loader: - kv   9:                          llama.block_count u32              = 32
llama_model_loader: - kv  10:                       llama.context_length u32              = 131072
llama_model_loader: - kv  11:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv  12:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv  13:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  14:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  15:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv  16:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  17:                          general.file_type u32              = 15
llama_model_loader: - kv  18:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  19:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  20:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  21:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  22:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  23:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  24:                      tokenizer.ggml.merges arr[str,280147]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  25:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  26:                tokenizer.ggml.eos_token_id u32              = 128009
llama_model_loader: - kv  27:                    tokenizer.chat_template str              = {{- bos_token }}\n{%- if custom_tools ...
llama_model_loader: - kv  28:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   66 tensors
llama_model_loader: - type q4_K:  193 tensors
llama_model_loader: - type q6_K:   33 tensors
time=2025-02-26T00:25:17.926Z level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model"
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.7999 MB
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = BPE
llm_load_print_meta: n_vocab          = 128256
llm_load_print_meta: n_merges         = 280147
llm_load_print_meta: vocab_only       = 0
llm_load_print_meta: n_ctx_train      = 131072
llm_load_print_meta: n_embd           = 4096
llm_load_print_meta: n_layer          = 32
llm_load_print_meta: n_head           = 32
llm_load_print_meta: n_head_kv        = 8
llm_load_print_meta: n_rot            = 128
llm_load_print_meta: n_swa            = 0
llm_load_print_meta: n_embd_head_k    = 128
llm_load_print_meta: n_embd_head_v    = 128
llm_load_print_meta: n_gqa            = 4
llm_load_print_meta: n_embd_k_gqa     = 1024
llm_load_print_meta: n_embd_v_gqa     = 1024
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 14336
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 1
llm_load_print_meta: pooling type     = 0
llm_load_print_meta: rope type        = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 500000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn  = 131072
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: ssm_dt_b_c_rms   = 0
llm_load_print_meta: model type       = 8B
llm_load_print_meta: model ftype      = Q4_K - Medium
llm_load_print_meta: model params     = 8.03 B
llm_load_print_meta: model size       = 4.58 GiB (4.89 BPW) 
llm_load_print_meta: general.name     = Meta Llama 3.1 8B Instruct
llm_load_print_meta: BOS token        = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token        = 128009 '<|eot_id|>'
llm_load_print_meta: EOT token        = 128009 '<|eot_id|>'
llm_load_print_meta: EOM token        = 128008 '<|eom_id|>'
llm_load_print_meta: LF token         = 128 'Ä'
llm_load_print_meta: EOG token        = 128008 '<|eom_id|>'
llm_load_print_meta: EOG token        = 128009 '<|eot_id|>'
llm_load_print_meta: max token length = 256
llm_load_tensors: offloading 32 repeating layers to GPU
llm_load_tensors: offloading output layer to GPU
llm_load_tensors: offloaded 33/33 layers to GPU
llm_load_tensors:        CUDA0 model buffer size =  4403.49 MiB
llm_load_tensors:   CPU_Mapped model buffer size =   281.81 MiB
llama_new_context_with_model: n_seq_max     = 10
llama_new_context_with_model: n_ctx         = 81920
llama_new_context_with_model: n_ctx_per_seq = 8192
llama_new_context_with_model: n_batch       = 5120
llama_new_context_with_model: n_ubatch      = 512
llama_new_context_with_model: flash_attn    = 0
llama_new_context_with_model: freq_base     = 500000.0
llama_new_context_with_model: freq_scale    = 1
llama_new_context_with_model: n_ctx_per_seq (8192) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
llama_kv_cache_init: kv_size = 81920, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 32, can_shift = 1
llama_kv_cache_init:      CUDA0 KV buffer size = 10240.00 MiB
llama_new_context_with_model: KV self size  = 10240.00 MiB, K (f16): 5120.00 MiB, V (f16): 5120.00 MiB
llama_new_context_with_model:  CUDA_Host  output buffer size =     5.05 MiB
llama_new_context_with_model:      CUDA0 compute buffer size =  5312.00 MiB
llama_new_context_with_model:  CUDA_Host compute buffer size =   168.01 MiB
llama_new_context_with_model: graph nodes  = 1030
llama_new_context_with_model: graph splits = 2
time=2025-02-26T00:25:18.930Z level=INFO source=server.go:596 msg="llama runner started in 1.26 seconds"
llama_model_loader: loaded meta data with 29 key-value pairs and 292 tensors from /root/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Meta Llama 3.1 8B Instruct
llama_model_loader: - kv   3:                           general.finetune str              = Instruct
llama_model_loader: - kv   4:                           general.basename str              = Meta-Llama-3.1
llama_model_loader: - kv   5:                         general.size_label str              = 8B
llama_model_loader: - kv   6:                            general.license str              = llama3.1
llama_model_loader: - kv   7:                               general.tags arr[str,6]       = ["facebook", "meta", "pytorch", "llam...
llama_model_loader: - kv   8:                          general.languages arr[str,8]       = ["en", "de", "fr", "it", "pt", "hi", ...
llama_model_loader: - kv   9:                          llama.block_count u32              = 32
llama_model_loader: - kv  10:                       llama.context_length u32              = 131072
llama_model_loader: - kv  11:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv  12:                  llama.feed_forward_length u32              = 14336
llama_model_loader: - kv  13:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  14:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  15:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv  16:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  17:                          general.file_type u32              = 15
llama_model_loader: - kv  18:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  19:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  20:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  21:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  22:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  23:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  24:                      tokenizer.ggml.merges arr[str,280147]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  25:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  26:                tokenizer.ggml.eos_token_id u32              = 128009
llama_model_loader: - kv  27:                    tokenizer.chat_template str              = {{- bos_token }}\n{%- if custom_tools ...
llama_model_loader: - kv  28:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   66 tensors
llama_model_loader: - type q4_K:  193 tensors
llama_model_loader: - type q6_K:   33 tensors
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.7999 MB
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = BPE
llm_load_print_meta: n_vocab          = 128256
llm_load_print_meta: n_merges         = 280147
llm_load_print_meta: vocab_only       = 1
llm_load_print_meta: model type       = ?B
llm_load_print_meta: model ftype      = all F32
llm_load_print_meta: model params     = 8.03 B
llm_load_print_meta: model size       = 4.58 GiB (4.89 BPW) 
llm_load_print_meta: general.name     = Meta Llama 3.1 8B Instruct
llm_load_print_meta: BOS token        = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token        = 128009 '<|eot_id|>'
llm_load_print_meta: EOT token        = 128009 '<|eot_id|>'
llm_load_print_meta: EOM token        = 128008 '<|eom_id|>'
llm_load_print_meta: LF token         = 128 'Ä'
llm_load_print_meta: EOG token        = 128008 '<|eom_id|>'
llm_load_print_meta: EOG token        = 128009 '<|eot_id|>'
llm_load_print_meta: max token length = 256
llama_model_load: vocab only - skipping tensors





//ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c:8456: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed
//ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c:8456: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed
SIGSEGV: segmentation violation
PC=0x7f27a8e24c47 m=0 sigcode=1 addr=0x206a03fb4
signal arrived during cgo execution

goroutine 29 gp=0xc000585dc0 m=0 mp=0x64b1b235c780 [syscall]:
runtime.cgocall(0x64b1b1512ce0, 0xc0000bfba0)
        runtime/cgocall.go:167 +0x4b fp=0xc0000bfb78 sp=0xc0000bfb40 pc=0x64b1b08fdacb
github.com/ollama/ollama/llama._Cfunc_llama_decode(0x7f2778ad5c10, {0x2, 0x7f2779219860, 0x0, 0x0, 0x7f277921a070, 0x7f277921a880, 0x7f277921b090, 0x7f27790752d0})
        _cgo_gotypes.go:545 +0x4f fp=0xc0000bfba0 sp=0xc0000bfb78 pc=0x64b1b0cb356f
github.com/ollama/ollama/llama.(*Context).Decode.func1(0x64b1b0cd248b?, 0x7f2778ad5c10?)
        github.com/ollama/ollama/llama/llama.go:163 +0xf5 fp=0xc0000bfc90 sp=0xc0000bfba0 pc=0x64b1b0cb6295
github.com/ollama/ollama/llama.(*Context).Decode(0xc0002fe0e0?, 0x0?)
        github.com/ollama/ollama/llama/llama.go:163 +0x13 fp=0xc0000bfcd8 sp=0xc0000bfc90 pc=0x64b1b0cb6113
github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0001eb560, 0xc0005f0000, 0xc0000bff20)
        github.com/ollama/ollama/runner/llamarunner/runner.go:435 +0x23f fp=0xc0000bfee0 sp=0xc0000bfcd8 pc=0x64b1b0cd127f
github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0001eb560, {0x64b1b1b65920, 0xc000511130})
        github.com/ollama/ollama/runner/llamarunner/runner.go:343 +0x1d5 fp=0xc0000bffb8 sp=0xc0000bfee0 pc=0x64b1b0cd0cb5
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2()
        github.com/ollama/ollama/runner/llamarunner/runner.go:973 +0x28 fp=0xc0000bffe0 sp=0xc0000bffb8 pc=0x64b1b0cd5b48
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bffe8 sp=0xc0000bffe0 pc=0x64b1b090c5a1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
        github.com/ollama/ollama/runner/llamarunner/runner.go:973 +0xdb5

goroutine 1 gp=0xc0000061c0 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0001335c0 sp=0xc0001335a0 pc=0x64b1b09041ce
runtime.netpollblock(0xc00011f610?, 0xb089afe6?, 0xb1?)
        runtime/netpoll.go:575 +0xf7 fp=0xc0001335f8 sp=0xc0001335c0 pc=0x64b1b08c7e37
internal/poll.runtime_pollWait(0x7f27fb610680, 0x72)
        runtime/netpoll.go:351 +0x85 fp=0xc000133618 sp=0xc0001335f8 pc=0x64b1b09034c5
internal/poll.(*pollDesc).wait(0xc0004e2380?, 0x900000036?, 0x0)
        internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000133640 sp=0xc000133618 pc=0x64b1b098b707
internal/poll.(*pollDesc).waitRead(...)
        internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc0004e2380)
        internal/poll/fd_unix.go:620 +0x295 fp=0xc0001336e8 sp=0xc000133640 pc=0x64b1b0990ad5
net.(*netFD).accept(0xc0004e2380)
        net/fd_unix.go:172 +0x29 fp=0xc0001337a0 sp=0xc0001336e8 pc=0x64b1b09f9bc9
net.(*TCPListener).accept(0xc00062d6c0)
        net/tcpsock_posix.go:159 +0x1e fp=0xc0001337f0 sp=0xc0001337a0 pc=0x64b1b0a0f83e
net.(*TCPListener).Accept(0xc00062d6c0)
        net/tcpsock.go:372 +0x30 fp=0xc000133820 sp=0xc0001337f0 pc=0x64b1b0a0e6f0
net/http.(*onceCloseListener).Accept(0xc0005f4090?)
        <autogenerated>:1 +0x24 fp=0xc000133838 sp=0xc000133820 pc=0x64b1b0c58964
net/http.(*Server).Serve(0xc0005bd1d0, {0x64b1b1b634f8, 0xc00062d6c0})
        net/http/server.go:3330 +0x30c fp=0xc000133968 sp=0xc000133838 pc=0x64b1b0c308ec
github.com/ollama/ollama/runner/llamarunner.Execute({0xc000036220, 0xe, 0xe})
        github.com/ollama/ollama/runner/llamarunner/runner.go:994 +0x1174 fp=0xc000133d08 sp=0xc000133968 pc=0x64b1b0cd5834
github.com/ollama/ollama/runner.Execute({0xc000036210?, 0x0?, 0x0?})
        github.com/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc000133d30 sp=0xc000133d08 pc=0x64b1b0f05c54
github.com/ollama/ollama/cmd.NewCLI.func2(0xc000037700?, {0x64b1b1700050?, 0x4?, 0x64b1b1700054?})
        github.com/ollama/ollama/cmd/cmd.go:1280 +0x45 fp=0xc000133d58 sp=0xc000133d30 pc=0x64b1b1512245
github.com/spf13/cobra.(*Command).execute(0xc0004e7b08, {0xc000645180, 0xe, 0xe})
        github.com/spf13/cobra@v1.7.0/command.go:940 +0x862 fp=0xc000133e78 sp=0xc000133d58 pc=0x64b1b0a72902
github.com/spf13/cobra.(*Command).ExecuteC(0xc0005d5b08)
        github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000133f30 sp=0xc000133e78 pc=0x64b1b0a73145
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
        github.com/ollama/ollama/main.go:12 +0x4d fp=0xc000133f50 sp=0xc000133f30 pc=0x64b1b15125cd
runtime.main()
        runtime/proc.go:272 +0x29d fp=0xc000133fe0 sp=0xc000133f50 pc=0x64b1b08cf4dd
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000133fe8 sp=0xc000133fe0 pc=0x64b1b090c5a1

goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000aafa8 sp=0xc0000aaf88 pc=0x64b1b09041ce
runtime.goparkunlock(...)
        runtime/proc.go:430
runtime.forcegchelper()
        runtime/proc.go:337 +0xb8 fp=0xc0000aafe0 sp=0xc0000aafa8 pc=0x64b1b08cf818
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000aafe8 sp=0xc0000aafe0 pc=0x64b1b090c5a1
created by runtime.init.7 in goroutine 1
        runtime/proc.go:325 +0x1a

goroutine 3 gp=0xc000007180 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000ab780 sp=0xc0000ab760 pc=0x64b1b09041ce
runtime.goparkunlock(...)
        runtime/proc.go:430
runtime.bgsweep(0xc00003e080)
        runtime/mgcsweep.go:317 +0xdf fp=0xc0000ab7c8 sp=0xc0000ab780 pc=0x64b1b08b9ebf
runtime.gcenable.gowrap1()
        runtime/mgc.go:204 +0x25 fp=0xc0000ab7e0 sp=0xc0000ab7c8 pc=0x64b1b08ae505
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000ab7e8 sp=0xc0000ab7e0 pc=0x64b1b090c5a1
created by runtime.gcenable in goroutine 1
        runtime/mgc.go:204 +0x66

goroutine 4 gp=0xc000007340 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x64b1b18b36f8?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000abf78 sp=0xc0000abf58 pc=0x64b1b09041ce
runtime.goparkunlock(...)
        runtime/proc.go:430
runtime.(*scavengerState).park(0x64b1b235a080)
        runtime/mgcscavenge.go:425 +0x49 fp=0xc0000abfa8 sp=0xc0000abf78 pc=0x64b1b08b7889
runtime.bgscavenge(0xc00003e080)
        runtime/mgcscavenge.go:658 +0x59 fp=0xc0000abfc8 sp=0xc0000abfa8 pc=0x64b1b08b7e19
runtime.gcenable.gowrap2()
        runtime/mgc.go:205 +0x25 fp=0xc0000abfe0 sp=0xc0000abfc8 pc=0x64b1b08ae4a5
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000abfe8 sp=0xc0000abfe0 pc=0x64b1b090c5a1
created by runtime.gcenable in goroutine 1
        runtime/mgc.go:205 +0xa5

goroutine 5 gp=0xc000007c00 m=nil [finalizer wait]:
runtime.gopark(0x0?, 0x64b1b1b52670?, 0x20?, 0xe0?, 0x1000000010?)
        runtime/proc.go:424 +0xce fp=0xc0000aa620 sp=0xc0000aa600 pc=0x64b1b09041ce
runtime.runfinq()
        runtime/mfinal.go:193 +0x107 fp=0xc0000aa7e0 sp=0xc0000aa620 pc=0x64b1b08ad587
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000aa7e8 sp=0xc0000aa7e0 pc=0x64b1b090c5a1
created by runtime.createfing in goroutine 1
        runtime/mfinal.go:163 +0x3d

goroutine 6 gp=0xc000209500 m=nil [chan receive]:
runtime.gopark(0xc0000ac760?, 0x64b1b09e1245?, 0x60?, 0xc9?, 0x64b1b1b78280?)
        runtime/proc.go:424 +0xce fp=0xc0000ac718 sp=0xc0000ac6f8 pc=0x64b1b09041ce
runtime.chanrecv(0xc0000e4310, 0x0, 0x1)
        runtime/chan.go:639 +0x41c fp=0xc0000ac790 sp=0xc0000ac718 pc=0x64b1b089dbfc
runtime.chanrecv1(0x0?, 0x0?)
        runtime/chan.go:489 +0x12 fp=0xc0000ac7b8 sp=0xc0000ac790 pc=0x64b1b089d7b2
runtime.unique_runtime_registerUniqueMapCleanup.func1(...)
        runtime/mgc.go:1781
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
        runtime/mgc.go:1784 +0x2f fp=0xc0000ac7e0 sp=0xc0000ac7b8 pc=0x64b1b08b156f
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000ac7e8 sp=0xc0000ac7e0 pc=0x64b1b090c5a1
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
        runtime/mgc.go:1779 +0x96

goroutine 7 gp=0xc000209880 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000acf38 sp=0xc0000acf18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000acfc8 sp=0xc0000acf38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000acfe0 sp=0xc0000acfc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000acfe8 sp=0xc0000acfe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 8 gp=0xc000209a40 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000ad738 sp=0xc0000ad718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000ad7c8 sp=0xc0000ad738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000ad7e0 sp=0xc0000ad7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000ad7e8 sp=0xc0000ad7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 9 gp=0xc000209c00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000adf38 sp=0xc0000adf18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000adfc8 sp=0xc0000adf38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000adfe0 sp=0xc0000adfc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000adfe8 sp=0xc0000adfe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 10 gp=0xc000209dc0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a6738 sp=0xc0000a6718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a67c8 sp=0xc0000a6738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a67e0 sp=0xc0000a67c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a67e8 sp=0xc0000a67e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 11 gp=0xc0004a4000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a6f38 sp=0xc0000a6f18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a6fc8 sp=0xc0000a6f38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a6fe0 sp=0xc0000a6fc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a6fe8 sp=0xc0000a6fe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 12 gp=0xc0004a41c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a7738 sp=0xc0000a7718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a77c8 sp=0xc0000a7738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a77e0 sp=0xc0000a77c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a77e8 sp=0xc0000a77e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 13 gp=0xc0004a4380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a7f38 sp=0xc0000a7f18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a7fc8 sp=0xc0000a7f38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a7fe0 sp=0xc0000a7fc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a7fe8 sp=0xc0000a7fe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 14 gp=0xc0004a4540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a8738 sp=0xc0000a8718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a87c8 sp=0xc0000a8738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a87e0 sp=0xc0000a87c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a87e8 sp=0xc0000a87e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 15 gp=0xc0004a4700 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a8f38 sp=0xc0000a8f18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a8fc8 sp=0xc0000a8f38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a8fe0 sp=0xc0000a8fc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a8fe8 sp=0xc0000a8fe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 16 gp=0xc0004a48c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a9738 sp=0xc0000a9718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a97c8 sp=0xc0000a9738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a97e0 sp=0xc0000a97c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a97e8 sp=0xc0000a97e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 18 gp=0xc0004a4a80 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a9f38 sp=0xc0000a9f18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a9fc8 sp=0xc0000a9f38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a9fe0 sp=0xc0000a9fc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a9fe8 sp=0xc0000a9fe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 19 gp=0xc0004a4c40 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004ac738 sp=0xc0004ac718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004ac7c8 sp=0xc0004ac738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004ac7e0 sp=0xc0004ac7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004ac7e8 sp=0xc0004ac7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 20 gp=0xc0004a4e00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004acf38 sp=0xc0004acf18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004acfc8 sp=0xc0004acf38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004acfe0 sp=0xc0004acfc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004acfe8 sp=0xc0004acfe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 21 gp=0xc0004a4fc0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004ad738 sp=0xc0004ad718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004ad7c8 sp=0xc0004ad738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004ad7e0 sp=0xc0004ad7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004ad7e8 sp=0xc0004ad7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 22 gp=0xc0004a5180 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004adf38 sp=0xc0004adf18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004adfc8 sp=0xc0004adf38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004adfe0 sp=0xc0004adfc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004adfe8 sp=0xc0004adfe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 34 gp=0xc000104380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004a8738 sp=0xc0004a8718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004a87c8 sp=0xc0004a8738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004a87e0 sp=0xc0004a87c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a87e8 sp=0xc0004a87e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 35 gp=0xc000104540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004a8f38 sp=0xc0004a8f18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004a8fc8 sp=0xc0004a8f38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004a8fe0 sp=0xc0004a8fc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a8fe8 sp=0xc0004a8fe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 50 gp=0xc000504000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc00050a7c8 sp=0xc00050a738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 23 gp=0xc0004a5340 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004ae738 sp=0xc0004ae718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004ae7c8 sp=0xc0004ae738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004ae7e0 sp=0xc0004ae7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004ae7e8 sp=0xc0004ae7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 36 gp=0xc000104700 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004a9738 sp=0xc0004a9718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004a97c8 sp=0xc0004a9738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004a97e0 sp=0xc0004a97c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a97e8 sp=0xc0004a97e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 51 gp=0xc0005041c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00050af38 sp=0xc00050af18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc00050afc8 sp=0xc00050af38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00050afe0 sp=0xc00050afc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 24 gp=0xc0004a5880 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004aef38 sp=0xc0004aef18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004aefc8 sp=0xc0004aef38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004aefe0 sp=0xc0004aefc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004aefe8 sp=0xc0004aefe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 25 gp=0xc0004a5a40 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004af738 sp=0xc0004af718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004af7c8 sp=0xc0004af738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004af7e0 sp=0xc0004af7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004af7e8 sp=0xc0004af7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 26 gp=0xc0004a5c00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004aff38 sp=0xc0004aff18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004affc8 sp=0xc0004aff38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004affe0 sp=0xc0004affc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004affe8 sp=0xc0004affe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 27 gp=0xc0004a5dc0 m=nil [GC worker (idle)]:
runtime.gopark(0x4475a548d97b2?, 0x1?, 0x68?, 0x3d?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc000506738 sp=0xc000506718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc0005067c8 sp=0xc000506738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0005067e0 sp=0xc0005067c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0005067e8 sp=0xc0005067e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 66 gp=0xc000584000 m=nil [GC worker (idle)]:
runtime.gopark(0x4475abc1b0b85?, 0x1?, 0x8a?, 0xa9?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00058a738 sp=0xc00058a718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc00058a7c8 sp=0xc00058a738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00058a7e0 sp=0xc00058a7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00058a7e8 sp=0xc00058a7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 67 gp=0xc0005841c0 m=nil [GC worker (idle)]:
runtime.gopark(0x64b1b24086e0?, 0x1?, 0x12?, 0xf5?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00058af38 sp=0xc00058af18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc00058afc8 sp=0xc00058af38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00058afe0 sp=0xc00058afc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00058afe8 sp=0xc00058afe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 68 gp=0xc000584380 m=nil [GC worker (idle)]:
runtime.gopark(0x4475abc1ad767?, 0x1?, 0x1a?, 0x6a?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00058b738 sp=0xc00058b718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc00058b7c8 sp=0xc00058b738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00058b7e0 sp=0xc00058b7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00058b7e8 sp=0xc00058b7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 69 gp=0xc000584540 m=nil [GC worker (idle)]:
runtime.gopark(0x64b1b24086e0?, 0x1?, 0x3a?, 0xb4?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00058bf38 sp=0xc00058bf18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc00058bfc8 sp=0xc00058bf38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00058bfe0 sp=0xc00058bfc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00058bfe8 sp=0xc00058bfe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 70 gp=0xc000584700 m=nil [GC worker (idle)]:
runtime.gopark(0x64b1b24086e0?, 0x1?, 0x3?, 0xc0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00058c738 sp=0xc00058c718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc00058c7c8 sp=0xc00058c738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00058c7e0 sp=0xc00058c7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00058c7e8 sp=0xc00058c7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 71 gp=0xc0005848c0 m=nil [GC worker (idle)]:
runtime.gopark(0x64b1b24086e0?, 0x1?, 0xfa?, 0x21?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00058cf38 sp=0xc00058cf18 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc00058cfc8 sp=0xc00058cf38 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00058cfe0 sp=0xc00058cfc8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00058cfe8 sp=0xc00058cfe0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 72 gp=0xc000584a80 m=nil [GC worker (idle)]:
runtime.gopark(0x4475abc1ad6ad?, 0x1?, 0x1f?, 0x42?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00058d738 sp=0xc00058d718 pc=0x64b1b09041ce
runtime.gcBgMarkWorker(0xc0000e5730)
        runtime/mgc.go:1412 +0xe9 fp=0xc00058d7c8 sp=0xc00058d738 pc=0x64b1b08b0869
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00058d7e0 sp=0xc00058d7c8 pc=0x64b1b08b0745
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00058d7e8 sp=0xc00058d7e0 pc=0x64b1b090c5a1
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 74 gp=0xc000104a80 m=nil [chan receive]:
runtime.gopark(0x64b1b090a5b4?, 0xc000137898?, 0xd0?, 0x22?, 0xc000137880?)
        runtime/proc.go:424 +0xce fp=0xc000137860 sp=0xc000137840 pc=0x64b1b09041ce
runtime.chanrecv(0xc0003ac070, 0xc000137a10, 0x1)
        runtime/chan.go:639 +0x41c fp=0xc0001378d8 sp=0xc000137860 pc=0x64b1b089dbfc
runtime.chanrecv1(0xc00026c060?, 0xc00066c808?)
        runtime/chan.go:489 +0x12 fp=0xc000137900 sp=0xc0001378d8 pc=0x64b1b089d7b2
github.com/ollama/ollama/runner/llamarunner.(*Server).embeddings(0xc0001eb560, {0x64b1b1b63708, 0xc0006440e0}, 0xc0004c8140)
        github.com/ollama/ollama/runner/llamarunner/runner.go:783 +0x746 fp=0xc000137ac0 sp=0xc000137900 pc=0x64b1b0cd3c06
github.com/ollama/ollama/runner/llamarunner.(*Server).embeddings-fm({0x64b1b1b63708?, 0xc0006440e0?}, 0x64b1b0c3a6c7?)
        <autogenerated>:1 +0x36 fp=0xc000137af0 sp=0xc000137ac0 pc=0x64b1b0cd5ff6
net/http.HandlerFunc.ServeHTTP(0xc000645340?, {0x64b1b1b63708?, 0xc0006440e0?}, 0x0?)
        net/http/server.go:2220 +0x29 fp=0xc000137b18 sp=0xc000137af0 pc=0x64b1b0c2cee9
net/http.(*ServeMux).ServeHTTP(0x64b1b08a4a05?, {0x64b1b1b63708, 0xc0006440e0}, 0xc0004c8140)
        net/http/server.go:2747 +0x1ca fp=0xc000137b68 sp=0xc000137b18 pc=0x64b1b0c2edea
net/http.serverHandler.ServeHTTP({0x64b1b1b600d0?}, {0x64b1b1b63708?, 0xc0006440e0?}, 0x6?)
        net/http/server.go:3210 +0x8e fp=0xc000137b98 sp=0xc000137b68 pc=0x64b1b0c4c34e
net/http.(*conn).serve(0xc0005f4090, {0x64b1b1b658e8, 0xc0001fe8a0})
        net/http/server.go:2092 +0x5d0 fp=0xc000137fb8 sp=0xc000137b98 pc=0x64b1b0c2b890
net/http.(*Server).Serve.gowrap3()
        net/http/server.go:3360 +0x28 fp=0xc000137fe0 sp=0xc000137fb8 pc=0x64b1b0c30ce8
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000137fe8 sp=0xc000137fe0 pc=0x64b1b090c5a1
created by net/http.(*Server).Serve in goroutine 1
        net/http/server.go:3360 +0x485

goroutine 136 gp=0xc000504a80 m=nil [IO wait]:
runtime.gopark(0x64b1b08a8ee5?, 0x0?, 0xf8?, 0xb5?, 0xb?)
        runtime/proc.go:424 +0xce fp=0xc00050b5a8 sp=0xc00050b588 pc=0x64b1b09041ce
runtime.netpollblock(0x64b1b09276b8?, 0xb089afe6?, 0xb1?)
        runtime/netpoll.go:575 +0xf7 fp=0xc00050b5e0 sp=0xc00050b5a8 pc=0x64b1b08c7e37
internal/poll.runtime_pollWait(0x7f27fb610568, 0x72)
        runtime/netpoll.go:351 +0x85 fp=0xc00050b600 sp=0xc00050b5e0 pc=0x64b1b09034c5
internal/poll.(*pollDesc).wait(0xc0005f8000?, 0xc00026c521?, 0x0)
        internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00050b628 sp=0xc00050b600 pc=0x64b1b098b707
internal/poll.(*pollDesc).waitRead(...)
        internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0005f8000, {0xc00026c521, 0x1, 0x1})
        internal/poll/fd_unix.go:165 +0x27a fp=0xc00050b6c0 sp=0xc00050b628 pc=0x64b1b098c9fa
net.(*netFD).Read(0xc0005f8000, {0xc00026c521?, 0xc00050b748?, 0x64b1b0905e50?})
        net/fd_posix.go:55 +0x25 fp=0xc00050b708 sp=0xc00050b6c0 pc=0x64b1b09f7c05
net.(*conn).Read(0xc0000ae040, {0xc00026c521?, 0x0?, 0x64b1b2406480?})
        net/net.go:189 +0x45 fp=0xc00050b750 sp=0xc00050b708 pc=0x64b1b0a06205
net.(*TCPConn).Read(0xc00026c510?, {0xc00026c521?, 0x0?, 0x0?})
        <autogenerated>:1 +0x25 fp=0xc00050b780 sp=0xc00050b750 pc=0x64b1b0a19405
net/http.(*connReader).backgroundRead(0xc00026c510)
        net/http/server.go:690 +0x37 fp=0xc00050b7c8 sp=0xc00050b780 pc=0x64b1b0c26217
net/http.(*connReader).startBackgroundRead.gowrap2()
        net/http/server.go:686 +0x25 fp=0xc00050b7e0 sp=0xc00050b7c8 pc=0x64b1b0c26145
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00050b7e8 sp=0xc00050b7e0 pc=0x64b1b090c5a1
created by net/http.(*connReader).startBackgroundRead in goroutine 74
        net/http/server.go:686 +0xb6

rax    0x206a03fb4
rbx    0x7f2778170400
rcx    0xfed
rdx    0x7f2778008820
rdi    0x7f2778008830
rsi    0x0
rbp    0x7ffcfc5e3ea0
rsp    0x7ffcfc5e3e80
r8     0x0
r9     0x7f27b382c430
r10    0x0
r11    0x246
r12    0x7f26a4001360
r13    0x7f2778008830
r14    0x0
r15    0x64b1c6690f70
rip    0x7f27a8e24c47
rflags 0x10297
cs     0x33
fs     0x0
gs     0x0
SIGABRT: abort
PC=0x7f27fb82200b m=0 sigcode=18446744073709551610
signal arrived during cgo execution

goroutine 29 gp=0xc000585dc0 m=0 mp=0x64b1b235c780 [syscall]:
runtime.cgocall(0x64b1b1512ce0, 0xc0000bfba0)
        runtime/cgocall.go:167 +0x4b fp=0xc0000bfb78 sp=0xc0000bfb40 pc=0x64b1b08fdacb
github.com/ollama/ollama/llama._Cfunc_llama_decode(0x7f2778ad5c10, {0x2, 0x7f2779219860, 0x0, 0x0, 0x7f277921a070, 0x7f277921a880, 0x7f277921b090, 0x7f27790752d0})
        _cgo_gotypes.go:545 +0x4f fp=0xc0000bfba0 sp=0xc0000bfb78 pc=0x64b1b0cb356f
github.com/ollama/ollama/llama.(*Context).Decode.func1(0x64b1b0cd248b?, 0x7f2778ad5c10?)
        github.com/ollama/ollama/llama/llama.go:163 +0xf5 fp=0xc0000bfc90 sp=0xc0000bfba0 pc=0x64b1b0cb6295
github.com/ollama/ollama/llama.(*Context).Decode(0xc0002fe0e0?, 0x0?)
        github.com/ollama/ollama/llama/llama.go:163 +0x13 fp=0xc0000bfcd8 sp=0xc0000bfc90 pc=0x64b1b0cb6113
github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0001eb560, 0xc0005f0000, 0xc0000bff20)
        github.com/ollama/ollama/runner/llamarunner/runner.go:435 +0x23f fp=0xc0000bfee0 sp=0xc0000bfcd8 pc=0x64b1b0cd127f
github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0001eb560, {0x64b1b1b65920, 0xc000511130})
        github.com/ollama/ollama/runner/llamarunner/runner.go:343 +0x1d5 fp=0xc0000bffb8 sp=0xc0000bfee0 pc=0x64b1b0cd0cb5
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2()
        github.com/ollama/ollama/runner/llamarunner/runner.go:973 +0x28 fp=0xc0000bffe0 sp=0xc0000bffb8 pc=0x64b1b0cd5b48
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bffe8 sp=0xc0000bffe0 pc=0x64b1b090c5a1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
        github.com/ollama/ollama/runner/llamarunner/runner.go:973 +0xdb5

etc etc etc.

OS

Docker

GPU

Nvidia

CPU

Intel

Ollama version

0.5.12

Originally created by @iganev on GitHub (Feb 26, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9352 ### What is the issue? Recently I noticed Ollama started hanging again and restarting the container is the only (temporary) remedy. Currently running 0.5.12. The instance uses 2x4090 in a docker setup. The workload consists of lots of embedding requests using several different embedding models (`all-minilm:33m`, `bge-m3`, `bge-large`, `snowflake-arctic-embed`, `paraphrase-multilingual`, etc.) and the occasional `llama3.1:8b` summarization requests. Might be important to note that the llama model is being used with n_ctx 8192 instead of the default 2048, which still fits neatly into one of the GPUs with about 3GB leftover space. Looking at the logs I can see `bge-large` causing consistent segfaults over and over again. Furthermore, it seems like the llama model is being loaded and unloaded on every request, even if the requests are milliseconds apart, dunno if that contributes to the issue. What I observe as a behavior is that at some point ollama stops serving requests and hangs indefinitely. Restarting the container drops all hanging connections and then the container works for a while until it doesn't. ### Relevant log output ```shell [GIN] 2025/02/26 - 00:25:16 | 200 | 42.274034387s | 10.252.1.10 | POST "/api/chat" time=2025-02-26T00:25:17.084Z level=WARN source=types.go:512 msg="invalid option provided" option=tfs_z time=2025-02-26T00:25:17.084Z level=WARN source=types.go:512 msg="invalid option provided" option=num_gqa time=2025-02-26T00:25:17.568Z level=INFO source=sched.go:508 msg="updated VRAM based on existing loaded models" gpu=GPU-d419dbd5-adab-6e8b-e46b-4e45491c3e50 library=cuda total="23.6 GiB" available="21.6 GiB" time=2025-02-26T00:25:17.568Z level=INFO source=sched.go:508 msg="updated VRAM based on existing loaded models" gpu=GPU-6da9f13b-9b65-b30a-fd59-910f358a7824 library=cuda total="23.6 GiB" available="23.3 GiB" time=2025-02-26T00:25:17.568Z level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.key_length default=128 time=2025-02-26T00:25:17.568Z level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.value_length default=128 time=2025-02-26T00:25:17.568Z level=INFO source=sched.go:715 msg="new model will fit in available VRAM in single GPU, loading" model=/root/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 gpu=GPU-6da9f13b-9b65-b30a-fd59-910f358a7824 parallel=10 available=24976752640 required="20.4 GiB" time=2025-02-26T00:25:17.674Z level=INFO source=server.go:97 msg="system memory" total="125.5 GiB" free="106.5 GiB" free_swap="8.0 GiB" time=2025-02-26T00:25:17.674Z level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.key_length default=128 time=2025-02-26T00:25:17.674Z level=WARN source=ggml.go:132 msg="key not found" key=llama.attention.value_length default=128 time=2025-02-26T00:25:17.674Z level=INFO source=server.go:130 msg=offload library=cuda layers.requested=-1 layers.model=33 layers.offload=33 layers.split="" memory.available="[23.3 GiB]" memory.gpu_overhead="0 B" memory.required.full="20.4 GiB" memory.required.partial="20.4 GiB" memory.required.kv="10.0 GiB" memory.required.allocations="[20.4 GiB]" memory.weights.total="13.9 GiB" memory.weights.repeating="13.5 GiB" memory.weights.nonrepeating="411.0 MiB" memory.graph.full="5.2 GiB" memory.graph.partial="5.5 GiB" time=2025-02-26T00:25:17.674Z level=INFO source=server.go:380 msg="starting llama server" cmd="/usr/bin/ollama runner --model /root/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 --ctx-size 81920 --batch-size 512 --n-gpu-layers 33 --threads 8 --parallel 10 --port 37941" time=2025-02-26T00:25:17.674Z level=INFO source=sched.go:450 msg="loaded runners" count=3 time=2025-02-26T00:25:17.674Z level=INFO source=server.go:557 msg="waiting for llama runner to start responding" time=2025-02-26T00:25:17.675Z level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error" time=2025-02-26T00:25:17.681Z level=INFO source=runner.go:932 msg="starting go runner" [GIN] 2025/02/26 - 00:25:17 | 200 | 5.065479138s | 10.252.1.10 | POST "/api/embed" ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes load_backend: loaded CUDA backend from /usr/lib/ollama/cuda_v12/libggml-cuda.so load_backend: loaded CPU backend from /usr/lib/ollama/libggml-cpu-alderlake.so time=2025-02-26T00:25:17.703Z level=INFO source=runner.go:935 msg=system info="CPU : LLAMAFILE = 1 | CPU : LLAMAFILE = 1 | CUDA : ARCHS = 600,610,620,700,720,750,800,860,870,890,900 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | cgo(gcc)" threads=8 time=2025-02-26T00:25:17.703Z level=INFO source=runner.go:993 msg="Server listening on 127.0.0.1:37941" llama_load_model_from_file: using device CUDA0 (NVIDIA GeForce RTX 4090) - 23819 MiB free llama_model_loader: loaded meta data with 29 key-value pairs and 292 tensors from /root/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = Meta Llama 3.1 8B Instruct llama_model_loader: - kv 3: general.finetune str = Instruct llama_model_loader: - kv 4: general.basename str = Meta-Llama-3.1 llama_model_loader: - kv 5: general.size_label str = 8B llama_model_loader: - kv 6: general.license str = llama3.1 llama_model_loader: - kv 7: general.tags arr[str,6] = ["facebook", "meta", "pytorch", "llam... llama_model_loader: - kv 8: general.languages arr[str,8] = ["en", "de", "fr", "it", "pt", "hi", ... llama_model_loader: - kv 9: llama.block_count u32 = 32 llama_model_loader: - kv 10: llama.context_length u32 = 131072 llama_model_loader: - kv 11: llama.embedding_length u32 = 4096 llama_model_loader: - kv 12: llama.feed_forward_length u32 = 14336 llama_model_loader: - kv 13: llama.attention.head_count u32 = 32 llama_model_loader: - kv 14: llama.attention.head_count_kv u32 = 8 llama_model_loader: - kv 15: llama.rope.freq_base f32 = 500000.000000 llama_model_loader: - kv 16: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 17: general.file_type u32 = 15 llama_model_loader: - kv 18: llama.vocab_size u32 = 128256 llama_model_loader: - kv 19: llama.rope.dimension_count u32 = 128 llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 21: tokenizer.ggml.pre str = llama-bpe llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "... llama_model_loader: - kv 25: tokenizer.ggml.bos_token_id u32 = 128000 llama_model_loader: - kv 26: tokenizer.ggml.eos_token_id u32 = 128009 llama_model_loader: - kv 27: tokenizer.chat_template str = {{- bos_token }}\n{%- if custom_tools ... llama_model_loader: - kv 28: general.quantization_version u32 = 2 llama_model_loader: - type f32: 66 tensors llama_model_loader: - type q4_K: 193 tensors llama_model_loader: - type q6_K: 33 tensors time=2025-02-26T00:25:17.926Z level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model" llm_load_vocab: special tokens cache size = 256 llm_load_vocab: token to piece cache size = 0.7999 MB llm_load_print_meta: format = GGUF V3 (latest) llm_load_print_meta: arch = llama llm_load_print_meta: vocab type = BPE llm_load_print_meta: n_vocab = 128256 llm_load_print_meta: n_merges = 280147 llm_load_print_meta: vocab_only = 0 llm_load_print_meta: n_ctx_train = 131072 llm_load_print_meta: n_embd = 4096 llm_load_print_meta: n_layer = 32 llm_load_print_meta: n_head = 32 llm_load_print_meta: n_head_kv = 8 llm_load_print_meta: n_rot = 128 llm_load_print_meta: n_swa = 0 llm_load_print_meta: n_embd_head_k = 128 llm_load_print_meta: n_embd_head_v = 128 llm_load_print_meta: n_gqa = 4 llm_load_print_meta: n_embd_k_gqa = 1024 llm_load_print_meta: n_embd_v_gqa = 1024 llm_load_print_meta: f_norm_eps = 0.0e+00 llm_load_print_meta: f_norm_rms_eps = 1.0e-05 llm_load_print_meta: f_clamp_kqv = 0.0e+00 llm_load_print_meta: f_max_alibi_bias = 0.0e+00 llm_load_print_meta: f_logit_scale = 0.0e+00 llm_load_print_meta: n_ff = 14336 llm_load_print_meta: n_expert = 0 llm_load_print_meta: n_expert_used = 0 llm_load_print_meta: causal attn = 1 llm_load_print_meta: pooling type = 0 llm_load_print_meta: rope type = 0 llm_load_print_meta: rope scaling = linear llm_load_print_meta: freq_base_train = 500000.0 llm_load_print_meta: freq_scale_train = 1 llm_load_print_meta: n_ctx_orig_yarn = 131072 llm_load_print_meta: rope_finetuned = unknown llm_load_print_meta: ssm_d_conv = 0 llm_load_print_meta: ssm_d_inner = 0 llm_load_print_meta: ssm_d_state = 0 llm_load_print_meta: ssm_dt_rank = 0 llm_load_print_meta: ssm_dt_b_c_rms = 0 llm_load_print_meta: model type = 8B llm_load_print_meta: model ftype = Q4_K - Medium llm_load_print_meta: model params = 8.03 B llm_load_print_meta: model size = 4.58 GiB (4.89 BPW) llm_load_print_meta: general.name = Meta Llama 3.1 8B Instruct llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>' llm_load_print_meta: EOS token = 128009 '<|eot_id|>' llm_load_print_meta: EOT token = 128009 '<|eot_id|>' llm_load_print_meta: EOM token = 128008 '<|eom_id|>' llm_load_print_meta: LF token = 128 'Ä' llm_load_print_meta: EOG token = 128008 '<|eom_id|>' llm_load_print_meta: EOG token = 128009 '<|eot_id|>' llm_load_print_meta: max token length = 256 llm_load_tensors: offloading 32 repeating layers to GPU llm_load_tensors: offloading output layer to GPU llm_load_tensors: offloaded 33/33 layers to GPU llm_load_tensors: CUDA0 model buffer size = 4403.49 MiB llm_load_tensors: CPU_Mapped model buffer size = 281.81 MiB llama_new_context_with_model: n_seq_max = 10 llama_new_context_with_model: n_ctx = 81920 llama_new_context_with_model: n_ctx_per_seq = 8192 llama_new_context_with_model: n_batch = 5120 llama_new_context_with_model: n_ubatch = 512 llama_new_context_with_model: flash_attn = 0 llama_new_context_with_model: freq_base = 500000.0 llama_new_context_with_model: freq_scale = 1 llama_new_context_with_model: n_ctx_per_seq (8192) < n_ctx_train (131072) -- the full capacity of the model will not be utilized llama_kv_cache_init: kv_size = 81920, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 32, can_shift = 1 llama_kv_cache_init: CUDA0 KV buffer size = 10240.00 MiB llama_new_context_with_model: KV self size = 10240.00 MiB, K (f16): 5120.00 MiB, V (f16): 5120.00 MiB llama_new_context_with_model: CUDA_Host output buffer size = 5.05 MiB llama_new_context_with_model: CUDA0 compute buffer size = 5312.00 MiB llama_new_context_with_model: CUDA_Host compute buffer size = 168.01 MiB llama_new_context_with_model: graph nodes = 1030 llama_new_context_with_model: graph splits = 2 time=2025-02-26T00:25:18.930Z level=INFO source=server.go:596 msg="llama runner started in 1.26 seconds" llama_model_loader: loaded meta data with 29 key-value pairs and 292 tensors from /root/.ollama/models/blobs/sha256-667b0c1932bc6ffc593ed1d03f895bf2dc8dc6df21db3042284a6f4416b06a29 (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = Meta Llama 3.1 8B Instruct llama_model_loader: - kv 3: general.finetune str = Instruct llama_model_loader: - kv 4: general.basename str = Meta-Llama-3.1 llama_model_loader: - kv 5: general.size_label str = 8B llama_model_loader: - kv 6: general.license str = llama3.1 llama_model_loader: - kv 7: general.tags arr[str,6] = ["facebook", "meta", "pytorch", "llam... llama_model_loader: - kv 8: general.languages arr[str,8] = ["en", "de", "fr", "it", "pt", "hi", ... llama_model_loader: - kv 9: llama.block_count u32 = 32 llama_model_loader: - kv 10: llama.context_length u32 = 131072 llama_model_loader: - kv 11: llama.embedding_length u32 = 4096 llama_model_loader: - kv 12: llama.feed_forward_length u32 = 14336 llama_model_loader: - kv 13: llama.attention.head_count u32 = 32 llama_model_loader: - kv 14: llama.attention.head_count_kv u32 = 8 llama_model_loader: - kv 15: llama.rope.freq_base f32 = 500000.000000 llama_model_loader: - kv 16: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 17: general.file_type u32 = 15 llama_model_loader: - kv 18: llama.vocab_size u32 = 128256 llama_model_loader: - kv 19: llama.rope.dimension_count u32 = 128 llama_model_loader: - kv 20: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 21: tokenizer.ggml.pre str = llama-bpe llama_model_loader: - kv 22: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 23: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 24: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "... llama_model_loader: - kv 25: tokenizer.ggml.bos_token_id u32 = 128000 llama_model_loader: - kv 26: tokenizer.ggml.eos_token_id u32 = 128009 llama_model_loader: - kv 27: tokenizer.chat_template str = {{- bos_token }}\n{%- if custom_tools ... llama_model_loader: - kv 28: general.quantization_version u32 = 2 llama_model_loader: - type f32: 66 tensors llama_model_loader: - type q4_K: 193 tensors llama_model_loader: - type q6_K: 33 tensors llm_load_vocab: special tokens cache size = 256 llm_load_vocab: token to piece cache size = 0.7999 MB llm_load_print_meta: format = GGUF V3 (latest) llm_load_print_meta: arch = llama llm_load_print_meta: vocab type = BPE llm_load_print_meta: n_vocab = 128256 llm_load_print_meta: n_merges = 280147 llm_load_print_meta: vocab_only = 1 llm_load_print_meta: model type = ?B llm_load_print_meta: model ftype = all F32 llm_load_print_meta: model params = 8.03 B llm_load_print_meta: model size = 4.58 GiB (4.89 BPW) llm_load_print_meta: general.name = Meta Llama 3.1 8B Instruct llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>' llm_load_print_meta: EOS token = 128009 '<|eot_id|>' llm_load_print_meta: EOT token = 128009 '<|eot_id|>' llm_load_print_meta: EOM token = 128008 '<|eom_id|>' llm_load_print_meta: LF token = 128 'Ä' llm_load_print_meta: EOG token = 128008 '<|eom_id|>' llm_load_print_meta: EOG token = 128009 '<|eot_id|>' llm_load_print_meta: max token length = 256 llama_model_load: vocab only - skipping tensors //ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c:8456: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed //ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c:8456: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed SIGSEGV: segmentation violation PC=0x7f27a8e24c47 m=0 sigcode=1 addr=0x206a03fb4 signal arrived during cgo execution goroutine 29 gp=0xc000585dc0 m=0 mp=0x64b1b235c780 [syscall]: runtime.cgocall(0x64b1b1512ce0, 0xc0000bfba0) runtime/cgocall.go:167 +0x4b fp=0xc0000bfb78 sp=0xc0000bfb40 pc=0x64b1b08fdacb github.com/ollama/ollama/llama._Cfunc_llama_decode(0x7f2778ad5c10, {0x2, 0x7f2779219860, 0x0, 0x0, 0x7f277921a070, 0x7f277921a880, 0x7f277921b090, 0x7f27790752d0}) _cgo_gotypes.go:545 +0x4f fp=0xc0000bfba0 sp=0xc0000bfb78 pc=0x64b1b0cb356f github.com/ollama/ollama/llama.(*Context).Decode.func1(0x64b1b0cd248b?, 0x7f2778ad5c10?) github.com/ollama/ollama/llama/llama.go:163 +0xf5 fp=0xc0000bfc90 sp=0xc0000bfba0 pc=0x64b1b0cb6295 github.com/ollama/ollama/llama.(*Context).Decode(0xc0002fe0e0?, 0x0?) github.com/ollama/ollama/llama/llama.go:163 +0x13 fp=0xc0000bfcd8 sp=0xc0000bfc90 pc=0x64b1b0cb6113 github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0001eb560, 0xc0005f0000, 0xc0000bff20) github.com/ollama/ollama/runner/llamarunner/runner.go:435 +0x23f fp=0xc0000bfee0 sp=0xc0000bfcd8 pc=0x64b1b0cd127f github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0001eb560, {0x64b1b1b65920, 0xc000511130}) github.com/ollama/ollama/runner/llamarunner/runner.go:343 +0x1d5 fp=0xc0000bffb8 sp=0xc0000bfee0 pc=0x64b1b0cd0cb5 github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2() github.com/ollama/ollama/runner/llamarunner/runner.go:973 +0x28 fp=0xc0000bffe0 sp=0xc0000bffb8 pc=0x64b1b0cd5b48 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bffe8 sp=0xc0000bffe0 pc=0x64b1b090c5a1 created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1 github.com/ollama/ollama/runner/llamarunner/runner.go:973 +0xdb5 goroutine 1 gp=0xc0000061c0 m=nil [IO wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0001335c0 sp=0xc0001335a0 pc=0x64b1b09041ce runtime.netpollblock(0xc00011f610?, 0xb089afe6?, 0xb1?) runtime/netpoll.go:575 +0xf7 fp=0xc0001335f8 sp=0xc0001335c0 pc=0x64b1b08c7e37 internal/poll.runtime_pollWait(0x7f27fb610680, 0x72) runtime/netpoll.go:351 +0x85 fp=0xc000133618 sp=0xc0001335f8 pc=0x64b1b09034c5 internal/poll.(*pollDesc).wait(0xc0004e2380?, 0x900000036?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000133640 sp=0xc000133618 pc=0x64b1b098b707 internal/poll.(*pollDesc).waitRead(...) internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Accept(0xc0004e2380) internal/poll/fd_unix.go:620 +0x295 fp=0xc0001336e8 sp=0xc000133640 pc=0x64b1b0990ad5 net.(*netFD).accept(0xc0004e2380) net/fd_unix.go:172 +0x29 fp=0xc0001337a0 sp=0xc0001336e8 pc=0x64b1b09f9bc9 net.(*TCPListener).accept(0xc00062d6c0) net/tcpsock_posix.go:159 +0x1e fp=0xc0001337f0 sp=0xc0001337a0 pc=0x64b1b0a0f83e net.(*TCPListener).Accept(0xc00062d6c0) net/tcpsock.go:372 +0x30 fp=0xc000133820 sp=0xc0001337f0 pc=0x64b1b0a0e6f0 net/http.(*onceCloseListener).Accept(0xc0005f4090?) <autogenerated>:1 +0x24 fp=0xc000133838 sp=0xc000133820 pc=0x64b1b0c58964 net/http.(*Server).Serve(0xc0005bd1d0, {0x64b1b1b634f8, 0xc00062d6c0}) net/http/server.go:3330 +0x30c fp=0xc000133968 sp=0xc000133838 pc=0x64b1b0c308ec github.com/ollama/ollama/runner/llamarunner.Execute({0xc000036220, 0xe, 0xe}) github.com/ollama/ollama/runner/llamarunner/runner.go:994 +0x1174 fp=0xc000133d08 sp=0xc000133968 pc=0x64b1b0cd5834 github.com/ollama/ollama/runner.Execute({0xc000036210?, 0x0?, 0x0?}) github.com/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc000133d30 sp=0xc000133d08 pc=0x64b1b0f05c54 github.com/ollama/ollama/cmd.NewCLI.func2(0xc000037700?, {0x64b1b1700050?, 0x4?, 0x64b1b1700054?}) github.com/ollama/ollama/cmd/cmd.go:1280 +0x45 fp=0xc000133d58 sp=0xc000133d30 pc=0x64b1b1512245 github.com/spf13/cobra.(*Command).execute(0xc0004e7b08, {0xc000645180, 0xe, 0xe}) github.com/spf13/cobra@v1.7.0/command.go:940 +0x862 fp=0xc000133e78 sp=0xc000133d58 pc=0x64b1b0a72902 github.com/spf13/cobra.(*Command).ExecuteC(0xc0005d5b08) github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc000133f30 sp=0xc000133e78 pc=0x64b1b0a73145 github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) github.com/spf13/cobra@v1.7.0/command.go:985 main.main() github.com/ollama/ollama/main.go:12 +0x4d fp=0xc000133f50 sp=0xc000133f30 pc=0x64b1b15125cd runtime.main() runtime/proc.go:272 +0x29d fp=0xc000133fe0 sp=0xc000133f50 pc=0x64b1b08cf4dd runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000133fe8 sp=0xc000133fe0 pc=0x64b1b090c5a1 goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000aafa8 sp=0xc0000aaf88 pc=0x64b1b09041ce runtime.goparkunlock(...) runtime/proc.go:430 runtime.forcegchelper() runtime/proc.go:337 +0xb8 fp=0xc0000aafe0 sp=0xc0000aafa8 pc=0x64b1b08cf818 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000aafe8 sp=0xc0000aafe0 pc=0x64b1b090c5a1 created by runtime.init.7 in goroutine 1 runtime/proc.go:325 +0x1a goroutine 3 gp=0xc000007180 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000ab780 sp=0xc0000ab760 pc=0x64b1b09041ce runtime.goparkunlock(...) runtime/proc.go:430 runtime.bgsweep(0xc00003e080) runtime/mgcsweep.go:317 +0xdf fp=0xc0000ab7c8 sp=0xc0000ab780 pc=0x64b1b08b9ebf runtime.gcenable.gowrap1() runtime/mgc.go:204 +0x25 fp=0xc0000ab7e0 sp=0xc0000ab7c8 pc=0x64b1b08ae505 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000ab7e8 sp=0xc0000ab7e0 pc=0x64b1b090c5a1 created by runtime.gcenable in goroutine 1 runtime/mgc.go:204 +0x66 goroutine 4 gp=0xc000007340 m=nil [GC scavenge wait]: runtime.gopark(0x10000?, 0x64b1b18b36f8?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000abf78 sp=0xc0000abf58 pc=0x64b1b09041ce runtime.goparkunlock(...) runtime/proc.go:430 runtime.(*scavengerState).park(0x64b1b235a080) runtime/mgcscavenge.go:425 +0x49 fp=0xc0000abfa8 sp=0xc0000abf78 pc=0x64b1b08b7889 runtime.bgscavenge(0xc00003e080) runtime/mgcscavenge.go:658 +0x59 fp=0xc0000abfc8 sp=0xc0000abfa8 pc=0x64b1b08b7e19 runtime.gcenable.gowrap2() runtime/mgc.go:205 +0x25 fp=0xc0000abfe0 sp=0xc0000abfc8 pc=0x64b1b08ae4a5 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000abfe8 sp=0xc0000abfe0 pc=0x64b1b090c5a1 created by runtime.gcenable in goroutine 1 runtime/mgc.go:205 +0xa5 goroutine 5 gp=0xc000007c00 m=nil [finalizer wait]: runtime.gopark(0x0?, 0x64b1b1b52670?, 0x20?, 0xe0?, 0x1000000010?) runtime/proc.go:424 +0xce fp=0xc0000aa620 sp=0xc0000aa600 pc=0x64b1b09041ce runtime.runfinq() runtime/mfinal.go:193 +0x107 fp=0xc0000aa7e0 sp=0xc0000aa620 pc=0x64b1b08ad587 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000aa7e8 sp=0xc0000aa7e0 pc=0x64b1b090c5a1 created by runtime.createfing in goroutine 1 runtime/mfinal.go:163 +0x3d goroutine 6 gp=0xc000209500 m=nil [chan receive]: runtime.gopark(0xc0000ac760?, 0x64b1b09e1245?, 0x60?, 0xc9?, 0x64b1b1b78280?) runtime/proc.go:424 +0xce fp=0xc0000ac718 sp=0xc0000ac6f8 pc=0x64b1b09041ce runtime.chanrecv(0xc0000e4310, 0x0, 0x1) runtime/chan.go:639 +0x41c fp=0xc0000ac790 sp=0xc0000ac718 pc=0x64b1b089dbfc runtime.chanrecv1(0x0?, 0x0?) runtime/chan.go:489 +0x12 fp=0xc0000ac7b8 sp=0xc0000ac790 pc=0x64b1b089d7b2 runtime.unique_runtime_registerUniqueMapCleanup.func1(...) runtime/mgc.go:1781 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() runtime/mgc.go:1784 +0x2f fp=0xc0000ac7e0 sp=0xc0000ac7b8 pc=0x64b1b08b156f runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000ac7e8 sp=0xc0000ac7e0 pc=0x64b1b090c5a1 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 runtime/mgc.go:1779 +0x96 goroutine 7 gp=0xc000209880 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000acf38 sp=0xc0000acf18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000acfc8 sp=0xc0000acf38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000acfe0 sp=0xc0000acfc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000acfe8 sp=0xc0000acfe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 8 gp=0xc000209a40 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000ad738 sp=0xc0000ad718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000ad7c8 sp=0xc0000ad738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000ad7e0 sp=0xc0000ad7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000ad7e8 sp=0xc0000ad7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 9 gp=0xc000209c00 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000adf38 sp=0xc0000adf18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000adfc8 sp=0xc0000adf38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000adfe0 sp=0xc0000adfc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000adfe8 sp=0xc0000adfe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 10 gp=0xc000209dc0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a6738 sp=0xc0000a6718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000a67c8 sp=0xc0000a6738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a67e0 sp=0xc0000a67c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a67e8 sp=0xc0000a67e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 11 gp=0xc0004a4000 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a6f38 sp=0xc0000a6f18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000a6fc8 sp=0xc0000a6f38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a6fe0 sp=0xc0000a6fc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a6fe8 sp=0xc0000a6fe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 12 gp=0xc0004a41c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a7738 sp=0xc0000a7718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000a77c8 sp=0xc0000a7738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a77e0 sp=0xc0000a77c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a77e8 sp=0xc0000a77e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 13 gp=0xc0004a4380 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a7f38 sp=0xc0000a7f18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000a7fc8 sp=0xc0000a7f38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a7fe0 sp=0xc0000a7fc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a7fe8 sp=0xc0000a7fe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 14 gp=0xc0004a4540 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a8738 sp=0xc0000a8718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000a87c8 sp=0xc0000a8738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a87e0 sp=0xc0000a87c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a87e8 sp=0xc0000a87e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 15 gp=0xc0004a4700 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a8f38 sp=0xc0000a8f18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000a8fc8 sp=0xc0000a8f38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a8fe0 sp=0xc0000a8fc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a8fe8 sp=0xc0000a8fe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 16 gp=0xc0004a48c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a9738 sp=0xc0000a9718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000a97c8 sp=0xc0000a9738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a97e0 sp=0xc0000a97c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a97e8 sp=0xc0000a97e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 18 gp=0xc0004a4a80 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a9f38 sp=0xc0000a9f18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0000a9fc8 sp=0xc0000a9f38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a9fe0 sp=0xc0000a9fc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a9fe8 sp=0xc0000a9fe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 19 gp=0xc0004a4c40 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004ac738 sp=0xc0004ac718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004ac7c8 sp=0xc0004ac738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004ac7e0 sp=0xc0004ac7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004ac7e8 sp=0xc0004ac7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 20 gp=0xc0004a4e00 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004acf38 sp=0xc0004acf18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004acfc8 sp=0xc0004acf38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004acfe0 sp=0xc0004acfc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004acfe8 sp=0xc0004acfe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 21 gp=0xc0004a4fc0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004ad738 sp=0xc0004ad718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004ad7c8 sp=0xc0004ad738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004ad7e0 sp=0xc0004ad7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004ad7e8 sp=0xc0004ad7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 22 gp=0xc0004a5180 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004adf38 sp=0xc0004adf18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004adfc8 sp=0xc0004adf38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004adfe0 sp=0xc0004adfc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004adfe8 sp=0xc0004adfe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 34 gp=0xc000104380 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004a8738 sp=0xc0004a8718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004a87c8 sp=0xc0004a8738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004a87e0 sp=0xc0004a87c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a87e8 sp=0xc0004a87e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 35 gp=0xc000104540 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004a8f38 sp=0xc0004a8f18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004a8fc8 sp=0xc0004a8f38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004a8fe0 sp=0xc0004a8fc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a8fe8 sp=0xc0004a8fe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 50 gp=0xc000504000 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc00050a7c8 sp=0xc00050a738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 23 gp=0xc0004a5340 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004ae738 sp=0xc0004ae718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004ae7c8 sp=0xc0004ae738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004ae7e0 sp=0xc0004ae7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004ae7e8 sp=0xc0004ae7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 36 gp=0xc000104700 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004a9738 sp=0xc0004a9718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004a97c8 sp=0xc0004a9738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004a97e0 sp=0xc0004a97c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a97e8 sp=0xc0004a97e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 51 gp=0xc0005041c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00050af38 sp=0xc00050af18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc00050afc8 sp=0xc00050af38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00050afe0 sp=0xc00050afc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 24 gp=0xc0004a5880 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004aef38 sp=0xc0004aef18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004aefc8 sp=0xc0004aef38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004aefe0 sp=0xc0004aefc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004aefe8 sp=0xc0004aefe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 25 gp=0xc0004a5a40 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004af738 sp=0xc0004af718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004af7c8 sp=0xc0004af738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004af7e0 sp=0xc0004af7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004af7e8 sp=0xc0004af7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 26 gp=0xc0004a5c00 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004aff38 sp=0xc0004aff18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0004affc8 sp=0xc0004aff38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004affe0 sp=0xc0004affc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004affe8 sp=0xc0004affe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 27 gp=0xc0004a5dc0 m=nil [GC worker (idle)]: runtime.gopark(0x4475a548d97b2?, 0x1?, 0x68?, 0x3d?, 0x0?) runtime/proc.go:424 +0xce fp=0xc000506738 sp=0xc000506718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc0005067c8 sp=0xc000506738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0005067e0 sp=0xc0005067c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0005067e8 sp=0xc0005067e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 66 gp=0xc000584000 m=nil [GC worker (idle)]: runtime.gopark(0x4475abc1b0b85?, 0x1?, 0x8a?, 0xa9?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00058a738 sp=0xc00058a718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc00058a7c8 sp=0xc00058a738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00058a7e0 sp=0xc00058a7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00058a7e8 sp=0xc00058a7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 67 gp=0xc0005841c0 m=nil [GC worker (idle)]: runtime.gopark(0x64b1b24086e0?, 0x1?, 0x12?, 0xf5?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00058af38 sp=0xc00058af18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc00058afc8 sp=0xc00058af38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00058afe0 sp=0xc00058afc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00058afe8 sp=0xc00058afe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 68 gp=0xc000584380 m=nil [GC worker (idle)]: runtime.gopark(0x4475abc1ad767?, 0x1?, 0x1a?, 0x6a?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00058b738 sp=0xc00058b718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc00058b7c8 sp=0xc00058b738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00058b7e0 sp=0xc00058b7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00058b7e8 sp=0xc00058b7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 69 gp=0xc000584540 m=nil [GC worker (idle)]: runtime.gopark(0x64b1b24086e0?, 0x1?, 0x3a?, 0xb4?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00058bf38 sp=0xc00058bf18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc00058bfc8 sp=0xc00058bf38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00058bfe0 sp=0xc00058bfc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00058bfe8 sp=0xc00058bfe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 70 gp=0xc000584700 m=nil [GC worker (idle)]: runtime.gopark(0x64b1b24086e0?, 0x1?, 0x3?, 0xc0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00058c738 sp=0xc00058c718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc00058c7c8 sp=0xc00058c738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00058c7e0 sp=0xc00058c7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00058c7e8 sp=0xc00058c7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 71 gp=0xc0005848c0 m=nil [GC worker (idle)]: runtime.gopark(0x64b1b24086e0?, 0x1?, 0xfa?, 0x21?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00058cf38 sp=0xc00058cf18 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc00058cfc8 sp=0xc00058cf38 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00058cfe0 sp=0xc00058cfc8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00058cfe8 sp=0xc00058cfe0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 72 gp=0xc000584a80 m=nil [GC worker (idle)]: runtime.gopark(0x4475abc1ad6ad?, 0x1?, 0x1f?, 0x42?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00058d738 sp=0xc00058d718 pc=0x64b1b09041ce runtime.gcBgMarkWorker(0xc0000e5730) runtime/mgc.go:1412 +0xe9 fp=0xc00058d7c8 sp=0xc00058d738 pc=0x64b1b08b0869 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00058d7e0 sp=0xc00058d7c8 pc=0x64b1b08b0745 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00058d7e8 sp=0xc00058d7e0 pc=0x64b1b090c5a1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 74 gp=0xc000104a80 m=nil [chan receive]: runtime.gopark(0x64b1b090a5b4?, 0xc000137898?, 0xd0?, 0x22?, 0xc000137880?) runtime/proc.go:424 +0xce fp=0xc000137860 sp=0xc000137840 pc=0x64b1b09041ce runtime.chanrecv(0xc0003ac070, 0xc000137a10, 0x1) runtime/chan.go:639 +0x41c fp=0xc0001378d8 sp=0xc000137860 pc=0x64b1b089dbfc runtime.chanrecv1(0xc00026c060?, 0xc00066c808?) runtime/chan.go:489 +0x12 fp=0xc000137900 sp=0xc0001378d8 pc=0x64b1b089d7b2 github.com/ollama/ollama/runner/llamarunner.(*Server).embeddings(0xc0001eb560, {0x64b1b1b63708, 0xc0006440e0}, 0xc0004c8140) github.com/ollama/ollama/runner/llamarunner/runner.go:783 +0x746 fp=0xc000137ac0 sp=0xc000137900 pc=0x64b1b0cd3c06 github.com/ollama/ollama/runner/llamarunner.(*Server).embeddings-fm({0x64b1b1b63708?, 0xc0006440e0?}, 0x64b1b0c3a6c7?) <autogenerated>:1 +0x36 fp=0xc000137af0 sp=0xc000137ac0 pc=0x64b1b0cd5ff6 net/http.HandlerFunc.ServeHTTP(0xc000645340?, {0x64b1b1b63708?, 0xc0006440e0?}, 0x0?) net/http/server.go:2220 +0x29 fp=0xc000137b18 sp=0xc000137af0 pc=0x64b1b0c2cee9 net/http.(*ServeMux).ServeHTTP(0x64b1b08a4a05?, {0x64b1b1b63708, 0xc0006440e0}, 0xc0004c8140) net/http/server.go:2747 +0x1ca fp=0xc000137b68 sp=0xc000137b18 pc=0x64b1b0c2edea net/http.serverHandler.ServeHTTP({0x64b1b1b600d0?}, {0x64b1b1b63708?, 0xc0006440e0?}, 0x6?) net/http/server.go:3210 +0x8e fp=0xc000137b98 sp=0xc000137b68 pc=0x64b1b0c4c34e net/http.(*conn).serve(0xc0005f4090, {0x64b1b1b658e8, 0xc0001fe8a0}) net/http/server.go:2092 +0x5d0 fp=0xc000137fb8 sp=0xc000137b98 pc=0x64b1b0c2b890 net/http.(*Server).Serve.gowrap3() net/http/server.go:3360 +0x28 fp=0xc000137fe0 sp=0xc000137fb8 pc=0x64b1b0c30ce8 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000137fe8 sp=0xc000137fe0 pc=0x64b1b090c5a1 created by net/http.(*Server).Serve in goroutine 1 net/http/server.go:3360 +0x485 goroutine 136 gp=0xc000504a80 m=nil [IO wait]: runtime.gopark(0x64b1b08a8ee5?, 0x0?, 0xf8?, 0xb5?, 0xb?) runtime/proc.go:424 +0xce fp=0xc00050b5a8 sp=0xc00050b588 pc=0x64b1b09041ce runtime.netpollblock(0x64b1b09276b8?, 0xb089afe6?, 0xb1?) runtime/netpoll.go:575 +0xf7 fp=0xc00050b5e0 sp=0xc00050b5a8 pc=0x64b1b08c7e37 internal/poll.runtime_pollWait(0x7f27fb610568, 0x72) runtime/netpoll.go:351 +0x85 fp=0xc00050b600 sp=0xc00050b5e0 pc=0x64b1b09034c5 internal/poll.(*pollDesc).wait(0xc0005f8000?, 0xc00026c521?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00050b628 sp=0xc00050b600 pc=0x64b1b098b707 internal/poll.(*pollDesc).waitRead(...) internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0xc0005f8000, {0xc00026c521, 0x1, 0x1}) internal/poll/fd_unix.go:165 +0x27a fp=0xc00050b6c0 sp=0xc00050b628 pc=0x64b1b098c9fa net.(*netFD).Read(0xc0005f8000, {0xc00026c521?, 0xc00050b748?, 0x64b1b0905e50?}) net/fd_posix.go:55 +0x25 fp=0xc00050b708 sp=0xc00050b6c0 pc=0x64b1b09f7c05 net.(*conn).Read(0xc0000ae040, {0xc00026c521?, 0x0?, 0x64b1b2406480?}) net/net.go:189 +0x45 fp=0xc00050b750 sp=0xc00050b708 pc=0x64b1b0a06205 net.(*TCPConn).Read(0xc00026c510?, {0xc00026c521?, 0x0?, 0x0?}) <autogenerated>:1 +0x25 fp=0xc00050b780 sp=0xc00050b750 pc=0x64b1b0a19405 net/http.(*connReader).backgroundRead(0xc00026c510) net/http/server.go:690 +0x37 fp=0xc00050b7c8 sp=0xc00050b780 pc=0x64b1b0c26217 net/http.(*connReader).startBackgroundRead.gowrap2() net/http/server.go:686 +0x25 fp=0xc00050b7e0 sp=0xc00050b7c8 pc=0x64b1b0c26145 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00050b7e8 sp=0xc00050b7e0 pc=0x64b1b090c5a1 created by net/http.(*connReader).startBackgroundRead in goroutine 74 net/http/server.go:686 +0xb6 rax 0x206a03fb4 rbx 0x7f2778170400 rcx 0xfed rdx 0x7f2778008820 rdi 0x7f2778008830 rsi 0x0 rbp 0x7ffcfc5e3ea0 rsp 0x7ffcfc5e3e80 r8 0x0 r9 0x7f27b382c430 r10 0x0 r11 0x246 r12 0x7f26a4001360 r13 0x7f2778008830 r14 0x0 r15 0x64b1c6690f70 rip 0x7f27a8e24c47 rflags 0x10297 cs 0x33 fs 0x0 gs 0x0 SIGABRT: abort PC=0x7f27fb82200b m=0 sigcode=18446744073709551610 signal arrived during cgo execution goroutine 29 gp=0xc000585dc0 m=0 mp=0x64b1b235c780 [syscall]: runtime.cgocall(0x64b1b1512ce0, 0xc0000bfba0) runtime/cgocall.go:167 +0x4b fp=0xc0000bfb78 sp=0xc0000bfb40 pc=0x64b1b08fdacb github.com/ollama/ollama/llama._Cfunc_llama_decode(0x7f2778ad5c10, {0x2, 0x7f2779219860, 0x0, 0x0, 0x7f277921a070, 0x7f277921a880, 0x7f277921b090, 0x7f27790752d0}) _cgo_gotypes.go:545 +0x4f fp=0xc0000bfba0 sp=0xc0000bfb78 pc=0x64b1b0cb356f github.com/ollama/ollama/llama.(*Context).Decode.func1(0x64b1b0cd248b?, 0x7f2778ad5c10?) github.com/ollama/ollama/llama/llama.go:163 +0xf5 fp=0xc0000bfc90 sp=0xc0000bfba0 pc=0x64b1b0cb6295 github.com/ollama/ollama/llama.(*Context).Decode(0xc0002fe0e0?, 0x0?) github.com/ollama/ollama/llama/llama.go:163 +0x13 fp=0xc0000bfcd8 sp=0xc0000bfc90 pc=0x64b1b0cb6113 github.com/ollama/ollama/runner/llamarunner.(*Server).processBatch(0xc0001eb560, 0xc0005f0000, 0xc0000bff20) github.com/ollama/ollama/runner/llamarunner/runner.go:435 +0x23f fp=0xc0000bfee0 sp=0xc0000bfcd8 pc=0x64b1b0cd127f github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc0001eb560, {0x64b1b1b65920, 0xc000511130}) github.com/ollama/ollama/runner/llamarunner/runner.go:343 +0x1d5 fp=0xc0000bffb8 sp=0xc0000bfee0 pc=0x64b1b0cd0cb5 github.com/ollama/ollama/runner/llamarunner.Execute.gowrap2() github.com/ollama/ollama/runner/llamarunner/runner.go:973 +0x28 fp=0xc0000bffe0 sp=0xc0000bffb8 pc=0x64b1b0cd5b48 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bffe8 sp=0xc0000bffe0 pc=0x64b1b090c5a1 created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1 github.com/ollama/ollama/runner/llamarunner/runner.go:973 +0xdb5 etc etc etc. ``` ### OS Docker ### GPU Nvidia ### CPU Intel ### Ollama version 0.5.12
GiteaMirror added the bug label 2026-05-04 12:40:09 -05:00
Author
Owner

@iganev commented on GitHub (Feb 26, 2025):

Similar story with snowflake-arctic-embed-l but on a different line:

//ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c:8374: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed
//ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c:8374: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed

Although looking at my pipeline that could have been all-minilm-33m, thing is, next event after this segfault is reloading of snowflake, so..

<!-- gh-comment-id:2683662617 --> @iganev commented on GitHub (Feb 26, 2025): Similar story with `snowflake-arctic-embed-l` but on a different line: ``` //ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c:8374: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed //ml/backend/ggml/ggml/src/ggml-cpu/ggml-cpu.c:8374: GGML_ASSERT(i01 >= 0 && i01 < ne01) failed ``` Although looking at my pipeline that could have been `all-minilm-33m`, thing is, next event after this segfault is reloading of snowflake, so..
Author
Owner

@rick-github commented on GitHub (Feb 26, 2025):

https://github.com/ollama/ollama/issues/7288#issuecomment-2591709109

<!-- gh-comment-id:2683691768 --> @rick-github commented on GitHub (Feb 26, 2025): https://github.com/ollama/ollama/issues/7288#issuecomment-2591709109
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#68162