[GH-ISSUE #9289] Sig11 with mlx backend #31817

Closed
opened 2026-04-22 12:35:15 -05:00 by GiteaMirror · 1 comment
Owner

Originally created by @pontscho on GitHub (Feb 22, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/9289

What is the issue?

Hi,

I started testing the new mlx backend in 0.5.12rc1 and runner crashes when I try to load a 4 bit quantized model with Ollama.

Steps for reproduction:

  1. OLLAMA_FLASH_ATTENTION=1 OLLAMA_NEW_ENGINE=1 OLLAMA_BACKEND=mlx ollama serve
  2. ollama run mistral-small:latest
  3. 'Hello?'
  4. Crash

I am using Ollama with ubuntu22 on an E5-2680 with an rtx3090. Log attached. The same thing happens when I set the backend to mlx or ggml.

Btw, thank you for doing such a great job with Ollama., I really like it and will be very grateful if Ollama can achieve the speed of lm-studio on the same machine with mlx!

Bests,
pontscho

Relevant log output

^Cpontscho@parastable:/mnt/nvme/ollama-src/ollama$ OLLAMA_FLASH_ATTENTION=1 OLLAMA_NEW_ENGINE=1 OLLAMA_BACKEND=mlx ollama serve
2025/02/22 12:41:17 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/pontscho/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:true OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-02-22T12:41:17.527+01:00 level=INFO source=images.go:432 msg="total blobs: 19"
time=2025-02-22T12:41:17.527+01:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-02-22T12:41:17.528+01:00 level=INFO source=routes.go:1238 msg="Listening on 127.0.0.1:11434 (version 0.5.12-rc1)"
time=2025-02-22T12:41:17.528+01:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-02-22T12:41:17.873+01:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-b7919a97-db72-a206-738f-d03ac7379a54 library=cuda variant=v12 compute=8.6 driver=12.6 name="NVIDIA GeForce RTX 3090" total="23.6 GiB" available="23.0 GiB"
[GIN] 2025/02/22 - 12:41:22 | 200 |      83.283µs |       127.0.0.1 | HEAD     "/"
[GIN] 2025/02/22 - 12:41:22 | 200 |   34.354062ms |       127.0.0.1 | POST     "/api/show"
time=2025-02-22T12:41:23.585+01:00 level=INFO source=sched.go:715 msg="new model will fit in available VRAM in single GPU, loading" model=/home/pontscho/.ollama/models/blobs/sha256-102a747c137683e81d431dab05d8f2158df4ab6f162f8f9019425a43d51e0e9f gpu=GPU-b7919a97-db72-a206-738f-d03ac7379a54 parallel=4 available=24697241600 required="15.6 GiB"
time=2025-02-22T12:41:23.791+01:00 level=INFO source=server.go:97 msg="system memory" total="62.6 GiB" free="51.3 GiB" free_swap="554.2 MiB"
time=2025-02-22T12:41:23.981+01:00 level=INFO source=server.go:130 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=41 layers.split="" memory.available="[23.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="15.6 GiB" memory.required.partial="15.6 GiB" memory.required.kv="1.2 GiB" memory.required.allocations="[15.6 GiB]" memory.weights.total="13.7 GiB" memory.weights.repeating="13.2 GiB" memory.weights.nonrepeating="525.0 MiB" memory.graph.full="568.0 MiB" memory.graph.partial="801.0 MiB"
time=2025-02-22T12:41:23.981+01:00 level=INFO source=server.go:182 msg="enabling flash attention"
time=2025-02-22T12:41:23.981+01:00 level=WARN source=server.go:190 msg="kv cache type not supported by model" type=""
time=2025-02-22T12:41:23.982+01:00 level=INFO source=server.go:380 msg="starting llama server" cmd="/usr/local/bin/ollama runner --ollama-engine --model /home/pontscho/.ollama/models/blobs/sha256-102a747c137683e81d431dab05d8f2158df4ab6f162f8f9019425a43d51e0e9f --ctx-size 8192 --batch-size 512 --n-gpu-layers 41 --threads 14 --flash-attn --parallel 4 --port 46447"
time=2025-02-22T12:41:23.982+01:00 level=INFO source=sched.go:450 msg="loaded runners" count=1
time=2025-02-22T12:41:23.982+01:00 level=INFO source=server.go:557 msg="waiting for llama runner to start responding"
time=2025-02-22T12:41:23.982+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error"
time=2025-02-22T12:41:24.021+01:00 level=INFO source=runner.go:885 msg="starting ollama engine"
time=2025-02-22T12:41:24.021+01:00 level=INFO source=runner.go:938 msg="Server listening on 127.0.0.1:46447"
time=2025-02-22T12:41:24.127+01:00 level=WARN source=ggml.go:132 msg="key not found" key=general.description default=""
time=2025-02-22T12:41:24.127+01:00 level=INFO source=ggml.go:93 msg="" architecture=llama file_type=Q4_K_M name="Mistral Small 24B Instruct 2501" description="" num_tensors=363 num_key_values=41
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v12/libggml-cuda.so
load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-haswell.so
time=2025-02-22T12:41:24.233+01:00 level=INFO source=ggml.go:108 msg=cpu device.name=CPU device.description="Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz" device.kind=cpu device.free="0 B" device.total="0 B"
time=2025-02-22T12:41:24.233+01:00 level=INFO source=ggml.go:108 msg=cpu device.name=CPU device.description="Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz" device.kind=cpu device.free="0 B" device.total="0 B"
time=2025-02-22T12:41:24.234+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model"
time=2025-02-22T12:41:24.233+01:00 level=INFO source=ggml.go:117 msg=gpu device.name=CUDA0 device.description="NVIDIA GeForce RTX 3090" device.kind=gpu device.free="23.0 GiB" device.total="23.6 GiB"
time=2025-02-22T12:41:24.685+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server not responding"
time=2025-02-22T12:41:24.939+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model"
time=2025-02-22T12:41:25.642+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server not responding"
time=2025-02-22T12:41:25.893+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model"
time=2025-02-22T12:41:28.073+01:00 level=WARN source=ggml.go:132 msg="key not found" key=tokenizer.ggml.pretokenizer default="(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?\\p{L}+|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+"
time=2025-02-22T12:41:28.077+01:00 level=WARN source=ggml.go:132 msg="key not found" key=llama.rope.freq_scale default=1
time=2025-02-22T12:41:28.080+01:00 level=INFO source=runner.go:816 msg=system info="CPU : LLAMAFILE = 1 | CUDA : ARCHS = 600,610,620,700,720,750,800,860,870,890,900 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | cgo(gcc)"
time=2025-02-22T12:41:28.153+01:00 level=INFO source=server.go:596 msg="llama runner started in 4.17 seconds"
[GIN] 2025/02/22 - 12:41:28 | 200 |  5.159199725s |       127.0.0.1 | POST     "/api/generate"
llama_model_loader: loaded meta data with 40 key-value pairs and 363 tensors from /home/pontscho/.ollama/models/blobs/sha256-102a747c137683e81d431dab05d8f2158df4ab6f162f8f9019425a43d51e0e9f (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Mistral Small 24B Instruct 2501
llama_model_loader: - kv   3:                            general.version str              = 2501
llama_model_loader: - kv   4:                           general.finetune str              = Instruct
llama_model_loader: - kv   5:                           general.basename str              = Mistral-Small
llama_model_loader: - kv   6:                         general.size_label str              = 24B
llama_model_loader: - kv   7:                            general.license str              = apache-2.0
llama_model_loader: - kv   8:                   general.base_model.count u32              = 1
llama_model_loader: - kv   9:                  general.base_model.0.name str              = Mistral Small Base 2501
llama_model_loader: - kv  10:               general.base_model.0.version str              = 2501
llama_model_loader: - kv  11:          general.base_model.0.organization str              = Mistralai
llama_model_loader: - kv  12:              general.base_model.0.repo_url str              = https://huggingface.co/mistralai/Mist...
llama_model_loader: - kv  13:                          general.languages arr[str,10]      = ["en", "fr", "de", "es", "it", "pt", ...
llama_model_loader: - kv  14:                          llama.block_count u32              = 40
llama_model_loader: - kv  15:                       llama.context_length u32              = 32768
llama_model_loader: - kv  16:                     llama.embedding_length u32              = 5120
llama_model_loader: - kv  17:                  llama.feed_forward_length u32              = 32768
llama_model_loader: - kv  18:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  19:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  20:                       llama.rope.freq_base f32              = 100000000.000000
llama_model_loader: - kv  21:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  22:                 llama.attention.key_length u32              = 128
llama_model_loader: - kv  23:               llama.attention.value_length u32              = 128
llama_model_loader: - kv  24:                          general.file_type u32              = 15
llama_model_loader: - kv  25:                           llama.vocab_size u32              = 131072
llama_model_loader: - kv  26:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  27:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  28:                         tokenizer.ggml.pre str              = qwen2
llama_model_loader: - kv  29:                      tokenizer.ggml.tokens arr[str,131072]  = ["<unk>", "<s>", "</s>", "[INST]", "[...
llama_model_loader: - kv  30:                  tokenizer.ggml.token_type arr[i32,131072]  = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
llama_model_loader: - kv  31:                      tokenizer.ggml.merges arr[str,269443]  = ["Ġ Ġ", "Ġ t", "e r", "i n", "Ġ �...
llama_model_loader: - kv  32:                tokenizer.ggml.bos_token_id u32              = 1
llama_model_loader: - kv  33:                tokenizer.ggml.eos_token_id u32              = 2
llama_model_loader: - kv  34:            tokenizer.ggml.unknown_token_id u32              = 0
llama_model_loader: - kv  35:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  36:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  37:                    tokenizer.chat_template str              = {{ bos_token }}{% for message in mess...
llama_model_loader: - kv  38:            tokenizer.ggml.add_space_prefix bool             = false
llama_model_loader: - kv  39:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   81 tensors
llama_model_loader: - type q4_K:  241 tensors
llama_model_loader: - type q6_K:   41 tensors
llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
llm_load_vocab: special tokens cache size = 1000
llm_load_vocab: token to piece cache size = 0.8498 MB
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = BPE
llm_load_print_meta: n_vocab          = 131072
llm_load_print_meta: n_merges         = 269443
llm_load_print_meta: vocab_only       = 1
llm_load_print_meta: model type       = ?B
llm_load_print_meta: model ftype      = all F32
llm_load_print_meta: model params     = 23.57 B
llm_load_print_meta: model size       = 13.34 GiB (4.86 BPW) 
llm_load_print_meta: general.name     = Mistral Small 24B Instruct 2501
llm_load_print_meta: BOS token        = 1 '<s>'
llm_load_print_meta: EOS token        = 2 '</s>'
llm_load_print_meta: UNK token        = 0 '<unk>'
llm_load_print_meta: LF token         = 1196 'Ä'
llm_load_print_meta: EOG token        = 2 '</s>'
llm_load_print_meta: max token length = 150
llama_model_load: vocab only - skipping tensors
ggml.c:3080: GGML_ASSERT(ggml_nelements(a) == ne0*ne1*ne2) failed
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Inappropriate ioctl for device.
No stack.
The program is not being run.
SIGABRT: abort
PC=0x7f9e660cf9fc m=131 sigcode=18446744073709551610
signal arrived during cgo execution

goroutine 47 gp=0xc0004ad880 m=131 mp=0xc015fa4308 [syscall]:
runtime.cgocall(0x556e46201710, 0xc001431960)
        runtime/cgocall.go:167 +0x4b fp=0xc001431938 sp=0xc001431900 pc=0x556e4543eaab
github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_reshape_3d(0x7f97dc00ba60, 0x7f9d9033e900, 0xa0, 0x20, 0xa2)
        _cgo_gotypes.go:937 +0x50 fp=0xc001431960 sp=0xc001431938 pc=0x556e45859690
github.com/ollama/ollama/ml/backend/ggml.(*Tensor).Reshape.func3({0x556e466a9c50?, 0xc0013b27b0?}, 0xc001431a08?, {0xc016059e78?, 0x18?, 0x556e46530ba0?})
        github.com/ollama/ollama/ml/backend/ggml/ggml.go:527 +0xe5 fp=0xc0014319d0 sp=0xc001431960 pc=0x556e45861c85
github.com/ollama/ollama/ml/backend/ggml.(*Tensor).Reshape(0xc0004cc2e0?, {0x556e466a9c50?, 0xc0013b27b0?}, {0xc016059e78?, 0xc001999230?, 0xc001999230?})
        github.com/ollama/ollama/ml/backend/ggml/ggml.go:527 +0x157 fp=0xc001431a18 sp=0xc0014319d0 pc=0x556e45861937
github.com/ollama/ollama/model/models/llama.(*SelfAttention).Forward(0xc0004cc2c0, {0x556e466a9c50, 0xc0013b27b0}, {0x556e466b45c0, 0xc001999230}, {0x556e466b45c0, 0xc0019991f0}, {0x556e466aa888, 0xc000000300}, 0xc016014700)
        github.com/ollama/ollama/model/models/llama/model.go:72 +0x14b fp=0xc001431ad0 sp=0xc001431a18 pc=0x556e459eeccb
github.com/ollama/ollama/model/models/llama.(*Layer).Forward(0xc001431bc0, {0x556e466a9c50, 0xc0013b27b0}, {0x556e466b45c0, 0xc001999200}, {0x556e466b45c0, 0xc0019991f0}, {0x556e466aa888, 0xc000000300}, 0xc016014700)
        github.com/ollama/ollama/model/models/llama/model.go:127 +0xd3 fp=0xc001431b40 sp=0xc001431ad0 pc=0x556e459ef613
github.com/ollama/ollama/model/models/llama.(*Model).Forward(0xc000444070, {0x556e466a9c50, 0xc0013b27b0}, {{0xc03a066400, 0xa2, 0x100}, {0xc03a066800, 0xa2, 0x100}, {0xc016022800, ...}, ...})
        github.com/ollama/ollama/model/models/llama/model.go:151 +0x245 fp=0xc001431c08 sp=0xc001431b40 pc=0x556e459ef985
github.com/ollama/ollama/model.Forward({0x556e466a9c50, 0xc0013b27b0}, {0x556e466a3fa0, 0xc000444070}, {{0xc03a066400, 0xa2, 0x100}, {0xc03a066800, 0xa2, 0x100}, ...})
        github.com/ollama/ollama/model/model.go:246 +0x12f fp=0xc001431ce0 sp=0xc001431c08 pc=0x556e4588bf6f
github.com/ollama/ollama/runner/ollamarunner.(*Server).processBatch(0xc000157e00)
        github.com/ollama/ollama/runner/ollamarunner/runner.go:390 +0x41e fp=0xc001431f98 sp=0xc001431ce0 pc=0x556e45a4069e
github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0xc000157e00, {0x556e466a52a0, 0xc000197ef0})
        github.com/ollama/ollama/runner/ollamarunner/runner.go:297 +0x4e fp=0xc001431fb8 sp=0xc001431f98 pc=0x556e45a4022e
github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap2()
        github.com/ollama/ollama/runner/ollamarunner/runner.go:918 +0x28 fp=0xc001431fe0 sp=0xc001431fb8 pc=0x556e45a44c68
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc001431fe8 sp=0xc001431fe0 pc=0x556e4544d581
created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1
        github.com/ollama/ollama/runner/ollamarunner/runner.go:918 +0x8ac

goroutine 1 gp=0xc0000061c0 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004bf710 sp=0xc0004bf6f0 pc=0x556e454451ae
runtime.netpollblock(0xc033d3bf80?, 0x453dbfc6?, 0x6e?)
        runtime/netpoll.go:575 +0xf7 fp=0xc0004bf748 sp=0xc0004bf710 pc=0x556e45408e17
internal/poll.runtime_pollWait(0x7f9e1ed79680, 0x72)
        runtime/netpoll.go:351 +0x85 fp=0xc0004bf768 sp=0xc0004bf748 pc=0x556e454444a5
internal/poll.(*pollDesc).wait(0xc000157e80?, 0x900000036?, 0x0)
        internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0004bf790 sp=0xc0004bf768 pc=0x556e454cc5c7
internal/poll.(*pollDesc).waitRead(...)
        internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc000157e80)
        internal/poll/fd_unix.go:620 +0x295 fp=0xc0004bf838 sp=0xc0004bf790 pc=0x556e454d1995
net.(*netFD).accept(0xc000157e80)
        net/fd_unix.go:172 +0x29 fp=0xc0004bf8f0 sp=0xc0004bf838 pc=0x556e4553aa89
net.(*TCPListener).accept(0xc0002c4dc0)
        net/tcpsock_posix.go:159 +0x1e fp=0xc0004bf940 sp=0xc0004bf8f0 pc=0x556e455506fe
net.(*TCPListener).Accept(0xc0002c4dc0)
        net/tcpsock.go:372 +0x30 fp=0xc0004bf970 sp=0xc0004bf940 pc=0x556e4554f5b0
net/http.(*onceCloseListener).Accept(0xc033d1e090?)
        <autogenerated>:1 +0x24 fp=0xc0004bf988 sp=0xc0004bf970 pc=0x556e45799824
net/http.(*Server).Serve(0xc0001451d0, {0x556e466a2e78, 0xc0002c4dc0})
        net/http/server.go:3330 +0x30c fp=0xc0004bfab8 sp=0xc0004bf988 pc=0x556e457717ac
github.com/ollama/ollama/runner/ollamarunner.Execute({0xc000036150, 0xf, 0xf})
        github.com/ollama/ollama/runner/ollamarunner/runner.go:939 +0xc67 fp=0xc0004bfd08 sp=0xc0004bfab8 pc=0x556e45a44a07
github.com/ollama/ollama/runner.Execute({0xc000036130?, 0x0?, 0x0?})
        github.com/ollama/ollama/runner/runner.go:20 +0xc9 fp=0xc0004bfd30 sp=0xc0004bfd08 pc=0x556e45a454c9
github.com/ollama/ollama/cmd.NewCLI.func2(0xc00020d500?, {0x556e46240050?, 0x4?, 0x556e46240054?})
        github.com/ollama/ollama/cmd/cmd.go:1280 +0x45 fp=0xc0004bfd58 sp=0xc0004bfd30 pc=0x556e46051b65
github.com/spf13/cobra.(*Command).execute(0xc00017fb08, {0xc00020d700, 0x10, 0x10})
        github.com/spf13/cobra@v1.7.0/command.go:940 +0x862 fp=0xc0004bfe78 sp=0xc0004bfd58 pc=0x556e455b37c2
github.com/spf13/cobra.(*Command).ExecuteC(0xc00049db08)
        github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc0004bff30 sp=0xc0004bfe78 pc=0x556e455b4005
github.com/spf13/cobra.(*Command).Execute(...)
        github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
        github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
        github.com/ollama/ollama/main.go:12 +0x4d fp=0xc0004bff50 sp=0xc0004bff30 pc=0x556e46051eed
runtime.main()
        runtime/proc.go:272 +0x29d fp=0xc0004bffe0 sp=0xc0004bff50 pc=0x556e454104bd
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004bffe8 sp=0xc0004bffe0 pc=0x556e4544d581

goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a0fa8 sp=0xc0000a0f88 pc=0x556e454451ae
runtime.goparkunlock(...)
        runtime/proc.go:430
runtime.forcegchelper()
        runtime/proc.go:337 +0xb8 fp=0xc0000a0fe0 sp=0xc0000a0fa8 pc=0x556e454107f8
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a0fe8 sp=0xc0000a0fe0 pc=0x556e4544d581
created by runtime.init.7 in goroutine 1
        runtime/proc.go:325 +0x1a

goroutine 3 gp=0xc000007180 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a1780 sp=0xc0000a1760 pc=0x556e454451ae
runtime.goparkunlock(...)
        runtime/proc.go:430
runtime.bgsweep(0xc000044080)
        runtime/mgcsweep.go:317 +0xdf fp=0xc0000a17c8 sp=0xc0000a1780 pc=0x556e453fae9f
runtime.gcenable.gowrap1()
        runtime/mgc.go:204 +0x25 fp=0xc0000a17e0 sp=0xc0000a17c8 pc=0x556e453ef4e5
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a17e8 sp=0xc0000a17e0 pc=0x556e4544d581
created by runtime.gcenable in goroutine 1
        runtime/mgc.go:204 +0x66

goroutine 4 gp=0xc000007340 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x556e463f3258?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a1f78 sp=0xc0000a1f58 pc=0x556e454451ae
runtime.goparkunlock(...)
        runtime/proc.go:430
runtime.(*scavengerState).park(0x556e46e99060)
        runtime/mgcscavenge.go:425 +0x49 fp=0xc0000a1fa8 sp=0xc0000a1f78 pc=0x556e453f8869
runtime.bgscavenge(0xc000044080)
        runtime/mgcscavenge.go:658 +0x59 fp=0xc0000a1fc8 sp=0xc0000a1fa8 pc=0x556e453f8df9
runtime.gcenable.gowrap2()
        runtime/mgc.go:205 +0x25 fp=0xc0000a1fe0 sp=0xc0000a1fc8 pc=0x556e453ef485
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a1fe8 sp=0xc0000a1fe0 pc=0x556e4544d581
created by runtime.gcenable in goroutine 1
        runtime/mgc.go:205 +0xa5

goroutine 5 gp=0xc000007c00 m=nil [finalizer wait]:
runtime.gopark(0xc0000a0648?, 0x556e453e59e5?, 0xb0?, 0x1?, 0xc0000061c0?)
        runtime/proc.go:424 +0xce fp=0xc0000a0620 sp=0xc0000a0600 pc=0x556e454451ae
runtime.runfinq()
        runtime/mfinal.go:193 +0x107 fp=0xc0000a07e0 sp=0xc0000a0620 pc=0x556e453ee567
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a07e8 sp=0xc0000a07e0 pc=0x556e4544d581
created by runtime.createfing in goroutine 1
        runtime/mfinal.go:163 +0x3d

goroutine 6 gp=0xc0001f9500 m=nil [chan receive]:
runtime.gopark(0xc0000a2760?, 0x556e45522105?, 0x60?, 0x49?, 0x556e466b7ae0?)
        runtime/proc.go:424 +0xce fp=0xc0000a2718 sp=0xc0000a26f8 pc=0x556e454451ae
runtime.chanrecv(0xc000046380, 0x0, 0x1)
        runtime/chan.go:639 +0x41c fp=0xc0000a2790 sp=0xc0000a2718 pc=0x556e453debdc
runtime.chanrecv1(0x0?, 0x0?)
        runtime/chan.go:489 +0x12 fp=0xc0000a27b8 sp=0xc0000a2790 pc=0x556e453de792
runtime.unique_runtime_registerUniqueMapCleanup.func1(...)
        runtime/mgc.go:1781
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
        runtime/mgc.go:1784 +0x2f fp=0xc0000a27e0 sp=0xc0000a27b8 pc=0x556e453f254f
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a27e8 sp=0xc0000a27e0 pc=0x556e4544d581
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
        runtime/mgc.go:1779 +0x96

goroutine 7 gp=0xc0001f9880 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a2f38 sp=0xc0000a2f18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a2fc8 sp=0xc0000a2f38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a2fe0 sp=0xc0000a2fc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a2fe8 sp=0xc0000a2fe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 18 gp=0xc000504000 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4ea7a39?, 0x3?, 0x24?, 0xdb?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00009c738 sp=0xc00009c718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00009c7c8 sp=0xc00009c738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00009c7e0 sp=0xc00009c7c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00009c7e8 sp=0xc00009c7e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 19 gp=0xc0005041c0 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x3?, 0xa1?, 0x23?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00009cf38 sp=0xc00009cf18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00009cfc8 sp=0xc00009cf38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00009cfe0 sp=0xc00009cfc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00009cfe8 sp=0xc00009cfe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 34 gp=0xc000104380 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4e691f4?, 0x3?, 0x9b?, 0x67?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00011c738 sp=0xc00011c718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00011c7c8 sp=0xc00011c738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00011c7e0 sp=0xc00011c7c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00011c7e8 sp=0xc00011c7e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 8 gp=0xc0001f9a40 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x3?, 0x3e?, 0x3b?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a3738 sp=0xc0000a3718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a37c8 sp=0xc0000a3738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a37e0 sp=0xc0000a37c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a37e8 sp=0xc0000a37e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 9 gp=0xc0001f9c00 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4e5b149?, 0x3?, 0x3d?, 0x56?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0000a3f38 sp=0xc0000a3f18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0000a3fc8 sp=0xc0000a3f38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0000a3fe0 sp=0xc0000a3fc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a3fe8 sp=0xc0000a3fe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 35 gp=0xc000104540 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4ea7e31?, 0x3?, 0xd4?, 0x36?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00011cf38 sp=0xc00011cf18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00011cfc8 sp=0xc00011cf38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00011cfe0 sp=0xc00011cfc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00011cfe8 sp=0xc00011cfe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 36 gp=0xc000104700 m=nil [GC worker (idle)]:
runtime.gopark(0x3437be9cb6d00?, 0x1?, 0x3d?, 0xf0?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00011d738 sp=0xc00011d718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00011d7c8 sp=0xc00011d738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00011d7e0 sp=0xc00011d7c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00011d7e8 sp=0xc00011d7e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 37 gp=0xc0001048c0 m=nil [GC worker (idle)]:
runtime.gopark(0x3437be9d1eb9b?, 0x1?, 0x19?, 0x8?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00011df38 sp=0xc00011df18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00011dfc8 sp=0xc00011df38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00011dfe0 sp=0xc00011dfc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00011dfe8 sp=0xc00011dfe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 38 gp=0xc000104a80 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4ea7c30?, 0x3?, 0x2?, 0x58?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00011e738 sp=0xc00011e718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00011e7c8 sp=0xc00011e738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00011e7e0 sp=0xc00011e7c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00011e7e8 sp=0xc00011e7e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 39 gp=0xc000104c40 m=nil [GC worker (idle)]:
runtime.gopark(0x3437be9d1f70d?, 0x1?, 0x6c?, 0x8a?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00011ef38 sp=0xc00011ef18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00011efc8 sp=0xc00011ef38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00011efe0 sp=0xc00011efc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00011efe8 sp=0xc00011efe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 40 gp=0xc000104e00 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x1?, 0xbf?, 0x8d?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00011f738 sp=0xc00011f718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00011f7c8 sp=0xc00011f738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00011f7e0 sp=0xc00011f7c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00011f7e8 sp=0xc00011f7e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 41 gp=0xc000104fc0 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x1?, 0x81?, 0xca?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00011ff38 sp=0xc00011ff18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00011ffc8 sp=0xc00011ff38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00011ffe0 sp=0xc00011ffc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00011ffe8 sp=0xc00011ffe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 42 gp=0xc000105180 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x1?, 0xc8?, 0x54?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc000118738 sp=0xc000118718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0001187c8 sp=0xc000118738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0001187e0 sp=0xc0001187c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0001187e8 sp=0xc0001187e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 43 gp=0xc000105340 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x1?, 0xf?, 0x27?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc000118f38 sp=0xc000118f18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc000118fc8 sp=0xc000118f38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc000118fe0 sp=0xc000118fc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000118fe8 sp=0xc000118fe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 44 gp=0xc000105500 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x1?, 0x22?, 0x4d?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc000119738 sp=0xc000119718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0001197c8 sp=0xc000119738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0001197e0 sp=0xc0001197c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0001197e8 sp=0xc0001197e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 45 gp=0xc0001056c0 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x1?, 0x2b?, 0xdb?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc000119f38 sp=0xc000119f18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc000119fc8 sp=0xc000119f38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc000119fe0 sp=0xc000119fc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc000119fe8 sp=0xc000119fe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 10 gp=0xc0001f9dc0 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4ea7248?, 0x1?, 0x88?, 0x2?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004a8738 sp=0xc0004a8718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004a87c8 sp=0xc0004a8738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004a87e0 sp=0xc0004a87c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a87e8 sp=0xc0004a87e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 11 gp=0xc0004ac000 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4e59ee4?, 0x3?, 0x88?, 0x1a?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004a8f38 sp=0xc0004a8f18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004a8fc8 sp=0xc0004a8f38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004a8fe0 sp=0xc0004a8fc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a8fe8 sp=0xc0004a8fe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 12 gp=0xc0004ac1c0 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4e532f9?, 0x3?, 0xcf?, 0x20?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004a9738 sp=0xc0004a9718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004a97c8 sp=0xc0004a9738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004a97e0 sp=0xc0004a97c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a97e8 sp=0xc0004a97e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 13 gp=0xc0004ac380 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4ea748e?, 0x3?, 0x4a?, 0x76?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004a9f38 sp=0xc0004a9f18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004a9fc8 sp=0xc0004a9f38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004a9fe0 sp=0xc0004a9fc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a9fe8 sp=0xc0004a9fe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 14 gp=0xc0004ac540 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4ea7a9b?, 0x3?, 0x9d?, 0x99?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004aa738 sp=0xc0004aa718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004aa7c8 sp=0xc0004aa738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004aa7e0 sp=0xc0004aa7c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004aa7e8 sp=0xc0004aa7e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 15 gp=0xc0004ac700 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x1?, 0x1e?, 0x6c?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004aaf38 sp=0xc0004aaf18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004aafc8 sp=0xc0004aaf38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004aafe0 sp=0xc0004aafc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004aafe8 sp=0xc0004aafe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 16 gp=0xc0004ac8c0 m=nil [GC worker (idle)]:
runtime.gopark(0x556e46f476c0?, 0x1?, 0xa8?, 0xe4?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc0004ab738 sp=0xc0004ab718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc0004ab7c8 sp=0xc0004ab738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc0004ab7e0 sp=0xc0004ab7c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc0004ab7e8 sp=0xc0004ab7e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 20 gp=0xc000504380 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4ea7441?, 0x3?, 0x42?, 0x69?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00009d738 sp=0xc00009d718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00009d7c8 sp=0xc00009d738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00009d7e0 sp=0xc00009d7c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00009d7e8 sp=0xc00009d7e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 21 gp=0xc000504540 m=nil [GC worker (idle)]:
runtime.gopark(0x3437be9cb691b?, 0x1?, 0xff?, 0x20?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00009df38 sp=0xc00009df18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00009dfc8 sp=0xc00009df38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00009dfe0 sp=0xc00009dfc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00009dfe8 sp=0xc00009dfe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 22 gp=0xc000504700 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4ea7f39?, 0x1?, 0x70?, 0xa1?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00009e738 sp=0xc00009e718 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00009e7c8 sp=0xc00009e738 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00009e7e0 sp=0xc00009e7c8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00009e7e8 sp=0xc00009e7e0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 23 gp=0xc0005048c0 m=nil [GC worker (idle)]:
runtime.gopark(0x3437bd4e648fd?, 0xc0005000c0?, 0x1a?, 0xa?, 0x0?)
        runtime/proc.go:424 +0xce fp=0xc00009ef38 sp=0xc00009ef18 pc=0x556e454451ae
runtime.gcBgMarkWorker(0xc0000477a0)
        runtime/mgc.go:1412 +0xe9 fp=0xc00009efc8 sp=0xc00009ef38 pc=0x556e453f1849
runtime.gcBgMarkStartWorkers.gowrap1()
        runtime/mgc.go:1328 +0x25 fp=0xc00009efe0 sp=0xc00009efc8 pc=0x556e453f1725
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00009efe8 sp=0xc00009efe0 pc=0x556e4544d581
created by runtime.gcBgMarkStartWorkers in goroutine 1
        runtime/mgc.go:1328 +0x105

goroutine 25 gp=0xc041a21500 m=nil [select]:
runtime.gopark(0xc00004da68?, 0x2?, 0x9c?, 0xe3?, 0xc00004d834?)
        runtime/proc.go:424 +0xce fp=0xc00004d690 sp=0xc00004d670 pc=0x556e454451ae
runtime.selectgo(0xc00004da68, 0xc00004d830, 0xa2?, 0x0, 0x1?, 0x1)
        runtime/select.go:335 +0x7a5 fp=0xc00004d7b8 sp=0xc00004d690 pc=0x556e454224a5
github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0xc000157e00, {0x556e466a3088, 0xc016020620}, 0xc00015ab40)
        github.com/ollama/ollama/runner/ollamarunner/runner.go:653 +0x9ed fp=0xc00004dac0 sp=0xc00004d7b8 pc=0x556e45a4270d
github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x556e466a3088?, 0xc016020620?}, 0x556e4577b587?)
        <autogenerated>:1 +0x36 fp=0xc00004daf0 sp=0xc00004dac0 pc=0x556e45a44fd6
net/http.HandlerFunc.ServeHTTP(0xc0001415e0?, {0x556e466a3088?, 0xc016020620?}, 0x0?)
        net/http/server.go:2220 +0x29 fp=0xc00004db18 sp=0xc00004daf0 pc=0x556e4576dda9
net/http.(*ServeMux).ServeHTTP(0x556e453e59e5?, {0x556e466a3088, 0xc016020620}, 0xc00015ab40)
        net/http/server.go:2747 +0x1ca fp=0xc00004db68 sp=0xc00004db18 pc=0x556e4576fcaa
net/http.serverHandler.ServeHTTP({0x556e4669fa50?}, {0x556e466a3088?, 0xc016020620?}, 0x6?)
        net/http/server.go:3210 +0x8e fp=0xc00004db98 sp=0xc00004db68 pc=0x556e4578d20e
net/http.(*conn).serve(0xc033d1e090, {0x556e466a5268, 0xc00017ccf0})
        net/http/server.go:2092 +0x5d0 fp=0xc00004dfb8 sp=0xc00004db98 pc=0x556e4576c750
net/http.(*Server).Serve.gowrap3()
        net/http/server.go:3360 +0x28 fp=0xc00004dfe0 sp=0xc00004dfb8 pc=0x556e45771ba8
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc00004dfe8 sp=0xc00004dfe0 pc=0x556e4544d581
created by net/http.(*Server).Serve in goroutine 1
        net/http/server.go:3360 +0x485

goroutine 553 gp=0xc041a6ba40 m=nil [IO wait]:
runtime.gopark(0xc0013a9680?, 0x2?, 0x2?, 0x0?, 0xb?)
        runtime/proc.go:424 +0xce fp=0xc041a105a8 sp=0xc041a10588 pc=0x556e454451ae
runtime.netpollblock(0x556e45468578?, 0x453dbfc6?, 0x6e?)
        runtime/netpoll.go:575 +0xf7 fp=0xc041a105e0 sp=0xc041a105a8 pc=0x556e45408e17
internal/poll.runtime_pollWait(0x7f9e1ed79450, 0x72)
        runtime/netpoll.go:351 +0x85 fp=0xc041a10600 sp=0xc041a105e0 pc=0x556e454444a5
internal/poll.(*pollDesc).wait(0xc033d36000?, 0xc0014262b1?, 0x0)
        internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc041a10628 sp=0xc041a10600 pc=0x556e454cc5c7
internal/poll.(*pollDesc).waitRead(...)
        internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc033d36000, {0xc0014262b1, 0x1, 0x1})
        internal/poll/fd_unix.go:165 +0x27a fp=0xc041a106c0 sp=0xc041a10628 pc=0x556e454cd8ba
net.(*netFD).Read(0xc033d36000, {0xc0014262b1?, 0xc0013a9680?, 0x2?})
        net/fd_posix.go:55 +0x25 fp=0xc041a10708 sp=0xc041a106c0 pc=0x556e45538ac5
net.(*conn).Read(0xc033d38000, {0xc0014262b1?, 0x0?, 0xc041a107d0?})
        net/net.go:189 +0x45 fp=0xc041a10750 sp=0xc041a10708 pc=0x556e455470c5
net.(*TCPConn).Read(0x0?, {0xc0014262b1?, 0xc0002c4500?, 0x556e45855f00?})
        <autogenerated>:1 +0x25 fp=0xc041a10780 sp=0xc041a10750 pc=0x556e4555a2c5
net/http.(*connReader).backgroundRead(0xc0014262a0)
        net/http/server.go:690 +0x37 fp=0xc041a107c8 sp=0xc041a10780 pc=0x556e457670d7
net/http.(*connReader).startBackgroundRead.gowrap2()
        net/http/server.go:686 +0x25 fp=0xc041a107e0 sp=0xc041a107c8 pc=0x556e45767005
runtime.goexit({})
        runtime/asm_amd64.s:1700 +0x1 fp=0xc041a107e8 sp=0xc041a107e0 pc=0x556e4544d581
created by net/http.(*connReader).startBackgroundRead in goroutine 25
        net/http/server.go:686 +0xb6

rax    0x0
rbx    0x7f9804ff9640
rcx    0x7f9e660cf9fc
rdx    0x6
rdi    0x531b6
rsi    0x5323e
rbp    0x5323e
rsp    0x7f9804ff89e0
r8     0x7f9804ff8ab0
r9     0x7f9804ff8a80
r10    0x8
r11    0x246
r12    0x6
r13    0x16
r14    0x7f97dc00ba60
r15    0x3ffffffffffffff
rip    0x7f9e660cf9fc
rflags 0x246
cs     0x33
fs     0x0
gs     0x0
[GIN] 2025/02/22 - 12:41:39 | 200 |  2.388408313s |       127.0.0.1 | POST     "/api/chat"
time=2025-02-22T12:41:39.331+01:00 level=ERROR source=server.go:421 msg="llama runner terminated" error="exit status 2"

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.5.12rc1

Originally created by @pontscho on GitHub (Feb 22, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/9289 ### What is the issue? Hi, I started testing the new mlx backend in 0.5.12rc1 and runner crashes when I try to load a 4 bit quantized model with Ollama. Steps for reproduction: 1) `OLLAMA_FLASH_ATTENTION=1 OLLAMA_NEW_ENGINE=1 OLLAMA_BACKEND=mlx ollama serve` 2) `ollama run mistral-small:latest` 3) 'Hello?' 4) Crash I am using Ollama with ubuntu22 on an E5-2680 with an rtx3090. Log attached. The same thing happens when I set the backend to mlx or ggml. Btw, thank you for doing such a great job with Ollama., I really like it and will be very grateful if Ollama can achieve the speed of lm-studio on the same machine with mlx! Bests, pontscho ### Relevant log output ```shell ^Cpontscho@parastable:/mnt/nvme/ollama-src/ollama$ OLLAMA_FLASH_ATTENTION=1 OLLAMA_NEW_ENGINE=1 OLLAMA_BACKEND=mlx ollama serve 2025/02/22 12:41:17 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:true OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/pontscho/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:true OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]" time=2025-02-22T12:41:17.527+01:00 level=INFO source=images.go:432 msg="total blobs: 19" time=2025-02-22T12:41:17.527+01:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0" time=2025-02-22T12:41:17.528+01:00 level=INFO source=routes.go:1238 msg="Listening on 127.0.0.1:11434 (version 0.5.12-rc1)" time=2025-02-22T12:41:17.528+01:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs" time=2025-02-22T12:41:17.873+01:00 level=INFO source=types.go:130 msg="inference compute" id=GPU-b7919a97-db72-a206-738f-d03ac7379a54 library=cuda variant=v12 compute=8.6 driver=12.6 name="NVIDIA GeForce RTX 3090" total="23.6 GiB" available="23.0 GiB" [GIN] 2025/02/22 - 12:41:22 | 200 | 83.283µs | 127.0.0.1 | HEAD "/" [GIN] 2025/02/22 - 12:41:22 | 200 | 34.354062ms | 127.0.0.1 | POST "/api/show" time=2025-02-22T12:41:23.585+01:00 level=INFO source=sched.go:715 msg="new model will fit in available VRAM in single GPU, loading" model=/home/pontscho/.ollama/models/blobs/sha256-102a747c137683e81d431dab05d8f2158df4ab6f162f8f9019425a43d51e0e9f gpu=GPU-b7919a97-db72-a206-738f-d03ac7379a54 parallel=4 available=24697241600 required="15.6 GiB" time=2025-02-22T12:41:23.791+01:00 level=INFO source=server.go:97 msg="system memory" total="62.6 GiB" free="51.3 GiB" free_swap="554.2 MiB" time=2025-02-22T12:41:23.981+01:00 level=INFO source=server.go:130 msg=offload library=cuda layers.requested=-1 layers.model=41 layers.offload=41 layers.split="" memory.available="[23.0 GiB]" memory.gpu_overhead="0 B" memory.required.full="15.6 GiB" memory.required.partial="15.6 GiB" memory.required.kv="1.2 GiB" memory.required.allocations="[15.6 GiB]" memory.weights.total="13.7 GiB" memory.weights.repeating="13.2 GiB" memory.weights.nonrepeating="525.0 MiB" memory.graph.full="568.0 MiB" memory.graph.partial="801.0 MiB" time=2025-02-22T12:41:23.981+01:00 level=INFO source=server.go:182 msg="enabling flash attention" time=2025-02-22T12:41:23.981+01:00 level=WARN source=server.go:190 msg="kv cache type not supported by model" type="" time=2025-02-22T12:41:23.982+01:00 level=INFO source=server.go:380 msg="starting llama server" cmd="/usr/local/bin/ollama runner --ollama-engine --model /home/pontscho/.ollama/models/blobs/sha256-102a747c137683e81d431dab05d8f2158df4ab6f162f8f9019425a43d51e0e9f --ctx-size 8192 --batch-size 512 --n-gpu-layers 41 --threads 14 --flash-attn --parallel 4 --port 46447" time=2025-02-22T12:41:23.982+01:00 level=INFO source=sched.go:450 msg="loaded runners" count=1 time=2025-02-22T12:41:23.982+01:00 level=INFO source=server.go:557 msg="waiting for llama runner to start responding" time=2025-02-22T12:41:23.982+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server error" time=2025-02-22T12:41:24.021+01:00 level=INFO source=runner.go:885 msg="starting ollama engine" time=2025-02-22T12:41:24.021+01:00 level=INFO source=runner.go:938 msg="Server listening on 127.0.0.1:46447" time=2025-02-22T12:41:24.127+01:00 level=WARN source=ggml.go:132 msg="key not found" key=general.description default="" time=2025-02-22T12:41:24.127+01:00 level=INFO source=ggml.go:93 msg="" architecture=llama file_type=Q4_K_M name="Mistral Small 24B Instruct 2501" description="" num_tensors=363 num_key_values=41 ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 CUDA devices: Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes load_backend: loaded CUDA backend from /usr/local/lib/ollama/cuda_v12/libggml-cuda.so load_backend: loaded CPU backend from /usr/local/lib/ollama/libggml-cpu-haswell.so time=2025-02-22T12:41:24.233+01:00 level=INFO source=ggml.go:108 msg=cpu device.name=CPU device.description="Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz" device.kind=cpu device.free="0 B" device.total="0 B" time=2025-02-22T12:41:24.233+01:00 level=INFO source=ggml.go:108 msg=cpu device.name=CPU device.description="Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz" device.kind=cpu device.free="0 B" device.total="0 B" time=2025-02-22T12:41:24.234+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model" time=2025-02-22T12:41:24.233+01:00 level=INFO source=ggml.go:117 msg=gpu device.name=CUDA0 device.description="NVIDIA GeForce RTX 3090" device.kind=gpu device.free="23.0 GiB" device.total="23.6 GiB" time=2025-02-22T12:41:24.685+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server not responding" time=2025-02-22T12:41:24.939+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model" time=2025-02-22T12:41:25.642+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server not responding" time=2025-02-22T12:41:25.893+01:00 level=INFO source=server.go:591 msg="waiting for server to become available" status="llm server loading model" time=2025-02-22T12:41:28.073+01:00 level=WARN source=ggml.go:132 msg="key not found" key=tokenizer.ggml.pretokenizer default="(?i:'s|'t|'re|'ve|'m|'ll|'d)|[^\\r\\n\\p{L}\\p{N}]?\\p{L}+|\\p{N}{1,3}| ?[^\\s\\p{L}\\p{N}]+[\\r\\n]*|\\s*[\\r\\n]+|\\s+(?!\\S)|\\s+" time=2025-02-22T12:41:28.077+01:00 level=WARN source=ggml.go:132 msg="key not found" key=llama.rope.freq_scale default=1 time=2025-02-22T12:41:28.080+01:00 level=INFO source=runner.go:816 msg=system info="CPU : LLAMAFILE = 1 | CUDA : ARCHS = 600,610,620,700,720,750,800,860,870,890,900 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | cgo(gcc)" time=2025-02-22T12:41:28.153+01:00 level=INFO source=server.go:596 msg="llama runner started in 4.17 seconds" [GIN] 2025/02/22 - 12:41:28 | 200 | 5.159199725s | 127.0.0.1 | POST "/api/generate" llama_model_loader: loaded meta data with 40 key-value pairs and 363 tensors from /home/pontscho/.ollama/models/blobs/sha256-102a747c137683e81d431dab05d8f2158df4ab6f162f8f9019425a43d51e0e9f (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = Mistral Small 24B Instruct 2501 llama_model_loader: - kv 3: general.version str = 2501 llama_model_loader: - kv 4: general.finetune str = Instruct llama_model_loader: - kv 5: general.basename str = Mistral-Small llama_model_loader: - kv 6: general.size_label str = 24B llama_model_loader: - kv 7: general.license str = apache-2.0 llama_model_loader: - kv 8: general.base_model.count u32 = 1 llama_model_loader: - kv 9: general.base_model.0.name str = Mistral Small Base 2501 llama_model_loader: - kv 10: general.base_model.0.version str = 2501 llama_model_loader: - kv 11: general.base_model.0.organization str = Mistralai llama_model_loader: - kv 12: general.base_model.0.repo_url str = https://huggingface.co/mistralai/Mist... llama_model_loader: - kv 13: general.languages arr[str,10] = ["en", "fr", "de", "es", "it", "pt", ... llama_model_loader: - kv 14: llama.block_count u32 = 40 llama_model_loader: - kv 15: llama.context_length u32 = 32768 llama_model_loader: - kv 16: llama.embedding_length u32 = 5120 llama_model_loader: - kv 17: llama.feed_forward_length u32 = 32768 llama_model_loader: - kv 18: llama.attention.head_count u32 = 32 llama_model_loader: - kv 19: llama.attention.head_count_kv u32 = 8 llama_model_loader: - kv 20: llama.rope.freq_base f32 = 100000000.000000 llama_model_loader: - kv 21: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 22: llama.attention.key_length u32 = 128 llama_model_loader: - kv 23: llama.attention.value_length u32 = 128 llama_model_loader: - kv 24: general.file_type u32 = 15 llama_model_loader: - kv 25: llama.vocab_size u32 = 131072 llama_model_loader: - kv 26: llama.rope.dimension_count u32 = 128 llama_model_loader: - kv 27: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 28: tokenizer.ggml.pre str = qwen2 llama_model_loader: - kv 29: tokenizer.ggml.tokens arr[str,131072] = ["<unk>", "<s>", "</s>", "[INST]", "[... llama_model_loader: - kv 30: tokenizer.ggml.token_type arr[i32,131072] = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ... llama_model_loader: - kv 31: tokenizer.ggml.merges arr[str,269443] = ["Ġ Ġ", "Ġ t", "e r", "i n", "Ġ �... llama_model_loader: - kv 32: tokenizer.ggml.bos_token_id u32 = 1 llama_model_loader: - kv 33: tokenizer.ggml.eos_token_id u32 = 2 llama_model_loader: - kv 34: tokenizer.ggml.unknown_token_id u32 = 0 llama_model_loader: - kv 35: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 36: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 37: tokenizer.chat_template str = {{ bos_token }}{% for message in mess... llama_model_loader: - kv 38: tokenizer.ggml.add_space_prefix bool = false llama_model_loader: - kv 39: general.quantization_version u32 = 2 llama_model_loader: - type f32: 81 tensors llama_model_loader: - type q4_K: 241 tensors llama_model_loader: - type q6_K: 41 tensors llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect llm_load_vocab: special tokens cache size = 1000 llm_load_vocab: token to piece cache size = 0.8498 MB llm_load_print_meta: format = GGUF V3 (latest) llm_load_print_meta: arch = llama llm_load_print_meta: vocab type = BPE llm_load_print_meta: n_vocab = 131072 llm_load_print_meta: n_merges = 269443 llm_load_print_meta: vocab_only = 1 llm_load_print_meta: model type = ?B llm_load_print_meta: model ftype = all F32 llm_load_print_meta: model params = 23.57 B llm_load_print_meta: model size = 13.34 GiB (4.86 BPW) llm_load_print_meta: general.name = Mistral Small 24B Instruct 2501 llm_load_print_meta: BOS token = 1 '<s>' llm_load_print_meta: EOS token = 2 '</s>' llm_load_print_meta: UNK token = 0 '<unk>' llm_load_print_meta: LF token = 1196 'Ä' llm_load_print_meta: EOG token = 2 '</s>' llm_load_print_meta: max token length = 150 llama_model_load: vocab only - skipping tensors ggml.c:3080: GGML_ASSERT(ggml_nelements(a) == ne0*ne1*ne2) failed Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf ptrace: Inappropriate ioctl for device. No stack. The program is not being run. SIGABRT: abort PC=0x7f9e660cf9fc m=131 sigcode=18446744073709551610 signal arrived during cgo execution goroutine 47 gp=0xc0004ad880 m=131 mp=0xc015fa4308 [syscall]: runtime.cgocall(0x556e46201710, 0xc001431960) runtime/cgocall.go:167 +0x4b fp=0xc001431938 sp=0xc001431900 pc=0x556e4543eaab github.com/ollama/ollama/ml/backend/ggml._Cfunc_ggml_reshape_3d(0x7f97dc00ba60, 0x7f9d9033e900, 0xa0, 0x20, 0xa2) _cgo_gotypes.go:937 +0x50 fp=0xc001431960 sp=0xc001431938 pc=0x556e45859690 github.com/ollama/ollama/ml/backend/ggml.(*Tensor).Reshape.func3({0x556e466a9c50?, 0xc0013b27b0?}, 0xc001431a08?, {0xc016059e78?, 0x18?, 0x556e46530ba0?}) github.com/ollama/ollama/ml/backend/ggml/ggml.go:527 +0xe5 fp=0xc0014319d0 sp=0xc001431960 pc=0x556e45861c85 github.com/ollama/ollama/ml/backend/ggml.(*Tensor).Reshape(0xc0004cc2e0?, {0x556e466a9c50?, 0xc0013b27b0?}, {0xc016059e78?, 0xc001999230?, 0xc001999230?}) github.com/ollama/ollama/ml/backend/ggml/ggml.go:527 +0x157 fp=0xc001431a18 sp=0xc0014319d0 pc=0x556e45861937 github.com/ollama/ollama/model/models/llama.(*SelfAttention).Forward(0xc0004cc2c0, {0x556e466a9c50, 0xc0013b27b0}, {0x556e466b45c0, 0xc001999230}, {0x556e466b45c0, 0xc0019991f0}, {0x556e466aa888, 0xc000000300}, 0xc016014700) github.com/ollama/ollama/model/models/llama/model.go:72 +0x14b fp=0xc001431ad0 sp=0xc001431a18 pc=0x556e459eeccb github.com/ollama/ollama/model/models/llama.(*Layer).Forward(0xc001431bc0, {0x556e466a9c50, 0xc0013b27b0}, {0x556e466b45c0, 0xc001999200}, {0x556e466b45c0, 0xc0019991f0}, {0x556e466aa888, 0xc000000300}, 0xc016014700) github.com/ollama/ollama/model/models/llama/model.go:127 +0xd3 fp=0xc001431b40 sp=0xc001431ad0 pc=0x556e459ef613 github.com/ollama/ollama/model/models/llama.(*Model).Forward(0xc000444070, {0x556e466a9c50, 0xc0013b27b0}, {{0xc03a066400, 0xa2, 0x100}, {0xc03a066800, 0xa2, 0x100}, {0xc016022800, ...}, ...}) github.com/ollama/ollama/model/models/llama/model.go:151 +0x245 fp=0xc001431c08 sp=0xc001431b40 pc=0x556e459ef985 github.com/ollama/ollama/model.Forward({0x556e466a9c50, 0xc0013b27b0}, {0x556e466a3fa0, 0xc000444070}, {{0xc03a066400, 0xa2, 0x100}, {0xc03a066800, 0xa2, 0x100}, ...}) github.com/ollama/ollama/model/model.go:246 +0x12f fp=0xc001431ce0 sp=0xc001431c08 pc=0x556e4588bf6f github.com/ollama/ollama/runner/ollamarunner.(*Server).processBatch(0xc000157e00) github.com/ollama/ollama/runner/ollamarunner/runner.go:390 +0x41e fp=0xc001431f98 sp=0xc001431ce0 pc=0x556e45a4069e github.com/ollama/ollama/runner/ollamarunner.(*Server).run(0xc000157e00, {0x556e466a52a0, 0xc000197ef0}) github.com/ollama/ollama/runner/ollamarunner/runner.go:297 +0x4e fp=0xc001431fb8 sp=0xc001431f98 pc=0x556e45a4022e github.com/ollama/ollama/runner/ollamarunner.Execute.gowrap2() github.com/ollama/ollama/runner/ollamarunner/runner.go:918 +0x28 fp=0xc001431fe0 sp=0xc001431fb8 pc=0x556e45a44c68 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc001431fe8 sp=0xc001431fe0 pc=0x556e4544d581 created by github.com/ollama/ollama/runner/ollamarunner.Execute in goroutine 1 github.com/ollama/ollama/runner/ollamarunner/runner.go:918 +0x8ac goroutine 1 gp=0xc0000061c0 m=nil [IO wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004bf710 sp=0xc0004bf6f0 pc=0x556e454451ae runtime.netpollblock(0xc033d3bf80?, 0x453dbfc6?, 0x6e?) runtime/netpoll.go:575 +0xf7 fp=0xc0004bf748 sp=0xc0004bf710 pc=0x556e45408e17 internal/poll.runtime_pollWait(0x7f9e1ed79680, 0x72) runtime/netpoll.go:351 +0x85 fp=0xc0004bf768 sp=0xc0004bf748 pc=0x556e454444a5 internal/poll.(*pollDesc).wait(0xc000157e80?, 0x900000036?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0004bf790 sp=0xc0004bf768 pc=0x556e454cc5c7 internal/poll.(*pollDesc).waitRead(...) internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Accept(0xc000157e80) internal/poll/fd_unix.go:620 +0x295 fp=0xc0004bf838 sp=0xc0004bf790 pc=0x556e454d1995 net.(*netFD).accept(0xc000157e80) net/fd_unix.go:172 +0x29 fp=0xc0004bf8f0 sp=0xc0004bf838 pc=0x556e4553aa89 net.(*TCPListener).accept(0xc0002c4dc0) net/tcpsock_posix.go:159 +0x1e fp=0xc0004bf940 sp=0xc0004bf8f0 pc=0x556e455506fe net.(*TCPListener).Accept(0xc0002c4dc0) net/tcpsock.go:372 +0x30 fp=0xc0004bf970 sp=0xc0004bf940 pc=0x556e4554f5b0 net/http.(*onceCloseListener).Accept(0xc033d1e090?) <autogenerated>:1 +0x24 fp=0xc0004bf988 sp=0xc0004bf970 pc=0x556e45799824 net/http.(*Server).Serve(0xc0001451d0, {0x556e466a2e78, 0xc0002c4dc0}) net/http/server.go:3330 +0x30c fp=0xc0004bfab8 sp=0xc0004bf988 pc=0x556e457717ac github.com/ollama/ollama/runner/ollamarunner.Execute({0xc000036150, 0xf, 0xf}) github.com/ollama/ollama/runner/ollamarunner/runner.go:939 +0xc67 fp=0xc0004bfd08 sp=0xc0004bfab8 pc=0x556e45a44a07 github.com/ollama/ollama/runner.Execute({0xc000036130?, 0x0?, 0x0?}) github.com/ollama/ollama/runner/runner.go:20 +0xc9 fp=0xc0004bfd30 sp=0xc0004bfd08 pc=0x556e45a454c9 github.com/ollama/ollama/cmd.NewCLI.func2(0xc00020d500?, {0x556e46240050?, 0x4?, 0x556e46240054?}) github.com/ollama/ollama/cmd/cmd.go:1280 +0x45 fp=0xc0004bfd58 sp=0xc0004bfd30 pc=0x556e46051b65 github.com/spf13/cobra.(*Command).execute(0xc00017fb08, {0xc00020d700, 0x10, 0x10}) github.com/spf13/cobra@v1.7.0/command.go:940 +0x862 fp=0xc0004bfe78 sp=0xc0004bfd58 pc=0x556e455b37c2 github.com/spf13/cobra.(*Command).ExecuteC(0xc00049db08) github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc0004bff30 sp=0xc0004bfe78 pc=0x556e455b4005 github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) github.com/spf13/cobra@v1.7.0/command.go:985 main.main() github.com/ollama/ollama/main.go:12 +0x4d fp=0xc0004bff50 sp=0xc0004bff30 pc=0x556e46051eed runtime.main() runtime/proc.go:272 +0x29d fp=0xc0004bffe0 sp=0xc0004bff50 pc=0x556e454104bd runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004bffe8 sp=0xc0004bffe0 pc=0x556e4544d581 goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a0fa8 sp=0xc0000a0f88 pc=0x556e454451ae runtime.goparkunlock(...) runtime/proc.go:430 runtime.forcegchelper() runtime/proc.go:337 +0xb8 fp=0xc0000a0fe0 sp=0xc0000a0fa8 pc=0x556e454107f8 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a0fe8 sp=0xc0000a0fe0 pc=0x556e4544d581 created by runtime.init.7 in goroutine 1 runtime/proc.go:325 +0x1a goroutine 3 gp=0xc000007180 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a1780 sp=0xc0000a1760 pc=0x556e454451ae runtime.goparkunlock(...) runtime/proc.go:430 runtime.bgsweep(0xc000044080) runtime/mgcsweep.go:317 +0xdf fp=0xc0000a17c8 sp=0xc0000a1780 pc=0x556e453fae9f runtime.gcenable.gowrap1() runtime/mgc.go:204 +0x25 fp=0xc0000a17e0 sp=0xc0000a17c8 pc=0x556e453ef4e5 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a17e8 sp=0xc0000a17e0 pc=0x556e4544d581 created by runtime.gcenable in goroutine 1 runtime/mgc.go:204 +0x66 goroutine 4 gp=0xc000007340 m=nil [GC scavenge wait]: runtime.gopark(0x10000?, 0x556e463f3258?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a1f78 sp=0xc0000a1f58 pc=0x556e454451ae runtime.goparkunlock(...) runtime/proc.go:430 runtime.(*scavengerState).park(0x556e46e99060) runtime/mgcscavenge.go:425 +0x49 fp=0xc0000a1fa8 sp=0xc0000a1f78 pc=0x556e453f8869 runtime.bgscavenge(0xc000044080) runtime/mgcscavenge.go:658 +0x59 fp=0xc0000a1fc8 sp=0xc0000a1fa8 pc=0x556e453f8df9 runtime.gcenable.gowrap2() runtime/mgc.go:205 +0x25 fp=0xc0000a1fe0 sp=0xc0000a1fc8 pc=0x556e453ef485 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a1fe8 sp=0xc0000a1fe0 pc=0x556e4544d581 created by runtime.gcenable in goroutine 1 runtime/mgc.go:205 +0xa5 goroutine 5 gp=0xc000007c00 m=nil [finalizer wait]: runtime.gopark(0xc0000a0648?, 0x556e453e59e5?, 0xb0?, 0x1?, 0xc0000061c0?) runtime/proc.go:424 +0xce fp=0xc0000a0620 sp=0xc0000a0600 pc=0x556e454451ae runtime.runfinq() runtime/mfinal.go:193 +0x107 fp=0xc0000a07e0 sp=0xc0000a0620 pc=0x556e453ee567 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a07e8 sp=0xc0000a07e0 pc=0x556e4544d581 created by runtime.createfing in goroutine 1 runtime/mfinal.go:163 +0x3d goroutine 6 gp=0xc0001f9500 m=nil [chan receive]: runtime.gopark(0xc0000a2760?, 0x556e45522105?, 0x60?, 0x49?, 0x556e466b7ae0?) runtime/proc.go:424 +0xce fp=0xc0000a2718 sp=0xc0000a26f8 pc=0x556e454451ae runtime.chanrecv(0xc000046380, 0x0, 0x1) runtime/chan.go:639 +0x41c fp=0xc0000a2790 sp=0xc0000a2718 pc=0x556e453debdc runtime.chanrecv1(0x0?, 0x0?) runtime/chan.go:489 +0x12 fp=0xc0000a27b8 sp=0xc0000a2790 pc=0x556e453de792 runtime.unique_runtime_registerUniqueMapCleanup.func1(...) runtime/mgc.go:1781 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() runtime/mgc.go:1784 +0x2f fp=0xc0000a27e0 sp=0xc0000a27b8 pc=0x556e453f254f runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a27e8 sp=0xc0000a27e0 pc=0x556e4544d581 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 runtime/mgc.go:1779 +0x96 goroutine 7 gp=0xc0001f9880 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a2f38 sp=0xc0000a2f18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0000a2fc8 sp=0xc0000a2f38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a2fe0 sp=0xc0000a2fc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a2fe8 sp=0xc0000a2fe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 18 gp=0xc000504000 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4ea7a39?, 0x3?, 0x24?, 0xdb?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00009c738 sp=0xc00009c718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00009c7c8 sp=0xc00009c738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00009c7e0 sp=0xc00009c7c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00009c7e8 sp=0xc00009c7e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 19 gp=0xc0005041c0 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x3?, 0xa1?, 0x23?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00009cf38 sp=0xc00009cf18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00009cfc8 sp=0xc00009cf38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00009cfe0 sp=0xc00009cfc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00009cfe8 sp=0xc00009cfe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 34 gp=0xc000104380 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4e691f4?, 0x3?, 0x9b?, 0x67?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00011c738 sp=0xc00011c718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00011c7c8 sp=0xc00011c738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00011c7e0 sp=0xc00011c7c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011c7e8 sp=0xc00011c7e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 8 gp=0xc0001f9a40 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x3?, 0x3e?, 0x3b?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a3738 sp=0xc0000a3718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0000a37c8 sp=0xc0000a3738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a37e0 sp=0xc0000a37c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a37e8 sp=0xc0000a37e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 9 gp=0xc0001f9c00 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4e5b149?, 0x3?, 0x3d?, 0x56?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0000a3f38 sp=0xc0000a3f18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0000a3fc8 sp=0xc0000a3f38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0000a3fe0 sp=0xc0000a3fc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a3fe8 sp=0xc0000a3fe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 35 gp=0xc000104540 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4ea7e31?, 0x3?, 0xd4?, 0x36?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00011cf38 sp=0xc00011cf18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00011cfc8 sp=0xc00011cf38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00011cfe0 sp=0xc00011cfc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011cfe8 sp=0xc00011cfe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 36 gp=0xc000104700 m=nil [GC worker (idle)]: runtime.gopark(0x3437be9cb6d00?, 0x1?, 0x3d?, 0xf0?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00011d738 sp=0xc00011d718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00011d7c8 sp=0xc00011d738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00011d7e0 sp=0xc00011d7c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011d7e8 sp=0xc00011d7e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 37 gp=0xc0001048c0 m=nil [GC worker (idle)]: runtime.gopark(0x3437be9d1eb9b?, 0x1?, 0x19?, 0x8?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00011df38 sp=0xc00011df18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00011dfc8 sp=0xc00011df38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00011dfe0 sp=0xc00011dfc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011dfe8 sp=0xc00011dfe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 38 gp=0xc000104a80 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4ea7c30?, 0x3?, 0x2?, 0x58?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00011e738 sp=0xc00011e718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00011e7c8 sp=0xc00011e738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00011e7e0 sp=0xc00011e7c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011e7e8 sp=0xc00011e7e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 39 gp=0xc000104c40 m=nil [GC worker (idle)]: runtime.gopark(0x3437be9d1f70d?, 0x1?, 0x6c?, 0x8a?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00011ef38 sp=0xc00011ef18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00011efc8 sp=0xc00011ef38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00011efe0 sp=0xc00011efc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011efe8 sp=0xc00011efe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 40 gp=0xc000104e00 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x1?, 0xbf?, 0x8d?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00011f738 sp=0xc00011f718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00011f7c8 sp=0xc00011f738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00011f7e0 sp=0xc00011f7c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011f7e8 sp=0xc00011f7e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 41 gp=0xc000104fc0 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x1?, 0x81?, 0xca?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00011ff38 sp=0xc00011ff18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00011ffc8 sp=0xc00011ff38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00011ffe0 sp=0xc00011ffc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011ffe8 sp=0xc00011ffe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 42 gp=0xc000105180 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x1?, 0xc8?, 0x54?, 0x0?) runtime/proc.go:424 +0xce fp=0xc000118738 sp=0xc000118718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0001187c8 sp=0xc000118738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0001187e0 sp=0xc0001187c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0001187e8 sp=0xc0001187e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 43 gp=0xc000105340 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x1?, 0xf?, 0x27?, 0x0?) runtime/proc.go:424 +0xce fp=0xc000118f38 sp=0xc000118f18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc000118fc8 sp=0xc000118f38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc000118fe0 sp=0xc000118fc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000118fe8 sp=0xc000118fe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 44 gp=0xc000105500 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x1?, 0x22?, 0x4d?, 0x0?) runtime/proc.go:424 +0xce fp=0xc000119738 sp=0xc000119718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0001197c8 sp=0xc000119738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0001197e0 sp=0xc0001197c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0001197e8 sp=0xc0001197e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 45 gp=0xc0001056c0 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x1?, 0x2b?, 0xdb?, 0x0?) runtime/proc.go:424 +0xce fp=0xc000119f38 sp=0xc000119f18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc000119fc8 sp=0xc000119f38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc000119fe0 sp=0xc000119fc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000119fe8 sp=0xc000119fe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 10 gp=0xc0001f9dc0 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4ea7248?, 0x1?, 0x88?, 0x2?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004a8738 sp=0xc0004a8718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0004a87c8 sp=0xc0004a8738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004a87e0 sp=0xc0004a87c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a87e8 sp=0xc0004a87e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 11 gp=0xc0004ac000 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4e59ee4?, 0x3?, 0x88?, 0x1a?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004a8f38 sp=0xc0004a8f18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0004a8fc8 sp=0xc0004a8f38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004a8fe0 sp=0xc0004a8fc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a8fe8 sp=0xc0004a8fe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 12 gp=0xc0004ac1c0 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4e532f9?, 0x3?, 0xcf?, 0x20?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004a9738 sp=0xc0004a9718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0004a97c8 sp=0xc0004a9738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004a97e0 sp=0xc0004a97c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a97e8 sp=0xc0004a97e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 13 gp=0xc0004ac380 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4ea748e?, 0x3?, 0x4a?, 0x76?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004a9f38 sp=0xc0004a9f18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0004a9fc8 sp=0xc0004a9f38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004a9fe0 sp=0xc0004a9fc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a9fe8 sp=0xc0004a9fe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 14 gp=0xc0004ac540 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4ea7a9b?, 0x3?, 0x9d?, 0x99?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004aa738 sp=0xc0004aa718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0004aa7c8 sp=0xc0004aa738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004aa7e0 sp=0xc0004aa7c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004aa7e8 sp=0xc0004aa7e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 15 gp=0xc0004ac700 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x1?, 0x1e?, 0x6c?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004aaf38 sp=0xc0004aaf18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0004aafc8 sp=0xc0004aaf38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004aafe0 sp=0xc0004aafc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004aafe8 sp=0xc0004aafe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 16 gp=0xc0004ac8c0 m=nil [GC worker (idle)]: runtime.gopark(0x556e46f476c0?, 0x1?, 0xa8?, 0xe4?, 0x0?) runtime/proc.go:424 +0xce fp=0xc0004ab738 sp=0xc0004ab718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc0004ab7c8 sp=0xc0004ab738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc0004ab7e0 sp=0xc0004ab7c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004ab7e8 sp=0xc0004ab7e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 20 gp=0xc000504380 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4ea7441?, 0x3?, 0x42?, 0x69?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00009d738 sp=0xc00009d718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00009d7c8 sp=0xc00009d738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00009d7e0 sp=0xc00009d7c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00009d7e8 sp=0xc00009d7e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 21 gp=0xc000504540 m=nil [GC worker (idle)]: runtime.gopark(0x3437be9cb691b?, 0x1?, 0xff?, 0x20?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00009df38 sp=0xc00009df18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00009dfc8 sp=0xc00009df38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00009dfe0 sp=0xc00009dfc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00009dfe8 sp=0xc00009dfe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 22 gp=0xc000504700 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4ea7f39?, 0x1?, 0x70?, 0xa1?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00009e738 sp=0xc00009e718 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00009e7c8 sp=0xc00009e738 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00009e7e0 sp=0xc00009e7c8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00009e7e8 sp=0xc00009e7e0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 23 gp=0xc0005048c0 m=nil [GC worker (idle)]: runtime.gopark(0x3437bd4e648fd?, 0xc0005000c0?, 0x1a?, 0xa?, 0x0?) runtime/proc.go:424 +0xce fp=0xc00009ef38 sp=0xc00009ef18 pc=0x556e454451ae runtime.gcBgMarkWorker(0xc0000477a0) runtime/mgc.go:1412 +0xe9 fp=0xc00009efc8 sp=0xc00009ef38 pc=0x556e453f1849 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1328 +0x25 fp=0xc00009efe0 sp=0xc00009efc8 pc=0x556e453f1725 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00009efe8 sp=0xc00009efe0 pc=0x556e4544d581 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1328 +0x105 goroutine 25 gp=0xc041a21500 m=nil [select]: runtime.gopark(0xc00004da68?, 0x2?, 0x9c?, 0xe3?, 0xc00004d834?) runtime/proc.go:424 +0xce fp=0xc00004d690 sp=0xc00004d670 pc=0x556e454451ae runtime.selectgo(0xc00004da68, 0xc00004d830, 0xa2?, 0x0, 0x1?, 0x1) runtime/select.go:335 +0x7a5 fp=0xc00004d7b8 sp=0xc00004d690 pc=0x556e454224a5 github.com/ollama/ollama/runner/ollamarunner.(*Server).completion(0xc000157e00, {0x556e466a3088, 0xc016020620}, 0xc00015ab40) github.com/ollama/ollama/runner/ollamarunner/runner.go:653 +0x9ed fp=0xc00004dac0 sp=0xc00004d7b8 pc=0x556e45a4270d github.com/ollama/ollama/runner/ollamarunner.(*Server).completion-fm({0x556e466a3088?, 0xc016020620?}, 0x556e4577b587?) <autogenerated>:1 +0x36 fp=0xc00004daf0 sp=0xc00004dac0 pc=0x556e45a44fd6 net/http.HandlerFunc.ServeHTTP(0xc0001415e0?, {0x556e466a3088?, 0xc016020620?}, 0x0?) net/http/server.go:2220 +0x29 fp=0xc00004db18 sp=0xc00004daf0 pc=0x556e4576dda9 net/http.(*ServeMux).ServeHTTP(0x556e453e59e5?, {0x556e466a3088, 0xc016020620}, 0xc00015ab40) net/http/server.go:2747 +0x1ca fp=0xc00004db68 sp=0xc00004db18 pc=0x556e4576fcaa net/http.serverHandler.ServeHTTP({0x556e4669fa50?}, {0x556e466a3088?, 0xc016020620?}, 0x6?) net/http/server.go:3210 +0x8e fp=0xc00004db98 sp=0xc00004db68 pc=0x556e4578d20e net/http.(*conn).serve(0xc033d1e090, {0x556e466a5268, 0xc00017ccf0}) net/http/server.go:2092 +0x5d0 fp=0xc00004dfb8 sp=0xc00004db98 pc=0x556e4576c750 net/http.(*Server).Serve.gowrap3() net/http/server.go:3360 +0x28 fp=0xc00004dfe0 sp=0xc00004dfb8 pc=0x556e45771ba8 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00004dfe8 sp=0xc00004dfe0 pc=0x556e4544d581 created by net/http.(*Server).Serve in goroutine 1 net/http/server.go:3360 +0x485 goroutine 553 gp=0xc041a6ba40 m=nil [IO wait]: runtime.gopark(0xc0013a9680?, 0x2?, 0x2?, 0x0?, 0xb?) runtime/proc.go:424 +0xce fp=0xc041a105a8 sp=0xc041a10588 pc=0x556e454451ae runtime.netpollblock(0x556e45468578?, 0x453dbfc6?, 0x6e?) runtime/netpoll.go:575 +0xf7 fp=0xc041a105e0 sp=0xc041a105a8 pc=0x556e45408e17 internal/poll.runtime_pollWait(0x7f9e1ed79450, 0x72) runtime/netpoll.go:351 +0x85 fp=0xc041a10600 sp=0xc041a105e0 pc=0x556e454444a5 internal/poll.(*pollDesc).wait(0xc033d36000?, 0xc0014262b1?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc041a10628 sp=0xc041a10600 pc=0x556e454cc5c7 internal/poll.(*pollDesc).waitRead(...) internal/poll/fd_poll_runtime.go:89 internal/poll.(*FD).Read(0xc033d36000, {0xc0014262b1, 0x1, 0x1}) internal/poll/fd_unix.go:165 +0x27a fp=0xc041a106c0 sp=0xc041a10628 pc=0x556e454cd8ba net.(*netFD).Read(0xc033d36000, {0xc0014262b1?, 0xc0013a9680?, 0x2?}) net/fd_posix.go:55 +0x25 fp=0xc041a10708 sp=0xc041a106c0 pc=0x556e45538ac5 net.(*conn).Read(0xc033d38000, {0xc0014262b1?, 0x0?, 0xc041a107d0?}) net/net.go:189 +0x45 fp=0xc041a10750 sp=0xc041a10708 pc=0x556e455470c5 net.(*TCPConn).Read(0x0?, {0xc0014262b1?, 0xc0002c4500?, 0x556e45855f00?}) <autogenerated>:1 +0x25 fp=0xc041a10780 sp=0xc041a10750 pc=0x556e4555a2c5 net/http.(*connReader).backgroundRead(0xc0014262a0) net/http/server.go:690 +0x37 fp=0xc041a107c8 sp=0xc041a10780 pc=0x556e457670d7 net/http.(*connReader).startBackgroundRead.gowrap2() net/http/server.go:686 +0x25 fp=0xc041a107e0 sp=0xc041a107c8 pc=0x556e45767005 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc041a107e8 sp=0xc041a107e0 pc=0x556e4544d581 created by net/http.(*connReader).startBackgroundRead in goroutine 25 net/http/server.go:686 +0xb6 rax 0x0 rbx 0x7f9804ff9640 rcx 0x7f9e660cf9fc rdx 0x6 rdi 0x531b6 rsi 0x5323e rbp 0x5323e rsp 0x7f9804ff89e0 r8 0x7f9804ff8ab0 r9 0x7f9804ff8a80 r10 0x8 r11 0x246 r12 0x6 r13 0x16 r14 0x7f97dc00ba60 r15 0x3ffffffffffffff rip 0x7f9e660cf9fc rflags 0x246 cs 0x33 fs 0x0 gs 0x0 [GIN] 2025/02/22 - 12:41:39 | 200 | 2.388408313s | 127.0.0.1 | POST "/api/chat" time=2025-02-22T12:41:39.331+01:00 level=ERROR source=server.go:421 msg="llama runner terminated" error="exit status 2" ``` ### OS Linux ### GPU Nvidia ### CPU Intel ### Ollama version 0.5.12rc1
GiteaMirror added the bug label 2026-04-22 12:35:15 -05:00
Author
Owner

@jessegross commented on GitHub (Feb 23, 2025):

Mistral isn't implemented in the new engine yet and the logic to auto-detect which engine to use hasn't been merged, which is one of several reasons why things are currently turned off by default. MLX also hasn't been merged.

<!-- gh-comment-id:2676464462 --> @jessegross commented on GitHub (Feb 23, 2025): Mistral isn't implemented in the new engine yet and the logic to auto-detect which engine to use hasn't been merged, which is one of several reasons why things are currently turned off by default. MLX also hasn't been merged.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#31817