[GH-ISSUE #13171] Error: 500 Internal Server Error: llama runner process has terminated: exit status 2 #70768

Closed
opened 2026-05-04 22:55:50 -05:00 by GiteaMirror · 0 comments
Owner

Originally created by @MapleZJH on GitHub (Nov 20, 2025).
Original GitHub issue: https://github.com/ollama/ollama/issues/13171

What is the issue?

When I choose Airplane mode want to keep my model local, I face this problem:
cmd: ollama run deepseek-r1:32b
Error: 500 Internal Server Error: llama runner process has terminated: exit status 2

I using deepseek-r1:32b and deepseek-r1:70b models, they have been downloaded local already.
The setting of ollama are as follows:

  1. Expose ollama to the network: off;
  2. Airplane mode: on;
  3. system: OLLAMA_HOST: 127.0.0.1;
  4. system: OLLAMA_MODEL: D:\LLM\Models_ollama (which is where my models locate)

Ollama version: 0.13.0
CPU: AMD RYZEN AI MAX+ PRO 395w
GPU: Radeon 8060s
GPU MEMORY: 96.0GB

Relevant log output

time=2025-11-20T14:01:57.543+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server not responding"
time=2025-11-20T14:01:57.794+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server error"
time=2025-11-20T14:02:00.557+08:00 level=INFO source=sched.go:470 msg="Load failed" model=D:\LLM\Models_ollama\blobs\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 error="llama runner process has terminated: exit status 2"
[GIN] 2025/11/20 - 14:02:00 | 500 |    49.853006s |       127.0.0.1 | POST     "/api/generate"


time=2025-11-20T13:53:41.540+08:00 level=INFO source=routes.go:1544 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:32768 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:D:\\LLM\\Models_ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES:]"
time=2025-11-20T13:53:41.566+08:00 level=INFO source=images.go:522 msg="total blobs: 11"
time=2025-11-20T13:53:41.567+08:00 level=INFO source=images.go:529 msg="total unused blobs removed: 0"
time=2025-11-20T13:53:41.568+08:00 level=INFO source=routes.go:1597 msg="Listening on [::]:11434 (version 0.13.0)"
time=2025-11-20T13:53:41.569+08:00 level=INFO source=runner.go:67 msg="discovering available GPUs..."
time=2025-11-20T13:53:41.569+08:00 level=INFO source=runner.go:102 msg="experimental Vulkan support disabled.  To enable, set OLLAMA_VULKAN=1"
time=2025-11-20T13:53:41.608+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 58346"
time=2025-11-20T13:53:43.013+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 49825"
time=2025-11-20T13:55:13.013+08:00 level=INFO source=runner.go:449 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\lib\\ollama C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v13]" extra_envs=map[] error="failed to finish discovery before timeout"
time=2025-11-20T13:55:13.031+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 57508"
time=2025-11-20T13:55:14.647+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 57577"
time=2025-11-20T13:55:17.215+08:00 level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=ROCm compute=gfx1151 name=ROCm0 description="AMD Radeon(TM) 8060S Graphics" libdirs=ollama,rocm driver=60241.51 pci_id=0000:c3:00.0 type=iGPU total="96.0 GiB" available="94.7 GiB"
[GIN] 2025/11/20 - 13:55:17 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/20 - 13:58:28 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/20 - 13:58:39 | 200 |            0s |       127.0.0.1 | GET      "/api/version"
[GIN] 2025/11/20 - 13:58:48 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2025/11/20 - 13:58:48 | 200 |      1.6596ms |       127.0.0.1 | GET      "/api/tags"
[GIN] 2025/11/20 - 14:01:10 | 200 |            0s |       127.0.0.1 | HEAD     "/"
[GIN] 2025/11/20 - 14:01:10 | 200 |     95.0704ms |       127.0.0.1 | POST     "/api/show"
[GIN] 2025/11/20 - 14:01:10 | 200 |     83.4468ms |       127.0.0.1 | POST     "/api/show"
time=2025-11-20T14:01:10.908+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 62280"
time=2025-11-20T14:01:11.828+08:00 level=INFO source=cpu_windows.go:148 msg=packages count=1
time=2025-11-20T14:01:11.829+08:00 level=INFO source=cpu_windows.go:195 msg="" package=0 cores=16 efficiency=0 threads=32
llama_model_loader: loaded meta data with 26 key-value pairs and 771 tensors from D:\LLM\Models_ollama\blobs\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = qwen2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 Distill Qwen 32B
llama_model_loader: - kv   3:                           general.basename str              = DeepSeek-R1-Distill-Qwen
llama_model_loader: - kv   4:                         general.size_label str              = 32B
llama_model_loader: - kv   5:                          qwen2.block_count u32              = 64
llama_model_loader: - kv   6:                       qwen2.context_length u32              = 131072
llama_model_loader: - kv   7:                     qwen2.embedding_length u32              = 5120
llama_model_loader: - kv   8:                  qwen2.feed_forward_length u32              = 27648
llama_model_loader: - kv   9:                 qwen2.attention.head_count u32              = 40
llama_model_loader: - kv  10:              qwen2.attention.head_count_kv u32              = 8
llama_model_loader: - kv  11:                       qwen2.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv  12:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  13:                          general.file_type u32              = 15
llama_model_loader: - kv  14:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  15:                         tokenizer.ggml.pre str              = deepseek-r1-qwen
llama_model_loader: - kv  16:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  17:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  18:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv  19:                tokenizer.ggml.bos_token_id u32              = 151646
llama_model_loader: - kv  20:                tokenizer.ggml.eos_token_id u32              = 151643
llama_model_loader: - kv  21:            tokenizer.ggml.padding_token_id u32              = 151643
llama_model_loader: - kv  22:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  23:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  24:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  25:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  321 tensors
llama_model_loader: - type q4_K:  385 tensors
llama_model_loader: - type q6_K:   65 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 18.48 GiB (4.85 BPW) 
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: printing all EOG tokens:
load:   - 151643 ('<|end▁of▁sentence|>')
load:   - 151662 ('<|fim_pad|>')
load:   - 151663 ('<|repo_name|>')
load:   - 151664 ('<|file_sep|>')
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch             = qwen2
print_info: vocab_only       = 1
print_info: model type       = ?B
print_info: model params     = 32.76 B
print_info: general.name     = DeepSeek R1 Distill Qwen 32B
print_info: vocab type       = BPE
print_info: n_vocab          = 152064
print_info: n_merges         = 151387
print_info: BOS token        = 151646 '<|begin▁of▁sentence|>'
print_info: EOS token        = 151643 '<|end▁of▁sentence|>'
print_info: EOT token        = 151643 '<|end▁of▁sentence|>'
print_info: PAD token        = 151643 '<|end▁of▁sentence|>'
print_info: LF token         = 198 'Ċ'
print_info: FIM PRE token    = 151659 '<|fim_prefix|>'
print_info: FIM SUF token    = 151661 '<|fim_suffix|>'
print_info: FIM MID token    = 151660 '<|fim_middle|>'
print_info: FIM PAD token    = 151662 '<|fim_pad|>'
print_info: FIM REP token    = 151663 '<|repo_name|>'
print_info: FIM SEP token    = 151664 '<|file_sep|>'
print_info: EOG token        = 151643 '<|end▁of▁sentence|>'
print_info: EOG token        = 151662 '<|fim_pad|>'
print_info: EOG token        = 151663 '<|repo_name|>'
print_info: EOG token        = 151664 '<|file_sep|>'
print_info: max token length = 256
llama_model_load: vocab only - skipping tensors
time=2025-11-20T14:01:12.167+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model D:\\LLM\\Models_ollama\\blobs\\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 --port 62302"
time=2025-11-20T14:01:12.195+08:00 level=INFO source=sched.go:443 msg="system memory" total="31.8 GiB" free="20.9 GiB" free_swap="15.1 GiB"
time=2025-11-20T14:01:12.195+08:00 level=INFO source=sched.go:450 msg="gpu memory" id=0 library=ROCm available="94.1 GiB" free="94.6 GiB" minimum="457.0 MiB" overhead="0 B"
time=2025-11-20T14:01:12.195+08:00 level=INFO source=server.go:459 msg="loading model" "model layers"=65 requested=-1
time=2025-11-20T14:01:12.196+08:00 level=INFO source=device.go:240 msg="model weights" device=ROCm0 size="18.1 GiB"
time=2025-11-20T14:01:12.196+08:00 level=INFO source=device.go:251 msg="kv cache" device=ROCm0 size="8.0 GiB"
time=2025-11-20T14:01:12.196+08:00 level=INFO source=device.go:262 msg="compute graph" device=ROCm0 size="2.6 GiB"
time=2025-11-20T14:01:12.196+08:00 level=INFO source=device.go:272 msg="total memory" size="28.7 GiB"
time=2025-11-20T14:01:12.822+08:00 level=INFO source=runner.go:963 msg="starting go runner"
load_backend: loaded CPU backend from C:\Users\ZJH\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 ROCm devices:
  Device 0: AMD Radeon(TM) 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, ID: 0
load_backend: loaded ROCm backend from C:\Users\ZJH\AppData\Local\Programs\Ollama\lib\ollama\rocm\ggml-hip.dll
time=2025-11-20T14:01:12.927+08:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 ROCm.0.NO_VMM=1 ROCm.0.NO_PEER_COPY=1 ROCm.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(clang)
time=2025-11-20T14:01:12.929+08:00 level=INFO source=runner.go:999 msg="Server listening on 127.0.0.1:62302"
time=2025-11-20T14:01:12.941+08:00 level=INFO source=runner.go:893 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:false KvSize:32768 KvCacheType: NumThreads:16 GPULayers:65[ID:0 Layers:65(0..64)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:true}"
time=2025-11-20T14:01:12.941+08:00 level=INFO source=server.go:1294 msg="waiting for llama runner to start responding"
time=2025-11-20T14:01:12.941+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server loading model"
ggml_hip_mgmt_init located ADLX version 1.3
ggml_backend_cuda_device_get_memory device 0000:c3:00.0 utilizing AMD specific memory reporting free: 101567168512 total: 103079215104
llama_model_load_from_file_impl: using device ROCm0 (AMD Radeon(TM) 8060S Graphics) (0000:c3:00.0) - 96862 MiB free
llama_model_loader: loaded meta data with 26 key-value pairs and 771 tensors from D:\LLM\Models_ollama\blobs\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = qwen2
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = DeepSeek R1 Distill Qwen 32B
llama_model_loader: - kv   3:                           general.basename str              = DeepSeek-R1-Distill-Qwen
llama_model_loader: - kv   4:                         general.size_label str              = 32B
llama_model_loader: - kv   5:                          qwen2.block_count u32              = 64
llama_model_loader: - kv   6:                       qwen2.context_length u32              = 131072
llama_model_loader: - kv   7:                     qwen2.embedding_length u32              = 5120
llama_model_loader: - kv   8:                  qwen2.feed_forward_length u32              = 27648
llama_model_loader: - kv   9:                 qwen2.attention.head_count u32              = 40
llama_model_loader: - kv  10:              qwen2.attention.head_count_kv u32              = 8
llama_model_loader: - kv  11:                       qwen2.rope.freq_base f32              = 1000000.000000
llama_model_loader: - kv  12:     qwen2.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  13:                          general.file_type u32              = 15
llama_model_loader: - kv  14:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  15:                         tokenizer.ggml.pre str              = deepseek-r1-qwen
llama_model_loader: - kv  16:                      tokenizer.ggml.tokens arr[str,152064]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  17:                  tokenizer.ggml.token_type arr[i32,152064]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  18:                      tokenizer.ggml.merges arr[str,151387]  = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv  19:                tokenizer.ggml.bos_token_id u32              = 151646
llama_model_loader: - kv  20:                tokenizer.ggml.eos_token_id u32              = 151643
llama_model_loader: - kv  21:            tokenizer.ggml.padding_token_id u32              = 151643
llama_model_loader: - kv  22:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  23:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  24:                    tokenizer.chat_template str              = {% if not add_generation_prompt is de...
llama_model_loader: - kv  25:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:  321 tensors
llama_model_loader: - type q4_K:  385 tensors
llama_model_loader: - type q6_K:   65 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q4_K - Medium
print_info: file size   = 18.48 GiB (4.85 BPW) 
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: printing all EOG tokens:
load:   - 151643 ('<|end▁of▁sentence|>')
load:   - 151662 ('<|fim_pad|>')
load:   - 151663 ('<|repo_name|>')
load:   - 151664 ('<|file_sep|>')
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch             = qwen2
print_info: vocab_only       = 0
print_info: n_ctx_train      = 131072
print_info: n_embd           = 5120
print_info: n_layer          = 64
print_info: n_head           = 40
print_info: n_head_kv        = 8
print_info: n_rot            = 128
print_info: n_swa            = 0
print_info: is_swa_any       = 0
print_info: n_embd_head_k    = 128
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 5
print_info: n_embd_k_gqa     = 1024
print_info: n_embd_v_gqa     = 1024
print_info: f_norm_eps       = 0.0e+00
print_info: f_norm_rms_eps   = 1.0e-05
print_info: f_clamp_kqv      = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale    = 0.0e+00
print_info: f_attn_scale     = 0.0e+00
print_info: n_ff             = 27648
print_info: n_expert         = 0
print_info: n_expert_used    = 0
print_info: causal attn      = 1
print_info: pooling type     = -1
print_info: rope type        = 2
print_info: rope scaling     = linear
print_info: freq_base_train  = 1000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn  = 131072
print_info: rope_finetuned   = unknown
print_info: model type       = 32B
print_info: model params     = 32.76 B
print_info: general.name     = DeepSeek R1 Distill Qwen 32B
print_info: vocab type       = BPE
print_info: n_vocab          = 152064
print_info: n_merges         = 151387
print_info: BOS token        = 151646 '<|begin▁of▁sentence|>'
print_info: EOS token        = 151643 '<|end▁of▁sentence|>'
print_info: EOT token        = 151643 '<|end▁of▁sentence|>'
print_info: PAD token        = 151643 '<|end▁of▁sentence|>'
print_info: LF token         = 198 'Ċ'
print_info: FIM PRE token    = 151659 '<|fim_prefix|>'
print_info: FIM SUF token    = 151661 '<|fim_suffix|>'
print_info: FIM MID token    = 151660 '<|fim_middle|>'
print_info: FIM PAD token    = 151662 '<|fim_pad|>'
print_info: FIM REP token    = 151663 '<|repo_name|>'
print_info: FIM SEP token    = 151664 '<|file_sep|>'
print_info: EOG token        = 151643 '<|end▁of▁sentence|>'
print_info: EOG token        = 151662 '<|fim_pad|>'
print_info: EOG token        = 151663 '<|repo_name|>'
print_info: EOG token        = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = true)
load_tensors: offloading 64 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 65/65 layers to GPU
load_tensors:        ROCm0 model buffer size = 18508.35 MiB
load_tensors:   CPU_Mapped model buffer size =   417.66 MiB
llama_context: constructing llama_context
llama_context: n_seq_max     = 1
llama_context: n_ctx         = 32768
llama_context: n_ctx_per_seq = 32768
llama_context: n_batch       = 512
llama_context: n_ubatch      = 512
llama_context: causal_attn   = 1
llama_context: flash_attn    = disabled
llama_context: kv_unified    = false
llama_context: freq_base     = 1000000.0
llama_context: freq_scale    = 1
llama_context: n_ctx_per_seq (32768) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
llama_context:  ROCm_Host  output buffer size =     0.60 MiB
Exception 0xc0000005 0x1 0x10 0x7ffb26419176
PC=0x7ffb26419176
signal arrived during external code execution

runtime.cgocall(0x7ff6052c88e0, 0xc0003e9c00)
	runtime/cgocall.go:167 +0x3e fp=0xc0003e9bd8 sp=0xc0003e9b70 pc=0x7ff60459243e
github.com/ollama/ollama/llama._Cfunc_llama_init_from_model(0x2a6a1db4020, {0x8000, 0x200, 0x200, 0x1, 0x10, 0x10, 0xffffffff, 0xffffffff, 0xffffffff, ...})
	_cgo_gotypes.go:754 +0x54 fp=0xc0003e9c00 sp=0xc0003e9bd8 pc=0x7ff604962d34
github.com/ollama/ollama/llama.NewContextWithModel.func1(...)
	github.com/ollama/ollama/llama/llama.go:317
github.com/ollama/ollama/llama.NewContextWithModel(0xc00037f9f0, {{0x8000, 0x200, 0x200, 0x1, 0x10, 0x10, 0xffffffff, 0xffffffff, 0xffffffff, ...}})
	github.com/ollama/ollama/llama/llama.go:317 +0x158 fp=0xc0003e9da0 sp=0xc0003e9c00 pc=0x7ff6049672f8
github.com/ollama/ollama/runner/llamarunner.(*Server).loadModel(0xc00032e320, {{0xc0003c0f80, 0x1, 0x1}, 0x41, 0x0, 0x1, {0xc0003c0f78, 0x1, 0x2}, ...}, ...)
	github.com/ollama/ollama/runner/llamarunner/runner.go:845 +0x178 fp=0xc0003e9ee8 sp=0xc0003e9da0 pc=0x7ff604a216d8
github.com/ollama/ollama/runner/llamarunner.(*Server).load.gowrap2()
	github.com/ollama/ollama/runner/llamarunner/runner.go:932 +0x115 fp=0xc0003e9fe0 sp=0xc0003e9ee8 pc=0x7ff604a228f5
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0003e9fe8 sp=0xc0003e9fe0 pc=0x7ff60459d8e1
created by github.com/ollama/ollama/runner/llamarunner.(*Server).load in goroutine 55
	github.com/ollama/ollama/runner/llamarunner/runner.go:932 +0x88a

goroutine 1 gp=0xc0000021c0 m=nil [IO wait]:
runtime.gopark(0x7ff60459f0e0?, 0x7ff60640ca00?, 0x20?, 0xc0?, 0xc0003ac0cc?)
	runtime/proc.go:435 +0xce fp=0xc00058f648 sp=0xc00058f628 pc=0x7ff60459598e
runtime.netpollblock(0x3d4?, 0x4530406?, 0xf6?)
	runtime/netpoll.go:575 +0xf7 fp=0xc00058f680 sp=0xc00058f648 pc=0x7ff60455bdf7
internal/poll.runtime_pollWait(0x2a6fb7ed110, 0x72)
	runtime/netpoll.go:351 +0x85 fp=0xc00058f6a0 sp=0xc00058f680 pc=0x7ff604594b25
internal/poll.(*pollDesc).wait(0x7ff60462a693?, 0x0?, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00058f6c8 sp=0xc00058f6a0 pc=0x7ff60462bc87
internal/poll.execIO(0xc0003ac020, 0xc00050f770)
	internal/poll/fd_windows.go:177 +0x105 fp=0xc00058f740 sp=0xc00058f6c8 pc=0x7ff60462d0e5
internal/poll.(*FD).acceptOne(0xc0003ac008, 0x3a0, {0xc0003a41e0?, 0xc00050f7d0?, 0x7ff604634da5?}, 0xc00050f804?)
	internal/poll/fd_windows.go:946 +0x65 fp=0xc00058f7a0 sp=0xc00058f740 pc=0x7ff604631665
internal/poll.(*FD).Accept(0xc0003ac008, 0xc00058f950)
	internal/poll/fd_windows.go:980 +0x1b6 fp=0xc00058f858 sp=0xc00058f7a0 pc=0x7ff604631996
net.(*netFD).accept(0xc0003ac008)
	net/fd_windows.go:182 +0x4b fp=0xc00058f970 sp=0xc00058f858 pc=0x7ff6046a2f0b
net.(*TCPListener).accept(0xc000308300)
	net/tcpsock_posix.go:159 +0x1b fp=0xc00058f9c0 sp=0xc00058f970 pc=0x7ff6046b8f5b
net.(*TCPListener).Accept(0xc000308300)
	net/tcpsock.go:380 +0x30 fp=0xc00058f9f0 sp=0xc00058f9c0 pc=0x7ff6046b7d10
net/http.(*onceCloseListener).Accept(0xc0003301b0?)
	<autogenerated>:1 +0x24 fp=0xc00058fa08 sp=0xc00058f9f0 pc=0x7ff6048d1184
net/http.(*Server).Serve(0xc000320700, {0x7ff605a81580, 0xc000308300})
	net/http/server.go:3424 +0x30c fp=0xc00058fb38 sp=0xc00058fa08 pc=0x7ff6048a8a4c
github.com/ollama/ollama/runner/llamarunner.Execute({0xc0000da020, 0x4, 0x6})
	github.com/ollama/ollama/runner/llamarunner/runner.go:1000 +0x8f5 fp=0xc00058fd08 sp=0xc00058fb38 pc=0x7ff604a232b5
github.com/ollama/ollama/runner.Execute({0xc0000da010?, 0x0?, 0x0?})
	github.com/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc00058fd30 sp=0xc00058fd08 pc=0x7ff604ac9714
github.com/ollama/ollama/cmd.NewCLI.func2(0xc000320300?, {0x7ff60589ad9d?, 0x4?, 0x7ff60589ada1?})
	github.com/ollama/ollama/cmd/cmd.go:1841 +0x45 fp=0xc00058fd58 sp=0xc00058fd30 pc=0x7ff605259145
github.com/spf13/cobra.(*Command).execute(0xc000235508, {0xc0003080c0, 0x4, 0x4})
	github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc00058fe78 sp=0xc00058fd58 pc=0x7ff60471d9dc
github.com/spf13/cobra.(*Command).ExecuteC(0xc000166908)
	github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc00058ff30 sp=0xc00058fe78 pc=0x7ff60471e225
github.com/spf13/cobra.(*Command).Execute(...)
	github.com/spf13/cobra@v1.7.0/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
	github.com/spf13/cobra@v1.7.0/command.go:985
main.main()
	github.com/ollama/ollama/main.go:12 +0x4d fp=0xc00058ff50 sp=0xc00058ff30 pc=0x7ff605259c2d
runtime.main()
	runtime/proc.go:283 +0x27d fp=0xc00058ffe0 sp=0xc00058ff50 pc=0x7ff604564ddd
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00058ffe8 sp=0xc00058ffe0 pc=0x7ff60459d8e1

goroutine 2 gp=0xc0000028c0 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc0000a7fa8 sp=0xc0000a7f88 pc=0x7ff60459598e
runtime.goparkunlock(...)
	runtime/proc.go:441
runtime.forcegchelper()
	runtime/proc.go:348 +0xb8 fp=0xc0000a7fe0 sp=0xc0000a7fa8 pc=0x7ff6045650f8
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a7fe8 sp=0xc0000a7fe0 pc=0x7ff60459d8e1
created by runtime.init.7 in goroutine 1
	runtime/proc.go:336 +0x1a

goroutine 3 gp=0xc000002c40 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc0000a9f80 sp=0xc0000a9f60 pc=0x7ff60459598e
runtime.goparkunlock(...)
	runtime/proc.go:441
runtime.bgsweep(0xc0000b6000)
	runtime/mgcsweep.go:316 +0xdf fp=0xc0000a9fc8 sp=0xc0000a9f80 pc=0x7ff60454debf
runtime.gcenable.gowrap1()
	runtime/mgc.go:204 +0x25 fp=0xc0000a9fe0 sp=0xc0000a9fc8 pc=0x7ff604542285
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a9fe8 sp=0xc0000a9fe0 pc=0x7ff60459d8e1
created by runtime.gcenable in goroutine 1
	runtime/mgc.go:204 +0x66

goroutine 4 gp=0xc000002e00 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x7ff605a6de10?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc0000bdf78 sp=0xc0000bdf58 pc=0x7ff60459598e
runtime.goparkunlock(...)
	runtime/proc.go:441
runtime.(*scavengerState).park(0x7ff6064333c0)
	runtime/mgcscavenge.go:425 +0x49 fp=0xc0000bdfa8 sp=0xc0000bdf78 pc=0x7ff60454b909
runtime.bgscavenge(0xc0000b6000)
	runtime/mgcscavenge.go:658 +0x59 fp=0xc0000bdfc8 sp=0xc0000bdfa8 pc=0x7ff60454be99
runtime.gcenable.gowrap2()
	runtime/mgc.go:205 +0x25 fp=0xc0000bdfe0 sp=0xc0000bdfc8 pc=0x7ff604542225
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bdfe8 sp=0xc0000bdfe0 pc=0x7ff60459d8e1
created by runtime.gcenable in goroutine 1
	runtime/mgc.go:205 +0xa5

goroutine 5 gp=0xc000003340 m=nil [finalizer wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc0000bfe30 sp=0xc0000bfe10 pc=0x7ff60459598e
runtime.runfinq()
	runtime/mfinal.go:196 +0x107 fp=0xc0000bffe0 sp=0xc0000bfe30 pc=0x7ff604541207
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bffe8 sp=0xc0000bffe0 pc=0x7ff60459d8e1
created by runtime.createfing in goroutine 1
	runtime/mfinal.go:166 +0x3d

goroutine 6 gp=0xc000003dc0 m=nil [chan receive]:
runtime.gopark(0xc0002295e0?, 0xc000694000?, 0x60?, 0xbf?, 0x7ff60468be48?)
	runtime/proc.go:435 +0xce fp=0xc0000abf18 sp=0xc0000abef8 pc=0x7ff60459598e
runtime.chanrecv(0xc000038460, 0x0, 0x1)
	runtime/chan.go:664 +0x445 fp=0xc0000abf90 sp=0xc0000abf18 pc=0x7ff604532d45
runtime.chanrecv1(0x7ff604564f40?, 0xc0000abf76?)
	runtime/chan.go:506 +0x12 fp=0xc0000abfb8 sp=0xc0000abf90 pc=0x7ff6045328d2
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
	runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
	runtime/mgc.go:1799 +0x2f fp=0xc0000abfe0 sp=0xc0000abfb8 pc=0x7ff6045454af
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0000abfe8 sp=0xc0000abfe0 pc=0x7ff60459d8e1
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
	runtime/mgc.go:1794 +0x85

goroutine 7 gp=0xc0004181c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc0000b9f38 sp=0xc0000b9f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc0000b9fc8 sp=0xc0000b9f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc0000b9fe0 sp=0xc0000b9fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0000b9fe8 sp=0xc0000b9fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 8 gp=0xc000418380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc0000bbf38 sp=0xc0000bbf18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc0000bbfc8 sp=0xc0000bbf38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc0000bbfe0 sp=0xc0000bbfc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bbfe8 sp=0xc0000bbfe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 9 gp=0xc000418540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000475f38 sp=0xc000475f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000475fc8 sp=0xc000475f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000475fe0 sp=0xc000475fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000475fe8 sp=0xc000475fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 10 gp=0xc000418700 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000477f38 sp=0xc000477f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000477fc8 sp=0xc000477f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000477fe0 sp=0xc000477fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000477fe8 sp=0xc000477fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 18 gp=0xc0001061c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000471f38 sp=0xc000471f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000471fc8 sp=0xc000471f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000471fe0 sp=0xc000471fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000471fe8 sp=0xc000471fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 34 gp=0xc000484000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc00048bf38 sp=0xc00048bf18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc00048bfc8 sp=0xc00048bf38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc00048bfe0 sp=0xc00048bfc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00048bfe8 sp=0xc00048bfe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 11 gp=0xc0004188c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000487f38 sp=0xc000487f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000487fc8 sp=0xc000487f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000487fe0 sp=0xc000487fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000487fe8 sp=0xc000487fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 35 gp=0xc0004841c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc00048df38 sp=0xc00048df18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc00048dfc8 sp=0xc00048df38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc00048dfe0 sp=0xc00048dfc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00048dfe8 sp=0xc00048dfe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 12 gp=0xc000418a80 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000489f38 sp=0xc000489f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000489fc8 sp=0xc000489f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000489fe0 sp=0xc000489fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000489fe8 sp=0xc000489fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 19 gp=0xc000106380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000473f38 sp=0xc000473f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000473fc8 sp=0xc000473f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000473fe0 sp=0xc000473fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000473fe8 sp=0xc000473fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 36 gp=0xc000484380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000493f38 sp=0xc000493f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000493fc8 sp=0xc000493f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000493fe0 sp=0xc000493fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000493fe8 sp=0xc000493fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 13 gp=0xc000418c40 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc00048ff38 sp=0xc00048ff18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc00048ffc8 sp=0xc00048ff38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc00048ffe0 sp=0xc00048ffc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00048ffe8 sp=0xc00048ffe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 20 gp=0xc000106540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000115f38 sp=0xc000115f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000115fc8 sp=0xc000115f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000115fe0 sp=0xc000115fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000115fe8 sp=0xc000115fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 21 gp=0xc000106700 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000117f38 sp=0xc000117f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000117fc8 sp=0xc000117f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000117fe0 sp=0xc000117fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000117fe8 sp=0xc000117fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 37 gp=0xc000484540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000495f38 sp=0xc000495f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000495fc8 sp=0xc000495f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000495fe0 sp=0xc000495fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000495fe8 sp=0xc000495fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 14 gp=0xc000418e00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000491f38 sp=0xc000491f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000491fc8 sp=0xc000491f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000491fe0 sp=0xc000491fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000491fe8 sp=0xc000491fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 22 gp=0xc0001068c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000111f38 sp=0xc000111f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000111fc8 sp=0xc000111f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000111fe0 sp=0xc000111fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000111fe8 sp=0xc000111fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 23 gp=0xc000106a80 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000113f38 sp=0xc000113f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000113fc8 sp=0xc000113f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000113fe0 sp=0xc000113fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000113fe8 sp=0xc000113fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 38 gp=0xc000484700 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc00049bf38 sp=0xc00049bf18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc00049bfc8 sp=0xc00049bf38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc00049bfe0 sp=0xc00049bfc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00049bfe8 sp=0xc00049bfe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 15 gp=0xc000418fc0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000497f38 sp=0xc000497f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000497fc8 sp=0xc000497f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000497fe0 sp=0xc000497fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000497fe8 sp=0xc000497fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 16 gp=0xc000419180 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000499f38 sp=0xc000499f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000499fc8 sp=0xc000499f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000499fe0 sp=0xc000499fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000499fe8 sp=0xc000499fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 24 gp=0xc000106c40 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc00011df38 sp=0xc00011df18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc00011dfc8 sp=0xc00011df38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc00011dfe0 sp=0xc00011dfc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00011dfe8 sp=0xc00011dfe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 39 gp=0xc0004848c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc00049df38 sp=0xc00049df18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc00049dfc8 sp=0xc00049df38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc00049dfe0 sp=0xc00049dfc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00049dfe8 sp=0xc00049dfe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 50 gp=0xc000419340 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000119f38 sp=0xc000119f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000119fc8 sp=0xc000119f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000119fe0 sp=0xc000119fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000119fe8 sp=0xc000119fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 51 gp=0xc000419500 m=nil [GC worker (idle)]:
runtime.gopark(0x4eb00b91ba40?, 0x1?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc00011bf38 sp=0xc00011bf18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc00011bfc8 sp=0xc00011bf38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc00011bfe0 sp=0xc00011bfc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00011bfe8 sp=0xc00011bfe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 25 gp=0xc000106e00 m=nil [GC worker (idle)]:
runtime.gopark(0x4eb00b8c05b4?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc00011ff38 sp=0xc00011ff18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc00011ffc8 sp=0xc00011ff38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc00011ffe0 sp=0xc00011ffc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00011ffe8 sp=0xc00011ffe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 40 gp=0xc000484a80 m=nil [GC worker (idle)]:
runtime.gopark(0x7ff606481fa0?, 0x1?, 0x84?, 0x52?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc0004a3f38 sp=0xc0004a3f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc0004a3fc8 sp=0xc0004a3f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc0004a3fe0 sp=0xc0004a3fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a3fe8 sp=0xc0004a3fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 52 gp=0xc0004196c0 m=nil [GC worker (idle)]:
runtime.gopark(0x4eb00b91ba40?, 0x1?, 0x64?, 0x1b?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc00049ff38 sp=0xc00049ff18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc00049ffc8 sp=0xc00049ff38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc00049ffe0 sp=0xc00049ffc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00049ffe8 sp=0xc00049ffe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 26 gp=0xc000106fc0 m=nil [GC worker (idle)]:
runtime.gopark(0x7ff606481fa0?, 0x1?, 0x84?, 0x52?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000127f38 sp=0xc000127f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000127fc8 sp=0xc000127f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000127fe0 sp=0xc000127fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000127fe8 sp=0xc000127fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 41 gp=0xc000484c40 m=nil [GC worker (idle)]:
runtime.gopark(0x4eb00b91ba40?, 0x0?, 0x0?, 0x0?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc0004a5f38 sp=0xc0004a5f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc0004a5fc8 sp=0xc0004a5f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc0004a5fe0 sp=0xc0004a5fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a5fe8 sp=0xc0004a5fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 53 gp=0xc000419880 m=nil [GC worker (idle)]:
runtime.gopark(0x4eb00b91ba40?, 0x1?, 0x84?, 0x52?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc0004a1f38 sp=0xc0004a1f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc0004a1fc8 sp=0xc0004a1f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc0004a1fe0 sp=0xc0004a1fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a1fe8 sp=0xc0004a1fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 27 gp=0xc000107180 m=nil [GC worker (idle)]:
runtime.gopark(0x4eb00b8c05b4?, 0x1?, 0x90?, 0x1e?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000129f38 sp=0xc000129f18 pc=0x7ff60459598e
runtime.gcBgMarkWorker(0xc000039880)
	runtime/mgc.go:1423 +0xe9 fp=0xc000129fc8 sp=0xc000129f38 pc=0x7ff6045447a9
runtime.gcBgMarkStartWorkers.gowrap1()
	runtime/mgc.go:1339 +0x25 fp=0xc000129fe0 sp=0xc000129fc8 pc=0x7ff604544685
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000129fe8 sp=0xc000129fe0 pc=0x7ff60459d8e1
created by runtime.gcBgMarkStartWorkers in goroutine 1
	runtime/mgc.go:1339 +0x105

goroutine 54 gp=0xc000506380 m=nil [sync.WaitGroup.Wait]:
runtime.gopark(0x0?, 0x0?, 0xc0?, 0x83?, 0x0?)
	runtime/proc.go:435 +0xce fp=0xc000123e20 sp=0xc000123e00 pc=0x7ff60459598e
runtime.goparkunlock(...)
	runtime/proc.go:441
runtime.semacquire1(0xc00032e340, 0x0, 0x1, 0x0, 0x18)
	runtime/sema.go:188 +0x22f fp=0xc000123e88 sp=0xc000123e20 pc=0x7ff60457750f
sync.runtime_SemacquireWaitGroup(0x0?)
	runtime/sema.go:110 +0x25 fp=0xc000123ec0 sp=0xc000123e88 pc=0x7ff604596f85
sync.(*WaitGroup).Wait(0x0?)
	sync/waitgroup.go:118 +0x48 fp=0xc000123ee8 sp=0xc000123ec0 pc=0x7ff6045ab7a8
github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc00032e320, {0x7ff605a83b80, 0xc0003a60f0})
	github.com/ollama/ollama/runner/llamarunner/runner.go:359 +0x4b fp=0xc000123fb8 sp=0xc000123ee8 pc=0x7ff604a1e08b
github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1()
	github.com/ollama/ollama/runner/llamarunner/runner.go:979 +0x28 fp=0xc000123fe0 sp=0xc000123fb8 pc=0x7ff604a23528
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc000123fe8 sp=0xc000123fe0 pc=0x7ff60459d8e1
created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1
	github.com/ollama/ollama/runner/llamarunner/runner.go:979 +0x4c5

goroutine 55 gp=0xc000506540 m=nil [IO wait]:
runtime.gopark(0x0?, 0xc0003ac2a0?, 0x48?, 0xc3?, 0xc0003ac34c?)
	runtime/proc.go:435 +0xce fp=0xc00058b8c8 sp=0xc00058b8a8 pc=0x7ff60459598e
runtime.netpollblock(0x3dc?, 0x4530406?, 0xf6?)
	runtime/netpoll.go:575 +0xf7 fp=0xc00058b900 sp=0xc00058b8c8 pc=0x7ff60455bdf7
internal/poll.runtime_pollWait(0x2a6fb7ecff8, 0x72)
	runtime/netpoll.go:351 +0x85 fp=0xc00058b920 sp=0xc00058b900 pc=0x7ff604594b25
internal/poll.(*pollDesc).wait(0x7ff60475c9b7?, 0xc00058b970?, 0x0)
	internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00058b948 sp=0xc00058b920 pc=0x7ff60462bc87
internal/poll.execIO(0xc0003ac2a0, 0x7ff605912278)
	internal/poll/fd_windows.go:177 +0x105 fp=0xc00058b9c0 sp=0xc00058b948 pc=0x7ff60462d0e5
internal/poll.(*FD).Read(0xc0003ac288, {0xc0003ce000, 0x1000, 0x1000})
	internal/poll/fd_windows.go:438 +0x29b fp=0xc00058ba60 sp=0xc00058b9c0 pc=0x7ff60462ddbb
net.(*netFD).Read(0xc0003ac288, {0xc0003ce000?, 0xc00058bad0?, 0x7ff60462c145?})
	net/fd_posix.go:55 +0x25 fp=0xc00058baa8 sp=0xc00058ba60 pc=0x7ff6046a1025
net.(*conn).Read(0xc000688058, {0xc0003ce000?, 0x0?, 0x0?})
	net/net.go:194 +0x45 fp=0xc00058baf0 sp=0xc00058baa8 pc=0x7ff6046b0505
net/http.(*connReader).Read(0xc00032c630, {0xc0003ce000, 0x1000, 0x1000})
	net/http/server.go:798 +0x159 fp=0xc00058bb40 sp=0xc00058baf0 pc=0x7ff60489d8f9
bufio.(*Reader).fill(0xc0000c24e0)
	bufio/bufio.go:113 +0x103 fp=0xc00058bb78 sp=0xc00058bb40 pc=0x7ff6046c6d43
bufio.(*Reader).Peek(0xc0000c24e0, 0x4)
	bufio/bufio.go:152 +0x53 fp=0xc00058bb98 sp=0xc00058bb78 pc=0x7ff6046c6e73
net/http.(*conn).serve(0xc0003301b0, {0x7ff605a83b48, 0xc00032c540})
	net/http/server.go:2137 +0x785 fp=0xc00058bfb8 sp=0xc00058bb98 pc=0x7ff6048a36e5
net/http.(*Server).Serve.gowrap3()
	net/http/server.go:3454 +0x28 fp=0xc00058bfe0 sp=0xc00058bfb8 pc=0x7ff6048a8e48
runtime.goexit({})
	runtime/asm_amd64.s:1700 +0x1 fp=0xc00058bfe8 sp=0xc00058bfe0 pc=0x7ff60459d8e1
created by net/http.(*Server).Serve in goroutine 1
	net/http/server.go:3454 +0x485
rax     0x0
rbx     0x2a6a1ff2e48
rcx     0x2a6a1ff2e48
rdx     0x2a6a1ff2e48
rdi     0x2a6a1ff2e18
rsi     0x0
rbp     0x2a6a1ff2e48
rsp     0xaf50d2e400
r8      0xfffffffd00000000
r9      0x2a6a1ff2df8
r10     0x5f
r11     0xc000012d
r12     0x0
r13     0x0
r14     0x2a6a1ff2e18
r15     0xaf50d2e5f0
rip     0x7ffb26419176
rflags  0x10246
cs      0x33
fs      0x53
gs      0x2b
time=2025-11-20T14:01:57.543+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server not responding"
time=2025-11-20T14:01:57.794+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server error"
time=2025-11-20T14:02:00.557+08:00 level=INFO source=sched.go:470 msg="Load failed" model=D:\LLM\Models_ollama\blobs\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 error="llama runner process has terminated: exit status 2"
[GIN] 2025/11/20 - 14:02:00 | 500 |    49.853006s |       127.0.0.1 | POST     "/api/generate"

OS

Windows

GPU

AMD

CPU

AMD

Ollama version

0.13.0

Originally created by @MapleZJH on GitHub (Nov 20, 2025). Original GitHub issue: https://github.com/ollama/ollama/issues/13171 ### What is the issue? When I choose Airplane mode want to keep my model local, I face this problem: cmd: ollama run deepseek-r1:32b Error: 500 Internal Server Error: llama runner process has terminated: exit status 2 I using deepseek-r1:32b and deepseek-r1:70b models, they have been downloaded local already. The setting of ollama are as follows: 1. Expose ollama to the network: off; 2. Airplane mode: on; 3. system: OLLAMA_HOST: 127.0.0.1; 4. system: OLLAMA_MODEL: D:\LLM\Models_ollama (which is where my models locate) Ollama version: 0.13.0 CPU: AMD RYZEN AI MAX+ PRO 395w GPU: Radeon 8060s GPU MEMORY: 96.0GB ### Relevant log output ```shell time=2025-11-20T14:01:57.543+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server not responding" time=2025-11-20T14:01:57.794+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server error" time=2025-11-20T14:02:00.557+08:00 level=INFO source=sched.go:470 msg="Load failed" model=D:\LLM\Models_ollama\blobs\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 error="llama runner process has terminated: exit status 2" [GIN] 2025/11/20 - 14:02:00 | 500 | 49.853006s | 127.0.0.1 | POST "/api/generate" time=2025-11-20T13:53:41.540+08:00 level=INFO source=routes.go:1544 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GGML_VK_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:32768 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:D:\\LLM\\Models_ollama OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_REMOTES:[ollama.com] OLLAMA_SCHED_SPREAD:false OLLAMA_VULKAN:false ROCR_VISIBLE_DEVICES:]" time=2025-11-20T13:53:41.566+08:00 level=INFO source=images.go:522 msg="total blobs: 11" time=2025-11-20T13:53:41.567+08:00 level=INFO source=images.go:529 msg="total unused blobs removed: 0" time=2025-11-20T13:53:41.568+08:00 level=INFO source=routes.go:1597 msg="Listening on [::]:11434 (version 0.13.0)" time=2025-11-20T13:53:41.569+08:00 level=INFO source=runner.go:67 msg="discovering available GPUs..." time=2025-11-20T13:53:41.569+08:00 level=INFO source=runner.go:102 msg="experimental Vulkan support disabled. To enable, set OLLAMA_VULKAN=1" time=2025-11-20T13:53:41.608+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 58346" time=2025-11-20T13:53:43.013+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 49825" time=2025-11-20T13:55:13.013+08:00 level=INFO source=runner.go:449 msg="failure during GPU discovery" OLLAMA_LIBRARY_PATH="[C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\lib\\ollama C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\lib\\ollama\\cuda_v13]" extra_envs=map[] error="failed to finish discovery before timeout" time=2025-11-20T13:55:13.031+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 57508" time=2025-11-20T13:55:14.647+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 57577" time=2025-11-20T13:55:17.215+08:00 level=INFO source=types.go:42 msg="inference compute" id=0 filter_id=0 library=ROCm compute=gfx1151 name=ROCm0 description="AMD Radeon(TM) 8060S Graphics" libdirs=ollama,rocm driver=60241.51 pci_id=0000:c3:00.0 type=iGPU total="96.0 GiB" available="94.7 GiB" [GIN] 2025/11/20 - 13:55:17 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/20 - 13:58:28 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/20 - 13:58:39 | 200 | 0s | 127.0.0.1 | GET "/api/version" [GIN] 2025/11/20 - 13:58:48 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2025/11/20 - 13:58:48 | 200 | 1.6596ms | 127.0.0.1 | GET "/api/tags" [GIN] 2025/11/20 - 14:01:10 | 200 | 0s | 127.0.0.1 | HEAD "/" [GIN] 2025/11/20 - 14:01:10 | 200 | 95.0704ms | 127.0.0.1 | POST "/api/show" [GIN] 2025/11/20 - 14:01:10 | 200 | 83.4468ms | 127.0.0.1 | POST "/api/show" time=2025-11-20T14:01:10.908+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --ollama-engine --port 62280" time=2025-11-20T14:01:11.828+08:00 level=INFO source=cpu_windows.go:148 msg=packages count=1 time=2025-11-20T14:01:11.829+08:00 level=INFO source=cpu_windows.go:195 msg="" package=0 cores=16 efficiency=0 threads=32 llama_model_loader: loaded meta data with 26 key-value pairs and 771 tensors from D:\LLM\Models_ollama\blobs\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = qwen2 llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = DeepSeek R1 Distill Qwen 32B llama_model_loader: - kv 3: general.basename str = DeepSeek-R1-Distill-Qwen llama_model_loader: - kv 4: general.size_label str = 32B llama_model_loader: - kv 5: qwen2.block_count u32 = 64 llama_model_loader: - kv 6: qwen2.context_length u32 = 131072 llama_model_loader: - kv 7: qwen2.embedding_length u32 = 5120 llama_model_loader: - kv 8: qwen2.feed_forward_length u32 = 27648 llama_model_loader: - kv 9: qwen2.attention.head_count u32 = 40 llama_model_loader: - kv 10: qwen2.attention.head_count_kv u32 = 8 llama_model_loader: - kv 11: qwen2.rope.freq_base f32 = 1000000.000000 llama_model_loader: - kv 12: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 13: general.file_type u32 = 15 llama_model_loader: - kv 14: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 15: tokenizer.ggml.pre str = deepseek-r1-qwen llama_model_loader: - kv 16: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 18: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",... llama_model_loader: - kv 19: tokenizer.ggml.bos_token_id u32 = 151646 llama_model_loader: - kv 20: tokenizer.ggml.eos_token_id u32 = 151643 llama_model_loader: - kv 21: tokenizer.ggml.padding_token_id u32 = 151643 llama_model_loader: - kv 22: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 23: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 24: tokenizer.chat_template str = {% if not add_generation_prompt is de... llama_model_loader: - kv 25: general.quantization_version u32 = 2 llama_model_loader: - type f32: 321 tensors llama_model_loader: - type q4_K: 385 tensors llama_model_loader: - type q6_K: 65 tensors print_info: file format = GGUF V3 (latest) print_info: file type = Q4_K - Medium print_info: file size = 18.48 GiB (4.85 BPW) load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect load: printing all EOG tokens: load: - 151643 ('<|end▁of▁sentence|>') load: - 151662 ('<|fim_pad|>') load: - 151663 ('<|repo_name|>') load: - 151664 ('<|file_sep|>') load: special tokens cache size = 22 load: token to piece cache size = 0.9310 MB print_info: arch = qwen2 print_info: vocab_only = 1 print_info: model type = ?B print_info: model params = 32.76 B print_info: general.name = DeepSeek R1 Distill Qwen 32B print_info: vocab type = BPE print_info: n_vocab = 152064 print_info: n_merges = 151387 print_info: BOS token = 151646 '<|begin▁of▁sentence|>' print_info: EOS token = 151643 '<|end▁of▁sentence|>' print_info: EOT token = 151643 '<|end▁of▁sentence|>' print_info: PAD token = 151643 '<|end▁of▁sentence|>' print_info: LF token = 198 'Ċ' print_info: FIM PRE token = 151659 '<|fim_prefix|>' print_info: FIM SUF token = 151661 '<|fim_suffix|>' print_info: FIM MID token = 151660 '<|fim_middle|>' print_info: FIM PAD token = 151662 '<|fim_pad|>' print_info: FIM REP token = 151663 '<|repo_name|>' print_info: FIM SEP token = 151664 '<|file_sep|>' print_info: EOG token = 151643 '<|end▁of▁sentence|>' print_info: EOG token = 151662 '<|fim_pad|>' print_info: EOG token = 151663 '<|repo_name|>' print_info: EOG token = 151664 '<|file_sep|>' print_info: max token length = 256 llama_model_load: vocab only - skipping tensors time=2025-11-20T14:01:12.167+08:00 level=INFO source=server.go:392 msg="starting runner" cmd="C:\\Users\\ZJH\\AppData\\Local\\Programs\\Ollama\\ollama.exe runner --model D:\\LLM\\Models_ollama\\blobs\\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 --port 62302" time=2025-11-20T14:01:12.195+08:00 level=INFO source=sched.go:443 msg="system memory" total="31.8 GiB" free="20.9 GiB" free_swap="15.1 GiB" time=2025-11-20T14:01:12.195+08:00 level=INFO source=sched.go:450 msg="gpu memory" id=0 library=ROCm available="94.1 GiB" free="94.6 GiB" minimum="457.0 MiB" overhead="0 B" time=2025-11-20T14:01:12.195+08:00 level=INFO source=server.go:459 msg="loading model" "model layers"=65 requested=-1 time=2025-11-20T14:01:12.196+08:00 level=INFO source=device.go:240 msg="model weights" device=ROCm0 size="18.1 GiB" time=2025-11-20T14:01:12.196+08:00 level=INFO source=device.go:251 msg="kv cache" device=ROCm0 size="8.0 GiB" time=2025-11-20T14:01:12.196+08:00 level=INFO source=device.go:262 msg="compute graph" device=ROCm0 size="2.6 GiB" time=2025-11-20T14:01:12.196+08:00 level=INFO source=device.go:272 msg="total memory" size="28.7 GiB" time=2025-11-20T14:01:12.822+08:00 level=INFO source=runner.go:963 msg="starting go runner" load_backend: loaded CPU backend from C:\Users\ZJH\AppData\Local\Programs\Ollama\lib\ollama\ggml-cpu-icelake.dll ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no ggml_cuda_init: found 1 ROCm devices: Device 0: AMD Radeon(TM) 8060S Graphics, gfx1151 (0x1151), VMM: no, Wave Size: 32, ID: 0 load_backend: loaded ROCm backend from C:\Users\ZJH\AppData\Local\Programs\Ollama\lib\ollama\rocm\ggml-hip.dll time=2025-11-20T14:01:12.927+08:00 level=INFO source=ggml.go:104 msg=system CPU.0.SSE3=1 CPU.0.SSSE3=1 CPU.0.AVX=1 CPU.0.AVX2=1 CPU.0.F16C=1 CPU.0.FMA=1 CPU.0.BMI2=1 CPU.0.AVX512=1 CPU.0.AVX512_VBMI=1 CPU.0.AVX512_VNNI=1 CPU.0.LLAMAFILE=1 CPU.1.LLAMAFILE=1 ROCm.0.NO_VMM=1 ROCm.0.NO_PEER_COPY=1 ROCm.0.PEER_MAX_BATCH_SIZE=128 compiler=cgo(clang) time=2025-11-20T14:01:12.929+08:00 level=INFO source=runner.go:999 msg="Server listening on 127.0.0.1:62302" time=2025-11-20T14:01:12.941+08:00 level=INFO source=runner.go:893 msg=load request="{Operation:commit LoraPath:[] Parallel:1 BatchSize:512 FlashAttention:false KvSize:32768 KvCacheType: NumThreads:16 GPULayers:65[ID:0 Layers:65(0..64)] MultiUserCache:false ProjectorPath: MainGPU:0 UseMmap:true}" time=2025-11-20T14:01:12.941+08:00 level=INFO source=server.go:1294 msg="waiting for llama runner to start responding" time=2025-11-20T14:01:12.941+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server loading model" ggml_hip_mgmt_init located ADLX version 1.3 ggml_backend_cuda_device_get_memory device 0000:c3:00.0 utilizing AMD specific memory reporting free: 101567168512 total: 103079215104 llama_model_load_from_file_impl: using device ROCm0 (AMD Radeon(TM) 8060S Graphics) (0000:c3:00.0) - 96862 MiB free llama_model_loader: loaded meta data with 26 key-value pairs and 771 tensors from D:\LLM\Models_ollama\blobs\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = qwen2 llama_model_loader: - kv 1: general.type str = model llama_model_loader: - kv 2: general.name str = DeepSeek R1 Distill Qwen 32B llama_model_loader: - kv 3: general.basename str = DeepSeek-R1-Distill-Qwen llama_model_loader: - kv 4: general.size_label str = 32B llama_model_loader: - kv 5: qwen2.block_count u32 = 64 llama_model_loader: - kv 6: qwen2.context_length u32 = 131072 llama_model_loader: - kv 7: qwen2.embedding_length u32 = 5120 llama_model_loader: - kv 8: qwen2.feed_forward_length u32 = 27648 llama_model_loader: - kv 9: qwen2.attention.head_count u32 = 40 llama_model_loader: - kv 10: qwen2.attention.head_count_kv u32 = 8 llama_model_loader: - kv 11: qwen2.rope.freq_base f32 = 1000000.000000 llama_model_loader: - kv 12: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 13: general.file_type u32 = 15 llama_model_loader: - kv 14: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 15: tokenizer.ggml.pre str = deepseek-r1-qwen llama_model_loader: - kv 16: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 18: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",... llama_model_loader: - kv 19: tokenizer.ggml.bos_token_id u32 = 151646 llama_model_loader: - kv 20: tokenizer.ggml.eos_token_id u32 = 151643 llama_model_loader: - kv 21: tokenizer.ggml.padding_token_id u32 = 151643 llama_model_loader: - kv 22: tokenizer.ggml.add_bos_token bool = true llama_model_loader: - kv 23: tokenizer.ggml.add_eos_token bool = false llama_model_loader: - kv 24: tokenizer.chat_template str = {% if not add_generation_prompt is de... llama_model_loader: - kv 25: general.quantization_version u32 = 2 llama_model_loader: - type f32: 321 tensors llama_model_loader: - type q4_K: 385 tensors llama_model_loader: - type q6_K: 65 tensors print_info: file format = GGUF V3 (latest) print_info: file type = Q4_K - Medium print_info: file size = 18.48 GiB (4.85 BPW) load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect load: printing all EOG tokens: load: - 151643 ('<|end▁of▁sentence|>') load: - 151662 ('<|fim_pad|>') load: - 151663 ('<|repo_name|>') load: - 151664 ('<|file_sep|>') load: special tokens cache size = 22 load: token to piece cache size = 0.9310 MB print_info: arch = qwen2 print_info: vocab_only = 0 print_info: n_ctx_train = 131072 print_info: n_embd = 5120 print_info: n_layer = 64 print_info: n_head = 40 print_info: n_head_kv = 8 print_info: n_rot = 128 print_info: n_swa = 0 print_info: is_swa_any = 0 print_info: n_embd_head_k = 128 print_info: n_embd_head_v = 128 print_info: n_gqa = 5 print_info: n_embd_k_gqa = 1024 print_info: n_embd_v_gqa = 1024 print_info: f_norm_eps = 0.0e+00 print_info: f_norm_rms_eps = 1.0e-05 print_info: f_clamp_kqv = 0.0e+00 print_info: f_max_alibi_bias = 0.0e+00 print_info: f_logit_scale = 0.0e+00 print_info: f_attn_scale = 0.0e+00 print_info: n_ff = 27648 print_info: n_expert = 0 print_info: n_expert_used = 0 print_info: causal attn = 1 print_info: pooling type = -1 print_info: rope type = 2 print_info: rope scaling = linear print_info: freq_base_train = 1000000.0 print_info: freq_scale_train = 1 print_info: n_ctx_orig_yarn = 131072 print_info: rope_finetuned = unknown print_info: model type = 32B print_info: model params = 32.76 B print_info: general.name = DeepSeek R1 Distill Qwen 32B print_info: vocab type = BPE print_info: n_vocab = 152064 print_info: n_merges = 151387 print_info: BOS token = 151646 '<|begin▁of▁sentence|>' print_info: EOS token = 151643 '<|end▁of▁sentence|>' print_info: EOT token = 151643 '<|end▁of▁sentence|>' print_info: PAD token = 151643 '<|end▁of▁sentence|>' print_info: LF token = 198 'Ċ' print_info: FIM PRE token = 151659 '<|fim_prefix|>' print_info: FIM SUF token = 151661 '<|fim_suffix|>' print_info: FIM MID token = 151660 '<|fim_middle|>' print_info: FIM PAD token = 151662 '<|fim_pad|>' print_info: FIM REP token = 151663 '<|repo_name|>' print_info: FIM SEP token = 151664 '<|file_sep|>' print_info: EOG token = 151643 '<|end▁of▁sentence|>' print_info: EOG token = 151662 '<|fim_pad|>' print_info: EOG token = 151663 '<|repo_name|>' print_info: EOG token = 151664 '<|file_sep|>' print_info: max token length = 256 load_tensors: loading model tensors, this can take a while... (mmap = true) load_tensors: offloading 64 repeating layers to GPU load_tensors: offloading output layer to GPU load_tensors: offloaded 65/65 layers to GPU load_tensors: ROCm0 model buffer size = 18508.35 MiB load_tensors: CPU_Mapped model buffer size = 417.66 MiB llama_context: constructing llama_context llama_context: n_seq_max = 1 llama_context: n_ctx = 32768 llama_context: n_ctx_per_seq = 32768 llama_context: n_batch = 512 llama_context: n_ubatch = 512 llama_context: causal_attn = 1 llama_context: flash_attn = disabled llama_context: kv_unified = false llama_context: freq_base = 1000000.0 llama_context: freq_scale = 1 llama_context: n_ctx_per_seq (32768) < n_ctx_train (131072) -- the full capacity of the model will not be utilized llama_context: ROCm_Host output buffer size = 0.60 MiB Exception 0xc0000005 0x1 0x10 0x7ffb26419176 PC=0x7ffb26419176 signal arrived during external code execution runtime.cgocall(0x7ff6052c88e0, 0xc0003e9c00) runtime/cgocall.go:167 +0x3e fp=0xc0003e9bd8 sp=0xc0003e9b70 pc=0x7ff60459243e github.com/ollama/ollama/llama._Cfunc_llama_init_from_model(0x2a6a1db4020, {0x8000, 0x200, 0x200, 0x1, 0x10, 0x10, 0xffffffff, 0xffffffff, 0xffffffff, ...}) _cgo_gotypes.go:754 +0x54 fp=0xc0003e9c00 sp=0xc0003e9bd8 pc=0x7ff604962d34 github.com/ollama/ollama/llama.NewContextWithModel.func1(...) github.com/ollama/ollama/llama/llama.go:317 github.com/ollama/ollama/llama.NewContextWithModel(0xc00037f9f0, {{0x8000, 0x200, 0x200, 0x1, 0x10, 0x10, 0xffffffff, 0xffffffff, 0xffffffff, ...}}) github.com/ollama/ollama/llama/llama.go:317 +0x158 fp=0xc0003e9da0 sp=0xc0003e9c00 pc=0x7ff6049672f8 github.com/ollama/ollama/runner/llamarunner.(*Server).loadModel(0xc00032e320, {{0xc0003c0f80, 0x1, 0x1}, 0x41, 0x0, 0x1, {0xc0003c0f78, 0x1, 0x2}, ...}, ...) github.com/ollama/ollama/runner/llamarunner/runner.go:845 +0x178 fp=0xc0003e9ee8 sp=0xc0003e9da0 pc=0x7ff604a216d8 github.com/ollama/ollama/runner/llamarunner.(*Server).load.gowrap2() github.com/ollama/ollama/runner/llamarunner/runner.go:932 +0x115 fp=0xc0003e9fe0 sp=0xc0003e9ee8 pc=0x7ff604a228f5 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0003e9fe8 sp=0xc0003e9fe0 pc=0x7ff60459d8e1 created by github.com/ollama/ollama/runner/llamarunner.(*Server).load in goroutine 55 github.com/ollama/ollama/runner/llamarunner/runner.go:932 +0x88a goroutine 1 gp=0xc0000021c0 m=nil [IO wait]: runtime.gopark(0x7ff60459f0e0?, 0x7ff60640ca00?, 0x20?, 0xc0?, 0xc0003ac0cc?) runtime/proc.go:435 +0xce fp=0xc00058f648 sp=0xc00058f628 pc=0x7ff60459598e runtime.netpollblock(0x3d4?, 0x4530406?, 0xf6?) runtime/netpoll.go:575 +0xf7 fp=0xc00058f680 sp=0xc00058f648 pc=0x7ff60455bdf7 internal/poll.runtime_pollWait(0x2a6fb7ed110, 0x72) runtime/netpoll.go:351 +0x85 fp=0xc00058f6a0 sp=0xc00058f680 pc=0x7ff604594b25 internal/poll.(*pollDesc).wait(0x7ff60462a693?, 0x0?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00058f6c8 sp=0xc00058f6a0 pc=0x7ff60462bc87 internal/poll.execIO(0xc0003ac020, 0xc00050f770) internal/poll/fd_windows.go:177 +0x105 fp=0xc00058f740 sp=0xc00058f6c8 pc=0x7ff60462d0e5 internal/poll.(*FD).acceptOne(0xc0003ac008, 0x3a0, {0xc0003a41e0?, 0xc00050f7d0?, 0x7ff604634da5?}, 0xc00050f804?) internal/poll/fd_windows.go:946 +0x65 fp=0xc00058f7a0 sp=0xc00058f740 pc=0x7ff604631665 internal/poll.(*FD).Accept(0xc0003ac008, 0xc00058f950) internal/poll/fd_windows.go:980 +0x1b6 fp=0xc00058f858 sp=0xc00058f7a0 pc=0x7ff604631996 net.(*netFD).accept(0xc0003ac008) net/fd_windows.go:182 +0x4b fp=0xc00058f970 sp=0xc00058f858 pc=0x7ff6046a2f0b net.(*TCPListener).accept(0xc000308300) net/tcpsock_posix.go:159 +0x1b fp=0xc00058f9c0 sp=0xc00058f970 pc=0x7ff6046b8f5b net.(*TCPListener).Accept(0xc000308300) net/tcpsock.go:380 +0x30 fp=0xc00058f9f0 sp=0xc00058f9c0 pc=0x7ff6046b7d10 net/http.(*onceCloseListener).Accept(0xc0003301b0?) <autogenerated>:1 +0x24 fp=0xc00058fa08 sp=0xc00058f9f0 pc=0x7ff6048d1184 net/http.(*Server).Serve(0xc000320700, {0x7ff605a81580, 0xc000308300}) net/http/server.go:3424 +0x30c fp=0xc00058fb38 sp=0xc00058fa08 pc=0x7ff6048a8a4c github.com/ollama/ollama/runner/llamarunner.Execute({0xc0000da020, 0x4, 0x6}) github.com/ollama/ollama/runner/llamarunner/runner.go:1000 +0x8f5 fp=0xc00058fd08 sp=0xc00058fb38 pc=0x7ff604a232b5 github.com/ollama/ollama/runner.Execute({0xc0000da010?, 0x0?, 0x0?}) github.com/ollama/ollama/runner/runner.go:22 +0xd4 fp=0xc00058fd30 sp=0xc00058fd08 pc=0x7ff604ac9714 github.com/ollama/ollama/cmd.NewCLI.func2(0xc000320300?, {0x7ff60589ad9d?, 0x4?, 0x7ff60589ada1?}) github.com/ollama/ollama/cmd/cmd.go:1841 +0x45 fp=0xc00058fd58 sp=0xc00058fd30 pc=0x7ff605259145 github.com/spf13/cobra.(*Command).execute(0xc000235508, {0xc0003080c0, 0x4, 0x4}) github.com/spf13/cobra@v1.7.0/command.go:940 +0x85c fp=0xc00058fe78 sp=0xc00058fd58 pc=0x7ff60471d9dc github.com/spf13/cobra.(*Command).ExecuteC(0xc000166908) github.com/spf13/cobra@v1.7.0/command.go:1068 +0x3a5 fp=0xc00058ff30 sp=0xc00058fe78 pc=0x7ff60471e225 github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra@v1.7.0/command.go:992 github.com/spf13/cobra.(*Command).ExecuteContext(...) github.com/spf13/cobra@v1.7.0/command.go:985 main.main() github.com/ollama/ollama/main.go:12 +0x4d fp=0xc00058ff50 sp=0xc00058ff30 pc=0x7ff605259c2d runtime.main() runtime/proc.go:283 +0x27d fp=0xc00058ffe0 sp=0xc00058ff50 pc=0x7ff604564ddd runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00058ffe8 sp=0xc00058ffe0 pc=0x7ff60459d8e1 goroutine 2 gp=0xc0000028c0 m=nil [force gc (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc0000a7fa8 sp=0xc0000a7f88 pc=0x7ff60459598e runtime.goparkunlock(...) runtime/proc.go:441 runtime.forcegchelper() runtime/proc.go:348 +0xb8 fp=0xc0000a7fe0 sp=0xc0000a7fa8 pc=0x7ff6045650f8 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a7fe8 sp=0xc0000a7fe0 pc=0x7ff60459d8e1 created by runtime.init.7 in goroutine 1 runtime/proc.go:336 +0x1a goroutine 3 gp=0xc000002c40 m=nil [GC sweep wait]: runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc0000a9f80 sp=0xc0000a9f60 pc=0x7ff60459598e runtime.goparkunlock(...) runtime/proc.go:441 runtime.bgsweep(0xc0000b6000) runtime/mgcsweep.go:316 +0xdf fp=0xc0000a9fc8 sp=0xc0000a9f80 pc=0x7ff60454debf runtime.gcenable.gowrap1() runtime/mgc.go:204 +0x25 fp=0xc0000a9fe0 sp=0xc0000a9fc8 pc=0x7ff604542285 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000a9fe8 sp=0xc0000a9fe0 pc=0x7ff60459d8e1 created by runtime.gcenable in goroutine 1 runtime/mgc.go:204 +0x66 goroutine 4 gp=0xc000002e00 m=nil [GC scavenge wait]: runtime.gopark(0x10000?, 0x7ff605a6de10?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc0000bdf78 sp=0xc0000bdf58 pc=0x7ff60459598e runtime.goparkunlock(...) runtime/proc.go:441 runtime.(*scavengerState).park(0x7ff6064333c0) runtime/mgcscavenge.go:425 +0x49 fp=0xc0000bdfa8 sp=0xc0000bdf78 pc=0x7ff60454b909 runtime.bgscavenge(0xc0000b6000) runtime/mgcscavenge.go:658 +0x59 fp=0xc0000bdfc8 sp=0xc0000bdfa8 pc=0x7ff60454be99 runtime.gcenable.gowrap2() runtime/mgc.go:205 +0x25 fp=0xc0000bdfe0 sp=0xc0000bdfc8 pc=0x7ff604542225 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bdfe8 sp=0xc0000bdfe0 pc=0x7ff60459d8e1 created by runtime.gcenable in goroutine 1 runtime/mgc.go:205 +0xa5 goroutine 5 gp=0xc000003340 m=nil [finalizer wait]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc0000bfe30 sp=0xc0000bfe10 pc=0x7ff60459598e runtime.runfinq() runtime/mfinal.go:196 +0x107 fp=0xc0000bffe0 sp=0xc0000bfe30 pc=0x7ff604541207 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bffe8 sp=0xc0000bffe0 pc=0x7ff60459d8e1 created by runtime.createfing in goroutine 1 runtime/mfinal.go:166 +0x3d goroutine 6 gp=0xc000003dc0 m=nil [chan receive]: runtime.gopark(0xc0002295e0?, 0xc000694000?, 0x60?, 0xbf?, 0x7ff60468be48?) runtime/proc.go:435 +0xce fp=0xc0000abf18 sp=0xc0000abef8 pc=0x7ff60459598e runtime.chanrecv(0xc000038460, 0x0, 0x1) runtime/chan.go:664 +0x445 fp=0xc0000abf90 sp=0xc0000abf18 pc=0x7ff604532d45 runtime.chanrecv1(0x7ff604564f40?, 0xc0000abf76?) runtime/chan.go:506 +0x12 fp=0xc0000abfb8 sp=0xc0000abf90 pc=0x7ff6045328d2 runtime.unique_runtime_registerUniqueMapCleanup.func2(...) runtime/mgc.go:1796 runtime.unique_runtime_registerUniqueMapCleanup.gowrap1() runtime/mgc.go:1799 +0x2f fp=0xc0000abfe0 sp=0xc0000abfb8 pc=0x7ff6045454af runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000abfe8 sp=0xc0000abfe0 pc=0x7ff60459d8e1 created by unique.runtime_registerUniqueMapCleanup in goroutine 1 runtime/mgc.go:1794 +0x85 goroutine 7 gp=0xc0004181c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc0000b9f38 sp=0xc0000b9f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc0000b9fc8 sp=0xc0000b9f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc0000b9fe0 sp=0xc0000b9fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000b9fe8 sp=0xc0000b9fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 8 gp=0xc000418380 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc0000bbf38 sp=0xc0000bbf18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc0000bbfc8 sp=0xc0000bbf38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc0000bbfe0 sp=0xc0000bbfc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0000bbfe8 sp=0xc0000bbfe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 9 gp=0xc000418540 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000475f38 sp=0xc000475f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000475fc8 sp=0xc000475f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000475fe0 sp=0xc000475fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000475fe8 sp=0xc000475fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 10 gp=0xc000418700 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000477f38 sp=0xc000477f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000477fc8 sp=0xc000477f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000477fe0 sp=0xc000477fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000477fe8 sp=0xc000477fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 18 gp=0xc0001061c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000471f38 sp=0xc000471f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000471fc8 sp=0xc000471f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000471fe0 sp=0xc000471fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000471fe8 sp=0xc000471fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 34 gp=0xc000484000 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00048bf38 sp=0xc00048bf18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc00048bfc8 sp=0xc00048bf38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00048bfe0 sp=0xc00048bfc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00048bfe8 sp=0xc00048bfe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 11 gp=0xc0004188c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000487f38 sp=0xc000487f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000487fc8 sp=0xc000487f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000487fe0 sp=0xc000487fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000487fe8 sp=0xc000487fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 35 gp=0xc0004841c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00048df38 sp=0xc00048df18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc00048dfc8 sp=0xc00048df38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00048dfe0 sp=0xc00048dfc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00048dfe8 sp=0xc00048dfe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 12 gp=0xc000418a80 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000489f38 sp=0xc000489f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000489fc8 sp=0xc000489f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000489fe0 sp=0xc000489fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000489fe8 sp=0xc000489fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 19 gp=0xc000106380 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000473f38 sp=0xc000473f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000473fc8 sp=0xc000473f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000473fe0 sp=0xc000473fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000473fe8 sp=0xc000473fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 36 gp=0xc000484380 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000493f38 sp=0xc000493f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000493fc8 sp=0xc000493f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000493fe0 sp=0xc000493fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000493fe8 sp=0xc000493fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 13 gp=0xc000418c40 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00048ff38 sp=0xc00048ff18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc00048ffc8 sp=0xc00048ff38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00048ffe0 sp=0xc00048ffc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00048ffe8 sp=0xc00048ffe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 20 gp=0xc000106540 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000115f38 sp=0xc000115f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000115fc8 sp=0xc000115f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000115fe0 sp=0xc000115fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000115fe8 sp=0xc000115fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 21 gp=0xc000106700 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000117f38 sp=0xc000117f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000117fc8 sp=0xc000117f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000117fe0 sp=0xc000117fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000117fe8 sp=0xc000117fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 37 gp=0xc000484540 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000495f38 sp=0xc000495f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000495fc8 sp=0xc000495f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000495fe0 sp=0xc000495fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000495fe8 sp=0xc000495fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 14 gp=0xc000418e00 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000491f38 sp=0xc000491f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000491fc8 sp=0xc000491f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000491fe0 sp=0xc000491fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000491fe8 sp=0xc000491fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 22 gp=0xc0001068c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000111f38 sp=0xc000111f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000111fc8 sp=0xc000111f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000111fe0 sp=0xc000111fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000111fe8 sp=0xc000111fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 23 gp=0xc000106a80 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000113f38 sp=0xc000113f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000113fc8 sp=0xc000113f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000113fe0 sp=0xc000113fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000113fe8 sp=0xc000113fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 38 gp=0xc000484700 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00049bf38 sp=0xc00049bf18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc00049bfc8 sp=0xc00049bf38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00049bfe0 sp=0xc00049bfc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00049bfe8 sp=0xc00049bfe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 15 gp=0xc000418fc0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000497f38 sp=0xc000497f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000497fc8 sp=0xc000497f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000497fe0 sp=0xc000497fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000497fe8 sp=0xc000497fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 16 gp=0xc000419180 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000499f38 sp=0xc000499f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000499fc8 sp=0xc000499f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000499fe0 sp=0xc000499fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000499fe8 sp=0xc000499fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 24 gp=0xc000106c40 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00011df38 sp=0xc00011df18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc00011dfc8 sp=0xc00011df38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00011dfe0 sp=0xc00011dfc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011dfe8 sp=0xc00011dfe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 39 gp=0xc0004848c0 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00049df38 sp=0xc00049df18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc00049dfc8 sp=0xc00049df38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00049dfe0 sp=0xc00049dfc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00049dfe8 sp=0xc00049dfe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 50 gp=0xc000419340 m=nil [GC worker (idle)]: runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000119f38 sp=0xc000119f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000119fc8 sp=0xc000119f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000119fe0 sp=0xc000119fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000119fe8 sp=0xc000119fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 51 gp=0xc000419500 m=nil [GC worker (idle)]: runtime.gopark(0x4eb00b91ba40?, 0x1?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00011bf38 sp=0xc00011bf18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc00011bfc8 sp=0xc00011bf38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00011bfe0 sp=0xc00011bfc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011bfe8 sp=0xc00011bfe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 25 gp=0xc000106e00 m=nil [GC worker (idle)]: runtime.gopark(0x4eb00b8c05b4?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00011ff38 sp=0xc00011ff18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc00011ffc8 sp=0xc00011ff38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00011ffe0 sp=0xc00011ffc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00011ffe8 sp=0xc00011ffe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 40 gp=0xc000484a80 m=nil [GC worker (idle)]: runtime.gopark(0x7ff606481fa0?, 0x1?, 0x84?, 0x52?, 0x0?) runtime/proc.go:435 +0xce fp=0xc0004a3f38 sp=0xc0004a3f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc0004a3fc8 sp=0xc0004a3f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc0004a3fe0 sp=0xc0004a3fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a3fe8 sp=0xc0004a3fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 52 gp=0xc0004196c0 m=nil [GC worker (idle)]: runtime.gopark(0x4eb00b91ba40?, 0x1?, 0x64?, 0x1b?, 0x0?) runtime/proc.go:435 +0xce fp=0xc00049ff38 sp=0xc00049ff18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc00049ffc8 sp=0xc00049ff38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc00049ffe0 sp=0xc00049ffc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00049ffe8 sp=0xc00049ffe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 26 gp=0xc000106fc0 m=nil [GC worker (idle)]: runtime.gopark(0x7ff606481fa0?, 0x1?, 0x84?, 0x52?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000127f38 sp=0xc000127f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000127fc8 sp=0xc000127f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000127fe0 sp=0xc000127fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000127fe8 sp=0xc000127fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 41 gp=0xc000484c40 m=nil [GC worker (idle)]: runtime.gopark(0x4eb00b91ba40?, 0x0?, 0x0?, 0x0?, 0x0?) runtime/proc.go:435 +0xce fp=0xc0004a5f38 sp=0xc0004a5f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc0004a5fc8 sp=0xc0004a5f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc0004a5fe0 sp=0xc0004a5fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a5fe8 sp=0xc0004a5fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 53 gp=0xc000419880 m=nil [GC worker (idle)]: runtime.gopark(0x4eb00b91ba40?, 0x1?, 0x84?, 0x52?, 0x0?) runtime/proc.go:435 +0xce fp=0xc0004a1f38 sp=0xc0004a1f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc0004a1fc8 sp=0xc0004a1f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc0004a1fe0 sp=0xc0004a1fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc0004a1fe8 sp=0xc0004a1fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 27 gp=0xc000107180 m=nil [GC worker (idle)]: runtime.gopark(0x4eb00b8c05b4?, 0x1?, 0x90?, 0x1e?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000129f38 sp=0xc000129f18 pc=0x7ff60459598e runtime.gcBgMarkWorker(0xc000039880) runtime/mgc.go:1423 +0xe9 fp=0xc000129fc8 sp=0xc000129f38 pc=0x7ff6045447a9 runtime.gcBgMarkStartWorkers.gowrap1() runtime/mgc.go:1339 +0x25 fp=0xc000129fe0 sp=0xc000129fc8 pc=0x7ff604544685 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000129fe8 sp=0xc000129fe0 pc=0x7ff60459d8e1 created by runtime.gcBgMarkStartWorkers in goroutine 1 runtime/mgc.go:1339 +0x105 goroutine 54 gp=0xc000506380 m=nil [sync.WaitGroup.Wait]: runtime.gopark(0x0?, 0x0?, 0xc0?, 0x83?, 0x0?) runtime/proc.go:435 +0xce fp=0xc000123e20 sp=0xc000123e00 pc=0x7ff60459598e runtime.goparkunlock(...) runtime/proc.go:441 runtime.semacquire1(0xc00032e340, 0x0, 0x1, 0x0, 0x18) runtime/sema.go:188 +0x22f fp=0xc000123e88 sp=0xc000123e20 pc=0x7ff60457750f sync.runtime_SemacquireWaitGroup(0x0?) runtime/sema.go:110 +0x25 fp=0xc000123ec0 sp=0xc000123e88 pc=0x7ff604596f85 sync.(*WaitGroup).Wait(0x0?) sync/waitgroup.go:118 +0x48 fp=0xc000123ee8 sp=0xc000123ec0 pc=0x7ff6045ab7a8 github.com/ollama/ollama/runner/llamarunner.(*Server).run(0xc00032e320, {0x7ff605a83b80, 0xc0003a60f0}) github.com/ollama/ollama/runner/llamarunner/runner.go:359 +0x4b fp=0xc000123fb8 sp=0xc000123ee8 pc=0x7ff604a1e08b github.com/ollama/ollama/runner/llamarunner.Execute.gowrap1() github.com/ollama/ollama/runner/llamarunner/runner.go:979 +0x28 fp=0xc000123fe0 sp=0xc000123fb8 pc=0x7ff604a23528 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc000123fe8 sp=0xc000123fe0 pc=0x7ff60459d8e1 created by github.com/ollama/ollama/runner/llamarunner.Execute in goroutine 1 github.com/ollama/ollama/runner/llamarunner/runner.go:979 +0x4c5 goroutine 55 gp=0xc000506540 m=nil [IO wait]: runtime.gopark(0x0?, 0xc0003ac2a0?, 0x48?, 0xc3?, 0xc0003ac34c?) runtime/proc.go:435 +0xce fp=0xc00058b8c8 sp=0xc00058b8a8 pc=0x7ff60459598e runtime.netpollblock(0x3dc?, 0x4530406?, 0xf6?) runtime/netpoll.go:575 +0xf7 fp=0xc00058b900 sp=0xc00058b8c8 pc=0x7ff60455bdf7 internal/poll.runtime_pollWait(0x2a6fb7ecff8, 0x72) runtime/netpoll.go:351 +0x85 fp=0xc00058b920 sp=0xc00058b900 pc=0x7ff604594b25 internal/poll.(*pollDesc).wait(0x7ff60475c9b7?, 0xc00058b970?, 0x0) internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc00058b948 sp=0xc00058b920 pc=0x7ff60462bc87 internal/poll.execIO(0xc0003ac2a0, 0x7ff605912278) internal/poll/fd_windows.go:177 +0x105 fp=0xc00058b9c0 sp=0xc00058b948 pc=0x7ff60462d0e5 internal/poll.(*FD).Read(0xc0003ac288, {0xc0003ce000, 0x1000, 0x1000}) internal/poll/fd_windows.go:438 +0x29b fp=0xc00058ba60 sp=0xc00058b9c0 pc=0x7ff60462ddbb net.(*netFD).Read(0xc0003ac288, {0xc0003ce000?, 0xc00058bad0?, 0x7ff60462c145?}) net/fd_posix.go:55 +0x25 fp=0xc00058baa8 sp=0xc00058ba60 pc=0x7ff6046a1025 net.(*conn).Read(0xc000688058, {0xc0003ce000?, 0x0?, 0x0?}) net/net.go:194 +0x45 fp=0xc00058baf0 sp=0xc00058baa8 pc=0x7ff6046b0505 net/http.(*connReader).Read(0xc00032c630, {0xc0003ce000, 0x1000, 0x1000}) net/http/server.go:798 +0x159 fp=0xc00058bb40 sp=0xc00058baf0 pc=0x7ff60489d8f9 bufio.(*Reader).fill(0xc0000c24e0) bufio/bufio.go:113 +0x103 fp=0xc00058bb78 sp=0xc00058bb40 pc=0x7ff6046c6d43 bufio.(*Reader).Peek(0xc0000c24e0, 0x4) bufio/bufio.go:152 +0x53 fp=0xc00058bb98 sp=0xc00058bb78 pc=0x7ff6046c6e73 net/http.(*conn).serve(0xc0003301b0, {0x7ff605a83b48, 0xc00032c540}) net/http/server.go:2137 +0x785 fp=0xc00058bfb8 sp=0xc00058bb98 pc=0x7ff6048a36e5 net/http.(*Server).Serve.gowrap3() net/http/server.go:3454 +0x28 fp=0xc00058bfe0 sp=0xc00058bfb8 pc=0x7ff6048a8e48 runtime.goexit({}) runtime/asm_amd64.s:1700 +0x1 fp=0xc00058bfe8 sp=0xc00058bfe0 pc=0x7ff60459d8e1 created by net/http.(*Server).Serve in goroutine 1 net/http/server.go:3454 +0x485 rax 0x0 rbx 0x2a6a1ff2e48 rcx 0x2a6a1ff2e48 rdx 0x2a6a1ff2e48 rdi 0x2a6a1ff2e18 rsi 0x0 rbp 0x2a6a1ff2e48 rsp 0xaf50d2e400 r8 0xfffffffd00000000 r9 0x2a6a1ff2df8 r10 0x5f r11 0xc000012d r12 0x0 r13 0x0 r14 0x2a6a1ff2e18 r15 0xaf50d2e5f0 rip 0x7ffb26419176 rflags 0x10246 cs 0x33 fs 0x53 gs 0x2b time=2025-11-20T14:01:57.543+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server not responding" time=2025-11-20T14:01:57.794+08:00 level=INFO source=server.go:1328 msg="waiting for server to become available" status="llm server error" time=2025-11-20T14:02:00.557+08:00 level=INFO source=sched.go:470 msg="Load failed" model=D:\LLM\Models_ollama\blobs\sha256-6150cb382311b69f09cc0f9a1b69fc029cbd742b66bb8ec531aa5ecf5c613e93 error="llama runner process has terminated: exit status 2" [GIN] 2025/11/20 - 14:02:00 | 500 | 49.853006s | 127.0.0.1 | POST "/api/generate" ``` ### OS Windows ### GPU AMD ### CPU AMD ### Ollama version 0.13.0
GiteaMirror added the bug label 2026-05-04 22:55:50 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: github-starred/ollama#70768